From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.1 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 2B0F4208E9; Fri, 20 Jul 2018 06:11:07 +0000 (UTC) Date: Fri, 20 Jul 2018 06:11:07 +0000 From: Eric Wong To: "Eric W. Biederman" Cc: meta@public-inbox.org Subject: Re: Searching via git grep? Message-ID: <20180720061106.4f2u2zpdxnsilrxt@dcvr> References: <87in5bdkbv.fsf@xmission.com> <20180719211216.GA1984@dcvr> <87601adfo7.fsf@xmission.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <87601adfo7.fsf@xmission.com> List-Id: "Eric W. Biederman" wrote: > My current goal is to make it pleasant to read linux-kernel and possibly > other large archives on my personal machine. Right now the git > trees for linux-kernel are aboug 6.8G. Small enough to fit in RAM. > > The Xapian indexes are about 63G. Not small enough to fit in ram. > They are also not fast to update when I pull in a new batch of messages > from linux-kernel. Interesting, how long does it take to do an incremental index medium/full for you? Setting XAPIAN_FLUSH_THRESHOLD after my patch yesterday should help noticeably, especially if you're on HDD. > So I am looking at using git grep as a stand-in for the Xapian indexes > when indexlevel eq 'basic'. > > Given my personal ratio of searches to indexing I think I will save > time in doing that. I don't have it all wired up yet to know if it will > work well, but I suspect it will. Totally understandable, and yes, if you can fit the LKML repos into RAM it should be usable enough for a single user. "git grep" also has the advantage of being able to use regexps, which isn't possible with Xapian at the moment. > Is it only the web interface where the advanced search functionality is > available? Yes. I don't think there's a good way to implement search for NNTP on the server side... IMAP has specs for implementing search; but I don't know how much overlap there is with what our web UI currently offers.