user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
To: Eric Wong <e@80x24.org>
Cc: meta@public-inbox.org
Subject: Re: Git-only operation mode
Date: Wed, 25 Sep 2019 15:58:38 -0400	[thread overview]
Message-ID: <20190925195838.GB4628@chatter.i7.local> (raw)
In-Reply-To: <20190925194503.GA21501@dcvr>

On Wed, Sep 25, 2019 at 07:45:03PM +0000, Eric Wong wrote:
>> Is there a way to run just the archiver component of public-inbox -- 
>> just
>> writing to git repos without any of the indexing/frontend bits? One of the
>> idle conversations I had with vger.kernel.org folks was to see if we can
>> shift the source of truth archive generation to happen at their end. We
>> would then clone repositories from them and provide the frontend/search bits
>> on lore.kernel.org. From my cursory looking, it would seem that the
>> watch/delivery tools always expect to be taking care of xapian/indexing, but
>> I think being able to decouple git bits from search/frontend bits would be a
>> useful mode or operation.
>
>v1 was git-only (that led to scalability problems from big trees).
>v2 needs SQLite to do dedupe with indexlevel=basic, but not Xapian,
>anymore.  We could get rid of dedupe for v2, but I'm not sure it's
>worth it...

Needing sqlite is not a big deal -- compared to the size of the repos, 
that's reasonably small (e.g. all of lkml git trees are 8.2GB, while 
msgmap.sqlite3 is 600MB). 

Is there an easy way to exclude xapian indexes from being generated 
during watch/mda runs then?

A follow-up to that -- is running "public-inbox-index" on the repository 
after it's been updated enough to update the xapian db? It would be easy 
to do so as part of the grok-pull post-update hook.

Best,
-K

  reply	other threads:[~2019-09-25 19:58 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-25 18:24 Git-only operation mode Konstantin Ryabitsev
2019-09-25 19:45 ` Eric Wong
2019-09-25 19:58   ` Konstantin Ryabitsev [this message]
2019-09-25 22:45     ` Eric Wong
2019-09-26  0:23       ` Eric W. Biederman
2019-09-26 20:52       ` Konstantin Ryabitsev
2019-09-26 21:10         ` Eric Wong
2019-09-26 21:44           ` Konstantin Ryabitsev
2019-10-07  0:07         ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190925195838.GB4628@chatter.i7.local \
    --to=konstantin@linuxfoundation.org \
    --cc=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).