user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@80x24.org>
To: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Cc: meta@public-inbox.org
Subject: Re: Git-only operation mode
Date: Wed, 25 Sep 2019 22:45:00 +0000	[thread overview]
Message-ID: <20190925224500.GA28628@dcvr> (raw)
In-Reply-To: <20190925195838.GB4628@chatter.i7.local>

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Wed, Sep 25, 2019 at 07:45:03PM +0000, Eric Wong wrote:
> > > Is there a way to run just the archiver component of public-inbox --
> > > just
> > > writing to git repos without any of the indexing/frontend bits? One of the
> > > idle conversations I had with vger.kernel.org folks was to see if we can
> > > shift the source of truth archive generation to happen at their end. We
> > > would then clone repositories from them and provide the frontend/search bits
> > > on lore.kernel.org. From my cursory looking, it would seem that the
> > > watch/delivery tools always expect to be taking care of xapian/indexing, but
> > > I think being able to decouple git bits from search/frontend bits would be a
> > > useful mode or operation.
> > 
> > v1 was git-only (that led to scalability problems from big trees).
> > v2 needs SQLite to do dedupe with indexlevel=basic, but not Xapian,
> > anymore.  We could get rid of dedupe for v2, but I'm not sure it's
> > worth it...
> 
> Needing sqlite is not a big deal -- compared to the size of the repos,
> that's reasonably small (e.g. all of lkml git trees are 8.2GB, while
> msgmap.sqlite3 is 600MB).

Right, it'll also need xap15/over.sqlite* but that's still not too big.

> Is there an easy way to exclude xapian indexes from being generated during
> watch/mda runs then?

public-inbox-init --indexlevel=basic <usual args>

Or setting publicinbox.$INBOX_NAME.indexlevel=basic in the
config file after-the-fact.  You should also be able to remove
any non-SQLite files from xap15 after-the-fact, if you already
generated them, too (but I haven't tested that).

I started working on a public-inbox-init manpage the other day,
still need to finish that...

> A follow-up to that -- is running "public-inbox-index" on the repository
> after it's been updated enough to update the xapian db? It would be easy to
> do so as part of the grok-pull post-update hook.

Yes, on a fresh clone.  You'll need to change indexlevel to
medium or full if it was setup using basic.

I haven't figured out how to use a grok-pull post-update hook to
run index on my clone of erol, since there's multiple epochs
per-inbox to deal with.

  reply	other threads:[~2019-09-25 22:45 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-25 18:24 Git-only operation mode Konstantin Ryabitsev
2019-09-25 19:45 ` Eric Wong
2019-09-25 19:58   ` Konstantin Ryabitsev
2019-09-25 22:45     ` Eric Wong [this message]
2019-09-26  0:23       ` Eric W. Biederman
2019-09-26 20:52       ` Konstantin Ryabitsev
2019-09-26 21:10         ` Eric Wong
2019-09-26 21:44           ` Konstantin Ryabitsev
2019-10-07  0:07         ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190925224500.GA28628@dcvr \
    --to=e@80x24.org \
    --cc=konstantin@linuxfoundation.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).