user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@80x24.org>
To: "W. Trevor King" <wking@tremily.us>
Cc: notmuch@notmuchmail.org, David Bremner <david@tethera.net>,
	Steven Allen <steven@stebalien.com>,
	Tomi Ollila <tomi.ollila@iki.fi>, Carl Worth <cworth@cworth.org>,
	meta@public-inbox.org
Subject: Re: Mail archives in Git using ssoma (Docker image)
Date: Sun, 21 Aug 2016 12:08:52 +0000	[thread overview]
Message-ID: <20160821120852.GA12964@dcvr> (raw)
In-Reply-To: <20160821094833.GB2338@odin.tremily.us>

+Cc meta@public-inbox.org

"W. Trevor King" <wking@tremily.us> wrote:
> On Sat, Aug 20, 2016 at 09:36:31PM -0700, W. Trevor King wrote:
> > [2]: git://tremily.us/notmuch-archives.git

Cool!

> This is the ssoma archive (with the data in it).  I just set up a
> basic HTTP archive (following [1]) based on a Docker image [2] (Gentoo
> doesn't package all the Perl dependencies public-inbox needs).

Ugh, that sucks (sorry, not a fan of Docker).

What's missing from Gentoo?

It should be easy to copy just the necessary .pm files and use
PERL5LIB environment to point to the correct path (man perlrun).

I'm conciously avoiding XS (compiled) extensions to make
installation/distribution easier.

> Dockerfile for rebuilding the image is in [2].  I'm currently hosting
> the archives (HTTP only) at [3].  Spinning that up from the Docker
> image looks like:
> 
>   $ mkdir srv
>   $ git clone --bare git://tremily.us/notmuch-archives.git srv/notmuch
>   $ echo 'Notmuch -- Just an email system' >srv/notmuch.git/description
>   $ git config -f srv/notmuch.git/config publicinbox.http http://tremily.us
>   $ git config -f srv/notmuch.git/config publicinbox.email notmuch@notmuchmail.org

That should probably be:

	; based on your [3]
	git config -f srv/notmuch.git/config \
		publicinbox.notmuch.url http://tremily.us/notmuch

	git config -f srv/notmuch.git/config \
		publicinbox.notmuch.address notmuch@notmuchmail.org

	; this is crucial for all the public-inbox-* tools
	git config -f srv/notmuch.git/config \
		publicinbox.notmuch.mainrepo /path/to/notmuch.git

I'm sorry that most of this is still undocumented at the moment,
but it's my first priority once I'm done sorting out some
non-computing-related stuff.

>   $ docker run --name notmuch-archives -d -p 80:8080 -v ${PWD}/srv/:/srv/ wking/public-inbox
> 
> (although I'm using -p ###:8080 and have an Nginx reverse-proxy in
> front).  It's not updating automatically yet, but that will probably
> look like:
> 
> 1. Pull new mbox [4].
> 2. Import into notmuch-archives [5].
> 3. Re-run public-inbox-index (this could probably be via ‘docker exec …’.
> 
> But I'll have to test that to confirm.  And ideally we'd be using
> ssoma-mda or similar directly, instead of going through mbox, but I'd
> rather get the official headers on the stored mail than be efficient
> ;).

For mirroring existing lists, I started using public-inbox-watch
which currently watches Maildirs.  The config knobs are sorta
documented from my announcement to git@vger:

https://public-inbox.org/git/20160710004813.GA20210@dcvr.yhbt.net/
http://hjrcffqmbrq6wope.onion/git/20160710004813.GA20210@dcvr.yhbt.net/

Initial import (w/o spamassassin) was done with
scripts/import_vger_from_mbox in the source:

        torsocks git clone http://hjrcffqmbrq6wope.onion/public-inbox
        git clone https://public-inbox.org/ public-inbox
        git clone git://repo.or.cz/public-inbox

I recommend public-inbox-watch for mirroring existing lists
(such as what I did with git@vger) but public-inbox-mda for
self-hosted lists (such as meta@public-inbox.org).

> One shift from Gmane's mid.gmane.org/… is that the public-inbox UI
> Message-ID lookup is per-bucket, and public-inbox seems to be
> encouraging per-list buckets.

The public-inbox-nntpd interface supports mid lookups across all
inboxes in that instance; so it should be doable in the WWW
interface, too.  Either way, I think it has to be O(n) where (n)
is the number of Xapian DBs, though.

I already have news.public-inbox.org hooked up to both
NNTP and HTTP(*), so I plan on making

	http://news.public-inbox.org/<Message-ID>

to work like:

	nntp://news.public-inbox.org/<Message-ID>

(*) Right now, it just redirects $GROUP to the HTTP interface:
    http://news.public-inbox.org/$NEWSGROUP -> http://...


And the WWW interface already has fallbacks to scan + link
across inboxes, so s/git/meta/ the above URLs and you'll get
a link to the message on /git/ instead of /meta/

http://hjrcffqmbrq6wope.onion/meta/20160710004813.GA20210@dcvr.yhbt.net/

> And while I feel like I had a good grasp of the ssoma format two years
> ago, I know very little about Perl and public-inbox.  I'm sure you
> could setup a public-inbox host that is more efficient than what's
> currently in my Docker image.

Feel free to ask me + meta@public-inbox.org if you have any
questions or need help.  Writing documentation doesn't come
naturally to me, so it's easier for me to answer emails.

I try to make it not very Perly.  I don't think I'll bother with
CPAN, for example  (I don't think I successfully got my PAUSE
account activated; not a fan of registrations, either).

But there will definitely be tarball releases for distros
soonish.  (mainly targeting Debian at the moment, but FreeBSD is
on the table).

> Cheers,
> Trevor
> 
> [1]: http://public-inbox.org/INSTALL
> [2]: https://hub.docker.com/r/wking/public-inbox/
> [3]: http://tremily.us/notmuch/
> [4]: https://notmuchmail.org/archives/notmuch.mbox
> [5]: id:20160821043631.GA2338@odin.tremily.us

       reply	other threads:[~2016-08-21 12:08 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20141107190321.GL23609@odin.tremily.us>
     [not found] ` <20160821043631.GA2338@odin.tremily.us>
     [not found]   ` <20160821094833.GB2338@odin.tremily.us>
2016-08-21 12:08     ` Eric Wong [this message]
2016-08-21 17:36       ` Mail archives in Git using ssoma (Docker image) W. Trevor King
2016-08-21 18:28         ` Eric Wong
     [not found]   ` <20160821183704.GB11495@dcvr>
2016-08-21 20:28     ` Mail archives in Git using ssoma W. Trevor King
2016-08-21 21:14       ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160821120852.GA12964@dcvr \
    --to=e@80x24.org \
    --cc=cworth@cworth.org \
    --cc=david@tethera.net \
    --cc=meta@public-inbox.org \
    --cc=notmuch@notmuchmail.org \
    --cc=steven@stebalien.com \
    --cc=tomi.ollila@iki.fi \
    --cc=wking@tremily.us \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).