user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
* [PATCH 00/14] purging support, v1 conversions, cleanups + more
@ 2018-03-29 10:28 Eric Wong (Contractor, The Linux Foundation)
  2018-03-29 10:28 ` [PATCH 01/14] www: remove unnecessary ghost checks Eric Wong (Contractor, The Linux Foundation)
                   ` (13 more replies)
  0 siblings, 14 replies; 15+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-29 10:28 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong (Contractor, The Linux Foundation)

There's now a public-inbox-convert tool which makes a new v2 repo
from a v1 git repo via git-fast-export/import.  Using those tools,
purge support also works internally, but there's no command-line
support for purging, yet.

public-inbox-watch restart scanning can still cause
removed/purged messages to reappear in the repository.  This
wasn't a problem with my IMAP setup since deletes happen in my
Maildir/IMAP folder first (triggering inotify wakeup in -watch)
but that may not always be the case for other deployments.

Eric Wong (Contractor, The Linux Foundation) (14):
  www: remove unnecessary ghost checks
  v2writable: append, instead of prepending generated Message-ID
  lookup by Message-ID favors the "primary" one
  www: fix attachment downloads for conflicted Message-IDs
  searchmsg: document why we store To: and Cc: for NNTP
  public-inbox-convert: tool for converting old to new inboxes
  v2writable: support purging messages from git entirely
  search: cleanup uniqueness checking
  search: get rid of most lookup_* subroutines
  search: move find_doc_ids to searchidx
  v2writable: cleanup: get rid of unused fields
  mbox: avoid extracting Message-ID for linkification
  www: cleanup expensive fallback for legacy URLs
  view: get rid of some unnecessary imports

 Documentation/public-inbox-config.pod  |   2 +-
 Documentation/public-inbox-convert.pod |  45 ++++++++++
 MANIFEST                               |   2 +
 lib/PublicInbox/Import.pm              |  98 +++++++++++++++++++++-
 lib/PublicInbox/Inbox.pm               |  31 ++++---
 lib/PublicInbox/Mbox.pm                |   9 +-
 lib/PublicInbox/Search.pm              |  88 +++-----------------
 lib/PublicInbox/SearchIdx.pm           |   8 ++
 lib/PublicInbox/SearchMsg.pm           |   4 +
 lib/PublicInbox/SearchThread.pm        |  14 ++--
 lib/PublicInbox/SearchView.pm          |   2 +-
 lib/PublicInbox/V2Writable.pm          |  43 ++++++++--
 lib/PublicInbox/View.pm                |  31 +++----
 lib/PublicInbox/WWW.pm                 |  23 ++----
 script/public-inbox-convert            | 109 +++++++++++++++++++++++++
 t/plack.t                              |  18 ++++
 t/psgi_v2.t                            |  45 +++++++++-
 t/search-thr-index.t                   |   3 +-
 t/search.t                             |   6 +-
 t/v2writable.t                         |  15 +++-
 20 files changed, 443 insertions(+), 153 deletions(-)
 create mode 100644 Documentation/public-inbox-convert.pod
 create mode 100755 script/public-inbox-convert

-- 
EW

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 01/14] www: remove unnecessary ghost checks
  2018-03-29 10:28 [PATCH 00/14] purging support, v1 conversions, cleanups + more Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-29 10:28 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-29 10:28 ` [PATCH 02/14] v2writable: append, instead of prepending generated Message-ID Eric Wong (Contractor, The Linux Foundation)
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-29 10:28 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong (Contractor, The Linux Foundation)

We do not need to care about ghosts at multiple call sites; they
cannot have a {blob} field and we've stored the blob field in
Xapian since SCHEMA_VERSION=13.
---
 lib/PublicInbox/Inbox.pm | 10 ++++------
 lib/PublicInbox/Mbox.pm  |  2 --
 lib/PublicInbox/View.pm  |  2 --
 3 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index 3097751..47b8630 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -270,12 +270,10 @@ sub msg_by_path ($$;$) {
 sub msg_by_smsg ($$;$) {
 	my ($self, $smsg, $ref) = @_;
 
-	return unless defined $smsg; # ghost
-
-	# backwards compat to fallback to msg_by_mid
-	# TODO: remove if we bump SCHEMA_VERSION in Search.pm:
-	defined(my $blob = $smsg->{blob}) or
-			return msg_by_path($self, mid2path($smsg->mid), $ref);
+	# ghosts may have undef smsg (from SearchThread.node) or
+	# no {blob} field (from each_smsg_by_mid)
+	return unless defined $smsg;
+	defined(my $blob = $smsg->{blob}) or return;
 
 	my $str = git($self)->cat_file($blob, $ref);
 	$$str =~ s/\A[\r\n]*From [^\r\n]*\r?\n//s if $str;
diff --git a/lib/PublicInbox/Mbox.pm b/lib/PublicInbox/Mbox.pm
index c14037f..381bcad 100644
--- a/lib/PublicInbox/Mbox.pm
+++ b/lib/PublicInbox/Mbox.pm
@@ -41,7 +41,6 @@ sub getline {
 	}
 	for (; !defined($cur) && $head != $tail; $head++) {
 		my $smsg = PublicInbox::SearchMsg->get($head, $db, $ctx->{mid});
-		next if $smsg->type ne 'mail';
 		my $mref = $ctx->{-inbox}->msg_by_smsg($smsg) or next;
 		$cur = Email::Simple->new($mref);
 		$cur = msg_str($ctx, $cur);
@@ -66,7 +65,6 @@ sub emit_raw {
 			for (; !defined($first) && $head != $tail; $head++) {
 				my @args = ($head, $db, $mid);
 				my $smsg = PublicInbox::SearchMsg->get(@args);
-				next if $smsg->type ne 'mail';
 				my $mref = $ibx->msg_by_smsg($smsg) or next;
 				$first = Email::Simple->new($mref);
 			}
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 5fb2b31..133c30a 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -63,7 +63,6 @@ sub msg_page {
 			for (; !defined($first) && $head != $tail; $head++) {
 				my @args = ($head, $db, $mid);
 				my $smsg = PublicInbox::SearchMsg->get(@args);
-				next if $smsg->type ne 'mail';
 				$first = $ibx->msg_by_smsg($smsg);
 			}
 			if ($head != $tail) {
@@ -85,7 +84,6 @@ sub msg_html_more {
 		my $mid = $ctx->{mid};
 		for (; !defined($smsg) && $head != $tail; $head++) {
 			my $m = PublicInbox::SearchMsg->get($head, $db, $mid);
-			next if $m->type ne 'mail';
 			$smsg = $ctx->{-inbox}->smsg_mime($m);
 		}
 		if ($head == $tail) { # done
-- 
EW


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 02/14] v2writable: append, instead of prepending generated Message-ID
  2018-03-29 10:28 [PATCH 00/14] purging support, v1 conversions, cleanups + more Eric Wong (Contractor, The Linux Foundation)
  2018-03-29 10:28 ` [PATCH 01/14] www: remove unnecessary ghost checks Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-29 10:28 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-29 10:28 ` [PATCH 03/14] lookup by Message-ID favors the "primary" one Eric Wong (Contractor, The Linux Foundation)
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-29 10:28 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong (Contractor, The Linux Foundation)

The original Message-ID is still the most important when
discussing with other recipients who do not rely on a message
flowing through public-inbox.  So whatever Message-ID we use
to deduplicate internally will be secondary and less important.

All of our front-end v2 code is order-independent, so we won't
let the message count against us, that way.
---
 lib/PublicInbox/Import.pm     | 8 ++++----
 lib/PublicInbox/V2Writable.pm | 2 +-
 t/psgi_v2.t                   | 9 +++++----
 t/v2writable.t                | 8 ++++----
 4 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/lib/PublicInbox/Import.pm b/lib/PublicInbox/Import.pm
index 6824fac..e07edda 100644
--- a/lib/PublicInbox/Import.pm
+++ b/lib/PublicInbox/Import.pm
@@ -297,12 +297,12 @@ sub drop_unwanted_headers ($) {
 }
 
 # used by V2Writable, too
-sub prepend_mid ($$) {
+sub append_mid ($$) {
 	my ($hdr, $mid0) = @_;
 	# @cur is likely empty if we need to call this sub, but it could
 	# have random unparseable crap which we'll preserve, too.
-	my @cur = $hdr->header_raw('Message-Id');
-	$hdr->header_set('Message-Id', "<$mid0>", @cur);
+	my @cur = $hdr->header_raw('Message-ID');
+	$hdr->header_set('Message-ID', @cur, "<$mid0>");
 }
 
 sub v1_mid0 ($) {
@@ -312,7 +312,7 @@ sub v1_mid0 ($) {
 
 	if (!scalar(@$mids)) { # spam often has no Message-Id
 		my $mid0 = digest2mid(content_digest($mime));
-		prepend_mid($hdr, $mid0);
+		append_mid($hdr, $mid0);
 		return $mid0;
 	}
 	$mids->[0];
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 01ec98a..9b280c6 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -173,7 +173,7 @@ sub num_for_harder {
 			$num = $self->{skel}->{mm}->mid_insert($$mid0);
 		}
 	}
-	PublicInbox::Import::prepend_mid($hdr, $$mid0);
+	PublicInbox::Import::append_mid($hdr, $$mid0);
 	$num;
 }
 
diff --git a/t/psgi_v2.t b/t/psgi_v2.t
index 9964b47..11b2c79 100644
--- a/t/psgi_v2.t
+++ b/t/psgi_v2.t
@@ -7,6 +7,7 @@ use File::Temp qw/tempdir/;
 use PublicInbox::MIME;
 use PublicInbox::Config;
 use PublicInbox::WWW;
+use PublicInbox::MID qw(mids);
 my @mods = qw(DBD::SQLite Search::Xapian HTTP::Request::Common Plack::Test
 		URI::Escape Plack::Builder);
 foreach my $mod (@mods) {
@@ -46,8 +47,8 @@ local $SIG{__WARN__} = sub { push @warn, @_ };
 $mime->header_set(Date => 'Fri, 02 Oct 1993 00:01:00 +0000');
 ok($im->add($mime), 'added duplicate-but-different message');
 is(scalar(@warn), 1, 'got one warning');
-my @mids = $mime->header_obj->header_raw('Message-Id');
-$new_mid = PublicInbox::MID::mid_clean($mids[0]);
+my $mids = mids($mime->header_obj);
+$new_mid = $mids->[1];
 $im->done;
 
 my $cfgpfx = "publicinbox.v2test";
@@ -93,8 +94,8 @@ is(scalar(@warn), 2, 'got another warning');
 like($warn[0], qr/mismatched/, 'warned about mismatched messages');
 is($warn[0], $warn[1], 'both warnings are the same');
 
-@mids = $mime->header_obj->header_raw('Message-Id');
-my $third = PublicInbox::MID::mid_clean($mids[0]);
+$mids = mids($mime->header_obj);
+my $third = $mids->[-1];
 $im->done;
 
 test_psgi(sub { $www->call(@_) }, sub {
diff --git a/t/v2writable.t b/t/v2writable.t
index 6cabf0d..c48f060 100644
--- a/t/v2writable.t
+++ b/t/v2writable.t
@@ -79,8 +79,8 @@ if ('ensure git configs are correct') {
 	ok($im->add($mime), 'reused mid ok');
 	like(join(' ', @warn), qr/reused/, 'warned about reused MID');
 	my @mids = $mime->header_obj->header_raw('Message-Id');
-	is($mids[1], '<a-mid@b>', 'original mid not changed');
-	like($mids[0], $sane_mid, 'new MID added');
+	is($mids[0], '<a-mid@b>', 'original mid not changed');
+	like($mids[1], $sane_mid, 'new MID added');
 	is(scalar(@mids), 2, 'only one new MID added');
 
 	@warn = ();
@@ -95,8 +95,8 @@ if ('ensure git configs are correct') {
 	ok($im->add($mime), 'random MID made');
 	like(join(' ', @warn), qr/using random/, 'warned about using random');
 	@mids = $mime->header_obj->header_raw('Message-Id');
-	is($mids[1], '<a-mid@b>', 'original mid not changed');
-	like($mids[0], $sane_mid, 'new MID added');
+	is($mids[0], '<a-mid@b>', 'original mid not changed');
+	like($mids[1], $sane_mid, 'new MID added');
 	is(scalar(@mids), 2, 'only one new MID added');
 
 	@warn = ();
-- 
EW


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 03/14] lookup by Message-ID favors the "primary" one
  2018-03-29 10:28 [PATCH 00/14] purging support, v1 conversions, cleanups + more Eric Wong (Contractor, The Linux Foundation)
  2018-03-29 10:28 ` [PATCH 01/14] www: remove unnecessary ghost checks Eric Wong (Contractor, The Linux Foundation)
  2018-03-29 10:28 ` [PATCH 02/14] v2writable: append, instead of prepending generated Message-ID Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-29 10:28 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-29 10:28 ` [PATCH 04/14] www: fix attachment downloads for conflicted Message-IDs Eric Wong (Contractor, The Linux Foundation)
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-29 10:28 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong (Contractor, The Linux Foundation)

The Message-ID mapped to an NNTP article number is stronger,
so we will favor that for attachment lookups.
---
 lib/PublicInbox/Inbox.pm | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index 47b8630..4c7305f 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -298,9 +298,15 @@ sub msg_by_mid ($$;$) {
 	my $srch = search($self) or
 			return msg_by_path($self, mid2path($mid), $ref);
 	my $smsg;
-	$srch->retry_reopen(sub {
-		$smsg = $srch->lookup_skeleton($mid) and $smsg->load_expand;
-	});
+	# favor the Message-ID we used for the NNTP article number:
+	if (my $mm = mm($self)) {
+		my $num = $mm->num_for($mid);
+		$smsg = $srch->lookup_article($num);
+	} else {
+		$smsg = $srch->retry_reopen(sub {
+			$srch->lookup_skeleton($mid) and $smsg->load_expand;
+		});
+	}
 	$smsg ? msg_by_smsg($self, $smsg, $ref) : undef;
 }
 
-- 
EW


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 04/14] www: fix attachment downloads for conflicted Message-IDs
  2018-03-29 10:28 [PATCH 00/14] purging support, v1 conversions, cleanups + more Eric Wong (Contractor, The Linux Foundation)
                   ` (2 preceding siblings ...)
  2018-03-29 10:28 ` [PATCH 03/14] lookup by Message-ID favors the "primary" one Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-29 10:28 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-29 10:28 ` [PATCH 05/14] searchmsg: document why we store To: and Cc: for NNTP Eric Wong (Contractor, The Linux Foundation)
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-29 10:28 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong (Contractor, The Linux Foundation)

By using the "primary" Message-ID in WwwAttach, we can avoid
conflicts in the links we use for downloading attachments.
---
 lib/PublicInbox/View.pm | 14 +++++++++-----
 t/psgi_v2.t             | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 133c30a..aad860e 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -24,7 +24,7 @@ sub th_pfx ($) { $_[0] == 0 ? '' : TCHILD };
 # public functions: (unstable)
 
 sub msg_html {
-	my ($ctx, $mime, $more) = @_;
+	my ($ctx, $mime, $more, $smsg) = @_;
 	my $hdr = $mime->header_obj;
 	my $ibx = $ctx->{-inbox};
 	my $obfs_ibx = $ctx->{-obfs_ibx} = $ibx->{obfuscate} ? $ibx : undef;
@@ -33,7 +33,9 @@ sub msg_html {
 	PublicInbox::WwwStream->response($ctx, 200, sub {
 		my ($nr, undef) = @_;
 		if ($nr == 1) {
-			$tip . multipart_text_as_html($mime, '', $obfs_ibx) .
+			# $more cannot be true w/o $smsg being defined:
+			my $upfx = $more ? '../'.mid_escape($smsg->mid).'/' : '';
+			$tip . multipart_text_as_html($mime, $upfx, $obfs_ibx) .
 				'</pre><hr>'
 		} elsif ($more && @$more) {
 			++$end;
@@ -57,12 +59,13 @@ sub msg_page {
 	my $mid = $ctx->{mid};
 	my $ibx = $ctx->{-inbox};
 	my ($first, $more, $head, $tail, $db);
+	my $smsg;
 	if (my $srch = $ibx->search) {
 		$srch->retry_reopen(sub {
 			($head, $tail, $db) = $srch->each_smsg_by_mid($mid);
 			for (; !defined($first) && $head != $tail; $head++) {
 				my @args = ($head, $db, $mid);
-				my $smsg = PublicInbox::SearchMsg->get(@args);
+				$smsg = PublicInbox::SearchMsg->get(@args);
 				$first = $ibx->msg_by_smsg($smsg);
 			}
 			if ($head != $tail) {
@@ -73,7 +76,7 @@ sub msg_page {
 	} else {
 		$first = $ibx->msg_by_mid($mid) or return;
 	}
-	msg_html($ctx, PublicInbox::MIME->new($first), $more);
+	msg_html($ctx, PublicInbox::MIME->new($first), $more, $smsg);
 }
 
 sub msg_html_more {
@@ -93,8 +96,9 @@ sub msg_html_more {
 		}
 		if ($smsg) {
 			my $mime = $smsg->{mime};
+			my $upfx = '../' . mid_escape($smsg->mid) . '/';
 			_msg_html_prepare($mime->header_obj, $ctx, $more, $nr) .
-				multipart_text_as_html($mime, '',
+				multipart_text_as_html($mime, $upfx,
 							$ctx->{-obfs_ibx}) .
 				'</pre><hr>'
 		} else {
diff --git a/t/psgi_v2.t b/t/psgi_v2.t
index 11b2c79..31c4178 100644
--- a/t/psgi_v2.t
+++ b/t/psgi_v2.t
@@ -171,6 +171,42 @@ test_psgi(sub { $www->call(@_) }, sub {
 	is($res->code, 200, 'got info refs for dumb clones');
 	$res = $cb->(GET('/v2test/info/refs'));
 	is($res->code, 404, 'unpartitioned git URL fails');
+
+	# ensure conflicted attachments can be resolved
+	foreach my $body (qw(old new)) {
+		my $parts = [
+			PublicInbox::MIME->create(
+				attributes => { content_type => 'text/plain' },
+				body => 'blah',
+			),
+			PublicInbox::MIME->create(
+				attributes => {
+					filename => 'attach.txt',
+					content_type => 'text/plain',
+				},
+				body => $body
+			)
+		];
+		$mime = PublicInbox::MIME->create(
+			parts => $parts,
+			header_str => [ From => 'root@z',
+				'Message-ID' => '<a@dup>',
+				Subject => 'hi']
+		);
+		ok($im->add($mime), "added attachment $body");
+	}
+	$im->done;
+	$config->each_inbox(sub { $_[0]->search->reopen });
+	$res = $cb->(GET('/v2test/a@dup/'));
+	my @links = ($res->content =~ m!"\.\./([^/]+/2-attach\.txt)\"!g);
+	is(scalar(@links), 2, 'both attachment links exist');
+	isnt($links[0], $links[1], 'attachment links are different');
+	{
+		my $old = $cb->(GET('/v2test/' . $links[0]));
+		my $new = $cb->(GET('/v2test/' . $links[1]));
+		is($old->content, 'old', 'got expected old content');
+		is($new->content, 'new', 'got expected new content');
+	}
 });
 
 done_testing();
-- 
EW


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 05/14] searchmsg: document why we store To: and Cc: for NNTP
  2018-03-29 10:28 [PATCH 00/14] purging support, v1 conversions, cleanups + more Eric Wong (Contractor, The Linux Foundation)
                   ` (3 preceding siblings ...)
  2018-03-29 10:28 ` [PATCH 04/14] www: fix attachment downloads for conflicted Message-IDs Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-29 10:28 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-29 10:28 ` [PATCH 06/14] public-inbox-convert: tool for converting old to new inboxes Eric Wong (Contractor, The Linux Foundation)
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-29 10:28 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong (Contractor, The Linux Foundation)

Otherwise I would forget and be tempted to remove them.
---
 lib/PublicInbox/SearchMsg.pm | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/PublicInbox/SearchMsg.pm b/lib/PublicInbox/SearchMsg.pm
index e314fed..e55d401 100644
--- a/lib/PublicInbox/SearchMsg.pm
+++ b/lib/PublicInbox/SearchMsg.pm
@@ -41,8 +41,12 @@ sub load_from_data ($$) {
 	$self->{subject} = $subj;
 	$self->{from} = $from;
 	$self->{references} = $refs;
+
+	# To: and Cc: are stored to optimize HDR/XHDR in NNTP since
+	# some NNTP clients will use that for message displays.
 	$self->{to} = $to;
 	$self->{cc} = $cc;
+
 	$self->{blob} = $blob;
 	$self->{mid} = $mid0;
 }
-- 
EW


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 06/14] public-inbox-convert: tool for converting old to new inboxes
  2018-03-29 10:28 [PATCH 00/14] purging support, v1 conversions, cleanups + more Eric Wong (Contractor, The Linux Foundation)
                   ` (4 preceding siblings ...)
  2018-03-29 10:28 ` [PATCH 05/14] searchmsg: document why we store To: and Cc: for NNTP Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-29 10:28 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-29 10:28 ` [PATCH 07/14] v2writable: support purging messages from git entirely Eric Wong (Contractor, The Linux Foundation)
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-29 10:28 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong (Contractor, The Linux Foundation)

This should make it easier to let users perform comparisons and
migrate to v2 if needed.
---
 Documentation/public-inbox-config.pod  |   2 +-
 Documentation/public-inbox-convert.pod |  45 ++++++++++
 MANIFEST                               |   2 +
 script/public-inbox-convert            | 109 +++++++++++++++++++++++++
 4 files changed, 157 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/public-inbox-convert.pod
 create mode 100755 script/public-inbox-convert

diff --git a/Documentation/public-inbox-config.pod b/Documentation/public-inbox-config.pod
index 8250b45..22ee909 100644
--- a/Documentation/public-inbox-config.pod
+++ b/Documentation/public-inbox-config.pod
@@ -40,7 +40,7 @@ Default: none, required
 
 =item publicinbox.<name>.mainrepo
 
-The absolute path to the git repository which hosts the
+The absolute path to the directory which hosts the
 public-inbox.  This must be specified once.
 
 Default: none, required
diff --git a/Documentation/public-inbox-convert.pod b/Documentation/public-inbox-convert.pod
new file mode 100644
index 0000000..1e16ea4
--- /dev/null
+++ b/Documentation/public-inbox-convert.pod
@@ -0,0 +1,45 @@
+=head1 NAME
+
+public-inbox-convert - convert v1 inboxes to v2
+
+=head1 SYNOPSIS
+
+	public-inbox-convert OLD_DIR NEW_DIR
+
+=head1 DESCRIPTION
+
+public-inbox-convert copies the contents of an old "v1" inbox
+into a new "v2" inbox.  It makes no changes to the old inbox
+and users are expected to update the "mainrepo" path in
+L<public-inbox-config(5)> to point to the path of NEW_DIR
+once they are satisfied with the conversion.
+
+=head1 ENVIRONMENT
+
+=over 8
+
+=item PI_CONFIG
+
+The default config file, normally "~/.public-inbox/config".
+See L<public-inbox-config(5)>
+
+=back
+
+=head1 UPGRADING
+
+=head1 CONTACT
+
+Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
+
+The mail archives are hosted at L<https://public-inbox.org/meta/>
+and L<http://hjrcffqmbrq6wope.onion/meta/>
+
+=head1 COPYRIGHT
+
+Copyright 2013-2018 all contributors L<mailto:meta@public-inbox.org>
+
+License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
+
+=head1 SEE ALSO
+
+L<public-inbox-init(1)>, L<public-inbox-index(1)>
diff --git a/MANIFEST b/MANIFEST
index 8b2b10b..1e48d3a 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -8,6 +8,7 @@ Documentation/design_www.txt
 Documentation/hosted.txt
 Documentation/include.mk
 Documentation/public-inbox-config.pod
+Documentation/public-inbox-convert.pod
 Documentation/public-inbox-daemon.pod
 Documentation/public-inbox-httpd.pod
 Documentation/public-inbox-index.pod
@@ -109,6 +110,7 @@ sa_config/Makefile
 sa_config/README
 sa_config/root/etc/spamassassin/public-inbox.pre
 sa_config/user/.spamassassin/user_prefs
+script/public-inbox-convert
 script/public-inbox-httpd
 script/public-inbox-index
 script/public-inbox-init
diff --git a/script/public-inbox-convert b/script/public-inbox-convert
new file mode 100755
index 0000000..2b0a385
--- /dev/null
+++ b/script/public-inbox-convert
@@ -0,0 +1,109 @@
+#!/usr/bin/perl -w
+# Copyright (C) 2018 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <http://www.gnu.org/licenses/agpl-3.0.txt>
+use strict;
+use warnings;
+use Getopt::Long qw(:config gnu_getopt no_ignore_case auto_abbrev);
+use PublicInbox::MIME;
+use PublicInbox::Inbox;
+use PublicInbox::Config;
+use PublicInbox::V2Writable;
+use PublicInbox::Spawn qw(spawn);
+use Cwd 'abs_path';
+my $usage = "Usage: public-inbox-convert OLD NEW\n";
+my $jobs;
+my $index = 1;
+my %opts = (
+	'--jobs|j=i' => \$jobs,
+	'--index!' => \$index,
+);
+GetOptions(%opts) or die "bad command-line args\n$usage";
+GetOptions(%opts) or die "bad command-line args\n$usage";
+my $old_dir = shift or die $usage;
+my $new_dir = shift or die $usage;
+die "$new_dir exists\n" if -d $new_dir;
+die "$old_dir not a directory\n" unless -d $old_dir;
+my $config = PublicInbox::Config->new;
+$old_dir = abs_path($old_dir);
+my $old;
+$config->each_inbox(sub {
+	$old = $_[0] if abs_path($_[0]->{mainrepo}) eq $old_dir;
+});
+unless ($old) {
+	warn "W: $old_dir not configured in " .
+		PublicInbox::Config::default_file() . "\n";
+	$old = {
+		mainrepo => $old_dir,
+		name => 'ignored',
+		address => [ 'old@example.com' ],
+	};
+	$old = PublicInbox::Inbox->new($old);
+}
+if (($old->{version} || 1) >= 2) {
+	die "Only conversion from v1 inboxes is supported\n";
+}
+my $new = { %$old };
+delete $new->{altid}; # TODO: support altid for v2
+$new->{mainrepo} = $new_dir;
+$new->{version} = 2;
+$new = PublicInbox::Inbox->new($new);
+my $v2w = PublicInbox::V2Writable->new($new, 1);
+$v2w->init_inbox($jobs);
+my $state = '';
+my ($prev, $from);
+my $head = $old->{ref_head} || 'HEAD';
+my ($rd, $pid) = $old->git->popen(qw(fast-export --use-done-feature), $head);
+$v2w->idx_init;
+my $im = $v2w->importer;
+my ($r, $w) = $im->gfi_start;
+my $h = '[0-9a-f]';
+my %D;
+while (<$rd>) {
+	if ($_ eq "blob\n") {
+		$state = 'blob';
+	} elsif (/^commit /) {
+		$state = 'commit';
+	} elsif (/^data (\d+)/) {
+		my $len = $1;
+		$w->print($_) or $im->wfail;
+		while ($len) {
+			my $n = read($rd, my $tmp, $len) or die "read: $!";
+			warn "$n != $len\n" if $n != $len;
+			$len -= $n;
+			$w->print($tmp) or $im->wfail;
+		}
+		next;
+	} elsif ($state eq 'commit') {
+		if (m{^M 100644 :(\d+) (${h}{2}/${h}{38})}o) {
+			my ($mark, $path) = ($1, $2);
+			$D{$path} = $mark;
+			$w->print("M 100644 :$mark m\n") or $im->wfail;
+			next;
+		}
+		if (m{^D (${h}{2}/${h}{38})}o) {
+			my $mark = delete $D{$1};
+			defined $mark or die "undeleted path: $1\n";
+			$w->print("M 100644 :$mark _/D\n") or $im->wfail;
+			next;
+		}
+		if (m{^from (:\d+)}) {
+			$prev = $from;
+			$from = $1;
+			# no next
+		}
+	} elsif ($_ eq "done\n") {
+		last;
+	}
+	$w->print($_) or $im->wfail;
+}
+$w = $r = undef;
+close $rd or die "close fast-export: $!\n";
+waitpid($pid, 0) or die "waitpid failed: $!\n";
+$? == 0 or die "fast-export failed: $?\n";
+my $mm = $old->mm;
+$mm->{dbh}->sqlite_backup_to_file("$new_dir/msgmap.sqlite3") if $mm;
+$v2w->done;
+if ($index) {
+	$v2w->reindex;
+	$v2w->done;
+}
-- 
EW


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 07/14] v2writable: support purging messages from git entirely
  2018-03-29 10:28 [PATCH 00/14] purging support, v1 conversions, cleanups + more Eric Wong (Contractor, The Linux Foundation)
                   ` (5 preceding siblings ...)
  2018-03-29 10:28 ` [PATCH 06/14] public-inbox-convert: tool for converting old to new inboxes Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-29 10:28 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-29 10:28 ` [PATCH 08/14] search: cleanup uniqueness checking Eric Wong (Contractor, The Linux Foundation)
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-29 10:28 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong (Contractor, The Linux Foundation)

Purging existing messages is fairly straightforward since we can
take advantage of Xapian and lookup the git object_id with it.

Unfortunately, purging an already "removed" message (which is
no longer in Xapian) is not as easy and we'll need to expose
->purge_oids to purge by the git object_id (currently SHA-1).

Furthermore, we expire reflogs and prune in hopes a dumb HTTP
client won't get the object.
---
 lib/PublicInbox/Import.pm     | 90 +++++++++++++++++++++++++++++++++++
 lib/PublicInbox/V2Writable.pm | 39 +++++++++++++--
 t/v2writable.t                |  7 +++
 3 files changed, 131 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/Import.pm b/lib/PublicInbox/Import.pm
index e07edda..f00c260 100644
--- a/lib/PublicInbox/Import.pm
+++ b/lib/PublicInbox/Import.pm
@@ -430,6 +430,96 @@ sub digest2mid ($) {
 	$b64 . '@localhost';
 }
 
+sub clean_purge_buffer {
+	my ($oid, $buf) = @_;
+	my $cmt_msg = "purged $oid\n";
+
+	foreach my $i (0..$#$buf) {
+		my $l = $buf->[$i];
+		if ($l =~ /^author .* (\d+ [\+-]?\d+)$/) {
+			$buf->[$i] = "author <> $1\n";
+		} elsif ($l =~ /^data (\d+)/) {
+			$buf->[$i++] = "data " . length($cmt_msg) . "\n";
+			$buf->[$i] = $cmt_msg;
+			last;
+		}
+	}
+}
+
+sub purge_oids {
+	my ($self, $purge) = @_;
+	my $tmp = "refs/heads/purge-".((keys %$purge)[0]);
+	my $old = $self->{'ref'};
+	my $git = $self->{git};
+	my @export = (qw(fast-export --no-data --use-done-feature), $old);
+	my ($rd, $pid) = $git->popen(@export);
+	my ($r, $w) = $self->gfi_start;
+	my @buf;
+	my $npurge = 0;
+	while (<$rd>) {
+		if (/^reset (?:.+)/) {
+			push @buf, "reset $tmp\n";
+		} elsif (/^commit (?:.+)/) {
+			if (@buf) {
+				$w->print(@buf) or wfail;
+				@buf = ();
+			}
+			push @buf, "commit $tmp\n";
+		} elsif (/^data (\d+)/) {
+			# only commit message, so $len is small:
+			my $len = $1; # + 1 for trailing "\n"
+			push @buf, $_;
+			my $n = read($rd, my $buf, $len) or die "read: $!";
+			$len == $n or die "short read ($n < $len)";
+			push @buf, $buf;
+		} elsif (/^M 100644 ([a-f0-9]+) /) {
+			my $oid = $1;
+			if ($purge->{$oid}) {
+				my $lf = <$rd>;
+				if ($lf eq "\n") {
+					my $out = join('', @buf);
+					$out =~ s/^/# /sgm;
+					warn "purge rewriting\n", $out, "\n";
+					clean_purge_buffer($oid, \@buf);
+					$out = join('', @buf);
+					$w->print(@buf, "\n") or wfail;
+					@buf = ();
+					$npurge++;
+				} else {
+					die "expected LF: $lf\n";
+				}
+			} else {
+				push @buf, $_;
+			}
+		} else {
+			push @buf, $_;
+		}
+	}
+	if (@buf) {
+		$w->print(@buf) or wfail;
+	}
+	$w = $r = undef;
+	$self->done;
+	my @git = ('git', "--git-dir=$git->{git_dir}");
+
+	run_die([@git, qw(update-ref), $old, $tmp]) if $npurge;
+
+	run_die([@git, qw(update-ref -d), $tmp]);
+
+	return if $npurge == 0;
+
+	run_die([@git, qw(-c gc.reflogExpire=now gc --prune=all)]);
+	my $err = 0;
+	foreach my $oid (keys %$purge) {
+		my @info = $git->check($oid);
+		if (@info) {
+			warn "$oid not purged\n";
+			$err++;
+		}
+	}
+	die "Failed to purge $err object(s)\n" if $err;
+}
+
 1;
 __END__
 =pod
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 9b280c6..ef9867d 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -209,11 +209,22 @@ sub idx_init {
 	$skel->_msgmap_init->{dbh}->begin_work;
 }
 
-sub remove {
-	my ($self, $mime, $cmt_msg) = @_;
+sub purge_oids {
+	my ($self, $purge) = @_; # $purge = { $object_id => 1, ... }
+	$self->done;
+	my $pfx = "$self->{-inbox}->{mainrepo}/git";
+	foreach my $i (0..$self->{max_git}) {
+		my $git = PublicInbox::Git->new("$pfx/$i.git");
+		my $im = $self->import_init($git, 0);
+		$im->purge_oids($purge);
+	}
+}
+
+sub remove_internal {
+	my ($self, $mime, $cmt_msg, $purge) = @_;
 	$self->barrier;
 	$self->idx_init;
-	my $im = $self->importer;
+	my $im = $self->importer unless $purge;
 	my $ibx = $self->{-inbox};
 	my $srch = $ibx->search;
 	my $cid = content_id($mime);
@@ -245,11 +256,15 @@ sub remove {
 				# no bugs in our deduplication code:
 				$removed = $smsg;
 				$removed->{mime} = $cur;
-				$im->remove(\$orig, $cmt_msg);
+				my $oid = $smsg->{blob};
+				if ($purge) {
+					$purge->{$oid} = 1;
+				} else {
+					$im->remove(\$orig, $cmt_msg);
+				}
 				$orig = undef;
 				$removed->num; # memoize this for callers
 
-				my $oid = $smsg->{blob};
 				foreach my $idx (@$parts, $skel) {
 					$idx->remote_remove($oid, $mid);
 				}
@@ -258,9 +273,23 @@ sub remove {
 		});
 		$self->barrier;
 	}
+	if ($purge && scalar keys %$purge) {
+		purge_oids($self, $purge);
+	}
 	$removed;
 }
 
+sub remove {
+	my ($self, $mime, $cmt_msg) = @_;
+	remove_internal($self, $mime, $cmt_msg);
+}
+
+sub purge {
+	my ($self, $mime) = @_;
+	remove_internal($self, $mime, undef, {});
+}
+
+
 sub done {
 	my ($self) = @_;
 	my $locked = defined $self->{idx_parts};
diff --git a/t/v2writable.t b/t/v2writable.t
index c48f060..0eda432 100644
--- a/t/v2writable.t
+++ b/t/v2writable.t
@@ -231,4 +231,11 @@ EOF
 	ok(!$@, '->done is idempotent');
 }
 
+{
+	ok($im->add($mime), 'add message to be purged');
+	local $SIG{__WARN__} = sub {};
+	ok($im->purge($mime), 'purged message');
+	$im->done;
+}
+
 done_testing();
-- 
EW


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 08/14] search: cleanup uniqueness checking
  2018-03-29 10:28 [PATCH 00/14] purging support, v1 conversions, cleanups + more Eric Wong (Contractor, The Linux Foundation)
                   ` (6 preceding siblings ...)
  2018-03-29 10:28 ` [PATCH 07/14] v2writable: support purging messages from git entirely Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-29 10:28 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-29 10:28 ` [PATCH 09/14] search: get rid of most lookup_* subroutines Eric Wong (Contractor, The Linux Foundation)
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-29 10:28 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong (Contractor, The Linux Foundation)

The only Xapian term which should be unique is the NNTP article
number; so we no longer need find_unique_doc_id.
---
 lib/PublicInbox/Search.pm | 24 ++++++++----------------
 1 file changed, 8 insertions(+), 16 deletions(-)

diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index a4e2498..584a508 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -396,9 +396,16 @@ sub lookup_article {
 		retry_reopen($self, sub {
 			my $db = $self->{skel} || $self->{xdb};
 			my $head = $db->postlist_begin($term);
-			return if $head == $db->postlist_end($term);
+			my $tail = $db->postlist_end($term);
+			return if $head->equal($tail);
 			my $doc_id = $head->get_docid;
 			return unless defined $doc_id;
+			$head->inc;
+			if ($head->nequal($tail)) {
+				my $loc= $self->{mainrepo} .
+					($self->{skel} ? 'skel' : 'xdb');
+				warn "article #$num is not unique in $loc\n";
+			}
 			# raises on error:
 			my $doc = $db->get_document($doc_id);
 			$smsg = PublicInbox::SearchMsg->wrap($doc);
@@ -432,21 +439,6 @@ sub each_smsg_by_mid {
 	}
 }
 
-sub find_unique_doc_id {
-	my ($self, $termval) = @_;
-
-	my ($begin, $end) = $self->find_doc_ids($termval);
-
-	return undef if $begin->equal($end); # not found
-
-	my $rv = $begin->get_docid;
-
-	# sanity check
-	$begin->inc;
-	$begin->equal($end) or die "Term '$termval' is not unique\n";
-	$rv;
-}
-
 # returns begin and end PostingIterator
 sub find_doc_ids {
 	my ($self, $termval) = @_;
-- 
EW


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 09/14] search: get rid of most lookup_* subroutines
  2018-03-29 10:28 [PATCH 00/14] purging support, v1 conversions, cleanups + more Eric Wong (Contractor, The Linux Foundation)
                   ` (7 preceding siblings ...)
  2018-03-29 10:28 ` [PATCH 08/14] search: cleanup uniqueness checking Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-29 10:28 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-29 10:28 ` [PATCH 10/14] search: move find_doc_ids to searchidx Eric Wong (Contractor, The Linux Foundation)
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-29 10:28 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong (Contractor, The Linux Foundation)

Too many similar functions doing the same basic thing was
redundant and misleading, especially since Message-ID is
no longer treated as a truly unique identifier.

For displaying threads in the HTML, this makes it clear
that we favor the primary Message-ID mapped to an NNTP
article number if a message cannot be found.
---
 lib/PublicInbox/Inbox.pm        | 22 ++++++-------
 lib/PublicInbox/Search.pm       | 56 +++------------------------------
 lib/PublicInbox/SearchThread.pm | 14 ++++-----
 lib/PublicInbox/SearchView.pm   |  2 +-
 lib/PublicInbox/View.pm         | 12 +++----
 t/search-thr-index.t            |  3 +-
 t/search.t                      |  6 ++--
 7 files changed, 35 insertions(+), 80 deletions(-)

diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index 4c7305f..01aa500 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -293,20 +293,20 @@ sub path_check {
 	git($self)->check('HEAD:'.$path);
 }
 
+sub smsg_by_mid ($$) {
+	my ($self, $mid) = @_;
+	my $srch = search($self) or return;
+	# favor the Message-ID we used for the NNTP article number:
+	my $mm = mm($self) or return;
+	my $num = $mm->num_for($mid);
+	$srch->lookup_article($num);
+}
+
 sub msg_by_mid ($$;$) {
 	my ($self, $mid, $ref) = @_;
 	my $srch = search($self) or
-			return msg_by_path($self, mid2path($mid), $ref);
-	my $smsg;
-	# favor the Message-ID we used for the NNTP article number:
-	if (my $mm = mm($self)) {
-		my $num = $mm->num_for($mid);
-		$smsg = $srch->lookup_article($num);
-	} else {
-		$smsg = $srch->retry_reopen(sub {
-			$srch->lookup_skeleton($mid) and $smsg->load_expand;
-		});
-	}
+		return msg_by_path($self, mid2path($mid), $ref);
+	my $smsg = smsg_by_mid($self, $mid);
 	$smsg ? msg_by_smsg($self, $smsg, $ref) : undef;
 }
 
diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index 584a508..7d42aaa 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -18,7 +18,7 @@ use constant YYYYMMDD => 5; # for searching in the WWW UI
 use Search::Xapian qw/:standard/;
 use PublicInbox::SearchMsg;
 use PublicInbox::MIME;
-use PublicInbox::MID qw/mid_clean id_compress/;
+use PublicInbox::MID qw/id_compress/;
 
 # This is English-only, everything else is non-standard and may be confused as
 # a prefix common in patch emails
@@ -193,9 +193,8 @@ sub query {
 
 sub get_thread {
 	my ($self, $mid, $opts) = @_;
-	my $smsg = retry_reopen($self, sub { lookup_skeleton($self, $mid) });
-
-	return { total => 0, msgs => [] } unless $smsg;
+	my $smsg = first_smsg_by_mid($self, $mid) or
+			return { total => 0, msgs => [] };
 	my $qtid = Search::Xapian::Query->new('G' . $smsg->thread_id);
 	my $path = $smsg->path;
 	if (defined $path && $path ne '') {
@@ -346,48 +345,13 @@ sub query_ts {
 	_do_enquire($self, $query, $opts);
 }
 
-sub lookup_skeleton {
+sub first_smsg_by_mid {
 	my ($self, $mid) = @_;
-	my $skel = $self->{skel} or return lookup_message($self, $mid);
-	$mid = mid_clean($mid);
-	my $term = 'Q' . $mid;
 	my $smsg;
-	my $beg = $skel->postlist_begin($term);
-	if ($beg != $skel->postlist_end($term)) {
-		my $doc_id = $beg->get_docid;
-		if (defined $doc_id) {
-			# raises on error:
-			my $doc = $skel->get_document($doc_id);
-			$smsg = PublicInbox::SearchMsg->wrap($doc, $mid);
-			$smsg->{doc_id} = $doc_id;
-		}
-	}
+	each_smsg_by_mid($self, $mid, sub { $smsg = $_[0]; undef });
 	$smsg;
 }
 
-sub lookup_message {
-	my ($self, $mid) = @_;
-	$mid = mid_clean($mid);
-
-	my $doc_id = $self->find_first_doc_id('Q' . $mid);
-	my $smsg;
-	if (defined $doc_id) {
-		# raises on error:
-		my $doc = $self->{xdb}->get_document($doc_id);
-		$smsg = PublicInbox::SearchMsg->wrap($doc, $mid);
-		$smsg->{doc_id} = $doc_id;
-	}
-	$smsg;
-}
-
-sub lookup_mail { # no ghosts!
-	my ($self, $mid) = @_;
-	retry_reopen($self, sub {
-		my $smsg = lookup_skeleton($self, $mid) or return;
-		$smsg->load_expand;
-	});
-}
-
 sub lookup_article {
 	my ($self, $num) = @_;
 	my $term = 'XNUM'.$num;
@@ -447,16 +411,6 @@ sub find_doc_ids {
 	($db->postlist_begin($termval), $db->postlist_end($termval));
 }
 
-sub find_first_doc_id {
-	my ($self, $termval) = @_;
-
-	my ($begin, $end) = $self->find_doc_ids($termval);
-
-	return undef if $begin->equal($end); # not found
-
-	$begin->get_docid;
-}
-
 # normalize subjects so they are suitable as pathnames for URLs
 # XXX: consider for removal
 sub subject_path {
diff --git a/lib/PublicInbox/SearchThread.pm b/lib/PublicInbox/SearchThread.pm
index 6fbce15..1d250b4 100644
--- a/lib/PublicInbox/SearchThread.pm
+++ b/lib/PublicInbox/SearchThread.pm
@@ -22,15 +22,15 @@ use strict;
 use warnings;
 
 sub thread {
-	my ($messages, $ordersub, $srch) = @_;
+	my ($messages, $ordersub, $ibx) = @_;
 	my $id_table = {};
 	_add_message($id_table, $_) foreach @$messages;
 	my $rootset = [ grep {
-			!delete($_->{parent}) && $_->visible($srch)
+			!delete($_->{parent}) && $_->visible($ibx)
 		} values %$id_table ];
 	$id_table = undef;
 	$rootset = $ordersub->($rootset);
-	$_->order_children($ordersub, $srch) for @$rootset;
+	$_->order_children($ordersub, $ibx) for @$rootset;
 	$rootset;
 }
 
@@ -131,20 +131,20 @@ sub has_descendent {
 # a ghost Message-ID is the result of a long header line
 # being folded/mangled by a MUA, and not a missing message.
 sub visible ($$) {
-	my ($self, $srch) = @_;
-	($self->{smsg} ||= eval { $srch->lookup_mail($self->{id}) }) ||
+	my ($self, $ibx) = @_;
+	($self->{smsg} ||= eval { $ibx->smsg_by_mid($self->{id}) }) ||
 	 (scalar values %{$self->{children}});
 }
 
 sub order_children {
-	my ($cur, $ordersub, $srch) = @_;
+	my ($cur, $ordersub, $ibx) = @_;
 
 	my %seen = ($cur => 1); # self-referential loop prevention
 	my @q = ($cur);
 	while (defined($cur = shift @q)) {
 		my $c = $cur->{children}; # The hashref here...
 
-		$c = [ grep { !$seen{$_}++ && visible($_, $srch) } values %$c ];
+		$c = [ grep { !$seen{$_}++ && visible($_, $ibx) } values %$c ];
 		$c = $ordersub->($c) if scalar @$c > 1;
 		$cur->{children} = $c; # ...becomes an arrayref
 		push @q, @$c;
diff --git a/lib/PublicInbox/SearchView.pm b/lib/PublicInbox/SearchView.pm
index 1a8fe7f..c789795 100644
--- a/lib/PublicInbox/SearchView.pm
+++ b/lib/PublicInbox/SearchView.pm
@@ -228,7 +228,7 @@ sub mset_thread {
 	my $r = $q->{r};
 	my $rootset = PublicInbox::SearchThread::thread($msgs,
 		$r ? sort_relevance(\%pct) : *PublicInbox::View::sort_ds,
-		$srch);
+		$ctx);
 	my $skel = search_nav_bot($mset, $q). "<pre>";
 	my $inbox = $ctx->{-inbox};
 	$ctx->{-upfx} = '';
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index aad860e..f5b278c 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -430,7 +430,7 @@ sub thread_html {
 	$ctx->{mapping} = {};
 	$ctx->{s_nr} = "$nr+ messages in thread";
 
-	my $rootset = thread_results($msgs, $srch);
+	my $rootset = thread_results($ctx, $msgs);
 
 	# reduce hash lookups in pre_thread->skel_dump
 	my $inbox = $ctx->{-inbox};
@@ -686,7 +686,7 @@ sub thread_skel {
 	# reduce hash lookups in skel_dump
 	my $ibx = $ctx->{-inbox};
 	$ctx->{-obfs_ibx} = $ibx->{obfuscate} ? $ibx : undef;
-	walk_thread(thread_results($sres, $srch), $ctx, *skel_dump);
+	walk_thread(thread_results($ctx, $sres), $ctx, *skel_dump);
 
 	$ctx->{parent_msg} = $parent;
 }
@@ -809,9 +809,9 @@ sub load_results {
 }
 
 sub thread_results {
-	my ($msgs, $srch) = @_;
+	my ($ctx, $msgs) = @_;
 	require PublicInbox::SearchThread;
-	PublicInbox::SearchThread::thread($msgs, *sort_ds, $srch);
+	PublicInbox::SearchThread::thread($msgs, *sort_ds, $ctx->{-inbox});
 }
 
 sub missing_thread {
@@ -952,7 +952,7 @@ sub acc_topic {
 	my ($ctx, $level, $node) = @_;
 	my $srch = $ctx->{srch};
 	my $mid = $node->{id};
-	my $x = $node->{smsg} || $srch->lookup_mail($mid);
+	my $x = $node->{smsg} || $ctx->{-inbox}->smsg_by_mid($mid);
 	my ($subj, $ds);
 	my $topic;
 	if ($x) {
@@ -1078,7 +1078,7 @@ sub index_topics {
 	my $nr = scalar @{$sres->{msgs}};
 	if ($nr) {
 		$sres = load_results($srch, $sres);
-		walk_thread(thread_results($sres, $srch), $ctx, *acc_topic);
+		walk_thread(thread_results($ctx, $sres), $ctx, *acc_topic);
 	}
 	$ctx->{-next_o} = $off+ $nr;
 	$ctx->{-cur_o} = $off;
diff --git a/t/search-thr-index.t b/t/search-thr-index.t
index 6c6e4c5..9549976 100644
--- a/t/search-thr-index.t
+++ b/t/search-thr-index.t
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 use Test::More;
 use File::Temp qw/tempdir/;
+use PublicInbox::MID qw(mids);
 use Email::MIME;
 eval { require PublicInbox::SearchIdx; };
 plan skip_all => "Xapian missing for search" if $@;
@@ -41,7 +42,7 @@ foreach (reverse split(/\n\n/, $data)) {
 	$mime->header_set('From' => 'bw@g');
 	$mime->header_set('To' => 'git@vger.kernel.org');
 	my $bytes = bytes::length($mime->as_string);
-	my $mid = $mime->header('Message-Id');
+	my $mid = mids($mime->header_obj)->[0];
 	my $doc_id = $rw->add_message($mime, $bytes, ++$num, 'ignored', $mid);
 	push @mids, $mid;
 	ok($doc_id, 'message added: '. $mid);
diff --git a/t/search.t b/t/search.t
index 6b1aa2a..ccf0f74 100644
--- a/t/search.t
+++ b/t/search.t
@@ -89,7 +89,7 @@ sub filter_mids {
 {
 	$rw_commit->();
 	$ro->reopen;
-	my $found = $ro->lookup_message('<root@s>');
+	my $found = $ro->first_smsg_by_mid('root@s');
 	ok($found, "message found");
 	is($root_id, $found->{doc_id}, 'doc_id set correctly');
 	is($found->mid, 'root@s', 'mid set correctly');
@@ -264,7 +264,7 @@ sub filter_mids {
 		],
 		body => "LOOP!\n"));
 	ok($doc_id > 0, "doc_id defined with circular reference");
-	my $smsg = $rw->lookup_message('circle@a');
+	my $smsg = $rw->first_smsg_by_mid('circle@a');
 	is($smsg->references, '', "no references created");
 	my $msg = PublicInbox::SearchMsg->load_doc($smsg->{doc});
 	is($s, $msg->subject, 'long subject not rewritten');
@@ -281,7 +281,7 @@ sub filter_mids {
 	my $mime = Email::MIME->new($str);
 	my $doc_id = $rw->add_message($mime);
 	ok($doc_id > 0, 'message indexed doc_id with UTF-8');
-	my $smsg = $rw->lookup_message('testmessage@example.com');
+	my $smsg = $rw->first_smsg_by_mid('testmessage@example.com');
 	my $msg = PublicInbox::SearchMsg->load_doc($smsg->{doc});
 
 	is($mime->header('Subject'), $msg->subject, 'UTF-8 subject preserved');
-- 
EW


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 10/14] search: move find_doc_ids to searchidx
  2018-03-29 10:28 [PATCH 00/14] purging support, v1 conversions, cleanups + more Eric Wong (Contractor, The Linux Foundation)
                   ` (8 preceding siblings ...)
  2018-03-29 10:28 ` [PATCH 09/14] search: get rid of most lookup_* subroutines Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-29 10:28 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-29 10:28 ` [PATCH 11/14] v2writable: cleanup: get rid of unused fields Eric Wong (Contractor, The Linux Foundation)
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-29 10:28 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong (Contractor, The Linux Foundation)

We do not need this subroutine for read-only use in Search.pm
---
 lib/PublicInbox/Search.pm    | 8 --------
 lib/PublicInbox/SearchIdx.pm | 8 ++++++++
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index 7d42aaa..6f5e062 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -403,14 +403,6 @@ sub each_smsg_by_mid {
 	}
 }
 
-# returns begin and end PostingIterator
-sub find_doc_ids {
-	my ($self, $termval) = @_;
-	my $db = $self->{xdb};
-
-	($db->postlist_begin($termval), $db->postlist_end($termval));
-}
-
 # normalize subjects so they are suitable as pathnames for URLs
 # XXX: consider for removal
 sub subject_path {
diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm
index 446cfb0..a234c8c 100644
--- a/lib/PublicInbox/SearchIdx.pm
+++ b/lib/PublicInbox/SearchIdx.pm
@@ -389,6 +389,14 @@ sub add_message {
 	$doc_id;
 }
 
+# returns begin and end PostingIterator
+sub find_doc_ids {
+	my ($self, $termval) = @_;
+	my $db = $self->{xdb};
+
+	($db->postlist_begin($termval), $db->postlist_end($termval));
+}
+
 sub batch_do {
 	my ($self, $termval, $cb) = @_;
 	my $batch_size = 1000; # don't let @ids grow too large to avoid OOM
-- 
EW


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 11/14] v2writable: cleanup: get rid of unused fields
  2018-03-29 10:28 [PATCH 00/14] purging support, v1 conversions, cleanups + more Eric Wong (Contractor, The Linux Foundation)
                   ` (9 preceding siblings ...)
  2018-03-29 10:28 ` [PATCH 10/14] search: move find_doc_ids to searchidx Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-29 10:28 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-29 10:28 ` [PATCH 12/14] mbox: avoid extracting Message-ID for linkification Eric Wong (Contractor, The Linux Foundation)
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-29 10:28 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong (Contractor, The Linux Foundation)

The layout of this structure ended up being a bit different
and the read-only access is handled through the ::Inbox class,
instead.
---
 lib/PublicInbox/V2Writable.pm | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index ef9867d..b516278 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -53,8 +53,6 @@ sub new {
 	my $self = {
 		-inbox => $v2ibx,
 		im => undef, #  PublicInbox::Import
-		xap_rw => undef, # PublicInbox::V2SearchIdx
-		xap_ro => undef,
 		partitions => $nparts,
 		parallel => 1,
 		transact_bytes => 0,
-- 
EW


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 12/14] mbox: avoid extracting Message-ID for linkification
  2018-03-29 10:28 [PATCH 00/14] purging support, v1 conversions, cleanups + more Eric Wong (Contractor, The Linux Foundation)
                   ` (10 preceding siblings ...)
  2018-03-29 10:28 ` [PATCH 11/14] v2writable: cleanup: get rid of unused fields Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-29 10:28 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-29 10:28 ` [PATCH 13/14] www: cleanup expensive fallback for legacy URLs Eric Wong (Contractor, The Linux Foundation)
  2018-03-29 10:28 ` [PATCH 14/14] view: get rid of some unnecessary imports Eric Wong (Contractor, The Linux Foundation)
  13 siblings, 0 replies; 15+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-29 10:28 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong (Contractor, The Linux Foundation)

We can avoid a small amount of overhead and use the "preferred"
Message-ID based on what is in the SearchMsg object.
---
 lib/PublicInbox/Mbox.pm | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/Mbox.pm b/lib/PublicInbox/Mbox.pm
index 381bcad..1b68f02 100644
--- a/lib/PublicInbox/Mbox.pm
+++ b/lib/PublicInbox/Mbox.pm
@@ -92,7 +92,7 @@ sub emit_raw {
 }
 
 sub msg_str {
-	my ($ctx, $simple) = @_; # Email::Simple object
+	my ($ctx, $simple, $mid) = @_; # Email::Simple object
 	my $header_obj = $simple->header_obj;
 
 	# drop potentially confusing headers, ssoma already should've dropped
@@ -102,7 +102,7 @@ sub msg_str {
 	}
 	my $ibx = $ctx->{-inbox};
 	my $base = $ibx->base_url($ctx->{env});
-	my $mid = mid_clean($header_obj->header('Message-ID'));
+	$mid = $ctx->{mid} unless defined $mid;
 	$mid = mid_escape($mid);
 	my @append = (
 		'Archived-At', "<$base$mid/>",
@@ -225,7 +225,8 @@ sub getline {
 		while (defined(my $smsg = shift @{$self->{msgs}})) {
 			my $msg = eval { $ibx->msg_by_smsg($smsg) } or next;
 			$msg = Email::Simple->new($msg);
-			$gz->write(PublicInbox::Mbox::msg_str($ctx, $msg));
+			$gz->write(PublicInbox::Mbox::msg_str($ctx, $msg,
+								$smsg->mid));
 
 			# use subject of first message as subject
 			if (my $hdr = delete $self->{hdr}) {
-- 
EW


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 13/14] www: cleanup expensive fallback for legacy URLs
  2018-03-29 10:28 [PATCH 00/14] purging support, v1 conversions, cleanups + more Eric Wong (Contractor, The Linux Foundation)
                   ` (11 preceding siblings ...)
  2018-03-29 10:28 ` [PATCH 12/14] mbox: avoid extracting Message-ID for linkification Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-29 10:28 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-29 10:28 ` [PATCH 14/14] view: get rid of some unnecessary imports Eric Wong (Contractor, The Linux Foundation)
  13 siblings, 0 replies; 15+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-29 10:28 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong (Contractor, The Linux Foundation)

Back in the day, we compressed long Message-IDs to SHA-1
hexdigests for the URL.  This now redirects to a 301 in
the hopes we can remove these checks some day to reduce
overhead.
---
 lib/PublicInbox/Inbox.pm | 11 ++++++++---
 lib/PublicInbox/WWW.pm   | 23 +++++++++--------------
 t/plack.t                | 18 ++++++++++++++++++
 3 files changed, 35 insertions(+), 17 deletions(-)

diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index 01aa500..265360d 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -293,13 +293,18 @@ sub path_check {
 	git($self)->check('HEAD:'.$path);
 }
 
+sub mid2num($$) {
+	my ($self, $mid) = @_;
+	my $mm = mm($self) or return;
+	$mm->num_for($mid);
+}
+
 sub smsg_by_mid ($$) {
 	my ($self, $mid) = @_;
 	my $srch = search($self) or return;
 	# favor the Message-ID we used for the NNTP article number:
-	my $mm = mm($self) or return;
-	my $num = $mm->num_for($mid);
-	$srch->lookup_article($num);
+	my $num = mid2num($self, $mid);
+	defined $num ? $srch->lookup_article($num) : undef;
 }
 
 sub msg_by_mid ($$;$) {
diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm
index 7bd2973..24e24f1 100644
--- a/lib/PublicInbox/WWW.pm
+++ b/lib/PublicInbox/WWW.pm
@@ -169,14 +169,15 @@ sub invalid_inbox_mid {
 	return $ret if $ret;
 
 	$ctx->{mid} = $mid;
-	if ($mid =~ /\A[a-f0-9]{40}\z/) {
-		# this is horiffically wasteful for legacy URLs:
-		if ($mid = mid2blob($ctx)) {
-			require Email::Simple;
-			use PublicInbox::MID qw/mid_clean/;
-			my $s = Email::Simple->new($mid);
-			$ctx->{mid} = mid_clean($s->header('Message-ID'));
-		}
+	my $ibx = $ctx->{-inbox};
+	if ($mid =~ m!\A([a-f0-9]{2})([a-f0-9]{38})\z!) {
+		my ($x2, $x38) = ($1, $2);
+		# this is horrifically wasteful for legacy URLs:
+		my $str = $ctx->{-inbox}->msg_by_path("$x2/$x38") or return;
+		require Email::Simple;
+		my $s = Email::Simple->new($str);
+		$mid = PublicInbox::MID::mid_clean($s->header('Message-ID'));
+		return r301($ctx, $inbox, $mid);
 	}
 	undef;
 }
@@ -208,12 +209,6 @@ sub get_index {
 	}
 }
 
-# just returns a string ref for the blob in the current ctx
-sub mid2blob {
-	my ($ctx) = @_;
-	$ctx->{-inbox}->msg_by_mid($ctx->{mid});
-}
-
 # /$INBOX/$MESSAGE_ID/raw                    -> raw mbox
 sub get_mid_txt {
 	my ($ctx) = @_;
diff --git a/t/plack.t b/t/plack.t
index 26b0366..7eb7d7f 100644
--- a/t/plack.t
+++ b/t/plack.t
@@ -18,6 +18,7 @@ foreach my $mod (@mods) {
 }
 use_ok 'PublicInbox::Import';
 use_ok 'PublicInbox::Git';
+my @ls;
 
 foreach my $mod (@mods) { use_ok $mod; }
 {
@@ -55,6 +56,8 @@ EOF
 		$im->done;
 		my $rev = `git --git-dir="$maindir" rev-list HEAD`;
 		like($rev, qr/\A[a-f0-9]{40}/, "good revision committed");
+		@ls = `git --git-dir="$maindir" ls-tree -r --name-only HEAD`;
+		chomp @ls;
 	}
 	my $app = eval {
 		local $ENV{PI_CONFIG} = $pi_config;
@@ -198,6 +201,21 @@ EOF
 			     "$sfx redirected to /mbox.gz");
 		});
 	}
+	test_psgi($app, sub {
+		my ($cb) = @_;
+		# for a while, we used to support /$INBOX/$X40/
+		# when we "compressed" long Message-IDs to SHA-1
+		# Now we're stuck supporting them forever :<
+		foreach my $path (@ls) {
+			$path =~ tr!/!!d;
+			my $from = "http://example.com/test/$path/";
+			my $res = $cb->(GET($from));
+			is(301, $res->code, 'is permanent redirect');
+			like($res->header('Location'),
+				qr!/test/blah\@example\.com/!,
+				'redirect from x40 MIDs works');
+		}
+	});
 }
 
 done_testing();
-- 
EW


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 14/14] view: get rid of some unnecessary imports
  2018-03-29 10:28 [PATCH 00/14] purging support, v1 conversions, cleanups + more Eric Wong (Contractor, The Linux Foundation)
                   ` (12 preceding siblings ...)
  2018-03-29 10:28 ` [PATCH 13/14] www: cleanup expensive fallback for legacy URLs Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-29 10:28 ` Eric Wong (Contractor, The Linux Foundation)
  13 siblings, 0 replies; 15+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-29 10:28 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong (Contractor, The Linux Foundation)

We no longer need some of these old subroutines which
assumed a single Message-ID for each message.
---
 lib/PublicInbox/View.pm | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index f5b278c..ec04343 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -9,8 +9,7 @@ use warnings;
 use PublicInbox::MsgTime qw(msg_datestamp);
 use PublicInbox::Hval qw/ascii_html obfuscate_addrs/;
 use PublicInbox::Linkify;
-use PublicInbox::MID qw/mid_clean id_compress mid_mime mid_escape mids
-			references/;
+use PublicInbox::MID qw/id_compress mid_escape mids references/;
 use PublicInbox::MsgIter;
 use PublicInbox::Address;
 use PublicInbox::WwwStream;
-- 
EW


^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-03-29 10:28 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-29 10:28 [PATCH 00/14] purging support, v1 conversions, cleanups + more Eric Wong (Contractor, The Linux Foundation)
2018-03-29 10:28 ` [PATCH 01/14] www: remove unnecessary ghost checks Eric Wong (Contractor, The Linux Foundation)
2018-03-29 10:28 ` [PATCH 02/14] v2writable: append, instead of prepending generated Message-ID Eric Wong (Contractor, The Linux Foundation)
2018-03-29 10:28 ` [PATCH 03/14] lookup by Message-ID favors the "primary" one Eric Wong (Contractor, The Linux Foundation)
2018-03-29 10:28 ` [PATCH 04/14] www: fix attachment downloads for conflicted Message-IDs Eric Wong (Contractor, The Linux Foundation)
2018-03-29 10:28 ` [PATCH 05/14] searchmsg: document why we store To: and Cc: for NNTP Eric Wong (Contractor, The Linux Foundation)
2018-03-29 10:28 ` [PATCH 06/14] public-inbox-convert: tool for converting old to new inboxes Eric Wong (Contractor, The Linux Foundation)
2018-03-29 10:28 ` [PATCH 07/14] v2writable: support purging messages from git entirely Eric Wong (Contractor, The Linux Foundation)
2018-03-29 10:28 ` [PATCH 08/14] search: cleanup uniqueness checking Eric Wong (Contractor, The Linux Foundation)
2018-03-29 10:28 ` [PATCH 09/14] search: get rid of most lookup_* subroutines Eric Wong (Contractor, The Linux Foundation)
2018-03-29 10:28 ` [PATCH 10/14] search: move find_doc_ids to searchidx Eric Wong (Contractor, The Linux Foundation)
2018-03-29 10:28 ` [PATCH 11/14] v2writable: cleanup: get rid of unused fields Eric Wong (Contractor, The Linux Foundation)
2018-03-29 10:28 ` [PATCH 12/14] mbox: avoid extracting Message-ID for linkification Eric Wong (Contractor, The Linux Foundation)
2018-03-29 10:28 ` [PATCH 13/14] www: cleanup expensive fallback for legacy URLs Eric Wong (Contractor, The Linux Foundation)
2018-03-29 10:28 ` [PATCH 14/14] view: get rid of some unnecessary imports Eric Wong (Contractor, The Linux Foundation)

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).