From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 6EF5E1F462; Mon, 10 Jun 2019 22:03:20 +0000 (UTC) Date: Mon, 10 Jun 2019 22:03:20 +0000 From: Eric Wong To: Konstantin Ryabitsev Cc: meta@public-inbox.org Subject: [WIP] v2writable: support INBOX_DEBUG=replace Message-ID: <20190610220320.nssvqjseswo2ujl2@dcvr> References: <20190609025147.24966-1-e@80x24.org> <20190610150647.GA16418@chatter.i7.local> <20190610154058.jqaawkktvb5u2itj@dcvr> <20190610185730.GC16418@chatter.i7.local> <20190610192905.55xb737jl7qnbh23@dcvr> <20190610194039.GD16418@chatter.i7.local> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20190610194039.GD16418@chatter.i7.local> List-Id: Konstantin Ryabitsev wrote: > On Mon, Jun 10, 2019 at 07:29:05PM +0000, Eric Wong wrote: > > > I did a few successful tests on small trial lists, but I'm running > > > into a > > > problem when I try to actually edit something in (a copy of) LKML: > > > > > > $ perl5lib/bin/public-inbox-edit -m messageid /mnt/fastio/lkml > > > (mutt opens here) > > > 1 kept, 0 deleted. > > > Exception: Expected block 102325 to be level 2, not 0 > > > > That's an exception from Xapian I haven't seen that in years. > > Which version of Xapian and are you using chert or glass? > > EL7 has 1.2.25. Oh, I just realized that doesn't use OFD locks at all because its Linux 3.10 and OFD locks appeared in 3.15 (unless RH backported). Maybe PATCH 14/11 fixes it: https://public-inbox.org/meta/20190610215811.untkksidetf3erf6@dcvr/ > > > I can provide more debug info if that helps. > > > > Yes, I could probably add some debug messages to make it easier > > Sounds good -- I had suspected it was coming from Xapian and I know EL7 is > lagging behind quite a bit. If this is too much divergence to work with, I > can probably build a xapian14 package, but I'm afraid of the rabbit hole I > may have to go down to make that work. Xapian 1.4 has huge performance improvements in worst-case scenarios with the glass backend; so it might be worth trying anyways. But that won't get you Linux >=3.15 for OFD locks; so Xapian is probably still using the nasty fork()-based lock in older releases. Maybe this dirty patch can dump more info: ---------8<-------- Subject: [WIP] v2writable: support INBOX_DEBUG=replace Dirty patch to enable ->replace debugging via INBOX_DEBUG=replace environment. cf. <20190610192905.55xb737jl7qnbh23@dcvr> --- lib/PublicInbox/V2Writable.pm | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm index 3484807..164a032 100644 --- a/lib/PublicInbox/V2Writable.pm +++ b/lib/PublicInbox/V2Writable.pm @@ -20,6 +20,9 @@ use PublicInbox::Spawn qw(spawn); use PublicInbox::SearchIdx; use IO::Handle; +# unstable interface +use constant DBG_REPLACE => !!(($ENV{INBOX_DEBUG}//'') =~ /\breplace\b/); + # an estimate of the post-packed size to the raw uncompressed size my $PACKING_FACTOR = 0.4; @@ -312,14 +315,23 @@ sub _replace_oids ($$$) { $self->{epoch_max} = $max; } + if (DBG_REPLACE) { + warn "Replacing OIDs\n"; + warn "\t", $_, "\n" for (keys %$replace_map); + } + foreach my $i (0..$max) { my $git_dir = "$pfx/$i.git"; -d $git_dir or next; + + warn "In $git_dir ... " if DBG_REPLACE; my $git = PublicInbox::Git->new($git_dir); my $im = $self->import_init($git, 0, 1); $rewrites->[$i] = $im->replace_oids($mime, $replace_map); $im->done; + warn "Done: $git_dir" if DBG_REPLACE; } + warn "done replacing in git repos" if DBG_REPLACE; $rewrites; } @@ -369,6 +381,7 @@ sub rewrite_internal ($$;$$$) { foreach my $mid (@$mids) { my %gone; # num => [ smsg, raw ] my ($id, $prev); + warn "looking for <$mid>" if DBG_REPLACE; while (my $smsg = $over->next_by_mid($mid, \$id, \$prev)) { my $msg = get_blob($self, $smsg); if (!defined($msg)) { @@ -380,6 +393,10 @@ sub rewrite_internal ($$;$$$) { if (content_matches($cids, $cur)) { $smsg->{mime} = $cur; $gone{$smsg->{num}} = [ $smsg, \$orig ]; + DBG_REPLACE and + warn "matched <$mid> => $smsg->{num}"; + } else { + DBG_REPLACE and warn "no match <$mid>"; } } my $n = scalar keys %gone; @@ -387,6 +404,7 @@ sub rewrite_internal ($$;$$$) { if ($n > 1) { warn "BUG: multiple articles linked to <$mid>\n", join(',', sort keys %gone), "\n"; + warn "Replacing all of them\n"; } foreach my $num (keys %gone) { my ($smsg, $orig) = @{$gone{$num}}; @@ -513,6 +531,7 @@ sub replace ($$$) { my $raw = $new_mime->as_string; my $expect_oid = git_hash_raw($self, \$raw); + warn "expect_oid: $expect_oid" if DBG_REPLACE; my $rewritten = _replace($self, $old_mime, $new_mime, \$raw) or return; my $need_reindex = $rewritten->{need_reindex}; @@ -537,6 +556,7 @@ W: $list for my $smsg (@$need_reindex) { my $num = $smsg->{num}; my $mid0 = $smsg->{mid}; + warn "Reindexing article $num <$mid0>" if DBG_REPLACE; do_idx($self, \$raw, $new_mime, $len, $num, $oid, $mid0); } $rewritten->{rewrites}; -- EW