From: Eric Wong <e@80x24.org>
To: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Cc: meta@public-inbox.org
Subject: [WIP] v2writable: support INBOX_DEBUG=replace
Date: Mon, 10 Jun 2019 22:03:20 +0000 [thread overview]
Message-ID: <20190610220320.nssvqjseswo2ujl2@dcvr> (raw)
In-Reply-To: <20190610194039.GD16418@chatter.i7.local>
Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Mon, Jun 10, 2019 at 07:29:05PM +0000, Eric Wong wrote:
> > > I did a few successful tests on small trial lists, but I'm running
> > > into a
> > > problem when I try to actually edit something in (a copy of) LKML:
> > >
> > > $ perl5lib/bin/public-inbox-edit -m messageid /mnt/fastio/lkml
> > > (mutt opens here)
> > > 1 kept, 0 deleted.
> > > Exception: Expected block 102325 to be level 2, not 0
> >
> > That's an exception from Xapian I haven't seen that in years.
> > Which version of Xapian and are you using chert or glass?
>
> EL7 has 1.2.25.
Oh, I just realized that doesn't use OFD locks at all because
its Linux 3.10 and OFD locks appeared in 3.15 (unless RH backported).
Maybe PATCH 14/11 fixes it:
https://public-inbox.org/meta/20190610215811.untkksidetf3erf6@dcvr/
> > > I can provide more debug info if that helps.
> >
> > Yes, I could probably add some debug messages to make it easier
>
> Sounds good -- I had suspected it was coming from Xapian and I know EL7 is
> lagging behind quite a bit. If this is too much divergence to work with, I
> can probably build a xapian14 package, but I'm afraid of the rabbit hole I
> may have to go down to make that work.
Xapian 1.4 has huge performance improvements in worst-case
scenarios with the glass backend; so it might be worth trying
anyways.
But that won't get you Linux >=3.15 for OFD locks; so Xapian
is probably still using the nasty fork()-based lock in older
releases.
Maybe this dirty patch can dump more info:
---------8<--------
Subject: [WIP] v2writable: support INBOX_DEBUG=replace
Dirty patch to enable ->replace debugging via
INBOX_DEBUG=replace environment.
cf. <20190610192905.55xb737jl7qnbh23@dcvr>
---
lib/PublicInbox/V2Writable.pm | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 3484807..164a032 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -20,6 +20,9 @@ use PublicInbox::Spawn qw(spawn);
use PublicInbox::SearchIdx;
use IO::Handle;
+# unstable interface
+use constant DBG_REPLACE => !!(($ENV{INBOX_DEBUG}//'') =~ /\breplace\b/);
+
# an estimate of the post-packed size to the raw uncompressed size
my $PACKING_FACTOR = 0.4;
@@ -312,14 +315,23 @@ sub _replace_oids ($$$) {
$self->{epoch_max} = $max;
}
+ if (DBG_REPLACE) {
+ warn "Replacing OIDs\n";
+ warn "\t", $_, "\n" for (keys %$replace_map);
+ }
+
foreach my $i (0..$max) {
my $git_dir = "$pfx/$i.git";
-d $git_dir or next;
+
+ warn "In $git_dir ... " if DBG_REPLACE;
my $git = PublicInbox::Git->new($git_dir);
my $im = $self->import_init($git, 0, 1);
$rewrites->[$i] = $im->replace_oids($mime, $replace_map);
$im->done;
+ warn "Done: $git_dir" if DBG_REPLACE;
}
+ warn "done replacing in git repos" if DBG_REPLACE;
$rewrites;
}
@@ -369,6 +381,7 @@ sub rewrite_internal ($$;$$$) {
foreach my $mid (@$mids) {
my %gone; # num => [ smsg, raw ]
my ($id, $prev);
+ warn "looking for <$mid>" if DBG_REPLACE;
while (my $smsg = $over->next_by_mid($mid, \$id, \$prev)) {
my $msg = get_blob($self, $smsg);
if (!defined($msg)) {
@@ -380,6 +393,10 @@ sub rewrite_internal ($$;$$$) {
if (content_matches($cids, $cur)) {
$smsg->{mime} = $cur;
$gone{$smsg->{num}} = [ $smsg, \$orig ];
+ DBG_REPLACE and
+ warn "matched <$mid> => $smsg->{num}";
+ } else {
+ DBG_REPLACE and warn "no match <$mid>";
}
}
my $n = scalar keys %gone;
@@ -387,6 +404,7 @@ sub rewrite_internal ($$;$$$) {
if ($n > 1) {
warn "BUG: multiple articles linked to <$mid>\n",
join(',', sort keys %gone), "\n";
+ warn "Replacing all of them\n";
}
foreach my $num (keys %gone) {
my ($smsg, $orig) = @{$gone{$num}};
@@ -513,6 +531,7 @@ sub replace ($$$) {
my $raw = $new_mime->as_string;
my $expect_oid = git_hash_raw($self, \$raw);
+ warn "expect_oid: $expect_oid" if DBG_REPLACE;
my $rewritten = _replace($self, $old_mime, $new_mime, \$raw) or return;
my $need_reindex = $rewritten->{need_reindex};
@@ -537,6 +556,7 @@ W: $list
for my $smsg (@$need_reindex) {
my $num = $smsg->{num};
my $mid0 = $smsg->{mid};
+ warn "Reindexing article $num <$mid0>" if DBG_REPLACE;
do_idx($self, \$raw, $new_mime, $len, $num, $oid, $mid0);
}
$rewritten->{rewrites};
--
EW
next prev parent reply other threads:[~2019-06-10 22:03 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-09 2:51 [PATCH 00/11] v2: implement message editing Eric Wong (Contractor, The Linux Foundation)
2019-06-09 2:51 ` [PATCH 01/11] v2writable: consolidate overview and indexing call Eric Wong (Contractor, The Linux Foundation)
2019-06-09 2:51 ` [PATCH 02/11] import: extract_author_info becomes extract_commit_info Eric Wong (Contractor, The Linux Foundation)
2019-06-09 2:51 ` [PATCH 03/11] import: switch to "replace_oids" interface for purge Eric Wong (Contractor, The Linux Foundation)
2019-06-09 2:51 ` [PATCH 04/11] v2writable: implement ->replace call Eric Wong (Contractor, The Linux Foundation)
2019-06-09 2:51 ` [PATCH 05/11] admin: remove warning arg for unconfigured inboxes Eric Wong (Contractor, The Linux Foundation)
2019-06-09 2:51 ` [PATCH 06/11] purge: start moving common options to AdminEdit module Eric Wong (Contractor, The Linux Foundation)
2019-06-09 2:51 ` [PATCH 07/11] admin: beef up resolve_inboxes to handle purge options Eric Wong (Contractor, The Linux Foundation)
2019-06-09 2:51 ` [PATCH 08/11] AdminEdit: move editability checks from -purge Eric Wong (Contractor, The Linux Foundation)
2019-06-09 2:51 ` [PATCH 09/11] admin: expose ->config Eric Wong (Contractor, The Linux Foundation)
2019-06-09 2:51 ` [PATCH 10/11] doc: document the --prune option for -index Eric Wong (Contractor, The Linux Foundation)
2019-06-09 2:51 ` [PATCH 11/11] edit: new tool to perform edits Eric Wong (Contractor, The Linux Foundation)
2019-06-10 16:06 ` Konstantin Ryabitsev
2019-06-10 18:02 ` Eric Wong
2019-06-13 8:07 ` Eric Wong
2019-06-10 15:06 ` [PATCH 00/11] v2: implement message editing Konstantin Ryabitsev
2019-06-10 15:40 ` Eric Wong
2019-06-10 17:56 ` [PATCH 12/11] edit|purge: improve output on rewrites Eric Wong
2019-06-10 18:57 ` [PATCH 00/11] v2: implement message editing Konstantin Ryabitsev
2019-06-10 19:29 ` Eric Wong
2019-06-10 19:40 ` Konstantin Ryabitsev
2019-06-10 22:03 ` Eric Wong [this message]
2019-06-10 22:13 ` [WIP] v2writable: support INBOX_DEBUG=replace Konstantin Ryabitsev
2019-06-10 23:12 ` [WIP] add more debug tracing around idx_init Eric Wong
2019-06-11 15:33 ` Konstantin Ryabitsev
2019-06-11 18:43 ` [WIP] v2writable: support INBOX_DEBUG=replace Eric Wong
2019-06-11 21:06 ` [PATCH 00/11] v2: implement message editing Konstantin Ryabitsev
2019-06-12 0:18 ` [PATCH] searchidx: improve error message when Xapian fails Eric Wong
2019-06-10 18:17 ` [PATCH 13/11] edit: drop unwanted headers before noop check Eric Wong (Contractor, The Linux Foundation)
2019-06-10 21:58 ` [PATCH 14/11] v2writable: replace: kill git processes before reindexing Eric Wong (Contractor, The Linux Foundation)
2019-06-12 0:25 ` [PATCH 15/11] edit: unlink temporary file when done Eric Wong (Contractor, The Linux Foundation)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://public-inbox.org/README
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190610220320.nssvqjseswo2ujl2@dcvr \
--to=e@80x24.org \
--cc=konstantin@linuxfoundation.org \
--cc=meta@public-inbox.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).