* Re: [PATCH 2/4] *index: avoid per-epoch --batch-check processes
2020-11-13 12:38 7% ` Kyle Meyer
@ 2020-11-15 3:03 7% ` Eric Wong
0 siblings, 0 replies; 4+ results
From: Eric Wong @ 2020-11-15 3:03 UTC (permalink / raw)
To: Kyle Meyer; +Cc: meta
Kyle Meyer <kyle@kyleam.com> wrote:
> [ superficial typo comments from a reader who is out of his depth :) ]
They're greatly appreciated. My brain is working worse than
usual these days, so extra eyes anywhere are appreciated. And
trying to get my mind off crap by trying something new caused me
to hit my head, so maybe I'm concussed, too :x
> Eric Wong writes:
>
> > Since all.git (v2) and ALL.git (extindex) encompass every single
> > epoch or indexed inbox; and we is_ancestor() only uses
>
> s/we is_ancestor/is_ancestor/?
yup
> > hexadecimal OIDs, there is no good reason to to use $unit->{git}
> > for an epoch-local $git->check avoids redundant long-lived
> > processes.
>
> s/to to/to/
>
> I'm having trouble parsing that last part. Perhaps the "avoids ..." is
> covered by the next paragraph and should be dropped?
Agreed, dropped that bit.
> s/an several/several/
Thanks
^ permalink raw reply [relevance 7%]
* Re: [PATCH 2/4] *index: avoid per-epoch --batch-check processes
2020-11-13 11:11 6% ` [PATCH 2/4] *index: avoid per-epoch --batch-check processes Eric Wong
@ 2020-11-13 12:38 7% ` Kyle Meyer
2020-11-15 3:03 7% ` Eric Wong
0 siblings, 1 reply; 4+ results
From: Kyle Meyer @ 2020-11-13 12:38 UTC (permalink / raw)
To: Eric Wong; +Cc: meta
[ superficial typo comments from a reader who is out of his depth :) ]
Eric Wong writes:
> Since all.git (v2) and ALL.git (extindex) encompass every single
> epoch or indexed inbox; and we is_ancestor() only uses
s/we is_ancestor/is_ancestor/?
> hexadecimal OIDs, there is no good reason to to use $unit->{git}
> for an epoch-local $git->check avoids redundant long-lived
> processes.
s/to to/to/
I'm having trouble parsing that last part. Perhaps the "avoids ..." is
covered by the next paragraph and should be dropped?
> This prevents dozens/hundreds of --batch-check processes from
> being left running after indexing and can improve locality
> if size checks are being done (since that uses --batch-check,
> too).
>
> Theoretically an several epochs may have conflicting OIDs, but
s/an several/several/
> we're screwed in those cases, anyways, so we might as well
> detect it earlier (though I'm not sure what the behavior would
> be :x).
^ permalink raw reply [relevance 7%]
* [PATCH 2/4] *index: avoid per-epoch --batch-check processes
2020-11-13 11:11 5% [PATCH 0/4] extindex: checkpoints, graceful shutdown, cleanups Eric Wong
@ 2020-11-13 11:11 6% ` Eric Wong
2020-11-13 12:38 7% ` Kyle Meyer
0 siblings, 1 reply; 4+ results
From: Eric Wong @ 2020-11-13 11:11 UTC (permalink / raw)
To: meta
Since all.git (v2) and ALL.git (extindex) encompass every single
epoch or indexed inbox; and we is_ancestor() only uses
hexadecimal OIDs, there is no good reason to to use $unit->{git}
for an epoch-local $git->check avoids redundant long-lived
processes.
This prevents dozens/hundreds of --batch-check processes from
being left running after indexing and can improve locality
if size checks are being done (since that uses --batch-check,
too).
Theoretically an several epochs may have conflicting OIDs, but
we're screwed in those cases, anyways, so we might as well
detect it earlier (though I'm not sure what the behavior would
be :x).
---
lib/PublicInbox/ExtSearchIdx.pm | 2 +-
lib/PublicInbox/V2Writable.pm | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm
index 14ffdadb..2d230dc1 100644
--- a/lib/PublicInbox/ExtSearchIdx.pm
+++ b/lib/PublicInbox/ExtSearchIdx.pm
@@ -361,7 +361,7 @@ sub update_last_commit { # overrides V2Writable
die "Unsupported inbox version: $v";
}
my $last = $self->{oidx}->eidx_meta($meta_key);
- if (defined $last && is_ancestor($unit->{git}, $last, $latest_cmt)) {
+ if (defined $last && is_ancestor($self->git, $last, $latest_cmt)) {
my @cmd = (qw(rev-list --count), "$last..$latest_cmt");
chomp(my $n = $unit->{git}->qx(@cmd));
return if $n ne '' && $n == 0;
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 87b76501..cf44c95b 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -966,7 +966,7 @@ sub update_last_commit {
my $latest_cmt = $stk ? $stk->{latest_cmt} : ${$sync->{latest_cmt}};
defined($latest_cmt) or return;
my $last = last_epoch_commit($self, $unit->{epoch});
- if (defined $last && is_ancestor($unit->{git}, $last, $latest_cmt)) {
+ if (defined $last && is_ancestor($self->git, $last, $latest_cmt)) {
my @cmd = (qw(rev-list --count), "$last..$latest_cmt");
chomp(my $n = $unit->{git}->qx(@cmd));
return if $n ne '' && $n == 0;
@@ -1003,7 +1003,7 @@ sub log_range ($$$) {
my $range = "$cur..$tip";
$pr->("$i.git checking contiguity... ") if $pr;
my $git = $unit->{git};
- if (is_ancestor($git, $cur, $tip)) { # common case
+ if (is_ancestor($sync->{self}->git, $cur, $tip)) { # common case
$pr->("OK\n") if $pr;
my $n = $git->qx(qw(rev-list --count), $range);
chomp($n);
^ permalink raw reply related [relevance 6%]
* [PATCH 0/4] extindex: checkpoints, graceful shutdown, cleanups
@ 2020-11-13 11:11 5% Eric Wong
2020-11-13 11:11 6% ` [PATCH 2/4] *index: avoid per-epoch --batch-check processes Eric Wong
0 siblings, 1 reply; 4+ results
From: Eric Wong @ 2020-11-13 11:11 UTC (permalink / raw)
To: meta
Patches 1 and 4 should make long indexing runs more
user-friendly by being interrupt-friendly (via SIGINT, SIGQUIT,
or SIGTERM, just like read-only daemons).
I would've found this feature useful when dealing with unplanned
emergency shutdowns due to power outages. I may continue to
find it useful in the future since the power grid falling to
pieces and see more power outages.
Stealing UI ideas from git, SIGUSR1 also triggers a checkpoint
during indexing.
2 and 3 are just cleanups I've noticed along the way.
Eric Wong (4):
*index: checkpoints write last_commit metadata
*index: avoid per-epoch --batch-check processes
*index: discard sync->{todo} on iteration
extindex: support graceful shutdown via QUIT/INT/TERM
lib/PublicInbox/ExtSearchIdx.pm | 23 +++++++++----
lib/PublicInbox/IdxStack.pm | 18 +++++++---
lib/PublicInbox/SearchIdx.pm | 56 +++++++++++++++++--------------
lib/PublicInbox/SearchIdxShard.pm | 6 ++++
lib/PublicInbox/V2Writable.pm | 49 +++++++++++++++++++++------
t/idx_stack.t | 20 ++++++-----
6 files changed, 115 insertions(+), 57 deletions(-)
^ permalink raw reply [relevance 5%]
Results 1-4 of 4 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2020-11-13 11:11 5% [PATCH 0/4] extindex: checkpoints, graceful shutdown, cleanups Eric Wong
2020-11-13 11:11 6% ` [PATCH 2/4] *index: avoid per-epoch --batch-check processes Eric Wong
2020-11-13 12:38 7% ` Kyle Meyer
2020-11-15 3:03 7% ` Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).