user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* Re: [PATCH 2/4] *index: avoid per-epoch --batch-check processes
  2020-11-13 12:38  7%   ` Kyle Meyer
@ 2020-11-15  3:03  7%     ` Eric Wong
  0 siblings, 0 replies; 4+ results
From: Eric Wong @ 2020-11-15  3:03 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> [ superficial typo comments from a reader who is out of his depth :) ]

They're greatly appreciated.  My brain is working worse than
usual these days, so extra eyes anywhere are appreciated.  And
trying to get my mind off crap by trying something new caused me
to hit my head, so maybe I'm concussed, too :x

> Eric Wong writes:
> 
> > Since all.git (v2) and ALL.git (extindex) encompass every single
> > epoch or indexed inbox; and we is_ancestor() only uses
> 
> s/we is_ancestor/is_ancestor/?

yup

> > hexadecimal OIDs, there is no good reason to to use $unit->{git}
> > for an epoch-local $git->check avoids redundant long-lived
> > processes.
> 
> s/to to/to/
> 
> I'm having trouble parsing that last part.  Perhaps the "avoids ..." is
> covered by the next paragraph and should be dropped?

Agreed, dropped that bit.

> s/an several/several/

Thanks

^ permalink raw reply	[relevance 7%]

* Re: [PATCH 2/4] *index: avoid per-epoch --batch-check processes
  2020-11-13 11:11  6% ` [PATCH 2/4] *index: avoid per-epoch --batch-check processes Eric Wong
@ 2020-11-13 12:38  7%   ` Kyle Meyer
  2020-11-15  3:03  7%     ` Eric Wong
  0 siblings, 1 reply; 4+ results
From: Kyle Meyer @ 2020-11-13 12:38 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

[ superficial typo comments from a reader who is out of his depth :) ]

Eric Wong writes:

> Since all.git (v2) and ALL.git (extindex) encompass every single
> epoch or indexed inbox; and we is_ancestor() only uses

s/we is_ancestor/is_ancestor/?

> hexadecimal OIDs, there is no good reason to to use $unit->{git}
> for an epoch-local $git->check avoids redundant long-lived
> processes.

s/to to/to/

I'm having trouble parsing that last part.  Perhaps the "avoids ..." is
covered by the next paragraph and should be dropped?

> This prevents dozens/hundreds of --batch-check processes from
> being left running after indexing and can improve locality
> if size checks are being done (since that uses --batch-check,
> too).
>
> Theoretically an several epochs may have conflicting OIDs, but

s/an several/several/

> we're screwed in those cases, anyways, so we might as well
> detect it earlier (though I'm not sure what the behavior would
> be :x).

^ permalink raw reply	[relevance 7%]

* [PATCH 2/4] *index: avoid per-epoch --batch-check processes
  2020-11-13 11:11  5% [PATCH 0/4] extindex: checkpoints, graceful shutdown, cleanups Eric Wong
@ 2020-11-13 11:11  6% ` Eric Wong
  2020-11-13 12:38  7%   ` Kyle Meyer
  0 siblings, 1 reply; 4+ results
From: Eric Wong @ 2020-11-13 11:11 UTC (permalink / raw)
  To: meta

Since all.git (v2) and ALL.git (extindex) encompass every single
epoch or indexed inbox; and we is_ancestor() only uses
hexadecimal OIDs, there is no good reason to to use $unit->{git}
for an epoch-local $git->check avoids redundant long-lived
processes.

This prevents dozens/hundreds of --batch-check processes from
being left running after indexing and can improve locality
if size checks are being done (since that uses --batch-check,
too).

Theoretically an several epochs may have conflicting OIDs, but
we're screwed in those cases, anyways, so we might as well
detect it earlier (though I'm not sure what the behavior would
be :x).
---
 lib/PublicInbox/ExtSearchIdx.pm | 2 +-
 lib/PublicInbox/V2Writable.pm   | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm
index 14ffdadb..2d230dc1 100644
--- a/lib/PublicInbox/ExtSearchIdx.pm
+++ b/lib/PublicInbox/ExtSearchIdx.pm
@@ -361,7 +361,7 @@ sub update_last_commit { # overrides V2Writable
 		die "Unsupported inbox version: $v";
 	}
 	my $last = $self->{oidx}->eidx_meta($meta_key);
-	if (defined $last && is_ancestor($unit->{git}, $last, $latest_cmt)) {
+	if (defined $last && is_ancestor($self->git, $last, $latest_cmt)) {
 		my @cmd = (qw(rev-list --count), "$last..$latest_cmt");
 		chomp(my $n = $unit->{git}->qx(@cmd));
 		return if $n ne '' && $n == 0;
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 87b76501..cf44c95b 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -966,7 +966,7 @@ sub update_last_commit {
 	my $latest_cmt = $stk ? $stk->{latest_cmt} : ${$sync->{latest_cmt}};
 	defined($latest_cmt) or return;
 	my $last = last_epoch_commit($self, $unit->{epoch});
-	if (defined $last && is_ancestor($unit->{git}, $last, $latest_cmt)) {
+	if (defined $last && is_ancestor($self->git, $last, $latest_cmt)) {
 		my @cmd = (qw(rev-list --count), "$last..$latest_cmt");
 		chomp(my $n = $unit->{git}->qx(@cmd));
 		return if $n ne '' && $n == 0;
@@ -1003,7 +1003,7 @@ sub log_range ($$$) {
 	my $range = "$cur..$tip";
 	$pr->("$i.git checking contiguity... ") if $pr;
 	my $git = $unit->{git};
-	if (is_ancestor($git, $cur, $tip)) { # common case
+	if (is_ancestor($sync->{self}->git, $cur, $tip)) { # common case
 		$pr->("OK\n") if $pr;
 		my $n = $git->qx(qw(rev-list --count), $range);
 		chomp($n);

^ permalink raw reply related	[relevance 6%]

* [PATCH 0/4] extindex: checkpoints, graceful shutdown, cleanups
@ 2020-11-13 11:11  5% Eric Wong
  2020-11-13 11:11  6% ` [PATCH 2/4] *index: avoid per-epoch --batch-check processes Eric Wong
  0 siblings, 1 reply; 4+ results
From: Eric Wong @ 2020-11-13 11:11 UTC (permalink / raw)
  To: meta

Patches 1 and 4 should make long indexing runs more
user-friendly by being interrupt-friendly (via SIGINT, SIGQUIT,
or SIGTERM, just like read-only daemons).

I would've found this feature useful when dealing with unplanned
emergency shutdowns due to power outages.   I may continue to
find it useful in the future since the power grid falling to
pieces and see more power outages.

Stealing UI ideas from git, SIGUSR1 also triggers a checkpoint
during indexing.

2 and 3 are just cleanups I've noticed along the way.

Eric Wong (4):
  *index: checkpoints write last_commit metadata
  *index: avoid per-epoch --batch-check processes
  *index: discard sync->{todo} on iteration
  extindex: support graceful shutdown via QUIT/INT/TERM

 lib/PublicInbox/ExtSearchIdx.pm   | 23 +++++++++----
 lib/PublicInbox/IdxStack.pm       | 18 +++++++---
 lib/PublicInbox/SearchIdx.pm      | 56 +++++++++++++++++--------------
 lib/PublicInbox/SearchIdxShard.pm |  6 ++++
 lib/PublicInbox/V2Writable.pm     | 49 +++++++++++++++++++++------
 t/idx_stack.t                     | 20 ++++++-----
 6 files changed, 115 insertions(+), 57 deletions(-)

^ permalink raw reply	[relevance 5%]

Results 1-4 of 4 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2020-11-13 11:11  5% [PATCH 0/4] extindex: checkpoints, graceful shutdown, cleanups Eric Wong
2020-11-13 11:11  6% ` [PATCH 2/4] *index: avoid per-epoch --batch-check processes Eric Wong
2020-11-13 12:38  7%   ` Kyle Meyer
2020-11-15  3:03  7%     ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).