user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: "Eric Wong (Contractor, The Linux Foundation)" <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 2/4] import: rewrite less history during purge
Date: Wed,  4 Apr 2018 21:24:58 +0000	[thread overview]
Message-ID: <20180404212500.1859-3-e@80x24.org> (raw)
In-Reply-To: <20180404212500.1859-1-e@80x24.org>

We do not need to rewrite old commits unaffected by the object_id
purge, only newer commits.  This was a state management bug :x

We will also return the new commit ID of rewritten history to
aid in incremental indexing of mirrors for the next change.
---
 lib/PublicInbox/Import.pm     | 25 ++++++++++++++++++-------
 lib/PublicInbox/V2Writable.pm |  6 ++++--
 t/v2writable.t                |  3 ++-
 3 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/lib/PublicInbox/Import.pm b/lib/PublicInbox/Import.pm
index b2aae9a..73290ee 100644
--- a/lib/PublicInbox/Import.pm
+++ b/lib/PublicInbox/Import.pm
@@ -476,6 +476,7 @@ sub purge_oids {
 	my @buf;
 	my $npurge = 0;
 	my @oids;
+	my ($done, $mark);
 	my $tree = $self->{-tree};
 	while (<$rd>) {
 		if (/^reset (?:.+)/) {
@@ -506,14 +507,20 @@ sub purge_oids {
 			my $path = $1;
 			push @buf, $_ if $tree->{$path};
 		} elsif ($_ eq "\n") {
-			my $out = join('', @buf);
-			$out =~ s/^/# /sgm;
-			warn "purge rewriting\n", $out, "\n";
-			clean_purge_buffer(\@oids, \@buf);
-			$out = join('', @buf);
+			if (@oids) {
+				my $out = join('', @buf);
+				$out =~ s/^/# /sgm;
+				warn "purge rewriting\n", $out, "\n";
+				clean_purge_buffer(\@oids, \@buf);
+				$npurge++;
+			}
 			$w->print(@buf, "\n") or wfail;
 			@buf = ();
-			$npurge++;
+		} elsif ($_ eq "done\n") {
+			$done = 1;
+		} elsif (/^mark :(\d+)$/) {
+			push @buf, $_;
+			$mark = $1;
 		} else {
 			push @buf, $_;
 		}
@@ -521,7 +528,9 @@ sub purge_oids {
 	if (@buf) {
 		$w->print(@buf) or wfail;
 	}
-	$w = $r = undef;
+	die 'done\n not seen from fast-export' unless $done;
+	chomp(my $cmt = $self->get_mark(":$mark")) if $npurge;
+	$self->{nchg} = 0; # prevent _update_git_info until update-ref:
 	$self->done;
 	my @git = ('git', "--git-dir=$git->{git_dir}");
 
@@ -540,7 +549,9 @@ sub purge_oids {
 			$err++;
 		}
 	}
+	_update_git_info($self, 0);
 	die "Failed to purge $err object(s)\n" if $err;
+	$cmt;
 }
 
 1;
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 479e2b5..b6532ac 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -224,11 +224,13 @@ sub purge_oids {
 	my ($self, $purge) = @_; # $purge = { $object_id => 1, ... }
 	$self->done;
 	my $pfx = "$self->{-inbox}->{mainrepo}/git";
+	my $purges = [];
 	foreach my $i (0..$self->{max_git}) {
 		my $git = PublicInbox::Git->new("$pfx/$i.git");
 		my $im = $self->import_init($git, 0);
-		$im->purge_oids($purge);
+		$purges->[$i] = $im->purge_oids($purge);
 	}
+	$purges;
 }
 
 sub remove_internal {
@@ -285,7 +287,7 @@ sub remove_internal {
 		$self->barrier;
 	}
 	if ($purge && scalar keys %$purge) {
-		purge_oids($self, $purge);
+		return purge_oids($self, $purge);
 	}
 	$removed;
 }
diff --git a/t/v2writable.t b/t/v2writable.t
index 2f83977..e49c06b 100644
--- a/t/v2writable.t
+++ b/t/v2writable.t
@@ -248,7 +248,8 @@ EOF
 {
 	ok($im->add($mime), 'add message to be purged');
 	local $SIG{__WARN__} = sub {};
-	ok($im->purge($mime), 'purged message');
+	ok(my $cmts = $im->purge($mime), 'purged message');
+	like($cmts->[0], qr/\A[a-f0-9]{40}\z/, 'purge returned current commit');
 	$im->done;
 }
 
-- 
EW


  parent reply	other threads:[~2018-04-04 21:25 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-04 21:24 [PATCH 0/4] incremental indexing support for mirrors Eric Wong (Contractor, The Linux Foundation)
2018-04-04 21:24 ` [PATCH 1/4] init: s/GIT_DIR/REPO_DIR/ in usage Eric Wong (Contractor, The Linux Foundation)
2018-04-04 21:24 ` Eric Wong (Contractor, The Linux Foundation) [this message]
2018-04-04 21:24 ` [PATCH 3/4] v2: support incremental indexing + purge Eric Wong (Contractor, The Linux Foundation)
2018-04-04 21:25 ` [PATCH 4/4] v2writable: do not modify DBs while iterating for ->remove Eric Wong (Contractor, The Linux Foundation)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180404212500.1859-3-e@80x24.org \
    --to=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).