user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 5/5] fetch: support v2 w/o manifest on old WWW
Date: Fri, 24 Sep 2021 10:56:45 +0000	[thread overview]
Message-ID: <20210924105645.8627-6-e@80x24.org> (raw)
In-Reply-To: <20210924105645.8627-1-e@80x24.org>

There may still be pre-manifest.js.gz versions of
PublicInbox::WWW running and serving v2 inboxes.

While -clone and "add-external --mirror" were working, -fetch
was failing due to 301 redirect to $INBOX_URL/manifest.js.gz/
and not the expected 404.  Update the code to deal with a JSON
decode error (from the 301) and ensure v2 epochs detection is
correct (and not using a shadowed variable).
---
 lib/PublicInbox/Fetch.pm | 12 +++++++-----
 t/v2mirror.t             |  8 ++++++++
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/Fetch.pm b/lib/PublicInbox/Fetch.pm
index 7f60b619..7881b402 100644
--- a/lib/PublicInbox/Fetch.pm
+++ b/lib/PublicInbox/Fetch.pm
@@ -60,11 +60,13 @@ sub do_manifest ($$$) {
 	$opt->{$_} = $lei->{$_} for (0..2);
 	my $cerr = PublicInbox::LeiMirror::run_reap($lei, $curl_cmd, $opt);
 	if ($cerr) {
-		return [ 404 ] if ($cerr >> 8) == 22; # 404 Missing
+		return [ 404, $muri ] if ($cerr >> 8) == 22; # 404 Missing
 		$lei->child_error($cerr, "@$curl_cmd failed");
 		return;
 	}
-	my $m1 = PublicInbox::LeiMirror::decode_manifest($ft, $fn, $muri);
+	my $m1 = eval {
+		PublicInbox::LeiMirror::decode_manifest($ft, $fn, $muri);
+	} or return [ 404, $muri ];
 	my $mdiff = { %$m1 };
 
 	# filter out unchanged entries.  We check modified, too, since
@@ -83,7 +85,7 @@ sub do_manifest ($$$) {
 	}
 	my (undef, $v1_path, @v2_epochs) =
 		PublicInbox::LeiMirror::deduce_epochs($mdiff, $ibx_uri->path);
-	[ 200, $v1_path, \@v2_epochs, $muri, $ft, $mf, $m1 ];
+	[ 200, $muri, $v1_path, \@v2_epochs, $ft, $mf, $m1 ];
 }
 
 sub get_fingerprint2 {
@@ -106,7 +108,7 @@ sub do_fetch { # main entry point
 	} else { # v2:
 		require PublicInbox::MultiGit;
 		$mg = PublicInbox::MultiGit->new($dir, 'all.git', 'git');
-		my @epochs = $mg->git_epochs;
+		@epochs = $mg->git_epochs;
 		my ($git_url, $epoch);
 		for my $nr (@epochs) { # try newest epoch, first
 			my $edir = "$dir/git/$nr.git";
@@ -135,7 +137,7 @@ EOM
 	PublicInbox::LeiMirror::write_makefile($dir, $ibx_ver);
 	$lei->qerr("# inbox URL: $ibx_uri/");
 	my $res = do_manifest($lei, $dir, $ibx_uri) or return;
-	my ($code, $v1_path, $v2_epochs, $muri, $ft, $mf, $m1) = @$res;
+	my ($code, $muri, $v1_path, $v2_epochs, $ft, $mf, $m1) = @$res;
 	if ($code == 404) {
 		# any pre-manifest.js.gz instances running? Just fetch all
 		# existing ones and unconditionally try cloning the next
diff --git a/t/v2mirror.t b/t/v2mirror.t
index fa4a717d..a625646d 100644
--- a/t/v2mirror.t
+++ b/t/v2mirror.t
@@ -376,6 +376,14 @@ EOM
 	my @g_last = grep { -w $_ } glob("$dst/git/*.git");
 	is_deeply(\@g_last, [ $g_all[-1] ], 'partial clone of ~0 worked');
 
+	chmod(0755, $g_all[0]) or xbail "chmod $!";
+	my @before = glob("$g_all[0]/objects/*/*");
+	run_script([qw(-fetch -v)], undef, { -C => $dst, 2 => \($err = '') });
+	is($?, 0, 'scraping fetch on old PublicInbox::WWW') or diag $err;
+	my @after = glob("$g_all[0]/objects/*/*");
+	ok(scalar(@before) < scalar(@after),
+		'fetched 0.git after enabling write-bit');
+
 	$td->join('TERM');
 }
 

  parent reply	other threads:[~2021-09-24 10:56 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-24 10:56 [PATCH 0/5] clone|fetch: flesh out partial mirror support Eric Wong
2021-09-24 10:56 ` [PATCH 1/5] clone|--mirror: support --epoch=RANGE for partial clones Eric Wong
2021-09-24 10:56 ` [PATCH 2/5] fetch: fix skipping with multi-epoch inboxes Eric Wong
2021-09-24 10:56 ` [PATCH 3/5] clone|--mirror: fix and test against pre-manifest WWW Eric Wong
2021-09-24 10:56 ` [PATCH 4/5] clone|fetch|--mirror: cull manifest in partial mirrors Eric Wong
2021-09-24 10:56 ` Eric Wong [this message]
2021-09-25  3:21 ` [PATCH 6/5] t/v2mirror: check dependencies for legacy test Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210924105645.8627-6-e@80x24.org \
    --to=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).