* [PATCH 5/5] fetch: support v2 w/o manifest on old WWW
2021-09-24 10:56 7% [PATCH 0/5] clone|fetch: flesh out partial mirror support Eric Wong
@ 2021-09-24 10:56 7% ` Eric Wong
0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2021-09-24 10:56 UTC (permalink / raw)
To: meta
There may still be pre-manifest.js.gz versions of
PublicInbox::WWW running and serving v2 inboxes.
While -clone and "add-external --mirror" were working, -fetch
was failing due to 301 redirect to $INBOX_URL/manifest.js.gz/
and not the expected 404. Update the code to deal with a JSON
decode error (from the 301) and ensure v2 epochs detection is
correct (and not using a shadowed variable).
---
lib/PublicInbox/Fetch.pm | 12 +++++++-----
t/v2mirror.t | 8 ++++++++
2 files changed, 15 insertions(+), 5 deletions(-)
diff --git a/lib/PublicInbox/Fetch.pm b/lib/PublicInbox/Fetch.pm
index 7f60b619..7881b402 100644
--- a/lib/PublicInbox/Fetch.pm
+++ b/lib/PublicInbox/Fetch.pm
@@ -60,11 +60,13 @@ sub do_manifest ($$$) {
$opt->{$_} = $lei->{$_} for (0..2);
my $cerr = PublicInbox::LeiMirror::run_reap($lei, $curl_cmd, $opt);
if ($cerr) {
- return [ 404 ] if ($cerr >> 8) == 22; # 404 Missing
+ return [ 404, $muri ] if ($cerr >> 8) == 22; # 404 Missing
$lei->child_error($cerr, "@$curl_cmd failed");
return;
}
- my $m1 = PublicInbox::LeiMirror::decode_manifest($ft, $fn, $muri);
+ my $m1 = eval {
+ PublicInbox::LeiMirror::decode_manifest($ft, $fn, $muri);
+ } or return [ 404, $muri ];
my $mdiff = { %$m1 };
# filter out unchanged entries. We check modified, too, since
@@ -83,7 +85,7 @@ sub do_manifest ($$$) {
}
my (undef, $v1_path, @v2_epochs) =
PublicInbox::LeiMirror::deduce_epochs($mdiff, $ibx_uri->path);
- [ 200, $v1_path, \@v2_epochs, $muri, $ft, $mf, $m1 ];
+ [ 200, $muri, $v1_path, \@v2_epochs, $ft, $mf, $m1 ];
}
sub get_fingerprint2 {
@@ -106,7 +108,7 @@ sub do_fetch { # main entry point
} else { # v2:
require PublicInbox::MultiGit;
$mg = PublicInbox::MultiGit->new($dir, 'all.git', 'git');
- my @epochs = $mg->git_epochs;
+ @epochs = $mg->git_epochs;
my ($git_url, $epoch);
for my $nr (@epochs) { # try newest epoch, first
my $edir = "$dir/git/$nr.git";
@@ -135,7 +137,7 @@ EOM
PublicInbox::LeiMirror::write_makefile($dir, $ibx_ver);
$lei->qerr("# inbox URL: $ibx_uri/");
my $res = do_manifest($lei, $dir, $ibx_uri) or return;
- my ($code, $v1_path, $v2_epochs, $muri, $ft, $mf, $m1) = @$res;
+ my ($code, $muri, $v1_path, $v2_epochs, $ft, $mf, $m1) = @$res;
if ($code == 404) {
# any pre-manifest.js.gz instances running? Just fetch all
# existing ones and unconditionally try cloning the next
diff --git a/t/v2mirror.t b/t/v2mirror.t
index fa4a717d..a625646d 100644
--- a/t/v2mirror.t
+++ b/t/v2mirror.t
@@ -376,6 +376,14 @@ EOM
my @g_last = grep { -w $_ } glob("$dst/git/*.git");
is_deeply(\@g_last, [ $g_all[-1] ], 'partial clone of ~0 worked');
+ chmod(0755, $g_all[0]) or xbail "chmod $!";
+ my @before = glob("$g_all[0]/objects/*/*");
+ run_script([qw(-fetch -v)], undef, { -C => $dst, 2 => \($err = '') });
+ is($?, 0, 'scraping fetch on old PublicInbox::WWW') or diag $err;
+ my @after = glob("$g_all[0]/objects/*/*");
+ ok(scalar(@before) < scalar(@after),
+ 'fetched 0.git after enabling write-bit');
+
$td->join('TERM');
}
^ permalink raw reply related [relevance 7%]
* [PATCH 0/5] clone|fetch: flesh out partial mirror support
@ 2021-09-24 10:56 7% Eric Wong
2021-09-24 10:56 7% ` [PATCH 5/5] fetch: support v2 w/o manifest on old WWW Eric Wong
0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2021-09-24 10:56 UTC (permalink / raw)
To: meta
The --epoch=RANGE feature discussed last week[1] is implemented.
There's also a bunch of fixes and improvements for handling
partial fetches from work started last week.
There's also a significant amount of work done to ensure the
client-side code works on servers running old, pre-manifest
versions of public-inbox.
I'm not sure if there's pre-manifest.js.gz versions of
public-inbox out there, but it's only ~2 years old and I can
understand if some admins have been preoccupied with the
pandemic and unable to upgrade :/
[1] https://public-inbox.org/meta/20210917002204.GA13112@dcvr/T/#u
Eric Wong (5):
clone|--mirror: support --epoch=RANGE for partial clones
fetch: fix skipping with multi-epoch inboxes
clone|--mirror: fix and test against pre-manifest WWW
clone|fetch|--mirror: cull manifest in partial mirrors
fetch: support v2 w/o manifest on old WWW
Documentation/lei-add-external.pod | 15 +++
Documentation/public-inbox-clone.pod | 15 +++
lib/PublicInbox/Fetch.pm | 27 ++++--
lib/PublicInbox/LEI.pm | 2 +-
lib/PublicInbox/LeiMirror.pm | 130 +++++++++++++++++++++++---
lib/PublicInbox/TestCommon.pm | 1 +
script/public-inbox-clone | 3 +-
t/lei-mirror.t | 8 ++
t/v2mirror.t | 135 +++++++++++++++++++++++++--
9 files changed, 306 insertions(+), 30 deletions(-)
^ permalink raw reply [relevance 7%]
Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2021-09-24 10:56 7% [PATCH 0/5] clone|fetch: flesh out partial mirror support Eric Wong
2021-09-24 10:56 7% ` [PATCH 5/5] fetch: support v2 w/o manifest on old WWW Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).