user/dev discussion of public-inbox itself
 help / color / Atom feed
* [PATCH 0/4] grokmirror-compatible manifests
@ 2019-06-09  4:31 Eric Wong (Contractor, The Linux Foundation)
  2019-06-09  4:31 ` [PATCH 1/4] wwwlisting: allow hiding entries from manifest Eric Wong (Contractor, The Linux Foundation)
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2019-06-09  4:31 UTC (permalink / raw)
  To: meta

Maintaining mirrors is a pain, especially for v2 repos
and multiple epochs.  So support both per-domain matching
and per-inbox manifests which can be fed to grok-pull(1)

https://git.kernel.org/pub/scm/utils/grokmirror/grokmirror.git

Eric Wong (Contractor, The Linux Foundation) (4):
  wwwlisting: allow hiding entries from manifest
  wwwlisting: generate grokmirror-compatible manifest.js.gz
  www: wire up /$INBOX/manifest.js.gz, too
  www: support $INBOX/git/$EPOCH.git for v2 cloning

 MANIFEST                      |   1 +
 lib/PublicInbox/WWW.pm        |  17 +++-
 lib/PublicInbox/WwwListing.pm | 174 +++++++++++++++++++++++++++++-----
 t/psgi_v2.t                   |   2 +
 t/v2mirror.t                  |   8 +-
 t/www_listing.t               | 158 ++++++++++++++++++++++++++++++
 6 files changed, 330 insertions(+), 30 deletions(-)
 create mode 100644 t/www_listing.t

-- 
EW

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/4] wwwlisting: allow hiding entries from manifest
  2019-06-09  4:31 [PATCH 0/4] grokmirror-compatible manifests Eric Wong (Contractor, The Linux Foundation)
@ 2019-06-09  4:31 ` Eric Wong (Contractor, The Linux Foundation)
  2019-06-09  4:31 ` [PATCH 2/4] wwwlisting: generate grokmirror-compatible manifest.js.gz Eric Wong (Contractor, The Linux Foundation)
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2019-06-09  4:31 UTC (permalink / raw)
  To: meta

Since we already have a mechanism for hiding repositories from
the WWW listing, we might as well support another one for hiding
repositories from the upcoming manifest.js.gz generation.
---
 lib/PublicInbox/WwwListing.pm | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/lib/PublicInbox/WwwListing.pm b/lib/PublicInbox/WwwListing.pm
index e1473b3..6d6d301 100644
--- a/lib/PublicInbox/WwwListing.pm
+++ b/lib/PublicInbox/WwwListing.pm
@@ -10,25 +10,27 @@ use PublicInbox::Hval qw(ascii_html);
 use PublicInbox::Linkify;
 use PublicInbox::View;
 
-sub list_all ($$) {
-	my ($self, undef) = @_;
+sub list_all ($$$) {
+	my ($self, $env, $hide_key) = @_;
 	my @list;
 	$self->{pi_config}->each_inbox(sub {
 		my ($ibx) = @_;
-		push @list, $ibx unless $ibx->{-hide}->{www};
+		push @list, $ibx unless $ibx->{-hide}->{$hide_key};
 	});
 	\@list;
 }
 
-sub list_match_domain ($$) {
-	my ($self, $env) = @_;
+sub list_match_domain ($$$) {
+	my ($self, $env, $hide_key) = @_;
 	my @list;
 	my $host = $env->{HTTP_HOST} // $env->{SERVER_NAME};
 	$host =~ s/:[0-9]+\z//;
 	my $re = qr!\A(?:https?:)?//\Q$host\E(?::[0-9]+)?/!i;
 	$self->{pi_config}->each_inbox(sub {
 		my ($ibx) = @_;
-		push @list, $ibx if !$ibx->{-hide}->{www} && $ibx->{url} =~ $re;
+		if (!$ibx->{-hide}->{$hide_key} && $ibx->{url} =~ $re) {
+			push @list, $ibx;
+		}
 	});
 	\@list;
 }
@@ -78,7 +80,11 @@ sub ibx_entry {
 sub call {
 	my ($self, $env) = @_;
 	my $h = [ 'Content-Type', 'text/html; charset=UTF-8' ];
-	my $list = $self->{list_cb}->($self, $env);
+	my $hide_key = 'www';
+	if ($env->{PATH_INFO} =~ m!/manifest\.js(?:\.gz)\z/!) {
+		$hide_key = 'manifest';
+	}
+	my $list = $self->{list_cb}->($self, $env, $hide_key);
 	my $code = 404;
 	my $title = 'public-inbox';
 	my $out = '';
-- 
EW


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 2/4] wwwlisting: generate grokmirror-compatible manifest.js.gz
  2019-06-09  4:31 [PATCH 0/4] grokmirror-compatible manifests Eric Wong (Contractor, The Linux Foundation)
  2019-06-09  4:31 ` [PATCH 1/4] wwwlisting: allow hiding entries from manifest Eric Wong (Contractor, The Linux Foundation)
@ 2019-06-09  4:31 ` Eric Wong (Contractor, The Linux Foundation)
  2019-06-09  4:31 ` [PATCH 3/4] www: wire up /$INBOX/manifest.js.gz, too Eric Wong (Contractor, The Linux Foundation)
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2019-06-09  4:31 UTC (permalink / raw)
  To: meta

Support on-demand generation of "/manifest.js.gz" for inboxes.
By default, this matches inboxes with URLs matching the given
request hostname by default.

This makes it easier to create full mirrors of several inboxes
without needing to configure static file serving.

cf. https://git.kernel.org/pub/scm/utils/grokmirror/grokmirror.git
---
 MANIFEST                      |   1 +
 lib/PublicInbox/WWW.pm        |   2 +-
 lib/PublicInbox/WwwListing.pm | 164 +++++++++++++++++++++++++++++-----
 t/www_listing.t               | 138 ++++++++++++++++++++++++++++
 4 files changed, 282 insertions(+), 23 deletions(-)
 create mode 100644 t/www_listing.t

diff --git a/MANIFEST b/MANIFEST
index 5085bff..9a88f13 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -258,3 +258,4 @@ t/view.t
 t/watch_filter_rubylang.t
 t/watch_maildir.t
 t/watch_maildir_v2.t
+t/www_listing.t
diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm
index 7ea9820..614adad 100644
--- a/lib/PublicInbox/WWW.pm
+++ b/lib/PublicInbox/WWW.pm
@@ -88,7 +88,7 @@ sub call {
 	}
 
 	# top-level indices and feeds
-	if ($path_info eq '/') {
+	if ($path_info eq '/' || $path_info eq '/manifest.js.gz') {
 		www_listing($self)->call($env);
 	} elsif ($path_info =~ m!$INBOX_RE\z!o) {
 		invalid_inbox($ctx, $1) || r301($ctx, $1);
diff --git a/lib/PublicInbox/WwwListing.pm b/lib/PublicInbox/WwwListing.pm
index 6d6d301..690976a 100644
--- a/lib/PublicInbox/WwwListing.pm
+++ b/lib/PublicInbox/WwwListing.pm
@@ -9,6 +9,11 @@ use warnings;
 use PublicInbox::Hval qw(ascii_html);
 use PublicInbox::Linkify;
 use PublicInbox::View;
+use bytes ();
+use HTTP::Date qw(time2str);
+require Digest::SHA;
+require File::Spec;
+{ no warnings 'once'; *try_cat = *PublicInbox::Inbox::try_cat };
 
 sub list_all ($$$) {
 	my ($self, $env, $hide_key) = @_;
@@ -44,21 +49,27 @@ my %VALID = (
 	404 => *list_404,
 );
 
+sub set_cb ($$$) {
+	my ($pi_config, $k, $default) = @_;
+	my $v = $pi_config->{lc $k} // $default;
+	$VALID{$v} || do {
+		warn <<"";
+`$v' is not a valid value for `$k'
+$k be one of `all', `match=domain', or `404'
+
+		$VALID{$default};
+	};
+}
+
 sub new {
 	my ($class, $www) = @_;
-	my $k = 'publicinbox.wwwListing';
 	my $pi_config = $www->{pi_config};
-	my $v = $pi_config->{lc($k)} // 404;
 	bless {
 		pi_config => $pi_config,
 		style => $www->style("\0"),
-		list_cb => $VALID{$v} || do {
-			warn <<"";
-`$v' is not a valid value for `$k'
-$k be one of `all', `match=domain', or `404'
-
-			*list_404;
-		},
+		www_cb => set_cb($pi_config, 'publicInbox.wwwListing', 404),
+		manifest_cb => set_cb($pi_config, 'publicInbox.grokManifest',
+					'match=domain'),
 	}, $class;
 }
 
@@ -76,26 +87,20 @@ sub ibx_entry {
 	$tmp;
 }
 
-# not really a stand-alone PSGI app, but maybe it could be...
-sub call {
-	my ($self, $env) = @_;
-	my $h = [ 'Content-Type', 'text/html; charset=UTF-8' ];
-	my $hide_key = 'www';
-	if ($env->{PATH_INFO} =~ m!/manifest\.js(?:\.gz)\z/!) {
-		$hide_key = 'manifest';
-	}
-	my $list = $self->{list_cb}->($self, $env, $hide_key);
-	my $code = 404;
+sub html ($$) {
+	my ($env, $list) = @_;
 	my $title = 'public-inbox';
 	my $out = '';
+	my $code = 404;
 	if (@$list) {
+		$title .= ' - listing';
+		$code = 200;
+
 		# Swartzian transform since ->modified is expensive
 		@$list = sort {
 			$b->[0] <=> $a->[0]
 		} map { [ $_->modified, $_ ] } @$list;
 
-		$code = 200;
-		$title .= ' - listing';
 		my $tmp = join("\n", map { ibx_entry(@$_, $env) } @$list);
 		my $l = PublicInbox::Linkify->new;
 		$l->linkify_1($tmp);
@@ -104,7 +109,122 @@ sub call {
 	$out = "<html><head><title>$title</title></head><body>" . $out;
 	$out .= '<pre>'. PublicInbox::WwwStream::code_footer($env) .
 		'</pre></body></html>';
-	[ $code, $h, [ $out ] ]
+
+	my $h = [ 'Content-Type', 'text/html; charset=UTF-8' ];
+	[ $code, $h, [ $out ] ];
+}
+
+my $json;
+sub _json () {
+	for my $mod (qw(JSON::MaybeXS JSON JSON::PP)) {
+		eval "require $mod" or next;
+		# ->ascii encodes non-ASCII to "\uXXXX"
+		return $mod->new->ascii(1);
+	}
+	die;
+}
+
+sub fingerprint ($) {
+	my ($git) = @_;
+	my $fh = $git->popen('show-ref') or
+		die "popen($git->{git_dir} show-ref) failed: $!";
+
+	my $dig = Digest::SHA->new(1);
+	while (read($fh, my $buf, 65536)) {
+		$dig->add($buf);
+	}
+	close $fh;
+	return if $?; # empty, uninitialized git repo
+	$dig->hexdigest;
+}
+
+sub manifest_add ($$;$) {
+	my ($manifest, $ibx, $epoch) = @_;
+	my $url_path = "/$ibx->{name}";
+	my $git_dir = $ibx->{mainrepo};
+	if (defined $epoch) {
+		$git_dir .= "/git/$epoch.git";
+		$url_path .= "/$epoch";
+	}
+	return unless -d $git_dir;
+	my $git = PublicInbox::Git->new($git_dir);
+	my $fingerprint = fingerprint($git) or return; # no empty repos
+
+	chomp(my $owner = $git->qx('config', 'gitweb.owner'));
+	chomp(my $desc = try_cat("$git_dir/description"));
+	$owner = undef if $owner eq '';
+	$desc = 'Unnamed repository' if $desc eq '';
+
+	my $reference;
+	chomp(my $alt = try_cat("$git_dir/objects/info/alternates"));
+	if ($alt) {
+		# n.b.: GitPython doesn't seem to handle comments or C-quoted
+		# strings like native git does; and we don't for now, either.
+		my @alt = split(/\n+/, $alt);
+
+		# grokmirror only supports 1 alternate for "reference",
+		if (scalar(@alt) == 1) {
+			my $objdir = "$git_dir/objects";
+			$reference = File::Spec->rel2abs($alt[0], $objdir);
+			$reference =~ s!/[^/]+/?\z!!; # basename
+		}
+	}
+	$manifest->{-abs2urlpath}->{$git_dir} = $url_path;
+	my $modified = $git->modified;
+	if ($modified > $manifest->{-mtime}) {
+		$manifest->{-mtime} = $modified;
+	}
+	$manifest->{$url_path} = {
+		owner => $owner,
+		reference => $reference,
+		description => $desc,
+		modified => $modified,
+		fingerprint => $fingerprint,
+	};
+}
+
+# manifest.js.gz
+sub js ($$) {
+	my ($env, $list) = @_;
+	eval { require IO::Compress::Gzip } or return [ 404, [], [] ];
+
+	my $manifest = { -abs2urlpath => {}, -mtime => 0 };
+	for my $ibx (@$list) {
+		if (defined(my $max = $ibx->max_git_part)) {
+			for my $epoch (0..$max) {
+				manifest_add($manifest, $ibx, $epoch);
+			}
+		} else {
+			manifest_add($manifest, $ibx);
+		}
+	}
+	my $abs2urlpath = delete $manifest->{-abs2urlpath};
+	my $mtime = delete $manifest->{-mtime};
+	while (my ($url_path, $repo) = each %$manifest) {
+		defined(my $abs = $repo->{reference}) or next;
+		$repo->{reference} = $abs2urlpath->{$abs};
+	}
+	my $out;
+	IO::Compress::Gzip::gzip(\(($json ||= _json())->encode($manifest)) =>
+				 \$out);
+	$manifest = undef;
+	[ 200, [ qw(Content-Type application/gzip),
+		 'Last-Modified', time2str($mtime),
+		 'Content-Length', bytes::length($out) ], [ $out ] ];
+}
+
+# not really a stand-alone PSGI app, but maybe it could be...
+sub call {
+	my ($self, $env) = @_;
+
+	if ($env->{PATH_INFO} eq '/manifest.js.gz') {
+		# grokmirror uses relative paths, so it's domain-dependent
+		my $list = $self->{manifest_cb}->($self, $env, 'manifest');
+		js($env, $list);
+	} else { # /
+		my $list = $self->{www_cb}->($self, $env, 'www');
+		html($env, $list);
+	}
 }
 
 1;
diff --git a/t/www_listing.t b/t/www_listing.t
new file mode 100644
index 0000000..f9d543e
--- /dev/null
+++ b/t/www_listing.t
@@ -0,0 +1,138 @@
+# Copyright (C) 2019 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+# manifest.js.gz generation and grok-pull integration test
+use strict;
+use warnings;
+use Test::More;
+use PublicInbox::Spawn qw(which);
+use File::Temp qw/tempdir/;
+require './t/common.perl';
+my @mods = qw(URI::Escape Plack::Builder IPC::Run Digest::SHA HTTP::Tiny
+		IO::Compress::Gzip IO::Uncompress::Gunzip Net::HTTP);
+foreach my $mod (@mods) {
+	eval("require $mod") or plan skip_all => "$mod missing for $0";
+}
+use_ok 'PublicInbox::WwwListing';
+use_ok 'PublicInbox::Git';
+
+my $fi_data = './t/git.fast-import-data';
+my $tmpdir = tempdir('www_listing-tmp-XXXXXX', TMPDIR => 1, CLEANUP => 1);
+my $bare = PublicInbox::Git->new("$tmpdir/bare.git");
+is(system(qw(git init -q --bare), $bare->{git_dir}), 0, 'git init --bare');
+is(PublicInbox::WwwListing::fingerprint($bare), undef,
+	'empty repo has no fingerprint');
+
+my $cmd = [ 'git', "--git-dir=$bare->{git_dir}", qw(fast-import --quiet) ];
+ok(IPC::Run::run($cmd, '<', $fi_data), 'fast-import');
+
+like(PublicInbox::WwwListing::fingerprint($bare), qr/\A[a-f0-9]{40}\z/,
+	'got fingerprint with non-empty repo');
+
+my $pid;
+END { kill 'TERM', $pid if defined $pid };
+SKIP: {
+	my $json = eval { PublicInbox::WwwListing::_json() };
+	skip "JSON module missing: $@", 1 if $@;
+	my $err = "$tmpdir/stderr.log";
+	my $out = "$tmpdir/stdout.log";
+	my $alt = "$tmpdir/alt.git";
+	my $cfgfile = "$tmpdir/config";
+	my $v2 = "$tmpdir/v2";
+	my $httpd = 'blib/script/public-inbox-httpd';
+	use IO::Socket::INET;
+	my %opts = (
+		LocalAddr => '127.0.0.1',
+		ReuseAddr => 1,
+		Proto => 'tcp',
+		Type => SOCK_STREAM,
+		Listen => 1024,
+	);
+	my $sock = IO::Socket::INET->new(%opts);
+	ok($sock, 'sock created');
+	my ($host, $port) = ($sock->sockhost, $sock->sockport);
+	my @clone = qw(git clone -q -s --bare);
+	is(system(@clone, $bare->{git_dir}, $alt), 0, 'clone shared repo');
+
+	for my $i (0..2) {
+		is(system(@clone, $alt, "$v2/git/$i.git"), 0, "clone epoch $i");
+	}
+	ok(open(my $fh, '>', "$v2/inbox.lock"), 'mock a v2 inbox');
+	open $fh, '>', "$alt/description" or die;
+	print $fh "we're all clones\n" or die;
+	close $fh or die;
+	is(system('git', "--git-dir=$alt", qw(config gitweb.owner lorelei)), 0,
+		'set gitweb user');
+	ok(unlink("$bare->{git_dir}/description"), 'removed bare/description');
+	open $fh, '>', $cfgfile or die;
+	print $fh <<"" or die;
+[publicinbox "bare"]
+	mainrepo = $bare->{git_dir}
+	url = http://$host/bare
+	address = bare\@example.com
+[publicinbox "alt"]
+	mainrepo = $alt
+	url = http://$host/alt
+	address = alt\@example.com
+[publicinbox "v2"]
+	mainrepo = $v2
+	url = http://$host/v2
+	address = v2\@example.com
+
+	close $fh or die;
+	my $env = { PI_CONFIG => $cfgfile };
+	my $cmd = [ $httpd, "--stdout=$out", "--stderr=$err" ];
+	$pid = spawn_listener($env, $cmd, [$sock]);
+	$sock = undef;
+	my $http = Net::HTTP->new(Host => "$host:$port");
+	$http->write_request(GET => '/manifest.js.gz');
+	my ($code, undef, %h) = $http->read_response_headers;
+	is($code, 200, 'got manifest');
+	my $tmp;
+	my $body = '';
+	while (1) {
+		my $n = $http->read_entity_body(my $buf, 65536);
+		die unless defined $n;
+		last if $n == 0;
+		$body .= $buf;
+	}
+	IO::Uncompress::Gunzip::gunzip(\$body => \$tmp);
+	my $manifest = $json->decode($tmp);
+	ok(my $clone = $manifest->{'/alt'}, '/alt in manifest');
+	is($clone->{owner}, 'lorelei', 'owner set');
+	is($clone->{reference}, '/bare', 'reference detected');
+	is($clone->{description}, "we're all clones", 'description read');
+	ok(my $bare = $manifest->{'/bare'}, '/bare in manifest');
+	is($bare->{description}, 'Unnamed repository',
+		'missing $GIT_DIR/description fallback');
+
+	like($bare->{fingerprint}, qr/\A[a-f0-9]{40}\z/, 'fingerprint');
+	is($clone->{fingerprint}, $bare->{fingerprint}, 'fingerprint matches');
+
+	is(HTTP::Date::time2str($bare->{modified}), $h{'Last-Modified'},
+		'modified field and Last-Modified header match');
+
+	ok($manifest->{'/v2/0'}, 'v2 epoch appeared');
+
+	skip 'skipping grok-pull integration test', 2 if !which('grok-pull');
+
+	ok(mkdir("$tmpdir/mirror"), 'prepare grok mirror dest');
+	open $fh, '>', "$tmpdir/repos.conf" or die;
+	print $fh <<"" or die;
+# You can pull from multiple grok mirrors, just create
+# a separate section for each mirror. The name can be anything.
+[test]
+site = http://$host:$port
+manifest = http://$host:$port/manifest.js.gz
+toplevel = $tmpdir/mirror
+mymanifest = $tmpdir/local-manifest.js.gz
+
+	close $fh or die;
+
+	system(qw(grok-pull -c), "$tmpdir/repos.conf");
+	is($? >> 8, 127, 'grok-pull exit code as expected');
+	for (qw(alt bare v2/0 v2/1 v2/2)) {
+		ok(-d "$tmpdir/mirror/$_", "grok-pull created $_");
+	}
+}
+
+done_testing();
-- 
EW


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 3/4] www: wire up /$INBOX/manifest.js.gz, too
  2019-06-09  4:31 [PATCH 0/4] grokmirror-compatible manifests Eric Wong (Contractor, The Linux Foundation)
  2019-06-09  4:31 ` [PATCH 1/4] wwwlisting: allow hiding entries from manifest Eric Wong (Contractor, The Linux Foundation)
  2019-06-09  4:31 ` [PATCH 2/4] wwwlisting: generate grokmirror-compatible manifest.js.gz Eric Wong (Contractor, The Linux Foundation)
@ 2019-06-09  4:31 ` Eric Wong (Contractor, The Linux Foundation)
  2019-06-09  4:31 ` [PATCH 4/4] www: support $INBOX/git/$EPOCH.git for v2 cloning Eric Wong (Contractor, The Linux Foundation)
  2019-06-10  6:21 ` [PATCH 5/4] git: ensure ->modified returns an integer Eric Wong (Contractor, The Linux Foundation)
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2019-06-09  4:31 UTC (permalink / raw)
  To: meta

I can imagine myself just wanting to clone a single v2 inbox
and all its epochs without thinking about include/exclude
rules in a grokmirror config file.
---
 lib/PublicInbox/WWW.pm | 11 +++++++++++
 t/www_listing.t        | 20 ++++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm
index 614adad..a546698 100644
--- a/lib/PublicInbox/WWW.pm
+++ b/lib/PublicInbox/WWW.pm
@@ -126,6 +126,8 @@ sub call {
 		get_text($ctx, $1, $2);
 	} elsif ($path_info =~ m!$INBOX_RE/([a-zA-Z0-9_\-\.]+)\.css\z!o) {
 		get_css($ctx, $1, $2);
+	} elsif ($path_info =~ m!$INBOX_RE/manifest\.js\.gz\z!o) {
+		get_inbox_manifest($ctx, $1, $2);
 	} elsif ($path_info =~ m!$INBOX_RE/($OID_RE)/s/\z!o) {
 		get_vcs_object($ctx, $1, $2);
 	} elsif ($path_info =~ m!$INBOX_RE/($OID_RE)/s/
@@ -490,6 +492,15 @@ sub www_listing {
 	}
 }
 
+# GET $INBOX/manifest.js.gz
+sub get_inbox_manifest ($$$) {
+	my ($ctx, $inbox, $key) = @_;
+	my $r404 = invalid_inbox($ctx, $inbox);
+	return $r404 if $r404;
+	require PublicInbox::WwwListing;
+	PublicInbox::WwwListing::js($ctx->{env}, [$ctx->{-inbox}]);
+}
+
 sub get_attach {
 	my ($ctx, $idx, $fn) = @_;
 	require PublicInbox::WwwAttach;
diff --git a/t/www_listing.t b/t/www_listing.t
index f9d543e..546c2f8 100644
--- a/t/www_listing.t
+++ b/t/www_listing.t
@@ -133,6 +133,26 @@ mymanifest = $tmpdir/local-manifest.js.gz
 	for (qw(alt bare v2/0 v2/1 v2/2)) {
 		ok(-d "$tmpdir/mirror/$_", "grok-pull created $_");
 	}
+
+	# support per-inbox manifests, handy for v2:
+	# /$INBOX/v2/manifest.js.gz
+	open $fh, '>', "$tmpdir/per-inbox.conf" or die;
+	print $fh <<"" or die;
+# You can pull from multiple grok mirrors, just create
+# a separate section for each mirror. The name can be anything.
+[v2]
+site = http://$host:$port
+manifest = http://$host:$port/v2/manifest.js.gz
+toplevel = $tmpdir/per-inbox
+mymanifest = $tmpdir/per-inbox-manifest.js.gz
+
+	close $fh or die;
+	ok(mkdir("$tmpdir/per-inbox"), 'prepare single-v2-inbox mirror');
+	system(qw(grok-pull -c), "$tmpdir/per-inbox.conf");
+	is($? >> 8, 127, 'grok-pull exit code as expected');
+	for (qw(v2/0 v2/1 v2/2)) {
+		ok(-d "$tmpdir/per-inbox/$_", "grok-pull created $_");
+	}
 }
 
 done_testing();
-- 
EW


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 4/4] www: support $INBOX/git/$EPOCH.git for v2 cloning
  2019-06-09  4:31 [PATCH 0/4] grokmirror-compatible manifests Eric Wong (Contractor, The Linux Foundation)
                   ` (2 preceding siblings ...)
  2019-06-09  4:31 ` [PATCH 3/4] www: wire up /$INBOX/manifest.js.gz, too Eric Wong (Contractor, The Linux Foundation)
@ 2019-06-09  4:31 ` Eric Wong (Contractor, The Linux Foundation)
  2019-06-10  6:21 ` [PATCH 5/4] git: ensure ->modified returns an integer Eric Wong (Contractor, The Linux Foundation)
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2019-06-09  4:31 UTC (permalink / raw)
  To: meta

And use it in manifest.js.

To ease maintaining mirrors with grokmirror(1), we can accept
a "git/" directory prefix before the epoch, and ".git" suffix
after the epoch number.

We maintain compatibility with "$INBOX/$EPOCH" cloning, of
course, and it's still easier-to-type on the command-line.
---
 lib/PublicInbox/WWW.pm        | 4 ++--
 lib/PublicInbox/WwwListing.pm | 2 +-
 t/psgi_v2.t                   | 2 ++
 t/v2mirror.t                  | 8 +++++---
 t/www_listing.t               | 6 +++---
 5 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm
index a546698..e468263 100644
--- a/lib/PublicInbox/WWW.pm
+++ b/lib/PublicInbox/WWW.pm
@@ -74,7 +74,7 @@ sub call {
 	my $method = $env->{REQUEST_METHOD};
 
 	if ($method eq 'POST') {
-		if ($path_info =~ m!$INBOX_RE/(?:([0-9]+)/)?
+		if ($path_info =~ m!$INBOX_RE/(?:(?:git/)?([0-9]+)(?:\.git)?/)?
 					(git-upload-pack)\z!x) {
 			my ($part, $path) = ($2, $3);
 			return invalid_inbox($ctx, $1) ||
@@ -98,7 +98,7 @@ sub call {
 		invalid_inbox($ctx, $1) || get_atom($ctx);
 	} elsif ($path_info =~ m!$INBOX_RE/new\.html\z!o) {
 		invalid_inbox($ctx, $1) || get_new($ctx);
-	} elsif ($path_info =~ m!$INBOX_RE/(?:([0-9]+)/)?
+	} elsif ($path_info =~ m!$INBOX_RE/(?:(?:git/)?([0-9]+)(?:\.git)?/)?
 				($PublicInbox::GitHTTPBackend::ANY)\z!ox) {
 		my ($part, $path) = ($2, $3);
 		invalid_inbox($ctx, $1) || serve_git($ctx, $part, $path);
diff --git a/lib/PublicInbox/WwwListing.pm b/lib/PublicInbox/WwwListing.pm
index 690976a..e2724cc 100644
--- a/lib/PublicInbox/WwwListing.pm
+++ b/lib/PublicInbox/WwwListing.pm
@@ -144,7 +144,7 @@ sub manifest_add ($$;$) {
 	my $git_dir = $ibx->{mainrepo};
 	if (defined $epoch) {
 		$git_dir .= "/git/$epoch.git";
-		$url_path .= "/$epoch";
+		$url_path .= "/git/$epoch.git";
 	}
 	return unless -d $git_dir;
 	my $git = PublicInbox::Git->new($git_dir);
diff --git a/t/psgi_v2.t b/t/psgi_v2.t
index 9811249..5c358cd 100644
--- a/t/psgi_v2.t
+++ b/t/psgi_v2.t
@@ -202,6 +202,8 @@ test_psgi(sub { $www->call(@_) }, sub {
 
 	$res = $cb->(GET('/v2test/0/info/refs'));
 	is($res->code, 200, 'got info refs for dumb clones');
+	$res = $cb->(GET('/v2test/0.git/info/refs'));
+	is($res->code, 200, 'got info refs for dumb clones w/ .git suffix');
 	$res = $cb->(GET('/v2test/info/refs'));
 	is($res->code, 404, 'unpartitioned git URL fails');
 
diff --git a/t/v2mirror.t b/t/v2mirror.t
index fe05ec4..c31dcd5 100644
--- a/t/v2mirror.t
+++ b/t/v2mirror.t
@@ -80,11 +80,13 @@ $sock = undef;
 
 my @cmd;
 foreach my $i (0..$epoch_max) {
-	@cmd = (qw(git clone --mirror -q), "http://$host:$port/v2/$i",
+	my $sfx = $i == 0 ? '.git' : '';
+	@cmd = (qw(git clone --mirror -q),
+		"http://$host:$port/v2/$i$sfx",
 		"$tmpdir/m/git/$i.git");
 
-	is(system(@cmd), 0, 'cloned OK');
-	ok(-d "$tmpdir/m/git/$i.git", 'mirror OK');
+	is(system(@cmd), 0, "cloned $i.git");
+	ok(-d "$tmpdir/m/git/$i.git", "mirror $i OK");
 }
 
 @cmd = ("$script-init", '-V2', 'm', "$tmpdir/m", 'http://example.com/m',
diff --git a/t/www_listing.t b/t/www_listing.t
index 546c2f8..2741e1b 100644
--- a/t/www_listing.t
+++ b/t/www_listing.t
@@ -111,7 +111,7 @@ SKIP: {
 	is(HTTP::Date::time2str($bare->{modified}), $h{'Last-Modified'},
 		'modified field and Last-Modified header match');
 
-	ok($manifest->{'/v2/0'}, 'v2 epoch appeared');
+	ok($manifest->{'/v2/git/0.git'}, 'v2 epoch appeared');
 
 	skip 'skipping grok-pull integration test', 2 if !which('grok-pull');
 
@@ -130,7 +130,7 @@ mymanifest = $tmpdir/local-manifest.js.gz
 
 	system(qw(grok-pull -c), "$tmpdir/repos.conf");
 	is($? >> 8, 127, 'grok-pull exit code as expected');
-	for (qw(alt bare v2/0 v2/1 v2/2)) {
+	for (qw(alt bare v2/git/0.git v2/git/1.git v2/git/2.git)) {
 		ok(-d "$tmpdir/mirror/$_", "grok-pull created $_");
 	}
 
@@ -150,7 +150,7 @@ mymanifest = $tmpdir/per-inbox-manifest.js.gz
 	ok(mkdir("$tmpdir/per-inbox"), 'prepare single-v2-inbox mirror');
 	system(qw(grok-pull -c), "$tmpdir/per-inbox.conf");
 	is($? >> 8, 127, 'grok-pull exit code as expected');
-	for (qw(v2/0 v2/1 v2/2)) {
+	for (qw(v2/git/0.git v2/git/1.git v2/git/2.git)) {
 		ok(-d "$tmpdir/per-inbox/$_", "grok-pull created $_");
 	}
 }
-- 
EW


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 5/4] git: ensure ->modified returns an integer
  2019-06-09  4:31 [PATCH 0/4] grokmirror-compatible manifests Eric Wong (Contractor, The Linux Foundation)
                   ` (3 preceding siblings ...)
  2019-06-09  4:31 ` [PATCH 4/4] www: support $INBOX/git/$EPOCH.git for v2 cloning Eric Wong (Contractor, The Linux Foundation)
@ 2019-06-10  6:21 ` Eric Wong (Contractor, The Linux Foundation)
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2019-06-10  6:21 UTC (permalink / raw)
  To: meta

We don't want to serialize timestamps as strings to JSON.
I only noticed this bug on a 32-bit system.
---
 lib/PublicInbox/Git.pm | 2 +-
 t/www_listing.t        | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm
index 68445b3..82510b9 100644
--- a/lib/PublicInbox/Git.pm
+++ b/lib/PublicInbox/Git.pm
@@ -320,7 +320,7 @@ sub modified ($) {
 		chomp $oid;
 		my $buf = cat_file($self, $oid) or next;
 		$$buf =~ /^committer .*?> ([0-9]+) [\+\-]?[0-9]+/sm or next;
-		my $cmt_time = $1;
+		my $cmt_time = $1 + 0;
 		$modified = $cmt_time if $cmt_time > $modified;
 	}
 	$modified || time;
diff --git a/t/www_listing.t b/t/www_listing.t
index 2741e1b..1f29298 100644
--- a/t/www_listing.t
+++ b/t/www_listing.t
@@ -96,6 +96,7 @@ SKIP: {
 		$body .= $buf;
 	}
 	IO::Uncompress::Gunzip::gunzip(\$body => \$tmp);
+	unlike($tmp, qr/"modified":\s*"/, 'modified is an integer');
 	my $manifest = $json->decode($tmp);
 	ok(my $clone = $manifest->{'/alt'}, '/alt in manifest');
 	is($clone->{owner}, 'lorelei', 'owner set');
-- 
EW


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, back to index

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-09  4:31 [PATCH 0/4] grokmirror-compatible manifests Eric Wong (Contractor, The Linux Foundation)
2019-06-09  4:31 ` [PATCH 1/4] wwwlisting: allow hiding entries from manifest Eric Wong (Contractor, The Linux Foundation)
2019-06-09  4:31 ` [PATCH 2/4] wwwlisting: generate grokmirror-compatible manifest.js.gz Eric Wong (Contractor, The Linux Foundation)
2019-06-09  4:31 ` [PATCH 3/4] www: wire up /$INBOX/manifest.js.gz, too Eric Wong (Contractor, The Linux Foundation)
2019-06-09  4:31 ` [PATCH 4/4] www: support $INBOX/git/$EPOCH.git for v2 cloning Eric Wong (Contractor, The Linux Foundation)
2019-06-10  6:21 ` [PATCH 5/4] git: ensure ->modified returns an integer Eric Wong (Contractor, The Linux Foundation)

user/dev discussion of public-inbox itself

Archives are clonable:
	git clone --mirror http://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://ou63pmih66umazou.onion/inbox.comp.mail.public-inbox.meta
	nntp://czquwvybam4bgbro.onion/inbox.comp.mail.public-inbox.meta
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.org/gmane.mail.public-inbox.general

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox