user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
* [PATCH 0/3] wwwaltid: add pointers for usability
@ 2020-03-26  8:21 Eric Wong
  2020-03-26  8:21 ` [PATCH 1/3] inbox: altid_map becomes a method Eric Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Eric Wong @ 2020-03-26  8:21 UTC (permalink / raw)
  To: meta

Provide helpful hints and pointers in existing config example
to reproduce altid DBs when mirroring

Eric Wong (3):
  inbox: altid_map becomes a method
  wwwtext: show altid instructions in config
  wwwaltid: inform users to use POST instead of GET

 lib/PublicInbox/Inbox.pm    | 15 +++++++++++++++
 lib/PublicInbox/WWW.pm      |  3 +++
 lib/PublicInbox/WwwAltId.pm | 32 ++++++++++++++++++--------------
 lib/PublicInbox/WwwText.pm  | 20 ++++++++++++++++++--
 4 files changed, 54 insertions(+), 16 deletions(-)


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/3] inbox: altid_map becomes a method
  2020-03-26  8:21 [PATCH 0/3] wwwaltid: add pointers for usability Eric Wong
@ 2020-03-26  8:21 ` Eric Wong
  2020-03-26  8:21 ` [PATCH 2/3] wwwtext: show altid instructions in config Eric Wong
  2020-03-26  8:21 ` [PATCH 3/3] wwwaltid: inform users to use POST instead of GET Eric Wong
  2 siblings, 0 replies; 4+ messages in thread
From: Eric Wong @ 2020-03-26  8:21 UTC (permalink / raw)
  To: meta

We want to be able to preload that, as well as to access it
in WwwText for a config comment in the config example.
---
 lib/PublicInbox/Inbox.pm    | 15 +++++++++++++++
 lib/PublicInbox/WWW.pm      |  1 +
 lib/PublicInbox/WwwAltId.pm | 14 +-------------
 3 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index 4f27d1bb..95ffd039 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -376,4 +376,19 @@ sub modified {
 	git($self)->modified; # v1
 }
 
+# returns prefix => pathname mapping
+# (pathname is NOT public, but prefix is used for Xapian queries)
+sub altid_map ($) {
+	my ($self) = @_;
+	$self->{-altid_map} //= eval {
+		require PublicInbox::AltId;
+		my $altid = $self->{altid} or return {};
+		my %h = map {;
+			my $x = PublicInbox::AltId->new($self, $_);
+			"$x->{prefix}" => $x->{filename}
+		} @$altid;
+		\%h;
+	} // {};
+}
+
 1;
diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm
index 5017f572..56d2c42a 100644
--- a/lib/PublicInbox/WWW.pm
+++ b/lib/PublicInbox/WWW.pm
@@ -170,6 +170,7 @@ sub preload {
 
 sub preload_inbox {
 	my $ibx = shift;
+	$ibx->altid_map;
 	$ibx->cloneurl;
 	$ibx->description;
 	$ibx->base_url;
diff --git a/lib/PublicInbox/WwwAltId.pm b/lib/PublicInbox/WwwAltId.pm
index 34641a92..a45d8061 100644
--- a/lib/PublicInbox/WwwAltId.pm
+++ b/lib/PublicInbox/WwwAltId.pm
@@ -10,18 +10,6 @@ use PublicInbox::AltId;
 use PublicInbox::Spawn qw(which);
 our $sqlite3 = $ENV{SQLITE3};
 
-# returns prefix => pathname mapping
-# (pathname is NOT public, but prefix is used for Xapian queries)
-sub altid_map ($) {
-	my ($ibx) = @_;
-	my $altid = $ibx->{altid} or return {};
-	my %h = map {;
-		my $x = PublicInbox::AltId->new($ibx, $_);
-		"$x->{prefix}" => $x->{filename}
-	} @$altid;
-	\%h;
-}
-
 sub sqlite3_missing ($) {
 	PublicInbox::WwwResponse::oneshot($_[0], 501, \<<EOF);
 <pre>sqlite3 not available
@@ -51,7 +39,7 @@ sub check_output {
 sub sqldump ($$) {
 	my ($ctx, $altid_pfx) = @_;
 	my $ibx = $ctx->{-inbox};
-	my $altid_map = $ibx->{-altid_map} //= altid_map($ibx);
+	my $altid_map = $ibx->altid_map;
 	my $fn = $altid_map->{$altid_pfx};
 	unless (defined $fn) {
 		return PublicInbox::WwwStream::oneshot($ctx, 404, \<<EOF);

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/3] wwwtext: show altid instructions in config
  2020-03-26  8:21 [PATCH 0/3] wwwaltid: add pointers for usability Eric Wong
  2020-03-26  8:21 ` [PATCH 1/3] inbox: altid_map becomes a method Eric Wong
@ 2020-03-26  8:21 ` Eric Wong
  2020-03-26  8:21 ` [PATCH 3/3] wwwaltid: inform users to use POST instead of GET Eric Wong
  2 siblings, 0 replies; 4+ messages in thread
From: Eric Wong @ 2020-03-26  8:21 UTC (permalink / raw)
  To: meta

Exposing altid dumps will help and ensure total reproducibility
of existing instances.

AFAIK, sqlite3(1) can't execute arbitrary code, so it's not
quite as fashionable as the "curl | bash" stuff the cool people
are doing, these days :P
---
 lib/PublicInbox/WwwText.pm | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/WwwText.pm b/lib/PublicInbox/WwwText.pm
index cbe82b73..2008ba09 100644
--- a/lib/PublicInbox/WwwText.pm
+++ b/lib/PublicInbox/WwwText.pm
@@ -138,15 +138,16 @@ sub inbox_config ($$$) {
 	my $ibx = $ctx->{-inbox};
 	push @$hdr, 'Content-Disposition', 'inline; filename=inbox.config';
 	my $name = dq_escape($ibx->{name});
+	my $inboxdir = '/path/to/top-level-inbox';
 	$$txt .= <<EOS;
 ; example public-inbox config snippet for "$name"
 ; see public-inbox-config(5) manpage for more details:
 ; https://public-inbox.org/public-inbox-config.html
 [publicinbox "$name"]
-	inboxdir = /path/to/top-level-inbox
+	inboxdir = $inboxdir
 	; note: public-inbox before v1.2.0 used "mainrepo"
 	; instead of "inboxdir", both remain supported after 1.2
-	mainrepo = /path/to/top-level-inbox
+	mainrepo = $inboxdir
 	url = https://example.com/$name/
 	url = http://example.onion/$name/
 EOS
@@ -154,6 +155,21 @@ EOS
 		defined(my $v = $ibx->{$k}) or next;
 		$$txt .= "\t$k = $_\n" for @$v;
 	}
+	if (my $altid = $ibx->{altid}) {
+		my $base_url = $ibx->base_url($ctx->{env});
+		my $altid_map = $ibx->altid_map;
+		$$txt .= <<EOF;
+	; altid DBs may be used to provide numeric article ID lookup from
+	; old, pre-existing sources.  You can recreate them via curl(1),
+	; gzip(1), and sqlite3(1) as documented:
+EOF
+		for (sort keys %$altid_map) {
+			$$txt .= "\t;\tcurl -XPOST $base_url$_.sql.gz | \\\n" .
+				"\t;\tgzip -dc | \\\n" .
+				"\t;\tsqlite3 $inboxdir/$_.sqlite3\n";
+			$$txt .= "\taltid = serial:$_:file=$_.sqlite3\n";
+		}
+	}
 
 	for my $k (qw(filter newsgroup obfuscate replyto watchheader)) {
 		defined(my $v = $ibx->{$k}) or next;

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 3/3] wwwaltid: inform users to use POST instead of GET
  2020-03-26  8:21 [PATCH 0/3] wwwaltid: add pointers for usability Eric Wong
  2020-03-26  8:21 ` [PATCH 1/3] inbox: altid_map becomes a method Eric Wong
  2020-03-26  8:21 ` [PATCH 2/3] wwwtext: show altid instructions in config Eric Wong
@ 2020-03-26  8:21 ` Eric Wong
  2 siblings, 0 replies; 4+ messages in thread
From: Eric Wong @ 2020-03-26  8:21 UTC (permalink / raw)
  To: meta

Seeing the example config linkified, some users may inevitably
try to following it in a browser with a GET request.  Provide
a helpful message to inform users to use POST instead of
attempting to treat /$INBOX/$ALTID.sql.gz as a Message-Id.
---
 lib/PublicInbox/WWW.pm      |  2 ++
 lib/PublicInbox/WwwAltId.pm | 18 +++++++++++++++++-
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm
index 56d2c42a..275e509f 100644
--- a/lib/PublicInbox/WWW.pm
+++ b/lib/PublicInbox/WWW.pm
@@ -125,6 +125,8 @@ sub call {
 		get_vcs_object($ctx, $1, $2, $3);
 	} elsif ($path_info =~ m!$INBOX_RE/($OID_RE)/s\z!o) {
 		r301($ctx, $1, $2, 's/');
+	} elsif ($path_info =~ m!$INBOX_RE/(\w+)\.sql\.gz\z!o) {
+		get_altid_dump($ctx, $1, $2);
 	# convenience redirects order matters
 	} elsif ($path_info =~ m!$INBOX_RE/([^/]{2,})\z!o) {
 		r301($ctx, $1, $2);
diff --git a/lib/PublicInbox/WwwAltId.pm b/lib/PublicInbox/WwwAltId.pm
index a45d8061..263e884a 100644
--- a/lib/PublicInbox/WwwAltId.pm
+++ b/lib/PublicInbox/WwwAltId.pm
@@ -38,6 +38,7 @@ sub check_output {
 # and thus not usable from DBD::SQLite.
 sub sqldump ($$) {
 	my ($ctx, $altid_pfx) = @_;
+	my $env = $ctx->{env};
 	my $ibx = $ctx->{-inbox};
 	my $altid_map = $ibx->altid_map;
 	my $fn = $altid_map->{$altid_pfx};
@@ -47,6 +48,22 @@ sub sqldump ($$) {
 EOF
 	}
 
+	if ($env->{REQUEST_METHOD} ne 'POST') {
+		my $url = $ibx->base_url($ctx->{env}) . "$altid_pfx.sql.gz";
+		return PublicInbox::WwwStream::oneshot($ctx, 405, \<<EOF);
+<pre>A POST request required to retrieve $altid_pfx.sql.gz
+
+	curl -XPOST -O $url
+
+or
+
+	curl -XPOST $url | \\
+		gzip -dc | \\
+		sqlite3 /path/to/$altid_pfx.sqlite3
+</pre>
+EOF
+	}
+
 	eval { require PublicInbox::GzipFilter } or
 		return PublicInbox::WwwStream::oneshot($ctx, 501, \<<EOF);
 <pre>gzip output not available
@@ -73,7 +90,6 @@ EOF
 
 	# TODO: use -readonly if available with newer sqlite3(1)
 	my $qsp = PublicInbox::Qspawn->new([$sqlite3, $fn], undef, { 0 => $r });
-	my $env = $ctx->{env};
 	$ctx->{altid_pfx} = $altid_pfx;
 	$env->{'qspawn.filter'} = PublicInbox::GzipFilter->new;
 	$qsp->psgi_return($env, undef, \&check_output, $ctx);

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-03-26  8:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-26  8:21 [PATCH 0/3] wwwaltid: add pointers for usability Eric Wong
2020-03-26  8:21 ` [PATCH 1/3] inbox: altid_map becomes a method Eric Wong
2020-03-26  8:21 ` [PATCH 2/3] wwwtext: show altid instructions in config Eric Wong
2020-03-26  8:21 ` [PATCH 3/3] wwwaltid: inform users to use POST instead of GET Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).