user/dev discussion of public-inbox itself
 help / color / mirror / Atom feed
* [PATCH 0/3] mostly NNTP stuff
@ 2020-09-11  7:32 Eric Wong
  2020-09-11  7:32 ` [PATCH 1/3] treewide: avoid `goto &NAME' for tail recursion Eric Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Eric Wong @ 2020-09-11  7:32 UTC (permalink / raw)
  To: meta

Just a couple of cleanups and small tweaks while I figure out
how to get the NNTP code to support tens of thousands of
inboxes; since it's hampered by the most existing code for
dealing with mere dozens of inboxes...  I think configuring
detached index support will be a requirement.

Eric Wong (3):
  treewide: avoid `goto &NAME' for tail recursion
  t/nntpd: add test for the XPATH command
  nntp: share more code between art_lookup callers

 lib/PublicInbox/DS.pm         |  4 ++--
 lib/PublicInbox/Eml.pm        |  2 +-
 lib/PublicInbox/ExtMsg.pm     |  6 ++---
 lib/PublicInbox/NNTP.pm       | 44 ++++++++++++++---------------------
 lib/PublicInbox/SolverGit.pm  | 32 ++++++++++++-------------
 lib/PublicInbox/V2Writable.pm |  4 ++--
 lib/PublicInbox/View.pm       |  2 +-
 lib/PublicInbox/Watch.pm      |  4 ++--
 t/nntpd.t                     |  2 ++
 9 files changed, 47 insertions(+), 53 deletions(-)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/3] treewide: avoid `goto &NAME' for tail recursion
  2020-09-11  7:32 [PATCH 0/3] mostly NNTP stuff Eric Wong
@ 2020-09-11  7:32 ` Eric Wong
  2020-09-11  7:32 ` [PATCH 2/3] t/nntpd: add test for the XPATH command Eric Wong
  2020-09-11  7:32 ` [PATCH 3/3] nntp: share more code between art_lookup callers Eric Wong
  2 siblings, 0 replies; 4+ messages in thread
From: Eric Wong @ 2020-09-11  7:32 UTC (permalink / raw)
  To: meta

While Perl implements tail recursion via `goto' which allows
avoiding warnings on deep recursion.  It doesn't (as of 5.28)
optimize the speed of such dispatches, though it may reduce
ephemeral memory usage.

Make the code less alien to hackers coming from other languages
by using normal subroutine dispatch.  It's actually slightly
faster in micro benchmarks due to the complexity of `goto &NAME'.
---
 lib/PublicInbox/DS.pm         |  4 ++--
 lib/PublicInbox/Eml.pm        |  2 +-
 lib/PublicInbox/ExtMsg.pm     |  6 +++---
 lib/PublicInbox/SolverGit.pm  | 32 ++++++++++++++++----------------
 lib/PublicInbox/V2Writable.pm |  4 ++--
 lib/PublicInbox/View.pm       |  2 +-
 lib/PublicInbox/Watch.pm      |  4 ++--
 7 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/lib/PublicInbox/DS.pm b/lib/PublicInbox/DS.pm
index 661be1fd..9c278307 100644
--- a/lib/PublicInbox/DS.pm
+++ b/lib/PublicInbox/DS.pm
@@ -244,7 +244,7 @@ sub reap_pids {
 }
 
 # reentrant SIGCHLD handler (since reap_pids is not reentrant)
-sub enqueue_reap { $reap_armed //= requeue(\&reap_pids) }
+sub enqueue_reap () { $reap_armed //= requeue(\&reap_pids) }
 
 sub in_loop () { $in_loop }
 
@@ -629,7 +629,7 @@ sub dwaitpid ($$$) {
 	push @$wait_pids, [ @_ ]; # [ $pid, $cb, $arg ]
 
 	# We could've just missed our SIGCHLD, cover it, here:
-	goto &enqueue_reap; # tail recursion
+	enqueue_reap();
 }
 
 sub _run_later () {
diff --git a/lib/PublicInbox/Eml.pm b/lib/PublicInbox/Eml.pm
index 1fecd41b..571edc5c 100644
--- a/lib/PublicInbox/Eml.pm
+++ b/lib/PublicInbox/Eml.pm
@@ -129,7 +129,7 @@ sub new {
 sub new_sub {
 	my (undef, $ref) = @_;
 	# special case for messages like <85k5su9k59.fsf_-_@lola.goethe.zz>
-	$$ref =~ /\A(\r?\n)/s or goto &new;
+	$$ref =~ /\A(\r?\n)/s or return new(undef, $ref);
 	my $hdr = substr($$ref, 0, $+[0], ''); # sv_chop on $$ref
 	bless { hdr => \$hdr, crlf => $1, bdy => $ref }, __PACKAGE__;
 }
diff --git a/lib/PublicInbox/ExtMsg.pm b/lib/PublicInbox/ExtMsg.pm
index ce1a47bb..03faf3a1 100644
--- a/lib/PublicInbox/ExtMsg.pm
+++ b/lib/PublicInbox/ExtMsg.pm
@@ -125,12 +125,12 @@ sub ext_msg {
 sub event_step {
 	my ($ctx, $sync) = @_;
 	# can't find a partial match in current inbox, try the others:
-	my $ibx = shift @{$ctx->{again}} or goto \&finalize_partial;
+	my $ibx = shift @{$ctx->{again}} or return finalize_partial($ctx);
 	my $mids = search_partial($ibx, $ctx->{mid}) or
 			return ($sync ? undef : PublicInbox::DS::requeue($ctx));
 	$ctx->{n_partial} += scalar(@$mids);
 	push @{$ctx->{partial}}, [ $ibx, $mids ];
-	$ctx->{n_partial} >= PARTIAL_MAX ? goto(\&finalize_partial)
+	$ctx->{n_partial} >= PARTIAL_MAX ? finalize_partial($ctx)
 			: ($sync ? undef : PublicInbox::DS::requeue($ctx));
 }
 
@@ -156,7 +156,7 @@ sub finalize_exact {
 		# synchronous fall-through
 		$ctx->event_step while @{$ctx->{again}};
 	}
-	goto \&finalize_partial;
+	finalize_partial($ctx);
 }
 
 sub finalize_partial {
diff --git a/lib/PublicInbox/SolverGit.pm b/lib/PublicInbox/SolverGit.pm
index ae3997ca..83f7a4ee 100644
--- a/lib/PublicInbox/SolverGit.pm
+++ b/lib/PublicInbox/SolverGit.pm
@@ -517,16 +517,16 @@ sub di_url ($$) {
 }
 
 sub retry_current {
-	# my ($self, $want) = @_;
-	push @{$_[0]->{todo}}, $_[1];
-	goto \&next_step # retry solve_existing
+	my ($self, $want) = @_;
+	push @{$self->{todo}}, $want;
+	next_step($self); # retry solve_existing
 }
 
-sub try_harder {
+sub try_harder ($$) {
 	my ($self, $want) = @_;
 
 	# do we have more inboxes to try?
-	goto \&retry_current if scalar @{$want->{try_ibxs}};
+	return retry_current($self, $want) if scalar @{$want->{try_ibxs}};
 
 	my $cur_want = $want->{oid_b};
 	if (length($cur_want) > $OID_MIN) { # maybe a shorter OID will work
@@ -534,7 +534,7 @@ sub try_harder {
 		chop($cur_want);
 		dbg($self, "retrying $want->{oid_b} as $cur_want");
 		$want->{oid_b} = $cur_want;
-		goto \&retry_current; # retry with shorter abbrev
+		return retry_current($self, $want); # retry with shorter abbrev
 	}
 
 	dbg($self, "could not find $cur_want");
@@ -564,9 +564,9 @@ sub extract_diffs_done {
 			my $job = { oid_b => $src, path_b => $di->{path_a} };
 			push @{$self->{todo}}, $job;
 		}
-		goto \&next_step; # onto the next todo item
+		return next_step($self); # onto the next todo item
 	}
-	goto \&try_harder;
+	try_harder($self, $want);
 }
 
 sub extract_diff_async {
@@ -578,9 +578,8 @@ sub extract_diff_async {
 		PublicInbox::Eml->new($bref)->each_part(\&extract_diff, $x, 1);
 	}
 
-	scalar(@{$want->{try_smsgs}}) ?
-		retry_current($self, $want) :
-		extract_diffs_done($self, $want);
+	scalar(@{$want->{try_smsgs}}) ? retry_current($self, $want)
+					: extract_diffs_done($self, $want);
 }
 
 sub resolve_patch ($$) {
@@ -605,7 +604,8 @@ sub resolve_patch ($$) {
 			}
 		}
 
-		goto(scalar @$msgs ? \&retry_current : \&extract_diffs_done);
+		return scalar(@$msgs) ? retry_current($self, $want)
+					: extract_diffs_done($self, $want);
 	}
 
 	# see if we can find the blob in an existing git repo:
@@ -626,9 +626,9 @@ sub resolve_patch ($$) {
 			return;
 		}
 		mark_found($self, $cur_want, $existing);
-		goto \&next_step; # onto patch application
+		return next_step($self); # onto patch application
 	} elsif ($existing > 0) {
-		goto \&retry_current;
+		return retry_current($self, $want);
 	} else { # $existing == 0: we may retry if inbox scan (below) fails
 		delete $want->{try_gits};
 	}
@@ -640,9 +640,9 @@ sub resolve_patch ($$) {
 		$want->{try_smsgs} = $msgs;
 		$want->{cur_ibx} = $ibx;
 		$self->{tmp_diffs} = [];
-		goto \&retry_current;
+		return retry_current($self, $want);
 	}
-	goto \&try_harder;
+	try_harder($self, $want);
 }
 
 # this API is designed to avoid creating self-referential structures;
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index a1f6048f..b8abfa94 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -1267,8 +1267,8 @@ sub xapian_only {
 # public, called by public-inbox-index
 sub index_sync {
 	my ($self, $opt) = @_;
-	$opt //= $_[1] //= {};
-	goto \&xapian_only if $opt->{xapian_only};
+	$opt //= {};
+	return xapian_only($self, $opt) if $opt->{xapian_only};
 
 	my $pr = $opt->{-progress};
 	my $epoch_max;
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 3055da20..1d5119cd 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -386,7 +386,7 @@ sub next_in_queue ($$) {
 
 sub stream_thread_i { # PublicInbox::WwwStream::getline callback
 	my ($ctx, $eml) = @_;
-	goto &thread_eml_entry if $eml; # tail recursion
+	return thread_eml_entry($ctx, $eml) if $eml;
 	return unless exists($ctx->{skel});
 	my $ghost_ok = $ctx->{nr}++;
 	while (1) {
diff --git a/lib/PublicInbox/Watch.pm b/lib/PublicInbox/Watch.pm
index 0f41dff2..8bbce929 100644
--- a/lib/PublicInbox/Watch.pm
+++ b/lib/PublicInbox/Watch.pm
@@ -651,7 +651,7 @@ sub event_step {
 		PublicInbox::Sigfd::sig_setmask($oldset);
 		die $@ if $@;
 	}
-	goto(&fs_scan_step) if $self->{mdre};
+	fs_scan_step($self) if $self->{mdre};
 }
 
 sub watch_imap_fetch_all ($$) {
@@ -1066,7 +1066,7 @@ sub fs_scan_step {
 sub scan {
 	my ($self, $op) = @_;
 	push @{$self->{ops}}, $op;
-	goto &fs_scan_step;
+	fs_scan_step($self);
 }
 
 sub _importer_for {

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 2/3] t/nntpd: add test for the XPATH command
  2020-09-11  7:32 [PATCH 0/3] mostly NNTP stuff Eric Wong
  2020-09-11  7:32 ` [PATCH 1/3] treewide: avoid `goto &NAME' for tail recursion Eric Wong
@ 2020-09-11  7:32 ` Eric Wong
  2020-09-11  7:32 ` [PATCH 3/3] nntp: share more code between art_lookup callers Eric Wong
  2 siblings, 0 replies; 4+ messages in thread
From: Eric Wong @ 2020-09-11  7:32 UTC (permalink / raw)
  To: meta

It's only in RFC 2980 (not 977 or 3977), but Net::NNTP has
supported it since 2001, at least.  We'll be making changes
to avoid pathological behavior, so test it, first.
---
 t/nntpd.t | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/t/nntpd.t b/t/nntpd.t
index a3d974cf..14db1a93 100644
--- a/t/nntpd.t
+++ b/t/nntpd.t
@@ -295,6 +295,8 @@ Date: Fri, 02 Oct 1993 00:00:00 +0000
 		'cross newsgroup BODY by Message-ID');
 	ok($n->head('<testmessage@example.com>'),
 		'cross newsgroup HEAD by Message-ID');
+	is($n->xpath('<testmessage@example.com>'), 'x.y.z/1', 'xpath hit');
+	is($n->xpath('<non-existent@example.com>'), undef, 'xpath miss');
 
 	# pipelined requests:
 	{

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 3/3] nntp: share more code between art_lookup callers
  2020-09-11  7:32 [PATCH 0/3] mostly NNTP stuff Eric Wong
  2020-09-11  7:32 ` [PATCH 1/3] treewide: avoid `goto &NAME' for tail recursion Eric Wong
  2020-09-11  7:32 ` [PATCH 2/3] t/nntpd: add test for the XPATH command Eric Wong
@ 2020-09-11  7:32 ` Eric Wong
  2 siblings, 0 replies; 4+ messages in thread
From: Eric Wong @ 2020-09-11  7:32 UTC (permalink / raw)
  To: meta

This prepares us for future changes to improve scalability to
many inboxes.
---
 lib/PublicInbox/NNTP.pm | 44 +++++++++++++++++------------------------
 1 file changed, 18 insertions(+), 26 deletions(-)

diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm
index 46398cd4..88fe2bb0 100644
--- a/lib/PublicInbox/NNTP.pm
+++ b/lib/PublicInbox/NNTP.pm
@@ -446,8 +446,8 @@ sub set_nntp_headers ($$) {
 	}
 }
 
-sub art_lookup ($$) {
-	my ($self, $art) = @_;
+sub art_lookup ($$$) {
+	my ($self, $art, $code) = @_;
 	my $ng = $self->{ng};
 	my ($n, $mid);
 	my $err;
@@ -484,7 +484,17 @@ find_mid:
 found:
 	my $smsg = $ng->over->get_art($n) or return $err;
 	$smsg->{-ibx} = $ng;
-	$smsg;
+	if ($code == 223) { # STAT
+		set_art($self, $n);
+		"223 $n <$smsg->{mid}> article retrieved - " .
+			"request text separately";
+	} else { # HEAD | BODY | ARTICLE
+		$smsg->{nntp} = $self;
+		$smsg->{nntp_code} = $code;
+		set_art($self, $art);
+		# this dereferences to `undef'
+		${git_async_cat($ng->git, $smsg->{blob}, \&blob_cb, $smsg)};
+	}
 }
 
 sub msg_body_write ($$) {
@@ -520,7 +530,7 @@ sub msg_hdr_write ($$) {
 sub blob_cb { # called by git->cat_async via git_async_cat
 	my ($bref, $oid, $type, $size, $smsg) = @_;
 	my $self = $smsg->{nntp};
-	my $code = $smsg->{nntp_code} // 220;
+	my $code = $smsg->{nntp_code};
 	if (!defined($oid)) {
 		# it's possible to have TOCTOU if an admin runs
 		# public-inbox-(edit|purge), just move onto the next message
@@ -553,40 +563,22 @@ sub blob_cb { # called by git->cat_async via git_async_cat
 
 sub cmd_article ($;$) {
 	my ($self, $art) = @_;
-	my $smsg = art_lookup($self, $art);
-	return $smsg unless ref $smsg;
-	set_art($self, $art);
-	$smsg->{nntp} = $self;
-	${git_async_cat($smsg->{-ibx}->git, $smsg->{blob}, \&blob_cb, $smsg)};
+	art_lookup($self, $art, 220);
 }
 
 sub cmd_head ($;$) {
 	my ($self, $art) = @_;
-	my $smsg = art_lookup($self, $art);
-	return $smsg unless ref $smsg;
-	set_art($self, $art);
-	$smsg->{nntp} = $self;
-	$smsg->{nntp_code} = 221;
-	${git_async_cat($smsg->{-ibx}->git, $smsg->{blob}, \&blob_cb, $smsg)};
+	art_lookup($self, $art, 221);
 }
 
 sub cmd_body ($;$) {
 	my ($self, $art) = @_;
-	my $smsg = art_lookup($self, $art);
-	return $smsg unless ref $smsg;
-	set_art($self, $art);
-	$smsg->{nntp} = $self;
-	$smsg->{nntp_code} = 222;
-	${git_async_cat($smsg->{-ibx}->git, $smsg->{blob}, \&blob_cb, $smsg)};
+	art_lookup($self, $art, 222);
 }
 
 sub cmd_stat ($;$) {
 	my ($self, $art) = @_;
-	my $smsg = art_lookup($self, $art); # art may be msgid
-	return $smsg unless ref $smsg;
-	$art = $smsg->{num};
-	set_art($self, $art);
-	"223 $art <$smsg->{mid}> article retrieved - request text separately";
+	art_lookup($self, $art, 223); # art may be msgid
 }
 
 sub cmd_ihave ($) { '435 article not wanted - do not send it' }

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-09-11  7:32 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-11  7:32 [PATCH 0/3] mostly NNTP stuff Eric Wong
2020-09-11  7:32 ` [PATCH 1/3] treewide: avoid `goto &NAME' for tail recursion Eric Wong
2020-09-11  7:32 ` [PATCH 2/3] t/nntpd: add test for the XPATH command Eric Wong
2020-09-11  7:32 ` [PATCH 3/3] nntp: share more code between art_lookup callers Eric Wong

user/dev discussion of public-inbox itself

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 meta meta/ http://public-inbox.org/meta \
		meta@public-inbox.org
	public-inbox-index meta

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://ou63pmih66umazou.onion/inbox.comp.mail.public-inbox.meta
	nntp://czquwvybam4bgbro.onion/inbox.comp.mail.public-inbox.meta
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.io/gmane.mail.public-inbox.general
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for the project(s) associated with this inbox:

	https://80x24.org/public-inbox.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git