user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
* [PATCH 0/2] Maildir fixups
@ 2021-02-19  0:58 Eric Wong
  2021-02-19  0:58 ` [PATCH 1/2] lei_to_mail: Maildir: ensure link(2) succeeds Eric Wong
  2021-02-19  0:58 ` [PATCH 2/2] emergency: modernize and reduce syscalls Eric Wong
  0 siblings, 2 replies; 3+ messages in thread
From: Eric Wong @ 2021-02-19  0:58 UTC (permalink / raw)
  To: meta

Emergency only writes to "new", and LeiToMail only writes to
"cur", but otherwise try to make them more similar.  LeiToMail
remains more performance critical since it's for bulk dumps;
and Emergency is close to the original djb specification with
hostname and PID in the filename.

Eric Wong (2):
  lei_to_mail: Maildir: ensure link(2) succeeds
  emergency: modernize and reduce syscalls

 lib/PublicInbox/Emergency.pm | 59 +++++++++++++++---------------------
 lib/PublicInbox/LeiToMail.pm |  3 +-
 2 files changed, 26 insertions(+), 36 deletions(-)

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 1/2] lei_to_mail: Maildir: ensure link(2) succeeds
  2021-02-19  0:58 [PATCH 0/2] Maildir fixups Eric Wong
@ 2021-02-19  0:58 ` Eric Wong
  2021-02-19  0:58 ` [PATCH 2/2] emergency: modernize and reduce syscalls Eric Wong
  1 sibling, 0 replies; 3+ messages in thread
From: Eric Wong @ 2021-02-19  0:58 UTC (permalink / raw)
  To: meta

link(2) may fail with errors other than EEXIST; just bail out
since something is likely seriously wrong.
---
 lib/PublicInbox/LeiToMail.pm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index f0adc44f..8a2d9471 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -302,8 +302,9 @@ sub _buf2maildir {
 		$rand = '';
 		do {
 			$final = $dst.$rand.$common.':2,'.$sfx;
-		} while (!link($tmp, $final) && $! == EEXIST &&
+		} while (!($ok = link($tmp, $final)) && $! == EEXIST &&
 			($rand = _rand.','));
+		die "link($tmp, $final): $!" unless $ok;
 		unlink($tmp) or warn "W: failed to unlink $tmp: $!\n";
 	} else {
 		my $err = "Error writing $smsg->{blob} to $dst: $!\n";

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 2/2] emergency: modernize and reduce syscalls
  2021-02-19  0:58 [PATCH 0/2] Maildir fixups Eric Wong
  2021-02-19  0:58 ` [PATCH 1/2] lei_to_mail: Maildir: ensure link(2) succeeds Eric Wong
@ 2021-02-19  0:58 ` Eric Wong
  1 sibling, 0 replies; 3+ messages in thread
From: Eric Wong @ 2021-02-19  0:58 UTC (permalink / raw)
  To: meta

As with LeiToMail, we'll exclusively rely on O_EXCL and EEXIST
instead of "-f" (stat(2)) for file name collision checking.
Furthermore, we can rely on link(2) error handling instead of
using stat(2) to check the result of link(2).

We'll still keep the hostname in these filenames, but memoize it
on a per-instance basis since hostname changes are rare and we
can assume it won't change between "tmp" and "cur".

We'll also start embedding the PID as {"tmp.$$"} into the fiel
name to guard against accidental deletion in child processes,
instead of requiring an extra hash lookup.

Finally, avoid multiple getpid(2) syscalls in internal subs
since glibc no longer caches in getpid(3).

We'll also favor constant comparison of $! against EEXIST for
inlining. and stop doing ->autoflush when we only have a single
print + flush.
---
 lib/PublicInbox/Emergency.pm | 59 +++++++++++++++---------------------
 1 file changed, 24 insertions(+), 35 deletions(-)

diff --git a/lib/PublicInbox/Emergency.pm b/lib/PublicInbox/Emergency.pm
index 67f27bc7..5a1ed1d7 100644
--- a/lib/PublicInbox/Emergency.pm
+++ b/lib/PublicInbox/Emergency.pm
@@ -4,10 +4,11 @@
 # Emergency Maildir delivery for MDA
 package PublicInbox::Emergency;
 use strict;
-use warnings;
+use v5.10.1;
 use Fcntl qw(:DEFAULT SEEK_SET);
 use Sys::Hostname qw(hostname);
-use IO::Handle; # ->flush, ->autoflush
+use IO::Handle; # ->flush
+use Errno qw(EEXIST);
 
 sub new {
 	my ($class, $dir) = @_;
@@ -20,49 +21,43 @@ sub new {
 			die "failed to mkpath($d): $!\n";
 		}
 	}
-	bless { dir => $dir, files => {}, t => 0, cnt => 0, pid => $$ }, $class;
+	bless { dir => $dir, t => 0 }, $class;
 }
 
 sub _fn_in {
-	my ($self, $dir) = @_;
-	my @host = split(/\./, hostname);
+	my ($self, $pid, $dir) = @_;
+	my $host = $self->{short_host} //= (split(/\./, hostname))[0];
 	my $now = time;
+	my $n;
 	if ($self->{t} != $now) {
 		$self->{t} = $now;
-		$self->{cnt} = 0;
+		$n = $self->{cnt} = 0;
 	} else {
-		$self->{cnt}++;
+		$n = ++$self->{cnt};
 	}
-
-	my $f;
-	do {
-		$f = "$self->{dir}/$dir/$self->{t}.$$"."_$self->{cnt}.$host[0]";
-		$self->{cnt}++;
-	} while (-e $f);
-	$f;
+	"$self->{dir}/$dir/$self->{t}.$pid"."_$n.$host";
 }
 
 sub prepare {
 	my ($self, $strref) = @_;
-
-	die "already in transaction: $self->{tmp}" if $self->{tmp};
+	my $pid = $$;
+	my $tmp_key = "tmp.$pid";
+	die "already in transaction: $self->{$tmp_key}" if $self->{$tmp_key};
 	my ($tmp, $fh);
 	do {
-		$tmp = _fn_in($self, 'tmp');
+		$tmp = _fn_in($self, $pid, 'tmp');
 		$! = undef;
-	} while (!sysopen($fh, $tmp, O_CREAT|O_EXCL|O_RDWR) && $!{EEXIST});
+	} while (!sysopen($fh, $tmp, O_CREAT|O_EXCL|O_RDWR) and $! == EEXIST);
 	print $fh $$strref or die "write failed: $!";
 	$fh->flush or die "flush failed: $!";
-	$fh->autoflush(1);
 	$self->{fh} = $fh;
-	$self->{tmp} = $tmp;
+	$self->{$tmp_key} = $tmp;
 }
 
 sub abort {
 	my ($self) = @_;
 	delete $self->{fh};
-	my $tmp = delete $self->{tmp} or return;
-
+	my $tmp = delete $self->{"tmp.$$"} or return;
 	unlink($tmp) or warn "Failed to unlink $tmp: $!";
 	undef;
 }
@@ -77,21 +72,15 @@ sub fh {
 
 sub commit {
 	my ($self) = @_;
-	$$ == $self->{pid} or return; # no-op in forked child
-
+	my $pid = $$;
+	my $tmp = delete $self->{"tmp.$pid"} or return;
 	delete $self->{fh};
-	my $tmp = delete $self->{tmp} or return;
-	my $new;
+	my ($new, $ok);
 	do {
-		$new = _fn_in($self, 'new');
-	} while (!link($tmp, $new) && $!{EEXIST});
-	my @sn = stat($new) or die "stat $new failed: $!";
-	my @st = stat($tmp) or die "stat $tmp failed: $!";
-	if ($st[0] == $sn[0] && $st[1] == $sn[1]) {
-		unlink($tmp) or warn "Failed to unlink $tmp: $!";
-	} else {
-		warn "stat($new) and stat($tmp) differ";
-	}
+		$new = _fn_in($self, $pid, 'new');
+	} while (!($ok = link($tmp, $new)) && $! == EEXIST);
+	die "link($tmp, $new): $!" unless $ok;
+	unlink($tmp) or warn "Failed to unlink $tmp: $!";
 }
 
 sub DESTROY { commit($_[0]) }

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-02-19  0:58 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-19  0:58 [PATCH 0/2] Maildir fixups Eric Wong
2021-02-19  0:58 ` [PATCH 1/2] lei_to_mail: Maildir: ensure link(2) succeeds Eric Wong
2021-02-19  0:58 ` [PATCH 2/2] emergency: modernize and reduce syscalls Eric Wong

user/dev discussion of public-inbox itself

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 meta meta/ https://public-inbox.org/meta \
		meta@public-inbox.org
	public-inbox-index meta

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd.onion/inbox.comp.mail.public-inbox.meta
	nntp://ie5yzdi7fg72h7s4sdcztq5evakq23rdt33mfyfcddc5u3ndnw24ogqd.onion/inbox.comp.mail.public-inbox.meta
	nntp://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.io/gmane.mail.public-inbox.general
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for project(s) associated with this inbox:

	https://80x24.org/public-inbox.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git