user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* [PATCH 0/2] Maildir fixups
@ 2021-02-19  0:58  7% Eric Wong
  2021-02-19  0:58  6% ` [PATCH 2/2] emergency: modernize and reduce syscalls Eric Wong
  0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2021-02-19  0:58 UTC (permalink / raw)
  To: meta

Emergency only writes to "new", and LeiToMail only writes to
"cur", but otherwise try to make them more similar.  LeiToMail
remains more performance critical since it's for bulk dumps;
and Emergency is close to the original djb specification with
hostname and PID in the filename.

Eric Wong (2):
  lei_to_mail: Maildir: ensure link(2) succeeds
  emergency: modernize and reduce syscalls

 lib/PublicInbox/Emergency.pm | 59 +++++++++++++++---------------------
 lib/PublicInbox/LeiToMail.pm |  3 +-
 2 files changed, 26 insertions(+), 36 deletions(-)

^ permalink raw reply	[relevance 7%]

* [PATCH 2/2] emergency: modernize and reduce syscalls
  2021-02-19  0:58  7% [PATCH 0/2] Maildir fixups Eric Wong
@ 2021-02-19  0:58  6% ` Eric Wong
  0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2021-02-19  0:58 UTC (permalink / raw)
  To: meta

As with LeiToMail, we'll exclusively rely on O_EXCL and EEXIST
instead of "-f" (stat(2)) for file name collision checking.
Furthermore, we can rely on link(2) error handling instead of
using stat(2) to check the result of link(2).

We'll still keep the hostname in these filenames, but memoize it
on a per-instance basis since hostname changes are rare and we
can assume it won't change between "tmp" and "cur".

We'll also start embedding the PID as {"tmp.$$"} into the fiel
name to guard against accidental deletion in child processes,
instead of requiring an extra hash lookup.

Finally, avoid multiple getpid(2) syscalls in internal subs
since glibc no longer caches in getpid(3).

We'll also favor constant comparison of $! against EEXIST for
inlining. and stop doing ->autoflush when we only have a single
print + flush.
---
 lib/PublicInbox/Emergency.pm | 59 +++++++++++++++---------------------
 1 file changed, 24 insertions(+), 35 deletions(-)

diff --git a/lib/PublicInbox/Emergency.pm b/lib/PublicInbox/Emergency.pm
index 67f27bc7..5a1ed1d7 100644
--- a/lib/PublicInbox/Emergency.pm
+++ b/lib/PublicInbox/Emergency.pm
@@ -4,10 +4,11 @@
 # Emergency Maildir delivery for MDA
 package PublicInbox::Emergency;
 use strict;
-use warnings;
+use v5.10.1;
 use Fcntl qw(:DEFAULT SEEK_SET);
 use Sys::Hostname qw(hostname);
-use IO::Handle; # ->flush, ->autoflush
+use IO::Handle; # ->flush
+use Errno qw(EEXIST);
 
 sub new {
 	my ($class, $dir) = @_;
@@ -20,49 +21,43 @@ sub new {
 			die "failed to mkpath($d): $!\n";
 		}
 	}
-	bless { dir => $dir, files => {}, t => 0, cnt => 0, pid => $$ }, $class;
+	bless { dir => $dir, t => 0 }, $class;
 }
 
 sub _fn_in {
-	my ($self, $dir) = @_;
-	my @host = split(/\./, hostname);
+	my ($self, $pid, $dir) = @_;
+	my $host = $self->{short_host} //= (split(/\./, hostname))[0];
 	my $now = time;
+	my $n;
 	if ($self->{t} != $now) {
 		$self->{t} = $now;
-		$self->{cnt} = 0;
+		$n = $self->{cnt} = 0;
 	} else {
-		$self->{cnt}++;
+		$n = ++$self->{cnt};
 	}
-
-	my $f;
-	do {
-		$f = "$self->{dir}/$dir/$self->{t}.$$"."_$self->{cnt}.$host[0]";
-		$self->{cnt}++;
-	} while (-e $f);
-	$f;
+	"$self->{dir}/$dir/$self->{t}.$pid"."_$n.$host";
 }
 
 sub prepare {
 	my ($self, $strref) = @_;
-
-	die "already in transaction: $self->{tmp}" if $self->{tmp};
+	my $pid = $$;
+	my $tmp_key = "tmp.$pid";
+	die "already in transaction: $self->{$tmp_key}" if $self->{$tmp_key};
 	my ($tmp, $fh);
 	do {
-		$tmp = _fn_in($self, 'tmp');
+		$tmp = _fn_in($self, $pid, 'tmp');
 		$! = undef;
-	} while (!sysopen($fh, $tmp, O_CREAT|O_EXCL|O_RDWR) && $!{EEXIST});
+	} while (!sysopen($fh, $tmp, O_CREAT|O_EXCL|O_RDWR) and $! == EEXIST);
 	print $fh $$strref or die "write failed: $!";
 	$fh->flush or die "flush failed: $!";
-	$fh->autoflush(1);
 	$self->{fh} = $fh;
-	$self->{tmp} = $tmp;
+	$self->{$tmp_key} = $tmp;
 }
 
 sub abort {
 	my ($self) = @_;
 	delete $self->{fh};
-	my $tmp = delete $self->{tmp} or return;
-
+	my $tmp = delete $self->{"tmp.$$"} or return;
 	unlink($tmp) or warn "Failed to unlink $tmp: $!";
 	undef;
 }
@@ -77,21 +72,15 @@ sub fh {
 
 sub commit {
 	my ($self) = @_;
-	$$ == $self->{pid} or return; # no-op in forked child
-
+	my $pid = $$;
+	my $tmp = delete $self->{"tmp.$pid"} or return;
 	delete $self->{fh};
-	my $tmp = delete $self->{tmp} or return;
-	my $new;
+	my ($new, $ok);
 	do {
-		$new = _fn_in($self, 'new');
-	} while (!link($tmp, $new) && $!{EEXIST});
-	my @sn = stat($new) or die "stat $new failed: $!";
-	my @st = stat($tmp) or die "stat $tmp failed: $!";
-	if ($st[0] == $sn[0] && $st[1] == $sn[1]) {
-		unlink($tmp) or warn "Failed to unlink $tmp: $!";
-	} else {
-		warn "stat($new) and stat($tmp) differ";
-	}
+		$new = _fn_in($self, $pid, 'new');
+	} while (!($ok = link($tmp, $new)) && $! == EEXIST);
+	die "link($tmp, $new): $!" unless $ok;
+	unlink($tmp) or warn "Failed to unlink $tmp: $!";
 }
 
 sub DESTROY { commit($_[0]) }

^ permalink raw reply related	[relevance 6%]

Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2021-02-19  0:58  7% [PATCH 0/2] Maildir fixups Eric Wong
2021-02-19  0:58  6% ` [PATCH 2/2] emergency: modernize and reduce syscalls Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).