user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Dominique Martinet <asmadeus@codewreck.org>
To: Julien Moutinho <julm+public-inbox@sourcephile.fr>
Cc: Eric Wong <e@80x24.org>, meta@public-inbox.org
Subject: Re: Test failures with 1.7.0
Date: Thu, 9 Dec 2021 11:53:49 +0900	[thread overview]
Message-ID: <YbFvvR4slPwUD8Gh@codewreck.org> (raw)
In-Reply-To: <20211209013743.okzgim7bbrpahks7@sourcephile.fr>

Julien Moutinho wrote on Thu, Dec 09, 2021 at 02:37:43AM +0100:
> I can also reproduce Infinisil's test failure with:
> $ (cd public-inbox-1.7.0; TMPDIR=/var/tmp perl -I$out/lib/perl5/site_perl t/lei_to_mail.t )
> > ok 96 - got Maildir callback
> > Use of uninitialized value in open at t/lei_to_mail.t line 263. 
> > Bail out!  No such file or directory 

I got curious on this one.
strace tells me:
----
2813384 renameat2(AT_FDCWD, "/tank/pi-lei_to_mail-2813384-n7sk/maildir/tmp/badc0ffee", AT_FDCWD, "/tank/pi-lei_to_mail-2813384-n7sk/maildir/cur/badc0ffee:2,", RENAME_NOREPLACE) = -1 EINVAL (Invalid argument)
2813384 openat(AT_FDCWD, "/tank/pi-lei_to_mail-2813384-n7sk/maildir/new/", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 4
2813384 newfstatat(4, "", {st_mode=S_IFDIR|0755, st_size=2, ...}, AT_EMPTY_PATH) = 0
2813384 brk(0x44f4000)                  = 0x44f4000
2813384 getdents64(4, 0x44b3e40 /* 2 entries */, 131072) = 48
2813384 getdents64(4, 0x44b3e40 /* 0 entries */, 131072) = 0
2813384 close(4)                        = 0
2813384 openat(AT_FDCWD, "/tank/pi-lei_to_mail-2813384-n7sk/maildir/cur/", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 4
2813384 newfstatat(4, "", {st_mode=S_IFDIR|0755, st_size=2, ...}, AT_EMPTY_PATH) = 0
2813384 getdents64(4, 0x44b3e40 /* 2 entries */, 131072) = 48
2813384 getdents64(4, 0x44b3e40 /* 0 entries */, 131072) = 0
2813384 close(4)                        = 0
2813384 write(2, "Use of uninitialized value in op"..., 64) = 64
2813384 openat(AT_FDCWD, "", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
2813384 getpid()                        = 2813384
2813384 getpid()                        = 2813384
2813384 getpid()                        = 2813384
2813384 write(5, "Bail out!  No such file or direc"..., 37) = 37
----

So this one is a real bug, this appears to fix it:
----
From 50a63628d505ca1c8d36f94ab5703f87a2c5e415 Mon Sep 17 00:00:00 2001
From: Dominique Martinet <asmadeus@codewreck.org>
Date: Thu, 9 Dec 2021 11:50:51 +0900
Subject: [PATCH] syscall: fallback to rename on renameat2 EINVAL

ZFS appears to incorrectly return EINVAL on renameat2 when the operation is not
supported:
renameat2(AT_FDCWD, "...", AT_FDCWD, "...", RENAME_NOREPLACE) = -1 EINVAL

Fall back to the racy rename in this case as well:

diff --git a/lib/PublicInbox/Syscall.pm b/lib/PublicInbox/Syscall.pm
index c00385b94db8..78f926ac38f0 100644
--- a/lib/PublicInbox/Syscall.pm
+++ b/lib/PublicInbox/Syscall.pm
@@ -15,7 +15,7 @@ package PublicInbox::Syscall;
 use strict;
 use v5.10.1;
 use parent qw(Exporter);
-use POSIX qw(ENOENT EEXIST ENOSYS O_NONBLOCK);
+use POSIX qw(ENOENT EEXIST ENOSYS EINVAL O_NONBLOCK);
 use Config;
 
 # $VERSION = '0.25'; # Sys::Syscall version
@@ -312,7 +312,7 @@ sub rename_noreplace ($$) {
 		my $ret = syscall($SYS_renameat2, -100, $old, -100, $new, 1);
 		if ($ret == 0) {
 			1; # like rename() perlop
-		} elsif ($! == ENOSYS) {
+		} elsif ($! == ENOSYS || $! == EINVAL) {
 			undef $SYS_renameat2;
 			_rename_noreplace_racy($old, $new);
 		} else {
----

> This test does succeed outside Nix's sandbox:
> $ (cd public-inbox-1.7.0; export PERL_INLINE_DIRECTORY=$PWD/inline-c; rm -rf $PERL_INLINE_DIRECTORY; mkdir $PERL_INLINE_DIRECTORY; prove -bvw t/lei-sigpipe.t )
> > t/lei-sigpipe.t ..               
> > ok 1 - lei import $TMPDIR/lei-daemon/big.eml
> > ok 2 - read one byte             
> > ok 3 - signaled                  
> > ok 4 - got SIGPIPE               
> > ok 5 - quiet after sigpipe 
> > ok 6 - read one byte
> > ok 7 - signaled -f mboxcl2       
> > ok 8 - got SIGPIPE -f mboxcl2    
> > ok 9 - quiet after sigpipe -f mboxcl2
> > ok 10 - read one byte
> > ok 11 - signaled -f text         
> > ok 12 - got SIGPIPE -f text
> > ok 13 - quiet after sigpipe -f text
> > ok 14 - lei daemon-pid (daemon-pid after t/lei-sigpipe.t:44)
> > ok 15 - daemon running after t/lei-sigpipe.t:44
> > ok 16 - lei daemon-kill (daemon-kill after t/lei-sigpipe.t:44)
> > ok 17 - t/lei-sigpipe.t:44 daemon stopped
> > ok 18 - t/lei-sigpipe.t:44 daemon XDG_RUNTIME_DIR/lei/errors.log empty
> > 1..18
> > ok
> > All tests successful.
> > Files=1, Tests=18,  7 wallclock secs ( 0.06 usr  0.06 sys +  3.44 cusr  2.73 csys =  6.29 CPU)
> > Result: PASS
> 
> More surprisingly, it even succeeds when run manually
> inside the hanging Nix sandbox:
> $ sudo nsenter --target 3137110 --all -S 1000 -G 100 $(readlink -e $(which bash))
> $ . /build/env-vars
> $ cd /build
> $ export HOME=$(mktemp -d)
> $ mkdir -p $HOME/.cache/public-inbox/inline-c
> $ LANG=C prove -bvw t/lei-sigpipe.t
> > t/lei-sigpipe.t .. 
> > ok 1 - lei import $TMPDIR/lei-daemon/big.eml
> > ok 2 - read one byte
> > ok 3 - signaled 
> > ok 4 - got SIGPIPE 
> > ok 5 - quiet after sigpipe 
> > ok 6 - read one byte
> > ok 7 - signaled -f mboxcl2
> > ok 8 - got SIGPIPE -f mboxcl2
> > ok 9 - quiet after sigpipe -f mboxcl2
> > ok 10 - read one byte
> > ok 11 - signaled -f text
> > ok 12 - got SIGPIPE -f text
> > ok 13 - quiet after sigpipe -f text
> > ok 14 - lei daemon-pid (daemon-pid after t/lei-sigpipe.t:44)
> > ok 15 - daemon running after t/lei-sigpipe.t:44
> > ok 16 - lei daemon-kill (daemon-kill after t/lei-sigpipe.t:44)
> > ok 17 - t/lei-sigpipe.t:44 daemon stopped
> > ok 18 - t/lei-sigpipe.t:44 daemon XDG_RUNTIME_DIR/lei/errors.log empty
> > 1..18
> > ok
> > All tests successful.
> > Files=1, Tests=18,  4 wallclock secs ( 0.06 usr  0.06 sys +  1.23 cusr  1.48 csys =  2.83 CPU)
> > Result: PASS
> 
> Even more strange, Dominique was able to reproduce
> the hang this morning, but no longer tonight..

Yes, I don't get it, it hanged once but no longer hangs, so as much as
I'd have liked to investigate I'm a bit stuck.

With this, I can confirm running with inline-c also makes the tests that
failed with the btrfs chattr call also pass.
So all that's left is fix the proc mounts parsing there :)

-- 
Dominique

  reply	other threads:[~2021-12-09  2:54 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-08  1:07 Test failures with 1.7.0 Julien Moutinho
2021-12-08  4:08 ` Eric Wong
2021-12-08 10:56   ` Dominique Martinet
2021-12-08 18:22     ` [PATCH] nodatacow: quiet chattr errors [was: Test failures with 1.7.0] Eric Wong
2021-12-08 21:14       ` Dominique Martinet
2021-12-08 22:01         ` Dominique Martinet
2022-01-30 21:49           ` Eric Wong
2022-01-30 23:18             ` Dominique Martinet
2022-01-31  2:03               ` Eric Wong
2022-01-31  3:34                 ` Dominique Martinet
2022-02-01  1:27                   ` Eric Wong
2021-12-09  1:37     ` Test failures with 1.7.0 Julien Moutinho
2021-12-09  2:53       ` Dominique Martinet [this message]
2022-02-01  9:37         ` Eric Wong
2022-02-01 23:27       ` FD_CLOEXEC w/ nix-shell [was: Test failures with 1.7.0] Eric Wong
2022-02-02  0:23         ` Dominique Martinet
2022-02-02  2:11           ` Dominique Martinet
2022-02-01 23:34       ` [PATCH] test_lei: use consistent locale for error messages Eric Wong
2022-02-17 21:02       ` [PATCH] t/lei-sigpipe: attempt to improve diagnostics for stuck test Eric Wong
2022-02-20  1:38         ` Julien Moutinho
2022-02-22  6:44           ` Eric Wong
2022-02-27  4:15             ` Julien Moutinho
2022-02-27  6:41               ` Julien Moutinho
2022-02-27  7:23                 ` Dominique Martinet
2022-02-27  8:04                   ` Julien Moutinho
2022-02-27 11:17                     ` [PATCH] t/lei-sigpipe: ensure SIGPIPE is unblocked for this test Eric Wong
2022-03-11 10:42                       ` [PATCH] t/lei-sigpipe.t: ensure SIGPIPE is not ignored instead of not blocked Julien Moutinho
2022-03-14 22:14                         ` Eric Wong
2022-03-15  2:56                           ` Julien Moutinho
2022-03-01  2:30   ` Test failures with 1.7.0 Julien Moutinho
2022-03-01  4:05     ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YbFvvR4slPwUD8Gh@codewreck.org \
    --to=asmadeus@codewreck.org \
    --cc=e@80x24.org \
    --cc=julm+public-inbox@sourcephile.fr \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).