git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* RE: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
@ 2021-06-02 17:52 Randall S. Becker
  2021-06-02 19:32 ` Taylor Blau
  0 siblings, 1 reply; 22+ messages in thread
From: Randall S. Becker @ 2021-06-02 17:52 UTC (permalink / raw)
  To: 'Junio C Hamano', git

On June 2, 2021 4:30 AM, Junio C Hamano wrote:
>Subject: [ANNOUNCE] Git v2.32.0-rc3
>
>A release candidate Git v2.32.0-rc3 is now available for testing at
>the usual places.  It is comprised of 589 non-merge commits since
>v2.31.0, contributed by 84 people, 31 of which are new faces [*].

Here is the current -x output from t5300. That last known working version was 2.31.1.

<snip>
ok 1 - setup

expecting success of 5300.2 'pack without delta':
        packname_1=$(git pack-objects --progress --window=0 test-1 \
                        <obj-list 2>stderr) &&
        check_deltas stderr = 0

+ + git pack-objects --progress --window=0 test-1
+ 0< obj-list 2> stderr
packname_1=
error: last command exited with $?=128
not ok 2 - pack without delta
#
#               packname_1=$(git pack-objects --progress --window=0 test-1 \
#                               <obj-list 2>stderr) &&
#               check_deltas stderr = 0
#

Working area is:
-rw-rw-r-- 1 JENKINS.JENKINS JENKINS    4096 Jun  2 12:47 a
-rw-rw-r-- 1 JENKINS.JENKINS JENKINS 2097152 Jun  2 12:47 a_big
-rw-rw-r-- 1 JENKINS.JENKINS JENKINS    4096 Jun  2 12:47 b
-rw-rw-r-- 1 JENKINS.JENKINS JENKINS 2097152 Jun  2 12:47 b_big
-rw-rw-r-- 1 JENKINS.JENKINS JENKINS    4096 Jun  2 12:47 c
-rw-rw-r-- 1 JENKINS.JENKINS JENKINS    4100 Jun  2 12:47 d
-rw-rw-r-- 1 JENKINS.JENKINS JENKINS 4228179 Jun  2 12:48 expect
-rw-rw-r-- 1 JENKINS.JENKINS JENKINS     328 Jun  2 12:48 obj-list
-rw-rw-r-- 1 JENKINS.JENKINS JENKINS     605 Jun  2 12:48 stderr

Stderr contains:
Enumerating objects: 8, done.
Counting objects: 100% (8/8), done.
fatal: fsync error on '.git/objects/pack/tmp_pack_NkPgqN': Interrupted system call

The maximum platform I/O size is 56Kb.

I'm happy to help figure this out but need some direction. I don't know the pack-object code.

Sincerely,
Randall


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-02 17:52 [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86 Randall S. Becker
@ 2021-06-02 19:32 ` Taylor Blau
  2021-06-02 19:49   ` Jeff King
  0 siblings, 1 reply; 22+ messages in thread
From: Taylor Blau @ 2021-06-02 19:32 UTC (permalink / raw)
  To: Randall S. Becker; +Cc: 'Junio C Hamano', git

On Wed, Jun 02, 2021 at 01:52:44PM -0400, Randall S. Becker wrote:
> I'm happy to help figure this out but need some direction. I don't
> know the pack-object code.

Is the failure consistent, i.e., that it occurs every time you run the
test? Not knowing much about your platform, it would be helpful to have
a bisection showing where this breakage first occurs.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-02 19:32 ` Taylor Blau
@ 2021-06-02 19:49   ` Jeff King
  2021-06-02 20:11     ` Taylor Blau
  2021-06-02 20:11     ` Eric Wong
  0 siblings, 2 replies; 22+ messages in thread
From: Jeff King @ 2021-06-02 19:49 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Randall S. Becker, 'Junio C Hamano', git

On Wed, Jun 02, 2021 at 03:32:43PM -0400, Taylor Blau wrote:

> On Wed, Jun 02, 2021 at 01:52:44PM -0400, Randall S. Becker wrote:
> > I'm happy to help figure this out but need some direction. I don't
> > know the pack-object code.
> 
> Is the failure consistent, i.e., that it occurs every time you run the
> test? Not knowing much about your platform, it would be helpful to have
> a bisection showing where this breakage first occurs.

I suspect the symptom comes from the test Randall noted, but that the
actual issue has been there all along. The test uses "--progress"
explicitly, so we'll be sending SIGALRM (whereas most tests will disable
the progress mechanism because their output isn't going to a tty).

And so when he gets this error:

  fatal: fsync error on '.git/objects/pack/tmp_pack_NkPgqN': Interrupted system call

presumably we were in fsync() when the signal arrived, and unlike most
other platforms, the call needs to be restarted manually (even though we
set up the signal with SA_RESTART). I'm not sure if this violates POSIX
or not (I couldn't find a definitive answer to the set of interruptible
functions in the standard). But either way, the workaround is probably
something like:


  #ifdef FSYNC_NEEDS_RESTART
  #undef fsync /* we'd define to git_fsync() in a header file */
  static int git_fsync(int fd)
  {
	int ret;
	while ((ret = fsync(fd)) < 0 && errno == EINTR)
		; /* try again */
	return ret;
  }
  #endif

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-02 19:49   ` Jeff King
@ 2021-06-02 20:11     ` Taylor Blau
  2021-06-02 20:15       ` Jeff King
  2021-06-02 20:11     ` Eric Wong
  1 sibling, 1 reply; 22+ messages in thread
From: Taylor Blau @ 2021-06-02 20:11 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, Randall S. Becker, 'Junio C Hamano', git

On Wed, Jun 02, 2021 at 03:49:31PM -0400, Jeff King wrote:
> And so when he gets this error:
>
>   fatal: fsync error on '.git/objects/pack/tmp_pack_NkPgqN': Interrupted system call
>
> presumably we were in fsync() when the signal arrived, and unlike most
> other platforms, the call needs to be restarted manually (even though we
> set up the signal with SA_RESTART).

Aha, thanks for explaining. That makes sense to me.

> I'm not sure if this violates POSIX or not (I couldn't find a
> definitive answer to the set of interruptible functions in the
> standard).

I couldn't find anything either, but I believe you that what you
outlined is the right solution.

> But either way, the workaround is probably something like:

Seems very reasonable. Here's a patch:

--- >8 ---

Subject: [PATCH] compat: introduce git_fsync_with_restart()

Some platforms, like NonStop do not automatically restart fsync() when
interrupted by a signal, even when that signal is setup with SA_RESTART.

This can lead to test breakage, e.g., where "--progress" is used, thus
SIGALRM is sent often, and can interrupt an fsync() syscall.

Add a Makefile knob FSYNC_NEEDS_RESTART to replace fsync() with one that
gracefully handles getting EINTR.

Reported-by: Randall S. Becker <randall.becker@nexbridge.ca>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Makefile          |  8 ++++++++
 compat/fsync.c    | 10 ++++++++++
 git-compat-util.h |  6 ++++++
 3 files changed, 24 insertions(+)
 create mode 100644 compat/fsync.c

diff --git a/Makefile b/Makefile
index c3565fc0f8..ebea693a2c 100644
--- a/Makefile
+++ b/Makefile
@@ -423,6 +423,9 @@ all::
 # Define NEED_ACCESS_ROOT_HANDLER if access() under root may success for X_OK
 # even if execution permission isn't granted for any user.
 #
+# Define FSYNC_NEEDS_RESTART if your platform's fsync() needs to be restarted
+# manually when interrupted by a signal.
+#
 # Define PAGER_ENV to a SP separated VAR=VAL pairs to define
 # default environment variables to be passed when a pager is spawned, e.g.
 #
@@ -1926,6 +1929,11 @@ ifdef NEED_ACCESS_ROOT_HANDLER
 	COMPAT_OBJS += compat/access.o
 endif

+ifdef FSYNC_NEEDS_RESTART
+	COMPAT_CFLAGS += -DFSYNC_NEEDS_RESTART
+	COMPAT_OBJS += compat/fsync.o
+endif
+
 ifeq ($(TCLTK_PATH),)
 NO_TCLTK = NoThanks
 endif
diff --git a/compat/fsync.c b/compat/fsync.c
new file mode 100644
index 0000000000..f340fc628b
--- /dev/null
+++ b/compat/fsync.c
@@ -0,0 +1,10 @@
+#include "git-compat-util.h"
+
+#undef fsync
+int git_fsync_with_restart(int fd)
+{
+	int ret;
+	while ((ret = fsync(fd)) < 0 && errno == EINTR)
+		; /* try again */
+	return ret;
+}
diff --git a/git-compat-util.h b/git-compat-util.h
index a508dbe5a3..972fdb609f 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -833,6 +833,12 @@ const char *inet_ntop(int af, const void *src, char *dst, size_t size);
 int git_atexit(void (*handler)(void));
 #endif

+#ifdef FSYNC_NEEDS_RESTART
+#undef fsync
+#define fsync git_fsync_with_restart
+int git_fsync_with_restart(int fd);
+#endif
+
 static inline size_t st_add(size_t a, size_t b)
 {
 	if (unsigned_add_overflows(a, b))
--
2.31.1.163.ga65ce7f831


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-02 19:49   ` Jeff King
  2021-06-02 20:11     ` Taylor Blau
@ 2021-06-02 20:11     ` Eric Wong
  2021-06-02 20:14       ` Jeff King
  1 sibling, 1 reply; 22+ messages in thread
From: Eric Wong @ 2021-06-02 20:11 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, Randall S. Becker, 'Junio C Hamano', git

Jeff King <peff@peff.net> wrote:
> And so when he gets this error:
> 
>   fatal: fsync error on '.git/objects/pack/tmp_pack_NkPgqN': Interrupted system call
> 
> presumably we were in fsync() when the signal arrived, and unlike most
> other platforms, the call needs to be restarted manually (even though we
> set up the signal with SA_RESTART). I'm not sure if this violates POSIX
> or not (I couldn't find a definitive answer to the set of interruptible
> functions in the standard). But either way, the workaround is probably
> something like:

"man 3posix fsync" says EINTR is allowed ("manpages-posix-dev"
package in Debian non-free).

>   #ifdef FSYNC_NEEDS_RESTART

The wrapper should apply to all platforms.  NFS (and presumably
other network FSes) can be mounted with interrupts enabled.

>   #undef fsync /* we'd define to git_fsync() in a header file */
>   static int git_fsync(int fd)
>   {
> 	int ret;
> 	while ((ret = fsync(fd)) < 0 && errno == EINTR)
> 		; /* try again */
> 	return ret;
>   }

Looks fine.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-02 20:11     ` Eric Wong
@ 2021-06-02 20:14       ` Jeff King
  2021-06-02 20:18         ` Taylor Blau
                           ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Jeff King @ 2021-06-02 20:14 UTC (permalink / raw)
  To: Eric Wong; +Cc: Taylor Blau, Randall S. Becker, 'Junio C Hamano', git

On Wed, Jun 02, 2021 at 08:11:50PM +0000, Eric Wong wrote:

> Jeff King <peff@peff.net> wrote:
> > And so when he gets this error:
> > 
> >   fatal: fsync error on '.git/objects/pack/tmp_pack_NkPgqN': Interrupted system call
> > 
> > presumably we were in fsync() when the signal arrived, and unlike most
> > other platforms, the call needs to be restarted manually (even though we
> > set up the signal with SA_RESTART). I'm not sure if this violates POSIX
> > or not (I couldn't find a definitive answer to the set of interruptible
> > functions in the standard). But either way, the workaround is probably
> > something like:
> 
> "man 3posix fsync" says EINTR is allowed ("manpages-posix-dev"
> package in Debian non-free).

Ah, thanks. Linux's fsync(3) doesn't mention it, and nor does it appear
in the discussion of interruptible calls in signals(7). So I was looking
for a POSIX equivalent of that signals manpage but couldn't find one. :)

> >   #ifdef FSYNC_NEEDS_RESTART
> 
> The wrapper should apply to all platforms.  NFS (and presumably
> other network FSes) can be mounted with interrupts enabled.

I don't mind that, as the wrapper is pretty low-cost (and one less
Makefile knob is nice). If it's widespread, though, I find it curious
that nobody has run into it before now.

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-02 20:11     ` Taylor Blau
@ 2021-06-02 20:15       ` Jeff King
  2021-06-02 20:36         ` Randall S. Becker
  2021-06-04  1:36         ` Junio C Hamano
  0 siblings, 2 replies; 22+ messages in thread
From: Jeff King @ 2021-06-02 20:15 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Randall S. Becker, 'Junio C Hamano', git

On Wed, Jun 02, 2021 at 04:11:14PM -0400, Taylor Blau wrote:

> Subject: [PATCH] compat: introduce git_fsync_with_restart()
> 
> Some platforms, like NonStop do not automatically restart fsync() when
> interrupted by a signal, even when that signal is setup with SA_RESTART.
> 
> This can lead to test breakage, e.g., where "--progress" is used, thus
> SIGALRM is sent often, and can interrupt an fsync() syscall.
> 
> Add a Makefile knob FSYNC_NEEDS_RESTART to replace fsync() with one that
> gracefully handles getting EINTR.
> 
> Reported-by: Randall S. Becker <randall.becker@nexbridge.ca>
> Signed-off-by: Jeff King <peff@peff.net>

Probably Helped-by might be more appropriate. But regardless, I
definitely give my S-o-B for anything I contributed.

>  Makefile          |  8 ++++++++
>  compat/fsync.c    | 10 ++++++++++
>  git-compat-util.h |  6 ++++++
>  3 files changed, 24 insertions(+)
>  create mode 100644 compat/fsync.c

This looks as I'd expect. But after seeing Eric's response, we perhaps
want to do away with the knob entirely.

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-02 20:14       ` Jeff King
@ 2021-06-02 20:18         ` Taylor Blau
  2021-06-02 20:34         ` Randall S. Becker
  2021-06-03 20:21         ` Bryan Turner
  2 siblings, 0 replies; 22+ messages in thread
From: Taylor Blau @ 2021-06-02 20:18 UTC (permalink / raw)
  To: Jeff King
  Cc: Eric Wong, Taylor Blau, Randall S. Becker,
	'Junio C Hamano', git

On Wed, Jun 02, 2021 at 04:14:27PM -0400, Jeff King wrote:
> > The wrapper should apply to all platforms.  NFS (and presumably
> > other network FSes) can be mounted with interrupts enabled.
>
> I don't mind that, as the wrapper is pretty low-cost (and one less
> Makefile knob is nice). If it's widespread, though, I find it curious
> that nobody has run into it before now.

Yeah, I find it curious as well. I agree that the wrapper is relatively
cheap (especially relative to fsync), but it seems unnecessary despite
what's in POSIX's fsync(3). So the more conservative choice to me would
be to hide it behind a Makefile knob, but I don't mind much either way.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-02 20:14       ` Jeff King
  2021-06-02 20:18         ` Taylor Blau
@ 2021-06-02 20:34         ` Randall S. Becker
  2021-06-03 19:31           ` Jeff King
  2021-06-03 20:21         ` Bryan Turner
  2 siblings, 1 reply; 22+ messages in thread
From: Randall S. Becker @ 2021-06-02 20:34 UTC (permalink / raw)
  To: 'Jeff King', 'Eric Wong'
  Cc: 'Taylor Blau', 'Junio C Hamano', git

On June 2, 2021 4:14 PM, Peff wrote:
>Subject: Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
>On Wed, Jun 02, 2021 at 08:11:50PM +0000, Eric Wong wrote:
>
>> Jeff King <peff@peff.net> wrote:
>> > And so when he gets this error:
>> >
>> >   fatal: fsync error on '.git/objects/pack/tmp_pack_NkPgqN':
>> > Interrupted system call
>> >
>> > presumably we were in fsync() when the signal arrived, and unlike
>> > most other platforms, the call needs to be restarted manually (even
>> > though we set up the signal with SA_RESTART). I'm not sure if this
>> > violates POSIX or not (I couldn't find a definitive answer to the
>> > set of interruptible functions in the standard). But either way, the
>> > workaround is probably something like:
>>
>> "man 3posix fsync" says EINTR is allowed ("manpages-posix-dev"
>> package in Debian non-free).
>
>Ah, thanks. Linux's fsync(3) doesn't mention it, and nor does it appear in the discussion of interruptible calls in signals(7). So I was looking
>for a POSIX equivalent of that signals manpage but couldn't find one. :)
>
>> >   #ifdef FSYNC_NEEDS_RESTART
>>
>> The wrapper should apply to all platforms.  NFS (and presumably other
>> network FSes) can be mounted with interrupts enabled.
>
>I don't mind that, as the wrapper is pretty low-cost (and one less Makefile knob is nice). If it's widespread, though, I find it curious that
>nobody has run into it before now.

I suspect this is because of the way the file system on NonStop behaves. It is a multi-processor platform, with multi-cores, so anything can happen. If the file system is delayed for any reason, like a signal coming from a different core (EINTR has high priority), then fsync() will be interrupted. EINTR is allowed on NonStop for fsync(). So it would be really great if the patch included a modification to config.mak.uname to include that. This would be a timing-only issue on most other systems, probably something that would hit NFS.

The patch for the config is:
diff --git a/config.mak.uname b/config.mak.uname
index cb443b4e02..ac3e3ca2c5 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -566,6 +566,7 @@ ifeq ($(uname_S),NONSTOP_KERNEL)
        NO_REGEX = NeedsStartEnd
        NO_PTHREADS = UnfortunatelyYes
        FREAD_READS_DIRECTORIES = UnfortunatelyYes
+       FSYNC_NEEDS_RESTART = YesPlease

        # Not detected (nor checked for) by './configure'.




^ permalink raw reply related	[flat|nested] 22+ messages in thread

* RE: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-02 20:15       ` Jeff King
@ 2021-06-02 20:36         ` Randall S. Becker
  2021-06-04  1:36         ` Junio C Hamano
  1 sibling, 0 replies; 22+ messages in thread
From: Randall S. Becker @ 2021-06-02 20:36 UTC (permalink / raw)
  To: 'Jeff King', 'Taylor Blau'; +Cc: 'Junio C Hamano', git

On June 2, 2021 4:16 PM, Peff wrote:
>To: Taylor Blau <me@ttaylorr.com>
>Cc: Randall S. Becker <rsbecker@nexbridge.com>; 'Junio C Hamano' <gitster@pobox.com>; git@vger.kernel.org
>Subject: Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
>
>On Wed, Jun 02, 2021 at 04:11:14PM -0400, Taylor Blau wrote:
>
>> Subject: [PATCH] compat: introduce git_fsync_with_restart()
>>
>> Some platforms, like NonStop do not automatically restart fsync() when
>> interrupted by a signal, even when that signal is setup with SA_RESTART.
>>
>> This can lead to test breakage, e.g., where "--progress" is used, thus
>> SIGALRM is sent often, and can interrupt an fsync() syscall.
>>
>> Add a Makefile knob FSYNC_NEEDS_RESTART to replace fsync() with one
>> that gracefully handles getting EINTR.
>>
>> Reported-by: Randall S. Becker <randall.becker@nexbridge.ca>
>> Signed-off-by: Jeff King <peff@peff.net>
>
>Probably Helped-by might be more appropriate. But regardless, I definitely give my S-o-B for anything I contributed.
>
>>  Makefile          |  8 ++++++++
>>  compat/fsync.c    | 10 ++++++++++
>>  git-compat-util.h |  6 ++++++
>>  3 files changed, 24 insertions(+)
>>  create mode 100644 compat/fsync.c
>
>This looks as I'd expect. But after seeing Eric's response, we perhaps want to do away with the knob entirely.

I'm for doing away with the knob altogether, providing you don't mean the platform itself 😉

Regards,
Randall


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-02 20:34         ` Randall S. Becker
@ 2021-06-03 19:31           ` Jeff King
  2021-06-03 20:07             ` Randall S. Becker
  0 siblings, 1 reply; 22+ messages in thread
From: Jeff King @ 2021-06-03 19:31 UTC (permalink / raw)
  To: Randall S. Becker
  Cc: 'Eric Wong', 'Taylor Blau',
	'Junio C Hamano', git

On Wed, Jun 02, 2021 at 04:34:51PM -0400, Randall S. Becker wrote:

> >> The wrapper should apply to all platforms.  NFS (and presumably other
> >> network FSes) can be mounted with interrupts enabled.
> >
> >I don't mind that, as the wrapper is pretty low-cost (and one less Makefile knob is nice). If it's widespread, though, I find it curious that
> >nobody has run into it before now.
> 
> I suspect this is because of the way the file system on NonStop behaves. It is a multi-processor platform, with multi-cores, so anything can happen. If the file system is delayed for any reason, like a signal coming from a different core (EINTR has high priority), then fsync() will be interrupted. EINTR is allowed on NonStop for fsync(). So it would be really great if the patch included a modification to config.mak.uname to include that. This would be a timing-only issue on most other systems, probably something that would hit NFS.
> 
> The patch for the config is:
> diff --git a/config.mak.uname b/config.mak.uname
> index cb443b4e02..ac3e3ca2c5 100644
> --- a/config.mak.uname
> +++ b/config.mak.uname
> @@ -566,6 +566,7 @@ ifeq ($(uname_S),NONSTOP_KERNEL)
>         NO_REGEX = NeedsStartEnd
>         NO_PTHREADS = UnfortunatelyYes
>         FREAD_READS_DIRECTORIES = UnfortunatelyYes
> +       FSYNC_NEEDS_RESTART = YesPlease
> 
>         # Not detected (nor checked for) by './configure'.

Yeah, if we don't make it unconditional, then this is the obvious next
step. But the more important question is: did you test this out and did
it fix the test breakage you saw on NonStop?

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-03 19:31           ` Jeff King
@ 2021-06-03 20:07             ` Randall S. Becker
  0 siblings, 0 replies; 22+ messages in thread
From: Randall S. Becker @ 2021-06-03 20:07 UTC (permalink / raw)
  To: 'Jeff King'
  Cc: 'Eric Wong', 'Taylor Blau',
	'Junio C Hamano', git

On June 3, 2021 3:31 PM, Peff wrote:
>On Wed, Jun 02, 2021 at 04:34:51PM -0400, Randall S. Becker wrote:
>
>> >> The wrapper should apply to all platforms.  NFS (and presumably
>> >> other network FSes) can be mounted with interrupts enabled.
>> >
>> >I don't mind that, as the wrapper is pretty low-cost (and one less
>> >Makefile knob is nice). If it's widespread, though, I find it curious that nobody has run into it before now.
>>
>> I suspect this is because of the way the file system on NonStop behaves. It is a multi-processor platform, with multi-cores, so anything
>can happen. If the file system is delayed for any reason, like a signal coming from a different core (EINTR has high priority), then fsync()
>will be interrupted. EINTR is allowed on NonStop for fsync(). So it would be really great if the patch included a modification to
>config.mak.uname to include that. This would be a timing-only issue on most other systems, probably something that would hit NFS.
>>
>> The patch for the config is:
>> diff --git a/config.mak.uname b/config.mak.uname index
>> cb443b4e02..ac3e3ca2c5 100644
>> --- a/config.mak.uname
>> +++ b/config.mak.uname
>> @@ -566,6 +566,7 @@ ifeq ($(uname_S),NONSTOP_KERNEL)
>>         NO_REGEX = NeedsStartEnd
>>         NO_PTHREADS = UnfortunatelyYes
>>         FREAD_READS_DIRECTORIES = UnfortunatelyYes
>> +       FSYNC_NEEDS_RESTART = YesPlease
>>
>>         # Not detected (nor checked for) by './configure'.
>
>Yeah, if we don't make it unconditional, then this is the obvious next step. But the more important question is: did you test this out and did
>it fix the test breakage you saw on NonStop?

The fix works for me and t5300 passes. I tested it without the conditional approach. While the test was running, I noticed this:

+ mkdir -p /home/git/git/t/trash directory.t5300-pack-object/prereq-test-dir-FAIL_PREREQS
+ cd /home/git/git/t/trash directory.t5300-pack-object/prereq-test-dir-FAIL_PREREQS
+ test_bool_env GIT_TEST_FAIL_PREREQS false
error: last command exited with $?=1
prerequisite FAIL_PREREQS not satisfied
expecting success of 5300.32 'index-pack --threads=N or pack.threads=N warns when no pthreads':

This may be intended, but the error line showed in red.

Regards,
Randall



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-02 20:14       ` Jeff King
  2021-06-02 20:18         ` Taylor Blau
  2021-06-02 20:34         ` Randall S. Becker
@ 2021-06-03 20:21         ` Bryan Turner
  2021-06-03 20:32           ` Randall S. Becker
  2 siblings, 1 reply; 22+ messages in thread
From: Bryan Turner @ 2021-06-03 20:21 UTC (permalink / raw)
  To: Jeff King
  Cc: Eric Wong, Taylor Blau, Randall S. Becker, Junio C Hamano,
	Git Users

On Wed, Jun 2, 2021 at 1:14 PM Jeff King <peff@peff.net> wrote:
>
> On Wed, Jun 02, 2021 at 08:11:50PM +0000, Eric Wong wrote:
>
> > Jeff King <peff@peff.net> wrote:
> > > And so when he gets this error:
> > >
> > >   fatal: fsync error on '.git/objects/pack/tmp_pack_NkPgqN': Interrupted system call
> > >
> > > presumably we were in fsync() when the signal arrived, and unlike most
> > > other platforms, the call needs to be restarted manually (even though we
> > > set up the signal with SA_RESTART). I'm not sure if this violates POSIX
> > > or not (I couldn't find a definitive answer to the set of interruptible
> > > functions in the standard). But either way, the workaround is probably
> > > something like:
> >
> > "man 3posix fsync" says EINTR is allowed ("manpages-posix-dev"
> > package in Debian non-free).
>
> Ah, thanks. Linux's fsync(3) doesn't mention it, and nor does it appear
> in the discussion of interruptible calls in signals(7). So I was looking
> for a POSIX equivalent of that signals manpage but couldn't find one. :)
>
> > >   #ifdef FSYNC_NEEDS_RESTART
> >
> > The wrapper should apply to all platforms.  NFS (and presumably
> > other network FSes) can be mounted with interrupts enabled.
>
> I don't mind that, as the wrapper is pretty low-cost (and one less
> Makefile knob is nice). If it's widespread, though, I find it curious
> that nobody has run into it before now.

I was dealing with a similar issue[1] recently, albeit not in the Git
codebase but rather with Java. My issue was with epoll_wait, rather
than fsync, which is documented on signal(7) as not restartable even
with SA_RESTART. That led me to this[2] little bit of code inside the
JVM:
#define RESTARTABLE(_cmd, _result) do { \
  do { \
    _result = _cmd; \
  } while((_result == -1) && (errno == EINTR)); \
} while(0)

which they use like this[3]:
RESTARTABLE(epoll_wait(epfd, events, numfds, -1), res);

Not sure what the Git maintainers' view on macros is, but if there
wasn't going to be a Makefile knob perhaps something similar might
make sense as a reusable construct. Of course, it's unclear how often
Git might _need_ such a thing; given this doesn't seem to come up
much, perhaps that's a sign such a macro would end up a waste of
effort. Anyway, just thought I'd share because I was looking at
something similar.

[1] https://github.com/brettwooldridge/NuProcess/issues/124
[2] https://github.com/JetBrains/jdk8u_jdk/blob/94318f9185757cc33d2b8d527d36be26ac6b7582/src/solaris/native/sun/nio/ch/nio_util.h#L33-L37
[3] https://github.com/JetBrains/jdk8u_jdk/blob/94318f9185757cc33d2b8d527d36be26ac6b7582/src/solaris/native/sun/nio/ch/EPoll.c#L92

>
> -Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-03 20:21         ` Bryan Turner
@ 2021-06-03 20:32           ` Randall S. Becker
  0 siblings, 0 replies; 22+ messages in thread
From: Randall S. Becker @ 2021-06-03 20:32 UTC (permalink / raw)
  To: 'Bryan Turner', 'Jeff King'
  Cc: 'Eric Wong', 'Taylor Blau',
	'Junio C Hamano', 'Git Users'

On June 3, 2021 4:22 PM, Bryan Turner wrote:
>On Wed, Jun 2, 2021 at 1:14 PM Jeff King <peff@peff.net> wrote:
>>
>> On Wed, Jun 02, 2021 at 08:11:50PM +0000, Eric Wong wrote:
>>
>> > Jeff King <peff@peff.net> wrote:
>> > > And so when he gets this error:
>> > >
>> > >   fatal: fsync error on '.git/objects/pack/tmp_pack_NkPgqN':
>> > > Interrupted system call
>> > >
>> > > presumably we were in fsync() when the signal arrived, and unlike
>> > > most other platforms, the call needs to be restarted manually
>> > > (even though we set up the signal with SA_RESTART). I'm not sure
>> > > if this violates POSIX or not (I couldn't find a definitive answer
>> > > to the set of interruptible functions in the standard). But either
>> > > way, the workaround is probably something like:
>> >
>> > "man 3posix fsync" says EINTR is allowed ("manpages-posix-dev"
>> > package in Debian non-free).
>>
>> Ah, thanks. Linux's fsync(3) doesn't mention it, and nor does it
>> appear in the discussion of interruptible calls in signals(7). So I
>> was looking for a POSIX equivalent of that signals manpage but
>> couldn't find one. :)
>>
>> > >   #ifdef FSYNC_NEEDS_RESTART
>> >
>> > The wrapper should apply to all platforms.  NFS (and presumably
>> > other network FSes) can be mounted with interrupts enabled.
>>
>> I don't mind that, as the wrapper is pretty low-cost (and one less
>> Makefile knob is nice). If it's widespread, though, I find it curious
>> that nobody has run into it before now.
>
>I was dealing with a similar issue[1] recently, albeit not in the Git codebase but rather with Java. My issue was with epoll_wait, rather
>than fsync, which is documented on signal(7) as not restartable even with SA_RESTART. That led me to this[2] little bit of code inside the
>JVM:
>#define RESTARTABLE(_cmd, _result) do { \
>  do { \
>    _result = _cmd; \
>  } while((_result == -1) && (errno == EINTR)); \ } while(0)
>
>which they use like this[3]:
>RESTARTABLE(epoll_wait(epfd, events, numfds, -1), res);
>
>Not sure what the Git maintainers' view on macros is, but if there wasn't going to be a Makefile knob perhaps something similar might
>make sense as a reusable construct. Of course, it's unclear how often Git might _need_ such a thing; given this doesn't seem to come up
>much, perhaps that's a sign such a macro would end up a waste of effort. Anyway, just thought I'd share because I was looking at
>something similar.
>
>[1] https://github.com/brettwooldridge/NuProcess/issues/124
>[2]
>https://github.com/JetBrains/jdk8u_jdk/blob/94318f9185757cc33d2b8d527d36be26ac6b7582/src/solaris/native/sun/nio/ch/nio_util.h#L33
>-L37
>[3]
>https://github.com/JetBrains/jdk8u_jdk/blob/94318f9185757cc33d2b8d527d36be26ac6b7582/src/solaris/native/sun/nio/ch/EPoll.c#L92

I can only suggest, from other OpenSource projects I'm on, that make extensive use of macros, that they are very difficult to debug and sometimes harder to integrate with platform-specific compatibility layers.  I would rather have something explicit in compat.c that gdb could make sense of during a debug session if necessary. Trying to debug macros is harsh, rather like debugging -O2 without -g.

I used to be a big fan of macros, but grew out of it 😉. Just my take on it.

-Randall


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-02 20:15       ` Jeff King
  2021-06-02 20:36         ` Randall S. Becker
@ 2021-06-04  1:36         ` Junio C Hamano
  2021-06-04  2:17           ` Taylor Blau
                             ` (2 more replies)
  1 sibling, 3 replies; 22+ messages in thread
From: Junio C Hamano @ 2021-06-04  1:36 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, Randall S. Becker, git

Jeff King <peff@peff.net> writes:

> This looks as I'd expect. But after seeing Eric's response, we perhaps
> want to do away with the knob entirely.

Thanks.  I was hoping somebody in the thread would tie the loose
ends, but upon inspection of the output from

    $ git grep -e fsync\( maint seen -- \*.[ch]

it turns out that fsync_or_die() is the only place that calls
fsync(), so perhaps doing it in a way that is quite different from
what has been discussed may be even a better alternative.

If any new callers care about the return value of fsync(), I'd
expect that they would be calling this wrapper, and the "best
effort" callers that do not check the returned value by definition
do not care if fsync() does not complete due to an interrupt, so I
am hoping that the current "we only call it from this wrapper" is
not just "the code currently happens to be this way", but it is
sensible that the code will stay that way in the future.

Obviously I appreciate reviews and possibly tests, but sanity
checking my observation that fsync() is called only from here is a
good thing to have.

-- >8 --
Subject: fsync(): be prepared to see EINTR

Some platforms, like NonStop do not automatically restart fsync()
when interrupted by a signal, even when that signal is setup with
SA_RESTART.

This can lead to test breakage, e.g., where "--progress" is used,
thus SIGALRM is sent often, and can interrupt an fsync() syscall.

Make sure we deal with such a case by retrying the syscall
ourselves.  Luckily, we call fsync() fron a single wrapper,
fsync_or_die(), so the fix is fairly isolated.
    
Reported-by: Randall S. Becker <randall.becker@nexbridge.ca>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Taylor Blau <me@ttaylorr.com>
[jc: the above two did most of the work---I just tied the loose end]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 write-or-die.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git i/write-or-die.c w/write-or-die.c
index eab8c8d0b9..534b2f5cba 100644
--- i/write-or-die.c
+++ w/write-or-die.c
@@ -57,7 +57,11 @@ void fprintf_or_die(FILE *f, const char *fmt, ...)
 
 void fsync_or_die(int fd, const char *msg)
 {
-	if (fsync(fd) < 0) {
+	int status;
+
+	while ((status = fsync(fd)) < 0 && errno == EINTR)
+		; /* try again */
+	if (status < 0) {
 		die_errno("fsync error on '%s'", msg);
 	}
 }

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-04  1:36         ` Junio C Hamano
@ 2021-06-04  2:17           ` Taylor Blau
  2021-06-04  3:55           ` Jeff King
  2021-06-05  7:04           ` René Scharfe
  2 siblings, 0 replies; 22+ messages in thread
From: Taylor Blau @ 2021-06-04  2:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Taylor Blau, Randall S. Becker, git

On Fri, Jun 04, 2021 at 10:36:11AM +0900, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > This looks as I'd expect. But after seeing Eric's response, we perhaps
> > want to do away with the knob entirely.
>
> Thanks.  I was hoping somebody in the thread would tie the loose
> ends, but upon inspection of the output from
>
>     $ git grep -e fsync\( maint seen -- \*.[ch]
>
> it turns out that fsync_or_die() is the only place that calls
> fsync(), so perhaps doing it in a way that is quite different from
> what has been discussed may be even a better alternative.
>
> If any new callers care about the return value of fsync(), I'd
> expect that they would be calling this wrapper, and the "best
> effort" callers that do not check the returned value by definition
> do not care if fsync() does not complete due to an interrupt, so I
> am hoping that the current "we only call it from this wrapper" is
> not just "the code currently happens to be this way", but it is
> sensible that the code will stay that way in the future.

That makes total sense to me; I can't imagine a scenario where you
would want to call fsync() over fsync_or_die(). But if you did, then you
probably don't care about whether or not fsync() was interrupted, or can
check the return value yourself.

If it became more common, then I wouldn't mind #undef-ing fsync() and
replacing it with our own hacked up version.

> Obviously I appreciate reviews and possibly tests, but sanity
> checking my observation that fsync() is called only from here is a
> good thing to have.

Your patch looks good to me (and Randall already tested my version with
the additional knob on NonStop and reported it working as expected[1]),
so this has my:

  Reviewed-by: Taylor Blau <me@ttaylorr.com>

Thanks,
Taylor

[1]: https://lore.kernel.org/git/009901d758b4$12016d80$36044880$@nexbridge.com/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-04  1:36         ` Junio C Hamano
  2021-06-04  2:17           ` Taylor Blau
@ 2021-06-04  3:55           ` Jeff King
  2021-06-04  5:12             ` Junio C Hamano
  2021-06-05  7:04           ` René Scharfe
  2 siblings, 1 reply; 22+ messages in thread
From: Jeff King @ 2021-06-04  3:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, Randall S. Becker, git

On Fri, Jun 04, 2021 at 10:36:11AM +0900, Junio C Hamano wrote:

> Thanks.  I was hoping somebody in the thread would tie the loose
> ends, but upon inspection of the output from
> 
>     $ git grep -e fsync\( maint seen -- \*.[ch]
> 
> it turns out that fsync_or_die() is the only place that calls
> fsync(), so perhaps doing it in a way that is quite different from
> what has been discussed may be even a better alternative.
> 
> If any new callers care about the return value of fsync(), I'd
> expect that they would be calling this wrapper, and the "best
> effort" callers that do not check the returned value by definition
> do not care if fsync() does not complete due to an interrupt, so I
> am hoping that the current "we only call it from this wrapper" is
> not just "the code currently happens to be this way", but it is
> sensible that the code will stay that way in the future.
> 
> Obviously I appreciate reviews and possibly tests, but sanity
> checking my observation that fsync() is called only from here is a
> good thing to have.

Thanks for digging further. I didn't even think to look at how many
calls there were, but I agree there is only the one. And moreover, I
agree that it is unlikely we'd ever have more than one, for the reasons
you listed. So I think your patch is a nice and simple solution, and we
don't need worry about magic macro wrappers at all.

One brief aside: I'm still not entirely convinced that NonStop isn't
violating POSIX. Yes, as Eric noted, fsync() is allowed to return EINTR.
But should it do so when the signal it got was set up with SA_RESTART?

The sigaction(3posix) page says:

     SA_RESTART   This flag affects the behavior of interruptible functions;
		  that is, those specified to fail with errno set to
		  [EINTR]. If set, and a function specified as
		  interruptible is interrupted by this signal, the
		  function shall restart and shall not fail with [EINTR]
		  unless otherwise specified. [...]

and I could not find anywhere that it is "otherwise specified" for
fsync(). Of course, whatever POSIX says, if NonStop needs this
workaround, we should provide it. But this may explain why we never saw
it on other systems.

It also means it's less important for this workaround to kick in
everywhere. But given how low-cost it is, I'm just as happy to avoid
having a separate knob to enable it.

> -- >8 --
> Subject: fsync(): be prepared to see EINTR

The patch itself looks good to me.

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-04  3:55           ` Jeff King
@ 2021-06-04  5:12             ` Junio C Hamano
  2021-06-06 19:06               ` Randall S. Becker
  0 siblings, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2021-06-04  5:12 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, Randall S. Becker, git

Jeff King <peff@peff.net> writes:

> One brief aside: I'm still not entirely convinced that NonStop isn't
> violating POSIX. Yes, as Eric noted, fsync() is allowed to return EINTR.
> But should it do so when the signal it got was set up with SA_RESTART?
>
> The sigaction(3posix) page says:
>
>      SA_RESTART   This flag affects the behavior of interruptible functions;
> 		  that is, those specified to fail with errno set to
> 		  [EINTR]. If set, and a function specified as
> 		  interruptible is interrupted by this signal, the
> 		  function shall restart and shall not fail with [EINTR]
> 		  unless otherwise specified. [...]
>
> and I could not find anywhere that it is "otherwise specified" for
> fsync(). Of course, whatever POSIX says, if NonStop needs this
> workaround, we should provide it. But this may explain why we never saw
> it on other systems.

Yeah, I think all of the above makes sense.

> It also means it's less important for this workaround to kick in
> everywhere. But given how low-cost it is, I'm just as happy to avoid
> having a separate knob to enable it.

Yes, any caller who cares about the result of fsync() is willing to
wait, and on a well benaved system, we would never see EINTR so the
loop won't iterate.  Having to carry just a handful of extra bytes
in ICache in a codepath that is not performance critical is probably
an acceptable cost for simpler code.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-04  1:36         ` Junio C Hamano
  2021-06-04  2:17           ` Taylor Blau
  2021-06-04  3:55           ` Jeff King
@ 2021-06-05  7:04           ` René Scharfe
  2021-06-05 13:15             ` Junio C Hamano
  2 siblings, 1 reply; 22+ messages in thread
From: René Scharfe @ 2021-06-05  7:04 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King; +Cc: Taylor Blau, Randall S. Becker, git

Am 04.06.21 um 03:36 schrieb Junio C Hamano:
> ---
>
>  write-or-die.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git i/write-or-die.c w/write-or-die.c
> index eab8c8d0b9..534b2f5cba 100644
> --- i/write-or-die.c
> +++ w/write-or-die.c
> @@ -57,7 +57,11 @@ void fprintf_or_die(FILE *f, const char *fmt, ...)
>
>  void fsync_or_die(int fd, const char *msg)
>  {
> -	if (fsync(fd) < 0) {
> +	int status;
> +
> +	while ((status = fsync(fd)) < 0 && errno == EINTR)
> +		; /* try again */
> +	if (status < 0) {
>  		die_errno("fsync error on '%s'", msg);
>  	}
>  }

The function body can be written like this:

	while (fsync(fd) < 0) {
		if (errno != EINTR)
			die_errno("fsync error on '%s'", msg);
	}

I find it easier to read because it has less syntax elements
and is shorter.  Bikeshedding, of course. :-/

René

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-05  7:04           ` René Scharfe
@ 2021-06-05 13:15             ` Junio C Hamano
  0 siblings, 0 replies; 22+ messages in thread
From: Junio C Hamano @ 2021-06-05 13:15 UTC (permalink / raw)
  To: René Scharfe; +Cc: Jeff King, Taylor Blau, Randall S. Becker, git

René Scharfe <l.s.r@web.de> writes:

> The function body can be written like this:
>
> 	while (fsync(fd) < 0) {
> 		if (errno != EINTR)
> 			die_errno("fsync error on '%s'", msg);
> 	}
>
> I find it easier to read because it has less syntax elements
> and is shorter.  Bikeshedding, of course. :-/

It indeed is easier to follow.  Thanks.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-04  5:12             ` Junio C Hamano
@ 2021-06-06 19:06               ` Randall S. Becker
  2021-06-08  6:40                 ` Jeff King
  0 siblings, 1 reply; 22+ messages in thread
From: Randall S. Becker @ 2021-06-06 19:06 UTC (permalink / raw)
  To: 'Junio C Hamano', 'Jeff King'; +Cc: 'Taylor Blau', git

On June 4, 2021 1:12 AM, Junio C Hamano wrote:
>To: Jeff King <peff@peff.net>
>Cc: Taylor Blau <me@ttaylorr.com>; Randall S. Becker <rsbecker@nexbridge.com>; git@vger.kernel.org
>Subject: Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
>
>Jeff King <peff@peff.net> writes:
>
>> One brief aside: I'm still not entirely convinced that NonStop isn't
>> violating POSIX. Yes, as Eric noted, fsync() is allowed to return EINTR.
>> But should it do so when the signal it got was set up with SA_RESTART?
>>
>> The sigaction(3posix) page says:
>>
>>      SA_RESTART   This flag affects the behavior of interruptible functions;
>> 		  that is, those specified to fail with errno set to
>> 		  [EINTR]. If set, and a function specified as
>> 		  interruptible is interrupted by this signal, the
>> 		  function shall restart and shall not fail with [EINTR]
>> 		  unless otherwise specified. [...]
>>
>> and I could not find anywhere that it is "otherwise specified" for
>> fsync(). Of course, whatever POSIX says, if NonStop needs this
>> workaround, we should provide it. But this may explain why we never
>> saw it on other systems.
>
>Yeah, I think all of the above makes sense.
>
>> It also means it's less important for this workaround to kick in
>> everywhere. But given how low-cost it is, I'm just as happy to avoid
>> having a separate knob to enable it.
>
>Yes, any caller who cares about the result of fsync() is willing to wait, and on a well benaved system, we would never see EINTR so
the
>loop won't iterate.  Having to carry just a handful of extra bytes in ICache in a codepath that is not performance critical is
probably an
>acceptable cost for simpler code.

Just something to note: I was running git gc --aggressive on OpenSSL today and got the same error as in this case. I'm assuming that
the fix will apply to that too. In this situation, the gc ran for over 2 hours until failing. This may be a more widely available
test condition for fsync() EINTR, possibly.

-Randall


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86
  2021-06-06 19:06               ` Randall S. Becker
@ 2021-06-08  6:40                 ` Jeff King
  0 siblings, 0 replies; 22+ messages in thread
From: Jeff King @ 2021-06-08  6:40 UTC (permalink / raw)
  To: Randall S. Becker; +Cc: 'Junio C Hamano', 'Taylor Blau', git

On Sun, Jun 06, 2021 at 03:06:31PM -0400, Randall S. Becker wrote:

> Just something to note: I was running git gc --aggressive on OpenSSL
> today and got the same error as in this case. I'm assuming that the
> fix will apply to that too. In this situation, the gc ran for over 2
> hours until failing. This may be a more widely available test
> condition for fsync() EINTR, possibly.

Thanks. One of my original questions was that we'd expect to see this in
the wild, and not just the test suite. But it sounds like now we have. :)
So I feel better that we have a full understanding of the issue.

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-06-08  6:40 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-02 17:52 [ANNOUNCE] Git v2.32.0-rc3 - t5300 Still Broken on NonStop ia64/x86 Randall S. Becker
2021-06-02 19:32 ` Taylor Blau
2021-06-02 19:49   ` Jeff King
2021-06-02 20:11     ` Taylor Blau
2021-06-02 20:15       ` Jeff King
2021-06-02 20:36         ` Randall S. Becker
2021-06-04  1:36         ` Junio C Hamano
2021-06-04  2:17           ` Taylor Blau
2021-06-04  3:55           ` Jeff King
2021-06-04  5:12             ` Junio C Hamano
2021-06-06 19:06               ` Randall S. Becker
2021-06-08  6:40                 ` Jeff King
2021-06-05  7:04           ` René Scharfe
2021-06-05 13:15             ` Junio C Hamano
2021-06-02 20:11     ` Eric Wong
2021-06-02 20:14       ` Jeff King
2021-06-02 20:18         ` Taylor Blau
2021-06-02 20:34         ` Randall S. Becker
2021-06-03 19:31           ` Jeff King
2021-06-03 20:07             ` Randall S. Becker
2021-06-03 20:21         ` Bryan Turner
2021-06-03 20:32           ` Randall S. Becker

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).