git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* t5400 failure on Windows
@ 2017-05-15 20:05 Johannes Sixt
  2017-05-15 22:24 ` Jeff King
  0 siblings, 1 reply; 6+ messages in thread
From: Johannes Sixt @ 2017-05-15 20:05 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

I observe the following failure on Windws with origin/next, in
t5400-send-pack.sh

+++ diff -u expect refs
--- expect      Mon May 15 06:54:59 2017
+++ refs        Mon May 15 06:54:59 2017
@@ -1,4 +1,3 @@
 5285e1710002a06a379056b0d21357a999f3c642 refs/heads/master
-5285e1710002a06a379056b0d21357a999f3c642 refs/remotes/origin/HEAD
 5285e1710002a06a379056b0d21357a999f3c642 refs/remotes/origin/master
 53d9066ca10f2e103b33caf3a16a723553c33b00 .have
error: last command exited with $?=1
not ok 16 - receive-pack de-dupes .have lines


The trace file looks like this:

-------- trace --------
packet: receive-pack> 5285e1710002a06a379056b0d21357a999f3c642 refs/heads/master\0report-status delete-refs side-band-64k quiet atomic ofs-delta agent=git/2.13.0.497.g5af12421b0
packet: receive-pack> 5285e1710002a06a379056b0d21357a999f3c642 refs/remotes/origin/HEAD
packet:         push< 5285e1710002a06a379056b0d21357a999f3c642 refs/heads/master\0report-status delete-refs side-band-64k quiet atomic ofs-delta agent=git/2.13.0.497.g5af12421b0
packet: receive-pack> 5285e1710002a06a379056b0d21357a999f3c642 refs/remotes/origin/master
packet:         push< 5285e1710002a06a379056b0d21357a999f3c642 refs/remotes/origin/master
packet: receive-pack> 53d9066ca10f2e103b33caf3a16a723553c33b00 .have
packet:         push< 53d9066ca10f2e103b33caf3a16a723553c33b00 .have
packet: receive-pack> 0000
packet:         push< 0000
packet:         push> 0000000000000000000000000000000000000000 b1a1c97e94b6388c108e195d28a3e89f00c81698 refs/heads/foo\0 report-status agent=git/2.13.0.497.g5af12421b0
packet:         push> 0000
packet: receive-pack< 0000000000000000000000000000000000000000 b1a1c97e94b6388c108e195d28a3e89f00c81698 refs/heads/foo\0 report-status agent=git/2.13.0.497.g5af12421b0
packet: receive-pack< 0000
packet: receive-pack> unpack ok
packet: receive-pack> ok refs/heads/foo
packet: receive-pack> 0000
packet:         push< unpack ok
packet:         push< ok refs/heads/foo
packet:         push< 0000
-------- end of trace --------

Where should I start looking? On a Linux box, the test case does
produce an additional line

packet:         push< 5285e1710002a06a379056b0d21357a999f3c642 refs/remotes/origin/HEAD

in the trace file. I also do not see that any tests would be skipped
on Windows. (But I forgot to check whether refs/remotes/origin/HEAD
is present in any of the repositories.)

-- Hannes

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: t5400 failure on Windows
  2017-05-15 20:05 t5400 failure on Windows Johannes Sixt
@ 2017-05-15 22:24 ` Jeff King
  2017-05-16 18:35   ` Johannes Sixt
  0 siblings, 1 reply; 6+ messages in thread
From: Jeff King @ 2017-05-15 22:24 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Git Mailing List

On Mon, May 15, 2017 at 10:05:29PM +0200, Johannes Sixt wrote:

> I observe the following failure on Windws with origin/next, in
> t5400-send-pack.sh
> 
> +++ diff -u expect refs
> --- expect      Mon May 15 06:54:59 2017
> +++ refs        Mon May 15 06:54:59 2017
> @@ -1,4 +1,3 @@
>  5285e1710002a06a379056b0d21357a999f3c642 refs/heads/master
> -5285e1710002a06a379056b0d21357a999f3c642 refs/remotes/origin/HEAD
>  5285e1710002a06a379056b0d21357a999f3c642 refs/remotes/origin/master
>  53d9066ca10f2e103b33caf3a16a723553c33b00 .have
> error: last command exited with $?=1
> not ok 16 - receive-pack de-dupes .have lines

Interesting that it's the symref. I wonder if that is important (though
they should always be implemented as a real symref file and not a
symlink, so I don't think FS limitations would come into play).

For sanity, you might want to double-check that the "shared-fork"
repository has that symref on disk, and that for-each-ref reports it.
But...

> The trace file looks like this:
> 
> -------- trace --------
> packet: receive-pack> 5285e1710002a06a379056b0d21357a999f3c642 refs/heads/master\0report-status delete-refs side-band-64k quiet atomic ofs-delta agent=git/2.13.0.497.g5af12421b0
> packet: receive-pack> 5285e1710002a06a379056b0d21357a999f3c642 refs/remotes/origin/HEAD

Wait, there it is in the receive-pack output. But if we look at what
"push" receives:

> packet:         push< 5285e1710002a06a379056b0d21357a999f3c642 refs/heads/master\0report-status delete-refs side-band-64k quiet atomic ofs-delta agent=git/2.13.0.497.g5af12421b0
> packet: receive-pack> 5285e1710002a06a379056b0d21357a999f3c642 refs/remotes/origin/master
> packet:         push< 5285e1710002a06a379056b0d21357a999f3c642 refs/remotes/origin/master

It's skipped! So either:

  1. receive-pack wrote the trace line without sending it (which seems
     an unlikely bug to differ between platforms)

  2. push some didn't receive it (which makes no sense; this is a stream
     socket and it was able to parse the next pktline)

  3. push did get it but for some reason didn't write the trace (also
     unlikely as with option 1)

  4. There is something racy and unportable about both programs writing
     to the same trace file. It's opened with O_APPEND, which means that
     each write should atomically position the pointer at the end of the
     file. Is it possible there's a problem with that in the way
     O_APPEND works on Windows?

     If that was the case, though, I'd generally expect a sheared write
     or a partial overwrite. The two origin/HEAD lines from the two
     programs are the exact same length, but I'd find it more likely for
     the overwrite to happen with one of the follow-on lines.

So all of those sound kind of implausible. If you run the test over and
over, does it happen consistently, and is it always the same line?

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: t5400 failure on Windows
  2017-05-15 22:24 ` Jeff King
@ 2017-05-16 18:35   ` Johannes Sixt
  2017-05-17  5:44     ` Jeff King
  0 siblings, 1 reply; 6+ messages in thread
From: Johannes Sixt @ 2017-05-16 18:35 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

Am 16.05.2017 um 00:24 schrieb Jeff King:
>    4. There is something racy and unportable about both programs writing
>       to the same trace file. It's opened with O_APPEND, which means that
>       each write should atomically position the pointer at the end of the
>       file. Is it possible there's a problem with that in the way
>       O_APPEND works on Windows?
> 
>       If that was the case, though, I'd generally expect a sheared write
>       or a partial overwrite. The two origin/HEAD lines from the two
>       programs are the exact same length, but I'd find it more likely for
>       the overwrite to happen with one of the follow-on lines.

Good guesswork! O_APPEND is not atomic on Windows, in particular, it is 
emulated as lseek(SEEK_END) followed by write().

I ran the test a few more times, and it fails in different ways 
(different missing lines) and also succeeds in a minority of the cases.

Windows is capable of writing atomically in the way O_APPEND requires 
(keyword: FILE_APPEND_DATA), but I do not see a way to wedge this into 
the emulation layer. For the time being, I think I have to skip the test 
case.

The question remains how endangered our uses of O_APPEND are when the 
POSIX semantics are not obeyed precisely:

* trace.c: It is about debugging. Not critical.

* sequencer.c: It writes the "done" file. No concurrent accesses are 
expected: Not critial.

* refs/files-backend.c: There are uses in functions 
open_or_create_logfile() and log_ref_setup(). Sounds like it is in 
reflogs. Sounds like this is about reflogs. If there are concurrent 
accesses, there is a danger that a reflog is not recorded correctly. 
Then reflogs may be missing and objects may be pruned earlier than 
expected. That's something to worry about, but I would not count it as 
mission critical.

-- Hannes

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: t5400 failure on Windows
  2017-05-16 18:35   ` Johannes Sixt
@ 2017-05-17  5:44     ` Jeff King
  2017-05-17 18:41       ` Johannes Sixt
  0 siblings, 1 reply; 6+ messages in thread
From: Jeff King @ 2017-05-17  5:44 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Git Mailing List

On Tue, May 16, 2017 at 08:35:46PM +0200, Johannes Sixt wrote:

> Am 16.05.2017 um 00:24 schrieb Jeff King:
> >    4. There is something racy and unportable about both programs writing
> >       to the same trace file. It's opened with O_APPEND, which means that
> >       each write should atomically position the pointer at the end of the
> >       file. Is it possible there's a problem with that in the way
> >       O_APPEND works on Windows?
> > 
> >       If that was the case, though, I'd generally expect a sheared write
> >       or a partial overwrite. The two origin/HEAD lines from the two
> >       programs are the exact same length, but I'd find it more likely for
> >       the overwrite to happen with one of the follow-on lines.
> 
> Good guesswork! O_APPEND is not atomic on Windows, in particular, it is
> emulated as lseek(SEEK_END) followed by write().
> 
> I ran the test a few more times, and it fails in different ways (different
> missing lines) and also succeeds in a minority of the cases.

OK, that definitely explains it, then.

> Windows is capable of writing atomically in the way O_APPEND requires
> (keyword: FILE_APPEND_DATA), but I do not see a way to wedge this into the
> emulation layer. For the time being, I think I have to skip the test case.

I wonder if there's a way we could convince the trace for the two
programs to go to separate locations. We don't care about receive-pack's
trace at all. So maybe:

diff --git a/t/t5400-send-pack.sh b/t/t5400-send-pack.sh
index 3331e0f53..d375d7110 100755
--- a/t/t5400-send-pack.sh
+++ b/t/t5400-send-pack.sh
@@ -288,7 +288,10 @@ test_expect_success 'receive-pack de-dupes .have lines' '
 	$shared .have
 	EOF
 
-	GIT_TRACE_PACKET=$(pwd)/trace git push fork HEAD:foo &&
+	GIT_TRACE_PACKET=$(pwd)/trace \
+	    git push \
+		--receive-pack="unset GIT_TRACE_PACKET; git-receive-pack" \
+		fork HEAD:foo &&
 	extract_ref_advertisement <trace >refs &&
 	test_cmp expect refs
 '

> The question remains how endangered our uses of O_APPEND are when the POSIX
> semantics are not obeyed precisely:
> 
> * trace.c: It is about debugging. Not critical.

Agreed.

> * sequencer.c: It writes the "done" file. No concurrent accesses are
> expected: Not critial.

Agreed.

> * refs/files-backend.c: There are uses in functions open_or_create_logfile()
> and log_ref_setup(). Sounds like it is in reflogs. Sounds like this is about
> reflogs. If there are concurrent accesses, there is a danger that a reflog
> is not recorded correctly. Then reflogs may be missing and objects may be
> pruned earlier than expected. That's something to worry about, but I would
> not count it as mission critical.

We should always hold the matching ref lock already when modifying a
reflog. I seem to recall there was a case that missed this (I think
maybe modifying the HEAD reflog without holding HEAD.lock), but it was
fixed in the last few versions.

So I think we're probably OK, but I agree it's an interesting gotcha
that may bite us in the future.

-Peff

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: t5400 failure on Windows
  2017-05-17  5:44     ` Jeff King
@ 2017-05-17 18:41       ` Johannes Sixt
  2017-05-18  5:02         ` [PATCH] t5400: avoid concurrent writes into a trace file Jeff King
  0 siblings, 1 reply; 6+ messages in thread
From: Johannes Sixt @ 2017-05-17 18:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

Am 17.05.2017 um 07:44 schrieb Jeff King:
> I wonder if there's a way we could convince the trace for the two
> programs to go to separate locations. We don't care about receive-pack's
> trace at all. So maybe:

This works. Below it is with a commit message. I'm unsure about the
sign-off procedure, though.

>> * refs/files-backend.c: There are uses in functions open_or_create_logfile()
>> and log_ref_setup(). Sounds like it is in reflogs. Sounds like this is about
>> reflogs. If there are concurrent accesses, there is a danger that a reflog
>> is not recorded correctly. Then reflogs may be missing and objects may be
>> pruned earlier than expected. That's something to worry about, but I would
>> not count it as mission critical.

Of course, the reflog can also be corrupted, but:

> We should always hold the matching ref lock already when modifying a
> reflog. I seem to recall there was a case that missed this (I think
> maybe modifying the HEAD reflog without holding HEAD.lock), but it was
> fixed in the last few versions.

we should be fairly safe then.

---- 8< ----
From: Jeff King <peff@peff.net>
Subject: [PATCH jk/alternate-ref-optim] t5400: avoid concurrent writes into a trace file

One test in t5400 examines the packet exchange between git-push and
git-receive-pack. The latter inherits the GIT_TRACE_PACKET environment
variable, so that both processes dump trace data into the same file
concurrently. This should not be a problem because the trace file is
opened with O_APPEND.

On Windows, however, O_APPEND is not atomic as it should be: it is
emulated as lseek(SEEK_END) followed by write(). For this reason, the
test is unreliable: it can happen that one process overwrites a line
that was just written by the other process. As a consequence, the test
sometimes does not find one or another line that is expected (and it is
also successful occasionally).

The test case is actually only interested in the output of git-push.
To ensure that only git-push writes to the trace file, override the
receive-pack command such that it does not even open the trace file.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
 t/t5400-send-pack.sh | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/t/t5400-send-pack.sh b/t/t5400-send-pack.sh
index 3331e0f534..d375d7110d 100755
--- a/t/t5400-send-pack.sh
+++ b/t/t5400-send-pack.sh
@@ -288,7 +288,10 @@ test_expect_success 'receive-pack de-dupes .have lines' '
 	$shared .have
 	EOF
 
-	GIT_TRACE_PACKET=$(pwd)/trace git push fork HEAD:foo &&
+	GIT_TRACE_PACKET=$(pwd)/trace \
+	    git push \
+		--receive-pack="unset GIT_TRACE_PACKET; git-receive-pack" \
+		fork HEAD:foo &&
 	extract_ref_advertisement <trace >refs &&
 	test_cmp expect refs
 '
-- 
2.13.0.55.g17b7d13330

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] t5400: avoid concurrent writes into a trace file
  2017-05-17 18:41       ` Johannes Sixt
@ 2017-05-18  5:02         ` Jeff King
  0 siblings, 0 replies; 6+ messages in thread
From: Jeff King @ 2017-05-18  5:02 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, Git Mailing List

On Wed, May 17, 2017 at 08:41:01PM +0200, Johannes Sixt wrote:

> Am 17.05.2017 um 07:44 schrieb Jeff King:
> > I wonder if there's a way we could convince the trace for the two
> > programs to go to separate locations. We don't care about receive-pack's
> > trace at all. So maybe:
> 
> This works. Below it is with a commit message. I'm unsure about the
> sign-off procedure, though.

Thanks for testing. I actually wrote one in preparation, but I like
yours a little better.

I'll repost here with my signoff and a cc to Junio, since the earlier
subject line was less likely to grab attention. This goes on top of
jk/alternate-ref-optim.

-- >8 --
Subject: t5400: avoid concurrent writes into a trace file

One test in t5400 examines the packet exchange between git-push and
git-receive-pack. The latter inherits the GIT_TRACE_PACKET environment
variable, so that both processes dump trace data into the same file
concurrently. This should not be a problem because the trace file is
opened with O_APPEND.

On Windows, however, O_APPEND is not atomic as it should be: it is
emulated as lseek(SEEK_END) followed by write(). For this reason, the
test is unreliable: it can happen that one process overwrites a line
that was just written by the other process. As a consequence, the test
sometimes does not find one or another line that is expected (and it is
also successful occasionally).

The test case is actually only interested in the output of git-push.
To ensure that only git-push writes to the trace file, override the
receive-pack command such that it does not even open the trace file.

Reported-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Jeff King <peff@peff.net>
---
 t/t5400-send-pack.sh | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/t/t5400-send-pack.sh b/t/t5400-send-pack.sh
index 3331e0f53..d375d7110 100755
--- a/t/t5400-send-pack.sh
+++ b/t/t5400-send-pack.sh
@@ -288,7 +288,10 @@ test_expect_success 'receive-pack de-dupes .have lines' '
 	$shared .have
 	EOF
 
-	GIT_TRACE_PACKET=$(pwd)/trace git push fork HEAD:foo &&
+	GIT_TRACE_PACKET=$(pwd)/trace \
+	    git push \
+		--receive-pack="unset GIT_TRACE_PACKET; git-receive-pack" \
+		fork HEAD:foo &&
 	extract_ref_advertisement <trace >refs &&
 	test_cmp expect refs
 '
-- 
2.13.0.219.g63f6bc368


^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-05-18  5:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-15 20:05 t5400 failure on Windows Johannes Sixt
2017-05-15 22:24 ` Jeff King
2017-05-16 18:35   ` Johannes Sixt
2017-05-17  5:44     ` Jeff King
2017-05-17 18:41       ` Johannes Sixt
2017-05-18  5:02         ` [PATCH] t5400: avoid concurrent writes into a trace file Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).