git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] t6026-merge-attr: don't fail if sleep exits early
@ 2016-11-08 17:03 Andreas Schwab
  2016-11-08 20:05 ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Andreas Schwab @ 2016-11-08 17:03 UTC (permalink / raw)
  To: git

Commit 5babb5bdb3 ("t6026-merge-attr: clean up background process at end
of test case") added a kill command to clean up after the test, but this
can fail if the sleep command exits before the cleanup is executed.
Ignore the error from the kill command.

Signed-off-by: Andreas Schwab <schwab@suse.de>
---
The failure can be simulated by adding a sleep after the last command to
delay the cleanup.
---
 t/t6026-merge-attr.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t6026-merge-attr.sh b/t/t6026-merge-attr.sh
index 7a6e33e673..2672b15aa3 100755
--- a/t/t6026-merge-attr.sh
+++ b/t/t6026-merge-attr.sh
@@ -187,7 +187,7 @@ test_expect_success 'custom merge does not lock index' '
 		sleep 1 &
 		echo $! >sleep.pid
 	EOF
-	test_when_finished "kill \$(cat sleep.pid)" &&
+	test_when_finished "kill \$(cat sleep.pid) || :" &&
 
 	test_write_lines >.gitattributes \
 		"* merge=ours" "text merge=sleep-one-second" &&
-- 
2.10.2


-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] t6026-merge-attr: don't fail if sleep exits early
  2016-11-08 17:03 [PATCH] t6026-merge-attr: don't fail if sleep exits early Andreas Schwab
@ 2016-11-08 20:05 ` Jeff King
  2016-11-09 13:47   ` Johannes Schindelin
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2016-11-08 20:05 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Johannes Sixt, git

On Tue, Nov 08, 2016 at 06:03:04PM +0100, Andreas Schwab wrote:

> Commit 5babb5bdb3 ("t6026-merge-attr: clean up background process at end
> of test case") added a kill command to clean up after the test, but this
> can fail if the sleep command exits before the cleanup is executed.
> Ignore the error from the kill command.
> 
> Signed-off-by: Andreas Schwab <schwab@suse.de>
> ---
> The failure can be simulated by adding a sleep after the last command to
> delay the cleanup.

Thanks for the reproduction hint. I sometimes run the test suite through
a "stress" script that sees if a test script racily fails under load,
but I wasn't able to trigger it here (I guess a full second is just too
long even under high load). But the extra sleep makes it obvious.

Looks like the original is in v2.10.1, so this should probably go to
maint, as well as the upcoming v2.11-rc1.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] t6026-merge-attr: don't fail if sleep exits early
  2016-11-08 20:05 ` Jeff King
@ 2016-11-09 13:47   ` Johannes Schindelin
  2016-11-09 14:36     ` Andreas Schwab
  0 siblings, 1 reply; 12+ messages in thread
From: Johannes Schindelin @ 2016-11-09 13:47 UTC (permalink / raw)
  To: Jeff King; +Cc: Andreas Schwab, Johannes Sixt, git

Hi 

On Tue, 8 Nov 2016, Jeff King wrote:

> On Tue, Nov 08, 2016 at 06:03:04PM +0100, Andreas Schwab wrote:
> 
> > Commit 5babb5bdb3 ("t6026-merge-attr: clean up background process at end
> > of test case") added a kill command to clean up after the test, but this
> > can fail if the sleep command exits before the cleanup is executed.
> > Ignore the error from the kill command.
> > 
> > Signed-off-by: Andreas Schwab <schwab@suse.de>
> > ---
> > The failure can be simulated by adding a sleep after the last command to
> > delay the cleanup.
> 
> Thanks for the reproduction hint. I sometimes run the test suite through
> a "stress" script that sees if a test script racily fails under load,
> but I wasn't able to trigger it here (I guess a full second is just too
> long even under high load). But the extra sleep makes it obvious.
> 
> Looks like the original is in v2.10.1, so this should probably go to
> maint, as well as the upcoming v2.11-rc1.

The reason why we do not ignore kill errors is that we want to make sure
that the script *actually ran*. Otherwise, the thing we need to test here
does not necessarily get tested.

So I would rather go with the patch Hannes hinted at when he said that he
did not want to change the file name (as it would have made the diff less
obvious): increase the number of seconds.

Will send out a superseding patch in a minute,
Dscho

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] t6026-merge-attr: don't fail if sleep exits early
  2016-11-09 13:47   ` Johannes Schindelin
@ 2016-11-09 14:36     ` Andreas Schwab
  2016-11-09 15:31       ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Andreas Schwab @ 2016-11-09 14:36 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jeff King, Johannes Sixt, git

On Nov 09 2016, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

> The reason why we do not ignore kill errors is that we want to make sure
> that the script *actually ran*. Otherwise, the thing we need to test here
> does not necessarily get tested.

That can be tested by looking for the pid file.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] t6026-merge-attr: don't fail if sleep exits early
  2016-11-09 14:36     ` Andreas Schwab
@ 2016-11-09 15:31       ` Jeff King
  2016-11-10  8:31         ` [PATCH v2] " Andreas Schwab
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2016-11-09 15:31 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Johannes Schindelin, Johannes Sixt, git

On Wed, Nov 09, 2016 at 03:36:40PM +0100, Andreas Schwab wrote:

> On Nov 09 2016, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> 
> > The reason why we do not ignore kill errors is that we want to make sure
> > that the script *actually ran*. Otherwise, the thing we need to test here
> > does not necessarily get tested.
> 
> That can be tested by looking for the pid file.

I agree that makes the intent a lot more obvious. Having a necessary
condition of the test stuffed into a test_when_finished block seems
counter-intuitive.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2] t6026-merge-attr: don't fail if sleep exits early
  2016-11-09 15:31       ` Jeff King
@ 2016-11-10  8:31         ` Andreas Schwab
  2016-11-10 19:20           ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Andreas Schwab @ 2016-11-10  8:31 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Schindelin, Johannes Sixt, git

Commit 5babb5bdb3 ("t6026-merge-attr: clean up background process at end
of test case") added a kill command to clean up after the test, but this
can fail if the sleep command exits before the cleanup is executed.
Ignore the error from the kill command.

Explicitly check for the existence of the pid file to test that the merge
driver was actually called.

Signed-off-by: Andreas Schwab <schwab@suse.de>
---
 t/t6026-merge-attr.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t6026-merge-attr.sh b/t/t6026-merge-attr.sh
index 7a6e33e673..03d13d00b5 100755
--- a/t/t6026-merge-attr.sh
+++ b/t/t6026-merge-attr.sh
@@ -187,13 +187,14 @@ test_expect_success 'custom merge does not lock index' '
 		sleep 1 &
 		echo $! >sleep.pid
 	EOF
-	test_when_finished "kill \$(cat sleep.pid)" &&
+	test_when_finished "kill \$(cat sleep.pid) || :" &&
 
 	test_write_lines >.gitattributes \
 		"* merge=ours" "text merge=sleep-one-second" &&
 	test_config merge.ours.driver true &&
 	test_config merge.sleep-one-second.driver ./sleep-one-second.sh &&
-	git merge master
+	git merge master &&
+	test -f sleep.pid
 '
 
 test_done
-- 
2.10.2


-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] t6026-merge-attr: don't fail if sleep exits early
  2016-11-10  8:31         ` [PATCH v2] " Andreas Schwab
@ 2016-11-10 19:20           ` Junio C Hamano
  2016-11-10 21:55             ` Johannes Schindelin
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2016-11-10 19:20 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Jeff King, Johannes Schindelin, Johannes Sixt, git

Andreas Schwab <schwab@suse.de> writes:

> Commit 5babb5bdb3 ("t6026-merge-attr: clean up background process at end
> of test case") added a kill command to clean up after the test, but this
> can fail if the sleep command exits before the cleanup is executed.
> Ignore the error from the kill command.
>
> Explicitly check for the existence of the pid file to test that the merge
> driver was actually called.
>
> Signed-off-by: Andreas Schwab <schwab@suse.de>
> ---

OK.  sleep.pid is a reasonable easy-to-access side effect we can
observe to make sure that the sleep-one-second merge driver was
indeed invoked, which was missing from the earlier round.

Thanks, will apply.

>  t/t6026-merge-attr.sh | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/t/t6026-merge-attr.sh b/t/t6026-merge-attr.sh
> index 7a6e33e673..03d13d00b5 100755
> --- a/t/t6026-merge-attr.sh
> +++ b/t/t6026-merge-attr.sh
> @@ -187,13 +187,14 @@ test_expect_success 'custom merge does not lock index' '
>  		sleep 1 &
>  		echo $! >sleep.pid
>  	EOF
> -	test_when_finished "kill \$(cat sleep.pid)" &&
> +	test_when_finished "kill \$(cat sleep.pid) || :" &&
>  
>  	test_write_lines >.gitattributes \
>  		"* merge=ours" "text merge=sleep-one-second" &&
>  	test_config merge.ours.driver true &&
>  	test_config merge.sleep-one-second.driver ./sleep-one-second.sh &&
> -	git merge master
> +	git merge master &&
> +	test -f sleep.pid
>  '
>  
>  test_done
> -- 
> 2.10.2

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] t6026-merge-attr: don't fail if sleep exits early
  2016-11-10 19:20           ` Junio C Hamano
@ 2016-11-10 21:55             ` Johannes Schindelin
  2016-11-10 22:30               ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Johannes Schindelin @ 2016-11-10 21:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Andreas Schwab, Jeff King, Johannes Sixt, git

Hi all,

On Thu, 10 Nov 2016, Junio C Hamano wrote:

> Andreas Schwab <schwab@suse.de> writes:
> 
> > Commit 5babb5bdb3 ("t6026-merge-attr: clean up background process at end
> > of test case") added a kill command to clean up after the test, but this
> > can fail if the sleep command exits before the cleanup is executed.
> > Ignore the error from the kill command.
> >
> > Explicitly check for the existence of the pid file to test that the merge
> > driver was actually called.
> >
> > Signed-off-by: Andreas Schwab <schwab@suse.de>
> > ---
> 
> OK.  sleep.pid is a reasonable easy-to-access side effect we can
> observe to make sure that the sleep-one-second merge driver was
> indeed invoked, which was missing from the earlier round.

No, this is incorrect. The condition that we need to know applies is that
the script is still running, and blocking if the bug reappears.

It is not enough to test that the script *started* running. If it exits
too early, the problematic condition is never tested, and the test is
useless.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] t6026-merge-attr: don't fail if sleep exits early
  2016-11-10 21:55             ` Johannes Schindelin
@ 2016-11-10 22:30               ` Junio C Hamano
  2016-11-10 22:35                 ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2016-11-10 22:30 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Andreas Schwab, Jeff King, Johannes Sixt, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> OK.  sleep.pid is a reasonable easy-to-access side effect we can
>> observe to make sure that the sleep-one-second merge driver was
>> indeed invoked, which was missing from the earlier round.
>
> No, this is incorrect. The condition that we need to know applies is that
> the script is still running, and blocking if the bug reappears.

OK, I see what you are saying, and I see a few things wrong in here:

 * First, the test is titled in a misleading way.  In the context of
   a patch that was titled ad65f7e3b7 ("t6026-merge-attr: child
   processes must not inherit index.lock handles", 2016-08-18), it
   might have been clear enough to say "does not lock index", but
   the sleeping is to make sure that we would notice if the fd to
   the index.lock leaked to the child process by mistake, and the
   way to do so is that the child arranges the leaked fd to be kept
   open after it exits (by spawning "sleep").  The test was never
   about "does not lock index" (the driver does not take any lock by
   itself in the first place).

 * There are three possible outcome from this test:

   - 'git merge' fails.

     This is expected to happen only on Windows and if the code gets
     broken and starts leaking the fd.

   - 'git merge' finishes correctly, the sleep is still running when
     test_when_finished goes to cull it.

     In this case, we KNOW there wasn't any fd leak IF we are on
     Windows where a leaked FD would not allow 'git merge' to
     succeed.  But on other platforms, fd leak that may cause
     trouble for Windows friends will not be caught.

   - 'git merge' finishes correctly, the sleep is no longer
     running because the machine was heavily loaded; a workaround is
     to tolerate failure of culling it.

     In this case, we cannot tell anything from the test.  Even if
     the fd was leaked, 'git merge' may have succeeded even on
     Windows.

As everybody knows there is no appropriate timeout value that is
good for everybody.  I wonder if we can replace the sleep 1 with
something like

	( while sleep 3600; do :; done ) &

so that leaked fd will be kept even in any heavily loaded
environment instead?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] t6026-merge-attr: don't fail if sleep exits early
  2016-11-10 22:30               ` Junio C Hamano
@ 2016-11-10 22:35                 ` Jeff King
  2016-11-10 22:55                   ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2016-11-10 22:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, Andreas Schwab, Johannes Sixt, git

On Thu, Nov 10, 2016 at 02:30:36PM -0800, Junio C Hamano wrote:

> As everybody knows there is no appropriate timeout value that is
> good for everybody.  I wonder if we can replace the sleep 1 with
> something like
> 
> 	( while sleep 3600; do :; done ) &
> 
> so that leaked fd will be kept even in any heavily loaded
> environment instead?

I think you may have missed:

  http://public-inbox.org/git/16dc9f159b214997f7501006a8d1d8be2ef858e8.1478699463.git.johannes.schindelin@gmx.de/

which does roughly that. It does not loop, but I suspect 3600 is plenty
in practice.

I do think the test would be a lot more obvious if it confirmed at the
end of the test that the process was still running, as opposed to
relying on test_when_finished to check it.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] t6026-merge-attr: don't fail if sleep exits early
  2016-11-10 22:35                 ` Jeff King
@ 2016-11-10 22:55                   ` Junio C Hamano
  2016-11-10 23:03                     ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2016-11-10 22:55 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Schindelin, Andreas Schwab, Johannes Sixt, git

Jeff King <peff@peff.net> writes:

> I do think the test would be a lot more obvious if it confirmed at the
> end of the test that the process was still running, as opposed to
> relying on test_when_finished to check it.

I agree that "check that the process is still running" is a wrong
thing to do in the first place.  What would it mean if the process
is no longer running?  It is a timing-dependent bug in the test;
after all we failed to produce the condition that could trigger a
bug that we are guarding against.  And "let's do '|| :'" is sweeping
the bug (not in the code we are testing, but in the test that will
fail to notice a bug we are preparing against) under the rug.  So I
agree with Dscho that we should do that first.

If we ensure that the process is still running, then such a check is
a good belt-and-suspenders way to catch a breakage in the mechanism
we choose to ensure it.  So probably we can require that the kill in
the "when finished" part to actually send a signal to a process that
is still running.

Is there an equivalent to pause(2) available to shell scripts?  I
really hate a single "sleep 3600" or anything with a magic number.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] t6026-merge-attr: don't fail if sleep exits early
  2016-11-10 22:55                   ` Junio C Hamano
@ 2016-11-10 23:03                     ` Jeff King
  0 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2016-11-10 23:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, Andreas Schwab, Johannes Sixt, git

On Thu, Nov 10, 2016 at 02:55:06PM -0800, Junio C Hamano wrote:

> If we ensure that the process is still running, then such a check is
> a good belt-and-suspenders way to catch a breakage in the mechanism
> we choose to ensure it.  So probably we can require that the kill in
> the "when finished" part to actually send a signal to a process that
> is still running.
> 
> Is there an equivalent to pause(2) available to shell scripts?  I
> really hate a single "sleep 3600" or anything with a magic number.

I think it is usually spelled "read <some-fifo", but we can't use FIFOs
here because Windows doesn't have them. You could probably do something
with "read <&9" and set up descriptor 9 in the test code. But frankly,
that gets complex pretty quickly, as you have to background things.

This minor issue isn't worth it.  Just bumping the sleep to 3600 makes
the raciness problem go away, and everything works in practice. That's
probably good enough for our purposes.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-11-10 23:04 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-08 17:03 [PATCH] t6026-merge-attr: don't fail if sleep exits early Andreas Schwab
2016-11-08 20:05 ` Jeff King
2016-11-09 13:47   ` Johannes Schindelin
2016-11-09 14:36     ` Andreas Schwab
2016-11-09 15:31       ` Jeff King
2016-11-10  8:31         ` [PATCH v2] " Andreas Schwab
2016-11-10 19:20           ` Junio C Hamano
2016-11-10 21:55             ` Johannes Schindelin
2016-11-10 22:30               ` Junio C Hamano
2016-11-10 22:35                 ` Jeff King
2016-11-10 22:55                   ` Junio C Hamano
2016-11-10 23:03                     ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).