git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] t6500: don't run detached auto gc at the end of the test script
@ 2017-04-10 12:59 SZEDER Gábor
  2017-04-10 13:58 ` Jeff King
  0 siblings, 1 reply; 41+ messages in thread
From: SZEDER Gábor @ 2017-04-10 12:59 UTC (permalink / raw)
  To: Junio C Hamano, David Turner; +Cc: git, SZEDER Gábor

The last test in 't6500-gc', 'background auto gc does not run if
gc.log is present and recent but does if it is old', added in
a831c06a2 (gc: ignore old gc.log files, 2017-02-10), may sporadically
trigger an error message from the test harness:

  rm: cannot remove 'trash directory.t6500-gc/.git/objects': Directory not empty

The test in question ends with executing an auto gc in the backround,
which occasionally takes so long that it's still running when
'test_done' is about to remove the trash directory.  This 'rm -rf
$trash' in the foreground might race with the detached auto gc to
create and delete files and directories, and gc might (re-)create a
path that 'rm' already visited and removed, triggering the above error
when 'rm' attempts to remove its parent directory.

Commit bb05510e5 (t5510: run auto-gc in the foreground, 2016-05-01)
fixed the same problem in a different test script by simply
disallowing background gc.  Unfortunately, what worked there is not
applicable here, because the purpose of this test is to check the
behavior of a detached auto gc.

Move the detached auto gc execution further away from the end of the
test script by splitting the test in two: run the 'background auto gc
runs if an old gc.log is present' part first, leaving only the
aborting auto gc executions in the last test.  Run the first test's
detached auto gc in a dedicated sub-repository to prevent it from
interfering with the subseqent test's gc.  Also add a comment at the
end of the test script to warn developers of future tests about this
issue.

While this change doesn't completely eliminate the possibility of
this race, it significantly and seemingly sufficiently reduces its
probability.  Running t6500 in a loop while my box was under heavy CPU
and I/O load usually triggered the error under 15 seconds, but with
this patch it run overnight without incident.

(Alternatively, we could check the presence of the gc.pid file after
starting the detached auto gc, and wait while it's there.  However,
this would create a different race: the auto gc process creates the
pid file after it goes to the background, so our check in the
foreground might racily look for the pid file before it's even
created, and thus would not wait for the background gc to finish.
Furthermore, this would open up the possibility of the test hanging if
something were to go wrong with the gc process and the pid file were
not removed.)

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 t/t6500-gc.sh | 32 +++++++++++++++++++++++++++-----
 1 file changed, 27 insertions(+), 5 deletions(-)

diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index 08de2e8ab..2dc202c9b 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -67,7 +67,27 @@ test_expect_success 'auto gc with too many loose objects does not attempt to cre
 	test_line_count = 2 new # There is one new pack and its .idx
 '
 
-test_expect_success 'background auto gc does not run if gc.log is present and recent but does if it is old' '
+test_expect_success 'background auto gc runs if an old gc.log is present' '
+	# The detached auto gc process can run long enough in the
+	# background to racily interfere with the gc execution in the
+	# subsequent test, hence the dedicated repository.
+	test_create_repo other &&
+	(
+		cd other &&
+		test_commit foo &&
+		git repack &&
+		test_commit bar &&
+		git repack &&
+		git config gc.autopacklimit 1 &&
+		git config gc.autodetach true &&
+		git config gc.logexpiry 2.days &&
+		echo fleem >.git/gc.log &&
+		test-chmtime =-345600 .git/gc.log &&	# 4 days
+		git gc --auto
+	)
+'
+
+test_expect_success 'background auto gc does not run if gc.log is present' '
 	test_commit foo &&
 	test_commit bar &&
 	git repack &&
@@ -77,10 +97,12 @@ test_expect_success 'background auto gc does not run if gc.log is present and re
 	test_must_fail git gc --auto 2>err &&
 	test_i18ngrep "^error:" err &&
 	test_config gc.logexpiry 5.days &&
-	test-chmtime =-345600 .git/gc.log &&
-	test_must_fail git gc --auto &&
-	test_config gc.logexpiry 2.days &&
-	git gc --auto
+	test-chmtime =-345600 .git/gc.log &&	# 4 days
+	test_must_fail git gc --auto
 '
 
+# DO NOT leave a detached auto gc process running near the end of the
+# test script: it can run long enough in the background to racily
+# interfere with the cleanup in 'test_done'.
+
 test_done
-- 
2.12.2.613.g9c5b79913


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH] t6500: don't run detached auto gc at the end of the test script
  2017-04-10 12:59 [PATCH] t6500: don't run detached auto gc at the end of the test script SZEDER Gábor
@ 2017-04-10 13:58 ` Jeff King
  2017-04-10 16:31   ` SZEDER Gábor
  0 siblings, 1 reply; 41+ messages in thread
From: Jeff King @ 2017-04-10 13:58 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Junio C Hamano, David Turner, git

On Mon, Apr 10, 2017 at 02:59:11PM +0200, SZEDER Gábor wrote:

> While this change doesn't completely eliminate the possibility of
> this race, it significantly and seemingly sufficiently reduces its
> probability.  Running t6500 in a loop while my box was under heavy CPU
> and I/O load usually triggered the error under 15 seconds, but with
> this patch it run overnight without incident.
> 
> (Alternatively, we could check the presence of the gc.pid file after
> starting the detached auto gc, and wait while it's there.  However,
> this would create a different race: the auto gc process creates the
> pid file after it goes to the background, so our check in the
> foreground might racily look for the pid file before it's even
> created, and thus would not wait for the background gc to finish.
> Furthermore, this would open up the possibility of the test hanging if
> something were to go wrong with the gc process and the pid file were
> not removed.)

Could we just leave open a pipe descriptor that the child inherits, and
wait for it to close?

Something like:

  git gc --auto 9>&1 | read

should wait until the background gc process finishes. It depends on our
daemonize() not closing descriptors beyond 0/1/2, but that is certainly
the case now.

It also loses the exit status of the main "git gc", but that can be
fixed with shell hackery:

  code=$(sh -c 'git gc --auto; echo $?' 9>&1)
  test "$code" = 0

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] t6500: don't run detached auto gc at the end of the test script
  2017-04-10 13:58 ` Jeff King
@ 2017-04-10 16:31   ` SZEDER Gábor
  2017-04-10 16:35     ` Jeff King
  0 siblings, 1 reply; 41+ messages in thread
From: SZEDER Gábor @ 2017-04-10 16:31 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, David Turner, Git mailing list

On Mon, Apr 10, 2017 at 3:58 PM, Jeff King <peff@peff.net> wrote:
> On Mon, Apr 10, 2017 at 02:59:11PM +0200, SZEDER Gábor wrote:
>
>> While this change doesn't completely eliminate the possibility of
>> this race, it significantly and seemingly sufficiently reduces its
>> probability.  Running t6500 in a loop while my box was under heavy CPU
>> and I/O load usually triggered the error under 15 seconds, but with
>> this patch it run overnight without incident.
>>
>> (Alternatively, we could check the presence of the gc.pid file after
>> starting the detached auto gc, and wait while it's there.  However,
>> this would create a different race: the auto gc process creates the
>> pid file after it goes to the background, so our check in the
>> foreground might racily look for the pid file before it's even
>> created, and thus would not wait for the background gc to finish.
>> Furthermore, this would open up the possibility of the test hanging if
>> something were to go wrong with the gc process and the pid file were
>> not removed.)
>
> Could we just leave open a pipe descriptor that the child inherits, and
> wait for it to close?
>
> Something like:
>
>   git gc --auto 9>&1 | read
>
> should wait until the background gc process finishes. It depends on our
> daemonize() not closing descriptors beyond 0/1/2, but that is certainly
> the case now.
>
> It also loses the exit status of the main "git gc", but that can be
> fixed with shell hackery:
>
>   code=$(sh -c 'git gc --auto; echo $?' 9>&1)
>   test "$code" = 0

Indeed this seems to work, and luckily we don't need that much
hackery.  When there is a single variable assignment and the expansion
of a command substitution is assigned to the variable, then the exit
status is that of the command inside the command substitution, i.e.

  $ v=$(false) ; echo $?
  1

This means we can write this simply as:

  doesnt_matter=$(git gc --auto 9>&1)

It's still hackery :)

OTOH, this makes it possible to continue the test reliably after the
gc finished in the background, so we could also check that there is
only a single pack file left, i.e. that the detached gc did what it
was supposed to do.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] t6500: don't run detached auto gc at the end of the test script
  2017-04-10 16:31   ` SZEDER Gábor
@ 2017-04-10 16:35     ` Jeff King
  2017-04-10 16:56       ` SZEDER Gábor
  0 siblings, 1 reply; 41+ messages in thread
From: Jeff King @ 2017-04-10 16:35 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Junio C Hamano, David Turner, Git mailing list

On Mon, Apr 10, 2017 at 06:31:54PM +0200, SZEDER Gábor wrote:

> Indeed this seems to work, and luckily we don't need that much
> hackery.  When there is a single variable assignment and the expansion
> of a command substitution is assigned to the variable, then the exit
> status is that of the command inside the command substitution, i.e.
> 
>   $ v=$(false) ; echo $?
>   1
> 
> This means we can write this simply as:
> 
>   doesnt_matter=$(git gc --auto 9>&1)
> 
> It's still hackery :)

Heh. Yeah, I would call that _more_ hackery in that it's much more
clever. But it is shorter. :)

I think as long as the trickery is documented that's OK (and calling it
doesnt_matter and explaining in the commit message is fine by me;
hopefully that name would induce somebody to look in the history).

> OTOH, this makes it possible to continue the test reliably after the
> gc finished in the background, so we could also check that there is
> only a single pack file left, i.e. that the detached gc did what it
> was supposed to do.

Yes, I think the test can and should check the after-gc state.

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] t6500: don't run detached auto gc at the end of the test script
  2017-04-10 16:35     ` Jeff King
@ 2017-04-10 16:56       ` SZEDER Gábor
  2017-04-10 17:01         ` Jeff King
  0 siblings, 1 reply; 41+ messages in thread
From: SZEDER Gábor @ 2017-04-10 16:56 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, David Turner, Git mailing list

On Mon, Apr 10, 2017 at 6:35 PM, Jeff King <peff@peff.net> wrote:
> On Mon, Apr 10, 2017 at 06:31:54PM +0200, SZEDER Gábor wrote:

>> This means we can write this simply as:
>>
>>   doesnt_matter=$(git gc --auto 9>&1)
>>
>> It's still hackery :)
>
> Heh. Yeah, I would call that _more_ hackery in that it's much more
> clever. But it is shorter. :)
>
> I think as long as the trickery is documented that's OK (and calling it
> doesnt_matter and explaining in the commit message is fine by me;
> hopefully that name would induce somebody to look in the history).

For the sake of self documentation and potential future users, I will
put it into a helper function like run_and_wait_for_detached_auto_gc()
or something.  Jury is still out on the proper function name (it would
be a shame if the funcname were longer than the command ;), but time
is up for today...

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] t6500: don't run detached auto gc at the end of the test script
  2017-04-10 16:56       ` SZEDER Gábor
@ 2017-04-10 17:01         ` Jeff King
  2017-04-11 21:32           ` Johannes Sixt
  0 siblings, 1 reply; 41+ messages in thread
From: Jeff King @ 2017-04-10 17:01 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Junio C Hamano, David Turner, Git mailing list

On Mon, Apr 10, 2017 at 06:56:30PM +0200, SZEDER Gábor wrote:

> On Mon, Apr 10, 2017 at 6:35 PM, Jeff King <peff@peff.net> wrote:
> > On Mon, Apr 10, 2017 at 06:31:54PM +0200, SZEDER Gábor wrote:
> 
> >> This means we can write this simply as:
> >>
> >>   doesnt_matter=$(git gc --auto 9>&1)
> >>
> >> It's still hackery :)
> >
> > Heh. Yeah, I would call that _more_ hackery in that it's much more
> > clever. But it is shorter. :)
> >
> > I think as long as the trickery is documented that's OK (and calling it
> > doesnt_matter and explaining in the commit message is fine by me;
> > hopefully that name would induce somebody to look in the history).
> 
> For the sake of self documentation and potential future users, I will
> put it into a helper function like run_and_wait_for_detached_auto_gc()
> or something.  Jury is still out on the proper function name (it would
> be a shame if the funcname were longer than the command ;), but time
> is up for today...

I wonder if you could make it a general test-lib function, like:

  run_and_wait () {
	# we read stdout from the child only for the side effect
	# of waiting until all child sub-processes exit, closing their
	# fd 9.
	does_not_matter=$("$@" 9>&1)
  }

You could make it even more general by doing an 'eval' in the middle,
but that is probably getting too insane. :)

I don't know if there are other spots that would benefit from this. Most
of the other racy stuff like this that I recall working on actually
wanted to run two co-processes (so background something, then keep
running the test, then ask the backgrounded task to die and wait for it
to finish).

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] t6500: don't run detached auto gc at the end of the test script
  2017-04-10 17:01         ` Jeff King
@ 2017-04-11 21:32           ` Johannes Sixt
  2017-04-12  0:27             ` SZEDER Gábor
  0 siblings, 1 reply; 41+ messages in thread
From: Johannes Sixt @ 2017-04-11 21:32 UTC (permalink / raw)
  To: Jeff King, SZEDER Gábor
  Cc: Junio C Hamano, David Turner, Git mailing list

Am 10.04.2017 um 19:01 schrieb Jeff King:
> I wonder if you could make it a general test-lib function, like:
>
>   run_and_wait () {
> 	# we read stdout from the child only for the side effect
> 	# of waiting until all child sub-processes exit, closing their
> 	# fd 9.
> 	does_not_matter=$("$@" 9>&1)

I'm afraid this won't work on Windows when the invoked command is git. 
FD inheritance between MSYS (bash) and non-MSYS programs (git) is only 
implemented for FDs 0,1,2. That's a deficiency of MSYS, and I don't 
think that was improved in MSYS2.

>   }

-- Hannes


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] t6500: don't run detached auto gc at the end of the test script
  2017-04-11 21:32           ` Johannes Sixt
@ 2017-04-12  0:27             ` SZEDER Gábor
  2017-04-12  0:50               ` Jeff King
  0 siblings, 1 reply; 41+ messages in thread
From: SZEDER Gábor @ 2017-04-12  0:27 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Jeff King, Junio C Hamano, David Turner, Git mailing list

On Tue, Apr 11, 2017 at 11:32 PM, Johannes Sixt <j6t@kdbg.org> wrote:
> Am 10.04.2017 um 19:01 schrieb Jeff King:
>>
>> I wonder if you could make it a general test-lib function, like:
>>
>>   run_and_wait () {
>>         # we read stdout from the child only for the side effect
>>         # of waiting until all child sub-processes exit, closing their
>>         # fd 9.
>>         does_not_matter=$("$@" 9>&1)
>
>
> I'm afraid this won't work on Windows when the invoked command is git. FD
> inheritance between MSYS (bash) and non-MSYS programs (git) is only
> implemented for FDs 0,1,2. That's a deficiency of MSYS, and I don't think
> that was improved in MSYS2.

Oh, that's a pity, I was almost ready with v2...

Unfortunately, this makes the general helper function unworkable, of
course.  Though in this particular case it wouldn't matter, because on
Windows daemonize() is basically a noop and 'git gc --auto' remains in
the foreground anyway.

I think we should stick with my initial patch, then.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] t6500: don't run detached auto gc at the end of the test script
  2017-04-12  0:27             ` SZEDER Gábor
@ 2017-04-12  0:50               ` Jeff King
  2017-04-12 22:03                 ` SZEDER Gábor
  0 siblings, 1 reply; 41+ messages in thread
From: Jeff King @ 2017-04-12  0:50 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Johannes Sixt, Junio C Hamano, David Turner, Git mailing list

On Wed, Apr 12, 2017 at 02:27:05AM +0200, SZEDER Gábor wrote:

> >> I wonder if you could make it a general test-lib function, like:
> >>
> >>   run_and_wait () {
> >>         # we read stdout from the child only for the side effect
> >>         # of waiting until all child sub-processes exit, closing their
> >>         # fd 9.
> >>         does_not_matter=$("$@" 9>&1)
> >
> >
> > I'm afraid this won't work on Windows when the invoked command is git. FD
> > inheritance between MSYS (bash) and non-MSYS programs (git) is only
> > implemented for FDs 0,1,2. That's a deficiency of MSYS, and I don't think
> > that was improved in MSYS2.
> 
> Oh, that's a pity, I was almost ready with v2...
> 
> Unfortunately, this makes the general helper function unworkable, of
> course.  Though in this particular case it wouldn't matter, because on
> Windows daemonize() is basically a noop and 'git gc --auto' remains in
> the foreground anyway.

That makes it tempting to use in this scenario as a one-off with a
comment.

> I think we should stick with my initial patch, then.

I'm not entirely opposed, but my understanding was that it didn't
actually fix the race, it just made it a bit bigger. Which is sort of
unsatisfying.

I couldn't get the original to show a failure, though, even under heavy
load. So maybe widening the race is enough.

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] t6500: don't run detached auto gc at the end of the test script
  2017-04-12  0:50               ` Jeff King
@ 2017-04-12 22:03                 ` SZEDER Gábor
  2017-04-12 22:07                   ` [PATCHv2] " SZEDER Gábor
  2017-04-13 16:37                   ` [PATCH] t6500: don't run " Jeff King
  0 siblings, 2 replies; 41+ messages in thread
From: SZEDER Gábor @ 2017-04-12 22:03 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Sixt, Junio C Hamano, David Turner, Git mailing list

On Wed, Apr 12, 2017 at 2:50 AM, Jeff King <peff@peff.net> wrote:
> On Wed, Apr 12, 2017 at 02:27:05AM +0200, SZEDER Gábor wrote:
>
>> >> I wonder if you could make it a general test-lib function, like:
>> >>
>> >>   run_and_wait () {
>> >>         # we read stdout from the child only for the side effect
>> >>         # of waiting until all child sub-processes exit, closing their
>> >>         # fd 9.
>> >>         does_not_matter=$("$@" 9>&1)
>> >
>> >
>> > I'm afraid this won't work on Windows when the invoked command is git. FD
>> > inheritance between MSYS (bash) and non-MSYS programs (git) is only
>> > implemented for FDs 0,1,2. That's a deficiency of MSYS, and I don't think
>> > that was improved in MSYS2.
>>
>> Oh, that's a pity, I was almost ready with v2...
>>
>> Unfortunately, this makes the general helper function unworkable, of
>> course.  Though in this particular case it wouldn't matter, because on
>> Windows daemonize() is basically a noop and 'git gc --auto' remains in
>> the foreground anyway.
>
> That makes it tempting to use in this scenario as a one-off with a
> comment.

Ok, I'll send it out in a minute, and Junio can then take his pick.

>> I think we should stick with my initial patch, then.
>
> I'm not entirely opposed, but my understanding was that it didn't
> actually fix the race, it just made it a bit bigger. Which is sort of
> unsatisfying.
>
> I couldn't get the original to show a failure, though, even under heavy
> load. So maybe widening the race is enough.

Just to be clear: it's only an occasionally appearing error message.
There is no failure in the sense of test failure, because 'rm -rf
$trash' erroring out during housekeeping does not fail the test suite.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCHv2] t6500: don't run detached auto gc at the end of the test script
  2017-04-12 22:03                 ` SZEDER Gábor
@ 2017-04-12 22:07                   ` SZEDER Gábor
  2017-04-13 10:31                     ` [PATCHv2.1] t6500: wait for " SZEDER Gábor
  2017-04-13 16:37                   ` [PATCH] t6500: don't run " Jeff King
  1 sibling, 1 reply; 41+ messages in thread
From: SZEDER Gábor @ 2017-04-12 22:07 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Johannes Sixt, David Turner, git, SZEDER Gábor

The last test in 't6500-gc', 'background auto gc does not run if
gc.log is present and recent but does if it is old', added in
a831c06a2 (gc: ignore old gc.log files, 2017-02-10), may sporadically
trigger an error message from the test harness:

  rm: cannot remove 'trash directory.t6500-gc/.git/objects': Directory not empty

The test in question ends with executing an auto gc in the backround,
which occasionally takes so long that it's still running when
'test_done' is about to remove the trash directory.  This 'rm -rf
$trash' in the foreground might race with the detached auto gc to
create and delete files and directories, and gc might (re-)create a
path that 'rm' already visited and removed, triggering the above error
message when 'rm' attempts to remove its parent directory.

Commit bb05510e5 (t5510: run auto-gc in the foreground, 2016-05-01)
fixed the same problem in a different test script by simply
disallowing background gc.  Unfortunately, what worked there is not
applicable here, because the purpose of this test is to check the
behavior of a detached auto gc.

Make sure that the test doesn't continue before the gc is finished in
the background with a clever bit of shell trickery:

  - Open fd 9 in the shell, to be inherited by the background gc
    process, because our daemonize() only closes the standard fds 0,
    1 and 2.
  - Duplicate this fd 9 to stdout.
  - Read 'git gc's stdout, and thus fd 9, through a command
    substitution.  We don't actually care about gc's output, but this
    construct has two useful properties:
  - This read blocks until stdout or fd 9 are open.  While stdout is
    closed after the main gc process creates the background process
    and exits, fd 9 remains open until the backround process exits.
  - The variable assignment from the command substitution gets its
    exit status from the command executed within the command
    substitution, i.e. a failing main gc process will cause the test
    to fail.

Note, that this fd trickery doesn't work on Windows, because due to
MSYS limitations the git process only inherits the standard fds 0, 1
and 2 from the shell.  Luckily, it doesn't matter in this case,
because on Windows daemonize() is basically a noop, thus 'git gc
--auto' always runs in the foreground.

And since we can now continue the test reliably after the detached gc
finished, check that there is only a single packfile left at the end,
i.e. that the detached gc actually did what it was supposed to do.
Also add a comment at the end of the test script to warn developers of
future tests about this issue of long running detached gc processes.

Helped-by: Jeff King <peff@peff.net>
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 t/t6500-gc.sh | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index 08de2e8ab..cc7acd101 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -67,6 +67,16 @@ test_expect_success 'auto gc with too many loose objects does not attempt to cre
 	test_line_count = 2 new # There is one new pack and its .idx
 '
 
+run_and_wait_for_auto_gc () {
+	# We read stdout from gc for the side effect of waiting until the
+	# background gc process exits, closing its fd 9.  Furthermore, the
+	# variable assignment from a command substitution preserves the
+	# exit status of the main gc process.
+	# Note: this fd trickery doesn't work on Windows, but there is no
+	# need to, because on Win the auto gc always runs in the foreground.
+	doesnt_matter=$(git gc --auto 9>&1)
+}
+
 test_expect_success 'background auto gc does not run if gc.log is present and recent but does if it is old' '
 	test_commit foo &&
 	test_commit bar &&
@@ -80,7 +90,13 @@ test_expect_success 'background auto gc does not run if gc.log is present and re
 	test-chmtime =-345600 .git/gc.log &&
 	test_must_fail git gc --auto &&
 	test_config gc.logexpiry 2.days &&
-	git gc --auto
+	run_and_wait_for_auto_gc &&
+	ls .git/objects/pack/pack-*.pack >packs &&
+	test_line_count = 1 packs
 '
 
+# DO NOT leave a detached auto gc process running near the end of the
+# test script: it can run long enough in the background to racily
+# interfere with the cleanup in 'test_done'.
+
 test_done
-- 
2.12.2.613.g9c5b79913


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCHv2.1] t6500: wait for detached auto gc at the end of the test script
  2017-04-12 22:07                   ` [PATCHv2] " SZEDER Gábor
@ 2017-04-13 10:31                     ` SZEDER Gábor
  2017-04-13 16:06                       ` David Turner
  2017-04-13 16:44                       ` Jeff King
  0 siblings, 2 replies; 41+ messages in thread
From: SZEDER Gábor @ 2017-04-13 10:31 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Johannes Sixt, David Turner, git, SZEDER Gábor

The last test in 't6500-gc', 'background auto gc does not run if
gc.log is present and recent but does if it is old', added in
a831c06a2 (gc: ignore old gc.log files, 2017-02-10), may sporadically
trigger an error message from the test harness:

  rm: cannot remove 'trash directory.t6500-gc/.git/objects': Directory not empty

The test in question ends with executing an auto gc in the backround,
which occasionally takes so long that it's still running when
'test_done' is about to remove the trash directory.  This 'rm -rf
$trash' in the foreground might race with the detached auto gc to
create and delete files and directories, and gc might (re-)create a
path that 'rm' already visited and removed, triggering the above error
message when 'rm' attempts to remove its parent directory.

Commit bb05510e5 (t5510: run auto-gc in the foreground, 2016-05-01)
fixed the same problem in a different test script by simply
disallowing background gc.  Unfortunately, what worked there is not
applicable here, because the purpose of this test is to check the
behavior of a detached auto gc.

Make sure that the test doesn't continue before the gc is finished in
the background with a clever bit of shell trickery:

  - Open fd 9 in the shell, to be inherited by the background gc
    process, because our daemonize() only closes the standard fds 0,
    1 and 2.
  - Duplicate this fd 9 to stdout.
  - Read 'git gc's stdout, and thus fd 9, through a command
    substitution.  We don't actually care about gc's output, but this
    construct has two useful properties:
  - This read blocks until stdout or fd 9 are open.  While stdout is
    closed after the main gc process creates the background process
    and exits, fd 9 remains open until the backround process exits.
  - The variable assignment from the command substitution gets its
    exit status from the command executed within the command
    substitution, i.e. a failing main gc process will cause the test
    to fail.

Note, that this fd trickery doesn't work on Windows, because due to
MSYS limitations the git process only inherits the standard fds 0, 1
and 2 from the shell.  Luckily, it doesn't matter in this case,
because on Windows daemonize() is basically a noop, thus 'git gc
--auto' always runs in the foreground.

And since we can now continue the test reliably after the detached gc
finished, check that there is only a single packfile left at the end,
i.e. that the detached gc actually did what it was supposed to do.
Also add a comment at the end of the test script to warn developers of
future tests about this issue of long running detached gc processes.

Helped-by: Jeff King <peff@peff.net>
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---

Updated subject line, but otherwise the same as v2.

 t/t6500-gc.sh | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index 08de2e8ab..cc7acd101 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -67,6 +67,16 @@ test_expect_success 'auto gc with too many loose objects does not attempt to cre
 	test_line_count = 2 new # There is one new pack and its .idx
 '
 
+run_and_wait_for_auto_gc () {
+	# We read stdout from gc for the side effect of waiting until the
+	# background gc process exits, closing its fd 9.  Furthermore, the
+	# variable assignment from a command substitution preserves the
+	# exit status of the main gc process.
+	# Note: this fd trickery doesn't work on Windows, but there is no
+	# need to, because on Win the auto gc always runs in the foreground.
+	doesnt_matter=$(git gc --auto 9>&1)
+}
+
 test_expect_success 'background auto gc does not run if gc.log is present and recent but does if it is old' '
 	test_commit foo &&
 	test_commit bar &&
@@ -80,7 +90,13 @@ test_expect_success 'background auto gc does not run if gc.log is present and re
 	test-chmtime =-345600 .git/gc.log &&
 	test_must_fail git gc --auto &&
 	test_config gc.logexpiry 2.days &&
-	git gc --auto
+	run_and_wait_for_auto_gc &&
+	ls .git/objects/pack/pack-*.pack >packs &&
+	test_line_count = 1 packs
 '
 
+# DO NOT leave a detached auto gc process running near the end of the
+# test script: it can run long enough in the background to racily
+# interfere with the cleanup in 'test_done'.
+
 test_done
-- 
2.12.2.613.g9c5b79913


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* RE: [PATCHv2.1] t6500: wait for detached auto gc at the end of the test script
  2017-04-13 10:31                     ` [PATCHv2.1] t6500: wait for " SZEDER Gábor
@ 2017-04-13 16:06                       ` David Turner
  2017-04-13 16:44                       ` Jeff King
  1 sibling, 0 replies; 41+ messages in thread
From: David Turner @ 2017-04-13 16:06 UTC (permalink / raw)
  To: 'SZEDER Gábor', Junio C Hamano
  Cc: Jeff King, Johannes Sixt, git@vger.kernel.org

Thanks for fixing this!

> -----Original Message-----
> From: SZEDER Gábor [mailto:szeder.dev@gmail.com]
> Sent: Thursday, April 13, 2017 6:32 AM
> To: Junio C Hamano <gitster@pobox.com>
> Cc: Jeff King <peff@peff.net>; Johannes Sixt <j6t@kdbg.org>; David Turner
> <David.Turner@twosigma.com>; git@vger.kernel.org; SZEDER Gábor
> <szeder.dev@gmail.com>
> Subject: [PATCHv2.1] t6500: wait for detached auto gc at the end of the test
> script
> 
> The last test in 't6500-gc', 'background auto gc does not run if gc.log is present
> and recent but does if it is old', added in
> a831c06a2 (gc: ignore old gc.log files, 2017-02-10), may sporadically trigger an
> error message from the test harness:
> 
>   rm: cannot remove 'trash directory.t6500-gc/.git/objects': Directory not empty
> 
> The test in question ends with executing an auto gc in the backround, which
> occasionally takes so long that it's still running when 'test_done' is about to
> remove the trash directory.  This 'rm -rf $trash' in the foreground might race
> with the detached auto gc to create and delete files and directories, and gc
> might (re-)create a path that 'rm' already visited and removed, triggering the
> above error message when 'rm' attempts to remove its parent directory.
> 
> Commit bb05510e5 (t5510: run auto-gc in the foreground, 2016-05-01) fixed the
> same problem in a different test script by simply disallowing background gc.
> Unfortunately, what worked there is not applicable here, because the purpose
> of this test is to check the behavior of a detached auto gc.
> 
> Make sure that the test doesn't continue before the gc is finished in the
> background with a clever bit of shell trickery:
> 
>   - Open fd 9 in the shell, to be inherited by the background gc
>     process, because our daemonize() only closes the standard fds 0,
>     1 and 2.
>   - Duplicate this fd 9 to stdout.
>   - Read 'git gc's stdout, and thus fd 9, through a command
>     substitution.  We don't actually care about gc's output, but this
>     construct has two useful properties:
>   - This read blocks until stdout or fd 9 are open.  While stdout is
>     closed after the main gc process creates the background process
>     and exits, fd 9 remains open until the backround process exits.
>   - The variable assignment from the command substitution gets its
>     exit status from the command executed within the command
>     substitution, i.e. a failing main gc process will cause the test
>     to fail.
> 
> Note, that this fd trickery doesn't work on Windows, because due to MSYS
> limitations the git process only inherits the standard fds 0, 1 and 2 from the shell.
> Luckily, it doesn't matter in this case, because on Windows daemonize() is
> basically a noop, thus 'git gc --auto' always runs in the foreground.
> 
> And since we can now continue the test reliably after the detached gc finished,
> check that there is only a single packfile left at the end, i.e. that the detached gc
> actually did what it was supposed to do.
> Also add a comment at the end of the test script to warn developers of future
> tests about this issue of long running detached gc processes.
> 
> Helped-by: Jeff King <peff@peff.net>
> Helped-by: Johannes Sixt <j6t@kdbg.org>
> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
> ---
> 
> Updated subject line, but otherwise the same as v2.
> 
>  t/t6500-gc.sh | 18 +++++++++++++++++-
>  1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index 08de2e8ab..cc7acd101 100755
> --- a/t/t6500-gc.sh
> +++ b/t/t6500-gc.sh
> @@ -67,6 +67,16 @@ test_expect_success 'auto gc with too many loose
> objects does not attempt to cre
>  	test_line_count = 2 new # There is one new pack and its .idx  '
> 
> +run_and_wait_for_auto_gc () {
> +	# We read stdout from gc for the side effect of waiting until the
> +	# background gc process exits, closing its fd 9.  Furthermore, the
> +	# variable assignment from a command substitution preserves the
> +	# exit status of the main gc process.
> +	# Note: this fd trickery doesn't work on Windows, but there is no
> +	# need to, because on Win the auto gc always runs in the foreground.
> +	doesnt_matter=$(git gc --auto 9>&1)
> +}
> +
>  test_expect_success 'background auto gc does not run if gc.log is present and
> recent but does if it is old' '
>  	test_commit foo &&
>  	test_commit bar &&
> @@ -80,7 +90,13 @@ test_expect_success 'background auto gc does not run if
> gc.log is present and re
>  	test-chmtime =-345600 .git/gc.log &&
>  	test_must_fail git gc --auto &&
>  	test_config gc.logexpiry 2.days &&
> -	git gc --auto
> +	run_and_wait_for_auto_gc &&
> +	ls .git/objects/pack/pack-*.pack >packs &&
> +	test_line_count = 1 packs
>  '
> 
> +# DO NOT leave a detached auto gc process running near the end of the #
> +test script: it can run long enough in the background to racily #
> +interfere with the cleanup in 'test_done'.
> +
>  test_done
> --
> 2.12.2.613.g9c5b79913


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] t6500: don't run detached auto gc at the end of the test script
  2017-04-12 22:03                 ` SZEDER Gábor
  2017-04-12 22:07                   ` [PATCHv2] " SZEDER Gábor
@ 2017-04-13 16:37                   ` Jeff King
  2017-04-13 17:55                     ` Stefan Beller
  1 sibling, 1 reply; 41+ messages in thread
From: Jeff King @ 2017-04-13 16:37 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Johannes Sixt, Junio C Hamano, David Turner, Git mailing list

On Thu, Apr 13, 2017 at 12:03:20AM +0200, SZEDER Gábor wrote:

> > I couldn't get the original to show a failure, though, even under heavy
> > load. So maybe widening the race is enough.
> 
> Just to be clear: it's only an occasionally appearing error message.
> There is no failure in the sense of test failure, because 'rm -rf
> $trash' erroring out during housekeeping does not fail the test suite.

Ah, OK, that makes more sense. I can detect it reliably by just checking

  ! test -d $root/trash*

in my stress-test after the test successfully completes.

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCHv2.1] t6500: wait for detached auto gc at the end of the test script
  2017-04-13 10:31                     ` [PATCHv2.1] t6500: wait for " SZEDER Gábor
  2017-04-13 16:06                       ` David Turner
@ 2017-04-13 16:44                       ` Jeff King
  2017-04-13 18:08                         ` SZEDER Gábor
  1 sibling, 1 reply; 41+ messages in thread
From: Jeff King @ 2017-04-13 16:44 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Junio C Hamano, Johannes Sixt, David Turner, git

On Thu, Apr 13, 2017 at 12:31:38PM +0200, SZEDER Gábor wrote:

> Note, that this fd trickery doesn't work on Windows, because due to
> MSYS limitations the git process only inherits the standard fds 0, 1
> and 2 from the shell.  Luckily, it doesn't matter in this case,
> because on Windows daemonize() is basically a noop, thus 'git gc
> --auto' always runs in the foreground.
> 
> And since we can now continue the test reliably after the detached gc
> finished, check that there is only a single packfile left at the end,
> i.e. that the detached gc actually did what it was supposed to do.
> Also add a comment at the end of the test script to warn developers of
> future tests about this issue of long running detached gc processes.

The whole thing looks nicely explained, and I'm happy that we're able to
reliably add this extra check at the end of the test.

I did wonder what will happen if Windows learns to daemonize() the
auto-gc. I don't think we'll get an immediate test failure, but this
test will become racy again. But this time we'll actually notice the
racy failure, because the "ls" will report extra packs if it runs before
the background gc does. At which point we can revisit this.

It would be nice if there were a non-racy way to detect whether we
daemonized or not, and complain on Windows when we do. Then we'll be
notified immediately when daemonize() changes by the test failure,
rather than waiting for a racy failure.

I guess we could probably grep for the "in the background" message from
the parent gc. OTOH, maybe it is not even worth it. The racy version
should fail reasonably promptly, I think, and the comments you've left
would point any investigator in the right direction.

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] t6500: don't run detached auto gc at the end of the test script
  2017-04-13 16:37                   ` [PATCH] t6500: don't run " Jeff King
@ 2017-04-13 17:55                     ` Stefan Beller
  2017-04-13 17:57                       ` Jeff King
  0 siblings, 1 reply; 41+ messages in thread
From: Stefan Beller @ 2017-04-13 17:55 UTC (permalink / raw)
  To: Jeff King
  Cc: SZEDER Gábor, Johannes Sixt, Junio C Hamano, David Turner,
	Git mailing list

On Thu, Apr 13, 2017 at 9:37 AM, Jeff King <peff@peff.net> wrote:
> Ah, OK, that makes more sense. I can detect it reliably by just checking
>
>   ! test -d $root/trash*
>
> in my stress-test after the test successfully completes.

Would we want to report such an error from the test suite itself?
(My line of thinking is that this is a similar issue to broken && chains,
which we do report).

Stefan

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] t6500: don't run detached auto gc at the end of the test script
  2017-04-13 17:55                     ` Stefan Beller
@ 2017-04-13 17:57                       ` Jeff King
  2017-04-13 19:03                         ` SZEDER Gábor
  0 siblings, 1 reply; 41+ messages in thread
From: Jeff King @ 2017-04-13 17:57 UTC (permalink / raw)
  To: Stefan Beller
  Cc: SZEDER Gábor, Johannes Sixt, Junio C Hamano, David Turner,
	Git mailing list

On Thu, Apr 13, 2017 at 10:55:08AM -0700, Stefan Beller wrote:

> On Thu, Apr 13, 2017 at 9:37 AM, Jeff King <peff@peff.net> wrote:
> > Ah, OK, that makes more sense. I can detect it reliably by just checking
> >
> >   ! test -d $root/trash*
> >
> > in my stress-test after the test successfully completes.
> 
> Would we want to report such an error from the test suite itself?
> (My line of thinking is that this is a similar issue to broken && chains,
> which we do report).

Yeah, I had a similar thought. I can't think of any reason why it would
trigger a false positive, as long as we account for test-failure and the
--debug flag properly. I don't think we can just drop the "-f" from the
final "rm", because it may be overriding funny permissions left by
tests.

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCHv2.1] t6500: wait for detached auto gc at the end of the test script
  2017-04-13 16:44                       ` Jeff King
@ 2017-04-13 18:08                         ` SZEDER Gábor
  2017-04-13 18:12                           ` Jeff King
  0 siblings, 1 reply; 41+ messages in thread
From: SZEDER Gábor @ 2017-04-13 18:08 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Johannes Sixt, David Turner, Git mailing list

On Thu, Apr 13, 2017 at 6:44 PM, Jeff King <peff@peff.net> wrote:
> On Thu, Apr 13, 2017 at 12:31:38PM +0200, SZEDER Gábor wrote:

> I did wonder what will happen if Windows learns to daemonize() the
> auto-gc. I don't think we'll get an immediate test failure, but this
> test will become racy again. But this time we'll actually notice the
> racy failure, because the "ls" will report extra packs if it runs before
> the background gc does. At which point we can revisit this.

Dscho said that it would take significant effort to make daemonize()
work on Windows, so I guess it will take a while before we'll have to
revisit this.

> It would be nice if there were a non-racy way to detect whether we
> daemonized or not, and complain on Windows when we do. Then we'll be
> notified immediately when daemonize() changes by the test failure,
> rather than waiting for a racy failure.
>
> I guess we could probably grep for the "in the background" message from
> the parent gc. OTOH, maybe it is not even worth it.

That wouldn't work at the moment, because auto gc says that it will go
to the background even on Windows.

As it is, auto gc first prints the "Auto packing the repo..." message,
and calls daemonize() after that.  Which message it prints, i.e. "in
background" or not, depends solely on the value of the 'gc.autoDetach'
config variable, which is true by default.  The only platform-specific
thing about all this is the #ifdef in daemonize(), but then it's
already too late, the misleading message has already been printed.

See the discussion the patch for the same issue in a different test
script (bb05510e5 (t5510: run auto-gc in the foreground, 2016-05-01)),
including a patch at the end that prevents auto gc on Windows from
lying about going to the background (which I, not being a Windows
user, didn't follow through).

  http://public-inbox.org/git/20160505171430.Horde.-GuvDpZBfS8VI1Zcfn4bJQI@webmail.informatik.kit.edu/T/#u

> The racy version
> should fail reasonably promptly, I think, and the comments you've left
> would point any investigator in the right direction.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCHv2.1] t6500: wait for detached auto gc at the end of the test script
  2017-04-13 18:08                         ` SZEDER Gábor
@ 2017-04-13 18:12                           ` Jeff King
  0 siblings, 0 replies; 41+ messages in thread
From: Jeff King @ 2017-04-13 18:12 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Junio C Hamano, Johannes Sixt, David Turner, Git mailing list

On Thu, Apr 13, 2017 at 08:08:12PM +0200, SZEDER Gábor wrote:

> On Thu, Apr 13, 2017 at 6:44 PM, Jeff King <peff@peff.net> wrote:
> > On Thu, Apr 13, 2017 at 12:31:38PM +0200, SZEDER Gábor wrote:
> 
> > I did wonder what will happen if Windows learns to daemonize() the
> > auto-gc. I don't think we'll get an immediate test failure, but this
> > test will become racy again. But this time we'll actually notice the
> > racy failure, because the "ls" will report extra packs if it runs before
> > the background gc does. At which point we can revisit this.
> 
> Dscho said that it would take significant effort to make daemonize()
> work on Windows, so I guess it will take a while before we'll have to
> revisit this.

Yeah, that's what I figured. I mostly just didn't want to leave a
time-bomb for future developers.

> > I guess we could probably grep for the "in the background" message from
> > the parent gc. OTOH, maybe it is not even worth it.
> 
> That wouldn't work at the moment, because auto gc says that it will go
> to the background even on Windows.

Ah, OK. Let's not worry about it, then. I think the way your test is
constructed we should get a racy failure not long after the change, and
your comments would lead people to realize what is going on.

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] t6500: don't run detached auto gc at the end of the test script
  2017-04-13 17:57                       ` Jeff King
@ 2017-04-13 19:03                         ` SZEDER Gábor
  2017-04-13 19:12                           ` Jeff King
  0 siblings, 1 reply; 41+ messages in thread
From: SZEDER Gábor @ 2017-04-13 19:03 UTC (permalink / raw)
  To: Jeff King
  Cc: Stefan Beller, Johannes Sixt, Junio C Hamano, David Turner,
	Git mailing list

On Thu, Apr 13, 2017 at 7:57 PM, Jeff King <peff@peff.net> wrote:
> On Thu, Apr 13, 2017 at 10:55:08AM -0700, Stefan Beller wrote:
>
>> On Thu, Apr 13, 2017 at 9:37 AM, Jeff King <peff@peff.net> wrote:
>> > Ah, OK, that makes more sense. I can detect it reliably by just checking
>> >
>> >   ! test -d $root/trash*
>> >
>> > in my stress-test after the test successfully completes.
>>
>> Would we want to report such an error from the test suite itself?
>> (My line of thinking is that this is a similar issue to broken && chains,
>> which we do report).

A broken &&-chain can harm the test's correctness by hiding errors.
The failing 'rm -rf $trash' during housekeeping, after all the tests
in the test script has been run and their results reported... not
really, I would think, though arguably it's a sign that something is
fishy.

> Yeah, I had a similar thought. I can't think of any reason why it would
> trigger a false positive, as long as we account for test-failure and the
> --debug flag properly. I don't think we can just drop the "-f" from the
> final "rm", because it may be overriding funny permissions left by
> tests.

FWIW, I used the following rather simple change during stress-testing
these patches (barring white-space damage):

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 13b569682..d7fa15a69 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -763,7 +763,7 @@ test_done () {

                test -d "$remove_trash" &&
                cd "$(dirname "$remove_trash")" &&
-               rm -rf "$(basename "$remove_trash")"
+               rm -rf "$(basename "$remove_trash")" || exit 1

                test_at_end_hook_

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH] t6500: don't run detached auto gc at the end of the test script
  2017-04-13 19:03                         ` SZEDER Gábor
@ 2017-04-13 19:12                           ` Jeff King
  2017-04-13 19:35                             ` SZEDER Gábor
  0 siblings, 1 reply; 41+ messages in thread
From: Jeff King @ 2017-04-13 19:12 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Stefan Beller, Johannes Sixt, Junio C Hamano, David Turner,
	Git mailing list

On Thu, Apr 13, 2017 at 09:03:26PM +0200, SZEDER Gábor wrote:

> > Yeah, I had a similar thought. I can't think of any reason why it would
> > trigger a false positive, as long as we account for test-failure and the
> > --debug flag properly. I don't think we can just drop the "-f" from the
> > final "rm", because it may be overriding funny permissions left by
> > tests.
> 
> FWIW, I used the following rather simple change during stress-testing
> these patches (barring white-space damage):
> 
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 13b569682..d7fa15a69 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -763,7 +763,7 @@ test_done () {
> 
>                 test -d "$remove_trash" &&
>                 cd "$(dirname "$remove_trash")" &&
> -               rm -rf "$(basename "$remove_trash")"
> +               rm -rf "$(basename "$remove_trash")" || exit 1

Oh, right. I thought "-f" would cause it to exit 0 even if some items
failed to be removed, but that only applies to ENOENT. So I think that
is probably a good change. I agree it's not a true error, but just a
sign of something fishy, but I don't think it hurts to have extra lint
checks.

Replacing it the "exit 1" with a "die" that has a message is probably a
good idea, though.

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] t6500: don't run detached auto gc at the end of the test script
  2017-04-13 19:12                           ` Jeff King
@ 2017-04-13 19:35                             ` SZEDER Gábor
  2017-04-14 20:08                               ` Jeff King
  0 siblings, 1 reply; 41+ messages in thread
From: SZEDER Gábor @ 2017-04-13 19:35 UTC (permalink / raw)
  To: Jeff King
  Cc: Stefan Beller, Johannes Sixt, Junio C Hamano, David Turner,
	Git mailing list

On Thu, Apr 13, 2017 at 9:12 PM, Jeff King <peff@peff.net> wrote:
> On Thu, Apr 13, 2017 at 09:03:26PM +0200, SZEDER Gábor wrote:
>
>> > Yeah, I had a similar thought. I can't think of any reason why it would
>> > trigger a false positive, as long as we account for test-failure and the
>> > --debug flag properly. I don't think we can just drop the "-f" from the
>> > final "rm", because it may be overriding funny permissions left by
>> > tests.
>>
>> FWIW, I used the following rather simple change during stress-testing
>> these patches (barring white-space damage):
>>
>> diff --git a/t/test-lib.sh b/t/test-lib.sh
>> index 13b569682..d7fa15a69 100644
>> --- a/t/test-lib.sh
>> +++ b/t/test-lib.sh
>> @@ -763,7 +763,7 @@ test_done () {
>>
>>                 test -d "$remove_trash" &&

I'm not sure under what circumstances the trash dir could be missing at
this point...

>>                 cd "$(dirname "$remove_trash")" &&
>> -               rm -rf "$(basename "$remove_trash")"
>> +               rm -rf "$(basename "$remove_trash")" || exit 1

... but when it is already removed, then I think we should not exit
with error here.
Nothing that a pair of {} wouldn't handle.

> Oh, right. I thought "-f" would cause it to exit 0 even if some items
> failed to be removed, but that only applies to ENOENT. So I think that
> is probably a good change. I agree it's not a true error, but just a
> sign of something fishy, but I don't think it hurts to have extra lint
> checks.
>
> Replacing it the "exit 1" with a "die" that has a message is probably a
> good idea, though.

If we can't 'cd' or 'rm -rf', then they will tell us why they failed
anyway, most likely including the name of the trash directory.
Do we really need more error messages than that?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] t6500: don't run detached auto gc at the end of the test script
  2017-04-13 19:35                             ` SZEDER Gábor
@ 2017-04-14 20:08                               ` Jeff King
  2017-04-20 16:42                                 ` SZEDER Gábor
  0 siblings, 1 reply; 41+ messages in thread
From: Jeff King @ 2017-04-14 20:08 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Stefan Beller, Johannes Sixt, Junio C Hamano, David Turner,
	Git mailing list

On Thu, Apr 13, 2017 at 09:35:08PM +0200, SZEDER Gábor wrote:

> >> diff --git a/t/test-lib.sh b/t/test-lib.sh
> >> index 13b569682..d7fa15a69 100644
> >> --- a/t/test-lib.sh
> >> +++ b/t/test-lib.sh
> >> @@ -763,7 +763,7 @@ test_done () {
> >>
> >>                 test -d "$remove_trash" &&
> 
> I'm not sure under what circumstances the trash dir could be missing at
> this point...

Yeah, I don't really see the point of that line. Our whole goal is to
remove it anyway.

> >>                 cd "$(dirname "$remove_trash")" &&
> >> -               rm -rf "$(basename "$remove_trash")"
> >> +               rm -rf "$(basename "$remove_trash")" || exit 1
> 
> ... but when it is already removed, then I think we should not exit
> with error here.
> Nothing that a pair of {} wouldn't handle.

I suppose so. It might be worth being picky just on the principle that
if it _is_ gone that's unexpected and we'd prefer somebody notice and
figure out why.

> > Replacing it the "exit 1" with a "die" that has a message is probably a
> > good idea, though.
> 
> If we can't 'cd' or 'rm -rf', then they will tell us why they failed
> anyway, most likely including the name of the trash directory.
> Do we really need more error messages than that?

I just meant something like "tests passed but test cleanup failed;
aborting" so that the user has a better idea what is going on. But since
it shouldn't happen, maybe that doesn't matter.

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] t6500: don't run detached auto gc at the end of the test script
  2017-04-14 20:08                               ` Jeff King
@ 2017-04-20 16:42                                 ` SZEDER Gábor
  2017-04-20 16:45                                   ` Jeff King
  2017-04-20 16:52                                   ` [PATCH] test-lib: abort when can't remove trash directory SZEDER Gábor
  0 siblings, 2 replies; 41+ messages in thread
From: SZEDER Gábor @ 2017-04-20 16:42 UTC (permalink / raw)
  To: Jeff King
  Cc: Stefan Beller, Johannes Sixt, Junio C Hamano, David Turner,
	Git mailing list

On Fri, Apr 14, 2017 at 10:08 PM, Jeff King <peff@peff.net> wrote:
> On Thu, Apr 13, 2017 at 09:35:08PM +0200, SZEDER Gábor wrote:
>
>> >> diff --git a/t/test-lib.sh b/t/test-lib.sh
>> >> index 13b569682..d7fa15a69 100644
>> >> --- a/t/test-lib.sh
>> >> +++ b/t/test-lib.sh
>> >> @@ -763,7 +763,7 @@ test_done () {
>> >>
>> >>                 test -d "$remove_trash" &&
>>
>> I'm not sure under what circumstances the trash dir could be missing at
>> this point...
>
> Yeah, I don't really see the point of that line. Our whole goal is to
> remove it anyway.

It turns out it's necessary, see below.

>> >>                 cd "$(dirname "$remove_trash")" &&
>> >> -               rm -rf "$(basename "$remove_trash")"
>> >> +               rm -rf "$(basename "$remove_trash")" || exit 1
>>
>> ... but when it is already removed, then I think we should not exit
>> with error here.
>> Nothing that a pair of {} wouldn't handle.
>
> I suppose so. It might be worth being picky just on the principle that
> if it _is_ gone that's unexpected and we'd prefer somebody notice and
> figure out why.

OK, agreed.

However, we do need the above 'test -d' for this to work, because 'rm
-rf' doesn't consider non-existing paths as errors, so we wouldn't
notice that the trash directory is already gone.

>> > Replacing it the "exit 1" with a "die" that has a message is probably a
>> > good idea, though.

test-lib.sh has a special 'die', different from git-sh-setup.sh's
'die'.  I'll use 'error "uh-oh"' instead.

>> If we can't 'cd' or 'rm -rf', then they will tell us why they failed
>> anyway, most likely including the name of the trash directory.
>> Do we really need more error messages than that?
>
> I just meant something like "tests passed but test cleanup failed;
> aborting" so that the user has a better idea what is going on. But since
> it shouldn't happen, maybe that doesn't matter.

And we do need an error message, because 'test -d' would remain silent
when the directory were already missing.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] t6500: don't run detached auto gc at the end of the test script
  2017-04-20 16:42                                 ` SZEDER Gábor
@ 2017-04-20 16:45                                   ` Jeff King
  2017-04-20 16:52                                   ` [PATCH] test-lib: abort when can't remove trash directory SZEDER Gábor
  1 sibling, 0 replies; 41+ messages in thread
From: Jeff King @ 2017-04-20 16:45 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Stefan Beller, Johannes Sixt, Junio C Hamano, David Turner,
	Git mailing list

On Thu, Apr 20, 2017 at 06:42:59PM +0200, SZEDER Gábor wrote:

> > I suppose so. It might be worth being picky just on the principle that
> > if it _is_ gone that's unexpected and we'd prefer somebody notice and
> > figure out why.
> 
> OK, agreed.
> 
> However, we do need the above 'test -d' for this to work, because 'rm
> -rf' doesn't consider non-existing paths as errors, so we wouldn't
> notice that the trash directory is already gone.

Right, that makes sense.

> >> > Replacing it the "exit 1" with a "die" that has a message is probably a
> >> > good idea, though.
> 
> test-lib.sh has a special 'die', different from git-sh-setup.sh's
> 'die'.  I'll use 'error "uh-oh"' instead.

Yeah, sorry, error() is the right function in this instance.

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH] test-lib: abort when can't remove trash directory
  2017-04-20 16:42                                 ` SZEDER Gábor
  2017-04-20 16:45                                   ` Jeff King
@ 2017-04-20 16:52                                   ` SZEDER Gábor
  2017-04-20 19:06                                     ` Jeff King
                                                       ` (2 more replies)
  1 sibling, 3 replies; 41+ messages in thread
From: SZEDER Gábor @ 2017-04-20 16:52 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Stefan Beller, Johannes Sixt, David Turner, git,
	SZEDER Gábor

We had two similar bugs in the tests sporadically triggering error
messages during the removal of the trash directory, see commits
bb05510e5 (t5510: run auto-gc in the foreground, 2016-05-01) and
ef09036cf (t6500: wait for detached auto gc at the end of the test
script, 2017-04-13).  The test script succeeded nonetheless, because
these errors are ignored during housekeeping in 'test_done'.

However, such an error is a sign that something is fishy in the test
script.  Print an error message and abort the test script when the
trash directory can't be removed successfully or is already removed,
because that's unexpected and we woud prefer somebody notice and
figure out why.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---

Note, that the commit message references ef09036cf (t6500: wait for
detached auto gc at the end of the test script, 2017-04-13), which
is still only in 'pu'.

 t/test-lib.sh | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 13b569682..e9e6f677d 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -761,9 +761,12 @@ test_done () {
 			say "1..$test_count$skip_all"
 		fi
 
-		test -d "$remove_trash" &&
+		test -d "$remove_trash" ||
+		error "Tests passed but trash directory already removed before test cleanup; aborting"
+
 		cd "$(dirname "$remove_trash")" &&
-		rm -rf "$(basename "$remove_trash")"
+		rm -rf "$(basename "$remove_trash")" ||
+		error "Tests passed but test cleanup failed; aborting"
 
 		test_at_end_hook_
 
-- 
2.12.2.613.g9c5b79913


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH] test-lib: abort when can't remove trash directory
  2017-04-20 16:52                                   ` [PATCH] test-lib: abort when can't remove trash directory SZEDER Gábor
@ 2017-04-20 19:06                                     ` Jeff King
  2017-04-21  0:48                                     ` Junio C Hamano
  2017-04-21 20:15                                     ` Jeff King
  2 siblings, 0 replies; 41+ messages in thread
From: Jeff King @ 2017-04-20 19:06 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Junio C Hamano, Stefan Beller, Johannes Sixt, David Turner, git

On Thu, Apr 20, 2017 at 06:52:30PM +0200, SZEDER Gábor wrote:

> We had two similar bugs in the tests sporadically triggering error
> messages during the removal of the trash directory, see commits
> bb05510e5 (t5510: run auto-gc in the foreground, 2016-05-01) and
> ef09036cf (t6500: wait for detached auto gc at the end of the test
> script, 2017-04-13).  The test script succeeded nonetheless, because
> these errors are ignored during housekeeping in 'test_done'.
> 
> However, such an error is a sign that something is fishy in the test
> script.  Print an error message and abort the test script when the
> trash directory can't be removed successfully or is already removed,
> because that's unexpected and we woud prefer somebody notice and
> figure out why.
> 
> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>

I think this is a good idea, and the patch itself looks good. Thanks.

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] test-lib: abort when can't remove trash directory
  2017-04-20 16:52                                   ` [PATCH] test-lib: abort when can't remove trash directory SZEDER Gábor
  2017-04-20 19:06                                     ` Jeff King
@ 2017-04-21  0:48                                     ` Junio C Hamano
  2017-04-21 20:06                                       ` SZEDER Gábor
  2017-04-21 20:15                                     ` Jeff King
  2 siblings, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2017-04-21  0:48 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Jeff King, Stefan Beller, Johannes Sixt, David Turner, git

SZEDER Gábor <szeder.dev@gmail.com> writes:

> We had two similar bugs in the tests sporadically triggering error
> messages during the removal of the trash directory, see commits
> bb05510e5 (t5510: run auto-gc in the foreground, 2016-05-01) and
> ef09036cf (t6500: wait for detached auto gc at the end of the test
> script, 2017-04-13).  The test script succeeded nonetheless, because
> these errors are ignored during housekeeping in 'test_done'.
>
> However, such an error is a sign that something is fishy in the test
> script.  Print an error message and abort the test script when the
> trash directory can't be removed successfully or is already removed,
> because that's unexpected and we woud prefer somebody notice and
> figure out why.

Makes sense to me, too.

>
> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
> ---
>
> Note, that the commit message references ef09036cf (t6500: wait for
> detached auto gc at the end of the test script, 2017-04-13), which
> is still only in 'pu'.

I think that one is already part of 2.13-rc0 ;-)

>  t/test-lib.sh | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 13b569682..e9e6f677d 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -761,9 +761,12 @@ test_done () {
>  			say "1..$test_count$skip_all"
>  		fi
>  
> -		test -d "$remove_trash" &&
> +		test -d "$remove_trash" ||
> +		error "Tests passed but trash directory already removed before test cleanup; aborting"
> +
>  		cd "$(dirname "$remove_trash")" &&
> -		rm -rf "$(basename "$remove_trash")"
> +		rm -rf "$(basename "$remove_trash")" ||
> +		error "Tests passed but test cleanup failed; aborting"
>  
>  		test_at_end_hook_

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] test-lib: abort when can't remove trash directory
  2017-04-21  0:48                                     ` Junio C Hamano
@ 2017-04-21 20:06                                       ` SZEDER Gábor
  0 siblings, 0 replies; 41+ messages in thread
From: SZEDER Gábor @ 2017-04-21 20:06 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Stefan Beller, Johannes Sixt, David Turner,
	Git mailing list

On Fri, Apr 21, 2017 at 2:48 AM, Junio C Hamano <gitster@pobox.com> wrote:
> SZEDER Gábor <szeder.dev@gmail.com> writes:
>
>> We had two similar bugs in the tests sporadically triggering error
>> messages during the removal of the trash directory, see commits
>> bb05510e5 (t5510: run auto-gc in the foreground, 2016-05-01) and
>> ef09036cf (t6500: wait for detached auto gc at the end of the test
>> script, 2017-04-13).  The test script succeeded nonetheless, because
>> these errors are ignored during housekeeping in 'test_done'.
>>
>> However, such an error is a sign that something is fishy in the test
>> script.  Print an error message and abort the test script when the
>> trash directory can't be removed successfully or is already removed,
>> because that's unexpected and we woud prefer somebody notice and

s/woud/would/

>> figure out why.


> Note, that the commit message references ef09036cf (t6500: wait for
>> detached auto gc at the end of the test script, 2017-04-13), which
>> is still only in 'pu'.
>
> I think that one is already part of 2.13-rc0 ;-)

Yeah, I should have fetched before submitting.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] test-lib: abort when can't remove trash directory
  2017-04-20 16:52                                   ` [PATCH] test-lib: abort when can't remove trash directory SZEDER Gábor
  2017-04-20 19:06                                     ` Jeff King
  2017-04-21  0:48                                     ` Junio C Hamano
@ 2017-04-21 20:15                                     ` Jeff King
  2017-04-24  0:14                                       ` Junio C Hamano
  2 siblings, 1 reply; 41+ messages in thread
From: Jeff King @ 2017-04-21 20:15 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Junio C Hamano, Stefan Beller, Johannes Sixt, David Turner, git

On Thu, Apr 20, 2017 at 06:52:30PM +0200, SZEDER Gábor wrote:

> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 13b569682..e9e6f677d 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -761,9 +761,12 @@ test_done () {
>  			say "1..$test_count$skip_all"
>  		fi
>  
> -		test -d "$remove_trash" &&
> +		test -d "$remove_trash" ||
> +		error "Tests passed but trash directory already removed before test cleanup; aborting"

I think I found out why this "test -d" was here in the first place:

  $ ./t0000-basic.sh --debug
  [...]
  # passed all 77 test(s)
  1..77
  error: Tests passed but trash directory already removed before test cleanup; aborting

When --debug is in use, we do not set $remove_trash. The original was
relying on 'test -d ""' to return false.

I think this whole removal block should probably be moved inside a
conditional like:

  if test -n "$remove_trash"
  then
     ...
  fi

I also wonder if we should come up with a better name than
$remove_trash. A script which unknowingly overwrites that variable would
be disastrous.

Perhaps we should drop it entirely and just do:

  if test -z "$debug"
  then
     test -d "$TRASH_DIRECTORY" ||
     error "Tests passed but..."
  [and so forth...]
  fi

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] test-lib: abort when can't remove trash directory
  2017-04-21 20:15                                     ` Jeff King
@ 2017-04-24  0:14                                       ` Junio C Hamano
  2017-04-24  1:43                                         ` Jeff King
  0 siblings, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2017-04-24  0:14 UTC (permalink / raw)
  To: Jeff King
  Cc: SZEDER Gábor, Stefan Beller, Johannes Sixt, David Turner,
	git

Jeff King <peff@peff.net> writes:

>> -		test -d "$remove_trash" &&
>> +		test -d "$remove_trash" ||
>> +		error "Tests passed but trash directory already removed before test cleanup; aborting"
>
> I think I found out why this "test -d" was here in the first place:
>
>   $ ./t0000-basic.sh --debug
>   [...]
>   # passed all 77 test(s)
>   1..77
>   error: Tests passed but trash directory already removed before test cleanup; aborting
>
> When --debug is in use, we do not set $remove_trash. The original was
> relying on 'test -d ""' to return false.
>
> I think this whole removal block should probably be moved inside a
> conditional like:
>
>   if test -n "$remove_trash"
>   then
>      ...
>   fi

Thanks for digging.  Yes, checking for non-empty string is
definitely better.

> I also wonder if we should come up with a better name than
> $remove_trash. A script which unknowingly overwrites that variable would
> be disastrous.
>
> Perhaps we should drop it entirely and just do:
>
>   if test -z "$debug"
>   then
>      test -d "$TRASH_DIRECTORY" ||
>      error "Tests passed but..."
>   [and so forth...]
>   fi

OK.  I am wondering why we do not do 

	rm -fr "$TRASH_DIRECTORY"

and do this instead:

	cd "$(dirname "$remove_trash")" &&
	rm -rf "$(basename "$remove_trash")"

in the original.  It feels somewhat unnatural.

	... goes and looks ...

Of course, back when abc5d372 ("Enable parallel tests", 2008-08-08)
was writen, we didn't even have TRASH_DIRECTORY variable; because
the way the test-lib.sh ensured that the trash directory is prestine
was to first do a 'rm -fr "$test"' before the first test_create_repo,
the above makes sort of matches how that initial removal is done.

So perhaps we can simplify and make it more robust by doing this?

 t/test-lib.sh | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index cde7fc7fcf..f1ab8f33d9 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -760,9 +760,14 @@ test_done () {
 			say "1..$test_count$skip_all"
 		fi
 
-		test -d "$remove_trash" &&
-		cd "$(dirname "$remove_trash")" &&
-		rm -rf "$(basename "$remove_trash")"
+		if test -z "$debug"
+		then
+			test -d "$TRASH_DIRECTORY" ||
+			error "Tests passed but trash directory already removed before test cleanup; aborting"
+
+			rm -fr "$TRASH_DIRECTORY" ||
+			error "Tests passed but test cleanup failed; aborting"
+		fi
 
 		test_at_end_hook_
 

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH] test-lib: abort when can't remove trash directory
  2017-04-24  0:14                                       ` Junio C Hamano
@ 2017-04-24  1:43                                         ` Jeff King
  2017-04-24  2:58                                           ` Junio C Hamano
  0 siblings, 1 reply; 41+ messages in thread
From: Jeff King @ 2017-04-24  1:43 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: SZEDER Gábor, Stefan Beller, Johannes Sixt, David Turner,
	git

On Sun, Apr 23, 2017 at 05:14:54PM -0700, Junio C Hamano wrote:

> OK.  I am wondering why we do not do 
> 
> 	rm -fr "$TRASH_DIRECTORY"
> 
> and do this instead:
> 
> 	cd "$(dirname "$remove_trash")" &&
> 	rm -rf "$(basename "$remove_trash")"
> 
> in the original.  It feels somewhat unnatural.

I assumed the "cd" was there because some systems may be unhappy
removing a directory which is our current working directory. That might
just be superstition, though.

The use of "basename" in the second does seem superfluous, since the
trash directory should be the full path (I suspect it wasn't in the
early days, though).

> So perhaps we can simplify and make it more robust by doing this?
> [...]
> +		if test -z "$debug"
> +		then
> +			test -d "$TRASH_DIRECTORY" ||
> +			error "Tests passed but trash directory already removed before test cleanup; aborting"
> +
> +			rm -fr "$TRASH_DIRECTORY" ||
> +			error "Tests passed but test cleanup failed; aborting"
> +		fi

That looks fine, assuming the answer to the "is the cwd important"
question is "no".

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] test-lib: abort when can't remove trash directory
  2017-04-24  1:43                                         ` Jeff King
@ 2017-04-24  2:58                                           ` Junio C Hamano
  2017-04-24  4:02                                             ` Junio C Hamano
  0 siblings, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2017-04-24  2:58 UTC (permalink / raw)
  To: Jeff King
  Cc: SZEDER Gábor, Stefan Beller, Johannes Sixt, David Turner,
	git

Jeff King <peff@peff.net> writes:

> On Sun, Apr 23, 2017 at 05:14:54PM -0700, Junio C Hamano wrote:
>
>> OK.  I am wondering why we do not do 
>> 
>> 	rm -fr "$TRASH_DIRECTORY"
>> 
>> and do this instead:
>> 
>> 	cd "$(dirname "$remove_trash")" &&
>> 	rm -rf "$(basename "$remove_trash")"
>> 
>> in the original.  It feels somewhat unnatural.
>
> I assumed the "cd" was there because some systems may be unhappy
> removing a directory which is our current working directory. That might
> just be superstition, though.

Ahh, OK, that makes sense.  I forgot about that.

> The use of "basename" in the second does seem superfluous, since the
> trash directory should be the full path (I suspect it wasn't in the
> early days, though).
>
>> So perhaps we can simplify and make it more robust by doing this?
>> [...]
>> +		if test -z "$debug"
>> +		then
>> +			test -d "$TRASH_DIRECTORY" ||
>> +			error "Tests passed but trash directory already removed before test cleanup; aborting"
>> +
>> +			rm -fr "$TRASH_DIRECTORY" ||
>> +			error "Tests passed but test cleanup failed; aborting"
>> +		fi
>
> That looks fine, assuming the answer to the "is the cwd important"
> question is "no".

And I do think the answer would be "yes", unfortunately.  There are
systems that do not even allow a file to be removed while it is
open, so...


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] test-lib: abort when can't remove trash directory
  2017-04-24  2:58                                           ` Junio C Hamano
@ 2017-04-24  4:02                                             ` Junio C Hamano
  2017-04-24  7:52                                               ` Jeff King
  0 siblings, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2017-04-24  4:02 UTC (permalink / raw)
  To: Jeff King
  Cc: SZEDER Gábor, Stefan Beller, Johannes Sixt, David Turner,
	git

Junio C Hamano <gitster@pobox.com> writes:

>> That looks fine, assuming the answer to the "is the cwd important"
>> question is "no".
>
> And I do think the answer would be "yes", unfortunately.  There are
> systems that do not even allow a file to be removed while it is
> open, so...

In addition to "some platforms may not want cwd removed", this
directory would be where test_at_end_hook_ will be running in, so
we'd better be consistent with the current behaviour.

Second try that hopefully is much less damaging

 t/test-lib.sh | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index cb0766b9ee..4e8f511870 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -760,12 +760,15 @@ test_done () {
 			say "1..$test_count$skip_all"
 		fi
 
-		test -d "$remove_trash" ||
-		error "Tests passed but trash directory already removed before test cleanup; aborting"
+		if test -z "$debug"
+		then
+			test -d "$TRASH_DIRECTORY" ||
+			error "Tests passed but trash directory already removed before test cleanup; aborting"
 
-		cd "$(dirname "$remove_trash")" &&
-		rm -rf "$(basename "$remove_trash")" ||
-		error "Tests passed but test cleanup failed; aborting"
+			cd "$(dirname "$TRASH_DIRECTORY")" &&
+			rm -fr "$TRASH_DIRECTORY" ||
+			error "Tests passed but test cleanup failed; aborting"
+		fi
 
 		test_at_end_hook_
 




^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH] test-lib: abort when can't remove trash directory
  2017-04-24  4:02                                             ` Junio C Hamano
@ 2017-04-24  7:52                                               ` Jeff King
  2017-04-24  9:39                                                 ` Torsten Bögershausen
  2017-04-25  6:05                                                 ` Junio C Hamano
  0 siblings, 2 replies; 41+ messages in thread
From: Jeff King @ 2017-04-24  7:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: SZEDER Gábor, Stefan Beller, Johannes Sixt, David Turner,
	git

On Sun, Apr 23, 2017 at 09:02:41PM -0700, Junio C Hamano wrote:

> >> That looks fine, assuming the answer to the "is the cwd important"
> >> question is "no".
> >
> > And I do think the answer would be "yes", unfortunately.  There are
> > systems that do not even allow a file to be removed while it is
> > open, so...
> 
> In addition to "some platforms may not want cwd removed", this
> directory would be where test_at_end_hook_ will be running in, so
> we'd better be consistent with the current behaviour.

Good point. There's only one caller, but it does care about being in
that directory.

> Second try that hopefully is much less damaging
> 
>  t/test-lib.sh | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index cb0766b9ee..4e8f511870 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -760,12 +760,15 @@ test_done () {
>  			say "1..$test_count$skip_all"
>  		fi
>  
> -		test -d "$remove_trash" ||
> -		error "Tests passed but trash directory already removed before test cleanup; aborting"
> +		if test -z "$debug"
> +		then
> +			test -d "$TRASH_DIRECTORY" ||
> +			error "Tests passed but trash directory already removed before test cleanup; aborting"
>  
> -		cd "$(dirname "$remove_trash")" &&
> -		rm -rf "$(basename "$remove_trash")" ||
> -		error "Tests passed but test cleanup failed; aborting"
> +			cd "$(dirname "$TRASH_DIRECTORY")" &&
> +			rm -fr "$TRASH_DIRECTORY" ||
> +			error "Tests passed but test cleanup failed; aborting"
> +		fi

Yeah, that looks good to me.

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] test-lib: abort when can't remove trash directory
  2017-04-24  7:52                                               ` Jeff King
@ 2017-04-24  9:39                                                 ` Torsten Bögershausen
  2017-04-24  9:46                                                   ` Jeff King
  2017-04-25  2:31                                                   ` Junio C Hamano
  2017-04-25  6:05                                                 ` Junio C Hamano
  1 sibling, 2 replies; 41+ messages in thread
From: Torsten Bögershausen @ 2017-04-24  9:39 UTC (permalink / raw)
  To: Jeff King, Junio C Hamano
  Cc: SZEDER Gábor, Stefan Beller, Johannes Sixt, David Turner,
	git

[]
>>
>> -		cd "$(dirname "$remove_trash")" &&
>> -		rm -rf "$(basename "$remove_trash")" ||
>> -		error "Tests passed but test cleanup failed; aborting"
>> +			cd "$(dirname "$TRASH_DIRECTORY")" &&
>> +			rm -fr "$TRASH_DIRECTORY" ||
>> +			error "Tests passed but test cleanup failed; aborting"
>> +		fi
>
> Yeah, that looks good to me.

Does it ?
Checking the error code of "rm -f" doesn't work as expected from the script:
rm -rf DoesNotExist ; echo $?
0

I think it should be

>> +			cd "$(dirname "$TRASH_DIRECTORY")" &&
>> +			rm -r "$TRASH_DIRECTORY" ||
>> +			error "Tests passed but test cleanup failed; aborting"






^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] test-lib: abort when can't remove trash directory
  2017-04-24  9:39                                                 ` Torsten Bögershausen
@ 2017-04-24  9:46                                                   ` Jeff King
  2017-04-25  2:31                                                   ` Junio C Hamano
  1 sibling, 0 replies; 41+ messages in thread
From: Jeff King @ 2017-04-24  9:46 UTC (permalink / raw)
  To: Torsten Bögershausen
  Cc: Junio C Hamano, SZEDER Gábor, Stefan Beller, Johannes Sixt,
	David Turner, git

On Mon, Apr 24, 2017 at 11:39:26AM +0200, Torsten Bögershausen wrote:

> []
> > > 
> > > -		cd "$(dirname "$remove_trash")" &&
> > > -		rm -rf "$(basename "$remove_trash")" ||
> > > -		error "Tests passed but test cleanup failed; aborting"
> > > +			cd "$(dirname "$TRASH_DIRECTORY")" &&
> > > +			rm -fr "$TRASH_DIRECTORY" ||
> > > +			error "Tests passed but test cleanup failed; aborting"
> > > +		fi
> > 
> > Yeah, that looks good to me.
> 
> Does it ?
> Checking the error code of "rm -f" doesn't work as expected from the script:
> rm -rf DoesNotExist ; echo $?
> 0

ENOENT is treated specially by "rm -f". We cover that case explicitly
with the "test -d" above (in the bit you didn't quote), and then rely on
"rm" to report other errors.  Like:

  $ rm -rf /etc/passwd ; echo $?
  rm: cannot remove '/etc/passwd': Permission denied
  1

> I think it should be
> 
> > > +			cd "$(dirname "$TRASH_DIRECTORY")" &&
> > > +			rm -r "$TRASH_DIRECTORY" ||
> > > +			error "Tests passed but test cleanup failed; aborting"

I think we need the "-f" to overcome rm's tendency to prompt in some
cases:

  $ echo content >foo
  $ chmod 0 foo
  $ rm foo
  rm: remove write-protected regular file 'foo'? ^C
  $ rm -f foo
  [no output; it just works]

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] test-lib: abort when can't remove trash directory
  2017-04-24  9:39                                                 ` Torsten Bögershausen
  2017-04-24  9:46                                                   ` Jeff King
@ 2017-04-25  2:31                                                   ` Junio C Hamano
  1 sibling, 0 replies; 41+ messages in thread
From: Junio C Hamano @ 2017-04-25  2:31 UTC (permalink / raw)
  To: Torsten Bögershausen
  Cc: Jeff King, SZEDER Gábor, Stefan Beller, Johannes Sixt,
	David Turner, git

Torsten Bögershausen <tboegi@web.de> writes:

> []
>>>
>>> -		cd "$(dirname "$remove_trash")" &&
>>> -		rm -rf "$(basename "$remove_trash")" ||
>>> -		error "Tests passed but test cleanup failed; aborting"
>>> +			cd "$(dirname "$TRASH_DIRECTORY")" &&
>>> +			rm -fr "$TRASH_DIRECTORY" ||
>>> +			error "Tests passed but test cleanup failed; aborting"
>>> +		fi
>>
>> Yeah, that looks good to me.
>
> Does it ?
> Checking the error code of "rm -f" doesn't work as expected from the script:
> rm -rf DoesNotExist ; echo $?
> 0
>
> I think it should be
>
>>> +			cd "$(dirname "$TRASH_DIRECTORY")" &&
>>> +			rm -r "$TRASH_DIRECTORY" ||
>>> +			error "Tests passed but test cleanup failed; aborting"

What Peff told you in his response is all correct, but besides, you
can see the patch and realize that the original has been using "rm
-rf" for ages.

This change is about not using $remove_trash variable, as it felt
brittle and error prone than detecting $debug case and removing
$TRASH_DIRECTORY.  So in that sense, I shouldn't have even rolled
the removal of basename into it.

Having said that, because people care about the number of subprocess
invocations, I am tempted to suggest doing

	cd "$TRASH_DIRECTORY/.." &&
	rm -fr "$TRASH_DIRECTORY" ||
	error ...

which should reduce another process ;-)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] test-lib: abort when can't remove trash directory
  2017-04-24  7:52                                               ` Jeff King
  2017-04-24  9:39                                                 ` Torsten Bögershausen
@ 2017-04-25  6:05                                                 ` Junio C Hamano
  2017-04-25  6:07                                                   ` Jeff King
  1 sibling, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2017-04-25  6:05 UTC (permalink / raw)
  To: Jeff King
  Cc: SZEDER Gábor, Stefan Beller, Johannes Sixt, David Turner,
	git

Jeff King <peff@peff.net> writes:

> Good point. There's only one caller, but it does care about being in
> that directory.
>
>> Second try that hopefully is much less damaging

I've been carrying it as a SQUASH??? patch, but I think it is better
to split it as a separate pach, as removal of $remove_trash is an
optional thing.  The main thing, i.e. SZEDER's "abort when trash is
already gone or when we cannot remove" _can_ be (and is) correctly
done with his pach alone.

So here is what I queued on top.

-- >8 --
Subject: [PATCH] test-lib: retire $remove_trash variable

The convention "$remove_trash is set to the trash directory that is
used during the test, so that it will be removed at the end, but
under --debug option we set the varilable to empty string to
preserve the directory" made sense back when it was introduced, as
there was no $TRASH_DIRECTORY variable.  These days, since no tests
looks at the variable, it is obscure and even risks that by mistake
the variable gets used for something else (e.g. remove_trash=yes)
and cause us misbehave.

Rewrite the clean-up sequence in test_done helper to explicitly
check the $debug condition and remove the trash directory using
the $TRASH_DIRECTORY variable.

Note that "go to the directory one level above the trash and then
remove it" is kept and this is deliverate; test_at_end_hook_ will
keep running from the expected location, and also some platforms may
not like a directory that is serving as the $cwd of a still-active
process removed.

We _might_ want to rewrite

    cd "$(dirname "$TRASH_DIRECTORY")"

further to

    cd "$TRASH_DIRECTORY/.."

to lose one extra process, but let's leave it to a later patch.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/test-lib.sh | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index cb0766b9ee..976566047d 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -760,12 +760,15 @@ test_done () {
 			say "1..$test_count$skip_all"
 		fi
 
-		test -d "$remove_trash" ||
-		error "Tests passed but trash directory already removed before test cleanup; aborting"
+		if test -z "$debug"
+		then
+			test -d "$TRASH_DIRECTORY" ||
+			error "Tests passed but trash directory already removed before test cleanup; aborting"
 
-		cd "$(dirname "$remove_trash")" &&
-		rm -rf "$(basename "$remove_trash")" ||
-		error "Tests passed but test cleanup failed; aborting"
+			cd "$(dirname "$TRASH_DIRECTORY")" &&
+			rm -fr "$TRASH_DIRECTORY" ||
+			error "Tests passed but test cleanup failed; aborting"
+		fi
 
 		test_at_end_hook_
 
@@ -921,7 +924,6 @@ case "$TRASH_DIRECTORY" in
 /*) ;; # absolute path is good
  *) TRASH_DIRECTORY="$TEST_OUTPUT_DIRECTORY/$TRASH_DIRECTORY" ;;
 esac
-test ! -z "$debug" || remove_trash=$TRASH_DIRECTORY
 rm -fr "$TRASH_DIRECTORY" || {
 	GIT_EXIT_OK=t
 	echo >&5 "FATAL: Cannot prepare test area"
-- 
2.13.0-rc0-308-g931de5db53


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH] test-lib: abort when can't remove trash directory
  2017-04-25  6:05                                                 ` Junio C Hamano
@ 2017-04-25  6:07                                                   ` Jeff King
  2017-04-25  6:31                                                     ` Junio C Hamano
  0 siblings, 1 reply; 41+ messages in thread
From: Jeff King @ 2017-04-25  6:07 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: SZEDER Gábor, Stefan Beller, Johannes Sixt, David Turner,
	git

On Mon, Apr 24, 2017 at 11:05:15PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Good point. There's only one caller, but it does care about being in
> > that directory.
> >
> >> Second try that hopefully is much less damaging
> 
> I've been carrying it as a SQUASH??? patch, but I think it is better
> to split it as a separate pach, as removal of $remove_trash is an
> optional thing.  The main thing, i.e. SZEDER's "abort when trash is
> already gone or when we cannot remove" _can_ be (and is) correctly
> done with his pach alone.

Sort of. It still has the bug that it dies with error() when "--debug"
is used. But that is relatively minor, and as long as yours gets applied
on top I do not mind the momentary breakage.

> So here is what I queued on top.
> 
> -- >8 --
> Subject: [PATCH] test-lib: retire $remove_trash variable

Yeah, this looks fine (though it could mention that it is fixing the
"test -d" bug).

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] test-lib: abort when can't remove trash directory
  2017-04-25  6:07                                                   ` Jeff King
@ 2017-04-25  6:31                                                     ` Junio C Hamano
  0 siblings, 0 replies; 41+ messages in thread
From: Junio C Hamano @ 2017-04-25  6:31 UTC (permalink / raw)
  To: Jeff King
  Cc: SZEDER Gábor, Stefan Beller, Johannes Sixt, David Turner,
	git

Jeff King <peff@peff.net> writes:

> Sort of. It still has the bug that it dies with error() when "--debug"
> is used.

Ah, I forgot about that one.  Let me squash it in without the
remove_trash change then, perhaps, but not today.


^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2017-04-25  6:31 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-10 12:59 [PATCH] t6500: don't run detached auto gc at the end of the test script SZEDER Gábor
2017-04-10 13:58 ` Jeff King
2017-04-10 16:31   ` SZEDER Gábor
2017-04-10 16:35     ` Jeff King
2017-04-10 16:56       ` SZEDER Gábor
2017-04-10 17:01         ` Jeff King
2017-04-11 21:32           ` Johannes Sixt
2017-04-12  0:27             ` SZEDER Gábor
2017-04-12  0:50               ` Jeff King
2017-04-12 22:03                 ` SZEDER Gábor
2017-04-12 22:07                   ` [PATCHv2] " SZEDER Gábor
2017-04-13 10:31                     ` [PATCHv2.1] t6500: wait for " SZEDER Gábor
2017-04-13 16:06                       ` David Turner
2017-04-13 16:44                       ` Jeff King
2017-04-13 18:08                         ` SZEDER Gábor
2017-04-13 18:12                           ` Jeff King
2017-04-13 16:37                   ` [PATCH] t6500: don't run " Jeff King
2017-04-13 17:55                     ` Stefan Beller
2017-04-13 17:57                       ` Jeff King
2017-04-13 19:03                         ` SZEDER Gábor
2017-04-13 19:12                           ` Jeff King
2017-04-13 19:35                             ` SZEDER Gábor
2017-04-14 20:08                               ` Jeff King
2017-04-20 16:42                                 ` SZEDER Gábor
2017-04-20 16:45                                   ` Jeff King
2017-04-20 16:52                                   ` [PATCH] test-lib: abort when can't remove trash directory SZEDER Gábor
2017-04-20 19:06                                     ` Jeff King
2017-04-21  0:48                                     ` Junio C Hamano
2017-04-21 20:06                                       ` SZEDER Gábor
2017-04-21 20:15                                     ` Jeff King
2017-04-24  0:14                                       ` Junio C Hamano
2017-04-24  1:43                                         ` Jeff King
2017-04-24  2:58                                           ` Junio C Hamano
2017-04-24  4:02                                             ` Junio C Hamano
2017-04-24  7:52                                               ` Jeff King
2017-04-24  9:39                                                 ` Torsten Bögershausen
2017-04-24  9:46                                                   ` Jeff King
2017-04-25  2:31                                                   ` Junio C Hamano
2017-04-25  6:05                                                 ` Junio C Hamano
2017-04-25  6:07                                                   ` Jeff King
2017-04-25  6:31                                                     ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).