git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [CI]: Is t7527 known to be flakey?
@ 2023-01-20  2:52 Junio C Hamano
  2023-01-20 15:23 ` Jeff Hostetler
  2023-01-21 10:23 ` SZEDER Gábor
  0 siblings, 2 replies; 8+ messages in thread
From: Junio C Hamano @ 2023-01-20  2:52 UTC (permalink / raw)
  To: Jeff Hostetler; +Cc: git

The said test failed its linux-musl job in its first attempt, but
re-running the failed job passed.

    https://github.com/git/git/actions/runs/3963948890/jobs/6792356234
    (seen@e096683 attempt #1 linux-musl)

    https://github.com/git/git/actions/runs/3963948890/jobs/6792850313
    (seen@e096683 attempt #2 linux-musl)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [CI]: Is t7527 known to be flakey?
  2023-01-20  2:52 [CI]: Is t7527 known to be flakey? Junio C Hamano
@ 2023-01-20 15:23 ` Jeff Hostetler
  2023-01-20 15:40   ` Junio C Hamano
  2023-01-21 10:23 ` SZEDER Gábor
  1 sibling, 1 reply; 8+ messages in thread
From: Jeff Hostetler @ 2023-01-20 15:23 UTC (permalink / raw)
  To: Junio C Hamano, Jeff Hostetler, edecosta; +Cc: git



On 1/19/23 9:52 PM, Junio C Hamano wrote:
> The said test failed its linux-musl job in its first attempt, but
> re-running the failed job passed.
> 
>      https://github.com/git/git/actions/runs/3963948890/jobs/6792356234
>      (seen@e096683 attempt #1 linux-musl)
> 
>      https://github.com/git/git/actions/runs/3963948890/jobs/6792850313
>      (seen@e096683 attempt #2 linux-musl)
> 

This is on Linux, so it would be using the linux inotify backend.
Let me add Eric to the "To:" line for visibility.  And see if he
has experienced this during his development of it.

I've not looked at the inotify code so I can't say if there are
races there or not.  Tests that move directories feel like good
candidates for race conditions -- since the daemon doesn't get a
recursive view of the tree with inotify() and must simulate that
and manage the individual directories, but again I don't want to
assume that.

Jeff


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [CI]: Is t7527 known to be flakey?
  2023-01-20 15:23 ` Jeff Hostetler
@ 2023-01-20 15:40   ` Junio C Hamano
  0 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2023-01-20 15:40 UTC (permalink / raw)
  To: Jeff Hostetler; +Cc: Jeff Hostetler, edecosta, git

Jeff Hostetler <git@jeffhostetler.com> writes:

> This is on Linux, so it would be using the linux inotify backend.
> Let me add Eric to the "To:" line for visibility.

Thanks for redirection.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [CI]: Is t7527 known to be flakey?
  2023-01-20  2:52 [CI]: Is t7527 known to be flakey? Junio C Hamano
  2023-01-20 15:23 ` Jeff Hostetler
@ 2023-01-21 10:23 ` SZEDER Gábor
  2023-01-23 16:56   ` Jeff Hostetler
  1 sibling, 1 reply; 8+ messages in thread
From: SZEDER Gábor @ 2023-01-21 10:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff Hostetler, edecosta, git

On Thu, Jan 19, 2023 at 06:52:01PM -0800, Junio C Hamano wrote:
> The said test failed its linux-musl job in its first attempt, but
> re-running the failed job passed.
> 
>     https://github.com/git/git/actions/runs/3963948890/jobs/6792356234
>     (seen@e096683 attempt #1 linux-musl)
> 
>     https://github.com/git/git/actions/runs/3963948890/jobs/6792850313
>     (seen@e096683 attempt #2 linux-musl)

t7527 is quite slow, even with the right selection of test cases, but
this little tweak makes it much faster:

  diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
  index 0e497ba98d..4210ef644c 100755
  --- a/t/t7527-builtin-fsmonitor.sh
  +++ b/t/t7527-builtin-fsmonitor.sh
  @@ -676,7 +676,10 @@ test_expect_success 'cleanup worktrees' '
   # cause incorrect results when the untracked-cache is enabled.
   
   test_lazy_prereq UNTRACKED_CACHE '
  -	git update-index --test-untracked-cache
  +	# This check takes a very long time, but I know it works on
  +	# my system, so let's fake it.
  +	#git update-index --test-untracked-cache
  +	true
   '
   
   test_expect_success 'Matrix: setup for untracked-cache,fsmonitor matrix' '

This with the right selection of test cases makes --stress
practicable, and the test tends to fail after a handful of
repetitions:

  ./t7527-builtin-fsmonitor.sh --stress -r 8,23-58

I saw different failures in multiple test cases, e.g.:

Unexpected output in case 55:

  expecting success of 7527.55 'Matrix[uc:true][fsm:true] move_directory_contents_deeper': 
  		matrix_clean_up_repo &&
  		$fn &&
  		if test $uc = false && test $fsm = false
  		then
  			git status --porcelain=v1 >.git/expect.$fn
  		else
  			git status --porcelain=v1 >.git/actual.$fn &&
  			test_cmp .git/expect.$fn .git/actual.$fn
  		fi
  	
  + matrix_clean_up_repo
  + git reset --hard HEAD
  HEAD is now at 1d1edcb initial
  + git clean -fd
  + move_directory_contents_deeper
  + mkdir T1/_new_
  + mv T1/F1 T1/F2 T1/T2 T1/_new_
  + test true = false
  + git status --porcelain=v1
  error: read error: Connection reset by peer
  error: could not read IPC response
  + test_cmp .git/expect.move_directory_contents_deeper .git/actual.move_directory_contents_deeper
  + test 2 -ne 2
  + eval diff -u "$@"
  + diff -u .git/expect.move_directory_contents_deeper .git/actual.move_directory_contents_deeper
  --- .git/expect.move_directory_contents_deeper	2023-01-21 09:47:12.677410349 +0000
  +++ .git/actual.move_directory_contents_deeper	2023-01-21 09:47:14.045448573 +0000
  @@ -7,3 +7,4 @@
    D T1/T2/T3/T4/F1
    D T1/T2/T3/T4/F2
   ?? T1/_new_/
  +?? dir1
  error: last command exited with $?=1
  not ok 55 - Matrix[uc:true][fsm:true] move_directory_contents_deeper

SIGPIPE in 'git status' cases 42, 43 and 55:

  expecting success of 7527.42 'Matrix[uc:false][fsm:true] move_directory_contents_deeper':
                  matrix_clean_up_repo &&
                  $fn &&
                  if test $uc = false && test $fsm = false
                  then
                          git status --porcelain=v1 >.git/expect.$fn
                  else
                          git status --porcelain=v1 >.git/actual.$fn &&
                          test_cmp .git/expect.$fn .git/actual.$fn
                  fi
  
  + matrix_clean_up_repo
  + git reset --hard HEAD
  HEAD is now at 1d1edcb initial
  + git clean -fd
  + move_directory_contents_deeper
  + mkdir T1/_new_
  + mv T1/F1 T1/F2 T1/T2 T1/_new_
  + test false = false
  + test true = false
  + git status --porcelain=v1
  error: last command exited with $?=141
  not ok 42 - Matrix[uc:false][fsm:true] move_directory_contents_deeper

  expecting success of 7527.43 'Matrix[uc:false][fsm:true] move_directory_up': 
  		matrix_clean_up_repo &&
  		$fn &&
  		if test $uc = false && test $fsm = false
  		then
  			git status --porcelain=v1 >.git/expect.$fn
  		else
  			git status --porcelain=v1 >.git/actual.$fn &&
  			test_cmp .git/expect.$fn .git/actual.$fn
  		fi
  	
  + matrix_clean_up_repo
  + git reset --hard HEAD
  HEAD is now at 1d1edcb initial
  + git clean -fd
  Removing T1/_new_/
  + move_directory_up
  + mv T1/T2/T3 T1
  + test false = false
  + test true = false
  + git status --porcelain=v1
  error: last command exited with $?=141
  not ok 43 - Matrix[uc:false][fsm:true] move_directory_up

  expecting success of 7527.55 'Matrix[uc:true][fsm:true] move_directory_contents_deeper':
                  matrix_clean_up_repo &&
                  $fn &&
                  if test $uc = false && test $fsm = false
                  then
                          git status --porcelain=v1 >.git/expect.$fn
                  else
                          git status --porcelain=v1 >.git/actual.$fn &&
                          test_cmp .git/expect.$fn .git/actual.$fn
                  fi
  
  + matrix_clean_up_repo
  + git reset --hard HEAD
  HEAD is now at 1d1edcb initial
  + git clean -fd
  + move_directory_contents_deeper
  + mkdir T1/_new_
  + mv T1/F1 T1/F2 T1/T2 T1/_new_
  + test true = false
  + git status --porcelain=v1
  error: last command exited with $?=141

I find it interesting that the output of 'git status' is redirected to
a file in all these cases, and yet it gets a SIGPIPE.


I also saw the test hang a couple of times, e.g.:

  $ ./t7527-builtin-fsmonitor.sh --stress -r 8,23-58
  OK    6.1
  OK    7.1
  OK    1.1
  OK    2.1
  OK    3.1
  OK    5.1
  OK    4.1
  OK    0.1
  OK    6.2
  OK    1.2
  OK    2.2
  OK    7.2
  OK    5.2
  OK    0.2
  OK    4.2
  OK    6.3
  OK    7.3
  OK    2.3
  OK    0.3
  OK    4.3
  OK    6.4
  OK    7.4
  OK    2.4
  OK    0.4
  OK    4.4
  OK    6.5
  OK    7.5
  OK    2.5
  OK    0.5
  OK    4.5
  OK    6.6
  OK    7.6
  OK    2.6
  OK    0.6
  OK    4.6
  OK    6.7
  OK    7.7
  OK    2.7
  OK    0.7
  OK    4.7
  OK    6.8
  OK    7.8
  OK    2.8
  OK    0.8
  OK    4.8
  OK    6.9
  OK    7.9
  OK    2.9
  OK    0.9
  OK    4.9
  OK    6.10
  OK    7.10
  OK    2.10
  OK    0.10
  OK    4.10
  OK    6.11
  OK    7.11
  OK    2.11
  OK    0.11
  OK    4.11
  OK    6.12
  OK    7.12
  OK    2.12
  OK    0.12
  OK    4.12
  OK    6.13
  OK    7.13
  OK    2.13
  OK    0.13
  OK    6.14
  OK    4.13
  OK    7.14
  OK    2.14
  OK    0.14
  OK    6.15
  OK    4.14
  OK    2.15
  OK    7.15
  OK    0.15
  FAIL  7.16
  OK    6.16
  OK    2.16
  OK    4.15
  OK    0.16

At this point the test script should print the log of the failed job,
but it hangs instead, as there are a number of stuck fsmonitor--daemon
and status processes (notice how the stress test starts with 8 jobs,
but the last repetition only has 4):

  $ ps aux |grep git
  szeder   1857100  0.0  0.1  72272  4452 pts/2    Sl+  21:40   0:00 /home/szeder/src/git/git fsmonitor--daemon run --detach --ipc-threads=8
  szeder   1857779  0.0  0.1   6560  4152 pts/2    S+   21:40   0:00 /home/szeder/src/git/git status --porcelain=v1
  szeder   1860020  0.0  0.1  88664  4312 pts/2    Sl+  21:40   0:00 /home/szeder/src/git/git fsmonitor--daemon run --detach --ipc-threads=8
  szeder   1860668  0.0  0.1   6560  4040 pts/2    S+   21:40   0:00 /home/szeder/src/git/git status --porcelain=v1
  szeder   1860749  0.0  0.1  96860  4528 pts/2    Sl+  21:40   0:00 /home/szeder/src/git/git fsmonitor--daemon run --detach --ipc-threads=8
  szeder   1861281  0.0  0.1   6560  4272 pts/2    S+   21:40   0:00 /home/szeder/src/git/git status --porcelain=v1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [CI]: Is t7527 known to be flakey?
  2023-01-21 10:23 ` SZEDER Gábor
@ 2023-01-23 16:56   ` Jeff Hostetler
  2023-01-23 18:12     ` SZEDER Gábor
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Hostetler @ 2023-01-23 16:56 UTC (permalink / raw)
  To: SZEDER Gábor, Junio C Hamano; +Cc: Jeff Hostetler, edecosta, git



On 1/21/23 5:23 AM, SZEDER Gábor wrote:
> On Thu, Jan 19, 2023 at 06:52:01PM -0800, Junio C Hamano wrote:
>> The said test failed its linux-musl job in its first attempt, but
>> re-running the failed job passed.
>>
>>      https://github.com/git/git/actions/runs/3963948890/jobs/6792356234
>>      (seen@e096683 attempt #1 linux-musl)
>>
>>      https://github.com/git/git/actions/runs/3963948890/jobs/6792850313
>>      (seen@e096683 attempt #2 linux-musl)
> 
> t7527 is quite slow, even with the right selection of test cases, but
> this little tweak makes it much faster:
> 
>    diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
>    index 0e497ba98d..4210ef644c 100755
>    --- a/t/t7527-builtin-fsmonitor.sh
>    +++ b/t/t7527-builtin-fsmonitor.sh
>    @@ -676,7 +676,10 @@ test_expect_success 'cleanup worktrees' '
>     # cause incorrect results when the untracked-cache is enabled.
>     
>     test_lazy_prereq UNTRACKED_CACHE '
>    -	git update-index --test-untracked-cache
>    +	# This check takes a very long time, but I know it works on
>    +	# my system, so let's fake it.
>    +	#git update-index --test-untracked-cache
>    +	true
>     '
>     
>     test_expect_success 'Matrix: setup for untracked-cache,fsmonitor matrix' '
> 
> This with the right selection of test cases makes --stress
> practicable, and the test tends to fail after a handful of
> repetitions:
> 
>    ./t7527-builtin-fsmonitor.sh --stress -r 8,23-58
> 
> I saw different failures in multiple test cases, e.g.:
> 
> Unexpected output in case 55:
> 
>    expecting success of 7527.55 'Matrix[uc:true][fsm:true] move_directory_contents_deeper':
>    		matrix_clean_up_repo &&
>    		$fn &&
>    		if test $uc = false && test $fsm = false
>    		then
>    			git status --porcelain=v1 >.git/expect.$fn
>    		else
>    			git status --porcelain=v1 >.git/actual.$fn &&
>    			test_cmp .git/expect.$fn .git/actual.$fn
>    		fi
>    	
>    + matrix_clean_up_repo
>    + git reset --hard HEAD
>    HEAD is now at 1d1edcb initial
>    + git clean -fd
>    + move_directory_contents_deeper
>    + mkdir T1/_new_
>    + mv T1/F1 T1/F2 T1/T2 T1/_new_
>    + test true = false
>    + git status --porcelain=v1
>    error: read error: Connection reset by peer
>    error: could not read IPC response
>    + test_cmp .git/expect.move_directory_contents_deeper .git/actual.move_directory_contents_deeper
>    + test 2 -ne 2
>    + eval diff -u "$@"
>    + diff -u .git/expect.move_directory_contents_deeper .git/actual.move_directory_contents_deeper
>    --- .git/expect.move_directory_contents_deeper	2023-01-21 09:47:12.677410349 +0000
>    +++ .git/actual.move_directory_contents_deeper	2023-01-21 09:47:14.045448573 +0000
>    @@ -7,3 +7,4 @@
>      D T1/T2/T3/T4/F1
>      D T1/T2/T3/T4/F2
>     ?? T1/_new_/
>    +?? dir1
>    error: last command exited with $?=1
>    not ok 55 - Matrix[uc:true][fsm:true] move_directory_contents_deeper
> 
> SIGPIPE in 'git status' cases 42, 43 and 55:
> 
>    expecting success of 7527.42 'Matrix[uc:false][fsm:true] move_directory_contents_deeper':
>                    matrix_clean_up_repo &&
>                    $fn &&
>                    if test $uc = false && test $fsm = false
>                    then
>                            git status --porcelain=v1 >.git/expect.$fn
>                    else
>                            git status --porcelain=v1 >.git/actual.$fn &&
>                            test_cmp .git/expect.$fn .git/actual.$fn
>                    fi
>    
>    + matrix_clean_up_repo
>    + git reset --hard HEAD
>    HEAD is now at 1d1edcb initial
>    + git clean -fd
>    + move_directory_contents_deeper
>    + mkdir T1/_new_
>    + mv T1/F1 T1/F2 T1/T2 T1/_new_
>    + test false = false
>    + test true = false
>    + git status --porcelain=v1
>    error: last command exited with $?=141
>    not ok 42 - Matrix[uc:false][fsm:true] move_directory_contents_deeper
> 
>    expecting success of 7527.43 'Matrix[uc:false][fsm:true] move_directory_up':
>    		matrix_clean_up_repo &&
>    		$fn &&
>    		if test $uc = false && test $fsm = false
>    		then
>    			git status --porcelain=v1 >.git/expect.$fn
>    		else
>    			git status --porcelain=v1 >.git/actual.$fn &&
>    			test_cmp .git/expect.$fn .git/actual.$fn
>    		fi
>    	
>    + matrix_clean_up_repo
>    + git reset --hard HEAD
>    HEAD is now at 1d1edcb initial
>    + git clean -fd
>    Removing T1/_new_/
>    + move_directory_up
>    + mv T1/T2/T3 T1
>    + test false = false
>    + test true = false
>    + git status --porcelain=v1
>    error: last command exited with $?=141
>    not ok 43 - Matrix[uc:false][fsm:true] move_directory_up
> 
>    expecting success of 7527.55 'Matrix[uc:true][fsm:true] move_directory_contents_deeper':
>                    matrix_clean_up_repo &&
>                    $fn &&
>                    if test $uc = false && test $fsm = false
>                    then
>                            git status --porcelain=v1 >.git/expect.$fn
>                    else
>                            git status --porcelain=v1 >.git/actual.$fn &&
>                            test_cmp .git/expect.$fn .git/actual.$fn
>                    fi
>    
>    + matrix_clean_up_repo
>    + git reset --hard HEAD
>    HEAD is now at 1d1edcb initial
>    + git clean -fd
>    + move_directory_contents_deeper
>    + mkdir T1/_new_
>    + mv T1/F1 T1/F2 T1/T2 T1/_new_
>    + test true = false
>    + git status --porcelain=v1
>    error: last command exited with $?=141
> 
> I find it interesting that the output of 'git status' is redirected to
> a file in all these cases, and yet it gets a SIGPIPE.

The `git status` command talks to the running `git fsmonitor--daemon`
on a Unix domain socket (or a Named Pipe on Windows), so a SIGPIPE is
possible.

Was this on Linux or MacOS ?

Jeff




> 
> 
> I also saw the test hang a couple of times, e.g.:
> 
>    $ ./t7527-builtin-fsmonitor.sh --stress -r 8,23-58
>    OK    6.1
>    OK    7.1
>    OK    1.1
>    OK    2.1
>    OK    3.1
>    OK    5.1
>    OK    4.1
>    OK    0.1
>    OK    6.2
>    OK    1.2
>    OK    2.2
>    OK    7.2
>    OK    5.2
>    OK    0.2
>    OK    4.2
>    OK    6.3
>    OK    7.3
>    OK    2.3
>    OK    0.3
>    OK    4.3
>    OK    6.4
>    OK    7.4
>    OK    2.4
>    OK    0.4
>    OK    4.4
>    OK    6.5
>    OK    7.5
>    OK    2.5
>    OK    0.5
>    OK    4.5
>    OK    6.6
>    OK    7.6
>    OK    2.6
>    OK    0.6
>    OK    4.6
>    OK    6.7
>    OK    7.7
>    OK    2.7
>    OK    0.7
>    OK    4.7
>    OK    6.8
>    OK    7.8
>    OK    2.8
>    OK    0.8
>    OK    4.8
>    OK    6.9
>    OK    7.9
>    OK    2.9
>    OK    0.9
>    OK    4.9
>    OK    6.10
>    OK    7.10
>    OK    2.10
>    OK    0.10
>    OK    4.10
>    OK    6.11
>    OK    7.11
>    OK    2.11
>    OK    0.11
>    OK    4.11
>    OK    6.12
>    OK    7.12
>    OK    2.12
>    OK    0.12
>    OK    4.12
>    OK    6.13
>    OK    7.13
>    OK    2.13
>    OK    0.13
>    OK    6.14
>    OK    4.13
>    OK    7.14
>    OK    2.14
>    OK    0.14
>    OK    6.15
>    OK    4.14
>    OK    2.15
>    OK    7.15
>    OK    0.15
>    FAIL  7.16
>    OK    6.16
>    OK    2.16
>    OK    4.15
>    OK    0.16
> 
> At this point the test script should print the log of the failed job,
> but it hangs instead, as there are a number of stuck fsmonitor--daemon
> and status processes (notice how the stress test starts with 8 jobs,
> but the last repetition only has 4):
> 
>    $ ps aux |grep git
>    szeder   1857100  0.0  0.1  72272  4452 pts/2    Sl+  21:40   0:00 /home/szeder/src/git/git fsmonitor--daemon run --detach --ipc-threads=8
>    szeder   1857779  0.0  0.1   6560  4152 pts/2    S+   21:40   0:00 /home/szeder/src/git/git status --porcelain=v1
>    szeder   1860020  0.0  0.1  88664  4312 pts/2    Sl+  21:40   0:00 /home/szeder/src/git/git fsmonitor--daemon run --detach --ipc-threads=8
>    szeder   1860668  0.0  0.1   6560  4040 pts/2    S+   21:40   0:00 /home/szeder/src/git/git status --porcelain=v1
>    szeder   1860749  0.0  0.1  96860  4528 pts/2    Sl+  21:40   0:00 /home/szeder/src/git/git fsmonitor--daemon run --detach --ipc-threads=8
>    szeder   1861281  0.0  0.1   6560  4272 pts/2    S+   21:40   0:00 /home/szeder/src/git/git status --porcelain=v1
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [CI]: Is t7527 known to be flakey?
  2023-01-23 16:56   ` Jeff Hostetler
@ 2023-01-23 18:12     ` SZEDER Gábor
  2023-01-25 19:02       ` Jeff Hostetler
  0 siblings, 1 reply; 8+ messages in thread
From: SZEDER Gábor @ 2023-01-23 18:12 UTC (permalink / raw)
  To: Jeff Hostetler; +Cc: Junio C Hamano, Jeff Hostetler, edecosta, git

On Mon, Jan 23, 2023 at 11:56:53AM -0500, Jeff Hostetler wrote:
> Was this on Linux or MacOS ?

On an average-ish Linux (an Ubuntu LTS variant).  So the issue is not
specific to musl.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [CI]: Is t7527 known to be flakey?
  2023-01-23 18:12     ` SZEDER Gábor
@ 2023-01-25 19:02       ` Jeff Hostetler
  2023-01-25 21:12         ` SZEDER Gábor
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Hostetler @ 2023-01-25 19:02 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Junio C Hamano, Jeff Hostetler, edecosta, git



On 1/23/23 1:12 PM, SZEDER Gábor wrote:
> On Mon, Jan 23, 2023 at 11:56:53AM -0500, Jeff Hostetler wrote:
>> Was this on Linux or MacOS ?
> 
> On an average-ish Linux (an Ubuntu LTS variant).  So the issue is not
> specific to musl.
> 

OK, thanks.  I wasn't worried about "musl", but rather whether
you were running the stress test on a Linux or Mac.

Since they have different backends (inotify vs FSEvent) and all
the code that touches the filesystem is different, it would best
to start on the correct OS when trying to repro it.


Can you tell from your stess test whether the fsmonitor-daemon
is crashing?  (It might be subtle since the daemon is auto-started
if necessary, so it might be crashing and silently getting restarted
by the next command.)

I ask because a SIGPIPE in the client would make me think that the
server suddenly closed the connection unexpectedly, like if it had
SIGSEGV'd or something.

I won't have time to spin up a Linux VM until next week, so I
won't be able to investigate this for a bit.

Jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [CI]: Is t7527 known to be flakey?
  2023-01-25 19:02       ` Jeff Hostetler
@ 2023-01-25 21:12         ` SZEDER Gábor
  0 siblings, 0 replies; 8+ messages in thread
From: SZEDER Gábor @ 2023-01-25 21:12 UTC (permalink / raw)
  To: Jeff Hostetler; +Cc: Junio C Hamano, Jeff Hostetler, edecosta, git

On Wed, Jan 25, 2023 at 02:02:40PM -0500, Jeff Hostetler wrote:
> Can you tell from your stess test whether the fsmonitor-daemon
> is crashing?  (It might be subtle since the daemon is auto-started
> if necessary, so it might be crashing and silently getting restarted
> by the next command.)
> 
> I ask because a SIGPIPE in the client would make me think that the
> server suddenly closed the connection unexpectedly, like if it had
> SIGSEGV'd or something.

Last time around I only looked at the failing test case, and didn't
notice anything that might have indicated the cause of the SIGPIPE.
This time I chanced to look a bit further up in the test log, and:

  expecting success of 7527.55 'Matrix[uc:true][fsm:true] move_directory_contents_deeper':
                  matrix_clean_up_repo &&
                  $fn &&
                  if test $uc = false && test $fsm = false
                  then
                          git status --porcelain=v1 >.git/expect.$fn
                  else
                          git status --porcelain=v1 >.git/actual.$fn &&
                          test_cmp .git/expect.$fn .git/actual.$fn
                  fi
  
  + matrix_clean_up_repo
  + git reset --hard HEAD
  HEAD is now at 1d1edcb initial
  + git clean -fd
  + move_directory_contents_deeper
  + mkdir T1/_new_
  + mv T1/F1 T1/F2 T1/T2 T1/_new_
  + test true = false
  + git status --porcelain=v1
  error: read error: Connection reset by peer
  error: could not read IPC response
  + test_cmp .git/expect.move_directory_contents_deeper .git/actual.move_directory_contents_deeper
  + test 2 -ne 2
  + eval diff -u "$@"
  + diff -u .git/expect.move_directory_contents_deeper .git/actual.move_directory_contents_deeper
  ok 55 - Matrix[uc:true][fsm:true] move_directory_contents_deeper
  
  expecting success of 7527.56 'Matrix[uc:true][fsm:true] move_directory_up':
                  matrix_clean_up_repo &&
                  $fn &&
                  if test $uc = false && test $fsm = false
                  then
                          git status --porcelain=v1 >.git/expect.$fn
                  else
                          git status --porcelain=v1 >.git/actual.$fn &&
                          test_cmp .git/expect.$fn .git/actual.$fn
                  fi
  
  + matrix_clean_up_repo
  + git reset --hard HEAD
  HEAD is now at 1d1edcb initial
  + git clean -fd
  Removing T1/_new_/
  + move_directory_up
  + mv T1/T2/T3 T1
  + test true = false
  + git status --porcelain=v1
  error: last command exited with $?=141
  not ok 56 - Matrix[uc:true][fsm:true] move_directory_up

Notice that "error: read error: Connection reset by peer" in the
previous, still successful test case!  I ran it a couple of times, and
saw the same error message in the still successful '42 -
Matrix[uc:false][fsm:true] move_directory_contents_deeper' followed by
a SIGPIPE caused failure in the next test case.  And now that I knew
what to look for, I noticed this error message in the very first test
failure I reported the other day, which didn't fail because of
SIGPIPE, and in that case the error message was printed in the failed
test case.

And there were a few cases that failed because of SIGPIPE but there
were no error messages at all.

I can't say what caused these errors, but I doubt that anything
segfaulted, because segfaults are logged in syslog, and I haven't
found any such syslog entries coinciding with stress testing.



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-01-25 21:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-20  2:52 [CI]: Is t7527 known to be flakey? Junio C Hamano
2023-01-20 15:23 ` Jeff Hostetler
2023-01-20 15:40   ` Junio C Hamano
2023-01-21 10:23 ` SZEDER Gábor
2023-01-23 16:56   ` Jeff Hostetler
2023-01-23 18:12     ` SZEDER Gábor
2023-01-25 19:02       ` Jeff Hostetler
2023-01-25 21:12         ` SZEDER Gábor

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).