git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* t7527 intermittent failure on macOS APFS and possible fix
@ 2022-08-12 13:52 Đoàn Trần Công Danh
  2022-08-12 18:08 ` Junio C Hamano
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Đoàn Trần Công Danh @ 2022-08-12 13:52 UTC (permalink / raw)
  To: git

Hello,

Running t7527 on macOS with encrypted APFS filesystem.
I observes intermittent failure, however, when I manually check the
test cases, they're all passed.

I suspected fileystem caching issue.
I added those sync-s into test steps and the test pass.
I'm not sure if this is the intending "fix" for the tests
since we're testing the fsmonitor with t7527.

Please advise!

P/S: When debugging, I also found out that:
"test-tool fsmonitor-client query" doesn't write the final newline
character, thus making the output harder to read. The diff also have
the final newline added.

----- 8< -------
diff --git a/t/helper/test-fsmonitor-client.c b/t/helper/test-fsmonitor-client.c
index 54a4856c48..98d6cf1440 100644
--- a/t/helper/test-fsmonitor-client.c
+++ b/t/helper/test-fsmonitor-client.c
@@ -55,6 +55,7 @@ static int do_send_query(const char *token)
 
 	write_in_full(1, answer.buf, answer.len);
 	strbuf_release(&answer);
+	write_in_full(1, "\n", 1);
 
 	return 0;
 }
@@ -77,6 +78,7 @@ static int do_send_flush(void)
 
 	write_in_full(1, answer.buf, answer.len);
 	strbuf_release(&answer);
+	write_in_full(1, "\n", 1);
 
 	return 0;
 }
diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index 56c0dfffea..3f2cef251d 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -430,6 +430,7 @@ test_expect_success 'edit some files' '
 	edit_files &&
 
 	test-tool fsmonitor-client query --token 0 &&
+	sync &&
 
 	grep "^event: dir1/modified$"  .git/trace &&
 	grep "^event: dir2/modified$"  .git/trace &&
@@ -445,6 +446,7 @@ test_expect_success 'create some files' '
 	create_files &&
 
 	test-tool fsmonitor-client query --token 0 &&
+	sync &&
 
 	grep "^event: dir1/new$" .git/trace &&
 	grep "^event: dir2/new$" .git/trace &&
@@ -459,6 +461,7 @@ test_expect_success 'delete some files' '
 	delete_files &&
 
 	test-tool fsmonitor-client query --token 0 &&
+	sync &&
 
 	grep "^event: dir1/delete$" .git/trace &&
 	grep "^event: dir2/delete$" .git/trace &&
@@ -473,6 +476,7 @@ test_expect_success 'rename some files' '
 	rename_files &&
 
 	test-tool fsmonitor-client query --token 0 &&
+	sync &&
 
 	grep "^event: dir1/rename$"  .git/trace &&
 	grep "^event: dir2/rename$"  .git/trace &&
@@ -490,6 +494,7 @@ test_expect_success 'rename directory' '
 	mv dirtorename dirrenamed &&
 
 	test-tool fsmonitor-client query --token 0 &&
+	sync &&
 
 	grep "^event: dirtorename/*$" .git/trace &&
 	grep "^event: dirrenamed/*$"  .git/trace
@@ -503,6 +508,7 @@ test_expect_success 'file changes to directory' '
 	file_to_directory &&
 
 	test-tool fsmonitor-client query --token 0 &&
+	sync &&
 
 	grep "^event: delete$"     .git/trace &&
 	grep "^event: delete/new$" .git/trace
@@ -516,6 +522,7 @@ test_expect_success 'directory changes to a file' '
 	directory_to_file &&
 
 	test-tool fsmonitor-client query --token 0 &&
+	sync &&
 
 	grep "^event: dir1$" .git/trace
 '
---- >8 --------------

-- 
Danh

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: t7527 intermittent failure on macOS APFS and possible fix
  2022-08-12 13:52 t7527 intermittent failure on macOS APFS and possible fix Đoàn Trần Công Danh
@ 2022-08-12 18:08 ` Junio C Hamano
  2022-08-13  0:16   ` Đoàn Trần Công Danh
  2022-08-14  6:52 ` Jeff King
  2022-08-17 20:23 ` Jeff Hostetler
  2 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2022-08-12 18:08 UTC (permalink / raw)
  To: Đoàn Trần Công Danh, Jeff Hostetler; +Cc: git

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> Running t7527 on macOS with encrypted APFS filesystem.
> I observes intermittent failure, however, when I manually check the
> test cases, they're all passed.
>
> I suspected fileystem caching issue.
> I added those sync-s into test steps and the test pass.
> I'm not sure if this is the intending "fix" for the tests
> since we're testing the fsmonitor with t7527.
>
> Please advise!

fsmonitor experts, please do.

My gut feeling is that, unless the production code internally calls
the equivalent of "sync" done here and the failure in tests are
coming from the fact that the "sync" is missing in "test-tool
fsmonitor-client" (i.e. test-tool does not emulate the real world
closely enough and fails in cases where the machinery does not fail
in the real world), these "sync" calls only sweep the problem under
the rug.

> P/S: When debugging, I also found out that:
> "test-tool fsmonitor-client query" doesn't write the final newline
> character, thus making the output harder to read. The diff also have
> the final newline added.
>
> ----- 8< -------
> diff --git a/t/helper/test-fsmonitor-client.c b/t/helper/test-fsmonitor-client.c
> index 54a4856c48..98d6cf1440 100644
> --- a/t/helper/test-fsmonitor-client.c
> +++ b/t/helper/test-fsmonitor-client.c
> @@ -55,6 +55,7 @@ static int do_send_query(const char *token)
>  
>  	write_in_full(1, answer.buf, answer.len);
>  	strbuf_release(&answer);
> +	write_in_full(1, "\n", 1);
>  
>  	return 0;
>  }
> @@ -77,6 +78,7 @@ static int do_send_flush(void)
>  
>  	write_in_full(1, answer.buf, answer.len);
>  	strbuf_release(&answer);
> +	write_in_full(1, "\n", 1);
>  
>  	return 0;
>  }

Aren't these protocol drivers?  If the protocol is defined without
the trailing LF, would it make sense to update only the sending end
to do this?  Or does the protocol makes it clear that a trailing LF,
or lack of it, should be tolerated by all the implementations?

If we are absolutely sure that no implementation of the other side
will get upset by seeing an extra LF, It would be fine, but as the
original code wants to call write_in_full(), it would be more
preferrable to do it this way instead, I suspect.

+	strbuf_complete(&answer, '\n');
	write_in_full(1, answer.buf, answer.len);
	strbuf_release(&answer);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: t7527 intermittent failure on macOS APFS and possible fix
  2022-08-12 18:08 ` Junio C Hamano
@ 2022-08-13  0:16   ` Đoàn Trần Công Danh
  2022-08-13  0:45     ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Đoàn Trần Công Danh @ 2022-08-13  0:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff Hostetler, git

On 2022-08-12 11:08:30-0700, Junio C Hamano <gitster@pobox.com> wrote:
> Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:
> 
> > Running t7527 on macOS with encrypted APFS filesystem.
> > I observes intermittent failure, however, when I manually check the
> > test cases, they're all passed.
> >
> > I suspected fileystem caching issue.
> > I added those sync-s into test steps and the test pass.
> > I'm not sure if this is the intending "fix" for the tests
> > since we're testing the fsmonitor with t7527.
> >
> > Please advise!
> 
> fsmonitor experts, please do.
> 
> My gut feeling is that, unless the production code internally calls
> the equivalent of "sync" done here and the failure in tests are
> coming from the fact that the "sync" is missing in "test-tool
> fsmonitor-client" (i.e. test-tool does not emulate the real world
> closely enough and fails in cases where the machinery does not fail
> in the real world), these "sync" calls only sweep the problem under
> the rug.

It's my gut feeling, too.
Anyway, I'm running the test again, too confirm your suggestion on
t/helper/test-fsmonitor-client.c

t7527.63 (Unicode nfc/nfd) also failed intermittently, too.
Here is the content of unicode.trace:
---- 8< ----
$ cat unicode.trace
statfs('/path/to/src/git/t/trash directory.t7527-builtin-fsmonitor/test_unicode') [type 0x0000001a][flags 0x04909080] 'apfs'
statfs('/path/to/src/git/t/trash directory.t7527-builtin-fsmonitor/test_unicode') [type 0x0000001a][flags 0x04909080] 'apfs'
Watching: worktree '/path/to/src/git/t/trash directory.t7527-builtin-fsmonitor/test_unicode'
statfs('/path/to/src/git/t/trash directory.t7527-builtin-fsmonitor/test_unicode') [type 0x0000001a][flags 0x04909080] 'apfs'
requested token: quit
---------- >8 ------------------

> 
> > P/S: When debugging, I also found out that:
> > "test-tool fsmonitor-client query" doesn't write the final newline
> > character, thus making the output harder to read. The diff also have
> > the final newline added.
> >
> > ----- 8< -------
> > diff --git a/t/helper/test-fsmonitor-client.c b/t/helper/test-fsmonitor-client.c
> > index 54a4856c48..98d6cf1440 100644
> > --- a/t/helper/test-fsmonitor-client.c
> > +++ b/t/helper/test-fsmonitor-client.c
> > @@ -55,6 +55,7 @@ static int do_send_query(const char *token)
> >  
> >  	write_in_full(1, answer.buf, answer.len);
> >  	strbuf_release(&answer);
> > +	write_in_full(1, "\n", 1);
> >  
> >  	return 0;
> >  }
> > @@ -77,6 +78,7 @@ static int do_send_flush(void)
> >  
> >  	write_in_full(1, answer.buf, answer.len);
> >  	strbuf_release(&answer);
> > +	write_in_full(1, "\n", 1);
> >  
> >  	return 0;
> >  }
> 
> Aren't these protocol drivers?

The "answer" strbuf is the response from fsmonitor daemon, I think.
The write_in_full to fd 1 is test-tool writes down the answer to
stdout.

> If the protocol is defined without
> the trailing LF, would it make sense to update only the sending end
> to do this?  Or does the protocol makes it clear that a trailing LF,
> or lack of it, should be tolerated by all the implementations?
> 
> If we are absolutely sure that no implementation of the other side
> will get upset by seeing an extra LF, It would be fine, but as the
> original code wants to call write_in_full(), it would be more
> preferrable to do it this way instead, I suspect.
> 
> +	strbuf_complete(&answer, '\n');
> 	write_in_full(1, answer.buf, answer.len);
> 	strbuf_release(&answer);

This could work, since we don't send "answer" back to
fsmonitor-daemon.

-- 
Danh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: t7527 intermittent failure on macOS APFS and possible fix
  2022-08-13  0:16   ` Đoàn Trần Công Danh
@ 2022-08-13  0:45     ` Junio C Hamano
  0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2022-08-13  0:45 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: Jeff Hostetler, git

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

>> Aren't these protocol drivers?
>
> The "answer" strbuf is the response from fsmonitor daemon, I think.
> The write_in_full to fd 1 is test-tool writes down the answer to
> stdout.

Ahh, then please scratch everything I said about this part.  Your
original implementation makes it more clear that we are "dumping"
what we got from the daemon, and then adding LF for our own
readability.  I somehow mistook the code to be preparing and sending
protocol message to the other end (iow, fd #1 was connected to
"them").

Sorry about the gotcha, and thanks for straightening me out.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: t7527 intermittent failure on macOS APFS and possible fix
  2022-08-12 13:52 t7527 intermittent failure on macOS APFS and possible fix Đoàn Trần Công Danh
  2022-08-12 18:08 ` Junio C Hamano
@ 2022-08-14  6:52 ` Jeff King
  2022-08-17 20:23 ` Jeff Hostetler
  2 siblings, 0 replies; 6+ messages in thread
From: Jeff King @ 2022-08-14  6:52 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git

On Fri, Aug 12, 2022 at 08:52:24PM +0700, Đoàn Trần Công Danh wrote:

> Running t7527 on macOS with encrypted APFS filesystem.
> I observes intermittent failure, however, when I manually check the
> test cases, they're all passed.
> 
> I suspected fileystem caching issue.
> I added those sync-s into test steps and the test pass.
> I'm not sure if this is the intending "fix" for the tests
> since we're testing the fsmonitor with t7527.

I don't know anything about your problem area, but some tools in our
test suite that might help you get better results with an intermittent
problem (if you didn't already know about them):

  - add:

      GIT_TEST_OPTS = -x --verbose-log

    into config.mak. Then when a "make test" run fails, you can see the
    log in t/test-results/t7527-*.out.

  - try running:

      ./t7527-* --stress

    which will run a bunch of instances simultaneously, with the idea
    that the load will introduce timing issues that will trigger the
    bug. The failed instance will produce a detailed log on stdout.

That might help you get more output from the failing cases.

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: t7527 intermittent failure on macOS APFS and possible fix
  2022-08-12 13:52 t7527 intermittent failure on macOS APFS and possible fix Đoàn Trần Công Danh
  2022-08-12 18:08 ` Junio C Hamano
  2022-08-14  6:52 ` Jeff King
@ 2022-08-17 20:23 ` Jeff Hostetler
  2 siblings, 0 replies; 6+ messages in thread
From: Jeff Hostetler @ 2022-08-17 20:23 UTC (permalink / raw)
  To: Đoàn Trần Công Danh, git; +Cc: git



On 8/12/22 9:52 AM, Đoàn Trần Công Danh wrote:
> Hello,
> 
> Running t7527 on macOS with encrypted APFS filesystem.
> I observes intermittent failure, however, when I manually check the
> test cases, they're all passed.
> 
> I suspected fileystem caching issue.
> I added those sync-s into test steps and the test pass.
> I'm not sure if this is the intending "fix" for the tests
> since we're testing the fsmonitor with t7527.
> 
> Please advise!
> 
> P/S: When debugging, I also found out that:
> "test-tool fsmonitor-client query" doesn't write the final newline
> character, thus making the output harder to read. The diff also have
> the final newline added.

Thanks for the report.  Sorry for the delay, I'm still digging thru
my email backlog after a long vacation.  I'll take a look at this
and get back to you shortly.

Thanks,
Jeff


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-08-17 20:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-12 13:52 t7527 intermittent failure on macOS APFS and possible fix Đoàn Trần Công Danh
2022-08-12 18:08 ` Junio C Hamano
2022-08-13  0:16   ` Đoàn Trần Công Danh
2022-08-13  0:45     ` Junio C Hamano
2022-08-14  6:52 ` Jeff King
2022-08-17 20:23 ` Jeff Hostetler

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).