git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] ci: fix the FreeBSD build
@ 2022-07-29 12:28 Johannes Schindelin via GitGitGadget
  2022-07-29 12:28 ` [PATCH 1/2] t5351: avoid relying on `core.fsyncMethod = batch` to be supported Johannes Schindelin via GitGitGadget
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-07-29 12:28 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin

Since 3a251bac0d1a (trace2: only include "fsync" events if we git_fsync(),
2022-07-18), the FreeBSD builds are failing in t5351.6. See
https://cirrus-ci.com/task/4577761405698048 for an example. The run at
https://cirrus-ci.com/task/6004115347079168 shows that this patch fixes the
bug.

While verifying the fix on Windows, I noticed a recent, rather terrible
performance regression: t5351 all of a sudden takes almost half an hour
[https://github.com/git/git/runs/7398490747?check_suite_focus=true#step:5:171]
to run on Windows. I found a fix, and it now passes in less than half a
minute
[https://github.com/gitgitgadget/git/runs/7578071365?check_suite_focus=true#step:5:125]
again.

Johannes Schindelin (2):
  t5351: avoid relying on `core.fsyncMethod = batch` to be supported
  t5351: avoid using `test_cmp` for binary data

 bulk-checkin.c                  |  2 ++
 t/t5351-unpack-large-objects.sh | 12 +++++++++---
 2 files changed, 11 insertions(+), 3 deletions(-)


base-commit: 23b219f8e3f2adfb0441e135f0a880e6124f766c
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1308%2Fdscho%2Ffix-t5351-on-freebsd-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1308/dscho/fix-t5351-on-freebsd-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1308
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] t5351: avoid relying on `core.fsyncMethod = batch` to be supported
  2022-07-29 12:28 [PATCH 0/2] ci: fix the FreeBSD build Johannes Schindelin via GitGitGadget
@ 2022-07-29 12:28 ` Johannes Schindelin via GitGitGadget
  2022-07-29 16:07   ` Junio C Hamano
  2022-07-29 12:28 ` [PATCH 2/2] t5351: avoid using `test_cmp` for binary data Johannes Schindelin via GitGitGadget
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-07-29 12:28 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

On FreeBSD, this mode is not supported. But since 3a251bac0d1a (trace2:
only include "fsync" events if we git_fsync(), 2022-07-18) t5351 will
fail if this mode is unsupported.

Let's address this in the minimal fashion, by detecting that that mode
is unsupported and expecting a different count of hardware flushes in
that case.

This fixes the CI/PR builds on FreeBSD again.

Note: A better way would be to test only what is relevant in t5351.6
"unpack big object in stream (core.fsyncmethod=batch)" again instead of
blindly comparing the output against some exact text. But that would
pretty much revert the idea of above-mentioned commit, and that commit
was _just_ accepted into Git's main branch so one must assume that it
was intentional.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 bulk-checkin.c                  |  2 ++
 t/t5351-unpack-large-objects.sh | 10 ++++++++--
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/bulk-checkin.c b/bulk-checkin.c
index 98ec8938424..855b68ec23b 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -340,6 +340,8 @@ void fsync_loose_object_bulk_checkin(int fd, const char *filename)
 	 */
 	if (!bulk_fsync_objdir ||
 	    git_fsync(fd, FSYNC_WRITEOUT_ONLY) < 0) {
+		if (errno == ENOSYS)
+			warning(_("core.fsyncMethod = batch is unsupported on this platform"));
 		fsync_or_die(fd, filename);
 	}
 }
diff --git a/t/t5351-unpack-large-objects.sh b/t/t5351-unpack-large-objects.sh
index f785cb06173..dd7ebc40072 100755
--- a/t/t5351-unpack-large-objects.sh
+++ b/t/t5351-unpack-large-objects.sh
@@ -70,9 +70,15 @@ test_expect_success 'unpack big object in stream (core.fsyncmethod=batch)' '
 	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \
 	GIT_TEST_FSYNC=true \
 		git -C dest.git $BATCH_CONFIGURATION unpack-objects <pack-$PACK.pack &&
-	check_fsync_events trace2.txt <<-\EOF &&
+	if grep "core.fsyncMethod = batch is unsupported" trace2.txt
+	then
+		flush_count=7
+	else
+		flush_count=1
+	fi &&
+	check_fsync_events trace2.txt <<-EOF &&
 	"key":"fsync/writeout-only","value":"6"
-	"key":"fsync/hardware-flush","value":"1"
+	"key":"fsync/hardware-flush","value":"$flush_count"
 	EOF
 
 	test_dir_is_empty dest.git/objects/pack &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] t5351: avoid using `test_cmp` for binary data
  2022-07-29 12:28 [PATCH 0/2] ci: fix the FreeBSD build Johannes Schindelin via GitGitGadget
  2022-07-29 12:28 ` [PATCH 1/2] t5351: avoid relying on `core.fsyncMethod = batch` to be supported Johannes Schindelin via GitGitGadget
@ 2022-07-29 12:28 ` Johannes Schindelin via GitGitGadget
  2022-07-29 12:58 ` [PATCH 0/2] ci: fix the FreeBSD build Derrick Stolee
  2022-07-29 16:02 ` Junio C Hamano
  3 siblings, 0 replies; 8+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-07-29 12:28 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `test_cmp` function is meant to provide nicer output than `cmp` when
expected and actual output of Git commands disagree. The implicit
assumption is that the output is line-based and human readable.

However, aaf81223f48 (unpack-objects: use stream_loose_object() to
unpack large objects, 2022-06-11) introduced a call that compares the
contents of pack files, which are distinctly not line-based nor human
readable.

This causes problems because on Windows, we hand off to the Bash
function `mingw_test_cmp` that compares the lines while ignoring line
ending differences. And this Bash function spends an insane amount of
cycles trying to read in that binary pack file, so that it is almost
indistinguishable from an infinite loop.

For example, t5351 took 1486 seconds in the CI run at
https://github.com/git/git/runs/7398490747?check_suite_focus=true#step:5:171,
to complete. And yes, that is almost half an hour.

Since Git's tests already use `cmp` consistently when comparing pack
files, let's change this instance to use `cmp` instead of `test_cmp`,
too, and fix that performance problem.

Now t5351 takes all of 22 seconds.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5351-unpack-large-objects.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t5351-unpack-large-objects.sh b/t/t5351-unpack-large-objects.sh
index dd7ebc40072..e936f91c3ba 100755
--- a/t/t5351-unpack-large-objects.sh
+++ b/t/t5351-unpack-large-objects.sh
@@ -93,7 +93,7 @@ test_expect_success 'do not unpack existing large objects' '
 
 	# The destination came up with the exact same pack...
 	DEST_PACK=$(echo dest.git/objects/pack/pack-*.pack) &&
-	test_cmp pack-$PACK.pack $DEST_PACK &&
+	cmp pack-$PACK.pack $DEST_PACK &&
 
 	# ...and wrote no loose objects
 	test_stdout_line_count = 0 find dest.git/objects -type f ! -name "pack-*"
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] ci: fix the FreeBSD build
  2022-07-29 12:28 [PATCH 0/2] ci: fix the FreeBSD build Johannes Schindelin via GitGitGadget
  2022-07-29 12:28 ` [PATCH 1/2] t5351: avoid relying on `core.fsyncMethod = batch` to be supported Johannes Schindelin via GitGitGadget
  2022-07-29 12:28 ` [PATCH 2/2] t5351: avoid using `test_cmp` for binary data Johannes Schindelin via GitGitGadget
@ 2022-07-29 12:58 ` Derrick Stolee
  2022-07-29 17:51   ` Carlo Marcelo Arenas Belón
  2022-07-29 16:02 ` Junio C Hamano
  3 siblings, 1 reply; 8+ messages in thread
From: Derrick Stolee @ 2022-07-29 12:58 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget, git; +Cc: Johannes Schindelin

On 7/29/2022 8:28 AM, Johannes Schindelin via GitGitGadget wrote:
> Since 3a251bac0d1a (trace2: only include "fsync" events if we git_fsync(),
> 2022-07-18), the FreeBSD builds are failing in t5351.6. See
> https://cirrus-ci.com/task/4577761405698048 for an example. The run at
> https://cirrus-ci.com/task/6004115347079168 shows that this patch fixes the
> bug.

Thanks for noticing and fixing this bug. The FreeBSD build is slow
and flaky enough that I sometimes ignore its output before submitting
a series. Good that it will be green again.
 
> While verifying the fix on Windows, I noticed a recent, rather terrible
> performance regression: t5351 all of a sudden takes almost half an hour
> [https://github.com/git/git/runs/7398490747?check_suite_focus=true#step:5:171]
> to run on Windows. I found a fix, and it now passes in less than half a
> minute
> [https://github.com/gitgitgadget/git/runs/7578071365?check_suite_focus=true#step:5:125]
> again.

Thanks for speeding this up!

-Stolee

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] ci: fix the FreeBSD build
  2022-07-29 12:28 [PATCH 0/2] ci: fix the FreeBSD build Johannes Schindelin via GitGitGadget
                   ` (2 preceding siblings ...)
  2022-07-29 12:58 ` [PATCH 0/2] ci: fix the FreeBSD build Derrick Stolee
@ 2022-07-29 16:02 ` Junio C Hamano
  3 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2022-07-29 16:02 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> performance regression: t5351 all of a sudden takes almost half an hour
> [https://github.com/git/git/runs/7398490747?check_suite_focus=true#step:5:171]
> to run on Windows. I found a fix, and it now passes in less than half a
> minute

;-)  That's impressive.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] t5351: avoid relying on `core.fsyncMethod = batch` to be supported
  2022-07-29 12:28 ` [PATCH 1/2] t5351: avoid relying on `core.fsyncMethod = batch` to be supported Johannes Schindelin via GitGitGadget
@ 2022-07-29 16:07   ` Junio C Hamano
  2022-07-29 21:24     ` brian m. carlson
  0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2022-07-29 16:07 UTC (permalink / raw)
  To: Neeraj Singh, Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin via GitGitGadget, git, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> On FreeBSD, this mode is not supported. But since 3a251bac0d1a (trace2:
> only include "fsync" events if we git_fsync(), 2022-07-18) t5351 will
> fail if this mode is unsupported.
>
> Let's address this in the minimal fashion, by detecting that that mode
> is unsupported and expecting a different count of hardware flushes in
> that case.
>
> This fixes the CI/PR builds on FreeBSD again.
>
> Note: A better way would be to test only what is relevant in t5351.6
> "unpack big object in stream (core.fsyncmethod=batch)" again instead of
> blindly comparing the output against some exact text. But that would
> pretty much revert the idea of above-mentioned commit, and that commit
> was _just_ accepted into Git's main branch so one must assume that it
> was intentional.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  bulk-checkin.c                  |  2 ++
>  t/t5351-unpack-large-objects.sh | 10 ++++++++--
>  2 files changed, 10 insertions(+), 2 deletions(-)

I am inclined to take this as-is for now.

But it inherits from 3a251bac0d1a the general "we should be able to
count the value" expectation, which may not be the best approach to
run this test; asking Acks from those originally involved in this
area and possibly ideas to test this in a more robust way.

Thanks.


> diff --git a/bulk-checkin.c b/bulk-checkin.c
> index 98ec8938424..855b68ec23b 100644
> --- a/bulk-checkin.c
> +++ b/bulk-checkin.c
> @@ -340,6 +340,8 @@ void fsync_loose_object_bulk_checkin(int fd, const char *filename)
>  	 */
>  	if (!bulk_fsync_objdir ||
>  	    git_fsync(fd, FSYNC_WRITEOUT_ONLY) < 0) {
> +		if (errno == ENOSYS)
> +			warning(_("core.fsyncMethod = batch is unsupported on this platform"));
>  		fsync_or_die(fd, filename);
>  	}
>  }
> diff --git a/t/t5351-unpack-large-objects.sh b/t/t5351-unpack-large-objects.sh
> index f785cb06173..dd7ebc40072 100755
> --- a/t/t5351-unpack-large-objects.sh
> +++ b/t/t5351-unpack-large-objects.sh
> @@ -70,9 +70,15 @@ test_expect_success 'unpack big object in stream (core.fsyncmethod=batch)' '
>  	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \
>  	GIT_TEST_FSYNC=true \
>  		git -C dest.git $BATCH_CONFIGURATION unpack-objects <pack-$PACK.pack &&
> -	check_fsync_events trace2.txt <<-\EOF &&
> +	if grep "core.fsyncMethod = batch is unsupported" trace2.txt
> +	then
> +		flush_count=7
> +	else
> +		flush_count=1
> +	fi &&
> +	check_fsync_events trace2.txt <<-EOF &&
>  	"key":"fsync/writeout-only","value":"6"
> -	"key":"fsync/hardware-flush","value":"1"
> +	"key":"fsync/hardware-flush","value":"$flush_count"
>  	EOF
>  
>  	test_dir_is_empty dest.git/objects/pack &&

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] ci: fix the FreeBSD build
  2022-07-29 12:58 ` [PATCH 0/2] ci: fix the FreeBSD build Derrick Stolee
@ 2022-07-29 17:51   ` Carlo Marcelo Arenas Belón
  0 siblings, 0 replies; 8+ messages in thread
From: Carlo Marcelo Arenas Belón @ 2022-07-29 17:51 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Johannes Schindelin via GitGitGadget, git, Johannes Schindelin

On Fri, Jul 29, 2022 at 08:58:46AM -0400, Derrick Stolee wrote:
> On 7/29/2022 8:28 AM, Johannes Schindelin via GitGitGadget wrote:
> > Since 3a251bac0d1a (trace2: only include "fsync" events if we git_fsync(),
> > 2022-07-18), the FreeBSD builds are failing in t5351.6. See
> > https://cirrus-ci.com/task/4577761405698048 for an example. The run at
> > https://cirrus-ci.com/task/6004115347079168 shows that this patch fixes the
> > bug.
> 
> Thanks for noticing and fixing this bug. The FreeBSD build is slow

It usually takes a little more than 7 minutes for a full run, which is IMHO
less (at least wall time) than the whole CI does; could you elaborate on
why being "slow" would warrant ignoring its failures?

> and flaky enough that I sometimes ignore its output before submitting
> a series. Good that it will be green again.

I'd noticed that because it runs outside GitHub actions it sometimes has
synchronization(ex [1]) issues, but that might be some bug on the integration
with Cirrus which is easily avoided by looking instead directly to their
status page:

  https://cirrus-ci.com/github/git/git

Carlo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] t5351: avoid relying on `core.fsyncMethod = batch` to be supported
  2022-07-29 16:07   ` Junio C Hamano
@ 2022-07-29 21:24     ` brian m. carlson
  0 siblings, 0 replies; 8+ messages in thread
From: brian m. carlson @ 2022-07-29 21:24 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Neeraj Singh, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin via GitGitGadget, git, Johannes Schindelin

[-- Attachment #1: Type: text/plain, Size: 3085 bytes --]

On 2022-07-29 at 16:07:46, Junio C Hamano wrote:
> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
> 
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > On FreeBSD, this mode is not supported. But since 3a251bac0d1a (trace2:
> > only include "fsync" events if we git_fsync(), 2022-07-18) t5351 will
> > fail if this mode is unsupported.
> >
> > Let's address this in the minimal fashion, by detecting that that mode
> > is unsupported and expecting a different count of hardware flushes in
> > that case.
> >
> > This fixes the CI/PR builds on FreeBSD again.
> >
> > Note: A better way would be to test only what is relevant in t5351.6
> > "unpack big object in stream (core.fsyncmethod=batch)" again instead of
> > blindly comparing the output against some exact text. But that would
> > pretty much revert the idea of above-mentioned commit, and that commit
> > was _just_ accepted into Git's main branch so one must assume that it
> > was intentional.
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> >  bulk-checkin.c                  |  2 ++
> >  t/t5351-unpack-large-objects.sh | 10 ++++++++--
> >  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> I am inclined to take this as-is for now.

I think this patch is a strict improvement over the status quo, despite
what I'm going to mention below.

> But it inherits from 3a251bac0d1a the general "we should be able to
> count the value" expectation, which may not be the best approach to
> run this test; asking Acks from those originally involved in this
> area and possibly ideas to test this in a more robust way.

While it may not matter in this test, I noticed that the metrics we use
need not be accurate in multi-threaded programs.  We use integers of
static intmax_t which isn't necessarily atomic across threads.  Even if
we assumed word-sized increments were atomic[0], this type might be 64
bits on a 32-bit system.

I am aware that we don't use threads in many places, but in general we
should be safely able to assume that we can perform an fsync on any
thread without thread safety problems because that's the assumption a
reasonable Unix programmer would make.  Even if it's not a problem now,
it could well be in the future, and we should either fix this (say, with
C11 atomic integers[1]) or put a note on the metrics functions that they
may be wrong and stop using them in tests.

I would, in general, prefer that if we add wrappers that wrap common
Unix functions we ensure that they provide the same guarantees as the
common Unix functions we're wrapping.  I realize I may sound fussy, but
I've been fixing giant thread-safety problems recently and it's not fun.

[0] This is not, in general, a safe assumption to make, since most RISC
architectures are load-store.
[1] This would necessitate moving to C11, which is fine with me (and
already required on Windows), but others may have objections to.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-07-29 21:25 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-29 12:28 [PATCH 0/2] ci: fix the FreeBSD build Johannes Schindelin via GitGitGadget
2022-07-29 12:28 ` [PATCH 1/2] t5351: avoid relying on `core.fsyncMethod = batch` to be supported Johannes Schindelin via GitGitGadget
2022-07-29 16:07   ` Junio C Hamano
2022-07-29 21:24     ` brian m. carlson
2022-07-29 12:28 ` [PATCH 2/2] t5351: avoid using `test_cmp` for binary data Johannes Schindelin via GitGitGadget
2022-07-29 12:58 ` [PATCH 0/2] ci: fix the FreeBSD build Derrick Stolee
2022-07-29 17:51   ` Carlo Marcelo Arenas Belón
2022-07-29 16:02 ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).