git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/7] progress: verify progress counters in the test suite
@ 2021-06-20 20:02 SZEDER Gábor
  2021-06-20 20:02 ` [PATCH 1/7] progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress counters SZEDER Gábor
                   ` (10 more replies)
  0 siblings, 11 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-06-20 20:02 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	SZEDER Gábor

Splitting off from:

  https://public-inbox.org/git/cover-0.2-0000000000-20210607T144206Z-avarab@gmail.com/T/#me5d3176914d4268fd9f2a96fc63f4e41beb26bd6

On Tue, Jun 08, 2021 at 06:14:42PM +0200, René Scharfe wrote:
> I wonder (only in a semi-curious way, though) if we can detect
> off-by-one errors by adding an assertion to display_progress() that
> requires the first update to have the value 0, and in stop_progress()
> one that requires the previous display_progress() call to have a value
> equal to the total number of work items.  Not sure it'd be worth the
> hassle..

I fixed and reported a number of bogus progress lines in the past, the
last one during v2.31.0-rc phase, so I've looked into whether progress
counters could be automatically validated in our tests, and came up
with these patches a few months ago.  It turned out that progress
counters can be checked easily and transparently in case of progress
lines that are shown in the tests, i.e. that are shown even when
stderr is not a terminal or are forced with '--progress'.  (In other
cases it's still fairly easy but not quite transparent, as I think we
need changes to the progress API; more on that later in a separate
series.)

These checks did uncover a couple of buggy progress lines which are
fixed in this series as well, but I'm not sure that the fix presented
in patch 6 is the right approach, hence the RFC.


SZEDER Gábor (7):
  progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress
    counters
  progress: catch nested/overlapping progresses with
    GIT_TEST_CHECK_PROGRESS
  progress: catch backwards counting with GIT_TEST_CHECK_PROGRESS
  commit-graph: fix bogus counter in "Scanning merged commits" progress
    line
  entry: show finer-grained counter in "Filtering content" progress line
  [RFC] entry: don't show "Filtering content: ... done." line in case of
    errors
  test-lib: enable GIT_TEST_CHECK_PROGRESS by default

 commit-graph.c              |  2 +-
 entry.c                     | 10 +++---
 progress.c                  | 29 ++++++++++++++--
 t/t0500-progress-display.sh | 69 +++++++++++++++++++++++++------------
 t/test-lib.sh               |  6 ++++
 5 files changed, 86 insertions(+), 30 deletions(-)

-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 1/7] progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress counters
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
@ 2021-06-20 20:02 ` SZEDER Gábor
  2021-06-21  7:09   ` Ævar Arnfjörð Bjarmason
  2021-06-22 15:55   ` Taylor Blau
  2021-06-20 20:02 ` [PATCH 2/7] progress: catch nested/overlapping progresses with GIT_TEST_CHECK_PROGRESS SZEDER Gábor
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-06-20 20:02 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	SZEDER Gábor

We had to fix a couple of buggy progress lines in the past, where the
progress counter's final value didn't match the expected total [1],
e.g.:

  Expanding reachable commits in commit graph: 138606% (824706/595), done.
  Writing out commit graph in 3 passes: 166% (4187845/2512707), done.

Let's do better, and, instead of waiting for someone to notice such
issues by mere chance, start verifying progress counters in the test
suite: introduce the GIT_TEST_CHECK_PROGRESS knob to automatically
check that the final value of each progress counter matches the
expected total upon calling stop_progress(), and trigger a BUG() if it
doesn't.

This check should cover progress lines that are too fast to be shown,
because the repositories used in our tests are tiny and most of our
progress lines are delayed.  However, in case of a delayed progress
line the variable holding the value of the progress counter
('progress->last_value') is only updated after that delay is up, and,
consequently, we can't compare the progress counter with the expected
total in stop_progress() in these cases.

So let's update 'progress->last_value' already during the initial
delay as well.  This doesn't affect the visible behavior of progress
lines, though it results in additional invocations of the internal
display() function during the initial delay, but those don't make any
difference, because display() returns early without displaying
anything until the delay is up anyway.

Note that this can only check progress lines that are actually
started, i.e. that are shown by default even when standard error is
not a terminal, or that are forced to show with the '--progress'
option of whichever Git command displaying them.

Nonetheless, running the test suite with this new knob enabled results
in failures in 't0021-conversion.sh' and 't5510-fetch.sh', revealing
two more progress lines whose counter doesn't reach the expected
total.  These will be fixed in later patches in this series, and after
that GIT_TEST_CHECK_PROGRESS will be enabled by default in the test
suite.

[1] c4ff24bbb3 (commit-graph.c: display correct number of chunks when
                writing, 2021-02-24)
    1cbdbf3bef (commit-graph: drop count_distinct_commits() function,
                2020-12-07), though this didn't actually fixed, but
                instead removed a buggy progress line.
    150cd3b61d (commit-graph: fix "Writing out commit graph" progress
                counter, 2020-07-09)
    67fa6aac5a (commit-graph: don't show progress percentages while
                expanding reachable commits, 2019-09-07)
    531e6daa03 (prune-packed: advanced progress even for non-existing
                fan-out directories, 2009-04-27)

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 progress.c                  | 16 ++++++++++++++--
 t/t0500-progress-display.sh | 34 ++++++++++++++++++++++++++++++++++
 2 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/progress.c b/progress.c
index 680c6a8bf9..255995406f 100644
--- a/progress.c
+++ b/progress.c
@@ -47,6 +47,8 @@ struct progress {
 
 static volatile sig_atomic_t progress_update;
 
+static int test_check_progress;
+
 /*
  * These are only intended for testing the progress output, i.e. exclusively
  * for 'test-tool progress'.
@@ -111,10 +113,11 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 	int show_update = 0;
 	int last_count_len = counters_sb->len;
 
+	progress->last_value = n;
+
 	if (progress->delay && (!progress_update || --progress->delay))
 		return;
 
-	progress->last_value = n;
 	tp = (progress->throughput) ? progress->throughput->display.buf : "";
 	if (progress->total) {
 		unsigned percent = n * 100 / progress->total;
@@ -252,7 +255,11 @@ void display_progress(struct progress *progress, uint64_t n)
 static struct progress *start_progress_delay(const char *title, uint64_t total,
 					     unsigned delay, unsigned sparse)
 {
-	struct progress *progress = xmalloc(sizeof(*progress));
+	struct progress *progress;
+
+	test_check_progress = git_env_bool("GIT_TEST_CHECK_PROGRESS", 0);
+
+	progress = xmalloc(sizeof(*progress));
 	progress->title = title;
 	progress->total = total;
 	progress->last_value = -1;
@@ -349,6 +356,11 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 	progress = *p_progress;
 	if (!progress)
 		return;
+	if (test_check_progress && progress->total &&
+	    progress->total != progress->last_value)
+		BUG("total progress does not match for \"%s\": expected: %"PRIuMAX" got: %"PRIuMAX,
+		    progress->title, (uintmax_t)progress->total,
+		    (uintmax_t)progress->last_value);
 	*p_progress = NULL;
 	if (progress->last_value != -1) {
 		/* Force the last update */
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 22058b503a..641fa0964e 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -308,4 +308,38 @@ test_expect_success 'progress generates traces' '
 	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
 '
 
+test_expect_success 'GIT_TEST_CHECK_PROGRESS catches non-matching total' '
+	cat >in <<-\EOF &&
+	progress 1
+	progress 2
+	progress 4
+	EOF
+
+	test_must_fail env GIT_TEST_CHECK_PROGRESS=1 \
+		test-tool progress --total=3 "Not enough" <in 2>stderr &&
+	grep "BUG:.*total progress does not match" stderr &&
+
+	test_must_fail env GIT_TEST_CHECK_PROGRESS=1 \
+		test-tool progress --total=5 "Too much" <in 2>stderr &&
+	grep "BUG:.*total progress does not match" stderr
+'
+
+test_expect_success 'tolerate bogus progress without GIT_TEST_CHECK_PROGRESS' '
+	cat >expect <<-\EOF &&
+	Working hard:  33% (1/3)<CR>
+	Working hard:  33% (1/3), done.
+	EOF
+
+	cat >in <<-\EOF &&
+	progress 1
+	EOF
+	(
+		sane_unset GIT_TEST_CHECK_PROGRESS &&
+		test-tool progress --total=3 "Working hard" <in 2>stderr
+	) &&
+
+	show_cr <stderr >out &&
+	test_cmp expect out
+'
+
 test_done
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 2/7] progress: catch nested/overlapping progresses with GIT_TEST_CHECK_PROGRESS
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
  2021-06-20 20:02 ` [PATCH 1/7] progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress counters SZEDER Gábor
@ 2021-06-20 20:02 ` SZEDER Gábor
  2021-06-22 16:00   ` Taylor Blau
  2021-06-20 20:02 ` [PATCH 3/7] progress: catch backwards counting " SZEDER Gábor
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 138+ messages in thread
From: SZEDER Gábor @ 2021-06-20 20:02 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	SZEDER Gábor

We had to fix two buggy progress lines in the past, where
stop_progress calls were added at the wrong place [1], resulting in
"done" progress lines appearing in the wrong order.

Extend GIT_TEST_CHECK_PROGRESS to catch these cases as well, i.e.
trigger a BUG() when a progress has already been running when
start_progress() or one of its variants is called to start a new one.

Running the test suite with GIT_TEST_CHECK_PROGRESS enabled doesn't
reveal any new issues [2].

Note that this will trigger even in cases where the output is not
visibly wrong, e.g. consider this simplified sequence of calls:

  progress1 = start_delayed_progress();
  progress2 = start_delayed_progress();
  for (i = 0; ...)
      display_progress(progress2, i + 1);
  stop_progres(&progress2);
  for (j = 0; ...)
      display_progress(progress1, j + 1);
  stop_progres(&progress1);

This doesn't produce bogus output like what is shown in those two
fixes [1], because 'progress2' is already "done" before the first
display_progress(progress1, ...) call.  Btw, this is not just a
pathological example, we do have two progress lines arranged like
this, but they are only shown when standard error is a terminal, and
thus aren't caught by GIT_TEST_CHECK_PROGRESS in its current form.

[1] 6f9d5f2fda (commit-graph: fix progress of reachable commits,
                2020-07-09)
    862aead24e (commit-graph: fix "Collecting commits from input"
                progress line, 2020-07-10)

[2] This patch series applies with a minor conflict on top of
    6f9d5f2fda^, and makes 37 tests fail because of that bug.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 progress.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/progress.c b/progress.c
index 255995406f..549e8d1fe7 100644
--- a/progress.c
+++ b/progress.c
@@ -48,6 +48,8 @@ struct progress {
 static volatile sig_atomic_t progress_update;
 
 static int test_check_progress;
+/* Used to catch nested/overlapping progresses with GIT_TEST_CHECK_PROGRESS. */
+static struct progress *current_progress = NULL;
 
 /*
  * These are only intended for testing the progress output, i.e. exclusively
@@ -258,8 +260,12 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 	struct progress *progress;
 
 	test_check_progress = git_env_bool("GIT_TEST_CHECK_PROGRESS", 0);
+	if (test_check_progress && current_progress)
+		BUG("progress \"%s\" is still active when starting new progress \"%s\"",
+		    current_progress->title, title);
 
 	progress = xmalloc(sizeof(*progress));
+	current_progress = progress;
 	progress->title = title;
 	progress->total = total;
 	progress->last_value = -1;
@@ -383,6 +389,7 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 	strbuf_release(&progress->counters_sb);
 	if (progress->throughput)
 		strbuf_release(&progress->throughput->display);
+	current_progress = NULL;
 	free(progress->throughput);
 	free(progress);
 }
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 3/7] progress: catch backwards counting with GIT_TEST_CHECK_PROGRESS
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
  2021-06-20 20:02 ` [PATCH 1/7] progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress counters SZEDER Gábor
  2021-06-20 20:02 ` [PATCH 2/7] progress: catch nested/overlapping progresses with GIT_TEST_CHECK_PROGRESS SZEDER Gábor
@ 2021-06-20 20:02 ` SZEDER Gábor
  2021-06-20 20:03 ` [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line SZEDER Gábor
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-06-20 20:02 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	SZEDER Gábor

We had to fix a buggy progress line recently, where the progress
counter counted backwards, see 8e118e8490 (pack-objects: update
"nr_seen" progress based on pack-reused count, 2021-04-11).

Extend GIT_TEST_CHECK_PROGRESS to catch these cases as well, i.e.
trigger a BUG() when the counter passed to display_progress() is
smaller than the previous value.

Note that we allow subsequent display_progress() calls with the same
counter value, because:

  - Strictly speaking, it's not wrong to do so.

  - Forbidding it might make the code calling display_progress() more
    complex; I suspect that would be the case with e.g. the "Updating
    index flags" progress line in 'unpack-trees.c', where the counter
    is increased in recursive function calls.

  - We would need to special case the internal display() call in
    stop_progress_msg(), because it uses the same counter value as the
    last display_progress() call, which would trigger this BUG().

't0500-progress-display.sh' countains a few tests that check how
shortened progress lines are covered up, and one of them ('progress
shortens - crazy caller') shortens the progress line by counting
backwards.  From now on that test would trigger this BUG(), so remove
it; the other test cases cover shortening progress lines sufficiently.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 progress.c                  |  6 ++++++
 t/t0500-progress-display.sh | 35 +++++++++++++----------------------
 2 files changed, 19 insertions(+), 22 deletions(-)

diff --git a/progress.c b/progress.c
index 549e8d1fe7..034d50cd6b 100644
--- a/progress.c
+++ b/progress.c
@@ -115,6 +115,12 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 	int show_update = 0;
 	int last_count_len = counters_sb->len;
 
+	if (test_check_progress && progress->last_value != -1 &&
+	    n < progress->last_value)
+		BUG("progress \"%s\" counts backwards %"PRIuMAX" -> %"PRIuMAX,
+		    progress->title, (uintmax_t)progress->last_value,
+		    (uintmax_t)n);
+
 	progress->last_value = n;
 
 	if (progress->delay && (!progress_update || --progress->delay))
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 641fa0964e..a73dd45153 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -153,28 +153,6 @@ EOF
 	test_cmp expect out
 '
 
-# Progress counter goes backwards, this should not happen in practice.
-test_expect_success 'progress shortens - crazy caller' '
-	cat >expect <<-\EOF &&
-	Working hard:  10% (100/1000)<CR>
-	Working hard:  20% (200/1000)<CR>
-	Working hard:   0% (1/1000)  <CR>
-	Working hard: 100% (1000/1000)<CR>
-	Working hard: 100% (1000/1000), done.
-	EOF
-
-	cat >in <<-\EOF &&
-	progress 100
-	progress 200
-	progress 1
-	progress 1000
-	EOF
-	test-tool progress --total=1000 "Working hard" <in 2>stderr &&
-
-	show_cr <stderr >out &&
-	test_cmp expect out
-'
-
 test_expect_success 'progress display with throughput' '
 	cat >expect <<-\EOF &&
 	Working hard: 10<CR>
@@ -324,13 +302,26 @@ test_expect_success 'GIT_TEST_CHECK_PROGRESS catches non-matching total' '
 	grep "BUG:.*total progress does not match" stderr
 '
 
+test_expect_success 'GIT_TEST_CHECK_PROGRESS catches backwards counting' '
+	cat >in <<-\EOF &&
+	progress 2
+	progress 1
+	EOF
+
+	test_must_fail env GIT_TEST_CHECK_PROGRESS=1 \
+		test-tool progress --total=3 "Working hard" <in 2>stderr &&
+	grep "BUG:.*counts backwards" stderr
+'
+
 test_expect_success 'tolerate bogus progress without GIT_TEST_CHECK_PROGRESS' '
 	cat >expect <<-\EOF &&
+	Working hard:  66% (2/3)<CR>
 	Working hard:  33% (1/3)<CR>
 	Working hard:  33% (1/3), done.
 	EOF
 
 	cat >in <<-\EOF &&
+	progress 2
 	progress 1
 	EOF
 	(
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
                   ` (2 preceding siblings ...)
  2021-06-20 20:02 ` [PATCH 3/7] progress: catch backwards counting " SZEDER Gábor
@ 2021-06-20 20:03 ` SZEDER Gábor
  2021-06-20 22:13   ` Ævar Arnfjörð Bjarmason
  2021-06-20 20:03 ` [PATCH 5/7] entry: show finer-grained counter in "Filtering content" " SZEDER Gábor
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 138+ messages in thread
From: SZEDER Gábor @ 2021-06-20 20:03 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	SZEDER Gábor

The final value of the counter of the "Scanning merged commits"
progress line is always one less than its expected total, e.g.:

  Scanning merged commits:  83% (5/6), done.

This happens because while iterating over an array the loop variable
is passed to display_progress() as-is, but while C arrays (and thus
the loop variable) start at 0 and end at N-1, the progress counter
must end at N.  This causes the failures of the tests
'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.

Fix this by passing 'i + 1' to display_progress(), like most other
callsites do.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 commit-graph.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/commit-graph.c b/commit-graph.c
index 2bcb4e0f89..3181906368 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
 
 	ctx->num_extra_edges = 0;
 	for (i = 0; i < ctx->commits.nr; i++) {
-		display_progress(ctx->progress, i);
+		display_progress(ctx->progress, i + 1);
 
 		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
 			  &ctx->commits.list[i]->object.oid)) {
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 5/7] entry: show finer-grained counter in "Filtering content" progress line
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
                   ` (3 preceding siblings ...)
  2021-06-20 20:03 ` [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line SZEDER Gábor
@ 2021-06-20 20:03 ` SZEDER Gábor
  2021-06-20 20:03 ` [PATCH 6/7] [RFC] entry: don't show "Filtering content: ... done." line in case of errors SZEDER Gábor
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-06-20 20:03 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	SZEDER Gábor

The "Filtering content" progress in entry.c:finish_delayed_checkout()
is unusual because of how it calculates the progress count and because
it shows the progress of a nested loop.  It works basically like this:

  start_delayed_progress(p, nr_of_paths_to_filter)
  for_each_filter {
      display_progress(p, nr_of_paths_to_filter - nr_of_paths_still_left_to_filter)
      for_each_path_handled_by_the_current_filter {
          checkout_entry()
      }
  }
  stop_progress(p)

There are two issues with this approach:

  - The work done by the last filter (or the only filter if there is
    only one) is never counted, so if the last filter still has some
    paths to process, then the counter shown in the "done" progress
    line will not match the expected total.

    This is, in part, responsible for the failures of the tests
    'missing file in delayed checkout' and 'invalid file in delayed
    checkout' in 't0021-conversion.sh' when run with
    GIT_TEST_CHECK_PROGRESS=1, because both use only one filter.  (The
    test 'delayed checkout in process filter' uses two filters but the
    first one does all the work, so that test already happens to
    succeed even with GIT_TEST_CHECK_PROGRESS=1.)

  - The progress counter is updated only once per filter, not once per
    processed path, so if a filter has a lot of paths to process, then
    the counter might stay unchanged for a long while and then make a
    big jump (though the user still gets a sense of progress, because
    we call display_throughput() after each processed path to show the
    amount of processed data).

Move the display_progress() call to the inner loop, right next to that
checkout_entry() call that does the hard work for each path, and use a
dedicated counter variable that is incremented upon processing each
path.

After this change the 'invalid file in delayed checkout' in
't0021-conversion.sh' will succeed with GIT_TEST_CHECK_PROGRESS=1, but
the 'missing file in delayed checkout' test will still fail, because
its purposefully buggy filter doesn't process any paths, so we won't
execute that inner loop at all (this will be fixed in the next patch).

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 entry.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/entry.c b/entry.c
index 711ee0693c..bc4b8fcc98 100644
--- a/entry.c
+++ b/entry.c
@@ -162,7 +162,7 @@ static int remove_available_paths(struct string_list_item *item, void *cb_data)
 int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 {
 	int errs = 0;
-	unsigned delayed_object_count;
+	unsigned processed_paths = 0;
 	off_t filtered_bytes = 0;
 	struct string_list_item *filter, *path;
 	struct progress *progress;
@@ -172,12 +172,10 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 		return errs;
 
 	dco->state = CE_RETRY;
-	delayed_object_count = dco->paths.nr;
-	progress = start_delayed_progress(_("Filtering content"), delayed_object_count);
+	progress = start_delayed_progress(_("Filtering content"), dco->paths.nr);
 	while (dco->filters.nr > 0) {
 		for_each_string_list_item(filter, &dco->filters) {
 			struct string_list available_paths = STRING_LIST_INIT_NODUP;
-			display_progress(progress, delayed_object_count - dco->paths.nr);
 
 			if (!async_query_available_blobs(filter->string, &available_paths)) {
 				/* Filter reported an error */
@@ -224,6 +222,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 				ce = index_file_exists(state->istate, path->string,
 						       strlen(path->string), 0);
 				if (ce) {
+					display_progress(progress, ++processed_paths);
 					errs |= checkout_entry(ce, state, NULL, nr_checkouts);
 					filtered_bytes += ce->ce_stat_data.sd_size;
 					display_throughput(progress, filtered_bytes);
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 6/7] [RFC] entry: don't show "Filtering content: ... done." line in case of errors
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
                   ` (4 preceding siblings ...)
  2021-06-20 20:03 ` [PATCH 5/7] entry: show finer-grained counter in "Filtering content" " SZEDER Gábor
@ 2021-06-20 20:03 ` SZEDER Gábor
  2021-06-21 18:32   ` René Scharfe
  2021-06-20 20:03 ` [PATCH 7/7] test-lib: enable GIT_TEST_CHECK_PROGRESS by default SZEDER Gábor
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 138+ messages in thread
From: SZEDER Gábor @ 2021-06-20 20:03 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	SZEDER Gábor

The test 'missing file in delayed checkout' in 't0021-conversion.sh'
fails when run with GIT_TEST_CHECK_PROGRESS=1, because the final value
of the "Filtering content" progress counter doesn't match the expected
total, triggering BUG().  This is not caused by a bug in how we count
progress, but because the test involves a purposefully buggy filter
process that doesn't process any paths, so the progress counter
doesn't have a chance to reach the expected total.

Arguably, it is wrong to show "done" at the end of the progress
line when not all work was done.

So let's check whether there were any errors while processing or that
there are still unprocessed paths at the end (which a few lines later
will in fact be considered as error) and don't show the final "done"
line, i.e. don't call stop_progress(), if there were any.  And if we
don't call stop_progress(), then we won't verify that the progress
counter matches the expected total, won't trigger BUG() on mismatch,
and t0021 will succeed even with GIT_TEST_CHECK_PROGRESS=1.

After this change the test suite passes with
GIT_TEST_CHECK_PROGRESS=1.

RFC!!  Alas, not calling stop_progress() on error has drawbacks:

  - All memory allocated for the progress bar is leaked.
  - This progress line remains "active", in the sense that if we were
    to start a new progress later in the same git process, then with
    GIT_TEST_CHECK_PROGRESS it would trigger the other BUG() catching
    nested/overlapping progresses.

Do we care?!  TBH I don't :)
Anyway, if we do, then we might need some sort of an abort_progress()
function...

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 entry.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/entry.c b/entry.c
index bc4b8fcc98..38baefe22a 100644
--- a/entry.c
+++ b/entry.c
@@ -232,7 +232,8 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 		}
 		string_list_remove_empty_items(&dco->filters, 0);
 	}
-	stop_progress(&progress);
+	if (!errs && !dco->paths.nr)
+		stop_progress(&progress);
 	string_list_clear(&dco->filters, 0);
 
 	/* At this point we should not have any delayed paths anymore. */
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 7/7] test-lib: enable GIT_TEST_CHECK_PROGRESS by default
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
                   ` (5 preceding siblings ...)
  2021-06-20 20:03 ` [PATCH 6/7] [RFC] entry: don't show "Filtering content: ... done." line in case of errors SZEDER Gábor
@ 2021-06-20 20:03 ` SZEDER Gábor
  2021-06-21  0:59 ` [PATCH 0/7] progress: verify progress counters in the test suite Ævar Arnfjörð Bjarmason
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-06-20 20:03 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	SZEDER Gábor

Let's enable GIT_TEST_CHECK_PROGRESS by default, in the hope that it
will effectively prevent buggy progress line counters and nested
progress lines from entering our codebase in the future.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 t/test-lib.sh | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index adaf03543e..ae2dd6d0d2 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1502,6 +1502,12 @@ then
 	export GIT_TEST_CHECK_CACHE_TREE
 fi
 
+if test -z "$GIT_TEST_CHECK_PROGRESS"
+then
+	GIT_TEST_CHECK_PROGRESS=true
+	export GIT_TEST_CHECK_PROGRESS
+fi
+
 test_lazy_prereq PIPE '
 	# test whether the filesystem supports FIFOs
 	test_have_prereq !MINGW,!CYGWIN &&
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-20 20:03 ` [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line SZEDER Gábor
@ 2021-06-20 22:13   ` Ævar Arnfjörð Bjarmason
  2021-06-21 18:32     ` René Scharfe
  0 siblings, 1 reply; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-20 22:13 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, René Scharfe


On Sun, Jun 20 2021, SZEDER Gábor wrote:

> The final value of the counter of the "Scanning merged commits"
> progress line is always one less than its expected total, e.g.:
>
>   Scanning merged commits:  83% (5/6), done.
>
> This happens because while iterating over an array the loop variable
> is passed to display_progress() as-is, but while C arrays (and thus
> the loop variable) start at 0 and end at N-1, the progress counter
> must end at N.  This causes the failures of the tests
> 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
> in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.
>
> Fix this by passing 'i + 1' to display_progress(), like most other
> callsites do.
>
> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
> ---
>  commit-graph.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index 2bcb4e0f89..3181906368 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>  
>  	ctx->num_extra_edges = 0;
>  	for (i = 0; i < ctx->commits.nr; i++) {
> -		display_progress(ctx->progress, i);
> +		display_progress(ctx->progress, i + 1);
>  
>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>  			  &ctx->commits.list[i]->object.oid)) {

I think this fix makes sense, but FWIW there's a large thread starting
at [1] where René disagrees with me, and thinks the fix for this sort of
thing would be to display_progress(..., i + 1) at the end of that
for-loop, or just before the stop_progress().

I don't agree, but just noting the disagreement, and that if that
argument wins then a patch like this would involve changing the other
20-some calls to display_progress() in commit-graph.c to work
differently (and to be more complex, we'd need to deal with loop
break/continue etc.).

1. https://lore.kernel.org/git/patch-2.2-042f598826-20210607T144206Z-avarab@gmail.com/ 

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 0/7] progress: verify progress counters in the test suite
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
                   ` (6 preceding siblings ...)
  2021-06-20 20:03 ` [PATCH 7/7] test-lib: enable GIT_TEST_CHECK_PROGRESS by default SZEDER Gábor
@ 2021-06-21  0:59 ` Ævar Arnfjörð Bjarmason
  2021-06-23  2:04   ` Taylor Blau
  2021-06-23 21:57 ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-21  0:59 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, René Scharfe


On Sun, Jun 20 2021, SZEDER Gábor wrote:

> Splitting off from:
>
>   https://public-inbox.org/git/cover-0.2-0000000000-20210607T144206Z-avarab@gmail.com/T/#me5d3176914d4268fd9f2a96fc63f4e41beb26bd6
>
> On Tue, Jun 08, 2021 at 06:14:42PM +0200, René Scharfe wrote:
>> I wonder (only in a semi-curious way, though) if we can detect
>> off-by-one errors by adding an assertion to display_progress() that
>> requires the first update to have the value 0, and in stop_progress()
>> one that requires the previous display_progress() call to have a value
>> equal to the total number of work items.  Not sure it'd be worth the
>> hassle..
>
> I fixed and reported a number of bogus progress lines in the past, the
> last one during v2.31.0-rc phase, so I've looked into whether progress
> counters could be automatically validated in our tests, and came up
> with these patches a few months ago.  It turned out that progress
> counters can be checked easily and transparently in case of progress
> lines that are shown in the tests, i.e. that are shown even when
> stderr is not a terminal or are forced with '--progress'.  (In other
> cases it's still fairly easy but not quite transparent, as I think we
> need changes to the progress API; more on that later in a separate
> series.)

I've also been working on some progress.[ch] patches that are mostly
finished, and I'm some 20 patches in at the moment. I wasn't sure about
whether to send an alternate 20-patch "let's do this (mostly) instead?"
series, hence this message.

Much of what you're doing here becomes easier after that series,
e.g. your global process struct in 2/7 is something I ended up
implementing as part of a general feature to allow progress to be driven
by either display_progress() *or* the signal handler itself.

Thus we can show a "stalled" message if we run start(), but hang before
we ever call display_progress(), as we do on e.g. git.git in gc's
"Enumerating Objects" phase (at least on my laptop).

So e.g. your 2/7 becomes a general hard assertion, not some test-only
mode.

After that I use the same facility to implement a mode where any signal
can update a new "spinner" part of the progress bar. So let's say you're
hanging on item 1/3 and not calling display_progress() at all, we'll
update a spinner on each signal to show the user that git itself isn't
hanging, just working.

I could also rebase on yours, but much of it would be rewriting the
test-only code to be more generalized, perhaps it's easier if we start
going for the more generalized solution first.

Per some of what I mentioned in the thread you linked to I'm a bit
uncomfortable with the direction in your 1/7. I seems it works in-tree
for now, but I'd like to take the progress.c API in the direction of a
more generally useful API, not just something that narrowly fits the
exact set of current use-cases.

There's a lot of potential uses in-tree where the total not matching at
the end is just something that happens due to real-world fuzzyness,
e.g. the unlink() example here:
https://public-inbox.org/git/87lf7k2bem.fsf@evledraar.gmail.com/

Perhaps we can just have it BUG() for now as you're doing and cross that
bridge when we come to it. I just wonder if we can't catch potential
bugs in a more gentle way somehow.

> These checks did uncover a couple of buggy progress lines which are
> fixed in this series as well, but I'm not sure that the fix presented
> in patch 6 is the right approach, hence the RFC.

The approach in 6/7 will also have the effect of not balancing a trace2
start/stop region. Quoting a line from its commit message:

    > Arguably, it is wrong to show "done" at the end of the progress
    > line when not all work was done.

I think for a more general API it makes sense to think of "done" as a
different state than "we have reached == total". The target may change
as in the unlink() example, or we may simply decide to abort and "be
done early".

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 1/7] progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress counters
  2021-06-20 20:02 ` [PATCH 1/7] progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress counters SZEDER Gábor
@ 2021-06-21  7:09   ` Ævar Arnfjörð Bjarmason
  2021-06-22 15:55   ` Taylor Blau
  1 sibling, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-21  7:09 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, René Scharfe


On Sun, Jun 20 2021, SZEDER Gábor wrote:

> @@ -252,7 +255,11 @@ void display_progress(struct progress *progress, uint64_t n)
>  static struct progress *start_progress_delay(const char *title, uint64_t total,
>  					     unsigned delay, unsigned sparse)
>  {
> -	struct progress *progress = xmalloc(sizeof(*progress));
> +	struct progress *progress;
> +
> +	test_check_progress = git_env_bool("GIT_TEST_CHECK_PROGRESS", 0);
> +
> +	progress = xmalloc(sizeof(*progress));

Is this simply an unrelated cleanup/refactoring? I don't see how this
re-arrangement is needed for adding the git_env_bool() call.

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-20 22:13   ` Ævar Arnfjörð Bjarmason
@ 2021-06-21 18:32     ` René Scharfe
  2021-06-21 20:08       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 138+ messages in thread
From: René Scharfe @ 2021-06-21 18:32 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, SZEDER Gábor; +Cc: git

Am 21.06.21 um 00:13 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>
>> The final value of the counter of the "Scanning merged commits"
>> progress line is always one less than its expected total, e.g.:
>>
>>   Scanning merged commits:  83% (5/6), done.
>>
>> This happens because while iterating over an array the loop variable
>> is passed to display_progress() as-is, but while C arrays (and thus
>> the loop variable) start at 0 and end at N-1, the progress counter
>> must end at N.  This causes the failures of the tests
>> 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
>> in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.
>>
>> Fix this by passing 'i + 1' to display_progress(), like most other
>> callsites do.
>>
>> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
>> ---
>>  commit-graph.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/commit-graph.c b/commit-graph.c
>> index 2bcb4e0f89..3181906368 100644
>> --- a/commit-graph.c
>> +++ b/commit-graph.c
>> @@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>>
>>  	ctx->num_extra_edges = 0;
>>  	for (i = 0; i < ctx->commits.nr; i++) {
>> -		display_progress(ctx->progress, i);
>> +		display_progress(ctx->progress, i + 1);
>>
>>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>>  			  &ctx->commits.list[i]->object.oid)) {
>
> I think this fix makes sense, but FWIW there's a large thread starting
> at [1] where René disagrees with me, and thinks the fix for this sort of
> thing would be to display_progress(..., i + 1) at the end of that
> for-loop, or just before the stop_progress().
>
> I don't agree, but just noting the disagreement, and that if that
> argument wins then a patch like this would involve changing the other
> 20-some calls to display_progress() in commit-graph.c to work
> differently (and to be more complex, we'd need to deal with loop
> break/continue etc.).
>
> 1. https://lore.kernel.org/git/patch-2.2-042f598826-20210607T144206Z-avarab@gmail.com/

*sigh*  (And sorry, Ævar.)

Before an item is done, it should be reported as not done.  After an
item is done, it should be reported as done.  One loop iteration
finishes one item.  Thus the number of items to report at the bottom of
the loop is one higher than at the top.  i is the correct number to
report at the top of a zero-based loop, i+1 at the bottom.

There is another place: In the loop header.  It's a weird place for a
function call, but it gets triggered before, between and after all
items, just as we need it:

	for (i = 0; display_progress(ctx->progress), i < ctx->commits.nr; i++) {

We could hide this unseemly sight in a macro:

  #define progress_foreach(index, count, progress) \
  for (index = 0; display_progress(progress, index), index < count; index++)

Hmm?

René

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 6/7] [RFC] entry: don't show "Filtering content: ... done." line in case of errors
  2021-06-20 20:03 ` [PATCH 6/7] [RFC] entry: don't show "Filtering content: ... done." line in case of errors SZEDER Gábor
@ 2021-06-21 18:32   ` René Scharfe
  2021-06-23  1:52     ` Taylor Blau
  0 siblings, 1 reply; 138+ messages in thread
From: René Scharfe @ 2021-06-21 18:32 UTC (permalink / raw)
  To: SZEDER Gábor, git; +Cc: Ævar Arnfjörð Bjarmason

Am 20.06.21 um 22:03 schrieb SZEDER Gábor:
> RFC!!  Alas, not calling stop_progress() on error has drawbacks:
>
>   - All memory allocated for the progress bar is leaked.
>   - This progress line remains "active", in the sense that if we were
>     to start a new progress later in the same git process, then with
>     GIT_TEST_CHECK_PROGRESS it would trigger the other BUG() catching
>     nested/overlapping progresses.
>
> Do we care?!  TBH I don't :)
> Anyway, if we do, then we might need some sort of an abort_progress()
> function...

I think the abort_progress() idea makes sense; to clean up allocations,
tell the user what happened and avoid the BUG().  Showing just
"aborted" instead of "done" should suffice here -- the explanation is
given a few lines later ("'foo' was not filtered properly").

It could be a cheesy stop_progress_msg() wrapper that temporarily sets
test_check_progress to zero..

René

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-21 18:32     ` René Scharfe
@ 2021-06-21 20:08       ` Ævar Arnfjörð Bjarmason
  2021-06-26  8:27         ` René Scharfe
  0 siblings, 1 reply; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-21 20:08 UTC (permalink / raw)
  To: René Scharfe; +Cc: SZEDER Gábor, git


On Mon, Jun 21 2021, René Scharfe wrote:

> Am 21.06.21 um 00:13 schrieb Ævar Arnfjörð Bjarmason:
>>
>> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>>
>>> The final value of the counter of the "Scanning merged commits"
>>> progress line is always one less than its expected total, e.g.:
>>>
>>>   Scanning merged commits:  83% (5/6), done.
>>>
>>> This happens because while iterating over an array the loop variable
>>> is passed to display_progress() as-is, but while C arrays (and thus
>>> the loop variable) start at 0 and end at N-1, the progress counter
>>> must end at N.  This causes the failures of the tests
>>> 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
>>> in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.
>>>
>>> Fix this by passing 'i + 1' to display_progress(), like most other
>>> callsites do.
>>>
>>> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
>>> ---
>>>  commit-graph.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/commit-graph.c b/commit-graph.c
>>> index 2bcb4e0f89..3181906368 100644
>>> --- a/commit-graph.c
>>> +++ b/commit-graph.c
>>> @@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>>>
>>>  	ctx->num_extra_edges = 0;
>>>  	for (i = 0; i < ctx->commits.nr; i++) {
>>> -		display_progress(ctx->progress, i);
>>> +		display_progress(ctx->progress, i + 1);
>>>
>>>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>>>  			  &ctx->commits.list[i]->object.oid)) {
>>
>> I think this fix makes sense, but FWIW there's a large thread starting
>> at [1] where René disagrees with me, and thinks the fix for this sort of
>> thing would be to display_progress(..., i + 1) at the end of that
>> for-loop, or just before the stop_progress().
>>
>> I don't agree, but just noting the disagreement, and that if that
>> argument wins then a patch like this would involve changing the other
>> 20-some calls to display_progress() in commit-graph.c to work
>> differently (and to be more complex, we'd need to deal with loop
>> break/continue etc.).
>>
>> 1. https://lore.kernel.org/git/patch-2.2-042f598826-20210607T144206Z-avarab@gmail.com/
>
> *sigh*  (And sorry, Ævar.)
>
> Before an item is done, it should be reported as not done.  After an
> item is done, it should be reported as done.  One loop iteration
> finishes one item.  Thus the number of items to report at the bottom of
> the loop is one higher than at the top.  i is the correct number to
> report at the top of a zero-based loop, i+1 at the bottom.
>
> There is another place: In the loop header.  It's a weird place for a
> function call, but it gets triggered before, between and after all
> items, just as we need it:
>
> 	for (i = 0; display_progress(ctx->progress), i < ctx->commits.nr; i++) {
>
> We could hide this unseemly sight in a macro:
>
>   #define progress_foreach(index, count, progress) \
>   for (index = 0; display_progress(progress, index), index < count; index++)

Anyone with more time than sense can go and read over our linked back &
forth thread where we're disagreeing on that point :). I think the pattern
in commit-graph.c makes sense, you don't.

Anyway, aside from that. I think, and I really would be advocating this
too, even if our respective positions were reversed, that *in this case*
it makes sense to just take something like SZEDER's patch here
as-is. Because in that file there's some dozen occurrences of that exact
pattern.

Let's just bring this one case in line with the rest, if we then want to
argue that one or the other use of the progress.c API is wrong as a
general thing, I think it makes more sense to discuss that as some
follow-up series that changes these various API uses en-masse than
holding back isolated fixes that leave the state of the progress bar it
!= 100%.

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 1/7] progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress counters
  2021-06-20 20:02 ` [PATCH 1/7] progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress counters SZEDER Gábor
  2021-06-21  7:09   ` Ævar Arnfjörð Bjarmason
@ 2021-06-22 15:55   ` Taylor Blau
  1 sibling, 0 replies; 138+ messages in thread
From: Taylor Blau @ 2021-06-22 15:55 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: git, Ævar Arnfjörð Bjarmason, René Scharfe

On Sun, Jun 20, 2021 at 10:02:57PM +0200, SZEDER Gábor wrote:
> +	progress->last_value = n;
> +
>  	if (progress->delay && (!progress_update || --progress->delay))
>  		return;
>
> -	progress->last_value = n;

Makes sense, and thanks for explaining it explicitly in the patch
message.

>  	tp = (progress->throughput) ? progress->throughput->display.buf : "";
>  	if (progress->total) {
>  		unsigned percent = n * 100 / progress->total;
> @@ -252,7 +255,11 @@ void display_progress(struct progress *progress, uint64_t n)
>  static struct progress *start_progress_delay(const char *title, uint64_t total,
>  					     unsigned delay, unsigned sparse)
>  {
> -	struct progress *progress = xmalloc(sizeof(*progress));
> +	struct progress *progress;
> +
> +	test_check_progress = git_env_bool("GIT_TEST_CHECK_PROGRESS", 0);
> +
> +	progress = xmalloc(sizeof(*progress));

Ævar noted below, I think, but this cleanup to move the xmalloc() call
to after reading $GIT_TEST_CHECK_PROGRESS is unnecessary.

> +test_expect_success 'GIT_TEST_CHECK_PROGRESS catches non-matching total' '
> +	cat >in <<-\EOF &&
> +	progress 1
> +	progress 2
> +	progress 4
> +	EOF
> +
> +	test_must_fail env GIT_TEST_CHECK_PROGRESS=1 \
> +		test-tool progress --total=3 "Not enough" <in 2>stderr &&
> +	grep "BUG:.*total progress does not match" stderr &&
> +
> +	test_must_fail env GIT_TEST_CHECK_PROGRESS=1 \
> +		test-tool progress --total=5 "Too much" <in 2>stderr &&
> +	grep "BUG:.*total progress does not match" stderr
> +'

This and the below test are both good to see. I wondered briefly whether
or not it would be worth adding a test to check that the "progress does
not match" triggers even when we have a non-zero delay, like:

    test_must_fail env GIT_PROGRESS_DELAY=100 GIT_TEST_CHECK_PROGRESS=1 \
      test-tool progress --total=5 "Too much" <in 2>stderr &&
    grep "BUG:.*total progress does not match" stderr

But it's not helpful, because GIT_PROGRESS_DELAY is already 2 by
default, and we unset GIT_* environment variables (including
GIT_PROGRESS_DELAY) except a few which are left alone.

So we are already testing this case implicitly. It may be worth making
it explicit, and/or testing the case where GIT_PROGRESS_DELAY=0, but I
do not feel strongly about it. Besides, I would much rather err on the
side of testing cases we feel are legitimately interesting, rather than
filling in a grid of all possible combinations, including uninteresting
ones.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 2/7] progress: catch nested/overlapping progresses with GIT_TEST_CHECK_PROGRESS
  2021-06-20 20:02 ` [PATCH 2/7] progress: catch nested/overlapping progresses with GIT_TEST_CHECK_PROGRESS SZEDER Gábor
@ 2021-06-22 16:00   ` Taylor Blau
  2021-08-30 21:15     ` SZEDER Gábor
  0 siblings, 1 reply; 138+ messages in thread
From: Taylor Blau @ 2021-06-22 16:00 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: git, Ævar Arnfjörð Bjarmason, René Scharfe

On Sun, Jun 20, 2021 at 10:02:58PM +0200, SZEDER Gábor wrote:
> Note that this will trigger even in cases where the output is not
> visibly wrong, e.g. consider this simplified sequence of calls:
>
>   progress1 = start_delayed_progress();
>   progress2 = start_delayed_progress();
>   for (i = 0; ...)
>       display_progress(progress2, i + 1);
>   stop_progres(&progress2);
>   for (j = 0; ...)
>       display_progress(progress1, j + 1);
>   stop_progres(&progress1);

s/stop_progres/&s, but no big deal. Everything else here looks good.

> diff --git a/progress.c b/progress.c
> index 255995406f..549e8d1fe7 100644
> --- a/progress.c
> +++ b/progress.c
> @@ -48,6 +48,8 @@ struct progress {
>  static volatile sig_atomic_t progress_update;
>
>  static int test_check_progress;
> +/* Used to catch nested/overlapping progresses with GIT_TEST_CHECK_PROGRESS. */
> +static struct progress *current_progress = NULL;
>
>  /*
>   * These are only intended for testing the progress output, i.e. exclusively
> @@ -258,8 +260,12 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
>  	struct progress *progress;
>
>  	test_check_progress = git_env_bool("GIT_TEST_CHECK_PROGRESS", 0);
> +	if (test_check_progress && current_progress)
> +		BUG("progress \"%s\" is still active when starting new progress \"%s\"",
> +		    current_progress->title, title);
>
>  	progress = xmalloc(sizeof(*progress));

Ah. This is why you moved the allocation down further, since we don't
have to free anything up when calling BUG() if it wasn't allocated in
the first place (and we had no such conditional that would cause us to
abort early before).

For what it's worth, I probably would have preferred to see that change
from the previous patch included in this one rather than in the first of
the series, since it's much clearer here than it is in the first patch.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 6/7] [RFC] entry: don't show "Filtering content: ... done." line in case of errors
  2021-06-21 18:32   ` René Scharfe
@ 2021-06-23  1:52     ` Taylor Blau
  2021-08-30 21:17       ` SZEDER Gábor
  0 siblings, 1 reply; 138+ messages in thread
From: Taylor Blau @ 2021-06-23  1:52 UTC (permalink / raw)
  To: René Scharfe
  Cc: SZEDER Gábor, git, Ævar Arnfjörð Bjarmason

On Mon, Jun 21, 2021 at 08:32:56PM +0200, René Scharfe wrote:
> Am 20.06.21 um 22:03 schrieb SZEDER Gábor:
> > RFC!!  Alas, not calling stop_progress() on error has drawbacks:
> >
> >   - All memory allocated for the progress bar is leaked.
> >   - This progress line remains "active", in the sense that if we were
> >     to start a new progress later in the same git process, then with
> >     GIT_TEST_CHECK_PROGRESS it would trigger the other BUG() catching
> >     nested/overlapping progresses.
> >
> > Do we care?!  TBH I don't :)
> > Anyway, if we do, then we might need some sort of an abort_progress()
> > function...
>
> I think the abort_progress() idea makes sense; to clean up allocations,
> tell the user what happened and avoid the BUG().  Showing just
> "aborted" instead of "done" should suffice here -- the explanation is
> given a few lines later ("'foo' was not filtered properly").

Very well put. I concur that having an abort_progress() API makes sense
for all of the reasons that you suggest, but also because we shouldn't
encourage not using what seems like an appropriate API in order to not
fail tests when GIT_TEST_CHECK_PROGRESS is set.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 0/7] progress: verify progress counters in the test suite
  2021-06-21  0:59 ` [PATCH 0/7] progress: verify progress counters in the test suite Ævar Arnfjörð Bjarmason
@ 2021-06-23  2:04   ` Taylor Blau
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 138+ messages in thread
From: Taylor Blau @ 2021-06-23  2:04 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: SZEDER Gábor, git, René Scharfe

On Mon, Jun 21, 2021 at 02:59:53AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>
> > Splitting off from:
> >
> >   https://public-inbox.org/git/cover-0.2-0000000000-20210607T144206Z-avarab@gmail.com/T/#me5d3176914d4268fd9f2a96fc63f4e41beb26bd6
> >
> > On Tue, Jun 08, 2021 at 06:14:42PM +0200, René Scharfe wrote:
> >> I wonder (only in a semi-curious way, though) if we can detect
> >> off-by-one errors by adding an assertion to display_progress() that
> >> requires the first update to have the value 0, and in stop_progress()
> >> one that requires the previous display_progress() call to have a value
> >> equal to the total number of work items.  Not sure it'd be worth the
> >> hassle..
> >
> > I fixed and reported a number of bogus progress lines in the past, the
> > last one during v2.31.0-rc phase, so I've looked into whether progress
> > counters could be automatically validated in our tests, and came up
> > with these patches a few months ago.  It turned out that progress
> > counters can be checked easily and transparently in case of progress
> > lines that are shown in the tests, i.e. that are shown even when
> > stderr is not a terminal or are forced with '--progress'.  (In other
> > cases it's still fairly easy but not quite transparent, as I think we
> > need changes to the progress API; more on that later in a separate
> > series.)
>
> I've also been working on some progress.[ch] patches that are mostly
> finished, and I'm some 20 patches in at the moment. I wasn't sure about
> whether to send an alternate 20-patch "let's do this (mostly) instead?"
> series, hence this message.
>
> Much of what you're doing here becomes easier after that series,
> e.g. your global process struct in 2/7 is something I ended up
> implementing as part of a general feature to allow progress to be driven
> by either display_progress() *or* the signal handler itself.

It's difficult to know who should rebase onto who without seeing one
half of the patches. I couldn't find a link to them anywhere (even if
they are only available in your fork in a pre-polished state) despite
looking, but my apologies if they are available and I'm just missing
them.

In general, I think that these patches are clear and are helpful in
pinning down issues with the progress API (which I have made a hadnful
of times in the past), so I would be happy to see them picked up.

> I could also rebase on yours, but much of it would be rewriting the
> test-only code to be more generalized, perhaps it's easier if we start
> going for the more generalized solution first.

Again, without knowing the substance of your patches it's hard to
comment for sure, but I don't have a problem with a simple and direct
approach here.

> Perhaps we can just have it BUG() for now as you're doing and cross that
> bridge when we come to it. I just wonder if we can't catch potential
> bugs in a more gentle way somehow.

I think there are compelling reasons to feel that the new mode should
only be enabled during tests, as well as compelling reasons to feel that
it should be enabled all of the time.

One way to think about it is that we do not want users to have a BUG()
abort their program just because a progress meter went rogue. So in that
sense, it makes sense that we would only see that happen during tests,
so that those tests could tell us where the bug is, and we could fix it.

On the other hand, since we make sure that our tests pass at each patch,
there's no point in having a separate mode (and instead, remove the
conditionals on GIT_TEST_PROGRESS_CHECK), since successfully running the
tests tells us that there are no rogue progress meters that we exercise
in our (hopefully) complete set of tests.

I could go either way, I think both lines of reasoning are quite
reasonable. But, I think we are generally more lax about having the
whole ci/run-build-and-tests.sh script pass at every commit, and that it
seems we care more about having the tip of each series pass CI when
integrated into 'seen'.

So I don't think that hiding this new mode behind an environment
variable is giving us as much confidence as we'd like, because it
doesn't add anything in "make test".

To me, I think a reasonable direction to take would be to *always*
export GIT_TEST_PROGRESS_CHECK when running tests, not just in
ci/run-build-and-tests.sh. That means we'll catch incorrect uses of the
progress API during tests, without worrying that incomplete coverage
will cause user-visible breakage.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code
  2021-06-23  2:04   ` Taylor Blau
@ 2021-06-23 17:48     ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 01/25] progress.c tests: fix breakage with COLUMNS != 80 Ævar Arnfjörð Bjarmason
                         ` (25 more replies)
  0 siblings, 26 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

> On Mon, Jun 21, 2021 at 02:59:53AM +0200, Ævar Arnfjörð Bjarmason wrote:
>>
>> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>>
>> > Splitting off from:
>> >
>> >   https://public-inbox.org/git/cover-0.2-0000000000-20210607T144206Z-avarab@gmail.com/T/#me5d3176914d4268fd9f2a96fc63f4e41beb26bd6
>> >
>> > On Tue, Jun 08, 2021 at 06:14:42PM +0200, René Scharfe wrote:
>> >> I wonder (only in a semi-curious way, though) if we can detect
>> >> off-by-one errors by adding an assertion to display_progress() that
>> >> requires the first update to have the value 0, and in stop_progress()
>> >> one that requires the previous display_progress() call to have a value
>> >> equal to the total number of work items.  Not sure it'd be worth the
>> >> hassle..
>> >
>> > I fixed and reported a number of bogus progress lines in the past, the
>> > last one during v2.31.0-rc phase, so I've looked into whether progress
>> > counters could be automatically validated in our tests, and came up
>> > with these patches a few months ago.  It turned out that progress
>> > counters can be checked easily and transparently in case of progress
>> > lines that are shown in the tests, i.e. that are shown even when
>> > stderr is not a terminal or are forced with '--progress'.  (In other
>> > cases it's still fairly easy but not quite transparent, as I think we
>> > need changes to the progress API; more on that later in a separate
>> > series.)
>>
>> I've also been working on some progress.[ch] patches that are mostly
>> finished, and I'm some 20 patches in at the moment. I wasn't sure about
>> whether to send an alternate 20-patch "let's do this (mostly) instead?"
>> series, hence this message.
>>
>> Much of what you're doing here becomes easier after that series,
>> e.g. your global process struct in 2/7 is something I ended up
>> implementing as part of a general feature to allow progress to be driven
>> by either display_progress() *or* the signal handler itself.
>
> It's difficult to know who should rebase onto who without seeing one
> half of the patches.

I was sort of hoping he'd take me word for it, but here it is. Don't
say I didn't warn you :)

> I couldn't find a link to them anywhere (even if
> they are only available in your fork in a pre-polished state) despite
> looking, but my apologies if they are available and I'm just missing
> them.

FWIW it's avar-szeder/progress-bar-assertions in
https://github.com/avar/git.git, that repo contains various
functioning and not-so-functioning code.

https://github.com/avar/git/tree/meta/ is my version of the crappy
scripts we probably all have some version of for building my own git,
things that are uncommented in series.conf is what I build my own git
from.

> In general, I think that these patches are clear and are helpful in
> pinning down issues with the progress API (which I have made a hadnful
> of times in the past), so I would be happy to see them picked up.

Here's all 25 patches (well, around 20 before) that I had queued up
locally and fixed up a bit.

The 01/25 is something I submitted already as
https://lore.kernel.org/git/patch-1.1-cba5d88ca35-20210621T070114Z-avarab@gmail.com;
hoping to get this in incrementally.

The 12/25 is my own version of that "global progress struct, 11/25 is
the first of many bugs SZEDER missed in his :)

18/25 is the first step of the UI I was going for, the signal handler
can now drive the progress bar, so e.g. during "git gc" we show (at
least for me, on git.git), a "stalled" message just before we start
the actual count of "Enumerating Objects".

After that was in I was planning on adding config-driven support to
show a "spinner" when we stalled in that way, config-driven because
you could just scrape
e.g. https://github.com/sindresorhus/cli-spinners/blob/main/spinners.json
into your own config. See
https://jsfiddle.net/sindresorhus/2eLtsbey/embedded/result/ :)

19-23/25 is my grabbing of SZEDER's patches that I'm comfortable
labeling as "PATCH", I think they work, but no BUG() assertions yet. I
left out the GIT_TEST_CHECK_PROGRESS parts, since my earlier works set
things up to do any BUG() we trust by default.

22/25 is what I think we should do instead of SZEDER's 6/7
(http://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.com)
I don't think this "our total doesn't match at the end" is something
we should always BUG() on, for reasons explained there.

I am sympathetic to doing it by default though, hence the
stop_progress_early() API, that's there to allow select callers to
bypass his BUG(...) assertion.

24/25 and 25/25 are "RFC" and a rebased+modified version of SZEDER's
BUG(...) assertions.

His series passes the test suite, but actually severely break things
things. It'll make e.g. "git commit-graph write" BUG(...) out. The
reason the tests don't catch it is because we have a blind spot in the
tests.

Namely, that most things that use the progress bar API use isatty() to
check if they should start_progress(). If you run the tests as
e.g. (better ways to do this, especially in parallel, most welcome):

    for t in t[0-9]*.sh; do if ! ./$t -vixd; then echo $t bad; break; fi; done

You can discover various things that his series BUG()'s on, I fixed a
couple of those myself, it's an early part of this series.

But we'll still have various untested for BUG()'s even then, this is
because you *also* have to have the test actually emit a "naked"
progress bar on stderr, if the test itself e.g. pipes fd 2 to a file
it won't work.

I created a shitty-and-mostly-broken throwaway change to
search-replace all the guards of "start_progress(...)" to run
unconditionally, and convert all the "delayed" to the non-delayed
version. That'll find even more BUG()'s where SZEDER's series still
needs to be fixed (and also some unrelated segfaults, I gave up on it
soon after).

Even if we fix that I wouldn't trust it, because a lot of the progress
bars we have depend on the size and shape of the data we're
processing, e.g. the bug I fixed in 11/25. If people find this BUG()
approach worth pursuing I think it would be better to make it an
opt-in flag we convert one caller at a time to.

For some it's really clear that we could assert it, for others such as
the commit-graph it's much more subtle, we're in some callback after
setting a "total", that callback does a "break", "continue" etc. in
various places, all depending on repository data.

It's not easy to reason about that and be certain that we can hold to
the estimate. If we get it wrong someone's repo in the wild won't
fully GC because of the overly eager BUG().

If SZEDER wants to pursue it I think it'll be easier on top of this
series, but personally I really don't see the point of spending effort
on it.

We should really be going in the other direction, of having more fuzzy
ETAs, not less.

E.g. we often have enough data at the start of "Enumerating Objects"
to give a good-enough target value, that it's 5-10% off isn't really
the point, but that the user looking at it sees something better than
a dumb count-up, and can instead see that they'll probably be looking
at it for about a minute. Now our API is to give no ETA/target if
we're not 100% sure, it's not good UX.

So trying to get the current exact count/exact percentage right seems
like a distraction to me in the longer term. If anything we should
just be rounding those numbers, showing fuzzy ETAs instead of
percentages if we can etc.

SZEDER Gábor (4):
  commit-graph: fix bogus counter in "Scanning merged commits" progress
    line
  entry: show finer-grained counter in "Filtering content" progress line
  progress: assert last update in stop_progress()
  progress: assert counting upwards in display()

Ævar Arnfjörð Bjarmason (21):
  progress.c tests: fix breakage with COLUMNS != 80
  progress.c tests: make start/stop verbs on stdin
  progress.c tests: test some invalid usage
  progress.c tests: add a "signal" verb
  progress.c: move signal handler functions lower
  progress.c: call progress_interval() from progress_test_force_update()
  progress.c: stop eagerly fflush(stderr) when not a terminal
  progress.c: add temporary variable from progress struct
  midx perf: add a perf test for multi-pack-index
  progress.c: remove the "sparse" mode nano-optimization
  pack-bitmap-write.c: add a missing stop_progress()
  progress.c: add & assert a "global_progress" variable
  progress.[ch]: move the "struct progress" to the header
  progress.[ch]: move test-only code away from "extern" variables
  progress.c: pass "is done?" (again) to display()
  progress.[ch]: convert "title" to "struct strbuf"
  progress.c: refactor display() for less confusion, and fix bug
  progress.c: emit progress on first signal, show "stalled"
  midx: don't provide a total for QSORT() progress
  progress.c: add a stop_progress_early() function
  entry: deal with unexpected "Filtering content" total

 cache.h                          |   1 -
 commit-graph.c                   |   2 +-
 csum-file.h                      |   2 -
 entry.c                          |  12 +-
 midx.c                           |  25 +-
 pack-bitmap-write.c              |   1 +
 pack.h                           |   1 -
 parallel-checkout.h              |   1 -
 progress.c                       | 391 ++++++++++++++++++-------------
 progress.h                       |  50 +++-
 reachable.h                      |   1 -
 t/helper/test-progress.c         |  54 +++--
 t/perf/p5319-multi-pack-index.sh |  21 ++
 t/t0500-progress-display.sh      | 247 ++++++++++++++-----
 14 files changed, 537 insertions(+), 272 deletions(-)
 create mode 100755 t/perf/p5319-multi-pack-index.sh

-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 01/25] progress.c tests: fix breakage with COLUMNS != 80
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 02/25] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
                         ` (24 subsequent siblings)
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

The tests added in 2bb74b53a49 (Test the progress display, 2019-09-16)
broke under anything except COLUMNS=80, i.e. when running them under
the "-v" mode under a differently sized terminal.

Let's set the expected number of COLUMNS at the start of the test to
fix that bug. It's handy not do do this in test-progress.c itself, in
case we'd like to test for a different number of COLUMNS, either
manually or in a future test.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t0500-progress-display.sh | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 22058b503ac..66c092a0fe3 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -8,6 +8,11 @@ show_cr () {
 	tr '\015' Q | sed -e "s/Q/<CR>\\$LF/g"
 }
 
+test_expect_success 'setup COLUMNS' '
+	COLUMNS=80 &&
+	export COLUMNS
+'
+
 test_expect_success 'simple progress display' '
 	cat >expect <<-\EOF &&
 	Working hard: 1<CR>
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 02/25] progress.c tests: make start/stop verbs on stdin
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 01/25] progress.c tests: fix breakage with COLUMNS != 80 Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 03/25] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
                         ` (23 subsequent siblings)
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Change the usage of the "test-tool progress" introduced in
2bb74b53a49 (Test the progress display, 2019-09-16) to take command
like "start" and "stop" on stdin, instead of running them implicitly.

This makes for tests that are easier to read, since the recipe will
mirror the API usage, and allows for easily testing invalid usage that
would yield (or should yield) a BUG(), e.g. providing two "start"
calls in a row. A subsequent commit will add such stress tests.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/helper/test-progress.c    | 45 ++++++++++++++++++++--------
 t/t0500-progress-display.sh | 59 +++++++++++++++++++++++--------------
 2 files changed, 69 insertions(+), 35 deletions(-)

diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
index 5d05cbe7894..eb925d591e1 100644
--- a/t/helper/test-progress.c
+++ b/t/helper/test-progress.c
@@ -3,6 +3,9 @@
  *
  * Reads instructions from standard input, one instruction per line:
  *
+ *   "start[ <total>[ <title>]]" - Call start_progress(title, total),
+ *                                 when "start" use a title of
+ *                                 "Working hard" with a total of 0.
  *   "progress <items>" - Call display_progress() with the given item count
  *                        as parameter.
  *   "throughput <bytes> <millis> - Call display_throughput() with the given
@@ -10,6 +13,7 @@
  *                                  specify the time elapsed since the
  *                                  start_progress() call.
  *   "update" - Set the 'progress_update' flag.
+ *   "stop" - Call stop_progress().
  *
  * See 't0500-progress-display.sh' for examples.
  */
@@ -22,31 +26,42 @@
 
 int cmd__progress(int argc, const char **argv)
 {
-	int total = 0;
-	const char *title;
+	const char *default_title = "Working hard";
+	char *detached_title = NULL;
 	struct strbuf line = STRBUF_INIT;
-	struct progress *progress;
+	struct progress *progress = NULL;
 
 	const char *usage[] = {
-		"test-tool progress [--total=<n>] <progress-title>",
+		"test-tool progress <stdin",
 		NULL
 	};
 	struct option options[] = {
-		OPT_INTEGER(0, "total", &total, "total number of items"),
 		OPT_END(),
 	};
 
 	argc = parse_options(argc, argv, NULL, options, usage, 0);
-	if (argc != 1)
-		die("need a title for the progress output");
-	title = argv[0];
+	if (argc)
+		usage_with_options(usage, options);
 
 	progress_testing = 1;
-	progress = start_progress(title, total);
 	while (strbuf_getline(&line, stdin) != EOF) {
 		char *end;
 
-		if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
+		if (!strcmp(line.buf, "start")) {
+			progress = start_progress(default_title, 0);
+		} else if (skip_prefix(line.buf, "start ", (const char **) &end)) {
+			uint64_t total = strtoull(end, &end, 10);
+			if (*end == '\0') {
+				progress = start_progress(default_title, total);
+			} else if (*end == ' ') {
+				if (detached_title)
+					free(detached_title);
+				detached_title = strbuf_detach(&line, NULL);
+				progress = start_progress(end + 1, total);
+			} else {
+				die("invalid input: '%s'\n", line.buf);
+			}
+		} else if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
 			uint64_t item_count = strtoull(end, &end, 10);
 			if (*end != '\0')
 				die("invalid input: '%s'\n", line.buf);
@@ -63,12 +78,16 @@ int cmd__progress(int argc, const char **argv)
 				die("invalid input: '%s'\n", line.buf);
 			progress_test_ns = test_ms * 1000 * 1000;
 			display_throughput(progress, byte_count);
-		} else if (!strcmp(line.buf, "update"))
+		} else if (!strcmp(line.buf, "update")) {
 			progress_test_force_update();
-		else
+		} else if (!strcmp(line.buf, "stop")) {
+			stop_progress(&progress);
+		} else {
 			die("invalid input: '%s'\n", line.buf);
+		}
 	}
-	stop_progress(&progress);
+	if (detached_title)
+		free(detached_title);
 
 	return 0;
 }
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 66c092a0fe3..ce6c3434673 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -22,6 +22,7 @@ test_expect_success 'simple progress display' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 0
 	update
 	progress 1
 	update
@@ -30,8 +31,9 @@ test_expect_success 'simple progress display' '
 	progress 4
 	update
 	progress 5
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -46,11 +48,13 @@ test_expect_success 'progress display with total' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 3
 	progress 1
 	progress 2
 	progress 3
+	stop
 	EOF
-	test-tool progress --total=3 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -67,14 +71,14 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 100
 	progress 1000
 	progress 10000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -93,16 +97,15 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
-	update
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 1
 	update
 	progress 2
 	progress 10000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -121,14 +124,14 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 25000
 	progress 50000
 	progress 75000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -145,14 +148,14 @@ Working hard.......2.........3.........4.........5.........6.........7.........:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6.........7.........
 	progress 25000
 	progress 50000
 	progress 75000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6.........7........." \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -169,12 +172,14 @@ test_expect_success 'progress shortens - crazy caller' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 1000
 	progress 100
 	progress 200
 	progress 1
 	progress 1000
+	stop
 	EOF
-	test-tool progress --total=1000 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -190,6 +195,7 @@ test_expect_success 'progress display with throughput' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start
 	throughput 102400 1000
 	update
 	progress 10
@@ -202,8 +208,9 @@ test_expect_success 'progress display with throughput' '
 	throughput 409600 4000
 	update
 	progress 40
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -219,6 +226,7 @@ test_expect_success 'progress display with throughput and total' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 40
 	throughput 102400 1000
 	progress 10
 	throughput 204800 2000
@@ -227,8 +235,9 @@ test_expect_success 'progress display with throughput and total' '
 	progress 30
 	throughput 409600 4000
 	progress 40
+	stop
 	EOF
-	test-tool progress --total=40 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -244,6 +253,7 @@ test_expect_success 'cover up after throughput shortens' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start
 	throughput 409600 1000
 	update
 	progress 1
@@ -256,8 +266,9 @@ test_expect_success 'cover up after throughput shortens' '
 	throughput 1638400 4000
 	update
 	progress 4
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -272,6 +283,7 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start
 	throughput 1 1000
 	update
 	progress 1
@@ -281,8 +293,9 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	throughput 3145728 3000
 	update
 	progress 3
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -290,6 +303,7 @@ test_expect_success 'cover up after throughput shortens a lot' '
 
 test_expect_success 'progress generates traces' '
 	cat >in <<-\EOF &&
+	start 40
 	throughput 102400 1000
 	update
 	progress 10
@@ -302,10 +316,11 @@ test_expect_success 'progress generates traces' '
 	throughput 409600 4000
 	update
 	progress 40
+	stop
 	EOF
 
-	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress --total=40 \
-		"Working hard" <in 2>stderr &&
+	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress \
+		<in 2>stderr &&
 
 	# t0212/parse_events.perl intentionally omits regions and data.
 	test_region progress "Working hard" trace.event &&
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 03/25] progress.c tests: test some invalid usage
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 01/25] progress.c tests: fix breakage with COLUMNS != 80 Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 02/25] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 04/25] progress.c tests: add a "signal" verb Ævar Arnfjörð Bjarmason
                         ` (22 subsequent siblings)
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Test what happens when we "stop" without a "start", omit the "stop"
after a "start", or try to start two concurrent progress bars. This
extends the trace2 tests added in 98a13647408 (trace2: log progress
time and throughput, 2020-05-12).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t0500-progress-display.sh | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index ce6c3434673..50eced31f03 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -328,4 +328,37 @@ test_expect_success 'progress generates traces' '
 	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
 '
 
+test_expect_success 'progress generates traces: stop / start' '
+	cat >in <<-\EOF &&
+	start
+	stop
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-startstop.event" test-tool progress \
+		<in 2>stderr &&
+	test_region progress "Working hard" trace-startstop.event
+'
+
+test_expect_success 'progress generates traces: start without stop' '
+	cat >in <<-\EOF &&
+	start
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-start.event" test-tool progress \
+		<in 2>stderr &&
+	grep region_enter.*progress trace-start.event &&
+	! grep region_leave.*progress trace-start.event
+'
+
+test_expect_success 'progress generates traces: stop without start' '
+	cat >in <<-\EOF &&
+	stop
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-stop.event" test-tool progress \
+		<in 2>stderr &&
+	! grep region_enter.*progress trace-stop.event &&
+	! grep region_leave.*progress trace-stop.event
+'
+
 test_done
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 04/25] progress.c tests: add a "signal" verb
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (2 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 03/25] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 05/25] progress.c: move signal handler functions lower Ævar Arnfjörð Bjarmason
                         ` (21 subsequent siblings)
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Add a "signal" synonym for "update". It is not typical of the
progress.c API to encounter a scenario where we do an update before
the first display_progress(), let's indicate this explicitly by
calling such instances "signal".

It's just a synonym for "update", but we can imagine than the
following "update" calls could elide many "progress" calls, and the
progress bar output will generally be of the same type, whereas the
output where we're asked to emit an update before we've received any
data is a special case.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/helper/test-progress.c    |  6 +++++-
 t/t0500-progress-display.sh | 10 +++++-----
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
index eb925d591e1..7ca58a3ee78 100644
--- a/t/helper/test-progress.c
+++ b/t/helper/test-progress.c
@@ -13,6 +13,9 @@
  *                                  specify the time elapsed since the
  *                                  start_progress() call.
  *   "update" - Set the 'progress_update' flag.
+ *   "signal" - Synonym for "update", used for self-documenting tests,
+ *              i.e. "expect signal here due to hanging ("signal")
+ *              v.s. it was time to update ("update").
  *   "stop" - Call stop_progress().
  *
  * See 't0500-progress-display.sh' for examples.
@@ -78,7 +81,8 @@ int cmd__progress(int argc, const char **argv)
 				die("invalid input: '%s'\n", line.buf);
 			progress_test_ns = test_ms * 1000 * 1000;
 			display_throughput(progress, byte_count);
-		} else if (!strcmp(line.buf, "update")) {
+		} else if (!strcmp(line.buf, "update") ||
+			   !strcmp(line.buf, "signal")) {
 			progress_test_force_update();
 		} else if (!strcmp(line.buf, "stop")) {
 			stop_progress(&progress);
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 50eced31f03..66c1989b176 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -23,7 +23,7 @@ test_expect_success 'simple progress display' '
 
 	cat >in <<-\EOF &&
 	start 0
-	update
+	signal
 	progress 1
 	update
 	progress 2
@@ -197,7 +197,7 @@ test_expect_success 'progress display with throughput' '
 	cat >in <<-\EOF &&
 	start
 	throughput 102400 1000
-	update
+	signal
 	progress 10
 	throughput 204800 2000
 	update
@@ -255,7 +255,7 @@ test_expect_success 'cover up after throughput shortens' '
 	cat >in <<-\EOF &&
 	start
 	throughput 409600 1000
-	update
+	signal
 	progress 1
 	throughput 819200 2000
 	update
@@ -285,7 +285,7 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	cat >in <<-\EOF &&
 	start
 	throughput 1 1000
-	update
+	signal
 	progress 1
 	throughput 1024000 2000
 	update
@@ -305,7 +305,7 @@ test_expect_success 'progress generates traces' '
 	cat >in <<-\EOF &&
 	start 40
 	throughput 102400 1000
-	update
+	signal
 	progress 10
 	throughput 204800 2000
 	update
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 05/25] progress.c: move signal handler functions lower
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (3 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 04/25] progress.c tests: add a "signal" verb Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 06/25] progress.c: call progress_interval() from progress_test_force_update() Ævar Arnfjörð Bjarmason
                         ` (20 subsequent siblings)
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Move the signal handler functions to just before the
start_progress_delay() where they'll be referenced, instead of having
them at the top of the file.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 92 ++++++++++++++++++++++++++++--------------------------
 1 file changed, 48 insertions(+), 44 deletions(-)

diff --git a/progress.c b/progress.c
index 680c6a8bf93..893cb0fe56f 100644
--- a/progress.c
+++ b/progress.c
@@ -53,50 +53,6 @@ static volatile sig_atomic_t progress_update;
  */
 int progress_testing;
 uint64_t progress_test_ns = 0;
-void progress_test_force_update(void)
-{
-	progress_update = 1;
-}
-
-
-static void progress_interval(int signum)
-{
-	progress_update = 1;
-}
-
-static void set_progress_signal(void)
-{
-	struct sigaction sa;
-	struct itimerval v;
-
-	if (progress_testing)
-		return;
-
-	progress_update = 0;
-
-	memset(&sa, 0, sizeof(sa));
-	sa.sa_handler = progress_interval;
-	sigemptyset(&sa.sa_mask);
-	sa.sa_flags = SA_RESTART;
-	sigaction(SIGALRM, &sa, NULL);
-
-	v.it_interval.tv_sec = 1;
-	v.it_interval.tv_usec = 0;
-	v.it_value = v.it_interval;
-	setitimer(ITIMER_REAL, &v, NULL);
-}
-
-static void clear_progress_signal(void)
-{
-	struct itimerval v = {{0,},};
-
-	if (progress_testing)
-		return;
-
-	setitimer(ITIMER_REAL, &v, NULL);
-	signal(SIGALRM, SIG_IGN);
-	progress_update = 0;
-}
 
 static int is_foreground_fd(int fd)
 {
@@ -249,6 +205,54 @@ void display_progress(struct progress *progress, uint64_t n)
 		display(progress, n, NULL);
 }
 
+static void progress_interval(int signum)
+{
+	progress_update = 1;
+}
+
+/*
+ * The progress_test_force_update() function is intended for testing
+ * the progress output, i.e. exclusively for 'test-tool progress'.
+ */
+void progress_test_force_update(void)
+{
+	progress_update = 1;
+}
+
+static void set_progress_signal(void)
+{
+	struct sigaction sa;
+	struct itimerval v;
+
+	if (progress_testing)
+		return;
+
+	progress_update = 0;
+
+	memset(&sa, 0, sizeof(sa));
+	sa.sa_handler = progress_interval;
+	sigemptyset(&sa.sa_mask);
+	sa.sa_flags = SA_RESTART;
+	sigaction(SIGALRM, &sa, NULL);
+
+	v.it_interval.tv_sec = 1;
+	v.it_interval.tv_usec = 0;
+	v.it_value = v.it_interval;
+	setitimer(ITIMER_REAL, &v, NULL);
+}
+
+static void clear_progress_signal(void)
+{
+	struct itimerval v = {{0,},};
+
+	if (progress_testing)
+		return;
+
+	setitimer(ITIMER_REAL, &v, NULL);
+	signal(SIGALRM, SIG_IGN);
+	progress_update = 0;
+}
+
 static struct progress *start_progress_delay(const char *title, uint64_t total,
 					     unsigned delay, unsigned sparse)
 {
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 06/25] progress.c: call progress_interval() from progress_test_force_update()
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (4 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 05/25] progress.c: move signal handler functions lower Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 07/25] progress.c: stop eagerly fflush(stderr) when not a terminal Ævar Arnfjörð Bjarmason
                         ` (19 subsequent siblings)
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Define the progress_test_force_update() function in terms of
progress_interval(). For documentation purposes these two functions
have the same body, but different names. Let's just define the test
function by calling progress_interval() with SIGALRM ourselves.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/progress.c b/progress.c
index 893cb0fe56f..7fcc513717a 100644
--- a/progress.c
+++ b/progress.c
@@ -216,7 +216,7 @@ static void progress_interval(int signum)
  */
 void progress_test_force_update(void)
 {
-	progress_update = 1;
+	progress_interval(SIGALRM);
 }
 
 static void set_progress_signal(void)
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 07/25] progress.c: stop eagerly fflush(stderr) when not a terminal
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (5 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 06/25] progress.c: call progress_interval() from progress_test_force_update() Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 08/25] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
                         ` (18 subsequent siblings)
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

It's the clear intention of the combination of 137a0d0ef56 (Flush
progress message buffer in display()., 2007-11-19) and
85cb8906f0e (progress: no progress in background, 2015-04-13) to call
fflush(stderr) when we have a stderr in the foreground, but we ended
up always calling fflush(stderr) seemingly by omission. Let's not.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/progress.c b/progress.c
index 7fcc513717a..1fade5808de 100644
--- a/progress.c
+++ b/progress.c
@@ -91,7 +91,8 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 	}
 
 	if (show_update) {
-		if (is_foreground_fd(fileno(stderr)) || done) {
+		int stderr_is_foreground_fd = is_foreground_fd(fileno(stderr));
+		if (stderr_is_foreground_fd || done) {
 			const char *eol = done ? done : "\r";
 			size_t clear_len = counters_sb->len < last_count_len ?
 					last_count_len - counters_sb->len + 1 :
@@ -115,7 +116,8 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 				fprintf(stderr, "%s: %s%*s", progress->title,
 					counters_sb->buf, (int) clear_len, eol);
 			}
-			fflush(stderr);
+			if (stderr_is_foreground_fd)
+				fflush(stderr);
 		}
 		progress_update = 0;
 	}
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 08/25] progress.c: add temporary variable from progress struct
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (6 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 07/25] progress.c: stop eagerly fflush(stderr) when not a terminal Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 09/25] midx perf: add a perf test for multi-pack-index Ævar Arnfjörð Bjarmason
                         ` (17 subsequent siblings)
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Add a temporary "progress" variable for the dereferenced p_progress
pointer to a "struct progress *". Before 98a13647408 (trace2: log
progress time and throughput, 2020-05-12) we didn't dereference
"p_progress" in this function, now that we do it's easier to read the
code if we work with a "progress" struct pointer like everywhere else,
instead of a pointer to a pointer.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/progress.c b/progress.c
index 1fade5808de..1ab7d19deb8 100644
--- a/progress.c
+++ b/progress.c
@@ -331,15 +331,16 @@ void stop_progress(struct progress **p_progress)
 	finish_if_sparse(*p_progress);
 
 	if (*p_progress) {
+		struct progress *progress = *p_progress;
 		trace2_data_intmax("progress", the_repository, "total_objects",
 				   (*p_progress)->total);
 
 		if ((*p_progress)->throughput)
 			trace2_data_intmax("progress", the_repository,
 					   "total_bytes",
-					   (*p_progress)->throughput->curr_total);
+					   progress->throughput->curr_total);
 
-		trace2_region_leave("progress", (*p_progress)->title, the_repository);
+		trace2_region_leave("progress", progress->title, the_repository);
 	}
 
 	stop_progress_msg(p_progress, _("done"));
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 09/25] midx perf: add a perf test for multi-pack-index
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (7 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 08/25] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 10/25] progress.c: remove the "sparse" mode nano-optimization Ævar Arnfjörð Bjarmason
                         ` (16 subsequent siblings)
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Add a basic write and verify performance test for the multi-pack-index
command.

The reason for doing the "write" also in a "test_expect_success" is to
be friendly to skipping the "write" test as a perf test (which would
run N times) but still being guaranteed to have a midx to verify by
the time we get to the "verify" test.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/perf/p5319-multi-pack-index.sh | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)
 create mode 100755 t/perf/p5319-multi-pack-index.sh

diff --git a/t/perf/p5319-multi-pack-index.sh b/t/perf/p5319-multi-pack-index.sh
new file mode 100755
index 00000000000..39769602ab7
--- /dev/null
+++ b/t/perf/p5319-multi-pack-index.sh
@@ -0,0 +1,21 @@
+#!/bin/sh
+
+test_description='Test midx performance'
+
+. ./perf-lib.sh
+
+test_perf_large_repo
+
+test_expect_success 'setup multi-pack-index' '
+	git multi-pack-index write
+'
+
+test_perf 'midx write' '
+	git multi-pack-index write
+'
+
+test_perf 'midx verify' '
+	git multi-pack-index verify
+'
+
+test_done
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 10/25] progress.c: remove the "sparse" mode nano-optimization
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (8 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 09/25] midx perf: add a perf test for multi-pack-index Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 11/25] pack-bitmap-write.c: add a missing stop_progress() Ævar Arnfjörð Bjarmason
                         ` (15 subsequent siblings)
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Revert the code added in 9d81ecb52b5 (progress: add sparse mode to
force 100% complete message, 2019-03-21) for the "sparse" progress
mode, and change its only user added in 430efb8a74b (midx: add
progress indicators in multi-pack-index verify, 2019-03-21) to use the
normal non-sparse progress.c API instead.

The reason for checking the SPARSE_PROGRESS_INTERVAL for every 2^12
objects is to improve performance. It does that, but only in an
isolated and artificial benchmark. In the case of the
"verify_midx_file" user we're in a loop doing various other OID/object
work, the cost of calling display_progress() is entirely lost in the
noise.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 midx.c     | 26 +++++++-------------------
 progress.c | 38 +++-----------------------------------
 2 files changed, 10 insertions(+), 54 deletions(-)

diff --git a/midx.c b/midx.c
index 21d6a05e887..d80e68998b8 100644
--- a/midx.c
+++ b/midx.c
@@ -1186,18 +1186,6 @@ static int compare_pair_pos_vs_id(const void *_a, const void *_b)
 	return b->pack_int_id - a->pack_int_id;
 }
 
-/*
- * Limit calls to display_progress() for performance reasons.
- * The interval here was arbitrarily chosen.
- */
-#define SPARSE_PROGRESS_INTERVAL (1 << 12)
-#define midx_display_sparse_progress(progress, n) \
-	do { \
-		uint64_t _n = (n); \
-		if ((_n & (SPARSE_PROGRESS_INTERVAL - 1)) == 0) \
-			display_progress(progress, _n); \
-	} while (0)
-
 int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags)
 {
 	struct pair_pos_vs_id *pairs = NULL;
@@ -1248,8 +1236,8 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 	}
 
 	if (flags & MIDX_PROGRESS)
-		progress = start_sparse_progress(_("Verifying OID order in multi-pack-index"),
-						 m->num_objects - 1);
+		progress = start_progress(_("Verifying OID order in multi-pack-index"),
+					  m->num_objects - 1);
 	for (i = 0; i < m->num_objects - 1; i++) {
 		struct object_id oid1, oid2;
 
@@ -1260,7 +1248,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 			midx_report(_("oid lookup out of order: oid[%d] = %s >= %s = oid[%d]"),
 				    i, oid_to_hex(&oid1), oid_to_hex(&oid2), i + 1);
 
-		midx_display_sparse_progress(progress, i + 1);
+		display_progress(progress, i + 1);
 	}
 	stop_progress(&progress);
 
@@ -1277,14 +1265,14 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 	}
 
 	if (flags & MIDX_PROGRESS)
-		progress = start_sparse_progress(_("Sorting objects by packfile"),
-						 m->num_objects);
+		progress = start_progress(_("Sorting objects by packfile"),
+					  m->num_objects);
 	display_progress(progress, 0); /* TODO: Measure QSORT() progress */
 	QSORT(pairs, m->num_objects, compare_pair_pos_vs_id);
 	stop_progress(&progress);
 
 	if (flags & MIDX_PROGRESS)
-		progress = start_sparse_progress(_("Verifying object offsets"), m->num_objects);
+		progress = start_progress(_("Verifying object offsets"), m->num_objects);
 	for (i = 0; i < m->num_objects; i++) {
 		struct object_id oid;
 		struct pack_entry e;
@@ -1318,7 +1306,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 			midx_report(_("incorrect object offset for oid[%d] = %s: %"PRIx64" != %"PRIx64),
 				    pairs[i].pos, oid_to_hex(&oid), m_offset, p_offset);
 
-		midx_display_sparse_progress(progress, i + 1);
+		display_progress(progress, i + 1);
 	}
 	stop_progress(&progress);
 
diff --git a/progress.c b/progress.c
index 1ab7d19deb8..912edd4c818 100644
--- a/progress.c
+++ b/progress.c
@@ -37,7 +37,6 @@ struct progress {
 	uint64_t total;
 	unsigned last_percent;
 	unsigned delay;
-	unsigned sparse;
 	struct throughput *throughput;
 	uint64_t start_ns;
 	struct strbuf counters_sb;
@@ -256,7 +255,7 @@ static void clear_progress_signal(void)
 }
 
 static struct progress *start_progress_delay(const char *title, uint64_t total,
-					     unsigned delay, unsigned sparse)
+					     unsigned delay)
 {
 	struct progress *progress = xmalloc(sizeof(*progress));
 	progress->title = title;
@@ -264,7 +263,6 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 	progress->last_value = -1;
 	progress->last_percent = -1;
 	progress->delay = delay;
-	progress->sparse = sparse;
 	progress->throughput = NULL;
 	progress->start_ns = getnanotime();
 	strbuf_init(&progress->counters_sb, 0);
@@ -287,40 +285,12 @@ static int get_default_delay(void)
 
 struct progress *start_delayed_progress(const char *title, uint64_t total)
 {
-	return start_progress_delay(title, total, get_default_delay(), 0);
+	return start_progress_delay(title, total, get_default_delay());
 }
 
 struct progress *start_progress(const char *title, uint64_t total)
 {
-	return start_progress_delay(title, total, 0, 0);
-}
-
-/*
- * Here "sparse" means that the caller might use some sampling criteria to
- * decide when to call display_progress() rather than calling it for every
- * integer value in[0 .. total).  In particular, the caller might not call
- * display_progress() for the last value in the range.
- *
- * When "sparse" is set, stop_progress() will automatically force the done
- * message to show 100%.
- */
-struct progress *start_sparse_progress(const char *title, uint64_t total)
-{
-	return start_progress_delay(title, total, 0, 1);
-}
-
-struct progress *start_delayed_sparse_progress(const char *title,
-					       uint64_t total)
-{
-	return start_progress_delay(title, total, get_default_delay(), 1);
-}
-
-static void finish_if_sparse(struct progress *progress)
-{
-	if (progress &&
-	    progress->sparse &&
-	    progress->last_value != progress->total)
-		display_progress(progress, progress->total);
+	return start_progress_delay(title, total, 0);
 }
 
 void stop_progress(struct progress **p_progress)
@@ -328,8 +298,6 @@ void stop_progress(struct progress **p_progress)
 	if (!p_progress)
 		BUG("don't provide NULL to stop_progress");
 
-	finish_if_sparse(*p_progress);
-
 	if (*p_progress) {
 		struct progress *progress = *p_progress;
 		trace2_data_intmax("progress", the_repository, "total_objects",
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 11/25] pack-bitmap-write.c: add a missing stop_progress()
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (9 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 10/25] progress.c: remove the "sparse" mode nano-optimization Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-09-17  5:14         ` SZEDER Gábor
  2021-06-23 17:48       ` [PATCH 12/25] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
                         ` (14 subsequent siblings)
  25 siblings, 1 reply; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Fix a bug that's been here since 7cc8f971085 (pack-objects: implement
bitmap writing, 2013-12-21), we did not call stop_progress() if we
reached the early exit in this function. This will matter in a
subsequent commit where we BUG(...) out if this happens, and matters
now e.g. because we don't have a corresponding "region_end" for the
progress trace2 event.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 pack-bitmap-write.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 88d9e696a54..6e110e41ea4 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -550,6 +550,7 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
 	if (indexed_commits_nr < 100) {
 		for (i = 0; i < indexed_commits_nr; ++i)
 			push_bitmapped_commit(indexed_commits[i]);
+		stop_progress(&writer.progress);
 		return;
 	}
 
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 12/25] progress.c: add & assert a "global_progress" variable
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (10 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 11/25] pack-bitmap-write.c: add a missing stop_progress() Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-09-16 18:31         ` SZEDER Gábor
  2021-06-23 17:48       ` [PATCH 13/25] progress.[ch]: move the "struct progress" to the header Ævar Arnfjörð Bjarmason
                         ` (13 subsequent siblings)
  25 siblings, 1 reply; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

The progress.c code makes a hard assumption that only one progress bar
be active at a time (see [1] for a bug where this wasn't the case),
but nothing has asserted that that's the case. Let's add a BUG()
that'll trigger if two progress bars are active at the same time.

There's an alternate test-only approach to doing the same thing[2],
but by doing this for all progress bars we'll have a canary to check
if we have any unexpected interaction between the "sig_atomic_t
progress_update" variable and this global struct.

I am then planning on using this scaffolding in the future to fix a
limitation in the progress output, namely the current limitation of
the progress.c bar code that any update must pro-actively go through
the likes of display_progress().

If we e.g. hang forever before the first display_progress(), or in the
middle of a loop that would call display_progress() the user will only
see either no output, or output frozen at the last display_progress()
that would have done an update (e.g. in cases where progress_update
was "1" due to an earlier signal).

This change does not fix that, but sets up the structure for solving
that and other related problems by juggling this "global_progress"
struct. Later changes will make more use of the "global_progress" than
only using it for these assertions.

1. 6f9d5f2fda1 (commit-graph: fix progress of reachable commits, 2020-07-09)
2. https://lore.kernel.org/git/20210620200303.2328957-3-szeder.dev@gmail.com

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c                  | 17 +++++++++++++----
 t/t0500-progress-display.sh | 11 +++++++++++
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/progress.c b/progress.c
index 912edd4c818..e1b50ef7882 100644
--- a/progress.c
+++ b/progress.c
@@ -45,6 +45,7 @@ struct progress {
 };
 
 static volatile sig_atomic_t progress_update;
+static struct progress *global_progress;
 
 /*
  * These are only intended for testing the progress output, i.e. exclusively
@@ -220,11 +221,15 @@ void progress_test_force_update(void)
 	progress_interval(SIGALRM);
 }
 
-static void set_progress_signal(void)
+static void set_progress_signal(struct progress *progress)
 {
 	struct sigaction sa;
 	struct itimerval v;
 
+	if (global_progress)
+		BUG("should have no global_progress in set_progress_signal()");
+	global_progress = progress;
+
 	if (progress_testing)
 		return;
 
@@ -242,10 +247,14 @@ static void set_progress_signal(void)
 	setitimer(ITIMER_REAL, &v, NULL);
 }
 
-static void clear_progress_signal(void)
+static void clear_progress_signal(struct progress *progress)
 {
 	struct itimerval v = {{0,},};
 
+	if (!global_progress)
+		BUG("should have a global_progress in clear_progress_signal()");
+	global_progress = NULL;
+
 	if (progress_testing)
 		return;
 
@@ -268,7 +277,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 	strbuf_init(&progress->counters_sb, 0);
 	progress->title_len = utf8_strwidth(title);
 	progress->split = 0;
-	set_progress_signal();
+	set_progress_signal(progress);
 	trace2_region_enter("progress", title, the_repository);
 	return progress;
 }
@@ -342,7 +351,7 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 		display(progress, progress->last_value, buf);
 		free(buf);
 	}
-	clear_progress_signal();
+	clear_progress_signal(progress);
 	strbuf_release(&progress->counters_sb);
 	if (progress->throughput)
 		strbuf_release(&progress->throughput->display);
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 66c1989b176..476a31222a3 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -361,4 +361,15 @@ test_expect_success 'progress generates traces: stop without start' '
 	! grep region_leave.*progress trace-stop.event
 '
 
+test_expect_success 'BUG: start two concurrent progress bars' '
+	cat >in <<-\EOF &&
+	start 0 one
+	start 0 two
+	EOF
+
+	test_must_fail test-tool progress \
+		<in 2>stderr &&
+	grep -E "^BUG: .*: should have no global_progress in set_progress_signal\(\)$" stderr
+'
+
 test_done
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 13/25] progress.[ch]: move the "struct progress" to the header
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (11 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 12/25] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-09-16 19:42         ` SZEDER Gábor
  2021-06-23 17:48       ` [PATCH 14/25] progress.[ch]: move test-only code away from "extern" variables Ævar Arnfjörð Bjarmason
                         ` (12 subsequent siblings)
  25 siblings, 1 reply; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Move the definition of the "struct progress" to the progress.h
header. Even though its contents are meant to be "private" this
pattern has resulted in forward declarations of it in various places,
as other functions have a need to pass it around.

Let's just define it in the header instead. It's part of our own
internal code, so we're not at much risk of someone tweaking the
internal fields manually. While doing that rename the "TP_IDX_MAX"
macro to the more clearly namespaced "PROGRESS_THROUGHPUT_IDX_MAX".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 cache.h             |  1 -
 csum-file.h         |  2 --
 pack.h              |  1 -
 parallel-checkout.h |  1 -
 progress.c          | 29 +----------------------------
 progress.h          | 28 +++++++++++++++++++++++++++-
 reachable.h         |  1 -
 7 files changed, 28 insertions(+), 35 deletions(-)

diff --git a/cache.h b/cache.h
index ba04ff8bd36..7e03a181f68 100644
--- a/cache.h
+++ b/cache.h
@@ -308,7 +308,6 @@ static inline unsigned int canon_mode(unsigned int mode)
 
 struct split_index;
 struct untracked_cache;
-struct progress;
 struct pattern_list;
 
 struct index_state {
diff --git a/csum-file.h b/csum-file.h
index 3044bd19ab6..3de0de653e8 100644
--- a/csum-file.h
+++ b/csum-file.h
@@ -3,8 +3,6 @@
 
 #include "hash.h"
 
-struct progress;
-
 /* A SHA1-protected file */
 struct hashfile {
 	int fd;
diff --git a/pack.h b/pack.h
index fa139545262..8df04f4937a 100644
--- a/pack.h
+++ b/pack.h
@@ -77,7 +77,6 @@ struct pack_idx_entry {
 };
 
 
-struct progress;
 /* Note, the data argument could be NULL if object type is blob */
 typedef int (*verify_fn)(const struct object_id *, enum object_type, unsigned long, void*, int*);
 
diff --git a/parallel-checkout.h b/parallel-checkout.h
index 80f539bcb77..193f76398d6 100644
--- a/parallel-checkout.h
+++ b/parallel-checkout.h
@@ -5,7 +5,6 @@
 
 struct cache_entry;
 struct checkout;
-struct progress;
 
 /****************************************************************
  * Users of parallel checkout
diff --git a/progress.c b/progress.c
index e1b50ef7882..aff9af9ee8b 100644
--- a/progress.c
+++ b/progress.c
@@ -17,33 +17,6 @@
 #include "utf8.h"
 #include "config.h"
 
-#define TP_IDX_MAX      8
-
-struct throughput {
-	off_t curr_total;
-	off_t prev_total;
-	uint64_t prev_ns;
-	unsigned int avg_bytes;
-	unsigned int avg_misecs;
-	unsigned int last_bytes[TP_IDX_MAX];
-	unsigned int last_misecs[TP_IDX_MAX];
-	unsigned int idx;
-	struct strbuf display;
-};
-
-struct progress {
-	const char *title;
-	uint64_t last_value;
-	uint64_t total;
-	unsigned last_percent;
-	unsigned delay;
-	struct throughput *throughput;
-	uint64_t start_ns;
-	struct strbuf counters_sb;
-	int title_len;
-	int split;
-};
-
 static volatile sig_atomic_t progress_update;
 static struct progress *global_progress;
 
@@ -194,7 +167,7 @@ void display_throughput(struct progress *progress, uint64_t total)
 	tp->avg_misecs -= tp->last_misecs[tp->idx];
 	tp->last_bytes[tp->idx] = count;
 	tp->last_misecs[tp->idx] = misecs;
-	tp->idx = (tp->idx + 1) % TP_IDX_MAX;
+	tp->idx = (tp->idx + 1) % PROGRESS_THROUGHPUT_IDX_MAX;
 
 	throughput_string(&tp->display, total, rate);
 	if (progress->last_value != -1 && progress_update)
diff --git a/progress.h b/progress.h
index f1913acf73f..4fb2b483d36 100644
--- a/progress.h
+++ b/progress.h
@@ -1,7 +1,33 @@
 #ifndef PROGRESS_H
 #define PROGRESS_H
+#include "strbuf.h"
 
-struct progress;
+#define PROGRESS_THROUGHPUT_IDX_MAX      8
+
+struct throughput {
+	off_t curr_total;
+	off_t prev_total;
+	uint64_t prev_ns;
+	unsigned int avg_bytes;
+	unsigned int avg_misecs;
+	unsigned int last_bytes[PROGRESS_THROUGHPUT_IDX_MAX];
+	unsigned int last_misecs[PROGRESS_THROUGHPUT_IDX_MAX];
+	unsigned int idx;
+	struct strbuf display;
+};
+
+struct progress {
+	const char *title;
+	uint64_t last_value;
+	uint64_t total;
+	unsigned last_percent;
+	unsigned delay;
+	struct throughput *throughput;
+	uint64_t start_ns;
+	struct strbuf counters_sb;
+	int title_len;
+	int split;
+};
 
 #ifdef GIT_TEST_PROGRESS_ONLY
 
diff --git a/reachable.h b/reachable.h
index 5df932ad8f5..7e1ddddbc63 100644
--- a/reachable.h
+++ b/reachable.h
@@ -1,7 +1,6 @@
 #ifndef REACHEABLE_H
 #define REACHEABLE_H
 
-struct progress;
 struct rev_info;
 
 int add_unseen_recent_objects_to_traversal(struct rev_info *revs,
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 14/25] progress.[ch]: move test-only code away from "extern" variables
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (12 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 13/25] progress.[ch]: move the "struct progress" to the header Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 15/25] progress.c: pass "is done?" (again) to display() Ævar Arnfjörð Bjarmason
                         ` (11 subsequent siblings)
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Since the test-only support code was added in 2bb74b53a49 (Test the
progress display, 2019-09-16) we've had to define
GIT_TEST_PROGRESS_ONLY more widely as part of the bugfix in
3cacb9aaf46 (progress.c: silence cgcc suggestion about internal
linkage, 2020-04-27).

So the only thing we were getting out of this indirection was keeping
GIT_TEST_PROGRESS_ONLY from being defined in progress.h itself,
i.e. so the likes of csum-file.h wouldn't have access to them, we'd
still compile them in progress.o.

Let's just always define and compile them without this needless slight
of hand, the linking and strip step will take care of removing these
unused symbols, if needed.

We now expose a start_progress_testing() function instead, which'll
set a "test_mode" member, which the test of the code can check.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c               | 34 ++++++++++++++--------------------
 progress.h               | 21 ++++++++++++++-------
 t/helper/test-progress.c | 11 +++++------
 3 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/progress.c b/progress.c
index aff9af9ee8b..39d7f6bd86b 100644
--- a/progress.c
+++ b/progress.c
@@ -8,7 +8,6 @@
  * published by the Free Software Foundation.
  */
 
-#define GIT_TEST_PROGRESS_ONLY
 #include "cache.h"
 #include "gettext.h"
 #include "progress.h"
@@ -20,13 +19,6 @@
 static volatile sig_atomic_t progress_update;
 static struct progress *global_progress;
 
-/*
- * These are only intended for testing the progress output, i.e. exclusively
- * for 'test-tool progress'.
- */
-int progress_testing;
-uint64_t progress_test_ns = 0;
-
 static int is_foreground_fd(int fd)
 {
 	int tpgrp = tcgetpgrp(fd);
@@ -108,8 +100,8 @@ static void throughput_string(struct strbuf *buf, uint64_t total,
 
 static uint64_t progress_getnanotime(struct progress *progress)
 {
-	if (progress_testing)
-		return progress->start_ns + progress_test_ns;
+	if (progress->test_getnanotime)
+		return progress->start_ns + progress->test_getnanotime;
 	else
 		return getnanotime();
 }
@@ -185,11 +177,7 @@ static void progress_interval(int signum)
 	progress_update = 1;
 }
 
-/*
- * The progress_test_force_update() function is intended for testing
- * the progress output, i.e. exclusively for 'test-tool progress'.
- */
-void progress_test_force_update(void)
+void test_progress_force_update(void)
 {
 	progress_interval(SIGALRM);
 }
@@ -203,7 +191,7 @@ static void set_progress_signal(struct progress *progress)
 		BUG("should have no global_progress in set_progress_signal()");
 	global_progress = progress;
 
-	if (progress_testing)
+	if (progress->test_mode)
 		return;
 
 	progress_update = 0;
@@ -228,7 +216,7 @@ static void clear_progress_signal(struct progress *progress)
 		BUG("should have a global_progress in clear_progress_signal()");
 	global_progress = NULL;
 
-	if (progress_testing)
+	if (progress->test_mode)
 		return;
 
 	setitimer(ITIMER_REAL, &v, NULL);
@@ -237,7 +225,7 @@ static void clear_progress_signal(struct progress *progress)
 }
 
 static struct progress *start_progress_delay(const char *title, uint64_t total,
-					     unsigned delay)
+					     unsigned delay, int testing)
 {
 	struct progress *progress = xmalloc(sizeof(*progress));
 	progress->title = title;
@@ -250,11 +238,17 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 	strbuf_init(&progress->counters_sb, 0);
 	progress->title_len = utf8_strwidth(title);
 	progress->split = 0;
+	progress->test_mode = testing;
 	set_progress_signal(progress);
 	trace2_region_enter("progress", title, the_repository);
 	return progress;
 }
 
+struct progress *start_progress_testing(const char *title, uint64_t total)
+{
+	return start_progress_delay(title, total, 0, 1);
+}
+
 static int get_default_delay(void)
 {
 	static int delay_in_secs = -1;
@@ -267,12 +261,12 @@ static int get_default_delay(void)
 
 struct progress *start_delayed_progress(const char *title, uint64_t total)
 {
-	return start_progress_delay(title, total, get_default_delay());
+	return start_progress_delay(title, total, get_default_delay(), 0);
 }
 
 struct progress *start_progress(const char *title, uint64_t total)
 {
-	return start_progress_delay(title, total, 0);
+	return start_progress_delay(title, total, 0, 0);
 }
 
 void stop_progress(struct progress **p_progress)
diff --git a/progress.h b/progress.h
index 4fb2b483d36..4693dddb6c5 100644
--- a/progress.h
+++ b/progress.h
@@ -27,15 +27,22 @@ struct progress {
 	struct strbuf counters_sb;
 	int title_len;
 	int split;
-};
-
-#ifdef GIT_TEST_PROGRESS_ONLY
 
-extern int progress_testing;
-extern uint64_t progress_test_ns;
-void progress_test_force_update(void);
+	/*
+	 * The test_* members are are only intended for testing the
+	 * progress output, i.e. exclusively for 'test-tool progress'.
+	 */
+	int test_mode;
+	uint64_t test_getnanotime;
+};
 
-#endif
+/*
+ * *_testing() functions are only for use in
+ * t/helper/test-progress.c. Do not use them elsewhere!
+ */
+void test_progress_force_update(void);
+struct progress *start_progress_testing(const char *title, uint64_t total);
+void test_progress_setnanotime(struct progress *progress, uint64_t time);
 
 void display_throughput(struct progress *progress, uint64_t total);
 void display_progress(struct progress *progress, uint64_t n);
diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
index 7ca58a3ee78..40dbacb0557 100644
--- a/t/helper/test-progress.c
+++ b/t/helper/test-progress.c
@@ -46,21 +46,20 @@ int cmd__progress(int argc, const char **argv)
 	if (argc)
 		usage_with_options(usage, options);
 
-	progress_testing = 1;
 	while (strbuf_getline(&line, stdin) != EOF) {
 		char *end;
 
 		if (!strcmp(line.buf, "start")) {
-			progress = start_progress(default_title, 0);
+			progress = start_progress_testing(default_title, 0);
 		} else if (skip_prefix(line.buf, "start ", (const char **) &end)) {
 			uint64_t total = strtoull(end, &end, 10);
 			if (*end == '\0') {
-				progress = start_progress(default_title, total);
+				progress = start_progress_testing(default_title, total);
 			} else if (*end == ' ') {
 				if (detached_title)
 					free(detached_title);
 				detached_title = strbuf_detach(&line, NULL);
-				progress = start_progress(end + 1, total);
+				progress = start_progress_testing(end + 1, total);
 			} else {
 				die("invalid input: '%s'\n", line.buf);
 			}
@@ -79,11 +78,11 @@ int cmd__progress(int argc, const char **argv)
 			test_ms = strtoull(end + 1, &end, 10);
 			if (*end != '\0')
 				die("invalid input: '%s'\n", line.buf);
-			progress_test_ns = test_ms * 1000 * 1000;
+			progress->test_getnanotime = test_ms * 1000 * 1000;
 			display_throughput(progress, byte_count);
 		} else if (!strcmp(line.buf, "update") ||
 			   !strcmp(line.buf, "signal")) {
-			progress_test_force_update();
+			test_progress_force_update();
 		} else if (!strcmp(line.buf, "stop")) {
 			stop_progress(&progress);
 		} else {
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 15/25] progress.c: pass "is done?" (again) to display()
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (13 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 14/25] progress.[ch]: move test-only code away from "extern" variables Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 16/25] progress.[ch]: convert "title" to "struct strbuf" Ævar Arnfjörð Bjarmason
                         ` (10 subsequent siblings)
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Go back to passing a "are we done?" state variable to the display()
function, instead of passing a string that happens to end in a newline
for the ", done\n" special-case in stop_progress().

This doesn't matter now, but is needed to display an arbitrary message
earlier in the progress display, not just at the very end.

In a984a06a07c (nicer display of thin pack completion, 2007-11-08)
this code worked like this, but later on in 42e18fbf5f9 (more compact
progress display, 2007-10-16) we ended up with the "const
char *done". Then in d53ba841d4f (progress: assemble percentage and
counters in a strbuf before printing, 2019-04-05) we ended up with the
current code structure around the "counters_sb" strbuf.

The "counters_sb" is needed because when we emit a line like:

    Title (1/10)<CR>

We need to know how many characters the " (1/10)" variable part is, so
that we'll emit the appropriate number of spaces to "clear" the line.

If we want to emit output like:

    Title (1/10), some message<CR>

We'll need to stick the whole " (1/10), some message" part into the
strbuf, so that if we want to clear the message we'll know to emit:

    Title (1/10), some message<CR>
    Title (2/10)              <CR>

This didn't matter for the ", done\n" case because we were ending the
process anyway, but in preparation for the above let's star treating
it like any other line, and pass an "int last_update" to decide
whether the line ends with a "\r" or a "\n".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/progress.c b/progress.c
index 39d7f6bd86b..44479f65921 100644
--- a/progress.c
+++ b/progress.c
@@ -25,7 +25,8 @@ static int is_foreground_fd(int fd)
 	return tpgrp < 0 || tpgrp == getpgid(0);
 }
 
-static void display(struct progress *progress, uint64_t n, const char *done)
+static void display(struct progress *progress, uint64_t n,
+		    const char *update_msg, int last_update)
 {
 	const char *tp;
 	struct strbuf *counters_sb = &progress->counters_sb;
@@ -55,10 +56,13 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 		show_update = 1;
 	}
 
+	if (show_update && update_msg)
+		strbuf_addf(counters_sb, ", %s.", update_msg);
+
 	if (show_update) {
 		int stderr_is_foreground_fd = is_foreground_fd(fileno(stderr));
-		if (stderr_is_foreground_fd || done) {
-			const char *eol = done ? done : "\r";
+		if (stderr_is_foreground_fd || update_msg) {
+			const char *eol = last_update ? "\n" : "\r";
 			size_t clear_len = counters_sb->len < last_count_len ?
 					last_count_len - counters_sb->len + 1 :
 					0;
@@ -70,7 +74,7 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 			if (progress->split) {
 				fprintf(stderr, "  %s%*s", counters_sb->buf,
 					(int) clear_len, eol);
-			} else if (!done && cols < progress_line_len) {
+			} else if (!update_msg && cols < progress_line_len) {
 				clear_len = progress->title_len + 1 < cols ?
 					    cols - progress->title_len - 1 : 0;
 				fprintf(stderr, "%s:%*s\n  %s%s",
@@ -163,13 +167,13 @@ void display_throughput(struct progress *progress, uint64_t total)
 
 	throughput_string(&tp->display, total, rate);
 	if (progress->last_value != -1 && progress_update)
-		display(progress, progress->last_value, NULL);
+		display(progress, progress->last_value, NULL, 0);
 }
 
 void display_progress(struct progress *progress, uint64_t n)
 {
 	if (progress)
-		display(progress, n, NULL);
+		display(progress, n, NULL, 0);
 }
 
 static void progress_interval(int signum)
@@ -303,7 +307,6 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 	*p_progress = NULL;
 	if (progress->last_value != -1) {
 		/* Force the last update */
-		char *buf;
 		struct throughput *tp = progress->throughput;
 
 		if (tp) {
@@ -314,9 +317,7 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 			throughput_string(&tp->display, tp->curr_total, rate);
 		}
 		progress_update = 1;
-		buf = xstrfmt(", %s.\n", msg);
-		display(progress, progress->last_value, buf);
-		free(buf);
+		display(progress, progress->last_value, msg, 1);
 	}
 	clear_progress_signal(progress);
 	strbuf_release(&progress->counters_sb);
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 16/25] progress.[ch]: convert "title" to "struct strbuf"
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (14 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 15/25] progress.c: pass "is done?" (again) to display() Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 17/25] progress.c: refactor display() for less confusion, and fix bug Ævar Arnfjörð Bjarmason
                         ` (9 subsequent siblings)
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Covert the "title" for the progress bar to a "struct strbuf", as with
the existing "counters_sb". Let's also rename the "counters_sb" to
merely "status", as we'll soon start using it not just to count, but
for any other arbitrary messaging after our fixed "title".

This makes the emitting the output more consistent, and allows us to
have both a UTF-8 progress bar, and a "status" portion. We won't be
making use of the latter just let, but let's not close the door to it
by relying on a strbuf with a len for one, and a char * for the other.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 63 ++++++++++++++++++++++++++++++++----------------------
 progress.h |  9 +++++---
 2 files changed, 44 insertions(+), 28 deletions(-)

diff --git a/progress.c b/progress.c
index 44479f65921..e17490964c4 100644
--- a/progress.c
+++ b/progress.c
@@ -29,9 +29,8 @@ static void display(struct progress *progress, uint64_t n,
 		    const char *update_msg, int last_update)
 {
 	const char *tp;
-	struct strbuf *counters_sb = &progress->counters_sb;
 	int show_update = 0;
-	int last_count_len = counters_sb->len;
+	size_t last_count_len = progress->status_len_utf8;
 
 	if (progress->delay && (!progress_update || --progress->delay))
 		return;
@@ -43,47 +42,57 @@ static void display(struct progress *progress, uint64_t n,
 		if (percent != progress->last_percent || progress_update) {
 			progress->last_percent = percent;
 
-			strbuf_reset(counters_sb);
-			strbuf_addf(counters_sb,
+			strbuf_reset(&progress->status);
+			strbuf_addf(&progress->status,
 				    "%3u%% (%"PRIuMAX"/%"PRIuMAX")%s", percent,
 				    (uintmax_t)n, (uintmax_t)progress->total,
 				    tp);
 			show_update = 1;
 		}
 	} else if (progress_update) {
-		strbuf_reset(counters_sb);
-		strbuf_addf(counters_sb, "%"PRIuMAX"%s", (uintmax_t)n, tp);
+		strbuf_reset(&progress->status);
+		strbuf_addf(&progress->status, "%"PRIuMAX"%s", (uintmax_t)n, tp);
 		show_update = 1;
 	}
 
 	if (show_update && update_msg)
-		strbuf_addf(counters_sb, ", %s.", update_msg);
+		strbuf_addf(&progress->status, ", %s.", update_msg);
 
 	if (show_update) {
 		int stderr_is_foreground_fd = is_foreground_fd(fileno(stderr));
 		if (stderr_is_foreground_fd || update_msg) {
 			const char *eol = last_update ? "\n" : "\r";
-			size_t clear_len = counters_sb->len < last_count_len ?
-					last_count_len - counters_sb->len + 1 :
+			size_t clear_len = progress->status.len < last_count_len ?
+					last_count_len - progress->status.len + 1 :
 					0;
 			/* The "+ 2" accounts for the ": ". */
-			size_t progress_line_len = progress->title_len +
-						counters_sb->len + 2;
+			size_t progress_line_len = progress->title_len_utf8 +
+						progress->status.len + 2;
 			int cols = term_columns();
+			progress->status_len_utf8 = utf8_strwidth(progress->status.buf);
 
 			if (progress->split) {
-				fprintf(stderr, "  %s%*s", counters_sb->buf,
-					(int) clear_len, eol);
+				fprintf(stderr, "  %*s%*s",
+					(int)progress->status_len_utf8,
+					progress->status.buf,
+					(int)clear_len, eol);
 			} else if (!update_msg && cols < progress_line_len) {
-				clear_len = progress->title_len + 1 < cols ?
-					    cols - progress->title_len - 1 : 0;
-				fprintf(stderr, "%s:%*s\n  %s%s",
-					progress->title, (int) clear_len, "",
-					counters_sb->buf, eol);
+				clear_len = progress->title_len_utf8 + 1 < cols ?
+					    cols - progress->title_len_utf8 - 1 : 0;
+				fprintf(stderr, "%*s:%*s\n  %*s%s",
+					(int)progress->title_len_utf8,
+					progress->title.buf,
+					(int)clear_len, "",
+					(int)progress->status_len_utf8,
+					progress->status.buf, eol);
 				progress->split = 1;
 			} else {
-				fprintf(stderr, "%s: %s%*s", progress->title,
-					counters_sb->buf, (int) clear_len, eol);
+				fprintf(stderr, "%*s: %*s%*s",
+					(int)progress->title_len_utf8,
+					progress->title.buf,
+					(int)progress->status_len_utf8,
+					progress->status.buf,
+					(int)clear_len, eol);
 			}
 			if (stderr_is_foreground_fd)
 				fflush(stderr);
@@ -232,15 +241,18 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 					     unsigned delay, int testing)
 {
 	struct progress *progress = xmalloc(sizeof(*progress));
-	progress->title = title;
+	strbuf_init(&progress->title, 0);
+	strbuf_addstr(&progress->title, title);
+	progress->title_len_utf8 = utf8_strwidth(title);
+	strbuf_init(&progress->status, 0);
+	progress->status_len_utf8 = 0;
+
 	progress->total = total;
 	progress->last_value = -1;
 	progress->last_percent = -1;
 	progress->delay = delay;
 	progress->throughput = NULL;
 	progress->start_ns = getnanotime();
-	strbuf_init(&progress->counters_sb, 0);
-	progress->title_len = utf8_strwidth(title);
 	progress->split = 0;
 	progress->test_mode = testing;
 	set_progress_signal(progress);
@@ -288,7 +300,7 @@ void stop_progress(struct progress **p_progress)
 					   "total_bytes",
 					   progress->throughput->curr_total);
 
-		trace2_region_leave("progress", progress->title, the_repository);
+		trace2_region_leave("progress", progress->title.buf, the_repository);
 	}
 
 	stop_progress_msg(p_progress, _("done"));
@@ -320,7 +332,8 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 		display(progress, progress->last_value, msg, 1);
 	}
 	clear_progress_signal(progress);
-	strbuf_release(&progress->counters_sb);
+	strbuf_release(&progress->title);
+	strbuf_release(&progress->status);
 	if (progress->throughput)
 		strbuf_release(&progress->throughput->display);
 	free(progress->throughput);
diff --git a/progress.h b/progress.h
index 4693dddb6c5..ba38447d104 100644
--- a/progress.h
+++ b/progress.h
@@ -17,15 +17,18 @@ struct throughput {
 };
 
 struct progress {
-	const char *title;
+	struct strbuf title;
+	size_t title_len_utf8;
+
+	struct strbuf status;
+	size_t status_len_utf8;
+
 	uint64_t last_value;
 	uint64_t total;
 	unsigned last_percent;
 	unsigned delay;
 	struct throughput *throughput;
 	uint64_t start_ns;
-	struct strbuf counters_sb;
-	int title_len;
 	int split;
 
 	/*
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 17/25] progress.c: refactor display() for less confusion, and fix bug
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (15 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 16/25] progress.[ch]: convert "title" to "struct strbuf" Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 18/25] progress.c: emit progress on first signal, show "stalled" Ævar Arnfjörð Bjarmason
                         ` (8 subsequent siblings)
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

As tested for in 2bb74b53a49 (Test the progress display, 2019-09-16)
we would redundantly emit extra spaces to clear output we never
emitted under the split mode. Now we'll always clear precisely as many
columns as we need, and no more.

The root cause of that issue is that since the progress code was
originally written we've grown support for various new features, and
ended up with a function where we didn't build the output we were
about to emit once, and then emitted it.

We thus couldn't easily track the length of the output we really did
emit, with everything going downhill from there.

The alternative approach is longer (largely due to added comments),
but I think much clearer.

We no longer rely on magic constants like "2" for ": " or "
" (although we do still rely on the two separators being the same
length, but now have a related BUG(...) assertion).

We don't update "status_len_utf8" (or rather, the now-gone
"last_count_len") or "progress->last_value" until after we've emitted
all the output.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c                  | 137 +++++++++++++++++++++++++++---------
 t/t0500-progress-display.sh |   8 +--
 2 files changed, 104 insertions(+), 41 deletions(-)

diff --git a/progress.c b/progress.c
index e17490964c4..6c4038df791 100644
--- a/progress.c
+++ b/progress.c
@@ -25,17 +25,24 @@ static int is_foreground_fd(int fd)
 	return tpgrp < 0 || tpgrp == getpgid(0);
 }
 
+static const char *counter_prefix(int split)
+{
+	switch (split) {
+	case 1: return "  ";
+	case 0: return ": ";
+	default: BUG("unknown split value");
+	}
+}
+
 static void display(struct progress *progress, uint64_t n,
 		    const char *update_msg, int last_update)
 {
 	const char *tp;
 	int show_update = 0;
-	size_t last_count_len = progress->status_len_utf8;
 
 	if (progress->delay && (!progress_update || --progress->delay))
 		return;
 
-	progress->last_value = n;
 	tp = (progress->throughput) ? progress->throughput->display.buf : "";
 	if (progress->total) {
 		unsigned percent = n * 100 / progress->total;
@@ -44,61 +51,121 @@ static void display(struct progress *progress, uint64_t n,
 
 			strbuf_reset(&progress->status);
 			strbuf_addf(&progress->status,
-				    "%3u%% (%"PRIuMAX"/%"PRIuMAX")%s", percent,
+				    "%s%3u%% (%"PRIuMAX"/%"PRIuMAX")%s",
+				    counter_prefix(progress->split), percent,
 				    (uintmax_t)n, (uintmax_t)progress->total,
 				    tp);
 			show_update = 1;
 		}
 	} else if (progress_update) {
 		strbuf_reset(&progress->status);
-		strbuf_addf(&progress->status, "%"PRIuMAX"%s", (uintmax_t)n, tp);
+		strbuf_addf(&progress->status, "%s%"PRIuMAX"%s", counter_prefix(progress->split),
+			    (uintmax_t)n, tp);
 		show_update = 1;
 	}
 
 	if (show_update && update_msg)
-		strbuf_addf(&progress->status, ", %s.", update_msg);
+		strbuf_addstr(&progress->status, update_msg);
 
 	if (show_update) {
 		int stderr_is_foreground_fd = is_foreground_fd(fileno(stderr));
 		if (stderr_is_foreground_fd || update_msg) {
 			const char *eol = last_update ? "\n" : "\r";
-			size_t clear_len = progress->status.len < last_count_len ?
-					last_count_len - progress->status.len + 1 :
-					0;
-			/* The "+ 2" accounts for the ": ". */
-			size_t progress_line_len = progress->title_len_utf8 +
-						progress->status.len + 2;
-			int cols = term_columns();
-			progress->status_len_utf8 = utf8_strwidth(progress->status.buf);
-
-			if (progress->split) {
-				fprintf(stderr, "  %*s%*s",
-					(int)progress->status_len_utf8,
-					progress->status.buf,
-					(int)clear_len, eol);
-			} else if (!update_msg && cols < progress_line_len) {
-				clear_len = progress->title_len_utf8 + 1 < cols ?
-					    cols - progress->title_len_utf8 - 1 : 0;
-				fprintf(stderr, "%*s:%*s\n  %*s%s",
-					(int)progress->title_len_utf8,
-					progress->title.buf,
-					(int)clear_len, "",
-					(int)progress->status_len_utf8,
-					progress->status.buf, eol);
+			size_t status_len_utf8 = utf8_strwidth(progress->status.buf);
+			size_t progress_line_len = progress->title_len_utf8 + status_len_utf8;
+
+			/*
+			 * We're back at the beginning, so we'll
+			 * always print out the title, unless we're
+			 * already split, then the title is on an
+			 * earlier line.
+			 */
+			if (!progress->split)
+				fprintf(stderr, "%*s",
+					(int)(progress->title_len_utf8),
+					progress->title.buf);
+
+			/*
+			 * Did the user resize the terminal and we're
+			 * splitting this progress bar? Clear previous
+			 * ": (X/Y) [msg]"
+			 */
+			if (!progress->split &&
+			    term_columns() < progress_line_len) {
+				const char *split_prefix = counter_prefix(0);
+				const char *unsplit_prefix = counter_prefix(1);
+				const char *split_colon = ":";
 				progress->split = 1;
+
+				if (progress->last_value == -1) {
+					/*
+					 * We've got no previous
+					 * output whatsoever, so we
+					 * were "always split". No
+					 * previous status output to
+					 * erase.
+					 */
+					fprintf(stderr, "%s\n", split_colon);
+				} else {
+					const char *split_colon = ":";
+					const size_t split_colon_len = strlen(split_colon);
+
+					/*
+					 * Erase whatever we had, adding a
+					 * trailing ":" (not ": ") to indicate
+					 * the progress on the next line.
+					 */
+					fprintf(stderr, "%s%*s\n", split_colon,
+						(int)(progress->status_len_utf8 - split_colon_len),
+						"");
+				}
+
+				/*
+				 * For the one-off switching from
+				 * "!progress->split" to
+				 * "progress->split" fake up the
+				 * expected strbuf and replace the ":
+				 * " with a " ".
+				 *
+				 * The length of the two delimiters
+				 * must be the same for this trick to
+				 * work.
+				 */
+				if (!starts_with(progress->status.buf, split_prefix))
+					BUG("switching from already true split mode to split mode?");
+
+				strbuf_splice(&progress->status, 0,
+					      strlen(split_prefix),
+					      unsplit_prefix,
+					      strlen(unsplit_prefix));
+
+				fprintf(stderr, "%*s%s", (int)status_len_utf8,
+					progress->status.buf, eol);
 			} else {
-				fprintf(stderr, "%*s: %*s%*s",
-					(int)progress->title_len_utf8,
-					progress->title.buf,
-					(int)progress->status_len_utf8,
-					progress->status.buf,
-					(int)clear_len, eol);
+				/*
+				 * Our current
+				 * message may be larger or smaller than the
+				 * last one. Either the progress bar went
+				 * backards (smaller numbers), or we went back
+				 * and forth with a status message.
+				 */
+				size_t clear_len = progress->status_len_utf8 > status_len_utf8
+					? progress->status_len_utf8 - status_len_utf8
+					: 0;
+				fprintf(stderr, "%*s%*s%s",
+					(int) status_len_utf8, progress->status.buf,
+					(int) clear_len, "",
+					eol);
 			}
+			progress->status_len_utf8 = status_len_utf8;
+
 			if (stderr_is_foreground_fd)
 				fflush(stderr);
 		}
 		progress_update = 0;
 	}
+	progress->last_value = n;
+
 }
 
 static void throughput_string(struct strbuf *buf, uint64_t total,
@@ -303,7 +370,7 @@ void stop_progress(struct progress **p_progress)
 		trace2_region_leave("progress", progress->title.buf, the_repository);
 	}
 
-	stop_progress_msg(p_progress, _("done"));
+	stop_progress_msg(p_progress, _(", done."));
 }
 
 void stop_progress_msg(struct progress **p_progress, const char *msg)
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 476a31222a3..883e044fe64 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -85,12 +85,10 @@ EOF
 '
 
 test_expect_success 'progress display breaks long lines #2' '
-	# Note: we do not need that many spaces after the title to cover up
-	# the last line before breaking the progress line.
 	sed -e "s/Z$//" >expect <<\EOF &&
 Working hard.......2.........3.........4.........5.........6:   0% (1/100000)<CR>
 Working hard.......2.........3.........4.........5.........6:   0% (2/100000)<CR>
-Working hard.......2.........3.........4.........5.........6:                   Z
+Working hard.......2.........3.........4.........5.........6:                Z
    10% (10000/100000)<CR>
   100% (100000/100000)<CR>
   100% (100000/100000), done.
@@ -112,10 +110,8 @@ EOF
 '
 
 test_expect_success 'progress display breaks long lines #3 - even the first is too long' '
-	# Note: we do not actually need any spaces at the end of the title
-	# line, because there is no previous progress line to cover up.
 	sed -e "s/Z$//" >expect <<\EOF &&
-Working hard.......2.........3.........4.........5.........6:                   Z
+Working hard.......2.........3.........4.........5.........6:
    25% (25000/100000)<CR>
    50% (50000/100000)<CR>
    75% (75000/100000)<CR>
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 18/25] progress.c: emit progress on first signal, show "stalled"
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (16 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 17/25] progress.c: refactor display() for less confusion, and fix bug Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-09-16 18:37         ` SZEDER Gábor
  2021-06-23 17:48       ` [PATCH 19/25] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
                         ` (7 subsequent siblings)
  25 siblings, 1 reply; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Ever since the progress.c code was added in 96a02f8f6d2 (common
progress display support, 2007-04-18) we have been driven purely by
calls to the display() function (via the public display_progress()),
or via stop_progress(). Even though we got a signal and invoked
progress_interval() that function would not actually emit progress
output for us.

Thus in cases like "git gc" we don't emit any "Enumerating Objects"
output until we get past the setup code, and start enumerating
objects, we'll now (at least on my laptop) show output earlier, and
emit a "stalled" message before we start the count.

But more generally, this is a first step towards never showing a
hanging progress bar from the user's perspective. If we're truly
taking a very long time with one item we can show some spinner that we
update every time we get a signal. We don't right now, and only
special-case the most common case of hanging before we get to the
first item.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c                  |  7 +++++
 t/t0500-progress-display.sh | 63 ++++++++++++++++++++++++++++++++++---
 2 files changed, 66 insertions(+), 4 deletions(-)

diff --git a/progress.c b/progress.c
index 6c4038df791..35847d3a7f2 100644
--- a/progress.c
+++ b/progress.c
@@ -255,6 +255,13 @@ void display_progress(struct progress *progress, uint64_t n)
 static void progress_interval(int signum)
 {
 	progress_update = 1;
+
+	if (global_progress->last_value != -1)
+		return;
+
+	display(global_progress, 0, _(", stalled."), 0);
+	progress_update = 1;
+	return;
 }
 
 void test_progress_force_update(void)
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 883e044fe64..bc458cfc28b 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -15,7 +15,8 @@ test_expect_success 'setup COLUMNS' '
 
 test_expect_success 'simple progress display' '
 	cat >expect <<-\EOF &&
-	Working hard: 1<CR>
+	Working hard: 0, stalled.<CR>
+	Working hard: 1          <CR>
 	Working hard: 2<CR>
 	Working hard: 5<CR>
 	Working hard: 5, done.
@@ -60,6 +61,57 @@ test_expect_success 'progress display with total' '
 	test_cmp expect out
 '
 
+test_expect_success 'stalled progress display' '
+	cat >expect <<-\EOF &&
+	Working hard:   0% (0/3), stalled.<CR>
+	Working hard:  33% (1/3)          <CR>
+	Working hard:  66% (2/3)<CR>
+	Working hard: 100% (3/3)<CR>
+	Working hard: 100% (3/3), done.
+	EOF
+
+	cat >in <<-\EOF &&
+	start 3
+	signal
+	signal
+	signal
+	progress 1
+	signal
+	update
+	signal
+	progress 2
+	update
+	progress 3
+	stop
+	EOF
+	STALLED=1 test-tool progress <in 2>stderr &&
+
+	show_cr <stderr >out &&
+	test_cmp expect out
+'
+
+test_expect_success 'progress display breaks long lines #0, stalled' '
+	sed -e "s/Z$//" >expect <<\EOF &&
+Working hard.......2.........3.........4.........5.........6.........7:
+    0% (0/100), stalled.<CR>
+    1% (1/100)          <CR>
+   50% (50/100)<CR>
+   50% (50/100), done.
+EOF
+
+	cat >in <<-\EOF &&
+	start 100 Working hard.......2.........3.........4.........5.........6.........7
+	signal
+	progress 1
+	progress 50
+	stop
+	EOF
+	test-tool progress <in 2>stderr &&
+
+	show_cr <stderr >out &&
+	test_cmp expect out
+'
+
 test_expect_success 'progress display breaks long lines #1' '
 	sed -e "s/Z$//" >expect <<\EOF &&
 Working hard.......2.........3.........4.........5.........6:   0% (100/100000)<CR>
@@ -183,7 +235,8 @@ test_expect_success 'progress shortens - crazy caller' '
 
 test_expect_success 'progress display with throughput' '
 	cat >expect <<-\EOF &&
-	Working hard: 10<CR>
+	Working hard: 0, stalled.<CR>
+	Working hard: 10         <CR>
 	Working hard: 20, 200.00 KiB | 100.00 KiB/s<CR>
 	Working hard: 30, 300.00 KiB | 100.00 KiB/s<CR>
 	Working hard: 40, 400.00 KiB | 100.00 KiB/s<CR>
@@ -241,7 +294,8 @@ test_expect_success 'progress display with throughput and total' '
 
 test_expect_success 'cover up after throughput shortens' '
 	cat >expect <<-\EOF &&
-	Working hard: 1<CR>
+	Working hard: 0, stalled.<CR>
+	Working hard: 1          <CR>
 	Working hard: 2, 800.00 KiB | 400.00 KiB/s<CR>
 	Working hard: 3, 1.17 MiB | 400.00 KiB/s  <CR>
 	Working hard: 4, 1.56 MiB | 400.00 KiB/s<CR>
@@ -272,7 +326,8 @@ test_expect_success 'cover up after throughput shortens' '
 
 test_expect_success 'cover up after throughput shortens a lot' '
 	cat >expect <<-\EOF &&
-	Working hard: 1<CR>
+	Working hard: 0, stalled.<CR>
+	Working hard: 1          <CR>
 	Working hard: 2, 1000.00 KiB | 1000.00 KiB/s<CR>
 	Working hard: 3, 3.00 MiB | 1.50 MiB/s      <CR>
 	Working hard: 3, 3.00 MiB | 1024.00 KiB/s, done.
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 19/25] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (17 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 18/25] progress.c: emit progress on first signal, show "stalled" Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 20/25] midx: don't provide a total for QSORT() progress Ævar Arnfjörð Bjarmason
                         ` (6 subsequent siblings)
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

From: SZEDER Gábor <szeder.dev@gmail.com>

The final value of the counter of the "Scanning merged commits"
progress line is always one less than its expected total, e.g.:

  Scanning merged commits:  83% (5/6), done.

This happens because while iterating over an array the loop variable
is passed to display_progress() as-is, but while C arrays (and thus
the loop variable) start at 0 and end at N-1, the progress counter
must end at N.  This causes the failures of the tests
'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.

Fix this by passing 'i + 1' to display_progress(), like most other
callsites do.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 commit-graph.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/commit-graph.c b/commit-graph.c
index 2bcb4e0f89e..3181906368d 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
 
 	ctx->num_extra_edges = 0;
 	for (i = 0; i < ctx->commits.nr; i++) {
-		display_progress(ctx->progress, i);
+		display_progress(ctx->progress, i + 1);
 
 		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
 			  &ctx->commits.list[i]->object.oid)) {
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 20/25] midx: don't provide a total for QSORT() progress
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (18 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 19/25] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 21/25] entry: show finer-grained counter in "Filtering content" progress line Ævar Arnfjörð Bjarmason
                         ` (5 subsequent siblings)
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

The quicksort algorithm can be anywhere between O(n) and O(n^2), so
providing a "num objects" as a total means that in some cases we're
going to go past 100%.

This fixes a logic error in 5ae18df9d8e (midx: during verify group
objects by packfile to speed verification, 2019-03-21), which in turn
seems to have been diligently copied from my own logic error in the
commit-graph.c code, see 890226ccb57 (commit-graph write: add
itermediate progress, 2019-01-19).

That commit-graph code of mine was removed in
1cbdbf3bef7 (commit-graph: drop count_distinct_commits() function,
2020-12-07), so we don't need to fix that too.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 midx.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/midx.c b/midx.c
index d80e68998b8..9f1b4018c1c 100644
--- a/midx.c
+++ b/midx.c
@@ -1265,8 +1265,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 	}
 
 	if (flags & MIDX_PROGRESS)
-		progress = start_progress(_("Sorting objects by packfile"),
-					  m->num_objects);
+		progress = start_progress(_("Sorting objects by packfile"), 0);
 	display_progress(progress, 0); /* TODO: Measure QSORT() progress */
 	QSORT(pairs, m->num_objects, compare_pair_pos_vs_id);
 	stop_progress(&progress);
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 21/25] entry: show finer-grained counter in "Filtering content" progress line
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (19 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 20/25] midx: don't provide a total for QSORT() progress Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 22/25] progress.c: add a stop_progress_early() function Ævar Arnfjörð Bjarmason
                         ` (4 subsequent siblings)
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

From: SZEDER Gábor <szeder.dev@gmail.com>

The "Filtering content" progress in entry.c:finish_delayed_checkout()
is unusual because of how it calculates the progress count and because
it shows the progress of a nested loop.  It works basically like this:

  start_delayed_progress(p, nr_of_paths_to_filter)
  for_each_filter {
      display_progress(p, nr_of_paths_to_filter - nr_of_paths_still_left_to_filter)
      for_each_path_handled_by_the_current_filter {
          checkout_entry()
      }
  }
  stop_progress(p)

There are two issues with this approach:

  - The work done by the last filter (or the only filter if there is
    only one) is never counted, so if the last filter still has some
    paths to process, then the counter shown in the "done" progress
    line will not match the expected total.

    This would cause a BUG() in an upcoming change that adds an
    assertion checking if the "total" at the end matches the last
    progress bar update..

    This is because both use only one filter.  (The test 'delayed
    checkout in process filter' uses two filters but the first one
    does all the work, so that test already happens to succeed even
    with such an assertion.)

  - The progress counter is updated only once per filter, not once per
    processed path, so if a filter has a lot of paths to process, then
    the counter might stay unchanged for a long while and then make a
    big jump (though the user still gets a sense of progress, because
    we call display_throughput() after each processed path to show the
    amount of processed data).

Move the display_progress() call to the inner loop, right next to that
checkout_entry() call that does the hard work for each path, and use a
dedicated counter variable that is incremented upon processing each
path.

After this change the 'invalid file in delayed checkout' in
't0021-conversion.sh' would succeed with the future BUG() assertion
discussed above but the 'missing file in delayed checkout' test would
still fail, because its purposefully buggy filter doesn't process any
paths, so we won't execute that inner loop at all (this will be fixed
in a subsequent commit).

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 entry.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/entry.c b/entry.c
index 711ee0693c7..bc4b8fcc980 100644
--- a/entry.c
+++ b/entry.c
@@ -162,7 +162,7 @@ static int remove_available_paths(struct string_list_item *item, void *cb_data)
 int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 {
 	int errs = 0;
-	unsigned delayed_object_count;
+	unsigned processed_paths = 0;
 	off_t filtered_bytes = 0;
 	struct string_list_item *filter, *path;
 	struct progress *progress;
@@ -172,12 +172,10 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 		return errs;
 
 	dco->state = CE_RETRY;
-	delayed_object_count = dco->paths.nr;
-	progress = start_delayed_progress(_("Filtering content"), delayed_object_count);
+	progress = start_delayed_progress(_("Filtering content"), dco->paths.nr);
 	while (dco->filters.nr > 0) {
 		for_each_string_list_item(filter, &dco->filters) {
 			struct string_list available_paths = STRING_LIST_INIT_NODUP;
-			display_progress(progress, delayed_object_count - dco->paths.nr);
 
 			if (!async_query_available_blobs(filter->string, &available_paths)) {
 				/* Filter reported an error */
@@ -224,6 +222,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 				ce = index_file_exists(state->istate, path->string,
 						       strlen(path->string), 0);
 				if (ce) {
+					display_progress(progress, ++processed_paths);
 					errs |= checkout_entry(ce, state, NULL, nr_checkouts);
 					filtered_bytes += ce->ce_stat_data.sd_size;
 					display_throughput(progress, filtered_bytes);
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 22/25] progress.c: add a stop_progress_early() function
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (20 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 21/25] entry: show finer-grained counter in "Filtering content" progress line Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-24 10:35         ` Ævar Arnfjörð Bjarmason
  2021-06-25  1:24         ` Andrei Rybak
  2021-06-23 17:48       ` [PATCH 23/25] entry: deal with unexpected "Filtering content" total Ævar Arnfjörð Bjarmason
                         ` (3 subsequent siblings)
  25 siblings, 2 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

In cases where we error out during processing or otherwise miss
initial "total" estimate we'll still want to show a "done" message and
end our trace2 region, but it won't be true that our total ==
last_update at the end.

So let's add a "last_update" and this stop_progress_early() function
to handle that edge case, this will be used in a subsequent commit.

We could also use a total=0 in such cases, but that would make the
progress output worse for the common non-erroring case. Let's instead
note that we didn't reach the total count, and snap the progress bar
to "100%, done" at the end.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 20 ++++++++++++++++++++
 progress.h |  2 ++
 2 files changed, 22 insertions(+)

diff --git a/progress.c b/progress.c
index 35847d3a7f2..c1cb01ba975 100644
--- a/progress.c
+++ b/progress.c
@@ -40,6 +40,8 @@ static void display(struct progress *progress, uint64_t n,
 	const char *tp;
 	int show_update = 0;
 
+	progress->last_update = n;
+
 	if (progress->delay && (!progress_update || --progress->delay))
 		return;
 
@@ -413,3 +415,21 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 	free(progress->throughput);
 	free(progress);
 }
+
+void stop_progress_early(struct progress **p_progress)
+{
+	struct progress *progress;
+	struct strbuf sb = STRBUF_INIT;
+
+	if (!p_progress)
+		BUG("don't provide NULL to stop_progress_early");
+	progress = *p_progress;
+	if (!progress)
+		return;
+
+	strbuf_addf(&sb, _(", done at %"PRIuMAX" items, expected %"PRIuMAX"."),
+		    progress->total, progress->last_update);
+	progress->total = progress->last_update;
+	stop_progress_msg(p_progress, sb.buf);
+	strbuf_release(&sb);
+}
diff --git a/progress.h b/progress.h
index ba38447d104..5c5d027d1a0 100644
--- a/progress.h
+++ b/progress.h
@@ -23,6 +23,7 @@ struct progress {
 	struct strbuf status;
 	size_t status_len_utf8;
 
+	uint64_t last_update;
 	uint64_t last_value;
 	uint64_t total;
 	unsigned last_percent;
@@ -56,5 +57,6 @@ struct progress *start_delayed_sparse_progress(const char *title,
 					       uint64_t total);
 void stop_progress(struct progress **progress);
 void stop_progress_msg(struct progress **progress, const char *msg);
+void stop_progress_early(struct progress **p_progress);
 
 #endif
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 23/25] entry: deal with unexpected "Filtering content" total
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (21 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 22/25] progress.c: add a stop_progress_early() function Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [RFC/PATCH 24/25] progress: assert last update in stop_progress() Ævar Arnfjörð Bjarmason
                         ` (2 subsequent siblings)
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

The "Filtering content" end total does not match the expected total in
cases such as the 'missing file in delayed checkout' test in
't0021-conversion.sh'.

If we encounter errors we can't accurately estimate the end state of
the progress bar. This is because the test involves a purposefully
buggy filter process that doesn't process any paths, so the progress
counter doesn't have a chance to reach the expected total.

See the preceding commit for why we'd want a stop_progress_early() in
this case, as opposed to leaking memory here, or not providing a
"total" estimate to begin with.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 entry.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/entry.c b/entry.c
index bc4b8fcc980..e79a13daa51 100644
--- a/entry.c
+++ b/entry.c
@@ -232,7 +232,10 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 		}
 		string_list_remove_empty_items(&dco->filters, 0);
 	}
-	stop_progress(&progress);
+	if (!errs && !dco->paths.nr)
+		stop_progress(&progress);
+	else
+		stop_progress_early(&progress);
 	string_list_clear(&dco->filters, 0);
 
 	/* At this point we should not have any delayed paths anymore. */
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [RFC/PATCH 24/25] progress: assert last update in stop_progress()
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (22 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 23/25] entry: deal with unexpected "Filtering content" total Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [RFC/PATCH 25/25] progress: assert counting upwards in display() Ævar Arnfjörð Bjarmason
  2021-06-23 17:59       ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Randall S. Becker
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

From: SZEDER Gábor <szeder.dev@gmail.com>

We had to fix a couple of buggy progress lines in the past, where the
progress counter's final value didn't match the expected total [1],
e.g.:

  Expanding reachable commits in commit graph: 138606% (824706/595), done.
  Writing out commit graph in 3 passes: 166% (4187845/2512707), done.

Let's do better, and, instead of waiting for someone to notice such
issues by mere chance, start verifying progress counters in the test
suite. Let's track what the last display_progress() value was, and if
it doesn't match the total at the end invoke BUG().

We need to introduce a "last_update" distinct from "last_value" for
this, since the "last_value" really means "last displayed value", and
the logic in display() relies on it having those semantics.

Using the "last_value" would also leave us with a subtle case where
this assertion wouldn't catch broken API uses, as an earlier version
of this change did.

Even if that was not the case we couldn't rely on it for the purposes
of this assertion. In the case of a delayed progress the variable
holding the value of the progress counter
('progress->last_value') is only updated after that delay is up, and,
consequently, we can't compare the progress counter with the expected
total in stop_progress() in these cases. Thus this check will cover
progress lines that are too fast to be shown, because the repositories
used in our tests are tiny and most of our progress lines are delayed.

What it can't cover is code that doesn't start the progress bar at
all, e.g. due to its own isatty() check, so progress that is only
started and shown when standard error is not a terminal won't be
covered by our tests.

[1] c4ff24bbb3 (commit-graph.c: display correct number of chunks when
                writing, 2021-02-24)
    1cbdbf3bef (commit-graph: drop count_distinct_commits() function,
                2020-12-07), though this didn't actually fixed, but
                instead removed a buggy progress line.
    150cd3b61d (commit-graph: fix "Writing out commit graph" progress
                counter, 2020-07-09)
    67fa6aac5a (commit-graph: don't show progress percentages while
                expanding reachable commits, 2019-09-07)
    531e6daa03 (prune-packed: advanced progress even for non-existing
                fan-out directories, 2009-04-27)

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

WARNING: I belive this is subtly buggy, see the discussion in the
cover letter. It needs more fixes of the progress.c API usage in
various places before being ready.

 progress.c                  |  8 ++++++++
 t/t0500-progress-display.sh | 30 +++++++++++++++++++++++++++++-
 2 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/progress.c b/progress.c
index c1cb01ba975..40043bf6601 100644
--- a/progress.c
+++ b/progress.c
@@ -325,6 +325,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 
 	progress->total = total;
 	progress->last_value = -1;
+	progress->last_update = -1;
 	progress->last_percent = -1;
 	progress->delay = delay;
 	progress->throughput = NULL;
@@ -393,6 +394,13 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 	if (!progress)
 		return;
 	*p_progress = NULL;
+
+	if (progress->total &&
+	    progress->total != progress->last_update)
+		BUG("total progress does not match for \"%*s\": expected: %"PRIuMAX" got: %"PRIuMAX,
+		    (int)(progress->status_len_utf8), progress->title.buf,
+		    (uintmax_t)progress->total,
+		    (uintmax_t)progress->last_update);
 	if (progress->last_value != -1) {
 		/* Force the last update */
 		struct throughput *tp = progress->throughput;
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index bc458cfc28b..3f00e52ce46 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -96,7 +96,8 @@ Working hard.......2.........3.........4.........5.........6.........7:
     0% (0/100), stalled.<CR>
     1% (1/100)          <CR>
    50% (50/100)<CR>
-   50% (50/100), done.
+  100% (100/100)<CR>
+  100% (100/100), done.
 EOF
 
 	cat >in <<-\EOF &&
@@ -104,6 +105,7 @@ EOF
 	signal
 	progress 1
 	progress 50
+	progress 100
 	stop
 	EOF
 	test-tool progress <in 2>stderr &&
@@ -423,4 +425,30 @@ test_expect_success 'BUG: start two concurrent progress bars' '
 	grep -E "^BUG: .*: should have no global_progress in set_progress_signal\(\)$" stderr
 '
 
+test_expect_success 'BUG: display_progress() goes past declared "total"' '
+	cat >in <<-\EOF &&
+	start 3
+	progress 1
+	progress 2
+	progress 4
+	stop
+	EOF
+
+	test_must_fail test-tool progress <in 2>stderr &&
+	grep "BUG:.*total progress does not match" stderr
+'
+
+test_expect_success 'BUG: display_progress() does not reach declared "total"' '
+	cat >in <<-\EOF &&
+	start 5
+	progress 1
+	progress 2
+	progress 4
+	stop
+	EOF
+
+	test_must_fail test-tool progress <in 2>stderr &&
+	grep "BUG:.*total progress does not match" stderr
+'
+
 test_done
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [RFC/PATCH 25/25] progress: assert counting upwards in display()
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (23 preceding siblings ...)
  2021-06-23 17:48       ` [RFC/PATCH 24/25] progress: assert last update in stop_progress() Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:59       ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Randall S. Becker
  25 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

From: SZEDER Gábor <szeder.dev@gmail.com>

We had to fix a buggy progress line recently, where the progress
counter counted backwards, see 8e118e8490 (pack-objects: update
"nr_seen" progress based on pack-reused count, 2021-04-11).

Let's add a BUG(...) assertion that makes use of the "last_update"
value to make sure this doesn't happen again, i.e.  trigger a BUG()
when the counter passed to display_progress() is smaller than the
previous value.

Note that we allow subsequent display_progress() calls with the same
counter value, because:

  - Strictly speaking, it's not wrong to do so.

  - Forbidding it might make the code calling display_progress() more
    complex; I suspect that would be the case with e.g. the "Updating
    index flags" progress line in 'unpack-trees.c', where the counter
    is increased in recursive function calls.

  - We would need to special case the internal display() call in
    stop_progress_msg(), because it uses the same counter value as the
    last display_progress() call, which would trigger this BUG().

't0500-progress-display.sh' countains a few tests that check how
shortened progress lines are covered up, and one of them ('progress
shortens - crazy caller') shortens the progress line by counting
backwards.  From now on that test would trigger this BUG(), so remove
it; the other test cases cover shortening progress lines sufficiently.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

WARNING: I belive this is subtly buggy, see the discussion in the
cover letter. It needs more fixes of the progress.c API usage in
various places before being ready.

 progress.c                  |  2 ++
 t/t0500-progress-display.sh | 36 ++++++++++++------------------------
 2 files changed, 14 insertions(+), 24 deletions(-)

diff --git a/progress.c b/progress.c
index 40043bf6601..7b59006c7c4 100644
--- a/progress.c
+++ b/progress.c
@@ -40,6 +40,8 @@ static void display(struct progress *progress, uint64_t n,
 	const char *tp;
 	int show_update = 0;
 
+	if (progress->last_update != -1 && n < progress->last_update)
+		BUG("counting backwards with display_progress()");
 	progress->last_update = n;
 
 	if (progress->delay && (!progress_update || --progress->delay))
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 3f00e52ce46..de59a757f86 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -211,30 +211,6 @@ EOF
 	test_cmp expect out
 '
 
-# Progress counter goes backwards, this should not happen in practice.
-test_expect_success 'progress shortens - crazy caller' '
-	cat >expect <<-\EOF &&
-	Working hard:  10% (100/1000)<CR>
-	Working hard:  20% (200/1000)<CR>
-	Working hard:   0% (1/1000)  <CR>
-	Working hard: 100% (1000/1000)<CR>
-	Working hard: 100% (1000/1000), done.
-	EOF
-
-	cat >in <<-\EOF &&
-	start 1000
-	progress 100
-	progress 200
-	progress 1
-	progress 1000
-	stop
-	EOF
-	test-tool progress <in 2>stderr &&
-
-	show_cr <stderr >out &&
-	test_cmp expect out
-'
-
 test_expect_success 'progress display with throughput' '
 	cat >expect <<-\EOF &&
 	Working hard: 0, stalled.<CR>
@@ -451,4 +427,16 @@ test_expect_success 'BUG: display_progress() does not reach declared "total"' '
 	grep "BUG:.*total progress does not match" stderr
 '
 
+test_expect_success 'BUG: display_progres() counting backwards' '
+	cat >in <<-\EOF &&
+	start 3
+	progress 1
+	progress 2
+	progress 1
+	EOF
+
+	test_must_fail test-tool progress <in 2>stderr &&
+	grep "BUG:.*counting backwards" stderr
+'
+
 test_done
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 138+ messages in thread

* RE: [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (24 preceding siblings ...)
  2021-06-23 17:48       ` [RFC/PATCH 25/25] progress: assert counting upwards in display() Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:59       ` Randall S. Becker
  2021-06-23 20:01         ` Ævar Arnfjörð Bjarmason
  25 siblings, 1 reply; 138+ messages in thread
From: Randall S. Becker @ 2021-06-23 17:59 UTC (permalink / raw)
  To: 'Ævar Arnfjörð Bjarmason', git
  Cc: 'Junio C Hamano', 'SZEDER Gábor',
	'René Scharfe', 'Taylor Blau'

On June 23, 2021 1:48 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Mon, Jun 21, 2021 at 02:59:53AM +0200, Ævar Arnfjörð Bjarmason wrote:
>>>
>>> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>>>
>>> > Splitting off from:
>>> >
>>> >
>>> > https://public-inbox.org/git/cover-0.2-0000000000-20210607T144206Z-
>>> > avarab@gmail.com/T/#me5d3176914d4268fd9f2a96fc63f4e41beb26bd6
>>> >
>>> > On Tue, Jun 08, 2021 at 06:14:42PM +0200, René Scharfe wrote:
>>> >> I wonder (only in a semi-curious way, though) if we can detect
>>> >> off-by-one errors by adding an assertion to display_progress()
>>> >> that requires the first update to have the value 0, and in
>>> >> stop_progress() one that requires the previous display_progress()
>>> >> call to have a value equal to the total number of work items.  Not
>>> >> sure it'd be worth the hassle..
>>> >
>>> > I fixed and reported a number of bogus progress lines in the past,
>>> > the last one during v2.31.0-rc phase, so I've looked into whether
>>> > progress counters could be automatically validated in our tests,
>>> > and came up with these patches a few months ago.  It turned out
>>> > that progress counters can be checked easily and transparently in
>>> > case of progress lines that are shown in the tests, i.e. that are
>>> > shown even when stderr is not a terminal or are forced with
>>> > '--progress'.  (In other cases it's still fairly easy but not quite
>>> > transparent, as I think we need changes to the progress API; more
>>> > on that later in a separate
>>> > series.)
>>>
>>> I've also been working on some progress.[ch] patches that are mostly
>>> finished, and I'm some 20 patches in at the moment. I wasn't sure
>>> about whether to send an alternate 20-patch "let's do this (mostly) instead?"
>>> series, hence this message.
>>>
>>> Much of what you're doing here becomes easier after that series, e.g.
>>> your global process struct in 2/7 is something I ended up
>>> implementing as part of a general feature to allow progress to be
>>> driven by either display_progress() *or* the signal handler itself.
>>
>> It's difficult to know who should rebase onto who without seeing one
>> half of the patches.
>
>I was sort of hoping he'd take me word for it, but here it is. Don't say I didn't warn you :)
>
>> I couldn't find a link to them anywhere (even if they are only
>> available in your fork in a pre-polished state) despite looking, but
>> my apologies if they are available and I'm just missing them.
>
>FWIW it's avar-szeder/progress-bar-assertions in https://github.com/avar/git.git, that repo contains various functioning and not-so-
>functioning code.
>
>https://github.com/avar/git/tree/meta/ is my version of the crappy scripts we probably all have some version of for building my own git,
>things that are uncommented in series.conf is what I build my own git from.
>
>> In general, I think that these patches are clear and are helpful in
>> pinning down issues with the progress API (which I have made a hadnful
>> of times in the past), so I would be happy to see them picked up.
>
>Here's all 25 patches (well, around 20 before) that I had queued up locally and fixed up a bit.
>
>The 01/25 is something I submitted already as https://lore.kernel.org/git/patch-1.1-cba5d88ca35-20210621T070114Z-avarab@gmail.com;
>hoping to get this in incrementally.
>
>The 12/25 is my own version of that "global progress struct, 11/25 is the first of many bugs SZEDER missed in his :)
>
>18/25 is the first step of the UI I was going for, the signal handler can now drive the progress bar, so e.g. during "git gc" we show (at least
>for me, on git.git), a "stalled" message just before we start the actual count of "Enumerating Objects".
>
>After that was in I was planning on adding config-driven support to show a "spinner" when we stalled in that way, config-driven because
>you could just scrape e.g. https://github.com/sindresorhus/cli-spinners/blob/main/spinners.json
>into your own config. See
>https://jsfiddle.net/sindresorhus/2eLtsbey/embedded/result/ :)
>
>19-23/25 is my grabbing of SZEDER's patches that I'm comfortable labeling as "PATCH", I think they work, but no BUG() assertions yet. I
>left out the GIT_TEST_CHECK_PROGRESS parts, since my earlier works set things up to do any BUG() we trust by default.
>
>22/25 is what I think we should do instead of SZEDER's 6/7
>(http://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.com)
>I don't think this "our total doesn't match at the end" is something we should always BUG() on, for reasons explained there.
>
>I am sympathetic to doing it by default though, hence the
>stop_progress_early() API, that's there to allow select callers to bypass his BUG(...) assertion.
>
>24/25 and 25/25 are "RFC" and a rebased+modified version of SZEDER's
>BUG(...) assertions.
>
>His series passes the test suite, but actually severely break things things. It'll make e.g. "git commit-graph write" BUG(...) out. The reason
>the tests don't catch it is because we have a blind spot in the tests.
>
>Namely, that most things that use the progress bar API use isatty() to check if they should start_progress(). If you run the tests as e.g.
>(better ways to do this, especially in parallel, most welcome):
>
>    for t in t[0-9]*.sh; do if ! ./$t -vixd; then echo $t bad; break; fi; done
>
>You can discover various things that his series BUG()'s on, I fixed a couple of those myself, it's an early part of this series.
>
>But we'll still have various untested for BUG()'s even then, this is because you *also* have to have the test actually emit a "naked"
>progress bar on stderr, if the test itself e.g. pipes fd 2 to a file it won't work.
>
>I created a shitty-and-mostly-broken throwaway change to search-replace all the guards of "start_progress(...)" to run unconditionally, and
>convert all the "delayed" to the non-delayed version. That'll find even more BUG()'s where SZEDER's series still needs to be fixed (and also
>some unrelated segfaults, I gave up on it soon after).
>
>Even if we fix that I wouldn't trust it, because a lot of the progress bars we have depend on the size and shape of the data we're
>processing, e.g. the bug I fixed in 11/25. If people find this BUG() approach worth pursuing I think it would be better to make it an opt-in
>flag we convert one caller at a time to.
>
>For some it's really clear that we could assert it, for others such as the commit-graph it's much more subtle, we're in some callback after
>setting a "total", that callback does a "break", "continue" etc. in various places, all depending on repository data.
>
>It's not easy to reason about that and be certain that we can hold to the estimate. If we get it wrong someone's repo in the wild won't fully
>GC because of the overly eager BUG().
>
>If SZEDER wants to pursue it I think it'll be easier on top of this series, but personally I really don't see the point of spending effort on it.
>
>We should really be going in the other direction, of having more fuzzy ETAs, not less.
>
>E.g. we often have enough data at the start of "Enumerating Objects"
>to give a good-enough target value, that it's 5-10% off isn't really the point, but that the user looking at it sees something better than a
>dumb count-up, and can instead see that they'll probably be looking at it for about a minute. Now our API is to give no ETA/target if we're
>not 100% sure, it's not good UX.
>
>So trying to get the current exact count/exact percentage right seems like a distraction to me in the longer term. If anything we should
>just be rounding those numbers, showing fuzzy ETAs instead of percentages if we can etc.
>
>SZEDER Gábor (4):
>  commit-graph: fix bogus counter in "Scanning merged commits" progress
>    line
>  entry: show finer-grained counter in "Filtering content" progress line
>  progress: assert last update in stop_progress()
>  progress: assert counting upwards in display()
>
>Ævar Arnfjörð Bjarmason (21):
>  progress.c tests: fix breakage with COLUMNS != 80
>  progress.c tests: make start/stop verbs on stdin
>  progress.c tests: test some invalid usage
>  progress.c tests: add a "signal" verb
>  progress.c: move signal handler functions lower
>  progress.c: call progress_interval() from progress_test_force_update()
>  progress.c: stop eagerly fflush(stderr) when not a terminal
>  progress.c: add temporary variable from progress struct
>  midx perf: add a perf test for multi-pack-index
>  progress.c: remove the "sparse" mode nano-optimization
>  pack-bitmap-write.c: add a missing stop_progress()
>  progress.c: add & assert a "global_progress" variable
>  progress.[ch]: move the "struct progress" to the header
>  progress.[ch]: move test-only code away from "extern" variables
>  progress.c: pass "is done?" (again) to display()
>  progress.[ch]: convert "title" to "struct strbuf"
>  progress.c: refactor display() for less confusion, and fix bug
>  progress.c: emit progress on first signal, show "stalled"
>  midx: don't provide a total for QSORT() progress
>  progress.c: add a stop_progress_early() function
>  entry: deal with unexpected "Filtering content" total
>
> cache.h                          |   1 -
> commit-graph.c                   |   2 +-
> csum-file.h                      |   2 -
> entry.c                          |  12 +-
> midx.c                           |  25 +-
> pack-bitmap-write.c              |   1 +
> pack.h                           |   1 -
> parallel-checkout.h              |   1 -
> progress.c                       | 391 ++++++++++++++++++-------------
> progress.h                       |  50 +++-
> reachable.h                      |   1 -
> t/helper/test-progress.c         |  54 +++--
> t/perf/p5319-multi-pack-index.sh |  21 ++
> t/t0500-progress-display.sh      | 247 ++++++++++++++-----
> 14 files changed, 537 insertions(+), 272 deletions(-)  create mode 100755 t/perf/p5319-multi-pack-index.sh

Is there provision for disabling progress on a per-command basis? My use case is specifically in a CI/CD script, being able to suppress progress handling. The current Jenkins plugin does not appear to have provision for hooking into a mechanism, which makes things get a bit wonky when a job runs with a pseudo-tty (as provided by Jenkins through SSH/RMI).
-Randall


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code
  2021-06-23 17:59       ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Randall S. Becker
@ 2021-06-23 20:01         ` Ævar Arnfjörð Bjarmason
  2021-06-23 20:25           ` Randall S. Becker
  0 siblings, 1 reply; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 20:01 UTC (permalink / raw)
  To: Randall S. Becker
  Cc: git, 'Junio C Hamano', 'SZEDER Gábor',
	'René Scharfe', 'Taylor Blau'


On Wed, Jun 23 2021, Randall S. Becker wrote:

> On June 23, 2021 1:48 PM, Ævar Arnfjörð Bjarmason wrote:
>>> On Mon, Jun 21, 2021 at 02:59:53AM +0200, Ævar Arnfjörð Bjarmason wrote:
>>>>
>>>> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>>>>
>>>> > Splitting off from:
>>>> >
>>>> >
>>>> > https://public-inbox.org/git/cover-0.2-0000000000-20210607T144206Z-
>>>> > avarab@gmail.com/T/#me5d3176914d4268fd9f2a96fc63f4e41beb26bd6
>>>> >
>>>> > On Tue, Jun 08, 2021 at 06:14:42PM +0200, René Scharfe wrote:
>>>> >> I wonder (only in a semi-curious way, though) if we can detect
>>>> >> off-by-one errors by adding an assertion to display_progress()
>>>> >> that requires the first update to have the value 0, and in
>>>> >> stop_progress() one that requires the previous display_progress()
>>>> >> call to have a value equal to the total number of work items.  Not
>>>> >> sure it'd be worth the hassle..
>>>> >
>>>> > I fixed and reported a number of bogus progress lines in the past,
>>>> > the last one during v2.31.0-rc phase, so I've looked into whether
>>>> > progress counters could be automatically validated in our tests,
>>>> > and came up with these patches a few months ago.  It turned out
>>>> > that progress counters can be checked easily and transparently in
>>>> > case of progress lines that are shown in the tests, i.e. that are
>>>> > shown even when stderr is not a terminal or are forced with
>>>> > '--progress'.  (In other cases it's still fairly easy but not quite
>>>> > transparent, as I think we need changes to the progress API; more
>>>> > on that later in a separate
>>>> > series.)
>>>>
>>>> I've also been working on some progress.[ch] patches that are mostly
>>>> finished, and I'm some 20 patches in at the moment. I wasn't sure
>>>> about whether to send an alternate 20-patch "let's do this (mostly) instead?"
>>>> series, hence this message.
>>>>
>>>> Much of what you're doing here becomes easier after that series, e.g.
>>>> your global process struct in 2/7 is something I ended up
>>>> implementing as part of a general feature to allow progress to be
>>>> driven by either display_progress() *or* the signal handler itself.
>>>
>>> It's difficult to know who should rebase onto who without seeing one
>>> half of the patches.
>>
>>I was sort of hoping he'd take me word for it, but here it is. Don't say I didn't warn you :)
>>
>>> I couldn't find a link to them anywhere (even if they are only
>>> available in your fork in a pre-polished state) despite looking, but
>>> my apologies if they are available and I'm just missing them.
>>
>>FWIW it's avar-szeder/progress-bar-assertions in https://github.com/avar/git.git, that repo contains various functioning and not-so-
>>functioning code.
>>
>>https://github.com/avar/git/tree/meta/ is my version of the crappy scripts we probably all have some version of for building my own git,
>>things that are uncommented in series.conf is what I build my own git from.
>>
>>> In general, I think that these patches are clear and are helpful in
>>> pinning down issues with the progress API (which I have made a hadnful
>>> of times in the past), so I would be happy to see them picked up.
>>
>>Here's all 25 patches (well, around 20 before) that I had queued up locally and fixed up a bit.
>>
>>The 01/25 is something I submitted already as https://lore.kernel.org/git/patch-1.1-cba5d88ca35-20210621T070114Z-avarab@gmail.com;
>>hoping to get this in incrementally.
>>
>>The 12/25 is my own version of that "global progress struct, 11/25 is the first of many bugs SZEDER missed in his :)
>>
>>18/25 is the first step of the UI I was going for, the signal handler can now drive the progress bar, so e.g. during "git gc" we show (at least
>>for me, on git.git), a "stalled" message just before we start the actual count of "Enumerating Objects".
>>
>>After that was in I was planning on adding config-driven support to show a "spinner" when we stalled in that way, config-driven because
>>you could just scrape e.g. https://github.com/sindresorhus/cli-spinners/blob/main/spinners.json
>>into your own config. See
>>https://jsfiddle.net/sindresorhus/2eLtsbey/embedded/result/ :)
>>
>>19-23/25 is my grabbing of SZEDER's patches that I'm comfortable labeling as "PATCH", I think they work, but no BUG() assertions yet. I
>>left out the GIT_TEST_CHECK_PROGRESS parts, since my earlier works set things up to do any BUG() we trust by default.
>>
>>22/25 is what I think we should do instead of SZEDER's 6/7
>>(http://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.com)
>>I don't think this "our total doesn't match at the end" is something we should always BUG() on, for reasons explained there.
>>
>>I am sympathetic to doing it by default though, hence the
>>stop_progress_early() API, that's there to allow select callers to bypass his BUG(...) assertion.
>>
>>24/25 and 25/25 are "RFC" and a rebased+modified version of SZEDER's
>>BUG(...) assertions.
>>
>>His series passes the test suite, but actually severely break things things. It'll make e.g. "git commit-graph write" BUG(...) out. The reason
>>the tests don't catch it is because we have a blind spot in the tests.
>>
>>Namely, that most things that use the progress bar API use isatty() to check if they should start_progress(). If you run the tests as e.g.
>>(better ways to do this, especially in parallel, most welcome):
>>
>>    for t in t[0-9]*.sh; do if ! ./$t -vixd; then echo $t bad; break; fi; done
>>
>>You can discover various things that his series BUG()'s on, I fixed a couple of those myself, it's an early part of this series.
>>
>>But we'll still have various untested for BUG()'s even then, this is because you *also* have to have the test actually emit a "naked"
>>progress bar on stderr, if the test itself e.g. pipes fd 2 to a file it won't work.
>>
>>I created a shitty-and-mostly-broken throwaway change to search-replace all the guards of "start_progress(...)" to run unconditionally, and
>>convert all the "delayed" to the non-delayed version. That'll find even more BUG()'s where SZEDER's series still needs to be fixed (and also
>>some unrelated segfaults, I gave up on it soon after).
>>
>>Even if we fix that I wouldn't trust it, because a lot of the progress bars we have depend on the size and shape of the data we're
>>processing, e.g. the bug I fixed in 11/25. If people find this BUG() approach worth pursuing I think it would be better to make it an opt-in
>>flag we convert one caller at a time to.
>>
>>For some it's really clear that we could assert it, for others such as the commit-graph it's much more subtle, we're in some callback after
>>setting a "total", that callback does a "break", "continue" etc. in various places, all depending on repository data.
>>
>>It's not easy to reason about that and be certain that we can hold to the estimate. If we get it wrong someone's repo in the wild won't fully
>>GC because of the overly eager BUG().
>>
>>If SZEDER wants to pursue it I think it'll be easier on top of this series, but personally I really don't see the point of spending effort on it.
>>
>>We should really be going in the other direction, of having more fuzzy ETAs, not less.
>>
>>E.g. we often have enough data at the start of "Enumerating Objects"
>>to give a good-enough target value, that it's 5-10% off isn't really the point, but that the user looking at it sees something better than a
>>dumb count-up, and can instead see that they'll probably be looking at it for about a minute. Now our API is to give no ETA/target if we're
>>not 100% sure, it's not good UX.
>>
>>So trying to get the current exact count/exact percentage right seems like a distraction to me in the longer term. If anything we should
>>just be rounding those numbers, showing fuzzy ETAs instead of percentages if we can etc.
>>
>>SZEDER Gábor (4):
>>  commit-graph: fix bogus counter in "Scanning merged commits" progress
>>    line
>>  entry: show finer-grained counter in "Filtering content" progress line
>>  progress: assert last update in stop_progress()
>>  progress: assert counting upwards in display()
>>
>>Ævar Arnfjörð Bjarmason (21):
>>  progress.c tests: fix breakage with COLUMNS != 80
>>  progress.c tests: make start/stop verbs on stdin
>>  progress.c tests: test some invalid usage
>>  progress.c tests: add a "signal" verb
>>  progress.c: move signal handler functions lower
>>  progress.c: call progress_interval() from progress_test_force_update()
>>  progress.c: stop eagerly fflush(stderr) when not a terminal
>>  progress.c: add temporary variable from progress struct
>>  midx perf: add a perf test for multi-pack-index
>>  progress.c: remove the "sparse" mode nano-optimization
>>  pack-bitmap-write.c: add a missing stop_progress()
>>  progress.c: add & assert a "global_progress" variable
>>  progress.[ch]: move the "struct progress" to the header
>>  progress.[ch]: move test-only code away from "extern" variables
>>  progress.c: pass "is done?" (again) to display()
>>  progress.[ch]: convert "title" to "struct strbuf"
>>  progress.c: refactor display() for less confusion, and fix bug
>>  progress.c: emit progress on first signal, show "stalled"
>>  midx: don't provide a total for QSORT() progress
>>  progress.c: add a stop_progress_early() function
>>  entry: deal with unexpected "Filtering content" total
>>
>> cache.h                          |   1 -
>> commit-graph.c                   |   2 +-
>> csum-file.h                      |   2 -
>> entry.c                          |  12 +-
>> midx.c                           |  25 +-
>> pack-bitmap-write.c              |   1 +
>> pack.h                           |   1 -
>> parallel-checkout.h              |   1 -
>> progress.c                       | 391 ++++++++++++++++++-------------
>> progress.h                       |  50 +++-
>> reachable.h                      |   1 -
>> t/helper/test-progress.c         |  54 +++--
>> t/perf/p5319-multi-pack-index.sh |  21 ++
>> t/t0500-progress-display.sh      | 247 ++++++++++++++-----
>> 14 files changed, 537 insertions(+), 272 deletions(-)  create mode 100755 t/perf/p5319-multi-pack-index.sh
>
> Is there provision for disabling progress on a per-command basis? My
> use case is specifically in a CI/CD script, being able to suppress
> progress handling. The current Jenkins plugin does not appear to have
> provision for hooking into a mechanism, which makes things get a bit
> wonky when a job runs with a pseudo-tty (as provided by Jenkins
> through SSH/RMI).
> -Randall

There isn't, some commands support --no-progress, but it's hit and miss.

You can then set the undocumented GIT_PROGRESS_DELAY=99999999 (or some
really big number) to suppress more of them.

We could just add it as a top-level "git --no-progress" option I
suppose...

Probably better would be to detect such not-a-terminals somehow, I think
at some point our own gc.log was a victim of this.

^ permalink raw reply	[flat|nested] 138+ messages in thread

* RE: [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code
  2021-06-23 20:01         ` Ævar Arnfjörð Bjarmason
@ 2021-06-23 20:25           ` Randall S. Becker
  0 siblings, 0 replies; 138+ messages in thread
From: Randall S. Becker @ 2021-06-23 20:25 UTC (permalink / raw)
  To: 'Ævar Arnfjörð Bjarmason'
  Cc: git, 'Junio C Hamano', 'SZEDER Gábor',
	'René Scharfe', 'Taylor Blau'

On June 23, 2021 4:02 PM, Ævar Arnfjörð Bjarmason wrote:
>On Wed, Jun 23 2021, Randall S. Becker wrote:
>> On June 23, 2021 1:48 PM, Ævar Arnfjörð Bjarmason wrote:
>>>> On Mon, Jun 21, 2021 at 02:59:53AM +0200, Ævar Arnfjörð Bjarmason wrote:
>>>>>
>>>>> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>>>>>
>>>>> > Splitting off from:
>>>>> >
>>>>> >
>>>>> > https://public-inbox.org/git/cover-0.2-0000000000-20210607T144206
>>>>> > Z-
>>>>> > avarab@gmail.com/T/#me5d3176914d4268fd9f2a96fc63f4e41beb26bd6
>>>>> >
>>>>> > On Tue, Jun 08, 2021 at 06:14:42PM +0200, René Scharfe wrote:
>>>>> >> I wonder (only in a semi-curious way, though) if we can detect
>>>>> >> off-by-one errors by adding an assertion to display_progress()
>>>>> >> that requires the first update to have the value 0, and in
>>>>> >> stop_progress() one that requires the previous
>>>>> >> display_progress() call to have a value equal to the total
>>>>> >> number of work items.  Not sure it'd be worth the hassle..
>>>>> >
>>>>> > I fixed and reported a number of bogus progress lines in the
>>>>> > past, the last one during v2.31.0-rc phase, so I've looked into
>>>>> > whether progress counters could be automatically validated in our
>>>>> > tests, and came up with these patches a few months ago.  It
>>>>> > turned out that progress counters can be checked easily and
>>>>> > transparently in case of progress lines that are shown in the
>>>>> > tests, i.e. that are shown even when stderr is not a terminal or
>>>>> > are forced with '--progress'.  (In other cases it's still fairly
>>>>> > easy but not quite transparent, as I think we need changes to the
>>>>> > progress API; more on that later in a separate
>>>>> > series.)
>>>>>
>>>>> I've also been working on some progress.[ch] patches that are
>>>>> mostly finished, and I'm some 20 patches in at the moment. I wasn't
>>>>> sure about whether to send an alternate 20-patch "let's do this (mostly) instead?"
>>>>> series, hence this message.
>>>>>
>>>>> Much of what you're doing here becomes easier after that series, e.g.
>>>>> your global process struct in 2/7 is something I ended up
>>>>> implementing as part of a general feature to allow progress to be
>>>>> driven by either display_progress() *or* the signal handler itself.
>>>>
>>>> It's difficult to know who should rebase onto who without seeing one
>>>> half of the patches.
>>>
>>>I was sort of hoping he'd take me word for it, but here it is. Don't
>>>say I didn't warn you :)
>>>
>>>> I couldn't find a link to them anywhere (even if they are only
>>>> available in your fork in a pre-polished state) despite looking, but
>>>> my apologies if they are available and I'm just missing them.
>>>
>>>FWIW it's avar-szeder/progress-bar-assertions in
>>>https://github.com/avar/git.git, that repo contains various functioning and not-so- functioning code.
>>>
>>>https://github.com/avar/git/tree/meta/ is my version of the crappy
>>>scripts we probably all have some version of for building my own git, things that are uncommented in series.conf is what I build my own
>git from.
>>>
>>>> In general, I think that these patches are clear and are helpful in
>>>> pinning down issues with the progress API (which I have made a
>>>> hadnful of times in the past), so I would be happy to see them picked up.
>>>
>>>Here's all 25 patches (well, around 20 before) that I had queued up locally and fixed up a bit.
>>>
>>>The 01/25 is something I submitted already as
>>>https://lore.kernel.org/git/patch-1.1-cba5d88ca35-20210621T070114Z-ava
>>>rab@gmail.com;
>>>hoping to get this in incrementally.
>>>
>>>The 12/25 is my own version of that "global progress struct, 11/25 is
>>>the first of many bugs SZEDER missed in his :)
>>>
>>>18/25 is the first step of the UI I was going for, the signal handler
>>>can now drive the progress bar, so e.g. during "git gc" we show (at least for me, on git.git), a "stalled" message just before we start the
>actual count of "Enumerating Objects".
>>>
>>>After that was in I was planning on adding config-driven support to
>>>show a "spinner" when we stalled in that way, config-driven because
>>>you could just scrape e.g.
>>>https://github.com/sindresorhus/cli-spinners/blob/main/spinners.json
>>>into your own config. See
>>>https://jsfiddle.net/sindresorhus/2eLtsbey/embedded/result/ :)
>>>
>>>19-23/25 is my grabbing of SZEDER's patches that I'm comfortable
>>>labeling as "PATCH", I think they work, but no BUG() assertions yet. I left out the GIT_TEST_CHECK_PROGRESS parts, since my earlier
>works set things up to do any BUG() we trust by default.
>>>
>>>22/25 is what I think we should do instead of SZEDER's 6/7
>>>(http://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.
>>>com) I don't think this "our total doesn't match at the end" is
>>>something we should always BUG() on, for reasons explained there.
>>>
>>>I am sympathetic to doing it by default though, hence the
>>>stop_progress_early() API, that's there to allow select callers to bypass his BUG(...) assertion.
>>>
>>>24/25 and 25/25 are "RFC" and a rebased+modified version of SZEDER's
>>>BUG(...) assertions.
>>>
>>>His series passes the test suite, but actually severely break things
>>>things. It'll make e.g. "git commit-graph write" BUG(...) out. The reason the tests don't catch it is because we have a blind spot in the
>tests.
>>>
>>>Namely, that most things that use the progress bar API use isatty() to check if they should start_progress(). If you run the tests as e.g.
>>>(better ways to do this, especially in parallel, most welcome):
>>>
>>>    for t in t[0-9]*.sh; do if ! ./$t -vixd; then echo $t bad; break;
>>> fi; done
>>>
>>>You can discover various things that his series BUG()'s on, I fixed a couple of those myself, it's an early part of this series.
>>>
>>>But we'll still have various untested for BUG()'s even then, this is because you *also* have to have the test actually emit a "naked"
>>>progress bar on stderr, if the test itself e.g. pipes fd 2 to a file it won't work.
>>>
>>>I created a shitty-and-mostly-broken throwaway change to
>>>search-replace all the guards of "start_progress(...)" to run
>>>unconditionally, and convert all the "delayed" to the non-delayed version. That'll find even more BUG()'s where SZEDER's series still
>needs to be fixed (and also some unrelated segfaults, I gave up on it soon after).
>>>
>>>Even if we fix that I wouldn't trust it, because a lot of the progress
>>>bars we have depend on the size and shape of the data we're
>>>processing, e.g. the bug I fixed in 11/25. If people find this BUG() approach worth pursuing I think it would be better to make it an opt-in
>flag we convert one caller at a time to.
>>>
>>>For some it's really clear that we could assert it, for others such as
>>>the commit-graph it's much more subtle, we're in some callback after setting a "total", that callback does a "break", "continue" etc. in
>various places, all depending on repository data.
>>>
>>>It's not easy to reason about that and be certain that we can hold to
>>>the estimate. If we get it wrong someone's repo in the wild won't fully GC because of the overly eager BUG().
>>>
>>>If SZEDER wants to pursue it I think it'll be easier on top of this series, but personally I really don't see the point of spending effort on it.
>>>
>>>We should really be going in the other direction, of having more fuzzy ETAs, not less.
>>>
>>>E.g. we often have enough data at the start of "Enumerating Objects"
>>>to give a good-enough target value, that it's 5-10% off isn't really
>>>the point, but that the user looking at it sees something better than
>>>a dumb count-up, and can instead see that they'll probably be looking at it for about a minute. Now our API is to give no ETA/target if
>we're not 100% sure, it's not good UX.
>>>
>>>So trying to get the current exact count/exact percentage right seems
>>>like a distraction to me in the longer term. If anything we should just be rounding those numbers, showing fuzzy ETAs instead of
>percentages if we can etc.
>>>
>>>SZEDER Gábor (4):
>>>  commit-graph: fix bogus counter in "Scanning merged commits" progress
>>>    line
>>>  entry: show finer-grained counter in "Filtering content" progress
>>>line
>>>  progress: assert last update in stop_progress()
>>>  progress: assert counting upwards in display()
>>>
>>>Ævar Arnfjörð Bjarmason (21):
>>>  progress.c tests: fix breakage with COLUMNS != 80
>>>  progress.c tests: make start/stop verbs on stdin
>>>  progress.c tests: test some invalid usage
>>>  progress.c tests: add a "signal" verb
>>>  progress.c: move signal handler functions lower
>>>  progress.c: call progress_interval() from
>>>progress_test_force_update()
>>>  progress.c: stop eagerly fflush(stderr) when not a terminal
>>>  progress.c: add temporary variable from progress struct
>>>  midx perf: add a perf test for multi-pack-index
>>>  progress.c: remove the "sparse" mode nano-optimization
>>>  pack-bitmap-write.c: add a missing stop_progress()
>>>  progress.c: add & assert a "global_progress" variable
>>>  progress.[ch]: move the "struct progress" to the header
>>>  progress.[ch]: move test-only code away from "extern" variables
>>>  progress.c: pass "is done?" (again) to display()
>>>  progress.[ch]: convert "title" to "struct strbuf"
>>>  progress.c: refactor display() for less confusion, and fix bug
>>>  progress.c: emit progress on first signal, show "stalled"
>>>  midx: don't provide a total for QSORT() progress
>>>  progress.c: add a stop_progress_early() function
>>>  entry: deal with unexpected "Filtering content" total
>>>
>>> cache.h                          |   1 -
>>> commit-graph.c                   |   2 +-
>>> csum-file.h                      |   2 -
>>> entry.c                          |  12 +-
>>> midx.c                           |  25 +-
>>> pack-bitmap-write.c              |   1 +
>>> pack.h                           |   1 -
>>> parallel-checkout.h              |   1 -
>>> progress.c                       | 391 ++++++++++++++++++-------------
>>> progress.h                       |  50 +++-
>>> reachable.h                      |   1 -
>>> t/helper/test-progress.c         |  54 +++--
>>> t/perf/p5319-multi-pack-index.sh |  21 ++
>>> t/t0500-progress-display.sh      | 247 ++++++++++++++-----
>>> 14 files changed, 537 insertions(+), 272 deletions(-)  create mode
>>> 100755 t/perf/p5319-multi-pack-index.sh
>>
>> Is there provision for disabling progress on a per-command basis? My
>> use case is specifically in a CI/CD script, being able to suppress
>> progress handling. The current Jenkins plugin does not appear to have
>> provision for hooking into a mechanism, which makes things get a bit
>> wonky when a job runs with a pseudo-tty (as provided by Jenkins
>> through SSH/RMI).
>> -Randall
>
>There isn't, some commands support --no-progress, but it's hit and miss.
>
>You can then set the undocumented GIT_PROGRESS_DELAY=99999999 (or some really big number) to suppress more of them.
>
>We could just add it as a top-level "git --no-progress" option I suppose...
>
>Probably better would be to detect such not-a-terminals somehow, I think at some point our own gc.log was a victim of this.

I think a global not-a-terminal would be best here. It does not make a lot of sense to dump progress on a device that does not handle Control-M. I think I recall someone recently saying that we should be detecting this.


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
                   ` (7 preceding siblings ...)
  2021-06-21  0:59 ` [PATCH 0/7] progress: verify progress counters in the test suite Ævar Arnfjörð Bjarmason
@ 2021-06-23 21:57 ` SZEDER Gábor
  2021-06-23 21:57   ` [PATCH 1/4] WIP progress, isatty(2), hidden progress lnies for GIT_TEST_CHECK_PROGRESS SZEDER Gábor
                     ` (5 more replies)
  2021-07-22 12:20 ` [PATCH 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
  10 siblings, 6 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-06-23 21:57 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	Taylor Blau, SZEDER Gábor

On Sun, Jun 20, 2021 at 10:02:56PM +0200, SZEDER Gábor wrote:
> It turned out that progress
> counters can be checked easily and transparently in case of progress
> lines that are shown in the tests, i.e. that are shown even when
> stderr is not a terminal or are forced with '--progress'.  (In other
> cases it's still fairly easy but not quite transparent, as I think we
> need changes to the progress API; more on that later in a separate
> series.)

So, the first patch in this WIP/POC series is my attempt at checking
even those progress counters that are not shown in our test suite,
either because stderr is not a terminal or because of an explicit
'--no-progress' option.  There are no usable commit messages yet, I
just wanted to see whether it's possible to check all progress lines
and whether it uncovers any more bugs; and the answer is yes to both.

Anyway, the basic idea is that instead of checking isatty(2) in the
caller, let's perform that check in start_progress() and let callers
override it through an extra function parameter (e.g. when
'--(no-)progress', '-v' or '--quiet' was given).  This way
start_progress() will always be called and it would then return NULL
if the progress line should not be shown.  Or, if
GIT_TEST_CHECK_PROGRESS=1, then it would return a valid non-NULL
progress instance even when the progress line should not be shown, but
with the new 'progress->hidden' flag set, so subsequent
display_progress() and stop_progress() calls won't print anything but
will be able to perform all the checks and trigger BUG() if one is
violated.

However, after Ævar pointed out upthread that progress also generates
trace2 regions, I think that it would be better if start_progress()
always returned a valid progress instance, even without
GIT_TEST_CHECK_PROGRESS but with 'progress->hidden' set as necessary,
because that way we would always get that trace2 output, even with
'--no-progress' or 'git cmd 2>log'.

The first patch also converts a good couple of progress lines to this
new approach, and the subsequent patches fix most of the uncovered
buggy progress lines.


SZEDER Gábor (4):
  WIP progress, isatty(2), hidden progress lnies for
    GIT_TEST_CHECK_PROGRESS
  blame: fix progress total with line ranges
  read-cache: avoid overlapping progress lines
  preload-index: fix "Refreshing index" progress line

 builtin/blame.c          |  8 ++++----
 builtin/fsck.c           | 10 +++-------
 builtin/index-pack.c     | 18 +++++++++---------
 builtin/log.c            |  4 ++--
 builtin/prune.c          |  5 +----
 builtin/unpack-objects.c |  6 +++---
 preload-index.c          | 10 +++++-----
 progress.c               | 26 +++++++++++++++++++-------
 progress.h               |  6 ++++--
 read-cache.c             |  9 +++++----
 10 files changed, 55 insertions(+), 47 deletions(-)

-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 1/4] WIP progress, isatty(2), hidden progress lnies for GIT_TEST_CHECK_PROGRESS
  2021-06-23 21:57 ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
@ 2021-06-23 21:57   ` SZEDER Gábor
  2021-06-23 21:57   ` [PATCH 2/4] blame: fix progress total with line ranges SZEDER Gábor
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-06-23 21:57 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	Taylor Blau, SZEDER Gábor

---
 builtin/blame.c          |  6 ++----
 builtin/fsck.c           | 10 +++-------
 builtin/index-pack.c     | 18 +++++++++---------
 builtin/log.c            |  4 ++--
 builtin/prune.c          |  5 +----
 builtin/unpack-objects.c |  6 +++---
 preload-index.c          |  7 +++----
 progress.c               | 26 +++++++++++++++++++-------
 progress.h               |  6 ++++--
 read-cache.c             |  6 +++---
 10 files changed, 49 insertions(+), 45 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 641523ff9a..5efb920dd4 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -944,8 +944,7 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
 		if (show_progress > 0)
 			die(_("--progress can't be used with --incremental or porcelain formats"));
 		show_progress = 0;
-	} else if (show_progress < 0)
-		show_progress = isatty(2);
+	}
 
 	if (0 < abbrev && abbrev < hexsz)
 		/* one more abbrev length is needed for the boundary commit */
@@ -1153,8 +1152,7 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
 
 	sb.found_guilty_entry = &found_guilty_entry;
 	sb.found_guilty_entry_data = &pi;
-	if (show_progress)
-		pi.progress = start_delayed_progress(_("Blaming lines"), sb.num_lines);
+	pi.progress = start_delayed_progress_if_tty(_("Blaming lines"), sb.num_lines, show_progress);
 
 	assign_blame(&sb, opt);
 
diff --git a/builtin/fsck.c b/builtin/fsck.c
index b42b6fe21f..78e799f748 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -185,8 +185,7 @@ static int traverse_reachable(void)
 	struct progress *progress = NULL;
 	unsigned int nr = 0;
 	int result = 0;
-	if (show_progress)
-		progress = start_delayed_progress(_("Checking connectivity"), 0);
+	progress = start_delayed_progress_if_tty(_("Checking connectivity"), 0, show_progress);
 	while (pending.nr) {
 		result |= traverse_one_object(object_array_pop(&pending));
 		display_progress(progress, ++nr);
@@ -653,8 +652,7 @@ static void fsck_object_dir(const char *path)
 	if (verbose)
 		fprintf_ln(stderr, _("Checking object directory"));
 
-	if (show_progress)
-		progress = start_progress(_("Checking object directories"), 256);
+	progress = start_progress_if_tty(_("Checking object directories"), 256, show_progress);
 
 	for_each_loose_file_in_objdir(path, fsck_loose, fsck_cruft, fsck_subdir,
 				      progress);
@@ -789,8 +787,6 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 	if (check_strict)
 		fsck_obj_options.strict = 1;
 
-	if (show_progress == -1)
-		show_progress = isatty(2);
 	if (verbose)
 		show_progress = 0;
 
@@ -825,7 +821,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 					total += p->num_objects;
 				}
 
-				progress = start_progress(_("Checking objects"), total);
+				progress = start_progress_if_tty(_("Checking objects"), total, show_progress);
 			}
 			for (p = get_all_packs(the_repository); p;
 			     p = p->next) {
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 3fbc5d7077..0caabe237e 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -258,8 +258,8 @@ static unsigned check_objects(void)
 
 	max = get_max_object_index();
 
-	if (verbose)
-		progress = start_delayed_progress(_("Checking objects"), max);
+	progress = start_delayed_progress_if_tty(_("Checking objects"), max,
+						 verbose ? 1 : 0);
 
 	for (i = 0; i < max; i++) {
 		foreign_nr += check_object(get_indexed_object(i));
@@ -1157,10 +1157,9 @@ static void parse_pack_objects(unsigned char *hash)
 	struct object_id ref_delta_oid;
 	struct stat st;
 
-	if (verbose)
-		progress = start_progress(
-				from_stdin ? _("Receiving objects") : _("Indexing objects"),
-				nr_objects);
+	progress = start_progress_if_tty(
+			from_stdin ? _("Receiving objects") : _("Indexing objects"),
+			nr_objects, verbose ? 1 : 0);
 	for (i = 0; i < nr_objects; i++) {
 		struct object_entry *obj = &objects[i];
 		void *data = unpack_raw_entry(obj, &ofs_delta->offset,
@@ -1235,9 +1234,10 @@ static void resolve_deltas(void)
 	QSORT(ofs_deltas, nr_ofs_deltas, compare_ofs_delta_entry);
 	QSORT(ref_deltas, nr_ref_deltas, compare_ref_delta_entry);
 
-	if (verbose || show_resolving_progress)
-		progress = start_progress(_("Resolving deltas"),
-					  nr_ref_deltas + nr_ofs_deltas);
+	/* TODO: breaks 5309.3 and .4 */
+	progress = start_progress_if_tty(_("Resolving deltas"),
+					 nr_ref_deltas + nr_ofs_deltas,
+					 verbose || show_resolving_progress ? 1 : 0);
 
 	nr_dispatched = 0;
 	base_cache_limit = delta_base_cache_limit * nr_threads;
diff --git a/builtin/log.c b/builtin/log.c
index 6102893fcc..41bcd4d0fb 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -2154,8 +2154,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	}
 	rev.add_signoff = do_signoff;
 
-	if (show_progress)
-		progress = start_delayed_progress(_("Generating patches"), total);
+	progress = start_delayed_progress_if_tty(_("Generating patches"), total,
+						 show_progress);
 	while (0 <= --nr) {
 		int shown;
 		display_progress(progress, total - nr);
diff --git a/builtin/prune.c b/builtin/prune.c
index 02c6ab7cba..2ee1baf40d 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -41,8 +41,7 @@ static void perform_reachability_traversal(struct rev_info *revs)
 	if (initialized)
 		return;
 
-	if (show_progress)
-		progress = start_delayed_progress(_("Checking connectivity"), 0);
+	progress = start_delayed_progress_if_tty(_("Checking connectivity"), 0, show_progress);
 	mark_reachable_objects(revs, 1, expire, progress);
 	stop_progress(&progress);
 	initialized = 1;
@@ -164,8 +163,6 @@ int cmd_prune(int argc, const char **argv, const char *prefix)
 			die("unrecognized argument: %s", name);
 	}
 
-	if (show_progress == -1)
-		show_progress = isatty(2);
 	if (exclude_promisor_objects) {
 		fetch_if_missing = 0;
 		revs.exclude_promisor_objects = 1;
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 4a9466295b..8517522a31 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -14,7 +14,7 @@
 #include "decorate.h"
 #include "fsck.h"
 
-static int dry_run, quiet, recover, has_errors, strict;
+static int dry_run, quiet = -1, recover, has_errors, strict;
 static const char unpack_usage[] = "git unpack-objects [-n] [-q] [-r] [--strict]";
 
 /* We always read in 4kB chunks. */
@@ -500,8 +500,8 @@ static void unpack_all(void)
 			ntohl(hdr->hdr_version));
 	use(sizeof(struct pack_header));
 
-	if (!quiet)
-		progress = start_progress(_("Unpacking objects"), nr_objects);
+	progress = start_progress_if_tty(_("Unpacking objects"), nr_objects,
+					 quiet ? 0 : -1);
 	CALLOC_ARRAY(obj_list, nr_objects);
 	for (i = 0; i < nr_objects; i++) {
 		unpack_one(i);
diff --git a/preload-index.c b/preload-index.c
index e5529a5863..aae6e4a042 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -121,10 +121,9 @@ void preload_index(struct index_state *index,
 	memset(&data, 0, sizeof(data));
 
 	memset(&pd, 0, sizeof(pd));
-	if (refresh_flags & REFRESH_PROGRESS && isatty(2)) {
-		pd.progress = start_delayed_progress(_("Refreshing index"), index->cache_nr);
-		pthread_mutex_init(&pd.mutex, NULL);
-	}
+	pd.progress = start_delayed_progress_if_tty(_("Refreshing index"),index->cache_nr,
+						   refresh_flags & REFRESH_PROGRESS ? -1 : 0);
+	pthread_mutex_init(&pd.mutex, NULL);
 
 	for (i = 0; i < threads; i++) {
 		struct thread_data *p = data+i;
diff --git a/progress.c b/progress.c
index 034d50cd6b..99e130f1eb 100644
--- a/progress.c
+++ b/progress.c
@@ -43,6 +43,7 @@ struct progress {
 	struct strbuf counters_sb;
 	int title_len;
 	int split;
+	int hidden;
 };
 
 static volatile sig_atomic_t progress_update;
@@ -123,6 +124,8 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 
 	progress->last_value = n;
 
+	if (progress->hidden)
+		return;
 	if (progress->delay && (!progress_update || --progress->delay))
 		return;
 
@@ -261,15 +264,23 @@ void display_progress(struct progress *progress, uint64_t n)
 }
 
 static struct progress *start_progress_delay(const char *title, uint64_t total,
-					     unsigned delay, unsigned sparse)
+					     unsigned delay, unsigned sparse,
+					     int show)
 {
 	struct progress *progress;
 
 	test_check_progress = git_env_bool("GIT_TEST_CHECK_PROGRESS", 0);
+
+	if (show == -1)
+		show = isatty(STDERR_FILENO);
+
 	if (test_check_progress && current_progress)
 		BUG("progress \"%s\" is still active when starting new progress \"%s\"",
 		    current_progress->title, title);
 
+	if (!show && !test_check_progress)
+		return NULL;
+
 	progress = xmalloc(sizeof(*progress));
 	current_progress = progress;
 	progress->title = title;
@@ -283,6 +294,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 	strbuf_init(&progress->counters_sb, 0);
 	progress->title_len = utf8_strwidth(title);
 	progress->split = 0;
+	progress->hidden = !show;
 	set_progress_signal();
 	trace2_region_enter("progress", title, the_repository);
 	return progress;
@@ -298,14 +310,14 @@ static int get_default_delay(void)
 	return delay_in_secs;
 }
 
-struct progress *start_delayed_progress(const char *title, uint64_t total)
+struct progress *start_delayed_progress_if_tty(const char *title, uint64_t total, int show)
 {
-	return start_progress_delay(title, total, get_default_delay(), 0);
+	return start_progress_delay(title, total, get_default_delay(), 0, show);
 }
 
-struct progress *start_progress(const char *title, uint64_t total)
+struct progress *start_progress_if_tty(const char *title, uint64_t total, int show)
 {
-	return start_progress_delay(title, total, 0, 0);
+	return start_progress_delay(title, total, 0, 0, show);
 }
 
 /*
@@ -319,13 +331,13 @@ struct progress *start_progress(const char *title, uint64_t total)
  */
 struct progress *start_sparse_progress(const char *title, uint64_t total)
 {
-	return start_progress_delay(title, total, 0, 1);
+	return start_progress_delay(title, total, 0, 1, 1);
 }
 
 struct progress *start_delayed_sparse_progress(const char *title,
 					       uint64_t total)
 {
-	return start_progress_delay(title, total, get_default_delay(), 1);
+	return start_progress_delay(title, total, get_default_delay(), 1, 1);
 }
 
 static void finish_if_sparse(struct progress *progress)
diff --git a/progress.h b/progress.h
index f1913acf73..7c3bdd3d63 100644
--- a/progress.h
+++ b/progress.h
@@ -13,9 +13,11 @@ void progress_test_force_update(void);
 
 void display_throughput(struct progress *progress, uint64_t total);
 void display_progress(struct progress *progress, uint64_t n);
-struct progress *start_progress(const char *title, uint64_t total);
+#define start_progress(title, total) start_progress_if_tty((title), (total), 1)
+struct progress *start_progress_if_tty(const char *title, uint64_t total, int show);
 struct progress *start_sparse_progress(const char *title, uint64_t total);
-struct progress *start_delayed_progress(const char *title, uint64_t total);
+#define start_delayed_progress(title, total) start_delayed_progress_if_tty((title), (total), 1)
+struct progress *start_delayed_progress_if_tty(const char *title, uint64_t total, int show);
 struct progress *start_delayed_sparse_progress(const char *title,
 					       uint64_t total);
 void stop_progress(struct progress **progress);
diff --git a/read-cache.c b/read-cache.c
index 1b3c2eb408..c3fc797639 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1567,9 +1567,9 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 	int t2_sum_lstat = 0;
 	int t2_sum_scan = 0;
 
-	if (flags & REFRESH_PROGRESS && isatty(2))
-		progress = start_delayed_progress(_("Refresh index"),
-						  istate->cache_nr);
+	progress = start_delayed_progress_if_tty(_("Refresh index"),
+						 istate->cache_nr,
+						 flags & REFRESH_PROGRESS ? -1 : 0);
 
 	trace_performance_enter();
 	modified_fmt   = in_porcelain ? "M\t%s\n" : "%s: needs update\n";
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 2/4] blame: fix progress total with line ranges
  2021-06-23 21:57 ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
  2021-06-23 21:57   ` [PATCH 1/4] WIP progress, isatty(2), hidden progress lnies for GIT_TEST_CHECK_PROGRESS SZEDER Gábor
@ 2021-06-23 21:57   ` SZEDER Gábor
  2021-06-23 21:57   ` [PATCH 3/4] read-cache: avoid overlapping progress lines SZEDER Gábor
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-06-23 21:57 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	Taylor Blau, SZEDER Gábor

When not blaming a whole file but only a subset of its lines using the
'-L<start>,<end>' option, then the "Blaming lines" progress counter
can be way off, because the counter only counts the actually processed
lines in the line range(s) while the expected total wrongly shows the
number of lines in the given file:

  $ wc -l git.c
  932 git.c
  $ GIT_PROGRESS_DELAY=0 git blame -L10,20 git.c
  Blaming lines:   1% (11/932), done.
  <...>

Let's sum up the number of lines in all (sorted and merged) line
ranges and specify the resulting number as expected total.  (Note:
when blaming the whole file, then we (implicitly) have one line range
encompassing all its lines, so this approach works even when no line
range was given as option.)

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 builtin/blame.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 5efb920dd4..7d29f5dc61 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1121,9 +1121,11 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
 	}
 	sort_and_merge_range_set(&ranges);
 
+	lno = 0;
 	for (range_i = ranges.nr; range_i > 0; --range_i) {
 		const struct range *r = &ranges.ranges[range_i - 1];
 		ent = blame_entry_prepend(ent, r->start, r->end, o);
+		lno += r->end - r->start;
 	}
 
 	o->suspects = ent;
@@ -1152,7 +1154,7 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
 
 	sb.found_guilty_entry = &found_guilty_entry;
 	sb.found_guilty_entry_data = &pi;
-	pi.progress = start_delayed_progress_if_tty(_("Blaming lines"), sb.num_lines, show_progress);
+	pi.progress = start_delayed_progress_if_tty(_("Blaming lines"), lno, show_progress);
 
 	assign_blame(&sb, opt);
 
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 3/4] read-cache: avoid overlapping progress lines
  2021-06-23 21:57 ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
  2021-06-23 21:57   ` [PATCH 1/4] WIP progress, isatty(2), hidden progress lnies for GIT_TEST_CHECK_PROGRESS SZEDER Gábor
  2021-06-23 21:57   ` [PATCH 2/4] blame: fix progress total with line ranges SZEDER Gábor
@ 2021-06-23 21:57   ` SZEDER Gábor
  2021-06-23 21:57   ` [PATCH 4/4] preload-index: fix "Refreshing index" progress line SZEDER Gábor
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-06-23 21:57 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	Taylor Blau, SZEDER Gábor

"Refresh index" in refresh_index() in 'read-cache.c' vs. "Refreshing
index" in preload_index() in 'preload-index.c'.
---
 read-cache.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index c3fc797639..692a69f2db 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1567,10 +1567,6 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 	int t2_sum_lstat = 0;
 	int t2_sum_scan = 0;
 
-	progress = start_delayed_progress_if_tty(_("Refresh index"),
-						 istate->cache_nr,
-						 flags & REFRESH_PROGRESS ? -1 : 0);
-
 	trace_performance_enter();
 	modified_fmt   = in_porcelain ? "M\t%s\n" : "%s: needs update\n";
 	deleted_fmt    = in_porcelain ? "D\t%s\n" : "%s: needs update\n";
@@ -1583,6 +1579,11 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 	 * we only have to do the special cases that are left.
 	 */
 	preload_index(istate, pathspec, 0);
+
+	progress = start_delayed_progress_if_tty(_("Refresh index"),
+						 istate->cache_nr,
+						 flags & REFRESH_PROGRESS ? -1 : 0);
+
 	trace2_region_enter("index", "refresh", NULL);
 	/* TODO: audit for interaction with sparse-index. */
 	ensure_full_index(istate);
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 4/4] preload-index: fix "Refreshing index" progress line
  2021-06-23 21:57 ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
                     ` (2 preceding siblings ...)
  2021-06-23 21:57   ` [PATCH 3/4] read-cache: avoid overlapping progress lines SZEDER Gábor
@ 2021-06-23 21:57   ` SZEDER Gábor
  2021-06-23 22:11   ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
  2021-06-24 10:45   ` Ævar Arnfjörð Bjarmason
  5 siblings, 0 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-06-23 21:57 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	Taylor Blau, SZEDER Gábor

---
 preload-index.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/preload-index.c b/preload-index.c
index aae6e4a042..757dbeced6 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -86,7 +86,8 @@ static void *preload_thread(void *_data)
 		struct progress_data *pd = p->progress;
 
 		pthread_mutex_lock(&pd->mutex);
-		display_progress(pd->progress, pd->n + last_nr);
+		pd->n += last_nr;
+		display_progress(pd->progress, pd->n);
 		pthread_mutex_unlock(&pd->mutex);
 	}
 	cache_def_clear(&cache);
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines
  2021-06-23 21:57 ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
                     ` (3 preceding siblings ...)
  2021-06-23 21:57   ` [PATCH 4/4] preload-index: fix "Refreshing index" progress line SZEDER Gábor
@ 2021-06-23 22:11   ` SZEDER Gábor
  2021-06-24 10:43     ` Ævar Arnfjörð Bjarmason
  2021-06-24 10:45   ` Ævar Arnfjörð Bjarmason
  5 siblings, 1 reply; 138+ messages in thread
From: SZEDER Gábor @ 2021-06-23 22:11 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	Taylor Blau, Derrick Stolee

On Wed, Jun 23, 2021 at 11:57:32PM +0200, SZEDER Gábor wrote:
> I just wanted to see whether it's possible to check all progress lines
> and whether it uncovers any more bugs; and the answer is yes to both.

Oh, and there is another one:

test_expect_success 'test' '
	git commit --allow-empty -m 1 &&
	git commit --allow-empty -m 2 &&
	git commit --allow-empty -m 3 &&
	GIT_PROGRESS_DELAY=0 \
	git commit-graph write --progress --reachable --split &&
	git commit --allow-empty -m 4 &&
	GIT_PROGRESS_DELAY=0 \
	git commit-graph write --progress --reachable --split
'

The last command's progress output ends with:

  Writing out commit graph in 5 passes:  80% (4/5), done.

This is because since 53035c4f0b (commit-graph write: add "Writing
out" progress output, 2019-01-19) we have assumed that the work done
while writing each chunk is proportional to the number of commits in
the graph, but with the arrival of split commit graphs and the BASE
chunk in 118bd57002 (commit-graph: add base graphs chunk, 2019-06-18)
that's not longer the case.


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 22/25] progress.c: add a stop_progress_early() function
  2021-06-23 17:48       ` [PATCH 22/25] progress.c: add a stop_progress_early() function Ævar Arnfjörð Bjarmason
@ 2021-06-24 10:35         ` Ævar Arnfjörð Bjarmason
  2021-06-25  1:24         ` Andrei Rybak
  1 sibling, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-24 10:35 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason


On Wed, Jun 23 2021, Ævar Arnfjörð Bjarmason wrote:

> +	strbuf_addf(&sb, _(", done at %"PRIuMAX" items, expected %"PRIuMAX"."),
> +		    progress->total, progress->last_update);

These two need a (uintmax_t) cast like the rest of such sprintfs in the
file, as I discovered with the OSX CI.

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines
  2021-06-23 22:11   ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
@ 2021-06-24 10:43     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-24 10:43 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, René Scharfe, Taylor Blau, Derrick Stolee


On Thu, Jun 24 2021, SZEDER Gábor wrote:

> On Wed, Jun 23, 2021 at 11:57:32PM +0200, SZEDER Gábor wrote:
>> I just wanted to see whether it's possible to check all progress lines
>> and whether it uncovers any more bugs; and the answer is yes to both.
>
> Oh, and there is another one:
>
> test_expect_success 'test' '
> 	git commit --allow-empty -m 1 &&
> 	git commit --allow-empty -m 2 &&
> 	git commit --allow-empty -m 3 &&
> 	GIT_PROGRESS_DELAY=0 \
> 	git commit-graph write --progress --reachable --split &&
> 	git commit --allow-empty -m 4 &&
> 	GIT_PROGRESS_DELAY=0 \
> 	git commit-graph write --progress --reachable --split
> '
>
> The last command's progress output ends with:
>
>   Writing out commit graph in 5 passes:  80% (4/5), done.
>
> This is because since 53035c4f0b (commit-graph write: add "Writing
> out" progress output, 2019-01-19) we have assumed that the work done
> while writing each chunk is proportional to the number of commits in
> the graph, but with the arrival of split commit graphs and the BASE
> chunk in 118bd57002 (commit-graph: add base graphs chunk, 2019-06-18)
> that's not longer the case.

Ah, I encountered the off-by-something in that "writing in N passes" but
didn't find the root cause, thanks.

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines
  2021-06-23 21:57 ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
                     ` (4 preceding siblings ...)
  2021-06-23 22:11   ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
@ 2021-06-24 10:45   ` Ævar Arnfjörð Bjarmason
  5 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-24 10:45 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, René Scharfe, Taylor Blau


On Wed, Jun 23 2021, SZEDER Gábor wrote:

> On Sun, Jun 20, 2021 at 10:02:56PM +0200, SZEDER Gábor wrote:
>> It turned out that progress
>> counters can be checked easily and transparently in case of progress
>> lines that are shown in the tests, i.e. that are shown even when
>> stderr is not a terminal or are forced with '--progress'.  (In other
>> cases it's still fairly easy but not quite transparent, as I think we
>> need changes to the progress API; more on that later in a separate
>> series.)
>
> So, the first patch in this WIP/POC series is my attempt at checking
> even those progress counters that are not shown in our test suite,
> either because stderr is not a terminal or because of an explicit
> '--no-progress' option.  There are no usable commit messages yet, I
> just wanted to see whether it's possible to check all progress lines
> and whether it uncovers any more bugs; and the answer is yes to both.
>
> Anyway, the basic idea is that instead of checking isatty(2) in the
> caller, let's perform that check in start_progress() and let callers
> override it through an extra function parameter (e.g. when
> '--(no-)progress', '-v' or '--quiet' was given).  This way
> start_progress() will always be called and it would then return NULL
> if the progress line should not be shown.  Or, if
> GIT_TEST_CHECK_PROGRESS=1, then it would return a valid non-NULL
> progress instance even when the progress line should not be shown, but
> with the new 'progress->hidden' flag set, so subsequent
> display_progress() and stop_progress() calls won't print anything but
> will be able to perform all the checks and trigger BUG() if one is
> violated.
>
> However, after Ævar pointed out upthread that progress also generates
> trace2 regions, I think that it would be better if start_progress()
> always returned a valid progress instance, even without
> GIT_TEST_CHECK_PROGRESS but with 'progress->hidden' set as necessary,
> because that way we would always get that trace2 output, even with
> '--no-progress' or 'git cmd 2>log'.
>
> The first patch also converts a good couple of progress lines to this
> new approach, and the subsequent patches fix most of the uncovered
> buggy progress lines.

Thanks, I skimmed over it and this sort of approach is definitely what
we'll need to address my "But we'll still have various untested for
BUG()[...]" in
https://lore.kernel.org/git/cover-00.25-00000000000-20210623T155626Z-avarab@gmail.com/

And as you point out we'll get the benefit of consistent trace2 regions,
on the one hand it's a bit weird to have this UI code drive a trace2
region when we don't have a TTY, but I think it's useful. We could
e.g. eventually record some stats about min/max/avg/percentile
processing per-item while we're at it, that's unlikely to be worth it if
we need another API like display_progress(), but since we have that one
we can piggy-back on it quite easily.

Just some implementation nits: I for one would prefer "static inline"
wrappers instead of macros in progress.h, makes it easier to
consistently set breakpoints in gdb.

It's more work up-front, but I think Re Randall's question in
https://lore.kernel.org/git/00fb01d76859$8a6ebc50$9f4c34f0$@nexbridge.com
that instead of s/start_delayed_progress/start_delayed_progress_if_tty/
it would be better to just leave the "start_delayed_progress", and have
it by default do the TTY check, and also check for --progress and/or
--verbose/--quiet etc. itself.

We'd probably have some special-cases left even then, but I think most
of them can be handled with an isatty() check and the "standard" options
of --progress etc.

I.e. we have OPT__VERBOSE now, but no OPT__PROGRESS (we just use
OPT_BOOL). If we made the various common parse-options flags that impact
it callbacks that would munge a global variable we could then pick that
up in progress.c, and handle the common case of "git some-command
--no-progress" directly.

It would also make it easy to just move that over to git.c, so we could
have "git --no-progress some-command", which I think for --progress,
--object-format and other "global-y" options it we should have them to
"git" directly, not per-command, especially with us hopefully soon
moving 100% away from dashed built-ins.

Isn't the most common general rule just:

    int want_progress = progress ? 1 : verbose ? 1 : quiet ? 0 : isatty(2);

Well, that and a version that handles --no-progress distinct from "did
not provide it", so we need some "-1" checks in there. Maybe:

    /* Earlier */
    if (quiet != -1 && verbose != -1)
        die("--quiet and --verbose?");

    /* In progress.c after getopt */
    int enable = -1;
    if (opt_progress != -1) enable = opt_progress;
    if (enable == -1 && opt_verbose != -1) enable = opt_verbose;
    if (enable == -1 && opt_quiet != -1) enable = !opt_quiet;
    if (enable == -1) enable = isatty(2);

In any case, I think moving that to one place so it's consistently
checked would make sense.

Some things like builtin/multi-pack-index.c set "progress" as a bitflag
for IMO (this was discussed on-list before) no good reason. I.e. the
builtin should handle it with a bool, maybe the library wants a flag,
but in any case if we can do what I proposed above such libraries won't
need a flag at all.

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 22/25] progress.c: add a stop_progress_early() function
  2021-06-23 17:48       ` [PATCH 22/25] progress.c: add a stop_progress_early() function Ævar Arnfjörð Bjarmason
  2021-06-24 10:35         ` Ævar Arnfjörð Bjarmason
@ 2021-06-25  1:24         ` Andrei Rybak
  1 sibling, 0 replies; 138+ messages in thread
From: Andrei Rybak @ 2021-06-25  1:24 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe, Taylor Blau

On 23/06/2021 19:48, Ævar Arnfjörð Bjarmason wrote:
> In cases where we error out during processing or otherwise miss
> initial "total" estimate we'll still want to show a "done" message and
> end our trace2 region, but it won't be true that our total ==
> last_update at the end.
> 
> So let's add a "last_update" and this stop_progress_early() function
> to handle that edge case, this will be used in a subsequent commit.
> 
> We could also use a total=0 in such cases, but that would make the
> progress output worse for the common non-erroring case. Let's instead
> note that we didn't reach the total count, and snap the progress bar
> to "100%, done" at the end.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>   progress.c | 20 ++++++++++++++++++++
>   progress.h |  2 ++
>   2 files changed, 22 insertions(+)
> 
> diff --git a/progress.c b/progress.c
> index 35847d3a7f2..c1cb01ba975 100644
> --- a/progress.c
> +++ b/progress.c
> @@ -40,6 +40,8 @@ static void display(struct progress *progress, uint64_t n,
>   	const char *tp;
>   	int show_update = 0;
>   
> +	progress->last_update = n;
> +
>   	if (progress->delay && (!progress_update || --progress->delay))
>   		return;
>   
> @@ -413,3 +415,21 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
>   	free(progress->throughput);
>   	free(progress);
>   }
> +
> +void stop_progress_early(struct progress **p_progress)
> +{
> +	struct progress *progress;
> +	struct strbuf sb = STRBUF_INIT;
> +
> +	if (!p_progress)
> +		BUG("don't provide NULL to stop_progress_early");
> +	progress = *p_progress;
> +	if (!progress)
> +		return;
> +
> +	strbuf_addf(&sb, _(", done at %"PRIuMAX" items, expected %"PRIuMAX"."),
> +		    progress->total, progress->last_update);

It seems that these two arguments to strbuf_addf should be swapped
around.  Done at progress->last_update, expected progress->total.

> +	progress->total = progress->last_update;
> +	stop_progress_msg(p_progress, sb.buf);
> +	strbuf_release(&sb);
> +}
> diff --git a/progress.h b/progress.h
> index ba38447d104..5c5d027d1a0 100644
> --- a/progress.h
> +++ b/progress.h
> @@ -23,6 +23,7 @@ struct progress {
>   	struct strbuf status;
>   	size_t status_len_utf8;
>   
> +	uint64_t last_update;
>   	uint64_t last_value;
>   	uint64_t total;
>   	unsigned last_percent;
> @@ -56,5 +57,6 @@ struct progress *start_delayed_sparse_progress(const char *title,
>   					       uint64_t total);
>   void stop_progress(struct progress **progress);
>   void stop_progress_msg(struct progress **progress, const char *msg);
> +void stop_progress_early(struct progress **p_progress);
>   
>   #endif
> 


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-21 20:08       ` Ævar Arnfjörð Bjarmason
@ 2021-06-26  8:27         ` René Scharfe
  2021-06-26 14:11           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 138+ messages in thread
From: René Scharfe @ 2021-06-26  8:27 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: SZEDER Gábor, git

Am 21.06.21 um 22:08 schrieb Ævar Arnfjörð Bjarmason:
>
> On Mon, Jun 21 2021, René Scharfe wrote:
>
>> Am 21.06.21 um 00:13 schrieb Ævar Arnfjörð Bjarmason:
>>>
>>> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>>>
>>>> The final value of the counter of the "Scanning merged commits"
>>>> progress line is always one less than its expected total, e.g.:
>>>>
>>>>   Scanning merged commits:  83% (5/6), done.
>>>>
>>>> This happens because while iterating over an array the loop variable
>>>> is passed to display_progress() as-is, but while C arrays (and thus
>>>> the loop variable) start at 0 and end at N-1, the progress counter
>>>> must end at N.  This causes the failures of the tests
>>>> 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
>>>> in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.
>>>>
>>>> Fix this by passing 'i + 1' to display_progress(), like most other
>>>> callsites do.
>>>>
>>>> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
>>>> ---
>>>>  commit-graph.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/commit-graph.c b/commit-graph.c
>>>> index 2bcb4e0f89..3181906368 100644
>>>> --- a/commit-graph.c
>>>> +++ b/commit-graph.c
>>>> @@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>>>>
>>>>  	ctx->num_extra_edges = 0;
>>>>  	for (i = 0; i < ctx->commits.nr; i++) {
>>>> -		display_progress(ctx->progress, i);
>>>> +		display_progress(ctx->progress, i + 1);
>>>>
>>>>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>>>>  			  &ctx->commits.list[i]->object.oid)) {
>>>
>>> I think this fix makes sense, but FWIW there's a large thread starting
>>> at [1] where René disagrees with me, and thinks the fix for this sort of
>>> thing would be to display_progress(..., i + 1) at the end of that
>>> for-loop, or just before the stop_progress().
>>>
>>> I don't agree, but just noting the disagreement, and that if that
>>> argument wins then a patch like this would involve changing the other
>>> 20-some calls to display_progress() in commit-graph.c to work
>>> differently (and to be more complex, we'd need to deal with loop
>>> break/continue etc.).
>>>
>>> 1. https://lore.kernel.org/git/patch-2.2-042f598826-20210607T144206Z-avarab@gmail.com/
>>
>> *sigh*  (And sorry, Ævar.)
>>
>> Before an item is done, it should be reported as not done.  After an
>> item is done, it should be reported as done.  One loop iteration
>> finishes one item.  Thus the number of items to report at the bottom of
>> the loop is one higher than at the top.  i is the correct number to
>> report at the top of a zero-based loop, i+1 at the bottom.

> Anyone with more time than sense can go and read over our linked back &
> forth thread where we're disagreeing on that point :). I think the pattern
> in commit-graph.c makes sense, you don't.

Thanks for this comment, I think I got it now: Work doesn't count in the
commit-graph.c model of measuring progress, literally.  I.e. progress is
the same before and after one item of work.  Instead it counts the
number of loop iterations.  The model I describe above counts finished
work items instead.  The results of the two models differ by at most one
despite their inverted axiom regarding the value of work.

Phew, that took me a while.

> Anyway, aside from that. I think, and I really would be advocating this
> too, even if our respective positions were reversed, that *in this case*
> it makes sense to just take something like SZEDER's patch here
> as-is. Because in that file there's some dozen occurrences of that exact
> pattern.

The code without the patch either forgets to report the last work item
in the count-work-items model or is one short in the count-iterations
model, so a fix is needed either way.

The number of the other occurrences wouldn't matter if they were
buggy, but in this case they indicate that Stolee consistently used
the count-iterations model.  Thus using it in the patch as well makes
sense.

> Let's just bring this one case in line with the rest, if we then want to
> argue that one or the other use of the progress.c API is wrong as a
> general thing, I think it makes more sense to discuss that as some
> follow-up series that changes these various API uses en-masse than
> holding back isolated fixes that leave the state of the progress bar it
> != 100%.

Agreed.

René

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-26  8:27         ` René Scharfe
@ 2021-06-26 14:11           ` Ævar Arnfjörð Bjarmason
  2021-06-26 20:22             ` René Scharfe
  0 siblings, 1 reply; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-26 14:11 UTC (permalink / raw)
  To: René Scharfe; +Cc: SZEDER Gábor, git


On Sat, Jun 26 2021, René Scharfe wrote:

> Am 21.06.21 um 22:08 schrieb Ævar Arnfjörð Bjarmason:
>>
>> On Mon, Jun 21 2021, René Scharfe wrote:
>>
>>> Am 21.06.21 um 00:13 schrieb Ævar Arnfjörð Bjarmason:
>>>>
>>>> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>>>>
>>>>> The final value of the counter of the "Scanning merged commits"
>>>>> progress line is always one less than its expected total, e.g.:
>>>>>
>>>>>   Scanning merged commits:  83% (5/6), done.
>>>>>
>>>>> This happens because while iterating over an array the loop variable
>>>>> is passed to display_progress() as-is, but while C arrays (and thus
>>>>> the loop variable) start at 0 and end at N-1, the progress counter
>>>>> must end at N.  This causes the failures of the tests
>>>>> 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
>>>>> in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.
>>>>>
>>>>> Fix this by passing 'i + 1' to display_progress(), like most other
>>>>> callsites do.
>>>>>
>>>>> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
>>>>> ---
>>>>>  commit-graph.c | 2 +-
>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/commit-graph.c b/commit-graph.c
>>>>> index 2bcb4e0f89..3181906368 100644
>>>>> --- a/commit-graph.c
>>>>> +++ b/commit-graph.c
>>>>> @@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>>>>>
>>>>>  	ctx->num_extra_edges = 0;
>>>>>  	for (i = 0; i < ctx->commits.nr; i++) {
>>>>> -		display_progress(ctx->progress, i);
>>>>> +		display_progress(ctx->progress, i + 1);
>>>>>
>>>>>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>>>>>  			  &ctx->commits.list[i]->object.oid)) {
>>>>
>>>> I think this fix makes sense, but FWIW there's a large thread starting
>>>> at [1] where René disagrees with me, and thinks the fix for this sort of
>>>> thing would be to display_progress(..., i + 1) at the end of that
>>>> for-loop, or just before the stop_progress().
>>>>
>>>> I don't agree, but just noting the disagreement, and that if that
>>>> argument wins then a patch like this would involve changing the other
>>>> 20-some calls to display_progress() in commit-graph.c to work
>>>> differently (and to be more complex, we'd need to deal with loop
>>>> break/continue etc.).
>>>>
>>>> 1. https://lore.kernel.org/git/patch-2.2-042f598826-20210607T144206Z-avarab@gmail.com/
>>>
>>> *sigh*  (And sorry, Ævar.)
>>>
>>> Before an item is done, it should be reported as not done.  After an
>>> item is done, it should be reported as done.  One loop iteration
>>> finishes one item.  Thus the number of items to report at the bottom of
>>> the loop is one higher than at the top.  i is the correct number to
>>> report at the top of a zero-based loop, i+1 at the bottom.
>
>> Anyone with more time than sense can go and read over our linked back &
>> forth thread where we're disagreeing on that point :). I think the pattern
>> in commit-graph.c makes sense, you don't.
>
> Thanks for this comment, I think I got it now: Work doesn't count in the
> commit-graph.c model of measuring progress, literally.  I.e. progress is
> the same before and after one item of work.

The progress isn't the same, we update the count. Or do you mean in the
time it takes us to go from the end of the for-loop & jump to the start
of it and update the count?

> Instead it counts the number of loop iterations.  The model I describe
> above counts finished work items instead.  The results of the two
> models differ by at most one despite their inverted axiom regarding
> the value of work.
>
> Phew, that took me a while.

For what it's worth I had some extensive examples in our initial
thread[1][2] (search for "apple" and "throughput", respectively), that
you cut out when replying to the relevant E-Mails. I'd think we could
probably have gotten here earlier :)

I'm a bit confused about this "value of work" comment.

If you pick up a copy of say a video game like Mario Kart you'll find
that for a 3-lap race you start at 1/3, and still have an entire lap to
go when the count is at 3/3.

So it's just a question of whether you report progress on item N or work
finished on item N, not whether laps in a race have more or less
value.

To reference my earlier E-Mail[1] are you eating the first apple or the
zeroeth apple? I don't think one is more or less right in the
mathematical sense, I just think for UX aimed at people counting "laps"
makes more sense than counting completed items.

>> Anyway, aside from that. I think, and I really would be advocating this
>> too, even if our respective positions were reversed, that *in this case*
>> it makes sense to just take something like SZEDER's patch here
>> as-is. Because in that file there's some dozen occurrences of that exact
>> pattern.
>
> The code without the patch either forgets to report the last work item
> in the count-work-items model or is one short in the count-iterations
> model, so a fix is needed either way.

It won't be one short, for a loop of 2 items we'll go from:

     0/2
     1/2
     1/2, done

To:

     1/2
     2/2
     2/2, done

Just like the rest of the uses of the progress API in that file.

Which is one of the two reasons I prefer this pattern, i.e. this is less
verbose:

    start_progress()
    for i in (0..X-1):
        display_progress(i+1)
        work()
    stop_progress()

Than one of these, which AFAICT would be your recommendation:

    # Simplest, but stalls on work()
    start_progress()
    for i in (0..X-1):
        work()
        display_progress(i+1)
    stop_progress()

    # More verbose, but doesn't:
    start_progress()
    for i in (0..X-1):
        display_progress(i)
        work()
        display_progress(i+1)
    stop_progress()

    # Ditto:
    start_progress()
    display_progress(0)
    for i in (0..X-1):
        work()
        display_progress(i+1)
    stop_progress()

And of course if your loop continues or whatever you'll need a last
"display_progress(X)" before the "stop_progress()".

The other is that if you count laps you can have your progress bar
optionally show progress on that item. E.g. we could if we stall show
seconds spend that we're hung on that item, or '3/3 ETA 40s". I have a
patch[3] that takes an initial step towards that, with some more queued
locally.

> The number of the other occurrences wouldn't matter if they were
> buggy, but in this case they indicate that Stolee consistently used
> the count-iterations model.  Thus using it in the patch as well makes
> sense.

>> Let's just bring this one case in line with the rest, if we then want to
>> argue that one or the other use of the progress.c API is wrong as a
>> general thing, I think it makes more sense to discuss that as some
>> follow-up series that changes these various API uses en-masse than
>> holding back isolated fixes that leave the state of the progress bar it
>> != 100%.
>
> Agreed.

Sorry to go on about this again :)

1. https://lore.kernel.org/git/87lf7k2bem.fsf@evledraar.gmail.com/
2. https://lore.kernel.org/git/87o8c8z105.fsf@evledraar.gmail.com/
3. https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-26 14:11           ` Ævar Arnfjörð Bjarmason
@ 2021-06-26 20:22             ` René Scharfe
  2021-06-26 21:38               ` Ævar Arnfjörð Bjarmason
  2021-06-27 17:31               ` Felipe Contreras
  0 siblings, 2 replies; 138+ messages in thread
From: René Scharfe @ 2021-06-26 20:22 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: SZEDER Gábor, git

Am 26.06.21 um 16:11 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sat, Jun 26 2021, René Scharfe wrote:
>
>> Am 21.06.21 um 22:08 schrieb Ævar Arnfjörð Bjarmason:
>>>
>>> On Mon, Jun 21 2021, René Scharfe wrote:
>>>
>>>> Before an item is done, it should be reported as not done.  After an
>>>> item is done, it should be reported as done.  One loop iteration
>>>> finishes one item.  Thus the number of items to report at the bottom of
>>>> the loop is one higher than at the top.  i is the correct number to
>>>> report at the top of a zero-based loop, i+1 at the bottom.
>>
>>> Anyone with more time than sense can go and read over our linked back &
>>> forth thread where we're disagreeing on that point :). I think the pattern
>>> in commit-graph.c makes sense, you don't.
>>
>> Thanks for this comment, I think I got it now: Work doesn't count in the
>> commit-graph.c model of measuring progress, literally.  I.e. progress is
>> the same before and after one item of work.
>
> The progress isn't the same, we update the count. Or do you mean in the
> time it takes us to go from the end of the for-loop & jump to the start
> of it and update the count?
>
>> Instead it counts the number of loop iterations.  The model I describe
>> above counts finished work items instead.  The results of the two
>> models differ by at most one despite their inverted axiom regarding
>> the value of work.
>>
>> Phew, that took me a while.
>
> For what it's worth I had some extensive examples in our initial
> thread[1][2] (search for "apple" and "throughput", respectively), that
> you cut out when replying to the relevant E-Mails. I'd think we could
> probably have gotten here earlier :)

Perhaps, but the key point for me was to invert my basic assumption that
a work item has value, and for that I had to realize and state it first
(done above).  A mathematician would have done that in an instant, I
guess ("Invert, always invert").

> I'm a bit confused about this "value of work" comment.

Progress is a counter.  The difference of the counter before and after
a work item is done is one in the count-work model, but zero in the
count-iterations model.

> If you pick up a copy of say a video game like Mario Kart you'll find
> that for a 3-lap race you start at 1/3, and still have an entire lap to
> go when the count is at 3/3.
>
> So it's just a question of whether you report progress on item N or work
> finished on item N, not whether laps in a race have more or less
> value.

These are linked.  If you want to know which lap you are in, the answer
won't change until you start a new lap:

	for (i = 0; i < 3; i++) {
		display_progress(p, i + 1);
		drive_one_lap();
		display_progress(p, i + 1);
	}

If you want for know how many laps you finished, the answer will
increase after a lap is done:

	for (i = 0; i < 3; i++) {
		display_progress(p, i);
		drive_one_lap();
		display_progress(p, i + 1);
	}

> To reference my earlier E-Mail[1] are you eating the first apple or the
> zeroeth apple? I don't think one is more or less right in the
> mathematical sense, I just think for UX aimed at people counting "laps"
> makes more sense than counting completed items.

The difference between counting iterations and work items vanishes as
their numbers increase.  The most pronounced difference is observed when
there is only a single item of work.  The count-iterations model shows
1/1 from start to finish.  The count-work model shows 0/1 initially and
1/1 after the work is done.

As a user I prefer the second one.  If presented with just a number and
a percentage then I assume 100% means all work is done and would cancel
the program if that status is shown for too long.  With Git I have
learned that only the final ", done" really means done in some cases,
but that's an unnecessary lesson and still surprising to me.

>>> Anyway, aside from that. I think, and I really would be advocating this
>>> too, even if our respective positions were reversed, that *in this case*
>>> it makes sense to just take something like SZEDER's patch here
>>> as-is. Because in that file there's some dozen occurrences of that exact
>>> pattern.
>>
>> The code without the patch either forgets to report the last work item
>> in the count-work-items model or is one short in the count-iterations
>> model, so a fix is needed either way.
>
> It won't be one short, for a loop of 2 items we'll go from:
>
>      0/2
>      1/2
>      1/2, done
>
> To:
>
>      1/2
>      2/2
>      2/2, done
>
> Just like the rest of the uses of the progress API in that file.

Yes, just like I wrote -- the old code is one short compared to the
correct output of the count-iterations method.

For completeness' sake, the correct output of the count-work method
would be:

	0/2
	1/2
	2/2
	2/2, done

> Which is one of the two reasons I prefer this pattern, i.e. this is less
> verbose:
>
>     start_progress()
>     for i in (0..X-1):
>         display_progress(i+1)
>         work()
>     stop_progress()
>
> Than one of these, which AFAICT would be your recommendation:
>
>     # Simplest, but stalls on work()
>     start_progress()
>     for i in (0..X-1):
>         work()
>         display_progress(i+1)
>     stop_progress()
>
>     # More verbose, but doesn't:
>     start_progress()
>     for i in (0..X-1):
>         display_progress(i)
>         work()
>         display_progress(i+1)
>     stop_progress()
>
>     # Ditto:
>     start_progress()
>     display_progress(0)
>     for i in (0..X-1):
>         work()
>         display_progress(i+1)
>     stop_progress()
>
> And of course if your loop continues or whatever you'll need a last
> "display_progress(X)" before the "stop_progress()".

The count-work model needs one more progress update than the
count-iteration model.  We could do all updates in the loop header,
which is evaluated just the right number of times.  But I think that we
rather should choose between the models based on their results.

If each work item finishes within a progress display update period
(half a second) then there won't be any user-visible difference and
both models would do.

> The other is that if you count laps you can have your progress bar
> optionally show progress on that item. E.g. we could if we stall show
> seconds spend that we're hung on that item, or '3/3 ETA 40s". I have a
> patch[3] that takes an initial step towards that, with some more queued
> locally.

A time estimate for the whole operation (until ", done") would be nice.
It can help with the decision to go for a break or to keep staring at
the screen.  I guess we just need to remember when start_progress() was
called and can then estimate the remaining time once the first item is
done.  Stalling items would push the estimate further into the future.

A time estimate per item wouldn't help me much.  I'd have to subtract
to get the number of unfinished items, catch the maximum estimated
duration and multiply those values.  OK, by the time I manage that Git
is probably done -- but I'd rather like to leave arithmetic tasks to
the computer..

Seconds spent for the current item can be shown with both models.  The
progress value is not sufficient to identify the problem case in most
cases.  An ID of some kind (e.g. a file name or hash) would have to be
shown as well for that.  But how would I use that information?

René

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-26 20:22             ` René Scharfe
@ 2021-06-26 21:38               ` Ævar Arnfjörð Bjarmason
  2021-07-04 12:15                 ` René Scharfe
  2021-06-27 17:31               ` Felipe Contreras
  1 sibling, 1 reply; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-26 21:38 UTC (permalink / raw)
  To: René Scharfe; +Cc: SZEDER Gábor, git


On Sat, Jun 26 2021, René Scharfe wrote:

> Am 26.06.21 um 16:11 schrieb Ævar Arnfjörð Bjarmason:
>> [...]
>> To reference my earlier E-Mail[1] are you eating the first apple or the
>> zeroeth apple? I don't think one is more or less right in the
>> mathematical sense, I just think for UX aimed at people counting "laps"
>> makes more sense than counting completed items.
>
> The difference between counting iterations and work items vanishes as
> their numbers increase.  The most pronounced difference is observed when
> there is only a single item of work.  The count-iterations model shows
> 1/1 from start to finish.  The count-work model shows 0/1 initially and
> 1/1 after the work is done.
>
> As a user I prefer the second one.  If presented with just a number and
> a percentage then I assume 100% means all work is done and would cancel
> the program if that status is shown for too long.  With Git I have
> learned that only the final ", done" really means done in some cases,
> but that's an unnecessary lesson and still surprising to me.

What progress bar of ours goes slow enough that the difference matters
for you in either case?

The only one I know of is "Enumerating objects", which notably stalls at
the start, and which I'm proposing changing the output of in:
https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/

>> [...]
>> Which is one of the two reasons I prefer this pattern, i.e. this is less
>> verbose:
>>
>>     start_progress()
>>     for i in (0..X-1):
>>         display_progress(i+1)
>>         work()
>>     stop_progress()
>>
>> Than one of these, which AFAICT would be your recommendation:
>>
>>     # Simplest, but stalls on work()
>>     start_progress()
>>     for i in (0..X-1):
>>         work()
>>         display_progress(i+1)
>>     stop_progress()
>>
>>     # More verbose, but doesn't:
>>     start_progress()
>>     for i in (0..X-1):
>>         display_progress(i)
>>         work()
>>         display_progress(i+1)
>>     stop_progress()
>>
>>     # Ditto:
>>     start_progress()
>>     display_progress(0)
>>     for i in (0..X-1):
>>         work()
>>         display_progress(i+1)
>>     stop_progress()
>>
>> And of course if your loop continues or whatever you'll need a last
>> "display_progress(X)" before the "stop_progress()".
>
> The count-work model needs one more progress update than the
> count-iteration model.  We could do all updates in the loop header,
> which is evaluated just the right number of times.  But I think that we
> rather should choose between the models based on their results.

I think we should be more biased towards API convenience here than
anything else, because for most of these they'll go so fast that users
won't see the difference. I just also happen to think that the easy way
to do it is also more correct.

Also, because for these cases that you're focusing on where we count up
to exactly 100% and we therefore expect N calls to display_progress()
(igroning the rare but allowed duplicate calls with the same number,
which most callers don't use). We could just have a convenience API of:

    start_progress()
    for i in (0..X-1):
        progress_update() /* passing "i" not needed, we increment internally */
        work()
    stop_progress()

Then we could even make showing 0/N or 1/N the first time configuable,
but we could only do both if we use the API as I'm suggesting, not as
you want to use it.

You also sort of can get me what I want with with what you're
suggesting, but you'd conflate "setup" work with the first item, which
matters e.g. for "Enumerating objects" and my "stalled" patch. It's also
more verbose at the code level, and complex (need to deal with "break",
"continue"), so why would you?

Which I think is the main point of our not so much disagreement but I
think a bit of talking past one another.

I.e. I think you're narrowly focused on what I think of as a display
issue of the current progress bars we show, I'm mainly interested in how
we use the API, and we should pick a way to use it that allows us to do
more with displaying progress better in the future.

> If each work item finishes within a progress display update period
> (half a second) then there won't be any user-visible difference and
> both models would do.

A trivial point, but don't you mean a second? AFAICT for "delayed" we
display after 2 seconds, then update every 1 seconds, it's only if we
have display_throughput() that we do every 0.5s.

>> The other is that if you count laps you can have your progress bar
>> optionally show progress on that item. E.g. we could if we stall show
>> seconds spend that we're hung on that item, or '3/3 ETA 40s". I have a
>> patch[3] that takes an initial step towards that, with some more queued
>> locally.
>
> A time estimate for the whole operation (until ", done") would be nice.
> It can help with the decision to go for a break or to keep staring at
> the screen.  I guess we just need to remember when start_progress() was
> called and can then estimate the remaining time once the first item is
> done.  Stalling items would push the estimate further into the future.
>
> A time estimate per item wouldn't help me much.  I'd have to subtract
> to get the number of unfinished items, catch the maximum estimated
> duration and multiply those values.  OK, by the time I manage that Git
> is probably done -- but I'd rather like to leave arithmetic tasks to
> the computer..
>
> Seconds spent for the current item can be shown with both models.  The
> progress value is not sufficient to identify the problem case in most
> cases.  An ID of some kind (e.g. a file name or hash) would have to be
> shown as well for that.  But how would I use that information?

If we're spending enough time on one item to update progress for it N
times we probably want to show throughput/progress/ETA mainly for that
item, not the work as a whole.

If we do run into those cases and want to convert them to show some
intra-item progress we'd need to first migrate them over to suggested
way of using the API if we picked yours first, with my suggested use we
only need to add new API calls (display_throughput(), and similar future
calls/implicit display).

Consider e.g. using the packfile-uri response to ask the user to
download N number of URLs, just because we grab one at 1MB/s that
probably won't do much to inform our estimate of the next one (which may
be on a different CDN etc.).

The throughput API was intended (and mainly used) for the estimate for
the whole batch, I just wonder if as we use it more widely whether that
use-case won't be the exception.

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-26 20:22             ` René Scharfe
  2021-06-26 21:38               ` Ævar Arnfjörð Bjarmason
@ 2021-06-27 17:31               ` Felipe Contreras
  1 sibling, 0 replies; 138+ messages in thread
From: Felipe Contreras @ 2021-06-27 17:31 UTC (permalink / raw)
  To: René Scharfe, Ævar Arnfjörð Bjarmason
  Cc: SZEDER Gábor, git

René Scharfe wrote:
> Am 26.06.21 um 16:11 schrieb Ævar Arnfjörð Bjarmason:

> > For what it's worth I had some extensive examples in our initial
> > thread[1][2] (search for "apple" and "throughput", respectively), that
> > you cut out when replying to the relevant E-Mails. I'd think we could
> > probably have gotten here earlier :)
> 
> Perhaps, but the key point for me was to invert my basic assumption that
> a work item has value, and for that I had to realize and state it first
> (done above).  A mathematician would have done that in an instant, I
> guess ("Invert, always invert").

When you get down to it, numbers almost never mean what most people
think they mean.

If work is a continuum, the probabilty that you would land exactly at
1/3 is 0 P(X=1/3). What you want is the probability of less than 1/3
P(X<=1/3), and that includes 0.

So, anything from 0 to 1/3 is part of the first chunk of work.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-26 21:38               ` Ævar Arnfjörð Bjarmason
@ 2021-07-04 12:15                 ` René Scharfe
  2021-07-05 14:09                   ` Junio C Hamano
  2021-07-05 23:28                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 138+ messages in thread
From: René Scharfe @ 2021-07-04 12:15 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: SZEDER Gábor, git

Am 26.06.21 um 23:38 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sat, Jun 26 2021, René Scharfe wrote:
>
>> Am 26.06.21 um 16:11 schrieb Ævar Arnfjörð Bjarmason:
>>> [...]
>>> To reference my earlier E-Mail[1] are you eating the first apple or the
>>> zeroeth apple? I don't think one is more or less right in the
>>> mathematical sense, I just think for UX aimed at people counting "laps"
>>> makes more sense than counting completed items.
>>
>> The difference between counting iterations and work items vanishes as
>> their numbers increase.  The most pronounced difference is observed when
>> there is only a single item of work.  The count-iterations model shows
>> 1/1 from start to finish.  The count-work model shows 0/1 initially and
>> 1/1 after the work is done.
>>
>> As a user I prefer the second one.  If presented with just a number and
>> a percentage then I assume 100% means all work is done and would cancel
>> the program if that status is shown for too long.  With Git I have
>> learned that only the final ", done" really means done in some cases,
>> but that's an unnecessary lesson and still surprising to me.
>
> What progress bar of ours goes slow enough that the difference matters
> for you in either case?

I don't have an example -- Git, network and SSD are quick enough for my
small use cases.

The advantage of the count-work method is that the question doesn't even
come up.

> The only one I know of is "Enumerating objects", which notably stalls at
> the start, and which I'm proposing changing the output of in:
> https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/

That's annoying, but the first number I see there has five or six digits,
so it's not an example of the issue mentioned above for me.

Your patch shows ", stalled." while pack-objects starts up.  I'm not sure
this helps.  Perhaps there are cases when it gets stuck, but it's hard to
determine by the clock alone.  When I run gc, it just needs a few seconds
to prepare something and then starts visibly counting objects.  A more
fine-grained report of the preparation steps would help, but seeing
"stalled" would just scare me.

>> The count-work model needs one more progress update than the
>> count-iteration model.  We could do all updates in the loop header,
>> which is evaluated just the right number of times.  But I think that we
>> rather should choose between the models based on their results.
>
> I think we should be more biased towards API convenience here than
> anything else, because for most of these they'll go so fast that users
> won't see the difference. I just also happen to think that the easy way
> to do it is also more correct.

The convenience of having one less display_progress() call is only a
slight advantage.

Correctness is a matter of definitions.  Recently I learned that in Arabic
a person's age is given using the count-iterations model.  I.e. on the day
of your birth your age is one.  That causes trouble if you deal with
state officials that use the count-work, err, count-completed-years model,
where your age is one only after living through a full year.

The solution around here is to avoid ambiguity by not using terms like
"age" in laws, regulations and forms, but to state explicitly "full years
since birth" or so.

"2/3 (33%)" means something else to me than to you by default.  So a
solution could be to state the model explicitly.  I.e. "2/3 (66%) done"
or "working on 2/3 (66%)", but the percentage doesn't quite fit in the
latter case.  Thoughts?

> Also, because for these cases that you're focusing on where we count up
> to exactly 100% and we therefore expect N calls to display_progress()
> (igroning the rare but allowed duplicate calls with the same number,
> which most callers don't use). We could just have a convenience API of:
>
>     start_progress()
>     for i in (0..X-1):
>         progress_update() /* passing "i" not needed, we increment internally */
>         work()
>     stop_progress()
>
> Then we could even make showing 0/N or 1/N the first time configuable,
> but we could only do both if we use the API as I'm suggesting, not as
> you want to use it.

A function that increments the progress number relatively can be used
with both models.  It's more useful for the count-iterations model,
though, as in the count-work model you can piggy-back on the loop
counter check:

	for (i = 0; display_progress(p, i), i < X; i++)
		work();

> You also sort of can get me what I want with with what you're
> suggesting, but you'd conflate "setup" work with the first item, which
> matters e.g. for "Enumerating objects" and my "stalled" patch. It's also
> more verbose at the code level, and complex (need to deal with "break",
> "continue"), so why would you?

It's not complicated, just slightly odd, because function calls are
seldomly put into the loop counter check.

>> If each work item finishes within a progress display update period
>> (half a second) then there won't be any user-visible difference and
>> both models would do.
>
> A trivial point, but don't you mean a second? AFAICT for "delayed" we
> display after 2 seconds, then update every 1 seconds, it's only if we
> have display_throughput() that we do every 0.5s.

Right, I mixed those up.

>>> The other is that if you count laps you can have your progress bar
>>> optionally show progress on that item. E.g. we could if we stall show
>>> seconds spend that we're hung on that item, or '3/3 ETA 40s". I have a
>>> patch[3] that takes an initial step towards that, with some more queued
>>> locally.
>>
>> A time estimate for the whole operation (until ", done") would be nice.
>> It can help with the decision to go for a break or to keep staring at
>> the screen.  I guess we just need to remember when start_progress() was
>> called and can then estimate the remaining time once the first item is
>> done.  Stalling items would push the estimate further into the future.
>>
>> A time estimate per item wouldn't help me much.  I'd have to subtract
>> to get the number of unfinished items, catch the maximum estimated
>> duration and multiply those values.  OK, by the time I manage that Git
>> is probably done -- but I'd rather like to leave arithmetic tasks to
>> the computer..
>>
>> Seconds spent for the current item can be shown with both models.  The
>> progress value is not sufficient to identify the problem case in most
>> cases.  An ID of some kind (e.g. a file name or hash) would have to be
>> shown as well for that.  But how would I use that information?
>
> If we're spending enough time on one item to update progress for it N
> times we probably want to show throughput/progress/ETA mainly for that
> item, not the work as a whole.

Throughput is shown for the last time period.  It is independent of the
item or items being worked on during that period.  If one item takes
multiple periods to finish then indeed only its current progress is
shown automatically, as you want.

Showing intra-item progress requires some kind of hierarchical API to
keep track of both parent and child progress and show them in some
readable way.  Perhaps appending another progress display would suffice?
"Files 1/3 (33%) Bytes 17kB/9GB (0%)".  Not sure.

Calculating the ETA of a single item seems hard.  It does require intra-
item progress to be reported by the work code.

> If we do run into those cases and want to convert them to show some
> intra-item progress we'd need to first migrate them over to suggested
> way of using the API if we picked yours first, with my suggested use we
> only need to add new API calls (display_throughput(), and similar future
> calls/implicit display).

I don't see why.  The intra-item progress numbers need to be reported in
any case if they are to be shown somehow.  If the model is clear then we
can show unambiguous output.

> Consider e.g. using the packfile-uri response to ask the user to
> download N number of URLs, just because we grab one at 1MB/s that
> probably won't do much to inform our estimate of the next one (which may
> be on a different CDN etc.).

Sure, if the speed of work items varies wildly then estimates will be
unreliable.

I can vaguely imagine that it would be kind of useful to know the
throughput of different data sources, to allow e.g. use a different
mirror next time.  The current API doesn't distinguish work items in a
meaningful way, though.  They only have numbers.  I'd need a name (e.g.
the URL) for intra-item progress numbers to mean something.

René

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-07-04 12:15                 ` René Scharfe
@ 2021-07-05 14:09                   ` Junio C Hamano
  2021-07-05 23:28                   ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 138+ messages in thread
From: Junio C Hamano @ 2021-07-05 14:09 UTC (permalink / raw)
  To: René Scharfe
  Cc: Ævar Arnfjörð Bjarmason, SZEDER Gábor, git

René Scharfe <l.s.r@web.de> writes:

> ...  A more
> fine-grained report of the preparation steps would help, but seeing
> "stalled" would just scare me.

True.

> The convenience of having one less display_progress() call is only a
> slight advantage.

True, too.

> "2/3 (33%)" means something else to me than to you by default.  So a
> solution could be to state the model explicitly.  I.e. "2/3 (66%) done"
> or "working on 2/3 (66%)", but the percentage doesn't quite fit in the
> latter case.  Thoughts?


I still see "2/3 done" is how we should look at it, but either way,
that's a good way to view at the problem.

Thanks.


[Unrelated Tangent]

> ...  Recently I learned that in Arabic a person's age is given
> using the count-iterations model.  I.e. on the day of your birth
> your age is one.

East Asign age reckoning is shared among EA countries and works the
same way, not just Arabic.

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-07-04 12:15                 ` René Scharfe
  2021-07-05 14:09                   ` Junio C Hamano
@ 2021-07-05 23:28                   ` Ævar Arnfjörð Bjarmason
  2021-07-06 16:02                     ` René Scharfe
  1 sibling, 1 reply; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-05 23:28 UTC (permalink / raw)
  To: René Scharfe; +Cc: SZEDER Gábor, git


On Sun, Jul 04 2021, René Scharfe wrote:

> Am 26.06.21 um 23:38 schrieb Ævar Arnfjörð Bjarmason:
>>
>> On Sat, Jun 26 2021, René Scharfe wrote:
>>
>>> Am 26.06.21 um 16:11 schrieb Ævar Arnfjörð Bjarmason:
>>>> [...]
>>>> To reference my earlier E-Mail[1] are you eating the first apple or the
>>>> zeroeth apple? I don't think one is more or less right in the
>>>> mathematical sense, I just think for UX aimed at people counting "laps"
>>>> makes more sense than counting completed items.
>>>
>>> The difference between counting iterations and work items vanishes as
>>> their numbers increase.  The most pronounced difference is observed when
>>> there is only a single item of work.  The count-iterations model shows
>>> 1/1 from start to finish.  The count-work model shows 0/1 initially and
>>> 1/1 after the work is done.
>>>
>>> As a user I prefer the second one.  If presented with just a number and
>>> a percentage then I assume 100% means all work is done and would cancel
>>> the program if that status is shown for too long.  With Git I have
>>> learned that only the final ", done" really means done in some cases,
>>> but that's an unnecessary lesson and still surprising to me.
>>
>> What progress bar of ours goes slow enough that the difference matters
>> for you in either case?
>
> I don't have an example -- Git, network and SSD are quick enough for my
> small use cases.
>
> The advantage of the count-work method is that the question doesn't even
> come up.
>
>> The only one I know of is "Enumerating objects", which notably stalls at
>> the start, and which I'm proposing changing the output of in:
>> https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/
>
> That's annoying, but the first number I see there has five or six digits,
> so it's not an example of the issue mentioned above for me.

Because it stalls and shows nothing, but with my patch it'll show
something while stalling, FWIW on linux.git from a cold cache it took
5-10s before showing anything.

> Your patch shows ", stalled." while pack-objects starts up.  I'm not sure
> this helps.  Perhaps there are cases when it gets stuck, but it's hard to
> determine by the clock alone.  When I run gc, it just needs a few seconds
> to prepare something and then starts visibly counting objects.  A more
> fine-grained report of the preparation steps would help, but seeing
> "stalled" would just scare me.

Fair enough, I have other patches to have it show a spinner. Again, API
v.s. UI. The idea is that we show something before we start the loop.

>>> The count-work model needs one more progress update than the
>>> count-iteration model.  We could do all updates in the loop header,
>>> which is evaluated just the right number of times.  But I think that we
>>> rather should choose between the models based on their results.
>>
>> I think we should be more biased towards API convenience here than
>> anything else, because for most of these they'll go so fast that users
>> won't see the difference. I just also happen to think that the easy way
>> to do it is also more correct.
>
> The convenience of having one less display_progress() call is only a
> slight advantage.
>
> Correctness is a matter of definitions.  Recently I learned that in Arabic
> a person's age is given using the count-iterations model.  I.e. on the day
> of your birth your age is one.  That causes trouble if you deal with
> state officials that use the count-work, err, count-completed-years model,
> where your age is one only after living through a full year.
>
> The solution around here is to avoid ambiguity by not using terms like
> "age" in laws, regulations and forms, but to state explicitly "full years
> since birth" or so.
>
> "2/3 (33%)" means something else to me than to you by default.  So a
> solution could be to state the model explicitly.  I.e. "2/3 (66%) done"
> or "working on 2/3 (66%)", but the percentage doesn't quite fit in the
> latter case.  Thoughts?

OK, UI again.

>> Also, because for these cases that you're focusing on where we count up
>> to exactly 100% and we therefore expect N calls to display_progress()
>> (igroning the rare but allowed duplicate calls with the same number,
>> which most callers don't use). We could just have a convenience API of:
>>
>>     start_progress()
>>     for i in (0..X-1):
>>         progress_update() /* passing "i" not needed, we increment internally */
>>         work()
>>     stop_progress()
>>
>> Then we could even make showing 0/N or 1/N the first time configuable,
>> but we could only do both if we use the API as I'm suggesting, not as
>> you want to use it.
>
> A function that increments the progress number relatively can be used
> with both models.  It's more useful for the count-iterations model,
> though, as in the count-work model you can piggy-back on the loop
> counter check:
>
> 	for (i = 0; display_progress(p, i), i < X; i++)
> 		work();

Aside from this whole progress API discussion I find sticking stuff like
that in the for-loop body to be less readable.

But no, that can't be used with both models, because it conflates the 0
of the 1st iteration with 0 of doing prep work. I.e.:

    p = start_progress();
    display_progress(p, 0);
    prep_work();
    for (i = 0; i < 100; i++)
        display_progress(p, i + 1);

Which is implicitly how that "stalled" patch views the world, i.e. our
count is -1 is at start_progress() (that's already the case in
progress.c).

If you set it to 0 you're not working on the 1st item yet, but
explicitly doing setup. 

Then at n=1 you're starting work on the 1st item.

>> You also sort of can get me what I want with with what you're
>> suggesting, but you'd conflate "setup" work with the first item, which
>> matters e.g. for "Enumerating objects" and my "stalled" patch. It's also
>> more verbose at the code level, and complex (need to deal with "break",
>> "continue"), so why would you?
>
> It's not complicated, just slightly odd, because function calls are
> seldomly put into the loop counter check.

FWIW the "complicated" here was referring to dealing with break/continue.

Yes I'll grant you that there's cases where the uglyness/oddity of that
for-loop trick is going to be better than dealing with that, but there's
also while loops doing progress, callbacks etc.

Picking an API pattern that works with all of that makes sense, since
the UI can render the count one way or the other.

>>> If each work item finishes within a progress display update period
>>> (half a second) then there won't be any user-visible difference and
>>> both models would do.
>>
>> A trivial point, but don't you mean a second? AFAICT for "delayed" we
>> display after 2 seconds, then update every 1 seconds, it's only if we
>> have display_throughput() that we do every 0.5s.
>
> Right, I mixed those up.
>
>>>> The other is that if you count laps you can have your progress bar
>>>> optionally show progress on that item. E.g. we could if we stall show
>>>> seconds spend that we're hung on that item, or '3/3 ETA 40s". I have a
>>>> patch[3] that takes an initial step towards that, with some more queued
>>>> locally.
>>>
>>> A time estimate for the whole operation (until ", done") would be nice.
>>> It can help with the decision to go for a break or to keep staring at
>>> the screen.  I guess we just need to remember when start_progress() was
>>> called and can then estimate the remaining time once the first item is
>>> done.  Stalling items would push the estimate further into the future.
>>>
>>> A time estimate per item wouldn't help me much.  I'd have to subtract
>>> to get the number of unfinished items, catch the maximum estimated
>>> duration and multiply those values.  OK, by the time I manage that Git
>>> is probably done -- but I'd rather like to leave arithmetic tasks to
>>> the computer..
>>>
>>> Seconds spent for the current item can be shown with both models.  The
>>> progress value is not sufficient to identify the problem case in most
>>> cases.  An ID of some kind (e.g. a file name or hash) would have to be
>>> shown as well for that.  But how would I use that information?
>>
>> If we're spending enough time on one item to update progress for it N
>> times we probably want to show throughput/progress/ETA mainly for that
>> item, not the work as a whole.
>
> Throughput is shown for the last time period.  It is independent of the
> item or items being worked on during that period.  If one item takes
> multiple periods to finish then indeed only its current progress is
> shown automatically, as you want.
>
> Showing intra-item progress requires some kind of hierarchical API to
> keep track of both parent and child progress and show them in some
> readable way.  Perhaps appending another progress display would suffice?
> "Files 1/3 (33%) Bytes 17kB/9GB (0%)".  Not sure.

Yes, this is another thing I'm heading for with the patches I posted.

For now I just fixed bugs in the state machine of how many characters we
erase, now we always reset exactly as much as we need to, and pass
things like ", done" around, not ", done\n" or ", done\r" (i.e. the
output we're emitting isn't conflacted with whether we're clearing the
line, or creating a new line.

It's a relatively straightforward change from there to have N progress
structs that each track/emit their part of a larger progress bar,
e.g. something like the progress prove(1) shows you (test status for
each concurrent test you're running).

You just need a "parent" progress struct that has the "title" (or none),
and receives the signal, and to have N registered sub-progress structs.

> Calculating the ETA of a single item seems hard.  It does require intra-
> item progress to be reported by the work code.
>
>> If we do run into those cases and want to convert them to show some
>> intra-item progress we'd need to first migrate them over to suggested
>> way of using the API if we picked yours first, with my suggested use we
>> only need to add new API calls (display_throughput(), and similar future
>> calls/implicit display).
>
> I don't see why.  The intra-item progress numbers need to be reported in
> any case if they are to be shown somehow.  If the model is clear then we
> can show unambiguous output.

Because you want to show:

    Files 1/3 (33%) Bytes 17kB/9GB (0%)

Not:

    Files 0/3 (33%) Bytes 17kB/9GB (0%)

You're downloading the 1st file, not the 0th file, so the code is a
for-loop (or equivalent) with a display_progress(p, i + 1) for that
file, not display_progress(p, i).

This is the main reason I prefer the API and UI of reporting "what item
am I on?" v.s. "how many items are done?", because it's easy to add
intra-item state to the former.

>> Consider e.g. using the packfile-uri response to ask the user to
>> download N number of URLs, just because we grab one at 1MB/s that
>> probably won't do much to inform our estimate of the next one (which may
>> be on a different CDN etc.).
>
> Sure, if the speed of work items varies wildly then estimates will be
> unreliable.
>
> I can vaguely imagine that it would be kind of useful to know the
> throughput of different data sources, to allow e.g. use a different
> mirror next time.  The current API doesn't distinguish work items in a
> meaningful way, though.  They only have numbers.  I'd need a name (e.g.
> the URL) for intra-item progress numbers to mean something.

Sure, anyway, let's assume all those numbers are magically known and
constant. The point was that as noted above you're downloading the 1st
file, not the 0th file, and want to show throughput/ETA etc. for that
file.

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-07-05 23:28                   ` Ævar Arnfjörð Bjarmason
@ 2021-07-06 16:02                     ` René Scharfe
  0 siblings, 0 replies; 138+ messages in thread
From: René Scharfe @ 2021-07-06 16:02 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: SZEDER Gábor, git

Am 06.07.21 um 01:28 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sun, Jul 04 2021, René Scharfe wrote:
>
>> Am 26.06.21 um 23:38 schrieb Ævar Arnfjörð Bjarmason:
>>>
>>> On Sat, Jun 26 2021, René Scharfe wrote:
>>>
>>>> Am 26.06.21 um 16:11 schrieb Ævar Arnfjörð Bjarmason:
>>> The only one I know of is "Enumerating objects", which notably stalls at
>>> the start, and which I'm proposing changing the output of in:
>>> https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/
>>
>> That's annoying, but the first number I see there has five or six digits,
>> so it's not an example of the issue mentioned above for me.
>
> Because it stalls and shows nothing, but with my patch it'll show
> something while stalling, FWIW on linux.git from a cold cache it took
> 5-10s before showing anything.
>
>> Your patch shows ", stalled." while pack-objects starts up.  I'm not sure
>> this helps.  Perhaps there are cases when it gets stuck, but it's hard to
>> determine by the clock alone.  When I run gc, it just needs a few seconds
>> to prepare something and then starts visibly counting objects.  A more
>> fine-grained report of the preparation steps would help, but seeing
>> "stalled" would just scare me.
>
> Fair enough, I have other patches to have it show a spinner. Again, API
> v.s. UI. The idea is that we show something before we start the loop.

A spinner would be nicer, but I would be more interested to see what it is
actually spending all that time on.  A separate progress line might be
justified here.

>>> Also, because for these cases that you're focusing on where we count up
>>> to exactly 100% and we therefore expect N calls to display_progress()
>>> (igroning the rare but allowed duplicate calls with the same number,
>>> which most callers don't use). We could just have a convenience API of:
>>>
>>>     start_progress()
>>>     for i in (0..X-1):
>>>         progress_update() /* passing "i" not needed, we increment internally */
>>>         work()
>>>     stop_progress()
>>>
>>> Then we could even make showing 0/N or 1/N the first time configuable,
>>> but we could only do both if we use the API as I'm suggesting, not as
>>> you want to use it.
>>
>> A function that increments the progress number relatively can be used
>> with both models.  It's more useful for the count-iterations model,
>> though, as in the count-work model you can piggy-back on the loop
>> counter check:
>>
>> 	for (i = 0; display_progress(p, i), i < X; i++)
>> 		work();
>
> Aside from this whole progress API discussion I find sticking stuff like
> that in the for-loop body to be less readable.
>
> But no, that can't be used with both models, because it conflates the 0
> of the 1st iteration with 0 of doing prep work. I.e.:
>
>     p = start_progress();
>     display_progress(p, 0);
>     prep_work();
>     for (i = 0; i < 100; i++)
>         display_progress(p, i + 1);
>
> Which is implicitly how that "stalled" patch views the world, i.e. our
> count is -1 is at start_progress() (that's already the case in
> progress.c).
>
> If you set it to 0 you're not working on the 1st item yet, but
> explicitly doing setup.
>
> Then at n=1 you're starting work on the 1st item.

A distinct preparation phase feels like an extension to the progress
API.  A symmetric cleanup phase at the end may make sense as well then.

I assume that preparations would be done between the start_progress call
and the first display_progress (no matter what number it reports).  And
cleanup would be done between the last display_progress call and the
stop_progress call.

In the count-iterations model this might report the time taken fro the
first or last item as preparation or cleanup depending on the placement
of the display_progress call.  That shouldn't be much of a problem,
though, as the value of one work item is zero in that model.

> FWIW the "complicated" here was referring to dealing with break/continue.
>
> Yes I'll grant you that there's cases where the uglyness/oddity of that
> for-loop trick is going to be better than dealing with that, but there's
> also while loops doing progress, callbacks etc.

while loops can easily be converted to for loops, of course.

Callbacks are a different matter.  I think we should use them less in
general (they force different operations to use the same set of
parameters, which is worked around with context structs).  A function
to increment progress would help them because then they wouldn't need
to keep track of the item/iteration count themselves in a context
variable.

However, in some cases display_progress calls are rate-limited, e.g.
midx_display_sparse_progress does that for performance reasons.  I
wonder why, and whether this is a problem that needs to be addressed
for all callers.  We don't want the progress API to delay the actual
progress significantly!  Currently display_progress avoids updating
the progress counter; an increment function would need to write an
updated value at each call.

> Picking an API pattern that works with all of that makes sense, since
> the UI can render the count one way or the other.

Right.

>>> If we do run into those cases and want to convert them to show some
>>> intra-item progress we'd need to first migrate them over to suggested
>>> way of using the API if we picked yours first, with my suggested use we
>>> only need to add new API calls (display_throughput(), and similar future
>>> calls/implicit display).
>>
>> I don't see why.  The intra-item progress numbers need to be reported in
>> any case if they are to be shown somehow.  If the model is clear then we
>> can show unambiguous output.
>
> Because you want to show:
>
>     Files 1/3 (33%) Bytes 17kB/9GB (0%)
>
> Not:
>
>     Files 0/3 (33%) Bytes 17kB/9GB (0%)
>
> You're downloading the 1st file, not the 0th file, so the code is a
> for-loop (or equivalent) with a display_progress(p, i + 1) for that
> file, not display_progress(p, i).
>
> This is the main reason I prefer the API and UI of reporting "what item
> am I on?" v.s. "how many items are done?", because it's easy to add
> intra-item state to the former.

Both look confusing.  If I'd care enough about one of the files or each
of them that I'd like to know their individual progress then I'd
certainly would want to see their names instead of some random number.

And as you write above: The display part can easily add or subtract one
to convert the number between models.

> Sure, anyway, let's assume all those numbers are magically known and
> constant. The point was that as noted above you're downloading the 1st
> file, not the 0th file, and want to show throughput/ETA etc. for that
> file.

OK, but still some kind of indication would have to be given that the
Bytes relate to a particular File instead of being the total for this
activity.  Perhaps like this, but it's a bit cluttered:

   File 1 (Bytes 17kB/9GB, 0% done) of 3 (0% done in total)

René

^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 0/3] progress.c API users: fix bogus counting
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
                   ` (8 preceding siblings ...)
  2021-06-23 21:57 ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
@ 2021-07-22 12:20 ` Ævar Arnfjörð Bjarmason
  2021-07-22 12:20   ` [PATCH 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
                     ` (3 more replies)
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
  10 siblings, 4 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

As a split-off from the larger topic these were submitted as part of
[1] and which didn't get picked up. As I pointed out in [2] that
larger topic had some hidden untested-for flaws.

But these patches are just fixes to bogus progress bar output from
that topic. Let's consider them in isolation...

1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/
2. https://lore.kernel.org/git/cover-00.25-00000000000-20210623T155626Z-avarab@gmail.com/

SZEDER Gábor (2):
  commit-graph: fix bogus counter in "Scanning merged commits" progress
    line
  entry: show finer-grained counter in "Filtering content" progress line

Ævar Arnfjörð Bjarmason (1):
  midx: don't provide a total for QSORT() progress

 commit-graph.c | 2 +-
 entry.c        | 7 +++----
 midx.c         | 2 +-
 3 files changed, 5 insertions(+), 6 deletions(-)

-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-07-22 12:20 ` [PATCH 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:20   ` Ævar Arnfjörð Bjarmason
  2021-07-23 21:55     ` Junio C Hamano
  2021-08-02 21:07     ` SZEDER Gábor
  2021-07-22 12:20   ` [PATCH 2/3] midx: don't provide a total for QSORT() progress Ævar Arnfjörð Bjarmason
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

From: SZEDER Gábor <szeder.dev@gmail.com>

The final value of the counter of the "Scanning merged commits"
progress line is always one less than its expected total, e.g.:

  Scanning merged commits:  83% (5/6), done.

This happens because while iterating over an array the loop variable
is passed to display_progress() as-is, but while C arrays (and thus
the loop variable) start at 0 and end at N-1, the progress counter
must end at N.  This causes the failures of the tests
'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.

Fix this by passing 'i + 1' to display_progress(), like most other
callsites do.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 commit-graph.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/commit-graph.c b/commit-graph.c
index 1a2602da61..918061f207 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
 
 	ctx->num_extra_edges = 0;
 	for (i = 0; i < ctx->commits.nr; i++) {
-		display_progress(ctx->progress, i);
+		display_progress(ctx->progress, i + 1);
 
 		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
 			  &ctx->commits.list[i]->object.oid)) {
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 2/3] midx: don't provide a total for QSORT() progress
  2021-07-22 12:20 ` [PATCH 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
  2021-07-22 12:20   ` [PATCH 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:20   ` Ævar Arnfjörð Bjarmason
  2021-07-23 21:56     ` Junio C Hamano
  2021-08-05 15:07     ` Phillip Wood
  2021-07-22 12:20   ` [PATCH 3/3] entry: show finer-grained counter in "Filtering content" progress line Ævar Arnfjörð Bjarmason
  2021-08-05 11:01   ` [PATCH v2 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
  3 siblings, 2 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

The quicksort algorithm can be anywhere between O(n) and O(n^2), so
providing a "num objects" as a total means that in some cases we're
going to go past 100%.

This fixes a logic error in 5ae18df9d8e (midx: during verify group
objects by packfile to speed verification, 2019-03-21), which in turn
seems to have been diligently copied from my own logic error in the
commit-graph.c code, see 890226ccb57 (commit-graph write: add
itermediate progress, 2019-01-19).

That commit-graph code of mine was removed in
1cbdbf3bef7 (commit-graph: drop count_distinct_commits() function,
2020-12-07), so we don't need to fix that too.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 midx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/midx.c b/midx.c
index 9a35b0255d..eaae75ab19 100644
--- a/midx.c
+++ b/midx.c
@@ -1291,7 +1291,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 
 	if (flags & MIDX_PROGRESS)
 		progress = start_sparse_progress(_("Sorting objects by packfile"),
-						 m->num_objects);
+						 0);
 	display_progress(progress, 0); /* TODO: Measure QSORT() progress */
 	QSORT(pairs, m->num_objects, compare_pair_pos_vs_id);
 	stop_progress(&progress);
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 3/3] entry: show finer-grained counter in "Filtering content" progress line
  2021-07-22 12:20 ` [PATCH 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
  2021-07-22 12:20   ` [PATCH 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
  2021-07-22 12:20   ` [PATCH 2/3] midx: don't provide a total for QSORT() progress Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:20   ` Ævar Arnfjörð Bjarmason
  2021-07-23 22:01     ` Junio C Hamano
  2021-08-02 21:48     ` SZEDER Gábor
  2021-08-05 11:01   ` [PATCH v2 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
  3 siblings, 2 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

From: SZEDER Gábor <szeder.dev@gmail.com>

The "Filtering content" progress in entry.c:finish_delayed_checkout()
is unusual because of how it calculates the progress count and because
it shows the progress of a nested loop.  It works basically like this:

  start_delayed_progress(p, nr_of_paths_to_filter)
  for_each_filter {
      display_progress(p, nr_of_paths_to_filter - nr_of_paths_still_left_to_filter)
      for_each_path_handled_by_the_current_filter {
          checkout_entry()
      }
  }
  stop_progress(p)

There are two issues with this approach:

  - The work done by the last filter (or the only filter if there is
    only one) is never counted, so if the last filter still has some
    paths to process, then the counter shown in the "done" progress
    line will not match the expected total.

    This would cause a BUG() in an upcoming change that adds an
    assertion checking if the "total" at the end matches the last
    progress bar update..

    This is because both use only one filter.  (The test 'delayed
    checkout in process filter' uses two filters but the first one
    does all the work, so that test already happens to succeed even
    with such an assertion.)

  - The progress counter is updated only once per filter, not once per
    processed path, so if a filter has a lot of paths to process, then
    the counter might stay unchanged for a long while and then make a
    big jump (though the user still gets a sense of progress, because
    we call display_throughput() after each processed path to show the
    amount of processed data).

Move the display_progress() call to the inner loop, right next to that
checkout_entry() call that does the hard work for each path, and use a
dedicated counter variable that is incremented upon processing each
path.

After this change the 'invalid file in delayed checkout' in
't0021-conversion.sh' would succeed with the future BUG() assertion
discussed above but the 'missing file in delayed checkout' test would
still fail, because its purposefully buggy filter doesn't process any
paths, so we won't execute that inner loop at all (this will be fixed
in a subsequent commit).

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 entry.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/entry.c b/entry.c
index 125fabdbd5..d92dd020b3 100644
--- a/entry.c
+++ b/entry.c
@@ -162,7 +162,7 @@ static int remove_available_paths(struct string_list_item *item, void *cb_data)
 int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 {
 	int errs = 0;
-	unsigned delayed_object_count;
+	unsigned processed_paths = 0;
 	off_t filtered_bytes = 0;
 	struct string_list_item *filter, *path;
 	struct progress *progress;
@@ -172,12 +172,10 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 		return errs;
 
 	dco->state = CE_RETRY;
-	delayed_object_count = dco->paths.nr;
-	progress = start_delayed_progress(_("Filtering content"), delayed_object_count);
+	progress = start_delayed_progress(_("Filtering content"), dco->paths.nr);
 	while (dco->filters.nr > 0) {
 		for_each_string_list_item(filter, &dco->filters) {
 			struct string_list available_paths = STRING_LIST_INIT_NODUP;
-			display_progress(progress, delayed_object_count - dco->paths.nr);
 
 			if (!async_query_available_blobs(filter->string, &available_paths)) {
 				/* Filter reported an error */
@@ -224,6 +222,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 				ce = index_file_exists(state->istate, path->string,
 						       strlen(path->string), 0);
 				if (ce) {
+					display_progress(progress, ++processed_paths);
 					errs |= checkout_entry(ce, state, NULL, nr_checkouts);
 					filtered_bytes += ce->ce_stat_data.sd_size;
 					display_throughput(progress, filtered_bytes);
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
                   ` (9 preceding siblings ...)
  2021-07-22 12:20 ` [PATCH 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:54 ` Ævar Arnfjörð Bjarmason
  2021-07-22 12:54   ` [PATCH 1/8] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
                     ` (9 more replies)
  10 siblings, 10 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

These patches were originally submitted as part of a much larger topic
at [1]. The add a "global_progress" "struct progress *" which we
assign/clear to as we start/stop progress bars.

This will become imporant for some new progress features I have
planend, but right now is just used to assert that we don't start two
progress bars at the same time. 7/8 fixes an existing bug where we did
that.

To get there I fixed up the test helper to be able to test this, moved
some code around, and fixes a couple of existing nits in 5/8 and 6/8..

See also [2] which is a re-submission of that larger topic, but the
two can proceed independently.

1. https://lore.kernel.org/git/cover-00.25-00000000000-20210623T155626Z-avarab@gmail.com/
2. https://lore.kernel.org/git/cover-0.3-0000000000-20210722T121801Z-avarab@gmail.com/

Ævar Arnfjörð Bjarmason (8):
  progress.c tests: make start/stop verbs on stdin
  progress.c tests: test some invalid usage
  progress.c: move signal handler functions lower
  progress.c: call progress_interval() from progress_test_force_update()
  progress.c: stop eagerly fflush(stderr) when not a terminal
  progress.c: add temporary variable from progress struct
  pack-bitmap-write.c: add a missing stop_progress()
  progress.c: add & assert a "global_progress" variable

 pack-bitmap-write.c         |   1 +
 progress.c                  | 116 ++++++++++++++++++++----------------
 t/helper/test-progress.c    |  43 +++++++++----
 t/t0500-progress-display.sh | 103 +++++++++++++++++++++++++-------
 4 files changed, 178 insertions(+), 85 deletions(-)

-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 1/8] progress.c tests: make start/stop verbs on stdin
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:54   ` Ævar Arnfjörð Bjarmason
  2021-07-22 12:55   ` [PATCH 2/8] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

Change the usage of the "test-tool progress" introduced in
2bb74b53a49 (Test the progress display, 2019-09-16) to take command
like "start" and "stop" on stdin, instead of running them implicitly.

This makes for tests that are easier to read, since the recipe will
mirror the API usage, and allows for easily testing invalid usage that
would yield (or should yield) a BUG(), e.g. providing two "start"
calls in a row. A subsequent commit will add such stress tests.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/helper/test-progress.c    | 43 +++++++++++++++++++--------
 t/t0500-progress-display.sh | 59 +++++++++++++++++++++++--------------
 2 files changed, 67 insertions(+), 35 deletions(-)

diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
index 5d05cbe789..685c0a7c49 100644
--- a/t/helper/test-progress.c
+++ b/t/helper/test-progress.c
@@ -3,6 +3,9 @@
  *
  * Reads instructions from standard input, one instruction per line:
  *
+ *   "start[ <total>[ <title>]]" - Call start_progress(title, total),
+ *                                 when "start" use a title of
+ *                                 "Working hard" with a total of 0.
  *   "progress <items>" - Call display_progress() with the given item count
  *                        as parameter.
  *   "throughput <bytes> <millis> - Call display_throughput() with the given
@@ -10,6 +13,7 @@
  *                                  specify the time elapsed since the
  *                                  start_progress() call.
  *   "update" - Set the 'progress_update' flag.
+ *   "stop" - Call stop_progress().
  *
  * See 't0500-progress-display.sh' for examples.
  */
@@ -22,31 +26,41 @@
 
 int cmd__progress(int argc, const char **argv)
 {
-	int total = 0;
-	const char *title;
+	const char *default_title = "Working hard";
+	char *detached_title = NULL;
 	struct strbuf line = STRBUF_INIT;
-	struct progress *progress;
+	struct progress *progress = NULL;
 
 	const char *usage[] = {
-		"test-tool progress [--total=<n>] <progress-title>",
+		"test-tool progress <stdin",
 		NULL
 	};
 	struct option options[] = {
-		OPT_INTEGER(0, "total", &total, "total number of items"),
 		OPT_END(),
 	};
 
 	argc = parse_options(argc, argv, NULL, options, usage, 0);
-	if (argc != 1)
-		die("need a title for the progress output");
-	title = argv[0];
+	if (argc)
+		usage_with_options(usage, options);
 
 	progress_testing = 1;
-	progress = start_progress(title, total);
 	while (strbuf_getline(&line, stdin) != EOF) {
 		char *end;
 
-		if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
+		if (!strcmp(line.buf, "start")) {
+			progress = start_progress(default_title, 0);
+		} else if (skip_prefix(line.buf, "start ", (const char **) &end)) {
+			uint64_t total = strtoull(end, &end, 10);
+			if (*end == '\0') {
+				progress = start_progress(default_title, total);
+			} else if (*end == ' ') {
+				free(detached_title);
+				detached_title = strbuf_detach(&line, NULL);
+				progress = start_progress(end + 1, total);
+			} else {
+				die("invalid input: '%s'\n", line.buf);
+			}
+		} else if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
 			uint64_t item_count = strtoull(end, &end, 10);
 			if (*end != '\0')
 				die("invalid input: '%s'\n", line.buf);
@@ -63,12 +77,15 @@ int cmd__progress(int argc, const char **argv)
 				die("invalid input: '%s'\n", line.buf);
 			progress_test_ns = test_ms * 1000 * 1000;
 			display_throughput(progress, byte_count);
-		} else if (!strcmp(line.buf, "update"))
+		} else if (!strcmp(line.buf, "update")) {
 			progress_test_force_update();
-		else
+		} else if (!strcmp(line.buf, "stop")) {
+			stop_progress(&progress);
+		} else {
 			die("invalid input: '%s'\n", line.buf);
+		}
 	}
-	stop_progress(&progress);
+	free(detached_title);
 
 	return 0;
 }
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 22058b503a..ca96ac1fa5 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -17,6 +17,7 @@ test_expect_success 'simple progress display' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 0
 	update
 	progress 1
 	update
@@ -25,8 +26,9 @@ test_expect_success 'simple progress display' '
 	progress 4
 	update
 	progress 5
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -41,11 +43,13 @@ test_expect_success 'progress display with total' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 3
 	progress 1
 	progress 2
 	progress 3
+	stop
 	EOF
-	test-tool progress --total=3 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -62,14 +66,14 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 100
 	progress 1000
 	progress 10000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -88,16 +92,15 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
-	update
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 1
 	update
 	progress 2
 	progress 10000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -116,14 +119,14 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 25000
 	progress 50000
 	progress 75000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -140,14 +143,14 @@ Working hard.......2.........3.........4.........5.........6.........7.........:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6.........7.........
 	progress 25000
 	progress 50000
 	progress 75000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6.........7........." \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -164,12 +167,14 @@ test_expect_success 'progress shortens - crazy caller' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 1000
 	progress 100
 	progress 200
 	progress 1
 	progress 1000
+	stop
 	EOF
-	test-tool progress --total=1000 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -185,6 +190,7 @@ test_expect_success 'progress display with throughput' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start
 	throughput 102400 1000
 	update
 	progress 10
@@ -197,8 +203,9 @@ test_expect_success 'progress display with throughput' '
 	throughput 409600 4000
 	update
 	progress 40
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -214,6 +221,7 @@ test_expect_success 'progress display with throughput and total' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 40
 	throughput 102400 1000
 	progress 10
 	throughput 204800 2000
@@ -222,8 +230,9 @@ test_expect_success 'progress display with throughput and total' '
 	progress 30
 	throughput 409600 4000
 	progress 40
+	stop
 	EOF
-	test-tool progress --total=40 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -239,6 +248,7 @@ test_expect_success 'cover up after throughput shortens' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start
 	throughput 409600 1000
 	update
 	progress 1
@@ -251,8 +261,9 @@ test_expect_success 'cover up after throughput shortens' '
 	throughput 1638400 4000
 	update
 	progress 4
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -267,6 +278,7 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start
 	throughput 1 1000
 	update
 	progress 1
@@ -276,8 +288,9 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	throughput 3145728 3000
 	update
 	progress 3
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -285,6 +298,7 @@ test_expect_success 'cover up after throughput shortens a lot' '
 
 test_expect_success 'progress generates traces' '
 	cat >in <<-\EOF &&
+	start 40
 	throughput 102400 1000
 	update
 	progress 10
@@ -297,10 +311,11 @@ test_expect_success 'progress generates traces' '
 	throughput 409600 4000
 	update
 	progress 40
+	stop
 	EOF
 
-	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress --total=40 \
-		"Working hard" <in 2>stderr &&
+	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress \
+		<in 2>stderr &&
 
 	# t0212/parse_events.perl intentionally omits regions and data.
 	test_region progress "Working hard" trace.event &&
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 2/8] progress.c tests: test some invalid usage
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
  2021-07-22 12:54   ` [PATCH 1/8] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:55   ` Ævar Arnfjörð Bjarmason
  2021-07-22 12:55   ` [PATCH 3/8] progress.c: move signal handler functions lower Ævar Arnfjörð Bjarmason
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

Test what happens when we "stop" without a "start", omit the "stop"
after a "start", or try to start two concurrent progress bars. This
extends the trace2 tests added in 98a13647408 (trace2: log progress
time and throughput, 2020-05-12).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t0500-progress-display.sh | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index ca96ac1fa5..ffa819ca1d 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -323,4 +323,37 @@ test_expect_success 'progress generates traces' '
 	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
 '
 
+test_expect_success 'progress generates traces: stop / start' '
+	cat >in <<-\EOF &&
+	start
+	stop
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-startstop.event" test-tool progress \
+		<in 2>stderr &&
+	test_region progress "Working hard" trace-startstop.event
+'
+
+test_expect_success 'progress generates traces: start without stop' '
+	cat >in <<-\EOF &&
+	start
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-start.event" test-tool progress \
+		<in 2>stderr &&
+	grep region_enter.*progress trace-start.event &&
+	! grep region_leave.*progress trace-start.event
+'
+
+test_expect_success 'progress generates traces: stop without start' '
+	cat >in <<-\EOF &&
+	stop
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-stop.event" test-tool progress \
+		<in 2>stderr &&
+	! grep region_enter.*progress trace-stop.event &&
+	! grep region_leave.*progress trace-stop.event
+'
+
 test_done
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 3/8] progress.c: move signal handler functions lower
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
  2021-07-22 12:54   ` [PATCH 1/8] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
  2021-07-22 12:55   ` [PATCH 2/8] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:55   ` Ævar Arnfjörð Bjarmason
  2021-07-22 12:55   ` [PATCH 4/8] progress.c: call progress_interval() from progress_test_force_update() Ævar Arnfjörð Bjarmason
                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

Move the signal handler functions to just before the
start_progress_delay() where they'll be referenced, instead of having
them at the top of the file.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 92 ++++++++++++++++++++++++++++--------------------------
 1 file changed, 48 insertions(+), 44 deletions(-)

diff --git a/progress.c b/progress.c
index 680c6a8bf9..893cb0fe56 100644
--- a/progress.c
+++ b/progress.c
@@ -53,50 +53,6 @@ static volatile sig_atomic_t progress_update;
  */
 int progress_testing;
 uint64_t progress_test_ns = 0;
-void progress_test_force_update(void)
-{
-	progress_update = 1;
-}
-
-
-static void progress_interval(int signum)
-{
-	progress_update = 1;
-}
-
-static void set_progress_signal(void)
-{
-	struct sigaction sa;
-	struct itimerval v;
-
-	if (progress_testing)
-		return;
-
-	progress_update = 0;
-
-	memset(&sa, 0, sizeof(sa));
-	sa.sa_handler = progress_interval;
-	sigemptyset(&sa.sa_mask);
-	sa.sa_flags = SA_RESTART;
-	sigaction(SIGALRM, &sa, NULL);
-
-	v.it_interval.tv_sec = 1;
-	v.it_interval.tv_usec = 0;
-	v.it_value = v.it_interval;
-	setitimer(ITIMER_REAL, &v, NULL);
-}
-
-static void clear_progress_signal(void)
-{
-	struct itimerval v = {{0,},};
-
-	if (progress_testing)
-		return;
-
-	setitimer(ITIMER_REAL, &v, NULL);
-	signal(SIGALRM, SIG_IGN);
-	progress_update = 0;
-}
 
 static int is_foreground_fd(int fd)
 {
@@ -249,6 +205,54 @@ void display_progress(struct progress *progress, uint64_t n)
 		display(progress, n, NULL);
 }
 
+static void progress_interval(int signum)
+{
+	progress_update = 1;
+}
+
+/*
+ * The progress_test_force_update() function is intended for testing
+ * the progress output, i.e. exclusively for 'test-tool progress'.
+ */
+void progress_test_force_update(void)
+{
+	progress_update = 1;
+}
+
+static void set_progress_signal(void)
+{
+	struct sigaction sa;
+	struct itimerval v;
+
+	if (progress_testing)
+		return;
+
+	progress_update = 0;
+
+	memset(&sa, 0, sizeof(sa));
+	sa.sa_handler = progress_interval;
+	sigemptyset(&sa.sa_mask);
+	sa.sa_flags = SA_RESTART;
+	sigaction(SIGALRM, &sa, NULL);
+
+	v.it_interval.tv_sec = 1;
+	v.it_interval.tv_usec = 0;
+	v.it_value = v.it_interval;
+	setitimer(ITIMER_REAL, &v, NULL);
+}
+
+static void clear_progress_signal(void)
+{
+	struct itimerval v = {{0,},};
+
+	if (progress_testing)
+		return;
+
+	setitimer(ITIMER_REAL, &v, NULL);
+	signal(SIGALRM, SIG_IGN);
+	progress_update = 0;
+}
+
 static struct progress *start_progress_delay(const char *title, uint64_t total,
 					     unsigned delay, unsigned sparse)
 {
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 4/8] progress.c: call progress_interval() from progress_test_force_update()
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                     ` (2 preceding siblings ...)
  2021-07-22 12:55   ` [PATCH 3/8] progress.c: move signal handler functions lower Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:55   ` Ævar Arnfjörð Bjarmason
  2021-07-22 12:55   ` [PATCH 5/8] progress.c: stop eagerly fflush(stderr) when not a terminal Ævar Arnfjörð Bjarmason
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

Define the progress_test_force_update() function in terms of
progress_interval(). For documentation purposes these two functions
have the same body, but different names. Let's just define the test
function by calling progress_interval() with SIGALRM ourselves.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/progress.c b/progress.c
index 893cb0fe56..7fcc513717 100644
--- a/progress.c
+++ b/progress.c
@@ -216,7 +216,7 @@ static void progress_interval(int signum)
  */
 void progress_test_force_update(void)
 {
-	progress_update = 1;
+	progress_interval(SIGALRM);
 }
 
 static void set_progress_signal(void)
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 5/8] progress.c: stop eagerly fflush(stderr) when not a terminal
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                     ` (3 preceding siblings ...)
  2021-07-22 12:55   ` [PATCH 4/8] progress.c: call progress_interval() from progress_test_force_update() Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:55   ` Ævar Arnfjörð Bjarmason
  2021-07-22 12:55   ` [PATCH 6/8] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

It's the clear intention of the combination of 137a0d0ef56 (Flush
progress message buffer in display()., 2007-11-19) and
85cb8906f0e (progress: no progress in background, 2015-04-13) to call
fflush(stderr) when we have a stderr in the foreground, but we ended
up always calling fflush(stderr) seemingly by omission. Let's not.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/progress.c b/progress.c
index 7fcc513717..1fade5808d 100644
--- a/progress.c
+++ b/progress.c
@@ -91,7 +91,8 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 	}
 
 	if (show_update) {
-		if (is_foreground_fd(fileno(stderr)) || done) {
+		int stderr_is_foreground_fd = is_foreground_fd(fileno(stderr));
+		if (stderr_is_foreground_fd || done) {
 			const char *eol = done ? done : "\r";
 			size_t clear_len = counters_sb->len < last_count_len ?
 					last_count_len - counters_sb->len + 1 :
@@ -115,7 +116,8 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 				fprintf(stderr, "%s: %s%*s", progress->title,
 					counters_sb->buf, (int) clear_len, eol);
 			}
-			fflush(stderr);
+			if (stderr_is_foreground_fd)
+				fflush(stderr);
 		}
 		progress_update = 0;
 	}
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 6/8] progress.c: add temporary variable from progress struct
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                     ` (4 preceding siblings ...)
  2021-07-22 12:55   ` [PATCH 5/8] progress.c: stop eagerly fflush(stderr) when not a terminal Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:55   ` Ævar Arnfjörð Bjarmason
  2021-07-22 12:55   ` [PATCH 7/8] pack-bitmap-write.c: add a missing stop_progress() Ævar Arnfjörð Bjarmason
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

Add a temporary "progress" variable for the dereferenced p_progress
pointer to a "struct progress *". Before 98a13647408 (trace2: log
progress time and throughput, 2020-05-12) we didn't dereference
"p_progress" in this function, now that we do it's easier to read the
code if we work with a "progress" struct pointer like everywhere else,
instead of a pointer to a pointer.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/progress.c b/progress.c
index 1fade5808d..1ab7d19deb 100644
--- a/progress.c
+++ b/progress.c
@@ -331,15 +331,16 @@ void stop_progress(struct progress **p_progress)
 	finish_if_sparse(*p_progress);
 
 	if (*p_progress) {
+		struct progress *progress = *p_progress;
 		trace2_data_intmax("progress", the_repository, "total_objects",
 				   (*p_progress)->total);
 
 		if ((*p_progress)->throughput)
 			trace2_data_intmax("progress", the_repository,
 					   "total_bytes",
-					   (*p_progress)->throughput->curr_total);
+					   progress->throughput->curr_total);
 
-		trace2_region_leave("progress", (*p_progress)->title, the_repository);
+		trace2_region_leave("progress", progress->title, the_repository);
 	}
 
 	stop_progress_msg(p_progress, _("done"));
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 7/8] pack-bitmap-write.c: add a missing stop_progress()
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                     ` (5 preceding siblings ...)
  2021-07-22 12:55   ` [PATCH 6/8] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:55   ` Ævar Arnfjörð Bjarmason
  2021-07-22 12:55   ` [PATCH 8/8] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
                     ` (2 subsequent siblings)
  9 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

Fix a bug that's been here since 7cc8f971085 (pack-objects: implement
bitmap writing, 2013-12-21), we did not call stop_progress() if we
reached the early exit in this function. This will matter in a
subsequent commit where we BUG(...) out if this happens, and matters
now e.g. because we don't have a corresponding "region_end" for the
progress trace2 event.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 pack-bitmap-write.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 88d9e696a5..6e110e41ea 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -550,6 +550,7 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
 	if (indexed_commits_nr < 100) {
 		for (i = 0; i < indexed_commits_nr; ++i)
 			push_bitmapped_commit(indexed_commits[i]);
+		stop_progress(&writer.progress);
 		return;
 	}
 
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH 8/8] progress.c: add & assert a "global_progress" variable
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                     ` (6 preceding siblings ...)
  2021-07-22 12:55   ` [PATCH 7/8] pack-bitmap-write.c: add a missing stop_progress() Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:55   ` Ævar Arnfjörð Bjarmason
  2021-09-16 21:34     ` [PATCH 12/25] " Ævar Arnfjörð Bjarmason
  2021-07-23 22:02   ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Junio C Hamano
  2021-09-20 23:09   ` [PATCH v2 " Ævar Arnfjörð Bjarmason
  9 siblings, 1 reply; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

The progress.c code makes a hard assumption that only one progress bar
be active at a time (see [1] for a bug where this wasn't the case),
but nothing has asserted that that's the case. Let's add a BUG()
that'll trigger if two progress bars are active at the same time.

There's an alternate test-only approach to doing the same thing[2],
but by doing this for all progress bars we'll have a canary to check
if we have any unexpected interaction between the "sig_atomic_t
progress_update" variable and this global struct.

I am then planning on using this scaffolding in the future to fix a
limitation in the progress output, namely the current limitation of
the progress.c bar code that any update must pro-actively go through
the likes of display_progress().

If we e.g. hang forever before the first display_progress(), or in the
middle of a loop that would call display_progress() the user will only
see either no output, or output frozen at the last display_progress()
that would have done an update (e.g. in cases where progress_update
was "1" due to an earlier signal).

This change does not fix that, but sets up the structure for solving
that and other related problems by juggling this "global_progress"
struct. Later changes will make more use of the "global_progress" than
only using it for these assertions.

1. 6f9d5f2fda1 (commit-graph: fix progress of reachable commits, 2020-07-09)
2. https://lore.kernel.org/git/20210620200303.2328957-3-szeder.dev@gmail.com

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c                  | 17 +++++++++++++----
 t/t0500-progress-display.sh | 11 +++++++++++
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/progress.c b/progress.c
index 1ab7d19deb..14a023f4b4 100644
--- a/progress.c
+++ b/progress.c
@@ -46,6 +46,7 @@ struct progress {
 };
 
 static volatile sig_atomic_t progress_update;
+static struct progress *global_progress;
 
 /*
  * These are only intended for testing the progress output, i.e. exclusively
@@ -221,11 +222,15 @@ void progress_test_force_update(void)
 	progress_interval(SIGALRM);
 }
 
-static void set_progress_signal(void)
+static void set_progress_signal(struct progress *progress)
 {
 	struct sigaction sa;
 	struct itimerval v;
 
+	if (global_progress)
+		BUG("should have no global_progress in set_progress_signal()");
+	global_progress = progress;
+
 	if (progress_testing)
 		return;
 
@@ -243,10 +248,14 @@ static void set_progress_signal(void)
 	setitimer(ITIMER_REAL, &v, NULL);
 }
 
-static void clear_progress_signal(void)
+static void clear_progress_signal(struct progress *progress)
 {
 	struct itimerval v = {{0,},};
 
+	if (!global_progress)
+		BUG("should have a global_progress in clear_progress_signal()");
+	global_progress = NULL;
+
 	if (progress_testing)
 		return;
 
@@ -270,7 +279,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 	strbuf_init(&progress->counters_sb, 0);
 	progress->title_len = utf8_strwidth(title);
 	progress->split = 0;
-	set_progress_signal();
+	set_progress_signal(progress);
 	trace2_region_enter("progress", title, the_repository);
 	return progress;
 }
@@ -374,7 +383,7 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 		display(progress, progress->last_value, buf);
 		free(buf);
 	}
-	clear_progress_signal();
+	clear_progress_signal(progress);
 	strbuf_release(&progress->counters_sb);
 	if (progress->throughput)
 		strbuf_release(&progress->throughput->display);
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index ffa819ca1d..124d33c96b 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -296,6 +296,17 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	test_cmp expect out
 '
 
+test_expect_success 'BUG: start two concurrent progress bars' '
+	cat >in <<-\EOF &&
+	start 0 one
+	start 0 two
+	EOF
+
+	test_must_fail test-tool progress \
+		<in 2>stderr &&
+	grep -E "^BUG: .*: should have no global_progress in set_progress_signal\(\)$" stderr
+'
+
 test_expect_success 'progress generates traces' '
 	cat >in <<-\EOF &&
 	start 40
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-07-22 12:20   ` [PATCH 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
@ 2021-07-23 21:55     ` Junio C Hamano
  2021-08-02 21:07     ` SZEDER Gábor
  1 sibling, 0 replies; 138+ messages in thread
From: Junio C Hamano @ 2021-07-23 21:55 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, SZEDER Gábor, René Scharfe

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> From: SZEDER Gábor <szeder.dev@gmail.com>
>
> The final value of the counter of the "Scanning merged commits"
> progress line is always one less than its expected total, e.g.:
>
>   Scanning merged commits:  83% (5/6), done.
>
> This happens because while iterating over an array the loop variable
> is passed to display_progress() as-is, but while C arrays (and thus
> the loop variable) start at 0 and end at N-1, the progress counter
> must end at N.  This causes the failures of the tests
> 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
> in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.
>
> Fix this by passing 'i + 1' to display_progress(), like most other
> callsites do.

Sensible, I guess.

>
> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  commit-graph.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index 1a2602da61..918061f207 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>  
>  	ctx->num_extra_edges = 0;
>  	for (i = 0; i < ctx->commits.nr; i++) {
> -		display_progress(ctx->progress, i);
> +		display_progress(ctx->progress, i + 1);
>  
>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>  			  &ctx->commits.list[i]->object.oid)) {


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 2/3] midx: don't provide a total for QSORT() progress
  2021-07-22 12:20   ` [PATCH 2/3] midx: don't provide a total for QSORT() progress Ævar Arnfjörð Bjarmason
@ 2021-07-23 21:56     ` Junio C Hamano
  2021-08-05 15:07     ` Phillip Wood
  1 sibling, 0 replies; 138+ messages in thread
From: Junio C Hamano @ 2021-07-23 21:56 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, SZEDER Gábor, René Scharfe

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> The quicksort algorithm can be anywhere between O(n) and O(n^2), so
> providing a "num objects" as a total means that in some cases we're
> going to go past 100%.
>
> This fixes a logic error in 5ae18df9d8e (midx: during verify group
> objects by packfile to speed verification, 2019-03-21), which in turn
> seems to have been diligently copied from my own logic error in the
> commit-graph.c code, see 890226ccb57 (commit-graph write: add
> itermediate progress, 2019-01-19).

Interesting.

>
> That commit-graph code of mine was removed in
> 1cbdbf3bef7 (commit-graph: drop count_distinct_commits() function,
> 2020-12-07), so we don't need to fix that too.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  midx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/midx.c b/midx.c
> index 9a35b0255d..eaae75ab19 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -1291,7 +1291,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
>  
>  	if (flags & MIDX_PROGRESS)
>  		progress = start_sparse_progress(_("Sorting objects by packfile"),
> -						 m->num_objects);
> +						 0);
>  	display_progress(progress, 0); /* TODO: Measure QSORT() progress */
>  	QSORT(pairs, m->num_objects, compare_pair_pos_vs_id);
>  	stop_progress(&progress);

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 3/3] entry: show finer-grained counter in "Filtering content" progress line
  2021-07-22 12:20   ` [PATCH 3/3] entry: show finer-grained counter in "Filtering content" progress line Ævar Arnfjörð Bjarmason
@ 2021-07-23 22:01     ` Junio C Hamano
  2021-08-02 22:05       ` SZEDER Gábor
  2021-08-02 21:48     ` SZEDER Gábor
  1 sibling, 1 reply; 138+ messages in thread
From: Junio C Hamano @ 2021-07-23 22:01 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, SZEDER Gábor, René Scharfe

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> From: SZEDER Gábor <szeder.dev@gmail.com>
>
> The "Filtering content" progress in entry.c:finish_delayed_checkout()
> is unusual because of how it calculates the progress count and because
> it shows the progress of a nested loop.  It works basically like this:
>
>   start_delayed_progress(p, nr_of_paths_to_filter)
>   for_each_filter {
>       display_progress(p, nr_of_paths_to_filter - nr_of_paths_still_left_to_filter)
>       for_each_path_handled_by_the_current_filter {
>           checkout_entry()
>       }
>   }
>   stop_progress(p)
>
> There are two issues with this approach:
>
>   - The work done by the last filter (or the only filter if there is
>     only one) is never counted, so if the last filter still has some
>     paths to process, then the counter shown in the "done" progress
>     line will not match the expected total.
>
>     This would cause a BUG() in an upcoming change that adds an
>     assertion checking if the "total" at the end matches the last
>     progress bar update..

So the other series will semantically depend on this 3-patch series?
Just making sure that is the intended topic structure.

> diff --git a/entry.c b/entry.c
> index 125fabdbd5..d92dd020b3 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -162,7 +162,7 @@ static int remove_available_paths(struct string_list_item *item, void *cb_data)
>  int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
>  {
>  	int errs = 0;
> -	unsigned delayed_object_count;
> +	unsigned processed_paths = 0;
>  	off_t filtered_bytes = 0;
>  	struct string_list_item *filter, *path;
>  	struct progress *progress;
> @@ -172,12 +172,10 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
>  		return errs;
>  
>  	dco->state = CE_RETRY;
> -	delayed_object_count = dco->paths.nr;
> -	progress = start_delayed_progress(_("Filtering content"), delayed_object_count);
> +	progress = start_delayed_progress(_("Filtering content"), dco->paths.nr);
>  	while (dco->filters.nr > 0) {
>  		for_each_string_list_item(filter, &dco->filters) {
>  			struct string_list available_paths = STRING_LIST_INIT_NODUP;
> -			display_progress(progress, delayed_object_count - dco->paths.nr);
>  
>  			if (!async_query_available_blobs(filter->string, &available_paths)) {
>  				/* Filter reported an error */
> @@ -224,6 +222,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
>  				ce = index_file_exists(state->istate, path->string,
>  						       strlen(path->string), 0);
>  				if (ce) {
> +					display_progress(progress, ++processed_paths);
>  					errs |= checkout_entry(ce, state, NULL, nr_checkouts);
>  					filtered_bytes += ce->ce_stat_data.sd_size;
>  					display_throughput(progress, filtered_bytes);

Hmph.  A missing cache entries will not increment processed; would
that cause stop_progress() to see at the end the counter that is
smaller than dco->paths.nr we saw at the beginning?


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                     ` (7 preceding siblings ...)
  2021-07-22 12:55   ` [PATCH 8/8] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
@ 2021-07-23 22:02   ` Junio C Hamano
  2021-09-20 23:09   ` [PATCH v2 " Ævar Arnfjörð Bjarmason
  9 siblings, 0 replies; 138+ messages in thread
From: Junio C Hamano @ 2021-07-23 22:02 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, SZEDER Gábor, René Scharfe

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> These patches were originally submitted as part of a much larger topic
> at [1]. The add a "global_progress" "struct progress *" which we
> assign/clear to as we start/stop progress bars.
>
> This will become imporant for some new progress features I have
> planend, but right now is just used to assert that we don't start two
> progress bars at the same time. 7/8 fixes an existing bug where we did
> that.
>
> To get there I fixed up the test helper to be able to test this, moved
> some code around, and fixes a couple of existing nits in 5/8 and 6/8..
>
> See also [2] which is a re-submission of that larger topic, but the
> two can proceed independently.

OK.

>
> 1. https://lore.kernel.org/git/cover-00.25-00000000000-20210623T155626Z-avarab@gmail.com/
> 2. https://lore.kernel.org/git/cover-0.3-0000000000-20210722T121801Z-avarab@gmail.com/
>
> Ævar Arnfjörð Bjarmason (8):
>   progress.c tests: make start/stop verbs on stdin
>   progress.c tests: test some invalid usage
>   progress.c: move signal handler functions lower
>   progress.c: call progress_interval() from progress_test_force_update()
>   progress.c: stop eagerly fflush(stderr) when not a terminal
>   progress.c: add temporary variable from progress struct
>   pack-bitmap-write.c: add a missing stop_progress()
>   progress.c: add & assert a "global_progress" variable
>
>  pack-bitmap-write.c         |   1 +
>  progress.c                  | 116 ++++++++++++++++++++----------------
>  t/helper/test-progress.c    |  43 +++++++++----
>  t/t0500-progress-display.sh | 103 +++++++++++++++++++++++++-------
>  4 files changed, 178 insertions(+), 85 deletions(-)

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-07-22 12:20   ` [PATCH 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
  2021-07-23 21:55     ` Junio C Hamano
@ 2021-08-02 21:07     ` SZEDER Gábor
  1 sibling, 0 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-08-02 21:07 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, René Scharfe

On Thu, Jul 22, 2021 at 02:20:15PM +0200, Ævar Arnfjörð Bjarmason wrote:
> From: SZEDER Gábor <szeder.dev@gmail.com>
> 
> The final value of the counter of the "Scanning merged commits"
> progress line is always one less than its expected total, e.g.:
> 
>   Scanning merged commits:  83% (5/6), done.
> 
> This happens because while iterating over an array the loop variable
> is passed to display_progress() as-is, but while C arrays (and thus
> the loop variable) start at 0 and end at N-1, the progress counter
> must end at N.  This causes the failures of the tests
> 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
> in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.

There is no GIT_TEST_CHECK_PROGRESS in this patch series.

> Fix this by passing 'i + 1' to display_progress(), like most other
> callsites do.
> 
> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  commit-graph.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/commit-graph.c b/commit-graph.c
> index 1a2602da61..918061f207 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>  
>  	ctx->num_extra_edges = 0;
>  	for (i = 0; i < ctx->commits.nr; i++) {
> -		display_progress(ctx->progress, i);
> +		display_progress(ctx->progress, i + 1);
>  
>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>  			  &ctx->commits.list[i]->object.oid)) {
> -- 
> 2.32.0.957.gd9e39d72fe6
> 

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 3/3] entry: show finer-grained counter in "Filtering content" progress line
  2021-07-22 12:20   ` [PATCH 3/3] entry: show finer-grained counter in "Filtering content" progress line Ævar Arnfjörð Bjarmason
  2021-07-23 22:01     ` Junio C Hamano
@ 2021-08-02 21:48     ` SZEDER Gábor
  1 sibling, 0 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-08-02 21:48 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, René Scharfe

On Thu, Jul 22, 2021 at 02:20:17PM +0200, Ævar Arnfjörð Bjarmason wrote:
> From: SZEDER Gábor <szeder.dev@gmail.com>
> 
> The "Filtering content" progress in entry.c:finish_delayed_checkout()
> is unusual because of how it calculates the progress count and because
> it shows the progress of a nested loop.  It works basically like this:
> 
>   start_delayed_progress(p, nr_of_paths_to_filter)
>   for_each_filter {
>       display_progress(p, nr_of_paths_to_filter - nr_of_paths_still_left_to_filter)
>       for_each_path_handled_by_the_current_filter {
>           checkout_entry()
>       }
>   }
>   stop_progress(p)
> 
> There are two issues with this approach:
> 
>   - The work done by the last filter (or the only filter if there is
>     only one) is never counted, so if the last filter still has some
>     paths to process, then the counter shown in the "done" progress
>     line will not match the expected total.
> 
>     This would cause a BUG() in an upcoming change that adds an
>     assertion checking if the "total" at the end matches the last
>     progress bar update..
> 
>     This is because both

"Both" what?

>     use only one filter.  (The test 'delayed
>     checkout in process filter' uses two filters but the first one
>     does all the work, so that test already happens to succeed even
>     with such an assertion.)
> 
>   - The progress counter is updated only once per filter, not once per
>     processed path, so if a filter has a lot of paths to process, then
>     the counter might stay unchanged for a long while and then make a
>     big jump (though the user still gets a sense of progress, because
>     we call display_throughput() after each processed path to show the
>     amount of processed data).
> 
> Move the display_progress() call to the inner loop, right next to that
> checkout_entry() call that does the hard work for each path, and use a
> dedicated counter variable that is incremented upon processing each
> path.
> 
> After this change the 'invalid file in delayed checkout' in
> 't0021-conversion.sh' would succeed with the future BUG() assertion
> discussed above but the 'missing file in delayed checkout' test would
> still fail, because its purposefully buggy filter doesn't process any
> paths, so we won't execute that inner loop at all (this will be fixed
> in a subsequent commit).

I don't like how the updates to the commit message keeps referring to
some future BUG().

A benefit of my original submission is that all those checks are added
at the beginning of the patch series, so when looking at these later
bugfix patches reviewers can easily run the problematic tests with
GIT_TEST_CHECK_PROGRESS=1 to see the failure themselves and to confirm
that the fix indeed works.  Without those checks the next best thing
is applying the patch below and then looking at the verbose log of
those tests:


diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index b5749f327d..93a67f2f1f 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -955,7 +955,8 @@ test_expect_success PERL 'missing file in delayed checkout' '
 	) &&
 
 	rm -rf repo-cloned &&
-	test_must_fail git clone repo repo-cloned 2>git-stderr.log &&
+	test_must_fail env GIT_PROGRESS_DELAY=0 git clone repo repo-cloned 2>git-stderr.log &&
+	cat git-stderr.log &&
 	grep "error: .missing-delay\.a. was not filtered properly" git-stderr.log
 '
 
@@ -976,7 +977,8 @@ test_expect_success PERL 'invalid file in delayed checkout' '
 	) &&
 
 	rm -rf repo-cloned &&
-	test_must_fail git clone repo repo-cloned 2>git-stderr.log &&
+	test_must_fail env GIT_PROGRESS_DELAY=0 git clone repo repo-cloned 2>git-stderr.log &&
+	cat git-stderr.log &&
 	grep "error: external filter .* signaled that .unfiltered. is now available although it has not been delayed earlier" git-stderr.log
 '
 

> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  entry.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/entry.c b/entry.c
> index 125fabdbd5..d92dd020b3 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -162,7 +162,7 @@ static int remove_available_paths(struct string_list_item *item, void *cb_data)
>  int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
>  {
>  	int errs = 0;
> -	unsigned delayed_object_count;
> +	unsigned processed_paths = 0;
>  	off_t filtered_bytes = 0;
>  	struct string_list_item *filter, *path;
>  	struct progress *progress;
> @@ -172,12 +172,10 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
>  		return errs;
>  
>  	dco->state = CE_RETRY;
> -	delayed_object_count = dco->paths.nr;
> -	progress = start_delayed_progress(_("Filtering content"), delayed_object_count);
> +	progress = start_delayed_progress(_("Filtering content"), dco->paths.nr);
>  	while (dco->filters.nr > 0) {
>  		for_each_string_list_item(filter, &dco->filters) {
>  			struct string_list available_paths = STRING_LIST_INIT_NODUP;
> -			display_progress(progress, delayed_object_count - dco->paths.nr);
>  
>  			if (!async_query_available_blobs(filter->string, &available_paths)) {
>  				/* Filter reported an error */
> @@ -224,6 +222,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
>  				ce = index_file_exists(state->istate, path->string,
>  						       strlen(path->string), 0);
>  				if (ce) {
> +					display_progress(progress, ++processed_paths);
>  					errs |= checkout_entry(ce, state, NULL, nr_checkouts);
>  					filtered_bytes += ce->ce_stat_data.sd_size;
>  					display_throughput(progress, filtered_bytes);
> -- 
> 2.32.0.957.gd9e39d72fe6
> 

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 3/3] entry: show finer-grained counter in "Filtering content" progress line
  2021-07-23 22:01     ` Junio C Hamano
@ 2021-08-02 22:05       ` SZEDER Gábor
  0 siblings, 0 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-08-02 22:05 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, git, René Scharfe

On Fri, Jul 23, 2021 at 03:01:48PM -0700, Junio C Hamano wrote:
> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
> 
> > From: SZEDER Gábor <szeder.dev@gmail.com>
> >
> > The "Filtering content" progress in entry.c:finish_delayed_checkout()
> > is unusual because of how it calculates the progress count and because
> > it shows the progress of a nested loop.  It works basically like this:
> >
> >   start_delayed_progress(p, nr_of_paths_to_filter)
> >   for_each_filter {
> >       display_progress(p, nr_of_paths_to_filter - nr_of_paths_still_left_to_filter)
> >       for_each_path_handled_by_the_current_filter {
> >           checkout_entry()
> >       }
> >   }
> >   stop_progress(p)
> >
> > There are two issues with this approach:
> >
> >   - The work done by the last filter (or the only filter if there is
> >     only one) is never counted, so if the last filter still has some
> >     paths to process, then the counter shown in the "done" progress
> >     line will not match the expected total.
> >
> >     This would cause a BUG() in an upcoming change that adds an
> >     assertion checking if the "total" at the end matches the last
> >     progress bar update..
> 
> So the other series will semantically depend on this 3-patch series?
> Just making sure that is the intended topic structure.
> 
> > diff --git a/entry.c b/entry.c
> > index 125fabdbd5..d92dd020b3 100644
> > --- a/entry.c
> > +++ b/entry.c
> > @@ -162,7 +162,7 @@ static int remove_available_paths(struct string_list_item *item, void *cb_data)
> >  int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
> >  {
> >  	int errs = 0;
> > -	unsigned delayed_object_count;
> > +	unsigned processed_paths = 0;
> >  	off_t filtered_bytes = 0;
> >  	struct string_list_item *filter, *path;
> >  	struct progress *progress;
> > @@ -172,12 +172,10 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
> >  		return errs;
> >  
> >  	dco->state = CE_RETRY;
> > -	delayed_object_count = dco->paths.nr;
> > -	progress = start_delayed_progress(_("Filtering content"), delayed_object_count);
> > +	progress = start_delayed_progress(_("Filtering content"), dco->paths.nr);
> >  	while (dco->filters.nr > 0) {
> >  		for_each_string_list_item(filter, &dco->filters) {
> >  			struct string_list available_paths = STRING_LIST_INIT_NODUP;
> > -			display_progress(progress, delayed_object_count - dco->paths.nr);
> >  
> >  			if (!async_query_available_blobs(filter->string, &available_paths)) {
> >  				/* Filter reported an error */
> > @@ -224,6 +222,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
> >  				ce = index_file_exists(state->istate, path->string,
> >  						       strlen(path->string), 0);
> >  				if (ce) {
> > +					display_progress(progress, ++processed_paths);
> >  					errs |= checkout_entry(ce, state, NULL, nr_checkouts);
> >  					filtered_bytes += ce->ce_stat_data.sd_size;
> >  					display_throughput(progress, filtered_bytes);
> 
> Hmph.  A missing cache entries will not increment processed; would
> that cause stop_progress() to see at the end the counter that is
> smaller than dco->paths.nr we saw at the beginning?

Yes, but this 'if (ce)' condition has an 'else errs = 1;' branch as
well, i.e. a missing cache entry is considered an error.  This patch
fixes issues with this progress bar in the case when all went well,
i.e. when there were no errors.  In my original submission it is
followed up by another patch that attempts to fix this progress line
when there are errors, arguing that it's wrong to show "done" at the
end when not all work was done because of said errors...  although
"fix" is not quite the right word for the approach taken in that
patch, it's more like "papering over" ;)

  https://public-inbox.org/git/20210620200303.2328957-7-szeder.dev@gmail.com/


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v2 0/3] progress.c API users: fix bogus counting
  2021-07-22 12:20 ` [PATCH 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
                     ` (2 preceding siblings ...)
  2021-07-22 12:20   ` [PATCH 3/3] entry: show finer-grained counter in "Filtering content" progress line Ævar Arnfjörð Bjarmason
@ 2021-08-05 11:01   ` Ævar Arnfjörð Bjarmason
  2021-08-05 11:01     ` [PATCH v2 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
                       ` (3 more replies)
  3 siblings, 4 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-05 11:01 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

As a split-off from the larger topic these were submitted as part of
[1] and which didn't get picked up. As I pointed out in [2] that
larger topic had some hidden untested-for flaws.

But these patches are just fixes to bogus progress bar output from
that topic. Let's consider them in isolation...

Since v1 the only changes are to the commit messages, in response to
SZEDER's feedback at
https://lore.kernel.org/git/20210802210759.GD23408@szeder.dev/ and
https://lore.kernel.org/git/20210802214827.GE23408@szeder.dev/;
Hopefully this update addresses all of those outstanding comments.

1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/
2. https://lore.kernel.org/git/cover-00.25-00000000000-20210623T155626Z-avarab@gmail.com/

SZEDER Gábor (2):
  commit-graph: fix bogus counter in "Scanning merged commits" progress
    line
  entry: show finer-grained counter in "Filtering content" progress line

Ævar Arnfjörð Bjarmason (1):
  midx: don't provide a total for QSORT() progress

 commit-graph.c | 2 +-
 entry.c        | 7 +++----
 midx.c         | 2 +-
 3 files changed, 5 insertions(+), 6 deletions(-)

Range-diff against v1:
1:  832a6c1f78 ! 1:  bcb13be500 commit-graph: fix bogus counter in "Scanning merged commits" progress line
    @@ Commit message
         This happens because while iterating over an array the loop variable
         is passed to display_progress() as-is, but while C arrays (and thus
         the loop variable) start at 0 and end at N-1, the progress counter
    -    must end at N.  This causes the failures of the tests
    -    'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
    -    in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.
    +    must end at N. Fix this by passing 'i + 1' to display_progress(), like
    +    most other callsites do.
     
    -    Fix this by passing 'i + 1' to display_progress(), like most other
    -    callsites do.
    +    There's an RFC series to add a GIT_TEST_CHECK_PROGRESS=1 mode[1] which
    +    catches this issue in the 'fetch.writeCommitGraph' and
    +    'fetch.writeCommitGraph with submodules' tests in
    +    't5510-fetch.sh'. The GIT_TEST_CHECK_PROGRESS=1 mode is not part of
    +    this series, but future changes to progress.c may add it or similar
    +    assertions to catch this and similar bugs elsewhere.
    +
    +    1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/
     
         Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
2:  3411fe0515 = 2:  8e67712c48 midx: don't provide a total for QSORT() progress
3:  f65001627a ! 3:  c70e554e46 entry: show finer-grained counter in "Filtering content" progress line
    @@ Commit message
             paths to process, then the counter shown in the "done" progress
             line will not match the expected total.
     
    -        This would cause a BUG() in an upcoming change that adds an
    -        assertion checking if the "total" at the end matches the last
    -        progress bar update..
    -
    -        This is because both use only one filter.  (The test 'delayed
    -        checkout in process filter' uses two filters but the first one
    -        does all the work, so that test already happens to succeed even
    -        with such an assertion.)
    +        The partially-RFC series to add a GIT_TEST_CHECK_PROGRESS=1
    +        mode[1] helps spot this issue. Under it the 'missing file in
    +        delayed checkout' and 'invalid file in delayed checkout' tests in
    +        't0021-conversion.sh' fail, because both use only one
    +        filter.  (The test 'delayed checkout in process filter' uses two
    +        filters but the first one does all the work, so that test already
    +        happens to succeed even with GIT_TEST_CHECK_PROGRESS=1.)
     
           - The progress counter is updated only once per filter, not once per
             processed path, so if a filter has a lot of paths to process, then
    @@ Commit message
         path.
     
         After this change the 'invalid file in delayed checkout' in
    -    't0021-conversion.sh' would succeed with the future BUG() assertion
    -    discussed above but the 'missing file in delayed checkout' test would
    -    still fail, because its purposefully buggy filter doesn't process any
    -    paths, so we won't execute that inner loop at all (this will be fixed
    -    in a subsequent commit).
    +    't0021-conversion.sh' would succeed with the GIT_TEST_CHECK_PROGRESS=1
    +    assertion discussed above, but the 'missing file in delayed checkout'
    +    test would still fail.
    +
    +    It'll fail because its purposefully buggy filter doesn't process any
    +    paths, so we won't execute that inner loop at all, see [2] for how to
    +    spot that issue without GIT_TEST_CHECK_PROGRESS=1. It's not
    +    straightforward to fix it with the current progress.c library (see [3]
    +    for an attempt), so let's leave it for now.
    +
    +    1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/
    +    2. http://lore.kernel.org/git/20210802214827.GE23408@szeder.dev
    +    3. https://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.com/
     
         Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
-- 
2.33.0.rc0.635.g0ab9d6d3b5a


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v2 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-08-05 11:01   ` [PATCH v2 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
@ 2021-08-05 11:01     ` Ævar Arnfjörð Bjarmason
  2021-08-05 11:01     ` [PATCH v2 2/3] midx: don't provide a total for QSORT() progress Ævar Arnfjörð Bjarmason
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-05 11:01 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

From: SZEDER Gábor <szeder.dev@gmail.com>

The final value of the counter of the "Scanning merged commits"
progress line is always one less than its expected total, e.g.:

  Scanning merged commits:  83% (5/6), done.

This happens because while iterating over an array the loop variable
is passed to display_progress() as-is, but while C arrays (and thus
the loop variable) start at 0 and end at N-1, the progress counter
must end at N. Fix this by passing 'i + 1' to display_progress(), like
most other callsites do.

There's an RFC series to add a GIT_TEST_CHECK_PROGRESS=1 mode[1] which
catches this issue in the 'fetch.writeCommitGraph' and
'fetch.writeCommitGraph with submodules' tests in
't5510-fetch.sh'. The GIT_TEST_CHECK_PROGRESS=1 mode is not part of
this series, but future changes to progress.c may add it or similar
assertions to catch this and similar bugs elsewhere.

1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 commit-graph.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/commit-graph.c b/commit-graph.c
index 3860a0d847..9d18c1d87d 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
 
 	ctx->num_extra_edges = 0;
 	for (i = 0; i < ctx->commits.nr; i++) {
-		display_progress(ctx->progress, i);
+		display_progress(ctx->progress, i + 1);
 
 		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
 			  &ctx->commits.list[i]->object.oid)) {
-- 
2.33.0.rc0.635.g0ab9d6d3b5a


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v2 2/3] midx: don't provide a total for QSORT() progress
  2021-08-05 11:01   ` [PATCH v2 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
  2021-08-05 11:01     ` [PATCH v2 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
@ 2021-08-05 11:01     ` Ævar Arnfjörð Bjarmason
  2021-08-05 11:01     ` [PATCH v2 3/3] entry: show finer-grained counter in "Filtering content" progress line Ævar Arnfjörð Bjarmason
  2021-08-23 10:29     ` [PATCH v3 0/2] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
  3 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-05 11:01 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

The quicksort algorithm can be anywhere between O(n) and O(n^2), so
providing a "num objects" as a total means that in some cases we're
going to go past 100%.

This fixes a logic error in 5ae18df9d8e (midx: during verify group
objects by packfile to speed verification, 2019-03-21), which in turn
seems to have been diligently copied from my own logic error in the
commit-graph.c code, see 890226ccb57 (commit-graph write: add
itermediate progress, 2019-01-19).

That commit-graph code of mine was removed in
1cbdbf3bef7 (commit-graph: drop count_distinct_commits() function,
2020-12-07), so we don't need to fix that too.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 midx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/midx.c b/midx.c
index 321c6fdd2f..cad78d71fc 100644
--- a/midx.c
+++ b/midx.c
@@ -1292,7 +1292,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 
 	if (flags & MIDX_PROGRESS)
 		progress = start_sparse_progress(_("Sorting objects by packfile"),
-						 m->num_objects);
+						 0);
 	display_progress(progress, 0); /* TODO: Measure QSORT() progress */
 	QSORT(pairs, m->num_objects, compare_pair_pos_vs_id);
 	stop_progress(&progress);
-- 
2.33.0.rc0.635.g0ab9d6d3b5a


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v2 3/3] entry: show finer-grained counter in "Filtering content" progress line
  2021-08-05 11:01   ` [PATCH v2 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
  2021-08-05 11:01     ` [PATCH v2 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
  2021-08-05 11:01     ` [PATCH v2 2/3] midx: don't provide a total for QSORT() progress Ævar Arnfjörð Bjarmason
@ 2021-08-05 11:01     ` Ævar Arnfjörð Bjarmason
  2021-08-23 10:29     ` [PATCH v3 0/2] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
  3 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-05 11:01 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

From: SZEDER Gábor <szeder.dev@gmail.com>

The "Filtering content" progress in entry.c:finish_delayed_checkout()
is unusual because of how it calculates the progress count and because
it shows the progress of a nested loop.  It works basically like this:

  start_delayed_progress(p, nr_of_paths_to_filter)
  for_each_filter {
      display_progress(p, nr_of_paths_to_filter - nr_of_paths_still_left_to_filter)
      for_each_path_handled_by_the_current_filter {
          checkout_entry()
      }
  }
  stop_progress(p)

There are two issues with this approach:

  - The work done by the last filter (or the only filter if there is
    only one) is never counted, so if the last filter still has some
    paths to process, then the counter shown in the "done" progress
    line will not match the expected total.

    The partially-RFC series to add a GIT_TEST_CHECK_PROGRESS=1
    mode[1] helps spot this issue. Under it the 'missing file in
    delayed checkout' and 'invalid file in delayed checkout' tests in
    't0021-conversion.sh' fail, because both use only one
    filter.  (The test 'delayed checkout in process filter' uses two
    filters but the first one does all the work, so that test already
    happens to succeed even with GIT_TEST_CHECK_PROGRESS=1.)

  - The progress counter is updated only once per filter, not once per
    processed path, so if a filter has a lot of paths to process, then
    the counter might stay unchanged for a long while and then make a
    big jump (though the user still gets a sense of progress, because
    we call display_throughput() after each processed path to show the
    amount of processed data).

Move the display_progress() call to the inner loop, right next to that
checkout_entry() call that does the hard work for each path, and use a
dedicated counter variable that is incremented upon processing each
path.

After this change the 'invalid file in delayed checkout' in
't0021-conversion.sh' would succeed with the GIT_TEST_CHECK_PROGRESS=1
assertion discussed above, but the 'missing file in delayed checkout'
test would still fail.

It'll fail because its purposefully buggy filter doesn't process any
paths, so we won't execute that inner loop at all, see [2] for how to
spot that issue without GIT_TEST_CHECK_PROGRESS=1. It's not
straightforward to fix it with the current progress.c library (see [3]
for an attempt), so let's leave it for now.

1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/
2. http://lore.kernel.org/git/20210802214827.GE23408@szeder.dev
3. https://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.com/

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 entry.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/entry.c b/entry.c
index 125fabdbd5..d92dd020b3 100644
--- a/entry.c
+++ b/entry.c
@@ -162,7 +162,7 @@ static int remove_available_paths(struct string_list_item *item, void *cb_data)
 int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 {
 	int errs = 0;
-	unsigned delayed_object_count;
+	unsigned processed_paths = 0;
 	off_t filtered_bytes = 0;
 	struct string_list_item *filter, *path;
 	struct progress *progress;
@@ -172,12 +172,10 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 		return errs;
 
 	dco->state = CE_RETRY;
-	delayed_object_count = dco->paths.nr;
-	progress = start_delayed_progress(_("Filtering content"), delayed_object_count);
+	progress = start_delayed_progress(_("Filtering content"), dco->paths.nr);
 	while (dco->filters.nr > 0) {
 		for_each_string_list_item(filter, &dco->filters) {
 			struct string_list available_paths = STRING_LIST_INIT_NODUP;
-			display_progress(progress, delayed_object_count - dco->paths.nr);
 
 			if (!async_query_available_blobs(filter->string, &available_paths)) {
 				/* Filter reported an error */
@@ -224,6 +222,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 				ce = index_file_exists(state->istate, path->string,
 						       strlen(path->string), 0);
 				if (ce) {
+					display_progress(progress, ++processed_paths);
 					errs |= checkout_entry(ce, state, NULL, nr_checkouts);
 					filtered_bytes += ce->ce_stat_data.sd_size;
 					display_throughput(progress, filtered_bytes);
-- 
2.33.0.rc0.635.g0ab9d6d3b5a


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 2/3] midx: don't provide a total for QSORT() progress
  2021-07-22 12:20   ` [PATCH 2/3] midx: don't provide a total for QSORT() progress Ævar Arnfjörð Bjarmason
  2021-07-23 21:56     ` Junio C Hamano
@ 2021-08-05 15:07     ` Phillip Wood
  2021-08-05 19:07       ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 138+ messages in thread
From: Phillip Wood @ 2021-08-05 15:07 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe

On 22/07/2021 13:20, Ævar Arnfjörð Bjarmason wrote:
> The quicksort algorithm can be anywhere between O(n) and O(n^2), so
> providing a "num objects" as a total means that in some cases we're
> going to go past 100%.

Being pedantic QSORT() is not necessarily a quicksort, for example 
compact/qsort_s.c implements a merge sort.

I'm confused as to how we go past 100% when we only call 
display_progress(progress, 0) and then stop_progress(). If my reading of 
progress.c is correct then this change means we'll stop displaying a 
percentage as progress->total will be zero in

static void display(struct progress *progress, uint64_t n, const char *done)
{
	/* ... */
	if (progress->total) {
		unsigned percent = n * 100 / progress->total;
		if (percent != progress->last_percent || progress_update) {
			progress->last_percent = percent;

			strbuf_reset(counters_sb);
			strbuf_addf(counters_sb,
				    "%3u%% (%"PRIuMAX"/%"PRIuMAX")%s", percent,
				    (uintmax_t)n, (uintmax_t)progress->total,
				    tp);
			show_update = 1;
		}
	} else if (progress_update) {
		strbuf_reset(counters_sb);
		strbuf_addf(counters_sb, "%"PRIuMAX"%s", (uintmax_t)n, tp);
		show_update = 1;
	}
	/* ... */
}

Best Wishes

Phillip

> This fixes a logic error in 5ae18df9d8e (midx: during verify group
> objects by packfile to speed verification, 2019-03-21), which in turn
> seems to have been diligently copied from my own logic error in the
> commit-graph.c code, see 890226ccb57 (commit-graph write: add
> itermediate progress, 2019-01-19).
> 
> That commit-graph code of mine was removed in
> 1cbdbf3bef7 (commit-graph: drop count_distinct_commits() function,
> 2020-12-07), so we don't need to fix that too.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>   midx.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/midx.c b/midx.c
> index 9a35b0255d..eaae75ab19 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -1291,7 +1291,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
>   
>   	if (flags & MIDX_PROGRESS)
>   		progress = start_sparse_progress(_("Sorting objects by packfile"),
> -						 m->num_objects);
> +						 0);
>   	display_progress(progress, 0); /* TODO: Measure QSORT() progress */
>   	QSORT(pairs, m->num_objects, compare_pair_pos_vs_id);
>   	stop_progress(&progress);
> 

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 2/3] midx: don't provide a total for QSORT() progress
  2021-08-05 15:07     ` Phillip Wood
@ 2021-08-05 19:07       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-05 19:07 UTC (permalink / raw)
  To: phillip.wood; +Cc: git, Junio C Hamano, SZEDER Gábor, René Scharfe


On Thu, Aug 05 2021, Phillip Wood wrote:

> On 22/07/2021 13:20, Ævar Arnfjörð Bjarmason wrote:
>> The quicksort algorithm can be anywhere between O(n) and O(n^2), so
>> providing a "num objects" as a total means that in some cases we're
>> going to go past 100%.
>
> Being pedantic QSORT() is not necessarily a quicksort, for example
> compact/qsort_s.c implements a merge sort.
>
> I'm confused as to how we go past 100% when we only call
> display_progress(progress, 0) and then stop_progress(). If my reading
> of progress.c is correct then this change means we'll stop displaying
> a percentage as progress->total will be zero in
>
> static void display(struct progress *progress, uint64_t n, const char *done)
> {
> 	/* ... */
> 	if (progress->total) {
> 		unsigned percent = n * 100 / progress->total;
> 		if (percent != progress->last_percent || progress_update) {
> 			progress->last_percent = percent;
>
> 			strbuf_reset(counters_sb);
> 			strbuf_addf(counters_sb,
> 				    "%3u%% (%"PRIuMAX"/%"PRIuMAX")%s", percent,
> 				    (uintmax_t)n, (uintmax_t)progress->total,
> 				    tp);
> 			show_update = 1;
> 		}
> 	} else if (progress_update) {
> 		strbuf_reset(counters_sb);
> 		strbuf_addf(counters_sb, "%"PRIuMAX"%s", (uintmax_t)n, tp);
> 		show_update = 1;
> 	}
> 	/* ... */
> }

Hrm, now I'm also confused. Yes you're right. This whole patch doesn't
make sense.

I.e. it's perfectly fine to provide a target of <num> for QSORT and then
use display_progress() just to bump the progress bar's display like
this, and we have other code in commit-graph.c that does the same.

I think I might have (badly) extracted this patch from some WIP work
where I was giving qsort() calls progress of some sort, or maybe I was
just being stupid...

(I have a few local experiments in progress bar output, one of those is
that you can give the API a <num> target, and not be sure if it'll be
O(n), O(log(n)), O(n^2) etc, and we'll show progress output
appropriately, i.e. "still working, at X items of work for N, so north
of O(N)...".)

^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v3 0/2] progress.c API users: fix bogus counting
  2021-08-05 11:01   ` [PATCH v2 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
                       ` (2 preceding siblings ...)
  2021-08-05 11:01     ` [PATCH v2 3/3] entry: show finer-grained counter in "Filtering content" progress line Ævar Arnfjörð Bjarmason
@ 2021-08-23 10:29     ` Ævar Arnfjörð Bjarmason
  2021-08-23 10:29       ` [PATCH v3 1/2] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
                         ` (2 more replies)
  3 siblings, 3 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-23 10:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Phillip Wood, Ævar Arnfjörð Bjarmason

Fixes up some users of the progress.c API. See
https://lore.kernel.org/git/cover-v2-0.3-0000000000-20210805T105720Z-avarab@gmail.com
for v2.

In v3 the old 2/3 is ejected per feedback from Phillip Wood, see:
https://lore.kernel.org/git/87v94jzoxj.fsf@evledraar.gmail.com/

SZEDER Gábor (2):
  commit-graph: fix bogus counter in "Scanning merged commits" progress
    line
  entry: show finer-grained counter in "Filtering content" progress line

 commit-graph.c | 2 +-
 entry.c        | 7 +++----
 2 files changed, 4 insertions(+), 5 deletions(-)

Range-diff against v2:
1:  bcb13be5006 = 1:  443374551ad commit-graph: fix bogus counter in "Scanning merged commits" progress line
2:  8e67712c480 < -:  ----------- midx: don't provide a total for QSORT() progress
3:  c70e554e461 = 2:  71c93f624ec entry: show finer-grained counter in "Filtering content" progress line
-- 
2.33.0.632.g78310755cd0


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v3 1/2] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-08-23 10:29     ` [PATCH v3 0/2] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
@ 2021-08-23 10:29       ` Ævar Arnfjörð Bjarmason
  2021-08-23 10:29       ` [PATCH v3 2/2] entry: show finer-grained counter in "Filtering content" " Ævar Arnfjörð Bjarmason
  2021-09-09  1:10       ` [PATCH v4 0/2] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-23 10:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Phillip Wood, Ævar Arnfjörð Bjarmason

From: SZEDER Gábor <szeder.dev@gmail.com>

The final value of the counter of the "Scanning merged commits"
progress line is always one less than its expected total, e.g.:

  Scanning merged commits:  83% (5/6), done.

This happens because while iterating over an array the loop variable
is passed to display_progress() as-is, but while C arrays (and thus
the loop variable) start at 0 and end at N-1, the progress counter
must end at N. Fix this by passing 'i + 1' to display_progress(), like
most other callsites do.

There's an RFC series to add a GIT_TEST_CHECK_PROGRESS=1 mode[1] which
catches this issue in the 'fetch.writeCommitGraph' and
'fetch.writeCommitGraph with submodules' tests in
't5510-fetch.sh'. The GIT_TEST_CHECK_PROGRESS=1 mode is not part of
this series, but future changes to progress.c may add it or similar
assertions to catch this and similar bugs elsewhere.

1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 commit-graph.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/commit-graph.c b/commit-graph.c
index 3860a0d8477..9d18c1d87d9 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
 
 	ctx->num_extra_edges = 0;
 	for (i = 0; i < ctx->commits.nr; i++) {
-		display_progress(ctx->progress, i);
+		display_progress(ctx->progress, i + 1);
 
 		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
 			  &ctx->commits.list[i]->object.oid)) {
-- 
2.33.0.632.g78310755cd0


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v3 2/2] entry: show finer-grained counter in "Filtering content" progress line
  2021-08-23 10:29     ` [PATCH v3 0/2] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
  2021-08-23 10:29       ` [PATCH v3 1/2] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
@ 2021-08-23 10:29       ` Ævar Arnfjörð Bjarmason
  2021-09-09  1:10       ` [PATCH v4 0/2] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-23 10:29 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Phillip Wood, Ævar Arnfjörð Bjarmason

From: SZEDER Gábor <szeder.dev@gmail.com>

The "Filtering content" progress in entry.c:finish_delayed_checkout()
is unusual because of how it calculates the progress count and because
it shows the progress of a nested loop.  It works basically like this:

  start_delayed_progress(p, nr_of_paths_to_filter)
  for_each_filter {
      display_progress(p, nr_of_paths_to_filter - nr_of_paths_still_left_to_filter)
      for_each_path_handled_by_the_current_filter {
          checkout_entry()
      }
  }
  stop_progress(p)

There are two issues with this approach:

  - The work done by the last filter (or the only filter if there is
    only one) is never counted, so if the last filter still has some
    paths to process, then the counter shown in the "done" progress
    line will not match the expected total.

    The partially-RFC series to add a GIT_TEST_CHECK_PROGRESS=1
    mode[1] helps spot this issue. Under it the 'missing file in
    delayed checkout' and 'invalid file in delayed checkout' tests in
    't0021-conversion.sh' fail, because both use only one
    filter.  (The test 'delayed checkout in process filter' uses two
    filters but the first one does all the work, so that test already
    happens to succeed even with GIT_TEST_CHECK_PROGRESS=1.)

  - The progress counter is updated only once per filter, not once per
    processed path, so if a filter has a lot of paths to process, then
    the counter might stay unchanged for a long while and then make a
    big jump (though the user still gets a sense of progress, because
    we call display_throughput() after each processed path to show the
    amount of processed data).

Move the display_progress() call to the inner loop, right next to that
checkout_entry() call that does the hard work for each path, and use a
dedicated counter variable that is incremented upon processing each
path.

After this change the 'invalid file in delayed checkout' in
't0021-conversion.sh' would succeed with the GIT_TEST_CHECK_PROGRESS=1
assertion discussed above, but the 'missing file in delayed checkout'
test would still fail.

It'll fail because its purposefully buggy filter doesn't process any
paths, so we won't execute that inner loop at all, see [2] for how to
spot that issue without GIT_TEST_CHECK_PROGRESS=1. It's not
straightforward to fix it with the current progress.c library (see [3]
for an attempt), so let's leave it for now.

1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/
2. http://lore.kernel.org/git/20210802214827.GE23408@szeder.dev
3. https://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.com/

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 entry.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/entry.c b/entry.c
index 125fabdbd52..d92dd020b3d 100644
--- a/entry.c
+++ b/entry.c
@@ -162,7 +162,7 @@ static int remove_available_paths(struct string_list_item *item, void *cb_data)
 int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 {
 	int errs = 0;
-	unsigned delayed_object_count;
+	unsigned processed_paths = 0;
 	off_t filtered_bytes = 0;
 	struct string_list_item *filter, *path;
 	struct progress *progress;
@@ -172,12 +172,10 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 		return errs;
 
 	dco->state = CE_RETRY;
-	delayed_object_count = dco->paths.nr;
-	progress = start_delayed_progress(_("Filtering content"), delayed_object_count);
+	progress = start_delayed_progress(_("Filtering content"), dco->paths.nr);
 	while (dco->filters.nr > 0) {
 		for_each_string_list_item(filter, &dco->filters) {
 			struct string_list available_paths = STRING_LIST_INIT_NODUP;
-			display_progress(progress, delayed_object_count - dco->paths.nr);
 
 			if (!async_query_available_blobs(filter->string, &available_paths)) {
 				/* Filter reported an error */
@@ -224,6 +222,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 				ce = index_file_exists(state->istate, path->string,
 						       strlen(path->string), 0);
 				if (ce) {
+					display_progress(progress, ++processed_paths);
 					errs |= checkout_entry(ce, state, NULL, nr_checkouts);
 					filtered_bytes += ce->ce_stat_data.sd_size;
 					display_throughput(progress, filtered_bytes);
-- 
2.33.0.632.g78310755cd0


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 2/7] progress: catch nested/overlapping progresses with GIT_TEST_CHECK_PROGRESS
  2021-06-22 16:00   ` Taylor Blau
@ 2021-08-30 21:15     ` SZEDER Gábor
  0 siblings, 0 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-08-30 21:15 UTC (permalink / raw)
  To: Taylor Blau
  Cc: git, Ævar Arnfjörð Bjarmason, René Scharfe

On Tue, Jun 22, 2021 at 12:00:07PM -0400, Taylor Blau wrote:
> On Sun, Jun 20, 2021 at 10:02:58PM +0200, SZEDER Gábor wrote:
> > Note that this will trigger even in cases where the output is not
> > visibly wrong, e.g. consider this simplified sequence of calls:
> >
> >   progress1 = start_delayed_progress();
> >   progress2 = start_delayed_progress();
> >   for (i = 0; ...)
> >       display_progress(progress2, i + 1);
> >   stop_progres(&progress2);
> >   for (j = 0; ...)
> >       display_progress(progress1, j + 1);
> >   stop_progres(&progress1);
> 
> s/stop_progres/&s, but no big deal. Everything else here looks good.

Well, at least I was consistent :)

> > diff --git a/progress.c b/progress.c
> > index 255995406f..549e8d1fe7 100644
> > --- a/progress.c
> > +++ b/progress.c
> > @@ -48,6 +48,8 @@ struct progress {
> >  static volatile sig_atomic_t progress_update;
> >
> >  static int test_check_progress;
> > +/* Used to catch nested/overlapping progresses with GIT_TEST_CHECK_PROGRESS. */
> > +static struct progress *current_progress = NULL;
> >
> >  /*
> >   * These are only intended for testing the progress output, i.e. exclusively
> > @@ -258,8 +260,12 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
> >  	struct progress *progress;
> >
> >  	test_check_progress = git_env_bool("GIT_TEST_CHECK_PROGRESS", 0);
> > +	if (test_check_progress && current_progress)
> > +		BUG("progress \"%s\" is still active when starting new progress \"%s\"",
> > +		    current_progress->title, title);
> >
> >  	progress = xmalloc(sizeof(*progress));
> 
> Ah. This is why you moved the allocation down further, since we don't
> have to free anything up when calling BUG() if it wasn't allocated in
> the first place (and we had no such conditional that would cause us to
> abort early before).
> 
> For what it's worth, I probably would have preferred to see that change
> from the previous patch included in this one rather than in the first of
> the series, since it's much clearer here than it is in the first patch.

Yeah.  It must have been a rebase mishap.  (I started working on this
after I reported yet another commit-graph related progress bug around
v2.31.0-rc0, and I had the first two checks on the same evening.  But
then some time later Peff came along and found a backwards counting
progress line, so I decided to add a check for that as well, which
necessitated a bit of refactoring in the other two checks, and then a
hunk somehow ended up in the wrong patch.)


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 6/7] [RFC] entry: don't show "Filtering content: ... done." line in case of errors
  2021-06-23  1:52     ` Taylor Blau
@ 2021-08-30 21:17       ` SZEDER Gábor
  0 siblings, 0 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-08-30 21:17 UTC (permalink / raw)
  To: Taylor Blau
  Cc: René Scharfe, git, Ævar Arnfjörð Bjarmason

On Tue, Jun 22, 2021 at 09:52:32PM -0400, Taylor Blau wrote:
> On Mon, Jun 21, 2021 at 08:32:56PM +0200, René Scharfe wrote:
> > Am 20.06.21 um 22:03 schrieb SZEDER Gábor:
> > > RFC!!  Alas, not calling stop_progress() on error has drawbacks:
> > >
> > >   - All memory allocated for the progress bar is leaked.
> > >   - This progress line remains "active", in the sense that if we were
> > >     to start a new progress later in the same git process, then with
> > >     GIT_TEST_CHECK_PROGRESS it would trigger the other BUG() catching
> > >     nested/overlapping progresses.
> > >
> > > Do we care?!  TBH I don't :)
> > > Anyway, if we do, then we might need some sort of an abort_progress()
> > > function...
> >
> > I think the abort_progress() idea makes sense; to clean up allocations,
> > tell the user what happened and avoid the BUG().  Showing just
> > "aborted" instead of "done" should suffice here -- the explanation is
> > given a few lines later ("'foo' was not filtered properly").
> 
> Very well put. I concur that having an abort_progress() API makes sense
> for all of the reasons that you suggest, but also because we shouldn't
> encourage not using what seems like an appropriate API in order to not
> fail tests when GIT_TEST_CHECK_PROGRESS is set.

Ah, damn, I was hoping that I can get away with it :)


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v4 0/2] progress.c API users: fix bogus counting
  2021-08-23 10:29     ` [PATCH v3 0/2] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
  2021-08-23 10:29       ` [PATCH v3 1/2] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
  2021-08-23 10:29       ` [PATCH v3 2/2] entry: show finer-grained counter in "Filtering content" " Ævar Arnfjörð Bjarmason
@ 2021-09-09  1:10       ` Ævar Arnfjörð Bjarmason
  2021-09-09  1:10         ` [PATCH v4 1/2] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
                           ` (2 more replies)
  2 siblings, 3 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-09  1:10 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Phillip Wood, Matheus Tavares,
	Ævar Arnfjörð Bjarmason

Fix bad uses of the progress.c API. See
https://lore.kernel.org/git/cover-v3-0.2-00000000000-20210823T102722Z-avarab@gmail.com
for the v3.

This re-roll is on top of a merge conflict in v3 with 7a132c628e5
(checkout: make delayed checkout respect --quiet and --no-progress,
2021-08-26), i.e. the mt/quiet-with-delayed-checkout topic.

SZEDER Gábor (2):
  commit-graph: fix bogus counter in "Scanning merged commits" progress
    line
  entry: show finer-grained counter in "Filtering content" progress line

 commit-graph.c |  2 +-
 entry.c        | 12 +++++-------
 2 files changed, 6 insertions(+), 8 deletions(-)

Range-diff against v3:
1:  443374551ad = 1:  4cc3923089d commit-graph: fix bogus counter in "Scanning merged commits" progress line
2:  71c93f624ec ! 2:  54a09b5b883 entry: show finer-grained counter in "Filtering content" progress line
    @@ Commit message
         straightforward to fix it with the current progress.c library (see [3]
         for an attempt), so let's leave it for now.
     
    +    Let's also initialize the *progress to "NULL" while we're at it. Since
    +    7a132c628e5 (checkout: make delayed checkout respect --quiet and
    +    --no-progress, 2021-08-26) we have had progress conditional on
    +    "show_progress", usually we use the idiom of a "NULL" initialization
    +    of the "*progress", rather than the more verbose ternary added in
    +    7a132c628e5.
    +
         1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/
         2. http://lore.kernel.org/git/20210802214827.GE23408@szeder.dev
         3. https://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.com/
    @@ Commit message
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## entry.c ##
    -@@ entry.c: static int remove_available_paths(struct string_list_item *item, void *cb_data)
    - int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
    +@@ entry.c: int finish_delayed_checkout(struct checkout *state, int *nr_checkouts,
    + 			    int show_progress)
      {
      	int errs = 0;
     -	unsigned delayed_object_count;
     +	unsigned processed_paths = 0;
      	off_t filtered_bytes = 0;
      	struct string_list_item *filter, *path;
    - 	struct progress *progress;
    -@@ entry.c: int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
    +-	struct progress *progress;
    ++	struct progress *progress = NULL;
    + 	struct delayed_checkout *dco = state->delayed_checkout;
    + 
    + 	if (!state->delayed_checkout)
      		return errs;
      
      	dco->state = CE_RETRY;
     -	delayed_object_count = dco->paths.nr;
    --	progress = start_delayed_progress(_("Filtering content"), delayed_object_count);
    -+	progress = start_delayed_progress(_("Filtering content"), dco->paths.nr);
    +-	progress = show_progress
    +-		? start_delayed_progress(_("Filtering content"), delayed_object_count)
    +-		: NULL;
    ++	if (show_progress)
    ++		progress = start_delayed_progress(_("Filtering content"), dco->paths.nr);
      	while (dco->filters.nr > 0) {
      		for_each_string_list_item(filter, &dco->filters) {
      			struct string_list available_paths = STRING_LIST_INIT_NODUP;
    @@ entry.c: int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
      
      			if (!async_query_available_blobs(filter->string, &available_paths)) {
      				/* Filter reported an error */
    -@@ entry.c: int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
    +@@ entry.c: int finish_delayed_checkout(struct checkout *state, int *nr_checkouts,
      				ce = index_file_exists(state->istate, path->string,
      						       strlen(path->string), 0);
      				if (ce) {
-- 
2.33.0.825.gdc3f7a2a6c7


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v4 1/2] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-09-09  1:10       ` [PATCH v4 0/2] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
@ 2021-09-09  1:10         ` Ævar Arnfjörð Bjarmason
  2021-09-09  1:10         ` [PATCH v4 2/2] entry: show finer-grained counter in "Filtering content" " Ævar Arnfjörð Bjarmason
  2021-09-09 20:02         ` [PATCH v4 0/2] progress.c API users: fix bogus counting Junio C Hamano
  2 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-09  1:10 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Phillip Wood, Matheus Tavares,
	Ævar Arnfjörð Bjarmason

From: SZEDER Gábor <szeder.dev@gmail.com>

The final value of the counter of the "Scanning merged commits"
progress line is always one less than its expected total, e.g.:

  Scanning merged commits:  83% (5/6), done.

This happens because while iterating over an array the loop variable
is passed to display_progress() as-is, but while C arrays (and thus
the loop variable) start at 0 and end at N-1, the progress counter
must end at N. Fix this by passing 'i + 1' to display_progress(), like
most other callsites do.

There's an RFC series to add a GIT_TEST_CHECK_PROGRESS=1 mode[1] which
catches this issue in the 'fetch.writeCommitGraph' and
'fetch.writeCommitGraph with submodules' tests in
't5510-fetch.sh'. The GIT_TEST_CHECK_PROGRESS=1 mode is not part of
this series, but future changes to progress.c may add it or similar
assertions to catch this and similar bugs elsewhere.

1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 commit-graph.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/commit-graph.c b/commit-graph.c
index 00614acd65d..46170592204 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -2125,7 +2125,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
 
 	ctx->num_extra_edges = 0;
 	for (i = 0; i < ctx->commits.nr; i++) {
-		display_progress(ctx->progress, i);
+		display_progress(ctx->progress, i + 1);
 
 		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
 			  &ctx->commits.list[i]->object.oid)) {
-- 
2.33.0.825.gdc3f7a2a6c7


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v4 2/2] entry: show finer-grained counter in "Filtering content" progress line
  2021-09-09  1:10       ` [PATCH v4 0/2] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
  2021-09-09  1:10         ` [PATCH v4 1/2] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
@ 2021-09-09  1:10         ` Ævar Arnfjörð Bjarmason
  2021-09-09 20:02         ` [PATCH v4 0/2] progress.c API users: fix bogus counting Junio C Hamano
  2 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-09  1:10 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Phillip Wood, Matheus Tavares,
	Ævar Arnfjörð Bjarmason

From: SZEDER Gábor <szeder.dev@gmail.com>

The "Filtering content" progress in entry.c:finish_delayed_checkout()
is unusual because of how it calculates the progress count and because
it shows the progress of a nested loop.  It works basically like this:

  start_delayed_progress(p, nr_of_paths_to_filter)
  for_each_filter {
      display_progress(p, nr_of_paths_to_filter - nr_of_paths_still_left_to_filter)
      for_each_path_handled_by_the_current_filter {
          checkout_entry()
      }
  }
  stop_progress(p)

There are two issues with this approach:

  - The work done by the last filter (or the only filter if there is
    only one) is never counted, so if the last filter still has some
    paths to process, then the counter shown in the "done" progress
    line will not match the expected total.

    The partially-RFC series to add a GIT_TEST_CHECK_PROGRESS=1
    mode[1] helps spot this issue. Under it the 'missing file in
    delayed checkout' and 'invalid file in delayed checkout' tests in
    't0021-conversion.sh' fail, because both use only one
    filter.  (The test 'delayed checkout in process filter' uses two
    filters but the first one does all the work, so that test already
    happens to succeed even with GIT_TEST_CHECK_PROGRESS=1.)

  - The progress counter is updated only once per filter, not once per
    processed path, so if a filter has a lot of paths to process, then
    the counter might stay unchanged for a long while and then make a
    big jump (though the user still gets a sense of progress, because
    we call display_throughput() after each processed path to show the
    amount of processed data).

Move the display_progress() call to the inner loop, right next to that
checkout_entry() call that does the hard work for each path, and use a
dedicated counter variable that is incremented upon processing each
path.

After this change the 'invalid file in delayed checkout' in
't0021-conversion.sh' would succeed with the GIT_TEST_CHECK_PROGRESS=1
assertion discussed above, but the 'missing file in delayed checkout'
test would still fail.

It'll fail because its purposefully buggy filter doesn't process any
paths, so we won't execute that inner loop at all, see [2] for how to
spot that issue without GIT_TEST_CHECK_PROGRESS=1. It's not
straightforward to fix it with the current progress.c library (see [3]
for an attempt), so let's leave it for now.

Let's also initialize the *progress to "NULL" while we're at it. Since
7a132c628e5 (checkout: make delayed checkout respect --quiet and
--no-progress, 2021-08-26) we have had progress conditional on
"show_progress", usually we use the idiom of a "NULL" initialization
of the "*progress", rather than the more verbose ternary added in
7a132c628e5.

1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/
2. http://lore.kernel.org/git/20210802214827.GE23408@szeder.dev
3. https://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.com/

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 entry.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/entry.c b/entry.c
index 044e8ec92c6..9b0f968a70c 100644
--- a/entry.c
+++ b/entry.c
@@ -163,24 +163,21 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts,
 			    int show_progress)
 {
 	int errs = 0;
-	unsigned delayed_object_count;
+	unsigned processed_paths = 0;
 	off_t filtered_bytes = 0;
 	struct string_list_item *filter, *path;
-	struct progress *progress;
+	struct progress *progress = NULL;
 	struct delayed_checkout *dco = state->delayed_checkout;
 
 	if (!state->delayed_checkout)
 		return errs;
 
 	dco->state = CE_RETRY;
-	delayed_object_count = dco->paths.nr;
-	progress = show_progress
-		? start_delayed_progress(_("Filtering content"), delayed_object_count)
-		: NULL;
+	if (show_progress)
+		progress = start_delayed_progress(_("Filtering content"), dco->paths.nr);
 	while (dco->filters.nr > 0) {
 		for_each_string_list_item(filter, &dco->filters) {
 			struct string_list available_paths = STRING_LIST_INIT_NODUP;
-			display_progress(progress, delayed_object_count - dco->paths.nr);
 
 			if (!async_query_available_blobs(filter->string, &available_paths)) {
 				/* Filter reported an error */
@@ -227,6 +224,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts,
 				ce = index_file_exists(state->istate, path->string,
 						       strlen(path->string), 0);
 				if (ce) {
+					display_progress(progress, ++processed_paths);
 					errs |= checkout_entry(ce, state, NULL, nr_checkouts);
 					filtered_bytes += ce->ce_stat_data.sd_size;
 					display_throughput(progress, filtered_bytes);
-- 
2.33.0.825.gdc3f7a2a6c7


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH v4 0/2] progress.c API users: fix bogus counting
  2021-09-09  1:10       ` [PATCH v4 0/2] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
  2021-09-09  1:10         ` [PATCH v4 1/2] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
  2021-09-09  1:10         ` [PATCH v4 2/2] entry: show finer-grained counter in "Filtering content" " Ævar Arnfjörð Bjarmason
@ 2021-09-09 20:02         ` Junio C Hamano
  2 siblings, 0 replies; 138+ messages in thread
From: Junio C Hamano @ 2021-09-09 20:02 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, SZEDER Gábor, René Scharfe, Phillip Wood, Matheus Tavares

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Fix bad uses of the progress.c API. See
> https://lore.kernel.org/git/cover-v3-0.2-00000000000-20210823T102722Z-avarab@gmail.com
> for the v3.
>
> This re-roll is on top of a merge conflict in v3 with 7a132c628e5
> (checkout: make delayed checkout respect --quiet and --no-progress,
> 2021-08-26), i.e. the mt/quiet-with-delayed-checkout topic.

Thanks, as that commit makes the call to progress code conditional,
with a new variable involved in the decision, it is understandable
that this needs to be adjusted for that newer codebase.

Very much appreciated.

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 12/25] progress.c: add & assert a "global_progress" variable
  2021-06-23 17:48       ` [PATCH 12/25] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
@ 2021-09-16 18:31         ` SZEDER Gábor
  0 siblings, 0 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-09-16 18:31 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, René Scharfe, Taylor Blau

On Wed, Jun 23, 2021 at 07:48:12PM +0200, Ævar Arnfjörð Bjarmason wrote:
> The progress.c code makes a hard assumption that only one progress bar
> be active at a time (see [1] for a bug where this wasn't the case),
> but nothing has asserted that that's the case. Let's add a BUG()
> that'll trigger if two progress bars are active at the same time.

I very much dislike the idea of any BUG() in the progress code that
can trigger outside of the test suite.

As the number of progress-related fixes clearly show, we often misuse
the progress API, and, arguably, a bug is a bug is a bug, so strictly
speaking a BUG() is not wrong here.

However, the progress line is merely a UI gimmick, not a crucial part
of Git, and none of those progress bugs affected the correctness of
the operation itself.  Worse, calling BUG() during some operations
(e.g. 'git commit-graph write', the worst offender when it comes to
progress bugs) can leave a lockfile behind, resulting in scary errors
and requiring manual cleanup in the .git directory, which is a much
worse UX than showing some bogus values or out of order progress
lines.


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 18/25] progress.c: emit progress on first signal, show "stalled"
  2021-06-23 17:48       ` [PATCH 18/25] progress.c: emit progress on first signal, show "stalled" Ævar Arnfjörð Bjarmason
@ 2021-09-16 18:37         ` SZEDER Gábor
  0 siblings, 0 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-09-16 18:37 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, René Scharfe, Taylor Blau

On Wed, Jun 23, 2021 at 07:48:18PM +0200, Ævar Arnfjörð Bjarmason wrote:
> Ever since the progress.c code was added in 96a02f8f6d2 (common
> progress display support, 2007-04-18) we have been driven purely by
> calls to the display() function (via the public display_progress()),
> or via stop_progress(). Even though we got a signal and invoked
> progress_interval() that function would not actually emit progress
> output for us.
> 
> Thus in cases like "git gc" we don't emit any "Enumerating Objects"
> output until we get past the setup code, and start enumerating
> objects, we'll now (at least on my laptop) show output earlier, and
> emit a "stalled" message before we start the count.
> 
> But more generally, this is a first step towards never showing a
> hanging progress bar from the user's perspective. If we're truly
> taking a very long time with one item we can show some spinner that we
> update every time we get a signal. We don't right now, and only
> special-case the most common case of hanging before we get to the
> first item.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  progress.c                  |  7 +++++
>  t/t0500-progress-display.sh | 63 ++++++++++++++++++++++++++++++++++---
>  2 files changed, 66 insertions(+), 4 deletions(-)
> 
> diff --git a/progress.c b/progress.c
> index 6c4038df791..35847d3a7f2 100644
> --- a/progress.c
> +++ b/progress.c
> @@ -255,6 +255,13 @@ void display_progress(struct progress *progress, uint64_t n)
>  static void progress_interval(int signum)
>  {
>  	progress_update = 1;
> +
> +	if (global_progress->last_value != -1)
> +		return;
> +
> +	display(global_progress, 0, _(", stalled."), 0);

We have a few progress lines that are updated from multiple threads,
and to prevent concurrency issues those threads call display() while
holding a mutex.  This call without synchronization causes undefined
behavior.


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 13/25] progress.[ch]: move the "struct progress" to the header
  2021-06-23 17:48       ` [PATCH 13/25] progress.[ch]: move the "struct progress" to the header Ævar Arnfjörð Bjarmason
@ 2021-09-16 19:42         ` SZEDER Gábor
  0 siblings, 0 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-09-16 19:42 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, René Scharfe, Taylor Blau

On Wed, Jun 23, 2021 at 07:48:13PM +0200, Ævar Arnfjörð Bjarmason wrote:
> Move the definition of the "struct progress" to the progress.h
> header. Even though its contents are meant to be "private" this
> pattern has resulted in forward declarations of it in various places,
> as other functions have a need to pass it around.
> 
> Let's just define it in the header instead. 

This is not a good excuse to move the definition of 'struct progress'
to the header file.  Defining a struct in a C source file and
declaring it in header files is C's well-established way to create
an opaque data type and to hide implementation details, so there is
nothing wrong with those forward declarations, and keeping 'struct
progress' private to 'progress.c' is a good thing.

Having said that, we can simply remove all those forward declarations
without moving the definition of 'struct progress' to 'progress.h',
and still successfully build git.  The reason is that in 'cache.h':

  struct index_state {
    [...]
    struct progress *progress;
    [...]
  };

does count as a forward declaration of 'struct progress', and
'cache.h' is the first header included in just about all our C source
files, rendering the other forward declaration unnecessary.


> It's part of our own
> internal code, so we're not at much risk of someone tweaking the
> internal fields manually. While doing that rename the "TP_IDX_MAX"
> macro to the more clearly namespaced "PROGRESS_THROUGHPUT_IDX_MAX".
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  cache.h             |  1 -
>  csum-file.h         |  2 --
>  pack.h              |  1 -
>  parallel-checkout.h |  1 -
>  progress.c          | 29 +----------------------------
>  progress.h          | 28 +++++++++++++++++++++++++++-
>  reachable.h         |  1 -
>  7 files changed, 28 insertions(+), 35 deletions(-)
> 
> diff --git a/cache.h b/cache.h
> index ba04ff8bd36..7e03a181f68 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -308,7 +308,6 @@ static inline unsigned int canon_mode(unsigned int mode)
>  
>  struct split_index;
>  struct untracked_cache;
> -struct progress;
>  struct pattern_list;
>  
>  struct index_state {
> diff --git a/csum-file.h b/csum-file.h
> index 3044bd19ab6..3de0de653e8 100644
> --- a/csum-file.h
> +++ b/csum-file.h
> @@ -3,8 +3,6 @@
>  
>  #include "hash.h"
>  
> -struct progress;
> -
>  /* A SHA1-protected file */
>  struct hashfile {
>  	int fd;
> diff --git a/pack.h b/pack.h
> index fa139545262..8df04f4937a 100644
> --- a/pack.h
> +++ b/pack.h
> @@ -77,7 +77,6 @@ struct pack_idx_entry {
>  };
>  
>  
> -struct progress;
>  /* Note, the data argument could be NULL if object type is blob */
>  typedef int (*verify_fn)(const struct object_id *, enum object_type, unsigned long, void*, int*);
>  
> diff --git a/parallel-checkout.h b/parallel-checkout.h
> index 80f539bcb77..193f76398d6 100644
> --- a/parallel-checkout.h
> +++ b/parallel-checkout.h
> @@ -5,7 +5,6 @@
>  
>  struct cache_entry;
>  struct checkout;
> -struct progress;
>  
>  /****************************************************************
>   * Users of parallel checkout
> diff --git a/progress.c b/progress.c
> index e1b50ef7882..aff9af9ee8b 100644
> --- a/progress.c
> +++ b/progress.c
> @@ -17,33 +17,6 @@
>  #include "utf8.h"
>  #include "config.h"
>  
> -#define TP_IDX_MAX      8
> -
> -struct throughput {
> -	off_t curr_total;
> -	off_t prev_total;
> -	uint64_t prev_ns;
> -	unsigned int avg_bytes;
> -	unsigned int avg_misecs;
> -	unsigned int last_bytes[TP_IDX_MAX];
> -	unsigned int last_misecs[TP_IDX_MAX];
> -	unsigned int idx;
> -	struct strbuf display;
> -};
> -
> -struct progress {
> -	const char *title;
> -	uint64_t last_value;
> -	uint64_t total;
> -	unsigned last_percent;
> -	unsigned delay;
> -	struct throughput *throughput;
> -	uint64_t start_ns;
> -	struct strbuf counters_sb;
> -	int title_len;
> -	int split;
> -};
> -
>  static volatile sig_atomic_t progress_update;
>  static struct progress *global_progress;
>  
> @@ -194,7 +167,7 @@ void display_throughput(struct progress *progress, uint64_t total)
>  	tp->avg_misecs -= tp->last_misecs[tp->idx];
>  	tp->last_bytes[tp->idx] = count;
>  	tp->last_misecs[tp->idx] = misecs;
> -	tp->idx = (tp->idx + 1) % TP_IDX_MAX;
> +	tp->idx = (tp->idx + 1) % PROGRESS_THROUGHPUT_IDX_MAX;
>  
>  	throughput_string(&tp->display, total, rate);
>  	if (progress->last_value != -1 && progress_update)
> diff --git a/progress.h b/progress.h
> index f1913acf73f..4fb2b483d36 100644
> --- a/progress.h
> +++ b/progress.h
> @@ -1,7 +1,33 @@
>  #ifndef PROGRESS_H
>  #define PROGRESS_H
> +#include "strbuf.h"
>  
> -struct progress;
> +#define PROGRESS_THROUGHPUT_IDX_MAX      8
> +
> +struct throughput {
> +	off_t curr_total;
> +	off_t prev_total;
> +	uint64_t prev_ns;
> +	unsigned int avg_bytes;
> +	unsigned int avg_misecs;
> +	unsigned int last_bytes[PROGRESS_THROUGHPUT_IDX_MAX];
> +	unsigned int last_misecs[PROGRESS_THROUGHPUT_IDX_MAX];
> +	unsigned int idx;
> +	struct strbuf display;
> +};
> +
> +struct progress {
> +	const char *title;
> +	uint64_t last_value;
> +	uint64_t total;
> +	unsigned last_percent;
> +	unsigned delay;
> +	struct throughput *throughput;
> +	uint64_t start_ns;
> +	struct strbuf counters_sb;
> +	int title_len;
> +	int split;
> +};
>  
>  #ifdef GIT_TEST_PROGRESS_ONLY
>  
> diff --git a/reachable.h b/reachable.h
> index 5df932ad8f5..7e1ddddbc63 100644
> --- a/reachable.h
> +++ b/reachable.h
> @@ -1,7 +1,6 @@
>  #ifndef REACHEABLE_H
>  #define REACHEABLE_H
>  
> -struct progress;
>  struct rev_info;
>  
>  int add_unseen_recent_objects_to_traversal(struct rev_info *revs,
> -- 
> 2.32.0.599.g3967b4fa4ac
> 

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 12/25] progress.c: add & assert a "global_progress" variable
  2021-07-22 12:55   ` [PATCH 8/8] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
@ 2021-09-16 21:34     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-16 21:34 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, Junio C Hamano, René Scharfe, Taylor Blau


I've adjusted the In-reply-to header here.

This is really in reply to <20210916183137.GD76263@szeder.dev>, but
that's a reply to a previous & longer RFC-ish iteration of the
series. https://lore.kernel.org/git/cover-0.8-0000000000-20210722T125012Z-avarab@gmail.com/
is what's currently queued as ab/only-single-progress-at-once in "seen".

Let's continue the discussion relevant to the currently proposed patches
in this thread.

On Thu, Sep 16 2021, SZEDER Gábor wrote:

> On Wed, Jun 23, 2021 at 07:48:12PM +0200, Ævar Arnfjörð Bjarmason wrote:
>> The progress.c code makes a hard assumption that only one progress bar
>> be active at a time (see [1] for a bug where this wasn't the case),
>> but nothing has asserted that that's the case. Let's add a BUG()
>> that'll trigger if two progress bars are active at the same time.
>
> I very much dislike the idea of any BUG() in the progress code that
> can trigger outside of the test suite.

First regarding the state of this series. I'd understood your:

    Please don't advance this to next yet.  I've found some issues with
    it, but not the time to raise them.

In https://lore.kernel.org/git/20210901050406.GB76263@szeder.dev/ to
mean (, by perhaps overly reading between the lines out of paranoia,)
that you'd found some case where we'd hit the BUG() here.

Am I correct that you haven't, but are concerned that we've left some
case undiscovered where we might?

I'm wondering given your replies there if you thought that the series to
be merged down was that 25 patch one. I agree (e.g. because of your
https://lore.kernel.org/git/20210916183711.GE76263@szeder.dev/) that
merging that one down wouldn't be a good idea in its current state.

Of course you may still have valid concerns etc., just trying to clarify
if there's some specific known outstanding issue or not with the current
8-patch series in "seen".

> As the number of progress-related fixes clearly show, we often misuse
> the progress API, and, arguably, a bug is a bug is a bug, so strictly
> speaking a BUG() is not wrong here.
>
> However, the progress line is merely a UI gimmick, not a crucial part
> of Git, and none of those progress bugs affected the correctness of
> the operation itself.  Worse, calling BUG() during some operations
> (e.g. 'git commit-graph write', the worst offender when it comes to
> progress bugs) can leave a lockfile behind, resulting in scary errors
> and requiring manual cleanup in the .git directory, which is a much
> worse UX than showing some bogus values or out of order progress
> lines.

Yes, I agree that actually hitting this BUG() would absolutely suck, and
that we shouldn't consider this patch if we weren't certain, or near
enough, that we wouldn't hit it.

As covered in the cover letter of the earlier series I sent (at
https://lore.kernel.org/git/cover-00.25-00000000000-20210623T155626Z-avarab@gmail.com/)
I share your concerns that it's hard to uncover if we've reached
sufficient coverage to be certain that we should add certain assertions,
i.e. some of the ones your initial series added around the actual
progress counting that we do.

In this case though, I do think we can safely add this. Maybe you have
run into the BUG() and I'm about to get some egg on my face, but here's
why I think we won't hit it.

This is a *much* narrower case than the general thread quagmire around
display_progress() etc., since this only covers the start/stop progress
calls. All of the multi-threaded code we have does the equivalent of:

    p = start_progress();
    start_threads();
    /* do stuff in threads, including with "p" */
    stop_threads();
    stop_progress(p);

Do you agree that if this is the pattern everywhere that this patch
would be safe in its current form?

To assert that that's the case (I'd read the code before) I instrumented
the tests to BUG() out if we ever start() or stop() where getpid() !=
gettid(), which on Linux means you're inside a pthread. The diff-on-top
is at the end of this E-Mail.

With that and running the tests with:

    GIT_TEST_BUG_START=1 \
    GIT_TEST_BUG_STOP=1

We pass all tests, i.e. there's no current callers that call
start_progress() or start_progress() and do so in anything but the main
program thread.

Well, there could be, per the concerns I had in the CL linked above,
i.e. sometimes that start_progress() is guarded by a preceding isatty()
or whatever, but having looked at those / grepped the union of pthread +
progress in the source I think there's no such cases. That's *much*
easier to eyeball than "does all this control flow around
display_progress() make sense?".

Getting back on topic, we pass all tests with that, but we'll fail with:

    GIT_TEST_BUG_DISPLAY=1 \
    GIT_TEST_BUG_DISPLAY_HARDER=1

Why the second variable? Because I marked up the two callers that call
display_progress() within threads, it's just index-pack and
pack-objects. If you don't set GIT_TEST_BUG_DISPLAY_HARDER=1 we'll
whitelist those two, and all tests pass.

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 6cc48902170..05c82dc6e6d 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1229,6 +1229,7 @@ static void resolve_deltas(void)
 	QSORT(ofs_deltas, nr_ofs_deltas, compare_ofs_delta_entry);
 	QSORT(ref_deltas, nr_ref_deltas, compare_ref_delta_entry);
 
+	no_progress_bug();
 	if (verbose || show_resolving_progress)
 		progress = start_progress(_("Resolving deltas"),
 					  nr_ref_deltas + nr_ofs_deltas);
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index ec8503563a6..97ad321f67c 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3072,6 +3072,7 @@ static void prepare_pack(int window, int depth)
 	if (nr_deltas && n > 1) {
 		unsigned nr_done = 0;
 
+		no_progress_bug();
 		if (progress)
 			progress_state = start_progress(_("Compressing objects"),
 							nr_deltas);
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index f3aba5d6cbb..efa1e5e5bdf 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -5,6 +5,9 @@
 
 . ${0%/*}/lib.sh
 
+export GIT_TEST_BUG_START=true
+export GIT_TEST_BUG_STOP=true
+
 case "$CI_OS_NAME" in
 windows*) cmd //c mklink //j t\\.prove "$(cygpath -aw "$cache_dir/.prove")";;
 *) ln -s "$cache_dir/.prove" t/.prove;;
diff --git a/progress.c b/progress.c
index 14a023f4b43..bdebf100ad8 100644
--- a/progress.c
+++ b/progress.c
@@ -16,6 +16,7 @@
 #include "trace.h"
 #include "utf8.h"
 #include "config.h"
+#include <unistd.h>
 
 #define TP_IDX_MAX      8
 
@@ -202,8 +203,18 @@ void display_throughput(struct progress *progress, uint64_t total)
 		display(progress, progress->last_value, NULL);
 }
 
+static int no_bug_please;
+void no_progress_bug(void)
+{
+	no_bug_please = 1;
+}
+
 void display_progress(struct progress *progress, uint64_t n)
 {
+	if (getpid() != gettid() && getenv("GIT_TEST_BUG_DISPLAY") &&
+	    (getenv("GIT_TEST_BUG_DISPLAY_HARDER") || !no_bug_please))
+		BUG("display: pid = %d, tid = %d: %s\n", getpid(), gettid(),
+		    progress ? progress->title : "N/A");
 	if (progress)
 		display(progress, n, NULL);
 }
@@ -281,6 +292,11 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 	progress->split = 0;
 	set_progress_signal(progress);
 	trace2_region_enter("progress", title, the_repository);
+
+	if (getpid() != gettid() && getenv("GIT_TEST_BUG_START"))
+		BUG("start: pid = %d, tid = %d: %s\n", getpid(), gettid(),
+		    title ? title : "N/A");
+
 	return progress;
 }
 
@@ -334,6 +350,11 @@ static void finish_if_sparse(struct progress *progress)
 
 void stop_progress(struct progress **p_progress)
 {
+	no_bug_please = 0;
+	if (getpid() != gettid() && getenv("GIT_TEST_BUG_STOP"))
+		BUG("stop: pid = %d, tid = %d: %s\n", getpid(), gettid(),
+		    ((p_progress && *p_progress) ? (*p_progress)->title : "N/A"));
+
 	if (!p_progress)
 		BUG("don't provide NULL to stop_progress");
 
diff --git a/progress.h b/progress.h
index f1913acf73f..2ebb1da2666 100644
--- a/progress.h
+++ b/progress.h
@@ -20,5 +20,6 @@ struct progress *start_delayed_sparse_progress(const char *title,
 					       uint64_t total);
 void stop_progress(struct progress **progress);
 void stop_progress_msg(struct progress **progress, const char *msg);
+void no_progress_bug(void);
 
 #endif

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 11/25] pack-bitmap-write.c: add a missing stop_progress()
  2021-06-23 17:48       ` [PATCH 11/25] pack-bitmap-write.c: add a missing stop_progress() Ævar Arnfjörð Bjarmason
@ 2021-09-17  5:14         ` SZEDER Gábor
  2021-09-17  5:56           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 138+ messages in thread
From: SZEDER Gábor @ 2021-09-17  5:14 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, René Scharfe, Taylor Blau

On Wed, Jun 23, 2021 at 07:48:11PM +0200, Ævar Arnfjörð Bjarmason wrote:
> Fix a bug that's been here since 7cc8f971085 (pack-objects: implement
> bitmap writing, 2013-12-21), we did not call stop_progress() if we
> reached the early exit in this function. This will matter in a
> subsequent commit where we BUG(...) out if this happens, and matters
> now e.g. because we don't have a corresponding "region_end" for the
> progress trace2 event.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  pack-bitmap-write.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
> index 88d9e696a54..6e110e41ea4 100644
> --- a/pack-bitmap-write.c
> +++ b/pack-bitmap-write.c
> @@ -550,6 +550,7 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
>  	if (indexed_commits_nr < 100) {
>  		for (i = 0; i < indexed_commits_nr; ++i)
>  			push_bitmapped_commit(indexed_commits[i]);
> +		stop_progress(&writer.progress);
>  		return;
>  	}

When I found this bug I fixed it differently: with your patch there
are no display() calls at all between start_progress() and this new
stop_progress(), indicating that a stop_progress() is not missing but
rather the start_progress is in the wrong place:

  ---  >8  ---

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 88d9e696a5..f0b4044e2b 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -544,15 +544,15 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
 
 	QSORT(indexed_commits, indexed_commits_nr, date_compare);
 
-	if (writer.show_progress)
-		writer.progress = start_progress("Selecting bitmap commits", 0);
-
 	if (indexed_commits_nr < 100) {
 		for (i = 0; i < indexed_commits_nr; ++i)
 			push_bitmapped_commit(indexed_commits[i]);
 		return;
 	}
 
+	if (writer.show_progress)
+		writer.progress = start_progress("Selecting bitmap commits", 0);
+
 	for (;;) {
 		struct commit *chosen = NULL;
 
  ---  8<  ---

And I don't think it's worth adding display() calls to that loop,
because it has so few iterations and it does barely anything per
iteration.


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 11/25] pack-bitmap-write.c: add a missing stop_progress()
  2021-09-17  5:14         ` SZEDER Gábor
@ 2021-09-17  5:56           ` Ævar Arnfjörð Bjarmason
  2021-09-17 21:38             ` SZEDER Gábor
  0 siblings, 1 reply; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-17  5:56 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, Junio C Hamano, René Scharfe, Taylor Blau


On Fri, Sep 17 2021, SZEDER Gábor wrote:

> On Wed, Jun 23, 2021 at 07:48:11PM +0200, Ævar Arnfjörð Bjarmason wrote:
>> Fix a bug that's been here since 7cc8f971085 (pack-objects: implement
>> bitmap writing, 2013-12-21), we did not call stop_progress() if we
>> reached the early exit in this function. This will matter in a
>> subsequent commit where we BUG(...) out if this happens, and matters
>> now e.g. because we don't have a corresponding "region_end" for the
>> progress trace2 event.
>> 
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  pack-bitmap-write.c | 1 +
>>  1 file changed, 1 insertion(+)
>> 
>> diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
>> index 88d9e696a54..6e110e41ea4 100644
>> --- a/pack-bitmap-write.c
>> +++ b/pack-bitmap-write.c
>> @@ -550,6 +550,7 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
>>  	if (indexed_commits_nr < 100) {
>>  		for (i = 0; i < indexed_commits_nr; ++i)
>>  			push_bitmapped_commit(indexed_commits[i]);
>> +		stop_progress(&writer.progress);
>>  		return;
>>  	}
>
> When I found this bug I fixed it differently: with your patch there

Is that patch on-list somewhere or something you have locally?

> are no display() calls at all between start_progress() and this new
> stop_progress(), indicating that a stop_progress() is not missing but
> rather the start_progress is in the wrong place:

*Nod*, I'll see about fixing it differenty depending on the above / any
other comments.

Note that while this comment is current to
https://lore.kernel.org/git/patch-7.8-eb63b4ba6a-20210722T125012Z-avarab@gmail.com/
as well, as noted in
https://lore.kernel.org/git/877dffg37n.fsf@evledraar.gmail.com/ you've
had several comments on the 25 patch series not currently queued in
"seen".

Still very useful as I'd had some of it planned for after that 8-patch
series, just noting it for context.

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH 11/25] pack-bitmap-write.c: add a missing stop_progress()
  2021-09-17  5:56           ` Ævar Arnfjörð Bjarmason
@ 2021-09-17 21:38             ` SZEDER Gábor
  0 siblings, 0 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-09-17 21:38 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, René Scharfe, Taylor Blau

On Fri, Sep 17, 2021 at 07:56:48AM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> On Fri, Sep 17 2021, SZEDER Gábor wrote:
> 
> > On Wed, Jun 23, 2021 at 07:48:11PM +0200, Ævar Arnfjörð Bjarmason wrote:
> >> Fix a bug that's been here since 7cc8f971085 (pack-objects: implement
> >> bitmap writing, 2013-12-21), we did not call stop_progress() if we
> >> reached the early exit in this function. This will matter in a
> >> subsequent commit where we BUG(...) out if this happens, and matters
> >> now e.g. because we don't have a corresponding "region_end" for the
> >> progress trace2 event.
> >> 
> >> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> >> ---
> >>  pack-bitmap-write.c | 1 +
> >>  1 file changed, 1 insertion(+)
> >> 
> >> diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
> >> index 88d9e696a54..6e110e41ea4 100644
> >> --- a/pack-bitmap-write.c
> >> +++ b/pack-bitmap-write.c
> >> @@ -550,6 +550,7 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
> >>  	if (indexed_commits_nr < 100) {
> >>  		for (i = 0; i < indexed_commits_nr; ++i)
> >>  			push_bitmapped_commit(indexed_commits[i]);
> >> +		stop_progress(&writer.progress);
> >>  		return;
> >>  	}
> >
> > When I found this bug I fixed it differently: with your patch there
> 
> Is that patch on-list somewhere or something you have locally?

Yes, it is between the scissors lines in the email you responded to :)
It was another fallout from my isatty(2) vs. progress PoC starting at

  https://public-inbox.org/git/20210623215736.8279-1-szeder.dev@gmail.com/

but it was not included there, because I fixed it some days after
sending those patches.

> > are no display() calls at all between start_progress() and this new
> > stop_progress(), indicating that a stop_progress() is not missing but
> > rather the start_progress is in the wrong place:
> 
> *Nod*, I'll see about fixing it differenty depending on the above / any
> other comments.
> 
> Note that while this comment is current to
> https://lore.kernel.org/git/patch-7.8-eb63b4ba6a-20210722T125012Z-avarab@gmail.com/
> as well, as noted in
> https://lore.kernel.org/git/877dffg37n.fsf@evledraar.gmail.com/ you've
> had several comments on the 25 patch series not currently queued in
> "seen".

Oh, right, I commented on the wrong patch series.  Gaah.

> Still very useful as I'd had some of it planned for after that 8-patch
> series, just noting it for context.

^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v2 0/8] progress: assert "global_progress" + test fixes / cleanup
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                     ` (8 preceding siblings ...)
  2021-07-23 22:02   ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Junio C Hamano
@ 2021-09-20 23:09   ` Ævar Arnfjörð Bjarmason
  2021-09-20 23:09     ` [PATCH v2 1/8] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
                       ` (8 more replies)
  9 siblings, 9 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-20 23:09 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

These patches improve the progress.c tests, fix a couple of
miscellaneous nits in 5/8 and 6/8, and 7/8 and 8/8/ then fix what I
believe is the last in a class of bug that 8/8 adds a new BUG() assert
for: We should not be starting two progress bars at the same time.


This series has been held off since September 1st on SZEDER's comment
that he "found some issues with it" in[1].

He's been relatively inactive on list recently, but I belive based on
a recent discussion in the thread-at-large that those comments didn't
refer to any unreported bug, but a general concern that we might be
getting things wrong and had cases where the BUG() might trigger that
we haven't thought of.

I think my [2] and the updated commit message of 8/8 cover in detail
why I think this is safe to do. SZEDER didn't reply to my [2] yet, so
perhaps there really is some specific issue I'm not aware of (i.e. the
BUG() being hit), but I don't think there is based on the information
I have now.

SZEDER also had comments rightly pointing out some issues[3] in the
earlier 25-patch series I'd submitted[2]. Those will need to be
addressed or fixed if I re-submit those, but they're not part of this
series.

1. https://lore.kernel.org/git/20210901050406.GB76263@szeder.dev/
2. https://lore.kernel.org/git/877dffg37n.fsf@evledraar.gmail.com/
3. https://lore.kernel.org/git/20210916183711.GE76263@szeder.dev/

Ævar Arnfjörð Bjarmason (8):
  progress.c tests: make start/stop verbs on stdin
  progress.c tests: test some invalid usage
  progress.c: move signal handler functions lower
  progress.c: call progress_interval() from progress_test_force_update()
  progress.c: stop eagerly fflush(stderr) when not a terminal
  progress.c: add temporary variable from progress struct
  pack-bitmap-write.c: add a missing stop_progress()
  progress.c: add & assert a "global_progress" variable

 pack-bitmap-write.c         |   1 +
 progress.c                  | 116 ++++++++++++++++++++----------------
 t/helper/test-progress.c    |  43 +++++++++----
 t/t0500-progress-display.sh | 103 +++++++++++++++++++++++++-------
 4 files changed, 178 insertions(+), 85 deletions(-)

Range-diff against v1:
1:  e0a294eb479 = 1:  e0a294eb479 progress.c tests: make start/stop verbs on stdin
2:  7b1220b641e = 2:  7b1220b641e progress.c tests: test some invalid usage
3:  f1b8bf1dbde = 3:  f1b8bf1dbde progress.c: move signal handler functions lower
4:  74057b0046a = 4:  74057b0046a progress.c: call progress_interval() from progress_test_force_update()
5:  250e50667c2 = 5:  250e50667c2 progress.c: stop eagerly fflush(stderr) when not a terminal
6:  d4e9ff1de73 = 6:  d4e9ff1de73 progress.c: add temporary variable from progress struct
7:  a3f133ca7ad = 7:  a3f133ca7ad pack-bitmap-write.c: add a missing stop_progress()
8:  4fd2754caeb ! 8:  1bd285eba0d progress.c: add & assert a "global_progress" variable
    @@ Commit message
         progress.c: add & assert a "global_progress" variable
     
         The progress.c code makes a hard assumption that only one progress bar
    -    be active at a time (see [1] for a bug where this wasn't the case),
    -    but nothing has asserted that that's the case. Let's add a BUG()
    -    that'll trigger if two progress bars are active at the same time.
    -
    -    There's an alternate test-only approach to doing the same thing[2],
    -    but by doing this for all progress bars we'll have a canary to check
    -    if we have any unexpected interaction between the "sig_atomic_t
    -    progress_update" variable and this global struct.
    -
    -    I am then planning on using this scaffolding in the future to fix a
    -    limitation in the progress output, namely the current limitation of
    -    the progress.c bar code that any update must pro-actively go through
    -    the likes of display_progress().
    -
    -    If we e.g. hang forever before the first display_progress(), or in the
    -    middle of a loop that would call display_progress() the user will only
    -    see either no output, or output frozen at the last display_progress()
    -    that would have done an update (e.g. in cases where progress_update
    -    was "1" due to an earlier signal).
    -
    -    This change does not fix that, but sets up the structure for solving
    -    that and other related problems by juggling this "global_progress"
    -    struct. Later changes will make more use of the "global_progress" than
    -    only using it for these assertions.
    +    be active at a time (see [1] for a bug where this wasn't the
    +    case). Add a BUG() that'll trigger if we ever regress on that promise
    +    and have two progress bars active at the same time.
    +
    +    There was an alternative test-only approach to doing the same
    +    thing[2], but by doing this outside of a GIT_TEST_* mode we'll know
    +    we've put a hard stop to this particular API misuse.
    +
    +    It will also establish scaffolding to address current fundamental
    +    limitations in the progress output: The current output must be
    +    "driven" by calls to the likes of display_progress(). Once we have a
    +    global current progress object we'll be able to update that object via
    +    SIGALRM. See [3] for early code to do that.
    +
    +    It's conceivable that this change will hit the BUG() condition in some
    +    scenario that we don't currently have tests for, this would be very
    +    bad. If that happened we'd die just because we couldn't emit some
    +    pretty output.
    +
    +    See [4] for a discussion of why our test coverage is lacking; our
    +    progress display is hidden behind isatty(2) checks in many cases, so
    +    the test suite doesn't cover it unless individual tests are run in
    +    "--verbose" mode, we might also have multi-threaded use of the API, so
    +    two progress bars stopping and starting would only be visible due to a
    +    race condition.
    +
    +    Despite that, I think that this change won't introduce such
    +    regressions, because:
    +
    +     1. I've read all the code using the progress API (and have modified a
    +        large part of it in some WIP code I have). Almost all of it is really
    +        simple, the parts that aren't[5] are complex in the display_progress() part,
    +        not in starting or stopping the progress bar.
    +
    +     2. The entire test suite passes when instrumented with an ad-hoc
    +        Linux-specific mode (it uses gettid()) to die if progress bars are
    +        ever started or stopped on anything but the main thread[6].
    +
    +        Extending that to die if display_progress() is called in a thread
    +        reveals that we have exactly two users of the progress bar under
    +        threaded conditions, "git index-pack" and "git pack-objects". Both
    +        uses are straightforward, and they don't start/stop the progress
    +        bar when threads are active.
    +
    +     3. I've likewise done an ad-hoc test to force progress bars to be
    +        displayed with:
    +
    +            perl -pi -e 's[isatty\((?:STDERR_FILENO|2)\)][1]g' $(git grep -l 'isatty\((STDERR_FILENO|2)\)')
    +
    +        I.e. to replace all checks (not just for progress) of checking
    +        whether STDERR is connected to a TTY, and then monkeypatching
    +        is_foreground_fd() in progress.c to always "return 1". Running the
    +        tests with those applied, interactively and under -V reveals via:
    +
    +            $ grep -e set_progress_signal -e clear_progress_signal test-results/*out
    +
    +        That nothing our tests cover hits the BUG conditions added here,
    +        except the expected "BUG: start two concurrent progress bars" test
    +        being added here.
    +
    +        That isn't entirely true since we won't be getting 100% coverage
    +        due to cascading failures from tests that expected no progress
    +        output on stderr. To make sure I covered 100% I also tried making
    +        the display() function in progress.c a NOOP on top of that (it's
    +        the calls to start_progress_delay() and stop_progress()) that
    +        matter.
    +
    +        That doesn't hit the BUG() either. Some tests fail in that mode
    +        due to a combination of the overzealous isatty(2) munging noted
    +        above, and the tests that are testing that the progress output
    +        itself is present (but for testing I'd made display() a NOOP).
    +
    +    Between those three points I think it's safe to go ahead with this
    +    change.
     
         1. 6f9d5f2fda1 (commit-graph: fix progress of reachable commits, 2020-07-09)
         2. https://lore.kernel.org/git/20210620200303.2328957-3-szeder.dev@gmail.com
    +    3. https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/
    +    4. https://lore.kernel.org/git/cover-00.25-00000000000-20210623T155626Z-avarab@gmail.com/
    +    5. b50c37aa44d (Merge branch 'ab/progress-users-adjust-counters' into
    +       next, 2021-09-10)
    +    6. https://lore.kernel.org/git/877dffg37n.fsf@evledraar.gmail.com/
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
-- 
2.33.0.1098.gf02a64c1a2d


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v2 1/8] progress.c tests: make start/stop verbs on stdin
  2021-09-20 23:09   ` [PATCH v2 " Ævar Arnfjörð Bjarmason
@ 2021-09-20 23:09     ` Ævar Arnfjörð Bjarmason
  2021-10-08  3:43       ` Emily Shaffer
  2021-09-20 23:09     ` [PATCH v2 2/8] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
                       ` (7 subsequent siblings)
  8 siblings, 1 reply; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-20 23:09 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

Change the usage of the "test-tool progress" introduced in
2bb74b53a49 (Test the progress display, 2019-09-16) to take command
like "start" and "stop" on stdin, instead of running them implicitly.

This makes for tests that are easier to read, since the recipe will
mirror the API usage, and allows for easily testing invalid usage that
would yield (or should yield) a BUG(), e.g. providing two "start"
calls in a row. A subsequent commit will add such stress tests.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/helper/test-progress.c    | 43 +++++++++++++++++++--------
 t/t0500-progress-display.sh | 59 +++++++++++++++++++++++--------------
 2 files changed, 67 insertions(+), 35 deletions(-)

diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
index 5d05cbe7894..685c0a7c49a 100644
--- a/t/helper/test-progress.c
+++ b/t/helper/test-progress.c
@@ -3,6 +3,9 @@
  *
  * Reads instructions from standard input, one instruction per line:
  *
+ *   "start[ <total>[ <title>]]" - Call start_progress(title, total),
+ *                                 when "start" use a title of
+ *                                 "Working hard" with a total of 0.
  *   "progress <items>" - Call display_progress() with the given item count
  *                        as parameter.
  *   "throughput <bytes> <millis> - Call display_throughput() with the given
@@ -10,6 +13,7 @@
  *                                  specify the time elapsed since the
  *                                  start_progress() call.
  *   "update" - Set the 'progress_update' flag.
+ *   "stop" - Call stop_progress().
  *
  * See 't0500-progress-display.sh' for examples.
  */
@@ -22,31 +26,41 @@
 
 int cmd__progress(int argc, const char **argv)
 {
-	int total = 0;
-	const char *title;
+	const char *default_title = "Working hard";
+	char *detached_title = NULL;
 	struct strbuf line = STRBUF_INIT;
-	struct progress *progress;
+	struct progress *progress = NULL;
 
 	const char *usage[] = {
-		"test-tool progress [--total=<n>] <progress-title>",
+		"test-tool progress <stdin",
 		NULL
 	};
 	struct option options[] = {
-		OPT_INTEGER(0, "total", &total, "total number of items"),
 		OPT_END(),
 	};
 
 	argc = parse_options(argc, argv, NULL, options, usage, 0);
-	if (argc != 1)
-		die("need a title for the progress output");
-	title = argv[0];
+	if (argc)
+		usage_with_options(usage, options);
 
 	progress_testing = 1;
-	progress = start_progress(title, total);
 	while (strbuf_getline(&line, stdin) != EOF) {
 		char *end;
 
-		if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
+		if (!strcmp(line.buf, "start")) {
+			progress = start_progress(default_title, 0);
+		} else if (skip_prefix(line.buf, "start ", (const char **) &end)) {
+			uint64_t total = strtoull(end, &end, 10);
+			if (*end == '\0') {
+				progress = start_progress(default_title, total);
+			} else if (*end == ' ') {
+				free(detached_title);
+				detached_title = strbuf_detach(&line, NULL);
+				progress = start_progress(end + 1, total);
+			} else {
+				die("invalid input: '%s'\n", line.buf);
+			}
+		} else if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
 			uint64_t item_count = strtoull(end, &end, 10);
 			if (*end != '\0')
 				die("invalid input: '%s'\n", line.buf);
@@ -63,12 +77,15 @@ int cmd__progress(int argc, const char **argv)
 				die("invalid input: '%s'\n", line.buf);
 			progress_test_ns = test_ms * 1000 * 1000;
 			display_throughput(progress, byte_count);
-		} else if (!strcmp(line.buf, "update"))
+		} else if (!strcmp(line.buf, "update")) {
 			progress_test_force_update();
-		else
+		} else if (!strcmp(line.buf, "stop")) {
+			stop_progress(&progress);
+		} else {
 			die("invalid input: '%s'\n", line.buf);
+		}
 	}
-	stop_progress(&progress);
+	free(detached_title);
 
 	return 0;
 }
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 22058b503ac..ca96ac1fa55 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -17,6 +17,7 @@ test_expect_success 'simple progress display' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 0
 	update
 	progress 1
 	update
@@ -25,8 +26,9 @@ test_expect_success 'simple progress display' '
 	progress 4
 	update
 	progress 5
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -41,11 +43,13 @@ test_expect_success 'progress display with total' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 3
 	progress 1
 	progress 2
 	progress 3
+	stop
 	EOF
-	test-tool progress --total=3 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -62,14 +66,14 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 100
 	progress 1000
 	progress 10000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -88,16 +92,15 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
-	update
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 1
 	update
 	progress 2
 	progress 10000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -116,14 +119,14 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 25000
 	progress 50000
 	progress 75000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -140,14 +143,14 @@ Working hard.......2.........3.........4.........5.........6.........7.........:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6.........7.........
 	progress 25000
 	progress 50000
 	progress 75000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6.........7........." \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -164,12 +167,14 @@ test_expect_success 'progress shortens - crazy caller' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 1000
 	progress 100
 	progress 200
 	progress 1
 	progress 1000
+	stop
 	EOF
-	test-tool progress --total=1000 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -185,6 +190,7 @@ test_expect_success 'progress display with throughput' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start
 	throughput 102400 1000
 	update
 	progress 10
@@ -197,8 +203,9 @@ test_expect_success 'progress display with throughput' '
 	throughput 409600 4000
 	update
 	progress 40
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -214,6 +221,7 @@ test_expect_success 'progress display with throughput and total' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 40
 	throughput 102400 1000
 	progress 10
 	throughput 204800 2000
@@ -222,8 +230,9 @@ test_expect_success 'progress display with throughput and total' '
 	progress 30
 	throughput 409600 4000
 	progress 40
+	stop
 	EOF
-	test-tool progress --total=40 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -239,6 +248,7 @@ test_expect_success 'cover up after throughput shortens' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start
 	throughput 409600 1000
 	update
 	progress 1
@@ -251,8 +261,9 @@ test_expect_success 'cover up after throughput shortens' '
 	throughput 1638400 4000
 	update
 	progress 4
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -267,6 +278,7 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start
 	throughput 1 1000
 	update
 	progress 1
@@ -276,8 +288,9 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	throughput 3145728 3000
 	update
 	progress 3
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -285,6 +298,7 @@ test_expect_success 'cover up after throughput shortens a lot' '
 
 test_expect_success 'progress generates traces' '
 	cat >in <<-\EOF &&
+	start 40
 	throughput 102400 1000
 	update
 	progress 10
@@ -297,10 +311,11 @@ test_expect_success 'progress generates traces' '
 	throughput 409600 4000
 	update
 	progress 40
+	stop
 	EOF
 
-	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress --total=40 \
-		"Working hard" <in 2>stderr &&
+	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress \
+		<in 2>stderr &&
 
 	# t0212/parse_events.perl intentionally omits regions and data.
 	test_region progress "Working hard" trace.event &&
-- 
2.33.0.1098.gf02a64c1a2d


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v2 2/8] progress.c tests: test some invalid usage
  2021-09-20 23:09   ` [PATCH v2 " Ævar Arnfjörð Bjarmason
  2021-09-20 23:09     ` [PATCH v2 1/8] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
@ 2021-09-20 23:09     ` Ævar Arnfjörð Bjarmason
  2021-10-08  3:53       ` Emily Shaffer
  2021-09-20 23:09     ` [PATCH v2 3/8] progress.c: move signal handler functions lower Ævar Arnfjörð Bjarmason
                       ` (6 subsequent siblings)
  8 siblings, 1 reply; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-20 23:09 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

Test what happens when we "stop" without a "start", omit the "stop"
after a "start", or try to start two concurrent progress bars. This
extends the trace2 tests added in 98a13647408 (trace2: log progress
time and throughput, 2020-05-12).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t0500-progress-display.sh | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index ca96ac1fa55..ffa819ca1db 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -323,4 +323,37 @@ test_expect_success 'progress generates traces' '
 	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
 '
 
+test_expect_success 'progress generates traces: stop / start' '
+	cat >in <<-\EOF &&
+	start
+	stop
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-startstop.event" test-tool progress \
+		<in 2>stderr &&
+	test_region progress "Working hard" trace-startstop.event
+'
+
+test_expect_success 'progress generates traces: start without stop' '
+	cat >in <<-\EOF &&
+	start
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-start.event" test-tool progress \
+		<in 2>stderr &&
+	grep region_enter.*progress trace-start.event &&
+	! grep region_leave.*progress trace-start.event
+'
+
+test_expect_success 'progress generates traces: stop without start' '
+	cat >in <<-\EOF &&
+	stop
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-stop.event" test-tool progress \
+		<in 2>stderr &&
+	! grep region_enter.*progress trace-stop.event &&
+	! grep region_leave.*progress trace-stop.event
+'
+
 test_done
-- 
2.33.0.1098.gf02a64c1a2d


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v2 3/8] progress.c: move signal handler functions lower
  2021-09-20 23:09   ` [PATCH v2 " Ævar Arnfjörð Bjarmason
  2021-09-20 23:09     ` [PATCH v2 1/8] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
  2021-09-20 23:09     ` [PATCH v2 2/8] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
@ 2021-09-20 23:09     ` Ævar Arnfjörð Bjarmason
  2021-09-20 23:09     ` [PATCH v2 4/8] progress.c: call progress_interval() from progress_test_force_update() Ævar Arnfjörð Bjarmason
                       ` (5 subsequent siblings)
  8 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-20 23:09 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

Move the signal handler functions to just before the
start_progress_delay() where they'll be referenced, instead of having
them at the top of the file.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 92 ++++++++++++++++++++++++++++--------------------------
 1 file changed, 48 insertions(+), 44 deletions(-)

diff --git a/progress.c b/progress.c
index 680c6a8bf93..893cb0fe56f 100644
--- a/progress.c
+++ b/progress.c
@@ -53,50 +53,6 @@ static volatile sig_atomic_t progress_update;
  */
 int progress_testing;
 uint64_t progress_test_ns = 0;
-void progress_test_force_update(void)
-{
-	progress_update = 1;
-}
-
-
-static void progress_interval(int signum)
-{
-	progress_update = 1;
-}
-
-static void set_progress_signal(void)
-{
-	struct sigaction sa;
-	struct itimerval v;
-
-	if (progress_testing)
-		return;
-
-	progress_update = 0;
-
-	memset(&sa, 0, sizeof(sa));
-	sa.sa_handler = progress_interval;
-	sigemptyset(&sa.sa_mask);
-	sa.sa_flags = SA_RESTART;
-	sigaction(SIGALRM, &sa, NULL);
-
-	v.it_interval.tv_sec = 1;
-	v.it_interval.tv_usec = 0;
-	v.it_value = v.it_interval;
-	setitimer(ITIMER_REAL, &v, NULL);
-}
-
-static void clear_progress_signal(void)
-{
-	struct itimerval v = {{0,},};
-
-	if (progress_testing)
-		return;
-
-	setitimer(ITIMER_REAL, &v, NULL);
-	signal(SIGALRM, SIG_IGN);
-	progress_update = 0;
-}
 
 static int is_foreground_fd(int fd)
 {
@@ -249,6 +205,54 @@ void display_progress(struct progress *progress, uint64_t n)
 		display(progress, n, NULL);
 }
 
+static void progress_interval(int signum)
+{
+	progress_update = 1;
+}
+
+/*
+ * The progress_test_force_update() function is intended for testing
+ * the progress output, i.e. exclusively for 'test-tool progress'.
+ */
+void progress_test_force_update(void)
+{
+	progress_update = 1;
+}
+
+static void set_progress_signal(void)
+{
+	struct sigaction sa;
+	struct itimerval v;
+
+	if (progress_testing)
+		return;
+
+	progress_update = 0;
+
+	memset(&sa, 0, sizeof(sa));
+	sa.sa_handler = progress_interval;
+	sigemptyset(&sa.sa_mask);
+	sa.sa_flags = SA_RESTART;
+	sigaction(SIGALRM, &sa, NULL);
+
+	v.it_interval.tv_sec = 1;
+	v.it_interval.tv_usec = 0;
+	v.it_value = v.it_interval;
+	setitimer(ITIMER_REAL, &v, NULL);
+}
+
+static void clear_progress_signal(void)
+{
+	struct itimerval v = {{0,},};
+
+	if (progress_testing)
+		return;
+
+	setitimer(ITIMER_REAL, &v, NULL);
+	signal(SIGALRM, SIG_IGN);
+	progress_update = 0;
+}
+
 static struct progress *start_progress_delay(const char *title, uint64_t total,
 					     unsigned delay, unsigned sparse)
 {
-- 
2.33.0.1098.gf02a64c1a2d


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v2 4/8] progress.c: call progress_interval() from progress_test_force_update()
  2021-09-20 23:09   ` [PATCH v2 " Ævar Arnfjörð Bjarmason
                       ` (2 preceding siblings ...)
  2021-09-20 23:09     ` [PATCH v2 3/8] progress.c: move signal handler functions lower Ævar Arnfjörð Bjarmason
@ 2021-09-20 23:09     ` Ævar Arnfjörð Bjarmason
  2021-09-20 23:09     ` [PATCH v2 5/8] progress.c: stop eagerly fflush(stderr) when not a terminal Ævar Arnfjörð Bjarmason
                       ` (4 subsequent siblings)
  8 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-20 23:09 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

Define the progress_test_force_update() function in terms of
progress_interval(). For documentation purposes these two functions
have the same body, but different names. Let's just define the test
function by calling progress_interval() with SIGALRM ourselves.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/progress.c b/progress.c
index 893cb0fe56f..7fcc513717a 100644
--- a/progress.c
+++ b/progress.c
@@ -216,7 +216,7 @@ static void progress_interval(int signum)
  */
 void progress_test_force_update(void)
 {
-	progress_update = 1;
+	progress_interval(SIGALRM);
 }
 
 static void set_progress_signal(void)
-- 
2.33.0.1098.gf02a64c1a2d


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v2 5/8] progress.c: stop eagerly fflush(stderr) when not a terminal
  2021-09-20 23:09   ` [PATCH v2 " Ævar Arnfjörð Bjarmason
                       ` (3 preceding siblings ...)
  2021-09-20 23:09     ` [PATCH v2 4/8] progress.c: call progress_interval() from progress_test_force_update() Ævar Arnfjörð Bjarmason
@ 2021-09-20 23:09     ` Ævar Arnfjörð Bjarmason
  2021-10-08  3:59       ` Emily Shaffer
  2021-09-20 23:09     ` [PATCH v2 6/8] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
                       ` (3 subsequent siblings)
  8 siblings, 1 reply; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-20 23:09 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

It's the clear intention of the combination of 137a0d0ef56 (Flush
progress message buffer in display()., 2007-11-19) and
85cb8906f0e (progress: no progress in background, 2015-04-13) to call
fflush(stderr) when we have a stderr in the foreground, but we ended
up always calling fflush(stderr) seemingly by omission. Let's not.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/progress.c b/progress.c
index 7fcc513717a..1fade5808de 100644
--- a/progress.c
+++ b/progress.c
@@ -91,7 +91,8 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 	}
 
 	if (show_update) {
-		if (is_foreground_fd(fileno(stderr)) || done) {
+		int stderr_is_foreground_fd = is_foreground_fd(fileno(stderr));
+		if (stderr_is_foreground_fd || done) {
 			const char *eol = done ? done : "\r";
 			size_t clear_len = counters_sb->len < last_count_len ?
 					last_count_len - counters_sb->len + 1 :
@@ -115,7 +116,8 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 				fprintf(stderr, "%s: %s%*s", progress->title,
 					counters_sb->buf, (int) clear_len, eol);
 			}
-			fflush(stderr);
+			if (stderr_is_foreground_fd)
+				fflush(stderr);
 		}
 		progress_update = 0;
 	}
-- 
2.33.0.1098.gf02a64c1a2d


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v2 6/8] progress.c: add temporary variable from progress struct
  2021-09-20 23:09   ` [PATCH v2 " Ævar Arnfjörð Bjarmason
                       ` (4 preceding siblings ...)
  2021-09-20 23:09     ` [PATCH v2 5/8] progress.c: stop eagerly fflush(stderr) when not a terminal Ævar Arnfjörð Bjarmason
@ 2021-09-20 23:09     ` Ævar Arnfjörð Bjarmason
  2021-10-08  4:02       ` Emily Shaffer
  2021-09-20 23:09     ` [PATCH v2 7/8] pack-bitmap-write.c: add a missing stop_progress() Ævar Arnfjörð Bjarmason
                       ` (2 subsequent siblings)
  8 siblings, 1 reply; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-20 23:09 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

Add a temporary "progress" variable for the dereferenced p_progress
pointer to a "struct progress *". Before 98a13647408 (trace2: log
progress time and throughput, 2020-05-12) we didn't dereference
"p_progress" in this function, now that we do it's easier to read the
code if we work with a "progress" struct pointer like everywhere else,
instead of a pointer to a pointer.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/progress.c b/progress.c
index 1fade5808de..1ab7d19deb8 100644
--- a/progress.c
+++ b/progress.c
@@ -331,15 +331,16 @@ void stop_progress(struct progress **p_progress)
 	finish_if_sparse(*p_progress);
 
 	if (*p_progress) {
+		struct progress *progress = *p_progress;
 		trace2_data_intmax("progress", the_repository, "total_objects",
 				   (*p_progress)->total);
 
 		if ((*p_progress)->throughput)
 			trace2_data_intmax("progress", the_repository,
 					   "total_bytes",
-					   (*p_progress)->throughput->curr_total);
+					   progress->throughput->curr_total);
 
-		trace2_region_leave("progress", (*p_progress)->title, the_repository);
+		trace2_region_leave("progress", progress->title, the_repository);
 	}
 
 	stop_progress_msg(p_progress, _("done"));
-- 
2.33.0.1098.gf02a64c1a2d


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v2 7/8] pack-bitmap-write.c: add a missing stop_progress()
  2021-09-20 23:09   ` [PATCH v2 " Ævar Arnfjörð Bjarmason
                       ` (5 preceding siblings ...)
  2021-09-20 23:09     ` [PATCH v2 6/8] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
@ 2021-09-20 23:09     ` Ævar Arnfjörð Bjarmason
  2021-10-08  4:04       ` Emily Shaffer
  2021-10-10 21:29       ` SZEDER Gábor
  2021-09-20 23:09     ` [PATCH v2 8/8] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
  2021-10-13 22:28     ` [PATCH v3 00/10] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
  8 siblings, 2 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-20 23:09 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

Fix a bug that's been here since 7cc8f971085 (pack-objects: implement
bitmap writing, 2013-12-21), we did not call stop_progress() if we
reached the early exit in this function. This will matter in a
subsequent commit where we BUG(...) out if this happens, and matters
now e.g. because we don't have a corresponding "region_end" for the
progress trace2 event.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 pack-bitmap-write.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 88d9e696a54..6e110e41ea4 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -550,6 +550,7 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
 	if (indexed_commits_nr < 100) {
 		for (i = 0; i < indexed_commits_nr; ++i)
 			push_bitmapped_commit(indexed_commits[i]);
+		stop_progress(&writer.progress);
 		return;
 	}
 
-- 
2.33.0.1098.gf02a64c1a2d


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v2 8/8] progress.c: add & assert a "global_progress" variable
  2021-09-20 23:09   ` [PATCH v2 " Ævar Arnfjörð Bjarmason
                       ` (6 preceding siblings ...)
  2021-09-20 23:09     ` [PATCH v2 7/8] pack-bitmap-write.c: add a missing stop_progress() Ævar Arnfjörð Bjarmason
@ 2021-09-20 23:09     ` Ævar Arnfjörð Bjarmason
  2021-10-08  4:18       ` Emily Shaffer
  2021-10-13 22:28     ` [PATCH v3 00/10] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
  8 siblings, 1 reply; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-20 23:09 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

The progress.c code makes a hard assumption that only one progress bar
be active at a time (see [1] for a bug where this wasn't the
case). Add a BUG() that'll trigger if we ever regress on that promise
and have two progress bars active at the same time.

There was an alternative test-only approach to doing the same
thing[2], but by doing this outside of a GIT_TEST_* mode we'll know
we've put a hard stop to this particular API misuse.

It will also establish scaffolding to address current fundamental
limitations in the progress output: The current output must be
"driven" by calls to the likes of display_progress(). Once we have a
global current progress object we'll be able to update that object via
SIGALRM. See [3] for early code to do that.

It's conceivable that this change will hit the BUG() condition in some
scenario that we don't currently have tests for, this would be very
bad. If that happened we'd die just because we couldn't emit some
pretty output.

See [4] for a discussion of why our test coverage is lacking; our
progress display is hidden behind isatty(2) checks in many cases, so
the test suite doesn't cover it unless individual tests are run in
"--verbose" mode, we might also have multi-threaded use of the API, so
two progress bars stopping and starting would only be visible due to a
race condition.

Despite that, I think that this change won't introduce such
regressions, because:

 1. I've read all the code using the progress API (and have modified a
    large part of it in some WIP code I have). Almost all of it is really
    simple, the parts that aren't[5] are complex in the display_progress() part,
    not in starting or stopping the progress bar.

 2. The entire test suite passes when instrumented with an ad-hoc
    Linux-specific mode (it uses gettid()) to die if progress bars are
    ever started or stopped on anything but the main thread[6].

    Extending that to die if display_progress() is called in a thread
    reveals that we have exactly two users of the progress bar under
    threaded conditions, "git index-pack" and "git pack-objects". Both
    uses are straightforward, and they don't start/stop the progress
    bar when threads are active.

 3. I've likewise done an ad-hoc test to force progress bars to be
    displayed with:

        perl -pi -e 's[isatty\((?:STDERR_FILENO|2)\)][1]g' $(git grep -l 'isatty\((STDERR_FILENO|2)\)')

    I.e. to replace all checks (not just for progress) of checking
    whether STDERR is connected to a TTY, and then monkeypatching
    is_foreground_fd() in progress.c to always "return 1". Running the
    tests with those applied, interactively and under -V reveals via:

        $ grep -e set_progress_signal -e clear_progress_signal test-results/*out

    That nothing our tests cover hits the BUG conditions added here,
    except the expected "BUG: start two concurrent progress bars" test
    being added here.

    That isn't entirely true since we won't be getting 100% coverage
    due to cascading failures from tests that expected no progress
    output on stderr. To make sure I covered 100% I also tried making
    the display() function in progress.c a NOOP on top of that (it's
    the calls to start_progress_delay() and stop_progress()) that
    matter.

    That doesn't hit the BUG() either. Some tests fail in that mode
    due to a combination of the overzealous isatty(2) munging noted
    above, and the tests that are testing that the progress output
    itself is present (but for testing I'd made display() a NOOP).

Between those three points I think it's safe to go ahead with this
change.

1. 6f9d5f2fda1 (commit-graph: fix progress of reachable commits, 2020-07-09)
2. https://lore.kernel.org/git/20210620200303.2328957-3-szeder.dev@gmail.com
3. https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/
4. https://lore.kernel.org/git/cover-00.25-00000000000-20210623T155626Z-avarab@gmail.com/
5. b50c37aa44d (Merge branch 'ab/progress-users-adjust-counters' into
   next, 2021-09-10)
6. https://lore.kernel.org/git/877dffg37n.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c                  | 17 +++++++++++++----
 t/t0500-progress-display.sh | 11 +++++++++++
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/progress.c b/progress.c
index 1ab7d19deb8..14a023f4b43 100644
--- a/progress.c
+++ b/progress.c
@@ -46,6 +46,7 @@ struct progress {
 };
 
 static volatile sig_atomic_t progress_update;
+static struct progress *global_progress;
 
 /*
  * These are only intended for testing the progress output, i.e. exclusively
@@ -221,11 +222,15 @@ void progress_test_force_update(void)
 	progress_interval(SIGALRM);
 }
 
-static void set_progress_signal(void)
+static void set_progress_signal(struct progress *progress)
 {
 	struct sigaction sa;
 	struct itimerval v;
 
+	if (global_progress)
+		BUG("should have no global_progress in set_progress_signal()");
+	global_progress = progress;
+
 	if (progress_testing)
 		return;
 
@@ -243,10 +248,14 @@ static void set_progress_signal(void)
 	setitimer(ITIMER_REAL, &v, NULL);
 }
 
-static void clear_progress_signal(void)
+static void clear_progress_signal(struct progress *progress)
 {
 	struct itimerval v = {{0,},};
 
+	if (!global_progress)
+		BUG("should have a global_progress in clear_progress_signal()");
+	global_progress = NULL;
+
 	if (progress_testing)
 		return;
 
@@ -270,7 +279,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 	strbuf_init(&progress->counters_sb, 0);
 	progress->title_len = utf8_strwidth(title);
 	progress->split = 0;
-	set_progress_signal();
+	set_progress_signal(progress);
 	trace2_region_enter("progress", title, the_repository);
 	return progress;
 }
@@ -374,7 +383,7 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 		display(progress, progress->last_value, buf);
 		free(buf);
 	}
-	clear_progress_signal();
+	clear_progress_signal(progress);
 	strbuf_release(&progress->counters_sb);
 	if (progress->throughput)
 		strbuf_release(&progress->throughput->display);
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index ffa819ca1db..124d33c96b3 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -296,6 +296,17 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	test_cmp expect out
 '
 
+test_expect_success 'BUG: start two concurrent progress bars' '
+	cat >in <<-\EOF &&
+	start 0 one
+	start 0 two
+	EOF
+
+	test_must_fail test-tool progress \
+		<in 2>stderr &&
+	grep -E "^BUG: .*: should have no global_progress in set_progress_signal\(\)$" stderr
+'
+
 test_expect_success 'progress generates traces' '
 	cat >in <<-\EOF &&
 	start 40
-- 
2.33.0.1098.gf02a64c1a2d


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH v2 1/8] progress.c tests: make start/stop verbs on stdin
  2021-09-20 23:09     ` [PATCH v2 1/8] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
@ 2021-10-08  3:43       ` Emily Shaffer
  0 siblings, 0 replies; 138+ messages in thread
From: Emily Shaffer @ 2021-10-08  3:43 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, SZEDER Gábor, René Scharfe

On Tue, Sep 21, 2021 at 01:09:22AM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> Change the usage of the "test-tool progress" introduced in
> 2bb74b53a49 (Test the progress display, 2019-09-16) to take command
> like "start" and "stop" on stdin, instead of running them implicitly.
> 
> This makes for tests that are easier to read, since the recipe will
> mirror the API usage, and allows for easily testing invalid usage that
> would yield (or should yield) a BUG(), e.g. providing two "start"
> calls in a row. A subsequent commit will add such stress tests.

Ok. So this is just a readability change, and not a functional change to
the helper, for now.

Or so I thought, but I was surprised to see the usage changing and the
total count moving to stdin. I don't think that's a bad change but the
commit message doesn't mention it in a way that I expected to see it in
the diff.

> diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
> index 5d05cbe7894..685c0a7c49a 100644
> --- a/t/helper/test-progress.c
> +++ b/t/helper/test-progress.c
> @@ -22,31 +26,41 @@
>  
>  int cmd__progress(int argc, const char **argv)
>  {
> -	int total = 0;
> -	const char *title;
> +	const char *default_title = "Working hard";
> +	char *detached_title = NULL;
>  	struct strbuf line = STRBUF_INIT;
> -	struct progress *progress;
> +	struct progress *progress = NULL;
>  
>  	const char *usage[] = {
> -		"test-tool progress [--total=<n>] <progress-title>",
> +		"test-tool progress <stdin",
>  		NULL
>  	};
>  	struct option options[] = {
> -		OPT_INTEGER(0, "total", &total, "total number of items"),
>  		OPT_END(),
>  	};
>  
>  	argc = parse_options(argc, argv, NULL, options, usage, 0);
> -	if (argc != 1)
> -		die("need a title for the progress output");
> -	title = argv[0];
> +	if (argc)
> +		usage_with_options(usage, options);

Ok. We lose the args entirely, moving them to stdin lines, and that
cleans up the usage() check. Nice.

>  	progress_testing = 1;
> -	progress = start_progress(title, total);

Getting rid of the implied start. Ok.

>  	while (strbuf_getline(&line, stdin) != EOF) {
>  		char *end;
>  
> -		if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
> +		if (!strcmp(line.buf, "start")) {
> +			progress = start_progress(default_title, 0);
'start' with no args...
> +		} else if (skip_prefix(line.buf, "start ", (const char **) &end)) {
'start 1234'...

Would it be more readable to use strbuf_split_buf() here instead? Maybe
it doesn't fix the need for strtoull() but it could make the parsing
clearer. I did have to think about this one for a bit.

> +			uint64_t total = strtoull(end, &end, 10);
> +			if (*end == '\0') {
> +				progress = start_progress(default_title, total);
> +			} else if (*end == ' ') {
'start 1234 lorem ipsum dolor'. Ok.
> +				free(detached_title);
> +				detached_title = strbuf_detach(&line, NULL);
> +				progress = start_progress(end + 1, total);
> +			} else {
> +				die("invalid input: '%s'\n", line.buf);
> +			}

I wondered why we had to do all this title parsing from scratch now when
we didn't before, but I guess it's because we don't get a nicely
allocated argv[0]. Ok.

> +		} else if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
>  			uint64_t item_count = strtoull(end, &end, 10);
>  			if (*end != '\0')
>  				die("invalid input: '%s'\n", line.buf);
> @@ -63,12 +77,15 @@ int cmd__progress(int argc, const char **argv)
>  				die("invalid input: '%s'\n", line.buf);
>  			progress_test_ns = test_ms * 1000 * 1000;
>  			display_throughput(progress, byte_count);
> -		} else if (!strcmp(line.buf, "update"))
> +		} else if (!strcmp(line.buf, "update")) {
>  			progress_test_force_update();
> -		else
> +		} else if (!strcmp(line.buf, "stop")) {
> +			stop_progress(&progress);
> +		} else {

And 'stop' doesn't take any args. Ok. Do you need the {}?

>  			die("invalid input: '%s'\n", line.buf);
> +		}
>  	}
> -	stop_progress(&progress);
> +	free(detached_title);
>  
>  	return 0;
>  }
> diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
> index 22058b503ac..ca96ac1fa55 100755
> --- a/t/t0500-progress-display.sh
> +++ b/t/t0500-progress-display.sh
> @@ -17,6 +17,7 @@ test_expect_success 'simple progress display' '
>  	EOF
>  
>  	cat >in <<-\EOF &&
> +	start 0

Does it need the total arg?

>  	update
>  	progress 1
>  	update
> @@ -88,16 +92,15 @@ Working hard.......2.........3.........4.........5.........6:
>  EOF
>  
>  	cat >in <<-\EOF &&
> -	update
Was it intended to drop the 'update' line here? Does this not change the
content of the test?
> +	start 100000 Working hard.......2.........3.........4.........5.........6
>  	progress 1
>  	update
>  	progress 2
>  	progress 10000
>  	progress 100000
> +	stop
>  	EOF
> -	test-tool progress --total=100000 \
> -		"Working hard.......2.........3.........4.........5.........6" \
> -		<in 2>stderr &&
> +	test-tool progress <in 2>stderr &&
>  
>  	show_cr <stderr >out &&
>  	test_cmp expect out

With whichever nits seem appropriate,

Reviewed-by: Emily Shaffer <emilyshaffer@google.com>

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH v2 2/8] progress.c tests: test some invalid usage
  2021-09-20 23:09     ` [PATCH v2 2/8] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
@ 2021-10-08  3:53       ` Emily Shaffer
  0 siblings, 0 replies; 138+ messages in thread
From: Emily Shaffer @ 2021-10-08  3:53 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, SZEDER Gábor, René Scharfe

On Tue, Sep 21, 2021 at 01:09:23AM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> Test what happens when we "stop" without a "start", omit the "stop"
> after a "start", or try to start two concurrent progress bars. This
> extends the trace2 tests added in 98a13647408 (trace2: log progress
> time and throughput, 2020-05-12).

I wondered whether these tests were more testing the helper, rather than
testing the API, but I think this is a good change - you're correct that
having the helper assume correct usage by automatically
start_progress()ing and stop_progress()ing was an oversight. Thanks.

Diff is pretty straightforward.

Reviewed-by: Emily Shaffer <emilyshaffer@google.com>

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH v2 5/8] progress.c: stop eagerly fflush(stderr) when not a terminal
  2021-09-20 23:09     ` [PATCH v2 5/8] progress.c: stop eagerly fflush(stderr) when not a terminal Ævar Arnfjörð Bjarmason
@ 2021-10-08  3:59       ` Emily Shaffer
  2021-10-08  7:01         ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 138+ messages in thread
From: Emily Shaffer @ 2021-10-08  3:59 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, SZEDER Gábor, René Scharfe

On Tue, Sep 21, 2021 at 01:09:26AM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> It's the clear intention of the combination of 137a0d0ef56 (Flush
> progress message buffer in display()., 2007-11-19) and
> 85cb8906f0e (progress: no progress in background, 2015-04-13) to call
> fflush(stderr) when we have a stderr in the foreground, but we ended
> up always calling fflush(stderr) seemingly by omission. Let's not.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  progress.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/progress.c b/progress.c
> index 7fcc513717a..1fade5808de 100644
> --- a/progress.c
> +++ b/progress.c
> @@ -91,7 +91,8 @@ static void display(struct progress *progress, uint64_t n, const char *done)
>  	}
>  
>  	if (show_update) {
> -		if (is_foreground_fd(fileno(stderr)) || done) {
> +		int stderr_is_foreground_fd = is_foreground_fd(fileno(stderr));
> +		if (stderr_is_foreground_fd || done) {
>  			const char *eol = done ? done : "\r";
>  			size_t clear_len = counters_sb->len < last_count_len ?
>  					last_count_len - counters_sb->len + 1 :
> @@ -115,7 +116,8 @@ static void display(struct progress *progress, uint64_t n, const char *done)
>  				fprintf(stderr, "%s: %s%*s", progress->title,
>  					counters_sb->buf, (int) clear_len, eol);
>  			}
> -			fflush(stderr);
> +			if (stderr_is_foreground_fd)
> +				fflush(stderr);

Looks like a straightforward refactor, although I wonder what's the
difference between is_foreground_fd(fileno(stderr)) and isatty() in
practice.

Reviewed-by: Emily Shaffer <emilyshaffer@google.com>

>  		}
>  		progress_update = 0;
>  	}
> -- 
> 2.33.0.1098.gf02a64c1a2d
> 

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH v2 6/8] progress.c: add temporary variable from progress struct
  2021-09-20 23:09     ` [PATCH v2 6/8] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
@ 2021-10-08  4:02       ` Emily Shaffer
  0 siblings, 0 replies; 138+ messages in thread
From: Emily Shaffer @ 2021-10-08  4:02 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, SZEDER Gábor, René Scharfe

On Tue, Sep 21, 2021 at 01:09:27AM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> Add a temporary "progress" variable for the dereferenced p_progress
> pointer to a "struct progress *". Before 98a13647408 (trace2: log
> progress time and throughput, 2020-05-12) we didn't dereference
> "p_progress" in this function, now that we do it's easier to read the
> code if we work with a "progress" struct pointer like everywhere else,
> instead of a pointer to a pointer.

Thanks, this looks much nicer :)

> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Reviewed-by: Emily Shaffer <emilyshaffer@google.com>

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH v2 7/8] pack-bitmap-write.c: add a missing stop_progress()
  2021-09-20 23:09     ` [PATCH v2 7/8] pack-bitmap-write.c: add a missing stop_progress() Ævar Arnfjörð Bjarmason
@ 2021-10-08  4:04       ` Emily Shaffer
  2021-10-08 12:14         ` Ævar Arnfjörð Bjarmason
  2021-10-10 21:29       ` SZEDER Gábor
  1 sibling, 1 reply; 138+ messages in thread
From: Emily Shaffer @ 2021-10-08  4:04 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, SZEDER Gábor, René Scharfe

On Tue, Sep 21, 2021 at 01:09:28AM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> Fix a bug that's been here since 7cc8f971085 (pack-objects: implement
> bitmap writing, 2013-12-21), we did not call stop_progress() if we
> reached the early exit in this function. This will matter in a
> subsequent commit where we BUG(...) out if this happens, and matters
> now e.g. because we don't have a corresponding "region_end" for the
> progress trace2 event.

Sounds like this was the only place we were calling start_progress()
without a stop_progress(), then? Or at least the only place that is
exercised by the test suite. Wow. I'm proud of Git contributor base :)

Reviewed-by: Emily Shaffer <emilyshaffer@google.com>

> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  pack-bitmap-write.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
> index 88d9e696a54..6e110e41ea4 100644
> --- a/pack-bitmap-write.c
> +++ b/pack-bitmap-write.c
> @@ -550,6 +550,7 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
>  	if (indexed_commits_nr < 100) {
>  		for (i = 0; i < indexed_commits_nr; ++i)
>  			push_bitmapped_commit(indexed_commits[i]);
> +		stop_progress(&writer.progress);
>  		return;
>  	}
>  
> -- 
> 2.33.0.1098.gf02a64c1a2d
> 

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH v2 8/8] progress.c: add & assert a "global_progress" variable
  2021-09-20 23:09     ` [PATCH v2 8/8] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
@ 2021-10-08  4:18       ` Emily Shaffer
  2021-10-08  7:15         ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 138+ messages in thread
From: Emily Shaffer @ 2021-10-08  4:18 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, SZEDER Gábor, René Scharfe

On Tue, Sep 21, 2021 at 01:09:29AM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> The progress.c code makes a hard assumption that only one progress bar
> be active at a time (see [1] for a bug where this wasn't the
> case). Add a BUG() that'll trigger if we ever regress on that promise
> and have two progress bars active at the same time.
> 
> There was an alternative test-only approach to doing the same
> thing[2], but by doing this outside of a GIT_TEST_* mode we'll know
> we've put a hard stop to this particular API misuse.
> 
> It will also establish scaffolding to address current fundamental
> limitations in the progress output: The current output must be
> "driven" by calls to the likes of display_progress(). Once we have a
> global current progress object we'll be able to update that object via
> SIGALRM. See [3] for early code to do that.
> 
> It's conceivable that this change will hit the BUG() condition in some
> scenario that we don't currently have tests for, this would be very
> bad. If that happened we'd die just because we couldn't emit some
> pretty output.
> 
> See [4] for a discussion of why our test coverage is lacking; our
> progress display is hidden behind isatty(2) checks in many cases, so
> the test suite doesn't cover it unless individual tests are run in
> "--verbose" mode, we might also have multi-threaded use of the API, so
> two progress bars stopping and starting would only be visible due to a
> race condition.
> 
> Despite that, I think that this change won't introduce such
> regressions, because:
> 
>  1. I've read all the code using the progress API (and have modified a
>     large part of it in some WIP code I have). Almost all of it is really
>     simple, the parts that aren't[5] are complex in the display_progress() part,
>     not in starting or stopping the progress bar.
> 
>  2. The entire test suite passes when instrumented with an ad-hoc
>     Linux-specific mode (it uses gettid()) to die if progress bars are
>     ever started or stopped on anything but the main thread[6].
> 
>     Extending that to die if display_progress() is called in a thread
>     reveals that we have exactly two users of the progress bar under
>     threaded conditions, "git index-pack" and "git pack-objects". Both
>     uses are straightforward, and they don't start/stop the progress
>     bar when threads are active.
> 
>  3. I've likewise done an ad-hoc test to force progress bars to be
>     displayed with:
> 
>         perl -pi -e 's[isatty\((?:STDERR_FILENO|2)\)][1]g' $(git grep -l 'isatty\((STDERR_FILENO|2)\)')

I think your ad-hoc test might be a little more compelling if it was
easier to understand, which is to say, maybe if your Perl oneliner was
on more than one line, or had comments, or was in a different language.
Although you explain it right after, we kind of have to take your word
for it.

> 
>     I.e. to replace all checks (not just for progress) of checking
>     whether STDERR is connected to a TTY, and then monkeypatching
>     is_foreground_fd() in progress.c to always "return 1". Running the
>     tests with those applied, interactively and under -V reveals via:
> 
>         $ grep -e set_progress_signal -e clear_progress_signal test-results/*out
> 
>     That nothing our tests cover hits the BUG conditions added here,
>     except the expected "BUG: start two concurrent progress bars" test
>     being added here.
> 
>     That isn't entirely true since we won't be getting 100% coverage
>     due to cascading failures from tests that expected no progress
>     output on stderr. To make sure I covered 100% I also tried making
>     the display() function in progress.c a NOOP on top of that (it's
>     the calls to start_progress_delay() and stop_progress()) that
>     matter.
> 
>     That doesn't hit the BUG() either. Some tests fail in that mode
>     due to a combination of the overzealous isatty(2) munging noted
>     above, and the tests that are testing that the progress output
>     itself is present (but for testing I'd made display() a NOOP).
> 
> Between those three points I think it's safe to go ahead with this
> change.

One worry I had was that we might be painting ourselves into a corner
here if we did want to support the ability to do multiple progress bars
simultaneously (for example if we want to pull from multiple CDNs at the
same time when we're using promisor packfiles, and we expect those packs
to be large enough that we'd need to show a progress bar for each one).
However, I think the pattern - hang onto a pointer to the progress
objects, and complain if we get a signal and there are any still valid -
still holds well enough, so I'm ok with this change.

There are a couple patches in the middle which I didn't reply to, but I
did read them, and they were so tiny and mechanical that I did not have
useful comments to add.

Thanks, it's nice to see progress here (ha ha ha).

Preferably with the BUG() message nit below,
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>

> 
> 1. 6f9d5f2fda1 (commit-graph: fix progress of reachable commits, 2020-07-09)
> 2. https://lore.kernel.org/git/20210620200303.2328957-3-szeder.dev@gmail.com
> 3. https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/
> 4. https://lore.kernel.org/git/cover-00.25-00000000000-20210623T155626Z-avarab@gmail.com/
> 5. b50c37aa44d (Merge branch 'ab/progress-users-adjust-counters' into
>    next, 2021-09-10)
> 6. https://lore.kernel.org/git/877dffg37n.fsf@evledraar.gmail.com/
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  progress.c                  | 17 +++++++++++++----
>  t/t0500-progress-display.sh | 11 +++++++++++
>  2 files changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/progress.c b/progress.c
> index 1ab7d19deb8..14a023f4b43 100644
> --- a/progress.c
> +++ b/progress.c
> @@ -46,6 +46,7 @@ struct progress {
>  };
>  
>  static volatile sig_atomic_t progress_update;
> +static struct progress *global_progress;
>  
>  /*
>   * These are only intended for testing the progress output, i.e. exclusively
> @@ -221,11 +222,15 @@ void progress_test_force_update(void)
>  	progress_interval(SIGALRM);
>  }
>  
> -static void set_progress_signal(void)
> +static void set_progress_signal(struct progress *progress)
>  {
>  	struct sigaction sa;
>  	struct itimerval v;
>  
> +	if (global_progress)
> +		BUG("should have no global_progress in set_progress_signal()");
> +	global_progress = progress;

Can we make the BUG() message a little clearer? Even in the context of
the code, it's not clear that what this BUG() really means is "hey, you
forgot to call stop_progress on something" or "hey, you can't have two
progress bars at the same time". Even if you were to change the name of
'global_progress' to 'existing_progress_bar' or something, I think that
would make the message more understandable.

> +
>  	if (progress_testing)
>  		return;
>  
> @@ -243,10 +248,14 @@ static void set_progress_signal(void)
>  	setitimer(ITIMER_REAL, &v, NULL);
>  }
>  
> -static void clear_progress_signal(void)
> +static void clear_progress_signal(struct progress *progress)
>  {
>  	struct itimerval v = {{0,},};
>  
> +	if (!global_progress)
> +		BUG("should have a global_progress in clear_progress_signal()");
> +	global_progress = NULL;
> +
>  	if (progress_testing)
>  		return;
>  
> @@ -270,7 +279,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
>  	strbuf_init(&progress->counters_sb, 0);
>  	progress->title_len = utf8_strwidth(title);
>  	progress->split = 0;
> -	set_progress_signal();
> +	set_progress_signal(progress);
>  	trace2_region_enter("progress", title, the_repository);
>  	return progress;
>  }
> @@ -374,7 +383,7 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
>  		display(progress, progress->last_value, buf);
>  		free(buf);
>  	}
> -	clear_progress_signal();
> +	clear_progress_signal(progress);
>  	strbuf_release(&progress->counters_sb);
>  	if (progress->throughput)
>  		strbuf_release(&progress->throughput->display);
> diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
> index ffa819ca1db..124d33c96b3 100755
> --- a/t/t0500-progress-display.sh
> +++ b/t/t0500-progress-display.sh
> @@ -296,6 +296,17 @@ test_expect_success 'cover up after throughput shortens a lot' '
>  	test_cmp expect out
>  '
>  
> +test_expect_success 'BUG: start two concurrent progress bars' '
> +	cat >in <<-\EOF &&
> +	start 0 one
> +	start 0 two
> +	EOF
> +
> +	test_must_fail test-tool progress \
> +		<in 2>stderr &&
> +	grep -E "^BUG: .*: should have no global_progress in set_progress_signal\(\)$" stderr
> +'
> +
>  test_expect_success 'progress generates traces' '
>  	cat >in <<-\EOF &&
>  	start 40
> -- 
> 2.33.0.1098.gf02a64c1a2d
> 

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH v2 5/8] progress.c: stop eagerly fflush(stderr) when not a terminal
  2021-10-08  3:59       ` Emily Shaffer
@ 2021-10-08  7:01         ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-08  7:01 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git, Junio C Hamano, SZEDER Gábor, René Scharfe


On Thu, Oct 07 2021, Emily Shaffer wrote:

> On Tue, Sep 21, 2021 at 01:09:26AM +0200, Ævar Arnfjörð Bjarmason wrote:
>> 
>> It's the clear intention of the combination of 137a0d0ef56 (Flush
>> progress message buffer in display()., 2007-11-19) and
>> 85cb8906f0e (progress: no progress in background, 2015-04-13) to call
>> fflush(stderr) when we have a stderr in the foreground, but we ended
>> up always calling fflush(stderr) seemingly by omission. Let's not.
>> 
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  progress.c | 6 ++++--
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>> 
>> diff --git a/progress.c b/progress.c
>> index 7fcc513717a..1fade5808de 100644
>> --- a/progress.c
>> +++ b/progress.c
>> @@ -91,7 +91,8 @@ static void display(struct progress *progress, uint64_t n, const char *done)
>>  	}
>>  
>>  	if (show_update) {
>> -		if (is_foreground_fd(fileno(stderr)) || done) {
>> +		int stderr_is_foreground_fd = is_foreground_fd(fileno(stderr));
>> +		if (stderr_is_foreground_fd || done) {
>>  			const char *eol = done ? done : "\r";
>>  			size_t clear_len = counters_sb->len < last_count_len ?
>>  					last_count_len - counters_sb->len + 1 :
>> @@ -115,7 +116,8 @@ static void display(struct progress *progress, uint64_t n, const char *done)
>>  				fprintf(stderr, "%s: %s%*s", progress->title,
>>  					counters_sb->buf, (int) clear_len, eol);
>>  			}
>> -			fflush(stderr);
>> +			if (stderr_is_foreground_fd)
>> +				fflush(stderr);
>
> Looks like a straightforward refactor, although I wonder what's the
> difference between is_foreground_fd(fileno(stderr)) and isatty() in
> practice.

Good question. Whether you have a TTY is different from if it's in the
foreground. In this case we don't want progress bars to display their
full output if they're not in the foreground, just the summary line.

I.e.:
    
    perl -MPOSIX=tcgetpgrp,isatty,getpgrp -wE '
            say "TTY: ", isatty(1) ? "yes" : "no";
            open my $tty, "/dev/tty";
            my $tpgrp = tcgetpgrp(fileno($tty));
            my $pgrp = getpgrp();
            say "Foreground?: ",  $tpgrp == $pgrp ? "yes" : "no"
    '
    
Then:
    
    $ <that>
    TTY: yes
    Foreground?: yes
    $ <that> &
    TTY: yes
    Foreground?: no
    $ <that> >f && cat f
    TTY: no
    Foreground?: yes
    $ (<that> >f &); sleep 1; cat f;
    TTY: no
    Foreground?: no

But having written that I can see that this commit of mine is buggy,
because when I wrote it I conflated the two. I.e. we don't want to defer
eager flushing in that "&" case. I.e. to have our line-buffered summary
line be held up by I/O buffered flushing.

> Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
>
>>  		}
>>  		progress_update = 0;
>>  	}
>> -- 
>> 2.33.0.1098.gf02a64c1a2d
>> 


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH v2 8/8] progress.c: add & assert a "global_progress" variable
  2021-10-08  4:18       ` Emily Shaffer
@ 2021-10-08  7:15         ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-08  7:15 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git, Junio C Hamano, SZEDER Gábor, René Scharfe


On Thu, Oct 07 2021, Emily Shaffer wrote:

> On Tue, Sep 21, 2021 at 01:09:29AM +0200, Ævar Arnfjörð Bjarmason wrote:
>> 
>> The progress.c code makes a hard assumption that only one progress bar
>> be active at a time (see [1] for a bug where this wasn't the
>> case). Add a BUG() that'll trigger if we ever regress on that promise
>> and have two progress bars active at the same time.
>> 
>> There was an alternative test-only approach to doing the same
>> thing[2], but by doing this outside of a GIT_TEST_* mode we'll know
>> we've put a hard stop to this particular API misuse.
>> 
>> It will also establish scaffolding to address current fundamental
>> limitations in the progress output: The current output must be
>> "driven" by calls to the likes of display_progress(). Once we have a
>> global current progress object we'll be able to update that object via
>> SIGALRM. See [3] for early code to do that.
>> 
>> It's conceivable that this change will hit the BUG() condition in some
>> scenario that we don't currently have tests for, this would be very
>> bad. If that happened we'd die just because we couldn't emit some
>> pretty output.
>> 
>> See [4] for a discussion of why our test coverage is lacking; our
>> progress display is hidden behind isatty(2) checks in many cases, so
>> the test suite doesn't cover it unless individual tests are run in
>> "--verbose" mode, we might also have multi-threaded use of the API, so
>> two progress bars stopping and starting would only be visible due to a
>> race condition.
>> 
>> Despite that, I think that this change won't introduce such
>> regressions, because:
>> 
>>  1. I've read all the code using the progress API (and have modified a
>>     large part of it in some WIP code I have). Almost all of it is really
>>     simple, the parts that aren't[5] are complex in the display_progress() part,
>>     not in starting or stopping the progress bar.
>> 
>>  2. The entire test suite passes when instrumented with an ad-hoc
>>     Linux-specific mode (it uses gettid()) to die if progress bars are
>>     ever started or stopped on anything but the main thread[6].
>> 
>>     Extending that to die if display_progress() is called in a thread
>>     reveals that we have exactly two users of the progress bar under
>>     threaded conditions, "git index-pack" and "git pack-objects". Both
>>     uses are straightforward, and they don't start/stop the progress
>>     bar when threads are active.
>> 
>>  3. I've likewise done an ad-hoc test to force progress bars to be
>>     displayed with:
>> 
>>         perl -pi -e 's[isatty\((?:STDERR_FILENO|2)\)][1]g' $(git grep -l 'isatty\((STDERR_FILENO|2)\)')
>
> I think your ad-hoc test might be a little more compelling if it was
> easier to understand, which is to say, maybe if your Perl oneliner was
> on more than one line, or had comments, or was in a different language.
> Although you explain it right after, we kind of have to take your word
> for it.

I'll see if I can use sed or something, which is easy enough in this
case. I just write Perl out of habit for this sort of thing
(e.g. balanced braces & Perl-regexes make it much nicer).

>> 
>>     I.e. to replace all checks (not just for progress) of checking
>>     whether STDERR is connected to a TTY, and then monkeypatching
>>     is_foreground_fd() in progress.c to always "return 1". Running the
>>     tests with those applied, interactively and under -V reveals via:
>> 
>>         $ grep -e set_progress_signal -e clear_progress_signal test-results/*out
>> 
>>     That nothing our tests cover hits the BUG conditions added here,
>>     except the expected "BUG: start two concurrent progress bars" test
>>     being added here.
>> 
>>     That isn't entirely true since we won't be getting 100% coverage
>>     due to cascading failures from tests that expected no progress
>>     output on stderr. To make sure I covered 100% I also tried making
>>     the display() function in progress.c a NOOP on top of that (it's
>>     the calls to start_progress_delay() and stop_progress()) that
>>     matter.
>> 
>>     That doesn't hit the BUG() either. Some tests fail in that mode
>>     due to a combination of the overzealous isatty(2) munging noted
>>     above, and the tests that are testing that the progress output
>>     itself is present (but for testing I'd made display() a NOOP).
>> 
>> Between those three points I think it's safe to go ahead with this
>> change.
>
> One worry I had was that we might be painting ourselves into a corner
> here if we did want to support the ability to do multiple progress bars
> simultaneously (for example if we want to pull from multiple CDNs at the
> same time when we're using promisor packfiles, and we expect those packs
> to be large enough that we'd need to show a progress bar for each one).
> However, I think the pattern - hang onto a pointer to the progress
> objects, and complain if we get a signal and there are any still valid -
> still holds well enough, so I'm ok with this change.

Yes, and that's a thing I'd really like the progress code to be able to
do too, and I've got some follow-up patches, but (somewhat
paradoxically) in order to display multiple progress bars you need to
first have a step like this to ensure that there is only ever one
progress bar.

The user only has one terminal, so we'll need to serialize our N
progress bars to one "emitter", we'll need to teach the progress
accounting to either have N parallel progress lines, or to simply make N
number of "slave" "struct progress *" hang off it. I'm leaning towards
the latter.

> There are a couple patches in the middle which I didn't reply to, but I
> did read them, and they were so tiny and mechanical that I did not have
> useful comments to add.
>
> Thanks, it's nice to see progress here (ha ha ha).

:)

> Preferably with the BUG() message nit below,
> Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
>
>> 
>> 1. 6f9d5f2fda1 (commit-graph: fix progress of reachable commits, 2020-07-09)
>> 2. https://lore.kernel.org/git/20210620200303.2328957-3-szeder.dev@gmail.com
>> 3. https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/
>> 4. https://lore.kernel.org/git/cover-00.25-00000000000-20210623T155626Z-avarab@gmail.com/
>> 5. b50c37aa44d (Merge branch 'ab/progress-users-adjust-counters' into
>>    next, 2021-09-10)
>> 6. https://lore.kernel.org/git/877dffg37n.fsf@evledraar.gmail.com/
>> 
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  progress.c                  | 17 +++++++++++++----
>>  t/t0500-progress-display.sh | 11 +++++++++++
>>  2 files changed, 24 insertions(+), 4 deletions(-)
>> 
>> diff --git a/progress.c b/progress.c
>> index 1ab7d19deb8..14a023f4b43 100644
>> --- a/progress.c
>> +++ b/progress.c
>> @@ -46,6 +46,7 @@ struct progress {
>>  };
>>  
>>  static volatile sig_atomic_t progress_update;
>> +static struct progress *global_progress;
>>  
>>  /*
>>   * These are only intended for testing the progress output, i.e. exclusively
>> @@ -221,11 +222,15 @@ void progress_test_force_update(void)
>>  	progress_interval(SIGALRM);
>>  }
>>  
>> -static void set_progress_signal(void)
>> +static void set_progress_signal(struct progress *progress)
>>  {
>>  	struct sigaction sa;
>>  	struct itimerval v;
>>  
>> +	if (global_progress)
>> +		BUG("should have no global_progress in set_progress_signal()");
>> +	global_progress = progress;
>
> Can we make the BUG() message a little clearer? Even in the context of
> the code, it's not clear that what this BUG() really means is "hey, you
> forgot to call stop_progress on something" or "hey, you can't have two
> progress bars at the same time". Even if you were to change the name of
> 'global_progress' to 'existing_progress_bar' or something, I think that
> would make the message more understandable.

Willdo, thanks.

>> +
>>  	if (progress_testing)
>>  		return;
>>  
>> @@ -243,10 +248,14 @@ static void set_progress_signal(void)
>>  	setitimer(ITIMER_REAL, &v, NULL);
>>  }
>>  
>> -static void clear_progress_signal(void)
>> +static void clear_progress_signal(struct progress *progress)
>>  {
>>  	struct itimerval v = {{0,},};
>>  
>> +	if (!global_progress)
>> +		BUG("should have a global_progress in clear_progress_signal()");
>> +	global_progress = NULL;
>> +
>>  	if (progress_testing)
>>  		return;
>>  
>> @@ -270,7 +279,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
>>  	strbuf_init(&progress->counters_sb, 0);
>>  	progress->title_len = utf8_strwidth(title);
>>  	progress->split = 0;
>> -	set_progress_signal();
>> +	set_progress_signal(progress);
>>  	trace2_region_enter("progress", title, the_repository);
>>  	return progress;
>>  }
>> @@ -374,7 +383,7 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
>>  		display(progress, progress->last_value, buf);
>>  		free(buf);
>>  	}
>> -	clear_progress_signal();
>> +	clear_progress_signal(progress);
>>  	strbuf_release(&progress->counters_sb);
>>  	if (progress->throughput)
>>  		strbuf_release(&progress->throughput->display);
>> diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
>> index ffa819ca1db..124d33c96b3 100755
>> --- a/t/t0500-progress-display.sh
>> +++ b/t/t0500-progress-display.sh
>> @@ -296,6 +296,17 @@ test_expect_success 'cover up after throughput shortens a lot' '
>>  	test_cmp expect out
>>  '
>>  
>> +test_expect_success 'BUG: start two concurrent progress bars' '
>> +	cat >in <<-\EOF &&
>> +	start 0 one
>> +	start 0 two
>> +	EOF
>> +
>> +	test_must_fail test-tool progress \
>> +		<in 2>stderr &&
>> +	grep -E "^BUG: .*: should have no global_progress in set_progress_signal\(\)$" stderr
>> +'
>> +
>>  test_expect_success 'progress generates traces' '
>>  	cat >in <<-\EOF &&
>>  	start 40
>> -- 
>> 2.33.0.1098.gf02a64c1a2d
>> 


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH v2 7/8] pack-bitmap-write.c: add a missing stop_progress()
  2021-10-08  4:04       ` Emily Shaffer
@ 2021-10-08 12:14         ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-08 12:14 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git, Junio C Hamano, SZEDER Gábor, René Scharfe


On Thu, Oct 07 2021, Emily Shaffer wrote:

> On Tue, Sep 21, 2021 at 01:09:28AM +0200, Ævar Arnfjörð Bjarmason wrote:
>> 
>> Fix a bug that's been here since 7cc8f971085 (pack-objects: implement
>> bitmap writing, 2013-12-21), we did not call stop_progress() if we
>> reached the early exit in this function. This will matter in a
>> subsequent commit where we BUG(...) out if this happens, and matters
>> now e.g. because we don't have a corresponding "region_end" for the
>> progress trace2 event.
>
> Sounds like this was the only place we were calling start_progress()
> without a stop_progress(), then? Or at least the only place that is
> exercised by the test suite. Wow. I'm proud of Git contributor base :)
>
> Reviewed-by: Emily Shaffer <emilyshaffer@google.com>

Yes! Will clarify that, there were some fixes to other bugs in the area,
but this one was the last one. Might squash it actually...

>> 
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  pack-bitmap-write.c | 1 +
>>  1 file changed, 1 insertion(+)
>> 
>> diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
>> index 88d9e696a54..6e110e41ea4 100644
>> --- a/pack-bitmap-write.c
>> +++ b/pack-bitmap-write.c
>> @@ -550,6 +550,7 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
>>  	if (indexed_commits_nr < 100) {
>>  		for (i = 0; i < indexed_commits_nr; ++i)
>>  			push_bitmapped_commit(indexed_commits[i]);
>> +		stop_progress(&writer.progress);
>>  		return;
>>  	}
>>  
>> -- 
>> 2.33.0.1098.gf02a64c1a2d
>> 


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [PATCH v2 7/8] pack-bitmap-write.c: add a missing stop_progress()
  2021-09-20 23:09     ` [PATCH v2 7/8] pack-bitmap-write.c: add a missing stop_progress() Ævar Arnfjörð Bjarmason
  2021-10-08  4:04       ` Emily Shaffer
@ 2021-10-10 21:29       ` SZEDER Gábor
  1 sibling, 0 replies; 138+ messages in thread
From: SZEDER Gábor @ 2021-10-10 21:29 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, René Scharfe

On Tue, Sep 21, 2021 at 01:09:28AM +0200, Ævar Arnfjörð Bjarmason wrote:
> Fix a bug that's been here since 7cc8f971085 (pack-objects: implement
> bitmap writing, 2013-12-21), we did not call stop_progress() if we
> reached the early exit in this function. This will matter in a
> subsequent commit where we BUG(...) out if this happens, and matters
> now e.g. because we don't have a corresponding "region_end" for the
> progress trace2 event.

The stop_progress() is not missing, but rather the start_progress() is
in the wrong place.

  https://public-inbox.org/git/20210917051448.GB2118053@szeder.dev/

> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  pack-bitmap-write.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
> index 88d9e696a54..6e110e41ea4 100644
> --- a/pack-bitmap-write.c
> +++ b/pack-bitmap-write.c
> @@ -550,6 +550,7 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
>  	if (indexed_commits_nr < 100) {
>  		for (i = 0; i < indexed_commits_nr; ++i)
>  			push_bitmapped_commit(indexed_commits[i]);
> +		stop_progress(&writer.progress);
>  		return;
>  	}
>  
> -- 
> 2.33.0.1098.gf02a64c1a2d
> 

^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v3 00/10] progress: assert "global_progress" + test fixes / cleanup
  2021-09-20 23:09   ` [PATCH v2 " Ævar Arnfjörð Bjarmason
                       ` (7 preceding siblings ...)
  2021-09-20 23:09     ` [PATCH v2 8/8] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
@ 2021-10-13 22:28     ` Ævar Arnfjörð Bjarmason
  2021-10-13 22:28       ` [PATCH v3 01/10] leak tests: fix a memory leaks in "test-progress" helper Ævar Arnfjörð Bjarmason
                         ` (9 more replies)
  8 siblings, 10 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-13 22:28 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Emily Shaffer, Ævar Arnfjörð Bjarmason

This series fixes various issues in and related to progress.c, and
adds a BUG() assertion for us not starting two progress bars at the
same time. Those changes are needed for subsequent changes that do
more interesting things with this new global progress bar.

This v3 hopefully addresses all the feedback on the v2, thanks
all. Changes:

 * Fix a memory leak in 1/10, and make the progress tests use the
   SANITIZE=leak test mode.

 * Simplified some of the test-progress.c code (no more "start"
   handling, the "total" count is mandatory now.

 * Split out a formatting change into 2/10 to make 3/10 easier to
   read.

 * A new 9/10 makes an ad-hoc test recipie in 10/10 easier to explain
   (in response to Emily's comment).

 * The BUG() assertion in 10/10 now has a much better message, we dump
   the title of the two progress bars in play if we have a bug where
   we started two at the same time.


Ævar Arnfjörð Bjarmason (10):
  leak tests: fix a memory leaks in "test-progress" helper
  progress.c test helper: add missing braces
  progress.c tests: make start/stop verbs on stdin
  progress.c tests: test some invalid usage
  progress.c: move signal handler functions lower
  progress.c: call progress_interval() from progress_test_force_update()
  progress.c: add temporary variable from progress struct
  pack-bitmap-write.c: don't return without stop_progress()
  various *.c: use isatty(1|2), not isatty(STDIN_FILENO|STDERR_FILENO)
  progress.c: add & assert a "global_progress" variable

 builtin/bisect--helper.c    |   2 +-
 builtin/bundle.c            |   2 +-
 compat/mingw.c              |   2 +-
 pack-bitmap-write.c         |   6 +-
 progress.c                  | 111 ++++++++++++++++++++----------------
 t/helper/test-progress.c    |  43 +++++++++-----
 t/t0500-progress-display.sh | 105 +++++++++++++++++++++++++++-------
 7 files changed, 183 insertions(+), 88 deletions(-)

Range-diff against v2:
 -:  ----------- >  1:  40f7c438a1e leak tests: fix a memory leaks in "test-progress" helper
 -:  ----------- >  2:  ee177d253a8 progress.c test helper: add missing braces
 1:  e0a294eb479 !  3:  045d58d8201 progress.c tests: make start/stop verbs on stdin
    @@ Commit message
         This makes for tests that are easier to read, since the recipe will
         mirror the API usage, and allows for easily testing invalid usage that
         would yield (or should yield) a BUG(), e.g. providing two "start"
    -    calls in a row. A subsequent commit will add such stress tests.
    +    calls in a row. A subsequent commit will add such tests.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    @@ t/helper/test-progress.c
       *
       * Reads instructions from standard input, one instruction per line:
       *
    -+ *   "start[ <total>[ <title>]]" - Call start_progress(title, total),
    -+ *                                 when "start" use a title of
    -+ *                                 "Working hard" with a total of 0.
    ++ *   "start <total>[ <title>]" - Call start_progress(title, total),
    ++ *                               Uses the default title of "Working hard"
    ++ *                               if the " <title>" is omitted.
       *   "progress <items>" - Call display_progress() with the given item count
       *                        as parameter.
       *   "throughput <bytes> <millis> - Call display_throughput() with the given
    @@ t/helper/test-progress.c
       * See 't0500-progress-display.sh' for examples.
       */
     @@
    + #include "parse-options.h"
    + #include "progress.h"
    + #include "strbuf.h"
    ++#include "string-list.h"
      
      int cmd__progress(int argc, const char **argv)
      {
     -	int total = 0;
     -	const char *title;
    -+	const char *default_title = "Working hard";
    -+	char *detached_title = NULL;
    ++	const char *const default_title = "Working hard";
    ++	struct string_list list = STRING_LIST_INIT_DUP;
    ++	const struct string_list_item *item;
      	struct strbuf line = STRBUF_INIT;
     -	struct progress *progress;
     +	struct progress *progress = NULL;
    @@ t/helper/test-progress.c
      		char *end;
      
     -		if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
    -+		if (!strcmp(line.buf, "start")) {
    -+			progress = start_progress(default_title, 0);
    -+		} else if (skip_prefix(line.buf, "start ", (const char **) &end)) {
    ++		if (skip_prefix(line.buf, "start ", (const char **) &end)) {
     +			uint64_t total = strtoull(end, &end, 10);
     +			if (*end == '\0') {
     +				progress = start_progress(default_title, total);
     +			} else if (*end == ' ') {
    -+				free(detached_title);
    -+				detached_title = strbuf_detach(&line, NULL);
    -+				progress = start_progress(end + 1, total);
    ++				item = string_list_insert(&list, end + 1);
    ++				progress = start_progress(item->string, total);
     +			} else {
     +				die("invalid input: '%s'\n", line.buf);
     +			}
    @@ t/helper/test-progress.c
      			if (*end != '\0')
      				die("invalid input: '%s'\n", line.buf);
     @@ t/helper/test-progress.c: int cmd__progress(int argc, const char **argv)
    - 				die("invalid input: '%s'\n", line.buf);
    - 			progress_test_ns = test_ms * 1000 * 1000;
      			display_throughput(progress, byte_count);
    --		} else if (!strcmp(line.buf, "update"))
    -+		} else if (!strcmp(line.buf, "update")) {
    + 		} else if (!strcmp(line.buf, "update")) {
      			progress_test_force_update();
    --		else
     +		} else if (!strcmp(line.buf, "stop")) {
     +			stop_progress(&progress);
    -+		} else {
    + 		} else {
      			die("invalid input: '%s'\n", line.buf);
    -+		}
    + 		}
      	}
     -	stop_progress(&progress);
    -+	free(detached_title);
    + 	strbuf_release(&line);
    ++	string_list_clear(&list, 0);
      
      	return 0;
      }
    @@ t/t0500-progress-display.sh: Working hard.......2.........3.........4.........5.
      EOF
      
      	cat >in <<-\EOF &&
    --	update
     +	start 100000 Working hard.......2.........3.........4.........5.........6
    + 	update
      	progress 1
      	update
      	progress 2
    @@ t/t0500-progress-display.sh: test_expect_success 'progress display with throughp
      	EOF
      
      	cat >in <<-\EOF &&
    -+	start
    ++	start 0
      	throughput 102400 1000
      	update
      	progress 10
    @@ t/t0500-progress-display.sh: test_expect_success 'cover up after throughput shor
      	EOF
      
      	cat >in <<-\EOF &&
    -+	start
    ++	start 0
      	throughput 409600 1000
      	update
      	progress 1
    @@ t/t0500-progress-display.sh: test_expect_success 'cover up after throughput shor
      	EOF
      
      	cat >in <<-\EOF &&
    -+	start
    ++	start 0
      	throughput 1 1000
      	update
      	progress 1
 2:  7b1220b641e !  4:  efc0ec360cc progress.c tests: test some invalid usage
    @@ Commit message
         extends the trace2 tests added in 98a13647408 (trace2: log progress
         time and throughput, 2020-05-12).
     
    +    These tests are not merely testing the helper, but invalid API usage
    +    that can happen if the progress.c API is misused.
    +
    +    The "without stop" test will leak under SANITIZE=leak, since this
    +    buggy use of the API will leak memory. But let's not skip it entirely,
    +    or use the "!SANITIZE_LEAK" prerequisite check as we'd do with tests
    +    that we're skipping due to leaks we haven't fixed yet. Instead
    +    annotate the specific command that should skip leak checking with
    +    custom $LSAN_OPTIONS[1].
    +
    +    1. https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer
    +
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## t/t0500-progress-display.sh ##
    @@ t/t0500-progress-display.sh: test_expect_success 'progress generates traces' '
      
     +test_expect_success 'progress generates traces: stop / start' '
     +	cat >in <<-\EOF &&
    -+	start
    ++	start 0
     +	stop
     +	EOF
     +
    @@ t/t0500-progress-display.sh: test_expect_success 'progress generates traces' '
     +
     +test_expect_success 'progress generates traces: start without stop' '
     +	cat >in <<-\EOF &&
    -+	start
    ++	start 0
     +	EOF
     +
    -+	GIT_TRACE2_EVENT="$(pwd)/trace-start.event" test-tool progress \
    ++	GIT_TRACE2_EVENT="$(pwd)/trace-start.event" \
    ++	LSAN_OPTIONS=detect_leaks=0 \
    ++	test-tool progress \
     +		<in 2>stderr &&
     +	grep region_enter.*progress trace-start.event &&
     +	! grep region_leave.*progress trace-start.event
 3:  f1b8bf1dbde =  5:  9e36f03de46 progress.c: move signal handler functions lower
 4:  74057b0046a =  6:  c7c3843564e progress.c: call progress_interval() from progress_test_force_update()
 5:  250e50667c2 <  -:  ----------- progress.c: stop eagerly fflush(stderr) when not a terminal
 6:  d4e9ff1de73 =  7:  cd2d27b1626 progress.c: add temporary variable from progress struct
 7:  a3f133ca7ad !  8:  e0a3510dd88 pack-bitmap-write.c: add a missing stop_progress()
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    pack-bitmap-write.c: add a missing stop_progress()
    +    pack-bitmap-write.c: don't return without stop_progress()
     
         Fix a bug that's been here since 7cc8f971085 (pack-objects: implement
         bitmap writing, 2013-12-21), we did not call stop_progress() if we
    -    reached the early exit in this function. This will matter in a
    -    subsequent commit where we BUG(...) out if this happens, and matters
    -    now e.g. because we don't have a corresponding "region_end" for the
    -    progress trace2 event.
    +    reached the early exit in this function.
     
    +    We could call stop_progress() before we return, but better yet is to
    +    defer calling start_progress() until we need it.
    +
    +    This will matter in a subsequent commit where we BUG(...) out if this
    +    happens, and matters now e.g. because we don't have a corresponding
    +    "region_end" for the progress trace2 event.
    +
    +    Suggested-by: SZEDER Gábor <szeder.dev@gmail.com>
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## pack-bitmap-write.c ##
     @@ pack-bitmap-write.c: void bitmap_writer_select_commits(struct commit **indexed_commits,
    + 
    + 	QSORT(indexed_commits, indexed_commits_nr, date_compare);
    + 
    +-	if (writer.show_progress)
    +-		writer.progress = start_progress("Selecting bitmap commits", 0);
    +-
      	if (indexed_commits_nr < 100) {
      		for (i = 0; i < indexed_commits_nr; ++i)
      			push_bitmapped_commit(indexed_commits[i]);
    -+		stop_progress(&writer.progress);
      		return;
      	}
      
    ++	if (writer.show_progress)
    ++		writer.progress = start_progress("Selecting bitmap commits", 0);
    ++
    + 	for (;;) {
    + 		struct commit *chosen = NULL;
    + 
 -:  ----------- >  9:  2cf14881ecf various *.c: use isatty(1|2), not isatty(STDIN_FILENO|STDERR_FILENO)
 8:  1bd285eba0d ! 10:  01d5bbfce76 progress.c: add & assert a "global_progress" variable
    @@ Commit message
          3. I've likewise done an ad-hoc test to force progress bars to be
             displayed with:
     
    -            perl -pi -e 's[isatty\((?:STDERR_FILENO|2)\)][1]g' $(git grep -l 'isatty\((STDERR_FILENO|2)\)')
    +            perl -pi -e 's[isatty\(2\)][1]g' $(git grep -l -F 'isatty(2)')
     
             I.e. to replace all checks (not just for progress) of checking
             whether STDERR is connected to a TTY, and then monkeypatching
    @@ progress.c: void progress_test_force_update(void)
      	struct itimerval v;
      
     +	if (global_progress)
    -+		BUG("should have no global_progress in set_progress_signal()");
    ++		BUG("'%s' progress still active when trying to start '%s'",
    ++		    global_progress->title, progress->title);
     +	global_progress = progress;
     +
      	if (progress_testing)
    @@ progress.c: static void set_progress_signal(void)
      	struct itimerval v = {{0,},};
      
     +	if (!global_progress)
    -+		BUG("should have a global_progress in clear_progress_signal()");
    ++		BUG("should have active global_progress when cleaning up");
     +	global_progress = NULL;
     +
      	if (progress_testing)
    @@ t/t0500-progress-display.sh: test_expect_success 'cover up after throughput shor
     +
     +	test_must_fail test-tool progress \
     +		<in 2>stderr &&
    -+	grep -E "^BUG: .*: should have no global_progress in set_progress_signal\(\)$" stderr
    ++	grep "^BUG: .*'\''one'\'' progress still active when trying to start '\''two'\''$" stderr
     +'
     +
      test_expect_success 'progress generates traces' '
-- 
2.33.1.1346.g48288c3c089


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v3 01/10] leak tests: fix a memory leaks in "test-progress" helper
  2021-10-13 22:28     ` [PATCH v3 00/10] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
@ 2021-10-13 22:28       ` Ævar Arnfjörð Bjarmason
  2021-10-13 22:28       ` [PATCH v3 02/10] progress.c test helper: add missing braces Ævar Arnfjörð Bjarmason
                         ` (8 subsequent siblings)
  9 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-13 22:28 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Emily Shaffer, Ævar Arnfjörð Bjarmason

Fix a memory leak in the test-progress helper, and mark the
corresponding "t0500-progress-display.sh" test as being leak-free
under SANITIZE=leak. This fixes a leak added in 2bb74b53a4 (Test the
progress display, 2019-09-16).

My 48f68715b14 (tr2: stop leaking "thread_name" memory, 2021-08-27)
had fixed another memory leak in this test (as it did some trace2
testing).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/helper/test-progress.c    | 1 +
 t/t0500-progress-display.sh | 1 +
 2 files changed, 2 insertions(+)

diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
index 5d05cbe7894..9265e6ab7cf 100644
--- a/t/helper/test-progress.c
+++ b/t/helper/test-progress.c
@@ -69,6 +69,7 @@ int cmd__progress(int argc, const char **argv)
 			die("invalid input: '%s'\n", line.buf);
 	}
 	stop_progress(&progress);
+	strbuf_release(&line);
 
 	return 0;
 }
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 22058b503ac..f37cf2eb9c9 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -2,6 +2,7 @@
 
 test_description='progress display'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 show_cr () {
-- 
2.33.1.1346.g48288c3c089


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v3 02/10] progress.c test helper: add missing braces
  2021-10-13 22:28     ` [PATCH v3 00/10] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
  2021-10-13 22:28       ` [PATCH v3 01/10] leak tests: fix a memory leaks in "test-progress" helper Ævar Arnfjörð Bjarmason
@ 2021-10-13 22:28       ` Ævar Arnfjörð Bjarmason
  2021-10-13 22:28       ` [PATCH v3 03/10] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
                         ` (7 subsequent siblings)
  9 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-13 22:28 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Emily Shaffer, Ævar Arnfjörð Bjarmason

If we have braces on one arm of an if/else all of them should have it,
per the CodingGuidelines's "When there are multiple arms to a
conditional[...]" advice. This formatting change makes a subsequent
commit smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/helper/test-progress.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
index 9265e6ab7cf..50fd3be3dad 100644
--- a/t/helper/test-progress.c
+++ b/t/helper/test-progress.c
@@ -63,10 +63,11 @@ int cmd__progress(int argc, const char **argv)
 				die("invalid input: '%s'\n", line.buf);
 			progress_test_ns = test_ms * 1000 * 1000;
 			display_throughput(progress, byte_count);
-		} else if (!strcmp(line.buf, "update"))
+		} else if (!strcmp(line.buf, "update")) {
 			progress_test_force_update();
-		else
+		} else {
 			die("invalid input: '%s'\n", line.buf);
+		}
 	}
 	stop_progress(&progress);
 	strbuf_release(&line);
-- 
2.33.1.1346.g48288c3c089


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v3 03/10] progress.c tests: make start/stop verbs on stdin
  2021-10-13 22:28     ` [PATCH v3 00/10] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
  2021-10-13 22:28       ` [PATCH v3 01/10] leak tests: fix a memory leaks in "test-progress" helper Ævar Arnfjörð Bjarmason
  2021-10-13 22:28       ` [PATCH v3 02/10] progress.c test helper: add missing braces Ævar Arnfjörð Bjarmason
@ 2021-10-13 22:28       ` Ævar Arnfjörð Bjarmason
  2021-10-13 22:28       ` [PATCH v3 04/10] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
                         ` (6 subsequent siblings)
  9 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-13 22:28 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Emily Shaffer, Ævar Arnfjörð Bjarmason

Change the usage of the "test-tool progress" introduced in
2bb74b53a49 (Test the progress display, 2019-09-16) to take command
like "start" and "stop" on stdin, instead of running them implicitly.

This makes for tests that are easier to read, since the recipe will
mirror the API usage, and allows for easily testing invalid usage that
would yield (or should yield) a BUG(), e.g. providing two "start"
calls in a row. A subsequent commit will add such tests.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/helper/test-progress.c    | 37 ++++++++++++++++-------
 t/t0500-progress-display.sh | 58 +++++++++++++++++++++++--------------
 2 files changed, 63 insertions(+), 32 deletions(-)

diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
index 50fd3be3dad..45ccbafa9da 100644
--- a/t/helper/test-progress.c
+++ b/t/helper/test-progress.c
@@ -3,6 +3,9 @@
  *
  * Reads instructions from standard input, one instruction per line:
  *
+ *   "start <total>[ <title>]" - Call start_progress(title, total),
+ *                               Uses the default title of "Working hard"
+ *                               if the " <title>" is omitted.
  *   "progress <items>" - Call display_progress() with the given item count
  *                        as parameter.
  *   "throughput <bytes> <millis> - Call display_throughput() with the given
@@ -10,6 +13,7 @@
  *                                  specify the time elapsed since the
  *                                  start_progress() call.
  *   "update" - Set the 'progress_update' flag.
+ *   "stop" - Call stop_progress().
  *
  * See 't0500-progress-display.sh' for examples.
  */
@@ -19,34 +23,43 @@
 #include "parse-options.h"
 #include "progress.h"
 #include "strbuf.h"
+#include "string-list.h"
 
 int cmd__progress(int argc, const char **argv)
 {
-	int total = 0;
-	const char *title;
+	const char *const default_title = "Working hard";
+	struct string_list list = STRING_LIST_INIT_DUP;
+	const struct string_list_item *item;
 	struct strbuf line = STRBUF_INIT;
-	struct progress *progress;
+	struct progress *progress = NULL;
 
 	const char *usage[] = {
-		"test-tool progress [--total=<n>] <progress-title>",
+		"test-tool progress <stdin",
 		NULL
 	};
 	struct option options[] = {
-		OPT_INTEGER(0, "total", &total, "total number of items"),
 		OPT_END(),
 	};
 
 	argc = parse_options(argc, argv, NULL, options, usage, 0);
-	if (argc != 1)
-		die("need a title for the progress output");
-	title = argv[0];
+	if (argc)
+		usage_with_options(usage, options);
 
 	progress_testing = 1;
-	progress = start_progress(title, total);
 	while (strbuf_getline(&line, stdin) != EOF) {
 		char *end;
 
-		if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
+		if (skip_prefix(line.buf, "start ", (const char **) &end)) {
+			uint64_t total = strtoull(end, &end, 10);
+			if (*end == '\0') {
+				progress = start_progress(default_title, total);
+			} else if (*end == ' ') {
+				item = string_list_insert(&list, end + 1);
+				progress = start_progress(item->string, total);
+			} else {
+				die("invalid input: '%s'\n", line.buf);
+			}
+		} else if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
 			uint64_t item_count = strtoull(end, &end, 10);
 			if (*end != '\0')
 				die("invalid input: '%s'\n", line.buf);
@@ -65,12 +78,14 @@ int cmd__progress(int argc, const char **argv)
 			display_throughput(progress, byte_count);
 		} else if (!strcmp(line.buf, "update")) {
 			progress_test_force_update();
+		} else if (!strcmp(line.buf, "stop")) {
+			stop_progress(&progress);
 		} else {
 			die("invalid input: '%s'\n", line.buf);
 		}
 	}
-	stop_progress(&progress);
 	strbuf_release(&line);
+	string_list_clear(&list, 0);
 
 	return 0;
 }
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index f37cf2eb9c9..27ab4218b01 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -18,6 +18,7 @@ test_expect_success 'simple progress display' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 0
 	update
 	progress 1
 	update
@@ -26,8 +27,9 @@ test_expect_success 'simple progress display' '
 	progress 4
 	update
 	progress 5
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -42,11 +44,13 @@ test_expect_success 'progress display with total' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 3
 	progress 1
 	progress 2
 	progress 3
+	stop
 	EOF
-	test-tool progress --total=3 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -63,14 +67,14 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 100
 	progress 1000
 	progress 10000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -89,16 +93,16 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	update
 	progress 1
 	update
 	progress 2
 	progress 10000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -117,14 +121,14 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 25000
 	progress 50000
 	progress 75000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -141,14 +145,14 @@ Working hard.......2.........3.........4.........5.........6.........7.........:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6.........7.........
 	progress 25000
 	progress 50000
 	progress 75000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6.........7........." \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -165,12 +169,14 @@ test_expect_success 'progress shortens - crazy caller' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 1000
 	progress 100
 	progress 200
 	progress 1
 	progress 1000
+	stop
 	EOF
-	test-tool progress --total=1000 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -186,6 +192,7 @@ test_expect_success 'progress display with throughput' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 0
 	throughput 102400 1000
 	update
 	progress 10
@@ -198,8 +205,9 @@ test_expect_success 'progress display with throughput' '
 	throughput 409600 4000
 	update
 	progress 40
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -215,6 +223,7 @@ test_expect_success 'progress display with throughput and total' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 40
 	throughput 102400 1000
 	progress 10
 	throughput 204800 2000
@@ -223,8 +232,9 @@ test_expect_success 'progress display with throughput and total' '
 	progress 30
 	throughput 409600 4000
 	progress 40
+	stop
 	EOF
-	test-tool progress --total=40 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -240,6 +250,7 @@ test_expect_success 'cover up after throughput shortens' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 0
 	throughput 409600 1000
 	update
 	progress 1
@@ -252,8 +263,9 @@ test_expect_success 'cover up after throughput shortens' '
 	throughput 1638400 4000
 	update
 	progress 4
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -268,6 +280,7 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 0
 	throughput 1 1000
 	update
 	progress 1
@@ -277,8 +290,9 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	throughput 3145728 3000
 	update
 	progress 3
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -286,6 +300,7 @@ test_expect_success 'cover up after throughput shortens a lot' '
 
 test_expect_success 'progress generates traces' '
 	cat >in <<-\EOF &&
+	start 40
 	throughput 102400 1000
 	update
 	progress 10
@@ -298,10 +313,11 @@ test_expect_success 'progress generates traces' '
 	throughput 409600 4000
 	update
 	progress 40
+	stop
 	EOF
 
-	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress --total=40 \
-		"Working hard" <in 2>stderr &&
+	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress \
+		<in 2>stderr &&
 
 	# t0212/parse_events.perl intentionally omits regions and data.
 	test_region progress "Working hard" trace.event &&
-- 
2.33.1.1346.g48288c3c089


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v3 04/10] progress.c tests: test some invalid usage
  2021-10-13 22:28     ` [PATCH v3 00/10] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                         ` (2 preceding siblings ...)
  2021-10-13 22:28       ` [PATCH v3 03/10] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
@ 2021-10-13 22:28       ` Ævar Arnfjörð Bjarmason
  2021-10-13 22:28       ` [PATCH v3 05/10] progress.c: move signal handler functions lower Ævar Arnfjörð Bjarmason
                         ` (5 subsequent siblings)
  9 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-13 22:28 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Emily Shaffer, Ævar Arnfjörð Bjarmason

Test what happens when we "stop" without a "start", omit the "stop"
after a "start", or try to start two concurrent progress bars. This
extends the trace2 tests added in 98a13647408 (trace2: log progress
time and throughput, 2020-05-12).

These tests are not merely testing the helper, but invalid API usage
that can happen if the progress.c API is misused.

The "without stop" test will leak under SANITIZE=leak, since this
buggy use of the API will leak memory. But let's not skip it entirely,
or use the "!SANITIZE_LEAK" prerequisite check as we'd do with tests
that we're skipping due to leaks we haven't fixed yet. Instead
annotate the specific command that should skip leak checking with
custom $LSAN_OPTIONS[1].

1. https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t0500-progress-display.sh | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 27ab4218b01..59e9f226ea4 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -325,4 +325,39 @@ test_expect_success 'progress generates traces' '
 	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
 '
 
+test_expect_success 'progress generates traces: stop / start' '
+	cat >in <<-\EOF &&
+	start 0
+	stop
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-startstop.event" test-tool progress \
+		<in 2>stderr &&
+	test_region progress "Working hard" trace-startstop.event
+'
+
+test_expect_success 'progress generates traces: start without stop' '
+	cat >in <<-\EOF &&
+	start 0
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-start.event" \
+	LSAN_OPTIONS=detect_leaks=0 \
+	test-tool progress \
+		<in 2>stderr &&
+	grep region_enter.*progress trace-start.event &&
+	! grep region_leave.*progress trace-start.event
+'
+
+test_expect_success 'progress generates traces: stop without start' '
+	cat >in <<-\EOF &&
+	stop
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-stop.event" test-tool progress \
+		<in 2>stderr &&
+	! grep region_enter.*progress trace-stop.event &&
+	! grep region_leave.*progress trace-stop.event
+'
+
 test_done
-- 
2.33.1.1346.g48288c3c089


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v3 05/10] progress.c: move signal handler functions lower
  2021-10-13 22:28     ` [PATCH v3 00/10] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                         ` (3 preceding siblings ...)
  2021-10-13 22:28       ` [PATCH v3 04/10] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
@ 2021-10-13 22:28       ` Ævar Arnfjörð Bjarmason
  2021-10-13 22:28       ` [PATCH v3 06/10] progress.c: call progress_interval() from progress_test_force_update() Ævar Arnfjörð Bjarmason
                         ` (4 subsequent siblings)
  9 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-13 22:28 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Emily Shaffer, Ævar Arnfjörð Bjarmason

Move the signal handler functions to just before the
start_progress_delay() where they'll be referenced, instead of having
them at the top of the file.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 92 ++++++++++++++++++++++++++++--------------------------
 1 file changed, 48 insertions(+), 44 deletions(-)

diff --git a/progress.c b/progress.c
index 680c6a8bf93..893cb0fe56f 100644
--- a/progress.c
+++ b/progress.c
@@ -53,50 +53,6 @@ static volatile sig_atomic_t progress_update;
  */
 int progress_testing;
 uint64_t progress_test_ns = 0;
-void progress_test_force_update(void)
-{
-	progress_update = 1;
-}
-
-
-static void progress_interval(int signum)
-{
-	progress_update = 1;
-}
-
-static void set_progress_signal(void)
-{
-	struct sigaction sa;
-	struct itimerval v;
-
-	if (progress_testing)
-		return;
-
-	progress_update = 0;
-
-	memset(&sa, 0, sizeof(sa));
-	sa.sa_handler = progress_interval;
-	sigemptyset(&sa.sa_mask);
-	sa.sa_flags = SA_RESTART;
-	sigaction(SIGALRM, &sa, NULL);
-
-	v.it_interval.tv_sec = 1;
-	v.it_interval.tv_usec = 0;
-	v.it_value = v.it_interval;
-	setitimer(ITIMER_REAL, &v, NULL);
-}
-
-static void clear_progress_signal(void)
-{
-	struct itimerval v = {{0,},};
-
-	if (progress_testing)
-		return;
-
-	setitimer(ITIMER_REAL, &v, NULL);
-	signal(SIGALRM, SIG_IGN);
-	progress_update = 0;
-}
 
 static int is_foreground_fd(int fd)
 {
@@ -249,6 +205,54 @@ void display_progress(struct progress *progress, uint64_t n)
 		display(progress, n, NULL);
 }
 
+static void progress_interval(int signum)
+{
+	progress_update = 1;
+}
+
+/*
+ * The progress_test_force_update() function is intended for testing
+ * the progress output, i.e. exclusively for 'test-tool progress'.
+ */
+void progress_test_force_update(void)
+{
+	progress_update = 1;
+}
+
+static void set_progress_signal(void)
+{
+	struct sigaction sa;
+	struct itimerval v;
+
+	if (progress_testing)
+		return;
+
+	progress_update = 0;
+
+	memset(&sa, 0, sizeof(sa));
+	sa.sa_handler = progress_interval;
+	sigemptyset(&sa.sa_mask);
+	sa.sa_flags = SA_RESTART;
+	sigaction(SIGALRM, &sa, NULL);
+
+	v.it_interval.tv_sec = 1;
+	v.it_interval.tv_usec = 0;
+	v.it_value = v.it_interval;
+	setitimer(ITIMER_REAL, &v, NULL);
+}
+
+static void clear_progress_signal(void)
+{
+	struct itimerval v = {{0,},};
+
+	if (progress_testing)
+		return;
+
+	setitimer(ITIMER_REAL, &v, NULL);
+	signal(SIGALRM, SIG_IGN);
+	progress_update = 0;
+}
+
 static struct progress *start_progress_delay(const char *title, uint64_t total,
 					     unsigned delay, unsigned sparse)
 {
-- 
2.33.1.1346.g48288c3c089


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v3 06/10] progress.c: call progress_interval() from progress_test_force_update()
  2021-10-13 22:28     ` [PATCH v3 00/10] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                         ` (4 preceding siblings ...)
  2021-10-13 22:28       ` [PATCH v3 05/10] progress.c: move signal handler functions lower Ævar Arnfjörð Bjarmason
@ 2021-10-13 22:28       ` Ævar Arnfjörð Bjarmason
  2021-10-13 22:28       ` [PATCH v3 07/10] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
                         ` (3 subsequent siblings)
  9 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-13 22:28 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Emily Shaffer, Ævar Arnfjörð Bjarmason

Define the progress_test_force_update() function in terms of
progress_interval(). For documentation purposes these two functions
have the same body, but different names. Let's just define the test
function by calling progress_interval() with SIGALRM ourselves.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/progress.c b/progress.c
index 893cb0fe56f..7fcc513717a 100644
--- a/progress.c
+++ b/progress.c
@@ -216,7 +216,7 @@ static void progress_interval(int signum)
  */
 void progress_test_force_update(void)
 {
-	progress_update = 1;
+	progress_interval(SIGALRM);
 }
 
 static void set_progress_signal(void)
-- 
2.33.1.1346.g48288c3c089


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v3 07/10] progress.c: add temporary variable from progress struct
  2021-10-13 22:28     ` [PATCH v3 00/10] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                         ` (5 preceding siblings ...)
  2021-10-13 22:28       ` [PATCH v3 06/10] progress.c: call progress_interval() from progress_test_force_update() Ævar Arnfjörð Bjarmason
@ 2021-10-13 22:28       ` Ævar Arnfjörð Bjarmason
  2021-10-13 22:28       ` [PATCH v3 08/10] pack-bitmap-write.c: don't return without stop_progress() Ævar Arnfjörð Bjarmason
                         ` (2 subsequent siblings)
  9 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-13 22:28 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Emily Shaffer, Ævar Arnfjörð Bjarmason

Add a temporary "progress" variable for the dereferenced p_progress
pointer to a "struct progress *". Before 98a13647408 (trace2: log
progress time and throughput, 2020-05-12) we didn't dereference
"p_progress" in this function, now that we do it's easier to read the
code if we work with a "progress" struct pointer like everywhere else,
instead of a pointer to a pointer.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/progress.c b/progress.c
index 7fcc513717a..b9369e9a264 100644
--- a/progress.c
+++ b/progress.c
@@ -329,15 +329,16 @@ void stop_progress(struct progress **p_progress)
 	finish_if_sparse(*p_progress);
 
 	if (*p_progress) {
+		struct progress *progress = *p_progress;
 		trace2_data_intmax("progress", the_repository, "total_objects",
 				   (*p_progress)->total);
 
 		if ((*p_progress)->throughput)
 			trace2_data_intmax("progress", the_repository,
 					   "total_bytes",
-					   (*p_progress)->throughput->curr_total);
+					   progress->throughput->curr_total);
 
-		trace2_region_leave("progress", (*p_progress)->title, the_repository);
+		trace2_region_leave("progress", progress->title, the_repository);
 	}
 
 	stop_progress_msg(p_progress, _("done"));
-- 
2.33.1.1346.g48288c3c089


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v3 08/10] pack-bitmap-write.c: don't return without stop_progress()
  2021-10-13 22:28     ` [PATCH v3 00/10] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                         ` (6 preceding siblings ...)
  2021-10-13 22:28       ` [PATCH v3 07/10] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
@ 2021-10-13 22:28       ` Ævar Arnfjörð Bjarmason
  2021-10-13 22:28       ` [PATCH v3 09/10] various *.c: use isatty(1|2), not isatty(STDIN_FILENO|STDERR_FILENO) Ævar Arnfjörð Bjarmason
  2021-10-13 22:28       ` [PATCH v3 10/10] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
  9 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-13 22:28 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Emily Shaffer, Ævar Arnfjörð Bjarmason

Fix a bug that's been here since 7cc8f971085 (pack-objects: implement
bitmap writing, 2013-12-21), we did not call stop_progress() if we
reached the early exit in this function.

We could call stop_progress() before we return, but better yet is to
defer calling start_progress() until we need it.

This will matter in a subsequent commit where we BUG(...) out if this
happens, and matters now e.g. because we don't have a corresponding
"region_end" for the progress trace2 event.

Suggested-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 pack-bitmap-write.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 9c55c1531e1..cab3eaa2acd 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -575,15 +575,15 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
 
 	QSORT(indexed_commits, indexed_commits_nr, date_compare);
 
-	if (writer.show_progress)
-		writer.progress = start_progress("Selecting bitmap commits", 0);
-
 	if (indexed_commits_nr < 100) {
 		for (i = 0; i < indexed_commits_nr; ++i)
 			push_bitmapped_commit(indexed_commits[i]);
 		return;
 	}
 
+	if (writer.show_progress)
+		writer.progress = start_progress("Selecting bitmap commits", 0);
+
 	for (;;) {
 		struct commit *chosen = NULL;
 
-- 
2.33.1.1346.g48288c3c089


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v3 09/10] various *.c: use isatty(1|2), not isatty(STDIN_FILENO|STDERR_FILENO)
  2021-10-13 22:28     ` [PATCH v3 00/10] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                         ` (7 preceding siblings ...)
  2021-10-13 22:28       ` [PATCH v3 08/10] pack-bitmap-write.c: don't return without stop_progress() Ævar Arnfjörð Bjarmason
@ 2021-10-13 22:28       ` Ævar Arnfjörð Bjarmason
  2021-10-13 22:28       ` [PATCH v3 10/10] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
  9 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-13 22:28 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Emily Shaffer, Ævar Arnfjörð Bjarmason

We have over 50 uses of "isatty(1)" and "isatty(2)" in the codebase,
only these two used the stdlib.h macros to refer to them.

Let's change these for consistency, and because a subsequent commit's
commit message outlines a recipe to change all of these for ad-hoc
testing, not needing to match these with that ad-hoc regex will make
things easier to explain.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/bisect--helper.c | 2 +-
 builtin/bundle.c         | 2 +-
 compat/mingw.c           | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
index 28a2e6a5750..1727cb051fb 100644
--- a/builtin/bisect--helper.c
+++ b/builtin/bisect--helper.c
@@ -830,7 +830,7 @@ static int bisect_autostart(struct bisect_terms *terms)
 	fprintf_ln(stderr, _("You need to start by \"git bisect "
 			  "start\"\n"));
 
-	if (!isatty(STDIN_FILENO))
+	if (!isatty(1))
 		return -1;
 
 	/*
diff --git a/builtin/bundle.c b/builtin/bundle.c
index 5a85d7cd0fe..df69c651753 100644
--- a/builtin/bundle.c
+++ b/builtin/bundle.c
@@ -56,7 +56,7 @@ static int parse_options_cmd_bundle(int argc,
 
 static int cmd_bundle_create(int argc, const char **argv, const char *prefix) {
 	int all_progress_implied = 0;
-	int progress = isatty(STDERR_FILENO);
+	int progress = isatty(2);
 	struct strvec pack_opts;
 	int version = -1;
 	int ret;
diff --git a/compat/mingw.c b/compat/mingw.c
index 9e0cd1e097f..0f545c1a7d1 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -2374,7 +2374,7 @@ int mingw_raise(int sig)
 	switch (sig) {
 	case SIGALRM:
 		if (timer_fn == SIG_DFL) {
-			if (isatty(STDERR_FILENO))
+			if (isatty(2))
 				fputs("Alarm clock\n", stderr);
 			exit(128 + SIGALRM);
 		} else if (timer_fn != SIG_IGN)
-- 
2.33.1.1346.g48288c3c089


^ permalink raw reply	[flat|nested] 138+ messages in thread

* [PATCH v3 10/10] progress.c: add & assert a "global_progress" variable
  2021-10-13 22:28     ` [PATCH v3 00/10] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                         ` (8 preceding siblings ...)
  2021-10-13 22:28       ` [PATCH v3 09/10] various *.c: use isatty(1|2), not isatty(STDIN_FILENO|STDERR_FILENO) Ævar Arnfjörð Bjarmason
@ 2021-10-13 22:28       ` Ævar Arnfjörð Bjarmason
  9 siblings, 0 replies; 138+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-13 22:28 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Emily Shaffer, Ævar Arnfjörð Bjarmason

The progress.c code makes a hard assumption that only one progress bar
be active at a time (see [1] for a bug where this wasn't the
case). Add a BUG() that'll trigger if we ever regress on that promise
and have two progress bars active at the same time.

There was an alternative test-only approach to doing the same
thing[2], but by doing this outside of a GIT_TEST_* mode we'll know
we've put a hard stop to this particular API misuse.

It will also establish scaffolding to address current fundamental
limitations in the progress output: The current output must be
"driven" by calls to the likes of display_progress(). Once we have a
global current progress object we'll be able to update that object via
SIGALRM. See [3] for early code to do that.

It's conceivable that this change will hit the BUG() condition in some
scenario that we don't currently have tests for, this would be very
bad. If that happened we'd die just because we couldn't emit some
pretty output.

See [4] for a discussion of why our test coverage is lacking; our
progress display is hidden behind isatty(2) checks in many cases, so
the test suite doesn't cover it unless individual tests are run in
"--verbose" mode, we might also have multi-threaded use of the API, so
two progress bars stopping and starting would only be visible due to a
race condition.

Despite that, I think that this change won't introduce such
regressions, because:

 1. I've read all the code using the progress API (and have modified a
    large part of it in some WIP code I have). Almost all of it is really
    simple, the parts that aren't[5] are complex in the display_progress() part,
    not in starting or stopping the progress bar.

 2. The entire test suite passes when instrumented with an ad-hoc
    Linux-specific mode (it uses gettid()) to die if progress bars are
    ever started or stopped on anything but the main thread[6].

    Extending that to die if display_progress() is called in a thread
    reveals that we have exactly two users of the progress bar under
    threaded conditions, "git index-pack" and "git pack-objects". Both
    uses are straightforward, and they don't start/stop the progress
    bar when threads are active.

 3. I've likewise done an ad-hoc test to force progress bars to be
    displayed with:

        perl -pi -e 's[isatty\(2\)][1]g' $(git grep -l -F 'isatty(2)')

    I.e. to replace all checks (not just for progress) of checking
    whether STDERR is connected to a TTY, and then monkeypatching
    is_foreground_fd() in progress.c to always "return 1". Running the
    tests with those applied, interactively and under -V reveals via:

        $ grep -e set_progress_signal -e clear_progress_signal test-results/*out

    That nothing our tests cover hits the BUG conditions added here,
    except the expected "BUG: start two concurrent progress bars" test
    being added here.

    That isn't entirely true since we won't be getting 100% coverage
    due to cascading failures from tests that expected no progress
    output on stderr. To make sure I covered 100% I also tried making
    the display() function in progress.c a NOOP on top of that (it's
    the calls to start_progress_delay() and stop_progress()) that
    matter.

    That doesn't hit the BUG() either. Some tests fail in that mode
    due to a combination of the overzealous isatty(2) munging noted
    above, and the tests that are testing that the progress output
    itself is present (but for testing I'd made display() a NOOP).

Between those three points I think it's safe to go ahead with this
change.

1. 6f9d5f2fda1 (commit-graph: fix progress of reachable commits, 2020-07-09)
2. https://lore.kernel.org/git/20210620200303.2328957-3-szeder.dev@gmail.com
3. https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/
4. https://lore.kernel.org/git/cover-00.25-00000000000-20210623T155626Z-avarab@gmail.com/
5. b50c37aa44d (Merge branch 'ab/progress-users-adjust-counters' into
   next, 2021-09-10)
6. https://lore.kernel.org/git/877dffg37n.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c                  | 18 ++++++++++++++----
 t/t0500-progress-display.sh | 11 +++++++++++
 2 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/progress.c b/progress.c
index b9369e9a264..a31500f8b2b 100644
--- a/progress.c
+++ b/progress.c
@@ -46,6 +46,7 @@ struct progress {
 };
 
 static volatile sig_atomic_t progress_update;
+static struct progress *global_progress;
 
 /*
  * These are only intended for testing the progress output, i.e. exclusively
@@ -219,11 +220,16 @@ void progress_test_force_update(void)
 	progress_interval(SIGALRM);
 }
 
-static void set_progress_signal(void)
+static void set_progress_signal(struct progress *progress)
 {
 	struct sigaction sa;
 	struct itimerval v;
 
+	if (global_progress)
+		BUG("'%s' progress still active when trying to start '%s'",
+		    global_progress->title, progress->title);
+	global_progress = progress;
+
 	if (progress_testing)
 		return;
 
@@ -241,10 +247,14 @@ static void set_progress_signal(void)
 	setitimer(ITIMER_REAL, &v, NULL);
 }
 
-static void clear_progress_signal(void)
+static void clear_progress_signal(struct progress *progress)
 {
 	struct itimerval v = {{0,},};
 
+	if (!global_progress)
+		BUG("should have active global_progress when cleaning up");
+	global_progress = NULL;
+
 	if (progress_testing)
 		return;
 
@@ -268,7 +278,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 	strbuf_init(&progress->counters_sb, 0);
 	progress->title_len = utf8_strwidth(title);
 	progress->split = 0;
-	set_progress_signal();
+	set_progress_signal(progress);
 	trace2_region_enter("progress", title, the_repository);
 	return progress;
 }
@@ -372,7 +382,7 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 		display(progress, progress->last_value, buf);
 		free(buf);
 	}
-	clear_progress_signal();
+	clear_progress_signal(progress);
 	strbuf_release(&progress->counters_sb);
 	if (progress->throughput)
 		strbuf_release(&progress->throughput->display);
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 59e9f226ea4..867fdace3f2 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -298,6 +298,17 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	test_cmp expect out
 '
 
+test_expect_success 'BUG: start two concurrent progress bars' '
+	cat >in <<-\EOF &&
+	start 0 one
+	start 0 two
+	EOF
+
+	test_must_fail test-tool progress \
+		<in 2>stderr &&
+	grep "^BUG: .*'\''one'\'' progress still active when trying to start '\''two'\''$" stderr
+'
+
 test_expect_success 'progress generates traces' '
 	cat >in <<-\EOF &&
 	start 40
-- 
2.33.1.1346.g48288c3c089


^ permalink raw reply	[flat|nested] 138+ messages in thread

end of thread, other threads:[~2021-10-13 22:28 UTC | newest]

Thread overview: 138+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
2021-06-20 20:02 ` [PATCH 1/7] progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress counters SZEDER Gábor
2021-06-21  7:09   ` Ævar Arnfjörð Bjarmason
2021-06-22 15:55   ` Taylor Blau
2021-06-20 20:02 ` [PATCH 2/7] progress: catch nested/overlapping progresses with GIT_TEST_CHECK_PROGRESS SZEDER Gábor
2021-06-22 16:00   ` Taylor Blau
2021-08-30 21:15     ` SZEDER Gábor
2021-06-20 20:02 ` [PATCH 3/7] progress: catch backwards counting " SZEDER Gábor
2021-06-20 20:03 ` [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line SZEDER Gábor
2021-06-20 22:13   ` Ævar Arnfjörð Bjarmason
2021-06-21 18:32     ` René Scharfe
2021-06-21 20:08       ` Ævar Arnfjörð Bjarmason
2021-06-26  8:27         ` René Scharfe
2021-06-26 14:11           ` Ævar Arnfjörð Bjarmason
2021-06-26 20:22             ` René Scharfe
2021-06-26 21:38               ` Ævar Arnfjörð Bjarmason
2021-07-04 12:15                 ` René Scharfe
2021-07-05 14:09                   ` Junio C Hamano
2021-07-05 23:28                   ` Ævar Arnfjörð Bjarmason
2021-07-06 16:02                     ` René Scharfe
2021-06-27 17:31               ` Felipe Contreras
2021-06-20 20:03 ` [PATCH 5/7] entry: show finer-grained counter in "Filtering content" " SZEDER Gábor
2021-06-20 20:03 ` [PATCH 6/7] [RFC] entry: don't show "Filtering content: ... done." line in case of errors SZEDER Gábor
2021-06-21 18:32   ` René Scharfe
2021-06-23  1:52     ` Taylor Blau
2021-08-30 21:17       ` SZEDER Gábor
2021-06-20 20:03 ` [PATCH 7/7] test-lib: enable GIT_TEST_CHECK_PROGRESS by default SZEDER Gábor
2021-06-21  0:59 ` [PATCH 0/7] progress: verify progress counters in the test suite Ævar Arnfjörð Bjarmason
2021-06-23  2:04   ` Taylor Blau
2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 01/25] progress.c tests: fix breakage with COLUMNS != 80 Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 02/25] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 03/25] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 04/25] progress.c tests: add a "signal" verb Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 05/25] progress.c: move signal handler functions lower Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 06/25] progress.c: call progress_interval() from progress_test_force_update() Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 07/25] progress.c: stop eagerly fflush(stderr) when not a terminal Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 08/25] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 09/25] midx perf: add a perf test for multi-pack-index Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 10/25] progress.c: remove the "sparse" mode nano-optimization Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 11/25] pack-bitmap-write.c: add a missing stop_progress() Ævar Arnfjörð Bjarmason
2021-09-17  5:14         ` SZEDER Gábor
2021-09-17  5:56           ` Ævar Arnfjörð Bjarmason
2021-09-17 21:38             ` SZEDER Gábor
2021-06-23 17:48       ` [PATCH 12/25] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
2021-09-16 18:31         ` SZEDER Gábor
2021-06-23 17:48       ` [PATCH 13/25] progress.[ch]: move the "struct progress" to the header Ævar Arnfjörð Bjarmason
2021-09-16 19:42         ` SZEDER Gábor
2021-06-23 17:48       ` [PATCH 14/25] progress.[ch]: move test-only code away from "extern" variables Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 15/25] progress.c: pass "is done?" (again) to display() Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 16/25] progress.[ch]: convert "title" to "struct strbuf" Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 17/25] progress.c: refactor display() for less confusion, and fix bug Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 18/25] progress.c: emit progress on first signal, show "stalled" Ævar Arnfjörð Bjarmason
2021-09-16 18:37         ` SZEDER Gábor
2021-06-23 17:48       ` [PATCH 19/25] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 20/25] midx: don't provide a total for QSORT() progress Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 21/25] entry: show finer-grained counter in "Filtering content" progress line Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 22/25] progress.c: add a stop_progress_early() function Ævar Arnfjörð Bjarmason
2021-06-24 10:35         ` Ævar Arnfjörð Bjarmason
2021-06-25  1:24         ` Andrei Rybak
2021-06-23 17:48       ` [PATCH 23/25] entry: deal with unexpected "Filtering content" total Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [RFC/PATCH 24/25] progress: assert last update in stop_progress() Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [RFC/PATCH 25/25] progress: assert counting upwards in display() Ævar Arnfjörð Bjarmason
2021-06-23 17:59       ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Randall S. Becker
2021-06-23 20:01         ` Ævar Arnfjörð Bjarmason
2021-06-23 20:25           ` Randall S. Becker
2021-06-23 21:57 ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
2021-06-23 21:57   ` [PATCH 1/4] WIP progress, isatty(2), hidden progress lnies for GIT_TEST_CHECK_PROGRESS SZEDER Gábor
2021-06-23 21:57   ` [PATCH 2/4] blame: fix progress total with line ranges SZEDER Gábor
2021-06-23 21:57   ` [PATCH 3/4] read-cache: avoid overlapping progress lines SZEDER Gábor
2021-06-23 21:57   ` [PATCH 4/4] preload-index: fix "Refreshing index" progress line SZEDER Gábor
2021-06-23 22:11   ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
2021-06-24 10:43     ` Ævar Arnfjörð Bjarmason
2021-06-24 10:45   ` Ævar Arnfjörð Bjarmason
2021-07-22 12:20 ` [PATCH 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
2021-07-22 12:20   ` [PATCH 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
2021-07-23 21:55     ` Junio C Hamano
2021-08-02 21:07     ` SZEDER Gábor
2021-07-22 12:20   ` [PATCH 2/3] midx: don't provide a total for QSORT() progress Ævar Arnfjörð Bjarmason
2021-07-23 21:56     ` Junio C Hamano
2021-08-05 15:07     ` Phillip Wood
2021-08-05 19:07       ` Ævar Arnfjörð Bjarmason
2021-07-22 12:20   ` [PATCH 3/3] entry: show finer-grained counter in "Filtering content" progress line Ævar Arnfjörð Bjarmason
2021-07-23 22:01     ` Junio C Hamano
2021-08-02 22:05       ` SZEDER Gábor
2021-08-02 21:48     ` SZEDER Gábor
2021-08-05 11:01   ` [PATCH v2 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
2021-08-05 11:01     ` [PATCH v2 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
2021-08-05 11:01     ` [PATCH v2 2/3] midx: don't provide a total for QSORT() progress Ævar Arnfjörð Bjarmason
2021-08-05 11:01     ` [PATCH v2 3/3] entry: show finer-grained counter in "Filtering content" progress line Ævar Arnfjörð Bjarmason
2021-08-23 10:29     ` [PATCH v3 0/2] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
2021-08-23 10:29       ` [PATCH v3 1/2] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
2021-08-23 10:29       ` [PATCH v3 2/2] entry: show finer-grained counter in "Filtering content" " Ævar Arnfjörð Bjarmason
2021-09-09  1:10       ` [PATCH v4 0/2] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
2021-09-09  1:10         ` [PATCH v4 1/2] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
2021-09-09  1:10         ` [PATCH v4 2/2] entry: show finer-grained counter in "Filtering content" " Ævar Arnfjörð Bjarmason
2021-09-09 20:02         ` [PATCH v4 0/2] progress.c API users: fix bogus counting Junio C Hamano
2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
2021-07-22 12:54   ` [PATCH 1/8] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
2021-07-22 12:55   ` [PATCH 2/8] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
2021-07-22 12:55   ` [PATCH 3/8] progress.c: move signal handler functions lower Ævar Arnfjörð Bjarmason
2021-07-22 12:55   ` [PATCH 4/8] progress.c: call progress_interval() from progress_test_force_update() Ævar Arnfjörð Bjarmason
2021-07-22 12:55   ` [PATCH 5/8] progress.c: stop eagerly fflush(stderr) when not a terminal Ævar Arnfjörð Bjarmason
2021-07-22 12:55   ` [PATCH 6/8] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
2021-07-22 12:55   ` [PATCH 7/8] pack-bitmap-write.c: add a missing stop_progress() Ævar Arnfjörð Bjarmason
2021-07-22 12:55   ` [PATCH 8/8] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
2021-09-16 21:34     ` [PATCH 12/25] " Ævar Arnfjörð Bjarmason
2021-07-23 22:02   ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Junio C Hamano
2021-09-20 23:09   ` [PATCH v2 " Ævar Arnfjörð Bjarmason
2021-09-20 23:09     ` [PATCH v2 1/8] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
2021-10-08  3:43       ` Emily Shaffer
2021-09-20 23:09     ` [PATCH v2 2/8] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
2021-10-08  3:53       ` Emily Shaffer
2021-09-20 23:09     ` [PATCH v2 3/8] progress.c: move signal handler functions lower Ævar Arnfjörð Bjarmason
2021-09-20 23:09     ` [PATCH v2 4/8] progress.c: call progress_interval() from progress_test_force_update() Ævar Arnfjörð Bjarmason
2021-09-20 23:09     ` [PATCH v2 5/8] progress.c: stop eagerly fflush(stderr) when not a terminal Ævar Arnfjörð Bjarmason
2021-10-08  3:59       ` Emily Shaffer
2021-10-08  7:01         ` Ævar Arnfjörð Bjarmason
2021-09-20 23:09     ` [PATCH v2 6/8] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
2021-10-08  4:02       ` Emily Shaffer
2021-09-20 23:09     ` [PATCH v2 7/8] pack-bitmap-write.c: add a missing stop_progress() Ævar Arnfjörð Bjarmason
2021-10-08  4:04       ` Emily Shaffer
2021-10-08 12:14         ` Ævar Arnfjörð Bjarmason
2021-10-10 21:29       ` SZEDER Gábor
2021-09-20 23:09     ` [PATCH v2 8/8] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
2021-10-08  4:18       ` Emily Shaffer
2021-10-08  7:15         ` Ævar Arnfjörð Bjarmason
2021-10-13 22:28     ` [PATCH v3 00/10] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
2021-10-13 22:28       ` [PATCH v3 01/10] leak tests: fix a memory leaks in "test-progress" helper Ævar Arnfjörð Bjarmason
2021-10-13 22:28       ` [PATCH v3 02/10] progress.c test helper: add missing braces Ævar Arnfjörð Bjarmason
2021-10-13 22:28       ` [PATCH v3 03/10] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
2021-10-13 22:28       ` [PATCH v3 04/10] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
2021-10-13 22:28       ` [PATCH v3 05/10] progress.c: move signal handler functions lower Ævar Arnfjörð Bjarmason
2021-10-13 22:28       ` [PATCH v3 06/10] progress.c: call progress_interval() from progress_test_force_update() Ævar Arnfjörð Bjarmason
2021-10-13 22:28       ` [PATCH v3 07/10] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
2021-10-13 22:28       ` [PATCH v3 08/10] pack-bitmap-write.c: don't return without stop_progress() Ævar Arnfjörð Bjarmason
2021-10-13 22:28       ` [PATCH v3 09/10] various *.c: use isatty(1|2), not isatty(STDIN_FILENO|STDERR_FILENO) Ævar Arnfjörð Bjarmason
2021-10-13 22:28       ` [PATCH v3 10/10] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).