git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/7] progress: verify progress counters in the test suite
@ 2021-06-20 20:02 SZEDER Gábor
  2021-06-20 20:02 ` [PATCH 1/7] progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress counters SZEDER Gábor
                   ` (10 more replies)
  0 siblings, 11 replies; 83+ messages in thread
From: SZEDER Gábor @ 2021-06-20 20:02 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	SZEDER Gábor

Splitting off from:

  https://public-inbox.org/git/cover-0.2-0000000000-20210607T144206Z-avarab@gmail.com/T/#me5d3176914d4268fd9f2a96fc63f4e41beb26bd6

On Tue, Jun 08, 2021 at 06:14:42PM +0200, René Scharfe wrote:
> I wonder (only in a semi-curious way, though) if we can detect
> off-by-one errors by adding an assertion to display_progress() that
> requires the first update to have the value 0, and in stop_progress()
> one that requires the previous display_progress() call to have a value
> equal to the total number of work items.  Not sure it'd be worth the
> hassle..

I fixed and reported a number of bogus progress lines in the past, the
last one during v2.31.0-rc phase, so I've looked into whether progress
counters could be automatically validated in our tests, and came up
with these patches a few months ago.  It turned out that progress
counters can be checked easily and transparently in case of progress
lines that are shown in the tests, i.e. that are shown even when
stderr is not a terminal or are forced with '--progress'.  (In other
cases it's still fairly easy but not quite transparent, as I think we
need changes to the progress API; more on that later in a separate
series.)

These checks did uncover a couple of buggy progress lines which are
fixed in this series as well, but I'm not sure that the fix presented
in patch 6 is the right approach, hence the RFC.


SZEDER Gábor (7):
  progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress
    counters
  progress: catch nested/overlapping progresses with
    GIT_TEST_CHECK_PROGRESS
  progress: catch backwards counting with GIT_TEST_CHECK_PROGRESS
  commit-graph: fix bogus counter in "Scanning merged commits" progress
    line
  entry: show finer-grained counter in "Filtering content" progress line
  [RFC] entry: don't show "Filtering content: ... done." line in case of
    errors
  test-lib: enable GIT_TEST_CHECK_PROGRESS by default

 commit-graph.c              |  2 +-
 entry.c                     | 10 +++---
 progress.c                  | 29 ++++++++++++++--
 t/t0500-progress-display.sh | 69 +++++++++++++++++++++++++------------
 t/test-lib.sh               |  6 ++++
 5 files changed, 86 insertions(+), 30 deletions(-)

-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 1/7] progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress counters
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
@ 2021-06-20 20:02 ` SZEDER Gábor
  2021-06-21  7:09   ` Ævar Arnfjörð Bjarmason
  2021-06-22 15:55   ` Taylor Blau
  2021-06-20 20:02 ` [PATCH 2/7] progress: catch nested/overlapping progresses with GIT_TEST_CHECK_PROGRESS SZEDER Gábor
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 83+ messages in thread
From: SZEDER Gábor @ 2021-06-20 20:02 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	SZEDER Gábor

We had to fix a couple of buggy progress lines in the past, where the
progress counter's final value didn't match the expected total [1],
e.g.:

  Expanding reachable commits in commit graph: 138606% (824706/595), done.
  Writing out commit graph in 3 passes: 166% (4187845/2512707), done.

Let's do better, and, instead of waiting for someone to notice such
issues by mere chance, start verifying progress counters in the test
suite: introduce the GIT_TEST_CHECK_PROGRESS knob to automatically
check that the final value of each progress counter matches the
expected total upon calling stop_progress(), and trigger a BUG() if it
doesn't.

This check should cover progress lines that are too fast to be shown,
because the repositories used in our tests are tiny and most of our
progress lines are delayed.  However, in case of a delayed progress
line the variable holding the value of the progress counter
('progress->last_value') is only updated after that delay is up, and,
consequently, we can't compare the progress counter with the expected
total in stop_progress() in these cases.

So let's update 'progress->last_value' already during the initial
delay as well.  This doesn't affect the visible behavior of progress
lines, though it results in additional invocations of the internal
display() function during the initial delay, but those don't make any
difference, because display() returns early without displaying
anything until the delay is up anyway.

Note that this can only check progress lines that are actually
started, i.e. that are shown by default even when standard error is
not a terminal, or that are forced to show with the '--progress'
option of whichever Git command displaying them.

Nonetheless, running the test suite with this new knob enabled results
in failures in 't0021-conversion.sh' and 't5510-fetch.sh', revealing
two more progress lines whose counter doesn't reach the expected
total.  These will be fixed in later patches in this series, and after
that GIT_TEST_CHECK_PROGRESS will be enabled by default in the test
suite.

[1] c4ff24bbb3 (commit-graph.c: display correct number of chunks when
                writing, 2021-02-24)
    1cbdbf3bef (commit-graph: drop count_distinct_commits() function,
                2020-12-07), though this didn't actually fixed, but
                instead removed a buggy progress line.
    150cd3b61d (commit-graph: fix "Writing out commit graph" progress
                counter, 2020-07-09)
    67fa6aac5a (commit-graph: don't show progress percentages while
                expanding reachable commits, 2019-09-07)
    531e6daa03 (prune-packed: advanced progress even for non-existing
                fan-out directories, 2009-04-27)

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 progress.c                  | 16 ++++++++++++++--
 t/t0500-progress-display.sh | 34 ++++++++++++++++++++++++++++++++++
 2 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/progress.c b/progress.c
index 680c6a8bf9..255995406f 100644
--- a/progress.c
+++ b/progress.c
@@ -47,6 +47,8 @@ struct progress {
 
 static volatile sig_atomic_t progress_update;
 
+static int test_check_progress;
+
 /*
  * These are only intended for testing the progress output, i.e. exclusively
  * for 'test-tool progress'.
@@ -111,10 +113,11 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 	int show_update = 0;
 	int last_count_len = counters_sb->len;
 
+	progress->last_value = n;
+
 	if (progress->delay && (!progress_update || --progress->delay))
 		return;
 
-	progress->last_value = n;
 	tp = (progress->throughput) ? progress->throughput->display.buf : "";
 	if (progress->total) {
 		unsigned percent = n * 100 / progress->total;
@@ -252,7 +255,11 @@ void display_progress(struct progress *progress, uint64_t n)
 static struct progress *start_progress_delay(const char *title, uint64_t total,
 					     unsigned delay, unsigned sparse)
 {
-	struct progress *progress = xmalloc(sizeof(*progress));
+	struct progress *progress;
+
+	test_check_progress = git_env_bool("GIT_TEST_CHECK_PROGRESS", 0);
+
+	progress = xmalloc(sizeof(*progress));
 	progress->title = title;
 	progress->total = total;
 	progress->last_value = -1;
@@ -349,6 +356,11 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 	progress = *p_progress;
 	if (!progress)
 		return;
+	if (test_check_progress && progress->total &&
+	    progress->total != progress->last_value)
+		BUG("total progress does not match for \"%s\": expected: %"PRIuMAX" got: %"PRIuMAX,
+		    progress->title, (uintmax_t)progress->total,
+		    (uintmax_t)progress->last_value);
 	*p_progress = NULL;
 	if (progress->last_value != -1) {
 		/* Force the last update */
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 22058b503a..641fa0964e 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -308,4 +308,38 @@ test_expect_success 'progress generates traces' '
 	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
 '
 
+test_expect_success 'GIT_TEST_CHECK_PROGRESS catches non-matching total' '
+	cat >in <<-\EOF &&
+	progress 1
+	progress 2
+	progress 4
+	EOF
+
+	test_must_fail env GIT_TEST_CHECK_PROGRESS=1 \
+		test-tool progress --total=3 "Not enough" <in 2>stderr &&
+	grep "BUG:.*total progress does not match" stderr &&
+
+	test_must_fail env GIT_TEST_CHECK_PROGRESS=1 \
+		test-tool progress --total=5 "Too much" <in 2>stderr &&
+	grep "BUG:.*total progress does not match" stderr
+'
+
+test_expect_success 'tolerate bogus progress without GIT_TEST_CHECK_PROGRESS' '
+	cat >expect <<-\EOF &&
+	Working hard:  33% (1/3)<CR>
+	Working hard:  33% (1/3), done.
+	EOF
+
+	cat >in <<-\EOF &&
+	progress 1
+	EOF
+	(
+		sane_unset GIT_TEST_CHECK_PROGRESS &&
+		test-tool progress --total=3 "Working hard" <in 2>stderr
+	) &&
+
+	show_cr <stderr >out &&
+	test_cmp expect out
+'
+
 test_done
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 2/7] progress: catch nested/overlapping progresses with GIT_TEST_CHECK_PROGRESS
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
  2021-06-20 20:02 ` [PATCH 1/7] progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress counters SZEDER Gábor
@ 2021-06-20 20:02 ` SZEDER Gábor
  2021-06-22 16:00   ` Taylor Blau
  2021-06-20 20:02 ` [PATCH 3/7] progress: catch backwards counting " SZEDER Gábor
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 83+ messages in thread
From: SZEDER Gábor @ 2021-06-20 20:02 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	SZEDER Gábor

We had to fix two buggy progress lines in the past, where
stop_progress calls were added at the wrong place [1], resulting in
"done" progress lines appearing in the wrong order.

Extend GIT_TEST_CHECK_PROGRESS to catch these cases as well, i.e.
trigger a BUG() when a progress has already been running when
start_progress() or one of its variants is called to start a new one.

Running the test suite with GIT_TEST_CHECK_PROGRESS enabled doesn't
reveal any new issues [2].

Note that this will trigger even in cases where the output is not
visibly wrong, e.g. consider this simplified sequence of calls:

  progress1 = start_delayed_progress();
  progress2 = start_delayed_progress();
  for (i = 0; ...)
      display_progress(progress2, i + 1);
  stop_progres(&progress2);
  for (j = 0; ...)
      display_progress(progress1, j + 1);
  stop_progres(&progress1);

This doesn't produce bogus output like what is shown in those two
fixes [1], because 'progress2' is already "done" before the first
display_progress(progress1, ...) call.  Btw, this is not just a
pathological example, we do have two progress lines arranged like
this, but they are only shown when standard error is a terminal, and
thus aren't caught by GIT_TEST_CHECK_PROGRESS in its current form.

[1] 6f9d5f2fda (commit-graph: fix progress of reachable commits,
                2020-07-09)
    862aead24e (commit-graph: fix "Collecting commits from input"
                progress line, 2020-07-10)

[2] This patch series applies with a minor conflict on top of
    6f9d5f2fda^, and makes 37 tests fail because of that bug.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 progress.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/progress.c b/progress.c
index 255995406f..549e8d1fe7 100644
--- a/progress.c
+++ b/progress.c
@@ -48,6 +48,8 @@ struct progress {
 static volatile sig_atomic_t progress_update;
 
 static int test_check_progress;
+/* Used to catch nested/overlapping progresses with GIT_TEST_CHECK_PROGRESS. */
+static struct progress *current_progress = NULL;
 
 /*
  * These are only intended for testing the progress output, i.e. exclusively
@@ -258,8 +260,12 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 	struct progress *progress;
 
 	test_check_progress = git_env_bool("GIT_TEST_CHECK_PROGRESS", 0);
+	if (test_check_progress && current_progress)
+		BUG("progress \"%s\" is still active when starting new progress \"%s\"",
+		    current_progress->title, title);
 
 	progress = xmalloc(sizeof(*progress));
+	current_progress = progress;
 	progress->title = title;
 	progress->total = total;
 	progress->last_value = -1;
@@ -383,6 +389,7 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 	strbuf_release(&progress->counters_sb);
 	if (progress->throughput)
 		strbuf_release(&progress->throughput->display);
+	current_progress = NULL;
 	free(progress->throughput);
 	free(progress);
 }
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 3/7] progress: catch backwards counting with GIT_TEST_CHECK_PROGRESS
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
  2021-06-20 20:02 ` [PATCH 1/7] progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress counters SZEDER Gábor
  2021-06-20 20:02 ` [PATCH 2/7] progress: catch nested/overlapping progresses with GIT_TEST_CHECK_PROGRESS SZEDER Gábor
@ 2021-06-20 20:02 ` SZEDER Gábor
  2021-06-20 20:03 ` [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line SZEDER Gábor
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 83+ messages in thread
From: SZEDER Gábor @ 2021-06-20 20:02 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	SZEDER Gábor

We had to fix a buggy progress line recently, where the progress
counter counted backwards, see 8e118e8490 (pack-objects: update
"nr_seen" progress based on pack-reused count, 2021-04-11).

Extend GIT_TEST_CHECK_PROGRESS to catch these cases as well, i.e.
trigger a BUG() when the counter passed to display_progress() is
smaller than the previous value.

Note that we allow subsequent display_progress() calls with the same
counter value, because:

  - Strictly speaking, it's not wrong to do so.

  - Forbidding it might make the code calling display_progress() more
    complex; I suspect that would be the case with e.g. the "Updating
    index flags" progress line in 'unpack-trees.c', where the counter
    is increased in recursive function calls.

  - We would need to special case the internal display() call in
    stop_progress_msg(), because it uses the same counter value as the
    last display_progress() call, which would trigger this BUG().

't0500-progress-display.sh' countains a few tests that check how
shortened progress lines are covered up, and one of them ('progress
shortens - crazy caller') shortens the progress line by counting
backwards.  From now on that test would trigger this BUG(), so remove
it; the other test cases cover shortening progress lines sufficiently.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 progress.c                  |  6 ++++++
 t/t0500-progress-display.sh | 35 +++++++++++++----------------------
 2 files changed, 19 insertions(+), 22 deletions(-)

diff --git a/progress.c b/progress.c
index 549e8d1fe7..034d50cd6b 100644
--- a/progress.c
+++ b/progress.c
@@ -115,6 +115,12 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 	int show_update = 0;
 	int last_count_len = counters_sb->len;
 
+	if (test_check_progress && progress->last_value != -1 &&
+	    n < progress->last_value)
+		BUG("progress \"%s\" counts backwards %"PRIuMAX" -> %"PRIuMAX,
+		    progress->title, (uintmax_t)progress->last_value,
+		    (uintmax_t)n);
+
 	progress->last_value = n;
 
 	if (progress->delay && (!progress_update || --progress->delay))
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 641fa0964e..a73dd45153 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -153,28 +153,6 @@ EOF
 	test_cmp expect out
 '
 
-# Progress counter goes backwards, this should not happen in practice.
-test_expect_success 'progress shortens - crazy caller' '
-	cat >expect <<-\EOF &&
-	Working hard:  10% (100/1000)<CR>
-	Working hard:  20% (200/1000)<CR>
-	Working hard:   0% (1/1000)  <CR>
-	Working hard: 100% (1000/1000)<CR>
-	Working hard: 100% (1000/1000), done.
-	EOF
-
-	cat >in <<-\EOF &&
-	progress 100
-	progress 200
-	progress 1
-	progress 1000
-	EOF
-	test-tool progress --total=1000 "Working hard" <in 2>stderr &&
-
-	show_cr <stderr >out &&
-	test_cmp expect out
-'
-
 test_expect_success 'progress display with throughput' '
 	cat >expect <<-\EOF &&
 	Working hard: 10<CR>
@@ -324,13 +302,26 @@ test_expect_success 'GIT_TEST_CHECK_PROGRESS catches non-matching total' '
 	grep "BUG:.*total progress does not match" stderr
 '
 
+test_expect_success 'GIT_TEST_CHECK_PROGRESS catches backwards counting' '
+	cat >in <<-\EOF &&
+	progress 2
+	progress 1
+	EOF
+
+	test_must_fail env GIT_TEST_CHECK_PROGRESS=1 \
+		test-tool progress --total=3 "Working hard" <in 2>stderr &&
+	grep "BUG:.*counts backwards" stderr
+'
+
 test_expect_success 'tolerate bogus progress without GIT_TEST_CHECK_PROGRESS' '
 	cat >expect <<-\EOF &&
+	Working hard:  66% (2/3)<CR>
 	Working hard:  33% (1/3)<CR>
 	Working hard:  33% (1/3), done.
 	EOF
 
 	cat >in <<-\EOF &&
+	progress 2
 	progress 1
 	EOF
 	(
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
                   ` (2 preceding siblings ...)
  2021-06-20 20:02 ` [PATCH 3/7] progress: catch backwards counting " SZEDER Gábor
@ 2021-06-20 20:03 ` SZEDER Gábor
  2021-06-20 22:13   ` Ævar Arnfjörð Bjarmason
  2021-06-20 20:03 ` [PATCH 5/7] entry: show finer-grained counter in "Filtering content" " SZEDER Gábor
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 83+ messages in thread
From: SZEDER Gábor @ 2021-06-20 20:03 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	SZEDER Gábor

The final value of the counter of the "Scanning merged commits"
progress line is always one less than its expected total, e.g.:

  Scanning merged commits:  83% (5/6), done.

This happens because while iterating over an array the loop variable
is passed to display_progress() as-is, but while C arrays (and thus
the loop variable) start at 0 and end at N-1, the progress counter
must end at N.  This causes the failures of the tests
'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.

Fix this by passing 'i + 1' to display_progress(), like most other
callsites do.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 commit-graph.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/commit-graph.c b/commit-graph.c
index 2bcb4e0f89..3181906368 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
 
 	ctx->num_extra_edges = 0;
 	for (i = 0; i < ctx->commits.nr; i++) {
-		display_progress(ctx->progress, i);
+		display_progress(ctx->progress, i + 1);
 
 		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
 			  &ctx->commits.list[i]->object.oid)) {
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 5/7] entry: show finer-grained counter in "Filtering content" progress line
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
                   ` (3 preceding siblings ...)
  2021-06-20 20:03 ` [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line SZEDER Gábor
@ 2021-06-20 20:03 ` SZEDER Gábor
  2021-06-20 20:03 ` [PATCH 6/7] [RFC] entry: don't show "Filtering content: ... done." line in case of errors SZEDER Gábor
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 83+ messages in thread
From: SZEDER Gábor @ 2021-06-20 20:03 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	SZEDER Gábor

The "Filtering content" progress in entry.c:finish_delayed_checkout()
is unusual because of how it calculates the progress count and because
it shows the progress of a nested loop.  It works basically like this:

  start_delayed_progress(p, nr_of_paths_to_filter)
  for_each_filter {
      display_progress(p, nr_of_paths_to_filter - nr_of_paths_still_left_to_filter)
      for_each_path_handled_by_the_current_filter {
          checkout_entry()
      }
  }
  stop_progress(p)

There are two issues with this approach:

  - The work done by the last filter (or the only filter if there is
    only one) is never counted, so if the last filter still has some
    paths to process, then the counter shown in the "done" progress
    line will not match the expected total.

    This is, in part, responsible for the failures of the tests
    'missing file in delayed checkout' and 'invalid file in delayed
    checkout' in 't0021-conversion.sh' when run with
    GIT_TEST_CHECK_PROGRESS=1, because both use only one filter.  (The
    test 'delayed checkout in process filter' uses two filters but the
    first one does all the work, so that test already happens to
    succeed even with GIT_TEST_CHECK_PROGRESS=1.)

  - The progress counter is updated only once per filter, not once per
    processed path, so if a filter has a lot of paths to process, then
    the counter might stay unchanged for a long while and then make a
    big jump (though the user still gets a sense of progress, because
    we call display_throughput() after each processed path to show the
    amount of processed data).

Move the display_progress() call to the inner loop, right next to that
checkout_entry() call that does the hard work for each path, and use a
dedicated counter variable that is incremented upon processing each
path.

After this change the 'invalid file in delayed checkout' in
't0021-conversion.sh' will succeed with GIT_TEST_CHECK_PROGRESS=1, but
the 'missing file in delayed checkout' test will still fail, because
its purposefully buggy filter doesn't process any paths, so we won't
execute that inner loop at all (this will be fixed in the next patch).

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 entry.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/entry.c b/entry.c
index 711ee0693c..bc4b8fcc98 100644
--- a/entry.c
+++ b/entry.c
@@ -162,7 +162,7 @@ static int remove_available_paths(struct string_list_item *item, void *cb_data)
 int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 {
 	int errs = 0;
-	unsigned delayed_object_count;
+	unsigned processed_paths = 0;
 	off_t filtered_bytes = 0;
 	struct string_list_item *filter, *path;
 	struct progress *progress;
@@ -172,12 +172,10 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 		return errs;
 
 	dco->state = CE_RETRY;
-	delayed_object_count = dco->paths.nr;
-	progress = start_delayed_progress(_("Filtering content"), delayed_object_count);
+	progress = start_delayed_progress(_("Filtering content"), dco->paths.nr);
 	while (dco->filters.nr > 0) {
 		for_each_string_list_item(filter, &dco->filters) {
 			struct string_list available_paths = STRING_LIST_INIT_NODUP;
-			display_progress(progress, delayed_object_count - dco->paths.nr);
 
 			if (!async_query_available_blobs(filter->string, &available_paths)) {
 				/* Filter reported an error */
@@ -224,6 +222,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 				ce = index_file_exists(state->istate, path->string,
 						       strlen(path->string), 0);
 				if (ce) {
+					display_progress(progress, ++processed_paths);
 					errs |= checkout_entry(ce, state, NULL, nr_checkouts);
 					filtered_bytes += ce->ce_stat_data.sd_size;
 					display_throughput(progress, filtered_bytes);
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 6/7] [RFC] entry: don't show "Filtering content: ... done." line in case of errors
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
                   ` (4 preceding siblings ...)
  2021-06-20 20:03 ` [PATCH 5/7] entry: show finer-grained counter in "Filtering content" " SZEDER Gábor
@ 2021-06-20 20:03 ` SZEDER Gábor
  2021-06-21 18:32   ` René Scharfe
  2021-06-20 20:03 ` [PATCH 7/7] test-lib: enable GIT_TEST_CHECK_PROGRESS by default SZEDER Gábor
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 83+ messages in thread
From: SZEDER Gábor @ 2021-06-20 20:03 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	SZEDER Gábor

The test 'missing file in delayed checkout' in 't0021-conversion.sh'
fails when run with GIT_TEST_CHECK_PROGRESS=1, because the final value
of the "Filtering content" progress counter doesn't match the expected
total, triggering BUG().  This is not caused by a bug in how we count
progress, but because the test involves a purposefully buggy filter
process that doesn't process any paths, so the progress counter
doesn't have a chance to reach the expected total.

Arguably, it is wrong to show "done" at the end of the progress
line when not all work was done.

So let's check whether there were any errors while processing or that
there are still unprocessed paths at the end (which a few lines later
will in fact be considered as error) and don't show the final "done"
line, i.e. don't call stop_progress(), if there were any.  And if we
don't call stop_progress(), then we won't verify that the progress
counter matches the expected total, won't trigger BUG() on mismatch,
and t0021 will succeed even with GIT_TEST_CHECK_PROGRESS=1.

After this change the test suite passes with
GIT_TEST_CHECK_PROGRESS=1.

RFC!!  Alas, not calling stop_progress() on error has drawbacks:

  - All memory allocated for the progress bar is leaked.
  - This progress line remains "active", in the sense that if we were
    to start a new progress later in the same git process, then with
    GIT_TEST_CHECK_PROGRESS it would trigger the other BUG() catching
    nested/overlapping progresses.

Do we care?!  TBH I don't :)
Anyway, if we do, then we might need some sort of an abort_progress()
function...

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 entry.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/entry.c b/entry.c
index bc4b8fcc98..38baefe22a 100644
--- a/entry.c
+++ b/entry.c
@@ -232,7 +232,8 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 		}
 		string_list_remove_empty_items(&dco->filters, 0);
 	}
-	stop_progress(&progress);
+	if (!errs && !dco->paths.nr)
+		stop_progress(&progress);
 	string_list_clear(&dco->filters, 0);
 
 	/* At this point we should not have any delayed paths anymore. */
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 7/7] test-lib: enable GIT_TEST_CHECK_PROGRESS by default
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
                   ` (5 preceding siblings ...)
  2021-06-20 20:03 ` [PATCH 6/7] [RFC] entry: don't show "Filtering content: ... done." line in case of errors SZEDER Gábor
@ 2021-06-20 20:03 ` SZEDER Gábor
  2021-06-21  0:59 ` [PATCH 0/7] progress: verify progress counters in the test suite Ævar Arnfjörð Bjarmason
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 83+ messages in thread
From: SZEDER Gábor @ 2021-06-20 20:03 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	SZEDER Gábor

Let's enable GIT_TEST_CHECK_PROGRESS by default, in the hope that it
will effectively prevent buggy progress line counters and nested
progress lines from entering our codebase in the future.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 t/test-lib.sh | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index adaf03543e..ae2dd6d0d2 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1502,6 +1502,12 @@ then
 	export GIT_TEST_CHECK_CACHE_TREE
 fi
 
+if test -z "$GIT_TEST_CHECK_PROGRESS"
+then
+	GIT_TEST_CHECK_PROGRESS=true
+	export GIT_TEST_CHECK_PROGRESS
+fi
+
 test_lazy_prereq PIPE '
 	# test whether the filesystem supports FIFOs
 	test_have_prereq !MINGW,!CYGWIN &&
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-20 20:03 ` [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line SZEDER Gábor
@ 2021-06-20 22:13   ` Ævar Arnfjörð Bjarmason
  2021-06-21 18:32     ` René Scharfe
  0 siblings, 1 reply; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-20 22:13 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, René Scharfe


On Sun, Jun 20 2021, SZEDER Gábor wrote:

> The final value of the counter of the "Scanning merged commits"
> progress line is always one less than its expected total, e.g.:
>
>   Scanning merged commits:  83% (5/6), done.
>
> This happens because while iterating over an array the loop variable
> is passed to display_progress() as-is, but while C arrays (and thus
> the loop variable) start at 0 and end at N-1, the progress counter
> must end at N.  This causes the failures of the tests
> 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
> in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.
>
> Fix this by passing 'i + 1' to display_progress(), like most other
> callsites do.
>
> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
> ---
>  commit-graph.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index 2bcb4e0f89..3181906368 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>  
>  	ctx->num_extra_edges = 0;
>  	for (i = 0; i < ctx->commits.nr; i++) {
> -		display_progress(ctx->progress, i);
> +		display_progress(ctx->progress, i + 1);
>  
>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>  			  &ctx->commits.list[i]->object.oid)) {

I think this fix makes sense, but FWIW there's a large thread starting
at [1] where René disagrees with me, and thinks the fix for this sort of
thing would be to display_progress(..., i + 1) at the end of that
for-loop, or just before the stop_progress().

I don't agree, but just noting the disagreement, and that if that
argument wins then a patch like this would involve changing the other
20-some calls to display_progress() in commit-graph.c to work
differently (and to be more complex, we'd need to deal with loop
break/continue etc.).

1. https://lore.kernel.org/git/patch-2.2-042f598826-20210607T144206Z-avarab@gmail.com/ 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 0/7] progress: verify progress counters in the test suite
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
                   ` (6 preceding siblings ...)
  2021-06-20 20:03 ` [PATCH 7/7] test-lib: enable GIT_TEST_CHECK_PROGRESS by default SZEDER Gábor
@ 2021-06-21  0:59 ` Ævar Arnfjörð Bjarmason
  2021-06-23  2:04   ` Taylor Blau
  2021-06-23 21:57 ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-21  0:59 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, René Scharfe


On Sun, Jun 20 2021, SZEDER Gábor wrote:

> Splitting off from:
>
>   https://public-inbox.org/git/cover-0.2-0000000000-20210607T144206Z-avarab@gmail.com/T/#me5d3176914d4268fd9f2a96fc63f4e41beb26bd6
>
> On Tue, Jun 08, 2021 at 06:14:42PM +0200, René Scharfe wrote:
>> I wonder (only in a semi-curious way, though) if we can detect
>> off-by-one errors by adding an assertion to display_progress() that
>> requires the first update to have the value 0, and in stop_progress()
>> one that requires the previous display_progress() call to have a value
>> equal to the total number of work items.  Not sure it'd be worth the
>> hassle..
>
> I fixed and reported a number of bogus progress lines in the past, the
> last one during v2.31.0-rc phase, so I've looked into whether progress
> counters could be automatically validated in our tests, and came up
> with these patches a few months ago.  It turned out that progress
> counters can be checked easily and transparently in case of progress
> lines that are shown in the tests, i.e. that are shown even when
> stderr is not a terminal or are forced with '--progress'.  (In other
> cases it's still fairly easy but not quite transparent, as I think we
> need changes to the progress API; more on that later in a separate
> series.)

I've also been working on some progress.[ch] patches that are mostly
finished, and I'm some 20 patches in at the moment. I wasn't sure about
whether to send an alternate 20-patch "let's do this (mostly) instead?"
series, hence this message.

Much of what you're doing here becomes easier after that series,
e.g. your global process struct in 2/7 is something I ended up
implementing as part of a general feature to allow progress to be driven
by either display_progress() *or* the signal handler itself.

Thus we can show a "stalled" message if we run start(), but hang before
we ever call display_progress(), as we do on e.g. git.git in gc's
"Enumerating Objects" phase (at least on my laptop).

So e.g. your 2/7 becomes a general hard assertion, not some test-only
mode.

After that I use the same facility to implement a mode where any signal
can update a new "spinner" part of the progress bar. So let's say you're
hanging on item 1/3 and not calling display_progress() at all, we'll
update a spinner on each signal to show the user that git itself isn't
hanging, just working.

I could also rebase on yours, but much of it would be rewriting the
test-only code to be more generalized, perhaps it's easier if we start
going for the more generalized solution first.

Per some of what I mentioned in the thread you linked to I'm a bit
uncomfortable with the direction in your 1/7. I seems it works in-tree
for now, but I'd like to take the progress.c API in the direction of a
more generally useful API, not just something that narrowly fits the
exact set of current use-cases.

There's a lot of potential uses in-tree where the total not matching at
the end is just something that happens due to real-world fuzzyness,
e.g. the unlink() example here:
https://public-inbox.org/git/87lf7k2bem.fsf@evledraar.gmail.com/

Perhaps we can just have it BUG() for now as you're doing and cross that
bridge when we come to it. I just wonder if we can't catch potential
bugs in a more gentle way somehow.

> These checks did uncover a couple of buggy progress lines which are
> fixed in this series as well, but I'm not sure that the fix presented
> in patch 6 is the right approach, hence the RFC.

The approach in 6/7 will also have the effect of not balancing a trace2
start/stop region. Quoting a line from its commit message:

    > Arguably, it is wrong to show "done" at the end of the progress
    > line when not all work was done.

I think for a more general API it makes sense to think of "done" as a
different state than "we have reached == total". The target may change
as in the unlink() example, or we may simply decide to abort and "be
done early".

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 1/7] progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress counters
  2021-06-20 20:02 ` [PATCH 1/7] progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress counters SZEDER Gábor
@ 2021-06-21  7:09   ` Ævar Arnfjörð Bjarmason
  2021-06-22 15:55   ` Taylor Blau
  1 sibling, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-21  7:09 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, René Scharfe


On Sun, Jun 20 2021, SZEDER Gábor wrote:

> @@ -252,7 +255,11 @@ void display_progress(struct progress *progress, uint64_t n)
>  static struct progress *start_progress_delay(const char *title, uint64_t total,
>  					     unsigned delay, unsigned sparse)
>  {
> -	struct progress *progress = xmalloc(sizeof(*progress));
> +	struct progress *progress;
> +
> +	test_check_progress = git_env_bool("GIT_TEST_CHECK_PROGRESS", 0);
> +
> +	progress = xmalloc(sizeof(*progress));

Is this simply an unrelated cleanup/refactoring? I don't see how this
re-arrangement is needed for adding the git_env_bool() call.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-20 22:13   ` Ævar Arnfjörð Bjarmason
@ 2021-06-21 18:32     ` René Scharfe
  2021-06-21 20:08       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 83+ messages in thread
From: René Scharfe @ 2021-06-21 18:32 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, SZEDER Gábor; +Cc: git

Am 21.06.21 um 00:13 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>
>> The final value of the counter of the "Scanning merged commits"
>> progress line is always one less than its expected total, e.g.:
>>
>>   Scanning merged commits:  83% (5/6), done.
>>
>> This happens because while iterating over an array the loop variable
>> is passed to display_progress() as-is, but while C arrays (and thus
>> the loop variable) start at 0 and end at N-1, the progress counter
>> must end at N.  This causes the failures of the tests
>> 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
>> in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.
>>
>> Fix this by passing 'i + 1' to display_progress(), like most other
>> callsites do.
>>
>> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
>> ---
>>  commit-graph.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/commit-graph.c b/commit-graph.c
>> index 2bcb4e0f89..3181906368 100644
>> --- a/commit-graph.c
>> +++ b/commit-graph.c
>> @@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>>
>>  	ctx->num_extra_edges = 0;
>>  	for (i = 0; i < ctx->commits.nr; i++) {
>> -		display_progress(ctx->progress, i);
>> +		display_progress(ctx->progress, i + 1);
>>
>>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>>  			  &ctx->commits.list[i]->object.oid)) {
>
> I think this fix makes sense, but FWIW there's a large thread starting
> at [1] where René disagrees with me, and thinks the fix for this sort of
> thing would be to display_progress(..., i + 1) at the end of that
> for-loop, or just before the stop_progress().
>
> I don't agree, but just noting the disagreement, and that if that
> argument wins then a patch like this would involve changing the other
> 20-some calls to display_progress() in commit-graph.c to work
> differently (and to be more complex, we'd need to deal with loop
> break/continue etc.).
>
> 1. https://lore.kernel.org/git/patch-2.2-042f598826-20210607T144206Z-avarab@gmail.com/

*sigh*  (And sorry, Ævar.)

Before an item is done, it should be reported as not done.  After an
item is done, it should be reported as done.  One loop iteration
finishes one item.  Thus the number of items to report at the bottom of
the loop is one higher than at the top.  i is the correct number to
report at the top of a zero-based loop, i+1 at the bottom.

There is another place: In the loop header.  It's a weird place for a
function call, but it gets triggered before, between and after all
items, just as we need it:

	for (i = 0; display_progress(ctx->progress), i < ctx->commits.nr; i++) {

We could hide this unseemly sight in a macro:

  #define progress_foreach(index, count, progress) \
  for (index = 0; display_progress(progress, index), index < count; index++)

Hmm?

René

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 6/7] [RFC] entry: don't show "Filtering content: ... done." line in case of errors
  2021-06-20 20:03 ` [PATCH 6/7] [RFC] entry: don't show "Filtering content: ... done." line in case of errors SZEDER Gábor
@ 2021-06-21 18:32   ` René Scharfe
  2021-06-23  1:52     ` Taylor Blau
  0 siblings, 1 reply; 83+ messages in thread
From: René Scharfe @ 2021-06-21 18:32 UTC (permalink / raw)
  To: SZEDER Gábor, git; +Cc: Ævar Arnfjörð Bjarmason

Am 20.06.21 um 22:03 schrieb SZEDER Gábor:
> RFC!!  Alas, not calling stop_progress() on error has drawbacks:
>
>   - All memory allocated for the progress bar is leaked.
>   - This progress line remains "active", in the sense that if we were
>     to start a new progress later in the same git process, then with
>     GIT_TEST_CHECK_PROGRESS it would trigger the other BUG() catching
>     nested/overlapping progresses.
>
> Do we care?!  TBH I don't :)
> Anyway, if we do, then we might need some sort of an abort_progress()
> function...

I think the abort_progress() idea makes sense; to clean up allocations,
tell the user what happened and avoid the BUG().  Showing just
"aborted" instead of "done" should suffice here -- the explanation is
given a few lines later ("'foo' was not filtered properly").

It could be a cheesy stop_progress_msg() wrapper that temporarily sets
test_check_progress to zero..

René

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-21 18:32     ` René Scharfe
@ 2021-06-21 20:08       ` Ævar Arnfjörð Bjarmason
  2021-06-26  8:27         ` René Scharfe
  0 siblings, 1 reply; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-21 20:08 UTC (permalink / raw)
  To: René Scharfe; +Cc: SZEDER Gábor, git


On Mon, Jun 21 2021, René Scharfe wrote:

> Am 21.06.21 um 00:13 schrieb Ævar Arnfjörð Bjarmason:
>>
>> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>>
>>> The final value of the counter of the "Scanning merged commits"
>>> progress line is always one less than its expected total, e.g.:
>>>
>>>   Scanning merged commits:  83% (5/6), done.
>>>
>>> This happens because while iterating over an array the loop variable
>>> is passed to display_progress() as-is, but while C arrays (and thus
>>> the loop variable) start at 0 and end at N-1, the progress counter
>>> must end at N.  This causes the failures of the tests
>>> 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
>>> in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.
>>>
>>> Fix this by passing 'i + 1' to display_progress(), like most other
>>> callsites do.
>>>
>>> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
>>> ---
>>>  commit-graph.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/commit-graph.c b/commit-graph.c
>>> index 2bcb4e0f89..3181906368 100644
>>> --- a/commit-graph.c
>>> +++ b/commit-graph.c
>>> @@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>>>
>>>  	ctx->num_extra_edges = 0;
>>>  	for (i = 0; i < ctx->commits.nr; i++) {
>>> -		display_progress(ctx->progress, i);
>>> +		display_progress(ctx->progress, i + 1);
>>>
>>>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>>>  			  &ctx->commits.list[i]->object.oid)) {
>>
>> I think this fix makes sense, but FWIW there's a large thread starting
>> at [1] where René disagrees with me, and thinks the fix for this sort of
>> thing would be to display_progress(..., i + 1) at the end of that
>> for-loop, or just before the stop_progress().
>>
>> I don't agree, but just noting the disagreement, and that if that
>> argument wins then a patch like this would involve changing the other
>> 20-some calls to display_progress() in commit-graph.c to work
>> differently (and to be more complex, we'd need to deal with loop
>> break/continue etc.).
>>
>> 1. https://lore.kernel.org/git/patch-2.2-042f598826-20210607T144206Z-avarab@gmail.com/
>
> *sigh*  (And sorry, Ævar.)
>
> Before an item is done, it should be reported as not done.  After an
> item is done, it should be reported as done.  One loop iteration
> finishes one item.  Thus the number of items to report at the bottom of
> the loop is one higher than at the top.  i is the correct number to
> report at the top of a zero-based loop, i+1 at the bottom.
>
> There is another place: In the loop header.  It's a weird place for a
> function call, but it gets triggered before, between and after all
> items, just as we need it:
>
> 	for (i = 0; display_progress(ctx->progress), i < ctx->commits.nr; i++) {
>
> We could hide this unseemly sight in a macro:
>
>   #define progress_foreach(index, count, progress) \
>   for (index = 0; display_progress(progress, index), index < count; index++)

Anyone with more time than sense can go and read over our linked back &
forth thread where we're disagreeing on that point :). I think the pattern
in commit-graph.c makes sense, you don't.

Anyway, aside from that. I think, and I really would be advocating this
too, even if our respective positions were reversed, that *in this case*
it makes sense to just take something like SZEDER's patch here
as-is. Because in that file there's some dozen occurrences of that exact
pattern.

Let's just bring this one case in line with the rest, if we then want to
argue that one or the other use of the progress.c API is wrong as a
general thing, I think it makes more sense to discuss that as some
follow-up series that changes these various API uses en-masse than
holding back isolated fixes that leave the state of the progress bar it
!= 100%.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 1/7] progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress counters
  2021-06-20 20:02 ` [PATCH 1/7] progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress counters SZEDER Gábor
  2021-06-21  7:09   ` Ævar Arnfjörð Bjarmason
@ 2021-06-22 15:55   ` Taylor Blau
  1 sibling, 0 replies; 83+ messages in thread
From: Taylor Blau @ 2021-06-22 15:55 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: git, Ævar Arnfjörð Bjarmason, René Scharfe

On Sun, Jun 20, 2021 at 10:02:57PM +0200, SZEDER Gábor wrote:
> +	progress->last_value = n;
> +
>  	if (progress->delay && (!progress_update || --progress->delay))
>  		return;
>
> -	progress->last_value = n;

Makes sense, and thanks for explaining it explicitly in the patch
message.

>  	tp = (progress->throughput) ? progress->throughput->display.buf : "";
>  	if (progress->total) {
>  		unsigned percent = n * 100 / progress->total;
> @@ -252,7 +255,11 @@ void display_progress(struct progress *progress, uint64_t n)
>  static struct progress *start_progress_delay(const char *title, uint64_t total,
>  					     unsigned delay, unsigned sparse)
>  {
> -	struct progress *progress = xmalloc(sizeof(*progress));
> +	struct progress *progress;
> +
> +	test_check_progress = git_env_bool("GIT_TEST_CHECK_PROGRESS", 0);
> +
> +	progress = xmalloc(sizeof(*progress));

Ævar noted below, I think, but this cleanup to move the xmalloc() call
to after reading $GIT_TEST_CHECK_PROGRESS is unnecessary.

> +test_expect_success 'GIT_TEST_CHECK_PROGRESS catches non-matching total' '
> +	cat >in <<-\EOF &&
> +	progress 1
> +	progress 2
> +	progress 4
> +	EOF
> +
> +	test_must_fail env GIT_TEST_CHECK_PROGRESS=1 \
> +		test-tool progress --total=3 "Not enough" <in 2>stderr &&
> +	grep "BUG:.*total progress does not match" stderr &&
> +
> +	test_must_fail env GIT_TEST_CHECK_PROGRESS=1 \
> +		test-tool progress --total=5 "Too much" <in 2>stderr &&
> +	grep "BUG:.*total progress does not match" stderr
> +'

This and the below test are both good to see. I wondered briefly whether
or not it would be worth adding a test to check that the "progress does
not match" triggers even when we have a non-zero delay, like:

    test_must_fail env GIT_PROGRESS_DELAY=100 GIT_TEST_CHECK_PROGRESS=1 \
      test-tool progress --total=5 "Too much" <in 2>stderr &&
    grep "BUG:.*total progress does not match" stderr

But it's not helpful, because GIT_PROGRESS_DELAY is already 2 by
default, and we unset GIT_* environment variables (including
GIT_PROGRESS_DELAY) except a few which are left alone.

So we are already testing this case implicitly. It may be worth making
it explicit, and/or testing the case where GIT_PROGRESS_DELAY=0, but I
do not feel strongly about it. Besides, I would much rather err on the
side of testing cases we feel are legitimately interesting, rather than
filling in a grid of all possible combinations, including uninteresting
ones.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 2/7] progress: catch nested/overlapping progresses with GIT_TEST_CHECK_PROGRESS
  2021-06-20 20:02 ` [PATCH 2/7] progress: catch nested/overlapping progresses with GIT_TEST_CHECK_PROGRESS SZEDER Gábor
@ 2021-06-22 16:00   ` Taylor Blau
  0 siblings, 0 replies; 83+ messages in thread
From: Taylor Blau @ 2021-06-22 16:00 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: git, Ævar Arnfjörð Bjarmason, René Scharfe

On Sun, Jun 20, 2021 at 10:02:58PM +0200, SZEDER Gábor wrote:
> Note that this will trigger even in cases where the output is not
> visibly wrong, e.g. consider this simplified sequence of calls:
>
>   progress1 = start_delayed_progress();
>   progress2 = start_delayed_progress();
>   for (i = 0; ...)
>       display_progress(progress2, i + 1);
>   stop_progres(&progress2);
>   for (j = 0; ...)
>       display_progress(progress1, j + 1);
>   stop_progres(&progress1);

s/stop_progres/&s, but no big deal. Everything else here looks good.

> diff --git a/progress.c b/progress.c
> index 255995406f..549e8d1fe7 100644
> --- a/progress.c
> +++ b/progress.c
> @@ -48,6 +48,8 @@ struct progress {
>  static volatile sig_atomic_t progress_update;
>
>  static int test_check_progress;
> +/* Used to catch nested/overlapping progresses with GIT_TEST_CHECK_PROGRESS. */
> +static struct progress *current_progress = NULL;
>
>  /*
>   * These are only intended for testing the progress output, i.e. exclusively
> @@ -258,8 +260,12 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
>  	struct progress *progress;
>
>  	test_check_progress = git_env_bool("GIT_TEST_CHECK_PROGRESS", 0);
> +	if (test_check_progress && current_progress)
> +		BUG("progress \"%s\" is still active when starting new progress \"%s\"",
> +		    current_progress->title, title);
>
>  	progress = xmalloc(sizeof(*progress));

Ah. This is why you moved the allocation down further, since we don't
have to free anything up when calling BUG() if it wasn't allocated in
the first place (and we had no such conditional that would cause us to
abort early before).

For what it's worth, I probably would have preferred to see that change
from the previous patch included in this one rather than in the first of
the series, since it's much clearer here than it is in the first patch.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 6/7] [RFC] entry: don't show "Filtering content: ... done." line in case of errors
  2021-06-21 18:32   ` René Scharfe
@ 2021-06-23  1:52     ` Taylor Blau
  0 siblings, 0 replies; 83+ messages in thread
From: Taylor Blau @ 2021-06-23  1:52 UTC (permalink / raw)
  To: René Scharfe
  Cc: SZEDER Gábor, git, Ævar Arnfjörð Bjarmason

On Mon, Jun 21, 2021 at 08:32:56PM +0200, René Scharfe wrote:
> Am 20.06.21 um 22:03 schrieb SZEDER Gábor:
> > RFC!!  Alas, not calling stop_progress() on error has drawbacks:
> >
> >   - All memory allocated for the progress bar is leaked.
> >   - This progress line remains "active", in the sense that if we were
> >     to start a new progress later in the same git process, then with
> >     GIT_TEST_CHECK_PROGRESS it would trigger the other BUG() catching
> >     nested/overlapping progresses.
> >
> > Do we care?!  TBH I don't :)
> > Anyway, if we do, then we might need some sort of an abort_progress()
> > function...
>
> I think the abort_progress() idea makes sense; to clean up allocations,
> tell the user what happened and avoid the BUG().  Showing just
> "aborted" instead of "done" should suffice here -- the explanation is
> given a few lines later ("'foo' was not filtered properly").

Very well put. I concur that having an abort_progress() API makes sense
for all of the reasons that you suggest, but also because we shouldn't
encourage not using what seems like an appropriate API in order to not
fail tests when GIT_TEST_CHECK_PROGRESS is set.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 0/7] progress: verify progress counters in the test suite
  2021-06-21  0:59 ` [PATCH 0/7] progress: verify progress counters in the test suite Ævar Arnfjörð Bjarmason
@ 2021-06-23  2:04   ` Taylor Blau
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 83+ messages in thread
From: Taylor Blau @ 2021-06-23  2:04 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: SZEDER Gábor, git, René Scharfe

On Mon, Jun 21, 2021 at 02:59:53AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>
> > Splitting off from:
> >
> >   https://public-inbox.org/git/cover-0.2-0000000000-20210607T144206Z-avarab@gmail.com/T/#me5d3176914d4268fd9f2a96fc63f4e41beb26bd6
> >
> > On Tue, Jun 08, 2021 at 06:14:42PM +0200, René Scharfe wrote:
> >> I wonder (only in a semi-curious way, though) if we can detect
> >> off-by-one errors by adding an assertion to display_progress() that
> >> requires the first update to have the value 0, and in stop_progress()
> >> one that requires the previous display_progress() call to have a value
> >> equal to the total number of work items.  Not sure it'd be worth the
> >> hassle..
> >
> > I fixed and reported a number of bogus progress lines in the past, the
> > last one during v2.31.0-rc phase, so I've looked into whether progress
> > counters could be automatically validated in our tests, and came up
> > with these patches a few months ago.  It turned out that progress
> > counters can be checked easily and transparently in case of progress
> > lines that are shown in the tests, i.e. that are shown even when
> > stderr is not a terminal or are forced with '--progress'.  (In other
> > cases it's still fairly easy but not quite transparent, as I think we
> > need changes to the progress API; more on that later in a separate
> > series.)
>
> I've also been working on some progress.[ch] patches that are mostly
> finished, and I'm some 20 patches in at the moment. I wasn't sure about
> whether to send an alternate 20-patch "let's do this (mostly) instead?"
> series, hence this message.
>
> Much of what you're doing here becomes easier after that series,
> e.g. your global process struct in 2/7 is something I ended up
> implementing as part of a general feature to allow progress to be driven
> by either display_progress() *or* the signal handler itself.

It's difficult to know who should rebase onto who without seeing one
half of the patches. I couldn't find a link to them anywhere (even if
they are only available in your fork in a pre-polished state) despite
looking, but my apologies if they are available and I'm just missing
them.

In general, I think that these patches are clear and are helpful in
pinning down issues with the progress API (which I have made a hadnful
of times in the past), so I would be happy to see them picked up.

> I could also rebase on yours, but much of it would be rewriting the
> test-only code to be more generalized, perhaps it's easier if we start
> going for the more generalized solution first.

Again, without knowing the substance of your patches it's hard to
comment for sure, but I don't have a problem with a simple and direct
approach here.

> Perhaps we can just have it BUG() for now as you're doing and cross that
> bridge when we come to it. I just wonder if we can't catch potential
> bugs in a more gentle way somehow.

I think there are compelling reasons to feel that the new mode should
only be enabled during tests, as well as compelling reasons to feel that
it should be enabled all of the time.

One way to think about it is that we do not want users to have a BUG()
abort their program just because a progress meter went rogue. So in that
sense, it makes sense that we would only see that happen during tests,
so that those tests could tell us where the bug is, and we could fix it.

On the other hand, since we make sure that our tests pass at each patch,
there's no point in having a separate mode (and instead, remove the
conditionals on GIT_TEST_PROGRESS_CHECK), since successfully running the
tests tells us that there are no rogue progress meters that we exercise
in our (hopefully) complete set of tests.

I could go either way, I think both lines of reasoning are quite
reasonable. But, I think we are generally more lax about having the
whole ci/run-build-and-tests.sh script pass at every commit, and that it
seems we care more about having the tip of each series pass CI when
integrated into 'seen'.

So I don't think that hiding this new mode behind an environment
variable is giving us as much confidence as we'd like, because it
doesn't add anything in "make test".

To me, I think a reasonable direction to take would be to *always*
export GIT_TEST_PROGRESS_CHECK when running tests, not just in
ci/run-build-and-tests.sh. That means we'll catch incorrect uses of the
progress API during tests, without worrying that incomplete coverage
will cause user-visible breakage.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code
  2021-06-23  2:04   ` Taylor Blau
@ 2021-06-23 17:48     ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 01/25] progress.c tests: fix breakage with COLUMNS != 80 Ævar Arnfjörð Bjarmason
                         ` (25 more replies)
  0 siblings, 26 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

> On Mon, Jun 21, 2021 at 02:59:53AM +0200, Ævar Arnfjörð Bjarmason wrote:
>>
>> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>>
>> > Splitting off from:
>> >
>> >   https://public-inbox.org/git/cover-0.2-0000000000-20210607T144206Z-avarab@gmail.com/T/#me5d3176914d4268fd9f2a96fc63f4e41beb26bd6
>> >
>> > On Tue, Jun 08, 2021 at 06:14:42PM +0200, René Scharfe wrote:
>> >> I wonder (only in a semi-curious way, though) if we can detect
>> >> off-by-one errors by adding an assertion to display_progress() that
>> >> requires the first update to have the value 0, and in stop_progress()
>> >> one that requires the previous display_progress() call to have a value
>> >> equal to the total number of work items.  Not sure it'd be worth the
>> >> hassle..
>> >
>> > I fixed and reported a number of bogus progress lines in the past, the
>> > last one during v2.31.0-rc phase, so I've looked into whether progress
>> > counters could be automatically validated in our tests, and came up
>> > with these patches a few months ago.  It turned out that progress
>> > counters can be checked easily and transparently in case of progress
>> > lines that are shown in the tests, i.e. that are shown even when
>> > stderr is not a terminal or are forced with '--progress'.  (In other
>> > cases it's still fairly easy but not quite transparent, as I think we
>> > need changes to the progress API; more on that later in a separate
>> > series.)
>>
>> I've also been working on some progress.[ch] patches that are mostly
>> finished, and I'm some 20 patches in at the moment. I wasn't sure about
>> whether to send an alternate 20-patch "let's do this (mostly) instead?"
>> series, hence this message.
>>
>> Much of what you're doing here becomes easier after that series,
>> e.g. your global process struct in 2/7 is something I ended up
>> implementing as part of a general feature to allow progress to be driven
>> by either display_progress() *or* the signal handler itself.
>
> It's difficult to know who should rebase onto who without seeing one
> half of the patches.

I was sort of hoping he'd take me word for it, but here it is. Don't
say I didn't warn you :)

> I couldn't find a link to them anywhere (even if
> they are only available in your fork in a pre-polished state) despite
> looking, but my apologies if they are available and I'm just missing
> them.

FWIW it's avar-szeder/progress-bar-assertions in
https://github.com/avar/git.git, that repo contains various
functioning and not-so-functioning code.

https://github.com/avar/git/tree/meta/ is my version of the crappy
scripts we probably all have some version of for building my own git,
things that are uncommented in series.conf is what I build my own git
from.

> In general, I think that these patches are clear and are helpful in
> pinning down issues with the progress API (which I have made a hadnful
> of times in the past), so I would be happy to see them picked up.

Here's all 25 patches (well, around 20 before) that I had queued up
locally and fixed up a bit.

The 01/25 is something I submitted already as
https://lore.kernel.org/git/patch-1.1-cba5d88ca35-20210621T070114Z-avarab@gmail.com;
hoping to get this in incrementally.

The 12/25 is my own version of that "global progress struct, 11/25 is
the first of many bugs SZEDER missed in his :)

18/25 is the first step of the UI I was going for, the signal handler
can now drive the progress bar, so e.g. during "git gc" we show (at
least for me, on git.git), a "stalled" message just before we start
the actual count of "Enumerating Objects".

After that was in I was planning on adding config-driven support to
show a "spinner" when we stalled in that way, config-driven because
you could just scrape
e.g. https://github.com/sindresorhus/cli-spinners/blob/main/spinners.json
into your own config. See
https://jsfiddle.net/sindresorhus/2eLtsbey/embedded/result/ :)

19-23/25 is my grabbing of SZEDER's patches that I'm comfortable
labeling as "PATCH", I think they work, but no BUG() assertions yet. I
left out the GIT_TEST_CHECK_PROGRESS parts, since my earlier works set
things up to do any BUG() we trust by default.

22/25 is what I think we should do instead of SZEDER's 6/7
(http://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.com)
I don't think this "our total doesn't match at the end" is something
we should always BUG() on, for reasons explained there.

I am sympathetic to doing it by default though, hence the
stop_progress_early() API, that's there to allow select callers to
bypass his BUG(...) assertion.

24/25 and 25/25 are "RFC" and a rebased+modified version of SZEDER's
BUG(...) assertions.

His series passes the test suite, but actually severely break things
things. It'll make e.g. "git commit-graph write" BUG(...) out. The
reason the tests don't catch it is because we have a blind spot in the
tests.

Namely, that most things that use the progress bar API use isatty() to
check if they should start_progress(). If you run the tests as
e.g. (better ways to do this, especially in parallel, most welcome):

    for t in t[0-9]*.sh; do if ! ./$t -vixd; then echo $t bad; break; fi; done

You can discover various things that his series BUG()'s on, I fixed a
couple of those myself, it's an early part of this series.

But we'll still have various untested for BUG()'s even then, this is
because you *also* have to have the test actually emit a "naked"
progress bar on stderr, if the test itself e.g. pipes fd 2 to a file
it won't work.

I created a shitty-and-mostly-broken throwaway change to
search-replace all the guards of "start_progress(...)" to run
unconditionally, and convert all the "delayed" to the non-delayed
version. That'll find even more BUG()'s where SZEDER's series still
needs to be fixed (and also some unrelated segfaults, I gave up on it
soon after).

Even if we fix that I wouldn't trust it, because a lot of the progress
bars we have depend on the size and shape of the data we're
processing, e.g. the bug I fixed in 11/25. If people find this BUG()
approach worth pursuing I think it would be better to make it an
opt-in flag we convert one caller at a time to.

For some it's really clear that we could assert it, for others such as
the commit-graph it's much more subtle, we're in some callback after
setting a "total", that callback does a "break", "continue" etc. in
various places, all depending on repository data.

It's not easy to reason about that and be certain that we can hold to
the estimate. If we get it wrong someone's repo in the wild won't
fully GC because of the overly eager BUG().

If SZEDER wants to pursue it I think it'll be easier on top of this
series, but personally I really don't see the point of spending effort
on it.

We should really be going in the other direction, of having more fuzzy
ETAs, not less.

E.g. we often have enough data at the start of "Enumerating Objects"
to give a good-enough target value, that it's 5-10% off isn't really
the point, but that the user looking at it sees something better than
a dumb count-up, and can instead see that they'll probably be looking
at it for about a minute. Now our API is to give no ETA/target if
we're not 100% sure, it's not good UX.

So trying to get the current exact count/exact percentage right seems
like a distraction to me in the longer term. If anything we should
just be rounding those numbers, showing fuzzy ETAs instead of
percentages if we can etc.

SZEDER Gábor (4):
  commit-graph: fix bogus counter in "Scanning merged commits" progress
    line
  entry: show finer-grained counter in "Filtering content" progress line
  progress: assert last update in stop_progress()
  progress: assert counting upwards in display()

Ævar Arnfjörð Bjarmason (21):
  progress.c tests: fix breakage with COLUMNS != 80
  progress.c tests: make start/stop verbs on stdin
  progress.c tests: test some invalid usage
  progress.c tests: add a "signal" verb
  progress.c: move signal handler functions lower
  progress.c: call progress_interval() from progress_test_force_update()
  progress.c: stop eagerly fflush(stderr) when not a terminal
  progress.c: add temporary variable from progress struct
  midx perf: add a perf test for multi-pack-index
  progress.c: remove the "sparse" mode nano-optimization
  pack-bitmap-write.c: add a missing stop_progress()
  progress.c: add & assert a "global_progress" variable
  progress.[ch]: move the "struct progress" to the header
  progress.[ch]: move test-only code away from "extern" variables
  progress.c: pass "is done?" (again) to display()
  progress.[ch]: convert "title" to "struct strbuf"
  progress.c: refactor display() for less confusion, and fix bug
  progress.c: emit progress on first signal, show "stalled"
  midx: don't provide a total for QSORT() progress
  progress.c: add a stop_progress_early() function
  entry: deal with unexpected "Filtering content" total

 cache.h                          |   1 -
 commit-graph.c                   |   2 +-
 csum-file.h                      |   2 -
 entry.c                          |  12 +-
 midx.c                           |  25 +-
 pack-bitmap-write.c              |   1 +
 pack.h                           |   1 -
 parallel-checkout.h              |   1 -
 progress.c                       | 391 ++++++++++++++++++-------------
 progress.h                       |  50 +++-
 reachable.h                      |   1 -
 t/helper/test-progress.c         |  54 +++--
 t/perf/p5319-multi-pack-index.sh |  21 ++
 t/t0500-progress-display.sh      | 247 ++++++++++++++-----
 14 files changed, 537 insertions(+), 272 deletions(-)
 create mode 100755 t/perf/p5319-multi-pack-index.sh

-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 01/25] progress.c tests: fix breakage with COLUMNS != 80
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 02/25] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
                         ` (24 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

The tests added in 2bb74b53a49 (Test the progress display, 2019-09-16)
broke under anything except COLUMNS=80, i.e. when running them under
the "-v" mode under a differently sized terminal.

Let's set the expected number of COLUMNS at the start of the test to
fix that bug. It's handy not do do this in test-progress.c itself, in
case we'd like to test for a different number of COLUMNS, either
manually or in a future test.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t0500-progress-display.sh | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 22058b503ac..66c092a0fe3 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -8,6 +8,11 @@ show_cr () {
 	tr '\015' Q | sed -e "s/Q/<CR>\\$LF/g"
 }
 
+test_expect_success 'setup COLUMNS' '
+	COLUMNS=80 &&
+	export COLUMNS
+'
+
 test_expect_success 'simple progress display' '
 	cat >expect <<-\EOF &&
 	Working hard: 1<CR>
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 02/25] progress.c tests: make start/stop verbs on stdin
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 01/25] progress.c tests: fix breakage with COLUMNS != 80 Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 03/25] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
                         ` (23 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Change the usage of the "test-tool progress" introduced in
2bb74b53a49 (Test the progress display, 2019-09-16) to take command
like "start" and "stop" on stdin, instead of running them implicitly.

This makes for tests that are easier to read, since the recipe will
mirror the API usage, and allows for easily testing invalid usage that
would yield (or should yield) a BUG(), e.g. providing two "start"
calls in a row. A subsequent commit will add such stress tests.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/helper/test-progress.c    | 45 ++++++++++++++++++++--------
 t/t0500-progress-display.sh | 59 +++++++++++++++++++++++--------------
 2 files changed, 69 insertions(+), 35 deletions(-)

diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
index 5d05cbe7894..eb925d591e1 100644
--- a/t/helper/test-progress.c
+++ b/t/helper/test-progress.c
@@ -3,6 +3,9 @@
  *
  * Reads instructions from standard input, one instruction per line:
  *
+ *   "start[ <total>[ <title>]]" - Call start_progress(title, total),
+ *                                 when "start" use a title of
+ *                                 "Working hard" with a total of 0.
  *   "progress <items>" - Call display_progress() with the given item count
  *                        as parameter.
  *   "throughput <bytes> <millis> - Call display_throughput() with the given
@@ -10,6 +13,7 @@
  *                                  specify the time elapsed since the
  *                                  start_progress() call.
  *   "update" - Set the 'progress_update' flag.
+ *   "stop" - Call stop_progress().
  *
  * See 't0500-progress-display.sh' for examples.
  */
@@ -22,31 +26,42 @@
 
 int cmd__progress(int argc, const char **argv)
 {
-	int total = 0;
-	const char *title;
+	const char *default_title = "Working hard";
+	char *detached_title = NULL;
 	struct strbuf line = STRBUF_INIT;
-	struct progress *progress;
+	struct progress *progress = NULL;
 
 	const char *usage[] = {
-		"test-tool progress [--total=<n>] <progress-title>",
+		"test-tool progress <stdin",
 		NULL
 	};
 	struct option options[] = {
-		OPT_INTEGER(0, "total", &total, "total number of items"),
 		OPT_END(),
 	};
 
 	argc = parse_options(argc, argv, NULL, options, usage, 0);
-	if (argc != 1)
-		die("need a title for the progress output");
-	title = argv[0];
+	if (argc)
+		usage_with_options(usage, options);
 
 	progress_testing = 1;
-	progress = start_progress(title, total);
 	while (strbuf_getline(&line, stdin) != EOF) {
 		char *end;
 
-		if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
+		if (!strcmp(line.buf, "start")) {
+			progress = start_progress(default_title, 0);
+		} else if (skip_prefix(line.buf, "start ", (const char **) &end)) {
+			uint64_t total = strtoull(end, &end, 10);
+			if (*end == '\0') {
+				progress = start_progress(default_title, total);
+			} else if (*end == ' ') {
+				if (detached_title)
+					free(detached_title);
+				detached_title = strbuf_detach(&line, NULL);
+				progress = start_progress(end + 1, total);
+			} else {
+				die("invalid input: '%s'\n", line.buf);
+			}
+		} else if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
 			uint64_t item_count = strtoull(end, &end, 10);
 			if (*end != '\0')
 				die("invalid input: '%s'\n", line.buf);
@@ -63,12 +78,16 @@ int cmd__progress(int argc, const char **argv)
 				die("invalid input: '%s'\n", line.buf);
 			progress_test_ns = test_ms * 1000 * 1000;
 			display_throughput(progress, byte_count);
-		} else if (!strcmp(line.buf, "update"))
+		} else if (!strcmp(line.buf, "update")) {
 			progress_test_force_update();
-		else
+		} else if (!strcmp(line.buf, "stop")) {
+			stop_progress(&progress);
+		} else {
 			die("invalid input: '%s'\n", line.buf);
+		}
 	}
-	stop_progress(&progress);
+	if (detached_title)
+		free(detached_title);
 
 	return 0;
 }
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 66c092a0fe3..ce6c3434673 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -22,6 +22,7 @@ test_expect_success 'simple progress display' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 0
 	update
 	progress 1
 	update
@@ -30,8 +31,9 @@ test_expect_success 'simple progress display' '
 	progress 4
 	update
 	progress 5
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -46,11 +48,13 @@ test_expect_success 'progress display with total' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 3
 	progress 1
 	progress 2
 	progress 3
+	stop
 	EOF
-	test-tool progress --total=3 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -67,14 +71,14 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 100
 	progress 1000
 	progress 10000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -93,16 +97,15 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
-	update
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 1
 	update
 	progress 2
 	progress 10000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -121,14 +124,14 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 25000
 	progress 50000
 	progress 75000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -145,14 +148,14 @@ Working hard.......2.........3.........4.........5.........6.........7.........:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6.........7.........
 	progress 25000
 	progress 50000
 	progress 75000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6.........7........." \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -169,12 +172,14 @@ test_expect_success 'progress shortens - crazy caller' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 1000
 	progress 100
 	progress 200
 	progress 1
 	progress 1000
+	stop
 	EOF
-	test-tool progress --total=1000 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -190,6 +195,7 @@ test_expect_success 'progress display with throughput' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start
 	throughput 102400 1000
 	update
 	progress 10
@@ -202,8 +208,9 @@ test_expect_success 'progress display with throughput' '
 	throughput 409600 4000
 	update
 	progress 40
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -219,6 +226,7 @@ test_expect_success 'progress display with throughput and total' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 40
 	throughput 102400 1000
 	progress 10
 	throughput 204800 2000
@@ -227,8 +235,9 @@ test_expect_success 'progress display with throughput and total' '
 	progress 30
 	throughput 409600 4000
 	progress 40
+	stop
 	EOF
-	test-tool progress --total=40 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -244,6 +253,7 @@ test_expect_success 'cover up after throughput shortens' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start
 	throughput 409600 1000
 	update
 	progress 1
@@ -256,8 +266,9 @@ test_expect_success 'cover up after throughput shortens' '
 	throughput 1638400 4000
 	update
 	progress 4
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -272,6 +283,7 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start
 	throughput 1 1000
 	update
 	progress 1
@@ -281,8 +293,9 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	throughput 3145728 3000
 	update
 	progress 3
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -290,6 +303,7 @@ test_expect_success 'cover up after throughput shortens a lot' '
 
 test_expect_success 'progress generates traces' '
 	cat >in <<-\EOF &&
+	start 40
 	throughput 102400 1000
 	update
 	progress 10
@@ -302,10 +316,11 @@ test_expect_success 'progress generates traces' '
 	throughput 409600 4000
 	update
 	progress 40
+	stop
 	EOF
 
-	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress --total=40 \
-		"Working hard" <in 2>stderr &&
+	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress \
+		<in 2>stderr &&
 
 	# t0212/parse_events.perl intentionally omits regions and data.
 	test_region progress "Working hard" trace.event &&
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 03/25] progress.c tests: test some invalid usage
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 01/25] progress.c tests: fix breakage with COLUMNS != 80 Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 02/25] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 04/25] progress.c tests: add a "signal" verb Ævar Arnfjörð Bjarmason
                         ` (22 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Test what happens when we "stop" without a "start", omit the "stop"
after a "start", or try to start two concurrent progress bars. This
extends the trace2 tests added in 98a13647408 (trace2: log progress
time and throughput, 2020-05-12).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t0500-progress-display.sh | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index ce6c3434673..50eced31f03 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -328,4 +328,37 @@ test_expect_success 'progress generates traces' '
 	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
 '
 
+test_expect_success 'progress generates traces: stop / start' '
+	cat >in <<-\EOF &&
+	start
+	stop
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-startstop.event" test-tool progress \
+		<in 2>stderr &&
+	test_region progress "Working hard" trace-startstop.event
+'
+
+test_expect_success 'progress generates traces: start without stop' '
+	cat >in <<-\EOF &&
+	start
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-start.event" test-tool progress \
+		<in 2>stderr &&
+	grep region_enter.*progress trace-start.event &&
+	! grep region_leave.*progress trace-start.event
+'
+
+test_expect_success 'progress generates traces: stop without start' '
+	cat >in <<-\EOF &&
+	stop
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-stop.event" test-tool progress \
+		<in 2>stderr &&
+	! grep region_enter.*progress trace-stop.event &&
+	! grep region_leave.*progress trace-stop.event
+'
+
 test_done
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 04/25] progress.c tests: add a "signal" verb
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (2 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 03/25] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 05/25] progress.c: move signal handler functions lower Ævar Arnfjörð Bjarmason
                         ` (21 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Add a "signal" synonym for "update". It is not typical of the
progress.c API to encounter a scenario where we do an update before
the first display_progress(), let's indicate this explicitly by
calling such instances "signal".

It's just a synonym for "update", but we can imagine than the
following "update" calls could elide many "progress" calls, and the
progress bar output will generally be of the same type, whereas the
output where we're asked to emit an update before we've received any
data is a special case.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/helper/test-progress.c    |  6 +++++-
 t/t0500-progress-display.sh | 10 +++++-----
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
index eb925d591e1..7ca58a3ee78 100644
--- a/t/helper/test-progress.c
+++ b/t/helper/test-progress.c
@@ -13,6 +13,9 @@
  *                                  specify the time elapsed since the
  *                                  start_progress() call.
  *   "update" - Set the 'progress_update' flag.
+ *   "signal" - Synonym for "update", used for self-documenting tests,
+ *              i.e. "expect signal here due to hanging ("signal")
+ *              v.s. it was time to update ("update").
  *   "stop" - Call stop_progress().
  *
  * See 't0500-progress-display.sh' for examples.
@@ -78,7 +81,8 @@ int cmd__progress(int argc, const char **argv)
 				die("invalid input: '%s'\n", line.buf);
 			progress_test_ns = test_ms * 1000 * 1000;
 			display_throughput(progress, byte_count);
-		} else if (!strcmp(line.buf, "update")) {
+		} else if (!strcmp(line.buf, "update") ||
+			   !strcmp(line.buf, "signal")) {
 			progress_test_force_update();
 		} else if (!strcmp(line.buf, "stop")) {
 			stop_progress(&progress);
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 50eced31f03..66c1989b176 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -23,7 +23,7 @@ test_expect_success 'simple progress display' '
 
 	cat >in <<-\EOF &&
 	start 0
-	update
+	signal
 	progress 1
 	update
 	progress 2
@@ -197,7 +197,7 @@ test_expect_success 'progress display with throughput' '
 	cat >in <<-\EOF &&
 	start
 	throughput 102400 1000
-	update
+	signal
 	progress 10
 	throughput 204800 2000
 	update
@@ -255,7 +255,7 @@ test_expect_success 'cover up after throughput shortens' '
 	cat >in <<-\EOF &&
 	start
 	throughput 409600 1000
-	update
+	signal
 	progress 1
 	throughput 819200 2000
 	update
@@ -285,7 +285,7 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	cat >in <<-\EOF &&
 	start
 	throughput 1 1000
-	update
+	signal
 	progress 1
 	throughput 1024000 2000
 	update
@@ -305,7 +305,7 @@ test_expect_success 'progress generates traces' '
 	cat >in <<-\EOF &&
 	start 40
 	throughput 102400 1000
-	update
+	signal
 	progress 10
 	throughput 204800 2000
 	update
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 05/25] progress.c: move signal handler functions lower
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (3 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 04/25] progress.c tests: add a "signal" verb Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 06/25] progress.c: call progress_interval() from progress_test_force_update() Ævar Arnfjörð Bjarmason
                         ` (20 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Move the signal handler functions to just before the
start_progress_delay() where they'll be referenced, instead of having
them at the top of the file.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 92 ++++++++++++++++++++++++++++--------------------------
 1 file changed, 48 insertions(+), 44 deletions(-)

diff --git a/progress.c b/progress.c
index 680c6a8bf93..893cb0fe56f 100644
--- a/progress.c
+++ b/progress.c
@@ -53,50 +53,6 @@ static volatile sig_atomic_t progress_update;
  */
 int progress_testing;
 uint64_t progress_test_ns = 0;
-void progress_test_force_update(void)
-{
-	progress_update = 1;
-}
-
-
-static void progress_interval(int signum)
-{
-	progress_update = 1;
-}
-
-static void set_progress_signal(void)
-{
-	struct sigaction sa;
-	struct itimerval v;
-
-	if (progress_testing)
-		return;
-
-	progress_update = 0;
-
-	memset(&sa, 0, sizeof(sa));
-	sa.sa_handler = progress_interval;
-	sigemptyset(&sa.sa_mask);
-	sa.sa_flags = SA_RESTART;
-	sigaction(SIGALRM, &sa, NULL);
-
-	v.it_interval.tv_sec = 1;
-	v.it_interval.tv_usec = 0;
-	v.it_value = v.it_interval;
-	setitimer(ITIMER_REAL, &v, NULL);
-}
-
-static void clear_progress_signal(void)
-{
-	struct itimerval v = {{0,},};
-
-	if (progress_testing)
-		return;
-
-	setitimer(ITIMER_REAL, &v, NULL);
-	signal(SIGALRM, SIG_IGN);
-	progress_update = 0;
-}
 
 static int is_foreground_fd(int fd)
 {
@@ -249,6 +205,54 @@ void display_progress(struct progress *progress, uint64_t n)
 		display(progress, n, NULL);
 }
 
+static void progress_interval(int signum)
+{
+	progress_update = 1;
+}
+
+/*
+ * The progress_test_force_update() function is intended for testing
+ * the progress output, i.e. exclusively for 'test-tool progress'.
+ */
+void progress_test_force_update(void)
+{
+	progress_update = 1;
+}
+
+static void set_progress_signal(void)
+{
+	struct sigaction sa;
+	struct itimerval v;
+
+	if (progress_testing)
+		return;
+
+	progress_update = 0;
+
+	memset(&sa, 0, sizeof(sa));
+	sa.sa_handler = progress_interval;
+	sigemptyset(&sa.sa_mask);
+	sa.sa_flags = SA_RESTART;
+	sigaction(SIGALRM, &sa, NULL);
+
+	v.it_interval.tv_sec = 1;
+	v.it_interval.tv_usec = 0;
+	v.it_value = v.it_interval;
+	setitimer(ITIMER_REAL, &v, NULL);
+}
+
+static void clear_progress_signal(void)
+{
+	struct itimerval v = {{0,},};
+
+	if (progress_testing)
+		return;
+
+	setitimer(ITIMER_REAL, &v, NULL);
+	signal(SIGALRM, SIG_IGN);
+	progress_update = 0;
+}
+
 static struct progress *start_progress_delay(const char *title, uint64_t total,
 					     unsigned delay, unsigned sparse)
 {
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 06/25] progress.c: call progress_interval() from progress_test_force_update()
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (4 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 05/25] progress.c: move signal handler functions lower Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 07/25] progress.c: stop eagerly fflush(stderr) when not a terminal Ævar Arnfjörð Bjarmason
                         ` (19 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Define the progress_test_force_update() function in terms of
progress_interval(). For documentation purposes these two functions
have the same body, but different names. Let's just define the test
function by calling progress_interval() with SIGALRM ourselves.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/progress.c b/progress.c
index 893cb0fe56f..7fcc513717a 100644
--- a/progress.c
+++ b/progress.c
@@ -216,7 +216,7 @@ static void progress_interval(int signum)
  */
 void progress_test_force_update(void)
 {
-	progress_update = 1;
+	progress_interval(SIGALRM);
 }
 
 static void set_progress_signal(void)
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 07/25] progress.c: stop eagerly fflush(stderr) when not a terminal
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (5 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 06/25] progress.c: call progress_interval() from progress_test_force_update() Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 08/25] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
                         ` (18 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

It's the clear intention of the combination of 137a0d0ef56 (Flush
progress message buffer in display()., 2007-11-19) and
85cb8906f0e (progress: no progress in background, 2015-04-13) to call
fflush(stderr) when we have a stderr in the foreground, but we ended
up always calling fflush(stderr) seemingly by omission. Let's not.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/progress.c b/progress.c
index 7fcc513717a..1fade5808de 100644
--- a/progress.c
+++ b/progress.c
@@ -91,7 +91,8 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 	}
 
 	if (show_update) {
-		if (is_foreground_fd(fileno(stderr)) || done) {
+		int stderr_is_foreground_fd = is_foreground_fd(fileno(stderr));
+		if (stderr_is_foreground_fd || done) {
 			const char *eol = done ? done : "\r";
 			size_t clear_len = counters_sb->len < last_count_len ?
 					last_count_len - counters_sb->len + 1 :
@@ -115,7 +116,8 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 				fprintf(stderr, "%s: %s%*s", progress->title,
 					counters_sb->buf, (int) clear_len, eol);
 			}
-			fflush(stderr);
+			if (stderr_is_foreground_fd)
+				fflush(stderr);
 		}
 		progress_update = 0;
 	}
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 08/25] progress.c: add temporary variable from progress struct
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (6 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 07/25] progress.c: stop eagerly fflush(stderr) when not a terminal Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 09/25] midx perf: add a perf test for multi-pack-index Ævar Arnfjörð Bjarmason
                         ` (17 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Add a temporary "progress" variable for the dereferenced p_progress
pointer to a "struct progress *". Before 98a13647408 (trace2: log
progress time and throughput, 2020-05-12) we didn't dereference
"p_progress" in this function, now that we do it's easier to read the
code if we work with a "progress" struct pointer like everywhere else,
instead of a pointer to a pointer.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/progress.c b/progress.c
index 1fade5808de..1ab7d19deb8 100644
--- a/progress.c
+++ b/progress.c
@@ -331,15 +331,16 @@ void stop_progress(struct progress **p_progress)
 	finish_if_sparse(*p_progress);
 
 	if (*p_progress) {
+		struct progress *progress = *p_progress;
 		trace2_data_intmax("progress", the_repository, "total_objects",
 				   (*p_progress)->total);
 
 		if ((*p_progress)->throughput)
 			trace2_data_intmax("progress", the_repository,
 					   "total_bytes",
-					   (*p_progress)->throughput->curr_total);
+					   progress->throughput->curr_total);
 
-		trace2_region_leave("progress", (*p_progress)->title, the_repository);
+		trace2_region_leave("progress", progress->title, the_repository);
 	}
 
 	stop_progress_msg(p_progress, _("done"));
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 09/25] midx perf: add a perf test for multi-pack-index
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (7 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 08/25] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 10/25] progress.c: remove the "sparse" mode nano-optimization Ævar Arnfjörð Bjarmason
                         ` (16 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Add a basic write and verify performance test for the multi-pack-index
command.

The reason for doing the "write" also in a "test_expect_success" is to
be friendly to skipping the "write" test as a perf test (which would
run N times) but still being guaranteed to have a midx to verify by
the time we get to the "verify" test.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/perf/p5319-multi-pack-index.sh | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)
 create mode 100755 t/perf/p5319-multi-pack-index.sh

diff --git a/t/perf/p5319-multi-pack-index.sh b/t/perf/p5319-multi-pack-index.sh
new file mode 100755
index 00000000000..39769602ab7
--- /dev/null
+++ b/t/perf/p5319-multi-pack-index.sh
@@ -0,0 +1,21 @@
+#!/bin/sh
+
+test_description='Test midx performance'
+
+. ./perf-lib.sh
+
+test_perf_large_repo
+
+test_expect_success 'setup multi-pack-index' '
+	git multi-pack-index write
+'
+
+test_perf 'midx write' '
+	git multi-pack-index write
+'
+
+test_perf 'midx verify' '
+	git multi-pack-index verify
+'
+
+test_done
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 10/25] progress.c: remove the "sparse" mode nano-optimization
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (8 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 09/25] midx perf: add a perf test for multi-pack-index Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 11/25] pack-bitmap-write.c: add a missing stop_progress() Ævar Arnfjörð Bjarmason
                         ` (15 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Revert the code added in 9d81ecb52b5 (progress: add sparse mode to
force 100% complete message, 2019-03-21) for the "sparse" progress
mode, and change its only user added in 430efb8a74b (midx: add
progress indicators in multi-pack-index verify, 2019-03-21) to use the
normal non-sparse progress.c API instead.

The reason for checking the SPARSE_PROGRESS_INTERVAL for every 2^12
objects is to improve performance. It does that, but only in an
isolated and artificial benchmark. In the case of the
"verify_midx_file" user we're in a loop doing various other OID/object
work, the cost of calling display_progress() is entirely lost in the
noise.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 midx.c     | 26 +++++++-------------------
 progress.c | 38 +++-----------------------------------
 2 files changed, 10 insertions(+), 54 deletions(-)

diff --git a/midx.c b/midx.c
index 21d6a05e887..d80e68998b8 100644
--- a/midx.c
+++ b/midx.c
@@ -1186,18 +1186,6 @@ static int compare_pair_pos_vs_id(const void *_a, const void *_b)
 	return b->pack_int_id - a->pack_int_id;
 }
 
-/*
- * Limit calls to display_progress() for performance reasons.
- * The interval here was arbitrarily chosen.
- */
-#define SPARSE_PROGRESS_INTERVAL (1 << 12)
-#define midx_display_sparse_progress(progress, n) \
-	do { \
-		uint64_t _n = (n); \
-		if ((_n & (SPARSE_PROGRESS_INTERVAL - 1)) == 0) \
-			display_progress(progress, _n); \
-	} while (0)
-
 int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags)
 {
 	struct pair_pos_vs_id *pairs = NULL;
@@ -1248,8 +1236,8 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 	}
 
 	if (flags & MIDX_PROGRESS)
-		progress = start_sparse_progress(_("Verifying OID order in multi-pack-index"),
-						 m->num_objects - 1);
+		progress = start_progress(_("Verifying OID order in multi-pack-index"),
+					  m->num_objects - 1);
 	for (i = 0; i < m->num_objects - 1; i++) {
 		struct object_id oid1, oid2;
 
@@ -1260,7 +1248,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 			midx_report(_("oid lookup out of order: oid[%d] = %s >= %s = oid[%d]"),
 				    i, oid_to_hex(&oid1), oid_to_hex(&oid2), i + 1);
 
-		midx_display_sparse_progress(progress, i + 1);
+		display_progress(progress, i + 1);
 	}
 	stop_progress(&progress);
 
@@ -1277,14 +1265,14 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 	}
 
 	if (flags & MIDX_PROGRESS)
-		progress = start_sparse_progress(_("Sorting objects by packfile"),
-						 m->num_objects);
+		progress = start_progress(_("Sorting objects by packfile"),
+					  m->num_objects);
 	display_progress(progress, 0); /* TODO: Measure QSORT() progress */
 	QSORT(pairs, m->num_objects, compare_pair_pos_vs_id);
 	stop_progress(&progress);
 
 	if (flags & MIDX_PROGRESS)
-		progress = start_sparse_progress(_("Verifying object offsets"), m->num_objects);
+		progress = start_progress(_("Verifying object offsets"), m->num_objects);
 	for (i = 0; i < m->num_objects; i++) {
 		struct object_id oid;
 		struct pack_entry e;
@@ -1318,7 +1306,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 			midx_report(_("incorrect object offset for oid[%d] = %s: %"PRIx64" != %"PRIx64),
 				    pairs[i].pos, oid_to_hex(&oid), m_offset, p_offset);
 
-		midx_display_sparse_progress(progress, i + 1);
+		display_progress(progress, i + 1);
 	}
 	stop_progress(&progress);
 
diff --git a/progress.c b/progress.c
index 1ab7d19deb8..912edd4c818 100644
--- a/progress.c
+++ b/progress.c
@@ -37,7 +37,6 @@ struct progress {
 	uint64_t total;
 	unsigned last_percent;
 	unsigned delay;
-	unsigned sparse;
 	struct throughput *throughput;
 	uint64_t start_ns;
 	struct strbuf counters_sb;
@@ -256,7 +255,7 @@ static void clear_progress_signal(void)
 }
 
 static struct progress *start_progress_delay(const char *title, uint64_t total,
-					     unsigned delay, unsigned sparse)
+					     unsigned delay)
 {
 	struct progress *progress = xmalloc(sizeof(*progress));
 	progress->title = title;
@@ -264,7 +263,6 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 	progress->last_value = -1;
 	progress->last_percent = -1;
 	progress->delay = delay;
-	progress->sparse = sparse;
 	progress->throughput = NULL;
 	progress->start_ns = getnanotime();
 	strbuf_init(&progress->counters_sb, 0);
@@ -287,40 +285,12 @@ static int get_default_delay(void)
 
 struct progress *start_delayed_progress(const char *title, uint64_t total)
 {
-	return start_progress_delay(title, total, get_default_delay(), 0);
+	return start_progress_delay(title, total, get_default_delay());
 }
 
 struct progress *start_progress(const char *title, uint64_t total)
 {
-	return start_progress_delay(title, total, 0, 0);
-}
-
-/*
- * Here "sparse" means that the caller might use some sampling criteria to
- * decide when to call display_progress() rather than calling it for every
- * integer value in[0 .. total).  In particular, the caller might not call
- * display_progress() for the last value in the range.
- *
- * When "sparse" is set, stop_progress() will automatically force the done
- * message to show 100%.
- */
-struct progress *start_sparse_progress(const char *title, uint64_t total)
-{
-	return start_progress_delay(title, total, 0, 1);
-}
-
-struct progress *start_delayed_sparse_progress(const char *title,
-					       uint64_t total)
-{
-	return start_progress_delay(title, total, get_default_delay(), 1);
-}
-
-static void finish_if_sparse(struct progress *progress)
-{
-	if (progress &&
-	    progress->sparse &&
-	    progress->last_value != progress->total)
-		display_progress(progress, progress->total);
+	return start_progress_delay(title, total, 0);
 }
 
 void stop_progress(struct progress **p_progress)
@@ -328,8 +298,6 @@ void stop_progress(struct progress **p_progress)
 	if (!p_progress)
 		BUG("don't provide NULL to stop_progress");
 
-	finish_if_sparse(*p_progress);
-
 	if (*p_progress) {
 		struct progress *progress = *p_progress;
 		trace2_data_intmax("progress", the_repository, "total_objects",
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 11/25] pack-bitmap-write.c: add a missing stop_progress()
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (9 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 10/25] progress.c: remove the "sparse" mode nano-optimization Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 12/25] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
                         ` (14 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Fix a bug that's been here since 7cc8f971085 (pack-objects: implement
bitmap writing, 2013-12-21), we did not call stop_progress() if we
reached the early exit in this function. This will matter in a
subsequent commit where we BUG(...) out if this happens, and matters
now e.g. because we don't have a corresponding "region_end" for the
progress trace2 event.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 pack-bitmap-write.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 88d9e696a54..6e110e41ea4 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -550,6 +550,7 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
 	if (indexed_commits_nr < 100) {
 		for (i = 0; i < indexed_commits_nr; ++i)
 			push_bitmapped_commit(indexed_commits[i]);
+		stop_progress(&writer.progress);
 		return;
 	}
 
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 12/25] progress.c: add & assert a "global_progress" variable
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (10 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 11/25] pack-bitmap-write.c: add a missing stop_progress() Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 13/25] progress.[ch]: move the "struct progress" to the header Ævar Arnfjörð Bjarmason
                         ` (13 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

The progress.c code makes a hard assumption that only one progress bar
be active at a time (see [1] for a bug where this wasn't the case),
but nothing has asserted that that's the case. Let's add a BUG()
that'll trigger if two progress bars are active at the same time.

There's an alternate test-only approach to doing the same thing[2],
but by doing this for all progress bars we'll have a canary to check
if we have any unexpected interaction between the "sig_atomic_t
progress_update" variable and this global struct.

I am then planning on using this scaffolding in the future to fix a
limitation in the progress output, namely the current limitation of
the progress.c bar code that any update must pro-actively go through
the likes of display_progress().

If we e.g. hang forever before the first display_progress(), or in the
middle of a loop that would call display_progress() the user will only
see either no output, or output frozen at the last display_progress()
that would have done an update (e.g. in cases where progress_update
was "1" due to an earlier signal).

This change does not fix that, but sets up the structure for solving
that and other related problems by juggling this "global_progress"
struct. Later changes will make more use of the "global_progress" than
only using it for these assertions.

1. 6f9d5f2fda1 (commit-graph: fix progress of reachable commits, 2020-07-09)
2. https://lore.kernel.org/git/20210620200303.2328957-3-szeder.dev@gmail.com

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c                  | 17 +++++++++++++----
 t/t0500-progress-display.sh | 11 +++++++++++
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/progress.c b/progress.c
index 912edd4c818..e1b50ef7882 100644
--- a/progress.c
+++ b/progress.c
@@ -45,6 +45,7 @@ struct progress {
 };
 
 static volatile sig_atomic_t progress_update;
+static struct progress *global_progress;
 
 /*
  * These are only intended for testing the progress output, i.e. exclusively
@@ -220,11 +221,15 @@ void progress_test_force_update(void)
 	progress_interval(SIGALRM);
 }
 
-static void set_progress_signal(void)
+static void set_progress_signal(struct progress *progress)
 {
 	struct sigaction sa;
 	struct itimerval v;
 
+	if (global_progress)
+		BUG("should have no global_progress in set_progress_signal()");
+	global_progress = progress;
+
 	if (progress_testing)
 		return;
 
@@ -242,10 +247,14 @@ static void set_progress_signal(void)
 	setitimer(ITIMER_REAL, &v, NULL);
 }
 
-static void clear_progress_signal(void)
+static void clear_progress_signal(struct progress *progress)
 {
 	struct itimerval v = {{0,},};
 
+	if (!global_progress)
+		BUG("should have a global_progress in clear_progress_signal()");
+	global_progress = NULL;
+
 	if (progress_testing)
 		return;
 
@@ -268,7 +277,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 	strbuf_init(&progress->counters_sb, 0);
 	progress->title_len = utf8_strwidth(title);
 	progress->split = 0;
-	set_progress_signal();
+	set_progress_signal(progress);
 	trace2_region_enter("progress", title, the_repository);
 	return progress;
 }
@@ -342,7 +351,7 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 		display(progress, progress->last_value, buf);
 		free(buf);
 	}
-	clear_progress_signal();
+	clear_progress_signal(progress);
 	strbuf_release(&progress->counters_sb);
 	if (progress->throughput)
 		strbuf_release(&progress->throughput->display);
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 66c1989b176..476a31222a3 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -361,4 +361,15 @@ test_expect_success 'progress generates traces: stop without start' '
 	! grep region_leave.*progress trace-stop.event
 '
 
+test_expect_success 'BUG: start two concurrent progress bars' '
+	cat >in <<-\EOF &&
+	start 0 one
+	start 0 two
+	EOF
+
+	test_must_fail test-tool progress \
+		<in 2>stderr &&
+	grep -E "^BUG: .*: should have no global_progress in set_progress_signal\(\)$" stderr
+'
+
 test_done
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 13/25] progress.[ch]: move the "struct progress" to the header
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (11 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 12/25] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 14/25] progress.[ch]: move test-only code away from "extern" variables Ævar Arnfjörð Bjarmason
                         ` (12 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Move the definition of the "struct progress" to the progress.h
header. Even though its contents are meant to be "private" this
pattern has resulted in forward declarations of it in various places,
as other functions have a need to pass it around.

Let's just define it in the header instead. It's part of our own
internal code, so we're not at much risk of someone tweaking the
internal fields manually. While doing that rename the "TP_IDX_MAX"
macro to the more clearly namespaced "PROGRESS_THROUGHPUT_IDX_MAX".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 cache.h             |  1 -
 csum-file.h         |  2 --
 pack.h              |  1 -
 parallel-checkout.h |  1 -
 progress.c          | 29 +----------------------------
 progress.h          | 28 +++++++++++++++++++++++++++-
 reachable.h         |  1 -
 7 files changed, 28 insertions(+), 35 deletions(-)

diff --git a/cache.h b/cache.h
index ba04ff8bd36..7e03a181f68 100644
--- a/cache.h
+++ b/cache.h
@@ -308,7 +308,6 @@ static inline unsigned int canon_mode(unsigned int mode)
 
 struct split_index;
 struct untracked_cache;
-struct progress;
 struct pattern_list;
 
 struct index_state {
diff --git a/csum-file.h b/csum-file.h
index 3044bd19ab6..3de0de653e8 100644
--- a/csum-file.h
+++ b/csum-file.h
@@ -3,8 +3,6 @@
 
 #include "hash.h"
 
-struct progress;
-
 /* A SHA1-protected file */
 struct hashfile {
 	int fd;
diff --git a/pack.h b/pack.h
index fa139545262..8df04f4937a 100644
--- a/pack.h
+++ b/pack.h
@@ -77,7 +77,6 @@ struct pack_idx_entry {
 };
 
 
-struct progress;
 /* Note, the data argument could be NULL if object type is blob */
 typedef int (*verify_fn)(const struct object_id *, enum object_type, unsigned long, void*, int*);
 
diff --git a/parallel-checkout.h b/parallel-checkout.h
index 80f539bcb77..193f76398d6 100644
--- a/parallel-checkout.h
+++ b/parallel-checkout.h
@@ -5,7 +5,6 @@
 
 struct cache_entry;
 struct checkout;
-struct progress;
 
 /****************************************************************
  * Users of parallel checkout
diff --git a/progress.c b/progress.c
index e1b50ef7882..aff9af9ee8b 100644
--- a/progress.c
+++ b/progress.c
@@ -17,33 +17,6 @@
 #include "utf8.h"
 #include "config.h"
 
-#define TP_IDX_MAX      8
-
-struct throughput {
-	off_t curr_total;
-	off_t prev_total;
-	uint64_t prev_ns;
-	unsigned int avg_bytes;
-	unsigned int avg_misecs;
-	unsigned int last_bytes[TP_IDX_MAX];
-	unsigned int last_misecs[TP_IDX_MAX];
-	unsigned int idx;
-	struct strbuf display;
-};
-
-struct progress {
-	const char *title;
-	uint64_t last_value;
-	uint64_t total;
-	unsigned last_percent;
-	unsigned delay;
-	struct throughput *throughput;
-	uint64_t start_ns;
-	struct strbuf counters_sb;
-	int title_len;
-	int split;
-};
-
 static volatile sig_atomic_t progress_update;
 static struct progress *global_progress;
 
@@ -194,7 +167,7 @@ void display_throughput(struct progress *progress, uint64_t total)
 	tp->avg_misecs -= tp->last_misecs[tp->idx];
 	tp->last_bytes[tp->idx] = count;
 	tp->last_misecs[tp->idx] = misecs;
-	tp->idx = (tp->idx + 1) % TP_IDX_MAX;
+	tp->idx = (tp->idx + 1) % PROGRESS_THROUGHPUT_IDX_MAX;
 
 	throughput_string(&tp->display, total, rate);
 	if (progress->last_value != -1 && progress_update)
diff --git a/progress.h b/progress.h
index f1913acf73f..4fb2b483d36 100644
--- a/progress.h
+++ b/progress.h
@@ -1,7 +1,33 @@
 #ifndef PROGRESS_H
 #define PROGRESS_H
+#include "strbuf.h"
 
-struct progress;
+#define PROGRESS_THROUGHPUT_IDX_MAX      8
+
+struct throughput {
+	off_t curr_total;
+	off_t prev_total;
+	uint64_t prev_ns;
+	unsigned int avg_bytes;
+	unsigned int avg_misecs;
+	unsigned int last_bytes[PROGRESS_THROUGHPUT_IDX_MAX];
+	unsigned int last_misecs[PROGRESS_THROUGHPUT_IDX_MAX];
+	unsigned int idx;
+	struct strbuf display;
+};
+
+struct progress {
+	const char *title;
+	uint64_t last_value;
+	uint64_t total;
+	unsigned last_percent;
+	unsigned delay;
+	struct throughput *throughput;
+	uint64_t start_ns;
+	struct strbuf counters_sb;
+	int title_len;
+	int split;
+};
 
 #ifdef GIT_TEST_PROGRESS_ONLY
 
diff --git a/reachable.h b/reachable.h
index 5df932ad8f5..7e1ddddbc63 100644
--- a/reachable.h
+++ b/reachable.h
@@ -1,7 +1,6 @@
 #ifndef REACHEABLE_H
 #define REACHEABLE_H
 
-struct progress;
 struct rev_info;
 
 int add_unseen_recent_objects_to_traversal(struct rev_info *revs,
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 14/25] progress.[ch]: move test-only code away from "extern" variables
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (12 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 13/25] progress.[ch]: move the "struct progress" to the header Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 15/25] progress.c: pass "is done?" (again) to display() Ævar Arnfjörð Bjarmason
                         ` (11 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Since the test-only support code was added in 2bb74b53a49 (Test the
progress display, 2019-09-16) we've had to define
GIT_TEST_PROGRESS_ONLY more widely as part of the bugfix in
3cacb9aaf46 (progress.c: silence cgcc suggestion about internal
linkage, 2020-04-27).

So the only thing we were getting out of this indirection was keeping
GIT_TEST_PROGRESS_ONLY from being defined in progress.h itself,
i.e. so the likes of csum-file.h wouldn't have access to them, we'd
still compile them in progress.o.

Let's just always define and compile them without this needless slight
of hand, the linking and strip step will take care of removing these
unused symbols, if needed.

We now expose a start_progress_testing() function instead, which'll
set a "test_mode" member, which the test of the code can check.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c               | 34 ++++++++++++++--------------------
 progress.h               | 21 ++++++++++++++-------
 t/helper/test-progress.c | 11 +++++------
 3 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/progress.c b/progress.c
index aff9af9ee8b..39d7f6bd86b 100644
--- a/progress.c
+++ b/progress.c
@@ -8,7 +8,6 @@
  * published by the Free Software Foundation.
  */
 
-#define GIT_TEST_PROGRESS_ONLY
 #include "cache.h"
 #include "gettext.h"
 #include "progress.h"
@@ -20,13 +19,6 @@
 static volatile sig_atomic_t progress_update;
 static struct progress *global_progress;
 
-/*
- * These are only intended for testing the progress output, i.e. exclusively
- * for 'test-tool progress'.
- */
-int progress_testing;
-uint64_t progress_test_ns = 0;
-
 static int is_foreground_fd(int fd)
 {
 	int tpgrp = tcgetpgrp(fd);
@@ -108,8 +100,8 @@ static void throughput_string(struct strbuf *buf, uint64_t total,
 
 static uint64_t progress_getnanotime(struct progress *progress)
 {
-	if (progress_testing)
-		return progress->start_ns + progress_test_ns;
+	if (progress->test_getnanotime)
+		return progress->start_ns + progress->test_getnanotime;
 	else
 		return getnanotime();
 }
@@ -185,11 +177,7 @@ static void progress_interval(int signum)
 	progress_update = 1;
 }
 
-/*
- * The progress_test_force_update() function is intended for testing
- * the progress output, i.e. exclusively for 'test-tool progress'.
- */
-void progress_test_force_update(void)
+void test_progress_force_update(void)
 {
 	progress_interval(SIGALRM);
 }
@@ -203,7 +191,7 @@ static void set_progress_signal(struct progress *progress)
 		BUG("should have no global_progress in set_progress_signal()");
 	global_progress = progress;
 
-	if (progress_testing)
+	if (progress->test_mode)
 		return;
 
 	progress_update = 0;
@@ -228,7 +216,7 @@ static void clear_progress_signal(struct progress *progress)
 		BUG("should have a global_progress in clear_progress_signal()");
 	global_progress = NULL;
 
-	if (progress_testing)
+	if (progress->test_mode)
 		return;
 
 	setitimer(ITIMER_REAL, &v, NULL);
@@ -237,7 +225,7 @@ static void clear_progress_signal(struct progress *progress)
 }
 
 static struct progress *start_progress_delay(const char *title, uint64_t total,
-					     unsigned delay)
+					     unsigned delay, int testing)
 {
 	struct progress *progress = xmalloc(sizeof(*progress));
 	progress->title = title;
@@ -250,11 +238,17 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 	strbuf_init(&progress->counters_sb, 0);
 	progress->title_len = utf8_strwidth(title);
 	progress->split = 0;
+	progress->test_mode = testing;
 	set_progress_signal(progress);
 	trace2_region_enter("progress", title, the_repository);
 	return progress;
 }
 
+struct progress *start_progress_testing(const char *title, uint64_t total)
+{
+	return start_progress_delay(title, total, 0, 1);
+}
+
 static int get_default_delay(void)
 {
 	static int delay_in_secs = -1;
@@ -267,12 +261,12 @@ static int get_default_delay(void)
 
 struct progress *start_delayed_progress(const char *title, uint64_t total)
 {
-	return start_progress_delay(title, total, get_default_delay());
+	return start_progress_delay(title, total, get_default_delay(), 0);
 }
 
 struct progress *start_progress(const char *title, uint64_t total)
 {
-	return start_progress_delay(title, total, 0);
+	return start_progress_delay(title, total, 0, 0);
 }
 
 void stop_progress(struct progress **p_progress)
diff --git a/progress.h b/progress.h
index 4fb2b483d36..4693dddb6c5 100644
--- a/progress.h
+++ b/progress.h
@@ -27,15 +27,22 @@ struct progress {
 	struct strbuf counters_sb;
 	int title_len;
 	int split;
-};
-
-#ifdef GIT_TEST_PROGRESS_ONLY
 
-extern int progress_testing;
-extern uint64_t progress_test_ns;
-void progress_test_force_update(void);
+	/*
+	 * The test_* members are are only intended for testing the
+	 * progress output, i.e. exclusively for 'test-tool progress'.
+	 */
+	int test_mode;
+	uint64_t test_getnanotime;
+};
 
-#endif
+/*
+ * *_testing() functions are only for use in
+ * t/helper/test-progress.c. Do not use them elsewhere!
+ */
+void test_progress_force_update(void);
+struct progress *start_progress_testing(const char *title, uint64_t total);
+void test_progress_setnanotime(struct progress *progress, uint64_t time);
 
 void display_throughput(struct progress *progress, uint64_t total);
 void display_progress(struct progress *progress, uint64_t n);
diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
index 7ca58a3ee78..40dbacb0557 100644
--- a/t/helper/test-progress.c
+++ b/t/helper/test-progress.c
@@ -46,21 +46,20 @@ int cmd__progress(int argc, const char **argv)
 	if (argc)
 		usage_with_options(usage, options);
 
-	progress_testing = 1;
 	while (strbuf_getline(&line, stdin) != EOF) {
 		char *end;
 
 		if (!strcmp(line.buf, "start")) {
-			progress = start_progress(default_title, 0);
+			progress = start_progress_testing(default_title, 0);
 		} else if (skip_prefix(line.buf, "start ", (const char **) &end)) {
 			uint64_t total = strtoull(end, &end, 10);
 			if (*end == '\0') {
-				progress = start_progress(default_title, total);
+				progress = start_progress_testing(default_title, total);
 			} else if (*end == ' ') {
 				if (detached_title)
 					free(detached_title);
 				detached_title = strbuf_detach(&line, NULL);
-				progress = start_progress(end + 1, total);
+				progress = start_progress_testing(end + 1, total);
 			} else {
 				die("invalid input: '%s'\n", line.buf);
 			}
@@ -79,11 +78,11 @@ int cmd__progress(int argc, const char **argv)
 			test_ms = strtoull(end + 1, &end, 10);
 			if (*end != '\0')
 				die("invalid input: '%s'\n", line.buf);
-			progress_test_ns = test_ms * 1000 * 1000;
+			progress->test_getnanotime = test_ms * 1000 * 1000;
 			display_throughput(progress, byte_count);
 		} else if (!strcmp(line.buf, "update") ||
 			   !strcmp(line.buf, "signal")) {
-			progress_test_force_update();
+			test_progress_force_update();
 		} else if (!strcmp(line.buf, "stop")) {
 			stop_progress(&progress);
 		} else {
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 15/25] progress.c: pass "is done?" (again) to display()
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (13 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 14/25] progress.[ch]: move test-only code away from "extern" variables Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 16/25] progress.[ch]: convert "title" to "struct strbuf" Ævar Arnfjörð Bjarmason
                         ` (10 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Go back to passing a "are we done?" state variable to the display()
function, instead of passing a string that happens to end in a newline
for the ", done\n" special-case in stop_progress().

This doesn't matter now, but is needed to display an arbitrary message
earlier in the progress display, not just at the very end.

In a984a06a07c (nicer display of thin pack completion, 2007-11-08)
this code worked like this, but later on in 42e18fbf5f9 (more compact
progress display, 2007-10-16) we ended up with the "const
char *done". Then in d53ba841d4f (progress: assemble percentage and
counters in a strbuf before printing, 2019-04-05) we ended up with the
current code structure around the "counters_sb" strbuf.

The "counters_sb" is needed because when we emit a line like:

    Title (1/10)<CR>

We need to know how many characters the " (1/10)" variable part is, so
that we'll emit the appropriate number of spaces to "clear" the line.

If we want to emit output like:

    Title (1/10), some message<CR>

We'll need to stick the whole " (1/10), some message" part into the
strbuf, so that if we want to clear the message we'll know to emit:

    Title (1/10), some message<CR>
    Title (2/10)              <CR>

This didn't matter for the ", done\n" case because we were ending the
process anyway, but in preparation for the above let's star treating
it like any other line, and pass an "int last_update" to decide
whether the line ends with a "\r" or a "\n".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/progress.c b/progress.c
index 39d7f6bd86b..44479f65921 100644
--- a/progress.c
+++ b/progress.c
@@ -25,7 +25,8 @@ static int is_foreground_fd(int fd)
 	return tpgrp < 0 || tpgrp == getpgid(0);
 }
 
-static void display(struct progress *progress, uint64_t n, const char *done)
+static void display(struct progress *progress, uint64_t n,
+		    const char *update_msg, int last_update)
 {
 	const char *tp;
 	struct strbuf *counters_sb = &progress->counters_sb;
@@ -55,10 +56,13 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 		show_update = 1;
 	}
 
+	if (show_update && update_msg)
+		strbuf_addf(counters_sb, ", %s.", update_msg);
+
 	if (show_update) {
 		int stderr_is_foreground_fd = is_foreground_fd(fileno(stderr));
-		if (stderr_is_foreground_fd || done) {
-			const char *eol = done ? done : "\r";
+		if (stderr_is_foreground_fd || update_msg) {
+			const char *eol = last_update ? "\n" : "\r";
 			size_t clear_len = counters_sb->len < last_count_len ?
 					last_count_len - counters_sb->len + 1 :
 					0;
@@ -70,7 +74,7 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 			if (progress->split) {
 				fprintf(stderr, "  %s%*s", counters_sb->buf,
 					(int) clear_len, eol);
-			} else if (!done && cols < progress_line_len) {
+			} else if (!update_msg && cols < progress_line_len) {
 				clear_len = progress->title_len + 1 < cols ?
 					    cols - progress->title_len - 1 : 0;
 				fprintf(stderr, "%s:%*s\n  %s%s",
@@ -163,13 +167,13 @@ void display_throughput(struct progress *progress, uint64_t total)
 
 	throughput_string(&tp->display, total, rate);
 	if (progress->last_value != -1 && progress_update)
-		display(progress, progress->last_value, NULL);
+		display(progress, progress->last_value, NULL, 0);
 }
 
 void display_progress(struct progress *progress, uint64_t n)
 {
 	if (progress)
-		display(progress, n, NULL);
+		display(progress, n, NULL, 0);
 }
 
 static void progress_interval(int signum)
@@ -303,7 +307,6 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 	*p_progress = NULL;
 	if (progress->last_value != -1) {
 		/* Force the last update */
-		char *buf;
 		struct throughput *tp = progress->throughput;
 
 		if (tp) {
@@ -314,9 +317,7 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 			throughput_string(&tp->display, tp->curr_total, rate);
 		}
 		progress_update = 1;
-		buf = xstrfmt(", %s.\n", msg);
-		display(progress, progress->last_value, buf);
-		free(buf);
+		display(progress, progress->last_value, msg, 1);
 	}
 	clear_progress_signal(progress);
 	strbuf_release(&progress->counters_sb);
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 16/25] progress.[ch]: convert "title" to "struct strbuf"
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (14 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 15/25] progress.c: pass "is done?" (again) to display() Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 17/25] progress.c: refactor display() for less confusion, and fix bug Ævar Arnfjörð Bjarmason
                         ` (9 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Covert the "title" for the progress bar to a "struct strbuf", as with
the existing "counters_sb". Let's also rename the "counters_sb" to
merely "status", as we'll soon start using it not just to count, but
for any other arbitrary messaging after our fixed "title".

This makes the emitting the output more consistent, and allows us to
have both a UTF-8 progress bar, and a "status" portion. We won't be
making use of the latter just let, but let's not close the door to it
by relying on a strbuf with a len for one, and a char * for the other.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 63 ++++++++++++++++++++++++++++++++----------------------
 progress.h |  9 +++++---
 2 files changed, 44 insertions(+), 28 deletions(-)

diff --git a/progress.c b/progress.c
index 44479f65921..e17490964c4 100644
--- a/progress.c
+++ b/progress.c
@@ -29,9 +29,8 @@ static void display(struct progress *progress, uint64_t n,
 		    const char *update_msg, int last_update)
 {
 	const char *tp;
-	struct strbuf *counters_sb = &progress->counters_sb;
 	int show_update = 0;
-	int last_count_len = counters_sb->len;
+	size_t last_count_len = progress->status_len_utf8;
 
 	if (progress->delay && (!progress_update || --progress->delay))
 		return;
@@ -43,47 +42,57 @@ static void display(struct progress *progress, uint64_t n,
 		if (percent != progress->last_percent || progress_update) {
 			progress->last_percent = percent;
 
-			strbuf_reset(counters_sb);
-			strbuf_addf(counters_sb,
+			strbuf_reset(&progress->status);
+			strbuf_addf(&progress->status,
 				    "%3u%% (%"PRIuMAX"/%"PRIuMAX")%s", percent,
 				    (uintmax_t)n, (uintmax_t)progress->total,
 				    tp);
 			show_update = 1;
 		}
 	} else if (progress_update) {
-		strbuf_reset(counters_sb);
-		strbuf_addf(counters_sb, "%"PRIuMAX"%s", (uintmax_t)n, tp);
+		strbuf_reset(&progress->status);
+		strbuf_addf(&progress->status, "%"PRIuMAX"%s", (uintmax_t)n, tp);
 		show_update = 1;
 	}
 
 	if (show_update && update_msg)
-		strbuf_addf(counters_sb, ", %s.", update_msg);
+		strbuf_addf(&progress->status, ", %s.", update_msg);
 
 	if (show_update) {
 		int stderr_is_foreground_fd = is_foreground_fd(fileno(stderr));
 		if (stderr_is_foreground_fd || update_msg) {
 			const char *eol = last_update ? "\n" : "\r";
-			size_t clear_len = counters_sb->len < last_count_len ?
-					last_count_len - counters_sb->len + 1 :
+			size_t clear_len = progress->status.len < last_count_len ?
+					last_count_len - progress->status.len + 1 :
 					0;
 			/* The "+ 2" accounts for the ": ". */
-			size_t progress_line_len = progress->title_len +
-						counters_sb->len + 2;
+			size_t progress_line_len = progress->title_len_utf8 +
+						progress->status.len + 2;
 			int cols = term_columns();
+			progress->status_len_utf8 = utf8_strwidth(progress->status.buf);
 
 			if (progress->split) {
-				fprintf(stderr, "  %s%*s", counters_sb->buf,
-					(int) clear_len, eol);
+				fprintf(stderr, "  %*s%*s",
+					(int)progress->status_len_utf8,
+					progress->status.buf,
+					(int)clear_len, eol);
 			} else if (!update_msg && cols < progress_line_len) {
-				clear_len = progress->title_len + 1 < cols ?
-					    cols - progress->title_len - 1 : 0;
-				fprintf(stderr, "%s:%*s\n  %s%s",
-					progress->title, (int) clear_len, "",
-					counters_sb->buf, eol);
+				clear_len = progress->title_len_utf8 + 1 < cols ?
+					    cols - progress->title_len_utf8 - 1 : 0;
+				fprintf(stderr, "%*s:%*s\n  %*s%s",
+					(int)progress->title_len_utf8,
+					progress->title.buf,
+					(int)clear_len, "",
+					(int)progress->status_len_utf8,
+					progress->status.buf, eol);
 				progress->split = 1;
 			} else {
-				fprintf(stderr, "%s: %s%*s", progress->title,
-					counters_sb->buf, (int) clear_len, eol);
+				fprintf(stderr, "%*s: %*s%*s",
+					(int)progress->title_len_utf8,
+					progress->title.buf,
+					(int)progress->status_len_utf8,
+					progress->status.buf,
+					(int)clear_len, eol);
 			}
 			if (stderr_is_foreground_fd)
 				fflush(stderr);
@@ -232,15 +241,18 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 					     unsigned delay, int testing)
 {
 	struct progress *progress = xmalloc(sizeof(*progress));
-	progress->title = title;
+	strbuf_init(&progress->title, 0);
+	strbuf_addstr(&progress->title, title);
+	progress->title_len_utf8 = utf8_strwidth(title);
+	strbuf_init(&progress->status, 0);
+	progress->status_len_utf8 = 0;
+
 	progress->total = total;
 	progress->last_value = -1;
 	progress->last_percent = -1;
 	progress->delay = delay;
 	progress->throughput = NULL;
 	progress->start_ns = getnanotime();
-	strbuf_init(&progress->counters_sb, 0);
-	progress->title_len = utf8_strwidth(title);
 	progress->split = 0;
 	progress->test_mode = testing;
 	set_progress_signal(progress);
@@ -288,7 +300,7 @@ void stop_progress(struct progress **p_progress)
 					   "total_bytes",
 					   progress->throughput->curr_total);
 
-		trace2_region_leave("progress", progress->title, the_repository);
+		trace2_region_leave("progress", progress->title.buf, the_repository);
 	}
 
 	stop_progress_msg(p_progress, _("done"));
@@ -320,7 +332,8 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 		display(progress, progress->last_value, msg, 1);
 	}
 	clear_progress_signal(progress);
-	strbuf_release(&progress->counters_sb);
+	strbuf_release(&progress->title);
+	strbuf_release(&progress->status);
 	if (progress->throughput)
 		strbuf_release(&progress->throughput->display);
 	free(progress->throughput);
diff --git a/progress.h b/progress.h
index 4693dddb6c5..ba38447d104 100644
--- a/progress.h
+++ b/progress.h
@@ -17,15 +17,18 @@ struct throughput {
 };
 
 struct progress {
-	const char *title;
+	struct strbuf title;
+	size_t title_len_utf8;
+
+	struct strbuf status;
+	size_t status_len_utf8;
+
 	uint64_t last_value;
 	uint64_t total;
 	unsigned last_percent;
 	unsigned delay;
 	struct throughput *throughput;
 	uint64_t start_ns;
-	struct strbuf counters_sb;
-	int title_len;
 	int split;
 
 	/*
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 17/25] progress.c: refactor display() for less confusion, and fix bug
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (15 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 16/25] progress.[ch]: convert "title" to "struct strbuf" Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 18/25] progress.c: emit progress on first signal, show "stalled" Ævar Arnfjörð Bjarmason
                         ` (8 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

As tested for in 2bb74b53a49 (Test the progress display, 2019-09-16)
we would redundantly emit extra spaces to clear output we never
emitted under the split mode. Now we'll always clear precisely as many
columns as we need, and no more.

The root cause of that issue is that since the progress code was
originally written we've grown support for various new features, and
ended up with a function where we didn't build the output we were
about to emit once, and then emitted it.

We thus couldn't easily track the length of the output we really did
emit, with everything going downhill from there.

The alternative approach is longer (largely due to added comments),
but I think much clearer.

We no longer rely on magic constants like "2" for ": " or "
" (although we do still rely on the two separators being the same
length, but now have a related BUG(...) assertion).

We don't update "status_len_utf8" (or rather, the now-gone
"last_count_len") or "progress->last_value" until after we've emitted
all the output.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c                  | 137 +++++++++++++++++++++++++++---------
 t/t0500-progress-display.sh |   8 +--
 2 files changed, 104 insertions(+), 41 deletions(-)

diff --git a/progress.c b/progress.c
index e17490964c4..6c4038df791 100644
--- a/progress.c
+++ b/progress.c
@@ -25,17 +25,24 @@ static int is_foreground_fd(int fd)
 	return tpgrp < 0 || tpgrp == getpgid(0);
 }
 
+static const char *counter_prefix(int split)
+{
+	switch (split) {
+	case 1: return "  ";
+	case 0: return ": ";
+	default: BUG("unknown split value");
+	}
+}
+
 static void display(struct progress *progress, uint64_t n,
 		    const char *update_msg, int last_update)
 {
 	const char *tp;
 	int show_update = 0;
-	size_t last_count_len = progress->status_len_utf8;
 
 	if (progress->delay && (!progress_update || --progress->delay))
 		return;
 
-	progress->last_value = n;
 	tp = (progress->throughput) ? progress->throughput->display.buf : "";
 	if (progress->total) {
 		unsigned percent = n * 100 / progress->total;
@@ -44,61 +51,121 @@ static void display(struct progress *progress, uint64_t n,
 
 			strbuf_reset(&progress->status);
 			strbuf_addf(&progress->status,
-				    "%3u%% (%"PRIuMAX"/%"PRIuMAX")%s", percent,
+				    "%s%3u%% (%"PRIuMAX"/%"PRIuMAX")%s",
+				    counter_prefix(progress->split), percent,
 				    (uintmax_t)n, (uintmax_t)progress->total,
 				    tp);
 			show_update = 1;
 		}
 	} else if (progress_update) {
 		strbuf_reset(&progress->status);
-		strbuf_addf(&progress->status, "%"PRIuMAX"%s", (uintmax_t)n, tp);
+		strbuf_addf(&progress->status, "%s%"PRIuMAX"%s", counter_prefix(progress->split),
+			    (uintmax_t)n, tp);
 		show_update = 1;
 	}
 
 	if (show_update && update_msg)
-		strbuf_addf(&progress->status, ", %s.", update_msg);
+		strbuf_addstr(&progress->status, update_msg);
 
 	if (show_update) {
 		int stderr_is_foreground_fd = is_foreground_fd(fileno(stderr));
 		if (stderr_is_foreground_fd || update_msg) {
 			const char *eol = last_update ? "\n" : "\r";
-			size_t clear_len = progress->status.len < last_count_len ?
-					last_count_len - progress->status.len + 1 :
-					0;
-			/* The "+ 2" accounts for the ": ". */
-			size_t progress_line_len = progress->title_len_utf8 +
-						progress->status.len + 2;
-			int cols = term_columns();
-			progress->status_len_utf8 = utf8_strwidth(progress->status.buf);
-
-			if (progress->split) {
-				fprintf(stderr, "  %*s%*s",
-					(int)progress->status_len_utf8,
-					progress->status.buf,
-					(int)clear_len, eol);
-			} else if (!update_msg && cols < progress_line_len) {
-				clear_len = progress->title_len_utf8 + 1 < cols ?
-					    cols - progress->title_len_utf8 - 1 : 0;
-				fprintf(stderr, "%*s:%*s\n  %*s%s",
-					(int)progress->title_len_utf8,
-					progress->title.buf,
-					(int)clear_len, "",
-					(int)progress->status_len_utf8,
-					progress->status.buf, eol);
+			size_t status_len_utf8 = utf8_strwidth(progress->status.buf);
+			size_t progress_line_len = progress->title_len_utf8 + status_len_utf8;
+
+			/*
+			 * We're back at the beginning, so we'll
+			 * always print out the title, unless we're
+			 * already split, then the title is on an
+			 * earlier line.
+			 */
+			if (!progress->split)
+				fprintf(stderr, "%*s",
+					(int)(progress->title_len_utf8),
+					progress->title.buf);
+
+			/*
+			 * Did the user resize the terminal and we're
+			 * splitting this progress bar? Clear previous
+			 * ": (X/Y) [msg]"
+			 */
+			if (!progress->split &&
+			    term_columns() < progress_line_len) {
+				const char *split_prefix = counter_prefix(0);
+				const char *unsplit_prefix = counter_prefix(1);
+				const char *split_colon = ":";
 				progress->split = 1;
+
+				if (progress->last_value == -1) {
+					/*
+					 * We've got no previous
+					 * output whatsoever, so we
+					 * were "always split". No
+					 * previous status output to
+					 * erase.
+					 */
+					fprintf(stderr, "%s\n", split_colon);
+				} else {
+					const char *split_colon = ":";
+					const size_t split_colon_len = strlen(split_colon);
+
+					/*
+					 * Erase whatever we had, adding a
+					 * trailing ":" (not ": ") to indicate
+					 * the progress on the next line.
+					 */
+					fprintf(stderr, "%s%*s\n", split_colon,
+						(int)(progress->status_len_utf8 - split_colon_len),
+						"");
+				}
+
+				/*
+				 * For the one-off switching from
+				 * "!progress->split" to
+				 * "progress->split" fake up the
+				 * expected strbuf and replace the ":
+				 * " with a " ".
+				 *
+				 * The length of the two delimiters
+				 * must be the same for this trick to
+				 * work.
+				 */
+				if (!starts_with(progress->status.buf, split_prefix))
+					BUG("switching from already true split mode to split mode?");
+
+				strbuf_splice(&progress->status, 0,
+					      strlen(split_prefix),
+					      unsplit_prefix,
+					      strlen(unsplit_prefix));
+
+				fprintf(stderr, "%*s%s", (int)status_len_utf8,
+					progress->status.buf, eol);
 			} else {
-				fprintf(stderr, "%*s: %*s%*s",
-					(int)progress->title_len_utf8,
-					progress->title.buf,
-					(int)progress->status_len_utf8,
-					progress->status.buf,
-					(int)clear_len, eol);
+				/*
+				 * Our current
+				 * message may be larger or smaller than the
+				 * last one. Either the progress bar went
+				 * backards (smaller numbers), or we went back
+				 * and forth with a status message.
+				 */
+				size_t clear_len = progress->status_len_utf8 > status_len_utf8
+					? progress->status_len_utf8 - status_len_utf8
+					: 0;
+				fprintf(stderr, "%*s%*s%s",
+					(int) status_len_utf8, progress->status.buf,
+					(int) clear_len, "",
+					eol);
 			}
+			progress->status_len_utf8 = status_len_utf8;
+
 			if (stderr_is_foreground_fd)
 				fflush(stderr);
 		}
 		progress_update = 0;
 	}
+	progress->last_value = n;
+
 }
 
 static void throughput_string(struct strbuf *buf, uint64_t total,
@@ -303,7 +370,7 @@ void stop_progress(struct progress **p_progress)
 		trace2_region_leave("progress", progress->title.buf, the_repository);
 	}
 
-	stop_progress_msg(p_progress, _("done"));
+	stop_progress_msg(p_progress, _(", done."));
 }
 
 void stop_progress_msg(struct progress **p_progress, const char *msg)
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 476a31222a3..883e044fe64 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -85,12 +85,10 @@ EOF
 '
 
 test_expect_success 'progress display breaks long lines #2' '
-	# Note: we do not need that many spaces after the title to cover up
-	# the last line before breaking the progress line.
 	sed -e "s/Z$//" >expect <<\EOF &&
 Working hard.......2.........3.........4.........5.........6:   0% (1/100000)<CR>
 Working hard.......2.........3.........4.........5.........6:   0% (2/100000)<CR>
-Working hard.......2.........3.........4.........5.........6:                   Z
+Working hard.......2.........3.........4.........5.........6:                Z
    10% (10000/100000)<CR>
   100% (100000/100000)<CR>
   100% (100000/100000), done.
@@ -112,10 +110,8 @@ EOF
 '
 
 test_expect_success 'progress display breaks long lines #3 - even the first is too long' '
-	# Note: we do not actually need any spaces at the end of the title
-	# line, because there is no previous progress line to cover up.
 	sed -e "s/Z$//" >expect <<\EOF &&
-Working hard.......2.........3.........4.........5.........6:                   Z
+Working hard.......2.........3.........4.........5.........6:
    25% (25000/100000)<CR>
    50% (50000/100000)<CR>
    75% (75000/100000)<CR>
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 18/25] progress.c: emit progress on first signal, show "stalled"
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (16 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 17/25] progress.c: refactor display() for less confusion, and fix bug Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 19/25] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
                         ` (7 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Ever since the progress.c code was added in 96a02f8f6d2 (common
progress display support, 2007-04-18) we have been driven purely by
calls to the display() function (via the public display_progress()),
or via stop_progress(). Even though we got a signal and invoked
progress_interval() that function would not actually emit progress
output for us.

Thus in cases like "git gc" we don't emit any "Enumerating Objects"
output until we get past the setup code, and start enumerating
objects, we'll now (at least on my laptop) show output earlier, and
emit a "stalled" message before we start the count.

But more generally, this is a first step towards never showing a
hanging progress bar from the user's perspective. If we're truly
taking a very long time with one item we can show some spinner that we
update every time we get a signal. We don't right now, and only
special-case the most common case of hanging before we get to the
first item.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c                  |  7 +++++
 t/t0500-progress-display.sh | 63 ++++++++++++++++++++++++++++++++++---
 2 files changed, 66 insertions(+), 4 deletions(-)

diff --git a/progress.c b/progress.c
index 6c4038df791..35847d3a7f2 100644
--- a/progress.c
+++ b/progress.c
@@ -255,6 +255,13 @@ void display_progress(struct progress *progress, uint64_t n)
 static void progress_interval(int signum)
 {
 	progress_update = 1;
+
+	if (global_progress->last_value != -1)
+		return;
+
+	display(global_progress, 0, _(", stalled."), 0);
+	progress_update = 1;
+	return;
 }
 
 void test_progress_force_update(void)
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 883e044fe64..bc458cfc28b 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -15,7 +15,8 @@ test_expect_success 'setup COLUMNS' '
 
 test_expect_success 'simple progress display' '
 	cat >expect <<-\EOF &&
-	Working hard: 1<CR>
+	Working hard: 0, stalled.<CR>
+	Working hard: 1          <CR>
 	Working hard: 2<CR>
 	Working hard: 5<CR>
 	Working hard: 5, done.
@@ -60,6 +61,57 @@ test_expect_success 'progress display with total' '
 	test_cmp expect out
 '
 
+test_expect_success 'stalled progress display' '
+	cat >expect <<-\EOF &&
+	Working hard:   0% (0/3), stalled.<CR>
+	Working hard:  33% (1/3)          <CR>
+	Working hard:  66% (2/3)<CR>
+	Working hard: 100% (3/3)<CR>
+	Working hard: 100% (3/3), done.
+	EOF
+
+	cat >in <<-\EOF &&
+	start 3
+	signal
+	signal
+	signal
+	progress 1
+	signal
+	update
+	signal
+	progress 2
+	update
+	progress 3
+	stop
+	EOF
+	STALLED=1 test-tool progress <in 2>stderr &&
+
+	show_cr <stderr >out &&
+	test_cmp expect out
+'
+
+test_expect_success 'progress display breaks long lines #0, stalled' '
+	sed -e "s/Z$//" >expect <<\EOF &&
+Working hard.......2.........3.........4.........5.........6.........7:
+    0% (0/100), stalled.<CR>
+    1% (1/100)          <CR>
+   50% (50/100)<CR>
+   50% (50/100), done.
+EOF
+
+	cat >in <<-\EOF &&
+	start 100 Working hard.......2.........3.........4.........5.........6.........7
+	signal
+	progress 1
+	progress 50
+	stop
+	EOF
+	test-tool progress <in 2>stderr &&
+
+	show_cr <stderr >out &&
+	test_cmp expect out
+'
+
 test_expect_success 'progress display breaks long lines #1' '
 	sed -e "s/Z$//" >expect <<\EOF &&
 Working hard.......2.........3.........4.........5.........6:   0% (100/100000)<CR>
@@ -183,7 +235,8 @@ test_expect_success 'progress shortens - crazy caller' '
 
 test_expect_success 'progress display with throughput' '
 	cat >expect <<-\EOF &&
-	Working hard: 10<CR>
+	Working hard: 0, stalled.<CR>
+	Working hard: 10         <CR>
 	Working hard: 20, 200.00 KiB | 100.00 KiB/s<CR>
 	Working hard: 30, 300.00 KiB | 100.00 KiB/s<CR>
 	Working hard: 40, 400.00 KiB | 100.00 KiB/s<CR>
@@ -241,7 +294,8 @@ test_expect_success 'progress display with throughput and total' '
 
 test_expect_success 'cover up after throughput shortens' '
 	cat >expect <<-\EOF &&
-	Working hard: 1<CR>
+	Working hard: 0, stalled.<CR>
+	Working hard: 1          <CR>
 	Working hard: 2, 800.00 KiB | 400.00 KiB/s<CR>
 	Working hard: 3, 1.17 MiB | 400.00 KiB/s  <CR>
 	Working hard: 4, 1.56 MiB | 400.00 KiB/s<CR>
@@ -272,7 +326,8 @@ test_expect_success 'cover up after throughput shortens' '
 
 test_expect_success 'cover up after throughput shortens a lot' '
 	cat >expect <<-\EOF &&
-	Working hard: 1<CR>
+	Working hard: 0, stalled.<CR>
+	Working hard: 1          <CR>
 	Working hard: 2, 1000.00 KiB | 1000.00 KiB/s<CR>
 	Working hard: 3, 3.00 MiB | 1.50 MiB/s      <CR>
 	Working hard: 3, 3.00 MiB | 1024.00 KiB/s, done.
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 19/25] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (17 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 18/25] progress.c: emit progress on first signal, show "stalled" Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 20/25] midx: don't provide a total for QSORT() progress Ævar Arnfjörð Bjarmason
                         ` (6 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

From: SZEDER Gábor <szeder.dev@gmail.com>

The final value of the counter of the "Scanning merged commits"
progress line is always one less than its expected total, e.g.:

  Scanning merged commits:  83% (5/6), done.

This happens because while iterating over an array the loop variable
is passed to display_progress() as-is, but while C arrays (and thus
the loop variable) start at 0 and end at N-1, the progress counter
must end at N.  This causes the failures of the tests
'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.

Fix this by passing 'i + 1' to display_progress(), like most other
callsites do.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 commit-graph.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/commit-graph.c b/commit-graph.c
index 2bcb4e0f89e..3181906368d 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
 
 	ctx->num_extra_edges = 0;
 	for (i = 0; i < ctx->commits.nr; i++) {
-		display_progress(ctx->progress, i);
+		display_progress(ctx->progress, i + 1);
 
 		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
 			  &ctx->commits.list[i]->object.oid)) {
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 20/25] midx: don't provide a total for QSORT() progress
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (18 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 19/25] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 21/25] entry: show finer-grained counter in "Filtering content" progress line Ævar Arnfjörð Bjarmason
                         ` (5 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

The quicksort algorithm can be anywhere between O(n) and O(n^2), so
providing a "num objects" as a total means that in some cases we're
going to go past 100%.

This fixes a logic error in 5ae18df9d8e (midx: during verify group
objects by packfile to speed verification, 2019-03-21), which in turn
seems to have been diligently copied from my own logic error in the
commit-graph.c code, see 890226ccb57 (commit-graph write: add
itermediate progress, 2019-01-19).

That commit-graph code of mine was removed in
1cbdbf3bef7 (commit-graph: drop count_distinct_commits() function,
2020-12-07), so we don't need to fix that too.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 midx.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/midx.c b/midx.c
index d80e68998b8..9f1b4018c1c 100644
--- a/midx.c
+++ b/midx.c
@@ -1265,8 +1265,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 	}
 
 	if (flags & MIDX_PROGRESS)
-		progress = start_progress(_("Sorting objects by packfile"),
-					  m->num_objects);
+		progress = start_progress(_("Sorting objects by packfile"), 0);
 	display_progress(progress, 0); /* TODO: Measure QSORT() progress */
 	QSORT(pairs, m->num_objects, compare_pair_pos_vs_id);
 	stop_progress(&progress);
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 21/25] entry: show finer-grained counter in "Filtering content" progress line
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (19 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 20/25] midx: don't provide a total for QSORT() progress Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [PATCH 22/25] progress.c: add a stop_progress_early() function Ævar Arnfjörð Bjarmason
                         ` (4 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

From: SZEDER Gábor <szeder.dev@gmail.com>

The "Filtering content" progress in entry.c:finish_delayed_checkout()
is unusual because of how it calculates the progress count and because
it shows the progress of a nested loop.  It works basically like this:

  start_delayed_progress(p, nr_of_paths_to_filter)
  for_each_filter {
      display_progress(p, nr_of_paths_to_filter - nr_of_paths_still_left_to_filter)
      for_each_path_handled_by_the_current_filter {
          checkout_entry()
      }
  }
  stop_progress(p)

There are two issues with this approach:

  - The work done by the last filter (or the only filter if there is
    only one) is never counted, so if the last filter still has some
    paths to process, then the counter shown in the "done" progress
    line will not match the expected total.

    This would cause a BUG() in an upcoming change that adds an
    assertion checking if the "total" at the end matches the last
    progress bar update..

    This is because both use only one filter.  (The test 'delayed
    checkout in process filter' uses two filters but the first one
    does all the work, so that test already happens to succeed even
    with such an assertion.)

  - The progress counter is updated only once per filter, not once per
    processed path, so if a filter has a lot of paths to process, then
    the counter might stay unchanged for a long while and then make a
    big jump (though the user still gets a sense of progress, because
    we call display_throughput() after each processed path to show the
    amount of processed data).

Move the display_progress() call to the inner loop, right next to that
checkout_entry() call that does the hard work for each path, and use a
dedicated counter variable that is incremented upon processing each
path.

After this change the 'invalid file in delayed checkout' in
't0021-conversion.sh' would succeed with the future BUG() assertion
discussed above but the 'missing file in delayed checkout' test would
still fail, because its purposefully buggy filter doesn't process any
paths, so we won't execute that inner loop at all (this will be fixed
in a subsequent commit).

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 entry.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/entry.c b/entry.c
index 711ee0693c7..bc4b8fcc980 100644
--- a/entry.c
+++ b/entry.c
@@ -162,7 +162,7 @@ static int remove_available_paths(struct string_list_item *item, void *cb_data)
 int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 {
 	int errs = 0;
-	unsigned delayed_object_count;
+	unsigned processed_paths = 0;
 	off_t filtered_bytes = 0;
 	struct string_list_item *filter, *path;
 	struct progress *progress;
@@ -172,12 +172,10 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 		return errs;
 
 	dco->state = CE_RETRY;
-	delayed_object_count = dco->paths.nr;
-	progress = start_delayed_progress(_("Filtering content"), delayed_object_count);
+	progress = start_delayed_progress(_("Filtering content"), dco->paths.nr);
 	while (dco->filters.nr > 0) {
 		for_each_string_list_item(filter, &dco->filters) {
 			struct string_list available_paths = STRING_LIST_INIT_NODUP;
-			display_progress(progress, delayed_object_count - dco->paths.nr);
 
 			if (!async_query_available_blobs(filter->string, &available_paths)) {
 				/* Filter reported an error */
@@ -224,6 +222,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 				ce = index_file_exists(state->istate, path->string,
 						       strlen(path->string), 0);
 				if (ce) {
+					display_progress(progress, ++processed_paths);
 					errs |= checkout_entry(ce, state, NULL, nr_checkouts);
 					filtered_bytes += ce->ce_stat_data.sd_size;
 					display_throughput(progress, filtered_bytes);
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 22/25] progress.c: add a stop_progress_early() function
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (20 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 21/25] entry: show finer-grained counter in "Filtering content" progress line Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-24 10:35         ` Ævar Arnfjörð Bjarmason
  2021-06-25  1:24         ` Andrei Rybak
  2021-06-23 17:48       ` [PATCH 23/25] entry: deal with unexpected "Filtering content" total Ævar Arnfjörð Bjarmason
                         ` (3 subsequent siblings)
  25 siblings, 2 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

In cases where we error out during processing or otherwise miss
initial "total" estimate we'll still want to show a "done" message and
end our trace2 region, but it won't be true that our total ==
last_update at the end.

So let's add a "last_update" and this stop_progress_early() function
to handle that edge case, this will be used in a subsequent commit.

We could also use a total=0 in such cases, but that would make the
progress output worse for the common non-erroring case. Let's instead
note that we didn't reach the total count, and snap the progress bar
to "100%, done" at the end.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 20 ++++++++++++++++++++
 progress.h |  2 ++
 2 files changed, 22 insertions(+)

diff --git a/progress.c b/progress.c
index 35847d3a7f2..c1cb01ba975 100644
--- a/progress.c
+++ b/progress.c
@@ -40,6 +40,8 @@ static void display(struct progress *progress, uint64_t n,
 	const char *tp;
 	int show_update = 0;
 
+	progress->last_update = n;
+
 	if (progress->delay && (!progress_update || --progress->delay))
 		return;
 
@@ -413,3 +415,21 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 	free(progress->throughput);
 	free(progress);
 }
+
+void stop_progress_early(struct progress **p_progress)
+{
+	struct progress *progress;
+	struct strbuf sb = STRBUF_INIT;
+
+	if (!p_progress)
+		BUG("don't provide NULL to stop_progress_early");
+	progress = *p_progress;
+	if (!progress)
+		return;
+
+	strbuf_addf(&sb, _(", done at %"PRIuMAX" items, expected %"PRIuMAX"."),
+		    progress->total, progress->last_update);
+	progress->total = progress->last_update;
+	stop_progress_msg(p_progress, sb.buf);
+	strbuf_release(&sb);
+}
diff --git a/progress.h b/progress.h
index ba38447d104..5c5d027d1a0 100644
--- a/progress.h
+++ b/progress.h
@@ -23,6 +23,7 @@ struct progress {
 	struct strbuf status;
 	size_t status_len_utf8;
 
+	uint64_t last_update;
 	uint64_t last_value;
 	uint64_t total;
 	unsigned last_percent;
@@ -56,5 +57,6 @@ struct progress *start_delayed_sparse_progress(const char *title,
 					       uint64_t total);
 void stop_progress(struct progress **progress);
 void stop_progress_msg(struct progress **progress, const char *msg);
+void stop_progress_early(struct progress **p_progress);
 
 #endif
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 23/25] entry: deal with unexpected "Filtering content" total
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (21 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 22/25] progress.c: add a stop_progress_early() function Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [RFC/PATCH 24/25] progress: assert last update in stop_progress() Ævar Arnfjörð Bjarmason
                         ` (2 subsequent siblings)
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

The "Filtering content" end total does not match the expected total in
cases such as the 'missing file in delayed checkout' test in
't0021-conversion.sh'.

If we encounter errors we can't accurately estimate the end state of
the progress bar. This is because the test involves a purposefully
buggy filter process that doesn't process any paths, so the progress
counter doesn't have a chance to reach the expected total.

See the preceding commit for why we'd want a stop_progress_early() in
this case, as opposed to leaking memory here, or not providing a
"total" estimate to begin with.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 entry.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/entry.c b/entry.c
index bc4b8fcc980..e79a13daa51 100644
--- a/entry.c
+++ b/entry.c
@@ -232,7 +232,10 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 		}
 		string_list_remove_empty_items(&dco->filters, 0);
 	}
-	stop_progress(&progress);
+	if (!errs && !dco->paths.nr)
+		stop_progress(&progress);
+	else
+		stop_progress_early(&progress);
 	string_list_clear(&dco->filters, 0);
 
 	/* At this point we should not have any delayed paths anymore. */
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [RFC/PATCH 24/25] progress: assert last update in stop_progress()
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (22 preceding siblings ...)
  2021-06-23 17:48       ` [PATCH 23/25] entry: deal with unexpected "Filtering content" total Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:48       ` [RFC/PATCH 25/25] progress: assert counting upwards in display() Ævar Arnfjörð Bjarmason
  2021-06-23 17:59       ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Randall S. Becker
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

From: SZEDER Gábor <szeder.dev@gmail.com>

We had to fix a couple of buggy progress lines in the past, where the
progress counter's final value didn't match the expected total [1],
e.g.:

  Expanding reachable commits in commit graph: 138606% (824706/595), done.
  Writing out commit graph in 3 passes: 166% (4187845/2512707), done.

Let's do better, and, instead of waiting for someone to notice such
issues by mere chance, start verifying progress counters in the test
suite. Let's track what the last display_progress() value was, and if
it doesn't match the total at the end invoke BUG().

We need to introduce a "last_update" distinct from "last_value" for
this, since the "last_value" really means "last displayed value", and
the logic in display() relies on it having those semantics.

Using the "last_value" would also leave us with a subtle case where
this assertion wouldn't catch broken API uses, as an earlier version
of this change did.

Even if that was not the case we couldn't rely on it for the purposes
of this assertion. In the case of a delayed progress the variable
holding the value of the progress counter
('progress->last_value') is only updated after that delay is up, and,
consequently, we can't compare the progress counter with the expected
total in stop_progress() in these cases. Thus this check will cover
progress lines that are too fast to be shown, because the repositories
used in our tests are tiny and most of our progress lines are delayed.

What it can't cover is code that doesn't start the progress bar at
all, e.g. due to its own isatty() check, so progress that is only
started and shown when standard error is not a terminal won't be
covered by our tests.

[1] c4ff24bbb3 (commit-graph.c: display correct number of chunks when
                writing, 2021-02-24)
    1cbdbf3bef (commit-graph: drop count_distinct_commits() function,
                2020-12-07), though this didn't actually fixed, but
                instead removed a buggy progress line.
    150cd3b61d (commit-graph: fix "Writing out commit graph" progress
                counter, 2020-07-09)
    67fa6aac5a (commit-graph: don't show progress percentages while
                expanding reachable commits, 2019-09-07)
    531e6daa03 (prune-packed: advanced progress even for non-existing
                fan-out directories, 2009-04-27)

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

WARNING: I belive this is subtly buggy, see the discussion in the
cover letter. It needs more fixes of the progress.c API usage in
various places before being ready.

 progress.c                  |  8 ++++++++
 t/t0500-progress-display.sh | 30 +++++++++++++++++++++++++++++-
 2 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/progress.c b/progress.c
index c1cb01ba975..40043bf6601 100644
--- a/progress.c
+++ b/progress.c
@@ -325,6 +325,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 
 	progress->total = total;
 	progress->last_value = -1;
+	progress->last_update = -1;
 	progress->last_percent = -1;
 	progress->delay = delay;
 	progress->throughput = NULL;
@@ -393,6 +394,13 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 	if (!progress)
 		return;
 	*p_progress = NULL;
+
+	if (progress->total &&
+	    progress->total != progress->last_update)
+		BUG("total progress does not match for \"%*s\": expected: %"PRIuMAX" got: %"PRIuMAX,
+		    (int)(progress->status_len_utf8), progress->title.buf,
+		    (uintmax_t)progress->total,
+		    (uintmax_t)progress->last_update);
 	if (progress->last_value != -1) {
 		/* Force the last update */
 		struct throughput *tp = progress->throughput;
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index bc458cfc28b..3f00e52ce46 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -96,7 +96,8 @@ Working hard.......2.........3.........4.........5.........6.........7:
     0% (0/100), stalled.<CR>
     1% (1/100)          <CR>
    50% (50/100)<CR>
-   50% (50/100), done.
+  100% (100/100)<CR>
+  100% (100/100), done.
 EOF
 
 	cat >in <<-\EOF &&
@@ -104,6 +105,7 @@ EOF
 	signal
 	progress 1
 	progress 50
+	progress 100
 	stop
 	EOF
 	test-tool progress <in 2>stderr &&
@@ -423,4 +425,30 @@ test_expect_success 'BUG: start two concurrent progress bars' '
 	grep -E "^BUG: .*: should have no global_progress in set_progress_signal\(\)$" stderr
 '
 
+test_expect_success 'BUG: display_progress() goes past declared "total"' '
+	cat >in <<-\EOF &&
+	start 3
+	progress 1
+	progress 2
+	progress 4
+	stop
+	EOF
+
+	test_must_fail test-tool progress <in 2>stderr &&
+	grep "BUG:.*total progress does not match" stderr
+'
+
+test_expect_success 'BUG: display_progress() does not reach declared "total"' '
+	cat >in <<-\EOF &&
+	start 5
+	progress 1
+	progress 2
+	progress 4
+	stop
+	EOF
+
+	test_must_fail test-tool progress <in 2>stderr &&
+	grep "BUG:.*total progress does not match" stderr
+'
+
 test_done
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [RFC/PATCH 25/25] progress: assert counting upwards in display()
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (23 preceding siblings ...)
  2021-06-23 17:48       ` [RFC/PATCH 24/25] progress: assert last update in stop_progress() Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:48       ` Ævar Arnfjörð Bjarmason
  2021-06-23 17:59       ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Randall S. Becker
  25 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 17:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason

From: SZEDER Gábor <szeder.dev@gmail.com>

We had to fix a buggy progress line recently, where the progress
counter counted backwards, see 8e118e8490 (pack-objects: update
"nr_seen" progress based on pack-reused count, 2021-04-11).

Let's add a BUG(...) assertion that makes use of the "last_update"
value to make sure this doesn't happen again, i.e.  trigger a BUG()
when the counter passed to display_progress() is smaller than the
previous value.

Note that we allow subsequent display_progress() calls with the same
counter value, because:

  - Strictly speaking, it's not wrong to do so.

  - Forbidding it might make the code calling display_progress() more
    complex; I suspect that would be the case with e.g. the "Updating
    index flags" progress line in 'unpack-trees.c', where the counter
    is increased in recursive function calls.

  - We would need to special case the internal display() call in
    stop_progress_msg(), because it uses the same counter value as the
    last display_progress() call, which would trigger this BUG().

't0500-progress-display.sh' countains a few tests that check how
shortened progress lines are covered up, and one of them ('progress
shortens - crazy caller') shortens the progress line by counting
backwards.  From now on that test would trigger this BUG(), so remove
it; the other test cases cover shortening progress lines sufficiently.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

WARNING: I belive this is subtly buggy, see the discussion in the
cover letter. It needs more fixes of the progress.c API usage in
various places before being ready.

 progress.c                  |  2 ++
 t/t0500-progress-display.sh | 36 ++++++++++++------------------------
 2 files changed, 14 insertions(+), 24 deletions(-)

diff --git a/progress.c b/progress.c
index 40043bf6601..7b59006c7c4 100644
--- a/progress.c
+++ b/progress.c
@@ -40,6 +40,8 @@ static void display(struct progress *progress, uint64_t n,
 	const char *tp;
 	int show_update = 0;
 
+	if (progress->last_update != -1 && n < progress->last_update)
+		BUG("counting backwards with display_progress()");
 	progress->last_update = n;
 
 	if (progress->delay && (!progress_update || --progress->delay))
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 3f00e52ce46..de59a757f86 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -211,30 +211,6 @@ EOF
 	test_cmp expect out
 '
 
-# Progress counter goes backwards, this should not happen in practice.
-test_expect_success 'progress shortens - crazy caller' '
-	cat >expect <<-\EOF &&
-	Working hard:  10% (100/1000)<CR>
-	Working hard:  20% (200/1000)<CR>
-	Working hard:   0% (1/1000)  <CR>
-	Working hard: 100% (1000/1000)<CR>
-	Working hard: 100% (1000/1000), done.
-	EOF
-
-	cat >in <<-\EOF &&
-	start 1000
-	progress 100
-	progress 200
-	progress 1
-	progress 1000
-	stop
-	EOF
-	test-tool progress <in 2>stderr &&
-
-	show_cr <stderr >out &&
-	test_cmp expect out
-'
-
 test_expect_success 'progress display with throughput' '
 	cat >expect <<-\EOF &&
 	Working hard: 0, stalled.<CR>
@@ -451,4 +427,16 @@ test_expect_success 'BUG: display_progress() does not reach declared "total"' '
 	grep "BUG:.*total progress does not match" stderr
 '
 
+test_expect_success 'BUG: display_progres() counting backwards' '
+	cat >in <<-\EOF &&
+	start 3
+	progress 1
+	progress 2
+	progress 1
+	EOF
+
+	test_must_fail test-tool progress <in 2>stderr &&
+	grep "BUG:.*counting backwards" stderr
+'
+
 test_done
-- 
2.32.0.599.g3967b4fa4ac


^ permalink raw reply	[flat|nested] 83+ messages in thread

* RE: [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code
  2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
                         ` (24 preceding siblings ...)
  2021-06-23 17:48       ` [RFC/PATCH 25/25] progress: assert counting upwards in display() Ævar Arnfjörð Bjarmason
@ 2021-06-23 17:59       ` Randall S. Becker
  2021-06-23 20:01         ` Ævar Arnfjörð Bjarmason
  25 siblings, 1 reply; 83+ messages in thread
From: Randall S. Becker @ 2021-06-23 17:59 UTC (permalink / raw)
  To: 'Ævar Arnfjörð Bjarmason', git
  Cc: 'Junio C Hamano', 'SZEDER Gábor',
	'René Scharfe', 'Taylor Blau'

On June 23, 2021 1:48 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Mon, Jun 21, 2021 at 02:59:53AM +0200, Ævar Arnfjörð Bjarmason wrote:
>>>
>>> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>>>
>>> > Splitting off from:
>>> >
>>> >
>>> > https://public-inbox.org/git/cover-0.2-0000000000-20210607T144206Z-
>>> > avarab@gmail.com/T/#me5d3176914d4268fd9f2a96fc63f4e41beb26bd6
>>> >
>>> > On Tue, Jun 08, 2021 at 06:14:42PM +0200, René Scharfe wrote:
>>> >> I wonder (only in a semi-curious way, though) if we can detect
>>> >> off-by-one errors by adding an assertion to display_progress()
>>> >> that requires the first update to have the value 0, and in
>>> >> stop_progress() one that requires the previous display_progress()
>>> >> call to have a value equal to the total number of work items.  Not
>>> >> sure it'd be worth the hassle..
>>> >
>>> > I fixed and reported a number of bogus progress lines in the past,
>>> > the last one during v2.31.0-rc phase, so I've looked into whether
>>> > progress counters could be automatically validated in our tests,
>>> > and came up with these patches a few months ago.  It turned out
>>> > that progress counters can be checked easily and transparently in
>>> > case of progress lines that are shown in the tests, i.e. that are
>>> > shown even when stderr is not a terminal or are forced with
>>> > '--progress'.  (In other cases it's still fairly easy but not quite
>>> > transparent, as I think we need changes to the progress API; more
>>> > on that later in a separate
>>> > series.)
>>>
>>> I've also been working on some progress.[ch] patches that are mostly
>>> finished, and I'm some 20 patches in at the moment. I wasn't sure
>>> about whether to send an alternate 20-patch "let's do this (mostly) instead?"
>>> series, hence this message.
>>>
>>> Much of what you're doing here becomes easier after that series, e.g.
>>> your global process struct in 2/7 is something I ended up
>>> implementing as part of a general feature to allow progress to be
>>> driven by either display_progress() *or* the signal handler itself.
>>
>> It's difficult to know who should rebase onto who without seeing one
>> half of the patches.
>
>I was sort of hoping he'd take me word for it, but here it is. Don't say I didn't warn you :)
>
>> I couldn't find a link to them anywhere (even if they are only
>> available in your fork in a pre-polished state) despite looking, but
>> my apologies if they are available and I'm just missing them.
>
>FWIW it's avar-szeder/progress-bar-assertions in https://github.com/avar/git.git, that repo contains various functioning and not-so-
>functioning code.
>
>https://github.com/avar/git/tree/meta/ is my version of the crappy scripts we probably all have some version of for building my own git,
>things that are uncommented in series.conf is what I build my own git from.
>
>> In general, I think that these patches are clear and are helpful in
>> pinning down issues with the progress API (which I have made a hadnful
>> of times in the past), so I would be happy to see them picked up.
>
>Here's all 25 patches (well, around 20 before) that I had queued up locally and fixed up a bit.
>
>The 01/25 is something I submitted already as https://lore.kernel.org/git/patch-1.1-cba5d88ca35-20210621T070114Z-avarab@gmail.com;
>hoping to get this in incrementally.
>
>The 12/25 is my own version of that "global progress struct, 11/25 is the first of many bugs SZEDER missed in his :)
>
>18/25 is the first step of the UI I was going for, the signal handler can now drive the progress bar, so e.g. during "git gc" we show (at least
>for me, on git.git), a "stalled" message just before we start the actual count of "Enumerating Objects".
>
>After that was in I was planning on adding config-driven support to show a "spinner" when we stalled in that way, config-driven because
>you could just scrape e.g. https://github.com/sindresorhus/cli-spinners/blob/main/spinners.json
>into your own config. See
>https://jsfiddle.net/sindresorhus/2eLtsbey/embedded/result/ :)
>
>19-23/25 is my grabbing of SZEDER's patches that I'm comfortable labeling as "PATCH", I think they work, but no BUG() assertions yet. I
>left out the GIT_TEST_CHECK_PROGRESS parts, since my earlier works set things up to do any BUG() we trust by default.
>
>22/25 is what I think we should do instead of SZEDER's 6/7
>(http://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.com)
>I don't think this "our total doesn't match at the end" is something we should always BUG() on, for reasons explained there.
>
>I am sympathetic to doing it by default though, hence the
>stop_progress_early() API, that's there to allow select callers to bypass his BUG(...) assertion.
>
>24/25 and 25/25 are "RFC" and a rebased+modified version of SZEDER's
>BUG(...) assertions.
>
>His series passes the test suite, but actually severely break things things. It'll make e.g. "git commit-graph write" BUG(...) out. The reason
>the tests don't catch it is because we have a blind spot in the tests.
>
>Namely, that most things that use the progress bar API use isatty() to check if they should start_progress(). If you run the tests as e.g.
>(better ways to do this, especially in parallel, most welcome):
>
>    for t in t[0-9]*.sh; do if ! ./$t -vixd; then echo $t bad; break; fi; done
>
>You can discover various things that his series BUG()'s on, I fixed a couple of those myself, it's an early part of this series.
>
>But we'll still have various untested for BUG()'s even then, this is because you *also* have to have the test actually emit a "naked"
>progress bar on stderr, if the test itself e.g. pipes fd 2 to a file it won't work.
>
>I created a shitty-and-mostly-broken throwaway change to search-replace all the guards of "start_progress(...)" to run unconditionally, and
>convert all the "delayed" to the non-delayed version. That'll find even more BUG()'s where SZEDER's series still needs to be fixed (and also
>some unrelated segfaults, I gave up on it soon after).
>
>Even if we fix that I wouldn't trust it, because a lot of the progress bars we have depend on the size and shape of the data we're
>processing, e.g. the bug I fixed in 11/25. If people find this BUG() approach worth pursuing I think it would be better to make it an opt-in
>flag we convert one caller at a time to.
>
>For some it's really clear that we could assert it, for others such as the commit-graph it's much more subtle, we're in some callback after
>setting a "total", that callback does a "break", "continue" etc. in various places, all depending on repository data.
>
>It's not easy to reason about that and be certain that we can hold to the estimate. If we get it wrong someone's repo in the wild won't fully
>GC because of the overly eager BUG().
>
>If SZEDER wants to pursue it I think it'll be easier on top of this series, but personally I really don't see the point of spending effort on it.
>
>We should really be going in the other direction, of having more fuzzy ETAs, not less.
>
>E.g. we often have enough data at the start of "Enumerating Objects"
>to give a good-enough target value, that it's 5-10% off isn't really the point, but that the user looking at it sees something better than a
>dumb count-up, and can instead see that they'll probably be looking at it for about a minute. Now our API is to give no ETA/target if we're
>not 100% sure, it's not good UX.
>
>So trying to get the current exact count/exact percentage right seems like a distraction to me in the longer term. If anything we should
>just be rounding those numbers, showing fuzzy ETAs instead of percentages if we can etc.
>
>SZEDER Gábor (4):
>  commit-graph: fix bogus counter in "Scanning merged commits" progress
>    line
>  entry: show finer-grained counter in "Filtering content" progress line
>  progress: assert last update in stop_progress()
>  progress: assert counting upwards in display()
>
>Ævar Arnfjörð Bjarmason (21):
>  progress.c tests: fix breakage with COLUMNS != 80
>  progress.c tests: make start/stop verbs on stdin
>  progress.c tests: test some invalid usage
>  progress.c tests: add a "signal" verb
>  progress.c: move signal handler functions lower
>  progress.c: call progress_interval() from progress_test_force_update()
>  progress.c: stop eagerly fflush(stderr) when not a terminal
>  progress.c: add temporary variable from progress struct
>  midx perf: add a perf test for multi-pack-index
>  progress.c: remove the "sparse" mode nano-optimization
>  pack-bitmap-write.c: add a missing stop_progress()
>  progress.c: add & assert a "global_progress" variable
>  progress.[ch]: move the "struct progress" to the header
>  progress.[ch]: move test-only code away from "extern" variables
>  progress.c: pass "is done?" (again) to display()
>  progress.[ch]: convert "title" to "struct strbuf"
>  progress.c: refactor display() for less confusion, and fix bug
>  progress.c: emit progress on first signal, show "stalled"
>  midx: don't provide a total for QSORT() progress
>  progress.c: add a stop_progress_early() function
>  entry: deal with unexpected "Filtering content" total
>
> cache.h                          |   1 -
> commit-graph.c                   |   2 +-
> csum-file.h                      |   2 -
> entry.c                          |  12 +-
> midx.c                           |  25 +-
> pack-bitmap-write.c              |   1 +
> pack.h                           |   1 -
> parallel-checkout.h              |   1 -
> progress.c                       | 391 ++++++++++++++++++-------------
> progress.h                       |  50 +++-
> reachable.h                      |   1 -
> t/helper/test-progress.c         |  54 +++--
> t/perf/p5319-multi-pack-index.sh |  21 ++
> t/t0500-progress-display.sh      | 247 ++++++++++++++-----
> 14 files changed, 537 insertions(+), 272 deletions(-)  create mode 100755 t/perf/p5319-multi-pack-index.sh

Is there provision for disabling progress on a per-command basis? My use case is specifically in a CI/CD script, being able to suppress progress handling. The current Jenkins plugin does not appear to have provision for hooking into a mechanism, which makes things get a bit wonky when a job runs with a pseudo-tty (as provided by Jenkins through SSH/RMI).
-Randall


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code
  2021-06-23 17:59       ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Randall S. Becker
@ 2021-06-23 20:01         ` Ævar Arnfjörð Bjarmason
  2021-06-23 20:25           ` Randall S. Becker
  0 siblings, 1 reply; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-23 20:01 UTC (permalink / raw)
  To: Randall S. Becker
  Cc: git, 'Junio C Hamano', 'SZEDER Gábor',
	'René Scharfe', 'Taylor Blau'


On Wed, Jun 23 2021, Randall S. Becker wrote:

> On June 23, 2021 1:48 PM, Ævar Arnfjörð Bjarmason wrote:
>>> On Mon, Jun 21, 2021 at 02:59:53AM +0200, Ævar Arnfjörð Bjarmason wrote:
>>>>
>>>> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>>>>
>>>> > Splitting off from:
>>>> >
>>>> >
>>>> > https://public-inbox.org/git/cover-0.2-0000000000-20210607T144206Z-
>>>> > avarab@gmail.com/T/#me5d3176914d4268fd9f2a96fc63f4e41beb26bd6
>>>> >
>>>> > On Tue, Jun 08, 2021 at 06:14:42PM +0200, René Scharfe wrote:
>>>> >> I wonder (only in a semi-curious way, though) if we can detect
>>>> >> off-by-one errors by adding an assertion to display_progress()
>>>> >> that requires the first update to have the value 0, and in
>>>> >> stop_progress() one that requires the previous display_progress()
>>>> >> call to have a value equal to the total number of work items.  Not
>>>> >> sure it'd be worth the hassle..
>>>> >
>>>> > I fixed and reported a number of bogus progress lines in the past,
>>>> > the last one during v2.31.0-rc phase, so I've looked into whether
>>>> > progress counters could be automatically validated in our tests,
>>>> > and came up with these patches a few months ago.  It turned out
>>>> > that progress counters can be checked easily and transparently in
>>>> > case of progress lines that are shown in the tests, i.e. that are
>>>> > shown even when stderr is not a terminal or are forced with
>>>> > '--progress'.  (In other cases it's still fairly easy but not quite
>>>> > transparent, as I think we need changes to the progress API; more
>>>> > on that later in a separate
>>>> > series.)
>>>>
>>>> I've also been working on some progress.[ch] patches that are mostly
>>>> finished, and I'm some 20 patches in at the moment. I wasn't sure
>>>> about whether to send an alternate 20-patch "let's do this (mostly) instead?"
>>>> series, hence this message.
>>>>
>>>> Much of what you're doing here becomes easier after that series, e.g.
>>>> your global process struct in 2/7 is something I ended up
>>>> implementing as part of a general feature to allow progress to be
>>>> driven by either display_progress() *or* the signal handler itself.
>>>
>>> It's difficult to know who should rebase onto who without seeing one
>>> half of the patches.
>>
>>I was sort of hoping he'd take me word for it, but here it is. Don't say I didn't warn you :)
>>
>>> I couldn't find a link to them anywhere (even if they are only
>>> available in your fork in a pre-polished state) despite looking, but
>>> my apologies if they are available and I'm just missing them.
>>
>>FWIW it's avar-szeder/progress-bar-assertions in https://github.com/avar/git.git, that repo contains various functioning and not-so-
>>functioning code.
>>
>>https://github.com/avar/git/tree/meta/ is my version of the crappy scripts we probably all have some version of for building my own git,
>>things that are uncommented in series.conf is what I build my own git from.
>>
>>> In general, I think that these patches are clear and are helpful in
>>> pinning down issues with the progress API (which I have made a hadnful
>>> of times in the past), so I would be happy to see them picked up.
>>
>>Here's all 25 patches (well, around 20 before) that I had queued up locally and fixed up a bit.
>>
>>The 01/25 is something I submitted already as https://lore.kernel.org/git/patch-1.1-cba5d88ca35-20210621T070114Z-avarab@gmail.com;
>>hoping to get this in incrementally.
>>
>>The 12/25 is my own version of that "global progress struct, 11/25 is the first of many bugs SZEDER missed in his :)
>>
>>18/25 is the first step of the UI I was going for, the signal handler can now drive the progress bar, so e.g. during "git gc" we show (at least
>>for me, on git.git), a "stalled" message just before we start the actual count of "Enumerating Objects".
>>
>>After that was in I was planning on adding config-driven support to show a "spinner" when we stalled in that way, config-driven because
>>you could just scrape e.g. https://github.com/sindresorhus/cli-spinners/blob/main/spinners.json
>>into your own config. See
>>https://jsfiddle.net/sindresorhus/2eLtsbey/embedded/result/ :)
>>
>>19-23/25 is my grabbing of SZEDER's patches that I'm comfortable labeling as "PATCH", I think they work, but no BUG() assertions yet. I
>>left out the GIT_TEST_CHECK_PROGRESS parts, since my earlier works set things up to do any BUG() we trust by default.
>>
>>22/25 is what I think we should do instead of SZEDER's 6/7
>>(http://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.com)
>>I don't think this "our total doesn't match at the end" is something we should always BUG() on, for reasons explained there.
>>
>>I am sympathetic to doing it by default though, hence the
>>stop_progress_early() API, that's there to allow select callers to bypass his BUG(...) assertion.
>>
>>24/25 and 25/25 are "RFC" and a rebased+modified version of SZEDER's
>>BUG(...) assertions.
>>
>>His series passes the test suite, but actually severely break things things. It'll make e.g. "git commit-graph write" BUG(...) out. The reason
>>the tests don't catch it is because we have a blind spot in the tests.
>>
>>Namely, that most things that use the progress bar API use isatty() to check if they should start_progress(). If you run the tests as e.g.
>>(better ways to do this, especially in parallel, most welcome):
>>
>>    for t in t[0-9]*.sh; do if ! ./$t -vixd; then echo $t bad; break; fi; done
>>
>>You can discover various things that his series BUG()'s on, I fixed a couple of those myself, it's an early part of this series.
>>
>>But we'll still have various untested for BUG()'s even then, this is because you *also* have to have the test actually emit a "naked"
>>progress bar on stderr, if the test itself e.g. pipes fd 2 to a file it won't work.
>>
>>I created a shitty-and-mostly-broken throwaway change to search-replace all the guards of "start_progress(...)" to run unconditionally, and
>>convert all the "delayed" to the non-delayed version. That'll find even more BUG()'s where SZEDER's series still needs to be fixed (and also
>>some unrelated segfaults, I gave up on it soon after).
>>
>>Even if we fix that I wouldn't trust it, because a lot of the progress bars we have depend on the size and shape of the data we're
>>processing, e.g. the bug I fixed in 11/25. If people find this BUG() approach worth pursuing I think it would be better to make it an opt-in
>>flag we convert one caller at a time to.
>>
>>For some it's really clear that we could assert it, for others such as the commit-graph it's much more subtle, we're in some callback after
>>setting a "total", that callback does a "break", "continue" etc. in various places, all depending on repository data.
>>
>>It's not easy to reason about that and be certain that we can hold to the estimate. If we get it wrong someone's repo in the wild won't fully
>>GC because of the overly eager BUG().
>>
>>If SZEDER wants to pursue it I think it'll be easier on top of this series, but personally I really don't see the point of spending effort on it.
>>
>>We should really be going in the other direction, of having more fuzzy ETAs, not less.
>>
>>E.g. we often have enough data at the start of "Enumerating Objects"
>>to give a good-enough target value, that it's 5-10% off isn't really the point, but that the user looking at it sees something better than a
>>dumb count-up, and can instead see that they'll probably be looking at it for about a minute. Now our API is to give no ETA/target if we're
>>not 100% sure, it's not good UX.
>>
>>So trying to get the current exact count/exact percentage right seems like a distraction to me in the longer term. If anything we should
>>just be rounding those numbers, showing fuzzy ETAs instead of percentages if we can etc.
>>
>>SZEDER Gábor (4):
>>  commit-graph: fix bogus counter in "Scanning merged commits" progress
>>    line
>>  entry: show finer-grained counter in "Filtering content" progress line
>>  progress: assert last update in stop_progress()
>>  progress: assert counting upwards in display()
>>
>>Ævar Arnfjörð Bjarmason (21):
>>  progress.c tests: fix breakage with COLUMNS != 80
>>  progress.c tests: make start/stop verbs on stdin
>>  progress.c tests: test some invalid usage
>>  progress.c tests: add a "signal" verb
>>  progress.c: move signal handler functions lower
>>  progress.c: call progress_interval() from progress_test_force_update()
>>  progress.c: stop eagerly fflush(stderr) when not a terminal
>>  progress.c: add temporary variable from progress struct
>>  midx perf: add a perf test for multi-pack-index
>>  progress.c: remove the "sparse" mode nano-optimization
>>  pack-bitmap-write.c: add a missing stop_progress()
>>  progress.c: add & assert a "global_progress" variable
>>  progress.[ch]: move the "struct progress" to the header
>>  progress.[ch]: move test-only code away from "extern" variables
>>  progress.c: pass "is done?" (again) to display()
>>  progress.[ch]: convert "title" to "struct strbuf"
>>  progress.c: refactor display() for less confusion, and fix bug
>>  progress.c: emit progress on first signal, show "stalled"
>>  midx: don't provide a total for QSORT() progress
>>  progress.c: add a stop_progress_early() function
>>  entry: deal with unexpected "Filtering content" total
>>
>> cache.h                          |   1 -
>> commit-graph.c                   |   2 +-
>> csum-file.h                      |   2 -
>> entry.c                          |  12 +-
>> midx.c                           |  25 +-
>> pack-bitmap-write.c              |   1 +
>> pack.h                           |   1 -
>> parallel-checkout.h              |   1 -
>> progress.c                       | 391 ++++++++++++++++++-------------
>> progress.h                       |  50 +++-
>> reachable.h                      |   1 -
>> t/helper/test-progress.c         |  54 +++--
>> t/perf/p5319-multi-pack-index.sh |  21 ++
>> t/t0500-progress-display.sh      | 247 ++++++++++++++-----
>> 14 files changed, 537 insertions(+), 272 deletions(-)  create mode 100755 t/perf/p5319-multi-pack-index.sh
>
> Is there provision for disabling progress on a per-command basis? My
> use case is specifically in a CI/CD script, being able to suppress
> progress handling. The current Jenkins plugin does not appear to have
> provision for hooking into a mechanism, which makes things get a bit
> wonky when a job runs with a pseudo-tty (as provided by Jenkins
> through SSH/RMI).
> -Randall

There isn't, some commands support --no-progress, but it's hit and miss.

You can then set the undocumented GIT_PROGRESS_DELAY=99999999 (or some
really big number) to suppress more of them.

We could just add it as a top-level "git --no-progress" option I
suppose...

Probably better would be to detect such not-a-terminals somehow, I think
at some point our own gc.log was a victim of this.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* RE: [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code
  2021-06-23 20:01         ` Ævar Arnfjörð Bjarmason
@ 2021-06-23 20:25           ` Randall S. Becker
  0 siblings, 0 replies; 83+ messages in thread
From: Randall S. Becker @ 2021-06-23 20:25 UTC (permalink / raw)
  To: 'Ævar Arnfjörð Bjarmason'
  Cc: git, 'Junio C Hamano', 'SZEDER Gábor',
	'René Scharfe', 'Taylor Blau'

On June 23, 2021 4:02 PM, Ævar Arnfjörð Bjarmason wrote:
>On Wed, Jun 23 2021, Randall S. Becker wrote:
>> On June 23, 2021 1:48 PM, Ævar Arnfjörð Bjarmason wrote:
>>>> On Mon, Jun 21, 2021 at 02:59:53AM +0200, Ævar Arnfjörð Bjarmason wrote:
>>>>>
>>>>> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>>>>>
>>>>> > Splitting off from:
>>>>> >
>>>>> >
>>>>> > https://public-inbox.org/git/cover-0.2-0000000000-20210607T144206
>>>>> > Z-
>>>>> > avarab@gmail.com/T/#me5d3176914d4268fd9f2a96fc63f4e41beb26bd6
>>>>> >
>>>>> > On Tue, Jun 08, 2021 at 06:14:42PM +0200, René Scharfe wrote:
>>>>> >> I wonder (only in a semi-curious way, though) if we can detect
>>>>> >> off-by-one errors by adding an assertion to display_progress()
>>>>> >> that requires the first update to have the value 0, and in
>>>>> >> stop_progress() one that requires the previous
>>>>> >> display_progress() call to have a value equal to the total
>>>>> >> number of work items.  Not sure it'd be worth the hassle..
>>>>> >
>>>>> > I fixed and reported a number of bogus progress lines in the
>>>>> > past, the last one during v2.31.0-rc phase, so I've looked into
>>>>> > whether progress counters could be automatically validated in our
>>>>> > tests, and came up with these patches a few months ago.  It
>>>>> > turned out that progress counters can be checked easily and
>>>>> > transparently in case of progress lines that are shown in the
>>>>> > tests, i.e. that are shown even when stderr is not a terminal or
>>>>> > are forced with '--progress'.  (In other cases it's still fairly
>>>>> > easy but not quite transparent, as I think we need changes to the
>>>>> > progress API; more on that later in a separate
>>>>> > series.)
>>>>>
>>>>> I've also been working on some progress.[ch] patches that are
>>>>> mostly finished, and I'm some 20 patches in at the moment. I wasn't
>>>>> sure about whether to send an alternate 20-patch "let's do this (mostly) instead?"
>>>>> series, hence this message.
>>>>>
>>>>> Much of what you're doing here becomes easier after that series, e.g.
>>>>> your global process struct in 2/7 is something I ended up
>>>>> implementing as part of a general feature to allow progress to be
>>>>> driven by either display_progress() *or* the signal handler itself.
>>>>
>>>> It's difficult to know who should rebase onto who without seeing one
>>>> half of the patches.
>>>
>>>I was sort of hoping he'd take me word for it, but here it is. Don't
>>>say I didn't warn you :)
>>>
>>>> I couldn't find a link to them anywhere (even if they are only
>>>> available in your fork in a pre-polished state) despite looking, but
>>>> my apologies if they are available and I'm just missing them.
>>>
>>>FWIW it's avar-szeder/progress-bar-assertions in
>>>https://github.com/avar/git.git, that repo contains various functioning and not-so- functioning code.
>>>
>>>https://github.com/avar/git/tree/meta/ is my version of the crappy
>>>scripts we probably all have some version of for building my own git, things that are uncommented in series.conf is what I build my own
>git from.
>>>
>>>> In general, I think that these patches are clear and are helpful in
>>>> pinning down issues with the progress API (which I have made a
>>>> hadnful of times in the past), so I would be happy to see them picked up.
>>>
>>>Here's all 25 patches (well, around 20 before) that I had queued up locally and fixed up a bit.
>>>
>>>The 01/25 is something I submitted already as
>>>https://lore.kernel.org/git/patch-1.1-cba5d88ca35-20210621T070114Z-ava
>>>rab@gmail.com;
>>>hoping to get this in incrementally.
>>>
>>>The 12/25 is my own version of that "global progress struct, 11/25 is
>>>the first of many bugs SZEDER missed in his :)
>>>
>>>18/25 is the first step of the UI I was going for, the signal handler
>>>can now drive the progress bar, so e.g. during "git gc" we show (at least for me, on git.git), a "stalled" message just before we start the
>actual count of "Enumerating Objects".
>>>
>>>After that was in I was planning on adding config-driven support to
>>>show a "spinner" when we stalled in that way, config-driven because
>>>you could just scrape e.g.
>>>https://github.com/sindresorhus/cli-spinners/blob/main/spinners.json
>>>into your own config. See
>>>https://jsfiddle.net/sindresorhus/2eLtsbey/embedded/result/ :)
>>>
>>>19-23/25 is my grabbing of SZEDER's patches that I'm comfortable
>>>labeling as "PATCH", I think they work, but no BUG() assertions yet. I left out the GIT_TEST_CHECK_PROGRESS parts, since my earlier
>works set things up to do any BUG() we trust by default.
>>>
>>>22/25 is what I think we should do instead of SZEDER's 6/7
>>>(http://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.
>>>com) I don't think this "our total doesn't match at the end" is
>>>something we should always BUG() on, for reasons explained there.
>>>
>>>I am sympathetic to doing it by default though, hence the
>>>stop_progress_early() API, that's there to allow select callers to bypass his BUG(...) assertion.
>>>
>>>24/25 and 25/25 are "RFC" and a rebased+modified version of SZEDER's
>>>BUG(...) assertions.
>>>
>>>His series passes the test suite, but actually severely break things
>>>things. It'll make e.g. "git commit-graph write" BUG(...) out. The reason the tests don't catch it is because we have a blind spot in the
>tests.
>>>
>>>Namely, that most things that use the progress bar API use isatty() to check if they should start_progress(). If you run the tests as e.g.
>>>(better ways to do this, especially in parallel, most welcome):
>>>
>>>    for t in t[0-9]*.sh; do if ! ./$t -vixd; then echo $t bad; break;
>>> fi; done
>>>
>>>You can discover various things that his series BUG()'s on, I fixed a couple of those myself, it's an early part of this series.
>>>
>>>But we'll still have various untested for BUG()'s even then, this is because you *also* have to have the test actually emit a "naked"
>>>progress bar on stderr, if the test itself e.g. pipes fd 2 to a file it won't work.
>>>
>>>I created a shitty-and-mostly-broken throwaway change to
>>>search-replace all the guards of "start_progress(...)" to run
>>>unconditionally, and convert all the "delayed" to the non-delayed version. That'll find even more BUG()'s where SZEDER's series still
>needs to be fixed (and also some unrelated segfaults, I gave up on it soon after).
>>>
>>>Even if we fix that I wouldn't trust it, because a lot of the progress
>>>bars we have depend on the size and shape of the data we're
>>>processing, e.g. the bug I fixed in 11/25. If people find this BUG() approach worth pursuing I think it would be better to make it an opt-in
>flag we convert one caller at a time to.
>>>
>>>For some it's really clear that we could assert it, for others such as
>>>the commit-graph it's much more subtle, we're in some callback after setting a "total", that callback does a "break", "continue" etc. in
>various places, all depending on repository data.
>>>
>>>It's not easy to reason about that and be certain that we can hold to
>>>the estimate. If we get it wrong someone's repo in the wild won't fully GC because of the overly eager BUG().
>>>
>>>If SZEDER wants to pursue it I think it'll be easier on top of this series, but personally I really don't see the point of spending effort on it.
>>>
>>>We should really be going in the other direction, of having more fuzzy ETAs, not less.
>>>
>>>E.g. we often have enough data at the start of "Enumerating Objects"
>>>to give a good-enough target value, that it's 5-10% off isn't really
>>>the point, but that the user looking at it sees something better than
>>>a dumb count-up, and can instead see that they'll probably be looking at it for about a minute. Now our API is to give no ETA/target if
>we're not 100% sure, it's not good UX.
>>>
>>>So trying to get the current exact count/exact percentage right seems
>>>like a distraction to me in the longer term. If anything we should just be rounding those numbers, showing fuzzy ETAs instead of
>percentages if we can etc.
>>>
>>>SZEDER Gábor (4):
>>>  commit-graph: fix bogus counter in "Scanning merged commits" progress
>>>    line
>>>  entry: show finer-grained counter in "Filtering content" progress
>>>line
>>>  progress: assert last update in stop_progress()
>>>  progress: assert counting upwards in display()
>>>
>>>Ævar Arnfjörð Bjarmason (21):
>>>  progress.c tests: fix breakage with COLUMNS != 80
>>>  progress.c tests: make start/stop verbs on stdin
>>>  progress.c tests: test some invalid usage
>>>  progress.c tests: add a "signal" verb
>>>  progress.c: move signal handler functions lower
>>>  progress.c: call progress_interval() from
>>>progress_test_force_update()
>>>  progress.c: stop eagerly fflush(stderr) when not a terminal
>>>  progress.c: add temporary variable from progress struct
>>>  midx perf: add a perf test for multi-pack-index
>>>  progress.c: remove the "sparse" mode nano-optimization
>>>  pack-bitmap-write.c: add a missing stop_progress()
>>>  progress.c: add & assert a "global_progress" variable
>>>  progress.[ch]: move the "struct progress" to the header
>>>  progress.[ch]: move test-only code away from "extern" variables
>>>  progress.c: pass "is done?" (again) to display()
>>>  progress.[ch]: convert "title" to "struct strbuf"
>>>  progress.c: refactor display() for less confusion, and fix bug
>>>  progress.c: emit progress on first signal, show "stalled"
>>>  midx: don't provide a total for QSORT() progress
>>>  progress.c: add a stop_progress_early() function
>>>  entry: deal with unexpected "Filtering content" total
>>>
>>> cache.h                          |   1 -
>>> commit-graph.c                   |   2 +-
>>> csum-file.h                      |   2 -
>>> entry.c                          |  12 +-
>>> midx.c                           |  25 +-
>>> pack-bitmap-write.c              |   1 +
>>> pack.h                           |   1 -
>>> parallel-checkout.h              |   1 -
>>> progress.c                       | 391 ++++++++++++++++++-------------
>>> progress.h                       |  50 +++-
>>> reachable.h                      |   1 -
>>> t/helper/test-progress.c         |  54 +++--
>>> t/perf/p5319-multi-pack-index.sh |  21 ++
>>> t/t0500-progress-display.sh      | 247 ++++++++++++++-----
>>> 14 files changed, 537 insertions(+), 272 deletions(-)  create mode
>>> 100755 t/perf/p5319-multi-pack-index.sh
>>
>> Is there provision for disabling progress on a per-command basis? My
>> use case is specifically in a CI/CD script, being able to suppress
>> progress handling. The current Jenkins plugin does not appear to have
>> provision for hooking into a mechanism, which makes things get a bit
>> wonky when a job runs with a pseudo-tty (as provided by Jenkins
>> through SSH/RMI).
>> -Randall
>
>There isn't, some commands support --no-progress, but it's hit and miss.
>
>You can then set the undocumented GIT_PROGRESS_DELAY=99999999 (or some really big number) to suppress more of them.
>
>We could just add it as a top-level "git --no-progress" option I suppose...
>
>Probably better would be to detect such not-a-terminals somehow, I think at some point our own gc.log was a victim of this.

I think a global not-a-terminal would be best here. It does not make a lot of sense to dump progress on a device that does not handle Control-M. I think I recall someone recently saying that we should be detecting this.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
                   ` (7 preceding siblings ...)
  2021-06-21  0:59 ` [PATCH 0/7] progress: verify progress counters in the test suite Ævar Arnfjörð Bjarmason
@ 2021-06-23 21:57 ` SZEDER Gábor
  2021-06-23 21:57   ` [PATCH 1/4] WIP progress, isatty(2), hidden progress lnies for GIT_TEST_CHECK_PROGRESS SZEDER Gábor
                     ` (5 more replies)
  2021-07-22 12:20 ` [PATCH 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
  10 siblings, 6 replies; 83+ messages in thread
From: SZEDER Gábor @ 2021-06-23 21:57 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	Taylor Blau, SZEDER Gábor

On Sun, Jun 20, 2021 at 10:02:56PM +0200, SZEDER Gábor wrote:
> It turned out that progress
> counters can be checked easily and transparently in case of progress
> lines that are shown in the tests, i.e. that are shown even when
> stderr is not a terminal or are forced with '--progress'.  (In other
> cases it's still fairly easy but not quite transparent, as I think we
> need changes to the progress API; more on that later in a separate
> series.)

So, the first patch in this WIP/POC series is my attempt at checking
even those progress counters that are not shown in our test suite,
either because stderr is not a terminal or because of an explicit
'--no-progress' option.  There are no usable commit messages yet, I
just wanted to see whether it's possible to check all progress lines
and whether it uncovers any more bugs; and the answer is yes to both.

Anyway, the basic idea is that instead of checking isatty(2) in the
caller, let's perform that check in start_progress() and let callers
override it through an extra function parameter (e.g. when
'--(no-)progress', '-v' or '--quiet' was given).  This way
start_progress() will always be called and it would then return NULL
if the progress line should not be shown.  Or, if
GIT_TEST_CHECK_PROGRESS=1, then it would return a valid non-NULL
progress instance even when the progress line should not be shown, but
with the new 'progress->hidden' flag set, so subsequent
display_progress() and stop_progress() calls won't print anything but
will be able to perform all the checks and trigger BUG() if one is
violated.

However, after Ævar pointed out upthread that progress also generates
trace2 regions, I think that it would be better if start_progress()
always returned a valid progress instance, even without
GIT_TEST_CHECK_PROGRESS but with 'progress->hidden' set as necessary,
because that way we would always get that trace2 output, even with
'--no-progress' or 'git cmd 2>log'.

The first patch also converts a good couple of progress lines to this
new approach, and the subsequent patches fix most of the uncovered
buggy progress lines.


SZEDER Gábor (4):
  WIP progress, isatty(2), hidden progress lnies for
    GIT_TEST_CHECK_PROGRESS
  blame: fix progress total with line ranges
  read-cache: avoid overlapping progress lines
  preload-index: fix "Refreshing index" progress line

 builtin/blame.c          |  8 ++++----
 builtin/fsck.c           | 10 +++-------
 builtin/index-pack.c     | 18 +++++++++---------
 builtin/log.c            |  4 ++--
 builtin/prune.c          |  5 +----
 builtin/unpack-objects.c |  6 +++---
 preload-index.c          | 10 +++++-----
 progress.c               | 26 +++++++++++++++++++-------
 progress.h               |  6 ++++--
 read-cache.c             |  9 +++++----
 10 files changed, 55 insertions(+), 47 deletions(-)

-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 1/4] WIP progress, isatty(2), hidden progress lnies for GIT_TEST_CHECK_PROGRESS
  2021-06-23 21:57 ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
@ 2021-06-23 21:57   ` SZEDER Gábor
  2021-06-23 21:57   ` [PATCH 2/4] blame: fix progress total with line ranges SZEDER Gábor
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 83+ messages in thread
From: SZEDER Gábor @ 2021-06-23 21:57 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	Taylor Blau, SZEDER Gábor

---
 builtin/blame.c          |  6 ++----
 builtin/fsck.c           | 10 +++-------
 builtin/index-pack.c     | 18 +++++++++---------
 builtin/log.c            |  4 ++--
 builtin/prune.c          |  5 +----
 builtin/unpack-objects.c |  6 +++---
 preload-index.c          |  7 +++----
 progress.c               | 26 +++++++++++++++++++-------
 progress.h               |  6 ++++--
 read-cache.c             |  6 +++---
 10 files changed, 49 insertions(+), 45 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 641523ff9a..5efb920dd4 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -944,8 +944,7 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
 		if (show_progress > 0)
 			die(_("--progress can't be used with --incremental or porcelain formats"));
 		show_progress = 0;
-	} else if (show_progress < 0)
-		show_progress = isatty(2);
+	}
 
 	if (0 < abbrev && abbrev < hexsz)
 		/* one more abbrev length is needed for the boundary commit */
@@ -1153,8 +1152,7 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
 
 	sb.found_guilty_entry = &found_guilty_entry;
 	sb.found_guilty_entry_data = &pi;
-	if (show_progress)
-		pi.progress = start_delayed_progress(_("Blaming lines"), sb.num_lines);
+	pi.progress = start_delayed_progress_if_tty(_("Blaming lines"), sb.num_lines, show_progress);
 
 	assign_blame(&sb, opt);
 
diff --git a/builtin/fsck.c b/builtin/fsck.c
index b42b6fe21f..78e799f748 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -185,8 +185,7 @@ static int traverse_reachable(void)
 	struct progress *progress = NULL;
 	unsigned int nr = 0;
 	int result = 0;
-	if (show_progress)
-		progress = start_delayed_progress(_("Checking connectivity"), 0);
+	progress = start_delayed_progress_if_tty(_("Checking connectivity"), 0, show_progress);
 	while (pending.nr) {
 		result |= traverse_one_object(object_array_pop(&pending));
 		display_progress(progress, ++nr);
@@ -653,8 +652,7 @@ static void fsck_object_dir(const char *path)
 	if (verbose)
 		fprintf_ln(stderr, _("Checking object directory"));
 
-	if (show_progress)
-		progress = start_progress(_("Checking object directories"), 256);
+	progress = start_progress_if_tty(_("Checking object directories"), 256, show_progress);
 
 	for_each_loose_file_in_objdir(path, fsck_loose, fsck_cruft, fsck_subdir,
 				      progress);
@@ -789,8 +787,6 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 	if (check_strict)
 		fsck_obj_options.strict = 1;
 
-	if (show_progress == -1)
-		show_progress = isatty(2);
 	if (verbose)
 		show_progress = 0;
 
@@ -825,7 +821,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 					total += p->num_objects;
 				}
 
-				progress = start_progress(_("Checking objects"), total);
+				progress = start_progress_if_tty(_("Checking objects"), total, show_progress);
 			}
 			for (p = get_all_packs(the_repository); p;
 			     p = p->next) {
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 3fbc5d7077..0caabe237e 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -258,8 +258,8 @@ static unsigned check_objects(void)
 
 	max = get_max_object_index();
 
-	if (verbose)
-		progress = start_delayed_progress(_("Checking objects"), max);
+	progress = start_delayed_progress_if_tty(_("Checking objects"), max,
+						 verbose ? 1 : 0);
 
 	for (i = 0; i < max; i++) {
 		foreign_nr += check_object(get_indexed_object(i));
@@ -1157,10 +1157,9 @@ static void parse_pack_objects(unsigned char *hash)
 	struct object_id ref_delta_oid;
 	struct stat st;
 
-	if (verbose)
-		progress = start_progress(
-				from_stdin ? _("Receiving objects") : _("Indexing objects"),
-				nr_objects);
+	progress = start_progress_if_tty(
+			from_stdin ? _("Receiving objects") : _("Indexing objects"),
+			nr_objects, verbose ? 1 : 0);
 	for (i = 0; i < nr_objects; i++) {
 		struct object_entry *obj = &objects[i];
 		void *data = unpack_raw_entry(obj, &ofs_delta->offset,
@@ -1235,9 +1234,10 @@ static void resolve_deltas(void)
 	QSORT(ofs_deltas, nr_ofs_deltas, compare_ofs_delta_entry);
 	QSORT(ref_deltas, nr_ref_deltas, compare_ref_delta_entry);
 
-	if (verbose || show_resolving_progress)
-		progress = start_progress(_("Resolving deltas"),
-					  nr_ref_deltas + nr_ofs_deltas);
+	/* TODO: breaks 5309.3 and .4 */
+	progress = start_progress_if_tty(_("Resolving deltas"),
+					 nr_ref_deltas + nr_ofs_deltas,
+					 verbose || show_resolving_progress ? 1 : 0);
 
 	nr_dispatched = 0;
 	base_cache_limit = delta_base_cache_limit * nr_threads;
diff --git a/builtin/log.c b/builtin/log.c
index 6102893fcc..41bcd4d0fb 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -2154,8 +2154,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	}
 	rev.add_signoff = do_signoff;
 
-	if (show_progress)
-		progress = start_delayed_progress(_("Generating patches"), total);
+	progress = start_delayed_progress_if_tty(_("Generating patches"), total,
+						 show_progress);
 	while (0 <= --nr) {
 		int shown;
 		display_progress(progress, total - nr);
diff --git a/builtin/prune.c b/builtin/prune.c
index 02c6ab7cba..2ee1baf40d 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -41,8 +41,7 @@ static void perform_reachability_traversal(struct rev_info *revs)
 	if (initialized)
 		return;
 
-	if (show_progress)
-		progress = start_delayed_progress(_("Checking connectivity"), 0);
+	progress = start_delayed_progress_if_tty(_("Checking connectivity"), 0, show_progress);
 	mark_reachable_objects(revs, 1, expire, progress);
 	stop_progress(&progress);
 	initialized = 1;
@@ -164,8 +163,6 @@ int cmd_prune(int argc, const char **argv, const char *prefix)
 			die("unrecognized argument: %s", name);
 	}
 
-	if (show_progress == -1)
-		show_progress = isatty(2);
 	if (exclude_promisor_objects) {
 		fetch_if_missing = 0;
 		revs.exclude_promisor_objects = 1;
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 4a9466295b..8517522a31 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -14,7 +14,7 @@
 #include "decorate.h"
 #include "fsck.h"
 
-static int dry_run, quiet, recover, has_errors, strict;
+static int dry_run, quiet = -1, recover, has_errors, strict;
 static const char unpack_usage[] = "git unpack-objects [-n] [-q] [-r] [--strict]";
 
 /* We always read in 4kB chunks. */
@@ -500,8 +500,8 @@ static void unpack_all(void)
 			ntohl(hdr->hdr_version));
 	use(sizeof(struct pack_header));
 
-	if (!quiet)
-		progress = start_progress(_("Unpacking objects"), nr_objects);
+	progress = start_progress_if_tty(_("Unpacking objects"), nr_objects,
+					 quiet ? 0 : -1);
 	CALLOC_ARRAY(obj_list, nr_objects);
 	for (i = 0; i < nr_objects; i++) {
 		unpack_one(i);
diff --git a/preload-index.c b/preload-index.c
index e5529a5863..aae6e4a042 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -121,10 +121,9 @@ void preload_index(struct index_state *index,
 	memset(&data, 0, sizeof(data));
 
 	memset(&pd, 0, sizeof(pd));
-	if (refresh_flags & REFRESH_PROGRESS && isatty(2)) {
-		pd.progress = start_delayed_progress(_("Refreshing index"), index->cache_nr);
-		pthread_mutex_init(&pd.mutex, NULL);
-	}
+	pd.progress = start_delayed_progress_if_tty(_("Refreshing index"),index->cache_nr,
+						   refresh_flags & REFRESH_PROGRESS ? -1 : 0);
+	pthread_mutex_init(&pd.mutex, NULL);
 
 	for (i = 0; i < threads; i++) {
 		struct thread_data *p = data+i;
diff --git a/progress.c b/progress.c
index 034d50cd6b..99e130f1eb 100644
--- a/progress.c
+++ b/progress.c
@@ -43,6 +43,7 @@ struct progress {
 	struct strbuf counters_sb;
 	int title_len;
 	int split;
+	int hidden;
 };
 
 static volatile sig_atomic_t progress_update;
@@ -123,6 +124,8 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 
 	progress->last_value = n;
 
+	if (progress->hidden)
+		return;
 	if (progress->delay && (!progress_update || --progress->delay))
 		return;
 
@@ -261,15 +264,23 @@ void display_progress(struct progress *progress, uint64_t n)
 }
 
 static struct progress *start_progress_delay(const char *title, uint64_t total,
-					     unsigned delay, unsigned sparse)
+					     unsigned delay, unsigned sparse,
+					     int show)
 {
 	struct progress *progress;
 
 	test_check_progress = git_env_bool("GIT_TEST_CHECK_PROGRESS", 0);
+
+	if (show == -1)
+		show = isatty(STDERR_FILENO);
+
 	if (test_check_progress && current_progress)
 		BUG("progress \"%s\" is still active when starting new progress \"%s\"",
 		    current_progress->title, title);
 
+	if (!show && !test_check_progress)
+		return NULL;
+
 	progress = xmalloc(sizeof(*progress));
 	current_progress = progress;
 	progress->title = title;
@@ -283,6 +294,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 	strbuf_init(&progress->counters_sb, 0);
 	progress->title_len = utf8_strwidth(title);
 	progress->split = 0;
+	progress->hidden = !show;
 	set_progress_signal();
 	trace2_region_enter("progress", title, the_repository);
 	return progress;
@@ -298,14 +310,14 @@ static int get_default_delay(void)
 	return delay_in_secs;
 }
 
-struct progress *start_delayed_progress(const char *title, uint64_t total)
+struct progress *start_delayed_progress_if_tty(const char *title, uint64_t total, int show)
 {
-	return start_progress_delay(title, total, get_default_delay(), 0);
+	return start_progress_delay(title, total, get_default_delay(), 0, show);
 }
 
-struct progress *start_progress(const char *title, uint64_t total)
+struct progress *start_progress_if_tty(const char *title, uint64_t total, int show)
 {
-	return start_progress_delay(title, total, 0, 0);
+	return start_progress_delay(title, total, 0, 0, show);
 }
 
 /*
@@ -319,13 +331,13 @@ struct progress *start_progress(const char *title, uint64_t total)
  */
 struct progress *start_sparse_progress(const char *title, uint64_t total)
 {
-	return start_progress_delay(title, total, 0, 1);
+	return start_progress_delay(title, total, 0, 1, 1);
 }
 
 struct progress *start_delayed_sparse_progress(const char *title,
 					       uint64_t total)
 {
-	return start_progress_delay(title, total, get_default_delay(), 1);
+	return start_progress_delay(title, total, get_default_delay(), 1, 1);
 }
 
 static void finish_if_sparse(struct progress *progress)
diff --git a/progress.h b/progress.h
index f1913acf73..7c3bdd3d63 100644
--- a/progress.h
+++ b/progress.h
@@ -13,9 +13,11 @@ void progress_test_force_update(void);
 
 void display_throughput(struct progress *progress, uint64_t total);
 void display_progress(struct progress *progress, uint64_t n);
-struct progress *start_progress(const char *title, uint64_t total);
+#define start_progress(title, total) start_progress_if_tty((title), (total), 1)
+struct progress *start_progress_if_tty(const char *title, uint64_t total, int show);
 struct progress *start_sparse_progress(const char *title, uint64_t total);
-struct progress *start_delayed_progress(const char *title, uint64_t total);
+#define start_delayed_progress(title, total) start_delayed_progress_if_tty((title), (total), 1)
+struct progress *start_delayed_progress_if_tty(const char *title, uint64_t total, int show);
 struct progress *start_delayed_sparse_progress(const char *title,
 					       uint64_t total);
 void stop_progress(struct progress **progress);
diff --git a/read-cache.c b/read-cache.c
index 1b3c2eb408..c3fc797639 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1567,9 +1567,9 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 	int t2_sum_lstat = 0;
 	int t2_sum_scan = 0;
 
-	if (flags & REFRESH_PROGRESS && isatty(2))
-		progress = start_delayed_progress(_("Refresh index"),
-						  istate->cache_nr);
+	progress = start_delayed_progress_if_tty(_("Refresh index"),
+						 istate->cache_nr,
+						 flags & REFRESH_PROGRESS ? -1 : 0);
 
 	trace_performance_enter();
 	modified_fmt   = in_porcelain ? "M\t%s\n" : "%s: needs update\n";
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 2/4] blame: fix progress total with line ranges
  2021-06-23 21:57 ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
  2021-06-23 21:57   ` [PATCH 1/4] WIP progress, isatty(2), hidden progress lnies for GIT_TEST_CHECK_PROGRESS SZEDER Gábor
@ 2021-06-23 21:57   ` SZEDER Gábor
  2021-06-23 21:57   ` [PATCH 3/4] read-cache: avoid overlapping progress lines SZEDER Gábor
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 83+ messages in thread
From: SZEDER Gábor @ 2021-06-23 21:57 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	Taylor Blau, SZEDER Gábor

When not blaming a whole file but only a subset of its lines using the
'-L<start>,<end>' option, then the "Blaming lines" progress counter
can be way off, because the counter only counts the actually processed
lines in the line range(s) while the expected total wrongly shows the
number of lines in the given file:

  $ wc -l git.c
  932 git.c
  $ GIT_PROGRESS_DELAY=0 git blame -L10,20 git.c
  Blaming lines:   1% (11/932), done.
  <...>

Let's sum up the number of lines in all (sorted and merged) line
ranges and specify the resulting number as expected total.  (Note:
when blaming the whole file, then we (implicitly) have one line range
encompassing all its lines, so this approach works even when no line
range was given as option.)

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 builtin/blame.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 5efb920dd4..7d29f5dc61 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1121,9 +1121,11 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
 	}
 	sort_and_merge_range_set(&ranges);
 
+	lno = 0;
 	for (range_i = ranges.nr; range_i > 0; --range_i) {
 		const struct range *r = &ranges.ranges[range_i - 1];
 		ent = blame_entry_prepend(ent, r->start, r->end, o);
+		lno += r->end - r->start;
 	}
 
 	o->suspects = ent;
@@ -1152,7 +1154,7 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
 
 	sb.found_guilty_entry = &found_guilty_entry;
 	sb.found_guilty_entry_data = &pi;
-	pi.progress = start_delayed_progress_if_tty(_("Blaming lines"), sb.num_lines, show_progress);
+	pi.progress = start_delayed_progress_if_tty(_("Blaming lines"), lno, show_progress);
 
 	assign_blame(&sb, opt);
 
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 3/4] read-cache: avoid overlapping progress lines
  2021-06-23 21:57 ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
  2021-06-23 21:57   ` [PATCH 1/4] WIP progress, isatty(2), hidden progress lnies for GIT_TEST_CHECK_PROGRESS SZEDER Gábor
  2021-06-23 21:57   ` [PATCH 2/4] blame: fix progress total with line ranges SZEDER Gábor
@ 2021-06-23 21:57   ` SZEDER Gábor
  2021-06-23 21:57   ` [PATCH 4/4] preload-index: fix "Refreshing index" progress line SZEDER Gábor
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 83+ messages in thread
From: SZEDER Gábor @ 2021-06-23 21:57 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	Taylor Blau, SZEDER Gábor

"Refresh index" in refresh_index() in 'read-cache.c' vs. "Refreshing
index" in preload_index() in 'preload-index.c'.
---
 read-cache.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index c3fc797639..692a69f2db 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1567,10 +1567,6 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 	int t2_sum_lstat = 0;
 	int t2_sum_scan = 0;
 
-	progress = start_delayed_progress_if_tty(_("Refresh index"),
-						 istate->cache_nr,
-						 flags & REFRESH_PROGRESS ? -1 : 0);
-
 	trace_performance_enter();
 	modified_fmt   = in_porcelain ? "M\t%s\n" : "%s: needs update\n";
 	deleted_fmt    = in_porcelain ? "D\t%s\n" : "%s: needs update\n";
@@ -1583,6 +1579,11 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 	 * we only have to do the special cases that are left.
 	 */
 	preload_index(istate, pathspec, 0);
+
+	progress = start_delayed_progress_if_tty(_("Refresh index"),
+						 istate->cache_nr,
+						 flags & REFRESH_PROGRESS ? -1 : 0);
+
 	trace2_region_enter("index", "refresh", NULL);
 	/* TODO: audit for interaction with sparse-index. */
 	ensure_full_index(istate);
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 4/4] preload-index: fix "Refreshing index" progress line
  2021-06-23 21:57 ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
                     ` (2 preceding siblings ...)
  2021-06-23 21:57   ` [PATCH 3/4] read-cache: avoid overlapping progress lines SZEDER Gábor
@ 2021-06-23 21:57   ` SZEDER Gábor
  2021-06-23 22:11   ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
  2021-06-24 10:45   ` Ævar Arnfjörð Bjarmason
  5 siblings, 0 replies; 83+ messages in thread
From: SZEDER Gábor @ 2021-06-23 21:57 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	Taylor Blau, SZEDER Gábor

---
 preload-index.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/preload-index.c b/preload-index.c
index aae6e4a042..757dbeced6 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -86,7 +86,8 @@ static void *preload_thread(void *_data)
 		struct progress_data *pd = p->progress;
 
 		pthread_mutex_lock(&pd->mutex);
-		display_progress(pd->progress, pd->n + last_nr);
+		pd->n += last_nr;
+		display_progress(pd->progress, pd->n);
 		pthread_mutex_unlock(&pd->mutex);
 	}
 	cache_def_clear(&cache);
-- 
2.32.0.289.g44fbea0957


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines
  2021-06-23 21:57 ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
                     ` (3 preceding siblings ...)
  2021-06-23 21:57   ` [PATCH 4/4] preload-index: fix "Refreshing index" progress line SZEDER Gábor
@ 2021-06-23 22:11   ` SZEDER Gábor
  2021-06-24 10:43     ` Ævar Arnfjörð Bjarmason
  2021-06-24 10:45   ` Ævar Arnfjörð Bjarmason
  5 siblings, 1 reply; 83+ messages in thread
From: SZEDER Gábor @ 2021-06-23 22:11 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, René Scharfe,
	Taylor Blau, Derrick Stolee

On Wed, Jun 23, 2021 at 11:57:32PM +0200, SZEDER Gábor wrote:
> I just wanted to see whether it's possible to check all progress lines
> and whether it uncovers any more bugs; and the answer is yes to both.

Oh, and there is another one:

test_expect_success 'test' '
	git commit --allow-empty -m 1 &&
	git commit --allow-empty -m 2 &&
	git commit --allow-empty -m 3 &&
	GIT_PROGRESS_DELAY=0 \
	git commit-graph write --progress --reachable --split &&
	git commit --allow-empty -m 4 &&
	GIT_PROGRESS_DELAY=0 \
	git commit-graph write --progress --reachable --split
'

The last command's progress output ends with:

  Writing out commit graph in 5 passes:  80% (4/5), done.

This is because since 53035c4f0b (commit-graph write: add "Writing
out" progress output, 2019-01-19) we have assumed that the work done
while writing each chunk is proportional to the number of commits in
the graph, but with the arrival of split commit graphs and the BASE
chunk in 118bd57002 (commit-graph: add base graphs chunk, 2019-06-18)
that's not longer the case.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 22/25] progress.c: add a stop_progress_early() function
  2021-06-23 17:48       ` [PATCH 22/25] progress.c: add a stop_progress_early() function Ævar Arnfjörð Bjarmason
@ 2021-06-24 10:35         ` Ævar Arnfjörð Bjarmason
  2021-06-25  1:24         ` Andrei Rybak
  1 sibling, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-24 10:35 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Taylor Blau, Ævar Arnfjörð Bjarmason


On Wed, Jun 23 2021, Ævar Arnfjörð Bjarmason wrote:

> +	strbuf_addf(&sb, _(", done at %"PRIuMAX" items, expected %"PRIuMAX"."),
> +		    progress->total, progress->last_update);

These two need a (uintmax_t) cast like the rest of such sprintfs in the
file, as I discovered with the OSX CI.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines
  2021-06-23 22:11   ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
@ 2021-06-24 10:43     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-24 10:43 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, René Scharfe, Taylor Blau, Derrick Stolee


On Thu, Jun 24 2021, SZEDER Gábor wrote:

> On Wed, Jun 23, 2021 at 11:57:32PM +0200, SZEDER Gábor wrote:
>> I just wanted to see whether it's possible to check all progress lines
>> and whether it uncovers any more bugs; and the answer is yes to both.
>
> Oh, and there is another one:
>
> test_expect_success 'test' '
> 	git commit --allow-empty -m 1 &&
> 	git commit --allow-empty -m 2 &&
> 	git commit --allow-empty -m 3 &&
> 	GIT_PROGRESS_DELAY=0 \
> 	git commit-graph write --progress --reachable --split &&
> 	git commit --allow-empty -m 4 &&
> 	GIT_PROGRESS_DELAY=0 \
> 	git commit-graph write --progress --reachable --split
> '
>
> The last command's progress output ends with:
>
>   Writing out commit graph in 5 passes:  80% (4/5), done.
>
> This is because since 53035c4f0b (commit-graph write: add "Writing
> out" progress output, 2019-01-19) we have assumed that the work done
> while writing each chunk is proportional to the number of commits in
> the graph, but with the arrival of split commit graphs and the BASE
> chunk in 118bd57002 (commit-graph: add base graphs chunk, 2019-06-18)
> that's not longer the case.

Ah, I encountered the off-by-something in that "writing in N passes" but
didn't find the root cause, thanks.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines
  2021-06-23 21:57 ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
                     ` (4 preceding siblings ...)
  2021-06-23 22:11   ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
@ 2021-06-24 10:45   ` Ævar Arnfjörð Bjarmason
  5 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-24 10:45 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, René Scharfe, Taylor Blau


On Wed, Jun 23 2021, SZEDER Gábor wrote:

> On Sun, Jun 20, 2021 at 10:02:56PM +0200, SZEDER Gábor wrote:
>> It turned out that progress
>> counters can be checked easily and transparently in case of progress
>> lines that are shown in the tests, i.e. that are shown even when
>> stderr is not a terminal or are forced with '--progress'.  (In other
>> cases it's still fairly easy but not quite transparent, as I think we
>> need changes to the progress API; more on that later in a separate
>> series.)
>
> So, the first patch in this WIP/POC series is my attempt at checking
> even those progress counters that are not shown in our test suite,
> either because stderr is not a terminal or because of an explicit
> '--no-progress' option.  There are no usable commit messages yet, I
> just wanted to see whether it's possible to check all progress lines
> and whether it uncovers any more bugs; and the answer is yes to both.
>
> Anyway, the basic idea is that instead of checking isatty(2) in the
> caller, let's perform that check in start_progress() and let callers
> override it through an extra function parameter (e.g. when
> '--(no-)progress', '-v' or '--quiet' was given).  This way
> start_progress() will always be called and it would then return NULL
> if the progress line should not be shown.  Or, if
> GIT_TEST_CHECK_PROGRESS=1, then it would return a valid non-NULL
> progress instance even when the progress line should not be shown, but
> with the new 'progress->hidden' flag set, so subsequent
> display_progress() and stop_progress() calls won't print anything but
> will be able to perform all the checks and trigger BUG() if one is
> violated.
>
> However, after Ævar pointed out upthread that progress also generates
> trace2 regions, I think that it would be better if start_progress()
> always returned a valid progress instance, even without
> GIT_TEST_CHECK_PROGRESS but with 'progress->hidden' set as necessary,
> because that way we would always get that trace2 output, even with
> '--no-progress' or 'git cmd 2>log'.
>
> The first patch also converts a good couple of progress lines to this
> new approach, and the subsequent patches fix most of the uncovered
> buggy progress lines.

Thanks, I skimmed over it and this sort of approach is definitely what
we'll need to address my "But we'll still have various untested for
BUG()[...]" in
https://lore.kernel.org/git/cover-00.25-00000000000-20210623T155626Z-avarab@gmail.com/

And as you point out we'll get the benefit of consistent trace2 regions,
on the one hand it's a bit weird to have this UI code drive a trace2
region when we don't have a TTY, but I think it's useful. We could
e.g. eventually record some stats about min/max/avg/percentile
processing per-item while we're at it, that's unlikely to be worth it if
we need another API like display_progress(), but since we have that one
we can piggy-back on it quite easily.

Just some implementation nits: I for one would prefer "static inline"
wrappers instead of macros in progress.h, makes it easier to
consistently set breakpoints in gdb.

It's more work up-front, but I think Re Randall's question in
https://lore.kernel.org/git/00fb01d76859$8a6ebc50$9f4c34f0$@nexbridge.com
that instead of s/start_delayed_progress/start_delayed_progress_if_tty/
it would be better to just leave the "start_delayed_progress", and have
it by default do the TTY check, and also check for --progress and/or
--verbose/--quiet etc. itself.

We'd probably have some special-cases left even then, but I think most
of them can be handled with an isatty() check and the "standard" options
of --progress etc.

I.e. we have OPT__VERBOSE now, but no OPT__PROGRESS (we just use
OPT_BOOL). If we made the various common parse-options flags that impact
it callbacks that would munge a global variable we could then pick that
up in progress.c, and handle the common case of "git some-command
--no-progress" directly.

It would also make it easy to just move that over to git.c, so we could
have "git --no-progress some-command", which I think for --progress,
--object-format and other "global-y" options it we should have them to
"git" directly, not per-command, especially with us hopefully soon
moving 100% away from dashed built-ins.

Isn't the most common general rule just:

    int want_progress = progress ? 1 : verbose ? 1 : quiet ? 0 : isatty(2);

Well, that and a version that handles --no-progress distinct from "did
not provide it", so we need some "-1" checks in there. Maybe:

    /* Earlier */
    if (quiet != -1 && verbose != -1)
        die("--quiet and --verbose?");

    /* In progress.c after getopt */
    int enable = -1;
    if (opt_progress != -1) enable = opt_progress;
    if (enable == -1 && opt_verbose != -1) enable = opt_verbose;
    if (enable == -1 && opt_quiet != -1) enable = !opt_quiet;
    if (enable == -1) enable = isatty(2);

In any case, I think moving that to one place so it's consistently
checked would make sense.

Some things like builtin/multi-pack-index.c set "progress" as a bitflag
for IMO (this was discussed on-list before) no good reason. I.e. the
builtin should handle it with a bool, maybe the library wants a flag,
but in any case if we can do what I proposed above such libraries won't
need a flag at all.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 22/25] progress.c: add a stop_progress_early() function
  2021-06-23 17:48       ` [PATCH 22/25] progress.c: add a stop_progress_early() function Ævar Arnfjörð Bjarmason
  2021-06-24 10:35         ` Ævar Arnfjörð Bjarmason
@ 2021-06-25  1:24         ` Andrei Rybak
  1 sibling, 0 replies; 83+ messages in thread
From: Andrei Rybak @ 2021-06-25  1:24 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe, Taylor Blau

On 23/06/2021 19:48, Ævar Arnfjörð Bjarmason wrote:
> In cases where we error out during processing or otherwise miss
> initial "total" estimate we'll still want to show a "done" message and
> end our trace2 region, but it won't be true that our total ==
> last_update at the end.
> 
> So let's add a "last_update" and this stop_progress_early() function
> to handle that edge case, this will be used in a subsequent commit.
> 
> We could also use a total=0 in such cases, but that would make the
> progress output worse for the common non-erroring case. Let's instead
> note that we didn't reach the total count, and snap the progress bar
> to "100%, done" at the end.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>   progress.c | 20 ++++++++++++++++++++
>   progress.h |  2 ++
>   2 files changed, 22 insertions(+)
> 
> diff --git a/progress.c b/progress.c
> index 35847d3a7f2..c1cb01ba975 100644
> --- a/progress.c
> +++ b/progress.c
> @@ -40,6 +40,8 @@ static void display(struct progress *progress, uint64_t n,
>   	const char *tp;
>   	int show_update = 0;
>   
> +	progress->last_update = n;
> +
>   	if (progress->delay && (!progress_update || --progress->delay))
>   		return;
>   
> @@ -413,3 +415,21 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
>   	free(progress->throughput);
>   	free(progress);
>   }
> +
> +void stop_progress_early(struct progress **p_progress)
> +{
> +	struct progress *progress;
> +	struct strbuf sb = STRBUF_INIT;
> +
> +	if (!p_progress)
> +		BUG("don't provide NULL to stop_progress_early");
> +	progress = *p_progress;
> +	if (!progress)
> +		return;
> +
> +	strbuf_addf(&sb, _(", done at %"PRIuMAX" items, expected %"PRIuMAX"."),
> +		    progress->total, progress->last_update);

It seems that these two arguments to strbuf_addf should be swapped
around.  Done at progress->last_update, expected progress->total.

> +	progress->total = progress->last_update;
> +	stop_progress_msg(p_progress, sb.buf);
> +	strbuf_release(&sb);
> +}
> diff --git a/progress.h b/progress.h
> index ba38447d104..5c5d027d1a0 100644
> --- a/progress.h
> +++ b/progress.h
> @@ -23,6 +23,7 @@ struct progress {
>   	struct strbuf status;
>   	size_t status_len_utf8;
>   
> +	uint64_t last_update;
>   	uint64_t last_value;
>   	uint64_t total;
>   	unsigned last_percent;
> @@ -56,5 +57,6 @@ struct progress *start_delayed_sparse_progress(const char *title,
>   					       uint64_t total);
>   void stop_progress(struct progress **progress);
>   void stop_progress_msg(struct progress **progress, const char *msg);
> +void stop_progress_early(struct progress **p_progress);
>   
>   #endif
> 


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-21 20:08       ` Ævar Arnfjörð Bjarmason
@ 2021-06-26  8:27         ` René Scharfe
  2021-06-26 14:11           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 83+ messages in thread
From: René Scharfe @ 2021-06-26  8:27 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: SZEDER Gábor, git

Am 21.06.21 um 22:08 schrieb Ævar Arnfjörð Bjarmason:
>
> On Mon, Jun 21 2021, René Scharfe wrote:
>
>> Am 21.06.21 um 00:13 schrieb Ævar Arnfjörð Bjarmason:
>>>
>>> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>>>
>>>> The final value of the counter of the "Scanning merged commits"
>>>> progress line is always one less than its expected total, e.g.:
>>>>
>>>>   Scanning merged commits:  83% (5/6), done.
>>>>
>>>> This happens because while iterating over an array the loop variable
>>>> is passed to display_progress() as-is, but while C arrays (and thus
>>>> the loop variable) start at 0 and end at N-1, the progress counter
>>>> must end at N.  This causes the failures of the tests
>>>> 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
>>>> in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.
>>>>
>>>> Fix this by passing 'i + 1' to display_progress(), like most other
>>>> callsites do.
>>>>
>>>> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
>>>> ---
>>>>  commit-graph.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/commit-graph.c b/commit-graph.c
>>>> index 2bcb4e0f89..3181906368 100644
>>>> --- a/commit-graph.c
>>>> +++ b/commit-graph.c
>>>> @@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>>>>
>>>>  	ctx->num_extra_edges = 0;
>>>>  	for (i = 0; i < ctx->commits.nr; i++) {
>>>> -		display_progress(ctx->progress, i);
>>>> +		display_progress(ctx->progress, i + 1);
>>>>
>>>>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>>>>  			  &ctx->commits.list[i]->object.oid)) {
>>>
>>> I think this fix makes sense, but FWIW there's a large thread starting
>>> at [1] where René disagrees with me, and thinks the fix for this sort of
>>> thing would be to display_progress(..., i + 1) at the end of that
>>> for-loop, or just before the stop_progress().
>>>
>>> I don't agree, but just noting the disagreement, and that if that
>>> argument wins then a patch like this would involve changing the other
>>> 20-some calls to display_progress() in commit-graph.c to work
>>> differently (and to be more complex, we'd need to deal with loop
>>> break/continue etc.).
>>>
>>> 1. https://lore.kernel.org/git/patch-2.2-042f598826-20210607T144206Z-avarab@gmail.com/
>>
>> *sigh*  (And sorry, Ævar.)
>>
>> Before an item is done, it should be reported as not done.  After an
>> item is done, it should be reported as done.  One loop iteration
>> finishes one item.  Thus the number of items to report at the bottom of
>> the loop is one higher than at the top.  i is the correct number to
>> report at the top of a zero-based loop, i+1 at the bottom.

> Anyone with more time than sense can go and read over our linked back &
> forth thread where we're disagreeing on that point :). I think the pattern
> in commit-graph.c makes sense, you don't.

Thanks for this comment, I think I got it now: Work doesn't count in the
commit-graph.c model of measuring progress, literally.  I.e. progress is
the same before and after one item of work.  Instead it counts the
number of loop iterations.  The model I describe above counts finished
work items instead.  The results of the two models differ by at most one
despite their inverted axiom regarding the value of work.

Phew, that took me a while.

> Anyway, aside from that. I think, and I really would be advocating this
> too, even if our respective positions were reversed, that *in this case*
> it makes sense to just take something like SZEDER's patch here
> as-is. Because in that file there's some dozen occurrences of that exact
> pattern.

The code without the patch either forgets to report the last work item
in the count-work-items model or is one short in the count-iterations
model, so a fix is needed either way.

The number of the other occurrences wouldn't matter if they were
buggy, but in this case they indicate that Stolee consistently used
the count-iterations model.  Thus using it in the patch as well makes
sense.

> Let's just bring this one case in line with the rest, if we then want to
> argue that one or the other use of the progress.c API is wrong as a
> general thing, I think it makes more sense to discuss that as some
> follow-up series that changes these various API uses en-masse than
> holding back isolated fixes that leave the state of the progress bar it
> != 100%.

Agreed.

René

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-26  8:27         ` René Scharfe
@ 2021-06-26 14:11           ` Ævar Arnfjörð Bjarmason
  2021-06-26 20:22             ` René Scharfe
  0 siblings, 1 reply; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-26 14:11 UTC (permalink / raw)
  To: René Scharfe; +Cc: SZEDER Gábor, git


On Sat, Jun 26 2021, René Scharfe wrote:

> Am 21.06.21 um 22:08 schrieb Ævar Arnfjörð Bjarmason:
>>
>> On Mon, Jun 21 2021, René Scharfe wrote:
>>
>>> Am 21.06.21 um 00:13 schrieb Ævar Arnfjörð Bjarmason:
>>>>
>>>> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>>>>
>>>>> The final value of the counter of the "Scanning merged commits"
>>>>> progress line is always one less than its expected total, e.g.:
>>>>>
>>>>>   Scanning merged commits:  83% (5/6), done.
>>>>>
>>>>> This happens because while iterating over an array the loop variable
>>>>> is passed to display_progress() as-is, but while C arrays (and thus
>>>>> the loop variable) start at 0 and end at N-1, the progress counter
>>>>> must end at N.  This causes the failures of the tests
>>>>> 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
>>>>> in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.
>>>>>
>>>>> Fix this by passing 'i + 1' to display_progress(), like most other
>>>>> callsites do.
>>>>>
>>>>> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
>>>>> ---
>>>>>  commit-graph.c | 2 +-
>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/commit-graph.c b/commit-graph.c
>>>>> index 2bcb4e0f89..3181906368 100644
>>>>> --- a/commit-graph.c
>>>>> +++ b/commit-graph.c
>>>>> @@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>>>>>
>>>>>  	ctx->num_extra_edges = 0;
>>>>>  	for (i = 0; i < ctx->commits.nr; i++) {
>>>>> -		display_progress(ctx->progress, i);
>>>>> +		display_progress(ctx->progress, i + 1);
>>>>>
>>>>>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>>>>>  			  &ctx->commits.list[i]->object.oid)) {
>>>>
>>>> I think this fix makes sense, but FWIW there's a large thread starting
>>>> at [1] where René disagrees with me, and thinks the fix for this sort of
>>>> thing would be to display_progress(..., i + 1) at the end of that
>>>> for-loop, or just before the stop_progress().
>>>>
>>>> I don't agree, but just noting the disagreement, and that if that
>>>> argument wins then a patch like this would involve changing the other
>>>> 20-some calls to display_progress() in commit-graph.c to work
>>>> differently (and to be more complex, we'd need to deal with loop
>>>> break/continue etc.).
>>>>
>>>> 1. https://lore.kernel.org/git/patch-2.2-042f598826-20210607T144206Z-avarab@gmail.com/
>>>
>>> *sigh*  (And sorry, Ævar.)
>>>
>>> Before an item is done, it should be reported as not done.  After an
>>> item is done, it should be reported as done.  One loop iteration
>>> finishes one item.  Thus the number of items to report at the bottom of
>>> the loop is one higher than at the top.  i is the correct number to
>>> report at the top of a zero-based loop, i+1 at the bottom.
>
>> Anyone with more time than sense can go and read over our linked back &
>> forth thread where we're disagreeing on that point :). I think the pattern
>> in commit-graph.c makes sense, you don't.
>
> Thanks for this comment, I think I got it now: Work doesn't count in the
> commit-graph.c model of measuring progress, literally.  I.e. progress is
> the same before and after one item of work.

The progress isn't the same, we update the count. Or do you mean in the
time it takes us to go from the end of the for-loop & jump to the start
of it and update the count?

> Instead it counts the number of loop iterations.  The model I describe
> above counts finished work items instead.  The results of the two
> models differ by at most one despite their inverted axiom regarding
> the value of work.
>
> Phew, that took me a while.

For what it's worth I had some extensive examples in our initial
thread[1][2] (search for "apple" and "throughput", respectively), that
you cut out when replying to the relevant E-Mails. I'd think we could
probably have gotten here earlier :)

I'm a bit confused about this "value of work" comment.

If you pick up a copy of say a video game like Mario Kart you'll find
that for a 3-lap race you start at 1/3, and still have an entire lap to
go when the count is at 3/3.

So it's just a question of whether you report progress on item N or work
finished on item N, not whether laps in a race have more or less
value.

To reference my earlier E-Mail[1] are you eating the first apple or the
zeroeth apple? I don't think one is more or less right in the
mathematical sense, I just think for UX aimed at people counting "laps"
makes more sense than counting completed items.

>> Anyway, aside from that. I think, and I really would be advocating this
>> too, even if our respective positions were reversed, that *in this case*
>> it makes sense to just take something like SZEDER's patch here
>> as-is. Because in that file there's some dozen occurrences of that exact
>> pattern.
>
> The code without the patch either forgets to report the last work item
> in the count-work-items model or is one short in the count-iterations
> model, so a fix is needed either way.

It won't be one short, for a loop of 2 items we'll go from:

     0/2
     1/2
     1/2, done

To:

     1/2
     2/2
     2/2, done

Just like the rest of the uses of the progress API in that file.

Which is one of the two reasons I prefer this pattern, i.e. this is less
verbose:

    start_progress()
    for i in (0..X-1):
        display_progress(i+1)
        work()
    stop_progress()

Than one of these, which AFAICT would be your recommendation:

    # Simplest, but stalls on work()
    start_progress()
    for i in (0..X-1):
        work()
        display_progress(i+1)
    stop_progress()

    # More verbose, but doesn't:
    start_progress()
    for i in (0..X-1):
        display_progress(i)
        work()
        display_progress(i+1)
    stop_progress()

    # Ditto:
    start_progress()
    display_progress(0)
    for i in (0..X-1):
        work()
        display_progress(i+1)
    stop_progress()

And of course if your loop continues or whatever you'll need a last
"display_progress(X)" before the "stop_progress()".

The other is that if you count laps you can have your progress bar
optionally show progress on that item. E.g. we could if we stall show
seconds spend that we're hung on that item, or '3/3 ETA 40s". I have a
patch[3] that takes an initial step towards that, with some more queued
locally.

> The number of the other occurrences wouldn't matter if they were
> buggy, but in this case they indicate that Stolee consistently used
> the count-iterations model.  Thus using it in the patch as well makes
> sense.

>> Let's just bring this one case in line with the rest, if we then want to
>> argue that one or the other use of the progress.c API is wrong as a
>> general thing, I think it makes more sense to discuss that as some
>> follow-up series that changes these various API uses en-masse than
>> holding back isolated fixes that leave the state of the progress bar it
>> != 100%.
>
> Agreed.

Sorry to go on about this again :)

1. https://lore.kernel.org/git/87lf7k2bem.fsf@evledraar.gmail.com/
2. https://lore.kernel.org/git/87o8c8z105.fsf@evledraar.gmail.com/
3. https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-26 14:11           ` Ævar Arnfjörð Bjarmason
@ 2021-06-26 20:22             ` René Scharfe
  2021-06-26 21:38               ` Ævar Arnfjörð Bjarmason
  2021-06-27 17:31               ` Felipe Contreras
  0 siblings, 2 replies; 83+ messages in thread
From: René Scharfe @ 2021-06-26 20:22 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: SZEDER Gábor, git

Am 26.06.21 um 16:11 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sat, Jun 26 2021, René Scharfe wrote:
>
>> Am 21.06.21 um 22:08 schrieb Ævar Arnfjörð Bjarmason:
>>>
>>> On Mon, Jun 21 2021, René Scharfe wrote:
>>>
>>>> Before an item is done, it should be reported as not done.  After an
>>>> item is done, it should be reported as done.  One loop iteration
>>>> finishes one item.  Thus the number of items to report at the bottom of
>>>> the loop is one higher than at the top.  i is the correct number to
>>>> report at the top of a zero-based loop, i+1 at the bottom.
>>
>>> Anyone with more time than sense can go and read over our linked back &
>>> forth thread where we're disagreeing on that point :). I think the pattern
>>> in commit-graph.c makes sense, you don't.
>>
>> Thanks for this comment, I think I got it now: Work doesn't count in the
>> commit-graph.c model of measuring progress, literally.  I.e. progress is
>> the same before and after one item of work.
>
> The progress isn't the same, we update the count. Or do you mean in the
> time it takes us to go from the end of the for-loop & jump to the start
> of it and update the count?
>
>> Instead it counts the number of loop iterations.  The model I describe
>> above counts finished work items instead.  The results of the two
>> models differ by at most one despite their inverted axiom regarding
>> the value of work.
>>
>> Phew, that took me a while.
>
> For what it's worth I had some extensive examples in our initial
> thread[1][2] (search for "apple" and "throughput", respectively), that
> you cut out when replying to the relevant E-Mails. I'd think we could
> probably have gotten here earlier :)

Perhaps, but the key point for me was to invert my basic assumption that
a work item has value, and for that I had to realize and state it first
(done above).  A mathematician would have done that in an instant, I
guess ("Invert, always invert").

> I'm a bit confused about this "value of work" comment.

Progress is a counter.  The difference of the counter before and after
a work item is done is one in the count-work model, but zero in the
count-iterations model.

> If you pick up a copy of say a video game like Mario Kart you'll find
> that for a 3-lap race you start at 1/3, and still have an entire lap to
> go when the count is at 3/3.
>
> So it's just a question of whether you report progress on item N or work
> finished on item N, not whether laps in a race have more or less
> value.

These are linked.  If you want to know which lap you are in, the answer
won't change until you start a new lap:

	for (i = 0; i < 3; i++) {
		display_progress(p, i + 1);
		drive_one_lap();
		display_progress(p, i + 1);
	}

If you want for know how many laps you finished, the answer will
increase after a lap is done:

	for (i = 0; i < 3; i++) {
		display_progress(p, i);
		drive_one_lap();
		display_progress(p, i + 1);
	}

> To reference my earlier E-Mail[1] are you eating the first apple or the
> zeroeth apple? I don't think one is more or less right in the
> mathematical sense, I just think for UX aimed at people counting "laps"
> makes more sense than counting completed items.

The difference between counting iterations and work items vanishes as
their numbers increase.  The most pronounced difference is observed when
there is only a single item of work.  The count-iterations model shows
1/1 from start to finish.  The count-work model shows 0/1 initially and
1/1 after the work is done.

As a user I prefer the second one.  If presented with just a number and
a percentage then I assume 100% means all work is done and would cancel
the program if that status is shown for too long.  With Git I have
learned that only the final ", done" really means done in some cases,
but that's an unnecessary lesson and still surprising to me.

>>> Anyway, aside from that. I think, and I really would be advocating this
>>> too, even if our respective positions were reversed, that *in this case*
>>> it makes sense to just take something like SZEDER's patch here
>>> as-is. Because in that file there's some dozen occurrences of that exact
>>> pattern.
>>
>> The code without the patch either forgets to report the last work item
>> in the count-work-items model or is one short in the count-iterations
>> model, so a fix is needed either way.
>
> It won't be one short, for a loop of 2 items we'll go from:
>
>      0/2
>      1/2
>      1/2, done
>
> To:
>
>      1/2
>      2/2
>      2/2, done
>
> Just like the rest of the uses of the progress API in that file.

Yes, just like I wrote -- the old code is one short compared to the
correct output of the count-iterations method.

For completeness' sake, the correct output of the count-work method
would be:

	0/2
	1/2
	2/2
	2/2, done

> Which is one of the two reasons I prefer this pattern, i.e. this is less
> verbose:
>
>     start_progress()
>     for i in (0..X-1):
>         display_progress(i+1)
>         work()
>     stop_progress()
>
> Than one of these, which AFAICT would be your recommendation:
>
>     # Simplest, but stalls on work()
>     start_progress()
>     for i in (0..X-1):
>         work()
>         display_progress(i+1)
>     stop_progress()
>
>     # More verbose, but doesn't:
>     start_progress()
>     for i in (0..X-1):
>         display_progress(i)
>         work()
>         display_progress(i+1)
>     stop_progress()
>
>     # Ditto:
>     start_progress()
>     display_progress(0)
>     for i in (0..X-1):
>         work()
>         display_progress(i+1)
>     stop_progress()
>
> And of course if your loop continues or whatever you'll need a last
> "display_progress(X)" before the "stop_progress()".

The count-work model needs one more progress update than the
count-iteration model.  We could do all updates in the loop header,
which is evaluated just the right number of times.  But I think that we
rather should choose between the models based on their results.

If each work item finishes within a progress display update period
(half a second) then there won't be any user-visible difference and
both models would do.

> The other is that if you count laps you can have your progress bar
> optionally show progress on that item. E.g. we could if we stall show
> seconds spend that we're hung on that item, or '3/3 ETA 40s". I have a
> patch[3] that takes an initial step towards that, with some more queued
> locally.

A time estimate for the whole operation (until ", done") would be nice.
It can help with the decision to go for a break or to keep staring at
the screen.  I guess we just need to remember when start_progress() was
called and can then estimate the remaining time once the first item is
done.  Stalling items would push the estimate further into the future.

A time estimate per item wouldn't help me much.  I'd have to subtract
to get the number of unfinished items, catch the maximum estimated
duration and multiply those values.  OK, by the time I manage that Git
is probably done -- but I'd rather like to leave arithmetic tasks to
the computer..

Seconds spent for the current item can be shown with both models.  The
progress value is not sufficient to identify the problem case in most
cases.  An ID of some kind (e.g. a file name or hash) would have to be
shown as well for that.  But how would I use that information?

René

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-26 20:22             ` René Scharfe
@ 2021-06-26 21:38               ` Ævar Arnfjörð Bjarmason
  2021-07-04 12:15                 ` René Scharfe
  2021-06-27 17:31               ` Felipe Contreras
  1 sibling, 1 reply; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-26 21:38 UTC (permalink / raw)
  To: René Scharfe; +Cc: SZEDER Gábor, git


On Sat, Jun 26 2021, René Scharfe wrote:

> Am 26.06.21 um 16:11 schrieb Ævar Arnfjörð Bjarmason:
>> [...]
>> To reference my earlier E-Mail[1] are you eating the first apple or the
>> zeroeth apple? I don't think one is more or less right in the
>> mathematical sense, I just think for UX aimed at people counting "laps"
>> makes more sense than counting completed items.
>
> The difference between counting iterations and work items vanishes as
> their numbers increase.  The most pronounced difference is observed when
> there is only a single item of work.  The count-iterations model shows
> 1/1 from start to finish.  The count-work model shows 0/1 initially and
> 1/1 after the work is done.
>
> As a user I prefer the second one.  If presented with just a number and
> a percentage then I assume 100% means all work is done and would cancel
> the program if that status is shown for too long.  With Git I have
> learned that only the final ", done" really means done in some cases,
> but that's an unnecessary lesson and still surprising to me.

What progress bar of ours goes slow enough that the difference matters
for you in either case?

The only one I know of is "Enumerating objects", which notably stalls at
the start, and which I'm proposing changing the output of in:
https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/

>> [...]
>> Which is one of the two reasons I prefer this pattern, i.e. this is less
>> verbose:
>>
>>     start_progress()
>>     for i in (0..X-1):
>>         display_progress(i+1)
>>         work()
>>     stop_progress()
>>
>> Than one of these, which AFAICT would be your recommendation:
>>
>>     # Simplest, but stalls on work()
>>     start_progress()
>>     for i in (0..X-1):
>>         work()
>>         display_progress(i+1)
>>     stop_progress()
>>
>>     # More verbose, but doesn't:
>>     start_progress()
>>     for i in (0..X-1):
>>         display_progress(i)
>>         work()
>>         display_progress(i+1)
>>     stop_progress()
>>
>>     # Ditto:
>>     start_progress()
>>     display_progress(0)
>>     for i in (0..X-1):
>>         work()
>>         display_progress(i+1)
>>     stop_progress()
>>
>> And of course if your loop continues or whatever you'll need a last
>> "display_progress(X)" before the "stop_progress()".
>
> The count-work model needs one more progress update than the
> count-iteration model.  We could do all updates in the loop header,
> which is evaluated just the right number of times.  But I think that we
> rather should choose between the models based on their results.

I think we should be more biased towards API convenience here than
anything else, because for most of these they'll go so fast that users
won't see the difference. I just also happen to think that the easy way
to do it is also more correct.

Also, because for these cases that you're focusing on where we count up
to exactly 100% and we therefore expect N calls to display_progress()
(igroning the rare but allowed duplicate calls with the same number,
which most callers don't use). We could just have a convenience API of:

    start_progress()
    for i in (0..X-1):
        progress_update() /* passing "i" not needed, we increment internally */
        work()
    stop_progress()

Then we could even make showing 0/N or 1/N the first time configuable,
but we could only do both if we use the API as I'm suggesting, not as
you want to use it.

You also sort of can get me what I want with with what you're
suggesting, but you'd conflate "setup" work with the first item, which
matters e.g. for "Enumerating objects" and my "stalled" patch. It's also
more verbose at the code level, and complex (need to deal with "break",
"continue"), so why would you?

Which I think is the main point of our not so much disagreement but I
think a bit of talking past one another.

I.e. I think you're narrowly focused on what I think of as a display
issue of the current progress bars we show, I'm mainly interested in how
we use the API, and we should pick a way to use it that allows us to do
more with displaying progress better in the future.

> If each work item finishes within a progress display update period
> (half a second) then there won't be any user-visible difference and
> both models would do.

A trivial point, but don't you mean a second? AFAICT for "delayed" we
display after 2 seconds, then update every 1 seconds, it's only if we
have display_throughput() that we do every 0.5s.

>> The other is that if you count laps you can have your progress bar
>> optionally show progress on that item. E.g. we could if we stall show
>> seconds spend that we're hung on that item, or '3/3 ETA 40s". I have a
>> patch[3] that takes an initial step towards that, with some more queued
>> locally.
>
> A time estimate for the whole operation (until ", done") would be nice.
> It can help with the decision to go for a break or to keep staring at
> the screen.  I guess we just need to remember when start_progress() was
> called and can then estimate the remaining time once the first item is
> done.  Stalling items would push the estimate further into the future.
>
> A time estimate per item wouldn't help me much.  I'd have to subtract
> to get the number of unfinished items, catch the maximum estimated
> duration and multiply those values.  OK, by the time I manage that Git
> is probably done -- but I'd rather like to leave arithmetic tasks to
> the computer..
>
> Seconds spent for the current item can be shown with both models.  The
> progress value is not sufficient to identify the problem case in most
> cases.  An ID of some kind (e.g. a file name or hash) would have to be
> shown as well for that.  But how would I use that information?

If we're spending enough time on one item to update progress for it N
times we probably want to show throughput/progress/ETA mainly for that
item, not the work as a whole.

If we do run into those cases and want to convert them to show some
intra-item progress we'd need to first migrate them over to suggested
way of using the API if we picked yours first, with my suggested use we
only need to add new API calls (display_throughput(), and similar future
calls/implicit display).

Consider e.g. using the packfile-uri response to ask the user to
download N number of URLs, just because we grab one at 1MB/s that
probably won't do much to inform our estimate of the next one (which may
be on a different CDN etc.).

The throughput API was intended (and mainly used) for the estimate for
the whole batch, I just wonder if as we use it more widely whether that
use-case won't be the exception.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-26 20:22             ` René Scharfe
  2021-06-26 21:38               ` Ævar Arnfjörð Bjarmason
@ 2021-06-27 17:31               ` Felipe Contreras
  1 sibling, 0 replies; 83+ messages in thread
From: Felipe Contreras @ 2021-06-27 17:31 UTC (permalink / raw)
  To: René Scharfe, Ævar Arnfjörð Bjarmason
  Cc: SZEDER Gábor, git

René Scharfe wrote:
> Am 26.06.21 um 16:11 schrieb Ævar Arnfjörð Bjarmason:

> > For what it's worth I had some extensive examples in our initial
> > thread[1][2] (search for "apple" and "throughput", respectively), that
> > you cut out when replying to the relevant E-Mails. I'd think we could
> > probably have gotten here earlier :)
> 
> Perhaps, but the key point for me was to invert my basic assumption that
> a work item has value, and for that I had to realize and state it first
> (done above).  A mathematician would have done that in an instant, I
> guess ("Invert, always invert").

When you get down to it, numbers almost never mean what most people
think they mean.

If work is a continuum, the probabilty that you would land exactly at
1/3 is 0 P(X=1/3). What you want is the probability of less than 1/3
P(X<=1/3), and that includes 0.

So, anything from 0 to 1/3 is part of the first chunk of work.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-06-26 21:38               ` Ævar Arnfjörð Bjarmason
@ 2021-07-04 12:15                 ` René Scharfe
  2021-07-05 14:09                   ` Junio C Hamano
  2021-07-05 23:28                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 83+ messages in thread
From: René Scharfe @ 2021-07-04 12:15 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: SZEDER Gábor, git

Am 26.06.21 um 23:38 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sat, Jun 26 2021, René Scharfe wrote:
>
>> Am 26.06.21 um 16:11 schrieb Ævar Arnfjörð Bjarmason:
>>> [...]
>>> To reference my earlier E-Mail[1] are you eating the first apple or the
>>> zeroeth apple? I don't think one is more or less right in the
>>> mathematical sense, I just think for UX aimed at people counting "laps"
>>> makes more sense than counting completed items.
>>
>> The difference between counting iterations and work items vanishes as
>> their numbers increase.  The most pronounced difference is observed when
>> there is only a single item of work.  The count-iterations model shows
>> 1/1 from start to finish.  The count-work model shows 0/1 initially and
>> 1/1 after the work is done.
>>
>> As a user I prefer the second one.  If presented with just a number and
>> a percentage then I assume 100% means all work is done and would cancel
>> the program if that status is shown for too long.  With Git I have
>> learned that only the final ", done" really means done in some cases,
>> but that's an unnecessary lesson and still surprising to me.
>
> What progress bar of ours goes slow enough that the difference matters
> for you in either case?

I don't have an example -- Git, network and SSD are quick enough for my
small use cases.

The advantage of the count-work method is that the question doesn't even
come up.

> The only one I know of is "Enumerating objects", which notably stalls at
> the start, and which I'm proposing changing the output of in:
> https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/

That's annoying, but the first number I see there has five or six digits,
so it's not an example of the issue mentioned above for me.

Your patch shows ", stalled." while pack-objects starts up.  I'm not sure
this helps.  Perhaps there are cases when it gets stuck, but it's hard to
determine by the clock alone.  When I run gc, it just needs a few seconds
to prepare something and then starts visibly counting objects.  A more
fine-grained report of the preparation steps would help, but seeing
"stalled" would just scare me.

>> The count-work model needs one more progress update than the
>> count-iteration model.  We could do all updates in the loop header,
>> which is evaluated just the right number of times.  But I think that we
>> rather should choose between the models based on their results.
>
> I think we should be more biased towards API convenience here than
> anything else, because for most of these they'll go so fast that users
> won't see the difference. I just also happen to think that the easy way
> to do it is also more correct.

The convenience of having one less display_progress() call is only a
slight advantage.

Correctness is a matter of definitions.  Recently I learned that in Arabic
a person's age is given using the count-iterations model.  I.e. on the day
of your birth your age is one.  That causes trouble if you deal with
state officials that use the count-work, err, count-completed-years model,
where your age is one only after living through a full year.

The solution around here is to avoid ambiguity by not using terms like
"age" in laws, regulations and forms, but to state explicitly "full years
since birth" or so.

"2/3 (33%)" means something else to me than to you by default.  So a
solution could be to state the model explicitly.  I.e. "2/3 (66%) done"
or "working on 2/3 (66%)", but the percentage doesn't quite fit in the
latter case.  Thoughts?

> Also, because for these cases that you're focusing on where we count up
> to exactly 100% and we therefore expect N calls to display_progress()
> (igroning the rare but allowed duplicate calls with the same number,
> which most callers don't use). We could just have a convenience API of:
>
>     start_progress()
>     for i in (0..X-1):
>         progress_update() /* passing "i" not needed, we increment internally */
>         work()
>     stop_progress()
>
> Then we could even make showing 0/N or 1/N the first time configuable,
> but we could only do both if we use the API as I'm suggesting, not as
> you want to use it.

A function that increments the progress number relatively can be used
with both models.  It's more useful for the count-iterations model,
though, as in the count-work model you can piggy-back on the loop
counter check:

	for (i = 0; display_progress(p, i), i < X; i++)
		work();

> You also sort of can get me what I want with with what you're
> suggesting, but you'd conflate "setup" work with the first item, which
> matters e.g. for "Enumerating objects" and my "stalled" patch. It's also
> more verbose at the code level, and complex (need to deal with "break",
> "continue"), so why would you?

It's not complicated, just slightly odd, because function calls are
seldomly put into the loop counter check.

>> If each work item finishes within a progress display update period
>> (half a second) then there won't be any user-visible difference and
>> both models would do.
>
> A trivial point, but don't you mean a second? AFAICT for "delayed" we
> display after 2 seconds, then update every 1 seconds, it's only if we
> have display_throughput() that we do every 0.5s.

Right, I mixed those up.

>>> The other is that if you count laps you can have your progress bar
>>> optionally show progress on that item. E.g. we could if we stall show
>>> seconds spend that we're hung on that item, or '3/3 ETA 40s". I have a
>>> patch[3] that takes an initial step towards that, with some more queued
>>> locally.
>>
>> A time estimate for the whole operation (until ", done") would be nice.
>> It can help with the decision to go for a break or to keep staring at
>> the screen.  I guess we just need to remember when start_progress() was
>> called and can then estimate the remaining time once the first item is
>> done.  Stalling items would push the estimate further into the future.
>>
>> A time estimate per item wouldn't help me much.  I'd have to subtract
>> to get the number of unfinished items, catch the maximum estimated
>> duration and multiply those values.  OK, by the time I manage that Git
>> is probably done -- but I'd rather like to leave arithmetic tasks to
>> the computer..
>>
>> Seconds spent for the current item can be shown with both models.  The
>> progress value is not sufficient to identify the problem case in most
>> cases.  An ID of some kind (e.g. a file name or hash) would have to be
>> shown as well for that.  But how would I use that information?
>
> If we're spending enough time on one item to update progress for it N
> times we probably want to show throughput/progress/ETA mainly for that
> item, not the work as a whole.

Throughput is shown for the last time period.  It is independent of the
item or items being worked on during that period.  If one item takes
multiple periods to finish then indeed only its current progress is
shown automatically, as you want.

Showing intra-item progress requires some kind of hierarchical API to
keep track of both parent and child progress and show them in some
readable way.  Perhaps appending another progress display would suffice?
"Files 1/3 (33%) Bytes 17kB/9GB (0%)".  Not sure.

Calculating the ETA of a single item seems hard.  It does require intra-
item progress to be reported by the work code.

> If we do run into those cases and want to convert them to show some
> intra-item progress we'd need to first migrate them over to suggested
> way of using the API if we picked yours first, with my suggested use we
> only need to add new API calls (display_throughput(), and similar future
> calls/implicit display).

I don't see why.  The intra-item progress numbers need to be reported in
any case if they are to be shown somehow.  If the model is clear then we
can show unambiguous output.

> Consider e.g. using the packfile-uri response to ask the user to
> download N number of URLs, just because we grab one at 1MB/s that
> probably won't do much to inform our estimate of the next one (which may
> be on a different CDN etc.).

Sure, if the speed of work items varies wildly then estimates will be
unreliable.

I can vaguely imagine that it would be kind of useful to know the
throughput of different data sources, to allow e.g. use a different
mirror next time.  The current API doesn't distinguish work items in a
meaningful way, though.  They only have numbers.  I'd need a name (e.g.
the URL) for intra-item progress numbers to mean something.

René

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-07-04 12:15                 ` René Scharfe
@ 2021-07-05 14:09                   ` Junio C Hamano
  2021-07-05 23:28                   ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 83+ messages in thread
From: Junio C Hamano @ 2021-07-05 14:09 UTC (permalink / raw)
  To: René Scharfe
  Cc: Ævar Arnfjörð Bjarmason, SZEDER Gábor, git

René Scharfe <l.s.r@web.de> writes:

> ...  A more
> fine-grained report of the preparation steps would help, but seeing
> "stalled" would just scare me.

True.

> The convenience of having one less display_progress() call is only a
> slight advantage.

True, too.

> "2/3 (33%)" means something else to me than to you by default.  So a
> solution could be to state the model explicitly.  I.e. "2/3 (66%) done"
> or "working on 2/3 (66%)", but the percentage doesn't quite fit in the
> latter case.  Thoughts?


I still see "2/3 done" is how we should look at it, but either way,
that's a good way to view at the problem.

Thanks.


[Unrelated Tangent]

> ...  Recently I learned that in Arabic a person's age is given
> using the count-iterations model.  I.e. on the day of your birth
> your age is one.

East Asign age reckoning is shared among EA countries and works the
same way, not just Arabic.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-07-04 12:15                 ` René Scharfe
  2021-07-05 14:09                   ` Junio C Hamano
@ 2021-07-05 23:28                   ` Ævar Arnfjörð Bjarmason
  2021-07-06 16:02                     ` René Scharfe
  1 sibling, 1 reply; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-05 23:28 UTC (permalink / raw)
  To: René Scharfe; +Cc: SZEDER Gábor, git


On Sun, Jul 04 2021, René Scharfe wrote:

> Am 26.06.21 um 23:38 schrieb Ævar Arnfjörð Bjarmason:
>>
>> On Sat, Jun 26 2021, René Scharfe wrote:
>>
>>> Am 26.06.21 um 16:11 schrieb Ævar Arnfjörð Bjarmason:
>>>> [...]
>>>> To reference my earlier E-Mail[1] are you eating the first apple or the
>>>> zeroeth apple? I don't think one is more or less right in the
>>>> mathematical sense, I just think for UX aimed at people counting "laps"
>>>> makes more sense than counting completed items.
>>>
>>> The difference between counting iterations and work items vanishes as
>>> their numbers increase.  The most pronounced difference is observed when
>>> there is only a single item of work.  The count-iterations model shows
>>> 1/1 from start to finish.  The count-work model shows 0/1 initially and
>>> 1/1 after the work is done.
>>>
>>> As a user I prefer the second one.  If presented with just a number and
>>> a percentage then I assume 100% means all work is done and would cancel
>>> the program if that status is shown for too long.  With Git I have
>>> learned that only the final ", done" really means done in some cases,
>>> but that's an unnecessary lesson and still surprising to me.
>>
>> What progress bar of ours goes slow enough that the difference matters
>> for you in either case?
>
> I don't have an example -- Git, network and SSD are quick enough for my
> small use cases.
>
> The advantage of the count-work method is that the question doesn't even
> come up.
>
>> The only one I know of is "Enumerating objects", which notably stalls at
>> the start, and which I'm proposing changing the output of in:
>> https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/
>
> That's annoying, but the first number I see there has five or six digits,
> so it's not an example of the issue mentioned above for me.

Because it stalls and shows nothing, but with my patch it'll show
something while stalling, FWIW on linux.git from a cold cache it took
5-10s before showing anything.

> Your patch shows ", stalled." while pack-objects starts up.  I'm not sure
> this helps.  Perhaps there are cases when it gets stuck, but it's hard to
> determine by the clock alone.  When I run gc, it just needs a few seconds
> to prepare something and then starts visibly counting objects.  A more
> fine-grained report of the preparation steps would help, but seeing
> "stalled" would just scare me.

Fair enough, I have other patches to have it show a spinner. Again, API
v.s. UI. The idea is that we show something before we start the loop.

>>> The count-work model needs one more progress update than the
>>> count-iteration model.  We could do all updates in the loop header,
>>> which is evaluated just the right number of times.  But I think that we
>>> rather should choose between the models based on their results.
>>
>> I think we should be more biased towards API convenience here than
>> anything else, because for most of these they'll go so fast that users
>> won't see the difference. I just also happen to think that the easy way
>> to do it is also more correct.
>
> The convenience of having one less display_progress() call is only a
> slight advantage.
>
> Correctness is a matter of definitions.  Recently I learned that in Arabic
> a person's age is given using the count-iterations model.  I.e. on the day
> of your birth your age is one.  That causes trouble if you deal with
> state officials that use the count-work, err, count-completed-years model,
> where your age is one only after living through a full year.
>
> The solution around here is to avoid ambiguity by not using terms like
> "age" in laws, regulations and forms, but to state explicitly "full years
> since birth" or so.
>
> "2/3 (33%)" means something else to me than to you by default.  So a
> solution could be to state the model explicitly.  I.e. "2/3 (66%) done"
> or "working on 2/3 (66%)", but the percentage doesn't quite fit in the
> latter case.  Thoughts?

OK, UI again.

>> Also, because for these cases that you're focusing on where we count up
>> to exactly 100% and we therefore expect N calls to display_progress()
>> (igroning the rare but allowed duplicate calls with the same number,
>> which most callers don't use). We could just have a convenience API of:
>>
>>     start_progress()
>>     for i in (0..X-1):
>>         progress_update() /* passing "i" not needed, we increment internally */
>>         work()
>>     stop_progress()
>>
>> Then we could even make showing 0/N or 1/N the first time configuable,
>> but we could only do both if we use the API as I'm suggesting, not as
>> you want to use it.
>
> A function that increments the progress number relatively can be used
> with both models.  It's more useful for the count-iterations model,
> though, as in the count-work model you can piggy-back on the loop
> counter check:
>
> 	for (i = 0; display_progress(p, i), i < X; i++)
> 		work();

Aside from this whole progress API discussion I find sticking stuff like
that in the for-loop body to be less readable.

But no, that can't be used with both models, because it conflates the 0
of the 1st iteration with 0 of doing prep work. I.e.:

    p = start_progress();
    display_progress(p, 0);
    prep_work();
    for (i = 0; i < 100; i++)
        display_progress(p, i + 1);

Which is implicitly how that "stalled" patch views the world, i.e. our
count is -1 is at start_progress() (that's already the case in
progress.c).

If you set it to 0 you're not working on the 1st item yet, but
explicitly doing setup. 

Then at n=1 you're starting work on the 1st item.

>> You also sort of can get me what I want with with what you're
>> suggesting, but you'd conflate "setup" work with the first item, which
>> matters e.g. for "Enumerating objects" and my "stalled" patch. It's also
>> more verbose at the code level, and complex (need to deal with "break",
>> "continue"), so why would you?
>
> It's not complicated, just slightly odd, because function calls are
> seldomly put into the loop counter check.

FWIW the "complicated" here was referring to dealing with break/continue.

Yes I'll grant you that there's cases where the uglyness/oddity of that
for-loop trick is going to be better than dealing with that, but there's
also while loops doing progress, callbacks etc.

Picking an API pattern that works with all of that makes sense, since
the UI can render the count one way or the other.

>>> If each work item finishes within a progress display update period
>>> (half a second) then there won't be any user-visible difference and
>>> both models would do.
>>
>> A trivial point, but don't you mean a second? AFAICT for "delayed" we
>> display after 2 seconds, then update every 1 seconds, it's only if we
>> have display_throughput() that we do every 0.5s.
>
> Right, I mixed those up.
>
>>>> The other is that if you count laps you can have your progress bar
>>>> optionally show progress on that item. E.g. we could if we stall show
>>>> seconds spend that we're hung on that item, or '3/3 ETA 40s". I have a
>>>> patch[3] that takes an initial step towards that, with some more queued
>>>> locally.
>>>
>>> A time estimate for the whole operation (until ", done") would be nice.
>>> It can help with the decision to go for a break or to keep staring at
>>> the screen.  I guess we just need to remember when start_progress() was
>>> called and can then estimate the remaining time once the first item is
>>> done.  Stalling items would push the estimate further into the future.
>>>
>>> A time estimate per item wouldn't help me much.  I'd have to subtract
>>> to get the number of unfinished items, catch the maximum estimated
>>> duration and multiply those values.  OK, by the time I manage that Git
>>> is probably done -- but I'd rather like to leave arithmetic tasks to
>>> the computer..
>>>
>>> Seconds spent for the current item can be shown with both models.  The
>>> progress value is not sufficient to identify the problem case in most
>>> cases.  An ID of some kind (e.g. a file name or hash) would have to be
>>> shown as well for that.  But how would I use that information?
>>
>> If we're spending enough time on one item to update progress for it N
>> times we probably want to show throughput/progress/ETA mainly for that
>> item, not the work as a whole.
>
> Throughput is shown for the last time period.  It is independent of the
> item or items being worked on during that period.  If one item takes
> multiple periods to finish then indeed only its current progress is
> shown automatically, as you want.
>
> Showing intra-item progress requires some kind of hierarchical API to
> keep track of both parent and child progress and show them in some
> readable way.  Perhaps appending another progress display would suffice?
> "Files 1/3 (33%) Bytes 17kB/9GB (0%)".  Not sure.

Yes, this is another thing I'm heading for with the patches I posted.

For now I just fixed bugs in the state machine of how many characters we
erase, now we always reset exactly as much as we need to, and pass
things like ", done" around, not ", done\n" or ", done\r" (i.e. the
output we're emitting isn't conflacted with whether we're clearing the
line, or creating a new line.

It's a relatively straightforward change from there to have N progress
structs that each track/emit their part of a larger progress bar,
e.g. something like the progress prove(1) shows you (test status for
each concurrent test you're running).

You just need a "parent" progress struct that has the "title" (or none),
and receives the signal, and to have N registered sub-progress structs.

> Calculating the ETA of a single item seems hard.  It does require intra-
> item progress to be reported by the work code.
>
>> If we do run into those cases and want to convert them to show some
>> intra-item progress we'd need to first migrate them over to suggested
>> way of using the API if we picked yours first, with my suggested use we
>> only need to add new API calls (display_throughput(), and similar future
>> calls/implicit display).
>
> I don't see why.  The intra-item progress numbers need to be reported in
> any case if they are to be shown somehow.  If the model is clear then we
> can show unambiguous output.

Because you want to show:

    Files 1/3 (33%) Bytes 17kB/9GB (0%)

Not:

    Files 0/3 (33%) Bytes 17kB/9GB (0%)

You're downloading the 1st file, not the 0th file, so the code is a
for-loop (or equivalent) with a display_progress(p, i + 1) for that
file, not display_progress(p, i).

This is the main reason I prefer the API and UI of reporting "what item
am I on?" v.s. "how many items are done?", because it's easy to add
intra-item state to the former.

>> Consider e.g. using the packfile-uri response to ask the user to
>> download N number of URLs, just because we grab one at 1MB/s that
>> probably won't do much to inform our estimate of the next one (which may
>> be on a different CDN etc.).
>
> Sure, if the speed of work items varies wildly then estimates will be
> unreliable.
>
> I can vaguely imagine that it would be kind of useful to know the
> throughput of different data sources, to allow e.g. use a different
> mirror next time.  The current API doesn't distinguish work items in a
> meaningful way, though.  They only have numbers.  I'd need a name (e.g.
> the URL) for intra-item progress numbers to mean something.

Sure, anyway, let's assume all those numbers are magically known and
constant. The point was that as noted above you're downloading the 1st
file, not the 0th file, and want to show throughput/ETA etc. for that
file.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-07-05 23:28                   ` Ævar Arnfjörð Bjarmason
@ 2021-07-06 16:02                     ` René Scharfe
  0 siblings, 0 replies; 83+ messages in thread
From: René Scharfe @ 2021-07-06 16:02 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: SZEDER Gábor, git

Am 06.07.21 um 01:28 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sun, Jul 04 2021, René Scharfe wrote:
>
>> Am 26.06.21 um 23:38 schrieb Ævar Arnfjörð Bjarmason:
>>>
>>> On Sat, Jun 26 2021, René Scharfe wrote:
>>>
>>>> Am 26.06.21 um 16:11 schrieb Ævar Arnfjörð Bjarmason:
>>> The only one I know of is "Enumerating objects", which notably stalls at
>>> the start, and which I'm proposing changing the output of in:
>>> https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/
>>
>> That's annoying, but the first number I see there has five or six digits,
>> so it's not an example of the issue mentioned above for me.
>
> Because it stalls and shows nothing, but with my patch it'll show
> something while stalling, FWIW on linux.git from a cold cache it took
> 5-10s before showing anything.
>
>> Your patch shows ", stalled." while pack-objects starts up.  I'm not sure
>> this helps.  Perhaps there are cases when it gets stuck, but it's hard to
>> determine by the clock alone.  When I run gc, it just needs a few seconds
>> to prepare something and then starts visibly counting objects.  A more
>> fine-grained report of the preparation steps would help, but seeing
>> "stalled" would just scare me.
>
> Fair enough, I have other patches to have it show a spinner. Again, API
> v.s. UI. The idea is that we show something before we start the loop.

A spinner would be nicer, but I would be more interested to see what it is
actually spending all that time on.  A separate progress line might be
justified here.

>>> Also, because for these cases that you're focusing on where we count up
>>> to exactly 100% and we therefore expect N calls to display_progress()
>>> (igroning the rare but allowed duplicate calls with the same number,
>>> which most callers don't use). We could just have a convenience API of:
>>>
>>>     start_progress()
>>>     for i in (0..X-1):
>>>         progress_update() /* passing "i" not needed, we increment internally */
>>>         work()
>>>     stop_progress()
>>>
>>> Then we could even make showing 0/N or 1/N the first time configuable,
>>> but we could only do both if we use the API as I'm suggesting, not as
>>> you want to use it.
>>
>> A function that increments the progress number relatively can be used
>> with both models.  It's more useful for the count-iterations model,
>> though, as in the count-work model you can piggy-back on the loop
>> counter check:
>>
>> 	for (i = 0; display_progress(p, i), i < X; i++)
>> 		work();
>
> Aside from this whole progress API discussion I find sticking stuff like
> that in the for-loop body to be less readable.
>
> But no, that can't be used with both models, because it conflates the 0
> of the 1st iteration with 0 of doing prep work. I.e.:
>
>     p = start_progress();
>     display_progress(p, 0);
>     prep_work();
>     for (i = 0; i < 100; i++)
>         display_progress(p, i + 1);
>
> Which is implicitly how that "stalled" patch views the world, i.e. our
> count is -1 is at start_progress() (that's already the case in
> progress.c).
>
> If you set it to 0 you're not working on the 1st item yet, but
> explicitly doing setup.
>
> Then at n=1 you're starting work on the 1st item.

A distinct preparation phase feels like an extension to the progress
API.  A symmetric cleanup phase at the end may make sense as well then.

I assume that preparations would be done between the start_progress call
and the first display_progress (no matter what number it reports).  And
cleanup would be done between the last display_progress call and the
stop_progress call.

In the count-iterations model this might report the time taken fro the
first or last item as preparation or cleanup depending on the placement
of the display_progress call.  That shouldn't be much of a problem,
though, as the value of one work item is zero in that model.

> FWIW the "complicated" here was referring to dealing with break/continue.
>
> Yes I'll grant you that there's cases where the uglyness/oddity of that
> for-loop trick is going to be better than dealing with that, but there's
> also while loops doing progress, callbacks etc.

while loops can easily be converted to for loops, of course.

Callbacks are a different matter.  I think we should use them less in
general (they force different operations to use the same set of
parameters, which is worked around with context structs).  A function
to increment progress would help them because then they wouldn't need
to keep track of the item/iteration count themselves in a context
variable.

However, in some cases display_progress calls are rate-limited, e.g.
midx_display_sparse_progress does that for performance reasons.  I
wonder why, and whether this is a problem that needs to be addressed
for all callers.  We don't want the progress API to delay the actual
progress significantly!  Currently display_progress avoids updating
the progress counter; an increment function would need to write an
updated value at each call.

> Picking an API pattern that works with all of that makes sense, since
> the UI can render the count one way or the other.

Right.

>>> If we do run into those cases and want to convert them to show some
>>> intra-item progress we'd need to first migrate them over to suggested
>>> way of using the API if we picked yours first, with my suggested use we
>>> only need to add new API calls (display_throughput(), and similar future
>>> calls/implicit display).
>>
>> I don't see why.  The intra-item progress numbers need to be reported in
>> any case if they are to be shown somehow.  If the model is clear then we
>> can show unambiguous output.
>
> Because you want to show:
>
>     Files 1/3 (33%) Bytes 17kB/9GB (0%)
>
> Not:
>
>     Files 0/3 (33%) Bytes 17kB/9GB (0%)
>
> You're downloading the 1st file, not the 0th file, so the code is a
> for-loop (or equivalent) with a display_progress(p, i + 1) for that
> file, not display_progress(p, i).
>
> This is the main reason I prefer the API and UI of reporting "what item
> am I on?" v.s. "how many items are done?", because it's easy to add
> intra-item state to the former.

Both look confusing.  If I'd care enough about one of the files or each
of them that I'd like to know their individual progress then I'd
certainly would want to see their names instead of some random number.

And as you write above: The display part can easily add or subtract one
to convert the number between models.

> Sure, anyway, let's assume all those numbers are magically known and
> constant. The point was that as noted above you're downloading the 1st
> file, not the 0th file, and want to show throughput/ETA etc. for that
> file.

OK, but still some kind of indication would have to be given that the
Bytes relate to a particular File instead of being the total for this
activity.  Perhaps like this, but it's a bit cluttered:

   File 1 (Bytes 17kB/9GB, 0% done) of 3 (0% done in total)

René

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 0/3] progress.c API users: fix bogus counting
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
                   ` (8 preceding siblings ...)
  2021-06-23 21:57 ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
@ 2021-07-22 12:20 ` Ævar Arnfjörð Bjarmason
  2021-07-22 12:20   ` [PATCH 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
                     ` (2 more replies)
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
  10 siblings, 3 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

As a split-off from the larger topic these were submitted as part of
[1] and which didn't get picked up. As I pointed out in [2] that
larger topic had some hidden untested-for flaws.

But these patches are just fixes to bogus progress bar output from
that topic. Let's consider them in isolation...

1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/
2. https://lore.kernel.org/git/cover-00.25-00000000000-20210623T155626Z-avarab@gmail.com/

SZEDER Gábor (2):
  commit-graph: fix bogus counter in "Scanning merged commits" progress
    line
  entry: show finer-grained counter in "Filtering content" progress line

Ævar Arnfjörð Bjarmason (1):
  midx: don't provide a total for QSORT() progress

 commit-graph.c | 2 +-
 entry.c        | 7 +++----
 midx.c         | 2 +-
 3 files changed, 5 insertions(+), 6 deletions(-)

-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-07-22 12:20 ` [PATCH 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:20   ` Ævar Arnfjörð Bjarmason
  2021-07-23 21:55     ` Junio C Hamano
  2021-07-22 12:20   ` [PATCH 2/3] midx: don't provide a total for QSORT() progress Ævar Arnfjörð Bjarmason
  2021-07-22 12:20   ` [PATCH 3/3] entry: show finer-grained counter in "Filtering content" progress line Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

From: SZEDER Gábor <szeder.dev@gmail.com>

The final value of the counter of the "Scanning merged commits"
progress line is always one less than its expected total, e.g.:

  Scanning merged commits:  83% (5/6), done.

This happens because while iterating over an array the loop variable
is passed to display_progress() as-is, but while C arrays (and thus
the loop variable) start at 0 and end at N-1, the progress counter
must end at N.  This causes the failures of the tests
'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.

Fix this by passing 'i + 1' to display_progress(), like most other
callsites do.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 commit-graph.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/commit-graph.c b/commit-graph.c
index 1a2602da61..918061f207 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
 
 	ctx->num_extra_edges = 0;
 	for (i = 0; i < ctx->commits.nr; i++) {
-		display_progress(ctx->progress, i);
+		display_progress(ctx->progress, i + 1);
 
 		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
 			  &ctx->commits.list[i]->object.oid)) {
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 2/3] midx: don't provide a total for QSORT() progress
  2021-07-22 12:20 ` [PATCH 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
  2021-07-22 12:20   ` [PATCH 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:20   ` Ævar Arnfjörð Bjarmason
  2021-07-23 21:56     ` Junio C Hamano
  2021-07-22 12:20   ` [PATCH 3/3] entry: show finer-grained counter in "Filtering content" progress line Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

The quicksort algorithm can be anywhere between O(n) and O(n^2), so
providing a "num objects" as a total means that in some cases we're
going to go past 100%.

This fixes a logic error in 5ae18df9d8e (midx: during verify group
objects by packfile to speed verification, 2019-03-21), which in turn
seems to have been diligently copied from my own logic error in the
commit-graph.c code, see 890226ccb57 (commit-graph write: add
itermediate progress, 2019-01-19).

That commit-graph code of mine was removed in
1cbdbf3bef7 (commit-graph: drop count_distinct_commits() function,
2020-12-07), so we don't need to fix that too.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 midx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/midx.c b/midx.c
index 9a35b0255d..eaae75ab19 100644
--- a/midx.c
+++ b/midx.c
@@ -1291,7 +1291,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 
 	if (flags & MIDX_PROGRESS)
 		progress = start_sparse_progress(_("Sorting objects by packfile"),
-						 m->num_objects);
+						 0);
 	display_progress(progress, 0); /* TODO: Measure QSORT() progress */
 	QSORT(pairs, m->num_objects, compare_pair_pos_vs_id);
 	stop_progress(&progress);
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 3/3] entry: show finer-grained counter in "Filtering content" progress line
  2021-07-22 12:20 ` [PATCH 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
  2021-07-22 12:20   ` [PATCH 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
  2021-07-22 12:20   ` [PATCH 2/3] midx: don't provide a total for QSORT() progress Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:20   ` Ævar Arnfjörð Bjarmason
  2021-07-23 22:01     ` Junio C Hamano
  2 siblings, 1 reply; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

From: SZEDER Gábor <szeder.dev@gmail.com>

The "Filtering content" progress in entry.c:finish_delayed_checkout()
is unusual because of how it calculates the progress count and because
it shows the progress of a nested loop.  It works basically like this:

  start_delayed_progress(p, nr_of_paths_to_filter)
  for_each_filter {
      display_progress(p, nr_of_paths_to_filter - nr_of_paths_still_left_to_filter)
      for_each_path_handled_by_the_current_filter {
          checkout_entry()
      }
  }
  stop_progress(p)

There are two issues with this approach:

  - The work done by the last filter (or the only filter if there is
    only one) is never counted, so if the last filter still has some
    paths to process, then the counter shown in the "done" progress
    line will not match the expected total.

    This would cause a BUG() in an upcoming change that adds an
    assertion checking if the "total" at the end matches the last
    progress bar update..

    This is because both use only one filter.  (The test 'delayed
    checkout in process filter' uses two filters but the first one
    does all the work, so that test already happens to succeed even
    with such an assertion.)

  - The progress counter is updated only once per filter, not once per
    processed path, so if a filter has a lot of paths to process, then
    the counter might stay unchanged for a long while and then make a
    big jump (though the user still gets a sense of progress, because
    we call display_throughput() after each processed path to show the
    amount of processed data).

Move the display_progress() call to the inner loop, right next to that
checkout_entry() call that does the hard work for each path, and use a
dedicated counter variable that is incremented upon processing each
path.

After this change the 'invalid file in delayed checkout' in
't0021-conversion.sh' would succeed with the future BUG() assertion
discussed above but the 'missing file in delayed checkout' test would
still fail, because its purposefully buggy filter doesn't process any
paths, so we won't execute that inner loop at all (this will be fixed
in a subsequent commit).

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 entry.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/entry.c b/entry.c
index 125fabdbd5..d92dd020b3 100644
--- a/entry.c
+++ b/entry.c
@@ -162,7 +162,7 @@ static int remove_available_paths(struct string_list_item *item, void *cb_data)
 int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 {
 	int errs = 0;
-	unsigned delayed_object_count;
+	unsigned processed_paths = 0;
 	off_t filtered_bytes = 0;
 	struct string_list_item *filter, *path;
 	struct progress *progress;
@@ -172,12 +172,10 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 		return errs;
 
 	dco->state = CE_RETRY;
-	delayed_object_count = dco->paths.nr;
-	progress = start_delayed_progress(_("Filtering content"), delayed_object_count);
+	progress = start_delayed_progress(_("Filtering content"), dco->paths.nr);
 	while (dco->filters.nr > 0) {
 		for_each_string_list_item(filter, &dco->filters) {
 			struct string_list available_paths = STRING_LIST_INIT_NODUP;
-			display_progress(progress, delayed_object_count - dco->paths.nr);
 
 			if (!async_query_available_blobs(filter->string, &available_paths)) {
 				/* Filter reported an error */
@@ -224,6 +222,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 				ce = index_file_exists(state->istate, path->string,
 						       strlen(path->string), 0);
 				if (ce) {
+					display_progress(progress, ++processed_paths);
 					errs |= checkout_entry(ce, state, NULL, nr_checkouts);
 					filtered_bytes += ce->ce_stat_data.sd_size;
 					display_throughput(progress, filtered_bytes);
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup
  2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
                   ` (9 preceding siblings ...)
  2021-07-22 12:20 ` [PATCH 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:54 ` Ævar Arnfjörð Bjarmason
  2021-07-22 12:54   ` [PATCH 1/8] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
                     ` (8 more replies)
  10 siblings, 9 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

These patches were originally submitted as part of a much larger topic
at [1]. The add a "global_progress" "struct progress *" which we
assign/clear to as we start/stop progress bars.

This will become imporant for some new progress features I have
planend, but right now is just used to assert that we don't start two
progress bars at the same time. 7/8 fixes an existing bug where we did
that.

To get there I fixed up the test helper to be able to test this, moved
some code around, and fixes a couple of existing nits in 5/8 and 6/8..

See also [2] which is a re-submission of that larger topic, but the
two can proceed independently.

1. https://lore.kernel.org/git/cover-00.25-00000000000-20210623T155626Z-avarab@gmail.com/
2. https://lore.kernel.org/git/cover-0.3-0000000000-20210722T121801Z-avarab@gmail.com/

Ævar Arnfjörð Bjarmason (8):
  progress.c tests: make start/stop verbs on stdin
  progress.c tests: test some invalid usage
  progress.c: move signal handler functions lower
  progress.c: call progress_interval() from progress_test_force_update()
  progress.c: stop eagerly fflush(stderr) when not a terminal
  progress.c: add temporary variable from progress struct
  pack-bitmap-write.c: add a missing stop_progress()
  progress.c: add & assert a "global_progress" variable

 pack-bitmap-write.c         |   1 +
 progress.c                  | 116 ++++++++++++++++++++----------------
 t/helper/test-progress.c    |  43 +++++++++----
 t/t0500-progress-display.sh | 103 +++++++++++++++++++++++++-------
 4 files changed, 178 insertions(+), 85 deletions(-)

-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 1/8] progress.c tests: make start/stop verbs on stdin
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:54   ` Ævar Arnfjörð Bjarmason
  2021-07-22 12:55   ` [PATCH 2/8] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:54 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

Change the usage of the "test-tool progress" introduced in
2bb74b53a49 (Test the progress display, 2019-09-16) to take command
like "start" and "stop" on stdin, instead of running them implicitly.

This makes for tests that are easier to read, since the recipe will
mirror the API usage, and allows for easily testing invalid usage that
would yield (or should yield) a BUG(), e.g. providing two "start"
calls in a row. A subsequent commit will add such stress tests.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/helper/test-progress.c    | 43 +++++++++++++++++++--------
 t/t0500-progress-display.sh | 59 +++++++++++++++++++++++--------------
 2 files changed, 67 insertions(+), 35 deletions(-)

diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
index 5d05cbe789..685c0a7c49 100644
--- a/t/helper/test-progress.c
+++ b/t/helper/test-progress.c
@@ -3,6 +3,9 @@
  *
  * Reads instructions from standard input, one instruction per line:
  *
+ *   "start[ <total>[ <title>]]" - Call start_progress(title, total),
+ *                                 when "start" use a title of
+ *                                 "Working hard" with a total of 0.
  *   "progress <items>" - Call display_progress() with the given item count
  *                        as parameter.
  *   "throughput <bytes> <millis> - Call display_throughput() with the given
@@ -10,6 +13,7 @@
  *                                  specify the time elapsed since the
  *                                  start_progress() call.
  *   "update" - Set the 'progress_update' flag.
+ *   "stop" - Call stop_progress().
  *
  * See 't0500-progress-display.sh' for examples.
  */
@@ -22,31 +26,41 @@
 
 int cmd__progress(int argc, const char **argv)
 {
-	int total = 0;
-	const char *title;
+	const char *default_title = "Working hard";
+	char *detached_title = NULL;
 	struct strbuf line = STRBUF_INIT;
-	struct progress *progress;
+	struct progress *progress = NULL;
 
 	const char *usage[] = {
-		"test-tool progress [--total=<n>] <progress-title>",
+		"test-tool progress <stdin",
 		NULL
 	};
 	struct option options[] = {
-		OPT_INTEGER(0, "total", &total, "total number of items"),
 		OPT_END(),
 	};
 
 	argc = parse_options(argc, argv, NULL, options, usage, 0);
-	if (argc != 1)
-		die("need a title for the progress output");
-	title = argv[0];
+	if (argc)
+		usage_with_options(usage, options);
 
 	progress_testing = 1;
-	progress = start_progress(title, total);
 	while (strbuf_getline(&line, stdin) != EOF) {
 		char *end;
 
-		if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
+		if (!strcmp(line.buf, "start")) {
+			progress = start_progress(default_title, 0);
+		} else if (skip_prefix(line.buf, "start ", (const char **) &end)) {
+			uint64_t total = strtoull(end, &end, 10);
+			if (*end == '\0') {
+				progress = start_progress(default_title, total);
+			} else if (*end == ' ') {
+				free(detached_title);
+				detached_title = strbuf_detach(&line, NULL);
+				progress = start_progress(end + 1, total);
+			} else {
+				die("invalid input: '%s'\n", line.buf);
+			}
+		} else if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
 			uint64_t item_count = strtoull(end, &end, 10);
 			if (*end != '\0')
 				die("invalid input: '%s'\n", line.buf);
@@ -63,12 +77,15 @@ int cmd__progress(int argc, const char **argv)
 				die("invalid input: '%s'\n", line.buf);
 			progress_test_ns = test_ms * 1000 * 1000;
 			display_throughput(progress, byte_count);
-		} else if (!strcmp(line.buf, "update"))
+		} else if (!strcmp(line.buf, "update")) {
 			progress_test_force_update();
-		else
+		} else if (!strcmp(line.buf, "stop")) {
+			stop_progress(&progress);
+		} else {
 			die("invalid input: '%s'\n", line.buf);
+		}
 	}
-	stop_progress(&progress);
+	free(detached_title);
 
 	return 0;
 }
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 22058b503a..ca96ac1fa5 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -17,6 +17,7 @@ test_expect_success 'simple progress display' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 0
 	update
 	progress 1
 	update
@@ -25,8 +26,9 @@ test_expect_success 'simple progress display' '
 	progress 4
 	update
 	progress 5
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -41,11 +43,13 @@ test_expect_success 'progress display with total' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 3
 	progress 1
 	progress 2
 	progress 3
+	stop
 	EOF
-	test-tool progress --total=3 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -62,14 +66,14 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 100
 	progress 1000
 	progress 10000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -88,16 +92,15 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
-	update
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 1
 	update
 	progress 2
 	progress 10000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -116,14 +119,14 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 25000
 	progress 50000
 	progress 75000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -140,14 +143,14 @@ Working hard.......2.........3.........4.........5.........6.........7.........:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6.........7.........
 	progress 25000
 	progress 50000
 	progress 75000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6.........7........." \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -164,12 +167,14 @@ test_expect_success 'progress shortens - crazy caller' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 1000
 	progress 100
 	progress 200
 	progress 1
 	progress 1000
+	stop
 	EOF
-	test-tool progress --total=1000 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -185,6 +190,7 @@ test_expect_success 'progress display with throughput' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start
 	throughput 102400 1000
 	update
 	progress 10
@@ -197,8 +203,9 @@ test_expect_success 'progress display with throughput' '
 	throughput 409600 4000
 	update
 	progress 40
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -214,6 +221,7 @@ test_expect_success 'progress display with throughput and total' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 40
 	throughput 102400 1000
 	progress 10
 	throughput 204800 2000
@@ -222,8 +230,9 @@ test_expect_success 'progress display with throughput and total' '
 	progress 30
 	throughput 409600 4000
 	progress 40
+	stop
 	EOF
-	test-tool progress --total=40 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -239,6 +248,7 @@ test_expect_success 'cover up after throughput shortens' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start
 	throughput 409600 1000
 	update
 	progress 1
@@ -251,8 +261,9 @@ test_expect_success 'cover up after throughput shortens' '
 	throughput 1638400 4000
 	update
 	progress 4
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -267,6 +278,7 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start
 	throughput 1 1000
 	update
 	progress 1
@@ -276,8 +288,9 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	throughput 3145728 3000
 	update
 	progress 3
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -285,6 +298,7 @@ test_expect_success 'cover up after throughput shortens a lot' '
 
 test_expect_success 'progress generates traces' '
 	cat >in <<-\EOF &&
+	start 40
 	throughput 102400 1000
 	update
 	progress 10
@@ -297,10 +311,11 @@ test_expect_success 'progress generates traces' '
 	throughput 409600 4000
 	update
 	progress 40
+	stop
 	EOF
 
-	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress --total=40 \
-		"Working hard" <in 2>stderr &&
+	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress \
+		<in 2>stderr &&
 
 	# t0212/parse_events.perl intentionally omits regions and data.
 	test_region progress "Working hard" trace.event &&
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 2/8] progress.c tests: test some invalid usage
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
  2021-07-22 12:54   ` [PATCH 1/8] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:55   ` Ævar Arnfjörð Bjarmason
  2021-07-22 12:55   ` [PATCH 3/8] progress.c: move signal handler functions lower Ævar Arnfjörð Bjarmason
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

Test what happens when we "stop" without a "start", omit the "stop"
after a "start", or try to start two concurrent progress bars. This
extends the trace2 tests added in 98a13647408 (trace2: log progress
time and throughput, 2020-05-12).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t0500-progress-display.sh | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index ca96ac1fa5..ffa819ca1d 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -323,4 +323,37 @@ test_expect_success 'progress generates traces' '
 	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
 '
 
+test_expect_success 'progress generates traces: stop / start' '
+	cat >in <<-\EOF &&
+	start
+	stop
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-startstop.event" test-tool progress \
+		<in 2>stderr &&
+	test_region progress "Working hard" trace-startstop.event
+'
+
+test_expect_success 'progress generates traces: start without stop' '
+	cat >in <<-\EOF &&
+	start
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-start.event" test-tool progress \
+		<in 2>stderr &&
+	grep region_enter.*progress trace-start.event &&
+	! grep region_leave.*progress trace-start.event
+'
+
+test_expect_success 'progress generates traces: stop without start' '
+	cat >in <<-\EOF &&
+	stop
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-stop.event" test-tool progress \
+		<in 2>stderr &&
+	! grep region_enter.*progress trace-stop.event &&
+	! grep region_leave.*progress trace-stop.event
+'
+
 test_done
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 3/8] progress.c: move signal handler functions lower
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
  2021-07-22 12:54   ` [PATCH 1/8] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
  2021-07-22 12:55   ` [PATCH 2/8] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:55   ` Ævar Arnfjörð Bjarmason
  2021-07-22 12:55   ` [PATCH 4/8] progress.c: call progress_interval() from progress_test_force_update() Ævar Arnfjörð Bjarmason
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

Move the signal handler functions to just before the
start_progress_delay() where they'll be referenced, instead of having
them at the top of the file.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 92 ++++++++++++++++++++++++++++--------------------------
 1 file changed, 48 insertions(+), 44 deletions(-)

diff --git a/progress.c b/progress.c
index 680c6a8bf9..893cb0fe56 100644
--- a/progress.c
+++ b/progress.c
@@ -53,50 +53,6 @@ static volatile sig_atomic_t progress_update;
  */
 int progress_testing;
 uint64_t progress_test_ns = 0;
-void progress_test_force_update(void)
-{
-	progress_update = 1;
-}
-
-
-static void progress_interval(int signum)
-{
-	progress_update = 1;
-}
-
-static void set_progress_signal(void)
-{
-	struct sigaction sa;
-	struct itimerval v;
-
-	if (progress_testing)
-		return;
-
-	progress_update = 0;
-
-	memset(&sa, 0, sizeof(sa));
-	sa.sa_handler = progress_interval;
-	sigemptyset(&sa.sa_mask);
-	sa.sa_flags = SA_RESTART;
-	sigaction(SIGALRM, &sa, NULL);
-
-	v.it_interval.tv_sec = 1;
-	v.it_interval.tv_usec = 0;
-	v.it_value = v.it_interval;
-	setitimer(ITIMER_REAL, &v, NULL);
-}
-
-static void clear_progress_signal(void)
-{
-	struct itimerval v = {{0,},};
-
-	if (progress_testing)
-		return;
-
-	setitimer(ITIMER_REAL, &v, NULL);
-	signal(SIGALRM, SIG_IGN);
-	progress_update = 0;
-}
 
 static int is_foreground_fd(int fd)
 {
@@ -249,6 +205,54 @@ void display_progress(struct progress *progress, uint64_t n)
 		display(progress, n, NULL);
 }
 
+static void progress_interval(int signum)
+{
+	progress_update = 1;
+}
+
+/*
+ * The progress_test_force_update() function is intended for testing
+ * the progress output, i.e. exclusively for 'test-tool progress'.
+ */
+void progress_test_force_update(void)
+{
+	progress_update = 1;
+}
+
+static void set_progress_signal(void)
+{
+	struct sigaction sa;
+	struct itimerval v;
+
+	if (progress_testing)
+		return;
+
+	progress_update = 0;
+
+	memset(&sa, 0, sizeof(sa));
+	sa.sa_handler = progress_interval;
+	sigemptyset(&sa.sa_mask);
+	sa.sa_flags = SA_RESTART;
+	sigaction(SIGALRM, &sa, NULL);
+
+	v.it_interval.tv_sec = 1;
+	v.it_interval.tv_usec = 0;
+	v.it_value = v.it_interval;
+	setitimer(ITIMER_REAL, &v, NULL);
+}
+
+static void clear_progress_signal(void)
+{
+	struct itimerval v = {{0,},};
+
+	if (progress_testing)
+		return;
+
+	setitimer(ITIMER_REAL, &v, NULL);
+	signal(SIGALRM, SIG_IGN);
+	progress_update = 0;
+}
+
 static struct progress *start_progress_delay(const char *title, uint64_t total,
 					     unsigned delay, unsigned sparse)
 {
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 4/8] progress.c: call progress_interval() from progress_test_force_update()
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                     ` (2 preceding siblings ...)
  2021-07-22 12:55   ` [PATCH 3/8] progress.c: move signal handler functions lower Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:55   ` Ævar Arnfjörð Bjarmason
  2021-07-22 12:55   ` [PATCH 5/8] progress.c: stop eagerly fflush(stderr) when not a terminal Ævar Arnfjörð Bjarmason
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

Define the progress_test_force_update() function in terms of
progress_interval(). For documentation purposes these two functions
have the same body, but different names. Let's just define the test
function by calling progress_interval() with SIGALRM ourselves.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/progress.c b/progress.c
index 893cb0fe56..7fcc513717 100644
--- a/progress.c
+++ b/progress.c
@@ -216,7 +216,7 @@ static void progress_interval(int signum)
  */
 void progress_test_force_update(void)
 {
-	progress_update = 1;
+	progress_interval(SIGALRM);
 }
 
 static void set_progress_signal(void)
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 5/8] progress.c: stop eagerly fflush(stderr) when not a terminal
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                     ` (3 preceding siblings ...)
  2021-07-22 12:55   ` [PATCH 4/8] progress.c: call progress_interval() from progress_test_force_update() Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:55   ` Ævar Arnfjörð Bjarmason
  2021-07-22 12:55   ` [PATCH 6/8] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

It's the clear intention of the combination of 137a0d0ef56 (Flush
progress message buffer in display()., 2007-11-19) and
85cb8906f0e (progress: no progress in background, 2015-04-13) to call
fflush(stderr) when we have a stderr in the foreground, but we ended
up always calling fflush(stderr) seemingly by omission. Let's not.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/progress.c b/progress.c
index 7fcc513717..1fade5808d 100644
--- a/progress.c
+++ b/progress.c
@@ -91,7 +91,8 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 	}
 
 	if (show_update) {
-		if (is_foreground_fd(fileno(stderr)) || done) {
+		int stderr_is_foreground_fd = is_foreground_fd(fileno(stderr));
+		if (stderr_is_foreground_fd || done) {
 			const char *eol = done ? done : "\r";
 			size_t clear_len = counters_sb->len < last_count_len ?
 					last_count_len - counters_sb->len + 1 :
@@ -115,7 +116,8 @@ static void display(struct progress *progress, uint64_t n, const char *done)
 				fprintf(stderr, "%s: %s%*s", progress->title,
 					counters_sb->buf, (int) clear_len, eol);
 			}
-			fflush(stderr);
+			if (stderr_is_foreground_fd)
+				fflush(stderr);
 		}
 		progress_update = 0;
 	}
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 6/8] progress.c: add temporary variable from progress struct
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                     ` (4 preceding siblings ...)
  2021-07-22 12:55   ` [PATCH 5/8] progress.c: stop eagerly fflush(stderr) when not a terminal Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:55   ` Ævar Arnfjörð Bjarmason
  2021-07-22 12:55   ` [PATCH 7/8] pack-bitmap-write.c: add a missing stop_progress() Ævar Arnfjörð Bjarmason
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

Add a temporary "progress" variable for the dereferenced p_progress
pointer to a "struct progress *". Before 98a13647408 (trace2: log
progress time and throughput, 2020-05-12) we didn't dereference
"p_progress" in this function, now that we do it's easier to read the
code if we work with a "progress" struct pointer like everywhere else,
instead of a pointer to a pointer.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/progress.c b/progress.c
index 1fade5808d..1ab7d19deb 100644
--- a/progress.c
+++ b/progress.c
@@ -331,15 +331,16 @@ void stop_progress(struct progress **p_progress)
 	finish_if_sparse(*p_progress);
 
 	if (*p_progress) {
+		struct progress *progress = *p_progress;
 		trace2_data_intmax("progress", the_repository, "total_objects",
 				   (*p_progress)->total);
 
 		if ((*p_progress)->throughput)
 			trace2_data_intmax("progress", the_repository,
 					   "total_bytes",
-					   (*p_progress)->throughput->curr_total);
+					   progress->throughput->curr_total);
 
-		trace2_region_leave("progress", (*p_progress)->title, the_repository);
+		trace2_region_leave("progress", progress->title, the_repository);
 	}
 
 	stop_progress_msg(p_progress, _("done"));
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 7/8] pack-bitmap-write.c: add a missing stop_progress()
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                     ` (5 preceding siblings ...)
  2021-07-22 12:55   ` [PATCH 6/8] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:55   ` Ævar Arnfjörð Bjarmason
  2021-07-22 12:55   ` [PATCH 8/8] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
  2021-07-23 22:02   ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Junio C Hamano
  8 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

Fix a bug that's been here since 7cc8f971085 (pack-objects: implement
bitmap writing, 2013-12-21), we did not call stop_progress() if we
reached the early exit in this function. This will matter in a
subsequent commit where we BUG(...) out if this happens, and matters
now e.g. because we don't have a corresponding "region_end" for the
progress trace2 event.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 pack-bitmap-write.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 88d9e696a5..6e110e41ea 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -550,6 +550,7 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
 	if (indexed_commits_nr < 100) {
 		for (i = 0; i < indexed_commits_nr; ++i)
 			push_bitmapped_commit(indexed_commits[i]);
+		stop_progress(&writer.progress);
 		return;
 	}
 
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 8/8] progress.c: add & assert a "global_progress" variable
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                     ` (6 preceding siblings ...)
  2021-07-22 12:55   ` [PATCH 7/8] pack-bitmap-write.c: add a missing stop_progress() Ævar Arnfjörð Bjarmason
@ 2021-07-22 12:55   ` Ævar Arnfjörð Bjarmason
  2021-07-23 22:02   ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Junio C Hamano
  8 siblings, 0 replies; 83+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22 12:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Ævar Arnfjörð Bjarmason

The progress.c code makes a hard assumption that only one progress bar
be active at a time (see [1] for a bug where this wasn't the case),
but nothing has asserted that that's the case. Let's add a BUG()
that'll trigger if two progress bars are active at the same time.

There's an alternate test-only approach to doing the same thing[2],
but by doing this for all progress bars we'll have a canary to check
if we have any unexpected interaction between the "sig_atomic_t
progress_update" variable and this global struct.

I am then planning on using this scaffolding in the future to fix a
limitation in the progress output, namely the current limitation of
the progress.c bar code that any update must pro-actively go through
the likes of display_progress().

If we e.g. hang forever before the first display_progress(), or in the
middle of a loop that would call display_progress() the user will only
see either no output, or output frozen at the last display_progress()
that would have done an update (e.g. in cases where progress_update
was "1" due to an earlier signal).

This change does not fix that, but sets up the structure for solving
that and other related problems by juggling this "global_progress"
struct. Later changes will make more use of the "global_progress" than
only using it for these assertions.

1. 6f9d5f2fda1 (commit-graph: fix progress of reachable commits, 2020-07-09)
2. https://lore.kernel.org/git/20210620200303.2328957-3-szeder.dev@gmail.com

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c                  | 17 +++++++++++++----
 t/t0500-progress-display.sh | 11 +++++++++++
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/progress.c b/progress.c
index 1ab7d19deb..14a023f4b4 100644
--- a/progress.c
+++ b/progress.c
@@ -46,6 +46,7 @@ struct progress {
 };
 
 static volatile sig_atomic_t progress_update;
+static struct progress *global_progress;
 
 /*
  * These are only intended for testing the progress output, i.e. exclusively
@@ -221,11 +222,15 @@ void progress_test_force_update(void)
 	progress_interval(SIGALRM);
 }
 
-static void set_progress_signal(void)
+static void set_progress_signal(struct progress *progress)
 {
 	struct sigaction sa;
 	struct itimerval v;
 
+	if (global_progress)
+		BUG("should have no global_progress in set_progress_signal()");
+	global_progress = progress;
+
 	if (progress_testing)
 		return;
 
@@ -243,10 +248,14 @@ static void set_progress_signal(void)
 	setitimer(ITIMER_REAL, &v, NULL);
 }
 
-static void clear_progress_signal(void)
+static void clear_progress_signal(struct progress *progress)
 {
 	struct itimerval v = {{0,},};
 
+	if (!global_progress)
+		BUG("should have a global_progress in clear_progress_signal()");
+	global_progress = NULL;
+
 	if (progress_testing)
 		return;
 
@@ -270,7 +279,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 	strbuf_init(&progress->counters_sb, 0);
 	progress->title_len = utf8_strwidth(title);
 	progress->split = 0;
-	set_progress_signal();
+	set_progress_signal(progress);
 	trace2_region_enter("progress", title, the_repository);
 	return progress;
 }
@@ -374,7 +383,7 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 		display(progress, progress->last_value, buf);
 		free(buf);
 	}
-	clear_progress_signal();
+	clear_progress_signal(progress);
 	strbuf_release(&progress->counters_sb);
 	if (progress->throughput)
 		strbuf_release(&progress->throughput->display);
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index ffa819ca1d..124d33c96b 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -296,6 +296,17 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	test_cmp expect out
 '
 
+test_expect_success 'BUG: start two concurrent progress bars' '
+	cat >in <<-\EOF &&
+	start 0 one
+	start 0 two
+	EOF
+
+	test_must_fail test-tool progress \
+		<in 2>stderr &&
+	grep -E "^BUG: .*: should have no global_progress in set_progress_signal\(\)$" stderr
+'
+
 test_expect_success 'progress generates traces' '
 	cat >in <<-\EOF &&
 	start 40
-- 
2.32.0.957.gd9e39d72fe6


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line
  2021-07-22 12:20   ` [PATCH 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
@ 2021-07-23 21:55     ` Junio C Hamano
  0 siblings, 0 replies; 83+ messages in thread
From: Junio C Hamano @ 2021-07-23 21:55 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, SZEDER Gábor, René Scharfe

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> From: SZEDER Gábor <szeder.dev@gmail.com>
>
> The final value of the counter of the "Scanning merged commits"
> progress line is always one less than its expected total, e.g.:
>
>   Scanning merged commits:  83% (5/6), done.
>
> This happens because while iterating over an array the loop variable
> is passed to display_progress() as-is, but while C arrays (and thus
> the loop variable) start at 0 and end at N-1, the progress counter
> must end at N.  This causes the failures of the tests
> 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
> in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.
>
> Fix this by passing 'i + 1' to display_progress(), like most other
> callsites do.

Sensible, I guess.

>
> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  commit-graph.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index 1a2602da61..918061f207 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>  
>  	ctx->num_extra_edges = 0;
>  	for (i = 0; i < ctx->commits.nr; i++) {
> -		display_progress(ctx->progress, i);
> +		display_progress(ctx->progress, i + 1);
>  
>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>  			  &ctx->commits.list[i]->object.oid)) {


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 2/3] midx: don't provide a total for QSORT() progress
  2021-07-22 12:20   ` [PATCH 2/3] midx: don't provide a total for QSORT() progress Ævar Arnfjörð Bjarmason
@ 2021-07-23 21:56     ` Junio C Hamano
  0 siblings, 0 replies; 83+ messages in thread
From: Junio C Hamano @ 2021-07-23 21:56 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, SZEDER Gábor, René Scharfe

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> The quicksort algorithm can be anywhere between O(n) and O(n^2), so
> providing a "num objects" as a total means that in some cases we're
> going to go past 100%.
>
> This fixes a logic error in 5ae18df9d8e (midx: during verify group
> objects by packfile to speed verification, 2019-03-21), which in turn
> seems to have been diligently copied from my own logic error in the
> commit-graph.c code, see 890226ccb57 (commit-graph write: add
> itermediate progress, 2019-01-19).

Interesting.

>
> That commit-graph code of mine was removed in
> 1cbdbf3bef7 (commit-graph: drop count_distinct_commits() function,
> 2020-12-07), so we don't need to fix that too.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  midx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/midx.c b/midx.c
> index 9a35b0255d..eaae75ab19 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -1291,7 +1291,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
>  
>  	if (flags & MIDX_PROGRESS)
>  		progress = start_sparse_progress(_("Sorting objects by packfile"),
> -						 m->num_objects);
> +						 0);
>  	display_progress(progress, 0); /* TODO: Measure QSORT() progress */
>  	QSORT(pairs, m->num_objects, compare_pair_pos_vs_id);
>  	stop_progress(&progress);

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 3/3] entry: show finer-grained counter in "Filtering content" progress line
  2021-07-22 12:20   ` [PATCH 3/3] entry: show finer-grained counter in "Filtering content" progress line Ævar Arnfjörð Bjarmason
@ 2021-07-23 22:01     ` Junio C Hamano
  0 siblings, 0 replies; 83+ messages in thread
From: Junio C Hamano @ 2021-07-23 22:01 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, SZEDER Gábor, René Scharfe

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> From: SZEDER Gábor <szeder.dev@gmail.com>
>
> The "Filtering content" progress in entry.c:finish_delayed_checkout()
> is unusual because of how it calculates the progress count and because
> it shows the progress of a nested loop.  It works basically like this:
>
>   start_delayed_progress(p, nr_of_paths_to_filter)
>   for_each_filter {
>       display_progress(p, nr_of_paths_to_filter - nr_of_paths_still_left_to_filter)
>       for_each_path_handled_by_the_current_filter {
>           checkout_entry()
>       }
>   }
>   stop_progress(p)
>
> There are two issues with this approach:
>
>   - The work done by the last filter (or the only filter if there is
>     only one) is never counted, so if the last filter still has some
>     paths to process, then the counter shown in the "done" progress
>     line will not match the expected total.
>
>     This would cause a BUG() in an upcoming change that adds an
>     assertion checking if the "total" at the end matches the last
>     progress bar update..

So the other series will semantically depend on this 3-patch series?
Just making sure that is the intended topic structure.

> diff --git a/entry.c b/entry.c
> index 125fabdbd5..d92dd020b3 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -162,7 +162,7 @@ static int remove_available_paths(struct string_list_item *item, void *cb_data)
>  int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
>  {
>  	int errs = 0;
> -	unsigned delayed_object_count;
> +	unsigned processed_paths = 0;
>  	off_t filtered_bytes = 0;
>  	struct string_list_item *filter, *path;
>  	struct progress *progress;
> @@ -172,12 +172,10 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
>  		return errs;
>  
>  	dco->state = CE_RETRY;
> -	delayed_object_count = dco->paths.nr;
> -	progress = start_delayed_progress(_("Filtering content"), delayed_object_count);
> +	progress = start_delayed_progress(_("Filtering content"), dco->paths.nr);
>  	while (dco->filters.nr > 0) {
>  		for_each_string_list_item(filter, &dco->filters) {
>  			struct string_list available_paths = STRING_LIST_INIT_NODUP;
> -			display_progress(progress, delayed_object_count - dco->paths.nr);
>  
>  			if (!async_query_available_blobs(filter->string, &available_paths)) {
>  				/* Filter reported an error */
> @@ -224,6 +222,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
>  				ce = index_file_exists(state->istate, path->string,
>  						       strlen(path->string), 0);
>  				if (ce) {
> +					display_progress(progress, ++processed_paths);
>  					errs |= checkout_entry(ce, state, NULL, nr_checkouts);
>  					filtered_bytes += ce->ce_stat_data.sd_size;
>  					display_throughput(progress, filtered_bytes);

Hmph.  A missing cache entries will not increment processed; would
that cause stop_progress() to see at the end the counter that is
smaller than dco->paths.nr we saw at the beginning?


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup
  2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
                     ` (7 preceding siblings ...)
  2021-07-22 12:55   ` [PATCH 8/8] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
@ 2021-07-23 22:02   ` Junio C Hamano
  8 siblings, 0 replies; 83+ messages in thread
From: Junio C Hamano @ 2021-07-23 22:02 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, SZEDER Gábor, René Scharfe

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> These patches were originally submitted as part of a much larger topic
> at [1]. The add a "global_progress" "struct progress *" which we
> assign/clear to as we start/stop progress bars.
>
> This will become imporant for some new progress features I have
> planend, but right now is just used to assert that we don't start two
> progress bars at the same time. 7/8 fixes an existing bug where we did
> that.
>
> To get there I fixed up the test helper to be able to test this, moved
> some code around, and fixes a couple of existing nits in 5/8 and 6/8..
>
> See also [2] which is a re-submission of that larger topic, but the
> two can proceed independently.

OK.

>
> 1. https://lore.kernel.org/git/cover-00.25-00000000000-20210623T155626Z-avarab@gmail.com/
> 2. https://lore.kernel.org/git/cover-0.3-0000000000-20210722T121801Z-avarab@gmail.com/
>
> Ævar Arnfjörð Bjarmason (8):
>   progress.c tests: make start/stop verbs on stdin
>   progress.c tests: test some invalid usage
>   progress.c: move signal handler functions lower
>   progress.c: call progress_interval() from progress_test_force_update()
>   progress.c: stop eagerly fflush(stderr) when not a terminal
>   progress.c: add temporary variable from progress struct
>   pack-bitmap-write.c: add a missing stop_progress()
>   progress.c: add & assert a "global_progress" variable
>
>  pack-bitmap-write.c         |   1 +
>  progress.c                  | 116 ++++++++++++++++++++----------------
>  t/helper/test-progress.c    |  43 +++++++++----
>  t/t0500-progress-display.sh | 103 +++++++++++++++++++++++++-------
>  4 files changed, 178 insertions(+), 85 deletions(-)

^ permalink raw reply	[flat|nested] 83+ messages in thread

end of thread, other threads:[~2021-07-23 22:03 UTC | newest]

Thread overview: 83+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-20 20:02 [PATCH 0/7] progress: verify progress counters in the test suite SZEDER Gábor
2021-06-20 20:02 ` [PATCH 1/7] progress: introduce GIT_TEST_CHECK_PROGRESS to verify progress counters SZEDER Gábor
2021-06-21  7:09   ` Ævar Arnfjörð Bjarmason
2021-06-22 15:55   ` Taylor Blau
2021-06-20 20:02 ` [PATCH 2/7] progress: catch nested/overlapping progresses with GIT_TEST_CHECK_PROGRESS SZEDER Gábor
2021-06-22 16:00   ` Taylor Blau
2021-06-20 20:02 ` [PATCH 3/7] progress: catch backwards counting " SZEDER Gábor
2021-06-20 20:03 ` [PATCH 4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line SZEDER Gábor
2021-06-20 22:13   ` Ævar Arnfjörð Bjarmason
2021-06-21 18:32     ` René Scharfe
2021-06-21 20:08       ` Ævar Arnfjörð Bjarmason
2021-06-26  8:27         ` René Scharfe
2021-06-26 14:11           ` Ævar Arnfjörð Bjarmason
2021-06-26 20:22             ` René Scharfe
2021-06-26 21:38               ` Ævar Arnfjörð Bjarmason
2021-07-04 12:15                 ` René Scharfe
2021-07-05 14:09                   ` Junio C Hamano
2021-07-05 23:28                   ` Ævar Arnfjörð Bjarmason
2021-07-06 16:02                     ` René Scharfe
2021-06-27 17:31               ` Felipe Contreras
2021-06-20 20:03 ` [PATCH 5/7] entry: show finer-grained counter in "Filtering content" " SZEDER Gábor
2021-06-20 20:03 ` [PATCH 6/7] [RFC] entry: don't show "Filtering content: ... done." line in case of errors SZEDER Gábor
2021-06-21 18:32   ` René Scharfe
2021-06-23  1:52     ` Taylor Blau
2021-06-20 20:03 ` [PATCH 7/7] test-lib: enable GIT_TEST_CHECK_PROGRESS by default SZEDER Gábor
2021-06-21  0:59 ` [PATCH 0/7] progress: verify progress counters in the test suite Ævar Arnfjörð Bjarmason
2021-06-23  2:04   ` Taylor Blau
2021-06-23 17:48     ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 01/25] progress.c tests: fix breakage with COLUMNS != 80 Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 02/25] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 03/25] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 04/25] progress.c tests: add a "signal" verb Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 05/25] progress.c: move signal handler functions lower Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 06/25] progress.c: call progress_interval() from progress_test_force_update() Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 07/25] progress.c: stop eagerly fflush(stderr) when not a terminal Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 08/25] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 09/25] midx perf: add a perf test for multi-pack-index Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 10/25] progress.c: remove the "sparse" mode nano-optimization Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 11/25] pack-bitmap-write.c: add a missing stop_progress() Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 12/25] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 13/25] progress.[ch]: move the "struct progress" to the header Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 14/25] progress.[ch]: move test-only code away from "extern" variables Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 15/25] progress.c: pass "is done?" (again) to display() Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 16/25] progress.[ch]: convert "title" to "struct strbuf" Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 17/25] progress.c: refactor display() for less confusion, and fix bug Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 18/25] progress.c: emit progress on first signal, show "stalled" Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 19/25] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 20/25] midx: don't provide a total for QSORT() progress Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 21/25] entry: show finer-grained counter in "Filtering content" progress line Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [PATCH 22/25] progress.c: add a stop_progress_early() function Ævar Arnfjörð Bjarmason
2021-06-24 10:35         ` Ævar Arnfjörð Bjarmason
2021-06-25  1:24         ` Andrei Rybak
2021-06-23 17:48       ` [PATCH 23/25] entry: deal with unexpected "Filtering content" total Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [RFC/PATCH 24/25] progress: assert last update in stop_progress() Ævar Arnfjörð Bjarmason
2021-06-23 17:48       ` [RFC/PATCH 25/25] progress: assert counting upwards in display() Ævar Arnfjörð Bjarmason
2021-06-23 17:59       ` [PATCH 00/25] progress.c: various fixes + SZEDER's RFC code Randall S. Becker
2021-06-23 20:01         ` Ævar Arnfjörð Bjarmason
2021-06-23 20:25           ` Randall S. Becker
2021-06-23 21:57 ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
2021-06-23 21:57   ` [PATCH 1/4] WIP progress, isatty(2), hidden progress lnies for GIT_TEST_CHECK_PROGRESS SZEDER Gábor
2021-06-23 21:57   ` [PATCH 2/4] blame: fix progress total with line ranges SZEDER Gábor
2021-06-23 21:57   ` [PATCH 3/4] read-cache: avoid overlapping progress lines SZEDER Gábor
2021-06-23 21:57   ` [PATCH 4/4] preload-index: fix "Refreshing index" progress line SZEDER Gábor
2021-06-23 22:11   ` [PATCH 0/4] WIP/POC check isatty(2)-protected progress lines SZEDER Gábor
2021-06-24 10:43     ` Ævar Arnfjörð Bjarmason
2021-06-24 10:45   ` Ævar Arnfjörð Bjarmason
2021-07-22 12:20 ` [PATCH 0/3] progress.c API users: fix bogus counting Ævar Arnfjörð Bjarmason
2021-07-22 12:20   ` [PATCH 1/3] commit-graph: fix bogus counter in "Scanning merged commits" progress line Ævar Arnfjörð Bjarmason
2021-07-23 21:55     ` Junio C Hamano
2021-07-22 12:20   ` [PATCH 2/3] midx: don't provide a total for QSORT() progress Ævar Arnfjörð Bjarmason
2021-07-23 21:56     ` Junio C Hamano
2021-07-22 12:20   ` [PATCH 3/3] entry: show finer-grained counter in "Filtering content" progress line Ævar Arnfjörð Bjarmason
2021-07-23 22:01     ` Junio C Hamano
2021-07-22 12:54 ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Ævar Arnfjörð Bjarmason
2021-07-22 12:54   ` [PATCH 1/8] progress.c tests: make start/stop verbs on stdin Ævar Arnfjörð Bjarmason
2021-07-22 12:55   ` [PATCH 2/8] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
2021-07-22 12:55   ` [PATCH 3/8] progress.c: move signal handler functions lower Ævar Arnfjörð Bjarmason
2021-07-22 12:55   ` [PATCH 4/8] progress.c: call progress_interval() from progress_test_force_update() Ævar Arnfjörð Bjarmason
2021-07-22 12:55   ` [PATCH 5/8] progress.c: stop eagerly fflush(stderr) when not a terminal Ævar Arnfjörð Bjarmason
2021-07-22 12:55   ` [PATCH 6/8] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
2021-07-22 12:55   ` [PATCH 7/8] pack-bitmap-write.c: add a missing stop_progress() Ævar Arnfjörð Bjarmason
2021-07-22 12:55   ` [PATCH 8/8] progress.c: add & assert a "global_progress" variable Ævar Arnfjörð Bjarmason
2021-07-23 22:02   ` [PATCH 0/8] progress: assert "global_progress" + test fixes / cleanup Junio C Hamano

git@vger.kernel.org list mirror (unofficial, one of many)

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 git git/ https://public-inbox.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd.onion/inbox.comp.version-control.git
	nntp://ie5yzdi7fg72h7s4sdcztq5evakq23rdt33mfyfcddc5u3ndnw24ogqd.onion/inbox.comp.version-control.git
	nntp://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git