git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] enable progress traces even when quiet
@ 2020-07-10  1:42 Emily Shaffer
  2020-07-10  1:42 ` [PATCH 1/2] progress: create progress struct in 'verbose' mode Emily Shaffer
  2020-07-10  1:42 ` [PATCH 2/2] progress: remove redundant null-checking Emily Shaffer
  0 siblings, 2 replies; 27+ messages in thread
From: Emily Shaffer @ 2020-07-10  1:42 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

This is a simple but large-scale change, because it changes an API and
fixes all callers. The bulk of the change is in the first patch,
including justification in the commit message; the second patch is only
tidying I noticed while fixing compiler errors from the former. If that
churn is unwanted I'm happy to drop it.

I wanted to do this change because we received reports that 'git clone'
was only logging traces sometimes, not all the time. An alternative
approach would have been to move the 'if (verbose)' wrappers to surround
calls to 'display_progress()' - but this would have worked in
builtin/index-pack.c (where the reported bug was happening) only because
of that code's use of static global flags. Placing the onus on the
caller to ensure that trace collection occurs, even in quiet mode, is
flimsy and error-prone. Adding another flag to 'struct progress' ensures
that trace collection is consistent regardless of how the caller
organizes their code.

In a couple of cases, this does result in some additional work
calculating titles and object counts where previously there wasn't;
however, since those values will be used by the trace logger, I don't
consider it a waste of resources.

Finally, 'git rev-list' has one weird case where it takes the progress
title as an argument (i.e. 'git rev-list --progress "Working hard" ...')
and also null-checks that string to decide whether to calle
'start_progress()' - in this case, I added a placeholder text for the
traces to show. Bikeshedding welcome here.

 - Emily

Emily Shaffer (2):
  progress: create progress struct in 'verbose' mode
  progress: remove redundant null-checking

 builtin/blame.c             |   4 +-
 builtin/commit-graph.c      |   8 +-
 builtin/fsck.c              |  25 +++----
 builtin/index-pack.c        |  16 ++--
 builtin/log.c               |   4 +-
 builtin/pack-objects.c      |  18 ++---
 builtin/prune.c             |   4 +-
 builtin/rev-list.c          |   9 ++-
 builtin/unpack-objects.c    |   3 +-
 commit-graph.c              | 143 +++++++++++++++++-------------------
 delta-islands.c             |   4 +-
 diffcore-rename.c           |   9 +--
 entry.c                     |   2 +-
 midx.c                      |  43 +++++------
 pack-bitmap-write.c         |   8 +-
 pack-bitmap.c               |   5 +-
 preload-index.c             |   7 +-
 progress.c                  |  27 ++++---
 progress.h                  |  10 ++-
 prune-packed.c              |   4 +-
 read-cache.c                |  15 ++--
 t/helper/test-progress.c    |   6 +-
 t/t0500-progress-display.sh |  27 +++++++
 unpack-trees.c              |  14 ++--
 walker.c                    |   4 +-
 25 files changed, 217 insertions(+), 202 deletions(-)

-- 
2.27.0.383.g050319c2ae-goog


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/2] progress: create progress struct in 'verbose' mode
  2020-07-10  1:42 [PATCH 0/2] enable progress traces even when quiet Emily Shaffer
@ 2020-07-10  1:42 ` Emily Shaffer
  2020-07-10  2:00   ` Derrick Stolee
                     ` (3 more replies)
  2020-07-10  1:42 ` [PATCH 2/2] progress: remove redundant null-checking Emily Shaffer
  1 sibling, 4 replies; 27+ messages in thread
From: Emily Shaffer @ 2020-07-10  1:42 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Before now, the progress API is used by conditionally calling
start_progress() or a similar call, and then unconditionally calling
display_progress() and stop_progress(), both of which are tolerant of
NULL or uninitialized inputs. However, in
98a136474082cdc7228d7e0e45672c5274fab701 (trace: log progress time and
throughput), the progress library learned to log traces during expensive
operations. In cases where progress should not be displayed to the user
- such as when Git is called by a script - no traces will be logged,
because the progress object is never created.

Instead, to allow us to collect traces from scripted Git commands, teach
a progress->verbose flag, which is specified via a new argument to
start_progress() and friends. display_progress() also learns to filter
for that flag. With these changes, start_progress() can be called
unconditionally but with a conditional as an argument to determine
whether to report progress to the user.

Since this changes the API, also modify callers of start_progress() and
friends to drop their conditional and pass a new argument in instead.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 builtin/blame.c             |   4 +-
 builtin/commit-graph.c      |   5 +-
 builtin/fsck.c              |  25 ++++---
 builtin/index-pack.c        |  16 ++---
 builtin/log.c               |   4 +-
 builtin/pack-objects.c      |  18 +++--
 builtin/prune.c             |   4 +-
 builtin/rev-list.c          |   9 ++-
 builtin/unpack-objects.c    |   3 +-
 commit-graph.c              | 140 +++++++++++++++++-------------------
 delta-islands.c             |   4 +-
 diffcore-rename.c           |   9 ++-
 entry.c                     |   2 +-
 midx.c                      |  43 +++++------
 pack-bitmap-write.c         |   8 +--
 pack-bitmap.c               |   5 +-
 preload-index.c             |   7 +-
 progress.c                  |  27 ++++---
 progress.h                  |  10 +--
 prune-packed.c              |   4 +-
 read-cache.c                |   6 +-
 t/helper/test-progress.c    |   6 +-
 t/t0500-progress-display.sh |  27 +++++++
 unpack-trees.c              |  14 ++--
 walker.c                    |   4 +-
 25 files changed, 212 insertions(+), 192 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 94ef57c1cc..1d72b27fda 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1129,8 +1129,8 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
 
 	sb.found_guilty_entry = &found_guilty_entry;
 	sb.found_guilty_entry_data = &pi;
-	if (show_progress)
-		pi.progress = start_delayed_progress(_("Blaming lines"), sb.num_lines);
+	pi.progress = start_delayed_progress(_("Blaming lines"), sb.num_lines,
+					     show_progress);
 
 	assign_blame(&sb, opt);
 
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index f6797e2a9f..ae4dc2dfcd 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -240,9 +240,8 @@ static int graph_write(int argc, const char **argv)
 					   strbuf_detach(&buf, NULL));
 	} else if (opts.stdin_commits) {
 		oidset_init(&commits, 0);
-		if (opts.progress)
-			progress = start_delayed_progress(
-				_("Collecting commits from input"), 0);
+		progress = start_delayed_progress(
+			_("Collecting commits from input"), 0, opts.progress);
 
 		while (strbuf_getline(&buf, stdin) != EOF) {
 			if (read_one_commit(&commits, progress, buf.buf)) {
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 37aa07da78..3236ed91ac 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -205,8 +205,8 @@ static int traverse_reachable(void)
 	struct progress *progress = NULL;
 	unsigned int nr = 0;
 	int result = 0;
-	if (show_progress)
-		progress = start_delayed_progress(_("Checking connectivity"), 0);
+	progress = start_delayed_progress(_("Checking connectivity"), 0,
+					  show_progress);
 	while (pending.nr) {
 		result |= traverse_one_object(object_array_pop(&pending));
 		display_progress(progress, ++nr);
@@ -672,8 +672,8 @@ static void fsck_object_dir(const char *path)
 	if (verbose)
 		fprintf_ln(stderr, _("Checking object directory"));
 
-	if (show_progress)
-		progress = start_progress(_("Checking object directories"), 256);
+	progress = start_progress(_("Checking object directories"), 256,
+				  show_progress);
 
 	for_each_loose_file_in_objdir(path, fsck_loose, fsck_cruft, fsck_subdir,
 				      progress);
@@ -836,16 +836,15 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 			uint32_t total = 0, count = 0;
 			struct progress *progress = NULL;
 
-			if (show_progress) {
-				for (p = get_all_packs(the_repository); p;
-				     p = p->next) {
-					if (open_pack_index(p))
-						continue;
-					total += p->num_objects;
-				}
-
-				progress = start_progress(_("Checking objects"), total);
+			for (p = get_all_packs(the_repository); p;
+			     p = p->next) {
+				if (open_pack_index(p))
+					continue;
+				total += p->num_objects;
 			}
+
+			progress = start_progress(_("Checking objects"), total,
+						  show_progress);
 			for (p = get_all_packs(the_repository); p;
 			     p = p->next) {
 				/* verify gives error messages itself */
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index f865666db9..945fc13ddd 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -221,8 +221,7 @@ static unsigned check_objects(void)
 
 	max = get_max_object_index();
 
-	if (verbose)
-		progress = start_delayed_progress(_("Checking objects"), max);
+	progress = start_delayed_progress(_("Checking objects"), max, verbose);
 
 	for (i = 0; i < max; i++) {
 		foreign_nr += check_object(get_indexed_object(i));
@@ -1116,10 +1115,9 @@ static void parse_pack_objects(unsigned char *hash)
 	struct object_id ref_delta_oid;
 	struct stat st;
 
-	if (verbose)
-		progress = start_progress(
-				from_stdin ? _("Receiving objects") : _("Indexing objects"),
-				nr_objects);
+	progress = start_progress(
+			from_stdin ? _("Receiving objects") : _("Indexing objects"),
+			nr_objects, verbose);
 	for (i = 0; i < nr_objects; i++) {
 		struct object_entry *obj = &objects[i];
 		void *data = unpack_raw_entry(obj, &ofs_delta->offset,
@@ -1194,9 +1192,9 @@ static void resolve_deltas(void)
 	QSORT(ofs_deltas, nr_ofs_deltas, compare_ofs_delta_entry);
 	QSORT(ref_deltas, nr_ref_deltas, compare_ref_delta_entry);
 
-	if (verbose || show_resolving_progress)
-		progress = start_progress(_("Resolving deltas"),
-					  nr_ref_deltas + nr_ofs_deltas);
+	progress = start_progress(_("Resolving deltas"),
+				  nr_ref_deltas + nr_ofs_deltas,
+				  (verbose || show_resolving_progress));
 
 	nr_dispatched = 0;
 	if (nr_threads > 1 || getenv("GIT_FORCE_THREADS")) {
diff --git a/builtin/log.c b/builtin/log.c
index d104d5c688..71874df75c 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -2045,8 +2045,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	}
 	rev.add_signoff = do_signoff;
 
-	if (show_progress)
-		progress = start_delayed_progress(_("Generating patches"), total);
+	progress = start_delayed_progress(_("Generating patches"), total,
+					  show_progress);
 	while (0 <= --nr) {
 		int shown;
 		display_progress(progress, total - nr);
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 7016b28485..1a97a9f0be 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1012,8 +1012,8 @@ static void write_pack_file(void)
 	time_t last_mtime = 0;
 	struct object_entry **write_order;
 
-	if (progress > pack_to_stdout)
-		progress_state = start_progress(_("Writing objects"), nr_result);
+	progress_state = start_progress(_("Writing objects"), nr_result,
+					(progress > pack_to_stdout));
 	ALLOC_ARRAY(written_list, to_pack.nr_objects);
 	write_order = compute_write_order();
 
@@ -2050,9 +2050,9 @@ static void get_object_details(void)
 	uint32_t i;
 	struct object_entry **sorted_by_offset;
 
-	if (progress)
-		progress_state = start_progress(_("Counting objects"),
-						to_pack.nr_objects);
+	progress_state = start_progress(_("Counting objects"),
+					to_pack.nr_objects,
+					progress);
 
 	sorted_by_offset = xcalloc(to_pack.nr_objects, sizeof(struct object_entry *));
 	for (i = 0; i < to_pack.nr_objects; i++)
@@ -2847,9 +2847,8 @@ static void prepare_pack(int window, int depth)
 	if (nr_deltas && n > 1) {
 		unsigned nr_done = 0;
 
-		if (progress)
-			progress_state = start_progress(_("Compressing objects"),
-							nr_deltas);
+		progress_state = start_progress(_("Compressing objects"),
+						nr_deltas, progress);
 		QSORT(delta_list, n, type_size_sort);
 		ll_find_deltas(delta_list, n, window+1, depth, &nr_done);
 		stop_progress(&progress_state);
@@ -3699,8 +3698,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 			    the_repository);
 	prepare_packing_data(the_repository, &to_pack);
 
-	if (progress)
-		progress_state = start_progress(_("Enumerating objects"), 0);
+	progress_state = start_progress(_("Enumerating objects"), 0, progress);
 	if (!use_internal_rev_list)
 		read_object_list_from_stdin();
 	else {
diff --git a/builtin/prune.c b/builtin/prune.c
index 02c6ab7cba..004338f7e0 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -41,8 +41,8 @@ static void perform_reachability_traversal(struct rev_info *revs)
 	if (initialized)
 		return;
 
-	if (show_progress)
-		progress = start_delayed_progress(_("Checking connectivity"), 0);
+	progress = start_delayed_progress(_("Checking connectivity"), 0,
+					  show_progress);
 	mark_reachable_objects(revs, 1, expire, progress);
 	stop_progress(&progress);
 	initialized = 1;
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index f520111eda..f64cad8390 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -620,8 +620,13 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	if (bisect_list)
 		revs.limited = 1;
 
-	if (show_progress)
-		progress = start_delayed_progress(show_progress, 0);
+	/*
+	 * When progress is not printed to the user, we still want to be able to
+	 * classify the progress during tracing. So, use a placeholder name.
+	 */
+	progress = start_delayed_progress(
+			show_progress ? show_progress : _("Quiet rev-list operation"),
+			0, show_progress != NULL);
 
 	if (use_bitmap_index) {
 		if (!try_bitmap_count(&revs, &filter_options))
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index dd4a75e030..719d446916 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -498,8 +498,7 @@ static void unpack_all(void)
 			ntohl(hdr->hdr_version));
 	use(sizeof(struct pack_header));
 
-	if (!quiet)
-		progress = start_progress(_("Unpacking objects"), nr_objects);
+	progress = start_progress(_("Unpacking objects"), nr_objects, !quiet);
 	obj_list = xcalloc(nr_objects, sizeof(*obj_list));
 	for (i = 0; i < nr_objects; i++) {
 		unpack_one(i);
diff --git a/commit-graph.c b/commit-graph.c
index 328ab06fd4..b9a784fece 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1152,10 +1152,10 @@ static void write_graph_chunk_bloom_indexes(struct hashfile *f,
 	struct progress *progress = NULL;
 	int i = 0;
 
-	if (ctx->report_progress)
-		progress = start_delayed_progress(
-			_("Writing changed paths Bloom filters index"),
-			ctx->commits.nr);
+	progress = start_delayed_progress(
+		_("Writing changed paths Bloom filters index"),
+		ctx->commits.nr,
+		ctx->report_progress);
 
 	while (list < last) {
 		struct bloom_filter *filter = get_bloom_filter(ctx->r, *list, 0);
@@ -1177,10 +1177,10 @@ static void write_graph_chunk_bloom_data(struct hashfile *f,
 	struct progress *progress = NULL;
 	int i = 0;
 
-	if (ctx->report_progress)
-		progress = start_delayed_progress(
-			_("Writing changed paths Bloom filters data"),
-			ctx->commits.nr);
+	progress = start_delayed_progress(
+		_("Writing changed paths Bloom filters data"),
+		ctx->commits.nr,
+		ctx->report_progress);
 
 	hashwrite_be32(f, settings->hash_version);
 	hashwrite_be32(f, settings->num_hashes);
@@ -1213,8 +1213,7 @@ static int add_packed_commits(const struct object_id *oid,
 	off_t offset = nth_packed_object_offset(pack, pos);
 	struct object_info oi = OBJECT_INFO_INIT;
 
-	if (ctx->progress)
-		display_progress(ctx->progress, ++ctx->progress_done);
+	display_progress(ctx->progress, ++ctx->progress_done);
 
 	oi.typep = &type;
 	if (packed_object_info(ctx->r, pack, offset, &oi) < 0)
@@ -1252,10 +1251,10 @@ static void close_reachable(struct write_commit_graph_context *ctx)
 	enum commit_graph_split_flags flags = ctx->split_opts ?
 		ctx->split_opts->flags : COMMIT_GRAPH_SPLIT_UNSPECIFIED;
 
-	if (ctx->report_progress)
-		ctx->progress = start_delayed_progress(
-					_("Loading known commits in commit graph"),
-					ctx->oids.nr);
+	ctx->progress = start_delayed_progress(
+				_("Loading known commits in commit graph"),
+				ctx->oids.nr,
+				ctx->report_progress);
 	for (i = 0; i < ctx->oids.nr; i++) {
 		display_progress(ctx->progress, i + 1);
 		commit = lookup_commit(ctx->r, &ctx->oids.list[i]);
@@ -1269,10 +1268,9 @@ static void close_reachable(struct write_commit_graph_context *ctx)
 	 * than the number of missing commits in the reachable
 	 * closure.
 	 */
-	if (ctx->report_progress)
-		ctx->progress = start_delayed_progress(
-					_("Expanding reachable commits in commit graph"),
-					0);
+	ctx->progress = start_delayed_progress(
+				_("Expanding reachable commits in commit graph"),
+				0, ctx->report_progress);
 	for (i = 0; i < ctx->oids.nr; i++) {
 		display_progress(ctx->progress, i + 1);
 		commit = lookup_commit(ctx->r, &ctx->oids.list[i]);
@@ -1289,10 +1287,9 @@ static void close_reachable(struct write_commit_graph_context *ctx)
 	}
 	stop_progress(&ctx->progress);
 
-	if (ctx->report_progress)
-		ctx->progress = start_delayed_progress(
-					_("Clearing commit marks in commit graph"),
-					ctx->oids.nr);
+	ctx->progress = start_delayed_progress(
+				_("Clearing commit marks in commit graph"),
+				ctx->oids.nr, ctx->report_progress);
 	for (i = 0; i < ctx->oids.nr; i++) {
 		display_progress(ctx->progress, i + 1);
 		commit = lookup_commit(ctx->r, &ctx->oids.list[i]);
@@ -1308,10 +1305,10 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx)
 	int i;
 	struct commit_list *list = NULL;
 
-	if (ctx->report_progress)
-		ctx->progress = start_delayed_progress(
-					_("Computing commit graph generation numbers"),
-					ctx->commits.nr);
+	ctx->progress = start_delayed_progress(
+				_("Computing commit graph generation numbers"),
+				ctx->commits.nr,
+				ctx->report_progress);
 	for (i = 0; i < ctx->commits.nr; i++) {
 		uint32_t generation = commit_graph_data_at(ctx->commits.list[i])->generation;
 
@@ -1362,10 +1359,10 @@ static void compute_bloom_filters(struct write_commit_graph_context *ctx)
 
 	init_bloom_filters();
 
-	if (ctx->report_progress)
-		progress = start_delayed_progress(
-			_("Computing commit changed paths Bloom filters"),
-			ctx->commits.nr);
+	progress = start_delayed_progress(
+		_("Computing commit changed paths Bloom filters"),
+		ctx->commits.nr,
+		ctx->report_progress);
 
 	ALLOC_ARRAY(sorted_commits, ctx->commits.nr);
 	COPY_ARRAY(sorted_commits, ctx->commits.list, ctx->commits.nr);
@@ -1418,9 +1415,9 @@ int write_commit_graph_reachable(struct object_directory *odb,
 
 	memset(&data, 0, sizeof(data));
 	data.commits = &commits;
-	if (flags & COMMIT_GRAPH_WRITE_PROGRESS)
-		data.progress = start_delayed_progress(
-			_("Collecting referenced commits"), 0);
+	data.progress = start_delayed_progress(
+		_("Collecting referenced commits"), 0,
+		(flags & COMMIT_GRAPH_WRITE_PROGRESS));
 
 	for_each_ref(add_ref_to_set, &data);
 	result = write_commit_graph(odb, NULL, &commits,
@@ -1442,15 +1439,14 @@ static int fill_oids_from_packs(struct write_commit_graph_context *ctx,
 
 	strbuf_addf(&packname, "%s/pack/", ctx->odb->path);
 	dirlen = packname.len;
-	if (ctx->report_progress) {
-		strbuf_addf(&progress_title,
-			    Q_("Finding commits for commit graph in %d pack",
-			       "Finding commits for commit graph in %d packs",
-			       pack_indexes->nr),
-			    pack_indexes->nr);
-		ctx->progress = start_delayed_progress(progress_title.buf, 0);
-		ctx->progress_done = 0;
-	}
+	strbuf_addf(&progress_title,
+		    Q_("Finding commits for commit graph in %d pack",
+		       "Finding commits for commit graph in %d packs",
+		       pack_indexes->nr),
+		    pack_indexes->nr);
+	ctx->progress = start_delayed_progress(progress_title.buf, 0,
+					       ctx->report_progress);
+	ctx->progress_done = 0;
 	for (i = 0; i < pack_indexes->nr; i++) {
 		struct packed_git *p;
 		strbuf_setlen(&packname, dirlen);
@@ -1498,10 +1494,10 @@ static int fill_oids_from_commits(struct write_commit_graph_context *ctx,
 
 static void fill_oids_from_all_packs(struct write_commit_graph_context *ctx)
 {
-	if (ctx->report_progress)
-		ctx->progress = start_delayed_progress(
-			_("Finding commits for commit graph among packed objects"),
-			ctx->approx_nr_objects);
+	ctx->progress = start_delayed_progress(
+		_("Finding commits for commit graph among packed objects"),
+		ctx->approx_nr_objects,
+		ctx->report_progress);
 	for_each_packed_object(add_packed_commits, ctx,
 			       FOR_EACH_OBJECT_PACK_ORDER);
 	if (ctx->progress_done < ctx->approx_nr_objects)
@@ -1513,10 +1509,9 @@ static uint32_t count_distinct_commits(struct write_commit_graph_context *ctx)
 {
 	uint32_t i, count_distinct = 1;
 
-	if (ctx->report_progress)
-		ctx->progress = start_delayed_progress(
-			_("Counting distinct commits in commit graph"),
-			ctx->oids.nr);
+	ctx->progress = start_delayed_progress(
+		_("Counting distinct commits in commit graph"),
+		ctx->oids.nr, ctx->report_progress);
 	display_progress(ctx->progress, 0); /* TODO: Measure QSORT() progress */
 	QSORT(ctx->oids.list, ctx->oids.nr, oid_compare);
 
@@ -1545,10 +1540,10 @@ static void copy_oids_to_commits(struct write_commit_graph_context *ctx)
 		ctx->split_opts->flags : COMMIT_GRAPH_SPLIT_UNSPECIFIED;
 
 	ctx->num_extra_edges = 0;
-	if (ctx->report_progress)
-		ctx->progress = start_delayed_progress(
-			_("Finding extra edges in commit graph"),
-			ctx->oids.nr);
+	ctx->progress = start_delayed_progress(
+		_("Finding extra edges in commit graph"),
+		ctx->oids.nr,
+		ctx->report_progress);
 	for (i = 0; i < ctx->oids.nr; i++) {
 		unsigned int num_parents;
 
@@ -1723,16 +1718,15 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx)
 		hashwrite(f, chunk_write, 12);
 	}
 
-	if (ctx->report_progress) {
-		strbuf_addf(&progress_title,
-			    Q_("Writing out commit graph in %d pass",
-			       "Writing out commit graph in %d passes",
-			       num_chunks),
-			    num_chunks);
-		ctx->progress = start_delayed_progress(
-			progress_title.buf,
-			num_chunks * ctx->commits.nr);
-	}
+	strbuf_addf(&progress_title,
+		    Q_("Writing out commit graph in %d pass",
+		       "Writing out commit graph in %d passes",
+		       num_chunks),
+		    num_chunks);
+	ctx->progress = start_delayed_progress(
+		progress_title.buf,
+		num_chunks * ctx->commits.nr,
+		ctx->report_progress);
 	write_graph_chunk_fanout(f, ctx);
 	write_graph_chunk_oids(f, hashsz, ctx);
 	write_graph_chunk_data(f, hashsz, ctx);
@@ -1930,10 +1924,10 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
 {
 	uint32_t i;
 
-	if (ctx->report_progress)
-		ctx->progress = start_delayed_progress(
-					_("Scanning merged commits"),
-					ctx->commits.nr);
+	ctx->progress = start_delayed_progress(
+				_("Scanning merged commits"),
+				ctx->commits.nr,
+				ctx->report_progress);
 
 	QSORT(ctx->commits.list, ctx->commits.nr, commit_compare);
 
@@ -1965,8 +1959,8 @@ static void merge_commit_graphs(struct write_commit_graph_context *ctx)
 	while (g && current_graph_number >= ctx->num_commit_graphs_after) {
 		current_graph_number--;
 
-		if (ctx->report_progress)
-			ctx->progress = start_delayed_progress(_("Merging commit-graph"), 0);
+		ctx->progress = start_delayed_progress(_("Merging commit-graph"), 0,
+						       ctx->report_progress);
 
 		merge_commit_graph(ctx, g);
 		stop_progress(&ctx->progress);
@@ -2301,9 +2295,9 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
 	if (verify_commit_graph_error & ~VERIFY_COMMIT_GRAPH_ERROR_HASH)
 		return verify_commit_graph_error;
 
-	if (flags & COMMIT_GRAPH_WRITE_PROGRESS)
-		progress = start_progress(_("Verifying commits in commit graph"),
-					g->num_commits);
+	progress = start_progress(_("Verifying commits in commit graph"),
+				g->num_commits,
+				(flags & COMMIT_GRAPH_WRITE_PROGRESS));
 
 	for (i = 0; i < g->num_commits; i++) {
 		struct commit *graph_commit, *odb_commit;
diff --git a/delta-islands.c b/delta-islands.c
index aa98b2e541..102d4a5573 100644
--- a/delta-islands.c
+++ b/delta-islands.c
@@ -268,8 +268,8 @@ void resolve_tree_islands(struct repository *r,
 	}
 	QSORT(todo, nr, tree_depth_compare);
 
-	if (progress)
-		progress_state = start_progress(_("Propagating island marks"), nr);
+	progress_state = start_progress(_("Propagating island marks"), nr,
+					progress);
 
 	for (i = 0; i < nr; i++) {
 		struct object_entry *ent = todo[i].entry;
diff --git a/diffcore-rename.c b/diffcore-rename.c
index 99e63e90f8..4fd5a7b9da 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -587,11 +587,10 @@ void diffcore_rename(struct diff_options *options)
 		break;
 	}
 
-	if (options->show_rename_progress) {
-		progress = start_delayed_progress(
-				_("Performing inexact rename detection"),
-				(uint64_t)rename_dst_nr * (uint64_t)rename_src_nr);
-	}
+	progress = start_delayed_progress(
+			_("Performing inexact rename detection"),
+			(uint64_t)rename_dst_nr * (uint64_t)rename_src_nr,
+			options->show_rename_progress);
 
 	mx = xcalloc(st_mult(NUM_CANDIDATE_PER_DST, num_create), sizeof(*mx));
 	for (dst_cnt = i = 0; i < rename_dst_nr; i++) {
diff --git a/entry.c b/entry.c
index 449bd32dee..e46e5b56ed 100644
--- a/entry.c
+++ b/entry.c
@@ -174,7 +174,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts)
 
 	dco->state = CE_RETRY;
 	delayed_object_count = dco->paths.nr;
-	progress = start_delayed_progress(_("Filtering content"), delayed_object_count);
+	progress = start_delayed_progress(_("Filtering content"), delayed_object_count, 1);
 	while (dco->filters.nr > 0) {
 		for_each_string_list_item(filter, &dco->filters) {
 			struct string_list available_paths = STRING_LIST_INIT_NODUP;
diff --git a/midx.c b/midx.c
index 6d1584ca51..d9ab163340 100644
--- a/midx.c
+++ b/midx.c
@@ -836,10 +836,8 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index *
 	}
 
 	packs.pack_paths_checked = 0;
-	if (flags & MIDX_PROGRESS)
-		packs.progress = start_progress(_("Adding packfiles to multi-pack-index"), 0);
-	else
-		packs.progress = NULL;
+	packs.progress = start_progress(_("Adding packfiles to multi-pack-index"), 0,
+						(flags & MIDX_PROGRESS));
 
 	for_each_file_in_pack_dir(object_dir, add_pack_to_midx, &packs);
 	stop_progress(&packs.progress);
@@ -973,9 +971,8 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index *
 		written += MIDX_CHUNKLOOKUP_WIDTH;
 	}
 
-	if (flags & MIDX_PROGRESS)
-		progress = start_progress(_("Writing chunks to multi-pack-index"),
-					  num_chunks);
+	progress = start_progress(_("Writing chunks to multi-pack-index"),
+				  num_chunks, (flags & MIDX_PROGRESS));
 	for (i = 0; i < num_chunks; i++) {
 		if (written != chunk_offsets[i])
 			BUG("incorrect chunk offset (%"PRIu64" != %"PRIu64") for chunk id %"PRIx32,
@@ -1108,9 +1105,8 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 	if (!m)
 		return 0;
 
-	if (flags & MIDX_PROGRESS)
-		progress = start_progress(_("Looking for referenced packfiles"),
-					  m->num_packs);
+	progress = start_progress(_("Looking for referenced packfiles"),
+				  m->num_packs, (flags & MIDX_PROGRESS));
 	for (i = 0; i < m->num_packs; i++) {
 		if (prepare_midx_pack(r, m, i))
 			midx_report("failed to load pack in position %d", i);
@@ -1137,9 +1133,9 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 		return verify_midx_error;
 	}
 
-	if (flags & MIDX_PROGRESS)
-		progress = start_sparse_progress(_("Verifying OID order in multi-pack-index"),
-						 m->num_objects - 1);
+	progress = start_sparse_progress(_("Verifying OID order in multi-pack-index"),
+					 m->num_objects - 1,
+					 (flags & MIDX_PROGRESS));
 	for (i = 0; i < m->num_objects - 1; i++) {
 		struct object_id oid1, oid2;
 
@@ -1166,15 +1162,16 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 		pairs[i].pack_int_id = nth_midxed_pack_int_id(m, i);
 	}
 
-	if (flags & MIDX_PROGRESS)
-		progress = start_sparse_progress(_("Sorting objects by packfile"),
-						 m->num_objects);
+	progress = start_sparse_progress(_("Sorting objects by packfile"),
+					 m->num_objects,
+					 (flags & MIDX_PROGRESS));
 	display_progress(progress, 0); /* TODO: Measure QSORT() progress */
 	QSORT(pairs, m->num_objects, compare_pair_pos_vs_id);
 	stop_progress(&progress);
 
-	if (flags & MIDX_PROGRESS)
-		progress = start_sparse_progress(_("Verifying object offsets"), m->num_objects);
+	progress = start_sparse_progress(_("Verifying object offsets"),
+					 m->num_objects,
+					 (flags & MIDX_PROGRESS));
 	for (i = 0; i < m->num_objects; i++) {
 		struct object_id oid;
 		struct pack_entry e;
@@ -1229,9 +1226,8 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla
 
 	count = xcalloc(m->num_packs, sizeof(uint32_t));
 
-	if (flags & MIDX_PROGRESS)
-		progress = start_progress(_("Counting referenced objects"),
-					  m->num_objects);
+	progress = start_progress(_("Counting referenced objects"),
+				  m->num_objects, (flags & MIDX_PROGRESS));
 	for (i = 0; i < m->num_objects; i++) {
 		int pack_int_id = nth_midxed_pack_int_id(m, i);
 		count[pack_int_id]++;
@@ -1239,9 +1235,8 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla
 	}
 	stop_progress(&progress);
 
-	if (flags & MIDX_PROGRESS)
-		progress = start_progress(_("Finding and deleting unreferenced packfiles"),
-					  m->num_packs);
+	progress = start_progress(_("Finding and deleting unreferenced packfiles"),
+				  m->num_packs, (flags & MIDX_PROGRESS));
 	for (i = 0; i < m->num_packs; i++) {
 		char *pack_name;
 		display_progress(progress, i + 1);
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index a7a4964b50..9a8bfefeb5 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -259,8 +259,8 @@ void bitmap_writer_build(struct packing_data *to_pack)
 	writer.bitmaps = kh_init_oid_map();
 	writer.to_pack = to_pack;
 
-	if (writer.show_progress)
-		writer.progress = start_progress("Building bitmaps", writer.selected_nr);
+	writer.progress = start_progress("Building bitmaps", writer.selected_nr,
+					 writer.show_progress);
 
 	repo_init_revisions(to_pack->repo, &revs, NULL);
 	revs.tag_objects = 1;
@@ -397,8 +397,8 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
 
 	QSORT(indexed_commits, indexed_commits_nr, date_compare);
 
-	if (writer.show_progress)
-		writer.progress = start_progress("Selecting bitmap commits", 0);
+	writer.progress = start_progress("Selecting bitmap commits", 0,
+					 writer.show_progress);
 
 	if (indexed_commits_nr < 100) {
 		for (i = 0; i < indexed_commits_nr; ++i)
diff --git a/pack-bitmap.c b/pack-bitmap.c
index 4077e731e8..73232d460f 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -1313,7 +1313,7 @@ void test_bitmap_walk(struct rev_info *revs)
 
 	tdata.bitmap_git = bitmap_git;
 	tdata.base = bitmap_new();
-	tdata.prg = start_progress("Verifying bitmap entries", result_popcnt);
+	tdata.prg = start_progress("Verifying bitmap entries", result_popcnt, 1);
 	tdata.seen = 0;
 
 	traverse_commit_list(revs, &test_show_commit, &test_show_object, &tdata);
@@ -1392,8 +1392,7 @@ int rebuild_existing_bitmaps(struct bitmap_index *bitmap_git,
 	rebuild = bitmap_new();
 	i = 0;
 
-	if (show_progress)
-		progress = start_progress("Reusing bitmaps", 0);
+	progress = start_progress("Reusing bitmaps", 0, show_progress);
 
 	kh_foreach_value(bitmap_git->bitmaps, stored, {
 		if (stored->flags & BITMAP_FLAG_REUSE) {
diff --git a/preload-index.c b/preload-index.c
index ed6eaa4738..a924a84a14 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -115,10 +115,9 @@ void preload_index(struct index_state *index,
 	memset(&data, 0, sizeof(data));
 
 	memset(&pd, 0, sizeof(pd));
-	if (refresh_flags & REFRESH_PROGRESS && isatty(2)) {
-		pd.progress = start_delayed_progress(_("Refreshing index"), index->cache_nr);
-		pthread_mutex_init(&pd.mutex, NULL);
-	}
+	pd.progress = start_delayed_progress(_("Refreshing index"), index->cache_nr,
+					     (refresh_flags & REFRESH_PROGRESS && isatty(2)));
+	pthread_mutex_init(&pd.mutex, NULL);
 
 	for (i = 0; i < threads; i++) {
 		struct thread_data *p = data+i;
diff --git a/progress.c b/progress.c
index 3eda914518..f014908b05 100644
--- a/progress.c
+++ b/progress.c
@@ -43,6 +43,7 @@ struct progress {
 	struct strbuf counters_sb;
 	int title_len;
 	int split;
+	int verbose;
 };
 
 static volatile sig_atomic_t progress_update;
@@ -245,12 +246,13 @@ void display_throughput(struct progress *progress, uint64_t total)
 
 void display_progress(struct progress *progress, uint64_t n)
 {
-	if (progress)
+	if (progress && progress->verbose)
 		display(progress, n, NULL);
 }
 
 static struct progress *start_progress_delay(const char *title, uint64_t total,
-					     unsigned delay, unsigned sparse)
+					     unsigned delay, unsigned sparse,
+					     int verbose)
 {
 	struct progress *progress = xmalloc(sizeof(*progress));
 	progress->title = title;
@@ -264,6 +266,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
 	strbuf_init(&progress->counters_sb, 0);
 	progress->title_len = utf8_strwidth(title);
 	progress->split = 0;
+	progress->verbose = verbose;
 	set_progress_signal();
 	trace2_region_enter("progress", title, the_repository);
 	return progress;
@@ -279,14 +282,16 @@ static int get_default_delay(void)
 	return delay_in_secs;
 }
 
-struct progress *start_delayed_progress(const char *title, uint64_t total)
+struct progress *start_delayed_progress(const char *title, uint64_t total,
+					int verbose)
 {
-	return start_progress_delay(title, total, get_default_delay(), 0);
+	return start_progress_delay(title, total, get_default_delay(), 0,
+				    verbose);
 }
 
-struct progress *start_progress(const char *title, uint64_t total)
+struct progress *start_progress(const char *title, uint64_t total, int verbose)
 {
-	return start_progress_delay(title, total, 0, 0);
+	return start_progress_delay(title, total, 0, 0, verbose);
 }
 
 /*
@@ -298,15 +303,17 @@ struct progress *start_progress(const char *title, uint64_t total)
  * When "sparse" is set, stop_progress() will automatically force the done
  * message to show 100%.
  */
-struct progress *start_sparse_progress(const char *title, uint64_t total)
+struct progress *start_sparse_progress(const char *title, uint64_t total,
+				       int verbose)
 {
-	return start_progress_delay(title, total, 0, 1);
+	return start_progress_delay(title, total, 0, 1, verbose);
 }
 
 struct progress *start_delayed_sparse_progress(const char *title,
-					       uint64_t total)
+					       uint64_t total, int verbose)
 {
-	return start_progress_delay(title, total, get_default_delay(), 1);
+	return start_progress_delay(title, total, get_default_delay(), 1,
+				    verbose);
 }
 
 static void finish_if_sparse(struct progress *progress)
diff --git a/progress.h b/progress.h
index f1913acf73..0c64557231 100644
--- a/progress.h
+++ b/progress.h
@@ -13,11 +13,13 @@ void progress_test_force_update(void);
 
 void display_throughput(struct progress *progress, uint64_t total);
 void display_progress(struct progress *progress, uint64_t n);
-struct progress *start_progress(const char *title, uint64_t total);
-struct progress *start_sparse_progress(const char *title, uint64_t total);
-struct progress *start_delayed_progress(const char *title, uint64_t total);
+struct progress *start_progress(const char *title, uint64_t total, int verbose);
+struct progress *start_sparse_progress(const char *title, uint64_t total,
+				       int verbose);
+struct progress *start_delayed_progress(const char *title, uint64_t total,
+					int verbose);
 struct progress *start_delayed_sparse_progress(const char *title,
-					       uint64_t total);
+					       uint64_t total, int verbose);
 void stop_progress(struct progress **progress);
 void stop_progress_msg(struct progress **progress, const char *msg);
 
diff --git a/prune-packed.c b/prune-packed.c
index 261520b472..dcc982e369 100644
--- a/prune-packed.c
+++ b/prune-packed.c
@@ -31,8 +31,8 @@ static int prune_object(const struct object_id *oid, const char *path,
 
 void prune_packed_objects(int opts)
 {
-	if (opts & PRUNE_PACKED_VERBOSE)
-		progress = start_delayed_progress(_("Removing duplicate objects"), 256);
+	progress = start_delayed_progress(_("Removing duplicate objects"), 256,
+					  (opts & PRUNE_PACKED_VERBOSE));
 
 	for_each_loose_file_in_objdir(get_object_directory(),
 				      prune_object, NULL, prune_subdir, &opts);
diff --git a/read-cache.c b/read-cache.c
index aa427c5c17..2ddc422dbd 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1532,9 +1532,9 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 	const char *unmerged_fmt;
 	struct progress *progress = NULL;
 
-	if (flags & REFRESH_PROGRESS && isatty(2))
-		progress = start_delayed_progress(_("Refresh index"),
-						  istate->cache_nr);
+	progress = start_delayed_progress(_("Refresh index"),
+					  istate->cache_nr,
+					  (flags & REFRESH_PROGRESS && isatty(2)));
 
 	trace_performance_enter();
 	modified_fmt   = in_porcelain ? "M\t%s\n" : "%s: needs update\n";
diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
index 5d05cbe789..19b874f9cd 100644
--- a/t/helper/test-progress.c
+++ b/t/helper/test-progress.c
@@ -23,16 +23,18 @@
 int cmd__progress(int argc, const char **argv)
 {
 	int total = 0;
+	int quiet = 0;
 	const char *title;
 	struct strbuf line = STRBUF_INIT;
 	struct progress *progress;
 
 	const char *usage[] = {
-		"test-tool progress [--total=<n>] <progress-title>",
+		"test-tool progress [--total=<n>] [--quiet] <progress-title>",
 		NULL
 	};
 	struct option options[] = {
 		OPT_INTEGER(0, "total", &total, "total number of items"),
+		OPT_BOOL(0, "quiet", &quiet, "suppress stderr"),
 		OPT_END(),
 	};
 
@@ -42,7 +44,7 @@ int cmd__progress(int argc, const char **argv)
 	title = argv[0];
 
 	progress_testing = 1;
-	progress = start_progress(title, total);
+	progress = start_progress(title, total, !quiet);
 	while (strbuf_getline(&line, stdin) != EOF) {
 		char *end;
 
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 1ed1df351c..9d6e6274ad 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -309,4 +309,31 @@ test_expect_success 'progress generates traces' '
 	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
 '
 
+test_expect_success 'progress generates traces even quietly' '
+	cat >in <<-\EOF &&
+	throughput 102400 1000
+	update
+	progress 10
+	throughput 204800 2000
+	update
+	progress 20
+	throughput 307200 3000
+	update
+	progress 30
+	throughput 409600 4000
+	update
+	progress 40
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress --total=40 \
+		--quiet "Working hard" <in 2>stderr &&
+
+	# t0212/parse_events.perl intentionally omits regions and data.
+	grep -e "region_enter" -e "\"category\":\"progress\"" trace.event &&
+	grep -e "region_leave" -e "\"category\":\"progress\"" trace.event &&
+	grep "\"key\":\"total_objects\",\"value\":\"40\"" trace.event &&
+	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
+'
+
+
 test_done
diff --git a/unpack-trees.c b/unpack-trees.c
index 4be5fc3075..fa84244466 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -338,16 +338,14 @@ static struct progress *get_progress(struct unpack_trees_options *o,
 {
 	unsigned cnt = 0, total = 0;
 
-	if (!o->update || !o->verbose_update)
-		return NULL;
-
 	for (; cnt < index->cache_nr; cnt++) {
 		const struct cache_entry *ce = index->cache[cnt];
 		if (ce->ce_flags & (CE_UPDATE | CE_WT_REMOVE))
 			total++;
 	}
 
-	return start_delayed_progress(_("Updating files"), total);
+	return start_delayed_progress(_("Updating files"), total,
+				      (o->update && o->verbose_update));
 }
 
 static void setup_collided_checkout_detection(struct checkout *state,
@@ -1493,10 +1491,10 @@ static int clear_ce_flags(struct index_state *istate,
 	int rval;
 
 	strbuf_reset(&prefix);
-	if (show_progress)
-		istate->progress = start_delayed_progress(
-					_("Updating index flags"),
-					istate->cache_nr);
+	istate->progress = start_delayed_progress(
+				_("Updating index flags"),
+				istate->cache_nr,
+				show_progress);
 
 	xsnprintf(label, sizeof(label), "clear_ce_flags(0x%08lx,0x%08lx)",
 		  (unsigned long)select_mask, (unsigned long)clear_mask);
diff --git a/walker.c b/walker.c
index 4984bf8b3d..4758bdde70 100644
--- a/walker.c
+++ b/walker.c
@@ -166,8 +166,8 @@ static int loop(struct walker *walker)
 	struct progress *progress = NULL;
 	uint64_t nr = 0;
 
-	if (walker->get_progress)
-		progress = start_delayed_progress(_("Fetching objects"), 0);
+	progress = start_delayed_progress(_("Fetching objects"), 0,
+					  walker->get_progress);
 
 	while (process_queue) {
 		struct object *obj = process_queue->item;
-- 
2.27.0.383.g050319c2ae-goog


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 2/2] progress: remove redundant null-checking
  2020-07-10  1:42 [PATCH 0/2] enable progress traces even when quiet Emily Shaffer
  2020-07-10  1:42 ` [PATCH 1/2] progress: create progress struct in 'verbose' mode Emily Shaffer
@ 2020-07-10  1:42 ` Emily Shaffer
  2020-07-10  2:01   ` Derrick Stolee
  1 sibling, 1 reply; 27+ messages in thread
From: Emily Shaffer @ 2020-07-10  1:42 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

display_progress() and stop_progress() are tolerant to NULL inputs.
Remove some places where redundant checks are performed before these
NULL-tolerant functions are called.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 builtin/commit-graph.c | 3 +--
 commit-graph.c         | 3 +--
 read-cache.c           | 9 +++------
 3 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index ae4dc2dfcd..7964de3126 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -263,8 +263,7 @@ static int graph_write(int argc, const char **argv)
 cleanup:
 	string_list_clear(&pack_indexes, 0);
 	strbuf_release(&buf);
-	if (progress)
-		stop_progress(&progress);
+	stop_progress(&progress);
 	return result;
 }
 
diff --git a/commit-graph.c b/commit-graph.c
index b9a784fece..30d9a75932 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1424,8 +1424,7 @@ int write_commit_graph_reachable(struct object_directory *odb,
 				    flags, split_opts);
 
 	oidset_clear(&commits);
-	if (data.progress)
-		stop_progress(&data.progress);
+	stop_progress(&data.progress);
 	return result;
 }
 
diff --git a/read-cache.c b/read-cache.c
index 2ddc422dbd..feb7abe37a 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1581,8 +1581,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 		new_entry = refresh_cache_ent(istate, ce, options, &cache_errno, &changed);
 		if (new_entry == ce)
 			continue;
-		if (progress)
-			display_progress(progress, i);
+		display_progress(progress, i);
 		if (!new_entry) {
 			const char *fmt;
 
@@ -1614,10 +1613,8 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 
 		replace_index_entry(istate, i, new_entry);
 	}
-	if (progress) {
-		display_progress(progress, istate->cache_nr);
-		stop_progress(&progress);
-	}
+	display_progress(progress, istate->cache_nr);
+	stop_progress(&progress);
 	trace_performance_leave("refresh index");
 	return has_errors;
 }
-- 
2.27.0.383.g050319c2ae-goog


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] progress: create progress struct in 'verbose' mode
  2020-07-10  1:42 ` [PATCH 1/2] progress: create progress struct in 'verbose' mode Emily Shaffer
@ 2020-07-10  2:00   ` Derrick Stolee
  2020-07-10  2:17     ` Taylor Blau
  2020-07-10  2:14   ` brian m. carlson
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 27+ messages in thread
From: Derrick Stolee @ 2020-07-10  2:00 UTC (permalink / raw)
  To: Emily Shaffer, git, Taylor Blau

On 7/9/2020 9:42 PM, Emily Shaffer wrote:
> Before now, the progress API is used by conditionally calling
> start_progress() or a similar call, and then unconditionally calling
> display_progress() and stop_progress(), both of which are tolerant of
> NULL or uninitialized inputs. However, in
> 98a136474082cdc7228d7e0e45672c5274fab701 (trace: log progress time and
> throughput), the progress library learned to log traces during expensive
> operations. In cases where progress should not be displayed to the user
> - such as when Git is called by a script - no traces will be logged,
> because the progress object is never created.
> 
> Instead, to allow us to collect traces from scripted Git commands, teach
> a progress->verbose flag, which is specified via a new argument to
> start_progress() and friends. display_progress() also learns to filter
> for that flag. With these changes, start_progress() can be called
> unconditionally but with a conditional as an argument to determine
> whether to report progress to the user.
> 
> Since this changes the API, also modify callers of start_progress() and
> friends to drop their conditional and pass a new argument in instead.

This is a worthwhile change. Thanks! I was hoping that we would
get some of these regions for free, which extends what we can get
out of trace2 events.

CC'ing Taylor because he had some thoughts on adding a possible
trace2 category to make it easier to reason about the regions,
when appropriate. Not sure if he's ready to apply that change
on top of this series.

> diff --git a/builtin/rev-list.c b/builtin/rev-list.c
> index f520111eda..f64cad8390 100644
> --- a/builtin/rev-list.c
> +++ b/builtin/rev-list.c
> @@ -620,8 +620,13 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
>  	if (bisect_list)
>  		revs.limited = 1;
>  
> -	if (show_progress)
> -		progress = start_delayed_progress(show_progress, 0);
> +	/*
> +	 * When progress is not printed to the user, we still want to be able to
> +	 * classify the progress during tracing. So, use a placeholder name.
> +	 */
> +	progress = start_delayed_progress(
> +			show_progress ? show_progress : _("Quiet rev-list operation"),
> +			0, show_progress != NULL)

This is so strange, how we let the command-lines specify a progress
indicator. I guess it is necessary when we use rev-list as a
subcommand instead of in-process. One such case is check_connected()
in connected.c.

It's stranger still that "show_progress" is actually a string here,
as opposed to being an int in most other places.

Your transformation is correct, here, though. Thanks for calling it
out in the commit message.

>  
>  	if (use_bitmap_index) {
>  		if (!try_bitmap_count(&revs, &filter_options))
> diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
> index dd4a75e030..719d446916 100644
> --- a/builtin/unpack-objects.c
> +++ b/builtin/unpack-objects.c
> @@ -498,8 +498,7 @@ static void unpack_all(void)
>  			ntohl(hdr->hdr_version));
>  	use(sizeof(struct pack_header));
>  
> -	if (!quiet)
> -		progress = start_progress(_("Unpacking objects"), nr_objects);
> +	progress = start_progress(_("Unpacking objects"), nr_objects, !quiet);
>  	obj_list = xcalloc(nr_objects, sizeof(*obj_list));
>  	for (i = 0; i < nr_objects; i++) {
>  		unpack_one(i);
> diff --git a/commit-graph.c b/commit-graph.c
> index 328ab06fd4..b9a784fece 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -1152,10 +1152,10 @@ static void write_graph_chunk_bloom_indexes(struct hashfile *f,
>  	struct progress *progress = NULL;
>  	int i = 0;
>  
> -	if (ctx->report_progress)
> -		progress = start_delayed_progress(
> -			_("Writing changed paths Bloom filters index"),
> -			ctx->commits.nr);
> +	progress = start_delayed_progress(
> +		_("Writing changed paths Bloom filters index"),
> +		ctx->commits.nr,
> +		ctx->report_progress);

There are a lot of blocks like this, where the progress string is long enough to
require the first param to be after the method name. Since we are changing the
API and every caller, would the resulting code be cleaner if the string value
was the last parameter? That would allow this code pattern in most cases:

	progress = start_delayed_progress(count, show_progress,
					  _("My special string!"));

Just a thought. Not super-important.

The rest of the changes look to be correct.

> diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
> index 5d05cbe789..19b874f9cd 100644
> --- a/t/helper/test-progress.c
> +++ b/t/helper/test-progress.c
> @@ -23,16 +23,18 @@
>  int cmd__progress(int argc, const char **argv)
>  {
>  	int total = 0;
> +	int quiet = 0;
>  	const char *title;
>  	struct strbuf line = STRBUF_INIT;
>  	struct progress *progress;
>  
>  	const char *usage[] = {
> -		"test-tool progress [--total=<n>] <progress-title>",
> +		"test-tool progress [--total=<n>] [--quiet] <progress-title>",
>  		NULL
>  	};
>  	struct option options[] = {
>  		OPT_INTEGER(0, "total", &total, "total number of items"),
> +		OPT_BOOL(0, "quiet", &quiet, "suppress stderr"),
>  		OPT_END(),
>  	};
>  
> @@ -42,7 +44,7 @@ int cmd__progress(int argc, const char **argv)
>  	title = argv[0];
>  
>  	progress_testing = 1;
> -	progress = start_progress(title, total);
> +	progress = start_progress(title, total, !quiet);
>  	while (strbuf_getline(&line, stdin) != EOF) {
>  		char *end;
>  
> diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
> index 1ed1df351c..9d6e6274ad 100755
> --- a/t/t0500-progress-display.sh
> +++ b/t/t0500-progress-display.sh
> @@ -309,4 +309,31 @@ test_expect_success 'progress generates traces' '
>  	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
>  '
>  
> +test_expect_success 'progress generates traces even quietly' '
> +	cat >in <<-\EOF &&
> +	throughput 102400 1000
> +	update
> +	progress 10
> +	throughput 204800 2000
> +	update
> +	progress 20
> +	throughput 307200 3000
> +	update
> +	progress 30
> +	throughput 409600 4000
> +	update
> +	progress 40
> +	EOF
> +
> +	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress --total=40 \
> +		--quiet "Working hard" <in 2>stderr &&
> +
> +	# t0212/parse_events.perl intentionally omits regions and data.
> +	grep -e "region_enter" -e "\"category\":\"progress\"" trace.event &&
> +	grep -e "region_leave" -e "\"category\":\"progress\"" trace.event &&
> +	grep "\"key\":\"total_objects\",\"value\":\"40\"" trace.event &&
> +	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
> +'

Thanks for adding a test, including the resulting trace events!

The patch of Taylor's that I mentioned earlier changes the "category"
to something a bit more specific than "progress", when appropriate.

> +
> +
>  test_done

nit: extra empty line

Thanks!
-Stolee

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] progress: remove redundant null-checking
  2020-07-10  1:42 ` [PATCH 2/2] progress: remove redundant null-checking Emily Shaffer
@ 2020-07-10  2:01   ` Derrick Stolee
  2020-07-10  2:20     ` Taylor Blau
  0 siblings, 1 reply; 27+ messages in thread
From: Derrick Stolee @ 2020-07-10  2:01 UTC (permalink / raw)
  To: Emily Shaffer, git

On 7/9/2020 9:42 PM, Emily Shaffer wrote:
> display_progress() and stop_progress() are tolerant to NULL inputs.
> Remove some places where redundant checks are performed before these
> NULL-tolerant functions are called.
> 
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>  builtin/commit-graph.c | 3 +--
>  commit-graph.c         | 3 +--
>  read-cache.c           | 9 +++------
>  3 files changed, 5 insertions(+), 10 deletions(-)
> 
> diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
> index ae4dc2dfcd..7964de3126 100644
> --- a/builtin/commit-graph.c
> +++ b/builtin/commit-graph.c
> @@ -263,8 +263,7 @@ static int graph_write(int argc, const char **argv)
>  cleanup:
>  	string_list_clear(&pack_indexes, 0);
>  	strbuf_release(&buf);
> -	if (progress)
> -		stop_progress(&progress);
> +	stop_progress(&progress);
>  	return result;
>  }
>  
> diff --git a/commit-graph.c b/commit-graph.c
> index b9a784fece..30d9a75932 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -1424,8 +1424,7 @@ int write_commit_graph_reachable(struct object_directory *odb,
>  				    flags, split_opts);
>  
>  	oidset_clear(&commits);
> -	if (data.progress)
> -		stop_progress(&data.progress);
> +	stop_progress(&data.progress);
>  	return result;
>  }
>  
> diff --git a/read-cache.c b/read-cache.c
> index 2ddc422dbd..feb7abe37a 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1581,8 +1581,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
>  		new_entry = refresh_cache_ent(istate, ce, options, &cache_errno, &changed);
>  		if (new_entry == ce)
>  			continue;
> -		if (progress)
> -			display_progress(progress, i);
> +		display_progress(progress, i);
>  		if (!new_entry) {
>  			const char *fmt;
>  
> @@ -1614,10 +1613,8 @@ int refresh_index(struct index_state *istate, unsigned int flags,
>  
>  		replace_index_entry(istate, i, new_entry);
>  	}
> -	if (progress) {
> -		display_progress(progress, istate->cache_nr);
> -		stop_progress(&progress);
> -	}
> +	display_progress(progress, istate->cache_nr);
> +	stop_progress(&progress);
>  	trace_performance_leave("refresh index");
>  	return has_errors;

Looks obviously correct to me.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] progress: create progress struct in 'verbose' mode
  2020-07-10  1:42 ` [PATCH 1/2] progress: create progress struct in 'verbose' mode Emily Shaffer
  2020-07-10  2:00   ` Derrick Stolee
@ 2020-07-10  2:14   ` brian m. carlson
  2020-07-10 19:24     ` Emily Shaffer
  2020-07-10 21:09     ` Junio C Hamano
  2020-07-10 22:40   ` Junio C Hamano
  2020-09-10  0:31   ` Jonathan Nieder
  3 siblings, 2 replies; 27+ messages in thread
From: brian m. carlson @ 2020-07-10  2:14 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1621 bytes --]

On 2020-07-10 at 01:42:41, Emily Shaffer wrote:
> Before now, the progress API is used by conditionally calling
> start_progress() or a similar call, and then unconditionally calling
> display_progress() and stop_progress(), both of which are tolerant of
> NULL or uninitialized inputs. However, in
> 98a136474082cdc7228d7e0e45672c5274fab701 (trace: log progress time and
> throughput), the progress library learned to log traces during expensive
> operations. In cases where progress should not be displayed to the user
> - such as when Git is called by a script - no traces will be logged,
> because the progress object is never created.
> 
> Instead, to allow us to collect traces from scripted Git commands, teach
> a progress->verbose flag, which is specified via a new argument to
> start_progress() and friends. display_progress() also learns to filter
> for that flag. With these changes, start_progress() can be called
> unconditionally but with a conditional as an argument to determine
> whether to report progress to the user.

So to make sure I understand this right, we'll collect traces regardless
if it's enabled, but we'll still honor the --quiet flag if the user
doesn't want to see them?  If so, I'm definitely in favor of this
change.  I was worried when I read the cover letter that we'd display
them to the user regardless, but from reading the patch and the commit
message, it seems I misunderstood.

I think the making the verbose flag a parameter simplifies the code
nicely and puts the rendering decision in the right place.
-- 
brian m. carlson: Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] progress: create progress struct in 'verbose' mode
  2020-07-10  2:00   ` Derrick Stolee
@ 2020-07-10  2:17     ` Taylor Blau
  2020-07-10 19:21       ` Emily Shaffer
  0 siblings, 1 reply; 27+ messages in thread
From: Taylor Blau @ 2020-07-10  2:17 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Emily Shaffer, git, Taylor Blau

Hi both,

On Thu, Jul 09, 2020 at 10:00:23PM -0400, Derrick Stolee wrote:
> On 7/9/2020 9:42 PM, Emily Shaffer wrote:
> > Before now, the progress API is used by conditionally calling
> > start_progress() or a similar call, and then unconditionally calling
> > display_progress() and stop_progress(), both of which are tolerant of
> > NULL or uninitialized inputs. However, in
> > 98a136474082cdc7228d7e0e45672c5274fab701 (trace: log progress time and
> > throughput), the progress library learned to log traces during expensive
> > operations. In cases where progress should not be displayed to the user
> > - such as when Git is called by a script - no traces will be logged,
> > because the progress object is never created.

This is such a fantastic idea. Just the other day, I was thinking of
getting your (for clarification, Emily, since I'm responding to
Stolee's mail) progress-emits-trace2-events work hooked into GitHub's
trace2 pipeline.

There were two unfortunate things that prevented this from working:

  1. GitHub filters which trace2 categories are of interest to us (these
     interesting ones get logged, and the uninteresting ones get
     discarded) using an environment variable of comma-separated
     categories. Since all of the trace2 metrics generated by the
     progress API don't have categories, taking in one interesting
     metric meant taking them all in, which is a non-starter for us.

  2. On top of that, we don't even _generate_ these progress events most
     of the time, since we're often running without a tty, and so we
     never end up hitting those 'if (progress) start_progress()'
     conditionals in the first place.

If we had something like this, it would reduce the problem to only (1),
which would make a lot of my headaches go away. (It would also give me a
good excuse to convert many of our custom trace2 regions into patches on
the list, and get rid of a non-trivial amount of code that generates
merge conflicts often).

> > Instead, to allow us to collect traces from scripted Git commands, teach
> > a progress->verbose flag, which is specified via a new argument to
> > start_progress() and friends. display_progress() also learns to filter
> > for that flag. With these changes, start_progress() can be called
> > unconditionally but with a conditional as an argument to determine
> > whether to report progress to the user.
> >
> > Since this changes the API, also modify callers of start_progress() and
> > friends to drop their conditional and pass a new argument in instead.

I don't think that this is why I was CC'd, but could you perhaps talk a
little bit about why this is all in the same patch? I don't think this
change needs to be broken out by the area affected per-se, but the
current form is a little unruly to review all at once.

> This is a worthwhile change. Thanks! I was hoping that we would
> get some of these regions for free, which extends what we can get
> out of trace2 events.
>
> CC'ing Taylor because he had some thoughts on adding a possible
> trace2 category to make it easier to reason about the regions,
> when appropriate. Not sure if he's ready to apply that change
> on top of this series.

This is what I was talking about above. It would be nice if we could
somehow teach the 'start_*_progress()' API about a trace2 category.

Unfortunately, this would need to be a new parameter, since we need to
know the category when we enter the region. So, the API changes might be
far-reaching. It would be nice if there was a way to limit the blast
radius (i.e., 'start_progress_trace2(..., category)'), but I haven't
thought deeply about it.

I don't want to delay this patch series with that. I'd be happy to build
it in myself on top after this graduates.

> > diff --git a/builtin/rev-list.c b/builtin/rev-list.c
> > index f520111eda..f64cad8390 100644
> > --- a/builtin/rev-list.c
> > +++ b/builtin/rev-list.c
> > @@ -620,8 +620,13 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
> >  	if (bisect_list)
> >  		revs.limited = 1;
> >
> > -	if (show_progress)
> > -		progress = start_delayed_progress(show_progress, 0);
> > +	/*
> > +	 * When progress is not printed to the user, we still want to be able to
> > +	 * classify the progress during tracing. So, use a placeholder name.
> > +	 */
> > +	progress = start_delayed_progress(
> > +			show_progress ? show_progress : _("Quiet rev-list operation"),
> > +			0, show_progress != NULL)
>
> This is so strange, how we let the command-lines specify a progress
> indicator. I guess it is necessary when we use rev-list as a
> subcommand instead of in-process. One such case is check_connected()
> in connected.c.
>
> It's stranger still that "show_progress" is actually a string here,
> as opposed to being an int in most other places.
>
> Your transformation is correct, here, though. Thanks for calling it
> out in the commit message.
>
> >
> >  	if (use_bitmap_index) {
> >  		if (!try_bitmap_count(&revs, &filter_options))
> > diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
> > index dd4a75e030..719d446916 100644
> > --- a/builtin/unpack-objects.c
> > +++ b/builtin/unpack-objects.c
> > @@ -498,8 +498,7 @@ static void unpack_all(void)
> >  			ntohl(hdr->hdr_version));
> >  	use(sizeof(struct pack_header));
> >
> > -	if (!quiet)
> > -		progress = start_progress(_("Unpacking objects"), nr_objects);
> > +	progress = start_progress(_("Unpacking objects"), nr_objects, !quiet);
> >  	obj_list = xcalloc(nr_objects, sizeof(*obj_list));
> >  	for (i = 0; i < nr_objects; i++) {
> >  		unpack_one(i);
> > diff --git a/commit-graph.c b/commit-graph.c
> > index 328ab06fd4..b9a784fece 100644
> > --- a/commit-graph.c
> > +++ b/commit-graph.c
> > @@ -1152,10 +1152,10 @@ static void write_graph_chunk_bloom_indexes(struct hashfile *f,
> >  	struct progress *progress = NULL;
> >  	int i = 0;
> >
> > -	if (ctx->report_progress)
> > -		progress = start_delayed_progress(
> > -			_("Writing changed paths Bloom filters index"),
> > -			ctx->commits.nr);
> > +	progress = start_delayed_progress(
> > +		_("Writing changed paths Bloom filters index"),
> > +		ctx->commits.nr,
> > +		ctx->report_progress);
>
> There are a lot of blocks like this, where the progress string is long enough to
> require the first param to be after the method name. Since we are changing the
> API and every caller, would the resulting code be cleaner if the string value
> was the last parameter? That would allow this code pattern in most cases:
>
> 	progress = start_delayed_progress(count, show_progress,
> 					  _("My special string!"));
>
> Just a thought. Not super-important.
>
> The rest of the changes look to be correct.
>
> > diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
> > index 5d05cbe789..19b874f9cd 100644
> > --- a/t/helper/test-progress.c
> > +++ b/t/helper/test-progress.c
> > @@ -23,16 +23,18 @@
> >  int cmd__progress(int argc, const char **argv)
> >  {
> >  	int total = 0;
> > +	int quiet = 0;
> >  	const char *title;
> >  	struct strbuf line = STRBUF_INIT;
> >  	struct progress *progress;
> >
> >  	const char *usage[] = {
> > -		"test-tool progress [--total=<n>] <progress-title>",
> > +		"test-tool progress [--total=<n>] [--quiet] <progress-title>",
> >  		NULL
> >  	};
> >  	struct option options[] = {
> >  		OPT_INTEGER(0, "total", &total, "total number of items"),
> > +		OPT_BOOL(0, "quiet", &quiet, "suppress stderr"),
> >  		OPT_END(),
> >  	};
> >
> > @@ -42,7 +44,7 @@ int cmd__progress(int argc, const char **argv)
> >  	title = argv[0];
> >
> >  	progress_testing = 1;
> > -	progress = start_progress(title, total);
> > +	progress = start_progress(title, total, !quiet);
> >  	while (strbuf_getline(&line, stdin) != EOF) {
> >  		char *end;
> >
> > diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
> > index 1ed1df351c..9d6e6274ad 100755
> > --- a/t/t0500-progress-display.sh
> > +++ b/t/t0500-progress-display.sh
> > @@ -309,4 +309,31 @@ test_expect_success 'progress generates traces' '
> >  	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
> >  '
> >
> > +test_expect_success 'progress generates traces even quietly' '
> > +	cat >in <<-\EOF &&
> > +	throughput 102400 1000
> > +	update
> > +	progress 10
> > +	throughput 204800 2000
> > +	update
> > +	progress 20
> > +	throughput 307200 3000
> > +	update
> > +	progress 30
> > +	throughput 409600 4000
> > +	update
> > +	progress 40
> > +	EOF
> > +
> > +	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress --total=40 \
> > +		--quiet "Working hard" <in 2>stderr &&
> > +
> > +	# t0212/parse_events.perl intentionally omits regions and data.
> > +	grep -e "region_enter" -e "\"category\":\"progress\"" trace.event &&
> > +	grep -e "region_leave" -e "\"category\":\"progress\"" trace.event &&
> > +	grep "\"key\":\"total_objects\",\"value\":\"40\"" trace.event &&
> > +	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
> > +'
>
> Thanks for adding a test, including the resulting trace events!
>
> The patch of Taylor's that I mentioned earlier changes the "category"
> to something a bit more specific than "progress", when appropriate.
>
> > +
> > +
> >  test_done
>
> nit: extra empty line
>
> Thanks!
> -Stolee

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] progress: remove redundant null-checking
  2020-07-10  2:01   ` Derrick Stolee
@ 2020-07-10  2:20     ` Taylor Blau
  2020-07-10 18:50       ` Junio C Hamano
  0 siblings, 1 reply; 27+ messages in thread
From: Taylor Blau @ 2020-07-10  2:20 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Emily Shaffer, git

On Thu, Jul 09, 2020 at 10:01:08PM -0400, Derrick Stolee wrote:
> On 7/9/2020 9:42 PM, Emily Shaffer wrote:
> > display_progress() and stop_progress() are tolerant to NULL inputs.
> > Remove some places where redundant checks are performed before these
> > NULL-tolerant functions are called.
> >
> > Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> > ---
> >  builtin/commit-graph.c | 3 +--
> >  commit-graph.c         | 3 +--
> >  read-cache.c           | 9 +++------
> >  3 files changed, 5 insertions(+), 10 deletions(-)
> >
> > diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
> > index ae4dc2dfcd..7964de3126 100644
> > --- a/builtin/commit-graph.c
> > +++ b/builtin/commit-graph.c
> > @@ -263,8 +263,7 @@ static int graph_write(int argc, const char **argv)
> >  cleanup:
> >  	string_list_clear(&pack_indexes, 0);
> >  	strbuf_release(&buf);
> > -	if (progress)
> > -		stop_progress(&progress);
> > +	stop_progress(&progress);
> >  	return result;
> >  }
> >
> > diff --git a/commit-graph.c b/commit-graph.c
> > index b9a784fece..30d9a75932 100644
> > --- a/commit-graph.c
> > +++ b/commit-graph.c
> > @@ -1424,8 +1424,7 @@ int write_commit_graph_reachable(struct object_directory *odb,
> >  				    flags, split_opts);
> >
> >  	oidset_clear(&commits);
> > -	if (data.progress)
> > -		stop_progress(&data.progress);
> > +	stop_progress(&data.progress);
> >  	return result;
> >  }
> >
> > diff --git a/read-cache.c b/read-cache.c
> > index 2ddc422dbd..feb7abe37a 100644
> > --- a/read-cache.c
> > +++ b/read-cache.c
> > @@ -1581,8 +1581,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
> >  		new_entry = refresh_cache_ent(istate, ce, options, &cache_errno, &changed);
> >  		if (new_entry == ce)
> >  			continue;
> > -		if (progress)
> > -			display_progress(progress, i);
> > +		display_progress(progress, i);
> >  		if (!new_entry) {
> >  			const char *fmt;
> >
> > @@ -1614,10 +1613,8 @@ int refresh_index(struct index_state *istate, unsigned int flags,
> >
> >  		replace_index_entry(istate, i, new_entry);
> >  	}
> > -	if (progress) {
> > -		display_progress(progress, istate->cache_nr);
> > -		stop_progress(&progress);
> > -	}
> > +	display_progress(progress, istate->cache_nr);
> > +	stop_progress(&progress);
> >  	trace_performance_leave("refresh index");
> >  	return has_errors;
>
> Looks obviously correct to me.

To me too, although note that this will generate merge conflicts with
Szeder's patch from earlier today[1].

Unfortunately, the conflicts are a little deeper than "we both removed
an unnecessary 'if' statement", since Szeder's patch moves the
'stop_progress()' call earlier to avoid a bug that I introduced.

This is just something for Junio to look out for while queuing,
otherwise I think just removing the hunk in builtin/commit-graph.c would
be fine, too.

> Thanks,
> -Stolee

Thanks,
Taylor

[1]: https://lore.kernel.org/git/xmqq5zawy4x5.fsf@gitster.c.googlers.com/T/#mbd2d5236a7fecf1287be75ffa396f3776fc9b891

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] progress: remove redundant null-checking
  2020-07-10  2:20     ` Taylor Blau
@ 2020-07-10 18:50       ` Junio C Hamano
  2020-07-10 19:27         ` Emily Shaffer
  0 siblings, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2020-07-10 18:50 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Derrick Stolee, Emily Shaffer, git

Taylor Blau <me@ttaylorr.com> writes:

> To me too, although note that this will generate merge conflicts with
> Szeder's patch from earlier today[1].
>
> Unfortunately, the conflicts are a little deeper than "we both removed
> an unnecessary 'if' statement", since Szeder's patch moves the
> 'stop_progress()' call earlier to avoid a bug that I introduced.
>
> This is just something for Junio to look out for while queuing,

Actually that is something contributors learn to play better with
each other ;-)

Thanks for a heads-up.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] progress: create progress struct in 'verbose' mode
  2020-07-10  2:17     ` Taylor Blau
@ 2020-07-10 19:21       ` Emily Shaffer
  0 siblings, 0 replies; 27+ messages in thread
From: Emily Shaffer @ 2020-07-10 19:21 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Derrick Stolee, git

On Thu, Jul 09, 2020 at 10:17:40PM -0400, Taylor Blau wrote:
> 
> Hi both,
> 
> On Thu, Jul 09, 2020 at 10:00:23PM -0400, Derrick Stolee wrote:
> > On 7/9/2020 9:42 PM, Emily Shaffer wrote:
> > > Before now, the progress API is used by conditionally calling
> > > start_progress() or a similar call, and then unconditionally calling
> > > display_progress() and stop_progress(), both of which are tolerant of
> > > NULL or uninitialized inputs. However, in
> > > 98a136474082cdc7228d7e0e45672c5274fab701 (trace: log progress time and
> > > throughput), the progress library learned to log traces during expensive
> > > operations. In cases where progress should not be displayed to the user
> > > - such as when Git is called by a script - no traces will be logged,
> > > because the progress object is never created.
> 
> This is such a fantastic idea. Just the other day, I was thinking of
> getting your (for clarification, Emily, since I'm responding to
> Stolee's mail) progress-emits-trace2-events work hooked into GitHub's
> trace2 pipeline.
> 
> There were two unfortunate things that prevented this from working:
> 
>   1. GitHub filters which trace2 categories are of interest to us (these
>      interesting ones get logged, and the uninteresting ones get
>      discarded) using an environment variable of comma-separated
>      categories. Since all of the trace2 metrics generated by the
>      progress API don't have categories, taking in one interesting
>      metric meant taking them all in, which is a non-starter for us.
> 
>   2. On top of that, we don't even _generate_ these progress events most
>      of the time, since we're often running without a tty, and so we
>      never end up hitting those 'if (progress) start_progress()'
>      conditionals in the first place.
> 
> If we had something like this, it would reduce the problem to only (1),
> which would make a lot of my headaches go away. (It would also give me a
> good excuse to convert many of our custom trace2 regions into patches on
> the list, and get rid of a non-trivial amount of code that generates
> merge conflicts often).
> 
> > > Instead, to allow us to collect traces from scripted Git commands, teach
> > > a progress->verbose flag, which is specified via a new argument to
> > > start_progress() and friends. display_progress() also learns to filter
> > > for that flag. With these changes, start_progress() can be called
> > > unconditionally but with a conditional as an argument to determine
> > > whether to report progress to the user.
> > >
> > > Since this changes the API, also modify callers of start_progress() and
> > > friends to drop their conditional and pass a new argument in instead.
> 
> I don't think that this is why I was CC'd, but could you perhaps talk a
> little bit about why this is all in the same patch? I don't think this
> change needs to be broken out by the area affected per-se, but the
> current form is a little unruly to review all at once.

I agree that it's a lot, but I didn't think very hard about how to split
it up for review. The problem is that changing the signature of course
breaks the build, and I didn't want a commit which wouldn't build. I
could split it into one commit per start_progress() variation, I
suppose, but is it really easier to review? With only one or two
exceptions these changes are all pretty rote.

 - Emily

> 
> > This is a worthwhile change. Thanks! I was hoping that we would
> > get some of these regions for free, which extends what we can get
> > out of trace2 events.
> >
> > CC'ing Taylor because he had some thoughts on adding a possible
> > trace2 category to make it easier to reason about the regions,
> > when appropriate. Not sure if he's ready to apply that change
> > on top of this series.
> 
> This is what I was talking about above. It would be nice if we could
> somehow teach the 'start_*_progress()' API about a trace2 category.

Yeah, we have been thinking about that too - right now I think we use
the title, which is pretty ugly for metrics as it's intended for human
eyes. One suggestion I heard was to teach an enum to the progress struct
for the region name (or category); but to get it into the region_enter
message we'd have to take it during start_progress(), as you say.

> 
> Unfortunately, this would need to be a new parameter, since we need to
> know the category when we enter the region. So, the API changes might be
> far-reaching. It would be nice if there was a way to limit the blast
> radius (i.e., 'start_progress_trace2(..., category)'), but I haven't
> thought deeply about it.

Yeah, this seems viable in the scenario you describe - those who aren't
using the *_trace2() constructors aren't any worse off than before.

> 
> I don't want to delay this patch series with that. I'd be happy to build
> it in myself on top after this graduates.
> 
> > > diff --git a/builtin/rev-list.c b/builtin/rev-list.c
> > > index f520111eda..f64cad8390 100644
> > > --- a/builtin/rev-list.c
> > > +++ b/builtin/rev-list.c
> > > @@ -620,8 +620,13 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
> > >  	if (bisect_list)
> > >  		revs.limited = 1;
> > >
> > > -	if (show_progress)
> > > -		progress = start_delayed_progress(show_progress, 0);
> > > +	/*
> > > +	 * When progress is not printed to the user, we still want to be able to
> > > +	 * classify the progress during tracing. So, use a placeholder name.
> > > +	 */
> > > +	progress = start_delayed_progress(
> > > +			show_progress ? show_progress : _("Quiet rev-list operation"),
> > > +			0, show_progress != NULL)
> >
> > This is so strange, how we let the command-lines specify a progress
> > indicator. I guess it is necessary when we use rev-list as a
> > subcommand instead of in-process. One such case is check_connected()
> > in connected.c.
> >
> > It's stranger still that "show_progress" is actually a string here,
> > as opposed to being an int in most other places.
> >
> > Your transformation is correct, here, though. Thanks for calling it
> > out in the commit message.
> >
> > >
> > >  	if (use_bitmap_index) {
> > >  		if (!try_bitmap_count(&revs, &filter_options))
> > > diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
> > > index dd4a75e030..719d446916 100644
> > > --- a/builtin/unpack-objects.c
> > > +++ b/builtin/unpack-objects.c
> > > @@ -498,8 +498,7 @@ static void unpack_all(void)
> > >  			ntohl(hdr->hdr_version));
> > >  	use(sizeof(struct pack_header));
> > >
> > > -	if (!quiet)
> > > -		progress = start_progress(_("Unpacking objects"), nr_objects);
> > > +	progress = start_progress(_("Unpacking objects"), nr_objects, !quiet);
> > >  	obj_list = xcalloc(nr_objects, sizeof(*obj_list));
> > >  	for (i = 0; i < nr_objects; i++) {
> > >  		unpack_one(i);
> > > diff --git a/commit-graph.c b/commit-graph.c
> > > index 328ab06fd4..b9a784fece 100644
> > > --- a/commit-graph.c
> > > +++ b/commit-graph.c
> > > @@ -1152,10 +1152,10 @@ static void write_graph_chunk_bloom_indexes(struct hashfile *f,
> > >  	struct progress *progress = NULL;
> > >  	int i = 0;
> > >
> > > -	if (ctx->report_progress)
> > > -		progress = start_delayed_progress(
> > > -			_("Writing changed paths Bloom filters index"),
> > > -			ctx->commits.nr);
> > > +	progress = start_delayed_progress(
> > > +		_("Writing changed paths Bloom filters index"),
> > > +		ctx->commits.nr,
> > > +		ctx->report_progress);
> >
> > There are a lot of blocks like this, where the progress string is long enough to
> > require the first param to be after the method name. Since we are changing the
> > API and every caller, would the resulting code be cleaner if the string value
> > was the last parameter? That would allow this code pattern in most cases:
> >
> > 	progress = start_delayed_progress(count, show_progress,
> > 					  _("My special string!"));
> >
> > Just a thought. Not super-important.
> >
> > The rest of the changes look to be correct.
> >
> > > diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
> > > index 5d05cbe789..19b874f9cd 100644
> > > --- a/t/helper/test-progress.c
> > > +++ b/t/helper/test-progress.c
> > > @@ -23,16 +23,18 @@
> > >  int cmd__progress(int argc, const char **argv)
> > >  {
> > >  	int total = 0;
> > > +	int quiet = 0;
> > >  	const char *title;
> > >  	struct strbuf line = STRBUF_INIT;
> > >  	struct progress *progress;
> > >
> > >  	const char *usage[] = {
> > > -		"test-tool progress [--total=<n>] <progress-title>",
> > > +		"test-tool progress [--total=<n>] [--quiet] <progress-title>",
> > >  		NULL
> > >  	};
> > >  	struct option options[] = {
> > >  		OPT_INTEGER(0, "total", &total, "total number of items"),
> > > +		OPT_BOOL(0, "quiet", &quiet, "suppress stderr"),
> > >  		OPT_END(),
> > >  	};
> > >
> > > @@ -42,7 +44,7 @@ int cmd__progress(int argc, const char **argv)
> > >  	title = argv[0];
> > >
> > >  	progress_testing = 1;
> > > -	progress = start_progress(title, total);
> > > +	progress = start_progress(title, total, !quiet);
> > >  	while (strbuf_getline(&line, stdin) != EOF) {
> > >  		char *end;
> > >
> > > diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
> > > index 1ed1df351c..9d6e6274ad 100755
> > > --- a/t/t0500-progress-display.sh
> > > +++ b/t/t0500-progress-display.sh
> > > @@ -309,4 +309,31 @@ test_expect_success 'progress generates traces' '
> > >  	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
> > >  '
> > >
> > > +test_expect_success 'progress generates traces even quietly' '
> > > +	cat >in <<-\EOF &&
> > > +	throughput 102400 1000
> > > +	update
> > > +	progress 10
> > > +	throughput 204800 2000
> > > +	update
> > > +	progress 20
> > > +	throughput 307200 3000
> > > +	update
> > > +	progress 30
> > > +	throughput 409600 4000
> > > +	update
> > > +	progress 40
> > > +	EOF
> > > +
> > > +	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress --total=40 \
> > > +		--quiet "Working hard" <in 2>stderr &&
> > > +
> > > +	# t0212/parse_events.perl intentionally omits regions and data.
> > > +	grep -e "region_enter" -e "\"category\":\"progress\"" trace.event &&
> > > +	grep -e "region_leave" -e "\"category\":\"progress\"" trace.event &&
> > > +	grep "\"key\":\"total_objects\",\"value\":\"40\"" trace.event &&
> > > +	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
> > > +'
> >
> > Thanks for adding a test, including the resulting trace events!
> >
> > The patch of Taylor's that I mentioned earlier changes the "category"
> > to something a bit more specific than "progress", when appropriate.
> >
> > > +
> > > +
> > >  test_done
> >
> > nit: extra empty line
> >
> > Thanks!
> > -Stolee
> 
> Thanks,
> Taylor

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] progress: create progress struct in 'verbose' mode
  2020-07-10  2:14   ` brian m. carlson
@ 2020-07-10 19:24     ` Emily Shaffer
  2020-07-10 21:09     ` Junio C Hamano
  1 sibling, 0 replies; 27+ messages in thread
From: Emily Shaffer @ 2020-07-10 19:24 UTC (permalink / raw)
  To: brian m. carlson, git

On Fri, Jul 10, 2020 at 02:14:51AM +0000, brian m. carlson wrote:
> 
> On 2020-07-10 at 01:42:41, Emily Shaffer wrote:
> > Before now, the progress API is used by conditionally calling
> > start_progress() or a similar call, and then unconditionally calling
> > display_progress() and stop_progress(), both of which are tolerant of
> > NULL or uninitialized inputs. However, in
> > 98a136474082cdc7228d7e0e45672c5274fab701 (trace: log progress time and
> > throughput), the progress library learned to log traces during expensive
> > operations. In cases where progress should not be displayed to the user
> > - such as when Git is called by a script - no traces will be logged,
> > because the progress object is never created.
> > 
> > Instead, to allow us to collect traces from scripted Git commands, teach
> > a progress->verbose flag, which is specified via a new argument to
> > start_progress() and friends. display_progress() also learns to filter
> > for that flag. With these changes, start_progress() can be called
> > unconditionally but with a conditional as an argument to determine
> > whether to report progress to the user.
> 
> So to make sure I understand this right, we'll collect traces regardless
> if it's enabled, but we'll still honor the --quiet flag if the user
> doesn't want to see them?

Yes, precisely. display_progress() used to check whether to display or
not based on whether the 'struct progress' was valid; now it also checks
whether 'progress->verbose' is set. display() (where the real logic
lives for printing to the user) still never gets called with this
change.

>
> If so, I'm definitely in favor of this
> change.  I was worried when I read the cover letter that we'd display
> them to the user regardless, but from reading the patch and the commit
> message, it seems I misunderstood.
> 
> I think the making the verbose flag a parameter simplifies the code
> nicely and puts the rendering decision in the right place.

Thanks!

 - Emily

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] progress: remove redundant null-checking
  2020-07-10 18:50       ` Junio C Hamano
@ 2020-07-10 19:27         ` Emily Shaffer
  2020-07-10 19:58           ` Junio C Hamano
  0 siblings, 1 reply; 27+ messages in thread
From: Emily Shaffer @ 2020-07-10 19:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, Derrick Stolee, git

On Fri, Jul 10, 2020 at 11:50:08AM -0700, Junio C Hamano wrote:
> 
> Taylor Blau <me@ttaylorr.com> writes:
> 
> > To me too, although note that this will generate merge conflicts with
> > Szeder's patch from earlier today[1].
> >
> > Unfortunately, the conflicts are a little deeper than "we both removed
> > an unnecessary 'if' statement", since Szeder's patch moves the
> > 'stop_progress()' call earlier to avoid a bug that I introduced.
> >
> > This is just something for Junio to look out for while queuing,
> 
> Actually that is something contributors learn to play better with
> each other ;-)

Yeah, I was worried about the nature of this series in general. Sorry to
have missed Szeder's change.

> 
> Thanks for a heads-up.

Will it make your life easier if I base this series on the other topic?
Or would you rather I leave it to you?

 - Emily

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] progress: remove redundant null-checking
  2020-07-10 19:27         ` Emily Shaffer
@ 2020-07-10 19:58           ` Junio C Hamano
  2020-07-10 20:29             ` Emily Shaffer
  0 siblings, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2020-07-10 19:58 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: Taylor Blau, Derrick Stolee, git

Emily Shaffer <emilyshaffer@google.com> writes:

> Will it make your life easier if I base this series on the other topic?
> Or would you rather I leave it to you?

Either is fine.  

Just don't expect any quick turnaround time for a new and involved
change like this while we are entering the pre-release freeze.  We
are supposed to be stablizing, not churning ;-).

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] progress: remove redundant null-checking
  2020-07-10 19:58           ` Junio C Hamano
@ 2020-07-10 20:29             ` Emily Shaffer
  2020-07-10 23:03               ` Emily Shaffer
  0 siblings, 1 reply; 27+ messages in thread
From: Emily Shaffer @ 2020-07-10 20:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, Derrick Stolee, git

On Fri, Jul 10, 2020 at 12:58:54PM -0700, Junio C Hamano wrote:
> 
> Emily Shaffer <emilyshaffer@google.com> writes:
> 
> > Will it make your life easier if I base this series on the other topic?
> > Or would you rather I leave it to you?
> 
> Either is fine.  
> 
> Just don't expect any quick turnaround time for a new and involved
> change like this while we are entering the pre-release freeze.  We
> are supposed to be stablizing, not churning ;-).

Understood.

It's got a nit or two from Stolee's review so I need to reroll it
anyway; I'll send it based on the other topic.

 - Emily

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] progress: create progress struct in 'verbose' mode
  2020-07-10  2:14   ` brian m. carlson
  2020-07-10 19:24     ` Emily Shaffer
@ 2020-07-10 21:09     ` Junio C Hamano
  2020-07-10 22:00       ` Emily Shaffer
  1 sibling, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2020-07-10 21:09 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Emily Shaffer, git

"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> So to make sure I understand this right, we'll collect traces regardless
> if it's enabled, but we'll still honor the --quiet flag if the user
> doesn't want to see them?  If so, I'm definitely in favor of this
> change.

Hmph, if we know we won't be emitting, why spend cycles to collect
traces in the first place?  Does it make the code simpler to follow
or something?  As long as it is done when the user requests a trace
to be taken, I do not care about the wasted cycles too much, I think ;-)

> I was worried when I read the cover letter that we'd display
> them to the user regardless, but from reading the patch and the commit
> message, it seems I misunderstood.
>
> I think the making the verbose flag a parameter simplifies the code
> nicely and puts the rendering decision in the right place.

OK.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] progress: create progress struct in 'verbose' mode
  2020-07-10 21:09     ` Junio C Hamano
@ 2020-07-10 22:00       ` Emily Shaffer
  0 siblings, 0 replies; 27+ messages in thread
From: Emily Shaffer @ 2020-07-10 22:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: brian m. carlson, git

On Fri, Jul 10, 2020 at 02:09:04PM -0700, Junio C Hamano wrote:
> 
> "brian m. carlson" <sandals@crustytoothpaste.net> writes:
> 
> > So to make sure I understand this right, we'll collect traces regardless
> > if it's enabled, but we'll still honor the --quiet flag if the user
> > doesn't want to see them?  If so, I'm definitely in favor of this
> > change.
> 
> Hmph, if we know we won't be emitting, why spend cycles to collect
> traces in the first place?  Does it make the code simpler to follow
> or something?  As long as it is done when the user requests a trace
> to be taken, I do not care about the wasted cycles too much, I think ;-)

Maybe then I did misunderstand brian.

With this series, traces will always be emitted to location where
GIT_TRACE2_EVENT indicates, or default location. Like before, if the
user asks for --quiet or equivalent, the user will not see "Resolving
deltas... (4/30)" in their stderr.

Before this series, when a user indicated --quiet or equivalent, the
traces were not emitted to GIT_TRACE2_EVENT's location, because the
progress struct was not created.

I figured brian was saying "we'll still honor the --quiet flag if the
user doesn't want to see ["Resolving deltas...blah blah" in their
stderr]", and that is true. The traces will be logged but the
user-facing output on stderr will be suppressed.

> 
> > I was worried when I read the cover letter that we'd display
> > them to the user regardless, but from reading the patch and the commit
> > message, it seems I misunderstood.
> >
> > I think the making the verbose flag a parameter simplifies the code
> > nicely and puts the rendering decision in the right place.
> 
> OK.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] progress: create progress struct in 'verbose' mode
  2020-07-10  1:42 ` [PATCH 1/2] progress: create progress struct in 'verbose' mode Emily Shaffer
  2020-07-10  2:00   ` Derrick Stolee
  2020-07-10  2:14   ` brian m. carlson
@ 2020-07-10 22:40   ` Junio C Hamano
  2020-07-14  0:15     ` Emily Shaffer
  2020-09-09 22:36     ` Jonathan Tan
  2020-09-10  0:31   ` Jonathan Nieder
  3 siblings, 2 replies; 27+ messages in thread
From: Junio C Hamano @ 2020-07-10 22:40 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Emily Shaffer <emilyshaffer@google.com> writes:

> Before now, the progress API is used by conditionally calling
> start_progress() or a similar call, and then unconditionally calling
> display_progress() and stop_progress(), both of which are tolerant of
> NULL or uninitialized inputs. However, in
> 98a136474082cdc7228d7e0e45672c5274fab701 (trace: log progress time and
> throughput), the progress library learned to log traces during expensive
> operations. In cases where progress should not be displayed to the user
> - such as when Git is called by a script - no traces will be logged,
> because the progress object is never created.
>
> Instead, to allow us to collect traces from scripted Git commands, teach
> a progress->verbose flag, which is specified via a new argument to
> start_progress() and friends. display_progress() also learns to filter
> for that flag. With these changes, start_progress() can be called
> unconditionally but with a conditional as an argument to determine
> whether to report progress to the user.
>
> Since this changes the API, also modify callers of start_progress() and
> friends to drop their conditional and pass a new argument in instead.
>
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>  builtin/blame.c             |   4 +-
>  builtin/commit-graph.c      |   5 +-
>  builtin/fsck.c              |  25 ++++---
>  builtin/index-pack.c        |  16 ++---
>  builtin/log.c               |   4 +-
>  builtin/pack-objects.c      |  18 +++--
>  builtin/prune.c             |   4 +-
>  builtin/rev-list.c          |   9 ++-
>  builtin/unpack-objects.c    |   3 +-
>  commit-graph.c              | 140 +++++++++++++++++-------------------
>  delta-islands.c             |   4 +-
>  diffcore-rename.c           |   9 ++-
>  entry.c                     |   2 +-
>  midx.c                      |  43 +++++------
>  pack-bitmap-write.c         |   8 +--
>  pack-bitmap.c               |   5 +-
>  preload-index.c             |   7 +-
>  progress.c                  |  27 ++++---
>  progress.h                  |  10 +--
>  prune-packed.c              |   4 +-
>  read-cache.c                |   6 +-
>  t/helper/test-progress.c    |   6 +-
>  t/t0500-progress-display.sh |  27 +++++++
>  unpack-trees.c              |  14 ++--
>  walker.c                    |   4 +-
>  25 files changed, 212 insertions(+), 192 deletions(-)

So, the gist of the change is these, where all functions in the
progress API that prepare a progress structure now take "verbose"
parameter, and the function that emits the output looks at the
flag.  Counters and timers are used no matter what as long as
progress is shown.

> diff --git a/progress.c b/progress.c
> index 3eda914518..f014908b05 100644
> --- a/progress.c
> +++ b/progress.c
> @@ -43,6 +43,7 @@ struct progress {
>  	struct strbuf counters_sb;
>  	int title_len;
>  	int split;
> +	int verbose;
>  };
>  
>  static volatile sig_atomic_t progress_update;
> @@ -245,12 +246,13 @@ void display_throughput(struct progress *progress, uint64_t total)
>  
>  void display_progress(struct progress *progress, uint64_t n)
>  {
> -	if (progress)
> +	if (progress && progress->verbose)
>  		display(progress, n, NULL);
>  }
>  
>  static struct progress *start_progress_delay(const char *title, uint64_t total,
> -					     unsigned delay, unsigned sparse)
> +					     unsigned delay, unsigned sparse,
> +					     int verbose)
>  {
>  	struct progress *progress = xmalloc(sizeof(*progress));
>  	progress->title = title;
> @@ -264,6 +266,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
>  	strbuf_init(&progress->counters_sb, 0);
>  	progress->title_len = utf8_strwidth(title);
>  	progress->split = 0;
> +	progress->verbose = verbose;
>  	set_progress_signal();
>  	trace2_region_enter("progress", title, the_repository);
>  	return progress;
> @@ -279,14 +282,16 @@ static int get_default_delay(void)
>  	return delay_in_secs;
>  }
>  
> -struct progress *start_delayed_progress(const char *title, uint64_t total)
> +struct progress *start_delayed_progress(const char *title, uint64_t total,
> +					int verbose)
>  {
> -	return start_progress_delay(title, total, get_default_delay(), 0);
> +	return start_progress_delay(title, total, get_default_delay(), 0,
> +				    verbose);
>  }
>  
> -struct progress *start_progress(const char *title, uint64_t total)
> +struct progress *start_progress(const char *title, uint64_t total, int verbose)
>  {
> -	return start_progress_delay(title, total, 0, 0);
> +	return start_progress_delay(title, total, 0, 0, verbose);
>  }
>  
>  /*
> @@ -298,15 +303,17 @@ struct progress *start_progress(const char *title, uint64_t total)
>   * When "sparse" is set, stop_progress() will automatically force the done
>   * message to show 100%.
>   */
> -struct progress *start_sparse_progress(const char *title, uint64_t total)
> +struct progress *start_sparse_progress(const char *title, uint64_t total,
> +				       int verbose)
>  {
> -	return start_progress_delay(title, total, 0, 1);
> +	return start_progress_delay(title, total, 0, 1, verbose);
>  }
>  
>  struct progress *start_delayed_sparse_progress(const char *title,
> -					       uint64_t total)
> +					       uint64_t total, int verbose)
>  {
> -	return start_progress_delay(title, total, get_default_delay(), 1);
> +	return start_progress_delay(title, total, get_default_delay(), 1,
> +				    verbose);
>  }
>  
>  static void finish_if_sparse(struct progress *progress)
> diff --git a/progress.h b/progress.h
> index f1913acf73..0c64557231 100644
> --- a/progress.h
> +++ b/progress.h
> @@ -13,11 +13,13 @@ void progress_test_force_update(void);
>  
>  void display_throughput(struct progress *progress, uint64_t total);
>  void display_progress(struct progress *progress, uint64_t n);
> -struct progress *start_progress(const char *title, uint64_t total);
> -struct progress *start_sparse_progress(const char *title, uint64_t total);
> -struct progress *start_delayed_progress(const char *title, uint64_t total);
> +struct progress *start_progress(const char *title, uint64_t total, int verbose);
> +struct progress *start_sparse_progress(const char *title, uint64_t total,
> +				       int verbose);
> +struct progress *start_delayed_progress(const char *title, uint64_t total,
> +					int verbose);
>  struct progress *start_delayed_sparse_progress(const char *title,
> -					       uint64_t total);
> +					       uint64_t total, int verbose);
>  void stop_progress(struct progress **progress);
>  void stop_progress_msg(struct progress **progress, const char *msg);
>  


And then this is how a typical caller gets updated.

For this particular one, it is not _too_ bad, even though it does
feel a bit roundabout way to say "we are starting delayed progress
at this point" by having this call unconditionally, but then the
reader needs to realize "ah, not really, because show_progress is
not enabled" when the reading reaches the end of the call.

> diff --git a/builtin/blame.c b/builtin/blame.c
> index 94ef57c1cc..1d72b27fda 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -1129,8 +1129,8 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
>  
>  	sb.found_guilty_entry = &found_guilty_entry;
>  	sb.found_guilty_entry_data = &pi;
> -	if (show_progress)
> -		pi.progress = start_delayed_progress(_("Blaming lines"), sb.num_lines);
> +	pi.progress = start_delayed_progress(_("Blaming lines"), sb.num_lines,
> +					     show_progress);
>  
>  	assign_blame(&sb, opt);

But there are others that look a bit problematic.  In this example
taken from fsck, we open all the pack index, only because it is
needed to show the progress, and the existing conditionals are ways
to avoid spending unneeded cycles.

> @@ -836,16 +836,15 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
>  			uint32_t total = 0, count = 0;
>  			struct progress *progress = NULL;
>  
> -			if (show_progress) {
> -				for (p = get_all_packs(the_repository); p;
> -				     p = p->next) {
> -					if (open_pack_index(p))
> -						continue;
> -					total += p->num_objects;
> -				}
> -
> -				progress = start_progress(_("Checking objects"), total);
> +			for (p = get_all_packs(the_repository); p;
> +			     p = p->next) {
> +				if (open_pack_index(p))
> +					continue;
> +				total += p->num_objects;
>  			}
> +
> +			progress = start_progress(_("Checking objects"), total,
> +						  show_progress);
>  			for (p = get_all_packs(the_repository); p;
>  			     p = p->next) {
>  				/* verify gives error messages itself */

Likewise, we do not even have to be scanning the index entries
upfront if we are not showing the progress report (and more
importantly, the user likely has chosen the speed/cycles over eye
candy progress meter) while checking out paths from the index.

> diff --git a/unpack-trees.c b/unpack-trees.c
> index 4be5fc3075..fa84244466 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -338,16 +338,14 @@ static struct progress *get_progress(struct unpack_trees_options *o,
>  {
>  	unsigned cnt = 0, total = 0;
>  
> -	if (!o->update || !o->verbose_update)
> -		return NULL;
> -
>  	for (; cnt < index->cache_nr; cnt++) {
>  		const struct cache_entry *ce = index->cache[cnt];
>  		if (ce->ce_flags & (CE_UPDATE | CE_WT_REMOVE))
>  			total++;
>  	}
>  
> -	return start_delayed_progress(_("Updating files"), total);
> +	return start_delayed_progress(_("Updating files"), total,
> +				      (o->update && o->verbose_update));
>  }
>  
>  static void setup_collided_checkout_detection(struct checkout *state,


And there are cases where the "is progress enabled?" parameter (by
the way, I am not sure 'verbose' is quite the right name for the
parameter and the field name, as that word implies there can be
multiple verbosity levels---what I see you've done is that you added
an "is this enabled?" flag) is not a simple variable, like these.
Some are simple enough, but some are not.  Some of the worst appear
in preload-index.c and read-cache.c---they may become easier to read
if we introduced a variable to be passed as the "enable?"  parameter
and compute it before makng a call, and that might be enough, though.

> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index 7016b28485..1a97a9f0be 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -1012,8 +1012,8 @@ static void write_pack_file(void)
>  	time_t last_mtime = 0;
>  	struct object_entry **write_order;
>  
> -	if (progress > pack_to_stdout)
> -		progress_state = start_progress(_("Writing objects"), nr_result);
> +	progress_state = start_progress(_("Writing objects"), nr_result,
> +					(progress > pack_to_stdout));
>  	ALLOC_ARRAY(written_list, to_pack.nr_objects);
>  	write_order = compute_write_order();
>  

> @@ -1418,9 +1415,9 @@ int write_commit_graph_reachable(struct object_directory *odb,
>  
>  	memset(&data, 0, sizeof(data));
>  	data.commits = &commits;
> -	if (flags & COMMIT_GRAPH_WRITE_PROGRESS)
> -		data.progress = start_delayed_progress(
> -			_("Collecting referenced commits"), 0);
> +	data.progress = start_delayed_progress(
> +		_("Collecting referenced commits"), 0,
> +		(flags & COMMIT_GRAPH_WRITE_PROGRESS));
>  
>  	for_each_ref(add_ref_to_set, &data);
>  	result = write_commit_graph(odb, NULL, &commits,

> @@ -2301,9 +2295,9 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
>  	if (verify_commit_graph_error & ~VERIFY_COMMIT_GRAPH_ERROR_HASH)
>  		return verify_commit_graph_error;
>  
> -	if (flags & COMMIT_GRAPH_WRITE_PROGRESS)
> -		progress = start_progress(_("Verifying commits in commit graph"),
> -					g->num_commits);
> +	progress = start_progress(_("Verifying commits in commit graph"),
> +				g->num_commits,
> +				(flags & COMMIT_GRAPH_WRITE_PROGRESS));
>  
>  	for (i = 0; i < g->num_commits; i++) {
>  		struct commit *graph_commit, *odb_commit;

> diff --git a/midx.c b/midx.c
> index 6d1584ca51..d9ab163340 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -836,10 +836,8 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index *
>  	}
>  
>  	packs.pack_paths_checked = 0;
> -	if (flags & MIDX_PROGRESS)
> -		packs.progress = start_progress(_("Adding packfiles to multi-pack-index"), 0);
> -	else
> -		packs.progress = NULL;
> +	packs.progress = start_progress(_("Adding packfiles to multi-pack-index"), 0,
> +						(flags & MIDX_PROGRESS));
>  
>  	for_each_file_in_pack_dir(object_dir, add_pack_to_midx, &packs);
>  	stop_progress(&packs.progress);

> diff --git a/preload-index.c b/preload-index.c
> index ed6eaa4738..a924a84a14 100644
> --- a/preload-index.c
> +++ b/preload-index.c
> @@ -115,10 +115,9 @@ void preload_index(struct index_state *index,
>  	memset(&data, 0, sizeof(data));
>  
>  	memset(&pd, 0, sizeof(pd));
> -	if (refresh_flags & REFRESH_PROGRESS && isatty(2)) {
> -		pd.progress = start_delayed_progress(_("Refreshing index"), index->cache_nr);
> -		pthread_mutex_init(&pd.mutex, NULL);
> -	}
> +	pd.progress = start_delayed_progress(_("Refreshing index"), index->cache_nr,
> +					     (refresh_flags & REFRESH_PROGRESS && isatty(2)));
> +	pthread_mutex_init(&pd.mutex, NULL);
>  
>  	for (i = 0; i < threads; i++) {
>  		struct thread_data *p = data+i;

> diff --git a/read-cache.c b/read-cache.c
> index aa427c5c17..2ddc422dbd 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1532,9 +1532,9 @@ int refresh_index(struct index_state *istate, unsigned int flags,
>  	const char *unmerged_fmt;
>  	struct progress *progress = NULL;
>  
> -	if (flags & REFRESH_PROGRESS && isatty(2))
> -		progress = start_delayed_progress(_("Refresh index"),
> -						  istate->cache_nr);
> +	progress = start_delayed_progress(_("Refresh index"),
> +					  istate->cache_nr,
> +					  (flags & REFRESH_PROGRESS && isatty(2)));
>  
>  	trace_performance_enter();
>  	modified_fmt   = in_porcelain ? "M\t%s\n" : "%s: needs update\n";


And here is the only one that may benefit from this change, which
cares that the codepath prepared in rev-list, primarily to support
the progress meter, is exercised because those pieces of code
happens to be used by the tracing output as well.  Perhaps we
shouldn't have tied the "when showing progress, we can also trace"
earlier and we didn't have to touch this many codepath only to spend
cycles when progress output is not desired?

> diff --git a/builtin/rev-list.c b/builtin/rev-list.c
> index f520111eda..f64cad8390 100644
> --- a/builtin/rev-list.c
> +++ b/builtin/rev-list.c
> @@ -620,8 +620,13 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
>  	if (bisect_list)
>  		revs.limited = 1;
>  
> -	if (show_progress)
> -		progress = start_delayed_progress(show_progress, 0);
> +	/*
> +	 * When progress is not printed to the user, we still want to be able to
> +	 * classify the progress during tracing. So, use a placeholder name.
> +	 */
> +	progress = start_delayed_progress(
> +			show_progress ? show_progress : _("Quiet rev-list operation"),
> +			0, show_progress != NULL);

I've cut out the hunks that exhibits similar looking patterns as the
ones I covered in the above discussion.  You are right to say that
these are simple and similar changes all over the place.  I am not
yet convinced that the simplicity supports that this change is the
right thing to do, though.  The only saving grace might be that
there aren't that many hunks that explicitly do unnecessary things
like the ones in unpack-trees.c and builtin/fsck.c even when we are
not showing progress.  

But the other codepaths may be doing conditional computation not
based on "if (show_progress)" but on "if (progress)", in which case
with this patch, we may be paying a lot more overhead even if we
know progress meter won't be shown and the worse part from
reviewability point of view is that this patch does not explicitly
do anything to make it happen because start_delayed_progress() now
unconditionally give a non-NULL progress structure to enable them.

So I dunno.  It certainly is not for the upcoming release, but we do
want to make sure that trace requested even when the command is quiet
gets collected in the longer term (i.e. during the next cycle).

Thanks.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] progress: remove redundant null-checking
  2020-07-10 20:29             ` Emily Shaffer
@ 2020-07-10 23:03               ` Emily Shaffer
  0 siblings, 0 replies; 27+ messages in thread
From: Emily Shaffer @ 2020-07-10 23:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, Derrick Stolee, git

On Fri, Jul 10, 2020 at 01:29:55PM -0700, Emily Shaffer wrote:
> 
> On Fri, Jul 10, 2020 at 12:58:54PM -0700, Junio C Hamano wrote:
> > 
> > Emily Shaffer <emilyshaffer@google.com> writes:
> > 
> > > Will it make your life easier if I base this series on the other topic?
> > > Or would you rather I leave it to you?
> > 
> > Either is fine.  
> > 
> > Just don't expect any quick turnaround time for a new and involved
> > change like this while we are entering the pre-release freeze.  We
> > are supposed to be stablizing, not churning ;-).
> 
> Understood.
> 
> It's got a nit or two from Stolee's review so I need to reroll it
> anyway; I'll send it based on the other topic.

Hum. It seems that, as of `git fetch github.com/gitster/git` in the past
hour, sg/commit-graph-progress-fix doesn't include
es/trace-log-progress, so I don't think it is actually a good idea for
me to submit this series based on sg/commit-graph-progress-fix instead.
I'll send the next version based on master, after all.

> 
>  - Emily

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] progress: create progress struct in 'verbose' mode
  2020-07-10 22:40   ` Junio C Hamano
@ 2020-07-14  0:15     ` Emily Shaffer
  2020-08-17 22:19       ` Emily Shaffer
  2020-09-09 22:36     ` Jonathan Tan
  1 sibling, 1 reply; 27+ messages in thread
From: Emily Shaffer @ 2020-07-14  0:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Fri, Jul 10, 2020 at 03:40:25PM -0700, Junio C Hamano wrote:
> 
> Emily Shaffer <emilyshaffer@google.com> writes:
> 
> > Before now, the progress API is used by conditionally calling
> > start_progress() or a similar call, and then unconditionally calling
> > display_progress() and stop_progress(), both of which are tolerant of
> > NULL or uninitialized inputs. However, in
> > 98a136474082cdc7228d7e0e45672c5274fab701 (trace: log progress time and
> > throughput), the progress library learned to log traces during expensive
> > operations. In cases where progress should not be displayed to the user
> > - such as when Git is called by a script - no traces will be logged,
> > because the progress object is never created.
> >
> > Instead, to allow us to collect traces from scripted Git commands, teach
> > a progress->verbose flag, which is specified via a new argument to
> > start_progress() and friends. display_progress() also learns to filter
> > for that flag. With these changes, start_progress() can be called
> > unconditionally but with a conditional as an argument to determine
> > whether to report progress to the user.
> >
> > Since this changes the API, also modify callers of start_progress() and
> > friends to drop their conditional and pass a new argument in instead.
> >
> > Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> > ---
> >  builtin/blame.c             |   4 +-
> >  builtin/commit-graph.c      |   5 +-
> >  builtin/fsck.c              |  25 ++++---
> >  builtin/index-pack.c        |  16 ++---
> >  builtin/log.c               |   4 +-
> >  builtin/pack-objects.c      |  18 +++--
> >  builtin/prune.c             |   4 +-
> >  builtin/rev-list.c          |   9 ++-
> >  builtin/unpack-objects.c    |   3 +-
> >  commit-graph.c              | 140 +++++++++++++++++-------------------
> >  delta-islands.c             |   4 +-
> >  diffcore-rename.c           |   9 ++-
> >  entry.c                     |   2 +-
> >  midx.c                      |  43 +++++------
> >  pack-bitmap-write.c         |   8 +--
> >  pack-bitmap.c               |   5 +-
> >  preload-index.c             |   7 +-
> >  progress.c                  |  27 ++++---
> >  progress.h                  |  10 +--
> >  prune-packed.c              |   4 +-
> >  read-cache.c                |   6 +-
> >  t/helper/test-progress.c    |   6 +-
> >  t/t0500-progress-display.sh |  27 +++++++
> >  unpack-trees.c              |  14 ++--
> >  walker.c                    |   4 +-
> >  25 files changed, 212 insertions(+), 192 deletions(-)
> 
> So, the gist of the change is these, where all functions in the
> progress API that prepare a progress structure now take "verbose"
> parameter, and the function that emits the output looks at the
> flag.  Counters and timers are used no matter what as long as
> progress is shown.
> 
> > diff --git a/progress.c b/progress.c
> > index 3eda914518..f014908b05 100644
> > --- a/progress.c
> > +++ b/progress.c
> > @@ -43,6 +43,7 @@ struct progress {
> >  	struct strbuf counters_sb;
> >  	int title_len;
> >  	int split;
> > +	int verbose;
> >  };
> >  
> >  static volatile sig_atomic_t progress_update;
> > @@ -245,12 +246,13 @@ void display_throughput(struct progress *progress, uint64_t total)
> >  
> >  void display_progress(struct progress *progress, uint64_t n)
> >  {
> > -	if (progress)
> > +	if (progress && progress->verbose)
> >  		display(progress, n, NULL);
> >  }
> >  
> >  static struct progress *start_progress_delay(const char *title, uint64_t total,
> > -					     unsigned delay, unsigned sparse)
> > +					     unsigned delay, unsigned sparse,
> > +					     int verbose)
> >  {
> >  	struct progress *progress = xmalloc(sizeof(*progress));
> >  	progress->title = title;
> > @@ -264,6 +266,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
> >  	strbuf_init(&progress->counters_sb, 0);
> >  	progress->title_len = utf8_strwidth(title);
> >  	progress->split = 0;
> > +	progress->verbose = verbose;
> >  	set_progress_signal();
> >  	trace2_region_enter("progress", title, the_repository);
> >  	return progress;
> > @@ -279,14 +282,16 @@ static int get_default_delay(void)
> >  	return delay_in_secs;
> >  }
> >  
> > -struct progress *start_delayed_progress(const char *title, uint64_t total)
> > +struct progress *start_delayed_progress(const char *title, uint64_t total,
> > +					int verbose)
> >  {
> > -	return start_progress_delay(title, total, get_default_delay(), 0);
> > +	return start_progress_delay(title, total, get_default_delay(), 0,
> > +				    verbose);
> >  }
> >  
> > -struct progress *start_progress(const char *title, uint64_t total)
> > +struct progress *start_progress(const char *title, uint64_t total, int verbose)
> >  {
> > -	return start_progress_delay(title, total, 0, 0);
> > +	return start_progress_delay(title, total, 0, 0, verbose);
> >  }
> >  
> >  /*
> > @@ -298,15 +303,17 @@ struct progress *start_progress(const char *title, uint64_t total)
> >   * When "sparse" is set, stop_progress() will automatically force the done
> >   * message to show 100%.
> >   */
> > -struct progress *start_sparse_progress(const char *title, uint64_t total)
> > +struct progress *start_sparse_progress(const char *title, uint64_t total,
> > +				       int verbose)
> >  {
> > -	return start_progress_delay(title, total, 0, 1);
> > +	return start_progress_delay(title, total, 0, 1, verbose);
> >  }
> >  
> >  struct progress *start_delayed_sparse_progress(const char *title,
> > -					       uint64_t total)
> > +					       uint64_t total, int verbose)
> >  {
> > -	return start_progress_delay(title, total, get_default_delay(), 1);
> > +	return start_progress_delay(title, total, get_default_delay(), 1,
> > +				    verbose);
> >  }
> >  
> >  static void finish_if_sparse(struct progress *progress)
> > diff --git a/progress.h b/progress.h
> > index f1913acf73..0c64557231 100644
> > --- a/progress.h
> > +++ b/progress.h
> > @@ -13,11 +13,13 @@ void progress_test_force_update(void);
> >  
> >  void display_throughput(struct progress *progress, uint64_t total);
> >  void display_progress(struct progress *progress, uint64_t n);
> > -struct progress *start_progress(const char *title, uint64_t total);
> > -struct progress *start_sparse_progress(const char *title, uint64_t total);
> > -struct progress *start_delayed_progress(const char *title, uint64_t total);
> > +struct progress *start_progress(const char *title, uint64_t total, int verbose);
> > +struct progress *start_sparse_progress(const char *title, uint64_t total,
> > +				       int verbose);
> > +struct progress *start_delayed_progress(const char *title, uint64_t total,
> > +					int verbose);
> >  struct progress *start_delayed_sparse_progress(const char *title,
> > -					       uint64_t total);
> > +					       uint64_t total, int verbose);
> >  void stop_progress(struct progress **progress);
> >  void stop_progress_msg(struct progress **progress, const char *msg);
> >  
> 
> 
> And then this is how a typical caller gets updated.
> 
> For this particular one, it is not _too_ bad, even though it does
> feel a bit roundabout way to say "we are starting delayed progress
> at this point" by having this call unconditionally, but then the
> reader needs to realize "ah, not really, because show_progress is
> not enabled" when the reading reaches the end of the call.
> 
> > diff --git a/builtin/blame.c b/builtin/blame.c
> > index 94ef57c1cc..1d72b27fda 100644
> > --- a/builtin/blame.c
> > +++ b/builtin/blame.c
> > @@ -1129,8 +1129,8 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
> >  
> >  	sb.found_guilty_entry = &found_guilty_entry;
> >  	sb.found_guilty_entry_data = &pi;
> > -	if (show_progress)
> > -		pi.progress = start_delayed_progress(_("Blaming lines"), sb.num_lines);
> > +	pi.progress = start_delayed_progress(_("Blaming lines"), sb.num_lines,
> > +					     show_progress);
> >  
> >  	assign_blame(&sb, opt);
> 
> But there are others that look a bit problematic.  In this example
> taken from fsck, we open all the pack index, only because it is
> needed to show the progress, and the existing conditionals are ways
> to avoid spending unneeded cycles.

That info is still used, when performing trace log during
stop_progress():

  if (p_progress && *p_progress) {
          trace2_data_intmax("progress", the_repository, "total_objects",
                             (*p_progress)->total);

          if ((*p_progress)->throughput)
                  trace2_data_intmax("progress", the_repository,
                                     "total_bytes",
                                     (*p_progress)->throughput->curr_total);

          trace2_region_leave("progress", (*p_progress)->title, the_repository);
  }

Because the progress struct is no longer NULL, we can write down the
number of objects calculated to the trace2 log file. So the expense of
calculating it is not wasted, at least in scenarios where someone cares
about the output of the trace log.

But yes, lots of users don't care about trace log - they are using their
home computer and don't care about metrics like how many objects were
transferred in every single push. Now it becomes a little strange - is
it better to instead wrap each 'struct progress' constructor in "if
(should_i_log || should_i_trace)"? Have we got a simple way to look up
'should_i_trace'?

Some later comments have the same point - "why am I calculating the
totals if the user won't see and doesn't care?" - so I won't respond to
each other one; my reply is the same.

> 
> > @@ -836,16 +836,15 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
> >  			uint32_t total = 0, count = 0;
> >  			struct progress *progress = NULL;
> >  
> > -			if (show_progress) {
> > -				for (p = get_all_packs(the_repository); p;
> > -				     p = p->next) {
> > -					if (open_pack_index(p))
> > -						continue;
> > -					total += p->num_objects;
> > -				}
> > -
> > -				progress = start_progress(_("Checking objects"), total);
> > +			for (p = get_all_packs(the_repository); p;
> > +			     p = p->next) {
> > +				if (open_pack_index(p))
> > +					continue;
> > +				total += p->num_objects;
> >  			}
> > +
> > +			progress = start_progress(_("Checking objects"), total,
> > +						  show_progress);
> >  			for (p = get_all_packs(the_repository); p;
> >  			     p = p->next) {
> >  				/* verify gives error messages itself */
> 
> Likewise, we do not even have to be scanning the index entries
> upfront if we are not showing the progress report (and more
> importantly, the user likely has chosen the speed/cycles over eye
> candy progress meter) while checking out paths from the index.
> 
> > diff --git a/unpack-trees.c b/unpack-trees.c
> > index 4be5fc3075..fa84244466 100644
> > --- a/unpack-trees.c
> > +++ b/unpack-trees.c
> > @@ -338,16 +338,14 @@ static struct progress *get_progress(struct unpack_trees_options *o,
> >  {
> >  	unsigned cnt = 0, total = 0;
> >  
> > -	if (!o->update || !o->verbose_update)
> > -		return NULL;
> > -
> >  	for (; cnt < index->cache_nr; cnt++) {
> >  		const struct cache_entry *ce = index->cache[cnt];
> >  		if (ce->ce_flags & (CE_UPDATE | CE_WT_REMOVE))
> >  			total++;
> >  	}
> >  
> > -	return start_delayed_progress(_("Updating files"), total);
> > +	return start_delayed_progress(_("Updating files"), total,
> > +				      (o->update && o->verbose_update));
> >  }
> >  
> >  static void setup_collided_checkout_detection(struct checkout *state,
> 
> 
> And there are cases where the "is progress enabled?" parameter (by
> the way, I am not sure 'verbose' is quite the right name for the

I struggled some with this; I also considered using 'quiet' but it
required inverting the logic everywhere when all callers already were
asking "should I print?" instead of "should I not print?". Maybe "print"
is better?

> parameter and the field name, as that word implies there can be
> multiple verbosity levels---what I see you've done is that you added
> an "is this enabled?" flag) is not a simple variable, like these.
> Some are simple enough, but some are not.  Some of the worst appear
> in preload-index.c and read-cache.c---they may become easier to read
> if we introduced a variable to be passed as the "enable?"  parameter
> and compute it before makng a call, and that might be enough, though.

Sure, makes sense.

> 
> > diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> > index 7016b28485..1a97a9f0be 100644
> > --- a/builtin/pack-objects.c
> > +++ b/builtin/pack-objects.c
> > @@ -1012,8 +1012,8 @@ static void write_pack_file(void)
> >  	time_t last_mtime = 0;
> >  	struct object_entry **write_order;
> >  
> > -	if (progress > pack_to_stdout)
> > -		progress_state = start_progress(_("Writing objects"), nr_result);
> > +	progress_state = start_progress(_("Writing objects"), nr_result,
> > +					(progress > pack_to_stdout));
> >  	ALLOC_ARRAY(written_list, to_pack.nr_objects);
> >  	write_order = compute_write_order();
> >  
> 
> > @@ -1418,9 +1415,9 @@ int write_commit_graph_reachable(struct object_directory *odb,
> >  
> >  	memset(&data, 0, sizeof(data));
> >  	data.commits = &commits;
> > -	if (flags & COMMIT_GRAPH_WRITE_PROGRESS)
> > -		data.progress = start_delayed_progress(
> > -			_("Collecting referenced commits"), 0);
> > +	data.progress = start_delayed_progress(
> > +		_("Collecting referenced commits"), 0,
> > +		(flags & COMMIT_GRAPH_WRITE_PROGRESS));
> >  
> >  	for_each_ref(add_ref_to_set, &data);
> >  	result = write_commit_graph(odb, NULL, &commits,
> 
> > @@ -2301,9 +2295,9 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
> >  	if (verify_commit_graph_error & ~VERIFY_COMMIT_GRAPH_ERROR_HASH)
> >  		return verify_commit_graph_error;
> >  
> > -	if (flags & COMMIT_GRAPH_WRITE_PROGRESS)
> > -		progress = start_progress(_("Verifying commits in commit graph"),
> > -					g->num_commits);
> > +	progress = start_progress(_("Verifying commits in commit graph"),
> > +				g->num_commits,
> > +				(flags & COMMIT_GRAPH_WRITE_PROGRESS));
> >  
> >  	for (i = 0; i < g->num_commits; i++) {
> >  		struct commit *graph_commit, *odb_commit;
> 
> > diff --git a/midx.c b/midx.c
> > index 6d1584ca51..d9ab163340 100644
> > --- a/midx.c
> > +++ b/midx.c
> > @@ -836,10 +836,8 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index *
> >  	}
> >  
> >  	packs.pack_paths_checked = 0;
> > -	if (flags & MIDX_PROGRESS)
> > -		packs.progress = start_progress(_("Adding packfiles to multi-pack-index"), 0);
> > -	else
> > -		packs.progress = NULL;
> > +	packs.progress = start_progress(_("Adding packfiles to multi-pack-index"), 0,
> > +						(flags & MIDX_PROGRESS));
> >  
> >  	for_each_file_in_pack_dir(object_dir, add_pack_to_midx, &packs);
> >  	stop_progress(&packs.progress);
> 
> > diff --git a/preload-index.c b/preload-index.c
> > index ed6eaa4738..a924a84a14 100644
> > --- a/preload-index.c
> > +++ b/preload-index.c
> > @@ -115,10 +115,9 @@ void preload_index(struct index_state *index,
> >  	memset(&data, 0, sizeof(data));
> >  
> >  	memset(&pd, 0, sizeof(pd));
> > -	if (refresh_flags & REFRESH_PROGRESS && isatty(2)) {
> > -		pd.progress = start_delayed_progress(_("Refreshing index"), index->cache_nr);
> > -		pthread_mutex_init(&pd.mutex, NULL);
> > -	}
> > +	pd.progress = start_delayed_progress(_("Refreshing index"), index->cache_nr,
> > +					     (refresh_flags & REFRESH_PROGRESS && isatty(2)));
> > +	pthread_mutex_init(&pd.mutex, NULL);
> >  
> >  	for (i = 0; i < threads; i++) {
> >  		struct thread_data *p = data+i;
> 
> > diff --git a/read-cache.c b/read-cache.c
> > index aa427c5c17..2ddc422dbd 100644
> > --- a/read-cache.c
> > +++ b/read-cache.c
> > @@ -1532,9 +1532,9 @@ int refresh_index(struct index_state *istate, unsigned int flags,
> >  	const char *unmerged_fmt;
> >  	struct progress *progress = NULL;
> >  
> > -	if (flags & REFRESH_PROGRESS && isatty(2))
> > -		progress = start_delayed_progress(_("Refresh index"),
> > -						  istate->cache_nr);
> > +	progress = start_delayed_progress(_("Refresh index"),
> > +					  istate->cache_nr,
> > +					  (flags & REFRESH_PROGRESS && isatty(2)));
> >  
> >  	trace_performance_enter();
> >  	modified_fmt   = in_porcelain ? "M\t%s\n" : "%s: needs update\n";
> 
> 
> And here is the only one that may benefit from this change, which
> cares that the codepath prepared in rev-list, primarily to support
> the progress meter, is exercised because those pieces of code
> happens to be used by the tracing output as well.  Perhaps we
> shouldn't have tied the "when showing progress, we can also trace"
> earlier and we didn't have to touch this many codepath only to spend
> cycles when progress output is not desired?

I guess I don't see how this is the only caller that benefits. The
tracing output should appear in all the places where I touched the code;
that was the point of the earlier change. Since the object total is provided in
start_progress() and friends, and the runtime is logged by
region_enter() in start_progress() and region_leave() in
stop_progress(), then every caller is now unconditionally leaving traces
to give some impression of how much work and how long it took. So I am
confused - can you explain how the other callers don't benefit?

> 
> > diff --git a/builtin/rev-list.c b/builtin/rev-list.c
> > index f520111eda..f64cad8390 100644
> > --- a/builtin/rev-list.c
> > +++ b/builtin/rev-list.c
> > @@ -620,8 +620,13 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
> >  	if (bisect_list)
> >  		revs.limited = 1;
> >  
> > -	if (show_progress)
> > -		progress = start_delayed_progress(show_progress, 0);
> > +	/*
> > +	 * When progress is not printed to the user, we still want to be able to
> > +	 * classify the progress during tracing. So, use a placeholder name.
> > +	 */
> > +	progress = start_delayed_progress(
> > +			show_progress ? show_progress : _("Quiet rev-list operation"),
> > +			0, show_progress != NULL);
> 
> I've cut out the hunks that exhibits similar looking patterns as the
> ones I covered in the above discussion.  You are right to say that
> these are simple and similar changes all over the place.  I am not
> yet convinced that the simplicity supports that this change is the
> right thing to do, though.  The only saving grace might be that
> there aren't that many hunks that explicitly do unnecessary things
> like the ones in unpack-trees.c and builtin/fsck.c even when we are
> not showing progress.  
> 
> But the other codepaths may be doing conditional computation not
> based on "if (show_progress)" but on "if (progress)", in which case

Hm, good point. For what it's worth, I performed the changes manually:

  git grep "start_progress()"
  for each caller, open in vim
  /progress
  delete or move if necessary
  (repeat until all matches for /progress look right)

and I did not see cases where 'if (progress)' was being used. That is
hard to express in a diff; I can try to audit it better. (This process
is how I also noticed the things to change in 2/2 in this series, even
though they are unrelated to start_progress()).

> with this patch, we may be paying a lot more overhead even if we
> know progress meter won't be shown and the worse part from
> reviewability point of view is that this patch does not explicitly
> do anything to make it happen because start_delayed_progress() now
> unconditionally give a non-NULL progress structure to enable them.
> 
> So I dunno.  It certainly is not for the upcoming release, but we do
> want to make sure that trace requested even when the command is quiet
> gets collected in the longer term (i.e. during the next cycle).

Ok. It sounds like you are saying you want the spirit of the change,
even if not in this exact form - that is good to know.

 - Emily

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] progress: create progress struct in 'verbose' mode
  2020-07-14  0:15     ` Emily Shaffer
@ 2020-08-17 22:19       ` Emily Shaffer
  2020-08-17 22:44         ` Junio C Hamano
  0 siblings, 1 reply; 27+ messages in thread
From: Emily Shaffer @ 2020-08-17 22:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Mon, Jul 13, 2020 at 05:15:04PM -0700, Emily Shaffer wrote:
> 
> On Fri, Jul 10, 2020 at 03:40:25PM -0700, Junio C Hamano wrote:
> > > diff --git a/builtin/blame.c b/builtin/blame.c
> > > index 94ef57c1cc..1d72b27fda 100644
> > > --- a/builtin/blame.c
> > > +++ b/builtin/blame.c
> > > @@ -1129,8 +1129,8 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
> > >  
> > >  	sb.found_guilty_entry = &found_guilty_entry;
> > >  	sb.found_guilty_entry_data = &pi;
> > > -	if (show_progress)
> > > -		pi.progress = start_delayed_progress(_("Blaming lines"), sb.num_lines);
> > > +	pi.progress = start_delayed_progress(_("Blaming lines"), sb.num_lines,
> > > +					     show_progress);
> > >  
> > >  	assign_blame(&sb, opt);
> > 
> > But there are others that look a bit problematic.  In this example
> > taken from fsck, we open all the pack index, only because it is
> > needed to show the progress, and the existing conditionals are ways
> > to avoid spending unneeded cycles.
> 
> That info is still used, when performing trace log during
> stop_progress():
> 
>   if (p_progress && *p_progress) {
>           trace2_data_intmax("progress", the_repository, "total_objects",
>                              (*p_progress)->total);
> 
>           if ((*p_progress)->throughput)
>                   trace2_data_intmax("progress", the_repository,
>                                      "total_bytes",
>                                      (*p_progress)->throughput->curr_total);
> 
>           trace2_region_leave("progress", (*p_progress)->title, the_repository);
>   }
> 
> Because the progress struct is no longer NULL, we can write down the
> number of objects calculated to the trace2 log file. So the expense of
> calculating it is not wasted, at least in scenarios where someone cares
> about the output of the trace log.
> 
> But yes, lots of users don't care about trace log - they are using their
> home computer and don't care about metrics like how many objects were
> transferred in every single push. Now it becomes a little strange - is
> it better to instead wrap each 'struct progress' constructor in "if
> (should_i_log || should_i_trace)"? Have we got a simple way to look up
> 'should_i_trace'?
> 
> Some later comments have the same point - "why am I calculating the
> totals if the user won't see and doesn't care?" - so I won't respond to
> each other one; my reply is the same.
> 
> > 
> > > @@ -836,16 +836,15 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
> > >  			uint32_t total = 0, count = 0;
> > >  			struct progress *progress = NULL;
> > >  
> > > -			if (show_progress) {
> > > -				for (p = get_all_packs(the_repository); p;
> > > -				     p = p->next) {
> > > -					if (open_pack_index(p))
> > > -						continue;
> > > -					total += p->num_objects;
> > > -				}
> > > -
> > > -				progress = start_progress(_("Checking objects"), total);
> > > +			for (p = get_all_packs(the_repository); p;
> > > +			     p = p->next) {
> > > +				if (open_pack_index(p))
> > > +					continue;
> > > +				total += p->num_objects;
> > >  			}
> > > +
> > > +			progress = start_progress(_("Checking objects"), total,
> > > +						  show_progress);
> > >  			for (p = get_all_packs(the_repository); p;
> > >  			     p = p->next) {
> > >  				/* verify gives error messages itself */
> > 
> > Likewise, we do not even have to be scanning the index entries
> > upfront if we are not showing the progress report (and more
> > importantly, the user likely has chosen the speed/cycles over eye
> > candy progress meter) while checking out paths from the index.
> > 
> > > diff --git a/unpack-trees.c b/unpack-trees.c
> > > index 4be5fc3075..fa84244466 100644
> > > --- a/unpack-trees.c
> > > +++ b/unpack-trees.c
> > > @@ -338,16 +338,14 @@ static struct progress *get_progress(struct unpack_trees_options *o,
> > >  {
> > >  	unsigned cnt = 0, total = 0;
> > >  
> > > -	if (!o->update || !o->verbose_update)
> > > -		return NULL;
> > > -
> > >  	for (; cnt < index->cache_nr; cnt++) {
> > >  		const struct cache_entry *ce = index->cache[cnt];
> > >  		if (ce->ce_flags & (CE_UPDATE | CE_WT_REMOVE))
> > >  			total++;
> > >  	}
> > >  
> > > -	return start_delayed_progress(_("Updating files"), total);
> > > +	return start_delayed_progress(_("Updating files"), total,
> > > +				      (o->update && o->verbose_update));
> > >  }
> > >  
> > >  static void setup_collided_checkout_detection(struct checkout *state,
> > 
> > 
> > And there are cases where the "is progress enabled?" parameter (by
> > the way, I am not sure 'verbose' is quite the right name for the
> 
> I struggled some with this; I also considered using 'quiet' but it
> required inverting the logic everywhere when all callers already were
> asking "should I print?" instead of "should I not print?". Maybe "print"
> is better?
> 
> > parameter and the field name, as that word implies there can be
> > multiple verbosity levels---what I see you've done is that you added
> > an "is this enabled?" flag) is not a simple variable, like these.
> > Some are simple enough, but some are not.  Some of the worst appear
> > in preload-index.c and read-cache.c---they may become easier to read
> > if we introduced a variable to be passed as the "enable?"  parameter
> > and compute it before makng a call, and that might be enough, though.
> 
> Sure, makes sense.
> 
> > > diff --git a/read-cache.c b/read-cache.c
> > > index aa427c5c17..2ddc422dbd 100644
> > > --- a/read-cache.c
> > > +++ b/read-cache.c
> > > @@ -1532,9 +1532,9 @@ int refresh_index(struct index_state *istate, unsigned int flags,
> > >  	const char *unmerged_fmt;
> > >  	struct progress *progress = NULL;
> > >  
> > > -	if (flags & REFRESH_PROGRESS && isatty(2))
> > > -		progress = start_delayed_progress(_("Refresh index"),
> > > -						  istate->cache_nr);
> > > +	progress = start_delayed_progress(_("Refresh index"),
> > > +					  istate->cache_nr,
> > > +					  (flags & REFRESH_PROGRESS && isatty(2)));
> > >  
> > >  	trace_performance_enter();
> > >  	modified_fmt   = in_porcelain ? "M\t%s\n" : "%s: needs update\n";
> > 
> > 
> > And here is the only one that may benefit from this change, which
> > cares that the codepath prepared in rev-list, primarily to support
> > the progress meter, is exercised because those pieces of code
> > happens to be used by the tracing output as well.  Perhaps we
> > shouldn't have tied the "when showing progress, we can also trace"
> > earlier and we didn't have to touch this many codepath only to spend
> > cycles when progress output is not desired?
> 
> I guess I don't see how this is the only caller that benefits. The
> tracing output should appear in all the places where I touched the code;
> that was the point of the earlier change. Since the object total is provided in
> start_progress() and friends, and the runtime is logged by
> region_enter() in start_progress() and region_leave() in
> stop_progress(), then every caller is now unconditionally leaving traces
> to give some impression of how much work and how long it took. So I am
> confused - can you explain how the other callers don't benefit?
> 
> > 
> > > diff --git a/builtin/rev-list.c b/builtin/rev-list.c
> > > index f520111eda..f64cad8390 100644
> > > --- a/builtin/rev-list.c
> > > +++ b/builtin/rev-list.c
> > > @@ -620,8 +620,13 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
> > >  	if (bisect_list)
> > >  		revs.limited = 1;
> > >  
> > > -	if (show_progress)
> > > -		progress = start_delayed_progress(show_progress, 0);
> > > +	/*
> > > +	 * When progress is not printed to the user, we still want to be able to
> > > +	 * classify the progress during tracing. So, use a placeholder name.
> > > +	 */
> > > +	progress = start_delayed_progress(
> > > +			show_progress ? show_progress : _("Quiet rev-list operation"),
> > > +			0, show_progress != NULL);
> > 
> > I've cut out the hunks that exhibits similar looking patterns as the
> > ones I covered in the above discussion.  You are right to say that
> > these are simple and similar changes all over the place.  I am not
> > yet convinced that the simplicity supports that this change is the
> > right thing to do, though.  The only saving grace might be that
> > there aren't that many hunks that explicitly do unnecessary things
> > like the ones in unpack-trees.c and builtin/fsck.c even when we are
> > not showing progress.  
> > 
> > But the other codepaths may be doing conditional computation not
> > based on "if (show_progress)" but on "if (progress)", in which case
> 
> Hm, good point. For what it's worth, I performed the changes manually:
> 
>   git grep "start_progress()"
>   for each caller, open in vim
>   /progress
>   delete or move if necessary
>   (repeat until all matches for /progress look right)
> 
> and I did not see cases where 'if (progress)' was being used. That is
> hard to express in a diff; I can try to audit it better. (This process
> is how I also noticed the things to change in 2/2 in this series, even
> though they are unrelated to start_progress()).
> 
> > with this patch, we may be paying a lot more overhead even if we
> > know progress meter won't be shown and the worse part from
> > reviewability point of view is that this patch does not explicitly
> > do anything to make it happen because start_delayed_progress() now
> > unconditionally give a non-NULL progress structure to enable them.
> > 
> > So I dunno.  It certainly is not for the upcoming release, but we do
> > want to make sure that trace requested even when the command is quiet
> > gets collected in the longer term (i.e. during the next cycle).
> 
> Ok. It sounds like you are saying you want the spirit of the change,
> even if not in this exact form - that is good to know.

A month later, are we still interested in this change? I'd like to
pursue it - we are trying to use this progress tracing internally to
generate some metrics and not having this change (or some version of it)
means those metrics aren't accurate.

The main concern I saw here was "we are doing a lot of work that isn't
used if the user doesn't want to log traces" - should I approach a
reroll of this topic by trying to be smarter about whether to set
'quiet' or 'print' or 'verbose' or whatever it is renamed to, based on
whether there is a trace destination? Then for systems which are logging
traces the extra work is worth it, but for everyone else it can function
as before.

I don't love it from a design perspective - it feels a little like
progress module is looking a little too closely at trace module
internals - but from a performance perspective I see the appeal.

On the other hand, progress operations are showing progress because they
are slow - is a little more work on something that's already
human-noticeable slow such a big deal?

 - Emily

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] progress: create progress struct in 'verbose' mode
  2020-08-17 22:19       ` Emily Shaffer
@ 2020-08-17 22:44         ` Junio C Hamano
  2020-08-17 22:49           ` Junio C Hamano
  0 siblings, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2020-08-17 22:44 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Emily Shaffer <emilyshaffer@google.com> writes:

> The main concern I saw here was "we are doing a lot of work that isn't
> used if the user doesn't want to log traces" - should I approach a
> reroll of this topic by trying to be smarter about whether to set
> 'quiet' or 'print' or 'verbose' or whatever it is renamed to, based on
> whether there is a trace destination? Then for systems which are logging
> traces the extra work is worth it, but for everyone else it can function
> as before.
>
> I don't love it from a design perspective - it feels a little like
> progress module is looking a little too closely at trace module
> internals.

Isn't that primarily due to the decision to tie progress and trace
too closely?  If so, perhaps that needs to be revisited?

I dunno.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] progress: create progress struct in 'verbose' mode
  2020-08-17 22:44         ` Junio C Hamano
@ 2020-08-17 22:49           ` Junio C Hamano
  2020-09-09 22:42             ` Jonathan Tan
  0 siblings, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2020-08-17 22:49 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Emily Shaffer <emilyshaffer@google.com> writes:
>
>> The main concern I saw here was "we are doing a lot of work that isn't
>> used if the user doesn't want to log traces" - should I approach a
>> reroll of this topic by trying to be smarter about whether to set
>> 'quiet' or 'print' or 'verbose' or whatever it is renamed to, based on
>> whether there is a trace destination? Then for systems which are logging
>> traces the extra work is worth it, but for everyone else it can function
>> as before.
>>
>> I don't love it from a design perspective - it feels a little like
>> progress module is looking a little too closely at trace module
>> internals.
>
> Isn't that primarily due to the decision to tie progress and trace
> too closely?  If so, perhaps that needs to be revisited?

Or the "too close coupling" needs to be accepted as the cost of
doing so (as "progress is often a good cue for an event worth
tracing" was a convenient way to cheat by programmers not to spend
too many braincycles to decide adding trace points---they
automatically got them when they decided to show progress output).

>
> I dunno.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] progress: create progress struct in 'verbose' mode
  2020-07-10 22:40   ` Junio C Hamano
  2020-07-14  0:15     ` Emily Shaffer
@ 2020-09-09 22:36     ` Jonathan Tan
  2020-09-09 23:06       ` Junio C Hamano
  1 sibling, 1 reply; 27+ messages in thread
From: Jonathan Tan @ 2020-09-09 22:36 UTC (permalink / raw)
  To: gitster; +Cc: emilyshaffer, git, Jonathan Tan

> But there are others that look a bit problematic.  In this example
> taken from fsck, we open all the pack index, only because it is
> needed to show the progress, and the existing conditionals are ways
> to avoid spending unneeded cycles.
> 
> > @@ -836,16 +836,15 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
> >  			uint32_t total = 0, count = 0;
> >  			struct progress *progress = NULL;
> >  
> > -			if (show_progress) {
> > -				for (p = get_all_packs(the_repository); p;
> > -				     p = p->next) {
> > -					if (open_pack_index(p))
> > -						continue;
> > -					total += p->num_objects;
> > -				}
> > -
> > -				progress = start_progress(_("Checking objects"), total);
> > +			for (p = get_all_packs(the_repository); p;
> > +			     p = p->next) {
> > +				if (open_pack_index(p))
> > +					continue;
> > +				total += p->num_objects;
> >  			}
> > +
> > +			progress = start_progress(_("Checking objects"), total,
> > +						  show_progress);
> >  			for (p = get_all_packs(the_repository); p;
> >  			     p = p->next) {
> >  				/* verify gives error messages itself */
> 
> Likewise, we do not even have to be scanning the index entries
> upfront if we are not showing the progress report (and more
> importantly, the user likely has chosen the speed/cycles over eye
> candy progress meter) while checking out paths from the index.

This was the most problematic one I saw, and I don't think it's that
problematic - the loop at the bottom of the quotation above calls
verify_pack(), which also calls open_pack_index(), so (unless some of
the "struct packed_git" are freed in the meantime - I haven't looked at
this closely) the opening of the pack indexes are not being wasted.

I also saw some strbuf manipulation to generate the title, but I also
don't think that takes significant cycles compared to the task that
requires the progress display.

But if this is a problem, one thing we could do is pass the total as a
callback instead of as an int, and provide a generic callback that just
returns the dereferenced cb_data. Most invocations would just use that
generic callback. (Alternatively, as discussed in-office, we could allow
start_progress() to return NULL when no progress display is needed,
change start_progress() to not take a total, add a progress_set_total(),
and check in display_progress() that the total has been set before
proceeding.)

> But the other codepaths may be doing conditional computation not
> based on "if (show_progress)" but on "if (progress)", in which case
> with this patch, we may be paying a lot more overhead even if we
> know progress meter won't be shown and the worse part from
> reviewability point of view is that this patch does not explicitly
> do anything to make it happen because start_delayed_progress() now
> unconditionally give a non-NULL progress structure to enable them.

One way to enumerate this might be to get the LHS of all the assignments
from start_progress() and friends (e.g. "pi.progress" in
builtin/blame.c, "progress" in builtin/commit-graph.c) and then grepping
the respective files to see if "if (.*[LHS]" is done.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] progress: create progress struct in 'verbose' mode
  2020-08-17 22:49           ` Junio C Hamano
@ 2020-09-09 22:42             ` Jonathan Tan
  0 siblings, 0 replies; 27+ messages in thread
From: Jonathan Tan @ 2020-09-09 22:42 UTC (permalink / raw)
  To: gitster; +Cc: emilyshaffer, git, Jonathan Tan

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Emily Shaffer <emilyshaffer@google.com> writes:
> >
> >> The main concern I saw here was "we are doing a lot of work that isn't
> >> used if the user doesn't want to log traces" - should I approach a
> >> reroll of this topic by trying to be smarter about whether to set
> >> 'quiet' or 'print' or 'verbose' or whatever it is renamed to, based on
> >> whether there is a trace destination? Then for systems which are logging
> >> traces the extra work is worth it, but for everyone else it can function
> >> as before.
> >>
> >> I don't love it from a design perspective - it feels a little like
> >> progress module is looking a little too closely at trace module
> >> internals.
> >
> > Isn't that primarily due to the decision to tie progress and trace
> > too closely?  If so, perhaps that needs to be revisited?
> 
> Or the "too close coupling" needs to be accepted as the cost of
> doing so (as "progress is often a good cue for an event worth
> tracing" was a convenient way to cheat by programmers not to spend
> too many braincycles to decide adding trace points---they
> automatically got them when they decided to show progress output).

I wouldn't describe it as "cheat", but I agree with the general
sentiment - in general, I would think that if something is lengthy
enough to need to indicate progress to the user, we should trace its
performance.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] progress: create progress struct in 'verbose' mode
  2020-09-09 22:36     ` Jonathan Tan
@ 2020-09-09 23:06       ` Junio C Hamano
  0 siblings, 0 replies; 27+ messages in thread
From: Junio C Hamano @ 2020-09-09 23:06 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: emilyshaffer, git

Jonathan Tan <jonathantanmy@google.com> writes:

> I also saw some strbuf manipulation to generate the title, but I also
> don't think that takes significant cycles compared to the task that
> requires the progress display.

I do not think we care so much about cycles.  What I find a lot more
disturbing was that this loses conceptual clarity (e.g. "why are we
generating the title unconditionally when we know we are not going
to use it?"), and that is why in the message you are responding to I
wondered if this was a price we should accept to pay, as decided to
tie the progress and tracing closely earlier.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] progress: create progress struct in 'verbose' mode
  2020-07-10  1:42 ` [PATCH 1/2] progress: create progress struct in 'verbose' mode Emily Shaffer
                     ` (2 preceding siblings ...)
  2020-07-10 22:40   ` Junio C Hamano
@ 2020-09-10  0:31   ` Jonathan Nieder
  2020-09-10  5:09     ` Junio C Hamano
  3 siblings, 1 reply; 27+ messages in thread
From: Jonathan Nieder @ 2020-09-10  0:31 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Hi,

Emily Shaffer wrote:

> Before now, the progress API is used by conditionally calling
> start_progress() or a similar call, and then unconditionally calling
> display_progress() and stop_progress(), both of which are tolerant of
> NULL or uninitialized inputs.

Being super nitpicky for a moment:

- Git's commit messages describe the behavior without the patch in
  the present tense, as though the author were writing a bug report.
  so "Before now" can be "Today".

- with an uninitialized input, display_progress would produce undefined
  behavior and likely crash ;-)

[...]
> Since this changes the API, also modify callers of start_progress() and
> friends to drop their conditional and pass a new argument in instead.

If I understand correctly, the calling sequence before this patch is
something like

	struct progress *progress =
		want_progress ? start_progress(title, n) : NULL;

	for (int i = 0; i < n; i++) {
		... do some work ...
		display_progress(progress, i);
	}

	stop_progress(progress);

and this patch changes it to

	struct progress *progress = start_progress(title, n, want_progress);

	for (int i = 0; i < n; i++) {
		... do some work ...
		display_progress(progress, i);
	}

	stop_progress(progress);

A few consequences:

- it's a little briefer, which is nice

- progress is always non-NULL, so we can't express

	if (progress) {
		for ( ... ) {
			... do one chunk of work ...
			display_progress(...);
		}
	} else {
		... do work slightly more efficiently, all in one chunk ...
	}

- even if we don't want progress, we always spend the overhead of
  allocating a progress struct (not a big deal)

- if 'n' is a more complex expression (e.g. a function call), it gets
  computed even if we don't want progress.  For example, in "git fsck",
  as Junio noticed, this means accumulating the object counts from all
  idx files just to throw them away.

- the motivation: it means the progress API can be aware of whether
  the caller wants to write progress to the terminal and has control
  over what to do with that information.

  In particular this makes the function name display_progress even
  more of a misnomer --- before this patch, display_progress on a
  non-NULL progress struct would display some progress information and
  possibly also write something to traces, but after this patch it
  sometimes only writes something to traces.

Overall I think it's an improvement in the API.  It opens the door to
future changes like making the progress API handle isatty checks in
some cases, perhaps.

[...]
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
[...]
> @@ -836,16 +836,15 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
>  			uint32_t total = 0, count = 0;
>  			struct progress *progress = NULL;
>  
> -			if (show_progress) {
> -				for (p = get_all_packs(the_repository); p;
> -				     p = p->next) {
> -					if (open_pack_index(p))
> -						continue;
> -					total += p->num_objects;
> -				}
> -
> -				progress = start_progress(_("Checking objects"), total);
> +			for (p = get_all_packs(the_repository); p;
> +			     p = p->next) {
> +				if (open_pack_index(p))
> +					continue;
> +				total += p->num_objects;
>  			}
> +
> +			progress = start_progress(_("Checking objects"), total,
> +						  show_progress);

As Jonathan mentions[1], this is nothing next to the work that fsck is
about to do on those objects, so although it's aesthetically
unappealing to have to compute this total just in order to throw it
away, it's not likely to make a big difference.

That said, what would an API look like that avoids that?

One possibility would be to make separate initialization and
start-of-progress calls:

	struct progress *progress = progress_new(show_progress, the_repository);

	if (progress_is_enabled(progress)) {
		for (...) {
			...
			total += ...
		}

		start_progress(progress, _("Checking objects"), total);
	}

Using a callback to supply the title and total like Jonathan suggests
is another possibility.  It seems a bit more fussy, though.

[...]
> --- a/progress.h
> +++ b/progress.h
> @@ -13,11 +13,13 @@ void progress_test_force_update(void);
>  
>  void display_throughput(struct progress *progress, uint64_t total);
>  void display_progress(struct progress *progress, uint64_t n);
> -struct progress *start_progress(const char *title, uint64_t total);
> -struct progress *start_sparse_progress(const char *title, uint64_t total);
> -struct progress *start_delayed_progress(const char *title, uint64_t total);
> +struct progress *start_progress(const char *title, uint64_t total, int verbose);
> +struct progress *start_sparse_progress(const char *title, uint64_t total,
> +				       int verbose);
> +struct progress *start_delayed_progress(const char *title, uint64_t total,
> +					int verbose);
>  struct progress *start_delayed_sparse_progress(const char *title,
> -					       uint64_t total);
> +					       uint64_t total, int verbose);
>  void stop_progress(struct progress **progress);
>  void stop_progress_msg(struct progress **progress, const char *msg);

This header contains no comments and there's no API docs for it in
Documentation/technical/ waiting to be copied into comments, so we end
up having to reverse engineer it.  Not about this patch, but it would
be nice to add an overview of the progress API to this file (e.g., to
show a typical calling sequence).

Since display_progress and display_throughput are no longer about
display, would it make sense to rename them (in a separate patch)?
For example,

	update_progress(progress, ++i);
	update_throughput(progress, bytes);

"update" is a bit vague but it may convey more clearly that this
affects traces too and not just what is written to the terminal.

> --- a/prune-packed.c
> +++ b/prune-packed.c
> @@ -31,8 +31,8 @@ static int prune_object(const struct object_id *oid, const char *path,
>  
>  void prune_packed_objects(int opts)
>  {
> -	if (opts & PRUNE_PACKED_VERBOSE)
> -		progress = start_delayed_progress(_("Removing duplicate objects"), 256);
> +	progress = start_delayed_progress(_("Removing duplicate objects"), 256,
> +					  (opts & PRUNE_PACKED_VERBOSE));

style nit: no need for parens around the function parameter (likewise
for most of these)

[...]
> --- a/t/t0500-progress-display.sh
> +++ b/t/t0500-progress-display.sh
> @@ -309,4 +309,31 @@ test_expect_success 'progress generates traces' '
>  	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
>  '
>  
> +test_expect_success 'progress generates traces even quietly' '

Nice.

With whatever subset of the changes discussed makes sense,
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

Thanks.

[1] https://lore.kernel.org/git/20200909224253.866718-1-jonathantanmy@google.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] progress: create progress struct in 'verbose' mode
  2020-09-10  0:31   ` Jonathan Nieder
@ 2020-09-10  5:09     ` Junio C Hamano
  0 siblings, 0 replies; 27+ messages in thread
From: Junio C Hamano @ 2020-09-10  5:09 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Emily Shaffer, git

Jonathan Nieder <jrnieder@gmail.com> writes:

> A few consequences:
>
> - it's a little briefer, which is nice

Also this makes start always pair with stop, which is cleaner,
especially if there is no extra work.

The current "if you do not want the overhead when not collecting
stats for and showing progress, just avoid calling start---stop
without calling start will be a no-op and safe anyway" arrangement
we have feels a bit kludgy.

> - progress is always non-NULL, so we can't express
>
> 	if (progress) {
> 		for ( ... ) {
> 			... do one chunk of work ...
> 			display_progress(...);
> 		}
> 	} else {
> 		... do work slightly more efficiently, all in one chunk ...
> 	}

Yes, this is the other side of the coin.  When there is significant
difference in the work between with and without progress codepath,
it is convenient to be able to switch on the "progress" pointer
itself.  The progress_is_enabled() helper you bring up later may be
a way to solve it.

> - even if we don't want progress, we always spend the overhead of
>   allocating a progress struct (not a big deal)

True.

> - if 'n' is a more complex expression (e.g. a function call), it gets
>   computed even if we don't want progress.  For example, in "git fsck",
>   as Junio noticed, this means accumulating the object counts from all
>   idx files just to throw them away.

Yes, I think the conceptual muddiness caused by this is what
disturbed me the most.  The cost of counting would likely to be
negligible; developers' time to understand why things are counted
in the first place however is the true waste.

> - the motivation: it means the progress API can be aware of whether
>   the caller wants to write progress to the terminal and has control
>   over what to do with that information.
>
>   In particular this makes the function name display_progress even
>   more of a misnomer --- before this patch, display_progress on a
>   non-NULL progress struct would display some progress information and
>   possibly also write something to traces, but after this patch it
>   sometimes only writes something to traces.

Yeah, this might show us a way to an acceptable solution to the
problem of conceptual uncleanliness, as naming may have a lot to
contribute to it.

> That said, what would an API look like that avoids that?
>
> One possibility would be to make separate initialization and
> start-of-progress calls:
>
> 	struct progress *progress = progress_new(show_progress, the_repository);
>
> 	if (progress_is_enabled(progress)) {
> 		for (...) {
> 			...
> 			total += ...
> 		}
>
> 		start_progress(progress, _("Checking objects"), total);
> 	}

OK.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2020-09-10  5:09 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-10  1:42 [PATCH 0/2] enable progress traces even when quiet Emily Shaffer
2020-07-10  1:42 ` [PATCH 1/2] progress: create progress struct in 'verbose' mode Emily Shaffer
2020-07-10  2:00   ` Derrick Stolee
2020-07-10  2:17     ` Taylor Blau
2020-07-10 19:21       ` Emily Shaffer
2020-07-10  2:14   ` brian m. carlson
2020-07-10 19:24     ` Emily Shaffer
2020-07-10 21:09     ` Junio C Hamano
2020-07-10 22:00       ` Emily Shaffer
2020-07-10 22:40   ` Junio C Hamano
2020-07-14  0:15     ` Emily Shaffer
2020-08-17 22:19       ` Emily Shaffer
2020-08-17 22:44         ` Junio C Hamano
2020-08-17 22:49           ` Junio C Hamano
2020-09-09 22:42             ` Jonathan Tan
2020-09-09 22:36     ` Jonathan Tan
2020-09-09 23:06       ` Junio C Hamano
2020-09-10  0:31   ` Jonathan Nieder
2020-09-10  5:09     ` Junio C Hamano
2020-07-10  1:42 ` [PATCH 2/2] progress: remove redundant null-checking Emily Shaffer
2020-07-10  2:01   ` Derrick Stolee
2020-07-10  2:20     ` Taylor Blau
2020-07-10 18:50       ` Junio C Hamano
2020-07-10 19:27         ` Emily Shaffer
2020-07-10 19:58           ` Junio C Hamano
2020-07-10 20:29             ` Emily Shaffer
2020-07-10 23:03               ` Emily Shaffer

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).