git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] reflog expire: add progress output
@ 2018-09-19 14:10 Ævar Arnfjörð Bjarmason
  2018-09-19 16:26 ` Duy Nguyen
  2018-09-21  2:17 ` Eric Sunshine
  0 siblings, 2 replies; 7+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-09-19 14:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason

Before this change the "git reflog expire" command didn't report any
progress. This is the second command (after "pack-refs --all --prune")
that the "gc" command will run.

On small repositories like this command won't take long to run, my
test system it takes just under 1 second to run on git.git, but just
around 8 seconds on linux.git, and much longer on the
2015-04-03-1M-git.git[1] large test repository.

Taking so long means that "gc" will appear to hang at the beginning of
its run. It might still do so after this change if the earlier
"pack-refs" command takes a really long time to run, but that'll only
impact repositories with a really large set of refs to pack, and can
be addressed in some future change.

One thing that's bad about this change is that we might *in theory*
print a "Marking unreachable commits in reflog for expiry" message for
each ref with a reflog. This is because the abbreviated callstack
looks like this:

    0  mark_reachable at builtin/reflog.c:227
    1  in unreachable at builtin/reflog.c:290
    2  in should_expire_reflog_ent at builtin/reflog.c:317
    3  in expire_reflog_ent at refs/files-backend.c:2956
    4  in show_one_reflog_ent at refs/files-backend.c:1879
    # This is the last function that has the refname (e.g. "HEAD") available
    5  in files_for_each_reflog_ent at refs/files-backend.c:2025
    6  in refs_for_each_reflog_ent at refs.c:2066
    7  in files_reflog_expire at refs/files-backend.c:3043
    8  in refs_reflog_expire at refs.c:2117
    9  in reflog_expire at refs.c:2129
    # Here's where we collect reflogs to expire, and expire each one
    10 in cmd_reflog_expire  at builtin/reflog.c:595

I.e. this progress is being reported for each expired reflog. So if
start_progress() were used instead of start_delayed_progress() we'd
print (e.g. on my git.git) hundreds of these lines.

In practice I haven't been able to make it print anything except one
line. This is because validating the reflogs for these other
branches (not "HEAD") takes such a short amount of time.

That may just be some artifact of the repositories I've tested, but I
suspect It'll be true in general. As the callstack above shows, in
order to guarantee that we don't do that we'd need to pass some
"progress" variable through 10 levels of functions, many of which are
"for_each" callback functions with void* cb_data.

1. https://github.com/avar/2015-04-03-1M-git

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/reflog.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/builtin/reflog.c b/builtin/reflog.c
index 3acef5a0ab..d3075ee75a 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -10,6 +10,7 @@
 #include "diff.h"
 #include "revision.h"
 #include "reachable.h"
+#include "progress.h"
 
 /* NEEDSWORK: switch to using parse_options */
 static const char reflog_expire_usage[] =
@@ -225,14 +226,20 @@ static void mark_reachable(struct expire_reflog_policy_cb *cb)
 	struct commit_list *pending;
 	timestamp_t expire_limit = cb->mark_limit;
 	struct commit_list *leftover = NULL;
+	struct progress *progress = NULL;
+	int i = 0;
 
 	for (pending = cb->mark_list; pending; pending = pending->next)
 		pending->item->object.flags &= ~REACHABLE;
 
 	pending = cb->mark_list;
+	progress = start_delayed_progress(
+		_("Marking unreachable commits in reflog for expiry"), 0);
 	while (pending) {
 		struct commit_list *parent;
 		struct commit *commit = pop_commit(&pending);
+
+		display_progress(progress, ++i);
 		if (commit->object.flags & REACHABLE)
 			continue;
 		if (parse_commit(commit))
@@ -253,6 +260,7 @@ static void mark_reachable(struct expire_reflog_policy_cb *cb)
 		}
 	}
 	cb->mark_list = leftover;
+	stop_progress(&progress);
 }
 
 static int unreachable(struct expire_reflog_policy_cb *cb, struct commit *commit, struct object_id *oid)
-- 
2.19.0.444.g18242da7ef


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] reflog expire: add progress output
  2018-09-19 14:10 [PATCH] reflog expire: add progress output Ævar Arnfjörð Bjarmason
@ 2018-09-19 16:26 ` Duy Nguyen
  2018-09-19 17:22   ` Ævar Arnfjörð Bjarmason
  2018-09-21  2:17 ` Eric Sunshine
  1 sibling, 1 reply; 7+ messages in thread
From: Duy Nguyen @ 2018-09-19 16:26 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List, Junio C Hamano

On Wed, Sep 19, 2018 at 4:23 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> Before this change the "git reflog expire" command didn't report any
> progress.

I love these progress additions you've been pushing lately :)

> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index 3acef5a0ab..d3075ee75a 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -10,6 +10,7 @@
>  #include "diff.h"
>  #include "revision.h"
>  #include "reachable.h"
> +#include "progress.h"
>
>  /* NEEDSWORK: switch to using parse_options */
>  static const char reflog_expire_usage[] =
> @@ -225,14 +226,20 @@ static void mark_reachable(struct expire_reflog_policy_cb *cb)
>         struct commit_list *pending;
>         timestamp_t expire_limit = cb->mark_limit;
>         struct commit_list *leftover = NULL;
> +       struct progress *progress = NULL;
> +       int i = 0;
>
>         for (pending = cb->mark_list; pending; pending = pending->next)
>                 pending->item->object.flags &= ~REACHABLE;
>
>         pending = cb->mark_list;
> +       progress = start_delayed_progress(
> +               _("Marking unreachable commits in reflog for expiry"), 0);

Maybe just "Searching expired reflog entries" or something like that.
It's not technically as accurate, but I think it's easier to
understand by by new people.

Do we have --quiet option or something that needs to completely
suppress this progress thing?

>         while (pending) {
>                 struct commit_list *parent;
>                 struct commit *commit = pop_commit(&pending);
> +
> +               display_progress(progress, ++i);

maybe rename it to commit_count or something and leave "i" for
temporary/short lived usage.

>                 if (commit->object.flags & REACHABLE)
>                         continue;
>                 if (parse_commit(commit))
> @@ -253,6 +260,7 @@ static void mark_reachable(struct expire_reflog_policy_cb *cb)
>                 }
>         }
>         cb->mark_list = leftover;
> +       stop_progress(&progress);
>  }
>
>  static int unreachable(struct expire_reflog_policy_cb *cb, struct commit *commit, struct object_id *oid)
> --
> 2.19.0.444.g18242da7ef
>
-- 
Duy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] reflog expire: add progress output
  2018-09-19 16:26 ` Duy Nguyen
@ 2018-09-19 17:22   ` Ævar Arnfjörð Bjarmason
  2018-09-19 19:01     ` Jeff King
  2018-09-20 16:19     ` Duy Nguyen
  0 siblings, 2 replies; 7+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-09-19 17:22 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List, Junio C Hamano


On Wed, Sep 19 2018, Duy Nguyen wrote:

> On Wed, Sep 19, 2018 at 4:23 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> Before this change the "git reflog expire" command didn't report any
>> progress.
>
> I love these progress additions you've been pushing lately :)
>
>> diff --git a/builtin/reflog.c b/builtin/reflog.c
>> index 3acef5a0ab..d3075ee75a 100644
>> --- a/builtin/reflog.c
>> +++ b/builtin/reflog.c
>> @@ -10,6 +10,7 @@
>>  #include "diff.h"
>>  #include "revision.h"
>>  #include "reachable.h"
>> +#include "progress.h"
>>
>>  /* NEEDSWORK: switch to using parse_options */
>>  static const char reflog_expire_usage[] =
>> @@ -225,14 +226,20 @@ static void mark_reachable(struct expire_reflog_policy_cb *cb)
>>         struct commit_list *pending;
>>         timestamp_t expire_limit = cb->mark_limit;
>>         struct commit_list *leftover = NULL;
>> +       struct progress *progress = NULL;
>> +       int i = 0;
>>
>>         for (pending = cb->mark_list; pending; pending = pending->next)
>>                 pending->item->object.flags &= ~REACHABLE;
>>
>>         pending = cb->mark_list;
>> +       progress = start_delayed_progress(
>> +               _("Marking unreachable commits in reflog for expiry"), 0);
>
> Maybe just "Searching expired reflog entries" or something like that.
> It's not technically as accurate, but I think it's easier to
> understand by by new people.

Or "Pruning reflog entries"? I was aiming for some combination of a)
making it clear what we're doing (pruning stuff) b) that we're doing it
on a subset of the counter of the (very large) values we're showing.

So the current one has "a" && "b", "Searching..." has neither, and
"Pruning..." has "a" but not "b".

Maybe making a && b clear isn't that important, but I'm currently
leaning towards keeping the current one because it's not *that* long and
makes things clearer to the user.

> Do we have --quiet option or something that needs to completely
> suppress this progress thing?

Yes. I also see my commit graph process patches sitting in "next" broke
the "git gc --quiet" mode, and I'll need to submit something on top
(which'll be easy), and submit a v2 on this (pending further
comments...).

Is there a better way to test that (fake up the file descriptor check)
in the tests other than adding getenv("GIT_TEST...") to the progress.c
logic?

>>         while (pending) {
>>                 struct commit_list *parent;
>>                 struct commit *commit = pop_commit(&pending);
>> +
>> +               display_progress(progress, ++i);
>
> maybe rename it to commit_count or something and leave "i" for
> temporary/short lived usage.

Good point. Willdo.

>>                 if (commit->object.flags & REACHABLE)
>>                         continue;
>>                 if (parse_commit(commit))
>> @@ -253,6 +260,7 @@ static void mark_reachable(struct expire_reflog_policy_cb *cb)
>>                 }
>>         }
>>         cb->mark_list = leftover;
>> +       stop_progress(&progress);
>>  }
>>
>>  static int unreachable(struct expire_reflog_policy_cb *cb, struct commit *commit, struct object_id *oid)
>> --
>> 2.19.0.444.g18242da7ef
>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] reflog expire: add progress output
  2018-09-19 17:22   ` Ævar Arnfjörð Bjarmason
@ 2018-09-19 19:01     ` Jeff King
  2018-09-19 20:16       ` Ævar Arnfjörð Bjarmason
  2018-09-20 16:19     ` Duy Nguyen
  1 sibling, 1 reply; 7+ messages in thread
From: Jeff King @ 2018-09-19 19:01 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Duy Nguyen, Git Mailing List, Junio C Hamano

On Wed, Sep 19, 2018 at 07:22:44PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > Do we have --quiet option or something that needs to completely
> > suppress this progress thing?
> 
> Yes. I also see my commit graph process patches sitting in "next" broke
> the "git gc --quiet" mode, and I'll need to submit something on top
> (which'll be easy), and submit a v2 on this (pending further
> comments...).
> 
> Is there a better way to test that (fake up the file descriptor check)
> in the tests other than adding getenv("GIT_TEST...") to the progress.c
> logic?

The progress code doesn't do the isatty() check at all. The caller has
to do it (and ideally would respect --progress/--no-progress to
override, along with having --quiet imply --no-progress if such an
option exists).

Once you have all that, then you can test --progress explicitly. If you
want to check the isatty() handling, you can use test_terminal().

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] reflog expire: add progress output
  2018-09-19 19:01     ` Jeff King
@ 2018-09-19 20:16       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 7+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-09-19 20:16 UTC (permalink / raw)
  To: Jeff King; +Cc: Duy Nguyen, Git Mailing List, Junio C Hamano


On Wed, Sep 19 2018, Jeff King wrote:

> On Wed, Sep 19, 2018 at 07:22:44PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> > Do we have --quiet option or something that needs to completely
>> > suppress this progress thing?
>>
>> Yes. I also see my commit graph process patches sitting in "next" broke
>> the "git gc --quiet" mode, and I'll need to submit something on top
>> (which'll be easy), and submit a v2 on this (pending further
>> comments...).
>>
>> Is there a better way to test that (fake up the file descriptor check)
>> in the tests other than adding getenv("GIT_TEST...") to the progress.c
>> logic?
>
> The progress code doesn't do the isatty() check at all. The caller has
> to do it (and ideally would respect --progress/--no-progress to
> override, along with having --quiet imply --no-progress if such an
> option exists).

Yeah, what I was confused by was testing this with "git gc", and until
my recent commit graph progress patches + this (which I wasn't testing
against) the progress output was all from pack-objects, which checks the
--progress option, and then proceeds to ignore all of that and just
check isatty().

> Once you have all that, then you can test --progress explicitly. If you
> want to check the isatty() handling, you can use test_terminal().

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] reflog expire: add progress output
  2018-09-19 17:22   ` Ævar Arnfjörð Bjarmason
  2018-09-19 19:01     ` Jeff King
@ 2018-09-20 16:19     ` Duy Nguyen
  1 sibling, 0 replies; 7+ messages in thread
From: Duy Nguyen @ 2018-09-20 16:19 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List, Junio C Hamano

On Wed, Sep 19, 2018 at 07:22:44PM +0200, Ævar Arnfjörð Bjarmason wrote:
> >> @@ -225,14 +226,20 @@ static void mark_reachable(struct expire_reflog_policy_cb *cb)
> >>         struct commit_list *pending;
> >>         timestamp_t expire_limit = cb->mark_limit;
> >>         struct commit_list *leftover = NULL;
> >> +       struct progress *progress = NULL;
> >> +       int i = 0;
> >>
> >>         for (pending = cb->mark_list; pending; pending = pending->next)
> >>                 pending->item->object.flags &= ~REACHABLE;
> >>
> >>         pending = cb->mark_list;
> >> +       progress = start_delayed_progress(
> >> +               _("Marking unreachable commits in reflog for expiry"), 0);
> >
> > Maybe just "Searching expired reflog entries" or something like that.
> > It's not technically as accurate, but I think it's easier to
> > understand by by new people.
> 
> Or "Pruning reflog entries"? I was aiming for some combination of a)
> making it clear what we're doing (pruning stuff) b) that we're doing it
> on a subset of the counter of the (very large) values we're showing.
> 
> So the current one has "a" && "b", "Searching..." has neither, and
> "Pruning..." has "a" but not "b".
> 
> Maybe making a && b clear isn't that important, but I'm currently
> leaning towards keeping the current one because it's not *that* long and
> makes things clearer to the user.

OK

> >>         while (pending) {
> >>                 struct commit_list *parent;
> >>                 struct commit *commit = pop_commit(&pending);
> >> +
> >> +               display_progress(progress, ++i);
> >
> > maybe rename it to commit_count or something and leave "i" for
> > temporary/short lived usage.
> 
> Good point. Willdo.

Actually I'm still not sure if it's valuable to show the actual commit
count here. Some patch like this could show "activity" without the
number. But this is tangent.

-- 8< --
Subject: [PATCH] progress: add api for displaying a throbber

---
 progress.c | 29 +++++++++++++++++++++++++----
 progress.h |  1 +
 2 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/progress.c b/progress.c
index 5a99c9fbf0..fada556bf0 100644
--- a/progress.c
+++ b/progress.c
@@ -36,6 +36,7 @@ struct progress {
 	unsigned delay;
 	struct throughput *throughput;
 	uint64_t start_ns;
+	int show_throbber;
 };
 
 static volatile sig_atomic_t progress_update;
@@ -81,11 +82,13 @@ static int is_foreground_fd(int fd)
 static int display(struct progress *progress, uint64_t n, const char *done)
 {
 	const char *eol, *tp;
+	static char throbber[] = "\\|/-";
 
 	if (progress->delay && (!progress_update || --progress->delay))
 		return 0;
 
-	progress->last_value = n;
+	if (!progress->show_throbber)
+		progress->last_value = n;
 	tp = (progress->throughput) ? progress->throughput->display.buf : "";
 	eol = done ? done : "   \r";
 	if (progress->total) {
@@ -104,8 +107,15 @@ static int display(struct progress *progress, uint64_t n, const char *done)
 		}
 	} else if (progress_update) {
 		if (is_foreground_fd(fileno(stderr)) || done) {
-			fprintf(stderr, "%s: %"PRIuMAX"%s%s",
-				progress->title, (uintmax_t)n, tp, eol);
+			if (progress->show_throbber) {
+				progress->last_value++;
+				fprintf(stderr, "%s: %c%s%s",
+					progress->title,
+					throbber[progress->last_value % 4],
+					tp, eol);
+			} else
+				fprintf(stderr, "%s: %"PRIuMAX"%s%s",
+					progress->title, (uintmax_t)n, tp, eol);
 			fflush(stderr);
 		}
 		progress_update = 0;
@@ -193,6 +203,14 @@ int display_progress(struct progress *progress, uint64_t n)
 	return progress ? display(progress, n, NULL) : 0;
 }
 
+void display_throbber(struct progress *progress)
+{
+	if (progress) {
+		progress->show_throbber = 1;
+		display(progress, 0, NULL);
+	}
+}
+
 static struct progress *start_progress_delay(const char *title, uint64_t total,
 					     unsigned delay)
 {
@@ -248,7 +266,10 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 			throughput_string(&tp->display, tp->curr_total, rate);
 		}
 		progress_update = 1;
-		buf = xstrfmt(", %s.\n", msg);
+		if (progress->show_throbber)
+			buf = xstrfmt(" %s.\n", msg);
+		else
+			buf = xstrfmt(", %s.\n", msg);
 		display(progress, progress->last_value, buf);
 		free(buf);
 	}
diff --git a/progress.h b/progress.h
index 70a4d4a0d6..9d23ff242f 100644
--- a/progress.h
+++ b/progress.h
@@ -5,6 +5,7 @@ struct progress;
 
 void display_throughput(struct progress *progress, uint64_t total);
 int display_progress(struct progress *progress, uint64_t n);
+void display_throbber(struct progress *progress);
 struct progress *start_progress(const char *title, uint64_t total);
 struct progress *start_delayed_progress(const char *title, uint64_t total);
 void stop_progress(struct progress **progress);
-- 
2.19.0.rc0.337.ge906d732e7

-- 8< --

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] reflog expire: add progress output
  2018-09-19 14:10 [PATCH] reflog expire: add progress output Ævar Arnfjörð Bjarmason
  2018-09-19 16:26 ` Duy Nguyen
@ 2018-09-21  2:17 ` Eric Sunshine
  1 sibling, 0 replies; 7+ messages in thread
From: Eric Sunshine @ 2018-09-21  2:17 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git List, Junio C Hamano

On Wed, Sep 19, 2018 at 10:10 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> Before this change the "git reflog expire" command didn't report any
> progress. This is the second command (after "pack-refs --all --prune")
> that the "gc" command will run.
> [...]
> That may just be some artifact of the repositories I've tested, but I
> suspect It'll be true in general. As the callstack above shows, in

s/It'll/it'll/

> order to guarantee that we don't do that we'd need to pass some
> "progress" variable through 10 levels of functions, many of which are
> "for_each" callback functions with void* cb_data.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-09-21  2:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-19 14:10 [PATCH] reflog expire: add progress output Ævar Arnfjörð Bjarmason
2018-09-19 16:26 ` Duy Nguyen
2018-09-19 17:22   ` Ævar Arnfjörð Bjarmason
2018-09-19 19:01     ` Jeff King
2018-09-19 20:16       ` Ævar Arnfjörð Bjarmason
2018-09-20 16:19     ` Duy Nguyen
2018-09-21  2:17 ` Eric Sunshine

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).