From: Jeff Hostetler <git@jeffhostetler.com>
To: Emily Shaffer <emilyshaffer@google.com>, git@vger.kernel.org
Cc: Josh Steadmon <steadmon@google.com>
Subject: Re: [PATCH] trace2: log progress time and throughput
Date: Fri, 15 May 2020 15:37:20 -0400 [thread overview]
Message-ID: <90f800d5-c647-6750-52d1-da973f036bd0@jeffhostetler.com> (raw)
In-Reply-To: <20200512214420.36329-1-emilyshaffer@google.com>
On 5/12/20 5:44 PM, Emily Shaffer wrote:
> Rather than teaching only one operation, like 'git fetch', how to write
> down throughput to traces, we can learn about a wide range of user
> operations that may seem slow by adding tooling to the progress library
> itself. Operations which display progress are likely to be slow-running
> and the kind of thing we want to monitor for performance anyways. By
> showing object counts and data transfer size, we should be able to
> make some derived measurements to ensure operations are scaling the way
> we expect.
>
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
We need to be careful here. The region_enter and _leave calls need
to always be properly nested. Having an implicit region within the
progress code means the code path between all of the different start_
and stop_progress calls need to not interleave with explicit regions.
How about putting something like this in stop_progress:
struct json_writer jw = JSON_WRITER_INIT;
jw_object_begin(&jw, 0);
jw_object_intmax(&jw, "total", p->total);
if (p->throughput)
jw_object_intmax(&jw, "throughput", p->throughput->curr_total);
/*
* and so on...
* and maybe include the elapsed time in since the start_progress.
*/
jw_end(&jw);
trace2_data_json("progress", NULL, p->title, &jw);
jw_release(&jw);
That will give you a single record summary of the progress meter
and you won't have to worry about any interleaving.
You could also add a bit to `struct progress` to let you opt-in
or opt-out of the message on a case-by-case basis.
Jeff
> ---
> One note: by putting trace collection into the progress library, we end
> up with data events which have titles like "Receiving objects" - not
> very machine-parseable. An alternative might be to ask for a
> machine-readable title in the progress struct, but I didn't think it was
> worth the code churn. However, I don't have experience with processing
> the trace data after it's been collected, so if this is a bigger problem
> than I think, please say so and I'll figure something out.
>
> CI run here, although it failed on the same error Junio noted today[1]:
> https://github.com/nasamuffin/git/runs/668457062
>
> - Emily
>
> [1]: https://lore.kernel.org/git/xmqqtv0kc2q1.fsf@gitster.c.googlers.com
>
> progress.c | 17 +++++++++++++++++
> t/t0500-progress-display.sh | 26 ++++++++++++++++++++++++++
> 2 files changed, 43 insertions(+)
>
> diff --git a/progress.c b/progress.c
> index 75633e9c5e..6d2dcff0b6 100644
> --- a/progress.c
> +++ b/progress.c
> @@ -265,6 +265,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
> progress->title_len = utf8_strwidth(title);
> progress->split = 0;
> set_progress_signal();
> + trace2_region_enter("progress", title, the_repository);
> return progress;
> }
>
> @@ -320,6 +321,22 @@ void stop_progress(struct progress **p_progress)
> {
> finish_if_sparse(*p_progress);
>
> + if (p_progress && *p_progress) {
> + trace2_data_intmax("progress", the_repository, "total_objects",
> + (*p_progress)->total);
> +
> + if ((*p_progress)->throughput)
> + trace2_data_intmax("progress", the_repository,
> + "total_bytes",
> + (*p_progress)->throughput->curr_total);
> + }
> +
> + trace2_region_leave("progress",
> + p_progress && *p_progress
> + ? (*p_progress)->title
> + : NULL,
> + the_repository);
> +
> stop_progress_msg(p_progress, _("done"));
> }
>
> diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
> index d2d088d9a0..1ed1df351c 100755
> --- a/t/t0500-progress-display.sh
> +++ b/t/t0500-progress-display.sh
> @@ -283,4 +283,30 @@ test_expect_success 'cover up after throughput shortens a lot' '
> test_i18ncmp expect out
> '
>
> +test_expect_success 'progress generates traces' '
> + cat >in <<-\EOF &&
> + throughput 102400 1000
> + update
> + progress 10
> + throughput 204800 2000
> + update
> + progress 20
> + throughput 307200 3000
> + update
> + progress 30
> + throughput 409600 4000
> + update
> + progress 40
> + EOF
> +
> + GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress --total=40 \
> + "Working hard" <in 2>stderr &&
> +
> + # t0212/parse_events.perl intentionally omits regions and data.
> + grep -e "region_enter" -e "\"category\":\"progress\"" trace.event &&
> + grep -e "region_leave" -e "\"category\":\"progress\"" trace.event &&
> + grep "\"key\":\"total_objects\",\"value\":\"40\"" trace.event &&
> + grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
> +'
> +
> test_done
>
next prev parent reply other threads:[~2020-05-15 19:37 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-12 21:44 [PATCH] trace2: log progress time and throughput Emily Shaffer
2020-05-12 22:14 ` Junio C Hamano
2020-05-13 19:46 ` Josh Steadmon
2020-05-15 10:59 ` Derrick Stolee
2020-05-15 15:27 ` Junio C Hamano
2020-05-15 16:09 ` Junio C Hamano
2020-05-15 16:49 ` Derrick Stolee
2020-05-15 17:00 ` Junio C Hamano
2020-05-15 19:37 ` Jeff Hostetler [this message]
2020-05-15 19:44 ` Jeff Hostetler
2021-06-21 1:24 ` Ævar Arnfjörð Bjarmason
2021-06-21 13:55 ` Elijah Newren
2021-06-21 14:51 ` Ævar Arnfjörð Bjarmason
2021-06-21 20:28 ` Elijah Newren
2021-06-23 2:55 ` Taylor Blau
2021-06-23 3:29 ` Chris Torek
2021-06-23 3:42 ` Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=90f800d5-c647-6750-52d1-da973f036bd0@jeffhostetler.com \
--to=git@jeffhostetler.com \
--cc=emilyshaffer@google.com \
--cc=git@vger.kernel.org \
--cc=steadmon@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).