From: Junio C Hamano <gitster@pobox.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org, Neeraj Singh <neerajsi@microsoft.com>
Subject: Re: [PATCH] trace2: don't include "fsync" events in all trace2 logs
Date: Thu, 23 Jun 2022 12:34:10 -0700 [thread overview]
Message-ID: <xmqqh74byy19.fsf@gitster.g> (raw)
In-Reply-To: <patch-1.1-df87e515efd-20220623T154943Z-avarab@gmail.com> ("Ævar Arnfjörð Bjarmason"'s message of "Thu, 23 Jun 2022 17:51:46 +0200")
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> Fix the overly verbose trace2 logging added in 9a4987677d3 (trace2:
> add stats for fsync operations, 2022-03-30) (first released with
> v2.36.0).
>
> Since that change every single "git" command invocation has included
> these "data" events, even though we'll only make use of these with
> core.fsyncMethod=batch, and even then only have non-zero values if
> we're writing object data to disk. See c0f4752ed2f (core.fsyncmethod:
> batched disk flushes for loose-objects, 2022-04-04) for that feature.
>
> The result was that e.g. "git version" would previously emit a total
> of 6 trace2 events with the GIT_TRACE2_EVENT target (version, start,
> cmd_ancestry, cmd_name, exit, atexit), but afterwards would emit
> 8. We'd emit 2 "data" events before the "exit" event.
>
> The reason we didn't catch this was that the trace2 unit tests added
> in a15860dca3f (trace2: t/helper/test-trace2, t0210.sh, t0211.sh,
> t0212.sh, 2019-02-22) would omit any "data" events that weren't the
> ones it cared about. Before this change to the C code 6/7 of our
> "t/t0212-trace2-event.sh" tests would fail if this change was applied
> to "t/t0212/parse_events.perl".
>
> Let's make the trace2 testing more strict, and further append any new
> events types we don't know about in "t/t0212/parse_events.perl". Since
> we only invoke the "test-tool trace2" there's no guarantee that we'll
> catch other overly verbose events in the future, but we'll at least
> notice if we start emitting new events that are issues every time we
> log anything with trace2's JSON target.
>
> We exclude the "data_json" event type, we'd otherwise would fail on
> both "win test" and "win+VS test" CI due to the logging added in
> 353d3d77f4f (trace2: collect Windows-specific process information,
> 2019-02-22). It looks like that logging should really be using
> trace2_cmd_ancestry() instead, which was introduced later in
> 2f732bf15e6 (tr2: log parent process name, 2021-07-21), but let's
> leave it for now.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>
> This is a fix for a v2.36.0 regression/change, so perhaps it's
> categorically out for v2.37.0, but given how isolated the fix is & how
> verbose the new trace2 data is perhaps it's worth including anyway.
>
> Passing CI at: https://github.com/avar/git/actions/runs/2548732003
>
> t/t0212/parse_events.perl | 17 +++++++++++------
> wrapper.c | 3 +++
> 2 files changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/t/t0212/parse_events.perl b/t/t0212/parse_events.perl
> index b6408560c0c..389cf6f61c4 100644
> --- a/t/t0212/parse_events.perl
> +++ b/t/t0212/parse_events.perl
> @@ -216,12 +216,17 @@
>
> elsif ($event eq 'data') {
> my $cat = $line->{'category'};
> + my $key = $line->{'key'};
> + my $value = $line->{'value'};
> + $processes->{$sid}->{'data'}->{$cat}->{$key} = $value;
> + }
> +
> + elsif ($event eq 'data_json') {
> + # Ignore due to compat/win32/trace2_win32_process_info.c
> + }
> +
> + else {
> + push @{$processes->{$sid}->{$event}} => $line->{value};
> }
The special casing is unfortunate, but if it is "let's leave it for
now", probably we should leave a NEEDSWORK comment here to remind
us, instead of giving a false impresion that abusing data_json is
OK?
> diff --git a/wrapper.c b/wrapper.c
> index 1c3c970080b..eebd764131f 100644
> --- a/wrapper.c
> +++ b/wrapper.c
> @@ -618,6 +618,9 @@ int git_fsync(int fd, enum fsync_action action)
>
> void trace_git_fsync_stats(void)
> {
> + if (!count_fsync_writeout_only && !count_fsync_hardware_flush)
> + return;
> +
OK, both counters are unconditionally incremented whenever the
corresponding action triggers, so this implementation is probably OK
for now, but I suspect that it would become a maintenance burden if
we were to add more fsync actions later.
> trace2_data_intmax("fsync", the_repository, "fsync/writeout-only", count_fsync_writeout_only);
> trace2_data_intmax("fsync", the_repository, "fsync/hardware-flush", count_fsync_hardware_flush);
If each of these is guarded with "show only when non-zero", that
would become less problematic, though.
A bigger issue is when somebody is debugging the callers of
git_fsync() wondering why they do not trigger. Getting log entries
that say "no git_fsync() call with these actions have been made"
may be useful (i.e. explicitly negative answer).
I am not sure if it is a good solution in the longer term, but this
implementation may be a good-enough way to squelch the immediate
loudness issue.
Thanks.
next prev parent reply other threads:[~2022-06-23 19:43 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-23 15:51 [PATCH] trace2: don't include "fsync" events in all trace2 logs Ævar Arnfjörð Bjarmason
2022-06-23 18:21 ` [EXTERNAL] " Neeraj Singh (WINDOWS-SFS)
2022-06-23 19:34 ` Junio C Hamano [this message]
2022-06-30 8:56 ` [PATCH v2] " Ævar Arnfjörð Bjarmason
2022-06-30 20:45 ` Junio C Hamano
2022-07-18 10:31 ` [PATCH v3] trace2: only include "fsync" events if we git_fsync() Ævar Arnfjörð Bjarmason
2022-07-18 16:50 ` Junio C Hamano
2022-07-18 19:01 ` [EXTERNAL] " Neeraj Singh (WINDOWS-SFS)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqh74byy19.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=neerajsi@microsoft.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).