git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org, Neeraj Singh <neerajsi@microsoft.com>
Subject: Re: [PATCH] trace2: don't include "fsync" events in all trace2 logs
Date: Thu, 23 Jun 2022 12:34:10 -0700	[thread overview]
Message-ID: <xmqqh74byy19.fsf@gitster.g> (raw)
In-Reply-To: <patch-1.1-df87e515efd-20220623T154943Z-avarab@gmail.com> ("Ævar	Arnfjörð Bjarmason"'s message of "Thu, 23 Jun 2022 17:51:46 +0200")

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Fix the overly verbose trace2 logging added in 9a4987677d3 (trace2:
> add stats for fsync operations, 2022-03-30) (first released with
> v2.36.0).
>
> Since that change every single "git" command invocation has included
> these "data" events, even though we'll only make use of these with
> core.fsyncMethod=batch, and even then only have non-zero values if
> we're writing object data to disk. See c0f4752ed2f (core.fsyncmethod:
> batched disk flushes for loose-objects, 2022-04-04) for that feature.
>
> The result was that e.g. "git version" would previously emit a total
> of 6 trace2 events with the GIT_TRACE2_EVENT target (version, start,
> cmd_ancestry, cmd_name, exit, atexit), but afterwards would emit
> 8. We'd emit 2 "data" events before the "exit" event.
>
> The reason we didn't catch this was that the trace2 unit tests added
> in a15860dca3f (trace2: t/helper/test-trace2, t0210.sh, t0211.sh,
> t0212.sh, 2019-02-22) would omit any "data" events that weren't the
> ones it cared about. Before this change to the C code 6/7 of our
> "t/t0212-trace2-event.sh" tests would fail if this change was applied
> to "t/t0212/parse_events.perl".
>
> Let's make the trace2 testing more strict, and further append any new
> events types we don't know about in "t/t0212/parse_events.perl". Since
> we only invoke the "test-tool trace2" there's no guarantee that we'll
> catch other overly verbose events in the future, but we'll at least
> notice if we start emitting new events that are issues every time we
> log anything with trace2's JSON target.
>
> We exclude the "data_json" event type, we'd otherwise would fail on
> both "win test" and "win+VS test" CI due to the logging added in
> 353d3d77f4f (trace2: collect Windows-specific process information,
> 2019-02-22). It looks like that logging should really be using
> trace2_cmd_ancestry() instead, which was introduced later in
> 2f732bf15e6 (tr2: log parent process name, 2021-07-21), but let's
> leave it for now.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>
> This is a fix for a v2.36.0 regression/change, so perhaps it's
> categorically out for v2.37.0, but given how isolated the fix is & how
> verbose the new trace2 data is perhaps it's worth including anyway.
>
> Passing CI at: https://github.com/avar/git/actions/runs/2548732003
>
>  t/t0212/parse_events.perl | 17 +++++++++++------
>  wrapper.c                 |  3 +++
>  2 files changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/t/t0212/parse_events.perl b/t/t0212/parse_events.perl
> index b6408560c0c..389cf6f61c4 100644
> --- a/t/t0212/parse_events.perl
> +++ b/t/t0212/parse_events.perl
> @@ -216,12 +216,17 @@
>  
>      elsif ($event eq 'data') {
>  	my $cat = $line->{'category'};
> +	my $key = $line->{'key'};
> +	my $value = $line->{'value'};
> +	$processes->{$sid}->{'data'}->{$cat}->{$key} = $value;
> +    }
> +
> +    elsif ($event eq 'data_json') {
> +	# Ignore due to compat/win32/trace2_win32_process_info.c
> +    }
> +
> +    else {
> +	push @{$processes->{$sid}->{$event}} => $line->{value};
>      }

The special casing is unfortunate, but if it is "let's leave it for
now", probably we should leave a NEEDSWORK comment here to remind
us, instead of giving a false impresion that abusing data_json is
OK?

> diff --git a/wrapper.c b/wrapper.c
> index 1c3c970080b..eebd764131f 100644
> --- a/wrapper.c
> +++ b/wrapper.c
> @@ -618,6 +618,9 @@ int git_fsync(int fd, enum fsync_action action)
>  
>  void trace_git_fsync_stats(void)
>  {
> +	if (!count_fsync_writeout_only && !count_fsync_hardware_flush)
> +		return;
> +

OK, both counters are unconditionally incremented whenever the
corresponding action triggers, so this implementation is probably OK
for now, but I suspect that it would become a maintenance burden if
we were to add more fsync actions later.

>  	trace2_data_intmax("fsync", the_repository, "fsync/writeout-only", count_fsync_writeout_only);
>  	trace2_data_intmax("fsync", the_repository, "fsync/hardware-flush", count_fsync_hardware_flush);

If each of these is guarded with "show only when non-zero", that
would become less problematic, though.

A bigger issue is when somebody is debugging the callers of
git_fsync() wondering why they do not trigger.  Getting log entries
that say "no git_fsync() call with these actions have been made"
may be useful (i.e. explicitly negative answer).

I am not sure if it is a good solution in the longer term, but this
implementation may be a good-enough way to squelch the immediate
loudness issue.

Thanks.


  parent reply	other threads:[~2022-06-23 19:43 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-23 15:51 [PATCH] trace2: don't include "fsync" events in all trace2 logs Ævar Arnfjörð Bjarmason
2022-06-23 18:21 ` [EXTERNAL] " Neeraj Singh (WINDOWS-SFS)
2022-06-23 19:34 ` Junio C Hamano [this message]
2022-06-30  8:56 ` [PATCH v2] " Ævar Arnfjörð Bjarmason
2022-06-30 20:45   ` Junio C Hamano
2022-07-18 10:31   ` [PATCH v3] trace2: only include "fsync" events if we git_fsync() Ævar Arnfjörð Bjarmason
2022-07-18 16:50     ` Junio C Hamano
2022-07-18 19:01       ` [EXTERNAL] " Neeraj Singh (WINDOWS-SFS)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqh74byy19.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=neerajsi@microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).