git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Patrick Steinhardt <ps@pks.im>, git@vger.kernel.org
Subject: commit-graph paranoia performance, was Re: [ANNOUNCE] Git v2.43.0-rc1
Date: Mon, 13 Nov 2023 15:55:38 -0500	[thread overview]
Message-ID: <20231113205538.GA2028092@coredump.intra.peff.net> (raw)
In-Reply-To: <xmqq8r785ev1.fsf@gitster.g>

On Thu, Nov 09, 2023 at 02:33:54AM +0900, Junio C Hamano wrote:

>  * The codepath to traverse the commit-graph learned to notice that a
>    commit is missing (e.g., corrupt repository lost an object), even
>    though it knows something about the commit (like its parents) from
>    what is in commit-graph.
>    (merge 7a5d604443 ps/do-not-trust-commit-graph-blindly-for-existence later to maint).

I happened to be timing "rev-list" for an unrelated topic today, and I
noticed that this change had a rather large effect. The commit message
for 7a5d604443 claims a 30% performance regression. But that's when
using "--topo-order", and actually writing out the result.

Running "rev-list --count" on a copy of linux.git with a fully-built
commit-graph shows that the run-time doubles:

  Benchmark 1: git.v2.42.1 rev-list --count HEAD
    Time (mean ± σ):     658.0 ms ±   5.2 ms    [User: 613.5 ms, System: 44.4 ms]
    Range (min … max):   650.2 ms … 666.0 ms    10 runs
  
  Benchmark 2: git.v2.43.0-rc1 rev-list --count HEAD
    Time (mean ± σ):      1.333 s ±  0.019 s    [User: 1.263 s, System: 0.069 s]
    Range (min … max):    1.302 s …  1.361 s    10 runs
  
  Summary
    git.v2.42.1 rev-list --count HEAD ran
      2.03 ± 0.03 times faster than git.v2.43.0-rc1 rev-list --count HEAD

Now in defense of that patch, this particular command is going to be one
of the most sensitive in terms of percent change, simply because it
isn't doing much besides walking the commits. And 650ms isn't _that_ big
in an absolute sense. But it also doesn't quite feel like nothing, even
tacked onto a command that might otherwise take 1000ms to run.

Should we default GIT_COMMIT_GRAPH_PARANOIA to "0"? Yes, some operations
might miss a breakage, but that is true of so much of Git. For day to
day commands we generally assume that the repository is not corrupted,
and avoid looking at any data we can. Other commands (like "commit-graph
verify", but maybe others) would probably want to be more careful
(either by checking this case explicitly, or by enabling the paranoia
flag themselves).

-Peff


  parent reply	other threads:[~2023-11-13 20:55 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-08 17:33 [ANNOUNCE] Git v2.43.0-rc1 Junio C Hamano
2023-11-10  7:59 ` Andy Koppe
2023-11-13 20:55 ` Jeff King [this message]
2023-11-13 23:50   ` commit-graph paranoia performance, was " Junio C Hamano
2023-11-14  8:46   ` Patrick Steinhardt
2023-11-14  9:05     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231113205538.GA2028092@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).