From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>,
Junio C Hamano <gitster@pobox.com>,
Derrick Stolee <stolee@gmail.com>,
Stefan Beller <sbeller@google.com>
Subject: Regression in: [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference()
Date: Fri, 25 Jan 2019 16:33:48 +0100 [thread overview]
Message-ID: <20190125153348.GF6702@szeder.dev> (raw)
In-Reply-To: <20181204224238.50966-1-jonathantanmy@google.com>
On Tue, Dec 04, 2018 at 02:42:38PM -0800, Jonathan Tan wrote:
> When fetching into a repository, a connectivity check is first made by
> check_exist_and_connected() in builtin/fetch.c that runs:
>
> git rev-list --objects --stdin --not --all --quiet <(list of objects)
>
> If the client repository has many refs, this command can be slow,
> regardless of the nature of the server repository or what is being
> fetched. A profiler reveals that most of the time is spent in
> setup_revisions() (approx. 60/63), and of the time spent in
> setup_revisions(), most of it is spent in parse_object() (approx.
> 49/60). This is because setup_revisions() parses the target of every ref
> (from "--all"), and parse_object() reads the buffer of the object.
>
> Reading the buffer is unnecessary if the repository has a commit graph
> and if the ref points to a commit (which is typically the case). This
> patch uses the commit graph wherever possible; on my computer, when I
> run the above command with a list of 1 object on a many-ref repository,
> I get a speedup from 1.8s to 1.0s.
>
> Another way to accomplish this effect would be to modify parse_object()
> to use the commit graph if possible; however, I did not want to change
> parse_object()'s current behavior of always checking the object
> signature of the returned object.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
> This is on sb/more-repo-in-api because I'm using the repo_parse_commit()
> function.
>
> A colleague noticed this issue when handling a mirror clone.
>
> Looking at the bigger picture, the speed of the connectivity check
> during a fetch might be further improved by passing only the negotiation
> tips (obtained through --negotiation-tip) instead of "--all". This patch
> just handles the low-hanging fruit first.
> ---
I stumbled upon a regression that bisects down to this commit
ec0c5798ee (revision: use commit graph in get_reference(),
2018-12-04):
$ ~/src/git/bin-wrappers/git version
git version 2.19.1.566.gec0c5798ee
$ ~/src/git/bin-wrappers/git commit-graph write --reachable
Computing commit graph generation numbers: 100% (58994/58994), done.
$ ~/src/git/bin-wrappers/git status
HEAD detached at origin/pu
nothing to commit, working tree clean
$ ~/src/git/bin-wrappers/git -c core.commitGraph=false describe --dirty
v2.20.1-833-gcb3b9e7ee3
$ ~/src/git/bin-wrappers/git -c core.commitGraph=true describe --dirty
v2.20.1-833-gcb3b9e7ee3
It's all good with only '--dirty', but watch this with '--all
--dirty':
$ ~/src/git/bin-wrappers/git -c core.commitGraph=false describe --all --dirty
remotes/origin/pu
$ ~/src/git/bin-wrappers/git -c core.commitGraph=true describe --all --dirty
remotes/origin/pu-dirty
IOW if the commit-graph is enabled, then my clean worktree is reported
as dirty.
And to add a cherry on top of my confusion:
$ git checkout v2.20.0
Previous HEAD position was cb3b9e7ee3 Merge branch 'jh/trace2' into pu
HEAD is now at 5d826e9729 Git 2.20
$ ~/src/git/bin-wrappers/git -c core.commitGraph=true describe --all --dirty
tags/v2.20.0
It's clean even with '--all' and commit-graph enabled, but watch this:
$ git branch this-will-screw-it-up
$ ~/src/git/bin-wrappers/git -c core.commitGraph=true describe --all --dirty
tags/v2.20.0-dirty
Have fun! :)
> revision.c | 15 ++++++++++++++-
> 1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/revision.c b/revision.c
> index b5108b75ab..e7da2c57ab 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -212,7 +212,20 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
> {
> struct object *object;
>
> - object = parse_object(revs->repo, oid);
> + /*
> + * If the repository has commit graphs, repo_parse_commit() avoids
> + * reading the object buffer, so use it whenever possible.
> + */
> + if (oid_object_info(revs->repo, oid, NULL) == OBJ_COMMIT) {
> + struct commit *c = lookup_commit(revs->repo, oid);
> + if (!repo_parse_commit(revs->repo, c))
> + object = (struct object *) c;
> + else
> + object = NULL;
> + } else {
> + object = parse_object(revs->repo, oid);
> + }
> +
> if (!object) {
> if (revs->ignore_missing)
> return object;
> --
> 2.19.0.271.gfe8321ec05.dirty
>
next prev parent reply other threads:[~2019-01-25 15:33 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-04 22:42 [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference() Jonathan Tan
2018-12-04 23:12 ` Stefan Beller
2018-12-06 23:36 ` Jonathan Tan
2018-12-07 13:49 ` Derrick Stolee
2018-12-05 4:54 ` Jeff King
2018-12-06 23:54 ` Jonathan Tan
2018-12-07 8:53 ` Jeff King
2018-12-05 23:15 ` Junio C Hamano
2018-12-07 21:50 ` [PATCH on master v2] " Jonathan Tan
2018-12-09 0:51 ` Junio C Hamano
2018-12-09 1:49 ` Junio C Hamano
2018-12-11 10:54 ` Jeff King
2018-12-12 19:58 ` Jonathan Tan
2018-12-13 1:27 ` Jeff King
2018-12-13 16:20 ` Derrick Stolee
2018-12-13 18:54 ` [PATCH v3] " Jonathan Tan
2018-12-14 3:20 ` Junio C Hamano
2018-12-14 8:45 ` Jeff King
2019-01-25 15:33 ` SZEDER Gábor [this message]
2019-01-25 19:56 ` Regression in: [PATCH on sb/more-repo-in-api] " Stefan Beller
2019-01-25 22:01 ` Jonathan Tan
2019-01-25 22:14 ` SZEDER Gábor
2019-01-25 22:21 ` SZEDER Gábor
2019-01-27 13:08 ` [PATCH] object_as_type: initialize commit-graph-related fields of 'struct commit' SZEDER Gábor
2019-01-27 13:28 ` SZEDER Gábor
2019-01-27 18:40 ` Derrick Stolee
2019-01-28 16:15 ` Jeff King
2019-01-28 16:57 ` Jonathan Tan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190125153348.GF6702@szeder.dev \
--to=szeder.dev@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jonathantanmy@google.com \
--cc=peff@peff.net \
--cc=sbeller@google.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).