From: Derrick Stolee <stolee@gmail.com>
To: Jonathan Tan <jonathantanmy@google.com>, sbeller@google.com
Cc: git@vger.kernel.org
Subject: Re: [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference()
Date: Fri, 7 Dec 2018 08:49:21 -0500 [thread overview]
Message-ID: <aa0cd481-c135-47aa-2a69-e3dc71661caa@gmail.com> (raw)
In-Reply-To: <20181206233626.144072-1-jonathantanmy@google.com>
On 12/6/2018 6:36 PM, Jonathan Tan wrote:
>> AFAICT oid_object_info doesn't take advantage of the commit graph,
>> but just looks up the object header, which is still less than completely
>> parsing it. Then lookup_commit is overly strict, as it may return
>> NULL as when there still is a type mismatch (I don't think a mismatch
>> could happen here, as both rely on just the object store, and not the
>> commit graph.), so this would be just defensive programming for
>> the sake of it. I dunno.
>>
>> struct commit *c;
>>
>> if (oid_object_info(revs->repo, oid, NULL) == OBJ_COMMIT &&
>> (c = lookup_commit(revs->repo, oid)) &&
>> !repo_parse_commit(revs->repo, c))
>> object = (struct object *) c;
>> else
>> object = parse_object(revs->repo, oid);
> I like this way better - I'll do it in the next version.
If we do _not_ have a commit-graph or if the commit-graph does not have
that commit, this will have the same performance problem, right?
Should we instead create a direct dependence on the commit-graph, and try
to parse the oid from the graph directly? If it succeeds, then we learn
that the object is a commit, in addition to all of the parsing work. This
means we could avoid oid_object_info() loading data if we succeed. We
would fall back to parse_object() if it fails.
I was thinking this should be a simple API call to parse_commit_in_graph(),
but that requires a struct commit filled with an oid, which is not the
best idea if we don't actually know it is a commit yet.
The approach I recommend would then be more detailed:
1. Modify find_commit_in_graph() to take a struct object_id instead of a
struct commit. This helps find the integer position in the graph. That
position can be used in fill_commit_in_graph() to load the commit
contents. Keep find_commit_in_graph() static as it should not be a
public function.
2. Create a public function with prototype
struct commit *try_parse_commit_from_graph(struct repository *r, struct
object_id *oid)
that returns a commit struct fully parsed if and only if the repository
has that oid. It can call find_commit_in_graph(), then
lookup_commit() and
fill_commit_in_graph() to create the commit and parse the data.
3. In replace of the snippet above, do:
struct commit *c;
if ((c = try_parse_commit_from_graph(revs->repo, oid))
object = (struct object *)c;
else
object = parse_object(revs->repo, oid);
A similar pattern _could_ be used in parse_object(), but I don't recommend
doing this pattern unless we have a reasonable suspicion that we are going
to parse commits more often than other objects. (It adds an O(log(#
commits))
binary search to each object.)
A final thought: consider making this "try the commit graph first, but fall
back to parse_object()" a library function with a name like
struct object *parse_probably_commit(struct repository *r, struct
object_id *oid)
so other paths that are parsing a lot of commits (but also maybe tags) could
use the logic.
Thanks!
-Stolee
next prev parent reply other threads:[~2018-12-07 13:50 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-04 22:42 [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference() Jonathan Tan
2018-12-04 23:12 ` Stefan Beller
2018-12-06 23:36 ` Jonathan Tan
2018-12-07 13:49 ` Derrick Stolee [this message]
2018-12-05 4:54 ` Jeff King
2018-12-06 23:54 ` Jonathan Tan
2018-12-07 8:53 ` Jeff King
2018-12-05 23:15 ` Junio C Hamano
2018-12-07 21:50 ` [PATCH on master v2] " Jonathan Tan
2018-12-09 0:51 ` Junio C Hamano
2018-12-09 1:49 ` Junio C Hamano
2018-12-11 10:54 ` Jeff King
2018-12-12 19:58 ` Jonathan Tan
2018-12-13 1:27 ` Jeff King
2018-12-13 16:20 ` Derrick Stolee
2018-12-13 18:54 ` [PATCH v3] " Jonathan Tan
2018-12-14 3:20 ` Junio C Hamano
2018-12-14 8:45 ` Jeff King
2019-01-25 15:33 ` Regression in: [PATCH on sb/more-repo-in-api] " SZEDER Gábor
2019-01-25 19:56 ` Stefan Beller
2019-01-25 22:01 ` Jonathan Tan
2019-01-25 22:14 ` SZEDER Gábor
2019-01-25 22:21 ` SZEDER Gábor
2019-01-27 13:08 ` [PATCH] object_as_type: initialize commit-graph-related fields of 'struct commit' SZEDER Gábor
2019-01-27 13:28 ` SZEDER Gábor
2019-01-27 18:40 ` Derrick Stolee
2019-01-28 16:15 ` Jeff King
2019-01-28 16:57 ` Jonathan Tan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aa0cd481-c135-47aa-2a69-e3dc71661caa@gmail.com \
--to=stolee@gmail.com \
--cc=git@vger.kernel.org \
--cc=jonathantanmy@google.com \
--cc=sbeller@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).