git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
* [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference()
@ 2018-12-04 22:42 Jonathan Tan
  2018-12-04 23:12 ` Stefan Beller
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Jonathan Tan @ 2018-12-04 22:42 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

When fetching into a repository, a connectivity check is first made by
check_exist_and_connected() in builtin/fetch.c that runs:

  git rev-list --objects --stdin --not --all --quiet <(list of objects)

If the client repository has many refs, this command can be slow,
regardless of the nature of the server repository or what is being
fetched. A profiler reveals that most of the time is spent in
setup_revisions() (approx. 60/63), and of the time spent in
setup_revisions(), most of it is spent in parse_object() (approx.
49/60). This is because setup_revisions() parses the target of every ref
(from "--all"), and parse_object() reads the buffer of the object.

Reading the buffer is unnecessary if the repository has a commit graph
and if the ref points to a commit (which is typically the case). This
patch uses the commit graph wherever possible; on my computer, when I
run the above command with a list of 1 object on a many-ref repository,
I get a speedup from 1.8s to 1.0s.

Another way to accomplish this effect would be to modify parse_object()
to use the commit graph if possible; however, I did not want to change
parse_object()'s current behavior of always checking the object
signature of the returned object.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
This is on sb/more-repo-in-api because I'm using the repo_parse_commit()
function.

A colleague noticed this issue when handling a mirror clone.

Looking at the bigger picture, the speed of the connectivity check
during a fetch might be further improved by passing only the negotiation
tips (obtained through --negotiation-tip) instead of "--all". This patch
just handles the low-hanging fruit first.
---
 revision.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/revision.c b/revision.c
index b5108b75ab..e7da2c57ab 100644
--- a/revision.c
+++ b/revision.c
@@ -212,7 +212,20 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
 {
 	struct object *object;
 
-	object = parse_object(revs->repo, oid);
+	/*
+	 * If the repository has commit graphs, repo_parse_commit() avoids
+	 * reading the object buffer, so use it whenever possible.
+	 */
+	if (oid_object_info(revs->repo, oid, NULL) == OBJ_COMMIT) {
+		struct commit *c = lookup_commit(revs->repo, oid);
+		if (!repo_parse_commit(revs->repo, c))
+			object = (struct object *) c;
+		else
+			object = NULL;
+	} else {
+		object = parse_object(revs->repo, oid);
+	}
+
 	if (!object) {
 		if (revs->ignore_missing)
 			return object;
-- 
2.19.0.271.gfe8321ec05.dirty


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference()
  2018-12-04 22:42 [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference() Jonathan Tan
@ 2018-12-04 23:12 ` Stefan Beller
  2018-12-06 23:36   ` Jonathan Tan
  2018-12-05  4:54 ` Jeff King
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Stefan Beller @ 2018-12-04 23:12 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Tue, Dec 4, 2018 at 2:42 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> When fetching into a repository, a connectivity check is first made by
> check_exist_and_connected() in builtin/fetch.c that runs:
>
>   git rev-list --objects --stdin --not --all --quiet <(list of objects)
>
> If the client repository has many refs, this command can be slow,
> regardless of the nature of the server repository or what is being
> fetched. A profiler reveals that most of the time is spent in
> setup_revisions() (approx. 60/63), and of the time spent in
> setup_revisions(), most of it is spent in parse_object() (approx.
> 49/60). This is because setup_revisions() parses the target of every ref
> (from "--all"), and parse_object() reads the buffer of the object.
>
> Reading the buffer is unnecessary if the repository has a commit graph
> and if the ref points to a commit (which is typically the case). This
> patch uses the commit graph wherever possible; on my computer, when I
> run the above command with a list of 1 object on a many-ref repository,
> I get a speedup from 1.8s to 1.0s.
>
> Another way to accomplish this effect would be to modify parse_object()
> to use the commit graph if possible; however, I did not want to change
> parse_object()'s current behavior of always checking the object
> signature of the returned object.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
> This is on sb/more-repo-in-api because I'm using the repo_parse_commit()
> function.

This is a mere nicety, not strictly required.
Before we had parse_commit(struct commit *) which would accomplish the
same, (and we'd still have that afterwards as a #define falling back onto
the_repository). As the function get_reference() is not the_repository safe
as it contains a call to is_promisor_object() that is repository
agnostic, I think
it would be fair game to not depend on that series. I am not
complaining, though.

> A colleague noticed this issue when handling a mirror clone.
>
> Looking at the bigger picture, the speed of the connectivity check
> during a fetch might be further improved by passing only the negotiation
> tips (obtained through --negotiation-tip) instead of "--all". This patch
> just handles the low-hanging fruit first.
> ---
>  revision.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/revision.c b/revision.c
> index b5108b75ab..e7da2c57ab 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -212,7 +212,20 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
>  {
>         struct object *object;
>
> -       object = parse_object(revs->repo, oid);
> +       /*
> +        * If the repository has commit graphs, repo_parse_commit() avoids
> +        * reading the object buffer, so use it whenever possible.
> +        */
> +       if (oid_object_info(revs->repo, oid, NULL) == OBJ_COMMIT) {
> +               struct commit *c = lookup_commit(revs->repo, oid);
> +               if (!repo_parse_commit(revs->repo, c))
> +                       object = (struct object *) c;
> +               else
> +                       object = NULL;

Would it make sense in this case to rely on parse_object below
instead of assigning NULL? The reason for that would be that
when lookup_commit returns NULL, we would try more broadly.

AFAICT oid_object_info doesn't take advantage of the commit graph,
but just looks up the object header, which is still less than completely
parsing it. Then lookup_commit is overly strict, as it may return
NULL as when there still is a type mismatch (I don't think a mismatch
could happen here, as both rely on just the object store, and not the
commit graph.), so this would be just defensive programming for
the sake of it. I dunno.

    struct commit *c;

    if (oid_object_info(revs->repo, oid, NULL) == OBJ_COMMIT &&
        (c = lookup_commit(revs->repo, oid)) &&
        !repo_parse_commit(revs->repo, c))
            object = (struct object *) c;
    else
        object = parse_object(revs->repo, oid);


So with all that said, I still think this is a good patch.

Thanks,
Stefan

> +       } else {
> +               object = parse_object(revs->repo, oid);
> +       }
> +
>         if (!object) {
>                 if (revs->ignore_missing)
>                         return object;
> --
> 2.19.0.271.gfe8321ec05.dirty
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference()
  2018-12-04 22:42 [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference() Jonathan Tan
  2018-12-04 23:12 ` Stefan Beller
@ 2018-12-05  4:54 ` Jeff King
  2018-12-06 23:54   ` Jonathan Tan
  2018-12-05 23:15 ` Junio C Hamano
  2018-12-07 21:50 ` [PATCH on master v2] " Jonathan Tan
  3 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2018-12-05  4:54 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Tue, Dec 04, 2018 at 02:42:38PM -0800, Jonathan Tan wrote:

> diff --git a/revision.c b/revision.c
> index b5108b75ab..e7da2c57ab 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -212,7 +212,20 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
>  {
>  	struct object *object;
>  
> -	object = parse_object(revs->repo, oid);
> +	/*
> +	 * If the repository has commit graphs, repo_parse_commit() avoids
> +	 * reading the object buffer, so use it whenever possible.
> +	 */
> +	if (oid_object_info(revs->repo, oid, NULL) == OBJ_COMMIT) {
> +		struct commit *c = lookup_commit(revs->repo, oid);
> +		if (!repo_parse_commit(revs->repo, c))
> +			object = (struct object *) c;
> +		else
> +			object = NULL;
> +	} else {
> +		object = parse_object(revs->repo, oid);
> +	}

This seems like a reasonable thing to do, but I have sort of a
meta-comment. In several places we've started doing this kind of "if
it's this type of object, do X, otherwise do Y" optimization (e.g.,
handling large blobs for streaming).

And in the many cases we end up doubling the effort to do object
lookups: here we do one lookup to get the type, and then if it's not a
commit (or if we don't have a commit graph) we end up parsing it anyway.

I wonder if we could do better. In this instance, it might make sense
to first see if we actually have a commit graph available (it might not
have this object, of course, but at least we'd expect it to have most
commits). In general, it would be nice if we had a more incremental API
for accessing objects: open, get metadata, then read the data. That
would make these kinds of optimizations "free".

I don't have numbers for how much the extra lookups cost. The lookups
are probably dwarfed by parse_object() in general, so even if we save
only a few full object loads, it may be a win. It just seems a shame
that we may be making the "slow" paths (when our type-specific check
doesn't match) even slower.

-Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference()
  2018-12-04 22:42 [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference() Jonathan Tan
  2018-12-04 23:12 ` Stefan Beller
  2018-12-05  4:54 ` Jeff King
@ 2018-12-05 23:15 ` Junio C Hamano
  2018-12-07 21:50 ` [PATCH on master v2] " Jonathan Tan
  3 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2018-12-05 23:15 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> Looking at the bigger picture, the speed of the connectivity check
> during a fetch might be further improved by passing only the negotiation
> tips (obtained through --negotiation-tip) instead of "--all". This patch
> just handles the low-hanging fruit first.

That sounds like a good direction, when having to list excessive
number of refs is the primary problem.  When fetching their 'master'
into our 'remotes/origin/master' and doing nothing else, we may end
up showing only the latter, which will miss optimization opportunity
a lot if the latest change made over there is to merge in the change
we asked them to pull earlier (which would be greatly helped if we
let them know about the tip of the topic they earlier pulled from
us), but also avoids having to send irrelevant refs that point at
tags addded to months old states.  So there is a subtle trade-off
between sending more refs to reduce the resulting packfile, and
sending fewer refs to reduce the cost of the "have" exchange.

Changing the way to throw each object pointed at by a ref into the
queue to be emitted in the "have" exchange from regular object
parsing to peeking of precomputed data would reduce the local cost
of "have" exchange, but it does not reduce the network cost at all,
though.

As to the change being specific to get_reference() and not to
parse_object(), I think what we see here is probably better, simply
because parse_object() is not in the position to asssume that it is
likely to be asked to parse commits, but I think get_reference() is,
after looking at its callsites in revision.c.

I do share the meta-comment concern with Peff, though.

> ---
>  revision.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/revision.c b/revision.c
> index b5108b75ab..e7da2c57ab 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -212,7 +212,20 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
>  {
>  	struct object *object;
>  
> -	object = parse_object(revs->repo, oid);
> +	/*
> +	 * If the repository has commit graphs, repo_parse_commit() avoids
> +	 * reading the object buffer, so use it whenever possible.
> +	 */
> +	if (oid_object_info(revs->repo, oid, NULL) == OBJ_COMMIT) {
> +		struct commit *c = lookup_commit(revs->repo, oid);
> +		if (!repo_parse_commit(revs->repo, c))
> +			object = (struct object *) c;
> +		else
> +			object = NULL;
> +	} else {
> +		object = parse_object(revs->repo, oid);
> +	}
> +
>  	if (!object) {
>  		if (revs->ignore_missing)
>  			return object;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference()
  2018-12-04 23:12 ` Stefan Beller
@ 2018-12-06 23:36   ` Jonathan Tan
  2018-12-07 13:49     ` Derrick Stolee
  0 siblings, 1 reply; 11+ messages in thread
From: Jonathan Tan @ 2018-12-06 23:36 UTC (permalink / raw)
  To: sbeller; +Cc: jonathantanmy, git

> > This is on sb/more-repo-in-api because I'm using the repo_parse_commit()
> > function.
> 
> This is a mere nicety, not strictly required.
> Before we had parse_commit(struct commit *) which would accomplish the
> same, (and we'd still have that afterwards as a #define falling back onto
> the_repository). As the function get_reference() is not the_repository safe
> as it contains a call to is_promisor_object() that is repository
> agnostic, I think
> it would be fair game to not depend on that series. I am not
> complaining, though.

Good point - I'll base the next version on master (and add a TODO
explaining which functions are not yet converted).

> AFAICT oid_object_info doesn't take advantage of the commit graph,
> but just looks up the object header, which is still less than completely
> parsing it. Then lookup_commit is overly strict, as it may return
> NULL as when there still is a type mismatch (I don't think a mismatch
> could happen here, as both rely on just the object store, and not the
> commit graph.), so this would be just defensive programming for
> the sake of it. I dunno.
> 
>     struct commit *c;
> 
>     if (oid_object_info(revs->repo, oid, NULL) == OBJ_COMMIT &&
>         (c = lookup_commit(revs->repo, oid)) &&
>         !repo_parse_commit(revs->repo, c))
>             object = (struct object *) c;
>     else
>         object = parse_object(revs->repo, oid);

I like this way better - I'll do it in the next version.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference()
  2018-12-05  4:54 ` Jeff King
@ 2018-12-06 23:54   ` Jonathan Tan
  2018-12-07  8:53     ` Jeff King
  0 siblings, 1 reply; 11+ messages in thread
From: Jonathan Tan @ 2018-12-06 23:54 UTC (permalink / raw)
  To: peff; +Cc: jonathantanmy, git, stolee

Also CC-ing Stolee since I mention multi-pack indices at the end.

> This seems like a reasonable thing to do, but I have sort of a
> meta-comment. In several places we've started doing this kind of "if
> it's this type of object, do X, otherwise do Y" optimization (e.g.,
> handling large blobs for streaming).
> 
> And in the many cases we end up doubling the effort to do object
> lookups: here we do one lookup to get the type, and then if it's not a
> commit (or if we don't have a commit graph) we end up parsing it anyway.
> 
> I wonder if we could do better. In this instance, it might make sense
> to first see if we actually have a commit graph available (it might not
> have this object, of course, but at least we'd expect it to have most
> commits).

This makes sense - I thought I shouldn't mention the commit graph in the
code since it seems like a layering violation, but I felt the need to
mention commit graph in a comment, so maybe the need to mention commit
graph in the code is there too. Subsequently, maybe the lookup-for-type
could be replaced by a lookup-in-commit-graph (maybe by using
parse_commit_in_graph() directly), which should be at least slightly
faster.

> In general, it would be nice if we had a more incremental API
> for accessing objects: open, get metadata, then read the data. That
> would make these kinds of optimizations "free".

Would this be assuming that to read the data, you would (1) first need to
read the metadata, and (2) there would be no redundancy in reading the
two? It seems to me that for loose objects, you would want to perform
all your reads at once, since any read requires opening the file, and
for commit graphs, you just want to read what you want, since the
metadata and the data are in separate places.

> I don't have numbers for how much the extra lookups cost. The lookups
> are probably dwarfed by parse_object() in general, so even if we save
> only a few full object loads, it may be a win. It just seems a shame
> that we may be making the "slow" paths (when our type-specific check
> doesn't match) even slower.

I agree. I think it will always remain a tradeoff when we have multiple
data sources of objects (loose, packed, commit graph - and we can't
unify them all, since they each have their uses). Unless the multi-pack
index can reference commit graphs as well...then it could be our first
point of reference without introducing any inefficiencies...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference()
  2018-12-06 23:54   ` Jonathan Tan
@ 2018-12-07  8:53     ` Jeff King
  0 siblings, 0 replies; 11+ messages in thread
From: Jeff King @ 2018-12-07  8:53 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, stolee

On Thu, Dec 06, 2018 at 03:54:46PM -0800, Jonathan Tan wrote:

> This makes sense - I thought I shouldn't mention the commit graph in the
> code since it seems like a layering violation, but I felt the need to
> mention commit graph in a comment, so maybe the need to mention commit
> graph in the code is there too. Subsequently, maybe the lookup-for-type
> could be replaced by a lookup-in-commit-graph (maybe by using
> parse_commit_in_graph() directly), which should be at least slightly
> faster.

That makes more sense to me. If we don't have a commit graph at all,
it's a quick noop. If we do, we might binary search in the list of
commits for a non-commit. But that's strictly faster than finding the
object's type (which involves a binary search of a larger list, followed
by actually accessing the type info).

> > In general, it would be nice if we had a more incremental API
> > for accessing objects: open, get metadata, then read the data. That
> > would make these kinds of optimizations "free".
> 
> Would this be assuming that to read the data, you would (1) first need to
> read the metadata, and (2) there would be no redundancy in reading the
> two? It seems to me that for loose objects, you would want to perform
> all your reads at once, since any read requires opening the file, and
> for commit graphs, you just want to read what you want, since the
> metadata and the data are in separate places.

By metadata here, I don't mean the commit-graph data, but just the
object type and size. So I'm imagining an interface more like:

  - object_open() locates the object, and stores either the pack
    file/offset or a descriptor to a loose path in an opaque handle
    struct

  - object_size() and object_type() on that handle would do what you
    expect. For loose objects, these would parse the header (the
    equivalent of unpack_sha1_header()). For packed ones, they'd use the
    object header in the pack (and chase down the delta bits as needed).

  - object_contents() would return the full content

  - object_read() could sequentially read a subset of the file (this
    could replace the streaming interface we currently have)

We have most of the low-level bits for this already, if you poke into
what object_info_extended() is doing. We just don't have them packaged
in an interface which can persist across multiple calls.

With an interface like that, parse_object()'s large-blob check could be
something like the patch below.

But your case here is a bit more interesting. If we have a commit graph,
then we can avoid opening (or even finding!) the on-disk object at all.
So I actually think it makes sense to just check the commit-graph first,
as discussed above.

---
diff --git a/object.c b/object.c
index e54160550c..afce58c0bc 100644
--- a/object.c
+++ b/object.c
@@ -254,23 +254,31 @@ struct object *parse_object(struct repository *r, const struct object_id *oid)
 	const struct object_id *repl = lookup_replace_object(r, oid);
 	void *buffer;
 	struct object *obj;
+	struct object_handle oh;
 
 	obj = lookup_object(r, oid->hash);
 	if (obj && obj->parsed)
 		return obj;
 
-	if ((obj && obj->type == OBJ_BLOB && has_object_file(oid)) ||
-	    (!obj && has_object_file(oid) &&
-	     oid_object_info(r, oid, NULL) == OBJ_BLOB)) {
-		if (check_object_signature(repl, NULL, 0, NULL) < 0) {
+	if (object_open(&oh, oid) < 0)
+		return NULL; /* missing object */
+
+	if (object_type(&oh) == OBJ_BLOB) {
+		/* this will call object_read() on 4k chunks */
+		if (check_object_signature_stream(&oh, oid)) {
 			error(_("sha1 mismatch %s"), oid_to_hex(oid));
 			return NULL;
 		}
+		object_close(&oh); /* we don't care about contents */
 		parse_blob_buffer(lookup_blob(r, oid), NULL, 0);
 		return lookup_object(r, oid->hash);
 	}
 
-	buffer = read_object_file(oid, &type, &size);
+	type = object_type(&oh);
+	size = object_size(&oh);
+	buffer = object_contents(&oh);
+	object_close(&oh);
+
 	if (buffer) {
 		if (check_object_signature(repl, buffer, size, type_name(type)) < 0) {
 			free(buffer);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference()
  2018-12-06 23:36   ` Jonathan Tan
@ 2018-12-07 13:49     ` Derrick Stolee
  0 siblings, 0 replies; 11+ messages in thread
From: Derrick Stolee @ 2018-12-07 13:49 UTC (permalink / raw)
  To: Jonathan Tan, sbeller; +Cc: git

On 12/6/2018 6:36 PM, Jonathan Tan wrote:
>> AFAICT oid_object_info doesn't take advantage of the commit graph,
>> but just looks up the object header, which is still less than completely
>> parsing it. Then lookup_commit is overly strict, as it may return
>> NULL as when there still is a type mismatch (I don't think a mismatch
>> could happen here, as both rely on just the object store, and not the
>> commit graph.), so this would be just defensive programming for
>> the sake of it. I dunno.
>>
>>      struct commit *c;
>>
>>      if (oid_object_info(revs->repo, oid, NULL) == OBJ_COMMIT &&
>>          (c = lookup_commit(revs->repo, oid)) &&
>>          !repo_parse_commit(revs->repo, c))
>>              object = (struct object *) c;
>>      else
>>          object = parse_object(revs->repo, oid);
> I like this way better - I'll do it in the next version.

If we do _not_ have a commit-graph or if the commit-graph does not have
that commit, this will have the same performance problem, right?

Should we instead create a direct dependence on the commit-graph, and try
to parse the oid from the graph directly? If it succeeds, then we learn
that the object is a commit, in addition to all of the parsing work. This
means we could avoid oid_object_info() loading data if we succeed. We
would fall back to parse_object() if it fails.

I was thinking this should be a simple API call to parse_commit_in_graph(),
but that requires a struct commit filled with an oid, which is not the
best idea if we don't actually know it is a commit yet.

The approach I recommend would then be more detailed:

1. Modify find_commit_in_graph() to take a struct object_id instead of a
    struct commit. This helps find the integer position in the graph. That
    position can be used in fill_commit_in_graph() to load the commit
    contents. Keep find_commit_in_graph() static as it should not be a
    public function.

2. Create a public function with prototype

struct commit *try_parse_commit_from_graph(struct repository *r, struct 
object_id *oid)

    that returns a commit struct fully parsed if and only if the repository
    has that oid. It can call find_commit_in_graph(), then 
lookup_commit() and
    fill_commit_in_graph() to create the commit and parse the data.

3. In replace of the snippet above, do:

     struct commit *c;

     if ((c = try_parse_commit_from_graph(revs->repo, oid))
         object = (struct object *)c;
     else
         object = parse_object(revs->repo, oid);

A similar pattern _could_ be used in parse_object(), but I don't recommend
doing this pattern unless we have a reasonable suspicion that we are going
to parse commits more often than other objects. (It adds an O(log(# 
commits))
binary search to each object.)

A final thought: consider making this "try the commit graph first, but fall
back to parse_object()" a library function with a name like

     struct object *parse_probably_commit(struct repository *r, struct 
object_id *oid)

so other paths that are parsing a lot of commits (but also maybe tags) could
use the logic.

Thanks!
-Stolee

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH on master v2] revision: use commit graph in get_reference()
  2018-12-04 22:42 [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference() Jonathan Tan
                   ` (2 preceding siblings ...)
  2018-12-05 23:15 ` Junio C Hamano
@ 2018-12-07 21:50 ` " Jonathan Tan
  2018-12-09  0:51   ` Junio C Hamano
  3 siblings, 1 reply; 11+ messages in thread
From: Jonathan Tan @ 2018-12-07 21:50 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, stolee, peff, gitster

When fetching into a repository, a connectivity check is first made by
check_exist_and_connected() in builtin/fetch.c that runs:

  git rev-list --objects --stdin --not --all --quiet <(list of objects)

If the client repository has many refs, this command can be slow,
regardless of the nature of the server repository or what is being
fetched. A profiler reveals that most of the time is spent in
setup_revisions() (approx. 60/63), and of the time spent in
setup_revisions(), most of it is spent in parse_object() (approx.
49/60). This is because setup_revisions() parses the target of every ref
(from "--all"), and parse_object() reads the buffer of the object.

Reading the buffer is unnecessary if the repository has a commit graph
and if the ref points to a commit (which is typically the case). This
patch uses the commit graph wherever possible; on my computer, when I
run the above command with a list of 1 object on a many-ref repository,
I get a speedup from 1.8s to 1.0s.

Another way to accomplish this effect would be to modify parse_object()
to use the commit graph if possible; however, I did not want to change
parse_object()'s current behavior of always checking the object
signature of the returned object.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
This patch is now on master.

v2 makes use of the optimization Stolee describes in [1], except that I
have arranged the functions slightly differently. In particular, I
didn't want to add even more ways to obtain objects, so I let
parse_commit_in_graph() be able to take in either a commit shell or an
OID, and did not create the parse_probably_commit() function he
suggested. But I'm not really attached to this design choice, and can
change it if requested.

[1] https://public-inbox.org/git/aa0cd481-c135-47aa-2a69-e3dc71661caa@gmail.com/
---
 commit-graph.c             | 38 ++++++++++++++++++++++++++++----------
 commit-graph.h             | 12 ++++++++----
 commit.c                   |  2 +-
 revision.c                 |  5 ++++-
 t/helper/test-repository.c |  4 ++--
 5 files changed, 43 insertions(+), 18 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 40c855f185..a571b523b7 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -286,7 +286,8 @@ void close_commit_graph(struct repository *r)
 	r->objects->commit_graph = NULL;
 }
 
-static int bsearch_graph(struct commit_graph *g, struct object_id *oid, uint32_t *pos)
+static int bsearch_graph(struct commit_graph *g, const struct object_id *oid,
+			 uint32_t *pos)
 {
 	return bsearch_hash(oid->hash, g->chunk_oid_fanout,
 			    g->chunk_oid_lookup, g->hash_len, pos);
@@ -374,24 +375,41 @@ static int find_commit_in_graph(struct commit *item, struct commit_graph *g, uin
 	}
 }
 
-static int parse_commit_in_graph_one(struct commit_graph *g, struct commit *item)
+static struct commit *parse_commit_in_graph_one(struct repository *r,
+						struct commit_graph *g,
+						struct commit *shell,
+						const struct object_id *oid)
 {
 	uint32_t pos;
 
-	if (item->object.parsed)
-		return 1;
+	if (shell && shell->object.parsed)
+		return shell;
 
-	if (find_commit_in_graph(item, g, &pos))
-		return fill_commit_in_graph(item, g, pos);
+	if (shell && shell->graph_pos != COMMIT_NOT_FROM_GRAPH) {
+		pos = shell->graph_pos;
+	} else if (bsearch_graph(g, shell ? &shell->object.oid : oid, &pos)) {
+		/* bsearch_graph sets pos */
+	} else {
+		return NULL;
+	}
 
-	return 0;
+	if (!shell) {
+		shell = lookup_commit(r, oid);
+		if (!shell)
+			return NULL;
+	}
+
+	fill_commit_in_graph(shell, g, pos);
+	return shell;
 }
 
-int parse_commit_in_graph(struct repository *r, struct commit *item)
+struct commit *parse_commit_in_graph(struct repository *r, struct commit *shell,
+				     const struct object_id *oid)
 {
 	if (!prepare_commit_graph(r))
 		return 0;
-	return parse_commit_in_graph_one(r->objects->commit_graph, item);
+	return parse_commit_in_graph_one(r, r->objects->commit_graph, shell,
+					 oid);
 }
 
 void load_commit_graph_info(struct repository *r, struct commit *item)
@@ -1025,7 +1043,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g)
 		}
 
 		graph_commit = lookup_commit(r, &cur_oid);
-		if (!parse_commit_in_graph_one(g, graph_commit))
+		if (!parse_commit_in_graph_one(r, g, graph_commit, NULL))
 			graph_report("failed to parse %s from commit-graph",
 				     oid_to_hex(&cur_oid));
 	}
diff --git a/commit-graph.h b/commit-graph.h
index 9db40b4d3a..8b7b5985dc 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -13,16 +13,20 @@ struct commit;
 char *get_commit_graph_filename(const char *obj_dir);
 
 /*
- * Given a commit struct, try to fill the commit struct info, including:
+ * If the given commit (identified by shell->object.oid or oid) is in the
+ * commit graph, returns a commit struct (reusing shell if it is not NULL)
+ * including the following info:
  *  1. tree object
  *  2. date
  *  3. parents.
  *
- * Returns 1 if and only if the commit was found in the packed graph.
+ * If not, returns NULL. See parse_commit_buffer() for the fallback after this
+ * call.
  *
- * See parse_commit_buffer() for the fallback after this call.
+ * Either shell or oid must be non-NULL. If both are non-NULL, oid is ignored.
  */
-int parse_commit_in_graph(struct repository *r, struct commit *item);
+struct commit *parse_commit_in_graph(struct repository *r, struct commit *shell,
+				     const struct object_id *oid);
 
 /*
  * It is possible that we loaded commit contents from the commit buffer,
diff --git a/commit.c b/commit.c
index d13a7bc374..88eb580c5a 100644
--- a/commit.c
+++ b/commit.c
@@ -456,7 +456,7 @@ int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_com
 		return -1;
 	if (item->object.parsed)
 		return 0;
-	if (use_commit_graph && parse_commit_in_graph(the_repository, item))
+	if (use_commit_graph && parse_commit_in_graph(the_repository, item, NULL))
 		return 0;
 	buffer = read_object_file(&item->object.oid, &type, &size);
 	if (!buffer)
diff --git a/revision.c b/revision.c
index 13e0519c02..05fddb5880 100644
--- a/revision.c
+++ b/revision.c
@@ -213,7 +213,10 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
 {
 	struct object *object;
 
-	object = parse_object(revs->repo, oid);
+	object = (struct object *) parse_commit_in_graph(revs->repo, NULL, oid);
+	if (!object)
+		object = parse_object(revs->repo, oid);
+
 	if (!object) {
 		if (revs->ignore_missing)
 			return object;
diff --git a/t/helper/test-repository.c b/t/helper/test-repository.c
index 6a84a53efb..63b928a883 100644
--- a/t/helper/test-repository.c
+++ b/t/helper/test-repository.c
@@ -22,7 +22,7 @@ static void test_parse_commit_in_graph(const char *gitdir, const char *worktree,
 
 	c = lookup_commit(&r, commit_oid);
 
-	if (!parse_commit_in_graph(&r, c))
+	if (!parse_commit_in_graph(&r, c, NULL))
 		die("Couldn't parse commit");
 
 	printf("%"PRItime, c->date);
@@ -52,7 +52,7 @@ static void test_get_commit_tree_in_graph(const char *gitdir,
 	 * get_commit_tree_in_graph does not automatically parse the commit, so
 	 * parse it first.
 	 */
-	if (!parse_commit_in_graph(&r, c))
+	if (!parse_commit_in_graph(&r, c, NULL))
 		die("Couldn't parse commit");
 	tree = get_commit_tree_in_graph(&r, c);
 	if (!tree)
-- 
2.19.0.271.gfe8321ec05.dirty


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH on master v2] revision: use commit graph in get_reference()
  2018-12-07 21:50 ` [PATCH on master v2] " Jonathan Tan
@ 2018-12-09  0:51   ` Junio C Hamano
  2018-12-09  1:49     ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2018-12-09  0:51 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, stolee, peff

Jonathan Tan <jonathantanmy@google.com> writes:

> When fetching into a repository, a connectivity check is first made by
> check_exist_and_connected() in builtin/fetch.c that runs:
> ...
> Another way to accomplish this effect would be to modify parse_object()
> to use the commit graph if possible; however, I did not want to change
> parse_object()'s current behavior of always checking the object
> signature of the returned object.

Sounds good.

> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
> This patch is now on master.

OK.  

Obviously that won't apply to the base for v1 without conflicts, and
it of course applies cleanly on 'master', but the result of doing so
will cause the same conflicts when sb/more-repo-in-api is merged on
top, which means that the same conflicts need to be resolved if this
wants to be merged to 'next' (or 'pu', FWIW).

> diff --git a/commit-graph.c b/commit-graph.c
> index 40c855f185..a571b523b7 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -374,24 +375,41 @@ static int find_commit_in_graph(struct commit *item, struct commit_graph *g, uin
>  	}
>  }
>  
> -static int parse_commit_in_graph_one(struct commit_graph *g, struct commit *item)
> +static struct commit *parse_commit_in_graph_one(struct repository *r,
> +						struct commit_graph *g,
> +						struct commit *shell,
> +						const struct object_id *oid)

Now the complexity of the behaviour of this function deserves to be
documented in a comment in front.  Let me see if I can get it
correctly without such a comment by explaining the function aloud.

The caller may or may not have already obtained an in-core commit
object for a given object name, so shell could be NULL but otherwise
it could be used for optimization.  When shell==NULL, the function
looks up the commit object using the oid parameter instead.  The
returned in-core commit has the parents etc. filled as if we ran
parse_commit() on it.  If the commit is not yet in the graph, the
caller may get a NULL even if the commit exists.

>  {
>  	uint32_t pos;
>  
> -	if (item->object.parsed)
> -		return 1;
> +	if (shell && shell->object.parsed)
> +		return shell;
>  
> -	if (find_commit_in_graph(item, g, &pos))
> -		return fill_commit_in_graph(item, g, pos);
> +	if (shell && shell->graph_pos != COMMIT_NOT_FROM_GRAPH) {
> +		pos = shell->graph_pos;
> +	} else if (bsearch_graph(g, shell ? &shell->object.oid : oid, &pos)) {
> +		/* bsearch_graph sets pos */

Please spell an empty statement like so:

		; /* comment */

> +	} else {
> +		return NULL;

We come here when the commit (either came from shell or from oid) is
not found by bsearch_graph().  "Is the caller prepared for it, and
how?" is a natural question a reader would have.  Let's read on.

> +	}
>  
> -	return 0;
> +	if (!shell) {
> +		shell = lookup_commit(r, oid);
> +		if (!shell)
> +			return NULL;
> +	}
> +
> +	fill_commit_in_graph(shell, g, pos);
> +	return shell;
>  }
>  
> -int parse_commit_in_graph(struct repository *r, struct commit *item)
> +struct commit *parse_commit_in_graph(struct repository *r, struct commit *shell,
> +				     const struct object_id *oid)
>  {
>  	if (!prepare_commit_graph(r))
>  		return 0;
> -	return parse_commit_in_graph_one(r->objects->commit_graph, item);
> +	return parse_commit_in_graph_one(r, r->objects->commit_graph, shell,
> +					 oid);
>  }
>  
>  void load_commit_graph_info(struct repository *r, struct commit *item)
> @@ -1025,7 +1043,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g)
>  		}
>  
>  		graph_commit = lookup_commit(r, &cur_oid);
> -		if (!parse_commit_in_graph_one(g, graph_commit))
> +		if (!parse_commit_in_graph_one(r, g, graph_commit, NULL))
>  			graph_report("failed to parse %s from commit-graph",
>  				     oid_to_hex(&cur_oid));
>  	}
> diff --git a/commit-graph.h b/commit-graph.h
> index 9db40b4d3a..8b7b5985dc 100644
> --- a/commit-graph.h
> +++ b/commit-graph.h
> @@ -13,16 +13,20 @@ struct commit;
>  char *get_commit_graph_filename(const char *obj_dir);
>  
>  /*
> - * Given a commit struct, try to fill the commit struct info, including:
> + * If the given commit (identified by shell->object.oid or oid) is in the
> + * commit graph, returns a commit struct (reusing shell if it is not NULL)
> + * including the following info:
>   *  1. tree object
>   *  2. date
>   *  3. parents.
>   *
> - * Returns 1 if and only if the commit was found in the packed graph.
> + * If not, returns NULL. See parse_commit_buffer() for the fallback after this
> + * call.
>   *
> - * See parse_commit_buffer() for the fallback after this call.
> + * Either shell or oid must be non-NULL. If both are non-NULL, oid is ignored.
>   */

OK, the eventual caller is the caller of this thing, which should
have been prepared to see NULL for a commit that is too new.  So
that should be OK.

> -int parse_commit_in_graph(struct repository *r, struct commit *item);
> +struct commit *parse_commit_in_graph(struct repository *r, struct commit *shell,
> +				     const struct object_id *oid);
>  
>  /*
>   * It is possible that we loaded commit contents from the commit buffer,
> diff --git a/commit.c b/commit.c
> index d13a7bc374..88eb580c5a 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -456,7 +456,7 @@ int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_com
>  		return -1;
>  	if (item->object.parsed)
>  		return 0;
> -	if (use_commit_graph && parse_commit_in_graph(the_repository, item))
> +	if (use_commit_graph && parse_commit_in_graph(the_repository, item, NULL))
>  		return 0;
>  	buffer = read_object_file(&item->object.oid, &type, &size);
>  	if (!buffer)
> diff --git a/revision.c b/revision.c
> index 13e0519c02..05fddb5880 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -213,7 +213,10 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
>  {
>  	struct object *object;
>  
> -	object = parse_object(revs->repo, oid);
> +	object = (struct object *) parse_commit_in_graph(revs->repo, NULL, oid);
> +	if (!object)
> +		object = parse_object(revs->repo, oid);

OK and this is such a caller.  I think a general rule of thumb is
that we need to access recent history a lot more often than the
older part of the history, and having to fall back for more recent
commits feels a bit disturbing, but I do not see an easy way to
reverse the performance characteristics offhand.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH on master v2] revision: use commit graph in get_reference()
  2018-12-09  0:51   ` Junio C Hamano
@ 2018-12-09  1:49     ` Junio C Hamano
  0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2018-12-09  1:49 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, stolee, peff

Junio C Hamano <gitster@pobox.com> writes:

> Jonathan Tan <jonathantanmy@google.com> writes:
>
>> When fetching into a repository, a connectivity check is first made by
>> check_exist_and_connected() in builtin/fetch.c that runs:
>> ...
>> Another way to accomplish this effect would be to modify parse_object()
>> to use the commit graph if possible; however, I did not want to change
>> parse_object()'s current behavior of always checking the object
>> signature of the returned object.
>
> Sounds good.
>
>> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
>> ---
>> This patch is now on master.
>
> OK.  
>
> Obviously that won't apply to the base for v1 without conflicts, and
> it of course applies cleanly on 'master', but the result of doing so
> will cause the same conflicts when sb/more-repo-in-api is merged on
> top, which means that the same conflicts need to be resolved if this
> wants to be merged to 'next' (or 'pu', FWIW).

So,... as I had to do the reverse rebase anyway, here is the
difference since the previous round, which I came up with by
comparing these two:

 (A) merge 'sb/more-repo-in-api' to 'master' and then merge v1 of
     this topic to the result.

 (B) apply your patch to 'master', and then merge
     'sb/more-repo-in-api' to the result.

diff --git a/commit-graph.c b/commit-graph.c
index f78a8e96b5..74a17789f8 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -286,7 +286,8 @@ void close_commit_graph(struct repository *r)
 	r->objects->commit_graph = NULL;
 }
 
-static int bsearch_graph(struct commit_graph *g, struct object_id *oid, uint32_t *pos)
+static int bsearch_graph(struct commit_graph *g, const struct object_id *oid,
+			 uint32_t *pos)
 {
 	return bsearch_hash(oid->hash, g->chunk_oid_fanout,
 			    g->chunk_oid_lookup, g->hash_len, pos);
@@ -377,26 +378,41 @@ static int find_commit_in_graph(struct commit *item, struct commit_graph *g, uin
 	}
 }
 
-static int parse_commit_in_graph_one(struct repository *r,
-				     struct commit_graph *g,
-				     struct commit *item)
+static struct commit *parse_commit_in_graph_one(struct repository *r,
+						struct commit_graph *g,
+						struct commit *shell,
+						const struct object_id *oid)
 {
 	uint32_t pos;
 
-	if (item->object.parsed)
-		return 1;
+	if (shell && shell->object.parsed)
+		return shell;
 
-	if (find_commit_in_graph(item, g, &pos))
-		return fill_commit_in_graph(r, item, g, pos);
+	if (shell && shell->graph_pos != COMMIT_NOT_FROM_GRAPH) {
+		pos = shell->graph_pos;
+	} else if (bsearch_graph(g, shell ? &shell->object.oid : oid, &pos)) {
+		/* bsearch_graph sets pos */
+	} else {
+		return NULL;
+	}
 
-	return 0;
+	if (!shell) {
+		shell = lookup_commit(r, oid);
+		if (!shell)
+			return NULL;
+	}
+
+	fill_commit_in_graph(r, shell, g, pos);
+	return shell;
 }
 
-int parse_commit_in_graph(struct repository *r, struct commit *item)
+struct commit *parse_commit_in_graph(struct repository *r, struct commit *shell,
+				     const struct object_id *oid)
 {
 	if (!prepare_commit_graph(r))
 		return 0;
-	return parse_commit_in_graph_one(r, r->objects->commit_graph, item);
+	return parse_commit_in_graph_one(r, r->objects->commit_graph, shell,
+					 oid);
 }
 
 void load_commit_graph_info(struct repository *r, struct commit *item)
@@ -1033,7 +1049,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g)
 		}
 
 		graph_commit = lookup_commit(r, &cur_oid);
-		if (!parse_commit_in_graph_one(r, g, graph_commit))
+		if (!parse_commit_in_graph_one(r, g, graph_commit, NULL))
 			graph_report("failed to parse %s from commit-graph",
 				     oid_to_hex(&cur_oid));
 	}
diff --git a/commit-graph.h b/commit-graph.h
index 9db40b4d3a..8b7b5985dc 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -13,16 +13,20 @@ struct commit;
 char *get_commit_graph_filename(const char *obj_dir);
 
 /*
- * Given a commit struct, try to fill the commit struct info, including:
+ * If the given commit (identified by shell->object.oid or oid) is in the
+ * commit graph, returns a commit struct (reusing shell if it is not NULL)
+ * including the following info:
  *  1. tree object
  *  2. date
  *  3. parents.
  *
- * Returns 1 if and only if the commit was found in the packed graph.
+ * If not, returns NULL. See parse_commit_buffer() for the fallback after this
+ * call.
  *
- * See parse_commit_buffer() for the fallback after this call.
+ * Either shell or oid must be non-NULL. If both are non-NULL, oid is ignored.
  */
-int parse_commit_in_graph(struct repository *r, struct commit *item);
+struct commit *parse_commit_in_graph(struct repository *r, struct commit *shell,
+				     const struct object_id *oid);
 
 /*
  * It is possible that we loaded commit contents from the commit buffer,
diff --git a/commit.c b/commit.c
index a5333c7ac6..da7a1d3262 100644
--- a/commit.c
+++ b/commit.c
@@ -462,7 +462,7 @@ int repo_parse_commit_internal(struct repository *r,
 		return -1;
 	if (item->object.parsed)
 		return 0;
-	if (use_commit_graph && parse_commit_in_graph(r, item))
+	if (use_commit_graph && parse_commit_in_graph(r, item, NULL))
 		return 0;
 	buffer = repo_read_object_file(r, &item->object.oid, &type, &size);
 	if (!buffer)
diff --git a/revision.c b/revision.c
index 22aa109c14..05fddb5880 100644
--- a/revision.c
+++ b/revision.c
@@ -213,19 +213,9 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
 {
 	struct object *object;
 
-	/*
-	 * If the repository has commit graphs, repo_parse_commit() avoids
-	 * reading the object buffer, so use it whenever possible.
-	 */
-	if (oid_object_info(revs->repo, oid, NULL) == OBJ_COMMIT) {
-		struct commit *c = lookup_commit(revs->repo, oid);
-		if (!repo_parse_commit(revs->repo, c))
-			object = (struct object *) c;
-		else
-			object = NULL;
-	} else {
+	object = (struct object *) parse_commit_in_graph(revs->repo, NULL, oid);
+	if (!object)
 		object = parse_object(revs->repo, oid);
-	}
 
 	if (!object) {
 		if (revs->ignore_missing)
diff --git a/t/helper/test-repository.c b/t/helper/test-repository.c
index f7f8618445..689a0b652e 100644
--- a/t/helper/test-repository.c
+++ b/t/helper/test-repository.c
@@ -27,7 +27,7 @@ static void test_parse_commit_in_graph(const char *gitdir, const char *worktree,
 
 	c = lookup_commit(&r, commit_oid);
 
-	if (!parse_commit_in_graph(&r, c))
+	if (!parse_commit_in_graph(&r, c, NULL))
 		die("Couldn't parse commit");
 
 	printf("%"PRItime, c->date);
@@ -62,7 +62,7 @@ static void test_get_commit_tree_in_graph(const char *gitdir,
 	 * get_commit_tree_in_graph does not automatically parse the commit, so
 	 * parse it first.
 	 */
-	if (!parse_commit_in_graph(&r, c))
+	if (!parse_commit_in_graph(&r, c, NULL))
 		die("Couldn't parse commit");
 	tree = get_commit_tree_in_graph(&r, c);
 	if (!tree)


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, back to index

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-04 22:42 [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference() Jonathan Tan
2018-12-04 23:12 ` Stefan Beller
2018-12-06 23:36   ` Jonathan Tan
2018-12-07 13:49     ` Derrick Stolee
2018-12-05  4:54 ` Jeff King
2018-12-06 23:54   ` Jonathan Tan
2018-12-07  8:53     ` Jeff King
2018-12-05 23:15 ` Junio C Hamano
2018-12-07 21:50 ` [PATCH on master v2] " Jonathan Tan
2018-12-09  0:51   ` Junio C Hamano
2018-12-09  1:49     ` Junio C Hamano

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox