git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] fetch: speed up mirror-fetches with many refs
@ 2022-01-28 10:15 Patrick Steinhardt
  2022-01-28 10:17 ` [PATCH 1/2] fetch-pack: use commit-graph when computing cutoff Patrick Steinhardt
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Patrick Steinhardt @ 2022-01-28 10:15 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 544 bytes --]

Hi,

this patch series aims to speed up mirror-fetches in repositories with
huge amounts of refs. It contains two patches which together speed up
git-fetch(1) in a repository with about 2,1 million references by
roughly 30%.

Patrick

Patrick Steinhardt (2):
  fetch-pack: use commit-graph when computing cutoff
  fetch: skip computing output width when not printing anything

 builtin/fetch.c |  8 ++++++--
 fetch-pack.c    | 28 ++++++++++++++++------------
 2 files changed, 22 insertions(+), 14 deletions(-)

-- 
2.35.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/2] fetch-pack: use commit-graph when computing cutoff
  2022-01-28 10:15 [PATCH 0/2] fetch: speed up mirror-fetches with many refs Patrick Steinhardt
@ 2022-01-28 10:17 ` Patrick Steinhardt
  2022-01-31 22:53   ` Taylor Blau
  2022-01-28 10:19 ` [PATCH 2/2] fetch: skip computing output width when not printing anything Patrick Steinhardt
  2022-02-02 12:51 ` [PATCH v2 0/2] fetch: speed up mirror-fetches with many refs Patrick Steinhardt
  2 siblings, 1 reply; 16+ messages in thread
From: Patrick Steinhardt @ 2022-01-28 10:17 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 3888 bytes --]

During packfile negotiation we iterate over all refs announced by the
remote side to check whether their IDs refer to commits already known to
us. If a commit is known to us already, then its date is a potential
cutoff point for commits we have in common with the remote side.

There is potentially a lot of commits announced by the remote depending
on how many refs there are in the remote repository, and for every one
of them we need to search for it in our object database and, if found,
parse the corresponding object to find out whether it is a candidate for
the cutoff date. This can be sped up by trying to look up commits via
the commit-graph first, which is a lot more efficient.

One thing to keep in mind though is that the commit-graph corrects
committer dates:

    * A commit with at least one parent has corrected committer date
      equal to the maximum of its commiter date and one more than the
      largest corrected committer date among its parents.

As a result, it may be that the commit date we load via the commit graph
is more recent than it would have been when loaded via the ODB, and as a
result we may also choose a more recent cutoff point. But as the code
documents, this is only a heuristic and it is okay if we determine a
wrong cutoff date. The worst that can happen is that we report more
commits as HAVEs to the server when using corrected dates.

Loading commits via the commit-graph is typically much faster than
loading commits via the object database. Benchmarks in a repository with
about 2,1 million refs and an up-to-date commit-graph show a 20% speedup
when mirror-fetching:

    Benchmark 1: git fetch --atomic +refs/*:refs/* (v2.35.0)
      Time (mean ± σ):     75.264 s ±  1.115 s    [User: 68.199 s, System: 10.094 s]
      Range (min … max):   74.145 s … 76.862 s    5 runs

    Benchmark 2: git fetch --atomic +refs/*:refs/* (HEAD)
      Time (mean ± σ):     62.350 s ±  0.854 s    [User: 55.412 s, System: 9.976 s]
      Range (min … max):   61.224 s … 63.216 s    5 runs

    Summary
      'git fetch --atomic +refs/*:refs/* (HEAD)' ran
        1.21 ± 0.02 times faster than 'git fetch --atomic +refs/*:refs/* (v2.35.0)'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 fetch-pack.c | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index dd6ec449f2..c5967e228e 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -696,26 +696,30 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 
 	trace2_region_enter("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL);
 	for (ref = *refs; ref; ref = ref->next) {
-		struct object *o;
+		struct commit *commit;
 
-		if (!has_object_file_with_flags(&ref->old_oid,
+		commit = lookup_commit_in_graph(the_repository, &ref->old_oid);
+		if (!commit) {
+			struct object *o;
+
+			if (!has_object_file_with_flags(&ref->old_oid,
 						OBJECT_INFO_QUICK |
-							OBJECT_INFO_SKIP_FETCH_OBJECT))
-			continue;
-		o = parse_object(the_repository, &ref->old_oid);
-		if (!o)
-			continue;
+						OBJECT_INFO_SKIP_FETCH_OBJECT))
+				continue;
+			o = parse_object(the_repository, &ref->old_oid);
+			if (!o || o->type != OBJ_COMMIT)
+				continue;
+
+			commit = (struct commit *)o;
+		}
 
 		/*
 		 * We already have it -- which may mean that we were
 		 * in sync with the other side at some time after
 		 * that (it is OK if we guess wrong here).
 		 */
-		if (o->type == OBJ_COMMIT) {
-			struct commit *commit = (struct commit *)o;
-			if (!cutoff || cutoff < commit->date)
-				cutoff = commit->date;
-		}
+		if (!cutoff || cutoff < commit->date)
+			cutoff = commit->date;
 	}
 	trace2_region_leave("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL);
 
-- 
2.35.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/2] fetch: skip computing output width when not printing anything
  2022-01-28 10:15 [PATCH 0/2] fetch: speed up mirror-fetches with many refs Patrick Steinhardt
  2022-01-28 10:17 ` [PATCH 1/2] fetch-pack: use commit-graph when computing cutoff Patrick Steinhardt
@ 2022-01-28 10:19 ` Patrick Steinhardt
  2022-02-02 12:51 ` [PATCH v2 0/2] fetch: speed up mirror-fetches with many refs Patrick Steinhardt
  2 siblings, 0 replies; 16+ messages in thread
From: Patrick Steinhardt @ 2022-01-28 10:19 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2836 bytes --]

When updating references via git-fetch(1), then by default we report to
the user which references have been changed. This output is formatted in
a nice table such that the different columns are aligned. Because the
first column contains abbreviated object IDs we thus need to iterate
over all refs which have changed and compute the minimum length for
their respective abbreviated hashes. While this effort makes sense in
most cases, it is wasteful when the user passes the `--quiet` flag: we
don't print the summary, but still compute the length.

Skip computing the summary width when the user asked for us to be quiet.
This gives us a small speedup of nearly 10% when doing a dry-run
mirror-fetch in a repository with thousands of references being updated:

    Benchmark 1: git fetch --prune --dry-run +refs/*:refs/* (HEAD~)
      Time (mean ± σ):     34.048 s ±  0.233 s    [User: 30.739 s, System: 4.640 s]
      Range (min … max):   33.785 s … 34.296 s    5 runs

    Benchmark 2: git fetch --prune --dry-run +refs/*:refs/* (HEAD)
      Time (mean ± σ):     30.768 s ±  0.287 s    [User: 27.534 s, System: 4.565 s]
      Range (min … max):   30.432 s … 31.181 s    5 runs

    Summary
      'git fetch --prune --dry-run +refs/*:refs/* (HEAD)' ran
        1.11 ± 0.01 times faster than 'git fetch --prune --dry-run +refs/*:refs/* (HEAD~)'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---

[Resend with correct In-Reply-To header.]

 builtin/fetch.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 5f06b21f8e..ebbde5d56d 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1093,12 +1093,15 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
 	struct ref *rm;
 	char *url;
 	int want_status;
-	int summary_width = transport_summary_width(ref_map);
+	int summary_width = 0;
 
 	rc = open_fetch_head(&fetch_head);
 	if (rc)
 		return -1;
 
+	if (verbosity >= 0)
+		summary_width = transport_summary_width(ref_map);
+
 	if (raw_url)
 		url = transport_anonymize_url(raw_url);
 	else
@@ -1344,7 +1347,6 @@ static int prune_refs(struct refspec *rs, struct ref *ref_map,
 	int url_len, i, result = 0;
 	struct ref *ref, *stale_refs = get_stale_heads(rs, ref_map);
 	char *url;
-	int summary_width = transport_summary_width(stale_refs);
 	const char *dangling_msg = dry_run
 		? _("   (%s will become dangling)")
 		: _("   (%s has become dangling)");
@@ -1373,6 +1375,8 @@ static int prune_refs(struct refspec *rs, struct ref *ref_map,
 	}
 
 	if (verbosity >= 0) {
+		int summary_width = transport_summary_width(stale_refs);
+
 		for (ref = stale_refs; ref; ref = ref->next) {
 			struct strbuf sb = STRBUF_INIT;
 			if (!shown_url) {
-- 
2.35.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] fetch-pack: use commit-graph when computing cutoff
  2022-01-28 10:17 ` [PATCH 1/2] fetch-pack: use commit-graph when computing cutoff Patrick Steinhardt
@ 2022-01-31 22:53   ` Taylor Blau
  2022-02-02 12:46     ` Patrick Steinhardt
  0 siblings, 1 reply; 16+ messages in thread
From: Taylor Blau @ 2022-01-31 22:53 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Derrick Stolee, git

(+cc Stolee, in case anything I'm saying here is wrong)

On Fri, Jan 28, 2022 at 11:17:03AM +0100, Patrick Steinhardt wrote:
> One thing to keep in mind though is that the commit-graph corrects
> committer dates:
>
>     * A commit with at least one parent has corrected committer date
>       equal to the maximum of its commiter date and one more than the
>       largest corrected committer date among its parents.

This snippet refers to how correct committer dates are computed, not how
the commit dates themselves are stored.

Indeed, the corrected committer date is used to compute the corrected
commit date offset, which is the "v2" generation number scheme (as
opposed to topological levels, which make up "v1").

But that is entirely separate from the committer dates stored by the
commit-graph file, which are faithful representations of the exact
committer date attached to each commit.

Looking at the very last few lines of the main loop in
write_graph_chunk_data() (where the committer dates are stored):

    if (sizeof((*list)->date) > 4)
      packedDate[0] = htonl(((*list)->date >> 32) & 0x3);
    else
      packedDate[0] = 0;

    packedDate[0] |= htonl(*topo_level_slab_at(ctx->topo_levels, *list) << 2);
    packedDate[1] = htonl((*list)->date);
    hashwrite(f, packedDate, 8);

the low-order 34 bits are used to store the commit's `->date` field, and
the remaining high-order 30 bits are used to store the generation
number. (You can look in `fill_commit_graph_info()` to see that we only
use those 34 bits to write back the date field).

So I think this paragraph (and the ones related to it) about this being
an approximation and that being OK since this is a heuristic can all go
away.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] fetch-pack: use commit-graph when computing cutoff
  2022-01-31 22:53   ` Taylor Blau
@ 2022-02-02 12:46     ` Patrick Steinhardt
  0 siblings, 0 replies; 16+ messages in thread
From: Patrick Steinhardt @ 2022-02-02 12:46 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Derrick Stolee, git

[-- Attachment #1: Type: text/plain, Size: 2090 bytes --]

On Mon, Jan 31, 2022 at 05:53:14PM -0500, Taylor Blau wrote:
> (+cc Stolee, in case anything I'm saying here is wrong)
> 
> On Fri, Jan 28, 2022 at 11:17:03AM +0100, Patrick Steinhardt wrote:
> > One thing to keep in mind though is that the commit-graph corrects
> > committer dates:
> >
> >     * A commit with at least one parent has corrected committer date
> >       equal to the maximum of its commiter date and one more than the
> >       largest corrected committer date among its parents.
> 
> This snippet refers to how correct committer dates are computed, not how
> the commit dates themselves are stored.
> 
> Indeed, the corrected committer date is used to compute the corrected
> commit date offset, which is the "v2" generation number scheme (as
> opposed to topological levels, which make up "v1").
> 
> But that is entirely separate from the committer dates stored by the
> commit-graph file, which are faithful representations of the exact
> committer date attached to each commit.
> 
> Looking at the very last few lines of the main loop in
> write_graph_chunk_data() (where the committer dates are stored):
> 
>     if (sizeof((*list)->date) > 4)
>       packedDate[0] = htonl(((*list)->date >> 32) & 0x3);
>     else
>       packedDate[0] = 0;
> 
>     packedDate[0] |= htonl(*topo_level_slab_at(ctx->topo_levels, *list) << 2);
>     packedDate[1] = htonl((*list)->date);
>     hashwrite(f, packedDate, 8);
> 
> the low-order 34 bits are used to store the commit's `->date` field, and
> the remaining high-order 30 bits are used to store the generation
> number. (You can look in `fill_commit_graph_info()` to see that we only
> use those 34 bits to write back the date field).
> 
> So I think this paragraph (and the ones related to it) about this being
> an approximation and that being OK since this is a heuristic can all go
> away.
> 
> Thanks,
> Taylor

Aha, that makes a lot of sense. Thanks a lot for correcting my
misconception! I'll send a v2 of this patch series with the corrected
commit message.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 0/2] fetch: speed up mirror-fetches with many refs
  2022-01-28 10:15 [PATCH 0/2] fetch: speed up mirror-fetches with many refs Patrick Steinhardt
  2022-01-28 10:17 ` [PATCH 1/2] fetch-pack: use commit-graph when computing cutoff Patrick Steinhardt
  2022-01-28 10:19 ` [PATCH 2/2] fetch: skip computing output width when not printing anything Patrick Steinhardt
@ 2022-02-02 12:51 ` Patrick Steinhardt
  2022-02-02 12:51   ` [PATCH v2 1/2] fetch-pack: use commit-graph when computing cutoff Patrick Steinhardt
                     ` (2 more replies)
  2 siblings, 3 replies; 16+ messages in thread
From: Patrick Steinhardt @ 2022-02-02 12:51 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau, Derrick Stolee

[-- Attachment #1: Type: text/plain, Size: 2458 bytes --]

Hi,

this is the second version of my patch series which aims to speed up
mirror-fetches in repos with huge amounts of refs. The only change
compared to v1 is a fixed up commit message: Taylor has pointed out to
me that commit dates retrieved from the commit-graph are not in fact the
corrected commit dates, which are stored separately.

Patrick

Patrick Steinhardt (2):
  fetch-pack: use commit-graph when computing cutoff
  fetch: skip computing output width when not printing anything

 builtin/fetch.c |  8 ++++++--
 fetch-pack.c    | 28 ++++++++++++++++------------
 2 files changed, 22 insertions(+), 14 deletions(-)

Range-diff against v1:
1:  31cf8f87a1 ! 1:  6fac914f0f fetch-pack: use commit-graph when computing cutoff
    @@ Commit message
         the cutoff date. This can be sped up by trying to look up commits via
         the commit-graph first, which is a lot more efficient.
     
    -    One thing to keep in mind though is that the commit-graph corrects
    -    committer dates:
    -
    -        * A commit with at least one parent has corrected committer date
    -          equal to the maximum of its commiter date and one more than the
    -          largest corrected committer date among its parents.
    -
    -    As a result, it may be that the commit date we load via the commit graph
    -    is more recent than it would have been when loaded via the ODB, and as a
    -    result we may also choose a more recent cutoff point. But as the code
    -    documents, this is only a heuristic and it is okay if we determine a
    -    wrong cutoff date. The worst that can happen is that we report more
    -    commits as HAVEs to the server when using corrected dates.
    -
    -    Loading commits via the commit-graph is typically much faster than
    -    loading commits via the object database. Benchmarks in a repository with
    -    about 2,1 million refs and an up-to-date commit-graph show a 20% speedup
    -    when mirror-fetching:
    +    Benchmarks in a repository with about 2,1 million refs and an up-to-date
    +    commit-graph show a 20% speedup when mirror-fetching:
     
             Benchmark 1: git fetch --atomic +refs/*:refs/* (v2.35.0)
               Time (mean ± σ):     75.264 s ±  1.115 s    [User: 68.199 s, System: 10.094 s]
2:  5a3fd3232f = 2:  4b9bbcf795 fetch: skip computing output width when not printing anything
-- 
2.35.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 1/2] fetch-pack: use commit-graph when computing cutoff
  2022-02-02 12:51 ` [PATCH v2 0/2] fetch: speed up mirror-fetches with many refs Patrick Steinhardt
@ 2022-02-02 12:51   ` Patrick Steinhardt
  2022-02-09 18:01     ` Christian Couder
  2022-02-02 12:51   ` [PATCH v2 2/2] fetch: skip computing output width when not printing anything Patrick Steinhardt
  2022-02-10 12:28   ` [PATCH v3 0/2] fetch: speed up mirror-fetches with many refs Patrick Steinhardt
  2 siblings, 1 reply; 16+ messages in thread
From: Patrick Steinhardt @ 2022-02-02 12:51 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau, Derrick Stolee

[-- Attachment #1: Type: text/plain, Size: 3068 bytes --]

During packfile negotiation we iterate over all refs announced by the
remote side to check whether their IDs refer to commits already known to
us. If a commit is known to us already, then its date is a potential
cutoff point for commits we have in common with the remote side.

There is potentially a lot of commits announced by the remote depending
on how many refs there are in the remote repository, and for every one
of them we need to search for it in our object database and, if found,
parse the corresponding object to find out whether it is a candidate for
the cutoff date. This can be sped up by trying to look up commits via
the commit-graph first, which is a lot more efficient.

Benchmarks in a repository with about 2,1 million refs and an up-to-date
commit-graph show a 20% speedup when mirror-fetching:

    Benchmark 1: git fetch --atomic +refs/*:refs/* (v2.35.0)
      Time (mean ± σ):     75.264 s ±  1.115 s    [User: 68.199 s, System: 10.094 s]
      Range (min … max):   74.145 s … 76.862 s    5 runs

    Benchmark 2: git fetch --atomic +refs/*:refs/* (HEAD)
      Time (mean ± σ):     62.350 s ±  0.854 s    [User: 55.412 s, System: 9.976 s]
      Range (min … max):   61.224 s … 63.216 s    5 runs

    Summary
      'git fetch --atomic +refs/*:refs/* (HEAD)' ran
        1.21 ± 0.02 times faster than 'git fetch --atomic +refs/*:refs/* (v2.35.0)'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 fetch-pack.c | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index dd6ec449f2..c5967e228e 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -696,26 +696,30 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 
 	trace2_region_enter("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL);
 	for (ref = *refs; ref; ref = ref->next) {
-		struct object *o;
+		struct commit *commit;
 
-		if (!has_object_file_with_flags(&ref->old_oid,
+		commit = lookup_commit_in_graph(the_repository, &ref->old_oid);
+		if (!commit) {
+			struct object *o;
+
+			if (!has_object_file_with_flags(&ref->old_oid,
 						OBJECT_INFO_QUICK |
-							OBJECT_INFO_SKIP_FETCH_OBJECT))
-			continue;
-		o = parse_object(the_repository, &ref->old_oid);
-		if (!o)
-			continue;
+						OBJECT_INFO_SKIP_FETCH_OBJECT))
+				continue;
+			o = parse_object(the_repository, &ref->old_oid);
+			if (!o || o->type != OBJ_COMMIT)
+				continue;
+
+			commit = (struct commit *)o;
+		}
 
 		/*
 		 * We already have it -- which may mean that we were
 		 * in sync with the other side at some time after
 		 * that (it is OK if we guess wrong here).
 		 */
-		if (o->type == OBJ_COMMIT) {
-			struct commit *commit = (struct commit *)o;
-			if (!cutoff || cutoff < commit->date)
-				cutoff = commit->date;
-		}
+		if (!cutoff || cutoff < commit->date)
+			cutoff = commit->date;
 	}
 	trace2_region_leave("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL);
 
-- 
2.35.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 2/2] fetch: skip computing output width when not printing anything
  2022-02-02 12:51 ` [PATCH v2 0/2] fetch: speed up mirror-fetches with many refs Patrick Steinhardt
  2022-02-02 12:51   ` [PATCH v2 1/2] fetch-pack: use commit-graph when computing cutoff Patrick Steinhardt
@ 2022-02-02 12:51   ` Patrick Steinhardt
  2022-02-09 18:10     ` Christian Couder
  2022-02-10 12:28   ` [PATCH v3 0/2] fetch: speed up mirror-fetches with many refs Patrick Steinhardt
  2 siblings, 1 reply; 16+ messages in thread
From: Patrick Steinhardt @ 2022-02-02 12:51 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau, Derrick Stolee

[-- Attachment #1: Type: text/plain, Size: 2789 bytes --]

When updating references via git-fetch(1), then by default we report to
the user which references have been changed. This output is formatted in
a nice table such that the different columns are aligned. Because the
first column contains abbreviated object IDs we thus need to iterate
over all refs which have changed and compute the minimum length for
their respective abbreviated hashes. While this effort makes sense in
most cases, it is wasteful when the user passes the `--quiet` flag: we
don't print the summary, but still compute the length.

Skip computing the summary width when the user asked for us to be quiet.
This gives us a small speedup of nearly 10% when doing a dry-run
mirror-fetch in a repository with thousands of references being updated:

    Benchmark 1: git fetch --prune --dry-run +refs/*:refs/* (HEAD~)
      Time (mean ± σ):     34.048 s ±  0.233 s    [User: 30.739 s, System: 4.640 s]
      Range (min … max):   33.785 s … 34.296 s    5 runs

    Benchmark 2: git fetch --prune --dry-run +refs/*:refs/* (HEAD)
      Time (mean ± σ):     30.768 s ±  0.287 s    [User: 27.534 s, System: 4.565 s]
      Range (min … max):   30.432 s … 31.181 s    5 runs

    Summary
      'git fetch --prune --dry-run +refs/*:refs/* (HEAD)' ran
        1.11 ± 0.01 times faster than 'git fetch --prune --dry-run +refs/*:refs/* (HEAD~)'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/fetch.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 5f06b21f8e..ebbde5d56d 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1093,12 +1093,15 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
 	struct ref *rm;
 	char *url;
 	int want_status;
-	int summary_width = transport_summary_width(ref_map);
+	int summary_width = 0;
 
 	rc = open_fetch_head(&fetch_head);
 	if (rc)
 		return -1;
 
+	if (verbosity >= 0)
+		summary_width = transport_summary_width(ref_map);
+
 	if (raw_url)
 		url = transport_anonymize_url(raw_url);
 	else
@@ -1344,7 +1347,6 @@ static int prune_refs(struct refspec *rs, struct ref *ref_map,
 	int url_len, i, result = 0;
 	struct ref *ref, *stale_refs = get_stale_heads(rs, ref_map);
 	char *url;
-	int summary_width = transport_summary_width(stale_refs);
 	const char *dangling_msg = dry_run
 		? _("   (%s will become dangling)")
 		: _("   (%s has become dangling)");
@@ -1373,6 +1375,8 @@ static int prune_refs(struct refspec *rs, struct ref *ref_map,
 	}
 
 	if (verbosity >= 0) {
+		int summary_width = transport_summary_width(stale_refs);
+
 		for (ref = stale_refs; ref; ref = ref->next) {
 			struct strbuf sb = STRBUF_INIT;
 			if (!shown_url) {
-- 
2.35.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/2] fetch-pack: use commit-graph when computing cutoff
  2022-02-02 12:51   ` [PATCH v2 1/2] fetch-pack: use commit-graph when computing cutoff Patrick Steinhardt
@ 2022-02-09 18:01     ` Christian Couder
  2022-02-10 11:43       ` Patrick Steinhardt
  0 siblings, 1 reply; 16+ messages in thread
From: Christian Couder @ 2022-02-09 18:01 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Taylor Blau, Derrick Stolee

On Mon, Feb 7, 2022 at 7:03 AM Patrick Steinhardt <ps@pks.im> wrote:

> Benchmarks in a repository with about 2,1 million refs and an up-to-date
> commit-graph show a 20% speedup when mirror-fetching:
>
>     Benchmark 1: git fetch --atomic +refs/*:refs/* (v2.35.0)
>       Time (mean ± σ):     75.264 s ±  1.115 s    [User: 68.199 s, System: 10.094 s]
>       Range (min … max):   74.145 s … 76.862 s    5 runs
>
>     Benchmark 2: git fetch --atomic +refs/*:refs/* (HEAD)
>       Time (mean ± σ):     62.350 s ±  0.854 s    [User: 55.412 s, System: 9.976 s]
>       Range (min … max):   61.224 s … 63.216 s    5 runs
>
>     Summary
>       'git fetch --atomic +refs/*:refs/* (HEAD)' ran
>         1.21 ± 0.02 times faster than 'git fetch --atomic +refs/*:refs/* (v2.35.0)'

The commit message and code make sense to me, but I wonder if there is
a reason why --atomic is used when fetching.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/2] fetch: skip computing output width when not printing anything
  2022-02-02 12:51   ` [PATCH v2 2/2] fetch: skip computing output width when not printing anything Patrick Steinhardt
@ 2022-02-09 18:10     ` Christian Couder
  2022-02-10 12:23       ` Patrick Steinhardt
  0 siblings, 1 reply; 16+ messages in thread
From: Christian Couder @ 2022-02-09 18:10 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Taylor Blau, Derrick Stolee

On Wed, Feb 2, 2022 at 5:13 PM Patrick Steinhardt <ps@pks.im> wrote:

> Skip computing the summary width when the user asked for us to be quiet.

There is a --quiet option for git fetch, so here we can expect that it
will be used to test this speedup...

> This gives us a small speedup of nearly 10% when doing a dry-run
> mirror-fetch in a repository with thousands of references being updated:
>
>     Benchmark 1: git fetch --prune --dry-run +refs/*:refs/* (HEAD~)
>       Time (mean ± σ):     34.048 s ±  0.233 s    [User: 30.739 s, System: 4.640 s]
>       Range (min … max):   33.785 s … 34.296 s    5 runs
>
>     Benchmark 2: git fetch --prune --dry-run +refs/*:refs/* (HEAD)
>       Time (mean ± σ):     30.768 s ±  0.287 s    [User: 27.534 s, System: 4.565 s]
>       Range (min … max):   30.432 s … 31.181 s    5 runs
>
>     Summary
>       'git fetch --prune --dry-run +refs/*:refs/* (HEAD)' ran
>         1.11 ± 0.01 times faster than 'git fetch --prune --dry-run +refs/*:refs/* (HEAD~)'

...but --prune and --dry-run are used for testing. It would be nice if
this discrepancy was explained a bit.

Otherwise the commit message and code look good to me.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/2] fetch-pack: use commit-graph when computing cutoff
  2022-02-09 18:01     ` Christian Couder
@ 2022-02-10 11:43       ` Patrick Steinhardt
  0 siblings, 0 replies; 16+ messages in thread
From: Patrick Steinhardt @ 2022-02-10 11:43 UTC (permalink / raw)
  To: Christian Couder; +Cc: git, Taylor Blau, Derrick Stolee

[-- Attachment #1: Type: text/plain, Size: 2148 bytes --]

On Wed, Feb 09, 2022 at 07:01:54PM +0100, Christian Couder wrote:
> On Mon, Feb 7, 2022 at 7:03 AM Patrick Steinhardt <ps@pks.im> wrote:
> 
> > Benchmarks in a repository with about 2,1 million refs and an up-to-date
> > commit-graph show a 20% speedup when mirror-fetching:
> >
> >     Benchmark 1: git fetch --atomic +refs/*:refs/* (v2.35.0)
> >       Time (mean ± σ):     75.264 s ±  1.115 s    [User: 68.199 s, System: 10.094 s]
> >       Range (min … max):   74.145 s … 76.862 s    5 runs
> >
> >     Benchmark 2: git fetch --atomic +refs/*:refs/* (HEAD)
> >       Time (mean ± σ):     62.350 s ±  0.854 s    [User: 55.412 s, System: 9.976 s]
> >       Range (min … max):   61.224 s … 63.216 s    5 runs
> >
> >     Summary
> >       'git fetch --atomic +refs/*:refs/* (HEAD)' ran
> >         1.21 ± 0.02 times faster than 'git fetch --atomic +refs/*:refs/* (v2.35.0)'
> 
> The commit message and code make sense to me, but I wonder if there is
> a reason why --atomic is used when fetching.

The repository that I was mirror-fetching into needs to update a big
bunch of references, and doing that via `--atomic` is more efficient
than doing it without, and this shows in the benchmark. I did another
benchmarking run without `--atomic`, and it is indeed about 30 seconds
slower for both cases. But interestingly the relative performance
improvement is still roughly the same:

    Benchmark 1: git fetch +refs/*:refs/* (v2.35.0)
      Time (mean ± σ):     115.587 s ±  2.009 s    [User: 109.874 s, System: 11.305 s]
      Range (min … max):   113.584 s … 118.820 s    5 runs

    Benchmark 2: git fetch +refs/*:refs/* (pks-fetch-pack-optim-v1~)
      Time (mean ± σ):     96.859 s ±  0.624 s    [User: 91.948 s, System: 10.980 s]
      Range (min … max):   96.180 s … 97.875 s    5 runs

    Summary
      'git fetch +refs/*:refs/* (pks-fetch-pack-optim-v1~)' ran
        1.19 ± 0.02 times faster than 'git fetch +refs/*:refs/* (v2.35.0)'

I'll update the commit message to just use this new benchmark so that
the `--atomic` flag doesn't cause any questions.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/2] fetch: skip computing output width when not printing anything
  2022-02-09 18:10     ` Christian Couder
@ 2022-02-10 12:23       ` Patrick Steinhardt
  0 siblings, 0 replies; 16+ messages in thread
From: Patrick Steinhardt @ 2022-02-10 12:23 UTC (permalink / raw)
  To: Christian Couder; +Cc: git, Taylor Blau, Derrick Stolee

[-- Attachment #1: Type: text/plain, Size: 2177 bytes --]

On Wed, Feb 09, 2022 at 07:10:38PM +0100, Christian Couder wrote:
> On Wed, Feb 2, 2022 at 5:13 PM Patrick Steinhardt <ps@pks.im> wrote:
> 
> > Skip computing the summary width when the user asked for us to be quiet.
> 
> There is a --quiet option for git fetch, so here we can expect that it
> will be used to test this speedup...
> 
> > This gives us a small speedup of nearly 10% when doing a dry-run
> > mirror-fetch in a repository with thousands of references being updated:
> >
> >     Benchmark 1: git fetch --prune --dry-run +refs/*:refs/* (HEAD~)
> >       Time (mean ± σ):     34.048 s ±  0.233 s    [User: 30.739 s, System: 4.640 s]
> >       Range (min … max):   33.785 s … 34.296 s    5 runs
> >
> >     Benchmark 2: git fetch --prune --dry-run +refs/*:refs/* (HEAD)
> >       Time (mean ± σ):     30.768 s ±  0.287 s    [User: 27.534 s, System: 4.565 s]
> >       Range (min … max):   30.432 s … 31.181 s    5 runs
> >
> >     Summary
> >       'git fetch --prune --dry-run +refs/*:refs/* (HEAD)' ran
> >         1.11 ± 0.01 times faster than 'git fetch --prune --dry-run +refs/*:refs/* (HEAD~)'
> 
> ...but --prune and --dry-run are used for testing. It would be nice if
> this discrepancy was explained a bit.
> 
> Otherwise the commit message and code look good to me.

Yeah, I was hiding away the `--quiet` flag here by accident. I used
`--prune` and `--dry-run` to trigger more lines to be printed to console
and to not take into account the time it takes to update local refs and
fetch objects. But doing so without these flags also demonstrates what I
want to:

    Benchmark 1: git fetch --quiet +refs/*:refs/* (pks-fetch-pack-optim-v1~)
      Time (mean ± σ):     96.078 s ±  0.508 s    [User: 91.378 s, System: 10.870 s]
      Range (min … max):   95.449 s … 96.760 s    5 runs

    Benchmark 2: git fetch --quiet +refs/*:refs/* (pks-fetch-pack-optim-v1)
      Time (mean ± σ):     88.214 s ±  0.192 s    [User: 83.274 s, System: 10.978 s]
      Range (min … max):   87.998 s … 88.446 s    5 runs

So again, I'll update the commit message.

Thanks for your feedback!

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3 0/2] fetch: speed up mirror-fetches with many refs
  2022-02-02 12:51 ` [PATCH v2 0/2] fetch: speed up mirror-fetches with many refs Patrick Steinhardt
  2022-02-02 12:51   ` [PATCH v2 1/2] fetch-pack: use commit-graph when computing cutoff Patrick Steinhardt
  2022-02-02 12:51   ` [PATCH v2 2/2] fetch: skip computing output width when not printing anything Patrick Steinhardt
@ 2022-02-10 12:28   ` Patrick Steinhardt
  2022-02-10 12:28     ` [PATCH v3 1/2] fetch-pack: use commit-graph when computing cutoff Patrick Steinhardt
                       ` (2 more replies)
  2 siblings, 3 replies; 16+ messages in thread
From: Patrick Steinhardt @ 2022-02-10 12:28 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau, Derrick Stolee, Christian Couder

[-- Attachment #1: Type: text/plain, Size: 4413 bytes --]

Hi,

this is the third version of my patch series which aimn to speed up
mirror-fetches in repos with huge amounts of refs. Again, the only
change compared to v2 is a change in commit messages: Chris has rightly
pointed out that the benchmarks were a bit confusing, so I've updated
them to hopefully be less so.

Thanks for your feedback!

Patrick

Patrick Steinhardt (2):
  fetch-pack: use commit-graph when computing cutoff
  fetch: skip computing output width when not printing anything

 builtin/fetch.c |  8 ++++++--
 fetch-pack.c    | 28 ++++++++++++++++------------
 2 files changed, 22 insertions(+), 14 deletions(-)

Range-diff against v2:
1:  6fac914f0f ! 1:  077d06764c fetch-pack: use commit-graph when computing cutoff
    @@ Commit message
         the commit-graph first, which is a lot more efficient.
     
         Benchmarks in a repository with about 2,1 million refs and an up-to-date
    -    commit-graph show a 20% speedup when mirror-fetching:
    +    commit-graph show an almost 20% speedup when mirror-fetching:
     
    -        Benchmark 1: git fetch --atomic +refs/*:refs/* (v2.35.0)
    -          Time (mean ± σ):     75.264 s ±  1.115 s    [User: 68.199 s, System: 10.094 s]
    -          Range (min … max):   74.145 s … 76.862 s    5 runs
    +        Benchmark 1: git fetch +refs/*:refs/* (v2.35.0)
    +          Time (mean ± σ):     115.587 s ±  2.009 s    [User: 109.874 s, System: 11.305 s]
    +          Range (min … max):   113.584 s … 118.820 s    5 runs
     
    -        Benchmark 2: git fetch --atomic +refs/*:refs/* (HEAD)
    -          Time (mean ± σ):     62.350 s ±  0.854 s    [User: 55.412 s, System: 9.976 s]
    -          Range (min … max):   61.224 s … 63.216 s    5 runs
    +        Benchmark 2: git fetch +refs/*:refs/* (HEAD)
    +          Time (mean ± σ):     96.859 s ±  0.624 s    [User: 91.948 s, System: 10.980 s]
    +          Range (min … max):   96.180 s … 97.875 s    5 runs
     
             Summary
    -          'git fetch --atomic +refs/*:refs/* (HEAD)' ran
    -            1.21 ± 0.02 times faster than 'git fetch --atomic +refs/*:refs/* (v2.35.0)'
    +          'git fetch +refs/*:refs/* (HEAD)' ran
    +            1.19 ± 0.02 times faster than 'git fetch +refs/*:refs/* (v2.35.0)'
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
2:  4b9bbcf795 ! 2:  ef1fd07be5 fetch: skip computing output width when not printing anything
    @@ Commit message
         don't print the summary, but still compute the length.
     
         Skip computing the summary width when the user asked for us to be quiet.
    -    This gives us a small speedup of nearly 10% when doing a dry-run
    -    mirror-fetch in a repository with thousands of references being updated:
    +    This gives us a speedup of nearly 10% when doing a mirror-fetch in a
    +    repository with thousands of references being updated:
     
    -        Benchmark 1: git fetch --prune --dry-run +refs/*:refs/* (HEAD~)
    -          Time (mean ± σ):     34.048 s ±  0.233 s    [User: 30.739 s, System: 4.640 s]
    -          Range (min … max):   33.785 s … 34.296 s    5 runs
    +        Benchmark 1: git fetch --quiet +refs/*:refs/* (HEAD~)
    +          Time (mean ± σ):     96.078 s ±  0.508 s    [User: 91.378 s, System: 10.870 s]
    +          Range (min … max):   95.449 s … 96.760 s    5 runs
     
    -        Benchmark 2: git fetch --prune --dry-run +refs/*:refs/* (HEAD)
    -          Time (mean ± σ):     30.768 s ±  0.287 s    [User: 27.534 s, System: 4.565 s]
    -          Range (min … max):   30.432 s … 31.181 s    5 runs
    +        Benchmark 2: git fetch --quiet +refs/*:refs/* (HEAD)
    +          Time (mean ± σ):     88.214 s ±  0.192 s    [User: 83.274 s, System: 10.978 s]
    +          Range (min … max):   87.998 s … 88.446 s    5 runs
     
             Summary
    -          'git fetch --prune --dry-run +refs/*:refs/* (HEAD)' ran
    -            1.11 ± 0.01 times faster than 'git fetch --prune --dry-run +refs/*:refs/* (HEAD~)'
    +          'git fetch --quiet +refs/*:refs/* (HEAD)' ran
    +            1.09 ± 0.01 times faster than 'git fetch --quiet +refs/*:refs/* (HEAD~)'
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
-- 
2.35.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3 1/2] fetch-pack: use commit-graph when computing cutoff
  2022-02-10 12:28   ` [PATCH v3 0/2] fetch: speed up mirror-fetches with many refs Patrick Steinhardt
@ 2022-02-10 12:28     ` Patrick Steinhardt
  2022-02-10 12:28     ` [PATCH v3 2/2] fetch: skip computing output width when not printing anything Patrick Steinhardt
  2022-02-10 18:04     ` [PATCH v3 0/2] fetch: speed up mirror-fetches with many refs Junio C Hamano
  2 siblings, 0 replies; 16+ messages in thread
From: Patrick Steinhardt @ 2022-02-10 12:28 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau, Derrick Stolee, Christian Couder

[-- Attachment #1: Type: text/plain, Size: 3045 bytes --]

During packfile negotiation we iterate over all refs announced by the
remote side to check whether their IDs refer to commits already known to
us. If a commit is known to us already, then its date is a potential
cutoff point for commits we have in common with the remote side.

There is potentially a lot of commits announced by the remote depending
on how many refs there are in the remote repository, and for every one
of them we need to search for it in our object database and, if found,
parse the corresponding object to find out whether it is a candidate for
the cutoff date. This can be sped up by trying to look up commits via
the commit-graph first, which is a lot more efficient.

Benchmarks in a repository with about 2,1 million refs and an up-to-date
commit-graph show an almost 20% speedup when mirror-fetching:

    Benchmark 1: git fetch +refs/*:refs/* (v2.35.0)
      Time (mean ± σ):     115.587 s ±  2.009 s    [User: 109.874 s, System: 11.305 s]
      Range (min … max):   113.584 s … 118.820 s    5 runs

    Benchmark 2: git fetch +refs/*:refs/* (HEAD)
      Time (mean ± σ):     96.859 s ±  0.624 s    [User: 91.948 s, System: 10.980 s]
      Range (min … max):   96.180 s … 97.875 s    5 runs

    Summary
      'git fetch +refs/*:refs/* (HEAD)' ran
        1.19 ± 0.02 times faster than 'git fetch +refs/*:refs/* (v2.35.0)'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 fetch-pack.c | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index dd6ec449f2..c5967e228e 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -696,26 +696,30 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 
 	trace2_region_enter("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL);
 	for (ref = *refs; ref; ref = ref->next) {
-		struct object *o;
+		struct commit *commit;
 
-		if (!has_object_file_with_flags(&ref->old_oid,
+		commit = lookup_commit_in_graph(the_repository, &ref->old_oid);
+		if (!commit) {
+			struct object *o;
+
+			if (!has_object_file_with_flags(&ref->old_oid,
 						OBJECT_INFO_QUICK |
-							OBJECT_INFO_SKIP_FETCH_OBJECT))
-			continue;
-		o = parse_object(the_repository, &ref->old_oid);
-		if (!o)
-			continue;
+						OBJECT_INFO_SKIP_FETCH_OBJECT))
+				continue;
+			o = parse_object(the_repository, &ref->old_oid);
+			if (!o || o->type != OBJ_COMMIT)
+				continue;
+
+			commit = (struct commit *)o;
+		}
 
 		/*
 		 * We already have it -- which may mean that we were
 		 * in sync with the other side at some time after
 		 * that (it is OK if we guess wrong here).
 		 */
-		if (o->type == OBJ_COMMIT) {
-			struct commit *commit = (struct commit *)o;
-			if (!cutoff || cutoff < commit->date)
-				cutoff = commit->date;
-		}
+		if (!cutoff || cutoff < commit->date)
+			cutoff = commit->date;
 	}
 	trace2_region_leave("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL);
 
-- 
2.35.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 2/2] fetch: skip computing output width when not printing anything
  2022-02-10 12:28   ` [PATCH v3 0/2] fetch: speed up mirror-fetches with many refs Patrick Steinhardt
  2022-02-10 12:28     ` [PATCH v3 1/2] fetch-pack: use commit-graph when computing cutoff Patrick Steinhardt
@ 2022-02-10 12:28     ` Patrick Steinhardt
  2022-02-10 18:04     ` [PATCH v3 0/2] fetch: speed up mirror-fetches with many refs Junio C Hamano
  2 siblings, 0 replies; 16+ messages in thread
From: Patrick Steinhardt @ 2022-02-10 12:28 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau, Derrick Stolee, Christian Couder

[-- Attachment #1: Type: text/plain, Size: 2737 bytes --]

When updating references via git-fetch(1), then by default we report to
the user which references have been changed. This output is formatted in
a nice table such that the different columns are aligned. Because the
first column contains abbreviated object IDs we thus need to iterate
over all refs which have changed and compute the minimum length for
their respective abbreviated hashes. While this effort makes sense in
most cases, it is wasteful when the user passes the `--quiet` flag: we
don't print the summary, but still compute the length.

Skip computing the summary width when the user asked for us to be quiet.
This gives us a speedup of nearly 10% when doing a mirror-fetch in a
repository with thousands of references being updated:

    Benchmark 1: git fetch --quiet +refs/*:refs/* (HEAD~)
      Time (mean ± σ):     96.078 s ±  0.508 s    [User: 91.378 s, System: 10.870 s]
      Range (min … max):   95.449 s … 96.760 s    5 runs

    Benchmark 2: git fetch --quiet +refs/*:refs/* (HEAD)
      Time (mean ± σ):     88.214 s ±  0.192 s    [User: 83.274 s, System: 10.978 s]
      Range (min … max):   87.998 s … 88.446 s    5 runs

    Summary
      'git fetch --quiet +refs/*:refs/* (HEAD)' ran
        1.09 ± 0.01 times faster than 'git fetch --quiet +refs/*:refs/* (HEAD~)'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/fetch.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 5b3b18a72f..7ef305c66d 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1094,12 +1094,15 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
 	struct ref *rm;
 	char *url;
 	int want_status;
-	int summary_width = transport_summary_width(ref_map);
+	int summary_width = 0;
 
 	rc = open_fetch_head(&fetch_head);
 	if (rc)
 		return -1;
 
+	if (verbosity >= 0)
+		summary_width = transport_summary_width(ref_map);
+
 	if (raw_url)
 		url = transport_anonymize_url(raw_url);
 	else
@@ -1345,7 +1348,6 @@ static int prune_refs(struct refspec *rs, struct ref *ref_map,
 	int url_len, i, result = 0;
 	struct ref *ref, *stale_refs = get_stale_heads(rs, ref_map);
 	char *url;
-	int summary_width = transport_summary_width(stale_refs);
 	const char *dangling_msg = dry_run
 		? _("   (%s will become dangling)")
 		: _("   (%s has become dangling)");
@@ -1374,6 +1376,8 @@ static int prune_refs(struct refspec *rs, struct ref *ref_map,
 	}
 
 	if (verbosity >= 0) {
+		int summary_width = transport_summary_width(stale_refs);
+
 		for (ref = stale_refs; ref; ref = ref->next) {
 			struct strbuf sb = STRBUF_INIT;
 			if (!shown_url) {
-- 
2.35.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 0/2] fetch: speed up mirror-fetches with many refs
  2022-02-10 12:28   ` [PATCH v3 0/2] fetch: speed up mirror-fetches with many refs Patrick Steinhardt
  2022-02-10 12:28     ` [PATCH v3 1/2] fetch-pack: use commit-graph when computing cutoff Patrick Steinhardt
  2022-02-10 12:28     ` [PATCH v3 2/2] fetch: skip computing output width when not printing anything Patrick Steinhardt
@ 2022-02-10 18:04     ` Junio C Hamano
  2 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2022-02-10 18:04 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Taylor Blau, Derrick Stolee, Christian Couder

1Patrick Steinhardt <ps@pks.im> writes:

> this is the third version of my patch series which aimn to speed up
> mirror-fetches in repos with huge amounts of refs. Again, the only
> change compared to v2 is a change in commit messages: Chris has rightly
> pointed out that the benchmarks were a bit confusing, so I've updated
> them to hopefully be less so.
>
> Thanks for your feedback!

> Patrick
>
> Patrick Steinhardt (2):
>   fetch-pack: use commit-graph when computing cutoff
>   fetch: skip computing output width when not printing anything

Both changes are based on quite sensible idea.  If we have
precomputed dates for each commit, it makes sense to look it up
before parsing the commit.  If we are not preparing output, there is
no point in computing the output width.

Very simple and potentially effective.

Will queue.  Thanks.


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-02-10 18:04 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-28 10:15 [PATCH 0/2] fetch: speed up mirror-fetches with many refs Patrick Steinhardt
2022-01-28 10:17 ` [PATCH 1/2] fetch-pack: use commit-graph when computing cutoff Patrick Steinhardt
2022-01-31 22:53   ` Taylor Blau
2022-02-02 12:46     ` Patrick Steinhardt
2022-01-28 10:19 ` [PATCH 2/2] fetch: skip computing output width when not printing anything Patrick Steinhardt
2022-02-02 12:51 ` [PATCH v2 0/2] fetch: speed up mirror-fetches with many refs Patrick Steinhardt
2022-02-02 12:51   ` [PATCH v2 1/2] fetch-pack: use commit-graph when computing cutoff Patrick Steinhardt
2022-02-09 18:01     ` Christian Couder
2022-02-10 11:43       ` Patrick Steinhardt
2022-02-02 12:51   ` [PATCH v2 2/2] fetch: skip computing output width when not printing anything Patrick Steinhardt
2022-02-09 18:10     ` Christian Couder
2022-02-10 12:23       ` Patrick Steinhardt
2022-02-10 12:28   ` [PATCH v3 0/2] fetch: speed up mirror-fetches with many refs Patrick Steinhardt
2022-02-10 12:28     ` [PATCH v3 1/2] fetch-pack: use commit-graph when computing cutoff Patrick Steinhardt
2022-02-10 12:28     ` [PATCH v3 2/2] fetch: skip computing output width when not printing anything Patrick Steinhardt
2022-02-10 18:04     ` [PATCH v3 0/2] fetch: speed up mirror-fetches with many refs Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).