git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Taylor Blau" <me@ttaylorr.com>, "Jeff King" <peff@peff.net>
Subject: [PATCH v5 7/7] receive-pack: only use visible refs for connectivity check
Date: Fri, 11 Nov 2022 07:50:18 +0100	[thread overview]
Message-ID: <f5f18f3939276e4dd63c66d5fb2a72231f9a7a06.1668149149.git.ps@pks.im> (raw)
In-Reply-To: <cover.1668149149.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 4675 bytes --]

When serving a push, git-receive-pack(1) needs to verify that the
packfile sent by the client contains all objects that are required by
the updated references. This connectivity check works by marking all
preexisting references as uninteresting and using the new reference tips
as starting point for a graph walk.

Marking all preexisting references as uninteresting can be a problem
when it comes to performance. Git forges tend to do internal bookkeeping
to keep alive sets of objects for internal use or make them easy to find
via certain references. These references are typically hidden away from
the user so that they are neither advertised nor writeable. At GitLab,
we have one particular repository that contains a total of 7 million
references, of which 6.8 million are indeed internal references. With
the current connectivity check we are forced to load all these
references in order to mark them as uninteresting, and this alone takes
around 15 seconds to compute.

We can optimize this by only taking into account the set of visible refs
when marking objects as uninteresting. This means that we may now walk
more objects until we hit any object that is marked as uninteresting.
But it is rather unlikely that clients send objects that make large
parts of objects reachable that have previously only ever been hidden,
whereas the common case is to push incremental changes that build on top
of the visible object graph.

This provides a huge boost to performance in the mentioned repository,
where the vast majority of its refs hidden. Pushing a new commit into
this repo with `transfer.hideRefs` set up to hide 6.8 million of 7 refs
as it is configured in Gitaly leads to a 4.5-fold speedup:

    Benchmark 1: main
      Time (mean ± σ):     30.977 s ±  0.157 s    [User: 30.226 s, System: 1.083 s]
      Range (min … max):   30.796 s … 31.071 s    3 runs

    Benchmark 2: pks-connectivity-check-hide-refs
      Time (mean ± σ):      6.799 s ±  0.063 s    [User: 6.803 s, System: 0.354 s]
      Range (min … max):    6.729 s …  6.850 s    3 runs

    Summary
      'pks-connectivity-check-hide-refs' ran
        4.56 ± 0.05 times faster than 'main'

As we mostly go through the same codepaths even in the case where there
are no hidden refs at all compared to the code before there is no change
in performance when no refs are hidden:

    Benchmark 1: main
      Time (mean ± σ):     48.188 s ±  0.432 s    [User: 49.326 s, System: 5.009 s]
      Range (min … max):   47.706 s … 48.539 s    3 runs

    Benchmark 2: pks-connectivity-check-hide-refs
      Time (mean ± σ):     48.027 s ±  0.500 s    [User: 48.934 s, System: 5.025 s]
      Range (min … max):   47.504 s … 48.500 s    3 runs

    Summary
      'pks-connectivity-check-hide-refs' ran
        1.00 ± 0.01 times faster than 'main'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/receive-pack.c | 2 ++
 connected.c            | 3 +++
 connected.h            | 7 +++++++
 3 files changed, 12 insertions(+)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 1e24b31a0a..a90af30363 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1929,6 +1929,8 @@ static void execute_commands(struct command *commands,
 	opt.err_fd = err_fd;
 	opt.progress = err_fd && !quiet;
 	opt.env = tmp_objdir_env(tmp_objdir);
+	opt.exclude_hidden_refs_section = "receive";
+
 	if (check_connected(iterate_receive_command_list, &data, &opt))
 		set_connectivity_errors(commands, si);
 
diff --git a/connected.c b/connected.c
index 74a20cb32e..4f6388eed7 100644
--- a/connected.c
+++ b/connected.c
@@ -100,6 +100,9 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
 		strvec_push(&rev_list.args, "--exclude-promisor-objects");
 	if (!opt->is_deepening_fetch) {
 		strvec_push(&rev_list.args, "--not");
+		if (opt->exclude_hidden_refs_section)
+			strvec_pushf(&rev_list.args, "--exclude-hidden=%s",
+				     opt->exclude_hidden_refs_section);
 		strvec_push(&rev_list.args, "--all");
 	}
 	strvec_push(&rev_list.args, "--quiet");
diff --git a/connected.h b/connected.h
index 6e59c92aa3..16b2c84f2e 100644
--- a/connected.h
+++ b/connected.h
@@ -46,6 +46,13 @@ struct check_connected_options {
 	 * during a fetch.
 	 */
 	unsigned is_deepening_fetch : 1;
+
+	/*
+	 * If not NULL, use `--exclude-hidden=$section` to exclude all refs
+	 * hidden via the `$section.hideRefs` config from the set of
+	 * already-reachable refs.
+	 */
+	const char *exclude_hidden_refs_section;
 };
 
 #define CHECK_CONNECTED_INIT { 0 }
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2022-11-11  6:50 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-28 14:42 [PATCH 0/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
2022-10-28 14:42 ` [PATCH 1/2] connected: allow supplying different view of reachable objects Patrick Steinhardt
2022-10-28 14:54   ` Ævar Arnfjörð Bjarmason
2022-10-28 18:12   ` Junio C Hamano
2022-10-30 18:49     ` Taylor Blau
2022-10-31 13:10     ` Patrick Steinhardt
2022-11-01  1:16       ` Taylor Blau
2022-10-28 14:42 ` [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
2022-10-28 15:01   ` Ævar Arnfjörð Bjarmason
2022-10-31 14:21     ` Patrick Steinhardt
2022-10-31 15:36       ` Ævar Arnfjörð Bjarmason
2022-10-30 19:09   ` Taylor Blau
2022-10-31 14:45     ` Patrick Steinhardt
2022-11-01  1:28       ` Taylor Blau
2022-11-01  7:20         ` Patrick Steinhardt
2022-11-01 11:53           ` Patrick Steinhardt
2022-11-02  1:05             ` Taylor Blau
2022-11-01  8:28       ` Jeff King
2022-10-28 16:40 ` [PATCH 0/2] " Junio C Hamano
2022-11-01  1:30 ` Taylor Blau
2022-11-01  9:00 ` Jeff King
2022-11-01 11:49   ` Patrick Steinhardt
2022-11-03 14:37 ` [PATCH v2 0/3] receive-pack: only use visible refs for " Patrick Steinhardt
2022-11-03 14:37   ` [PATCH v2 1/3] refs: get rid of global list of hidden refs Patrick Steinhardt
2022-11-03 14:37   ` [PATCH v2 2/3] revision: add new parameter to specify all visible refs Patrick Steinhardt
2022-11-05 12:46     ` Jeff King
2022-11-07  8:20       ` Patrick Steinhardt
2022-11-08 14:32         ` Jeff King
2022-11-05 12:55     ` Jeff King
2022-11-03 14:37   ` [PATCH v2 3/3] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
2022-11-05  0:40   ` [PATCH v2 0/3] " Taylor Blau
2022-11-05 12:55     ` Jeff King
2022-11-05 12:52   ` Jeff King
2022-11-07 12:16 ` [PATCH v3 0/6] " Patrick Steinhardt
2022-11-07 12:16   ` [PATCH v3 1/6] refs: get rid of global list of hidden refs Patrick Steinhardt
2022-11-07 12:16   ` [PATCH v3 2/6] revision: move together exclusion-related functions Patrick Steinhardt
2022-11-07 12:16   ` [PATCH v3 3/6] revision: introduce struct to handle exclusions Patrick Steinhardt
2022-11-07 12:51     ` Ævar Arnfjörð Bjarmason
2022-11-08  9:11       ` Patrick Steinhardt
2022-11-07 12:16   ` [PATCH v3 4/6] revision: add new parameter to exclude hidden refs Patrick Steinhardt
2022-11-07 13:34     ` Ævar Arnfjörð Bjarmason
2022-11-07 17:07       ` Ævar Arnfjörð Bjarmason
2022-11-08  9:48         ` Patrick Steinhardt
2022-11-08  9:22       ` Patrick Steinhardt
2022-11-08  0:57     ` Taylor Blau
2022-11-08  8:16       ` Patrick Steinhardt
2022-11-08 14:42         ` Jeff King
2022-11-07 12:16   ` [PATCH v3 5/6] revparse: add `--exclude-hidden=` option Patrick Steinhardt
2022-11-08 14:44     ` Jeff King
2022-11-07 12:16   ` [PATCH v3 6/6] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
2022-11-08  0:59   ` [PATCH v3 0/6] " Taylor Blau
2022-11-08 10:03 ` [PATCH v4 " Patrick Steinhardt
2022-11-08 10:03   ` [PATCH v4 1/6] refs: get rid of global list of hidden refs Patrick Steinhardt
2022-11-08 13:36     ` Ævar Arnfjörð Bjarmason
2022-11-08 14:49       ` Patrick Steinhardt
2022-11-08 14:51     ` Jeff King
2022-11-08 10:03   ` [PATCH v4 2/6] revision: move together exclusion-related functions Patrick Steinhardt
2022-11-08 10:03   ` [PATCH v4 3/6] revision: introduce struct to handle exclusions Patrick Steinhardt
2022-11-08 10:03   ` [PATCH v4 4/6] revision: add new parameter to exclude hidden refs Patrick Steinhardt
2022-11-08 15:07     ` Jeff King
2022-11-08 21:13       ` Taylor Blau
2022-11-11  5:48       ` Patrick Steinhardt
2022-11-08 10:03   ` [PATCH v4 5/6] rev-parse: add `--exclude-hidden=` option Patrick Steinhardt
2022-11-08 10:04   ` [PATCH v4 6/6] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
2022-11-11  6:49 ` [PATCH v5 0/7] " Patrick Steinhardt
2022-11-11  6:49   ` [PATCH v5 1/7] refs: fix memory leak when parsing hideRefs config Patrick Steinhardt
2022-11-11  6:49   ` [PATCH v5 2/7] refs: get rid of global list of hidden refs Patrick Steinhardt
2022-11-11  6:50   ` [PATCH v5 3/7] revision: move together exclusion-related functions Patrick Steinhardt
2022-11-11  6:50   ` [PATCH v5 4/7] revision: introduce struct to handle exclusions Patrick Steinhardt
2022-11-11  6:50   ` [PATCH v5 5/7] revision: add new parameter to exclude hidden refs Patrick Steinhardt
2022-11-11  6:50   ` [PATCH v5 6/7] rev-parse: add `--exclude-hidden=` option Patrick Steinhardt
2022-11-11  6:50   ` Patrick Steinhardt [this message]
2022-11-11 22:18   ` [PATCH v5 0/7] receive-pack: only use visible refs for connectivity check Taylor Blau
2022-11-15 17:26     ` Jeff King
2022-11-16 21:22       ` Taylor Blau
2022-11-16 22:04         ` Jeff King
2022-11-16 22:33           ` Taylor Blau
2022-11-17  5:45             ` Patrick Steinhardt
2022-11-17  5:46 ` [PATCH v6 " Patrick Steinhardt
2022-11-17  5:46   ` [PATCH v6 1/7] refs: fix memory leak when parsing hideRefs config Patrick Steinhardt
2022-11-17  5:46   ` [PATCH v6 2/7] refs: get rid of global list of hidden refs Patrick Steinhardt
2022-11-17  5:46   ` [PATCH v6 3/7] revision: move together exclusion-related functions Patrick Steinhardt
2022-11-17  5:46   ` [PATCH v6 4/7] revision: introduce struct to handle exclusions Patrick Steinhardt
2022-11-17  5:46   ` [PATCH v6 5/7] revision: add new parameter to exclude hidden refs Patrick Steinhardt
2022-11-17  5:47   ` [PATCH v6 6/7] rev-parse: add `--exclude-hidden=` option Patrick Steinhardt
2022-11-17  5:47   ` [PATCH v6 7/7] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
2022-11-17 15:03   ` [PATCH v6 0/7] " Jeff King
2022-11-17 21:24     ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f5f18f3939276e4dd63c66d5fb2a72231f9a7a06.1668149149.git.ps@pks.im \
    --to=ps@pks.im \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).