* [RFC] fetch: support hideRefs to speed up connectivity checks @ 2023-02-09 12:28 Eric Wong 2023-02-10 21:49 ` Jonathan Tan ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Eric Wong @ 2023-02-09 12:28 UTC (permalink / raw) To: git; +Cc: Patrick Steinhardt Not sure if this is the right way to go about this... If it's close, maybe --exclude-hidden=fetch can be supported. I'm using `receive' for now to minimize the change. With roughly 800 remotes all fetching to their own refs/remotes/$REMOTE/* island, the connectivity check[1] gets expensive for each fetch. To do a no-op fetch on one $REMOTE out of hundreds, hideRefs now allows the no-op fetch to take ~30 seconds instead of ~20 minutes on a noisy, RAM-constrained machine (localhost, so no network latency): git -c transfer.hideRefs=refs \ -c transfer.hideRefs='!refs/remotes/$REMOTE/' \ fetch $REMOTE I initially considered passing --negotiation-tip OIDs, but this seems like an easier solution as I'm not yet familiar with this code and prefer to avoid writing too much C. [1] `git rev-list --objects --stdin --not --all --quiet --alternate-refs' gets painful w/o enough RAM to cache the repo, even on a SATA-2 SSD. --- builtin/fetch.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/builtin/fetch.c b/builtin/fetch.c index 12978622d5..473d99fd26 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -1131,6 +1131,7 @@ static int store_updated_refs(const char *raw_url, const char *remote_name, if (!connectivity_checked) { struct check_connected_options opt = CHECK_CONNECTED_INIT; + opt.exclude_hidden_refs_section = "receive"; rm = ref_map; if (check_connected(iterate_ref_map, &rm, &opt)) { rc = error(_("%s did not send all necessary objects\n"), url); @@ -1324,6 +1325,7 @@ static int check_exist_and_connected(struct ref *ref_map) } opt.quiet = 1; + opt.exclude_hidden_refs_section = "receive"; return check_connected(iterate_ref_map, &rm, &opt); } ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC] fetch: support hideRefs to speed up connectivity checks 2023-02-09 12:28 [RFC] fetch: support hideRefs to speed up connectivity checks Eric Wong @ 2023-02-10 21:49 ` Jonathan Tan 2023-02-10 21:59 ` Eric Wong 2023-02-10 22:56 ` Junio C Hamano 2023-02-12 9:04 ` [PATCH v2] " Eric Wong 2 siblings, 1 reply; 11+ messages in thread From: Jonathan Tan @ 2023-02-10 21:49 UTC (permalink / raw) To: Eric Wong; +Cc: Jonathan Tan, git, Patrick Steinhardt Eric Wong <e@80x24.org> writes: > git -c transfer.hideRefs=refs \ > -c transfer.hideRefs='!refs/remotes/$REMOTE/' \ > fetch $REMOTE > > I initially considered passing --negotiation-tip OIDs, but this seems > like an easier solution as I'm not yet familiar with this code > and prefer to avoid writing too much C. --negotiation-tip supports ref name globs too. Would that be sufficient for your purposes? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC] fetch: support hideRefs to speed up connectivity checks 2023-02-10 21:49 ` Jonathan Tan @ 2023-02-10 21:59 ` Eric Wong 0 siblings, 0 replies; 11+ messages in thread From: Eric Wong @ 2023-02-10 21:59 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, Patrick Steinhardt Jonathan Tan <jonathantanmy@google.com> wrote: > Eric Wong <e@80x24.org> writes: > > git -c transfer.hideRefs=refs \ > > -c transfer.hideRefs='!refs/remotes/$REMOTE/' \ > > fetch $REMOTE > > > > I initially considered passing --negotiation-tip OIDs, but this seems > > like an easier solution as I'm not yet familiar with this code > > and prefer to avoid writing too much C. > > --negotiation-tip supports ref name globs too. Would that be sufficient > for your purposes? Yes, I tried using globs but didn't want to figure out how to pass the resulting OIDs to rev-list. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC] fetch: support hideRefs to speed up connectivity checks 2023-02-09 12:28 [RFC] fetch: support hideRefs to speed up connectivity checks Eric Wong 2023-02-10 21:49 ` Jonathan Tan @ 2023-02-10 22:56 ` Junio C Hamano 2023-02-11 7:53 ` Eric Wong 2023-02-12 9:04 ` [PATCH v2] " Eric Wong 2 siblings, 1 reply; 11+ messages in thread From: Junio C Hamano @ 2023-02-10 22:56 UTC (permalink / raw) To: Eric Wong; +Cc: git, Patrick Steinhardt Eric Wong <e@80x24.org> writes: > Not sure if this is the right way to go about this... > If it's close, maybe --exclude-hidden=fetch can be supported. Yeah, why not. I however notice error handling in the codepath that deals with "--exclude-hidden" is a bit sloppy. refs.c::parse_hide_refs_config() is nice enough to diagnose a malformed transfer.hiderefs configuration as an error by returning -1, and revision.c::hide_refs_config() propagates such an error up, but revision.c::exclude_hidden_refs() ignores the error from git_config(), and revision.c::handle_revision_pseudo_opt() ignores any error from exclude_hidden_refs() anyway. We may want to tighten it a bit before (ab)using the option in more contexts. Thanks. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC] fetch: support hideRefs to speed up connectivity checks 2023-02-10 22:56 ` Junio C Hamano @ 2023-02-11 7:53 ` Eric Wong 2023-02-11 19:24 ` Junio C Hamano 0 siblings, 1 reply; 11+ messages in thread From: Eric Wong @ 2023-02-11 7:53 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Patrick Steinhardt Junio C Hamano <gitster@pobox.com> wrote: > I however notice error handling in the codepath that deals with > "--exclude-hidden" is a bit sloppy. > > refs.c::parse_hide_refs_config() is nice enough to diagnose a > malformed transfer.hiderefs configuration as an error by returning > -1, and revision.c::hide_refs_config() propagates such an error up, > but revision.c::exclude_hidden_refs() ignores the error from > git_config(), and revision.c::handle_revision_pseudo_opt() ignores > any error from exclude_hidden_refs() anyway. Not sure I follow. exclude_hidden_refs() either dies or calls git_config(). git_config() calls repo_config(), then configset_iter(). configset_iter() will git_die_config_linenr() if `fn' (hide_refs_config() in this case) returns < 0. > We may want to tighten it a bit before (ab)using the option in more > contexts. > > Thanks. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC] fetch: support hideRefs to speed up connectivity checks 2023-02-11 7:53 ` Eric Wong @ 2023-02-11 19:24 ` Junio C Hamano 0 siblings, 0 replies; 11+ messages in thread From: Junio C Hamano @ 2023-02-11 19:24 UTC (permalink / raw) To: Eric Wong; +Cc: git, Patrick Steinhardt Eric Wong <e@80x24.org> writes: > Junio C Hamano <gitster@pobox.com> wrote: >> I however notice error handling in the codepath that deals with >> "--exclude-hidden" is a bit sloppy. >> >> refs.c::parse_hide_refs_config() is nice enough to diagnose a >> malformed transfer.hiderefs configuration as an error by returning >> -1, and revision.c::hide_refs_config() propagates such an error up, >> but revision.c::exclude_hidden_refs() ignores the error from >> git_config(), and revision.c::handle_revision_pseudo_opt() ignores >> any error from exclude_hidden_refs() anyway. > > Not sure I follow. exclude_hidden_refs() either dies or calls > git_config(). git_config() calls repo_config(), then > configset_iter(). configset_iter() will git_die_config_linenr() > if `fn' (hide_refs_config() in this case) returns < 0. Somehow I had this wishful thinking that the return value from git_config() can be checked and the caller can handle the error more gracefully, but its return type is void. We'll die when we see a bad configuration but only when we see "--exclude-hidden", which is when we need a valid value from there. That is how it should work, so I am now happier. Thanks. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2] fetch: support hideRefs to speed up connectivity checks 2023-02-09 12:28 [RFC] fetch: support hideRefs to speed up connectivity checks Eric Wong 2023-02-10 21:49 ` Jonathan Tan 2023-02-10 22:56 ` Junio C Hamano @ 2023-02-12 9:04 ` Eric Wong 2023-02-13 20:53 ` Jeff King 2 siblings, 1 reply; 11+ messages in thread From: Eric Wong @ 2023-02-12 9:04 UTC (permalink / raw) To: git; +Cc: Patrick Steinhardt With roughly 800 remotes all fetching into their own refs/remotes/$REMOTE/* island, the connectivity check[1] gets expensive for each fetch on systems which lack sufficient RAM to cache objects. To do a no-op fetch on one $REMOTE out of hundreds, hideRefs now allows the no-op fetch to take ~30 seconds instead of ~20 minutes on a noisy, RAM-constrained machine (localhost, so no network latency): git -c fetch.hideRefs=refs \ -c fetch.hideRefs='!refs/remotes/$REMOTE/' \ fetch $REMOTE [1] `git rev-list --objects --stdin --not --all --quiet --alternate-refs' Signed-off-by: Eric Wong <e@80x24.org> --- Sidenote: I'm curious about the reason $(pwd) is used in some places while $PWD seems fine in others, so it doesn't seem to be a portability problem. I chose $PWD since it's faster. Documentation/git-rev-parse.txt | 9 +++++---- Documentation/rev-list-options.txt | 9 +++++---- builtin/fetch.c | 2 ++ builtin/rev-list.c | 2 +- revision.c | 3 ++- t/t5510-fetch.sh | 9 +++++++++ t/t6018-rev-list-glob.sh | 2 +- | 2 +- 8 files changed, 26 insertions(+), 12 deletions(-) diff --git a/Documentation/git-rev-parse.txt b/Documentation/git-rev-parse.txt index bcd80692870..f26a7591e37 100644 --- a/Documentation/git-rev-parse.txt +++ b/Documentation/git-rev-parse.txt @@ -197,10 +197,11 @@ respectively, and they must begin with `refs/` when applied to `--glob` or `--all`. If a trailing '/{asterisk}' is intended, it must be given explicitly. ---exclude-hidden=[receive|uploadpack]:: - Do not include refs that would be hidden by `git-receive-pack` or - `git-upload-pack` by consulting the appropriate `receive.hideRefs` or - `uploadpack.hideRefs` configuration along with `transfer.hideRefs` (see +--exclude-hidden=[fetch|receive|uploadpack]:: + Do not include refs that would be hidden by `git-fetch`, + `git-receive-pack` or `git-upload-pack` by consulting the appropriate + `fetch.hideRefs`, `receive.hideRefs` or `uploadpack.hideRefs` + configuration along with `transfer.hideRefs` (see linkgit:git-config[1]). This option affects the next pseudo-ref option `--all` or `--glob` and is cleared after processing them. diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index ff68e484069..5e7f3c51792 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -195,10 +195,11 @@ respectively, and they must begin with `refs/` when applied to `--glob` or `--all`. If a trailing '/{asterisk}' is intended, it must be given explicitly. ---exclude-hidden=[receive|uploadpack]:: - Do not include refs that would be hidden by `git-receive-pack` or - `git-upload-pack` by consulting the appropriate `receive.hideRefs` or - `uploadpack.hideRefs` configuration along with `transfer.hideRefs` (see +--exclude-hidden=[fetch|receive|uploadpack]:: + Do not include refs that would be hidden by `git-fetch`, + `git-receive-pack` or `git-upload-pack` by consulting the appropriate + `fetch.hideRefs`, `receive.hideRefs` or `uploadpack.hideRefs` + configuration along with `transfer.hideRefs` (see linkgit:git-config[1]). This option affects the next pseudo-ref option `--all` or `--glob` and is cleared after processing them. diff --git a/builtin/fetch.c b/builtin/fetch.c index 12978622d51..2763dd969bb 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -1131,6 +1131,7 @@ static int store_updated_refs(const char *raw_url, const char *remote_name, if (!connectivity_checked) { struct check_connected_options opt = CHECK_CONNECTED_INIT; + opt.exclude_hidden_refs_section = "fetch"; rm = ref_map; if (check_connected(iterate_ref_map, &rm, &opt)) { rc = error(_("%s did not send all necessary objects\n"), url); @@ -1324,6 +1325,7 @@ static int check_exist_and_connected(struct ref *ref_map) } opt.quiet = 1; + opt.exclude_hidden_refs_section = "fetch"; return check_connected(iterate_ref_map, &rm, &opt); } diff --git a/builtin/rev-list.c b/builtin/rev-list.c index d42db0b0cc9..2ab3efd233b 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -38,7 +38,7 @@ static const char rev_list_usage[] = " --tags\n" " --remotes\n" " --stdin\n" -" --exclude-hidden=[receive|uploadpack]\n" +" --exclude-hidden=[fetch|receive|uploadpack]\n" " --quiet\n" " ordering output:\n" " --topo-order\n" diff --git a/revision.c b/revision.c index 21f5f572c22..50940699e4a 100644 --- a/revision.c +++ b/revision.c @@ -1574,7 +1574,8 @@ void exclude_hidden_refs(struct ref_exclusions *exclusions, const char *section) { struct exclude_hidden_refs_cb cb; - if (strcmp(section, "receive") && strcmp(section, "uploadpack")) + if (strcmp(section, "fetch") && strcmp(section, "receive") && + strcmp(section, "uploadpack")) die(_("unsupported section for hidden refs: %s"), section); if (exclusions->hidden_refs_configured) diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh index c0b745e33b8..287d6c3a8af 100755 --- a/t/t5510-fetch.sh +++ b/t/t5510-fetch.sh @@ -1163,6 +1163,15 @@ test_expect_success '--no-show-forced-updates' ' ) ' +for section in fetch transfer +do + test_expect_success "$section.hideRefs affects connectivity check" ' + GIT_TRACE="$PWD"/trace git -c $section.hideRefs=refs -c \ + $section.hideRefs="!refs/tags/" fetch && + grep "git rev-list .*--exclude-hidden=fetch" trace + ' +done + setup_negotiation_tip () { SERVER="$1" URL="$2" diff --git a/t/t6018-rev-list-glob.sh b/t/t6018-rev-list-glob.sh index aabf590dda6..67d523d4057 100755 --- a/t/t6018-rev-list-glob.sh +++ b/t/t6018-rev-list-glob.sh @@ -187,7 +187,7 @@ test_expect_success 'rev-parse --exclude=ref with --remotes=glob' ' compare rev-parse "--exclude=upstream/x --remotes=upstream/*" "upstream/one upstream/two" ' -for section in receive uploadpack +for section in fetch receive uploadpack do test_expect_success "rev-parse --exclude-hidden=$section with --all" ' compare "-c transfer.hideRefs=refs/remotes/ rev-parse" "--branches --tags" "--exclude-hidden=$section --all" --git a/t/t6021-rev-list-exclude-hidden.sh b/t/t6021-rev-list-exclude-hidden.sh index 32b2b094138..e219ac86738 100755 --- a/t/t6021-rev-list-exclude-hidden.sh +++ b/t/t6021-rev-list-exclude-hidden.sh @@ -21,7 +21,7 @@ test_expect_success 'invalid section' ' test_cmp expected err ' -for section in receive uploadpack +for section in fetch receive uploadpack do test_expect_success "$section: passed multiple times" ' echo "fatal: --exclude-hidden= passed more than once" >expected && ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2] fetch: support hideRefs to speed up connectivity checks 2023-02-12 9:04 ` [PATCH v2] " Eric Wong @ 2023-02-13 20:53 ` Jeff King 2023-02-13 23:30 ` Philip Oakley 0 siblings, 1 reply; 11+ messages in thread From: Jeff King @ 2023-02-13 20:53 UTC (permalink / raw) To: Eric Wong; +Cc: git, Patrick Steinhardt On Sun, Feb 12, 2023 at 09:04:26AM +0000, Eric Wong wrote: > Sidenote: I'm curious about the reason $(pwd) is used in some > places while $PWD seems fine in others, so it doesn't seem to be > a portability problem. I chose $PWD since it's faster. It sometimes matters; one is a Windows path (with "C:\", etc) and one is a Unix-style path. Many spots are happy with either type, but it sometimes bites us when doing string comparisons, or in a few specific cases. See https://lore.kernel.org/git/d36d8b51-f2d7-a2f5-89ea-369f49556e10@kdbg.org/ for an example. -Peff ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] fetch: support hideRefs to speed up connectivity checks 2023-02-13 20:53 ` Jeff King @ 2023-02-13 23:30 ` Philip Oakley 2023-02-14 1:40 ` Jeff King 0 siblings, 1 reply; 11+ messages in thread From: Philip Oakley @ 2023-02-13 23:30 UTC (permalink / raw) To: Jeff King, Eric Wong; +Cc: git, Patrick Steinhardt On 13/02/2023 20:53, Jeff King wrote: > On Sun, Feb 12, 2023 at 09:04:26AM +0000, Eric Wong wrote: > >> Sidenote: I'm curious about the reason $(pwd) is used in some >> places while $PWD seems fine in others, so it doesn't seem to be >> a portability problem. I chose $PWD since it's faster. > It sometimes matters; one is a Windows path (with "C:\", etc) and one is > a Unix-style path. Many spots are happy with either type, but it > sometimes bites us when doing string comparisons, or in a few specific > cases. See > > https://lore.kernel.org/git/d36d8b51-f2d7-a2f5-89ea-369f49556e10@kdbg.org/ > > for an example. > There is guidance in t/README L680-684 though it maybe not that easy to spot. A more recent patch was https://lore.kernel.org/git/4f5c5633-f5a2-3c99-329e-3057b8d447d2@kdbg.org/ with slightly more details. Philip ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] fetch: support hideRefs to speed up connectivity checks 2023-02-13 23:30 ` Philip Oakley @ 2023-02-14 1:40 ` Jeff King 2023-02-16 1:32 ` Eric Wong 0 siblings, 1 reply; 11+ messages in thread From: Jeff King @ 2023-02-14 1:40 UTC (permalink / raw) To: Philip Oakley; +Cc: Eric Wong, git, Patrick Steinhardt On Mon, Feb 13, 2023 at 11:30:35PM +0000, Philip Oakley wrote: > On 13/02/2023 20:53, Jeff King wrote: > > On Sun, Feb 12, 2023 at 09:04:26AM +0000, Eric Wong wrote: > > > >> Sidenote: I'm curious about the reason $(pwd) is used in some > >> places while $PWD seems fine in others, so it doesn't seem to be > >> a portability problem. I chose $PWD since it's faster. > > It sometimes matters; one is a Windows path (with "C:\", etc) and one is > > a Unix-style path. Many spots are happy with either type, but it > > sometimes bites us when doing string comparisons, or in a few specific > > cases. See > > > > https://lore.kernel.org/git/d36d8b51-f2d7-a2f5-89ea-369f49556e10@kdbg.org/ > > > > for an example. > > > There is guidance in t/README L680-684 though it maybe not that easy to > spot. > > A more recent patch was > https://lore.kernel.org/git/4f5c5633-f5a2-3c99-329e-3057b8d447d2@kdbg.org/ > with slightly more details. Thanks, both explanations are much better than the one I found (my digging in the archive consisted of "I know JSixt has corrected me on this at least once..."). -Peff ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] fetch: support hideRefs to speed up connectivity checks 2023-02-14 1:40 ` Jeff King @ 2023-02-16 1:32 ` Eric Wong 0 siblings, 0 replies; 11+ messages in thread From: Eric Wong @ 2023-02-16 1:32 UTC (permalink / raw) To: git; +Cc: Jeff King, Philip Oakley, Patrick Steinhardt Jeff King <peff@peff.net> wrote: > On Mon, Feb 13, 2023 at 11:30:35PM +0000, Philip Oakley wrote: > > On 13/02/2023 20:53, Jeff King wrote: > > > On Sun, Feb 12, 2023 at 09:04:26AM +0000, Eric Wong wrote: > > > > > >> Sidenote: I'm curious about the reason $(pwd) is used in some > > >> places while $PWD seems fine in others, so it doesn't seem to be > > >> a portability problem. I chose $PWD since it's faster. > > > It sometimes matters; one is a Windows path (with "C:\", etc) and one is > > > a Unix-style path. Many spots are happy with either type, but it > > > sometimes bites us when doing string comparisons, or in a few specific > > > cases. See > > > > > > https://lore.kernel.org/git/d36d8b51-f2d7-a2f5-89ea-369f49556e10@kdbg.org/ > > > > > > for an example. > > > > > There is guidance in t/README L680-684 though it maybe not that easy to > > spot. > > > > A more recent patch was > > https://lore.kernel.org/git/4f5c5633-f5a2-3c99-329e-3057b8d447d2@kdbg.org/ > > with slightly more details. > > Thanks, both explanations are much better than the one I found (my > digging in the archive consisted of "I know JSixt has corrected me on > this at least once..."). Thanks both. Looks like my use of GIT_TRACE="$PWD"/trace is fine and there's plenty of examples where $PWD is used for GIT_TRACE* in our test suite (`git grep GIT_TRACE.*PWD') Any comments on the actual change itself? Thanks again. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-02-16 1:32 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-02-09 12:28 [RFC] fetch: support hideRefs to speed up connectivity checks Eric Wong 2023-02-10 21:49 ` Jonathan Tan 2023-02-10 21:59 ` Eric Wong 2023-02-10 22:56 ` Junio C Hamano 2023-02-11 7:53 ` Eric Wong 2023-02-11 19:24 ` Junio C Hamano 2023-02-12 9:04 ` [PATCH v2] " Eric Wong 2023-02-13 20:53 ` Jeff King 2023-02-13 23:30 ` Philip Oakley 2023-02-14 1:40 ` Jeff King 2023-02-16 1:32 ` Eric Wong
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).