* [RFC] fetch: support hideRefs to speed up connectivity checks
@ 2023-02-09 12:28 Eric Wong
2023-02-10 21:49 ` Jonathan Tan
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Eric Wong @ 2023-02-09 12:28 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt
Not sure if this is the right way to go about this...
If it's close, maybe --exclude-hidden=fetch can be supported.
I'm using `receive' for now to minimize the change.
With roughly 800 remotes all fetching to their own refs/remotes/$REMOTE/*
island, the connectivity check[1] gets expensive for each fetch.
To do a no-op fetch on one $REMOTE out of hundreds, hideRefs now
allows the no-op fetch to take ~30 seconds instead of ~20 minutes
on a noisy, RAM-constrained machine (localhost, so no network latency):
git -c transfer.hideRefs=refs \
-c transfer.hideRefs='!refs/remotes/$REMOTE/' \
fetch $REMOTE
I initially considered passing --negotiation-tip OIDs, but this seems
like an easier solution as I'm not yet familiar with this code
and prefer to avoid writing too much C.
[1] `git rev-list --objects --stdin --not --all --quiet --alternate-refs'
gets painful w/o enough RAM to cache the repo, even on a SATA-2 SSD.
---
builtin/fetch.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 12978622d5..473d99fd26 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1131,6 +1131,7 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
if (!connectivity_checked) {
struct check_connected_options opt = CHECK_CONNECTED_INIT;
+ opt.exclude_hidden_refs_section = "receive";
rm = ref_map;
if (check_connected(iterate_ref_map, &rm, &opt)) {
rc = error(_("%s did not send all necessary objects\n"), url);
@@ -1324,6 +1325,7 @@ static int check_exist_and_connected(struct ref *ref_map)
}
opt.quiet = 1;
+ opt.exclude_hidden_refs_section = "receive";
return check_connected(iterate_ref_map, &rm, &opt);
}
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC] fetch: support hideRefs to speed up connectivity checks
2023-02-09 12:28 [RFC] fetch: support hideRefs to speed up connectivity checks Eric Wong
@ 2023-02-10 21:49 ` Jonathan Tan
2023-02-10 21:59 ` Eric Wong
2023-02-10 22:56 ` Junio C Hamano
2023-02-12 9:04 ` [PATCH v2] " Eric Wong
2 siblings, 1 reply; 11+ messages in thread
From: Jonathan Tan @ 2023-02-10 21:49 UTC (permalink / raw)
To: Eric Wong; +Cc: Jonathan Tan, git, Patrick Steinhardt
Eric Wong <e@80x24.org> writes:
> git -c transfer.hideRefs=refs \
> -c transfer.hideRefs='!refs/remotes/$REMOTE/' \
> fetch $REMOTE
>
> I initially considered passing --negotiation-tip OIDs, but this seems
> like an easier solution as I'm not yet familiar with this code
> and prefer to avoid writing too much C.
--negotiation-tip supports ref name globs too. Would that be sufficient
for your purposes?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC] fetch: support hideRefs to speed up connectivity checks
2023-02-10 21:49 ` Jonathan Tan
@ 2023-02-10 21:59 ` Eric Wong
0 siblings, 0 replies; 11+ messages in thread
From: Eric Wong @ 2023-02-10 21:59 UTC (permalink / raw)
To: Jonathan Tan; +Cc: git, Patrick Steinhardt
Jonathan Tan <jonathantanmy@google.com> wrote:
> Eric Wong <e@80x24.org> writes:
> > git -c transfer.hideRefs=refs \
> > -c transfer.hideRefs='!refs/remotes/$REMOTE/' \
> > fetch $REMOTE
> >
> > I initially considered passing --negotiation-tip OIDs, but this seems
> > like an easier solution as I'm not yet familiar with this code
> > and prefer to avoid writing too much C.
>
> --negotiation-tip supports ref name globs too. Would that be sufficient
> for your purposes?
Yes, I tried using globs but didn't want to figure out how to
pass the resulting OIDs to rev-list.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC] fetch: support hideRefs to speed up connectivity checks
2023-02-09 12:28 [RFC] fetch: support hideRefs to speed up connectivity checks Eric Wong
2023-02-10 21:49 ` Jonathan Tan
@ 2023-02-10 22:56 ` Junio C Hamano
2023-02-11 7:53 ` Eric Wong
2023-02-12 9:04 ` [PATCH v2] " Eric Wong
2 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2023-02-10 22:56 UTC (permalink / raw)
To: Eric Wong; +Cc: git, Patrick Steinhardt
Eric Wong <e@80x24.org> writes:
> Not sure if this is the right way to go about this...
> If it's close, maybe --exclude-hidden=fetch can be supported.
Yeah, why not.
I however notice error handling in the codepath that deals with
"--exclude-hidden" is a bit sloppy.
refs.c::parse_hide_refs_config() is nice enough to diagnose a
malformed transfer.hiderefs configuration as an error by returning
-1, and revision.c::hide_refs_config() propagates such an error up,
but revision.c::exclude_hidden_refs() ignores the error from
git_config(), and revision.c::handle_revision_pseudo_opt() ignores
any error from exclude_hidden_refs() anyway.
We may want to tighten it a bit before (ab)using the option in more
contexts.
Thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC] fetch: support hideRefs to speed up connectivity checks
2023-02-10 22:56 ` Junio C Hamano
@ 2023-02-11 7:53 ` Eric Wong
2023-02-11 19:24 ` Junio C Hamano
0 siblings, 1 reply; 11+ messages in thread
From: Eric Wong @ 2023-02-11 7:53 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Patrick Steinhardt
Junio C Hamano <gitster@pobox.com> wrote:
> I however notice error handling in the codepath that deals with
> "--exclude-hidden" is a bit sloppy.
>
> refs.c::parse_hide_refs_config() is nice enough to diagnose a
> malformed transfer.hiderefs configuration as an error by returning
> -1, and revision.c::hide_refs_config() propagates such an error up,
> but revision.c::exclude_hidden_refs() ignores the error from
> git_config(), and revision.c::handle_revision_pseudo_opt() ignores
> any error from exclude_hidden_refs() anyway.
Not sure I follow. exclude_hidden_refs() either dies or calls
git_config(). git_config() calls repo_config(), then
configset_iter(). configset_iter() will git_die_config_linenr()
if `fn' (hide_refs_config() in this case) returns < 0.
> We may want to tighten it a bit before (ab)using the option in more
> contexts.
>
> Thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC] fetch: support hideRefs to speed up connectivity checks
2023-02-11 7:53 ` Eric Wong
@ 2023-02-11 19:24 ` Junio C Hamano
0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2023-02-11 19:24 UTC (permalink / raw)
To: Eric Wong; +Cc: git, Patrick Steinhardt
Eric Wong <e@80x24.org> writes:
> Junio C Hamano <gitster@pobox.com> wrote:
>> I however notice error handling in the codepath that deals with
>> "--exclude-hidden" is a bit sloppy.
>>
>> refs.c::parse_hide_refs_config() is nice enough to diagnose a
>> malformed transfer.hiderefs configuration as an error by returning
>> -1, and revision.c::hide_refs_config() propagates such an error up,
>> but revision.c::exclude_hidden_refs() ignores the error from
>> git_config(), and revision.c::handle_revision_pseudo_opt() ignores
>> any error from exclude_hidden_refs() anyway.
>
> Not sure I follow. exclude_hidden_refs() either dies or calls
> git_config(). git_config() calls repo_config(), then
> configset_iter(). configset_iter() will git_die_config_linenr()
> if `fn' (hide_refs_config() in this case) returns < 0.
Somehow I had this wishful thinking that the return value from
git_config() can be checked and the caller can handle the error more
gracefully, but its return type is void. We'll die when we see a
bad configuration but only when we see "--exclude-hidden", which is
when we need a valid value from there. That is how it should work,
so I am now happier.
Thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2] fetch: support hideRefs to speed up connectivity checks
2023-02-09 12:28 [RFC] fetch: support hideRefs to speed up connectivity checks Eric Wong
2023-02-10 21:49 ` Jonathan Tan
2023-02-10 22:56 ` Junio C Hamano
@ 2023-02-12 9:04 ` Eric Wong
2023-02-13 20:53 ` Jeff King
2 siblings, 1 reply; 11+ messages in thread
From: Eric Wong @ 2023-02-12 9:04 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt
With roughly 800 remotes all fetching into their own
refs/remotes/$REMOTE/* island, the connectivity check[1] gets
expensive for each fetch on systems which lack sufficient RAM to
cache objects.
To do a no-op fetch on one $REMOTE out of hundreds, hideRefs now
allows the no-op fetch to take ~30 seconds instead of ~20 minutes
on a noisy, RAM-constrained machine (localhost, so no network latency):
git -c fetch.hideRefs=refs \
-c fetch.hideRefs='!refs/remotes/$REMOTE/' \
fetch $REMOTE
[1] `git rev-list --objects --stdin --not --all --quiet --alternate-refs'
Signed-off-by: Eric Wong <e@80x24.org>
---
Sidenote: I'm curious about the reason $(pwd) is used in some
places while $PWD seems fine in others, so it doesn't seem to be
a portability problem. I chose $PWD since it's faster.
Documentation/git-rev-parse.txt | 9 +++++----
Documentation/rev-list-options.txt | 9 +++++----
builtin/fetch.c | 2 ++
builtin/rev-list.c | 2 +-
revision.c | 3 ++-
t/t5510-fetch.sh | 9 +++++++++
t/t6018-rev-list-glob.sh | 2 +-
t/t6021-rev-list-exclude-hidden.sh | 2 +-
8 files changed, 26 insertions(+), 12 deletions(-)
diff --git a/Documentation/git-rev-parse.txt b/Documentation/git-rev-parse.txt
index bcd80692870..f26a7591e37 100644
--- a/Documentation/git-rev-parse.txt
+++ b/Documentation/git-rev-parse.txt
@@ -197,10 +197,11 @@ respectively, and they must begin with `refs/` when applied to `--glob`
or `--all`. If a trailing '/{asterisk}' is intended, it must be given
explicitly.
---exclude-hidden=[receive|uploadpack]::
- Do not include refs that would be hidden by `git-receive-pack` or
- `git-upload-pack` by consulting the appropriate `receive.hideRefs` or
- `uploadpack.hideRefs` configuration along with `transfer.hideRefs` (see
+--exclude-hidden=[fetch|receive|uploadpack]::
+ Do not include refs that would be hidden by `git-fetch`,
+ `git-receive-pack` or `git-upload-pack` by consulting the appropriate
+ `fetch.hideRefs`, `receive.hideRefs` or `uploadpack.hideRefs`
+ configuration along with `transfer.hideRefs` (see
linkgit:git-config[1]). This option affects the next pseudo-ref option
`--all` or `--glob` and is cleared after processing them.
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index ff68e484069..5e7f3c51792 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -195,10 +195,11 @@ respectively, and they must begin with `refs/` when applied to `--glob`
or `--all`. If a trailing '/{asterisk}' is intended, it must be given
explicitly.
---exclude-hidden=[receive|uploadpack]::
- Do not include refs that would be hidden by `git-receive-pack` or
- `git-upload-pack` by consulting the appropriate `receive.hideRefs` or
- `uploadpack.hideRefs` configuration along with `transfer.hideRefs` (see
+--exclude-hidden=[fetch|receive|uploadpack]::
+ Do not include refs that would be hidden by `git-fetch`,
+ `git-receive-pack` or `git-upload-pack` by consulting the appropriate
+ `fetch.hideRefs`, `receive.hideRefs` or `uploadpack.hideRefs`
+ configuration along with `transfer.hideRefs` (see
linkgit:git-config[1]). This option affects the next pseudo-ref option
`--all` or `--glob` and is cleared after processing them.
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 12978622d51..2763dd969bb 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1131,6 +1131,7 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
if (!connectivity_checked) {
struct check_connected_options opt = CHECK_CONNECTED_INIT;
+ opt.exclude_hidden_refs_section = "fetch";
rm = ref_map;
if (check_connected(iterate_ref_map, &rm, &opt)) {
rc = error(_("%s did not send all necessary objects\n"), url);
@@ -1324,6 +1325,7 @@ static int check_exist_and_connected(struct ref *ref_map)
}
opt.quiet = 1;
+ opt.exclude_hidden_refs_section = "fetch";
return check_connected(iterate_ref_map, &rm, &opt);
}
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index d42db0b0cc9..2ab3efd233b 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -38,7 +38,7 @@ static const char rev_list_usage[] =
" --tags\n"
" --remotes\n"
" --stdin\n"
-" --exclude-hidden=[receive|uploadpack]\n"
+" --exclude-hidden=[fetch|receive|uploadpack]\n"
" --quiet\n"
" ordering output:\n"
" --topo-order\n"
diff --git a/revision.c b/revision.c
index 21f5f572c22..50940699e4a 100644
--- a/revision.c
+++ b/revision.c
@@ -1574,7 +1574,8 @@ void exclude_hidden_refs(struct ref_exclusions *exclusions, const char *section)
{
struct exclude_hidden_refs_cb cb;
- if (strcmp(section, "receive") && strcmp(section, "uploadpack"))
+ if (strcmp(section, "fetch") && strcmp(section, "receive") &&
+ strcmp(section, "uploadpack"))
die(_("unsupported section for hidden refs: %s"), section);
if (exclusions->hidden_refs_configured)
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index c0b745e33b8..287d6c3a8af 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -1163,6 +1163,15 @@ test_expect_success '--no-show-forced-updates' '
)
'
+for section in fetch transfer
+do
+ test_expect_success "$section.hideRefs affects connectivity check" '
+ GIT_TRACE="$PWD"/trace git -c $section.hideRefs=refs -c \
+ $section.hideRefs="!refs/tags/" fetch &&
+ grep "git rev-list .*--exclude-hidden=fetch" trace
+ '
+done
+
setup_negotiation_tip () {
SERVER="$1"
URL="$2"
diff --git a/t/t6018-rev-list-glob.sh b/t/t6018-rev-list-glob.sh
index aabf590dda6..67d523d4057 100755
--- a/t/t6018-rev-list-glob.sh
+++ b/t/t6018-rev-list-glob.sh
@@ -187,7 +187,7 @@ test_expect_success 'rev-parse --exclude=ref with --remotes=glob' '
compare rev-parse "--exclude=upstream/x --remotes=upstream/*" "upstream/one upstream/two"
'
-for section in receive uploadpack
+for section in fetch receive uploadpack
do
test_expect_success "rev-parse --exclude-hidden=$section with --all" '
compare "-c transfer.hideRefs=refs/remotes/ rev-parse" "--branches --tags" "--exclude-hidden=$section --all"
diff --git a/t/t6021-rev-list-exclude-hidden.sh b/t/t6021-rev-list-exclude-hidden.sh
index 32b2b094138..e219ac86738 100755
--- a/t/t6021-rev-list-exclude-hidden.sh
+++ b/t/t6021-rev-list-exclude-hidden.sh
@@ -21,7 +21,7 @@ test_expect_success 'invalid section' '
test_cmp expected err
'
-for section in receive uploadpack
+for section in fetch receive uploadpack
do
test_expect_success "$section: passed multiple times" '
echo "fatal: --exclude-hidden= passed more than once" >expected &&
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2] fetch: support hideRefs to speed up connectivity checks
2023-02-12 9:04 ` [PATCH v2] " Eric Wong
@ 2023-02-13 20:53 ` Jeff King
2023-02-13 23:30 ` Philip Oakley
0 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2023-02-13 20:53 UTC (permalink / raw)
To: Eric Wong; +Cc: git, Patrick Steinhardt
On Sun, Feb 12, 2023 at 09:04:26AM +0000, Eric Wong wrote:
> Sidenote: I'm curious about the reason $(pwd) is used in some
> places while $PWD seems fine in others, so it doesn't seem to be
> a portability problem. I chose $PWD since it's faster.
It sometimes matters; one is a Windows path (with "C:\", etc) and one is
a Unix-style path. Many spots are happy with either type, but it
sometimes bites us when doing string comparisons, or in a few specific
cases. See
https://lore.kernel.org/git/d36d8b51-f2d7-a2f5-89ea-369f49556e10@kdbg.org/
for an example.
-Peff
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] fetch: support hideRefs to speed up connectivity checks
2023-02-13 20:53 ` Jeff King
@ 2023-02-13 23:30 ` Philip Oakley
2023-02-14 1:40 ` Jeff King
0 siblings, 1 reply; 11+ messages in thread
From: Philip Oakley @ 2023-02-13 23:30 UTC (permalink / raw)
To: Jeff King, Eric Wong; +Cc: git, Patrick Steinhardt
On 13/02/2023 20:53, Jeff King wrote:
> On Sun, Feb 12, 2023 at 09:04:26AM +0000, Eric Wong wrote:
>
>> Sidenote: I'm curious about the reason $(pwd) is used in some
>> places while $PWD seems fine in others, so it doesn't seem to be
>> a portability problem. I chose $PWD since it's faster.
> It sometimes matters; one is a Windows path (with "C:\", etc) and one is
> a Unix-style path. Many spots are happy with either type, but it
> sometimes bites us when doing string comparisons, or in a few specific
> cases. See
>
> https://lore.kernel.org/git/d36d8b51-f2d7-a2f5-89ea-369f49556e10@kdbg.org/
>
> for an example.
>
There is guidance in t/README L680-684 though it maybe not that easy to
spot.
A more recent patch was
https://lore.kernel.org/git/4f5c5633-f5a2-3c99-329e-3057b8d447d2@kdbg.org/
with slightly more details.
Philip
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] fetch: support hideRefs to speed up connectivity checks
2023-02-13 23:30 ` Philip Oakley
@ 2023-02-14 1:40 ` Jeff King
2023-02-16 1:32 ` Eric Wong
0 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2023-02-14 1:40 UTC (permalink / raw)
To: Philip Oakley; +Cc: Eric Wong, git, Patrick Steinhardt
On Mon, Feb 13, 2023 at 11:30:35PM +0000, Philip Oakley wrote:
> On 13/02/2023 20:53, Jeff King wrote:
> > On Sun, Feb 12, 2023 at 09:04:26AM +0000, Eric Wong wrote:
> >
> >> Sidenote: I'm curious about the reason $(pwd) is used in some
> >> places while $PWD seems fine in others, so it doesn't seem to be
> >> a portability problem. I chose $PWD since it's faster.
> > It sometimes matters; one is a Windows path (with "C:\", etc) and one is
> > a Unix-style path. Many spots are happy with either type, but it
> > sometimes bites us when doing string comparisons, or in a few specific
> > cases. See
> >
> > https://lore.kernel.org/git/d36d8b51-f2d7-a2f5-89ea-369f49556e10@kdbg.org/
> >
> > for an example.
> >
> There is guidance in t/README L680-684 though it maybe not that easy to
> spot.
>
> A more recent patch was
> https://lore.kernel.org/git/4f5c5633-f5a2-3c99-329e-3057b8d447d2@kdbg.org/
> with slightly more details.
Thanks, both explanations are much better than the one I found (my
digging in the archive consisted of "I know JSixt has corrected me on
this at least once...").
-Peff
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] fetch: support hideRefs to speed up connectivity checks
2023-02-14 1:40 ` Jeff King
@ 2023-02-16 1:32 ` Eric Wong
0 siblings, 0 replies; 11+ messages in thread
From: Eric Wong @ 2023-02-16 1:32 UTC (permalink / raw)
To: git; +Cc: Jeff King, Philip Oakley, Patrick Steinhardt
Jeff King <peff@peff.net> wrote:
> On Mon, Feb 13, 2023 at 11:30:35PM +0000, Philip Oakley wrote:
> > On 13/02/2023 20:53, Jeff King wrote:
> > > On Sun, Feb 12, 2023 at 09:04:26AM +0000, Eric Wong wrote:
> > >
> > >> Sidenote: I'm curious about the reason $(pwd) is used in some
> > >> places while $PWD seems fine in others, so it doesn't seem to be
> > >> a portability problem. I chose $PWD since it's faster.
> > > It sometimes matters; one is a Windows path (with "C:\", etc) and one is
> > > a Unix-style path. Many spots are happy with either type, but it
> > > sometimes bites us when doing string comparisons, or in a few specific
> > > cases. See
> > >
> > > https://lore.kernel.org/git/d36d8b51-f2d7-a2f5-89ea-369f49556e10@kdbg.org/
> > >
> > > for an example.
> > >
> > There is guidance in t/README L680-684 though it maybe not that easy to
> > spot.
> >
> > A more recent patch was
> > https://lore.kernel.org/git/4f5c5633-f5a2-3c99-329e-3057b8d447d2@kdbg.org/
> > with slightly more details.
>
> Thanks, both explanations are much better than the one I found (my
> digging in the archive consisted of "I know JSixt has corrected me on
> this at least once...").
Thanks both. Looks like my use of GIT_TRACE="$PWD"/trace is
fine and there's plenty of examples where $PWD is used for
GIT_TRACE* in our test suite (`git grep GIT_TRACE.*PWD')
Any comments on the actual change itself? Thanks again.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-02-16 1:32 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-09 12:28 [RFC] fetch: support hideRefs to speed up connectivity checks Eric Wong
2023-02-10 21:49 ` Jonathan Tan
2023-02-10 21:59 ` Eric Wong
2023-02-10 22:56 ` Junio C Hamano
2023-02-11 7:53 ` Eric Wong
2023-02-11 19:24 ` Junio C Hamano
2023-02-12 9:04 ` [PATCH v2] " Eric Wong
2023-02-13 20:53 ` Jeff King
2023-02-13 23:30 ` Philip Oakley
2023-02-14 1:40 ` Jeff King
2023-02-16 1:32 ` Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).