git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] receive-pack: use advertised reference tips to inform connectivity check
@ 2022-10-28 14:42 Patrick Steinhardt
  2022-10-28 14:42 ` [PATCH 1/2] connected: allow supplying different view of reachable objects Patrick Steinhardt
                   ` (9 more replies)
  0 siblings, 10 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-10-28 14:42 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1626 bytes --]

Hi,

this patch series improves the connectivity check done by stateful
git-receive-pack(1) to only consider references as reachable that have
been advertised to the client. This has two advantages:

    - A client shouldn't assume objects to exist that have not been part
      of the reference advertisement. But if it excluded an object from
      the packfile that is reachable via any ref that is excluded from
      the reference advertisement due to `transfer.hideRefs` we'd have
      accepted the push anyway. I'd argue that this is a bug in the
      current implementation.

    - Second, by using advertised refs as inputs instead of `git
      rev-list --not --all` we avoid looking up all refs that are
      irrelevant to the current push. This can be a huge performance
      improvement in repos that have a huge amount of internal, hidden
      refs. In one of our repos with 7m refs, of which 6.8m are hidden,
      this speeds up pushes from ~30s to ~4.5s.

One downside is that we need to pass in the object IDs that were part of
the reference advertisement via the standard input, which is seemingly
slower than reading them from the refdb. I'm discussing this in the
second commit.

Patrick

Patrick Steinhardt (2):
  connected: allow supplying different view of reachable objects
  receive-pack: use advertised reference tips to inform connectivity
    check

 builtin/receive-pack.c | 31 ++++++++++++++++++++++---------
 connected.c            |  9 ++++++++-
 connected.h            |  7 +++++++
 3 files changed, 37 insertions(+), 10 deletions(-)

-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH 1/2] connected: allow supplying different view of reachable objects
  2022-10-28 14:42 [PATCH 0/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
@ 2022-10-28 14:42 ` Patrick Steinhardt
  2022-10-28 14:54   ` Ævar Arnfjörð Bjarmason
  2022-10-28 18:12   ` Junio C Hamano
  2022-10-28 14:42 ` [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-10-28 14:42 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2455 bytes --]

The connectivity check is executed via git-receive-pack(1) to verify
that a client has provided all references that are required to satisfy a
set of reference updates. What the connectivity check does is to walk
the object graph with all reference tips as starting points while all
preexisting reference tips are marked as uninteresting.

Preexisting references are currently marked uninteresting by passing
`--not --all` to git-rev-list(1). Some users of the connectivity check
may have a better picture of which objects should be regarded as
uninteresting though, e.g. by reusing information from the reference
advertisement when serving a push.

Add a new field to `struct check_connected_options` that allows callers
to replace the `--not --all` logic with their own set of object IDs they
regard as uninteresting.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 connected.c | 9 ++++++++-
 connected.h | 7 +++++++
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/connected.c b/connected.c
index 74a20cb32e..2a4c4e0025 100644
--- a/connected.c
+++ b/connected.c
@@ -98,7 +98,7 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
 	strvec_push(&rev_list.args, "--stdin");
 	if (has_promisor_remote())
 		strvec_push(&rev_list.args, "--exclude-promisor-objects");
-	if (!opt->is_deepening_fetch) {
+	if (!opt->is_deepening_fetch && !opt->reachable_oids_fn) {
 		strvec_push(&rev_list.args, "--not");
 		strvec_push(&rev_list.args, "--all");
 	}
@@ -125,6 +125,13 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
 
 	rev_list_in = xfdopen(rev_list.in, "w");
 
+	if (opt->reachable_oids_fn) {
+		const struct object_id *reachable_oid;
+		while ((reachable_oid = opt->reachable_oids_fn(opt->reachable_oids_data)) != NULL)
+			if (fprintf(rev_list_in, "^%s\n", oid_to_hex(reachable_oid)) < 0)
+				break;
+	}
+
 	do {
 		/*
 		 * If index-pack already checked that:
diff --git a/connected.h b/connected.h
index 6e59c92aa3..f09c7d7884 100644
--- a/connected.h
+++ b/connected.h
@@ -46,6 +46,13 @@ struct check_connected_options {
 	 * during a fetch.
 	 */
 	unsigned is_deepening_fetch : 1;
+
+	/*
+	 * If non-NULL, use this iterator to determine the set of reachable
+	 * objects instead of marking all references as unreachable.
+	 */
+	oid_iterate_fn reachable_oids_fn;
+	void *reachable_oids_data;
 };
 
 #define CHECK_CONNECTED_INIT { 0 }
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check
  2022-10-28 14:42 [PATCH 0/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
  2022-10-28 14:42 ` [PATCH 1/2] connected: allow supplying different view of reachable objects Patrick Steinhardt
@ 2022-10-28 14:42 ` Patrick Steinhardt
  2022-10-28 15:01   ` Ævar Arnfjörð Bjarmason
  2022-10-30 19:09   ` Taylor Blau
  2022-10-28 16:40 ` [PATCH 0/2] " Junio C Hamano
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-10-28 14:42 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 8421 bytes --]

When serving a push, git-receive-pack(1) needs to verify that the
packfile sent by the client contains all objects that are required by
the updated references. This connectivity check works by marking all
preexisting references as uninteresting and using the new reference tips
as starting point for a graph walk.

This strategy has the major downside that it will not require any object
to be sent by the client that is reachable by any of the repositories'
references. While that sounds like it would be indeed what we are after
with the connectivity check, it is arguably not. The administrator that
manages the server-side Git repository may have configured certain refs
to be hidden during the reference advertisement via `transfer.hideRefs`
or `receivepack.hideRefs`. Whatever the reason, the result is that the
client shouldn't expect that any of those hidden references exists on
the remote side, and neither should they assume any of the pointed-to
objects to exist except if referenced by any visible reference. But
because we treat _all_ local refs as uninteresting in the connectivity
check, a client is free to send a packfile that references objects that
are only reachable via a hidden reference on the server-side, and we
will gladly accept it.

Besides the correctness issue there is also a performance issue. Git
forges tend to do internal bookkeeping to keep alive sets of objects for
internal use or make them easy to find via certain references. These
references are typically hidden away from the user so that they are
neither advertised nor writeable. At GitLab, we have one particular
repository that contains a total of 7 million references, of which 6.8
million are indeed internal references. With the current connectivity
check we are forced to load all these references in order to mark them
as uninteresting, and this alone takes around 15 seconds to compute.

We can fix both of these issues by changing the logic for stateful
invocations of git-receive-pack(1) where the reference advertisement and
packfile negotiation are served by the same process. Instead of marking
all preexisting references as unreachable, we will only mark those that
we have announced to the client.

Besides the stated fix to correctness this also provides a huge boost to
performance in the repository mentioned above. Pushing a new commit into
this repo with `transfer.hideRefs` set up to hide 6.8 million of 7 refs
as it is configured in Gitaly leads to an almost 7.5-fold speedup:

    Benchmark 1: main
      Time (mean ± σ):     29.902 s ±  0.105 s    [User: 29.176 s, System: 1.052 s]
      Range (min … max):   29.781 s … 29.969 s    3 runs

    Benchmark 2: pks-connectivity-check-hide-refs
      Time (mean ± σ):      4.033 s ±  0.088 s    [User: 4.071 s, System: 0.374 s]
      Range (min … max):    3.953 s …  4.128 s    3 runs

    Summary
      'pks-connectivity-check-hide-refs' ran
        7.42 ± 0.16 times faster than 'main'

Unfortunately, this change comes with a performance hit when refs are
not hidden. Executed in the same repository:

    Benchmark 1: main
      Time (mean ± σ):     45.780 s ±  0.507 s    [User: 46.908 s, System: 4.838 s]
      Range (min … max):   45.453 s … 46.364 s    3 runs

    Benchmark 2: pks-connectivity-check-hide-refs
      Time (mean ± σ):     49.886 s ±  0.282 s    [User: 51.168 s, System: 5.015 s]
      Range (min … max):   49.589 s … 50.149 s    3 runs

    Summary
      'main' ran
        1.09 ± 0.01 times faster than 'pks-connectivity-check-hide-refs'

This is probably caused by the overhead of reachable tips being passed
in via git-rev-list(1)'s standard input, which seems to be slower than
reading the references from disk.

It is debatable what to do about this. If this were only about improving
performance then it would be trivial to make the new logic depend on
whether or not `transfer.hideRefs` has been configured in the repo. But
as explained this is also about correctness, even though this can be
considered an edge case. Furthermore, this slowdown is really only
noticeable in outliers like the above repository with an unreasonable
amount of refs. The same benchmark in linux-stable.git with about
4500 references shows no measurable difference:

    Benchmark 1: main
      Time (mean ± σ):     375.4 ms ±  25.4 ms    [User: 312.2 ms, System: 155.7 ms]
      Range (min … max):   324.2 ms … 492.9 ms    50 runs

    Benchmark 2: pks-connectivity-check-hide-refs
      Time (mean ± σ):     374.9 ms ±  36.9 ms    [User: 311.6 ms, System: 158.2 ms]
      Range (min … max):   319.2 ms … 583.1 ms    50 runs

    Summary
      'pks-connectivity-check-hide-refs' ran
        1.00 ± 0.12 times faster than 'main'

Let's keep this as-is for the time being and accept the performance hit.
It is arguably extremely noticeable to a user if a push now performs 7.5
times faster than before, but a lot less so in case an already-slow push
becomes about 10% slower.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/receive-pack.c | 31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 44bcea3a5b..50794539c6 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -326,13 +326,10 @@ static void show_one_alternate_ref(const struct object_id *oid,
 	show_ref(".have", oid);
 }
 
-static void write_head_info(void)
+static void write_head_info(struct oidset *announced_objects)
 {
-	static struct oidset seen = OIDSET_INIT;
-
-	for_each_ref(show_ref_cb, &seen);
-	for_each_alternate_ref(show_one_alternate_ref, &seen);
-	oidset_clear(&seen);
+	for_each_ref(show_ref_cb, announced_objects);
+	for_each_alternate_ref(show_one_alternate_ref, announced_objects);
 	if (!sent_capabilities)
 		show_ref("capabilities^{}", null_oid());
 
@@ -1896,12 +1893,20 @@ static void execute_commands_atomic(struct command *commands,
 	strbuf_release(&err);
 }
 
+static const struct object_id *iterate_announced_oids(void *cb_data)
+{
+	struct oidset_iter *iter = cb_data;
+	return oidset_iter_next(iter);
+}
+
 static void execute_commands(struct command *commands,
 			     const char *unpacker_error,
 			     struct shallow_info *si,
-			     const struct string_list *push_options)
+			     const struct string_list *push_options,
+			     struct oidset *announced_oids)
 {
 	struct check_connected_options opt = CHECK_CONNECTED_INIT;
+	struct oidset_iter announced_oids_iter;
 	struct command *cmd;
 	struct iterate_data data;
 	struct async muxer;
@@ -1928,6 +1933,12 @@ static void execute_commands(struct command *commands,
 	opt.err_fd = err_fd;
 	opt.progress = err_fd && !quiet;
 	opt.env = tmp_objdir_env(tmp_objdir);
+	if (oidset_size(announced_oids) != 0) {
+		oidset_iter_init(announced_oids, &announced_oids_iter);
+		opt.reachable_oids_fn = iterate_announced_oids;
+		opt.reachable_oids_data = &announced_oids_iter;
+	}
+
 	if (check_connected(iterate_receive_command_list, &data, &opt))
 		set_connectivity_errors(commands, si);
 
@@ -2462,6 +2473,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 {
 	int advertise_refs = 0;
 	struct command *commands;
+	struct oidset announced_oids = OIDSET_INIT;
 	struct oid_array shallow = OID_ARRAY_INIT;
 	struct oid_array ref = OID_ARRAY_INIT;
 	struct shallow_info si;
@@ -2524,7 +2536,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 	}
 
 	if (advertise_refs || !stateless_rpc) {
-		write_head_info();
+		write_head_info(&announced_oids);
 	}
 	if (advertise_refs)
 		return 0;
@@ -2554,7 +2566,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 		}
 		use_keepalive = KEEPALIVE_ALWAYS;
 		execute_commands(commands, unpack_status, &si,
-				 &push_options);
+				 &push_options, &announced_oids);
 		if (pack_lockfile)
 			unlink_or_warn(pack_lockfile);
 		sigchain_push(SIGPIPE, SIG_IGN);
@@ -2591,6 +2603,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 		packet_flush(1);
 	oid_array_clear(&shallow);
 	oid_array_clear(&ref);
+	oidset_clear(&announced_oids);
 	free((void *)push_cert_nonce);
 	return 0;
 }
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH 1/2] connected: allow supplying different view of reachable objects
  2022-10-28 14:42 ` [PATCH 1/2] connected: allow supplying different view of reachable objects Patrick Steinhardt
@ 2022-10-28 14:54   ` Ævar Arnfjörð Bjarmason
  2022-10-28 18:12   ` Junio C Hamano
  1 sibling, 0 replies; 88+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-10-28 14:54 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git


On Fri, Oct 28 2022, Patrick Steinhardt wrote:

> @@ -125,6 +125,13 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
>  
>  	rev_list_in = xfdopen(rev_list.in, "w");
>  
> +	if (opt->reachable_oids_fn) {
> +		const struct object_id *reachable_oid;
> +		while ((reachable_oid = opt->reachable_oids_fn(opt->reachable_oids_data)) != NULL)
> +			if (fprintf(rev_list_in, "^%s\n", oid_to_hex(reachable_oid)) < 0)
> +				break;

Just a style nit, we tend to avoid != NULL, != 0 etc. comparisons. I see
connected.c has some of that already, but for new code let's just check
truthiness.

Also for such a small scope a shorter variable name helps us stay at the
usual column limits:

	if (opt->reachable_oids_fn) {
		const struct object_id *oid;
		while ((oid = opt->reachable_oids_fn(opt->reachable_oids_data)))
			if (fprintf(rev_list_in, "^%s\n", oid_to_hex(oid)) < 0)
				break;

The fprintf() return value checking seemed a bit odd, not because we
shouldn't do it, but because we usually don't bother. For other
reviewers: We have that form already in connected.c, so at least locally
we're not being diligently careful, only to have it undone by adjacent
code...

Looks good!

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check
  2022-10-28 14:42 ` [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
@ 2022-10-28 15:01   ` Ævar Arnfjörð Bjarmason
  2022-10-31 14:21     ` Patrick Steinhardt
  2022-10-30 19:09   ` Taylor Blau
  1 sibling, 1 reply; 88+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-10-28 15:01 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git


On Fri, Oct 28 2022, Patrick Steinhardt wrote:

> When serving a push, git-receive-pack(1) needs to verify that the
> packfile sent by the client contains all objects that are required by
> the updated references. This connectivity check works by marking all
> preexisting references as uninteresting and using the new reference tips
> as starting point for a graph walk.
>
> This strategy has the major downside that it will not require any object
> to be sent by the client that is reachable by any of the repositories'
> references. While that sounds like it would be indeed what we are after
> with the connectivity check, it is arguably not. The administrator that
> manages the server-side Git repository may have configured certain refs
> to be hidden during the reference advertisement via `transfer.hideRefs`
> or `receivepack.hideRefs`. Whatever the reason, the result is that the
> client shouldn't expect that any of those hidden references exists on
> the remote side, and neither should they assume any of the pointed-to
> objects to exist except if referenced by any visible reference. But
> because we treat _all_ local refs as uninteresting in the connectivity
> check, a client is free to send a packfile that references objects that
> are only reachable via a hidden reference on the server-side, and we
> will gladly accept it.
>
> Besides the correctness issue there is also a performance issue. Git
> forges tend to do internal bookkeeping to keep alive sets of objects for
> internal use or make them easy to find via certain references. These
> references are typically hidden away from the user so that they are
> neither advertised nor writeable. At GitLab, we have one particular
> repository that contains a total of 7 million references, of which 6.8
> million are indeed internal references. With the current connectivity
> check we are forced to load all these references in order to mark them
> as uninteresting, and this alone takes around 15 seconds to compute.
>
> We can fix both of these issues by changing the logic for stateful
> invocations of git-receive-pack(1) where the reference advertisement and
> packfile negotiation are served by the same process. Instead of marking
> all preexisting references as unreachable, we will only mark those that
> we have announced to the client.
>
> Besides the stated fix to correctness this also provides a huge boost to
> performance in the repository mentioned above. Pushing a new commit into
> this repo with `transfer.hideRefs` set up to hide 6.8 million of 7 refs
> as it is configured in Gitaly leads to an almost 7.5-fold speedup:

Really well explained.

>     Benchmark 1: main
>       Time (mean ± σ):     29.902 s ±  0.105 s    [User: 29.176 s, System: 1.052 s]
>       Range (min … max):   29.781 s … 29.969 s    3 runs
>
>     Benchmark 2: pks-connectivity-check-hide-refs
>       Time (mean ± σ):      4.033 s ±  0.088 s    [User: 4.071 s, System: 0.374 s]
>       Range (min … max):    3.953 s …  4.128 s    3 runs
>
>     Summary
>       'pks-connectivity-check-hide-refs' ran
>         7.42 ± 0.16 times faster than 'main'

And impressive, thanks!

> Unfortunately, this change comes with a performance hit when refs are
> not hidden. Executed in the same repository:
>
>     Benchmark 1: main
>       Time (mean ± σ):     45.780 s ±  0.507 s    [User: 46.908 s, System: 4.838 s]
>       Range (min … max):   45.453 s … 46.364 s    3 runs
>
>     Benchmark 2: pks-connectivity-check-hide-refs
>       Time (mean ± σ):     49.886 s ±  0.282 s    [User: 51.168 s, System: 5.015 s]
>       Range (min … max):   49.589 s … 50.149 s    3 runs
>
>     Summary
>       'main' ran
>         1.09 ± 0.01 times faster than 'pks-connectivity-check-hide-refs'
>
> This is probably caused by the overhead of reachable tips being passed
> in via git-rev-list(1)'s standard input, which seems to be slower than
> reading the references from disk.
>
> It is debatable what to do about this. If this were only about improving
> performance then it would be trivial to make the new logic depend on
> whether or not `transfer.hideRefs` has been configured in the repo. But
> as explained this is also about correctness, even though this can be
> considered an edge case. Furthermore, this slowdown is really only
> noticeable in outliers like the above repository with an unreasonable
> amount of refs. The same benchmark in linux-stable.git with about
> 4500 references shows no measurable difference:

Do we have a test that would start failing if we changed the behavior?
Perhaps such a test is peeking too much behind the curtain, but if it's
easy come up with one I think it would be most welcome to have it
alongside this.  to have exposes

> -static void write_head_info(void)
> +static void write_head_info(struct oidset *announced_objects)
>  {
> -	static struct oidset seen = OIDSET_INIT;
> -
> -	for_each_ref(show_ref_cb, &seen);
> -	for_each_alternate_ref(show_one_alternate_ref, &seen);
> -	oidset_clear(&seen);
> +	for_each_ref(show_ref_cb, announced_objects);
> +	for_each_alternate_ref(show_one_alternate_ref, announced_objects);
>  	if (!sent_capabilities)
>  		show_ref("capabilities^{}", null_oid());

Nit: The variable rename stands out slightly,
i.e. s/&seen/announced_objects/ not s/&seen/seen/, especially as:

>  static void execute_commands(struct command *commands,
>  			     const char *unpacker_error,
>  			     struct shallow_info *si,
> -			     const struct string_list *push_options)
> +			     const struct string_list *push_options,
> +			     struct oidset *announced_oids)

Here we have the same variable, but now it's *_oids, not *objects.

> +	if (oidset_size(announced_oids) != 0) {

Nit as before: The "!= 0" can go here.

> @@ -2462,6 +2473,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
>  {
>  	int advertise_refs = 0;
>  	struct command *commands;
> +	struct oidset announced_oids = OIDSET_INIT;
>  	struct oid_array shallow = OID_ARRAY_INIT;
>  	struct oid_array ref = OID_ARRAY_INIT;
>  	struct shallow_info si;
> @@ -2524,7 +2536,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
>  	}
>  
>  	if (advertise_refs || !stateless_rpc) {
> -		write_head_info();
> +		write_head_info(&announced_oids);
>  	}
>  	if (advertise_refs)
>  		return 0;

This introduces a memory leak to the function., We probably have other
ones in code it calls, but from a quick eyeballing not in the function
itself.

Squashing in / combining it with this should do it, as it never returns
non-zero (except for calling die()):
	
	diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
	index 44bcea3a5b3..8d5c2fbef1c 100644
	--- a/builtin/receive-pack.c
	+++ b/builtin/receive-pack.c
	@@ -2527,7 +2527,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
	 		write_head_info();
	 	}
	 	if (advertise_refs)
	-		return 0;
	+		goto cleanup;
	 
	 	packet_reader_init(&reader, 0, NULL, 0,
	 			   PACKET_READ_CHOMP_NEWLINE |
	@@ -2587,6 +2587,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
	 			update_server_info(0);
	 		clear_shallow_info(&si);
	 	}
	+cleanup:
	 	if (use_sideband)
	 		packet_flush(1);
	 	oid_array_clear(&shallow);

> @@ -2591,6 +2603,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
>  		packet_flush(1);
>  	oid_array_clear(&shallow);
>  	oid_array_clear(&ref);
> +	oidset_clear(&announced_oids);
>  	free((void *)push_cert_nonce);
>  	return 0;
>  }

We'll then properly reach this new oidset_clear()> The oid_array_clear()
are all for variables we're populating after we're past htat "if
(advertise_refs)".

I think if you're re-rolling this sqashing 1/2 and 2/2 together would be
an improvement. The 1/2 is tiny, and it's an API that's not used until
this 1/2. I found myself going back & forth more than helped in
reviewing this.

Ggoing back a bit this:

> +static const struct object_id *iterate_announced_oids(void *cb_data)
> +{
> +	struct oidset_iter *iter = cb_data;
> +	return oidset_iter_next(iter);
> +}
> +

Is just used as (from 1/2):

> +	if (opt->reachable_oids_fn) {
> +		const struct object_id *reachable_oid;
> +		while ((reachable_oid = opt->reachable_oids_fn(opt->reachable_oids_data)) != NULL)
> +			if (fprintf(rev_list_in, "^%s\n", oid_to_hex(reachable_oid)) < 0)
> +				break;
> +	}

After doing above:

> +	if (oidset_size(announced_oids) != 0) {
> +		oidset_iter_init(announced_oids, &announced_oids_iter);
> +		opt.reachable_oids_fn = iterate_announced_oids;
> +		opt.reachable_oids_data = &announced_oids_iter;
> +	}

But I don't see the reason for the indirection, but maybe I'm missing
something obvious.

Why not just pass the oidset itself and have connected.c iterate through
it, rather than going thorugh this callback / data indirection?


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 0/2] receive-pack: use advertised reference tips to inform connectivity check
  2022-10-28 14:42 [PATCH 0/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
  2022-10-28 14:42 ` [PATCH 1/2] connected: allow supplying different view of reachable objects Patrick Steinhardt
  2022-10-28 14:42 ` [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
@ 2022-10-28 16:40 ` Junio C Hamano
  2022-11-01  1:30 ` Taylor Blau
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 88+ messages in thread
From: Junio C Hamano @ 2022-10-28 16:40 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

> Hi,
>
> this patch series improves the connectivity check done by stateful
> git-receive-pack(1) to only consider references as reachable that have
> been advertised to the client. This has two advantages:
>
>     - A client shouldn't assume objects to exist that have not been part
>       of the reference advertisement. But if it excluded an object from
>       the packfile that is reachable via any ref that is excluded from
>       the reference advertisement due to `transfer.hideRefs` we'd have
>       accepted the push anyway. I'd argue that this is a bug in the
>       current implementation.

I agree that it is bad to accept an incoming pack that depends on an
object that is *only* reachable by a hidden ref, but what you said
above is a bit stronger than that.  Use transfer.hideRefs does not
have to be to hide objects (e.g. I could hide the tip of the
branches that holds tips of older maintenance tracks, just to
unclutter, and giving a pack that depends on older parts of history
is just fine).

Let's let it pass, as the cover letter material won't become part of
the permanent history ;-)

>     - Second, by using advertised refs as inputs instead of `git
>       rev-list --not --all` we avoid looking up all refs that are
>       irrelevant to the current push. This can be a huge performance
>       improvement in repos that have a huge amount of internal, hidden
>       refs. In one of our repos with 7m refs, of which 6.8m are hidden,
>       this speeds up pushes from ~30s to ~4.5s.

Yes.

> One downside is that we need to pass in the object IDs that were part of
> the reference advertisement via the standard input, which is seemingly
> slower than reading them from the refdb. I'm discussing this in the
> second commit.

Interesting.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 1/2] connected: allow supplying different view of reachable objects
  2022-10-28 14:42 ` [PATCH 1/2] connected: allow supplying different view of reachable objects Patrick Steinhardt
  2022-10-28 14:54   ` Ævar Arnfjörð Bjarmason
@ 2022-10-28 18:12   ` Junio C Hamano
  2022-10-30 18:49     ` Taylor Blau
  2022-10-31 13:10     ` Patrick Steinhardt
  1 sibling, 2 replies; 88+ messages in thread
From: Junio C Hamano @ 2022-10-28 18:12 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

> diff --git a/connected.c b/connected.c
> index 74a20cb32e..2a4c4e0025 100644
> --- a/connected.c
> +++ b/connected.c
> @@ -98,7 +98,7 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
>  	strvec_push(&rev_list.args, "--stdin");
>  	if (has_promisor_remote())
>  		strvec_push(&rev_list.args, "--exclude-promisor-objects");
> -	if (!opt->is_deepening_fetch) {
> +	if (!opt->is_deepening_fetch && !opt->reachable_oids_fn) {
>  		strvec_push(&rev_list.args, "--not");
>  		strvec_push(&rev_list.args, "--all");
>  	}
> @@ -125,6 +125,13 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
>  
>  	rev_list_in = xfdopen(rev_list.in, "w");
>  
> +	if (opt->reachable_oids_fn) {
> +		const struct object_id *reachable_oid;
> +		while ((reachable_oid = opt->reachable_oids_fn(opt->reachable_oids_data)) != NULL)
> +			if (fprintf(rev_list_in, "^%s\n", oid_to_hex(reachable_oid)) < 0)
> +				break;
> +	}

It is good that these individual negative references are fed from
the standard input, not on the command line, as they can be many.

In the original code without the reachable_oids_fn, we refrain from
excluding when the is_deepening_fetch bit is set, but here we do not
pay attention to the bit at all.  Is that sensible, and if so why?

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 1/2] connected: allow supplying different view of reachable objects
  2022-10-28 18:12   ` Junio C Hamano
@ 2022-10-30 18:49     ` Taylor Blau
  2022-10-31 13:10     ` Patrick Steinhardt
  1 sibling, 0 replies; 88+ messages in thread
From: Taylor Blau @ 2022-10-30 18:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Patrick Steinhardt, git

On Fri, Oct 28, 2022 at 11:12:33AM -0700, Junio C Hamano wrote:
> In the original code without the reachable_oids_fn, we refrain from
> excluding when the is_deepening_fetch bit is set, but here we do not
> pay attention to the bit at all.  Is that sensible, and if so why?

I was wondering the same thing. Thanks for asking.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check
  2022-10-28 14:42 ` [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
  2022-10-28 15:01   ` Ævar Arnfjörð Bjarmason
@ 2022-10-30 19:09   ` Taylor Blau
  2022-10-31 14:45     ` Patrick Steinhardt
  1 sibling, 1 reply; 88+ messages in thread
From: Taylor Blau @ 2022-10-30 19:09 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On Fri, Oct 28, 2022 at 04:42:27PM +0200, Patrick Steinhardt wrote:
> This strategy has the major downside that it will not require any object
> to be sent by the client that is reachable by any of the repositories'
> references. While that sounds like it would be indeed what we are after
> with the connectivity check, it is arguably not. The administrator that
> manages the server-side Git repository may have configured certain refs
> to be hidden during the reference advertisement via `transfer.hideRefs`
> or `receivepack.hideRefs`. Whatever the reason, the result is that the
> client shouldn't expect that any of those hidden references exists on
> the remote side, and neither should they assume any of the pointed-to
> objects to exist except if referenced by any visible reference. But
> because we treat _all_ local refs as uninteresting in the connectivity
> check, a client is free to send a packfile that references objects that
> are only reachable via a hidden reference on the server-side, and we
> will gladly accept it.

You mention below that this is a correctness issue, but I am not sure
that I agree.

The existing behavior is a little strange, I agree, but your argument
relies on an assumption that the history on hidden refs is not part of
the reachable set, which is not the case. Any part of the repository
that is reachable from _any_ reference, hidden or not, is reachable by
definition.

So it's perfectly fine to consider objects on hidden refs to be in the
uninteresting set, because they are reachable. It's odd from the
client's perspective, but I do not see a path to repository corruption
with thee existing behavior.

> Besides the stated fix to correctness this also provides a huge boost to
> performance in the repository mentioned above. Pushing a new commit into
> this repo with `transfer.hideRefs` set up to hide 6.8 million of 7 refs
> as it is configured in Gitaly leads to an almost 7.5-fold speedup:

Nice, here we expect a pretty good speed-up, and indeed...

>     Summary
>       'pks-connectivity-check-hide-refs' ran
>         7.42 ± 0.16 times faster than 'main'

...that's exactly what we get. Good.

> @@ -1928,6 +1933,12 @@ static void execute_commands(struct command *commands,
>  	opt.err_fd = err_fd;
>  	opt.progress = err_fd && !quiet;
>  	opt.env = tmp_objdir_env(tmp_objdir);
> +	if (oidset_size(announced_oids) != 0) {

I'm nitpicking, but this would be preferable as "if (oidset_size(announced_oids))"
without the "!= 0".

> +		oidset_iter_init(announced_oids, &announced_oids_iter);
> +		opt.reachable_oids_fn = iterate_announced_oids;
> +		opt.reachable_oids_data = &announced_oids_iter;
> +	}

Why do we see a slowdown when there there aren't any hidden references?
Or am I misunderstanding your patch message which instead means "we see
a slow-down when there are no hidden references [since we still must
store and enumerate all advertised references]"?

If the latter, could we avoid invoking the new machinery altogether? In
other words, shouldn't receive-pack only set the reachable_oids_fn() to
enumerate advertised references only when the set of advertised
references differs from the behavior of `--not --all`?

>  	if (check_connected(iterate_receive_command_list, &data, &opt))
>  		set_connectivity_errors(commands, si);
>
> @@ -2462,6 +2473,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
>  {
>  	int advertise_refs = 0;
>  	struct command *commands;
> +	struct oidset announced_oids = OIDSET_INIT;

This looks like trading one problem for another. In your above example,
we now need to store 20 bytes of OIDs 6.8M times, or ~130 MiB. Not the
end of the world, but it feels like an avoidable problem.

Could we enumerate the references in a callback to for_each_ref() and
only emit ones which aren't hidden? Storing these and then recalling
them after the fact is worth avoiding.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 1/2] connected: allow supplying different view of reachable objects
  2022-10-28 18:12   ` Junio C Hamano
  2022-10-30 18:49     ` Taylor Blau
@ 2022-10-31 13:10     ` Patrick Steinhardt
  2022-11-01  1:16       ` Taylor Blau
  1 sibling, 1 reply; 88+ messages in thread
From: Patrick Steinhardt @ 2022-10-31 13:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 2195 bytes --]

On Fri, Oct 28, 2022 at 11:12:33AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > diff --git a/connected.c b/connected.c
> > index 74a20cb32e..2a4c4e0025 100644
> > --- a/connected.c
> > +++ b/connected.c
> > @@ -98,7 +98,7 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
> >  	strvec_push(&rev_list.args, "--stdin");
> >  	if (has_promisor_remote())
> >  		strvec_push(&rev_list.args, "--exclude-promisor-objects");
> > -	if (!opt->is_deepening_fetch) {
> > +	if (!opt->is_deepening_fetch && !opt->reachable_oids_fn) {
> >  		strvec_push(&rev_list.args, "--not");
> >  		strvec_push(&rev_list.args, "--all");
> >  	}
> > @@ -125,6 +125,13 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
> >  
> >  	rev_list_in = xfdopen(rev_list.in, "w");
> >  
> > +	if (opt->reachable_oids_fn) {
> > +		const struct object_id *reachable_oid;
> > +		while ((reachable_oid = opt->reachable_oids_fn(opt->reachable_oids_data)) != NULL)
> > +			if (fprintf(rev_list_in, "^%s\n", oid_to_hex(reachable_oid)) < 0)
> > +				break;
> > +	}
> 
> It is good that these individual negative references are fed from
> the standard input, not on the command line, as they can be many.
> 
> In the original code without the reachable_oids_fn, we refrain from
> excluding when the is_deepening_fetch bit is set, but here we do not
> pay attention to the bit at all.  Is that sensible, and if so why?

Hm, good point. On a deepening fetch the commits that were the previous
boundary will likely get replaced by new commits that are at a deeper
point in history, so they cannot be used as a well-defined boundary.
Instead, we do a complete graph-walk that doesn't stop at any previously
known commits at all. At least that's how I understand the code, the
explanation is likely a bit fuzzy.

I guess we should thus also pay attention to `is_deepening_fetch` here.
As this means that `is_deepening_fetch` and `reachable_oids_fn` are
mutually exclusive I'm inclined to go even further and `die()` if both
are set at the same time. We only adapt git-receive-pack(1) anyway, so
we should never run into this situation for now.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check
  2022-10-28 15:01   ` Ævar Arnfjörð Bjarmason
@ 2022-10-31 14:21     ` Patrick Steinhardt
  2022-10-31 15:36       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 88+ messages in thread
From: Patrick Steinhardt @ 2022-10-31 14:21 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 4559 bytes --]

On Fri, Oct 28, 2022 at 05:01:58PM +0200, Ævar Arnfjörð Bjarmason wrote:
> On Fri, Oct 28 2022, Patrick Steinhardt wrote:
[sinp]
> > Unfortunately, this change comes with a performance hit when refs are
> > not hidden. Executed in the same repository:
> >
> >     Benchmark 1: main
> >       Time (mean ± σ):     45.780 s ±  0.507 s    [User: 46.908 s, System: 4.838 s]
> >       Range (min … max):   45.453 s … 46.364 s    3 runs
> >
> >     Benchmark 2: pks-connectivity-check-hide-refs
> >       Time (mean ± σ):     49.886 s ±  0.282 s    [User: 51.168 s, System: 5.015 s]
> >       Range (min … max):   49.589 s … 50.149 s    3 runs
> >
> >     Summary
> >       'main' ran
> >         1.09 ± 0.01 times faster than 'pks-connectivity-check-hide-refs'
> >
> > This is probably caused by the overhead of reachable tips being passed
> > in via git-rev-list(1)'s standard input, which seems to be slower than
> > reading the references from disk.
> >
> > It is debatable what to do about this. If this were only about improving
> > performance then it would be trivial to make the new logic depend on
> > whether or not `transfer.hideRefs` has been configured in the repo. But
> > as explained this is also about correctness, even though this can be
> > considered an edge case. Furthermore, this slowdown is really only
> > noticeable in outliers like the above repository with an unreasonable
> > amount of refs. The same benchmark in linux-stable.git with about
> > 4500 references shows no measurable difference:
> 
> Do we have a test that would start failing if we changed the behavior?
> Perhaps such a test is peeking too much behind the curtain, but if it's
> easy come up with one I think it would be most welcome to have it
> alongside this.  to have exposes

We have tests that verify that we indeed detect missing objects in
t5504. But what we're lacking is tests that verify that we stop walking
at the boundary of preexisting objects, and I honestly wouldn't quite
know how to do that as there is no functional difference, but really
only a performance issue if we overwalked.

> > -static void write_head_info(void)
> > +static void write_head_info(struct oidset *announced_objects)
> >  {
> > -	static struct oidset seen = OIDSET_INIT;
> > -
> > -	for_each_ref(show_ref_cb, &seen);
> > -	for_each_alternate_ref(show_one_alternate_ref, &seen);
> > -	oidset_clear(&seen);
> > +	for_each_ref(show_ref_cb, announced_objects);
> > +	for_each_alternate_ref(show_one_alternate_ref, announced_objects);
> >  	if (!sent_capabilities)
> >  		show_ref("capabilities^{}", null_oid());
> 
> Nit: The variable rename stands out slightly,
> i.e. s/&seen/announced_objects/ not s/&seen/seen/, especially as:
> 
> >  static void execute_commands(struct command *commands,
> >  			     const char *unpacker_error,
> >  			     struct shallow_info *si,
> > -			     const struct string_list *push_options)
> > +			     const struct string_list *push_options,
> > +			     struct oidset *announced_oids)
> 
> Here we have the same variable, but now it's *_oids, not *objects.

Hm. I think that `announced_oids` is easier to understand compared to
`seen`, so I'd prefer to keep the rename. But I'll definitely make this
consistent so we use `announced_oids` in both places.

[snip]
> > +static const struct object_id *iterate_announced_oids(void *cb_data)
> > +{
> > +	struct oidset_iter *iter = cb_data;
> > +	return oidset_iter_next(iter);
> > +}
> > +
> 
> Is just used as (from 1/2):
> 
> > +	if (opt->reachable_oids_fn) {
> > +		const struct object_id *reachable_oid;
> > +		while ((reachable_oid = opt->reachable_oids_fn(opt->reachable_oids_data)) != NULL)
> > +			if (fprintf(rev_list_in, "^%s\n", oid_to_hex(reachable_oid)) < 0)
> > +				break;
> > +	}
> 
> After doing above:
> 
> > +	if (oidset_size(announced_oids) != 0) {
> > +		oidset_iter_init(announced_oids, &announced_oids_iter);
> > +		opt.reachable_oids_fn = iterate_announced_oids;
> > +		opt.reachable_oids_data = &announced_oids_iter;
> > +	}
> 
> But I don't see the reason for the indirection, but maybe I'm missing
> something obvious.
> 
> Why not just pass the oidset itself and have connected.c iterate through
> it, rather than going thorugh this callback / data indirection?

This is done to stay consistent with the way new tips are passed in via
the `oid_iterate_fn`. I'm happy to change callers to just directly pass
a `struct oidset *` though.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check
  2022-10-30 19:09   ` Taylor Blau
@ 2022-10-31 14:45     ` Patrick Steinhardt
  2022-11-01  1:28       ` Taylor Blau
  2022-11-01  8:28       ` Jeff King
  0 siblings, 2 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-10-31 14:45 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 5518 bytes --]

On Sun, Oct 30, 2022 at 03:09:04PM -0400, Taylor Blau wrote:
> On Fri, Oct 28, 2022 at 04:42:27PM +0200, Patrick Steinhardt wrote:
> > This strategy has the major downside that it will not require any object
> > to be sent by the client that is reachable by any of the repositories'
> > references. While that sounds like it would be indeed what we are after
> > with the connectivity check, it is arguably not. The administrator that
> > manages the server-side Git repository may have configured certain refs
> > to be hidden during the reference advertisement via `transfer.hideRefs`
> > or `receivepack.hideRefs`. Whatever the reason, the result is that the
> > client shouldn't expect that any of those hidden references exists on
> > the remote side, and neither should they assume any of the pointed-to
> > objects to exist except if referenced by any visible reference. But
> > because we treat _all_ local refs as uninteresting in the connectivity
> > check, a client is free to send a packfile that references objects that
> > are only reachable via a hidden reference on the server-side, and we
> > will gladly accept it.
> 
> You mention below that this is a correctness issue, but I am not sure
> that I agree.
> 
> The existing behavior is a little strange, I agree, but your argument
> relies on an assumption that the history on hidden refs is not part of
> the reachable set, which is not the case. Any part of the repository
> that is reachable from _any_ reference, hidden or not, is reachable by
> definition.
> 
> So it's perfectly fine to consider objects on hidden refs to be in the
> uninteresting set, because they are reachable. It's odd from the
> client's perspective, but I do not see a path to repository corruption
> with thee existing behavior.

Indeed, I'm not trying to say that this can lead to repository
corruption. If at all you can argue that this is more security-related.
Suppose an object is not reachable from any public reference and that
`allowAnySHA1InWant=false`. Then you could make these hidden objects
reachable by sending a packfile with an object that references the
hidden object. It naturally requires you to somehow know about the
object ID, so I don't think this is a critical issue.

But security-related or not, I think it is safe to say that any packfile
sent by a client that does not contain objects required for the updated
reference that the client cannot know to exist on the server-side must
be generated by buggy code.

[snip]
> Why do we see a slowdown when there there aren't any hidden references?
> Or am I misunderstanding your patch message which instead means "we see
> a slow-down when there are no hidden references [since we still must
> store and enumerate all advertised references]"?

I have tried to dig down into the code of `revision.c` but ultimately
returned empty-handed. I _think_ that this is because of the different
paths we use when reading revisions from stdin as we have to resolve the
revision to an OID first, which is more involved than taking the OIDs as
returned by the reference backend. I have tried to short-circuit this
logic in case the revision read from stdin is exactly `hash_algo->hexsz`
long so that we try to parse it as an OID directly instead of trying to
do any of the magic that is required to resolve a revision. But this
only speed things up by a small margin.

Another assumption was that this is overhead caused by using stdin
instead of reading data from a file, but flame graphs didn't support
this theory, either.

> If the latter, could we avoid invoking the new machinery altogether? In
> other words, shouldn't receive-pack only set the reachable_oids_fn() to
> enumerate advertised references only when the set of advertised
> references differs from the behavior of `--not --all`?

Yeah, I was taking a "wait for feedback and see" stance on this. We can
easily make the logic conditional on whether there are any hidden refs
at all.

> >  	if (check_connected(iterate_receive_command_list, &data, &opt))
> >  		set_connectivity_errors(commands, si);
> >
> > @@ -2462,6 +2473,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
> >  {
> >  	int advertise_refs = 0;
> >  	struct command *commands;
> > +	struct oidset announced_oids = OIDSET_INIT;
> 
> This looks like trading one problem for another. In your above example,
> we now need to store 20 bytes of OIDs 6.8M times, or ~130 MiB. Not the
> end of the world, but it feels like an avoidable problem.

We store these references in an `oidset` before this patch set already,
but yes, the lifetime is longer now. But note that this set stores the
announced objects, not the hidden ones. So we don't store 6.8m OIDs, but
only the 250k announced ones.

> Could we enumerate the references in a callback to for_each_ref() and
> only emit ones which aren't hidden? Storing these and then recalling
> them after the fact is worth avoiding.

Sorry, I don't quite get what you're proposing. `for_each_ref()` already
does exactly that: it stores every reference that is not hidden in the
above `oidset`. This is the exact set of advertised references, which in
my example repository would be about 250k. This information is used by
git-receive-pack(1) to avoid announcing the same object twice, and now
it's also used to inform the connectivity check to use these objects as
the set of already-reachable objects.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check
  2022-10-31 14:21     ` Patrick Steinhardt
@ 2022-10-31 15:36       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 88+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-10-31 15:36 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git


On Mon, Oct 31 2022, Patrick Steinhardt wrote:

> [[PGP Signed Part:Undecided]]
> On Fri, Oct 28, 2022 at 05:01:58PM +0200, Ævar Arnfjörð Bjarmason wrote:
>> On Fri, Oct 28 2022, Patrick Steinhardt wrote:
> [sinp]
>> > Unfortunately, this change comes with a performance hit when refs are
>> > not hidden. Executed in the same repository:
>> >
>> >     Benchmark 1: main
>> >       Time (mean ± σ):     45.780 s ±  0.507 s    [User: 46.908 s, System: 4.838 s]
>> >       Range (min … max):   45.453 s … 46.364 s    3 runs
>> >
>> >     Benchmark 2: pks-connectivity-check-hide-refs
>> >       Time (mean ± σ):     49.886 s ±  0.282 s    [User: 51.168 s, System: 5.015 s]
>> >       Range (min … max):   49.589 s … 50.149 s    3 runs
>> >
>> >     Summary
>> >       'main' ran
>> >         1.09 ± 0.01 times faster than 'pks-connectivity-check-hide-refs'
>> >
>> > This is probably caused by the overhead of reachable tips being passed
>> > in via git-rev-list(1)'s standard input, which seems to be slower than
>> > reading the references from disk.
>> >
>> > It is debatable what to do about this. If this were only about improving
>> > performance then it would be trivial to make the new logic depend on
>> > whether or not `transfer.hideRefs` has been configured in the repo. But
>> > as explained this is also about correctness, even though this can be
>> > considered an edge case. Furthermore, this slowdown is really only
>> > noticeable in outliers like the above repository with an unreasonable
>> > amount of refs. The same benchmark in linux-stable.git with about
>> > 4500 references shows no measurable difference:
>> 
>> Do we have a test that would start failing if we changed the behavior?
>> Perhaps such a test is peeking too much behind the curtain, but if it's
>> easy come up with one I think it would be most welcome to have it
>> alongside this.  to have exposes
>
> We have tests that verify that we indeed detect missing objects in
> t5504. But what we're lacking is tests that verify that we stop walking
> at the boundary of preexisting objects, and I honestly wouldn't quite
> know how to do that as there is no functional difference, but really
> only a performance issue if we overwalked.
>
>> > -static void write_head_info(void)
>> > +static void write_head_info(struct oidset *announced_objects)
>> >  {
>> > -	static struct oidset seen = OIDSET_INIT;
>> > -
>> > -	for_each_ref(show_ref_cb, &seen);
>> > -	for_each_alternate_ref(show_one_alternate_ref, &seen);
>> > -	oidset_clear(&seen);
>> > +	for_each_ref(show_ref_cb, announced_objects);
>> > +	for_each_alternate_ref(show_one_alternate_ref, announced_objects);
>> >  	if (!sent_capabilities)
>> >  		show_ref("capabilities^{}", null_oid());
>> 
>> Nit: The variable rename stands out slightly,
>> i.e. s/&seen/announced_objects/ not s/&seen/seen/, especially as:
>> 
>> >  static void execute_commands(struct command *commands,
>> >  			     const char *unpacker_error,
>> >  			     struct shallow_info *si,
>> > -			     const struct string_list *push_options)
>> > +			     const struct string_list *push_options,
>> > +			     struct oidset *announced_oids)
>> 
>> Here we have the same variable, but now it's *_oids, not *objects.
>
> Hm. I think that `announced_oids` is easier to understand compared to
> `seen`, so I'd prefer to keep the rename. But I'll definitely make this
> consistent so we use `announced_oids` in both places.

Sounds good, we'll need to look at the diff lines in either case (as
we're converting it to a pointer), so changing the name while at it is
fine...

> [snip]
>> > +static const struct object_id *iterate_announced_oids(void *cb_data)
>> > +{
>> > +	struct oidset_iter *iter = cb_data;
>> > +	return oidset_iter_next(iter);
>> > +}
>> > +
>> 
>> Is just used as (from 1/2):
>> 
>> > +	if (opt->reachable_oids_fn) {
>> > +		const struct object_id *reachable_oid;
>> > +		while ((reachable_oid = opt->reachable_oids_fn(opt->reachable_oids_data)) != NULL)
>> > +			if (fprintf(rev_list_in, "^%s\n", oid_to_hex(reachable_oid)) < 0)
>> > +				break;
>> > +	}
>> 
>> After doing above:
>> 
>> > +	if (oidset_size(announced_oids) != 0) {
>> > +		oidset_iter_init(announced_oids, &announced_oids_iter);
>> > +		opt.reachable_oids_fn = iterate_announced_oids;
>> > +		opt.reachable_oids_data = &announced_oids_iter;
>> > +	}
>> 
>> But I don't see the reason for the indirection, but maybe I'm missing
>> something obvious.
>> 
>> Why not just pass the oidset itself and have connected.c iterate through
>> it, rather than going thorugh this callback / data indirection?
>
> This is done to stay consistent with the way new tips are passed in via
> the `oid_iterate_fn`. I'm happy to change callers to just directly pass
> a `struct oidset *` though.

*nod*, makes sense, no need to change it. Just wondering...

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 1/2] connected: allow supplying different view of reachable objects
  2022-10-31 13:10     ` Patrick Steinhardt
@ 2022-11-01  1:16       ` Taylor Blau
  0 siblings, 0 replies; 88+ messages in thread
From: Taylor Blau @ 2022-11-01  1:16 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Junio C Hamano, git

On Mon, Oct 31, 2022 at 02:10:16PM +0100, Patrick Steinhardt wrote:
> I guess we should thus also pay attention to `is_deepening_fetch` here.
> As this means that `is_deepening_fetch` and `reachable_oids_fn` are
> mutually exclusive I'm inclined to go even further and `die()` if both
> are set at the same time. We only adapt git-receive-pack(1) anyway, so
> we should never run into this situation for now.

Yes, I agree.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check
  2022-10-31 14:45     ` Patrick Steinhardt
@ 2022-11-01  1:28       ` Taylor Blau
  2022-11-01  7:20         ` Patrick Steinhardt
  2022-11-01  8:28       ` Jeff King
  1 sibling, 1 reply; 88+ messages in thread
From: Taylor Blau @ 2022-11-01  1:28 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On Mon, Oct 31, 2022 at 03:45:47PM +0100, Patrick Steinhardt wrote:
> On Sun, Oct 30, 2022 at 03:09:04PM -0400, Taylor Blau wrote:
> > On Fri, Oct 28, 2022 at 04:42:27PM +0200, Patrick Steinhardt wrote:
> > > This strategy has the major downside that it will not require any object
> > > to be sent by the client that is reachable by any of the repositories'
> > > references. While that sounds like it would be indeed what we are after
> > > with the connectivity check, it is arguably not. The administrator that
> > > manages the server-side Git repository may have configured certain refs
> > > to be hidden during the reference advertisement via `transfer.hideRefs`
> > > or `receivepack.hideRefs`. Whatever the reason, the result is that the
> > > client shouldn't expect that any of those hidden references exists on
> > > the remote side, and neither should they assume any of the pointed-to
> > > objects to exist except if referenced by any visible reference. But
> > > because we treat _all_ local refs as uninteresting in the connectivity
> > > check, a client is free to send a packfile that references objects that
> > > are only reachable via a hidden reference on the server-side, and we
> > > will gladly accept it.
> >
> > You mention below that this is a correctness issue, but I am not sure
> > that I agree.
> >
> > The existing behavior is a little strange, I agree, but your argument
> > relies on an assumption that the history on hidden refs is not part of
> > the reachable set, which is not the case. Any part of the repository
> > that is reachable from _any_ reference, hidden or not, is reachable by
> > definition.
> >
> > So it's perfectly fine to consider objects on hidden refs to be in the
> > uninteresting set, because they are reachable. It's odd from the
> > client's perspective, but I do not see a path to repository corruption
> > with thee existing behavior.
>
> Indeed, I'm not trying to say that this can lead to repository
> corruption.

I definitely agree with that. I have thought about this on-and-off since
you sent the topic, and I am pretty certain that there is no path to
repository corruption with the existing behavior. It would be worth
updating the commit message to make this clearer.

> But security-related or not, I think it is safe to say that any packfile
> sent by a client that does not contain objects required for the updated
> reference that the client cannot know to exist on the server-side must
> be generated by buggy code.

Maybe, though I think it's fine to let clients send us smaller packfiles
if they have some a-priori knowledge that the server has objects that it
isn't advertising. And that can all happen without buggy code. So it's
weird, but there isn't anything wrong with letting it happen.

> [snip]
> > Why do we see a slowdown when there there aren't any hidden references?
> > Or am I misunderstanding your patch message which instead means "we see
> > a slow-down when there are no hidden references [since we still must
> > store and enumerate all advertised references]"?
>
> I have tried to dig down into the code of `revision.c` but ultimately
> returned empty-handed. I _think_ that this is because of the different
> paths we use when reading revisions from stdin as we have to resolve the
> revision to an OID first, which is more involved than taking the OIDs as
> returned by the reference backend. I have tried to short-circuit this
> logic in case the revision read from stdin is exactly `hash_algo->hexsz`
> long so that we try to parse it as an OID directly instead of trying to
> do any of the magic that is required to resolve a revision. But this
> only speed things up by a small margin.
>
> Another assumption was that this is overhead caused by using stdin
> instead of reading data from a file, but flame graphs didn't support
> this theory, either.
>
> > If the latter, could we avoid invoking the new machinery altogether? In
> > other words, shouldn't receive-pack only set the reachable_oids_fn() to
> > enumerate advertised references only when the set of advertised
> > references differs from the behavior of `--not --all`?
>
> Yeah, I was taking a "wait for feedback and see" stance on this. We can
> easily make the logic conditional on whether there are any hidden refs
> at all.

Yeah, I think that this would be preferable. I'm surprised that your
data doesn't support the idea that the slowdown is caused by reading
from stdin instead of parsing `--not --all`. I'd be curious to see what
you have tried so far.

I'm almost certain that forcing rev-list to chew through a bunch of data
on stdin that is basically equivalent to saying `--not --all` is going
to be the source of the slowdown.

> > >  	if (check_connected(iterate_receive_command_list, &data, &opt))
> > >  		set_connectivity_errors(commands, si);
> > >
> > > @@ -2462,6 +2473,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
> > >  {
> > >  	int advertise_refs = 0;
> > >  	struct command *commands;
> > > +	struct oidset announced_oids = OIDSET_INIT;
> >
> > This looks like trading one problem for another. In your above example,
> > we now need to store 20 bytes of OIDs 6.8M times, or ~130 MiB. Not the
> > end of the world, but it feels like an avoidable problem.
>
> We store these references in an `oidset` before this patch set already,
> but yes, the lifetime is longer now. But note that this set stores the
> announced objects, not the hidden ones. So we don't store 6.8m OIDs, but
> only the 250k announced ones.

Hmm, OK. I wonder, could rev-list be given a `--refs=<namespace>`
argument which is equal to the advertised references? Or something that
implies "give me all of the references which aren't hidden?"

If we call that maybe `--visible-refs=receive` (to imply the references
*not* in receive.hideRefs), then our connectivity check would become:

    git rev-list --stdin --not --visible-refs=receive

or something like that. Not having to extend the lifetime of these OIDs
in an oidset would be worthwhile. Besides, having a new rev-list option
is cool, too ;-).

> > Could we enumerate the references in a callback to for_each_ref() and
> > only emit ones which aren't hidden? Storing these and then recalling
> > them after the fact is worth avoiding.
>
> Sorry, I don't quite get what you're proposing. `for_each_ref()` already
> does exactly that: it stores every reference that is not hidden in the
> above `oidset`. This is the exact set of advertised references, which in
> my example repository would be about 250k. This information is used by
> git-receive-pack(1) to avoid announcing the same object twice, and now
> it's also used to inform the connectivity check to use these objects as
> the set of already-reachable objects.

Yes, ignore me ;-).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 0/2] receive-pack: use advertised reference tips to inform connectivity check
  2022-10-28 14:42 [PATCH 0/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
                   ` (2 preceding siblings ...)
  2022-10-28 16:40 ` [PATCH 0/2] " Junio C Hamano
@ 2022-11-01  1:30 ` Taylor Blau
  2022-11-01  9:00 ` Jeff King
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 88+ messages in thread
From: Taylor Blau @ 2022-11-01  1:30 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On Fri, Oct 28, 2022 at 04:42:19PM +0200, Patrick Steinhardt wrote:
> this patch series improves the connectivity check done by stateful
> git-receive-pack(1) to only consider references as reachable that have
> been advertised to the client. This has two advantages:

A third advantage which I didn't see mentioned here is that we have a
better chance of being helped by reachability bitmaps here. Because we
aren't likely to cover hidden references with bitmaps, we would be
paying a significant cost to build up a complete picture of what's
reachable from --all.

That will inevitably involve a lot of traversing from unbitmapped
reference tips (i.e. the hidden ones) down to either a random bitmap
sprinkled throughout history, or the root of that line of history.

But if we limited the negated side of our connectivity check to just the
non-hidden references, then we will likely end up with more full bitmap
coverage of those tips, without having to do any additional traversal.

So that is definitely something worth experimenting with.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check
  2022-11-01  1:28       ` Taylor Blau
@ 2022-11-01  7:20         ` Patrick Steinhardt
  2022-11-01 11:53           ` Patrick Steinhardt
  0 siblings, 1 reply; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-01  7:20 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 7856 bytes --]

On Mon, Oct 31, 2022 at 09:28:09PM -0400, Taylor Blau wrote:
> On Mon, Oct 31, 2022 at 03:45:47PM +0100, Patrick Steinhardt wrote:
> > On Sun, Oct 30, 2022 at 03:09:04PM -0400, Taylor Blau wrote:
> > > On Fri, Oct 28, 2022 at 04:42:27PM +0200, Patrick Steinhardt wrote:
> > > > This strategy has the major downside that it will not require any object
> > > > to be sent by the client that is reachable by any of the repositories'
> > > > references. While that sounds like it would be indeed what we are after
> > > > with the connectivity check, it is arguably not. The administrator that
> > > > manages the server-side Git repository may have configured certain refs
> > > > to be hidden during the reference advertisement via `transfer.hideRefs`
> > > > or `receivepack.hideRefs`. Whatever the reason, the result is that the
> > > > client shouldn't expect that any of those hidden references exists on
> > > > the remote side, and neither should they assume any of the pointed-to
> > > > objects to exist except if referenced by any visible reference. But
> > > > because we treat _all_ local refs as uninteresting in the connectivity
> > > > check, a client is free to send a packfile that references objects that
> > > > are only reachable via a hidden reference on the server-side, and we
> > > > will gladly accept it.
> > >
> > > You mention below that this is a correctness issue, but I am not sure
> > > that I agree.
> > >
> > > The existing behavior is a little strange, I agree, but your argument
> > > relies on an assumption that the history on hidden refs is not part of
> > > the reachable set, which is not the case. Any part of the repository
> > > that is reachable from _any_ reference, hidden or not, is reachable by
> > > definition.
> > >
> > > So it's perfectly fine to consider objects on hidden refs to be in the
> > > uninteresting set, because they are reachable. It's odd from the
> > > client's perspective, but I do not see a path to repository corruption
> > > with thee existing behavior.
> >
> > Indeed, I'm not trying to say that this can lead to repository
> > corruption.
> 
> I definitely agree with that. I have thought about this on-and-off since
> you sent the topic, and I am pretty certain that there is no path to
> repository corruption with the existing behavior. It would be worth
> updating the commit message to make this clearer.

Fair enough, I can try to do that.

> > But security-related or not, I think it is safe to say that any packfile
> > sent by a client that does not contain objects required for the updated
> > reference that the client cannot know to exist on the server-side must
> > be generated by buggy code.
> 
> Maybe, though I think it's fine to let clients send us smaller packfiles
> if they have some a-priori knowledge that the server has objects that it
> isn't advertising. And that can all happen without buggy code. So it's
> weird, but there isn't anything wrong with letting it happen.

Well, I don't see how to achieve both at the same time though: we can
either limit the set of uninteresting tips to what we have announced to
the client, or we allow clients to omit objects that have not been
announced. These are mutually exclusive.

So if we take the stance that it was fine to send packfiles that omit
hidden objects and that this is something we want to continue to support
then this patch series probably becomes moot. Doing the proposed
optimization means that we also tighten the rules here.

> > [snip]
> > > Why do we see a slowdown when there there aren't any hidden references?
> > > Or am I misunderstanding your patch message which instead means "we see
> > > a slow-down when there are no hidden references [since we still must
> > > store and enumerate all advertised references]"?
> >
> > I have tried to dig down into the code of `revision.c` but ultimately
> > returned empty-handed. I _think_ that this is because of the different
> > paths we use when reading revisions from stdin as we have to resolve the
> > revision to an OID first, which is more involved than taking the OIDs as
> > returned by the reference backend. I have tried to short-circuit this
> > logic in case the revision read from stdin is exactly `hash_algo->hexsz`
> > long so that we try to parse it as an OID directly instead of trying to
> > do any of the magic that is required to resolve a revision. But this
> > only speed things up by a small margin.
> >
> > Another assumption was that this is overhead caused by using stdin
> > instead of reading data from a file, but flame graphs didn't support
> > this theory, either.
> >
> > > If the latter, could we avoid invoking the new machinery altogether? In
> > > other words, shouldn't receive-pack only set the reachable_oids_fn() to
> > > enumerate advertised references only when the set of advertised
> > > references differs from the behavior of `--not --all`?
> >
> > Yeah, I was taking a "wait for feedback and see" stance on this. We can
> > easily make the logic conditional on whether there are any hidden refs
> > at all.
> 
> Yeah, I think that this would be preferable. I'm surprised that your
> data doesn't support the idea that the slowdown is caused by reading
> from stdin instead of parsing `--not --all`. I'd be curious to see what
> you have tried so far.
> 
> I'm almost certain that forcing rev-list to chew through a bunch of data
> on stdin that is basically equivalent to saying `--not --all` is going
> to be the source of the slowdown.

I was basically benchmarking these two commands in the repository with
7m refs:

    # old-style connectivity check
    $ git commit-tree HEAD^{tree} -m commit -p HEAD >newtip
    $ git rev-list --objects --quiet --stdin --not --all <newtip

    # new-style connectivity check
    $ ( git for-each-ref --format='^%(objectname)' ; $ git commit-tree HEAD^{tree} -m commit -p HEAD ) >oldandnewtips
    $ git rev-list --objects --quiet --stdin <oldandnewtips

I may have another look today and maybe share some flame graphs if I
continue to not see anything obvious.

> > > >  	if (check_connected(iterate_receive_command_list, &data, &opt))
> > > >  		set_connectivity_errors(commands, si);
> > > >
> > > > @@ -2462,6 +2473,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
> > > >  {
> > > >  	int advertise_refs = 0;
> > > >  	struct command *commands;
> > > > +	struct oidset announced_oids = OIDSET_INIT;
> > >
> > > This looks like trading one problem for another. In your above example,
> > > we now need to store 20 bytes of OIDs 6.8M times, or ~130 MiB. Not the
> > > end of the world, but it feels like an avoidable problem.
> >
> > We store these references in an `oidset` before this patch set already,
> > but yes, the lifetime is longer now. But note that this set stores the
> > announced objects, not the hidden ones. So we don't store 6.8m OIDs, but
> > only the 250k announced ones.
> 
> Hmm, OK. I wonder, could rev-list be given a `--refs=<namespace>`
> argument which is equal to the advertised references? Or something that
> implies "give me all of the references which aren't hidden?"
> 
> If we call that maybe `--visible-refs=receive` (to imply the references
> *not* in receive.hideRefs), then our connectivity check would become:
> 
>     git rev-list --stdin --not --visible-refs=receive
> 
> or something like that. Not having to extend the lifetime of these OIDs
> in an oidset would be worthwhile. Besides, having a new rev-list option
> is cool, too ;-).

That is indeed an interesting idea. It would likely both fix the issue
of extended memory lifetime of the announced OIDs as well as fixing the
slowdown caused by using `--stdin` for them.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check
  2022-10-31 14:45     ` Patrick Steinhardt
  2022-11-01  1:28       ` Taylor Blau
@ 2022-11-01  8:28       ` Jeff King
  1 sibling, 0 replies; 88+ messages in thread
From: Jeff King @ 2022-11-01  8:28 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Taylor Blau, git

On Mon, Oct 31, 2022 at 03:45:47PM +0100, Patrick Steinhardt wrote:

> Indeed, I'm not trying to say that this can lead to repository
> corruption. If at all you can argue that this is more security-related.
> Suppose an object is not reachable from any public reference and that
> `allowAnySHA1InWant=false`. Then you could make these hidden objects
> reachable by sending a packfile with an object that references the
> hidden object. It naturally requires you to somehow know about the
> object ID, so I don't think this is a critical issue.

I'd have to double check, but isn't this all moot with the v2 protocol
anyway? I didn't think it even respected allowAnySHA1InWant.

Even that aside, there are other tricks you can do. E.g., pushing up an
object which claims to be a delta of X (though you have the trick then
of figuring out how to reference it), or pushing up a test object then
fetching it back while claiming to have X, treating the server as an
oracle which may give you a delta against X.

In short, I don't think it's worth considering unreachable objects in a
repository to be considered secret.

> > Why do we see a slowdown when there there aren't any hidden references?
> > Or am I misunderstanding your patch message which instead means "we see
> > a slow-down when there are no hidden references [since we still must
> > store and enumerate all advertised references]"?
> 
> I have tried to dig down into the code of `revision.c` but ultimately
> returned empty-handed. I _think_ that this is because of the different
> paths we use when reading revisions from stdin as we have to resolve the
> revision to an OID first, which is more involved than taking the OIDs as
> returned by the reference backend. I have tried to short-circuit this
> logic in case the revision read from stdin is exactly `hash_algo->hexsz`
> long so that we try to parse it as an OID directly instead of trying to
> do any of the magic that is required to resolve a revision. But this
> only speed things up by a small margin.
> 
> Another assumption was that this is overhead caused by using stdin
> instead of reading data from a file, but flame graphs didn't support
> this theory, either.

Certainly read_revisions_from_stdin() will allocate a bunch of extra
data per item it reads (via add_rev_cmdline() and add_pending), which is
going to be way more than parsing or stdin overhead.

Much worse is that it will call get_reference(), which is very keen to
actually open and parse the object (as you know, since you added
commit-graph handling there). That might have gotten a bit better in
v2.38.0 if you have any references to blobs (as we'd now skip the extra
hash check).

Of course the original "rev-list --not --all" would suffer from the same
thing, so some of that may not explain any difference between the two.

-Peff

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 0/2] receive-pack: use advertised reference tips to inform connectivity check
  2022-10-28 14:42 [PATCH 0/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
                   ` (3 preceding siblings ...)
  2022-11-01  1:30 ` Taylor Blau
@ 2022-11-01  9:00 ` Jeff King
  2022-11-01 11:49   ` Patrick Steinhardt
  2022-11-03 14:37 ` [PATCH v2 0/3] receive-pack: only use visible refs for " Patrick Steinhardt
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 88+ messages in thread
From: Jeff King @ 2022-11-01  9:00 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Taylor Blau, Junio C Hamano, git

On Fri, Oct 28, 2022 at 04:42:19PM +0200, Patrick Steinhardt wrote:

>     - A client shouldn't assume objects to exist that have not been part
>       of the reference advertisement. But if it excluded an object from
>       the packfile that is reachable via any ref that is excluded from
>       the reference advertisement due to `transfer.hideRefs` we'd have
>       accepted the push anyway. I'd argue that this is a bug in the
>       current implementation.

Like others, I don't think this is a bug exactly. We'd never introduce a
corruption. We're just more lenient with clients than we need to be.

But I don't think your scheme changes that. In a sense, the tips used by
"rev-list --not --all" are really an optimization. We will walk the
history from the to-be-updated ref tips all the way down to the roots if
we have to. So imagine that I have object X which is not referenced at
all (neither hidden nor visible ref). We obviously do not advertise it
to the client, but let's further imagine that a client sends us a pack
with X..Y, and a request to update some ref to Y.

Both before and after your code, if rev-list is able to walk down from Y
until either we hit all roots or all UNINTERESTING commits, it will be
satisfied. So as long as the receiving repo actually has all of the
history leading up to X, it will allow the push, regardless of your
patch.

If we wanted to stop being lenient, we'd have to actually check that
every object we traverse is either reachable, or came from the
just-pushed pack.


There's also a subtle timing issue here. Our connectivity check happens
after we've finished receiving the pack. So not only are we including
hidden refs, but we are using the ref state at the end of the push
(after receiving and processing the incoming pack), rather than the
beginning.

From the same "leniency" lens this seems like the wrong thing. But as
above, it doesn't matter in practice, because these tips are really an
optimization to tell rev-list that it can stop traversing.

If you think of the connectivity check less as "did the client try to
cheat" and more as "is it OK to update these refs without introducing a
corruption", then it makes sense that you'd want to do read the inputs
to the check as close to the ref update as possible, because it shrinks
the race window which could introduce corruption.

Imagine a situation like this:

  0. We advertise to client that we have commit X.

  1. Client starts pushing up a pack with X..Y and asks to update some
     branch to Y.

  2. Meanwhile, the branch with X is deleted, and X is pruned.

  3. Server finishes receiving the pack. All looks good, and then we
     start a connectivity check.

In the current code, that check starts with the current ref state (with
X deleted) as a given, and makes sure that we have the objects we need
to update the refs. After your patches, it would take X as a given, and
stop traversing when we see it.

That same race exists before your patch, but it's between the time of
"rev-list --not --all" running and the ref update. After your patch,
it's between the advertisement and the ref update, which can be a long
time (hours or even days, if the client is very slow).

In practice I'm not sure how big a deal this is. If we feed the
now-pruned X to rev-list, it may notice that X went away, though we've
been reducing the number of checks there in the name of efficiency
(e.g., if it's still in the commit graph, we'd say "OK, good enough"
these days, even if we don't have it on disk anymore).

But it feels like a wrong direction to make that race longer if there's
no need to.

So all that said...

>     - Second, by using advertised refs as inputs instead of `git
>       rev-list --not --all` we avoid looking up all refs that are
>       irrelevant to the current push. This can be a huge performance
>       improvement in repos that have a huge amount of internal, hidden
>       refs. In one of our repos with 7m refs, of which 6.8m are hidden,
>       this speeds up pushes from ~30s to ~4.5s.

I like the general direction here of avoiding the hidden refs. The
client _shouldn't_ have been using them, so we can optimistically assume
they're useless (and in the case of races or other weirdness, rev-list
just ends up traversing a bit further).

But we can split the two ideas in your series:

  1. Feed the advertised tips from receive-pack to rev-list.

  2. Avoid traversing from the hidden tips.

Doing (1) gets you (2) for free. But if we don't want to do (1), and I
don't think we do, we can get (2) by just teaching rev-list to narrow
the check.

I see some discussion in the other part of the thread, and we may need a
new rev-list option to do this, as mentioned there. However, you _might_
be able to do it the existing --exclude mechanism. I.e., something like:

  rev-list --stdin --not --exclude 'refs/hidden/*' --all

The gotchas are:

  - I'm not 100% sure that --exclude globbing and transfer.hideRefs
    syntax are compatible. You'd want to check.

  - these would have to come on the command line (at least with the
    current code). Probably nobody has enough hiderefs patterns for that
    to be a problem (and remember we are passing the glob pattern here,
    not the 6.8M refs themselves). But it could bite somebody in a
    pathological case.

-Peff

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 0/2] receive-pack: use advertised reference tips to inform connectivity check
  2022-11-01  9:00 ` Jeff King
@ 2022-11-01 11:49   ` Patrick Steinhardt
  0 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-01 11:49 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 6412 bytes --]

On Tue, Nov 01, 2022 at 05:00:22AM -0400, Jeff King wrote:
> On Fri, Oct 28, 2022 at 04:42:19PM +0200, Patrick Steinhardt wrote:
> 
> >     - A client shouldn't assume objects to exist that have not been part
> >       of the reference advertisement. But if it excluded an object from
> >       the packfile that is reachable via any ref that is excluded from
> >       the reference advertisement due to `transfer.hideRefs` we'd have
> >       accepted the push anyway. I'd argue that this is a bug in the
> >       current implementation.
> 
> Like others, I don't think this is a bug exactly. We'd never introduce a
> corruption. We're just more lenient with clients than we need to be.
> 
> But I don't think your scheme changes that. In a sense, the tips used by
> "rev-list --not --all" are really an optimization. We will walk the
> history from the to-be-updated ref tips all the way down to the roots if
> we have to. So imagine that I have object X which is not referenced at
> all (neither hidden nor visible ref). We obviously do not advertise it
> to the client, but let's further imagine that a client sends us a pack
> with X..Y, and a request to update some ref to Y.
> 
> Both before and after your code, if rev-list is able to walk down from Y
> until either we hit all roots or all UNINTERESTING commits, it will be
> satisfied. So as long as the receiving repo actually has all of the
> history leading up to X, it will allow the push, regardless of your
> patch.

Oh, right! Now I see where my thinko was, which means both you and
Taylor are correct. I somehow assumed that we'd fail the connectivity
check in that case, but all it means is that we now potentially walk
more objects than we'd have done if we used `--not --all`.

> If we wanted to stop being lenient, we'd have to actually check that
> every object we traverse is either reachable, or came from the
> just-pushed pack.

Yes, indeed.

> There's also a subtle timing issue here. Our connectivity check happens
> after we've finished receiving the pack. So not only are we including
> hidden refs, but we are using the ref state at the end of the push
> (after receiving and processing the incoming pack), rather than the
> beginning.
> 
> From the same "leniency" lens this seems like the wrong thing. But as
> above, it doesn't matter in practice, because these tips are really an
> optimization to tell rev-list that it can stop traversing.
> 
> If you think of the connectivity check less as "did the client try to
> cheat" and more as "is it OK to update these refs without introducing a
> corruption", then it makes sense that you'd want to do read the inputs
> to the check as close to the ref update as possible, because it shrinks
> the race window which could introduce corruption.

Agreed.

> Imagine a situation like this:
> 
>   0. We advertise to client that we have commit X.
> 
>   1. Client starts pushing up a pack with X..Y and asks to update some
>      branch to Y.
> 
>   2. Meanwhile, the branch with X is deleted, and X is pruned.
> 
>   3. Server finishes receiving the pack. All looks good, and then we
>      start a connectivity check.
> 
> In the current code, that check starts with the current ref state (with
> X deleted) as a given, and makes sure that we have the objects we need
> to update the refs. After your patches, it would take X as a given, and
> stop traversing when we see it.
> 
> That same race exists before your patch, but it's between the time of
> "rev-list --not --all" running and the ref update. After your patch,
> it's between the advertisement and the ref update, which can be a long
> time (hours or even days, if the client is very slow).
> 
> In practice I'm not sure how big a deal this is. If we feed the
> now-pruned X to rev-list, it may notice that X went away, though we've
> been reducing the number of checks there in the name of efficiency
> (e.g., if it's still in the commit graph, we'd say "OK, good enough"
> these days, even if we don't have it on disk anymore).
> 
> But it feels like a wrong direction to make that race longer if there's
> no need to.

Good point.

> So all that said...
> 
> >     - Second, by using advertised refs as inputs instead of `git
> >       rev-list --not --all` we avoid looking up all refs that are
> >       irrelevant to the current push. This can be a huge performance
> >       improvement in repos that have a huge amount of internal, hidden
> >       refs. In one of our repos with 7m refs, of which 6.8m are hidden,
> >       this speeds up pushes from ~30s to ~4.5s.
> 
> I like the general direction here of avoiding the hidden refs. The
> client _shouldn't_ have been using them, so we can optimistically assume
> they're useless (and in the case of races or other weirdness, rev-list
> just ends up traversing a bit further).
> 
> But we can split the two ideas in your series:
> 
>   1. Feed the advertised tips from receive-pack to rev-list.
> 
>   2. Avoid traversing from the hidden tips.
> 
> Doing (1) gets you (2) for free. But if we don't want to do (1), and I
> don't think we do, we can get (2) by just teaching rev-list to narrow
> the check.
> 
> I see some discussion in the other part of the thread, and we may need a
> new rev-list option to do this, as mentioned there. However, you _might_
> be able to do it the existing --exclude mechanism. I.e., something like:
> 
>   rev-list --stdin --not --exclude 'refs/hidden/*' --all

Yeah, Taylor proposed to add a new `--visible-refs=receive` option that
lets git-rev-list(1) automatically add all references that are visible
when paying attention to `receive.hideRefs`. I actually like this idea
and will likely have a look at how easy or hard it is to implement.

> The gotchas are:
> 
>   - I'm not 100% sure that --exclude globbing and transfer.hideRefs
>     syntax are compatible. You'd want to check.
> 
>   - these would have to come on the command line (at least with the
>     current code). Probably nobody has enough hiderefs patterns for that
>     to be a problem (and remember we are passing the glob pattern here,
>     not the 6.8M refs themselves). But it could bite somebody in a
>     pathological case.
> 
> -Peff

Well, we can avoid these gotchas if we used `--visible-refs`.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check
  2022-11-01  7:20         ` Patrick Steinhardt
@ 2022-11-01 11:53           ` Patrick Steinhardt
  2022-11-02  1:05             ` Taylor Blau
  0 siblings, 1 reply; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-01 11:53 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 4161 bytes --]

On Tue, Nov 01, 2022 at 08:20:48AM +0100, Patrick Steinhardt wrote:
> On Mon, Oct 31, 2022 at 09:28:09PM -0400, Taylor Blau wrote:
> > On Mon, Oct 31, 2022 at 03:45:47PM +0100, Patrick Steinhardt wrote:
> > > On Sun, Oct 30, 2022 at 03:09:04PM -0400, Taylor Blau wrote:
> > > > On Fri, Oct 28, 2022 at 04:42:27PM +0200, Patrick Steinhardt wrote:
> > > > > This strategy has the major downside that it will not require any object
> > > > > to be sent by the client that is reachable by any of the repositories'
> > > > > references. While that sounds like it would be indeed what we are after
> > > > > with the connectivity check, it is arguably not. The administrator that
> > > > > manages the server-side Git repository may have configured certain refs
> > > > > to be hidden during the reference advertisement via `transfer.hideRefs`
> > > > > or `receivepack.hideRefs`. Whatever the reason, the result is that the
> > > > > client shouldn't expect that any of those hidden references exists on
> > > > > the remote side, and neither should they assume any of the pointed-to
> > > > > objects to exist except if referenced by any visible reference. But
> > > > > because we treat _all_ local refs as uninteresting in the connectivity
> > > > > check, a client is free to send a packfile that references objects that
> > > > > are only reachable via a hidden reference on the server-side, and we
> > > > > will gladly accept it.
> > > >
> > > > You mention below that this is a correctness issue, but I am not sure
> > > > that I agree.
> > > >
> > > > The existing behavior is a little strange, I agree, but your argument
> > > > relies on an assumption that the history on hidden refs is not part of
> > > > the reachable set, which is not the case. Any part of the repository
> > > > that is reachable from _any_ reference, hidden or not, is reachable by
> > > > definition.
> > > >
> > > > So it's perfectly fine to consider objects on hidden refs to be in the
> > > > uninteresting set, because they are reachable. It's odd from the
> > > > client's perspective, but I do not see a path to repository corruption
> > > > with thee existing behavior.
> > >
> > > Indeed, I'm not trying to say that this can lead to repository
> > > corruption.
> > 
> > I definitely agree with that. I have thought about this on-and-off since
> > you sent the topic, and I am pretty certain that there is no path to
> > repository corruption with the existing behavior. It would be worth
> > updating the commit message to make this clearer.
> 
> Fair enough, I can try to do that.
> 
> > > But security-related or not, I think it is safe to say that any packfile
> > > sent by a client that does not contain objects required for the updated
> > > reference that the client cannot know to exist on the server-side must
> > > be generated by buggy code.
> > 
> > Maybe, though I think it's fine to let clients send us smaller packfiles
> > if they have some a-priori knowledge that the server has objects that it
> > isn't advertising. And that can all happen without buggy code. So it's
> > weird, but there isn't anything wrong with letting it happen.
> 
> Well, I don't see how to achieve both at the same time though: we can
> either limit the set of uninteresting tips to what we have announced to
> the client, or we allow clients to omit objects that have not been
> announced. These are mutually exclusive.
> 
> So if we take the stance that it was fine to send packfiles that omit
> hidden objects and that this is something we want to continue to support
> then this patch series probably becomes moot. Doing the proposed
> optimization means that we also tighten the rules here.

I'm wrong and you're right: we can do the optimization to limit the refs
we use but still let clients send objects that are hidden. I didn't take
into account that this is merely an optimization that we stop walking at
reachable tips. I'll reword the commit message when having another go
and will likely do something along the lines of your proposed new
`--visible-refs` option in v2 of this series.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check
  2022-11-01 11:53           ` Patrick Steinhardt
@ 2022-11-02  1:05             ` Taylor Blau
  0 siblings, 0 replies; 88+ messages in thread
From: Taylor Blau @ 2022-11-02  1:05 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Taylor Blau, git

On Tue, Nov 01, 2022 at 12:53:42PM +0100, Patrick Steinhardt wrote:
> > > Maybe, though I think it's fine to let clients send us smaller packfiles
> > > if they have some a-priori knowledge that the server has objects that it
> > > isn't advertising. And that can all happen without buggy code. So it's
> > > weird, but there isn't anything wrong with letting it happen.
> >
> > Well, I don't see how to achieve both at the same time though: we can
> > either limit the set of uninteresting tips to what we have announced to
> > the client, or we allow clients to omit objects that have not been
> > announced. These are mutually exclusive.
> >
> > So if we take the stance that it was fine to send packfiles that omit
> > hidden objects and that this is something we want to continue to support
> > then this patch series probably becomes moot. Doing the proposed
> > optimization means that we also tighten the rules here.
>
> I'm wrong and you're right: we can do the optimization to limit the refs
> we use but still let clients send objects that are hidden. I didn't take
> into account that this is merely an optimization that we stop walking at
> reachable tips. I'll reword the commit message when having another go
> and will likely do something along the lines of your proposed new
> `--visible-refs` option in v2 of this series.

I wasn't necessarily advocating for a behavior change in this series,
more pointing out that the situation you said can only happen with buggy
code doesn't actually require a bug in practice.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v2 0/3] receive-pack: only use visible refs for connectivity check
  2022-10-28 14:42 [PATCH 0/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
                   ` (4 preceding siblings ...)
  2022-11-01  9:00 ` Jeff King
@ 2022-11-03 14:37 ` Patrick Steinhardt
  2022-11-03 14:37   ` [PATCH v2 1/3] refs: get rid of global list of hidden refs Patrick Steinhardt
                     ` (4 more replies)
  2022-11-07 12:16 ` [PATCH v3 0/6] " Patrick Steinhardt
                   ` (3 subsequent siblings)
  9 siblings, 5 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-03 14:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 2653 bytes --]

Hi,

this is the second version of my patch series that tries to improve
performance of the connectivity check by only considering preexisting
refs as uninteresting that could actually have been advertised to the
client.

This version uses the same idea as v1, but it's basically rewritten
based on Taylor's idea of adding a `git rev-list --visible-refs=`
switch. This fixes two major concerns:

    1. The performance regression in repositories is now gone when there
       are no hidden refs. Previously, there was a 10% regression in one
       specific benchmark that was caused by reading advertised tips
       via stdin in git-rev-list(1).

    2. It fixes Jeffs concerns around a TOCTOU-style race with very slow
       clients. It could in theory happen that a push takes multiple
       days. With the previous idea of reusing advertised refs for the
       connectivity check, the end result would be that we perform the
       check with a set of refs that has been generated when the push
       started.

The series is structured as following:

    - Patch 1 does some preliminary cleanups to de-globalize the set of
      parsed hidden refs so that we can use the existing functions
      easily for the new `--visible-refs=` option.

    - Patch 2 adds the new `--visible-refs=` option to git-rev-list(1).

    - Patch 3 converts git-receive-pack(1) to use `--visible-refs=`
      instead of `--all`.

Overall the performance improvement isn't quite as strong as before:
we're only 4.5x faster compared to 6.5x in our repo. But I guess that's
still a good-enough improvement, doubly so that there are no downsides
for repos anymore that ain't got any hidden refs.

Patrick

Patrick Steinhardt (3):
  refs: get rid of global list of hidden refs
  revision: add new parameter to specify all visible refs
  receive-pack: only use visible refs for connectivity check

 Documentation/rev-list-options.txt |  15 +++--
 builtin/receive-pack.c             |  10 ++-
 builtin/rev-list.c                 |   1 +
 builtin/rev-parse.c                |   1 +
 connected.c                        |   5 +-
 connected.h                        |   6 ++
 ls-refs.c                          |  13 ++--
 refs.c                             |  14 ++--
 refs.h                             |   5 +-
 revision.c                         |  34 +++++++++-
 t/t6021-rev-list-visible-refs.sh   | 102 +++++++++++++++++++++++++++++
 upload-pack.c                      |  30 +++++----
 12 files changed, 196 insertions(+), 40 deletions(-)
 create mode 100755 t/t6021-rev-list-visible-refs.sh

-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v2 1/3] refs: get rid of global list of hidden refs
  2022-11-03 14:37 ` [PATCH v2 0/3] receive-pack: only use visible refs for " Patrick Steinhardt
@ 2022-11-03 14:37   ` Patrick Steinhardt
  2022-11-03 14:37   ` [PATCH v2 2/3] revision: add new parameter to specify all visible refs Patrick Steinhardt
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-03 14:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 12498 bytes --]

We're about to add a new argument to git-rev-list(1) that allows it to
add all references that are visible when taking `transfer.hideRefs` et
al into account. This will require us to potentially parse multiple sets
of hidden refs, which is not easily possible right now as there is only
a single, global instance of the list of parsed hidden refs.

Refactor `parse_hide_refs_config()` and `ref_is_hidden()` so that both
take the list of hidden references as input and adjust callers to keep a
local list, instead. This allows us to easily use multiple hidden-ref
lists. Furthermore, it allows us to properly free this list before we
exit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/receive-pack.c |  8 +++++---
 ls-refs.c              | 13 +++++++++----
 refs.c                 | 14 ++++----------
 refs.h                 |  5 +++--
 upload-pack.c          | 30 ++++++++++++++++++------------
 5 files changed, 39 insertions(+), 31 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 44bcea3a5b..1f3efc58fb 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -80,6 +80,7 @@ static struct object_id push_cert_oid;
 static struct signature_check sigcheck;
 static const char *push_cert_nonce;
 static const char *cert_nonce_seed;
+static struct string_list hidden_refs = STRING_LIST_INIT_DUP;
 
 static const char *NONCE_UNSOLICITED = "UNSOLICITED";
 static const char *NONCE_BAD = "BAD";
@@ -130,7 +131,7 @@ static enum deny_action parse_deny_action(const char *var, const char *value)
 
 static int receive_pack_config(const char *var, const char *value, void *cb)
 {
-	int status = parse_hide_refs_config(var, value, "receive");
+	int status = parse_hide_refs_config(var, value, "receive", &hidden_refs);
 
 	if (status)
 		return status;
@@ -296,7 +297,7 @@ static int show_ref_cb(const char *path_full, const struct object_id *oid,
 	struct oidset *seen = data;
 	const char *path = strip_namespace(path_full);
 
-	if (ref_is_hidden(path, path_full))
+	if (ref_is_hidden(path, path_full, &hidden_refs))
 		return 0;
 
 	/*
@@ -1794,7 +1795,7 @@ static void reject_updates_to_hidden(struct command *commands)
 		strbuf_setlen(&refname_full, prefix_len);
 		strbuf_addstr(&refname_full, cmd->ref_name);
 
-		if (!ref_is_hidden(cmd->ref_name, refname_full.buf))
+		if (!ref_is_hidden(cmd->ref_name, refname_full.buf, &hidden_refs))
 			continue;
 		if (is_null_oid(&cmd->new_oid))
 			cmd->error_string = "deny deleting a hidden ref";
@@ -2591,6 +2592,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 		packet_flush(1);
 	oid_array_clear(&shallow);
 	oid_array_clear(&ref);
+	string_list_clear(&hidden_refs, 1);
 	free((void *)push_cert_nonce);
 	return 0;
 }
diff --git a/ls-refs.c b/ls-refs.c
index fa0d01b47c..ae89f850e9 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -6,6 +6,7 @@
 #include "ls-refs.h"
 #include "pkt-line.h"
 #include "config.h"
+#include "string-list.h"
 
 static int config_read;
 static int advertise_unborn;
@@ -73,6 +74,7 @@ struct ls_refs_data {
 	unsigned symrefs;
 	struct strvec prefixes;
 	struct strbuf buf;
+	struct string_list hidden_refs;
 	unsigned unborn : 1;
 };
 
@@ -84,7 +86,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
 
 	strbuf_reset(&data->buf);
 
-	if (ref_is_hidden(refname_nons, refname))
+	if (ref_is_hidden(refname_nons, refname, &data->hidden_refs))
 		return 0;
 
 	if (!ref_match(&data->prefixes, refname_nons))
@@ -137,14 +139,15 @@ static void send_possibly_unborn_head(struct ls_refs_data *data)
 }
 
 static int ls_refs_config(const char *var, const char *value,
-			  void *data UNUSED)
+			  void *cb_data)
 {
+	struct ls_refs_data *data = cb_data;
 	/*
 	 * We only serve fetches over v2 for now, so respect only "uploadpack"
 	 * config. This may need to eventually be expanded to "receive", but we
 	 * don't yet know how that information will be passed to ls-refs.
 	 */
-	return parse_hide_refs_config(var, value, "uploadpack");
+	return parse_hide_refs_config(var, value, "uploadpack", &data->hidden_refs);
 }
 
 int ls_refs(struct repository *r, struct packet_reader *request)
@@ -154,9 +157,10 @@ int ls_refs(struct repository *r, struct packet_reader *request)
 	memset(&data, 0, sizeof(data));
 	strvec_init(&data.prefixes);
 	strbuf_init(&data.buf, 0);
+	string_list_init_dup(&data.hidden_refs);
 
 	ensure_config_read();
-	git_config(ls_refs_config, NULL);
+	git_config(ls_refs_config, &data);
 
 	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
 		const char *arg = request->line;
@@ -195,6 +199,7 @@ int ls_refs(struct repository *r, struct packet_reader *request)
 	packet_fflush(stdout);
 	strvec_clear(&data.prefixes);
 	strbuf_release(&data.buf);
+	string_list_clear(&data.hidden_refs, 1);
 	return 0;
 }
 
diff --git a/refs.c b/refs.c
index 1491ae937e..f1711e2e9f 100644
--- a/refs.c
+++ b/refs.c
@@ -1414,9 +1414,8 @@ char *shorten_unambiguous_ref(const char *refname, int strict)
 					    refname, strict);
 }
 
-static struct string_list *hide_refs;
-
-int parse_hide_refs_config(const char *var, const char *value, const char *section)
+int parse_hide_refs_config(const char *var, const char *value, const char *section,
+			   struct string_list *hide_refs)
 {
 	const char *key;
 	if (!strcmp("transfer.hiderefs", var) ||
@@ -1431,21 +1430,16 @@ int parse_hide_refs_config(const char *var, const char *value, const char *secti
 		len = strlen(ref);
 		while (len && ref[len - 1] == '/')
 			ref[--len] = '\0';
-		if (!hide_refs) {
-			CALLOC_ARRAY(hide_refs, 1);
-			hide_refs->strdup_strings = 1;
-		}
 		string_list_append(hide_refs, ref);
 	}
 	return 0;
 }
 
-int ref_is_hidden(const char *refname, const char *refname_full)
+int ref_is_hidden(const char *refname, const char *refname_full,
+		  const struct string_list *hide_refs)
 {
 	int i;
 
-	if (!hide_refs)
-		return 0;
 	for (i = hide_refs->nr - 1; i >= 0; i--) {
 		const char *match = hide_refs->items[i].string;
 		const char *subject;
diff --git a/refs.h b/refs.h
index 8958717a17..3266fd8f57 100644
--- a/refs.h
+++ b/refs.h
@@ -808,7 +808,8 @@ int update_ref(const char *msg, const char *refname,
 	       const struct object_id *new_oid, const struct object_id *old_oid,
 	       unsigned int flags, enum action_on_err onerr);
 
-int parse_hide_refs_config(const char *var, const char *value, const char *);
+int parse_hide_refs_config(const char *var, const char *value, const char *,
+			   struct string_list *);
 
 /*
  * Check whether a ref is hidden. If no namespace is set, both the first and
@@ -818,7 +819,7 @@ int parse_hide_refs_config(const char *var, const char *value, const char *);
  * the ref is outside that namespace, the first parameter is NULL. The second
  * parameter always points to the full ref name.
  */
-int ref_is_hidden(const char *, const char *);
+int ref_is_hidden(const char *, const char *, const struct string_list *);
 
 /* Is this a per-worktree ref living in the refs/ namespace? */
 int is_per_worktree_ref(const char *refname);
diff --git a/upload-pack.c b/upload-pack.c
index 0b8311bd68..9db17f8787 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -62,6 +62,7 @@ struct upload_pack_data {
 	struct object_array have_obj;
 	struct oid_array haves;					/* v2 only */
 	struct string_list wanted_refs;				/* v2 only */
+	struct string_list hidden_refs;
 
 	struct object_array shallows;
 	struct string_list deepen_not;
@@ -118,6 +119,7 @@ static void upload_pack_data_init(struct upload_pack_data *data)
 {
 	struct string_list symref = STRING_LIST_INIT_DUP;
 	struct string_list wanted_refs = STRING_LIST_INIT_DUP;
+	struct string_list hidden_refs = STRING_LIST_INIT_DUP;
 	struct object_array want_obj = OBJECT_ARRAY_INIT;
 	struct object_array have_obj = OBJECT_ARRAY_INIT;
 	struct oid_array haves = OID_ARRAY_INIT;
@@ -130,6 +132,7 @@ static void upload_pack_data_init(struct upload_pack_data *data)
 	memset(data, 0, sizeof(*data));
 	data->symref = symref;
 	data->wanted_refs = wanted_refs;
+	data->hidden_refs = hidden_refs;
 	data->want_obj = want_obj;
 	data->have_obj = have_obj;
 	data->haves = haves;
@@ -151,6 +154,7 @@ static void upload_pack_data_clear(struct upload_pack_data *data)
 {
 	string_list_clear(&data->symref, 1);
 	string_list_clear(&data->wanted_refs, 1);
+	string_list_clear(&data->hidden_refs, 1);
 	object_array_clear(&data->want_obj);
 	object_array_clear(&data->have_obj);
 	oid_array_clear(&data->haves);
@@ -842,8 +846,8 @@ static void deepen(struct upload_pack_data *data, int depth)
 		 * Checking for reachable shallows requires that our refs be
 		 * marked with OUR_REF.
 		 */
-		head_ref_namespaced(check_ref, NULL);
-		for_each_namespaced_ref(check_ref, NULL);
+		head_ref_namespaced(check_ref, data);
+		for_each_namespaced_ref(check_ref, data);
 
 		get_reachable_list(data, &reachable_shallows);
 		result = get_shallow_commits(&reachable_shallows,
@@ -1158,11 +1162,11 @@ static void receive_needs(struct upload_pack_data *data,
 
 /* return non-zero if the ref is hidden, otherwise 0 */
 static int mark_our_ref(const char *refname, const char *refname_full,
-			const struct object_id *oid)
+			const struct object_id *oid, const struct string_list *hidden_refs)
 {
 	struct object *o = lookup_unknown_object(the_repository, oid);
 
-	if (ref_is_hidden(refname, refname_full)) {
+	if (ref_is_hidden(refname, refname_full, hidden_refs)) {
 		o->flags |= HIDDEN_REF;
 		return 1;
 	}
@@ -1171,11 +1175,12 @@ static int mark_our_ref(const char *refname, const char *refname_full,
 }
 
 static int check_ref(const char *refname_full, const struct object_id *oid,
-		     int flag UNUSED, void *cb_data UNUSED)
+		     int flag UNUSED, void *cb_data)
 {
 	const char *refname = strip_namespace(refname_full);
+	struct upload_pack_data *data = cb_data;
 
-	mark_our_ref(refname, refname_full, oid);
+	mark_our_ref(refname, refname_full, oid, &data->hidden_refs);
 	return 0;
 }
 
@@ -1204,7 +1209,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	struct object_id peeled;
 	struct upload_pack_data *data = cb_data;
 
-	if (mark_our_ref(refname_nons, refname, oid))
+	if (mark_our_ref(refname_nons, refname, oid, &data->hidden_refs))
 		return 0;
 
 	if (capabilities) {
@@ -1327,7 +1332,7 @@ static int upload_pack_config(const char *var, const char *value, void *cb_data)
 	if (parse_object_filter_config(var, value, data) < 0)
 		return -1;
 
-	return parse_hide_refs_config(var, value, "uploadpack");
+	return parse_hide_refs_config(var, value, "uploadpack", &data->hidden_refs);
 }
 
 static int upload_pack_protected_config(const char *var, const char *value, void *cb_data)
@@ -1375,8 +1380,8 @@ void upload_pack(const int advertise_refs, const int stateless_rpc,
 		advertise_shallow_grafts(1);
 		packet_flush(1);
 	} else {
-		head_ref_namespaced(check_ref, NULL);
-		for_each_namespaced_ref(check_ref, NULL);
+		head_ref_namespaced(check_ref, &data);
+		for_each_namespaced_ref(check_ref, &data);
 	}
 
 	if (!advertise_refs) {
@@ -1441,6 +1446,7 @@ static int parse_want(struct packet_writer *writer, const char *line,
 
 static int parse_want_ref(struct packet_writer *writer, const char *line,
 			  struct string_list *wanted_refs,
+			  struct string_list *hidden_refs,
 			  struct object_array *want_obj)
 {
 	const char *refname_nons;
@@ -1451,7 +1457,7 @@ static int parse_want_ref(struct packet_writer *writer, const char *line,
 		struct strbuf refname = STRBUF_INIT;
 
 		strbuf_addf(&refname, "%s%s", get_git_namespace(), refname_nons);
-		if (ref_is_hidden(refname_nons, refname.buf) ||
+		if (ref_is_hidden(refname_nons, refname.buf, hidden_refs) ||
 		    read_ref(refname.buf, &oid)) {
 			packet_writer_error(writer, "unknown ref %s", refname_nons);
 			die("unknown ref %s", refname_nons);
@@ -1508,7 +1514,7 @@ static void process_args(struct packet_reader *request,
 			continue;
 		if (data->allow_ref_in_want &&
 		    parse_want_ref(&data->writer, arg, &data->wanted_refs,
-				   &data->want_obj))
+				   &data->hidden_refs, &data->want_obj))
 			continue;
 		/* process have line */
 		if (parse_have(arg, &data->haves))
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 2/3] revision: add new parameter to specify all visible refs
  2022-11-03 14:37 ` [PATCH v2 0/3] receive-pack: only use visible refs for " Patrick Steinhardt
  2022-11-03 14:37   ` [PATCH v2 1/3] refs: get rid of global list of hidden refs Patrick Steinhardt
@ 2022-11-03 14:37   ` Patrick Steinhardt
  2022-11-05 12:46     ` Jeff King
  2022-11-05 12:55     ` Jeff King
  2022-11-03 14:37   ` [PATCH v2 3/3] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
                     ` (2 subsequent siblings)
  4 siblings, 2 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-03 14:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 9576 bytes --]

Users can optionally hide refs from remote users in git-upload-pack(1),
git-receive-pack(1) and others via the `transfer.hideRefs`, but there is
not an easy way to obtain the list of all visible or hidden refs right
now. We'll require just that though for a performance improvement in our
connectivity check.

Add a new pseudo-ref `--visible-refs=` that pretends as if all refs have
been added to the command line that are not hidden. The pseudo-ref
requiers either one of "transfer", "uploadpack" or "receive" as argument
to pay attention to `transfer.hideRefs`, `uploadpack.hideRefs` or
`receive.hideRefs`, respectively.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/rev-list-options.txt |  15 +++--
 builtin/rev-list.c                 |   1 +
 builtin/rev-parse.c                |   1 +
 revision.c                         |  34 +++++++++-
 t/t6021-rev-list-visible-refs.sh   | 102 +++++++++++++++++++++++++++++
 5 files changed, 145 insertions(+), 8 deletions(-)
 create mode 100755 t/t6021-rev-list-visible-refs.sh

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 1837509566..a0e34b0e2b 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -180,14 +180,19 @@ endif::git-log[]
 	is automatically prepended if missing. If pattern lacks '?', '{asterisk}',
 	or '[', '/{asterisk}' at the end is implied.
 
+--visible-refs=[transfer|receive|uploadpack]::
+	Pretend as if all the refs that have not been hidden via either one of
+	`transfer.hideRefs`, `receive.hideRefs` or `uploadpack.hideRefs` are
+	listed on the command line.
+
 --exclude=<glob-pattern>::
 
 	Do not include refs matching '<glob-pattern>' that the next `--all`,
-	`--branches`, `--tags`, `--remotes`, or `--glob` would otherwise
-	consider. Repetitions of this option accumulate exclusion patterns
-	up to the next `--all`, `--branches`, `--tags`, `--remotes`, or
-	`--glob` option (other options or arguments do not clear
-	accumulated patterns).
+	`--branches`, `--tags`, `--remotes`, `--glob` or `--visible-refs` would
+	otherwise consider. Repetitions of this option accumulate exclusion
+	patterns up to the next `--all`, `--branches`, `--tags`, `--remotes`,
+	`--glob` or `--visible-refs` option (other options or arguments do not
+	clear accumulated patterns).
 +
 The patterns given should not begin with `refs/heads`, `refs/tags`, or
 `refs/remotes` when applied to `--branches`, `--tags`, or `--remotes`,
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 3acd93f71e..f719286cf8 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -38,6 +38,7 @@ static const char rev_list_usage[] =
 "    --tags\n"
 "    --remotes\n"
 "    --stdin\n"
+"    --visible-refs=[transfer|receive|uploadpack]\n"
 "    --quiet\n"
 "  ordering output:\n"
 "    --topo-order\n"
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 8f61050bde..31617bf3d5 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -77,6 +77,7 @@ static int is_rev_argument(const char *arg)
 		"--topo-order",
 		"--date-order",
 		"--unpacked",
+		"--visible-refs=",
 		NULL
 	};
 	const char **p = rev_args;
diff --git a/revision.c b/revision.c
index 0760e78936..ef9e2947af 100644
--- a/revision.c
+++ b/revision.c
@@ -1,4 +1,5 @@
 #include "cache.h"
+#include "config.h"
 #include "object-store.h"
 #include "tag.h"
 #include "blob.h"
@@ -1523,6 +1524,8 @@ struct all_refs_cb {
 	struct rev_info *all_revs;
 	const char *name_for_errormsg;
 	struct worktree *wt;
+	struct string_list hidden_refs;
+	const char *hidden_refs_section;
 };
 
 int ref_excluded(struct string_list *ref_excludes, const char *path)
@@ -1542,11 +1545,13 @@ static int handle_one_ref(const char *path, const struct object_id *oid,
 			  int flag UNUSED,
 			  void *cb_data)
 {
+	const char *stripped_path = strip_namespace(path);
 	struct all_refs_cb *cb = cb_data;
 	struct object *object;
 
-	if (ref_excluded(cb->all_revs->ref_excludes, path))
-	    return 0;
+	if (ref_excluded(cb->all_revs->ref_excludes, path) ||
+	    ref_is_hidden(stripped_path, path, &cb->hidden_refs))
+		return 0;
 
 	object = get_reference(cb->all_revs, path, oid, cb->all_flags);
 	add_rev_cmdline(cb->all_revs, object, path, REV_CMD_REF, cb->all_flags);
@@ -1561,6 +1566,7 @@ static void init_all_refs_cb(struct all_refs_cb *cb, struct rev_info *revs,
 	cb->all_flags = flags;
 	revs->rev_input_given = 1;
 	cb->wt = NULL;
+	string_list_init_dup(&cb->hidden_refs);
 }
 
 void clear_ref_exclusion(struct string_list **ref_excludes_p)
@@ -1596,6 +1602,13 @@ static void handle_refs(struct ref_store *refs,
 	for_each(refs, handle_one_ref, &cb);
 }
 
+static int hide_refs_config(const char *var, const char *value, void *cb_data)
+{
+	struct all_refs_cb *cb = cb_data;
+	return parse_hide_refs_config(var, value, cb->hidden_refs_section,
+				      &cb->hidden_refs);
+}
+
 static void handle_one_reflog_commit(struct object_id *oid, void *cb_data)
 {
 	struct all_refs_cb *cb = cb_data;
@@ -2225,7 +2238,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	    !strcmp(arg, "--bisect") || starts_with(arg, "--glob=") ||
 	    !strcmp(arg, "--indexed-objects") ||
 	    !strcmp(arg, "--alternate-refs") ||
-	    starts_with(arg, "--exclude=") ||
+	    starts_with(arg, "--exclude=") || starts_with(arg, "--visible-refs=") ||
 	    starts_with(arg, "--branches=") || starts_with(arg, "--tags=") ||
 	    starts_with(arg, "--remotes=") || starts_with(arg, "--no-walk="))
 	{
@@ -2759,6 +2772,21 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 		parse_list_objects_filter(&revs->filter, arg);
 	} else if (!strcmp(arg, ("--no-filter"))) {
 		list_objects_filter_set_no_filter(&revs->filter);
+	} else if (skip_prefix(arg, "--visible-refs=", &arg)) {
+		struct all_refs_cb cb;
+
+		if (strcmp(arg, "transfer") && strcmp(arg, "receive") &&
+		    strcmp(arg, "uploadpack"))
+			die(_("unsupported section for --visible-refs: %s"), arg);
+
+		init_all_refs_cb(&cb, revs, *flags);
+		cb.hidden_refs_section = arg;
+		git_config(hide_refs_config, &cb);
+
+		refs_for_each_ref(refs, handle_one_ref, &cb);
+
+		string_list_clear(&cb.hidden_refs, 1);
+		clear_ref_exclusion(&revs->ref_excludes);
 	} else {
 		return 0;
 	}
diff --git a/t/t6021-rev-list-visible-refs.sh b/t/t6021-rev-list-visible-refs.sh
new file mode 100755
index 0000000000..9e12384dcf
--- /dev/null
+++ b/t/t6021-rev-list-visible-refs.sh
@@ -0,0 +1,102 @@
+#!/bin/sh
+
+test_description='git rev-list --visible-refs test'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit_bulk --id=commit --ref=refs/heads/main 1 &&
+	COMMIT=$(git rev-parse refs/heads/main) &&
+	test_commit_bulk --id=tag --ref=refs/tags/lightweight 1 &&
+	TAG=$(git rev-parse refs/tags/lightweight) &&
+	test_commit_bulk --id=hidden --ref=refs/hidden/commit 1 &&
+	HIDDEN=$(git rev-parse refs/hidden/commit)
+'
+
+test_expect_success 'invalid section' '
+	echo "fatal: unsupported section for --visible-refs: unsupported" >expected &&
+	test_must_fail git rev-list --visible-refs=unsupported 2>err &&
+	test_cmp expected err
+'
+
+test_expect_success '--visible-refs without hiddenRefs' '
+	git rev-list --visible-refs=transfer >out &&
+	cat >expected <<-EOF &&
+	$HIDDEN
+	$TAG
+	$COMMIT
+	EOF
+	test_cmp expected out
+'
+
+test_expect_success 'hidden via transfer.hideRefs' '
+	git -c transfer.hideRefs=refs/hidden/ rev-list --visible-refs=transfer >out &&
+	cat >expected <<-EOF &&
+	$TAG
+	$COMMIT
+	EOF
+	test_cmp expected out
+'
+
+test_expect_success '--all --not --visible-refs=transfer without hidden refs' '
+	git rev-list --all --not --visible-refs=transfer >out &&
+	test_must_be_empty out
+'
+
+test_expect_success '--all --not --visible-refs=transfer with hidden ref' '
+	git -c transfer.hideRefs=refs/hidden/ rev-list --all --not --visible-refs=transfer >out &&
+	cat >expected <<-EOF &&
+	$HIDDEN
+	EOF
+	test_cmp expected out
+'
+
+test_expect_success '--visible-refs with --exclude' '
+	git -c transfer.hideRefs=refs/hidden/ rev-list --exclude=refs/tags/* --visible-refs=transfer >out &&
+	cat >expected <<-EOF &&
+	$COMMIT
+	EOF
+	test_cmp expected out
+'
+
+for section in receive uploadpack
+do
+	test_expect_success "hidden via $section.hideRefs" '
+		git -c receive.hideRefs=refs/hidden/ rev-list --visible-refs=receive >out &&
+		cat >expected <<-EOF &&
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "--visible-refs=$section respects transfer.hideRefs" '
+		git -c transfer.hideRefs=refs/hidden/ rev-list --visible-refs=$section >out &&
+		cat >expected <<-EOF &&
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "--visible-refs=transfer ignores $section.hideRefs" '
+		git -c $section.hideRefs=refs/hidden/ rev-list --visible-refs=transfer >out &&
+		cat >expected <<-EOF &&
+		$HIDDEN
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "--visible-refs=$section respects both transfer.hideRefs and $section.hideRefs" '
+		git -c transfer.hideRefs=refs/tags/ -c $section.hideRefs=refs/hidden/ rev-list --visible-refs=$section >out &&
+		cat >expected <<-EOF &&
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+done
+
+test_done
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 3/3] receive-pack: only use visible refs for connectivity check
  2022-11-03 14:37 ` [PATCH v2 0/3] receive-pack: only use visible refs for " Patrick Steinhardt
  2022-11-03 14:37   ` [PATCH v2 1/3] refs: get rid of global list of hidden refs Patrick Steinhardt
  2022-11-03 14:37   ` [PATCH v2 2/3] revision: add new parameter to specify all visible refs Patrick Steinhardt
@ 2022-11-03 14:37   ` Patrick Steinhardt
  2022-11-05  0:40   ` [PATCH v2 0/3] " Taylor Blau
  2022-11-05 12:52   ` Jeff King
  4 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-03 14:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 4722 bytes --]

When serving a push, git-receive-pack(1) needs to verify that the
packfile sent by the client contains all objects that are required by
the updated references. This connectivity check works by marking all
preexisting references as uninteresting and using the new reference tips
as starting point for a graph walk.

Marking all preexisting references as uninteresting can be a problem
when it comes to performance. Git forges tend to do internal bookkeeping
to keep alive sets of objects for internal use or make them easy to find
via certain references. These references are typically hidden away from
the user so that they are neither advertised nor writeable. At GitLab,
we have one particular repository that contains a total of 7 million
references, of which 6.8 million are indeed internal references. With
the current connectivity check we are forced to load all these
references in order to mark them as uninteresting, and this alone takes
around 15 seconds to compute.

We can optimize this by only taking into account the set of visible refs
when marking objects as uninteresting. This means that we may now walk
more objects until we hit any object that is marked as uninteresting.
But it is rather unlikely that clients send objects that make large
parts of objects reachable that have previously only ever been hidden,
whereas the common case is to push incremental changes that build on top
of the visible object graph.

This provides a huge boost to performance in the mentioned repository,
where the vast majority of its refs hidden. Pushing a new commit into
this repo with `transfer.hideRefs` set up to hide 6.8 million of 7 refs
as it is configured in Gitaly leads to an almost 4.5-fold speedup:

    Benchmark 1: main
      Time (mean ± σ):     29.475 s ±  0.248 s    [User: 28.812 s, System: 1.006 s]
      Range (min … max):   29.189 s … 29.636 s    3 runs

    Benchmark 2: pks-connectivity-check-hide-refs
      Time (mean ± σ):      6.657 s ±  0.027 s    [User: 6.664 s, System: 0.355 s]
      Range (min … max):    6.629 s …  6.682 s    3 runs

    Summary
      'pks-connectivity-check-hide-refs' ran
        4.43 ± 0.04 times faster than 'main'

As we mostly go through the same codepaths even in the case where there
are no hidden refs at all compared to the code before there is no change
in performance when no refs are hidden:

    Benchmark 1: main
      Time (mean ± σ):     48.688 s ±  1.535 s    [User: 49.579 s, System: 5.058 s]
      Range (min … max):   47.518 s … 50.425 s    3 runs

    Benchmark 2: pks-connectivity-check-hide-refs
      Time (mean ± σ):     48.147 s ±  0.716 s    [User: 48.800 s, System: 5.297 s]
      Range (min … max):   47.694 s … 48.973 s    3 runs

    Summary
      'pks-connectivity-check-hide-refs' ran
        1.01 ± 0.04 times faster than 'main'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/receive-pack.c | 2 ++
 connected.c            | 5 ++++-
 connected.h            | 6 ++++++
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 1f3efc58fb..be1c1d2702 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1929,6 +1929,8 @@ static void execute_commands(struct command *commands,
 	opt.err_fd = err_fd;
 	opt.progress = err_fd && !quiet;
 	opt.env = tmp_objdir_env(tmp_objdir);
+	opt.visible_refs_section = "receive";
+
 	if (check_connected(iterate_receive_command_list, &data, &opt))
 		set_connectivity_errors(commands, si);
 
diff --git a/connected.c b/connected.c
index 74a20cb32e..c64501f755 100644
--- a/connected.c
+++ b/connected.c
@@ -100,7 +100,10 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
 		strvec_push(&rev_list.args, "--exclude-promisor-objects");
 	if (!opt->is_deepening_fetch) {
 		strvec_push(&rev_list.args, "--not");
-		strvec_push(&rev_list.args, "--all");
+		if (opt->visible_refs_section)
+			strvec_pushf(&rev_list.args, "--visible-refs=%s", opt->visible_refs_section);
+		else
+			strvec_push(&rev_list.args, "--all");
 	}
 	strvec_push(&rev_list.args, "--quiet");
 	strvec_push(&rev_list.args, "--alternate-refs");
diff --git a/connected.h b/connected.h
index 6e59c92aa3..d8396e5d55 100644
--- a/connected.h
+++ b/connected.h
@@ -46,6 +46,12 @@ struct check_connected_options {
 	 * during a fetch.
 	 */
 	unsigned is_deepening_fetch : 1;
+
+	/*
+	 * If not NULL, use `--visible-refs=$section` instead of `--not --all`
+	 * as the set of already-reachable references.
+	 */
+	const char *visible_refs_section;
 };
 
 #define CHECK_CONNECTED_INIT { 0 }
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 0/3] receive-pack: only use visible refs for connectivity check
  2022-11-03 14:37 ` [PATCH v2 0/3] receive-pack: only use visible refs for " Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2022-11-03 14:37   ` [PATCH v2 3/3] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
@ 2022-11-05  0:40   ` Taylor Blau
  2022-11-05 12:55     ` Jeff King
  2022-11-05 12:52   ` Jeff King
  4 siblings, 1 reply; 88+ messages in thread
From: Taylor Blau @ 2022-11-05  0:40 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

On Thu, Nov 03, 2022 at 03:37:25PM +0100, Patrick Steinhardt wrote:
> Hi,
>
> this is the second version of my patch series that tries to improve
> performance of the connectivity check by only considering preexisting
> refs as uninteresting that could actually have been advertised to the
> client.

This version was delightful to read, and I don't have any concerns with
the approach or implementation.

I would appreciate another set of reviewer eyes on the topic, but if
not, I am comfortable starting to merge this down as-is.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 2/3] revision: add new parameter to specify all visible refs
  2022-11-03 14:37   ` [PATCH v2 2/3] revision: add new parameter to specify all visible refs Patrick Steinhardt
@ 2022-11-05 12:46     ` Jeff King
  2022-11-07  8:20       ` Patrick Steinhardt
  2022-11-05 12:55     ` Jeff King
  1 sibling, 1 reply; 88+ messages in thread
From: Jeff King @ 2022-11-05 12:46 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau

On Thu, Nov 03, 2022 at 03:37:32PM +0100, Patrick Steinhardt wrote:

> Users can optionally hide refs from remote users in git-upload-pack(1),
> git-receive-pack(1) and others via the `transfer.hideRefs`, but there is
> not an easy way to obtain the list of all visible or hidden refs right
> now. We'll require just that though for a performance improvement in our
> connectivity check.
> 
> Add a new pseudo-ref `--visible-refs=` that pretends as if all refs have
> been added to the command line that are not hidden. The pseudo-ref
> requiers either one of "transfer", "uploadpack" or "receive" as argument
> to pay attention to `transfer.hideRefs`, `uploadpack.hideRefs` or
> `receive.hideRefs`, respectively.

Thanks for re-working this. I think it's a sensible path forward for the
problem you're facing.

There were two parts of the implementation that surprised me a bit.
These might just be nits, but because this is a new user-facing plumbing
option that will be hard to change later, we should make sure it fits in
with the adjacent features.

The two things I saw were:

  1. The mutual-exclusion selection between "transfer", "uploadpack",
     and "receive" is not how those options work in their respective
     programs. The "transfer.hideRefs" variable is respected for either
     program. So whichever program you are running, it will always look
     at both "transfer" and itself ("uploadpack" or "receive"). Another
     way to think of it is that the "section" argument to
     parse_hide_refs_config() is not a config section so much as a
     profile. And the profiles are:

       - uploadpack: respect transfer.hideRefs and uploadpack.hideRefs
       - receive: respect transfer.hideRefs and receive.hideRefs

     So it does not make sense to ask for "transfer" as a section; each
     of the modes is already looking at transfer.hideRefs.

     In theory if this option was "look just at $section.hideRefs", it
     could be more flexible to separate out the two. But that makes it
     more of a pain to use (for normal use, you have to specify both
     "transfer" and "receive"). And that is not what your patch
     implements anyway; because it relies on parse_hide_refs_config(),
     it is always adding in "transfer" under the hood (which is why your
     final patch is correct to just say "--visible-refs=receive" without
     specifying "transfer" as well).

  2. Your "--visible-refs" implies "--all", because it's really "all
     refs minus the hidden ones". That's convenient for the intended
     caller, but not as flexible as it could be. If it were instead
     treated the way "--exclude" is, as a modifier for the next
     iteration option, then you do a few extra things:

       a. Combine multiple exclusions in a single iteration:

            git rev-list --exclude-hidden=receive \
	                 --exclude-hidden=upload \
			 --all

          That excludes both types in a single iteration. Whereas if you
	  did:

	    git rev-list --visible-refs=receive \
	                 --visible-refs=upload

	  that will do _two_ iterations, and end up with the union of
	  the sets. Equivalent to:

	    git rev-list --exclude-hidden=receive --all \
	                 --exclude-hidden=upload  --all

       b. Do exclusions on smaller sets than --all:

            git rev-list --exclude-hidden=receive \
	                 --branches

	  which would show just the branches that we'd advertise.

     Now I don't have a particular use case for either of those things.
     But they're plausible things to want in the long run, and they fit
     in nicely with the existing ref-selection scheme of rev-list. They
     do make your call from check_connected() slightly longer, but it is
     pretty negligible. It's "--exclude-hidden=receive --all" instead of
     "--visible-refs=hidden".

So looking at the patch itself, if you wanted to take my suggestions:

> +--visible-refs=[transfer|receive|uploadpack]::
> +	Pretend as if all the refs that have not been hidden via either one of
> +	`transfer.hideRefs`, `receive.hideRefs` or `uploadpack.hideRefs` are
> +	listed on the command line.

This would drop "transfer" as a mode, and explain that the argument is
"hide the refs that receive-pack would use", etc.

Likewise, the name would switch and pick up explanation similar to
--exclude below about how it affects the next --all, etc.

> @@ -1542,11 +1545,13 @@ static int handle_one_ref(const char *path, const struct object_id *oid,
>  			  int flag UNUSED,
>  			  void *cb_data)
>  {
> +	const char *stripped_path = strip_namespace(path);
>  	struct all_refs_cb *cb = cb_data;
>  	struct object *object;
>  
> -	if (ref_excluded(cb->all_revs->ref_excludes, path))
> -	    return 0;
> +	if (ref_excluded(cb->all_revs->ref_excludes, path) ||
> +	    ref_is_hidden(stripped_path, path, &cb->hidden_refs))
> +		return 0;

This would stay the same. We'd still exclude hidden refs during the
iteration.

> @@ -2759,6 +2772,21 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
>  		parse_list_objects_filter(&revs->filter, arg);
>  	} else if (!strcmp(arg, ("--no-filter"))) {
>  		list_objects_filter_set_no_filter(&revs->filter);
> +	} else if (skip_prefix(arg, "--visible-refs=", &arg)) {
> +		struct all_refs_cb cb;
> +
> +		if (strcmp(arg, "transfer") && strcmp(arg, "receive") &&
> +		    strcmp(arg, "uploadpack"))
> +			die(_("unsupported section for --visible-refs: %s"), arg);
> +
> +		init_all_refs_cb(&cb, revs, *flags);
> +		cb.hidden_refs_section = arg;
> +		git_config(hide_refs_config, &cb);
> +
> +		refs_for_each_ref(refs, handle_one_ref, &cb);
> +
> +		string_list_clear(&cb.hidden_refs, 1);
> +		clear_ref_exclusion(&revs->ref_excludes);

And here we'd do the same git_config() call, but drop the
refs_for_each_ref() call. We'd clear the hidden_refs field in all the
places that call clear_ref_exclusion() now.

In fact, you could argue that all of this should just be folded into
clear_ref_exclusion() and ref_excluded(), since from the perspective of
the iterating code, they are all part of the same feature. I don't mind
leaving it separate from the perspective of rev-list, though I think
if you did so, it would all Just Work for "rev-parse", too (I doubt
anybody cares in practice, but it's probably better to keep it
consistent).

-Peff

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 0/3] receive-pack: only use visible refs for connectivity check
  2022-11-03 14:37 ` [PATCH v2 0/3] receive-pack: only use visible refs for " Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2022-11-05  0:40   ` [PATCH v2 0/3] " Taylor Blau
@ 2022-11-05 12:52   ` Jeff King
  4 siblings, 0 replies; 88+ messages in thread
From: Jeff King @ 2022-11-05 12:52 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau

On Thu, Nov 03, 2022 at 03:37:25PM +0100, Patrick Steinhardt wrote:

> Overall the performance improvement isn't quite as strong as before:
> we're only 4.5x faster compared to 6.5x in our repo. But I guess that's
> still a good-enough improvement, doubly so that there are no downsides
> for repos anymore that ain't got any hidden refs.

Just a guess, but the extra time is probably spent because "rev-list"
has to iterate over your kajillion refs saying "nope, not this one".

I wonder if there is a way to carve up the ref namespace early, in a
trie-like way, similar to how for_each_fullref_in_prefixes() skips
straight to the interesting part. It's a harder problem, because we are
excluding rather than including entries based on our patterns. But it
seems like you'd be able to notice that we are on "refs/foo/bar", that
all of "refs/foo" is excluded, and jump ahead to the end of the
"refs/foo" section by binary-searching for the end within the
packed-refs file.

And that would speed up both this rev-list, but also the initial
advertisement that receive-pack does, because it's also iterating all of
these saying "nope, it's hidden". And assuming these are all internal
refs you hide from users, upload-pack would benefit, too.

I don't think that should be part of this series, but it may be an
interesting future direction.

-Peff

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 2/3] revision: add new parameter to specify all visible refs
  2022-11-03 14:37   ` [PATCH v2 2/3] revision: add new parameter to specify all visible refs Patrick Steinhardt
  2022-11-05 12:46     ` Jeff King
@ 2022-11-05 12:55     ` Jeff King
  1 sibling, 0 replies; 88+ messages in thread
From: Jeff King @ 2022-11-05 12:55 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau

On Thu, Nov 03, 2022 at 03:37:32PM +0100, Patrick Steinhardt wrote:

> diff --git a/t/t6021-rev-list-visible-refs.sh b/t/t6021-rev-list-visible-refs.sh
> new file mode 100755
> index 0000000000..9e12384dcf

Oh, I forgot to mention in my earlier response: these tests mostly look
good to me, but it may be worth having a few that cover the namespaces
feature. I _think_ your code is doing the right thing (because it calls
strip_namespace() as appropriate), but I have to admit I am generally
confused about how the namespace feature works in the first place. So
it's probably good to figure out how it is supposed to interact here and
make sure it is covered by tests.

-Peff

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 0/3] receive-pack: only use visible refs for connectivity check
  2022-11-05  0:40   ` [PATCH v2 0/3] " Taylor Blau
@ 2022-11-05 12:55     ` Jeff King
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff King @ 2022-11-05 12:55 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Patrick Steinhardt, git, Junio C Hamano,
	Ævar Arnfjörð Bjarmason

On Fri, Nov 04, 2022 at 08:40:35PM -0400, Taylor Blau wrote:

> On Thu, Nov 03, 2022 at 03:37:25PM +0100, Patrick Steinhardt wrote:
> > Hi,
> >
> > this is the second version of my patch series that tries to improve
> > performance of the connectivity check by only considering preexisting
> > refs as uninteresting that could actually have been advertised to the
> > client.
> 
> This version was delightful to read, and I don't have any concerns with
> the approach or implementation.
> 
> I would appreciate another set of reviewer eyes on the topic, but if
> not, I am comfortable starting to merge this down as-is.

I like the overall direction, but I had some suggestions for the
interface in patch 2. Hopefully Patrick agrees, and we'll see a v2.

-Peff

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 2/3] revision: add new parameter to specify all visible refs
  2022-11-05 12:46     ` Jeff King
@ 2022-11-07  8:20       ` Patrick Steinhardt
  2022-11-08 14:32         ` Jeff King
  0 siblings, 1 reply; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-07  8:20 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau

[-- Attachment #1: Type: text/plain, Size: 5567 bytes --]

On Sat, Nov 05, 2022 at 08:46:05AM -0400, Jeff King wrote:
> On Thu, Nov 03, 2022 at 03:37:32PM +0100, Patrick Steinhardt wrote:
> 
> > Users can optionally hide refs from remote users in git-upload-pack(1),
> > git-receive-pack(1) and others via the `transfer.hideRefs`, but there is
> > not an easy way to obtain the list of all visible or hidden refs right
> > now. We'll require just that though for a performance improvement in our
> > connectivity check.
> > 
> > Add a new pseudo-ref `--visible-refs=` that pretends as if all refs have
> > been added to the command line that are not hidden. The pseudo-ref
> > requiers either one of "transfer", "uploadpack" or "receive" as argument
> > to pay attention to `transfer.hideRefs`, `uploadpack.hideRefs` or
> > `receive.hideRefs`, respectively.
> 
> Thanks for re-working this. I think it's a sensible path forward for the
> problem you're facing.
> 
> There were two parts of the implementation that surprised me a bit.
> These might just be nits, but because this is a new user-facing plumbing
> option that will be hard to change later, we should make sure it fits in
> with the adjacent features.
> 
> The two things I saw were:
> 
>   1. The mutual-exclusion selection between "transfer", "uploadpack",
>      and "receive" is not how those options work in their respective
>      programs. The "transfer.hideRefs" variable is respected for either
>      program. So whichever program you are running, it will always look
>      at both "transfer" and itself ("uploadpack" or "receive"). Another
>      way to think of it is that the "section" argument to
>      parse_hide_refs_config() is not a config section so much as a
>      profile. And the profiles are:
> 
>        - uploadpack: respect transfer.hideRefs and uploadpack.hideRefs
>        - receive: respect transfer.hideRefs and receive.hideRefs
> 
>      So it does not make sense to ask for "transfer" as a section; each
>      of the modes is already looking at transfer.hideRefs.
> 
>      In theory if this option was "look just at $section.hideRefs", it
>      could be more flexible to separate out the two. But that makes it
>      more of a pain to use (for normal use, you have to specify both
>      "transfer" and "receive"). And that is not what your patch
>      implements anyway; because it relies on parse_hide_refs_config(),
>      it is always adding in "transfer" under the hood (which is why your
>      final patch is correct to just say "--visible-refs=receive" without
>      specifying "transfer" as well).

Yup, I'm aware of this. And as you say, the current implementation
already handles this alright for both `receive` and `uploadpack` as we
rely on `parse_hide_refs_config()`, which knows to look at both
`transfer.hideRefs` and `$section.hideRefs`. But I don't see a reason
why we shouldn't allow users to ask "What is the set of hidden refs that
are shared by `uploadpack` and `receive`?", which is exactly what
`--visible-refs=transfer` does.

The implementation is not really explicit about this as we cheat a
little bit here by passing "transfer" as a section to the parsing
function. So what it does right now is to basically check for the same
section twice, once via the hard-coded "transfer.hideRefs" and once for
the "$section.hideRefs" with `section == "transfer"`. But I didn't see
much of a point in making this more explicit.

I might update the commit message and/or documentation though to point
this out.

>   2. Your "--visible-refs" implies "--all", because it's really "all
>      refs minus the hidden ones". That's convenient for the intended
>      caller, but not as flexible as it could be. If it were instead
>      treated the way "--exclude" is, as a modifier for the next
>      iteration option, then you do a few extra things:
> 
>        a. Combine multiple exclusions in a single iteration:
> 
>             git rev-list --exclude-hidden=receive \
> 	                 --exclude-hidden=upload \
> 			 --all
> 
>           That excludes both types in a single iteration. Whereas if you
> 	  did:
> 
> 	    git rev-list --visible-refs=receive \
> 	                 --visible-refs=upload
> 
> 	  that will do _two_ iterations, and end up with the union of
> 	  the sets. Equivalent to:
> 
> 	    git rev-list --exclude-hidden=receive --all \
> 	                 --exclude-hidden=upload  --all
> 
>        b. Do exclusions on smaller sets than --all:
> 
>             git rev-list --exclude-hidden=receive \
> 	                 --branches
> 
> 	  which would show just the branches that we'd advertise.
> 
>      Now I don't have a particular use case for either of those things.
>      But they're plausible things to want in the long run, and they fit
>      in nicely with the existing ref-selection scheme of rev-list. They
>      do make your call from check_connected() slightly longer, but it is
>      pretty negligible. It's "--exclude-hidden=receive --all" instead of
>      "--visible-refs=hidden".

Fair enough. I guess that the usecase where you e.g. only hide a subset
of branches via `hideRefs` is going to be rare, so in most cases you
don't gain much by modelling this so that you can `--exclude-hidden
--branches`. But as you rightfully point out, modelling it that way fits
neatly with the existing `--exclude` switch and is overall more
flexible. So there's not much of a reason to not do so.

I'll send a v3 and include your suggestion, thanks.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v3 0/6] receive-pack: only use visible refs for connectivity check
  2022-10-28 14:42 [PATCH 0/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
                   ` (5 preceding siblings ...)
  2022-11-03 14:37 ` [PATCH v2 0/3] receive-pack: only use visible refs for " Patrick Steinhardt
@ 2022-11-07 12:16 ` Patrick Steinhardt
  2022-11-07 12:16   ` [PATCH v3 1/6] refs: get rid of global list of hidden refs Patrick Steinhardt
                     ` (6 more replies)
  2022-11-08 10:03 ` [PATCH v4 " Patrick Steinhardt
                   ` (2 subsequent siblings)
  9 siblings, 7 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-07 12:16 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 2218 bytes --]

Hi,

this is the second version of my patch series that tries to improve
performance of the connectivity check by only considering preexisting
refs as uninteresting that could actually have been advertised to the
client.

In v2 of this series, we introduced a new option `--visible-refs=` that
mostly acted as if `--all` was given, but its output was restricted to
only visible refs. As Peff rightly points out though, this is less
flexible than it needs to be. This new version instead introduces a new
option `--exclude-hidden=` that can be combined with `--all`, `--glob`,
`--branches` and so on to provide a more flexible interface.

The patch series is structured as following:

    - Patch 1-3 refactor multiple different parts of refs.c and
      revision.c so that they are more readily reusable.

    - Patch 4-5 implement `--exclude-hidden=` for git-rev-list(1) and
      git-rev-parse(1).

    - Patch 6 starts using that option in the connectivity check.

Patrick


Patrick Steinhardt (6):
  refs: get rid of global list of hidden refs
  revision: move together exclusion-related functions
  revision: introduce struct to handle exclusions
  revision: add new parameter to exclude hidden refs
  revparse: add `--exclude-hidden=` option
  receive-pack: only use visible refs for connectivity check

 Documentation/git-rev-parse.txt    |   7 ++
 Documentation/rev-list-options.txt |   7 ++
 builtin/receive-pack.c             |  10 +-
 builtin/rev-list.c                 |   1 +
 builtin/rev-parse.c                |  12 ++-
 connected.c                        |   3 +
 connected.h                        |   7 ++
 ls-refs.c                          |  13 ++-
 refs.c                             |  14 +--
 refs.h                             |   5 +-
 revision.c                         | 118 +++++++++++++--------
 revision.h                         |  29 ++++--
 t/t6018-rev-list-glob.sh           |   8 ++
 t/t6021-rev-list-exclude-hidden.sh | 159 +++++++++++++++++++++++++++++
 upload-pack.c                      |  30 +++---
 15 files changed, 341 insertions(+), 82 deletions(-)
 create mode 100755 t/t6021-rev-list-exclude-hidden.sh

-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v3 1/6] refs: get rid of global list of hidden refs
  2022-11-07 12:16 ` [PATCH v3 0/6] " Patrick Steinhardt
@ 2022-11-07 12:16   ` Patrick Steinhardt
  2022-11-07 12:16   ` [PATCH v3 2/6] revision: move together exclusion-related functions Patrick Steinhardt
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-07 12:16 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 12498 bytes --]

We're about to add a new argument to git-rev-list(1) that allows it to
add all references that are visible when taking `transfer.hideRefs` et
al into account. This will require us to potentially parse multiple sets
of hidden refs, which is not easily possible right now as there is only
a single, global instance of the list of parsed hidden refs.

Refactor `parse_hide_refs_config()` and `ref_is_hidden()` so that both
take the list of hidden references as input and adjust callers to keep a
local list, instead. This allows us to easily use multiple hidden-ref
lists. Furthermore, it allows us to properly free this list before we
exit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/receive-pack.c |  8 +++++---
 ls-refs.c              | 13 +++++++++----
 refs.c                 | 14 ++++----------
 refs.h                 |  5 +++--
 upload-pack.c          | 30 ++++++++++++++++++------------
 5 files changed, 39 insertions(+), 31 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 44bcea3a5b..1f3efc58fb 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -80,6 +80,7 @@ static struct object_id push_cert_oid;
 static struct signature_check sigcheck;
 static const char *push_cert_nonce;
 static const char *cert_nonce_seed;
+static struct string_list hidden_refs = STRING_LIST_INIT_DUP;
 
 static const char *NONCE_UNSOLICITED = "UNSOLICITED";
 static const char *NONCE_BAD = "BAD";
@@ -130,7 +131,7 @@ static enum deny_action parse_deny_action(const char *var, const char *value)
 
 static int receive_pack_config(const char *var, const char *value, void *cb)
 {
-	int status = parse_hide_refs_config(var, value, "receive");
+	int status = parse_hide_refs_config(var, value, "receive", &hidden_refs);
 
 	if (status)
 		return status;
@@ -296,7 +297,7 @@ static int show_ref_cb(const char *path_full, const struct object_id *oid,
 	struct oidset *seen = data;
 	const char *path = strip_namespace(path_full);
 
-	if (ref_is_hidden(path, path_full))
+	if (ref_is_hidden(path, path_full, &hidden_refs))
 		return 0;
 
 	/*
@@ -1794,7 +1795,7 @@ static void reject_updates_to_hidden(struct command *commands)
 		strbuf_setlen(&refname_full, prefix_len);
 		strbuf_addstr(&refname_full, cmd->ref_name);
 
-		if (!ref_is_hidden(cmd->ref_name, refname_full.buf))
+		if (!ref_is_hidden(cmd->ref_name, refname_full.buf, &hidden_refs))
 			continue;
 		if (is_null_oid(&cmd->new_oid))
 			cmd->error_string = "deny deleting a hidden ref";
@@ -2591,6 +2592,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 		packet_flush(1);
 	oid_array_clear(&shallow);
 	oid_array_clear(&ref);
+	string_list_clear(&hidden_refs, 1);
 	free((void *)push_cert_nonce);
 	return 0;
 }
diff --git a/ls-refs.c b/ls-refs.c
index fa0d01b47c..ae89f850e9 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -6,6 +6,7 @@
 #include "ls-refs.h"
 #include "pkt-line.h"
 #include "config.h"
+#include "string-list.h"
 
 static int config_read;
 static int advertise_unborn;
@@ -73,6 +74,7 @@ struct ls_refs_data {
 	unsigned symrefs;
 	struct strvec prefixes;
 	struct strbuf buf;
+	struct string_list hidden_refs;
 	unsigned unborn : 1;
 };
 
@@ -84,7 +86,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
 
 	strbuf_reset(&data->buf);
 
-	if (ref_is_hidden(refname_nons, refname))
+	if (ref_is_hidden(refname_nons, refname, &data->hidden_refs))
 		return 0;
 
 	if (!ref_match(&data->prefixes, refname_nons))
@@ -137,14 +139,15 @@ static void send_possibly_unborn_head(struct ls_refs_data *data)
 }
 
 static int ls_refs_config(const char *var, const char *value,
-			  void *data UNUSED)
+			  void *cb_data)
 {
+	struct ls_refs_data *data = cb_data;
 	/*
 	 * We only serve fetches over v2 for now, so respect only "uploadpack"
 	 * config. This may need to eventually be expanded to "receive", but we
 	 * don't yet know how that information will be passed to ls-refs.
 	 */
-	return parse_hide_refs_config(var, value, "uploadpack");
+	return parse_hide_refs_config(var, value, "uploadpack", &data->hidden_refs);
 }
 
 int ls_refs(struct repository *r, struct packet_reader *request)
@@ -154,9 +157,10 @@ int ls_refs(struct repository *r, struct packet_reader *request)
 	memset(&data, 0, sizeof(data));
 	strvec_init(&data.prefixes);
 	strbuf_init(&data.buf, 0);
+	string_list_init_dup(&data.hidden_refs);
 
 	ensure_config_read();
-	git_config(ls_refs_config, NULL);
+	git_config(ls_refs_config, &data);
 
 	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
 		const char *arg = request->line;
@@ -195,6 +199,7 @@ int ls_refs(struct repository *r, struct packet_reader *request)
 	packet_fflush(stdout);
 	strvec_clear(&data.prefixes);
 	strbuf_release(&data.buf);
+	string_list_clear(&data.hidden_refs, 1);
 	return 0;
 }
 
diff --git a/refs.c b/refs.c
index 1491ae937e..f1711e2e9f 100644
--- a/refs.c
+++ b/refs.c
@@ -1414,9 +1414,8 @@ char *shorten_unambiguous_ref(const char *refname, int strict)
 					    refname, strict);
 }
 
-static struct string_list *hide_refs;
-
-int parse_hide_refs_config(const char *var, const char *value, const char *section)
+int parse_hide_refs_config(const char *var, const char *value, const char *section,
+			   struct string_list *hide_refs)
 {
 	const char *key;
 	if (!strcmp("transfer.hiderefs", var) ||
@@ -1431,21 +1430,16 @@ int parse_hide_refs_config(const char *var, const char *value, const char *secti
 		len = strlen(ref);
 		while (len && ref[len - 1] == '/')
 			ref[--len] = '\0';
-		if (!hide_refs) {
-			CALLOC_ARRAY(hide_refs, 1);
-			hide_refs->strdup_strings = 1;
-		}
 		string_list_append(hide_refs, ref);
 	}
 	return 0;
 }
 
-int ref_is_hidden(const char *refname, const char *refname_full)
+int ref_is_hidden(const char *refname, const char *refname_full,
+		  const struct string_list *hide_refs)
 {
 	int i;
 
-	if (!hide_refs)
-		return 0;
 	for (i = hide_refs->nr - 1; i >= 0; i--) {
 		const char *match = hide_refs->items[i].string;
 		const char *subject;
diff --git a/refs.h b/refs.h
index 8958717a17..3266fd8f57 100644
--- a/refs.h
+++ b/refs.h
@@ -808,7 +808,8 @@ int update_ref(const char *msg, const char *refname,
 	       const struct object_id *new_oid, const struct object_id *old_oid,
 	       unsigned int flags, enum action_on_err onerr);
 
-int parse_hide_refs_config(const char *var, const char *value, const char *);
+int parse_hide_refs_config(const char *var, const char *value, const char *,
+			   struct string_list *);
 
 /*
  * Check whether a ref is hidden. If no namespace is set, both the first and
@@ -818,7 +819,7 @@ int parse_hide_refs_config(const char *var, const char *value, const char *);
  * the ref is outside that namespace, the first parameter is NULL. The second
  * parameter always points to the full ref name.
  */
-int ref_is_hidden(const char *, const char *);
+int ref_is_hidden(const char *, const char *, const struct string_list *);
 
 /* Is this a per-worktree ref living in the refs/ namespace? */
 int is_per_worktree_ref(const char *refname);
diff --git a/upload-pack.c b/upload-pack.c
index 0b8311bd68..9db17f8787 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -62,6 +62,7 @@ struct upload_pack_data {
 	struct object_array have_obj;
 	struct oid_array haves;					/* v2 only */
 	struct string_list wanted_refs;				/* v2 only */
+	struct string_list hidden_refs;
 
 	struct object_array shallows;
 	struct string_list deepen_not;
@@ -118,6 +119,7 @@ static void upload_pack_data_init(struct upload_pack_data *data)
 {
 	struct string_list symref = STRING_LIST_INIT_DUP;
 	struct string_list wanted_refs = STRING_LIST_INIT_DUP;
+	struct string_list hidden_refs = STRING_LIST_INIT_DUP;
 	struct object_array want_obj = OBJECT_ARRAY_INIT;
 	struct object_array have_obj = OBJECT_ARRAY_INIT;
 	struct oid_array haves = OID_ARRAY_INIT;
@@ -130,6 +132,7 @@ static void upload_pack_data_init(struct upload_pack_data *data)
 	memset(data, 0, sizeof(*data));
 	data->symref = symref;
 	data->wanted_refs = wanted_refs;
+	data->hidden_refs = hidden_refs;
 	data->want_obj = want_obj;
 	data->have_obj = have_obj;
 	data->haves = haves;
@@ -151,6 +154,7 @@ static void upload_pack_data_clear(struct upload_pack_data *data)
 {
 	string_list_clear(&data->symref, 1);
 	string_list_clear(&data->wanted_refs, 1);
+	string_list_clear(&data->hidden_refs, 1);
 	object_array_clear(&data->want_obj);
 	object_array_clear(&data->have_obj);
 	oid_array_clear(&data->haves);
@@ -842,8 +846,8 @@ static void deepen(struct upload_pack_data *data, int depth)
 		 * Checking for reachable shallows requires that our refs be
 		 * marked with OUR_REF.
 		 */
-		head_ref_namespaced(check_ref, NULL);
-		for_each_namespaced_ref(check_ref, NULL);
+		head_ref_namespaced(check_ref, data);
+		for_each_namespaced_ref(check_ref, data);
 
 		get_reachable_list(data, &reachable_shallows);
 		result = get_shallow_commits(&reachable_shallows,
@@ -1158,11 +1162,11 @@ static void receive_needs(struct upload_pack_data *data,
 
 /* return non-zero if the ref is hidden, otherwise 0 */
 static int mark_our_ref(const char *refname, const char *refname_full,
-			const struct object_id *oid)
+			const struct object_id *oid, const struct string_list *hidden_refs)
 {
 	struct object *o = lookup_unknown_object(the_repository, oid);
 
-	if (ref_is_hidden(refname, refname_full)) {
+	if (ref_is_hidden(refname, refname_full, hidden_refs)) {
 		o->flags |= HIDDEN_REF;
 		return 1;
 	}
@@ -1171,11 +1175,12 @@ static int mark_our_ref(const char *refname, const char *refname_full,
 }
 
 static int check_ref(const char *refname_full, const struct object_id *oid,
-		     int flag UNUSED, void *cb_data UNUSED)
+		     int flag UNUSED, void *cb_data)
 {
 	const char *refname = strip_namespace(refname_full);
+	struct upload_pack_data *data = cb_data;
 
-	mark_our_ref(refname, refname_full, oid);
+	mark_our_ref(refname, refname_full, oid, &data->hidden_refs);
 	return 0;
 }
 
@@ -1204,7 +1209,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	struct object_id peeled;
 	struct upload_pack_data *data = cb_data;
 
-	if (mark_our_ref(refname_nons, refname, oid))
+	if (mark_our_ref(refname_nons, refname, oid, &data->hidden_refs))
 		return 0;
 
 	if (capabilities) {
@@ -1327,7 +1332,7 @@ static int upload_pack_config(const char *var, const char *value, void *cb_data)
 	if (parse_object_filter_config(var, value, data) < 0)
 		return -1;
 
-	return parse_hide_refs_config(var, value, "uploadpack");
+	return parse_hide_refs_config(var, value, "uploadpack", &data->hidden_refs);
 }
 
 static int upload_pack_protected_config(const char *var, const char *value, void *cb_data)
@@ -1375,8 +1380,8 @@ void upload_pack(const int advertise_refs, const int stateless_rpc,
 		advertise_shallow_grafts(1);
 		packet_flush(1);
 	} else {
-		head_ref_namespaced(check_ref, NULL);
-		for_each_namespaced_ref(check_ref, NULL);
+		head_ref_namespaced(check_ref, &data);
+		for_each_namespaced_ref(check_ref, &data);
 	}
 
 	if (!advertise_refs) {
@@ -1441,6 +1446,7 @@ static int parse_want(struct packet_writer *writer, const char *line,
 
 static int parse_want_ref(struct packet_writer *writer, const char *line,
 			  struct string_list *wanted_refs,
+			  struct string_list *hidden_refs,
 			  struct object_array *want_obj)
 {
 	const char *refname_nons;
@@ -1451,7 +1457,7 @@ static int parse_want_ref(struct packet_writer *writer, const char *line,
 		struct strbuf refname = STRBUF_INIT;
 
 		strbuf_addf(&refname, "%s%s", get_git_namespace(), refname_nons);
-		if (ref_is_hidden(refname_nons, refname.buf) ||
+		if (ref_is_hidden(refname_nons, refname.buf, hidden_refs) ||
 		    read_ref(refname.buf, &oid)) {
 			packet_writer_error(writer, "unknown ref %s", refname_nons);
 			die("unknown ref %s", refname_nons);
@@ -1508,7 +1514,7 @@ static void process_args(struct packet_reader *request,
 			continue;
 		if (data->allow_ref_in_want &&
 		    parse_want_ref(&data->writer, arg, &data->wanted_refs,
-				   &data->want_obj))
+				   &data->hidden_refs, &data->want_obj))
 			continue;
 		/* process have line */
 		if (parse_have(arg, &data->haves))
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 2/6] revision: move together exclusion-related functions
  2022-11-07 12:16 ` [PATCH v3 0/6] " Patrick Steinhardt
  2022-11-07 12:16   ` [PATCH v3 1/6] refs: get rid of global list of hidden refs Patrick Steinhardt
@ 2022-11-07 12:16   ` Patrick Steinhardt
  2022-11-07 12:16   ` [PATCH v3 3/6] revision: introduce struct to handle exclusions Patrick Steinhardt
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-07 12:16 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 2402 bytes --]

Move together the definitions of functions that handle exclusions of
refs so that related functionality sits in a single place, only.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 revision.c | 52 ++++++++++++++++++++++++++--------------------------
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/revision.c b/revision.c
index 0760e78936..be755670e2 100644
--- a/revision.c
+++ b/revision.c
@@ -1517,14 +1517,6 @@ static void add_rev_cmdline_list(struct rev_info *revs,
 	}
 }
 
-struct all_refs_cb {
-	int all_flags;
-	int warned_bad_reflog;
-	struct rev_info *all_revs;
-	const char *name_for_errormsg;
-	struct worktree *wt;
-};
-
 int ref_excluded(struct string_list *ref_excludes, const char *path)
 {
 	struct string_list_item *item;
@@ -1538,6 +1530,32 @@ int ref_excluded(struct string_list *ref_excludes, const char *path)
 	return 0;
 }
 
+void clear_ref_exclusion(struct string_list **ref_excludes_p)
+{
+	if (*ref_excludes_p) {
+		string_list_clear(*ref_excludes_p, 0);
+		free(*ref_excludes_p);
+	}
+	*ref_excludes_p = NULL;
+}
+
+void add_ref_exclusion(struct string_list **ref_excludes_p, const char *exclude)
+{
+	if (!*ref_excludes_p) {
+		CALLOC_ARRAY(*ref_excludes_p, 1);
+		(*ref_excludes_p)->strdup_strings = 1;
+	}
+	string_list_append(*ref_excludes_p, exclude);
+}
+
+struct all_refs_cb {
+	int all_flags;
+	int warned_bad_reflog;
+	struct rev_info *all_revs;
+	const char *name_for_errormsg;
+	struct worktree *wt;
+};
+
 static int handle_one_ref(const char *path, const struct object_id *oid,
 			  int flag UNUSED,
 			  void *cb_data)
@@ -1563,24 +1581,6 @@ static void init_all_refs_cb(struct all_refs_cb *cb, struct rev_info *revs,
 	cb->wt = NULL;
 }
 
-void clear_ref_exclusion(struct string_list **ref_excludes_p)
-{
-	if (*ref_excludes_p) {
-		string_list_clear(*ref_excludes_p, 0);
-		free(*ref_excludes_p);
-	}
-	*ref_excludes_p = NULL;
-}
-
-void add_ref_exclusion(struct string_list **ref_excludes_p, const char *exclude)
-{
-	if (!*ref_excludes_p) {
-		CALLOC_ARRAY(*ref_excludes_p, 1);
-		(*ref_excludes_p)->strdup_strings = 1;
-	}
-	string_list_append(*ref_excludes_p, exclude);
-}
-
 static void handle_refs(struct ref_store *refs,
 			struct rev_info *revs, unsigned flags,
 			int (*for_each)(struct ref_store *, each_ref_fn, void *))
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 3/6] revision: introduce struct to handle exclusions
  2022-11-07 12:16 ` [PATCH v3 0/6] " Patrick Steinhardt
  2022-11-07 12:16   ` [PATCH v3 1/6] refs: get rid of global list of hidden refs Patrick Steinhardt
  2022-11-07 12:16   ` [PATCH v3 2/6] revision: move together exclusion-related functions Patrick Steinhardt
@ 2022-11-07 12:16   ` Patrick Steinhardt
  2022-11-07 12:51     ` Ævar Arnfjörð Bjarmason
  2022-11-07 12:16   ` [PATCH v3 4/6] revision: add new parameter to exclude hidden refs Patrick Steinhardt
                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-07 12:16 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 8348 bytes --]

The functions that handle exclusion of refs work on a single string
list. We're about to add a second mechanism for excluding refs though,
and it makes sense to reuse much of the same architecture for both kinds
of exclusion.

Introduce a new `struct ref_exclusions` that encapsulates all the logic
related to excluding refs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/rev-parse.c |  8 ++++----
 revision.c          | 47 ++++++++++++++++++++-------------------------
 revision.h          | 22 +++++++++++++++------
 3 files changed, 41 insertions(+), 36 deletions(-)

diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 8f61050bde..7fa5b6991b 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -39,7 +39,7 @@ static int abbrev_ref_strict;
 static int output_sq;
 
 static int stuck_long;
-static struct string_list *ref_excludes;
+static struct ref_exclusions ref_excludes = REF_EXCLUSIONS_INIT;
 
 /*
  * Some arguments are relevant "revision" arguments,
@@ -198,7 +198,7 @@ static int show_default(void)
 static int show_reference(const char *refname, const struct object_id *oid,
 			  int flag UNUSED, void *cb_data UNUSED)
 {
-	if (ref_excluded(ref_excludes, refname))
+	if (ref_excluded(&ref_excludes, refname))
 		return 0;
 	show_rev(NORMAL, oid, refname);
 	return 0;
@@ -585,7 +585,7 @@ static void handle_ref_opt(const char *pattern, const char *prefix)
 		for_each_glob_ref_in(show_reference, pattern, prefix, NULL);
 	else
 		for_each_ref_in(prefix, show_reference, NULL);
-	clear_ref_exclusion(&ref_excludes);
+	clear_ref_exclusions(&ref_excludes);
 }
 
 enum format_type {
@@ -863,7 +863,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 			}
 			if (!strcmp(arg, "--all")) {
 				for_each_ref(show_reference, NULL);
-				clear_ref_exclusion(&ref_excludes);
+				clear_ref_exclusions(&ref_excludes);
 				continue;
 			}
 			if (skip_prefix(arg, "--disambiguate=", &arg)) {
diff --git a/revision.c b/revision.c
index be755670e2..e5eaaa24ba 100644
--- a/revision.c
+++ b/revision.c
@@ -1517,35 +1517,29 @@ static void add_rev_cmdline_list(struct rev_info *revs,
 	}
 }
 
-int ref_excluded(struct string_list *ref_excludes, const char *path)
+int ref_excluded(const struct ref_exclusions *exclusions, const char *path)
 {
 	struct string_list_item *item;
-
-	if (!ref_excludes)
-		return 0;
-	for_each_string_list_item(item, ref_excludes) {
+	for_each_string_list_item(item, &exclusions->excluded_refs) {
 		if (!wildmatch(item->string, path, 0))
 			return 1;
 	}
 	return 0;
 }
 
-void clear_ref_exclusion(struct string_list **ref_excludes_p)
+void init_ref_exclusions(struct ref_exclusions *exclusions)
 {
-	if (*ref_excludes_p) {
-		string_list_clear(*ref_excludes_p, 0);
-		free(*ref_excludes_p);
-	}
-	*ref_excludes_p = NULL;
+	string_list_init_dup(&exclusions->excluded_refs);
 }
 
-void add_ref_exclusion(struct string_list **ref_excludes_p, const char *exclude)
+void clear_ref_exclusions(struct ref_exclusions *exclusions)
 {
-	if (!*ref_excludes_p) {
-		CALLOC_ARRAY(*ref_excludes_p, 1);
-		(*ref_excludes_p)->strdup_strings = 1;
-	}
-	string_list_append(*ref_excludes_p, exclude);
+	string_list_clear(&exclusions->excluded_refs, 0);
+}
+
+void add_ref_exclusion(struct ref_exclusions *exclusions, const char *exclude)
+{
+	string_list_append(&exclusions->excluded_refs, exclude);
 }
 
 struct all_refs_cb {
@@ -1563,7 +1557,7 @@ static int handle_one_ref(const char *path, const struct object_id *oid,
 	struct all_refs_cb *cb = cb_data;
 	struct object *object;
 
-	if (ref_excluded(cb->all_revs->ref_excludes, path))
+	if (ref_excluded(&cb->all_revs->ref_excludes, path))
 	    return 0;
 
 	object = get_reference(cb->all_revs, path, oid, cb->all_flags);
@@ -1901,6 +1895,7 @@ void repo_init_revisions(struct repository *r,
 
 	init_display_notes(&revs->notes_opt);
 	list_objects_filter_init(&revs->filter);
+	init_ref_exclusions(&revs->ref_excludes);
 }
 
 static void add_pending_commit_list(struct rev_info *revs,
@@ -2689,10 +2684,10 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 			init_all_refs_cb(&cb, revs, *flags);
 			other_head_refs(handle_one_ref, &cb);
 		}
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--branches")) {
 		handle_refs(refs, revs, *flags, refs_for_each_branch_ref);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--bisect")) {
 		read_bisect_terms(&term_bad, &term_good);
 		handle_refs(refs, revs, *flags, for_each_bad_bisect_ref);
@@ -2701,15 +2696,15 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 		revs->bisect = 1;
 	} else if (!strcmp(arg, "--tags")) {
 		handle_refs(refs, revs, *flags, refs_for_each_tag_ref);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--remotes")) {
 		handle_refs(refs, revs, *flags, refs_for_each_remote_ref);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if ((argcount = parse_long_opt("glob", argv, &optarg))) {
 		struct all_refs_cb cb;
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref(handle_one_ref, optarg, &cb);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 		return argcount;
 	} else if ((argcount = parse_long_opt("exclude", argv, &optarg))) {
 		add_ref_exclusion(&revs->ref_excludes, optarg);
@@ -2718,17 +2713,17 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 		struct all_refs_cb cb;
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/heads/", &cb);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (skip_prefix(arg, "--tags=", &optarg)) {
 		struct all_refs_cb cb;
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/tags/", &cb);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (skip_prefix(arg, "--remotes=", &optarg)) {
 		struct all_refs_cb cb;
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/remotes/", &cb);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--reflog")) {
 		add_reflogs_to_pending(revs, *flags);
 	} else if (!strcmp(arg, "--indexed-objects")) {
diff --git a/revision.h b/revision.h
index afe1b77985..87d6824c55 100644
--- a/revision.h
+++ b/revision.h
@@ -81,6 +81,14 @@ struct rev_cmdline_info {
 	} *rev;
 };
 
+struct ref_exclusions {
+	/*
+	 * Excluded refs is a list of wildmatch patterns. If any of the
+	 * patterns matches, the reference will be excluded.
+	 */
+	struct string_list excluded_refs;
+};
+
 struct oidset;
 struct topo_walk_info;
 
@@ -103,7 +111,7 @@ struct rev_info {
 	struct list_objects_filter_options filter;
 
 	/* excluding from --branches, --refs, etc. expansion */
-	struct string_list *ref_excludes;
+	struct ref_exclusions ref_excludes;
 
 	/* Basic information */
 	const char *prefix;
@@ -439,12 +447,14 @@ void mark_trees_uninteresting_sparse(struct repository *r, struct oidset *trees)
 void show_object_with_name(FILE *, struct object *, const char *);
 
 /**
- * Helpers to check if a "struct string_list" item matches with
- * wildmatch().
+ * Helpers to check if a reference should be excluded.
  */
-int ref_excluded(struct string_list *, const char *path);
-void clear_ref_exclusion(struct string_list **);
-void add_ref_exclusion(struct string_list **, const char *exclude);
+#define REF_EXCLUSIONS_INIT { .excluded_refs = STRING_LIST_INIT_DUP }
+
+int ref_excluded(const struct ref_exclusions *exclusions, const char *path);
+void init_ref_exclusions(struct ref_exclusions *);
+void clear_ref_exclusions(struct ref_exclusions *);
+void add_ref_exclusion(struct ref_exclusions *, const char *exclude);
 
 /**
  * This function can be used if you want to add commit objects as revision
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 4/6] revision: add new parameter to exclude hidden refs
  2022-11-07 12:16 ` [PATCH v3 0/6] " Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2022-11-07 12:16   ` [PATCH v3 3/6] revision: introduce struct to handle exclusions Patrick Steinhardt
@ 2022-11-07 12:16   ` Patrick Steinhardt
  2022-11-07 13:34     ` Ævar Arnfjörð Bjarmason
  2022-11-08  0:57     ` Taylor Blau
  2022-11-07 12:16   ` [PATCH v3 5/6] revparse: add `--exclude-hidden=` option Patrick Steinhardt
                     ` (2 subsequent siblings)
  6 siblings, 2 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-07 12:16 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 11516 bytes --]

Users can optionally hide refs from remote users in git-upload-pack(1),
git-receive-pack(1) and others via the `transfer.hideRefs`, but there is
not an easy way to obtain the list of all visible or hidden refs right
now. We'll require just that though for a performance improvement in our
connectivity check.

Add a new option `--exclude-hidden=` that excludes any hidden refs from
the next pseudo-ref like `--all` or `--branches`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/rev-list-options.txt |   7 ++
 builtin/rev-list.c                 |   1 +
 revision.c                         |  43 +++++++-
 revision.h                         |   9 +-
 t/t6021-rev-list-exclude-hidden.sh | 159 +++++++++++++++++++++++++++++
 5 files changed, 217 insertions(+), 2 deletions(-)
 create mode 100755 t/t6021-rev-list-exclude-hidden.sh

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 1837509566..a178956613 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -195,6 +195,13 @@ respectively, and they must begin with `refs/` when applied to `--glob`
 or `--all`. If a trailing '/{asterisk}' is intended, it must be given
 explicitly.
 
+--exclude-hidden=[transfer|receive|uploadpack]::
+	Do not include refs that have been hidden via either one of
+	`transfer.hideRefs`, `receive.hideRefs` or `uploadpack.hideRefs` that
+	the next `--all`, `--branches`, `--tags`, `--remotes` or `--glob` would
+	otherwise consider.  This option is cleared when seeing one of these
+	pseudo-refs.
+
 --reflog::
 	Pretend as if all objects mentioned by reflogs are listed on the
 	command line as `<commit>`.
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 3acd93f71e..9eace06385 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -38,6 +38,7 @@ static const char rev_list_usage[] =
 "    --tags\n"
 "    --remotes\n"
 "    --stdin\n"
+"    --exclude-hidden=[transfer|receive|uploadpack]\n"
 "    --quiet\n"
 "  ordering output:\n"
 "    --topo-order\n"
diff --git a/revision.c b/revision.c
index e5eaaa24ba..45652f9b0b 100644
--- a/revision.c
+++ b/revision.c
@@ -1,4 +1,5 @@
 #include "cache.h"
+#include "config.h"
 #include "object-store.h"
 #include "tag.h"
 #include "blob.h"
@@ -1519,22 +1520,30 @@ static void add_rev_cmdline_list(struct rev_info *revs,
 
 int ref_excluded(const struct ref_exclusions *exclusions, const char *path)
 {
+	const char *stripped_path = strip_namespace(path);
 	struct string_list_item *item;
+
 	for_each_string_list_item(item, &exclusions->excluded_refs) {
 		if (!wildmatch(item->string, path, 0))
 			return 1;
 	}
+
+	if (ref_is_hidden(stripped_path, path, &exclusions->hidden_refs))
+		return 1;
+
 	return 0;
 }
 
 void init_ref_exclusions(struct ref_exclusions *exclusions)
 {
 	string_list_init_dup(&exclusions->excluded_refs);
+	string_list_init_dup(&exclusions->hidden_refs);
 }
 
 void clear_ref_exclusions(struct ref_exclusions *exclusions)
 {
 	string_list_clear(&exclusions->excluded_refs, 0);
+	string_list_clear(&exclusions->hidden_refs, 1);
 }
 
 void add_ref_exclusion(struct ref_exclusions *exclusions, const char *exclude)
@@ -1542,6 +1551,35 @@ void add_ref_exclusion(struct ref_exclusions *exclusions, const char *exclude)
 	string_list_append(&exclusions->excluded_refs, exclude);
 }
 
+struct exclude_hidden_refs_cb {
+	struct ref_exclusions *exclusions;
+	const char *section;
+};
+
+static int hide_refs_config(const char *var, const char *value, void *cb_data)
+{
+	struct exclude_hidden_refs_cb *cb = cb_data;
+	return parse_hide_refs_config(var, value, cb->section,
+				      &cb->exclusions->hidden_refs);
+}
+
+void exclude_hidden_refs(struct ref_exclusions *exclusions, const char *section)
+{
+	struct exclude_hidden_refs_cb cb;
+
+	if (strcmp(section, "transfer") && strcmp(section, "receive") &&
+	    strcmp(section, "uploadpack"))
+		die(_("unsupported section for hidden refs: %s"), section);
+
+	if (exclusions->hidden_refs.nr)
+		die(_("--exclude-hidden= passed more than once"));
+
+	cb.exclusions = exclusions;
+	cb.section = section;
+
+	git_config(hide_refs_config, &cb);
+}
+
 struct all_refs_cb {
 	int all_flags;
 	int warned_bad_reflog;
@@ -2220,7 +2258,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	    !strcmp(arg, "--bisect") || starts_with(arg, "--glob=") ||
 	    !strcmp(arg, "--indexed-objects") ||
 	    !strcmp(arg, "--alternate-refs") ||
-	    starts_with(arg, "--exclude=") ||
+	    starts_with(arg, "--exclude=") || starts_with(arg, "--exclude-hidden=") ||
 	    starts_with(arg, "--branches=") || starts_with(arg, "--tags=") ||
 	    starts_with(arg, "--remotes=") || starts_with(arg, "--no-walk="))
 	{
@@ -2709,6 +2747,9 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 	} else if ((argcount = parse_long_opt("exclude", argv, &optarg))) {
 		add_ref_exclusion(&revs->ref_excludes, optarg);
 		return argcount;
+	} else if ((argcount = parse_long_opt("exclude-hidden", argv, &optarg))) {
+		exclude_hidden_refs(&revs->ref_excludes, optarg);
+		return argcount;
 	} else if (skip_prefix(arg, "--branches=", &optarg)) {
 		struct all_refs_cb cb;
 		init_all_refs_cb(&cb, revs, *flags);
diff --git a/revision.h b/revision.h
index 87d6824c55..fef5e063d1 100644
--- a/revision.h
+++ b/revision.h
@@ -87,6 +87,12 @@ struct ref_exclusions {
 	 * patterns matches, the reference will be excluded.
 	 */
 	struct string_list excluded_refs;
+
+	/*
+	 * Hidden refs is a list of patterns that is to be hidden via
+	 * `ref_is_hidden()`.
+	 */
+	struct string_list hidden_refs;
 };
 
 struct oidset;
@@ -449,12 +455,13 @@ void show_object_with_name(FILE *, struct object *, const char *);
 /**
  * Helpers to check if a reference should be excluded.
  */
-#define REF_EXCLUSIONS_INIT { .excluded_refs = STRING_LIST_INIT_DUP }
+#define REF_EXCLUSIONS_INIT { .excluded_refs = STRING_LIST_INIT_DUP, .hidden_refs = STRING_LIST_INIT_DUP }
 
 int ref_excluded(const struct ref_exclusions *exclusions, const char *path);
 void init_ref_exclusions(struct ref_exclusions *);
 void clear_ref_exclusions(struct ref_exclusions *);
 void add_ref_exclusion(struct ref_exclusions *, const char *exclude);
+void exclude_hidden_refs(struct ref_exclusions *, const char *section);
 
 /**
  * This function can be used if you want to add commit objects as revision
diff --git a/t/t6021-rev-list-exclude-hidden.sh b/t/t6021-rev-list-exclude-hidden.sh
new file mode 100755
index 0000000000..d08fc2da93
--- /dev/null
+++ b/t/t6021-rev-list-exclude-hidden.sh
@@ -0,0 +1,159 @@
+#!/bin/sh
+
+test_description='git rev-list --exclude-hidden test'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit_bulk --id=commit --ref=refs/heads/main 1 &&
+	COMMIT=$(git rev-parse refs/heads/main) &&
+	test_commit_bulk --id=tag --ref=refs/tags/lightweight 1 &&
+	TAG=$(git rev-parse refs/tags/lightweight) &&
+	test_commit_bulk --id=hidden --ref=refs/hidden/commit 1 &&
+	HIDDEN=$(git rev-parse refs/hidden/commit) &&
+	test_commit_bulk --id=namespace --ref=refs/namespaces/namespace/refs/namespaced/commit 1 &&
+	NAMESPACE=$(git rev-parse refs/namespaces/namespace/refs/namespaced/commit)
+'
+
+test_expect_success 'invalid section' '
+	echo "fatal: unsupported section for hidden refs: unsupported" >expected &&
+	test_must_fail git rev-list --exclude-hidden=unsupported 2>err &&
+	test_cmp expected err
+'
+
+test_expect_success 'passed multiple times' '
+	echo "fatal: --exclude-hidden= passed more than once" >expected &&
+	test_must_fail git -c transfer.hideRefs=refs/hidden/ rev-list --exclude-hidden=transfer --exclude-hidden=transfer 2>err &&
+	test_cmp expected err
+'
+
+test_expect_success '--exclude-hidden without hiddenRefs' '
+	git rev-list --exclude-hidden=transfer --all >out &&
+	cat >expected <<-EOF &&
+	$NAMESPACE
+	$HIDDEN
+	$TAG
+	$COMMIT
+	EOF
+	test_cmp expected out
+'
+
+test_expect_success 'hidden via transfer.hideRefs' '
+	git -c transfer.hideRefs=refs/hidden/ rev-list --exclude-hidden=transfer --all >out &&
+	cat >expected <<-EOF &&
+	$NAMESPACE
+	$TAG
+	$COMMIT
+	EOF
+	test_cmp expected out
+'
+
+test_expect_success '--all --exclude-hidden=transfer --not --all without hidden refs' '
+	git rev-list --all --exclude-hidden=transfer --not --all >out &&
+	test_must_be_empty out
+'
+
+test_expect_success '--all --exclude-hidden=transfer --not --all with hidden ref' '
+	git -c transfer.hideRefs=refs/hidden/ rev-list --all --exclude-hidden=transfer --not --all >out &&
+	cat >expected <<-EOF &&
+	$HIDDEN
+	EOF
+	test_cmp expected out
+'
+
+test_expect_success '--exclude-hidden with --exclude' '
+	git -c transfer.hideRefs=refs/hidden/ rev-list --exclude=refs/tags/* --exclude-hidden=transfer --all >out &&
+	cat >expected <<-EOF &&
+	$NAMESPACE
+	$COMMIT
+	EOF
+	test_cmp expected out
+'
+
+test_expect_success '--exclude-hidden is reset' '
+	git -c transfer.hideRefs=refs/ rev-list --exclude-hidden=transfer --all --all >out &&
+	cat >expected <<-EOF &&
+	$NAMESPACE
+	$HIDDEN
+	$TAG
+	$COMMIT
+	EOF
+	test_cmp expected out
+'
+
+test_expect_success '--exclude-hidden operates on stripped refs by default' '
+	GIT_NAMESPACE=namespace git -c transfer.hideRefs=refs/namespaced/ rev-list --exclude-hidden=transfer --all >out &&
+	cat >expected <<-EOF &&
+	$HIDDEN
+	$TAG
+	$COMMIT
+	EOF
+	test_cmp expected out
+'
+
+test_expect_success '--exclude-hidden does not hide namespace by default' '
+	GIT_NAMESPACE=namespace git -c transfer.hideRefs=refs/namespaces/namespace/ rev-list --exclude-hidden=transfer --all >out &&
+	cat >expected <<-EOF &&
+	$NAMESPACE
+	$HIDDEN
+	$TAG
+	$COMMIT
+	EOF
+	test_cmp expected out
+'
+
+test_expect_success '--exclude-hidden= may operate on unstripped refs' '
+	GIT_NAMESPACE=namespace git -c transfer.hideRefs=^refs/namespaces/namespace/ rev-list --exclude-hidden=transfer --all >out &&
+	cat >expected <<-EOF &&
+	$HIDDEN
+	$TAG
+	$COMMIT
+	EOF
+	test_cmp expected out
+'
+
+for section in receive uploadpack
+do
+	test_expect_success "hidden via $section.hideRefs" '
+		git -c $section.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "--exclude-hidden=$section respects transfer.hideRefs" '
+		git -c transfer.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "--exclude-hidden=transfer ignores $section.hideRefs" '
+		git -c $section.hideRefs=refs/hidden/ rev-list --exclude-hidden=transfer --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$HIDDEN
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "--exclude-hidden=$section respects both transfer.hideRefs and $section.hideRefs" '
+		git -c transfer.hideRefs=refs/tags/ -c $section.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+done
+
+test_done
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 5/6] revparse: add `--exclude-hidden=` option
  2022-11-07 12:16 ` [PATCH v3 0/6] " Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2022-11-07 12:16   ` [PATCH v3 4/6] revision: add new parameter to exclude hidden refs Patrick Steinhardt
@ 2022-11-07 12:16   ` Patrick Steinhardt
  2022-11-08 14:44     ` Jeff King
  2022-11-07 12:16   ` [PATCH v3 6/6] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
  2022-11-08  0:59   ` [PATCH v3 0/6] " Taylor Blau
  6 siblings, 1 reply; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-07 12:16 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 2873 bytes --]

Add a new `--exclude-hidden=` option that is similar to the one we just
added to git-rev-list(1). Given a seciton name `transfer`, `uploadpack`
or `receive` as argument, it causes us to exclude all references that
would be hidden by the respective `$seciton.hideRefs` configuration.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-rev-parse.txt | 7 +++++++
 builtin/rev-parse.c             | 4 ++++
 t/t6018-rev-list-glob.sh        | 8 ++++++++
 3 files changed, 19 insertions(+)

diff --git a/Documentation/git-rev-parse.txt b/Documentation/git-rev-parse.txt
index 6b8ca085aa..a016cb5abe 100644
--- a/Documentation/git-rev-parse.txt
+++ b/Documentation/git-rev-parse.txt
@@ -197,6 +197,13 @@ respectively, and they must begin with `refs/` when applied to `--glob`
 or `--all`. If a trailing '/{asterisk}' is intended, it must be given
 explicitly.
 
+--exclude-hidden=[transfer|receive|uploadpack]::
+	Do not include refs that have been hidden via either one of
+	`transfer.hideRefs`, `receive.hideRefs` or `uploadpack.hideRefs` that
+	the next `--all`, `--branches`, `--tags`, `--remotes` or `--glob` would
+	otherwise consider.  This option is cleared when seeing one of these
+	pseudo-refs.
+
 --disambiguate=<prefix>::
 	Show every object whose name begins with the given prefix.
 	The <prefix> must be at least 4 hexadecimal digits long to
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 7fa5b6991b..49730c7a23 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -895,6 +895,10 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				add_ref_exclusion(&ref_excludes, arg);
 				continue;
 			}
+			if (skip_prefix(arg, "--exclude-hidden=", &arg)) {
+				exclude_hidden_refs(&ref_excludes, arg);
+				continue;
+			}
 			if (!strcmp(arg, "--show-toplevel")) {
 				const char *work_tree = get_git_work_tree();
 				if (work_tree)
diff --git a/t/t6018-rev-list-glob.sh b/t/t6018-rev-list-glob.sh
index e1abc5c2b3..f92616de12 100755
--- a/t/t6018-rev-list-glob.sh
+++ b/t/t6018-rev-list-glob.sh
@@ -187,6 +187,14 @@ test_expect_success 'rev-parse --exclude=ref with --remotes=glob' '
 	compare rev-parse "--exclude=upstream/x --remotes=upstream/*" "upstream/one upstream/two"
 '
 
+test_expect_success 'rev-parse --exclude-hidden= with --all' '
+	compare "-c transfer.hideRefs=refs/remotes/ rev-parse" "--exclude-hidden=transfer --all" "--branches --tags"
+'
+
+test_expect_success 'rev-parse --exclude-hidden= with --all' '
+	compare "-c transfer.hideRefs=refs/heads/subspace/ rev-parse" "--exclude-hidden=transfer --all" "--exclude=refs/heads/subspace/* --all"
+'
+
 test_expect_success 'rev-list --exclude=glob with --branches=glob' '
 	compare rev-list "--exclude=subspace-* --branches=sub*" "subspace/one subspace/two"
 '
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 6/6] receive-pack: only use visible refs for connectivity check
  2022-11-07 12:16 ` [PATCH v3 0/6] " Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2022-11-07 12:16   ` [PATCH v3 5/6] revparse: add `--exclude-hidden=` option Patrick Steinhardt
@ 2022-11-07 12:16   ` Patrick Steinhardt
  2022-11-08  0:59   ` [PATCH v3 0/6] " Taylor Blau
  6 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-07 12:16 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 4675 bytes --]

When serving a push, git-receive-pack(1) needs to verify that the
packfile sent by the client contains all objects that are required by
the updated references. This connectivity check works by marking all
preexisting references as uninteresting and using the new reference tips
as starting point for a graph walk.

Marking all preexisting references as uninteresting can be a problem
when it comes to performance. Git forges tend to do internal bookkeeping
to keep alive sets of objects for internal use or make them easy to find
via certain references. These references are typically hidden away from
the user so that they are neither advertised nor writeable. At GitLab,
we have one particular repository that contains a total of 7 million
references, of which 6.8 million are indeed internal references. With
the current connectivity check we are forced to load all these
references in order to mark them as uninteresting, and this alone takes
around 15 seconds to compute.

We can optimize this by only taking into account the set of visible refs
when marking objects as uninteresting. This means that we may now walk
more objects until we hit any object that is marked as uninteresting.
But it is rather unlikely that clients send objects that make large
parts of objects reachable that have previously only ever been hidden,
whereas the common case is to push incremental changes that build on top
of the visible object graph.

This provides a huge boost to performance in the mentioned repository,
where the vast majority of its refs hidden. Pushing a new commit into
this repo with `transfer.hideRefs` set up to hide 6.8 million of 7 refs
as it is configured in Gitaly leads to a 4.5-fold speedup:

    Benchmark 1: main
      Time (mean ± σ):     30.977 s ±  0.157 s    [User: 30.226 s, System: 1.083 s]
      Range (min … max):   30.796 s … 31.071 s    3 runs

    Benchmark 2: pks-connectivity-check-hide-refs
      Time (mean ± σ):      6.799 s ±  0.063 s    [User: 6.803 s, System: 0.354 s]
      Range (min … max):    6.729 s …  6.850 s    3 runs

    Summary
      'pks-connectivity-check-hide-refs' ran
        4.56 ± 0.05 times faster than 'main'

As we mostly go through the same codepaths even in the case where there
are no hidden refs at all compared to the code before there is no change
in performance when no refs are hidden:

    Benchmark 1: main
      Time (mean ± σ):     48.188 s ±  0.432 s    [User: 49.326 s, System: 5.009 s]
      Range (min … max):   47.706 s … 48.539 s    3 runs

    Benchmark 2: pks-connectivity-check-hide-refs
      Time (mean ± σ):     48.027 s ±  0.500 s    [User: 48.934 s, System: 5.025 s]
      Range (min … max):   47.504 s … 48.500 s    3 runs

    Summary
      'pks-connectivity-check-hide-refs' ran
        1.00 ± 0.01 times faster than 'main'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/receive-pack.c | 2 ++
 connected.c            | 3 +++
 connected.h            | 7 +++++++
 3 files changed, 12 insertions(+)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 1f3efc58fb..77ab40f123 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1929,6 +1929,8 @@ static void execute_commands(struct command *commands,
 	opt.err_fd = err_fd;
 	opt.progress = err_fd && !quiet;
 	opt.env = tmp_objdir_env(tmp_objdir);
+	opt.exclude_hidden_refs_section = "receive";
+
 	if (check_connected(iterate_receive_command_list, &data, &opt))
 		set_connectivity_errors(commands, si);
 
diff --git a/connected.c b/connected.c
index 74a20cb32e..4f6388eed7 100644
--- a/connected.c
+++ b/connected.c
@@ -100,6 +100,9 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
 		strvec_push(&rev_list.args, "--exclude-promisor-objects");
 	if (!opt->is_deepening_fetch) {
 		strvec_push(&rev_list.args, "--not");
+		if (opt->exclude_hidden_refs_section)
+			strvec_pushf(&rev_list.args, "--exclude-hidden=%s",
+				     opt->exclude_hidden_refs_section);
 		strvec_push(&rev_list.args, "--all");
 	}
 	strvec_push(&rev_list.args, "--quiet");
diff --git a/connected.h b/connected.h
index 6e59c92aa3..16b2c84f2e 100644
--- a/connected.h
+++ b/connected.h
@@ -46,6 +46,13 @@ struct check_connected_options {
 	 * during a fetch.
 	 */
 	unsigned is_deepening_fetch : 1;
+
+	/*
+	 * If not NULL, use `--exclude-hidden=$section` to exclude all refs
+	 * hidden via the `$section.hideRefs` config from the set of
+	 * already-reachable refs.
+	 */
+	const char *exclude_hidden_refs_section;
 };
 
 #define CHECK_CONNECTED_INIT { 0 }
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 3/6] revision: introduce struct to handle exclusions
  2022-11-07 12:16   ` [PATCH v3 3/6] revision: introduce struct to handle exclusions Patrick Steinhardt
@ 2022-11-07 12:51     ` Ævar Arnfjörð Bjarmason
  2022-11-08  9:11       ` Patrick Steinhardt
  0 siblings, 1 reply; 88+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-07 12:51 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Junio C Hamano, Taylor Blau, Jeff King


On Mon, Nov 07 2022, Patrick Steinhardt wrote:

> [[PGP Signed Part:Undecided]]
> The functions that handle exclusion of refs work on a single string
> list. We're about to add a second mechanism for excluding refs though,
> and it makes sense to reuse much of the same architecture for both kinds
> of exclusion.
>
> Introduce a new `struct ref_exclusions` that encapsulates all the logic
> related to excluding refs.

I think it's a good change, but probably worth mentioning tha we're
moving the "excluded_refs" from being malloc'd to the "struct
string_list" being embedded in this new "struct ref_exclusions".

That change isn't necessary for hoisting it into a container struct, but
does make things nicer down the line.

>  	struct string_list_item *item;
> -
> -	if (!ref_excludes)
> -		return 0;
> -	for_each_string_list_item(item, ref_excludes) {
> +	for_each_string_list_item(item, &exclusions->excluded_refs) {
>  		if (!wildmatch(item->string, path, 0))
>  			return 1;

E.g. because here we don't care about the distinction between NULL and
!list->nr anymore, it *does* matter in some cases, but it's always nice
to be able to clearly distinguish the cases where we don't, such as this
one....

> -void clear_ref_exclusion(struct string_list **ref_excludes_p)
> +void init_ref_exclusions(struct ref_exclusions *exclusions)
>  {
> -	if (*ref_excludes_p) {
> -		string_list_clear(*ref_excludes_p, 0);
> -		free(*ref_excludes_p);
> -	}
> -	*ref_excludes_p = NULL;

...and this becomes much nicer.

Aside: There's some churn, and this diff is worse for the
rename-while-at-it of "clear_ref_exclusion" to "add_ref_exclusion", but
that's probably worth it to have the macro match the struct name etc.

> +	string_list_init_dup(&exclusions->excluded_refs);

Okey, so this is partly my fault for not following up on f196c1e908d
(revisions API users: use release_revisions() needing REV_INFO_INIT,
2022-04-13) :); But here:

If we keep this *_init() function don't duplicate what you're adding to
the macro, just init this in terms of the macro. See the two-line
examples in:

	git grep -W memcpy.*blank

But here (and this is the part that's mostly me) as we don't malloc this
anymore you're only needing to keep this init function for
repo_init_revisions().

So, probably too big a digression for a "while at it", but FWIW this on
top of your topic would do:
	
	 revision.c | 10 ++--------
	 revision.h | 10 +++++++---
	 2 files changed, 9 insertions(+), 11 deletions(-)
	
	diff --git a/revision.c b/revision.c
	index 45652f9b0bb..cf352d1fa43 100644
	--- a/revision.c
	+++ b/revision.c
	@@ -1534,12 +1534,6 @@ int ref_excluded(const struct ref_exclusions *exclusions, const char *path)
	 	return 0;
	 }
	 
	-void init_ref_exclusions(struct ref_exclusions *exclusions)
	-{
	-	string_list_init_dup(&exclusions->excluded_refs);
	-	string_list_init_dup(&exclusions->hidden_refs);
	-}
	-
	 void clear_ref_exclusions(struct ref_exclusions *exclusions)
	 {
	 	string_list_clear(&exclusions->excluded_refs, 0);
	@@ -1897,7 +1891,8 @@ void repo_init_revisions(struct repository *r,
	 			 struct rev_info *revs,
	 			 const char *prefix)
	 {
	-	memset(revs, 0, sizeof(*revs));
	+	struct rev_info blank = REV_INFO_INIT;
	+	memcpy(revs, &blank, sizeof(*revs));
	 
	 	revs->repo = r;
	 	revs->abbrev = DEFAULT_ABBREV;
	@@ -1933,7 +1928,6 @@ void repo_init_revisions(struct repository *r,
	 
	 	init_display_notes(&revs->notes_opt);
	 	list_objects_filter_init(&revs->filter);
	-	init_ref_exclusions(&revs->ref_excludes);
	 }
	 
	 static void add_pending_commit_list(struct rev_info *revs,
	diff --git a/revision.h b/revision.h
	index fef5e063d16..75b8ecc307b 100644
	--- a/revision.h
	+++ b/revision.h
	@@ -94,6 +94,10 @@ struct ref_exclusions {
	 	 */
	 	struct string_list hidden_refs;
	 };
	+#define REF_EXCLUSIONS_INIT { \
	+	.excluded_refs = STRING_LIST_INIT_DUP, \
	+	.hidden_refs = STRING_LIST_INIT_DUP, \
	+}
	 
	 struct oidset;
	 struct topo_walk_info;
	@@ -371,7 +375,9 @@ struct rev_info {
	  * called before release_revisions() the "struct rev_info" can be left
	  * uninitialized.
	  */
	-#define REV_INFO_INIT { 0 }
	+#define REV_INFO_INIT { \
	+	.ref_excludes = REF_EXCLUSIONS_INIT, \
	+}
	 
	 /**
	  * Initialize a rev_info structure with default values. The third parameter may
	@@ -455,10 +461,8 @@ void show_object_with_name(FILE *, struct object *, const char *);
	 /**
	  * Helpers to check if a reference should be excluded.
	  */
	-#define REF_EXCLUSIONS_INIT { .excluded_refs = STRING_LIST_INIT_DUP, .hidden_refs = STRING_LIST_INIT_DUP }
	 
	 int ref_excluded(const struct ref_exclusions *exclusions, const char *path);
	-void init_ref_exclusions(struct ref_exclusions *);
	 void clear_ref_exclusions(struct ref_exclusions *);
	 void add_ref_exclusion(struct ref_exclusions *, const char *exclude);
	 void exclude_hidden_refs(struct ref_exclusions *, const char *section);

But I'll submit that cleanup seperately, but for now let's not duplicate
your REF_EXCLUSIONS_INIT macro here in init_ref_exclusions(), just have
the function do what the macro is doing, now that we don't need the
malloc.

> -void add_ref_exclusion(struct string_list **ref_excludes_p, const char *exclude)
> +void clear_ref_exclusions(struct ref_exclusions *exclusions)
>  {
> -	if (!*ref_excludes_p) {
> -		CALLOC_ARRAY(*ref_excludes_p, 1);
> -		(*ref_excludes_p)->strdup_strings = 1;
> -	}
> -	string_list_append(*ref_excludes_p, exclude);
> +	string_list_clear(&exclusions->excluded_refs, 0);

Also nicer.

>  static void add_pending_commit_list(struct rev_info *revs,
> @@ -2689,10 +2684,10 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
>  			init_all_refs_cb(&cb, revs, *flags);
>  			other_head_refs(handle_one_ref, &cb);
>  		}
> -		clear_ref_exclusion(&revs->ref_excludes);
> +		clear_ref_exclusions(&revs->ref_excludes);
>  	} else if (!strcmp(arg, "--branches")) {
>  		handle_refs(refs, revs, *flags, refs_for_each_branch_ref);
> -		clear_ref_exclusion(&revs->ref_excludes);
> +		clear_ref_exclusions(&revs->ref_excludes);
>  	} else if (!strcmp(arg, "--bisect")) {
>  		read_bisect_terms(&term_bad, &term_good);
>  		handle_refs(refs, revs, *flags, for_each_bad_bisect_ref);
> @@ -2701,15 +2696,15 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
>  		revs->bisect = 1;
>  	} else if (!strcmp(arg, "--tags")) {
>  		handle_refs(refs, revs, *flags, refs_for_each_tag_ref);
> -		clear_ref_exclusion(&revs->ref_excludes);
> +		clear_ref_exclusions(&revs->ref_excludes);
>  	} else if (!strcmp(arg, "--remotes")) {
>  		handle_refs(refs, revs, *flags, refs_for_each_remote_ref);
> -		clear_ref_exclusion(&revs->ref_excludes);
> +		clear_ref_exclusions(&revs->ref_excludes);
>  	} else if ((argcount = parse_long_opt("glob", argv, &optarg))) {
>  		struct all_refs_cb cb;
>  		init_all_refs_cb(&cb, revs, *flags);
>  		for_each_glob_ref(handle_one_ref, optarg, &cb);
> -		clear_ref_exclusion(&revs->ref_excludes);
> +		clear_ref_exclusions(&revs->ref_excludes);
>  		return argcount;
>  	} else if ((argcount = parse_long_opt("exclude", argv, &optarg))) {
>  		add_ref_exclusion(&revs->ref_excludes, optarg);
> @@ -2718,17 +2713,17 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
>  		struct all_refs_cb cb;
>  		init_all_refs_cb(&cb, revs, *flags);
>  		for_each_glob_ref_in(handle_one_ref, optarg, "refs/heads/", &cb);
> -		clear_ref_exclusion(&revs->ref_excludes);
> +		clear_ref_exclusions(&revs->ref_excludes);
>  	} else if (skip_prefix(arg, "--tags=", &optarg)) {
>  		struct all_refs_cb cb;
>  		init_all_refs_cb(&cb, revs, *flags);
>  		for_each_glob_ref_in(handle_one_ref, optarg, "refs/tags/", &cb);
> -		clear_ref_exclusion(&revs->ref_excludes);
> +		clear_ref_exclusions(&revs->ref_excludes);
>  	} else if (skip_prefix(arg, "--remotes=", &optarg)) {
>  		struct all_refs_cb cb;
>  		init_all_refs_cb(&cb, revs, *flags);
>  		for_each_glob_ref_in(handle_one_ref, optarg, "refs/remotes/", &cb);
> -		clear_ref_exclusion(&revs->ref_excludes);
> +		clear_ref_exclusions(&revs->ref_excludes);

The churn I mentioned with the renaming, so maybe worth doing that as a
"prep" commit?

> +struct ref_exclusions {
> +	/*
> +	 * Excluded refs is a list of wildmatch patterns. If any of the
> +	 * patterns matches, the reference will be excluded.
> +	 */
> +	struct string_list excluded_refs;
> +};

Per the above POC diff though, please move...

>  struct oidset;
>  struct topo_walk_info;
>  
> @@ -103,7 +111,7 @@ struct rev_info {
>  	struct list_objects_filter_options filter;
>  
>  	/* excluding from --branches, --refs, etc. expansion */
> -	struct string_list *ref_excludes;
> +	struct ref_exclusions ref_excludes;
>  
>  	/* Basic information */
>  	const char *prefix;
> @@ -439,12 +447,14 @@ void mark_trees_uninteresting_sparse(struct repository *r, struct oidset *trees)
>  void show_object_with_name(FILE *, struct object *, const char *);
>  
>  /**
> - * Helpers to check if a "struct string_list" item matches with
> - * wildmatch().
> + * Helpers to check if a reference should be excluded.
>   */
> -int ref_excluded(struct string_list *, const char *path);
> -void clear_ref_exclusion(struct string_list **);
> -void add_ref_exclusion(struct string_list **, const char *exclude);
> +#define REF_EXCLUSIONS_INIT { .excluded_refs = STRING_LIST_INIT_DUP }

...this macro to right after declaring the struct, which is what we
usually do, and will help in adding it to "REV_INFO_INIT" sooner than
later.

Also, at the end of your series this end up being overly long, so per
the diff-above (which is tot he end of the series), let's start by
line-wrapping it:

	#define ..._INIT { \
        	.member = ..._INIT, \
	}

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 4/6] revision: add new parameter to exclude hidden refs
  2022-11-07 12:16   ` [PATCH v3 4/6] revision: add new parameter to exclude hidden refs Patrick Steinhardt
@ 2022-11-07 13:34     ` Ævar Arnfjörð Bjarmason
  2022-11-07 17:07       ` Ævar Arnfjörð Bjarmason
  2022-11-08  9:22       ` Patrick Steinhardt
  2022-11-08  0:57     ` Taylor Blau
  1 sibling, 2 replies; 88+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-07 13:34 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Junio C Hamano, Taylor Blau, Jeff King


On Mon, Nov 07 2022, Patrick Steinhardt wrote:

> +--exclude-hidden=[transfer|receive|uploadpack]::
> +	Do not include refs that have been hidden via either one of
> +	`transfer.hideRefs`, `receive.hideRefs` or `uploadpack.hideRefs` that

Maybe worth adding "(see linkgit:git-config[1]) after listing the config
variables.

>  int ref_excluded(const struct ref_exclusions *exclusions, const char *path)
>  {
> +	const char *stripped_path = strip_namespace(path);
>  	struct string_list_item *item;
> +

nit: stray whitespace in otherwise "clean" commit, but the post-image looks nicer, so...

>  void init_ref_exclusions(struct ref_exclusions *exclusions)
>  {
>  	string_list_init_dup(&exclusions->excluded_refs);
> +	string_list_init_dup(&exclusions->hidden_refs);
>  }

Per my comment on 3/6 we wouldn't need this when using the macro as a
source of truth.

>  void clear_ref_exclusions(struct ref_exclusions *exclusions)
>  {
>  	string_list_clear(&exclusions->excluded_refs, 0);
> +	string_list_clear(&exclusions->hidden_refs, 1);
>  }

Hrm, I'l read on, but I don't see any use of "util" here at a glance,
should the "1" here be "0", or maybe I've just missed how it's used...

> +	if (strcmp(section, "transfer") && strcmp(section, "receive") &&
> +	    strcmp(section, "uploadpack"))
> +		die(_("unsupported section for hidden refs: %s"), section);
> +
> +	if (exclusions->hidden_refs.nr)
> +		die(_("--exclude-hidden= passed more than once"));

We usually just ignore the first of --foo=bar --foo=baz and take "baz"
in our CLI use. Is it better to die here than just clear the previous
one & continue?


> -#define REF_EXCLUSIONS_INIT { .excluded_refs = STRING_LIST_INIT_DUP }
> +#define REF_EXCLUSIONS_INIT { .excluded_refs = STRING_LIST_INIT_DUP, .hidden_refs = STRING_LIST_INIT_DUP }

...the getting overly long line I mentioned in 3/6...

> +TEST_PASSES_SANITIZE_LEAK=true

Thanks for adding this! :)

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 4/6] revision: add new parameter to exclude hidden refs
  2022-11-07 13:34     ` Ævar Arnfjörð Bjarmason
@ 2022-11-07 17:07       ` Ævar Arnfjörð Bjarmason
  2022-11-08  9:48         ` Patrick Steinhardt
  2022-11-08  9:22       ` Patrick Steinhardt
  1 sibling, 1 reply; 88+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-07 17:07 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Junio C Hamano, Taylor Blau, Jeff King


On Mon, Nov 07 2022, Ævar Arnfjörð Bjarmason wrote:

> On Mon, Nov 07 2022, Patrick Steinhardt wrote:

>> +TEST_PASSES_SANITIZE_LEAK=true
>
> Thanks for adding this! :)

Hrm, I spoke too soon :) This series adds new leaks, so it'll fail with
the linux-leaks job. I have the following local monkeypatch on top,
which obviously doesn't address the root cause. The t6018 leak is new
due to the new tests you added.

diff --git a/t/t6018-rev-list-glob.sh b/t/t6018-rev-list-glob.sh
index f92616de12d..54221588dd0 100755
--- a/t/t6018-rev-list-glob.sh
+++ b/t/t6018-rev-list-glob.sh
@@ -5,7 +5,6 @@ test_description='rev-list/rev-parse --glob'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
-TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 commit () {
diff --git a/t/t6021-rev-list-exclude-hidden.sh b/t/t6021-rev-list-exclude-hidden.sh
index d08fc2da93d..908b9dba611 100755
--- a/t/t6021-rev-list-exclude-hidden.sh
+++ b/t/t6021-rev-list-exclude-hidden.sh
@@ -2,7 +2,6 @@
 
 test_description='git rev-list --exclude-hidden test'
 
-TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup' '

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 4/6] revision: add new parameter to exclude hidden refs
  2022-11-07 12:16   ` [PATCH v3 4/6] revision: add new parameter to exclude hidden refs Patrick Steinhardt
  2022-11-07 13:34     ` Ævar Arnfjörð Bjarmason
@ 2022-11-08  0:57     ` Taylor Blau
  2022-11-08  8:16       ` Patrick Steinhardt
  1 sibling, 1 reply; 88+ messages in thread
From: Taylor Blau @ 2022-11-08  0:57 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jeff King

On Mon, Nov 07, 2022 at 01:16:35PM +0100, Patrick Steinhardt wrote:
> @@ -195,6 +195,13 @@ respectively, and they must begin with `refs/` when applied to `--glob`
>  or `--all`. If a trailing '/{asterisk}' is intended, it must be given
>  explicitly.
>
> +--exclude-hidden=[transfer|receive|uploadpack]::
> +	Do not include refs that have been hidden via either one of
> +	`transfer.hideRefs`, `receive.hideRefs` or `uploadpack.hideRefs` that
> +	the next `--all`, `--branches`, `--tags`, `--remotes` or `--glob` would
> +	otherwise consider.  This option is cleared when seeing one of these
> +	pseudo-refs.
> +

Hmm. I thought that part of the motivation behind this round was to drop
the 'transfer' group, since it's implied by '--exclude-hidden=receive
--exclude-hidden-uploadpack', no?

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 0/6] receive-pack: only use visible refs for connectivity check
  2022-11-07 12:16 ` [PATCH v3 0/6] " Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2022-11-07 12:16   ` [PATCH v3 6/6] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
@ 2022-11-08  0:59   ` Taylor Blau
  6 siblings, 0 replies; 88+ messages in thread
From: Taylor Blau @ 2022-11-08  0:59 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jeff King

On Mon, Nov 07, 2022 at 01:16:19PM +0100, Patrick Steinhardt wrote:
> Patrick Steinhardt (6):
>   refs: get rid of global list of hidden refs
>   revision: move together exclusion-related functions
>   revision: introduce struct to handle exclusions
>   revision: add new parameter to exclude hidden refs
>   revparse: add `--exclude-hidden=` option
>   receive-pack: only use visible refs for connectivity check

Thanks for the updated round. This version is looking pretty good,
though it looks like Ævar and I had a few minor recommendations. I am
definitely puzzled by seeing '--exclude-hidden=transfer', though.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 4/6] revision: add new parameter to exclude hidden refs
  2022-11-08  0:57     ` Taylor Blau
@ 2022-11-08  8:16       ` Patrick Steinhardt
  2022-11-08 14:42         ` Jeff King
  0 siblings, 1 reply; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-08  8:16 UTC (permalink / raw)
  To: Taylor Blau
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jeff King

[-- Attachment #1: Type: text/plain, Size: 1280 bytes --]

On Mon, Nov 07, 2022 at 07:57:29PM -0500, Taylor Blau wrote:
> On Mon, Nov 07, 2022 at 01:16:35PM +0100, Patrick Steinhardt wrote:
> > @@ -195,6 +195,13 @@ respectively, and they must begin with `refs/` when applied to `--glob`
> >  or `--all`. If a trailing '/{asterisk}' is intended, it must be given
> >  explicitly.
> >
> > +--exclude-hidden=[transfer|receive|uploadpack]::
> > +	Do not include refs that have been hidden via either one of
> > +	`transfer.hideRefs`, `receive.hideRefs` or `uploadpack.hideRefs` that
> > +	the next `--all`, `--branches`, `--tags`, `--remotes` or `--glob` would
> > +	otherwise consider.  This option is cleared when seeing one of these
> > +	pseudo-refs.
> > +
> 
> Hmm. I thought that part of the motivation behind this round was to drop
> the 'transfer' group, since it's implied by '--exclude-hidden=receive
> --exclude-hidden-uploadpack', no?
> 
> Thanks,
> Taylor

I didn't quite see the point in not providing the `transfer` group so
that users can ask for only the set of refs that are hidden by both
`uploadpack` and `receive`. But given that you're the second person
asking for it to be dropped now and given that I don't really have a
plausible usecase for this I'll drop it in the next version.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 3/6] revision: introduce struct to handle exclusions
  2022-11-07 12:51     ` Ævar Arnfjörð Bjarmason
@ 2022-11-08  9:11       ` Patrick Steinhardt
  0 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-08  9:11 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 8738 bytes --]

On Mon, Nov 07, 2022 at 01:51:51PM +0100, Ævar Arnfjörð Bjarmason wrote:
> 
> On Mon, Nov 07 2022, Patrick Steinhardt wrote:
[snip]
> > +	string_list_init_dup(&exclusions->excluded_refs);
> 
> Okey, so this is partly my fault for not following up on f196c1e908d
> (revisions API users: use release_revisions() needing REV_INFO_INIT,
> 2022-04-13) :); But here:
> 
> If we keep this *_init() function don't duplicate what you're adding to
> the macro, just init this in terms of the macro. See the two-line
> examples in:
> 
> 	git grep -W memcpy.*blank

Makes sense.

> But here (and this is the part that's mostly me) as we don't malloc this
> anymore you're only needing to keep this init function for
> repo_init_revisions().
> 
> So, probably too big a digression for a "while at it", but FWIW this on
> top of your topic would do:
> 	
> 	 revision.c | 10 ++--------
> 	 revision.h | 10 +++++++---
> 	 2 files changed, 9 insertions(+), 11 deletions(-)
> 	
> 	diff --git a/revision.c b/revision.c
> 	index 45652f9b0bb..cf352d1fa43 100644
> 	--- a/revision.c
> 	+++ b/revision.c
> 	@@ -1534,12 +1534,6 @@ int ref_excluded(const struct ref_exclusions *exclusions, const char *path)
> 	 	return 0;
> 	 }
> 	 
> 	-void init_ref_exclusions(struct ref_exclusions *exclusions)
> 	-{
> 	-	string_list_init_dup(&exclusions->excluded_refs);
> 	-	string_list_init_dup(&exclusions->hidden_refs);
> 	-}
> 	-
> 	 void clear_ref_exclusions(struct ref_exclusions *exclusions)
> 	 {
> 	 	string_list_clear(&exclusions->excluded_refs, 0);
> 	@@ -1897,7 +1891,8 @@ void repo_init_revisions(struct repository *r,
> 	 			 struct rev_info *revs,
> 	 			 const char *prefix)
> 	 {
> 	-	memset(revs, 0, sizeof(*revs));
> 	+	struct rev_info blank = REV_INFO_INIT;
> 	+	memcpy(revs, &blank, sizeof(*revs));
> 	 
> 	 	revs->repo = r;
> 	 	revs->abbrev = DEFAULT_ABBREV;
> 	@@ -1933,7 +1928,6 @@ void repo_init_revisions(struct repository *r,
> 	 
> 	 	init_display_notes(&revs->notes_opt);
> 	 	list_objects_filter_init(&revs->filter);
> 	-	init_ref_exclusions(&revs->ref_excludes);
> 	 }
> 	 
> 	 static void add_pending_commit_list(struct rev_info *revs,
> 	diff --git a/revision.h b/revision.h
> 	index fef5e063d16..75b8ecc307b 100644
> 	--- a/revision.h
> 	+++ b/revision.h
> 	@@ -94,6 +94,10 @@ struct ref_exclusions {
> 	 	 */
> 	 	struct string_list hidden_refs;
> 	 };
> 	+#define REF_EXCLUSIONS_INIT { \
> 	+	.excluded_refs = STRING_LIST_INIT_DUP, \
> 	+	.hidden_refs = STRING_LIST_INIT_DUP, \
> 	+}
> 	 
> 	 struct oidset;
> 	 struct topo_walk_info;
> 	@@ -371,7 +375,9 @@ struct rev_info {
> 	  * called before release_revisions() the "struct rev_info" can be left
> 	  * uninitialized.
> 	  */
> 	-#define REV_INFO_INIT { 0 }
> 	+#define REV_INFO_INIT { \
> 	+	.ref_excludes = REF_EXCLUSIONS_INIT, \
> 	+}
> 	 
> 	 /**
> 	  * Initialize a rev_info structure with default values. The third parameter may
> 	@@ -455,10 +461,8 @@ void show_object_with_name(FILE *, struct object *, const char *);
> 	 /**
> 	  * Helpers to check if a reference should be excluded.
> 	  */
> 	-#define REF_EXCLUSIONS_INIT { .excluded_refs = STRING_LIST_INIT_DUP, .hidden_refs = STRING_LIST_INIT_DUP }
> 	 
> 	 int ref_excluded(const struct ref_exclusions *exclusions, const char *path);
> 	-void init_ref_exclusions(struct ref_exclusions *);
> 	 void clear_ref_exclusions(struct ref_exclusions *);
> 	 void add_ref_exclusion(struct ref_exclusions *, const char *exclude);
> 	 void exclude_hidden_refs(struct ref_exclusions *, const char *section);
> 
> But I'll submit that cleanup seperately, but for now let's not duplicate
> your REF_EXCLUSIONS_INIT macro here in init_ref_exclusions(), just have
> the function do what the macro is doing, now that we don't need the
> malloc.

Great, thanks.

[snip]
> >  static void add_pending_commit_list(struct rev_info *revs,
> > @@ -2689,10 +2684,10 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
> >  			init_all_refs_cb(&cb, revs, *flags);
> >  			other_head_refs(handle_one_ref, &cb);
> >  		}
> > -		clear_ref_exclusion(&revs->ref_excludes);
> > +		clear_ref_exclusions(&revs->ref_excludes);
> >  	} else if (!strcmp(arg, "--branches")) {
> >  		handle_refs(refs, revs, *flags, refs_for_each_branch_ref);
> > -		clear_ref_exclusion(&revs->ref_excludes);
> > +		clear_ref_exclusions(&revs->ref_excludes);
> >  	} else if (!strcmp(arg, "--bisect")) {
> >  		read_bisect_terms(&term_bad, &term_good);
> >  		handle_refs(refs, revs, *flags, for_each_bad_bisect_ref);
> > @@ -2701,15 +2696,15 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
> >  		revs->bisect = 1;
> >  	} else if (!strcmp(arg, "--tags")) {
> >  		handle_refs(refs, revs, *flags, refs_for_each_tag_ref);
> > -		clear_ref_exclusion(&revs->ref_excludes);
> > +		clear_ref_exclusions(&revs->ref_excludes);
> >  	} else if (!strcmp(arg, "--remotes")) {
> >  		handle_refs(refs, revs, *flags, refs_for_each_remote_ref);
> > -		clear_ref_exclusion(&revs->ref_excludes);
> > +		clear_ref_exclusions(&revs->ref_excludes);
> >  	} else if ((argcount = parse_long_opt("glob", argv, &optarg))) {
> >  		struct all_refs_cb cb;
> >  		init_all_refs_cb(&cb, revs, *flags);
> >  		for_each_glob_ref(handle_one_ref, optarg, &cb);
> > -		clear_ref_exclusion(&revs->ref_excludes);
> > +		clear_ref_exclusions(&revs->ref_excludes);
> >  		return argcount;
> >  	} else if ((argcount = parse_long_opt("exclude", argv, &optarg))) {
> >  		add_ref_exclusion(&revs->ref_excludes, optarg);
> > @@ -2718,17 +2713,17 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
> >  		struct all_refs_cb cb;
> >  		init_all_refs_cb(&cb, revs, *flags);
> >  		for_each_glob_ref_in(handle_one_ref, optarg, "refs/heads/", &cb);
> > -		clear_ref_exclusion(&revs->ref_excludes);
> > +		clear_ref_exclusions(&revs->ref_excludes);
> >  	} else if (skip_prefix(arg, "--tags=", &optarg)) {
> >  		struct all_refs_cb cb;
> >  		init_all_refs_cb(&cb, revs, *flags);
> >  		for_each_glob_ref_in(handle_one_ref, optarg, "refs/tags/", &cb);
> > -		clear_ref_exclusion(&revs->ref_excludes);
> > +		clear_ref_exclusions(&revs->ref_excludes);
> >  	} else if (skip_prefix(arg, "--remotes=", &optarg)) {
> >  		struct all_refs_cb cb;
> >  		init_all_refs_cb(&cb, revs, *flags);
> >  		for_each_glob_ref_in(handle_one_ref, optarg, "refs/remotes/", &cb);
> > -		clear_ref_exclusion(&revs->ref_excludes);
> > +		clear_ref_exclusions(&revs->ref_excludes);
> 
> The churn I mentioned with the renaming, so maybe worth doing that as a
> "prep" commit?

Hm. I don't think it's too bad, and it's weird to rename things already
without any clear justification why that's visible from the diff. Like,
there is no `struct ref_exclusions` yet, why rename it?

I'll retain this in a single commit if you don't mind, but amend the
commit message to explicitly mention the rename.

> > +struct ref_exclusions {
> > +	/*
> > +	 * Excluded refs is a list of wildmatch patterns. If any of the
> > +	 * patterns matches, the reference will be excluded.
> > +	 */
> > +	struct string_list excluded_refs;
> > +};
> 
> Per the above POC diff though, please move...
> 
> >  struct oidset;
> >  struct topo_walk_info;
> >  
> > @@ -103,7 +111,7 @@ struct rev_info {
> >  	struct list_objects_filter_options filter;
> >  
> >  	/* excluding from --branches, --refs, etc. expansion */
> > -	struct string_list *ref_excludes;
> > +	struct ref_exclusions ref_excludes;
> >  
> >  	/* Basic information */
> >  	const char *prefix;
> > @@ -439,12 +447,14 @@ void mark_trees_uninteresting_sparse(struct repository *r, struct oidset *trees)
> >  void show_object_with_name(FILE *, struct object *, const char *);
> >  
> >  /**
> > - * Helpers to check if a "struct string_list" item matches with
> > - * wildmatch().
> > + * Helpers to check if a reference should be excluded.
> >   */
> > -int ref_excluded(struct string_list *, const char *path);
> > -void clear_ref_exclusion(struct string_list **);
> > -void add_ref_exclusion(struct string_list **, const char *exclude);
> > +#define REF_EXCLUSIONS_INIT { .excluded_refs = STRING_LIST_INIT_DUP }
> 
> ...this macro to right after declaring the struct, which is what we
> usually do, and will help in adding it to "REV_INFO_INIT" sooner than
> later.

Fair, will change.

> Also, at the end of your series this end up being overly long, so per
> the diff-above (which is tot he end of the series), let's start by
> line-wrapping it:
> 
> 	#define ..._INIT { \
>         	.member = ..._INIT, \
> 	}

Makes sense.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 4/6] revision: add new parameter to exclude hidden refs
  2022-11-07 13:34     ` Ævar Arnfjörð Bjarmason
  2022-11-07 17:07       ` Ævar Arnfjörð Bjarmason
@ 2022-11-08  9:22       ` Patrick Steinhardt
  1 sibling, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-08  9:22 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 1483 bytes --]

On Mon, Nov 07, 2022 at 02:34:45PM +0100, Ævar Arnfjörð Bjarmason wrote:
> On Mon, Nov 07 2022, Patrick Steinhardt wrote:
[snip]
> > +	if (strcmp(section, "transfer") && strcmp(section, "receive") &&
> > +	    strcmp(section, "uploadpack"))
> > +		die(_("unsupported section for hidden refs: %s"), section);
> > +
> > +	if (exclusions->hidden_refs.nr)
> > +		die(_("--exclude-hidden= passed more than once"));
> 
> We usually just ignore the first of --foo=bar --foo=baz and take "baz"
> in our CLI use. Is it better to die here than just clear the previous
> one & continue?

It's something I was torn on. I ultimately chose to die though because
of the difference between `--exclude` and `--exclude-hidden`: the former
one will happily add additional patterns, all of which will ultimately
be ignored. So as a user you might rightfully expect that the latter
will work the same: if both `--exclude-hidden=uploadpack` and
`--exclude-hidden=receive` are specified, you might want to have both be
ignored.

To me it wasn't quite clear how to support multiple instances of
`transfer.hideRefs` though as there is also the concept of un-hiding
already-hidden refs. So I wanted to avoid going into this discussion to
make the patch series a little bit smaller.

By dying instead of silently overriding the previous argument we retain
the ability to iterate on this at a later point though to implement
above behaviour, if the usecase ever arises.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 4/6] revision: add new parameter to exclude hidden refs
  2022-11-07 17:07       ` Ævar Arnfjörð Bjarmason
@ 2022-11-08  9:48         ` Patrick Steinhardt
  0 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-08  9:48 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 1253 bytes --]

On Mon, Nov 07, 2022 at 06:07:01PM +0100, Ævar Arnfjörð Bjarmason wrote:
> 
> On Mon, Nov 07 2022, Ævar Arnfjörð Bjarmason wrote:
> 
> > On Mon, Nov 07 2022, Patrick Steinhardt wrote:
> 
> >> +TEST_PASSES_SANITIZE_LEAK=true
> >
> > Thanks for adding this! :)
> 
> Hrm, I spoke too soon :) This series adds new leaks, so it'll fail with
> the linux-leaks job. I have the following local monkeypatch on top,
> which obviously doesn't address the root cause. The t6018 leak is new
> due to the new tests you added.

Right, I didn't know it was as easy to run tests with leak checking as
just executing `make test SANITIZE=leak`. Anyway, I did that now and the
issue is in fact in how the hidden refs are parsed because we already
`xstrdup()` the config value as we need to modify it anyway.

The following patch fixes the issue:

diff --git a/refs.c b/refs.c
index f1711e2e9f..2c7e88b190 100644
--- a/refs.c
+++ b/refs.c
@@ -1430,7 +1430,7 @@ int parse_hide_refs_config(const char *var, const char *value, const char *secti
 		len = strlen(ref);
 		while (len && ref[len - 1] == '/')
 			ref[--len] = '\0';
-		string_list_append(hide_refs, ref);
+		string_list_append_nodup(hide_refs, ref);
 	}
 	return 0;
 }

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v4 0/6] receive-pack: only use visible refs for connectivity check
  2022-10-28 14:42 [PATCH 0/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
                   ` (6 preceding siblings ...)
  2022-11-07 12:16 ` [PATCH v3 0/6] " Patrick Steinhardt
@ 2022-11-08 10:03 ` Patrick Steinhardt
  2022-11-08 10:03   ` [PATCH v4 1/6] refs: get rid of global list of hidden refs Patrick Steinhardt
                     ` (5 more replies)
  2022-11-11  6:49 ` [PATCH v5 0/7] " Patrick Steinhardt
  2022-11-17  5:46 ` [PATCH v6 " Patrick Steinhardt
  9 siblings, 6 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-08 10:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 19538 bytes --]

Hi,

this is the fourth version of my patch series that tries to improve
performance of the connectivity check by only considering preexisting
refs as uninteresting that could actually have been advertised to the
client.

This time around there are only incremental changes compared to v3, the
overall implementation stays the same. Changes:

    - Fixed a pre-existing memory leak in how hidden refs are parsed so
      that tests now pass with TEST_PASSES_SANITIZE_LEAK=true.

    - Improved initialization of `struct ref_exclusions` to reuse the
      `REF_EXCLUSIONS_INIT` macro so we don't have to repeat the logic.

    - Dropped the `--exclude-hidden=transfer` option. Only `receive` and
      `uploadpack` are supported now.

Patrick

Patrick Steinhardt (6):
  refs: get rid of global list of hidden refs
  revision: move together exclusion-related functions
  revision: introduce struct to handle exclusions
  revision: add new parameter to exclude hidden refs
  rev-parse: add `--exclude-hidden=` option
  receive-pack: only use visible refs for connectivity check

 Documentation/git-rev-parse.txt    |   7 ++
 Documentation/rev-list-options.txt |   7 ++
 builtin/receive-pack.c             |  10 ++-
 builtin/rev-list.c                 |   1 +
 builtin/rev-parse.c                |  12 ++-
 connected.c                        |   3 +
 connected.h                        |   7 ++
 ls-refs.c                          |  13 ++-
 refs.c                             |  16 ++--
 refs.h                             |   5 +-
 revision.c                         | 117 +++++++++++++++---------
 revision.h                         |  36 ++++++--
 t/t6018-rev-list-glob.sh           |  11 +++
 t/t6021-rev-list-exclude-hidden.sh | 137 +++++++++++++++++++++++++++++
 upload-pack.c                      |  30 ++++---
 15 files changed, 329 insertions(+), 83 deletions(-)
 create mode 100755 t/t6021-rev-list-exclude-hidden.sh

Range-diff against v3:
1:  3741f0a389 ! 1:  34afe30d60 refs: get rid of global list of hidden refs
    @@ refs.c: int parse_hide_refs_config(const char *var, const char *value, const cha
     -			CALLOC_ARRAY(hide_refs, 1);
     -			hide_refs->strdup_strings = 1;
     -		}
    - 		string_list_append(hide_refs, ref);
    +-		string_list_append(hide_refs, ref);
    ++		string_list_append_nodup(hide_refs, ref);
      	}
      	return 0;
      }
2:  a6dcc99ca9 = 2:  b4f21d0a80 revision: move together exclusion-related functions
3:  2a6a67df1d ! 3:  265b292ed5 revision: introduce struct to handle exclusions
    @@ Commit message
         of exclusion.
     
         Introduce a new `struct ref_exclusions` that encapsulates all the logic
    -    related to excluding refs.
    +    related to excluding refs and move the `struct string_list` that holds
    +    all wildmatch patterns of excluded refs into it. Rename functions that
    +    operate on this struct to match its name.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
    @@ revision.c: static void add_rev_cmdline_list(struct rev_info *revs,
     -		free(*ref_excludes_p);
     -	}
     -	*ref_excludes_p = NULL;
    -+	string_list_init_dup(&exclusions->excluded_refs);
    ++	struct ref_exclusions blank = REF_EXCLUSIONS_INIT;
    ++	memcpy(exclusions, &blank, sizeof(*exclusions));
      }
      
     -void add_ref_exclusion(struct string_list **ref_excludes_p, const char *exclude)
    @@ revision.h: struct rev_cmdline_info {
     +	 */
     +	struct string_list excluded_refs;
     +};
    ++
    ++/**
    ++ * Initialize a `struct ref_exclusions` with a macro.
    ++ */
    ++#define REF_EXCLUSIONS_INIT { \
    ++	.excluded_refs = STRING_LIST_INIT_DUP, \
    ++}
     +
      struct oidset;
      struct topo_walk_info;
    @@ revision.h: void mark_trees_uninteresting_sparse(struct repository *r, struct oi
     -int ref_excluded(struct string_list *, const char *path);
     -void clear_ref_exclusion(struct string_list **);
     -void add_ref_exclusion(struct string_list **, const char *exclude);
    -+#define REF_EXCLUSIONS_INIT { .excluded_refs = STRING_LIST_INIT_DUP }
    -+
     +int ref_excluded(const struct ref_exclusions *exclusions, const char *path);
     +void init_ref_exclusions(struct ref_exclusions *);
     +void clear_ref_exclusions(struct ref_exclusions *);
4:  de7c1aa210 ! 4:  c7fa6698db revision: add new parameter to exclude hidden refs
    @@ Documentation/rev-list-options.txt: respectively, and they must begin with `refs
      or `--all`. If a trailing '/{asterisk}' is intended, it must be given
      explicitly.
      
    -+--exclude-hidden=[transfer|receive|uploadpack]::
    ++--exclude-hidden=[receive|uploadpack]::
     +	Do not include refs that have been hidden via either one of
    -+	`transfer.hideRefs`, `receive.hideRefs` or `uploadpack.hideRefs` that
    -+	the next `--all`, `--branches`, `--tags`, `--remotes` or `--glob` would
    -+	otherwise consider.  This option is cleared when seeing one of these
    -+	pseudo-refs.
    ++	`receive.hideRefs` or `uploadpack.hideRefs` (see linkgit:git-config[1])
    ++	that the next `--all`, `--branches`, `--tags`, `--remotes` or `--glob`
    ++	would otherwise consider. This option is cleared when seeing one of
    ++	these pseudo-refs.
     +
      --reflog::
      	Pretend as if all objects mentioned by reflogs are listed on the
    @@ builtin/rev-list.c: static const char rev_list_usage[] =
      "    --tags\n"
      "    --remotes\n"
      "    --stdin\n"
    -+"    --exclude-hidden=[transfer|receive|uploadpack]\n"
    ++"    --exclude-hidden=[receive|uploadpack]\n"
      "    --quiet\n"
      "  ordering output:\n"
      "    --topo-order\n"
    @@ revision.c: static void add_rev_cmdline_list(struct rev_info *revs,
      	return 0;
      }
      
    - void init_ref_exclusions(struct ref_exclusions *exclusions)
    - {
    - 	string_list_init_dup(&exclusions->excluded_refs);
    -+	string_list_init_dup(&exclusions->hidden_refs);
    - }
    - 
    +@@ revision.c: void init_ref_exclusions(struct ref_exclusions *exclusions)
      void clear_ref_exclusions(struct ref_exclusions *exclusions)
      {
      	string_list_clear(&exclusions->excluded_refs, 0);
    -+	string_list_clear(&exclusions->hidden_refs, 1);
    ++	string_list_clear(&exclusions->hidden_refs, 0);
      }
      
      void add_ref_exclusion(struct ref_exclusions *exclusions, const char *exclude)
    @@ revision.c: void add_ref_exclusion(struct ref_exclusions *exclusions, const char
     +{
     +	struct exclude_hidden_refs_cb cb;
     +
    -+	if (strcmp(section, "transfer") && strcmp(section, "receive") &&
    -+	    strcmp(section, "uploadpack"))
    ++	if (strcmp(section, "receive") && strcmp(section, "uploadpack"))
     +		die(_("unsupported section for hidden refs: %s"), section);
     +
     +	if (exclusions->hidden_refs.nr)
    @@ revision.h: struct ref_exclusions {
     +	struct string_list hidden_refs;
      };
      
    + /**
    +@@ revision.h: struct ref_exclusions {
    +  */
    + #define REF_EXCLUSIONS_INIT { \
    + 	.excluded_refs = STRING_LIST_INIT_DUP, \
    ++	.hidden_refs = STRING_LIST_INIT_DUP, \
    + }
    + 
      struct oidset;
     @@ revision.h: void show_object_with_name(FILE *, struct object *, const char *);
      /**
       * Helpers to check if a reference should be excluded.
       */
    --#define REF_EXCLUSIONS_INIT { .excluded_refs = STRING_LIST_INIT_DUP }
    -+#define REF_EXCLUSIONS_INIT { .excluded_refs = STRING_LIST_INIT_DUP, .hidden_refs = STRING_LIST_INIT_DUP }
    - 
    ++
      int ref_excluded(const struct ref_exclusions *exclusions, const char *path);
      void init_ref_exclusions(struct ref_exclusions *);
      void clear_ref_exclusions(struct ref_exclusions *);
    @@ t/t6021-rev-list-exclude-hidden.sh (new)
     +
     +test_description='git rev-list --exclude-hidden test'
     +
    -+TEST_PASSES_SANITIZE_LEAK=true
     +. ./test-lib.sh
     +
     +test_expect_success 'setup' '
    @@ t/t6021-rev-list-exclude-hidden.sh (new)
     +	test_cmp expected err
     +'
     +
    -+test_expect_success 'passed multiple times' '
    -+	echo "fatal: --exclude-hidden= passed more than once" >expected &&
    -+	test_must_fail git -c transfer.hideRefs=refs/hidden/ rev-list --exclude-hidden=transfer --exclude-hidden=transfer 2>err &&
    -+	test_cmp expected err
    -+'
    -+
    -+test_expect_success '--exclude-hidden without hiddenRefs' '
    -+	git rev-list --exclude-hidden=transfer --all >out &&
    -+	cat >expected <<-EOF &&
    -+	$NAMESPACE
    -+	$HIDDEN
    -+	$TAG
    -+	$COMMIT
    -+	EOF
    -+	test_cmp expected out
    -+'
    -+
    -+test_expect_success 'hidden via transfer.hideRefs' '
    -+	git -c transfer.hideRefs=refs/hidden/ rev-list --exclude-hidden=transfer --all >out &&
    -+	cat >expected <<-EOF &&
    -+	$NAMESPACE
    -+	$TAG
    -+	$COMMIT
    -+	EOF
    -+	test_cmp expected out
    -+'
    -+
    -+test_expect_success '--all --exclude-hidden=transfer --not --all without hidden refs' '
    -+	git rev-list --all --exclude-hidden=transfer --not --all >out &&
    -+	test_must_be_empty out
    -+'
    -+
    -+test_expect_success '--all --exclude-hidden=transfer --not --all with hidden ref' '
    -+	git -c transfer.hideRefs=refs/hidden/ rev-list --all --exclude-hidden=transfer --not --all >out &&
    -+	cat >expected <<-EOF &&
    -+	$HIDDEN
    -+	EOF
    -+	test_cmp expected out
    -+'
    -+
    -+test_expect_success '--exclude-hidden with --exclude' '
    -+	git -c transfer.hideRefs=refs/hidden/ rev-list --exclude=refs/tags/* --exclude-hidden=transfer --all >out &&
    -+	cat >expected <<-EOF &&
    -+	$NAMESPACE
    -+	$COMMIT
    -+	EOF
    -+	test_cmp expected out
    -+'
    -+
    -+test_expect_success '--exclude-hidden is reset' '
    -+	git -c transfer.hideRefs=refs/ rev-list --exclude-hidden=transfer --all --all >out &&
    -+	cat >expected <<-EOF &&
    -+	$NAMESPACE
    -+	$HIDDEN
    -+	$TAG
    -+	$COMMIT
    -+	EOF
    -+	test_cmp expected out
    -+'
    -+
    -+test_expect_success '--exclude-hidden operates on stripped refs by default' '
    -+	GIT_NAMESPACE=namespace git -c transfer.hideRefs=refs/namespaced/ rev-list --exclude-hidden=transfer --all >out &&
    -+	cat >expected <<-EOF &&
    -+	$HIDDEN
    -+	$TAG
    -+	$COMMIT
    -+	EOF
    -+	test_cmp expected out
    -+'
    -+
    -+test_expect_success '--exclude-hidden does not hide namespace by default' '
    -+	GIT_NAMESPACE=namespace git -c transfer.hideRefs=refs/namespaces/namespace/ rev-list --exclude-hidden=transfer --all >out &&
    -+	cat >expected <<-EOF &&
    -+	$NAMESPACE
    -+	$HIDDEN
    -+	$TAG
    -+	$COMMIT
    -+	EOF
    -+	test_cmp expected out
    -+'
    -+
    -+test_expect_success '--exclude-hidden= may operate on unstripped refs' '
    -+	GIT_NAMESPACE=namespace git -c transfer.hideRefs=^refs/namespaces/namespace/ rev-list --exclude-hidden=transfer --all >out &&
    -+	cat >expected <<-EOF &&
    -+	$HIDDEN
    -+	$TAG
    -+	$COMMIT
    -+	EOF
    -+	test_cmp expected out
    -+'
    -+
     +for section in receive uploadpack
     +do
    -+	test_expect_success "hidden via $section.hideRefs" '
    -+		git -c $section.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --all >out &&
    -+		cat >expected <<-EOF &&
    -+		$NAMESPACE
    -+		$TAG
    -+		$COMMIT
    -+		EOF
    -+		test_cmp expected out
    ++	test_expect_success "$section: passed multiple times" '
    ++		echo "fatal: --exclude-hidden= passed more than once" >expected &&
    ++		test_must_fail git -c transfer.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --exclude-hidden=$section 2>err &&
    ++		test_cmp expected err
     +	'
     +
    -+	test_expect_success "--exclude-hidden=$section respects transfer.hideRefs" '
    -+		git -c transfer.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --all >out &&
    -+		cat >expected <<-EOF &&
    -+		$NAMESPACE
    -+		$TAG
    -+		$COMMIT
    -+		EOF
    -+		test_cmp expected out
    -+	'
    -+
    -+	test_expect_success "--exclude-hidden=transfer ignores $section.hideRefs" '
    -+		git -c $section.hideRefs=refs/hidden/ rev-list --exclude-hidden=transfer --all >out &&
    ++	test_expect_success "$section: without hiddenRefs" '
    ++		git rev-list --exclude-hidden=$section --all >out &&
     +		cat >expected <<-EOF &&
     +		$NAMESPACE
     +		$HIDDEN
    @@ t/t6021-rev-list-exclude-hidden.sh (new)
     +		test_cmp expected out
     +	'
     +
    -+	test_expect_success "--exclude-hidden=$section respects both transfer.hideRefs and $section.hideRefs" '
    ++	test_expect_success "$section: hidden via transfer.hideRefs" '
    ++		git -c transfer.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --all >out &&
    ++		cat >expected <<-EOF &&
    ++		$NAMESPACE
    ++		$TAG
    ++		$COMMIT
    ++		EOF
    ++		test_cmp expected out
    ++	'
    ++
    ++	test_expect_success "$section: hidden via $section.hideRefs" '
    ++		git -c $section.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --all >out &&
    ++		cat >expected <<-EOF &&
    ++		$NAMESPACE
    ++		$TAG
    ++		$COMMIT
    ++		EOF
    ++		test_cmp expected out
    ++	'
    ++
    ++	test_expect_success "$section: respects both transfer.hideRefs and $section.hideRefs" '
     +		git -c transfer.hideRefs=refs/tags/ -c $section.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --all >out &&
     +		cat >expected <<-EOF &&
     +		$NAMESPACE
    @@ t/t6021-rev-list-exclude-hidden.sh (new)
     +		EOF
     +		test_cmp expected out
     +	'
    ++
    ++	test_expect_success "$section: negation without hidden refs marks everything as uninteresting" '
    ++		git rev-list --all --exclude-hidden=$section --not --all >out &&
    ++		test_must_be_empty out
    ++	'
    ++
    ++	test_expect_success "$section: negation with hidden refs marks them as interesting" '
    ++		git -c transfer.hideRefs=refs/hidden/ rev-list --all --exclude-hidden=$section --not --all >out &&
    ++		cat >expected <<-EOF &&
    ++		$HIDDEN
    ++		EOF
    ++		test_cmp expected out
    ++	'
    ++
    ++	test_expect_success "$section: hidden refs and excludes work together" '
    ++		git -c transfer.hideRefs=refs/hidden/ rev-list --exclude=refs/tags/* --exclude-hidden=$section --all >out &&
    ++		cat >expected <<-EOF &&
    ++		$NAMESPACE
    ++		$COMMIT
    ++		EOF
    ++		test_cmp expected out
    ++	'
    ++
    ++	test_expect_success "$section: excluded hidden refs get reset" '
    ++		git -c transfer.hideRefs=refs/ rev-list --exclude-hidden=$section --all --all >out &&
    ++		cat >expected <<-EOF &&
    ++		$NAMESPACE
    ++		$HIDDEN
    ++		$TAG
    ++		$COMMIT
    ++		EOF
    ++		test_cmp expected out
    ++	'
    ++
    ++	test_expect_success "$section: operates on stripped refs by default" '
    ++		GIT_NAMESPACE=namespace git -c transfer.hideRefs=refs/namespaced/ rev-list --exclude-hidden=$section --all >out &&
    ++		cat >expected <<-EOF &&
    ++		$HIDDEN
    ++		$TAG
    ++		$COMMIT
    ++		EOF
    ++		test_cmp expected out
    ++	'
    ++
    ++	test_expect_success "$section: does not hide namespace by default" '
    ++		GIT_NAMESPACE=namespace git -c transfer.hideRefs=refs/namespaces/namespace/ rev-list --exclude-hidden=$section --all >out &&
    ++		cat >expected <<-EOF &&
    ++		$NAMESPACE
    ++		$HIDDEN
    ++		$TAG
    ++		$COMMIT
    ++		EOF
    ++		test_cmp expected out
    ++	'
    ++
    ++	test_expect_success "$section: can operate on unstripped refs" '
    ++		GIT_NAMESPACE=namespace git -c transfer.hideRefs=^refs/namespaces/namespace/ rev-list --exclude-hidden=$section --all >out &&
    ++		cat >expected <<-EOF &&
    ++		$HIDDEN
    ++		$TAG
    ++		$COMMIT
    ++		EOF
    ++		test_cmp expected out
    ++	'
     +done
     +
     +test_done
5:  68a5e56304 ! 5:  79c5c64a80 revparse: add `--exclude-hidden=` option
    @@ Metadata
     Author: Patrick Steinhardt <ps@pks.im>
     
      ## Commit message ##
    -    revparse: add `--exclude-hidden=` option
    +    rev-parse: add `--exclude-hidden=` option
     
         Add a new `--exclude-hidden=` option that is similar to the one we just
    -    added to git-rev-list(1). Given a seciton name `transfer`, `uploadpack`
    -    or `receive` as argument, it causes us to exclude all references that
    -    would be hidden by the respective `$seciton.hideRefs` configuration.
    +    added to git-rev-list(1). Given a seciton name `uploadpack` or `receive`
    +    as argument, it causes us to exclude all references that would be hidden
    +    by the respective `$section.hideRefs` configuration.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
    @@ Documentation/git-rev-parse.txt: respectively, and they must begin with `refs/`
      or `--all`. If a trailing '/{asterisk}' is intended, it must be given
      explicitly.
      
    -+--exclude-hidden=[transfer|receive|uploadpack]::
    ++--exclude-hidden=[receive|uploadpack]::
     +	Do not include refs that have been hidden via either one of
    -+	`transfer.hideRefs`, `receive.hideRefs` or `uploadpack.hideRefs` that
    -+	the next `--all`, `--branches`, `--tags`, `--remotes` or `--glob` would
    -+	otherwise consider.  This option is cleared when seeing one of these
    -+	pseudo-refs.
    ++	`receive.hideRefs` or `uploadpack.hideRefs` (see linkgit:git-config[1])
    ++	that the next `--all`, `--branches`, `--tags`, `--remotes` or `--glob`
    ++	would otherwise consider. This option is cleared when seeing one of
    ++	these pseudo-refs.
     +
      --disambiguate=<prefix>::
      	Show every object whose name begins with the given prefix.
    @@ t/t6018-rev-list-glob.sh: test_expect_success 'rev-parse --exclude=ref with --re
      	compare rev-parse "--exclude=upstream/x --remotes=upstream/*" "upstream/one upstream/two"
      '
      
    -+test_expect_success 'rev-parse --exclude-hidden= with --all' '
    -+	compare "-c transfer.hideRefs=refs/remotes/ rev-parse" "--exclude-hidden=transfer --all" "--branches --tags"
    -+'
    ++for section in receive uploadpack
    ++do
    ++	test_expect_success "rev-parse --exclude-hidden=$section with --all" '
    ++		compare "-c transfer.hideRefs=refs/remotes/ rev-parse" "--exclude-hidden=$section --all" "--branches --tags"
    ++	'
     +
    -+test_expect_success 'rev-parse --exclude-hidden= with --all' '
    -+	compare "-c transfer.hideRefs=refs/heads/subspace/ rev-parse" "--exclude-hidden=transfer --all" "--exclude=refs/heads/subspace/* --all"
    -+'
    ++	test_expect_success "rev-parse --exclude-hidden=$section with --all" '
    ++		compare "-c transfer.hideRefs=refs/heads/subspace/ rev-parse" "--exclude-hidden=$section --all" "--exclude=refs/heads/subspace/* --all"
    ++	'
    ++done
     +
      test_expect_success 'rev-list --exclude=glob with --branches=glob' '
      	compare rev-list "--exclude=subspace-* --branches=sub*" "subspace/one subspace/two"
6:  9d15449559 = 6:  39b4741734 receive-pack: only use visible refs for connectivity check
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v4 1/6] refs: get rid of global list of hidden refs
  2022-11-08 10:03 ` [PATCH v4 " Patrick Steinhardt
@ 2022-11-08 10:03   ` Patrick Steinhardt
  2022-11-08 13:36     ` Ævar Arnfjörð Bjarmason
  2022-11-08 14:51     ` Jeff King
  2022-11-08 10:03   ` [PATCH v4 2/6] revision: move together exclusion-related functions Patrick Steinhardt
                     ` (4 subsequent siblings)
  5 siblings, 2 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-08 10:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 12546 bytes --]

We're about to add a new argument to git-rev-list(1) that allows it to
add all references that are visible when taking `transfer.hideRefs` et
al into account. This will require us to potentially parse multiple sets
of hidden refs, which is not easily possible right now as there is only
a single, global instance of the list of parsed hidden refs.

Refactor `parse_hide_refs_config()` and `ref_is_hidden()` so that both
take the list of hidden references as input and adjust callers to keep a
local list, instead. This allows us to easily use multiple hidden-ref
lists. Furthermore, it allows us to properly free this list before we
exit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/receive-pack.c |  8 +++++---
 ls-refs.c              | 13 +++++++++----
 refs.c                 | 16 +++++-----------
 refs.h                 |  5 +++--
 upload-pack.c          | 30 ++++++++++++++++++------------
 5 files changed, 40 insertions(+), 32 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 44bcea3a5b..1f3efc58fb 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -80,6 +80,7 @@ static struct object_id push_cert_oid;
 static struct signature_check sigcheck;
 static const char *push_cert_nonce;
 static const char *cert_nonce_seed;
+static struct string_list hidden_refs = STRING_LIST_INIT_DUP;
 
 static const char *NONCE_UNSOLICITED = "UNSOLICITED";
 static const char *NONCE_BAD = "BAD";
@@ -130,7 +131,7 @@ static enum deny_action parse_deny_action(const char *var, const char *value)
 
 static int receive_pack_config(const char *var, const char *value, void *cb)
 {
-	int status = parse_hide_refs_config(var, value, "receive");
+	int status = parse_hide_refs_config(var, value, "receive", &hidden_refs);
 
 	if (status)
 		return status;
@@ -296,7 +297,7 @@ static int show_ref_cb(const char *path_full, const struct object_id *oid,
 	struct oidset *seen = data;
 	const char *path = strip_namespace(path_full);
 
-	if (ref_is_hidden(path, path_full))
+	if (ref_is_hidden(path, path_full, &hidden_refs))
 		return 0;
 
 	/*
@@ -1794,7 +1795,7 @@ static void reject_updates_to_hidden(struct command *commands)
 		strbuf_setlen(&refname_full, prefix_len);
 		strbuf_addstr(&refname_full, cmd->ref_name);
 
-		if (!ref_is_hidden(cmd->ref_name, refname_full.buf))
+		if (!ref_is_hidden(cmd->ref_name, refname_full.buf, &hidden_refs))
 			continue;
 		if (is_null_oid(&cmd->new_oid))
 			cmd->error_string = "deny deleting a hidden ref";
@@ -2591,6 +2592,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 		packet_flush(1);
 	oid_array_clear(&shallow);
 	oid_array_clear(&ref);
+	string_list_clear(&hidden_refs, 1);
 	free((void *)push_cert_nonce);
 	return 0;
 }
diff --git a/ls-refs.c b/ls-refs.c
index fa0d01b47c..ae89f850e9 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -6,6 +6,7 @@
 #include "ls-refs.h"
 #include "pkt-line.h"
 #include "config.h"
+#include "string-list.h"
 
 static int config_read;
 static int advertise_unborn;
@@ -73,6 +74,7 @@ struct ls_refs_data {
 	unsigned symrefs;
 	struct strvec prefixes;
 	struct strbuf buf;
+	struct string_list hidden_refs;
 	unsigned unborn : 1;
 };
 
@@ -84,7 +86,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
 
 	strbuf_reset(&data->buf);
 
-	if (ref_is_hidden(refname_nons, refname))
+	if (ref_is_hidden(refname_nons, refname, &data->hidden_refs))
 		return 0;
 
 	if (!ref_match(&data->prefixes, refname_nons))
@@ -137,14 +139,15 @@ static void send_possibly_unborn_head(struct ls_refs_data *data)
 }
 
 static int ls_refs_config(const char *var, const char *value,
-			  void *data UNUSED)
+			  void *cb_data)
 {
+	struct ls_refs_data *data = cb_data;
 	/*
 	 * We only serve fetches over v2 for now, so respect only "uploadpack"
 	 * config. This may need to eventually be expanded to "receive", but we
 	 * don't yet know how that information will be passed to ls-refs.
 	 */
-	return parse_hide_refs_config(var, value, "uploadpack");
+	return parse_hide_refs_config(var, value, "uploadpack", &data->hidden_refs);
 }
 
 int ls_refs(struct repository *r, struct packet_reader *request)
@@ -154,9 +157,10 @@ int ls_refs(struct repository *r, struct packet_reader *request)
 	memset(&data, 0, sizeof(data));
 	strvec_init(&data.prefixes);
 	strbuf_init(&data.buf, 0);
+	string_list_init_dup(&data.hidden_refs);
 
 	ensure_config_read();
-	git_config(ls_refs_config, NULL);
+	git_config(ls_refs_config, &data);
 
 	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
 		const char *arg = request->line;
@@ -195,6 +199,7 @@ int ls_refs(struct repository *r, struct packet_reader *request)
 	packet_fflush(stdout);
 	strvec_clear(&data.prefixes);
 	strbuf_release(&data.buf);
+	string_list_clear(&data.hidden_refs, 1);
 	return 0;
 }
 
diff --git a/refs.c b/refs.c
index 1491ae937e..2c7e88b190 100644
--- a/refs.c
+++ b/refs.c
@@ -1414,9 +1414,8 @@ char *shorten_unambiguous_ref(const char *refname, int strict)
 					    refname, strict);
 }
 
-static struct string_list *hide_refs;
-
-int parse_hide_refs_config(const char *var, const char *value, const char *section)
+int parse_hide_refs_config(const char *var, const char *value, const char *section,
+			   struct string_list *hide_refs)
 {
 	const char *key;
 	if (!strcmp("transfer.hiderefs", var) ||
@@ -1431,21 +1430,16 @@ int parse_hide_refs_config(const char *var, const char *value, const char *secti
 		len = strlen(ref);
 		while (len && ref[len - 1] == '/')
 			ref[--len] = '\0';
-		if (!hide_refs) {
-			CALLOC_ARRAY(hide_refs, 1);
-			hide_refs->strdup_strings = 1;
-		}
-		string_list_append(hide_refs, ref);
+		string_list_append_nodup(hide_refs, ref);
 	}
 	return 0;
 }
 
-int ref_is_hidden(const char *refname, const char *refname_full)
+int ref_is_hidden(const char *refname, const char *refname_full,
+		  const struct string_list *hide_refs)
 {
 	int i;
 
-	if (!hide_refs)
-		return 0;
 	for (i = hide_refs->nr - 1; i >= 0; i--) {
 		const char *match = hide_refs->items[i].string;
 		const char *subject;
diff --git a/refs.h b/refs.h
index 8958717a17..3266fd8f57 100644
--- a/refs.h
+++ b/refs.h
@@ -808,7 +808,8 @@ int update_ref(const char *msg, const char *refname,
 	       const struct object_id *new_oid, const struct object_id *old_oid,
 	       unsigned int flags, enum action_on_err onerr);
 
-int parse_hide_refs_config(const char *var, const char *value, const char *);
+int parse_hide_refs_config(const char *var, const char *value, const char *,
+			   struct string_list *);
 
 /*
  * Check whether a ref is hidden. If no namespace is set, both the first and
@@ -818,7 +819,7 @@ int parse_hide_refs_config(const char *var, const char *value, const char *);
  * the ref is outside that namespace, the first parameter is NULL. The second
  * parameter always points to the full ref name.
  */
-int ref_is_hidden(const char *, const char *);
+int ref_is_hidden(const char *, const char *, const struct string_list *);
 
 /* Is this a per-worktree ref living in the refs/ namespace? */
 int is_per_worktree_ref(const char *refname);
diff --git a/upload-pack.c b/upload-pack.c
index 0b8311bd68..9db17f8787 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -62,6 +62,7 @@ struct upload_pack_data {
 	struct object_array have_obj;
 	struct oid_array haves;					/* v2 only */
 	struct string_list wanted_refs;				/* v2 only */
+	struct string_list hidden_refs;
 
 	struct object_array shallows;
 	struct string_list deepen_not;
@@ -118,6 +119,7 @@ static void upload_pack_data_init(struct upload_pack_data *data)
 {
 	struct string_list symref = STRING_LIST_INIT_DUP;
 	struct string_list wanted_refs = STRING_LIST_INIT_DUP;
+	struct string_list hidden_refs = STRING_LIST_INIT_DUP;
 	struct object_array want_obj = OBJECT_ARRAY_INIT;
 	struct object_array have_obj = OBJECT_ARRAY_INIT;
 	struct oid_array haves = OID_ARRAY_INIT;
@@ -130,6 +132,7 @@ static void upload_pack_data_init(struct upload_pack_data *data)
 	memset(data, 0, sizeof(*data));
 	data->symref = symref;
 	data->wanted_refs = wanted_refs;
+	data->hidden_refs = hidden_refs;
 	data->want_obj = want_obj;
 	data->have_obj = have_obj;
 	data->haves = haves;
@@ -151,6 +154,7 @@ static void upload_pack_data_clear(struct upload_pack_data *data)
 {
 	string_list_clear(&data->symref, 1);
 	string_list_clear(&data->wanted_refs, 1);
+	string_list_clear(&data->hidden_refs, 1);
 	object_array_clear(&data->want_obj);
 	object_array_clear(&data->have_obj);
 	oid_array_clear(&data->haves);
@@ -842,8 +846,8 @@ static void deepen(struct upload_pack_data *data, int depth)
 		 * Checking for reachable shallows requires that our refs be
 		 * marked with OUR_REF.
 		 */
-		head_ref_namespaced(check_ref, NULL);
-		for_each_namespaced_ref(check_ref, NULL);
+		head_ref_namespaced(check_ref, data);
+		for_each_namespaced_ref(check_ref, data);
 
 		get_reachable_list(data, &reachable_shallows);
 		result = get_shallow_commits(&reachable_shallows,
@@ -1158,11 +1162,11 @@ static void receive_needs(struct upload_pack_data *data,
 
 /* return non-zero if the ref is hidden, otherwise 0 */
 static int mark_our_ref(const char *refname, const char *refname_full,
-			const struct object_id *oid)
+			const struct object_id *oid, const struct string_list *hidden_refs)
 {
 	struct object *o = lookup_unknown_object(the_repository, oid);
 
-	if (ref_is_hidden(refname, refname_full)) {
+	if (ref_is_hidden(refname, refname_full, hidden_refs)) {
 		o->flags |= HIDDEN_REF;
 		return 1;
 	}
@@ -1171,11 +1175,12 @@ static int mark_our_ref(const char *refname, const char *refname_full,
 }
 
 static int check_ref(const char *refname_full, const struct object_id *oid,
-		     int flag UNUSED, void *cb_data UNUSED)
+		     int flag UNUSED, void *cb_data)
 {
 	const char *refname = strip_namespace(refname_full);
+	struct upload_pack_data *data = cb_data;
 
-	mark_our_ref(refname, refname_full, oid);
+	mark_our_ref(refname, refname_full, oid, &data->hidden_refs);
 	return 0;
 }
 
@@ -1204,7 +1209,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	struct object_id peeled;
 	struct upload_pack_data *data = cb_data;
 
-	if (mark_our_ref(refname_nons, refname, oid))
+	if (mark_our_ref(refname_nons, refname, oid, &data->hidden_refs))
 		return 0;
 
 	if (capabilities) {
@@ -1327,7 +1332,7 @@ static int upload_pack_config(const char *var, const char *value, void *cb_data)
 	if (parse_object_filter_config(var, value, data) < 0)
 		return -1;
 
-	return parse_hide_refs_config(var, value, "uploadpack");
+	return parse_hide_refs_config(var, value, "uploadpack", &data->hidden_refs);
 }
 
 static int upload_pack_protected_config(const char *var, const char *value, void *cb_data)
@@ -1375,8 +1380,8 @@ void upload_pack(const int advertise_refs, const int stateless_rpc,
 		advertise_shallow_grafts(1);
 		packet_flush(1);
 	} else {
-		head_ref_namespaced(check_ref, NULL);
-		for_each_namespaced_ref(check_ref, NULL);
+		head_ref_namespaced(check_ref, &data);
+		for_each_namespaced_ref(check_ref, &data);
 	}
 
 	if (!advertise_refs) {
@@ -1441,6 +1446,7 @@ static int parse_want(struct packet_writer *writer, const char *line,
 
 static int parse_want_ref(struct packet_writer *writer, const char *line,
 			  struct string_list *wanted_refs,
+			  struct string_list *hidden_refs,
 			  struct object_array *want_obj)
 {
 	const char *refname_nons;
@@ -1451,7 +1457,7 @@ static int parse_want_ref(struct packet_writer *writer, const char *line,
 		struct strbuf refname = STRBUF_INIT;
 
 		strbuf_addf(&refname, "%s%s", get_git_namespace(), refname_nons);
-		if (ref_is_hidden(refname_nons, refname.buf) ||
+		if (ref_is_hidden(refname_nons, refname.buf, hidden_refs) ||
 		    read_ref(refname.buf, &oid)) {
 			packet_writer_error(writer, "unknown ref %s", refname_nons);
 			die("unknown ref %s", refname_nons);
@@ -1508,7 +1514,7 @@ static void process_args(struct packet_reader *request,
 			continue;
 		if (data->allow_ref_in_want &&
 		    parse_want_ref(&data->writer, arg, &data->wanted_refs,
-				   &data->want_obj))
+				   &data->hidden_refs, &data->want_obj))
 			continue;
 		/* process have line */
 		if (parse_have(arg, &data->haves))
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v4 2/6] revision: move together exclusion-related functions
  2022-11-08 10:03 ` [PATCH v4 " Patrick Steinhardt
  2022-11-08 10:03   ` [PATCH v4 1/6] refs: get rid of global list of hidden refs Patrick Steinhardt
@ 2022-11-08 10:03   ` Patrick Steinhardt
  2022-11-08 10:03   ` [PATCH v4 3/6] revision: introduce struct to handle exclusions Patrick Steinhardt
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-08 10:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 2402 bytes --]

Move together the definitions of functions that handle exclusions of
refs so that related functionality sits in a single place, only.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 revision.c | 52 ++++++++++++++++++++++++++--------------------------
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/revision.c b/revision.c
index 0760e78936..be755670e2 100644
--- a/revision.c
+++ b/revision.c
@@ -1517,14 +1517,6 @@ static void add_rev_cmdline_list(struct rev_info *revs,
 	}
 }
 
-struct all_refs_cb {
-	int all_flags;
-	int warned_bad_reflog;
-	struct rev_info *all_revs;
-	const char *name_for_errormsg;
-	struct worktree *wt;
-};
-
 int ref_excluded(struct string_list *ref_excludes, const char *path)
 {
 	struct string_list_item *item;
@@ -1538,6 +1530,32 @@ int ref_excluded(struct string_list *ref_excludes, const char *path)
 	return 0;
 }
 
+void clear_ref_exclusion(struct string_list **ref_excludes_p)
+{
+	if (*ref_excludes_p) {
+		string_list_clear(*ref_excludes_p, 0);
+		free(*ref_excludes_p);
+	}
+	*ref_excludes_p = NULL;
+}
+
+void add_ref_exclusion(struct string_list **ref_excludes_p, const char *exclude)
+{
+	if (!*ref_excludes_p) {
+		CALLOC_ARRAY(*ref_excludes_p, 1);
+		(*ref_excludes_p)->strdup_strings = 1;
+	}
+	string_list_append(*ref_excludes_p, exclude);
+}
+
+struct all_refs_cb {
+	int all_flags;
+	int warned_bad_reflog;
+	struct rev_info *all_revs;
+	const char *name_for_errormsg;
+	struct worktree *wt;
+};
+
 static int handle_one_ref(const char *path, const struct object_id *oid,
 			  int flag UNUSED,
 			  void *cb_data)
@@ -1563,24 +1581,6 @@ static void init_all_refs_cb(struct all_refs_cb *cb, struct rev_info *revs,
 	cb->wt = NULL;
 }
 
-void clear_ref_exclusion(struct string_list **ref_excludes_p)
-{
-	if (*ref_excludes_p) {
-		string_list_clear(*ref_excludes_p, 0);
-		free(*ref_excludes_p);
-	}
-	*ref_excludes_p = NULL;
-}
-
-void add_ref_exclusion(struct string_list **ref_excludes_p, const char *exclude)
-{
-	if (!*ref_excludes_p) {
-		CALLOC_ARRAY(*ref_excludes_p, 1);
-		(*ref_excludes_p)->strdup_strings = 1;
-	}
-	string_list_append(*ref_excludes_p, exclude);
-}
-
 static void handle_refs(struct ref_store *refs,
 			struct rev_info *revs, unsigned flags,
 			int (*for_each)(struct ref_store *, each_ref_fn, void *))
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v4 3/6] revision: introduce struct to handle exclusions
  2022-11-08 10:03 ` [PATCH v4 " Patrick Steinhardt
  2022-11-08 10:03   ` [PATCH v4 1/6] refs: get rid of global list of hidden refs Patrick Steinhardt
  2022-11-08 10:03   ` [PATCH v4 2/6] revision: move together exclusion-related functions Patrick Steinhardt
@ 2022-11-08 10:03   ` Patrick Steinhardt
  2022-11-08 10:03   ` [PATCH v4 4/6] revision: add new parameter to exclude hidden refs Patrick Steinhardt
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-08 10:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 8642 bytes --]

The functions that handle exclusion of refs work on a single string
list. We're about to add a second mechanism for excluding refs though,
and it makes sense to reuse much of the same architecture for both kinds
of exclusion.

Introduce a new `struct ref_exclusions` that encapsulates all the logic
related to excluding refs and move the `struct string_list` that holds
all wildmatch patterns of excluded refs into it. Rename functions that
operate on this struct to match its name.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/rev-parse.c |  8 ++++----
 revision.c          | 48 +++++++++++++++++++++------------------------
 revision.h          | 27 +++++++++++++++++++------
 3 files changed, 47 insertions(+), 36 deletions(-)

diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 8f61050bde..7fa5b6991b 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -39,7 +39,7 @@ static int abbrev_ref_strict;
 static int output_sq;
 
 static int stuck_long;
-static struct string_list *ref_excludes;
+static struct ref_exclusions ref_excludes = REF_EXCLUSIONS_INIT;
 
 /*
  * Some arguments are relevant "revision" arguments,
@@ -198,7 +198,7 @@ static int show_default(void)
 static int show_reference(const char *refname, const struct object_id *oid,
 			  int flag UNUSED, void *cb_data UNUSED)
 {
-	if (ref_excluded(ref_excludes, refname))
+	if (ref_excluded(&ref_excludes, refname))
 		return 0;
 	show_rev(NORMAL, oid, refname);
 	return 0;
@@ -585,7 +585,7 @@ static void handle_ref_opt(const char *pattern, const char *prefix)
 		for_each_glob_ref_in(show_reference, pattern, prefix, NULL);
 	else
 		for_each_ref_in(prefix, show_reference, NULL);
-	clear_ref_exclusion(&ref_excludes);
+	clear_ref_exclusions(&ref_excludes);
 }
 
 enum format_type {
@@ -863,7 +863,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 			}
 			if (!strcmp(arg, "--all")) {
 				for_each_ref(show_reference, NULL);
-				clear_ref_exclusion(&ref_excludes);
+				clear_ref_exclusions(&ref_excludes);
 				continue;
 			}
 			if (skip_prefix(arg, "--disambiguate=", &arg)) {
diff --git a/revision.c b/revision.c
index be755670e2..fe3ec98f46 100644
--- a/revision.c
+++ b/revision.c
@@ -1517,35 +1517,30 @@ static void add_rev_cmdline_list(struct rev_info *revs,
 	}
 }
 
-int ref_excluded(struct string_list *ref_excludes, const char *path)
+int ref_excluded(const struct ref_exclusions *exclusions, const char *path)
 {
 	struct string_list_item *item;
-
-	if (!ref_excludes)
-		return 0;
-	for_each_string_list_item(item, ref_excludes) {
+	for_each_string_list_item(item, &exclusions->excluded_refs) {
 		if (!wildmatch(item->string, path, 0))
 			return 1;
 	}
 	return 0;
 }
 
-void clear_ref_exclusion(struct string_list **ref_excludes_p)
+void init_ref_exclusions(struct ref_exclusions *exclusions)
 {
-	if (*ref_excludes_p) {
-		string_list_clear(*ref_excludes_p, 0);
-		free(*ref_excludes_p);
-	}
-	*ref_excludes_p = NULL;
+	struct ref_exclusions blank = REF_EXCLUSIONS_INIT;
+	memcpy(exclusions, &blank, sizeof(*exclusions));
 }
 
-void add_ref_exclusion(struct string_list **ref_excludes_p, const char *exclude)
+void clear_ref_exclusions(struct ref_exclusions *exclusions)
 {
-	if (!*ref_excludes_p) {
-		CALLOC_ARRAY(*ref_excludes_p, 1);
-		(*ref_excludes_p)->strdup_strings = 1;
-	}
-	string_list_append(*ref_excludes_p, exclude);
+	string_list_clear(&exclusions->excluded_refs, 0);
+}
+
+void add_ref_exclusion(struct ref_exclusions *exclusions, const char *exclude)
+{
+	string_list_append(&exclusions->excluded_refs, exclude);
 }
 
 struct all_refs_cb {
@@ -1563,7 +1558,7 @@ static int handle_one_ref(const char *path, const struct object_id *oid,
 	struct all_refs_cb *cb = cb_data;
 	struct object *object;
 
-	if (ref_excluded(cb->all_revs->ref_excludes, path))
+	if (ref_excluded(&cb->all_revs->ref_excludes, path))
 	    return 0;
 
 	object = get_reference(cb->all_revs, path, oid, cb->all_flags);
@@ -1901,6 +1896,7 @@ void repo_init_revisions(struct repository *r,
 
 	init_display_notes(&revs->notes_opt);
 	list_objects_filter_init(&revs->filter);
+	init_ref_exclusions(&revs->ref_excludes);
 }
 
 static void add_pending_commit_list(struct rev_info *revs,
@@ -2689,10 +2685,10 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 			init_all_refs_cb(&cb, revs, *flags);
 			other_head_refs(handle_one_ref, &cb);
 		}
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--branches")) {
 		handle_refs(refs, revs, *flags, refs_for_each_branch_ref);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--bisect")) {
 		read_bisect_terms(&term_bad, &term_good);
 		handle_refs(refs, revs, *flags, for_each_bad_bisect_ref);
@@ -2701,15 +2697,15 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 		revs->bisect = 1;
 	} else if (!strcmp(arg, "--tags")) {
 		handle_refs(refs, revs, *flags, refs_for_each_tag_ref);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--remotes")) {
 		handle_refs(refs, revs, *flags, refs_for_each_remote_ref);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if ((argcount = parse_long_opt("glob", argv, &optarg))) {
 		struct all_refs_cb cb;
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref(handle_one_ref, optarg, &cb);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 		return argcount;
 	} else if ((argcount = parse_long_opt("exclude", argv, &optarg))) {
 		add_ref_exclusion(&revs->ref_excludes, optarg);
@@ -2718,17 +2714,17 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 		struct all_refs_cb cb;
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/heads/", &cb);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (skip_prefix(arg, "--tags=", &optarg)) {
 		struct all_refs_cb cb;
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/tags/", &cb);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (skip_prefix(arg, "--remotes=", &optarg)) {
 		struct all_refs_cb cb;
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/remotes/", &cb);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--reflog")) {
 		add_reflogs_to_pending(revs, *flags);
 	} else if (!strcmp(arg, "--indexed-objects")) {
diff --git a/revision.h b/revision.h
index afe1b77985..5c8ab16047 100644
--- a/revision.h
+++ b/revision.h
@@ -81,6 +81,21 @@ struct rev_cmdline_info {
 	} *rev;
 };
 
+struct ref_exclusions {
+	/*
+	 * Excluded refs is a list of wildmatch patterns. If any of the
+	 * patterns matches, the reference will be excluded.
+	 */
+	struct string_list excluded_refs;
+};
+
+/**
+ * Initialize a `struct ref_exclusions` with a macro.
+ */
+#define REF_EXCLUSIONS_INIT { \
+	.excluded_refs = STRING_LIST_INIT_DUP, \
+}
+
 struct oidset;
 struct topo_walk_info;
 
@@ -103,7 +118,7 @@ struct rev_info {
 	struct list_objects_filter_options filter;
 
 	/* excluding from --branches, --refs, etc. expansion */
-	struct string_list *ref_excludes;
+	struct ref_exclusions ref_excludes;
 
 	/* Basic information */
 	const char *prefix;
@@ -439,12 +454,12 @@ void mark_trees_uninteresting_sparse(struct repository *r, struct oidset *trees)
 void show_object_with_name(FILE *, struct object *, const char *);
 
 /**
- * Helpers to check if a "struct string_list" item matches with
- * wildmatch().
+ * Helpers to check if a reference should be excluded.
  */
-int ref_excluded(struct string_list *, const char *path);
-void clear_ref_exclusion(struct string_list **);
-void add_ref_exclusion(struct string_list **, const char *exclude);
+int ref_excluded(const struct ref_exclusions *exclusions, const char *path);
+void init_ref_exclusions(struct ref_exclusions *);
+void clear_ref_exclusions(struct ref_exclusions *);
+void add_ref_exclusion(struct ref_exclusions *, const char *exclude);
 
 /**
  * This function can be used if you want to add commit objects as revision
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v4 4/6] revision: add new parameter to exclude hidden refs
  2022-11-08 10:03 ` [PATCH v4 " Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2022-11-08 10:03   ` [PATCH v4 3/6] revision: introduce struct to handle exclusions Patrick Steinhardt
@ 2022-11-08 10:03   ` Patrick Steinhardt
  2022-11-08 15:07     ` Jeff King
  2022-11-08 10:03   ` [PATCH v4 5/6] rev-parse: add `--exclude-hidden=` option Patrick Steinhardt
  2022-11-08 10:04   ` [PATCH v4 6/6] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
  5 siblings, 1 reply; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-08 10:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 10885 bytes --]

Users can optionally hide refs from remote users in git-upload-pack(1),
git-receive-pack(1) and others via the `transfer.hideRefs`, but there is
not an easy way to obtain the list of all visible or hidden refs right
now. We'll require just that though for a performance improvement in our
connectivity check.

Add a new option `--exclude-hidden=` that excludes any hidden refs from
the next pseudo-ref like `--all` or `--branches`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/rev-list-options.txt |   7 ++
 builtin/rev-list.c                 |   1 +
 revision.c                         |  41 ++++++++-
 revision.h                         |   9 ++
 t/t6021-rev-list-exclude-hidden.sh | 137 +++++++++++++++++++++++++++++
 5 files changed, 194 insertions(+), 1 deletion(-)
 create mode 100755 t/t6021-rev-list-exclude-hidden.sh

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 1837509566..5b46781b35 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -195,6 +195,13 @@ respectively, and they must begin with `refs/` when applied to `--glob`
 or `--all`. If a trailing '/{asterisk}' is intended, it must be given
 explicitly.
 
+--exclude-hidden=[receive|uploadpack]::
+	Do not include refs that have been hidden via either one of
+	`receive.hideRefs` or `uploadpack.hideRefs` (see linkgit:git-config[1])
+	that the next `--all`, `--branches`, `--tags`, `--remotes` or `--glob`
+	would otherwise consider. This option is cleared when seeing one of
+	these pseudo-refs.
+
 --reflog::
 	Pretend as if all objects mentioned by reflogs are listed on the
 	command line as `<commit>`.
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 3acd93f71e..d42db0b0cc 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -38,6 +38,7 @@ static const char rev_list_usage[] =
 "    --tags\n"
 "    --remotes\n"
 "    --stdin\n"
+"    --exclude-hidden=[receive|uploadpack]\n"
 "    --quiet\n"
 "  ordering output:\n"
 "    --topo-order\n"
diff --git a/revision.c b/revision.c
index fe3ec98f46..b726cfd255 100644
--- a/revision.c
+++ b/revision.c
@@ -1,4 +1,5 @@
 #include "cache.h"
+#include "config.h"
 #include "object-store.h"
 #include "tag.h"
 #include "blob.h"
@@ -1519,11 +1520,17 @@ static void add_rev_cmdline_list(struct rev_info *revs,
 
 int ref_excluded(const struct ref_exclusions *exclusions, const char *path)
 {
+	const char *stripped_path = strip_namespace(path);
 	struct string_list_item *item;
+
 	for_each_string_list_item(item, &exclusions->excluded_refs) {
 		if (!wildmatch(item->string, path, 0))
 			return 1;
 	}
+
+	if (ref_is_hidden(stripped_path, path, &exclusions->hidden_refs))
+		return 1;
+
 	return 0;
 }
 
@@ -1536,6 +1543,7 @@ void init_ref_exclusions(struct ref_exclusions *exclusions)
 void clear_ref_exclusions(struct ref_exclusions *exclusions)
 {
 	string_list_clear(&exclusions->excluded_refs, 0);
+	string_list_clear(&exclusions->hidden_refs, 0);
 }
 
 void add_ref_exclusion(struct ref_exclusions *exclusions, const char *exclude)
@@ -1543,6 +1551,34 @@ void add_ref_exclusion(struct ref_exclusions *exclusions, const char *exclude)
 	string_list_append(&exclusions->excluded_refs, exclude);
 }
 
+struct exclude_hidden_refs_cb {
+	struct ref_exclusions *exclusions;
+	const char *section;
+};
+
+static int hide_refs_config(const char *var, const char *value, void *cb_data)
+{
+	struct exclude_hidden_refs_cb *cb = cb_data;
+	return parse_hide_refs_config(var, value, cb->section,
+				      &cb->exclusions->hidden_refs);
+}
+
+void exclude_hidden_refs(struct ref_exclusions *exclusions, const char *section)
+{
+	struct exclude_hidden_refs_cb cb;
+
+	if (strcmp(section, "receive") && strcmp(section, "uploadpack"))
+		die(_("unsupported section for hidden refs: %s"), section);
+
+	if (exclusions->hidden_refs.nr)
+		die(_("--exclude-hidden= passed more than once"));
+
+	cb.exclusions = exclusions;
+	cb.section = section;
+
+	git_config(hide_refs_config, &cb);
+}
+
 struct all_refs_cb {
 	int all_flags;
 	int warned_bad_reflog;
@@ -2221,7 +2257,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	    !strcmp(arg, "--bisect") || starts_with(arg, "--glob=") ||
 	    !strcmp(arg, "--indexed-objects") ||
 	    !strcmp(arg, "--alternate-refs") ||
-	    starts_with(arg, "--exclude=") ||
+	    starts_with(arg, "--exclude=") || starts_with(arg, "--exclude-hidden=") ||
 	    starts_with(arg, "--branches=") || starts_with(arg, "--tags=") ||
 	    starts_with(arg, "--remotes=") || starts_with(arg, "--no-walk="))
 	{
@@ -2710,6 +2746,9 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 	} else if ((argcount = parse_long_opt("exclude", argv, &optarg))) {
 		add_ref_exclusion(&revs->ref_excludes, optarg);
 		return argcount;
+	} else if ((argcount = parse_long_opt("exclude-hidden", argv, &optarg))) {
+		exclude_hidden_refs(&revs->ref_excludes, optarg);
+		return argcount;
 	} else if (skip_prefix(arg, "--branches=", &optarg)) {
 		struct all_refs_cb cb;
 		init_all_refs_cb(&cb, revs, *flags);
diff --git a/revision.h b/revision.h
index 5c8ab16047..a96fefebf1 100644
--- a/revision.h
+++ b/revision.h
@@ -87,6 +87,12 @@ struct ref_exclusions {
 	 * patterns matches, the reference will be excluded.
 	 */
 	struct string_list excluded_refs;
+
+	/*
+	 * Hidden refs is a list of patterns that is to be hidden via
+	 * `ref_is_hidden()`.
+	 */
+	struct string_list hidden_refs;
 };
 
 /**
@@ -94,6 +100,7 @@ struct ref_exclusions {
  */
 #define REF_EXCLUSIONS_INIT { \
 	.excluded_refs = STRING_LIST_INIT_DUP, \
+	.hidden_refs = STRING_LIST_INIT_DUP, \
 }
 
 struct oidset;
@@ -456,10 +463,12 @@ void show_object_with_name(FILE *, struct object *, const char *);
 /**
  * Helpers to check if a reference should be excluded.
  */
+
 int ref_excluded(const struct ref_exclusions *exclusions, const char *path);
 void init_ref_exclusions(struct ref_exclusions *);
 void clear_ref_exclusions(struct ref_exclusions *);
 void add_ref_exclusion(struct ref_exclusions *, const char *exclude);
+void exclude_hidden_refs(struct ref_exclusions *, const char *section);
 
 /**
  * This function can be used if you want to add commit objects as revision
diff --git a/t/t6021-rev-list-exclude-hidden.sh b/t/t6021-rev-list-exclude-hidden.sh
new file mode 100755
index 0000000000..4ab50e7f4f
--- /dev/null
+++ b/t/t6021-rev-list-exclude-hidden.sh
@@ -0,0 +1,137 @@
+#!/bin/sh
+
+test_description='git rev-list --exclude-hidden test'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit_bulk --id=commit --ref=refs/heads/main 1 &&
+	COMMIT=$(git rev-parse refs/heads/main) &&
+	test_commit_bulk --id=tag --ref=refs/tags/lightweight 1 &&
+	TAG=$(git rev-parse refs/tags/lightweight) &&
+	test_commit_bulk --id=hidden --ref=refs/hidden/commit 1 &&
+	HIDDEN=$(git rev-parse refs/hidden/commit) &&
+	test_commit_bulk --id=namespace --ref=refs/namespaces/namespace/refs/namespaced/commit 1 &&
+	NAMESPACE=$(git rev-parse refs/namespaces/namespace/refs/namespaced/commit)
+'
+
+test_expect_success 'invalid section' '
+	echo "fatal: unsupported section for hidden refs: unsupported" >expected &&
+	test_must_fail git rev-list --exclude-hidden=unsupported 2>err &&
+	test_cmp expected err
+'
+
+for section in receive uploadpack
+do
+	test_expect_success "$section: passed multiple times" '
+		echo "fatal: --exclude-hidden= passed more than once" >expected &&
+		test_must_fail git -c transfer.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --exclude-hidden=$section 2>err &&
+		test_cmp expected err
+	'
+
+	test_expect_success "$section: without hiddenRefs" '
+		git rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$HIDDEN
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: hidden via transfer.hideRefs" '
+		git -c transfer.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: hidden via $section.hideRefs" '
+		git -c $section.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: respects both transfer.hideRefs and $section.hideRefs" '
+		git -c transfer.hideRefs=refs/tags/ -c $section.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: negation without hidden refs marks everything as uninteresting" '
+		git rev-list --all --exclude-hidden=$section --not --all >out &&
+		test_must_be_empty out
+	'
+
+	test_expect_success "$section: negation with hidden refs marks them as interesting" '
+		git -c transfer.hideRefs=refs/hidden/ rev-list --all --exclude-hidden=$section --not --all >out &&
+		cat >expected <<-EOF &&
+		$HIDDEN
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: hidden refs and excludes work together" '
+		git -c transfer.hideRefs=refs/hidden/ rev-list --exclude=refs/tags/* --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: excluded hidden refs get reset" '
+		git -c transfer.hideRefs=refs/ rev-list --exclude-hidden=$section --all --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$HIDDEN
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: operates on stripped refs by default" '
+		GIT_NAMESPACE=namespace git -c transfer.hideRefs=refs/namespaced/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$HIDDEN
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: does not hide namespace by default" '
+		GIT_NAMESPACE=namespace git -c transfer.hideRefs=refs/namespaces/namespace/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$HIDDEN
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: can operate on unstripped refs" '
+		GIT_NAMESPACE=namespace git -c transfer.hideRefs=^refs/namespaces/namespace/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$HIDDEN
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+done
+
+test_done
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v4 5/6] rev-parse: add `--exclude-hidden=` option
  2022-11-08 10:03 ` [PATCH v4 " Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2022-11-08 10:03   ` [PATCH v4 4/6] revision: add new parameter to exclude hidden refs Patrick Steinhardt
@ 2022-11-08 10:03   ` Patrick Steinhardt
  2022-11-08 10:04   ` [PATCH v4 6/6] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
  5 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-08 10:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 2934 bytes --]

Add a new `--exclude-hidden=` option that is similar to the one we just
added to git-rev-list(1). Given a seciton name `uploadpack` or `receive`
as argument, it causes us to exclude all references that would be hidden
by the respective `$section.hideRefs` configuration.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-rev-parse.txt |  7 +++++++
 builtin/rev-parse.c             |  4 ++++
 t/t6018-rev-list-glob.sh        | 11 +++++++++++
 3 files changed, 22 insertions(+)

diff --git a/Documentation/git-rev-parse.txt b/Documentation/git-rev-parse.txt
index 6b8ca085aa..393aa6e982 100644
--- a/Documentation/git-rev-parse.txt
+++ b/Documentation/git-rev-parse.txt
@@ -197,6 +197,13 @@ respectively, and they must begin with `refs/` when applied to `--glob`
 or `--all`. If a trailing '/{asterisk}' is intended, it must be given
 explicitly.
 
+--exclude-hidden=[receive|uploadpack]::
+	Do not include refs that have been hidden via either one of
+	`receive.hideRefs` or `uploadpack.hideRefs` (see linkgit:git-config[1])
+	that the next `--all`, `--branches`, `--tags`, `--remotes` or `--glob`
+	would otherwise consider. This option is cleared when seeing one of
+	these pseudo-refs.
+
 --disambiguate=<prefix>::
 	Show every object whose name begins with the given prefix.
 	The <prefix> must be at least 4 hexadecimal digits long to
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 7fa5b6991b..49730c7a23 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -895,6 +895,10 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				add_ref_exclusion(&ref_excludes, arg);
 				continue;
 			}
+			if (skip_prefix(arg, "--exclude-hidden=", &arg)) {
+				exclude_hidden_refs(&ref_excludes, arg);
+				continue;
+			}
 			if (!strcmp(arg, "--show-toplevel")) {
 				const char *work_tree = get_git_work_tree();
 				if (work_tree)
diff --git a/t/t6018-rev-list-glob.sh b/t/t6018-rev-list-glob.sh
index e1abc5c2b3..af0c55cbe7 100755
--- a/t/t6018-rev-list-glob.sh
+++ b/t/t6018-rev-list-glob.sh
@@ -187,6 +187,17 @@ test_expect_success 'rev-parse --exclude=ref with --remotes=glob' '
 	compare rev-parse "--exclude=upstream/x --remotes=upstream/*" "upstream/one upstream/two"
 '
 
+for section in receive uploadpack
+do
+	test_expect_success "rev-parse --exclude-hidden=$section with --all" '
+		compare "-c transfer.hideRefs=refs/remotes/ rev-parse" "--exclude-hidden=$section --all" "--branches --tags"
+	'
+
+	test_expect_success "rev-parse --exclude-hidden=$section with --all" '
+		compare "-c transfer.hideRefs=refs/heads/subspace/ rev-parse" "--exclude-hidden=$section --all" "--exclude=refs/heads/subspace/* --all"
+	'
+done
+
 test_expect_success 'rev-list --exclude=glob with --branches=glob' '
 	compare rev-list "--exclude=subspace-* --branches=sub*" "subspace/one subspace/two"
 '
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v4 6/6] receive-pack: only use visible refs for connectivity check
  2022-11-08 10:03 ` [PATCH v4 " Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2022-11-08 10:03   ` [PATCH v4 5/6] rev-parse: add `--exclude-hidden=` option Patrick Steinhardt
@ 2022-11-08 10:04   ` Patrick Steinhardt
  5 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-08 10:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 4675 bytes --]

When serving a push, git-receive-pack(1) needs to verify that the
packfile sent by the client contains all objects that are required by
the updated references. This connectivity check works by marking all
preexisting references as uninteresting and using the new reference tips
as starting point for a graph walk.

Marking all preexisting references as uninteresting can be a problem
when it comes to performance. Git forges tend to do internal bookkeeping
to keep alive sets of objects for internal use or make them easy to find
via certain references. These references are typically hidden away from
the user so that they are neither advertised nor writeable. At GitLab,
we have one particular repository that contains a total of 7 million
references, of which 6.8 million are indeed internal references. With
the current connectivity check we are forced to load all these
references in order to mark them as uninteresting, and this alone takes
around 15 seconds to compute.

We can optimize this by only taking into account the set of visible refs
when marking objects as uninteresting. This means that we may now walk
more objects until we hit any object that is marked as uninteresting.
But it is rather unlikely that clients send objects that make large
parts of objects reachable that have previously only ever been hidden,
whereas the common case is to push incremental changes that build on top
of the visible object graph.

This provides a huge boost to performance in the mentioned repository,
where the vast majority of its refs hidden. Pushing a new commit into
this repo with `transfer.hideRefs` set up to hide 6.8 million of 7 refs
as it is configured in Gitaly leads to a 4.5-fold speedup:

    Benchmark 1: main
      Time (mean ± σ):     30.977 s ±  0.157 s    [User: 30.226 s, System: 1.083 s]
      Range (min … max):   30.796 s … 31.071 s    3 runs

    Benchmark 2: pks-connectivity-check-hide-refs
      Time (mean ± σ):      6.799 s ±  0.063 s    [User: 6.803 s, System: 0.354 s]
      Range (min … max):    6.729 s …  6.850 s    3 runs

    Summary
      'pks-connectivity-check-hide-refs' ran
        4.56 ± 0.05 times faster than 'main'

As we mostly go through the same codepaths even in the case where there
are no hidden refs at all compared to the code before there is no change
in performance when no refs are hidden:

    Benchmark 1: main
      Time (mean ± σ):     48.188 s ±  0.432 s    [User: 49.326 s, System: 5.009 s]
      Range (min … max):   47.706 s … 48.539 s    3 runs

    Benchmark 2: pks-connectivity-check-hide-refs
      Time (mean ± σ):     48.027 s ±  0.500 s    [User: 48.934 s, System: 5.025 s]
      Range (min … max):   47.504 s … 48.500 s    3 runs

    Summary
      'pks-connectivity-check-hide-refs' ran
        1.00 ± 0.01 times faster than 'main'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/receive-pack.c | 2 ++
 connected.c            | 3 +++
 connected.h            | 7 +++++++
 3 files changed, 12 insertions(+)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 1f3efc58fb..77ab40f123 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1929,6 +1929,8 @@ static void execute_commands(struct command *commands,
 	opt.err_fd = err_fd;
 	opt.progress = err_fd && !quiet;
 	opt.env = tmp_objdir_env(tmp_objdir);
+	opt.exclude_hidden_refs_section = "receive";
+
 	if (check_connected(iterate_receive_command_list, &data, &opt))
 		set_connectivity_errors(commands, si);
 
diff --git a/connected.c b/connected.c
index 74a20cb32e..4f6388eed7 100644
--- a/connected.c
+++ b/connected.c
@@ -100,6 +100,9 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
 		strvec_push(&rev_list.args, "--exclude-promisor-objects");
 	if (!opt->is_deepening_fetch) {
 		strvec_push(&rev_list.args, "--not");
+		if (opt->exclude_hidden_refs_section)
+			strvec_pushf(&rev_list.args, "--exclude-hidden=%s",
+				     opt->exclude_hidden_refs_section);
 		strvec_push(&rev_list.args, "--all");
 	}
 	strvec_push(&rev_list.args, "--quiet");
diff --git a/connected.h b/connected.h
index 6e59c92aa3..16b2c84f2e 100644
--- a/connected.h
+++ b/connected.h
@@ -46,6 +46,13 @@ struct check_connected_options {
 	 * during a fetch.
 	 */
 	unsigned is_deepening_fetch : 1;
+
+	/*
+	 * If not NULL, use `--exclude-hidden=$section` to exclude all refs
+	 * hidden via the `$section.hideRefs` config from the set of
+	 * already-reachable refs.
+	 */
+	const char *exclude_hidden_refs_section;
 };
 
 #define CHECK_CONNECTED_INIT { 0 }
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH v4 1/6] refs: get rid of global list of hidden refs
  2022-11-08 10:03   ` [PATCH v4 1/6] refs: get rid of global list of hidden refs Patrick Steinhardt
@ 2022-11-08 13:36     ` Ævar Arnfjörð Bjarmason
  2022-11-08 14:49       ` Patrick Steinhardt
  2022-11-08 14:51     ` Jeff King
  1 sibling, 1 reply; 88+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-08 13:36 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Junio C Hamano, Taylor Blau, Jeff King


On Tue, Nov 08 2022, Patrick Steinhardt wrote:

> @@ -2591,6 +2592,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
>  		packet_flush(1);
>  	oid_array_clear(&shallow);
>  	oid_array_clear(&ref);
> +	string_list_clear(&hidden_refs, 1);

In the v4 re-roll you got rid of the "1" for some other string_lists,
but is this one still needed, i.e. does it use "util"? At a glance it
doesn't seem so. There's another "hidden_refs" (maybe just semi-related)
in 4/6 that doesn't use it when clearing.

> diff --git a/refs.c b/refs.c
> index 1491ae937e..2c7e88b190 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1414,9 +1414,8 @@ char *shorten_unambiguous_ref(const char *refname, int strict)
>  					    refname, strict);
>  }
>  
> -static struct string_list *hide_refs;
> -
> -int parse_hide_refs_config(const char *var, const char *value, const char *section)
> +int parse_hide_refs_config(const char *var, const char *value, const char *section,
> +			   struct string_list *hide_refs)
>  {
>  	const char *key;
>  	if (!strcmp("transfer.hiderefs", var) ||
> @@ -1431,21 +1430,16 @@ int parse_hide_refs_config(const char *var, const char *value, const char *secti
>  		len = strlen(ref);
>  		while (len && ref[len - 1] == '/')
>  			ref[--len] = '\0';
> -		if (!hide_refs) {
> -			CALLOC_ARRAY(hide_refs, 1);
> -			hide_refs->strdup_strings = 1;
> -		}
> -		string_list_append(hide_refs, ref);
> +		string_list_append_nodup(hide_refs, ref);
>  	}
>  	return 0;
>  }
>  

As before, this is all much nicer, thanks.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 2/3] revision: add new parameter to specify all visible refs
  2022-11-07  8:20       ` Patrick Steinhardt
@ 2022-11-08 14:32         ` Jeff King
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff King @ 2022-11-08 14:32 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau

On Mon, Nov 07, 2022 at 09:20:08AM +0100, Patrick Steinhardt wrote:

> >   1. The mutual-exclusion selection between "transfer", "uploadpack",
> >      and "receive" is not how those options work in their respective
> > [...]
> 
> Yup, I'm aware of this. And as you say, the current implementation
> already handles this alright for both `receive` and `uploadpack` as we
> rely on `parse_hide_refs_config()`, which knows to look at both
> `transfer.hideRefs` and `$section.hideRefs`. But I don't see a reason
> why we shouldn't allow users to ask "What is the set of hidden refs that
> are shared by `uploadpack` and `receive`?", which is exactly what
> `--visible-refs=transfer` does.

OK. I don't have a real problem with having that mode, as long as it is
documented accurately. I do think, though, that it's not that likely to
be useful on its own (even less so than the hypotheticals I gave! ;) ),
and it would be easy to add later in a backwards-compatible way. So my
instinct would be to leave it out for now, but I don't think it's
hurting too much as-is (again, with a correct explanation in the
documentation).

> The implementation is not really explicit about this as we cheat a
> little bit here by passing "transfer" as a section to the parsing
> function. So what it does right now is to basically check for the same
> section twice, once via the hard-coded "transfer.hideRefs" and once for
> the "$section.hideRefs" with `section == "transfer"`. But I didn't see
> much of a point in making this more explicit.

Yeah, I agree the implementation works OK here. It does a duplicate
string-comparison for the section, but the important thing is that it
doesn't add each entry to the list twice.

> >      Now I don't have a particular use case for either of those things.
> >      But they're plausible things to want in the long run, and they fit
> >      in nicely with the existing ref-selection scheme of rev-list. They
> >      do make your call from check_connected() slightly longer, but it is
> >      pretty negligible. It's "--exclude-hidden=receive --all" instead of
> >      "--visible-refs=hidden".
> 
> Fair enough. I guess that the usecase where you e.g. only hide a subset
> of branches via `hideRefs` is going to be rare, so in most cases you
> don't gain much by modelling this so that you can `--exclude-hidden
> --branches`. But as you rightfully point out, modelling it that way fits
> neatly with the existing `--exclude` switch and is overall more
> flexible. So there's not much of a reason to not do so.

Thanks. I agree these aren't incredibly likely cases to come up in
practice. But unlike the "transfer" thing, it would be very hard to
switch techniques later without adding an awkward almost-the-same
option.

-Peff

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 4/6] revision: add new parameter to exclude hidden refs
  2022-11-08  8:16       ` Patrick Steinhardt
@ 2022-11-08 14:42         ` Jeff King
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff King @ 2022-11-08 14:42 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Taylor Blau, git, Junio C Hamano,
	Ævar Arnfjörð Bjarmason

On Tue, Nov 08, 2022 at 09:16:47AM +0100, Patrick Steinhardt wrote:

> > > +--exclude-hidden=[transfer|receive|uploadpack]::
> > > +	Do not include refs that have been hidden via either one of
> > > +	`transfer.hideRefs`, `receive.hideRefs` or `uploadpack.hideRefs` that
> > > +	the next `--all`, `--branches`, `--tags`, `--remotes` or `--glob` would
> > > +	otherwise consider.  This option is cleared when seeing one of these
> > > +	pseudo-refs.
> > > +
> > 
> > Hmm. I thought that part of the motivation behind this round was to drop
> > the 'transfer' group, since it's implied by '--exclude-hidden=receive
> > --exclude-hidden-uploadpack', no?
> > 
> > Thanks,
> > Taylor
> 
> I didn't quite see the point in not providing the `transfer` group so
> that users can ask for only the set of refs that are hidden by both
> `uploadpack` and `receive`. But given that you're the second person
> asking for it to be dropped now and given that I don't really have a
> plausible usecase for this I'll drop it in the next version.

Sorry, I'm a little slow on the review. I just left a message in
response to v2 saying that I'm OK with it _if_ it's explained. But the
explanation above still seems misleading. Saying "either one of" implies
that they are mutually exclusive, but "receive" is really pulling from
"receive.hideRefs" and "transfer.hideRefs".

I think you'd need to lay out the rules. But if we just drop "transfer"
then that is simpler still (but the explanation probably still ought to
become "refs hidden by receive-pack" and not mention receive.hideRefs
explicitly).

-Peff

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 5/6] revparse: add `--exclude-hidden=` option
  2022-11-07 12:16   ` [PATCH v3 5/6] revparse: add `--exclude-hidden=` option Patrick Steinhardt
@ 2022-11-08 14:44     ` Jeff King
  0 siblings, 0 replies; 88+ messages in thread
From: Jeff King @ 2022-11-08 14:44 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau

On Mon, Nov 07, 2022 at 01:16:39PM +0100, Patrick Steinhardt wrote:

> Add a new `--exclude-hidden=` option that is similar to the one we just
> added to git-rev-list(1). Given a seciton name `transfer`, `uploadpack`
> or `receive` as argument, it causes us to exclude all references that
> would be hidden by the respective `$seciton.hideRefs` configuration.

Thanks for adding this one in. I feel like rev-parse isn't used all that
much these days, and in a sense, we could just let it fall behind what
rev-list could do and probably nobody would care. But since it's only a
few lines, keeping parity with rev-list is nice to have.

-Peff

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v4 1/6] refs: get rid of global list of hidden refs
  2022-11-08 13:36     ` Ævar Arnfjörð Bjarmason
@ 2022-11-08 14:49       ` Patrick Steinhardt
  0 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-08 14:49 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 782 bytes --]

On Tue, Nov 08, 2022 at 02:36:04PM +0100, Ævar Arnfjörð Bjarmason wrote:
> 
> On Tue, Nov 08 2022, Patrick Steinhardt wrote:
> 
> > @@ -2591,6 +2592,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
> >  		packet_flush(1);
> >  	oid_array_clear(&shallow);
> >  	oid_array_clear(&ref);
> > +	string_list_clear(&hidden_refs, 1);
> 
> In the v4 re-roll you got rid of the "1" for some other string_lists,
> but is this one still needed, i.e. does it use "util"? At a glance it
> doesn't seem so. There's another "hidden_refs" (maybe just semi-related)
> in 4/6 that doesn't use it when clearing.

Oh, right, I missed this one. Will wait a bit though for other feedback
to come in before sending a v5 only with this one-line change.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v4 1/6] refs: get rid of global list of hidden refs
  2022-11-08 10:03   ` [PATCH v4 1/6] refs: get rid of global list of hidden refs Patrick Steinhardt
  2022-11-08 13:36     ` Ævar Arnfjörð Bjarmason
@ 2022-11-08 14:51     ` Jeff King
  1 sibling, 0 replies; 88+ messages in thread
From: Jeff King @ 2022-11-08 14:51 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau

On Tue, Nov 08, 2022 at 11:03:39AM +0100, Patrick Steinhardt wrote:

> -static struct string_list *hide_refs;
> -
> -int parse_hide_refs_config(const char *var, const char *value, const char *section)
> +int parse_hide_refs_config(const char *var, const char *value, const char *section,
> +			   struct string_list *hide_refs)
>  {
>  	const char *key;
>  	if (!strcmp("transfer.hiderefs", var) ||
> @@ -1431,21 +1430,16 @@ int parse_hide_refs_config(const char *var, const char *value, const char *secti
>  		len = strlen(ref);
>  		while (len && ref[len - 1] == '/')
>  			ref[--len] = '\0';
> -		if (!hide_refs) {
> -			CALLOC_ARRAY(hide_refs, 1);
> -			hide_refs->strdup_strings = 1;
> -		}
> -		string_list_append(hide_refs, ref);
> +		string_list_append_nodup(hide_refs, ref);
>  	}
>  	return 0;
>  }

This nodup is definitely the right thing to be doing, but it's kind of
hidden in here. AFAICT it is fixing an existing leak, because the
previous code always set strdup_strings, and we always made our own copy
of "ref".

Probably not worth a re-roll on its own, but I'd probably have pulled
that into its own commit.

The rest of the commit looks OK to me. Like Ævar, I'm confused by the
"free_util" arguments to string_list_clear().

-Peff

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v4 4/6] revision: add new parameter to exclude hidden refs
  2022-11-08 10:03   ` [PATCH v4 4/6] revision: add new parameter to exclude hidden refs Patrick Steinhardt
@ 2022-11-08 15:07     ` Jeff King
  2022-11-08 21:13       ` Taylor Blau
  2022-11-11  5:48       ` Patrick Steinhardt
  0 siblings, 2 replies; 88+ messages in thread
From: Jeff King @ 2022-11-08 15:07 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau

On Tue, Nov 08, 2022 at 11:03:51AM +0100, Patrick Steinhardt wrote:

> --- a/Documentation/rev-list-options.txt
> +++ b/Documentation/rev-list-options.txt
> @@ -195,6 +195,13 @@ respectively, and they must begin with `refs/` when applied to `--glob`
>  or `--all`. If a trailing '/{asterisk}' is intended, it must be given
>  explicitly.
>  
> +--exclude-hidden=[receive|uploadpack]::
> +	Do not include refs that have been hidden via either one of
> +	`receive.hideRefs` or `uploadpack.hideRefs` (see linkgit:git-config[1])
> +	that the next `--all`, `--branches`, `--tags`, `--remotes` or `--glob`
> +	would otherwise consider. This option is cleared when seeing one of
> +	these pseudo-refs.

OK, so this one drops "transfer", which I think is good. But again, I
think the explanation here is subtly misleading, because it's not just
"hidden via those sections", but "hidden as receive-pack or upload-pack
would, respecting both those sections _and_ transfer".

I also wondered if we could simplify the explanation a bit by tying it
into the existing --exclude, and then we don't need that mouthful of
pseudo-ref options. So together something like:

  Do not include refs that would be hidden by `receive-pack` or
  `upload-pack`, by consulting the appropriate `receive.hideRefs` or
  `uploadpack.hideRefs`, along with `transfer.hideRefs`. Like
  `--exclude`, this option affects the next pseudo-ref option (`--all`,
  `--glob`, etc), and is cleared after processing them.

But then I read the next section in the --exclude docs, which says:

  The patterns given should not begin with refs/heads, refs/tags, or
  refs/remotes when applied to --branches, --tags, or --remotes,
  respectively, and they must begin with refs/ when applied to --glob or
  --all. If a trailing /* is intended, it must be given explicitly.

Yikes. So --all is going to process "refs/heads/foo", but --branches
will see just "foo". And that means that:

  git rev-list --exclude-hidden=receive --branches

will not work! Because receive.hideRefs is naming fully qualified refs,
but we'll pass unqualified ones to ref_is_hidden(). Even though that
function takes a "full" refname argument, I think that has to do with
namespaces.

I'm sure this _could_ be made to work, but I wonder if it is worth the
trouble. If it's not going to work, though, I think we'd want to detect
the situation and complain, at least for now. And likewise the
documentation needs to make clear it only works with --all and --glob.

Sorry to have misled in my initial suggestion to turn --visible-refs
into --exclude-hidden. However, I do still stand by that suggestion.
Even if we don't make it work with "--branches" now, the user-visible
framework is still there, so it becomes a matter of extending the
implementation later, rather than re-designing the options.

-Peff

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v4 4/6] revision: add new parameter to exclude hidden refs
  2022-11-08 15:07     ` Jeff King
@ 2022-11-08 21:13       ` Taylor Blau
  2022-11-11  5:48       ` Patrick Steinhardt
  1 sibling, 0 replies; 88+ messages in thread
From: Taylor Blau @ 2022-11-08 21:13 UTC (permalink / raw)
  To: Jeff King
  Cc: Patrick Steinhardt, git, Junio C Hamano,
	Ævar Arnfjörð Bjarmason, Taylor Blau

On Tue, Nov 08, 2022 at 10:07:35AM -0500, Jeff King wrote:
> I'm sure this _could_ be made to work, but I wonder if it is worth the
> trouble. If it's not going to work, though, I think we'd want to detect
> the situation and complain, at least for now. And likewise the
> documentation needs to make clear it only works with --all and --glob.
>
> Sorry to have misled in my initial suggestion to turn --visible-refs
> into --exclude-hidden. However, I do still stand by that suggestion.
> Even if we don't make it work with "--branches" now, the user-visible
> framework is still there, so it becomes a matter of extending the
> implementation later, rather than re-designing the options.

Good catch. I agree completely, so hopefully we will see something to
this effect in the forthcoming v5.

I'll hold off on merging this down until we see that version.


Thanks,
Taylor

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v4 4/6] revision: add new parameter to exclude hidden refs
  2022-11-08 15:07     ` Jeff King
  2022-11-08 21:13       ` Taylor Blau
@ 2022-11-11  5:48       ` Patrick Steinhardt
  1 sibling, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-11  5:48 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau

[-- Attachment #1: Type: text/plain, Size: 3531 bytes --]

On Tue, Nov 08, 2022 at 10:07:35AM -0500, Jeff King wrote:
> On Tue, Nov 08, 2022 at 11:03:51AM +0100, Patrick Steinhardt wrote:
> 
> > --- a/Documentation/rev-list-options.txt
> > +++ b/Documentation/rev-list-options.txt
> > @@ -195,6 +195,13 @@ respectively, and they must begin with `refs/` when applied to `--glob`
> >  or `--all`. If a trailing '/{asterisk}' is intended, it must be given
> >  explicitly.
> >  
> > +--exclude-hidden=[receive|uploadpack]::
> > +	Do not include refs that have been hidden via either one of
> > +	`receive.hideRefs` or `uploadpack.hideRefs` (see linkgit:git-config[1])
> > +	that the next `--all`, `--branches`, `--tags`, `--remotes` or `--glob`
> > +	would otherwise consider. This option is cleared when seeing one of
> > +	these pseudo-refs.
> 
> OK, so this one drops "transfer", which I think is good. But again, I
> think the explanation here is subtly misleading, because it's not just
> "hidden via those sections", but "hidden as receive-pack or upload-pack
> would, respecting both those sections _and_ transfer".
> 
> I also wondered if we could simplify the explanation a bit by tying it
> into the existing --exclude, and then we don't need that mouthful of
> pseudo-ref options. So together something like:
> 
>   Do not include refs that would be hidden by `receive-pack` or
>   `upload-pack`, by consulting the appropriate `receive.hideRefs` or
>   `uploadpack.hideRefs`, along with `transfer.hideRefs`. Like
>   `--exclude`, this option affects the next pseudo-ref option (`--all`,
>   `--glob`, etc), and is cleared after processing them.

I'll try to come up with something, thanks.

> But then I read the next section in the --exclude docs, which says:
> 
>   The patterns given should not begin with refs/heads, refs/tags, or
>   refs/remotes when applied to --branches, --tags, or --remotes,
>   respectively, and they must begin with refs/ when applied to --glob or
>   --all. If a trailing /* is intended, it must be given explicitly.
> 
> Yikes. So --all is going to process "refs/heads/foo", but --branches
> will see just "foo". And that means that:
> 
>   git rev-list --exclude-hidden=receive --branches
> 
> will not work! Because receive.hideRefs is naming fully qualified refs,
> but we'll pass unqualified ones to ref_is_hidden(). Even though that
> function takes a "full" refname argument, I think that has to do with
> namespaces.

Oh, good catch. I noticed that paragraph before, but somehow it didn't
click.

> I'm sure this _could_ be made to work, but I wonder if it is worth the
> trouble. If it's not going to work, though, I think we'd want to detect
> the situation and complain, at least for now. And likewise the
> documentation needs to make clear it only works with --all and --glob.
> 
> Sorry to have misled in my initial suggestion to turn --visible-refs
> into --exclude-hidden. However, I do still stand by that suggestion.
> Even if we don't make it work with "--branches" now, the user-visible
> framework is still there, so it becomes a matter of extending the
> implementation later, rather than re-designing the options.

No worries, and I agree that it's ultimately still the better route to
use `--exclude-hidden=`. The things that don't work right now just go to
show that there is still some potential here and that it's overall the
more flexible design. And as you say, this is something we can build on
at a later point if any such usecases come up.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v5 0/7] receive-pack: only use visible refs for connectivity check
  2022-10-28 14:42 [PATCH 0/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
                   ` (7 preceding siblings ...)
  2022-11-08 10:03 ` [PATCH v4 " Patrick Steinhardt
@ 2022-11-11  6:49 ` Patrick Steinhardt
  2022-11-11  6:49   ` [PATCH v5 1/7] refs: fix memory leak when parsing hideRefs config Patrick Steinhardt
                     ` (7 more replies)
  2022-11-17  5:46 ` [PATCH v6 " Patrick Steinhardt
  9 siblings, 8 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-11  6:49 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 17012 bytes --]

Hi,

this is the fifth version of my patch series that tries to improve
performance of the connectivity check by only considering preexisting
refs as uninteresting that could actually have been advertised to the
client.

Changes compared to v4:

    - Split out the memory leak fix when parsing hidden refs into a
      separate commit 1/7.

    - Fixed calls to `string_list_clear()` to not free the `util` field
      of `hidden_refs`.

    - Updated the documentation of the new `--exclude-hidden=` option to
      hopefully be easier to understand.

    - We now return an error when `--exclude-hidden=` is used with
      either one of `--branches`, `--tags` or `--remotes`.

    - Fixed a bug where we didn't bail when `--exclude-hidden=` was
      passed multiple times when there are no hidden refs.

    - Extended test coverage.

Patrick

Patrick Steinhardt (7):
  refs: fix memory leak when parsing hideRefs config
  refs: get rid of global list of hidden refs
  revision: move together exclusion-related functions
  revision: introduce struct to handle exclusions
  revision: add new parameter to exclude hidden refs
  rev-parse: add `--exclude-hidden=` option
  receive-pack: only use visible refs for connectivity check

 Documentation/git-rev-parse.txt    |   7 ++
 Documentation/rev-list-options.txt |   7 ++
 builtin/receive-pack.c             |  10 +-
 builtin/rev-list.c                 |   1 +
 builtin/rev-parse.c                |  18 +++-
 connected.c                        |   3 +
 connected.h                        |   7 ++
 ls-refs.c                          |  13 ++-
 refs.c                             |  16 +--
 refs.h                             |   5 +-
 revision.c                         | 131 +++++++++++++++--------
 revision.h                         |  43 ++++++--
 t/t6018-rev-list-glob.sh           |  40 +++++++
 t/t6021-rev-list-exclude-hidden.sh | 163 +++++++++++++++++++++++++++++
 upload-pack.c                      |  30 +++---
 15 files changed, 411 insertions(+), 83 deletions(-)
 create mode 100755 t/t6021-rev-list-exclude-hidden.sh

Range-diff against v4:
-:  ---------- > 1:  cfab8ba1a2 refs: fix memory leak when parsing hideRefs config
1:  34afe30d60 ! 2:  d8118c6dd8 refs: get rid of global list of hidden refs
    @@ builtin/receive-pack.c: int cmd_receive_pack(int argc, const char **argv, const
      		packet_flush(1);
      	oid_array_clear(&shallow);
      	oid_array_clear(&ref);
    -+	string_list_clear(&hidden_refs, 1);
    ++	string_list_clear(&hidden_refs, 0);
      	free((void *)push_cert_nonce);
      	return 0;
      }
    @@ ls-refs.c: int ls_refs(struct repository *r, struct packet_reader *request)
      	packet_fflush(stdout);
      	strvec_clear(&data.prefixes);
      	strbuf_release(&data.buf);
    -+	string_list_clear(&data.hidden_refs, 1);
    ++	string_list_clear(&data.hidden_refs, 0);
      	return 0;
      }
      
    @@ refs.c: int parse_hide_refs_config(const char *var, const char *value, const cha
     -			CALLOC_ARRAY(hide_refs, 1);
     -			hide_refs->strdup_strings = 1;
     -		}
    --		string_list_append(hide_refs, ref);
    -+		string_list_append_nodup(hide_refs, ref);
    + 		string_list_append_nodup(hide_refs, ref);
      	}
      	return 0;
      }
    @@ upload-pack.c: static void upload_pack_data_clear(struct upload_pack_data *data)
      {
      	string_list_clear(&data->symref, 1);
      	string_list_clear(&data->wanted_refs, 1);
    -+	string_list_clear(&data->hidden_refs, 1);
    ++	string_list_clear(&data->hidden_refs, 0);
      	object_array_clear(&data->want_obj);
      	object_array_clear(&data->have_obj);
      	oid_array_clear(&data->haves);
2:  b4f21d0a80 = 3:  93a627fb7f revision: move together exclusion-related functions
3:  265b292ed5 = 4:  ad41ade332 revision: introduce struct to handle exclusions
4:  c7fa6698db ! 5:  b5a4ce432a revision: add new parameter to exclude hidden refs
    @@ Documentation/rev-list-options.txt: respectively, and they must begin with `refs
      explicitly.
      
     +--exclude-hidden=[receive|uploadpack]::
    -+	Do not include refs that have been hidden via either one of
    -+	`receive.hideRefs` or `uploadpack.hideRefs` (see linkgit:git-config[1])
    -+	that the next `--all`, `--branches`, `--tags`, `--remotes` or `--glob`
    -+	would otherwise consider. This option is cleared when seeing one of
    -+	these pseudo-refs.
    ++	Do not include refs that would be hidden by `git-receive-pack` or
    ++	`git-upload-pack` by consulting the appropriate `receive.hideRefs` or
    ++	`uploadpack.hideRefs` configuration along with `transfer.hideRefs` (see
    ++	linkgit:git-config[1]). This option affects the next pseudo-ref option
    ++	`--all` or `--glob` and is cleared after processing them.
     +
      --reflog::
      	Pretend as if all objects mentioned by reflogs are listed on the
    @@ revision.c: void init_ref_exclusions(struct ref_exclusions *exclusions)
      {
      	string_list_clear(&exclusions->excluded_refs, 0);
     +	string_list_clear(&exclusions->hidden_refs, 0);
    ++	exclusions->hidden_refs_configured = 0;
      }
      
      void add_ref_exclusion(struct ref_exclusions *exclusions, const char *exclude)
    @@ revision.c: void add_ref_exclusion(struct ref_exclusions *exclusions, const char
     +static int hide_refs_config(const char *var, const char *value, void *cb_data)
     +{
     +	struct exclude_hidden_refs_cb *cb = cb_data;
    ++	cb->exclusions->hidden_refs_configured = 1;
     +	return parse_hide_refs_config(var, value, cb->section,
     +				      &cb->exclusions->hidden_refs);
     +}
    @@ revision.c: void add_ref_exclusion(struct ref_exclusions *exclusions, const char
     +	if (strcmp(section, "receive") && strcmp(section, "uploadpack"))
     +		die(_("unsupported section for hidden refs: %s"), section);
     +
    -+	if (exclusions->hidden_refs.nr)
    ++	if (exclusions->hidden_refs_configured)
     +		die(_("--exclude-hidden= passed more than once"));
     +
     +	cb.exclusions = exclusions;
    @@ revision.c: static int handle_revision_opt(struct rev_info *revs, int argc, cons
      	    starts_with(arg, "--branches=") || starts_with(arg, "--tags=") ||
      	    starts_with(arg, "--remotes=") || starts_with(arg, "--no-walk="))
      	{
    +@@ revision.c: static int handle_revision_pseudo_opt(struct rev_info *revs,
    + 		}
    + 		clear_ref_exclusions(&revs->ref_excludes);
    + 	} else if (!strcmp(arg, "--branches")) {
    ++		if (revs->ref_excludes.hidden_refs_configured)
    ++			return error(_("--exclude-hidden cannot be used together with --branches"));
    + 		handle_refs(refs, revs, *flags, refs_for_each_branch_ref);
    + 		clear_ref_exclusions(&revs->ref_excludes);
    + 	} else if (!strcmp(arg, "--bisect")) {
    +@@ revision.c: static int handle_revision_pseudo_opt(struct rev_info *revs,
    + 			    for_each_good_bisect_ref);
    + 		revs->bisect = 1;
    + 	} else if (!strcmp(arg, "--tags")) {
    ++		if (revs->ref_excludes.hidden_refs_configured)
    ++			return error(_("--exclude-hidden cannot be used together with --tags"));
    + 		handle_refs(refs, revs, *flags, refs_for_each_tag_ref);
    + 		clear_ref_exclusions(&revs->ref_excludes);
    + 	} else if (!strcmp(arg, "--remotes")) {
    ++		if (revs->ref_excludes.hidden_refs_configured)
    ++			return error(_("--exclude-hidden cannot be used together with --remotes"));
    + 		handle_refs(refs, revs, *flags, refs_for_each_remote_ref);
    + 		clear_ref_exclusions(&revs->ref_excludes);
    + 	} else if ((argcount = parse_long_opt("glob", argv, &optarg))) {
     @@ revision.c: static int handle_revision_pseudo_opt(struct rev_info *revs,
      	} else if ((argcount = parse_long_opt("exclude", argv, &optarg))) {
      		add_ref_exclusion(&revs->ref_excludes, optarg);
    @@ revision.c: static int handle_revision_pseudo_opt(struct rev_info *revs,
     +		return argcount;
      	} else if (skip_prefix(arg, "--branches=", &optarg)) {
      		struct all_refs_cb cb;
    ++		if (revs->ref_excludes.hidden_refs_configured)
    ++			return error(_("--exclude-hidden cannot be used together with --branches"));
      		init_all_refs_cb(&cb, revs, *flags);
    + 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/heads/", &cb);
    + 		clear_ref_exclusions(&revs->ref_excludes);
    + 	} else if (skip_prefix(arg, "--tags=", &optarg)) {
    + 		struct all_refs_cb cb;
    ++		if (revs->ref_excludes.hidden_refs_configured)
    ++			return error(_("--exclude-hidden cannot be used together with --tags"));
    + 		init_all_refs_cb(&cb, revs, *flags);
    + 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/tags/", &cb);
    + 		clear_ref_exclusions(&revs->ref_excludes);
    + 	} else if (skip_prefix(arg, "--remotes=", &optarg)) {
    + 		struct all_refs_cb cb;
    ++		if (revs->ref_excludes.hidden_refs_configured)
    ++			return error(_("--exclude-hidden cannot be used together with --remotes"));
    + 		init_all_refs_cb(&cb, revs, *flags);
    + 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/remotes/", &cb);
    + 		clear_ref_exclusions(&revs->ref_excludes);
     
      ## revision.h ##
     @@ revision.h: struct ref_exclusions {
    @@ revision.h: struct ref_exclusions {
     +	 * `ref_is_hidden()`.
     +	 */
     +	struct string_list hidden_refs;
    ++
    ++	/*
    ++	 * Indicates whether hidden refs have been configured. This is to
    ++	 * distinguish between no hidden refs existing and hidden refs not
    ++	 * being parsed.
    ++	 */
    ++	char hidden_refs_configured;
      };
      
      /**
    @@ t/t6021-rev-list-exclude-hidden.sh (new)
     +do
     +	test_expect_success "$section: passed multiple times" '
     +		echo "fatal: --exclude-hidden= passed more than once" >expected &&
    -+		test_must_fail git -c transfer.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --exclude-hidden=$section 2>err &&
    ++		test_must_fail git rev-list --exclude-hidden=$section --exclude-hidden=$section 2>err &&
     +		test_cmp expected err
     +	'
     +
    @@ t/t6021-rev-list-exclude-hidden.sh (new)
     +		test_cmp expected out
     +	'
     +
    ++	test_expect_success "$section: excluded hidden refs can be used with multiple pseudo-refs" '
    ++		git -c transfer.hideRefs=refs/ rev-list --exclude-hidden=$section --all --exclude-hidden=$section --all >out &&
    ++		test_must_be_empty out
    ++	'
    ++
    ++	test_expect_success "$section: works with --glob" '
    ++		git -c transfer.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --glob=refs/h* >out &&
    ++		cat >expected <<-EOF &&
    ++		$COMMIT
    ++		EOF
    ++		test_cmp expected out
    ++	'
    ++
     +	test_expect_success "$section: operates on stripped refs by default" '
     +		GIT_NAMESPACE=namespace git -c transfer.hideRefs=refs/namespaced/ rev-list --exclude-hidden=$section --all >out &&
     +		cat >expected <<-EOF &&
    @@ t/t6021-rev-list-exclude-hidden.sh (new)
     +		EOF
     +		test_cmp expected out
     +	'
    ++
    ++	for pseudoopt in remotes branches tags
    ++	do
    ++		test_expect_success "$section: fails with --$pseudoopt" '
    ++			test_must_fail git rev-list --exclude-hidden=$section --$pseudoopt 2>err &&
    ++			test_i18ngrep "error: --exclude-hidden cannot be used together with --$pseudoopt" err
    ++		'
    ++
    ++		test_expect_success "$section: fails with --$pseudoopt=pattern" '
    ++			test_must_fail git rev-list --exclude-hidden=$section --$pseudoopt=pattern 2>err &&
    ++			test_i18ngrep "error: --exclude-hidden cannot be used together with --$pseudoopt" err
    ++		'
    ++	done
     +done
     +
     +test_done
5:  79c5c64a80 ! 6:  2eeb25eef0 rev-parse: add `--exclude-hidden=` option
    @@ Documentation/git-rev-parse.txt: respectively, and they must begin with `refs/`
      explicitly.
      
     +--exclude-hidden=[receive|uploadpack]::
    -+	Do not include refs that have been hidden via either one of
    -+	`receive.hideRefs` or `uploadpack.hideRefs` (see linkgit:git-config[1])
    -+	that the next `--all`, `--branches`, `--tags`, `--remotes` or `--glob`
    -+	would otherwise consider. This option is cleared when seeing one of
    -+	these pseudo-refs.
    ++	Do not include refs that would be hidden by `git-receive-pack` or
    ++	`git-upload-pack` by consulting the appropriate `receive.hideRefs` or
    ++	`uploadpack.hideRefs` configuration along with `transfer.hideRefs` (see
    ++	linkgit:git-config[1]). This option affects the next pseudo-ref option
    ++	`--all` or `--glob` and is cleared after processing them.
     +
      --disambiguate=<prefix>::
      	Show every object whose name begins with the given prefix.
      	The <prefix> must be at least 4 hexadecimal digits long to
     
      ## builtin/rev-parse.c ##
    +@@ builtin/rev-parse.c: int cmd_rev_parse(int argc, const char **argv, const char *prefix)
    + 				continue;
    + 			}
    + 			if (opt_with_value(arg, "--branches", &arg)) {
    ++				if (ref_excludes.hidden_refs_configured)
    ++					return error(_("--exclude-hidden cannot be used together with --branches"));
    + 				handle_ref_opt(arg, "refs/heads/");
    + 				continue;
    + 			}
    + 			if (opt_with_value(arg, "--tags", &arg)) {
    ++				if (ref_excludes.hidden_refs_configured)
    ++					return error(_("--exclude-hidden cannot be used together with --tags"));
    + 				handle_ref_opt(arg, "refs/tags/");
    + 				continue;
    + 			}
    +@@ builtin/rev-parse.c: int cmd_rev_parse(int argc, const char **argv, const char *prefix)
    + 				continue;
    + 			}
    + 			if (opt_with_value(arg, "--remotes", &arg)) {
    ++				if (ref_excludes.hidden_refs_configured)
    ++					return error(_("--exclude-hidden cannot be used together with --remotes"));
    + 				handle_ref_opt(arg, "refs/remotes/");
    + 				continue;
    + 			}
     @@ builtin/rev-parse.c: int cmd_rev_parse(int argc, const char **argv, const char *prefix)
      				add_ref_exclusion(&ref_excludes, arg);
      				continue;
    @@ t/t6018-rev-list-glob.sh: test_expect_success 'rev-parse --exclude=ref with --re
     +for section in receive uploadpack
     +do
     +	test_expect_success "rev-parse --exclude-hidden=$section with --all" '
    -+		compare "-c transfer.hideRefs=refs/remotes/ rev-parse" "--exclude-hidden=$section --all" "--branches --tags"
    ++		compare "-c transfer.hideRefs=refs/remotes/ rev-parse" "--branches --tags" "--exclude-hidden=$section --all"
     +	'
     +
     +	test_expect_success "rev-parse --exclude-hidden=$section with --all" '
    -+		compare "-c transfer.hideRefs=refs/heads/subspace/ rev-parse" "--exclude-hidden=$section --all" "--exclude=refs/heads/subspace/* --all"
    ++		compare "-c transfer.hideRefs=refs/heads/subspace/ rev-parse" "--exclude=refs/heads/subspace/* --all" "--exclude-hidden=$section --all"
     +	'
    ++
    ++	test_expect_success "rev-parse --exclude-hidden=$section with --glob" '
    ++		compare "-c transfer.hideRefs=refs/heads/subspace/ rev-parse" "--exclude=refs/heads/subspace/* --glob=refs/heads/*" "--exclude-hidden=$section --glob=refs/heads/*"
    ++	'
    ++
    ++	test_expect_success "rev-parse --exclude-hidden=$section can be passed once per pseudo-ref" '
    ++		compare "-c transfer.hideRefs=refs/remotes/ rev-parse" "--branches --tags --branches --tags" "--exclude-hidden=$section --all --exclude-hidden=$section --all"
    ++	'
    ++
    ++	test_expect_success "rev-parse --exclude-hidden=$section can only be passed once per pseudo-ref" '
    ++		echo "fatal: --exclude-hidden= passed more than once" >expected &&
    ++		test_must_fail git rev-parse --exclude-hidden=$section --exclude-hidden=$section 2>err &&
    ++		test_cmp expected err
    ++	'
    ++
    ++	for pseudoopt in branches tags remotes
    ++	do
    ++		test_expect_success "rev-parse --exclude-hidden=$section fails with --$pseudoopt" '
    ++			echo "error: --exclude-hidden cannot be used together with --$pseudoopt" >expected &&
    ++			test_must_fail git rev-parse --exclude-hidden=$section --$pseudoopt 2>err &&
    ++			test_cmp expected err
    ++		'
    ++
    ++		test_expect_success "rev-parse --exclude-hidden=$section fails with --$pseudoopt=pattern" '
    ++			echo "error: --exclude-hidden cannot be used together with --$pseudoopt" >expected &&
    ++			test_must_fail git rev-parse --exclude-hidden=$section --$pseudoopt=pattern 2>err &&
    ++			test_cmp expected err
    ++		'
    ++	done
     +done
     +
      test_expect_success 'rev-list --exclude=glob with --branches=glob' '
6:  39b4741734 = 7:  f5f18f3939 receive-pack: only use visible refs for connectivity check
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v5 1/7] refs: fix memory leak when parsing hideRefs config
  2022-11-11  6:49 ` [PATCH v5 0/7] " Patrick Steinhardt
@ 2022-11-11  6:49   ` Patrick Steinhardt
  2022-11-11  6:49   ` [PATCH v5 2/7] refs: get rid of global list of hidden refs Patrick Steinhardt
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-11  6:49 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 1242 bytes --]

When parsing the hideRefs configuration, we first duplicate the config
value so that we can modify it. We then subsequently append it to the
`hide_refs` string list, which is initialized with `strdup_strings`
enabled. As a consequence we again reallocate the string, but never
free the first duplicate and thus have a memory leak.

While we never clean up the static `hide_refs` variable anyway, this is
no excuse to make the leak worse by leaking every value twice. We are
also about to change the way this variable will be handled so that we do
indeed start to clean it up. So let's fix the memory leak by using the
`string_list_append_nodup()` so that we pass ownership of the allocated
string to `hide_refs`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/refs.c b/refs.c
index 1491ae937e..a4ab264d74 100644
--- a/refs.c
+++ b/refs.c
@@ -1435,7 +1435,7 @@ int parse_hide_refs_config(const char *var, const char *value, const char *secti
 			CALLOC_ARRAY(hide_refs, 1);
 			hide_refs->strdup_strings = 1;
 		}
-		string_list_append(hide_refs, ref);
+		string_list_append_nodup(hide_refs, ref);
 	}
 	return 0;
 }
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 2/7] refs: get rid of global list of hidden refs
  2022-11-11  6:49 ` [PATCH v5 0/7] " Patrick Steinhardt
  2022-11-11  6:49   ` [PATCH v5 1/7] refs: fix memory leak when parsing hideRefs config Patrick Steinhardt
@ 2022-11-11  6:49   ` Patrick Steinhardt
  2022-11-11  6:50   ` [PATCH v5 3/7] revision: move together exclusion-related functions Patrick Steinhardt
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-11  6:49 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 12504 bytes --]

We're about to add a new argument to git-rev-list(1) that allows it to
add all references that are visible when taking `transfer.hideRefs` et
al into account. This will require us to potentially parse multiple sets
of hidden refs, which is not easily possible right now as there is only
a single, global instance of the list of parsed hidden refs.

Refactor `parse_hide_refs_config()` and `ref_is_hidden()` so that both
take the list of hidden references as input and adjust callers to keep a
local list, instead. This allows us to easily use multiple hidden-ref
lists. Furthermore, it allows us to properly free this list before we
exit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/receive-pack.c |  8 +++++---
 ls-refs.c              | 13 +++++++++----
 refs.c                 | 14 ++++----------
 refs.h                 |  5 +++--
 upload-pack.c          | 30 ++++++++++++++++++------------
 5 files changed, 39 insertions(+), 31 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 44bcea3a5b..1e24b31a0a 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -80,6 +80,7 @@ static struct object_id push_cert_oid;
 static struct signature_check sigcheck;
 static const char *push_cert_nonce;
 static const char *cert_nonce_seed;
+static struct string_list hidden_refs = STRING_LIST_INIT_DUP;
 
 static const char *NONCE_UNSOLICITED = "UNSOLICITED";
 static const char *NONCE_BAD = "BAD";
@@ -130,7 +131,7 @@ static enum deny_action parse_deny_action(const char *var, const char *value)
 
 static int receive_pack_config(const char *var, const char *value, void *cb)
 {
-	int status = parse_hide_refs_config(var, value, "receive");
+	int status = parse_hide_refs_config(var, value, "receive", &hidden_refs);
 
 	if (status)
 		return status;
@@ -296,7 +297,7 @@ static int show_ref_cb(const char *path_full, const struct object_id *oid,
 	struct oidset *seen = data;
 	const char *path = strip_namespace(path_full);
 
-	if (ref_is_hidden(path, path_full))
+	if (ref_is_hidden(path, path_full, &hidden_refs))
 		return 0;
 
 	/*
@@ -1794,7 +1795,7 @@ static void reject_updates_to_hidden(struct command *commands)
 		strbuf_setlen(&refname_full, prefix_len);
 		strbuf_addstr(&refname_full, cmd->ref_name);
 
-		if (!ref_is_hidden(cmd->ref_name, refname_full.buf))
+		if (!ref_is_hidden(cmd->ref_name, refname_full.buf, &hidden_refs))
 			continue;
 		if (is_null_oid(&cmd->new_oid))
 			cmd->error_string = "deny deleting a hidden ref";
@@ -2591,6 +2592,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 		packet_flush(1);
 	oid_array_clear(&shallow);
 	oid_array_clear(&ref);
+	string_list_clear(&hidden_refs, 0);
 	free((void *)push_cert_nonce);
 	return 0;
 }
diff --git a/ls-refs.c b/ls-refs.c
index fa0d01b47c..fb6769742c 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -6,6 +6,7 @@
 #include "ls-refs.h"
 #include "pkt-line.h"
 #include "config.h"
+#include "string-list.h"
 
 static int config_read;
 static int advertise_unborn;
@@ -73,6 +74,7 @@ struct ls_refs_data {
 	unsigned symrefs;
 	struct strvec prefixes;
 	struct strbuf buf;
+	struct string_list hidden_refs;
 	unsigned unborn : 1;
 };
 
@@ -84,7 +86,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
 
 	strbuf_reset(&data->buf);
 
-	if (ref_is_hidden(refname_nons, refname))
+	if (ref_is_hidden(refname_nons, refname, &data->hidden_refs))
 		return 0;
 
 	if (!ref_match(&data->prefixes, refname_nons))
@@ -137,14 +139,15 @@ static void send_possibly_unborn_head(struct ls_refs_data *data)
 }
 
 static int ls_refs_config(const char *var, const char *value,
-			  void *data UNUSED)
+			  void *cb_data)
 {
+	struct ls_refs_data *data = cb_data;
 	/*
 	 * We only serve fetches over v2 for now, so respect only "uploadpack"
 	 * config. This may need to eventually be expanded to "receive", but we
 	 * don't yet know how that information will be passed to ls-refs.
 	 */
-	return parse_hide_refs_config(var, value, "uploadpack");
+	return parse_hide_refs_config(var, value, "uploadpack", &data->hidden_refs);
 }
 
 int ls_refs(struct repository *r, struct packet_reader *request)
@@ -154,9 +157,10 @@ int ls_refs(struct repository *r, struct packet_reader *request)
 	memset(&data, 0, sizeof(data));
 	strvec_init(&data.prefixes);
 	strbuf_init(&data.buf, 0);
+	string_list_init_dup(&data.hidden_refs);
 
 	ensure_config_read();
-	git_config(ls_refs_config, NULL);
+	git_config(ls_refs_config, &data);
 
 	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
 		const char *arg = request->line;
@@ -195,6 +199,7 @@ int ls_refs(struct repository *r, struct packet_reader *request)
 	packet_fflush(stdout);
 	strvec_clear(&data.prefixes);
 	strbuf_release(&data.buf);
+	string_list_clear(&data.hidden_refs, 0);
 	return 0;
 }
 
diff --git a/refs.c b/refs.c
index a4ab264d74..2c7e88b190 100644
--- a/refs.c
+++ b/refs.c
@@ -1414,9 +1414,8 @@ char *shorten_unambiguous_ref(const char *refname, int strict)
 					    refname, strict);
 }
 
-static struct string_list *hide_refs;
-
-int parse_hide_refs_config(const char *var, const char *value, const char *section)
+int parse_hide_refs_config(const char *var, const char *value, const char *section,
+			   struct string_list *hide_refs)
 {
 	const char *key;
 	if (!strcmp("transfer.hiderefs", var) ||
@@ -1431,21 +1430,16 @@ int parse_hide_refs_config(const char *var, const char *value, const char *secti
 		len = strlen(ref);
 		while (len && ref[len - 1] == '/')
 			ref[--len] = '\0';
-		if (!hide_refs) {
-			CALLOC_ARRAY(hide_refs, 1);
-			hide_refs->strdup_strings = 1;
-		}
 		string_list_append_nodup(hide_refs, ref);
 	}
 	return 0;
 }
 
-int ref_is_hidden(const char *refname, const char *refname_full)
+int ref_is_hidden(const char *refname, const char *refname_full,
+		  const struct string_list *hide_refs)
 {
 	int i;
 
-	if (!hide_refs)
-		return 0;
 	for (i = hide_refs->nr - 1; i >= 0; i--) {
 		const char *match = hide_refs->items[i].string;
 		const char *subject;
diff --git a/refs.h b/refs.h
index 8958717a17..3266fd8f57 100644
--- a/refs.h
+++ b/refs.h
@@ -808,7 +808,8 @@ int update_ref(const char *msg, const char *refname,
 	       const struct object_id *new_oid, const struct object_id *old_oid,
 	       unsigned int flags, enum action_on_err onerr);
 
-int parse_hide_refs_config(const char *var, const char *value, const char *);
+int parse_hide_refs_config(const char *var, const char *value, const char *,
+			   struct string_list *);
 
 /*
  * Check whether a ref is hidden. If no namespace is set, both the first and
@@ -818,7 +819,7 @@ int parse_hide_refs_config(const char *var, const char *value, const char *);
  * the ref is outside that namespace, the first parameter is NULL. The second
  * parameter always points to the full ref name.
  */
-int ref_is_hidden(const char *, const char *);
+int ref_is_hidden(const char *, const char *, const struct string_list *);
 
 /* Is this a per-worktree ref living in the refs/ namespace? */
 int is_per_worktree_ref(const char *refname);
diff --git a/upload-pack.c b/upload-pack.c
index 0b8311bd68..551f22ffa5 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -62,6 +62,7 @@ struct upload_pack_data {
 	struct object_array have_obj;
 	struct oid_array haves;					/* v2 only */
 	struct string_list wanted_refs;				/* v2 only */
+	struct string_list hidden_refs;
 
 	struct object_array shallows;
 	struct string_list deepen_not;
@@ -118,6 +119,7 @@ static void upload_pack_data_init(struct upload_pack_data *data)
 {
 	struct string_list symref = STRING_LIST_INIT_DUP;
 	struct string_list wanted_refs = STRING_LIST_INIT_DUP;
+	struct string_list hidden_refs = STRING_LIST_INIT_DUP;
 	struct object_array want_obj = OBJECT_ARRAY_INIT;
 	struct object_array have_obj = OBJECT_ARRAY_INIT;
 	struct oid_array haves = OID_ARRAY_INIT;
@@ -130,6 +132,7 @@ static void upload_pack_data_init(struct upload_pack_data *data)
 	memset(data, 0, sizeof(*data));
 	data->symref = symref;
 	data->wanted_refs = wanted_refs;
+	data->hidden_refs = hidden_refs;
 	data->want_obj = want_obj;
 	data->have_obj = have_obj;
 	data->haves = haves;
@@ -151,6 +154,7 @@ static void upload_pack_data_clear(struct upload_pack_data *data)
 {
 	string_list_clear(&data->symref, 1);
 	string_list_clear(&data->wanted_refs, 1);
+	string_list_clear(&data->hidden_refs, 0);
 	object_array_clear(&data->want_obj);
 	object_array_clear(&data->have_obj);
 	oid_array_clear(&data->haves);
@@ -842,8 +846,8 @@ static void deepen(struct upload_pack_data *data, int depth)
 		 * Checking for reachable shallows requires that our refs be
 		 * marked with OUR_REF.
 		 */
-		head_ref_namespaced(check_ref, NULL);
-		for_each_namespaced_ref(check_ref, NULL);
+		head_ref_namespaced(check_ref, data);
+		for_each_namespaced_ref(check_ref, data);
 
 		get_reachable_list(data, &reachable_shallows);
 		result = get_shallow_commits(&reachable_shallows,
@@ -1158,11 +1162,11 @@ static void receive_needs(struct upload_pack_data *data,
 
 /* return non-zero if the ref is hidden, otherwise 0 */
 static int mark_our_ref(const char *refname, const char *refname_full,
-			const struct object_id *oid)
+			const struct object_id *oid, const struct string_list *hidden_refs)
 {
 	struct object *o = lookup_unknown_object(the_repository, oid);
 
-	if (ref_is_hidden(refname, refname_full)) {
+	if (ref_is_hidden(refname, refname_full, hidden_refs)) {
 		o->flags |= HIDDEN_REF;
 		return 1;
 	}
@@ -1171,11 +1175,12 @@ static int mark_our_ref(const char *refname, const char *refname_full,
 }
 
 static int check_ref(const char *refname_full, const struct object_id *oid,
-		     int flag UNUSED, void *cb_data UNUSED)
+		     int flag UNUSED, void *cb_data)
 {
 	const char *refname = strip_namespace(refname_full);
+	struct upload_pack_data *data = cb_data;
 
-	mark_our_ref(refname, refname_full, oid);
+	mark_our_ref(refname, refname_full, oid, &data->hidden_refs);
 	return 0;
 }
 
@@ -1204,7 +1209,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	struct object_id peeled;
 	struct upload_pack_data *data = cb_data;
 
-	if (mark_our_ref(refname_nons, refname, oid))
+	if (mark_our_ref(refname_nons, refname, oid, &data->hidden_refs))
 		return 0;
 
 	if (capabilities) {
@@ -1327,7 +1332,7 @@ static int upload_pack_config(const char *var, const char *value, void *cb_data)
 	if (parse_object_filter_config(var, value, data) < 0)
 		return -1;
 
-	return parse_hide_refs_config(var, value, "uploadpack");
+	return parse_hide_refs_config(var, value, "uploadpack", &data->hidden_refs);
 }
 
 static int upload_pack_protected_config(const char *var, const char *value, void *cb_data)
@@ -1375,8 +1380,8 @@ void upload_pack(const int advertise_refs, const int stateless_rpc,
 		advertise_shallow_grafts(1);
 		packet_flush(1);
 	} else {
-		head_ref_namespaced(check_ref, NULL);
-		for_each_namespaced_ref(check_ref, NULL);
+		head_ref_namespaced(check_ref, &data);
+		for_each_namespaced_ref(check_ref, &data);
 	}
 
 	if (!advertise_refs) {
@@ -1441,6 +1446,7 @@ static int parse_want(struct packet_writer *writer, const char *line,
 
 static int parse_want_ref(struct packet_writer *writer, const char *line,
 			  struct string_list *wanted_refs,
+			  struct string_list *hidden_refs,
 			  struct object_array *want_obj)
 {
 	const char *refname_nons;
@@ -1451,7 +1457,7 @@ static int parse_want_ref(struct packet_writer *writer, const char *line,
 		struct strbuf refname = STRBUF_INIT;
 
 		strbuf_addf(&refname, "%s%s", get_git_namespace(), refname_nons);
-		if (ref_is_hidden(refname_nons, refname.buf) ||
+		if (ref_is_hidden(refname_nons, refname.buf, hidden_refs) ||
 		    read_ref(refname.buf, &oid)) {
 			packet_writer_error(writer, "unknown ref %s", refname_nons);
 			die("unknown ref %s", refname_nons);
@@ -1508,7 +1514,7 @@ static void process_args(struct packet_reader *request,
 			continue;
 		if (data->allow_ref_in_want &&
 		    parse_want_ref(&data->writer, arg, &data->wanted_refs,
-				   &data->want_obj))
+				   &data->hidden_refs, &data->want_obj))
 			continue;
 		/* process have line */
 		if (parse_have(arg, &data->haves))
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 3/7] revision: move together exclusion-related functions
  2022-11-11  6:49 ` [PATCH v5 0/7] " Patrick Steinhardt
  2022-11-11  6:49   ` [PATCH v5 1/7] refs: fix memory leak when parsing hideRefs config Patrick Steinhardt
  2022-11-11  6:49   ` [PATCH v5 2/7] refs: get rid of global list of hidden refs Patrick Steinhardt
@ 2022-11-11  6:50   ` Patrick Steinhardt
  2022-11-11  6:50   ` [PATCH v5 4/7] revision: introduce struct to handle exclusions Patrick Steinhardt
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-11  6:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 2402 bytes --]

Move together the definitions of functions that handle exclusions of
refs so that related functionality sits in a single place, only.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 revision.c | 52 ++++++++++++++++++++++++++--------------------------
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/revision.c b/revision.c
index 0760e78936..be755670e2 100644
--- a/revision.c
+++ b/revision.c
@@ -1517,14 +1517,6 @@ static void add_rev_cmdline_list(struct rev_info *revs,
 	}
 }
 
-struct all_refs_cb {
-	int all_flags;
-	int warned_bad_reflog;
-	struct rev_info *all_revs;
-	const char *name_for_errormsg;
-	struct worktree *wt;
-};
-
 int ref_excluded(struct string_list *ref_excludes, const char *path)
 {
 	struct string_list_item *item;
@@ -1538,6 +1530,32 @@ int ref_excluded(struct string_list *ref_excludes, const char *path)
 	return 0;
 }
 
+void clear_ref_exclusion(struct string_list **ref_excludes_p)
+{
+	if (*ref_excludes_p) {
+		string_list_clear(*ref_excludes_p, 0);
+		free(*ref_excludes_p);
+	}
+	*ref_excludes_p = NULL;
+}
+
+void add_ref_exclusion(struct string_list **ref_excludes_p, const char *exclude)
+{
+	if (!*ref_excludes_p) {
+		CALLOC_ARRAY(*ref_excludes_p, 1);
+		(*ref_excludes_p)->strdup_strings = 1;
+	}
+	string_list_append(*ref_excludes_p, exclude);
+}
+
+struct all_refs_cb {
+	int all_flags;
+	int warned_bad_reflog;
+	struct rev_info *all_revs;
+	const char *name_for_errormsg;
+	struct worktree *wt;
+};
+
 static int handle_one_ref(const char *path, const struct object_id *oid,
 			  int flag UNUSED,
 			  void *cb_data)
@@ -1563,24 +1581,6 @@ static void init_all_refs_cb(struct all_refs_cb *cb, struct rev_info *revs,
 	cb->wt = NULL;
 }
 
-void clear_ref_exclusion(struct string_list **ref_excludes_p)
-{
-	if (*ref_excludes_p) {
-		string_list_clear(*ref_excludes_p, 0);
-		free(*ref_excludes_p);
-	}
-	*ref_excludes_p = NULL;
-}
-
-void add_ref_exclusion(struct string_list **ref_excludes_p, const char *exclude)
-{
-	if (!*ref_excludes_p) {
-		CALLOC_ARRAY(*ref_excludes_p, 1);
-		(*ref_excludes_p)->strdup_strings = 1;
-	}
-	string_list_append(*ref_excludes_p, exclude);
-}
-
 static void handle_refs(struct ref_store *refs,
 			struct rev_info *revs, unsigned flags,
 			int (*for_each)(struct ref_store *, each_ref_fn, void *))
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 4/7] revision: introduce struct to handle exclusions
  2022-11-11  6:49 ` [PATCH v5 0/7] " Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2022-11-11  6:50   ` [PATCH v5 3/7] revision: move together exclusion-related functions Patrick Steinhardt
@ 2022-11-11  6:50   ` Patrick Steinhardt
  2022-11-11  6:50   ` [PATCH v5 5/7] revision: add new parameter to exclude hidden refs Patrick Steinhardt
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-11  6:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 8642 bytes --]

The functions that handle exclusion of refs work on a single string
list. We're about to add a second mechanism for excluding refs though,
and it makes sense to reuse much of the same architecture for both kinds
of exclusion.

Introduce a new `struct ref_exclusions` that encapsulates all the logic
related to excluding refs and move the `struct string_list` that holds
all wildmatch patterns of excluded refs into it. Rename functions that
operate on this struct to match its name.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/rev-parse.c |  8 ++++----
 revision.c          | 48 +++++++++++++++++++++------------------------
 revision.h          | 27 +++++++++++++++++++------
 3 files changed, 47 insertions(+), 36 deletions(-)

diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 8f61050bde..7fa5b6991b 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -39,7 +39,7 @@ static int abbrev_ref_strict;
 static int output_sq;
 
 static int stuck_long;
-static struct string_list *ref_excludes;
+static struct ref_exclusions ref_excludes = REF_EXCLUSIONS_INIT;
 
 /*
  * Some arguments are relevant "revision" arguments,
@@ -198,7 +198,7 @@ static int show_default(void)
 static int show_reference(const char *refname, const struct object_id *oid,
 			  int flag UNUSED, void *cb_data UNUSED)
 {
-	if (ref_excluded(ref_excludes, refname))
+	if (ref_excluded(&ref_excludes, refname))
 		return 0;
 	show_rev(NORMAL, oid, refname);
 	return 0;
@@ -585,7 +585,7 @@ static void handle_ref_opt(const char *pattern, const char *prefix)
 		for_each_glob_ref_in(show_reference, pattern, prefix, NULL);
 	else
 		for_each_ref_in(prefix, show_reference, NULL);
-	clear_ref_exclusion(&ref_excludes);
+	clear_ref_exclusions(&ref_excludes);
 }
 
 enum format_type {
@@ -863,7 +863,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 			}
 			if (!strcmp(arg, "--all")) {
 				for_each_ref(show_reference, NULL);
-				clear_ref_exclusion(&ref_excludes);
+				clear_ref_exclusions(&ref_excludes);
 				continue;
 			}
 			if (skip_prefix(arg, "--disambiguate=", &arg)) {
diff --git a/revision.c b/revision.c
index be755670e2..fe3ec98f46 100644
--- a/revision.c
+++ b/revision.c
@@ -1517,35 +1517,30 @@ static void add_rev_cmdline_list(struct rev_info *revs,
 	}
 }
 
-int ref_excluded(struct string_list *ref_excludes, const char *path)
+int ref_excluded(const struct ref_exclusions *exclusions, const char *path)
 {
 	struct string_list_item *item;
-
-	if (!ref_excludes)
-		return 0;
-	for_each_string_list_item(item, ref_excludes) {
+	for_each_string_list_item(item, &exclusions->excluded_refs) {
 		if (!wildmatch(item->string, path, 0))
 			return 1;
 	}
 	return 0;
 }
 
-void clear_ref_exclusion(struct string_list **ref_excludes_p)
+void init_ref_exclusions(struct ref_exclusions *exclusions)
 {
-	if (*ref_excludes_p) {
-		string_list_clear(*ref_excludes_p, 0);
-		free(*ref_excludes_p);
-	}
-	*ref_excludes_p = NULL;
+	struct ref_exclusions blank = REF_EXCLUSIONS_INIT;
+	memcpy(exclusions, &blank, sizeof(*exclusions));
 }
 
-void add_ref_exclusion(struct string_list **ref_excludes_p, const char *exclude)
+void clear_ref_exclusions(struct ref_exclusions *exclusions)
 {
-	if (!*ref_excludes_p) {
-		CALLOC_ARRAY(*ref_excludes_p, 1);
-		(*ref_excludes_p)->strdup_strings = 1;
-	}
-	string_list_append(*ref_excludes_p, exclude);
+	string_list_clear(&exclusions->excluded_refs, 0);
+}
+
+void add_ref_exclusion(struct ref_exclusions *exclusions, const char *exclude)
+{
+	string_list_append(&exclusions->excluded_refs, exclude);
 }
 
 struct all_refs_cb {
@@ -1563,7 +1558,7 @@ static int handle_one_ref(const char *path, const struct object_id *oid,
 	struct all_refs_cb *cb = cb_data;
 	struct object *object;
 
-	if (ref_excluded(cb->all_revs->ref_excludes, path))
+	if (ref_excluded(&cb->all_revs->ref_excludes, path))
 	    return 0;
 
 	object = get_reference(cb->all_revs, path, oid, cb->all_flags);
@@ -1901,6 +1896,7 @@ void repo_init_revisions(struct repository *r,
 
 	init_display_notes(&revs->notes_opt);
 	list_objects_filter_init(&revs->filter);
+	init_ref_exclusions(&revs->ref_excludes);
 }
 
 static void add_pending_commit_list(struct rev_info *revs,
@@ -2689,10 +2685,10 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 			init_all_refs_cb(&cb, revs, *flags);
 			other_head_refs(handle_one_ref, &cb);
 		}
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--branches")) {
 		handle_refs(refs, revs, *flags, refs_for_each_branch_ref);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--bisect")) {
 		read_bisect_terms(&term_bad, &term_good);
 		handle_refs(refs, revs, *flags, for_each_bad_bisect_ref);
@@ -2701,15 +2697,15 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 		revs->bisect = 1;
 	} else if (!strcmp(arg, "--tags")) {
 		handle_refs(refs, revs, *flags, refs_for_each_tag_ref);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--remotes")) {
 		handle_refs(refs, revs, *flags, refs_for_each_remote_ref);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if ((argcount = parse_long_opt("glob", argv, &optarg))) {
 		struct all_refs_cb cb;
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref(handle_one_ref, optarg, &cb);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 		return argcount;
 	} else if ((argcount = parse_long_opt("exclude", argv, &optarg))) {
 		add_ref_exclusion(&revs->ref_excludes, optarg);
@@ -2718,17 +2714,17 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 		struct all_refs_cb cb;
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/heads/", &cb);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (skip_prefix(arg, "--tags=", &optarg)) {
 		struct all_refs_cb cb;
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/tags/", &cb);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (skip_prefix(arg, "--remotes=", &optarg)) {
 		struct all_refs_cb cb;
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/remotes/", &cb);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--reflog")) {
 		add_reflogs_to_pending(revs, *flags);
 	} else if (!strcmp(arg, "--indexed-objects")) {
diff --git a/revision.h b/revision.h
index afe1b77985..5c8ab16047 100644
--- a/revision.h
+++ b/revision.h
@@ -81,6 +81,21 @@ struct rev_cmdline_info {
 	} *rev;
 };
 
+struct ref_exclusions {
+	/*
+	 * Excluded refs is a list of wildmatch patterns. If any of the
+	 * patterns matches, the reference will be excluded.
+	 */
+	struct string_list excluded_refs;
+};
+
+/**
+ * Initialize a `struct ref_exclusions` with a macro.
+ */
+#define REF_EXCLUSIONS_INIT { \
+	.excluded_refs = STRING_LIST_INIT_DUP, \
+}
+
 struct oidset;
 struct topo_walk_info;
 
@@ -103,7 +118,7 @@ struct rev_info {
 	struct list_objects_filter_options filter;
 
 	/* excluding from --branches, --refs, etc. expansion */
-	struct string_list *ref_excludes;
+	struct ref_exclusions ref_excludes;
 
 	/* Basic information */
 	const char *prefix;
@@ -439,12 +454,12 @@ void mark_trees_uninteresting_sparse(struct repository *r, struct oidset *trees)
 void show_object_with_name(FILE *, struct object *, const char *);
 
 /**
- * Helpers to check if a "struct string_list" item matches with
- * wildmatch().
+ * Helpers to check if a reference should be excluded.
  */
-int ref_excluded(struct string_list *, const char *path);
-void clear_ref_exclusion(struct string_list **);
-void add_ref_exclusion(struct string_list **, const char *exclude);
+int ref_excluded(const struct ref_exclusions *exclusions, const char *path);
+void init_ref_exclusions(struct ref_exclusions *);
+void clear_ref_exclusions(struct ref_exclusions *);
+void add_ref_exclusion(struct ref_exclusions *, const char *exclude);
 
 /**
  * This function can be used if you want to add commit objects as revision
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 5/7] revision: add new parameter to exclude hidden refs
  2022-11-11  6:49 ` [PATCH v5 0/7] " Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2022-11-11  6:50   ` [PATCH v5 4/7] revision: introduce struct to handle exclusions Patrick Steinhardt
@ 2022-11-11  6:50   ` Patrick Steinhardt
  2022-11-11  6:50   ` [PATCH v5 6/7] rev-parse: add `--exclude-hidden=` option Patrick Steinhardt
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-11  6:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 14480 bytes --]

Users can optionally hide refs from remote users in git-upload-pack(1),
git-receive-pack(1) and others via the `transfer.hideRefs`, but there is
not an easy way to obtain the list of all visible or hidden refs right
now. We'll require just that though for a performance improvement in our
connectivity check.

Add a new option `--exclude-hidden=` that excludes any hidden refs from
the next pseudo-ref like `--all` or `--branches`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/rev-list-options.txt |   7 ++
 builtin/rev-list.c                 |   1 +
 revision.c                         |  55 +++++++++-
 revision.h                         |  16 +++
 t/t6021-rev-list-exclude-hidden.sh | 163 +++++++++++++++++++++++++++++
 5 files changed, 241 insertions(+), 1 deletion(-)
 create mode 100755 t/t6021-rev-list-exclude-hidden.sh

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 1837509566..ff68e48406 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -195,6 +195,13 @@ respectively, and they must begin with `refs/` when applied to `--glob`
 or `--all`. If a trailing '/{asterisk}' is intended, it must be given
 explicitly.
 
+--exclude-hidden=[receive|uploadpack]::
+	Do not include refs that would be hidden by `git-receive-pack` or
+	`git-upload-pack` by consulting the appropriate `receive.hideRefs` or
+	`uploadpack.hideRefs` configuration along with `transfer.hideRefs` (see
+	linkgit:git-config[1]). This option affects the next pseudo-ref option
+	`--all` or `--glob` and is cleared after processing them.
+
 --reflog::
 	Pretend as if all objects mentioned by reflogs are listed on the
 	command line as `<commit>`.
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 3acd93f71e..d42db0b0cc 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -38,6 +38,7 @@ static const char rev_list_usage[] =
 "    --tags\n"
 "    --remotes\n"
 "    --stdin\n"
+"    --exclude-hidden=[receive|uploadpack]\n"
 "    --quiet\n"
 "  ordering output:\n"
 "    --topo-order\n"
diff --git a/revision.c b/revision.c
index fe3ec98f46..bc32fb819a 100644
--- a/revision.c
+++ b/revision.c
@@ -1,4 +1,5 @@
 #include "cache.h"
+#include "config.h"
 #include "object-store.h"
 #include "tag.h"
 #include "blob.h"
@@ -1519,11 +1520,17 @@ static void add_rev_cmdline_list(struct rev_info *revs,
 
 int ref_excluded(const struct ref_exclusions *exclusions, const char *path)
 {
+	const char *stripped_path = strip_namespace(path);
 	struct string_list_item *item;
+
 	for_each_string_list_item(item, &exclusions->excluded_refs) {
 		if (!wildmatch(item->string, path, 0))
 			return 1;
 	}
+
+	if (ref_is_hidden(stripped_path, path, &exclusions->hidden_refs))
+		return 1;
+
 	return 0;
 }
 
@@ -1536,6 +1543,8 @@ void init_ref_exclusions(struct ref_exclusions *exclusions)
 void clear_ref_exclusions(struct ref_exclusions *exclusions)
 {
 	string_list_clear(&exclusions->excluded_refs, 0);
+	string_list_clear(&exclusions->hidden_refs, 0);
+	exclusions->hidden_refs_configured = 0;
 }
 
 void add_ref_exclusion(struct ref_exclusions *exclusions, const char *exclude)
@@ -1543,6 +1552,35 @@ void add_ref_exclusion(struct ref_exclusions *exclusions, const char *exclude)
 	string_list_append(&exclusions->excluded_refs, exclude);
 }
 
+struct exclude_hidden_refs_cb {
+	struct ref_exclusions *exclusions;
+	const char *section;
+};
+
+static int hide_refs_config(const char *var, const char *value, void *cb_data)
+{
+	struct exclude_hidden_refs_cb *cb = cb_data;
+	cb->exclusions->hidden_refs_configured = 1;
+	return parse_hide_refs_config(var, value, cb->section,
+				      &cb->exclusions->hidden_refs);
+}
+
+void exclude_hidden_refs(struct ref_exclusions *exclusions, const char *section)
+{
+	struct exclude_hidden_refs_cb cb;
+
+	if (strcmp(section, "receive") && strcmp(section, "uploadpack"))
+		die(_("unsupported section for hidden refs: %s"), section);
+
+	if (exclusions->hidden_refs_configured)
+		die(_("--exclude-hidden= passed more than once"));
+
+	cb.exclusions = exclusions;
+	cb.section = section;
+
+	git_config(hide_refs_config, &cb);
+}
+
 struct all_refs_cb {
 	int all_flags;
 	int warned_bad_reflog;
@@ -2221,7 +2259,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	    !strcmp(arg, "--bisect") || starts_with(arg, "--glob=") ||
 	    !strcmp(arg, "--indexed-objects") ||
 	    !strcmp(arg, "--alternate-refs") ||
-	    starts_with(arg, "--exclude=") ||
+	    starts_with(arg, "--exclude=") || starts_with(arg, "--exclude-hidden=") ||
 	    starts_with(arg, "--branches=") || starts_with(arg, "--tags=") ||
 	    starts_with(arg, "--remotes=") || starts_with(arg, "--no-walk="))
 	{
@@ -2687,6 +2725,8 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 		}
 		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--branches")) {
+		if (revs->ref_excludes.hidden_refs_configured)
+			return error(_("--exclude-hidden cannot be used together with --branches"));
 		handle_refs(refs, revs, *flags, refs_for_each_branch_ref);
 		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--bisect")) {
@@ -2696,9 +2736,13 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 			    for_each_good_bisect_ref);
 		revs->bisect = 1;
 	} else if (!strcmp(arg, "--tags")) {
+		if (revs->ref_excludes.hidden_refs_configured)
+			return error(_("--exclude-hidden cannot be used together with --tags"));
 		handle_refs(refs, revs, *flags, refs_for_each_tag_ref);
 		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--remotes")) {
+		if (revs->ref_excludes.hidden_refs_configured)
+			return error(_("--exclude-hidden cannot be used together with --remotes"));
 		handle_refs(refs, revs, *flags, refs_for_each_remote_ref);
 		clear_ref_exclusions(&revs->ref_excludes);
 	} else if ((argcount = parse_long_opt("glob", argv, &optarg))) {
@@ -2710,18 +2754,27 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 	} else if ((argcount = parse_long_opt("exclude", argv, &optarg))) {
 		add_ref_exclusion(&revs->ref_excludes, optarg);
 		return argcount;
+	} else if ((argcount = parse_long_opt("exclude-hidden", argv, &optarg))) {
+		exclude_hidden_refs(&revs->ref_excludes, optarg);
+		return argcount;
 	} else if (skip_prefix(arg, "--branches=", &optarg)) {
 		struct all_refs_cb cb;
+		if (revs->ref_excludes.hidden_refs_configured)
+			return error(_("--exclude-hidden cannot be used together with --branches"));
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/heads/", &cb);
 		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (skip_prefix(arg, "--tags=", &optarg)) {
 		struct all_refs_cb cb;
+		if (revs->ref_excludes.hidden_refs_configured)
+			return error(_("--exclude-hidden cannot be used together with --tags"));
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/tags/", &cb);
 		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (skip_prefix(arg, "--remotes=", &optarg)) {
 		struct all_refs_cb cb;
+		if (revs->ref_excludes.hidden_refs_configured)
+			return error(_("--exclude-hidden cannot be used together with --remotes"));
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/remotes/", &cb);
 		clear_ref_exclusions(&revs->ref_excludes);
diff --git a/revision.h b/revision.h
index 5c8ab16047..adb810c2f8 100644
--- a/revision.h
+++ b/revision.h
@@ -87,6 +87,19 @@ struct ref_exclusions {
 	 * patterns matches, the reference will be excluded.
 	 */
 	struct string_list excluded_refs;
+
+	/*
+	 * Hidden refs is a list of patterns that is to be hidden via
+	 * `ref_is_hidden()`.
+	 */
+	struct string_list hidden_refs;
+
+	/*
+	 * Indicates whether hidden refs have been configured. This is to
+	 * distinguish between no hidden refs existing and hidden refs not
+	 * being parsed.
+	 */
+	char hidden_refs_configured;
 };
 
 /**
@@ -94,6 +107,7 @@ struct ref_exclusions {
  */
 #define REF_EXCLUSIONS_INIT { \
 	.excluded_refs = STRING_LIST_INIT_DUP, \
+	.hidden_refs = STRING_LIST_INIT_DUP, \
 }
 
 struct oidset;
@@ -456,10 +470,12 @@ void show_object_with_name(FILE *, struct object *, const char *);
 /**
  * Helpers to check if a reference should be excluded.
  */
+
 int ref_excluded(const struct ref_exclusions *exclusions, const char *path);
 void init_ref_exclusions(struct ref_exclusions *);
 void clear_ref_exclusions(struct ref_exclusions *);
 void add_ref_exclusion(struct ref_exclusions *, const char *exclude);
+void exclude_hidden_refs(struct ref_exclusions *, const char *section);
 
 /**
  * This function can be used if you want to add commit objects as revision
diff --git a/t/t6021-rev-list-exclude-hidden.sh b/t/t6021-rev-list-exclude-hidden.sh
new file mode 100755
index 0000000000..018796d41c
--- /dev/null
+++ b/t/t6021-rev-list-exclude-hidden.sh
@@ -0,0 +1,163 @@
+#!/bin/sh
+
+test_description='git rev-list --exclude-hidden test'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit_bulk --id=commit --ref=refs/heads/main 1 &&
+	COMMIT=$(git rev-parse refs/heads/main) &&
+	test_commit_bulk --id=tag --ref=refs/tags/lightweight 1 &&
+	TAG=$(git rev-parse refs/tags/lightweight) &&
+	test_commit_bulk --id=hidden --ref=refs/hidden/commit 1 &&
+	HIDDEN=$(git rev-parse refs/hidden/commit) &&
+	test_commit_bulk --id=namespace --ref=refs/namespaces/namespace/refs/namespaced/commit 1 &&
+	NAMESPACE=$(git rev-parse refs/namespaces/namespace/refs/namespaced/commit)
+'
+
+test_expect_success 'invalid section' '
+	echo "fatal: unsupported section for hidden refs: unsupported" >expected &&
+	test_must_fail git rev-list --exclude-hidden=unsupported 2>err &&
+	test_cmp expected err
+'
+
+for section in receive uploadpack
+do
+	test_expect_success "$section: passed multiple times" '
+		echo "fatal: --exclude-hidden= passed more than once" >expected &&
+		test_must_fail git rev-list --exclude-hidden=$section --exclude-hidden=$section 2>err &&
+		test_cmp expected err
+	'
+
+	test_expect_success "$section: without hiddenRefs" '
+		git rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$HIDDEN
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: hidden via transfer.hideRefs" '
+		git -c transfer.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: hidden via $section.hideRefs" '
+		git -c $section.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: respects both transfer.hideRefs and $section.hideRefs" '
+		git -c transfer.hideRefs=refs/tags/ -c $section.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: negation without hidden refs marks everything as uninteresting" '
+		git rev-list --all --exclude-hidden=$section --not --all >out &&
+		test_must_be_empty out
+	'
+
+	test_expect_success "$section: negation with hidden refs marks them as interesting" '
+		git -c transfer.hideRefs=refs/hidden/ rev-list --all --exclude-hidden=$section --not --all >out &&
+		cat >expected <<-EOF &&
+		$HIDDEN
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: hidden refs and excludes work together" '
+		git -c transfer.hideRefs=refs/hidden/ rev-list --exclude=refs/tags/* --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: excluded hidden refs get reset" '
+		git -c transfer.hideRefs=refs/ rev-list --exclude-hidden=$section --all --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$HIDDEN
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: excluded hidden refs can be used with multiple pseudo-refs" '
+		git -c transfer.hideRefs=refs/ rev-list --exclude-hidden=$section --all --exclude-hidden=$section --all >out &&
+		test_must_be_empty out
+	'
+
+	test_expect_success "$section: works with --glob" '
+		git -c transfer.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --glob=refs/h* >out &&
+		cat >expected <<-EOF &&
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: operates on stripped refs by default" '
+		GIT_NAMESPACE=namespace git -c transfer.hideRefs=refs/namespaced/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$HIDDEN
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: does not hide namespace by default" '
+		GIT_NAMESPACE=namespace git -c transfer.hideRefs=refs/namespaces/namespace/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$HIDDEN
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: can operate on unstripped refs" '
+		GIT_NAMESPACE=namespace git -c transfer.hideRefs=^refs/namespaces/namespace/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$HIDDEN
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	for pseudoopt in remotes branches tags
+	do
+		test_expect_success "$section: fails with --$pseudoopt" '
+			test_must_fail git rev-list --exclude-hidden=$section --$pseudoopt 2>err &&
+			test_i18ngrep "error: --exclude-hidden cannot be used together with --$pseudoopt" err
+		'
+
+		test_expect_success "$section: fails with --$pseudoopt=pattern" '
+			test_must_fail git rev-list --exclude-hidden=$section --$pseudoopt=pattern 2>err &&
+			test_i18ngrep "error: --exclude-hidden cannot be used together with --$pseudoopt" err
+		'
+	done
+done
+
+test_done
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 6/7] rev-parse: add `--exclude-hidden=` option
  2022-11-11  6:49 ` [PATCH v5 0/7] " Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2022-11-11  6:50   ` [PATCH v5 5/7] revision: add new parameter to exclude hidden refs Patrick Steinhardt
@ 2022-11-11  6:50   ` Patrick Steinhardt
  2022-11-11  6:50   ` [PATCH v5 7/7] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
  2022-11-11 22:18   ` [PATCH v5 0/7] " Taylor Blau
  7 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-11  6:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 5452 bytes --]

Add a new `--exclude-hidden=` option that is similar to the one we just
added to git-rev-list(1). Given a seciton name `uploadpack` or `receive`
as argument, it causes us to exclude all references that would be hidden
by the respective `$section.hideRefs` configuration.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-rev-parse.txt |  7 ++++++
 builtin/rev-parse.c             | 10 +++++++++
 t/t6018-rev-list-glob.sh        | 40 +++++++++++++++++++++++++++++++++
 3 files changed, 57 insertions(+)

diff --git a/Documentation/git-rev-parse.txt b/Documentation/git-rev-parse.txt
index 6b8ca085aa..bcd8069287 100644
--- a/Documentation/git-rev-parse.txt
+++ b/Documentation/git-rev-parse.txt
@@ -197,6 +197,13 @@ respectively, and they must begin with `refs/` when applied to `--glob`
 or `--all`. If a trailing '/{asterisk}' is intended, it must be given
 explicitly.
 
+--exclude-hidden=[receive|uploadpack]::
+	Do not include refs that would be hidden by `git-receive-pack` or
+	`git-upload-pack` by consulting the appropriate `receive.hideRefs` or
+	`uploadpack.hideRefs` configuration along with `transfer.hideRefs` (see
+	linkgit:git-config[1]). This option affects the next pseudo-ref option
+	`--all` or `--glob` and is cleared after processing them.
+
 --disambiguate=<prefix>::
 	Show every object whose name begins with the given prefix.
 	The <prefix> must be at least 4 hexadecimal digits long to
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 7fa5b6991b..b5666a03bd 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -876,10 +876,14 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				continue;
 			}
 			if (opt_with_value(arg, "--branches", &arg)) {
+				if (ref_excludes.hidden_refs_configured)
+					return error(_("--exclude-hidden cannot be used together with --branches"));
 				handle_ref_opt(arg, "refs/heads/");
 				continue;
 			}
 			if (opt_with_value(arg, "--tags", &arg)) {
+				if (ref_excludes.hidden_refs_configured)
+					return error(_("--exclude-hidden cannot be used together with --tags"));
 				handle_ref_opt(arg, "refs/tags/");
 				continue;
 			}
@@ -888,6 +892,8 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				continue;
 			}
 			if (opt_with_value(arg, "--remotes", &arg)) {
+				if (ref_excludes.hidden_refs_configured)
+					return error(_("--exclude-hidden cannot be used together with --remotes"));
 				handle_ref_opt(arg, "refs/remotes/");
 				continue;
 			}
@@ -895,6 +901,10 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				add_ref_exclusion(&ref_excludes, arg);
 				continue;
 			}
+			if (skip_prefix(arg, "--exclude-hidden=", &arg)) {
+				exclude_hidden_refs(&ref_excludes, arg);
+				continue;
+			}
 			if (!strcmp(arg, "--show-toplevel")) {
 				const char *work_tree = get_git_work_tree();
 				if (work_tree)
diff --git a/t/t6018-rev-list-glob.sh b/t/t6018-rev-list-glob.sh
index e1abc5c2b3..aabf590dda 100755
--- a/t/t6018-rev-list-glob.sh
+++ b/t/t6018-rev-list-glob.sh
@@ -187,6 +187,46 @@ test_expect_success 'rev-parse --exclude=ref with --remotes=glob' '
 	compare rev-parse "--exclude=upstream/x --remotes=upstream/*" "upstream/one upstream/two"
 '
 
+for section in receive uploadpack
+do
+	test_expect_success "rev-parse --exclude-hidden=$section with --all" '
+		compare "-c transfer.hideRefs=refs/remotes/ rev-parse" "--branches --tags" "--exclude-hidden=$section --all"
+	'
+
+	test_expect_success "rev-parse --exclude-hidden=$section with --all" '
+		compare "-c transfer.hideRefs=refs/heads/subspace/ rev-parse" "--exclude=refs/heads/subspace/* --all" "--exclude-hidden=$section --all"
+	'
+
+	test_expect_success "rev-parse --exclude-hidden=$section with --glob" '
+		compare "-c transfer.hideRefs=refs/heads/subspace/ rev-parse" "--exclude=refs/heads/subspace/* --glob=refs/heads/*" "--exclude-hidden=$section --glob=refs/heads/*"
+	'
+
+	test_expect_success "rev-parse --exclude-hidden=$section can be passed once per pseudo-ref" '
+		compare "-c transfer.hideRefs=refs/remotes/ rev-parse" "--branches --tags --branches --tags" "--exclude-hidden=$section --all --exclude-hidden=$section --all"
+	'
+
+	test_expect_success "rev-parse --exclude-hidden=$section can only be passed once per pseudo-ref" '
+		echo "fatal: --exclude-hidden= passed more than once" >expected &&
+		test_must_fail git rev-parse --exclude-hidden=$section --exclude-hidden=$section 2>err &&
+		test_cmp expected err
+	'
+
+	for pseudoopt in branches tags remotes
+	do
+		test_expect_success "rev-parse --exclude-hidden=$section fails with --$pseudoopt" '
+			echo "error: --exclude-hidden cannot be used together with --$pseudoopt" >expected &&
+			test_must_fail git rev-parse --exclude-hidden=$section --$pseudoopt 2>err &&
+			test_cmp expected err
+		'
+
+		test_expect_success "rev-parse --exclude-hidden=$section fails with --$pseudoopt=pattern" '
+			echo "error: --exclude-hidden cannot be used together with --$pseudoopt" >expected &&
+			test_must_fail git rev-parse --exclude-hidden=$section --$pseudoopt=pattern 2>err &&
+			test_cmp expected err
+		'
+	done
+done
+
 test_expect_success 'rev-list --exclude=glob with --branches=glob' '
 	compare rev-list "--exclude=subspace-* --branches=sub*" "subspace/one subspace/two"
 '
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v5 7/7] receive-pack: only use visible refs for connectivity check
  2022-11-11  6:49 ` [PATCH v5 0/7] " Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2022-11-11  6:50   ` [PATCH v5 6/7] rev-parse: add `--exclude-hidden=` option Patrick Steinhardt
@ 2022-11-11  6:50   ` Patrick Steinhardt
  2022-11-11 22:18   ` [PATCH v5 0/7] " Taylor Blau
  7 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-11  6:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 4675 bytes --]

When serving a push, git-receive-pack(1) needs to verify that the
packfile sent by the client contains all objects that are required by
the updated references. This connectivity check works by marking all
preexisting references as uninteresting and using the new reference tips
as starting point for a graph walk.

Marking all preexisting references as uninteresting can be a problem
when it comes to performance. Git forges tend to do internal bookkeeping
to keep alive sets of objects for internal use or make them easy to find
via certain references. These references are typically hidden away from
the user so that they are neither advertised nor writeable. At GitLab,
we have one particular repository that contains a total of 7 million
references, of which 6.8 million are indeed internal references. With
the current connectivity check we are forced to load all these
references in order to mark them as uninteresting, and this alone takes
around 15 seconds to compute.

We can optimize this by only taking into account the set of visible refs
when marking objects as uninteresting. This means that we may now walk
more objects until we hit any object that is marked as uninteresting.
But it is rather unlikely that clients send objects that make large
parts of objects reachable that have previously only ever been hidden,
whereas the common case is to push incremental changes that build on top
of the visible object graph.

This provides a huge boost to performance in the mentioned repository,
where the vast majority of its refs hidden. Pushing a new commit into
this repo with `transfer.hideRefs` set up to hide 6.8 million of 7 refs
as it is configured in Gitaly leads to a 4.5-fold speedup:

    Benchmark 1: main
      Time (mean ± σ):     30.977 s ±  0.157 s    [User: 30.226 s, System: 1.083 s]
      Range (min … max):   30.796 s … 31.071 s    3 runs

    Benchmark 2: pks-connectivity-check-hide-refs
      Time (mean ± σ):      6.799 s ±  0.063 s    [User: 6.803 s, System: 0.354 s]
      Range (min … max):    6.729 s …  6.850 s    3 runs

    Summary
      'pks-connectivity-check-hide-refs' ran
        4.56 ± 0.05 times faster than 'main'

As we mostly go through the same codepaths even in the case where there
are no hidden refs at all compared to the code before there is no change
in performance when no refs are hidden:

    Benchmark 1: main
      Time (mean ± σ):     48.188 s ±  0.432 s    [User: 49.326 s, System: 5.009 s]
      Range (min … max):   47.706 s … 48.539 s    3 runs

    Benchmark 2: pks-connectivity-check-hide-refs
      Time (mean ± σ):     48.027 s ±  0.500 s    [User: 48.934 s, System: 5.025 s]
      Range (min … max):   47.504 s … 48.500 s    3 runs

    Summary
      'pks-connectivity-check-hide-refs' ran
        1.00 ± 0.01 times faster than 'main'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/receive-pack.c | 2 ++
 connected.c            | 3 +++
 connected.h            | 7 +++++++
 3 files changed, 12 insertions(+)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 1e24b31a0a..a90af30363 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1929,6 +1929,8 @@ static void execute_commands(struct command *commands,
 	opt.err_fd = err_fd;
 	opt.progress = err_fd && !quiet;
 	opt.env = tmp_objdir_env(tmp_objdir);
+	opt.exclude_hidden_refs_section = "receive";
+
 	if (check_connected(iterate_receive_command_list, &data, &opt))
 		set_connectivity_errors(commands, si);
 
diff --git a/connected.c b/connected.c
index 74a20cb32e..4f6388eed7 100644
--- a/connected.c
+++ b/connected.c
@@ -100,6 +100,9 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
 		strvec_push(&rev_list.args, "--exclude-promisor-objects");
 	if (!opt->is_deepening_fetch) {
 		strvec_push(&rev_list.args, "--not");
+		if (opt->exclude_hidden_refs_section)
+			strvec_pushf(&rev_list.args, "--exclude-hidden=%s",
+				     opt->exclude_hidden_refs_section);
 		strvec_push(&rev_list.args, "--all");
 	}
 	strvec_push(&rev_list.args, "--quiet");
diff --git a/connected.h b/connected.h
index 6e59c92aa3..16b2c84f2e 100644
--- a/connected.h
+++ b/connected.h
@@ -46,6 +46,13 @@ struct check_connected_options {
 	 * during a fetch.
 	 */
 	unsigned is_deepening_fetch : 1;
+
+	/*
+	 * If not NULL, use `--exclude-hidden=$section` to exclude all refs
+	 * hidden via the `$section.hideRefs` config from the set of
+	 * already-reachable refs.
+	 */
+	const char *exclude_hidden_refs_section;
 };
 
 #define CHECK_CONNECTED_INIT { 0 }
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 0/7] receive-pack: only use visible refs for connectivity check
  2022-11-11  6:49 ` [PATCH v5 0/7] " Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2022-11-11  6:50   ` [PATCH v5 7/7] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
@ 2022-11-11 22:18   ` Taylor Blau
  2022-11-15 17:26     ` Jeff King
  7 siblings, 1 reply; 88+ messages in thread
From: Taylor Blau @ 2022-11-11 22:18 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

On Fri, Nov 11, 2022 at 07:49:49AM +0100, Patrick Steinhardt wrote:
> Patrick Steinhardt (7):
>   refs: fix memory leak when parsing hideRefs config
>   refs: get rid of global list of hidden refs
>   revision: move together exclusion-related functions
>   revision: introduce struct to handle exclusions
>   revision: add new parameter to exclude hidden refs
>   rev-parse: add `--exclude-hidden=` option
>   receive-pack: only use visible refs for connectivity check

The new version is looking pretty good to my eyes, though I would like
to hear from Peff, too, since he caught a few things that I missed in
the previous rounds.

I have to say, the final patch is *really* nice ;-).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 0/7] receive-pack: only use visible refs for connectivity check
  2022-11-11 22:18   ` [PATCH v5 0/7] " Taylor Blau
@ 2022-11-15 17:26     ` Jeff King
  2022-11-16 21:22       ` Taylor Blau
  0 siblings, 1 reply; 88+ messages in thread
From: Jeff King @ 2022-11-15 17:26 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Patrick Steinhardt, git, Junio C Hamano,
	Ævar Arnfjörð Bjarmason

On Fri, Nov 11, 2022 at 05:18:23PM -0500, Taylor Blau wrote:

> On Fri, Nov 11, 2022 at 07:49:49AM +0100, Patrick Steinhardt wrote:
> > Patrick Steinhardt (7):
> >   refs: fix memory leak when parsing hideRefs config
> >   refs: get rid of global list of hidden refs
> >   revision: move together exclusion-related functions
> >   revision: introduce struct to handle exclusions
> >   revision: add new parameter to exclude hidden refs
> >   rev-parse: add `--exclude-hidden=` option
> >   receive-pack: only use visible refs for connectivity check
> 
> The new version is looking pretty good to my eyes, though I would like
> to hear from Peff, too, since he caught a few things that I missed in
> the previous rounds.

This looks good to me, too. There's a typo (s/seciton/section/) in the
commit message of patch 6, but definitely not worth a re-roll. :)

I admit I didn't think _too_ deeply about the interaction with
namespaces, and whether there are any corner cases. I was happy to see
the tests there, and I assume in writing them that Patrick matched how
receive-pack, etc, behave.

-Peff

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 0/7] receive-pack: only use visible refs for connectivity check
  2022-11-15 17:26     ` Jeff King
@ 2022-11-16 21:22       ` Taylor Blau
  2022-11-16 22:04         ` Jeff King
  0 siblings, 1 reply; 88+ messages in thread
From: Taylor Blau @ 2022-11-16 21:22 UTC (permalink / raw)
  To: Jeff King
  Cc: Patrick Steinhardt, git, Junio C Hamano,
	Ævar Arnfjörð Bjarmason

On Tue, Nov 15, 2022 at 12:26:25PM -0500, Jeff King wrote:
> On Fri, Nov 11, 2022 at 05:18:23PM -0500, Taylor Blau wrote:
>
> > On Fri, Nov 11, 2022 at 07:49:49AM +0100, Patrick Steinhardt wrote:
> > > Patrick Steinhardt (7):
> > >   refs: fix memory leak when parsing hideRefs config
> > >   refs: get rid of global list of hidden refs
> > >   revision: move together exclusion-related functions
> > >   revision: introduce struct to handle exclusions
> > >   revision: add new parameter to exclude hidden refs
> > >   rev-parse: add `--exclude-hidden=` option
> > >   receive-pack: only use visible refs for connectivity check
> >
> > The new version is looking pretty good to my eyes, though I would like
> > to hear from Peff, too, since he caught a few things that I missed in
> > the previous rounds.
>
> This looks good to me, too. There's a typo (s/seciton/section/) in the
> commit message of patch 6, but definitely not worth a re-roll. :)

Hmm. It looks like this is broken in CI when the default initial branch
name is something other than "master":

    $ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main ./t6021-rev-list-exclude-hidden.sh -i --verbose-only=12 -x
    [...]
    expecting success of 6021.12 'receive: excluded hidden refs can be used with multiple pseudo-refs':
        git -c transfer.hideRefs=refs/ rev-list --exclude-hidden=$section --all --exclude-hidden=$section --all >out &&
        test_must_be_empty out

    + git -c transfer.hideRefs=refs/ rev-list --exclude-hidden=receive --all --exclude-hidden=receive --all
    + test_must_be_empty out
    + test 1 -ne 1
    + test_path_is_file out
    + test 1 -ne 1
    + test -f out
    + test -s out
    + echo 'out' is not empty, it contains:
    'out' is not empty, it contains:
    + cat out
    d2e88f5a45c63e4ec7e1fd303542944487abe89a
    + return 1
    error: last command exited with $?=1
    not ok 12 - receive: excluded hidden refs can be used with multiple pseudo-refs
    #
    #			git -c transfer.hideRefs=refs/ rev-list --exclude-hidden=$section --all --exclude-hidden=$section --all >out &&
    #			test_must_be_empty out
    #
    1..12

I haven't looked too deeply at what is going on here, but let's make
sure to resolve this before graduating the topic down (which I would
otherwise like to do in the next push-out, probably tomorrow or the next
day).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 0/7] receive-pack: only use visible refs for connectivity check
  2022-11-16 21:22       ` Taylor Blau
@ 2022-11-16 22:04         ` Jeff King
  2022-11-16 22:33           ` Taylor Blau
  0 siblings, 1 reply; 88+ messages in thread
From: Jeff King @ 2022-11-16 22:04 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Patrick Steinhardt, git, Junio C Hamano,
	Ævar Arnfjörð Bjarmason

On Wed, Nov 16, 2022 at 04:22:24PM -0500, Taylor Blau wrote:

> > This looks good to me, too. There's a typo (s/seciton/section/) in the
> > commit message of patch 6, but definitely not worth a re-roll. :)
> 
> Hmm. It looks like this is broken in CI when the default initial branch
> name is something other than "master":
> 
>     $ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main ./t6021-rev-list-exclude-hidden.sh -i --verbose-only=12 -x
>     [...]
>     expecting success of 6021.12 'receive: excluded hidden refs can be used with multiple pseudo-refs':
>         git -c transfer.hideRefs=refs/ rev-list --exclude-hidden=$section --all --exclude-hidden=$section --all >out &&
>         test_must_be_empty out
> 
>     + git -c transfer.hideRefs=refs/ rev-list --exclude-hidden=receive --all --exclude-hidden=receive --all
>     + test_must_be_empty out
>     + test 1 -ne 1
>     + test_path_is_file out
>     + test 1 -ne 1
>     + test -f out
>     + test -s out
>     + echo 'out' is not empty, it contains:
>     'out' is not empty, it contains:
>     + cat out
>     d2e88f5a45c63e4ec7e1fd303542944487abe89a
>     + return 1
>     error: last command exited with $?=1
>     not ok 12 - receive: excluded hidden refs can be used with multiple pseudo-refs
>     #
>     #			git -c transfer.hideRefs=refs/ rev-list --exclude-hidden=$section --all --exclude-hidden=$section --all >out &&
>     #			test_must_be_empty out
>     #
>     1..12
> 
> I haven't looked too deeply at what is going on here, but let's make
> sure to resolve this before graduating the topic down (which I would
> otherwise like to do in the next push-out, probably tomorrow or the next
> day).

The issue is that some of the tests assume that hiding "refs/" should
produce no output from "--exclude-hidden=receive --all". But it will
also show HEAD, even if it points to a hidden ref (which I think is OK,
and matches what receive-pack would do).

But because the setup uses "main" as one of the sample refs, HEAD may or
may not be valid, depending on what it points to (without setting
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME it points to master, which is
unborn).

So the fix is just:

diff --git a/t/t6021-rev-list-exclude-hidden.sh b/t/t6021-rev-list-exclude-hidden.sh
index 018796d41c..1543a93fe0 100755
--- a/t/t6021-rev-list-exclude-hidden.sh
+++ b/t/t6021-rev-list-exclude-hidden.sh
@@ -5,8 +5,8 @@ test_description='git rev-list --exclude-hidden test'
 . ./test-lib.sh
 
 test_expect_success 'setup' '
-	test_commit_bulk --id=commit --ref=refs/heads/main 1 &&
-	COMMIT=$(git rev-parse refs/heads/main) &&
+	test_commit_bulk --id=commit --ref=refs/heads/foo 1 &&
+	COMMIT=$(git rev-parse refs/heads/foo) &&
 	test_commit_bulk --id=tag --ref=refs/tags/lightweight 1 &&
 	TAG=$(git rev-parse refs/tags/lightweight) &&
 	test_commit_bulk --id=hidden --ref=refs/hidden/commit 1 &&

but Patrick may want to select a more meaningful name. :)

Notably I don't think we want to do the usual

  GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
  export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME

at the top of the script. We really don't mean to depend on having a
specific branch that HEAD points to here.

-Peff

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 0/7] receive-pack: only use visible refs for connectivity check
  2022-11-16 22:04         ` Jeff King
@ 2022-11-16 22:33           ` Taylor Blau
  2022-11-17  5:45             ` Patrick Steinhardt
  0 siblings, 1 reply; 88+ messages in thread
From: Taylor Blau @ 2022-11-16 22:33 UTC (permalink / raw)
  To: Jeff King
  Cc: Taylor Blau, Patrick Steinhardt, git, Junio C Hamano,
	Ævar Arnfjörð Bjarmason

On Wed, Nov 16, 2022 at 05:04:10PM -0500, Jeff King wrote:
> > I haven't looked too deeply at what is going on here, but let's make
> > sure to resolve this before graduating the topic down (which I would
> > otherwise like to do in the next push-out, probably tomorrow or the next
> > day).
>
> The issue is that some of the tests assume that hiding "refs/" should
> produce no output from "--exclude-hidden=receive --all". But it will
> also show HEAD, even if it points to a hidden ref (which I think is OK,
> and matches what receive-pack would do).
>
> But because the setup uses "main" as one of the sample refs, HEAD may or
> may not be valid, depending on what it points to (without setting
> GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME it points to master, which is
> unborn).
>
> So the fix is just:
>
> [...]

Makes perfect sense, and thanks for looking into it.

Patrick: it sounds like there was one typo in the earlier round which
you may want to pick up also, assuming you reroll this.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 0/7] receive-pack: only use visible refs for connectivity check
  2022-11-16 22:33           ` Taylor Blau
@ 2022-11-17  5:45             ` Patrick Steinhardt
  0 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-17  5:45 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Jeff King, git, Junio C Hamano,
	Ævar Arnfjörð Bjarmason

[-- Attachment #1: Type: text/plain, Size: 1217 bytes --]

On Wed, Nov 16, 2022 at 05:33:12PM -0500, Taylor Blau wrote:
> On Wed, Nov 16, 2022 at 05:04:10PM -0500, Jeff King wrote:
> > > I haven't looked too deeply at what is going on here, but let's make
> > > sure to resolve this before graduating the topic down (which I would
> > > otherwise like to do in the next push-out, probably tomorrow or the next
> > > day).
> >
> > The issue is that some of the tests assume that hiding "refs/" should
> > produce no output from "--exclude-hidden=receive --all". But it will
> > also show HEAD, even if it points to a hidden ref (which I think is OK,
> > and matches what receive-pack would do).
> >
> > But because the setup uses "main" as one of the sample refs, HEAD may or
> > may not be valid, depending on what it points to (without setting
> > GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME it points to master, which is
> > unborn).
> >
> > So the fix is just:
> >
> > [...]
> 
> Makes perfect sense, and thanks for looking into it.
> 
> Patrick: it sounds like there was one typo in the earlier round which
> you may want to pick up also, assuming you reroll this.
> 
> Thanks,
> Taylor

Thanks to both of you, I'll send out v6 in a bit.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v6 0/7] receive-pack: only use visible refs for connectivity check
  2022-10-28 14:42 [PATCH 0/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
                   ` (8 preceding siblings ...)
  2022-11-11  6:49 ` [PATCH v5 0/7] " Patrick Steinhardt
@ 2022-11-17  5:46 ` Patrick Steinhardt
  2022-11-17  5:46   ` [PATCH v6 1/7] refs: fix memory leak when parsing hideRefs config Patrick Steinhardt
                     ` (7 more replies)
  9 siblings, 8 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-17  5:46 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 3380 bytes --]

Hi,

this is the sixth version of my patch series that tries to improve
performance of the connectivity check by only considering preexisting
refs as uninteresting that could actually have been advertised to the
client.

There are only two changes in this version compared to v5:

    - A fix to the test setup in commit 5/7 so that tests pass when
      GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main.

    - A typo fix in the commit message of patch 6/7.

Patrick

Patrick Steinhardt (7):
  refs: fix memory leak when parsing hideRefs config
  refs: get rid of global list of hidden refs
  revision: move together exclusion-related functions
  revision: introduce struct to handle exclusions
  revision: add new parameter to exclude hidden refs
  rev-parse: add `--exclude-hidden=` option
  receive-pack: only use visible refs for connectivity check

 Documentation/git-rev-parse.txt    |   7 ++
 Documentation/rev-list-options.txt |   7 ++
 builtin/receive-pack.c             |  10 +-
 builtin/rev-list.c                 |   1 +
 builtin/rev-parse.c                |  18 +++-
 connected.c                        |   3 +
 connected.h                        |   7 ++
 ls-refs.c                          |  13 ++-
 refs.c                             |  16 +--
 refs.h                             |   5 +-
 revision.c                         | 131 +++++++++++++++--------
 revision.h                         |  43 ++++++--
 t/t6018-rev-list-glob.sh           |  40 +++++++
 t/t6021-rev-list-exclude-hidden.sh | 163 +++++++++++++++++++++++++++++
 upload-pack.c                      |  30 +++---
 15 files changed, 411 insertions(+), 83 deletions(-)
 create mode 100755 t/t6021-rev-list-exclude-hidden.sh

Range-diff against v5:
1:  cfab8ba1a2 = 1:  ef182e4330 refs: fix memory leak when parsing hideRefs config
2:  d8118c6dd8 = 2:  48913c1493 refs: get rid of global list of hidden refs
3:  93a627fb7f = 3:  3827d6a2fc revision: move together exclusion-related functions
4:  ad41ade332 = 4:  805de80e64 revision: introduce struct to handle exclusions
5:  b5a4ce432a ! 5:  d86a3342f6 revision: add new parameter to exclude hidden refs
    @@ t/t6021-rev-list-exclude-hidden.sh (new)
     +. ./test-lib.sh
     +
     +test_expect_success 'setup' '
    -+	test_commit_bulk --id=commit --ref=refs/heads/main 1 &&
    -+	COMMIT=$(git rev-parse refs/heads/main) &&
    ++	test_commit_bulk --id=commit --ref=refs/heads/branch 1 &&
    ++	COMMIT=$(git rev-parse refs/heads/branch) &&
     +	test_commit_bulk --id=tag --ref=refs/tags/lightweight 1 &&
     +	TAG=$(git rev-parse refs/tags/lightweight) &&
     +	test_commit_bulk --id=hidden --ref=refs/hidden/commit 1 &&
6:  2eeb25eef0 ! 6:  f8b5eb5a7e rev-parse: add `--exclude-hidden=` option
    @@ Commit message
         rev-parse: add `--exclude-hidden=` option
     
         Add a new `--exclude-hidden=` option that is similar to the one we just
    -    added to git-rev-list(1). Given a seciton name `uploadpack` or `receive`
    +    added to git-rev-list(1). Given a section name `uploadpack` or `receive`
         as argument, it causes us to exclude all references that would be hidden
         by the respective `$section.hideRefs` configuration.
     
7:  f5f18f3939 = 7:  a7eae80ff3 receive-pack: only use visible refs for connectivity check
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v6 1/7] refs: fix memory leak when parsing hideRefs config
  2022-11-17  5:46 ` [PATCH v6 " Patrick Steinhardt
@ 2022-11-17  5:46   ` Patrick Steinhardt
  2022-11-17  5:46   ` [PATCH v6 2/7] refs: get rid of global list of hidden refs Patrick Steinhardt
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-17  5:46 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 1242 bytes --]

When parsing the hideRefs configuration, we first duplicate the config
value so that we can modify it. We then subsequently append it to the
`hide_refs` string list, which is initialized with `strdup_strings`
enabled. As a consequence we again reallocate the string, but never
free the first duplicate and thus have a memory leak.

While we never clean up the static `hide_refs` variable anyway, this is
no excuse to make the leak worse by leaking every value twice. We are
also about to change the way this variable will be handled so that we do
indeed start to clean it up. So let's fix the memory leak by using the
`string_list_append_nodup()` so that we pass ownership of the allocated
string to `hide_refs`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/refs.c b/refs.c
index 1491ae937e..a4ab264d74 100644
--- a/refs.c
+++ b/refs.c
@@ -1435,7 +1435,7 @@ int parse_hide_refs_config(const char *var, const char *value, const char *secti
 			CALLOC_ARRAY(hide_refs, 1);
 			hide_refs->strdup_strings = 1;
 		}
-		string_list_append(hide_refs, ref);
+		string_list_append_nodup(hide_refs, ref);
 	}
 	return 0;
 }
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v6 2/7] refs: get rid of global list of hidden refs
  2022-11-17  5:46 ` [PATCH v6 " Patrick Steinhardt
  2022-11-17  5:46   ` [PATCH v6 1/7] refs: fix memory leak when parsing hideRefs config Patrick Steinhardt
@ 2022-11-17  5:46   ` Patrick Steinhardt
  2022-11-17  5:46   ` [PATCH v6 3/7] revision: move together exclusion-related functions Patrick Steinhardt
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-17  5:46 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 12504 bytes --]

We're about to add a new argument to git-rev-list(1) that allows it to
add all references that are visible when taking `transfer.hideRefs` et
al into account. This will require us to potentially parse multiple sets
of hidden refs, which is not easily possible right now as there is only
a single, global instance of the list of parsed hidden refs.

Refactor `parse_hide_refs_config()` and `ref_is_hidden()` so that both
take the list of hidden references as input and adjust callers to keep a
local list, instead. This allows us to easily use multiple hidden-ref
lists. Furthermore, it allows us to properly free this list before we
exit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/receive-pack.c |  8 +++++---
 ls-refs.c              | 13 +++++++++----
 refs.c                 | 14 ++++----------
 refs.h                 |  5 +++--
 upload-pack.c          | 30 ++++++++++++++++++------------
 5 files changed, 39 insertions(+), 31 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 44bcea3a5b..1e24b31a0a 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -80,6 +80,7 @@ static struct object_id push_cert_oid;
 static struct signature_check sigcheck;
 static const char *push_cert_nonce;
 static const char *cert_nonce_seed;
+static struct string_list hidden_refs = STRING_LIST_INIT_DUP;
 
 static const char *NONCE_UNSOLICITED = "UNSOLICITED";
 static const char *NONCE_BAD = "BAD";
@@ -130,7 +131,7 @@ static enum deny_action parse_deny_action(const char *var, const char *value)
 
 static int receive_pack_config(const char *var, const char *value, void *cb)
 {
-	int status = parse_hide_refs_config(var, value, "receive");
+	int status = parse_hide_refs_config(var, value, "receive", &hidden_refs);
 
 	if (status)
 		return status;
@@ -296,7 +297,7 @@ static int show_ref_cb(const char *path_full, const struct object_id *oid,
 	struct oidset *seen = data;
 	const char *path = strip_namespace(path_full);
 
-	if (ref_is_hidden(path, path_full))
+	if (ref_is_hidden(path, path_full, &hidden_refs))
 		return 0;
 
 	/*
@@ -1794,7 +1795,7 @@ static void reject_updates_to_hidden(struct command *commands)
 		strbuf_setlen(&refname_full, prefix_len);
 		strbuf_addstr(&refname_full, cmd->ref_name);
 
-		if (!ref_is_hidden(cmd->ref_name, refname_full.buf))
+		if (!ref_is_hidden(cmd->ref_name, refname_full.buf, &hidden_refs))
 			continue;
 		if (is_null_oid(&cmd->new_oid))
 			cmd->error_string = "deny deleting a hidden ref";
@@ -2591,6 +2592,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 		packet_flush(1);
 	oid_array_clear(&shallow);
 	oid_array_clear(&ref);
+	string_list_clear(&hidden_refs, 0);
 	free((void *)push_cert_nonce);
 	return 0;
 }
diff --git a/ls-refs.c b/ls-refs.c
index fa0d01b47c..fb6769742c 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -6,6 +6,7 @@
 #include "ls-refs.h"
 #include "pkt-line.h"
 #include "config.h"
+#include "string-list.h"
 
 static int config_read;
 static int advertise_unborn;
@@ -73,6 +74,7 @@ struct ls_refs_data {
 	unsigned symrefs;
 	struct strvec prefixes;
 	struct strbuf buf;
+	struct string_list hidden_refs;
 	unsigned unborn : 1;
 };
 
@@ -84,7 +86,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
 
 	strbuf_reset(&data->buf);
 
-	if (ref_is_hidden(refname_nons, refname))
+	if (ref_is_hidden(refname_nons, refname, &data->hidden_refs))
 		return 0;
 
 	if (!ref_match(&data->prefixes, refname_nons))
@@ -137,14 +139,15 @@ static void send_possibly_unborn_head(struct ls_refs_data *data)
 }
 
 static int ls_refs_config(const char *var, const char *value,
-			  void *data UNUSED)
+			  void *cb_data)
 {
+	struct ls_refs_data *data = cb_data;
 	/*
 	 * We only serve fetches over v2 for now, so respect only "uploadpack"
 	 * config. This may need to eventually be expanded to "receive", but we
 	 * don't yet know how that information will be passed to ls-refs.
 	 */
-	return parse_hide_refs_config(var, value, "uploadpack");
+	return parse_hide_refs_config(var, value, "uploadpack", &data->hidden_refs);
 }
 
 int ls_refs(struct repository *r, struct packet_reader *request)
@@ -154,9 +157,10 @@ int ls_refs(struct repository *r, struct packet_reader *request)
 	memset(&data, 0, sizeof(data));
 	strvec_init(&data.prefixes);
 	strbuf_init(&data.buf, 0);
+	string_list_init_dup(&data.hidden_refs);
 
 	ensure_config_read();
-	git_config(ls_refs_config, NULL);
+	git_config(ls_refs_config, &data);
 
 	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
 		const char *arg = request->line;
@@ -195,6 +199,7 @@ int ls_refs(struct repository *r, struct packet_reader *request)
 	packet_fflush(stdout);
 	strvec_clear(&data.prefixes);
 	strbuf_release(&data.buf);
+	string_list_clear(&data.hidden_refs, 0);
 	return 0;
 }
 
diff --git a/refs.c b/refs.c
index a4ab264d74..2c7e88b190 100644
--- a/refs.c
+++ b/refs.c
@@ -1414,9 +1414,8 @@ char *shorten_unambiguous_ref(const char *refname, int strict)
 					    refname, strict);
 }
 
-static struct string_list *hide_refs;
-
-int parse_hide_refs_config(const char *var, const char *value, const char *section)
+int parse_hide_refs_config(const char *var, const char *value, const char *section,
+			   struct string_list *hide_refs)
 {
 	const char *key;
 	if (!strcmp("transfer.hiderefs", var) ||
@@ -1431,21 +1430,16 @@ int parse_hide_refs_config(const char *var, const char *value, const char *secti
 		len = strlen(ref);
 		while (len && ref[len - 1] == '/')
 			ref[--len] = '\0';
-		if (!hide_refs) {
-			CALLOC_ARRAY(hide_refs, 1);
-			hide_refs->strdup_strings = 1;
-		}
 		string_list_append_nodup(hide_refs, ref);
 	}
 	return 0;
 }
 
-int ref_is_hidden(const char *refname, const char *refname_full)
+int ref_is_hidden(const char *refname, const char *refname_full,
+		  const struct string_list *hide_refs)
 {
 	int i;
 
-	if (!hide_refs)
-		return 0;
 	for (i = hide_refs->nr - 1; i >= 0; i--) {
 		const char *match = hide_refs->items[i].string;
 		const char *subject;
diff --git a/refs.h b/refs.h
index 8958717a17..3266fd8f57 100644
--- a/refs.h
+++ b/refs.h
@@ -808,7 +808,8 @@ int update_ref(const char *msg, const char *refname,
 	       const struct object_id *new_oid, const struct object_id *old_oid,
 	       unsigned int flags, enum action_on_err onerr);
 
-int parse_hide_refs_config(const char *var, const char *value, const char *);
+int parse_hide_refs_config(const char *var, const char *value, const char *,
+			   struct string_list *);
 
 /*
  * Check whether a ref is hidden. If no namespace is set, both the first and
@@ -818,7 +819,7 @@ int parse_hide_refs_config(const char *var, const char *value, const char *);
  * the ref is outside that namespace, the first parameter is NULL. The second
  * parameter always points to the full ref name.
  */
-int ref_is_hidden(const char *, const char *);
+int ref_is_hidden(const char *, const char *, const struct string_list *);
 
 /* Is this a per-worktree ref living in the refs/ namespace? */
 int is_per_worktree_ref(const char *refname);
diff --git a/upload-pack.c b/upload-pack.c
index 0b8311bd68..551f22ffa5 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -62,6 +62,7 @@ struct upload_pack_data {
 	struct object_array have_obj;
 	struct oid_array haves;					/* v2 only */
 	struct string_list wanted_refs;				/* v2 only */
+	struct string_list hidden_refs;
 
 	struct object_array shallows;
 	struct string_list deepen_not;
@@ -118,6 +119,7 @@ static void upload_pack_data_init(struct upload_pack_data *data)
 {
 	struct string_list symref = STRING_LIST_INIT_DUP;
 	struct string_list wanted_refs = STRING_LIST_INIT_DUP;
+	struct string_list hidden_refs = STRING_LIST_INIT_DUP;
 	struct object_array want_obj = OBJECT_ARRAY_INIT;
 	struct object_array have_obj = OBJECT_ARRAY_INIT;
 	struct oid_array haves = OID_ARRAY_INIT;
@@ -130,6 +132,7 @@ static void upload_pack_data_init(struct upload_pack_data *data)
 	memset(data, 0, sizeof(*data));
 	data->symref = symref;
 	data->wanted_refs = wanted_refs;
+	data->hidden_refs = hidden_refs;
 	data->want_obj = want_obj;
 	data->have_obj = have_obj;
 	data->haves = haves;
@@ -151,6 +154,7 @@ static void upload_pack_data_clear(struct upload_pack_data *data)
 {
 	string_list_clear(&data->symref, 1);
 	string_list_clear(&data->wanted_refs, 1);
+	string_list_clear(&data->hidden_refs, 0);
 	object_array_clear(&data->want_obj);
 	object_array_clear(&data->have_obj);
 	oid_array_clear(&data->haves);
@@ -842,8 +846,8 @@ static void deepen(struct upload_pack_data *data, int depth)
 		 * Checking for reachable shallows requires that our refs be
 		 * marked with OUR_REF.
 		 */
-		head_ref_namespaced(check_ref, NULL);
-		for_each_namespaced_ref(check_ref, NULL);
+		head_ref_namespaced(check_ref, data);
+		for_each_namespaced_ref(check_ref, data);
 
 		get_reachable_list(data, &reachable_shallows);
 		result = get_shallow_commits(&reachable_shallows,
@@ -1158,11 +1162,11 @@ static void receive_needs(struct upload_pack_data *data,
 
 /* return non-zero if the ref is hidden, otherwise 0 */
 static int mark_our_ref(const char *refname, const char *refname_full,
-			const struct object_id *oid)
+			const struct object_id *oid, const struct string_list *hidden_refs)
 {
 	struct object *o = lookup_unknown_object(the_repository, oid);
 
-	if (ref_is_hidden(refname, refname_full)) {
+	if (ref_is_hidden(refname, refname_full, hidden_refs)) {
 		o->flags |= HIDDEN_REF;
 		return 1;
 	}
@@ -1171,11 +1175,12 @@ static int mark_our_ref(const char *refname, const char *refname_full,
 }
 
 static int check_ref(const char *refname_full, const struct object_id *oid,
-		     int flag UNUSED, void *cb_data UNUSED)
+		     int flag UNUSED, void *cb_data)
 {
 	const char *refname = strip_namespace(refname_full);
+	struct upload_pack_data *data = cb_data;
 
-	mark_our_ref(refname, refname_full, oid);
+	mark_our_ref(refname, refname_full, oid, &data->hidden_refs);
 	return 0;
 }
 
@@ -1204,7 +1209,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	struct object_id peeled;
 	struct upload_pack_data *data = cb_data;
 
-	if (mark_our_ref(refname_nons, refname, oid))
+	if (mark_our_ref(refname_nons, refname, oid, &data->hidden_refs))
 		return 0;
 
 	if (capabilities) {
@@ -1327,7 +1332,7 @@ static int upload_pack_config(const char *var, const char *value, void *cb_data)
 	if (parse_object_filter_config(var, value, data) < 0)
 		return -1;
 
-	return parse_hide_refs_config(var, value, "uploadpack");
+	return parse_hide_refs_config(var, value, "uploadpack", &data->hidden_refs);
 }
 
 static int upload_pack_protected_config(const char *var, const char *value, void *cb_data)
@@ -1375,8 +1380,8 @@ void upload_pack(const int advertise_refs, const int stateless_rpc,
 		advertise_shallow_grafts(1);
 		packet_flush(1);
 	} else {
-		head_ref_namespaced(check_ref, NULL);
-		for_each_namespaced_ref(check_ref, NULL);
+		head_ref_namespaced(check_ref, &data);
+		for_each_namespaced_ref(check_ref, &data);
 	}
 
 	if (!advertise_refs) {
@@ -1441,6 +1446,7 @@ static int parse_want(struct packet_writer *writer, const char *line,
 
 static int parse_want_ref(struct packet_writer *writer, const char *line,
 			  struct string_list *wanted_refs,
+			  struct string_list *hidden_refs,
 			  struct object_array *want_obj)
 {
 	const char *refname_nons;
@@ -1451,7 +1457,7 @@ static int parse_want_ref(struct packet_writer *writer, const char *line,
 		struct strbuf refname = STRBUF_INIT;
 
 		strbuf_addf(&refname, "%s%s", get_git_namespace(), refname_nons);
-		if (ref_is_hidden(refname_nons, refname.buf) ||
+		if (ref_is_hidden(refname_nons, refname.buf, hidden_refs) ||
 		    read_ref(refname.buf, &oid)) {
 			packet_writer_error(writer, "unknown ref %s", refname_nons);
 			die("unknown ref %s", refname_nons);
@@ -1508,7 +1514,7 @@ static void process_args(struct packet_reader *request,
 			continue;
 		if (data->allow_ref_in_want &&
 		    parse_want_ref(&data->writer, arg, &data->wanted_refs,
-				   &data->want_obj))
+				   &data->hidden_refs, &data->want_obj))
 			continue;
 		/* process have line */
 		if (parse_have(arg, &data->haves))
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v6 3/7] revision: move together exclusion-related functions
  2022-11-17  5:46 ` [PATCH v6 " Patrick Steinhardt
  2022-11-17  5:46   ` [PATCH v6 1/7] refs: fix memory leak when parsing hideRefs config Patrick Steinhardt
  2022-11-17  5:46   ` [PATCH v6 2/7] refs: get rid of global list of hidden refs Patrick Steinhardt
@ 2022-11-17  5:46   ` Patrick Steinhardt
  2022-11-17  5:46   ` [PATCH v6 4/7] revision: introduce struct to handle exclusions Patrick Steinhardt
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-17  5:46 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 2402 bytes --]

Move together the definitions of functions that handle exclusions of
refs so that related functionality sits in a single place, only.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 revision.c | 52 ++++++++++++++++++++++++++--------------------------
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/revision.c b/revision.c
index 0760e78936..be755670e2 100644
--- a/revision.c
+++ b/revision.c
@@ -1517,14 +1517,6 @@ static void add_rev_cmdline_list(struct rev_info *revs,
 	}
 }
 
-struct all_refs_cb {
-	int all_flags;
-	int warned_bad_reflog;
-	struct rev_info *all_revs;
-	const char *name_for_errormsg;
-	struct worktree *wt;
-};
-
 int ref_excluded(struct string_list *ref_excludes, const char *path)
 {
 	struct string_list_item *item;
@@ -1538,6 +1530,32 @@ int ref_excluded(struct string_list *ref_excludes, const char *path)
 	return 0;
 }
 
+void clear_ref_exclusion(struct string_list **ref_excludes_p)
+{
+	if (*ref_excludes_p) {
+		string_list_clear(*ref_excludes_p, 0);
+		free(*ref_excludes_p);
+	}
+	*ref_excludes_p = NULL;
+}
+
+void add_ref_exclusion(struct string_list **ref_excludes_p, const char *exclude)
+{
+	if (!*ref_excludes_p) {
+		CALLOC_ARRAY(*ref_excludes_p, 1);
+		(*ref_excludes_p)->strdup_strings = 1;
+	}
+	string_list_append(*ref_excludes_p, exclude);
+}
+
+struct all_refs_cb {
+	int all_flags;
+	int warned_bad_reflog;
+	struct rev_info *all_revs;
+	const char *name_for_errormsg;
+	struct worktree *wt;
+};
+
 static int handle_one_ref(const char *path, const struct object_id *oid,
 			  int flag UNUSED,
 			  void *cb_data)
@@ -1563,24 +1581,6 @@ static void init_all_refs_cb(struct all_refs_cb *cb, struct rev_info *revs,
 	cb->wt = NULL;
 }
 
-void clear_ref_exclusion(struct string_list **ref_excludes_p)
-{
-	if (*ref_excludes_p) {
-		string_list_clear(*ref_excludes_p, 0);
-		free(*ref_excludes_p);
-	}
-	*ref_excludes_p = NULL;
-}
-
-void add_ref_exclusion(struct string_list **ref_excludes_p, const char *exclude)
-{
-	if (!*ref_excludes_p) {
-		CALLOC_ARRAY(*ref_excludes_p, 1);
-		(*ref_excludes_p)->strdup_strings = 1;
-	}
-	string_list_append(*ref_excludes_p, exclude);
-}
-
 static void handle_refs(struct ref_store *refs,
 			struct rev_info *revs, unsigned flags,
 			int (*for_each)(struct ref_store *, each_ref_fn, void *))
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v6 4/7] revision: introduce struct to handle exclusions
  2022-11-17  5:46 ` [PATCH v6 " Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2022-11-17  5:46   ` [PATCH v6 3/7] revision: move together exclusion-related functions Patrick Steinhardt
@ 2022-11-17  5:46   ` Patrick Steinhardt
  2022-11-17  5:46   ` [PATCH v6 5/7] revision: add new parameter to exclude hidden refs Patrick Steinhardt
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-17  5:46 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 8642 bytes --]

The functions that handle exclusion of refs work on a single string
list. We're about to add a second mechanism for excluding refs though,
and it makes sense to reuse much of the same architecture for both kinds
of exclusion.

Introduce a new `struct ref_exclusions` that encapsulates all the logic
related to excluding refs and move the `struct string_list` that holds
all wildmatch patterns of excluded refs into it. Rename functions that
operate on this struct to match its name.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/rev-parse.c |  8 ++++----
 revision.c          | 48 +++++++++++++++++++++------------------------
 revision.h          | 27 +++++++++++++++++++------
 3 files changed, 47 insertions(+), 36 deletions(-)

diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 8f61050bde..7fa5b6991b 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -39,7 +39,7 @@ static int abbrev_ref_strict;
 static int output_sq;
 
 static int stuck_long;
-static struct string_list *ref_excludes;
+static struct ref_exclusions ref_excludes = REF_EXCLUSIONS_INIT;
 
 /*
  * Some arguments are relevant "revision" arguments,
@@ -198,7 +198,7 @@ static int show_default(void)
 static int show_reference(const char *refname, const struct object_id *oid,
 			  int flag UNUSED, void *cb_data UNUSED)
 {
-	if (ref_excluded(ref_excludes, refname))
+	if (ref_excluded(&ref_excludes, refname))
 		return 0;
 	show_rev(NORMAL, oid, refname);
 	return 0;
@@ -585,7 +585,7 @@ static void handle_ref_opt(const char *pattern, const char *prefix)
 		for_each_glob_ref_in(show_reference, pattern, prefix, NULL);
 	else
 		for_each_ref_in(prefix, show_reference, NULL);
-	clear_ref_exclusion(&ref_excludes);
+	clear_ref_exclusions(&ref_excludes);
 }
 
 enum format_type {
@@ -863,7 +863,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 			}
 			if (!strcmp(arg, "--all")) {
 				for_each_ref(show_reference, NULL);
-				clear_ref_exclusion(&ref_excludes);
+				clear_ref_exclusions(&ref_excludes);
 				continue;
 			}
 			if (skip_prefix(arg, "--disambiguate=", &arg)) {
diff --git a/revision.c b/revision.c
index be755670e2..fe3ec98f46 100644
--- a/revision.c
+++ b/revision.c
@@ -1517,35 +1517,30 @@ static void add_rev_cmdline_list(struct rev_info *revs,
 	}
 }
 
-int ref_excluded(struct string_list *ref_excludes, const char *path)
+int ref_excluded(const struct ref_exclusions *exclusions, const char *path)
 {
 	struct string_list_item *item;
-
-	if (!ref_excludes)
-		return 0;
-	for_each_string_list_item(item, ref_excludes) {
+	for_each_string_list_item(item, &exclusions->excluded_refs) {
 		if (!wildmatch(item->string, path, 0))
 			return 1;
 	}
 	return 0;
 }
 
-void clear_ref_exclusion(struct string_list **ref_excludes_p)
+void init_ref_exclusions(struct ref_exclusions *exclusions)
 {
-	if (*ref_excludes_p) {
-		string_list_clear(*ref_excludes_p, 0);
-		free(*ref_excludes_p);
-	}
-	*ref_excludes_p = NULL;
+	struct ref_exclusions blank = REF_EXCLUSIONS_INIT;
+	memcpy(exclusions, &blank, sizeof(*exclusions));
 }
 
-void add_ref_exclusion(struct string_list **ref_excludes_p, const char *exclude)
+void clear_ref_exclusions(struct ref_exclusions *exclusions)
 {
-	if (!*ref_excludes_p) {
-		CALLOC_ARRAY(*ref_excludes_p, 1);
-		(*ref_excludes_p)->strdup_strings = 1;
-	}
-	string_list_append(*ref_excludes_p, exclude);
+	string_list_clear(&exclusions->excluded_refs, 0);
+}
+
+void add_ref_exclusion(struct ref_exclusions *exclusions, const char *exclude)
+{
+	string_list_append(&exclusions->excluded_refs, exclude);
 }
 
 struct all_refs_cb {
@@ -1563,7 +1558,7 @@ static int handle_one_ref(const char *path, const struct object_id *oid,
 	struct all_refs_cb *cb = cb_data;
 	struct object *object;
 
-	if (ref_excluded(cb->all_revs->ref_excludes, path))
+	if (ref_excluded(&cb->all_revs->ref_excludes, path))
 	    return 0;
 
 	object = get_reference(cb->all_revs, path, oid, cb->all_flags);
@@ -1901,6 +1896,7 @@ void repo_init_revisions(struct repository *r,
 
 	init_display_notes(&revs->notes_opt);
 	list_objects_filter_init(&revs->filter);
+	init_ref_exclusions(&revs->ref_excludes);
 }
 
 static void add_pending_commit_list(struct rev_info *revs,
@@ -2689,10 +2685,10 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 			init_all_refs_cb(&cb, revs, *flags);
 			other_head_refs(handle_one_ref, &cb);
 		}
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--branches")) {
 		handle_refs(refs, revs, *flags, refs_for_each_branch_ref);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--bisect")) {
 		read_bisect_terms(&term_bad, &term_good);
 		handle_refs(refs, revs, *flags, for_each_bad_bisect_ref);
@@ -2701,15 +2697,15 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 		revs->bisect = 1;
 	} else if (!strcmp(arg, "--tags")) {
 		handle_refs(refs, revs, *flags, refs_for_each_tag_ref);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--remotes")) {
 		handle_refs(refs, revs, *flags, refs_for_each_remote_ref);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if ((argcount = parse_long_opt("glob", argv, &optarg))) {
 		struct all_refs_cb cb;
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref(handle_one_ref, optarg, &cb);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 		return argcount;
 	} else if ((argcount = parse_long_opt("exclude", argv, &optarg))) {
 		add_ref_exclusion(&revs->ref_excludes, optarg);
@@ -2718,17 +2714,17 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 		struct all_refs_cb cb;
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/heads/", &cb);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (skip_prefix(arg, "--tags=", &optarg)) {
 		struct all_refs_cb cb;
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/tags/", &cb);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (skip_prefix(arg, "--remotes=", &optarg)) {
 		struct all_refs_cb cb;
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/remotes/", &cb);
-		clear_ref_exclusion(&revs->ref_excludes);
+		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--reflog")) {
 		add_reflogs_to_pending(revs, *flags);
 	} else if (!strcmp(arg, "--indexed-objects")) {
diff --git a/revision.h b/revision.h
index afe1b77985..5c8ab16047 100644
--- a/revision.h
+++ b/revision.h
@@ -81,6 +81,21 @@ struct rev_cmdline_info {
 	} *rev;
 };
 
+struct ref_exclusions {
+	/*
+	 * Excluded refs is a list of wildmatch patterns. If any of the
+	 * patterns matches, the reference will be excluded.
+	 */
+	struct string_list excluded_refs;
+};
+
+/**
+ * Initialize a `struct ref_exclusions` with a macro.
+ */
+#define REF_EXCLUSIONS_INIT { \
+	.excluded_refs = STRING_LIST_INIT_DUP, \
+}
+
 struct oidset;
 struct topo_walk_info;
 
@@ -103,7 +118,7 @@ struct rev_info {
 	struct list_objects_filter_options filter;
 
 	/* excluding from --branches, --refs, etc. expansion */
-	struct string_list *ref_excludes;
+	struct ref_exclusions ref_excludes;
 
 	/* Basic information */
 	const char *prefix;
@@ -439,12 +454,12 @@ void mark_trees_uninteresting_sparse(struct repository *r, struct oidset *trees)
 void show_object_with_name(FILE *, struct object *, const char *);
 
 /**
- * Helpers to check if a "struct string_list" item matches with
- * wildmatch().
+ * Helpers to check if a reference should be excluded.
  */
-int ref_excluded(struct string_list *, const char *path);
-void clear_ref_exclusion(struct string_list **);
-void add_ref_exclusion(struct string_list **, const char *exclude);
+int ref_excluded(const struct ref_exclusions *exclusions, const char *path);
+void init_ref_exclusions(struct ref_exclusions *);
+void clear_ref_exclusions(struct ref_exclusions *);
+void add_ref_exclusion(struct ref_exclusions *, const char *exclude);
 
 /**
  * This function can be used if you want to add commit objects as revision
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v6 5/7] revision: add new parameter to exclude hidden refs
  2022-11-17  5:46 ` [PATCH v6 " Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2022-11-17  5:46   ` [PATCH v6 4/7] revision: introduce struct to handle exclusions Patrick Steinhardt
@ 2022-11-17  5:46   ` Patrick Steinhardt
  2022-11-17  5:47   ` [PATCH v6 6/7] rev-parse: add `--exclude-hidden=` option Patrick Steinhardt
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-17  5:46 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 14484 bytes --]

Users can optionally hide refs from remote users in git-upload-pack(1),
git-receive-pack(1) and others via the `transfer.hideRefs`, but there is
not an easy way to obtain the list of all visible or hidden refs right
now. We'll require just that though for a performance improvement in our
connectivity check.

Add a new option `--exclude-hidden=` that excludes any hidden refs from
the next pseudo-ref like `--all` or `--branches`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/rev-list-options.txt |   7 ++
 builtin/rev-list.c                 |   1 +
 revision.c                         |  55 +++++++++-
 revision.h                         |  16 +++
 t/t6021-rev-list-exclude-hidden.sh | 163 +++++++++++++++++++++++++++++
 5 files changed, 241 insertions(+), 1 deletion(-)
 create mode 100755 t/t6021-rev-list-exclude-hidden.sh

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 1837509566..ff68e48406 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -195,6 +195,13 @@ respectively, and they must begin with `refs/` when applied to `--glob`
 or `--all`. If a trailing '/{asterisk}' is intended, it must be given
 explicitly.
 
+--exclude-hidden=[receive|uploadpack]::
+	Do not include refs that would be hidden by `git-receive-pack` or
+	`git-upload-pack` by consulting the appropriate `receive.hideRefs` or
+	`uploadpack.hideRefs` configuration along with `transfer.hideRefs` (see
+	linkgit:git-config[1]). This option affects the next pseudo-ref option
+	`--all` or `--glob` and is cleared after processing them.
+
 --reflog::
 	Pretend as if all objects mentioned by reflogs are listed on the
 	command line as `<commit>`.
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 3acd93f71e..d42db0b0cc 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -38,6 +38,7 @@ static const char rev_list_usage[] =
 "    --tags\n"
 "    --remotes\n"
 "    --stdin\n"
+"    --exclude-hidden=[receive|uploadpack]\n"
 "    --quiet\n"
 "  ordering output:\n"
 "    --topo-order\n"
diff --git a/revision.c b/revision.c
index fe3ec98f46..bc32fb819a 100644
--- a/revision.c
+++ b/revision.c
@@ -1,4 +1,5 @@
 #include "cache.h"
+#include "config.h"
 #include "object-store.h"
 #include "tag.h"
 #include "blob.h"
@@ -1519,11 +1520,17 @@ static void add_rev_cmdline_list(struct rev_info *revs,
 
 int ref_excluded(const struct ref_exclusions *exclusions, const char *path)
 {
+	const char *stripped_path = strip_namespace(path);
 	struct string_list_item *item;
+
 	for_each_string_list_item(item, &exclusions->excluded_refs) {
 		if (!wildmatch(item->string, path, 0))
 			return 1;
 	}
+
+	if (ref_is_hidden(stripped_path, path, &exclusions->hidden_refs))
+		return 1;
+
 	return 0;
 }
 
@@ -1536,6 +1543,8 @@ void init_ref_exclusions(struct ref_exclusions *exclusions)
 void clear_ref_exclusions(struct ref_exclusions *exclusions)
 {
 	string_list_clear(&exclusions->excluded_refs, 0);
+	string_list_clear(&exclusions->hidden_refs, 0);
+	exclusions->hidden_refs_configured = 0;
 }
 
 void add_ref_exclusion(struct ref_exclusions *exclusions, const char *exclude)
@@ -1543,6 +1552,35 @@ void add_ref_exclusion(struct ref_exclusions *exclusions, const char *exclude)
 	string_list_append(&exclusions->excluded_refs, exclude);
 }
 
+struct exclude_hidden_refs_cb {
+	struct ref_exclusions *exclusions;
+	const char *section;
+};
+
+static int hide_refs_config(const char *var, const char *value, void *cb_data)
+{
+	struct exclude_hidden_refs_cb *cb = cb_data;
+	cb->exclusions->hidden_refs_configured = 1;
+	return parse_hide_refs_config(var, value, cb->section,
+				      &cb->exclusions->hidden_refs);
+}
+
+void exclude_hidden_refs(struct ref_exclusions *exclusions, const char *section)
+{
+	struct exclude_hidden_refs_cb cb;
+
+	if (strcmp(section, "receive") && strcmp(section, "uploadpack"))
+		die(_("unsupported section for hidden refs: %s"), section);
+
+	if (exclusions->hidden_refs_configured)
+		die(_("--exclude-hidden= passed more than once"));
+
+	cb.exclusions = exclusions;
+	cb.section = section;
+
+	git_config(hide_refs_config, &cb);
+}
+
 struct all_refs_cb {
 	int all_flags;
 	int warned_bad_reflog;
@@ -2221,7 +2259,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	    !strcmp(arg, "--bisect") || starts_with(arg, "--glob=") ||
 	    !strcmp(arg, "--indexed-objects") ||
 	    !strcmp(arg, "--alternate-refs") ||
-	    starts_with(arg, "--exclude=") ||
+	    starts_with(arg, "--exclude=") || starts_with(arg, "--exclude-hidden=") ||
 	    starts_with(arg, "--branches=") || starts_with(arg, "--tags=") ||
 	    starts_with(arg, "--remotes=") || starts_with(arg, "--no-walk="))
 	{
@@ -2687,6 +2725,8 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 		}
 		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--branches")) {
+		if (revs->ref_excludes.hidden_refs_configured)
+			return error(_("--exclude-hidden cannot be used together with --branches"));
 		handle_refs(refs, revs, *flags, refs_for_each_branch_ref);
 		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--bisect")) {
@@ -2696,9 +2736,13 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 			    for_each_good_bisect_ref);
 		revs->bisect = 1;
 	} else if (!strcmp(arg, "--tags")) {
+		if (revs->ref_excludes.hidden_refs_configured)
+			return error(_("--exclude-hidden cannot be used together with --tags"));
 		handle_refs(refs, revs, *flags, refs_for_each_tag_ref);
 		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (!strcmp(arg, "--remotes")) {
+		if (revs->ref_excludes.hidden_refs_configured)
+			return error(_("--exclude-hidden cannot be used together with --remotes"));
 		handle_refs(refs, revs, *flags, refs_for_each_remote_ref);
 		clear_ref_exclusions(&revs->ref_excludes);
 	} else if ((argcount = parse_long_opt("glob", argv, &optarg))) {
@@ -2710,18 +2754,27 @@ static int handle_revision_pseudo_opt(struct rev_info *revs,
 	} else if ((argcount = parse_long_opt("exclude", argv, &optarg))) {
 		add_ref_exclusion(&revs->ref_excludes, optarg);
 		return argcount;
+	} else if ((argcount = parse_long_opt("exclude-hidden", argv, &optarg))) {
+		exclude_hidden_refs(&revs->ref_excludes, optarg);
+		return argcount;
 	} else if (skip_prefix(arg, "--branches=", &optarg)) {
 		struct all_refs_cb cb;
+		if (revs->ref_excludes.hidden_refs_configured)
+			return error(_("--exclude-hidden cannot be used together with --branches"));
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/heads/", &cb);
 		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (skip_prefix(arg, "--tags=", &optarg)) {
 		struct all_refs_cb cb;
+		if (revs->ref_excludes.hidden_refs_configured)
+			return error(_("--exclude-hidden cannot be used together with --tags"));
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/tags/", &cb);
 		clear_ref_exclusions(&revs->ref_excludes);
 	} else if (skip_prefix(arg, "--remotes=", &optarg)) {
 		struct all_refs_cb cb;
+		if (revs->ref_excludes.hidden_refs_configured)
+			return error(_("--exclude-hidden cannot be used together with --remotes"));
 		init_all_refs_cb(&cb, revs, *flags);
 		for_each_glob_ref_in(handle_one_ref, optarg, "refs/remotes/", &cb);
 		clear_ref_exclusions(&revs->ref_excludes);
diff --git a/revision.h b/revision.h
index 5c8ab16047..adb810c2f8 100644
--- a/revision.h
+++ b/revision.h
@@ -87,6 +87,19 @@ struct ref_exclusions {
 	 * patterns matches, the reference will be excluded.
 	 */
 	struct string_list excluded_refs;
+
+	/*
+	 * Hidden refs is a list of patterns that is to be hidden via
+	 * `ref_is_hidden()`.
+	 */
+	struct string_list hidden_refs;
+
+	/*
+	 * Indicates whether hidden refs have been configured. This is to
+	 * distinguish between no hidden refs existing and hidden refs not
+	 * being parsed.
+	 */
+	char hidden_refs_configured;
 };
 
 /**
@@ -94,6 +107,7 @@ struct ref_exclusions {
  */
 #define REF_EXCLUSIONS_INIT { \
 	.excluded_refs = STRING_LIST_INIT_DUP, \
+	.hidden_refs = STRING_LIST_INIT_DUP, \
 }
 
 struct oidset;
@@ -456,10 +470,12 @@ void show_object_with_name(FILE *, struct object *, const char *);
 /**
  * Helpers to check if a reference should be excluded.
  */
+
 int ref_excluded(const struct ref_exclusions *exclusions, const char *path);
 void init_ref_exclusions(struct ref_exclusions *);
 void clear_ref_exclusions(struct ref_exclusions *);
 void add_ref_exclusion(struct ref_exclusions *, const char *exclude);
+void exclude_hidden_refs(struct ref_exclusions *, const char *section);
 
 /**
  * This function can be used if you want to add commit objects as revision
diff --git a/t/t6021-rev-list-exclude-hidden.sh b/t/t6021-rev-list-exclude-hidden.sh
new file mode 100755
index 0000000000..32b2b09413
--- /dev/null
+++ b/t/t6021-rev-list-exclude-hidden.sh
@@ -0,0 +1,163 @@
+#!/bin/sh
+
+test_description='git rev-list --exclude-hidden test'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit_bulk --id=commit --ref=refs/heads/branch 1 &&
+	COMMIT=$(git rev-parse refs/heads/branch) &&
+	test_commit_bulk --id=tag --ref=refs/tags/lightweight 1 &&
+	TAG=$(git rev-parse refs/tags/lightweight) &&
+	test_commit_bulk --id=hidden --ref=refs/hidden/commit 1 &&
+	HIDDEN=$(git rev-parse refs/hidden/commit) &&
+	test_commit_bulk --id=namespace --ref=refs/namespaces/namespace/refs/namespaced/commit 1 &&
+	NAMESPACE=$(git rev-parse refs/namespaces/namespace/refs/namespaced/commit)
+'
+
+test_expect_success 'invalid section' '
+	echo "fatal: unsupported section for hidden refs: unsupported" >expected &&
+	test_must_fail git rev-list --exclude-hidden=unsupported 2>err &&
+	test_cmp expected err
+'
+
+for section in receive uploadpack
+do
+	test_expect_success "$section: passed multiple times" '
+		echo "fatal: --exclude-hidden= passed more than once" >expected &&
+		test_must_fail git rev-list --exclude-hidden=$section --exclude-hidden=$section 2>err &&
+		test_cmp expected err
+	'
+
+	test_expect_success "$section: without hiddenRefs" '
+		git rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$HIDDEN
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: hidden via transfer.hideRefs" '
+		git -c transfer.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: hidden via $section.hideRefs" '
+		git -c $section.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: respects both transfer.hideRefs and $section.hideRefs" '
+		git -c transfer.hideRefs=refs/tags/ -c $section.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: negation without hidden refs marks everything as uninteresting" '
+		git rev-list --all --exclude-hidden=$section --not --all >out &&
+		test_must_be_empty out
+	'
+
+	test_expect_success "$section: negation with hidden refs marks them as interesting" '
+		git -c transfer.hideRefs=refs/hidden/ rev-list --all --exclude-hidden=$section --not --all >out &&
+		cat >expected <<-EOF &&
+		$HIDDEN
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: hidden refs and excludes work together" '
+		git -c transfer.hideRefs=refs/hidden/ rev-list --exclude=refs/tags/* --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: excluded hidden refs get reset" '
+		git -c transfer.hideRefs=refs/ rev-list --exclude-hidden=$section --all --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$HIDDEN
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: excluded hidden refs can be used with multiple pseudo-refs" '
+		git -c transfer.hideRefs=refs/ rev-list --exclude-hidden=$section --all --exclude-hidden=$section --all >out &&
+		test_must_be_empty out
+	'
+
+	test_expect_success "$section: works with --glob" '
+		git -c transfer.hideRefs=refs/hidden/ rev-list --exclude-hidden=$section --glob=refs/h* >out &&
+		cat >expected <<-EOF &&
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: operates on stripped refs by default" '
+		GIT_NAMESPACE=namespace git -c transfer.hideRefs=refs/namespaced/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$HIDDEN
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: does not hide namespace by default" '
+		GIT_NAMESPACE=namespace git -c transfer.hideRefs=refs/namespaces/namespace/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$NAMESPACE
+		$HIDDEN
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	test_expect_success "$section: can operate on unstripped refs" '
+		GIT_NAMESPACE=namespace git -c transfer.hideRefs=^refs/namespaces/namespace/ rev-list --exclude-hidden=$section --all >out &&
+		cat >expected <<-EOF &&
+		$HIDDEN
+		$TAG
+		$COMMIT
+		EOF
+		test_cmp expected out
+	'
+
+	for pseudoopt in remotes branches tags
+	do
+		test_expect_success "$section: fails with --$pseudoopt" '
+			test_must_fail git rev-list --exclude-hidden=$section --$pseudoopt 2>err &&
+			test_i18ngrep "error: --exclude-hidden cannot be used together with --$pseudoopt" err
+		'
+
+		test_expect_success "$section: fails with --$pseudoopt=pattern" '
+			test_must_fail git rev-list --exclude-hidden=$section --$pseudoopt=pattern 2>err &&
+			test_i18ngrep "error: --exclude-hidden cannot be used together with --$pseudoopt" err
+		'
+	done
+done
+
+test_done
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v6 6/7] rev-parse: add `--exclude-hidden=` option
  2022-11-17  5:46 ` [PATCH v6 " Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2022-11-17  5:46   ` [PATCH v6 5/7] revision: add new parameter to exclude hidden refs Patrick Steinhardt
@ 2022-11-17  5:47   ` Patrick Steinhardt
  2022-11-17  5:47   ` [PATCH v6 7/7] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
  2022-11-17 15:03   ` [PATCH v6 0/7] " Jeff King
  7 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-17  5:47 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 5452 bytes --]

Add a new `--exclude-hidden=` option that is similar to the one we just
added to git-rev-list(1). Given a section name `uploadpack` or `receive`
as argument, it causes us to exclude all references that would be hidden
by the respective `$section.hideRefs` configuration.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-rev-parse.txt |  7 ++++++
 builtin/rev-parse.c             | 10 +++++++++
 t/t6018-rev-list-glob.sh        | 40 +++++++++++++++++++++++++++++++++
 3 files changed, 57 insertions(+)

diff --git a/Documentation/git-rev-parse.txt b/Documentation/git-rev-parse.txt
index 6b8ca085aa..bcd8069287 100644
--- a/Documentation/git-rev-parse.txt
+++ b/Documentation/git-rev-parse.txt
@@ -197,6 +197,13 @@ respectively, and they must begin with `refs/` when applied to `--glob`
 or `--all`. If a trailing '/{asterisk}' is intended, it must be given
 explicitly.
 
+--exclude-hidden=[receive|uploadpack]::
+	Do not include refs that would be hidden by `git-receive-pack` or
+	`git-upload-pack` by consulting the appropriate `receive.hideRefs` or
+	`uploadpack.hideRefs` configuration along with `transfer.hideRefs` (see
+	linkgit:git-config[1]). This option affects the next pseudo-ref option
+	`--all` or `--glob` and is cleared after processing them.
+
 --disambiguate=<prefix>::
 	Show every object whose name begins with the given prefix.
 	The <prefix> must be at least 4 hexadecimal digits long to
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 7fa5b6991b..b5666a03bd 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -876,10 +876,14 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				continue;
 			}
 			if (opt_with_value(arg, "--branches", &arg)) {
+				if (ref_excludes.hidden_refs_configured)
+					return error(_("--exclude-hidden cannot be used together with --branches"));
 				handle_ref_opt(arg, "refs/heads/");
 				continue;
 			}
 			if (opt_with_value(arg, "--tags", &arg)) {
+				if (ref_excludes.hidden_refs_configured)
+					return error(_("--exclude-hidden cannot be used together with --tags"));
 				handle_ref_opt(arg, "refs/tags/");
 				continue;
 			}
@@ -888,6 +892,8 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				continue;
 			}
 			if (opt_with_value(arg, "--remotes", &arg)) {
+				if (ref_excludes.hidden_refs_configured)
+					return error(_("--exclude-hidden cannot be used together with --remotes"));
 				handle_ref_opt(arg, "refs/remotes/");
 				continue;
 			}
@@ -895,6 +901,10 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				add_ref_exclusion(&ref_excludes, arg);
 				continue;
 			}
+			if (skip_prefix(arg, "--exclude-hidden=", &arg)) {
+				exclude_hidden_refs(&ref_excludes, arg);
+				continue;
+			}
 			if (!strcmp(arg, "--show-toplevel")) {
 				const char *work_tree = get_git_work_tree();
 				if (work_tree)
diff --git a/t/t6018-rev-list-glob.sh b/t/t6018-rev-list-glob.sh
index e1abc5c2b3..aabf590dda 100755
--- a/t/t6018-rev-list-glob.sh
+++ b/t/t6018-rev-list-glob.sh
@@ -187,6 +187,46 @@ test_expect_success 'rev-parse --exclude=ref with --remotes=glob' '
 	compare rev-parse "--exclude=upstream/x --remotes=upstream/*" "upstream/one upstream/two"
 '
 
+for section in receive uploadpack
+do
+	test_expect_success "rev-parse --exclude-hidden=$section with --all" '
+		compare "-c transfer.hideRefs=refs/remotes/ rev-parse" "--branches --tags" "--exclude-hidden=$section --all"
+	'
+
+	test_expect_success "rev-parse --exclude-hidden=$section with --all" '
+		compare "-c transfer.hideRefs=refs/heads/subspace/ rev-parse" "--exclude=refs/heads/subspace/* --all" "--exclude-hidden=$section --all"
+	'
+
+	test_expect_success "rev-parse --exclude-hidden=$section with --glob" '
+		compare "-c transfer.hideRefs=refs/heads/subspace/ rev-parse" "--exclude=refs/heads/subspace/* --glob=refs/heads/*" "--exclude-hidden=$section --glob=refs/heads/*"
+	'
+
+	test_expect_success "rev-parse --exclude-hidden=$section can be passed once per pseudo-ref" '
+		compare "-c transfer.hideRefs=refs/remotes/ rev-parse" "--branches --tags --branches --tags" "--exclude-hidden=$section --all --exclude-hidden=$section --all"
+	'
+
+	test_expect_success "rev-parse --exclude-hidden=$section can only be passed once per pseudo-ref" '
+		echo "fatal: --exclude-hidden= passed more than once" >expected &&
+		test_must_fail git rev-parse --exclude-hidden=$section --exclude-hidden=$section 2>err &&
+		test_cmp expected err
+	'
+
+	for pseudoopt in branches tags remotes
+	do
+		test_expect_success "rev-parse --exclude-hidden=$section fails with --$pseudoopt" '
+			echo "error: --exclude-hidden cannot be used together with --$pseudoopt" >expected &&
+			test_must_fail git rev-parse --exclude-hidden=$section --$pseudoopt 2>err &&
+			test_cmp expected err
+		'
+
+		test_expect_success "rev-parse --exclude-hidden=$section fails with --$pseudoopt=pattern" '
+			echo "error: --exclude-hidden cannot be used together with --$pseudoopt" >expected &&
+			test_must_fail git rev-parse --exclude-hidden=$section --$pseudoopt=pattern 2>err &&
+			test_cmp expected err
+		'
+	done
+done
+
 test_expect_success 'rev-list --exclude=glob with --branches=glob' '
 	compare rev-list "--exclude=subspace-* --branches=sub*" "subspace/one subspace/two"
 '
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v6 7/7] receive-pack: only use visible refs for connectivity check
  2022-11-17  5:46 ` [PATCH v6 " Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2022-11-17  5:47   ` [PATCH v6 6/7] rev-parse: add `--exclude-hidden=` option Patrick Steinhardt
@ 2022-11-17  5:47   ` Patrick Steinhardt
  2022-11-17 15:03   ` [PATCH v6 0/7] " Jeff King
  7 siblings, 0 replies; 88+ messages in thread
From: Patrick Steinhardt @ 2022-11-17  5:47 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 4675 bytes --]

When serving a push, git-receive-pack(1) needs to verify that the
packfile sent by the client contains all objects that are required by
the updated references. This connectivity check works by marking all
preexisting references as uninteresting and using the new reference tips
as starting point for a graph walk.

Marking all preexisting references as uninteresting can be a problem
when it comes to performance. Git forges tend to do internal bookkeeping
to keep alive sets of objects for internal use or make them easy to find
via certain references. These references are typically hidden away from
the user so that they are neither advertised nor writeable. At GitLab,
we have one particular repository that contains a total of 7 million
references, of which 6.8 million are indeed internal references. With
the current connectivity check we are forced to load all these
references in order to mark them as uninteresting, and this alone takes
around 15 seconds to compute.

We can optimize this by only taking into account the set of visible refs
when marking objects as uninteresting. This means that we may now walk
more objects until we hit any object that is marked as uninteresting.
But it is rather unlikely that clients send objects that make large
parts of objects reachable that have previously only ever been hidden,
whereas the common case is to push incremental changes that build on top
of the visible object graph.

This provides a huge boost to performance in the mentioned repository,
where the vast majority of its refs hidden. Pushing a new commit into
this repo with `transfer.hideRefs` set up to hide 6.8 million of 7 refs
as it is configured in Gitaly leads to a 4.5-fold speedup:

    Benchmark 1: main
      Time (mean ± σ):     30.977 s ±  0.157 s    [User: 30.226 s, System: 1.083 s]
      Range (min … max):   30.796 s … 31.071 s    3 runs

    Benchmark 2: pks-connectivity-check-hide-refs
      Time (mean ± σ):      6.799 s ±  0.063 s    [User: 6.803 s, System: 0.354 s]
      Range (min … max):    6.729 s …  6.850 s    3 runs

    Summary
      'pks-connectivity-check-hide-refs' ran
        4.56 ± 0.05 times faster than 'main'

As we mostly go through the same codepaths even in the case where there
are no hidden refs at all compared to the code before there is no change
in performance when no refs are hidden:

    Benchmark 1: main
      Time (mean ± σ):     48.188 s ±  0.432 s    [User: 49.326 s, System: 5.009 s]
      Range (min … max):   47.706 s … 48.539 s    3 runs

    Benchmark 2: pks-connectivity-check-hide-refs
      Time (mean ± σ):     48.027 s ±  0.500 s    [User: 48.934 s, System: 5.025 s]
      Range (min … max):   47.504 s … 48.500 s    3 runs

    Summary
      'pks-connectivity-check-hide-refs' ran
        1.00 ± 0.01 times faster than 'main'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/receive-pack.c | 2 ++
 connected.c            | 3 +++
 connected.h            | 7 +++++++
 3 files changed, 12 insertions(+)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 1e24b31a0a..a90af30363 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1929,6 +1929,8 @@ static void execute_commands(struct command *commands,
 	opt.err_fd = err_fd;
 	opt.progress = err_fd && !quiet;
 	opt.env = tmp_objdir_env(tmp_objdir);
+	opt.exclude_hidden_refs_section = "receive";
+
 	if (check_connected(iterate_receive_command_list, &data, &opt))
 		set_connectivity_errors(commands, si);
 
diff --git a/connected.c b/connected.c
index 74a20cb32e..4f6388eed7 100644
--- a/connected.c
+++ b/connected.c
@@ -100,6 +100,9 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
 		strvec_push(&rev_list.args, "--exclude-promisor-objects");
 	if (!opt->is_deepening_fetch) {
 		strvec_push(&rev_list.args, "--not");
+		if (opt->exclude_hidden_refs_section)
+			strvec_pushf(&rev_list.args, "--exclude-hidden=%s",
+				     opt->exclude_hidden_refs_section);
 		strvec_push(&rev_list.args, "--all");
 	}
 	strvec_push(&rev_list.args, "--quiet");
diff --git a/connected.h b/connected.h
index 6e59c92aa3..16b2c84f2e 100644
--- a/connected.h
+++ b/connected.h
@@ -46,6 +46,13 @@ struct check_connected_options {
 	 * during a fetch.
 	 */
 	unsigned is_deepening_fetch : 1;
+
+	/*
+	 * If not NULL, use `--exclude-hidden=$section` to exclude all refs
+	 * hidden via the `$section.hideRefs` config from the set of
+	 * already-reachable refs.
+	 */
+	const char *exclude_hidden_refs_section;
 };
 
 #define CHECK_CONNECTED_INIT { 0 }
-- 
2.38.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH v6 0/7] receive-pack: only use visible refs for connectivity check
  2022-11-17  5:46 ` [PATCH v6 " Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2022-11-17  5:47   ` [PATCH v6 7/7] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
@ 2022-11-17 15:03   ` Jeff King
  2022-11-17 21:24     ` Taylor Blau
  7 siblings, 1 reply; 88+ messages in thread
From: Jeff King @ 2022-11-17 15:03 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Taylor Blau

On Thu, Nov 17, 2022 at 06:46:35AM +0100, Patrick Steinhardt wrote:

> this is the sixth version of my patch series that tries to improve
> performance of the connectivity check by only considering preexisting
> refs as uninteresting that could actually have been advertised to the
> client.
> 
> There are only two changes in this version compared to v5:
> 
>     - A fix to the test setup in commit 5/7 so that tests pass when
>       GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main.
> 
>     - A typo fix in the commit message of patch 6/7.

Thanks. Looking at the range diff, this seems good to me!

-Peff

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v6 0/7] receive-pack: only use visible refs for connectivity check
  2022-11-17 15:03   ` [PATCH v6 0/7] " Jeff King
@ 2022-11-17 21:24     ` Taylor Blau
  0 siblings, 0 replies; 88+ messages in thread
From: Taylor Blau @ 2022-11-17 21:24 UTC (permalink / raw)
  To: Jeff King
  Cc: Patrick Steinhardt, git, Junio C Hamano,
	Ævar Arnfjörð Bjarmason, Taylor Blau

On Thu, Nov 17, 2022 at 10:03:46AM -0500, Jeff King wrote:
> On Thu, Nov 17, 2022 at 06:46:35AM +0100, Patrick Steinhardt wrote:
>
> > this is the sixth version of my patch series that tries to improve
> > performance of the connectivity check by only considering preexisting
> > refs as uninteresting that could actually have been advertised to the
> > client.
> >
> > There are only two changes in this version compared to v5:
> >
> >     - A fix to the test setup in commit 5/7 so that tests pass when
> >       GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main.
> >
> >     - A typo fix in the commit message of patch 6/7.
>
> Thanks. Looking at the range diff, this seems good to me!

Yep, I concur. Let's make sure that it passes CI, too, and then start
merging this down. Thanks, both.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 88+ messages in thread

end of thread, other threads:[~2022-11-17 21:26 UTC | newest]

Thread overview: 88+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-28 14:42 [PATCH 0/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
2022-10-28 14:42 ` [PATCH 1/2] connected: allow supplying different view of reachable objects Patrick Steinhardt
2022-10-28 14:54   ` Ævar Arnfjörð Bjarmason
2022-10-28 18:12   ` Junio C Hamano
2022-10-30 18:49     ` Taylor Blau
2022-10-31 13:10     ` Patrick Steinhardt
2022-11-01  1:16       ` Taylor Blau
2022-10-28 14:42 ` [PATCH 2/2] receive-pack: use advertised reference tips to inform connectivity check Patrick Steinhardt
2022-10-28 15:01   ` Ævar Arnfjörð Bjarmason
2022-10-31 14:21     ` Patrick Steinhardt
2022-10-31 15:36       ` Ævar Arnfjörð Bjarmason
2022-10-30 19:09   ` Taylor Blau
2022-10-31 14:45     ` Patrick Steinhardt
2022-11-01  1:28       ` Taylor Blau
2022-11-01  7:20         ` Patrick Steinhardt
2022-11-01 11:53           ` Patrick Steinhardt
2022-11-02  1:05             ` Taylor Blau
2022-11-01  8:28       ` Jeff King
2022-10-28 16:40 ` [PATCH 0/2] " Junio C Hamano
2022-11-01  1:30 ` Taylor Blau
2022-11-01  9:00 ` Jeff King
2022-11-01 11:49   ` Patrick Steinhardt
2022-11-03 14:37 ` [PATCH v2 0/3] receive-pack: only use visible refs for " Patrick Steinhardt
2022-11-03 14:37   ` [PATCH v2 1/3] refs: get rid of global list of hidden refs Patrick Steinhardt
2022-11-03 14:37   ` [PATCH v2 2/3] revision: add new parameter to specify all visible refs Patrick Steinhardt
2022-11-05 12:46     ` Jeff King
2022-11-07  8:20       ` Patrick Steinhardt
2022-11-08 14:32         ` Jeff King
2022-11-05 12:55     ` Jeff King
2022-11-03 14:37   ` [PATCH v2 3/3] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
2022-11-05  0:40   ` [PATCH v2 0/3] " Taylor Blau
2022-11-05 12:55     ` Jeff King
2022-11-05 12:52   ` Jeff King
2022-11-07 12:16 ` [PATCH v3 0/6] " Patrick Steinhardt
2022-11-07 12:16   ` [PATCH v3 1/6] refs: get rid of global list of hidden refs Patrick Steinhardt
2022-11-07 12:16   ` [PATCH v3 2/6] revision: move together exclusion-related functions Patrick Steinhardt
2022-11-07 12:16   ` [PATCH v3 3/6] revision: introduce struct to handle exclusions Patrick Steinhardt
2022-11-07 12:51     ` Ævar Arnfjörð Bjarmason
2022-11-08  9:11       ` Patrick Steinhardt
2022-11-07 12:16   ` [PATCH v3 4/6] revision: add new parameter to exclude hidden refs Patrick Steinhardt
2022-11-07 13:34     ` Ævar Arnfjörð Bjarmason
2022-11-07 17:07       ` Ævar Arnfjörð Bjarmason
2022-11-08  9:48         ` Patrick Steinhardt
2022-11-08  9:22       ` Patrick Steinhardt
2022-11-08  0:57     ` Taylor Blau
2022-11-08  8:16       ` Patrick Steinhardt
2022-11-08 14:42         ` Jeff King
2022-11-07 12:16   ` [PATCH v3 5/6] revparse: add `--exclude-hidden=` option Patrick Steinhardt
2022-11-08 14:44     ` Jeff King
2022-11-07 12:16   ` [PATCH v3 6/6] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
2022-11-08  0:59   ` [PATCH v3 0/6] " Taylor Blau
2022-11-08 10:03 ` [PATCH v4 " Patrick Steinhardt
2022-11-08 10:03   ` [PATCH v4 1/6] refs: get rid of global list of hidden refs Patrick Steinhardt
2022-11-08 13:36     ` Ævar Arnfjörð Bjarmason
2022-11-08 14:49       ` Patrick Steinhardt
2022-11-08 14:51     ` Jeff King
2022-11-08 10:03   ` [PATCH v4 2/6] revision: move together exclusion-related functions Patrick Steinhardt
2022-11-08 10:03   ` [PATCH v4 3/6] revision: introduce struct to handle exclusions Patrick Steinhardt
2022-11-08 10:03   ` [PATCH v4 4/6] revision: add new parameter to exclude hidden refs Patrick Steinhardt
2022-11-08 15:07     ` Jeff King
2022-11-08 21:13       ` Taylor Blau
2022-11-11  5:48       ` Patrick Steinhardt
2022-11-08 10:03   ` [PATCH v4 5/6] rev-parse: add `--exclude-hidden=` option Patrick Steinhardt
2022-11-08 10:04   ` [PATCH v4 6/6] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
2022-11-11  6:49 ` [PATCH v5 0/7] " Patrick Steinhardt
2022-11-11  6:49   ` [PATCH v5 1/7] refs: fix memory leak when parsing hideRefs config Patrick Steinhardt
2022-11-11  6:49   ` [PATCH v5 2/7] refs: get rid of global list of hidden refs Patrick Steinhardt
2022-11-11  6:50   ` [PATCH v5 3/7] revision: move together exclusion-related functions Patrick Steinhardt
2022-11-11  6:50   ` [PATCH v5 4/7] revision: introduce struct to handle exclusions Patrick Steinhardt
2022-11-11  6:50   ` [PATCH v5 5/7] revision: add new parameter to exclude hidden refs Patrick Steinhardt
2022-11-11  6:50   ` [PATCH v5 6/7] rev-parse: add `--exclude-hidden=` option Patrick Steinhardt
2022-11-11  6:50   ` [PATCH v5 7/7] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
2022-11-11 22:18   ` [PATCH v5 0/7] " Taylor Blau
2022-11-15 17:26     ` Jeff King
2022-11-16 21:22       ` Taylor Blau
2022-11-16 22:04         ` Jeff King
2022-11-16 22:33           ` Taylor Blau
2022-11-17  5:45             ` Patrick Steinhardt
2022-11-17  5:46 ` [PATCH v6 " Patrick Steinhardt
2022-11-17  5:46   ` [PATCH v6 1/7] refs: fix memory leak when parsing hideRefs config Patrick Steinhardt
2022-11-17  5:46   ` [PATCH v6 2/7] refs: get rid of global list of hidden refs Patrick Steinhardt
2022-11-17  5:46   ` [PATCH v6 3/7] revision: move together exclusion-related functions Patrick Steinhardt
2022-11-17  5:46   ` [PATCH v6 4/7] revision: introduce struct to handle exclusions Patrick Steinhardt
2022-11-17  5:46   ` [PATCH v6 5/7] revision: add new parameter to exclude hidden refs Patrick Steinhardt
2022-11-17  5:47   ` [PATCH v6 6/7] rev-parse: add `--exclude-hidden=` option Patrick Steinhardt
2022-11-17  5:47   ` [PATCH v6 7/7] receive-pack: only use visible refs for connectivity check Patrick Steinhardt
2022-11-17 15:03   ` [PATCH v6 0/7] " Jeff King
2022-11-17 21:24     ` Taylor Blau

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).