git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/3] Filter alternate references
@ 2018-09-20 18:04 Taylor Blau
  2018-09-20 18:04 ` [PATCH 1/3] transport.c: extract 'fill_alternate_refs_command' Taylor Blau
                   ` (8 more replies)
  0 siblings, 9 replies; 94+ messages in thread
From: Taylor Blau @ 2018-09-20 18:04 UTC (permalink / raw)
  To: git; +Cc: peff

Hi,

This is a series to customize Git's behavior when listing references
from an alternate repository. It is motivated by the following example:

Consider an upstream repository, a fork of it, and a local copy of that
fork. Ideally, running "git pull upstream" from the local copy followed
by a "git push fork" should be a lightweight operation, ideally because
the fork already "knows" about the new objects introduced upstream.

Today, we do this by means of the special ".have" references advertised
by 'git receive-pack'. This special part of the advertisement is
designed to tell the pusher about tips that it might want to know about,
to avoid sending them again.

This optimization is a good one and works well, particularly when the
upstream repository has a relatively normal number of references. When
the upstream has a pathologically _large_ number of references, the
advertisement alone can be so time consuming, that it's faster to send
redundant objects to the fork.

To make the reference advertisement manageable even with a large number
of references, let's allow the fork to select which ones it thinks might
be "interesting", and only advertise those. This makes the advertisement
much smaller, and lets us take advantage of the ".have" references, even
when the upstream contains more references than we're advertising.

This series implements the above functionality by means of
"core.alternateRefsCommand", and "core.alternateRefsPrefixes", either a
command to run in place of "git for-each-ref", or arguments to be
appended to "git for-each-ref".

The order of precedence when listing references from an alternate is as
follows:

  1. If the fork configures "core.alternateRefsCommand", run that.

  2. If the fork configures "core.alternateRefsPrefixes", run 'git
     for-each-ref', limiting results to references that have any of the
     given values as a prefix.

  3. Otherwise, run 'git for-each-ref' in the alternate.

In a previous version of this series, I taught the configuration
property to the alternate, as in "these are the references that _I_
think _you_ will find interesting," rather than the other way around. I
ultimately decided on what is attached here so that the fork does not
have to trust the upstream to run arbitrary shell commands.

Thanks,
Taylor

Taylor Blau (3):
  transport.c: extract 'fill_alternate_refs_command'
  transport.c: introduce core.alternateRefsCommand
  transport.c: introduce core.alternateRefsPrefixes

 Documentation/config.txt | 12 +++++++++
 t/t5410-receive-pack.sh  | 58 ++++++++++++++++++++++++++++++++++++++++
 transport.c              | 34 ++++++++++++++++++-----
 3 files changed, 98 insertions(+), 6 deletions(-)
 create mode 100755 t/t5410-receive-pack.sh

--
2.19.0

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 1/3] transport.c: extract 'fill_alternate_refs_command'
  2018-09-20 18:04 [PATCH 0/3] Filter alternate references Taylor Blau
@ 2018-09-20 18:04 ` Taylor Blau
  2018-09-20 18:04 ` [PATCH 2/3] transport.c: introduce core.alternateRefsCommand Taylor Blau
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-09-20 18:04 UTC (permalink / raw)
  To: git; +Cc: peff

To list alternate references, 'read_alternate_refs' creates a child
process running 'git for-each-ref' in the alternate's Git directory.

Prepare to run other commands besides 'git for-each-ref' by introducing
and moving the relevant code from 'read_alternate_refs' to
'fill_alternate_refs_command'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 transport.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/transport.c b/transport.c
index 1c76d64aba..24ae3f375d 100644
--- a/transport.c
+++ b/transport.c
@@ -1325,6 +1325,17 @@ char *transport_anonymize_url(const char *url)
 	return xstrdup(url);
 }
 
+static void fill_alternate_refs_command(struct child_process *cmd,
+					const char *repo_path)
+{
+	cmd->git_cmd = 1;
+	argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
+	argv_array_push(&cmd->args, "for-each-ref");
+	argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
+	cmd->env = local_repo_env;
+	cmd->out = -1;
+}
+
 static void read_alternate_refs(const char *path,
 				alternate_ref_fn *cb,
 				void *data)
@@ -1333,12 +1344,7 @@ static void read_alternate_refs(const char *path,
 	struct strbuf line = STRBUF_INIT;
 	FILE *fh;
 
-	cmd.git_cmd = 1;
-	argv_array_pushf(&cmd.args, "--git-dir=%s", path);
-	argv_array_push(&cmd.args, "for-each-ref");
-	argv_array_push(&cmd.args, "--format=%(objectname) %(refname)");
-	cmd.env = local_repo_env;
-	cmd.out = -1;
+	fill_alternate_refs_command(&cmd, path);
 
 	if (start_command(&cmd))
 		return;
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-20 18:04 [PATCH 0/3] Filter alternate references Taylor Blau
  2018-09-20 18:04 ` [PATCH 1/3] transport.c: extract 'fill_alternate_refs_command' Taylor Blau
@ 2018-09-20 18:04 ` Taylor Blau
  2018-09-20 19:37   ` Jeff King
  2018-09-21 16:39   ` Junio C Hamano
  2018-09-20 18:04 ` [PATCH 3/3] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 94+ messages in thread
From: Taylor Blau @ 2018-09-20 18:04 UTC (permalink / raw)
  To: git; +Cc: peff

When in a repository containing one or more alternates, Git would
sometimes like to list references from its alternates. For example, 'git
receive-pack' list the objects pointed to by alternate references as
special ".have" references.

Listing ".have" references is designed to make pushing changes from
upstream to a fork a lightweight operation, by advertising to the pusher
that the fork already has the objects (via its alternate). Thus, the
client can avoid sending them.

However, when the alternate has a pathologically large number of
references, the initial advertisement is too expensive. In fact, it can
dominate any such optimization where the pusher avoids sending certain
objects.

Introduce "core.alternateRefsCommand" in order to provide a facility to
limit or filter alternate references. This can be used, for example, to
filter out "uninteresting" references from the initial advertisement in
the above scenario.

Let the repository that has alternates configure this command to avoid
trusting the alternate to provide us a safe command to run in the shell.
To behave differently on each alternate (e.g., only list tags from
alternate A, only heads from B) provide the path of the alternate as the
first argument.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config.txt |  6 +++++
 t/t5410-receive-pack.sh  | 47 ++++++++++++++++++++++++++++++++++++++++
 transport.c              | 19 ++++++++++++----
 3 files changed, 68 insertions(+), 4 deletions(-)
 create mode 100755 t/t5410-receive-pack.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 112041f407..b908bc5825 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -616,6 +616,12 @@ core.preferSymlinkRefs::
 	This is sometimes needed to work with old scripts that
 	expect HEAD to be a symbolic link.
 
+core.alternateRefsCommand::
+	When listing references from an alternate (e.g., in the case of ".have"), use
+	the shell to execute the specified command instead of
+	linkgit:git-for-each-ref[1]. The first argument is the path of the alternate.
+	Output must be of the form: `%(objectname) SPC %(refname)`.
+
 core.bare::
 	If true this repository is assumed to be 'bare' and has no
 	working directory associated with it.  If this is the case a
diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
new file mode 100755
index 0000000000..09fb3f39a1
--- /dev/null
+++ b/t/t5410-receive-pack.sh
@@ -0,0 +1,47 @@
+#!/bin/sh
+
+test_description='git receive-pack test'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit one &&
+	git update-ref refs/heads/a HEAD &&
+	test_commit two &&
+	git update-ref refs/heads/b HEAD &&
+	test_commit three &&
+	git update-ref refs/heads/c HEAD &&
+	git clone --bare . fork &&
+	git clone fork pusher &&
+	(
+		cd fork &&
+		git config receive.advertisealternates true &&
+		git update-ref -d refs/heads/a &&
+		git update-ref -d refs/heads/b &&
+		git update-ref -d refs/heads/c &&
+		git update-ref -d refs/heads/master &&
+		git update-ref -d refs/tags/one &&
+		git update-ref -d refs/tags/two &&
+		git update-ref -d refs/tags/three &&
+		printf "../../.git/objects" >objects/info/alternates
+	)
+'
+
+extract_haves () {
+	depacketize - | grep -o '^.* \.have'
+}
+
+test_expect_success 'with core.alternateRefsCommand' '
+	test_config -C fork core.alternateRefsCommand \
+		"git --git-dir=\"\$1\" for-each-ref \
+		--format=\"%(objectname) %(refname)\" \
+		refs/heads/a refs/heads/c;:" &&
+	cat >expect <<-EOF &&
+	$(git rev-parse a) .have
+	$(git rev-parse c) .have
+	EOF
+	printf "0000" | git receive-pack fork | extract_haves >actual &&
+	test_cmp expect actual
+'
+
+test_done
diff --git a/transport.c b/transport.c
index 24ae3f375d..e7d2cdf00b 100644
--- a/transport.c
+++ b/transport.c
@@ -1328,10 +1328,21 @@ char *transport_anonymize_url(const char *url)
 static void fill_alternate_refs_command(struct child_process *cmd,
 					const char *repo_path)
 {
-	cmd->git_cmd = 1;
-	argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
-	argv_array_push(&cmd->args, "for-each-ref");
-	argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
+	const char *value;
+
+	if (!git_config_get_value("core.alternateRefsCommand", &value)) {
+		cmd->use_shell = 1;
+
+		argv_array_push(&cmd->args, value);
+		argv_array_push(&cmd->args, repo_path);
+	} else {
+		cmd->git_cmd = 1;
+
+		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
+		argv_array_push(&cmd->args, "for-each-ref");
+		argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
+	}
+
 	cmd->env = local_repo_env;
 	cmd->out = -1;
 }
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-20 18:04 [PATCH 0/3] Filter alternate references Taylor Blau
  2018-09-20 18:04 ` [PATCH 1/3] transport.c: extract 'fill_alternate_refs_command' Taylor Blau
  2018-09-20 18:04 ` [PATCH 2/3] transport.c: introduce core.alternateRefsCommand Taylor Blau
@ 2018-09-20 18:04 ` Taylor Blau
  2018-09-20 19:47   ` Jeff King
  2018-09-21  7:19   ` Eric Sunshine
  2018-09-20 18:35 ` [PATCH 0/3] Filter alternate references Stefan Beller
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 94+ messages in thread
From: Taylor Blau @ 2018-09-20 18:04 UTC (permalink / raw)
  To: git; +Cc: peff

The recently-introduced "core.alternateRefsCommand" allows callers to
specify with high flexibility the tips that they wish to advertise from
alternates. This flexibility comes at the cost of some inconvenience
when the caller only wishes to limit the advertisement to one or more
prefixes.

For example, to advertise only tags, a caller using
'core.alternateRefsCommand' would have to do:

  $ git config core.alternateRefsCommand ' \
      git -C "$1" for-each-ref refs/tags \
      --format="%(objectname) %(refname)" \
    '

The above is cumbersome to write, so let's introduce a
"core.alternateRefsPrefixes" to address this common case. Instead, the
caller can run:

  $ git config core.alternateRefsPrefixes 'refs/tags'

Which will behave identically to the longer example using
"core.alternateRefsCommand".

Since the value of "core.alternateRefsPrefixes" is appended to 'git
for-each-ref' and then executed, include a "--" before taking the
configured value to avoid misinterpreting arguments as flags to 'git
for-each-ref'.

In the case that the caller wishes to specify multiple prefixes, they
may separate them by whitespace. If "core.alternateRefsCommand" is set,
it will take precedence over "core.alternateRefsPrefixes".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config.txt |  6 ++++++
 t/t5410-receive-pack.sh  | 11 +++++++++++
 transport.c              |  5 +++++
 3 files changed, 22 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index b908bc5825..d768c57310 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -622,6 +622,12 @@ core.alternateRefsCommand::
 	linkgit:git-for-each-ref[1]. The first argument is the path of the alternate.
 	Output must be of the form: `%(objectname) SPC %(refname)`.
 
+core.alternateRefsPrefixes::
+	When listing references from an alternate, list only references that begin
+	with the given prefix. To list multiple prefixes, separate them with a
+	whitespace character. If `core.alternateRefsCommand` is set, setting
+	`core.alternateRefsPrefixes` has no effect.
+
 core.bare::
 	If true this repository is assumed to be 'bare' and has no
 	working directory associated with it.  If this is the case a
diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
index 09fb3f39a1..df2830e9f6 100755
--- a/t/t5410-receive-pack.sh
+++ b/t/t5410-receive-pack.sh
@@ -44,4 +44,15 @@ test_expect_success 'with core.alternateRefsCommand' '
 	test_cmp expect actual
 '
 
+test_expect_success 'with core.alternateRefsPrefixes' '
+	test_config -C fork core.alternateRefsPrefixes "refs/tags" &&
+	cat >expect <<-EOF &&
+	$(git rev-parse one) .have
+	$(git rev-parse three) .have
+	$(git rev-parse two) .have
+	EOF
+	printf "0000" | git receive-pack fork | extract_haves >actual &&
+	test_cmp expect actual
+'
+
 test_done
diff --git a/transport.c b/transport.c
index e7d2cdf00b..9323e5c3cd 100644
--- a/transport.c
+++ b/transport.c
@@ -1341,6 +1341,11 @@ static void fill_alternate_refs_command(struct child_process *cmd,
 		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
 		argv_array_push(&cmd->args, "for-each-ref");
 		argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
+
+		if (!git_config_get_value("core.alternateRefsPrefixes", &value)) {
+			argv_array_push(&cmd->args, "--");
+			argv_array_split(&cmd->args, value);
+		}
 	}
 
 	cmd->env = local_repo_env;
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH 0/3] Filter alternate references
  2018-09-20 18:04 [PATCH 0/3] Filter alternate references Taylor Blau
                   ` (2 preceding siblings ...)
  2018-09-20 18:04 ` [PATCH 3/3] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
@ 2018-09-20 18:35 ` Stefan Beller
  2018-09-20 18:56   ` Taylor Blau
  2018-09-20 19:27   ` Jeff King
  2018-09-20 19:21 ` Jeff King
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 94+ messages in thread
From: Stefan Beller @ 2018-09-20 18:35 UTC (permalink / raw)
  To: ttaylorr; +Cc: git, Jeff King

On Thu, Sep 20, 2018 at 11:04 AM Taylor Blau <ttaylorr@github.com> wrote:
>
> Hi,
>
> This is a series to customize Git's behavior when listing references
> from an alternate repository. It is motivated by the following example:
>
> Consider an upstream repository, a fork of it, and a local copy of that
> fork. Ideally, running "git pull upstream" from the local copy followed
> by a "git push fork" should be a lightweight operation, ideally because
> the fork already "knows" about the new objects introduced upstream.
>
> Today, we do this by means of the special ".have" references advertised
> by 'git receive-pack'. This special part of the advertisement is
> designed to tell the pusher about tips that it might want to know about,
> to avoid sending them again.
>
> This optimization is a good one and works well, particularly when the
> upstream repository has a relatively normal number of references. When
> the upstream has a pathologically _large_ number of references, the
> advertisement alone can be so time consuming, that it's faster to send
> redundant objects to the fork.

(tangent:)
The current fetch protocol consists of 2 parts:
negotiation + sending the packfile, and the negotiation only tries
to trim down the size of the packfile to send, without taking its own
cost (in terms of time and band width) into account, just to produce
a perfect pack to send to the client.

When talking about designing protocol v2 for push (which has not
landed yet[1]), we had some in-office discussions whether we
want to have a proper negotiation on push, as it would help
pushing to remotes that have non-ff pushes, but not necessarily
regular pushes, as they should be fine with just the refs advertisement.

[1] https://github.com/bmwill/git/commit/57a4e6e5d18a2d4d806fc8dec644b89affd50853
bmwill@ no longer works on it though.


>
> To make the reference advertisement manageable even with a large number
> of references, let's allow the fork to select which ones it thinks might
> be "interesting", and only advertise those. This makes the advertisement
> much smaller, and lets us take advantage of the ".have" references, even
> when the upstream contains more references than we're advertising.
>
> This series implements the above functionality by means of
> "core.alternateRefsCommand", and "core.alternateRefsPrefixes", either a
> command to run in place of "git for-each-ref", or arguments to be
> appended to "git for-each-ref".
>
> The order of precedence when listing references from an alternate is as
> follows:
>
>   1. If the fork configures "core.alternateRefsCommand", run that.
>
>   2. If the fork configures "core.alternateRefsPrefixes", run 'git
>      for-each-ref', limiting results to references that have any of the
>      given values as a prefix.
>
>   3. Otherwise, run 'git for-each-ref' in the alternate.
>
> In a previous version of this series, I taught the configuration
> property to the alternate, as in "these are the references that _I_
> think _you_ will find interesting," rather than the other way around. I
> ultimately decided on what is attached here so that the fork does not
> have to trust the upstream to run arbitrary shell commands.

Would it make sense to estimate the value of each .have before
advertising them and then advertise only the <n> most valuable
.haves ?
(e.g. if a .have is only one small commit ahead of origin/master,
it may not bring a lot of value as the potential savings are small,
but if that .have contains history between master..TIP that has lots
of big blobs or objects in general, this may be valuable to know)

Stefan

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 0/3] Filter alternate references
  2018-09-20 18:35 ` [PATCH 0/3] Filter alternate references Stefan Beller
@ 2018-09-20 18:56   ` Taylor Blau
  2018-09-20 19:27   ` Jeff King
  1 sibling, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-09-20 18:56 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jeff King

Hi Stefan,

On Thu, Sep 20, 2018 at 11:35:23AM -0700, Stefan Beller wrote:
> > To make the reference advertisement manageable even with a large number
> > of references, let's allow the fork to select which ones it thinks might
> > be "interesting", and only advertise those. This makes the advertisement
> > much smaller, and lets us take advantage of the ".have" references, even
> > when the upstream contains more references than we're advertising.
> >
> > This series implements the above functionality by means of
> > "core.alternateRefsCommand", and "core.alternateRefsPrefixes", either a
> > command to run in place of "git for-each-ref", or arguments to be
> > appended to "git for-each-ref".
> >
> > The order of precedence when listing references from an alternate is as
> > follows:
> >
> >   1. If the fork configures "core.alternateRefsCommand", run that.
> >
> >   2. If the fork configures "core.alternateRefsPrefixes", run 'git
> >      for-each-ref', limiting results to references that have any of the
> >      given values as a prefix.
> >
> >   3. Otherwise, run 'git for-each-ref' in the alternate.
> >
> > In a previous version of this series, I taught the configuration
> > property to the alternate, as in "these are the references that _I_
> > think _you_ will find interesting," rather than the other way around. I
> > ultimately decided on what is attached here so that the fork does not
> > have to trust the upstream to run arbitrary shell commands.
>
> Would it make sense to estimate the value of each .have before
> advertising them and then advertise only the <n> most valuable
> .haves ?
> (e.g. if a .have is only one small commit ahead of origin/master,
> it may not bring a lot of value as the potential savings are small,
> but if that .have contains history between master..TIP that has lots
> of big blobs or objects in general, this may be valuable to know)

I think that this sort of filtering should be theoretically possible
by configuring "core.alternateRefsCommand", perhaps to execute a script
like:

  cd "$1" &&
  git for-each-ref --format="%(objectname) %(refname)" |
  while read objectname refname; do
    total_size="$(git rev-list --objects master...$objectname \
      | awk '{ print $1 }' \
      | git cat-file --batch-check='%(objectsize)' \
      | awk '{ sum+=$1 } END { print $sum }')"

    if [ "$total_size" -gt "$minimum_size" ]; then
      echo "$objectname $refname"
    fi
  done

But that's quite inefficient to compute, since you're walking the same
parts of the graph over and over again.

Perhaps we could teach Git to do something better? I suppose that just
"core.alternateRefPrefixes" could do this by default (or with another
knob) to further optimize the simpler case. But I think that we'd be
equally OK without it, since push over V2 obviates the need for this
sort of optimization (as you noted in the unquoted part of this
response).

My inclination is to avoid teaching this to Git, and let callers
script it into their "core.alternateRefsCommand" if they really desire
it.

Does that seem OK?


Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 0/3] Filter alternate references
  2018-09-20 18:04 [PATCH 0/3] Filter alternate references Taylor Blau
                   ` (3 preceding siblings ...)
  2018-09-20 18:35 ` [PATCH 0/3] Filter alternate references Stefan Beller
@ 2018-09-20 19:21 ` Jeff King
  2018-09-21 18:47 ` [PATCH v2 " Taylor Blau
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 94+ messages in thread
From: Jeff King @ 2018-09-20 19:21 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git

On Thu, Sep 20, 2018 at 02:04:05PM -0400, Taylor Blau wrote:

> This is a series to customize Git's behavior when listing references
> from an alternate repository. It is motivated by the following example:
> 
> Consider an upstream repository, a fork of it, and a local copy of that
> fork. Ideally, running "git pull upstream" from the local copy followed
> by a "git push fork" should be a lightweight operation, ideally because
> the fork already "knows" about the new objects introduced upstream.
> 
> Today, we do this by means of the special ".have" references advertised
> by 'git receive-pack'. This special part of the advertisement is
> designed to tell the pusher about tips that it might want to know about,
> to avoid sending them again.

I think it's important to note that this is just one place where this
optimization is useful. A few others are:

  1. On fetching, the client similarly advertises the extra tips (not in
     a ref advertisement, but as part of the negotiation).

  2. We don't do it now, but we ought to use those for checking the
     connectivity of incoming objects. Otherwise we end up walking over
     history that we already know we have. Since this is purely local,
     it's not usually as big a deal, but it can matter a lot in large
     repositories, because it makes what should be O(nr_changes)
     fetches into O(size_of_repo). E.g., imagine making a fork of
     linux.git backed by the same shared-object alternate. The initial
     "fetch" should be a noop as we realize that we have everything
     already, but we spend 45s of CPU walking the whole graph.

     I have patches for this, but haven't sent them, since without the
     optimization you've done here, we'd never be able to turn it on at
     GitHub.

  3. Other scripts may want us to expose this. The patches I have for
     (2) actually implement "rev-list --alternate-refs" (since we
     implement the connectivity check there). I don't have other
     particular uses in mind, but it lets you ask questions like "which
     objects are reachable here versus in the alternate".

Your patches would affect all of those sites, I and I think that's a
good thing. It's giving a consistent view of "what can I assume is
reachable from the alternate?", which is OK to be a subset of the whole
(and already is, really, since we don't peek into the alternate's
reflogs).

> In a previous version of this series, I taught the configuration
> property to the alternate, as in "these are the references that _I_
> think _you_ will find interesting," rather than the other way around. I
> ultimately decided on what is attached here so that the fork does not
> have to trust the upstream to run arbitrary shell commands.

Right, we had a lot of discussion here (which I'm repeating not for you
but for the benefit of the list). It might seem conceptually simpler to
for the alternate itself to say "what are my important refs?". And that
nicely generalizes if you have multiple alternates. But in our use case,
"important" here is in the eye of the beholder. If a bunch of repos are
sharing object storage, and repo Y is derived from repo X, then refs
related to X are going to be most important when you're doing an
operation in Y. But in some repo Q derived from R, that wouldn't be the
case.

So I think you could make an argument either way there. But simplifying
the security boundary around core.alternateRefsCommand pushes it in
favor of having all of this decided by the repo doing the looking,
rather than the one it's looking at.

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 0/3] Filter alternate references
  2018-09-20 18:35 ` [PATCH 0/3] Filter alternate references Stefan Beller
  2018-09-20 18:56   ` Taylor Blau
@ 2018-09-20 19:27   ` Jeff King
  1 sibling, 0 replies; 94+ messages in thread
From: Jeff King @ 2018-09-20 19:27 UTC (permalink / raw)
  To: Stefan Beller; +Cc: ttaylorr, git

On Thu, Sep 20, 2018 at 11:35:23AM -0700, Stefan Beller wrote:

> > This optimization is a good one and works well, particularly when the
> > upstream repository has a relatively normal number of references. When
> > the upstream has a pathologically _large_ number of references, the
> > advertisement alone can be so time consuming, that it's faster to send
> > redundant objects to the fork.
> 
> (tangent:)
> The current fetch protocol consists of 2 parts:
> negotiation + sending the packfile, and the negotiation only tries
> to trim down the size of the packfile to send, without taking its own
> cost (in terms of time and band width) into account, just to produce
> a perfect pack to send to the client.
> 
> When talking about designing protocol v2 for push (which has not
> landed yet[1]), we had some in-office discussions whether we
> want to have a proper negotiation on push, as it would help
> pushing to remotes that have non-ff pushes, but not necessarily
> regular pushes, as they should be fine with just the refs advertisement.

I don't think that materially changes anything. We already do this same
trick on fetch (but just with the client advertising the extra haves,
since it's the receiver). So if push started doing a real negotiation,
we'd still want to feed those haves in the same way.

> Would it make sense to estimate the value of each .have before
> advertising them and then advertise only the <n> most valuable
> .haves ?
> (e.g. if a .have is only one small commit ahead of origin/master,
> it may not bring a lot of value as the potential savings are small,
> but if that .have contains history between master..TIP that has lots
> of big blobs or objects in general, this may be valuable to know)

That sounds neat, but I think is mostly orthogonal here. We're primarily
interested in just narrowing down the initial set of possibilities, so
you could cull it further.

And I see Taylor just responded with the idea that you could do this in
your hook. Which is neat, but definitely not something we are planning
on doing with it immediately. ;)

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-20 18:04 ` [PATCH 2/3] transport.c: introduce core.alternateRefsCommand Taylor Blau
@ 2018-09-20 19:37   ` Jeff King
  2018-09-20 20:00     ` Taylor Blau
  2018-09-21 16:39   ` Junio C Hamano
  1 sibling, 1 reply; 94+ messages in thread
From: Jeff King @ 2018-09-20 19:37 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git

On Thu, Sep 20, 2018 at 02:04:11PM -0400, Taylor Blau wrote:

> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 112041f407..b908bc5825 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -616,6 +616,12 @@ core.preferSymlinkRefs::
>  	This is sometimes needed to work with old scripts that
>  	expect HEAD to be a symbolic link.
>  
> +core.alternateRefsCommand::
> +	When listing references from an alternate (e.g., in the case of ".have"), use
> +	the shell to execute the specified command instead of
> +	linkgit:git-for-each-ref[1]. The first argument is the path of the alternate.
> +	Output must be of the form: `%(objectname) SPC %(refname)`.

We discussed off-list the notion that this could just be the objectname,
since the ".have" mechanism doesn't care about the actual refnames.

There's a little prior discussion from the list:

  https://public-inbox.org/git/xmqqefzraqbu.fsf@gitster.mtv.corp.google.com/

My "rev-list --alternate-refs" patches _do_ use the refnames, since you
could do something like "--source" that cares about them. But there's
some awkwardness there, because the names are in a different namespace
than the rest of the refs. If we were to just say "nope, you do not get
to see the names of the alternates" then that awkwardness goes away. But
it also loses some information that could _possibly_ be of use to a
caller.

Back in that earlier discussion I did not have a strong opinion, but
here we are cementing that decision into a user-visible interface. So it
probably makes sense to revisit and decide once and for all.

> +test_description='git receive-pack test'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup' '
> +	test_commit one &&
> +	git update-ref refs/heads/a HEAD &&
> +	test_commit two &&
> +	git update-ref refs/heads/b HEAD &&
> +	test_commit three &&
> +	git update-ref refs/heads/c HEAD &&
> +	git clone --bare . fork &&
> +	git clone fork pusher &&
> +	(
> +		cd fork &&
> +		git config receive.advertisealternates true &&
> +		git update-ref -d refs/heads/a &&
> +		git update-ref -d refs/heads/b &&
> +		git update-ref -d refs/heads/c &&
> +		git update-ref -d refs/heads/master &&
> +		git update-ref -d refs/tags/one &&
> +		git update-ref -d refs/tags/two &&
> +		git update-ref -d refs/tags/three &&

Probably not worth nit-picking process count, but this could done with a
single "update-ref --stdin".

> +		printf "../../.git/objects" >objects/info/alternates

Also a nitpick, but I think "echo" would be more usual here (we handle
the lack of a trailing newline just fine, but any use of printf makes me
wonder if something tricky is going on with line endings).

> +test_expect_success 'with core.alternateRefsCommand' '
> +	test_config -C fork core.alternateRefsCommand \
> +		"git --git-dir=\"\$1\" for-each-ref \
> +		--format=\"%(objectname) %(refname)\" \
> +		refs/heads/a refs/heads/c;:" &&

This is cute and all, but might it be more readable to use
write_script() to stick it into its own script?

> +	cat >expect <<-EOF &&
> +	$(git rev-parse a) .have
> +	$(git rev-parse c) .have
> +	EOF
> +	printf "0000" | git receive-pack fork | extract_haves >actual &&

There's been a push lately to avoid having git on the left-hand side of
a fork, since we might otherwise miss its exit code (including things
like asan/valgrind errors). So maybe:

   ... receive-pack fork >actual &&
   extract_haves <actual >actual.haves &&
   test_cmp expect actual.haves

or similar?

> diff --git a/transport.c b/transport.c
> index 24ae3f375d..e7d2cdf00b 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -1328,10 +1328,21 @@ char *transport_anonymize_url(const char *url)
>  static void fill_alternate_refs_command(struct child_process *cmd,
>  					const char *repo_path)
>  {
> -	cmd->git_cmd = 1;
> -	argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
> -	argv_array_push(&cmd->args, "for-each-ref");
> -	argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
> +	const char *value;
> +
> +	if (!git_config_get_value("core.alternateRefsCommand", &value)) {
> +		cmd->use_shell = 1;
> +
> +		argv_array_push(&cmd->args, value);
> +		argv_array_push(&cmd->args, repo_path);

Setting use_shell allows the shell trickery in your test, and matches
the modern way we run config-based commands. Good.

> +	} else {
> +		cmd->git_cmd = 1;
> +
> +		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
> +		argv_array_push(&cmd->args, "for-each-ref");
> +		argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
> +	}
> +
>  	cmd->env = local_repo_env;
>  	cmd->out = -1;

And we still clear local_repo_env for the custom command, which is good
to avoid confusion like $GIT_DIR being set when the custom command does
"cd $1 && git ...". Good.

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-20 18:04 ` [PATCH 3/3] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
@ 2018-09-20 19:47   ` Jeff King
  2018-09-20 20:12     ` Taylor Blau
  2018-09-21  7:19   ` Eric Sunshine
  1 sibling, 1 reply; 94+ messages in thread
From: Jeff King @ 2018-09-20 19:47 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git

On Thu, Sep 20, 2018 at 02:04:13PM -0400, Taylor Blau wrote:

> The recently-introduced "core.alternateRefsCommand" allows callers to
> specify with high flexibility the tips that they wish to advertise from
> alternates. This flexibility comes at the cost of some inconvenience
> when the caller only wishes to limit the advertisement to one or more
> prefixes.

To be clear: this isn't something we plan to use at GitHub at all. It
just seemed like a nice "in between" the current inflexible state and
the "incredibly flexible but not trivial to use" command from patch 2.

Note that unlike core.alternateRefsCommand, there are no security issues
here with reading this from the alternate, although:

 - it's a little awkward to read the config from the alternate

 - since these are clearly related config, it probably makes sense for
   them to be consistent

> For example, to advertise only tags, a caller using
> 'core.alternateRefsCommand' would have to do:
> 
>   $ git config core.alternateRefsCommand ' \
>       git -C "$1" for-each-ref refs/tags \
>       --format="%(objectname) %(refname)" \
>     '

I think it's more likely that advertising only heads would make sense.
The pathological repos I see are usually a sane number of branches and
then an absurd number of tags.

Not that it's super important, but I wonder if we should give a
motivating example like this in the documentation. In which case we'd
probably want to give the most plausible one.

> Since the value of "core.alternateRefsPrefixes" is appended to 'git
> for-each-ref' and then executed, include a "--" before taking the
> configured value to avoid misinterpreting arguments as flags to 'git
> for-each-ref'.

Good idea.

> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index b908bc5825..d768c57310 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -622,6 +622,12 @@ core.alternateRefsCommand::
>  	linkgit:git-for-each-ref[1]. The first argument is the path of the alternate.
>  	Output must be of the form: `%(objectname) SPC %(refname)`.
>  
> +core.alternateRefsPrefixes::
> +	When listing references from an alternate, list only references that begin
> +	with the given prefix. To list multiple prefixes, separate them with a
> +	whitespace character. If `core.alternateRefsCommand` is set, setting
> +	`core.alternateRefsPrefixes` has no effect.

I can't remember all of the rules for how for-each-ref matches prefixes,
but I remember that it's subtly different than git-branch (and that's
why ref-filter.c has two matching modes). Do we need to spell out the
rules here (or at least say "it matches like for-each-ref")?

Also, a minor nit, but I think the argv_array_split() helper you're
using soaks up arbitrary amounts of whitespace. So maybe "separate them
with whitespace" instead of "a whitespace character". Or maybe we should
be strict in what we suggest and liberal in what we parse. ;)

> +test_expect_success 'with core.alternateRefsPrefixes' '
> +	test_config -C fork core.alternateRefsPrefixes "refs/tags" &&
> +	cat >expect <<-EOF &&
> +	$(git rev-parse one) .have
> +	$(git rev-parse three) .have
> +	$(git rev-parse two) .have
> +	EOF
> +	printf "0000" | git receive-pack fork | extract_haves >actual &&
> +	test_cmp expect actual

Looks sane, though the same pipe comment applies as before.

>  test_done
> diff --git a/transport.c b/transport.c
> index e7d2cdf00b..9323e5c3cd 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -1341,6 +1341,11 @@ static void fill_alternate_refs_command(struct child_process *cmd,
>  		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
>  		argv_array_push(&cmd->args, "for-each-ref");
>  		argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
> +
> +		if (!git_config_get_value("core.alternateRefsPrefixes", &value)) {
> +			argv_array_push(&cmd->args, "--");
> +			argv_array_split(&cmd->args, value);
> +		}
>  	}

The implementation ended up delightfully simple.

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-20 19:37   ` Jeff King
@ 2018-09-20 20:00     ` Taylor Blau
  2018-09-20 20:06       ` Jeff King
  0 siblings, 1 reply; 94+ messages in thread
From: Taylor Blau @ 2018-09-20 20:00 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Thu, Sep 20, 2018 at 03:37:51PM -0400, Jeff King wrote:
> On Thu, Sep 20, 2018 at 02:04:11PM -0400, Taylor Blau wrote:
>
> > diff --git a/Documentation/config.txt b/Documentation/config.txt
> > index 112041f407..b908bc5825 100644
> > --- a/Documentation/config.txt
> > +++ b/Documentation/config.txt
> > @@ -616,6 +616,12 @@ core.preferSymlinkRefs::
> >  	This is sometimes needed to work with old scripts that
> >  	expect HEAD to be a symbolic link.
> >
> > +core.alternateRefsCommand::
> > +	When listing references from an alternate (e.g., in the case of ".have"), use
> > +	the shell to execute the specified command instead of
> > +	linkgit:git-for-each-ref[1]. The first argument is the path of the alternate.
> > +	Output must be of the form: `%(objectname) SPC %(refname)`.
>
> We discussed off-list the notion that this could just be the objectname,
> since the ".have" mechanism doesn't care about the actual refnames.
>
> There's a little prior discussion from the list:
>
>   https://public-inbox.org/git/xmqqefzraqbu.fsf@gitster.mtv.corp.google.com/
>
> My "rev-list --alternate-refs" patches _do_ use the refnames, since you
> could do something like "--source" that cares about them. But there's
> some awkwardness there, because the names are in a different namespace
> than the rest of the refs. If we were to just say "nope, you do not get
> to see the names of the alternates" then that awkwardness goes away. But
> it also loses some information that could _possibly_ be of use to a
> caller.
>
> Back in that earlier discussion I did not have a strong opinion, but
> here we are cementing that decision into a user-visible interface. So it
> probably makes sense to revisit and decide once and for all.

Interesting, and thanks for the link to the prior discussion. I think
that I agree mostly with your rationale in [1], which boils down (for
me) to:

  - Other callers (like 'rev-list --alternate-refs') might care about
    them. Even if we don't have those patches in Git today, it's worth
    keeping their use case(s) in mind.

  - I didn't measure either, but I can't imagine that we're paying a
    huge price for this. So, it might be easy enough to keep saying,
    "please write output as '%(objectname) SP %(refname)'", even if we
    end up throwing out the refname, anyway.

> > +test_description='git receive-pack test'
> > +
> > +. ./test-lib.sh
> > +
> > +test_expect_success 'setup' '
> > +	test_commit one &&
> > +	git update-ref refs/heads/a HEAD &&
> > +	test_commit two &&
> > +	git update-ref refs/heads/b HEAD &&
> > +	test_commit three &&
> > +	git update-ref refs/heads/c HEAD &&
> > +	git clone --bare . fork &&
> > +	git clone fork pusher &&
> > +	(
> > +		cd fork &&
> > +		git config receive.advertisealternates true &&
> > +		git update-ref -d refs/heads/a &&
> > +		git update-ref -d refs/heads/b &&
> > +		git update-ref -d refs/heads/c &&
> > +		git update-ref -d refs/heads/master &&
> > +		git update-ref -d refs/tags/one &&
> > +		git update-ref -d refs/tags/two &&
> > +		git update-ref -d refs/tags/three &&
>
> Probably not worth nit-picking process count, but this could done with a
> single "update-ref --stdin".

Sure, I don't think that 7 `update-ref`'s vs 2 (`cat` + `git update-ref
--stdin`) will make or break the series, but I can happily shorten it as
you suggest ;-).

> > +		printf "../../.git/objects" >objects/info/alternates
>
> Also a nitpick, but I think "echo" would be more usual here (we handle
> the lack of a trailing newline just fine, but any use of printf makes me
> wonder if something tricky is going on with line endings).

'echo' indeed seems to be the way to go. This 'printf' preference is a
Git LFS-ism ;-).

> > +test_expect_success 'with core.alternateRefsCommand' '
> > +	test_config -C fork core.alternateRefsCommand \
> > +		"git --git-dir=\"\$1\" for-each-ref \
> > +		--format=\"%(objectname) %(refname)\" \
> > +		refs/heads/a refs/heads/c;:" &&
>
> This is cute and all, but might it be more readable to use
> write_script() to stick it into its own script?

Good idea, I'll do that.

> > +	cat >expect <<-EOF &&
> > +	$(git rev-parse a) .have
> > +	$(git rev-parse c) .have
> > +	EOF
> > +	printf "0000" | git receive-pack fork | extract_haves >actual &&
>
> There's been a push lately to avoid having git on the left-hand side of
> a fork, since we might otherwise miss its exit code (including things
> like asan/valgrind errors). So maybe:
>
>    ... receive-pack fork >actual &&
>    extract_haves <actual >actual.haves &&
>    test_cmp expect actual.haves
>
> or similar?

Sure, I agree that it's a good idea to not miss the exit code (since we
don't have pipefail on), etc. I adopted your suggestion into my local
copy.

> > diff --git a/transport.c b/transport.c
> > index 24ae3f375d..e7d2cdf00b 100644
> > --- a/transport.c
> > +++ b/transport.c
> > @@ -1328,10 +1328,21 @@ char *transport_anonymize_url(const char *url)
> >  static void fill_alternate_refs_command(struct child_process *cmd,
> >  					const char *repo_path)
> >  {
> > -	cmd->git_cmd = 1;
> > -	argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
> > -	argv_array_push(&cmd->args, "for-each-ref");
> > -	argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
> > +	const char *value;
> > +
> > +	if (!git_config_get_value("core.alternateRefsCommand", &value)) {
> > +		cmd->use_shell = 1;
> > +
> > +		argv_array_push(&cmd->args, value);
> > +		argv_array_push(&cmd->args, repo_path);
>
> Setting use_shell allows the shell trickery in your test, and matches
> the modern way we run config-based commands. Good.
>
> > +	} else {
> > +		cmd->git_cmd = 1;
> > +
> > +		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
> > +		argv_array_push(&cmd->args, "for-each-ref");
> > +		argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
> > +	}
> > +
> >  	cmd->env = local_repo_env;
> >  	cmd->out = -1;
>
> And we still clear local_repo_env for the custom command, which is good
> to avoid confusion like $GIT_DIR being set when the custom command does
> "cd $1 && git ...". Good.

Thanks,
Taylor

[1]: https://public-inbox.org/git/20170125195425.q4fpvc4ten5mfjgl@sigill.intra.peff.net/

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-20 20:00     ` Taylor Blau
@ 2018-09-20 20:06       ` Jeff King
  0 siblings, 0 replies; 94+ messages in thread
From: Jeff King @ 2018-09-20 20:06 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git

On Thu, Sep 20, 2018 at 04:00:34PM -0400, Taylor Blau wrote:

> > My "rev-list --alternate-refs" patches _do_ use the refnames, since you
> > could do something like "--source" that cares about them. But there's
> > some awkwardness there, because the names are in a different namespace
> > than the rest of the refs. If we were to just say "nope, you do not get
> > to see the names of the alternates" then that awkwardness goes away. But
> > it also loses some information that could _possibly_ be of use to a
> > caller.
> >
> > Back in that earlier discussion I did not have a strong opinion, but
> > here we are cementing that decision into a user-visible interface. So it
> > probably makes sense to revisit and decide once and for all.
> 
> Interesting, and thanks for the link to the prior discussion. I think
> that I agree mostly with your rationale in [1], which boils down (for
> me) to:
> 
>   - Other callers (like 'rev-list --alternate-refs') might care about
>     them. Even if we don't have those patches in Git today, it's worth
>     keeping their use case(s) in mind.
> 
>   - I didn't measure either, but I can't imagine that we're paying a
>     huge price for this. So, it might be easy enough to keep saying,
>     "please write output as '%(objectname) SP %(refname)'", even if we
>     end up throwing out the refname, anyway.

TBH, the main advantage to me is that it makes the user-visible
interface way simpler. We just say "give us a list of object ids, one
per line". I guess the current spec is not too bad, especially given
that we can just provide a for-each-ref format that generates it.

> > Probably not worth nit-picking process count, but this could done with a
> > single "update-ref --stdin".
> 
> Sure, I don't think that 7 `update-ref`'s vs 2 (`cat` + `git update-ref
> --stdin`) will make or break the series, but I can happily shorten it as
> you suggest ;-).

Yeah, in retrospect I should have not have even mentioned it.
test_commit() already adds a bunch of extra processes you may or may not
care about (e.g., by making tags, or using "git add" when "commit -a"
might do).

> > > +	cat >expect <<-EOF &&
> > > +	$(git rev-parse a) .have
> > > +	$(git rev-parse c) .have
> > > +	EOF
> > > +	printf "0000" | git receive-pack fork | extract_haves >actual &&
> >
> > There's been a push lately to avoid having git on the left-hand side of
> > a fork, since we might otherwise miss its exit code (including things

Heh, I meant to say "left-hand side of a pipe", but you obviously
figured out what I meant. :)

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-20 19:47   ` Jeff King
@ 2018-09-20 20:12     ` Taylor Blau
  0 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-09-20 20:12 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Thu, Sep 20, 2018 at 03:47:34PM -0400, Jeff King wrote:
> On Thu, Sep 20, 2018 at 02:04:13PM -0400, Taylor Blau wrote:
>
> > The recently-introduced "core.alternateRefsCommand" allows callers to
> > specify with high flexibility the tips that they wish to advertise from
> > alternates. This flexibility comes at the cost of some inconvenience
> > when the caller only wishes to limit the advertisement to one or more
> > prefixes.
>
> To be clear: this isn't something we plan to use at GitHub at all. It
> just seemed like a nice "in between" the current inflexible state and
> the "incredibly flexible but not trivial to use" command from patch 2.
>
> Note that unlike core.alternateRefsCommand, there are no security issues
> here with reading this from the alternate, although:
>
>  - it's a little awkward to read the config from the alternate
>
>  - since these are clearly related config, it probably makes sense for
>    them to be consistent

Another note is that the thing we are planning on using
("core.alternateRefsCommand") could also be implemented as a hook,
e.g., .git/hooks/gather-alternate-refs.

That said, I think that this makes more sense when the alternate is
doing the configuring, not the ohter way around.

> > For example, to advertise only tags, a caller using
> > 'core.alternateRefsCommand' would have to do:
> >
> >   $ git config core.alternateRefsCommand ' \
> >       git -C "$1" for-each-ref refs/tags \
> >       --format="%(objectname) %(refname)" \
> >     '
>
> I think it's more likely that advertising only heads would make sense.
> The pathological repos I see are usually a sane number of branches and
> then an absurd number of tags.

I agree with you. I used "refs/tags" as the prefix here since I'd like
different output than when "core.alternateRefsPrefixes" isn't configured
at all. Since we have a tag for each commit (we use test_commit to do
so), and refs/heads/{a,b,c,master}, we'd get the same output whether we
configured the prefix to be refs/heads, or didn't configure it at all.

Since using 'git for-each-ref' sorts in order of refname, a prefix of
"refs/tags" sorts in order of tagname, so we'll get different output
because of it.

That said, I think that this test is a little fragile as-is, since it'll
break if we change the ordering of 'git for-each-ref'. Maybe we should
`| sort >actual.haves`?

> Not that it's super important, but I wonder if we should give a
> motivating example like this in the documentation. In which case we'd
> probably want to give the most plausible one.

Maybe. I don't feel strongly about it, though.

> > Since the value of "core.alternateRefsPrefixes" is appended to 'git
> > for-each-ref' and then executed, include a "--" before taking the
> > configured value to avoid misinterpreting arguments as flags to 'git
> > for-each-ref'.
>
> Good idea.
>
> > diff --git a/Documentation/config.txt b/Documentation/config.txt
> > index b908bc5825..d768c57310 100644
> > --- a/Documentation/config.txt
> > +++ b/Documentation/config.txt
> > @@ -622,6 +622,12 @@ core.alternateRefsCommand::
> >  	linkgit:git-for-each-ref[1]. The first argument is the path of the alternate.
> >  	Output must be of the form: `%(objectname) SPC %(refname)`.
> >
> > +core.alternateRefsPrefixes::
> > +	When listing references from an alternate, list only references that begin
> > +	with the given prefix. To list multiple prefixes, separate them with a
> > +	whitespace character. If `core.alternateRefsCommand` is set, setting
> > +	`core.alternateRefsPrefixes` has no effect.
>
> I can't remember all of the rules for how for-each-ref matches prefixes,
> but I remember that it's subtly different than git-branch (and that's
> why ref-filter.c has two matching modes). Do we need to spell out the
> rules here (or at least say "it matches like for-each-ref")?

Good idea. I'll do that.

> Also, a minor nit, but I think the argv_array_split() helper you're
> using soaks up arbitrary amounts of whitespace. So maybe "separate them
> with whitespace" instead of "a whitespace character". Or maybe we should
> be strict in what we suggest and liberal in what we parse. ;)

Yeah, I think that chaning "a whitespace character" -> "with
whitespace" is the easier thing to do ;-).

> > +test_expect_success 'with core.alternateRefsPrefixes' '
> > +	test_config -C fork core.alternateRefsPrefixes "refs/tags" &&
> > +	cat >expect <<-EOF &&
> > +	$(git rev-parse one) .have
> > +	$(git rev-parse three) .have
> > +	$(git rev-parse two) .have
> > +	EOF
> > +	printf "0000" | git receive-pack fork | extract_haves >actual &&
> > +	test_cmp expect actual
>
> Looks sane, though the same pipe comment applies as before.

Thanks. I applied that suggestion in both locations when reading your
last mail.

> >  test_done
> > diff --git a/transport.c b/transport.c
> > index e7d2cdf00b..9323e5c3cd 100644
> > --- a/transport.c
> > +++ b/transport.c
> > @@ -1341,6 +1341,11 @@ static void fill_alternate_refs_command(struct child_process *cmd,
> >  		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
> >  		argv_array_push(&cmd->args, "for-each-ref");
> >  		argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
> > +
> > +		if (!git_config_get_value("core.alternateRefsPrefixes", &value)) {
> > +			argv_array_push(&cmd->args, "--");
> > +			argv_array_split(&cmd->args, value);
> > +		}
> >  	}
>
> The implementation ended up delightfully simple.

Thanks :-). It made me quite happy, too.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-20 18:04 ` [PATCH 3/3] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
  2018-09-20 19:47   ` Jeff King
@ 2018-09-21  7:19   ` Eric Sunshine
  2018-09-21 14:07     ` Taylor Blau
  2018-09-21 16:40     ` Junio C Hamano
  1 sibling, 2 replies; 94+ messages in thread
From: Eric Sunshine @ 2018-09-21  7:19 UTC (permalink / raw)
  To: ttaylorr; +Cc: Git List, Jeff King

On Thu, Sep 20, 2018 at 2:04 PM Taylor Blau <ttaylorr@github.com> wrote:
> The recently-introduced "core.alternateRefsCommand" allows callers to
> specify with high flexibility the tips that they wish to advertise from
> alternates. This flexibility comes at the cost of some inconvenience
> when the caller only wishes to limit the advertisement to one or more
> prefixes.
> [...]
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
> diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
> @@ -44,4 +44,15 @@ test_expect_success 'with core.alternateRefsCommand' '
> +test_expect_success 'with core.alternateRefsPrefixes' '
> +       test_config -C fork core.alternateRefsPrefixes "refs/tags" &&
> +       cat >expect <<-EOF &&
> +       $(git rev-parse one) .have
> +       $(git rev-parse three) .have
> +       $(git rev-parse two) .have
> +       EOF

It's probably a matter of taste as to which is more readable, but this
entire "cat <<EOF" block could be replaced with a simple one-liner:

    printf "%s .have\n" $(git rev-parse one three two) >expect &&

Same comment applies to previous patch, as well.

> +       printf "0000" | git receive-pack fork | extract_haves >actual &&
> +       test_cmp expect actual
> +'

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-21  7:19   ` Eric Sunshine
@ 2018-09-21 14:07     ` Taylor Blau
  2018-09-21 16:45       ` Junio C Hamano
  2018-09-21 16:40     ` Junio C Hamano
  1 sibling, 1 reply; 94+ messages in thread
From: Taylor Blau @ 2018-09-21 14:07 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Jeff King

On Fri, Sep 21, 2018 at 03:19:20AM -0400, Eric Sunshine wrote:
> On Thu, Sep 20, 2018 at 2:04 PM Taylor Blau <ttaylorr@github.com> wrote:
> > The recently-introduced "core.alternateRefsCommand" allows callers to
> > specify with high flexibility the tips that they wish to advertise from
> > alternates. This flexibility comes at the cost of some inconvenience
> > when the caller only wishes to limit the advertisement to one or more
> > prefixes.
> > [...]
> > Signed-off-by: Taylor Blau <me@ttaylorr.com>
> > ---
> > diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
> > @@ -44,4 +44,15 @@ test_expect_success 'with core.alternateRefsCommand' '
> > +test_expect_success 'with core.alternateRefsPrefixes' '
> > +       test_config -C fork core.alternateRefsPrefixes "refs/tags" &&
> > +       cat >expect <<-EOF &&
> > +       $(git rev-parse one) .have
> > +       $(git rev-parse three) .have
> > +       $(git rev-parse two) .have
> > +       EOF
>
> It's probably a matter of taste as to which is more readable, but this
> entire "cat <<EOF" block could be replaced with a simple one-liner:
>
>     printf "%s .have\n" $(git rev-parse one three two) >expect &&
>
> Same comment applies to previous patch, as well.

That's a good idea. I amended both patches to replace the 'cat <<-EOF
...' block with your suggestion above. It's tempting to introduce it as:

  expect_haves() {
    printf "%s .have\n" $(git rev-parse -- $@)
  }

And call it as:

  expect_haves one three two >expect

But I'm not sure whether I think that this is better or worse than
writing it twice inline. I think that the test is small enough that it
doesn't really matter either way, but I think that I've convinced myself
while composing this email that expect_haves() is an OK idea.

If you feel strongly that it isn't, please let me know, and I'll write
them inline before sending v2.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-20 18:04 ` [PATCH 2/3] transport.c: introduce core.alternateRefsCommand Taylor Blau
  2018-09-20 19:37   ` Jeff King
@ 2018-09-21 16:39   ` Junio C Hamano
  2018-09-21 17:48     ` Taylor Blau
  1 sibling, 1 reply; 94+ messages in thread
From: Junio C Hamano @ 2018-09-21 16:39 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff

Taylor Blau <ttaylorr@github.com> writes:

> +extract_haves () {
> +	depacketize - | grep -o '^.* \.have'

Not portable, isn't it?

cf. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/grep.html


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-21  7:19   ` Eric Sunshine
  2018-09-21 14:07     ` Taylor Blau
@ 2018-09-21 16:40     ` Junio C Hamano
  1 sibling, 0 replies; 94+ messages in thread
From: Junio C Hamano @ 2018-09-21 16:40 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: ttaylorr, Git List, Jeff King

Eric Sunshine <sunshine@sunshineco.com> writes:

> On Thu, Sep 20, 2018 at 2:04 PM Taylor Blau <ttaylorr@github.com> wrote:
>> The recently-introduced "core.alternateRefsCommand" allows callers to
>> specify with high flexibility the tips that they wish to advertise from
>> alternates. This flexibility comes at the cost of some inconvenience
>> when the caller only wishes to limit the advertisement to one or more
>> prefixes.
>> [...]
>> Signed-off-by: Taylor Blau <me@ttaylorr.com>
>> ---
>> diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
>> @@ -44,4 +44,15 @@ test_expect_success 'with core.alternateRefsCommand' '
>> +test_expect_success 'with core.alternateRefsPrefixes' '
>> +       test_config -C fork core.alternateRefsPrefixes "refs/tags" &&
>> +       cat >expect <<-EOF &&
>> +       $(git rev-parse one) .have
>> +       $(git rev-parse three) .have
>> +       $(git rev-parse two) .have
>> +       EOF
>
> It's probably a matter of taste as to which is more readable, but this
> entire "cat <<EOF" block could be replaced with a simple one-liner:
>
>     printf "%s .have\n" $(git rev-parse one three two) >expect &&
>
> Same comment applies to previous patch, as well.

If the expected pattern is expected to stay to be just a sequence of
"<oid> .have" and nothing else for the foreseeable future, I think
it is a good idea.

>
>> +       printf "0000" | git receive-pack fork | extract_haves >actual &&
>> +       test_cmp expect actual
>> +'

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-21 14:07     ` Taylor Blau
@ 2018-09-21 16:45       ` Junio C Hamano
  2018-09-21 17:49         ` Taylor Blau
  0 siblings, 1 reply; 94+ messages in thread
From: Junio C Hamano @ 2018-09-21 16:45 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Eric Sunshine, Git List, Jeff King

Taylor Blau <me@ttaylorr.com> writes:

> ...' block with your suggestion above. It's tempting to introduce it as:
>
>   expect_haves() {
>     printf "%s .have\n" $(git rev-parse -- $@)
>   }
>
> And call it as:
>
>   expect_haves one three two >expect
>
> But I'm not sure whether I think that this is better or worse than
> writing it twice inline.

If the expected pattern is expected to stay to be just a sequence of
"<oid> .have" and nothing else for the foreseeable future, I think
it is a good idea to introduce such a helper function.  Spelling it
out at the use site, e.g.

	printf "%s .have\n" $(git rev-parse a b c) >expect

will become cumbersome once the set of objects you need to show
starts growing.

	expect_haves a b c >expect

would be shorter, of course.  And as long as we expect to have ONLY
"<oid> .have" lines and nothing else, there is no downside that the
details of the format is hidden away inside the helper.


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-21 16:39   ` Junio C Hamano
@ 2018-09-21 17:48     ` Taylor Blau
  2018-09-21 17:57       ` Taylor Blau
  0 siblings, 1 reply; 94+ messages in thread
From: Taylor Blau @ 2018-09-21 17:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, peff

On Fri, Sep 21, 2018 at 09:39:14AM -0700, Junio C Hamano wrote:
> Taylor Blau <ttaylorr@github.com> writes:
>
> > +extract_haves () {
> > +	depacketize - | grep -o '^.* \.have'
>
> Not portable, isn't it?
>
> cf. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/grep.html

Good catch. Definitely not portable, per the link that you shared above.

Since 'depacketize()' will give us a "\0", we can pull it and anything
after it out with 'sed', instead. Any lines that don't contain a "\0"
only contain an OID and the literal, ".have", and are fine as-is.

Something like this:

  extract_haves () {
    depacketize - | grep '^.* \.have' | sed -e 's/\\0.*$//g'
  }

Harder to read--at least for me--but infinitely more portable.

I'll wait until a little later today, and then send you v2. Thanks for
reviewing :-).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-21 16:45       ` Junio C Hamano
@ 2018-09-21 17:49         ` Taylor Blau
  0 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-09-21 17:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, Eric Sunshine, Git List, Jeff King

On Fri, Sep 21, 2018 at 09:45:11AM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > ...' block with your suggestion above. It's tempting to introduce it as:
> >
> >   expect_haves() {
> >     printf "%s .have\n" $(git rev-parse -- $@)
> >   }
> >
> > And call it as:
> >
> >   expect_haves one three two >expect
> >
> > But I'm not sure whether I think that this is better or worse than
> > writing it twice inline.
>
> If the expected pattern is expected to stay to be just a sequence of
> "<oid> .have" and nothing else for the foreseeable future, I think
> it is a good idea to introduce such a helper function.  Spelling it
> out at the use site, e.g.
>
> 	printf "%s .have\n" $(git rev-parse a b c) >expect
>
> will become cumbersome once the set of objects you need to show
> starts growing.

That's a good reason, and I hadn't thought of it.

> 	expect_haves a b c >expect
>
> would be shorter, of course.  And as long as we expect to have ONLY
> "<oid> .have" lines and nothing else, there is no downside that the
> details of the format is hidden away inside the helper.

Yeah, I don't expect this to to change much at all, so I think that
'expect_haves()' is good.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-21 17:48     ` Taylor Blau
@ 2018-09-21 17:57       ` Taylor Blau
  2018-09-21 19:59         ` Junio C Hamano
  0 siblings, 1 reply; 94+ messages in thread
From: Taylor Blau @ 2018-09-21 17:57 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Junio C Hamano, git, peff

On Fri, Sep 21, 2018 at 01:48:25PM -0400, Taylor Blau wrote:
> On Fri, Sep 21, 2018 at 09:39:14AM -0700, Junio C Hamano wrote:
> > Taylor Blau <ttaylorr@github.com> writes:
> >
> > > +extract_haves () {
> > > +	depacketize - | grep -o '^.* \.have'
> >
> > Not portable, isn't it?
> >
> > cf. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/grep.html
>
> Good catch. Definitely not portable, per the link that you shared above.
>
> Since 'depacketize()' will give us a "\0", we can pull it and anything
> after it out with 'sed', instead. Any lines that don't contain a "\0"
> only contain an OID and the literal, ".have", and are fine as-is.
>
> Something like this:
>
>   extract_haves () {
>     depacketize - | grep '^.* \.have' | sed -e 's/\\0.*$//g'
>   }
>
> Harder to read--at least for me--but infinitely more portable.

In fact, I think that we can go even further: since we don't need to
catch the beginning '^.*' (without -o), we can instead:

  extract_haves () {
    depacketize - | grep '\.have' | sed -e 's/\\0.*$//g'
  }

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v2 0/3] Filter alternate references
  2018-09-20 18:04 [PATCH 0/3] Filter alternate references Taylor Blau
                   ` (4 preceding siblings ...)
  2018-09-20 19:21 ` Jeff King
@ 2018-09-21 18:47 ` Taylor Blau
  2018-09-21 18:47   ` [PATCH v2 1/3] transport.c: extract 'fill_alternate_refs_command' Taylor Blau
                     ` (2 more replies)
  2018-09-28  4:25 ` [PATCH v3 0/4] Filter alternate references Taylor Blau
                   ` (2 subsequent siblings)
  8 siblings, 3 replies; 94+ messages in thread
From: Taylor Blau @ 2018-09-21 18:47 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller

Hi,

Attached is the second re-roll of my series to teach
"core.alternateRefsCommand" and "core.alternateRefsPrefixes".

I have included a range-diff below (which I have taught my scripts to do
by default now), but will summarize the changes as usual:

  * Clean up t5410 according to Peff's suggestions in [1]:

    * Simplify many `git update-ref -d`'s into one `git update-ref
      --stdin`.

    * Use `echo >`, instead of `printf >` to write an alternate
      repository.

    * Avoid placing Git on the left-hand side of a pipe.

    * Use 'write_script', instead of embedding the same code in a
      lengthy 'test_config'.

  * Add a motivating example in Documentation/config.txt, per Peff's
    suggestion in [1].

  * Use `printf "%s .have\n"` with many arguments instead of another
    `cat <<-EOF` block and extract it into `expect_haves`, per [2].

  * Do not use `grep -o` in `extract_haves`, thus making it portable.
    Per [3].

[1]: https://public-inbox.org/git/20180920193751.GC29603@sigill.intra.peff.net/
[2]: https://public-inbox.org/git/CAPig+cT7WTyBCQZ75WSjmBqiui383YrKqoHqbLASQkOaGVTfVA@mail.gmail.com/
[3]: https://public-inbox.org/git/xmqqlg7ux0st.fsf@gitster-ct.c.googlers.com/

Taylor Blau (3):
  transport.c: extract 'fill_alternate_refs_command'
  transport.c: introduce core.alternateRefsCommand
  transport.c: introduce core.alternateRefsPrefixes

 Documentation/config.txt | 18 ++++++++++++
 t/t5410-receive-pack.sh  | 62 ++++++++++++++++++++++++++++++++++++++++
 transport.c              | 34 ++++++++++++++++++----
 3 files changed, 108 insertions(+), 6 deletions(-)
 create mode 100755 t/t5410-receive-pack.sh

Range-diff against v1:
1:  6e3a58afe7 = 1:  6e3a58afe7 transport.c: extract 'fill_alternate_refs_command'
2:  4c4900722c ! 2:  9797f52551 transport.c: introduce core.alternateRefsCommand
    @@ -42,6 +42,11 @@
     +	the shell to execute the specified command instead of
     +	linkgit:git-for-each-ref[1]. The first argument is the path of the alternate.
     +	Output must be of the form: `%(objectname) SPC %(refname)`.
    +++
    ++This is useful when a repository only wishes to advertise some of its
    ++alternate's references as ".have"'s. For example, to only advertise branch
    ++heads, configure `core.alternateRefsCommand` to the path of a script which runs
    ++`git --git-dir="$1" for-each-ref refs/heads`.
     +
      core.bare::
      	If true this repository is assumed to be 'bare' and has no
    @@ -70,32 +75,39 @@
     +	(
     +		cd fork &&
     +		git config receive.advertisealternates true &&
    -+		git update-ref -d refs/heads/a &&
    -+		git update-ref -d refs/heads/b &&
    -+		git update-ref -d refs/heads/c &&
    -+		git update-ref -d refs/heads/master &&
    -+		git update-ref -d refs/tags/one &&
    -+		git update-ref -d refs/tags/two &&
    -+		git update-ref -d refs/tags/three &&
    -+		printf "../../.git/objects" >objects/info/alternates
    ++		cat <<-EOF | git update-ref --stdin &&
    ++		delete refs/heads/a
    ++		delete refs/heads/b
    ++		delete refs/heads/c
    ++		delete refs/heads/master
    ++		delete refs/tags/one
    ++		delete refs/tags/two
    ++		delete refs/tags/three
    ++		EOF
    ++		echo "../../.git/objects" >objects/info/alternates
     +	)
     +'
     +
    ++expect_haves () {
    ++	printf "%s .have\n" $(git rev-parse $@) >expect
    ++}
    ++
     +extract_haves () {
    -+	depacketize - | grep -o '^.* \.have'
    ++	depacketize - | grep '\.have' | sed -e 's/\\0.*$//g'
     +}
     +
     +test_expect_success 'with core.alternateRefsCommand' '
    -+	test_config -C fork core.alternateRefsCommand \
    -+		"git --git-dir=\"\$1\" for-each-ref \
    -+		--format=\"%(objectname) %(refname)\" \
    -+		refs/heads/a refs/heads/c;:" &&
    -+	cat >expect <<-EOF &&
    -+	$(git rev-parse a) .have
    -+	$(git rev-parse c) .have
    ++	write_script fork/alternate-refs <<-\EOF &&
    ++		git --git-dir="$1" for-each-ref \
    ++			--format="%(objectname) %(refname)" \
    ++			refs/heads/a \
    ++			refs/heads/c
     +	EOF
    -+	printf "0000" | git receive-pack fork | extract_haves >actual &&
    -+	test_cmp expect actual
    ++	test_config -C fork core.alternateRefsCommand alternate-refs &&
    ++	expect_haves a c >expect &&
    ++	printf "0000" | git receive-pack fork >actual &&
    ++	extract_haves <actual >actual.haves &&
    ++	test_cmp expect actual.haves
     +'
     +
     +test_done
3:  3639e90588 ! 3:  6e8f65a16d transport.c: introduce core.alternateRefsPrefixes
    @@ -40,13 +40,14 @@
      --- a/Documentation/config.txt
      +++ b/Documentation/config.txt
     @@
    - 	linkgit:git-for-each-ref[1]. The first argument is the path of the alternate.
    - 	Output must be of the form: `%(objectname) SPC %(refname)`.
    + heads, configure `core.alternateRefsCommand` to the path of a script which runs
    + `git --git-dir="$1" for-each-ref refs/heads`.

     +core.alternateRefsPrefixes::
     +	When listing references from an alternate, list only references that begin
    -+	with the given prefix. To list multiple prefixes, separate them with a
    -+	whitespace character. If `core.alternateRefsCommand` is set, setting
    ++	with the given prefix. Prefixes match as if they were given as arguments to
    ++	linkgit:git-for-each-ref[1]. To list multiple prefixes, separate them with
    ++	whitespace. If `core.alternateRefsCommand` is set, setting
     +	`core.alternateRefsPrefixes` has no effect.
     +
      core.bare::
    @@ -57,18 +58,15 @@
      --- a/t/t5410-receive-pack.sh
      +++ b/t/t5410-receive-pack.sh
     @@
    - 	test_cmp expect actual
    + 	test_cmp expect actual.haves
      '

     +test_expect_success 'with core.alternateRefsPrefixes' '
     +	test_config -C fork core.alternateRefsPrefixes "refs/tags" &&
    -+	cat >expect <<-EOF &&
    -+	$(git rev-parse one) .have
    -+	$(git rev-parse three) .have
    -+	$(git rev-parse two) .have
    -+	EOF
    -+	printf "0000" | git receive-pack fork | extract_haves >actual &&
    -+	test_cmp expect actual
    ++	expect_haves one three two >expect &&
    ++	printf "0000" | git receive-pack fork >actual &&
    ++	extract_haves <actual >actual.haves &&
    ++	test_cmp expect actual.haves
     +'
     +
      test_done
--
2.19.0.221.g150f307af

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v2 1/3] transport.c: extract 'fill_alternate_refs_command'
  2018-09-21 18:47 ` [PATCH v2 " Taylor Blau
@ 2018-09-21 18:47   ` Taylor Blau
  2018-09-21 18:47   ` [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand Taylor Blau
  2018-09-21 18:47   ` [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
  2 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-09-21 18:47 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller

To list alternate references, 'read_alternate_refs' creates a child
process running 'git for-each-ref' in the alternate's Git directory.

Prepare to run other commands besides 'git for-each-ref' by introducing
and moving the relevant code from 'read_alternate_refs' to
'fill_alternate_refs_command'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 transport.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/transport.c b/transport.c
index 1c76d64aba..24ae3f375d 100644
--- a/transport.c
+++ b/transport.c
@@ -1325,6 +1325,17 @@ char *transport_anonymize_url(const char *url)
 	return xstrdup(url);
 }
 
+static void fill_alternate_refs_command(struct child_process *cmd,
+					const char *repo_path)
+{
+	cmd->git_cmd = 1;
+	argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
+	argv_array_push(&cmd->args, "for-each-ref");
+	argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
+	cmd->env = local_repo_env;
+	cmd->out = -1;
+}
+
 static void read_alternate_refs(const char *path,
 				alternate_ref_fn *cb,
 				void *data)
@@ -1333,12 +1344,7 @@ static void read_alternate_refs(const char *path,
 	struct strbuf line = STRBUF_INIT;
 	FILE *fh;
 
-	cmd.git_cmd = 1;
-	argv_array_pushf(&cmd.args, "--git-dir=%s", path);
-	argv_array_push(&cmd.args, "for-each-ref");
-	argv_array_push(&cmd.args, "--format=%(objectname) %(refname)");
-	cmd.env = local_repo_env;
-	cmd.out = -1;
+	fill_alternate_refs_command(&cmd, path);
 
 	if (start_command(&cmd))
 		return;
-- 
2.19.0.221.g150f307af


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-21 18:47 ` [PATCH v2 " Taylor Blau
  2018-09-21 18:47   ` [PATCH v2 1/3] transport.c: extract 'fill_alternate_refs_command' Taylor Blau
@ 2018-09-21 18:47   ` Taylor Blau
  2018-09-21 20:18     ` Eric Sunshine
                       ` (3 more replies)
  2018-09-21 18:47   ` [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
  2 siblings, 4 replies; 94+ messages in thread
From: Taylor Blau @ 2018-09-21 18:47 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller

When in a repository containing one or more alternates, Git would
sometimes like to list references from its alternates. For example, 'git
receive-pack' list the objects pointed to by alternate references as
special ".have" references.

Listing ".have" references is designed to make pushing changes from
upstream to a fork a lightweight operation, by advertising to the pusher
that the fork already has the objects (via its alternate). Thus, the
client can avoid sending them.

However, when the alternate has a pathologically large number of
references, the initial advertisement is too expensive. In fact, it can
dominate any such optimization where the pusher avoids sending certain
objects.

Introduce "core.alternateRefsCommand" in order to provide a facility to
limit or filter alternate references. This can be used, for example, to
filter out "uninteresting" references from the initial advertisement in
the above scenario.

Let the repository that has alternates configure this command to avoid
trusting the alternate to provide us a safe command to run in the shell.
To behave differently on each alternate (e.g., only list tags from
alternate A, only heads from B) provide the path of the alternate as the
first argument.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config.txt | 11 ++++++++
 t/t5410-receive-pack.sh  | 54 ++++++++++++++++++++++++++++++++++++++++
 transport.c              | 19 +++++++++++---
 3 files changed, 80 insertions(+), 4 deletions(-)
 create mode 100755 t/t5410-receive-pack.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 112041f407..526557e494 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -616,6 +616,17 @@ core.preferSymlinkRefs::
 	This is sometimes needed to work with old scripts that
 	expect HEAD to be a symbolic link.
 
+core.alternateRefsCommand::
+	When listing references from an alternate (e.g., in the case of ".have"), use
+	the shell to execute the specified command instead of
+	linkgit:git-for-each-ref[1]. The first argument is the path of the alternate.
+	Output must be of the form: `%(objectname) SPC %(refname)`.
++
+This is useful when a repository only wishes to advertise some of its
+alternate's references as ".have"'s. For example, to only advertise branch
+heads, configure `core.alternateRefsCommand` to the path of a script which runs
+`git --git-dir="$1" for-each-ref refs/heads`.
+
 core.bare::
 	If true this repository is assumed to be 'bare' and has no
 	working directory associated with it.  If this is the case a
diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
new file mode 100755
index 0000000000..2f21f1cb8f
--- /dev/null
+++ b/t/t5410-receive-pack.sh
@@ -0,0 +1,54 @@
+#!/bin/sh
+
+test_description='git receive-pack test'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit one &&
+	git update-ref refs/heads/a HEAD &&
+	test_commit two &&
+	git update-ref refs/heads/b HEAD &&
+	test_commit three &&
+	git update-ref refs/heads/c HEAD &&
+	git clone --bare . fork &&
+	git clone fork pusher &&
+	(
+		cd fork &&
+		git config receive.advertisealternates true &&
+		cat <<-EOF | git update-ref --stdin &&
+		delete refs/heads/a
+		delete refs/heads/b
+		delete refs/heads/c
+		delete refs/heads/master
+		delete refs/tags/one
+		delete refs/tags/two
+		delete refs/tags/three
+		EOF
+		echo "../../.git/objects" >objects/info/alternates
+	)
+'
+
+expect_haves () {
+	printf "%s .have\n" $(git rev-parse $@) >expect
+}
+
+extract_haves () {
+	depacketize - | grep '\.have' | sed -e 's/\\0.*$//g'
+}
+
+test_expect_success 'with core.alternateRefsCommand' '
+	write_script fork/alternate-refs <<-\EOF &&
+		git --git-dir="$1" for-each-ref \
+			--format="%(objectname) %(refname)" \
+			refs/heads/a \
+			refs/heads/c
+	EOF
+	test_config -C fork core.alternateRefsCommand alternate-refs &&
+	expect_haves a c >expect &&
+	printf "0000" | git receive-pack fork >actual &&
+	extract_haves <actual >actual.haves &&
+	test_cmp expect actual.haves
+'
+
+test_done
diff --git a/transport.c b/transport.c
index 24ae3f375d..e7d2cdf00b 100644
--- a/transport.c
+++ b/transport.c
@@ -1328,10 +1328,21 @@ char *transport_anonymize_url(const char *url)
 static void fill_alternate_refs_command(struct child_process *cmd,
 					const char *repo_path)
 {
-	cmd->git_cmd = 1;
-	argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
-	argv_array_push(&cmd->args, "for-each-ref");
-	argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
+	const char *value;
+
+	if (!git_config_get_value("core.alternateRefsCommand", &value)) {
+		cmd->use_shell = 1;
+
+		argv_array_push(&cmd->args, value);
+		argv_array_push(&cmd->args, repo_path);
+	} else {
+		cmd->git_cmd = 1;
+
+		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
+		argv_array_push(&cmd->args, "for-each-ref");
+		argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
+	}
+
 	cmd->env = local_repo_env;
 	cmd->out = -1;
 }
-- 
2.19.0.221.g150f307af


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-21 18:47 ` [PATCH v2 " Taylor Blau
  2018-09-21 18:47   ` [PATCH v2 1/3] transport.c: extract 'fill_alternate_refs_command' Taylor Blau
  2018-09-21 18:47   ` [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand Taylor Blau
@ 2018-09-21 18:47   ` Taylor Blau
  2018-09-21 21:14     ` Junio C Hamano
  2 siblings, 1 reply; 94+ messages in thread
From: Taylor Blau @ 2018-09-21 18:47 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller

The recently-introduced "core.alternateRefsCommand" allows callers to
specify with high flexibility the tips that they wish to advertise from
alternates. This flexibility comes at the cost of some inconvenience
when the caller only wishes to limit the advertisement to one or more
prefixes.

For example, to advertise only tags, a caller using
'core.alternateRefsCommand' would have to do:

  $ git config core.alternateRefsCommand ' \
      git -C "$1" for-each-ref refs/tags \
      --format="%(objectname) %(refname)" \
    '

The above is cumbersome to write, so let's introduce a
"core.alternateRefsPrefixes" to address this common case. Instead, the
caller can run:

  $ git config core.alternateRefsPrefixes 'refs/tags'

Which will behave identically to the longer example using
"core.alternateRefsCommand".

Since the value of "core.alternateRefsPrefixes" is appended to 'git
for-each-ref' and then executed, include a "--" before taking the
configured value to avoid misinterpreting arguments as flags to 'git
for-each-ref'.

In the case that the caller wishes to specify multiple prefixes, they
may separate them by whitespace. If "core.alternateRefsCommand" is set,
it will take precedence over "core.alternateRefsPrefixes".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config.txt | 7 +++++++
 t/t5410-receive-pack.sh  | 8 ++++++++
 transport.c              | 5 +++++
 3 files changed, 20 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 526557e494..7df6c22925 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -627,6 +627,13 @@ alternate's references as ".have"'s. For example, to only advertise branch
 heads, configure `core.alternateRefsCommand` to the path of a script which runs
 `git --git-dir="$1" for-each-ref refs/heads`.
 
+core.alternateRefsPrefixes::
+	When listing references from an alternate, list only references that begin
+	with the given prefix. Prefixes match as if they were given as arguments to
+	linkgit:git-for-each-ref[1]. To list multiple prefixes, separate them with
+	whitespace. If `core.alternateRefsCommand` is set, setting
+	`core.alternateRefsPrefixes` has no effect.
+
 core.bare::
 	If true this repository is assumed to be 'bare' and has no
 	working directory associated with it.  If this is the case a
diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
index 2f21f1cb8f..b656c9b30c 100755
--- a/t/t5410-receive-pack.sh
+++ b/t/t5410-receive-pack.sh
@@ -51,4 +51,12 @@ test_expect_success 'with core.alternateRefsCommand' '
 	test_cmp expect actual.haves
 '
 
+test_expect_success 'with core.alternateRefsPrefixes' '
+	test_config -C fork core.alternateRefsPrefixes "refs/tags" &&
+	expect_haves one three two >expect &&
+	printf "0000" | git receive-pack fork >actual &&
+	extract_haves <actual >actual.haves &&
+	test_cmp expect actual.haves
+'
+
 test_done
diff --git a/transport.c b/transport.c
index e7d2cdf00b..9323e5c3cd 100644
--- a/transport.c
+++ b/transport.c
@@ -1341,6 +1341,11 @@ static void fill_alternate_refs_command(struct child_process *cmd,
 		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
 		argv_array_push(&cmd->args, "for-each-ref");
 		argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
+
+		if (!git_config_get_value("core.alternateRefsPrefixes", &value)) {
+			argv_array_push(&cmd->args, "--");
+			argv_array_split(&cmd->args, value);
+		}
 	}
 
 	cmd->env = local_repo_env;
-- 
2.19.0.221.g150f307af

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-21 17:57       ` Taylor Blau
@ 2018-09-21 19:59         ` Junio C Hamano
  2018-09-26  0:56           ` Taylor Blau
  0 siblings, 1 reply; 94+ messages in thread
From: Junio C Hamano @ 2018-09-21 19:59 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff

Taylor Blau <me@ttaylorr.com> writes:

> In fact, I think that we can go even further: since we don't need to
> catch the beginning '^.*' (without -o), we can instead:
>
>   extract_haves () {
>     depacketize - | grep '\.have' | sed -e 's/\\0.*$//g'
>   }

Do not pipe grep into sed, unless you have an overly elaborate set
of patterns to filter with, e.g. something along the lines of...

	sed -ne '/\.have/s/...//p'


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-21 18:47   ` [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand Taylor Blau
@ 2018-09-21 20:18     ` Eric Sunshine
  2018-09-26  0:59       ` Taylor Blau
  2018-09-21 21:09     ` Junio C Hamano
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 94+ messages in thread
From: Eric Sunshine @ 2018-09-21 20:18 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Git List, Jeff King, Junio C Hamano, Stefan Beller

On Fri, Sep 21, 2018 at 2:47 PM Taylor Blau <me@ttaylorr.com> wrote:
> When in a repository containing one or more alternates, Git would
> sometimes like to list references from its alternates. For example, 'git
> receive-pack' list the objects pointed to by alternate references as
> special ".have" references.
> [...]
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
> diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
> @@ -0,0 +1,54 @@
> +expect_haves () {
> +       printf "%s .have\n" $(git rev-parse $@) >expect
> +}

Magic quoting behavior only kicks in when $@ is itself quoted, so this
should be:

    printf "%s .have\n" $(git rev-parse "$@") >expect

However, as it's unlikely that you need magic quoting in this case,
you might get by with plain $* (unquoted).

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-21 18:47   ` [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand Taylor Blau
  2018-09-21 20:18     ` Eric Sunshine
@ 2018-09-21 21:09     ` Junio C Hamano
  2018-09-21 22:13       ` Jeff King
  2018-09-26  1:06       ` Taylor Blau
  2018-09-21 21:10     ` Eric Sunshine
  2018-09-22 18:02     ` brian m. carlson
  3 siblings, 2 replies; 94+ messages in thread
From: Junio C Hamano @ 2018-09-21 21:09 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff, sunshine, sbeller

Taylor Blau <me@ttaylorr.com> writes:

> +core.alternateRefsCommand::
> +	When listing references from an alternate (e.g., in the case of ".have"), use

It is not clear how (e.g.,...) connects to what is said in the
sentence.  "When advertising tips of available history from an
alternate, use ..." without saying ".have" may be less cryptic.  

I dunno.

> +	the shell to execute the specified command instead of
> +	linkgit:git-for-each-ref[1]. The first argument is the path of the alternate.

"The path" meaning the absolute path?  Relative to the original
object store?  Something else?

> +	Output must be of the form: `%(objectname) SPC %(refname)`.
> ++
> +This is useful when a repository only wishes to advertise some of its
> +alternate's references as ".have"'s. For example, to only advertise branch
> +heads, configure `core.alternateRefsCommand` to the path of a script which runs
> +`git --git-dir="$1" for-each-ref refs/heads`.
> +
>  core.bare::
>  	If true this repository is assumed to be 'bare' and has no
>  	working directory associated with it.  If this is the case a
> diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
> new file mode 100755
> index 0000000000..2f21f1cb8f
> --- /dev/null
> +++ b/t/t5410-receive-pack.sh
> @@ -0,0 +1,54 @@
> +#!/bin/sh
> +
> +test_description='git receive-pack test'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup' '
> +	test_commit one &&
> +	git update-ref refs/heads/a HEAD &&
> +	test_commit two &&
> +	git update-ref refs/heads/b HEAD &&
> +	test_commit three &&
> +	git update-ref refs/heads/c HEAD &&
> +	git clone --bare . fork &&
> +	git clone fork pusher &&
> +	(
> +		cd fork &&
> +		git config receive.advertisealternates true &&

Hmph.  Do we have code to support this configuration variable?

> +		cat <<-EOF | git update-ref --stdin &&

Style: writing "<<-\EOF" instead would allow readers' eyes to
coast over without having to look for $variable_references in
the here-doc.

> +		delete refs/heads/a
> +		delete refs/heads/b
> +		delete refs/heads/c
> +		delete refs/heads/master
> +		delete refs/tags/one
> +		delete refs/tags/two
> +		delete refs/tags/three

So, the original created one/two/three/a/b/c/master, fork is a bare
clone of it and has all these things, and then you deleted all of
these?  What does fork have after this is done?  HEAD that is
dangling?

> +		EOF
> +		echo "../../.git/objects" >objects/info/alternates

When viewed from fork/objects, ../../.git is the GIT_DIR of the
primary test repository, so that is where we borrow objects from.

If we pruned the objects from fork's object store before this echo,
we would have an almost empty repository that borrows from its
alternates everything, which may make a more realistic sample case,
but because you are only focusing on the ref advertisement, it does
not matter that your fork is full of duplicate objects that are
available from the alternates.

> +expect_haves () {
> +	printf "%s .have\n" $(git rev-parse $@) >expect

Quote $@ inside dq pair, like $(git rev-parse "$@").

> +extract_haves () {
> +	depacketize - | grep '\.have' | sed -e 's/\\0.*$//g'
> +}


Don't pipe grep into sed, especially when both the pattern to filter
and the operation to perform are simple.

I am not sure what you are trying to achive with 'g' in
s/pattern$//g; The anchor at the rightmost end of the pattern makes
sure that the pattern matches only once per line at the end anyway,
so "do this howmanyever times as we have match on each line" would
not make any difference, no?

> diff --git a/transport.c b/transport.c
> index 24ae3f375d..e7d2cdf00b 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -1328,10 +1328,21 @@ char *transport_anonymize_url(const char *url)
>  static void fill_alternate_refs_command(struct child_process *cmd,
>  					const char *repo_path)
>  {
> -	cmd->git_cmd = 1;
> -	argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
> -	argv_array_push(&cmd->args, "for-each-ref");
> -	argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
> +	const char *value;
> +
> +	if (!git_config_get_value("core.alternateRefsCommand", &value)) {
> +		cmd->use_shell = 1;
> +
> +		argv_array_push(&cmd->args, value);
> +		argv_array_push(&cmd->args, repo_path);
> +	} else {
> +		cmd->git_cmd = 1;
> +
> +		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
> +		argv_array_push(&cmd->args, "for-each-ref");
> +		argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
> +	}
> +
>  	cmd->env = local_repo_env;
>  	cmd->out = -1;
>  }

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-21 18:47   ` [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand Taylor Blau
  2018-09-21 20:18     ` Eric Sunshine
  2018-09-21 21:09     ` Junio C Hamano
@ 2018-09-21 21:10     ` Eric Sunshine
  2018-09-22 18:02     ` brian m. carlson
  3 siblings, 0 replies; 94+ messages in thread
From: Eric Sunshine @ 2018-09-21 21:10 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Git List, Jeff King, Junio C Hamano, Stefan Beller

On Fri, Sep 21, 2018 at 2:47 PM Taylor Blau <me@ttaylorr.com> wrote:
> When in a repository containing one or more alternates, Git would
> sometimes like to list references from its alternates. For example, 'git
> receive-pack' list the objects pointed to by alternate references as
> special ".have" references.
> [...]
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
> diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
> @@ -0,0 +1,54 @@
> +expect_haves () {
> +       printf "%s .have\n" $(git rev-parse $@) >expect
> +}
> +
> +test_expect_success 'with core.alternateRefsCommand' '
> +       [...]
> +       expect_haves a c >expect &&

This is not great. Both the caller of expect_haves() and
expect_haves() itself redirect to a file named "expect". This works,
but only by accident.

Better would be to make expect_haves() simply a generator to stdout
and let the caller redirect to the file rather than hardcoding the
filename in the function itself (much as extract_haves() takes it its
input on stdin rather than hardcoding a filename). If you take this
approach, then you'd probably want to rename the function, as well;
perhaps call it emit_haves() or something.

> +       printf "0000" | git receive-pack fork >actual &&
> +       extract_haves <actual >actual.haves &&
> +       test_cmp expect actual.haves
> +'

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-21 18:47   ` [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
@ 2018-09-21 21:14     ` Junio C Hamano
  2018-09-21 21:37       ` Jeff King
  0 siblings, 1 reply; 94+ messages in thread
From: Junio C Hamano @ 2018-09-21 21:14 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff, sunshine, sbeller

Taylor Blau <me@ttaylorr.com> writes:

> +core.alternateRefsPrefixes::
> +	When listing references from an alternate, list only references that begin
> +	with the given prefix. Prefixes match as if they were given as arguments to
> +	linkgit:git-for-each-ref[1]. To list multiple prefixes, separate them with
> +	whitespace. If `core.alternateRefsCommand` is set, setting
> +	`core.alternateRefsPrefixes` has no effect.

We do not allow anything elaborate like "refs/tags/release-*" but we
still allow "refs/tags/" and "refs/heads/" by listing them together,
and because these are only prefixes, whitespace is a reasonable list
separator as they cannot appear anywhere in a refname.  OK.

Why is this "core"?  I thought this was more about receive-pack;
even if this is going to be extended to upload-pack's negotiation,
"core" is way too wide a hierarchy.  We have "transport.*" for
things like this, no?

The exact same comment applies to 2/3, of course.


> diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
> index 2f21f1cb8f..b656c9b30c 100755
> --- a/t/t5410-receive-pack.sh
> +++ b/t/t5410-receive-pack.sh
> @@ -51,4 +51,12 @@ test_expect_success 'with core.alternateRefsCommand' '
>  	test_cmp expect actual.haves
>  '
>  
> +test_expect_success 'with core.alternateRefsPrefixes' '
> +	test_config -C fork core.alternateRefsPrefixes "refs/tags" &&
> +	expect_haves one three two >expect &&
> +	printf "0000" | git receive-pack fork >actual &&
> +	extract_haves <actual >actual.haves &&
> +	test_cmp expect actual.haves
> +'
> +
>  test_done
> diff --git a/transport.c b/transport.c
> index e7d2cdf00b..9323e5c3cd 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -1341,6 +1341,11 @@ static void fill_alternate_refs_command(struct child_process *cmd,
>  		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
>  		argv_array_push(&cmd->args, "for-each-ref");
>  		argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
> +
> +		if (!git_config_get_value("core.alternateRefsPrefixes", &value)) {
> +			argv_array_push(&cmd->args, "--");
> +			argv_array_split(&cmd->args, value);
> +		}
>  	}
>  
>  	cmd->env = local_repo_env;

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-21 21:14     ` Junio C Hamano
@ 2018-09-21 21:37       ` Jeff King
  2018-09-21 22:06         ` Junio C Hamano
  0 siblings, 1 reply; 94+ messages in thread
From: Jeff King @ 2018-09-21 21:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, sunshine, sbeller

On Fri, Sep 21, 2018 at 02:14:17PM -0700, Junio C Hamano wrote:

> Taylor Blau <me@ttaylorr.com> writes:
> 
> > +core.alternateRefsPrefixes::
> > +	When listing references from an alternate, list only references that begin
> > +	with the given prefix. Prefixes match as if they were given as arguments to
> > +	linkgit:git-for-each-ref[1]. To list multiple prefixes, separate them with
> > +	whitespace. If `core.alternateRefsCommand` is set, setting
> > +	`core.alternateRefsPrefixes` has no effect.
> 
> We do not allow anything elaborate like "refs/tags/release-*" but we
> still allow "refs/tags/" and "refs/heads/" by listing them together,
> and because these are only prefixes, whitespace is a reasonable list
> separator as they cannot appear anywhere in a refname.  OK.
> 
> Why is this "core"?  I thought this was more about receive-pack;
> even if this is going to be extended to upload-pack's negotiation,
> "core" is way too wide a hierarchy.  We have "transport.*" for
> things like this, no?

There's no extension necessary; these should already affect upload-pack
as well. I agree transport.* would cover both upload-pack and
receive-pack. If we extend it to check_everything_connected(), would it
make sense as part of transport.*, too?

I dunno. I guess I could see an argument either way.

If we do add "rev-list --alternate-refs", that pushes it even further
away from transport.*, though.

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-21 21:37       ` Jeff King
@ 2018-09-21 22:06         ` Junio C Hamano
  2018-09-21 22:18           ` Jeff King
  0 siblings, 1 reply; 94+ messages in thread
From: Junio C Hamano @ 2018-09-21 22:06 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, sunshine, sbeller

Jeff King <peff@peff.net> writes:

> There's no extension necessary; these should already affect upload-pack
> as well. I agree transport.* would cover both upload-pack and
> receive-pack. If we extend it to check_everything_connected(), would it
> make sense as part of transport.*, too?
>
> I dunno. I guess I could see an argument either way.

Sorry but I do not quite follow.  Are you saying that something that
covers check-everything-connected would the result be too wide to
fit inside transport.*?  or something that does not cover
check-everything-connected falls short of transport.*?  Or something
else?  Either way, core.* is way too wide for what this hook does, I
would think.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-21 21:09     ` Junio C Hamano
@ 2018-09-21 22:13       ` Jeff King
  2018-09-21 22:23         ` Junio C Hamano
  2018-09-26  1:06       ` Taylor Blau
  1 sibling, 1 reply; 94+ messages in thread
From: Jeff King @ 2018-09-21 22:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, sunshine, sbeller

On Fri, Sep 21, 2018 at 02:09:08PM -0700, Junio C Hamano wrote:

> > +test_expect_success 'setup' '
> > +	test_commit one &&
> > +	git update-ref refs/heads/a HEAD &&
> > +	test_commit two &&
> > +	git update-ref refs/heads/b HEAD &&
> > +	test_commit three &&
> > +	git update-ref refs/heads/c HEAD &&
> > +	git clone --bare . fork &&
> > +	git clone fork pusher &&
> > +	(
> > +		cd fork &&
> > +		git config receive.advertisealternates true &&
> 
> Hmph.  Do we have code to support this configuration variable?

Sorry, I should have caught that. Our existing solution is to disable
alternates in the advertisement entirely (since the optimization
backfires for us). So this line is a leftover from testing it against
our fork, and should be dropped.

If anybody is interested, we can share those patches, though they're
unsurprisingly trivial. I suspect we may end up discarding them if this
custom-command thing works, but it's possible we'll still need to be
able to shut them off completely for some truly pathological cases.

> > +		cat <<-EOF | git update-ref --stdin &&
> 
> Style: writing "<<-\EOF" instead would allow readers' eyes to
> coast over without having to look for $variable_references in
> the here-doc.

Also, useless-use-of-cat in the original, which could be:

  git update-ref --stdin <<-\EOF

> [...]

Yeah, I second all the other bits you mentioned.

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-21 22:06         ` Junio C Hamano
@ 2018-09-21 22:18           ` Jeff King
  2018-09-21 22:23             ` Stefan Beller
  2018-09-24 15:17             ` Junio C Hamano
  0 siblings, 2 replies; 94+ messages in thread
From: Jeff King @ 2018-09-21 22:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, sunshine, sbeller

On Fri, Sep 21, 2018 at 03:06:43PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > There's no extension necessary; these should already affect upload-pack
> > as well. I agree transport.* would cover both upload-pack and
> > receive-pack. If we extend it to check_everything_connected(), would it
> > make sense as part of transport.*, too?
> >
> > I dunno. I guess I could see an argument either way.
> 
> Sorry but I do not quite follow.  Are you saying that something that
> covers check-everything-connected would the result be too wide to
> fit inside transport.*?  or something that does not cover
> check-everything-connected falls short of transport.*?  Or something
> else?  Either way, core.* is way too wide for what this hook does, I
> would think.

I was suggesting that check_everything_connected() is not strictly
transport-related, so would be inappropriate for transport.*, and we'd
need a more generic name. And my "either way" was that I could see
an argument that it _is_ transport related, since we only call it now
when receiving a pack. But that doesn't have to be the case, and
certainly implementing it with "rev-list --alternate-refs" muddies that
considerably.

I agree that core.* is kind of a kitchen sink, but I'm not sure that's
all that bad. Is "here is how Git finds refs in an alternate" any more
or less core than "here is how Git invokes ssh"?

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-21 22:18           ` Jeff King
@ 2018-09-21 22:23             ` Stefan Beller
  2018-09-24 15:17             ` Junio C Hamano
  1 sibling, 0 replies; 94+ messages in thread
From: Stefan Beller @ 2018-09-21 22:23 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Taylor Blau, git, Eric Sunshine

On Fri, Sep 21, 2018 at 3:18 PM Jeff King <peff@peff.net> wrote:

> I agree that core.* is kind of a kitchen sink, but I'm not sure that's
> all that bad. Is "here is how Git finds refs in an alternate" any more

This touches both "refs" and "alternates", which are Git concepts
whereas ssh is not.

> or less core than "here is how Git invokes ssh"?

Arguably core.sshCommand should be deprecated and re-introduced
as transport."ssh".command. :-P

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-21 22:13       ` Jeff King
@ 2018-09-21 22:23         ` Junio C Hamano
  2018-09-21 22:27           ` Jeff King
  0 siblings, 1 reply; 94+ messages in thread
From: Junio C Hamano @ 2018-09-21 22:23 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, sunshine, sbeller

Jeff King <peff@peff.net> writes:

> On Fri, Sep 21, 2018 at 02:09:08PM -0700, Junio C Hamano wrote:
>
>> > +test_expect_success 'setup' '
>> > +	test_commit one &&
>> > +	git update-ref refs/heads/a HEAD &&
>> > +	test_commit two &&
>> > +	git update-ref refs/heads/b HEAD &&
>> > +	test_commit three &&
>> > +	git update-ref refs/heads/c HEAD &&
>> > +	git clone --bare . fork &&
>> > +	git clone fork pusher &&
>> > +	(
>> > +		cd fork &&
>> > +		git config receive.advertisealternates true &&
>> 
>> Hmph.  Do we have code to support this configuration variable?
>
> Sorry, I should have caught that. Our existing solution is to disable
> alternates in the advertisement entirely (since the optimization
> backfires for us). So this line is a leftover from testing it against
> our fork, and should be dropped.
>
> If anybody is interested, we can share those patches, though they're
> unsurprisingly trivial.

Heh, I guessed correctly what is going on ;-)

Even though there may not be much interest in the "all-or-none"
boolean configuration, in order to upstream this custom thing, it
may be the cleanest to upstream that all-or-none thing as well.
Otherwise, you'd need to keep a patch to this test script that is
private for your "all-or-none" feature.  That's your maintenance
burden so it ultimately is your call ;-)
> Also, useless-use-of-cat in the original, which could be:
>
>   git update-ref --stdin <<-\EOF

Yup.

Thanks.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-21 22:23         ` Junio C Hamano
@ 2018-09-21 22:27           ` Jeff King
  0 siblings, 0 replies; 94+ messages in thread
From: Jeff King @ 2018-09-21 22:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, sunshine, sbeller

On Fri, Sep 21, 2018 at 03:23:40PM -0700, Junio C Hamano wrote:

> >> > +		git config receive.advertisealternates true &&
> >> 
> >> Hmph.  Do we have code to support this configuration variable?
> >
> > Sorry, I should have caught that. Our existing solution is to disable
> > alternates in the advertisement entirely (since the optimization
> > backfires for us). So this line is a leftover from testing it against
> > our fork, and should be dropped.
> >
> > If anybody is interested, we can share those patches, though they're
> > unsurprisingly trivial.
> 
> Heh, I guessed correctly what is going on ;-)
> 
> Even though there may not be much interest in the "all-or-none"
> boolean configuration, in order to upstream this custom thing, it
> may be the cleanest to upstream that all-or-none thing as well.
> Otherwise, you'd need to keep a patch to this test script that is
> private for your "all-or-none" feature.  That's your maintenance
> burden so it ultimately is your call ;-)

Easy one-liners in test scripts are the least of my ongoing maintenance
burden. ;)

I think in this case, though, the line is not even necessary, as our
patches leave the default as "true" (which is certainly what we would
want upstream, as well, for compatibility).

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-21 18:47   ` [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand Taylor Blau
                       ` (2 preceding siblings ...)
  2018-09-21 21:10     ` Eric Sunshine
@ 2018-09-22 18:02     ` brian m. carlson
  2018-09-22 19:52       ` Jeff King
  3 siblings, 1 reply; 94+ messages in thread
From: brian m. carlson @ 2018-09-22 18:02 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff, gitster, sunshine, sbeller

[-- Attachment #1: Type: text/plain, Size: 615 bytes --]

On Fri, Sep 21, 2018 at 02:47:43PM -0400, Taylor Blau wrote:
> +expect_haves () {
> +	printf "%s .have\n" $(git rev-parse $@) >expect
> +}
> +
> +extract_haves () {
> +	depacketize - | grep '\.have' | sed -e 's/\\0.*$//g'

It looks like you're trying to match a NUL here in the sed expression,
but from my reading of it, POSIX doesn't permit BREs to match NUL.

Perhaps someone can come up with a better solution, but I'd write this
as the following:

  depacketize - | perl -ne 'next unless /\.have/; s/\0.*$//g; print'
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-22 18:02     ` brian m. carlson
@ 2018-09-22 19:52       ` Jeff King
  2018-09-23 14:53         ` brian m. carlson
  2018-09-26  1:09         ` Taylor Blau
  0 siblings, 2 replies; 94+ messages in thread
From: Jeff King @ 2018-09-22 19:52 UTC (permalink / raw)
  To: brian m. carlson, Taylor Blau, git, gitster, sunshine, sbeller

On Sat, Sep 22, 2018 at 06:02:31PM +0000, brian m. carlson wrote:

> On Fri, Sep 21, 2018 at 02:47:43PM -0400, Taylor Blau wrote:
> > +expect_haves () {
> > +	printf "%s .have\n" $(git rev-parse $@) >expect
> > +}
> > +
> > +extract_haves () {
> > +	depacketize - | grep '\.have' | sed -e 's/\\0.*$//g'
> 
> It looks like you're trying to match a NUL here in the sed expression,
> but from my reading of it, POSIX doesn't permit BREs to match NUL.

No, it's trying to literally match backslash followed by 0. The
depacketize() script will have undone the NUL already. In perl, no less,
making it more or less equivalent to your suggestion. ;)

So I think this is fine (modulo that the grep and sed can be combined).
Yet another option would be to simply strip away everything except the
object id (which is all we care about), like:

  depacketize | perl -lne '/^(\S+) \.have/ and print $1'

Or the equivalent in sed. I am happy with any solution that does the
correct thing.

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-22 19:52       ` Jeff King
@ 2018-09-23 14:53         ` brian m. carlson
  2018-09-26  1:09         ` Taylor Blau
  1 sibling, 0 replies; 94+ messages in thread
From: brian m. carlson @ 2018-09-23 14:53 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, gitster, sunshine, sbeller

[-- Attachment #1: Type: text/plain, Size: 1287 bytes --]

On Sat, Sep 22, 2018 at 03:52:58PM -0400, Jeff King wrote:
> On Sat, Sep 22, 2018 at 06:02:31PM +0000, brian m. carlson wrote:
> 
> > On Fri, Sep 21, 2018 at 02:47:43PM -0400, Taylor Blau wrote:
> > > +expect_haves () {
> > > +	printf "%s .have\n" $(git rev-parse $@) >expect
> > > +}
> > > +
> > > +extract_haves () {
> > > +	depacketize - | grep '\.have' | sed -e 's/\\0.*$//g'
> > 
> > It looks like you're trying to match a NUL here in the sed expression,
> > but from my reading of it, POSIX doesn't permit BREs to match NUL.
> 
> No, it's trying to literally match backslash followed by 0. The
> depacketize() script will have undone the NUL already. In perl, no less,
> making it more or less equivalent to your suggestion. ;)

Ah, okay.  That makes more sense.

> So I think this is fine (modulo that the grep and sed can be combined).
> Yet another option would be to simply strip away everything except the
> object id (which is all we care about), like:
> 
>   depacketize | perl -lne '/^(\S+) \.have/ and print $1'
> 
> Or the equivalent in sed. I am happy with any solution that does the
> correct thing.

Yeah, I agree that with that context, no change is needed.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-21 22:18           ` Jeff King
  2018-09-21 22:23             ` Stefan Beller
@ 2018-09-24 15:17             ` Junio C Hamano
  2018-09-24 18:10               ` Jeff King
  1 sibling, 1 reply; 94+ messages in thread
From: Junio C Hamano @ 2018-09-24 15:17 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, sunshine, sbeller

Jeff King <peff@peff.net> writes:

> I was suggesting that check_everything_connected() is not strictly
> transport-related, so would be inappropriate for transport.*, and we'd
> need a more generic name. And my "either way" was that I could see
> an argument that it _is_ transport related, since we only call it now
> when receiving a pack. But that doesn't have to be the case, and
> certainly implementing it with "rev-list --alternate-refs" muddies that
> considerably.

Even after 7043c707 ("check_everything_connected: use a struct with
named options", 2016-07-15) unified many into check_connected(),
there still are different reasons why we call to find out about the
connectivity, and I doubt we can afford to have a single knob that
is shared both for transport and other kind of connectivity checks
(like fsck or repack).  Do we want to be affected by "we pretend
that these are the only refs exported from that alternate object
store" when repacking and pruning only local objects and keep us
rely on the alternate, for example?

In any case it is good that these configuration variables are
defined on _our_ side, not in the alternate---it means that we do
not have to worry about the case where the alternateRefsCommand lies
and tells us that an object that the alternate does not actually
have exists at a tip of a ref in an attempt to confuse us, etc.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-24 15:17             ` Junio C Hamano
@ 2018-09-24 18:10               ` Jeff King
  2018-09-24 20:32                 ` Junio C Hamano
  0 siblings, 1 reply; 94+ messages in thread
From: Jeff King @ 2018-09-24 18:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, sunshine, sbeller

On Mon, Sep 24, 2018 at 08:17:14AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I was suggesting that check_everything_connected() is not strictly
> > transport-related, so would be inappropriate for transport.*, and we'd
> > need a more generic name. And my "either way" was that I could see
> > an argument that it _is_ transport related, since we only call it now
> > when receiving a pack. But that doesn't have to be the case, and
> > certainly implementing it with "rev-list --alternate-refs" muddies that
> > considerably.
> 
> Even after 7043c707 ("check_everything_connected: use a struct with
> named options", 2016-07-15) unified many into check_connected(),
> there still are different reasons why we call to find out about the
> connectivity, and I doubt we can afford to have a single knob that
> is shared both for transport and other kind of connectivity checks
> (like fsck or repack).  Do we want to be affected by "we pretend
> that these are the only refs exported from that alternate object
> store" when repacking and pruning only local objects and keep us
> rely on the alternate, for example?

Actually, yes, I think there is value in a single knob. At least that's
what I'd want for our (GitHub's) use case.

Remember that these alternate refs might not exist at all (the
alternates mechanism can work with just a bare "objects" directory,
unconnected from a real git repo). So I think anything using them has to
view it as a "best effort" optimization: we might or might not know
about some ref tips that might or might not cover the whole set of
objects in the alternate. They're the things we _guarantee_ that the
alternate has full connectivity for, and it might have more.

So I think it's conceptually consistent to always show a subset. I did
qualify with "for our use case" because some people might be primarily
concerned with the bandwidth of sending .haves across the network.
Whereas at our scale, even enumerating them at all is prohibitively
expensive.

One thing we could do is add a "core" config now (whether it's in core.*
or wherever). And then if later somebody wants receive-pack to behave
differently, we have an out: we can add transfer.alternateRefsCommand or
even receive.alternateRefsCommand that take precedence in those
situations.

Of course we could add the more restricted ones now, and add the "core"
one later as new uses grow. But that's more work now, since we'd have to
plumb through that context to the for_each_alternate_ref() interface.
I'd rather punt on that work until later (because I suspect that "later"
will never actually come).

> In any case it is good that these configuration variables are
> defined on _our_ side, not in the alternate---it means that we do
> not have to worry about the case where the alternateRefsCommand lies
> and tells us that an object that the alternate does not actually
> have exists at a tip of a ref in an attempt to confuse us, etc.

Yes. It also makes it easy to use "git -c" to override the scheme if you
want to (as opposed to mucking with on-disk files in the alternate).

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-24 18:10               ` Jeff King
@ 2018-09-24 20:32                 ` Junio C Hamano
  2018-09-24 20:50                   ` Jeff King
  0 siblings, 1 reply; 94+ messages in thread
From: Junio C Hamano @ 2018-09-24 20:32 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, sunshine, sbeller

Jeff King <peff@peff.net> writes:

> So I think it's conceptually consistent to always show a subset.

OK.  Then I agree with you that it is a good approach to first adopt
core.* knobs that universally apply, and add specialized ones as
they are needed later.

Thanks.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-24 20:32                 ` Junio C Hamano
@ 2018-09-24 20:50                   ` Jeff King
  2018-09-24 21:01                     ` Jeff King
  2018-09-24 21:55                     ` Junio C Hamano
  0 siblings, 2 replies; 94+ messages in thread
From: Jeff King @ 2018-09-24 20:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, sunshine, sbeller

On Mon, Sep 24, 2018 at 01:32:26PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > So I think it's conceptually consistent to always show a subset.
> 
> OK.  Then I agree with you that it is a good approach to first adopt
> core.* knobs that universally apply, and add specialized ones as
> they are needed later.

Thanks. There's one other major decision for this series, I think.

Do you have an opinion on whether for_each_alternate_refs() interface
should stop passing back refnames? By the "they may not even exist"
rationale in this sub-thread, I think it's probably foolish for any
caller to actually depend on the names being meaningful.

We need to decide now because the idea of which data is relevant is
getting baked into the documented alternateRefsCmd output format.

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-24 20:50                   ` Jeff King
@ 2018-09-24 21:01                     ` Jeff King
  2018-09-24 21:55                     ` Junio C Hamano
  1 sibling, 0 replies; 94+ messages in thread
From: Jeff King @ 2018-09-24 21:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, sunshine, sbeller

On Mon, Sep 24, 2018 at 04:50:22PM -0400, Jeff King wrote:

> On Mon, Sep 24, 2018 at 01:32:26PM -0700, Junio C Hamano wrote:
> 
> > Jeff King <peff@peff.net> writes:
> > 
> > > So I think it's conceptually consistent to always show a subset.
> > 
> > OK.  Then I agree with you that it is a good approach to first adopt
> > core.* knobs that universally apply, and add specialized ones as
> > they are needed later.
> 
> Thanks. There's one other major decision for this series, I think.
> 
> Do you have an opinion on whether for_each_alternate_refs() interface
> should stop passing back refnames? By the "they may not even exist"
> rationale in this sub-thread, I think it's probably foolish for any
> caller to actually depend on the names being meaningful.
> 
> We need to decide now because the idea of which data is relevant is
> getting baked into the documented alternateRefsCmd output format.

Just to sketch it out further, I was thinking that we'd do something
like this at the front of Taylor's series (with the rest rebased as
appropriate on top).

-- >8 --
Subject: [PATCH] transport: drop refnames from for_each_alternate_ref

None of the current callers use the refname parameter we pass to their
callbacks. In theory somebody _could_ do so, but it's actually quite
weird if you think about it: it's a ref in somebody else's repository.
So the name has no meaning locally, and in fact there may be duplicates
if there are multiple alternates.

The users of this interface really only care about seeing some ref tips,
since that promises that the alternate has the full commit graph
reachable from there. So let's keep the information we pass back to the
bare minimum.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/receive-pack.c | 3 +--
 fetch-pack.c           | 3 +--
 transport.c            | 6 +++---
 transport.h            | 2 +-
 4 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index a3bb13af10..39993f2bcf 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -281,8 +281,7 @@ static int show_ref_cb(const char *path_full, const struct object_id *oid,
 	return 0;
 }
 
-static void show_one_alternate_ref(const char *refname,
-				   const struct object_id *oid,
+static void show_one_alternate_ref(const struct object_id *oid,
 				   void *data)
 {
 	struct oidset *seen = data;
diff --git a/fetch-pack.c b/fetch-pack.c
index 75047a4b2a..b643de143b 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -76,8 +76,7 @@ struct alternate_object_cache {
 	size_t nr, alloc;
 };
 
-static void cache_one_alternate(const char *refname,
-				const struct object_id *oid,
+static void cache_one_alternate(const struct object_id *oid,
 				void *vcache)
 {
 	struct alternate_object_cache *cache = vcache;
diff --git a/transport.c b/transport.c
index 1c76d64aba..2e0bc414d0 100644
--- a/transport.c
+++ b/transport.c
@@ -1336,7 +1336,7 @@ static void read_alternate_refs(const char *path,
 	cmd.git_cmd = 1;
 	argv_array_pushf(&cmd.args, "--git-dir=%s", path);
 	argv_array_push(&cmd.args, "for-each-ref");
-	argv_array_push(&cmd.args, "--format=%(objectname) %(refname)");
+	argv_array_push(&cmd.args, "--format=%(objectname)");
 	cmd.env = local_repo_env;
 	cmd.out = -1;
 
@@ -1348,13 +1348,13 @@ static void read_alternate_refs(const char *path,
 		struct object_id oid;
 
 		if (get_oid_hex(line.buf, &oid) ||
-		    line.buf[GIT_SHA1_HEXSZ] != ' ') {
+		    line.buf[GIT_SHA1_HEXSZ]) {
 			warning(_("invalid line while parsing alternate refs: %s"),
 				line.buf);
 			break;
 		}
 
-		cb(line.buf + GIT_SHA1_HEXSZ + 1, &oid, data);
+		cb(&oid, data);
 	}
 
 	fclose(fh);
diff --git a/transport.h b/transport.h
index 01e717c29e..9baeca2d7a 100644
--- a/transport.h
+++ b/transport.h
@@ -261,6 +261,6 @@ int transport_refs_pushed(struct ref *ref);
 void transport_print_push_status(const char *dest, struct ref *refs,
 		  int verbose, int porcelain, unsigned int *reject_reasons);
 
-typedef void alternate_ref_fn(const char *refname, const struct object_id *oid, void *);
+typedef void alternate_ref_fn(const struct object_id *oid, void *);
 extern void for_each_alternate_ref(alternate_ref_fn, void *);
 #endif
-- 
2.19.0.764.g0a058409ab


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-24 20:50                   ` Jeff King
  2018-09-24 21:01                     ` Jeff King
@ 2018-09-24 21:55                     ` Junio C Hamano
  2018-09-24 23:14                       ` Jeff King
  1 sibling, 1 reply; 94+ messages in thread
From: Junio C Hamano @ 2018-09-24 21:55 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, sunshine, sbeller

Jeff King <peff@peff.net> writes:

> Do you have an opinion on whether for_each_alternate_refs() interface
> should stop passing back refnames? By the "they may not even exist"
> rationale in this sub-thread, I think it's probably foolish for any
> caller to actually depend on the names being meaningful.

I personally do not mind they were all ".have" or unnamed.

The primary motivatgion behind for-each-alternate-refs was that we
wanted to find more anchoring points to help the common ancestry
negotiation and for-each-*-ref was the obvious way to do so; the
user did not care anything about names.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-24 21:55                     ` Junio C Hamano
@ 2018-09-24 23:14                       ` Jeff King
  2018-09-25 17:41                         ` Junio C Hamano
  0 siblings, 1 reply; 94+ messages in thread
From: Jeff King @ 2018-09-24 23:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, sunshine, sbeller

On Mon, Sep 24, 2018 at 02:55:57PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Do you have an opinion on whether for_each_alternate_refs() interface
> > should stop passing back refnames? By the "they may not even exist"
> > rationale in this sub-thread, I think it's probably foolish for any
> > caller to actually depend on the names being meaningful.
> 
> I personally do not mind they were all ".have" or unnamed.
> 
> The primary motivatgion behind for-each-alternate-refs was that we
> wanted to find more anchoring points to help the common ancestry
> negotiation and for-each-*-ref was the obvious way to do so; the
> user did not care anything about names.

Right, I think that is totally fine for the current uses. I guess my
question was: do you envision cutting the interface down to only the
oids to bite us in the future?

I was on the fence during past discussions, but I think I've come over
to the idea that the refnames actively confuse things.

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-24 23:14                       ` Jeff King
@ 2018-09-25 17:41                         ` Junio C Hamano
  2018-09-25 22:46                           ` Taylor Blau
  0 siblings, 1 reply; 94+ messages in thread
From: Junio C Hamano @ 2018-09-25 17:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, sunshine, sbeller

Jeff King <peff@peff.net> writes:

> Right, I think that is totally fine for the current uses. I guess my
> question was: do you envision cutting the interface down to only the
> oids to bite us in the future?
>
> I was on the fence during past discussions, but I think I've come over
> to the idea that the refnames actively confuse things.

Alternates are sort-of repositories that you interact with via more
normal transports like fetch or push, and at the object store level
(i.e. the one that helps you build your local history) you do not
really care what refnames other people use in their repository.
E.g. it does not matter if a pull request to you asks you to pull
their 'frotz' branch or 'nitfol' branch, as long as the work they
did on that branch is what you expected them to do.  And I think
"I am aware that I can get to the objects that are reachable from
these objects I can borrow from that alternate when I need them" is
quite similar in spirit; the borrower has even less need to be aware
of the refnames as there isn't even a need to "git pull" from it (at
that only one single point, you would care what name they used in
their pull request).

So, I think we probably are better off without names.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-25 17:41                         ` Junio C Hamano
@ 2018-09-25 22:46                           ` Taylor Blau
  2018-09-25 23:56                             ` Junio C Hamano
  0 siblings, 1 reply; 94+ messages in thread
From: Taylor Blau @ 2018-09-25 22:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Taylor Blau, git, sunshine, sbeller

On Tue, Sep 25, 2018 at 10:41:18AM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > Right, I think that is totally fine for the current uses. I guess my
> > question was: do you envision cutting the interface down to only the
> > oids to bite us in the future?
> >
> > I was on the fence during past discussions, but I think I've come over
> > to the idea that the refnames actively confuse things.
>
> [ ... ]
>
> So, I think we probably are better off without names.

Sorry for re-entering the thread a little later. I was travelling
yesterday, and was surprised when I discovered that our "grep | sed" vs.
"sed" discussion had grown so much ;-).

My reading of this is threefold:

  1. There are some cosmetic changes that need to occur in t5410 and
     documentation, which are mentioned above. Those seem self
     explanatory, and I've applied the necessary bits already on my
     local version of this topic.

  2. The core.alternateRefsCommand vs transport.* discussion was
     resolved in [1] as "let's use core.alternateRefsCommand and
     core.alternateRefsPrefixes" for now, and others contributors can
     change this as is needed.

  3. We can apply Peff's patch to remove the refname requirement before
     mine, as well as any relevant changes in my series as have been
     affected by Peff's patch (e.g., documentation mentioning
     '%(refname)', etc).

Does this all sound sane to you (and match your recollection/reading of
the thread)? If so, I'll send v3 hopefully tomorrow.

Sorry for repeating what's already been said in this thread, but I felt
it was important to ensure that we had matching understandings of one
another.

Thanks,
Taylor

[1]: https://public-inbox.org/git/xmqqa7o6skkl.fsf@gitster-ct.c.googlers.com/

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-25 22:46                           ` Taylor Blau
@ 2018-09-25 23:56                             ` Junio C Hamano
  2018-09-26  1:18                               ` Taylor Blau
  2018-09-26  3:16                               ` Jeff King
  0 siblings, 2 replies; 94+ messages in thread
From: Junio C Hamano @ 2018-09-25 23:56 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Jeff King, git, sunshine, sbeller

Taylor Blau <me@ttaylorr.com> writes:

> My reading of this is threefold:
>
>   1. There are some cosmetic changes that need to occur in t5410 and
>      documentation, which are mentioned above. Those seem self
>      explanatory, and I've applied the necessary bits already on my
>      local version of this topic.
>
>   2. The core.alternateRefsCommand vs transport.* discussion was
>      resolved in [1] as "let's use core.alternateRefsCommand and
>      core.alternateRefsPrefixes" for now, and others contributors can
>      change this as is needed.
>
>   3. We can apply Peff's patch to remove the refname requirement before
>      mine, as well as any relevant changes in my series as have been
>      affected by Peff's patch (e.g., documentation mentioning
>      '%(refname)', etc).

I do think it makes sense to allow alternateRefsCommand to output
just the object names without adding any refnames, and to keep the
parser simple, we should not even make the refname optional
(i.e. "allow" above becomes "require"), and make the default one
done via an invocation of for-each-ref also do the same.

I do not think there was a strong concensus that we need to change
the internal C API signature, though.  If the function signature for
the callback between each_ref_fn and alternate_ref_fn were the same,
I would have opposed to the change, but because they are already
different, I do not think it is necessary to keep the dummy refname
parameter that is always passed a meaningless value.

The final series would be

 1/4: peff's "refnames in alternates do nto matter"

 2/4: your "hardcoded for-each-ref becomes just a default"

 3/4: your "config can affect what command enumerates alternate's tips"

 4/4: your "with prefix config, you don't need a fully custom command"

I guess?

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-21 19:59         ` Junio C Hamano
@ 2018-09-26  0:56           ` Taylor Blau
  0 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-09-26  0:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, peff

On Fri, Sep 21, 2018 at 12:59:16PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > In fact, I think that we can go even further: since we don't need to
> > catch the beginning '^.*' (without -o), we can instead:
> >
> >   extract_haves () {
> >     depacketize - | grep '\.have' | sed -e 's/\\0.*$//g'
> >   }
>
> Do not pipe grep into sed, unless you have an overly elaborate set
> of patterns to filter with, e.g. something along the lines of...
>
> 	sed -ne '/\.have/s/...//p'

Thanks, I'm not sure why I thought that this was a good idea to send
(even after discussing it to myself twice publicly on the list
beforehand).

Anyway, in my local copy, I adopted Peff's suggestion below in the
thread, which is:

  extract_haves () {
    depacketize - | perl -lne '/^(\S+) \.have/ and print $1'
  }

I think that that should be OK, but I sent it here to double check
before sending you real patches.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-21 20:18     ` Eric Sunshine
@ 2018-09-26  0:59       ` Taylor Blau
  0 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-09-26  0:59 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Taylor Blau, Git List, Jeff King, Junio C Hamano, Stefan Beller

On Fri, Sep 21, 2018 at 04:18:03PM -0400, Eric Sunshine wrote:
> On Fri, Sep 21, 2018 at 2:47 PM Taylor Blau <me@ttaylorr.com> wrote:
> > When in a repository containing one or more alternates, Git would
> > sometimes like to list references from its alternates. For example, 'git
> > receive-pack' list the objects pointed to by alternate references as
> > special ".have" references.
> > [...]
> > Signed-off-by: Taylor Blau <me@ttaylorr.com>
> > ---
> > diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
> > @@ -0,0 +1,54 @@
> > +expect_haves () {
> > +       printf "%s .have\n" $(git rev-parse $@) >expect
> > +}
>
> Magic quoting behavior only kicks in when $@ is itself quoted, so this
> should be:
>
>     printf "%s .have\n" $(git rev-parse "$@") >expect
>
> However, as it's unlikely that you need magic quoting in this case,
> you might get by with plain $* (unquoted).

Yep, thanks for catching my mistake. I rewrote my local copy with "$@"
(instead of $@), and also applied your suggestion of not redirecting to
`>expect`, and renaming the function.

These both ended up becoming moot points, though, because of the
Perl-ism that Peff suggested and I adopted throughout this thread.

The Perl Peff wrote does not capture the " .have" suffix at all, and
instead only the object identifiers. Hence, all we really need is a call
to 'git-rev-parse(1)'. I doubt that this will ever change, so I removed
the function entirely.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-21 21:09     ` Junio C Hamano
  2018-09-21 22:13       ` Jeff King
@ 2018-09-26  1:06       ` Taylor Blau
  2018-09-26  3:21         ` Jeff King
  1 sibling, 1 reply; 94+ messages in thread
From: Taylor Blau @ 2018-09-26  1:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, peff, sunshine, sbeller

On Fri, Sep 21, 2018 at 02:09:08PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > +core.alternateRefsCommand::
> > +	When listing references from an alternate (e.g., in the case of ".have"), use
>
> It is not clear how (e.g.,...) connects to what is said in the
> sentence.  "When advertising tips of available history from an
> alternate, use ..." without saying ".have" may be less cryptic.
>
> I dunno.

Thanks, I think that I tend to overuse both "e.g.," and "i.e.,". I took
your suggestion as above, which I think looks better than my original
prose.

> > +	the shell to execute the specified command instead of
> > +	linkgit:git-for-each-ref[1]. The first argument is the path of the alternate.
>
> "The path" meaning the absolute path?  Relative to the original
> object store?  Something else?

It's the absolute path, and I've updated the documentation to clarify it
as such.

> > +	Output must be of the form: `%(objectname) SPC %(refname)`.
> > ++
> > +This is useful when a repository only wishes to advertise some of its
> > +alternate's references as ".have"'s. For example, to only advertise branch
> > +heads, configure `core.alternateRefsCommand` to the path of a script which runs
> > +`git --git-dir="$1" for-each-ref refs/heads`.
> > +
> >  core.bare::
> >  	If true this repository is assumed to be 'bare' and has no
> >  	working directory associated with it.  If this is the case a
> > diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
> > new file mode 100755
> > index 0000000000..2f21f1cb8f
> > --- /dev/null
> > +++ b/t/t5410-receive-pack.sh
> > @@ -0,0 +1,54 @@
> > +#!/bin/sh
> > +
> > +test_description='git receive-pack test'
> > +
> > +. ./test-lib.sh
> > +
> > +test_expect_success 'setup' '
> > +	test_commit one &&
> > +	git update-ref refs/heads/a HEAD &&
> > +	test_commit two &&
> > +	git update-ref refs/heads/b HEAD &&
> > +	test_commit three &&
> > +	git update-ref refs/heads/c HEAD &&
> > +	git clone --bare . fork &&
> > +	git clone fork pusher &&
> > +	(
> > +		cd fork &&
> > +		git config receive.advertisealternates true &&
>
> Hmph.  Do we have code to support this configuration variable?

We don't ;-). Peff's explanation of why is accurate, and the mistake is
mine.

> > +		cat <<-EOF | git update-ref --stdin &&
>
> Style: writing "<<-\EOF" instead would allow readers' eyes to
> coast over without having to look for $variable_references in
> the here-doc.
>
> > +		delete refs/heads/a
> > +		delete refs/heads/b
> > +		delete refs/heads/c
> > +		delete refs/heads/master
> > +		delete refs/tags/one
> > +		delete refs/tags/two
> > +		delete refs/tags/three

Thanks, it ended up being much cleaner to write <<-\EOF, and avoid the
unnecessary cat(1) entirely.

> So, the original created one/two/three/a/b/c/master, fork is a bare
> clone of it and has all these things, and then you deleted all of
> these?  What does fork have after this is done?  HEAD that is
> dangling?
>
> > +		EOF
> > +		echo "../../.git/objects" >objects/info/alternates
>
> When viewed from fork/objects, ../../.git is the GIT_DIR of the
> primary test repository, so that is where we borrow objects from.
>
> If we pruned the objects from fork's object store before this echo,
> we would have an almost empty repository that borrows from its
> alternates everything, which may make a more realistic sample case,
> but because you are only focusing on the ref advertisement, it does
> not matter that your fork is full of duplicate objects that are
> available from the alternates.

I could go either way. You're right in that we have only a dangling HEAD
reference in the fork, and that all of the objects are still there. I
suppose that we could gc the objects that are there, but I think (as you
note above) that it doesn't make a huge difference either way.

> > +expect_haves () {
> > +	printf "%s .have\n" $(git rev-parse $@) >expect
>
> Quote $@ inside dq pair, like $(git rev-parse "$@").

Thanks, I fixed this (per your and Eric's suggestion), but ended up
removing the function entirely anyway.

> > +extract_haves () {
> > +	depacketize - | grep '\.have' | sed -e 's/\\0.*$//g'
> > +}
>
> Don't pipe grep into sed, especially when both the pattern to filter
> and the operation to perform are simple.
>
> I am not sure what you are trying to achive with 'g' in
> s/pattern$//g; The anchor at the rightmost end of the pattern makes
> sure that the pattern matches only once per line at the end anyway,
> so "do this howmanyever times as we have match on each line" would
> not make any difference, no?

I admit to not fully understanding when the trailing `/g` is and is not
useful. Anyway, I took Peff's suggestion below to convert this 'grep |
sed' pipeline into a Perl invocation, which I think ended up much
cleaner.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-22 19:52       ` Jeff King
  2018-09-23 14:53         ` brian m. carlson
@ 2018-09-26  1:09         ` Taylor Blau
  2018-09-26  3:33           ` Jeff King
  1 sibling, 1 reply; 94+ messages in thread
From: Taylor Blau @ 2018-09-26  1:09 UTC (permalink / raw)
  To: Jeff King; +Cc: brian m. carlson, Taylor Blau, git, gitster, sunshine, sbeller

On Sat, Sep 22, 2018 at 03:52:58PM -0400, Jeff King wrote:
> On Sat, Sep 22, 2018 at 06:02:31PM +0000, brian m. carlson wrote:
>
> > On Fri, Sep 21, 2018 at 02:47:43PM -0400, Taylor Blau wrote:
> > > +expect_haves () {
> > > +	printf "%s .have\n" $(git rev-parse $@) >expect
> > > +}
> > > +
> > > +extract_haves () {
> > > +	depacketize - | grep '\.have' | sed -e 's/\\0.*$//g'
> >
> > It looks like you're trying to match a NUL here in the sed expression,
> > but from my reading of it, POSIX doesn't permit BREs to match NUL.
>
> No, it's trying to literally match backslash followed by 0. The
> depacketize() script will have undone the NUL already. In perl, no less,
> making it more or less equivalent to your suggestion. ;)
>
> So I think this is fine (modulo that the grep and sed can be combined).
> Yet another option would be to simply strip away everything except the
> object id (which is all we care about), like:
>
>   depacketize | perl -lne '/^(\S+) \.have/ and print $1'

Thanks for this. This is the suggestion I ended up taking (modulo taking
'-' as the first argument to 'depacketize').

The 'print $1' part of this makes things a lot nicer, actually, having
removed the " .have" suffix. We can get rid of the expect_haves()
function above, and instead call 'git rev-parse' inline and get the
right results.

> Or the equivalent in sed. I am happy with any solution that does the
> correct thing.

Me too :-). Thanks again.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-25 23:56                             ` Junio C Hamano
@ 2018-09-26  1:18                               ` Taylor Blau
  2018-09-26  3:16                               ` Jeff King
  1 sibling, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-09-26  1:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, Jeff King, git, sunshine, sbeller

On Tue, Sep 25, 2018 at 04:56:11PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > My reading of this is threefold:
> >
> >   1. There are some cosmetic changes that need to occur in t5410 and
> >      documentation, which are mentioned above. Those seem self
> >      explanatory, and I've applied the necessary bits already on my
> >      local version of this topic.
> >
> >   2. The core.alternateRefsCommand vs transport.* discussion was
> >      resolved in [1] as "let's use core.alternateRefsCommand and
> >      core.alternateRefsPrefixes" for now, and others contributors can
> >      change this as is needed.
> >
> >   3. We can apply Peff's patch to remove the refname requirement before
> >      mine, as well as any relevant changes in my series as have been
> >      affected by Peff's patch (e.g., documentation mentioning
> >      '%(refname)', etc).
>
> I do think it makes sense to allow alternateRefsCommand to output
> just the object names without adding any refnames, and to keep the
> parser simple, we should not even make the refname optional
> (i.e. "allow" above becomes "require"), and make the default one
> done via an invocation of for-each-ref also do the same.
>
> I do not think there was a strong concensus that we need to change
> the internal C API signature, though.  If the function signature for
> the callback between each_ref_fn and alternate_ref_fn were the same,
> I would have opposed to the change, but because they are already
> different, I do not think it is necessary to keep the dummy refname
> parameter that is always passed a meaningless value.
>
> The final series would be
>
>  1/4: peff's "refnames in alternates do nto matter"
>
>  2/4: your "hardcoded for-each-ref becomes just a default"
>
>  3/4: your "config can affect what command enumerates alternate's tips"
>
>  4/4: your "with prefix config, you don't need a fully custom command"
>
> I guess?

Perfect -- we are in agreement on how the rerolled series should be
organized. I don't anticipate much further comment on v2 in this thread,
but I'll let it sit overnight to make sure that the dust has all settled
after my new mail.

I have a version of what will likely become 'v3', pushed here: [1].

Thanks,
Taylor

[1]: https://github.com/ttaylorr/git/tree/tb/alternate-refs-cmd

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes
  2018-09-25 23:56                             ` Junio C Hamano
  2018-09-26  1:18                               ` Taylor Blau
@ 2018-09-26  3:16                               ` Jeff King
  1 sibling, 0 replies; 94+ messages in thread
From: Jeff King @ 2018-09-26  3:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, sunshine, sbeller

On Tue, Sep 25, 2018 at 04:56:11PM -0700, Junio C Hamano wrote:

> Taylor Blau <me@ttaylorr.com> writes:
> 
> > My reading of this is threefold:
> >
> >   1. There are some cosmetic changes that need to occur in t5410 and
> >      documentation, which are mentioned above. Those seem self
> >      explanatory, and I've applied the necessary bits already on my
> >      local version of this topic.
> >
> >   2. The core.alternateRefsCommand vs transport.* discussion was
> >      resolved in [1] as "let's use core.alternateRefsCommand and
> >      core.alternateRefsPrefixes" for now, and others contributors can
> >      change this as is needed.
> >
> >   3. We can apply Peff's patch to remove the refname requirement before
> >      mine, as well as any relevant changes in my series as have been
> >      affected by Peff's patch (e.g., documentation mentioning
> >      '%(refname)', etc).

Yeah, these three sound right to me.

> I do think it makes sense to allow alternateRefsCommand to output
> just the object names without adding any refnames, and to keep the
> parser simple, we should not even make the refname optional
> (i.e. "allow" above becomes "require"), and make the default one
> done via an invocation of for-each-ref also do the same.

Yeah, making it optional is just the worst of both worlds, IMHO. Then
callers sometimes get a real value and sometimes just whatever garbage
we fill in, and can't rely on it.

> I do not think there was a strong concensus that we need to change
> the internal C API signature, though.  If the function signature for
> the callback between each_ref_fn and alternate_ref_fn were the same,
> I would have opposed to the change, but because they are already
> different, I do not think it is necessary to keep the dummy refname
> parameter that is always passed a meaningless value.

Agreed. I adjusted my "rev-list --alternate-refs" patch for the proposed
new world order (just because it's the likely user of the refname
field). Since the function signatures aren't the same, I already had a
custom callback. It did chain to the existing each_ref_fn one, so I had
to adjust it like so:

diff --git a/revision.c b/revision.c
index 3988275fde..8dfe2fd4c0 100644
--- a/revision.c
+++ b/revision.c
@@ -1396,11 +1396,10 @@ void add_index_objects_to_pending(struct rev_info *revs, unsigned int flags)
 	free_worktrees(worktrees);
 }
 
-static void handle_one_alternate_ref(const char *refname,
-				     const struct object_id *oid,
+static void handle_one_alternate_ref(const struct object_id *oid,
 				     void *data)
 {
-	handle_one_ref(refname, oid, 0, data);
+	handle_one_ref(".have", oid, 0, data);
 }
 
 static int add_parents_only(struct rev_info *revs, const char *arg_, int flags,

But I think that's fine. We have to handle the lack of name _somewhere_
in the call stack, so I'd just as soon it be here in the callback, where
we know what it will be used for (or not used at all).

> The final series would be
> 
>  1/4: peff's "refnames in alternates do nto matter"
> 
>  2/4: your "hardcoded for-each-ref becomes just a default"
> 
>  3/4: your "config can affect what command enumerates alternate's tips"
> 
>  4/4: your "with prefix config, you don't need a fully custom command"

Yep, that's what I'd expect from the new series.

-Peff

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-26  1:06       ` Taylor Blau
@ 2018-09-26  3:21         ` Jeff King
  0 siblings, 0 replies; 94+ messages in thread
From: Jeff King @ 2018-09-26  3:21 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Junio C Hamano, git, sunshine, sbeller

On Tue, Sep 25, 2018 at 06:06:06PM -0700, Taylor Blau wrote:

> > > +extract_haves () {
> > > +	depacketize - | grep '\.have' | sed -e 's/\\0.*$//g'
> > > +}
> >
> > Don't pipe grep into sed, especially when both the pattern to filter
> > and the operation to perform are simple.
> >
> > I am not sure what you are trying to achive with 'g' in
> > s/pattern$//g; The anchor at the rightmost end of the pattern makes
> > sure that the pattern matches only once per line at the end anyway,
> > so "do this howmanyever times as we have match on each line" would
> > not make any difference, no?
> 
> I admit to not fully understanding when the trailing `/g` is and is not
> useful. Anyway, I took Peff's suggestion below to convert this 'grep |
> sed' pipeline into a Perl invocation, which I think ended up much
> cleaner.

It makes the replacement global in the line. Without we substitute only
the first match. So try:

  echo foo | sed s/o/X/

versus:

  echo foo | sed s/o/X/g

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-26  1:09         ` Taylor Blau
@ 2018-09-26  3:33           ` Jeff King
  2018-09-26 13:39             ` Taylor Blau
  0 siblings, 1 reply; 94+ messages in thread
From: Jeff King @ 2018-09-26  3:33 UTC (permalink / raw)
  To: Taylor Blau; +Cc: brian m. carlson, git, gitster, sunshine, sbeller

On Tue, Sep 25, 2018 at 06:09:35PM -0700, Taylor Blau wrote:

> > So I think this is fine (modulo that the grep and sed can be combined).
> > Yet another option would be to simply strip away everything except the
> > object id (which is all we care about), like:
> >
> >   depacketize | perl -lne '/^(\S+) \.have/ and print $1'
> 
> Thanks for this. This is the suggestion I ended up taking (modulo taking
> '-' as the first argument to 'depacketize').

I don't think depacketize takes any arguments. It always reads from
stdin directly, doesn't it? Your "-" is not hurting anything, but it is
totally ignored.

A perl tangent if you're interested:

  Normally for shell functions like this that are just wrappers around
  perl snippets, I would suggest to pass "$@" from the function's
  arguments to perl. So for example if we had:

    haves_from_packets () {
	perl -lne '/^(\S+) \.have/ and print $1' "$@"
    }

  then you could call it with a filename:

    haves_from_packets packets

  or input on stdin:

    haves_from_packets <packets

  and either works (this is magic from perl's "-p" loop, but you get the
  same if you write "while (<>)" explicitly in your program).

  But because depacketize() has to use byte-wise read() calls, it
  doesn't get that magic for free. And it did not seem worth the effort
  to implement, when shell redirections are so easy. ;)

  Just skimming through test-lib-functions.sh, though, it does seem that
  we often deviate from that pattern (e.g., all of the q_to_nul family).
  And has seemed to mind.

> The 'print $1' part of this makes things a lot nicer, actually, having
> removed the " .have" suffix. We can get rid of the expect_haves()
> function above, and instead call 'git rev-parse' inline and get the
> right results.

Yes. You can even do it all in a single rev-parse call.

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-26  3:33           ` Jeff King
@ 2018-09-26 13:39             ` Taylor Blau
  2018-09-26 18:38               ` Jeff King
  0 siblings, 1 reply; 94+ messages in thread
From: Taylor Blau @ 2018-09-26 13:39 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, brian m. carlson, git, gitster, sunshine, sbeller

On Tue, Sep 25, 2018 at 11:33:37PM -0400, Jeff King wrote:
> On Tue, Sep 25, 2018 at 06:09:35PM -0700, Taylor Blau wrote:
>
> > > So I think this is fine (modulo that the grep and sed can be combined).
> > > Yet another option would be to simply strip away everything except the
> > > object id (which is all we care about), like:
> > >
> > >   depacketize | perl -lne '/^(\S+) \.have/ and print $1'
> >
> > Thanks for this. This is the suggestion I ended up taking (modulo taking
> > '-' as the first argument to 'depacketize').
>
> I don't think depacketize takes any arguments. It always reads from
> stdin directly, doesn't it? Your "-" is not hurting anything, but it is
> totally ignored.

Yep, certainly. I think that I was drawn to this claim because I watched
t5410 fail after applying the above recommendation, so thusly assumed
that it was my fault for not passing `-` to 'depacketize()`.

In the end, I'm not sure why the test failed originally (it's likely
that I hadn't removed the ".have" part of 'expect_haves()', yet). But, I
removed the `-` in my local copy of v3, and the tests passes on all
revisions of this series that have it.

> A perl tangent if you're interested:
>
>   Normally for shell functions like this that are just wrappers around
>   perl snippets, I would suggest to pass "$@" from the function's
>   arguments to perl. So for example if we had:
>
>     haves_from_packets () {
> 	perl -lne '/^(\S+) \.have/ and print $1' "$@"
>     }
>
>   then you could call it with a filename:
>
>     haves_from_packets packets
>
>   or input on stdin:
>
>     haves_from_packets <packets
>
>   and either works (this is magic from perl's "-p" loop, but you get the
>   same if you write "while (<>)" explicitly in your program).
>
>   But because depacketize() has to use byte-wise read() calls, it
>   doesn't get that magic for free. And it did not seem worth the effort
>   to implement, when shell redirections are so easy. ;)

To be clear, we ought to leave this function as:

  extract_haves () {
    depacketize | perl -lne '/^(\S+) \.have/ and print $1'
  }

Or are you suggesting that we change it to:

  extract_haves () {
    perl -lne '/^(\S+) \.have/ and print $1'
  }

And call it as:

  printf "0000" | git receive-pack fork >actual &&
  depacketize <actual >actual.packets
  extract_haves <actual.packets >actual.haves &&

Frankly, (and I think that this is what you're getting at in your reply
above), I think that the former (e.g., calling 'depacketize()' in
'extract_haves()') is cleaner. This approach leaves us with "actual" and
"actual.haves", and obviates the need for another intermediary,
"actual.packets".

> > The 'print $1' part of this makes things a lot nicer, actually, having
> > removed the " .have" suffix. We can get rid of the expect_haves()
> > function above, and instead call 'git rev-parse' inline and get the
> > right results.
>
> Yes. You can even do it all in a single rev-parse call.

Indeed.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-26 13:39             ` Taylor Blau
@ 2018-09-26 18:38               ` Jeff King
  2018-09-28  2:39                 ` Taylor Blau
  0 siblings, 1 reply; 94+ messages in thread
From: Jeff King @ 2018-09-26 18:38 UTC (permalink / raw)
  To: Taylor Blau; +Cc: brian m. carlson, git, gitster, sunshine, sbeller

On Wed, Sep 26, 2018 at 06:39:56AM -0700, Taylor Blau wrote:

> > A perl tangent if you're interested:
> [...]
> 
> To be clear, we ought to leave this function as:
> 
>   extract_haves () {
>     depacketize | perl -lne '/^(\S+) \.have/ and print $1'
>   }

Yes, I agree. You cannot do the "$@" there because it relies on
depacketize, which only handles stdin.

> Or are you suggesting that we change it to:
> 
>   extract_haves () {
>     perl -lne '/^(\S+) \.have/ and print $1'
>   }

No, sorry. I just used the ".have" snippet as filler text, but I see
that muddied my meaning considerably. This really was just a tangent for
the future. What you've written above is the best thing for this case.

> And call it as:
> 
>   printf "0000" | git receive-pack fork >actual &&
>   depacketize <actual >actual.packets
>   extract_haves <actual.packets >actual.haves &&
> 
> Frankly, (and I think that this is what you're getting at in your reply
> above), I think that the former (e.g., calling 'depacketize()' in
> 'extract_haves()') is cleaner. This approach leaves us with "actual" and
> "actual.haves", and obviates the need for another intermediary,
> "actual.packets".

Yeah. I have no problem with the three-liner you wrote above, but I do
not see any particular reason for it.

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand
  2018-09-26 18:38               ` Jeff King
@ 2018-09-28  2:39                 ` Taylor Blau
  0 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-09-28  2:39 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, brian m. carlson, git, gitster, sunshine, sbeller

On Wed, Sep 26, 2018 at 02:38:53PM -0400, Jeff King wrote:
> On Wed, Sep 26, 2018 at 06:39:56AM -0700, Taylor Blau wrote:
>
> > > A perl tangent if you're interested:
> > [...]
> >
> > To be clear, we ought to leave this function as:
> >
> >   extract_haves () {
> >     depacketize | perl -lne '/^(\S+) \.have/ and print $1'
> >   }
>
> Yes, I agree. You cannot do the "$@" there because it relies on
> depacketize, which only handles stdin.
>
> > Or are you suggesting that we change it to:
> >
> >   extract_haves () {
> >     perl -lne '/^(\S+) \.have/ and print $1'
> >   }
>
> No, sorry. I just used the ".have" snippet as filler text, but I see
> that muddied my meaning considerably. This really was just a tangent for
> the future. What you've written above is the best thing for this case.

I see, and I had assumed that you meant the later, not that including
" .have" was a good way to go forward. So I think that we're in
agreement here.

> > And call it as:
> >
> >   printf "0000" | git receive-pack fork >actual &&
> >   depacketize <actual >actual.packets
> >   extract_haves <actual.packets >actual.haves &&
> >
> > Frankly, (and I think that this is what you're getting at in your reply
> > above), I think that the former (e.g., calling 'depacketize()' in
> > 'extract_haves()') is cleaner. This approach leaves us with "actual" and
> > "actual.haves", and obviates the need for another intermediary,
> > "actual.packets".
>
> Yeah. I have no problem with the three-liner you wrote above, but I do
> not see any particular reason for it.

Good. That's the version that I'll send shortly, then.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v3 0/4] Filter alternate references
  2018-09-20 18:04 [PATCH 0/3] Filter alternate references Taylor Blau
                   ` (5 preceding siblings ...)
  2018-09-21 18:47 ` [PATCH v2 " Taylor Blau
@ 2018-09-28  4:25 ` Taylor Blau
  2018-09-28  4:25   ` [PATCH v3 1/4] transport: drop refnames from for_each_alternate_ref Jeff King
                     ` (3 more replies)
  2018-10-02  2:23 ` [PATCH v4 0/4] Filter alternate references Taylor Blau
  2018-10-08 18:09 ` [PATCH v5 0/4] Filter alternate references Taylor Blau
  8 siblings, 4 replies; 94+ messages in thread
From: Taylor Blau @ 2018-09-28  4:25 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller

Hi,

Attached is the third re-roll of mine and Peff's series to introduce
'core.alternateRefsCommand', and 'core.alternateRefsPrefixes' to filter
the initial ".have" advertisement when an alternate has a pathologically
large number of references.

A range-diff against v2 is included below, but the major changes between
the two revisions are as follows:

  1. Documentation and testing clean-up, per helpful input from Junio,
     Peff, and brian carlson.

  2. Included also is a preparatory patch from Peff, to change the
     requirement that we provide refnames for alternate references. We
     no longer allow this, and the first commit sent makes that such
     change.

I imagine that we may hit one more re-roll, depending on the outcome of
this review. The series has not fundamentally changed since v2, so I
think that we are at a point of stasis there. Anything that is left
outstanding from v3 should hopefully be similarly-not-earth-shattering
;-).

Thanks in advance for your review.

Thanks,
Taylor

Jeff King (1):
  transport: drop refnames from for_each_alternate_ref

Taylor Blau (3):
  transport.c: extract 'fill_alternate_refs_command'
  transport.c: introduce core.alternateRefsCommand
  transport.c: introduce core.alternateRefsPrefixes

 Documentation/config.txt | 18 +++++++++++++
 builtin/receive-pack.c   |  3 +--
 fetch-pack.c             |  3 +--
 t/t5410-receive-pack.sh  | 57 ++++++++++++++++++++++++++++++++++++++++
 transport.c              | 38 +++++++++++++++++++++------
 transport.h              |  2 +-
 6 files changed, 108 insertions(+), 13 deletions(-)
 create mode 100755 t/t5410-receive-pack.sh

Range-diff against v2:
-:  ---------- > 1:  037273dab0 transport: drop refnames from for_each_alternate_ref
1:  6e3a58afe7 ! 2:  9479470cb1 transport.c: extract 'fill_alternate_refs_command'
    @@ -24,7 +24,7 @@
     +	cmd->git_cmd = 1;
     +	argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
     +	argv_array_push(&cmd->args, "for-each-ref");
    -+	argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
    ++	argv_array_push(&cmd->args, "--format=%(objectname)");
     +	cmd->env = local_repo_env;
     +	cmd->out = -1;
     +}
    @@ -39,7 +39,7 @@
     -	cmd.git_cmd = 1;
     -	argv_array_pushf(&cmd.args, "--git-dir=%s", path);
     -	argv_array_push(&cmd.args, "for-each-ref");
    --	argv_array_push(&cmd.args, "--format=%(objectname) %(refname)");
    +-	argv_array_push(&cmd.args, "--format=%(objectname)");
     -	cmd.env = local_repo_env;
     -	cmd.out = -1;
     +	fill_alternate_refs_command(&cmd, path);
2:  9797f52551 ! 3:  2dbcd54190 transport.c: introduce core.alternateRefsCommand
    @@ -3,24 +3,24 @@
         transport.c: introduce core.alternateRefsCommand

         When in a repository containing one or more alternates, Git would
    -    sometimes like to list references from its alternates. For example, 'git
    -    receive-pack' list the objects pointed to by alternate references as
    -    special ".have" references.
    +    sometimes like to list references from those alternates. For example,
    +    'git receive-pack' lists the "tips" pointed to by references in those
    +    alternates as special ".have" references.

         Listing ".have" references is designed to make pushing changes from
         upstream to a fork a lightweight operation, by advertising to the pusher
         that the fork already has the objects (via its alternate). Thus, the
         client can avoid sending them.

    -    However, when the alternate has a pathologically large number of
    -    references, the initial advertisement is too expensive. In fact, it can
    -    dominate any such optimization where the pusher avoids sending certain
    -    objects.
    +    However, when the alternate (upstream, in the previous example) has a
    +    pathologically large number of references, the initial advertisement is
    +    too expensive. In fact, it can dominate any such optimization where the
    +    pusher avoids sending certain objects.

         Introduce "core.alternateRefsCommand" in order to provide a facility to
         limit or filter alternate references. This can be used, for example, to
    -    filter out "uninteresting" references from the initial advertisement in
    -    the above scenario.
    +    filter out references the alternate does not wish to send (for space
    +    concerns, or otherwise) during the initial advertisement.

         Let the repository that has alternates configure this command to avoid
         trusting the alternate to provide us a safe command to run in the shell.
    @@ -38,15 +38,15 @@
      	expect HEAD to be a symbolic link.

     +core.alternateRefsCommand::
    -+	When listing references from an alternate (e.g., in the case of ".have"), use
    -+	the shell to execute the specified command instead of
    -+	linkgit:git-for-each-ref[1]. The first argument is the path of the alternate.
    -+	Output must be of the form: `%(objectname) SPC %(refname)`.
    ++	When advertising tips of available history from an alternate, use the shell to
    ++	execute the specified command instead of linkgit:git-for-each-ref[1]. The
    ++	first argument is the absolute path of the alternate. Output must be of the
    ++	form: `%(objectname)`, where multiple tips are separated by newlines.
     ++
     +This is useful when a repository only wishes to advertise some of its
     +alternate's references as ".have"'s. For example, to only advertise branch
     +heads, configure `core.alternateRefsCommand` to the path of a script which runs
    -+`git --git-dir="$1" for-each-ref refs/heads`.
    ++`git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads`.
     +
      core.bare::
      	If true this repository is assumed to be 'bare' and has no
    @@ -74,8 +74,7 @@
     +	git clone fork pusher &&
     +	(
     +		cd fork &&
    -+		git config receive.advertisealternates true &&
    -+		cat <<-EOF | git update-ref --stdin &&
    ++		git update-ref --stdin <<-\EOF &&
     +		delete refs/heads/a
     +		delete refs/heads/b
     +		delete refs/heads/c
    @@ -88,23 +87,19 @@
     +	)
     +'
     +
    -+expect_haves () {
    -+	printf "%s .have\n" $(git rev-parse $@) >expect
    -+}
    -+
     +extract_haves () {
    -+	depacketize - | grep '\.have' | sed -e 's/\\0.*$//g'
    ++	depacketize | perl -lne '/^(\S+) \.have/ and print $1'
     +}
     +
     +test_expect_success 'with core.alternateRefsCommand' '
     +	write_script fork/alternate-refs <<-\EOF &&
     +		git --git-dir="$1" for-each-ref \
    -+			--format="%(objectname) %(refname)" \
    ++			--format="%(objectname)" \
     +			refs/heads/a \
     +			refs/heads/c
     +	EOF
     +	test_config -C fork core.alternateRefsCommand alternate-refs &&
    -+	expect_haves a c >expect &&
    ++	git rev-parse a c >expect &&
     +	printf "0000" | git receive-pack fork >actual &&
     +	extract_haves <actual >actual.haves &&
     +	test_cmp expect actual.haves
    @@ -122,7 +117,7 @@
     -	cmd->git_cmd = 1;
     -	argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
     -	argv_array_push(&cmd->args, "for-each-ref");
    --	argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
    +-	argv_array_push(&cmd->args, "--format=%(objectname)");
     +	const char *value;
     +
     +	if (!git_config_get_value("core.alternateRefsCommand", &value)) {
    @@ -135,7 +130,7 @@
     +
     +		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
     +		argv_array_push(&cmd->args, "for-each-ref");
    -+		argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
    ++		argv_array_push(&cmd->args, "--format=%(objectname)");
     +	}
     +
      	cmd->env = local_repo_env;
3:  6e8f65a16d ! 4:  48eb774c9e transport.c: introduce core.alternateRefsPrefixes
    @@ -12,9 +12,7 @@
         'core.alternateRefsCommand' would have to do:

           $ git config core.alternateRefsCommand ' \
    -          git -C "$1" for-each-ref refs/tags \
    -          --format="%(objectname) %(refname)" \
    -        '
    +          git -C "$1" for-each-ref refs/tags --format="%(objectname)"'

         The above is cumbersome to write, so let's introduce a
         "core.alternateRefsPrefixes" to address this common case. Instead, the
    @@ -41,7 +39,7 @@
      +++ b/Documentation/config.txt
     @@
      heads, configure `core.alternateRefsCommand` to the path of a script which runs
    - `git --git-dir="$1" for-each-ref refs/heads`.
    + `git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads`.

     +core.alternateRefsPrefixes::
     +	When listing references from an alternate, list only references that begin
    @@ -63,7 +61,7 @@

     +test_expect_success 'with core.alternateRefsPrefixes' '
     +	test_config -C fork core.alternateRefsPrefixes "refs/tags" &&
    -+	expect_haves one three two >expect &&
    ++	git rev-parse one three two >expect &&
     +	printf "0000" | git receive-pack fork >actual &&
     +	extract_haves <actual >actual.haves &&
     +	test_cmp expect actual.haves
    @@ -77,7 +75,7 @@
     @@
      		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
      		argv_array_push(&cmd->args, "for-each-ref");
    - 		argv_array_push(&cmd->args, "--format=%(objectname) %(refname)");
    + 		argv_array_push(&cmd->args, "--format=%(objectname)");
     +
     +		if (!git_config_get_value("core.alternateRefsPrefixes", &value)) {
     +			argv_array_push(&cmd->args, "--");
--
2.19.0.221.g150f307af

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v3 1/4] transport: drop refnames from for_each_alternate_ref
  2018-09-28  4:25 ` [PATCH v3 0/4] Filter alternate references Taylor Blau
@ 2018-09-28  4:25   ` Jeff King
  2018-09-28  4:58     ` Jeff King
  2018-09-28  4:25   ` [PATCH v3 2/4] transport.c: extract 'fill_alternate_refs_command' Taylor Blau
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 94+ messages in thread
From: Jeff King @ 2018-09-28  4:25 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller

None of the current callers use the refname parameter we pass to their
callbacks. In theory somebody _could_ do so, but it's actually quite
weird if you think about it: it's a ref in somebody else's repository.
So the name has no meaning locally, and in fact there may be duplicates
if there are multiple alternates.

The users of this interface really only care about seeing some ref tips,
since that promises that the alternate has the full commit graph
reachable from there. So let's keep the information we pass back to the
bare minimum.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/receive-pack.c | 3 +--
 fetch-pack.c           | 3 +--
 transport.c            | 6 +++---
 transport.h            | 2 +-
 4 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 4d30001950..6792291f5e 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -281,8 +281,7 @@ static int show_ref_cb(const char *path_full, const struct object_id *oid,
 	return 0;
 }
 
-static void show_one_alternate_ref(const char *refname,
-				   const struct object_id *oid,
+static void show_one_alternate_ref(const struct object_id *oid,
 				   void *data)
 {
 	struct oidset *seen = data;
diff --git a/fetch-pack.c b/fetch-pack.c
index 75047a4b2a..b643de143b 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -76,8 +76,7 @@ struct alternate_object_cache {
 	size_t nr, alloc;
 };
 
-static void cache_one_alternate(const char *refname,
-				const struct object_id *oid,
+static void cache_one_alternate(const struct object_id *oid,
 				void *vcache)
 {
 	struct alternate_object_cache *cache = vcache;
diff --git a/transport.c b/transport.c
index 1c76d64aba..2e0bc414d0 100644
--- a/transport.c
+++ b/transport.c
@@ -1336,7 +1336,7 @@ static void read_alternate_refs(const char *path,
 	cmd.git_cmd = 1;
 	argv_array_pushf(&cmd.args, "--git-dir=%s", path);
 	argv_array_push(&cmd.args, "for-each-ref");
-	argv_array_push(&cmd.args, "--format=%(objectname) %(refname)");
+	argv_array_push(&cmd.args, "--format=%(objectname)");
 	cmd.env = local_repo_env;
 	cmd.out = -1;
 
@@ -1348,13 +1348,13 @@ static void read_alternate_refs(const char *path,
 		struct object_id oid;
 
 		if (get_oid_hex(line.buf, &oid) ||
-		    line.buf[GIT_SHA1_HEXSZ] != ' ') {
+		    line.buf[GIT_SHA1_HEXSZ]) {
 			warning(_("invalid line while parsing alternate refs: %s"),
 				line.buf);
 			break;
 		}
 
-		cb(line.buf + GIT_SHA1_HEXSZ + 1, &oid, data);
+		cb(&oid, data);
 	}
 
 	fclose(fh);
diff --git a/transport.h b/transport.h
index 01e717c29e..9baeca2d7a 100644
--- a/transport.h
+++ b/transport.h
@@ -261,6 +261,6 @@ int transport_refs_pushed(struct ref *ref);
 void transport_print_push_status(const char *dest, struct ref *refs,
 		  int verbose, int porcelain, unsigned int *reject_reasons);
 
-typedef void alternate_ref_fn(const char *refname, const struct object_id *oid, void *);
+typedef void alternate_ref_fn(const struct object_id *oid, void *);
 extern void for_each_alternate_ref(alternate_ref_fn, void *);
 #endif
-- 
2.19.0.221.g150f307af


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v3 2/4] transport.c: extract 'fill_alternate_refs_command'
  2018-09-28  4:25 ` [PATCH v3 0/4] Filter alternate references Taylor Blau
  2018-09-28  4:25   ` [PATCH v3 1/4] transport: drop refnames from for_each_alternate_ref Jeff King
@ 2018-09-28  4:25   ` Taylor Blau
  2018-09-28  4:59     ` Jeff King
  2018-09-28  4:25   ` [PATCH v3 3/4] transport.c: introduce core.alternateRefsCommand Taylor Blau
  2018-09-28  4:25   ` [PATCH v3 4/4] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
  3 siblings, 1 reply; 94+ messages in thread
From: Taylor Blau @ 2018-09-28  4:25 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller

To list alternate references, 'read_alternate_refs' creates a child
process running 'git for-each-ref' in the alternate's Git directory.

Prepare to run other commands besides 'git for-each-ref' by introducing
and moving the relevant code from 'read_alternate_refs' to
'fill_alternate_refs_command'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 transport.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/transport.c b/transport.c
index 2e0bc414d0..2825debac5 100644
--- a/transport.c
+++ b/transport.c
@@ -1325,6 +1325,17 @@ char *transport_anonymize_url(const char *url)
 	return xstrdup(url);
 }
 
+static void fill_alternate_refs_command(struct child_process *cmd,
+					const char *repo_path)
+{
+	cmd->git_cmd = 1;
+	argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
+	argv_array_push(&cmd->args, "for-each-ref");
+	argv_array_push(&cmd->args, "--format=%(objectname)");
+	cmd->env = local_repo_env;
+	cmd->out = -1;
+}
+
 static void read_alternate_refs(const char *path,
 				alternate_ref_fn *cb,
 				void *data)
@@ -1333,12 +1344,7 @@ static void read_alternate_refs(const char *path,
 	struct strbuf line = STRBUF_INIT;
 	FILE *fh;
 
-	cmd.git_cmd = 1;
-	argv_array_pushf(&cmd.args, "--git-dir=%s", path);
-	argv_array_push(&cmd.args, "for-each-ref");
-	argv_array_push(&cmd.args, "--format=%(objectname)");
-	cmd.env = local_repo_env;
-	cmd.out = -1;
+	fill_alternate_refs_command(&cmd, path);
 
 	if (start_command(&cmd))
 		return;
-- 
2.19.0.221.g150f307af


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v3 3/4] transport.c: introduce core.alternateRefsCommand
  2018-09-28  4:25 ` [PATCH v3 0/4] Filter alternate references Taylor Blau
  2018-09-28  4:25   ` [PATCH v3 1/4] transport: drop refnames from for_each_alternate_ref Jeff King
  2018-09-28  4:25   ` [PATCH v3 2/4] transport.c: extract 'fill_alternate_refs_command' Taylor Blau
@ 2018-09-28  4:25   ` Taylor Blau
  2018-09-28  5:26     ` Jeff King
  2018-09-28  4:25   ` [PATCH v3 4/4] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
  3 siblings, 1 reply; 94+ messages in thread
From: Taylor Blau @ 2018-09-28  4:25 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller

When in a repository containing one or more alternates, Git would
sometimes like to list references from those alternates. For example,
'git receive-pack' lists the "tips" pointed to by references in those
alternates as special ".have" references.

Listing ".have" references is designed to make pushing changes from
upstream to a fork a lightweight operation, by advertising to the pusher
that the fork already has the objects (via its alternate). Thus, the
client can avoid sending them.

However, when the alternate (upstream, in the previous example) has a
pathologically large number of references, the initial advertisement is
too expensive. In fact, it can dominate any such optimization where the
pusher avoids sending certain objects.

Introduce "core.alternateRefsCommand" in order to provide a facility to
limit or filter alternate references. This can be used, for example, to
filter out references the alternate does not wish to send (for space
concerns, or otherwise) during the initial advertisement.

Let the repository that has alternates configure this command to avoid
trusting the alternate to provide us a safe command to run in the shell.
To behave differently on each alternate (e.g., only list tags from
alternate A, only heads from B) provide the path of the alternate as the
first argument.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config.txt | 11 +++++++++
 t/t5410-receive-pack.sh  | 49 ++++++++++++++++++++++++++++++++++++++++
 transport.c              | 19 ++++++++++++----
 3 files changed, 75 insertions(+), 4 deletions(-)
 create mode 100755 t/t5410-receive-pack.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ad0f4510c3..afcb18331a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -616,6 +616,17 @@ core.preferSymlinkRefs::
 	This is sometimes needed to work with old scripts that
 	expect HEAD to be a symbolic link.
 
+core.alternateRefsCommand::
+	When advertising tips of available history from an alternate, use the shell to
+	execute the specified command instead of linkgit:git-for-each-ref[1]. The
+	first argument is the absolute path of the alternate. Output must be of the
+	form: `%(objectname)`, where multiple tips are separated by newlines.
++
+This is useful when a repository only wishes to advertise some of its
+alternate's references as ".have"'s. For example, to only advertise branch
+heads, configure `core.alternateRefsCommand` to the path of a script which runs
+`git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads`.
+
 core.bare::
 	If true this repository is assumed to be 'bare' and has no
 	working directory associated with it.  If this is the case a
diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
new file mode 100755
index 0000000000..503dde35a4
--- /dev/null
+++ b/t/t5410-receive-pack.sh
@@ -0,0 +1,49 @@
+#!/bin/sh
+
+test_description='git receive-pack test'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit one &&
+	git update-ref refs/heads/a HEAD &&
+	test_commit two &&
+	git update-ref refs/heads/b HEAD &&
+	test_commit three &&
+	git update-ref refs/heads/c HEAD &&
+	git clone --bare . fork &&
+	git clone fork pusher &&
+	(
+		cd fork &&
+		git update-ref --stdin <<-\EOF &&
+		delete refs/heads/a
+		delete refs/heads/b
+		delete refs/heads/c
+		delete refs/heads/master
+		delete refs/tags/one
+		delete refs/tags/two
+		delete refs/tags/three
+		EOF
+		echo "../../.git/objects" >objects/info/alternates
+	)
+'
+
+extract_haves () {
+	depacketize | perl -lne '/^(\S+) \.have/ and print $1'
+}
+
+test_expect_success 'with core.alternateRefsCommand' '
+	write_script fork/alternate-refs <<-\EOF &&
+		git --git-dir="$1" for-each-ref \
+			--format="%(objectname)" \
+			refs/heads/a \
+			refs/heads/c
+	EOF
+	test_config -C fork core.alternateRefsCommand alternate-refs &&
+	git rev-parse a c >expect &&
+	printf "0000" | git receive-pack fork >actual &&
+	extract_haves <actual >actual.haves &&
+	test_cmp expect actual.haves
+'
+
+test_done
diff --git a/transport.c b/transport.c
index 2825debac5..e271b66603 100644
--- a/transport.c
+++ b/transport.c
@@ -1328,10 +1328,21 @@ char *transport_anonymize_url(const char *url)
 static void fill_alternate_refs_command(struct child_process *cmd,
 					const char *repo_path)
 {
-	cmd->git_cmd = 1;
-	argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
-	argv_array_push(&cmd->args, "for-each-ref");
-	argv_array_push(&cmd->args, "--format=%(objectname)");
+	const char *value;
+
+	if (!git_config_get_value("core.alternateRefsCommand", &value)) {
+		cmd->use_shell = 1;
+
+		argv_array_push(&cmd->args, value);
+		argv_array_push(&cmd->args, repo_path);
+	} else {
+		cmd->git_cmd = 1;
+
+		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
+		argv_array_push(&cmd->args, "for-each-ref");
+		argv_array_push(&cmd->args, "--format=%(objectname)");
+	}
+
 	cmd->env = local_repo_env;
 	cmd->out = -1;
 }
-- 
2.19.0.221.g150f307af


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v3 4/4] transport.c: introduce core.alternateRefsPrefixes
  2018-09-28  4:25 ` [PATCH v3 0/4] Filter alternate references Taylor Blau
                     ` (2 preceding siblings ...)
  2018-09-28  4:25   ` [PATCH v3 3/4] transport.c: introduce core.alternateRefsCommand Taylor Blau
@ 2018-09-28  4:25   ` Taylor Blau
  2018-09-28  5:30     ` Jeff King
  3 siblings, 1 reply; 94+ messages in thread
From: Taylor Blau @ 2018-09-28  4:25 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller

The recently-introduced "core.alternateRefsCommand" allows callers to
specify with high flexibility the tips that they wish to advertise from
alternates. This flexibility comes at the cost of some inconvenience
when the caller only wishes to limit the advertisement to one or more
prefixes.

For example, to advertise only tags, a caller using
'core.alternateRefsCommand' would have to do:

  $ git config core.alternateRefsCommand ' \
      git -C "$1" for-each-ref refs/tags --format="%(objectname)"'

The above is cumbersome to write, so let's introduce a
"core.alternateRefsPrefixes" to address this common case. Instead, the
caller can run:

  $ git config core.alternateRefsPrefixes 'refs/tags'

Which will behave identically to the longer example using
"core.alternateRefsCommand".

Since the value of "core.alternateRefsPrefixes" is appended to 'git
for-each-ref' and then executed, include a "--" before taking the
configured value to avoid misinterpreting arguments as flags to 'git
for-each-ref'.

In the case that the caller wishes to specify multiple prefixes, they
may separate them by whitespace. If "core.alternateRefsCommand" is set,
it will take precedence over "core.alternateRefsPrefixes".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config.txt | 7 +++++++
 t/t5410-receive-pack.sh  | 8 ++++++++
 transport.c              | 5 +++++
 3 files changed, 20 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index afcb18331a..9ef792ef0d 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -627,6 +627,13 @@ alternate's references as ".have"'s. For example, to only advertise branch
 heads, configure `core.alternateRefsCommand` to the path of a script which runs
 `git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads`.
 
+core.alternateRefsPrefixes::
+	When listing references from an alternate, list only references that begin
+	with the given prefix. Prefixes match as if they were given as arguments to
+	linkgit:git-for-each-ref[1]. To list multiple prefixes, separate them with
+	whitespace. If `core.alternateRefsCommand` is set, setting
+	`core.alternateRefsPrefixes` has no effect.
+
 core.bare::
 	If true this repository is assumed to be 'bare' and has no
 	working directory associated with it.  If this is the case a
diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
index 503dde35a4..3449967cc7 100755
--- a/t/t5410-receive-pack.sh
+++ b/t/t5410-receive-pack.sh
@@ -46,4 +46,12 @@ test_expect_success 'with core.alternateRefsCommand' '
 	test_cmp expect actual.haves
 '
 
+test_expect_success 'with core.alternateRefsPrefixes' '
+	test_config -C fork core.alternateRefsPrefixes "refs/tags" &&
+	git rev-parse one three two >expect &&
+	printf "0000" | git receive-pack fork >actual &&
+	extract_haves <actual >actual.haves &&
+	test_cmp expect actual.haves
+'
+
 test_done
diff --git a/transport.c b/transport.c
index e271b66603..83474add28 100644
--- a/transport.c
+++ b/transport.c
@@ -1341,6 +1341,11 @@ static void fill_alternate_refs_command(struct child_process *cmd,
 		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
 		argv_array_push(&cmd->args, "for-each-ref");
 		argv_array_push(&cmd->args, "--format=%(objectname)");
+
+		if (!git_config_get_value("core.alternateRefsPrefixes", &value)) {
+			argv_array_push(&cmd->args, "--");
+			argv_array_split(&cmd->args, value);
+		}
 	}
 
 	cmd->env = local_repo_env;
-- 
2.19.0.221.g150f307af

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH v3 1/4] transport: drop refnames from for_each_alternate_ref
  2018-09-28  4:25   ` [PATCH v3 1/4] transport: drop refnames from for_each_alternate_ref Jeff King
@ 2018-09-28  4:58     ` Jeff King
  2018-09-28 14:21       ` Taylor Blau
  0 siblings, 1 reply; 94+ messages in thread
From: Jeff King @ 2018-09-28  4:58 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, gitster, sunshine, sbeller

> From: Jeff King <me@ttaylorr.com>

Pretty sure that isn't right. :)

The preferred way to send a patch with a different author is to have
actual email be "From:" you, but then include a:

  From: Jeff King <peff@peff.net>

as the first line of the body (which git-am will then pick up).
git-send-email will do this for you automatically. Other scripts (like
say, if you're sending the output of format-patch into mutt) used to
have to implement it themselves, but these days we have "format-patch
--from", which should directly output what you want.

> ---
>  builtin/receive-pack.c | 3 +--
>  fetch-pack.c           | 3 +--
>  transport.c            | 6 +++---
>  transport.h            | 2 +-
>  4 files changed, 6 insertions(+), 8 deletions(-)

The patch itself is flawless, of course. ;)

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v3 2/4] transport.c: extract 'fill_alternate_refs_command'
  2018-09-28  4:25   ` [PATCH v3 2/4] transport.c: extract 'fill_alternate_refs_command' Taylor Blau
@ 2018-09-28  4:59     ` Jeff King
  0 siblings, 0 replies; 94+ messages in thread
From: Jeff King @ 2018-09-28  4:59 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, gitster, sunshine, sbeller

On Thu, Sep 27, 2018 at 09:25:39PM -0700, Taylor Blau wrote:

> To list alternate references, 'read_alternate_refs' creates a child
> process running 'git for-each-ref' in the alternate's Git directory.
> 
> Prepare to run other commands besides 'git for-each-ref' by introducing
> and moving the relevant code from 'read_alternate_refs' to
> 'fill_alternate_refs_command'.
> 
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  transport.c | 18 ++++++++++++------
>  1 file changed, 12 insertions(+), 6 deletions(-)

Same as before, but moving the slightly modified code. Makes sense.

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v3 3/4] transport.c: introduce core.alternateRefsCommand
  2018-09-28  4:25   ` [PATCH v3 3/4] transport.c: introduce core.alternateRefsCommand Taylor Blau
@ 2018-09-28  5:26     ` Jeff King
  2018-09-28 22:04       ` Taylor Blau
  0 siblings, 1 reply; 94+ messages in thread
From: Jeff King @ 2018-09-28  5:26 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, gitster, sunshine, sbeller

On Thu, Sep 27, 2018 at 09:25:42PM -0700, Taylor Blau wrote:

> Let the repository that has alternates configure this command to avoid
> trusting the alternate to provide us a safe command to run in the shell.
> To behave differently on each alternate (e.g., only list tags from
> alternate A, only heads from B) provide the path of the alternate as the
> first argument.

Well, you also need to pass the path so it knows which repo to look at.
Which I think is the primary reason we do it, but behaving differently
for each alternate is another option.

> +core.alternateRefsCommand::
> +	When advertising tips of available history from an alternate, use the shell to
> +	execute the specified command instead of linkgit:git-for-each-ref[1]. The
> +	first argument is the absolute path of the alternate. Output must be of the
> +	form: `%(objectname)`, where multiple tips are separated by newlines.

I wonder if people may be confused about the %(objectname) syntax, since
it's specific to for-each-ref.  Now that we've simplified the output
format to a single value, perhaps we should define it more directly.
E.g., like:

  The output should contain one hex object id per line (i.e., the same
  as produced by `git for-each-ref --format='%(objectname)'`).

Now that we've dropped the refname requirement from the output, it is
more clear that this really does not have to be about refs at all.  In
the most technical sense, what we really allow in the output is any
object id X for which the alternate promises it has all objects
reachable from X. Ref tips are a convenient and efficient way of
providing that, but they are not the only possibility (and likewise, it
is fine to omit duplicates or even tips that are ancestors of other
tips).

I think that's probably getting _too_ technical, though. It probably
makes sense to just keep thinking of these as "what are the ref tips".

> +This is useful when a repository only wishes to advertise some of its
> +alternate's references as ".have"'s. For example, to only advertise branch

Maybe put ".have" into backticks for formatting?

> +heads, configure `core.alternateRefsCommand` to the path of a script which runs
> +`git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads`.

Does that script actually work? Because of the way we invoke shell
commands with arguments, I think we'd end up with:

  git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads "$@"

Possibly for-each-ref would ignore the extra path argument (thinking
it's a ref pattern that just doesn't match), but it's definitely not
what you intended. You'd have to write:

  f() { git --git-dir=$1 ...etc; } f

in the usual way. That's a minor pain, but it's what makes the more
direct:

  /my/script

work.

The other alternative is to pass $GIT_DIR in the environment on behalf
of the program. Then writing:

  git for-each-ref --format='%(objectname)' refs/heads

would Just Work. But it's a bit subtle, since it is not immediately
obvious that the command is meant to run in a different repository.

> diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
> new file mode 100755
> index 0000000000..503dde35a4
> --- /dev/null
> +++ b/t/t5410-receive-pack.sh
> @@ -0,0 +1,49 @@
> +#!/bin/sh
> +
> +test_description='git receive-pack test'

The name of this test file and the description are pretty vague. Can we
say something like "test handling of receive-pack with alternate-refs
config"?

> +test_expect_success 'setup' '
> +	test_commit one &&
> +	git update-ref refs/heads/a HEAD &&
> +	test_commit two &&
> +	git update-ref refs/heads/b HEAD &&
> +	test_commit three &&
> +	git update-ref refs/heads/c HEAD &&
> +	git clone --bare . fork &&
> +	git clone fork pusher &&
> +	(
> +		cd fork &&
> +		git update-ref --stdin <<-\EOF &&
> +		delete refs/heads/a
> +		delete refs/heads/b
> +		delete refs/heads/c
> +		delete refs/heads/master
> +		delete refs/tags/one
> +		delete refs/tags/two
> +		delete refs/tags/three
> +		EOF
> +		echo "../../.git/objects" >objects/info/alternates
> +	)
> +'

This setup is kind of convoluted. You're deleting those refs in the
fork, I think, because we don't want them to suppress the duplicate
.have lines from the alternate. Might it be easier to just create the
.have lines we're interested in after the fact?

I think we can also use "clone -s" to make the setup of the alternate a
little simpler.

I don't see the "pusher" repo being used for anything here. Leftover
cruft from when you were using "git push" to test?

So all together, perhaps something like:

  # we have a fork which points back to us as an alternate
  test_commit base &&
  git clone -s . fork &&

  # the alternate has two refs with new tips, in two separate hierarchies
  git checkout -b public/branch master &&
  test_commit public &&
  git checkout -b private/branch master &&
  test_commit private

And then...

> +test_expect_success 'with core.alternateRefsCommand' '
> +	write_script fork/alternate-refs <<-\EOF &&
> +		git --git-dir="$1" for-each-ref \
> +			--format="%(objectname)" \
> +			refs/heads/a \
> +			refs/heads/c
> +	EOF

...this can just look for refs/heads/public/, and...

> +	test_config -C fork core.alternateRefsCommand alternate-refs &&
> +	git rev-parse a c >expect &&

...we verify that we saw public/branch but not private/branch.

It's not that much shorter, but I had trouble understanding from the
setup why we needed to delete all those refs (and why we cared about
those tags in the first place).

> diff --git a/transport.c b/transport.c
> index 2825debac5..e271b66603 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -1328,10 +1328,21 @@ char *transport_anonymize_url(const char *url)
>  static void fill_alternate_refs_command(struct child_process *cmd,
>  					const char *repo_path)

The code change itself looks good to me.

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v3 4/4] transport.c: introduce core.alternateRefsPrefixes
  2018-09-28  4:25   ` [PATCH v3 4/4] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
@ 2018-09-28  5:30     ` Jeff King
  2018-09-28 22:05       ` Taylor Blau
  0 siblings, 1 reply; 94+ messages in thread
From: Jeff King @ 2018-09-28  5:30 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, gitster, sunshine, sbeller

On Thu, Sep 27, 2018 at 09:25:45PM -0700, Taylor Blau wrote:

> The recently-introduced "core.alternateRefsCommand" allows callers to
> specify with high flexibility the tips that they wish to advertise from
> alternates. This flexibility comes at the cost of some inconvenience
> when the caller only wishes to limit the advertisement to one or more
> prefixes.
> 
> For example, to advertise only tags, a caller using
> 'core.alternateRefsCommand' would have to do:
> 
>   $ git config core.alternateRefsCommand ' \
>       git -C "$1" for-each-ref refs/tags --format="%(objectname)"'

This has the same "$@" issue as the previous one, I think (which only
makes your point about it being cumbersome more true!).

> In the case that the caller wishes to specify multiple prefixes, they
> may separate them by whitespace. If "core.alternateRefsCommand" is set,
> it will take precedence over "core.alternateRefsPrefixes".

Just a meta-comment: I don't particularly mind this discussion in the
commit message, but since these points ought to be in the documentation
anyway, it may make sense to omit them here in the name of brevity.

> +core.alternateRefsPrefixes::
> +	When listing references from an alternate, list only references that begin
> +	with the given prefix. Prefixes match as if they were given as arguments to
> +	linkgit:git-for-each-ref[1]. To list multiple prefixes, separate them with
> +	whitespace. If `core.alternateRefsCommand` is set, setting
> +	`core.alternateRefsPrefixes` has no effect.

Looks good.

> diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
> index 503dde35a4..3449967cc7 100755
> --- a/t/t5410-receive-pack.sh
> +++ b/t/t5410-receive-pack.sh
> @@ -46,4 +46,12 @@ test_expect_success 'with core.alternateRefsCommand' '
>  	test_cmp expect actual.haves
>  '
>  
> +test_expect_success 'with core.alternateRefsPrefixes' '
> +	test_config -C fork core.alternateRefsPrefixes "refs/tags" &&
> +	git rev-parse one three two >expect &&
> +	printf "0000" | git receive-pack fork >actual &&
> +	extract_haves <actual >actual.haves &&
> +	test_cmp expect actual.haves
> +'

If you follow my suggestion on the test setup from the last patch, it
would make sense to just put "refs/heads/public/" here. Although neither
that nor what you have here tests the whitespace separation. Possibly
there should be a third hierarchy.

> diff --git a/transport.c b/transport.c
> index e271b66603..83474add28 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -1341,6 +1341,11 @@ static void fill_alternate_refs_command(struct child_process *cmd,
>  		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
>  		argv_array_push(&cmd->args, "for-each-ref");
>  		argv_array_push(&cmd->args, "--format=%(objectname)");
> +
> +		if (!git_config_get_value("core.alternateRefsPrefixes", &value)) {
> +			argv_array_push(&cmd->args, "--");
> +			argv_array_split(&cmd->args, value);
> +		}

And this part looks good.

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v3 1/4] transport: drop refnames from for_each_alternate_ref
  2018-09-28  4:58     ` Jeff King
@ 2018-09-28 14:21       ` Taylor Blau
  0 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-09-28 14:21 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, gitster, sunshine, sbeller

On Fri, Sep 28, 2018 at 12:58:58AM -0400, Jeff King wrote:
> > From: Jeff King <me@ttaylorr.com>
>
> Pretty sure that isn't right. :)

Indeed that isn't right :-). I try my best to review my patches
diligently before submitting them, but here's an interesting side-story
if you're interested:

I use a script 'git mail' which is essentially doing:

  git format-patch --stdout >mbox && mutt -f mbox

So, by the time that I've reviewed the diff via:

  $ git format-patch --stdout | less

I assume that the patches are ready to send (since, after all, running
'git format-patch' more than once shouldn't change anything.) So, I open
mutt with 'git mail', write my cover letter, and send each of the
patches to the list.

It was during that last phase that I ignored the From: Jeff King
<me@ttaylorr.com>, which I agree with you is certainly incorrect :-).

I was going to ask Junio to fix this up when queuing, but it seems (from
a quick skim of the rest of your review), that we will reach v4, so I'll
see if I can't teach 'git mail' to do the right thing for me.

> The patch itself is flawless, of course. ;)

Obviously ;-).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v3 3/4] transport.c: introduce core.alternateRefsCommand
  2018-09-28  5:26     ` Jeff King
@ 2018-09-28 22:04       ` Taylor Blau
  2018-09-29  7:31         ` Jeff King
  0 siblings, 1 reply; 94+ messages in thread
From: Taylor Blau @ 2018-09-28 22:04 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, gitster, sunshine, sbeller

On Fri, Sep 28, 2018 at 01:26:13AM -0400, Jeff King wrote:
> On Thu, Sep 27, 2018 at 09:25:42PM -0700, Taylor Blau wrote:
>
> > Let the repository that has alternates configure this command to avoid
> > trusting the alternate to provide us a safe command to run in the shell.
> > To behave differently on each alternate (e.g., only list tags from
> > alternate A, only heads from B) provide the path of the alternate as the
> > first argument.
>
> Well, you also need to pass the path so it knows which repo to look at.
> Which I think is the primary reason we do it, but behaving differently
> for each alternate is another option.

Yeah. I think that the clearer argument is yours, so I'll amend my copy.
I am thinking of:

  To find the alternate, pass its absolute path as the first argument.

How does that sound?

> > +core.alternateRefsCommand::
> > +   When advertising tips of available history from an alternate, use the shell to
> > +   execute the specified command instead of linkgit:git-for-each-ref[1]. The
> > +   first argument is the absolute path of the alternate. Output must be of the
> > +   form: `%(objectname)`, where multiple tips are separated by newlines.
>
> I wonder if people may be confused about the %(objectname) syntax, since
> it's specific to for-each-ref.  Now that we've simplified the output
> format to a single value, perhaps we should define it more directly.
> E.g., like:
>
>   The output should contain one hex object id per line (i.e., the same
>   as produced by `git for-each-ref --format='%(objectname)'`).

I think that that's clearer, thanks. I applied it pretty much as you
suggested, but changed 'should' to 'must' and dropped the leading 'the'.

> Now that we've dropped the refname requirement from the output, it is
> more clear that this really does not have to be about refs at all.  In
> the most technical sense, what we really allow in the output is any
> object id X for which the alternate promises it has all objects
> reachable from X. Ref tips are a convenient and efficient way of
> providing that, but they are not the only possibility (and likewise, it
> is fine to omit duplicates or even tips that are ancestors of other
> tips).
>
> I think that's probably getting _too_ technical, though. It probably
> makes sense to just keep thinking of these as "what are the ref tips".

Yep, I agree completely.

> > +This is useful when a repository only wishes to advertise some of its
> > +alternate's references as ".have"'s. For example, to only advertise branch
>
> Maybe put ".have" into backticks for formatting?

Good idea, thanks. I took this locally as suggested.

> > +heads, configure `core.alternateRefsCommand` to the path of a script which runs
> > +`git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads`.
>
> Does that script actually work? Because of the way we invoke shell
> commands with arguments, I think we'd end up with:
>
>   git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads "$@"

I think that you're right...

> Possibly for-each-ref would ignore the extra path argument (thinking
> it's a ref pattern that just doesn't match), but it's definitely not
> what you intended. You'd have to write:
>
>   f() { git --git-dir=$1 ...etc; } f
>
> in the usual way. That's a minor pain, but it's what makes the more
> direct:
>
>   /my/script
>
> work.

...but this was what I was trying to get across with saying "...to the
path of a script which runs...", such that we would get the implicit
scoping that you make explicit in your example with "f() { ... }; f".

Does that seem OK as-is after the additional context? I think that after
reading your response, it seems to be confusing, so perhaps it should be
changed...

> The other alternative is to pass $GIT_DIR in the environment on behalf
> of the program. Then writing:
>
>   git for-each-ref --format='%(objectname)' refs/heads
>
> would Just Work. But it's a bit subtle, since it is not immediately
> obvious that the command is meant to run in a different repository.

I think that we discussed this approach a bit off-list, and I had the
idea that it was too fragile to work in practice, and that it would be
too surprising for callers to suddenly be in a different world.

I say this not because it wouldn't make this particular scenario more
convenient, which it uncountably would, but because it would make other
scenarios _more_ complicated.

For example, if a caller uses an alternate reference backed, perhaps,
MySQL (or anything that _isn't_ Git), they're not going to want to have
these GIT_ environment variable set.

So, I think that the greatest common denominator between the two is to
pass the alternate's absolute path as the first argument.

> > diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
> > new file mode 100755
> > index 0000000000..503dde35a4
> > --- /dev/null
> > +++ b/t/t5410-receive-pack.sh
> > @@ -0,0 +1,49 @@
> > +#!/bin/sh
> > +
> > +test_description='git receive-pack test'
>
> The name of this test file and the description are pretty vague. Can we
> say something like "test handling of receive-pack with alternate-refs
> config"?

I left it intentionally vague, since I'd like for it to contain more
tests about 'git receive-pack'-specific things in the future.

I'm happy to change the name, though I wonder if we should change the
filename accordingly, and if so, to what.

> > +test_expect_success 'setup' '
> > +   test_commit one &&
> > +   git update-ref refs/heads/a HEAD &&
> > +   test_commit two &&
> > +   git update-ref refs/heads/b HEAD &&
> > +   test_commit three &&
> > +   git update-ref refs/heads/c HEAD &&
> > +   git clone --bare . fork &&
> > +   git clone fork pusher &&
> > +   (
> > +           cd fork &&
> > +           git update-ref --stdin <<-\EOF &&
> > +           delete refs/heads/a
> > +           delete refs/heads/b
> > +           delete refs/heads/c
> > +           delete refs/heads/master
> > +           delete refs/tags/one
> > +           delete refs/tags/two
> > +           delete refs/tags/three
> > +           EOF
> > +           echo "../../.git/objects" >objects/info/alternates
> > +   )
> > +'
>
> This setup is kind of convoluted. You're deleting those refs in the
> fork, I think, because we don't want them to suppress the duplicate
> .have lines from the alternate. Might it be easier to just create the
> .have lines we're interested in after the fact?
> I think we can also use "clone -s" to make the setup of the alternate a
> little simpler.
>
> I don't see the "pusher" repo being used for anything here. Leftover
> cruft from when you were using "git push" to test?
>
> So all together, perhaps something like:
>
>   # we have a fork which points back to us as an alternate
>   test_commit base &&
>   git clone -s . fork &&
>
>   # the alternate has two refs with new tips, in two separate hierarchies
>   git checkout -b public/branch master &&
>   test_commit public &&
>   git checkout -b private/branch master &&
>   test_commit private
>
> And then...
>
> > +test_expect_success 'with core.alternateRefsCommand' '
> > +   write_script fork/alternate-refs <<-\EOF &&
> > +           git --git-dir="$1" for-each-ref \
> > +                   --format="%(objectname)" \
> > +                   refs/heads/a \
> > +                   refs/heads/c
> > +   EOF
>
> ...this can just look for refs/heads/public/, and...
>
> > +   test_config -C fork core.alternateRefsCommand alternate-refs &&
> > +   git rev-parse a c >expect &&
>
> ...we verify that we saw public/branch but not private/branch.
>
> It's not that much shorter, but I had trouble understanding from the
> setup why we needed to delete all those refs (and why we cared about
> those tags in the first place).

I agree with all of this. It's certainly roughly the same length, but I
think that it makes it much easier to grok, and it addresses a comment
that Junio made in an earlier response to this thread. So, two wins for
the price of one :-).

I had to make a couple of other changes that you didn't recommend:

  - Since we used to create fork with 'git clone --bare', the path of
    `core.alternateRefsCommand` grew an extra `../`, since we have to
    also traverse _out_ of the .git directory in a non-bare repository.

    Instead of this, I opted for both, with 'git clone -s --bare .
    fork', which means we don't have to check out a working copy, and we
    can avoid changing the line mentioned above.

  - Another thing that I had to decide on was what to give as a prefix
    for the test exercising 'core.alternateRefsPrefixes', which I
    decided to use 'refs/heads/private' for, which makes sure that we're
    seeing something different than 'core.alternateRefsCommand'.

The diff is kind of long (so I'm avoiding sending it here), but I think
that it's mostly self-explanatory from what you recommended to me and
what I said above.

> > diff --git a/transport.c b/transport.c
> > index 2825debac5..e271b66603 100644
> > --- a/transport.c
> > +++ b/transport.c
> > @@ -1328,10 +1328,21 @@ char *transport_anonymize_url(const char *url)
> >  static void fill_alternate_refs_command(struct child_process *cmd,
> >                                     const char *repo_path)
>
> The code change itself looks good to me.

Thanks for your review, as always.

I'll wait until Monday to re-roll, just to make sure that there isn't
any new feedback between now and then.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v3 4/4] transport.c: introduce core.alternateRefsPrefixes
  2018-09-28  5:30     ` Jeff King
@ 2018-09-28 22:05       ` Taylor Blau
  2018-09-29  7:34         ` Jeff King
  0 siblings, 1 reply; 94+ messages in thread
From: Taylor Blau @ 2018-09-28 22:05 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, gitster, sunshine, sbeller

On Fri, Sep 28, 2018 at 01:30:57AM -0400, Jeff King wrote:
> On Thu, Sep 27, 2018 at 09:25:45PM -0700, Taylor Blau wrote:
>
> > The recently-introduced "core.alternateRefsCommand" allows callers to
> > specify with high flexibility the tips that they wish to advertise from
> > alternates. This flexibility comes at the cost of some inconvenience
> > when the caller only wishes to limit the advertisement to one or more
> > prefixes.
> >
> > For example, to advertise only tags, a caller using
> > 'core.alternateRefsCommand' would have to do:
> >
> >   $ git config core.alternateRefsCommand ' \
> >       git -C "$1" for-each-ref refs/tags --format="%(objectname)"'
>
> This has the same "$@" issue as the previous one, I think (which only
> makes your point about it being cumbersome more true!).

Hmm. I'll be curious to how you respond to my other message about the
same topic. I feel that whatever the outcome there is will affect both
locations in the same way.

> > In the case that the caller wishes to specify multiple prefixes, they
> > may separate them by whitespace. If "core.alternateRefsCommand" is set,
> > it will take precedence over "core.alternateRefsPrefixes".
>
> Just a meta-comment: I don't particularly mind this discussion in the
> commit message, but since these points ought to be in the documentation
> anyway, it may make sense to omit them here in the name of brevity.

Sure, that makes sense.

> > +core.alternateRefsPrefixes::
> > +	When listing references from an alternate, list only references that begin
> > +	with the given prefix. Prefixes match as if they were given as arguments to
> > +	linkgit:git-for-each-ref[1]. To list multiple prefixes, separate them with
> > +	whitespace. If `core.alternateRefsCommand` is set, setting
> > +	`core.alternateRefsPrefixes` has no effect.
>
> Looks good.
>
> > diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
> > index 503dde35a4..3449967cc7 100755
> > --- a/t/t5410-receive-pack.sh
> > +++ b/t/t5410-receive-pack.sh
> > @@ -46,4 +46,12 @@ test_expect_success 'with core.alternateRefsCommand' '
> >  	test_cmp expect actual.haves
> >  '
> >
> > +test_expect_success 'with core.alternateRefsPrefixes' '
> > +	test_config -C fork core.alternateRefsPrefixes "refs/tags" &&
> > +	git rev-parse one three two >expect &&
> > +	printf "0000" | git receive-pack fork >actual &&
> > +	extract_haves <actual >actual.haves &&
> > +	test_cmp expect actual.haves
> > +'
>
> If you follow my suggestion on the test setup from the last patch, it
> would make sense to just put "refs/heads/public/" here. Although neither
> that nor what you have here tests the whitespace separation. Possibly
> there should be a third hierarchy.

Sounds good; that's what I did.

> > diff --git a/transport.c b/transport.c
> > index e271b66603..83474add28 100644
> > --- a/transport.c
> > +++ b/transport.c
> > @@ -1341,6 +1341,11 @@ static void fill_alternate_refs_command(struct child_process *cmd,
> >  		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
> >  		argv_array_push(&cmd->args, "for-each-ref");
> >  		argv_array_push(&cmd->args, "--format=%(objectname)");
> > +
> > +		if (!git_config_get_value("core.alternateRefsPrefixes", &value)) {
> > +			argv_array_push(&cmd->args, "--");
> > +			argv_array_split(&cmd->args, value);
> > +		}
>
> And this part looks good.

Thanks for the review of this patch, too.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v3 3/4] transport.c: introduce core.alternateRefsCommand
  2018-09-28 22:04       ` Taylor Blau
@ 2018-09-29  7:31         ` Jeff King
  2018-10-02  1:56           ` Taylor Blau
  0 siblings, 1 reply; 94+ messages in thread
From: Jeff King @ 2018-09-29  7:31 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, gitster, sunshine, sbeller

On Fri, Sep 28, 2018 at 03:04:10PM -0700, Taylor Blau wrote:

> > Well, you also need to pass the path so it knows which repo to look at.
> > Which I think is the primary reason we do it, but behaving differently
> > for each alternate is another option.
> 
> Yeah. I think that the clearer argument is yours, so I'll amend my copy.
> I am thinking of:
> 
>   To find the alternate, pass its absolute path as the first argument.
> 
> How does that sound?

Sounds good.

> > > +heads, configure `core.alternateRefsCommand` to the path of a script which runs
> > > +`git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads`.
> >
> > Does that script actually work? Because of the way we invoke shell
> > commands with arguments, I think we'd end up with:
> >
> >   git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads "$@"
> [...]
> ...but this was what I was trying to get across with saying "...to the
> path of a script which runs...", such that we would get the implicit
> scoping that you make explicit in your example with "f() { ... }; f".
>
> Does that seem OK as-is after the additional context? I think that after
> reading your response, it seems to be confusing, so perhaps it should be
> changed...

Ah, OK. I totally missed that "path of a script" part. What you have is
correct, then, but I do wonder if we could make it less subtle.

Maybe something like:

  For example, if `/path/to/script` runs `git --git-dir="$1"
  for-each-ref --format='%(objectname)' refs/heads/`, then putting
  `/path/to/script` in `core.alternateRefsCommand` will show only the
  branch heads from the alternate.

I dunno. It's certainly clunkier. I wonder if we would be less awkward
to show the sample script in a fenced block, with the `#!/bin/sh` and
everything.

Or maybe just keep the text you have and add a note at the end like:

  Note that writing that `for-each-ref` command directly in the config
  option doesn't quite work, as it has to handle the path argument
  specially.

I don't think we need to hand-hold a user through the f() shell-snippet
trickery. I just don't want somebody thinking they can blindly paste
that into their config.

> > The other alternative is to pass $GIT_DIR in the environment on behalf
> > of the program. Then writing:
> >
> >   git for-each-ref --format='%(objectname)' refs/heads
> >
> > would Just Work. But it's a bit subtle, since it is not immediately
> > obvious that the command is meant to run in a different repository.
> 
> I think that we discussed this approach a bit off-list, and I had the
> idea that it was too fragile to work in practice, and that it would be
> too surprising for callers to suddenly be in a different world.
> 
> I say this not because it wouldn't make this particular scenario more
> convenient, which it uncountably would, but because it would make other
> scenarios _more_ complicated.
> 
> For example, if a caller uses an alternate reference backed, perhaps,
> MySQL (or anything that _isn't_ Git), they're not going to want to have
> these GIT_ environment variable set.

If they're not using Git under the hood, then GIT_* probably isn't
hurting anything. But it is still pretty subtle. Let's forget I
mentioned it.  Just chaining for-each-ref with a prefix is pretty
awkward, but that's why we have the next patch with
alternateRefsPrefixes.

Your response did make me think of one other thing, though. The
alternate file points to a directory with objects, and the
for_each_alternate_ref() code checks to see if that looks vaguely like
the objects/ directory of a git repo. But would anybody want to run
something like alternateRefsCommand on _just_ the object directory?
I.e., you don't have a real git repo there, but your script can
"somehow" come up with a list of valid tips.

That isn't inconceivable to me for the kind of multi-fork storage we do
at GitHub. E.g., imagine a shared object directory with no refs, and
then a script that goes out to the other related forks to look at their
ref tips. I don't think we have any immediate plans for it, though (and
there are a lot of subtle bits that I won't go into here that make it
non-trivial). So I'm OK to punt on it for now. I also think in a pinch
that you could easily fool the alternates code by just having a dummy
"refs/" directory.

> > > diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
> [...]
> > > +test_description='git receive-pack test'
> >
> > The name of this test file and the description are pretty vague. Can we
> > say something like "test handling of receive-pack with alternate-refs
> > config"?
> 
> I left it intentionally vague, since I'd like for it to contain more
> tests about 'git receive-pack'-specific things in the future.
> 
> I'm happy to change the name, though I wonder if we should change the
> filename accordingly, and if so, to what.

I think we'd want to have a separate script for other receive-pack tests
that aren't related to alternates. There's some startup overhead to each
script so we don't want to make them _too_ small, but there are benefits
to having small test scripts:

 - they're our unit of parallelism, so we want to be able to keep a
   reasonable number of processors full

 - each test script starts with a clean slate, so there's less chance
   for unexpected interactions between individual tests (e.g., when
   modifying or adding a test in the middle of the script)

 - it's less annoying when you're debugging a failing test near the end
   of a script ;)

I actually think we'd benefit from splitting up a few of the longer
scripts. On my quad-core laptop, running the tests in slow-to-fast order
keeps the processors pretty busy, and the slowest test takes less time
than the whole suite. But I've also tried running on a 40-core box. It
burns through the short tests quickly, but you can never get faster than
the slowest single test, which takes something like 35 seconds. So
instead of being 10 times faster, it's more like two times faster, as
most of the processors idle waiting for that one script to finish.

But that's all pretty tangential here. My point is just that this
probably ought to be remain its own script. :)

I'd probably name it "t5410-receive-pack-alternates" or similar.

> I'll wait until Monday to re-roll, just to make sure that there isn't
> any new feedback between now and then.

Sounds good. Thanks for working on this.

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v3 4/4] transport.c: introduce core.alternateRefsPrefixes
  2018-09-28 22:05       ` Taylor Blau
@ 2018-09-29  7:34         ` Jeff King
  2018-10-02  1:57           ` Taylor Blau
  0 siblings, 1 reply; 94+ messages in thread
From: Jeff King @ 2018-09-29  7:34 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, gitster, sunshine, sbeller

On Fri, Sep 28, 2018 at 03:05:57PM -0700, Taylor Blau wrote:

> > > For example, to advertise only tags, a caller using
> > > 'core.alternateRefsCommand' would have to do:
> > >
> > >   $ git config core.alternateRefsCommand ' \
> > >       git -C "$1" for-each-ref refs/tags --format="%(objectname)"'
> >
> > This has the same "$@" issue as the previous one, I think (which only
> > makes your point about it being cumbersome more true!).
> 
> Hmm. I'll be curious to how you respond to my other message about the
> same topic. I feel that whatever the outcome there is will affect both
> locations in the same way.

I think they're separate issues, right? I was just confused on the
earlier patch, but the "git config" command you show above is the actual
broken case isn't it?

I'm not overly concerned since this isn't recommending the technique to
end users (and in fact the whole point is to give an alternative), but
it may be worth showing a working command in case anybody runs across
it.

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v3 3/4] transport.c: introduce core.alternateRefsCommand
  2018-09-29  7:31         ` Jeff King
@ 2018-10-02  1:56           ` Taylor Blau
  0 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-10-02  1:56 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, gitster, sunshine, sbeller

On Sat, Sep 29, 2018 at 03:31:38AM -0400, Jeff King wrote:
> On Fri, Sep 28, 2018 at 03:04:10PM -0700, Taylor Blau wrote:
>
> > > Well, you also need to pass the path so it knows which repo to look at.
> > > Which I think is the primary reason we do it, but behaving differently
> > > for each alternate is another option.
> >
> > Yeah. I think that the clearer argument is yours, so I'll amend my copy.
> > I am thinking of:
> >
> >   To find the alternate, pass its absolute path as the first argument.
> >
> > How does that sound?
>
> Sounds good.
>
> > > > +heads, configure `core.alternateRefsCommand` to the path of a script which runs
> > > > +`git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads`.
> > >
> > > Does that script actually work? Because of the way we invoke shell
> > > commands with arguments, I think we'd end up with:
> > >
> > >   git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads "$@"
> > [...]
> > ...but this was what I was trying to get across with saying "...to the
> > path of a script which runs...", such that we would get the implicit
> > scoping that you make explicit in your example with "f() { ... }; f".
> >
> > Does that seem OK as-is after the additional context? I think that after
> > reading your response, it seems to be confusing, so perhaps it should be
> > changed...
>
> Ah, OK. I totally missed that "path of a script" part. What you have is
> correct, then, but I do wonder if we could make it less subtle.
>
> Maybe something like:
>
>   For example, if `/path/to/script` runs `git --git-dir="$1"
>   for-each-ref --format='%(objectname)' refs/heads/`, then putting
>   `/path/to/script` in `core.alternateRefsCommand` will show only the
>   branch heads from the alternate.
>
> I dunno. It's certainly clunkier. I wonder if we would be less awkward
> to show the sample script in a fenced block, with the `#!/bin/sh` and
> everything.
>
> Or maybe just keep the text you have and add a note at the end like:
>
>   Note that writing that `for-each-ref` command directly in the config
>   option doesn't quite work, as it has to handle the path argument
>   specially.
>
> I don't think we need to hand-hold a user through the f() shell-snippet
> trickery. I just don't want somebody thinking they can blindly paste
> that into their config.

Yeah, I agree with your later suggestion, and I'm glad that we're on the
same page. I certianly don't think that we need to do an extra amount of
hand holding through the 'f() { ... }; f' pattern, but I added an extra
bit to say that 'git for-each-ref' by itself doesn't work, since you
have to handle the path argument.

> > > The other alternative is to pass $GIT_DIR in the environment on behalf
> > > of the program. Then writing:
> > >
> > >   git for-each-ref --format='%(objectname)' refs/heads
> > >
> > > would Just Work. But it's a bit subtle, since it is not immediately
> > > obvious that the command is meant to run in a different repository.
> >
> > I think that we discussed this approach a bit off-list, and I had the
> > idea that it was too fragile to work in practice, and that it would be
> > too surprising for callers to suddenly be in a different world.
> >
> > I say this not because it wouldn't make this particular scenario more
> > convenient, which it uncountably would, but because it would make other
> > scenarios _more_ complicated.
> >
> > For example, if a caller uses an alternate reference backed, perhaps,
> > MySQL (or anything that _isn't_ Git), they're not going to want to have
> > these GIT_ environment variable set.
>
> If they're not using Git under the hood, then GIT_* probably isn't
> hurting anything. But it is still pretty subtle. Let's forget I
> mentioned it.  Just chaining for-each-ref with a prefix is pretty
> awkward, but that's why we have the next patch with
> alternateRefsPrefixes.
>
> Your response did make me think of one other thing, though. The
> alternate file points to a directory with objects, and the
> for_each_alternate_ref() code checks to see if that looks vaguely like
> the objects/ directory of a git repo. But would anybody want to run
> something like alternateRefsCommand on _just_ the object directory?
> I.e., you don't have a real git repo there, but your script can
> "somehow" come up with a list of valid tips.
>
> That isn't inconceivable to me for the kind of multi-fork storage we do
> at GitHub. E.g., imagine a shared object directory with no refs, and
> then a script that goes out to the other related forks to look at their
> ref tips. I don't think we have any immediate plans for it, though (and
> there are a lot of subtle bits that I won't go into here that make it
> non-trivial). So I'm OK to punt on it for now. I also think in a pinch
> that you could easily fool the alternates code by just having a dummy
> "refs/" directory.

I'm not opposed to the idea in general, and I think that it's a good
one, but I am opposed to it in this series. I think that the series
as-is is concise, and unlocks a path towards implementing this feature
at GitHub, and for other users, too.

Certainly we can invent more complicated examples, and I think that many
of them (yours included) are worth building the extra support for. But
in this initial version, I think that we'd be fine to leave it off.

> > > > diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
> > [...]
> > > > +test_description='git receive-pack test'
> > >
> > > The name of this test file and the description are pretty vague. Can we
> > > say something like "test handling of receive-pack with alternate-refs
> > > config"?
> >
> > I left it intentionally vague, since I'd like for it to contain more
> > tests about 'git receive-pack'-specific things in the future.
> >
> > I'm happy to change the name, though I wonder if we should change the
> > filename accordingly, and if so, to what.
>
> I think we'd want to have a separate script for other receive-pack tests
> that aren't related to alternates. There's some startup overhead to each
> script so we don't want to make them _too_ small, but there are benefits
> to having small test scripts:
>
>  - they're our unit of parallelism, so we want to be able to keep a
>    reasonable number of processors full
>
>  - each test script starts with a clean slate, so there's less chance
>    for unexpected interactions between individual tests (e.g., when
>    modifying or adding a test in the middle of the script)
>
>  - it's less annoying when you're debugging a failing test near the end
>    of a script ;)

All good points, so I'm convinced ;-).

> I actually think we'd benefit from splitting up a few of the longer
> scripts. On my quad-core laptop, running the tests in slow-to-fast order
> keeps the processors pretty busy, and the slowest test takes less time
> than the whole suite. But I've also tried running on a 40-core box. It
> burns through the short tests quickly, but you can never get faster than
> the slowest single test, which takes something like 35 seconds. So
> instead of being 10 times faster, it's more like two times faster, as
> most of the processors idle waiting for that one script to finish.
>
> But that's all pretty tangential here. My point is just that this
> probably ought to be remain its own script. :)
>
> I'd probably name it "t5410-receive-pack-alternates" or similar.

Sounds good, I'll do that and update the name of the test to be
'receive-pack with alternate ref filtering'.

> > I'll wait until Monday to re-roll, just to make sure that there isn't
> > any new feedback between now and then.
>
> Sounds good. Thanks for working on this.

It's been my pleasure. Thanks for all of your help.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v3 4/4] transport.c: introduce core.alternateRefsPrefixes
  2018-09-29  7:34         ` Jeff King
@ 2018-10-02  1:57           ` Taylor Blau
  2018-10-02  2:00             ` Taylor Blau
  0 siblings, 1 reply; 94+ messages in thread
From: Taylor Blau @ 2018-10-02  1:57 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, gitster, sunshine, sbeller

On Sat, Sep 29, 2018 at 03:34:26AM -0400, Jeff King wrote:
> On Fri, Sep 28, 2018 at 03:05:57PM -0700, Taylor Blau wrote:
>
> > > > For example, to advertise only tags, a caller using
> > > > 'core.alternateRefsCommand' would have to do:
> > > >
> > > >   $ git config core.alternateRefsCommand ' \
> > > >       git -C "$1" for-each-ref refs/tags --format="%(objectname)"'
> > >
> > > This has the same "$@" issue as the previous one, I think (which only
> > > makes your point about it being cumbersome more true!).
> >
> > Hmm. I'll be curious to how you respond to my other message about the
> > same topic. I feel that whatever the outcome there is will affect both
> > locations in the same way.
>
> I think they're separate issues, right? I was just confused on the
> earlier patch, but the "git config" command you show above is the actual
> broken case isn't it?

Ah, I certainly had these mixed up on Saturday when I wrote what is
quoted here. As I understand it now, you were talking about the
difference between $@ and "$@", which I did fix (by rewriting the former
to the later).

> I'm not overly concerned since this isn't recommending the technique to
> end users (and in fact the whole point is to give an alternative), but
> it may be worth showing a working command in case anybody runs across
> it.

Completely agree, and thanks for your review.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v3 4/4] transport.c: introduce core.alternateRefsPrefixes
  2018-10-02  1:57           ` Taylor Blau
@ 2018-10-02  2:00             ` Taylor Blau
  0 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-10-02  2:00 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Jeff King, git, gitster, sunshine, sbeller

On Mon, Oct 01, 2018 at 06:57:37PM -0700, Taylor Blau wrote:
> On Sat, Sep 29, 2018 at 03:34:26AM -0400, Jeff King wrote:
> > On Fri, Sep 28, 2018 at 03:05:57PM -0700, Taylor Blau wrote:
> >
> > > > > For example, to advertise only tags, a caller using
> > > > > 'core.alternateRefsCommand' would have to do:
> > > > >
> > > > >   $ git config core.alternateRefsCommand ' \
> > > > >       git -C "$1" for-each-ref refs/tags --format="%(objectname)"'
> > > >
> > > > This has the same "$@" issue as the previous one, I think (which only
> > > > makes your point about it being cumbersome more true!).
> > >
> > > Hmm. I'll be curious to how you respond to my other message about the
> > > same topic. I feel that whatever the outcome there is will affect both
> > > locations in the same way.
> >
> > I think they're separate issues, right? I was just confused on the
> > earlier patch, but the "git config" command you show above is the actual
> > broken case isn't it?
>
> Ah, I certainly had these mixed up on Saturday when I wrote what is
> quoted here. As I understand it now, you were talking about the
> difference between $@ and "$@", which I did fix (by rewriting the former
> to the later).

Double "ah!". You were talking about getting the path to the repository
stuck on the end, which _is_ a problem here. I'll fix that.

> > I'm not overly concerned since this isn't recommending the technique to
> > end users (and in fact the whole point is to give an alternative), but
> > it may be worth showing a working command in case anybody runs across
> > it.
>
> Completely agree, and thanks for your review.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v4 0/4] Filter alternate references
  2018-09-20 18:04 [PATCH 0/3] Filter alternate references Taylor Blau
                   ` (6 preceding siblings ...)
  2018-09-28  4:25 ` [PATCH v3 0/4] Filter alternate references Taylor Blau
@ 2018-10-02  2:23 ` Taylor Blau
  2018-10-02  2:23   ` [PATCH v4 1/4] transport: drop refnames from for_each_alternate_ref Taylor Blau
                     ` (3 more replies)
  2018-10-08 18:09 ` [PATCH v5 0/4] Filter alternate references Taylor Blau
  8 siblings, 4 replies; 94+ messages in thread
From: Taylor Blau @ 2018-10-02  2:23 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller

Hi,

Attached is the fourth re-roll of a series to teach
'core.alternateRefsCommand' and 'core.alternateRefsPrefixes' to filter
refs from an alternate from being visible to the fork. This is done in
order to optimize a case described in patch [3/4].

As always, a range-diff is included below, showing that not much has
changed of significance since last round. I mostly focused my efforts on
taking Peff's suggestion towards a more straightforward implementation
of the test setup.

Some extra documentation was written and a couple of commit messages
amended, but no C code has changed since the v2.

Thanks again for all of your review.

Thanks,
Taylor

Jeff King (1):
  transport: drop refnames from for_each_alternate_ref

Taylor Blau (3):
  transport.c: extract 'fill_alternate_refs_command'
  transport.c: introduce core.alternateRefsCommand
  transport.c: introduce core.alternateRefsPrefixes

 Documentation/config.txt           | 23 +++++++++++++++++
 builtin/receive-pack.c             |  3 +--
 fetch-pack.c                       |  3 +--
 t/t5410-receive-pack-alternates.sh | 41 ++++++++++++++++++++++++++++++
 transport.c                        | 38 +++++++++++++++++++++------
 transport.h                        |  2 +-
 6 files changed, 97 insertions(+), 13 deletions(-)
 create mode 100755 t/t5410-receive-pack-alternates.sh

Range-diff against v3:
1:  037273dab0 ! 1:  491f258f50 transport: drop refnames from for_each_alternate_ref
    @@ -14,6 +14,7 @@
         bare minimum.

         Signed-off-by: Jeff King <peff@peff.net>
    +    Signed-off-by: Taylor Blau <me@ttaylorr.com>

      diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
      --- a/builtin/receive-pack.c
2:  9479470cb1 = 2:  6119de15f2 transport.c: extract 'fill_alternate_refs_command'
3:  2dbcd54190 ! 3:  aadb27c010 transport.c: introduce core.alternateRefsCommand
    @@ -24,9 +24,7 @@

         Let the repository that has alternates configure this command to avoid
         trusting the alternate to provide us a safe command to run in the shell.
    -    To behave differently on each alternate (e.g., only list tags from
    -    alternate A, only heads from B) provide the path of the alternate as the
    -    first argument.
    +    To find the alternate, pass its absolute path as the first argument.

         Signed-off-by: Taylor Blau <me@ttaylorr.com>

    @@ -40,51 +38,41 @@
     +core.alternateRefsCommand::
     +	When advertising tips of available history from an alternate, use the shell to
     +	execute the specified command instead of linkgit:git-for-each-ref[1]. The
    -+	first argument is the absolute path of the alternate. Output must be of the
    -+	form: `%(objectname)`, where multiple tips are separated by newlines.
    ++	first argument is the absolute path of the alternate. Output must contain one
    ++	hex object id per line (i.e., the same as produce by `git for-each-ref
    ++	--format='%(objectname)'`).
     ++
     +This is useful when a repository only wishes to advertise some of its
    -+alternate's references as ".have"'s. For example, to only advertise branch
    ++alternate's references as `.have`'s. For example, to only advertise branch
     +heads, configure `core.alternateRefsCommand` to the path of a script which runs
     +`git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads`.
    +++
    ++Note that the configured value is executed in a shell, and thus
    ++linkgit:git-for-each-ref[1] by itself does not work, as scripts have to handle
    ++the path argument specially.
     +
      core.bare::
      	If true this repository is assumed to be 'bare' and has no
      	working directory associated with it.  If this is the case a

    - diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
    + diff --git a/t/t5410-receive-pack-alternates.sh b/t/t5410-receive-pack-alternates.sh
      new file mode 100755
      --- /dev/null
    - +++ b/t/t5410-receive-pack.sh
    + +++ b/t/t5410-receive-pack-alternates.sh
     @@
     +#!/bin/sh
     +
    -+test_description='git receive-pack test'
    ++test_description='git receive-pack with alternate ref filtering'
     +
     +. ./test-lib.sh
     +
     +test_expect_success 'setup' '
    -+	test_commit one &&
    -+	git update-ref refs/heads/a HEAD &&
    -+	test_commit two &&
    -+	git update-ref refs/heads/b HEAD &&
    -+	test_commit three &&
    -+	git update-ref refs/heads/c HEAD &&
    -+	git clone --bare . fork &&
    -+	git clone fork pusher &&
    -+	(
    -+		cd fork &&
    -+		git update-ref --stdin <<-\EOF &&
    -+		delete refs/heads/a
    -+		delete refs/heads/b
    -+		delete refs/heads/c
    -+		delete refs/heads/master
    -+		delete refs/tags/one
    -+		delete refs/tags/two
    -+		delete refs/tags/three
    -+		EOF
    -+		echo "../../.git/objects" >objects/info/alternates
    -+	)
    ++	test_commit base &&
    ++	git clone -s --bare . fork &&
    ++	git checkout -b public/branch master &&
    ++	test_commit public &&
    ++	git checkout -b private/branch master &&
    ++	test_commit private
     +'
     +
     +extract_haves () {
    @@ -95,11 +83,10 @@
     +	write_script fork/alternate-refs <<-\EOF &&
     +		git --git-dir="$1" for-each-ref \
     +			--format="%(objectname)" \
    -+			refs/heads/a \
    -+			refs/heads/c
    ++			refs/heads/public/
     +	EOF
     +	test_config -C fork core.alternateRefsCommand alternate-refs &&
    -+	git rev-parse a c >expect &&
    ++	git rev-parse public/branch >expect &&
     +	printf "0000" | git receive-pack fork >actual &&
     +	extract_haves <actual >actual.haves &&
     +	test_cmp expect actual.haves
4:  48eb774c9e ! 4:  0d3521e92a transport.c: introduce core.alternateRefsPrefixes
    @@ -12,7 +12,8 @@
         'core.alternateRefsCommand' would have to do:

           $ git config core.alternateRefsCommand ' \
    -          git -C "$1" for-each-ref refs/tags --format="%(objectname)"'
    +          f() { git -C "$1" for-each-ref \
    +                  refs/tags --format="%(objectname)" }; f "$@"'

         The above is cumbersome to write, so let's introduce a
         "core.alternateRefsPrefixes" to address this common case. Instead, the
    @@ -38,8 +39,8 @@
      --- a/Documentation/config.txt
      +++ b/Documentation/config.txt
     @@
    - heads, configure `core.alternateRefsCommand` to the path of a script which runs
    - `git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads`.
    + linkgit:git-for-each-ref[1] by itself does not work, as scripts have to handle
    + the path argument specially.

     +core.alternateRefsPrefixes::
     +	When listing references from an alternate, list only references that begin
    @@ -52,16 +53,16 @@
      	If true this repository is assumed to be 'bare' and has no
      	working directory associated with it.  If this is the case a

    - diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
    - --- a/t/t5410-receive-pack.sh
    - +++ b/t/t5410-receive-pack.sh
    + diff --git a/t/t5410-receive-pack-alternates.sh b/t/t5410-receive-pack-alternates.sh
    + --- a/t/t5410-receive-pack-alternates.sh
    + +++ b/t/t5410-receive-pack-alternates.sh
     @@
      	test_cmp expect actual.haves
      '

     +test_expect_success 'with core.alternateRefsPrefixes' '
    -+	test_config -C fork core.alternateRefsPrefixes "refs/tags" &&
    -+	git rev-parse one three two >expect &&
    ++	test_config -C fork core.alternateRefsPrefixes "refs/heads/private" &&
    ++	git rev-parse private/branch expect &&
     +	printf "0000" | git receive-pack fork >actual &&
     +	extract_haves <actual >actual.haves &&
     +	test_cmp expect actual.haves
--
2.19.0.221.g150f307af

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v4 1/4] transport: drop refnames from for_each_alternate_ref
  2018-10-02  2:23 ` [PATCH v4 0/4] Filter alternate references Taylor Blau
@ 2018-10-02  2:23   ` Taylor Blau
  2018-10-02  2:23   ` [PATCH v4 2/4] transport.c: extract 'fill_alternate_refs_command' Taylor Blau
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-10-02  2:23 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller

From: Jeff King <peff@peff.net>

None of the current callers use the refname parameter we pass to their
callbacks. In theory somebody _could_ do so, but it's actually quite
weird if you think about it: it's a ref in somebody else's repository.
So the name has no meaning locally, and in fact there may be duplicates
if there are multiple alternates.

The users of this interface really only care about seeing some ref tips,
since that promises that the alternate has the full commit graph
reachable from there. So let's keep the information we pass back to the
bare minimum.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 builtin/receive-pack.c | 3 +--
 fetch-pack.c           | 3 +--
 transport.c            | 6 +++---
 transport.h            | 2 +-
 4 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 4d30001950..6792291f5e 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -281,8 +281,7 @@ static int show_ref_cb(const char *path_full, const struct object_id *oid,
 	return 0;
 }
 
-static void show_one_alternate_ref(const char *refname,
-				   const struct object_id *oid,
+static void show_one_alternate_ref(const struct object_id *oid,
 				   void *data)
 {
 	struct oidset *seen = data;
diff --git a/fetch-pack.c b/fetch-pack.c
index 75047a4b2a..b643de143b 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -76,8 +76,7 @@ struct alternate_object_cache {
 	size_t nr, alloc;
 };
 
-static void cache_one_alternate(const char *refname,
-				const struct object_id *oid,
+static void cache_one_alternate(const struct object_id *oid,
 				void *vcache)
 {
 	struct alternate_object_cache *cache = vcache;
diff --git a/transport.c b/transport.c
index 1c76d64aba..2e0bc414d0 100644
--- a/transport.c
+++ b/transport.c
@@ -1336,7 +1336,7 @@ static void read_alternate_refs(const char *path,
 	cmd.git_cmd = 1;
 	argv_array_pushf(&cmd.args, "--git-dir=%s", path);
 	argv_array_push(&cmd.args, "for-each-ref");
-	argv_array_push(&cmd.args, "--format=%(objectname) %(refname)");
+	argv_array_push(&cmd.args, "--format=%(objectname)");
 	cmd.env = local_repo_env;
 	cmd.out = -1;
 
@@ -1348,13 +1348,13 @@ static void read_alternate_refs(const char *path,
 		struct object_id oid;
 
 		if (get_oid_hex(line.buf, &oid) ||
-		    line.buf[GIT_SHA1_HEXSZ] != ' ') {
+		    line.buf[GIT_SHA1_HEXSZ]) {
 			warning(_("invalid line while parsing alternate refs: %s"),
 				line.buf);
 			break;
 		}
 
-		cb(line.buf + GIT_SHA1_HEXSZ + 1, &oid, data);
+		cb(&oid, data);
 	}
 
 	fclose(fh);
diff --git a/transport.h b/transport.h
index 01e717c29e..9baeca2d7a 100644
--- a/transport.h
+++ b/transport.h
@@ -261,6 +261,6 @@ int transport_refs_pushed(struct ref *ref);
 void transport_print_push_status(const char *dest, struct ref *refs,
 		  int verbose, int porcelain, unsigned int *reject_reasons);
 
-typedef void alternate_ref_fn(const char *refname, const struct object_id *oid, void *);
+typedef void alternate_ref_fn(const struct object_id *oid, void *);
 extern void for_each_alternate_ref(alternate_ref_fn, void *);
 #endif
-- 
2.19.0.221.g150f307af


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v4 2/4] transport.c: extract 'fill_alternate_refs_command'
  2018-10-02  2:23 ` [PATCH v4 0/4] Filter alternate references Taylor Blau
  2018-10-02  2:23   ` [PATCH v4 1/4] transport: drop refnames from for_each_alternate_ref Taylor Blau
@ 2018-10-02  2:23   ` Taylor Blau
  2018-10-02  2:23   ` [PATCH v4 3/4] transport.c: introduce core.alternateRefsCommand Taylor Blau
  2018-10-02  2:24   ` [PATCH v4 4/4] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
  3 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-10-02  2:23 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller

To list alternate references, 'read_alternate_refs' creates a child
process running 'git for-each-ref' in the alternate's Git directory.

Prepare to run other commands besides 'git for-each-ref' by introducing
and moving the relevant code from 'read_alternate_refs' to
'fill_alternate_refs_command'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 transport.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/transport.c b/transport.c
index 2e0bc414d0..2825debac5 100644
--- a/transport.c
+++ b/transport.c
@@ -1325,6 +1325,17 @@ char *transport_anonymize_url(const char *url)
 	return xstrdup(url);
 }
 
+static void fill_alternate_refs_command(struct child_process *cmd,
+					const char *repo_path)
+{
+	cmd->git_cmd = 1;
+	argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
+	argv_array_push(&cmd->args, "for-each-ref");
+	argv_array_push(&cmd->args, "--format=%(objectname)");
+	cmd->env = local_repo_env;
+	cmd->out = -1;
+}
+
 static void read_alternate_refs(const char *path,
 				alternate_ref_fn *cb,
 				void *data)
@@ -1333,12 +1344,7 @@ static void read_alternate_refs(const char *path,
 	struct strbuf line = STRBUF_INIT;
 	FILE *fh;
 
-	cmd.git_cmd = 1;
-	argv_array_pushf(&cmd.args, "--git-dir=%s", path);
-	argv_array_push(&cmd.args, "for-each-ref");
-	argv_array_push(&cmd.args, "--format=%(objectname)");
-	cmd.env = local_repo_env;
-	cmd.out = -1;
+	fill_alternate_refs_command(&cmd, path);
 
 	if (start_command(&cmd))
 		return;
-- 
2.19.0.221.g150f307af


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v4 3/4] transport.c: introduce core.alternateRefsCommand
  2018-10-02  2:23 ` [PATCH v4 0/4] Filter alternate references Taylor Blau
  2018-10-02  2:23   ` [PATCH v4 1/4] transport: drop refnames from for_each_alternate_ref Taylor Blau
  2018-10-02  2:23   ` [PATCH v4 2/4] transport.c: extract 'fill_alternate_refs_command' Taylor Blau
@ 2018-10-02  2:23   ` Taylor Blau
  2018-10-02 23:40     ` Jeff King
  2018-10-02  2:24   ` [PATCH v4 4/4] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
  3 siblings, 1 reply; 94+ messages in thread
From: Taylor Blau @ 2018-10-02  2:23 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller

When in a repository containing one or more alternates, Git would
sometimes like to list references from those alternates. For example,
'git receive-pack' lists the "tips" pointed to by references in those
alternates as special ".have" references.

Listing ".have" references is designed to make pushing changes from
upstream to a fork a lightweight operation, by advertising to the pusher
that the fork already has the objects (via its alternate). Thus, the
client can avoid sending them.

However, when the alternate (upstream, in the previous example) has a
pathologically large number of references, the initial advertisement is
too expensive. In fact, it can dominate any such optimization where the
pusher avoids sending certain objects.

Introduce "core.alternateRefsCommand" in order to provide a facility to
limit or filter alternate references. This can be used, for example, to
filter out references the alternate does not wish to send (for space
concerns, or otherwise) during the initial advertisement.

Let the repository that has alternates configure this command to avoid
trusting the alternate to provide us a safe command to run in the shell.
To find the alternate, pass its absolute path as the first argument.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config.txt           | 16 +++++++++++++++
 t/t5410-receive-pack-alternates.sh | 33 ++++++++++++++++++++++++++++++
 transport.c                        | 19 +++++++++++++----
 3 files changed, 64 insertions(+), 4 deletions(-)
 create mode 100755 t/t5410-receive-pack-alternates.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ad0f4510c3..ac0577d288 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -616,6 +616,22 @@ core.preferSymlinkRefs::
 	This is sometimes needed to work with old scripts that
 	expect HEAD to be a symbolic link.
 
+core.alternateRefsCommand::
+	When advertising tips of available history from an alternate, use the shell to
+	execute the specified command instead of linkgit:git-for-each-ref[1]. The
+	first argument is the absolute path of the alternate. Output must contain one
+	hex object id per line (i.e., the same as produce by `git for-each-ref
+	--format='%(objectname)'`).
++
+This is useful when a repository only wishes to advertise some of its
+alternate's references as `.have`'s. For example, to only advertise branch
+heads, configure `core.alternateRefsCommand` to the path of a script which runs
+`git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads`.
++
+Note that the configured value is executed in a shell, and thus
+linkgit:git-for-each-ref[1] by itself does not work, as scripts have to handle
+the path argument specially.
+
 core.bare::
 	If true this repository is assumed to be 'bare' and has no
 	working directory associated with it.  If this is the case a
diff --git a/t/t5410-receive-pack-alternates.sh b/t/t5410-receive-pack-alternates.sh
new file mode 100755
index 0000000000..49d0fe44fb
--- /dev/null
+++ b/t/t5410-receive-pack-alternates.sh
@@ -0,0 +1,33 @@
+#!/bin/sh
+
+test_description='git receive-pack with alternate ref filtering'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit base &&
+	git clone -s --bare . fork &&
+	git checkout -b public/branch master &&
+	test_commit public &&
+	git checkout -b private/branch master &&
+	test_commit private
+'
+
+extract_haves () {
+	depacketize | perl -lne '/^(\S+) \.have/ and print $1'
+}
+
+test_expect_success 'with core.alternateRefsCommand' '
+	write_script fork/alternate-refs <<-\EOF &&
+		git --git-dir="$1" for-each-ref \
+			--format="%(objectname)" \
+			refs/heads/public/
+	EOF
+	test_config -C fork core.alternateRefsCommand alternate-refs &&
+	git rev-parse public/branch >expect &&
+	printf "0000" | git receive-pack fork >actual &&
+	extract_haves <actual >actual.haves &&
+	test_cmp expect actual.haves
+'
+
+test_done
diff --git a/transport.c b/transport.c
index 2825debac5..e271b66603 100644
--- a/transport.c
+++ b/transport.c
@@ -1328,10 +1328,21 @@ char *transport_anonymize_url(const char *url)
 static void fill_alternate_refs_command(struct child_process *cmd,
 					const char *repo_path)
 {
-	cmd->git_cmd = 1;
-	argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
-	argv_array_push(&cmd->args, "for-each-ref");
-	argv_array_push(&cmd->args, "--format=%(objectname)");
+	const char *value;
+
+	if (!git_config_get_value("core.alternateRefsCommand", &value)) {
+		cmd->use_shell = 1;
+
+		argv_array_push(&cmd->args, value);
+		argv_array_push(&cmd->args, repo_path);
+	} else {
+		cmd->git_cmd = 1;
+
+		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
+		argv_array_push(&cmd->args, "for-each-ref");
+		argv_array_push(&cmd->args, "--format=%(objectname)");
+	}
+
 	cmd->env = local_repo_env;
 	cmd->out = -1;
 }
-- 
2.19.0.221.g150f307af


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v4 4/4] transport.c: introduce core.alternateRefsPrefixes
  2018-10-02  2:23 ` [PATCH v4 0/4] Filter alternate references Taylor Blau
                     ` (2 preceding siblings ...)
  2018-10-02  2:23   ` [PATCH v4 3/4] transport.c: introduce core.alternateRefsCommand Taylor Blau
@ 2018-10-02  2:24   ` Taylor Blau
  2018-10-02 15:13     ` Ramsay Jones
  3 siblings, 1 reply; 94+ messages in thread
From: Taylor Blau @ 2018-10-02  2:24 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller

The recently-introduced "core.alternateRefsCommand" allows callers to
specify with high flexibility the tips that they wish to advertise from
alternates. This flexibility comes at the cost of some inconvenience
when the caller only wishes to limit the advertisement to one or more
prefixes.

For example, to advertise only tags, a caller using
'core.alternateRefsCommand' would have to do:

  $ git config core.alternateRefsCommand ' \
      f() { git -C "$1" for-each-ref \
              refs/tags --format="%(objectname)" }; f "$@"'

The above is cumbersome to write, so let's introduce a
"core.alternateRefsPrefixes" to address this common case. Instead, the
caller can run:

  $ git config core.alternateRefsPrefixes 'refs/tags'

Which will behave identically to the longer example using
"core.alternateRefsCommand".

Since the value of "core.alternateRefsPrefixes" is appended to 'git
for-each-ref' and then executed, include a "--" before taking the
configured value to avoid misinterpreting arguments as flags to 'git
for-each-ref'.

In the case that the caller wishes to specify multiple prefixes, they
may separate them by whitespace. If "core.alternateRefsCommand" is set,
it will take precedence over "core.alternateRefsPrefixes".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config.txt           | 7 +++++++
 t/t5410-receive-pack-alternates.sh | 8 ++++++++
 transport.c                        | 5 +++++
 3 files changed, 20 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ac0577d288..1dc5eb3cfa 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -632,6 +632,13 @@ Note that the configured value is executed in a shell, and thus
 linkgit:git-for-each-ref[1] by itself does not work, as scripts have to handle
 the path argument specially.
 
+core.alternateRefsPrefixes::
+	When listing references from an alternate, list only references that begin
+	with the given prefix. Prefixes match as if they were given as arguments to
+	linkgit:git-for-each-ref[1]. To list multiple prefixes, separate them with
+	whitespace. If `core.alternateRefsCommand` is set, setting
+	`core.alternateRefsPrefixes` has no effect.
+
 core.bare::
 	If true this repository is assumed to be 'bare' and has no
 	working directory associated with it.  If this is the case a
diff --git a/t/t5410-receive-pack-alternates.sh b/t/t5410-receive-pack-alternates.sh
index 49d0fe44fb..94794c35da 100755
--- a/t/t5410-receive-pack-alternates.sh
+++ b/t/t5410-receive-pack-alternates.sh
@@ -30,4 +30,12 @@ test_expect_success 'with core.alternateRefsCommand' '
 	test_cmp expect actual.haves
 '
 
+test_expect_success 'with core.alternateRefsPrefixes' '
+	test_config -C fork core.alternateRefsPrefixes "refs/heads/private" &&
+	git rev-parse private/branch expect &&
+	printf "0000" | git receive-pack fork >actual &&
+	extract_haves <actual >actual.haves &&
+	test_cmp expect actual.haves
+'
+
 test_done
diff --git a/transport.c b/transport.c
index e271b66603..83474add28 100644
--- a/transport.c
+++ b/transport.c
@@ -1341,6 +1341,11 @@ static void fill_alternate_refs_command(struct child_process *cmd,
 		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
 		argv_array_push(&cmd->args, "for-each-ref");
 		argv_array_push(&cmd->args, "--format=%(objectname)");
+
+		if (!git_config_get_value("core.alternateRefsPrefixes", &value)) {
+			argv_array_push(&cmd->args, "--");
+			argv_array_split(&cmd->args, value);
+		}
 	}
 
 	cmd->env = local_repo_env;
-- 
2.19.0.221.g150f307af

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH v4 4/4] transport.c: introduce core.alternateRefsPrefixes
  2018-10-02  2:24   ` [PATCH v4 4/4] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
@ 2018-10-02 15:13     ` Ramsay Jones
  2018-10-02 23:28       ` Taylor Blau
  0 siblings, 1 reply; 94+ messages in thread
From: Ramsay Jones @ 2018-10-02 15:13 UTC (permalink / raw)
  To: Taylor Blau, git; +Cc: peff, gitster, sunshine, sbeller



On 02/10/18 03:24, Taylor Blau wrote:
[snip]
> diff --git a/t/t5410-receive-pack-alternates.sh b/t/t5410-receive-pack-alternates.sh
> index 49d0fe44fb..94794c35da 100755
> --- a/t/t5410-receive-pack-alternates.sh
> +++ b/t/t5410-receive-pack-alternates.sh
> @@ -30,4 +30,12 @@ test_expect_success 'with core.alternateRefsCommand' '
>  	test_cmp expect actual.haves
>  '
>  
> +test_expect_success 'with core.alternateRefsPrefixes' '
> +	test_config -C fork core.alternateRefsPrefixes "refs/heads/private" &&
> +	git rev-parse private/branch expect &&

s/expect/>expect/ ?

ATB,
Ramsay Jones

> +	printf "0000" | git receive-pack fork >actual &&
> +	extract_haves <actual >actual.haves &&
> +	test_cmp expect actual.haves
> +'
> +
>  test_done
> diff --git a/transport.c b/transport.c
> index e271b66603..83474add28 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -1341,6 +1341,11 @@ static void fill_alternate_refs_command(struct child_process *cmd,
>  		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
>  		argv_array_push(&cmd->args, "for-each-ref");
>  		argv_array_push(&cmd->args, "--format=%(objectname)");
> +
> +		if (!git_config_get_value("core.alternateRefsPrefixes", &value)) {
> +			argv_array_push(&cmd->args, "--");
> +			argv_array_split(&cmd->args, value);
> +		}
>  	}
>  
>  	cmd->env = local_repo_env;
> 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v4 4/4] transport.c: introduce core.alternateRefsPrefixes
  2018-10-02 15:13     ` Ramsay Jones
@ 2018-10-02 23:28       ` Taylor Blau
  0 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-10-02 23:28 UTC (permalink / raw)
  To: git, ramsay; +Cc: peff, gitster, sunshine, sbeller

On Tue, Oct 02, 2018 at 04:13:13PM +0100, Ramsay Jones wrote:
>
> On 02/10/18 03:24, Taylor Blau wrote:
> [snip]
> > diff --git a/t/t5410-receive-pack-alternates.sh b/t/t5410-receive-pack-alternates.sh
> > index 49d0fe44fb..94794c35da 100755
> > --- a/t/t5410-receive-pack-alternates.sh
> > +++ b/t/t5410-receive-pack-alternates.sh
> > @@ -30,4 +30,12 @@ test_expect_success 'with core.alternateRefsCommand' '
> >  	test_cmp expect actual.haves
> >  '
> >
> > +test_expect_success 'with core.alternateRefsPrefixes' '
> > +	test_config -C fork core.alternateRefsPrefixes "refs/heads/private" &&
> > +	git rev-parse private/branch expect &&
>
> s/expect/>expect/ ?

Ah, certainly. Thanks for catching my mistake. I've resent 4/4 as below.

Junio -- if you find this re-roll to be acceptable, please queue this
patch instead of the one that it is in reply to.

-- >8 --

The recently-introduced "core.alternateRefsCommand" allows callers to
specify with high flexibility the tips that they wish to advertise from
alternates. This flexibility comes at the cost of some inconvenience
when the caller only wishes to limit the advertisement to one or more
prefixes.

For example, to advertise only tags, a caller using
'core.alternateRefsCommand' would have to do:

  $ git config core.alternateRefsCommand ' \
      f() { git -C "$1" for-each-ref \
              refs/tags --format="%(objectname)" }; f "$@"'

The above is cumbersome to write, so let's introduce a
"core.alternateRefsPrefixes" to address this common case. Instead, the
caller can run:

  $ git config core.alternateRefsPrefixes 'refs/tags'

Which will behave identically to the longer example using
"core.alternateRefsCommand".

Since the value of "core.alternateRefsPrefixes" is appended to 'git
for-each-ref' and then executed, include a "--" before taking the
configured value to avoid misinterpreting arguments as flags to 'git
for-each-ref'.

In the case that the caller wishes to specify multiple prefixes, they
may separate them by whitespace. If "core.alternateRefsCommand" is set,
it will take precedence over "core.alternateRefsPrefixes".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config.txt           | 7 +++++++
 t/t5410-receive-pack-alternates.sh | 8 ++++++++
 transport.c                        | 5 +++++
 3 files changed, 20 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ac0577d288..1dc5eb3cfa 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -632,6 +632,13 @@ Note that the configured value is executed in a shell, and thus
 linkgit:git-for-each-ref[1] by itself does not work, as scripts have to handle
 the path argument specially.

+core.alternateRefsPrefixes::
+	When listing references from an alternate, list only references that begin
+	with the given prefix. Prefixes match as if they were given as arguments to
+	linkgit:git-for-each-ref[1]. To list multiple prefixes, separate them with
+	whitespace. If `core.alternateRefsCommand` is set, setting
+	`core.alternateRefsPrefixes` has no effect.
+
 core.bare::
 	If true this repository is assumed to be 'bare' and has no
 	working directory associated with it.  If this is the case a
diff --git a/t/t5410-receive-pack-alternates.sh b/t/t5410-receive-pack-alternates.sh
index 49d0fe44fb..457c20c2a5 100755
--- a/t/t5410-receive-pack-alternates.sh
+++ b/t/t5410-receive-pack-alternates.sh
@@ -30,4 +30,12 @@ test_expect_success 'with core.alternateRefsCommand' '
 	test_cmp expect actual.haves
 '

+test_expect_success 'with core.alternateRefsPrefixes' '
+	test_config -C fork core.alternateRefsPrefixes "refs/heads/private" &&
+	git rev-parse private/branch >expect &&
+	printf "0000" | git receive-pack fork >actual &&
+	extract_haves <actual >actual.haves &&
+	test_cmp expect actual.haves
+'
+
 test_done
diff --git a/transport.c b/transport.c
index e271b66603..83474add28 100644
--- a/transport.c
+++ b/transport.c
@@ -1341,6 +1341,11 @@ static void fill_alternate_refs_command(struct child_process *cmd,
 		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
 		argv_array_push(&cmd->args, "for-each-ref");
 		argv_array_push(&cmd->args, "--format=%(objectname)");
+
+		if (!git_config_get_value("core.alternateRefsPrefixes", &value)) {
+			argv_array_push(&cmd->args, "--");
+			argv_array_split(&cmd->args, value);
+		}
 	}

 	cmd->env = local_repo_env;
--
2.19.0.221.g150f307af

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH v4 3/4] transport.c: introduce core.alternateRefsCommand
  2018-10-02  2:23   ` [PATCH v4 3/4] transport.c: introduce core.alternateRefsCommand Taylor Blau
@ 2018-10-02 23:40     ` Jeff King
  2018-10-04  2:17       ` Taylor Blau
  0 siblings, 1 reply; 94+ messages in thread
From: Jeff King @ 2018-10-02 23:40 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, gitster, sunshine, sbeller

On Mon, Oct 01, 2018 at 07:23:58PM -0700, Taylor Blau wrote:

> +core.alternateRefsCommand::
> +	When advertising tips of available history from an alternate, use the shell to
> +	execute the specified command instead of linkgit:git-for-each-ref[1]. The
> +	first argument is the absolute path of the alternate. Output must contain one
> +	hex object id per line (i.e., the same as produce by `git for-each-ref
> +	--format='%(objectname)'`).
> ++
> +This is useful when a repository only wishes to advertise some of its
> +alternate's references as `.have`'s. For example, to only advertise branch
> +heads, configure `core.alternateRefsCommand` to the path of a script which runs
> +`git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads`.
> ++
> +Note that the configured value is executed in a shell, and thus
> +linkgit:git-for-each-ref[1] by itself does not work, as scripts have to handle
> +the path argument specially.

This last paragraph is trying to fix the wrong-impression that we
discussed in the last round. But I'm not sure it doesn't make things
more confusing. ;)

Specifically, the problem isn't the shell. The issue is that we pass the
repo path as an argument to the command. So either:

  - it's a real command that we run, in which case git-for-each-ref does
    not take a repo path argument and so doesn't work; or

  - it's a shell snippet, in which case the argument is appended to the
    snippet (and here's where you can get into a rabbit hole of
    explaining how our shell invocation works, and we should avoid that)

Can we just say:

  Note that you cannot generally put `git for-each-ref` directly into
  the config value, as it does not take a repository path as an argument
  (but you can wrap the command above in a shell script).

> [...]

The rest of the patch looks good to me, along with the other three
(modulo the "expect" fixup you already sent).

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v4 3/4] transport.c: introduce core.alternateRefsCommand
  2018-10-02 23:40     ` Jeff King
@ 2018-10-04  2:17       ` Taylor Blau
  0 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-10-04  2:17 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, gitster, sunshine, sbeller

On Tue, Oct 02, 2018 at 07:40:56PM -0400, Jeff King wrote:
> On Mon, Oct 01, 2018 at 07:23:58PM -0700, Taylor Blau wrote:
>
> > +core.alternateRefsCommand::
> > +	When advertising tips of available history from an alternate, use the shell to
> > +	execute the specified command instead of linkgit:git-for-each-ref[1]. The
> > +	first argument is the absolute path of the alternate. Output must contain one
> > +	hex object id per line (i.e., the same as produce by `git for-each-ref
> > +	--format='%(objectname)'`).
> > ++
> > +This is useful when a repository only wishes to advertise some of its
> > +alternate's references as `.have`'s. For example, to only advertise branch
> > +heads, configure `core.alternateRefsCommand` to the path of a script which runs
> > +`git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads`.
> > ++
> > +Note that the configured value is executed in a shell, and thus
> > +linkgit:git-for-each-ref[1] by itself does not work, as scripts have to handle
> > +the path argument specially.
>
> This last paragraph is trying to fix the wrong-impression that we
> discussed in the last round. But I'm not sure it doesn't make things
> more confusing. ;)

Heh, point taken. I suppose that I won't try to ignore your feedback
here!

> Specifically, the problem isn't the shell. The issue is that we pass the
> repo path as an argument to the command. So either:
>
>   - it's a real command that we run, in which case git-for-each-ref does
>     not take a repo path argument and so doesn't work; or
>
>   - it's a shell snippet, in which case the argument is appended to the
>     snippet (and here's where you can get into a rabbit hole of
>     explaining how our shell invocation works, and we should avoid that)
>
> Can we just say:
>
>   Note that you cannot generally put `git for-each-ref` directly into
>   the config value, as it does not take a repository path as an argument
>   (but you can wrap the command above in a shell script).
>
> > [...]

Yeah, I think that this is certainly the way to go. I took your
suggestion as-is, which I think is much clearer than what I wrote.
Thanks!

> The rest of the patch looks good to me, along with the other three
> (modulo the "expect" fixup you already sent).

Thanks for your review, as always :-). I certainly appreciate your
patience with the word-smithing and whatnot.

Junio, I applied Peff's suggestion directly into my local copy, so I'm
happy to do either of a couple things, depending on which would be
easiest for you to pick up. I could either:

  1. Re-send this patch (in addition to 4/4), or

  2. Re-roll the entire series (with this and 4/4 amended to reflect the
     two bits of feedback I've gotten since sending v4).

I imagine that the later will be easier for you to deal with, instead of
manually picking up patch amendments, but if you'd like fewer email,
that works too :-).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 0/4] Filter alternate references
  2018-09-20 18:04 [PATCH 0/3] Filter alternate references Taylor Blau
                   ` (7 preceding siblings ...)
  2018-10-02  2:23 ` [PATCH v4 0/4] Filter alternate references Taylor Blau
@ 2018-10-08 18:09 ` Taylor Blau
  2018-10-08 18:09   ` [PATCH v5 1/4] transport: drop refnames from for_each_alternate_ref Taylor Blau
                     ` (4 more replies)
  8 siblings, 5 replies; 94+ messages in thread
From: Taylor Blau @ 2018-10-08 18:09 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller, ramsay

Hi,

Attached is (what I anticipate to be) the final re-roll of my series to
introduce 'core.alternateRefsCommand' and 'core.alternateRefsPrefixes'
in order to limit the ".have" advertisement when pushing over protocol
v1 to a repository with configured alternates.

Not much has changed from last time, expect for:

  - Taking a documentation suggestion from Peff (in 3/4), and

  - Fixing a typo pointed out by Ramsay (in 4/4).

I believe that this series is otherwise ready for queueing, if everyone
else feels sufficiently OK about the changes.

Thanks in advance for your review.

Thanks,
Taylor

Jeff King (1):
  transport: drop refnames from for_each_alternate_ref

Taylor Blau (3):
  transport.c: extract 'fill_alternate_refs_command'
  transport.c: introduce core.alternateRefsCommand
  transport.c: introduce core.alternateRefsPrefixes

 Documentation/config.txt           | 18 +++++++++++++
 builtin/receive-pack.c             |  3 +--
 fetch-pack.c                       |  3 +--
 t/t5410-receive-pack-alternates.sh | 41 ++++++++++++++++++++++++++++++
 transport.c                        | 38 +++++++++++++++++++++------
 transport.h                        |  2 +-
 6 files changed, 92 insertions(+), 13 deletions(-)
 create mode 100755 t/t5410-receive-pack-alternates.sh

Range-diff against v4:
1:  76482a7eba = 1:  e4947f557b transport: drop refnames from for_each_alternate_ref
2:  120df009df = 2:  3d77a46c61 transport.c: extract 'fill_alternate_refs_command'
3:  c63864c89a ! 3:  7451b4872a transport.c: introduce core.alternateRefsCommand
    @@ -42,14 +42,9 @@
     +	hex object id per line (i.e., the same as produce by `git for-each-ref
     +	--format='%(objectname)'`).
     ++
    -+This is useful when a repository only wishes to advertise some of its
    -+alternate's references as `.have`'s. For example, to only advertise branch
    -+heads, configure `core.alternateRefsCommand` to the path of a script which runs
    -+`git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads`.
    -++
    -+Note that the configured value is executed in a shell, and thus
    -+linkgit:git-for-each-ref[1] by itself does not work, as scripts have to handle
    -+the path argument specially.
    ++Note that you cannot generally put `git for-each-ref` directly into the config
    ++value, as it does not take a repository path as an argument (but you can wrap
    ++the command above in a shell script).
     +
      core.bare::
      	If true this repository is assumed to be 'bare' and has no
4:  0f6cdc7ea4 ! 4:  28cbbe63f7 transport.c: introduce core.alternateRefsPrefixes
    @@ -39,8 +39,8 @@
      --- a/Documentation/config.txt
      +++ b/Documentation/config.txt
     @@
    - linkgit:git-for-each-ref[1] by itself does not work, as scripts have to handle
    - the path argument specially.
    + value, as it does not take a repository path as an argument (but you can wrap
    + the command above in a shell script).

     +core.alternateRefsPrefixes::
     +	When listing references from an alternate, list only references that begin
    @@ -62,7 +62,7 @@

     +test_expect_success 'with core.alternateRefsPrefixes' '
     +	test_config -C fork core.alternateRefsPrefixes "refs/heads/private" &&
    -+	git rev-parse private/branch expect &&
    ++	git rev-parse private/branch >expect &&
     +	printf "0000" | git receive-pack fork >actual &&
     +	extract_haves <actual >actual.haves &&
     +	test_cmp expect actual.haves
--
2.19.0.221.g150f307af

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 1/4] transport: drop refnames from for_each_alternate_ref
  2018-10-08 18:09 ` [PATCH v5 0/4] Filter alternate references Taylor Blau
@ 2018-10-08 18:09   ` Taylor Blau
  2018-10-08 18:09   ` [PATCH v5 2/4] transport.c: extract 'fill_alternate_refs_command' Taylor Blau
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-10-08 18:09 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller, ramsay

From: Jeff King <peff@peff.net>

None of the current callers use the refname parameter we pass to their
callbacks. In theory somebody _could_ do so, but it's actually quite
weird if you think about it: it's a ref in somebody else's repository.
So the name has no meaning locally, and in fact there may be duplicates
if there are multiple alternates.

The users of this interface really only care about seeing some ref tips,
since that promises that the alternate has the full commit graph
reachable from there. So let's keep the information we pass back to the
bare minimum.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 builtin/receive-pack.c | 3 +--
 fetch-pack.c           | 3 +--
 transport.c            | 6 +++---
 transport.h            | 2 +-
 4 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 4d30001950..6792291f5e 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -281,8 +281,7 @@ static int show_ref_cb(const char *path_full, const struct object_id *oid,
 	return 0;
 }
 
-static void show_one_alternate_ref(const char *refname,
-				   const struct object_id *oid,
+static void show_one_alternate_ref(const struct object_id *oid,
 				   void *data)
 {
 	struct oidset *seen = data;
diff --git a/fetch-pack.c b/fetch-pack.c
index 75047a4b2a..b643de143b 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -76,8 +76,7 @@ struct alternate_object_cache {
 	size_t nr, alloc;
 };
 
-static void cache_one_alternate(const char *refname,
-				const struct object_id *oid,
+static void cache_one_alternate(const struct object_id *oid,
 				void *vcache)
 {
 	struct alternate_object_cache *cache = vcache;
diff --git a/transport.c b/transport.c
index 1c76d64aba..2e0bc414d0 100644
--- a/transport.c
+++ b/transport.c
@@ -1336,7 +1336,7 @@ static void read_alternate_refs(const char *path,
 	cmd.git_cmd = 1;
 	argv_array_pushf(&cmd.args, "--git-dir=%s", path);
 	argv_array_push(&cmd.args, "for-each-ref");
-	argv_array_push(&cmd.args, "--format=%(objectname) %(refname)");
+	argv_array_push(&cmd.args, "--format=%(objectname)");
 	cmd.env = local_repo_env;
 	cmd.out = -1;
 
@@ -1348,13 +1348,13 @@ static void read_alternate_refs(const char *path,
 		struct object_id oid;
 
 		if (get_oid_hex(line.buf, &oid) ||
-		    line.buf[GIT_SHA1_HEXSZ] != ' ') {
+		    line.buf[GIT_SHA1_HEXSZ]) {
 			warning(_("invalid line while parsing alternate refs: %s"),
 				line.buf);
 			break;
 		}
 
-		cb(line.buf + GIT_SHA1_HEXSZ + 1, &oid, data);
+		cb(&oid, data);
 	}
 
 	fclose(fh);
diff --git a/transport.h b/transport.h
index 01e717c29e..9baeca2d7a 100644
--- a/transport.h
+++ b/transport.h
@@ -261,6 +261,6 @@ int transport_refs_pushed(struct ref *ref);
 void transport_print_push_status(const char *dest, struct ref *refs,
 		  int verbose, int porcelain, unsigned int *reject_reasons);
 
-typedef void alternate_ref_fn(const char *refname, const struct object_id *oid, void *);
+typedef void alternate_ref_fn(const struct object_id *oid, void *);
 extern void for_each_alternate_ref(alternate_ref_fn, void *);
 #endif
-- 
2.19.0.221.g150f307af


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] transport.c: extract 'fill_alternate_refs_command'
  2018-10-08 18:09 ` [PATCH v5 0/4] Filter alternate references Taylor Blau
  2018-10-08 18:09   ` [PATCH v5 1/4] transport: drop refnames from for_each_alternate_ref Taylor Blau
@ 2018-10-08 18:09   ` Taylor Blau
  2018-10-08 18:09   ` [PATCH v5 3/4] transport.c: introduce core.alternateRefsCommand Taylor Blau
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-10-08 18:09 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller, ramsay

To list alternate references, 'read_alternate_refs' creates a child
process running 'git for-each-ref' in the alternate's Git directory.

Prepare to run other commands besides 'git for-each-ref' by introducing
and moving the relevant code from 'read_alternate_refs' to
'fill_alternate_refs_command'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 transport.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/transport.c b/transport.c
index 2e0bc414d0..2825debac5 100644
--- a/transport.c
+++ b/transport.c
@@ -1325,6 +1325,17 @@ char *transport_anonymize_url(const char *url)
 	return xstrdup(url);
 }
 
+static void fill_alternate_refs_command(struct child_process *cmd,
+					const char *repo_path)
+{
+	cmd->git_cmd = 1;
+	argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
+	argv_array_push(&cmd->args, "for-each-ref");
+	argv_array_push(&cmd->args, "--format=%(objectname)");
+	cmd->env = local_repo_env;
+	cmd->out = -1;
+}
+
 static void read_alternate_refs(const char *path,
 				alternate_ref_fn *cb,
 				void *data)
@@ -1333,12 +1344,7 @@ static void read_alternate_refs(const char *path,
 	struct strbuf line = STRBUF_INIT;
 	FILE *fh;
 
-	cmd.git_cmd = 1;
-	argv_array_pushf(&cmd.args, "--git-dir=%s", path);
-	argv_array_push(&cmd.args, "for-each-ref");
-	argv_array_push(&cmd.args, "--format=%(objectname)");
-	cmd.env = local_repo_env;
-	cmd.out = -1;
+	fill_alternate_refs_command(&cmd, path);
 
 	if (start_command(&cmd))
 		return;
-- 
2.19.0.221.g150f307af


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v5 3/4] transport.c: introduce core.alternateRefsCommand
  2018-10-08 18:09 ` [PATCH v5 0/4] Filter alternate references Taylor Blau
  2018-10-08 18:09   ` [PATCH v5 1/4] transport: drop refnames from for_each_alternate_ref Taylor Blau
  2018-10-08 18:09   ` [PATCH v5 2/4] transport.c: extract 'fill_alternate_refs_command' Taylor Blau
@ 2018-10-08 18:09   ` Taylor Blau
  2018-10-08 18:09   ` [PATCH v5 4/4] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
  2018-10-09  3:09   ` [PATCH v5 0/4] Filter alternate references Jeff King
  4 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-10-08 18:09 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller, ramsay

When in a repository containing one or more alternates, Git would
sometimes like to list references from those alternates. For example,
'git receive-pack' lists the "tips" pointed to by references in those
alternates as special ".have" references.

Listing ".have" references is designed to make pushing changes from
upstream to a fork a lightweight operation, by advertising to the pusher
that the fork already has the objects (via its alternate). Thus, the
client can avoid sending them.

However, when the alternate (upstream, in the previous example) has a
pathologically large number of references, the initial advertisement is
too expensive. In fact, it can dominate any such optimization where the
pusher avoids sending certain objects.

Introduce "core.alternateRefsCommand" in order to provide a facility to
limit or filter alternate references. This can be used, for example, to
filter out references the alternate does not wish to send (for space
concerns, or otherwise) during the initial advertisement.

Let the repository that has alternates configure this command to avoid
trusting the alternate to provide us a safe command to run in the shell.
To find the alternate, pass its absolute path as the first argument.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config.txt           | 11 ++++++++++
 t/t5410-receive-pack-alternates.sh | 33 ++++++++++++++++++++++++++++++
 transport.c                        | 19 +++++++++++++----
 3 files changed, 59 insertions(+), 4 deletions(-)
 create mode 100755 t/t5410-receive-pack-alternates.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ad0f4510c3..c51e82d8a5 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -616,6 +616,17 @@ core.preferSymlinkRefs::
 	This is sometimes needed to work with old scripts that
 	expect HEAD to be a symbolic link.
 
+core.alternateRefsCommand::
+	When advertising tips of available history from an alternate, use the shell to
+	execute the specified command instead of linkgit:git-for-each-ref[1]. The
+	first argument is the absolute path of the alternate. Output must contain one
+	hex object id per line (i.e., the same as produce by `git for-each-ref
+	--format='%(objectname)'`).
++
+Note that you cannot generally put `git for-each-ref` directly into the config
+value, as it does not take a repository path as an argument (but you can wrap
+the command above in a shell script).
+
 core.bare::
 	If true this repository is assumed to be 'bare' and has no
 	working directory associated with it.  If this is the case a
diff --git a/t/t5410-receive-pack-alternates.sh b/t/t5410-receive-pack-alternates.sh
new file mode 100755
index 0000000000..49d0fe44fb
--- /dev/null
+++ b/t/t5410-receive-pack-alternates.sh
@@ -0,0 +1,33 @@
+#!/bin/sh
+
+test_description='git receive-pack with alternate ref filtering'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit base &&
+	git clone -s --bare . fork &&
+	git checkout -b public/branch master &&
+	test_commit public &&
+	git checkout -b private/branch master &&
+	test_commit private
+'
+
+extract_haves () {
+	depacketize | perl -lne '/^(\S+) \.have/ and print $1'
+}
+
+test_expect_success 'with core.alternateRefsCommand' '
+	write_script fork/alternate-refs <<-\EOF &&
+		git --git-dir="$1" for-each-ref \
+			--format="%(objectname)" \
+			refs/heads/public/
+	EOF
+	test_config -C fork core.alternateRefsCommand alternate-refs &&
+	git rev-parse public/branch >expect &&
+	printf "0000" | git receive-pack fork >actual &&
+	extract_haves <actual >actual.haves &&
+	test_cmp expect actual.haves
+'
+
+test_done
diff --git a/transport.c b/transport.c
index 2825debac5..e271b66603 100644
--- a/transport.c
+++ b/transport.c
@@ -1328,10 +1328,21 @@ char *transport_anonymize_url(const char *url)
 static void fill_alternate_refs_command(struct child_process *cmd,
 					const char *repo_path)
 {
-	cmd->git_cmd = 1;
-	argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
-	argv_array_push(&cmd->args, "for-each-ref");
-	argv_array_push(&cmd->args, "--format=%(objectname)");
+	const char *value;
+
+	if (!git_config_get_value("core.alternateRefsCommand", &value)) {
+		cmd->use_shell = 1;
+
+		argv_array_push(&cmd->args, value);
+		argv_array_push(&cmd->args, repo_path);
+	} else {
+		cmd->git_cmd = 1;
+
+		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
+		argv_array_push(&cmd->args, "for-each-ref");
+		argv_array_push(&cmd->args, "--format=%(objectname)");
+	}
+
 	cmd->env = local_repo_env;
 	cmd->out = -1;
 }
-- 
2.19.0.221.g150f307af


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v5 4/4] transport.c: introduce core.alternateRefsPrefixes
  2018-10-08 18:09 ` [PATCH v5 0/4] Filter alternate references Taylor Blau
                     ` (2 preceding siblings ...)
  2018-10-08 18:09   ` [PATCH v5 3/4] transport.c: introduce core.alternateRefsCommand Taylor Blau
@ 2018-10-08 18:09   ` Taylor Blau
  2018-10-09  3:09   ` [PATCH v5 0/4] Filter alternate references Jeff King
  4 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-10-08 18:09 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, sunshine, sbeller, ramsay

The recently-introduced "core.alternateRefsCommand" allows callers to
specify with high flexibility the tips that they wish to advertise from
alternates. This flexibility comes at the cost of some inconvenience
when the caller only wishes to limit the advertisement to one or more
prefixes.

For example, to advertise only tags, a caller using
'core.alternateRefsCommand' would have to do:

  $ git config core.alternateRefsCommand ' \
      f() { git -C "$1" for-each-ref \
              refs/tags --format="%(objectname)" }; f "$@"'

The above is cumbersome to write, so let's introduce a
"core.alternateRefsPrefixes" to address this common case. Instead, the
caller can run:

  $ git config core.alternateRefsPrefixes 'refs/tags'

Which will behave identically to the longer example using
"core.alternateRefsCommand".

Since the value of "core.alternateRefsPrefixes" is appended to 'git
for-each-ref' and then executed, include a "--" before taking the
configured value to avoid misinterpreting arguments as flags to 'git
for-each-ref'.

In the case that the caller wishes to specify multiple prefixes, they
may separate them by whitespace. If "core.alternateRefsCommand" is set,
it will take precedence over "core.alternateRefsPrefixes".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config.txt           | 7 +++++++
 t/t5410-receive-pack-alternates.sh | 8 ++++++++
 transport.c                        | 5 +++++
 3 files changed, 20 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index c51e82d8a5..a133a709f3 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -627,6 +627,13 @@ Note that you cannot generally put `git for-each-ref` directly into the config
 value, as it does not take a repository path as an argument (but you can wrap
 the command above in a shell script).
 
+core.alternateRefsPrefixes::
+	When listing references from an alternate, list only references that begin
+	with the given prefix. Prefixes match as if they were given as arguments to
+	linkgit:git-for-each-ref[1]. To list multiple prefixes, separate them with
+	whitespace. If `core.alternateRefsCommand` is set, setting
+	`core.alternateRefsPrefixes` has no effect.
+
 core.bare::
 	If true this repository is assumed to be 'bare' and has no
 	working directory associated with it.  If this is the case a
diff --git a/t/t5410-receive-pack-alternates.sh b/t/t5410-receive-pack-alternates.sh
index 49d0fe44fb..457c20c2a5 100755
--- a/t/t5410-receive-pack-alternates.sh
+++ b/t/t5410-receive-pack-alternates.sh
@@ -30,4 +30,12 @@ test_expect_success 'with core.alternateRefsCommand' '
 	test_cmp expect actual.haves
 '
 
+test_expect_success 'with core.alternateRefsPrefixes' '
+	test_config -C fork core.alternateRefsPrefixes "refs/heads/private" &&
+	git rev-parse private/branch >expect &&
+	printf "0000" | git receive-pack fork >actual &&
+	extract_haves <actual >actual.haves &&
+	test_cmp expect actual.haves
+'
+
 test_done
diff --git a/transport.c b/transport.c
index e271b66603..83474add28 100644
--- a/transport.c
+++ b/transport.c
@@ -1341,6 +1341,11 @@ static void fill_alternate_refs_command(struct child_process *cmd,
 		argv_array_pushf(&cmd->args, "--git-dir=%s", repo_path);
 		argv_array_push(&cmd->args, "for-each-ref");
 		argv_array_push(&cmd->args, "--format=%(objectname)");
+
+		if (!git_config_get_value("core.alternateRefsPrefixes", &value)) {
+			argv_array_push(&cmd->args, "--");
+			argv_array_split(&cmd->args, value);
+		}
 	}
 
 	cmd->env = local_repo_env;
-- 
2.19.0.221.g150f307af

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 0/4] Filter alternate references
  2018-10-08 18:09 ` [PATCH v5 0/4] Filter alternate references Taylor Blau
                     ` (3 preceding siblings ...)
  2018-10-08 18:09   ` [PATCH v5 4/4] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
@ 2018-10-09  3:09   ` Jeff King
  2018-10-09 14:49     ` Taylor Blau
  4 siblings, 1 reply; 94+ messages in thread
From: Jeff King @ 2018-10-09  3:09 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, gitster, sunshine, sbeller, ramsay

On Mon, Oct 08, 2018 at 11:09:20AM -0700, Taylor Blau wrote:

> Attached is (what I anticipate to be) the final re-roll of my series to
> introduce 'core.alternateRefsCommand' and 'core.alternateRefsPrefixes'
> in order to limit the ".have" advertisement when pushing over protocol
> v1 to a repository with configured alternates.

Thanks, this looks good to me!

-Peff

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 0/4] Filter alternate references
  2018-10-09  3:09   ` [PATCH v5 0/4] Filter alternate references Jeff King
@ 2018-10-09 14:49     ` Taylor Blau
  0 siblings, 0 replies; 94+ messages in thread
From: Taylor Blau @ 2018-10-09 14:49 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, gitster, sunshine, sbeller, ramsay

On Mon, Oct 08, 2018 at 11:09:18PM -0400, Jeff King wrote:
> On Mon, Oct 08, 2018 at 11:09:20AM -0700, Taylor Blau wrote:
>
> > Attached is (what I anticipate to be) the final re-roll of my series to
> > introduce 'core.alternateRefsCommand' and 'core.alternateRefsPrefixes'
> > in order to limit the ".have" advertisement when pushing over protocol
> > v1 to a repository with configured alternates.
>
> Thanks, this looks good to me!

Thanks again for all of your thoughtful review, it is much appreciated
:-).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 94+ messages in thread

end of thread, other threads:[~2018-10-09 14:49 UTC | newest]

Thread overview: 94+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-20 18:04 [PATCH 0/3] Filter alternate references Taylor Blau
2018-09-20 18:04 ` [PATCH 1/3] transport.c: extract 'fill_alternate_refs_command' Taylor Blau
2018-09-20 18:04 ` [PATCH 2/3] transport.c: introduce core.alternateRefsCommand Taylor Blau
2018-09-20 19:37   ` Jeff King
2018-09-20 20:00     ` Taylor Blau
2018-09-20 20:06       ` Jeff King
2018-09-21 16:39   ` Junio C Hamano
2018-09-21 17:48     ` Taylor Blau
2018-09-21 17:57       ` Taylor Blau
2018-09-21 19:59         ` Junio C Hamano
2018-09-26  0:56           ` Taylor Blau
2018-09-20 18:04 ` [PATCH 3/3] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
2018-09-20 19:47   ` Jeff King
2018-09-20 20:12     ` Taylor Blau
2018-09-21  7:19   ` Eric Sunshine
2018-09-21 14:07     ` Taylor Blau
2018-09-21 16:45       ` Junio C Hamano
2018-09-21 17:49         ` Taylor Blau
2018-09-21 16:40     ` Junio C Hamano
2018-09-20 18:35 ` [PATCH 0/3] Filter alternate references Stefan Beller
2018-09-20 18:56   ` Taylor Blau
2018-09-20 19:27   ` Jeff King
2018-09-20 19:21 ` Jeff King
2018-09-21 18:47 ` [PATCH v2 " Taylor Blau
2018-09-21 18:47   ` [PATCH v2 1/3] transport.c: extract 'fill_alternate_refs_command' Taylor Blau
2018-09-21 18:47   ` [PATCH v2 2/3] transport.c: introduce core.alternateRefsCommand Taylor Blau
2018-09-21 20:18     ` Eric Sunshine
2018-09-26  0:59       ` Taylor Blau
2018-09-21 21:09     ` Junio C Hamano
2018-09-21 22:13       ` Jeff King
2018-09-21 22:23         ` Junio C Hamano
2018-09-21 22:27           ` Jeff King
2018-09-26  1:06       ` Taylor Blau
2018-09-26  3:21         ` Jeff King
2018-09-21 21:10     ` Eric Sunshine
2018-09-22 18:02     ` brian m. carlson
2018-09-22 19:52       ` Jeff King
2018-09-23 14:53         ` brian m. carlson
2018-09-26  1:09         ` Taylor Blau
2018-09-26  3:33           ` Jeff King
2018-09-26 13:39             ` Taylor Blau
2018-09-26 18:38               ` Jeff King
2018-09-28  2:39                 ` Taylor Blau
2018-09-21 18:47   ` [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
2018-09-21 21:14     ` Junio C Hamano
2018-09-21 21:37       ` Jeff King
2018-09-21 22:06         ` Junio C Hamano
2018-09-21 22:18           ` Jeff King
2018-09-21 22:23             ` Stefan Beller
2018-09-24 15:17             ` Junio C Hamano
2018-09-24 18:10               ` Jeff King
2018-09-24 20:32                 ` Junio C Hamano
2018-09-24 20:50                   ` Jeff King
2018-09-24 21:01                     ` Jeff King
2018-09-24 21:55                     ` Junio C Hamano
2018-09-24 23:14                       ` Jeff King
2018-09-25 17:41                         ` Junio C Hamano
2018-09-25 22:46                           ` Taylor Blau
2018-09-25 23:56                             ` Junio C Hamano
2018-09-26  1:18                               ` Taylor Blau
2018-09-26  3:16                               ` Jeff King
2018-09-28  4:25 ` [PATCH v3 0/4] Filter alternate references Taylor Blau
2018-09-28  4:25   ` [PATCH v3 1/4] transport: drop refnames from for_each_alternate_ref Jeff King
2018-09-28  4:58     ` Jeff King
2018-09-28 14:21       ` Taylor Blau
2018-09-28  4:25   ` [PATCH v3 2/4] transport.c: extract 'fill_alternate_refs_command' Taylor Blau
2018-09-28  4:59     ` Jeff King
2018-09-28  4:25   ` [PATCH v3 3/4] transport.c: introduce core.alternateRefsCommand Taylor Blau
2018-09-28  5:26     ` Jeff King
2018-09-28 22:04       ` Taylor Blau
2018-09-29  7:31         ` Jeff King
2018-10-02  1:56           ` Taylor Blau
2018-09-28  4:25   ` [PATCH v3 4/4] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
2018-09-28  5:30     ` Jeff King
2018-09-28 22:05       ` Taylor Blau
2018-09-29  7:34         ` Jeff King
2018-10-02  1:57           ` Taylor Blau
2018-10-02  2:00             ` Taylor Blau
2018-10-02  2:23 ` [PATCH v4 0/4] Filter alternate references Taylor Blau
2018-10-02  2:23   ` [PATCH v4 1/4] transport: drop refnames from for_each_alternate_ref Taylor Blau
2018-10-02  2:23   ` [PATCH v4 2/4] transport.c: extract 'fill_alternate_refs_command' Taylor Blau
2018-10-02  2:23   ` [PATCH v4 3/4] transport.c: introduce core.alternateRefsCommand Taylor Blau
2018-10-02 23:40     ` Jeff King
2018-10-04  2:17       ` Taylor Blau
2018-10-02  2:24   ` [PATCH v4 4/4] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
2018-10-02 15:13     ` Ramsay Jones
2018-10-02 23:28       ` Taylor Blau
2018-10-08 18:09 ` [PATCH v5 0/4] Filter alternate references Taylor Blau
2018-10-08 18:09   ` [PATCH v5 1/4] transport: drop refnames from for_each_alternate_ref Taylor Blau
2018-10-08 18:09   ` [PATCH v5 2/4] transport.c: extract 'fill_alternate_refs_command' Taylor Blau
2018-10-08 18:09   ` [PATCH v5 3/4] transport.c: introduce core.alternateRefsCommand Taylor Blau
2018-10-08 18:09   ` [PATCH v5 4/4] transport.c: introduce core.alternateRefsPrefixes Taylor Blau
2018-10-09  3:09   ` [PATCH v5 0/4] Filter alternate references Jeff King
2018-10-09 14:49     ` Taylor Blau

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).