git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH] refs: implement reference transaction hooks
@ 2020-06-02  8:25 Patrick Steinhardt
  2020-06-02 17:47 ` Junio C Hamano
                   ` (4 more replies)
  0 siblings, 5 replies; 26+ messages in thread
From: Patrick Steinhardt @ 2020-06-02  8:25 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 10275 bytes --]

The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.

One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.

The above scenario is the motivation for a set of three new hooks that
reach directly into Git's reference transaction. Each of the following
new hooks (currently) doesn't accept any parameters and receives the set
of queued reference updates via stdin:

    - ref-transaction-prepared gets called when all reference updates
      have been queued. At this stage, the hook may decide to abort the
      transaction prematurely by returning a non-zero status code.

    - ref-transaction-committed gets called when a reference transaction
      was transmitted and all queued updates have been persisted.

    - ref-transaction-aborted gets called when a reference transaction
      was aborted and all queued updates have been rolled back.

Given the usecase described above, a voting mechanism can now be
implemented as a "ref-transaction-prepared" hook: as soon as it gets
called, it will take all of stdin and use it to cast a vote to a central
service. When all replicas of the repository agree, the hook will exit
with zero, otherwise it will abort the transaction by returning
non-zero. The most important upside is that this will catch _all_
commands writing references at once, allowing to implement strong
consistency for reference updates via a single mechanism.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/githooks.txt       | 51 ++++++++++++++++++
 refs.c                           | 67 +++++++++++++++++++++++-
 t/t1416-ref-transaction-hooks.sh | 88 ++++++++++++++++++++++++++++++++
 3 files changed, 204 insertions(+), 2 deletions(-)
 create mode 100755 t/t1416-ref-transaction-hooks.sh

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index 81f2a87e88..48f8446943 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -404,6 +404,57 @@ Both standard output and standard error output are forwarded to
 `git send-pack` on the other end, so you can simply `echo` messages
 for the user.
 
+ref-transaction-prepared
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+This hook is invoked by any Git command that performs reference
+updates. It executes as soon as all reference updates were queued to
+the transaction and locked on disk. This hook executes for every
+reference transaction that is being prepared and may thus get called
+multiple times.
+
+It takes no arguments, but for each ref to be updated it receives on
+standard input a line of the format:
+
+  <old-value> SP <new-value> SP <ref-name> LF
+
+If the hook exits with a non-zero status, the transaction is aborted
+and the command exits immediately. The
+<<ref-transaction-aborted,'ref-transaction-aborted'>> hook is not
+executed in that case.
+
+[[ref-transaction-aborted]]
+ref-transaction-aborted
+~~~~~~~~~~~~~~~~~~~~~~~
+
+This hook is invoked by any Git command that performs reference
+updates. It executes as soon as a reference transaction is aborted and
+after all reference locks were released and any changes made to
+references were rolled back. The hook may get called multiple times or
+never in case no transaction was aborted.
+
+The hook takes no arguments, but for each ref to be updated it
+receives on standard input a line of the format:
+
+  <old-value> SP <new-value> SP <ref-name> LF
+
+The hook's exit code is discarded by Git.
+
+ref-transaction-committed
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This hook is invoked by any Git command that performs reference
+updates. It executes as soon as a reference transaction is committed,
+persisting all changes to disk and releasing any locks. The hook may
+get called multiple times or never in case no transaction was aborted.
+
+The hook takes no arguments, but for each ref to be updated it
+receives on standard input a line of the format:
+
+  <old-value> SP <new-value> SP <ref-name> LF
+
+The hook's exit code is discarded by Git.
+
 push-to-checkout
 ~~~~~~~~~~~~~~~~
 
diff --git a/refs.c b/refs.c
index 224ff66c7b..e41fa7ea55 100644
--- a/refs.c
+++ b/refs.c
@@ -9,6 +9,7 @@
 #include "iterator.h"
 #include "refs.h"
 #include "refs/refs-internal.h"
+#include "run-command.h"
 #include "object-store.h"
 #include "object.h"
 #include "tag.h"
@@ -16,6 +17,7 @@
 #include "worktree.h"
 #include "argv-array.h"
 #include "repository.h"
+#include "sigchain.h"
 
 /*
  * List of all available backends
@@ -1986,10 +1988,56 @@ int ref_update_reject_duplicates(struct string_list *refnames,
 	return 0;
 }
 
+static int run_transaction_hook(struct ref_transaction *transaction,
+				const char *hook_name)
+{
+	struct child_process proc = CHILD_PROCESS_INIT;
+	struct strbuf buf = STRBUF_INIT;
+	const char *argv[2];
+	int code, i;
+
+	argv[0] = find_hook(hook_name);
+	if (!argv[0])
+		return 0;
+
+	argv[1] = NULL;
+
+	proc.argv = argv;
+	proc.in = -1;
+	proc.stdout_to_stderr = 1;
+	proc.trace2_hook_name = hook_name;
+
+	code = start_command(&proc);
+	if (code)
+		return code;
+
+	sigchain_push(SIGPIPE, SIG_IGN);
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *update = transaction->updates[i];
+
+		strbuf_reset(&buf);
+		strbuf_addf(&buf, "%s %s %s\n",
+			    oid_to_hex(&update->old_oid),
+			    oid_to_hex(&update->new_oid),
+			    update->refname);
+
+		if (write_in_full(proc.in, buf.buf, buf.len) < 0)
+			break;
+	}
+
+	close(proc.in);
+	sigchain_pop(SIGPIPE);
+
+	strbuf_release(&buf);
+	return finish_command(&proc);
+}
+
 int ref_transaction_prepare(struct ref_transaction *transaction,
 			    struct strbuf *err)
 {
 	struct ref_store *refs = transaction->ref_store;
+	int ret;
 
 	switch (transaction->state) {
 	case REF_TRANSACTION_OPEN:
@@ -2012,7 +2060,17 @@ int ref_transaction_prepare(struct ref_transaction *transaction,
 		return -1;
 	}
 
-	return refs->be->transaction_prepare(refs, transaction, err);
+	ret = refs->be->transaction_prepare(refs, transaction, err);
+	if (ret)
+		return ret;
+
+	ret = run_transaction_hook(transaction, "ref-transaction-prepared");
+	if (ret) {
+		ref_transaction_abort(transaction, err);
+		die(_("ref updates aborted by hook"));
+	}
+
+	return 0;
 }
 
 int ref_transaction_abort(struct ref_transaction *transaction,
@@ -2036,6 +2094,8 @@ int ref_transaction_abort(struct ref_transaction *transaction,
 		break;
 	}
 
+	run_transaction_hook(transaction, "ref-transaction-aborted");
+
 	ref_transaction_free(transaction);
 	return ret;
 }
@@ -2064,7 +2124,10 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 		break;
 	}
 
-	return refs->be->transaction_finish(refs, transaction, err);
+	ret = refs->be->transaction_finish(refs, transaction, err);
+	if (!ret)
+		run_transaction_hook(transaction, "ref-transaction-committed");
+	return ret;
 }
 
 int refs_verify_refname_available(struct ref_store *refs,
diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
new file mode 100755
index 0000000000..b6df5fc883
--- /dev/null
+++ b/t/t1416-ref-transaction-hooks.sh
@@ -0,0 +1,88 @@
+#!/bin/sh
+
+test_description='reference transaction hooks'
+
+. ./test-lib.sh
+
+create_commit ()
+{
+	test_tick &&
+	T=$(git write-tree) &&
+	sha1=$(echo message | git commit-tree $T) &&
+	echo $sha1
+}
+
+test_expect_success setup '
+	mkdir -p .git/hooks
+'
+
+test_expect_success 'prepared hook allows updating ref' '
+	test_when_finished "rm .git/hooks/ref-transaction-prepared" &&
+	write_script .git/hooks/ref-transaction-prepared <<-\EOF &&
+		exit 0
+	EOF
+	C=$(create_commit) &&
+	git update-ref HEAD $C
+'
+
+test_expect_success 'prepared hook aborts updating ref' '
+	test_when_finished "rm .git/hooks/ref-transaction-prepared" &&
+	write_script .git/hooks/ref-transaction-prepared <<-\EOF &&
+		exit 1
+	EOF
+	C=$(create_commit) &&
+	test_must_fail git update-ref HEAD $C 2>err &&
+	grep "ref updates aborted by hook" err
+'
+
+test_expect_success 'prepared hook gets all queued updates' '
+	test_when_finished "rm .git/hooks/ref-transaction-prepared actual" &&
+	write_script .git/hooks/ref-transaction-prepared <<-\EOF &&
+		while read -r line; do printf "%s\n" "$line"; done >actual
+	EOF
+	C=$(create_commit) &&
+	cat >expect <<-EOF &&
+		$ZERO_OID $C HEAD
+		$ZERO_OID $C refs/heads/master
+	EOF
+	git update-ref HEAD $C <<-EOF &&
+		update HEAD $ZERO_OID $C
+		update refs/heads/master $ZERO_OID $C
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'committed hook gets all queued updates' '
+	test_when_finished "rm .git/hooks/ref-transaction-committed actual" &&
+	write_script .git/hooks/ref-transaction-committed <<-\EOF &&
+		while read -r line; do printf "%s\n" "$line"; done >actual
+	EOF
+	C=$(create_commit) &&
+	cat >expect <<-EOF &&
+		$ZERO_OID $C HEAD
+		$ZERO_OID $C refs/heads/master
+	EOF
+	git update-ref HEAD $C &&
+	test_cmp expect actual
+'
+
+test_expect_success 'aborted hook gets all queued updates' '
+	test_when_finished "rm .git/hooks/ref-transaction-aborted actual" &&
+	write_script .git/hooks/ref-transaction-aborted <<-\EOF &&
+		while read -r line; do printf "%s\n" "$line"; done >actual
+	EOF
+	C=$(create_commit) &&
+	cat >expect <<-EOF &&
+		$ZERO_OID $C HEAD
+		$ZERO_OID $C refs/heads/master
+	EOF
+	git update-ref --stdin <<-EOF &&
+		start
+		update HEAD $C $ZERO_OID
+		update refs/heads/master $C $ZERO_OID
+		abort
+	EOF
+	test_cmp expect actual
+'
+
+test_done
-- 
2.26.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] refs: implement reference transaction hooks
  2020-06-02  8:25 [PATCH] refs: implement reference transaction hooks Patrick Steinhardt
@ 2020-06-02 17:47 ` Junio C Hamano
  2020-06-03 11:26   ` Patrick Steinhardt
  2020-06-02 18:09 ` SZEDER Gábor
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2020-06-02 17:47 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

> The above scenario is the motivation for a set of three new hooks that
> reach directly into Git's reference transaction. Each of the following
> new hooks (currently) doesn't accept any parameters and receives the set
> of queued reference updates via stdin:

Do we have something (e.g. performance measurement) to convince
ourselves that this won't incur unacceptable levels of overhead in
null cases where there is no hook installed in the repository?

> +static int run_transaction_hook(struct ref_transaction *transaction,
> +				const char *hook_name)
> +{
> +	struct child_process proc = CHILD_PROCESS_INIT;
> +	struct strbuf buf = STRBUF_INIT;
> +	const char *argv[2];
> +	int code, i;
> +
> +	argv[0] = find_hook(hook_name);
> +	if (!argv[0])
> +		return 0;
> +
> +	argv[1] = NULL;
> +
> +	proc.argv = argv;

Let's use proc.args and argv_push() API in newly introduced code
instead.

> +	proc.in = -1;
> +	proc.stdout_to_stderr = 1;
> +	proc.trace2_hook_name = hook_name;
> +
> +	code = start_command(&proc);
> +	if (code)
> +		return code;
> +
> +	sigchain_push(SIGPIPE, SIG_IGN);
> +
> +	for (i = 0; i < transaction->nr; i++) {
> +		struct ref_update *update = transaction->updates[i];
> +
> +		strbuf_reset(&buf);
> +		strbuf_addf(&buf, "%s %s %s\n",
> +			    oid_to_hex(&update->old_oid),
> +			    oid_to_hex(&update->new_oid),
> +			    update->refname);
> +
> +		if (write_in_full(proc.in, buf.buf, buf.len) < 0)
> +			break;

We leave the loop early when we detect a write failure here...

> +	}
> +
> +	close(proc.in);
> +	sigchain_pop(SIGPIPE);
> +
> +	strbuf_release(&buf);
> +	return finish_command(&proc);

... but the caller does not get notified.  Intended?

> +}
> +
>  int ref_transaction_prepare(struct ref_transaction *transaction,
>  			    struct strbuf *err)
>  {
>  	struct ref_store *refs = transaction->ref_store;
> +	int ret;
>  
>  	switch (transaction->state) {
>  	case REF_TRANSACTION_OPEN:
> @@ -2012,7 +2060,17 @@ int ref_transaction_prepare(struct ref_transaction *transaction,
>  		return -1;
>  	}
>  
> -	return refs->be->transaction_prepare(refs, transaction, err);
> +	ret = refs->be->transaction_prepare(refs, transaction, err);
> +	if (ret)
> +		return ret;
> +
> +	ret = run_transaction_hook(transaction, "ref-transaction-prepared");

This caller does care about it, no?

> +	if (ret) {
> +		ref_transaction_abort(transaction, err);
> +		die(_("ref updates aborted by hook"));
> +	}
> +
> +	return 0;
>  }
>  
>  int ref_transaction_abort(struct ref_transaction *transaction,
> @@ -2036,6 +2094,8 @@ int ref_transaction_abort(struct ref_transaction *transaction,
>  		break;
>  	}
>  
> +	run_transaction_hook(transaction, "ref-transaction-aborted");

And I presume that the callers of "ref_xn_abort()" would, too, but
the value returned here does not get folded into 'ret'.

>  	ref_transaction_free(transaction);
>  	return ret;
>  }
> @@ -2064,7 +2124,10 @@ int ref_transaction_commit(struct ref_transaction *transaction,
>  		break;
>  	}
>  
> -	return refs->be->transaction_finish(refs, transaction, err);
> +	ret = refs->be->transaction_finish(refs, transaction, err);
> +	if (!ret)
> +		run_transaction_hook(transaction, "ref-transaction-committed");
> +	return ret;
>  }
>  
>  int refs_verify_refname_available(struct ref_store *refs,
> diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
> new file mode 100755
> index 0000000000..b6df5fc883
> --- /dev/null
> +++ b/t/t1416-ref-transaction-hooks.sh
> @@ -0,0 +1,88 @@
> +#!/bin/sh
> +
> +test_description='reference transaction hooks'
> +
> +. ./test-lib.sh
> +
> +create_commit ()
> +{

Style (Documentation/CodingGuidelines):

 - We prefer a space between the function name and the parentheses,
   and no space inside the parentheses. The opening "{" should also
   be on the same line.

> +	test_tick &&
> +	T=$(git write-tree) &&
> +	sha1=$(echo message | git commit-tree $T) &&
> +	echo $sha1

Calling this helper in a subprocess does not have the intended
effect of calling test_tick, which increments the committer
timestamp by 1 second to prepare for the next call...

> +}
> +
> +test_expect_success setup '
> +	mkdir -p .git/hooks
> +'
> +
> +test_expect_success 'prepared hook allows updating ref' '
> +	test_when_finished "rm .git/hooks/ref-transaction-prepared" &&
> +	write_script .git/hooks/ref-transaction-prepared <<-\EOF &&
> +		exit 0
> +	EOF
> +	C=$(create_commit) &&

... like here.

Instead of creating test commits left and right at each test, how
about preparing a pair of them (PRE, POST) upfront in the "setup"
step, reset the refs involved to PRE at the very beginning of each
test, and use POST to operations that would update the refs?  We can
use a couple of calls to test_commit helper to do so, without having
to bother with the low level porcelain calls if we go that route.

> +	git update-ref HEAD $C
> +'
> +
> +test_expect_success 'prepared hook aborts updating ref' '
> +	test_when_finished "rm .git/hooks/ref-transaction-prepared" &&
> +	write_script .git/hooks/ref-transaction-prepared <<-\EOF &&
> +		exit 1
> +	EOF
> +	C=$(create_commit) &&
> +	test_must_fail git update-ref HEAD $C 2>err &&
> +	grep "ref updates aborted by hook" err

Running "make GIT_TEST_GETTEXT_POISON=Yes test" would probably break
this line.  Use test_i18ngrep instead.

> +'
> +
> +test_expect_success 'prepared hook gets all queued updates' '
> +	test_when_finished "rm .git/hooks/ref-transaction-prepared actual" &&
> +	write_script .git/hooks/ref-transaction-prepared <<-\EOF &&
> +		while read -r line; do printf "%s\n" "$line"; done >actual

Style (used in the generated script)?

> +	EOF
> +	C=$(create_commit) &&
> +	cat >expect <<-EOF &&
> +		$ZERO_OID $C HEAD
> +		$ZERO_OID $C refs/heads/master
> +	EOF
> +	git update-ref HEAD $C <<-EOF &&
> +		update HEAD $ZERO_OID $C
> +		update refs/heads/master $ZERO_OID $C
> +	EOF
> +	test_cmp expect actual

OK, good futureproofing for the hash algorithm update ;-).

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] refs: implement reference transaction hooks
  2020-06-02  8:25 [PATCH] refs: implement reference transaction hooks Patrick Steinhardt
  2020-06-02 17:47 ` Junio C Hamano
@ 2020-06-02 18:09 ` SZEDER Gábor
  2020-06-03  9:46   ` Patrick Steinhardt
  2020-06-03 12:27 ` [PATCH v2] refs: implement reference transaction hook Patrick Steinhardt
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 26+ messages in thread
From: SZEDER Gábor @ 2020-06-02 18:09 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On Tue, Jun 02, 2020 at 10:25:44AM +0200, Patrick Steinhardt wrote:
> The low-level reference transactions used to update references are
> currently completely opaque to the user. While certainly desirable in
> most usecases, there are some which might want to hook into the
> transaction to observe all queued reference updates as well as observing
> the abortion or commit of a prepared transaction.
> 
> One such usecase would be to have a set of replicas of a given Git
> repository, where we perform Git operations on all of the repositories
> at once and expect the outcome to be the same in all of them. While
> there exist hooks already for a certain subset of Git commands that
> could be used to implement a voting mechanism for this, many others
> currently don't have any mechanism for this.
> 
> The above scenario is the motivation for a set of three new hooks that
> reach directly into Git's reference transaction. Each of the following
> new hooks (currently) doesn't accept any parameters and receives the set
> of queued reference updates via stdin:
> 
>     - ref-transaction-prepared gets called when all reference updates
>       have been queued. At this stage, the hook may decide to abort the
>       transaction prematurely by returning a non-zero status code.
> 
>     - ref-transaction-committed gets called when a reference transaction
>       was transmitted and all queued updates have been persisted.
> 
>     - ref-transaction-aborted gets called when a reference transaction
>       was aborted and all queued updates have been rolled back.

The point of reference transactions is that they are atomic, and these
hooks must work together to ensure that.  This raises the question how
these hooks can be updated in an actively used repository.

Having multiple hooks means that they can't be updated atomically, and
git might invoke the new abort hook after the transaction was prepared
with the old hook.  Now, if there were a single 'ref-transaction' hook
(which gets the phase of the transaction ('prepared', 'committed' or
'aborted') as a parameter), then it could be updated atomically by
mv-ing it to place, but even that update can happen in between git
invokes 'ref-transaction prepared' and 'ref-transaction aborted'.

I suppose this issue could be addressed by a single hook which runs
during the whole transaction and some back-and-forth communication
through stdin/out between git and the hook.  However, this would, I'm
afraid, complicate both Git's handling of this hook and the hook as
well, so let's take a step back first: is this something we should
worry about in the first place?

> Given the usecase described above, a voting mechanism can now be
> implemented as a "ref-transaction-prepared" hook: as soon as it gets
> called, it will take all of stdin and use it to cast a vote to a central
> service. When all replicas of the repository agree, the hook will exit
> with zero, otherwise it will abort the transaction by returning
> non-zero. The most important upside is that this will catch _all_
> commands writing references at once, allowing to implement strong
> consistency for reference updates via a single mechanism.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  Documentation/githooks.txt       | 51 ++++++++++++++++++
>  refs.c                           | 67 +++++++++++++++++++++++-
>  t/t1416-ref-transaction-hooks.sh | 88 ++++++++++++++++++++++++++++++++
>  3 files changed, 204 insertions(+), 2 deletions(-)
>  create mode 100755 t/t1416-ref-transaction-hooks.sh
> 
> diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
> index 81f2a87e88..48f8446943 100644
> --- a/Documentation/githooks.txt
> +++ b/Documentation/githooks.txt
> @@ -404,6 +404,57 @@ Both standard output and standard error output are forwarded to
>  `git send-pack` on the other end, so you can simply `echo` messages
>  for the user.
>  
> +ref-transaction-prepared
> +~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +This hook is invoked by any Git command that performs reference
> +updates. It executes as soon as all reference updates were queued to
> +the transaction and locked on disk. This hook executes for every
> +reference transaction that is being prepared and may thus get called
> +multiple times.
> +
> +It takes no arguments, but for each ref to be updated it receives on
> +standard input a line of the format:
> +
> +  <old-value> SP <new-value> SP <ref-name> LF
> +
> +If the hook exits with a non-zero status, the transaction is aborted
> +and the command exits immediately. The
> +<<ref-transaction-aborted,'ref-transaction-aborted'>> hook is not
> +executed in that case.
> +
> +[[ref-transaction-aborted]]
> +ref-transaction-aborted
> +~~~~~~~~~~~~~~~~~~~~~~~
> +
> +This hook is invoked by any Git command that performs reference
> +updates. It executes as soon as a reference transaction is aborted and
> +after all reference locks were released and any changes made to
> +references were rolled back. The hook may get called multiple times or
> +never in case no transaction was aborted.
> +
> +The hook takes no arguments, but for each ref to be updated it

Nit: I found it a bit surprising to read about refs "to be updated" in
the description of the 'aborted' hook, because by the time this hook
is called the update has already been refused.  The same applies to
the 'committed' hook below as well.

> +receives on standard input a line of the format:
> +
> +  <old-value> SP <new-value> SP <ref-name> LF
> +
> +The hook's exit code is discarded by Git.
> +
> +ref-transaction-committed
> +~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +This hook is invoked by any Git command that performs reference
> +updates. It executes as soon as a reference transaction is committed,
> +persisting all changes to disk and releasing any locks. The hook may
> +get called multiple times or never in case no transaction was aborted.
> +
> +The hook takes no arguments, but for each ref to be updated it
> +receives on standard input a line of the format:
> +
> +  <old-value> SP <new-value> SP <ref-name> LF
> +
> +The hook's exit code is discarded by Git.
> +
>  push-to-checkout
>  ~~~~~~~~~~~~~~~~
>  

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] refs: implement reference transaction hooks
  2020-06-02 18:09 ` SZEDER Gábor
@ 2020-06-03  9:46   ` Patrick Steinhardt
  0 siblings, 0 replies; 26+ messages in thread
From: Patrick Steinhardt @ 2020-06-03  9:46 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 6846 bytes --]

On Tue, Jun 02, 2020 at 08:09:00PM +0200, SZEDER Gábor wrote:
> On Tue, Jun 02, 2020 at 10:25:44AM +0200, Patrick Steinhardt wrote:
> > The low-level reference transactions used to update references are
> > currently completely opaque to the user. While certainly desirable in
> > most usecases, there are some which might want to hook into the
> > transaction to observe all queued reference updates as well as observing
> > the abortion or commit of a prepared transaction.
> > 
> > One such usecase would be to have a set of replicas of a given Git
> > repository, where we perform Git operations on all of the repositories
> > at once and expect the outcome to be the same in all of them. While
> > there exist hooks already for a certain subset of Git commands that
> > could be used to implement a voting mechanism for this, many others
> > currently don't have any mechanism for this.
> > 
> > The above scenario is the motivation for a set of three new hooks that
> > reach directly into Git's reference transaction. Each of the following
> > new hooks (currently) doesn't accept any parameters and receives the set
> > of queued reference updates via stdin:
> > 
> >     - ref-transaction-prepared gets called when all reference updates
> >       have been queued. At this stage, the hook may decide to abort the
> >       transaction prematurely by returning a non-zero status code.
> > 
> >     - ref-transaction-committed gets called when a reference transaction
> >       was transmitted and all queued updates have been persisted.
> > 
> >     - ref-transaction-aborted gets called when a reference transaction
> >       was aborted and all queued updates have been rolled back.
> 
> The point of reference transactions is that they are atomic, and these
> hooks must work together to ensure that.  This raises the question how
> these hooks can be updated in an actively used repository.
> 
> Having multiple hooks means that they can't be updated atomically, and
> git might invoke the new abort hook after the transaction was prepared
> with the old hook.  Now, if there were a single 'ref-transaction' hook
> (which gets the phase of the transaction ('prepared', 'committed' or
> 'aborted') as a parameter), then it could be updated atomically by
> mv-ing it to place, but even that update can happen in between git
> invokes 'ref-transaction prepared' and 'ref-transaction aborted'.
> 
> I suppose this issue could be addressed by a single hook which runs
> during the whole transaction and some back-and-forth communication
> through stdin/out between git and the hook.  However, this would, I'm
> afraid, complicate both Git's handling of this hook and the hook as
> well, so let's take a step back first: is this something we should
> worry about in the first place?

Very good point about which I didn't previously think, thanks a lot for
raising it!

I agree that using a single long-lived hook would complicate the logic
by quite a bit. Given that the ref-transaction mechanism is such a
central piece to Git, I'd be wary of introducing such complexity. But
merging the current three hooks into a single hook accepting a parameter
sounds like a fair compromise to me that would at least allow users to
replace them atomically, even though it doesn't mean all stages were run
with the same version of the hook.

If we want to go further and also ensure the same script's run across
all hook invocations, we could also open the hook's file descriptor on
first invocation and then use `fexecve` on all subsequent invocations.
As long as the user doesn't do inline rewrites of the file, we'd thus
always use the same file. I'm not sure how portable that syscall is,
though, but it's at least part of POSIX-2008.

> > Given the usecase described above, a voting mechanism can now be
> > implemented as a "ref-transaction-prepared" hook: as soon as it gets
> > called, it will take all of stdin and use it to cast a vote to a central
> > service. When all replicas of the repository agree, the hook will exit
> > with zero, otherwise it will abort the transaction by returning
> > non-zero. The most important upside is that this will catch _all_
> > commands writing references at once, allowing to implement strong
> > consistency for reference updates via a single mechanism.
> > 
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  Documentation/githooks.txt       | 51 ++++++++++++++++++
> >  refs.c                           | 67 +++++++++++++++++++++++-
> >  t/t1416-ref-transaction-hooks.sh | 88 ++++++++++++++++++++++++++++++++
> >  3 files changed, 204 insertions(+), 2 deletions(-)
> >  create mode 100755 t/t1416-ref-transaction-hooks.sh
> > 
> > diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
> > index 81f2a87e88..48f8446943 100644
> > --- a/Documentation/githooks.txt
> > +++ b/Documentation/githooks.txt
> > @@ -404,6 +404,57 @@ Both standard output and standard error output are forwarded to
> >  `git send-pack` on the other end, so you can simply `echo` messages
> >  for the user.
> >  
> > +ref-transaction-prepared
> > +~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +This hook is invoked by any Git command that performs reference
> > +updates. It executes as soon as all reference updates were queued to
> > +the transaction and locked on disk. This hook executes for every
> > +reference transaction that is being prepared and may thus get called
> > +multiple times.
> > +
> > +It takes no arguments, but for each ref to be updated it receives on
> > +standard input a line of the format:
> > +
> > +  <old-value> SP <new-value> SP <ref-name> LF
> > +
> > +If the hook exits with a non-zero status, the transaction is aborted
> > +and the command exits immediately. The
> > +<<ref-transaction-aborted,'ref-transaction-aborted'>> hook is not
> > +executed in that case.
> > +
> > +[[ref-transaction-aborted]]
> > +ref-transaction-aborted
> > +~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +This hook is invoked by any Git command that performs reference
> > +updates. It executes as soon as a reference transaction is aborted and
> > +after all reference locks were released and any changes made to
> > +references were rolled back. The hook may get called multiple times or
> > +never in case no transaction was aborted.
> > +
> > +The hook takes no arguments, but for each ref to be updated it
> 
> Nit: I found it a bit surprising to read about refs "to be updated" in
> the description of the 'aborted' hook, because by the time this hook
> is called the update has already been refused.  The same applies to
> the 'committed' hook below as well.

Fair. This should rather read "that would have been updated" and "that
were updated", respectively.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] refs: implement reference transaction hooks
  2020-06-02 17:47 ` Junio C Hamano
@ 2020-06-03 11:26   ` Patrick Steinhardt
  2020-06-07 20:12     ` SZEDER Gábor
  0 siblings, 1 reply; 26+ messages in thread
From: Patrick Steinhardt @ 2020-06-03 11:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 3738 bytes --]

On Tue, Jun 02, 2020 at 10:47:55AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > The above scenario is the motivation for a set of three new hooks that
> > reach directly into Git's reference transaction. Each of the following
> > new hooks (currently) doesn't accept any parameters and receives the set
> > of queued reference updates via stdin:
> 
> Do we have something (e.g. performance measurement) to convince
> ourselves that this won't incur unacceptable levels of overhead in
> null cases where there is no hook installed in the repository?

Not yet, but I'll try to come up with a benchmark in the next iteration.
I guess the best way to test is to directly exercise git-update-refs, as
it's nearly a direct wrapper around reference transactions.

> > +	proc.in = -1;
> > +	proc.stdout_to_stderr = 1;
> > +	proc.trace2_hook_name = hook_name;
> > +
> > +	code = start_command(&proc);
> > +	if (code)
> > +		return code;
> > +
> > +	sigchain_push(SIGPIPE, SIG_IGN);
> > +
> > +	for (i = 0; i < transaction->nr; i++) {
> > +		struct ref_update *update = transaction->updates[i];
> > +
> > +		strbuf_reset(&buf);
> > +		strbuf_addf(&buf, "%s %s %s\n",
> > +			    oid_to_hex(&update->old_oid),
> > +			    oid_to_hex(&update->new_oid),
> > +			    update->refname);
> > +
> > +		if (write_in_full(proc.in, buf.buf, buf.len) < 0)
> > +			break;
> 
> We leave the loop early when we detect a write failure here...
> 
> > +	}
> > +
> > +	close(proc.in);
> > +	sigchain_pop(SIGPIPE);
> > +
> > +	strbuf_release(&buf);
> > +	return finish_command(&proc);
> 
> ... but the caller does not get notified.  Intended?

This is semi-intended. In case the hook doesn't fully consume stdin and
exits early, writing to its stdin would fail as we ignore SIGPIPE. We
don't want to force the hook to care about consuming all of stdin,
though.

We could improve error handling here by ignoring EPIPE, but making every
other write error fatal. If there's any other abnormal error condition
then we certainly don't want the hook to act on incomplete data and
pretend everything's fine.

> > +}
> > +
> >  int ref_transaction_prepare(struct ref_transaction *transaction,
> >  			    struct strbuf *err)
> >  {
> >  	struct ref_store *refs = transaction->ref_store;
> > +	int ret;
> >  
> >  	switch (transaction->state) {
> >  	case REF_TRANSACTION_OPEN:
> > @@ -2012,7 +2060,17 @@ int ref_transaction_prepare(struct ref_transaction *transaction,
> >  		return -1;
> >  	}
> >  
> > -	return refs->be->transaction_prepare(refs, transaction, err);
> > +	ret = refs->be->transaction_prepare(refs, transaction, err);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = run_transaction_hook(transaction, "ref-transaction-prepared");
> 
> This caller does care about it, no?

This caller does as it may abort the transaction, but...

> > +	if (ret) {
> > +		ref_transaction_abort(transaction, err);
> > +		die(_("ref updates aborted by hook"));
> > +	}
> > +
> > +	return 0;
> >  }
> >  
> >  int ref_transaction_abort(struct ref_transaction *transaction,
> > @@ -2036,6 +2094,8 @@ int ref_transaction_abort(struct ref_transaction *transaction,
> >  		break;
> >  	}
> >  
> > +	run_transaction_hook(transaction, "ref-transaction-aborted");
> 
> And I presume that the callers of "ref_xn_abort()" would, too, but
> the value returned here does not get folded into 'ret'.

... this one doesn't. The thing is that the reference transaction hook
for the "aborted" case can't do anything about an aborted transaction
after the fact anyway. That's why I chose to ignore errors here, same
for the "committed" case.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2] refs: implement reference transaction hook
  2020-06-02  8:25 [PATCH] refs: implement reference transaction hooks Patrick Steinhardt
  2020-06-02 17:47 ` Junio C Hamano
  2020-06-02 18:09 ` SZEDER Gábor
@ 2020-06-03 12:27 ` Patrick Steinhardt
  2020-06-03 16:51   ` Taylor Blau
  2020-06-03 17:44   ` Han-Wen Nienhuys
  2020-06-18 10:27 ` [PATCH v3] " Patrick Steinhardt
  2020-06-19  6:56 ` [PATCH v4] " Patrick Steinhardt
  4 siblings, 2 replies; 26+ messages in thread
From: Patrick Steinhardt @ 2020-06-03 12:27 UTC (permalink / raw)
  To: git; +Cc: SZEDER Gábor, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 12103 bytes --]

The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.

One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.

The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.

Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.

In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduces a new performance test for git-update-refs(1). Run against an
empty repository, it produces the following results:

  Test                                   HEAD~             HEAD
  ------------------------------------------------------------------------------
  1400.2: update existing reference      2.05(1.58+0.54)   2.08(1.58+0.57) +1.5%
  1400.3: create and destroy reference   1.79(1.38+0.49)   1.82(1.39+0.51) +1.7%

So the overhead is around ~1.5%. Given that git-update-refs(1) is a
near-direct wrapper around reference transactions, there likely is no
other command that is impacted much worse than this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---

The main adjustment in this second version is that I merged the
previously three hooks into a single one that gets the transaction state
as its first parameter, as proposed by Gábor. This is mainly to enable
atomic replacement of all three scripts, even though it could still be
that the hook gets replaced during a session. But I think it makes sense
regardless to merge these hooks into a single one.

I've also made changes based on Junio's feedback and added a benchmark
that exercises git-update-refs(1) as a proxy for this change. I guess
the ~1.5% penalty isn't too bad. It might be improved by caching hook
existence, but I don't think it necessary right now.

Thanks to both of you for your feedback!

Patrick

 Documentation/githooks.txt       |  29 ++++++++
 refs.c                           |  72 +++++++++++++++++++-
 t/perf/p1400-update-ref.sh       |  31 +++++++++
 t/t1416-ref-transaction-hooks.sh | 109 +++++++++++++++++++++++++++++++
 4 files changed, 239 insertions(+), 2 deletions(-)
 create mode 100755 t/perf/p1400-update-ref.sh
 create mode 100755 t/t1416-ref-transaction-hooks.sh

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index 81f2a87e88..642471109f 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -404,6 +404,35 @@ Both standard output and standard error output are forwarded to
 `git send-pack` on the other end, so you can simply `echo` messages
 for the user.
 
+ref-transaction
+~~~~~~~~~~~~~~~
+
+This hook is invoked by any Git command that performs reference
+updates. It executes whenever a reference transaction is prepared,
+committed or aborted and may thus get called multiple times.
+
+The hook takes exactly one argument, which is the current state the
+given reference transaction is in:
+
+    - "prepared": All reference updates have been queued to the
+      transaction and references were locked on disk.
+
+    - "committed": The reference transaction was committed and all
+      references now have their respective new value.
+
+    - "aborted": The reference transaction was aborted, no changes
+      were performed and the locks have been released.
+
+For each reference update that was added to the transaction, the hook
+receives on standard input a line of the format:
+
+  <old-value> SP <new-value> SP <ref-name> LF
+
+The exit status of the hook is ignored for any state except for the
+"prepared" state. In the "prepared" state, a non-zero exit status will
+cause the transaction to be aborted. The hook will not be called with
+"aborted" state in that case.
+
 push-to-checkout
 ~~~~~~~~~~~~~~~~
 
diff --git a/refs.c b/refs.c
index 224ff66c7b..af752e1759 100644
--- a/refs.c
+++ b/refs.c
@@ -9,6 +9,7 @@
 #include "iterator.h"
 #include "refs.h"
 #include "refs/refs-internal.h"
+#include "run-command.h"
 #include "object-store.h"
 #include "object.h"
 #include "tag.h"
@@ -16,6 +17,7 @@
 #include "worktree.h"
 #include "argv-array.h"
 #include "repository.h"
+#include "sigchain.h"
 
 /*
  * List of all available backends
@@ -1986,10 +1988,61 @@ int ref_update_reject_duplicates(struct string_list *refnames,
 	return 0;
 }
 
+static int run_transaction_hook(struct ref_transaction *transaction,
+				const char *state)
+{
+	struct child_process proc = CHILD_PROCESS_INIT;
+	struct strbuf buf = STRBUF_INIT;
+	int saved_errno = 0, ret, i;
+	const char *hook;
+
+	hook = find_hook("reference-transaction");
+	if (!hook)
+		return 0;
+
+	argv_array_pushl(&proc.args, hook, state, NULL);
+	proc.in = -1;
+	proc.stdout_to_stderr = 1;
+	proc.trace2_hook_name = hook;
+
+	ret = start_command(&proc);
+	if (ret)
+		return ret;
+
+	sigchain_push(SIGPIPE, SIG_IGN);
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *update = transaction->updates[i];
+
+		strbuf_reset(&buf);
+		strbuf_addf(&buf, "%s %s %s\n",
+			    oid_to_hex(&update->old_oid),
+			    oid_to_hex(&update->new_oid),
+			    update->refname);
+
+		if (write_in_full(proc.in, buf.buf, buf.len) < 0) {
+			if (errno != EPIPE)
+				saved_errno = errno;
+			break;
+		}
+	}
+
+	close(proc.in);
+	sigchain_pop(SIGPIPE);
+	strbuf_release(&buf);
+
+	ret = finish_command(&proc);
+	if (ret)
+		return ret;
+
+	return saved_errno;
+}
+
 int ref_transaction_prepare(struct ref_transaction *transaction,
 			    struct strbuf *err)
 {
 	struct ref_store *refs = transaction->ref_store;
+	int ret;
 
 	switch (transaction->state) {
 	case REF_TRANSACTION_OPEN:
@@ -2012,7 +2065,17 @@ int ref_transaction_prepare(struct ref_transaction *transaction,
 		return -1;
 	}
 
-	return refs->be->transaction_prepare(refs, transaction, err);
+	ret = refs->be->transaction_prepare(refs, transaction, err);
+	if (ret)
+		return ret;
+
+	ret = run_transaction_hook(transaction, "prepared");
+	if (ret) {
+		ref_transaction_abort(transaction, err);
+		die(_("ref updates aborted by hook"));
+	}
+
+	return 0;
 }
 
 int ref_transaction_abort(struct ref_transaction *transaction,
@@ -2036,6 +2099,8 @@ int ref_transaction_abort(struct ref_transaction *transaction,
 		break;
 	}
 
+	run_transaction_hook(transaction, "aborted");
+
 	ref_transaction_free(transaction);
 	return ret;
 }
@@ -2064,7 +2129,10 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 		break;
 	}
 
-	return refs->be->transaction_finish(refs, transaction, err);
+	ret = refs->be->transaction_finish(refs, transaction, err);
+	if (!ret)
+		run_transaction_hook(transaction, "committed");
+	return ret;
 }
 
 int refs_verify_refname_available(struct ref_store *refs,
diff --git a/t/perf/p1400-update-ref.sh b/t/perf/p1400-update-ref.sh
new file mode 100755
index 0000000000..4f4519529e
--- /dev/null
+++ b/t/perf/p1400-update-ref.sh
@@ -0,0 +1,31 @@
+#!/bin/sh
+
+test_description="Tests performance of update-ref"
+
+. ./perf-lib.sh
+
+test_perf_fresh_repo
+
+test_expect_success "setup" '
+	test_commit PRE &&
+	test_commit POST &&
+	git branch update-branch
+'
+
+test_perf "update existing reference" '
+	for i in $(test_seq 1000)
+	do
+		git update-ref refs/heads/update-branch PRE POST &&
+		git update-ref refs/heads/update-branch POST PRE
+	done
+'
+
+test_perf "create and destroy reference" '
+	for i in $(test_seq 1000)
+	do
+		git update-ref refs/heads/new POST
+		git update-ref -d refs/heads/new
+	done
+'
+
+test_done
diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
new file mode 100755
index 0000000000..da58d867a5
--- /dev/null
+++ b/t/t1416-ref-transaction-hooks.sh
@@ -0,0 +1,109 @@
+#!/bin/sh
+
+test_description='reference transaction hooks'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+	mkdir -p .git/hooks &&
+	test_commit PRE &&
+	test_commit POST &&
+	POST_OID=$(git rev-parse POST)
+'
+
+test_expect_success 'hook allows updating ref if successful' '
+	test_when_finished "rm .git/hooks/reference-transaction" &&
+	git reset --hard PRE &&
+	write_script .git/hooks/reference-transaction <<-\EOF &&
+		echo "$*" >>actual
+	EOF
+	cat >expect <<-EOF &&
+		prepared
+		committed
+	EOF
+	git update-ref HEAD POST &&
+	test_cmp expect actual
+'
+
+test_expect_success 'hook aborts updating ref in prepared state' '
+	test_when_finished "rm .git/hooks/reference-transaction" &&
+	git reset --hard PRE &&
+	write_script .git/hooks/reference-transaction <<-\EOF &&
+		if test "$1" = prepared
+		then
+			exit 1
+		fi
+	EOF
+	test_must_fail git update-ref HEAD POST 2>err &&
+	test_i18ngrep "ref updates aborted by hook" err
+'
+
+test_expect_success 'hook gets all queued updates in prepared state' '
+	test_when_finished "rm .git/hooks/reference-transaction actual" &&
+	git reset --hard PRE &&
+	write_script .git/hooks/reference-transaction <<-\EOF &&
+		if test "$1" = prepared
+		then
+			while read -r line
+			do
+				printf "%s\n" "$line"
+			done >actual
+		fi
+	EOF
+	cat >expect <<-EOF &&
+		$ZERO_OID $POST_OID HEAD
+		$ZERO_OID $POST_OID refs/heads/master
+	EOF
+	git update-ref HEAD POST <<-EOF &&
+		update HEAD $ZERO_OID $POST_OID
+		update refs/heads/master $ZERO_OID $POST_OID
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'hook gets all queued updates in committed state' '
+	test_when_finished "rm .git/hooks/reference-transaction actual" &&
+	git reset --hard PRE &&
+	write_script .git/hooks/reference-transaction <<-\EOF &&
+		if test "$1" = committed
+		then
+			while read -r line
+			do
+				printf "%s\n" "$line"
+			done >actual
+		fi
+	EOF
+	cat >expect <<-EOF &&
+		$ZERO_OID $POST_OID HEAD
+		$ZERO_OID $POST_OID refs/heads/master
+	EOF
+	git update-ref HEAD POST &&
+	test_cmp expect actual
+'
+
+test_expect_success 'hook gets all queued updates in aborted state' '
+	test_when_finished "rm .git/hooks/reference-transaction actual" &&
+	git reset --hard PRE &&
+	write_script .git/hooks/reference-transaction <<-\EOF &&
+		if test "$1" = aborted
+		then
+			while read -r line
+			do
+				printf "%s\n" "$line"
+			done >actual
+		fi
+	EOF
+	cat >expect <<-EOF &&
+		$ZERO_OID $POST_OID HEAD
+		$ZERO_OID $POST_OID refs/heads/master
+	EOF
+	git update-ref --stdin <<-EOF &&
+		start
+		update HEAD POST $ZERO_OID
+		update refs/heads/master POST $ZERO_OID
+		abort
+	EOF
+	test_cmp expect actual
+'
+
+test_done
-- 
2.27.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] refs: implement reference transaction hook
  2020-06-03 12:27 ` [PATCH v2] refs: implement reference transaction hook Patrick Steinhardt
@ 2020-06-03 16:51   ` Taylor Blau
  2020-06-04  7:36     ` Patrick Steinhardt
  2020-06-03 17:44   ` Han-Wen Nienhuys
  1 sibling, 1 reply; 26+ messages in thread
From: Taylor Blau @ 2020-06-03 16:51 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, SZEDER Gábor, Junio C Hamano

Hi Patrick,

On Wed, Jun 03, 2020 at 02:27:50PM +0200, Patrick Steinhardt wrote:
> In order to test the impact on the case where we don't have any
> "reference-transaction" hook installed in the repository, this commit
> introduces a new performance test for git-update-refs(1). Run against an
> empty repository, it produces the following results:
>
>   Test                                   HEAD~             HEAD
>   ------------------------------------------------------------------------------
>   1400.2: update existing reference      2.05(1.58+0.54)   2.08(1.58+0.57) +1.5%
>   1400.3: create and destroy reference   1.79(1.38+0.49)   1.82(1.39+0.51) +1.7%
>
> So the overhead is around ~1.5%. Given that git-update-refs(1) is a
> near-direct wrapper around reference transactions, there likely is no
> other command that is impacted much worse than this.

This is a serious performance regression that is worth considering. For
example, a 1.5% slow-down on average in reference transactions would
cause be enough for me to bisect the regression down to see what
changed.

What are ways that this could be avoided? I bet that this would work
quite well with Emily's work on hooks, where we could check in the
configuration first whether a hook is even configured.

Could this be integrated with that? If not, could you cache the result
of whether or not the hook exists, and/or implement some mechanism to
say something like, for eg., "only run reference transaction hooks
core.blah = 1" is true?

I haven't myself looked seriously at your patch (although I do plan on
doing so, but haven't yet had time), but I think that this should be
considered thoroughly before moving forward.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] refs: implement reference transaction hook
  2020-06-03 12:27 ` [PATCH v2] refs: implement reference transaction hook Patrick Steinhardt
  2020-06-03 16:51   ` Taylor Blau
@ 2020-06-03 17:44   ` Han-Wen Nienhuys
  2020-06-03 18:03     ` Junio C Hamano
  1 sibling, 1 reply; 26+ messages in thread
From: Han-Wen Nienhuys @ 2020-06-03 17:44 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, SZEDER Gábor, Junio C Hamano

On Wed, Jun 3, 2020 at 2:27 PM Patrick Steinhardt <ps@pks.im> wrote:
> The low-level reference transactions used to update references are
> currently completely opaque to the user. While certainly desirable in
> most usecases, there are some which might want to hook into the
> transaction to observe all queued reference updates as well as observing
> the abortion or commit of a prepared transaction.

I have an alternate suggestion, but it would depend on having
reftable.  In reftable, the transaction is

  0) create tables.list.lock
  1) write out new reftable(s),
  2) renaming tables.list.lock to tables.list.

If you insert a hook between 1) and 2), the hook would not need to be
supplied with any data, so it would have minimal performance impact.
If the hook runs, it can compare tables.list and tables.list.lock,
read out the new reftables, and make decisions based on that. Or
alternatively, the transaction could be passed a list of reftables,
that the hook could then process at its leisure.

> +For each reference update that was added to the transaction, the hook
> +receives on standard input a line of the format:
> +
> +  <old-value> SP <new-value> SP <ref-name> LF

Does this work for symrefs as well? Should it?

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] refs: implement reference transaction hook
  2020-06-03 17:44   ` Han-Wen Nienhuys
@ 2020-06-03 18:03     ` Junio C Hamano
  0 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2020-06-03 18:03 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: Patrick Steinhardt, git, SZEDER Gábor

Han-Wen Nienhuys <hanwen@google.com> writes:

>> +For each reference update that was added to the transaction, the hook
>> +receives on standard input a line of the format:
>> +
>> +  <old-value> SP <new-value> SP <ref-name> LF
>
> Does this work for symrefs as well? Should it?

That's a good question.

I am assuming that you are asking about updating the value of the
underlying ref the symbolic ref points at (i.e. <ref-name> would be
"HEAD" and old- and new-value are commits at the tip of the branch
that is currently checked out), and not about repointing the
symbolic ref to a different underlying ref.  I suspect we do not
want to.  In such a case, we'd have a separate "refs/heads/master
moved from old to new" record anyway, so we probably should not even
talk about symbolic refs when we report "a ref got updated to point
a different object" in the above format.

And obviously the above format cannot talk about repointing HEAD to
a different branch (well, it can in a sense that old and new branch
both would have some object name that can be filled in the old- and
new-value fields, but the record misses the most interesting part of
the "branch switching" without the actual old and new refnames), so
there must be another format that allows us to report such an event.

    "Sym" SP <old-refname> SP <new-refname> SP <symbolic-ref-name> LF

or something like that, so that

	$ git checkout next

may yield

    "Sym" SP "refs/heads/master" SP "refs/heads/next" SP "HEAD"

if we originally were on the 'master' branch.  Obviously we need a
way to represent different kinds of transition, like (1) a new
symbolic ref has been created, pointing at some underlying ref
(which may or may not exist), (2) an existing ref that pointed
directly at an object now has become a symbolic ref, (3) a symbolic
ref stopped pointing at a ref and now points directly at an object,
and (4) a symbolic ref has been removed.

Both (1) and (4) would be a trivial variation of the above format, I
guess, using some special token to denote void (say, we'd use an
empty string for that):

    "Sym" "" SP <new-refname> SP <symbolic-ref-name> LF
    "Sym" <old-refname> SP "" SP <symbolic-ref-name> LF

(2) and (3) are interesting.  We could represent them as two
separate operations inside the same transaction, e.g.

    "Sym" "refs/heads/master" SP "" SP "HEAD" LF
    "0{40}" SP "b3d7a52fac39193503a0b6728771d1bf6a161464" SP "HEADD" LF

or we'd need to invent different type of record to represent such
"detaching" and "reattaching" events.


    




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] refs: implement reference transaction hook
  2020-06-03 16:51   ` Taylor Blau
@ 2020-06-04  7:36     ` Patrick Steinhardt
  2020-06-15 16:46       ` Taylor Blau
  0 siblings, 1 reply; 26+ messages in thread
From: Patrick Steinhardt @ 2020-06-04  7:36 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, SZEDER Gábor, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 5142 bytes --]

On Wed, Jun 03, 2020 at 10:51:42AM -0600, Taylor Blau wrote:
> Hi Patrick,
> 
> On Wed, Jun 03, 2020 at 02:27:50PM +0200, Patrick Steinhardt wrote:
> > In order to test the impact on the case where we don't have any
> > "reference-transaction" hook installed in the repository, this commit
> > introduces a new performance test for git-update-refs(1). Run against an
> > empty repository, it produces the following results:
> >
> >   Test                                   HEAD~             HEAD
> >   ------------------------------------------------------------------------------
> >   1400.2: update existing reference      2.05(1.58+0.54)   2.08(1.58+0.57) +1.5%
> >   1400.3: create and destroy reference   1.79(1.38+0.49)   1.82(1.39+0.51) +1.7%
> >
> > So the overhead is around ~1.5%. Given that git-update-refs(1) is a
> > near-direct wrapper around reference transactions, there likely is no
> > other command that is impacted much worse than this.
> 
> This is a serious performance regression that is worth considering. For
> example, a 1.5% slow-down on average in reference transactions would
> cause be enough for me to bisect the regression down to see what
> changed.
> 
> What are ways that this could be avoided? I bet that this would work
> quite well with Emily's work on hooks, where we could check in the
> configuration first whether a hook is even configured.
> 
> Could this be integrated with that? If not, could you cache the result
> of whether or not the hook exists, and/or implement some mechanism to
> say something like, for eg., "only run reference transaction hooks
> core.blah = 1" is true?

I wasn't aware of her work until now, so I'll take a look.

Meanwhile, I toyed around with performance and tried out two different
caching mechanisms:

    - map-cache: `find_hook()` uses a map of hook names mapped to their
      resolved hook path (or `NULL` if none was found). This is a
      generic mechanism working across all hooks, but has some overhead
      in maintaining the map.

    - reftx-cache: `run_transaction_hook()` caches the path returned by
      `find_hook()`. It's got less overhead as it only caches the path,
      but obviously only applies to the reference-transaction hook.

In theory, we could go even further and cache the hook's file
descriptor, executing it via fexecve(3P) whenever it's required, but I
didn't go down that route yet. This would also solve the atomicity
issue, though, if the administrator replaces the reference-transactions
hook halfway through the transaction.

Benchmarking results are mixed, mostly in the sense that I can choose
which run of the benchmarks I take in order to have my own work look
better or worse, as timings vary quite a lot between runs. Which
probably hints at the fact that the benchmarks themselves aren't really
reliable. The issue is that a single git-update-ref(1) run finishes so
quick that it's hard to measure with our benchmarks, but spawning
thousands of them to get something different than 0.0s very much depends
on the operating system and thus fluctuates. On the other hand, if we
only spawn a single git-update-refs and have it perform a few thousand
ref updates, overhead of the hook will not show at all as it is only
executed once per prepare/commit of the transaction.

The following timings are for the case where we execute

    git update-ref refs/heads/update-branch PRE POST &&
    git update-ref refs/heads/update-branch POST PRE

respectively

    git update-ref refs/heads/new POST &&
    git update-ref -d refs/heads/new

a thousand times:

Test                                   master            ref-hook-no-cache       ref-hook-map-cache      ref-hook-reftx-cache
------------------------------------------------------------------------------------------------------------------------------
1400.2: update existing reference      1.96(1.50+0.53)   2.00(1.54+0.53) +2.0%   2.02(1.54+0.55) +3.1%   1.98(1.52+0.52) +1.0%
1400.3: create and destroy reference   1.74(1.33+0.49)   1.77(1.38+0.47) +1.7%   1.77(1.36+0.48) +1.7%   1.76(1.35+0.49) +1.1%

For such a short-lived program like git-update-refs(1), one can see that
the overhead of using a map negatively impacts performance compared to
the no-cache case. But using the reftx-cache roughly cuts the overhead
in half as expected, as we only need to look up the hook once instead of
twice.

And here's the results if we use a single `git update-ref --stdin` with a
thousand reference updates at once:

Test                             master            ref-hook-no-cache       ref-hook-map-cache      ref-hook-reftx-cache
------------------------------------------------------------------------------------------------------------------------
1400.2: git update-ref --stdin   0.21(0.09+0.12)   0.21(0.07+0.14) +0.0%   0.21(0.07+0.13) +0.0%   0.21(0.07+0.13) +0.0%

As expected, there's nothing much to be seen here because there's only a
single transaction for all ref updates and, as a result, at most two
calls to `access(refhook_path, X_OK)`.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] refs: implement reference transaction hooks
  2020-06-03 11:26   ` Patrick Steinhardt
@ 2020-06-07 20:12     ` SZEDER Gábor
  2020-06-08  5:36       ` Patrick Steinhardt
  0 siblings, 1 reply; 26+ messages in thread
From: SZEDER Gábor @ 2020-06-07 20:12 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Junio C Hamano, git

On Wed, Jun 03, 2020 at 01:26:04PM +0200, Patrick Steinhardt wrote:
> On Tue, Jun 02, 2020 at 10:47:55AM -0700, Junio C Hamano wrote:
> > Patrick Steinhardt <ps@pks.im> writes:
> > 
> > > The above scenario is the motivation for a set of three new hooks that
> > > reach directly into Git's reference transaction. Each of the following
> > > new hooks (currently) doesn't accept any parameters and receives the set
> > > of queued reference updates via stdin:
> > 
> > Do we have something (e.g. performance measurement) to convince
> > ourselves that this won't incur unacceptable levels of overhead in
> > null cases where there is no hook installed in the repository?
> 
> Not yet, but I'll try to come up with a benchmark in the next iteration.
> I guess the best way to test is to directly exercise git-update-refs, as
> it's nearly a direct wrapper around reference transactions.
> 
> > > +	proc.in = -1;
> > > +	proc.stdout_to_stderr = 1;
> > > +	proc.trace2_hook_name = hook_name;
> > > +
> > > +	code = start_command(&proc);
> > > +	if (code)
> > > +		return code;
> > > +
> > > +	sigchain_push(SIGPIPE, SIG_IGN);
> > > +
> > > +	for (i = 0; i < transaction->nr; i++) {
> > > +		struct ref_update *update = transaction->updates[i];
> > > +
> > > +		strbuf_reset(&buf);
> > > +		strbuf_addf(&buf, "%s %s %s\n",
> > > +			    oid_to_hex(&update->old_oid),
> > > +			    oid_to_hex(&update->new_oid),
> > > +			    update->refname);
> > > +
> > > +		if (write_in_full(proc.in, buf.buf, buf.len) < 0)
> > > +			break;
> > 
> > We leave the loop early when we detect a write failure here...
> > 
> > > +	}
> > > +
> > > +	close(proc.in);
> > > +	sigchain_pop(SIGPIPE);
> > > +
> > > +	strbuf_release(&buf);
> > > +	return finish_command(&proc);
> > 
> > ... but the caller does not get notified.  Intended?
> 
> This is semi-intended. In case the hook doesn't fully consume stdin and
> exits early, writing to its stdin would fail as we ignore SIGPIPE. We
> don't want to force the hook to care about consuming all of stdin,
> though.

Why?  How could the prepared hook properly initialize the voting
mechanism for the transaction without reading all the refs to be
updated?

> We could improve error handling here by ignoring EPIPE, but making every
> other write error fatal. If there's any other abnormal error condition
> then we certainly don't want the hook to act on incomplete data and
> pretend everything's fine.

As I read v2 of this patch, a prepared hook can exit(0) early without
reading all the refs to be updated, cause EPIPE in the git process
invoking the hook, and that process would interpret that as success.
I haven't though it through how such a voting mechanism would work,
but I have a gut feeling that this can't be good.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] refs: implement reference transaction hooks
  2020-06-07 20:12     ` SZEDER Gábor
@ 2020-06-08  5:36       ` Patrick Steinhardt
  0 siblings, 0 replies; 26+ messages in thread
From: Patrick Steinhardt @ 2020-06-08  5:36 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 4210 bytes --]

On Sun, Jun 07, 2020 at 10:12:33PM +0200, SZEDER Gábor wrote:
> On Wed, Jun 03, 2020 at 01:26:04PM +0200, Patrick Steinhardt wrote:
> > On Tue, Jun 02, 2020 at 10:47:55AM -0700, Junio C Hamano wrote:
> > > Patrick Steinhardt <ps@pks.im> writes:
> > > 
> > > > The above scenario is the motivation for a set of three new hooks that
> > > > reach directly into Git's reference transaction. Each of the following
> > > > new hooks (currently) doesn't accept any parameters and receives the set
> > > > of queued reference updates via stdin:
> > > 
> > > Do we have something (e.g. performance measurement) to convince
> > > ourselves that this won't incur unacceptable levels of overhead in
> > > null cases where there is no hook installed in the repository?
> > 
> > Not yet, but I'll try to come up with a benchmark in the next iteration.
> > I guess the best way to test is to directly exercise git-update-refs, as
> > it's nearly a direct wrapper around reference transactions.
> > 
> > > > +	proc.in = -1;
> > > > +	proc.stdout_to_stderr = 1;
> > > > +	proc.trace2_hook_name = hook_name;
> > > > +
> > > > +	code = start_command(&proc);
> > > > +	if (code)
> > > > +		return code;
> > > > +
> > > > +	sigchain_push(SIGPIPE, SIG_IGN);
> > > > +
> > > > +	for (i = 0; i < transaction->nr; i++) {
> > > > +		struct ref_update *update = transaction->updates[i];
> > > > +
> > > > +		strbuf_reset(&buf);
> > > > +		strbuf_addf(&buf, "%s %s %s\n",
> > > > +			    oid_to_hex(&update->old_oid),
> > > > +			    oid_to_hex(&update->new_oid),
> > > > +			    update->refname);
> > > > +
> > > > +		if (write_in_full(proc.in, buf.buf, buf.len) < 0)
> > > > +			break;
> > > 
> > > We leave the loop early when we detect a write failure here...
> > > 
> > > > +	}
> > > > +
> > > > +	close(proc.in);
> > > > +	sigchain_pop(SIGPIPE);
> > > > +
> > > > +	strbuf_release(&buf);
> > > > +	return finish_command(&proc);
> > > 
> > > ... but the caller does not get notified.  Intended?
> > 
> > This is semi-intended. In case the hook doesn't fully consume stdin and
> > exits early, writing to its stdin would fail as we ignore SIGPIPE. We
> > don't want to force the hook to care about consuming all of stdin,
> > though.
> 
> Why?  How could the prepared hook properly initialize the voting
> mechanism for the transaction without reading all the refs to be
> updated?

Because the hook might not want to implement a voting mechanism after
all but something entirely different which we're currently not
foreseeing as a valid usecase. We don't enforce this anywhere else
either, like e.g. for the pre-receive hook. If that one exits early
without consuming its stdin then that's totally fine.

> > We could improve error handling here by ignoring EPIPE, but making every
> > other write error fatal. If there's any other abnormal error condition
> > then we certainly don't want the hook to act on incomplete data and
> > pretend everything's fine.
> 
> As I read v2 of this patch, a prepared hook can exit(0) early without
> reading all the refs to be updated, cause EPIPE in the git process
> invoking the hook, and that process would interpret that as success.
> I haven't though it through how such a voting mechanism would work,
> but I have a gut feeling that this can't be good.

As said, I lean towards allowing more flexibility for the hook
implementation to also cater for other usecases. But I agree that in a
voting implementation, not reading all of stdin is a bad thing and may
point to a buggy hook implementation. Aborting the transaction if the
hook didn't read all of stdin would be a nice safeguard in that case.

With the current implementation of using a single hook for "prepared",
"committed" and "aborted", it'd also force the hook implementation to do
something in cases it doesn't care about. E.g.

    #!/bin/sh
    case "$1" in
        prepared)
            VOTE=$(sha1sum <&0)
            cast $VOTE
            ;;
        aborted|committed)
            cat <&0 >/dev/null
            ;;
    esac

That being said, I'm not opposed to enforce this and not treat EPIPE
differently.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] refs: implement reference transaction hook
  2020-06-04  7:36     ` Patrick Steinhardt
@ 2020-06-15 16:46       ` Taylor Blau
  2020-06-16  5:45         ` Patrick Steinhardt
  0 siblings, 1 reply; 26+ messages in thread
From: Taylor Blau @ 2020-06-15 16:46 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Taylor Blau, git, SZEDER Gábor, Junio C Hamano

Hi Patrick,

Sorry for the slow response. I was out of the office last week and am
only just now getting a chance to catch up on the backlog of emails that
I missed.

On Thu, Jun 04, 2020 at 09:36:32AM +0200, Patrick Steinhardt wrote:
> On Wed, Jun 03, 2020 at 10:51:42AM -0600, Taylor Blau wrote:
> > Hi Patrick,
> >
> > On Wed, Jun 03, 2020 at 02:27:50PM +0200, Patrick Steinhardt wrote:
> > > In order to test the impact on the case where we don't have any
> > > "reference-transaction" hook installed in the repository, this commit
> > > introduces a new performance test for git-update-refs(1). Run against an
> > > empty repository, it produces the following results:
> > >
> > >   Test                                   HEAD~             HEAD
> > >   ------------------------------------------------------------------------------
> > >   1400.2: update existing reference      2.05(1.58+0.54)   2.08(1.58+0.57) +1.5%
> > >   1400.3: create and destroy reference   1.79(1.38+0.49)   1.82(1.39+0.51) +1.7%
> > >
> > > So the overhead is around ~1.5%. Given that git-update-refs(1) is a
> > > near-direct wrapper around reference transactions, there likely is no
> > > other command that is impacted much worse than this.
> >
> > This is a serious performance regression that is worth considering. For
> > example, a 1.5% slow-down on average in reference transactions would
> > cause be enough for me to bisect the regression down to see what
> > changed.
> >
> > What are ways that this could be avoided? I bet that this would work
> > quite well with Emily's work on hooks, where we could check in the
> > configuration first whether a hook is even configured.
> >
> > Could this be integrated with that? If not, could you cache the result
> > of whether or not the hook exists, and/or implement some mechanism to
> > say something like, for eg., "only run reference transaction hooks
> > core.blah = 1" is true?
>
> I wasn't aware of her work until now, so I'll take a look.
>
> Meanwhile, I toyed around with performance and tried out two different
> caching mechanisms:
>
>     - map-cache: `find_hook()` uses a map of hook names mapped to their
>       resolved hook path (or `NULL` if none was found). This is a
>       generic mechanism working across all hooks, but has some overhead
>       in maintaining the map.
>
>     - reftx-cache: `run_transaction_hook()` caches the path returned by
>       `find_hook()`. It's got less overhead as it only caches the path,
>       but obviously only applies to the reference-transaction hook.
>
> In theory, we could go even further and cache the hook's file
> descriptor, executing it via fexecve(3P) whenever it's required, but I
> didn't go down that route yet. This would also solve the atomicity
> issue, though, if the administrator replaces the reference-transactions
> hook halfway through the transaction.
>
> Benchmarking results are mixed, mostly in the sense that I can choose
> which run of the benchmarks I take in order to have my own work look
> better or worse, as timings vary quite a lot between runs. Which
> probably hints at the fact that the benchmarks themselves aren't really
> reliable. The issue is that a single git-update-ref(1) run finishes so
> quick that it's hard to measure with our benchmarks, but spawning
> thousands of them to get something different than 0.0s very much depends
> on the operating system and thus fluctuates. On the other hand, if we
> only spawn a single git-update-refs and have it perform a few thousand
> ref updates, overhead of the hook will not show at all as it is only
> executed once per prepare/commit of the transaction.
>
> The following timings are for the case where we execute
>
>     git update-ref refs/heads/update-branch PRE POST &&
>     git update-ref refs/heads/update-branch POST PRE
>
> respectively
>
>     git update-ref refs/heads/new POST &&
>     git update-ref -d refs/heads/new
>
> a thousand times:
>
> Test                                   master            ref-hook-no-cache       ref-hook-map-cache      ref-hook-reftx-cache
> ------------------------------------------------------------------------------------------------------------------------------
> 1400.2: update existing reference      1.96(1.50+0.53)   2.00(1.54+0.53) +2.0%   2.02(1.54+0.55) +3.1%   1.98(1.52+0.52) +1.0%
> 1400.3: create and destroy reference   1.74(1.33+0.49)   1.77(1.38+0.47) +1.7%   1.77(1.36+0.48) +1.7%   1.76(1.35+0.49) +1.1%

Huh. It is super interesting (to me, at least) that caching the set of
hooks that are on disk and should be run makes this *slower* than
without the cache. What's going on there? In p1400.2, I'd expect to see
'ref-hook-map-cache' attain at most a 2.0% slow-down, plus a little bit
to process the hook when it is there, but not much more than that.

I think that this just seems a little contrived to me. I can understand
why server administrators may want this feature, but the general
user-base of Git doesn't seem to stand to benefit much from this change
(in my own mind, at least).

So... I'm not sure where this leaves us. Maybe a 2.0% slow-down on an
already fast 'git update-ref' invocation for the average user won't be
noticeable. It certainly *will* be noticeable to administrators who
processes a much higher volume of such transactions.

> For such a short-lived program like git-update-refs(1), one can see that
> the overhead of using a map negatively impacts performance compared to
> the no-cache case. But using the reftx-cache roughly cuts the overhead
> in half as expected, as we only need to look up the hook once instead of
> twice.
>
> And here's the results if we use a single `git update-ref --stdin` with a
> thousand reference updates at once:
>
> Test                             master            ref-hook-no-cache       ref-hook-map-cache      ref-hook-reftx-cache
> ------------------------------------------------------------------------------------------------------------------------
> 1400.2: git update-ref --stdin   0.21(0.09+0.12)   0.21(0.07+0.14) +0.0%   0.21(0.07+0.13) +0.0%   0.21(0.07+0.13) +0.0%
>
> As expected, there's nothing much to be seen here because there's only a
> single transaction for all ref updates and, as a result, at most two
> calls to `access(refhook_path, X_OK)`.

Better, but I have to imagine that real-world usage will look much more
like a thousand tiny transactions than one large one.

> Patrick

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] refs: implement reference transaction hook
  2020-06-15 16:46       ` Taylor Blau
@ 2020-06-16  5:45         ` Patrick Steinhardt
  0 siblings, 0 replies; 26+ messages in thread
From: Patrick Steinhardt @ 2020-06-16  5:45 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, SZEDER Gábor, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 8236 bytes --]

On Mon, Jun 15, 2020 at 10:46:39AM -0600, Taylor Blau wrote:
> Hi Patrick,
> 
> Sorry for the slow response. I was out of the office last week and am
> only just now getting a chance to catch up on the backlog of emails that
> I missed.

No worries!

> On Thu, Jun 04, 2020 at 09:36:32AM +0200, Patrick Steinhardt wrote:
> > On Wed, Jun 03, 2020 at 10:51:42AM -0600, Taylor Blau wrote:
> > > Hi Patrick,
> > >
> > > On Wed, Jun 03, 2020 at 02:27:50PM +0200, Patrick Steinhardt wrote:
> > > > In order to test the impact on the case where we don't have any
> > > > "reference-transaction" hook installed in the repository, this commit
> > > > introduces a new performance test for git-update-refs(1). Run against an
> > > > empty repository, it produces the following results:
> > > >
> > > >   Test                                   HEAD~             HEAD
> > > >   ------------------------------------------------------------------------------
> > > >   1400.2: update existing reference      2.05(1.58+0.54)   2.08(1.58+0.57) +1.5%
> > > >   1400.3: create and destroy reference   1.79(1.38+0.49)   1.82(1.39+0.51) +1.7%
> > > >
> > > > So the overhead is around ~1.5%. Given that git-update-refs(1) is a
> > > > near-direct wrapper around reference transactions, there likely is no
> > > > other command that is impacted much worse than this.
> > >
> > > This is a serious performance regression that is worth considering. For
> > > example, a 1.5% slow-down on average in reference transactions would
> > > cause be enough for me to bisect the regression down to see what
> > > changed.
> > >
> > > What are ways that this could be avoided? I bet that this would work
> > > quite well with Emily's work on hooks, where we could check in the
> > > configuration first whether a hook is even configured.
> > >
> > > Could this be integrated with that? If not, could you cache the result
> > > of whether or not the hook exists, and/or implement some mechanism to
> > > say something like, for eg., "only run reference transaction hooks
> > > core.blah = 1" is true?
> >
> > I wasn't aware of her work until now, so I'll take a look.
> >
> > Meanwhile, I toyed around with performance and tried out two different
> > caching mechanisms:
> >
> >     - map-cache: `find_hook()` uses a map of hook names mapped to their
> >       resolved hook path (or `NULL` if none was found). This is a
> >       generic mechanism working across all hooks, but has some overhead
> >       in maintaining the map.
> >
> >     - reftx-cache: `run_transaction_hook()` caches the path returned by
> >       `find_hook()`. It's got less overhead as it only caches the path,
> >       but obviously only applies to the reference-transaction hook.
> >
> > In theory, we could go even further and cache the hook's file
> > descriptor, executing it via fexecve(3P) whenever it's required, but I
> > didn't go down that route yet. This would also solve the atomicity
> > issue, though, if the administrator replaces the reference-transactions
> > hook halfway through the transaction.
> >
> > Benchmarking results are mixed, mostly in the sense that I can choose
> > which run of the benchmarks I take in order to have my own work look
> > better or worse, as timings vary quite a lot between runs. Which
> > probably hints at the fact that the benchmarks themselves aren't really
> > reliable. The issue is that a single git-update-ref(1) run finishes so
> > quick that it's hard to measure with our benchmarks, but spawning
> > thousands of them to get something different than 0.0s very much depends
> > on the operating system and thus fluctuates. On the other hand, if we
> > only spawn a single git-update-refs and have it perform a few thousand
> > ref updates, overhead of the hook will not show at all as it is only
> > executed once per prepare/commit of the transaction.
> >
> > The following timings are for the case where we execute
> >
> >     git update-ref refs/heads/update-branch PRE POST &&
> >     git update-ref refs/heads/update-branch POST PRE
> >
> > respectively
> >
> >     git update-ref refs/heads/new POST &&
> >     git update-ref -d refs/heads/new
> >
> > a thousand times:
> >
> > Test                                   master            ref-hook-no-cache       ref-hook-map-cache      ref-hook-reftx-cache
> > ------------------------------------------------------------------------------------------------------------------------------
> > 1400.2: update existing reference      1.96(1.50+0.53)   2.00(1.54+0.53) +2.0%   2.02(1.54+0.55) +3.1%   1.98(1.52+0.52) +1.0%
> > 1400.3: create and destroy reference   1.74(1.33+0.49)   1.77(1.38+0.47) +1.7%   1.77(1.36+0.48) +1.7%   1.76(1.35+0.49) +1.1%
> 
> Huh. It is super interesting (to me, at least) that caching the set of
> hooks that are on disk and should be run makes this *slower* than
> without the cache. What's going on there? In p1400.2, I'd expect to see
> 'ref-hook-map-cache' attain at most a 2.0% slow-down, plus a little bit
> to process the hook when it is there, but not much more than that.

I think the issue is that a single git-update-ref(1) invocation does so
little work that allocating the hashmap and inserting the hook already
has noticeable impact on the program's runtime. E.g. in above
benchmark, a single call to git-update-ref in p1400.2 takes roughly
0.002s on my machine. You also see this by the fact that doing a single
stat(3P) call as introduced by my patch adds a 1% performance penalty
already, and with the map cache we still have to do this single stat(3P)
call in addition to the dynamic memory allocations for the map and
insertion of the hook.

> I think that this just seems a little contrived to me. I can understand
> why server administrators may want this feature, but the general
> user-base of Git doesn't seem to stand to benefit much from this change
> (in my own mind, at least).

That's true for several hooks we have, though.

> So... I'm not sure where this leaves us. Maybe a 2.0% slow-down on an
> already fast 'git update-ref' invocation for the average user won't be
> noticeable. It certainly *will* be noticeable to administrators who
> processes a much higher volume of such transactions.

I think we should keep in mind that it's git-update-ref(1) we're talking
about, which is a nearly direct wrapper around reference transactions.
The 1% perfomance hit is thus the worst case that can happen, as there
is no other tool that does as little work around the reftx as this one.
For any other tool, I imagine the performance hit to be at worst the
same (e.g. git-branch(1)) or not noticeable at all because a single
stat(3P) call will be drowned out by other things (e.g. git-clone(1)).

That's not to say that nobody will be impacted by this change, I bet
there are setups that make heavy use of git-update-ref(1).

> > For such a short-lived program like git-update-refs(1), one can see that
> > the overhead of using a map negatively impacts performance compared to
> > the no-cache case. But using the reftx-cache roughly cuts the overhead
> > in half as expected, as we only need to look up the hook once instead of
> > twice.
> >
> > And here's the results if we use a single `git update-ref --stdin` with a
> > thousand reference updates at once:
> >
> > Test                             master            ref-hook-no-cache       ref-hook-map-cache      ref-hook-reftx-cache
> > ------------------------------------------------------------------------------------------------------------------------
> > 1400.2: git update-ref --stdin   0.21(0.09+0.12)   0.21(0.07+0.14) +0.0%   0.21(0.07+0.13) +0.0%   0.21(0.07+0.13) +0.0%
> >
> > As expected, there's nothing much to be seen here because there's only a
> > single transaction for all ref updates and, as a result, at most two
> > calls to `access(refhook_path, X_OK)`.
> 
> Better, but I have to imagine that real-world usage will look much more
> like a thousand tiny transactions than one large one.

Most likely, yes. Doesn't happen too often that one updates multiple
references at once.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v3] refs: implement reference transaction hook
  2020-06-02  8:25 [PATCH] refs: implement reference transaction hooks Patrick Steinhardt
                   ` (2 preceding siblings ...)
  2020-06-03 12:27 ` [PATCH v2] refs: implement reference transaction hook Patrick Steinhardt
@ 2020-06-18 10:27 ` Patrick Steinhardt
  2020-06-18 22:23   ` Junio C Hamano
  2020-06-19  6:56 ` [PATCH v4] " Patrick Steinhardt
  4 siblings, 1 reply; 26+ messages in thread
From: Patrick Steinhardt @ 2020-06-18 10:27 UTC (permalink / raw)
  To: git; +Cc: SZEDER Gábor, Junio C Hamano, Taylor Blau

[-- Attachment #1: Type: text/plain, Size: 13586 bytes --]

The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.

One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.

The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.

Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.

In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:

  Test                         origin/master     HEAD
  --------------------------------------------------------------------
  1400.2: update-ref           2.70(2.10+0.71)   2.71(2.10+0.73) +0.4%
  1400.3: update-ref --stdin   0.21(0.09+0.11)   0.21(0.07+0.14) +0.0%

The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.

As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---

There's two changes compared to v2:

    1. I've added lookup cache for the hook that both caches the
       positive as well as the negative lookup of this hook. It's
       specific to the reference-transaction hook, only, and will simply
       store the path to the found hook or a sentinel value in case it
       wasn't found. Like this, we avoid one of the two calls to
       access(3P) when executing git-update-ref(1). This improved
       performance into the range where variance between runs drowns out
       the impact of the single access(3P) call.

    2. I've amend a second benchmark to p1400, where one tests single
       invocations of git-update-ref and the second one uses batched
       invocations via its --stdin switch. The latter doesn't show any
       impact, while the former one ranged from an overhead of -1.5% to
       +1.1%.

I did have a look at integrating this work with Emily's work, but I
don't really think it necessary given that benchmarks show that the
overhead of the access(3P) call is drowned out by the OS anyway. If you
feel differently, I may revisit it and perform some benchmarks on top of
her work.

Patrick

 Documentation/githooks.txt       |  29 ++++++++
 refs.c                           |  79 +++++++++++++++++++++-
 t/perf/p1400-update-ref.sh       |  32 +++++++++
 t/t1416-ref-transaction-hooks.sh | 109 +++++++++++++++++++++++++++++++
 4 files changed, 247 insertions(+), 2 deletions(-)
 create mode 100755 t/perf/p1400-update-ref.sh
 create mode 100755 t/t1416-ref-transaction-hooks.sh

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index 81f2a87e88..642471109f 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -404,6 +404,35 @@ Both standard output and standard error output are forwarded to
 `git send-pack` on the other end, so you can simply `echo` messages
 for the user.
 
+ref-transaction
+~~~~~~~~~~~~~~~
+
+This hook is invoked by any Git command that performs reference
+updates. It executes whenever a reference transaction is prepared,
+committed or aborted and may thus get called multiple times.
+
+The hook takes exactly one argument, which is the current state the
+given reference transaction is in:
+
+    - "prepared": All reference updates have been queued to the
+      transaction and references were locked on disk.
+
+    - "committed": The reference transaction was committed and all
+      references now have their respective new value.
+
+    - "aborted": The reference transaction was aborted, no changes
+      were performed and the locks have been released.
+
+For each reference update that was added to the transaction, the hook
+receives on standard input a line of the format:
+
+  <old-value> SP <new-value> SP <ref-name> LF
+
+The exit status of the hook is ignored for any state except for the
+"prepared" state. In the "prepared" state, a non-zero exit status will
+cause the transaction to be aborted. The hook will not be called with
+"aborted" state in that case.
+
 push-to-checkout
 ~~~~~~~~~~~~~~~~
 
diff --git a/refs.c b/refs.c
index 224ff66c7b..c4961f4ad9 100644
--- a/refs.c
+++ b/refs.c
@@ -9,6 +9,7 @@
 #include "iterator.h"
 #include "refs.h"
 #include "refs/refs-internal.h"
+#include "run-command.h"
 #include "object-store.h"
 #include "object.h"
 #include "tag.h"
@@ -16,6 +17,7 @@
 #include "worktree.h"
 #include "argv-array.h"
 #include "repository.h"
+#include "sigchain.h"
 
 /*
  * List of all available backends
@@ -1986,10 +1988,68 @@ int ref_update_reject_duplicates(struct string_list *refnames,
 	return 0;
 }
 
+static const char hook_not_found;
+static const char *hook;
+
+static int run_transaction_hook(struct ref_transaction *transaction,
+				const char *state)
+{
+	struct child_process proc = CHILD_PROCESS_INIT;
+	struct strbuf buf = STRBUF_INIT;
+	int saved_errno = 0, ret, i;
+
+	if (hook == &hook_not_found)
+		return 0;
+	if (!hook)
+		hook = find_hook("reference-transaction");
+	if (!hook) {
+		hook = &hook_not_found;
+		return 0;
+	}
+
+	argv_array_pushl(&proc.args, hook, state, NULL);
+	proc.in = -1;
+	proc.stdout_to_stderr = 1;
+	proc.trace2_hook_name = "reference-transaction";
+
+	ret = start_command(&proc);
+	if (ret)
+		return ret;
+
+	sigchain_push(SIGPIPE, SIG_IGN);
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *update = transaction->updates[i];
+
+		strbuf_reset(&buf);
+		strbuf_addf(&buf, "%s %s %s\n",
+			    oid_to_hex(&update->old_oid),
+			    oid_to_hex(&update->new_oid),
+			    update->refname);
+
+		if (write_in_full(proc.in, buf.buf, buf.len) < 0) {
+			if (errno != EPIPE)
+				saved_errno = errno;
+			break;
+		}
+	}
+
+	close(proc.in);
+	sigchain_pop(SIGPIPE);
+	strbuf_release(&buf);
+
+	ret = finish_command(&proc);
+	if (ret)
+		return ret;
+
+	return saved_errno;
+}
+
 int ref_transaction_prepare(struct ref_transaction *transaction,
 			    struct strbuf *err)
 {
 	struct ref_store *refs = transaction->ref_store;
+	int ret;
 
 	switch (transaction->state) {
 	case REF_TRANSACTION_OPEN:
@@ -2012,7 +2072,17 @@ int ref_transaction_prepare(struct ref_transaction *transaction,
 		return -1;
 	}
 
-	return refs->be->transaction_prepare(refs, transaction, err);
+	ret = refs->be->transaction_prepare(refs, transaction, err);
+	if (ret)
+		return ret;
+
+	ret = run_transaction_hook(transaction, "prepared");
+	if (ret) {
+		ref_transaction_abort(transaction, err);
+		die(_("ref updates aborted by hook"));
+	}
+
+	return 0;
 }
 
 int ref_transaction_abort(struct ref_transaction *transaction,
@@ -2036,6 +2106,8 @@ int ref_transaction_abort(struct ref_transaction *transaction,
 		break;
 	}
 
+	run_transaction_hook(transaction, "aborted");
+
 	ref_transaction_free(transaction);
 	return ret;
 }
@@ -2064,7 +2136,10 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 		break;
 	}
 
-	return refs->be->transaction_finish(refs, transaction, err);
+	ret = refs->be->transaction_finish(refs, transaction, err);
+	if (!ret)
+		run_transaction_hook(transaction, "committed");
+	return ret;
 }
 
 int refs_verify_refname_available(struct ref_store *refs,
diff --git a/t/perf/p1400-update-ref.sh b/t/perf/p1400-update-ref.sh
new file mode 100755
index 0000000000..d275a81248
--- /dev/null
+++ b/t/perf/p1400-update-ref.sh
@@ -0,0 +1,32 @@
+#!/bin/sh
+
+test_description="Tests performance of update-ref"
+
+. ./perf-lib.sh
+
+test_perf_fresh_repo
+
+test_expect_success "setup" '
+	test_commit PRE &&
+	test_commit POST &&
+	printf "create refs/heads/%d PRE\n" $(test_seq 1000) >create &&
+	printf "update refs/heads/%d POST PRE\n" $(test_seq 1000) >update &&
+	printf "delete refs/heads/%d POST\n" $(test_seq 1000) >delete
+'
+
+test_perf "update-ref" '
+	for i in $(test_seq 1000)
+	do
+		git update-ref refs/heads/branch PRE &&
+		git update-ref refs/heads/branch POST PRE &&
+		git update-ref -d refs/heads/branch
+	done
+'
+
+test_perf "update-ref --stdin" '
+	git update-ref --stdin <create &&
+	git update-ref --stdin <update &&
+	git update-ref --stdin <delete
+'
+
+test_done
diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
new file mode 100755
index 0000000000..da58d867a5
--- /dev/null
+++ b/t/t1416-ref-transaction-hooks.sh
@@ -0,0 +1,109 @@
+#!/bin/sh
+
+test_description='reference transaction hooks'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+	mkdir -p .git/hooks &&
+	test_commit PRE &&
+	test_commit POST &&
+	POST_OID=$(git rev-parse POST)
+'
+
+test_expect_success 'hook allows updating ref if successful' '
+	test_when_finished "rm .git/hooks/reference-transaction" &&
+	git reset --hard PRE &&
+	write_script .git/hooks/reference-transaction <<-\EOF &&
+		echo "$*" >>actual
+	EOF
+	cat >expect <<-EOF &&
+		prepared
+		committed
+	EOF
+	git update-ref HEAD POST &&
+	test_cmp expect actual
+'
+
+test_expect_success 'hook aborts updating ref in prepared state' '
+	test_when_finished "rm .git/hooks/reference-transaction" &&
+	git reset --hard PRE &&
+	write_script .git/hooks/reference-transaction <<-\EOF &&
+		if test "$1" = prepared
+		then
+			exit 1
+		fi
+	EOF
+	test_must_fail git update-ref HEAD POST 2>err &&
+	test_i18ngrep "ref updates aborted by hook" err
+'
+
+test_expect_success 'hook gets all queued updates in prepared state' '
+	test_when_finished "rm .git/hooks/reference-transaction actual" &&
+	git reset --hard PRE &&
+	write_script .git/hooks/reference-transaction <<-\EOF &&
+		if test "$1" = prepared
+		then
+			while read -r line
+			do
+				printf "%s\n" "$line"
+			done >actual
+		fi
+	EOF
+	cat >expect <<-EOF &&
+		$ZERO_OID $POST_OID HEAD
+		$ZERO_OID $POST_OID refs/heads/master
+	EOF
+	git update-ref HEAD POST <<-EOF &&
+		update HEAD $ZERO_OID $POST_OID
+		update refs/heads/master $ZERO_OID $POST_OID
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'hook gets all queued updates in committed state' '
+	test_when_finished "rm .git/hooks/reference-transaction actual" &&
+	git reset --hard PRE &&
+	write_script .git/hooks/reference-transaction <<-\EOF &&
+		if test "$1" = committed
+		then
+			while read -r line
+			do
+				printf "%s\n" "$line"
+			done >actual
+		fi
+	EOF
+	cat >expect <<-EOF &&
+		$ZERO_OID $POST_OID HEAD
+		$ZERO_OID $POST_OID refs/heads/master
+	EOF
+	git update-ref HEAD POST &&
+	test_cmp expect actual
+'
+
+test_expect_success 'hook gets all queued updates in aborted state' '
+	test_when_finished "rm .git/hooks/reference-transaction actual" &&
+	git reset --hard PRE &&
+	write_script .git/hooks/reference-transaction <<-\EOF &&
+		if test "$1" = aborted
+		then
+			while read -r line
+			do
+				printf "%s\n" "$line"
+			done >actual
+		fi
+	EOF
+	cat >expect <<-EOF &&
+		$ZERO_OID $POST_OID HEAD
+		$ZERO_OID $POST_OID refs/heads/master
+	EOF
+	git update-ref --stdin <<-EOF &&
+		start
+		update HEAD POST $ZERO_OID
+		update refs/heads/master POST $ZERO_OID
+		abort
+	EOF
+	test_cmp expect actual
+'
+
+test_done
-- 
2.27.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3] refs: implement reference transaction hook
  2020-06-18 10:27 ` [PATCH v3] " Patrick Steinhardt
@ 2020-06-18 22:23   ` Junio C Hamano
  0 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2020-06-18 22:23 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, SZEDER Gábor, Taylor Blau

Patrick Steinhardt <ps@pks.im> writes:

> +static const char hook_not_found;
> +static const char *hook;

;-)  Nice.

> +static int run_transaction_hook(struct ref_transaction *transaction,
> +				const char *state)
> +{
> +	struct child_process proc = CHILD_PROCESS_INIT;
> +	struct strbuf buf = STRBUF_INIT;
> +	int saved_errno = 0, ret, i;
> +...
> +	ret = start_command(&proc);
> +	if (ret)
> +		return ret;
> +
> +	sigchain_push(SIGPIPE, SIG_IGN);
> +
> +	for (i = 0; i < transaction->nr; i++) {
> +		struct ref_update *update = transaction->updates[i];
> + ...
> +		if (write_in_full(proc.in, buf.buf, buf.len) < 0) {
> +			if (errno != EPIPE)
> +				saved_errno = errno;
> +			break;
> +		}
> +	}
> +
> +	close(proc.in);
> +	sigchain_pop(SIGPIPE);
> +	strbuf_release(&buf);
> +
> +	ret = finish_command(&proc);
> +	if (ret)
> +		return ret;
> +
> +	return saved_errno;
> +}

OK, the only thing that looked a bit tricky was the "saved_errno"
that is used in an unusual (relative to its name) way.  The use
isn't incorrect per-se, but it rubs readers' expectation the wrong
way to use the variable named saved_errno for any purpose other than
the established pattern:

	saved_errno = errno;
	if (some_libcall_that_may_update_errno()) {
		... deal with an error and perform
		... some fallback action
	}
	errno = saved_errno;

that allows the caller to be oblivious to the library call that is
made as a mere implementation detail whose failure does not matter
to the caller.

In any case, the idea of the code in the patch is to make sure we
remember the fact that we failed to write (or caught any other
error, if we added more calls in the future) in a variable, and make
sure we return an error even if we manage to cleanly call
"finish_command()".  For that purpose, in order to avoid overwriting
the "ret" variable with the returned value from finish_command(), a
separate variable is needed, and "saved_errno" was picked for the
name of the variable.

But I do not think it is a good idea to return the errno in one
codepath when the function can return an error status that is not an
errno that is received from start_command() and finish_command().
The caller of this function cannot (and probably do not want to)
tell what the failed syscall was and would be checking if the return
value was success (=0) or failure (<0).

So I'd rather simplify the error handling to

 - Remove "saved_errno"; instead initialize ret to 0 at the beginning;

 - Return "ret" even if we return hardcoded 0 in the earlier part
   for consistency;

 - Update "ret" in the loop to -1 (the same error return status that
   is returned by start_command() and finish_command()) when we
   found a write error that we are not ignoring before breaking out
   of the loop.

 - We need to call finish_command() even if we earlier saw an error
   once we successfully called start_command().  So do something
   like this:

	ret |= finish_command(&proc);
	return ret;

   to make sure we retain an earlier error in "ret", we
   unconditionally call finish_command() when the control reaches
   there, and we mark the result a failure when finish_command()
   fails.

if I were writing this function.

Thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4] refs: implement reference transaction hook
  2020-06-02  8:25 [PATCH] refs: implement reference transaction hooks Patrick Steinhardt
                   ` (3 preceding siblings ...)
  2020-06-18 10:27 ` [PATCH v3] " Patrick Steinhardt
@ 2020-06-19  6:56 ` Patrick Steinhardt
  2020-10-23  1:03   ` Jeff King
  4 siblings, 1 reply; 26+ messages in thread
From: Patrick Steinhardt @ 2020-06-19  6:56 UTC (permalink / raw)
  To: git; +Cc: SZEDER Gábor, Junio C Hamano, Taylor Blau

[-- Attachment #1: Type: text/plain, Size: 13835 bytes --]

The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.

One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.

The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.

Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.

In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:

  Test                         origin/master     HEAD
  --------------------------------------------------------------------
  1400.2: update-ref           2.70(2.10+0.71)   2.71(2.10+0.73) +0.4%
  1400.3: update-ref --stdin   0.21(0.09+0.11)   0.21(0.07+0.14) +0.0%

The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.

As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---

The only change was simplified error handling as proposed by Junio.
Thanks for your feedback, the result definitely looks easier to read to
me.

Range-diff against v3:
1:  1de96b96e3 ! 1:  55c58e9b09 refs: implement reference transaction hook
    @@ refs.c: int ref_update_reject_duplicates(struct string_list *refnames,
     +{
     +	struct child_process proc = CHILD_PROCESS_INIT;
     +	struct strbuf buf = STRBUF_INIT;
    -+	int saved_errno = 0, ret, i;
    ++	int ret = 0, i;
     +
     +	if (hook == &hook_not_found)
    -+		return 0;
    ++		return ret;
     +	if (!hook)
     +		hook = find_hook("reference-transaction");
     +	if (!hook) {
     +		hook = &hook_not_found;
    -+		return 0;
    ++		return ret;
     +	}
     +
     +	argv_array_pushl(&proc.args, hook, state, NULL);
    @@ refs.c: int ref_update_reject_duplicates(struct string_list *refnames,
     +
     +		if (write_in_full(proc.in, buf.buf, buf.len) < 0) {
     +			if (errno != EPIPE)
    -+				saved_errno = errno;
    ++				ret = -1;
     +			break;
     +		}
     +	}
    @@ refs.c: int ref_update_reject_duplicates(struct string_list *refnames,
     +	sigchain_pop(SIGPIPE);
     +	strbuf_release(&buf);
     +
    -+	ret = finish_command(&proc);
    -+	if (ret)
    -+		return ret;
    -+
    -+	return saved_errno;
    ++	ret |= finish_command(&proc);
    ++	return ret;
     +}
     +
      int ref_transaction_prepare(struct ref_transaction *transaction,

 Documentation/githooks.txt       |  29 ++++++++
 refs.c                           |  76 ++++++++++++++++++++-
 t/perf/p1400-update-ref.sh       |  32 +++++++++
 t/t1416-ref-transaction-hooks.sh | 109 +++++++++++++++++++++++++++++++
 4 files changed, 244 insertions(+), 2 deletions(-)
 create mode 100755 t/perf/p1400-update-ref.sh
 create mode 100755 t/t1416-ref-transaction-hooks.sh

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index 81f2a87e88..642471109f 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -404,6 +404,35 @@ Both standard output and standard error output are forwarded to
 `git send-pack` on the other end, so you can simply `echo` messages
 for the user.
 
+ref-transaction
+~~~~~~~~~~~~~~~
+
+This hook is invoked by any Git command that performs reference
+updates. It executes whenever a reference transaction is prepared,
+committed or aborted and may thus get called multiple times.
+
+The hook takes exactly one argument, which is the current state the
+given reference transaction is in:
+
+    - "prepared": All reference updates have been queued to the
+      transaction and references were locked on disk.
+
+    - "committed": The reference transaction was committed and all
+      references now have their respective new value.
+
+    - "aborted": The reference transaction was aborted, no changes
+      were performed and the locks have been released.
+
+For each reference update that was added to the transaction, the hook
+receives on standard input a line of the format:
+
+  <old-value> SP <new-value> SP <ref-name> LF
+
+The exit status of the hook is ignored for any state except for the
+"prepared" state. In the "prepared" state, a non-zero exit status will
+cause the transaction to be aborted. The hook will not be called with
+"aborted" state in that case.
+
 push-to-checkout
 ~~~~~~~~~~~~~~~~
 
diff --git a/refs.c b/refs.c
index 224ff66c7b..f05295f503 100644
--- a/refs.c
+++ b/refs.c
@@ -9,6 +9,7 @@
 #include "iterator.h"
 #include "refs.h"
 #include "refs/refs-internal.h"
+#include "run-command.h"
 #include "object-store.h"
 #include "object.h"
 #include "tag.h"
@@ -16,6 +17,7 @@
 #include "worktree.h"
 #include "argv-array.h"
 #include "repository.h"
+#include "sigchain.h"
 
 /*
  * List of all available backends
@@ -1986,10 +1988,65 @@ int ref_update_reject_duplicates(struct string_list *refnames,
 	return 0;
 }
 
+static const char hook_not_found;
+static const char *hook;
+
+static int run_transaction_hook(struct ref_transaction *transaction,
+				const char *state)
+{
+	struct child_process proc = CHILD_PROCESS_INIT;
+	struct strbuf buf = STRBUF_INIT;
+	int ret = 0, i;
+
+	if (hook == &hook_not_found)
+		return ret;
+	if (!hook)
+		hook = find_hook("reference-transaction");
+	if (!hook) {
+		hook = &hook_not_found;
+		return ret;
+	}
+
+	argv_array_pushl(&proc.args, hook, state, NULL);
+	proc.in = -1;
+	proc.stdout_to_stderr = 1;
+	proc.trace2_hook_name = "reference-transaction";
+
+	ret = start_command(&proc);
+	if (ret)
+		return ret;
+
+	sigchain_push(SIGPIPE, SIG_IGN);
+
+	for (i = 0; i < transaction->nr; i++) {
+		struct ref_update *update = transaction->updates[i];
+
+		strbuf_reset(&buf);
+		strbuf_addf(&buf, "%s %s %s\n",
+			    oid_to_hex(&update->old_oid),
+			    oid_to_hex(&update->new_oid),
+			    update->refname);
+
+		if (write_in_full(proc.in, buf.buf, buf.len) < 0) {
+			if (errno != EPIPE)
+				ret = -1;
+			break;
+		}
+	}
+
+	close(proc.in);
+	sigchain_pop(SIGPIPE);
+	strbuf_release(&buf);
+
+	ret |= finish_command(&proc);
+	return ret;
+}
+
 int ref_transaction_prepare(struct ref_transaction *transaction,
 			    struct strbuf *err)
 {
 	struct ref_store *refs = transaction->ref_store;
+	int ret;
 
 	switch (transaction->state) {
 	case REF_TRANSACTION_OPEN:
@@ -2012,7 +2069,17 @@ int ref_transaction_prepare(struct ref_transaction *transaction,
 		return -1;
 	}
 
-	return refs->be->transaction_prepare(refs, transaction, err);
+	ret = refs->be->transaction_prepare(refs, transaction, err);
+	if (ret)
+		return ret;
+
+	ret = run_transaction_hook(transaction, "prepared");
+	if (ret) {
+		ref_transaction_abort(transaction, err);
+		die(_("ref updates aborted by hook"));
+	}
+
+	return 0;
 }
 
 int ref_transaction_abort(struct ref_transaction *transaction,
@@ -2036,6 +2103,8 @@ int ref_transaction_abort(struct ref_transaction *transaction,
 		break;
 	}
 
+	run_transaction_hook(transaction, "aborted");
+
 	ref_transaction_free(transaction);
 	return ret;
 }
@@ -2064,7 +2133,10 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 		break;
 	}
 
-	return refs->be->transaction_finish(refs, transaction, err);
+	ret = refs->be->transaction_finish(refs, transaction, err);
+	if (!ret)
+		run_transaction_hook(transaction, "committed");
+	return ret;
 }
 
 int refs_verify_refname_available(struct ref_store *refs,
diff --git a/t/perf/p1400-update-ref.sh b/t/perf/p1400-update-ref.sh
new file mode 100755
index 0000000000..d275a81248
--- /dev/null
+++ b/t/perf/p1400-update-ref.sh
@@ -0,0 +1,32 @@
+#!/bin/sh
+
+test_description="Tests performance of update-ref"
+
+. ./perf-lib.sh
+
+test_perf_fresh_repo
+
+test_expect_success "setup" '
+	test_commit PRE &&
+	test_commit POST &&
+	printf "create refs/heads/%d PRE\n" $(test_seq 1000) >create &&
+	printf "update refs/heads/%d POST PRE\n" $(test_seq 1000) >update &&
+	printf "delete refs/heads/%d POST\n" $(test_seq 1000) >delete
+'
+
+test_perf "update-ref" '
+	for i in $(test_seq 1000)
+	do
+		git update-ref refs/heads/branch PRE &&
+		git update-ref refs/heads/branch POST PRE &&
+		git update-ref -d refs/heads/branch
+	done
+'
+
+test_perf "update-ref --stdin" '
+	git update-ref --stdin <create &&
+	git update-ref --stdin <update &&
+	git update-ref --stdin <delete
+'
+
+test_done
diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
new file mode 100755
index 0000000000..da58d867a5
--- /dev/null
+++ b/t/t1416-ref-transaction-hooks.sh
@@ -0,0 +1,109 @@
+#!/bin/sh
+
+test_description='reference transaction hooks'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+	mkdir -p .git/hooks &&
+	test_commit PRE &&
+	test_commit POST &&
+	POST_OID=$(git rev-parse POST)
+'
+
+test_expect_success 'hook allows updating ref if successful' '
+	test_when_finished "rm .git/hooks/reference-transaction" &&
+	git reset --hard PRE &&
+	write_script .git/hooks/reference-transaction <<-\EOF &&
+		echo "$*" >>actual
+	EOF
+	cat >expect <<-EOF &&
+		prepared
+		committed
+	EOF
+	git update-ref HEAD POST &&
+	test_cmp expect actual
+'
+
+test_expect_success 'hook aborts updating ref in prepared state' '
+	test_when_finished "rm .git/hooks/reference-transaction" &&
+	git reset --hard PRE &&
+	write_script .git/hooks/reference-transaction <<-\EOF &&
+		if test "$1" = prepared
+		then
+			exit 1
+		fi
+	EOF
+	test_must_fail git update-ref HEAD POST 2>err &&
+	test_i18ngrep "ref updates aborted by hook" err
+'
+
+test_expect_success 'hook gets all queued updates in prepared state' '
+	test_when_finished "rm .git/hooks/reference-transaction actual" &&
+	git reset --hard PRE &&
+	write_script .git/hooks/reference-transaction <<-\EOF &&
+		if test "$1" = prepared
+		then
+			while read -r line
+			do
+				printf "%s\n" "$line"
+			done >actual
+		fi
+	EOF
+	cat >expect <<-EOF &&
+		$ZERO_OID $POST_OID HEAD
+		$ZERO_OID $POST_OID refs/heads/master
+	EOF
+	git update-ref HEAD POST <<-EOF &&
+		update HEAD $ZERO_OID $POST_OID
+		update refs/heads/master $ZERO_OID $POST_OID
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'hook gets all queued updates in committed state' '
+	test_when_finished "rm .git/hooks/reference-transaction actual" &&
+	git reset --hard PRE &&
+	write_script .git/hooks/reference-transaction <<-\EOF &&
+		if test "$1" = committed
+		then
+			while read -r line
+			do
+				printf "%s\n" "$line"
+			done >actual
+		fi
+	EOF
+	cat >expect <<-EOF &&
+		$ZERO_OID $POST_OID HEAD
+		$ZERO_OID $POST_OID refs/heads/master
+	EOF
+	git update-ref HEAD POST &&
+	test_cmp expect actual
+'
+
+test_expect_success 'hook gets all queued updates in aborted state' '
+	test_when_finished "rm .git/hooks/reference-transaction actual" &&
+	git reset --hard PRE &&
+	write_script .git/hooks/reference-transaction <<-\EOF &&
+		if test "$1" = aborted
+		then
+			while read -r line
+			do
+				printf "%s\n" "$line"
+			done >actual
+		fi
+	EOF
+	cat >expect <<-EOF &&
+		$ZERO_OID $POST_OID HEAD
+		$ZERO_OID $POST_OID refs/heads/master
+	EOF
+	git update-ref --stdin <<-EOF &&
+		start
+		update HEAD POST $ZERO_OID
+		update refs/heads/master POST $ZERO_OID
+		abort
+	EOF
+	test_cmp expect actual
+'
+
+test_done
-- 
2.27.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4] refs: implement reference transaction hook
  2020-06-19  6:56 ` [PATCH v4] " Patrick Steinhardt
@ 2020-10-23  1:03   ` Jeff King
  2020-10-23  3:59     ` Junio C Hamano
  2020-10-23 20:00     ` Taylor Blau
  0 siblings, 2 replies; 26+ messages in thread
From: Jeff King @ 2020-10-23  1:03 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, SZEDER Gábor, Junio C Hamano, Taylor Blau

On Fri, Jun 19, 2020 at 08:56:14AM +0200, Patrick Steinhardt wrote:

> The above scenario is the motivation for the new "reference-transaction"
> hook that reaches directly into Git's reference transaction mechanism.
> The hook receives as parameter the current state the transaction was
> moved to ("prepared", "committed" or "aborted") and gets via its
> standard input all queued reference updates. While the exit code gets
> ignored in the "committed" and "aborted" states, a non-zero exit code in
> the "prepared" state will cause the transaction to be aborted
> prematurely.

I know this has been merged for a while, but Taylor and I were looking
at it today and came across a question. The docs say:

> +For each reference update that was added to the transaction, the hook
> +receives on standard input a line of the format:
> +
> +  <old-value> SP <new-value> SP <ref-name> LF

but that doesn't define <old-value>. I take it to mean the current value
of the ref before the proposed update. But in the tests:

> diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
> new file mode 100755
> index 0000000000..da58d867a5
> --- /dev/null
> +++ b/t/t1416-ref-transaction-hooks.sh
> [...]
> +test_expect_success 'hook gets all queued updates in prepared state' '
> +	test_when_finished "rm .git/hooks/reference-transaction actual" &&
> +	git reset --hard PRE &&
> +	write_script .git/hooks/reference-transaction <<-\EOF &&
> +		if test "$1" = prepared
> +		then
> +			while read -r line
> +			do
> +				printf "%s\n" "$line"
> +			done >actual
> +		fi
> +	EOF
> +	cat >expect <<-EOF &&
> +		$ZERO_OID $POST_OID HEAD
> +		$ZERO_OID $POST_OID refs/heads/master
> +	EOF

We are expecting to see $ZERO_OID in that slot, even though the current
value of the ref is "PRE" due to our reset above.

Should this be $PRE_OID (we don't have that variable, but it would be
the result of "git rev-parse PRE")?

I could alternatively see an argument that <old-value> is the old-value
that the caller asked for. So seeing $ZERO_OID is saying that the caller
wants to move from _anything_ to $POST_OID, and we're not willing to
tell the hook what the current value actually is.

We could actually fill in the current value for zero cost. The reason we
found this is that we have a custom patch at GitHub that fills in these
values when we open the ref after locking.

In real usage, I'm not sure how much the distinction would matter,
because any careful caller would provide a non-zero "old" value. And if
that doesn't match the current value, we'd reject the transaction before
we even hit the hook, I think. It's only the fact that the update-ref
calls are sloppy and do not provide an expected old value that it even
matters.

So I wonder if:

diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
index f6e741c6c0..8155522a1a 100755
--- a/t/t1416-ref-transaction-hooks.sh
+++ b/t/t1416-ref-transaction-hooks.sh
@@ -9,6 +9,7 @@ test_expect_success setup '
 	test_commit PRE &&
 	PRE_OID=$(git rev-parse PRE) &&
 	test_commit POST &&
+	PRE_OID=$(git rev-parse PRE) &&
 	POST_OID=$(git rev-parse POST)
 '
 
@@ -52,10 +53,10 @@ test_expect_success 'hook gets all queued updates in prepared state' '
 		fi
 	EOF
 	cat >expect <<-EOF &&
-		$ZERO_OID $POST_OID HEAD
-		$ZERO_OID $POST_OID refs/heads/master
+		$PRE_OID $POST_OID HEAD
+		$PRE_OID $POST_OID refs/heads/master
 	EOF
-	git update-ref HEAD POST <<-EOF &&
+	git update-ref HEAD POST POST <<-EOF &&
 		update HEAD $ZERO_OID $POST_OID
 		update refs/heads/master $ZERO_OID $POST_OID
 	EOF
@@ -75,10 +76,10 @@ test_expect_success 'hook gets all queued updates in committed state' '
 		fi
 	EOF
 	cat >expect <<-EOF &&
-		$ZERO_OID $POST_OID HEAD
-		$ZERO_OID $POST_OID refs/heads/master
+		$PRE_OID $POST_OID HEAD
+		$PRE_OID $POST_OID refs/heads/master
 	EOF
-	git update-ref HEAD POST &&
+	git update-ref HEAD POST PRE &&
 	test_cmp expect actual
 '
 

would be a step forward. This isn't changing the actual behavior,
obviously. It's just tweaking the test so that it tests the more likely
real-world case.  But we'd possibly want to actually change the behavior
to always send the actual ref value to the hook.

-Peff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4] refs: implement reference transaction hook
  2020-10-23  1:03   ` Jeff King
@ 2020-10-23  3:59     ` Junio C Hamano
  2020-10-23 19:57       ` Taylor Blau
  2020-10-26  7:43       ` Patrick Steinhardt
  2020-10-23 20:00     ` Taylor Blau
  1 sibling, 2 replies; 26+ messages in thread
From: Junio C Hamano @ 2020-10-23  3:59 UTC (permalink / raw)
  To: Jeff King; +Cc: Patrick Steinhardt, git, SZEDER Gábor, Taylor Blau

Jeff King <peff@peff.net> writes:

> So I wonder if:
>
> diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
> index f6e741c6c0..8155522a1a 100755
> --- a/t/t1416-ref-transaction-hooks.sh
> +++ b/t/t1416-ref-transaction-hooks.sh
> @@ -9,6 +9,7 @@ test_expect_success setup '
>  	test_commit PRE &&
>  	PRE_OID=$(git rev-parse PRE) &&
>  	test_commit POST &&
> +	PRE_OID=$(git rev-parse PRE) &&
>  	POST_OID=$(git rev-parse POST)
>  '
>  
> @@ -52,10 +53,10 @@ test_expect_success 'hook gets all queued updates in prepared state' '
>  		fi
>  	EOF
>  	cat >expect <<-EOF &&
> -		$ZERO_OID $POST_OID HEAD
> -		$ZERO_OID $POST_OID refs/heads/master
> +		$PRE_OID $POST_OID HEAD
> +		$PRE_OID $POST_OID refs/heads/master
>  	EOF
> -	git update-ref HEAD POST <<-EOF &&
> +	git update-ref HEAD POST POST <<-EOF &&
>  		update HEAD $ZERO_OID $POST_OID
>  		update refs/heads/master $ZERO_OID $POST_OID
>  	EOF
> @@ -75,10 +76,10 @@ test_expect_success 'hook gets all queued updates in committed state' '
>  		fi
>  	EOF
>  	cat >expect <<-EOF &&
> -		$ZERO_OID $POST_OID HEAD
> -		$ZERO_OID $POST_OID refs/heads/master
> +		$PRE_OID $POST_OID HEAD
> +		$PRE_OID $POST_OID refs/heads/master
>  	EOF
> -	git update-ref HEAD POST &&
> +	git update-ref HEAD POST PRE &&
>  	test_cmp expect actual
>  '

I think that makes a lot of sense.  In addition, 

> ...  But we'd possibly want to actually change the behavior
> to always send the actual ref value to the hook.

I offhand do not see a reason why we shouldn't do that.

Thanks for carefully thinking things through.



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4] refs: implement reference transaction hook
  2020-10-23  3:59     ` Junio C Hamano
@ 2020-10-23 19:57       ` Taylor Blau
  2020-10-23 22:07         ` Taylor Blau
  2020-10-26  7:43       ` Patrick Steinhardt
  1 sibling, 1 reply; 26+ messages in thread
From: Taylor Blau @ 2020-10-23 19:57 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Patrick Steinhardt, git, SZEDER Gábor, Taylor Blau

On Thu, Oct 22, 2020 at 08:59:23PM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> [...]
> I think that makes a lot of sense.  In addition,
>
> > ...  But we'd possibly want to actually change the behavior
> > to always send the actual ref value to the hook.
>
> I offhand do not see a reason why we shouldn't do that.

I can't see a reason either, but teaching the new ref transaction hook
about these updates gets a little tricky. In particular, we update the
old_oid for ref updates that were split off with something like:

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 04e85e7002..9c105a376b 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2484,9 +2484,20 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 		     parent_update = parent_update->parent_update) {
 			struct ref_lock *parent_lock = parent_update->backend_data;
 			oidcpy(&parent_lock->old_oid, &lock->old_oid);
+			/*
+			 * Now the parent knows its old OID too;
+			 * record that fact for logging.
+			 */
+			parent_update->flags |= REF_HAVE_OLD;
 		}
 	}

+	/* Now we know the old value. Record that fact for logging. */
+	if (!(update->flags & REF_HAVE_OLD)) {
+		oidcpy(&update->old_oid, &lock->old_oid);
+		update->flags |= REF_HAVE_OLD;
+	}
+
 	if ((update->flags & REF_HAVE_NEW) &&
 	    !(update->flags & REF_DELETING) &&
 	    !(update->flags & REF_LOG_ONLY)) {

...but by that time we have already recorded via an oidcpy the old and
new state of HEAD. So, Peff's patch passes in isolation, but applying
this on top produces failures in t1416 like:

--- expect	2020-10-23 19:56:15.649101024 +0000
+++ actual	2020-10-23 19:56:15.657100941 +0000
@@ -1,2 +1,2 @@
-63ac8e7bcdb882293465435909f54a96de17d4f7 99d53161c3a0a903b6561b9f6c0c665b3a476401 HEAD
+0000000000000000000000000000000000000000 99d53161c3a0a903b6561b9f6c0c665b3a476401 HEAD
 63ac8e7bcdb882293465435909f54a96de17d4f7 99d53161c3a0a903b6561b9f6c0c665b3a476401 refs/heads/master

It should be possible to keep track of the old and new OIDs via a
non-copied pointer, but I have to untangle this code a little bit more
before I can be sure.

> Thanks for carefully thinking things through.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4] refs: implement reference transaction hook
  2020-10-23  1:03   ` Jeff King
  2020-10-23  3:59     ` Junio C Hamano
@ 2020-10-23 20:00     ` Taylor Blau
  1 sibling, 0 replies; 26+ messages in thread
From: Taylor Blau @ 2020-10-23 20:00 UTC (permalink / raw)
  To: Jeff King; +Cc: Patrick Steinhardt, git, SZEDER Gábor, Junio C Hamano

On Thu, Oct 22, 2020 at 09:03:15PM -0400, Jeff King wrote:
> We are expecting to see $ZERO_OID in that slot, even though the current
> value of the ref is "PRE" due to our reset above.
>
> Should this be $PRE_OID (we don't have that variable, but it would be
> the result of "git rev-parse PRE")?

I also skimmed past it, but we do already have $PRE_OID, which is
convenient.

> I could alternatively see an argument that <old-value> is the old-value
> that the caller asked for. So seeing $ZERO_OID is saying that the caller
> wants to move from _anything_ to $POST_OID, and we're not willing to
> tell the hook what the current value actually is.
>
> We could actually fill in the current value for zero cost. The reason we
> found this is that we have a custom patch at GitHub that fills in these
> values when we open the ref after locking.

Yup, modulo being easy for symrefs (which I talk about in [1]), but it
shouldn't be impossible.

[1]: https://lore.kernel.org/git/X5M1oe4lfkUy9lAh@nand.local

> In real usage, I'm not sure how much the distinction would matter,
> because any careful caller would provide a non-zero "old" value. And if
> that doesn't match the current value, we'd reject the transaction before
> we even hit the hook, I think. It's only the fact that the update-ref
> calls are sloppy and do not provide an expected old value that it even
> matters.
>
> So I wonder if:
>
> diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
> index f6e741c6c0..8155522a1a 100755
> --- a/t/t1416-ref-transaction-hooks.sh
> +++ b/t/t1416-ref-transaction-hooks.sh
> @@ -9,6 +9,7 @@ test_expect_success setup '
>  	test_commit PRE &&
>  	PRE_OID=$(git rev-parse PRE) &&
>  	test_commit POST &&
> +	PRE_OID=$(git rev-parse PRE) &&
>  	POST_OID=$(git rev-parse POST)
>  '
>
> @@ -52,10 +53,10 @@ test_expect_success 'hook gets all queued updates in prepared state' '
>  		fi
>  	EOF
>  	cat >expect <<-EOF &&
> -		$ZERO_OID $POST_OID HEAD
> -		$ZERO_OID $POST_OID refs/heads/master
> +		$PRE_OID $POST_OID HEAD
> +		$PRE_OID $POST_OID refs/heads/master
>  	EOF
> -	git update-ref HEAD POST <<-EOF &&
> +	git update-ref HEAD POST POST <<-EOF &&

This should be "git update-ref HEAD POST PRE", since PRE is the before
state.

> would be a step forward. This isn't changing the actual behavior,
> obviously. It's just tweaking the test so that it tests the more likely
> real-world case.  But we'd possibly want to actually change the behavior
> to always send the actual ref value to the hook.

I have to look at the issue in [1] a little bit more to determine
whether or not it requires major surgery. If it does, then I'd be fine
going forward with just your patch to the tests. If it doesn't, then
updating the refs machinery to invoke the hook with the before OIDs
correctly even for symrefs seems sensible to me.

> -Peff

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4] refs: implement reference transaction hook
  2020-10-23 19:57       ` Taylor Blau
@ 2020-10-23 22:07         ` Taylor Blau
  0 siblings, 0 replies; 26+ messages in thread
From: Taylor Blau @ 2020-10-23 22:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Patrick Steinhardt, git, SZEDER Gábor

On Fri, Oct 23, 2020 at 03:57:21PM -0400, Taylor Blau wrote:
> It should be possible to keep track of the old and new OIDs via a
> non-copied pointer, but I have to untangle this code a little bit more
> before I can be sure.

This may be both easier and harder than I was imagining ;-). In the
easier direction, the following patch is sufficient to get the tests
passing again:

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 04e85e7002..744c93b7ff 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2311,12 +2311,11 @@ static int split_symref_update(struct ref_update *update,
 	new_update->parent_update = update;

 	/*
-	 * Change the symbolic ref update to log only. Also, it
-	 * doesn't need to check its old OID value, as that will be
-	 * done when new_update is processed.
+	 * Change the symbolic ref update to log only. Though we don't
+	 * need to check its old OID value, leave REF_HAVE_OLD alone so
+	 * we can propagate it to the ref transaction hook.
 	 */
 	update->flags |= REF_LOG_ONLY | REF_NO_DEREF;
-	update->flags &= ~REF_HAVE_OLD;

 	/*
 	 * Add the referent. This insertion is O(N) in the transaction

But, it also means that we're now needlessly re-verifying the before
state of symrefs (or so I think, I haven't yet proved it one way or the
other).

So, I need to look into that before deciding if this is a good direction
to go.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4] refs: implement reference transaction hook
  2020-10-23  3:59     ` Junio C Hamano
  2020-10-23 19:57       ` Taylor Blau
@ 2020-10-26  7:43       ` Patrick Steinhardt
  2020-10-26 23:52         ` Taylor Blau
  1 sibling, 1 reply; 26+ messages in thread
From: Patrick Steinhardt @ 2020-10-26  7:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, git, SZEDER Gábor, Taylor Blau

[-- Attachment #1: Type: text/plain, Size: 2381 bytes --]

On Thu, Oct 22, 2020 at 08:59:23PM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
> > So I wonder if:
> >
> > diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
> > index f6e741c6c0..8155522a1a 100755
> > --- a/t/t1416-ref-transaction-hooks.sh
> > +++ b/t/t1416-ref-transaction-hooks.sh
> > @@ -9,6 +9,7 @@ test_expect_success setup '
> >  	test_commit PRE &&
> >  	PRE_OID=$(git rev-parse PRE) &&
> >  	test_commit POST &&
> > +	PRE_OID=$(git rev-parse PRE) &&
> >  	POST_OID=$(git rev-parse POST)
> >  '
> >  
> > @@ -52,10 +53,10 @@ test_expect_success 'hook gets all queued updates in prepared state' '
> >  		fi
> >  	EOF
> >  	cat >expect <<-EOF &&
> > -		$ZERO_OID $POST_OID HEAD
> > -		$ZERO_OID $POST_OID refs/heads/master
> > +		$PRE_OID $POST_OID HEAD
> > +		$PRE_OID $POST_OID refs/heads/master
> >  	EOF
> > -	git update-ref HEAD POST <<-EOF &&
> > +	git update-ref HEAD POST POST <<-EOF &&
> >  		update HEAD $ZERO_OID $POST_OID
> >  		update refs/heads/master $ZERO_OID $POST_OID
> >  	EOF
> > @@ -75,10 +76,10 @@ test_expect_success 'hook gets all queued updates in committed state' '
> >  		fi
> >  	EOF
> >  	cat >expect <<-EOF &&
> > -		$ZERO_OID $POST_OID HEAD
> > -		$ZERO_OID $POST_OID refs/heads/master
> > +		$PRE_OID $POST_OID HEAD
> > +		$PRE_OID $POST_OID refs/heads/master
> >  	EOF
> > -	git update-ref HEAD POST &&
> > +	git update-ref HEAD POST PRE &&
> >  	test_cmp expect actual
> >  '
> 
> I think that makes a lot of sense.  In addition, 
> 
> > ...  But we'd possibly want to actually change the behavior
> > to always send the actual ref value to the hook.
> 
> I offhand do not see a reason why we shouldn't do that.
> 
> Thanks for carefully thinking things through.

Thanks for having a look! I agree, changing this seems sensible to me.
In the end, the intention of the hook is to have a single script which
is able to verify all reference updates, no matter where they come from.
And behaving differently based on whether the user passed the zero OID
or not doesn't seem useful to me in that context.

We should also a better job than I did and describe what the old OID
actually is in the documentation.

@Taylor, given that you've already dug into the code: do you already
have plans to post a patch for this?

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4] refs: implement reference transaction hook
  2020-10-26  7:43       ` Patrick Steinhardt
@ 2020-10-26 23:52         ` Taylor Blau
  2020-10-27  5:37           ` Jeff King
  2020-10-28 18:22           ` Patrick Steinhardt
  0 siblings, 2 replies; 26+ messages in thread
From: Taylor Blau @ 2020-10-26 23:52 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Junio C Hamano, Jeff King, git, SZEDER Gábor, Taylor Blau

On Mon, Oct 26, 2020 at 08:43:03AM +0100, Patrick Steinhardt wrote:
> @Taylor, given that you've already dug into the code: do you already
> have plans to post a patch for this?

You are likely in a better position to do that than I am. I am
unfamiliar enough with the refs.c code to feel confident that my change
is correct, let alone working. The combination of REF_HAVE_OLD, the lock
OID, the update OID, and so on is very puzzling.

Ordinarily, I'd be happy to post a patch after familiarizing myself, but
right now I don't have the time. So, I may come back to this in six
months, but I certainly won't feel bad if you beat me to it ;-).

In the meantime, I'd be fine to apply Peff's patch with some fix-ups,
maybe something like what's below the scissors line.

Taylor

--- >8 ---

Subject: [PATCH] t1416: specify pre-state for ref-updates

The ref-transaction hook documentation says that the expected format for
each line is:

  <old-value> SP <new-value> SP <ref-name> LF

without defining what <old-value> is. It could either be the current
state of the refs (after locking, but before committing the
transaction), or the optional caller-provided pre-state.

If <old-value> is to mean the caller-provided pre-state, than $ZERO_OID
could be taken to mean that the update is allowed to take place without
requiring the ref to be at some state. On the other hand, if <old-value>
is taken to mean "the current value of the reference", then that
requires a behavior change.

But that may only be semi-realistic, since any careful callers are
likely to pass a pre-state around anyway, and failing to meet that
pre-state means the hook will not even be invoked.

So, tweak the tests to more closely match how callers will actually
invoke this hook by providing a pre-state explicitly and then asserting
that it made its way down to the ref-transaction hook.

If we do decide to go further and implement a behavior change, it would
make sense to modify the tests to instead look something like:

  for before in "$PRE" ""
  do
    cat >expect <<-EOF &&
      $ZERO_OID $POST_OID HEAD
      $ZERO_OID $POST_OID refs/heads/master
      $PRE_OID $POST_OID HEAD
      $PRE_OID $POST_OID refs/heads/master
    EOF
    git reset --hard $PRE &&
    git update-ref HEAD POST $before &&
    test_cmp expect actual
  done

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t1416-ref-transaction-hooks.sh | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
index f6e741c6c0..74f94e293c 100755
--- a/t/t1416-ref-transaction-hooks.sh
+++ b/t/t1416-ref-transaction-hooks.sh
@@ -52,10 +52,10 @@ test_expect_success 'hook gets all queued updates in prepared state' '
 		fi
 	EOF
 	cat >expect <<-EOF &&
-		$ZERO_OID $POST_OID HEAD
-		$ZERO_OID $POST_OID refs/heads/master
+		$PRE_OID $POST_OID HEAD
+		$PRE_OID $POST_OID refs/heads/master
 	EOF
-	git update-ref HEAD POST <<-EOF &&
+	git update-ref HEAD POST PRE <<-EOF &&
 		update HEAD $ZERO_OID $POST_OID
 		update refs/heads/master $ZERO_OID $POST_OID
 	EOF
@@ -75,10 +75,10 @@ test_expect_success 'hook gets all queued updates in committed state' '
 		fi
 	EOF
 	cat >expect <<-EOF &&
-		$ZERO_OID $POST_OID HEAD
-		$ZERO_OID $POST_OID refs/heads/master
+		$PRE_OID $POST_OID HEAD
+		$PRE_OID $POST_OID refs/heads/master
 	EOF
-	git update-ref HEAD POST &&
+	git update-ref HEAD POST PRE &&
 	test_cmp expect actual
 '

--
2.29.1.9.g605042ee00


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4] refs: implement reference transaction hook
  2020-10-26 23:52         ` Taylor Blau
@ 2020-10-27  5:37           ` Jeff King
  2020-10-28 18:22           ` Patrick Steinhardt
  1 sibling, 0 replies; 26+ messages in thread
From: Jeff King @ 2020-10-27  5:37 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Patrick Steinhardt, Junio C Hamano, git, SZEDER Gábor

On Mon, Oct 26, 2020 at 07:52:23PM -0400, Taylor Blau wrote:

> On Mon, Oct 26, 2020 at 08:43:03AM +0100, Patrick Steinhardt wrote:
> > @Taylor, given that you've already dug into the code: do you already
> > have plans to post a patch for this?
> 
> You are likely in a better position to do that than I am. I am
> unfamiliar enough with the refs.c code to feel confident that my change
> is correct, let alone working. The combination of REF_HAVE_OLD, the lock
> OID, the update OID, and so on is very puzzling.
> 
> Ordinarily, I'd be happy to post a patch after familiarizing myself, but
> right now I don't have the time. So, I may come back to this in six
> months, but I certainly won't feel bad if you beat me to it ;-).
> 
> In the meantime, I'd be fine to apply Peff's patch with some fix-ups,
> maybe something like what's below the scissors line.

Thanks for moving this forward. I'm definitely OK with leaving the code
for now and just tightening the test. But there is one curiosity, still.
Your patch tweaks two tests:

> @@ -52,10 +52,10 @@ test_expect_success 'hook gets all queued updates in prepared state' '
> [...]
> @@ -75,10 +75,10 @@ test_expect_success 'hook gets all queued updates in committed state' '

but there's a third one, which checks the hook's view of the "aborted"
state. And there we _do_ pass in an expected state which is not
$PRE_OID. But it's $ZERO_OID, so we can't tell if if the hook is getting
what we passed in, or some sensible value.

If we instead do this:

diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
index 74f94e293c..a7ed983d3c 100755
--- a/t/t1416-ref-transaction-hooks.sh
+++ b/t/t1416-ref-transaction-hooks.sh
@@ -94,14 +94,15 @@ test_expect_success 'hook gets all queued updates in aborted state' '
 			done >actual
 		fi
 	EOF
+	nonsense_oid=$(echo $ZERO_OID | tr 0 a) &&
 	cat >expect <<-EOF &&
-		$ZERO_OID $POST_OID HEAD
-		$ZERO_OID $POST_OID refs/heads/master
+		$nonsense_oid $POST_OID HEAD
+		$nonsense_oid $POST_OID refs/heads/master
 	EOF
 	git update-ref --stdin <<-EOF &&
 		start
-		update HEAD POST $ZERO_OID
-		update refs/heads/master POST $ZERO_OID
+		update HEAD POST $nonsense_oid
+		update refs/heads/master POST $nonsense_oid
 		abort
 	EOF
 	test_cmp expect actual

Then test still passes, because it's passing along the value we
provided. And I think that's the only sensible thing we can do: show the
hook the proposed update. There's little point in telling it what the
actual ref values were, I would think.

So I think it's worth adding in to the test, but:

  - probably that meaning of old-value needs to be discussed in more
    detail in the hook documentation, because it feels like it differs a
    bit in the "aborted" case versus the "committed" case

  - we'd want to make sure this keeps passing if further changes to the
    code are made to support the case without an expected state
    specified (and not, say, accidentally passing $PRE_OID to the hook)

  - we'd likewise want eventually a test for the case without an
    expected state (which I guess would actually pass $ZERO_OID to the
    hook). Of course we're using a mismatch of the expected value as the
    reason to abort, so we'd have to find another reliable way to make
    the transaction abort. :) Perhaps a refname with illegal characters,
    or trying to create "foo/bar" when "foo" exists (or vice versa).

Most of that can be bumped to later, but I think it's worth squashing
something like the hunk above into the patch you posted.

-Peff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4] refs: implement reference transaction hook
  2020-10-26 23:52         ` Taylor Blau
  2020-10-27  5:37           ` Jeff King
@ 2020-10-28 18:22           ` Patrick Steinhardt
  1 sibling, 0 replies; 26+ messages in thread
From: Patrick Steinhardt @ 2020-10-28 18:22 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Junio C Hamano, Jeff King, git, SZEDER Gábor

[-- Attachment #1: Type: text/plain, Size: 1106 bytes --]

On Mon, Oct 26, 2020 at 07:52:23PM -0400, Taylor Blau wrote:
> On Mon, Oct 26, 2020 at 08:43:03AM +0100, Patrick Steinhardt wrote:
> > @Taylor, given that you've already dug into the code: do you already
> > have plans to post a patch for this?
> 
> You are likely in a better position to do that than I am. I am
> unfamiliar enough with the refs.c code to feel confident that my change
> is correct, let alone working. The combination of REF_HAVE_OLD, the lock
> OID, the update OID, and so on is very puzzling.
> 
> Ordinarily, I'd be happy to post a patch after familiarizing myself, but
> right now I don't have the time. So, I may come back to this in six
> months, but I certainly won't feel bad if you beat me to it ;-).
> 
> In the meantime, I'd be fine to apply Peff's patch with some fix-ups,
> maybe something like what's below the scissors line.

Fair enough, let's do it like this and submit the test change first.
I'll try to squeeze in doing the hook change soonish, but I'm currently
lacking time myself. So no promise I'll get to it soonish,
unfortunately.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2020-10-28 21:51 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-02  8:25 [PATCH] refs: implement reference transaction hooks Patrick Steinhardt
2020-06-02 17:47 ` Junio C Hamano
2020-06-03 11:26   ` Patrick Steinhardt
2020-06-07 20:12     ` SZEDER Gábor
2020-06-08  5:36       ` Patrick Steinhardt
2020-06-02 18:09 ` SZEDER Gábor
2020-06-03  9:46   ` Patrick Steinhardt
2020-06-03 12:27 ` [PATCH v2] refs: implement reference transaction hook Patrick Steinhardt
2020-06-03 16:51   ` Taylor Blau
2020-06-04  7:36     ` Patrick Steinhardt
2020-06-15 16:46       ` Taylor Blau
2020-06-16  5:45         ` Patrick Steinhardt
2020-06-03 17:44   ` Han-Wen Nienhuys
2020-06-03 18:03     ` Junio C Hamano
2020-06-18 10:27 ` [PATCH v3] " Patrick Steinhardt
2020-06-18 22:23   ` Junio C Hamano
2020-06-19  6:56 ` [PATCH v4] " Patrick Steinhardt
2020-10-23  1:03   ` Jeff King
2020-10-23  3:59     ` Junio C Hamano
2020-10-23 19:57       ` Taylor Blau
2020-10-23 22:07         ` Taylor Blau
2020-10-26  7:43       ` Patrick Steinhardt
2020-10-26 23:52         ` Taylor Blau
2020-10-27  5:37           ` Jeff King
2020-10-28 18:22           ` Patrick Steinhardt
2020-10-23 20:00     ` Taylor Blau

git@vger.kernel.org list mirror (unofficial, one of many)

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 git git/ https://public-inbox.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd.onion/inbox.comp.version-control.git
	nntp://ie5yzdi7fg72h7s4sdcztq5evakq23rdt33mfyfcddc5u3ndnw24ogqd.onion/inbox.comp.version-control.git
	nntp://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git