git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] refs: Always pass old object name to reftx hook
@ 2020-11-04 14:58 Patrick Steinhardt
  2020-12-04  8:37 ` Patrick Steinhardt
  2021-01-18 12:49 ` [PATCH RESEND] " Patrick Steinhardt
  0 siblings, 2 replies; 13+ messages in thread
From: Patrick Steinhardt @ 2020-11-04 14:58 UTC (permalink / raw)
  To: git; +Cc: peff, me, gitster

[-- Attachment #1: Type: text/plain, Size: 4954 bytes --]

Inputs of the reference-transaction hook currently depends on the
command which is being run. For example if the command `git update-ref
$REF $A $B` is executed, it will receive "$B $A $REF" as input, but if
the command `git update-ref $REF $A` is executed without providing the
old value, then it will receive "0*40 $A $REF" as input. This is due to
the fact that we directly write queued transaction updates into the
hook's standard input, which will not contain the old object value in
case it wasn't provided.

While this behaviour reflects what is happening as part of the
repository, it doesn't feel like it is useful. The main intent of the
reference-transaction hook is to be able to completely audit all
reference updates, no matter where they come from. As such, it makes a
lot more sense to always provide actual values instead of what the user
wanted. Furthermore, it's impossible for the hook to distinguish whether
this is intended to be a branch creation or a branch update without
doing additional digging with the current format.

Fix the issue by storing the old object value into the queued
transaction update operation if it wasn't provided by the caller.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/githooks.txt       |  6 ++++++
 refs/files-backend.c             |  8 ++++++++
 refs/packed-backend.c            |  2 ++
 t/t1416-ref-transaction-hooks.sh | 12 ++++++------
 4 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index 4e097dc4e9..8f3540e2f6 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -492,6 +492,12 @@ receives on standard input a line of the format:
 
   <old-value> SP <new-value> SP <ref-name> LF
 
+where `<old-value>` is the old object name stored in the ref,
+`<new-value>` is the new object name to be stored in the ref and
+`<ref-name>` is the full name of the ref.
+When creating a new ref, `<old-value>` is 40 `0`.
+When deleting an old ref, `<new-value>` is 40 `0`.
+
 The exit status of the hook is ignored for any state except for the
 "prepared" state. In the "prepared" state, a non-zero exit status will
 cause the transaction to be aborted. The hook will not be called with
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 04e85e7002..5b10d3822b 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2452,6 +2452,9 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 				ret = TRANSACTION_GENERIC_ERROR;
 				goto out;
 			}
+
+			if (!(update->flags & REF_HAVE_OLD))
+				oidcpy(&update->old_oid, &lock->old_oid);
 		} else {
 			/*
 			 * Create a new update for the reference this
@@ -2474,6 +2477,9 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 			goto out;
 		}
 
+		if (!(update->flags & REF_HAVE_OLD))
+			oidcpy(&update->old_oid, &lock->old_oid);
+
 		/*
 		 * If this update is happening indirectly because of a
 		 * symref update, record the old OID in the parent
@@ -2484,6 +2490,8 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 		     parent_update = parent_update->parent_update) {
 			struct ref_lock *parent_lock = parent_update->backend_data;
 			oidcpy(&parent_lock->old_oid, &lock->old_oid);
+			if (!(parent_update->flags & REF_HAVE_OLD))
+				oidcpy(&parent_update->old_oid, &lock->old_oid);
 		}
 	}
 
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index b912f2505f..08f0feee3d 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1178,6 +1178,8 @@ static int write_with_updates(struct packed_ref_store *refs,
 						    oid_to_hex(&update->old_oid));
 					goto error;
 				}
+			} else {
+				oidcpy(&update->old_oid, iter->oid);
 			}
 
 			/* Now figure out what to use for the new value: */
diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
index f6e741c6c0..111533682a 100755
--- a/t/t1416-ref-transaction-hooks.sh
+++ b/t/t1416-ref-transaction-hooks.sh
@@ -52,12 +52,12 @@ test_expect_success 'hook gets all queued updates in prepared state' '
 		fi
 	EOF
 	cat >expect <<-EOF &&
-		$ZERO_OID $POST_OID HEAD
-		$ZERO_OID $POST_OID refs/heads/master
+		$PRE_OID $POST_OID HEAD
+		$PRE_OID $POST_OID refs/heads/master
 	EOF
 	git update-ref HEAD POST <<-EOF &&
-		update HEAD $ZERO_OID $POST_OID
-		update refs/heads/master $ZERO_OID $POST_OID
+		update HEAD $PRE_OID $POST_OID
+		update refs/heads/master $PRE_OID $POST_OID
 	EOF
 	test_cmp expect actual
 '
@@ -75,8 +75,8 @@ test_expect_success 'hook gets all queued updates in committed state' '
 		fi
 	EOF
 	cat >expect <<-EOF &&
-		$ZERO_OID $POST_OID HEAD
-		$ZERO_OID $POST_OID refs/heads/master
+		$PRE_OID $POST_OID HEAD
+		$PRE_OID $POST_OID refs/heads/master
 	EOF
 	git update-ref HEAD POST &&
 	test_cmp expect actual
-- 
2.29.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] refs: Always pass old object name to reftx hook
  2020-11-04 14:58 [PATCH] refs: Always pass old object name to reftx hook Patrick Steinhardt
@ 2020-12-04  8:37 ` Patrick Steinhardt
  2021-01-12 21:07   ` Taylor Blau
  2021-01-18 12:49 ` [PATCH RESEND] " Patrick Steinhardt
  1 sibling, 1 reply; 13+ messages in thread
From: Patrick Steinhardt @ 2020-12-04  8:37 UTC (permalink / raw)
  To: git; +Cc: peff, me, gitster

[-- Attachment #1: Type: text/plain, Size: 5363 bytes --]

On Wed, Nov 04, 2020 at 03:58:40PM +0100, Patrick Steinhardt wrote:
> Inputs of the reference-transaction hook currently depends on the
> command which is being run. For example if the command `git update-ref
> $REF $A $B` is executed, it will receive "$B $A $REF" as input, but if
> the command `git update-ref $REF $A` is executed without providing the
> old value, then it will receive "0*40 $A $REF" as input. This is due to
> the fact that we directly write queued transaction updates into the
> hook's standard input, which will not contain the old object value in
> case it wasn't provided.
> 
> While this behaviour reflects what is happening as part of the
> repository, it doesn't feel like it is useful. The main intent of the
> reference-transaction hook is to be able to completely audit all
> reference updates, no matter where they come from. As such, it makes a
> lot more sense to always provide actual values instead of what the user
> wanted. Furthermore, it's impossible for the hook to distinguish whether
> this is intended to be a branch creation or a branch update without
> doing additional digging with the current format.
> 
> Fix the issue by storing the old object value into the queued
> transaction update operation if it wasn't provided by the caller.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>

Quick ping on this patch. Is there any interest or shall I just drop it?

Patrick

>  Documentation/githooks.txt       |  6 ++++++
>  refs/files-backend.c             |  8 ++++++++
>  refs/packed-backend.c            |  2 ++
>  t/t1416-ref-transaction-hooks.sh | 12 ++++++------
>  4 files changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
> index 4e097dc4e9..8f3540e2f6 100644
> --- a/Documentation/githooks.txt
> +++ b/Documentation/githooks.txt
> @@ -492,6 +492,12 @@ receives on standard input a line of the format:
>  
>    <old-value> SP <new-value> SP <ref-name> LF
>  
> +where `<old-value>` is the old object name stored in the ref,
> +`<new-value>` is the new object name to be stored in the ref and
> +`<ref-name>` is the full name of the ref.
> +When creating a new ref, `<old-value>` is 40 `0`.
> +When deleting an old ref, `<new-value>` is 40 `0`.
> +
>  The exit status of the hook is ignored for any state except for the
>  "prepared" state. In the "prepared" state, a non-zero exit status will
>  cause the transaction to be aborted. The hook will not be called with
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 04e85e7002..5b10d3822b 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -2452,6 +2452,9 @@ static int lock_ref_for_update(struct files_ref_store *refs,
>  				ret = TRANSACTION_GENERIC_ERROR;
>  				goto out;
>  			}
> +
> +			if (!(update->flags & REF_HAVE_OLD))
> +				oidcpy(&update->old_oid, &lock->old_oid);
>  		} else {
>  			/*
>  			 * Create a new update for the reference this
> @@ -2474,6 +2477,9 @@ static int lock_ref_for_update(struct files_ref_store *refs,
>  			goto out;
>  		}
>  
> +		if (!(update->flags & REF_HAVE_OLD))
> +			oidcpy(&update->old_oid, &lock->old_oid);
> +
>  		/*
>  		 * If this update is happening indirectly because of a
>  		 * symref update, record the old OID in the parent
> @@ -2484,6 +2490,8 @@ static int lock_ref_for_update(struct files_ref_store *refs,
>  		     parent_update = parent_update->parent_update) {
>  			struct ref_lock *parent_lock = parent_update->backend_data;
>  			oidcpy(&parent_lock->old_oid, &lock->old_oid);
> +			if (!(parent_update->flags & REF_HAVE_OLD))
> +				oidcpy(&parent_update->old_oid, &lock->old_oid);
>  		}
>  	}
>  
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index b912f2505f..08f0feee3d 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -1178,6 +1178,8 @@ static int write_with_updates(struct packed_ref_store *refs,
>  						    oid_to_hex(&update->old_oid));
>  					goto error;
>  				}
> +			} else {
> +				oidcpy(&update->old_oid, iter->oid);
>  			}
>  
>  			/* Now figure out what to use for the new value: */
> diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
> index f6e741c6c0..111533682a 100755
> --- a/t/t1416-ref-transaction-hooks.sh
> +++ b/t/t1416-ref-transaction-hooks.sh
> @@ -52,12 +52,12 @@ test_expect_success 'hook gets all queued updates in prepared state' '
>  		fi
>  	EOF
>  	cat >expect <<-EOF &&
> -		$ZERO_OID $POST_OID HEAD
> -		$ZERO_OID $POST_OID refs/heads/master
> +		$PRE_OID $POST_OID HEAD
> +		$PRE_OID $POST_OID refs/heads/master
>  	EOF
>  	git update-ref HEAD POST <<-EOF &&
> -		update HEAD $ZERO_OID $POST_OID
> -		update refs/heads/master $ZERO_OID $POST_OID
> +		update HEAD $PRE_OID $POST_OID
> +		update refs/heads/master $PRE_OID $POST_OID
>  	EOF
>  	test_cmp expect actual
>  '
> @@ -75,8 +75,8 @@ test_expect_success 'hook gets all queued updates in committed state' '
>  		fi
>  	EOF
>  	cat >expect <<-EOF &&
> -		$ZERO_OID $POST_OID HEAD
> -		$ZERO_OID $POST_OID refs/heads/master
> +		$PRE_OID $POST_OID HEAD
> +		$PRE_OID $POST_OID refs/heads/master
>  	EOF
>  	git update-ref HEAD POST &&
>  	test_cmp expect actual
> -- 
> 2.29.2
> 



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] refs: Always pass old object name to reftx hook
  2020-12-04  8:37 ` Patrick Steinhardt
@ 2021-01-12 21:07   ` Taylor Blau
  2021-01-13 11:22     ` Patrick Steinhardt
  0 siblings, 1 reply; 13+ messages in thread
From: Taylor Blau @ 2021-01-12 21:07 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, peff, gitster

On Fri, Dec 04, 2020 at 09:37:21AM +0100, Patrick Steinhardt wrote:
> Quick ping on this patch. Is there any interest or shall I just drop it?

Apologies for completely dropping this from inbox. I am interested in
it, and the patch looks good to me. I suspect the reason that this never
got queued was that nobody reviewed it.

Would you consider resubmitting this patch if you are still interested
in pushing it forwards?


Thanks,
Taylor

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] refs: Always pass old object name to reftx hook
  2021-01-12 21:07   ` Taylor Blau
@ 2021-01-13 11:22     ` Patrick Steinhardt
  2021-01-13 15:09       ` Taylor Blau
  2021-01-13 20:11       ` Junio C Hamano
  0 siblings, 2 replies; 13+ messages in thread
From: Patrick Steinhardt @ 2021-01-13 11:22 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff, gitster

[-- Attachment #1: Type: text/plain, Size: 646 bytes --]

On Tue, Jan 12, 2021 at 04:07:22PM -0500, Taylor Blau wrote:
> On Fri, Dec 04, 2020 at 09:37:21AM +0100, Patrick Steinhardt wrote:
> > Quick ping on this patch. Is there any interest or shall I just drop it?
> 
> Apologies for completely dropping this from inbox. I am interested in
> it, and the patch looks good to me. I suspect the reason that this never
> got queued was that nobody reviewed it.

No worries.

> Would you consider resubmitting this patch if you are still interested
> in pushing it forwards?

I can, but does it help to resubmit the same patch? Your response bumped
the thread up to the top anyway.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] refs: Always pass old object name to reftx hook
  2021-01-13 11:22     ` Patrick Steinhardt
@ 2021-01-13 15:09       ` Taylor Blau
  2021-01-13 20:11       ` Junio C Hamano
  1 sibling, 0 replies; 13+ messages in thread
From: Taylor Blau @ 2021-01-13 15:09 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Taylor Blau, git, peff, gitster

On Wed, Jan 13, 2021 at 12:22:14PM +0100, Patrick Steinhardt wrote:
> > Would you consider resubmitting this patch if you are still interested
> > in pushing it forwards?
>
> I can, but does it help to resubmit the same patch? Your response bumped
> the thread up to the top anyway.

Let's see if Junio picks it up in the next cycle. If he doesn't, it may
help to re-submit the thread.

> Patrick

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] refs: Always pass old object name to reftx hook
  2021-01-13 11:22     ` Patrick Steinhardt
  2021-01-13 15:09       ` Taylor Blau
@ 2021-01-13 20:11       ` Junio C Hamano
  2021-01-18 12:44         ` Patrick Steinhardt
  1 sibling, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2021-01-13 20:11 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Taylor Blau, git, peff

Patrick Steinhardt <ps@pks.im> writes:

> On Tue, Jan 12, 2021 at 04:07:22PM -0500, Taylor Blau wrote:
>> On Fri, Dec 04, 2020 at 09:37:21AM +0100, Patrick Steinhardt wrote:
>> > Quick ping on this patch. Is there any interest or shall I just drop it?
>> 
>> Apologies for completely dropping this from inbox. I am interested in
>> it, and the patch looks good to me. I suspect the reason that this never
>> got queued was that nobody reviewed it.
>
> No worries.
>
>> Would you consider resubmitting this patch if you are still interested
>> in pushing it forwards?
>
> I can, but does it help to resubmit the same patch? Your response bumped
> the thread up to the top anyway.

Bumping without resending would often not help people to see the
patch at all.

For example, already-read-and-old messages are not even shown to me
unless I ask my newsreader "I told you to show only the latest 200
messages, and I see this recent 'bump' message, but it responds to a
way old message so you need to show me the latest 3000 messages to
cover that era in order to see the patch message(s) it bumps."

Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] refs: Always pass old object name to reftx hook
  2021-01-13 20:11       ` Junio C Hamano
@ 2021-01-18 12:44         ` Patrick Steinhardt
  0 siblings, 0 replies; 13+ messages in thread
From: Patrick Steinhardt @ 2021-01-18 12:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, peff

[-- Attachment #1: Type: text/plain, Size: 1294 bytes --]

On Wed, Jan 13, 2021 at 12:11:28PM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > On Tue, Jan 12, 2021 at 04:07:22PM -0500, Taylor Blau wrote:
> >> On Fri, Dec 04, 2020 at 09:37:21AM +0100, Patrick Steinhardt wrote:
> >> > Quick ping on this patch. Is there any interest or shall I just drop it?
> >> 
> >> Apologies for completely dropping this from inbox. I am interested in
> >> it, and the patch looks good to me. I suspect the reason that this never
> >> got queued was that nobody reviewed it.
> >
> > No worries.
> >
> >> Would you consider resubmitting this patch if you are still interested
> >> in pushing it forwards?
> >
> > I can, but does it help to resubmit the same patch? Your response bumped
> > the thread up to the top anyway.
> 
> Bumping without resending would often not help people to see the
> patch at all.
> 
> For example, already-read-and-old messages are not even shown to me
> unless I ask my newsreader "I told you to show only the latest 200
> messages, and I see this recent 'bump' message, but it responds to a
> way old message so you need to show me the latest 3000 messages to
> cover that era in order to see the patch message(s) it bumps."

Fair point, I'll resend the patch. Thanks!

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH RESEND] refs: Always pass old object name to reftx hook
  2020-11-04 14:58 [PATCH] refs: Always pass old object name to reftx hook Patrick Steinhardt
  2020-12-04  8:37 ` Patrick Steinhardt
@ 2021-01-18 12:49 ` Patrick Steinhardt
  2021-01-18 22:45   ` Junio C Hamano
  1 sibling, 1 reply; 13+ messages in thread
From: Patrick Steinhardt @ 2021-01-18 12:49 UTC (permalink / raw)
  To: git; +Cc: peff, me, gitster

[-- Attachment #1: Type: text/plain, Size: 4954 bytes --]

Inputs of the reference-transaction hook currently depends on the
command which is being run. For example if the command `git update-ref
$REF $A $B` is executed, it will receive "$B $A $REF" as input, but if
the command `git update-ref $REF $A` is executed without providing the
old value, then it will receive "0*40 $A $REF" as input. This is due to
the fact that we directly write queued transaction updates into the
hook's standard input, which will not contain the old object value in
case it wasn't provided.

While this behaviour reflects what is happening as part of the
repository, it doesn't feel like it is useful. The main intent of the
reference-transaction hook is to be able to completely audit all
reference updates, no matter where they come from. As such, it makes a
lot more sense to always provide actual values instead of what the user
wanted. Furthermore, it's impossible for the hook to distinguish whether
this is intended to be a branch creation or a branch update without
doing additional digging with the current format.

Fix the issue by storing the old object value into the queued
transaction update operation if it wasn't provided by the caller.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/githooks.txt       |  6 ++++++
 refs/files-backend.c             |  8 ++++++++
 refs/packed-backend.c            |  2 ++
 t/t1416-ref-transaction-hooks.sh | 12 ++++++------
 4 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index 1f3b57d04d..e7baf0e2a0 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -492,6 +492,12 @@ receives on standard input a line of the format:
 
   <old-value> SP <new-value> SP <ref-name> LF
 
+where `<old-value>` is the old object name stored in the ref,
+`<new-value>` is the new object name to be stored in the ref and
+`<ref-name>` is the full name of the ref.
+When creating a new ref, `<old-value>` is 40 `0`.
+When deleting an old ref, `<new-value>` is 40 `0`.
+
 The exit status of the hook is ignored for any state except for the
 "prepared" state. In the "prepared" state, a non-zero exit status will
 cause the transaction to be aborted. The hook will not be called with
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 04e85e7002..5b10d3822b 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2452,6 +2452,9 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 				ret = TRANSACTION_GENERIC_ERROR;
 				goto out;
 			}
+
+			if (!(update->flags & REF_HAVE_OLD))
+				oidcpy(&update->old_oid, &lock->old_oid);
 		} else {
 			/*
 			 * Create a new update for the reference this
@@ -2474,6 +2477,9 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 			goto out;
 		}
 
+		if (!(update->flags & REF_HAVE_OLD))
+			oidcpy(&update->old_oid, &lock->old_oid);
+
 		/*
 		 * If this update is happening indirectly because of a
 		 * symref update, record the old OID in the parent
@@ -2484,6 +2490,8 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 		     parent_update = parent_update->parent_update) {
 			struct ref_lock *parent_lock = parent_update->backend_data;
 			oidcpy(&parent_lock->old_oid, &lock->old_oid);
+			if (!(parent_update->flags & REF_HAVE_OLD))
+				oidcpy(&parent_update->old_oid, &lock->old_oid);
 		}
 	}
 
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index b912f2505f..08f0feee3d 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1178,6 +1178,8 @@ static int write_with_updates(struct packed_ref_store *refs,
 						    oid_to_hex(&update->old_oid));
 					goto error;
 				}
+			} else {
+				oidcpy(&update->old_oid, iter->oid);
 			}
 
 			/* Now figure out what to use for the new value: */
diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
index f6e741c6c0..111533682a 100755
--- a/t/t1416-ref-transaction-hooks.sh
+++ b/t/t1416-ref-transaction-hooks.sh
@@ -52,12 +52,12 @@ test_expect_success 'hook gets all queued updates in prepared state' '
 		fi
 	EOF
 	cat >expect <<-EOF &&
-		$ZERO_OID $POST_OID HEAD
-		$ZERO_OID $POST_OID refs/heads/master
+		$PRE_OID $POST_OID HEAD
+		$PRE_OID $POST_OID refs/heads/master
 	EOF
 	git update-ref HEAD POST <<-EOF &&
-		update HEAD $ZERO_OID $POST_OID
-		update refs/heads/master $ZERO_OID $POST_OID
+		update HEAD $PRE_OID $POST_OID
+		update refs/heads/master $PRE_OID $POST_OID
 	EOF
 	test_cmp expect actual
 '
@@ -75,8 +75,8 @@ test_expect_success 'hook gets all queued updates in committed state' '
 		fi
 	EOF
 	cat >expect <<-EOF &&
-		$ZERO_OID $POST_OID HEAD
-		$ZERO_OID $POST_OID refs/heads/master
+		$PRE_OID $POST_OID HEAD
+		$PRE_OID $POST_OID refs/heads/master
 	EOF
 	git update-ref HEAD POST &&
 	test_cmp expect actual
-- 
2.30.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH RESEND] refs: Always pass old object name to reftx hook
  2021-01-18 12:49 ` [PATCH RESEND] " Patrick Steinhardt
@ 2021-01-18 22:45   ` Junio C Hamano
  2021-01-20  6:28     ` Patrick Steinhardt
  0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2021-01-18 22:45 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, peff, me

Patrick Steinhardt <ps@pks.im> writes:

> Inputs of the reference-transaction hook currently depends on the
> command which is being run. For example if the command `git update-ref
> $REF $A $B` is executed, it will receive "$B $A $REF" as input, but if
> the command `git update-ref $REF $A` is executed without providing the
> old value, then it will receive "0*40 $A $REF" as input. This is due to
> the fact that we directly write queued transaction updates into the
> hook's standard input, which will not contain the old object value in
> case it wasn't provided.

In effect, the user says "I do not care if this update races with
somebody else and it is perfectly OK if it overwrites their update"
by not giving $B.

> While this behaviour reflects what is happening as part of the
> repository, it doesn't feel like it is useful. The main intent of the
> reference-transaction hook is to be able to completely audit all
> reference updates, no matter where they come from. As such, it makes a
> lot more sense to always provide actual values instead of what the user
> wanted. Furthermore, it's impossible for the hook to distinguish whether
> this is intended to be a branch creation or a branch update without
> doing additional digging with the current format.

But shouldn't the transaction hook script be allowed to learn the
end-user intention and behave differently?  If we replace the
missing old object before calling the script, wouldn't it lose
information?

The above is not an objection posed as two rhetoric questions.  I am
purely curious why losing information is OK in this case, or why it
may not be so OK but should still be acceptable because it is lessor
evil than giving 0{40} to the hooks.

Even without this change, the current value the hook can learn by
looking the ref up itself if it really wanted to, no?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RESEND] refs: Always pass old object name to reftx hook
  2021-01-18 22:45   ` Junio C Hamano
@ 2021-01-20  6:28     ` Patrick Steinhardt
  2021-01-20  7:06       ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Patrick Steinhardt @ 2021-01-20  6:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, peff, me

[-- Attachment #1: Type: text/plain, Size: 2772 bytes --]

On Mon, Jan 18, 2021 at 02:45:30PM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > Inputs of the reference-transaction hook currently depends on the
> > command which is being run. For example if the command `git update-ref
> > $REF $A $B` is executed, it will receive "$B $A $REF" as input, but if
> > the command `git update-ref $REF $A` is executed without providing the
> > old value, then it will receive "0*40 $A $REF" as input. This is due to
> > the fact that we directly write queued transaction updates into the
> > hook's standard input, which will not contain the old object value in
> > case it wasn't provided.
> 
> In effect, the user says "I do not care if this update races with
> somebody else and it is perfectly OK if it overwrites their update"
> by not giving $B.
> 
> > While this behaviour reflects what is happening as part of the
> > repository, it doesn't feel like it is useful. The main intent of the
> > reference-transaction hook is to be able to completely audit all
> > reference updates, no matter where they come from. As such, it makes a
> > lot more sense to always provide actual values instead of what the user
> > wanted. Furthermore, it's impossible for the hook to distinguish whether
> > this is intended to be a branch creation or a branch update without
> > doing additional digging with the current format.
> 
> But shouldn't the transaction hook script be allowed to learn the
> end-user intention and behave differently?  If we replace the
> missing old object before calling the script, wouldn't it lose
> information?
> 
> The above is not an objection posed as two rhetoric questions.  I am
> purely curious why losing information is OK in this case, or why it
> may not be so OK but should still be acceptable because it is lessor
> evil than giving 0{40} to the hooks.
> 
> Even without this change, the current value the hook can learn by
> looking the ref up itself if it really wanted to, no?

I think the biggest problem is that right now, you cannot discern the
actual intention of the user because the information provided to the
hook is ambiguous in the branch creation case: "$ZERO_OID $NEW_OID $REF"
could mean the user intends to create a new branch where it shouldn't
have existed previously. BUT it could also mean that the user just
doesn't care what the reference previously pointed to.

The user could now try to derive the intention by manually looking up
the current state of the reference. But that does feel kind of awkward
to me.

To me, having clearly defined semantics ("The script always gets old and
new value of the branch regardless of what the user did") is preferable
to having ambiguous semantics.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RESEND] refs: Always pass old object name to reftx hook
  2021-01-20  6:28     ` Patrick Steinhardt
@ 2021-01-20  7:06       ` Junio C Hamano
  2021-01-22  6:44         ` Patrick Steinhardt
  0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2021-01-20  7:06 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, peff, me

Patrick Steinhardt <ps@pks.im> writes:

>> Even without this change, the current value the hook can learn by
>> looking the ref up itself if it really wanted to, no?
>
> I think the biggest problem is that right now, you cannot discern the
> actual intention of the user because the information provided to the
> hook is ambiguous in the branch creation case: "$ZERO_OID $NEW_OID $REF"
> could mean the user intends to create a new branch where it shouldn't
> have existed previously. BUT it could also mean that the user just
> doesn't care what the reference previously pointed to.

Yes, it can mean both, but when you pretend to be that hook,
wouldn't you check if the ref exists?  If not, the user is trying to
create it, and otherwise, the user does not know or care what the
original value is, no?

> The user could now try to derive the intention by manually looking up
> the current state of the reference. But that does feel kind of awkward
> to me.

So in short, with respect to the OLD slot, there are three kind of
end-user intention that could be conveyed to the hook:

 (1) the user does not care, so 0{40} appears in the OLD slot here,
 (2) the user is creating, so 0{40} apears in the OLD slot here, and
 (3) the user does care, and this is the OID in the OLD slot,

And (1) and (2) cannot be separated without looking at the ref (in
other words, if the hook really cares, it can find it out).

But if you replace 0{40} with the current OID, then you are making
it impossible to tell (1) and (3) apart.  The hook cannot tell the
distinction even if it is willing to go the extra mile.

So that sounds like a strict disimprovement to me.

If you can invent a way to help the hook to tell all three apart, I
am very much interested.  It would earn you a bonus point if you can
do so without breaking backward compatibility (but I doubt that it
is possible).

> To me, having clearly defined semantics ("The script always gets old and
> new value of the branch regardless of what the user did") is preferable
> to having ambiguous semantics.

But "The script always gets old that was given by the user and the
new value to be stored" is very clearly defined semantics already.

On the other hand, "The script gets a non-NULL object name in both
cases, either when the user says s/he does not care, or when the
user insists that the old value must be that, and it is not just
ambiguous but is impossible to tell apart" is worse than just being
ambiguous.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RESEND] refs: Always pass old object name to reftx hook
  2021-01-20  7:06       ` Junio C Hamano
@ 2021-01-22  6:44         ` Patrick Steinhardt
  2021-01-22 18:33           ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Patrick Steinhardt @ 2021-01-22  6:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, peff, me

[-- Attachment #1: Type: text/plain, Size: 3771 bytes --]

On Tue, Jan 19, 2021 at 11:06:15PM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> >> Even without this change, the current value the hook can learn by
> >> looking the ref up itself if it really wanted to, no?
> >
> > I think the biggest problem is that right now, you cannot discern the
> > actual intention of the user because the information provided to the
> > hook is ambiguous in the branch creation case: "$ZERO_OID $NEW_OID $REF"
> > could mean the user intends to create a new branch where it shouldn't
> > have existed previously. BUT it could also mean that the user just
> > doesn't care what the reference previously pointed to.
> 
> Yes, it can mean both, but when you pretend to be that hook,
> wouldn't you check if the ref exists?  If not, the user is trying to
> create it, and otherwise, the user does not know or care what the
> original value is, no?

As long as you're aware as the script author, yes.

There is one gotcha though: you can verify the state when the
reference-transaction hook gets invoked in the "prepared" state, as it
means that all references have been locked and thus cannot be changed by
any other well-behaved process accessing the git repository. In
"committed" or "aborted" that's not true anymore, given that the state
has changed already, so any locks have been released and it's impossible
to find out what happened now.

> > The user could now try to derive the intention by manually looking up
> > the current state of the reference. But that does feel kind of awkward
> > to me.
> 
> So in short, with respect to the OLD slot, there are three kind of
> end-user intention that could be conveyed to the hook:
> 
>  (1) the user does not care, so 0{40} appears in the OLD slot here,
>  (2) the user is creating, so 0{40} apears in the OLD slot here, and
>  (3) the user does care, and this is the OID in the OLD slot,
> 
> And (1) and (2) cannot be separated without looking at the ref (in
> other words, if the hook really cares, it can find it out).
> 
> But if you replace 0{40} with the current OID, then you are making
> it impossible to tell (1) and (3) apart.  The hook cannot tell the
> distinction even if it is willing to go the extra mile.
> 
> So that sounds like a strict disimprovement to me.

True.

> If you can invent a way to help the hook to tell all three apart, I
> am very much interested.  It would earn you a bonus point if you can
> do so without breaking backward compatibility (but I doubt that it
> is possible).

I did think about any way to do this, but wasn't yet able to find one.
And doing it in a backwards-compatible way is probably going to be
impossible. One idea I had is to use something similar to the
peeled-format we use in packed refs in case the actual change is
different from the user-provided change. E.g.

    0{40} <new> <ref>
    ^<old>

or

    0{40}^<old> <new> <ref>

That can be considered as backwards-incompatible though.

> > To me, having clearly defined semantics ("The script always gets old and
> > new value of the branch regardless of what the user did") is preferable
> > to having ambiguous semantics.
> 
> But "The script always gets old that was given by the user and the
> new value to be stored" is very clearly defined semantics already.
> 
> On the other hand, "The script gets a non-NULL object name in both
> cases, either when the user says s/he does not care, or when the
> user insists that the old value must be that, and it is not just
> ambiguous but is impossible to tell apart" is worse than just being
> ambiguous.

Yup. Whatever we agree on, what is clear is that the documentation needs
to be more specific here.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RESEND] refs: Always pass old object name to reftx hook
  2021-01-22  6:44         ` Patrick Steinhardt
@ 2021-01-22 18:33           ` Junio C Hamano
  0 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2021-01-22 18:33 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, peff, me

Patrick Steinhardt <ps@pks.im> writes:

>> Yes, it can mean both, but when you pretend to be that hook,
>> wouldn't you check if the ref exists?  If not, the user is trying to
>> create it, and otherwise, the user does not know or care what the
>> original value is, no?
>
> As long as you're aware as the script author, yes.

As you said downbelow, I agree that clear documentation may be
necessary.

> There is one gotcha though: you can verify the state when the
> reference-transaction hook gets invoked in the "prepared" state, as it
> means that all references have been locked and thus cannot be changed by
> any other well-behaved process accessing the git repository. In
> "committed" or "aborted" that's not true anymore, given that the state
> has changed already, so any locks have been released and it's impossible
> to find out what happened now.

True, but isn't the situation the same if we replaced the 0{40} old
side with (one version of) original value of the ref?

> different from the user-provided change. E.g.
>
>     0{40} <new> <ref>
>     ^<old>
>
> or
>
>     0{40}^<old> <new> <ref>
>
> That can be considered as backwards-incompatible though.

Yes, it is an incompatible change.  I thought of somehow annotating
the old side, e.g. "<old> <new> <ref>" vs "<OLD> <new> <ref>", to
show the distinction between "this is the original value of ref the
user wanted to see when updating <ref>" and "the user does not care
what value the <ref> gets updated from, but by the way, here is the
original value of the ref as Git sees it" [*], but I cannot think of
a way to do so without breaking existing readers.

    Side note: here, I am exploring the approach to replace 0{40}
    that is given when "do not care" into an actual original object
    name taken from the current state, like your patch did, but
    trying to find a way to make non-NULL object name distinguishable
    between the two cases (i.e. user-supplied vs system-filled).

That raises another question.  How much trust should the hook place
on the value of the <old> given to it?  When a non-NULL <old> value
is given by the end-user, does the hook get the value as-is, or do
we read the current value of the ref and send that as <old>?  Does
the transaction get rejected if the two are different and such a
record is not even given to the hook?

> Yup. Whatever we agree on, what is clear is that the documentation needs
> to be more specific here.

Yes, agreed.

Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-01-22 19:20 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-04 14:58 [PATCH] refs: Always pass old object name to reftx hook Patrick Steinhardt
2020-12-04  8:37 ` Patrick Steinhardt
2021-01-12 21:07   ` Taylor Blau
2021-01-13 11:22     ` Patrick Steinhardt
2021-01-13 15:09       ` Taylor Blau
2021-01-13 20:11       ` Junio C Hamano
2021-01-18 12:44         ` Patrick Steinhardt
2021-01-18 12:49 ` [PATCH RESEND] " Patrick Steinhardt
2021-01-18 22:45   ` Junio C Hamano
2021-01-20  6:28     ` Patrick Steinhardt
2021-01-20  7:06       ` Junio C Hamano
2021-01-22  6:44         ` Patrick Steinhardt
2021-01-22 18:33           ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).