git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] clone: ignore invalid local refs in remote
@ 2022-04-18 17:12 Derrick Stolee via GitGitGadget
  2022-04-18 17:30 ` Ævar Arnfjörð Bjarmason
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-04-18 17:12 UTC (permalink / raw)
  To: git; +Cc: gitster, me, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

When cloning directly from a local repository, we load a list of refs
based on scanning the $GIT_DIR/refs/ directory of the "server"
repository. If files exist in that directory that do not parse as
hexadecimal hashes, then the ref array used by write_remote_refs()
ends up with some entries with null OIDs. This causes us to hit a BUG()
statement in ref_transaction_create():

  BUG: create called without valid new_oid

This BUG() call used to be a die() until 033abf97f (Replace all
die("BUG: ...") calls by BUG() ones, 2018-05-02). Before that, the die()
was added by f04c5b552 (ref_transaction_create(): check that new_sha1 is
valid, 2015-02-17).

The original report for this bug [1] mentioned that this problem did not
exist in Git 2.27.0. The failure bisects unsurprisingly to 968f12fda
(refs: turn on GIT_REF_PARANOIA by default, 2021-09-24). When
GIT_REF_PARANOIA is enabled, this case always fails as far back as I am
able to successfully compile and test the Git codebase.

[1] https://github.com/git-for-windows/git/issues/3781

There are two approaches to consider here. One would be to remove this
BUG() statement in favor of returning with an error. There are only two
callers to ref_transaction_create(), so this would have a limited
impact.

The other approach would be to add special casing in 'git clone' to
avoid this faulty input to the method.

While I originally started with changing 'git clone', I decided that
modifying ref_transaction_create() was a more complete solution. This
prevents failing with a BUG() statement when we already have a good way
to report an error (including a reason for that error) within the
method. Both callers properly check the return value and die() with the
error message, so this is an appropriate direction.

The added test helps check against a regression, but does check that our
intended error message is handled correctly.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
    clone: ignore invalid local refs in remote

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1214%2Fderrickstolee%2Frefs-bug-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1214/derrickstolee/refs-bug-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1214

 refs.c                 | 6 ++++--
 t/t5605-clone-local.sh | 9 +++++++++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/refs.c b/refs.c
index 1a964505f92..f300f83e4d4 100644
--- a/refs.c
+++ b/refs.c
@@ -1111,8 +1111,10 @@ int ref_transaction_create(struct ref_transaction *transaction,
 			   unsigned int flags, const char *msg,
 			   struct strbuf *err)
 {
-	if (!new_oid || is_null_oid(new_oid))
-		BUG("create called without valid new_oid");
+	if (!new_oid || is_null_oid(new_oid)) {
+		strbuf_addf(err, "'%s' has a null OID", refname);
+		return 1;
+	}
 	return ref_transaction_update(transaction, refname, new_oid,
 				      null_oid(), flags, msg, err);
 }
diff --git a/t/t5605-clone-local.sh b/t/t5605-clone-local.sh
index 7d63365f93a..21ab6192839 100755
--- a/t/t5605-clone-local.sh
+++ b/t/t5605-clone-local.sh
@@ -141,4 +141,13 @@ test_expect_success 'cloning locally respects "-u" for fetching refs' '
 	test_must_fail git clone --bare -u false a should_not_work.git
 '
 
+test_expect_success 'local clone from repo with corrupt refs fails gracefully' '
+	git init corrupt &&
+	test_commit -C corrupt one &&
+	echo a >corrupt/.git/refs/heads/topic &&
+
+	test_must_fail git clone corrupt working 2>err &&
+	grep "has a null OID" err
+'
+
 test_done

base-commit: 11cfe552610386954886543f5de87dcc49ad5735
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] clone: ignore invalid local refs in remote
  2022-04-18 17:12 [PATCH] clone: ignore invalid local refs in remote Derrick Stolee via GitGitGadget
@ 2022-04-18 17:30 ` Ævar Arnfjörð Bjarmason
  2022-04-20 20:53 ` Junio C Hamano
  2022-04-25 13:47 ` [PATCH v2] clone: die() instead of BUG() on bad refs Derrick Stolee via GitGitGadget
  2 siblings, 0 replies; 6+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-04-18 17:30 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, gitster, me, Derrick Stolee


On Mon, Apr 18 2022, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <derrickstolee@github.com>
>
> When cloning directly from a local repository, we load a list of refs
> based on scanning the $GIT_DIR/refs/ directory of the "server"
> repository. If files exist in that directory that do not parse as
> hexadecimal hashes, then the ref array used by write_remote_refs()
> ends up with some entries with null OIDs. This causes us to hit a BUG()
> statement in ref_transaction_create():
>
>   BUG: create called without valid new_oid
>
> This BUG() call used to be a die() until 033abf97f (Replace all
> die("BUG: ...") calls by BUG() ones, 2018-05-02). Before that, the die()
> was added by f04c5b552 (ref_transaction_create(): check that new_sha1 is
> valid, 2015-02-17).
>
> The original report for this bug [1] mentioned that this problem did not
> exist in Git 2.27.0. The failure bisects unsurprisingly to 968f12fda
> (refs: turn on GIT_REF_PARANOIA by default, 2021-09-24). When
> GIT_REF_PARANOIA is enabled, this case always fails as far back as I am
> able to successfully compile and test the Git codebase.
>
> [1] https://github.com/git-for-windows/git/issues/3781
>
> There are two approaches to consider here. One would be to remove this
> BUG() statement in favor of returning with an error. There are only two
> callers to ref_transaction_create(), so this would have a limited
> impact.
>
> The other approach would be to add special casing in 'git clone' to
> avoid this faulty input to the method.
>
> While I originally started with changing 'git clone', I decided that
> modifying ref_transaction_create() was a more complete solution. This
> prevents failing with a BUG() statement when we already have a good way
> to report an error (including a reason for that error) within the
> method. Both callers properly check the return value and die() with the
> error message, so this is an appropriate direction.
>
> The added test helps check against a regression, but does check that our
> intended error message is handled correctly.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>     clone: ignore invalid local refs in remote
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1214%2Fderrickstolee%2Frefs-bug-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1214/derrickstolee/refs-bug-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1214
>
>  refs.c                 | 6 ++++--
>  t/t5605-clone-local.sh | 9 +++++++++
>  2 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index 1a964505f92..f300f83e4d4 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1111,8 +1111,10 @@ int ref_transaction_create(struct ref_transaction *transaction,
>  			   unsigned int flags, const char *msg,
>  			   struct strbuf *err)
>  {
> -	if (!new_oid || is_null_oid(new_oid))
> -		BUG("create called without valid new_oid");
> +	if (!new_oid || is_null_oid(new_oid)) {
> +		strbuf_addf(err, "'%s' has a null OID", refname);
> +		return 1;
> +	}
>  	return ref_transaction_update(transaction, refname, new_oid,
>  				      null_oid(), flags, msg, err);
>  }
> diff --git a/t/t5605-clone-local.sh b/t/t5605-clone-local.sh
> index 7d63365f93a..21ab6192839 100755
> --- a/t/t5605-clone-local.sh
> +++ b/t/t5605-clone-local.sh
> @@ -141,4 +141,13 @@ test_expect_success 'cloning locally respects "-u" for fetching refs' '
>  	test_must_fail git clone --bare -u false a should_not_work.git
>  '
>  
> +test_expect_success 'local clone from repo with corrupt refs fails gracefully' '
> +	git init corrupt &&
> +	test_commit -C corrupt one &&
> +	echo a >corrupt/.git/refs/heads/topic &&
> +
> +	test_must_fail git clone corrupt working 2>err &&
> +	grep "has a null OID" err
> +'
> +
>  test_done
>
> base-commit: 11cfe552610386954886543f5de87dcc49ad5735

If I change this test to clone from "$PWD/corrupt" instead we get as far
as dying in "not our ref" in upload-pack", after the "client" said "want
0000....".

I don't see anything wrong with this narrow fix, but I wonder how close
we are to just doing something useful here for the user. I.e. if we
can't parse_object() what we're about to "copy" to e.g. error() on that,
but carry on cloning the rest if possible.

We should probably always return a non-zero at the end, suggesting the
user omit the bad ref with a refspec. But if we did all that we could
usefully recover partially corrupt-repos in this state...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] clone: ignore invalid local refs in remote
  2022-04-18 17:12 [PATCH] clone: ignore invalid local refs in remote Derrick Stolee via GitGitGadget
  2022-04-18 17:30 ` Ævar Arnfjörð Bjarmason
@ 2022-04-20 20:53 ` Junio C Hamano
  2022-04-21 13:29   ` Derrick Stolee
  2022-04-25 13:47 ` [PATCH v2] clone: die() instead of BUG() on bad refs Derrick Stolee via GitGitGadget
  2 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2022-04-20 20:53 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, me, Derrick Stolee

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <derrickstolee@github.com>
> Subject: Re: [PATCH] clone: ignore invalid local refs in remote

After seeing the title, I expected that cloning from such a
repository with cruft in .git/refs/ directory would issue a warning
and succeed without these non-ref files.

But that is not what is happening here?

> +test_expect_success 'local clone from repo with corrupt refs fails gracefully' '
> +	git init corrupt &&
> +	test_commit -C corrupt one &&
> +	echo a >corrupt/.git/refs/heads/topic &&
> +
> +	test_must_fail git clone corrupt working 2>err &&
> +	grep "has a null OID" err
> +'
> +

We keep expecting that clone _will_ fail.

So the net change is that we still do not tolerate a corrupt
repository and do not let corruption to propagate through cloning,
but we diagnose this breakage as an error by calling die(), which 
is appropriate for dealing with runtime data error, instead of
hitting a BUG(), which is reserved for program errors.

I agree with the fixed behaviour and implementation.  It just is
that "ignore" on the title seems misleading.  Other than that,
thanks for a good finding and a clean fix.

> -	if (!new_oid || is_null_oid(new_oid))
> -		BUG("create called without valid new_oid");
> +	if (!new_oid || is_null_oid(new_oid)) {
> +		strbuf_addf(err, "'%s' has a null OID", refname);
> +		return 1;
> +	}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] clone: ignore invalid local refs in remote
  2022-04-20 20:53 ` Junio C Hamano
@ 2022-04-21 13:29   ` Derrick Stolee
  2022-04-21 18:00     ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Derrick Stolee @ 2022-04-21 13:29 UTC (permalink / raw)
  To: Junio C Hamano, Derrick Stolee via GitGitGadget; +Cc: git, me

On 4/20/2022 4:53 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Derrick Stolee <derrickstolee@github.com>
>> Subject: Re: [PATCH] clone: ignore invalid local refs in remote
> 
> After seeing the title, I expected that cloning from such a
> repository with cruft in .git/refs/ directory would issue a warning
> and succeed without these non-ref files.
> 
> But that is not what is happening here?

Sorry, this title of the commit message is stale from a version
where I started making the clones succeed (but without these
bad refs). I changed my mind to only switch BUG() to die() to
avoid giving the impression that we have a "matching" repo after
the clone.
 
>> +test_expect_success 'local clone from repo with corrupt refs fails gracefully' '
>> +	git init corrupt &&
>> +	test_commit -C corrupt one &&
>> +	echo a >corrupt/.git/refs/heads/topic &&
>> +
>> +	test_must_fail git clone corrupt working 2>err &&
>> +	grep "has a null OID" err
>> +'
>> +
> 
> We keep expecting that clone _will_ fail.
> 
> So the net change is that we still do not tolerate a corrupt
> repository and do not let corruption to propagate through cloning,
> but we diagnose this breakage as an error by calling die(), which 
> is appropriate for dealing with runtime data error, instead of
> hitting a BUG(), which is reserved for program errors.
> 
> I agree with the fixed behaviour and implementation.  It just is
> that "ignore" on the title seems misleading.  Other than that,
> thanks for a good finding and a clean fix.

Perhaps the commit could instead start with

	clone: die() instead of BUG() on bad refs

?

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] clone: ignore invalid local refs in remote
  2022-04-21 13:29   ` Derrick Stolee
@ 2022-04-21 18:00     ` Junio C Hamano
  0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2022-04-21 18:00 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Derrick Stolee via GitGitGadget, git, me

Derrick Stolee <derrickstolee@github.com> writes:

> Sorry, this title of the commit message is stale from a version
> where I started making the clones succeed (but without these
> bad refs). I changed my mind to only switch BUG() to die() to
> avoid giving the impression that we have a "matching" repo after
> the clone.

Ah, that figures ;-)

> Perhaps the commit could instead start with
>
> 	clone: die() instead of BUG() on bad refs
>
> ?

Yeah, it would be the title I would have been happy with.




^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2] clone: die() instead of BUG() on bad refs
  2022-04-18 17:12 [PATCH] clone: ignore invalid local refs in remote Derrick Stolee via GitGitGadget
  2022-04-18 17:30 ` Ævar Arnfjörð Bjarmason
  2022-04-20 20:53 ` Junio C Hamano
@ 2022-04-25 13:47 ` Derrick Stolee via GitGitGadget
  2 siblings, 0 replies; 6+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-04-25 13:47 UTC (permalink / raw)
  To: git
  Cc: gitster, me, Ævar Arnfjörð Bjarmason,
	Derrick Stolee, Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

When cloning directly from a local repository, we load a list of refs
based on scanning the $GIT_DIR/refs/ directory of the "server"
repository. If files exist in that directory that do not parse as
hexadecimal hashes, then the ref array used by write_remote_refs()
ends up with some entries with null OIDs. This causes us to hit a BUG()
statement in ref_transaction_create():

  BUG: create called without valid new_oid

This BUG() call used to be a die() until 033abf97f (Replace all
die("BUG: ...") calls by BUG() ones, 2018-05-02). Before that, the die()
was added by f04c5b552 (ref_transaction_create(): check that new_sha1 is
valid, 2015-02-17).

The original report for this bug [1] mentioned that this problem did not
exist in Git 2.27.0. The failure bisects unsurprisingly to 968f12fda
(refs: turn on GIT_REF_PARANOIA by default, 2021-09-24). When
GIT_REF_PARANOIA is enabled, this case always fails as far back as I am
able to successfully compile and test the Git codebase.

[1] https://github.com/git-for-windows/git/issues/3781

There are two approaches to consider here. One would be to remove this
BUG() statement in favor of returning with an error. There are only two
callers to ref_transaction_create(), so this would have a limited
impact.

The other approach would be to add special casing in 'git clone' to
avoid this faulty input to the method.

While I originally started with changing 'git clone', I decided that
modifying ref_transaction_create() was a more complete solution. This
prevents failing with a BUG() statement when we already have a good way
to report an error (including a reason for that error) within the
method. Both callers properly check the return value and die() with the
error message, so this is an appropriate direction.

The added test helps check against a regression, but does check that our
intended error message is handled correctly.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
    clone: ignore invalid local refs in remote
    
    Update in v2:
    
     * Subject line is improved to better match the message and fix.
     * I did not explore Ævar's idea to optimistically continue the clone as
       much as possible. That seems like a bigger change that could be
       investigated independently.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1214%2Fderrickstolee%2Frefs-bug-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1214/derrickstolee/refs-bug-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1214

Range-diff vs v1:

 1:  6c9e85c5a98 ! 1:  9801123c534 clone: ignore invalid local refs in remote
     @@ Metadata
      Author: Derrick Stolee <derrickstolee@github.com>
      
       ## Commit message ##
     -    clone: ignore invalid local refs in remote
     +    clone: die() instead of BUG() on bad refs
      
          When cloning directly from a local repository, we load a list of refs
          based on scanning the $GIT_DIR/refs/ directory of the "server"


 refs.c                 | 6 ++++--
 t/t5605-clone-local.sh | 9 +++++++++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/refs.c b/refs.c
index 1a964505f92..f300f83e4d4 100644
--- a/refs.c
+++ b/refs.c
@@ -1111,8 +1111,10 @@ int ref_transaction_create(struct ref_transaction *transaction,
 			   unsigned int flags, const char *msg,
 			   struct strbuf *err)
 {
-	if (!new_oid || is_null_oid(new_oid))
-		BUG("create called without valid new_oid");
+	if (!new_oid || is_null_oid(new_oid)) {
+		strbuf_addf(err, "'%s' has a null OID", refname);
+		return 1;
+	}
 	return ref_transaction_update(transaction, refname, new_oid,
 				      null_oid(), flags, msg, err);
 }
diff --git a/t/t5605-clone-local.sh b/t/t5605-clone-local.sh
index 7d63365f93a..21ab6192839 100755
--- a/t/t5605-clone-local.sh
+++ b/t/t5605-clone-local.sh
@@ -141,4 +141,13 @@ test_expect_success 'cloning locally respects "-u" for fetching refs' '
 	test_must_fail git clone --bare -u false a should_not_work.git
 '
 
+test_expect_success 'local clone from repo with corrupt refs fails gracefully' '
+	git init corrupt &&
+	test_commit -C corrupt one &&
+	echo a >corrupt/.git/refs/heads/topic &&
+
+	test_must_fail git clone corrupt working 2>err &&
+	grep "has a null OID" err
+'
+
 test_done

base-commit: 11cfe552610386954886543f5de87dcc49ad5735
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-04-25 13:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-18 17:12 [PATCH] clone: ignore invalid local refs in remote Derrick Stolee via GitGitGadget
2022-04-18 17:30 ` Ævar Arnfjörð Bjarmason
2022-04-20 20:53 ` Junio C Hamano
2022-04-21 13:29   ` Derrick Stolee
2022-04-21 18:00     ` Junio C Hamano
2022-04-25 13:47 ` [PATCH v2] clone: die() instead of BUG() on bad refs Derrick Stolee via GitGitGadget

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).