git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH] merge-tree: fix segmentation fault in read-only repositories
@ 2022-09-21 15:30 Johannes Schindelin via GitGitGadget
  2022-09-21 15:42 ` Taylor Blau
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-09-21 15:30 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Independent of the question whether we want `git merge-tree` to report a
tree name even when it failed to write the tree objects in a read-only
repository, there is no question that we should avoid a segmentation
fault.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
    merge-tree: fix segmentation fault in read-only repositories
    
    Turns out that the segmentation fault reported by Taylor
    [https://lore.kernel.org/git/YyopQD+LvPucnz3w@nand.local/] happened
    while testing merge-ort in a read-only repository, and that the upstream
    version of git merge-tree is as affected as GitHub's internal version.
    
    Note: I briefly considered using the OID of the_hash_algo->empty_tree
    instead of null_oid() when no tree object could be constructed. However,
    I have come to the conclusion that this could potentially cause its own
    set of problems because it would relate to a valid tree object even if
    we do not have any valid tree object to play with.
    
    Also note: The question I hinted at in the commit message, namely
    whether or not to try harder to construct a tree object even if we
    cannot write it out, maybe merits a longer discussion, one that I think
    we should have after v2.38.0 is released, so as not to distract from
    focusing on v2.38.0.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1362%2Fdscho%2Fmerge-tree-in-read-only-repos-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1362/dscho/merge-tree-in-read-only-repos-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1362

 builtin/merge-tree.c             | 4 +++-
 t/t4301-merge-tree-write-tree.sh | 8 ++++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index ae5782917b9..25c7142a882 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -412,6 +412,7 @@ static int real_merge(struct merge_tree_options *o,
 	struct commit_list *merge_bases = NULL;
 	struct merge_options opt;
 	struct merge_result result = { 0 };
+	const struct object_id *tree_oid;
 
 	parent1 = get_merge_parent(branch1);
 	if (!parent1)
@@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o,
 	if (o->show_messages == -1)
 		o->show_messages = !result.clean;
 
-	printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
+	tree_oid = result.tree ? &result.tree->object.oid : null_oid();
+	printf("%s%c", oid_to_hex(tree_oid), line_termination);
 	if (!result.clean) {
 		struct string_list conflicted_files = STRING_LIST_INIT_NODUP;
 		const char *last = NULL;
diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
index 28ca5c38bb5..e56b1ba6e50 100755
--- a/t/t4301-merge-tree-write-tree.sh
+++ b/t/t4301-merge-tree-write-tree.sh
@@ -810,4 +810,12 @@ test_expect_success 'can override merge of unrelated histories' '
 	test_cmp expect actual
 '
 
+test_expect_success 'merge-ort fails gracefully in a read-only repository' '
+	git init --bare read-only &&
+	git push read-only side1 side2 &&
+	test_when_finished "chmod -R u+w read-only" &&
+	chmod -R a-w read-only &&
+	test_must_fail git -C read-only merge-tree side1 side2
+'
+
 test_done

base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
-- 
@@ -810,4 +810,12 @@ test_expect_success 'can override merge of unrelated histories' '
 	test_cmp expect actual
 '
 
+test_expect_success 'merge-ort fails gracefully in a read-only repository' '
+	git init --bare read-only &&
+	git push read-only side1 side2 &&
+	test_when_finished "chmod -R u+w read-only" &&
+	chmod -R a-w read-only &&
+	test_must_fail git -C read-only merge-tree side1 side2
+'
+
 test_done

base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 15:30 [PATCH] merge-tree: fix segmentation fault in read-only repositories Johannes Schindelin via GitGitGadget
@ 2022-09-21 15:42 ` Taylor Blau
  2022-09-21 20:12   ` Johannes Schindelin
  2022-09-21 18:12 ` Junio C Hamano
  2022-09-21 22:08 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
  2 siblings, 1 reply; 16+ messages in thread
From: Taylor Blau @ 2022-09-21 15:42 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Elijah Newren, Johannes Schindelin

On Wed, Sep 21, 2022 at 03:30:48PM +0000, Johannes Schindelin via GitGitGadget wrote:
> diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
> index ae5782917b9..25c7142a882 100644
> --- a/builtin/merge-tree.c
> +++ b/builtin/merge-tree.c
> @@ -412,6 +412,7 @@ static int real_merge(struct merge_tree_options *o,
>  	struct commit_list *merge_bases = NULL;
>  	struct merge_options opt;
>  	struct merge_result result = { 0 };
> +	const struct object_id *tree_oid;
>
>  	parent1 = get_merge_parent(branch1);
>  	if (!parent1)
> @@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o,
>  	if (o->show_messages == -1)
>  		o->show_messages = !result.clean;
>
> -	printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
> +	tree_oid = result.tree ? &result.tree->object.oid : null_oid();
> +	printf("%s%c", oid_to_hex(tree_oid), line_termination);

My understanding is that we can get a clean result from
merge_incore_recursive(), but still have failed to write the physical
tree object out?

In other words, the two sides *could* have been merged, but the actual
result of doing that merge couldn't be written to disk (e.g., because
the repository is read-only or some such)?

If so, then this approach makes sense, and I agree with your idea to
use the all-zeros OID instead of the empty tree one. It would be
nice(r?) if we could just abort this command earlier, since `merge-tree`
promises that we'll write the result out as an object. So I don't think
a non-zero exit before we have to print the resulting tree object would
be unexpected.

But I don't have a strong feeling about it either way. So, if you want
to proceed here and just emit the all-zeros OID, I think that is a fine
approach.

> diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> index 28ca5c38bb5..e56b1ba6e50 100755
> --- a/t/t4301-merge-tree-write-tree.sh
> +++ b/t/t4301-merge-tree-write-tree.sh
> @@ -810,4 +810,12 @@ test_expect_success 'can override merge of unrelated histories' '
>  	test_cmp expect actual
>  '
>
> +test_expect_success 'merge-ort fails gracefully in a read-only repository' '
> +	git init --bare read-only &&
> +	git push read-only side1 side2 &&
> +	test_when_finished "chmod -R u+w read-only" &&

Do we care about keeping this read-only repository around after the
test is done? It seems odd to have a directory called "read-only" be
user-writable. I'd probably just as soon replace this with:

  test_when_finished "rm -fr read-only" &&

Otherwise, this patch looks good. Thanks for working on it!

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 15:30 [PATCH] merge-tree: fix segmentation fault in read-only repositories Johannes Schindelin via GitGitGadget
  2022-09-21 15:42 ` Taylor Blau
@ 2022-09-21 18:12 ` Junio C Hamano
  2022-09-21 20:13   ` Johannes Schindelin
  2022-09-21 22:08 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
  2 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2022-09-21 18:12 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Elijah Newren, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> +test_expect_success 'merge-ort fails gracefully in a read-only repository' '
> +	git init --bare read-only &&
> +	git push read-only side1 side2 &&
> +	test_when_finished "chmod -R u+w read-only" &&
> +	chmod -R a-w read-only &&
> +	test_must_fail git -C read-only merge-tree side1 side2
> +'

Doesn't this test need a prerequisite to guard against those whose
filesystem lacks SANITY?



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 15:42 ` Taylor Blau
@ 2022-09-21 20:12   ` Johannes Schindelin
  0 siblings, 0 replies; 16+ messages in thread
From: Johannes Schindelin @ 2022-09-21 20:12 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Johannes Schindelin via GitGitGadget, git, Elijah Newren

Hi Taylor,

On Wed, 21 Sep 2022, Taylor Blau wrote:

> On Wed, Sep 21, 2022 at 03:30:48PM +0000, Johannes Schindelin via GitGitGadget wrote:
> > diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
> > index ae5782917b9..25c7142a882 100644
> > --- a/builtin/merge-tree.c
> > +++ b/builtin/merge-tree.c
> > @@ -412,6 +412,7 @@ static int real_merge(struct merge_tree_options *o,
> >  	struct commit_list *merge_bases = NULL;
> >  	struct merge_options opt;
> >  	struct merge_result result = { 0 };
> > +	const struct object_id *tree_oid;
> >
> >  	parent1 = get_merge_parent(branch1);
> >  	if (!parent1)
> > @@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o,
> >  	if (o->show_messages == -1)
> >  		o->show_messages = !result.clean;
> >
> > -	printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
> > +	tree_oid = result.tree ? &result.tree->object.oid : null_oid();
> > +	printf("%s%c", oid_to_hex(tree_oid), line_termination);
>
> My understanding is that we can get a clean result from
> merge_incore_recursive(), but still have failed to write the physical
> tree object out?

No, we do not get a clean result _because_ we have failed to write the
tree object.

> In other words, the two sides *could* have been merged, but the actual
> result of doing that merge couldn't be written to disk (e.g., because
> the repository is read-only or some such)?

Since we could not write the tree object, we cannot say whether the
overall merge could have been merged or not.

All we can say is whether the merge has been clean _so far_ or not. Which
is not very helpful.

Now, taking that thought a bit further, I did get suspicious that my patch
still contained a bug, and lo and behold, this bug really was there: when
the merge _has_ been clean so far, we report success (via exit code 0),
even if the tree could not be written, and therefore the `git merge-tree`
command should have failed!

I fixed the bug in the upcoming v2 and extended the test case to verify
that this bug no longer exists.

> If so, then this approach makes sense, and I agree with your idea to
> use the all-zeros OID instead of the empty tree one. It would be
> nice(r?) if we could just abort this command earlier, since `merge-tree`
> promises that we'll write the result out as an object. So I don't think
> a non-zero exit before we have to print the resulting tree object would
> be unexpected.

The idea of aborting the command earlier might make sense in isolation,
and when you're familiar with Git's built-ins taking the easy `die()` way
out upon failure without having to bother to release resources.

However, when you keep in mind that the `merge-tree` command is far from
its full potential and that really interesting ideas are still to be
implemented, some even well on their way like batched merges (see
https://github.com/gitgitgadget/git/pull/1361), it starts to become a much
better idea to keep the code as libified as possible, without any aborts.
Such an abort in the middle of the code path would interfere horribly with
the idea of batched merges.

> But I don't have a strong feeling about it either way. So, if you want
> to proceed here and just emit the all-zeros OID, I think that is a fine
> approach.
>
> > diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> > index 28ca5c38bb5..e56b1ba6e50 100755
> > --- a/t/t4301-merge-tree-write-tree.sh
> > +++ b/t/t4301-merge-tree-write-tree.sh
> > @@ -810,4 +810,12 @@ test_expect_success 'can override merge of unrelated histories' '
> >  	test_cmp expect actual
> >  '
> >
> > +test_expect_success 'merge-ort fails gracefully in a read-only repository' '
> > +	git init --bare read-only &&
> > +	git push read-only side1 side2 &&
> > +	test_when_finished "chmod -R u+w read-only" &&
>
> Do we care about keeping this read-only repository around after the
> test is done? It seems odd to have a directory called "read-only" be
> user-writable. I'd probably just as soon replace this with:
>
>   test_when_finished "rm -fr read-only" &&

Speaking from way more experience debugging CI failures than I care to
have, it makes a lot of sense to keep this directory until the complete
test script finishes with utter success. Because if the test case fails,
and the CI tars up the "trash" directory, and the crucial files to
diagnose what might have gone wrong are missing from that tar file, my own
future life will be a lot harder than I like.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 18:12 ` Junio C Hamano
@ 2022-09-21 20:13   ` Johannes Schindelin
  0 siblings, 0 replies; 16+ messages in thread
From: Johannes Schindelin @ 2022-09-21 20:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin via GitGitGadget, git, Elijah Newren

Hi Junio,

On Wed, 21 Sep 2022, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > +test_expect_success 'merge-ort fails gracefully in a read-only repository' '
> > +	git init --bare read-only &&
> > +	git push read-only side1 side2 &&
> > +	test_when_finished "chmod -R u+w read-only" &&
> > +	chmod -R a-w read-only &&
> > +	test_must_fail git -C read-only merge-tree side1 side2
> > +'
>
> Doesn't this test need a prerequisite to guard against those whose
> filesystem lacks SANITY?

Yes, of course!

Thank you for the sanity check (pun intended!),
Dscho

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 15:30 [PATCH] merge-tree: fix segmentation fault in read-only repositories Johannes Schindelin via GitGitGadget
  2022-09-21 15:42 ` Taylor Blau
  2022-09-21 18:12 ` Junio C Hamano
@ 2022-09-21 22:08 ` Johannes Schindelin via GitGitGadget
  2022-09-21 22:24   ` Junio C Hamano
                     ` (2 more replies)
  2 siblings, 3 replies; 16+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-09-21 22:08 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Taylor Blau, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Independent of the question whether we want `git merge-tree` to report
the tree name even when it failed to write the tree objects in a
read-only repository, there is no question that we should avoid a
segmentation fault.

And when we report an invalid tree name (because the tree could not be
written), we need the exit status to be non-zero.

Let's make it so.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
    merge-tree: fix segmentation fault in read-only repositories
    
    Turns out that the segmentation fault reported by Taylor
    [https://lore.kernel.org/git/YyopQD+LvPucnz3w@nand.local/] happened
    while testing merge-ort in a read-only repository, and that the upstream
    version of git merge-tree is as affected as GitHub's internal version.
    
    Note: I briefly considered using the OID of the_hash_algo->empty_tree
    instead of null_oid() when no tree object could be constructed. However,
    I have come to the conclusion that this could potentially cause its own
    set of problems because it would relate to a valid tree object even if
    we do not have any valid tree object to play with.
    
    Also note: The question I hinted at in the commit message, namely
    whether or not to try harder to construct a tree object even if we
    cannot write it out, maybe merits a longer discussion, one that I think
    we should have after v2.38.0 is released, so as not to distract from
    focusing on v2.38.0.
    
    Changes since v1:
    
     * Using the SANITY prereq now
     * If the merge was aborted due to a write error, exit with a non-zero
       code even if the (potentially partial) merge is clean
     * The test case now also verifies that the git merge-tree command fails
       in a read-only repository even if the merge would have succeeded

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1362%2Fdscho%2Fmerge-tree-in-read-only-repos-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1362/dscho/merge-tree-in-read-only-repos-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1362

Range-diff vs v1:

 1:  beba3fd9a95 ! 1:  1de4df3f471 merge-tree: fix segmentation fault in read-only repositories
     @@ Metadata
       ## Commit message ##
          merge-tree: fix segmentation fault in read-only repositories
      
     -    Independent of the question whether we want `git merge-tree` to report a
     -    tree name even when it failed to write the tree objects in a read-only
     -    repository, there is no question that we should avoid a segmentation
     -    fault.
     +    Independent of the question whether we want `git merge-tree` to report
     +    the tree name even when it failed to write the tree objects in a
     +    read-only repository, there is no question that we should avoid a
     +    segmentation fault.
     +
     +    And when we report an invalid tree name (because the tree could not be
     +    written), we need the exit status to be non-zero.
     +
     +    Let's make it so.
      
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
     @@ builtin/merge-tree.c: static int real_merge(struct merge_tree_options *o,
       	if (!result.clean) {
       		struct string_list conflicted_files = STRING_LIST_INIT_NODUP;
       		const char *last = NULL;
     +@@ builtin/merge-tree.c: static int real_merge(struct merge_tree_options *o,
     + 					      &result);
     + 	}
     + 	merge_finalize(&opt, &result);
     +-	return !result.clean; /* result.clean < 0 handled above */
     ++	return !result.tree || !result.clean; /* result.clean < 0 handled above */
     + }
     + 
     + int cmd_merge_tree(int argc, const char **argv, const char *prefix)
      
       ## t/t4301-merge-tree-write-tree.sh ##
      @@ t/t4301-merge-tree-write-tree.sh: test_expect_success 'can override merge of unrelated histories' '
       	test_cmp expect actual
       '
       
     -+test_expect_success 'merge-ort fails gracefully in a read-only repository' '
     ++test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
      +	git init --bare read-only &&
     -+	git push read-only side1 side2 &&
     ++	git push read-only side1 side2 side3 &&
      +	test_when_finished "chmod -R u+w read-only" &&
      +	chmod -R a-w read-only &&
     ++	test_must_fail git -C read-only merge-tree side1 side3 &&
      +	test_must_fail git -C read-only merge-tree side1 side2
      +'
      +


 builtin/merge-tree.c             | 6 ++++--
 t/t4301-merge-tree-write-tree.sh | 9 +++++++++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index ae5782917b9..0df24eb82d4 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -412,6 +412,7 @@ static int real_merge(struct merge_tree_options *o,
 	struct commit_list *merge_bases = NULL;
 	struct merge_options opt;
 	struct merge_result result = { 0 };
+	const struct object_id *tree_oid;
 
 	parent1 = get_merge_parent(branch1);
 	if (!parent1)
@@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o,
 	if (o->show_messages == -1)
 		o->show_messages = !result.clean;
 
-	printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
+	tree_oid = result.tree ? &result.tree->object.oid : null_oid();
+	printf("%s%c", oid_to_hex(tree_oid), line_termination);
 	if (!result.clean) {
 		struct string_list conflicted_files = STRING_LIST_INIT_NODUP;
 		const char *last = NULL;
@@ -473,7 +475,7 @@ static int real_merge(struct merge_tree_options *o,
 					      &result);
 	}
 	merge_finalize(&opt, &result);
-	return !result.clean; /* result.clean < 0 handled above */
+	return !result.tree || !result.clean; /* result.clean < 0 handled above */
 }
 
 int cmd_merge_tree(int argc, const char **argv, const char *prefix)
diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
index 28ca5c38bb5..013b77144bd 100755
--- a/t/t4301-merge-tree-write-tree.sh
+++ b/t/t4301-merge-tree-write-tree.sh
@@ -810,4 +810,13 @@ test_expect_success 'can override merge of unrelated histories' '
 	test_cmp expect actual
 '
 
+test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
+	git init --bare read-only &&
+	git push read-only side1 side2 side3 &&
+	test_when_finished "chmod -R u+w read-only" &&
+	chmod -R a-w read-only &&
+	test_must_fail git -C read-only merge-tree side1 side3 &&
+	test_must_fail git -C read-only merge-tree side1 side2
+'
+
 test_done

base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
-- 
@@ -810,4 +810,13 @@ test_expect_success 'can override merge of unrelated histories' '
 	test_cmp expect actual
 '
 
+test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
+	git init --bare read-only &&
+	git push read-only side1 side2 side3 &&
+	test_when_finished "chmod -R u+w read-only" &&
+	chmod -R a-w read-only &&
+	test_must_fail git -C read-only merge-tree side1 side3 &&
+	test_must_fail git -C read-only merge-tree side1 side2
+'
+
 test_done

base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 22:08 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
@ 2022-09-21 22:24   ` Junio C Hamano
  2022-09-22 19:52     ` Johannes Schindelin
  2022-09-21 22:56   ` Elijah Newren
  2022-09-22 19:46   ` [PATCH v3] merge-ort: " Johannes Schindelin via GitGitGadget
  2 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2022-09-21 22:24 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Elijah Newren, Taylor Blau, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> +	const struct object_id *tree_oid;
>  
>  	parent1 = get_merge_parent(branch1);
>  	if (!parent1)
> @@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o,
>  	if (o->show_messages == -1)
>  		o->show_messages = !result.clean;
>  
> -	printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
> +	tree_oid = result.tree ? &result.tree->object.oid : null_oid();
> +	printf("%s%c", oid_to_hex(tree_oid), line_termination);

As we are not using tree_oid anywhere else, it might be cleaner to
switch on result.tree more explicitly, especially given that the
return statement also uses result.tree to decide what exit status to
use, e.g.

	if (result.tree)
		fputs(oid_to_hex(&result.tree->object.oid), stdout);
	else
		fputs(oid_to_hex(null_oid()), stdout);
	fputc(line_termination, stdout);
	
but that is merely "might be cleaner".

> -	return !result.clean; /* result.clean < 0 handled above */
> +	return !result.tree || !result.clean; /* result.clean < 0 handled above */

The original returned 1 when we failed to cleanly merge and 0 when
we cleanly merged.  Now, if we failed to come up with a merged tree,
result.tree would be NULL and we return 1 from the function, because
we failed to cleanly merge.

OK.  These negations make my head spin X-<.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 22:08 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
  2022-09-21 22:24   ` Junio C Hamano
@ 2022-09-21 22:56   ` Elijah Newren
  2022-09-21 23:11     ` Junio C Hamano
                       ` (2 more replies)
  2022-09-22 19:46   ` [PATCH v3] merge-ort: " Johannes Schindelin via GitGitGadget
  2 siblings, 3 replies; 16+ messages in thread
From: Elijah Newren @ 2022-09-21 22:56 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: Git Mailing List, Taylor Blau, Johannes Schindelin

On Wed, Sep 21, 2022 at 3:08 PM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> Independent of the question whether we want `git merge-tree` to report
> the tree name even when it failed to write the tree objects in a
> read-only repository, there is no question that we should avoid a
> segmentation fault.

Ah, a read-only repo.  Looks like we didn't bother to check the return
status of write_object_file() in one of the two cases; oops.  (And it
looks like the other case does not correctly propagate the error
upwards far enough...)

> And when we report an invalid tree name (because the tree could not be
> written), we need the exit status to be non-zero.
>
> Let's make it so.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>     merge-tree: fix segmentation fault in read-only repositories
>
>     Turns out that the segmentation fault reported by Taylor
>     [https://lore.kernel.org/git/YyopQD+LvPucnz3w@nand.local/] happened
>     while testing merge-ort in a read-only repository, and that the upstream
>     version of git merge-tree is as affected as GitHub's internal version.
>
>     Note: I briefly considered using the OID of the_hash_algo->empty_tree
>     instead of null_oid() when no tree object could be constructed. However,
>     I have come to the conclusion that this could potentially cause its own
>     set of problems because it would relate to a valid tree object even if
>     we do not have any valid tree object to play with.
>
>     Also note: The question I hinted at in the commit message, namely
>     whether or not to try harder to construct a tree object even if we
>     cannot write it out, maybe merits a longer discussion, one that I think
>     we should have after v2.38.0 is released, so as not to distract from
>     focusing on v2.38.0.

That's fair, but you know I'm going to have a hard time refraining
from commenting on it anyway...  :-)

[...]
> @@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o,
>         if (o->show_messages == -1)
>                 o->show_messages = !result.clean;
>
> -       printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
> +       tree_oid = result.tree ? &result.tree->object.oid : null_oid();
> +       printf("%s%c", oid_to_hex(tree_oid), line_termination);

Perhaps also print a warning to stderr when result.tree is NULL?

>         if (!result.clean) {
>                 struct string_list conflicted_files = STRING_LIST_INIT_NODUP;
>                 const char *last = NULL;
> @@ -473,7 +475,7 @@ static int real_merge(struct merge_tree_options *o,
>                                               &result);
>         }
>         merge_finalize(&opt, &result);
> -       return !result.clean; /* result.clean < 0 handled above */
> +       return !result.tree || !result.clean; /* result.clean < 0 handled above */

Thinking out loud, should this logic be at the merge-ort.c level,
perhaps something like this:

@@ -4940,6 +4941,9 @@ static void
merge_ort_nonrecursive_internal(struct merge_options *opt,
        result->tree = parse_tree_indirect(&working_tree_oid);
        /* existence of conflicted entries implies unclean */
        result->clean &= strmap_empty(&opt->priv->conflicted);
+       if (!result->tree)
+               /* This shouldn't happen, because if we did fail to
write a tree we should have returned early before getting here.  But
just in case we have some bugs... */
+               result->clean = -1;
        if (!opt->priv->call_depth) {
                result->priv = opt->priv;
                result->_properly_initialized = RESULT_INITIALIZED;

That might benefit callers other than merge-tree, though maybe it
makes it harder to print a helpful error message (unless we're fine
with the library always throwing one?)

> diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> index 28ca5c38bb5..013b77144bd 100755
> --- a/t/t4301-merge-tree-write-tree.sh
> +++ b/t/t4301-merge-tree-write-tree.sh
> @@ -810,4 +810,13 @@ test_expect_success 'can override merge of unrelated histories' '
>         test_cmp expect actual
>  '
>
> +test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
> +       git init --bare read-only &&
> +       git push read-only side1 side2 side3 &&
> +       test_when_finished "chmod -R u+w read-only" &&
> +       chmod -R a-w read-only &&
> +       test_must_fail git -C read-only merge-tree side1 side3 &&
> +       test_must_fail git -C read-only merge-tree side1 side2
> +'
> +
>  test_done
>
> base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
> --
> gitgitgadget

This is a reasonable workaround from the merge-tree side, if we expect
merge-ort to not correctly notify us of issues (which may be fair
given that a quick glance suggests we have more than one problematic
codepath, which I'll discuss more below).  However, I do think
merge-ort should be returning a negative value for the "clean" status
in such a case.  If merge-ort did that, then we wouldn't even get to
your new code because merge-tree.c already checks for that:

    merge_incore_recursive(&opt, merge_bases, parent1, parent2, &result);
    if (result.clean < 0)
        die(_("failure to merge"));

I have a hack above which sets a negative return value, but it's only
a workaround since it comes from a late detect-after-the-fact
location.  Here's another ugly hack, but one which highlights exactly
where the real problem occurs:

diff --git a/merge-ort.c b/merge-ort.c
index 99dcee2db8..c2144e9220 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -3605,7 +3605,8 @@ static void write_tree(struct object_id *result_oid,
        }

        /* Write this object file out, and record in result_oid */
-       write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
+       if (write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid))
+               die(_("Unable to add new tree to database"));
        strbuf_release(&buf);
 }

die()ing, of course, is not a good choice, we should really return -1
instead.  Unfortunately, the callers aren't currently prepared to
propagate such a value upwards (as reflected in the return type of the
function) so any fix in this area is a little more involved than just
modifying these few lines.  Also, there is one other call to
write_object_file() in merge-ort.c which does check and return a value
of -1, though looking around it appears the callers of that other site
do not always correctly check and propagate a return value of -1
further upwards as they should, meaning we are losing track of the
fact that the merge failed to function.

Given the lack of correct propagation of -1 return values in the other
codepath plus the error here, keeping some form of workaround sounds
fair (though it may be nice to print an error that something fishy
happened).

And then, possibly post-v2.38.0 though we may be able to get it in
before, getting correct propagation of a -1 return value from the
source of the error would be good.  Would you like to look into that,
or would you prefer I did?

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 22:56   ` Elijah Newren
@ 2022-09-21 23:11     ` Junio C Hamano
  2022-09-22 17:24     ` Johannes Schindelin
  2022-09-22 19:50     ` Johannes Schindelin
  2 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2022-09-21 23:11 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Johannes Schindelin via GitGitGadget, Git Mailing List,
	Taylor Blau, Johannes Schindelin

Elijah Newren <newren@gmail.com> writes:

>> @@ -473,7 +475,7 @@ static int real_merge(struct merge_tree_options *o,
>>                                               &result);
>>         }
>>         merge_finalize(&opt, &result);
>> -       return !result.clean; /* result.clean < 0 handled above */
>> +       return !result.tree || !result.clean; /* result.clean < 0 handled above */
>
> Thinking out loud, should this logic be at the merge-ort.c level,

True.

> ...  However, I do think
> merge-ort should be returning a negative value for the "clean" status
> in such a case.

True, too.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 22:56   ` Elijah Newren
  2022-09-21 23:11     ` Junio C Hamano
@ 2022-09-22 17:24     ` Johannes Schindelin
  2022-09-22 19:50     ` Johannes Schindelin
  2 siblings, 0 replies; 16+ messages in thread
From: Johannes Schindelin @ 2022-09-22 17:24 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Johannes Schindelin via GitGitGadget, Git Mailing List, Taylor Blau

Hi Elijah,

just quickly...

On Wed, 21 Sep 2022, Elijah Newren wrote:

> On Wed, Sep 21, 2022 at 3:08 PM Johannes Schindelin via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > @@ -473,7 +475,7 @@ static int real_merge(struct merge_tree_options *o,
> >                                               &result);
> >         }
> >         merge_finalize(&opt, &result);
> > -       return !result.clean; /* result.clean < 0 handled above */
> > +       return !result.tree || !result.clean; /* result.clean < 0 handled above */
>
> Thinking out loud, should this logic be at the merge-ort.c level,

You're right, of course. I have pushed up a tentative v3 that does it as
you proposed, and it looks not half bad. I'll look a bit more deeply,
still, but chances are that I'll submit this later today as v3 as-is.

And then I will reply a bit more verbosely, either ;-)

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3] merge-ort: fix segmentation fault in read-only repositories
  2022-09-21 22:08 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
  2022-09-21 22:24   ` Junio C Hamano
  2022-09-21 22:56   ` Elijah Newren
@ 2022-09-22 19:46   ` Johannes Schindelin via GitGitGadget
  2022-09-23  1:38     ` Elijah Newren
  2 siblings, 1 reply; 16+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-09-22 19:46 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Taylor Blau, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

If the blob/tree objects cannot be written, we really need the merge
operations to fail, and not to continue (and then try to access the tree
object which is however still set to `NULL`).

Let's stop ignoring the return value of `write_object_file()` and
`write_tree()` and set `clean = -1` in the error case.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
    merge-tree: fix segmentation fault in read-only repositories
    
    Turns out that the segmentation fault reported by Taylor
    [https://lore.kernel.org/git/YyopQD+LvPucnz3w@nand.local/] happened
    while testing merge-ort in a read-only repository, and that the upstream
    version of git merge-tree is as affected as GitHub's internal version.
    
    Changes since v2:
    
     * Completely changed the approach by no longer touching
       builtin/merge-tree.c but instead fixing the underlying merge-ort
       layer: we no longer ignore the return value of write_tree() and
       write_object_file().
    
    Changes since v1:
    
     * Using the SANITY prereq now
     * If the merge was aborted due to a write error, exit with a non-zero
       code even if the (potentially partial) merge is clean
     * The test case now also verifies that the git merge-tree command fails
       in a read-only repository even if the merge would have succeeded

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1362%2Fdscho%2Fmerge-tree-in-read-only-repos-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1362/dscho/merge-tree-in-read-only-repos-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1362

Range-diff vs v2:

 1:  1de4df3f471 ! 1:  198ff455f90 merge-tree: fix segmentation fault in read-only repositories
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    merge-tree: fix segmentation fault in read-only repositories
     +    merge-ort: fix segmentation fault in read-only repositories
      
     -    Independent of the question whether we want `git merge-tree` to report
     -    the tree name even when it failed to write the tree objects in a
     -    read-only repository, there is no question that we should avoid a
     -    segmentation fault.
     +    If the blob/tree objects cannot be written, we really need the merge
     +    operations to fail, and not to continue (and then try to access the tree
     +    object which is however still set to `NULL`).
      
     -    And when we report an invalid tree name (because the tree could not be
     -    written), we need the exit status to be non-zero.
     -
     -    Let's make it so.
     +    Let's stop ignoring the return value of `write_object_file()` and
     +    `write_tree()` and set `clean = -1` in the error case.
      
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
     - ## builtin/merge-tree.c ##
     -@@ builtin/merge-tree.c: static int real_merge(struct merge_tree_options *o,
     - 	struct commit_list *merge_bases = NULL;
     - 	struct merge_options opt;
     - 	struct merge_result result = { 0 };
     -+	const struct object_id *tree_oid;
     - 
     - 	parent1 = get_merge_parent(branch1);
     - 	if (!parent1)
     -@@ builtin/merge-tree.c: static int real_merge(struct merge_tree_options *o,
     - 	if (o->show_messages == -1)
     - 		o->show_messages = !result.clean;
     - 
     --	printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
     -+	tree_oid = result.tree ? &result.tree->object.oid : null_oid();
     -+	printf("%s%c", oid_to_hex(tree_oid), line_termination);
     - 	if (!result.clean) {
     - 		struct string_list conflicted_files = STRING_LIST_INIT_NODUP;
     - 		const char *last = NULL;
     -@@ builtin/merge-tree.c: static int real_merge(struct merge_tree_options *o,
     - 					      &result);
     + ## merge-ort.c ##
     +@@ merge-ort.c: static int tree_entry_order(const void *a_, const void *b_)
     + 				 b->string, strlen(b->string), bmi->result.mode);
     + }
     + 
     +-static void write_tree(struct object_id *result_oid,
     +-		       struct string_list *versions,
     +-		       unsigned int offset,
     +-		       size_t hash_size)
     ++static int write_tree(struct object_id *result_oid,
     ++		      struct string_list *versions,
     ++		      unsigned int offset,
     ++		      size_t hash_size)
     + {
     + 	size_t maxlen = 0, extra;
     + 	unsigned int nr;
     + 	struct strbuf buf = STRBUF_INIT;
     +-	int i;
     ++	int i, ret;
     + 
     + 	assert(offset <= versions->nr);
     + 	nr = versions->nr - offset;
     +@@ merge-ort.c: static void write_tree(struct object_id *result_oid,
     + 	}
     + 
     + 	/* Write this object file out, and record in result_oid */
     +-	write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
     ++	ret = write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
     + 	strbuf_release(&buf);
     ++	return ret;
     + }
     + 
     + static void record_entry_for_tree(struct directory_versions *dir_metadata,
     +@@ merge-ort.c: static void record_entry_for_tree(struct directory_versions *dir_metadata,
     + 			   basename)->util = &mi->result;
     + }
     + 
     +-static void write_completed_directory(struct merge_options *opt,
     +-				      const char *new_directory_name,
     +-				      struct directory_versions *info)
     ++static int write_completed_directory(struct merge_options *opt,
     ++				     const char *new_directory_name,
     ++				     struct directory_versions *info)
     + {
     + 	const char *prev_dir;
     + 	struct merged_info *dir_info = NULL;
     +-	unsigned int offset;
     ++	unsigned int offset, ret = 0;
     + 
     + 	/*
     + 	 * Some explanation of info->versions and info->offsets...
     +@@ merge-ort.c: static void write_completed_directory(struct merge_options *opt,
     + 	 * strcmp here.)
     + 	 */
     + 	if (new_directory_name == info->last_directory)
     +-		return;
     ++		return 0;
     + 
     + 	/*
     + 	 * If we are just starting (last_directory is NULL), or last_directory
     +@@ merge-ort.c: static void write_completed_directory(struct merge_options *opt,
     + 		 */
     + 		string_list_append(&info->offsets,
     + 				   info->last_directory)->util = (void*)offset;
     +-		return;
     ++		return 0;
     + 	}
     + 
     + 	/*
     +@@ merge-ort.c: static void write_completed_directory(struct merge_options *opt,
     + 		 */
     + 		dir_info->is_null = 0;
     + 		dir_info->result.mode = S_IFDIR;
     +-		write_tree(&dir_info->result.oid, &info->versions, offset,
     +-			   opt->repo->hash_algo->rawsz);
     ++		if (write_tree(&dir_info->result.oid, &info->versions, offset,
     ++			       opt->repo->hash_algo->rawsz) < 0)
     ++			ret = -1;
     + 	}
     + 
     + 	/*
     +@@ merge-ort.c: static void write_completed_directory(struct merge_options *opt,
     + 	/* And, of course, we need to update last_directory to match. */
     + 	info->last_directory = new_directory_name;
     + 	info->last_directory_len = strlen(info->last_directory);
     ++
     ++	return ret;
     + }
     + 
     + /* Per entry merge function */
     +@@ merge-ort.c: static void prefetch_for_content_merges(struct merge_options *opt,
     + 	oid_array_clear(&to_fetch);
     + }
     + 
     +-static void process_entries(struct merge_options *opt,
     +-			    struct object_id *result_oid)
     ++static int process_entries(struct merge_options *opt,
     ++			   struct object_id *result_oid)
     + {
     + 	struct hashmap_iter iter;
     + 	struct strmap_entry *e;
     +@@ merge-ort.c: static void process_entries(struct merge_options *opt,
     + 	struct directory_versions dir_metadata = { STRING_LIST_INIT_NODUP,
     + 						   STRING_LIST_INIT_NODUP,
     + 						   NULL, 0 };
     ++	int ret;
     + 
     + 	trace2_region_enter("merge", "process_entries setup", opt->repo);
     + 	if (strmap_empty(&opt->priv->paths)) {
     + 		oidcpy(result_oid, opt->repo->hash_algo->empty_tree);
     +-		return;
     ++		return 0;
       	}
     - 	merge_finalize(&opt, &result);
     --	return !result.clean; /* result.clean < 0 handled above */
     -+	return !result.tree || !result.clean; /* result.clean < 0 handled above */
     + 
     + 	/* Hack to pre-allocate plist to the desired size */
     +@@ merge-ort.c: static void process_entries(struct merge_options *opt,
     + 		 */
     + 		struct merged_info *mi = entry->util;
     + 
     +-		write_completed_directory(opt, mi->directory_name,
     +-					  &dir_metadata);
     ++		if (write_completed_directory(opt, mi->directory_name,
     ++					      &dir_metadata) < 0) {
     ++			ret = -1;
     ++			goto cleanup;
     ++		}
     + 		if (mi->clean)
     + 			record_entry_for_tree(&dir_metadata, path, mi);
     + 		else {
     +@@ merge-ort.c: static void process_entries(struct merge_options *opt,
     + 		fflush(stdout);
     + 		BUG("dir_metadata accounting completely off; shouldn't happen");
     + 	}
     +-	write_tree(result_oid, &dir_metadata.versions, 0,
     +-		   opt->repo->hash_algo->rawsz);
     ++	ret = write_tree(result_oid, &dir_metadata.versions, 0,
     ++			 opt->repo->hash_algo->rawsz);
     ++cleanup:
     + 	string_list_clear(&plist, 0);
     + 	string_list_clear(&dir_metadata.versions, 0);
     + 	string_list_clear(&dir_metadata.offsets, 0);
     + 	trace2_region_leave("merge", "process_entries cleanup", opt->repo);
     ++
     ++	return ret;
       }
       
     - int cmd_merge_tree(int argc, const char **argv, const char *prefix)
     + /*** Function Grouping: functions related to merge_switch_to_result() ***/
     +@@ merge-ort.c: redo:
     + 	}
     + 
     + 	trace2_region_enter("merge", "process_entries", opt->repo);
     +-	process_entries(opt, &working_tree_oid);
     ++	if (process_entries(opt, &working_tree_oid) < 0)
     ++		result->clean = -1;
     + 	trace2_region_leave("merge", "process_entries", opt->repo);
     + 
     + 	/* Set return values */
     + 	result->path_messages = &opt->priv->conflicts;
     + 
     +-	result->tree = parse_tree_indirect(&working_tree_oid);
     +-	/* existence of conflicted entries implies unclean */
     +-	result->clean &= strmap_empty(&opt->priv->conflicted);
     ++	if (result->clean >= 0) {
     ++		result->tree = parse_tree_indirect(&working_tree_oid);
     ++		/* existence of conflicted entries implies unclean */
     ++		result->clean &= strmap_empty(&opt->priv->conflicted);
     ++	}
     + 	if (!opt->priv->call_depth) {
     + 		result->priv = opt->priv;
     + 		result->_properly_initialized = RESULT_INITIALIZED;
      
       ## t/t4301-merge-tree-write-tree.sh ##
      @@ t/t4301-merge-tree-write-tree.sh: test_expect_success 'can override merge of unrelated histories' '


 merge-ort.c                      | 64 +++++++++++++++++++-------------
 t/t4301-merge-tree-write-tree.sh |  9 +++++
 2 files changed, 48 insertions(+), 25 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 99dcee2db8a..f654296220e 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -3571,15 +3571,15 @@ static int tree_entry_order(const void *a_, const void *b_)
 				 b->string, strlen(b->string), bmi->result.mode);
 }
 
-static void write_tree(struct object_id *result_oid,
-		       struct string_list *versions,
-		       unsigned int offset,
-		       size_t hash_size)
+static int write_tree(struct object_id *result_oid,
+		      struct string_list *versions,
+		      unsigned int offset,
+		      size_t hash_size)
 {
 	size_t maxlen = 0, extra;
 	unsigned int nr;
 	struct strbuf buf = STRBUF_INIT;
-	int i;
+	int i, ret;
 
 	assert(offset <= versions->nr);
 	nr = versions->nr - offset;
@@ -3605,8 +3605,9 @@ static void write_tree(struct object_id *result_oid,
 	}
 
 	/* Write this object file out, and record in result_oid */
-	write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
+	ret = write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
 	strbuf_release(&buf);
+	return ret;
 }
 
 static void record_entry_for_tree(struct directory_versions *dir_metadata,
@@ -3625,13 +3626,13 @@ static void record_entry_for_tree(struct directory_versions *dir_metadata,
 			   basename)->util = &mi->result;
 }
 
-static void write_completed_directory(struct merge_options *opt,
-				      const char *new_directory_name,
-				      struct directory_versions *info)
+static int write_completed_directory(struct merge_options *opt,
+				     const char *new_directory_name,
+				     struct directory_versions *info)
 {
 	const char *prev_dir;
 	struct merged_info *dir_info = NULL;
-	unsigned int offset;
+	unsigned int offset, ret = 0;
 
 	/*
 	 * Some explanation of info->versions and info->offsets...
@@ -3717,7 +3718,7 @@ static void write_completed_directory(struct merge_options *opt,
 	 * strcmp here.)
 	 */
 	if (new_directory_name == info->last_directory)
-		return;
+		return 0;
 
 	/*
 	 * If we are just starting (last_directory is NULL), or last_directory
@@ -3739,7 +3740,7 @@ static void write_completed_directory(struct merge_options *opt,
 		 */
 		string_list_append(&info->offsets,
 				   info->last_directory)->util = (void*)offset;
-		return;
+		return 0;
 	}
 
 	/*
@@ -3769,8 +3770,9 @@ static void write_completed_directory(struct merge_options *opt,
 		 */
 		dir_info->is_null = 0;
 		dir_info->result.mode = S_IFDIR;
-		write_tree(&dir_info->result.oid, &info->versions, offset,
-			   opt->repo->hash_algo->rawsz);
+		if (write_tree(&dir_info->result.oid, &info->versions, offset,
+			       opt->repo->hash_algo->rawsz) < 0)
+			ret = -1;
 	}
 
 	/*
@@ -3798,6 +3800,8 @@ static void write_completed_directory(struct merge_options *opt,
 	/* And, of course, we need to update last_directory to match. */
 	info->last_directory = new_directory_name;
 	info->last_directory_len = strlen(info->last_directory);
+
+	return ret;
 }
 
 /* Per entry merge function */
@@ -4214,8 +4218,8 @@ static void prefetch_for_content_merges(struct merge_options *opt,
 	oid_array_clear(&to_fetch);
 }
 
-static void process_entries(struct merge_options *opt,
-			    struct object_id *result_oid)
+static int process_entries(struct merge_options *opt,
+			   struct object_id *result_oid)
 {
 	struct hashmap_iter iter;
 	struct strmap_entry *e;
@@ -4224,11 +4228,12 @@ static void process_entries(struct merge_options *opt,
 	struct directory_versions dir_metadata = { STRING_LIST_INIT_NODUP,
 						   STRING_LIST_INIT_NODUP,
 						   NULL, 0 };
+	int ret;
 
 	trace2_region_enter("merge", "process_entries setup", opt->repo);
 	if (strmap_empty(&opt->priv->paths)) {
 		oidcpy(result_oid, opt->repo->hash_algo->empty_tree);
-		return;
+		return 0;
 	}
 
 	/* Hack to pre-allocate plist to the desired size */
@@ -4270,8 +4275,11 @@ static void process_entries(struct merge_options *opt,
 		 */
 		struct merged_info *mi = entry->util;
 
-		write_completed_directory(opt, mi->directory_name,
-					  &dir_metadata);
+		if (write_completed_directory(opt, mi->directory_name,
+					      &dir_metadata) < 0) {
+			ret = -1;
+			goto cleanup;
+		}
 		if (mi->clean)
 			record_entry_for_tree(&dir_metadata, path, mi);
 		else {
@@ -4291,12 +4299,15 @@ static void process_entries(struct merge_options *opt,
 		fflush(stdout);
 		BUG("dir_metadata accounting completely off; shouldn't happen");
 	}
-	write_tree(result_oid, &dir_metadata.versions, 0,
-		   opt->repo->hash_algo->rawsz);
+	ret = write_tree(result_oid, &dir_metadata.versions, 0,
+			 opt->repo->hash_algo->rawsz);
+cleanup:
 	string_list_clear(&plist, 0);
 	string_list_clear(&dir_metadata.versions, 0);
 	string_list_clear(&dir_metadata.offsets, 0);
 	trace2_region_leave("merge", "process_entries cleanup", opt->repo);
+
+	return ret;
 }
 
 /*** Function Grouping: functions related to merge_switch_to_result() ***/
@@ -4928,15 +4939,18 @@ redo:
 	}
 
 	trace2_region_enter("merge", "process_entries", opt->repo);
-	process_entries(opt, &working_tree_oid);
+	if (process_entries(opt, &working_tree_oid) < 0)
+		result->clean = -1;
 	trace2_region_leave("merge", "process_entries", opt->repo);
 
 	/* Set return values */
 	result->path_messages = &opt->priv->conflicts;
 
-	result->tree = parse_tree_indirect(&working_tree_oid);
-	/* existence of conflicted entries implies unclean */
-	result->clean &= strmap_empty(&opt->priv->conflicted);
+	if (result->clean >= 0) {
+		result->tree = parse_tree_indirect(&working_tree_oid);
+		/* existence of conflicted entries implies unclean */
+		result->clean &= strmap_empty(&opt->priv->conflicted);
+	}
 	if (!opt->priv->call_depth) {
 		result->priv = opt->priv;
 		result->_properly_initialized = RESULT_INITIALIZED;
diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
index 28ca5c38bb5..013b77144bd 100755
--- a/t/t4301-merge-tree-write-tree.sh
+++ b/t/t4301-merge-tree-write-tree.sh
@@ -810,4 +810,13 @@ test_expect_success 'can override merge of unrelated histories' '
 	test_cmp expect actual
 '
 
+test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
+	git init --bare read-only &&
+	git push read-only side1 side2 side3 &&
+	test_when_finished "chmod -R u+w read-only" &&
+	chmod -R a-w read-only &&
+	test_must_fail git -C read-only merge-tree side1 side3 &&
+	test_must_fail git -C read-only merge-tree side1 side2
+'
+
 test_done

base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
-- 
@@ -810,4 +810,13 @@ test_expect_success 'can override merge of unrelated histories' '
 	test_cmp expect actual
 '
 
+test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
+	git init --bare read-only &&
+	git push read-only side1 side2 side3 &&
+	test_when_finished "chmod -R u+w read-only" &&
+	chmod -R a-w read-only &&
+	test_must_fail git -C read-only merge-tree side1 side3 &&
+	test_must_fail git -C read-only merge-tree side1 side2
+'
+
 test_done

base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 22:56   ` Elijah Newren
  2022-09-21 23:11     ` Junio C Hamano
  2022-09-22 17:24     ` Johannes Schindelin
@ 2022-09-22 19:50     ` Johannes Schindelin
  2022-09-23  1:47       ` Elijah Newren
  2 siblings, 1 reply; 16+ messages in thread
From: Johannes Schindelin @ 2022-09-22 19:50 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Johannes Schindelin via GitGitGadget, Git Mailing List, Taylor Blau

Hi Elijah,

now that I have studied more of the code, hunted down Coverity reports
about `merge-ort.c` and `builtin/merge-tree.c` and essentially have grown
a lot more confidence about my patch, I'll take the time to respond in a
bit more detail.

On Wed, 21 Sep 2022, Elijah Newren wrote:

> On Wed, Sep 21, 2022 at 3:08 PM Johannes Schindelin via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > Independent of the question whether we want `git merge-tree` to report
> > the tree name even when it failed to write the tree objects in a
> > read-only repository, there is no question that we should avoid a
> > segmentation fault.
>
> Ah, a read-only repo.  Looks like we didn't bother to check the return
> status of write_object_file() in one of the two cases; oops.  (And it
> looks like the other case does not correctly propagate the error
> upwards far enough...)

Indeed, that was the actual root cause, but I had failed to even look
whether the return values of `write_tree()` and `write_object_file()` were
respected. Narrator's voice: they weren't.

> > And when we report an invalid tree name (because the tree could not be
> > written), we need the exit status to be non-zero.
> >
> > Let's make it so.
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> >     merge-tree: fix segmentation fault in read-only repositories
> >
> >     Turns out that the segmentation fault reported by Taylor
> >     [https://lore.kernel.org/git/YyopQD+LvPucnz3w@nand.local/] happened
> >     while testing merge-ort in a read-only repository, and that the upstream
> >     version of git merge-tree is as affected as GitHub's internal version.
> >
> >     Note: I briefly considered using the OID of the_hash_algo->empty_tree
> >     instead of null_oid() when no tree object could be constructed. However,
> >     I have come to the conclusion that this could potentially cause its own
> >     set of problems because it would relate to a valid tree object even if
> >     we do not have any valid tree object to play with.
> >
> >     Also note: The question I hinted at in the commit message, namely
> >     whether or not to try harder to construct a tree object even if we
> >     cannot write it out, maybe merits a longer discussion, one that I think
> >     we should have after v2.38.0 is released, so as not to distract from
> >     focusing on v2.38.0.
>
> That's fair, but you know I'm going to have a hard time refraining
> from commenting on it anyway...  :-)

Hahaha, of course you would ;-)

> [...]
> > @@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o,
> >         if (o->show_messages == -1)
> >                 o->show_messages = !result.clean;
> >
> > -       printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
> > +       tree_oid = result.tree ? &result.tree->object.oid : null_oid();
> > +       printf("%s%c", oid_to_hex(tree_oid), line_termination);
>
> Perhaps also print a warning to stderr when result.tree is NULL?

I ended up not even touching `builtin/merge-tree.c` but instead ensuring
that `result.clean` is negative if we fail to write an object (which could
happen even in read/write repositories, think "disk full").

As a consequence, we do not even reach this `printf()` anymore, as you
pointed out, a negative `result.clean` is handled much earlier.

It is handled via a `die()` in `real_merge()`, and that will need to
change, I think, if we want to continue on the batched merges.

> >         if (!result.clean) {
> >                 struct string_list conflicted_files = STRING_LIST_INIT_NODUP;
> >                 const char *last = NULL;
> > @@ -473,7 +475,7 @@ static int real_merge(struct merge_tree_options *o,
> >                                               &result);
> >         }
> >         merge_finalize(&opt, &result);
> > -       return !result.clean; /* result.clean < 0 handled above */
> > +       return !result.tree || !result.clean; /* result.clean < 0 handled above */
>
> Thinking out loud, should this logic be at the merge-ort.c level,
> perhaps something like this:
>
> @@ -4940,6 +4941,9 @@ static void
> merge_ort_nonrecursive_internal(struct merge_options *opt,
>         result->tree = parse_tree_indirect(&working_tree_oid);
>         /* existence of conflicted entries implies unclean */
>         result->clean &= strmap_empty(&opt->priv->conflicted);
> +       if (!result->tree)
> +               /* This shouldn't happen, because if we did fail to
> write a tree we should have returned early before getting here.  But
> just in case we have some bugs... */
> +               result->clean = -1;
>         if (!opt->priv->call_depth) {
>                 result->priv = opt->priv;
>                 result->_properly_initialized = RESULT_INITIALIZED;
>
> That might benefit callers other than merge-tree, though maybe it
> makes it harder to print a helpful error message (unless we're fine
> with the library always throwing one?)

The error messages are already thrown about (this is how it looks like
with v3 of this patch):

	[...]
	+ git -C read-only merge-tree side1 side2
	error: insufficient permission for adding an object to repository database ./objects
	error: error: Unable to add numbers to database
	error: insufficient permission for adding an object to repository database ./objects
	error: error: Unable to add greeting to database
	error: insufficient permission for adding an object to repository database ./objects
	fatal: failure to merge
	+ exit_code=128
	[...]

> > diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> > index 28ca5c38bb5..013b77144bd 100755
> > --- a/t/t4301-merge-tree-write-tree.sh
> > +++ b/t/t4301-merge-tree-write-tree.sh
> > @@ -810,4 +810,13 @@ test_expect_success 'can override merge of unrelated histories' '
> >         test_cmp expect actual
> >  '
> >
> > +test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
> > +       git init --bare read-only &&
> > +       git push read-only side1 side2 side3 &&
> > +       test_when_finished "chmod -R u+w read-only" &&
> > +       chmod -R a-w read-only &&
> > +       test_must_fail git -C read-only merge-tree side1 side3 &&
> > +       test_must_fail git -C read-only merge-tree side1 side2
> > +'
> > +
> >  test_done
> >
> > base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
> > --
> > gitgitgadget
>
> This is a reasonable workaround from the merge-tree side, if we expect
> merge-ort to not correctly notify us of issues (which may be fair
> given that a quick glance suggests we have more than one problematic
> codepath, which I'll discuss more below).  However, I do think
> merge-ort should be returning a negative value for the "clean" status
> in such a case.

You're absolutely right. That's what v3 now does.

> If merge-ort did that, then we wouldn't even get to your new code
> because merge-tree.c already checks for that:
>
>     merge_incore_recursive(&opt, merge_bases, parent1, parent2, &result);
>     if (result.clean < 0)
>         die(_("failure to merge"));
>
> I have a hack above which sets a negative return value, but it's only
> a workaround since it comes from a late detect-after-the-fact
> location.  Here's another ugly hack, but one which highlights exactly
> where the real problem occurs:
>
> diff --git a/merge-ort.c b/merge-ort.c
> index 99dcee2db8..c2144e9220 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -3605,7 +3605,8 @@ static void write_tree(struct object_id *result_oid,
>         }
>
>         /* Write this object file out, and record in result_oid */
> -       write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
> +       if (write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid))
> +               die(_("Unable to add new tree to database"));
>         strbuf_release(&buf);
>  }
>
> die()ing, of course, is not a good choice, we should really return -1
> instead.  Unfortunately, the callers aren't currently prepared to
> propagate such a value upwards (as reflected in the return type of the
> function) so any fix in this area is a little more involved than just
> modifying these few lines.  Also, there is one other call to
> write_object_file() in merge-ort.c which does check and return a value
> of -1, though looking around it appears the callers of that other site
> do not always correctly check and propagate a return value of -1
> further upwards as they should, meaning we are losing track of the
> fact that the merge failed to function.
>
> Given the lack of correct propagation of -1 return values in the other
> codepath plus the error here, keeping some form of workaround sounds
> fair (though it may be nice to print an error that something fishy
> happened).
>
> And then, possibly post-v2.38.0 though we may be able to get it in
> before, getting correct propagation of a -1 return value from the
> source of the error would be good.  Would you like to look into that,
> or would you prefer I did?

It turned out not to be too bad.

Essentially, to propagate `write_object_file()` failures I only had to
change a couple of signatures (with the corresponding `return`s):
`write_tree()`, `write_completed_directory()`, and `process_entries()`.
And then, of course, I needed to change
`merge_ort_nonrecursive_internal()` to respect the propagated error by
setting `result->clean = -1`.

Pretty straight-forward, really, much less involved than I had expected.

Thank you!
Dscho

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 22:24   ` Junio C Hamano
@ 2022-09-22 19:52     ` Johannes Schindelin
  0 siblings, 0 replies; 16+ messages in thread
From: Johannes Schindelin @ 2022-09-22 19:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin via GitGitGadget, git, Elijah Newren, Taylor Blau

Hi Junio,

On Wed, 21 Sep 2022, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > +	const struct object_id *tree_oid;
> >
> >  	parent1 = get_merge_parent(branch1);
> >  	if (!parent1)
> > @@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o,
> >  	if (o->show_messages == -1)
> >  		o->show_messages = !result.clean;
> >
> > -	printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
> > +	tree_oid = result.tree ? &result.tree->object.oid : null_oid();
> > +	printf("%s%c", oid_to_hex(tree_oid), line_termination);
>
> As we are not using tree_oid anywhere else, it might be cleaner to
> switch on result.tree more explicitly, especially given that the
> return statement also uses result.tree to decide what exit status to
> use, e.g.
>
> 	if (result.tree)
> 		fputs(oid_to_hex(&result.tree->object.oid), stdout);
> 	else
> 		fputs(oid_to_hex(null_oid()), stdout);
> 	fputc(line_termination, stdout);
>
> but that is merely "might be cleaner".

I would call it "more readable", even, not just "cleaner" :-)

> > -	return !result.clean; /* result.clean < 0 handled above */
> > +	return !result.tree || !result.clean; /* result.clean < 0 handled above */
>
> The original returned 1 when we failed to cleanly merge and 0 when
> we cleanly merged.  Now, if we failed to come up with a merged tree,
> result.tree would be NULL and we return 1 from the function, because
> we failed to cleanly merge.
>
> OK.  These negations make my head spin X-<.

And here I thought that you couldn't possibly not like if I didn't refrain
from not avoiding double negations...

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] merge-ort: fix segmentation fault in read-only repositories
  2022-09-22 19:46   ` [PATCH v3] merge-ort: " Johannes Schindelin via GitGitGadget
@ 2022-09-23  1:38     ` Elijah Newren
  2022-09-23  9:55       ` Johannes Schindelin
  0 siblings, 1 reply; 16+ messages in thread
From: Elijah Newren @ 2022-09-23  1:38 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: Git Mailing List, Taylor Blau, Johannes Schindelin

On Thu, Sep 22, 2022 at 12:46 PM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> If the blob/tree objects cannot be written, we really need the merge
> operations to fail, and not to continue (and then try to access the tree
> object which is however still set to `NULL`).
>
> Let's stop ignoring the return value of `write_object_file()` and
> `write_tree()` and set `clean = -1` in the error case.

:-)

>  merge-ort.c                      | 64 +++++++++++++++++++-------------
>  t/t4301-merge-tree-write-tree.sh |  9 +++++
>  2 files changed, 48 insertions(+), 25 deletions(-)
>
> diff --git a/merge-ort.c b/merge-ort.c
> index 99dcee2db8a..f654296220e 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -3571,15 +3571,15 @@ static int tree_entry_order(const void *a_, const void *b_)
>                                  b->string, strlen(b->string), bmi->result.mode);
>  }
>
> -static void write_tree(struct object_id *result_oid,
> -                      struct string_list *versions,
> -                      unsigned int offset,
> -                      size_t hash_size)
> +static int write_tree(struct object_id *result_oid,
> +                     struct string_list *versions,
> +                     unsigned int offset,
> +                     size_t hash_size)
>  {
>         size_t maxlen = 0, extra;
>         unsigned int nr;
>         struct strbuf buf = STRBUF_INIT;
> -       int i;
> +       int i, ret;
>
>         assert(offset <= versions->nr);
>         nr = versions->nr - offset;
> @@ -3605,8 +3605,9 @@ static void write_tree(struct object_id *result_oid,
>         }
>
>         /* Write this object file out, and record in result_oid */
> -       write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
> +       ret = write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);

This sent me into a bit of a hunt to try to figure out what the return
value of write_object_file() is.  I know it's non-zero on failure, but
don't know if it's always positive or always negative or possibly a
mix and how that maps to the idea of "clean_merge".  I gave up after a
few indirections, to be honest.

Anyway, regardless of my inability to find the answer to the above
question, would this be a bit easier to read if we initialized ret to
0 and then did something like

    if (write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid))
        ret = -1;

?

>         strbuf_release(&buf);
> +       return ret;
>  }
>
>  static void record_entry_for_tree(struct directory_versions *dir_metadata,
> @@ -3625,13 +3626,13 @@ static void record_entry_for_tree(struct directory_versions *dir_metadata,
>                            basename)->util = &mi->result;
>  }
>
> -static void write_completed_directory(struct merge_options *opt,
> -                                     const char *new_directory_name,
> -                                     struct directory_versions *info)
> +static int write_completed_directory(struct merge_options *opt,
> +                                    const char *new_directory_name,
> +                                    struct directory_versions *info)
>  {
>         const char *prev_dir;
>         struct merged_info *dir_info = NULL;
> -       unsigned int offset;
> +       unsigned int offset, ret = 0;
>
>         /*
>          * Some explanation of info->versions and info->offsets...
> @@ -3717,7 +3718,7 @@ static void write_completed_directory(struct merge_options *opt,
>          * strcmp here.)
>          */
>         if (new_directory_name == info->last_directory)
> -               return;
> +               return 0;
>
>         /*
>          * If we are just starting (last_directory is NULL), or last_directory
> @@ -3739,7 +3740,7 @@ static void write_completed_directory(struct merge_options *opt,
>                  */
>                 string_list_append(&info->offsets,
>                                    info->last_directory)->util = (void*)offset;
> -               return;
> +               return 0;
>         }
>
>         /*
> @@ -3769,8 +3770,9 @@ static void write_completed_directory(struct merge_options *opt,
>                  */
>                 dir_info->is_null = 0;
>                 dir_info->result.mode = S_IFDIR;
> -               write_tree(&dir_info->result.oid, &info->versions, offset,
> -                          opt->repo->hash_algo->rawsz);
> +               if (write_tree(&dir_info->result.oid, &info->versions, offset,
> +                              opt->repo->hash_algo->rawsz) < 0)
> +                       ret = -1;

This makes me wonder about write_tree() again.  What if its call to
write_object_file() returns a value greater than 0?  Is that possible?
 If it is, are those error cases or not?  About half the callers in
the code base that check the return value of write_object_file() check
for a non-zero value, the other half check for a value less than 0.
And I can't find any documentation.  And it seemed like too much time
for me to figure out what its range of return values was.

>         }
>
>         /*
> @@ -3798,6 +3800,8 @@ static void write_completed_directory(struct merge_options *opt,
>         /* And, of course, we need to update last_directory to match. */
>         info->last_directory = new_directory_name;
>         info->last_directory_len = strlen(info->last_directory);
> +
> +       return ret;
>  }
>
>  /* Per entry merge function */
> @@ -4214,8 +4218,8 @@ static void prefetch_for_content_merges(struct merge_options *opt,
>         oid_array_clear(&to_fetch);
>  }
>
> -static void process_entries(struct merge_options *opt,
> -                           struct object_id *result_oid)
> +static int process_entries(struct merge_options *opt,
> +                          struct object_id *result_oid)
>  {
>         struct hashmap_iter iter;
>         struct strmap_entry *e;
> @@ -4224,11 +4228,12 @@ static void process_entries(struct merge_options *opt,
>         struct directory_versions dir_metadata = { STRING_LIST_INIT_NODUP,
>                                                    STRING_LIST_INIT_NODUP,
>                                                    NULL, 0 };
> +       int ret;
>
>         trace2_region_enter("merge", "process_entries setup", opt->repo);
>         if (strmap_empty(&opt->priv->paths)) {
>                 oidcpy(result_oid, opt->repo->hash_algo->empty_tree);
> -               return;
> +               return 0;
>         }
>
>         /* Hack to pre-allocate plist to the desired size */
> @@ -4270,8 +4275,11 @@ static void process_entries(struct merge_options *opt,
>                  */
>                 struct merged_info *mi = entry->util;
>
> -               write_completed_directory(opt, mi->directory_name,
> -                                         &dir_metadata);
> +               if (write_completed_directory(opt, mi->directory_name,
> +                                             &dir_metadata) < 0) {
> +                       ret = -1;
> +                       goto cleanup;
> +               }
>                 if (mi->clean)
>                         record_entry_for_tree(&dir_metadata, path, mi);
>                 else {
> @@ -4291,12 +4299,15 @@ static void process_entries(struct merge_options *opt,
>                 fflush(stdout);
>                 BUG("dir_metadata accounting completely off; shouldn't happen");
>         }
> -       write_tree(result_oid, &dir_metadata.versions, 0,
> -                  opt->repo->hash_algo->rawsz);
> +       ret = write_tree(result_oid, &dir_metadata.versions, 0,
> +                        opt->repo->hash_algo->rawsz);
> +cleanup:
>         string_list_clear(&plist, 0);
>         string_list_clear(&dir_metadata.versions, 0);
>         string_list_clear(&dir_metadata.offsets, 0);
>         trace2_region_leave("merge", "process_entries cleanup", opt->repo);
> +
> +       return ret;
>  }
>
>  /*** Function Grouping: functions related to merge_switch_to_result() ***/
> @@ -4928,15 +4939,18 @@ redo:
>         }
>
>         trace2_region_enter("merge", "process_entries", opt->repo);
> -       process_entries(opt, &working_tree_oid);
> +       if (process_entries(opt, &working_tree_oid) < 0)
> +               result->clean = -1;
>         trace2_region_leave("merge", "process_entries", opt->repo);
>
>         /* Set return values */
>         result->path_messages = &opt->priv->conflicts;
>
> -       result->tree = parse_tree_indirect(&working_tree_oid);
> -       /* existence of conflicted entries implies unclean */
> -       result->clean &= strmap_empty(&opt->priv->conflicted);
> +       if (result->clean >= 0) {
> +               result->tree = parse_tree_indirect(&working_tree_oid);
> +               /* existence of conflicted entries implies unclean */
> +               result->clean &= strmap_empty(&opt->priv->conflicted);
> +       }
>         if (!opt->priv->call_depth) {
>                 result->priv = opt->priv;
>                 result->_properly_initialized = RESULT_INITIALIZED;
> diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> index 28ca5c38bb5..013b77144bd 100755
> --- a/t/t4301-merge-tree-write-tree.sh
> +++ b/t/t4301-merge-tree-write-tree.sh
> @@ -810,4 +810,13 @@ test_expect_success 'can override merge of unrelated histories' '
>         test_cmp expect actual
>  '
>
> +test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
> +       git init --bare read-only &&
> +       git push read-only side1 side2 side3 &&
> +       test_when_finished "chmod -R u+w read-only" &&
> +       chmod -R a-w read-only &&
> +       test_must_fail git -C read-only merge-tree side1 side3 &&
> +       test_must_fail git -C read-only merge-tree side1 side2
> +'
> +
>  test_done
>
> base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
> --
> gitgitgadget

Nice to see that it didn't take too much work to propagate the -1
value from write_tree() up the hierarchy.  Nice work!

I think we've still got some separate problems related to propagating
the return value of the other write_object_file() call, from
handle_content_merge().  The direct call in that other case is okay,
but the higher levels at process_renames() and process_entry() seem to
fumble it.  Your fix for failed tree writes is still valid without
fixing the failed blob writes, and I'm happy to tackle the other half
if you'd prefer to hand it off.  But, if you'd like to tackle that
half too, you might see fewer error messages when it fails to write to
the read-only repository, since it'll fail early on the first blob
write instead of only failing when it gets around to trying to write a
new tree.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-22 19:50     ` Johannes Schindelin
@ 2022-09-23  1:47       ` Elijah Newren
  0 siblings, 0 replies; 16+ messages in thread
From: Elijah Newren @ 2022-09-23  1:47 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Johannes Schindelin via GitGitGadget, Git Mailing List, Taylor Blau

On Thu, Sep 22, 2022 at 12:50 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Elijah,
>
> now that I have studied more of the code, hunted down Coverity reports
> about `merge-ort.c` and `builtin/merge-tree.c` and essentially have grown
> a lot more confidence about my patch, I'll take the time to respond in a
> bit more detail.
>
> On Wed, 21 Sep 2022, Elijah Newren wrote:
>
> > On Wed, Sep 21, 2022 at 3:08 PM Johannes Schindelin via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> > >
> > > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> > >
[...]
> > > @@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o,
> > >         if (o->show_messages == -1)
> > >                 o->show_messages = !result.clean;
> > >
> > > -       printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
> > > +       tree_oid = result.tree ? &result.tree->object.oid : null_oid();
> > > +       printf("%s%c", oid_to_hex(tree_oid), line_termination);
> >
> > Perhaps also print a warning to stderr when result.tree is NULL?
>
> I ended up not even touching `builtin/merge-tree.c` but instead ensuring
> that `result.clean` is negative if we fail to write an object (which could
> happen even in read/write repositories, think "disk full").
>
> As a consequence, we do not even reach this `printf()` anymore, as you
> pointed out, a negative `result.clean` is handled much earlier.

Works for me.  :-)

> It is handled via a `die()` in `real_merge()`, and that will need to
> change, I think, if we want to continue on the batched merges.

Indeed, good point.

> > >         if (!result.clean) {
> > >                 struct string_list conflicted_files = STRING_LIST_INIT_NODUP;
> > >                 const char *last = NULL;
> > > @@ -473,7 +475,7 @@ static int real_merge(struct merge_tree_options *o,
> > >                                               &result);
> > >         }
> > >         merge_finalize(&opt, &result);
> > > -       return !result.clean; /* result.clean < 0 handled above */
> > > +       return !result.tree || !result.clean; /* result.clean < 0 handled above */
> >
> > Thinking out loud, should this logic be at the merge-ort.c level,
> > perhaps something like this:
> >
> > @@ -4940,6 +4941,9 @@ static void
> > merge_ort_nonrecursive_internal(struct merge_options *opt,
> >         result->tree = parse_tree_indirect(&working_tree_oid);
> >         /* existence of conflicted entries implies unclean */
> >         result->clean &= strmap_empty(&opt->priv->conflicted);
> > +       if (!result->tree)
> > +               /* This shouldn't happen, because if we did fail to
> > write a tree we should have returned early before getting here.  But
> > just in case we have some bugs... */
> > +               result->clean = -1;
> >         if (!opt->priv->call_depth) {
> >                 result->priv = opt->priv;
> >                 result->_properly_initialized = RESULT_INITIALIZED;
> >
> > That might benefit callers other than merge-tree, though maybe it
> > makes it harder to print a helpful error message (unless we're fine
> > with the library always throwing one?)
>
> The error messages are already thrown about (this is how it looks like
> with v3 of this patch):
>
>         [...]
>         + git -C read-only merge-tree side1 side2
>         error: insufficient permission for adding an object to repository database ./objects
>         error: error: Unable to add numbers to database
>         error: insufficient permission for adding an object to repository database ./objects
>         error: error: Unable to add greeting to database

Ah, this confirms my suspicions.  These lines look like two failures,
both for adding blob objects.  This shows that the return value from
the write_object_file() call within handle_content_merge() is not
being correctly propagated upwards, and the merge is (erroneously)
continuing on despite that.

>         error: insufficient permission for adding an object to repository database ./objects

And I'm guessing the lack of the "Unable to add %s to database"
following message means this was the attempt to write a new tree.

>         fatal: failure to merge

And this shows that with your patch, you are propagating and catching
the failure to write the tree object and aborting, so we at least
catch failures to write trees now.

[...]
> > And then, possibly post-v2.38.0 though we may be able to get it in
> > before, getting correct propagation of a -1 return value from the
> > source of the error would be good.  Would you like to look into that,
> > or would you prefer I did?
>
> It turned out not to be too bad.
>
> Essentially, to propagate `write_object_file()` failures I only had to
> change a couple of signatures (with the corresponding `return`s):
> `write_tree()`, `write_completed_directory()`, and `process_entries()`.
> And then, of course, I needed to change
> `merge_ort_nonrecursive_internal()` to respect the propagated error by
> setting `result->clean = -1`.
>
> Pretty straight-forward, really, much less involved than I had expected.

Nice!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] merge-ort: fix segmentation fault in read-only repositories
  2022-09-23  1:38     ` Elijah Newren
@ 2022-09-23  9:55       ` Johannes Schindelin
  0 siblings, 0 replies; 16+ messages in thread
From: Johannes Schindelin @ 2022-09-23  9:55 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Johannes Schindelin via GitGitGadget, Git Mailing List, Taylor Blau

Hi Elijah,

On Thu, 22 Sep 2022, Elijah Newren wrote:

> On Thu, Sep 22, 2022 at 12:46 PM Johannes Schindelin via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > diff --git a/merge-ort.c b/merge-ort.c
> > index 99dcee2db8a..f654296220e 100644
> > --- a/merge-ort.c
> > +++ b/merge-ort.c
> > @@ -3571,15 +3571,15 @@ static int tree_entry_order(const void *a_, const void *b_)
> >                                  b->string, strlen(b->string), bmi->result.mode);
> >  }
> >
> > -static void write_tree(struct object_id *result_oid,
> > -                      struct string_list *versions,
> > -                      unsigned int offset,
> > -                      size_t hash_size)
> > +static int write_tree(struct object_id *result_oid,
> > +                     struct string_list *versions,
> > +                     unsigned int offset,
> > +                     size_t hash_size)
> >  {
> >         size_t maxlen = 0, extra;
> >         unsigned int nr;
> >         struct strbuf buf = STRBUF_INIT;
> > -       int i;
> > +       int i, ret;
> >
> >         assert(offset <= versions->nr);
> >         nr = versions->nr - offset;
> > @@ -3605,8 +3605,9 @@ static void write_tree(struct object_id *result_oid,
> >         }
> >
> >         /* Write this object file out, and record in result_oid */
> > -       write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
> > +       ret = write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
>
> This sent me into a bit of a hunt to try to figure out what the return
> value of write_object_file() is.  I know it's non-zero on failure, but
> don't know if it's always positive or always negative or possibly a
> mix and how that maps to the idea of "clean_merge".  I gave up after a
> few indirections, to be honest.
>
> Anyway, regardless of my inability to find the answer to the above
> question, would this be a bit easier to read if we initialized ret to
> 0 and then did something like
>
>     if (write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid))
>         ret = -1;
>
> ?

You're right, this is a better pattern to follow because it makes it
easier to wrap your head around.

> > [...]
> > @@ -3769,8 +3770,9 @@ static void write_completed_directory(struct merge_options *opt,
> >                  */
> >                 dir_info->is_null = 0;
> >                 dir_info->result.mode = S_IFDIR;
> > -               write_tree(&dir_info->result.oid, &info->versions, offset,
> > -                          opt->repo->hash_algo->rawsz);
> > +               if (write_tree(&dir_info->result.oid, &info->versions, offset,
> > +                              opt->repo->hash_algo->rawsz) < 0)
> > +                       ret = -1;
>
> This makes me wonder about write_tree() again.  What if its call to
> write_object_file() returns a value greater than 0?  Is that possible?

No, `write_object_file()` returns either 0 or -1, always. I followed all
the code paths.

Here is a tangent (feel free to skip over this lengthy analysis):

These code paths include a _very, very_ misleading one. In
`write_loose_object()` (which is called via the call chain
`write_object_file()` -> `write_object_file_flags()` ->
`write_object_file_flags()`), we call `finalize_object_file()`, see
https://github.com/git/git/blob/v2.37.3/object-file.c#L2030). And in that
`finalize_object_file()` function, we set `ret = errno;` in two places
(https://github.com/git/git/blob/v2.37.3/object-file.c#L1833 and
https://github.com/git/git/blob/v2.37.3/object-file.c#L1850).

When I read that, I chased down what the conventions are for `errno`
values. Now, while
https://pubs.opengroup.org/onlinepubs/007908799/xsh/errno.h.html does not
say anything about `errno` values being non-negative,
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/errno.h.html
very much says:

	Values for errno are now required to be distinct positive values
	rather than non-zero values. This change is for alignment with the
	ISO/IEC 9899:1999 standard.

Terrible, right? If we set `ret = errno`, that means that we return
something positive!

I then even dug up the origins of the `ret = errno` statements. Turns out
that yours truly introduced the first of these `ret = errno` statements in
9e48b389990 (Work around missing hard links on FAT formatted media,
2005-10-26). :facepalm:

So why did I do such a thing? Well, back then, the `ret` value _already_
contained an `errno`, thanks to calling `link_temp_to_file()`, which _did_
return an `errno`. In fact, there was no code path in
`move_temp_to_file()` (as the function was called back then) that would
return -1, it always returned either 0 or an `errno`.

And after coming so far, after one hour of analyzing, I finally, finally
realized that there is no `return ret;` in that function. That function
has a `ret` variable that is never returned _from_ that function, but it
used to be returned _to_ that function, by the `link_temp_to_file()`
function that does not even exist any longer.

In my defense, this terribly misleading code was not even introduced by
myself. It was introduced in 230f13225df (Create object subdirectories on
demand, 2005-10-08).

tl;dr all is good, `finalize_object_file()` returns 0 upon success, -1
upon failure.

> If it is, are those error cases or not?  About half the callers in the
> code base that check the return value of write_object_file() check for a
> non-zero value, the other half check for a value less than 0.

And then there is `apply.c` where the return value isn't even checked at
all most of the time.

> And I can't find any documentation.  And it seemed like too much time
> for me to figure out what its range of return values was.
>
> [...]
>
> Nice to see that it didn't take too much work to propagate the -1 value
>from write_tree() up the hierarchy.  Nice work!

Thank you!

> I think we've still got some separate problems related to propagating
> the return value of the other write_object_file() call, from
> handle_content_merge().  The direct call in that other case is okay, but
> the higher levels at process_renames() and process_entry() seem to
> fumble it.  Your fix for failed tree writes is still valid without
> fixing the failed blob writes, and I'm happy to tackle the other half if
> you'd prefer to hand it off.  But, if you'd like to tackle that half
> too, you might see fewer error messages when it fails to write to the
> read-only repository, since it'll fail early on the first blob write
> instead of only failing when it gets around to trying to write a new
> tree.

It'll be part of v4 ;-)

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-09-23  9:55 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-21 15:30 [PATCH] merge-tree: fix segmentation fault in read-only repositories Johannes Schindelin via GitGitGadget
2022-09-21 15:42 ` Taylor Blau
2022-09-21 20:12   ` Johannes Schindelin
2022-09-21 18:12 ` Junio C Hamano
2022-09-21 20:13   ` Johannes Schindelin
2022-09-21 22:08 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
2022-09-21 22:24   ` Junio C Hamano
2022-09-22 19:52     ` Johannes Schindelin
2022-09-21 22:56   ` Elijah Newren
2022-09-21 23:11     ` Junio C Hamano
2022-09-22 17:24     ` Johannes Schindelin
2022-09-22 19:50     ` Johannes Schindelin
2022-09-23  1:47       ` Elijah Newren
2022-09-22 19:46   ` [PATCH v3] merge-ort: " Johannes Schindelin via GitGitGadget
2022-09-23  1:38     ` Elijah Newren
2022-09-23  9:55       ` Johannes Schindelin

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).