git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH] merge-tree: fix segmentation fault in read-only repositories
@ 2022-09-21 15:30 Johannes Schindelin via GitGitGadget
  2022-09-21 15:42 ` Taylor Blau
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-09-21 15:30 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Independent of the question whether we want `git merge-tree` to report a
tree name even when it failed to write the tree objects in a read-only
repository, there is no question that we should avoid a segmentation
fault.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
    merge-tree: fix segmentation fault in read-only repositories
    
    Turns out that the segmentation fault reported by Taylor
    [https://lore.kernel.org/git/YyopQD+LvPucnz3w@nand.local/] happened
    while testing merge-ort in a read-only repository, and that the upstream
    version of git merge-tree is as affected as GitHub's internal version.
    
    Note: I briefly considered using the OID of the_hash_algo->empty_tree
    instead of null_oid() when no tree object could be constructed. However,
    I have come to the conclusion that this could potentially cause its own
    set of problems because it would relate to a valid tree object even if
    we do not have any valid tree object to play with.
    
    Also note: The question I hinted at in the commit message, namely
    whether or not to try harder to construct a tree object even if we
    cannot write it out, maybe merits a longer discussion, one that I think
    we should have after v2.38.0 is released, so as not to distract from
    focusing on v2.38.0.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1362%2Fdscho%2Fmerge-tree-in-read-only-repos-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1362/dscho/merge-tree-in-read-only-repos-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1362

 builtin/merge-tree.c             | 4 +++-
 t/t4301-merge-tree-write-tree.sh | 8 ++++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index ae5782917b9..25c7142a882 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -412,6 +412,7 @@ static int real_merge(struct merge_tree_options *o,
 	struct commit_list *merge_bases = NULL;
 	struct merge_options opt;
 	struct merge_result result = { 0 };
+	const struct object_id *tree_oid;
 
 	parent1 = get_merge_parent(branch1);
 	if (!parent1)
@@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o,
 	if (o->show_messages == -1)
 		o->show_messages = !result.clean;
 
-	printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
+	tree_oid = result.tree ? &result.tree->object.oid : null_oid();
+	printf("%s%c", oid_to_hex(tree_oid), line_termination);
 	if (!result.clean) {
 		struct string_list conflicted_files = STRING_LIST_INIT_NODUP;
 		const char *last = NULL;
diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
index 28ca5c38bb5..e56b1ba6e50 100755
--- a/t/t4301-merge-tree-write-tree.sh
+++ b/t/t4301-merge-tree-write-tree.sh
@@ -810,4 +810,12 @@ test_expect_success 'can override merge of unrelated histories' '
 	test_cmp expect actual
 '
 
+test_expect_success 'merge-ort fails gracefully in a read-only repository' '
+	git init --bare read-only &&
+	git push read-only side1 side2 &&
+	test_when_finished "chmod -R u+w read-only" &&
+	chmod -R a-w read-only &&
+	test_must_fail git -C read-only merge-tree side1 side2
+'
+
 test_done

base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
-- 
@@ -810,4 +810,12 @@ test_expect_success 'can override merge of unrelated histories' '
 	test_cmp expect actual
 '
 
+test_expect_success 'merge-ort fails gracefully in a read-only repository' '
+	git init --bare read-only &&
+	git push read-only side1 side2 &&
+	test_when_finished "chmod -R u+w read-only" &&
+	chmod -R a-w read-only &&
+	test_must_fail git -C read-only merge-tree side1 side2
+'
+
 test_done

base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 15:30 [PATCH] merge-tree: fix segmentation fault in read-only repositories Johannes Schindelin via GitGitGadget
@ 2022-09-21 15:42 ` Taylor Blau
  2022-09-21 20:12   ` Johannes Schindelin
  2022-09-21 18:12 ` Junio C Hamano
  2022-09-21 22:08 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
  2 siblings, 1 reply; 29+ messages in thread
From: Taylor Blau @ 2022-09-21 15:42 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Elijah Newren, Johannes Schindelin

On Wed, Sep 21, 2022 at 03:30:48PM +0000, Johannes Schindelin via GitGitGadget wrote:
> diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
> index ae5782917b9..25c7142a882 100644
> --- a/builtin/merge-tree.c
> +++ b/builtin/merge-tree.c
> @@ -412,6 +412,7 @@ static int real_merge(struct merge_tree_options *o,
>  	struct commit_list *merge_bases = NULL;
>  	struct merge_options opt;
>  	struct merge_result result = { 0 };
> +	const struct object_id *tree_oid;
>
>  	parent1 = get_merge_parent(branch1);
>  	if (!parent1)
> @@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o,
>  	if (o->show_messages == -1)
>  		o->show_messages = !result.clean;
>
> -	printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
> +	tree_oid = result.tree ? &result.tree->object.oid : null_oid();
> +	printf("%s%c", oid_to_hex(tree_oid), line_termination);

My understanding is that we can get a clean result from
merge_incore_recursive(), but still have failed to write the physical
tree object out?

In other words, the two sides *could* have been merged, but the actual
result of doing that merge couldn't be written to disk (e.g., because
the repository is read-only or some such)?

If so, then this approach makes sense, and I agree with your idea to
use the all-zeros OID instead of the empty tree one. It would be
nice(r?) if we could just abort this command earlier, since `merge-tree`
promises that we'll write the result out as an object. So I don't think
a non-zero exit before we have to print the resulting tree object would
be unexpected.

But I don't have a strong feeling about it either way. So, if you want
to proceed here and just emit the all-zeros OID, I think that is a fine
approach.

> diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> index 28ca5c38bb5..e56b1ba6e50 100755
> --- a/t/t4301-merge-tree-write-tree.sh
> +++ b/t/t4301-merge-tree-write-tree.sh
> @@ -810,4 +810,12 @@ test_expect_success 'can override merge of unrelated histories' '
>  	test_cmp expect actual
>  '
>
> +test_expect_success 'merge-ort fails gracefully in a read-only repository' '
> +	git init --bare read-only &&
> +	git push read-only side1 side2 &&
> +	test_when_finished "chmod -R u+w read-only" &&

Do we care about keeping this read-only repository around after the
test is done? It seems odd to have a directory called "read-only" be
user-writable. I'd probably just as soon replace this with:

  test_when_finished "rm -fr read-only" &&

Otherwise, this patch looks good. Thanks for working on it!

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 15:30 [PATCH] merge-tree: fix segmentation fault in read-only repositories Johannes Schindelin via GitGitGadget
  2022-09-21 15:42 ` Taylor Blau
@ 2022-09-21 18:12 ` Junio C Hamano
  2022-09-21 20:13   ` Johannes Schindelin
  2022-09-21 22:08 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
  2 siblings, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2022-09-21 18:12 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Elijah Newren, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> +test_expect_success 'merge-ort fails gracefully in a read-only repository' '
> +	git init --bare read-only &&
> +	git push read-only side1 side2 &&
> +	test_when_finished "chmod -R u+w read-only" &&
> +	chmod -R a-w read-only &&
> +	test_must_fail git -C read-only merge-tree side1 side2
> +'

Doesn't this test need a prerequisite to guard against those whose
filesystem lacks SANITY?



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 15:42 ` Taylor Blau
@ 2022-09-21 20:12   ` Johannes Schindelin
  0 siblings, 0 replies; 29+ messages in thread
From: Johannes Schindelin @ 2022-09-21 20:12 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Johannes Schindelin via GitGitGadget, git, Elijah Newren

Hi Taylor,

On Wed, 21 Sep 2022, Taylor Blau wrote:

> On Wed, Sep 21, 2022 at 03:30:48PM +0000, Johannes Schindelin via GitGitGadget wrote:
> > diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
> > index ae5782917b9..25c7142a882 100644
> > --- a/builtin/merge-tree.c
> > +++ b/builtin/merge-tree.c
> > @@ -412,6 +412,7 @@ static int real_merge(struct merge_tree_options *o,
> >  	struct commit_list *merge_bases = NULL;
> >  	struct merge_options opt;
> >  	struct merge_result result = { 0 };
> > +	const struct object_id *tree_oid;
> >
> >  	parent1 = get_merge_parent(branch1);
> >  	if (!parent1)
> > @@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o,
> >  	if (o->show_messages == -1)
> >  		o->show_messages = !result.clean;
> >
> > -	printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
> > +	tree_oid = result.tree ? &result.tree->object.oid : null_oid();
> > +	printf("%s%c", oid_to_hex(tree_oid), line_termination);
>
> My understanding is that we can get a clean result from
> merge_incore_recursive(), but still have failed to write the physical
> tree object out?

No, we do not get a clean result _because_ we have failed to write the
tree object.

> In other words, the two sides *could* have been merged, but the actual
> result of doing that merge couldn't be written to disk (e.g., because
> the repository is read-only or some such)?

Since we could not write the tree object, we cannot say whether the
overall merge could have been merged or not.

All we can say is whether the merge has been clean _so far_ or not. Which
is not very helpful.

Now, taking that thought a bit further, I did get suspicious that my patch
still contained a bug, and lo and behold, this bug really was there: when
the merge _has_ been clean so far, we report success (via exit code 0),
even if the tree could not be written, and therefore the `git merge-tree`
command should have failed!

I fixed the bug in the upcoming v2 and extended the test case to verify
that this bug no longer exists.

> If so, then this approach makes sense, and I agree with your idea to
> use the all-zeros OID instead of the empty tree one. It would be
> nice(r?) if we could just abort this command earlier, since `merge-tree`
> promises that we'll write the result out as an object. So I don't think
> a non-zero exit before we have to print the resulting tree object would
> be unexpected.

The idea of aborting the command earlier might make sense in isolation,
and when you're familiar with Git's built-ins taking the easy `die()` way
out upon failure without having to bother to release resources.

However, when you keep in mind that the `merge-tree` command is far from
its full potential and that really interesting ideas are still to be
implemented, some even well on their way like batched merges (see
https://github.com/gitgitgadget/git/pull/1361), it starts to become a much
better idea to keep the code as libified as possible, without any aborts.
Such an abort in the middle of the code path would interfere horribly with
the idea of batched merges.

> But I don't have a strong feeling about it either way. So, if you want
> to proceed here and just emit the all-zeros OID, I think that is a fine
> approach.
>
> > diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> > index 28ca5c38bb5..e56b1ba6e50 100755
> > --- a/t/t4301-merge-tree-write-tree.sh
> > +++ b/t/t4301-merge-tree-write-tree.sh
> > @@ -810,4 +810,12 @@ test_expect_success 'can override merge of unrelated histories' '
> >  	test_cmp expect actual
> >  '
> >
> > +test_expect_success 'merge-ort fails gracefully in a read-only repository' '
> > +	git init --bare read-only &&
> > +	git push read-only side1 side2 &&
> > +	test_when_finished "chmod -R u+w read-only" &&
>
> Do we care about keeping this read-only repository around after the
> test is done? It seems odd to have a directory called "read-only" be
> user-writable. I'd probably just as soon replace this with:
>
>   test_when_finished "rm -fr read-only" &&

Speaking from way more experience debugging CI failures than I care to
have, it makes a lot of sense to keep this directory until the complete
test script finishes with utter success. Because if the test case fails,
and the CI tars up the "trash" directory, and the crucial files to
diagnose what might have gone wrong are missing from that tar file, my own
future life will be a lot harder than I like.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 18:12 ` Junio C Hamano
@ 2022-09-21 20:13   ` Johannes Schindelin
  0 siblings, 0 replies; 29+ messages in thread
From: Johannes Schindelin @ 2022-09-21 20:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin via GitGitGadget, git, Elijah Newren

Hi Junio,

On Wed, 21 Sep 2022, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > +test_expect_success 'merge-ort fails gracefully in a read-only repository' '
> > +	git init --bare read-only &&
> > +	git push read-only side1 side2 &&
> > +	test_when_finished "chmod -R u+w read-only" &&
> > +	chmod -R a-w read-only &&
> > +	test_must_fail git -C read-only merge-tree side1 side2
> > +'
>
> Doesn't this test need a prerequisite to guard against those whose
> filesystem lacks SANITY?

Yes, of course!

Thank you for the sanity check (pun intended!),
Dscho

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 15:30 [PATCH] merge-tree: fix segmentation fault in read-only repositories Johannes Schindelin via GitGitGadget
  2022-09-21 15:42 ` Taylor Blau
  2022-09-21 18:12 ` Junio C Hamano
@ 2022-09-21 22:08 ` Johannes Schindelin via GitGitGadget
  2022-09-21 22:24   ` Junio C Hamano
                     ` (2 more replies)
  2 siblings, 3 replies; 29+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-09-21 22:08 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Taylor Blau, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Independent of the question whether we want `git merge-tree` to report
the tree name even when it failed to write the tree objects in a
read-only repository, there is no question that we should avoid a
segmentation fault.

And when we report an invalid tree name (because the tree could not be
written), we need the exit status to be non-zero.

Let's make it so.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
    merge-tree: fix segmentation fault in read-only repositories
    
    Turns out that the segmentation fault reported by Taylor
    [https://lore.kernel.org/git/YyopQD+LvPucnz3w@nand.local/] happened
    while testing merge-ort in a read-only repository, and that the upstream
    version of git merge-tree is as affected as GitHub's internal version.
    
    Note: I briefly considered using the OID of the_hash_algo->empty_tree
    instead of null_oid() when no tree object could be constructed. However,
    I have come to the conclusion that this could potentially cause its own
    set of problems because it would relate to a valid tree object even if
    we do not have any valid tree object to play with.
    
    Also note: The question I hinted at in the commit message, namely
    whether or not to try harder to construct a tree object even if we
    cannot write it out, maybe merits a longer discussion, one that I think
    we should have after v2.38.0 is released, so as not to distract from
    focusing on v2.38.0.
    
    Changes since v1:
    
     * Using the SANITY prereq now
     * If the merge was aborted due to a write error, exit with a non-zero
       code even if the (potentially partial) merge is clean
     * The test case now also verifies that the git merge-tree command fails
       in a read-only repository even if the merge would have succeeded

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1362%2Fdscho%2Fmerge-tree-in-read-only-repos-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1362/dscho/merge-tree-in-read-only-repos-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1362

Range-diff vs v1:

 1:  beba3fd9a95 ! 1:  1de4df3f471 merge-tree: fix segmentation fault in read-only repositories
     @@ Metadata
       ## Commit message ##
          merge-tree: fix segmentation fault in read-only repositories
      
     -    Independent of the question whether we want `git merge-tree` to report a
     -    tree name even when it failed to write the tree objects in a read-only
     -    repository, there is no question that we should avoid a segmentation
     -    fault.
     +    Independent of the question whether we want `git merge-tree` to report
     +    the tree name even when it failed to write the tree objects in a
     +    read-only repository, there is no question that we should avoid a
     +    segmentation fault.
     +
     +    And when we report an invalid tree name (because the tree could not be
     +    written), we need the exit status to be non-zero.
     +
     +    Let's make it so.
      
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
     @@ builtin/merge-tree.c: static int real_merge(struct merge_tree_options *o,
       	if (!result.clean) {
       		struct string_list conflicted_files = STRING_LIST_INIT_NODUP;
       		const char *last = NULL;
     +@@ builtin/merge-tree.c: static int real_merge(struct merge_tree_options *o,
     + 					      &result);
     + 	}
     + 	merge_finalize(&opt, &result);
     +-	return !result.clean; /* result.clean < 0 handled above */
     ++	return !result.tree || !result.clean; /* result.clean < 0 handled above */
     + }
     + 
     + int cmd_merge_tree(int argc, const char **argv, const char *prefix)
      
       ## t/t4301-merge-tree-write-tree.sh ##
      @@ t/t4301-merge-tree-write-tree.sh: test_expect_success 'can override merge of unrelated histories' '
       	test_cmp expect actual
       '
       
     -+test_expect_success 'merge-ort fails gracefully in a read-only repository' '
     ++test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
      +	git init --bare read-only &&
     -+	git push read-only side1 side2 &&
     ++	git push read-only side1 side2 side3 &&
      +	test_when_finished "chmod -R u+w read-only" &&
      +	chmod -R a-w read-only &&
     ++	test_must_fail git -C read-only merge-tree side1 side3 &&
      +	test_must_fail git -C read-only merge-tree side1 side2
      +'
      +


 builtin/merge-tree.c             | 6 ++++--
 t/t4301-merge-tree-write-tree.sh | 9 +++++++++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index ae5782917b9..0df24eb82d4 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -412,6 +412,7 @@ static int real_merge(struct merge_tree_options *o,
 	struct commit_list *merge_bases = NULL;
 	struct merge_options opt;
 	struct merge_result result = { 0 };
+	const struct object_id *tree_oid;
 
 	parent1 = get_merge_parent(branch1);
 	if (!parent1)
@@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o,
 	if (o->show_messages == -1)
 		o->show_messages = !result.clean;
 
-	printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
+	tree_oid = result.tree ? &result.tree->object.oid : null_oid();
+	printf("%s%c", oid_to_hex(tree_oid), line_termination);
 	if (!result.clean) {
 		struct string_list conflicted_files = STRING_LIST_INIT_NODUP;
 		const char *last = NULL;
@@ -473,7 +475,7 @@ static int real_merge(struct merge_tree_options *o,
 					      &result);
 	}
 	merge_finalize(&opt, &result);
-	return !result.clean; /* result.clean < 0 handled above */
+	return !result.tree || !result.clean; /* result.clean < 0 handled above */
 }
 
 int cmd_merge_tree(int argc, const char **argv, const char *prefix)
diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
index 28ca5c38bb5..013b77144bd 100755
--- a/t/t4301-merge-tree-write-tree.sh
+++ b/t/t4301-merge-tree-write-tree.sh
@@ -810,4 +810,13 @@ test_expect_success 'can override merge of unrelated histories' '
 	test_cmp expect actual
 '
 
+test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
+	git init --bare read-only &&
+	git push read-only side1 side2 side3 &&
+	test_when_finished "chmod -R u+w read-only" &&
+	chmod -R a-w read-only &&
+	test_must_fail git -C read-only merge-tree side1 side3 &&
+	test_must_fail git -C read-only merge-tree side1 side2
+'
+
 test_done

base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
-- 
@@ -810,4 +810,13 @@ test_expect_success 'can override merge of unrelated histories' '
 	test_cmp expect actual
 '
 
+test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
+	git init --bare read-only &&
+	git push read-only side1 side2 side3 &&
+	test_when_finished "chmod -R u+w read-only" &&
+	chmod -R a-w read-only &&
+	test_must_fail git -C read-only merge-tree side1 side3 &&
+	test_must_fail git -C read-only merge-tree side1 side2
+'
+
 test_done

base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 22:08 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
@ 2022-09-21 22:24   ` Junio C Hamano
  2022-09-22 19:52     ` Johannes Schindelin
  2022-09-21 22:56   ` Elijah Newren
  2022-09-22 19:46   ` [PATCH v3] merge-ort: " Johannes Schindelin via GitGitGadget
  2 siblings, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2022-09-21 22:24 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Elijah Newren, Taylor Blau, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> +	const struct object_id *tree_oid;
>  
>  	parent1 = get_merge_parent(branch1);
>  	if (!parent1)
> @@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o,
>  	if (o->show_messages == -1)
>  		o->show_messages = !result.clean;
>  
> -	printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
> +	tree_oid = result.tree ? &result.tree->object.oid : null_oid();
> +	printf("%s%c", oid_to_hex(tree_oid), line_termination);

As we are not using tree_oid anywhere else, it might be cleaner to
switch on result.tree more explicitly, especially given that the
return statement also uses result.tree to decide what exit status to
use, e.g.

	if (result.tree)
		fputs(oid_to_hex(&result.tree->object.oid), stdout);
	else
		fputs(oid_to_hex(null_oid()), stdout);
	fputc(line_termination, stdout);
	
but that is merely "might be cleaner".

> -	return !result.clean; /* result.clean < 0 handled above */
> +	return !result.tree || !result.clean; /* result.clean < 0 handled above */

The original returned 1 when we failed to cleanly merge and 0 when
we cleanly merged.  Now, if we failed to come up with a merged tree,
result.tree would be NULL and we return 1 from the function, because
we failed to cleanly merge.

OK.  These negations make my head spin X-<.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 22:08 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
  2022-09-21 22:24   ` Junio C Hamano
@ 2022-09-21 22:56   ` Elijah Newren
  2022-09-21 23:11     ` Junio C Hamano
                       ` (2 more replies)
  2022-09-22 19:46   ` [PATCH v3] merge-ort: " Johannes Schindelin via GitGitGadget
  2 siblings, 3 replies; 29+ messages in thread
From: Elijah Newren @ 2022-09-21 22:56 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: Git Mailing List, Taylor Blau, Johannes Schindelin

On Wed, Sep 21, 2022 at 3:08 PM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> Independent of the question whether we want `git merge-tree` to report
> the tree name even when it failed to write the tree objects in a
> read-only repository, there is no question that we should avoid a
> segmentation fault.

Ah, a read-only repo.  Looks like we didn't bother to check the return
status of write_object_file() in one of the two cases; oops.  (And it
looks like the other case does not correctly propagate the error
upwards far enough...)

> And when we report an invalid tree name (because the tree could not be
> written), we need the exit status to be non-zero.
>
> Let's make it so.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>     merge-tree: fix segmentation fault in read-only repositories
>
>     Turns out that the segmentation fault reported by Taylor
>     [https://lore.kernel.org/git/YyopQD+LvPucnz3w@nand.local/] happened
>     while testing merge-ort in a read-only repository, and that the upstream
>     version of git merge-tree is as affected as GitHub's internal version.
>
>     Note: I briefly considered using the OID of the_hash_algo->empty_tree
>     instead of null_oid() when no tree object could be constructed. However,
>     I have come to the conclusion that this could potentially cause its own
>     set of problems because it would relate to a valid tree object even if
>     we do not have any valid tree object to play with.
>
>     Also note: The question I hinted at in the commit message, namely
>     whether or not to try harder to construct a tree object even if we
>     cannot write it out, maybe merits a longer discussion, one that I think
>     we should have after v2.38.0 is released, so as not to distract from
>     focusing on v2.38.0.

That's fair, but you know I'm going to have a hard time refraining
from commenting on it anyway...  :-)

[...]
> @@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o,
>         if (o->show_messages == -1)
>                 o->show_messages = !result.clean;
>
> -       printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
> +       tree_oid = result.tree ? &result.tree->object.oid : null_oid();
> +       printf("%s%c", oid_to_hex(tree_oid), line_termination);

Perhaps also print a warning to stderr when result.tree is NULL?

>         if (!result.clean) {
>                 struct string_list conflicted_files = STRING_LIST_INIT_NODUP;
>                 const char *last = NULL;
> @@ -473,7 +475,7 @@ static int real_merge(struct merge_tree_options *o,
>                                               &result);
>         }
>         merge_finalize(&opt, &result);
> -       return !result.clean; /* result.clean < 0 handled above */
> +       return !result.tree || !result.clean; /* result.clean < 0 handled above */

Thinking out loud, should this logic be at the merge-ort.c level,
perhaps something like this:

@@ -4940,6 +4941,9 @@ static void
merge_ort_nonrecursive_internal(struct merge_options *opt,
        result->tree = parse_tree_indirect(&working_tree_oid);
        /* existence of conflicted entries implies unclean */
        result->clean &= strmap_empty(&opt->priv->conflicted);
+       if (!result->tree)
+               /* This shouldn't happen, because if we did fail to
write a tree we should have returned early before getting here.  But
just in case we have some bugs... */
+               result->clean = -1;
        if (!opt->priv->call_depth) {
                result->priv = opt->priv;
                result->_properly_initialized = RESULT_INITIALIZED;

That might benefit callers other than merge-tree, though maybe it
makes it harder to print a helpful error message (unless we're fine
with the library always throwing one?)

> diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> index 28ca5c38bb5..013b77144bd 100755
> --- a/t/t4301-merge-tree-write-tree.sh
> +++ b/t/t4301-merge-tree-write-tree.sh
> @@ -810,4 +810,13 @@ test_expect_success 'can override merge of unrelated histories' '
>         test_cmp expect actual
>  '
>
> +test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
> +       git init --bare read-only &&
> +       git push read-only side1 side2 side3 &&
> +       test_when_finished "chmod -R u+w read-only" &&
> +       chmod -R a-w read-only &&
> +       test_must_fail git -C read-only merge-tree side1 side3 &&
> +       test_must_fail git -C read-only merge-tree side1 side2
> +'
> +
>  test_done
>
> base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
> --
> gitgitgadget

This is a reasonable workaround from the merge-tree side, if we expect
merge-ort to not correctly notify us of issues (which may be fair
given that a quick glance suggests we have more than one problematic
codepath, which I'll discuss more below).  However, I do think
merge-ort should be returning a negative value for the "clean" status
in such a case.  If merge-ort did that, then we wouldn't even get to
your new code because merge-tree.c already checks for that:

    merge_incore_recursive(&opt, merge_bases, parent1, parent2, &result);
    if (result.clean < 0)
        die(_("failure to merge"));

I have a hack above which sets a negative return value, but it's only
a workaround since it comes from a late detect-after-the-fact
location.  Here's another ugly hack, but one which highlights exactly
where the real problem occurs:

diff --git a/merge-ort.c b/merge-ort.c
index 99dcee2db8..c2144e9220 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -3605,7 +3605,8 @@ static void write_tree(struct object_id *result_oid,
        }

        /* Write this object file out, and record in result_oid */
-       write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
+       if (write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid))
+               die(_("Unable to add new tree to database"));
        strbuf_release(&buf);
 }

die()ing, of course, is not a good choice, we should really return -1
instead.  Unfortunately, the callers aren't currently prepared to
propagate such a value upwards (as reflected in the return type of the
function) so any fix in this area is a little more involved than just
modifying these few lines.  Also, there is one other call to
write_object_file() in merge-ort.c which does check and return a value
of -1, though looking around it appears the callers of that other site
do not always correctly check and propagate a return value of -1
further upwards as they should, meaning we are losing track of the
fact that the merge failed to function.

Given the lack of correct propagation of -1 return values in the other
codepath plus the error here, keeping some form of workaround sounds
fair (though it may be nice to print an error that something fishy
happened).

And then, possibly post-v2.38.0 though we may be able to get it in
before, getting correct propagation of a -1 return value from the
source of the error would be good.  Would you like to look into that,
or would you prefer I did?

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 22:56   ` Elijah Newren
@ 2022-09-21 23:11     ` Junio C Hamano
  2022-09-22 17:24     ` Johannes Schindelin
  2022-09-22 19:50     ` Johannes Schindelin
  2 siblings, 0 replies; 29+ messages in thread
From: Junio C Hamano @ 2022-09-21 23:11 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Johannes Schindelin via GitGitGadget, Git Mailing List,
	Taylor Blau, Johannes Schindelin

Elijah Newren <newren@gmail.com> writes:

>> @@ -473,7 +475,7 @@ static int real_merge(struct merge_tree_options *o,
>>                                               &result);
>>         }
>>         merge_finalize(&opt, &result);
>> -       return !result.clean; /* result.clean < 0 handled above */
>> +       return !result.tree || !result.clean; /* result.clean < 0 handled above */
>
> Thinking out loud, should this logic be at the merge-ort.c level,

True.

> ...  However, I do think
> merge-ort should be returning a negative value for the "clean" status
> in such a case.

True, too.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 22:56   ` Elijah Newren
  2022-09-21 23:11     ` Junio C Hamano
@ 2022-09-22 17:24     ` Johannes Schindelin
  2022-09-22 19:50     ` Johannes Schindelin
  2 siblings, 0 replies; 29+ messages in thread
From: Johannes Schindelin @ 2022-09-22 17:24 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Johannes Schindelin via GitGitGadget, Git Mailing List, Taylor Blau

Hi Elijah,

just quickly...

On Wed, 21 Sep 2022, Elijah Newren wrote:

> On Wed, Sep 21, 2022 at 3:08 PM Johannes Schindelin via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > @@ -473,7 +475,7 @@ static int real_merge(struct merge_tree_options *o,
> >                                               &result);
> >         }
> >         merge_finalize(&opt, &result);
> > -       return !result.clean; /* result.clean < 0 handled above */
> > +       return !result.tree || !result.clean; /* result.clean < 0 handled above */
>
> Thinking out loud, should this logic be at the merge-ort.c level,

You're right, of course. I have pushed up a tentative v3 that does it as
you proposed, and it looks not half bad. I'll look a bit more deeply,
still, but chances are that I'll submit this later today as v3 as-is.

And then I will reply a bit more verbosely, either ;-)

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3] merge-ort: fix segmentation fault in read-only repositories
  2022-09-21 22:08 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
  2022-09-21 22:24   ` Junio C Hamano
  2022-09-21 22:56   ` Elijah Newren
@ 2022-09-22 19:46   ` Johannes Schindelin via GitGitGadget
  2022-09-23  1:38     ` Elijah Newren
  2022-09-26 21:55     ` [PATCH v4 0/2] merge-tree: " Johannes Schindelin via GitGitGadget
  2 siblings, 2 replies; 29+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-09-22 19:46 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Taylor Blau, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

If the blob/tree objects cannot be written, we really need the merge
operations to fail, and not to continue (and then try to access the tree
object which is however still set to `NULL`).

Let's stop ignoring the return value of `write_object_file()` and
`write_tree()` and set `clean = -1` in the error case.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
    merge-tree: fix segmentation fault in read-only repositories
    
    Turns out that the segmentation fault reported by Taylor
    [https://lore.kernel.org/git/YyopQD+LvPucnz3w@nand.local/] happened
    while testing merge-ort in a read-only repository, and that the upstream
    version of git merge-tree is as affected as GitHub's internal version.
    
    Changes since v2:
    
     * Completely changed the approach by no longer touching
       builtin/merge-tree.c but instead fixing the underlying merge-ort
       layer: we no longer ignore the return value of write_tree() and
       write_object_file().
    
    Changes since v1:
    
     * Using the SANITY prereq now
     * If the merge was aborted due to a write error, exit with a non-zero
       code even if the (potentially partial) merge is clean
     * The test case now also verifies that the git merge-tree command fails
       in a read-only repository even if the merge would have succeeded

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1362%2Fdscho%2Fmerge-tree-in-read-only-repos-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1362/dscho/merge-tree-in-read-only-repos-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1362

Range-diff vs v2:

 1:  1de4df3f471 ! 1:  198ff455f90 merge-tree: fix segmentation fault in read-only repositories
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    merge-tree: fix segmentation fault in read-only repositories
     +    merge-ort: fix segmentation fault in read-only repositories
      
     -    Independent of the question whether we want `git merge-tree` to report
     -    the tree name even when it failed to write the tree objects in a
     -    read-only repository, there is no question that we should avoid a
     -    segmentation fault.
     +    If the blob/tree objects cannot be written, we really need the merge
     +    operations to fail, and not to continue (and then try to access the tree
     +    object which is however still set to `NULL`).
      
     -    And when we report an invalid tree name (because the tree could not be
     -    written), we need the exit status to be non-zero.
     -
     -    Let's make it so.
     +    Let's stop ignoring the return value of `write_object_file()` and
     +    `write_tree()` and set `clean = -1` in the error case.
      
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
     - ## builtin/merge-tree.c ##
     -@@ builtin/merge-tree.c: static int real_merge(struct merge_tree_options *o,
     - 	struct commit_list *merge_bases = NULL;
     - 	struct merge_options opt;
     - 	struct merge_result result = { 0 };
     -+	const struct object_id *tree_oid;
     - 
     - 	parent1 = get_merge_parent(branch1);
     - 	if (!parent1)
     -@@ builtin/merge-tree.c: static int real_merge(struct merge_tree_options *o,
     - 	if (o->show_messages == -1)
     - 		o->show_messages = !result.clean;
     - 
     --	printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
     -+	tree_oid = result.tree ? &result.tree->object.oid : null_oid();
     -+	printf("%s%c", oid_to_hex(tree_oid), line_termination);
     - 	if (!result.clean) {
     - 		struct string_list conflicted_files = STRING_LIST_INIT_NODUP;
     - 		const char *last = NULL;
     -@@ builtin/merge-tree.c: static int real_merge(struct merge_tree_options *o,
     - 					      &result);
     + ## merge-ort.c ##
     +@@ merge-ort.c: static int tree_entry_order(const void *a_, const void *b_)
     + 				 b->string, strlen(b->string), bmi->result.mode);
     + }
     + 
     +-static void write_tree(struct object_id *result_oid,
     +-		       struct string_list *versions,
     +-		       unsigned int offset,
     +-		       size_t hash_size)
     ++static int write_tree(struct object_id *result_oid,
     ++		      struct string_list *versions,
     ++		      unsigned int offset,
     ++		      size_t hash_size)
     + {
     + 	size_t maxlen = 0, extra;
     + 	unsigned int nr;
     + 	struct strbuf buf = STRBUF_INIT;
     +-	int i;
     ++	int i, ret;
     + 
     + 	assert(offset <= versions->nr);
     + 	nr = versions->nr - offset;
     +@@ merge-ort.c: static void write_tree(struct object_id *result_oid,
     + 	}
     + 
     + 	/* Write this object file out, and record in result_oid */
     +-	write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
     ++	ret = write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
     + 	strbuf_release(&buf);
     ++	return ret;
     + }
     + 
     + static void record_entry_for_tree(struct directory_versions *dir_metadata,
     +@@ merge-ort.c: static void record_entry_for_tree(struct directory_versions *dir_metadata,
     + 			   basename)->util = &mi->result;
     + }
     + 
     +-static void write_completed_directory(struct merge_options *opt,
     +-				      const char *new_directory_name,
     +-				      struct directory_versions *info)
     ++static int write_completed_directory(struct merge_options *opt,
     ++				     const char *new_directory_name,
     ++				     struct directory_versions *info)
     + {
     + 	const char *prev_dir;
     + 	struct merged_info *dir_info = NULL;
     +-	unsigned int offset;
     ++	unsigned int offset, ret = 0;
     + 
     + 	/*
     + 	 * Some explanation of info->versions and info->offsets...
     +@@ merge-ort.c: static void write_completed_directory(struct merge_options *opt,
     + 	 * strcmp here.)
     + 	 */
     + 	if (new_directory_name == info->last_directory)
     +-		return;
     ++		return 0;
     + 
     + 	/*
     + 	 * If we are just starting (last_directory is NULL), or last_directory
     +@@ merge-ort.c: static void write_completed_directory(struct merge_options *opt,
     + 		 */
     + 		string_list_append(&info->offsets,
     + 				   info->last_directory)->util = (void*)offset;
     +-		return;
     ++		return 0;
     + 	}
     + 
     + 	/*
     +@@ merge-ort.c: static void write_completed_directory(struct merge_options *opt,
     + 		 */
     + 		dir_info->is_null = 0;
     + 		dir_info->result.mode = S_IFDIR;
     +-		write_tree(&dir_info->result.oid, &info->versions, offset,
     +-			   opt->repo->hash_algo->rawsz);
     ++		if (write_tree(&dir_info->result.oid, &info->versions, offset,
     ++			       opt->repo->hash_algo->rawsz) < 0)
     ++			ret = -1;
     + 	}
     + 
     + 	/*
     +@@ merge-ort.c: static void write_completed_directory(struct merge_options *opt,
     + 	/* And, of course, we need to update last_directory to match. */
     + 	info->last_directory = new_directory_name;
     + 	info->last_directory_len = strlen(info->last_directory);
     ++
     ++	return ret;
     + }
     + 
     + /* Per entry merge function */
     +@@ merge-ort.c: static void prefetch_for_content_merges(struct merge_options *opt,
     + 	oid_array_clear(&to_fetch);
     + }
     + 
     +-static void process_entries(struct merge_options *opt,
     +-			    struct object_id *result_oid)
     ++static int process_entries(struct merge_options *opt,
     ++			   struct object_id *result_oid)
     + {
     + 	struct hashmap_iter iter;
     + 	struct strmap_entry *e;
     +@@ merge-ort.c: static void process_entries(struct merge_options *opt,
     + 	struct directory_versions dir_metadata = { STRING_LIST_INIT_NODUP,
     + 						   STRING_LIST_INIT_NODUP,
     + 						   NULL, 0 };
     ++	int ret;
     + 
     + 	trace2_region_enter("merge", "process_entries setup", opt->repo);
     + 	if (strmap_empty(&opt->priv->paths)) {
     + 		oidcpy(result_oid, opt->repo->hash_algo->empty_tree);
     +-		return;
     ++		return 0;
       	}
     - 	merge_finalize(&opt, &result);
     --	return !result.clean; /* result.clean < 0 handled above */
     -+	return !result.tree || !result.clean; /* result.clean < 0 handled above */
     + 
     + 	/* Hack to pre-allocate plist to the desired size */
     +@@ merge-ort.c: static void process_entries(struct merge_options *opt,
     + 		 */
     + 		struct merged_info *mi = entry->util;
     + 
     +-		write_completed_directory(opt, mi->directory_name,
     +-					  &dir_metadata);
     ++		if (write_completed_directory(opt, mi->directory_name,
     ++					      &dir_metadata) < 0) {
     ++			ret = -1;
     ++			goto cleanup;
     ++		}
     + 		if (mi->clean)
     + 			record_entry_for_tree(&dir_metadata, path, mi);
     + 		else {
     +@@ merge-ort.c: static void process_entries(struct merge_options *opt,
     + 		fflush(stdout);
     + 		BUG("dir_metadata accounting completely off; shouldn't happen");
     + 	}
     +-	write_tree(result_oid, &dir_metadata.versions, 0,
     +-		   opt->repo->hash_algo->rawsz);
     ++	ret = write_tree(result_oid, &dir_metadata.versions, 0,
     ++			 opt->repo->hash_algo->rawsz);
     ++cleanup:
     + 	string_list_clear(&plist, 0);
     + 	string_list_clear(&dir_metadata.versions, 0);
     + 	string_list_clear(&dir_metadata.offsets, 0);
     + 	trace2_region_leave("merge", "process_entries cleanup", opt->repo);
     ++
     ++	return ret;
       }
       
     - int cmd_merge_tree(int argc, const char **argv, const char *prefix)
     + /*** Function Grouping: functions related to merge_switch_to_result() ***/
     +@@ merge-ort.c: redo:
     + 	}
     + 
     + 	trace2_region_enter("merge", "process_entries", opt->repo);
     +-	process_entries(opt, &working_tree_oid);
     ++	if (process_entries(opt, &working_tree_oid) < 0)
     ++		result->clean = -1;
     + 	trace2_region_leave("merge", "process_entries", opt->repo);
     + 
     + 	/* Set return values */
     + 	result->path_messages = &opt->priv->conflicts;
     + 
     +-	result->tree = parse_tree_indirect(&working_tree_oid);
     +-	/* existence of conflicted entries implies unclean */
     +-	result->clean &= strmap_empty(&opt->priv->conflicted);
     ++	if (result->clean >= 0) {
     ++		result->tree = parse_tree_indirect(&working_tree_oid);
     ++		/* existence of conflicted entries implies unclean */
     ++		result->clean &= strmap_empty(&opt->priv->conflicted);
     ++	}
     + 	if (!opt->priv->call_depth) {
     + 		result->priv = opt->priv;
     + 		result->_properly_initialized = RESULT_INITIALIZED;
      
       ## t/t4301-merge-tree-write-tree.sh ##
      @@ t/t4301-merge-tree-write-tree.sh: test_expect_success 'can override merge of unrelated histories' '


 merge-ort.c                      | 64 +++++++++++++++++++-------------
 t/t4301-merge-tree-write-tree.sh |  9 +++++
 2 files changed, 48 insertions(+), 25 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 99dcee2db8a..f654296220e 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -3571,15 +3571,15 @@ static int tree_entry_order(const void *a_, const void *b_)
 				 b->string, strlen(b->string), bmi->result.mode);
 }
 
-static void write_tree(struct object_id *result_oid,
-		       struct string_list *versions,
-		       unsigned int offset,
-		       size_t hash_size)
+static int write_tree(struct object_id *result_oid,
+		      struct string_list *versions,
+		      unsigned int offset,
+		      size_t hash_size)
 {
 	size_t maxlen = 0, extra;
 	unsigned int nr;
 	struct strbuf buf = STRBUF_INIT;
-	int i;
+	int i, ret;
 
 	assert(offset <= versions->nr);
 	nr = versions->nr - offset;
@@ -3605,8 +3605,9 @@ static void write_tree(struct object_id *result_oid,
 	}
 
 	/* Write this object file out, and record in result_oid */
-	write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
+	ret = write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
 	strbuf_release(&buf);
+	return ret;
 }
 
 static void record_entry_for_tree(struct directory_versions *dir_metadata,
@@ -3625,13 +3626,13 @@ static void record_entry_for_tree(struct directory_versions *dir_metadata,
 			   basename)->util = &mi->result;
 }
 
-static void write_completed_directory(struct merge_options *opt,
-				      const char *new_directory_name,
-				      struct directory_versions *info)
+static int write_completed_directory(struct merge_options *opt,
+				     const char *new_directory_name,
+				     struct directory_versions *info)
 {
 	const char *prev_dir;
 	struct merged_info *dir_info = NULL;
-	unsigned int offset;
+	unsigned int offset, ret = 0;
 
 	/*
 	 * Some explanation of info->versions and info->offsets...
@@ -3717,7 +3718,7 @@ static void write_completed_directory(struct merge_options *opt,
 	 * strcmp here.)
 	 */
 	if (new_directory_name == info->last_directory)
-		return;
+		return 0;
 
 	/*
 	 * If we are just starting (last_directory is NULL), or last_directory
@@ -3739,7 +3740,7 @@ static void write_completed_directory(struct merge_options *opt,
 		 */
 		string_list_append(&info->offsets,
 				   info->last_directory)->util = (void*)offset;
-		return;
+		return 0;
 	}
 
 	/*
@@ -3769,8 +3770,9 @@ static void write_completed_directory(struct merge_options *opt,
 		 */
 		dir_info->is_null = 0;
 		dir_info->result.mode = S_IFDIR;
-		write_tree(&dir_info->result.oid, &info->versions, offset,
-			   opt->repo->hash_algo->rawsz);
+		if (write_tree(&dir_info->result.oid, &info->versions, offset,
+			       opt->repo->hash_algo->rawsz) < 0)
+			ret = -1;
 	}
 
 	/*
@@ -3798,6 +3800,8 @@ static void write_completed_directory(struct merge_options *opt,
 	/* And, of course, we need to update last_directory to match. */
 	info->last_directory = new_directory_name;
 	info->last_directory_len = strlen(info->last_directory);
+
+	return ret;
 }
 
 /* Per entry merge function */
@@ -4214,8 +4218,8 @@ static void prefetch_for_content_merges(struct merge_options *opt,
 	oid_array_clear(&to_fetch);
 }
 
-static void process_entries(struct merge_options *opt,
-			    struct object_id *result_oid)
+static int process_entries(struct merge_options *opt,
+			   struct object_id *result_oid)
 {
 	struct hashmap_iter iter;
 	struct strmap_entry *e;
@@ -4224,11 +4228,12 @@ static void process_entries(struct merge_options *opt,
 	struct directory_versions dir_metadata = { STRING_LIST_INIT_NODUP,
 						   STRING_LIST_INIT_NODUP,
 						   NULL, 0 };
+	int ret;
 
 	trace2_region_enter("merge", "process_entries setup", opt->repo);
 	if (strmap_empty(&opt->priv->paths)) {
 		oidcpy(result_oid, opt->repo->hash_algo->empty_tree);
-		return;
+		return 0;
 	}
 
 	/* Hack to pre-allocate plist to the desired size */
@@ -4270,8 +4275,11 @@ static void process_entries(struct merge_options *opt,
 		 */
 		struct merged_info *mi = entry->util;
 
-		write_completed_directory(opt, mi->directory_name,
-					  &dir_metadata);
+		if (write_completed_directory(opt, mi->directory_name,
+					      &dir_metadata) < 0) {
+			ret = -1;
+			goto cleanup;
+		}
 		if (mi->clean)
 			record_entry_for_tree(&dir_metadata, path, mi);
 		else {
@@ -4291,12 +4299,15 @@ static void process_entries(struct merge_options *opt,
 		fflush(stdout);
 		BUG("dir_metadata accounting completely off; shouldn't happen");
 	}
-	write_tree(result_oid, &dir_metadata.versions, 0,
-		   opt->repo->hash_algo->rawsz);
+	ret = write_tree(result_oid, &dir_metadata.versions, 0,
+			 opt->repo->hash_algo->rawsz);
+cleanup:
 	string_list_clear(&plist, 0);
 	string_list_clear(&dir_metadata.versions, 0);
 	string_list_clear(&dir_metadata.offsets, 0);
 	trace2_region_leave("merge", "process_entries cleanup", opt->repo);
+
+	return ret;
 }
 
 /*** Function Grouping: functions related to merge_switch_to_result() ***/
@@ -4928,15 +4939,18 @@ redo:
 	}
 
 	trace2_region_enter("merge", "process_entries", opt->repo);
-	process_entries(opt, &working_tree_oid);
+	if (process_entries(opt, &working_tree_oid) < 0)
+		result->clean = -1;
 	trace2_region_leave("merge", "process_entries", opt->repo);
 
 	/* Set return values */
 	result->path_messages = &opt->priv->conflicts;
 
-	result->tree = parse_tree_indirect(&working_tree_oid);
-	/* existence of conflicted entries implies unclean */
-	result->clean &= strmap_empty(&opt->priv->conflicted);
+	if (result->clean >= 0) {
+		result->tree = parse_tree_indirect(&working_tree_oid);
+		/* existence of conflicted entries implies unclean */
+		result->clean &= strmap_empty(&opt->priv->conflicted);
+	}
 	if (!opt->priv->call_depth) {
 		result->priv = opt->priv;
 		result->_properly_initialized = RESULT_INITIALIZED;
diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
index 28ca5c38bb5..013b77144bd 100755
--- a/t/t4301-merge-tree-write-tree.sh
+++ b/t/t4301-merge-tree-write-tree.sh
@@ -810,4 +810,13 @@ test_expect_success 'can override merge of unrelated histories' '
 	test_cmp expect actual
 '
 
+test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
+	git init --bare read-only &&
+	git push read-only side1 side2 side3 &&
+	test_when_finished "chmod -R u+w read-only" &&
+	chmod -R a-w read-only &&
+	test_must_fail git -C read-only merge-tree side1 side3 &&
+	test_must_fail git -C read-only merge-tree side1 side2
+'
+
 test_done

base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
-- 
@@ -810,4 +810,13 @@ test_expect_success 'can override merge of unrelated histories' '
 	test_cmp expect actual
 '
 
+test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
+	git init --bare read-only &&
+	git push read-only side1 side2 side3 &&
+	test_when_finished "chmod -R u+w read-only" &&
+	chmod -R a-w read-only &&
+	test_must_fail git -C read-only merge-tree side1 side3 &&
+	test_must_fail git -C read-only merge-tree side1 side2
+'
+
 test_done

base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 22:56   ` Elijah Newren
  2022-09-21 23:11     ` Junio C Hamano
  2022-09-22 17:24     ` Johannes Schindelin
@ 2022-09-22 19:50     ` Johannes Schindelin
  2022-09-23  1:47       ` Elijah Newren
  2 siblings, 1 reply; 29+ messages in thread
From: Johannes Schindelin @ 2022-09-22 19:50 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Johannes Schindelin via GitGitGadget, Git Mailing List, Taylor Blau

Hi Elijah,

now that I have studied more of the code, hunted down Coverity reports
about `merge-ort.c` and `builtin/merge-tree.c` and essentially have grown
a lot more confidence about my patch, I'll take the time to respond in a
bit more detail.

On Wed, 21 Sep 2022, Elijah Newren wrote:

> On Wed, Sep 21, 2022 at 3:08 PM Johannes Schindelin via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > Independent of the question whether we want `git merge-tree` to report
> > the tree name even when it failed to write the tree objects in a
> > read-only repository, there is no question that we should avoid a
> > segmentation fault.
>
> Ah, a read-only repo.  Looks like we didn't bother to check the return
> status of write_object_file() in one of the two cases; oops.  (And it
> looks like the other case does not correctly propagate the error
> upwards far enough...)

Indeed, that was the actual root cause, but I had failed to even look
whether the return values of `write_tree()` and `write_object_file()` were
respected. Narrator's voice: they weren't.

> > And when we report an invalid tree name (because the tree could not be
> > written), we need the exit status to be non-zero.
> >
> > Let's make it so.
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> >     merge-tree: fix segmentation fault in read-only repositories
> >
> >     Turns out that the segmentation fault reported by Taylor
> >     [https://lore.kernel.org/git/YyopQD+LvPucnz3w@nand.local/] happened
> >     while testing merge-ort in a read-only repository, and that the upstream
> >     version of git merge-tree is as affected as GitHub's internal version.
> >
> >     Note: I briefly considered using the OID of the_hash_algo->empty_tree
> >     instead of null_oid() when no tree object could be constructed. However,
> >     I have come to the conclusion that this could potentially cause its own
> >     set of problems because it would relate to a valid tree object even if
> >     we do not have any valid tree object to play with.
> >
> >     Also note: The question I hinted at in the commit message, namely
> >     whether or not to try harder to construct a tree object even if we
> >     cannot write it out, maybe merits a longer discussion, one that I think
> >     we should have after v2.38.0 is released, so as not to distract from
> >     focusing on v2.38.0.
>
> That's fair, but you know I'm going to have a hard time refraining
> from commenting on it anyway...  :-)

Hahaha, of course you would ;-)

> [...]
> > @@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o,
> >         if (o->show_messages == -1)
> >                 o->show_messages = !result.clean;
> >
> > -       printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
> > +       tree_oid = result.tree ? &result.tree->object.oid : null_oid();
> > +       printf("%s%c", oid_to_hex(tree_oid), line_termination);
>
> Perhaps also print a warning to stderr when result.tree is NULL?

I ended up not even touching `builtin/merge-tree.c` but instead ensuring
that `result.clean` is negative if we fail to write an object (which could
happen even in read/write repositories, think "disk full").

As a consequence, we do not even reach this `printf()` anymore, as you
pointed out, a negative `result.clean` is handled much earlier.

It is handled via a `die()` in `real_merge()`, and that will need to
change, I think, if we want to continue on the batched merges.

> >         if (!result.clean) {
> >                 struct string_list conflicted_files = STRING_LIST_INIT_NODUP;
> >                 const char *last = NULL;
> > @@ -473,7 +475,7 @@ static int real_merge(struct merge_tree_options *o,
> >                                               &result);
> >         }
> >         merge_finalize(&opt, &result);
> > -       return !result.clean; /* result.clean < 0 handled above */
> > +       return !result.tree || !result.clean; /* result.clean < 0 handled above */
>
> Thinking out loud, should this logic be at the merge-ort.c level,
> perhaps something like this:
>
> @@ -4940,6 +4941,9 @@ static void
> merge_ort_nonrecursive_internal(struct merge_options *opt,
>         result->tree = parse_tree_indirect(&working_tree_oid);
>         /* existence of conflicted entries implies unclean */
>         result->clean &= strmap_empty(&opt->priv->conflicted);
> +       if (!result->tree)
> +               /* This shouldn't happen, because if we did fail to
> write a tree we should have returned early before getting here.  But
> just in case we have some bugs... */
> +               result->clean = -1;
>         if (!opt->priv->call_depth) {
>                 result->priv = opt->priv;
>                 result->_properly_initialized = RESULT_INITIALIZED;
>
> That might benefit callers other than merge-tree, though maybe it
> makes it harder to print a helpful error message (unless we're fine
> with the library always throwing one?)

The error messages are already thrown about (this is how it looks like
with v3 of this patch):

	[...]
	+ git -C read-only merge-tree side1 side2
	error: insufficient permission for adding an object to repository database ./objects
	error: error: Unable to add numbers to database
	error: insufficient permission for adding an object to repository database ./objects
	error: error: Unable to add greeting to database
	error: insufficient permission for adding an object to repository database ./objects
	fatal: failure to merge
	+ exit_code=128
	[...]

> > diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> > index 28ca5c38bb5..013b77144bd 100755
> > --- a/t/t4301-merge-tree-write-tree.sh
> > +++ b/t/t4301-merge-tree-write-tree.sh
> > @@ -810,4 +810,13 @@ test_expect_success 'can override merge of unrelated histories' '
> >         test_cmp expect actual
> >  '
> >
> > +test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
> > +       git init --bare read-only &&
> > +       git push read-only side1 side2 side3 &&
> > +       test_when_finished "chmod -R u+w read-only" &&
> > +       chmod -R a-w read-only &&
> > +       test_must_fail git -C read-only merge-tree side1 side3 &&
> > +       test_must_fail git -C read-only merge-tree side1 side2
> > +'
> > +
> >  test_done
> >
> > base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
> > --
> > gitgitgadget
>
> This is a reasonable workaround from the merge-tree side, if we expect
> merge-ort to not correctly notify us of issues (which may be fair
> given that a quick glance suggests we have more than one problematic
> codepath, which I'll discuss more below).  However, I do think
> merge-ort should be returning a negative value for the "clean" status
> in such a case.

You're absolutely right. That's what v3 now does.

> If merge-ort did that, then we wouldn't even get to your new code
> because merge-tree.c already checks for that:
>
>     merge_incore_recursive(&opt, merge_bases, parent1, parent2, &result);
>     if (result.clean < 0)
>         die(_("failure to merge"));
>
> I have a hack above which sets a negative return value, but it's only
> a workaround since it comes from a late detect-after-the-fact
> location.  Here's another ugly hack, but one which highlights exactly
> where the real problem occurs:
>
> diff --git a/merge-ort.c b/merge-ort.c
> index 99dcee2db8..c2144e9220 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -3605,7 +3605,8 @@ static void write_tree(struct object_id *result_oid,
>         }
>
>         /* Write this object file out, and record in result_oid */
> -       write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
> +       if (write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid))
> +               die(_("Unable to add new tree to database"));
>         strbuf_release(&buf);
>  }
>
> die()ing, of course, is not a good choice, we should really return -1
> instead.  Unfortunately, the callers aren't currently prepared to
> propagate such a value upwards (as reflected in the return type of the
> function) so any fix in this area is a little more involved than just
> modifying these few lines.  Also, there is one other call to
> write_object_file() in merge-ort.c which does check and return a value
> of -1, though looking around it appears the callers of that other site
> do not always correctly check and propagate a return value of -1
> further upwards as they should, meaning we are losing track of the
> fact that the merge failed to function.
>
> Given the lack of correct propagation of -1 return values in the other
> codepath plus the error here, keeping some form of workaround sounds
> fair (though it may be nice to print an error that something fishy
> happened).
>
> And then, possibly post-v2.38.0 though we may be able to get it in
> before, getting correct propagation of a -1 return value from the
> source of the error would be good.  Would you like to look into that,
> or would you prefer I did?

It turned out not to be too bad.

Essentially, to propagate `write_object_file()` failures I only had to
change a couple of signatures (with the corresponding `return`s):
`write_tree()`, `write_completed_directory()`, and `process_entries()`.
And then, of course, I needed to change
`merge_ort_nonrecursive_internal()` to respect the propagated error by
setting `result->clean = -1`.

Pretty straight-forward, really, much less involved than I had expected.

Thank you!
Dscho

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-21 22:24   ` Junio C Hamano
@ 2022-09-22 19:52     ` Johannes Schindelin
  0 siblings, 0 replies; 29+ messages in thread
From: Johannes Schindelin @ 2022-09-22 19:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin via GitGitGadget, git, Elijah Newren, Taylor Blau

Hi Junio,

On Wed, 21 Sep 2022, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > +	const struct object_id *tree_oid;
> >
> >  	parent1 = get_merge_parent(branch1);
> >  	if (!parent1)
> > @@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o,
> >  	if (o->show_messages == -1)
> >  		o->show_messages = !result.clean;
> >
> > -	printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
> > +	tree_oid = result.tree ? &result.tree->object.oid : null_oid();
> > +	printf("%s%c", oid_to_hex(tree_oid), line_termination);
>
> As we are not using tree_oid anywhere else, it might be cleaner to
> switch on result.tree more explicitly, especially given that the
> return statement also uses result.tree to decide what exit status to
> use, e.g.
>
> 	if (result.tree)
> 		fputs(oid_to_hex(&result.tree->object.oid), stdout);
> 	else
> 		fputs(oid_to_hex(null_oid()), stdout);
> 	fputc(line_termination, stdout);
>
> but that is merely "might be cleaner".

I would call it "more readable", even, not just "cleaner" :-)

> > -	return !result.clean; /* result.clean < 0 handled above */
> > +	return !result.tree || !result.clean; /* result.clean < 0 handled above */
>
> The original returned 1 when we failed to cleanly merge and 0 when
> we cleanly merged.  Now, if we failed to come up with a merged tree,
> result.tree would be NULL and we return 1 from the function, because
> we failed to cleanly merge.
>
> OK.  These negations make my head spin X-<.

And here I thought that you couldn't possibly not like if I didn't refrain
from not avoiding double negations...

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3] merge-ort: fix segmentation fault in read-only repositories
  2022-09-22 19:46   ` [PATCH v3] merge-ort: " Johannes Schindelin via GitGitGadget
@ 2022-09-23  1:38     ` Elijah Newren
  2022-09-23  9:55       ` Johannes Schindelin
  2022-09-26 21:55     ` [PATCH v4 0/2] merge-tree: " Johannes Schindelin via GitGitGadget
  1 sibling, 1 reply; 29+ messages in thread
From: Elijah Newren @ 2022-09-23  1:38 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: Git Mailing List, Taylor Blau, Johannes Schindelin

On Thu, Sep 22, 2022 at 12:46 PM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> If the blob/tree objects cannot be written, we really need the merge
> operations to fail, and not to continue (and then try to access the tree
> object which is however still set to `NULL`).
>
> Let's stop ignoring the return value of `write_object_file()` and
> `write_tree()` and set `clean = -1` in the error case.

:-)

>  merge-ort.c                      | 64 +++++++++++++++++++-------------
>  t/t4301-merge-tree-write-tree.sh |  9 +++++
>  2 files changed, 48 insertions(+), 25 deletions(-)
>
> diff --git a/merge-ort.c b/merge-ort.c
> index 99dcee2db8a..f654296220e 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -3571,15 +3571,15 @@ static int tree_entry_order(const void *a_, const void *b_)
>                                  b->string, strlen(b->string), bmi->result.mode);
>  }
>
> -static void write_tree(struct object_id *result_oid,
> -                      struct string_list *versions,
> -                      unsigned int offset,
> -                      size_t hash_size)
> +static int write_tree(struct object_id *result_oid,
> +                     struct string_list *versions,
> +                     unsigned int offset,
> +                     size_t hash_size)
>  {
>         size_t maxlen = 0, extra;
>         unsigned int nr;
>         struct strbuf buf = STRBUF_INIT;
> -       int i;
> +       int i, ret;
>
>         assert(offset <= versions->nr);
>         nr = versions->nr - offset;
> @@ -3605,8 +3605,9 @@ static void write_tree(struct object_id *result_oid,
>         }
>
>         /* Write this object file out, and record in result_oid */
> -       write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
> +       ret = write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);

This sent me into a bit of a hunt to try to figure out what the return
value of write_object_file() is.  I know it's non-zero on failure, but
don't know if it's always positive or always negative or possibly a
mix and how that maps to the idea of "clean_merge".  I gave up after a
few indirections, to be honest.

Anyway, regardless of my inability to find the answer to the above
question, would this be a bit easier to read if we initialized ret to
0 and then did something like

    if (write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid))
        ret = -1;

?

>         strbuf_release(&buf);
> +       return ret;
>  }
>
>  static void record_entry_for_tree(struct directory_versions *dir_metadata,
> @@ -3625,13 +3626,13 @@ static void record_entry_for_tree(struct directory_versions *dir_metadata,
>                            basename)->util = &mi->result;
>  }
>
> -static void write_completed_directory(struct merge_options *opt,
> -                                     const char *new_directory_name,
> -                                     struct directory_versions *info)
> +static int write_completed_directory(struct merge_options *opt,
> +                                    const char *new_directory_name,
> +                                    struct directory_versions *info)
>  {
>         const char *prev_dir;
>         struct merged_info *dir_info = NULL;
> -       unsigned int offset;
> +       unsigned int offset, ret = 0;
>
>         /*
>          * Some explanation of info->versions and info->offsets...
> @@ -3717,7 +3718,7 @@ static void write_completed_directory(struct merge_options *opt,
>          * strcmp here.)
>          */
>         if (new_directory_name == info->last_directory)
> -               return;
> +               return 0;
>
>         /*
>          * If we are just starting (last_directory is NULL), or last_directory
> @@ -3739,7 +3740,7 @@ static void write_completed_directory(struct merge_options *opt,
>                  */
>                 string_list_append(&info->offsets,
>                                    info->last_directory)->util = (void*)offset;
> -               return;
> +               return 0;
>         }
>
>         /*
> @@ -3769,8 +3770,9 @@ static void write_completed_directory(struct merge_options *opt,
>                  */
>                 dir_info->is_null = 0;
>                 dir_info->result.mode = S_IFDIR;
> -               write_tree(&dir_info->result.oid, &info->versions, offset,
> -                          opt->repo->hash_algo->rawsz);
> +               if (write_tree(&dir_info->result.oid, &info->versions, offset,
> +                              opt->repo->hash_algo->rawsz) < 0)
> +                       ret = -1;

This makes me wonder about write_tree() again.  What if its call to
write_object_file() returns a value greater than 0?  Is that possible?
 If it is, are those error cases or not?  About half the callers in
the code base that check the return value of write_object_file() check
for a non-zero value, the other half check for a value less than 0.
And I can't find any documentation.  And it seemed like too much time
for me to figure out what its range of return values was.

>         }
>
>         /*
> @@ -3798,6 +3800,8 @@ static void write_completed_directory(struct merge_options *opt,
>         /* And, of course, we need to update last_directory to match. */
>         info->last_directory = new_directory_name;
>         info->last_directory_len = strlen(info->last_directory);
> +
> +       return ret;
>  }
>
>  /* Per entry merge function */
> @@ -4214,8 +4218,8 @@ static void prefetch_for_content_merges(struct merge_options *opt,
>         oid_array_clear(&to_fetch);
>  }
>
> -static void process_entries(struct merge_options *opt,
> -                           struct object_id *result_oid)
> +static int process_entries(struct merge_options *opt,
> +                          struct object_id *result_oid)
>  {
>         struct hashmap_iter iter;
>         struct strmap_entry *e;
> @@ -4224,11 +4228,12 @@ static void process_entries(struct merge_options *opt,
>         struct directory_versions dir_metadata = { STRING_LIST_INIT_NODUP,
>                                                    STRING_LIST_INIT_NODUP,
>                                                    NULL, 0 };
> +       int ret;
>
>         trace2_region_enter("merge", "process_entries setup", opt->repo);
>         if (strmap_empty(&opt->priv->paths)) {
>                 oidcpy(result_oid, opt->repo->hash_algo->empty_tree);
> -               return;
> +               return 0;
>         }
>
>         /* Hack to pre-allocate plist to the desired size */
> @@ -4270,8 +4275,11 @@ static void process_entries(struct merge_options *opt,
>                  */
>                 struct merged_info *mi = entry->util;
>
> -               write_completed_directory(opt, mi->directory_name,
> -                                         &dir_metadata);
> +               if (write_completed_directory(opt, mi->directory_name,
> +                                             &dir_metadata) < 0) {
> +                       ret = -1;
> +                       goto cleanup;
> +               }
>                 if (mi->clean)
>                         record_entry_for_tree(&dir_metadata, path, mi);
>                 else {
> @@ -4291,12 +4299,15 @@ static void process_entries(struct merge_options *opt,
>                 fflush(stdout);
>                 BUG("dir_metadata accounting completely off; shouldn't happen");
>         }
> -       write_tree(result_oid, &dir_metadata.versions, 0,
> -                  opt->repo->hash_algo->rawsz);
> +       ret = write_tree(result_oid, &dir_metadata.versions, 0,
> +                        opt->repo->hash_algo->rawsz);
> +cleanup:
>         string_list_clear(&plist, 0);
>         string_list_clear(&dir_metadata.versions, 0);
>         string_list_clear(&dir_metadata.offsets, 0);
>         trace2_region_leave("merge", "process_entries cleanup", opt->repo);
> +
> +       return ret;
>  }
>
>  /*** Function Grouping: functions related to merge_switch_to_result() ***/
> @@ -4928,15 +4939,18 @@ redo:
>         }
>
>         trace2_region_enter("merge", "process_entries", opt->repo);
> -       process_entries(opt, &working_tree_oid);
> +       if (process_entries(opt, &working_tree_oid) < 0)
> +               result->clean = -1;
>         trace2_region_leave("merge", "process_entries", opt->repo);
>
>         /* Set return values */
>         result->path_messages = &opt->priv->conflicts;
>
> -       result->tree = parse_tree_indirect(&working_tree_oid);
> -       /* existence of conflicted entries implies unclean */
> -       result->clean &= strmap_empty(&opt->priv->conflicted);
> +       if (result->clean >= 0) {
> +               result->tree = parse_tree_indirect(&working_tree_oid);
> +               /* existence of conflicted entries implies unclean */
> +               result->clean &= strmap_empty(&opt->priv->conflicted);
> +       }
>         if (!opt->priv->call_depth) {
>                 result->priv = opt->priv;
>                 result->_properly_initialized = RESULT_INITIALIZED;
> diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> index 28ca5c38bb5..013b77144bd 100755
> --- a/t/t4301-merge-tree-write-tree.sh
> +++ b/t/t4301-merge-tree-write-tree.sh
> @@ -810,4 +810,13 @@ test_expect_success 'can override merge of unrelated histories' '
>         test_cmp expect actual
>  '
>
> +test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
> +       git init --bare read-only &&
> +       git push read-only side1 side2 side3 &&
> +       test_when_finished "chmod -R u+w read-only" &&
> +       chmod -R a-w read-only &&
> +       test_must_fail git -C read-only merge-tree side1 side3 &&
> +       test_must_fail git -C read-only merge-tree side1 side2
> +'
> +
>  test_done
>
> base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
> --
> gitgitgadget

Nice to see that it didn't take too much work to propagate the -1
value from write_tree() up the hierarchy.  Nice work!

I think we've still got some separate problems related to propagating
the return value of the other write_object_file() call, from
handle_content_merge().  The direct call in that other case is okay,
but the higher levels at process_renames() and process_entry() seem to
fumble it.  Your fix for failed tree writes is still valid without
fixing the failed blob writes, and I'm happy to tackle the other half
if you'd prefer to hand it off.  But, if you'd like to tackle that
half too, you might see fewer error messages when it fails to write to
the read-only repository, since it'll fail early on the first blob
write instead of only failing when it gets around to trying to write a
new tree.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-22 19:50     ` Johannes Schindelin
@ 2022-09-23  1:47       ` Elijah Newren
  0 siblings, 0 replies; 29+ messages in thread
From: Elijah Newren @ 2022-09-23  1:47 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Johannes Schindelin via GitGitGadget, Git Mailing List, Taylor Blau

On Thu, Sep 22, 2022 at 12:50 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Elijah,
>
> now that I have studied more of the code, hunted down Coverity reports
> about `merge-ort.c` and `builtin/merge-tree.c` and essentially have grown
> a lot more confidence about my patch, I'll take the time to respond in a
> bit more detail.
>
> On Wed, 21 Sep 2022, Elijah Newren wrote:
>
> > On Wed, Sep 21, 2022 at 3:08 PM Johannes Schindelin via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> > >
> > > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> > >
[...]
> > > @@ -446,7 +447,8 @@ static int real_merge(struct merge_tree_options *o,
> > >         if (o->show_messages == -1)
> > >                 o->show_messages = !result.clean;
> > >
> > > -       printf("%s%c", oid_to_hex(&result.tree->object.oid), line_termination);
> > > +       tree_oid = result.tree ? &result.tree->object.oid : null_oid();
> > > +       printf("%s%c", oid_to_hex(tree_oid), line_termination);
> >
> > Perhaps also print a warning to stderr when result.tree is NULL?
>
> I ended up not even touching `builtin/merge-tree.c` but instead ensuring
> that `result.clean` is negative if we fail to write an object (which could
> happen even in read/write repositories, think "disk full").
>
> As a consequence, we do not even reach this `printf()` anymore, as you
> pointed out, a negative `result.clean` is handled much earlier.

Works for me.  :-)

> It is handled via a `die()` in `real_merge()`, and that will need to
> change, I think, if we want to continue on the batched merges.

Indeed, good point.

> > >         if (!result.clean) {
> > >                 struct string_list conflicted_files = STRING_LIST_INIT_NODUP;
> > >                 const char *last = NULL;
> > > @@ -473,7 +475,7 @@ static int real_merge(struct merge_tree_options *o,
> > >                                               &result);
> > >         }
> > >         merge_finalize(&opt, &result);
> > > -       return !result.clean; /* result.clean < 0 handled above */
> > > +       return !result.tree || !result.clean; /* result.clean < 0 handled above */
> >
> > Thinking out loud, should this logic be at the merge-ort.c level,
> > perhaps something like this:
> >
> > @@ -4940,6 +4941,9 @@ static void
> > merge_ort_nonrecursive_internal(struct merge_options *opt,
> >         result->tree = parse_tree_indirect(&working_tree_oid);
> >         /* existence of conflicted entries implies unclean */
> >         result->clean &= strmap_empty(&opt->priv->conflicted);
> > +       if (!result->tree)
> > +               /* This shouldn't happen, because if we did fail to
> > write a tree we should have returned early before getting here.  But
> > just in case we have some bugs... */
> > +               result->clean = -1;
> >         if (!opt->priv->call_depth) {
> >                 result->priv = opt->priv;
> >                 result->_properly_initialized = RESULT_INITIALIZED;
> >
> > That might benefit callers other than merge-tree, though maybe it
> > makes it harder to print a helpful error message (unless we're fine
> > with the library always throwing one?)
>
> The error messages are already thrown about (this is how it looks like
> with v3 of this patch):
>
>         [...]
>         + git -C read-only merge-tree side1 side2
>         error: insufficient permission for adding an object to repository database ./objects
>         error: error: Unable to add numbers to database
>         error: insufficient permission for adding an object to repository database ./objects
>         error: error: Unable to add greeting to database

Ah, this confirms my suspicions.  These lines look like two failures,
both for adding blob objects.  This shows that the return value from
the write_object_file() call within handle_content_merge() is not
being correctly propagated upwards, and the merge is (erroneously)
continuing on despite that.

>         error: insufficient permission for adding an object to repository database ./objects

And I'm guessing the lack of the "Unable to add %s to database"
following message means this was the attempt to write a new tree.

>         fatal: failure to merge

And this shows that with your patch, you are propagating and catching
the failure to write the tree object and aborting, so we at least
catch failures to write trees now.

[...]
> > And then, possibly post-v2.38.0 though we may be able to get it in
> > before, getting correct propagation of a -1 return value from the
> > source of the error would be good.  Would you like to look into that,
> > or would you prefer I did?
>
> It turned out not to be too bad.
>
> Essentially, to propagate `write_object_file()` failures I only had to
> change a couple of signatures (with the corresponding `return`s):
> `write_tree()`, `write_completed_directory()`, and `process_entries()`.
> And then, of course, I needed to change
> `merge_ort_nonrecursive_internal()` to respect the propagated error by
> setting `result->clean = -1`.
>
> Pretty straight-forward, really, much less involved than I had expected.

Nice!

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3] merge-ort: fix segmentation fault in read-only repositories
  2022-09-23  1:38     ` Elijah Newren
@ 2022-09-23  9:55       ` Johannes Schindelin
  0 siblings, 0 replies; 29+ messages in thread
From: Johannes Schindelin @ 2022-09-23  9:55 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Johannes Schindelin via GitGitGadget, Git Mailing List, Taylor Blau

Hi Elijah,

On Thu, 22 Sep 2022, Elijah Newren wrote:

> On Thu, Sep 22, 2022 at 12:46 PM Johannes Schindelin via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > diff --git a/merge-ort.c b/merge-ort.c
> > index 99dcee2db8a..f654296220e 100644
> > --- a/merge-ort.c
> > +++ b/merge-ort.c
> > @@ -3571,15 +3571,15 @@ static int tree_entry_order(const void *a_, const void *b_)
> >                                  b->string, strlen(b->string), bmi->result.mode);
> >  }
> >
> > -static void write_tree(struct object_id *result_oid,
> > -                      struct string_list *versions,
> > -                      unsigned int offset,
> > -                      size_t hash_size)
> > +static int write_tree(struct object_id *result_oid,
> > +                     struct string_list *versions,
> > +                     unsigned int offset,
> > +                     size_t hash_size)
> >  {
> >         size_t maxlen = 0, extra;
> >         unsigned int nr;
> >         struct strbuf buf = STRBUF_INIT;
> > -       int i;
> > +       int i, ret;
> >
> >         assert(offset <= versions->nr);
> >         nr = versions->nr - offset;
> > @@ -3605,8 +3605,9 @@ static void write_tree(struct object_id *result_oid,
> >         }
> >
> >         /* Write this object file out, and record in result_oid */
> > -       write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
> > +       ret = write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
>
> This sent me into a bit of a hunt to try to figure out what the return
> value of write_object_file() is.  I know it's non-zero on failure, but
> don't know if it's always positive or always negative or possibly a
> mix and how that maps to the idea of "clean_merge".  I gave up after a
> few indirections, to be honest.
>
> Anyway, regardless of my inability to find the answer to the above
> question, would this be a bit easier to read if we initialized ret to
> 0 and then did something like
>
>     if (write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid))
>         ret = -1;
>
> ?

You're right, this is a better pattern to follow because it makes it
easier to wrap your head around.

> > [...]
> > @@ -3769,8 +3770,9 @@ static void write_completed_directory(struct merge_options *opt,
> >                  */
> >                 dir_info->is_null = 0;
> >                 dir_info->result.mode = S_IFDIR;
> > -               write_tree(&dir_info->result.oid, &info->versions, offset,
> > -                          opt->repo->hash_algo->rawsz);
> > +               if (write_tree(&dir_info->result.oid, &info->versions, offset,
> > +                              opt->repo->hash_algo->rawsz) < 0)
> > +                       ret = -1;
>
> This makes me wonder about write_tree() again.  What if its call to
> write_object_file() returns a value greater than 0?  Is that possible?

No, `write_object_file()` returns either 0 or -1, always. I followed all
the code paths.

Here is a tangent (feel free to skip over this lengthy analysis):

These code paths include a _very, very_ misleading one. In
`write_loose_object()` (which is called via the call chain
`write_object_file()` -> `write_object_file_flags()` ->
`write_object_file_flags()`), we call `finalize_object_file()`, see
https://github.com/git/git/blob/v2.37.3/object-file.c#L2030). And in that
`finalize_object_file()` function, we set `ret = errno;` in two places
(https://github.com/git/git/blob/v2.37.3/object-file.c#L1833 and
https://github.com/git/git/blob/v2.37.3/object-file.c#L1850).

When I read that, I chased down what the conventions are for `errno`
values. Now, while
https://pubs.opengroup.org/onlinepubs/007908799/xsh/errno.h.html does not
say anything about `errno` values being non-negative,
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/errno.h.html
very much says:

	Values for errno are now required to be distinct positive values
	rather than non-zero values. This change is for alignment with the
	ISO/IEC 9899:1999 standard.

Terrible, right? If we set `ret = errno`, that means that we return
something positive!

I then even dug up the origins of the `ret = errno` statements. Turns out
that yours truly introduced the first of these `ret = errno` statements in
9e48b389990 (Work around missing hard links on FAT formatted media,
2005-10-26). :facepalm:

So why did I do such a thing? Well, back then, the `ret` value _already_
contained an `errno`, thanks to calling `link_temp_to_file()`, which _did_
return an `errno`. In fact, there was no code path in
`move_temp_to_file()` (as the function was called back then) that would
return -1, it always returned either 0 or an `errno`.

And after coming so far, after one hour of analyzing, I finally, finally
realized that there is no `return ret;` in that function. That function
has a `ret` variable that is never returned _from_ that function, but it
used to be returned _to_ that function, by the `link_temp_to_file()`
function that does not even exist any longer.

In my defense, this terribly misleading code was not even introduced by
myself. It was introduced in 230f13225df (Create object subdirectories on
demand, 2005-10-08).

tl;dr all is good, `finalize_object_file()` returns 0 upon success, -1
upon failure.

> If it is, are those error cases or not?  About half the callers in the
> code base that check the return value of write_object_file() check for a
> non-zero value, the other half check for a value less than 0.

And then there is `apply.c` where the return value isn't even checked at
all most of the time.

> And I can't find any documentation.  And it seemed like too much time
> for me to figure out what its range of return values was.
>
> [...]
>
> Nice to see that it didn't take too much work to propagate the -1 value
>from write_tree() up the hierarchy.  Nice work!

Thank you!

> I think we've still got some separate problems related to propagating
> the return value of the other write_object_file() call, from
> handle_content_merge().  The direct call in that other case is okay, but
> the higher levels at process_renames() and process_entry() seem to
> fumble it.  Your fix for failed tree writes is still valid without
> fixing the failed blob writes, and I'm happy to tackle the other half if
> you'd prefer to hand it off.  But, if you'd like to tackle that half
> too, you might see fewer error messages when it fails to write to the
> read-only repository, since it'll fail early on the first blob write
> instead of only failing when it gets around to trying to write a new
> tree.

It'll be part of v4 ;-)

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v4 0/2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-22 19:46   ` [PATCH v3] merge-ort: " Johannes Schindelin via GitGitGadget
  2022-09-23  1:38     ` Elijah Newren
@ 2022-09-26 21:55     ` Johannes Schindelin via GitGitGadget
  2022-09-26 21:55       ` [PATCH v4 1/2] merge-ort: " Johannes Schindelin via GitGitGadget
                         ` (3 more replies)
  1 sibling, 4 replies; 29+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-09-26 21:55 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Taylor Blau, Johannes Schindelin

Turns out that the segmentation fault reported by Taylor
[https://lore.kernel.org/git/YyopQD+LvPucnz3w@nand.local/] happened while
testing merge-ort in a read-only repository, and that the upstream version
of git merge-tree is as affected as GitHub's internal version.

Changes since v3:

 * I now consistently use the pattern int ret = 0; ... if (...) ret = -1.
 * I added a commit to properly propagate write failures through the
   handle_content_merge() call path, even if it is not really critical (it
   just fails sooner, but the merge would have failed just the same without
   this patch).

Changes since v2:

 * Completely changed the approach by no longer touching
   builtin/merge-tree.c but instead fixing the underlying merge-ort layer:
   we no longer ignore the return value of write_tree() and
   write_object_file().

Changes since v1:

 * Using the SANITY prereq now
 * If the merge was aborted due to a write error, exit with a non-zero code
   even if the (potentially partial) merge is clean
 * The test case now also verifies that the git merge-tree command fails in
   a read-only repository even if the merge would have succeeded

Johannes Schindelin (2):
  merge-ort: fix segmentation fault in read-only repositories
  merge-ort: return early when failing to write a blob

 merge-ort.c                      | 94 ++++++++++++++++++++------------
 t/t4301-merge-tree-write-tree.sh |  9 +++
 2 files changed, 69 insertions(+), 34 deletions(-)


base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1362%2Fdscho%2Fmerge-tree-in-read-only-repos-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1362/dscho/merge-tree-in-read-only-repos-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1362

Range-diff vs v3:

 1:  198ff455f90 ! 1:  ab6df092eba merge-ort: fix segmentation fault in read-only repositories
     @@ merge-ort.c: static int tree_entry_order(const void *a_, const void *b_)
       	unsigned int nr;
       	struct strbuf buf = STRBUF_INIT;
      -	int i;
     -+	int i, ret;
     ++	int i, ret = 0;
       
       	assert(offset <= versions->nr);
       	nr = versions->nr - offset;
     @@ merge-ort.c: static void write_tree(struct object_id *result_oid,
       
       	/* Write this object file out, and record in result_oid */
      -	write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
     -+	ret = write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
     ++	if (write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid))
     ++		ret = -1;
       	strbuf_release(&buf);
      +	return ret;
       }
     @@ merge-ort.c: static void process_entries(struct merge_options *opt,
       	struct directory_versions dir_metadata = { STRING_LIST_INIT_NODUP,
       						   STRING_LIST_INIT_NODUP,
       						   NULL, 0 };
     -+	int ret;
     ++	int ret = 0;
       
       	trace2_region_enter("merge", "process_entries setup", opt->repo);
       	if (strmap_empty(&opt->priv->paths)) {
     @@ merge-ort.c: static void process_entries(struct merge_options *opt,
       	}
      -	write_tree(result_oid, &dir_metadata.versions, 0,
      -		   opt->repo->hash_algo->rawsz);
     -+	ret = write_tree(result_oid, &dir_metadata.versions, 0,
     -+			 opt->repo->hash_algo->rawsz);
     ++	if (write_tree(result_oid, &dir_metadata.versions, 0,
     ++		       opt->repo->hash_algo->rawsz) < 0)
     ++		ret = -1;
      +cleanup:
       	string_list_clear(&plist, 0);
       	string_list_clear(&dir_metadata.versions, 0);
 -:  ----------- > 2:  087207ae0b0 merge-ort: return early when failing to write a blob

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v4 1/2] merge-ort: fix segmentation fault in read-only repositories
  2022-09-26 21:55     ` [PATCH v4 0/2] merge-tree: " Johannes Schindelin via GitGitGadget
@ 2022-09-26 21:55       ` Johannes Schindelin via GitGitGadget
  2022-09-26 21:55       ` [PATCH v4 2/2] merge-ort: return early when failing to write a blob Johannes Schindelin via GitGitGadget
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 29+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-09-26 21:55 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Taylor Blau, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

If the blob/tree objects cannot be written, we really need the merge
operations to fail, and not to continue (and then try to access the tree
object which is however still set to `NULL`).

Let's stop ignoring the return value of `write_object_file()` and
`write_tree()` and set `clean = -1` in the error case.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 merge-ort.c                      | 66 ++++++++++++++++++++------------
 t/t4301-merge-tree-write-tree.sh |  9 +++++
 2 files changed, 50 insertions(+), 25 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 99dcee2db8a..f3bdce1041a 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -3571,15 +3571,15 @@ static int tree_entry_order(const void *a_, const void *b_)
 				 b->string, strlen(b->string), bmi->result.mode);
 }
 
-static void write_tree(struct object_id *result_oid,
-		       struct string_list *versions,
-		       unsigned int offset,
-		       size_t hash_size)
+static int write_tree(struct object_id *result_oid,
+		      struct string_list *versions,
+		      unsigned int offset,
+		      size_t hash_size)
 {
 	size_t maxlen = 0, extra;
 	unsigned int nr;
 	struct strbuf buf = STRBUF_INIT;
-	int i;
+	int i, ret = 0;
 
 	assert(offset <= versions->nr);
 	nr = versions->nr - offset;
@@ -3605,8 +3605,10 @@ static void write_tree(struct object_id *result_oid,
 	}
 
 	/* Write this object file out, and record in result_oid */
-	write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
+	if (write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid))
+		ret = -1;
 	strbuf_release(&buf);
+	return ret;
 }
 
 static void record_entry_for_tree(struct directory_versions *dir_metadata,
@@ -3625,13 +3627,13 @@ static void record_entry_for_tree(struct directory_versions *dir_metadata,
 			   basename)->util = &mi->result;
 }
 
-static void write_completed_directory(struct merge_options *opt,
-				      const char *new_directory_name,
-				      struct directory_versions *info)
+static int write_completed_directory(struct merge_options *opt,
+				     const char *new_directory_name,
+				     struct directory_versions *info)
 {
 	const char *prev_dir;
 	struct merged_info *dir_info = NULL;
-	unsigned int offset;
+	unsigned int offset, ret = 0;
 
 	/*
 	 * Some explanation of info->versions and info->offsets...
@@ -3717,7 +3719,7 @@ static void write_completed_directory(struct merge_options *opt,
 	 * strcmp here.)
 	 */
 	if (new_directory_name == info->last_directory)
-		return;
+		return 0;
 
 	/*
 	 * If we are just starting (last_directory is NULL), or last_directory
@@ -3739,7 +3741,7 @@ static void write_completed_directory(struct merge_options *opt,
 		 */
 		string_list_append(&info->offsets,
 				   info->last_directory)->util = (void*)offset;
-		return;
+		return 0;
 	}
 
 	/*
@@ -3769,8 +3771,9 @@ static void write_completed_directory(struct merge_options *opt,
 		 */
 		dir_info->is_null = 0;
 		dir_info->result.mode = S_IFDIR;
-		write_tree(&dir_info->result.oid, &info->versions, offset,
-			   opt->repo->hash_algo->rawsz);
+		if (write_tree(&dir_info->result.oid, &info->versions, offset,
+			       opt->repo->hash_algo->rawsz) < 0)
+			ret = -1;
 	}
 
 	/*
@@ -3798,6 +3801,8 @@ static void write_completed_directory(struct merge_options *opt,
 	/* And, of course, we need to update last_directory to match. */
 	info->last_directory = new_directory_name;
 	info->last_directory_len = strlen(info->last_directory);
+
+	return ret;
 }
 
 /* Per entry merge function */
@@ -4214,8 +4219,8 @@ static void prefetch_for_content_merges(struct merge_options *opt,
 	oid_array_clear(&to_fetch);
 }
 
-static void process_entries(struct merge_options *opt,
-			    struct object_id *result_oid)
+static int process_entries(struct merge_options *opt,
+			   struct object_id *result_oid)
 {
 	struct hashmap_iter iter;
 	struct strmap_entry *e;
@@ -4224,11 +4229,12 @@ static void process_entries(struct merge_options *opt,
 	struct directory_versions dir_metadata = { STRING_LIST_INIT_NODUP,
 						   STRING_LIST_INIT_NODUP,
 						   NULL, 0 };
+	int ret = 0;
 
 	trace2_region_enter("merge", "process_entries setup", opt->repo);
 	if (strmap_empty(&opt->priv->paths)) {
 		oidcpy(result_oid, opt->repo->hash_algo->empty_tree);
-		return;
+		return 0;
 	}
 
 	/* Hack to pre-allocate plist to the desired size */
@@ -4270,8 +4276,11 @@ static void process_entries(struct merge_options *opt,
 		 */
 		struct merged_info *mi = entry->util;
 
-		write_completed_directory(opt, mi->directory_name,
-					  &dir_metadata);
+		if (write_completed_directory(opt, mi->directory_name,
+					      &dir_metadata) < 0) {
+			ret = -1;
+			goto cleanup;
+		}
 		if (mi->clean)
 			record_entry_for_tree(&dir_metadata, path, mi);
 		else {
@@ -4291,12 +4300,16 @@ static void process_entries(struct merge_options *opt,
 		fflush(stdout);
 		BUG("dir_metadata accounting completely off; shouldn't happen");
 	}
-	write_tree(result_oid, &dir_metadata.versions, 0,
-		   opt->repo->hash_algo->rawsz);
+	if (write_tree(result_oid, &dir_metadata.versions, 0,
+		       opt->repo->hash_algo->rawsz) < 0)
+		ret = -1;
+cleanup:
 	string_list_clear(&plist, 0);
 	string_list_clear(&dir_metadata.versions, 0);
 	string_list_clear(&dir_metadata.offsets, 0);
 	trace2_region_leave("merge", "process_entries cleanup", opt->repo);
+
+	return ret;
 }
 
 /*** Function Grouping: functions related to merge_switch_to_result() ***/
@@ -4928,15 +4941,18 @@ redo:
 	}
 
 	trace2_region_enter("merge", "process_entries", opt->repo);
-	process_entries(opt, &working_tree_oid);
+	if (process_entries(opt, &working_tree_oid) < 0)
+		result->clean = -1;
 	trace2_region_leave("merge", "process_entries", opt->repo);
 
 	/* Set return values */
 	result->path_messages = &opt->priv->conflicts;
 
-	result->tree = parse_tree_indirect(&working_tree_oid);
-	/* existence of conflicted entries implies unclean */
-	result->clean &= strmap_empty(&opt->priv->conflicted);
+	if (result->clean >= 0) {
+		result->tree = parse_tree_indirect(&working_tree_oid);
+		/* existence of conflicted entries implies unclean */
+		result->clean &= strmap_empty(&opt->priv->conflicted);
+	}
 	if (!opt->priv->call_depth) {
 		result->priv = opt->priv;
 		result->_properly_initialized = RESULT_INITIALIZED;
diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
index 28ca5c38bb5..013b77144bd 100755
--- a/t/t4301-merge-tree-write-tree.sh
+++ b/t/t4301-merge-tree-write-tree.sh
@@ -810,4 +810,13 @@ test_expect_success 'can override merge of unrelated histories' '
 	test_cmp expect actual
 '
 
+test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
+	git init --bare read-only &&
+	git push read-only side1 side2 side3 &&
+	test_when_finished "chmod -R u+w read-only" &&
+	chmod -R a-w read-only &&
+	test_must_fail git -C read-only merge-tree side1 side3 &&
+	test_must_fail git -C read-only merge-tree side1 side2
+'
+
 test_done
-- 
@@ -810,4 +810,13 @@ test_expect_success 'can override merge of unrelated histories' '
 	test_cmp expect actual
 '
 
+test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
+	git init --bare read-only &&
+	git push read-only side1 side2 side3 &&
+	test_when_finished "chmod -R u+w read-only" &&
+	chmod -R a-w read-only &&
+	test_must_fail git -C read-only merge-tree side1 side3 &&
+	test_must_fail git -C read-only merge-tree side1 side2
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v4 2/2] merge-ort: return early when failing to write a blob
  2022-09-26 21:55     ` [PATCH v4 0/2] merge-tree: " Johannes Schindelin via GitGitGadget
  2022-09-26 21:55       ` [PATCH v4 1/2] merge-ort: " Johannes Schindelin via GitGitGadget
@ 2022-09-26 21:55       ` Johannes Schindelin via GitGitGadget
  2022-09-26 22:07         ` Junio C Hamano
  2022-09-27  8:05         ` Elijah Newren
  2022-09-27  8:11       ` [PATCH v4 0/2] merge-tree: fix segmentation fault in read-only repositories Elijah Newren
  2022-09-28  7:29       ` [PATCH v5 " Johannes Schindelin via GitGitGadget
  3 siblings, 2 replies; 29+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-09-26 21:55 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Taylor Blau, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In the previous commit, we fixed a segmentation fault when a tree object
could not be written.

However, before the tree object is written, `merge-ort` wants to write
out a blob object (except in cases where the merge results in a blob
that already exists in the database). And this can fail, too, but we
ignore that write failure so far.

Since we will always write out a new tree object in addition to the blob
(and if the blob did not exist in the database yet, we can be certain
that the tree object did not exist yet), the merge will _still_ fail at
that point, but it does unnecessary work by continuing after the blob
could not be written.

Let's pay close attention and error out early if the blob could not be
written. This reduces the error output of t4301.25 ("merge-ort fails
gracefully in a read-only repository") from:

	error: insufficient permission for adding an object to repository database ./objects
	error: error: Unable to add numbers to database
	error: insufficient permission for adding an object to repository database ./objects
	error: error: Unable to add greeting to database
	error: insufficient permission for adding an object to repository database ./objects
	fatal: failure to merge

to:

	error: insufficient permission for adding an object to repository database ./objects
	error: error: Unable to add numbers to database
	fatal: failure to merge

Note: This patch adjusts two variable declarations from `unsigned` to
`int` because their purpose is to hold the return value of
`handle_content_merge()`, which is of type `int`. The existing users of
those variables are only interested whether that variable is zero or
non-zero, therefore this type change does not affect the existing code.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 merge-ort.c | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index f3bdce1041a..e5f41cce481 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -2807,6 +2807,8 @@ static int process_renames(struct merge_options *opt,
 							   pathnames,
 							   1 + 2 * opt->priv->call_depth,
 							   &merged);
+			if (clean_merge < 0)
+				return -1;
 			if (!clean_merge &&
 			    merged.mode == side1->stages[1].mode &&
 			    oideq(&merged.oid, &side1->stages[1].oid))
@@ -2916,7 +2918,7 @@ static int process_renames(struct merge_options *opt,
 			struct version_info merged;
 
 			struct conflict_info *base, *side1, *side2;
-			unsigned clean;
+			int clean;
 
 			pathnames[0] = oldpath;
 			pathnames[other_source_index] = oldpath;
@@ -2937,6 +2939,8 @@ static int process_renames(struct merge_options *opt,
 						     pathnames,
 						     1 + 2 * opt->priv->call_depth,
 						     &merged);
+			if (clean < 0)
+				return -1;
 
 			memcpy(&newinfo->stages[target_index], &merged,
 			       sizeof(merged));
@@ -3806,10 +3810,10 @@ static int write_completed_directory(struct merge_options *opt,
 }
 
 /* Per entry merge function */
-static void process_entry(struct merge_options *opt,
-			  const char *path,
-			  struct conflict_info *ci,
-			  struct directory_versions *dir_metadata)
+static int process_entry(struct merge_options *opt,
+			 const char *path,
+			 struct conflict_info *ci,
+			 struct directory_versions *dir_metadata)
 {
 	int df_file_index = 0;
 
@@ -3823,7 +3827,7 @@ static void process_entry(struct merge_options *opt,
 		record_entry_for_tree(dir_metadata, path, &ci->merged);
 		if (ci->filemask == 0)
 			/* nothing else to handle */
-			return;
+			return 0;
 		assert(ci->df_conflict);
 	}
 
@@ -3870,7 +3874,7 @@ static void process_entry(struct merge_options *opt,
 		 */
 		if (ci->filemask == 1) {
 			ci->filemask = 0;
-			return;
+			return 0;
 		}
 
 		/*
@@ -4065,7 +4069,7 @@ static void process_entry(struct merge_options *opt,
 	} else if (ci->filemask >= 6) {
 		/* Need a two-way or three-way content merge */
 		struct version_info merged_file;
-		unsigned clean_merge;
+		int clean_merge;
 		struct version_info *o = &ci->stages[0];
 		struct version_info *a = &ci->stages[1];
 		struct version_info *b = &ci->stages[2];
@@ -4074,6 +4078,8 @@ static void process_entry(struct merge_options *opt,
 						   ci->pathnames,
 						   opt->priv->call_depth * 2,
 						   &merged_file);
+		if (clean_merge < 0)
+			return -1;
 		ci->merged.clean = clean_merge &&
 				   !ci->df_conflict && !ci->path_conflict;
 		ci->merged.result.mode = merged_file.mode;
@@ -4169,6 +4175,7 @@ static void process_entry(struct merge_options *opt,
 
 	/* Record metadata for ci->merged in dir_metadata */
 	record_entry_for_tree(dir_metadata, path, &ci->merged);
+	return 0;
 }
 
 static void prefetch_for_content_merges(struct merge_options *opt,
@@ -4285,7 +4292,10 @@ static int process_entries(struct merge_options *opt,
 			record_entry_for_tree(&dir_metadata, path, mi);
 		else {
 			struct conflict_info *ci = (struct conflict_info *)mi;
-			process_entry(opt, path, ci, &dir_metadata);
+			if (process_entry(opt, path, ci, &dir_metadata) < 0) {
+				ret = -1;
+				goto cleanup;
+			};
 		}
 	}
 	trace2_region_leave("merge", "processing", opt->repo);
-- 
@@ -2807,6 +2807,8 @@ static int process_renames(struct merge_options *opt,
 							   pathnames,
 							   1 + 2 * opt->priv->call_depth,
 							   &merged);
+			if (clean_merge < 0)
+				return -1;
 			if (!clean_merge &&
 			    merged.mode == side1->stages[1].mode &&
 			    oideq(&merged.oid, &side1->stages[1].oid))
@@ -2916,7 +2918,7 @@ static int process_renames(struct merge_options *opt,
 			struct version_info merged;
 
 			struct conflict_info *base, *side1, *side2;
-			unsigned clean;
+			int clean;
 
 			pathnames[0] = oldpath;
 			pathnames[other_source_index] = oldpath;
@@ -2937,6 +2939,8 @@ static int process_renames(struct merge_options *opt,
 						     pathnames,
 						     1 + 2 * opt->priv->call_depth,
 						     &merged);
+			if (clean < 0)
+				return -1;
 
 			memcpy(&newinfo->stages[target_index], &merged,
 			       sizeof(merged));
@@ -3806,10 +3810,10 @@ static int write_completed_directory(struct merge_options *opt,
 }
 
 /* Per entry merge function */
-static void process_entry(struct merge_options *opt,
-			  const char *path,
-			  struct conflict_info *ci,
-			  struct directory_versions *dir_metadata)
+static int process_entry(struct merge_options *opt,
+			 const char *path,
+			 struct conflict_info *ci,
+			 struct directory_versions *dir_metadata)
 {
 	int df_file_index = 0;
 
@@ -3823,7 +3827,7 @@ static void process_entry(struct merge_options *opt,
 		record_entry_for_tree(dir_metadata, path, &ci->merged);
 		if (ci->filemask == 0)
 			/* nothing else to handle */
-			return;
+			return 0;
 		assert(ci->df_conflict);
 	}
 
@@ -3870,7 +3874,7 @@ static void process_entry(struct merge_options *opt,
 		 */
 		if (ci->filemask == 1) {
 			ci->filemask = 0;
-			return;
+			return 0;
 		}
 
 		/*
@@ -4065,7 +4069,7 @@ static void process_entry(struct merge_options *opt,
 	} else if (ci->filemask >= 6) {
 		/* Need a two-way or three-way content merge */
 		struct version_info merged_file;
-		unsigned clean_merge;
+		int clean_merge;
 		struct version_info *o = &ci->stages[0];
 		struct version_info *a = &ci->stages[1];
 		struct version_info *b = &ci->stages[2];
@@ -4074,6 +4078,8 @@ static void process_entry(struct merge_options *opt,
 						   ci->pathnames,
 						   opt->priv->call_depth * 2,
 						   &merged_file);
+		if (clean_merge < 0)
+			return -1;
 		ci->merged.clean = clean_merge &&
 				   !ci->df_conflict && !ci->path_conflict;
 		ci->merged.result.mode = merged_file.mode;
@@ -4169,6 +4175,7 @@ static void process_entry(struct merge_options *opt,
 
 	/* Record metadata for ci->merged in dir_metadata */
 	record_entry_for_tree(dir_metadata, path, &ci->merged);
+	return 0;
 }
 
 static void prefetch_for_content_merges(struct merge_options *opt,
@@ -4285,7 +4292,10 @@ static int process_entries(struct merge_options *opt,
 			record_entry_for_tree(&dir_metadata, path, mi);
 		else {
 			struct conflict_info *ci = (struct conflict_info *)mi;
-			process_entry(opt, path, ci, &dir_metadata);
+			if (process_entry(opt, path, ci, &dir_metadata) < 0) {
+				ret = -1;
+				goto cleanup;
+			};
 		}
 	}
 	trace2_region_leave("merge", "processing", opt->repo);
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 2/2] merge-ort: return early when failing to write a blob
  2022-09-26 21:55       ` [PATCH v4 2/2] merge-ort: return early when failing to write a blob Johannes Schindelin via GitGitGadget
@ 2022-09-26 22:07         ` Junio C Hamano
  2022-09-27  8:05         ` Elijah Newren
  1 sibling, 0 replies; 29+ messages in thread
From: Junio C Hamano @ 2022-09-26 22:07 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Elijah Newren, Taylor Blau, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> In the previous commit, we fixed a segmentation fault when a tree object
> could not be written.
>
> However, before the tree object is written, `merge-ort` wants to write
> out a blob object (except in cases where the merge results in a blob
> that already exists in the database). And this can fail, too, but we
> ignore that write failure so far.

Nice find.

As long as we create at least one "new" blob the repository has not
seen, we have a chance to exit before we attempt to write a tree
object (and notice a failure).  Checking the return value of
process_entry() for failures is a good thing to do.

Will replace.  Thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 2/2] merge-ort: return early when failing to write a blob
  2022-09-26 21:55       ` [PATCH v4 2/2] merge-ort: return early when failing to write a blob Johannes Schindelin via GitGitGadget
  2022-09-26 22:07         ` Junio C Hamano
@ 2022-09-27  8:05         ` Elijah Newren
  2022-09-27 16:45           ` Junio C Hamano
  1 sibling, 1 reply; 29+ messages in thread
From: Elijah Newren @ 2022-09-27  8:05 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: Git Mailing List, Taylor Blau, Johannes Schindelin

On Mon, Sep 26, 2022 at 2:55 PM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> In the previous commit, we fixed a segmentation fault when a tree object
> could not be written.
>
> However, before the tree object is written, `merge-ort` wants to write
> out a blob object (except in cases where the merge results in a blob
> that already exists in the database). And this can fail, too, but we
> ignore that write failure so far.
>
> Since we will always write out a new tree object in addition to the blob
> (and if the blob did not exist in the database yet, we can be certain
> that the tree object did not exist yet), the merge will _still_ fail at
> that point, but it does unnecessary work by continuing after the blob
> could not be written.

I don't think this is quite true.  I've had a number of users come to
me with "messed up git repositories" where I eventually discover that
users just randomly add "sudo" to some of their commands because when
things don't work, sometimes "sudo" fixes it.  That means they've
created stuff in their .git directory which may be owned by root,
perhaps even some of the .git/objects/XX directories.  However, they
may have other .git/objects/XX directories owned by their normal user.

If just some .git/objects/XX directories are owned by root and not
others, then it may be when users run git commands as themselves that
some things can be written to the object database but not others.  In
particular, it could be that writing blob objects fail, but writing a
tree object which references those blobs succeeds.

> Let's pay close attention and error out early if the blob could not be
> written. This reduces the error output of t4301.25 ("merge-ort fails
> gracefully in a read-only repository") from:
>
>         error: insufficient permission for adding an object to repository database ./objects
>         error: error: Unable to add numbers to database
>         error: insufficient permission for adding an object to repository database ./objects
>         error: error: Unable to add greeting to database
>         error: insufficient permission for adding an object to repository database ./objects
>         fatal: failure to merge
>
> to:
>
>         error: insufficient permission for adding an object to repository database ./objects
>         error: error: Unable to add numbers to database
>         fatal: failure to merge
>
> Note: This patch adjusts two variable declarations from `unsigned` to
> `int` because their purpose is to hold the return value of
> `handle_content_merge()`, which is of type `int`. The existing users of
> those variables are only interested whether that variable is zero or
> non-zero, therefore this type change does not affect the existing code.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  merge-ort.c | 28 +++++++++++++++++++---------
>  1 file changed, 19 insertions(+), 9 deletions(-)
>
> diff --git a/merge-ort.c b/merge-ort.c
> index f3bdce1041a..e5f41cce481 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -2807,6 +2807,8 @@ static int process_renames(struct merge_options *opt,
>                                                            pathnames,
>                                                            1 + 2 * opt->priv->call_depth,
>                                                            &merged);
> +                       if (clean_merge < 0)
> +                               return -1;
>                         if (!clean_merge &&
>                             merged.mode == side1->stages[1].mode &&
>                             oideq(&merged.oid, &side1->stages[1].oid))
> @@ -2916,7 +2918,7 @@ static int process_renames(struct merge_options *opt,
>                         struct version_info merged;
>
>                         struct conflict_info *base, *side1, *side2;
> -                       unsigned clean;
> +                       int clean;
>
>                         pathnames[0] = oldpath;
>                         pathnames[other_source_index] = oldpath;
> @@ -2937,6 +2939,8 @@ static int process_renames(struct merge_options *opt,
>                                                      pathnames,
>                                                      1 + 2 * opt->priv->call_depth,
>                                                      &merged);
> +                       if (clean < 0)
> +                               return -1;
>
>                         memcpy(&newinfo->stages[target_index], &merged,
>                                sizeof(merged));
> @@ -3806,10 +3810,10 @@ static int write_completed_directory(struct merge_options *opt,
>  }
>
>  /* Per entry merge function */
> -static void process_entry(struct merge_options *opt,
> -                         const char *path,
> -                         struct conflict_info *ci,
> -                         struct directory_versions *dir_metadata)
> +static int process_entry(struct merge_options *opt,
> +                        const char *path,
> +                        struct conflict_info *ci,
> +                        struct directory_versions *dir_metadata)
>  {
>         int df_file_index = 0;
>
> @@ -3823,7 +3827,7 @@ static void process_entry(struct merge_options *opt,
>                 record_entry_for_tree(dir_metadata, path, &ci->merged);
>                 if (ci->filemask == 0)
>                         /* nothing else to handle */
> -                       return;
> +                       return 0;
>                 assert(ci->df_conflict);
>         }
>
> @@ -3870,7 +3874,7 @@ static void process_entry(struct merge_options *opt,
>                  */
>                 if (ci->filemask == 1) {
>                         ci->filemask = 0;
> -                       return;
> +                       return 0;
>                 }
>
>                 /*
> @@ -4065,7 +4069,7 @@ static void process_entry(struct merge_options *opt,
>         } else if (ci->filemask >= 6) {
>                 /* Need a two-way or three-way content merge */
>                 struct version_info merged_file;
> -               unsigned clean_merge;
> +               int clean_merge;
>                 struct version_info *o = &ci->stages[0];
>                 struct version_info *a = &ci->stages[1];
>                 struct version_info *b = &ci->stages[2];
> @@ -4074,6 +4078,8 @@ static void process_entry(struct merge_options *opt,
>                                                    ci->pathnames,
>                                                    opt->priv->call_depth * 2,
>                                                    &merged_file);
> +               if (clean_merge < 0)
> +                       return -1;
>                 ci->merged.clean = clean_merge &&
>                                    !ci->df_conflict && !ci->path_conflict;
>                 ci->merged.result.mode = merged_file.mode;
> @@ -4169,6 +4175,7 @@ static void process_entry(struct merge_options *opt,
>
>         /* Record metadata for ci->merged in dir_metadata */
>         record_entry_for_tree(dir_metadata, path, &ci->merged);
> +       return 0;
>  }
>
>  static void prefetch_for_content_merges(struct merge_options *opt,
> @@ -4285,7 +4292,10 @@ static int process_entries(struct merge_options *opt,
>                         record_entry_for_tree(&dir_metadata, path, mi);
>                 else {
>                         struct conflict_info *ci = (struct conflict_info *)mi;
> -                       process_entry(opt, path, ci, &dir_metadata);
> +                       if (process_entry(opt, path, ci, &dir_metadata) < 0) {
> +                               ret = -1;
> +                               goto cleanup;
> +                       };
>                 }
>         }
>         trace2_region_leave("merge", "processing", opt->repo);
> --
> gitgitgadget

Patch looks good to me, though.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 0/2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-26 21:55     ` [PATCH v4 0/2] merge-tree: " Johannes Schindelin via GitGitGadget
  2022-09-26 21:55       ` [PATCH v4 1/2] merge-ort: " Johannes Schindelin via GitGitGadget
  2022-09-26 21:55       ` [PATCH v4 2/2] merge-ort: return early when failing to write a blob Johannes Schindelin via GitGitGadget
@ 2022-09-27  8:11       ` Elijah Newren
  2022-09-28  7:29       ` [PATCH v5 " Johannes Schindelin via GitGitGadget
  3 siblings, 0 replies; 29+ messages in thread
From: Elijah Newren @ 2022-09-27  8:11 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: Git Mailing List, Taylor Blau, Johannes Schindelin

On Mon, Sep 26, 2022 at 2:55 PM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> Turns out that the segmentation fault reported by Taylor
> [https://lore.kernel.org/git/YyopQD+LvPucnz3w@nand.local/] happened while
> testing merge-ort in a read-only repository, and that the upstream version
> of git merge-tree is as affected as GitHub's internal version.
>
> Changes since v3:
>
>  * I now consistently use the pattern int ret = 0; ... if (...) ret = -1.
>  * I added a commit to properly propagate write failures through the
>    handle_content_merge() call path, even if it is not really critical (it
>    just fails sooner, but the merge would have failed just the same without
>    this patch).

I'm a little unsure about the commit message of patch 2, as I
commented on, but the code all looks good to me.  I don't think the
comment in the commit message matters too much; whether your commit
message addresses the weird case I mentioned, I feel the code handles
it just fine.  So I'll leave it up to you to decide whether to tweak
the commit message or not.  Anyway, after you decide on that, feel
free to apply my

Reviewed-by: Elijah Newren <newren@gmail.com>

Thanks for investigating, for tracking down all the return codes and
how to propagate them, and getting it all fixed up!

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 2/2] merge-ort: return early when failing to write a blob
  2022-09-27  8:05         ` Elijah Newren
@ 2022-09-27 16:45           ` Junio C Hamano
  0 siblings, 0 replies; 29+ messages in thread
From: Junio C Hamano @ 2022-09-27 16:45 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Johannes Schindelin via GitGitGadget, Git Mailing List,
	Taylor Blau, Johannes Schindelin

Elijah Newren <newren@gmail.com> writes:

>> Since we will always write out a new tree object in addition to the blob
>> (and if the blob did not exist in the database yet, we can be certain
>> that the tree object did not exist yet), the merge will _still_ fail at
>> that point, but it does unnecessary work by continuing after the blob
>> could not be written.
>
> I don't think this is quite true.  I've had a number of users come to
> me with "messed up git repositories" where I eventually discover that
> users just randomly add "sudo" to some of their commands because when
> things don't work, sometimes "sudo" fixes it.  That means they've
> created stuff in their .git directory which may be owned by root,
> perhaps even some of the .git/objects/XX directories.  However, they
> may have other .git/objects/XX directories owned by their normal user.
>
> If just some .git/objects/XX directories are owned by root and not
> others, then it may be when users run git commands as themselves that
> some things can be written to the object database but not others.  In
> particular, it could be that writing blob objects fail, but writing a
> tree object which references those blobs succeeds.

It is a good example scenario that argues even more strongly for
this change.  We may know the correct object contents and name for
the top-level tree and even manage to write it out, after computing
all the blobs and the trees contained within, some of which we may
know the object name but have failed to write out.  Letting such a
breakage unnoticed would be, eh, bad.

> Patch looks good to me, though.

Yup, thanks for a review.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5 0/2] merge-tree: fix segmentation fault in read-only repositories
  2022-09-26 21:55     ` [PATCH v4 0/2] merge-tree: " Johannes Schindelin via GitGitGadget
                         ` (2 preceding siblings ...)
  2022-09-27  8:11       ` [PATCH v4 0/2] merge-tree: fix segmentation fault in read-only repositories Elijah Newren
@ 2022-09-28  7:29       ` Johannes Schindelin via GitGitGadget
  2022-09-28  7:29         ` [PATCH v5 1/2] merge-ort: " Johannes Schindelin via GitGitGadget
  2022-09-28  7:29         ` [PATCH v5 2/2] merge-ort: return early when failing to write a blob Johannes Schindelin via GitGitGadget
  3 siblings, 2 replies; 29+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-09-28  7:29 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Taylor Blau, Johannes Schindelin

Turns out that the segmentation fault reported by Taylor
[https://lore.kernel.org/git/YyopQD+LvPucnz3w@nand.local/] happened while
testing merge-ort in a read-only repository, and that the upstream version
of git merge-tree is as affected as GitHub's internal version.

Changes since v4:

 * Corrected the commit message that claimed that the second patch was not
   fixing a real bug but just avoided unnecessary work. As pointed out by
   Elijah, there are scenarios where this second patch fixes a very real
   bug.
 * Added Elijah's "Reviewed-by" footer.

Changes since v3:

 * I now consistently use the pattern int ret = 0; ... if (...) ret = -1.
 * I added a commit to properly propagate write failures through the
   handle_content_merge() call path, even if it is not really critical (it
   just fails sooner, but the merge would have failed just the same without
   this patch).

Changes since v2:

 * Completely changed the approach by no longer touching
   builtin/merge-tree.c but instead fixing the underlying merge-ort layer:
   we no longer ignore the return value of write_tree() and
   write_object_file().

Changes since v1:

 * Using the SANITY prereq now
 * If the merge was aborted due to a write error, exit with a non-zero code
   even if the (potentially partial) merge is clean
 * The test case now also verifies that the git merge-tree command fails in
   a read-only repository even if the merge would have succeeded

Johannes Schindelin (2):
  merge-ort: fix segmentation fault in read-only repositories
  merge-ort: return early when failing to write a blob

 merge-ort.c                      | 94 ++++++++++++++++++++------------
 t/t4301-merge-tree-write-tree.sh |  9 +++
 2 files changed, 69 insertions(+), 34 deletions(-)


base-commit: dda7228a83e2e9ff584bf6adbf55910565b41e14
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1362%2Fdscho%2Fmerge-tree-in-read-only-repos-v5
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1362/dscho/merge-tree-in-read-only-repos-v5
Pull-Request: https://github.com/gitgitgadget/git/pull/1362

Range-diff vs v4:

 1:  ab6df092eba ! 1:  63be9f9f717 merge-ort: fix segmentation fault in read-only repositories
     @@ Commit message
          Let's stop ignoring the return value of `write_object_file()` and
          `write_tree()` and set `clean = -1` in the error case.
      
     +    Reviewed-by: Elijah Newren <newren@gmail.com>
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
       ## merge-ort.c ##
 2:  087207ae0b0 ! 2:  bfc71a2d8ad merge-ort: return early when failing to write a blob
     @@ Commit message
          that already exists in the database). And this can fail, too, but we
          ignore that write failure so far.
      
     -    Since we will always write out a new tree object in addition to the blob
     -    (and if the blob did not exist in the database yet, we can be certain
     -    that the tree object did not exist yet), the merge will _still_ fail at
     -    that point, but it does unnecessary work by continuing after the blob
     -    could not be written.
     -
          Let's pay close attention and error out early if the blob could not be
          written. This reduces the error output of t4301.25 ("merge-ort fails
          gracefully in a read-only repository") from:
     @@ Commit message
                  error: error: Unable to add numbers to database
                  fatal: failure to merge
      
     +    This is _not_ just a cosmetic change: Even though one might assume that
     +    the operation would have failed anyway at the point when the new tree
     +    object is written (and the corresponding tree object _will_ be new if it
     +    contains a blob that is new), but that is not so: As pointed out by
     +    Elijah Newren, when Git has previously been allowed to add loose objects
     +    via `sudo` calls, it is very possible that the blob object cannot be
     +    written (because the corresponding `.git/objects/??/` directory may be
     +    owned by `root`) but the tree object can be written (because the
     +    corresponding objects directory is owned by the current user). This
     +    would result in a corrupt repository because it is missing the blob
     +    object, and with this here patch we prevent that.
     +
          Note: This patch adjusts two variable declarations from `unsigned` to
          `int` because their purpose is to hold the return value of
          `handle_content_merge()`, which is of type `int`. The existing users of
          those variables are only interested whether that variable is zero or
          non-zero, therefore this type change does not affect the existing code.
      
     +    Reviewed-by: Elijah Newren <newren@gmail.com>
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
       ## merge-ort.c ##

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5 1/2] merge-ort: fix segmentation fault in read-only repositories
  2022-09-28  7:29       ` [PATCH v5 " Johannes Schindelin via GitGitGadget
@ 2022-09-28  7:29         ` Johannes Schindelin via GitGitGadget
  2022-09-28  7:29         ` [PATCH v5 2/2] merge-ort: return early when failing to write a blob Johannes Schindelin via GitGitGadget
  1 sibling, 0 replies; 29+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-09-28  7:29 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Taylor Blau, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

If the blob/tree objects cannot be written, we really need the merge
operations to fail, and not to continue (and then try to access the tree
object which is however still set to `NULL`).

Let's stop ignoring the return value of `write_object_file()` and
`write_tree()` and set `clean = -1` in the error case.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 merge-ort.c                      | 66 ++++++++++++++++++++------------
 t/t4301-merge-tree-write-tree.sh |  9 +++++
 2 files changed, 50 insertions(+), 25 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 99dcee2db8a..f3bdce1041a 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -3571,15 +3571,15 @@ static int tree_entry_order(const void *a_, const void *b_)
 				 b->string, strlen(b->string), bmi->result.mode);
 }
 
-static void write_tree(struct object_id *result_oid,
-		       struct string_list *versions,
-		       unsigned int offset,
-		       size_t hash_size)
+static int write_tree(struct object_id *result_oid,
+		      struct string_list *versions,
+		      unsigned int offset,
+		      size_t hash_size)
 {
 	size_t maxlen = 0, extra;
 	unsigned int nr;
 	struct strbuf buf = STRBUF_INIT;
-	int i;
+	int i, ret = 0;
 
 	assert(offset <= versions->nr);
 	nr = versions->nr - offset;
@@ -3605,8 +3605,10 @@ static void write_tree(struct object_id *result_oid,
 	}
 
 	/* Write this object file out, and record in result_oid */
-	write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
+	if (write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid))
+		ret = -1;
 	strbuf_release(&buf);
+	return ret;
 }
 
 static void record_entry_for_tree(struct directory_versions *dir_metadata,
@@ -3625,13 +3627,13 @@ static void record_entry_for_tree(struct directory_versions *dir_metadata,
 			   basename)->util = &mi->result;
 }
 
-static void write_completed_directory(struct merge_options *opt,
-				      const char *new_directory_name,
-				      struct directory_versions *info)
+static int write_completed_directory(struct merge_options *opt,
+				     const char *new_directory_name,
+				     struct directory_versions *info)
 {
 	const char *prev_dir;
 	struct merged_info *dir_info = NULL;
-	unsigned int offset;
+	unsigned int offset, ret = 0;
 
 	/*
 	 * Some explanation of info->versions and info->offsets...
@@ -3717,7 +3719,7 @@ static void write_completed_directory(struct merge_options *opt,
 	 * strcmp here.)
 	 */
 	if (new_directory_name == info->last_directory)
-		return;
+		return 0;
 
 	/*
 	 * If we are just starting (last_directory is NULL), or last_directory
@@ -3739,7 +3741,7 @@ static void write_completed_directory(struct merge_options *opt,
 		 */
 		string_list_append(&info->offsets,
 				   info->last_directory)->util = (void*)offset;
-		return;
+		return 0;
 	}
 
 	/*
@@ -3769,8 +3771,9 @@ static void write_completed_directory(struct merge_options *opt,
 		 */
 		dir_info->is_null = 0;
 		dir_info->result.mode = S_IFDIR;
-		write_tree(&dir_info->result.oid, &info->versions, offset,
-			   opt->repo->hash_algo->rawsz);
+		if (write_tree(&dir_info->result.oid, &info->versions, offset,
+			       opt->repo->hash_algo->rawsz) < 0)
+			ret = -1;
 	}
 
 	/*
@@ -3798,6 +3801,8 @@ static void write_completed_directory(struct merge_options *opt,
 	/* And, of course, we need to update last_directory to match. */
 	info->last_directory = new_directory_name;
 	info->last_directory_len = strlen(info->last_directory);
+
+	return ret;
 }
 
 /* Per entry merge function */
@@ -4214,8 +4219,8 @@ static void prefetch_for_content_merges(struct merge_options *opt,
 	oid_array_clear(&to_fetch);
 }
 
-static void process_entries(struct merge_options *opt,
-			    struct object_id *result_oid)
+static int process_entries(struct merge_options *opt,
+			   struct object_id *result_oid)
 {
 	struct hashmap_iter iter;
 	struct strmap_entry *e;
@@ -4224,11 +4229,12 @@ static void process_entries(struct merge_options *opt,
 	struct directory_versions dir_metadata = { STRING_LIST_INIT_NODUP,
 						   STRING_LIST_INIT_NODUP,
 						   NULL, 0 };
+	int ret = 0;
 
 	trace2_region_enter("merge", "process_entries setup", opt->repo);
 	if (strmap_empty(&opt->priv->paths)) {
 		oidcpy(result_oid, opt->repo->hash_algo->empty_tree);
-		return;
+		return 0;
 	}
 
 	/* Hack to pre-allocate plist to the desired size */
@@ -4270,8 +4276,11 @@ static void process_entries(struct merge_options *opt,
 		 */
 		struct merged_info *mi = entry->util;
 
-		write_completed_directory(opt, mi->directory_name,
-					  &dir_metadata);
+		if (write_completed_directory(opt, mi->directory_name,
+					      &dir_metadata) < 0) {
+			ret = -1;
+			goto cleanup;
+		}
 		if (mi->clean)
 			record_entry_for_tree(&dir_metadata, path, mi);
 		else {
@@ -4291,12 +4300,16 @@ static void process_entries(struct merge_options *opt,
 		fflush(stdout);
 		BUG("dir_metadata accounting completely off; shouldn't happen");
 	}
-	write_tree(result_oid, &dir_metadata.versions, 0,
-		   opt->repo->hash_algo->rawsz);
+	if (write_tree(result_oid, &dir_metadata.versions, 0,
+		       opt->repo->hash_algo->rawsz) < 0)
+		ret = -1;
+cleanup:
 	string_list_clear(&plist, 0);
 	string_list_clear(&dir_metadata.versions, 0);
 	string_list_clear(&dir_metadata.offsets, 0);
 	trace2_region_leave("merge", "process_entries cleanup", opt->repo);
+
+	return ret;
 }
 
 /*** Function Grouping: functions related to merge_switch_to_result() ***/
@@ -4928,15 +4941,18 @@ redo:
 	}
 
 	trace2_region_enter("merge", "process_entries", opt->repo);
-	process_entries(opt, &working_tree_oid);
+	if (process_entries(opt, &working_tree_oid) < 0)
+		result->clean = -1;
 	trace2_region_leave("merge", "process_entries", opt->repo);
 
 	/* Set return values */
 	result->path_messages = &opt->priv->conflicts;
 
-	result->tree = parse_tree_indirect(&working_tree_oid);
-	/* existence of conflicted entries implies unclean */
-	result->clean &= strmap_empty(&opt->priv->conflicted);
+	if (result->clean >= 0) {
+		result->tree = parse_tree_indirect(&working_tree_oid);
+		/* existence of conflicted entries implies unclean */
+		result->clean &= strmap_empty(&opt->priv->conflicted);
+	}
 	if (!opt->priv->call_depth) {
 		result->priv = opt->priv;
 		result->_properly_initialized = RESULT_INITIALIZED;
diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
index 28ca5c38bb5..013b77144bd 100755
--- a/t/t4301-merge-tree-write-tree.sh
+++ b/t/t4301-merge-tree-write-tree.sh
@@ -810,4 +810,13 @@ test_expect_success 'can override merge of unrelated histories' '
 	test_cmp expect actual
 '
 
+test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
+	git init --bare read-only &&
+	git push read-only side1 side2 side3 &&
+	test_when_finished "chmod -R u+w read-only" &&
+	chmod -R a-w read-only &&
+	test_must_fail git -C read-only merge-tree side1 side3 &&
+	test_must_fail git -C read-only merge-tree side1 side2
+'
+
 test_done
-- 
@@ -810,4 +810,13 @@ test_expect_success 'can override merge of unrelated histories' '
 	test_cmp expect actual
 '
 
+test_expect_success SANITY 'merge-ort fails gracefully in a read-only repository' '
+	git init --bare read-only &&
+	git push read-only side1 side2 side3 &&
+	test_when_finished "chmod -R u+w read-only" &&
+	chmod -R a-w read-only &&
+	test_must_fail git -C read-only merge-tree side1 side3 &&
+	test_must_fail git -C read-only merge-tree side1 side2
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 2/2] merge-ort: return early when failing to write a blob
  2022-09-28  7:29       ` [PATCH v5 " Johannes Schindelin via GitGitGadget
  2022-09-28  7:29         ` [PATCH v5 1/2] merge-ort: " Johannes Schindelin via GitGitGadget
@ 2022-09-28  7:29         ` Johannes Schindelin via GitGitGadget
  2022-09-28 15:53           ` Junio C Hamano
  1 sibling, 1 reply; 29+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-09-28  7:29 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Taylor Blau, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In the previous commit, we fixed a segmentation fault when a tree object
could not be written.

However, before the tree object is written, `merge-ort` wants to write
out a blob object (except in cases where the merge results in a blob
that already exists in the database). And this can fail, too, but we
ignore that write failure so far.

Let's pay close attention and error out early if the blob could not be
written. This reduces the error output of t4301.25 ("merge-ort fails
gracefully in a read-only repository") from:

	error: insufficient permission for adding an object to repository database ./objects
	error: error: Unable to add numbers to database
	error: insufficient permission for adding an object to repository database ./objects
	error: error: Unable to add greeting to database
	error: insufficient permission for adding an object to repository database ./objects
	fatal: failure to merge

to:

	error: insufficient permission for adding an object to repository database ./objects
	error: error: Unable to add numbers to database
	fatal: failure to merge

This is _not_ just a cosmetic change: Even though one might assume that
the operation would have failed anyway at the point when the new tree
object is written (and the corresponding tree object _will_ be new if it
contains a blob that is new), but that is not so: As pointed out by
Elijah Newren, when Git has previously been allowed to add loose objects
via `sudo` calls, it is very possible that the blob object cannot be
written (because the corresponding `.git/objects/??/` directory may be
owned by `root`) but the tree object can be written (because the
corresponding objects directory is owned by the current user). This
would result in a corrupt repository because it is missing the blob
object, and with this here patch we prevent that.

Note: This patch adjusts two variable declarations from `unsigned` to
`int` because their purpose is to hold the return value of
`handle_content_merge()`, which is of type `int`. The existing users of
those variables are only interested whether that variable is zero or
non-zero, therefore this type change does not affect the existing code.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 merge-ort.c | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index f3bdce1041a..e5f41cce481 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -2807,6 +2807,8 @@ static int process_renames(struct merge_options *opt,
 							   pathnames,
 							   1 + 2 * opt->priv->call_depth,
 							   &merged);
+			if (clean_merge < 0)
+				return -1;
 			if (!clean_merge &&
 			    merged.mode == side1->stages[1].mode &&
 			    oideq(&merged.oid, &side1->stages[1].oid))
@@ -2916,7 +2918,7 @@ static int process_renames(struct merge_options *opt,
 			struct version_info merged;
 
 			struct conflict_info *base, *side1, *side2;
-			unsigned clean;
+			int clean;
 
 			pathnames[0] = oldpath;
 			pathnames[other_source_index] = oldpath;
@@ -2937,6 +2939,8 @@ static int process_renames(struct merge_options *opt,
 						     pathnames,
 						     1 + 2 * opt->priv->call_depth,
 						     &merged);
+			if (clean < 0)
+				return -1;
 
 			memcpy(&newinfo->stages[target_index], &merged,
 			       sizeof(merged));
@@ -3806,10 +3810,10 @@ static int write_completed_directory(struct merge_options *opt,
 }
 
 /* Per entry merge function */
-static void process_entry(struct merge_options *opt,
-			  const char *path,
-			  struct conflict_info *ci,
-			  struct directory_versions *dir_metadata)
+static int process_entry(struct merge_options *opt,
+			 const char *path,
+			 struct conflict_info *ci,
+			 struct directory_versions *dir_metadata)
 {
 	int df_file_index = 0;
 
@@ -3823,7 +3827,7 @@ static void process_entry(struct merge_options *opt,
 		record_entry_for_tree(dir_metadata, path, &ci->merged);
 		if (ci->filemask == 0)
 			/* nothing else to handle */
-			return;
+			return 0;
 		assert(ci->df_conflict);
 	}
 
@@ -3870,7 +3874,7 @@ static void process_entry(struct merge_options *opt,
 		 */
 		if (ci->filemask == 1) {
 			ci->filemask = 0;
-			return;
+			return 0;
 		}
 
 		/*
@@ -4065,7 +4069,7 @@ static void process_entry(struct merge_options *opt,
 	} else if (ci->filemask >= 6) {
 		/* Need a two-way or three-way content merge */
 		struct version_info merged_file;
-		unsigned clean_merge;
+		int clean_merge;
 		struct version_info *o = &ci->stages[0];
 		struct version_info *a = &ci->stages[1];
 		struct version_info *b = &ci->stages[2];
@@ -4074,6 +4078,8 @@ static void process_entry(struct merge_options *opt,
 						   ci->pathnames,
 						   opt->priv->call_depth * 2,
 						   &merged_file);
+		if (clean_merge < 0)
+			return -1;
 		ci->merged.clean = clean_merge &&
 				   !ci->df_conflict && !ci->path_conflict;
 		ci->merged.result.mode = merged_file.mode;
@@ -4169,6 +4175,7 @@ static void process_entry(struct merge_options *opt,
 
 	/* Record metadata for ci->merged in dir_metadata */
 	record_entry_for_tree(dir_metadata, path, &ci->merged);
+	return 0;
 }
 
 static void prefetch_for_content_merges(struct merge_options *opt,
@@ -4285,7 +4292,10 @@ static int process_entries(struct merge_options *opt,
 			record_entry_for_tree(&dir_metadata, path, mi);
 		else {
 			struct conflict_info *ci = (struct conflict_info *)mi;
-			process_entry(opt, path, ci, &dir_metadata);
+			if (process_entry(opt, path, ci, &dir_metadata) < 0) {
+				ret = -1;
+				goto cleanup;
+			};
 		}
 	}
 	trace2_region_leave("merge", "processing", opt->repo);
-- 
@@ -2807,6 +2807,8 @@ static int process_renames(struct merge_options *opt,
 							   pathnames,
 							   1 + 2 * opt->priv->call_depth,
 							   &merged);
+			if (clean_merge < 0)
+				return -1;
 			if (!clean_merge &&
 			    merged.mode == side1->stages[1].mode &&
 			    oideq(&merged.oid, &side1->stages[1].oid))
@@ -2916,7 +2918,7 @@ static int process_renames(struct merge_options *opt,
 			struct version_info merged;
 
 			struct conflict_info *base, *side1, *side2;
-			unsigned clean;
+			int clean;
 
 			pathnames[0] = oldpath;
 			pathnames[other_source_index] = oldpath;
@@ -2937,6 +2939,8 @@ static int process_renames(struct merge_options *opt,
 						     pathnames,
 						     1 + 2 * opt->priv->call_depth,
 						     &merged);
+			if (clean < 0)
+				return -1;
 
 			memcpy(&newinfo->stages[target_index], &merged,
 			       sizeof(merged));
@@ -3806,10 +3810,10 @@ static int write_completed_directory(struct merge_options *opt,
 }
 
 /* Per entry merge function */
-static void process_entry(struct merge_options *opt,
-			  const char *path,
-			  struct conflict_info *ci,
-			  struct directory_versions *dir_metadata)
+static int process_entry(struct merge_options *opt,
+			 const char *path,
+			 struct conflict_info *ci,
+			 struct directory_versions *dir_metadata)
 {
 	int df_file_index = 0;
 
@@ -3823,7 +3827,7 @@ static void process_entry(struct merge_options *opt,
 		record_entry_for_tree(dir_metadata, path, &ci->merged);
 		if (ci->filemask == 0)
 			/* nothing else to handle */
-			return;
+			return 0;
 		assert(ci->df_conflict);
 	}
 
@@ -3870,7 +3874,7 @@ static void process_entry(struct merge_options *opt,
 		 */
 		if (ci->filemask == 1) {
 			ci->filemask = 0;
-			return;
+			return 0;
 		}
 
 		/*
@@ -4065,7 +4069,7 @@ static void process_entry(struct merge_options *opt,
 	} else if (ci->filemask >= 6) {
 		/* Need a two-way or three-way content merge */
 		struct version_info merged_file;
-		unsigned clean_merge;
+		int clean_merge;
 		struct version_info *o = &ci->stages[0];
 		struct version_info *a = &ci->stages[1];
 		struct version_info *b = &ci->stages[2];
@@ -4074,6 +4078,8 @@ static void process_entry(struct merge_options *opt,
 						   ci->pathnames,
 						   opt->priv->call_depth * 2,
 						   &merged_file);
+		if (clean_merge < 0)
+			return -1;
 		ci->merged.clean = clean_merge &&
 				   !ci->df_conflict && !ci->path_conflict;
 		ci->merged.result.mode = merged_file.mode;
@@ -4169,6 +4175,7 @@ static void process_entry(struct merge_options *opt,
 
 	/* Record metadata for ci->merged in dir_metadata */
 	record_entry_for_tree(dir_metadata, path, &ci->merged);
+	return 0;
 }
 
 static void prefetch_for_content_merges(struct merge_options *opt,
@@ -4285,7 +4292,10 @@ static int process_entries(struct merge_options *opt,
 			record_entry_for_tree(&dir_metadata, path, mi);
 		else {
 			struct conflict_info *ci = (struct conflict_info *)mi;
-			process_entry(opt, path, ci, &dir_metadata);
+			if (process_entry(opt, path, ci, &dir_metadata) < 0) {
+				ret = -1;
+				goto cleanup;
+			};
 		}
 	}
 	trace2_region_leave("merge", "processing", opt->repo);
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 2/2] merge-ort: return early when failing to write a blob
  2022-09-28  7:29         ` [PATCH v5 2/2] merge-ort: return early when failing to write a blob Johannes Schindelin via GitGitGadget
@ 2022-09-28 15:53           ` Junio C Hamano
  2022-09-29  1:36             ` Elijah Newren
  0 siblings, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2022-09-28 15:53 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Elijah Newren, Taylor Blau, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> This is _not_ just a cosmetic change: Even though one might assume that
> the operation would have failed anyway at the point when the new tree
> object is written (and the corresponding tree object _will_ be new if it
> contains a blob that is new), but that is not so: As pointed out by
> Elijah Newren, when Git has previously been allowed to add loose objects
> via `sudo` calls, it is very possible that the blob object cannot be
> written (because the corresponding `.git/objects/??/` directory may be
> owned by `root`) but the tree object can be written (because the
> corresponding objects directory is owned by the current user). This
> would result in a corrupt repository because it is missing the blob
> object, and with this here patch we prevent that.
>
> Note: This patch adjusts two variable declarations from `unsigned` to
> `int` because their purpose is to hold the return value of
> `handle_content_merge()`, which is of type `int`. The existing users of
> those variables are only interested whether that variable is zero or
> non-zero, therefore this type change does not affect the existing code.
>
> Reviewed-by: Elijah Newren <newren@gmail.com>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---

Thanks.  The first paragraph in the quoted part above is new and not
exactly "reviewed" by Elijah yet, so let's ask ;-)  

To me the description of the issue looks reasonable to me.  Any
comments, Elijah?

The code is the same as the one in the previous round and Elijah
gave his Reviewed-by, and does look good to me, too.

>  merge-ort.c | 28 +++++++++++++++++++---------
>  1 file changed, 19 insertions(+), 9 deletions(-)

Thanks, all.  Queued.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 2/2] merge-ort: return early when failing to write a blob
  2022-09-28 15:53           ` Junio C Hamano
@ 2022-09-29  1:36             ` Elijah Newren
  2022-09-29  1:49               ` Junio C Hamano
  0 siblings, 1 reply; 29+ messages in thread
From: Elijah Newren @ 2022-09-29  1:36 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin via GitGitGadget, Git Mailing List,
	Taylor Blau, Johannes Schindelin

On Wed, Sep 28, 2022 at 8:53 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > This is _not_ just a cosmetic change: Even though one might assume that
> > the operation would have failed anyway at the point when the new tree
> > object is written (and the corresponding tree object _will_ be new if it
> > contains a blob that is new), but that is not so: As pointed out by
> > Elijah Newren, when Git has previously been allowed to add loose objects
> > via `sudo` calls, it is very possible that the blob object cannot be
> > written (because the corresponding `.git/objects/??/` directory may be
> > owned by `root`) but the tree object can be written (because the
> > corresponding objects directory is owned by the current user). This
> > would result in a corrupt repository because it is missing the blob
> > object, and with this here patch we prevent that.
> >
> > Note: This patch adjusts two variable declarations from `unsigned` to
> > `int` because their purpose is to hold the return value of
> > `handle_content_merge()`, which is of type `int`. The existing users of
> > those variables are only interested whether that variable is zero or
> > non-zero, therefore this type change does not affect the existing code.
> >
> > Reviewed-by: Elijah Newren <newren@gmail.com>
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
>
> Thanks.  The first paragraph in the quoted part above is new and not
> exactly "reviewed" by Elijah yet, so let's ask ;-)
>
> To me the description of the issue looks reasonable to me.  Any
> comments, Elijah?

Looks good to me!

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 2/2] merge-ort: return early when failing to write a blob
  2022-09-29  1:36             ` Elijah Newren
@ 2022-09-29  1:49               ` Junio C Hamano
  0 siblings, 0 replies; 29+ messages in thread
From: Junio C Hamano @ 2022-09-29  1:49 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Johannes Schindelin via GitGitGadget, Git Mailing List,
	Taylor Blau, Johannes Schindelin

Elijah Newren <newren@gmail.com> writes:

>> To me the description of the issue looks reasonable to me.  Any
>> comments, Elijah?
>
> Looks good to me!

Excellent.  Thanks, both.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2022-09-29  1:51 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-21 15:30 [PATCH] merge-tree: fix segmentation fault in read-only repositories Johannes Schindelin via GitGitGadget
2022-09-21 15:42 ` Taylor Blau
2022-09-21 20:12   ` Johannes Schindelin
2022-09-21 18:12 ` Junio C Hamano
2022-09-21 20:13   ` Johannes Schindelin
2022-09-21 22:08 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
2022-09-21 22:24   ` Junio C Hamano
2022-09-22 19:52     ` Johannes Schindelin
2022-09-21 22:56   ` Elijah Newren
2022-09-21 23:11     ` Junio C Hamano
2022-09-22 17:24     ` Johannes Schindelin
2022-09-22 19:50     ` Johannes Schindelin
2022-09-23  1:47       ` Elijah Newren
2022-09-22 19:46   ` [PATCH v3] merge-ort: " Johannes Schindelin via GitGitGadget
2022-09-23  1:38     ` Elijah Newren
2022-09-23  9:55       ` Johannes Schindelin
2022-09-26 21:55     ` [PATCH v4 0/2] merge-tree: " Johannes Schindelin via GitGitGadget
2022-09-26 21:55       ` [PATCH v4 1/2] merge-ort: " Johannes Schindelin via GitGitGadget
2022-09-26 21:55       ` [PATCH v4 2/2] merge-ort: return early when failing to write a blob Johannes Schindelin via GitGitGadget
2022-09-26 22:07         ` Junio C Hamano
2022-09-27  8:05         ` Elijah Newren
2022-09-27 16:45           ` Junio C Hamano
2022-09-27  8:11       ` [PATCH v4 0/2] merge-tree: fix segmentation fault in read-only repositories Elijah Newren
2022-09-28  7:29       ` [PATCH v5 " Johannes Schindelin via GitGitGadget
2022-09-28  7:29         ` [PATCH v5 1/2] merge-ort: " Johannes Schindelin via GitGitGadget
2022-09-28  7:29         ` [PATCH v5 2/2] merge-ort: return early when failing to write a blob Johannes Schindelin via GitGitGadget
2022-09-28 15:53           ` Junio C Hamano
2022-09-29  1:36             ` Elijah Newren
2022-09-29  1:49               ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).