git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] submodules: allow empty working-tree dirs in merge/cherry-pick
@ 2016-11-07 18:31 David Turner
  2016-11-07 19:13 ` Stefan Beller
  2016-11-18  4:47 ` Junio C Hamano
  0 siblings, 2 replies; 5+ messages in thread
From: David Turner @ 2016-11-07 18:31 UTC (permalink / raw)
  To: git, sbeller; +Cc: David Turner

When a submodule is being merged or cherry-picked into a working
tree that already contains a corresponding empty directory, do not
record a conflict.

One situation where this bug appears is:

- Commit 1 adds a submodule
- Commit 2 removes that submodule and re-adds it into a subdirectory
       (sub1 to sub1/sub1).
- Commit 3 adds an unrelated file.

Now the user checks out commit 1 (first deinitializing the submodule),
and attempts to cherry-pick commit 3.  Previously, this would fail,
because the incoming submodule sub1/sub1 would falsely conflict with
the empty sub1 directory.

This patch ignores the empty sub1 directory, fixing the bug.  We only
ignore the empty directory if the object being emplaced is a
submodule, which expects an empty directory.

Signed-off-by: David Turner <dturner@twosigma.com>
---
 merge-recursive.c           | 21 +++++++++++++++------
 t/t3030-merge-recursive.sh  |  4 ++--
 t/t3426-rebase-submodule.sh |  3 ---
 3 files changed, 17 insertions(+), 11 deletions(-)

Note that there are four calls to dir_in_way, and only two of them
have changed their semantics.  This is because the merge code is quite
complicated, and I don't fully understand it.  So I did not have time
to analyze the remaining calls to see whether they, too, should be
changed.  For me, there are no test failures either way, indicating
that probably these cases are rare.

The reason behind the empty_ok parameter (as opposed to just always
allowing empy directories to be blown away) is found in t6022's 'pair
rename to parent of other (D/F conflicts) w/ untracked dir'.  This
test would fail with an unconditional rename, because it wouldn't
generate the conflict file.  It's not clear how important that
behavior is (I do not recall ever noticing the file~branch thing
before), but it seemed better to preserve it in case it was important.

diff --git a/merge-recursive.c b/merge-recursive.c
index 9041c2f..e64b48b 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -664,7 +664,13 @@ static char *unique_path(struct merge_options *o, const char *path, const char *
 	return strbuf_detach(&newpath, NULL);
 }
 
-static int dir_in_way(const char *path, int check_working_copy)
+/**
+ * Check whether a directory in the index is in the way of an incoming
+ * file.  Return 1 if so.  If check_working_copy is non-zero, also
+ * check the working directory.  If empty_ok is non-zero, also return
+ * 0 in the case where the working-tree dir exists but is empty.
+ */
+static int dir_in_way(const char *path, int check_working_copy, int empty_ok)
 {
 	int pos;
 	struct strbuf dirpath = STRBUF_INIT;
@@ -684,7 +690,8 @@ static int dir_in_way(const char *path, int check_working_copy)
 	}
 
 	strbuf_release(&dirpath);
-	return check_working_copy && !lstat(path, &st) && S_ISDIR(st.st_mode);
+	return check_working_copy && !lstat(path, &st) && S_ISDIR(st.st_mode) &&
+		!(empty_ok && is_empty_dir(path));
 }
 
 static int was_tracked(const char *path)
@@ -1062,7 +1069,7 @@ static int handle_change_delete(struct merge_options *o,
 {
 	char *renamed = NULL;
 	int ret = 0;
-	if (dir_in_way(path, !o->call_depth)) {
+	if (dir_in_way(path, !o->call_depth, 0)) {
 		renamed = unique_path(o, path, a_oid ? o->branch1 : o->branch2);
 	}
 
@@ -1195,7 +1202,7 @@ static int handle_file(struct merge_options *o,
 		remove_file(o, 0, rename->path, 0);
 		dst_name = unique_path(o, rename->path, cur_branch);
 	} else {
-		if (dir_in_way(rename->path, !o->call_depth)) {
+		if (dir_in_way(rename->path, !o->call_depth, 0)) {
 			dst_name = unique_path(o, rename->path, cur_branch);
 			output(o, 1, _("%s is a directory in %s adding as %s instead"),
 			       rename->path, other_branch, dst_name);
@@ -1704,7 +1711,8 @@ static int merge_content(struct merge_options *o,
 			 o->branch2 == rename_conflict_info->branch1) ?
 			pair1->two->path : pair1->one->path;
 
-		if (dir_in_way(path, !o->call_depth))
+		if (dir_in_way(path, !o->call_depth,
+			       S_ISGITLINK(pair1->two->mode)))
 			df_conflict_remains = 1;
 	}
 	if (merge_file_special_markers(o, &one, &a, &b,
@@ -1862,7 +1870,8 @@ static int process_entry(struct merge_options *o,
 			oid = b_oid;
 			conf = _("directory/file");
 		}
-		if (dir_in_way(path, !o->call_depth)) {
+		if (dir_in_way(path, !o->call_depth,
+			       S_ISGITLINK(a_mode))) {
 			char *new_path = unique_path(o, path, add_branch);
 			clean_merge = 0;
 			output(o, 1, _("CONFLICT (%s): There is a directory with name %s in %s. "
diff --git a/t/t3030-merge-recursive.sh b/t/t3030-merge-recursive.sh
index 470f334..be074a1 100755
--- a/t/t3030-merge-recursive.sh
+++ b/t/t3030-merge-recursive.sh
@@ -575,13 +575,13 @@ test_expect_success 'merge removes empty directories' '
 	test_must_fail test -d d
 '
 
-test_expect_failure 'merge-recursive simple w/submodule' '
+test_expect_success 'merge-recursive simple w/submodule' '
 
 	git checkout submod &&
 	git merge remove
 '
 
-test_expect_failure 'merge-recursive simple w/submodule result' '
+test_expect_sucess 'merge-recursive simple w/submodule result' '
 
 	git ls-files -s >actual &&
 	(
diff --git a/t/t3426-rebase-submodule.sh b/t/t3426-rebase-submodule.sh
index d5b896d..ebf4f5e 100755
--- a/t/t3426-rebase-submodule.sh
+++ b/t/t3426-rebase-submodule.sh
@@ -38,9 +38,6 @@ git_rebase_interactive () {
 	git rebase -i "$1"
 }
 
-KNOWN_FAILURE_NOFF_MERGE_DOESNT_CREATE_EMPTY_SUBMODULE_DIR=1
-# The real reason "replace directory with submodule" fails is because a
-# directory "sub1" exists, but we reuse the suppression added for merge here
 test_submodule_switch "git_rebase_interactive"
 
 test_done
-- 
2.8.0.rc4.22.g8ae061a


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] submodules: allow empty working-tree dirs in merge/cherry-pick
  2016-11-07 18:31 [PATCH] submodules: allow empty working-tree dirs in merge/cherry-pick David Turner
@ 2016-11-07 19:13 ` Stefan Beller
  2016-11-07 20:38   ` David Turner
  2016-11-18  4:47 ` Junio C Hamano
  1 sibling, 1 reply; 5+ messages in thread
From: Stefan Beller @ 2016-11-07 19:13 UTC (permalink / raw)
  To: David Turner; +Cc: git@vger.kernel.org

On Mon, Nov 7, 2016 at 10:31 AM, David Turner <dturner@twosigma.com> wrote:
> When a submodule is being merged or cherry-picked into a working
> tree that already contains a corresponding empty directory, do not
> record a conflict.
>
> One situation where this bug appears is:
>
> - Commit 1 adds a submodule

"... at sub1" as inferred by text below.

> - Commit 2 removes that submodule and re-adds it into a subdirectory
>        (sub1 to sub1/sub1).
> - Commit 3 adds an unrelated file.
>
> Now the user checks out commit 1 (first deinitializing the submodule),
> and attempts to cherry-pick commit 3.  Previously, this would fail,
> because the incoming submodule sub1/sub1 would falsely conflict with
> the empty sub1 directory.

So you'd want to achieve:
  $ # on commit 3:
  git checkout <commit 1>
  git cherry-pick <commit 3>

which essentially moves the gitlink back to its original place
(from sub1/sub1 -> sub1).  This sounds reasonable.
But what if the submodule contains a (file/directory)
named sub1? We'd first remove the sub1/sub1 submodule
(and even delete the inner directory?), such that "sub1/"
becomes an empty dir, which is perfect for having a
submodule right there at "sub1/"

>
> This patch ignores the empty sub1 directory, fixing the bug.  We only
> ignore the empty directory if the object being emplaced is a
> submodule, which expects an empty directory.
>
> Signed-off-by: David Turner <dturner@twosigma.com>
> ---
>  merge-recursive.c           | 21 +++++++++++++++------
>  t/t3030-merge-recursive.sh  |  4 ++--
>  t/t3426-rebase-submodule.sh |  3 ---
>  3 files changed, 17 insertions(+), 11 deletions(-)
>
> Note that there are four calls to dir_in_way, and only two of them
> have changed their semantics.  This is because the merge code is quite
> complicated, and I don't fully understand it.

A good approach. I was trying to haggle with unpack-trees.c and
the merging code and put way more on my plate than I could eat
in one sitting. Trying to get the mess sorted now to prepare a patch
series this week.

> So I did not have time
> to analyze the remaining calls to see whether they, too, should be
> changed.

The call in line 1205 (in handle_file, which is only called from
conflict_rename_rename_1to2) may be relevant if we move around
submodules on the same level and modifying it in different branches.
However I think preserving current behavior is ok.

The other one in handle_change_delete also doesn't look obvious
one way or another, so I'd stick with current behavior.

>For me, there are no test failures either way, indicating
> that probably these cases are rare.

The tests have to be crafted for this specific code pattern,

>
> The reason behind the empty_ok parameter (as opposed to just always
> allowing empy directories to be blown away) is found in t6022's 'pair
> rename to parent of other (D/F conflicts) w/ untracked dir'.  This
> test would fail with an unconditional rename, because it wouldn't
> generate the conflict file.

Or the submodule from your commit message contains a "sub1/..." itself.

>  It's not clear how important that
> behavior is (I do not recall ever noticing the file~branch thing
> before), but it seemed better to preserve it in case it was important.
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 9041c2f..e64b48b 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -664,7 +664,13 @@ static char *unique_path(struct merge_options *o, const char *path, const char *
>         return strbuf_detach(&newpath, NULL);
>  }
>
> -static int dir_in_way(const char *path, int check_working_copy)
> +/**
> + * Check whether a directory in the index is in the way of an incoming
> + * file.  Return 1 if so.  If check_working_copy is non-zero, also
> + * check the working directory.  If empty_ok is non-zero, also return
> + * 0 in the case where the working-tree dir exists but is empty.
> + */

Thanks for the documenting comment! This is probably fine as is with just
two boolean parameters. If we'd add more, we might have thought about
adding a flags parameter with bits for each flag.

Looks good to me,
Thanks,
Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH] submodules: allow empty working-tree dirs in merge/cherry-pick
  2016-11-07 19:13 ` Stefan Beller
@ 2016-11-07 20:38   ` David Turner
  2016-11-07 20:48     ` Stefan Beller
  0 siblings, 1 reply; 5+ messages in thread
From: David Turner @ 2016-11-07 20:38 UTC (permalink / raw)
  To: 'Stefan Beller'; +Cc: git@vger.kernel.org

> -----Original Message-----
> From: Stefan Beller [mailto:sbeller@google.com]
> Sent: Monday, November 07, 2016 2:14 PM
> To: David Turner
> Cc: git@vger.kernel.org
> Subject: Re: [PATCH] submodules: allow empty working-tree dirs in
> merge/cherry-pick
> 
> On Mon, Nov 7, 2016 at 10:31 AM, David Turner <dturner@twosigma.com>
> wrote:
> > When a submodule is being merged or cherry-picked into a working tree
> > that already contains a corresponding empty directory, do not record a
> > conflict.
> >
> > One situation where this bug appears is:
> >
> > - Commit 1 adds a submodule
> 
> "... at sub1" as inferred by text below.
> 
> > - Commit 2 removes that submodule and re-adds it into a subdirectory
> >        (sub1 to sub1/sub1).
> > - Commit 3 adds an unrelated file.
> >
> > Now the user checks out commit 1 (first deinitializing the submodule),
> > and attempts to cherry-pick commit 3.  Previously, this would fail,
> > because the incoming submodule sub1/sub1 would falsely conflict with
> > the empty sub1 directory.
> 
> So you'd want to achieve:
>   $ # on commit 3:
>   git checkout <commit 1>
>   git cherry-pick <commit 3>
> 
> which essentially moves the gitlink back to its original place (from
> sub1/sub1 -> sub1).  This sounds reasonable.
> But what if the submodule contains a (file/directory) named sub1? We'd
> first remove the sub1/sub1 submodule (and even delete the inner
> directory?), such that "sub1/"
> becomes an empty dir, which is perfect for having a submodule right there
> at "sub1/"

I'm confused about the "what if" here.

In our particular situation, the submodule in question was not initialized.  Basically, the submodule move by developer A messed up developer B's rebase, where developers A and B had been working on completely disjoint sets of submodules.  If it had been initialized, that might be a different story.  It would be somewhat less surprising, and thus probably OK.  The "first deinitializing the submodule" bit above, I think, describes the situation.

If the "what if" you are worried about is corruption caused the move of sub1/sub1 into sub1, don't worry about it.  sub1/ would still contain the .git file, and so would not be empty.  Even if this patch were really wacky, the worst it could do is delete already-empty directories.

> > This patch ignores the empty sub1 directory, fixing the bug.  We only
> > ignore the empty directory if the object being emplaced is a
> > submodule, which expects an empty directory.
> >
> > Signed-off-by: David Turner <dturner@twosigma.com>
> > ---
> >  merge-recursive.c           | 21 +++++++++++++++------
> >  t/t3030-merge-recursive.sh  |  4 ++--  t/t3426-rebase-submodule.sh |
> > 3 ---
> >  3 files changed, 17 insertions(+), 11 deletions(-)
> >
> > Note that there are four calls to dir_in_way, and only two of them
> > have changed their semantics.  This is because the merge code is quite
> > complicated, and I don't fully understand it.
> 
> A good approach. I was trying to haggle with unpack-trees.c and the
> merging code and put way more on my plate than I could eat in one sitting.
> Trying to get the mess sorted now to prepare a patch series this week.

If your approach also fixes the same tests that mine fixes, then I am happy to use your series over mine.  Please CC me so I can take a peek.

> > So I did not have time
> > to analyze the remaining calls to see whether they, too, should be
> > changed.
> 
> The call in line 1205 (in handle_file, which is only called from
> conflict_rename_rename_1to2) may be relevant if we move around submodules
> on the same level and modifying it in different branches.
> However I think preserving current behavior is ok.

So, the case there would be moving sub1 to sub2, where sub2 was previously a different submodule?  It appears that this works at least after my patch, if not before.  But I gather from the name rename_1to2 that I actually need to copy the submodule not move it?  This seems like such a rare case that I don't actually need to handle it; basically nobody needs two copies of one submodule in the same repo.  I think that case fails for other reasons anyway.

> The other one in handle_change_delete also doesn't look obvious one way or
> another, so I'd stick with current behavior.

This appears to be implicated in the t6022 test that I mentioned -- if I change empty_ok unconditionally to 1, the test fails.

> >For me, there are no test failures either way, indicating  that
> >probably these cases are rare.
> 
> The tests have to be crafted for this specific code pattern,
> 
> >
> > The reason behind the empty_ok parameter (as opposed to just always
> > allowing empy directories to be blown away) is found in t6022's 'pair
> > rename to parent of other (D/F conflicts) w/ untracked dir'.  This
> > test would fail with an unconditional rename, because it wouldn't
> > generate the conflict file.
> 
> Or the submodule from your commit message contains a "sub1/..." itself.

See above.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] submodules: allow empty working-tree dirs in merge/cherry-pick
  2016-11-07 20:38   ` David Turner
@ 2016-11-07 20:48     ` Stefan Beller
  0 siblings, 0 replies; 5+ messages in thread
From: Stefan Beller @ 2016-11-07 20:48 UTC (permalink / raw)
  To: David Turner; +Cc: git@vger.kernel.org

On Mon, Nov 7, 2016 at 12:38 PM, David Turner <David.Turner@twosigma.com> wrote:
>> -----Original Message-----
>> From: Stefan Beller [mailto:sbeller@google.com]
>> Sent: Monday, November 07, 2016 2:14 PM
>> To: David Turner
>> Cc: git@vger.kernel.org
>> Subject: Re: [PATCH] submodules: allow empty working-tree dirs in
>> merge/cherry-pick
>>
>> On Mon, Nov 7, 2016 at 10:31 AM, David Turner <dturner@twosigma.com>
>> wrote:
>> > When a submodule is being merged or cherry-picked into a working tree
>> > that already contains a corresponding empty directory, do not record a
>> > conflict.
>> >
>> > One situation where this bug appears is:
>> >
>> > - Commit 1 adds a submodule
>>
>> "... at sub1" as inferred by text below.
>>
>> > - Commit 2 removes that submodule and re-adds it into a subdirectory
>> >        (sub1 to sub1/sub1).
>> > - Commit 3 adds an unrelated file.
>> >
>> > Now the user checks out commit 1 (first deinitializing the submodule),
>> > and attempts to cherry-pick commit 3.  Previously, this would fail,
>> > because the incoming submodule sub1/sub1 would falsely conflict with
>> > the empty sub1 directory.
>>
>> So you'd want to achieve:
>>   $ # on commit 3:
>>   git checkout <commit 1>
>>   git cherry-pick <commit 3>
>>
>> which essentially moves the gitlink back to its original place (from
>> sub1/sub1 -> sub1).  This sounds reasonable.
>> But what if the submodule contains a (file/directory) named sub1? We'd
>> first remove the sub1/sub1 submodule (and even delete the inner
>> directory?), such that "sub1/"
>> becomes an empty dir, which is perfect for having a submodule right there
>> at "sub1/"
>
> I'm confused about the "what if" here.
>
> In our particular situation, the submodule in question was not initialized.

oops. That explains it. I somehow assumed we were talking about
initialized submodules.

>
> If your approach also fixes the same tests that mine fixes, then I am happy to use your series over mine.  Please CC me so I can take a peek.

No, my series seems to be orthogonal to this one. I plan
on cc'ing you nevertheless as it is still nearby.

> basically nobody needs two copies of one submodule in the same repo.

IIRC this is how gitlinks were used in very early days :/
(kernel people were using gitlinks to track different kernel versions
and see if they were interoperable or working at all.
e.g. see d92a39590d1126e195f1bbccf182a2cdb79218e7, which
only makes sense (for the update command) if the referenced repository
contains references of all submodules, which either means a huge reference
pile that contains different projects at the same time, or the same project
at different versions.

>  I think that case fails for other reasons anyway.
>

Yes. I agree that the patch as-is is applicable. I did not try to oppose
your approach, but rather give some thoughts I had.

Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] submodules: allow empty working-tree dirs in merge/cherry-pick
  2016-11-07 18:31 [PATCH] submodules: allow empty working-tree dirs in merge/cherry-pick David Turner
  2016-11-07 19:13 ` Stefan Beller
@ 2016-11-18  4:47 ` Junio C Hamano
  1 sibling, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2016-11-18  4:47 UTC (permalink / raw)
  To: David Turner; +Cc: git, sbeller

David Turner <dturner@twosigma.com> writes:

> diff --git a/t/t3030-merge-recursive.sh b/t/t3030-merge-recursive.sh
> index 470f334..be074a1 100755
> --- a/t/t3030-merge-recursive.sh
> +++ b/t/t3030-merge-recursive.sh
> @@ -575,13 +575,13 @@ test_expect_success 'merge removes empty directories' '
>  	test_must_fail test -d d
>  '
>  
> -test_expect_failure 'merge-recursive simple w/submodule' '
> +test_expect_success 'merge-recursive simple w/submodule' '
>  
>  	git checkout submod &&
>  	git merge remove
>  '
>  
> -test_expect_failure 'merge-recursive simple w/submodule result' '
> +test_expect_sucess 'merge-recursive simple w/submodule result' '

Here is a typo.  I wonder if we want to do "set -e" at the end of
test-lib.sh to catch a breakage like this.  I only caught it by
being lucky (I was staring "make test" output as it flew by).

I've already amended the copy I have, but in case you are going to
reroll in the future, please do not forget to update your copy.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-11-18  4:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-07 18:31 [PATCH] submodules: allow empty working-tree dirs in merge/cherry-pick David Turner
2016-11-07 19:13 ` Stefan Beller
2016-11-07 20:38   ` David Turner
2016-11-07 20:48     ` Stefan Beller
2016-11-18  4:47 ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).