git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Am able to delete a file with no trace in the log
@ 2009-06-02 19:33 Graham Perks
  2009-06-02 20:29 ` Tony Finch
  2009-06-02 21:34 ` Jeff King
  0 siblings, 2 replies; 26+ messages in thread
From: Graham Perks @ 2009-06-02 19:33 UTC (permalink / raw)
  To: Git List

We just had this come up. A file was accidentally removed during a  
merge operation, and we could find no clue in the gitk and git log  
output as to which change the deletion occurred in. We found the  
original file addition in a change; but no other mention of this file.

This seems like a bug. The log should show the file being deleted.

This is pretty easy to reproduce. Just run the script below. Here,  
pretend "c" is the central repo to which two users are pushing/ 
pulling. The users are using repositories r1 and r2.

# create and init repos
md r1
md r2
md c
cd c
git init --bare
cd ..
cd r1 && git init && cd ..
cd r2 && git init && cd ..

# create a file to conflict later
cd r1
echo hello > a.txt
git add .
git commit -m "add file"
git push ../c master

# sync other repo
cd ../r2
git pull ../c

# make conflicting change
echo helloworld > a.txt
git commit -a -m "change file in r2"

# and add a file
echo bye > b.txt
git add .
git commit -m "add file b"
git push ../c

# Meanwhile, r1 has been working
cd ../r1
echo hellogit > a.txt
git commit -a -m "change file in r1"
git pull ../c

# edit a.txt to resolve merge
vi a.txt

# User in r1 erroneously thinks "ooh, I didn't edit b.txt, I don't  
want that in my commit!"
# This is especially easy to do in git gui. That's what our user did.
# Or, the user could legitimately remove the file as part of the  
merge. Probably
# they should git rm, but our user didn't.
git reset HEAD b.txt

# But he thinks, "I did merge a.txt"
git add a.txt
git commit -m "Merged"
git push ../c


# Now user in r2 wants to pull r1's changes
cd ../r2
git pull ../c

# File b.txt deleted! Woah! How did that happen?
# gitk and git log show nothing about the deletion.
# There seems to be no evidence about who, how, why, or when the file  
got deleted.
# So it's hard to track down which user mis-used the system and  
educate them.

Cheers,
Graham Perks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-02 19:33 Am able to delete a file with no trace in the log Graham Perks
@ 2009-06-02 20:29 ` Tony Finch
  2009-06-02 21:34 ` Jeff King
  1 sibling, 0 replies; 26+ messages in thread
From: Tony Finch @ 2009-06-02 20:29 UTC (permalink / raw)
  To: Graham Perks; +Cc: Git List

On Tue, 2 Jun 2009, Graham Perks wrote:

> We just had this come up. A file was accidentally removed during a merge
> operation, and we could find no clue in the gitk and git log output as to
> which change the deletion occurred in. We found the original file addition in
> a change; but no other mention of this file.
>
> This seems like a bug. The log should show the file being deleted.

The manual for git log suggests the -c or --cc options ought to display
the information you want, but it doesn't seem to work in practice.

Tony.
-- 
f.anthony.n.finch  <dot@dotat.at>  http://dotat.at/
GERMAN BIGHT HUMBER: SOUTHWEST 5 TO 7. MODERATE OR ROUGH. SQUALLY SHOWERS.
MODERATE OR GOOD.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-02 19:33 Am able to delete a file with no trace in the log Graham Perks
  2009-06-02 20:29 ` Tony Finch
@ 2009-06-02 21:34 ` Jeff King
  2009-06-02 21:55   ` Linus Torvalds
  1 sibling, 1 reply; 26+ messages in thread
From: Jeff King @ 2009-06-02 21:34 UTC (permalink / raw)
  To: Graham Perks; +Cc: Git List

On Tue, Jun 02, 2009 at 02:33:14PM -0500, Graham Perks wrote:

> # File b.txt deleted! Woah! How did that happen?
> # gitk and git log show nothing about the deletion.
> # There seems to be no evidence about who, how, why, or when the file got 
> deleted.
> # So it's hard to track down which user mis-used the system and educate 
> them.

I think this is a funny interaction with the way the diffs for merge
commits are shown. The diff you are looking for is definitely available.
In your example:

  # compare the second parent of the merge to the merge result
  $ git diff-tree HEAD^2 HEAD
  :100644 100644 31e0fce560e96c8b357f5b8630c7d8fbeb0a3ec8 2dc6d98afe942589307d7c0166971b3a2ec8706d M      a.txt
  :100644 000000 b023018cabc396e7692c70bbf5784a93d3f738ab 0000000000000000000000000000000000000000 D      b.txt

But it doesn't show up in "git log". I believe this is because the rule
for what to show in a merge commit is "if content is exactly the same as
one of the parents, it's not interesting". That is, deleting "b.txt"
from the second parent ends up being exactly as it is in the first
parent -- nonexistent. So git has no idea that you deleted "b.txt"
accidentally, and it was not simply part of the conflict resolution.

So I think this is working as intended, and is not a bug exactly. But
certainly the behavior leaves something to be desired for actually
tracking down the source of the change later on. I don't think there is
a way to get "git show $merge" to show the deletion, and nor does it
show up under "git log -- b.txt". Even worse, the latter produces no
output at all for your example (you need "--full-history" to tell it to
follow both parents of a merge).

I wonder if we need some kind of "--verbose-merges" option to tell the
diff engine that we really are interested in all of the changes that
happened in a merge. But maybe we even have something and I don't know
about it.

-Peff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-02 21:34 ` Jeff King
@ 2009-06-02 21:55   ` Linus Torvalds
  2009-06-03  0:47     ` Sitaram Chamarty
                       ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Linus Torvalds @ 2009-06-02 21:55 UTC (permalink / raw)
  To: Jeff King; +Cc: Graham Perks, Git List



On Tue, 2 Jun 2009, Jeff King wrote:
> 
> But it doesn't show up in "git log". I believe this is because the rule
> for what to show in a merge commit is "if content is exactly the same as
> one of the parents, it's not interesting".

Correct.

What happens is that "git log" with a filename will always simplify the 
history to the side that matches. And yes, "matching" can and does include 
"doesn't exist in child, doesn't exist in parent"

Now, I admit that in this case the matching heuristic is dubious, and 
maybe we should consider "does not exist in result" to not match any 
parent. We already think that "all new" is special ("REV_TREE_NEW" vs 
"REV_TREE_DIFFERENT"), so maybe we should think that "all deleted" is also 
special ("REV_TREE_DEL")

		Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-02 21:55   ` Linus Torvalds
@ 2009-06-03  0:47     ` Sitaram Chamarty
  2009-06-03  1:20       ` Linus Torvalds
  2009-06-03  1:34     ` Clean up and simplify rev_compare_tree() Linus Torvalds
  2009-06-03  1:57     ` Am able to delete a file with no trace in the log Junio C Hamano
  2 siblings, 1 reply; 26+ messages in thread
From: Sitaram Chamarty @ 2009-06-03  0:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff King, Graham Perks, Git List

On Wed, Jun 3, 2009 at 3:25 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
>
> On Tue, 2 Jun 2009, Jeff King wrote:
>>
>> But it doesn't show up in "git log". I believe this is because the rule
>> for what to show in a merge commit is "if content is exactly the same as
>> one of the parents, it's not interesting".
>
> Correct.
>
> What happens is that "git log" with a filename will always simplify the
> history to the side that matches. And yes, "matching" can and does include
> "doesn't exist in child, doesn't exist in parent"
>
> Now, I admit that in this case the matching heuristic is dubious, and
> maybe we should consider "does not exist in result" to not match any
> parent. We already think that "all new" is special ("REV_TREE_NEW" vs
> "REV_TREE_DIFFERENT"), so maybe we should think that "all deleted" is also
> special ("REV_TREE_DEL")

"git merge -s ours" would do precisely the same thing, wouldn't it?
That has happened to me before, and I noticed that git log does not
show the deletion, but rationalised it as being because I had
explicitly done a "-s ours".

Fixing this would fix that (maybe more common) case too, and show that
the merge commit removed the file.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03  0:47     ` Sitaram Chamarty
@ 2009-06-03  1:20       ` Linus Torvalds
  0 siblings, 0 replies; 26+ messages in thread
From: Linus Torvalds @ 2009-06-03  1:20 UTC (permalink / raw)
  To: Sitaram Chamarty; +Cc: Jeff King, Graham Perks, Git List



On Wed, 3 Jun 2009, Sitaram Chamarty wrote:
> 
> "git merge -s ours" would do precisely the same thing, wouldn't it?

Yes. It doesn't matter which way it goes - if a file is seen as being 
identical (and "none" very much counts) in one parent, it's not judged 
"interesting" from a merge patch standpoint, or even from a "git log" 
(aka "revision history") standpoint.

> That has happened to me before, and I noticed that git log does not
> show the deletion, but rationalised it as being because I had
> explicitly done a "-s ours".
> 
> Fixing this would fix that (maybe more common) case too, and show that
> the merge commit removed the file.

The problem is that quite often, a merge that removes a file _is_ the 
correct thing when it was removed in one branch, and a merge that adds a 
file is even more common, and in no way special. We don't show the whole 
diff in a merge, because the whole diff is often nonsensical (ie so 
trivial that showing it all just hides the parts that are actually 
relevant).

So I'll have to think about it a bit more. We clearly don't generate good 
diffs for file deletion/creation in merges, and we should improve on it, 
but it's definitely not a trivial issue either.

		Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Clean up and simplify rev_compare_tree()
  2009-06-02 21:55   ` Linus Torvalds
  2009-06-03  0:47     ` Sitaram Chamarty
@ 2009-06-03  1:34     ` Linus Torvalds
  2009-06-03  1:57     ` Am able to delete a file with no trace in the log Junio C Hamano
  2 siblings, 0 replies; 26+ messages in thread
From: Linus Torvalds @ 2009-06-03  1:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Graham Perks, Jeff King, Git List


This simplifies the logic of rev_compare_tree() by removing a special 
case. 

It does so by turning the special case of finding a diff to be "all new 
files" into a more generic case of "all new" vs "all removed" vs "mixed 
changes", so now the code is actually more powerful and more generic, and 
the added symmetry actually makes it simpler too.

This makes no changes to any existing behavior, but apart from the 
simplification it does make it possible to some day care about whether all 
changes were just deletions if we want to. Which we may well want to for 
merge handling.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---

This is just a cleanup while I'm looking at the code. There's two things 
that are relevant to merges - the "TREESAME" logic that determines whether 
we should simplify the merge away and pick just one parent, and the actual 
diff creation logic that then creates diffs of the merges that we don't 
simplify away.

The two are totally independent, and this patch just cleans up the helper 
function that we use for the commit simplification logic. 

The only half-way subtle part here (and it really isn't that subtle) is 
that "REV_TREE_NEW | REV_TREE_OLD == REV_TREE_DIFFERENT", which makes 
sense and just simplifies the logic in general. It used to be that mixing 
REV_TREE_NEW state with REV_TREE_DIFFERENT was complicated. Now it's 
trivial, thanks to REV_TREE_OLD that is currently otherwise unused (we 
treat it the same way as REV_TREE_DIFFERENT, which is what that case used 
to result in).

There's an unrelated cleanup which is to just move the "can't happen" 
special case code of a missing commit tree up to the same point as the 
"can't happen" missing parent tree.

On Tue, 2 Jun 2009, Linus Torvalds wrote:
> 
> Now, I admit that in this case the matching heuristic is dubious, and 
> maybe we should consider "does not exist in result" to not match any 
> parent. We already think that "all new" is special ("REV_TREE_NEW" vs 
> "REV_TREE_DIFFERENT"), so maybe we should think that "all deleted" is also 
> special ("REV_TREE_DEL")
> 
> 		Linus
> 

 revision.c |   33 ++++++++++++---------------------
 revision.h |    5 +++--
 2 files changed, 15 insertions(+), 23 deletions(-)

diff --git a/revision.c b/revision.c
index 18b7ebb..bf58448 100644
--- a/revision.c
+++ b/revision.c
@@ -256,10 +256,12 @@ static int everybody_uninteresting(struct commit_list *orig)
 
 /*
  * The goal is to get REV_TREE_NEW as the result only if the
- * diff consists of all '+' (and no other changes), and
- * REV_TREE_DIFFERENT otherwise (of course if the trees are
- * the same we want REV_TREE_SAME).  That means that once we
- * get to REV_TREE_DIFFERENT, we do not have to look any further.
+ * diff consists of all '+' (and no other changes), REV_TREE_OLD
+ * if the whole diff is removal of old data, and otherwise
+ * REV_TREE_DIFFERENT (of course if the trees are the same we
+ * want REV_TREE_SAME).
+ * That means that once we get to REV_TREE_DIFFERENT, we do not
+ * have to look any further.
  */
 static int tree_difference = REV_TREE_SAME;
 
@@ -268,22 +270,9 @@ static void file_add_remove(struct diff_options *options,
 		    const unsigned char *sha1,
 		    const char *fullpath)
 {
-	int diff = REV_TREE_DIFFERENT;
+	int diff = addremove == '+' ? REV_TREE_NEW : REV_TREE_OLD;
 
-	/*
-	 * Is it an add of a new file? It means that the old tree
-	 * didn't have it at all, so we will turn "REV_TREE_SAME" ->
-	 * "REV_TREE_NEW", but leave any "REV_TREE_DIFFERENT" alone
-	 * (and if it already was "REV_TREE_NEW", we'll keep it
-	 * "REV_TREE_NEW" of course).
-	 */
-	if (addremove == '+') {
-		diff = tree_difference;
-		if (diff != REV_TREE_SAME)
-			return;
-		diff = REV_TREE_NEW;
-	}
-	tree_difference = diff;
+	tree_difference |= diff;
 	if (tree_difference == REV_TREE_DIFFERENT)
 		DIFF_OPT_SET(options, HAS_CHANGES);
 }
@@ -305,6 +294,8 @@ static int rev_compare_tree(struct rev_info *revs, struct commit *parent, struct
 
 	if (!t1)
 		return REV_TREE_NEW;
+	if (!t2)
+		return REV_TREE_OLD;
 
 	if (revs->simplify_by_decoration) {
 		/*
@@ -323,8 +314,7 @@ static int rev_compare_tree(struct rev_info *revs, struct commit *parent, struct
 		if (!revs->prune_data)
 			return REV_TREE_SAME;
 	}
-	if (!t2)
-		return REV_TREE_DIFFERENT;
+
 	tree_difference = REV_TREE_SAME;
 	DIFF_OPT_CLR(&revs->pruning, HAS_CHANGES);
 	if (diff_tree_sha1(t1->object.sha1, t2->object.sha1, "",
@@ -429,6 +419,7 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
 				p->parents = NULL;
 			}
 		/* fallthrough */
+		case REV_TREE_OLD:
 		case REV_TREE_DIFFERENT:
 			tree_changed = 1;
 			pp = &parent->next;
diff --git a/revision.h b/revision.h
index be39e7d..227164c 100644
--- a/revision.h
+++ b/revision.h
@@ -118,8 +118,9 @@ struct rev_info {
 };
 
 #define REV_TREE_SAME		0
-#define REV_TREE_NEW		1
-#define REV_TREE_DIFFERENT	2
+#define REV_TREE_NEW		1	/* Only new files */
+#define REV_TREE_OLD		2	/* Only files removed */
+#define REV_TREE_DIFFERENT	3	/* Mixed changes */
 
 /* revision.c */
 void read_revisions_from_stdin(struct rev_info *revs);

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-02 21:55   ` Linus Torvalds
  2009-06-03  0:47     ` Sitaram Chamarty
  2009-06-03  1:34     ` Clean up and simplify rev_compare_tree() Linus Torvalds
@ 2009-06-03  1:57     ` Junio C Hamano
  2009-06-03  3:03       ` Linus Torvalds
  2 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2009-06-03  1:57 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff King, Graham Perks, Git List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> What happens is that "git log" with a filename will always simplify the 
> history to the side that matches. And yes, "matching" can and does include 
> "doesn't exist in child, doesn't exist in parent"
>
> Now, I admit that in this case the matching heuristic is dubious, and 
> maybe we should consider "does not exist in result" to not match any 
> parent. We already think that "all new" is special ("REV_TREE_NEW" vs 
> "REV_TREE_DIFFERENT"), so maybe we should think that "all deleted" is also 
> special ("REV_TREE_DEL")

Sorry, but I do not quite understand this comment.  REV_TREE_NEW can be
treated differently from REV_TREE_DIFFERENT but that only happens if you
know about --remove-empty option, and no scripted (and later converted to
C) Porcelain uses that option by default.

Also, and more on the topic of simplification, if one parent matches the
child (i.e. REV_TREE_SAME), the merge history is simplified by discarding
other parents, regardless of the nature of their differences, be they
REV_TREE_NEW or REV_TREE_DIFFERENT, so in that sense we do not even
special case REV_TREE_NEW at all for the purpose of merge simplification.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03  1:57     ` Am able to delete a file with no trace in the log Junio C Hamano
@ 2009-06-03  3:03       ` Linus Torvalds
  2009-06-03 21:18         ` Junio C Hamano
  2009-06-03 21:55         ` Junio C Hamano
  0 siblings, 2 replies; 26+ messages in thread
From: Linus Torvalds @ 2009-06-03  3:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Graham Perks, Git List



On Tue, 2 Jun 2009, Junio C Hamano wrote:
> 
> Sorry, but I do not quite understand this comment.  REV_TREE_NEW can be
> treated differently from REV_TREE_DIFFERENT but that only happens if you
> know about --remove-empty option, and no scripted (and later converted to
> C) Porcelain uses that option by default.

It's not REV_TREE_NEW, but the other way around, ie when the commit has no 
contents but the parent _does_ have contents, maybe we shouldn't then look 
at another parent and say "no content", and then match that other parent 
(resulting in no difference).

IOW, we are in the situation where one parent gets REV_TREE_SAME, but gets 
it for a totally pointless reason, namely that neither that parent nor the 
eventual merge has anything at all in that path. In that case, we simplify 
towards the parent that results in the smallest diff - which in this case 
is "nothing there at all".

Now, that is often the _right_ thing to do, since if it was meant to be 
deleted in that branch, then we'll eventually hit the delete commit, and 
see it as a nice removal. But it does make it really hard to see this case 
of "unintentional delete"

		Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03  3:03       ` Linus Torvalds
@ 2009-06-03 21:18         ` Junio C Hamano
  2009-06-03 21:26           ` Linus Torvalds
  2009-06-03 21:33           ` Linus Torvalds
  2009-06-03 21:55         ` Junio C Hamano
  1 sibling, 2 replies; 26+ messages in thread
From: Junio C Hamano @ 2009-06-03 21:18 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff King, Graham Perks, Git List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Tue, 2 Jun 2009, Junio C Hamano wrote:
>> 
>> Sorry, but I do not quite understand this comment.  REV_TREE_NEW can be
>> treated differently from REV_TREE_DIFFERENT but that only happens if you
>> know about --remove-empty option, and no scripted (and later converted to
>> C) Porcelain uses that option by default.
>
> It's not REV_TREE_NEW, but the other way around, ie when the commit has no 
> contents but the parent _does_ have contents, maybe we shouldn't then look 
> at another parent and say "no content", and then match that other parent 
> (resulting in no difference).
>
> IOW, we are in the situation where one parent gets REV_TREE_SAME, but gets 
> it for a totally pointless reason, namely that neither that parent nor the 
> eventual merge has anything at all in that path. In that case, we simplify 
> towards the parent that results in the smallest diff - which in this case 
> is "nothing there at all".
>
> Now, that is often the _right_ thing to do, since if it was meant to be 
> deleted in that branch, then we'll eventually hit the delete commit, and 
> see it as a nice removal. But it does make it really hard to see this case 
> of "unintentional delete"

I realize that "--simplify-merges" would show that kind of deletion.

E.g. "git log --graph --oneline -- git-clone.sh" shows that the scripted
version ceased to exist at 8434c2f (Build in clone, 2008-04-27), but with
the option, b84c343 (Merge branch 'db/clone-in-c', 2008-05-25) merged the
deletion to the mainline, but while doing so we lost two updates to the
scripted version since the "Build in clone" topic forked.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03 21:18         ` Junio C Hamano
@ 2009-06-03 21:26           ` Linus Torvalds
  2009-06-03 21:33           ` Linus Torvalds
  1 sibling, 0 replies; 26+ messages in thread
From: Linus Torvalds @ 2009-06-03 21:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Graham Perks, Git List



On Wed, 3 Jun 2009, Junio C Hamano wrote:
> 
> I realize that "--simplify-merges" would show that kind of deletion.

Yes. Or just "--full-history".

Of course, it would then not show any _diff_, so even then you're still 
kind of in the dark, if what you were looking for was "when did this file 
go away". Doing a "git log -c --stat" would show the creation, but not the 
deletion of the file.

		Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03 21:18         ` Junio C Hamano
  2009-06-03 21:26           ` Linus Torvalds
@ 2009-06-03 21:33           ` Linus Torvalds
  2009-06-03 21:59             ` Avery Pennarun
  2009-06-03 22:02             ` Junio C Hamano
  1 sibling, 2 replies; 26+ messages in thread
From: Linus Torvalds @ 2009-06-03 21:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Graham Perks, Git List



On Wed, 3 Jun 2009, Junio C Hamano wrote:
> 
> E.g. "git log --graph --oneline -- git-clone.sh" shows that the scripted
> version ceased to exist at 8434c2f

Btw, this example misses the whole point of the original problem.

The original problem was:

 - create new file 'x' in branch 'a'

 - merge branch 'a' into branch 'b', but because of a merge conflict and 
   confurion in the merge, the merge result doesn't contain 'x' any more.

 - try to find out what happened to 'x' after-the-fact.

Try it. Git really doesn't make it very easy, because git will notice that 
'x' didn't exist before the branch either (in branch 'b'), so there will 
be _no_ sign of 'x' actually going away.

			Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03  3:03       ` Linus Torvalds
  2009-06-03 21:18         ` Junio C Hamano
@ 2009-06-03 21:55         ` Junio C Hamano
  1 sibling, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2009-06-03 21:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: eff King, Graham Perks, Git List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Tue, 2 Jun 2009, Junio C Hamano wrote:
>> 
>> Sorry, but I do not quite understand this comment.  REV_TREE_NEW can be
>> treated differently from REV_TREE_DIFFERENT but that only happens if you
>> know about --remove-empty option, and no scripted (and later converted to
>> C) Porcelain uses that option by default.
>
> It's not REV_TREE_NEW, but the other way around, ie when the commit has no 
> contents but the parent _does_ have contents, maybe we shouldn't then look 
> at another parent and say "no content", and then match that other parent 
> (resulting in no difference).
>
> IOW, we are in the situation where one parent gets REV_TREE_SAME, but gets 
> it for a totally pointless reason, namely that neither that parent nor the 
> eventual merge has anything at all in that path. In that case, we simplify 
> towards the parent that results in the smallest diff - which in this case 
> is "nothing there at all".

Here is a crude attempt to do this.  It introduces an diff optflag that
records if we checked any changes at the path level (if left unset, that
means everything was pruned away by pathspec) and teaches the internal
diff-tree logic to set it, so that the caller can tell between "no
changes" and "there was no interesting path that matches the pathspec, so
comparison between parent and child yielded nothing".

	Side note. I didn't bother touching the diff-files and diff-index
	codepaths in this patch; if the distinction turns out to be useful
	we should teach the flag to them as well (but I'll say it is
	probably not very useful shortly).

After looking at its output, I have to say that I do not like it very
much.  Compared to "--simplify-merges", it shows too many uninteresting
merges.  E.g.

    $ git log --oneline -- git-clone.sh

starts like this.

    17d778e Merge branch 'dr/ceiling'
    159e639 Merge branch 'lt/racy-empty'
    bc9c3e0 Merge branch 'jc/maint-combine-diff-pre-context'
    ...

All of them are resolving a merge with a revision that has a scripted "git
clone" (i.e. "maint" track before built-in git-clone) into a newer
revision after "git clone" has become built-in.

After a score or so of such uninteresting merges, we finally see
what matters.

    b84c343 Merge branch 'db/clone-in-c'
    6d9878c clone: bsd shell portability fix
    c904bf3 Be more careful with objects directory permissions on clone
    8434c2f Build in clone
    e42251a Use "=" instead of "==" in condition as it is more portable
    ...

This part is actually somewhat interesting.

    $ git log --oneline --graph b84c343 -- git-clone.sh

looks like this:

    *   b84c343 Merge branch 'db/clone-in-c'
    |\
    | * 8434c2f Build in clone
    * | 6d9878c clone: bsd shell portability fix
    * | c904bf3 Be more careful with objects dir...
    |/
    * e42251a Use "=" instead of "==" in conditi...
    * a2b26ac clone: detect and fail on excess p...
    * c20711d Silence cpio's "N blocks" output w...
    ...

But then that was what "--simplify-merges" gave us without the patch
anyway.



 diff.h      |    1 +
 revision.c  |   32 ++++++++++++++++++++++++++++++--
 revision.h  |    1 +
 tree-diff.c |    1 +
 4 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/diff.h b/diff.h
index 6616877..fcf1d52 100644
--- a/diff.h
+++ b/diff.h
@@ -66,6 +66,7 @@ typedef void (*diff_format_fn_t)(struct diff_queue_struct *q,
 #define DIFF_OPT_DIRSTAT_CUMULATIVE  (1 << 19)
 #define DIFF_OPT_DIRSTAT_BY_FILE     (1 << 20)
 #define DIFF_OPT_ALLOW_TEXTCONV      (1 << 21)
+#define DIFF_OPT_CHECKED_CHANGES     (1 << 22)
 #define DIFF_OPT_TST(opts, flag)    ((opts)->flags & DIFF_OPT_##flag)
 #define DIFF_OPT_SET(opts, flag)    ((opts)->flags |= DIFF_OPT_##flag)
 #define DIFF_OPT_CLR(opts, flag)    ((opts)->flags &= ~DIFF_OPT_##flag)
diff --git a/revision.c b/revision.c
index bf58448..7440f59 100644
--- a/revision.c
+++ b/revision.c
@@ -273,7 +273,7 @@ static void file_add_remove(struct diff_options *options,
 	int diff = addremove == '+' ? REV_TREE_NEW : REV_TREE_OLD;
 
 	tree_difference |= diff;
-	if (tree_difference == REV_TREE_DIFFERENT)
+	if ((tree_difference & REV_TREE_DIFFERENT) == REV_TREE_DIFFERENT)
 		DIFF_OPT_SET(options, HAS_CHANGES);
 }
 
@@ -317,9 +317,13 @@ static int rev_compare_tree(struct rev_info *revs, struct commit *parent, struct
 
 	tree_difference = REV_TREE_SAME;
 	DIFF_OPT_CLR(&revs->pruning, HAS_CHANGES);
+	DIFF_OPT_CLR(&revs->pruning, CHECKED_CHANGES);
 	if (diff_tree_sha1(t1->object.sha1, t2->object.sha1, "",
 			   &revs->pruning) < 0)
 		return REV_TREE_DIFFERENT;
+	if (tree_difference == REV_TREE_SAME &&
+	    !DIFF_OPT_TST(&revs->pruning, CHECKED_CHANGES))
+		tree_difference = REV_TREE_EMPTY;
 	return tree_difference;
 }
 
@@ -351,7 +355,7 @@ static int rev_same_tree_as_empty(struct rev_info *revs, struct commit *commit)
 static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
 {
 	struct commit_list **pp, *parent;
-	int tree_changed = 0, tree_same = 0;
+	int tree_changed = 0, tree_same = 0, all_empty = 1;
 
 	/*
 	 * If we don't do pruning, everything is interesting
@@ -384,7 +388,17 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
 			    sha1_to_hex(commit->object.sha1),
 			    sha1_to_hex(p->object.sha1));
 		switch (rev_compare_tree(revs, p, commit)) {
+		case REV_TREE_EMPTY:
+			/*
+			 * This parent is the same as the child but that is
+			 * only because no path we are interested in appears
+			 * in either of them.  Do not cull other parents (yet).
+			 */
+			pp = &parent->next;
+			continue;
+
 		case REV_TREE_SAME:
+			all_empty = 0;
 			tree_same = 1;
 			if (!revs->simplify_history || (p->object.flags & UNINTERESTING)) {
 				/* Even if a merge with an uninteresting
@@ -421,12 +435,26 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
 		/* fallthrough */
 		case REV_TREE_OLD:
 		case REV_TREE_DIFFERENT:
+			all_empty = 0;
 			tree_changed = 1;
 			pp = &parent->next;
 			continue;
+
 		}
 		die("bad tree compare for commit %s", sha1_to_hex(commit->object.sha1));
 	}
+
+	if (all_empty && revs->simplify_history) {
+		/*
+		 * No path we are interested in appears in any of the
+		 * parents nor in this commit.  Just simplify the merge away.
+		 */
+		parent = commit->parents;
+		parent->next = NULL;
+		commit->object.flags |= TREESAME;
+		return;
+	}
+
 	if (tree_changed && !tree_same)
 		return;
 	commit->object.flags |= TREESAME;
diff --git a/revision.h b/revision.h
index 227164c..7064d35 100644
--- a/revision.h
+++ b/revision.h
@@ -121,6 +121,7 @@ struct rev_info {
 #define REV_TREE_NEW		1	/* Only new files */
 #define REV_TREE_OLD		2	/* Only files removed */
 #define REV_TREE_DIFFERENT	3	/* Mixed changes */
+#define REV_TREE_EMPTY		4	/* Same but only because both are empty */
 
 /* revision.c */
 void read_revisions_from_stdin(struct rev_info *revs);
diff --git a/tree-diff.c b/tree-diff.c
index edd8394..692d1cb 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -298,6 +298,7 @@ int diff_tree(struct tree_desc *t1, struct tree_desc *t2, const char *base, stru
 			update_tree_entry(t1);
 			continue;
 		}
+		DIFF_OPT_SET(opt, CHECKED_CHANGES);
 		switch (compare_tree_entry(t1, t2, base, baselen, opt)) {
 		case -1:
 			update_tree_entry(t1);

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03 21:33           ` Linus Torvalds
@ 2009-06-03 21:59             ` Avery Pennarun
  2009-06-03 22:02             ` Junio C Hamano
  1 sibling, 0 replies; 26+ messages in thread
From: Avery Pennarun @ 2009-06-03 21:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Jeff King, Graham Perks, Git List

On Wed, Jun 3, 2009 at 5:33 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, 3 Jun 2009, Junio C Hamano wrote:
>> E.g. "git log --graph --oneline -- git-clone.sh" shows that the scripted
>> version ceased to exist at 8434c2f
>
> Btw, this example misses the whole point of the original problem.
>
> The original problem was:
>
>  - create new file 'x' in branch 'a'
>
>  - merge branch 'a' into branch 'b', but because of a merge conflict and
>   confurion in the merge, the merge result doesn't contain 'x' any more.
>
>  - try to find out what happened to 'x' after-the-fact.
>
> Try it. Git really doesn't make it very easy, because git will notice that
> 'x' didn't exist before the branch either (in branch 'b'), so there will
> be _no_ sign of 'x' actually going away.

Note that full-file deletion is only one particular case of a general
problem.  Consider this script:

===

#!/bin/bash
rm -rf testb
mkdir testb
cd testb
git init
echo 'initial a' >a
echo 'initial b' >b
git add a b
git commit -m 'initial commit'

git branch sub

echo 'first a change' >>a
git add a
git commit -m 'modify a'

git checkout sub
echo 'second a change' >>a
git add a
git commit -m 'modify a differently'

git checkout master
git merge -s ours sub

===

"git log -p" will show the addition of the line "second a change" to
a, but it doesn't show that, in fact, this line was later deleted
because of a botched merge.  There is no obvious way to find out in
which commit the line was lost.  If someone doing a code review is
reviewing all the patches from "git log -p", then they'll miss the
fact that this patch was lost.

('git log -p -- a" doesn't show that the line was added at all - which
is true, in some sense, but it's equally true that, in the example you
gave, 'x' was never added.)

Have fun,

Avery

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03 21:33           ` Linus Torvalds
  2009-06-03 21:59             ` Avery Pennarun
@ 2009-06-03 22:02             ` Junio C Hamano
  2009-06-03 22:17               ` Linus Torvalds
  1 sibling, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2009-06-03 22:02 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Jeff King, Graham Perks, Git List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> The original problem was:
>
>  - create new file 'x' in branch 'a'
>
>  - merge branch 'a' into branch 'b', but because of a merge conflict and 
>    confurion in the merge, the merge result doesn't contain 'x' any more.
>
>  - try to find out what happened to 'x' after-the-fact.
>
> Try it. Git really doesn't make it very easy, because git will notice that 
> 'x' didn't exist before the branch either (in branch 'b'), so there will 
> be _no_ sign of 'x' actually going away.

That is true.  The "crude attempt" patch I just sent actually catches
this, but it does not show the lossage of "new" in the "diff/diffstat"
part of the merge, when run with "git log --stat -- x".  Besides, it shows
too many other uninteresting "merged two branches, resolving to lossage of
the path the same way as all the previous merges" to be really useful.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03 22:02             ` Junio C Hamano
@ 2009-06-03 22:17               ` Linus Torvalds
  2009-06-03 22:38                 ` Junio C Hamano
  2009-06-03 22:56                 ` Junio C Hamano
  0 siblings, 2 replies; 26+ messages in thread
From: Linus Torvalds @ 2009-06-03 22:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Graham Perks, Git List



On Wed, 3 Jun 2009, Junio C Hamano wrote:

> Linus Torvalds <torvalds@linux-foundation.org> writes:
> 
> > The original problem was:
> >
> >  - create new file 'x' in branch 'a'
> >
> >  - merge branch 'a' into branch 'b', but because of a merge conflict and 
> >    confurion in the merge, the merge result doesn't contain 'x' any more.
> >
> >  - try to find out what happened to 'x' after-the-fact.
> >
> > Try it. Git really doesn't make it very easy, because git will notice that 
> > 'x' didn't exist before the branch either (in branch 'b'), so there will 
> > be _no_ sign of 'x' actually going away.
> 
> That is true.  The "crude attempt" patch I just sent actually catches
> this, but it does not show the lossage of "new" in the "diff/diffstat"
> part of the merge, when run with "git log --stat -- x".  Besides, it shows
> too many other uninteresting "merged two branches, resolving to lossage of
> the path the same way as all the previous merges" to be really useful.

Yes.

Thinking more about it, we always did have fairly good workarounds for the 
"we optimized away the history too aggressively" (ie the original 
--full-history, and then the newer and nicer --simplify-merges).

So I'm starting to suspect that I was just wrong in looking at the 
revision history simplification. Yes, that can cause simplification that 
we don't want, but on the other hand, it's reasonably easy to work around.

Maybe what we want is a better model for showing diffs from merges.

For example, right now there is _no_ way to get even a "show diff relative 
to first parent". You can do "-m", which will show it relative to _both_ 
parents, but nobody ever wants that. And you can do "-c" or "--cc", but 
that simplifies away all the paths that match in one. 

So here's a challenge: in the git repository, get a nice view of what your 
merges looked like. The closest I can get is

	git log -c --stat --grep="Merge branch '"

which is actually very non-intuitive ("-c" on its own gives no useful 
output, but "-c --stat" gives nice diffstat against the first parent, 
which in this case is what we want).

But I can't actually get git to generate the _patch_ that the --stat 
describes. You'd have to do something like

	git rev-list --parents --grep="Merge branch '" HEAD |
		while read a b c; do git show -s $a ; git diff $b..$a; done | less -S

which is pretty ugly.

			Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03 22:17               ` Linus Torvalds
@ 2009-06-03 22:38                 ` Junio C Hamano
  2009-06-03 22:44                   ` Jeff King
  2009-06-03 22:47                   ` Linus Torvalds
  2009-06-03 22:56                 ` Junio C Hamano
  1 sibling, 2 replies; 26+ messages in thread
From: Junio C Hamano @ 2009-06-03 22:38 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff King, Graham Perks, Git List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> For example, right now there is _no_ way to get even a "show diff relative 
> to first parent". You can do "-m", which will show it relative to _both_ 
> parents, but nobody ever wants that. And you can do "-c" or "--cc", but 
> that simplifies away all the paths that match in one. 

Actually for that "Where did my file 'x' go across the merge chain", I was
going to suggest something like

	git log --simplify-merges -m --raw -- x

> So here's a challenge: in the git repository, get a nice view of what your 
> merges looked like. The closest I can get is
>
> 	git log -c --stat --grep="Merge branch '"
>
> which is actually very non-intuitive ("-c" on its own gives no useful 
> output, but "-c --stat" gives nice diffstat against the first parent, 
> which in this case is what we want).

I think the logical place to hook that into is the --first-parent option.

I actually have very hard resisted so far the temptation to do so because
your mantra has always been "in a merge, all parents are equal."  If you
treat the first parent specially too heavily, it would go against the "I
got a pull request from you, but the resulting conflicts are too much for
me; you know the area much better than I do, so could you do the merge for
me and I'll fast forward to you later" workflow.  The "first parent is the
mainline" and "I am important, so I'll merge with --no-ff to pee in the
snow" mentality problems will become worse.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03 22:38                 ` Junio C Hamano
@ 2009-06-03 22:44                   ` Jeff King
  2009-06-03 22:56                     ` Linus Torvalds
  2009-06-03 22:47                   ` Linus Torvalds
  1 sibling, 1 reply; 26+ messages in thread
From: Jeff King @ 2009-06-03 22:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Graham Perks, Git List

On Wed, Jun 03, 2009 at 03:38:00PM -0700, Junio C Hamano wrote:

> Actually for that "Where did my file 'x' go across the merge chain", I was
> going to suggest something like
> 
> 	git log --simplify-merges -m --raw -- x

But in the original example, the merge commit where 'x' is deleted isn't
shown _at all_ when path limiting is used. You end up either with
"git log -m" showing the two sides of the merge separately, "git log
--simplify-merges -- x" showing stuff that happened on the side branch
but _not_ the actual merge that made a change, or of course "git log --
x" showing nothing (because we don't traverse the changing side of the
merge).

Is there a way to say "show me everything that touched x, _including_
merges"?

-Peff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03 22:38                 ` Junio C Hamano
  2009-06-03 22:44                   ` Jeff King
@ 2009-06-03 22:47                   ` Linus Torvalds
  1 sibling, 0 replies; 26+ messages in thread
From: Linus Torvalds @ 2009-06-03 22:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Graham Perks, Git List



On Wed, 3 Jun 2009, Junio C Hamano wrote:

> Linus Torvalds <torvalds@linux-foundation.org> writes:
> 
> > For example, right now there is _no_ way to get even a "show diff relative 
> > to first parent". You can do "-m", which will show it relative to _both_ 
> > parents, but nobody ever wants that. And you can do "-c" or "--cc", but 
> > that simplifies away all the paths that match in one. 
> 
> Actually for that "Where did my file 'x' go across the merge chain", I was
> going to suggest something like
> 
> 	git log --simplify-merges -m --raw -- x

Ok. Not very readable, but it's certainly getting closer.

> > So here's a challenge: in the git repository, get a nice view of what your 
> > merges looked like. The closest I can get is
> >
> > 	git log -c --stat --grep="Merge branch '"
> >
> > which is actually very non-intuitive ("-c" on its own gives no useful 
> > output, but "-c --stat" gives nice diffstat against the first parent, 
> > which in this case is what we want).
> 
> I think the logical place to hook that into is the --first-parent option.
> 
> I actually have very hard resisted so far the temptation to do so because
> your mantra has always been "in a merge, all parents are equal." 

Oh, I agree. The challenge was just the first step - how to make it do 
merges in general sanely would be the issue.

Because if you do just --first-parent, then that won't even show the 
test-case that Graham actually had - because the missing file didn't come 
in from the first parent of a merge. 

The challenge was mainly as a way to point out that even some fairly 
simple cases aren't all that simple.

		Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03 22:44                   ` Jeff King
@ 2009-06-03 22:56                     ` Linus Torvalds
  2009-06-03 23:06                       ` Jeff King
  2009-06-04  6:57                       ` Junio C Hamano
  0 siblings, 2 replies; 26+ messages in thread
From: Linus Torvalds @ 2009-06-03 22:56 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Graham Perks, Git List



On Wed, 3 Jun 2009, Jeff King wrote:
> 
> Is there a way to say "show me everything that touched x, _including_
> merges"?

Well, that's the "--simplify-merges" part. 

It's just that our diff generation isn't very smart. We do show the 
commit, we just don't show a meaningful diff in that case.

And doing good diffs for a merge is _hard_. The "--cc" thing is supremely 
useful - it's just that it's useful for data conflicts, not for metadata 
issues.

It's in fact somewhat dubious if you actually want to see the file removal 
as a _diff_ in a merge, exactly because it's so verbose and yet often so 
uninteresting (ie the removal may well be intentional). 

It might be that the right thing to do is to expand on "-c" and '--cc" to 
just give a summary of metadata changes.

Right now, "-c" and "--cc" ignore files that didn't change from one of the 
parents.

But maybe the right thing to do is to entirely ignore files only if they 
exist in all parents, but didn't change in one - and for things that have 
actual metadata changes, just say

 - exists in merge result, but not in parent 2:

   File 'x' was created in parent 1

 - does not exist in merge result, but exists in parent 1:

   File 'y' was deleted by parent 2

or similar (with perhaps even rename detection some day if -M is 
specified, although n-way rename detection is likely pretty painful).

IOW, do the whole "extended diff", but not actually show any diffs, just 
summary information, for new/deleted files in merges.

That would be enough of a hint to then use other tools to see the exact 
details of what happened..

			Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03 22:17               ` Linus Torvalds
  2009-06-03 22:38                 ` Junio C Hamano
@ 2009-06-03 22:56                 ` Junio C Hamano
  1 sibling, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2009-06-03 22:56 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff King, Graham Perks, Git List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> Thinking more about it, we always did have fairly good workarounds for the 
> "we optimized away the history too aggressively" (ie the original 
> --full-history, and then the newer and nicer --simplify-merges).

This is offtopic, but there is a rather funny hidden feature in that
"newer and nicer" option.

    $ git log --simplify-merges -- no-no-no-such-path-ever-existed

will lists the merges that do not have any common ancestors.

I do not know where it comes from, or if it is worth fixing.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03 22:56                     ` Linus Torvalds
@ 2009-06-03 23:06                       ` Jeff King
  2009-06-03 23:27                         ` Linus Torvalds
  2009-06-04  6:57                       ` Junio C Hamano
  1 sibling, 1 reply; 26+ messages in thread
From: Jeff King @ 2009-06-03 23:06 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Graham Perks, Git List

On Wed, Jun 03, 2009 at 03:56:01PM -0700, Linus Torvalds wrote:

> > Is there a way to say "show me everything that touched x, _including_
> > merges"?
> 
> Well, that's the "--simplify-merges" part. 
> 
> It's just that our diff generation isn't very smart. We do show the 
> commit, we just don't show a meaningful diff in that case.

No, --simplify-merges doesn't show the merge, unless I am doing
something very wrong. Try (and this is a simplified version of the
original example):

  mkdir repo && cd repo && git init &&
  echo content >base && git add base && git commit -m base &&
  echo context >a.txt && git add a.txt && git commit -m 'master 1' &&
  git checkout -b other HEAD^ &&
  echo content >b.txt && git add b.txt && git commit -m 'other 1' &&
  echo conflict >a.txt && git add a.txt && git commit -m 'other 2' &&
  git checkout master &&
  git merge other ;# conflicts

  rm b.txt && git add b.txt &&
  echo resolve >a.txt && git add a.txt &&
  git commit -m merged

Now try running git log on that. I can see the merge diff if I use "-m",
which is obviously too verbose, but at least works. But if I give
"b.txt" as a path limiter, I can't get the merge commit to display at
all. Doing "git log -m --simplify-merges --stat -- b.txt" yields only
the commit "other 1" in which b.txt was added.

-Peff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03 23:06                       ` Jeff King
@ 2009-06-03 23:27                         ` Linus Torvalds
  2009-06-03 23:37                           ` Jeff King
  0 siblings, 1 reply; 26+ messages in thread
From: Linus Torvalds @ 2009-06-03 23:27 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Graham Perks, Git List



On Wed, 3 Jun 2009, Jeff King wrote:
>
> Try (and this is a simplified version of the original example):
> 
>   mkdir repo && cd repo && git init &&
>   echo content >base && git add base && git commit -m base &&
>   echo context >a.txt && git add a.txt && git commit -m 'master 1' &&
>   git checkout -b other HEAD^ &&
>   echo content >b.txt && git add b.txt && git commit -m 'other 1' &&
>   echo conflict >a.txt && git add a.txt && git commit -m 'other 2' &&
>   git checkout master &&
>   git merge other ;# conflicts
> 
>   rm b.txt && git add b.txt &&
>   echo resolve >a.txt && git add a.txt &&
>   git commit -m merged

This doesn't work at all for me.

Do

	git show HEAD:b.txt

and it still shows b.txt in the commit. You should have used

	git rm b.txt

rather than "git add b.txt" (or you use use "-u" or "-a" to git add).

That looks like a bug, btw, but whatever. It seems intentional (we do the 
whole "ADD_CACHE_IGNORE_REMOVAL" flag thing).

But you're right. Even when fixed, it does seem to need "--full-history" 
to stay around, and --simplify-merges is insufficient. Bug in merge 
simplification?

			Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03 23:27                         ` Linus Torvalds
@ 2009-06-03 23:37                           ` Jeff King
  2009-06-04  1:58                             ` Linus Torvalds
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2009-06-03 23:37 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Graham Perks, Git List

On Wed, Jun 03, 2009 at 04:27:39PM -0700, Linus Torvalds wrote:

> >   rm b.txt && git add b.txt &&
> >   echo resolve >a.txt && git add a.txt &&
> >   git commit -m merged
> 
> This doesn't work at all for me.
> 
> Do
> 
> 	git show HEAD:b.txt
> 
> and it still shows b.txt in the commit. You should have used
> 
> 	git rm b.txt
> 
> rather than "git add b.txt" (or you use use "-u" or "-a" to git add).

Er, sorry, yeah, I botched the recipe (I initially used "git rm" by
itself, but it complains about "changes staged in the index", so I fixed
it up manually and then botched writing out the automated version).

But I see you figured out what I meant, so...

> But you're right. Even when fixed, it does seem to need "--full-history" 
> to stay around, and --simplify-merges is insufficient. Bug in merge 
> simplification?

I don't even see it with --full-history. I get:

  $ git log -m --stat --oneline | head
  b1a38ec (from 2671fa7) merged
   a.txt |    2 +-
   1 files changed, 1 insertions(+), 1 deletions(-)
  b1a38ec (from d0bac65) merged
   a.txt |    2 +-
   b.txt |    1 -
   2 files changed, 1 insertions(+), 2 deletions(-)

  $ git log -m --stat --oneline -- b.txt
  (no output)

  $ git log --simplify-merges -m --stat --oneline -- b.txt
  912ac84 other 1
   b.txt |    1 +
   1 files changed, 1 insertions(+), 0 deletions(-)

  $ git log --full-history -m --stat --oneline -- b.txt
  912ac84 other 1
   b.txt |    1 +
   1 files changed, 1 insertions(+), 0 deletions(-)

Is there some trick to enabling both path limiting _and_ still showing
the merge commit? Or is this a bug?

-Peff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03 23:37                           ` Jeff King
@ 2009-06-04  1:58                             ` Linus Torvalds
  0 siblings, 0 replies; 26+ messages in thread
From: Linus Torvalds @ 2009-06-04  1:58 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Graham Perks, Git List



On Wed, 3 Jun 2009, Jeff King wrote:
>
> I don't even see it with --full-history. 

Oh, that's because I used "gitk" rather than "git log", and that adds 
--parents, which in turn means that it actually keeps the merges. 

So with

	git log --full-history --parents --stat -- b.txt

you actually finally get a visible removal.

		Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Am able to delete a file with no trace in the log
  2009-06-03 22:56                     ` Linus Torvalds
  2009-06-03 23:06                       ` Jeff King
@ 2009-06-04  6:57                       ` Junio C Hamano
  1 sibling, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2009-06-04  6:57 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff King, Graham Perks, Git List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> And doing good diffs for a merge is _hard_. The "--cc" thing is supremely 
> useful - it's just that it's useful for data conflicts, not for metadata 
> issues.
>
> It's in fact somewhat dubious if you actually want to see the file removal 
> as a _diff_ in a merge, exactly because it's so verbose and yet often so 
> uninteresting (ie the removal may well be intentional). 

Thinking about this a bit more, a "good" diff for merge that catches this
kind of "broken merge" would likely require redoing a merge when you ask
for a diff.

Earlier, in my "Here is a crude attempt" patch, I mentioned that the
merges in the beginning part of the output are uninteresting while the one
that merges "Build-in git-clone" is interesting (and "crude attempt" was
not useful because it shows both).

The reason?  The uninteresting ones match mechanical merge result, while
the interesting one does not.  For example, 17d778e (the first one in the
output --- an uninteresting one) is a merge between 5e97f46 and 450f437;
the former branch removes the path in question (git-clone.sh) compared to
their merge base, while the latter does not change it, hence the
mechanical resolution to remove the path matches what the final merge
records.  On the other hand, b84c343 (Merge branch 'db/clone-in-c') merges 
0dbaa5b (modifies "git-clone.sh" since the merge base) and b50c846
(removes) and the mechanical merge results in a conflict.

I think the original example can be handled in the same way.  A side
branch created a new file since the merge base, but a merge lost the file
by mistake.  The recorded result does not resemble the mechanical merge
result and we can flag it by detecting this condition (that is, if we
wanted to --- I do not think we want to spend cycles to recreate a merge
while traversing the history with "log --stat" by default).

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2009-06-04  6:57 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-02 19:33 Am able to delete a file with no trace in the log Graham Perks
2009-06-02 20:29 ` Tony Finch
2009-06-02 21:34 ` Jeff King
2009-06-02 21:55   ` Linus Torvalds
2009-06-03  0:47     ` Sitaram Chamarty
2009-06-03  1:20       ` Linus Torvalds
2009-06-03  1:34     ` Clean up and simplify rev_compare_tree() Linus Torvalds
2009-06-03  1:57     ` Am able to delete a file with no trace in the log Junio C Hamano
2009-06-03  3:03       ` Linus Torvalds
2009-06-03 21:18         ` Junio C Hamano
2009-06-03 21:26           ` Linus Torvalds
2009-06-03 21:33           ` Linus Torvalds
2009-06-03 21:59             ` Avery Pennarun
2009-06-03 22:02             ` Junio C Hamano
2009-06-03 22:17               ` Linus Torvalds
2009-06-03 22:38                 ` Junio C Hamano
2009-06-03 22:44                   ` Jeff King
2009-06-03 22:56                     ` Linus Torvalds
2009-06-03 23:06                       ` Jeff King
2009-06-03 23:27                         ` Linus Torvalds
2009-06-03 23:37                           ` Jeff King
2009-06-04  1:58                             ` Linus Torvalds
2009-06-04  6:57                       ` Junio C Hamano
2009-06-03 22:47                   ` Linus Torvalds
2009-06-03 22:56                 ` Junio C Hamano
2009-06-03 21:55         ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).