git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: Kevin Bracey <kevin@bracey.fi>
Cc: "SZEDER Gábor" <szeder.dev@gmail.com>,
	"Jeff King" <peff@peff.net>, "Junio C Hamano" <gitster@pobox.com>,
	"Git mailing list" <git@vger.kernel.org>
Subject: Re: Weird revision walk behaviour
Date: Tue, 29 May 2018 00:06:51 +0200	[thread overview]
Message-ID: <20180528220651.20287-1-szeder.dev@gmail.com> (raw)
In-Reply-To: <cb1d7c86-a989-300a-01d2-923e9c29e834@bracey.fi>


> On 24/05/2018 23:26, Kevin Bracey wrote:
> >
> >>> On Wed, May 23, 2018 at 07:10:58PM +0200, SZEDER Gábor wrote:
> >>>
> >>>>    $ git log --oneline master..ba95710a3b -- ci/
> >>>>    ea44c0a594 Merge branch 'bw/protocol-v2' into 
> >>>> jt/partial-clone-proto-v2
> >>>>
> > In this case, we're hitting a merge commit which is not on master, but 
> > it has two parents which both are.

Indeed, I didn't notice this important detail amidst the complexity of
the git history.  Thanks, with this info I could come up with a small
test case to demonstrate the issue, see below.

> Which, IIRC, means the merge commit 
> > is INTERESTING with two UNINTERESTING parents; and we are TREESAME to 
> > only one of them.
> >
> > The commit changing the logic of TREESAME you identified believes that 
> > those TREESAME changes for merges which were intended to improve 
> > fuller history modes shouldn't affect the simple history "because 
> > partially TREESAME merges are turned into normal commits". Clearly 
> > that didn't happen here.
> >
> Haven't currently got a development environment set up here, but I've 
> been looking at the code.Here's a proposal, untested, as a potential 
> starting point if anyone wants to consider a proper patch.
> 
> The simplify_history first-scan logic never actually turned merges into 
> simple commits unless they were TREESAME to a relevant/interesting 
> parent.  Anything where the TREESAME parent was UNINTERESTING was 
> retained as a merge, but had its TREESAME flag set, and that permitted 
> later simplification.
> 
> With the redefinition of the TREESAME flag, this merge commit is no 
> longer TREESAME, and as the decoration logic to refine TREESAME isn't 
> active for simplify_history, it doesn't get cleaned up (even if it would 
> be in full history?)
> 
> I think the answer may be to add an extra post-process step on the 
> initial loop to handle this special case. Something like:
> 
>          case REV_TREE_SAME:
>              if (!revs->simplify_history || !relevant_commit(p)) {
>                  /* Even if a merge with an uninteresting
>                   * side branch brought the entire change
>                   * we are interested in, we do not want
>                   * to lose the other branches of this
>                   * merge, so we just keep going.
>                   */
>                  if (ts)
>                      ts->treesame[nth_parent] = 1;
> +               /* But we note it for potential later simplification */
> +               if (!treesame_parent)
> +                    treesame_parent = p;
>                  continue;
>               }
> 
> ...
> 
> After loop:
> 
> +     if (relevant_parents == 0 && revs->simplify_history && 
> treesame_parent) {
> +           treesame_parent->next = NULL;// Repeats code from loop - 
> share somehow?
> +           commit->parents = treesame_parent;
> +           commit->object.flags |= TREESAME;
> +           return;
> +    }
> 
>       /*
>        * TREESAME is straightforward for single-parent commits. For merge

So, without investing nearly enough time to understand what is going
on, I massaged the above diffs into this:

  ---  >8 ---

diff --git a/revision.c b/revision.c
index 4e0e193e57..0ddd2c1e8a 100644
--- a/revision.c
+++ b/revision.c
@@ -605,7 +605,7 @@ static inline int limiting_can_increase_treesame(const struct rev_info *revs)
 
 static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
 {
-	struct commit_list **pp, *parent;
+	struct commit_list **pp, *parent, *treesame_parents = NULL;
 	struct treesame_state *ts = NULL;
 	int relevant_change = 0, irrelevant_change = 0;
 	int relevant_parents, nth_parent;
@@ -672,6 +672,7 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
 		switch (rev_compare_tree(revs, p, commit)) {
 		case REV_TREE_SAME:
 			if (!revs->simplify_history || !relevant_commit(p)) {
+				struct commit_list *tp;
 				/* Even if a merge with an uninteresting
 				 * side branch brought the entire change
 				 * we are interested in, we do not want
@@ -680,6 +681,13 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
 				 */
 				if (ts)
 					ts->treesame[nth_parent] = 1;
+				/* But we note it for potential later
+				 * simplification
+				 */
+				tp = treesame_parents;
+				treesame_parents = xmalloc(sizeof(*treesame_parents));
+				treesame_parents->item = p;
+				treesame_parents->next = tp;
 				continue;
 			}
 			parent->next = NULL;
@@ -716,6 +724,14 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
 		die("bad tree compare for commit %s", oid_to_hex(&commit->object.oid));
 	}
 
+	if (relevant_parents == 0 && revs->simplify_history &&
+	    treesame_parents) {
+		commit->parents = treesame_parents;
+		commit->object.flags |= TREESAME;
+		return;
+	} else
+		free_commit_list(treesame_parents);
+
 	/*
 	 * TREESAME is straightforward for single-parent commits. For merge
 	 * commits, it is most useful to define it so that "irrelevant"

  ---  >8 ---

FWIW, the test suite passes with the above patch applied.

And here is the small PoC test case to illustrate the issue, which
fails without but succeeds with the above patch.  Eventually it should
be part of 't6012-rev-list-simplify.sh', of course, but I haven't
looked into that yet.


  ---  >8 ---

diff --git a/t/t9999-weird-revision-walk-behaviour.sh b/t/t9999-weird-revision-walk-behaviour.sh
new file mode 100755
index 0000000000..22820f845b
--- /dev/null
+++ b/t/t9999-weird-revision-walk-behaviour.sh
@@ -0,0 +1,38 @@
+#!/bin/sh
+
+test_description='PoC weird revision walk behaviour test'
+
+. ./test-lib.sh
+
+# Create the following history, i.e. where both parents of merge 'M1'
+# are in 'master':
+#
+#   B---M2   master
+#  / \ /
+# A   X
+#  \ / \
+#   C---M1   b2
+#
+# and modify 'file' in commits 'A' and 'B', so one of 'M1's parents
+# ('B') is TREESAME wrt. 'file'.
+test_expect_success 'setup' '
+	test_commit initial file &&	# A
+	test_commit modified file &&	# B
+	git checkout -b b1 master^ &&
+	test_commit other-file &&	# C
+	git checkout -b b2 master &&
+	git merge --no-ff b1 &&		# M1
+	git checkout master &&
+	git merge --no-ff b1		# M2
+'
+
+test_expect_success 'debug' '
+	git log --oneline --graph --all
+'
+
+test_expect_success "\"Merge branch 'b1' into b2\" should not be shown" '
+	git log master..b2 -- file >actual &&
+	test_must_be_empty actual
+'
+
+test_done



  reply	other threads:[~2018-05-28 22:07 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-23 17:10 Weird revision walk behaviour SZEDER Gábor
2018-05-23 17:32 ` Jeff King
2018-05-23 17:35   ` Jeff King
2018-05-24 18:54     ` Kevin Bracey
2018-05-24 20:26     ` Kevin Bracey
2018-05-27 17:37       ` Kevin Bracey
2018-05-28 22:06         ` SZEDER Gábor [this message]
2018-05-29  6:11           ` Kevin Bracey
2018-05-29 21:04           ` Jeff King
2018-05-30  8:20             ` Kevin Bracey
2018-05-31  5:43               ` Jeff King
2018-05-31 14:54                 ` Kevin Bracey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180528220651.20287-1-szeder.dev@gmail.com \
    --to=szeder.dev@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=kevin@bracey.fi \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).