git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Are --first-parent and --ancestry-path compatible rev-list options?
@ 2016-08-26 21:23 Philip Oakley
  2016-08-26 22:53 ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Philip Oakley @ 2016-08-26 21:23 UTC (permalink / raw)
  To: Git List; +Cc: self

While trying to answer a Stack Overflow question I thought I could 
contribute to, I've found a scenario that I don't understand that may be a 
bug.

In http://stackoverflow.com/questions/39144006/identify-merge-into-master 
MvG asked how to find the point at which a commit on a feature branch was 
later merged into the main-line.

After some discussion it appeared that
    `git log --oneline  --first-parent --merges --reverse  --ancestry-path 
:/j..  | head -1`
should find the point that 'j' was merged into master (when on master), 
however it appears that --first-parent and --ancestry-path interact badly to 
produce no output in the example, but if either is dropped, the expected 
commit is shown.

What am I missing?
--
Philip


The commit graph. We are looking for F based on knowing J.

. A - B - C - D -- E -- F -- G - H    <-first parent, --merges (C,F,H)
.  \  |  /    \        /    /
.   ----Z     |       /    /
.     |       |       |   /
.      \       \     /   /
.       I -[J]- K - L - M             <-since J, children of J
.        \         /
.         N - O - P

# Steps to reproduce the extended example from
# http://stackoverflow.com/q/39144006/1468366

git init .
echo a > txt
git add txt
git commit -m a
echo a > txt; git commit -a -m a
echo b > txt; git commit -a -m b
git checkout -b side :/a
echo z > txt; git commit -a -m z
git checkout master
git merge :/z; echo c > txt; git add -u; git commit -m c

#echo c > txt; git commit -a -m c
echo d > txt; git commit -a -m d
echo e > txt; git commit -a -m e
git checkout -b 2nd :/b
echo i > txt; git commit -a -m i
echo j > txt; git commit -a -m j
git merge :/d; echo k > txt; git add -u; git commit -m k
git checkout -b 3rd :/i
echo n > txt; git commit -a -m n
echo o > txt; git commit -a -m o
echo p > txt; git commit -a -m p
git checkout 2nd
git merge :/p; echo l > txt; git add -u; git commit -m l
echo m > txt; git commit -a -m m
git checkout master
git merge :/l; echo f > txt; git add -u; git commit -m f
git merge :/m; echo g > txt; git add -u; git commit -m g
echo h > txt; git commit -a -m h

git log --oneline  --first-parent --merges --reverse  --ancestry-path :/j.. 
| head -5

# why does this not work --ancestry-path and --first-parent appear to clash.

code available as 
https://gist.github.com/PhilipOakley/58f344f910e50b72f5a8a2bd55b6c175


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Are --first-parent and --ancestry-path compatible rev-list options?
  2016-08-26 21:23 Are --first-parent and --ancestry-path compatible rev-list options? Philip Oakley
@ 2016-08-26 22:53 ` Junio C Hamano
  2016-08-27 11:04   ` Philip Oakley
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2016-08-26 22:53 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Git List

"Philip Oakley" <philipoakley@iee.org> writes:

> The commit graph. We are looking for F based on knowing J.
>
> . A - B - C - D -- E -- F -- G - H    <-first parent, --merges (C,F,H)
> .  \  |  /    \        /    /
> .   ----Z     |       /    /
> .     |       |       |   /
> .      \       \     /   /
> .       I -[J]- K - L - M             <-since J, children of J
> .        \         /
> .         N - O - P

I think these two operations are fundamentally incompatible.

Because the first-parent traversal is what the name says, i.e.,
forbids the positive side of revision traversal to stray into side
branches, the positive side of a traversal that begins at H will not
see M, L and K.  The negative side would traverse in the normal way
to paint commits starting J and its ancestors as UNINTERESTING, so
the traversal over the artificial "first-parent only" history, i.e.
H, G, F, E, D, C, B, A would eventually stop before showing an
ancestor of J.

On the other hand, to compute the ancestry-path, you need to make a
full traversal of the real history to find a subgraph J..H and then
post-process it to cull paths that do not contain J.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Are --first-parent and --ancestry-path compatible rev-list options?
  2016-08-26 22:53 ` Junio C Hamano
@ 2016-08-27 11:04   ` Philip Oakley
  2016-09-01 17:32     ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Philip Oakley @ 2016-08-27 11:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List

From: "Junio C Hamano" <gitster@pobox.com>
> "Philip Oakley" <philipoakley@iee.org> writes:
>
>> The commit graph. We are looking for F based on knowing J.
>>
>> . A - B - C - D -- E -- F -- G - H    <-first parent, --merges (C,F,H)
>> .  \  |  /    \        /    /
>> .   ----Z     |       /    /
>> .     |       |       |   /
>> .      \       \     /   /
>> .       I -[J]- K - L - M             <-since J, children of J
>> .        \         /
>> .         N - O - P
>
> I think these two operations are fundamentally incompatible.

If I run them independently, they both find the desired INTERESTED commit, 
hence the expectation that together they will still find that commit as an 
intersection between the two sets.

>
> Because the first-parent traversal is what the name says, i.e.,
> forbids the positive side of revision traversal to stray into side
> branches, the positive side of a traversal that begins at H will not
> see M, L and K.

But it does see F the ultimately desired commit.
Philip@PhilipOakley MINGW32 /usr/src/test (master)

$ git log --oneline --first-parent --merges :/j..

d089efa g

ac811d4 f

1db59e5 c

Then we have:
--ancestry-path
    Limit the displayed commits to those directly on the ancestry chain 
between the "from" and "to" commits in the given commit range.

So one would expect, from a user viewepoint that this is commit selection, 
not internal algorithm implementation, that it would only list G and F, and 
the C would not be displayed.

>  The negative side would traverse in the normal way
> to paint commits starting J and its ancestors as UNINTERESTING, so
> the traversal over the artificial "first-parent only" history, i.e.
> H, G, F, E, D, C, B, A would eventually stop before showing an
> ancestor of J.
>

> On the other hand, to compute the ancestry-path, you need to make a
> full traversal of the real history to find a subgraph J..H and then
> post-process it to cull paths that do not contain J.

The culling order isn't clear. It's easy to expect that --first parent is a 
cull.
The documentation implies (to me) this different computation.


This explains why its happening, but it does feel rather contrary to user 
(rather than developer) expectation. It's the confusion between traversal 
limiting and marking a commit as Interesting (i.e. UNINTERSTING has two 
meanings which may not align when multiple options are given.)

It does seem odd that in the Git world, with its feature branch approach, 
that this hasn't come up before. I know that I haven't even bothered to try 
this 'search', and just use gitk to manually follow the breadcrumbs.

Perhaps a `--first-parent-ancestor` option to determine the places where a 
commit from a feature series was merged in to the mainline would be rather 
helpful.
> --
Philip 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Are --first-parent and --ancestry-path compatible rev-list options?
  2016-08-27 11:04   ` Philip Oakley
@ 2016-09-01 17:32     ` Junio C Hamano
  2016-09-01 20:48       ` Philip Oakley
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2016-09-01 17:32 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Git List

"Philip Oakley" <philipoakley@iee.org> writes:

> From: "Junio C Hamano" <gitster@pobox.com>
>> "Philip Oakley" <philipoakley@iee.org> writes:
>>
>>> The commit graph. We are looking for F based on knowing J.
>>>
>>> . A - B - C - D -- E -- F -- G - H    <-first parent, --merges (C,F,H)
>>> .  \  |  /    \        /    /
>>> .   ----Z     |       /    /
>>> .     |       |       |   /
>>> .      \       \     /   /
>>> .       I -[J]- K - L - M             <-since J, children of J
>>> .        \         /
>>> .         N - O - P
>>
>> I think these two operations are fundamentally incompatible.
>
> If I run them independently, they both find the desired INTERESTED
> commit, hence the expectation that together they will still find that
> commit as an intersection between the two sets.
>
>>
>> Because the first-parent traversal is what the name says, i.e.,
>> forbids the positive side of revision traversal to stray into side
>> branches, the positive side of a traversal that begins at H will not
>> see M, L and K.
>
> But it does see F the ultimately desired commit.

You are doing --merges --first-parent, right?  Traversing only the
first-parent chain on the positive side, while excluding J's
ancestor by traversing the negative side without being limited to
the first-parent chain, would paint B and its ancestors as
uninteresting on the first-parent chain, so among H, G, F, E, D and
C, which are the survivors on the first-parent chain that are still
not UNINTERESTIN, the last one you would find that is a merge is F.

So I do not see any room for "But" to come in here...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Are --first-parent and --ancestry-path compatible rev-list options?
  2016-09-01 17:32     ` Junio C Hamano
@ 2016-09-01 20:48       ` Philip Oakley
  2016-09-01 22:20         ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Philip Oakley @ 2016-09-01 20:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List

Hi Junio,

From: "Junio C Hamano" <gitster@pobox.com>
> "Philip Oakley" <philipoakley@iee.org> writes:
>
>> From: "Junio C Hamano" <gitster@pobox.com>
>>> "Philip Oakley" <philipoakley@iee.org> writes:
>>>
>>>> The commit graph. We are looking for F based on knowing J.
>>>>
>>>> . A - B - C - D -- E -- F -- G - H    <-first parent, --merges (C,F,H)
>>>> .  \  |  /    \        /    /
>>>> .   ----Z     |       /    /
>>>> .     |       |       |   /
>>>> .      \       \     /   /
>>>> .       I -[J]- K - L - M             <-since J, children of J
>>>> .        \         /
>>>> .         N - O - P
>>>
>>> I think these two operations are fundamentally incompatible.
>>
>> If I run them independently, they both find the desired INTERESTED
>> commit, hence the expectation that together they will still find that
>> commit as an intersection between the two sets.
>>
>>>
>>> Because the first-parent traversal is what the name says, i.e.,
>>> forbids the positive side of revision traversal to stray into side
>>> branches, the positive side of a traversal that begins at H will not
>>> see M, L and K.
>>
>> But it does see F the ultimately desired commit.
>
> You are doing --merges --first-parent, right?  Traversing only the
> first-parent chain on the positive side, while excluding J's
> ancestor by traversing the negative side without being limited to
> the first-parent chain, would paint B and its ancestors as
> uninteresting on the first-parent chain, so among H, G, F, E, D and
> C, which are the survivors on the first-parent chain that are still
> not UNINTERESTIN, the last one you would find that is a merge is F.
>
> So I do not see any room for "But" to come in here...
>

The confusion is between the "As required" and "as coded" viewpoints (this 
is regular dayjob problem of allegedly having 'requirement specifications').

You have rightly described the algorithm as currently implemented, while I 
was was trying to state a requirement (based on a user question).

The user question was, given a commit 'J', and a future commit 'H' 
(typically a branch tip such as 'master'), find those commits that are :
A) merges
B) on the first parent DAG chain of the future commit 'H'
C) children of the given commit 'J'

i.e. the points on master where the feature J (and it's children) could have 
brought in some effect to master.

Each of the three conditions match one of the revision list options, but the 
implemented logic does not produce the AND effect that could be expected. 
Essentially it (the user's desire vs the coding implementation) is a case 
where (depending on the viewpoint) we get the 'denying the antecedent' 
fallacy.

Just because a commit is not --first-parent does not mean that revison 
walking (in this case) should stop. The user is interested in the ancestry 
path, so from that perspective the the walk should carry on through the 
other parents till its reaches the TRULY_DISINTERESTED line, or reaches J.


With the current implementation one appears to need to script a search 
through all the first parent merges and then prune out those that aren't 
merge-base is-ancestor commits, which is what the OP ended up having to do.


I haven't had any time to look into the rev-walk code itself, but is this 
something that could be coded (new option) or is the double-duty problem 
with UNINTERESTING too hard baked into the code?
--
Philip


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Are --first-parent and --ancestry-path compatible rev-list options?
  2016-09-01 20:48       ` Philip Oakley
@ 2016-09-01 22:20         ` Junio C Hamano
  0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2016-09-01 22:20 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Git List

"Philip Oakley" <philipoakley@iee.org> writes:

> The user question was, given a commit 'J', and a future commit 'H'
> (typically a branch tip such as 'master'), find those commits that are
> :
> A) merges
> B) on the first parent DAG chain of the future commit 'H'
> C) children of the given commit 'J'

The answer then is that there is no such single step operation.

In the picture, if D or E were a merge from a side branch that does
not have anything to do with 'J', "log --first-parent --merges" will
not exclude it (i.e. C won't be fulfilled by --first-parent --merges).


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-09-01 22:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-26 21:23 Are --first-parent and --ancestry-path compatible rev-list options? Philip Oakley
2016-08-26 22:53 ` Junio C Hamano
2016-08-27 11:04   ` Philip Oakley
2016-09-01 17:32     ` Junio C Hamano
2016-09-01 20:48       ` Philip Oakley
2016-09-01 22:20         ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).