git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Trimming 'deadheads' (TREESAME 2nd parent) from revision walks?
@ 2021-09-18 14:18 Philip Oakley
  2021-09-19 18:59 ` Jeff King
  2021-10-05 10:53 ` Johannes Schindelin
  0 siblings, 2 replies; 9+ messages in thread
From: Philip Oakley @ 2021-09-18 14:18 UTC (permalink / raw)
  To: Git List

Hi all,

Is there a method within `git rev-list` to trim side branch merges where
the merge's tree is identical to the first parent's commit-tree?

One back-story: In the Git-for Windows repository, the previous releases
are 'deadheaded' by merging with the upstream git, and simply taking the
upstream's tree unconditionally. The Git-for-Windows `fixes` are then
rebased onto that merge.

This does mean that all the fixes keep repeating down the 2nd parent
line. So, for example, grep'ing for changes can tricky with so much
repeated chaff, but at least all old versions are directly in the history.

Sometimes it's nice to 'pretend' (a simplified history) that there is
only the one latest series of this 'long lived feature branch' (along
with a similar desire for 'gfw/next` and `gfw/seen`). (one method has
been to `git replace` that merge commit `{/"Start the"}` with it's
parent on a temporary basis).

From my reading of the `rev-list` manual this is similar to the <paths>
TREESAME capability, but without specifying any paths (maybe just `.` ?).

* Is there an existing method for specifying that simplified history?
* Is there a proper term for the treesame condition of the commit-tree
(as recorded in the commit object)?
* Thought's on adding an option for `--follow-treesame`?

The desire also came up in my pondering about progressive/partial merges
(how to represent/hold current state/history) of a large tree, whereby
different authors take different 'bites at the melon' of merging a long
lasting feature branch (the 'ball of mud' type), whereby the result
could be an octopus merge of the main/feature/partial commits, which is
repeated until the partial becomes a finalised merge (the book-ending
and octo-merge is still wip, but would also benefit from the 'feature'
merge technique used by GfW.

--
Philip

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trimming 'deadheads' (TREESAME 2nd parent) from revision walks?
  2021-09-18 14:18 Trimming 'deadheads' (TREESAME 2nd parent) from revision walks? Philip Oakley
@ 2021-09-19 18:59 ` Jeff King
  2021-09-19 23:44   ` Ævar Arnfjörð Bjarmason
  2021-10-05 10:53 ` Johannes Schindelin
  1 sibling, 1 reply; 9+ messages in thread
From: Jeff King @ 2021-09-19 18:59 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Git List

On Sat, Sep 18, 2021 at 03:18:47PM +0100, Philip Oakley wrote:

> Is there a method within `git rev-list` to trim side branch merges where
> the merge's tree is identical to the first parent's commit-tree?
> [...]
> From my reading of the `rev-list` manual this is similar to the <paths>
> TREESAME capability, but without specifying any paths (maybe just `.` ?).

Yes, I'd just do "git log ." for this. I don't think there's another way
to trigger simplification. In try_to_simplify_commit(), we bail early
unless revs->prune is set, and that is set only by the presence of
pathspecs or by --simplify-by-decoration.

> * Is there a proper term for the treesame condition of the commit-tree
> (as recorded in the commit object)?

In a one-parent commit, I'd just call it an empty commit. For a merge,
it is really I'd probably call it an "ours" merge, since one obvious way
to get there is with "git merge -s ours" (of course you can also just
resolve all conflicts in favor of one parent). I don't know of another
name (besides treesame, of course, but that generally implies a
particular scope of interest given by a pathspec).

-Peff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trimming 'deadheads' (TREESAME 2nd parent) from revision walks?
  2021-09-19 18:59 ` Jeff King
@ 2021-09-19 23:44   ` Ævar Arnfjörð Bjarmason
  2021-09-20 11:40     ` Philip Oakley
  0 siblings, 1 reply; 9+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-19 23:44 UTC (permalink / raw)
  To: Jeff King; +Cc: Philip Oakley, Git List


On Sun, Sep 19 2021, Jeff King wrote:

> On Sat, Sep 18, 2021 at 03:18:47PM +0100, Philip Oakley wrote:
>
>> Is there a method within `git rev-list` to trim side branch merges where
>> the merge's tree is identical to the first parent's commit-tree?
>> [...]
>> From my reading of the `rev-list` manual this is similar to the <paths>
>> TREESAME capability, but without specifying any paths (maybe just `.` ?).
>
> Yes, I'd just do "git log ." for this. I don't think there's another way
> to trigger simplification. In try_to_simplify_commit(), we bail early
> unless revs->prune is set, and that is set only by the presence of
> pathspecs or by --simplify-by-decoration.
>
>> * Is there a proper term for the treesame condition of the commit-tree
>> (as recorded in the commit object)?
>
> In a one-parent commit, I'd just call it an empty commit. For a merge,
> it is really I'd probably call it an "ours" merge, since one obvious way
> to get there is with "git merge -s ours" (of course you can also just
> resolve all conflicts in favor of one parent). I don't know of another
> name (besides treesame, of course, but that generally implies a
> particular scope of interest given by a pathspec).

Isn't it a "theirs" merge, not "ours"? Per the description Philip has:

    In the Git-for Windows repository, the previous releases are
    'deadheaded' by merging with the upstream git, and simply taking the
    upstream's tree unconditionally[...]

I.e. if you're taking your tree unconditionally it's -s ours, but -s
theirs for theirs. Except of course for the small matter of us not
having a "-s theirs" yet.

I had a WIP patch a while ago for a "-s theirs -X N", for what sounds
like a similar use-case:
https://lore.kernel.org/git/87sh7sdtc1.fsf@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trimming 'deadheads' (TREESAME 2nd parent) from revision walks?
  2021-09-19 23:44   ` Ævar Arnfjörð Bjarmason
@ 2021-09-20 11:40     ` Philip Oakley
  2021-09-20 20:50       ` Jeff King
  0 siblings, 1 reply; 9+ messages in thread
From: Philip Oakley @ 2021-09-20 11:40 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Jeff King; +Cc: Git List

On 20/09/2021 00:44, Ævar Arnfjörð Bjarmason wrote:
> On Sun, Sep 19 2021, Jeff King wrote:
>
>> On Sat, Sep 18, 2021 at 03:18:47PM +0100, Philip Oakley wrote:
>>
>>> Is there a method within `git rev-list` to trim side branch merges where
>>> the merge's tree is identical to the first parent's commit-tree?
>>> [...]
>>> From my reading of the `rev-list` manual this is similar to the <paths>
>>> TREESAME capability, but without specifying any paths (maybe just `.` ?).
>> Yes, I'd just do "git log ." for this. I don't think there's another way
>> to trigger simplification. In try_to_simplify_commit(), we bail early
>> unless revs->prune is set, and that is set only by the presence of
>> pathspecs or by --simplify-by-decoration.
>>
>>> * Is there a proper term for the treesame condition of the commit-tree
>>> (as recorded in the commit object)?
>> In a one-parent commit, I'd just call it an empty commit. For a merge,
>> it is really I'd probably call it an "ours" merge, since one obvious way
>> to get there is with "git merge -s ours" (of course you can also just
>> resolve all conflicts in favor of one parent). I don't know of another
>> name (besides treesame, of course, but that generally implies a
>> particular scope of interest given by a pathspec).
> Isn't it a "theirs" merge, not "ours"? Per the description Philip has:
>
>     In the Git-for Windows repository, the previous releases are
>     'deadheaded' by merging with the upstream git, and simply taking the
>     upstream's tree unconditionally[...]

In that sense, yes. From the gfw branch's perspective, the merge is 100%
that of git/master. It provides an 'out-of-line' commit onto which the
gfw patches can be rebased.
It's almost identical to a ball-of-mud progressive (incremental) merge,
but neatly refactored.
>
> I.e. if you're taking your tree unconditionally it's -s ours, but -s
> theirs for theirs. Except of course for the small matter of us not
> having a "-s theirs" yet.

There used to be a `theirs` strategy but (IIRC) it was removed by Linus
years ago (before I discovered git and it's ability to distribute
control..).

One thing that catches me, and I think others, is how the 'strategies'
work. IIUC a merge will look at each line in the diff, and accept any
change on either side that has no conflicts within the context zone.
It's only when there are changes from both sides that the selection
strategy kicks in. But it is difficult to describe, so it's easy to be
confused.

It doesn't look like this type of rebasing workflow for a
multi-platform/product scenarios was considered at the time. [1-4]

Either way, having a few clues (where to look in the code) to including
a `--deadhead` history simplification is useful.
>
> I had a WIP patch a while ago for a "-s theirs -X N", for what sounds
> like a similar use-case:
> https://lore.kernel.org/git/87sh7sdtc1.fsf@evledraar.gmail.com/

[1]
https://stackoverflow.com/questions/173919/is-there-a-theirs-version-of-git-merge-s-ours
[2]
https://lore.kernel.org/git/alpine.DEB.1.00.0807290123300.2725@eeepc-johanness/
[3] https://lore.kernel.org/git/7vtzen7bul.fsf@gitster.siamese.dyndns.org/
[4]
https://lore.kernel.org/git/xmqqmv5je412.fsf_-_@gitster.mtv.corp.google.com/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trimming 'deadheads' (TREESAME 2nd parent) from revision walks?
  2021-09-20 11:40     ` Philip Oakley
@ 2021-09-20 20:50       ` Jeff King
  2021-09-21 13:36         ` Philip Oakley
  0 siblings, 1 reply; 9+ messages in thread
From: Jeff King @ 2021-09-20 20:50 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Ævar Arnfjörð Bjarmason, Git List

On Mon, Sep 20, 2021 at 12:40:21PM +0100, Philip Oakley wrote:

> One thing that catches me, and I think others, is how the 'strategies'
> work. IIUC a merge will look at each line in the diff, and accept any
> change on either side that has no conflicts within the context zone.
> It's only when there are changes from both sides that the selection
> strategy kicks in. But it is difficult to describe, so it's easy to be
> confused.

I think you might be confusing the "ours" strategy (which takes the
tree state of the first parent entirely) with the "ours" (and "theirs")
options of the merge-recursive (or ort) strategy.

You can see the difference with:

  git init repo
  cd repo
  
  echo base >file
  git add file
  git commit -m base
  
  echo main >file
  git add file
  git commit -m main
  
  git checkout -b side HEAD^
  echo side >file
  echo unrelated >another
  git add file another
  git commit -m side
  
  git checkout -b strategy-ours main
  git merge -s ours side
  
  git checkout -b option-ours main
  git merge -X ours side

The strategy-ours merge will drop "another", because it was not in the
first parent. Whereas option-ours will keep it, preferring the
first parent only for the conflict in "file".

You could construct a similar example where instead of a second file,
there's enough content in "file" that some of it does not conflict.

-Peff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trimming 'deadheads' (TREESAME 2nd parent) from revision walks?
  2021-09-20 20:50       ` Jeff King
@ 2021-09-21 13:36         ` Philip Oakley
  2021-09-21 18:24           ` Philip Oakley
  0 siblings, 1 reply; 9+ messages in thread
From: Philip Oakley @ 2021-09-21 13:36 UTC (permalink / raw)
  To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, Git List

Hi Peff,
On 20/09/2021 21:50, Jeff King wrote:
> On Mon, Sep 20, 2021 at 12:40:21PM +0100, Philip Oakley wrote:
>
>> One thing that catches me, and I think others, is how the 'strategies'
>> work. IIUC a merge will look at each line in the diff, and accept any
>> change on either side that has no conflicts within the context zone.
>> It's only when there are changes from both sides that the selection
>> strategy kicks in. But it is difficult to describe, so it's easy to be
>> confused.
> I think you might be confusing the "ours" strategy (which takes the
> tree state of the first parent entirely) with the "ours" (and "theirs")
> options of the merge-recursive (or ort) strategy.
>
> You can see the difference with:
>
>   git init repo
>   cd repo
>   
>   echo base >file
>   git add file
>   git commit -m base
>   
>   echo main >file
>   git add file
>   git commit -m main
>   
>   git checkout -b side HEAD^
>   echo side >file
>   echo unrelated >another
>   git add file another
>   git commit -m side
>   
>   git checkout -b strategy-ours main
>   git merge -s ours side
>   
>   git checkout -b option-ours main
>   git merge -X ours side
>
> The strategy-ours merge will drop "another", because it was not in the
> first parent. Whereas option-ours will keep it, preferring the
> first parent only for the conflict in "file".
>
> You could construct a similar example where instead of a second file,
> there's enough content in "file" that some of it does not conflict.
>
> -Peff
Thanks for the clarification.
I was probably over thinking the problem, by starting at the default and
adding conditions that are extras to that, rather than  reducing the
conditions!

The `theirs` strategy is really only suitable for maintainers, rather
than solo coders, as it need to be 'old releases` that are kept, rather
'old cruft` (I've generated too much of that in my time).

Dscho's scripts (for anyone interested) for GfW are in
https://github.com/git-for-windows/build-extra/blob/main/shears.sh#L16-L18
and 
https://github.com/git-for-windows/build-extra/blob/main/ever-green.sh,
though from the script perspective it's an 'ours' strategy.

Dscho has to locate the start commit via it's subject line, rather than
it's topology.

--
Philip

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trimming 'deadheads' (TREESAME 2nd parent) from revision walks?
  2021-09-21 13:36         ` Philip Oakley
@ 2021-09-21 18:24           ` Philip Oakley
  0 siblings, 0 replies; 9+ messages in thread
From: Philip Oakley @ 2021-09-21 18:24 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, Git List,
	Johannes Schindelin

On 21/09/2021 14:36, Philip Oakley wrote:
> The `theirs` strategy is really only suitable for maintainers, rather
> than solo coders, as it need to be 'old releases` that are kept, rather
> 'old cruft` (I've generated too much of that in my time).
>
> Dscho's scripts (for anyone interested) for GfW are in
> https://github.com/git-for-windows/build-extra/blob/main/shears.sh#L16-L18
> and 
> https://github.com/git-for-windows/build-extra/blob/main/ever-green.sh,
> though from the script perspective it's an 'ours' strategy.
>
> Dscho has to locate the start commit via it's subject line, rather than
> it's topology.

It's taken me a while to realise why/how Dscho is using 'ours', for a
'theirs' merge.

He is inserting that merge into the start of the --merging-rebase's
instruction sheet, which means that the rebase itself will reverse the
meaning of 'ours' and 'theirs' as it checks out the 'theirs' branch
first before performing the actions in the instruction sheet.

Thus the 'ours' strategy now works in our favour, to effectively
deadhead the old hear as a 'theirs' merge and then begin the rebasing of
the Git-for-Windows patch thicket on top of the latest Git.

Sneaky.

--
Philip
(added dscho as cc, just in case I've got it wrong again;-)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trimming 'deadheads' (TREESAME 2nd parent) from revision walks?
  2021-09-18 14:18 Trimming 'deadheads' (TREESAME 2nd parent) from revision walks? Philip Oakley
  2021-09-19 18:59 ` Jeff King
@ 2021-10-05 10:53 ` Johannes Schindelin
  2021-10-06 14:03   ` Philip Oakley
  1 sibling, 1 reply; 9+ messages in thread
From: Johannes Schindelin @ 2021-10-05 10:53 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Git List

Hi Philip,

On Sat, 18 Sep 2021, Philip Oakley wrote:

> Is there a method within `git rev-list` to trim side branch merges where
> the merge's tree is identical to the first parent's commit-tree?

Yes, there is, but it is not as easy as a command-line option: `git
replace`.

For example, to pretend that the most recent merging-rebase in Git for
Windows was (almost) a regular rebase, you replace the "Start the
merging-rebase" commit with a graft that only keeps its first parent:

	git replace --graft HEAD^{/^Start.the} HEAD^{/^Start.the}^

(Of course, you still have to find out the first-parent-treesame merge
commits that you want to replace.)

If you want to do that only temporarily, you can use a throw-away
namespace instead of the refs/replace/ one:

	export GIT_REPLACE_REF_BASE=refs/philipoakley/
	git replace --graft HEAD^{/^Start.the} HEAD^{/^Start.the}^

Before:

	[...]
	| > | | 23e09ef1080 Merge 'remote-hg-prerequisites' into HEAD
	|/| | |
	| > | | 0221569db1c Always auto-gc after calling a fast-import transport
	| > | | f189282dcfc remote-helper: check helper status after import/export
	| > | | 158907ceb87 transport-helper: add trailing --
	| > | | 6e34e54050c t9350: point out that refs are not updated correctly
	|/ / /
	> | |   7b2b910b080 Start the merging-rebase to v2.33.0
	|\ \ \
	| |_|/
	|/| |
	| > |   508bb26ff90 (tag: v2.33.0-rc2.windows.1) Merge pull request #3349 from vdye/feature/ci-subtree-tests
	[...]

After:

	[...]
	| > | | 23e09ef1080 Merge 'remote-hg-prerequisites' into HEAD
	|/| | |
	| > | | 0221569db1c Always auto-gc after calling a fast-import transport
	| > | | f189282dcfc remote-helper: check helper status after import/export
	| > | | 158907ceb87 transport-helper: add trailing --
	| > | | 6e34e54050c t9350: point out that refs are not updated correctly
	|/ / /
	> | / 7b2b910b080 (replaced) Start the merging-rebase to v2.33.0
	| |/
	|/|
	> | 225bc32a989 (tag: v2.33.0, upstream/maint, mirucam/maint, gitgitgadget/snap, gitgitgadget/maint) Git 2.33
	[...]

You can always clean up _all_ replace objects via `git replace -d $(git
replace -l)`.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trimming 'deadheads' (TREESAME 2nd parent) from revision walks?
  2021-10-05 10:53 ` Johannes Schindelin
@ 2021-10-06 14:03   ` Philip Oakley
  0 siblings, 0 replies; 9+ messages in thread
From: Philip Oakley @ 2021-10-06 14:03 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Git List

Hi Dscho,
On 05/10/2021 11:53, Johannes Schindelin wrote:
> Hi Philip,
>
> On Sat, 18 Sep 2021, Philip Oakley wrote:
>
>> Is there a method within `git rev-list` to trim side branch merges where
>> the merge's tree is identical to the first parent's commit-tree?
> Yes, there is, but it is not as easy as a command-line option: `git
> replace`.
>
> For example, to pretend that the most recent merging-rebase in Git for
> Windows was (almost) a regular rebase, you replace the "Start the
> merging-rebase" commit with a graft that only keeps its first parent:
>
> 	git replace --graft HEAD^{/^Start.the} HEAD^{/^Start.the}^

If I remember rightly, the ever-green script needs to go to special
lengths to ensure that it gets the topographic (DAG) sort order, rather
than the default chrono - time ordered commit searching, which was part
of the deadheads question.

I need to learn the dot trick to avoid the quoting needed for the spaces
in the commit message title! It always slips my mind. I've added a
pseudo alias for the command style (I can never remember what I named
it.. but its easy to look up all aliases..) I also never remember that
"Start.the" subject line..
>
> (Of course, you still have to find out the first-parent-treesame merge
> commits that you want to replace.)

I was thinking of cases beyond the current Git-for-Windows, that other
maintainers may start to use where keeping the 'deadheads' is a valid,
or even required, part of their real-world projects, hence the idea of a
`--deadheads` variant of 'first-parent'.
>
> If you want to do that only temporarily, you can use a throw-away
> namespace instead of the refs/replace/ one:
>
> 	export GIT_REPLACE_REF_BASE=refs/philipoakley/
> 	git replace --graft HEAD^{/^Start.the} HEAD^{/^Start.the}^

Useful.
>
> Before:
>
> 	[...]
> 	| > | | 23e09ef1080 Merge 'remote-hg-prerequisites' into HEAD
> 	|/| | |
> 	| > | | 0221569db1c Always auto-gc after calling a fast-import transport
> 	| > | | f189282dcfc remote-helper: check helper status after import/export
> 	| > | | 158907ceb87 transport-helper: add trailing --
> 	| > | | 6e34e54050c t9350: point out that refs are not updated correctly
> 	|/ / /
> 	> | |   7b2b910b080 Start the merging-rebase to v2.33.0
> 	|\ \ \
> 	| |_|/
> 	|/| |
> 	| > |   508bb26ff90 (tag: v2.33.0-rc2.windows.1) Merge pull request #3349 from vdye/feature/ci-subtree-tests
> 	[...]
>
> After:
>
> 	[...]
> 	| > | | 23e09ef1080 Merge 'remote-hg-prerequisites' into HEAD
> 	|/| | |
> 	| > | | 0221569db1c Always auto-gc after calling a fast-import transport
> 	| > | | f189282dcfc remote-helper: check helper status after import/export
> 	| > | | 158907ceb87 transport-helper: add trailing --
> 	| > | | 6e34e54050c t9350: point out that refs are not updated correctly
> 	|/ / /
> 	> | / 7b2b910b080 (replaced) Start the merging-rebase to v2.33.0
> 	| |/
> 	|/|
> 	> | 225bc32a989 (tag: v2.33.0, upstream/maint, mirucam/maint, gitgitgadget/snap, gitgitgadget/maint) Git 2.33
> 	[...]
>
> You can always clean up _all_ replace objects via `git replace -d $(git
> replace -l)`.

That's a useful clean up tip. Thanks!

>
> Ciao,
> Dscho
Thanks
Philip

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-10-06 14:03 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-18 14:18 Trimming 'deadheads' (TREESAME 2nd parent) from revision walks? Philip Oakley
2021-09-19 18:59 ` Jeff King
2021-09-19 23:44   ` Ævar Arnfjörð Bjarmason
2021-09-20 11:40     ` Philip Oakley
2021-09-20 20:50       ` Jeff King
2021-09-21 13:36         ` Philip Oakley
2021-09-21 18:24           ` Philip Oakley
2021-10-05 10:53 ` Johannes Schindelin
2021-10-06 14:03   ` Philip Oakley

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).