git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* The case for two trees in a commit ("How to make rebase less modal")
@ 2018-02-28 23:30 Stefan Beller
  2018-03-01  1:15 ` Ramsay Jones
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Stefan Beller @ 2018-02-28 23:30 UTC (permalink / raw)
  To: git; +Cc: Sergey Organov, igor.d.djordjevic, Johannes Schindelin

$ git hash-object --stdin -w -t commit <<EOF
tree c70b4a33a0089f15eb3b38092832388d75293e86
parent 105d5b91138ced892765a84e771a061ede8d63b8
author Stefan Beller <sbeller@google.com> 1519859216 -0800
committer Stefan Beller <sbeller@google.com> 1519859216 -0800
tree 5495266479afc9a4bd9560e9feac465ed43fa63a
test commit
EOF
19abfc3bf1c5d782045acf23abdf7eed81e16669
$ git fsck |grep 19abfc3bf1c5d782045acf23abdf7eed81e16669
$

So it is technically possible to create a commit with two tree entries
and fsck is not complaining.

But why would I want to do that?

There are multiple abstraction levels in Git, I think of them as follows:
* data structures / object model
* plumbing
* porcelain commands to manipulate the repo "at small scale", e.g.
create a commit/tag
* porcelain to modify the repo "at larger scale", such as rebase,
cherrypicking, reverting
  involving more than 1 commit.

These large scale operations involving multiple commits however
are all modal in its nature. Before doing anything else, you have to
finish or abort the rebase or you need expert knowledge how to
go otherwise.

During the rebase there might be a hard to resolve conflict, which
you may not want to resolve right now, but defer to later.  Deferring a
conflict is currently impossible, because precisely one tree is recorded.

If we had multiple trees possible in a commit, then all these large scale
operations would stop being modal and you could just record the unresolved
merge conflict instead; to come back later and fix it up later.

I'd be advocating for having multiple trees in a commit
possible locally; it might be a bad idea to publish such trees.

Opinions or other use cases?

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: The case for two trees in a commit ("How to make rebase less modal")
  2018-02-28 23:30 The case for two trees in a commit ("How to make rebase less modal") Stefan Beller
@ 2018-03-01  1:15 ` Ramsay Jones
  2018-03-01  1:42   ` Brandon Williams
  2018-03-01  4:26 ` Jeff King
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 8+ messages in thread
From: Ramsay Jones @ 2018-03-01  1:15 UTC (permalink / raw)
  To: Stefan Beller, git; +Cc: Sergey Organov, igor.d.djordjevic, Johannes Schindelin



On 28/02/18 23:30, Stefan Beller wrote:
> $ git hash-object --stdin -w -t commit <<EOF
> tree c70b4a33a0089f15eb3b38092832388d75293e86
> parent 105d5b91138ced892765a84e771a061ede8d63b8
> author Stefan Beller <sbeller@google.com> 1519859216 -0800
> committer Stefan Beller <sbeller@google.com> 1519859216 -0800
> tree 5495266479afc9a4bd9560e9feac465ed43fa63a
> test commit
> EOF
> 19abfc3bf1c5d782045acf23abdf7eed81e16669
> $ git fsck |grep 19abfc3bf1c5d782045acf23abdf7eed81e16669
> $
> 
> So it is technically possible to create a commit with two tree entries
> and fsck is not complaining.

Hmm, it's a while since I looked at that code, but I don't think
you have a commit with two trees - the second 'tree <sha1>' line
is just part of the commit message, isn't it?

ATB,
Ramsay Jones


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: The case for two trees in a commit ("How to make rebase less modal")
  2018-03-01  1:15 ` Ramsay Jones
@ 2018-03-01  1:42   ` Brandon Williams
  0 siblings, 0 replies; 8+ messages in thread
From: Brandon Williams @ 2018-03-01  1:42 UTC (permalink / raw)
  To: Ramsay Jones
  Cc: Stefan Beller, git, Sergey Organov, igor.d.djordjevic,
	Johannes Schindelin

On 03/01, Ramsay Jones wrote:
>
>
> On 28/02/18 23:30, Stefan Beller wrote:
> > $ git hash-object --stdin -w -t commit <<EOF
> > tree c70b4a33a0089f15eb3b38092832388d75293e86
> > parent 105d5b91138ced892765a84e771a061ede8d63b8
> > author Stefan Beller <sbeller@google.com> 1519859216 -0800
> > committer Stefan Beller <sbeller@google.com> 1519859216 -0800
> > tree 5495266479afc9a4bd9560e9feac465ed43fa63a
> > test commit
> > EOF
> > 19abfc3bf1c5d782045acf23abdf7eed81e16669
> > $ git fsck |grep 19abfc3bf1c5d782045acf23abdf7eed81e16669
> > $
> >
> > So it is technically possible to create a commit with two tree entries
> > and fsck is not complaining.
>
> Hmm, it's a while since I looked at that code, but I don't think
> you have a commit with two trees - the second 'tree <sha1>' line
> is just part of the commit message, isn't it?
>
> ATB,
> Ramsay Jones
>

Actually it doesn't look like it.  The commit msg doesn't start till
after an empty newline so that commit has an empty commit msg.  Here's
one which you can see the msg when passed to show:

git hash-object --stdin -w -t commit <<EOF
tree 76d269b57d3c4283922216f84a2850e99f561ccc
parent fa0624f79f9d5765d09598b003124b3cf0b9acdb
author Brandon Williams <bmwill@google.com> 1519859216 -0800
committer Brandon Williams <bmwill@google.com> 1519859216 -0800
tree 76d269b57d3c4283922216f84a2850e99f561ccc

This is a test commit with multiple trees
EOF

Of course the extra tree is ignored, but fsck doesn't complain and show
happily shows what it knows about.

--
Brandon Williams

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: The case for two trees in a commit ("How to make rebase less modal")
  2018-02-28 23:30 The case for two trees in a commit ("How to make rebase less modal") Stefan Beller
  2018-03-01  1:15 ` Ramsay Jones
@ 2018-03-01  4:26 ` Jeff King
  2018-03-01  7:25 ` Jacob Keller
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Jeff King @ 2018-03-01  4:26 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Sergey Organov, igor.d.djordjevic, Johannes Schindelin

On Wed, Feb 28, 2018 at 03:30:27PM -0800, Stefan Beller wrote:

> During the rebase there might be a hard to resolve conflict, which
> you may not want to resolve right now, but defer to later.  Deferring a
> conflict is currently impossible, because precisely one tree is recorded.
> 
> If we had multiple trees possible in a commit, then all these large scale
> operations would stop being modal and you could just record the unresolved
> merge conflict instead; to come back later and fix it up later.
> 
> I'd be advocating for having multiple trees in a commit
> possible locally; it might be a bad idea to publish such trees.
> 
> Opinions or other use cases?

What benefit does it have over adding a new header "unresolved-tree" or
similar? I do not think you are getting any backwards compatibility
here. For instance, "prune" will not traverse it with existing versions
of git, nor "pack-objects" include it in a pack (I didn't actually test
it, so I could be wrong; but those are all based around parse_commit,
which should look at only the first tree).

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: The case for two trees in a commit ("How to make rebase less modal")
  2018-02-28 23:30 The case for two trees in a commit ("How to make rebase less modal") Stefan Beller
  2018-03-01  1:15 ` Ramsay Jones
  2018-03-01  4:26 ` Jeff King
@ 2018-03-01  7:25 ` Jacob Keller
  2018-03-01 18:50   ` Junio C Hamano
  2018-03-01 18:44 ` Junio C Hamano
  2018-03-01 19:05 ` Jonathan Nieder
  4 siblings, 1 reply; 8+ messages in thread
From: Jacob Keller @ 2018-03-01  7:25 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Sergey Organov, Igor Djordjevic, Johannes Schindelin

On Wed, Feb 28, 2018 at 3:30 PM, Stefan Beller <sbeller@google.com> wrote:
> $ git hash-object --stdin -w -t commit <<EOF
> tree c70b4a33a0089f15eb3b38092832388d75293e86
> parent 105d5b91138ced892765a84e771a061ede8d63b8
> author Stefan Beller <sbeller@google.com> 1519859216 -0800
> committer Stefan Beller <sbeller@google.com> 1519859216 -0800
> tree 5495266479afc9a4bd9560e9feac465ed43fa63a
> test commit
> EOF
> 19abfc3bf1c5d782045acf23abdf7eed81e16669
> $ git fsck |grep 19abfc3bf1c5d782045acf23abdf7eed81e16669
> $
>
> So it is technically possible to create a commit with two tree entries
> and fsck is not complaining.
>
> But why would I want to do that?
>
> There are multiple abstraction levels in Git, I think of them as follows:
> * data structures / object model
> * plumbing
> * porcelain commands to manipulate the repo "at small scale", e.g.
> create a commit/tag
> * porcelain to modify the repo "at larger scale", such as rebase,
> cherrypicking, reverting
>   involving more than 1 commit.
>
> These large scale operations involving multiple commits however
> are all modal in its nature. Before doing anything else, you have to
> finish or abort the rebase or you need expert knowledge how to
> go otherwise.
>
> During the rebase there might be a hard to resolve conflict, which
> you may not want to resolve right now, but defer to later.  Deferring a
> conflict is currently impossible, because precisely one tree is recorded.
>

How does this let you defer a conflict? A future commit which modified
blobs in that tree wouldn't know what version of the trees/blobs to
actually use? Clearly future commits could record their own trees, but
how would they generate the "correct" tree?

Maybe I am missing something here?

Thanks,
Jake

> If we had multiple trees possible in a commit, then all these large scale
> operations would stop being modal and you could just record the unresolved
> merge conflict instead; to come back later and fix it up later.
>
> I'd be advocating for having multiple trees in a commit
> possible locally; it might be a bad idea to publish such trees.
>
> Opinions or other use cases?
>
> Thanks,
> Stefan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: The case for two trees in a commit ("How to make rebase less modal")
  2018-02-28 23:30 The case for two trees in a commit ("How to make rebase less modal") Stefan Beller
                   ` (2 preceding siblings ...)
  2018-03-01  7:25 ` Jacob Keller
@ 2018-03-01 18:44 ` Junio C Hamano
  2018-03-01 19:05 ` Jonathan Nieder
  4 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2018-03-01 18:44 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Sergey Organov, igor.d.djordjevic, Johannes Schindelin

Stefan Beller <sbeller@google.com> writes:

> $ git hash-object --stdin -w -t commit <<EOF
> tree c70b4a33a0089f15eb3b38092832388d75293e86
> parent 105d5b91138ced892765a84e771a061ede8d63b8
> author Stefan Beller <sbeller@google.com> 1519859216 -0800
> committer Stefan Beller <sbeller@google.com> 1519859216 -0800
> tree 5495266479afc9a4bd9560e9feac465ed43fa63a
> test commit
> EOF
> 19abfc3bf1c5d782045acf23abdf7eed81e16669
> $ git fsck |grep 19abfc3bf1c5d782045acf23abdf7eed81e16669
> $
>
> So it is technically possible to create a commit with two tree entries
> and fsck is not complaining.

The second one is merely a random unauthorized header that is not
interpreted in any way by Git.  It is merely being confusing by
starting with "tree " and having 40-hex after it, but the 40-hex
does not get interpreted as an object name, and does not participate
in reachability computation (i.e. packing, pruning and fsck).

There is not much difference between that and a line of trailer in
the commit log message (other than this one is less discoverable).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: The case for two trees in a commit ("How to make rebase less modal")
  2018-03-01  7:25 ` Jacob Keller
@ 2018-03-01 18:50   ` Junio C Hamano
  0 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2018-03-01 18:50 UTC (permalink / raw)
  To: Jacob Keller
  Cc: Stefan Beller, git, Sergey Organov, Igor Djordjevic, Johannes Schindelin

Jacob Keller <jacob.keller@gmail.com> writes:

> How does this let you defer a conflict? A future commit which modified
> blobs in that tree wouldn't know what version of the trees/blobs to
> actually use? Clearly future commits could record their own trees, but
> how would they generate the "correct" tree?
>
> Maybe I am missing something here?

If you write four trees out of each stage in the index and record
them, you could in theory have a new command that reads them and
recreate the conflicted index.  Oh, and then you would need the
fifth tree that records what the working-tree files (with conflict
markers) looked like, in order to reproduce the state seen by the
person who ran "git merge", attempted to resolve and gave up halfway
in the middle.

As a local operation, extending "git stash" somehow so that it can
stash even in a conflicted working tree may be a better approach,
and it does not need cruft headers in commit objects, I would think.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: The case for two trees in a commit ("How to make rebase less modal")
  2018-02-28 23:30 The case for two trees in a commit ("How to make rebase less modal") Stefan Beller
                   ` (3 preceding siblings ...)
  2018-03-01 18:44 ` Junio C Hamano
@ 2018-03-01 19:05 ` Jonathan Nieder
  4 siblings, 0 replies; 8+ messages in thread
From: Jonathan Nieder @ 2018-03-01 19:05 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Sergey Organov, igor.d.djordjevic, Johannes Schindelin

Hi,

Stefan Beller wrote:

> $ git hash-object --stdin -w -t commit <<EOF
> tree c70b4a33a0089f15eb3b38092832388d75293e86
> parent 105d5b91138ced892765a84e771a061ede8d63b8
> author Stefan Beller <sbeller@google.com> 1519859216 -0800
> committer Stefan Beller <sbeller@google.com> 1519859216 -0800
> tree 5495266479afc9a4bd9560e9feac465ed43fa63a
> test commit
> EOF
> 19abfc3bf1c5d782045acf23abdf7eed81e16669
> $ git fsck |grep 19abfc3bf1c5d782045acf23abdf7eed81e16669
> $
>
> So it is technically possible to create a commit with two tree entries
> and fsck is not complaining.

As others mentioned, this is essentially a fancy way to experiment
with adding a new header (with the same name as an existing header) to
a commit.  It is kind of a scary thing to do because anyone trying to
parse commits, including old versions of git, is likely to get
confused by the multiple trees.  It doesn't affect the reachability
calculation in the way that it should so this ends up being something
that should be straightforward to do with a message in the commit body
instead.

To affect reachability, you could use multiple parent lines instead.
You'd need synthetic commits to hang the trees on.  This is similar to
how "git stash" stores the index state.

In other words, I think what you are trying to do is feasible, but not
in the exact way you described.

[...]
> * porcelain to modify the repo "at larger scale", such as rebase,
> cherrypicking, reverting
>   involving more than 1 commit.
>
> These large scale operations involving multiple commits however
> are all modal in its nature. Before doing anything else, you have to
> finish or abort the rebase or you need expert knowledge how to
> go otherwise.
>
> During the rebase there might be a hard to resolve conflict, which
> you may not want to resolve right now, but defer to later.  Deferring a
> conflict is currently impossible, because precisely one tree is recorded.

Junio mentions you'd want to record:
 - stages of the index, to re-create a conflicted index
 - working tree files, with conflict markers

In addition you may also want to record:
 - state (todo list) from .git/rebase-merge, to allow picking up where
   you left off in such a larger operation
 - similar state for other commands --- e.g. MERGE_MSG

Recording this work-in-progress state is in the spirit of "git stash"
does.  People also sometimes like to record their state in progress with
a "wip commit" at the tip of a branch.  Both of those workflows would
benefit from something like this, I'd think.

So I kind of like this.  Maybe a "git save-wip" command that is like
"git stash" but records state to the current branch?  And similarly
improving "git stash" to record the same richer state.

And in the spirit of "git stash" I think it is possible without
even modifying the commit object format.

[...]
> I'd be advocating for having multiple trees in a commit
> possible locally; it might be a bad idea to publish such trees.

I think such "WIP state" may also be useful for publishing, to allow
collaborating on a thorny rebase or merge.

Thanks and hope that helps,
Jonathan

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-03-01 19:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-28 23:30 The case for two trees in a commit ("How to make rebase less modal") Stefan Beller
2018-03-01  1:15 ` Ramsay Jones
2018-03-01  1:42   ` Brandon Williams
2018-03-01  4:26 ` Jeff King
2018-03-01  7:25 ` Jacob Keller
2018-03-01 18:50   ` Junio C Hamano
2018-03-01 18:44 ` Junio C Hamano
2018-03-01 19:05 ` Jonathan Nieder

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).