git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: Stefan Xenos via GitGitGadget <gitgitgadget@gmail.com>
Cc: Jonathan Tan <jonathantanmy@google.com>,
	git@vger.kernel.org,
	Christophe Poucet <christophe.poucet@gmail.com>,
	Stefan Xenos <sxenos@google.com>
Subject: Re: [PATCH 01/10] technical doc: add a design doc for the evolve command
Date: Thu, 29 Sep 2022 12:57:24 -0700	[thread overview]
Message-ID: <20220929195725.1420647-1-jonathantanmy@google.com> (raw)
In-Reply-To: <a0cf68f8ba2adefae4fceeab0d438d05e355e695.1663959324.git.gitgitgadget@gmail.com>

"Stefan Xenos via GitGitGadget" <gitgitgadget@gmail.com> writes:
> +Background
> +==========
> +Imagine you have three sequential changes up for review and you receive feedback
> +that requires editing all three changes. We'll define the word "change"
> +formally later, but for the moment let's say that a change is a work-in-progress
> +whose final version will be submitted as a commit in the future.
[snip]
> +Part of making the "evolve" command work involves tracking the edits to a commit
> +over time, which is why we need an change graph. 

Reading later, I thought that a "change" is a connected subset of
elements in the set of metacommits, so a "change" is already a graph.
I'll mentally substitute "the concept of a change" for "an [sic] change
graph" for now - hopefully that's correct.

> +- It can be used as part of other high-level commands that combine or split
> +  changes.

Is the current concept of a change suitable for combining and splitting?
There is divergence, but that seems like a commit being modified in 2
different ways, not a commit being split into 2.

> +Goals
> +-----
[snip]
> +P0. A commit can be obsoleted by more than one replacement (called divergence).
> +P0. Users must be able to resolve divergence (convergence).

Is divergence important? It seems to me that both the internals and the
UX could be simplified if we don't allow divergence, and systems like
Gerrit don't have it either (as far as I know, in Gerrit, all commits
bearing the same Change-Id just form a sequence in the order that they
were pushed to the server, with no branching).

> +Overview
> +========
> +We introduce the notion of “meta-commits” which describe how one commit was
> +created from other commits. A branch of meta-commits is known as a change.

"Branch" here is confusing. Do you mean a connected set of meta-commits?

("Branch" has a specific meaning in Git - a ref of the form
refs/heads/??. If you mean that refs of the form refs/metas/?? point to
changes in a 1:1 manner, then the term "ref" is appropriate.)

> +Example usage
> +-------------
> +# First create three dependent changes
> +$ echo foo>bar.txt && git add .
> +$ git commit -m "This is a test"
> +created change metas/this_is_a_test
> +$ echo foo2>bar2.txt && git add .
> +$ git commit -m "This is also a test"
> +created change metas/this_is_also_a_test
> +$ echo foo3>bar3.txt && git add .
> +$ git commit -m "More testing"
> +created change metas/more_testing
> +
> +# List all our changes in progress
> +$ git change list
> +metas/this_is_a_test
> +metas/this_is_also_a_test
> +* metas/more_testing
> +metas/some_change_already_merged_upstream
> +
> +# Now modify the earliest change, using its stable name
> +$ git reset --hard metas/this_is_a_test
> +$ echo morefoo>>bar.txt && git add . && git commit --amend --no-edit

So up to here, I thought that we would have 2 refs metas/this_is_a_test2
and metas/this_is_a_test, with the latter's commit being one of the
parents of the former's commit. (This is because we presumably need to
be able to represent the situation in which the user checks out the
original "This is a test" commit and modifies it, so we still need to
hang on to the metas/this_is_a_test.)

> +# Use git-evolve to fix up any dependent changes
> +$ git evolve
> +rebasing metas/this_is_also_a_test onto metas/this_is_a_test
> +rebasing metas/more_testing onto metas/this_is_also_a_test
> +Done

So I'm surprised that there's no mention of this_is_a_test2 here. In
addition, linearizing a change like this doesn't seem to be described in
this document.

Having said that, I don't think that linearizing changes is important to
the goals of this "evolve" concept, so maybe one thing we can do is to
not support it at all.

Fast-forward...

> +# Fetch the latest code from origin/master and use git-evolve
> +# to rebase all dependent changes.
> +$ git fetch origin master
> +$ git evolve origin/master
> +deleting metas/some_change_already_merged_upstream
> +rebasing metas/this_is_a_test onto origin/master
> +rebasing metas/this_is_also_a_test onto metas/this_is_a_test
> +rebasing metas/more_testing onto metas/this_is_also_a_test
> +rebasing metas/unrelated_change onto origin/master
> +Conflict detected! Resolve it and then use git evolve --continue to resume.
> +
> +# Sort out the conflict
> +$ git mergetool
> +$ git evolve origin/master
> +Done

This is what I expected from the evolve mechanism, so that's great :-)

The conflict resolution needs to be discussed further, though. It is
superficially similar to rebase, but with rebase, the ref being rebased
is only rewritten at the end of the process, so it is always possible to
abort halfway. Here, multiple refs are written during the process, so it
is not as easy to abort.

> +Parent-type
> +-----------
> +The “parent-type” field in the commit header identifies a commit as a
> +meta-commit and indicates the meaning for each of its parents. It is never
> +present for normal commits. It contains a space-deliminated list of enum values
> +whose order matches the order of the parents. Possible parent types are:
> +
> +- c: (content) the content parent identifies the commit that this meta-commit is
> +  describing.
> +- r: (replaced) indicates that this parent is made obsolete by the content
> +  parent.
> +- o: (origin) indicates that the content parent was generated by cherry-picking
> +  this parent.
> +- a: (abandoned) used in place of a content parent for abandoned changes. Points
> +  to the final content commit for the change at the time it was abandoned.

How would the "o" parent be useful to the user?

> +Changes
> +-------
[snip]
> +Changes are also stored in the refs/hiddenmetas namespace. Hiddenmetas holds
> +metadata for historical changes that are not currently in progress by the user.
> +Commands like filter-branch and other bulk import commands create metadata in
> +this namespace.
> +
> +Note that the changes in hiddenmetas get special treatment in several ways:
> +
> +- They are not cleaned up automatically once merged, since it is expected that
> +  they refer to historical changes.
> +- User commands that modify changes don't append to these changes as they would
> +  to a change in refs/metas.
> +- They are not displayed when the user lists their local changes.

The presence of refs/hiddenmetas further muddies the already unclear
lifecycle of meta-commits and their refs. Non-hidden meta-commits get
cleaned up when their latest commit appears upstream, so they may get
deleted when the user doesn't expect it (especially if the user is
using, say, a prefetch mechanism that downloads refs at night). We're
adding to this a class of refs that don't get cleaned up at all.

Besides the disk space taken by the meta-commits, having more refs
typically reduces performance e.g. because all refs generally take part
in packfile negotiation during fetching. (And they probably should
continue with this behavior because sharing meta-commits is one of the
features we want.) So I think that having a clear cleanup strategy is a
good idea, and permanent archiving probably shouldn't be it.

> +Change creation
> +---------------
> +Changes are created automatically whenever the user runs a command like “commit”
> +that has the semantics of creating a new change. They also move forward
> +automatically even if they’re not checked out. For example, whenever the user
> +runs a command like “commit --amend” that modifies a commit, all branches in
> +refs/metas that pointed to the old commit move forward to point to its
> +replacement instead.

What happens in the following?

  $ echo "hello" >hello.txt
  $ git add hello.txt
  $ git commit -m "hello"
  $ git tag hello
  $ echo "one" >hello.txt
  $ git commit -a --amend # this updates refs/metas/hello
  $ git checkout hello
  $ echo "one" >hello.txt
  $ git commit -a --amend # does this update refs/metas/hello too?

> +Sharing changes
> +---------------
> +Change histories are shared by pushing or fetching meta-commits and change
> +branches. This provides users with a lot of control of what to share and
> +repository implementations with control over what to retain.
> +
> +Users that only want to share the content of a commit can do so by pushing the
> +commit itself as they currently would. Users that want to share an edit history
> +for the commit can push its change, which would point to a meta-commit rather
> +than the commit itself if there is any history to share. Note that multiple
> +changes can refer to the same commits, so it’s possible to construct and push a
> +different history for the same commit in order to remove sensitive or irrelevant
> +intermediate states.

It looks difficult to remove such intermediate states, but maybe that
doesn't have to be dealt with in the initial design.

> +Checkout
> +--------
> +Running checkout on a change by name has the same effect as checking out a
> +detached head pointing to the latest commit on that change-branch. There is no
> +need to ever have HEAD point to a change since changes always move forward when
> +necessary, no matter what branch the user has checked out
> +
> +Meta-commits themselves cannot be checked out by their hash.

This is the same behavior as for annotated tags, but I guess we can't
use them because those can only have one referent.

  parent reply	other threads:[~2022-09-29 19:57 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-23 18:55 [PATCH 00/10] Add the Git Change command Christophe Poucet via GitGitGadget
2022-09-23 18:55 ` [PATCH 01/10] technical doc: add a design doc for the evolve command Stefan Xenos via GitGitGadget
2022-09-23 19:59   ` Jerry Zhang
2022-09-28 21:26   ` Junio C Hamano
2022-09-28 22:20   ` Junio C Hamano
2022-09-29  9:17     ` Phillip Wood
2022-09-29 19:57   ` Jonathan Tan [this message]
2022-09-23 18:55 ` [PATCH 02/10] sha1-array: implement oid_array_readonly_contains Chris Poucet via GitGitGadget
2022-09-26 13:08   ` Phillip Wood
2022-09-23 18:55 ` [PATCH 03/10] ref-filter: add the metas namespace to ref-filter Chris Poucet via GitGitGadget
2022-09-26 13:13   ` Phillip Wood
2022-10-04  9:50     ` Chris P
2022-09-23 18:55 ` [PATCH 04/10] evolve: add support for parsing metacommits Stefan Xenos via GitGitGadget
2022-09-26 13:27   ` Phillip Wood
2022-10-04 11:21     ` Chris P
2022-10-04 14:10       ` Phillip Wood
2022-09-23 18:55 ` [PATCH 05/10] evolve: add the change-table structure Stefan Xenos via GitGitGadget
2022-09-27 13:27   ` Phillip Wood
2022-09-27 13:50     ` Ævar Arnfjörð Bjarmason
2022-09-27 14:13       ` Phillip Wood
2022-09-27 15:28         ` Ævar Arnfjörð Bjarmason
2022-09-28 14:33           ` Phillip Wood
2022-09-28 15:14             ` Ævar Arnfjörð Bjarmason
2022-09-28 15:59             ` Junio C Hamano
2022-09-27 14:18     ` Phillip Wood
2022-10-04 14:48     ` Chris P
2022-09-23 18:55 ` [PATCH 06/10] evolve: add support for writing metacommits Stefan Xenos via GitGitGadget
2022-09-28 14:27   ` Phillip Wood
2022-10-05  9:40     ` Chris P
2022-10-05 11:09       ` Phillip Wood
2022-09-23 18:55 ` [PATCH 07/10] evolve: implement the git change command Stefan Xenos via GitGitGadget
2022-09-25  9:10   ` Phillip Wood
2022-09-26  8:23     ` Ævar Arnfjörð Bjarmason
2022-09-26  8:25   ` Ævar Arnfjörð Bjarmason
2022-10-05 12:30     ` Chris P
2022-09-23 18:55 ` [PATCH 08/10] evolve: add the git change list command Stefan Xenos via GitGitGadget
2022-09-23 18:55 ` [PATCH 09/10] evolve: add delete command Chris Poucet via GitGitGadget
2022-09-26  8:38   ` Ævar Arnfjörð Bjarmason
2022-09-26  9:10     ` Chris Poucet
2022-09-23 18:55 ` [PATCH 10/10] evolve: add documentation for `git change` Chris Poucet via GitGitGadget
2022-09-25  8:41   ` Phillip Wood
2022-09-25  8:39 ` [PATCH 00/10] Add the Git Change command Phillip Wood
2022-10-04  9:33   ` Chris P
2022-10-04 14:24 ` Phillip Wood
2022-10-04 15:19   ` Chris P
2022-10-04 15:55     ` Chris P
2022-10-04 16:00       ` Phillip Wood
2022-10-04 15:57     ` Phillip Wood
2022-10-05 14:59 ` [PATCH v2 00/10] RFC: Git Evolve / Change Christophe Poucet via GitGitGadget
2022-10-05 14:59   ` [PATCH v2 01/10] technical doc: add a design doc for the evolve command Stefan Xenos via GitGitGadget
2022-10-05 15:16     ` Chris Poucet
2022-10-06 20:53       ` Glen Choo
2022-10-10 19:35     ` Victoria Dye
2022-10-11  8:59       ` Phillip Wood
2022-10-11 16:59         ` Victoria Dye
2022-10-12 19:19           ` Phillip Wood
2022-10-05 14:59   ` [PATCH v2 02/10] sha1-array: implement oid_array_readonly_contains Chris Poucet via GitGitGadget
2022-10-05 14:59   ` [PATCH v2 03/10] ref-filter: add the metas namespace to ref-filter Chris Poucet via GitGitGadget
2022-10-05 14:59   ` [PATCH v2 04/10] evolve: add support for parsing metacommits Stefan Xenos via GitGitGadget
2022-10-05 14:59   ` [PATCH v2 05/10] evolve: add the change-table structure Stefan Xenos via GitGitGadget
2022-10-05 14:59   ` [PATCH v2 06/10] evolve: add support for writing metacommits Stefan Xenos via GitGitGadget
2022-10-05 14:59   ` [PATCH v2 07/10] evolve: implement the git change command Stefan Xenos via GitGitGadget
2022-10-05 14:59   ` [PATCH v2 08/10] evolve: add delete command Chris Poucet via GitGitGadget
2022-10-05 14:59   ` [PATCH v2 09/10] evolve: add documentation for `git change` Chris Poucet via GitGitGadget
2022-10-05 14:59   ` [PATCH v2 10/10] evolve: add tests for the git-change command Chris Poucet via GitGitGadget
2022-10-10  9:23   ` [PATCH v2 00/10] RFC: Git Evolve / Change Phillip Wood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220929195725.1420647-1-jonathantanmy@google.com \
    --to=jonathantanmy@google.com \
    --cc=christophe.poucet@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=sxenos@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).