git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Stephen P. Smith" <ischis2@cox.net>
Cc: git@vger.kernel.org
Subject: Re: How to keep a project's canonical history correct.
Date: Thu, 08 May 2014 11:37:48 -0700	[thread overview]
Message-ID: <xmqqsiok2k1v.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <1399526252-28522-1-git-send-email-ischis2@cox.net> (Stephen P. Smith's message of "Wed, 7 May 2014 22:17:32 -0700")

"Stephen P. Smith" <ischis2@cox.net> writes:

> During the mail thread about "Pull is mostly evil" a user asked how
> the first parent could become reversed.
>
> This howto explains how the first parent can get reversed when viewed
> by the project and then explains a method to keep the history correct.
>
> Signed-off-by: Stephen P. Smith <ischis2@cox.net>
> ---

Thanks.  There are a few nitpicks though, most of them what I should
have done when I wrote the original before sending it out ;-)

>  Documentation/Makefile                             |   1 +
>  .../howto/keep-canonical-history-correct.txt       | 207 +++++++++++++++++++++
>  2 files changed, 208 insertions(+)
>  create mode 100644 Documentation/howto/keep-canonical-history-correct.txt
>
> diff --git a/Documentation/Makefile b/Documentation/Makefile
> index fc6b2cf..cea0e7a 100644
> --- a/Documentation/Makefile
> +++ b/Documentation/Makefile
> @@ -59,6 +59,7 @@ SP_ARTICLES += howto/recover-corrupted-blob-object
>  SP_ARTICLES += howto/recover-corrupted-object-harder
>  SP_ARTICLES += howto/rebuild-from-update-hook
>  SP_ARTICLES += howto/rebase-from-internal-branch
> +SP_ARTICLES += howto/keep-canonical-history-correct
>  SP_ARTICLES += howto/maintain-git
>  API_DOCS = $(patsubst %.txt,%,$(filter-out technical/api-index-skel.txt technical/api-index.txt, $(wildcard technical/api-*.txt)))
>  SP_ARTICLES += $(API_DOCS)
> diff --git a/Documentation/howto/keep-canonical-history-correct.txt b/Documentation/howto/keep-canonical-history-correct.txt
> new file mode 100644
> index 0000000..dd310ea
> --- /dev/null
> +++ b/Documentation/howto/keep-canonical-history-correct.txt
> @@ -0,0 +1,207 @@
> +From: Junio C Hamano <gitster@pobox.com>
> +Date: Wed, 07 May 2014 13:15:39 -0700
> +Subject: Beginner question on "Pull is mostly evil"
> +Abstract: This how-to explains a method for keeping a project's history correct when using git pull.
> +Content-type: text/asciidoc

Please keep the Message-ID: from the original; people can find the
original discussion more easily that way.

Also, wrap long lines of Abstract (see revert-branch-rebase.txt for
an example).

> +Keep authoritative canonical history correct with git pull
> +==========================================================
> +

We may want to have an introductory sentence before this "Suppose"
to set the scene, so that readers would be prepared to read about
the workflow that uses a central repository for everybody to meet.

> +Suppose that that central repository has this history:
> +
> +------------
> +    ---o---o---A
> +------------
> +
> +which ends at commit A (time flows from left to right and each node in
> +the graph is a commit, lines between them indicating parent-child
> +relationship).
> +
> +Then you clone it and work on your own commits, which leads you to
> +have this in *your* repository:

s/this/this history/ perhaps.

> +
> +------------
> +    ---o---o---A---B---C
> +------------
> +
> +Imagine your coworker did the same and built on top of A in *his*
> +repository this history in the meantime, and then pushed it to the
> +central repository:
> +
> +------------
> +    ---o---o---A---X---Y---Z
> +------------
> +
> +Now, if you "git push" at this point, beause your history that leads
> +to C lack X, Y and Z, it will fail.  You need to somehow make the

s/lack/lacks/

As this is no longer a plain text, you probably want `C`, `X`, etc.,
to typeset these commit markers in fixed width.

> +tip of your history a descendant of Z.
> +
> +One suggested way to solve the problem is "fetch and then merge".

s/\.$/, aka "git pull"./

> +If you fetch, your repository will have a history like this:

s/If/When/

> +------------
> +    ---o---o---A---B---C
> +                \
> +                 X---Y---Z
> +------------
> +
> +And then if you did merge after that, while still on *your* branch,
> +i.e. C, you will create a merge M and make the history look like
> +this:

s/did/run/ perhaps.  The past tense there feels wrong.

> +
> +------------
> +    ---o---o---A---B---C---M
> +                \         /
> +                 X---Y---Z
> +------------
> +
> +M is a descendant of Z, so you can push to update the central
> +repository.  Such a merge M does not lose any commit in both
> +histories, so in that sense it may not be wrong, but when people
> +would want to talk about "the authoritative canonical history that

s/would want/want/; I have a bad habit of overusing "would".

> +is shared among the project participants", i.e. "the trunk", the way
> +they often use is to do:
> +
> +------------
> +    $ git log --first-parent
> +------------
> +
> +For all other people who observed the central repository after your
> +coworker pushed Z but before you pushed M, the commit on the trunk
> +used to be "o-o-A-X-Y-Z".  But because you made M while you were on

As this is no longer a plain text, you probably want `o-o-A-X-Y-Z`
(not dq, but bq) to typeset them in fixed width in AsciiDoc.  Same
for X-Y-Z below.

> +C, M's first parent is C, so by pushing M to advance the central
> +repository, you made X-Y-Z a side branch, not on the trunk.
> +
> +You would rather want to have a history of this shape:
> +
> +------------
> +    ---o---o---A---X---Y---Z---M'
> +                \             /
> +                 B-----------C
> +------------
> +
> +so that in the first-parent chain, it is clear that the project
> +first did X and then Y and then Z and merged a change that consists
> +of two commits B and C that achieves a single goal.  You may have
> +worked on fixing the bug #12345 with these two patches, and the
> +merge M' with swapped parents can say in its log message "Merge
> +'fix-bug-12345'".

Add something like this at the end of that paragraph:

    Having a way to tell "git pull" to create a merge but record the
    parents in reverse order may be a way to do so.

It was obvious to the original questioner who did read the recent
"pull is mostly evil" thread, which is why I did not say it in the
original, but it is necessary for the readers of this document where
they lack the context.  Otherwise it would not be apparent to them
what "swapping the merge order" below refers to.

> +Note that I said "achieves a single goal" above, because this is
> +important.  "swapping the merge order" only covers a special case
> +where the project does not care too much about having unrelated
> +things done on a single merge but cares a lot about first-parent
> +chain.
> +
> +There are multiple schools of thought about the "trunk" management.

I have not tried to format it myself, but does the following 1. 2. 3.
(all pre-indented already) format correctly when passed to AsciiDoc?
I suspect the text may all shown in fixed width, which would not be
what we want.

> + 1. Some projects want to keep a completely linear history without
> +    any merges.  Obviously, swapping the merge order would not help
> +    their taste.  You would need to flatten your history on top of

s/help/match/ or something.

> +    the updated upstream to result in a history of this shape
> +    instead:
> ++
> +------------
> +    ---o---o---A---X---Y---Z---B---C
> +------------
> ++
> +    with "git pull --rebase" or something.

Use `git pull --rebase` (i.e. not dq but bq) here, too.

> + 2. Some projects tolerate merges in their history, but do not worry
> +    too much about the first-parent order, and allows fast-forward
> +    merges.  To them, swapping the merge order does not hurt, but
> +    it is unnecessary.
> +
> + 3. Some projects want each commit on the "trunk" to do one single
> +    thing.  The output of "git log --first-parent" in such a project
> +    would show either a merge of a side branch that completes a
> +    single theme, or a single commit that completes a single theme
> +    by itself.  If your two commits B and C (or they may even be two
> +    groups of commits) were solving two independent issues, then the
> +    merge M' we made in the earlier example by swapping the merge
> +    order is still not up to the project standard.  It merges two
> +    unrelated efforts B and C at the same time.

Likewise for `git log --first-parent`, `B`, `C, and `M'`; they all
want to be typeset in fixed width.

> +For projects in the last category (git itself is one of them),

s/git/Git/ as we seem to prefer that spelling these days when
talking about the project.

> +individual developers would want to prepare a history more like
> +this:
> +
> +------------
> +                 C0--C1--C2     topic-c
> +                /
> +    ---o---o---A                master
> +                \
> +                 B0--B1--B2     topic-b
> +------------
> +
> +That is, keeping separate topics on separate branches, perhaps like
> +so:
> +
> +------------
> +    $ git clone $URL work && cd work
> +    $ git checkout -b topic-b master
> +    $ ... work to create B0, B1 and B2 to complete one theme
> +    $ git checkout -b topic-c master
> +    $ ... same for the theme of topic-c
> +------------
> +
> +And then
> +
> +------------
> +    $ git checkout master
> +    $ git pull --ff-only
> +------------
> +
> +would grab X, Y and Z from the upstream and advance your master
> +branch:
> +
> +------------
> +                 C0--C1--C2
> +                /
> +    ---o---o---A---X---Y---Z
> +                \
> +                 B0--B1--B2
> +------------

We may want to have the topic-c/master/topic-b branch labels like
the previous picture here.

> +And then you would merge these two branches separately:
> +
> +------------
> +    $ git merge topic-b
> +    $ git merge topic-c
> +------------
> +
> +to result in
> +
> +------------
> +                 C0--C1---------C2
> +                /                 \
> +    ---o---o---A---X---Y---Z---M---N
> +                \             /
> +                 B0--B1-----B2
> +------------
> +
> +and push it back to the central repository.
> +
> +It is very much possible that while you are merging topic-b and
> +topic-c, somebody again advanced the history in the central
> +repository to put W on top of Z, and make your "git push" fail.
> +
> +In such a case, you would rewind to discard M and N, update the tip
> +of your 'master' again and redo the two merges:
> +
> +------------
> +    $ git reset --hard origin/master
> +    $ git pull --ff-only
> +    $ git merge topic-b
> +    $ git merge topic-c
> +------------
> +
> +------------
> +                 C0--C1--------------C2
> +                /                     \
> +    ---o---o---A---X---Y---Z---W---M'--N
> +                \                 /
> +                 B0--B1---------B2
> +------------

I failed to do so in the original, but the final merge should be
labeled as N' (with prime), just like M turned into M', to denote
that they are different commits from their counterparts in the
orignal history.

Thanks.

      reply	other threads:[~2014-05-08 18:38 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-08  5:17 How to keep a project's canonical history correct Stephen P. Smith
2014-05-08 18:37 ` Junio C Hamano [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqsiok2k1v.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=ischis2@cox.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).