Could this be done simpler?

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* Could this be done simpler?
@ 2009-06-24 21:35 Linus Torvalds
  2009-06-25  1:04 ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Linus Torvalds @ 2009-06-24 21:35 UTC (permalink / raw
  To: Junio C Hamano, Git Mailing List

Ok, so I have a practice of occasionally doing octopus merges when I have 
two branches with trivial fixes from the same person.

That all works fine when they use the "multiple branches in the same 
repository" approach (eg x86 "tip" tree), but other people tend to prefer 
to use multiple repositories for different features, rather than branches. 
And git generally lets you do things either way with no real difference.

But for the octopus case, it does make a difference. You can easily make 
octopus merges only from one repository.

Which is kind of sad. 

So I did kernel commit c6223048259006759237d826219f0fa4f312fb47 by 
basically doing the 'git pull" logic by hand, and while this was just a 
trial and maybe I'll never feel the urge to do it again, I'm wondering it 
maybe we should make it easier to do.

Right now the "git pull" syntax is

	git pull <repo> <branch>*

and you cannot specify multiple repositories, only multiple branches.

But at the same time, it should be pretty unambiguous whether an argument 
is a repository or a branch (':' in a remote repository, or "/" or ".." at 
the beginning of a local one - all invalid in branch names).

So it _should_ be syntactically unambiguous to allow

	git pull (<repo> <branch>*)+

for the octopus case. Hmm?

		Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Could this be done simpler?
  2009-06-24 21:35 Could this be done simpler? Linus Torvalds
@ 2009-06-25  1:04 ` Junio C Hamano
  2009-06-25 14:33   ` Randal L. Schwartz
  2009-06-25 22:02   ` Christian Couder
  0 siblings, 2 replies; 15+ messages in thread
From: Junio C Hamano @ 2009-06-25  1:04 UTC (permalink / raw
  To: Linus Torvalds; +Cc: Git Mailing List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> Ok, so I have a practice of occasionally doing octopus merges when I have 
> two branches with trivial fixes from the same person.
>
> That all works fine when they use the "multiple branches in the same 
> repository" approach (eg x86 "tip" tree), but other people tend to prefer 
> to use multiple repositories for different features, rather than branches. 
> And git generally lets you do things either way with no real difference.
>
> But for the octopus case, it does make a difference. You can easily make 
> octopus merges only from one repository.
>
> Which is kind of sad. 
>
> So I did kernel commit c6223048259006759237d826219f0fa4f312fb47 by 
> basically doing the 'git pull" logic by hand, and while this was just a 
> trial and maybe I'll never feel the urge to do it again, I'm wondering it 
> maybe we should make it easier to do.

Every once in a while I have this urge to see how it feels to be Linus
by pretending to be him, trying what he did.

(1) So where is he?

    $ git pull
    ...
    From git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
       f234012..28d0325  master     -> linus
     * [new tag]         v2.6.31-rc1 -> v2.6.31-rc1
    Updating f234012..28d0325
    Fast forward
     ...

(2) Let's pretend to be Linus, just before he made this merge.

    $ git checkout c62230^

(3) Let's see what he did with that thing.

    $ git show c62230
    commit c6223048259006759237d826219f0fa4f312fb47
    Merge: bd453cd d5bb68a 3a6a6c1
    Author: Linus Torvalds <torvalds@linux-foundation.org>
    Date:   Wed Jun 24 14:17:14 2009 -0700

        Merge branches 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/{vfs-2.6,audit-current}

        * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
          another race fix in jfs_check_acl()
          Get "no acls for this inode" right, fix shmem breakage
          inline functions left without protection of ifdef (acl)

        * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
          audit: inode watches depend on CONFIG_AUDIT not CONFIG_AUDIT_SYSCALL

    Ah, so we know the two repositories and branches involved.

(4) Let's pretend to be Linus.  Fetch the first branch and drop the
    necessary information in FETCH_HEAD.

    $ git fetch \
      git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 \
      for-linus

(5) Continue pretending to be Linus, complete the octopus.  The key is to
    let the "fetch" phase of this to append to the FETCH_HEAD, not
    replacing it.

    $ git pull --append \
      git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current \
      for-linus

(6) Did I succeed?  Let's see.

    $ git diff c62230

    Yay, identical tree.

(7) How does the log message look?

    $ git show
    commit cb1e4198421091ea5844d93624d5d5499537dbe0
    Merge: bd453cd d5bb68a 3a6a6c1
    Author: Junio C Hamano <gitster@pobox.com>
    Date:   Wed Jun 24 17:45:09 2009 -0700

        Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6; branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current into HEAD

        * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
          another race fix in jfs_check_acl()
          Get "no acls for this inode" right, fix shmem breakage
          inline functions left without protection of ifdef (acl)

        * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
          audit: inode watches depend on CONFIG_AUDIT not CONFIG_AUDIT_SYSCALL

    Hmm, Linus's combined notation on the summary line that uses {} is
    much nicer.

> Right now the "git pull" syntax is
>
> 	git pull <repo> <branch>*
>
> and you cannot specify multiple repositories, only multiple branches.
>
> But at the same time, it should be pretty unambiguous whether an argument 
> is a repository or a branch (':' in a remote repository, or "/" or ".." at 
> the beginning of a local one - all invalid in branch names).
>
> So it _should_ be syntactically unambiguous to allow
>
> 	git pull (<repo> <branch>*)+
>
> for the octopus case. Hmm?

Strictly speaking, you are not quite correct.  Arguments after <repo> can
be storing refspecs and they do come with colon.

Conclusion.  git-fmt-merge-msg may need to learn the trick of using {}.
No other changes needed.

Side note.

People sometimes say, and I am certain I agreed to them on more than one
occasions, that Octopus hurt bisectability and does not have much value in
real life.  I've always thought this bisectability issue was a downside of
Octopus merges, but now I think about it, perhaps "git bisect" can be
taught to dynamically decompose an Octopus merges into a sequence of
two-head virtual merges while bisecting.  We strongly discourage and do
not allow conflicting Octopus merges, so when you need to bisect a history
with an Octopus that looks like this:

    ---o---A
            \    
  ---o---B---M---o
            /    
    ---o---C

it should be able to mechanically decompose it, without conflicts, into


    ---o---A
            \    
  ---o---B---M1--M2--o
                /    
        ---o---C

where the tree of M and the tree of M2 are identical.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Could this be done simpler?
  2009-06-25  1:04 ` Junio C Hamano
@ 2009-06-25 14:33   ` Randal L. Schwartz
  2009-06-25 16:32     ` Matthias Andree
  2009-06-25 17:19     ` Michael J Gruber
  2009-06-25 22:02   ` Christian Couder
  1 sibling, 2 replies; 15+ messages in thread
From: Randal L. Schwartz @ 2009-06-25 14:33 UTC (permalink / raw
  To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List

>>>>> "Junio" == Junio C Hamano <gitster@pobox.com> writes:

Junio> (5) Continue pretending to be Linus, complete the octopus.  The key is to
Junio>     let the "fetch" phase of this to append to the FETCH_HEAD, not
Junio>     replacing it.

Junio>     $ git pull --append \
Junio>       git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current \
Junio>       for-linus

The relatively current doc of "--append" looks like this:

       -a, --append
           Append ref names and object names of fetched refs to the existing
           contents of will be overwritten.

I read this three times, and still don't know what it means (and it doesn't
even scan well as English), so I would have never known to use this strategy.
Can you explain this more in detail, or point at something in the mailing list
that does?

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Could this be done simpler?
  2009-06-25 14:33   ` Randal L. Schwartz
@ 2009-06-25 16:32     ` Matthias Andree
  2009-06-25 17:25       ` Junio C Hamano
  2009-06-25 18:32       ` Junio C Hamano
  2009-06-25 17:19     ` Michael J Gruber
  1 sibling, 2 replies; 15+ messages in thread
From: Matthias Andree @ 2009-06-25 16:32 UTC (permalink / raw
  To: Randal L. Schwartz; +Cc: Junio C Hamano, Linus Torvalds, Git Mailing List

Randal L. Schwartz schrieb:
>>>>>> "Junio" == Junio C Hamano <gitster@pobox.com> writes:
> 
> Junio> (5) Continue pretending to be Linus, complete the octopus.  The key is to
> Junio>     let the "fetch" phase of this to append to the FETCH_HEAD, not
> Junio>     replacing it.
> 
> Junio>     $ git pull --append \
> Junio>       git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current \
> Junio>       for-linus
> 
> The relatively current doc of "--append" looks like this:
> 
>        -a, --append
>            Append ref names and object names of fetched refs to the existing
>            contents of will be overwritten.
> 
> I read this three times, and still don't know what it means (and it doesn't
> even scan well as English), so I would have never known to use this strategy.
> Can you explain this more in detail, or point at something in the mailing list
> that does?

Greetings,

If I may: So the existing description is incomprehensible. I sort of believed I
understood it, but apparently I didn't understand enough of it.

Could we ditch the current git-pull --append description? Can then please
somebody rewrite this paragraph? This somebody must have completely understood

(1) what this feature is good for (practically speaking)

(2) how it works (technically speaking, to provide reference information)

That would be much more useful, and the use would last longer :-)

I don't dare ask Junio directly.

However, it appears to me that git-pull already does most of what Linus needs,
could take some final cosmetic touch-ups WRT logs. So could somebody please
rewrite this?

And if I may be so bold: Please rewrite before somebody starts polishing the
bisect facilities WRT octopus merges. These seem unrelated, as in: you don't
need to make bisect more convenient to be able to fix the description of
git-pull --append...

Thanks for not slashing me to pieces. 8-)

Best regards
MA

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Could this be done simpler?
  2009-06-25 14:33   ` Randal L. Schwartz
  2009-06-25 16:32     ` Matthias Andree
@ 2009-06-25 17:19     ` Michael J Gruber
  1 sibling, 0 replies; 15+ messages in thread
From: Michael J Gruber @ 2009-06-25 17:19 UTC (permalink / raw
  To: Randal L. Schwartz; +Cc: Junio C Hamano, Linus Torvalds, Git Mailing List

Randal L. Schwartz venit, vidit, dixit 25.06.2009 16:33:
>>>>>> "Junio" == Junio C Hamano <gitster@pobox.com> writes:
> 
> Junio> (5) Continue pretending to be Linus, complete the octopus.  The key is to
> Junio>     let the "fetch" phase of this to append to the FETCH_HEAD, not
> Junio>     replacing it.
> 
> Junio>     $ git pull --append \
> Junio>       git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current \
> Junio>       for-linus
> 
> The relatively current doc of "--append" looks like this:
> 
>        -a, --append
>            Append ref names and object names of fetched refs to the existing
>            contents of will be overwritten.
> 
> I read this three times, and still don't know what it means (and it doesn't
> even scan well as English), so I would have never known to use this strategy.
> Can you explain this more in detail, or point at something in the mailing list
> that does?

Uhm,
my version of git-fetch.1 has

       -a, --append
           Append ref names and object names of fetched refs to the
existing contents of .git/FETCH_HEAD. Without this option
           old data in .git/FETCH_HEAD will be overwritten.

That at least scans better in English. It does not make it very clear
what the consequences are, though.

Michael

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Could this be done simpler?
  2009-06-25 16:32     ` Matthias Andree
@ 2009-06-25 17:25       ` Junio C Hamano
  2009-06-25 21:54         ` Matthias Andree
  2009-06-25 18:32       ` Junio C Hamano
  1 sibling, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2009-06-25 17:25 UTC (permalink / raw
  To: Matthias Andree
  Cc: Randal L. Schwartz, Junio C Hamano, Linus Torvalds,
	Git Mailing List

Matthias Andree <matthias.andree@gmx.de> writes:

> Could we ditch the current git-pull --append description? Can then please
> somebody rewrite this paragraph? This somebody must have completely understood

> (1) what this feature is good for (practically speaking)
>
> (2) how it works (technically speaking, to provide reference information)
>
> That would be much more useful, and the use would last longer :-)
>
> I don't dare ask Junio directly.

But if you run blame and mailing list archive search, you would discover
that "fetch --append" was my invention.  After all, the entire Octopus
idea originates from me at 211232b (Octopus merge of the following five
patches., 2005-05-05).  It is interesting to realize that it was actually
a Pentapus made on the day of 5/5/5 ;-)

I thought I was going to take blame on the incomprehensive documentation
and pass it on to me being non-native speaker/writer of English, but the
situation is bit funny.  Documentation/fetch-options.txt says this:

    -a::
    --append::
            Append ref names and object names of fetched refs to the
            existing contents of `.git/FETCH_HEAD`.  Without this
            option old data in `.git/FETCH_HEAD` will be overwritten.

Perhaps there has a cut&paste error?  I haven't looked.

Now answers to (1) and (2).

 (1) The feature was designed exactly for the use case Linus described.

 (2) "git fetch" leaves list of <commit object, repo, branch, flag> for
     each ref fetched from repository in .git/FETCH_HEAD, where flag tells
     if it is meant for merging.  "git pull" runs "git fetch", reads from
     this file to learn which ones to pass to "git merge".  The
     information also is given to "git fmt-merge-msg" to come up with the
     message.

     Usually "git fetch" first empties the existing contents of the file
     and stores the list of refs it fetched.  With --append, it doesn't
     empty the file; refs fetched by the previous invocation of "git
     fetch" will be kept and the refs it fetched are appenede.

     So:

	$ git fetch one a
        $ git fetch --append two b
        $ git pull --apend three c

     will end up having all the three refs from different repositories in
     .git/FETCH_HEAD.  I.e.

	branch a, from repo one, to be merged
	branch b, from repo two, to be merged
	branch c, from repo three, to be merged

     when "git fetch" run by the the last "git pull" returns.  "git pull"
     reads the file and learn what to give to "git fmt-merge-msg" (to come
     up with the message for the merge commit) and "git merge" (to create
     the merge commit).

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Could this be done simpler?
  2009-06-25 16:32     ` Matthias Andree
  2009-06-25 17:25       ` Junio C Hamano
@ 2009-06-25 18:32       ` Junio C Hamano
  1 sibling, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2009-06-25 18:32 UTC (permalink / raw
  To: Matthias Andree
  Cc: Randal L. Schwartz, Christian Couder, Linus Torvalds,
	Git Mailing List

Matthias Andree <matthias.andree@gmx.de> writes:

> And if I may be so bold: Please rewrite before somebody starts polishing the
> bisect facilities WRT octopus merges. These seem unrelated, as in: you don't
> need to make bisect more convenient to be able to fix the description of
> git-pull --append...

Let's have a refresher course of how bisection works with a history with
merges.

Assume that you have this history (time flows from left to right, recent
commits are known to be bad, old commits are known to be good).

                       o---o---o---o---A
                      /                 \
  ---o---o---o---o---F---o---o---o---B---M

In real life, you would start from a history with more commits on top of M
and only know that the tip of that sequence is bad, but for brevity, let's
assume we bisected and already know M is bad.

If B is good, the breakage was either introduced at M, or was on the side
branch leading to A, but not older than F where A and B forked from.

    Side note.  As in all other discussion in this message, remember
    that bisect is for finding a _single_ breakage that was left
    unfixed til the tip of the history being bisected.  "B is good"
    means "the _single_ breakage is not in the commit that would
    affect B, i.e. in B's ancestors",

If B is bad, on the other hand, the branch leading to A since the fork
point F is exonerated and we do not have to look at the side branch that
leads to A.

Which means that by seeing one the tip of a merged branch is good, you
can see that everything before the merge base is good and you need to only
look at _the other_ branch.

What happens if M is an Octopus?

                       o---o---o---o---A
                      /                 \
  ---o---o---o---o---F---o---o---o---B---M
                  \       \             /|
                   \       o---o---o---C |
                    \                    |
                     o---o---o---o---o---D

If B is good, you still need to look at histories leading to A, C, and D
individually.  Of course if B is bad, then you do not have to look at 
the histrories leading to A, C and D from their respective fork points,
but you still do have to look at the shared past.

But we could optimize further.  After knowing M, an Octopus merge, is bad,
when we are tempted to test one of the tips of the branches that was
merged (say B), we can instead give a tree that is a result of merging
only A and B (i.e. excluding C and D) for testing.  If it is good, then
the histories leading to both A and B are good, and we only need to check
side branches leading C and D since they forked from the shared common
history.  If combination of A and B is bad, on the other hand, then we do
not have to check branch histories leading to C nor D.

Doing so essentially shifts the balance between what happens if a single
test turns out to be good or bad.  If we test the tip of the branch, and
if it is bad, we will eliminate other forks (but still need to test the
shared history).  If it is good, we only eliminate that particular branch
and shared history, but all the other forks remain suspect.  So it is a
tradeoff between:

 - the size of all the other side branches since they forked == number of
   commits we do not have to test if this round says "bad";

 - the size of this side branch and the shared history == number of
   commits we do not have to test if this round says "good";

The current bisect algorithm makes this tradeoff, by computing the above
two numbers and finding the point that makes them closest to each other.
It however does not let you test two commits at the same time (i.e.
testing the merge of A and B in the above example) which could make the
tradeoff even more efficient.

I see there is another window for optimization we could make from the
above observation.  Making the number of commits eliminated when the test
is "good" and "bad" as close to equal as possible is the best strategy
when the tested commit has a 50-50 chance of being "good" or "bad".  If we
somehow know that the tested commit is likely to be "bad", we would want
to maximize the number of commits eliminated when the commit is indeed
"bad" (and vice versa).

I do not see an easy way to exploit this window offhand, though...

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Could this be done simpler?
  2009-06-25 17:25       ` Junio C Hamano
@ 2009-06-25 21:54         ` Matthias Andree
  2009-06-27  0:26           ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Matthias Andree @ 2009-06-25 21:54 UTC (permalink / raw
  To: Junio C Hamano; +Cc: Randal L. Schwartz, Linus Torvalds, Git Mailing List

Am 25.06.2009, 19:25 Uhr, schrieb Junio C Hamano <gitster@pobox.com>:

> Matthias Andree <matthias.andree@gmx.de> writes:
>
>> Could we ditch the current git-pull --append description? Can then  
>> please somebody rewrite this paragraph? This somebody must have  
>> completely understood
>
>> (1) what this feature is good for (practically speaking)
>>
>> (2) how it works (technically speaking, to provide reference  
>> information)
>>
>> That would be much more useful, and the use would last longer :-)
>>
>> I don't dare ask Junio directly.
>
> But if you run blame and mailing list archive search, you would discover
> that "fetch --append" was my invention.  After all, the entire Octopus
> idea originates from me at 211232b (Octopus merge of the following five
> patches., 2005-05-05).  It is interesting to realize that it was actually
> a Pentapus made on the day of 5/5/5 ;-)

Fair enough, but I hadn't looked at who wrote it because more people than  
just the original author can be able to write it. In fact, I've drifted  
into doing that. Suggestions at the very end if you're not interested in  
the rationale, but if you are only interested the solution. :-)

I've seen your later message on the octopus merges, and therefore suggest  
that we get this --append stuff documented first.

> I thought I was going to take blame on the incomprehensive documentation
> and pass it on to me being non-native speaker/writer of English, but the
> situation is bit funny.  Documentation/fetch-options.txt says this:

Neither am I a native writer, why bother… it's more important to write  
things clearly than to polish things, and writing something correctly from  
the beginning is a very hard problem. Writing something that works, and  
later polish, are two simpler problems.
Language isn't the concern here, but understanding the feature is.

>     -a::
>     --append::
>             Append ref names and object names of fetched refs to the
>             existing contents of `.git/FETCH_HEAD`.  Without this
>             option old data in `.git/FETCH_HEAD` will be overwritten.
>
> Perhaps there has a cut&paste error?  I haven't looked.

Nevermind, that's irrelevant. The key problem is that Linus could not tell  
 from this description that it was the feature he was looking for. So let's  
fix that and make documentation clearer. Particularly: let's fix the  
git-fetch manpage, let's untangle it from git-pull, and let git-pull  
reference the git-fetch description rather than copy it.

This quoted section "Append ref... overwritten." explains how the beast  
works technically. So what? What is it good for?  What can I do with it?   
You made FETCH_HEAD the focus point of the description, but that's not the  
point. (It may be the point of the implementation, but I don't care).

In order for a reader to understand this feature from the docs, he must  
know what FETCH_HEAD is good for in the whole git context (as a  
requisite), but that is just a diversion, not the key point.

The point is that you can mark several branches for merge, or in other  
words, accumulate other tips/heads that you want to merge, before doing  
the merge. It is useful for merges of more than one branch at a time,  
"octopus merges" or similar.

Preface: the next comments don't mean to criticize what you are  
presenting, but just to select which should and which shouldn't go in the  
reference manual, and if yes, where. I think we've got the whole  
description organized backwards, let's fix that, too.

>  (2) "git fetch" leaves list of <commit object, repo, branch, flag> for
>      each ref fetched from repository in .git/FETCH_HEAD, where flag  
> tells
>      if it is meant for merging.  "git pull" runs "git fetch", reads from
>      this file to learn which ones to pass to "git merge".  The
>      information also is given to "git fmt-merge-msg" to come up with the
>      message.

This is a technical detail - this belongs into a separate FETCH_HEAD  
document in section 5 (file formats).

>      Usually "git fetch" first empties the existing contents of the file
>      and stores the list of refs it fetched.  With --append, it doesn't
>      empty the file; refs fetched by the previous invocation of "git
>      fetch" will be kept and the refs it fetched are appenede.

OK. Also for later, so I know how it differs from regular behaviour.

So, this is technically more comprehensive, but that leaves the old  
question unanswered - what is FETCH_HEAD good for?

Let's change roles (or perspective) for a moment, for the sake of clarity  
and usability: I am just a Git user. I don't want to hack Git. I couldn't  
care less about implementation details such as FETCH_HEAD, I only need to  
know how I can tell Git to merge branches foo, bar, baz into master in one  
single merge.

>      So:
>
> 	$ git fetch one a
>       $ git fetch --append two b
>       $ git pull --apend three c
>
>      will end up having all the three refs from different repositories in
>      .git/FETCH_HEAD.  I.e.
>
> 	branch a, from repo one, to be merged
> 	branch b, from repo two, to be merged
> 	branch c, from repo three, to be merged
>
>      when "git fetch" run by the the last "git pull" returns.  "git pull"
>      reads the file and learn what to give to "git fmt-merge-msg" (to  
> come
>      up with the message for the merge commit) and "git merge" (to create
>      the merge commit).

Let's leave git pull out of this picture. If you mention it, you must  
explain the interaction between pull and fetch, but you don't want this  
here. You only want to explain the interaction between fetching more than  
one branch and merging all of them.

"git pull" (at bird eye's view) is just a short-cut for "git fetch  
something" and "git merge with somehow configured branch" (somehow =  
implicitly through setting up tracking branches, or clone), or explicitly  
through git branch, or git remote -- let's leave this aside.

So, here's my first stab at it (just content, not ASCIIDOC markup, as I'm  
not fluent in ASCIIDOC and you can easily do that when merging later) -  
feel free to correct edit, rewrite, amend to it...

I'm not sure

FETCH_HEAD(5)
-------------------------------------------------
This file in the git directory records which heads have been downloaded,  
 from where, and for what purpose. Each line in this file is one  
TAB-delimited record with three fields. From left to right, these fields  
contain:

1 - the commit of the remote head
2 - "not-for-merge" if the branch is not meant to be merged, otherwise,  
this field remains empty
3 - branch 'xxx' of UUU, where xxx and UUU are the remote repository's  
refname and base URL, respectively.

This file is written by git-fetch and used by git-merge.
-------------------------------------------------

git-fetch(1)
-------------------------------------------------
...
      -a::
      --append::
	This option allows you to fetch and accumulate multiple remote refs for  
future merging.  Normally, git-fetch records the latest fetch for a later  
merge, by writing them to .git/FETCH_HEAD (there can be multiple recorded  
heads in FETCH_HEAD although the name suggests there were just one).  The  
--append option lets git-fetch keep, rather than delete, prior contents of  
the file.  This can be ueful when consolidating multiple topic branches in  
one single merge (a so-called octopus merge, see git-merge(1)). Example:

> 	$ git fetch one a
>       $ git fetch --append two b
>       $ git pull --append three c

(git pull first runs git fetch --append three c, and then git merge with  
all remotes that have been recorded for merging in .git/FETCH_HEAD).

...
You can use git-pull as short-cut for the all too common  
"git-fetch"-"git-merge" sequence.
-------------------------------------------------

NOTE: git-fetch accepts command lines without refspec. These mark fetched  
heads as "not-for-merge". IOW, a refspec is needed that heads are marked  
as for-merge. I haven't found this documented in git-fetch.

-- 
Matthias Andree

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Could this be done simpler?
  2009-06-25  1:04 ` Junio C Hamano
  2009-06-25 14:33   ` Randal L. Schwartz
@ 2009-06-25 22:02   ` Christian Couder
  2009-06-25 22:23     ` Christian Couder
  1 sibling, 1 reply; 15+ messages in thread
From: Christian Couder @ 2009-06-25 22:02 UTC (permalink / raw
  To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List

On Thursday 25 June 2009, Junio C Hamano wrote:
> Side note.
>
> People sometimes say, and I am certain I agreed to them on more than one
> occasions, that Octopus hurt bisectability and does not have much value
> in real life.  I've always thought this bisectability issue was a
> downside of Octopus merges, but now I think about it, perhaps "git
> bisect" can be taught to dynamically decompose an Octopus merges into a
> sequence of two-head virtual merges while bisecting.  We strongly
> discourage and do not allow conflicting Octopus merges, so when you need
> to bisect a history with an Octopus that looks like this:
>
>     ---o---A
>             \
>   ---o---B---M---o
>             /
>     ---o---C
>
> it should be able to mechanically decompose it, without conflicts, into
>
>
>     ---o---A
>             \
>   ---o---B---M1--M2--o
>                 /
>         ---o---C
>
> where the tree of M and the tree of M2 are identical.

If someone creates a "git decompose-octopus <commit>" command then you only 
need to do "git replace M M2" after that and you can bisect as usual. (Of 
course after that you can remove the replacement with "git replace -d M".)

Best regards,
Christian.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Could this be done simpler?
  2009-06-25 22:02   ` Christian Couder
@ 2009-06-25 22:23     ` Christian Couder
  2009-06-25 22:29       ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Christian Couder @ 2009-06-25 22:23 UTC (permalink / raw
  To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List

On Friday 26 June 2009, Christian Couder wrote:
> On Thursday 25 June 2009, Junio C Hamano wrote:
> > Side note.
> >
> > People sometimes say, and I am certain I agreed to them on more than
> > one occasions, that Octopus hurt bisectability and does not have much
> > value in real life.  I've always thought this bisectability issue was a
> > downside of Octopus merges, but now I think about it, perhaps "git
> > bisect" can be taught to dynamically decompose an Octopus merges into a
> > sequence of two-head virtual merges while bisecting.  We strongly
> > discourage and do not allow conflicting Octopus merges, so when you
> > need to bisect a history with an Octopus that looks like this:
> >
> >     ---o---A
> >             \
> >   ---o---B---M---o
> >             /
> >     ---o---C
> >
> > it should be able to mechanically decompose it, without conflicts, into
> >
> >
> >     ---o---A
> >             \
> >   ---o---B---M1--M2--o
> >                 /
> >         ---o---C
> >
> > where the tree of M and the tree of M2 are identical.
>
> If someone creates a "git decompose-octopus <commit>" command then you
> only need to do "git replace M M2" after that and you can bisect as
> usual. (Of course after that you can remove the replacement with "git
> replace -d M".)

(Or if we make the "refs/replace/bisect/" directory special so that it is 
only used when bisecting, and if the replace ref is created in this 
directory, then no need to remove the replacement ref. On the contrary it's 
better to leave it there so that people who fetch it benefit from it too.)

Best regards,
Christian.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Could this be done simpler?
  2009-06-25 22:23     ` Christian Couder
@ 2009-06-25 22:29       ` Junio C Hamano
  2009-06-25 22:50         ` Linus Torvalds
  2009-06-25 22:55         ` Christian Couder
  0 siblings, 2 replies; 15+ messages in thread
From: Junio C Hamano @ 2009-06-25 22:29 UTC (permalink / raw
  To: Christian Couder; +Cc: Junio C Hamano, Linus Torvalds, Git Mailing List

Christian Couder <chriscool@tuxfamily.org> writes:

>> If someone creates a "git decompose-octopus <commit>" command then ...

I am afraid that misses the entire point of my discussion.

Such a decomposed octopus would _only_ be necessary during bisection, only
when the user chooses to test two tips at once (instead of testing one by
one), _and_ only its tree is needed for that purpose.  In other words, we
should be able to do this _without_ creating an extra commit, let alone
replace mechanism.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Could this be done simpler?
  2009-06-25 22:29       ` Junio C Hamano
@ 2009-06-25 22:50         ` Linus Torvalds
  2009-06-25 23:17           ` Junio C Hamano
  2009-06-25 22:55         ` Christian Couder
  1 sibling, 1 reply; 15+ messages in thread
From: Linus Torvalds @ 2009-06-25 22:50 UTC (permalink / raw
  To: Junio C Hamano; +Cc: Christian Couder, Git Mailing List

On Thu, 25 Jun 2009, Junio C Hamano wrote:
> 
> Such a decomposed octopus would _only_ be necessary during bisection, only
> when the user chooses to test two tips at once (instead of testing one by
> one), _and_ only its tree is needed for that purpose.  In other words, we
> should be able to do this _without_ creating an extra commit, let alone
> replace mechanism.

Keep in mind, though, that realistically, I don't think we've ever seen 
any bisection attempts that end at an octopus.

Sure, I suspect that being really clever about decomposing an octopus 
merge might allow us to bisect things _faster_ to one of the branches 
involved in the merge, but the amount of smarts to do that just for that 
reason seems pretty outlandish.

And if we ever do end up with an actual bug being bisected to the octopus 
merge itself, at that point I don't think it's unreasonable to take the 
same approach we do with any normal merge: just try to figure out what the 
conflict is all about (clearly it's not a data conflict, since the 
octopus wouldn't have succeeded in that case, but subtle merge errors can 
be due to two branches each introducing their own assumptions without 
actually ever clashing on a source file level).

With regular merges, if you really don't see what the conceptual conflict 
is, you could try to do a temporary rebase to try to figure it out, and I 
suspect that that is what you'd want to do with an octopus merge too - 
rather than try to decompose the octopus merge into multiple simpler 
merges, you'd like to try to linearize history and then re-do the 
bisection attempt on that totally modified/simplified history.

			Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Could this be done simpler?
  2009-06-25 22:29       ` Junio C Hamano
  2009-06-25 22:50         ` Linus Torvalds
@ 2009-06-25 22:55         ` Christian Couder
  1 sibling, 0 replies; 15+ messages in thread
From: Christian Couder @ 2009-06-25 22:55 UTC (permalink / raw
  To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List

On Friday 26 June 2009, Junio C Hamano wrote:
> Christian Couder <chriscool@tuxfamily.org> writes:
> >> If someone creates a "git decompose-octopus <commit>" command then ...
>
> I am afraid that misses the entire point of my discussion.
>
> Such a decomposed octopus would _only_ be necessary during bisection,
> only when the user chooses to test two tips at once (instead of testing
> one by one), _and_ only its tree is needed for that purpose.  In other
> words, we should be able to do this _without_ creating an extra commit,
> let alone replace mechanism.

But suppose the result from the bisection tells that M1 is the first bad 
commit, then the user will need to look at M1, and perhaps check it out or 
use it in other ways after the bisection is finished. So why shouldn't it 
be a real commit?

It's not like a few more commits are a big problem as they will be reclaimed 
by garbage collection anyway if the replace ref is deleted.

Best regards,
Christian.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Could this be done simpler?
  2009-06-25 22:50         ` Linus Torvalds
@ 2009-06-25 23:17           ` Junio C Hamano
  0 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2009-06-25 23:17 UTC (permalink / raw
  To: Linus Torvalds; +Cc: Christian Couder, Git Mailing List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> Sure, I suspect that being really clever about decomposing an octopus 
> merge might allow us to bisect things _faster_ to one of the branches 
> involved in the merge, but the amount of smarts to do that just for that 
> reason seems pretty outlandish.
>
> And if we ever do end up with an actual bug being bisected to the octopus 
> merge itself, at that point I don't think it's unreasonable to take the 
> same approach we do with any normal merge: just try to figure out what the 
> conflict is all about (clearly it's not a data conflict, since the 
> octopus wouldn't have succeeded in that case, but subtle merge errors can 
> be due to two branches each introducing their own assumptions without 
> actually ever clashing on a source file level).
>
> With regular merges, if you really don't see what the conceptual conflict 
> is, you could try to do a temporary rebase to try to figure it out, and I 
> suspect that that is what you'd want to do with an octopus merge too - 
> rather than try to decompose the octopus merge into multiple simpler 
> merges, you'd like to try to linearize history and then re-do the 
> bisection attempt on that totally modified/simplified history.

All true.

Thanks for thoughts.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Could this be done simpler?
  2009-06-25 21:54         ` Matthias Andree
@ 2009-06-27  0:26           ` Junio C Hamano
  0 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2009-06-27  0:26 UTC (permalink / raw
  To: Matthias Andree; +Cc: Randal L. Schwartz, Linus Torvalds, Git Mailing List

"Matthias Andree" <matthias.andree@gmx.de> writes:

> Neither am I a native writer, why bother… it's more important to write
> things clearly than to polish things,...

Hey, calm down.

The current documentation was written back when everybody knew what git
fetch internally did (e.g. left state in .git/FETCH_HEAD) and describing
things from the perspective of what is done internally was "accepted" back
when the alternative was not describing anything in any form ;-)

I took your two questions literally as they were.  That is, 

 * You, like other people, realize that times have changed since then, and
   noticed that even with the correct rendition (it appears the problem
   Merlyn saw was primarily caused by Asciidoc toolchain), the bottom-up
   description based on what is done internally is not sufficient.

 * You are volunteering to make things better, but you first need input to
   make sure the result is not just readable but technically correct.

 * And I was among the few people who were around when .git/FETCH_HEAD and
   "git fetch --append" were invented to give precise answers to these
   questions.

No way I meant that these two answers should replace the current
documentation.

> Let's change roles (or perspective) for a moment, for the sake of
> clarity  and usability: I am just a Git user. I don't want to hack
> Git. I couldn't  care less about implementation details such as
> FETCH_HEAD, I only need to  know how I can tell Git to merge branches
> foo, bar, baz into master in one  single merge.

Yes, that is the good starting point.

> "git pull" (at bird eye's view) is just a short-cut for "git fetch
> something" and "git merge with somehow configured branch" (somehow =
> implicitly through setting up tracking branches, or clone)

Actually the latter is "with information somehow left by git-fetch".

> FETCH_HEAD(5)
> -------------------------------------------------
> This file in the git directory records which heads have been
> downloaded,  from where, and for what purpose. Each line in this file
> is one  TAB-delimited record with three fields. From left to right,
> these fields  contain:
>
> 1 - the commit of the remote head
> 2 - "not-for-merge" if the branch is not meant to be merged,
> otherwise,  this field remains empty
> 3 - branch 'xxx' of UUU, where xxx and UUU are the remote repository's
> refname and base URL, respectively.
>
> This file is written by git-fetch and used by git-merge.
> -------------------------------------------------

It is true that git-merge does use it, but not under its normal mode of
operation.  Unless the reader of this paragraph is hacking git, I do not
think s/he needs to (nor wants to) know about it.  IIRC, it only triggers
if you do

	$ git merge FETCH_HEAD

The more prominent user is git-pull.  git-fetch leaves the instructions to
git-pull so that the latter knows what to use when it drives git-merge in
this file.

> git-fetch(1)
> -------------------------------------------------
> ...
>      -a::
>      --append::
> 	This option allows you to fetch and accumulate multiple remote
> refs for  future merging.  Normally, git-fetch records the latest
> fetch for a later  merge, by writing them to .git/FETCH_HEAD (there
> can be multiple recorded  heads in FETCH_HEAD although the name
> suggests there were just one).

I personally find the parenthesized comment at the end just distracting
and confusing.  You are explicitly saying "by writing THEM" so it is clear
that the file can and does record more than one when the user instructs
the command to.

> ....  The  --append option lets git-fetch
> keep, rather than delete, prior contents of  the file.  This can be
> ueful when consolidating multiple topic branches in  one single merge
> (a so-called octopus merge, see git-merge(1)). Example:

The description lacks one important point.  It can be useful only when
consolidating multiple topic branches _that come from more than one remote
repositories_

Other than that, the above paragraph is perfect.

> NOTE: git-fetch accepts command lines without refspec. These mark
> fetched  heads as "not-for-merge". IOW, a refspec is needed that heads
> are marked  as for-merge. I haven't found this documented in
> git-fetch.

Sorry, I have no idea what you are talking about in these four lines.

Perhaps "DEFAULT BEHAVIOUR" section in Documentation/git-pull.txt, the
paragraph that begins with "The rule to determine which remote branch to
merge ..." may be what you are looking for?

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2009-06-27  0:26 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-24 21:35 Could this be done simpler? Linus Torvalds
2009-06-25  1:04 ` Junio C Hamano
2009-06-25 14:33   ` Randal L. Schwartz
2009-06-25 16:32     ` Matthias Andree
2009-06-25 17:25       ` Junio C Hamano
2009-06-25 21:54         ` Matthias Andree
2009-06-27  0:26           ` Junio C Hamano
2009-06-25 18:32       ` Junio C Hamano
2009-06-25 17:19     ` Michael J Gruber
2009-06-25 22:02   ` Christian Couder
2009-06-25 22:23     ` Christian Couder
2009-06-25 22:29       ` Junio C Hamano
2009-06-25 22:50         ` Linus Torvalds
2009-06-25 23:17           ` Junio C Hamano
2009-06-25 22:55         ` Christian Couder

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).