git-name-rev off-by-one bug

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* git-name-rev off-by-one bug
@ 2005-11-28 23:42 linux
  2005-11-29  5:54 ` Junio C Hamano
  2005-12-01 10:14 ` Junio C Hamano
  0 siblings, 2 replies; 64+ messages in thread
From: linux @ 2005-11-28 23:42 UTC (permalink / raw
  To: git; +Cc: linux

I've been trying to wrap my head around git for a while now, and finding
things a bit confusing.  Basically, the reason that I'm scared to trust
it with my code is that all sharing is done via push and pull, and they
are done by merging, and merging isn't described very well anywhere.

There's lots of intimate *detail* of merge algorithms (hiding in, of all
places, the git-read-tree documentation, which is not the obvious place
for a beginner to look), but the important high-level questions like "what
happens to all my hard work if there's a merge conflict?" or "what if I
forget to git-update-index before doing the merge?" are not really clear.
I don't like to go ahead if I'm not confident I can get back.

(Being able to back up the object database is obviously simple, but what
happens if the index holds HEAD+1, the working directory holds HEAD+2,
and I try to mere the latest changes from origin?  Are either HEAD+1 or
HEAD+2 in danger of being lost, or will checking them in later overwrite
the merge, or what?)

Anyway, I'm doing some experiments and trying to understand it, and writing
what I learn as I go, which will hopefully be useful to someone.

Another very confusing thing is the ref syntax with all those ~12^3^22^2
suffixes.  The git tutorial uses "master^" and "master^2" syntax, but
doesn't actually explain it.

The meaning can be found on the second page of the git-rev-parse manual.
If, that is, you think to read that man page, and if you don't stop
reading after the first page tells you that it's a helper for scripts
not meant to be invoked directly by the end-user.

Trying to see if I understood what was going on, I picked a random rev out of
git-show-branch output and tried git-name-rev:

> $ git-name-rev 365a00a3f280f8697e4735e1ac5b42a1c50f7887
> 365a00a3f280f8697e4735e1ac5b42a1c50f7887 maint~404^1~7

(If you care, maint=93dcab2937624ebb97f91807576cddb242a55a46)

And was very confused when git-rev-parse didn't invert the operation:

> $ git-rev-parse maint~404^1~7
> f69714c38c6f3296a4bfba0d057e0f1605373f49

I spent a while verifying that I understood that ^1 == ^ == ~1, so
~404^1~7 = ~412, and that gave the same unwanted result:

> $ git-rev-parse maint~412
> f69714c38c6f3296a4bfba0d057e0f1605373f49

After confusing myself for a while, I looked to see why git-name-rev
would output such a redundant name and found that it was simply
wrong.  Fixing the symbolic name worked:

> $ git-rev-parse maint~404^2~7
> 365a00a3f280f8697e4735e1ac5b42a1c50f7887

You can either go with a minimal fix:
diff --git a/name-rev.c b/name-rev.c
index 7d89401..f7fa18c 100644
--- a/name-rev.c
+++ b/name-rev.c
@@ -61,9 +61,10 @@ copy_data:

 			if (generation > 0)
 				sprintf(new_name, "%s~%d^%d", tip_name,
-						generation, parent_number);
+						generation, parent_number+1);
 			else
-				sprintf(new_name, "%s^%d", tip_name, parent_number);
+				sprintf(new_name, "%s^%d", tip_name,
+						parent_number+1);

 			name_rev(parents->item, new_name,
 				merge_traversals + 1 , 0, 0);

Or you can get a bit more ambitious and write ~1 as ^:

diff --git a/name-rev.c b/name-rev.c
index 7d89401..82053c8 100644
--- a/name-rev.c
+++ b/name-rev.c
@@ -57,13 +57,17 @@ copy_data:
 			parents;
 			parents = parents->next, parent_number++) {
 		if (parent_number > 0) {
-			char *new_name = xmalloc(strlen(tip_name)+8);
+			unsigned const len = strlen(tip_name);
+			char *new_name = xmalloc(len+8);

-			if (generation > 0)
-				sprintf(new_name, "%s~%d^%d", tip_name,
-						generation, parent_number);
-			else
-				sprintf(new_name, "%s^%d", tip_name, parent_number);
+			memcpy(new_name, tip_name, len);
+
+			if (generation == 1)
+				new_name[len++] = '^';
+			else if (generation > 1)
+				len += sprintf(new_name+len, "~%d", generation);
+
+			sprintf(new_name+len, "^%d", parent_number+1);

 			name_rev(parents->item, new_name,
 				merge_traversals + 1 , 0, 0);

While I'm at it, I notice some unnecessary invocations of expr in some
of the shell scripts.  You can do it far more simply using the ${var#pat}
and ${var%pat} expansions to strip off leading and trailing patterns.
For example:

diff --git a/git-cherry.sh b/git-cherry.sh
index 867522b..c653a6a 100755
--- a/git-cherry.sh
+++ b/git-cherry.sh
@@ -23,8 +23,7 @@ case "$1" in -v) verbose=t; shift ;; esa

 case "$#,$1" in
 1,*..*)
-    upstream=$(expr "$1" : '\(.*\)\.\.') ours=$(expr "$1" : '.*\.\.\(.*\)$')
-    set x "$upstream" "$ours"
+    set x "${1%..*}" "${1#*..}"
     shift ;;
 esac

This works in dash and is in the POSIX spec.  It doesn't work in some
very old /bin/sh implementations (such as Solaris still ships), but I'm
pretty sure it was introduced at the same time as $(), and the scripts
use *that* all over the place.

% sh
$ uname -s -r
SunOS 5.9
$ foo=bar
$ echo ${foo#b}
bad substitution
$ echo `echo $foo`
bar
$ echo $(echo $foo)
syntax error: `(' unexpected

Anyway, if it's portable enough, it's faster.  Ah... I just found discussion
of this in late September, but it's not clear what the resolution was.
http://marc.theaimsgroup.com/?t=112746188000003

(Oh, yes: all of the above patches are released into the public domain.
Copyright abandoned.  Have fun.)

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-28 23:42 git-name-rev off-by-one bug linux
@ 2005-11-29  5:54 ` Junio C Hamano
  2005-11-29  8:05   ` linux
  2005-11-30 17:46   ` Daniel Barkalow
  2005-12-01 10:14 ` Junio C Hamano
  1 sibling, 2 replies; 64+ messages in thread
From: Junio C Hamano @ 2005-11-29  5:54 UTC (permalink / raw
  To: linux; +Cc: git

linux@horizon.com writes:

> (Being able to back up the object database is obviously simple, but what
> happens if the index holds HEAD+1, the working directory holds HEAD+2,
> and I try to mere the latest changes from origin?  Are either HEAD+1 or
> HEAD+2 in danger of being lost, or will checking them in later overwrite
> the merge, or what?)

Thanks for the complaints.  No sarcasm intended.  Yours is
exactly the kind of message we (people who've been around here
for too long) need to hear.

Although the technical details are hidden in the documentation
which needs reorganization to make them easier to find [*1*] as
you point out, the guiding principle for merge is quite simple.

To the git barebone Porcelain layer (things that start with
git-*, not with cg-*) [*2*], a merge is always between the
current HEAD and one or more remote branch heads, and the index
file must exactly match the tree of HEAD commit (i.e. the
contents of the last commit) when it happens.  In other words,
"git-diff --cached HEAD" must report no changes [*3*].  So
HEAD+1 must be HEAD in your above notation, or merge will refuse
to do any harm to your repository (that is, it may fetch the
objects from remote, and it may even update the local branch
used to keep track of the remote branch with "git pull remote
rbranch:lbranch", but your working tree, .git/HEAD pointer and
index file are left intact).

You may have local modifications in the working tree files.  In
other words, "git-diff" is allowed to report changes (the
difference between HEAD+2 and HEAD+1 in your notation).
However, the merge uses your working tree as the working area,
and in order to prevent the merge operation from losing such
changes, it makes sure that they do not interfere with the
merge. Those complex tables in read-tree documentation define
what it means for a path to "interfere with the merge".  And if
your local modifications interfere with the merge, again, it
stops before touching anything.

So in the above two "failed merge" case, you do not have to
worry about lossage of data --- you simply were not ready to do
a merge, so no merge happened at all.  You may want to finish
whatever you were in the middle of doing, and retry the same
pull after you are done and ready.

When things cleanly merge, these things happen:

 (1) the results are updated both in the index file and in your
     working tree,
 (2) index file is written out as a tree,
 (3) the tree gets committed, and 
 (4) the HEAD pointer gets advanced.

Because of (2), we require that the original state of the index
file to match exactly the current HEAD commit; otherwise we will
write out your local changes already registered in your index
file (the difference between HEAD+1 and HEAD in your notation)
along with the merge result, which is not good.  Because (1)
involves only the paths different between your branch and the
remote branch you are pulling from during the merge (which is
typically a fraction of the whole tree), you can have local
modifications in your working tree as long as they do not
overlap with what the merge updates.

When there are conflicts, these things happen:

 (0) HEAD stays the same.

 (1) Cleanly merged paths are updated both in the index file and
     in your working tree.

 (2) For conflicting paths, the index file records the version
     from HEAD. The working tree files have the result of
     "merge" program; i.e. 3-way merge result with familiar
     conflict markers <<< === >>>.

 (3) No other changes are done.  In particular, the local
     modifications you had before you started merge will stay the
     same and the index entries for them stay as they were,
     i.e. matching HEAD.

After seeing a conflict, you can do two things:

 * Decide not to merge.  The only clean-up you need are to reset
   the index file to the HEAD commit to reverse (1) and to clean
   up working tree changes made by (1) and (2); "git-reset" can
   be used for this.

 * Resolve the conflicts.  "git-diff" would report only the
   conflicting paths because of the above (1) and (2).  Edit the
   working tree files into a desirable shape, git-update-index
   them, to make the index file contain what the merge result
   should be, and run "git-commit" to commit the result.

[Footnotes]

*1* It is a shame that the most comprehensive definition of
3-way read-tree semantics is in t/t1000-read-tree-m-3way.sh test
script.

*2* Cogito (things that start with cg-*) seems to try to be
cleverer.  Pasky might want to brag about the rules in Cogito
land.

*3* This is a bit of lie.  In certain special cases, your index
are allowed to be different from the tree of HEAD commit;
basically your index entries are allowed to match the result of
trivial merge already (e.g. you received the same patch from
external source to produce the same result as what you are
merging).  For example, if a path did not exist in the common
ancestor and your head commit but exists in the tree you are
merging into your repository, and if you already happen to have
that path exactly in your index, the merge does not have to
fail.  This is case #2 in the 3-way read-tree table in t/t1000.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-29  5:54 ` Junio C Hamano
@ 2005-11-29  8:05   ` linux
  2005-11-29  9:29     ` Junio C Hamano
  2005-11-29 10:31     ` Petr Baudis
  2005-11-30 17:46   ` Daniel Barkalow
  1 sibling, 2 replies; 64+ messages in thread
From: linux @ 2005-11-29  8:05 UTC (permalink / raw
  To: junkio; +Cc: git, linux

> Thanks for the complaints.  No sarcasm intended.  Yours is
> exactly the kind of message we (people who've been around here
> for too long) need to hear.

Thanks for taking it so well!  I'm trying to really *understand* git,
so I can predict its behaviour, but I've been coming to the conclusion
that the only way to do that is by re-reading the mailing list from day 1.

And to understand git at all, you have to understand merging, since
doing merging fast and well is the central reason for git's entire
existence.  Single-developer porcelain like "cvs annotate" is
noticeably lacking, but branching and merging is great.

(In particular, and unlike other SCMs, "push" and "pull" are
based on merging!  So I can't even understand what pulling from
Linus's tree does until I understand merging.)

I'm working on some notes to explain git to myself and people
I work with, which I'll post when they're vaguely complete.

> To the git barebone Porcelain layer (things that start with
> git-*, not with cg-*) [*2*], a merge is always between the
> current HEAD and one or more remote branch heads, and the index
> file must exactly match the tree of HEAD commit (i.e. the
> contents of the last commit) when it happens.  In other words,
> "git-diff --cached HEAD" must report no changes [*3*].  So
> HEAD+1 must be HEAD in your above notation, or merge will refuse
> to do any harm to your repository (that is, it may fetch the
> objects from remote, and it may even update the local branch
> used to keep track of the remote branch with "git pull remote
> rbranch:lbranch", but your working tree, .git/HEAD pointer and
> index file are left intact).

Right!  Since the object database is strictly append-only, it's easy to
see how such changes are quite harmless, and updating a tracking branch
is hardly a big nasty surprise.

It's the index and working directory that are volatile, and that I was
worried about.

BUT... what's the second argument to git-read-tree for, if it
always has to be HEAD?

BTW, I'd change the description of git-read-tree from

>     Reads the tree information  given  by  <tree-ish>  into  the directory
>     cache, but does not actually update any of the files it "caches". (see:
>     git-checkout-index)

to

      Reads the tree information given by <tree-ish> into the index.
      The working directory is not modified in any way (unless -u
      is used).  Use git-checkout-index to do that.

      In addition to the simple one-tree case, this can (with the
      -m flag) merge 2 or 3 trees into the index.  When used with
      -m, the -u flag causes it to also update the files in the
      working directory.

      Trivial merges are done by "git-read-tree" itself.  Conflicts
      are left in an unmerged state for git-merge-index to resolve.

> You may have local modifications in the working tree files.  In
> other words, "git-diff" is allowed to report changes (the
> difference between HEAD+2 and HEAD+1 in your notation).
> However, the merge uses your working tree as the working area,
> and in order to prevent the merge operation from losing such
> changes, it makes sure that they do not interfere with the
> merge.  Those complex tables in read-tree documentation define
> what it means for a path to "interfere with the merge".  And if
> your local modifications interfere with the merge, again, it
> stops before touching anything.

THANK YOU for finally making this clear!  I was wondering why the
hell a 2-way merge looked more complex than a 3-way.  (Although,
admittedly, I'm *still* not clear on what the difference is.  It
seems like a 2-way just picks the origin commit automatically.)
So it goes like this:

- git-merge will refuse to do anything if there are any uncommitted changes
  in the index.  [footnote about harmless exceptions]
- git-merge will refuse to do anything if there are changes in the
  working directory to a file that would be affected by the merge.
  You CAN, however, have unindexed changes to files that are unchanged
  by the merge.

The description of a 1-way merge in git-read tree is quite confusing.
Here's a rephrasing; do I understand it?

READING

    Without -m, git-read-tree pulls a specified tree into the index.
    Any file whose underlying blob is changed by the update will have
    its cached stat data invalidated, so it will appear in the output
    of git-diff-files, and git-checkout-index -f will overwrite it.

    A single-tree merge is a slightly nicer variant on this.

MERGING
    ...

Single Tree Merge
    If only one tree is specfied, any file whose blob changes as a
    result of the update is compared with the working directory file,
    and the cached stat data updated if the working directory file
    matches the new blob.  Thus, working directory files which happen to
    match the contents of the new tree will be excluded from the outout
    of git-diff-files, and git-checkout-index -f will not update their
    file modification times.

    This is basically the effect of git-update-index --refresh.

    This is actually usually preferable to the non-merge case, but
    does do extra work.

I haven't found the code yet, but obviously if the working directory
file is clean relative to the previous blob, it's dirty relative to
the changed blob, so there's no need to actually read the file.

Um... I just tried it, and it appears that things do NOT work as
I just described.  Time to read the code...

BTW, an even cuter way of writing same() in read_tree.c would be:

static int same(struct cache_entry *a, struct cache_entry *b)
{
	if (!a || !b)
		return a == b;
        return a->ce_mode == b->ce_mode &&
                !memcmp(a->sha1, b->sha1, 20);
}

(I presume that
return a&&b ? a->ce_mode == b->ce_mode && !memcmp(a->sha1, b->sha1, 20) : a==b;
would be taking it too far...)

Oh, and the git-read-tree man page fails to mention --reset and --trivial.
And the usage message should have <sha> changed to <tree>.

> So in the above two "failed merge" case, you do not have to
> worry about lossage of data --- you simply were not ready to do
> a merge, so no merge happened at all.  You may want to finish
> whatever you were in the middle of doing, and retry the same
> pull after you are done and ready.

I feel about a thousand percent better about git already.

> When things cleanly merge, these things happen:
>
> (1) the results are updated both in the index file and in your
>     working tree,
> (2) index file is written out as a tree,
> (3) the tree gets committed, and 
> (4) the HEAD pointer gets advanced.

This is git-merge, as opposed to the more primitive git-read-tree -m
plus git-merge-index, right?

(Aside, you should document that got-merge --no-commit saves the <msg>
in .git/MERGE_MSG, and git-commit uses it as the default message.
Otherwise, users wonder why the hell it asks for a commit message it
knows won't be used.)

> Because of (2), we require that the original state of the index
> file to match exactly the current HEAD commit; otherwise we will
> write out your local changes already registered in your index
> file (the difference between HEAD+1 and HEAD in your notation)
> along with the merge result, which is not good.

> *3* This is a bit of lie.  In certain special cases, your index
> are allowed to be different from the tree of HEAD commit;
> basically your index entries are allowed to match the result of
> trivial merge already (e.g. you received the same patch from
> external source to produce the same result as what you are
> merging).  For example, if a path did not exist in the common
> ancestor and your head commit but exists in the tree you are
> merging into your repository, and if you already happen to have
> that path exactly in your index, the merge does not have to
> fail.  This is case #2 in the 3-way read-tree table in t/t1000.

Ah, yes, the light dawns!  So it *is* okay if you have an uncommitted
change in the index which exactly matches what git-read-tree -m would
have done anyway.  It doesn't make a difference to the final state
of the index, and forcing the poor user to undo it just so the merge
can redo it is simply annoying.

But it's NOT okay if you have any other change in the index, even in a
file unaffected by the merge, because then the final state would not
be the result of the merge, and that could be confusing.

> Because (1)
> involves only the paths different between your branch and the
> remote branch you are pulling from during the merge (which is
> typically a fraction of the whole tree), you can have local
> modifications in your working tree as long as they do not
> overlap with what the merge updates.

Wonderfully clear.  Such changes must not exist at the git-read-tree
level, since there's not even a way to represent them to
git-merge-index.  Still, this can save considerable bother when
trying to track someone else's tree.

> When there are conflicts, these things happen:
>
>  (0) HEAD stays the same.
>
>  (1) Cleanly merged paths are updated both in the index file and
>      in your working tree.
> 
>  (2) For conflicting paths, the index file records the version
>      from HEAD. The working tree files have the result of
>      "merge" program; i.e. 3-way merge result with familiar
>      conflict markers <<< === >>>.
> 
>  (3) No other changes are done.  In particular, the local
>      modifications you had before you started merge will stay the
>      same and the index entries for them stay as they were,
>      i.e. matching HEAD.

So this is a lot like what CVS does, but neater because all the
merge sources are available unmodified in the index.  Excellent.

> After seeing a conflict, you can do two things:
>
> * Decide not to merge.  The only clean-up you need are to reset
>   the index file to the HEAD commit to reverse (1) and to clean
>   up working tree changes made by (1) and (2); "git-reset" can
>   be used for this.

Cool.  Two minor questions:

- Doesn't any non-trivial merge or invocation of git-update-index
  produce blob objects in the database that become garbage if you
  do this?  Or are they somehow kept separate until a tree object
  is created to point to them?

  (You could have an "unreferenced" bit in the index, indicating that
  the blobs in question could be found in .git/pending rather than
  .git/objects, until git-write-tree moved them into the database.
  But I don't see mention of any such scheme.)

- Is there any difference between "git-reset --hard" and "git-checkout -f"?

> * Resolve the conflicts.  "git-diff" would report only the
>   conflicting paths because of the above (1) and (2).  Edit the
>   working tree files into a desirable shape, git-update-index
>   them, to make the index file contain what the merge result
>   should be, and run "git-commit" to commit the result.

Okay, so git-update-index will overwrite a staged file with a
fresh stage-0 copy.  And git-commit will refuse to commit
(to be precise, it'll stop at the git-write-tree stage) if there
are unresolved conflicts.

If you want to see the unmodified input files, you can find their
IDs with "git-ls-files -u" and then get a copy with "git-cat-file blob"
or "git-unpack-file".  git-merge-index is basically a different way to
process the output of git-ls-files -u.

> *1* It is a shame that the most comprehensive definition of
> 3-way read-tree semantics is in t/t1000-read-tree-m-3way.sh test
> script.

Thanks for the pointer; I'll go and read it!

> *2* Cogito (things that start with cg-*) seems to try to be
> cleverer.  Pasky might want to brag about the rules in Cogito
> land.

In fact, he might want to explain what the difference is between cogito
and git.  Most particularly, are there any restrictions on mixing cg-*
and git-* operations from within the same directory?

I've been assuming that cogito is just series of friendlier
utilities, in much the same way that a text editor is friendlier
than "cat > kernel/sched.c", so I'll study it after I understand
core git.

One more question to be sure I understand merging... AFAICT, it would
be theoretically possible to stop distinguishing "stage 2" and "stage 0".
If there is only a stage 2 file, then in-index will immediately
collapse it to stage 0, and if there are any other stage files,
you know there's an incomplete merge.

(Alternatively, you could collapse "stage 3" and "stage 0", since
stages 2 and 3 are treated identically, but traditionally, stage 2 is
the "trunk" and state 3 is the "branch" being merged in.)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-29  8:05   ` linux
@ 2005-11-29  9:29     ` Junio C Hamano
  2005-11-30  8:37       ` Junio C Hamano
  2005-11-29 10:31     ` Petr Baudis
  1 sibling, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2005-11-29  9:29 UTC (permalink / raw
  To: linux; +Cc: junkio, git

linux@horizon.com writes:

> (In particular, and unlike other SCMs, "push" and "pull" are
> based on merging!  So I can't even understand what pulling from
> Linus's tree does until I understand merging.)

If you do "git pull git://.../linux-2.6/", it fetches the
objects and stores Linus "master" into .git/FETCH_HEAD, and then
merges that into whatever branch you happen to be on (but you
already know that).

> BUT... what's the second argument to git-read-tree for, if it
> always has to be HEAD?

git-read-tree is purely "a building block to be scripted" in
your second mail (private?).  In my message you are replying to,
I outlined what the end-user level tool, git-pull (and git-merge
it uses) does, which is built using git-read-tree, and we happen
to always pass HEAD as its second argument.  But it does not
have to be that way.  One thing planned in the future is to do a
merge in a temporary working directory, not in your primary
working tree, and when we implement that, the second argument
will be whatever branch head you are pulling into, not
necessarily your current HEAD.

> BTW, I'd change the description of git-read-tree from

Thanks.  I am a bit too tired tonight so I may send you a
correction later if what I am going to comment here turns out to
be incorrect, but a quick glance tells me this is a good
clarification.

> ...  I was wondering why the
> hell a 2-way merge looked more complex than a 3-way.

2-way is called merge but it is not about the merge at all
(git-merge is mostly about 3-way merge).  It is more about
checkout.  Suppose you have checked out branch A, and have local
modificiations (both in index and in working tree).  You would
want to switch to branch B by "git checkout B" (without -f).
2-way "read-tree -m -u" is used to ensure that you take your
local modifications with you while checking out branch B into
your working tree, meaning:

 (1) local changes already registered in the index stays in the
     index file; obviously this can be done only if A and B are
     the same at such paths.

 (2) local changes not registered in the index stays in the
     working tree; similar restriction applies but the rules are
     more involved.

Similar to 3-way case, 2-way will refuse to lose your local
modifications, and that is the 2-way case table in
Documentation/git-read-tree.txt is about.

> This is git-merge, as opposed to the more primitive git-read-tree -m
> plus git-merge-index, right?

Everything I wrote in the previous message was about the
end-user tools git-pull/git-merge. 

> - Doesn't any non-trivial merge or invocation of git-update-index
>   produce blob objects in the database that become garbage if you
>   do this?

We produce garbage blobs all the time, and we do not care.  Even
the following sequence that does not involve any merge produces
a garbage blob for the first version of A that was faulty:

        $ git checkout
        $ edit A
        $ git update-index A
        $ make ;# oops, there is a mistake.
        $ edit A
        $ make ;# this time it is good.
        $ git commit -a -m 'Finally compiles.'

Occasional fsck-objects, prune and repack are your friends.

> - Is there any difference between "git-reset --hard" and "git-checkout -f"?

"reset --hard" does more thorough job removing unwanted files
from your working tree.  It looks at your current HEAD, the
commit you are resetting to (when you say "reset --hard
<commit>"), and your index file, and paths mentioned by any of
these three that should not remain (that is, not in the commit
you are resetting to) are removed from your working tree.  In
addition, "reset --hard <commit>" updates the branch head and
can be used to rewind it.  On the other hand, "checkout -f"
tells git to *ignore* what is in the index, so any file in the
working tree that used to be in the index (or old branch you
were working on) that does not exist in the branch you are
checking out is not removed.

On a related topic of removing unwanted paths, earlier I said
2-way is used to make sure "git checkout" takes your changes
with you when you switch branches.  As a natural consequence of
this, if you do not have any local changes, "git checkout"
without "-f" does the right thing -- it removes unwanted paths
that existed in the original branch but not in the branch you
are switching to.

> Okay, so git-update-index will overwrite a staged file with a
> fresh stage-0 copy.  And git-commit will refuse to commit
> (to be precise, it'll stop at the git-write-tree stage) if there
> are unresolved conflicts.

Sorry, I was unclear that I was talking about end-user level
tool.  The update-index here is not about the conflict
resolution in the index file read-tree documentation talks
about.  That has already been done when "merge" ran in the
conflicting case.  In the conflicting case, the working tree
holds 3-way merge conflicting result, and the index holds HEAD
version at stage0 for such a path.  Hand resolving after
update-index is to record what you eventually want to commit
(i.e. you are not replacing higher stage entry in the index with
stage0 entry -- you are replacing stage0 entry with another).

> If you want to see the unmodified input files, you can find their
> IDs with "git-ls-files -u" and then get a copy with "git-cat-file blob"
> or "git-unpack-file".  git-merge-index is basically a different way to
> process the output of git-ls-files -u.

Yes, in principle.  But in practice you usually do not use these
low level tools yourself.  When git-merge returns with
conflicting paths, most of them have already been collapsed into
stage0 and git-ls-files --unmerged would not show.  The only
case I know of that you may still see higher stage entries in
the index these days is merging paths with different mode bits.
We used to leave higher stage entries when both sides added new
file at the same path, but even that we show as merge from
common these days.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-29  8:05   ` linux
  2005-11-29  9:29     ` Junio C Hamano
@ 2005-11-29 10:31     ` Petr Baudis
  2005-11-29 18:46       ` Junio C Hamano
  2005-11-29 21:40       ` git-name-rev off-by-one bug linux
  1 sibling, 2 replies; 64+ messages in thread
From: Petr Baudis @ 2005-11-29 10:31 UTC (permalink / raw
  To: linux; +Cc: junkio, git

Dear diary, on Tue, Nov 29, 2005 at 09:05:29AM CET, I got a letter
where linux@horizon.com said that...
> > *2* Cogito (things that start with cg-*) seems to try to be
> > cleverer.  Pasky might want to brag about the rules in Cogito
> > land.

> In fact, he might want to explain what the difference is between cogito
> and git.  Most particularly, are there any restrictions on mixing cg-*
> and git-* operations from within the same directory?

  Nope, except:

Cogito vs. other GIT tools
~~~~~~~~~~~~~~~~~~~~~~~~~~

You can *MOSTLY* use Cogito in parallel with other GIT frontends (e.g.
StGIT), as well as the GIT plumbing and core GIT tools - the tools only
need to keep HEAD in place and follow the standardized `refs/`
hierarchy. The only notable exception is that you should stick with a
single toolkit during a merge.

	(-- Cogito README)

  So exactly during a merge, things might not blend well, since Cogito
does things a bit differently. It knows of no MERGE_HEAD, MERGE_MSG and
such, and instead passes stuff over different channels or computes/asks
it at different times. Historically, cg-merge and git-merge evolution
has been almost entirely separate.

  From the user POV, the main difference between Cogito and GIT merging
is that:

  (i) Cogito tries to never leave the index "dirty" (i.e. containing
unmerged entries), and instead all conflicts should propagate to the
working tree, so that the user can resolve them without any further
special tools. (What is lacking here is that Cogito won't proofcheck
that you really resolved them all during a commit. That's a big TODO.
But core GIT won't you warn about committing the classical << >> ==
conflicts either.)

  (ii) Cogito will handle trees with some local modifications better -
basically any local modifications git-read-tree -m won't care about.
I didn't read the whole conversation, so to reiterate: git-read-tree
will complain when the index does not match the HEAD, but won't
complain about modified files in the working tree if the merge is not
going to touch them. Now, let's say you do this (output is visually
only roughly or not at all resembling what would real tools tell you):

	$ ls
	a b c
	$ echo 'somelocalhack' >>a
	$ git merge "blah" HEAD remotehead
	File-level merge of 'b' and 'c'...
	Oops, 'b' contained local conflicts.
	Automatic merge aborted, fix up by hand.
	$ fixup b
	$ git commit
	Committed files 'a', 'b', 'c'.

Oops. It grabbed your local hack and committed it along the merge.
Cogito won't do this, it will hold 'a' back when doing the merge commit
(if it works right; in the past, there were several bugs related to
this, but hopefully they are all fixed by now):

	$ ls
	a b c
	$ echo 'somelocalhack' >>a
	$ cg-merge remotehead
	... Merging c
	... Merging b
	Conflicts during merge of 'b'.

		Fix up the conflicts, then kindly do cg-commit.
	$ fixup b
	$ cg-commit -m"blah"
	Committed files 'b', 'c'.

Also note that the cg-merge usage is simpler and you give the "blah"
message only to cg-commit, when it's for sure you are going to use it.

  (iii) Cogito does not support the smart recursive merging strategy.
That means it won't follow renames, and in case of multiple merge bases,
it will not merge them recursively, but it will just ask you to choose
one manually, or suggest you the most conservative merge base (where you
should get no false clean merges, but you will probably have to deal
with a lot of conflicts).

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-29 10:31     ` Petr Baudis
@ 2005-11-29 18:46       ` Junio C Hamano
  2005-12-04 21:34         ` Petr Baudis
  2005-11-29 21:40       ` git-name-rev off-by-one bug linux
  1 sibling, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2005-11-29 18:46 UTC (permalink / raw
  To: Petr Baudis; +Cc: linux, junkio, git

Petr Baudis <pasky@suse.cz> writes:

>   (ii) Cogito will handle trees with some local modifications better -
> basically any local modifications git-read-tree -m won't care about.
> I didn't read the whole conversation, so to reiterate: git-read-tree
> will complain when the index does not match the HEAD, but won't
> complain about modified files in the working tree if the merge is not
> going to touch them. Now, let's say you do this (output is visually
> only roughly or not at all resembling what would real tools tell you):
>
> 	$ ls
> 	a b c
> 	$ echo 'somelocalhack' >>a
> 	$ git merge "blah" HEAD remotehead
> 	File-level merge of 'b' and 'c'...
> 	Oops, 'b' contained local conflicts.
> 	Automatic merge aborted, fix up by hand.
> 	$ fixup b
> 	$ git commit
> 	Committed files 'a', 'b', 'c'.
>
> Oops. It grabbed your local hack and committed it along the merge.

Are you sure about this?

In the above sequence, after you touch a with 'somelocalhack',
there is no 'git update-index a', until you say 'git commit'
there, so I do not think that mixup is possible.

The "fixup b" step is actually two commands, so after merge
command, you would do:

        $ edit b
	$ git update-index b ;# mark that you are dealt with it
	$ git commit ;# commits what is in index

After the above steps, "git diff" (that is working tree against
index) still reports your local change to "a", which were _not_
committed.

Maybe you were mistaken because Cogito tries to be nice to its
users and always does a moral equivalent of "git commit -a"
(unless the user tells you to commit only specific paths), but
you needed to special case merge resolution commit to make sure
that you exclude "a" in the above example?  "git commit" does
not do "-a" by default, and it will stay that way, so I do not
think we do not have the "Oops" you described above.

"Oops" would happen only if you did "git commit -a" instead at
the last step.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-29 10:31     ` Petr Baudis
  2005-11-29 18:46       ` Junio C Hamano
@ 2005-11-29 21:40       ` linux
  2005-11-29 23:14         ` Junio C Hamano
  1 sibling, 1 reply; 64+ messages in thread
From: linux @ 2005-11-29 21:40 UTC (permalink / raw
  To: junkio, pasky; +Cc: git, linux

I'm feeling slightly guilty about eliciting such a flood of help, but I'm
certainly leraning a lot.  But there's one statement that, while I'm not
doubting it's accuracy, seems at odds with the mental model I'm building.
I must be misunderstanding something.

junkio wrote:
>> Okay, so git-update-index will overwrite a staged file with a
>> fresh stage-0 copy.  And git-commit will refuse to commit
>> (to be precise, it'll stop at the git-write-tree stage) if there
>> are unresolved conflicts.
>
> Sorry, I was unclear that I was talking about end-user level
> tool.  The update-index here is not about the conflict
> resolution in the index file read-tree documentation talks
> about.  That has already been done when "merge" ran in the
> conflicting case.  In the conflicting case, the working tree
> holds 3-way merge conflicting result, and the index holds HEAD
> version at stage0 for such a path.  Hand resolving after
> update-index is to record what you eventually want to commit
> (i.e. you are not replacing higher stage entry in the index with
> stage0 entry -- you are replacing stage0 entry with another).
> 
>> If you want to see the unmodified input files, you can find their
>> IDs with "git-ls-files -u" and then get a copy with "git-cat-file blob"
>> or "git-unpack-file".  git-merge-index is basically a different way to
>> process the output of git-ls-files -u.
>
> Yes, in principle.  But in practice you usually do not use these
> low level tools yourself.  When git-merge returns with
> conflicting paths, most of them have already been collapsed into
> stage0 and git-ls-files --unmerged would not show.  The only
> case I know of that you may still see higher stage entries in
> the index these days is merging paths with different mode bits.
> We used to leave higher stage entries when both sides added new
> file at the same path, but even that we show as merge from
> common these days.

And pasky reiterated:
>   From the user POV, the main difference between Cogito and GIT merging
> is that:
>
>  (i) Cogito tries to never leave the index "dirty" (i.e. containing
> unmerged entries), and instead all conflicts should propagate to the
> working tree, so that the user can resolve them without any further
> special tools. (What is lacking here is that Cogito won't proofcheck
> that you really resolved them all during a commit. That's a big TODO.
> But core GIT won't you warn about committing the classical << >> ==
> conflicts either.)

This seems odd to me.  There's an alternate implementation that
I described that makes a lot more sense to me, based on my current
state of knowledge.  Can someone explain why my idea is silly?

I'd imagine you'd consider user editing to be a last-resort merge
algorithm, but treat it like the other merges, and leave the file
staged while it's in progress.

Either git-checkout-index or something similar would "check out"
the staged file with CVS-style merge markers.  And an eventual
git-update-index would replace the staged file with a stage-0,
just like git-merge-one-file does automatically.

"git-diff" could default to diffing against the stage-2 file to
produce the same reults as now, but you could also have an
option to diff against a different stage, which might be useful.

(This is another reason for my earlier comment that I don't think
the distinction between stage-0 and stage-2 is actually necessary.)

And git-write-tree would naturally stop you from committing with
unresolved conflicts.  You could still commit the conflict
markers, but it would be a two-step process.

You'd have the simple principle that all merges start with git-read-tree
producing a staged file, and end with git-update-index collapsing
them when it's been resolved.  (Or something like git-reset throwing
everything away.)

Having said all this, there's presumably a good reason why this is a
bad idea.  Could someone enlighten me?

Thanks!

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-29 21:40       ` git-name-rev off-by-one bug linux
@ 2005-11-29 23:14         ` Junio C Hamano
  2005-11-30  0:15           ` linux
  0 siblings, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2005-11-29 23:14 UTC (permalink / raw
  To: linux; +Cc: junkio, pasky, git

linux@horizon.com writes:

> This seems odd to me.  There's an alternate implementation that
> I described that makes a lot more sense to me, based on my current
> state of knowledge.  Can someone explain why my idea is silly?

It is not silly.  Actually we have "been there, done that".

We used to leave the higher stages around in the index after
automerge failure.  Note that you would not just have stage2 in
such a case.  stage1 keeps the common ancestor, stage2 has what
you started with, and stage3 holds the version from other
branch.  diff-stages can be used to diff between these stages.
We _could_ have added feature to either diff-stages or
diff-files to compare between stageN and working tree.

However, this turned out to be not so convenient as we wished
initially.  What you would do after inspecting diffs between
stage1 and stage3, between stage2 and stage3 and between stage1
and stage2 typically ends up doing what "merge" have tried (and
failed) manually anyway, and being able to find the conflict
markers by simply running "git diff" was just as good, except
that we risk getting still-unresolved files checked in if the
user is not careful.

If you want to be clever about an automated merge, you could
write a new merge strategy to take the three trees and produce a
better automerge result.  That is what Fredrik has done in his
git-merge-recursive (now default).  Or you could "improve"
git-merge-one-file to take three blob object names and leave
file~1 file~2 file~3 in the working tree, instead of (or in
addition to) leaving a "merge" result with conflict markers, to
give the user ready access to the version from each stage.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-29 23:14         ` Junio C Hamano
@ 2005-11-30  0:15           ` linux
  2005-11-30  0:53             ` Junio C Hamano
  2005-11-30  1:51             ` Linus Torvalds
  0 siblings, 2 replies; 64+ messages in thread
From: linux @ 2005-11-30  0:15 UTC (permalink / raw
  To: junkio; +Cc: git, linux, pasky

> It is not silly.  Actually we have "been there, done that".

Um, okay, but I don't see why you changed...

> We used to leave the higher stages around in the index after
> automerge failure.  Note that you would not just have stage2 in
> such a case.  stage1 keeps the common ancestor, stage2 has what
> you started with, and stage3 holds the version from other
> branch.  diff-stages can be used to diff between these stages.
> We _could_ have added feature to either diff-stages or
> diff-files to compare between stageN and working tree.

Yes, exactly.  This is what I expected.

> However, this turned out to be not so convenient as we wished
> initially.  What you would do after inspecting diffs between
> stage1 and stage3, between stage2 and stage3 and between stage1
> and stage2 typically ends up doing what "merge" have tried (and
> failed) manually anyway, and being able to find the conflict
> markers by simply running "git diff" was just as good, except
> that we risk getting still-unresolved files checked in if the
> user is not careful.

You seem to be saying that producing a merge with conflict markers is
what you (almost) always want, so it's the default.  No objections.

But why collapse the index and only keep stage2?  Why not leave all
stages in the index *and* the merge-with-conflict-markers in the working
directory?

They you could, for example, try alternate single-file merge algorithms
on the conflict, or regenerate the conflict markers if you wanted.
By keeping all of the source material around until the user has decided
on a resolution, you achieve maximal flexibility.

This is no more effort for the user to use in the common case (edit the
conflicts and git-update-index), but lets you try various things in the
working directory and eaily back out of them.  ("git-merge-index -s manual
-a" would regenerate all of the conflict markers.)  And it prevents a
checkin until the matter has been resolved.

I'm wondering if this isn't a philosophical issue.  One side says that,
since all automated merging is complete, the stages should be collapsed.
To me, it makes more sense to leave out the adjective "automated" and
consider the merge to be incomplete; we're just putting the user in the
loop when software fails.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  0:15           ` linux
@ 2005-11-30  0:53             ` Junio C Hamano
  2005-11-30  1:27               ` Junio C Hamano
  2005-11-30  1:51             ` Linus Torvalds
  1 sibling, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2005-11-30  0:53 UTC (permalink / raw
  To: linux; +Cc: junkio, git, pasky

linux@horizon.com writes:

> I'm wondering if this isn't a philosophical issue.

I do not think so.  I have to admit I did not exactly agree with
the current behaviour when it was changed from the previous one,
but at the same time I did not have anything concrete against
it, and I did not care too much about the details back then.  I
suspect it was primarily be done to make things easier for the
end user without changing already existing tools (i.e.,
git-diff-files did not have to start taking --stage=2 flag to
tell it to compare stage2 and working tree).

This is the message from Linus that announced the current
behaviour:

	http://marc.theaimsgroup.com/?l=git&m=111826424425624

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  0:53             ` Junio C Hamano
@ 2005-11-30  1:27               ` Junio C Hamano
  0 siblings, 0 replies; 64+ messages in thread
From: Junio C Hamano @ 2005-11-30  1:27 UTC (permalink / raw
  To: linux; +Cc: git

Junio C Hamano <junkio@cox.net> writes:

> linux@horizon.com writes:
>
>> I'm wondering if this isn't a philosophical issue.
>
> I do not think so....
> ...
> This is the message from Linus that announced the current
> behaviour:
>
> 	http://marc.theaimsgroup.com/?l=git&m=111826424425624

Replying to myself.  In the message, Linus talks about being
able to do (diff-cache is an old name for diff-index):

	git-diff-files -p xyzzy ;# to compare with our version
        git-diff-cache -p MERGE_HEAD xyzzy ;# to compare with his

But because of the "index before merge has to match HEAD" rule,
the first one could have been written as:

	git-diff-index -p HEAD xyzzy ;# to compare with ours

So in that sense, I suspect it may not be too bad if we just
changed merge-one-file with the patch at the end.

However, git-diff-index HEAD without paths restriction would
show everything the merge brought in, not just the conflicting
path, so in that sense it may make things slightly harder for
the end user to use.

---

diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
index c3eca8b..df6dd67 100755
--- a/git-merge-one-file.sh
+++ b/git-merge-one-file.sh
@@ -79,11 +79,12 @@ case "${1:-.}${2:-.}${3:-.}" in
 		;;
 	esac
 
-	# We reset the index to the first branch, making
-	# git-diff-file useful
-	git-update-index --add --cacheinfo "$6" "$2" "$4"
-		git-checkout-index -u -f -- "$4" &&
-		merge "$4" "$orig" "$src2"
+	# Leave the conflicts in stages; failed merge result can be
+	# seen by "git-diff-index HEAD" or "git-diff-index MERGE_HEAD"
+	rm -fr "$4" &&
+	    git-cat-file blob "$2" >"$4" &&
+	    case "$6" in *?7??) chmod +x "$4" ;; esac &&
+	    merge "$4" "$orig" "$src2"
 	ret=$?
 	rm -f -- "$orig" "$src2"
 

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  0:15           ` linux
  2005-11-30  0:53             ` Junio C Hamano
@ 2005-11-30  1:51             ` Linus Torvalds
  2005-11-30  2:06               ` Junio C Hamano
                                 ` (2 more replies)
  1 sibling, 3 replies; 64+ messages in thread
From: Linus Torvalds @ 2005-11-30  1:51 UTC (permalink / raw
  To: linux; +Cc: junkio, git, pasky

On Tue, 29 Nov 2005, linux@horizon.com wrote:
>
> You seem to be saying that producing a merge with conflict markers is
> what you (almost) always want, so it's the default.  No objections.
> 
> But why collapse the index and only keep stage2?  Why not leave all
> stages in the index *and* the merge-with-conflict-markers in the working
> directory?

That may actually work really well. It would also avoid one bug that we 
have right now: if you fix things up by hand, but forget to explicitly do 
a "git-update-index filename" or "git commit filename", a plain regular 
"git commit" will happily commit all the changes _except_ for the ones you 
have merged manually. It's happened once to me.

If we left things in the index in an unmerged state, we'd be guaranteed to 
either _fail_ that git commit unless somebody has done the 
git-update-index (or names the files specifically on the commit command 
line, which will do it for you).

So I think I agree. 

Junio?

The problem (I think) was that "git-diff-file" did bad things with 
unmerged entries. That's what the comment in git-merge-one-file implies. 
But otherwise this should just make it so..

Do you want to test this out?

		Linus

---
diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
index c3eca8b..739a072 100755
--- a/git-merge-one-file.sh
+++ b/git-merge-one-file.sh
@@ -79,11 +79,7 @@ case "${1:-.}${2:-.}${3:-.}" in
 		;;
 	esac

-	# We reset the index to the first branch, making
-	# git-diff-file useful
-	git-update-index --add --cacheinfo "$6" "$2" "$4"
-		git-checkout-index -u -f -- "$4" &&
-		merge "$4" "$orig" "$src2"
+	merge "$4" "$orig" "$src2"
 	ret=$?
 	rm -f -- "$orig" "$src2"

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  1:51             ` Linus Torvalds
@ 2005-11-30  2:06               ` Junio C Hamano
  2005-11-30  2:33               ` Junio C Hamano
  2005-11-30 18:11               ` Daniel Barkalow
  2 siblings, 0 replies; 64+ messages in thread
From: Junio C Hamano @ 2005-11-30  2:06 UTC (permalink / raw
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> If we left things in the index in an unmerged state, we'd be guaranteed to 
> either _fail_ that git commit unless somebody has done the 
> git-update-index (or names the files specifically on the commit command 
> line, which will do it for you).
>
> So I think I agree. 

I suspect we are saying the same thing.  Funny.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  1:51             ` Linus Torvalds
  2005-11-30  2:06               ` Junio C Hamano
@ 2005-11-30  2:33               ` Junio C Hamano
  2005-11-30  3:12                 ` Linus Torvalds
  2005-11-30  3:15                 ` linux
  2005-11-30 18:11               ` Daniel Barkalow
  2 siblings, 2 replies; 64+ messages in thread
From: Junio C Hamano @ 2005-11-30  2:33 UTC (permalink / raw
  To: Linus Torvalds; +Cc: linux, junkio, git, pasky

Linus Torvalds <torvalds@osdl.org> writes:

> Junio?
>
> The problem (I think) was that "git-diff-file" did bad things with 
> unmerged entries. That's what the comment in git-merge-one-file implies. 
> But otherwise this should just make it so..
>
> Do you want to test this out?

I have actually resolved one conflicting merge with this and it
was OK, except that it was a bit unpleasant when I first did
"git-diff-index HEAD" without giving any path ;-), but the users
will get used to it.  Pushed out as a part of the proposed
updates collection.

Here is what I wrote as the commit log message for the
hand-resolved merge using this updated merge-one-file.

commit 6b48f6ff7ffff6ca0f9da53d9423a0474dd008fd
Merge: b4f40b90ed1d9e1f3c0557e1ba064d169ba03a1c 99e01692063cc48adee19e1f738472a579c14ca2
Author: Junio C Hamano <junkio@cox.net>
Date:   Tue Nov 29 18:25:29 2005 -0800

    Merge branch 'jc/subdir'
    
    This one is done with the updated merge-one-file, which leaves
    unmerged entries in the index file to prevent unresolved merge
    from getting committed by mistake.
    
    After "git pull ..." fails, earlier the user said:
    
    	$ git-diff
    
    to see half-merged state.  Now git-diff just says:
    
    	$ git-diff
    	* Unmerged path ls-tree.c
    
    In order to get the earlier "show me the failed merge relative
    to my HEAD", you can say:
    
    	$ git-diff HEAD ls-tree.c
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  2:33               ` Junio C Hamano
@ 2005-11-30  3:12                 ` Linus Torvalds
  2005-11-30  5:06                   ` Linus Torvalds
  2005-11-30  7:18                   ` Junio C Hamano
  2005-11-30  3:15                 ` linux
  1 sibling, 2 replies; 64+ messages in thread
From: Linus Torvalds @ 2005-11-30  3:12 UTC (permalink / raw
  To: Junio C Hamano; +Cc: linux, git, pasky



On Tue, 29 Nov 2005, Junio C Hamano wrote:
> 
> I have actually resolved one conflicting merge with this and it
> was OK, except that it was a bit unpleasant when I first did
> "git-diff-index HEAD" without giving any path ;-),

What does "git-diff-files" do? Just output a lot of nasty "unmerged" 
messages?

The _nice_ thing to do would be to output one "unmerged" message, but 
then diff against stage2 if it exists (and it basically always should, 
since otherwise we wouldn't have gotten a merge error).

If it did that, then you'd have the best of both world: the old nice "git 
diff" behaviour _and_ being safe (and saying that it's unmerged).

Something like this (untested, of course).

It _should_ write out

	* Unmerged path <filename>

followed by a regular diff, exactly like you'd want.

[ This all assumes that merge-one-file leaves the stages right. I think my 
  patch to do that was just broken. Yours was probably not. ]

		Linus

---
diff --git a/diff-files.c b/diff-files.c
index 38599b5..8a78326 100644
--- a/diff-files.c
+++ b/diff-files.c
@@ -95,11 +95,23 @@ int main(int argc, const char **argv)
 
 		if (ce_stage(ce)) {
 			show_unmerge(ce->name);
-			while (i < entries &&
-			       !strcmp(ce->name, active_cache[i]->name))
+			while (i < entries) {
+				struct cache_entry *nce = active_cache[i];
+
+				if (strcmp(ce->name, nce->name))
+					break;
+				/* Prefer to diff against stage 2 (original branch) */
+				if (ce_stage(nce) == 2)
+					ce = nce;
 				i++;
-			i--; /* compensate for loop control increments */
-			continue;
+			}
+			/*
+			 * Compensate for loop update
+			 */
+			i--;
+			/*
+			 * Show the diff for the 'ce' we chose
+			 */
 		}
 
 		if (lstat(ce->name, &st) < 0) {

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  2:33               ` Junio C Hamano
  2005-11-30  3:12                 ` Linus Torvalds
@ 2005-11-30  3:15                 ` linux
  1 sibling, 0 replies; 64+ messages in thread
From: linux @ 2005-11-30  3:15 UTC (permalink / raw
  To: junkio, torvalds; +Cc: git, linux, pasky

>   This one is done with the updated merge-one-file, which leaves
>   unmerged entries in the index file to prevent unresolved merge
>   from getting committed by mistake.
>   
>   After "git pull ..." fails, earlier the user said:
>   
>   	$ git-diff
>   
>   to see half-merged state.  Now git-diff just says:
>   
>   	$ git-diff
>   	* Unmerged path ls-tree.c
>   
>   In order to get the earlier "show me the failed merge relative
>   to my HEAD", you can say:
>   
>   	$ git-diff HEAD ls-tree.c

Cool!  You all know I like this change, mostly because it makes git's
merging conceptually cleaner and easier to explain.

Looking at git, the difference between it and other SCMs is the emphasis
on merging over editing.  The tools for local development are a bit
primitive in core git, but that's a well-understood problem and the
tools can be implemented as needed.

What makes git special is the assumption that a patch is going to pass
through several people on its way from the text editor to the release,
so merging is actually more important than initial writing.

Rather than saying "Linus doesn't scale" and giving up, it's seen as an
Amdahl's law problem - the goal is to remove as much work from Linus as
possible and thus make him scale.  (The shorter and earther way to say
this is that Linus is a lazy bastard, which is no surprise to anyone
who's seen him try to hide behind a podium. ;-) )

And an essential part of making that work is a good toolkit for dealing
with merging, and particularly in-progress merges.  By having the
concept of an unmerged index, git lets you develop merging algorithms
in a modular way, as opposed to "one big hairy pile of magic DWIMmery"
that people are afraid to touch.

For example, one thing I'm sure will arrive fairly soon is file-type
specific merge algorithms.  For something like a .po file where the
order of sections doesn't matter, merging ad->abd and ad->acd can be
fully automated.

There are a number of good idea in git, but from what I've seen so far,
"git-read-tree -m" is the most important one.

Making git-diff Do The Right Thing is a relatively minor matter.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  3:12                 ` Linus Torvalds
@ 2005-11-30  5:06                   ` Linus Torvalds
  2005-11-30  5:51                     ` Junio C Hamano
  2005-11-30  6:09                     ` git-name-rev off-by-one bug linux
  2005-11-30  7:18                   ` Junio C Hamano
  1 sibling, 2 replies; 64+ messages in thread
From: Linus Torvalds @ 2005-11-30  5:06 UTC (permalink / raw
  To: Junio C Hamano; +Cc: linux, git, pasky



On Tue, 29 Nov 2005, Linus Torvalds wrote:
> 
> Something like this (untested, of course).
> 
> It _should_ write out
> 
> 	* Unmerged path <filename>
> 
> followed by a regular diff, exactly like you'd want.

The more I thinking about this, the more I think this is a wonderful 
approach, but it would be even better to add a flag to let it choose 
between diffing against stage2 by default or diffing against stage3 (and 
hey, maybe even diffing against the original).

In fact, here's a patch that does that, and also makes the "resolve" merge 
create these kinds of merges. As usual, my python knowledge is useless, 
since the only thing I know about python is that thou shalt not count to 
four. As a result, the standard recursive merge doesn't do this yet ;(

The magic incantation is to just do

	git diff

and you'll get a diff against the first branch. If you want a diff against 
the second branch, just use the "-2" option, and if you want a diff 
against the common base (which is actually surprisingly useful, I noticed, 
when I tried this with a conflict), use "-0".

I've also changed "git diff" to _not_ drop the "-M" and "-p" options just 
because you give some other diff option. That was always a mistake. If you 
really want the raw git diff format, use the raw "git-diff-xyz" programs 
directly.

Whaddaya think? I really like it. Here's an example, where I merged two 
branches that had the file "hello" in it, and the first branch had:

	Hi there
	This is the master branch

and the second one had

	Hi there
	This is the 'other' branch

and the base version had just the "Hi there", of course.

The default (or "-1" arg) behaviour is:

	[torvalds@g5 test-merge]$ git diff
	* Unmerged path hello
	diff --git a/hello b/hello
	index 7cebcf8..3fa4697 100644
	--- a/hello
	+++ b/hello
	@@ -1,2 +1,6 @@
	 Hi there
	+<<<<<<< hello
	 This is the master branch
	+=======
	+This is the 'other' branch
	+>>>>>>> .merge_file_fJWiNf

which is obvious enough. You see exactly the conflict, and you see the 
part of the first branch that is unchanged.

Diffing against the original gives you

	[torvalds@g5 test-merge]$ git diff -0
	* Unmerged path hello
	diff --git a/hello b/hello
	index 6530b63..3fa4697 100644
	--- a/hello
	+++ b/hello
	@@ -1 +1,6 @@
	 Hi there
	+<<<<<<< hello
	+This is the master branch
	+=======
	+This is the 'other' branch
	+>>>>>>> .merge_file_fJWiNf

which I actually found really readable. I realize that this is a really 
stupid example, but for a lot of trivial merges, this won't be _that_ far 
off, and it basically shows what happened in both branches, and ignores 
what neither side changed.

The "-2" in this case is just the same as "-1" except obviously the "+" 
characters are situated differently. Still useful (especially if the 
changes were more complex).

Me likee. Hope you guys do too.

(And this is quite independently of the advantage that you can't commit an 
unmerged state by mistake, which is perhaps an even bigger one).

		Linus

---
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 6b496ed..afc7334 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -18,6 +18,12 @@
 	object name of pre- and post-image blob on the "index"
 	line when generating a patch format output.	
 
+-0 -1 -2::
+	When an unmerged entry is seen, diff against the base version,
+	the "first branch" or the "second branch" respectively.
+
+	The default is to diff against the first branch.
+
 -B::
 	Break complete rewrite changes into pairs of delete and create.
 
diff --git a/diff-files.c b/diff-files.c
index 38599b5..d744636 100644
--- a/diff-files.c
+++ b/diff-files.c
@@ -13,6 +13,7 @@ COMMON_DIFF_OPTIONS_HELP;
 
 static struct diff_options diff_options;
 static int silent = 0;
+static int diff_unmerged_stage = 2;
 
 static void show_unmerge(const char *path)
 {
@@ -46,7 +47,13 @@ int main(int argc, const char **argv)
 			argc--;
 			break;
 		}
-		if (!strcmp(argv[1], "-q"))
+		if (!strcmp(argv[1], "-0"))
+			diff_unmerged_stage = 1;
+		else if (!strcmp(argv[1], "-1"))
+			diff_unmerged_stage = 2;
+		else if (!strcmp(argv[1], "-2"))
+			diff_unmerged_stage = 3;
+		else if (!strcmp(argv[1], "-q"))
 			silent = 1;
 		else if (!strcmp(argv[1], "-r"))
 			; /* no-op */
@@ -95,11 +102,23 @@ int main(int argc, const char **argv)
 
 		if (ce_stage(ce)) {
 			show_unmerge(ce->name);
-			while (i < entries &&
-			       !strcmp(ce->name, active_cache[i]->name))
+			while (i < entries) {
+				struct cache_entry *nce = active_cache[i];
+
+				if (strcmp(ce->name, nce->name))
+					break;
+				/* Prefer to diff against the proper unmerged stage */
+				if (ce_stage(nce) == diff_unmerged_stage)
+					ce = nce;
 				i++;
-			i--; /* compensate for loop control increments */
-			continue;
+			}
+			/*
+			 * Compensate for loop update
+			 */
+			i--;
+			/*
+			 * Show the diff for the 'ce' we chose
+			 */
 		}
 
 		if (lstat(ce->name, &st) < 0) {
diff --git a/git-diff.sh b/git-diff.sh
index b3ec84b..efe8f75 100755
--- a/git-diff.sh
+++ b/git-diff.sh
@@ -3,12 +3,13 @@
 # Copyright (c) 2005 Linus Torvalds
 # Copyright (c) 2005 Junio C Hamano
 
+# Some way to turn these off?
+default_flags="-M -p"
+
 rev=$(git-rev-parse --revs-only --no-flags --sq "$@") || exit
-flags=$(git-rev-parse --no-revs --flags --sq "$@")
+flags=$(git-rev-parse --no-revs --flags --sq $default_flags "$@")
 files=$(git-rev-parse --no-revs --no-flags --sq "$@")
 
-: ${flags:="'-M' '-p'"}
-
 # I often say 'git diff --cached -p' and get scolded by git-diff-files, but
 # obviously I mean 'git diff --cached -p HEAD' in that case.
 case "$rev" in
diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
index c3eca8b..739a072 100755
--- a/git-merge-one-file.sh
+++ b/git-merge-one-file.sh
@@ -79,11 +79,7 @@ case "${1:-.}${2:-.}${3:-.}" in
 		;;
 	esac
 
-	# We reset the index to the first branch, making
-	# git-diff-file useful
-	git-update-index --add --cacheinfo "$6" "$2" "$4"
-		git-checkout-index -u -f -- "$4" &&
-		merge "$4" "$orig" "$src2"
+	merge "$4" "$orig" "$src2"
 	ret=$?
 	rm -f -- "$orig" "$src2"
 

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  5:06                   ` Linus Torvalds
@ 2005-11-30  5:51                     ` Junio C Hamano
  2005-11-30  6:11                       ` Junio C Hamano
                                         ` (2 more replies)
  2005-11-30  6:09                     ` git-name-rev off-by-one bug linux
  1 sibling, 3 replies; 64+ messages in thread
From: Junio C Hamano @ 2005-11-30  5:51 UTC (permalink / raw
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> Whaddaya think? I really like it.

Yes.  Maybe split this into 3 pieces.  I do not want to waste
your time with that, so will take the liberty to do so myself,
with appropriate commit log messages, if you do not mind.

 1. give diff-files -[012] flags.
 2. merge-one-file leaves unmerged index entries.
 3. always use -M -p in git-diff.

I do not have any issue against #1.

Regarding #2, in an earlier message you said something about
"patch to do that was just broken" which I did not understand; I
think your patch I am replying to is doing the right thing.  That
case arm is dealing with a path that exists in "our" branch and
the working tree blob should be the same as recorded in the
HEAD, so I did not have to do the unpack-cat-chmod like I did in
mine.  Am I simply confused?

About #3, I am not quite sure.  I often use --name-status and I
do _not_ want -p to kick in when I do so.  How about something
like this?

---

diff --git a/git-diff.sh b/git-diff.sh
index b3ec84b..8e0fe34 100755
--- a/git-diff.sh
+++ b/git-diff.sh
@@ -7,8 +7,6 @@ rev=$(git-rev-parse --revs-only --no-fla
 flags=$(git-rev-parse --no-revs --flags --sq "$@")
 files=$(git-rev-parse --no-revs --no-flags --sq "$@")

-: ${flags:="'-M' '-p'"}
-
 # I often say 'git diff --cached -p' and get scolded by git-diff-files, but
 # obviously I mean 'git diff --cached -p HEAD' in that case.
 case "$rev" in
@@ -20,6 +18,21 @@ case "$rev" in
 	esac
 esac

+# If we do not have --name-status, --name-only nor -r, default to -p.
+# If we do not have -B nor -C, default to -M.
+case " $flags " in
+*" '--name-status' "* | *" '--name-only' "* | *" '-r' "* )
+	;;
+*)
+	flags="$flags '-p'" ;;
+esac
+case " $flags " in
+*" '-"[BCM]* | *" '--find-copies-harder' "*)
+	;; # something like -M50.
+*)
+	flags="$flags '-M'" ;;
+esac
+
 case "$rev" in
 ?*' '?*' '?*)
 	echo >&2 "I don't understand"

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  5:06                   ` Linus Torvalds
  2005-11-30  5:51                     ` Junio C Hamano
@ 2005-11-30  6:09                     ` linux
  2005-11-30  6:39                       ` Junio C Hamano
  2005-11-30 16:12                       ` git-name-rev off-by-one bug Linus Torvalds
  1 sibling, 2 replies; 64+ messages in thread
From: linux @ 2005-11-30  6:09 UTC (permalink / raw
  To: junkio, torvalds; +Cc: git, linux, pasky

> +-0 -1 -2::
> +	When an unmerged entry is seen, diff against the base version,
> +	the "first branch" or the "second branch" respectively.
> +
> +	The default is to diff against the first branch.
> +

Er... why are these flags zero-based?

git-ls-files -s displays them as "1", "2" and "3".  All the docs talk
about "stage1", "stage2" and "stage3".

Change the nomenclature if you want, but this mixed messages business is
kind of weird...

(Heartened by the response to my previous question of "why do you do
this thing that makes no sense to me", I'm going to be bold and not ask
why this is a good idea.)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  5:51                     ` Junio C Hamano
@ 2005-11-30  6:11                       ` Junio C Hamano
  2005-11-30 16:13                         ` Linus Torvalds
  2005-11-30 16:08                       ` Linus Torvalds
  2005-12-02  8:25                       ` Junio C Hamano
  2 siblings, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2005-11-30  6:11 UTC (permalink / raw
  To: Linus Torvalds; +Cc: git

Junio C Hamano <junkio@cox.net> writes:

> Linus Torvalds <torvalds@osdl.org> writes:
>
>> Whaddaya think? I really like it.
>
> Yes.  Maybe split this into 3 pieces.  I do not want to waste
> your time with that, so will take the liberty to do so myself,
> with appropriate commit log messages, if you do not mind.
>
>  1. give diff-files -[012] flags.
>  2. merge-one-file leaves unmerged index entries.
>  3. always use -M -p in git-diff.
>
> I do not have any issue against #1.

Actually there is one.  If we are asked to do diff -1 and an
unmerged path does not have stage #2 but stage #1 entry exists,
we would end up showing that stage #1, without telling the user
that we are showing something different from what was asked.
How about doing something like this, on top of yours?

--- diff-files.c
+++ diff-files.c
@@ -117,8 +117,11 @@
 			 */
 			i--;
 			/*
-			 * Show the diff for the 'ce' we chose
+			 * Show the diff for the 'ce' if we found the one
+			 * from the desired stage.
 			 */
+			if (ce_stage(ce) != diff_unmerged_stage)
+				continue;
 		}
 
 		if (lstat(ce->name, &st) < 0) {

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  6:09                     ` git-name-rev off-by-one bug linux
@ 2005-11-30  6:39                       ` Junio C Hamano
  2005-11-30 13:10                         ` More merge questions linux
  2005-11-30 16:12                       ` git-name-rev off-by-one bug Linus Torvalds
  1 sibling, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2005-11-30  6:39 UTC (permalink / raw
  To: linux; +Cc: junkio, torvalds, git, pasky

linux@horizon.com writes:

>> +-0 -1 -2::
>> +	When an unmerged entry is seen, diff against the base version,
>> +	the "first branch" or the "second branch" respectively.
>> +
>> +	The default is to diff against the first branch.
>> +
>
> Er... why are these flags zero-based?

Because -1 means "first branch" (usually "ours", aka HEAD), and
-2 means "second branch" ("theirs", aka MERGE_HEAD), and -0 is
for the base (aka merge base)?

But I think you are right.  The numeric parameters should match
stage number for consistency.

How about if I redo the patch to make diff-files accept -1/-2/-3
instead, and in addition accept "--base", "--ours", and
"--theirs" as synonyms?

Side note.  diff3 says MINE OLDER YOURS and the way to remember
the order is they are alphabetical.  We can say the same for
base, ours and theirs.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  3:12                 ` Linus Torvalds
  2005-11-30  5:06                   ` Linus Torvalds
@ 2005-11-30  7:18                   ` Junio C Hamano
  2005-11-30  9:05                     ` Junio C Hamano
  2005-11-30  9:42                     ` Junio C Hamano
  1 sibling, 2 replies; 64+ messages in thread
From: Junio C Hamano @ 2005-11-30  7:18 UTC (permalink / raw
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> On Tue, 29 Nov 2005, Junio C Hamano wrote:
>> 
>> I have actually resolved one conflicting merge with this and it
>> was OK, except that it was a bit unpleasant when I first did
>> "git-diff-index HEAD" without giving any path ;-),
>
> What does "git-diff-files" do? Just output a lot of nasty "unmerged" 
> messages?

That was not what was unpleasant.  What was unpleasant was those
"unmerged" messages were buried under heap of normal diffs,
showing the successfully merged entries as the result of merge.

I am inclined to munge your patch to do this:

 * Change -0/-1/-2 to -1/-2/-3 to be consistent with stage
   numbers, for technically minded.

 * Give --base, --ours, and --theirs as synonyms for -1/-2/-3,
   for end users.

 * Change it not to pick other unmerged stage (I sent a separate
   message about this already).

 * Change the diff_unmerged_stage default to 0.  While you are
   inspecting a conflicted merge, you need to give --ours (or
   -2) explicitly.  Alternatively we could first check if the
   whole index is unmerged and make it default to 2 without
   flags, but that would mean inspecting 19K entries first
   before starting the main loop for the kernel for normal case.
   With hot cache it is fine, so I'll try it first.

   This "with unmerged defaults to 2 otherwise defaults to 0"
   behaviour needs to be made overridable with an option, say -0
   (or --merged, but that is overkill); otherwise you cannot get
   diffs for merged paths until the index file is unmerged.

 * When diff_unmerged_stage is zero, keep the current behaviour.
   Show diff for only specified stage when diff_unmerged_stage
   is not zero.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-29  9:29     ` Junio C Hamano
@ 2005-11-30  8:37       ` Junio C Hamano
  0 siblings, 0 replies; 64+ messages in thread
From: Junio C Hamano @ 2005-11-30  8:37 UTC (permalink / raw
  To: linux; +Cc: git

Junio C Hamano <junkio@cox.net> writes:

> On a related topic of removing unwanted paths, earlier I said
> 2-way is used to make sure "git checkout" takes your changes
> with you when you switch branches.  As a natural consequence of
> this, if you do not have any local changes, "git checkout"
> without "-f" does the right thing -- it removes unwanted paths
> that existed in the original branch but not in the branch you
> are switching to.

Here is an unsolicited advice ("tip of the day").

I was on a branch which had some local "throwaway" changes, and
I wanted to switch back to the master branch.  To be honest, I
even forgot I had local changes there.  So I ran "git checkout",
and here is what happened.

        junio@siamese:~/git$ git checkout master
        fatal: Entry 'Documentat...' not uptodate. Cannot merge.

The easiest is "git checkout -f master" at this point, but I
usually do not do that.  If that entry "git checkout" complains
about is something that is not in the master branch and I have
throwaway changes, "git checkout -f master" would leave that
file with throwaway changes behind.  So I did this first:

        junio@siamese:~/git$ git reset --hard

This would sync my working tree to the current branch.  Then

        junio@siamese:~/git$ git checkout master

would switch branches properly, removing that new file that
should not exist in the working tree.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  7:18                   ` Junio C Hamano
@ 2005-11-30  9:05                     ` Junio C Hamano
  2005-11-30  9:42                     ` Junio C Hamano
  1 sibling, 0 replies; 64+ messages in thread
From: Junio C Hamano @ 2005-11-30  9:05 UTC (permalink / raw
  To: Linus Torvalds; +Cc: git

Junio C Hamano <junkio@cox.net> writes:

> Linus Torvalds <torvalds@osdl.org> writes:
>
>> On Tue, 29 Nov 2005, Junio C Hamano wrote:
>>> 
>>> I have actually resolved one conflicting merge with this and it
>>> was OK, except that it was a bit unpleasant when I first did
>>> "git-diff-index HEAD" without giving any path ;-),
>>
>> What does "git-diff-files" do? Just output a lot of nasty "unmerged" 
>> messages?
>
> That was not what was unpleasant.  What was unpleasant was those
> "unmerged" messages were buried under heap of normal diffs,
> showing the successfully merged entries as the result of merge.
>
> I am inclined to munge your patch to do this:

This I have done, and pushed out to "pu" for tonight.  After
doing some more test I'll have this graduate to "master"
sometime tomorrow along with other accumulated changes.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  7:18                   ` Junio C Hamano
  2005-11-30  9:05                     ` Junio C Hamano
@ 2005-11-30  9:42                     ` Junio C Hamano
  1 sibling, 0 replies; 64+ messages in thread
From: Junio C Hamano @ 2005-11-30  9:42 UTC (permalink / raw
  To: Linus Torvalds; +Cc: git

Junio C Hamano <junkio@cox.net> writes:

> Linus Torvalds <torvalds@osdl.org> writes:
>
>> What does "git-diff-files" do? Just output a lot of nasty "unmerged" 
>> messages?
>
> That was not what was unpleasant.  What was unpleasant was those
> "unmerged" messages were buried under heap of normal diffs,
> showing the successfully merged entries as the result of merge.

Correction.  The above is a faulty memory, and does not happen,
with or without your "stage0 or stage2" patch.  Cleanly merged
paths are written out to working tree and collapsed to stage0 in
the index, so diff-files wouldn't have shown them at all.  

Sorry about the confused statement.  What I saw was my local
modifications on the paths unrelated to the merge.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* More merge questions
  2005-11-30  6:39                       ` Junio C Hamano
@ 2005-11-30 13:10                         ` linux
  2005-11-30 18:37                           ` Daniel Barkalow
  2005-11-30 20:23                           ` Junio C Hamano
  0 siblings, 2 replies; 64+ messages in thread
From: linux @ 2005-11-30 13:10 UTC (permalink / raw
  To: git; +Cc: junkio, linux

I'm working my way through a thorough understanding of merging.

First I got git-read-tree's 3-way merge down to 6 conditionals, where
a missing entry is considered equal to a missing entry, and a missing
index entry is considered clean.

a) If stage2 == stage3, use stage2
b) If stage1 == stage3, use stage2
c) If the index entry exists and is dirty (working dir changes), FAIL
d) If stage1 == stage2, use stage3
e) If trivial-only, FAIL
f) Return unmerged result for 3-way resolution by git-merge-index.

Case c is needed so you don't change the world out from under
your working directory changes.  You could move it earlier and
make things strictire, but that's the minimal restriction.

Then I started thinking about 2-way merge, and how that differed
from a 3-way merge where stage2 was the previous index contents.

If you apply the same rules (with trivial-only true), the only differences
to the big 22-case table in the git-read-tree docs are:

3) This says that if stage1 and state3 exist, use stage3.
   3-way says if they're equal, delete the file, while if they're
   unequal, it's fail.

If 3-way git-merge-index were allowed, then the conditions that would
change to do it are cases 8 and 12.

The full list of cases and the conditional that applies, is:

0) a
1) d
2) a
3) see above.  It's b or e by my logic, but d by the table.

4) b
5) b
6) a
7) a
8) e
9) c

10) d
11) c
12) e
13) c

14) a or b
15) a or b

16) e
17) c
18) a
19) a
20) d
21) c

Given that it all matches up so nicely, I'd like to honestly ask if
case 3 of the conditions is correct.  I'd think that if I deleted
a file form te index, and the file wasn't changed on the head I'm
tracking, the right resolution is to keep it deleted.  Why override
my deletion?

Sorry if this is a dumb question, but it's not obvious to me.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  5:51                     ` Junio C Hamano
  2005-11-30  6:11                       ` Junio C Hamano
@ 2005-11-30 16:08                       ` Linus Torvalds
  2005-12-02  8:25                       ` Junio C Hamano
  2 siblings, 0 replies; 64+ messages in thread
From: Linus Torvalds @ 2005-11-30 16:08 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git



On Tue, 29 Nov 2005, Junio C Hamano wrote:
> 
> About #3, I am not quite sure.  I often use --name-status and I
> do _not_ want -p to kick in when I do so.  How about something
> like this?

Yes. I was thinking about something like that, but I decided it was too 
much work ;)

		Linus

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  6:09                     ` git-name-rev off-by-one bug linux
  2005-11-30  6:39                       ` Junio C Hamano
@ 2005-11-30 16:12                       ` Linus Torvalds
  1 sibling, 0 replies; 64+ messages in thread
From: Linus Torvalds @ 2005-11-30 16:12 UTC (permalink / raw
  To: linux; +Cc: junkio, git, pasky

On Tue, 30 Nov 2005, linux@horizon.com wrote:
>
> > +-0 -1 -2::
> > +	When an unmerged entry is seen, diff against the base version,
> > +	the "first branch" or the "second branch" respectively.
> > +
> > +	The default is to diff against the first branch.
> 
> Er... why are these flags zero-based?

Because it makes more sense from a "git diff" standpoint to do that.

The fact that _internally_, git puts the first branch into "stage 2", and 
the second one into "stage 3", that's very much a internal git 
implementation issue that makes no sense to expose to a regular user.

> git-ls-files -s displays them as "1", "2" and "3".  All the docs talk
> about "stage1", "stage2" and "stage3".

Yes, but those are _technical_ docs, not docs aimed toward a user. Nobody 
sane uses "git-ls-files --stage" outside of a script, or unless they 
really know git and are trying to debug something.

>From a user standpoint, it makes a lot more sense to say "primary branch" 
and "other branch" , and then "-1" and "-2" make sense (and then the "base 
of the merge" makes sense as "-0").

		Linus

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  6:11                       ` Junio C Hamano
@ 2005-11-30 16:13                         ` Linus Torvalds
  0 siblings, 0 replies; 64+ messages in thread
From: Linus Torvalds @ 2005-11-30 16:13 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git



On Tue, 29 Nov 2005, Junio C Hamano wrote:
> 
> Actually there is one.  If we are asked to do diff -1 and an
> unmerged path does not have stage #2 but stage #1 entry exists,
> we would end up showing that stage #1, without telling the user
> that we are showing something different from what was asked.
> How about doing something like this, on top of yours?

Yes.

		Linus

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-29  5:54 ` Junio C Hamano
  2005-11-29  8:05   ` linux
@ 2005-11-30 17:46   ` Daniel Barkalow
  2005-11-30 20:05     ` Junio C Hamano
  1 sibling, 1 reply; 64+ messages in thread
From: Daniel Barkalow @ 2005-11-30 17:46 UTC (permalink / raw
  To: Junio C Hamano; +Cc: linux, git

On Mon, 28 Nov 2005, Junio C Hamano wrote:

> *1* It is a shame that the most comprehensive definition of
> 3-way read-tree semantics is in t/t1000-read-tree-m-3way.sh test
> script.

Isn't Documentation/technical/trivial-merge.txt more comprehensive?

Probably the tables in various other places should be replaced with 
references to this document.

	-Daniel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  1:51             ` Linus Torvalds
  2005-11-30  2:06               ` Junio C Hamano
  2005-11-30  2:33               ` Junio C Hamano
@ 2005-11-30 18:11               ` Daniel Barkalow
  2 siblings, 0 replies; 64+ messages in thread
From: Daniel Barkalow @ 2005-11-30 18:11 UTC (permalink / raw
  To: Linus Torvalds; +Cc: linux, junkio, git, pasky

On Tue, 29 Nov 2005, Linus Torvalds wrote:

> If we left things in the index in an unmerged state, we'd be guaranteed to 
> either _fail_ that git commit unless somebody has done the 
> git-update-index (or names the files specifically on the commit command 
> line, which will do it for you).

At this point, we could have a "git-merged-by-hand" script that would take 
filenames, check that they're unmerged now, and, if so, call 
git-update-index for them. And it could have a -a to do all of the 
unmerged entries (i.e., "I'm done merging by hand"), and maybe also have a 
flag to git-commit that does this, so you can say, "Commit the merge I did 
by hand, whatever filenames it used, but not any other changes I may have 
had beforehand."

The "merged-by-hand" script would probably be a sensible place to complain 
about leftover conflict markers (unless you force it).

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: More merge questions
  2005-11-30 13:10                         ` More merge questions linux
@ 2005-11-30 18:37                           ` Daniel Barkalow
  2005-11-30 20:23                           ` Junio C Hamano
  1 sibling, 0 replies; 64+ messages in thread
From: Daniel Barkalow @ 2005-11-30 18:37 UTC (permalink / raw
  To: linux; +Cc: git, junkio

On Wed, 30 Nov 2005, linux@horizon.com wrote:

> Given that it all matches up so nicely, I'd like to honestly ask if
> case 3 of the conditions is correct.  I'd think that if I deleted
> a file form te index, and the file wasn't changed on the head I'm
> tracking, the right resolution is to keep it deleted.  Why override
> my deletion?

You're allowed to do the two-way merge with your index empty, and this 
means that you just hadn't read the ancestor, not that you want to remove 
everything. I'm not sure what this is useful for.

You're definitely allowed to do a three-way merge with your index empty, 
meaning that you don't have any local changes at all, which lets you do a 
merge in a temporary index that didn't exist before. (The two-way case is 
less interesting, because it's the same as just reading the new tree.)

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30 17:46   ` Daniel Barkalow
@ 2005-11-30 20:05     ` Junio C Hamano
  2005-11-30 21:06       ` Daniel Barkalow
  0 siblings, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2005-11-30 20:05 UTC (permalink / raw
  To: Daniel Barkalow; +Cc: git

Daniel Barkalow <barkalow@iabervon.org> writes:

> On Mon, 28 Nov 2005, Junio C Hamano wrote:
>
>> *1* It is a shame that the most comprehensive definition of
>> 3-way read-tree semantics is in t/t1000-read-tree-m-3way.sh test
>> script.
>
> Isn't Documentation/technical/trivial-merge.txt more comprehensive?

It describes the multi-base extention while the old one was done
before the multi-base, so content-wise it may be more up to date.

One thing I have most trouble with is that it is not obvious if
the table is covering all the cases.  You have to read from top
to bottom and consider the first match as its fate [*1*].  I was
about to write "with no match resulting in no merge", but it is
not even obvious if there are cases that would fall off at the
end from the table by just looking at it.  Even worse, if we add
"no match results in no merge" at the end, by definition it
covers all the cases, but it is not obvious what those fall-off
cases are (IOW, what kinds of conflict they are and why they are
not handled).

Another thing, perhaps more important, is taht it does not seem
to talk about index and up-to-dateness requirements much; it
says something about what happens when "no merge" result is
taken, but it is not clear about other cases.  The table in
t1000 test marks the case with "must match X" when index and
tree X must agree at the path, and with "must match X and be
up-to-date" when in addition the file in the working tree must
match what is recorded in the index at the path (i.e. the former
can have local modification in the working tree as long as index
entry and tree match).

This is vital in making sure that read-tree 3-way merge does not
lose information from the working tree.  I am sure your updated
*code* is doing the right thing, but the documentation is not
clear about it.  E.g. case 3ALT in the table says "take our
branch if the path does not exist in one or more of common
ancestors and the other branch does not have it" without saying
anything about index nor up-to-dateness requirements.  In this
case, the index must match HEAD but the working tree file is
allowed to have local modification (t1000 table says "must match
A").  If somebody wants to audit if the current read-tree.c does
the right thing for this case, he needs the documentation to
tell him what should happen.  There may be thinko in the design
(IOW, the index requirements the documentation places may not
make sense) that can be found during such an audit.  There may
be implementation error that the code does not match what the
documentation says should happen.  Not having that information
in the case table makes these verification difficult.

> Probably the tables in various other places should be replaced with 
> references to this document.

I agree 100% that having them scattered all over is bad and the
trivial-merge.txt is the logical place to consolidate them, but
I do not think simply removing others and pointing at
trivial-merge.txt without updating it is good enough.

[Footnote]

*1* That is OK from an implementation point of view (i.e. we can
look at the table, and then go to C implementation and follow
its if-elif chain to see if the same checks are done in the same
order as specified in the document), but for somebody who wants
to understand the semantics, i.e. what the thing it does means,
by looking at the documentation it is harder to read.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: More merge questions
  2005-11-30 13:10                         ` More merge questions linux
  2005-11-30 18:37                           ` Daniel Barkalow
@ 2005-11-30 20:23                           ` Junio C Hamano
  2005-12-02  9:19                             ` More merge questions (why doesn't this work?) linux
  1 sibling, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2005-11-30 20:23 UTC (permalink / raw
  To: linux; +Cc: git

linux@horizon.com writes:

> 3) This says that if stage1 and state3 exist, use stage3.
>    3-way says if they're equal, delete the file, while if they're
>    unequal, it's fail.
>
> Given that it all matches up so nicely, I'd like to honestly ask if
> case 3 of the conditions is correct.  I'd think that if I deleted
> a file form te index, and the file wasn't changed on the head I'm
> tracking, the right resolution is to keep it deleted.  Why override
> my deletion?
>
> Sorry if this is a dumb question, but it's not obvious to me.

Funny that I asked exactly the same question when it was done
first:

	http://marc.theaimsgroup.com/?l=git&m=111804744926989

It was a question about then-current code, so other cases might
have been changed/corrected/enhanced since then, but I believe
the behaviour for the case in question here stays the same til
this day, and the response from Linus to that article still
applies.

	http://marc.theaimsgroup.com/?l=git&m=111807024201485

I'll quote only the punch line here, but the whole thing is
worth a read if you want to understand how this evolved and
what the design choices and decisions were:

  Right. We didn't lose anything hugely important. 

  In theory this could be a delete that we've missed, and we could add a 
  flag to actually reject this case. However, it's always easy to "recover" 
  deletes (just delete it again ;), so the loss of information is absolutely 
  minimal, and it allows starting from an empty index file.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30 20:05     ` Junio C Hamano
@ 2005-11-30 21:06       ` Daniel Barkalow
  2005-11-30 22:00         ` Junio C Hamano
  0 siblings, 1 reply; 64+ messages in thread
From: Daniel Barkalow @ 2005-11-30 21:06 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git

On Wed, 30 Nov 2005, Junio C Hamano wrote:

> Daniel Barkalow <barkalow@iabervon.org> writes:
> 
> > On Mon, 28 Nov 2005, Junio C Hamano wrote:
> >
> >> *1* It is a shame that the most comprehensive definition of
> >> 3-way read-tree semantics is in t/t1000-read-tree-m-3way.sh test
> >> script.
> >
> > Isn't Documentation/technical/trivial-merge.txt more comprehensive?
> 
> It describes the multi-base extention while the old one was done
> before the multi-base, so content-wise it may be more up to date.
> 
> One thing I have most trouble with is that it is not obvious if
> the table is covering all the cases.  You have to read from top
> to bottom and consider the first match as its fate [*1*]. 

I actually had that problem with the original tables; there isn't a 
canonical order in which to list a table of all of the possible matches 
and non-matches between items so as to be complete.

Perhaps it ought to list, on each line, which previous cases would match, 
so that you could see that case 2 is really the conditions of 2 minus the 
conditions for 2ALT, which is "all of the ancestors are empty, the head 
has a directory/file conflict, and remote exists."

It can't fall off the table, because 1, 2, 3, 4, 6, 7, 9, and 11 cover all 
of the possibilities with respect to inputs being empty, and do not care 
about matching between the inputs.

> I was about to write "with no match resulting in no merge", but it is
> not even obvious if there are cases that would fall off at the
> end from the table by just looking at it.  Even worse, if we add
> "no match results in no merge" at the end, by definition it
> covers all the cases, but it is not obvious what those fall-off
> cases are (IOW, what kinds of conflict they are and why they are
> not handled).
> 
> Another thing, perhaps more important, is taht it does not seem
> to talk about index and up-to-dateness requirements much; it
> says something about what happens when "no merge" result is
> taken, but it is not clear about other cases.  The table in
> t1000 test marks the case with "must match X" when index and
> tree X must agree at the path, and with "must match X and be
> up-to-date" when in addition the file in the working tree must
> match what is recorded in the index at the path (i.e. the former
> can have local modification in the working tree as long as index
> entry and tree match).

But that is redundant information. I was actually confused by that part of 
the table for a long time, because it was not clear that it followed a 
couple of simple rules (which I give above the table), and weren't 
actually chosen on a case-by-case basis.

The implementation I did is actually much easier to verify, because it 
doesn't go into each case for the index requirements, but checks the 
actual rules: the index must match either the head or (if there is one) 
the merge result, and the index must not be dirty if there is a "no merge" 
result. Therefore, you can't lose any work in the index (either you didn't 
have any, or you did the same thing), and you can't lose any work in the 
working tree (either you didn't have any, or we're not going to use the 
working tree).

Last time we discussed it ("Multi-ancestor read-tree notes"), you said:

  I like the second sentence in three-way merge description.  That
  is a very easy-to-understand description of what the index
  requirements are.

> This is vital in making sure that read-tree 3-way merge does not
> lose information from the working tree.  I am sure your updated
> *code* is doing the right thing, but the documentation is not
> clear about it.  E.g. case 3ALT in the table says "take our
> branch if the path does not exist in one or more of common
> ancestors and the other branch does not have it" without saying
> anything about index nor up-to-dateness requirements. 

"If the index exists, it is an error for it not to match either the
 head or (if the merge is trivial) the result."

"A result of "no merge" is an error if the index is not empty and not
 up-to-date."

So the index is permitted to not exist (you missed this case), but if it 
exists, it must match HEAD (or, well, HEAD, which is the result). The 
index need not be up-to-date (since the result is not "no merge"), so the 
working tree doesn't matter.

> > Probably the tables in various other places should be replaced with 
> > references to this document.
> 
> I agree 100% that having them scattered all over is bad and the
> trivial-merge.txt is the logical place to consolidate them, but
> I do not think simply removing others and pointing at
> trivial-merge.txt without updating it is good enough.

Certainly, your complaints about the table should be addressed first. I 
think I'd addressed all your complaints from last time, but at that point, 
we got sidetracked into a discussion of the details of case #16.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30 21:06       ` Daniel Barkalow
@ 2005-11-30 22:00         ` Junio C Hamano
  2005-11-30 23:12           ` Daniel Barkalow
  0 siblings, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2005-11-30 22:00 UTC (permalink / raw
  To: Daniel Barkalow; +Cc: git

Daniel Barkalow <barkalow@iabervon.org> writes:

> I actually had that problem with the original tables; there isn't a 
> canonical order in which to list a table of all of the possible matches 
> and non-matches between items so as to be complete.
>
> Perhaps it ought to list, on each line, which previous cases would match, 
> so that you could see that case 2 is really the conditions of 2 minus the 
> conditions for 2ALT, which is "all of the ancestors are empty, the head 
> has a directory/file conflict, and remote exists."

Sorry, I was not clear about it when I did that table the first
time.  2ALT was "alternatives suggested to replace 2" and listed
in the same table for comparison purpose.

The original table was designed in a way that if you have a
match on case N, there would not be any other case M that matches
the case, either N<M or M<N.  IOW, the order to read the table
did not matter.  At least that was the intention.

If you read "missing" = 0, "exists" = 1, and take OAB as bit2,
bit1, and bit0, you can easily see the pattern in the table.  It
counts in binary, although bit1 has various subcases so the
table has more than 8 rows, and it is easy to see it covers all.

> "If the index exists, it is an error for it not to match either the
>  head or (if the merge is trivial) the result."
>
> "A result of "no merge" is an error if the index is not empty and not
>  up-to-date."

That is good.

> Certainly, your complaints about the table should be addressed first. I 
> think I'd addressed all your complaints from last time, but at that point, 
> we got sidetracked into a discussion of the details of case #16.

... which was a good thing to think about in itself.  I feel I
understand the new table a bit better.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30 22:00         ` Junio C Hamano
@ 2005-11-30 23:12           ` Daniel Barkalow
  2005-12-01  7:46             ` Junio C Hamano
  0 siblings, 1 reply; 64+ messages in thread
From: Daniel Barkalow @ 2005-11-30 23:12 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git

On Wed, 30 Nov 2005, Junio C Hamano wrote:

> Daniel Barkalow <barkalow@iabervon.org> writes:
> 
> > I actually had that problem with the original tables; there isn't a 
> > canonical order in which to list a table of all of the possible matches 
> > and non-matches between items so as to be complete.
> >
> > Perhaps it ought to list, on each line, which previous cases would match, 
> > so that you could see that case 2 is really the conditions of 2 minus the 
> > conditions for 2ALT, which is "all of the ancestors are empty, the head 
> > has a directory/file conflict, and remote exists."
> 
> Sorry, I was not clear about it when I did that table the first
> time.  2ALT was "alternatives suggested to replace 2" and listed
> in the same table for comparison purpose.

I understood that; actually, when I found it, a number of the ALT cases 
had been implemented, and some of them supplimented rather than replaced 
the originals.

> The original table was designed in a way that if you have a
> match on case N, there would not be any other case M that matches
> the case, either N<M or M<N.  IOW, the order to read the table
> did not matter.  At least that was the intention.
> 
> If you read "missing" = 0, "exists" = 1, and take OAB as bit2,
> bit1, and bit0, you can easily see the pattern in the table.  It
> counts in binary, although bit1 has various subcases so the
> table has more than 8 rows, and it is easy to see it covers all.

The hard thing is to verify that all the subcases are listed. I switched 
the orders of matching and non-matching, so that I could make it matching 
and need-not-match. Your table is actually missing a few cases: what 
happens if O is missing, 2 has a directory, and 3 has a file? You note 
that we have to be careful, but don't list the result (which is "no 
merge").

Perhaps the table would be clearer if the lines were grouped in 
exists/missing? (With 5ALT repeated in the 011 and 111 groups, since it 
applies to both) Then you would only need to look at 5 lines with 
cascading (in the most complex case), rather than having to read the whole 
top of the table.

(It is actually written like that, with the exception of 5ALT, 2ALT, and 
3ALT, but it's not visually obvious.)

So case 11 is really: All three exist. Head and remote don't match 
(-5ALT), no ancestor matches remote (-13), and no ancestor matches head 
(-14). Case 13 is really: All three exist. Head and remote don't match 
(-5ALT), there aren't different ancestors which match head and remote 
(-16), and an ancestor matches remote.

The tricky bit is really cases 2ALT and 3ALT, which can be used in cases 
where some but not all of the ancestors are empty, and can't be used if 
there's a directory/file conflict; neither of these conditions matters for 
anything else in the table, so it's hard to fit this in. My strategy is to 
have those as special cases, and have the rest of the table cover 
everything (rather than having case 2 require a directory/file conflict 
and case 7 require that no ancestor be empty, which would be accurate, but 
would make it harder to check for missing cases).

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30 23:12           ` Daniel Barkalow
@ 2005-12-01  7:46             ` Junio C Hamano
  0 siblings, 0 replies; 64+ messages in thread
From: Junio C Hamano @ 2005-12-01  7:46 UTC (permalink / raw
  To: Daniel Barkalow; +Cc: git

Delight in working on free software project is you have so many
good people with you that you can have many "Aha, lightbulb!"
moments.  This is one of them for me.  I realized where the
trouble I felt when reading your table came from; it was that I
was focused on the old way the table was organized too much.

When one constructs a case table to make sure one covered
everything, one first lists the variables and the possible
values they can take, and make NxMxOxPx... grid.  In the
original table I did, I chose what is in O and A and B as my
variables (and that's where my comment about O being bit2 etc
comes from).  I did not realize the semantics and algorithm you
used can be better described by different set of variables
(namely, how ancestors match the HEAD, and how the remote
matches the HEAD, if I understand correctly).  I had trouble
understanding your version only because I kept thinking in terms
of (O,A,B).

So after thinking about that...

    Daniel Barkalow <barkalow@iabervon.org> writes:

    > On Wed, 30 Nov 2005, Junio C Hamano wrote:
    >
    > Perhaps the table would be clearer if the lines were grouped
    > in exists/missing? (With 5ALT repeated in the 011 and 111
    > groups, since it applies to both) Then you would only need
    > to look at 5 lines with cascading (in the most complex
    > case), rather than having to read the whole top of the
    > table.

I think the current ordering of cases makes more sense.  If we
forget about the case labels from the original table (and the
way the original table classified cases), I suspect we could
reorganize the cases to describe the semantics even better and
clearer.  That is, not grouping by exists/missing, but grouping
by matching/unmatching.

> (It is actually written like that, with the exception of 5ALT, 2ALT, and 
> 3ALT, but it's not visually obvious.)

Yeah, I now realize that.

> The tricky bit is really cases 2ALT and 3ALT, which can be used in cases 
> where some but not all of the ancestors are empty, and can't be used if 
> there's a directory/file conflict; neither of these conditions matters for 
> anything else in the table, so it's hard to fit this in. My strategy is to 
> have those as special cases, and have the rest of the table cover 
> everything (rather than having case 2 require a directory/file conflict 
> and case 7 require that no ancestor be empty, which would be accurate, but 
> would make it harder to check for missing cases).

Makes sense.  Thanks for the clarification and lightbulb moment.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-28 23:42 git-name-rev off-by-one bug linux
  2005-11-29  5:54 ` Junio C Hamano
@ 2005-12-01 10:14 ` Junio C Hamano
  2005-12-01 21:50   ` Petr Baudis
  1 sibling, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2005-12-01 10:14 UTC (permalink / raw
  To: linux; +Cc: git, Petr Baudis

linux@horizon.com writes:

> Anyway, if it's portable enough, it's faster.  Ah... I just found discussion
> of this in late September, but it's not clear what the resolution was.
> http://marc.theaimsgroup.com/?t=112746188000003

Although updating our shell scripts to this century is lower on
my priority scale, ideally I'd want to see things work with
dash, not because I do not like bash/ksh, but because it seems
the smallest minimally POSIXy shell.

Speaking of shell gotchas, I do not know what the resolution was
on the problem Merlyn was having the other day in "lost again on
syntax change - local repository?" thread, which seemed that the
failure described in <868xv86dam.fsf@blue.stonehenge.com> was
his bash mishandling an if..then..elif..else..fi chain, which
was sort of unexpected for me.  I was curious but do not
remember seeing the conclusion.  Pasky, what happened to that
thread?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-12-01 10:14 ` Junio C Hamano
@ 2005-12-01 21:50   ` Petr Baudis
  2005-12-01 21:53     ` Randal L. Schwartz
  0 siblings, 1 reply; 64+ messages in thread
From: Petr Baudis @ 2005-12-01 21:50 UTC (permalink / raw
  To: Junio C Hamano; +Cc: linux, git, Randal L. Schwartz

Dear diary, on Thu, Dec 01, 2005 at 11:14:19AM CET, I got a letter
where Junio C Hamano <junkio@cox.net> said that...
> Speaking of shell gotchas, I do not know what the resolution was
> on the problem Merlyn was having the other day in "lost again on
> syntax change - local repository?" thread, which seemed that the
> failure described in <868xv86dam.fsf@blue.stonehenge.com> was
> his bash mishandling an if..then..elif..else..fi chain, which
> was sort of unexpected for me.  I was curious but do not
> remember seeing the conclusion.  Pasky, what happened to that
> thread?

I'm still perplexed and curious about what _did_ git-send-pack actually
receive as URL, since it apparnetly decided it's ssh as well.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-12-01 21:50   ` Petr Baudis
@ 2005-12-01 21:53     ` Randal L. Schwartz
  0 siblings, 0 replies; 64+ messages in thread
From: Randal L. Schwartz @ 2005-12-01 21:53 UTC (permalink / raw
  To: Petr Baudis; +Cc: Junio C Hamano, linux, git

>>>>> "Petr" == Petr Baudis <pasky@suse.cz> writes:

Petr> I'm still perplexed and curious about what _did_ git-send-pack actually
Petr> receive as URL, since it apparnetly decided it's ssh as well.

Sorry... $work is swallowing my time right now.  It's on my list
of "very important things to get back to sometime real soon".

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-30  5:51                     ` Junio C Hamano
  2005-11-30  6:11                       ` Junio C Hamano
  2005-11-30 16:08                       ` Linus Torvalds
@ 2005-12-02  8:25                       ` Junio C Hamano
  2005-12-02  9:14                         ` [PATCH] merge-one-file: make sure we create the merged file Junio C Hamano
                                           ` (2 more replies)
  2 siblings, 3 replies; 64+ messages in thread
From: Junio C Hamano @ 2005-12-02  8:25 UTC (permalink / raw
  To: Linus Torvalds; +Cc: git

Junio C Hamano <junkio@cox.net> writes:

>  2. merge-one-file leaves unmerged index entries.
>
> Regarding #2, in an earlier message you said something about
> "patch to do that was just broken" which I did not understand; I
> think your patch I am replying to is doing the right thing.  That
> case arm is dealing with a path that exists in "our" branch and
> the working tree blob should be the same as recorded in the
> HEAD, so I did not have to do the unpack-cat-chmod like I did in
> mine.  Am I simply confused?

The only difference is that, from the old tradition, we are
supposed to allow the merge to happen in an unchecked-out
working tree [*1*].  The version you did and I merged in the
master branch breaks that, while the patch I posted keeps that
premise.

I can throw in my change on top of what is already commited for
now to "fix" this, but do we still care about the "merge should
succeed in an unchecked-out working tree" rule, or does it not
matter anymore these days?

One thing is that the check with "git diff" to show diff between
half-merged and stage2 after a failed merge does not work very
well in a sparsely checked out working tree, because the real
change is buried among tons of deletes ("diff --diff-filter=UM"
helps, though [*2*]).

[Footnote]

*1* ... and that is why we special case a non-existent working
tree file as if it is clean with the index.  After a merge, you
would end up with a sparsely checked-out working tree that
contains only the files that were involved in the merge.

*2* Maybe --diff-filter should always include U in the output,
because it is rare and when an unmerged entry exists the user
would always want to see it.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] merge-one-file: make sure we create the merged file.
  2005-12-02  8:25                       ` Junio C Hamano
@ 2005-12-02  9:14                         ` Junio C Hamano
  2005-12-02  9:15                         ` [PATCH] merge-one-file: make sure we do not mismerge symbolic links Junio C Hamano
  2005-12-02  9:16                         ` [PATCH] git-merge documentation: conflicting merge leaves higher stages in index Junio C Hamano
  2 siblings, 0 replies; 64+ messages in thread
From: Junio C Hamano @ 2005-12-02  9:14 UTC (permalink / raw
  To: git

The "update-index followed by checkout-index" chain served two
purposes -- to collapse the index to "our" version, and make
sure that file exists in the working tree.  In the recent update
to leave the index unmerged on conflicting path, we wanted to
stop doing the former, but we still need to do the latter (we
allow merging to work in an un-checked-out working tree).

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 git-merge-one-file.sh |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

7afd8d297cd0c24e51188181769b56e0fb0f4171
diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
index 9a049f4..906098d 100755
--- a/git-merge-one-file.sh
+++ b/git-merge-one-file.sh
@@ -79,7 +79,13 @@ case "${1:-.}${2:-.}${3:-.}" in
 		;;
 	esac

-	merge "$4" "$orig" "$src2"
+	# Create the working tree file, with the correct permission bits.
+	# we can not rely on the fact that our tree has the path, because
+	# we allow the merge to be done in an unchecked-out working tree.
+	rm -f "$4" &&
+		git-cat-file blob "$2" >"$4" &&
+		case "$6" in *7??) chmod +x "$4" ;; esac &&
+		merge "$4" "$orig" "$src2"
 	ret=$?
 	rm -f -- "$orig" "$src2"

-- 
0.99.9.GIT

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH] merge-one-file: make sure we do not mismerge symbolic links.
  2005-12-02  8:25                       ` Junio C Hamano
  2005-12-02  9:14                         ` [PATCH] merge-one-file: make sure we create the merged file Junio C Hamano
@ 2005-12-02  9:15                         ` Junio C Hamano
  2005-12-02  9:16                         ` [PATCH] git-merge documentation: conflicting merge leaves higher stages in index Junio C Hamano
  2 siblings, 0 replies; 64+ messages in thread
From: Junio C Hamano @ 2005-12-02  9:15 UTC (permalink / raw
  To: git

We ran "merge" command on O->A, O->B, A!=B case without verifying
the path involved is not a symlink.

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 git-merge-one-file.sh |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

01655e7c9a0b05d930aa7e27e74f75e086005bfc
diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
index 906098d..c262dc6 100755
--- a/git-merge-one-file.sh
+++ b/git-merge-one-file.sh
@@ -58,6 +58,14 @@ case "${1:-.}${2:-.}${3:-.}" in
 # Modified in both, but differently.
 #
 "$1$2$3" | ".$2$3")
+
+	case ",$6,$7," in
+	*,120000,*)
+		echo "ERROR: $4: Not merging symbolic link changes."
+		exit 1
+		;;
+	esac
+
 	src2=`git-unpack-file $3`
 	case "$1" in
 	'')
-- 
0.99.9.GIT

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH] git-merge documentation: conflicting merge leaves higher stages in index
  2005-12-02  8:25                       ` Junio C Hamano
  2005-12-02  9:14                         ` [PATCH] merge-one-file: make sure we create the merged file Junio C Hamano
  2005-12-02  9:15                         ` [PATCH] merge-one-file: make sure we do not mismerge symbolic links Junio C Hamano
@ 2005-12-02  9:16                         ` Junio C Hamano
  2 siblings, 0 replies; 64+ messages in thread
From: Junio C Hamano @ 2005-12-02  9:16 UTC (permalink / raw
  To: git

This hopefully concludes the latest updates that changes the
behaviour of the merge on an unsuccessful automerge.  Instead of
collapsing the conflicted path in the index to show HEAD, we
leave it unmerged, now that diff-files can compare working tree
files with higher stages.

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 Documentation/git-merge.txt |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

e8be26e9282e346b01aa41fd7ab0b5f7bf7dcfc3
diff --git a/Documentation/git-merge.txt b/Documentation/git-merge.txt
index c117404..0cac563 100644
--- a/Documentation/git-merge.txt
+++ b/Documentation/git-merge.txt
@@ -108,10 +108,12 @@ When there are conflicts, these things h
 2. Cleanly merged paths are updated both in the index file and
    in your working tree.
 
-3. For conflicting paths, the index file records the version
-   from `HEAD`. The working tree files have the result of
-   "merge" program; i.e. 3-way merge result with familiar
-   conflict markers `<<< === >>>`.
+3. For conflicting paths, the index file records up to three
+   versions; stage1 stores the version from the common ancestor,
+   stage2 from `HEAD`, and stage3 from the remote branch (you
+   can inspect the stages with `git-ls-files -u`).  The working
+   tree files have the result of "merge" program; i.e. 3-way
+   merge result with familiar conflict markers `<<< === >>>`.
 
 4. No other changes are done.  In particular, the local
    modifications you had before you started merge will stay the
-- 
0.99.9.GIT

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: More merge questions (why doesn't this work?)
  2005-11-30 20:23                           ` Junio C Hamano
@ 2005-12-02  9:19                             ` linux
  2005-12-02 10:12                               ` Junio C Hamano
  2005-12-02 11:37                               ` linux
  0 siblings, 2 replies; 64+ messages in thread
From: linux @ 2005-12-02  9:19 UTC (permalink / raw
  To: git; +Cc: junkio, linux

I was playing with the implications of the "deleted file in the
index is not a conflict" merge rule, and came up with the following
octopus test which fails to work.  Note line 2 when choosing a
directory to run it in!

#!/bin/bash -xe
rm -rf .git
git-init-db
echo "File A" > a
echo "File B" > b
echo "File C" > c
git-add a b c
git-commit -a -m "Octopus test repository"

git-checkout -b a
echo "Modifications to a" >> a
git-commit -a -m "Modified file a"

git-checkout -b b master
echo "Modifications to b" >> b
git-commit -a -m "Modified file b"

git-checkout -b c master
rm c
git-commit -a -m "Deleted file c"

git-checkout master
#git merge --no-commit "" master c b a
#git merge --no-commit "" master a b c
git-rev-parse a b c > .git/FETCH_HEAD
git-octopus

(Commented out are the first few things I tried.)
Can someone tell me why this doesn't work?  It should be a simple
in-index merge.

Right after the incomplete merge (I hacked this into the
git-octopus script), git-ls-files -s produces

100644 8fb437b77759c7709c122fbc8ba43f720e1fbc0a 0       a
100644 b3418f25da4393974aa205e2863f012e5b503369 0       b
100644 df78d3d51c369e1d2f1eadb73464aadd931d56b4 1       c
100644 df78d3d51c369e1d2f1eadb73464aadd931d56b4 2       c

Which should be case 10 of the t/t1000-read-tree-m-3way.sh
table and succeed.

Other things I've discovered...

1) The MAJOR difference between "git checkout" and "git reset --hard"
   are that git-checkout takes a *head* as an argument and changes the
   .git/HEAD *symlink* to point to that head (ln -sf refs/heads/<head>
   .git/HEAD).  "git reset" takes a *commit* (<rev>) as an argument and
   changes the head that .git/HEAD points to to have that commit as its
   hew tip (git-rev-parse <rev> > .git/HEAD)

   All the other behavioural differences are relatively minor, and
   appropriate for this big difference.

2) Don't use "git branch" to create branches, unless you really
   *don't* want to switch to them.  Use "git checkout -b".

3) Dumb question: why does "git-commit-tree" need "-p" before the
   parent commit arguments?  Isn't just argv[2]..argv[argc-1]
   good enough?

4) If the "git-read-tree" docs for "--reset", does "ignored" mean
   "not overwritten" or "overwritten"?

5) The final "error" message on "git-merge --no-commit" is a bit
   alarming for a newbie who uses it because they don't quite trust
   git's enough to enable auto-commit.  And it should be changed
   from ""Automatic merge failed/prevented; fix up by hand" to
   "fix up and commit by hand".
   Or how about:
   "Automatic commit prevented; edit and commit by hand."
   which actually tells the truth.

6) The "pickaxe" options are being a bit confusing, and the fact they're
   only documented in cvs-migration.txt doesn't help.

7) The git-tag man page could use a little better description of -a.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: More merge questions (why doesn't this work?)
  2005-12-02  9:19                             ` More merge questions (why doesn't this work?) linux
@ 2005-12-02 10:12                               ` Junio C Hamano
  2005-12-02 13:09                                 ` Sven Verdoolaege
  2005-12-02 11:37                               ` linux
  1 sibling, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2005-12-02 10:12 UTC (permalink / raw
  To: linux; +Cc: git

linux@horizon.com writes:

> Which should be case 10 of the t/t1000-read-tree-m-3way.sh
> table and succeed.

Yes.  The reason is git-read-tree's behaviour was changed
underneath while octopus was looking elsewhere ;-).  See
Documentation/technical/trivial-merge.txt, last couple of
lines.

There are two schools of thoughts about "both sides remove"
(case #10) case.  Some people argued that "the branches might
have renamed that path to different paths and might indicate a
rename/rename conflict" (meaning read-tree should not consider
it trivial, and leave that to upper level "policy layer" to
decide).  merge-one-file policy simply says "no, they both
wanted to remove them".  If I recall correctly, read-tree itself
merged this case before multi-base rewrite happened (if you are
curious, run 'git whatchanged -p read-tree.c' and look for
"Rewrite read-tree").

> 1) The MAJOR difference between "git checkout" and "git reset --hard"

True.  "git reset --hard" should be used without <rev> by
novices and with <rev> after they understand what they are
doing (it is used for rewinding/warping heads).

> 2) Don't use "git branch" to create branches, unless you really
>    *don't* want to switch to them.  Use "git checkout -b".

Because...?  "git branch foo && git checkout foo" may be
suboptimal to type, but it is not _wrong_; it does not do
anything bad or incorrect.

> 3) Dumb question: why does "git-commit-tree" need "-p" before the
>    parent commit arguments?  Isn't just argv[2]..argv[argc-1]
>    good enough?

1. Why not?

2. I myself wondered about it long time ago.

3. It does not matter; nobody types that command by hand.

4. It allows us to later add some other flags to commit-tree
   (none planned currently).

> 4) If the "git-read-tree" docs for "--reset", does "ignored" mean
>    "not overwritten" or "overwritten"?

That sentence is very poorly written; a better paraphrasing is
appreciated.

	$ git whatchanged -S--reset \
        	read-tree.c Documentation/git-read-tree.txt 

shows logs for 438195ccedce7270cf5ba167a940c90467cb72d7 commit
(run "git-cat-file commit 438195cc" to read it).  It ignores
existing unmerged entries when reconstructing the index from the
given tree ("git-read-tree -m", given an unmerged index, refuses
to operate, but "--reset" *ignores* the unmerged ones hence it
does not refuse to operate).

> 5) The final "error" message on "git-merge --no-commit" is a bit
>    alarming for a newbie who uses it...

First of all, --no-commit is not meant to be used by newbies,
but you are right.  Patches to make the failure message
conditional are welcome.  It should switch on these three cases:

 - "--no-commit" option is given, but a merge conflict would
   have prevented autocommit anyway;

 - "--no-commit" option is given, but automerge succeeded;

 - conflict prevented autocommit.

> 6) The "pickaxe" options are being a bit confusing, and the fact they're
>    only documented in cvs-migration.txt doesn't help.

Docs of git-diff-* family have OPTIONS section, at the end of
which refers you to the diffcore documentation.  Suggestions to
a better organization and a patch is appropriate here.

> 7) The git-tag man page could use a little better description of -a.

Please.  It should have the same "OPTIONS" section as others do.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: More merge questions (why doesn't this work?)
  2005-12-02  9:19                             ` More merge questions (why doesn't this work?) linux
  2005-12-02 10:12                               ` Junio C Hamano
@ 2005-12-02 11:37                               ` linux
  2005-12-02 20:31                                 ` Junio C Hamano
  1 sibling, 1 reply; 64+ messages in thread
From: linux @ 2005-12-02 11:37 UTC (permalink / raw
  To: junkio; +Cc: git, linux

> Yes.  The reason is git-read-tree's behaviour was changed
> underneath while octopus was looking elsewhere ;-).  See
> Documentation/technical/trivial-merge.txt, last couple of
> lines.

> There are two schools of thoughts about "both sides remove"
> (case #10) case.

Um, I'm looking at the one-side remove case, which t/t1000 calls

     O       A       B         result      index requirements
-------------------------------------------------------------------
 10  exists  O==A    missing   remove      ditto
 ------------------------------------------------------------------

while trivial-merge.txt says is:

case  ancest    head    remote    result
----------------------------------------
10    ancest^   ancest  (empty)   no merge

I assumed the test case was probably more accurate, given that it's coupled
to code which actually verifies the behaviour.

> Some people argued that "the branches might
> have renamed that path to different paths and might indicate a
> rename/rename conflict" (meaning read-tree should not consider
> it trivial, and leave that to upper level "policy layer" to
> decide).  merge-one-file policy simply says "no, they both
> wanted to remove them".  If I recall correctly, read-tree itself
> merged this case before multi-base rewrite happened (if you are
> curious, run 'git whatchanged -p read-tree.c' and look for
> "Rewrite read-tree").

Aren't you talking about case #6?

     O       A       B         result      index requirements
-------------------------------------------------------------------
  6  exists  missing missing   remove      must not exist.
 ------------------------------------------------------------------

case  ancest    head    remote    result
----------------------------------------
6     ancest+   (empty) (empty)   no merge

>> 1) The MAJOR difference between "git checkout" and "git reset --hard"

> True.  "git reset --hard" should be used without <rev> by
> novices and with <rev> after they understand what they are
> doing (it is used for rewinding/warping heads).

For the longest time I had been under the delusion that
"git-checkout <branch> *" and "git-reset --hard <branch>"
were very similar operations (modulo your comments about
deleting files): overwrite the index and working directory
files with the versions from that branch.

It's hard to say how much I managed to confuse myself by
damaging test repositories while I didn't understand what was
going on.

>> 2) Don't use "git branch" to create branches, unless you really
>>    *don't* want to switch to them.  Use "git checkout -b".

> Because...?  "git branch foo && git checkout foo" may be
> suboptimal to type, but it is not _wrong_; it does not do
> anything bad or incorrect.

Yes, I know it works.  I suggest avoiding it because there's a much
more convenient alternative and I kept forgetting the second half and
checking my changes in to the wrong branch.

>> 3) Dumb question: why does "git-commit-tree" need "-p" before the
>>    parent commit arguments?  Isn't just argv[2]..argv[argc-1]
>>    good enough?

> 1. Why not?
> 3. It does not matter; nobody types that command by hand.

Because it's a real pain to get it properly quoted and set up
in a shell script.  "$@" is a lot simpler and easier, and
old /bin/sh only has the one array which provides that magic
quoting behaviour.

(Admittedly, you usually pass the arguments through git-rev-parse
first, and are then guaranteed no embedded whitespace.)

> 4. It allows us to later add some other flags to commit-tree
>    (none planned currently).

Making it disappear wouldn't preclude having more options, either,
any more than the variable number of arguments to cp(1) or mv(1)...

>> 4) If the "git-read-tree" docs for "--reset", does "ignored" mean
>>    "not overwritten" or "overwritten"?

> That sentence is very poorly written; a better paraphrasing is
> appreciated.

diff --git a/Documentation/git-read-tree.txt b/Documentation/git-read-tree.txt
index 8b91847..47e2f93 100644
--- a/Documentation/git-read-tree.txt
+++ b/Documentation/git-read-tree.txt
@@ -31,8 +31,8 @@ OPTIONS
 	Perform a merge, not just a read.
 
 --reset::
-
-        Same as -m except that unmerged entries will be silently ignored.
+        Same as -m except that unmerged entries will be silently overwritten
+	(instead of failing).
 
 -u::
 	After a successful merge, update the files in the work
@@ -47,7 +47,6 @@ OPTIONS
 	trees that are not directly related to the current
 	working tree status into a temporary index file.
 
-
 <tree-ish#>::
 	The id of the tree object(s) to be read/merged.
 

>> 5) The final "error" message on "git-merge --no-commit" is a bit
>>    alarming for a newbie who uses it...

> First of all, --no-commit is not meant to be used by newbies,
> but you are right.

Well, I can tell you that it's very very attractive to newbies.
The first 5 or 10 times I tried git-merge, I used --no-commit.
(My surprise was mostly that there wasn't a one-letter -x form.)
"Do something really complicated and then commit it to the repository"
is a frightening concept.  "Do something really complicated and
then stop and wait for you to see if it was what you expected" is
a lot more comforting.

>> 6) The "pickaxe" options are being a bit confusing, and the fact they're
>>    only documented in cvs-migration.txt doesn't help.

> Docs of git-diff-* family have OPTIONS section, at the end of
> which refers you to the diffcore documentation.  Suggestions to
> a better organization and a patch is appropriate here.

That's a bigger job; I'll work on it when I've finished the docs I'm
writing right. :-)

>> 7) The git-tag man page could use a little better description of -a.

> Please.  It should have the same "OPTIONS" section as others do.

I know NOTHING about asciidoc, and really wish I could fix its
lack-of-line-break problem:

GIT-BISECT(1)                                                    GIT-BISECT(1)

NAME
       git-bisect - Find the change that introduced a bug

SYNOPSIS
       git  bisect start git bisect bad <rev> git bisect good <rev> git bisect
       reset [<branch>] git bisect visualize git bisect replay  <logfile>  git
       bisect log

but emulating what I saw elsewhere...

diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
index 95de436..7635b1e 100644
--- a/Documentation/git-tag.txt
+++ b/Documentation/git-tag.txt
@@ -10,6 +10,26 @@ SYNOPSIS
 --------
 'git-tag' [-a | -s | -u <key-id>] [-f | -d] [-m <msg>] <name> [<head>]
 
+OPTIONS
+-------
+-a::
+	Make an unsigned (anotation) tag object
+
+-s::
+	Make a GPG-signed tag, using the default e-mail address's key
+
+-u <key-id>::
+	Make a GPG-signed tag, using the given key
+
+-f::
+	Replace an existing tag with the given name (instead of failing)
+
+-d::
+	Delete an existing tag with the given name
+
+-m <msg>::
+	Use the given tag message (instead of prompting)
+
 DESCRIPTION
 -----------
 Adds a 'tag' reference in .git/refs/tags/
@@ -23,7 +43,7 @@ creates a 'tag' object, and requires the
 in the tag message.
 
 Otherwise just the SHA1 object name of the commit object is
-written (i.e. an lightweight tag).
+written (i.e. a lightweight tag).
 
 A GnuPG signed tag object will be created when `-s` or `-u
 <key-id>` is used.  When `-u <key-id>` is not used, the

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: More merge questions (why doesn't this work?)
  2005-12-02 10:12                               ` Junio C Hamano
@ 2005-12-02 13:09                                 ` Sven Verdoolaege
  2005-12-02 20:32                                   ` Junio C Hamano
  0 siblings, 1 reply; 64+ messages in thread
From: Sven Verdoolaege @ 2005-12-02 13:09 UTC (permalink / raw
  To: Junio C Hamano; +Cc: linux, git

On Fri, Dec 02, 2005 at 02:12:42AM -0800, Junio C Hamano wrote:
> linux@horizon.com writes:
> > 3) Dumb question: why does "git-commit-tree" need "-p" before the
> >    parent commit arguments?  Isn't just argv[2]..argv[argc-1]
> >    good enough?
> 
> 3. It does not matter; nobody types that command by hand.
> 

I do.  git commit won't let me commit an empty tree, or at
least I haven't figured out how to make it do that.

I also used it when, after resolving a merge initiated by
cg-merge, cogito (or at least the version I had installed
at the time) wouldn't let me commit it because a new file
I had pulled in contained non-ascii characters in its name.

skimo

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: More merge questions (why doesn't this work?)
  2005-12-02 11:37                               ` linux
@ 2005-12-02 20:31                                 ` Junio C Hamano
  2005-12-02 21:32                                   ` linux
  2005-12-02 21:56                                   ` More merge questions linux
  0 siblings, 2 replies; 64+ messages in thread
From: Junio C Hamano @ 2005-12-02 20:31 UTC (permalink / raw
  To: linux; +Cc: git

linux@horizon.com writes:

> Um, I'm looking at the one-side remove case, which t/t1000 calls
>
>      O       A       B         result      index requirements
> -------------------------------------------------------------------
>  10  exists  O==A    missing   remove      ditto
>  ------------------------------------------------------------------
>
> while trivial-merge.txt says is:
>
> case  ancest    head    remote    result
> ----------------------------------------
> 10    ancest^   ancest  (empty)   no merge
>
> I assumed the test case was probably more accurate, given that it's coupled
> to code which actually verifies the behaviour.

You are right.  And the test expects something different from
that table in t/t1000 test.  Relevant are the lines for ND (one
side No action the other Delete) in the "expected" file.  The
test expects the result to be unmerged.

Interesting is that it did so from the day one [*1*].  The very
original read-tree 3-way was quite conservative and left more
things unmerged for the policy script to handle, and it is not
surprising it started like this, but during the course of the
project I thought read-tree was made to collapse more cases in
index.  I am a bit surprised we did not loosen it ever since
[*2*].  Thanks for pointing out the discrepancy.

We earlier agreed that the table in t/t1000 test should go and
superseded by trivial-merge.txt, so what the table says right
now is a non-issue, but we _might_ want to revisit the issue of
what should happen in case #8 and #10 sometime in the future, as
the last three lines of trivial-merge.txt mentions.  I'd say we
should leave things as they are for now, though.

>  --reset::
> -
> -        Same as -m except that unmerged entries will be silently ignored.
> +        Same as -m except that unmerged entries will be silently overwritten
> +	(instead of failing).

Thanks.

> "Do something really complicated and then commit it to the repository"
> is a frightening concept.  "Do something really complicated and
> then stop and wait for you to see if it was what you expected" is
> a lot more comforting.

Fair enough.

>>> 7) The git-tag man page could use a little better description of -a.
>
>> Please.  It should have the same "OPTIONS" section as others do.
>
> I know NOTHING about asciidoc, and really wish I could fix its
> lack-of-line-break problem:

Thanks for pointing that one ont.  I think Josef recently did
similar linebreak on git-mv page.  I'll try and see if I can
mimic what he did [*3*].

> diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
> index 95de436..7635b1e 100644
> --- a/Documentation/git-tag.txt
> +++ b/Documentation/git-tag.txt

Thanks; applied.  

[Footnotes]

*1* A pickaxe example:

	$ git whatchanged -p -S'100644 1 ND
          100644 2 ND'

shows only two commits.  One is the first version of the test,
and the other is to adjust for the output format from 

*2* Further archaeology revealed that I did loosening primarily
for the 2-way side, and did not touch much about 3-way merge
other than what used to be marked with ALT.  There was no 10ALT
ever so it shows that my memory is simply faulty ;-).

*3* I did that, and it renders HTML side nicer, but it breaks
manpages X-<.  Inputs from asciidoc gurus are appreciated.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: More merge questions (why doesn't this work?)
  2005-12-02 13:09                                 ` Sven Verdoolaege
@ 2005-12-02 20:32                                   ` Junio C Hamano
  2005-12-05 15:01                                     ` Sven Verdoolaege
  0 siblings, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2005-12-02 20:32 UTC (permalink / raw
  To: Sven Verdoolaege; +Cc: git

Sven Verdoolaege <skimo@kotnet.org> writes:

>> 3. It does not matter; nobody types that command by hand.

I should have said "nobody should need to type that, otherwise
fix your Porcelain".

> I do.  git commit won't let me commit an empty tree, or at
> least I haven't figured out how to make it do that.

You are right, at least for the initial commit (for subsequent
commits it happily commits an empty tree).

Now why anybody would want to it is a different matter.  Is it
because you would want to record that your project started from
scratch, as opposed to some import from an existing non
versioned (or versioned by another SCM) working tree?

> I also used it when, after resolving a merge initiated by
> cg-merge, cogito (or at least the version I had installed
> at the time) wouldn't let me commit it because a new file
> I had pulled in contained non-ascii characters in its name.

That sounds like a simple Porcelain bug and I hope neither
Cogito or git wouldn't have that problem now.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: More merge questions (why doesn't this work?)
  2005-12-02 20:31                                 ` Junio C Hamano
@ 2005-12-02 21:32                                   ` linux
  2005-12-02 22:00                                     ` Junio C Hamano
  2005-12-02 22:12                                     ` Linus Torvalds
  2005-12-02 21:56                                   ` More merge questions linux
  1 sibling, 2 replies; 64+ messages in thread
From: linux @ 2005-12-02 21:32 UTC (permalink / raw
  To: junkio, linux; +Cc: git

> We earlier agreed that the table in t/t1000 test should go and
> superseded by trivial-merge.txt, so what the table says right
> now is a non-issue, but we _might_ want to revisit the issue of
> what should happen in case #8 and #10 sometime in the future, as
> the last three lines of trivial-merge.txt mentions.  I'd say we
> should leave things as they are for now, though.

But back to my original problem... I don't much care whether it's
done as a trivial merge or a non-trivial merge, but why the #%@#$ can't
it be done as an automatic merge?

As I said, I'm trying to build (and write down) a mental model, so the
behaviour of git can be predicted.  My mental model says this should
work.  It doesn't.  Therefore my mental model is incorrect, and I
don't actually understand what it's doing.

#!/bin/bash -xe
rm -rf .git
git-init-db
echo "File A" > a
echo "File B" > b
echo "File C" > c
git-add a b c
git-commit -a -m "Octopus test repository"

git-checkout -b a
echo "Modifications to a" >> a
git-commit -a -m "Modified file a"

git-checkout -b b master
echo "Modifications to b" >> b
git-commit -a -m "Modified file b"

git-checkout -b c master
rm c
git-commit -a -m "Deleted file c"

git-checkout master
git merge "Merge a, b, c" master a b c

produces...

+ git merge 'Merge a, b, c' master a b c
Trying simple merge with a
Trying simple merge with b
Trying simple merge with c
Simple merge did not work, trying automatic merge.
Removing c
fatal: merge program failed
No merge strategy handled the merge.


> *3* I did that, and it renders HTML side nicer, but it breaks
> manpages X-<.  Inputs from asciidoc gurus are appreciated.

I tried adding " +" at end-of-line, which is supposed to force a
line break, but that didn't have any effect.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: More merge questions
  2005-12-02 20:31                                 ` Junio C Hamano
  2005-12-02 21:32                                   ` linux
@ 2005-12-02 21:56                                   ` linux
  1 sibling, 0 replies; 64+ messages in thread
From: linux @ 2005-12-02 21:56 UTC (permalink / raw
  To: git; +Cc: junkio, linux

Just thinking about the difference between 2-way and 3-way merge...

*Mostly* a 2-way merge is just a 3-way merge where one of the ways
is taken from the index rather than from a tree.  But there
are some subtle differences.

This diffierence is what forces octopus merge to form intermediate tree
objects when doing its merges.  If there was a way to merge directly
into the index, octopus merge wouldn't have to make intermediate tree
objects that would have to be garbage-collected later.

(Indeed, I originally assumed that Octopus did all its merges in the
index; it's only when I traced the code that I saw it calls git-write-tree
multiple times.)

Is the time saved, and space not wasted, worth implementing a 2-way merge
that more exactly matches 3-way?  It should be fairly straightforward
to share the actual merging code.

Opinions solicited.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: More merge questions (why doesn't this work?)
  2005-12-02 21:32                                   ` linux
@ 2005-12-02 22:00                                     ` Junio C Hamano
  2005-12-02 22:12                                     ` Linus Torvalds
  1 sibling, 0 replies; 64+ messages in thread
From: Junio C Hamano @ 2005-12-02 22:00 UTC (permalink / raw
  To: git

 <linux <at> horizon.com> writes:

> + git merge 'Merge a, b, c' master a b c
> Trying simple merge with a
> Trying simple merge with b
> Trying simple merge with c
> Simple merge did not work, trying automatic merge.
> Removing c
> fatal: merge program failed
> No merge strategy handled the merge.

I think this is the same problem I fixed yesterday after the breakage report
from Luben Tuikov.  You need the ce3ca275452cf069eb6451d6f5b0f424a6f046aa commit.
Sorry about that.

Could you try the latest and see if it still breaks?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: More merge questions (why doesn't this work?)
  2005-12-02 21:32                                   ` linux
  2005-12-02 22:00                                     ` Junio C Hamano
@ 2005-12-02 22:12                                     ` Linus Torvalds
  2005-12-02 23:14                                       ` linux
  1 sibling, 1 reply; 64+ messages in thread
From: Linus Torvalds @ 2005-12-02 22:12 UTC (permalink / raw
  To: linux; +Cc: junkio, git



On Fri, 2 Dec 2005, linux@horizon.com wrote:
>
> produces...
> 
> + git merge 'Merge a, b, c' master a b c
> Trying simple merge with a
> Trying simple merge with b
> Trying simple merge with c
> Simple merge did not work, trying automatic merge.
> Removing c
> fatal: merge program failed
> No merge strategy handled the merge.

I'm getting

	...
	+ git merge 'Merge a, b, c' master a b c
	Trying simple merge with a
	Trying simple merge with b
	Trying simple merge with c
	Simple merge did not work, trying automatic merge.
	Removing c
	Merge 9ca217790c7e6581fe0b8b3b4baf026d03584c66, made by octopus.
	 a |    1 +
	 b |    1 +
	 c |    1 -
	 3 files changed, 2 insertions(+), 1 deletions(-)
	 delete mode 100644 c

and I don't see why you wouldn't get that too.

Do you have that broken version of git that had problems with "rmdir" and 
thought the unlink failed? 

		Linus

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: More merge questions (why doesn't this work?)
  2005-12-02 22:12                                     ` Linus Torvalds
@ 2005-12-02 23:14                                       ` linux
  0 siblings, 0 replies; 64+ messages in thread
From: linux @ 2005-12-02 23:14 UTC (permalink / raw
  To: linux, torvalds; +Cc: git, junkio

> and I don't see why you wouldn't get that too.
> 
> Do you have that broken version of git that had problems with "rmdir" and 
> thought the unlink failed? 

Quite possibly; my previous version of git was 27 November.
(I've been using the Debian package builder, which insists on a
full rebuild each time and is thus annoyingly slow... especially
the "xmlto man" part.  I think I'll switch to "make; make install")

Anyway, updated and it works as expected.  Sorry for the spurious
complaint.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: git-name-rev off-by-one bug
  2005-11-29 18:46       ` Junio C Hamano
@ 2005-12-04 21:34         ` Petr Baudis
  2005-12-08  6:34           ` as promised, docs: git for the confused linux
  0 siblings, 1 reply; 64+ messages in thread
From: Petr Baudis @ 2005-12-04 21:34 UTC (permalink / raw
  To: Junio C Hamano; +Cc: linux, git

Dear diary, on Tue, Nov 29, 2005 at 07:46:20PM CET, I got a letter
where Junio C Hamano <junkio@cox.net> said that...
> Petr Baudis <pasky@suse.cz> writes:
> 
> >   (ii) Cogito will handle trees with some local modifications better -
> > basically any local modifications git-read-tree -m won't care about.
> > I didn't read the whole conversation, so to reiterate: git-read-tree
> > will complain when the index does not match the HEAD, but won't
> > complain about modified files in the working tree if the merge is not
> > going to touch them. Now, let's say you do this (output is visually
> > only roughly or not at all resembling what would real tools tell you):
> >
> > 	$ ls
> > 	a b c
> > 	$ echo 'somelocalhack' >>a
> > 	$ git merge "blah" HEAD remotehead
> > 	File-level merge of 'b' and 'c'...
> > 	Oops, 'b' contained local conflicts.
> > 	Automatic merge aborted, fix up by hand.
> > 	$ fixup b
> > 	$ git commit
> > 	Committed files 'a', 'b', 'c'.
> >
> > Oops. It grabbed your local hack and committed it along the merge.
> 
> Are you sure about this?
> 
> In the above sequence, after you touch a with 'somelocalhack',
> there is no 'git update-index a', until you say 'git commit'
> there, so I do not think that mixup is possible.
> 
> The "fixup b" step is actually two commands, so after merge
> command, you would do:
> 
>         $ edit b
> 	$ git update-index b ;# mark that you are dealt with it
> 	$ git commit ;# commits what is in index
> 
> After the above steps, "git diff" (that is working tree against
> index) still reports your local change to "a", which were _not_
> committed.

Yes. I actually tried it out, but I was confused by the file list in the
commit message (I'm used to seeing just committed files there) and I
didn't check the status of the 'a' file after the commit.

Sorry about the confusion.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: More merge questions (why doesn't this work?)
  2005-12-02 20:32                                   ` Junio C Hamano
@ 2005-12-05 15:01                                     ` Sven Verdoolaege
  0 siblings, 0 replies; 64+ messages in thread
From: Sven Verdoolaege @ 2005-12-05 15:01 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git

On Fri, Dec 02, 2005 at 12:32:17PM -0800, Junio C Hamano wrote:
> Sven Verdoolaege <skimo@kotnet.org> writes:
> > I do.  git commit won't let me commit an empty tree, or at
> > least I haven't figured out how to make it do that.
> 
> You are right, at least for the initial commit (for subsequent
> commits it happily commits an empty tree).
> 
> Now why anybody would want to it is a different matter.  Is it
> because you would want to record that your project started from
> scratch, as opposed to some import from an existing non
> versioned (or versioned by another SCM) working tree?

Something like that, yes.

In the beginning there as nothing and git commited the nothingness.

skimo

^ permalink raw reply	[flat|nested] 64+ messages in thread

* as promised, docs: git for the confused
  2005-12-04 21:34         ` Petr Baudis
@ 2005-12-08  6:34           ` linux
  2005-12-08 21:53             ` Junio C Hamano
                               ` (2 more replies)
  0 siblings, 3 replies; 64+ messages in thread
From: linux @ 2005-12-08  6:34 UTC (permalink / raw
  To: git; +Cc: linux

As I mentioned with all my questions, I was writing up the answers
I got.  Here's the current status.  If anyone would like to comment on
its accuracy or usefulness, feedback is appreciated.

I've tried to omit or skim very lightly over subjects I think
are adequately explained in existing docs, unless that would
leave an uncomfortable hole in the explanation.

TODO: Describe the config file.  It's a recent invention, and I
haven't found a good description of its contents.

		"I Don't Git It"
		Git for the confused

Git is hardly lacking in documentation, but coming at it fresh, I found
it somewhat confusing.

Git is a toolkit in the Unix tradition.  There are a number of primitives
written in C, which are made friendly by a layer of shell scripts.
These are known in git-speak, as the "plumbing" and the "porcelain",
respectively.  The porcelain should work and look nice.  The plumbing
should just deal with lots of crap efficiently.

Much of git's documentation was first written to explain the plumbing to
the people writing the porcelain.  Since then, although the essentials
haven't changed, porcelain has been added and conventions have been
established that make it a lot more pleasant to deal with.  Some commands
have been changed or replaced, and it's not quite the same.

Using the original low-level commands is now most likely more difficult
than necessary, unless you want to do something not supported by the
existing porcelain.

This document retraces (with fewer false turns) how I learned my
way around git.  There are some concepts I didn't understand so well
the first time through, and an overview of all the git commands, grouped
by application.

A good rule of thumb is that the commands with one-word names (git-diff,
git-commit, git-merge, git-push, git-pull, git-status, git-tag, etc.) are
designed for end-user use.  Multi-word names (git-count-objects,
git-write-tree, git-cat-file) are generally designed for use from
a script.

This isn't ironclad.  The first command to start using git is git-init-db,
and git-show-branch is pure porcelain, while git-mktag is a primitive.
And you don't often run git-daemon by hand.  But still, it's a useful
guideline.

* Background material.

To start with, read "man git".  Or Documentation/git.txt in the git
source tree, which is the same thing.  Particularly note the description
of the index, which is where all the action in git happens.

One thing that's confusing is why git allows you to have one version of
a file in the current HEAD, a second version in the index, and possibly a
third in the working directory.  Why doesn't the index just contain a copy
of the current HEAD until you commit a new one?  The answer is merging,
which does all its work in the index.  Neither the object database nor
the working directory let you have multiple files with the same name.

The index is really very simple.  It's a series of structures, each
describing one file.  There's an object ID (SHA1) of the contents,
some file metadata to detect changes (time-stamps, inode number, size,
permissions, owner, etc.), and the path name relative to the root of
the working directory.  It's always stored sorted by path name, for
efficient merging.

At (almost) any time, you can take a snapshot of the index and write
it as a tree object.

The only interesting feature is that each entry has a 2-bit stage number.
Normally, this is always zero, but each path name is allowed up to three
different versions (object IDs) in the index at once.  This is used to
represent an incomplete merge, and an unmerged index entry (with more
than one version) prevents committing the index to the object database.

* Terminology - heads, branches, refs, and revisions

(This is a supplement to what's already in "man git".)

The most common object needed by git primitives is a tree.  Since a
commit points to a tree and a tag points to a commit, both of these are
acceptable "tree-ish" objects and can be used interchangeably.  Likewise,
a tag is "commit-ish" and can be used where a commit is required.

As soon as you get to the porcelain, the most commonly used object is
a commit.  Also known as a revision, this is a tree plus a history.

While you can always use the full object ID, you can also use a reference.
A reference is a file that contains a 40-character hex SHA1 object ID
(and a trailing newline).  When you specify the name of a reference,
it is searched for in one of the directories:
	.git/			(or $GIT_DIR)
	.git/refs/		(or $GIT_DIR/refs/)
	.git/refs/heads/	(or $GIT_DIR/refs/heads/)
	.git/refs/tags/		(or $GIT_DIR/refs/tags/)

You may use subdirectories by including slashes in the reference name.
There is no search order; if searching the above four path prefixes
produces more than one match for the reference name (it's ambiguous),
then the name is not valid.

There is additional syntax (which looks like "commit~3^2~17") for
specifying an ancestor of a given commit (or tag).  This is documented
in detail in the documentation for git-rev-parse.  Briefly, commit^
is the parent of the given commit.  commit^^ is the grandparent, etc..
If there is more than one ancestor (a merge), then they can be referenced
as commit^1 (a synonym for commit^), commit^2, commit^3, etc.  (commit^0
gives the commit object itself.  A no-op if you're starting from a commit,
but it lets you get the commit object from a tag object.)

As long strings of ^ can be annoying, they can be abbreviated using ~
syntax.  commit^^^ is the same as commit~3, ^^^^ is the same as ~4, etc.
You can see lots of examples in the output of "git-show-branch".

Now, the although the most primitive git tools don't care, a convention
among all the porcelain is that the current head of development is
.git/HEAD, a symbolic link to a reference under refs/heads/.

git-init-db creates HEAD pointing to refs/heads/master, and that is
traditionally the name used for the "main trunk" of development.
Note that initially refs/heads/master doesn't exist - HEAD is a
dangling symlink!  This is okay, and will cause the initial commit
to have zero parents.

A "head" is mostly synonymous with a "branch", but the terms have
different emphasis.  The "head" is particularly the tip of the branch,
where future development will be appended.  A "branch" is the entire
development history leading to the head.  However, as far as git is
concerned, they're both references to commit objects, referred to from
refs/heads/.

When you actually do more (with git-commit, or git-merge), then the
current HEAD reference is overwritten with the new commit's id, and
the old HEAD becomes HEAD^.  Since HEAD is a symlink, it's the file
in refs/heads/ that's actually overwritten.  (See the git-update-ref
documentation for further details.)

The git-checkout command actually changes the HEAD symlink.  git-checkout
enforces the rule that it will only check out a branch under refs/heads.
You can use refs/tags as a source for git-diff or any other command that
only examines the revision, but if you want to check it out, you have
to copy it to refs/heads.

* Resetting

The "undo" command for commits to the object database is git-reset.
Like all deletion-type commands, be careful or you'll hurt yourself.
Given a commit (using any of the syntaxes mentioned above), this
sets the current HEAD to refer to the given commit.

This does NOT alter the HEAD symlink (as git-checkout <branch> will
do), but actually changes the reference pointed to by HEAD
(e.g. refs/heads/master) to contain a new object ID.

The classic example is to undo an erroneous commit, use
"git-reset HEAD^".

There are actually three kinds of git-reset:
git-reset --soft: Only overwrite the reference.  If you can find the
	old object ID, you can put everything back with a second
	git-reset --soft OLD_HEAD.
git-reset --mixed: This is the default, which I always think of
	as "--medium".  Overwrite the reference, and (using
	git-read-tree) read the commit into the index.  The
	working directory is unchanged.
git-reset --hard: Do everything --mixed does, and also check out
	the index into the working directory.  This really erases
	all traces of the previous version.  (One caveat: this
	will not delete any files in the working directory that
	were added as part of the changes being undone.)

The space taken up by the abandoned commit won't actually be
reclaimed until you collect garbage with git-prune.

git-reset with no commit specified is "git-reset HEAD", which is much
safer because the object reference is not actually changed.  This can
be used to undo changes in the index or working directory that you did
not intend.  Note, however, that it is not selective.  "git-commit"
has options for doing this selectively.

Like being sure what directory you're in when typing "rm -r", think
carefully about what branch you're on when typing "git-reset <commit>".

There is an undelete: git-reset stores the previous HEAD commit
in OLD_HEAD.  And git-lost-found can find leftover commits
until you do a git-prune.

* Merging

Merging is central to git operations.  Indeed, a big difference between
git and other version control systems is that git assumes that a change
will be merged more often than it's written, as it's passed around
different developers' repositories.  Even "git checkout" is a merge.

The heart of merging is git-read-tree, but if you can understand it from
the man page, you're doing better than me.

As mentioned, the index and the working directory versions of a file
could both be different from the HEAD.  Git lets you merge "under" your
current working directory edits, as long as the merge doesn't change
the files you're editing.

There are some special cases of merging, but let me start with the
procedure for the general 3-way merge case: merging branch B into branch A
(the current HEAD).

1) Given two commits, find a common ancestor O to server as the origin
   of the merge.  The basic "resolve" algorithm uses git-merge-base for
   the task, but the "recursive" merge strategy gets more clever in the
   case where there are multiple candidates.  I won't got into what it
   does, but it does a pretty good job.

2) Add all three input trees (the Origin, A, and B) to the index by
   "git-read-tree -m O A B".  The index now contains up to three copies
   of every file.  (Four, including the original, but that is discarded
   before git-read-tree returns.)

   Then, for each file in the index, git-read-tree does the following:

   2a) For each file, git-merge-tree tries to collapse the various
       versions into one using the "trivial in-index merge".  This just
       uses the file blob object names to see if the file contents
       are identical, and if two or more of the three trees contain an
       identical copy of this file, it is merged.  A missing (deleted)
       file matches another missing file.

       Note that this is NOT a majority vote.  If A and B agree on the
       contents of the file, that's what is used.  (Whether O agrees is
       irrelevant in this case.)  But if O and A agree, then the change
       made in B is taken as the final value.  Likewise, if O and B agree,
       then A is used.

   2b) If this is possible, then a check is made to see if the merge would
       conflict with any uncommitted work in the index or change the index
       out from under a modified working directory file.  If either of
       those cases happen, the entire merge is backed out and fails.

       (In the git source, the test "t/t1000-read-tree-m-3way.sh" has
       a particularly detailed description of the various cases.)

       If the merge is possible and safe, the versions are collapsed
       into one final result version.

   2c) If all three versions differ, the trivial in-index merge is
       not possible, and the three source versions are left in the
       index unmerged.  Again, if there was uncommitted work in the
       index or the working directory, the entire merge fails.

3) Use git-merge-index to iterate over the remaining unmerged files, and
   apply an intra-file merge.  The intra-file merge is usually done with
   git-merge-one-file, which does a standard RCS-style three-way merge
   (see "man merge").

4) Check out all the successfully merged files into the working directory.

5) If automatic merging was successful on every file, commit the merged
   version immediately and stop.

6) If automatic merging was not complete, then replace the working
   directory copies of any remaining unmerged files with a merged copy
   with conflict markers (again, just like RCS or CVS) in the working
   directory.  All three source versions are available in the index for
   diffing against.

   (We have not destroyed anything, because in step 2c), we checked to make
   sure the working directory file didn't have anything not in the
   repository.)

7) Manually edit the conflicts and resolve the merge.  As long as an
   unmerged, multi-version file exists in the index, committing the
   index is forbidden.  (You can use the options to git-diff to
   see the changes.)

8) Commit the final merged version of the conflicting file(s), replacing
   the unmerged versions with the single finished version.

Note that if the merge is simple, with no one file edited on both
branches, git never has to open a single file.  It reads three tree
objects (recursively) and stat(2)s some working directory files to
verify that they haven't changed.

Also note that this aborts and backs out rather than overwrite
anything not committed.  You can merge "under" uncommitted edits
only if those edits are to files not affected by the merge.

* 2-way merging

A "2-way merge" is basically a 3-way merge with the contents of the
index as the "current HEAD", and the original HEAD as the Origin.
However, this merge is designed only for simple cases and only supports
the "trivial merge" cases.  It does not fall back to an intra-file merge.

[[ I'm not sure why it couldn't, I confess.  For reversibility?  Or
just because it's likely to be too confusing.  ]]

This merge is used by git-checkout to switch between two branches,
while preserving any changes in the working directory and index.

Like the 3-way case, if a particular file hasn't changed between
the two heads, then git will preserve any uncommitted edits.
If the file has changed in any way, git doesn't try to perform
any sort of intra-file merge, it just fails.

* 1-way merging

This is not actually used by the git-core porcelain, and so is only
useful to someone writing more porcelain, but I'll describe it for
completeness.

Plain (non-merging) git-read-tree will overwrite the index entries with
those from the tree.  This invalidates the cached stat data, causing
git to think all the working directory files are "potentially changed"
until you do a git-update-index --refresh.

By specifying a 1-way merge, any index entry whose contents (object ID)
matches the incoming tree will have its cached stat data preserved.
Thus, git will know if the working directory file is not changed, and
will not overwrite if you execute git-checkout-index.

This is purely an efficiency hack.

* Special merges - already up to date, and fast-forward

There are two cases of 3-way or 2-way merging that are special.
Recall that the basic merge pattern is

   B--------> A+B
  /        /
 /        /
O -----> A

The two special cases arise if one of A or B is a direct ancestor of
the other.  In that case, the common ancestor of both A and B is the
older of the two commits.  And the merged result is simply the
newer of the two, unchanged.

Recalling that we are merging B into A, if B is a direct ancestor of A,
then A already includes all of B.  A is "already up to date" and not
changed at all by the merge.

The other case you'll hear mentioned, because it happens a lot when
pulling, is when A is a direct ancestor of B.  In this case, the
result of the merge is a "fast-forward" to B.

Both of these cases are handled very efficiently by the in-index merge
done by git-read-tree.

* Deleted files during merges

There is one small wrinkle in git's merge algorithm that will probably
never bite you, but I ought to explain anyway, just because it's so rare
that it's difficult to discover it by experiment.

The index contains a list of all files that git is tracking.  If the
index file is empty or missing and you do a commit, you write an empty
tree with no files.

When merging, if git finds no pre-existing index entry for a path it is
trying to merge, it considers that to mean "status unknown" rather than
"modified by being deleted".  Thus, this is not uncommitted work
in the index file, and does not block the merge.  Instead, the
file will reappear in the merge.

This is because it is possible to blow away the index file (rm .git/index
will do it quite nicely), and if this was considered a modification to
be preserved, it would cause all sorts of conflicts.

So the one change to the index that will NOT be preserved by a merge is
the removal of a file.  A missing index entry is treated the same as an
unmodified index entry.  The index will be updated, and when you check
out the revision, the working directory file will be (re-)created.

Note that none of this affects you in the usual case where you make
changes in the working directory only, and leave the index equal to HEAD
until you're ready to commit.

* Packs

Originally, git stored every object in its own file, and used rsync
to share repositories.  It was quickly discovered that this brought
mighty servers to their knees.  It's great for retrieving a small
subset of the database the way git usually does, but rsync scans the
whole .git/objects tree every time.

So packs were developed.  A pack is a file, built all at once, which
contains many delta-compressed objects.  With each .pack file,
there's an accompanying .idx file that indexes the pack so that
individual objects can be retrieved quickly.

You can reduce the disk space used by your repositories by periodically
repacking them with git-repack.  Normally, this makes a new incremental
pack of everything not already packed.  With the -a flag, this repacks
everything for even greater compression (but takes longer).

The git wire protocol basically consists of negotiation over what
objects needs to be transferred followed by sending a custom-built pack.
The .idx file can be reconstructed from the .pack file, so it's
never transferred.

[[ Is once every few hundred commits a good rule of thumb for repacking?
When .git/objects/?? reaches X megabytes?  I think too many packs is
itself a bad thing, since they all have to be searched. ]]

* Raw diffs

A major source of git's speed is that it tries to avoid accessing files
unnecessarily.  In particular, files can be compared based on their
object IDs without needing to open and read them.  As part of this,
the responsibility for finding file differences (printing diffs) is
divided into finding what files have changed, and finding the changes
within those files.

This is all explained in the Documentation/diffcore.txt in the git
distribution, but the basics is that many of the primitives spit out a
line like this:
:100755 100755 68838f3fad1d22ab4f14977434e9ce73365fb304 0000000000000000000000000000000000000000 M	git-bisect.sh
when asked for a diff.  This is known as a "raw diff".  They can be
told to generate a human-readable diff with the "-p" (patch) flag.
The git-diff command includes this by default.

* Advice on using git

If you're used to CVS, where branches and merges are "advanced" features
that you can go a long time, you need to learn to use branches in git a
lot more.  Branch early and often.  Every time you think about developing
a feature or fixing a bug, create a branch to do it on.

In fact, avoid doing any development on the master branch.  Merges only.

A branch is the git equivalent of a patch, and merging a branch is the
equivalent of applying that patch.  A branch gives it a name that
you can use to refer to it.  This is particularly useful if you're
sharing your changes.

Once you're done with a branch, you can delete it.  This is basically
just removing the refs/heads/<branch> file, but "git-branch -d" adds a
few extra safety checks.  Assuming you merged the branch in, you can
still find all the commits in the history, it's just the name that's
been deleted.

You can also rename a branch by renaming the refs/heads/branch file.
There's no git command to do this, but as long as you update
the HEAD symlink if necessary, you don't need one.

Periodically merge all of the branches you're working on into a testing
branch to see if everything works.  Blow away and re-create the
testing branch whenever you do this.  When you like the result,
merge them into the master.

* The .git directory

There are a number of files in the .git directory used by the
porcelain.  In case you're curious (I was), this is what they are:

index
- The actual index file.

objects/
- The object database.  Can be overridden by $GIT_OBJECT_DIRECTORY

hooks/
- where the hook scripts are kept.  The standard git template includes
  examples, but disabled by being marked non-executable.

info/exclude
- Default project-wide list of file patterns to exclude from notice.
  To this is added the per-directory list in .gitignore.
  See the git-ls-files docs for full details.

refs/
- References to development heads (branches) and tags.

remotes/
- Short names of remote repositories we pull from or push to.
  Details are in the "git-fetch" man page.

HEAD
- The current default development head.
- Created by git-init-db and never deleted
- Changed by git-checkout
- Used by git-commit and any other command that commits changes.
- May be a dangling pointer, in which case git-commit
  does an "initial checkin" with no parent.

COMMIT_EDITMSG
- Temp used by git-commit to edit a commit message.
COMMIT_MSG
- Temp used by git-commit to form a commit message,
  post-processed from COMMIT_EDITMSG.

FETCH_HEAD
- Just-fetched commits, to be merged into the local trunk.
- Created by git-fetch.
- Used by git-pull as the source of data to merge.

MERGE_HEAD
- Keeps track of what heads are currently being merged into HEAD.
- Created by git-merge --no-commit with the heads used 
- Deleted by git-checkout and git-reset (since you're abandoning
  the merge)
- Used by git-commit to supply additional parents to the current commit.
  (And deleted when done.)

MERGE_MSG
- Generated by git-merge --no-commit.
- Used by git-commit as the commit message for a merge
  (If present, git-commit doesn't prompt.)

MERGE_SAVE
- cpio archive of all locally modified files created by
  "git-merge" before starting to do anything, if multiple
  merge strategies are being attempted.
  Used to rewind the tree in case a merge fails.

ORIG_HEAD
- Previous HEAD commit prior to a merge or reset operation.

LAST_MERGE
- Set by the "resolve" strategy to the most recently merged-in branch.
  Basically, a copy of MERGE_HEAD.  Not used by the other merge strategies,
  and resolve is no longer the default, so its utility is very limited.

BISECT_LOG
- History of a git-bisect operation.
- Can be replayed (or, more usefully, a prefix can) by "git-bisect replay"
BISECT_NAMES
- The list of files to be modified by git-bisect.
- Set by "git-bisect start"

TMP_HEAD (used by git-fetch)
TMP_ALT (used by git-fetch)

* Git command summary

There are slightly over a hundred git commands.  This section tries to
classify them by purpose, so you can know which commands are intended to
be used for what.  You can always use the low-level plumbing directly,
but that's inconvenient and error-prone.

Helper programs (not for direct use) for a specific utility are shown
indented under the program they help.

Note that all of these can be invoked using the "git" wrapper by replacing
the leading "git-" with "git ".  The results are exactly the same.
There is a suggestion to reduce the clutter in /usr/bin and move all
the git binaries to their own directory, leaving just the git wrapper
in /usr/bin. so you'll have to use it or adjust your $PATH.  But that
hasn't happened yet.  In the meantime, including the hyphen makes
tab-completion work.

I include ".sh", ".perl", etc. suffixes to show what the programs are
written in, so you can read those scripts written in languages you're
familiar with.  These are the names in the git source tree, but the
suffix is not included in the /usr/bin copies.

+ Administrative commands
git-init-db

+ Object database maintenance:
git-convert-objects
git-fsck-objects
git-lost-found.sh
git-prune.sh
git-relink.perl

+ Pack maintenance
git-count-objects.sh
git-index-pack
git-pack-objects
git-pack-redundant
git-prune-packed
git-repack.sh
git-show-index
git-unpack-objects
git-verify-pack

+ Important primitives
git-commit-tree
git-rev-list
git-rev-parse

+ Useful primitives
git-ls-files
git-update-index

+ General script helpers (used only by scripts)
git-cat-file
git-check-ref-format
git-checkout-index
git-fmt-merge-msg.perl
git-hash-object
git-ls-tree
git-repo-config
git-unpack-file
git-update-ref
git-sh-setup.sh
git-stripspace
git-symbolic-ref
git-var
git-write-tree

+ Oddballs
git-mv.perl

+ Code browsing
git-diff.sh
  git-diff-files
  git-diff-index
  git-diff-tree
  git-diff-stages
git-grep.sh
git-log.sh
git-name-rev
git-shortlog.perl
git-show-branch
git-whatchanged.sh

+ Making local changes
git-add.sh
git-bisect.sh
git-branch.sh
git-checkout.sh
git-commit.sh
git-reset.sh
git-status.sh

+ Cherry-picking
git-cherry.sh
  git-patch-id
git-cherry-pick.sh
git-rebase.sh
git-revert.sh

+ Accepting changes by e-mail
git-apply
git-am.sh
  git-mailinfo
  git-mailsplit
  git-applypatch.sh
git-applymbox.sh

+ Publishing changes by e-mail
git-format-patch.sh
git-send-email.perl

+ Merging
git-merge.sh
  git-merge-base
  git-merge-index
    git-merge-one-file.sh
  git-merge-octopus.sh
  git-merge-ours.sh
  git-merge-recursive.py
  git-merge-resolve.sh
  git-merge-stupid.sh
git-read-tree
git-resolve.sh
git-octopus.sh

+ Making releases
git-get-tar-commit-id
git-tag.sh
  git-mktag
git-tar-tree
git-verify-tag.sh

+ Accepting changes by network
git-clone.sh
  git-clone-pack
git-fetch.sh
  git-fetch-pack
  git-local-fetch
  git-http-fetch
  git-ssh-fetch
git-ls-remote.sh
  git-peek-remote
git-parse-remote.sh
git-pull.sh
  git-ssh-pull
git-shell
  git-receive-pack

+ Publishing changes by network
git-daemon

git-push.sh
  git-http-push
  git-ssh-push
  git-ssh-upload
git-request-pull.sh
git-send-pack
git-update-server-info
git-upload-pack

All of the basic git commands are designed to be scripted.  When
scripting, use the "--" option to ensure that files beginning with
"-" won't be interpreted as options, and the "-z" option to output
NUL-terminated file names so embedded newlines won't break things.

(A person who'd do either of these on purpose is probably crazy, but
it's not actually illegal.)

Looking at existing shell scripts can be very informative.

* Detailed list

Here's a repeat, including descriptions.  I don't try to include
every detail you can find on the man page, but to explain when
you'd want to use a command.

+ Administrative commands
git-init-db
	This creates an empty git repository in ./.git (or $GIT_DIR if
	that is non-null) using a system-wide template.
	It won't hurt an existing repository.

+ Object database maintenance:
git-convert-objects
	You will *never* need to use this command.
	The git repository format has undergone some revision since its
	first release.  If you have an ancient and crufty git repository
	from the very very early days, this will convert it for you.
	But as you're new to git, it doesn't apply.
git-fsck-objects
	Validate the object database.  Checks that all references
	point somewhere, all the SHA1 hashes are correct, and that
	sort of thing.

	This walks the entire repository, uncompressing and hashing
	every object, so it takes a while.  Note that by default,
	it skips over packs, which can make it seem misleadingly fast.
git-lost-found.sh
	Find (using git-fsck-objects) any unreferenced commits and
	tags in the object database, and place them in a .git/lost-found
	directory.  This can be used to recover from accidentally
	deleting a tag or branch reference that you wanted to keep.

	This is the opposite of git-prune.
git-prune.sh
	Delete all unreachable objects from the object database.
	It deletes useless packs, but does not remove useless
	objects from the middle of partially useful packs.

	Git leaks objects in a number of cases, such as unsuccessful
	merges.  The leak rate is generally a small fraction of
	the rate at which the desired history grows, so it's not
	very alarming, but occasionally running git-prune will
	eliminate the 

	If you deliberately throw away a development branch, you
	will need to run this command to fully reclaim the disk space.

	On something like the full Linux repository, this takes
	a while.
git-relink.perl
	Merge the objects stores of multiple git repositories by
	making hard links between them.  Useful to save space if
	duplicate copies are accidentally created on one machine.

+ Pack maintenance
	The classic git format is to compress and store each object
	separately.  This is still used for all newly created changes.
	However, objects can also be stored en masse in "packs" which
	contain many objects and tan take advantage of delta-compressing.
	Repacking your repositories periodically can save space.
	(Repacking is pretty quick but not quick enough to be
	comfortable doing every commit.)
git-count-objects.sh
	Print the number and total size of unpacked objects in the
	repository, to help you decide when is a good time to repack.
git-index-pack
	A pack file has an accompanying .idx file to allow rapid lookup.
	This regenerates the .idx file from the .pack.  This is almost never
	needed directly, but can be used after transferring a .pack file
	between machines.
git-pack-objects
	Given a list of objects on stdin, build a pack file.  This is
	a helper used by the various network communication scripts.
git-pack-redundant
	Produce a list of redundant packs, for feeding to "xargs rm".
	A helper for git-prune.
git-prune-packed
	Delete unpacked object files that are duplicated in packs.
	(With -n, only lists them.)  A helper for git-prune.
git-repack.sh
	Make a new pack with all the unpacked objects.
	With -a, include already-packed objects in the new pack.
	With -d as well, deletes all the old packs thereby made redundant.
git-show-index
	Dump the contents of a pack's .idx file.  Mostly for
	debugging git itself.
git-unpack-objects
	Unpack a .pack file, the opposite of git-pack-objects.	With -n,
	doesn't actually create the files.  With -q, suppresses the
	progress indicator.
git-verify-pack
	Validate a pack file.  Useful when debugging git, and when
	downloading from a remote source.  A helper for git-clone.

+ Important primitives
	Although these primitives are not used directly very frequently,
	understanding them will help you understand other git commands
	that wrap them.
git-commit-tree
	Create a new commit object from a tree and a list of parent
	commits.  This is the primitive that's the heart of git-commit.
	(It's also used by git-am, git-applypatch, git-merge, etc.)
git-rev-list
	Print a list of commits (revisions), in reverse chronological
	order.  This is the heart of git-log and other history examination
	commands, and the options for specifying parts of history are
	shared by all of them.

	In particular, it takes an arbitrary number of revisions as
	arguments, some of which may be prefixed with ^ to negate them.
	These make up "include" and "exclude" sets.  git-rev-list
	lists all revisions that are ancestors of the "include" set
	but not ancestors of the "exclude" set.

	For this purpose, a revision is considered an ancestor of itself.
	Thus, "git-rev-list ^v1.1 v1.2" will list all revisions from
	the v1.2 release back to (but not including) the v1.1 release.

	Because this is so convenient, a special syntax, "v1.1..v1.2"
	is allowed as an equivalent.  However, occasionally the general
	form is useful.  For example, adding ^branch will show the trunk
	(including merges from the branch), but exclude the branch itself.

	Similarly, "branch ^trunk", a.k.a. trunk..branch, will show
	all work on the branch that hasn't been merged to the trunk.
	This works even though trunk is not a direct ancestor of branch.

	Git-rev-list has a variety of flags to control it output format.
	The default is to just print the raw SHA1 object IDs of the
	commits, but --pretty produces a human-readable log.

	You can also specify a set of files names (or directories),
	in which case output will be limited to commits that modified
	those files.

	This command is used extensively by the git:// protocol to compute
	a set of objects to send to update a repository.
git-rev-parse
	This is a very widely used command line canonicalizer for git
	scripts.  It converts relative commit references (e.g. master~3)
	to absolute SHA1 hashes, and can also pass through arguments
	not recognizable as references, so the script can interpret them.

	It is important because it defines the <rev> syntax.

	This takes a variety of options to specify how to prepare the
	command line for the script's use.  --verify is a particularly
	important one.

+ Useful primitives
	These primitives are potentially useful directly.
git-ls-files
	List files in the index and/or working directory.  A variety of
	options control which files to list, based on whether they
	are the same in both places or have been modified.  This command
	is the start of most check-in scripts.
git-update-index
	Copy the given files from the working directory into the index.
	This create the blob objects, but no trees yet.  (Note that
	editing a file executing this multiple times without creating a
	commit will generate orphaned objects.  Harmless.)

	One common safe option is "git-update-index --refresh".  This
	looks for files whose metadata (modification time etc.) has
	changed, but not their contents, and updates the metadata in the
	index so the file contents won't have to be examined again.

+ General script helpers (used only by scripts)
	These are almost exclusively helpers for use in porcelain
	scripts and have little use by themselves from the command line.
git-cat-file
	Extract a file from the object database.  You can ask for
	an object's type or size given only an object ID, but
	to get its contents, you have to specify the type.  This
	is a deliberate safety measure.
git-check-ref-format
	Verify that the reference specified on the command line is
	syntactically valid for a new reference name under $GIT_DIR/refs.
	A number of characters (^, ~, :, and ..) are reserved; see the man
	page for the full list of rules.
git-checkout-index
	Copy files from the index to the working directory, or to a
	specified directory.  Most important as a helper for git-checkout,
	this is also used by git-merge and git-reset.
git-fmt-merge-msg.perl
	Generate a reasonable default commit message for a merge.
	Used by git-pull and git-octopus.
git-hash-object
	Very primitive helper to turn an arbitrary file into an object,
	returning just the ID or actually adding it to the database.
	Used by the cvs-to-git and svn-to-git import filters.
git-ls-tree
	List the contents of a tree object.  Will tell you all the files
	in a commit.  Used by the checkout scripts git-checkout and git-reset.
git-repo-config
	Get and set options in .git/config.  The .git/config format
	is designed to be human-readable.  This gives programmatic
	access to the settings.  This currently has a lot of overlap
	with the function of git-var.
git-unpack-file
	Write the contents of the given block to a temporary file,
	and return the name of that temp file.  Used most often
	by merging scripts.
git-update-ref
	Rewrite a reference (in .git/refs/) to point to a new object.
	"echo $sha1 > $file" is mostly equivalent, but this adds locking
	so two people don't update the same reference at once.
git-sh-setup.sh
	This is a general prefix script that sets up
	$GIT_DIR and $GIT_OBJECT_DIRECTORY for a script,
	or errors out if the git control files can't be found.
git-stripspace
	Remove unnecessary whitespace.  Used mostly on commit messages
	received by e-mail.
git-symbolic-ref
	This queries or creates symlinks to references such as HEAD.
	Basically equivalent to readlink(1) or ln -s, this also works
	on platforms that don't have symlinks.  See the man page.
git-var
	Provide access to the GIT_AUTHOR_IDENT and/or GIT_COMMITTER_IDENT
	values, used in various commit scripts.  This currently has a
	lot of overlap with the function of git-repo-config.
git-write-tree
	Generate a tree object reflecting the current index.  The output
	is the tree object; if you don't remember it somewhere (usually,
	pass it to git-commit-tree), it'll be lost.

	This requires that the index be fully merged.  If any incomplete
	merges are present in the index (files in stages 1, 2 or 3),
	git-write-tree will fail.

+ Oddballs
git-mv.perl
	I have to admit, I'm not quite sure what advantages this is
	supposed to have over plain "mv" followed by "git-update-index",
	or why it's complex enough to need perl.

	Basically, this renames a file, deleting its old name and adding
	its new name to the index.  Otherwise, it's a two-step process
	to rename a file:
	- Rename the file
	- git-add the new name
	Followed by which you must commit both the old and new names

+ Code browsing
git-diff.sh
	Show changes between various trees.  Takes up to two tree
	specifications, and shows the difference between the versions.
	Zero arguments: index vs. working directory (git-diff-files)
	One: tree vs. working directory (git-diff-index)
	One, --cached: tree vs. index (git-diff-index)
	Two: tree vs. tree (git-diff-tree)

	This wrapper always produces human-readable patch output.
	The helpers all produce "diff-raw" format unless you supply
	the -p option.

	There are some interesting options.  Unfortunately, the git-diff
	man page is annoyingly sparse, and refers to the helper scripts'
	documentation rather than describing the many useful options
	they all have in common.  Please do read the man pages of the
	helpers to see what's available.

	In particular, although git does not explicitly record file
	renames, it has some pretty good heuristics to notice things.
	-M tries to detect renamed files by matching up deleted files
	with similar newly created files.  -C tries to detect copies
	as well.  By default, -C only looks among the modified files for
	the copy source.  For common cases like splitting a file in two,
	this works well.  The --find-copies-harder searches ALL files
	in the tree for the copy source.  This can be slow on large trees!

	See Documentation/diffcore.txt for an explanation of how all
	this works.
  git-diff-files
	Compare the index and the working directory.
  git-diff-index
	Compare the working directory and a given tree.  This is the
	git equivalent of the single-operand form of "cvs diff".
	If "--cached" is specified, uses the index rather than the working
	directory.
  git-diff-tree
	Compare two trees.  This is the git equivalent of the two-operand
	form of "cvs diff".  This command is sometimes useful by itself
	to see the changes made by a single commit.  If you give it
	only one commit on the command line, it shows the diff between
	that commit and its first parent.  If the commit specification
	is long and awkward to type, using "git-diff-tree -p <commit>"
	can be easier than "git-diff <commit>^ <commit>".
  git-diff-stages
	Although not called by git-diff, there is a fourth diff helper
	routine, used to compare the various versions of an unmerged
	file in the index.  It is intended for use by merging porcelain.
git-grep.sh
	A very simple wrapper that runs git-ls-files and greps the
	output looking for a file name.  Does nothing fancy except
	saves typing.
git-log.sh
	Wrapper around git-rev-list --pretty.  Shows a history
	of changes made to the repository.  Takes all of git-rev-list's
	options for specifying which revisions to list.
git-name-rev
	Find a symbolic name for the commit specified on the
	command line, and returns a symbolic name of the form
	"maint~404^2~7".  Basically, this does a breadth-first search
	from all the heads in .git/refs looking for the given commit.
git-shortlog.perl
	This is a filter for the output of "git-log --pretty=short"
	to generate a one-line-per-change "shortlog" as Linus likes.
git-show-branch
	Visually show the merge history of the references given as
	arguments.  Prints one column per reference and one line per
	commit showing whether that commit is an ancestor of each
	reference.
git-whatchanged.sh
	A simple wrapper around git-rev-list and git-diff-tree,
	this shows the change history of a repository.  Specify a
	directory or file on the command line to limit the
	output to changes affecting those files.  This isn't
	the same as "cvs annotate", but it serves a similar purpose
	among git folks.

	You can add the -p option to include patches as well as log
	comments.  You can also add the -M or -C option to follow
	history back through file renames.

	-S is interesting: it's the "pickaxe" option.  Given a string,
	this limits the output to changes that make that string appear
	or disappear.  This is for "digging through history" to see when
	a piece of code was introduced.  The string may (and often does)
	contain embedded newlines.  See Documentation/cvs-migration.txt.

+ Making local changes
	All of these are examples of "porcelain" scripts.  Reading the
	scripts themselves can be informative; they're generally not
	too confusing.
git-add.sh
	A simple wrapper around "git-ls-files | git-update-index --add"
	to add new files to the index.   You may specify directories.
	You need to invoke this for every new file you want git to
	track.
git-bisect.sh
	Utility to do a binary search to find the change that broke something.
	The heart of this is in "git-rev-list --bisect"
	A very handy little utility!  Kernel developers love it
	when you tell them exactly which patch broke something.
	NOTE: this uses the head named "bisect", and will blow
	away any existing branch by that name.  Try not to
	create a branch with that name.

	There are three steps:
	git-bisect start [<files>]
		- Reset to start bisecting.  If any files are specified,
		  only they will be checked out as bisection proceeds.
	git-bisect good [<revision>]
		- Record the revision as "good".  The change being sought
		  must be after this revision.
	git-bisect bad [<revision>]
		- Record the revision as "bad".  The change being sought
		  must be before or equal to this revision.
	As soon as you have specified one good version and one bad version,
	git-bisect will find a halfway point and check out that
	revision.  Build and test it, then report it as good or bad,
	and git-bisect will narrow the search.  Finally, git-bisect
	will tell you exactly which change caused the problem.
	git-bisect log
		- Show a history of revisions.
	git-bisect replay
		- Replay (part of) a git-bisect log.  Generally used
		  to recover from a mistake, you can truncate the log
		  before the mistake and replay it to continue.
	If git-bisect chooses a version that cannot build, or you
	are otherwise unable to determine whether it is good or bad,
	you can change revisions with "git-reset --hard <revision>"
	to another checkout between the current good and bad limits, and
	continue from there.  "git-reset --hard <revision>" is generally
	dangerous, but you are on a scratch branch.

	This can, of course, be used to look for any change, even
	one for the better, if you can avoid being confused by the
	terms "good" and "bad".
git-branch.sh
	Most commonly used bare, to show the available branches.
	Show, create, or delete a branch.  The current branches
	are simply the contents of .git/refs/heads/.
	Note that this does NOT switch to the created branch!
	For the common case of creating a branch and immediately
	switching to it, "git-checkout -b <branch>" is simpler.
git-checkout.sh
	This does two superficially similar but very different things
	depending on whether any files or paths are specified on the
	command line.

	git-checkout [-f] [-b <new-branch>] <branch>
		This switches (changes the HEAD symlink to) the specified
		branch, updating the index and working directory to
		reflect the change.  This preserves changes in the
		working directory unless -f is specified.
		If -b is specified, a new branch is started from the
		specified point and switched to.  If <branch> is omitted,
		it defaults to HEAD.  This is the usual way to start a
		new branch.

	git-checkout [<branch>] [--] <paths>...
		This replaces the files specified by the given paths with
		the versions from the index or the specified branch.
		It does NOT affect the HEAD symlink, just replaces the
		specified paths.  This form is like a selective form of
		"git-reset".  Normally, this can guess whether the first
		argument is a branch name or a path, but you can use
		"--" to force the latter interpretation.

	With no branch, this is used to revert a botched edit of a
	particular file.

	Both forms use git-read-tree internally, but the net effect is
	quite different.
git-commit.sh
	Commit changes to the revision history.  In terms of primitives,
	this does three things:
	1) Updates the index file with the working directory files
	   specified on the command line, or -a for all
	   (using git-diff-files --name-only | git-update_index),
	2) Prompts for or generates a commit message, and then
	3) Creates a commit object with the current index contents.

	This also executes the pre-commit, commit-msg, and
	post-commit hooks if present.

	This will remove deleted files from the index, but will not
	add new files to the index, even if explicitly specified on the
	command line; you must use git-add for that.
git-reset.sh
	Explained in detail in "resetting", above.  This modifies the
	current branch head reference (as pointed to by .git/HEAD)
	to refer to the given commit.  It does not modify .git/HEAD 
	Reset the current HEAD to the specified commit, so that future
	checkins will be relative to it.  There are three
	variations:
	--soft: Just move the HEAD link.  The index is unchanged.
	--mixed (default): Move the HEAD link and update the index file.
		Any local changes will appear not checked in.
	--hard: Move the HEAD links, update the index file, and
		check out the index, overwriting the working
		directory.  Like "cvs update -C".
	In case of accidents, this copies the previous head
	object ID to ORIG_HEAD (which is NOT a symlink).
git-status.sh
	Show all files in the directory not current with respect
	to the git HEAD.  The basic categories are:
	1) Changed in the index, will be included in the next commit.
	2) Changed in the working directory but NOT in the index; will
	   be committed only if added via git-update-index or the
	   git-commit command line.
	3) Not tracked by git.

+ Cherry-picking
	Cherry-picking is the process of taking part of the changes
	introduced on one tree and applying those changes to another.
	This doesn't produce a parent/descendant relationship in the
	commit history.

	To produce that relationship, there's a special type of merge you
	can do if you've taken everything you want off a branch and
	want to show it in the merge history without actually importing
	any changes from it: ours.  "git-merge -s ours" will generate a
	commit that shows some branches were merged in, but not
	actually alter the current HEAD source code in any way.

	One thing cherry-picking is sometimes used for is taking a
	development branch and re-organizing the changes into a patch
	series for submission to the Linux kernel.
git-cherry.sh
	This searches a branch for patches which have not been applied
	to another.  Basically, it finds the unpicked cherries.
	It searches back to the common ancestor of the named branch and
	the current head using git-patch-id to identify similarity
	in patches.
  git-patch-id
	Generate a hash of a patch, ignoring whitespace and line numbers
	so that "the same" patch, even relative to a different code base,
	probably has the same hash, and different patches probably have
	different ones.  git-cherry looks for patch hashes which 
	are present on the branch (source branch) that are not present
	on the trunk (destination branch).
git-cherry-pick.sh
	Given a commit (on a different branch), compute a diff between
	it and its immediate parent, and apply it to the current HEAD.  
	This is actually the same script as "git revert", but
	works forward.  git-cherry finds the patches, this merges them.
	Handles failures gracefully.
git-rebase.sh
	Move a branch to a more recent "base" release.	This just extracts
	all the patches applied on the local head since the last merge
	with upstream (using git-format-patch) and re-applies them
	relative to the current upstream with git-am (explained under
	"accepting changes by e-mail").  Finally, it deletes the old
	branch and gives its name to the new one, so your branch now
	contains all the same changes, but relative to a different base.
	Basically the same as cherry-picking an entire branch.
git-revert.sh
	Undo a commit.	Basically "patch -R" followed by a commit.
	This is actually the same script as "git-cherry-pick", just
	applies the patch in reverse, undoing a change that you don't
	wish to back up to using git-reset.  Handles failures gracefully
	by telling the user what to do.

+ Accepting changes by e-mail
git-apply
	Apply a (git-style extended) patch to the current index
	and working directory.
git-am.sh
	The new and improved "apply an mbox" script.  Takes an
	mbox-style concatenation of e-mails as input and batch-applies
	them, generating one commit per message.  Can resume after
	stopping on a patch problem.
	(Invoke it as "git-am --skip" or "git-am --resolved" to
	deal with the problematic patch and continue.)
  git-mailinfo
	Given a single mail message on stdin (in the Linux standard
	SubmittingPatches format), extract a commit message and
	the patch proper.
  git-mailsplit
	Split an mbox into separate files.
  git-applypatch.sh
	Tries simple git-apply, then tries a few other clever merge
	strategies to get a patch to apply.  Used in the main loop
	of git-am and git-applymbox.
git-applymbox.sh
	This is Linus's original apply-mbox script.  Mostly superseded by
	git-am (which is friendlier and has more features), but he still
	uses it, so it's maintained.  This is so old it was originally
	a test of the git core called "dotest", and that name is still
	lurking in the temp file names.

+ Publishing changes by e-mail
git-format-patch.sh
	Generate a series of patches, in the preferred Linux kernel
	(Documentation/SubmittingPatches) format, for posting to lkml
	or the like.  This formats every commit on a branch as a separate
	patch.
git-send-email.perl
	Actually e-mail the output of git-format-patch.
	(This uses direct SMTP, a matter of some controversy.  Others feel
	that /bin/mail is the correct local mail-sending interface.)

+ Merging
git-merge.sh
	Merge one or more "remote" heads into the current head.
	Some changes, when there has been change only on one branch or
	the same change has been made to all branches, can be resolved
	by the "trivial in-index" merge done by git-read-tree.	For more
	complex cases, git provides a number of different merge strategies
	(with reasonable defaults).

	Note that merges are done on a filename basis.	While git tries
	to detect renames when generating diffs, most merge strategies
	don't track them by renaming.  (The "recursive" strategy, which
	recently became the default, is a notable exception.)
  git-merge-base
	Finds a common ancestor to use when comparing the changes made
	on two branches.  The simple case is straightforward, but if
	there have been cross-merges between the branches, it gets
	somewhat hairy.  The algorithm is not 100% final yet.
	(There's also --all, which lists all candidates.)
  git-merge-index
	This is the outer merging loop.  It takes the name of a one-file
	merge executable as an argument, and runs it for every incomplete
	merge.
    git-merge-one-file.sh
	This is the standard git-merge-index helper, that tries to
	resolve a 3-way merge.  A helper used by all the merge strategies.
	(Except "recursive" which has its own equivalent.)
  git-merge-octopus.sh
	Many-way merge.  Overlaps should be negligible.
  git-merge-ours.sh
	A "dummy" merge strategy helper.  Claims that we did the merge, but
	actually takes the current tree unmodified.  This is used to
	cleanly terminate side branches that heve been cherry-picked in.
  git-merge-recursive.py
	A somewhat fancier 3-way merge.   This handles multiple cross-merges
	better by using multiple common ancestors.
  git-merge-resolve.sh
  git-merge-stupid.sh
	Not actually used by git-merge, this is a simple example
	merge strategy.
git-read-tree
	Read the given tree into the index.  This is the difference
	between the "--soft" and "--mixed" modes of git-reset, but the
	important thing this command does is simple merging.
	If -m is specified, this can take up to three trees as arguments.
git-resolve.sh
	OBSOLETE.  Perform a merge using the "resolve" strategy.
	Has been superseded by the "-s resolve" option to git-merge
	and git-pull.
git-octopus.sh
	OBSOLETE.   Perform a merge using the "octopus" strategy.
	Has been superseded by the "-s octopus" option to git-merge
	and git-pull.

+ Making releases
git-get-tar-commit-id
	Reads out the commit ID that git-tar-tree puts in its output.
	(Or fails if this isn't a git-generated tar file.)
git-tag.sh
	Create a tag in the refs/tags directory.  There are two kinds:
	"lightweight tags" are just references to commits.  More
	serious tags are GPG-signed tag objects, and people receiving
	the git tree can verify that it is the version that you released.
  git-mktag
	Creates a tag object.  Verifies syntactic correctness of its
	input.  (If you want to cheat, use git-hash-object.)
git-tar-tree
	Generate a tar archive of the named tree.  Because git does NOT
	track file timestamps, this uses the timestamp of the commit,
	or the current time if you specify a tree.
	Also stores the commit ID in an extended tar header.
git-verify-tag.sh
	Given a tag object, GPG-verify the embedded signature.

+ Accepting changes by network
	Pulling consists of two steps: retrieving the remote commit
	objects and everything they point to (including ancestors),
	then merging that into the desired tree.  There are still
	separate fetch and merge commands, but it's more commonly done
	with a single "git-pull" command.  git-fetch leaves the commit
	objects, one per line, in .git/FETCH_HEAD.  git-merge will
	merge those in if that file exists when it is run.

	References to remote repositories can be made with long URLs,
	or with files in the .git/remotes/ directory.  The latter
	also specifies the local branches to merge the fetched data into,
	making it very easy to track a remote repository.
git-clone.sh
	Create a new local clone of a remote repository.
	(Can do a couple of space-sharing hacks when "remote" is on
	a local machine.)

	You only do this once.
  git-clone-pack
	Runs git-upload-pack remotely and places the resultant pack
	into the local repository.  Supports a variety of network
	protocols, but "remote" can also be a different directory on
	the current machine.
git-fetch.sh
	Fetch the named refs and all linked objects from a remote repository.
	The resultant refs (tags and commits) are stored in .git/FETCH_HEAD,
	which is used by a later git-resolve or git octopus.

	This is the first half of a "git pull" operation.
  git-fetch-pack
	Retrieve missing objects from a remote repository.
  git-local-fetch
	Duplicates a git repository from the local system.
	(Er... is this used anywhere???)
  git-http-fetch
	Do a fetch via http.  Http requires some kludgery on the
	server (see git-update-server-info), but it works.
  git-ssh-fetch
	Do a fetch via ssh.
git-ls-remote.sh
	Show the contents of the refs/heads/ and/or refs/tags/ directories
	of a remote repository.  Useful to see what's available.
  git-peek-remote
	Helper C program for the git-ls-remote script.  Implements the
	git protocol form of it.
git-parse-remote.sh
	Helper script to parse a .git/remotes/ file.  Used by a number
	of these programs.
git-pull.sh
	Fetches specific commits from the given remote repository,
	and merges everything into the current branch.	If a remote
	commit is named as src:dst, this merges the remote head "src"
	into the branch "dst" as well as the trunk.  Typically, the "dst"
	branch is not modified locally, but is kept as a pristine copy
	of the remote branch.

	One very standard example of this contention is that
	a repository that is tracking another specifies "master:origin"
	to provide a pristine local copy of the remote "master"
	branch in the local branch named "origin".
  git-ssh-pull
	A helper program that pulls over ssh.
git-shell
	A shell that can be used for git-only users.  Allows git
	push (git-receive-pack) and pull (git-upload-pack) only.
  git-receive-pack
	Receive a pack from git-send-pack, validate it, and add it to
	the repository.  Adding just the bare objects has no security
	implications, but this can also update branches and tags, which
	does have an effect.
	Runs pre-update and post-update hooks; the former may do
	permissions checking and disallow the upload.
	This is the command run remotely via ssh by git-push.

+ Publishing changes by network
git-daemon
	A daemon that serves up the git native protocol so anonymous
	clients can fetch data.  For it to allow export of a directory,
	the magic file name "git-daemon-export-ok" must exist in it.

	This does not accept (receive) data under any circumstances.
git-push.sh
	Git-pull, only backwards.  Send local changes to a remote
	repository.  The same .git/remotes/ short-cuts can be used,
	and the same src:dst syntax.  (But this time, the src is local
	and the dst is remote.)
  git-http-push
	A helper to git-push to implement the http: protocol.
  git-ssh-push
	A helper to git-push to push over ssh.
  git-ssh-upload
	Another helper.  This just does the equivalent of "fetch"
	("throw"?) and doesn't actually merge the result.  Obsolete?
git-request-pull.sh
	Generate an e-mail summarizing the changes between two commits,
	and request that the recipient pull them from your repository.
	Just a little helper to generate a consistent and informative
	format.
git-send-pack
	Pack together a pile of objects missing at the destination and
	send them.  This is the sending half that talks to a remote
	git-receive-pack.
git-update-server-info
	To run git over http, auxiliary info files are required that
	describes what objects are in the repository (since git-upload-pack
	can't generate this on the fly).  If you want to publish a
	repository via http, run this after every commit.  (Typically
	via the hooks/post-update script.)
git-upload-pack
	Like git-send-pack, but this is invoked by a remote git-fetch-pack.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: as promised, docs: git for the confused
  2005-12-08  6:34           ` as promised, docs: git for the confused linux
@ 2005-12-08 21:53             ` Junio C Hamano
  2005-12-08 22:02               ` H. Peter Anvin
  2005-12-09  0:47             ` Alan Chandler
  2005-12-09  1:19             ` Josef Weidendorfer
  2 siblings, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2005-12-08 21:53 UTC (permalink / raw
  To: linux; +Cc: git

linux@horizon.com writes:

> * Terminology - heads, branches, refs, and revisions
>
> (This is a supplement to what's already in "man git".)
>
> The most common object needed by git primitives is a tree.  Since a
> commit points to a tree and a tag points to a commit, both of these are
> acceptable "tree-ish" objects and can be used interchangeably.  Likewise,
> a tag is "commit-ish" and can be used where a commit is required.

I am unsure if we want to further confuse readers by saying
this, but technically, "Likewise, a tag which is commit-ish can
be used in place of commit".  Not all tags are necessarily
commit-ish.  v2.6.11 tag is tree-ish but not commit-ish for
example.  Typically, however, a tag is commit-ish.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: as promised, docs: git for the confused
  2005-12-08 21:53             ` Junio C Hamano
@ 2005-12-08 22:02               ` H. Peter Anvin
  0 siblings, 0 replies; 64+ messages in thread
From: H. Peter Anvin @ 2005-12-08 22:02 UTC (permalink / raw
  To: Junio C Hamano; +Cc: linux, git

Junio C Hamano wrote:
> linux@horizon.com writes:
> 
> 
>>* Terminology - heads, branches, refs, and revisions
>>
>>(This is a supplement to what's already in "man git".)
>>
>>The most common object needed by git primitives is a tree.  Since a
>>commit points to a tree and a tag points to a commit, both of these are
>>acceptable "tree-ish" objects and can be used interchangeably.  Likewise,
>>a tag is "commit-ish" and can be used where a commit is required.
> 
> 
> I am unsure if we want to further confuse readers by saying
> this, but technically, "Likewise, a tag which is commit-ish can
> be used in place of commit".  Not all tags are necessarily
> commit-ish.  v2.6.11 tag is tree-ish but not commit-ish for
> example.  Typically, however, a tag is commit-ish.
> 

Saying they can be used interchangably is just plain wrong, however. 
It's not a bijective relation.

Something like:

 >> The most common object needed by git primitives is a tree.  Since a
 >> commit points and tags uniquely identify a tree, a commit or tag can
 >> be used anywhere a tree is expected.

 >> Likewise, most tags point to commits and can be used anywhere a
 >> commit is expected.

... might be better, and avoids the colloquialisms.

	-hpa

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: as promised, docs: git for the confused
  2005-12-08  6:34           ` as promised, docs: git for the confused linux
  2005-12-08 21:53             ` Junio C Hamano
@ 2005-12-09  0:47             ` Alan Chandler
  2005-12-09  1:45               ` Petr Baudis
  2005-12-09  1:19             ` Josef Weidendorfer
  2 siblings, 1 reply; 64+ messages in thread
From: Alan Chandler @ 2005-12-09  0:47 UTC (permalink / raw
  To: git

On Thursday 08 Dec 2005 06:34, linux@horizon.com wrote:
> As I mentioned with all my questions, I was writing up the answers
> I got.  Here's the current status.  If anyone would like to comment on
> its accuracy or usefulness, feedback is appreciated.
...
> * Background material.
>
> To start with, read "man git".  Or Documentation/git.txt in the git
> source tree, which is the same thing.  Particularly note the description
> of the index, which is where all the action in git happens.
>
> One thing that's confusing is why git allows you to have one version of
> a file in the current HEAD, a second version in the index, and possibly a
> third in the working directory.  Why doesn't the index just contain a copy
> of the current HEAD until you commit a new one?  The answer is merging,
> which does all its work in the index.  Neither the object database nor
> the working directory let you have multiple files with the same name.


If I was a complete newbie, I would be lost right here.  You start refering to 
the term HEAD without any introduction to what it means and (as far as I 
could see on a quick glance - which is what a newbie would do - man git 
doesn't start out here either).

If your audience really is a complete new commer, then as a minimum I think 
you need  to describe to concept of a "branch of development" with a series 
of snapshots of the state, the current of which is called HEAD.  You might 
even at this stage hint about there being several such branches.  The next 
bit, which goes on about the index is great - just put it into context with a 
simple explanation first.
-- 
Alan Chandler
http://www.chandlerfamily.org.uk
Open Source. It's the difference between trust and antitrust.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: as promised, docs: git for the confused
  2005-12-08  6:34           ` as promised, docs: git for the confused linux
  2005-12-08 21:53             ` Junio C Hamano
  2005-12-09  0:47             ` Alan Chandler
@ 2005-12-09  1:19             ` Josef Weidendorfer
  2 siblings, 0 replies; 64+ messages in thread
From: Josef Weidendorfer @ 2005-12-09  1:19 UTC (permalink / raw
  To: linux; +Cc: git

On Thursday 08 December 2005 07:34, you wrote:
> As I mentioned with all my questions, I was writing up the answers
> I got.  Here's the current status.  If anyone would like to comment on
> its accuracy or usefulness, feedback is appreciated.
> ...

> + Oddballs
> git-mv.perl
> 	I have to admit, I'm not quite sure what advantages this is
> 	supposed to have over plain "mv" followed by "git-update-index",
> 	or why it's complex enough to need perl.
> 
> 	Basically, this renames a file, deleting its old name and adding
> 	its new name to the index.  Otherwise, it's a two-step process
> 	to rename a file:
> 	- Rename the file
> 	- git-add the new name
> 	Followed by which you must commit both the old and new names

The nice thing about it is that you can move huge directories around,
or multiple files/dirs at once, and it will do the right thing. E.g.
	git-mv -k foo* bar/
will only move files which are version controlled.

It is actually a 3-step process: rename, delete old, add new.
Perhaps it should be noted that this has nothing to do with any
explicit renaming feature like in other SCMs.

Josef

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: as promised, docs: git for the confused
  2005-12-09  0:47             ` Alan Chandler
@ 2005-12-09  1:45               ` Petr Baudis
  0 siblings, 0 replies; 64+ messages in thread
From: Petr Baudis @ 2005-12-09  1:45 UTC (permalink / raw
  To: Alan Chandler; +Cc: git

Dear diary, on Fri, Dec 09, 2005 at 01:47:56AM CET, I got a letter
where Alan Chandler <alan@chandlerfamily.org.uk> said that...
> On Thursday 08 Dec 2005 06:34, linux@horizon.com wrote:
> > As I mentioned with all my questions, I was writing up the answers
> > I got.  Here's the current status.  If anyone would like to comment on
> > its accuracy or usefulness, feedback is appreciated.
> ...
> > * Background material.
> >
> > To start with, read "man git".  Or Documentation/git.txt in the git
> > source tree, which is the same thing.  Particularly note the description
> > of the index, which is where all the action in git happens.
> >
> > One thing that's confusing is why git allows you to have one version of
> > a file in the current HEAD, a second version in the index, and possibly a
> > third in the working directory.  Why doesn't the index just contain a copy
> > of the current HEAD until you commit a new one?  The answer is merging,
> > which does all its work in the index.  Neither the object database nor
> > the working directory let you have multiple files with the same name.
> 
> 
> If I was a complete newbie, I would be lost right here.  You start refering to 
> the term HEAD without any introduction to what it means and (as far as I 
> could see on a quick glance - which is what a newbie would do - man git 
> doesn't start out here either).

I think that the first paragraph of the background material means
"insert Documentation/git.txt here", the second one is then "now what
might've been unclear there".

That said, the "git for the confused" contains a lot of nice points, but
I don't think it's a good approach to just have extra document for
clarifying this stuff. It would be much better if the stock
documentation itself would not be confusing in the first place. Same
goes for the "commands overview" (BOUND to get out-of-date over time
since it's detached from the normal per-command documentation; we have
troubles huge enough to keep usage strings in sync, let alone the
manpages).

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.

^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2005-12-09  1:45 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-11-28 23:42 git-name-rev off-by-one bug linux
2005-11-29  5:54 ` Junio C Hamano
2005-11-29  8:05   ` linux
2005-11-29  9:29     ` Junio C Hamano
2005-11-30  8:37       ` Junio C Hamano
2005-11-29 10:31     ` Petr Baudis
2005-11-29 18:46       ` Junio C Hamano
2005-12-04 21:34         ` Petr Baudis
2005-12-08  6:34           ` as promised, docs: git for the confused linux
2005-12-08 21:53             ` Junio C Hamano
2005-12-08 22:02               ` H. Peter Anvin
2005-12-09  0:47             ` Alan Chandler
2005-12-09  1:45               ` Petr Baudis
2005-12-09  1:19             ` Josef Weidendorfer
2005-11-29 21:40       ` git-name-rev off-by-one bug linux
2005-11-29 23:14         ` Junio C Hamano
2005-11-30  0:15           ` linux
2005-11-30  0:53             ` Junio C Hamano
2005-11-30  1:27               ` Junio C Hamano
2005-11-30  1:51             ` Linus Torvalds
2005-11-30  2:06               ` Junio C Hamano
2005-11-30  2:33               ` Junio C Hamano
2005-11-30  3:12                 ` Linus Torvalds
2005-11-30  5:06                   ` Linus Torvalds
2005-11-30  5:51                     ` Junio C Hamano
2005-11-30  6:11                       ` Junio C Hamano
2005-11-30 16:13                         ` Linus Torvalds
2005-11-30 16:08                       ` Linus Torvalds
2005-12-02  8:25                       ` Junio C Hamano
2005-12-02  9:14                         ` [PATCH] merge-one-file: make sure we create the merged file Junio C Hamano
2005-12-02  9:15                         ` [PATCH] merge-one-file: make sure we do not mismerge symbolic links Junio C Hamano
2005-12-02  9:16                         ` [PATCH] git-merge documentation: conflicting merge leaves higher stages in index Junio C Hamano
2005-11-30  6:09                     ` git-name-rev off-by-one bug linux
2005-11-30  6:39                       ` Junio C Hamano
2005-11-30 13:10                         ` More merge questions linux
2005-11-30 18:37                           ` Daniel Barkalow
2005-11-30 20:23                           ` Junio C Hamano
2005-12-02  9:19                             ` More merge questions (why doesn't this work?) linux
2005-12-02 10:12                               ` Junio C Hamano
2005-12-02 13:09                                 ` Sven Verdoolaege
2005-12-02 20:32                                   ` Junio C Hamano
2005-12-05 15:01                                     ` Sven Verdoolaege
2005-12-02 11:37                               ` linux
2005-12-02 20:31                                 ` Junio C Hamano
2005-12-02 21:32                                   ` linux
2005-12-02 22:00                                     ` Junio C Hamano
2005-12-02 22:12                                     ` Linus Torvalds
2005-12-02 23:14                                       ` linux
2005-12-02 21:56                                   ` More merge questions linux
2005-11-30 16:12                       ` git-name-rev off-by-one bug Linus Torvalds
2005-11-30  7:18                   ` Junio C Hamano
2005-11-30  9:05                     ` Junio C Hamano
2005-11-30  9:42                     ` Junio C Hamano
2005-11-30  3:15                 ` linux
2005-11-30 18:11               ` Daniel Barkalow
2005-11-30 17:46   ` Daniel Barkalow
2005-11-30 20:05     ` Junio C Hamano
2005-11-30 21:06       ` Daniel Barkalow
2005-11-30 22:00         ` Junio C Hamano
2005-11-30 23:12           ` Daniel Barkalow
2005-12-01  7:46             ` Junio C Hamano
2005-12-01 10:14 ` Junio C Hamano
2005-12-01 21:50   ` Petr Baudis
2005-12-01 21:53     ` Randal L. Schwartz

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).