git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Git 1.0 Synopis (Draft v2)
@ 2005-07-27 10:01 Ryan Anderson
  2005-07-27 22:13 ` Junio C Hamano
  2005-07-29  8:29 ` Git 1.0 Synopis (Draft v3 Ryan Anderson
  0 siblings, 2 replies; 1752+ messages in thread
From: Ryan Anderson @ 2005-07-27 10:01 UTC (permalink / raw)
  To: git

[This is still a draft, but I think I incorporated the suggestons from
the last attempt.]

Source Code Management with Git

Git, sometimes called "global information tracker", is a "directory
content manager".  Git has been designed to handle absolutely massive
projects with speed and efficiency, and the release of the 2.6.12 and
(soon) the 2.6.13 version of the Linux kernel would indicate that it
does this task well.

Git falls into the category of distributed source code management tools,
similar to Arch or Darcs (or, in the commercial world, BitKeeper).  Every Git
working directory is a full-fledged repository with full revision tracking
capabilities, not dependent on network access to a central server.

Git uses the SHA1 hash algorithm to provide a content-addressable pseudo
filesystem, complete with its own version of fsck.
  o Speed of use, both for the project maintainer, and the end-users, is
    a key development principle.
  o The history is stored as a directed acyclic graph, making long-lived
    branches and repeated merging simple.
  o A collection of related projects are building on the core Git project,
    either to provide an easier to use interface on top (StGit, Cogito, qgit,
    gitk, gitweb), or to take some of the underlying concepts and reimplement
    them directly into another system (Arch 2.0, Darcs-git).
  o Two, interchangeable, on-disk formats are used:
    o An efficient, packed format that saves spaced and network
      bandwidth.
    o An unpacked format, optimized for fast writes and incremental
      work.

To get a copy of Git:
	Daily snapshots are available at:
	http://www.codemonkey.org.uk/projects/git-snapshots/git/
	(Thanks to Dave Jones)

	Source tarballs and RPMs at:
	http://www.kernel.org/pub/software/scm/git/

	Deb packages at:
	<insert url here>

	Or via Git itself:
	git clone http://www.kernel.org/pub/scm/git/git.git/
	git clone rsync://rsync.kernel.org/pub/scm/git/git.git/
	(rsync is generally faster for an initial pull)

Git distributions contain a tutorial in the Documentation subdirectory.
Additionally, the Kernel-Hacker's Git Tutorial at
http://linux.yyz.us/git-howto.html may be useful.  (Thanks to Jeff Garzik for
that document)

Git development takes place on the Git mailing list.  To subscribe, send an
email with just "subscribe git" in the body to majordomo@vger.kernel.org.
Mailing list archives are available at http://marc.theaimsgroup.com/?l=git

Git results from the inspiration and frustration of Linus Torvalds, and
the enthusiastic help of over 300 participants on the development
mailing list.[1]  It is maintained by Junio C Hamano <junkio@cox.net>.

1 - Generated with the following, in a maildir folder:
        find . -type f | xargs grep -h "^From:" | perl -ne \
        'tr#A-Z#a-z#; m#<(.*)># && print $1,"\n";' | sort -u | wc -l

(This summary written by Ryan Anderson <ryan@michonline.com>.  Please bug him
with any corrections or complaints.)




-- 

Ryan Anderson
  sometimes Pug Majere

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v2)
  2005-07-27 10:01 Git 1.0 Synopis (Draft v2) Ryan Anderson
@ 2005-07-27 22:13 ` Junio C Hamano
  2005-07-29  8:27   ` Ryan Anderson
  2005-07-29  8:29 ` Git 1.0 Synopis (Draft v3 Ryan Anderson
  1 sibling, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2005-07-27 22:13 UTC (permalink / raw)
  To: Ryan Anderson; +Cc: git

Ryan Anderson <ryan@michonline.com> writes:

> Source Code Management with Git

Thanks for doing this.  Generally looks excellent.

>   o Two, interchangeable, on-disk formats are used:
>     o An efficient, packed format that saves spaced and network
>       bandwidth.

??? "spaced" ???

> 	Or via Git itself:
> 	git clone http://www.kernel.org/pub/scm/git/git.git/
> 	git clone rsync://rsync.kernel.org/pub/scm/git/git.git/
> 	(rsync is generally faster for an initial pull)

These need a target directory name to create, like this:

    git clone rsync://rsync.kernel.org/pub/scm/git/git.git/ $new_dir
    git clone http://www.kernel.org/pub/scm/git/git.git/ $new_dir

> Git results from the inspiration and frustration of Linus Torvalds, and
> the enthusiastic help of over 300 participants on the development
> mailing list.[1]  It is maintained by Junio C Hamano <junkio@cox.net>.

Please drop the e-mail address here; you mention nobody else's.

Well, dropping "the current maintainer" information altogether
might be even better; the above to a casual reader sounds like
Linus was frustrated and I wrote it for him, which is definitely
not what we would like to say.  I suspect it still has more code
by Linus than anybody else (I stopped counting some time ago).

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v2)
  2005-07-27 22:13 ` Junio C Hamano
@ 2005-07-29  8:27   ` Ryan Anderson
  0 siblings, 0 replies; 1752+ messages in thread
From: Ryan Anderson @ 2005-07-29  8:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Jul 27, 2005 at 03:13:18PM -0700, Junio C Hamano wrote:
> > Git results from the inspiration and frustration of Linus Torvalds, and
> > the enthusiastic help of over 300 participants on the development
> > mailing list.[1]  It is maintained by Junio C Hamano <junkio@cox.net>.
> 
> Please drop the e-mail address here; you mention nobody else's.
> 
> Well, dropping "the current maintainer" information altogether
> might be even better; the above to a casual reader sounds like
> Linus was frustrated and I wrote it for him, which is definitely
> not what we would like to say.  I suspect it still has more code
> by Linus than anybody else (I stopped counting some time ago).

Ok.  I was thinking I could add "current" into that description.  Or,
something like, "Linus has since returned his focus to the kernel, and
passed maintainership to ...".

-- 

Ryan Anderson
  sometimes Pug Majere

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Git 1.0 Synopis (Draft v3
  2005-07-27 10:01 Git 1.0 Synopis (Draft v2) Ryan Anderson
  2005-07-27 22:13 ` Junio C Hamano
@ 2005-07-29  8:29 ` Ryan Anderson
  2005-07-29 10:58   ` Johannes Schindelin
                     ` (2 more replies)
  1 sibling, 3 replies; 1752+ messages in thread
From: Ryan Anderson @ 2005-07-29  8:29 UTC (permalink / raw)
  To: git


Source Code Management with Git

"git" can mean anything, depending on your mood.

 - random three-letter combination that is pronounceable, and not
   actually used by any common UNIX command.  The fact that it is a
   mispronunciation of "get" may or may not be relevant.
 - stupid. contemptible and despicable. simple. Take your pick from the
   dictionary of slang.
 - "global information tracker": you're in a good mood, and it actually
   works for you. Angels sing, and a light suddenly fills the room. 
 - "goddamn idiotic truckload of sh*t": when it breaks

Git is a "directory content manager".  Git has been designed to handle
absolutely massive projects with speed and efficiency, and the release of the
2.6.12 and (soon) the 2.6.13 version of the Linux kernel would indicate that it
does this task well.

Git falls into the category of distributed source code management tools,
similar to Arch or Darcs (or, in the commercial world, BitKeeper).  Every Git
working directory is a full-fledged repository with full revision tracking
capabilities, not dependent on network access to a central server.

Git provides a content-addressable pseudo filesystem, complete with its own
version of fsck.

  o Speed of use, both for the project maintainer, and the end-users, is
    a key development principle.

  o The history is stored as a directed acyclic graph, making long-lived
    branches and repeated merging simple.

  o The core Git project considers itself to provide "plumbing" for other
     projects, as well as to serve to arbitrate for compatibility between them.
     The project built on top of the core Git are referred to as "porcelain".
     StGit, Cogito, qgit, gitk and gitweb are all building upon the core Git
     tools, and providing an easy to use interface to various pieces of
     functionality.

  o Some other projects have taken the concepts from the core Git project, and
    are either porting an existing toolset to use the Git tools, or
    reimplementing the concepts internally, to benefit from the performance
     improvements.  This includes both Arch 2.0, and Darcs-git.
  
  o Two, interchangeable, on-disk formats are used:
    o An efficient, packed format that saves space and network
      bandwidth.
    o An unpacked format, optimized for fast writes and incremental
      work.

To get a copy of Git:
	Daily snapshots are available at:
	http://www.codemonkey.org.uk/projects/git-snapshots/git/
	(Thanks to Dave Jones)

	Source tarballs and RPMs at:
	http://www.kernel.org/pub/software/scm/git/

	Deb packages at:
	<insert url here>

	Or via Git itself:
	git clone http://www.kernel.org/pub/scm/git/git.git/ <local directory>
	git clone rsync://rsync.kernel.org/pub/scm/git/git.git/ <local directory>

	(rsync is generally faster for an initial clone, you can switch later
	by editing .git/branches/origin and changing the url)

To get the 'Porcelain' tools mentioned above:
	SCM Interface layers:
	cogito - http://www.kernel.org/pub/software/scm/cogito/
	StGIT - http://www.procode.org/stgit/

	History Visualization:
	gitk - http://ozlabs.org/~paulus/gitk/
	gitweb - http://www.kernel.org/pub/software/scm/gitweb/
	qgit - http://sourceforge.net/projects/qgit


Git distributions contain a tutorial in the Documentation subdirectory.
Additionally, the Kernel-Hacker's Git Tutorial at
http://linux.yyz.us/git-howto.html may be useful.  (Thanks to Jeff Garzik for
that document)

Git development takes place on the Git mailing list.  To subscribe, send an
email with just "subscribe git" in the body to majordomo@vger.kernel.org.
Mailing list archives are available at http://marc.theaimsgroup.com/?l=git

(This summary written by Ryan Anderson <ryan@michonline.com>.  Please bug him
with any corrections or complaints.)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v3
  2005-07-29  8:29 ` Git 1.0 Synopis (Draft v3 Ryan Anderson
@ 2005-07-29 10:58   ` Johannes Schindelin
  2005-07-29 21:26   ` Sam Ravnborg
  2005-07-31 22:15   ` Horst von Brand
  2 siblings, 0 replies; 1752+ messages in thread
From: Johannes Schindelin @ 2005-07-29 10:58 UTC (permalink / raw)
  To: Ryan Anderson; +Cc: git

I like it!

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v3
  2005-07-29  8:29 ` Git 1.0 Synopis (Draft v3 Ryan Anderson
  2005-07-29 10:58   ` Johannes Schindelin
@ 2005-07-29 21:26   ` Sam Ravnborg
  2005-07-31 22:18     ` Horst von Brand
  2005-07-31 22:15   ` Horst von Brand
  2 siblings, 1 reply; 1752+ messages in thread
From: Sam Ravnborg @ 2005-07-29 21:26 UTC (permalink / raw)
  To: Ryan Anderson; +Cc: git

On Fri, Jul 29, 2005 at 04:29:41AM -0400, Ryan Anderson wrote:
> Source Code Management with Git
....

The article should include a HOWTO part alos. So people can see how to
edit a file, pull from a remote repository etc.
Since you have introduced core and porcelains it would be most logical
to use one of the porcelains in these examples, maybe accompanied by the
raw git commands being executed.

	Sam

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v3
  2005-07-29  8:29 ` Git 1.0 Synopis (Draft v3 Ryan Anderson
  2005-07-29 10:58   ` Johannes Schindelin
  2005-07-29 21:26   ` Sam Ravnborg
@ 2005-07-31 22:15   ` Horst von Brand
  2005-08-01 13:21     ` Horst von Brand
  2005-08-15  4:55     ` Git 1.0 Synopis (Draft v4) Ryan Anderson
  2 siblings, 2 replies; 1752+ messages in thread
From: Horst von Brand @ 2005-07-31 22:15 UTC (permalink / raw)
  To: Ryan Anderson; +Cc: git

Ryan Anderson <ryan@michonline.com> wrote:
> Source Code Management with Git

More bugging...

- Either stay with your idea of "Git is the idea, git the implementation"
  (iff blessed by the Git Powers That Be) and be consistent about it, or
  just use "git" throughout.

- Attribute the meaning appropiately, say by:

In Linus' own words as the inventor of git:

> "git" can mean anything, depending on your mood.
> 
>  - random three-letter combination that is pronounceable, and not
>    actually used by any common UNIX command.  The fact that it is a
>    mispronunciation of "get" may or may not be relevant.
>  - stupid. contemptible and despicable. simple. Take your pick from the
>    dictionary of slang.
>  - "global information tracker": you're in a good mood, and it actually
>    works for you. Angels sing, and a light suddenly fills the room. 
>  - "goddamn idiotic truckload of sh*t": when it breaks
[...]

> To get a copy of Git:
> 	Daily snapshots are available at:
> 	http://www.codemonkey.org.uk/projects/git-snapshots/git/
> 	(Thanks to Dave Jones)
> 
> 	Source tarballs and RPMs at:
> 	http://www.kernel.org/pub/software/scm/git/
> 
> 	Deb packages at:
> 	<insert url here>
> 
> 	Or via Git itself:
> 	git clone http://www.kernel.org/pub/scm/git/git.git/ <local directory>
> 	git clone rsync://rsync.kernel.org/pub/scm/git/git.git/ <local directory>
> 
> 	(rsync is generally faster for an initial clone, you can switch later
> 	by editing .git/branches/origin and changing the url)
> 
> To get the 'Porcelain' tools mentioned above:
> 	SCM Interface layers:
> 	cogito - http://www.kernel.org/pub/software/scm/cogito/
> 	StGIT - http://www.procode.org/stgit/

At least cogito includes a (slightly old) version of git. Dunno about
StGIT. And git and cogito have a gitk inside too. This should be mentioned,
i.e., look at the package(s) you are interested and see what else they
carry or require and keep in mind that (for now?) getting git as part of
one package is /not/ guaranteed to be compatible with another or standard
git.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v3
  2005-07-29 21:26   ` Sam Ravnborg
@ 2005-07-31 22:18     ` Horst von Brand
  0 siblings, 0 replies; 1752+ messages in thread
From: Horst von Brand @ 2005-07-31 22:18 UTC (permalink / raw)
  To: Sam Ravnborg; +Cc: Ryan Anderson, git

Sam Ravnborg <sam@ravnborg.org> wrote:
> On Fri, Jul 29, 2005 at 04:29:41AM -0400, Ryan Anderson wrote:
> > Source Code Management with Git
> ....

> The article should include a HOWTO part alos.

I'd vote for a separate file.

>                                               So people can see how to
> edit a file, pull from a remote repository etc.

Exactly.

> Since you have introduced core and porcelains it would be most logical
> to use one of the porcelains in these examples, maybe accompanied by the
> raw git commands being executed.

Better leave the Porcelain-HOWTO to individual Porcelain. Perhaps the
Plumbing-HOWTO should include a section on interfacing to Porcelain (or it
should be yet another file).
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v3
  2005-07-31 22:15   ` Horst von Brand
@ 2005-08-01 13:21     ` Horst von Brand
  2005-08-15  4:55     ` Git 1.0 Synopis (Draft v4) Ryan Anderson
  1 sibling, 0 replies; 1752+ messages in thread
From: Horst von Brand @ 2005-08-01 13:21 UTC (permalink / raw)
  To: Horst von Brand; +Cc: Ryan Anderson, git

[Yes, I know it is considered odd when you speak to yourself in public...]

Horst von Brand <vonbrand@inf.utfsm.cl> wrote:
> Ryan Anderson <ryan@michonline.com> wrote:
> > Source Code Management with Git

> More bugging...

And then some.

> > To get the 'Porcelain' tools mentioned above:
> > 	SCM Interface layers:
> > 	cogito - http://www.kernel.org/pub/software/scm/cogito/
> > 	StGIT - http://www.procode.org/stgit/
> 
> At least cogito includes a (slightly old) version of git. Dunno about
> StGIT. And git and cogito have a gitk inside too. This should be mentioned,
> i.e., look at the package(s) you are interested and see what else they
> carry or require and keep in mind that (for now?) getting git as part of
> one package is /not/ guaranteed to be compatible with another or standard
> git.

Also note that StGIT is /not/ a SCM (as cogito is), it is a tool to shuffle
patches that uses git as a backend/target.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Git 1.0 Synopis (Draft v4)
  2005-07-31 22:15   ` Horst von Brand
  2005-08-01 13:21     ` Horst von Brand
@ 2005-08-15  4:55     ` Ryan Anderson
  2005-08-15  5:09       ` Ryan Anderson
  2005-08-15  5:19       ` Junio C Hamano
  1 sibling, 2 replies; 1752+ messages in thread
From: Ryan Anderson @ 2005-08-15  4:55 UTC (permalink / raw)
  To: Horst von Brand; +Cc: git, Junio C Hamano

On Sun, Jul 31, 2005 at 06:15:40PM -0400, Horst von Brand wrote:
> Ryan Anderson <ryan@michonline.com> wrote:
> > Source Code Management with Git
> 
> More bugging...

Ok, I think I've got all this addressed (plus the other email).

It just took me a lot longer to get to it than I planned.

Junio, do you want to pull this into the git tree?  (I'll reply with a
patch)

==========

Source Code Management with git

In Linus's own words as the creator of git:
"git" can mean anything, depending on your mood.

 - random three-letter combination that is pronounceable, and not
   actually used by any common UNIX command.  The fact that it is a
   mispronunciation of "get" may or may not be relevant.
 - stupid. contemptible and despicable. simple. Take your pick from the
   dictionary of slang.
 - "global information tracker": you're in a good mood, and it actually
   works for you. Angels sing, and a light suddenly fills the room. 
 - "goddamn idiotic truckload of sh*t": when it breaks

git is a "directory content manager".  git has been designed to handle
absolutely massive projects with speed and efficiency, and the release of the
2.6.12 and (soon) the 2.6.13 version of the Linux kernel would indicate that it
does this task well.

git falls into the category of distributed source code management tools,
similar to Arch or Darcs (or, in the commercial world, BitKeeper).  Every git
working directory is a full-fledged repository with full revision tracking
capabilities, not dependent on network access to a central server.

git provides a content-addressable pseudo filesystem, complete with its own
version of fsck.

  o Speed of use, both for the project maintainer, and the end-users, is
    a key development principle.

  o The history is stored as a directed acyclic graph, making long-lived
    branches and repeated merging simple.

  o The core git project considers itself to provide "plumbing" for other
     projects, as well as to serve to arbitrate for compatibility between them.
     The project built on top of the core git are referred to as "porcelain".
     Stgit, Cogito, qgit, gitk and gitweb are all building upon the core git
     tools, and providing an easy to use interface to various pieces of
     functionality.

  o Some other projects have taken the concepts from the core git project, and
    are either porting an existing toolset to use the git tools, or
    reimplementing the concepts internally, to benefit from the performance
     improvements.  This includes both Arch 2.0, and Darcs-git.
  
  o Two, interchangeable, on-disk formats are used:
    o An efficient, packed format that saves space and network
      bandwidth.
    o An unpacked format, optimized for fast writes and incremental
      work.

To get a copy of git:
	Daily snapshots are available at:
	http://www.codemonkey.org.uk/projects/git-snapshots/git/
	(Thanks to Dave Jones)

	Source tarballs and RPMs at:
	http://www.kernel.org/pub/software/scm/git/

	Debian packages should be availabe in unstable (sid) as "git-core"

	Or via git itself:
	git clone http://www.kernel.org/pub/scm/git/git.git/ <local directory>
	git clone rsync://rsync.kernel.org/pub/scm/git/git.git/ <local directory>

	(rsync is generally faster for an initial clone, you can switch later
	by editing .git/branches/origin and changing the url)

To get the 'Porcelain' tools mentioned above:
	SCM Interface layers:
	cogito - http://www.kernel.org/pub/software/scm/cogito/

	Patch Management (similar to Quilt):
	StGIT - http://www.procode.org/stgit/

	History Visualization:
	gitk - http://ozlabs.org/~paulus/gitk/ (Included in the standard git
		distribution)
	gitweb - http://www.kernel.org/pub/software/scm/gitweb/
	qgit - http://sourceforge.net/projects/qgit


git distributions contain a tutorial in the Documentation subdirectory.
Additionally, the Kernel-Hacker's git Tutorial at
http://linux.yyz.us/git-howto.html may be useful.  (Thanks to Jeff Garzik for
that document)

git development takes place on the git mailing list.  To subscribe, send an
email with just "subscribe git" in the body to majordomo@vger.kernel.org.
Mailing list archives are available at http://marc.theaimsgroup.com/?l=git

(This summary written by Ryan Anderson <ryan@michonline.com>.  Please bug him
with any corrections or complaints.)


-- 

Ryan Anderson
  sometimes Pug Majere

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v4)
  2005-08-15  4:55     ` Git 1.0 Synopis (Draft v4) Ryan Anderson
@ 2005-08-15  5:09       ` Ryan Anderson
  2005-08-15  5:19       ` Junio C Hamano
  1 sibling, 0 replies; 1752+ messages in thread
From: Ryan Anderson @ 2005-08-15  5:09 UTC (permalink / raw)
  To: Horst von Brand; +Cc: git, Junio C Hamano


Add a SYNOPSIS/release summary to the tree.

Signed-off-by: Ryan Anderson <ryan@michonline.com>

diff --git a/SYNOPSIS b/SYNOPSIS
new file mode 100644
--- /dev/null
+++ b/SYNOPSIS
@@ -0,0 +1,93 @@
+Source Code Management with git
+
+In Linus's own words as the creator of git:
+"git" can mean anything, depending on your mood.
+
+ - random three-letter combination that is pronounceable, and not
+   actually used by any common UNIX command.  The fact that it is a
+   mispronunciation of "get" may or may not be relevant.
+ - stupid. contemptible and despicable. simple. Take your pick from the
+   dictionary of slang.
+ - "global information tracker": you're in a good mood, and it actually
+   works for you. Angels sing, and a light suddenly fills the room. 
+ - "goddamn idiotic truckload of sh*t": when it breaks
+
+git is a "directory content manager".  git has been designed to handle
+absolutely massive projects with speed and efficiency, and the release of the
+2.6.12 and (soon) the 2.6.13 version of the Linux kernel would indicate that it
+does this task well.
+
+git falls into the category of distributed source code management tools,
+similar to Arch or Darcs (or, in the commercial world, BitKeeper).  Every git
+working directory is a full-fledged repository with full revision tracking
+capabilities, not dependent on network access to a central server.
+
+git provides a content-addressable pseudo filesystem, complete with its own
+version of fsck.
+
+  o Speed of use, both for the project maintainer, and the end-users, is
+    a key development principle.
+
+  o The history is stored as a directed acyclic graph, making long-lived
+    branches and repeated merging simple.
+
+  o The core git project considers itself to provide "plumbing" for other
+     projects, as well as to serve to arbitrate for compatibility between them.
+     The project built on top of the core git are referred to as "porcelain".
+     Stgit, Cogito, qgit, gitk and gitweb are all building upon the core git
+     tools, and providing an easy to use interface to various pieces of
+     functionality.
+
+  o Some other projects have taken the concepts from the core git project, and
+    are either porting an existing toolset to use the git tools, or
+    reimplementing the concepts internally, to benefit from the performance
+     improvements.  This includes both Arch 2.0, and Darcs-git.
+  
+  o Two, interchangeable, on-disk formats are used:
+    o An efficient, packed format that saves space and network
+      bandwidth.
+    o An unpacked format, optimized for fast writes and incremental
+      work.
+
+To get a copy of git:
+	Daily snapshots are available at:
+	http://www.codemonkey.org.uk/projects/git-snapshots/git/
+	(Thanks to Dave Jones)
+
+	Source tarballs and RPMs at:
+	http://www.kernel.org/pub/software/scm/git/
+
+	Debian packages should be availabe in unstable (sid) as "git-core"
+
+	Or via git itself:
+	git clone http://www.kernel.org/pub/scm/git/git.git/ <local directory>
+	git clone rsync://rsync.kernel.org/pub/scm/git/git.git/ <local directory>
+
+	(rsync is generally faster for an initial clone, you can switch later
+	by editing .git/branches/origin and changing the url)
+
+To get the 'Porcelain' tools mentioned above:
+	SCM Interface layers:
+	cogito - http://www.kernel.org/pub/software/scm/cogito/
+
+	Patch Management (similar to Quilt):
+	StGIT - http://www.procode.org/stgit/
+
+	History Visualization:
+	gitk - http://ozlabs.org/~paulus/gitk/ (Included in the standard git
+		distribution)
+	gitweb - http://www.kernel.org/pub/software/scm/gitweb/
+	qgit - http://sourceforge.net/projects/qgit
+
+
+git distributions contain a tutorial in the Documentation subdirectory.
+Additionally, the Kernel-Hacker's git Tutorial at
+http://linux.yyz.us/git-howto.html may be useful.  (Thanks to Jeff Garzik for
+that document)
+
+git development takes place on the git mailing list.  To subscribe, send an
+email with just "subscribe git" in the body to majordomo@vger.kernel.org.
+Mailing list archives are available at http://marc.theaimsgroup.com/?l=git
+
+(This summary written by Ryan Anderson <ryan@michonline.com>.  Please bug him
+with any corrections or complaints.)

-- 

Ryan Anderson
  sometimes Pug Majere

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v4)
  2005-08-15  4:55     ` Git 1.0 Synopis (Draft v4) Ryan Anderson
  2005-08-15  5:09       ` Ryan Anderson
@ 2005-08-15  5:19       ` Junio C Hamano
  2005-08-15  6:58         ` Ryan Anderson
  1 sibling, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2005-08-15  5:19 UTC (permalink / raw)
  To: Ryan Anderson; +Cc: git

Ryan Anderson <ryan@michonline.com> writes:

> Junio, do you want to pull this into the git tree?

Yes, but I have been wondering where it should go.  Should it go
under Documentation/ and made into html via asciidoc along with
other tools?

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v4)
  2005-08-15  5:19       ` Junio C Hamano
@ 2005-08-15  6:58         ` Ryan Anderson
  2005-08-15  7:17           ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Ryan Anderson @ 2005-08-15  6:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sun, Aug 14, 2005 at 10:19:18PM -0700, Junio C Hamano wrote:
> Ryan Anderson <ryan@michonline.com> writes:
> 
> > Junio, do you want to pull this into the git tree?
> 
> Yes, but I have been wondering where it should go.  Should it go
> under Documentation/ and made into html via asciidoc along with
> other tools?

I was somewhat thinking it should go in the main directory, and be a
useful introduction to the project for people.... but it's not really
aimed at that very well, now that I think about it.

To be fair, it's not really aimed well at being documentation for people
that already have git, either.  I've been writing it with the idea of
"something to send to LWN when 1.0 happens so they can post it mostly
verbatim."

I guess this means, "I dunno, either place works for me."

-- 

Ryan Anderson
  sometimes Pug Majere

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v4)
  2005-08-15  6:58         ` Ryan Anderson
@ 2005-08-15  7:17           ` Junio C Hamano
  2005-08-15  8:02             ` Ryan Anderson
  0 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2005-08-15  7:17 UTC (permalink / raw)
  To: Ryan Anderson; +Cc: git

Ryan Anderson <ryan@michonline.com> writes:

> I guess this means, "I dunno, either place works for me."

I was hoping it means to "Oh, come to think of it, maybe I
should send this to corbet@lwn.net" ;-).

I agree with you that this may be a lot more suitable for people
_before_ they get the git sources, which is to say it may make
more sense not to include in core-git tarball but is made into a
patch to Pasky's introduction website.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v4)
  2005-08-15  7:17           ` Junio C Hamano
@ 2005-08-15  8:02             ` Ryan Anderson
  2005-08-15  8:17               ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Ryan Anderson @ 2005-08-15  8:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Mon, Aug 15, 2005 at 12:17:46AM -0700, Junio C Hamano wrote:
> Ryan Anderson <ryan@michonline.com> writes:
> 
> > I guess this means, "I dunno, either place works for me."
> 
> I was hoping it means to "Oh, come to think of it, maybe I
> should send this to corbet@lwn.net" ;-).

I was waiting until you said, "Ok, 1.00 tomorrow morning"

> I agree with you that this may be a lot more suitable for people
> _before_ they get the git sources, which is to say it may make
> more sense not to include in core-git tarball but is made into a
> patch to Pasky's introduction website.

Good point.

It's already there (now that I found the site.)

-- 

Ryan Anderson
  sometimes Pug Majere

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v4)
  2005-08-15  8:02             ` Ryan Anderson
@ 2005-08-15  8:17               ` Junio C Hamano
  2005-08-15 18:59                 ` Daniel Barkalow
  0 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2005-08-15  8:17 UTC (permalink / raw)
  To: Ryan Anderson; +Cc: git

Ryan Anderson <ryan@michonline.com> writes:

> I was waiting until you said, "Ok, 1.00 tomorrow morning"

Makes sense.  There would be some weeks until that happens I am
afraid.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v4)
  2005-08-15  8:17               ` Junio C Hamano
@ 2005-08-15 18:59                 ` Daniel Barkalow
  2005-08-16  7:28                   ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Daniel Barkalow @ 2005-08-15 18:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ryan Anderson, git

On Mon, 15 Aug 2005, Junio C Hamano wrote:

> Ryan Anderson <ryan@michonline.com> writes:
> 
> > I was waiting until you said, "Ok, 1.00 tomorrow morning"
> 
> Makes sense.  There would be some weeks until that happens I am
> afraid.

It might be worth putting the list of things left to do before 1.0 in the 
tree (since they clearly covary), and it would be useful to know what 
you're thinking of as preventing the release at any particular stage.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v4)
  2005-08-15 18:59                 ` Daniel Barkalow
@ 2005-08-16  7:28                   ` Junio C Hamano
  2005-08-16 10:03                     ` Johannes Schindelin
                                       ` (3 more replies)
  0 siblings, 4 replies; 1752+ messages in thread
From: Junio C Hamano @ 2005-08-16  7:28 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Ryan Anderson, git

Daniel Barkalow <barkalow@iabervon.org> writes:

> It might be worth putting the list of things left to do before 1.0 in the 
> tree (since they clearly covary), and it would be useful to know what 
> you're thinking of as preventing the release at any particular stage.

Yeah, yeah.  Call me lazy.

Excerpts from my "last mile to 1.0", my Itchlist, and pieces from
random other messages since then.

- Documentation. [I really need help here --- among ~7000 lines
  there, I've written around 2500 lines, David Greaves another
  2500, and Linus 1400.  And it is not very easy to proofread
  what you wrote yourself.]

  - Are all the core commands described in Documentation/
    directory?

  - Many files under Documentation/ directory have become stale.
    I've tried to do one pass of full sweep recently [and
    another since I wrote the original "last mile" message], but
    I'd like somebody else to make another pass to make sure
    that the usage strings in programs, what the programs do,
    and what Documentation says they do match.  Also, the
    spelling and grammar fixes, which I am very bad at and have
    not done any attempt, needs to be done.

    Volunteers?

  - Are all the files in Documentation/ reachable from git(7)
    or otherwise made into a standalone document using asciidoc
    by the Makefile?  I haven't looked into documentation
    generation myself (I use only the text files as they are);
    help to update the Makefile by somebody handy with asciidoc
    suite is greatly appreciated here.

    Volunteers?

  - We may want to describe more Best Current Practices, along
    the lines of "Working with Others" section in the tutorial.
    Please write on your faviorite topic and send patches in
    ;-)) [ryan started collecting Documentation/howto which
    would greatly help in this area].

  - Glossary documentation Johannes Schindelin is working on.

    I think coming up with the concensus of terms would come
    fairly quickly on the list.  Updating docs to match the
    concensus may take some time.  Help is greatly appreciated.

  - Maybe doing another pass at tutorial.  Could somebody run
    (or preferably, find a friend who has never touched git and
    have her run) the tutorial examples from the beginning to
    the end, and find rooms of improvements?  Does the order of
    materials presented make sense?  Do we talk about things
    assuming that the user knows something else that we have not
    talked about?  Have we introduced better way of doing the
    same thing since the tutorial was written?

    I've done that once with the text that is currently in the
    head of the master branch, but that is getting rather stale,
    and also I did that myself so I am sure I've sidestepped
    pitfalls without even realizing.

The above does not have to be all there in 0.99.5, but I
consider that lack of any of the above to block 1.0.

- Commit walker downloading from packed repository is finally
  complete.  Thanks, Daniel!

- Teach fetch-pack reference renaming.

  On the push side, send-pack now knows updating arbitrary
  remote references from local references.  We need something
  similar for fetching [since then I outlined the design of the
  new shorthand file format and semantics but have not got
  around to actually do it.  Maybe on my next GIT day...].  This
  is scheduled for 0.99.5.

- commit template filler discussed with Pasky some time ago,
  with perhaps pre-commit and post-commit hooks.  Somehow the
  discussion died out but that does not mean _I_ forgot about
  it.

- Binary packaging.  Should _I_ worry about "/usr/bin/git" stay
  there myself --- I think not.  But I _do_ want to help Debian
  packaging folks if that path is causing problems in their
  effort to push git-core into the official Debian archive.

  As Linus mentioned earlier, this seems to be a Debian specific
  problem, and will not block 1.0 --- if Debian heavyweights do
  not want to stay compatible with the rest of the world, so be
  it.

- I have not heard from Darwin or BSD people for some time.  Is
  your portfile up to date?  Do you have updates you want me to
  include?  Have we introduced non-Linux non-GNU
  incompatibilities lately that you want to see fixed and/or
  worked around?

  Again, I consider binary packaging issue independent from our
  release schedule; it is a distribution local issue, so this
  would not block 1.0 in any way.  But I _am_ willing to help
  them.

- Oh, another itch I did not list in the previous message.  Is
  anybody interested in doing an Emacs VC back-end for GIT?

- git prune and git fsck-cache; think about their interactions
  with an object database that borrows from another.  This
  includes the case where .git/objects itself is symlinked to
  somewhere else (i.e. running "git prune" that somewhere else
  without consulting this repository would lose objects), and
  alternates pointing at somewhere else (i.e. ditto).

  My personal feeling is that we should just warn users about
  doing .git/objects symlinking and/or alternates pointing ---
  do not do it unless you have an off-line arrangement with the
  owner of the repository you are borrowing from.  Even if that
  would become our official position to take, it needs to be
  documented clearly before we declare this issue to have been
  "dealt with".

I am sure I am forgetting something, but the above would be a
good start.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v4)
  2005-08-16  7:28                   ` Junio C Hamano
@ 2005-08-16 10:03                     ` Johannes Schindelin
  2005-08-16 10:14                       ` Dongsheng Song
  2005-08-16 10:17                       ` about git server & permissions Dongsheng Song
  2005-08-16 15:31                     ` Git 1.0 Synopis (Draft v4) Johannes Schindelin
                                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 1752+ messages in thread
From: Johannes Schindelin @ 2005-08-16 10:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Daniel Barkalow, Ryan Anderson, git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 978 bytes --]

Hi,

On Tue, 16 Aug 2005, Junio C Hamano wrote:

>   - Glossary documentation Johannes Schindelin is working on.

Yeah, yeah. Call _me_ lazy :-) I'll try to come up with a discussable item 
today.

> - git prune and git fsck-cache; think about their interactions
>   with an object database that borrows from another.  This
>   includes the case where .git/objects itself is symlinked to
>   somewhere else (i.e. running "git prune" that somewhere else
>   without consulting this repository would lose objects), and
>   alternates pointing at somewhere else (i.e. ditto).

I don´t see how git could help in the case you are pruning a repository 
which another repository points to. After all, the first repository 
doesn´t know about being used by the second.

> I am sure I am forgetting something, but the above would be a
> good start.

Maybe your $GIT_DIR/remotes idea? Along with a "--store <remotename>" flag 
to git-pull-script?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v4)
  2005-08-16 10:03                     ` Johannes Schindelin
@ 2005-08-16 10:14                       ` Dongsheng Song
  2005-08-16 10:17                       ` about git server & permissions Dongsheng Song
  1 sibling, 0 replies; 1752+ messages in thread
From: Dongsheng Song @ 2005-08-16 10:14 UTC (permalink / raw)
  To: git

Hi,

Is there any guide or advise for deploy git server ? 

How do I set repository permissions correctly?

cauchy

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* about git server & permissions
  2005-08-16 10:03                     ` Johannes Schindelin
  2005-08-16 10:14                       ` Dongsheng Song
@ 2005-08-16 10:17                       ` Dongsheng Song
  1 sibling, 0 replies; 1752+ messages in thread
From: Dongsheng Song @ 2005-08-16 10:17 UTC (permalink / raw)
  To: git

Hi,

Is there any guide or advise for deploy git server ? Especially
http/https/ssh server.

How do I set repository permissions correctly?

cauchy

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v4)
  2005-08-16  7:28                   ` Junio C Hamano
  2005-08-16 10:03                     ` Johannes Schindelin
@ 2005-08-16 15:31                     ` Johannes Schindelin
  2005-08-16 15:47                       ` Daniel Barkalow
  2005-08-16 15:39                     ` Daniel Barkalow
  2005-08-16 19:41                     ` Horst von Brand
  3 siblings, 1 reply; 1752+ messages in thread
From: Johannes Schindelin @ 2005-08-16 15:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Daniel Barkalow, Ryan Anderson, git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1072 bytes --]

Hi,

On Tue, 16 Aug 2005, Junio C Hamano wrote:

>   - Are all the files in Documentation/ reachable from git(7)
>     or otherwise made into a standalone document using asciidoc
>     by the Makefile?  I haven't looked into documentation
>     generation myself (I use only the text files as they are);
>     help to update the Makefile by somebody handy with asciidoc
>     suite is greatly appreciated here.
> 
>     Volunteers?

The attached script reveals:

git-unpack-objects.txt is not reachable from git.txt
git-cvsimport-script.txt is not reachable from git.txt
git-send-email-script.txt is not reachable from git.txt
git-rename-script.txt is not reachable from git.txt
tutorial.txt is not reachable from git.txt
git-show-index.txt is not reachable from git.txt
cvs-migration.txt is not reachable from git.txt
diffcore.txt is not reachable from git.txt
git-ls-remote-script.txt is not reachable from git.txt
git-apply.txt is not reachable from git.txt
git-diff-stages.txt is not reachable from git.txt
pack-protocol.txt is not reachable from git.txt

Ciao,
Dscho

[-- Attachment #2: Type: APPLICATION/x-perl, Size: 1215 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v4)
  2005-08-16  7:28                   ` Junio C Hamano
  2005-08-16 10:03                     ` Johannes Schindelin
  2005-08-16 15:31                     ` Git 1.0 Synopis (Draft v4) Johannes Schindelin
@ 2005-08-16 15:39                     ` Daniel Barkalow
  2005-08-16 19:41                     ` Horst von Brand
  3 siblings, 0 replies; 1752+ messages in thread
From: Daniel Barkalow @ 2005-08-16 15:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ryan Anderson, git

On Tue, 16 Aug 2005, Junio C Hamano wrote:

> Daniel Barkalow <barkalow@iabervon.org> writes:
>
> > It might be worth putting the list of things left to do before 1.0 in the
> > tree (since they clearly covary), and it would be useful to know what
> > you're thinking of as preventing the release at any particular stage.
>
> Yeah, yeah.  Call me lazy.
>
> Excerpts from my "last mile to 1.0", my Itchlist, and pieces from
> random other messages since then.
>
> - Documentation. [I really need help here --- among ~7000 lines
>   there, I've written around 2500 lines, David Greaves another
>   2500, and Linus 1400.  And it is not very easy to proofread
>   what you wrote yourself.]

I'm not sure how done this can actually get before some sort of feature
freeze; the best ways to do things keeps changing as more convenient ways
are added. Once the new stuff is diverted to post-1.0, I'd be interested
in going through it.

> - git prune and git fsck-cache; think about their interactions
>   with an object database that borrows from another.  This
>   includes the case where .git/objects itself is symlinked to
>   somewhere else (i.e. running "git prune" that somewhere else
>   without consulting this repository would lose objects), and
>   alternates pointing at somewhere else (i.e. ditto).

It should be fine, but only if .git/refs is symlinked to the matching
place; this gives you the same repository with multiple working trees.
Having refs/ and objects/ directories that aren't always together would be
much less safe.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v4)
  2005-08-16 15:31                     ` Git 1.0 Synopis (Draft v4) Johannes Schindelin
@ 2005-08-16 15:47                       ` Daniel Barkalow
  0 siblings, 0 replies; 1752+ messages in thread
From: Daniel Barkalow @ 2005-08-16 15:47 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, Ryan Anderson, git

On Tue, 16 Aug 2005, Johannes Schindelin wrote:

> Hi,
>
> On Tue, 16 Aug 2005, Junio C Hamano wrote:
>
> >   - Are all the files in Documentation/ reachable from git(7)
> >     or otherwise made into a standalone document using asciidoc
> >     by the Makefile?  I haven't looked into documentation
> >     generation myself (I use only the text files as they are);
> >     help to update the Makefile by somebody handy with asciidoc
> >     suite is greatly appreciated here.
> >
> >     Volunteers?
>
> The attached script reveals:
>
> git-unpack-objects.txt is not reachable from git.txt
> git-cvsimport-script.txt is not reachable from git.txt
> git-send-email-script.txt is not reachable from git.txt
> git-rename-script.txt is not reachable from git.txt
> tutorial.txt is not reachable from git.txt
> git-show-index.txt is not reachable from git.txt
> cvs-migration.txt is not reachable from git.txt
> diffcore.txt is not reachable from git.txt
> git-ls-remote-script.txt is not reachable from git.txt
> git-apply.txt is not reachable from git.txt
> git-diff-stages.txt is not reachable from git.txt
> pack-protocol.txt is not reachable from git.txt

The ones that don't start with git probably don't belong in the same set;
perhaps there should be a "technical" (or something similar but shorter)
subdirectory for developer documentation instead of user documentation?
(And tutorial and cvs-migration can move to howto)

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v4)
  2005-08-16  7:28                   ` Junio C Hamano
                                       ` (2 preceding siblings ...)
  2005-08-16 15:39                     ` Daniel Barkalow
@ 2005-08-16 19:41                     ` Horst von Brand
  2005-08-16 20:41                       ` Johannes Schindelin
  2005-08-18  9:27                       ` Matthias Urlichs
  3 siblings, 2 replies; 1752+ messages in thread
From: Horst von Brand @ 2005-08-16 19:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Daniel Barkalow, Ryan Anderson, git

Junio C Hamano <junkio@cox.net> wrote:

[...]

> - Oh, another itch I did not list in the previous message.  Is
>   anybody interested in doing an Emacs VC back-end for GIT?

And teach make(1) about checking out files from git... or just create a
co(1) command for git.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v4)
  2005-08-16 19:41                     ` Horst von Brand
@ 2005-08-16 20:41                       ` Johannes Schindelin
  2005-08-18  9:27                       ` Matthias Urlichs
  1 sibling, 0 replies; 1752+ messages in thread
From: Johannes Schindelin @ 2005-08-16 20:41 UTC (permalink / raw)
  To: Horst von Brand; +Cc: Junio C Hamano, Daniel Barkalow, Ryan Anderson, git

Hi,

On Tue, 16 Aug 2005, Horst von Brand wrote:

> And teach make(1) about checking out files from git... or just create a
> co(1) command for git.

How about "git-checkout-script", optionally with the "-f" flag to ignore 
changes since the last checkout/checkin?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Git 1.0 Synopis (Draft v4)
  2005-08-16 19:41                     ` Horst von Brand
  2005-08-16 20:41                       ` Johannes Schindelin
@ 2005-08-18  9:27                       ` Matthias Urlichs
  1 sibling, 0 replies; 1752+ messages in thread
From: Matthias Urlichs @ 2005-08-18  9:27 UTC (permalink / raw)
  To: git

Hi, Horst von Brand wrote:

> And teach make(1) about checking out files from git... or just create a
> co(1) command for git.

Ummm... why?

make's SCCS support depends on the presence of a SCCS/s.<name> file
for each <name>. We don't have that. Teaching make about git would be
equivalent to teaching it about parsing the index file.

Technically, that would require a stable libgit.so or so.
In reality, however, I don't know when I last had a tree which wasn't
fully populated, but it's been a while, and it's something that can be
readily fixed by "git-checkout-cache -a".

-- 
Matthias Urlichs   |   {M:U} IT Design @ m-u-it.de   |  smurf@smurf.noris.de
Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de
 - -
One possible reason that things aren't going according to plan
is that there never was a plan in the first place.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* VCS comparison table
@ 2006-10-14 15:07 Jon Smirl
  2006-10-14 16:40 ` Jakub Narebski
  2006-10-14 20:20 ` Jakub Narebski
  0 siblings, 2 replies; 1752+ messages in thread
From: Jon Smirl @ 2006-10-14 15:07 UTC (permalink / raw)
  To: Git Mailing List

I was reading Brendan's blog post about Mozilla 2
http://weblogs.mozillazine.org/roadmap/archives/2006/10/mozilla_2.html

It refers to this comparison chart between source control systems.
http://bazaar-vcs.org/RcsComparisons

Does it accurately reflect the current status of git? Is their
assessment of git's rename capability correct?

They want changes via IRC. "Please discuss changes to this table on
the freenode IRC network channel #bzr, or on the mailing list. The
terms used in the table have precise meanings, and not all VCS's use
the same term in the same way - which means that some translation is
needed to fill it in properly."

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-14 15:07 VCS comparison table Jon Smirl
@ 2006-10-14 16:40 ` Jakub Narebski
  2006-10-14 17:18   ` Jon Smirl
                     ` (2 more replies)
  2006-10-14 20:20 ` Jakub Narebski
  1 sibling, 3 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-14 16:40 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Jon Smirl wrote:

> It refers to this comparison chart between source control systems.
> http://bazaar-vcs.org/RcsComparisons

It is quite obvious that comparison of programs of given type (SMC)
on some program site (Bazaar-NG) is usually biased towards said program,
perhaps unconsciously: by emphasizing the features which were important
for developers of said program.
 
> Does it accurately reflect the current status of git? Is their
> assessment of git's rename capability correct?

For example simple namespace for git: you can use shortened sha1
(even to only 6 characters, although usually 8 are used), you can
use tags, you can use ref^m~n syntax.

I'm not sure about "No" in "Supports Repository". Git supports multiple
branches in one repository, and what's better supports development using
multiple branches, but cannot for example do a diff or a cherry-pick
between repositories (well, you can use git-format-patch/git-am to
cherry-pick changes between repositories...).

About "checkouts", i.e. working directories with repository elsewhere:
you can use GIT_DIR environmental variable or "git --git-dir" option,
or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
"symref"-like file to point to repository passes, we can use that.

Partial checkouts are only partially supported as of now; it means
you have to do some lowe level stuff to do partial checkout, and be
carefull when comitting. BTW it depends what you mean by partial
checkout, but they are somewhat incompatibile with atomic commits
to snapshot based repository.

Git supports renames in its own way; it doesn't use file ids, nor
remember renames (the new "note" header for use e.g. by porcelains 
didn't pass if I remember correctly). But it does *detect* moving
_contents_, and even *copying* _contents_ when requested. And of
course it detect renames in merges.

Git doesn't have some "plugin framework", but because it has many
"plumbing" commands, it is easy to add new commands, and also new
merge strategies, using shell scripts, Perl, Python and of course C.
So the answer would be "Somewhat", as git has plugable merge strategies,
or even "Yes" at it is easy to add new git command.

> They want changes via IRC. "Please discuss changes to this table on
> the freenode IRC network channel #bzr, or on the mailing list."

Gaah, subscribe-to-post mailing list!
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-14 16:40 ` Jakub Narebski
@ 2006-10-14 17:18   ` Jon Smirl
  2006-10-14 17:42     ` Jakub Narebski
  2006-10-16  3:53   ` Martin Pool
  2006-10-16 22:26   ` Aaron Bentley
  2 siblings, 1 reply; 1752+ messages in thread
From: Jon Smirl @ 2006-10-14 17:18 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

On 10/14/06, Jakub Narebski <jnareb@gmail.com> wrote:
> Jon Smirl wrote:
>
> > It refers to this comparison chart between source control systems.
> > http://bazaar-vcs.org/RcsComparisons
>
> It is quite obvious that comparison of programs of given type (SMC)
> on some program site (Bazaar-NG) is usually biased towards said program,
> perhaps unconsciously: by emphasizing the features which were important
> for developers of said program.
>
> > Does it accurately reflect the current status of git? Is their
> > assessment of git's rename capability correct?
>
> For example simple namespace for git: you can use shortened sha1
> (even to only 6 characters, although usually 8 are used), you can
> use tags, you can use ref^m~n syntax.
>
> I'm not sure about "No" in "Supports Repository". Git supports multiple
> branches in one repository, and what's better supports development using
> multiple branches, but cannot for example do a diff or a cherry-pick
> between repositories (well, you can use git-format-patch/git-am to
> cherry-pick changes between repositories...).
>
> About "checkouts", i.e. working directories with repository elsewhere:
> you can use GIT_DIR environmental variable or "git --git-dir" option,
> or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
> "symref"-like file to point to repository passes, we can use that.

I believe they mean checking out only the latest few revisions instead
of copying the whole repo. This issue is a problem for Mozilla. If you
want to change a line in the git version you have to download the
entire 500MB tree with full history.

>
> Partial checkouts are only partially supported as of now; it means
> you have to do some lowe level stuff to do partial checkout, and be
> carefull when comitting. BTW it depends what you mean by partial
> checkout, but they are somewhat incompatibile with atomic commits
> to snapshot based repository.

I believe partial checkout means being able to check one directory
tree out of the repo and work on it while ignoring what is happening
in the rest of the repo. This is another issue for Mozilla which has
multiple dependent projects checked into a single repo.

>
> Git supports renames in its own way; it doesn't use file ids, nor
> remember renames (the new "note" header for use e.g. by porcelains
> didn't pass if I remember correctly). But it does *detect* moving
> _contents_, and even *copying* _contents_ when requested. And of
> course it detect renames in merges.
>
> Git doesn't have some "plugin framework", but because it has many
> "plumbing" commands, it is easy to add new commands, and also new
> merge strategies, using shell scripts, Perl, Python and of course C.
> So the answer would be "Somewhat", as git has plugable merge strategies,
> or even "Yes" at it is easy to add new git command.
>
> > They want changes via IRC. "Please discuss changes to this table on
> > the freenode IRC network channel #bzr, or on the mailing list."
>
> Gaah, subscribe-to-post mailing list!

It is annoying, but subscribe with the no delivery option.

> --
> Jakub Narebski
> Warsaw, Poland
> ShadeHawk on #git
>
>
>


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-14 17:18   ` Jon Smirl
@ 2006-10-14 17:42     ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-14 17:42 UTC (permalink / raw)
  To: Jon Smirl; +Cc: git

Jon Smirl wrote:
>> About "checkouts", i.e. working directories with repository
>> elsewhere: you can use GIT_DIR environmental variable or "git
>> --git-dir" option, or symlinks, and if Nguyen Thai Ngoc D proposal
>> to have .gitdir/.git "symref"-like file to point to repository
>> passes, we can use that.
>
> I believe they mean checking out only the latest few revisions
> instead of copying the whole repo. This issue is a problem for
> Mozilla. If you want to change a line in the git version you have to
> download the entire 500MB tree with full history.

>From http://bazaar-vcs.org/RcsComparisons
  A "Checkout" is a working tree that points elsewhere for its RCS data.

You can always do like Linux kernel did, splitting repository into 
current and historical part (which would contain also dead branches), 
and creating and publishing current-historical graft file, to join 
history if needed.

>> Partial checkouts are only partially supported as of now; it means
>> you have to do some lowe level stuff to do partial checkout, and be
>> carefull when comitting. BTW it depends what you mean by partial
>> checkout, but they are somewhat incompatibile with atomic commits
>> to snapshot based repository.
> 
> I believe partial checkout means being able to check one directory
> tree out of the repo and work on it while ignoring what is happening
> in the rest of the repo. This is another issue for Mozilla which has
> multiple dependent projects checked into a single repo.

So split different projects into different repositories. There was some 
helper program (git-splitrepo or something like that) for that posted 
on git mailing list. And use "superrepository" to gather all projects 
together (see last discussion about subprojects on git mailing list).
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-14 15:07 VCS comparison table Jon Smirl
  2006-10-14 16:40 ` Jakub Narebski
@ 2006-10-14 20:20 ` Jakub Narebski
  2006-10-14 23:06   ` Jon Smirl
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-14 20:20 UTC (permalink / raw)
  To: git

Jon Smirl wrote:

> I was reading Brendan's blog post about Mozilla 2
> http://weblogs.mozillazine.org/roadmap/archives/2006/10/mozilla_2.html

You mean:
 "Oh, and isn't it time that we get off of CVS? The best way to do that
  without throwing 1.9 into an uproar is to develop Mozilla 2 using a new
  Version Control System (VCS) that can merge with CVS (since we will want
  to track changes to files not being revamped at first, or at all; and
  we'll probably find bugs whose fixes should flow back into 1.9). The
  problem with VCSes is that there are too many to choose from now.
  Nevertheless, looking for mostly green columns in that chart should help
  us make a quick decision. We don't need "the best" or the "newest", but we
  do need better merging, branching, and renaming support."

There is work by Jon Smirl and Shawn Pearce on CVS to Git importer which can
manage large and complicated (read: f*cked-up) Mozilla CVS repository.
  http://git.or.cz/gitwiki/InterfacesFrontendsAndTools#cvs2git

By the way, I'd rather use SCM comparison table on neutral site, not on SCM
site.


I think that Mozilla project should come with it's own set of requirements
and weights for best SCM _for Mozilla project_.

1. Converting existing CVS repository. This should be without data loss...
well, beside data loss that stems from using CVS in first place. "Best" SCM
would have:
  * Tool to convert CVS repository, which can then incrementally import
    changes.
  * It would be nice to have tool to exchange commits between SCM and CVS,
    be it like Tailor/git-svn, or via incremental import and exporting
    commits to CVS like git-cvsexportcommit. This would ease changing SCM,
    as both new SCM and CVS could be deployed in parallel, for a short time
    of course.
  * It would be nice to have CVS emulation like git-cvsserver, so users
    accustomed to CVS could still use it.

2. Good support for system which most important developers use, and good
support for system which most contributors use. If MS Windows is included
in those, then Git perhaps wouldn't be the best choice.

3. Good support for the workflow used in the project. Is it exchanging
patches via email (hello, Git!), having ssh access to some central
repository with central repository to push changes to or net/mesh of
repositories exchanging information, posting patches on some bug tracking
software integrated with SCM. Is it using many branches (topic branches),
or is it using few branches and merging.

But it is equally important to realize what would be the best workflow to
use, not constraining itself to the workflow imposed by limitations of CVS.

4. Good support for _large_ project, with large history. Namely, that
developer wouldn't need to download many megabytes and/or wouldn't need
megabytes of working area. How that is solved, be it partial checkouts,
lazy/shallow/sparse clone, subprojects, splitting into
projects/repositories and having some superproject or build-time
superproject, splitting repository into current and historical... that of
course depends on SCM.

5. ....

and probably few more
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-14 20:20 ` Jakub Narebski
@ 2006-10-14 23:06   ` Jon Smirl
  2006-10-14 23:34     ` Jakub Narebski
                       ` (4 more replies)
  0 siblings, 5 replies; 1752+ messages in thread
From: Jon Smirl @ 2006-10-14 23:06 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

On 10/14/06, Jakub Narebski <jnareb@gmail.com> wrote:
> Jon Smirl wrote:
>
> > I was reading Brendan's blog post about Mozilla 2
> > http://weblogs.mozillazine.org/roadmap/archives/2006/10/mozilla_2.html
>
> You mean:
>  "Oh, and isn't it time that we get off of CVS? The best way to do that
>   without throwing 1.9 into an uproar is to develop Mozilla 2 using a new
>   Version Control System (VCS) that can merge with CVS (since we will want
>   to track changes to files not being revamped at first, or at all; and
>   we'll probably find bugs whose fixes should flow back into 1.9). The
>   problem with VCSes is that there are too many to choose from now.
>   Nevertheless, looking for mostly green columns in that chart should help
>   us make a quick decision. We don't need "the best" or the "newest", but we
>   do need better merging, branching, and renaming support."
>
> There is work by Jon Smirl and Shawn Pearce on CVS to Git importer which can
> manage large and complicated (read: f*cked-up) Mozilla CVS repository.
>   http://git.or.cz/gitwiki/InterfacesFrontendsAndTools#cvs2git

I am still working with the developers of the cvs2svn import tool to
fix things so that Mozilla CVS can be correctly imported. There are
still outstanding bugs in cvs2svn preventing a correct import. MozCVS
can be imported, but the resulting repository is not entirely correct.

Once they get the base cvs2svn fixed I'll port my patches to turn it
into cvs2git again.

There is no existing CVS importer that will correctly import the
Mozilla CVS. I have tried them all.

> By the way, I'd rather use SCM comparison table on neutral site, not on SCM
> site.
>
>
> I think that Mozilla project should come with it's own set of requirements
> and weights for best SCM _for Mozilla project_.
>
> 1. Converting existing CVS repository. This should be without data loss...
> well, beside data loss that stems from using CVS in first place. "Best" SCM
> would have:
>   * Tool to convert CVS repository, which can then incrementally import
>     changes.
>   * It would be nice to have tool to exchange commits between SCM and CVS,
>     be it like Tailor/git-svn, or via incremental import and exporting
>     commits to CVS like git-cvsexportcommit. This would ease changing SCM,
>     as both new SCM and CVS could be deployed in parallel, for a short time
>     of course.

>From what Brendan wrote they are looking to continue 1.9 in CVS and
start 2.0 in a new SCM. This pretty much mandates tracking CVS into
the new SCM for a long period of time. Possibly as much as two years.
There does not appear to be a need to push 2.0 back into CVS.


>   * It would be nice to have CVS emulation like git-cvsserver, so users
>     accustomed to CVS could still use it.

This can also solve some of the problems with Windows support.

>
> 2. Good support for system which most important developers use, and good
> support for system which most contributors use. If MS Windows is included
> in those, then Git perhaps wouldn't be the best choice.

Better Windows support is needed to make git the first choice among
the various SCMs.

>
> 3. Good support for the workflow used in the project. Is it exchanging
> patches via email (hello, Git!), having ssh access to some central
> repository with central repository to push changes to or net/mesh of
> repositories exchanging information, posting patches on some bug tracking
> software integrated with SCM. Is it using many branches (topic branches),
> or is it using few branches and merging.
>
> But it is equally important to realize what would be the best workflow to
> use, not constraining itself to the workflow imposed by limitations of CVS.

A big problem for Mozilla is outside companies doing major work in a
local CVS. Since CVS is not decentralized these local repos drift away
from the main one over time making things hard to merge. Any new SCM
will have to be distributed.

> 4. Good support for _large_ project, with large history. Namely, that
> developer wouldn't need to download many megabytes and/or wouldn't need
> megabytes of working area. How that is solved, be it partial checkouts,
> lazy/shallow/sparse clone, subprojects, splitting into
> projects/repositories and having some superproject or build-time
> superproject, splitting repository into current and historical... that of
> course depends on SCM.

git has issues here. The smallest Mozilla download we have built so
far is 450MB for the initial checkout.

>
> 5. ....
>
> and probably few more


The three most complex repositories are the kernel, gcc and Mozilla.
Gcc is in SVN now. Mozilla CVS and the kernel git.

There are much larger repositories around for some of the distros, but
they are doing things like checking ISO images in to the repo which
just makes it big,, not complex.

Top two git issues effecting Mozilla choosing it
1) some way to avoid the initial 450MB download
2) better windows support


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-14 23:06   ` Jon Smirl
@ 2006-10-14 23:34     ` Jakub Narebski
       [not found]     ` <20061014200356.e7b56402.seanlkml@sympatico.ca>
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-14 23:34 UTC (permalink / raw)
  To: git

Jon Smirl wrote:

> Top two git issues effecting Mozilla choosing it
> 1) some way to avoid the initial 450MB download

Give out CDs with Mozilla's git repository (and use alternates) ;-)
Just kidding...
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]     ` <20061014200356.e7b56402.seanlkml@sympatico.ca>
@ 2006-10-15  0:03       ` Sean
  2006-10-15  0:34         ` Jon Smirl
  0 siblings, 1 reply; 1752+ messages in thread
From: Sean @ 2006-10-15  0:03 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Jakub Narebski, git

On Sat, 14 Oct 2006 19:06:10 -0400
"Jon Smirl" <jonsmirl@gmail.com> wrote:

> Top two git issues effecting Mozilla choosing it
> 1) some way to avoid the initial 450MB download

Why not split the repository up after you import it?  Break it into
two repositories, last year or two, and then everything else.

> 2) better windows support

Hard to imagine native windows support existing in time to be used by 
the Mozilla folks, maybe in time for 3.0 :o)

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-15  0:03       ` Sean
@ 2006-10-15  0:34         ` Jon Smirl
       [not found]           ` <20061014214452.8c2d2a5c.seanlkml@sympatico.ca>
  0 siblings, 1 reply; 1752+ messages in thread
From: Jon Smirl @ 2006-10-15  0:34 UTC (permalink / raw)
  To: Sean; +Cc: Jakub Narebski, git

On 10/14/06, Sean <seanlkml@sympatico.ca> wrote:
> On Sat, 14 Oct 2006 19:06:10 -0400
> "Jon Smirl" <jonsmirl@gmail.com> wrote:
>
> > Top two git issues effecting Mozilla choosing it
> > 1) some way to avoid the initial 450MB download
>
> Why not split the repository up after you import it?  Break it into
> two repositories, last year or two, and then everything else.

That is possible but I wish git had tools supporting this. What do you
do about core developers that want the full repo syncing to other
developers that only have a partial copy?

>
> > 2) better windows support
>
> Hard to imagine native windows support existing in time to be used by
> the Mozilla folks, maybe in time for 3.0 :o)
>
> Sean
>


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-14 23:06   ` Jon Smirl
  2006-10-14 23:34     ` Jakub Narebski
       [not found]     ` <20061014200356.e7b56402.seanlkml@sympatico.ca>
@ 2006-10-15  0:53     ` Jakub Narebski
  2006-10-15 15:37     ` Jakub Narebski
  2006-10-15 18:23     ` Petr Baudis
  4 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-15  0:53 UTC (permalink / raw)
  To: Jon Smirl; +Cc: git

Jon Smirl wrote:
> On 10/14/06, Jakub Narebski <jnareb@gmail.com> wrote:

>>   * It would be nice to have tool to exchange commits between SCM and CVS,
>>     be it like Tailor/git-svn, or via incremental import and exporting
>>     commits to CVS like git-cvsexportcommit. This would ease changing SCM,
>>     as both new SCM and CVS could be deployed in parallel, for a short time
>>     of course.
> 
> From what Brendan wrote they are looking to continue 1.9 in CVS and
> start 2.0 in a new SCM. This pretty much mandates tracking CVS into
> the new SCM for a long period of time. Possibly as much as two years.
> There does not appear to be a need to push 2.0 back into CVS.

That of course limits what we can do in 1.9 to what CVS supports.

> >   * It would be nice to have CVS emulation like git-cvsserver, so users
> >     accustomed to CVS could still use it.
> 
> This can also solve some of the problems with Windows support.

Well, git-cvsserver (perhaps with some improvements) could also serve as
CVS server for 1.9.
 
> > 4. Good support for _large_ project, with large history. Namely, that
> > developer wouldn't need to download many megabytes and/or wouldn't need
> > megabytes of working area. How that is solved, be it partial checkouts,
> > lazy/shallow/sparse clone, subprojects, splitting into
> > projects/repositories and having some superproject or build-time
> > superproject, splitting repository into current and historical... that of
> > course depends on SCM.
> 
> git has issues here. The smallest Mozilla download we have built so
> far is 450MB for the initial checkout.

One way to reduce repository size would be to split fairly independent
subprojects (inependent = independently testable) into separate repositories,
and perhaps use some kind of "super-repository" (common repository) to join
all the project in one single entity. The split can be done using
git-splitrepo (or something like that) which was posted on git mailing list
(most probably by some member of X.Org), or just cg-admin-rewritehist.
While at it we could split repository into current work and historical repo;
and clean up current work repository from the cruft accumulated (e.g. dead
branches, broken tags etc.).


Another way is to use grafts.

Linux kernel has it's current repository (starting somewhere 2.6.x),
and it's historical repository. I don't remember how they arrived at it
(and don't want to check KernelTrap articles), if the seed for current
work repository was simply project import at some state, or (very slow)
import of BitKeeper history. But if I remember correctly it was born split.
You can join both repositories into one (wrt. log and diff for example)
using grafts.

I'm not sure what happens if you pull from repository which has graft
file "cauterizing" history; would you get graft file and history up to
cutoff point? What would happen if your repository, repository you pull to
has cauterization graft file; would it get cut history? Of course
the problem (and the source of proposal and troubles with implementing
of shallow/sparse/lazy clone) lies if someone branches (in public repo)
from below cutoff point. But that is a matter of policy.

But it is true that the size of Mozilla repository is a challenge.
BTW. do you perchance know how other SCM dels with the repository
of that size?

-- 
Jakub Narebski
ShadeHawk on #git
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]           ` <20061014214452.8c2d2a5c.seanlkml@sympatico.ca>
@ 2006-10-15  1:44             ` Sean
  0 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-15  1:44 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Jakub Narebski, git

On Sat, 14 Oct 2006 20:34:22 -0400
"Jon Smirl" <jonsmirl@gmail.com> wrote:

> That is possible but I wish git had tools supporting this. What do you
> do about core developers that want the full repo syncing to other
> developers that only have a partial copy?

I don't think that will be an issue at all.

As an example, take the current Linux kernel repo maintained by Linus,
and one of the repos containing old historic kernel data imported into
Git.  Graft in the old historic data into your clone of Linus' repo,
and you're done. Anyone can pull from you even if they don't have the
historic data themselves.

With a little work you could do the same thing with the Mozilla data.
After you decide where to make the split, you'd have to rewrite the
commit history for the "current" repository, so that it terminates
at an initial commit rather than having a direct connection to the
historic data.  After that, the repos could be used just as described
above, separately or graphed together.

As far as I know though, there is still no way to use the git protocol
for the initial pull of such a combined repository.  You have to pull
both repos separately and graft them together locally.  This sounds
harder than it is though and can be scripted easily.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-14 23:06   ` Jon Smirl
                       ` (2 preceding siblings ...)
  2006-10-15  0:53     ` Jakub Narebski
@ 2006-10-15 15:37     ` Jakub Narebski
  2006-10-15 18:23     ` Petr Baudis
  4 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-15 15:37 UTC (permalink / raw)
  To: git

Jon Smirl wrote:

> The three most complex repositories are the kernel, gcc and Mozilla.
> Gcc is in SVN now. Mozilla CVS and the kernel git.
> 
> There are much larger repositories around for some of the distros, but
> they are doing things like checking ISO images in to the repo which
> just makes it big,, not complex.

I guess that one of the important thinkgs is the _size_ of the repository;
for example 12GB (if I remember correctly value for Subversion/SVK) vs 500MB
for git...
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-14 23:06   ` Jon Smirl
                       ` (3 preceding siblings ...)
  2006-10-15 15:37     ` Jakub Narebski
@ 2006-10-15 18:23     ` Petr Baudis
       [not found]       ` <20061015143956.86db3a8b.seanlkml@sympatico.ca>
  2006-10-15 19:49       ` Jon Smirl
  4 siblings, 2 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-15 18:23 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Jakub Narebski, git

Dear diary, on Sun, Oct 15, 2006 at 01:06:10AM CEST, I got a letter
where Jon Smirl <jonsmirl@gmail.com> said that...
> On 10/14/06, Jakub Narebski <jnareb@gmail.com> wrote:
> >There is work by Jon Smirl and Shawn Pearce on CVS to Git importer which 
> >can
> >manage large and complicated (read: f*cked-up) Mozilla CVS repository.
> >  http://git.or.cz/gitwiki/InterfacesFrontendsAndTools#cvs2git
> 
> I am still working with the developers of the cvs2svn import tool to
> fix things so that Mozilla CVS can be correctly imported. There are
> still outstanding bugs in cvs2svn preventing a correct import. MozCVS
> can be imported, but the resulting repository is not entirely correct.
> 
> Once they get the base cvs2svn fixed I'll port my patches to turn it
> into cvs2git again.

So what exactly is the cvs2git status now? AFAIU, there's a tool that
parses the CVS repository and that is then "piped" to git-fastimport?
git-fastimport is available somewhere (perhaps it would be interesting
to publish it at repo.or.cz or something), is the current cvs2git
version available as well?

> >2. Good support for system which most important developers use, and good
> >support for system which most contributors use. If MS Windows is included
> >in those, then Git perhaps wouldn't be the best choice.
> 
> Better Windows support is needed to make git the first choice among
> the various SCMs.

And this is probably not likely to happen soon.

Well, I'm enlisted in a "Programming in Windows" course at my university
now and I had this kind of thoughts, but I really can't promise
anything. :-)

> >4. Good support for _large_ project, with large history. Namely, that
> >developer wouldn't need to download many megabytes and/or wouldn't need
> >megabytes of working area. How that is solved, be it partial checkouts,
> >lazy/shallow/sparse clone, subprojects, splitting into
> >projects/repositories and having some superproject or build-time
> >superproject, splitting repository into current and historical... that of
> >course depends on SCM.
> 
> git has issues here. The smallest Mozilla download we have built so
> far is 450MB for the initial checkout.

(BTW, yes, grafting the old history could help this time, but it is a
hack and not a good long-term solution - it is just putting the real
solution away until the project history will re-grew. Periodical
regrafting is even worse hack, since at that moment you break
fast-forwarding and this kind of "restarting the history" breaks deep
into the Git distributiveness.)

> >5. ....
> >
> >and probably few more
> 
> 
> The three most complex repositories are the kernel, gcc and Mozilla.
> Gcc is in SVN now. Mozilla CVS and the kernel git.

I believe OpenOffice CVS probably beats all three hands down very
easily. KDE is also very big, and I don't think NetBSD is just ISO
images either (if it contains any at all).

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]       ` <20061015143956.86db3a8b.seanlkml@sympatico.ca>
@ 2006-10-15 18:39         ` Sean
  2006-10-15 19:24         ` Petr Baudis
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-15 18:39 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Jon Smirl, Jakub Narebski, git

On Sun, 15 Oct 2006 20:23:03 +0200
Petr Baudis <pasky@suse.cz> wrote:

> (BTW, yes, grafting the old history could help this time, but it is a
> hack and not a good long-term solution - it is just putting the real
> solution away until the project history will re-grew. Periodical
> regrafting is even worse hack, since at that moment you break
> fast-forwarding and this kind of "restarting the history" breaks deep
> into the Git distributiveness.)

But is there a better practical solution he can use today?  I don't think
there is.  And the experience of the Linux kernel has shown that it's not
really all that big a problem.  You even made a nice script to help people
do it! ;o)

It's probably not the solution that should be used _next_ time the repository
grows too big, but it sure seems like the correct solution this time around.
Not many people will want all that old history anyway (10+ years as i recall?).

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]       ` <20061015143956.86db3a8b.seanlkml@sympatico.ca>
  2006-10-15 18:39         ` Sean
@ 2006-10-15 19:24         ` Petr Baudis
  1 sibling, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-15 19:24 UTC (permalink / raw)
  To: Sean; +Cc: Jon Smirl, Jakub Narebski, git

On Sun, Oct 15, 2006 at 08:39:56PM CEST, Sean wrote:
> On Sun, 15 Oct 2006 20:23:03 +0200
> Petr Baudis <pasky@suse.cz> wrote:
> 
> > (BTW, yes, grafting the old history could help this time, but it is a
> > hack and not a good long-term solution - it is just putting the real
> > solution away until the project history will re-grew. Periodical
> > regrafting is even worse hack, since at that moment you break
> > fast-forwarding and this kind of "restarting the history" breaks deep
> > into the Git distributiveness.)
> 
> But is there a better practical solution he can use today?  I don't think
> there is.  And the experience of the Linux kernel has shown that it's not
> really all that big a problem.  You even made a nice script to help people
> do it! ;o)
> 
> It's probably not the solution that should be used _next_ time the repository
> grows too big, but it sure seems like the correct solution this time around.
> Not many people will want all that old history anyway (10+ years as i recall?).

Well I'm not saying it's the incorrect solution today, only that we
won't get around the problem by suggesting grafting forever. :-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-15 18:23     ` Petr Baudis
       [not found]       ` <20061015143956.86db3a8b.seanlkml@sympatico.ca>
@ 2006-10-15 19:49       ` Jon Smirl
  2006-10-16  3:23         ` Petr Baudis
  1 sibling, 1 reply; 1752+ messages in thread
From: Jon Smirl @ 2006-10-15 19:49 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Jakub Narebski, git

On 10/15/06, Petr Baudis <pasky@suse.cz> wrote:
> > I am still working with the developers of the cvs2svn import tool to
> > fix things so that Mozilla CVS can be correctly imported. There are
> > still outstanding bugs in cvs2svn preventing a correct import. MozCVS
> > can be imported, but the resulting repository is not entirely correct.
> >
> > Once they get the base cvs2svn fixed I'll port my patches to turn it
> > into cvs2git again.
>
> So what exactly is the cvs2git status now? AFAIU, there's a tool that
> parses the CVS repository and that is then "piped" to git-fastimport?
> git-fastimport is available somewhere (perhaps it would be interesting
> to publish it at repo.or.cz or something), is the current cvs2git
> version available as well?

cvs2git is a set of patches that get applied to cvs2svn. The patches
modify cvs2svn to output things in a format that git-fastimport can
consume.

The problem is that there are issues with cvs2svn and how it converts
CVS into change sets that are not getting fixed. These issues are
annoying for SVN users but they are fatal for git. The exact problem
is a bug in the way CVS symbol dependencies are dealt with in cvs2svn.
The bug results in most branches and symbols being based off from 5-7
different change sets instead of a single change set. SVN then copies
from the 5-7 change sets to build the branch base or symbol base.
Copying from the 5-7 change sets is addressing the symptoms of the bug
instead of fixing the underlying problem which is incorrect ordering
of the base change sets.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-15 19:49       ` Jon Smirl
@ 2006-10-16  3:23         ` Petr Baudis
  2006-10-16  3:30           ` Jon Smirl
  0 siblings, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-10-16  3:23 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Jakub Narebski, git

Dear diary, on Sun, Oct 15, 2006 at 09:49:08PM CEST, I got a letter
where Jon Smirl <jonsmirl@gmail.com> said that...
> On 10/15/06, Petr Baudis <pasky@suse.cz> wrote:
> >> I am still working with the developers of the cvs2svn import tool to
> >> fix things so that Mozilla CVS can be correctly imported. There are
> >> still outstanding bugs in cvs2svn preventing a correct import. MozCVS
> >> can be imported, but the resulting repository is not entirely correct.
> >>
> >> Once they get the base cvs2svn fixed I'll port my patches to turn it
> >> into cvs2git again.
> >
> >So what exactly is the cvs2git status now? AFAIU, there's a tool that
> >parses the CVS repository and that is then "piped" to git-fastimport?
> >git-fastimport is available somewhere (perhaps it would be interesting
> >to publish it at repo.or.cz or something), is the current cvs2git
> >version available as well?
> 
> cvs2git is a set of patches that get applied to cvs2svn. The patches
> modify cvs2svn to output things in a format that git-fastimport can
> consume.

By the way, isn't what you want an incremental importer, because of the
1.9 branch? According to its homepage, cvs2svn is not designed for
incremental importing. Or are you fixing that as well?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16  3:23         ` Petr Baudis
@ 2006-10-16  3:30           ` Jon Smirl
  2006-10-17  3:52             ` Sam Vilain
  0 siblings, 1 reply; 1752+ messages in thread
From: Jon Smirl @ 2006-10-16  3:30 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Jakub Narebski, git

On 10/15/06, Petr Baudis <pasky@suse.cz> wrote:
> Dear diary, on Sun, Oct 15, 2006 at 09:49:08PM CEST, I got a letter
> where Jon Smirl <jonsmirl@gmail.com> said that...
> > On 10/15/06, Petr Baudis <pasky@suse.cz> wrote:
> > >> I am still working with the developers of the cvs2svn import tool to
> > >> fix things so that Mozilla CVS can be correctly imported. There are
> > >> still outstanding bugs in cvs2svn preventing a correct import. MozCVS
> > >> can be imported, but the resulting repository is not entirely correct.
> > >>
> > >> Once they get the base cvs2svn fixed I'll port my patches to turn it
> > >> into cvs2git again.
> > >
> > >So what exactly is the cvs2git status now? AFAIU, there's a tool that
> > >parses the CVS repository and that is then "piped" to git-fastimport?
> > >git-fastimport is available somewhere (perhaps it would be interesting
> > >to publish it at repo.or.cz or something), is the current cvs2git
> > >version available as well?
> >
> > cvs2git is a set of patches that get applied to cvs2svn. The patches
> > modify cvs2svn to output things in a format that git-fastimport can
> > consume.
>
> By the way, isn't what you want an incremental importer, because of the
> 1.9 branch? According to its homepage, cvs2svn is not designed for
> incremental importing. Or are you fixing that as well?

cvsps works ok on small amounts of data, but it can't handle the full
Mozilla repo. The current idea is to convert the full repo with
cvs2git and build the ini file needed by cvsps to support incremental
imports. After that use cvsps.


>
> --
>                                 Petr "Pasky" Baudis
> Stuff: http://pasky.or.cz/
> #!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
> $/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
> lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
>


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-14 16:40 ` Jakub Narebski
  2006-10-14 17:18   ` Jon Smirl
@ 2006-10-16  3:53   ` Martin Pool
  2006-10-22 15:50     ` Jakub Narebski
  2006-10-16 22:26   ` Aaron Bentley
  2 siblings, 1 reply; 1752+ messages in thread
From: Martin Pool @ 2006-10-16  3:53 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

On 14 Oct 2006, Jakub Narebski <jnareb@gmail.com> wrote:
> Jon Smirl wrote:
> 
> > It refers to this comparison chart between source control systems.
> > http://bazaar-vcs.org/RcsComparisons
> 
> It is quite obvious that comparison of programs of given type (SMC)
> on some program site (Bazaar-NG) is usually biased towards said program,
> perhaps unconsciously: by emphasizing the features which were important
> for developers of said program.

I don't think I saw the original post but thanks for the feedback, we'll
update it.

> Gaah, subscribe-to-post mailing list!

No, it's just moderated for first time posters to avoid spam.  Your
message got through.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-14 16:40 ` Jakub Narebski
  2006-10-14 17:18   ` Jon Smirl
  2006-10-16  3:53   ` Martin Pool
@ 2006-10-16 22:26   ` Aaron Bentley
  2006-10-16 22:35     ` Andy Whitcroft
                       ` (3 more replies)
  2 siblings, 4 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-16 22:26 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
>>Does it accurately reflect the current status of git? Is their
>>assessment of git's rename capability correct?
> 
> 
> For example simple namespace for git: you can use shortened sha1
> (even to only 6 characters, although usually 8 are used), you can
> use tags, you can use ref^m~n syntax.

Bazaar's namespace is "simple" because all branches can be named by a
URL, and all revisions can be named by a URL + a number.

If that's true of Git, then it certainly has a simple namespace.  Using
eight-digit hex values doesn't sound simple to me, though.

> I'm not sure about "No" in "Supports Repository". Git supports multiple
> branches in one repository, and what's better supports development using
> multiple branches, but cannot for example do a diff or a cherry-pick
> between repositories (well, you can use git-format-patch/git-am to
> cherry-pick changes between repositories...).

That sounds right.  So those branches are persistent, and can be worked
on independently?

> About "checkouts", i.e. working directories with repository elsewhere:
> you can use GIT_DIR environmental variable or "git --git-dir" option,
> or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
> "symref"-like file to point to repository passes, we can use that.

It sounds like the .gitdir/.git proposal would give Git "checkouts", by
our meaning of the term.

> Partial checkouts are only partially supported as of now; it means
> you have to do some lowe level stuff to do partial checkout, and be
> carefull when comitting. BTW it depends what you mean by partial
> checkout, but they are somewhat incompatibile with atomic commits
> to snapshot based repository.

Yes, I'm very much aware of that tension.  It will be fun when Bazaar
tries to support that... :-)

> Git supports renames in its own way; it doesn't use file ids, nor
> remember renames (the new "note" header for use e.g. by porcelains 
> didn't pass if I remember correctly). But it does *detect* moving
> _contents_, and even *copying* _contents_ when requested. And of
> course it detect renames in merges.

You'll note we referred to that bevhavior on the page.  We don't think
what Git does is the same as supporting renames.  AIUI, some Git users
feel the same way.

> Git doesn't have some "plugin framework", but because it has many
> "plumbing" commands, it is easy to add new commands, and also new
> merge strategies, using shell scripts, Perl, Python and of course C.
> So the answer would be "Somewhat", as git has plugable merge strategies,
> or even "Yes" at it is easy to add new git command.

It sounds like you're saying it's extensible, not that it supports
plugins.  Plugins have very simple installation requirements.  They can
provide merge strategies, repository types, internet protocols, new
commands, etc., all seamlessly integrated.

What you're describing actually sounds like the Arch approach to
extensibility: provide a whole bunch of basic commands and let users
build an RCS on top of that.

As the author of two different Arch front-ends, I can say I haven't
found that approach satisfactory.  Invoking multiple commands tends
re-invoke the same validation routines over and over, killing
efficiency, and diagnostics tend to be pretty poorly integrated.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFNAb90F+nu1YWqI0RAvRDAJ9HHHdbhT1+aA3wOGeuUDkjRIr7BQCcDBKB
cL+DAy5GdTDk8Iz9TUkQ//M=
=AJAu
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16 22:26   ` Aaron Bentley
@ 2006-10-16 22:35     ` Andy Whitcroft
  2006-10-16 22:53       ` Jakub Narebski
  2006-10-16 23:19     ` Jakub Narebski
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 1752+ messages in thread
From: Andy Whitcroft @ 2006-10-16 22:35 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git

Aaron Bentley wrote:

>>> Git supports renames in its own way; it doesn't use file ids, nor
>>> remember renames (the new "note" header for use e.g. by porcelains 
>>> didn't pass if I remember correctly). But it does *detect* moving
>>> _contents_, and even *copying* _contents_ when requested. And of
>>> course it detect renames in merges.
> 
> You'll note we referred to that bevhavior on the page.  We don't think
> what Git does is the same as supporting renames.  AIUI, some Git users
> feel the same way.

In my experience there are two key features to rename support.  The
first that files move about efficiently ie. we don't have to carry a
different copy of the same file for each name it has had, this git
handles nicely.  The second is the seemless following of history 'back',
this git does not do trivially (when limited to specific files).  git
log on a renamed file pretty much stops at the rename point and you have
deal with it yourself.

I would love to see someone respond with a pickaxe like command line
which would list each and every change and its origin though merges and
the like.

Hmmm.

-apw

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16 22:35     ` Andy Whitcroft
@ 2006-10-16 22:53       ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-16 22:53 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Andy Whitcroft wrote:

> Aaron Bentley wrote:
> 
>>>> Git supports renames in its own way; it doesn't use file ids, nor
>>>> remember renames (the new "note" header for use e.g. by porcelains 
>>>> didn't pass if I remember correctly). But it does *detect* moving
>>>> _contents_, and even *copying* _contents_ when requested. And of
>>>> course it detect renames in merges.
>> 
>> You'll note we referred to that bevhavior on the page.  We don't think
>> what Git does is the same as supporting renames.  AIUI, some Git users
>> feel the same way.
> 
> In my experience there are two key features to rename support.  The
> first that files move about efficiently ie. we don't have to carry a
> different copy of the same file for each name it has had, this git
> handles nicely.  The second is the seemless following of history 'back',
> this git does not do trivially (when limited to specific files).  git
> log on a renamed file pretty much stops at the rename point and you have
> deal with it yourself.

Both git log and git diff follows renames (with -M) and even copies 
(with -C), but path _limiter_ doesn't follow renames. There is proposal
to add --follow option to git rev-list to follow specified paths. There was
a patch adding this option here on git mailing list (check archives), not
added because it was fairly intrusive and not complete solution IIRC.

I'd say that the second part is _partially_ supported, as we can follow
history of renamed file with pathlimit, detect that file was renamed, and
follow using previous name as pathlimit. For example if you know all the
names the file had through history, you can get whole history providing all
those names as pathlimit (well, unless there is some conflict like creating
new file with the same name as file before rename; something that all
file-id based solutions have problem with).
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16 22:26   ` Aaron Bentley
  2006-10-16 22:35     ` Andy Whitcroft
@ 2006-10-16 23:19     ` Jakub Narebski
  2006-10-16 23:39       ` Nguyen Thai Ngoc Duy
                         ` (2 more replies)
  2006-10-16 23:35     ` Linus Torvalds
  2006-10-16 23:45     ` Johannes Schindelin
  3 siblings, 3 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-16 23:19 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git

Aaron Bentley wrote:
> Jakub Narebski wrote:
> >>Does it accurately reflect the current status of git? Is their
> >>assessment of git's rename capability correct?
> >
> >
> > For example simple namespace for git: you can use shortened sha1
> > (even to only 6 characters, although usually 8 are used), you can
> > use tags, you can use ref^m~n syntax.
> 
> Bazaar's namespace is "simple" because all branches can be named by a
> URL, and all revisions can be named by a URL + a number.

Well, all refs (branches and tags) are named by [relative] path. So for
example we can have 'master', 'next', 'jc/diff' branches, 'v1.4.0' and
'examples/tag' tags. Cogito for example uses <repository URL>#<branch>
syntax.

> If that's true of Git, then it certainly has a simple namespace.  Using
> eight-digit hex values doesn't sound simple to me, though.

Well, <ref>~<n> means <n>-th _parent_ of a given ref, which for branches
(which constantly change) is a moving target.

There was proposal to add some kind of serial number to git (like 
Subversion revision numbers) and even solution how to do this...
but one must realize that any serial number must be _local_ to the
repository. One cannot have universally valid revision numbers (even
only per branch) in distributed development. Subversion can do that only
because it is centralized SCM. Global numbering and distributed nature
doesn't mix... hence contents based sha1 as commit identifiers.


But this doesn't matter much, because you can have really lightweight
tags in git (especially now with packed refs support). So you can have
the namespace you want.

>> I'm not sure about "No" in "Supports Repository". Git supports multiple
>> branches in one repository, and what's better supports development using
>> multiple branches, but cannot for example do a diff or a cherry-pick
>> between repositories (well, you can use git-format-patch/git-am to
>> cherry-pick changes between repositories...).
> 
> That sounds right.  So those branches are persistent, and can be worked
> on independently?

Branches are persistent, have _separate_ (!) namespace (are not
incorporated in repository URL according to some kind of convention
like in Subversion), can be worked independently, you can easily
switch between branches in one working directory. Branches are cheap
in git (notion of topic branches).

I wonder if any SCM other than git has easy way to "rebase" a branch,
i.e. cut branch at branching point, and transplant it to the tip
of other branch. For example you work on 'xx/topic' topic branch,
and want to have changes in those branch but applied to current work,
not to the version some time ago when you have started working on
said feature.

What your comparison matrick lacks for example is if given SCM
saves information about branching point and merges, so you can
get where two branches diverged, and when one branch was merged into
another.
 
>> About "checkouts", i.e. working directories with repository elsewhere:
>> you can use GIT_DIR environmental variable or "git --git-dir" option,
>> or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
>> "symref"-like file to point to repository passes, we can use that.
> 
> It sounds like the .gitdir/.git proposal would give Git "checkouts", by
> our meaning of the term.

Actually it is better to work with clone of repository, perhaps either
symlinking object database, or by alternates mechanism (with alternates
repositories would share old history, but gather new independetly
I think).

>> Git doesn't have some "plugin framework", but because it has many
>> "plumbing" commands, it is easy to add new commands, and also new
>> merge strategies, using shell scripts, Perl, Python and of course C.
>> So the answer would be "Somewhat", as git has plugable merge strategies,
>> or even "Yes" at it is easy to add new git command.
> 
> It sounds like you're saying it's extensible, not that it supports
> plugins.  Plugins have very simple installation requirements.  They can
> provide merge strategies, repository types, internet protocols, new
> commands, etc., all seamlessly integrated.

Plugins = API + detection ifrastructure + loading on demand.
Git has API, has a kind of detection ifrastructure (for commands and
merge strategies only), doesn't have loading on demand. You can
easily provide new commands (thanks to git wrapper) and new merge
strategies. 

Does git needs "plugin framework"? I'm not sure. Now it is like
Linux kernel without loadable modules support...

> What you're describing actually sounds like the Arch approach to
> extensibility: provide a whole bunch of basic commands and let users
> build an RCS on top of that.
>
> As the author of two different Arch front-ends, I can say I haven't
> found that approach satisfactory.  Invoking multiple commands tends
> re-invoke the same validation routines over and over, killing
> efficiency, and diagnostics tend to be pretty poorly integrated.

Actually I think it is how git was made. First came low level stuff,
"plumbing" in git parlance. Then there were scripts which used those
low level commands. There is ongoing project to rewrite them as builtin
commands (written in C); many of them got rewritten.

When git had very few higher level commands, here came git-pasky,
later renamed to Cogito; higher level SCM built on top of Git (in bash
shell). Now core git contains many high level commands, porcelanish
in git jargon.

Well, there is also StGit and it's alternative pg (Patchy Git), which
implement Quilt-like functionality (patch management) on top of Git.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16 22:26   ` Aaron Bentley
  2006-10-16 22:35     ` Andy Whitcroft
  2006-10-16 23:19     ` Jakub Narebski
@ 2006-10-16 23:35     ` Linus Torvalds
  2006-10-16 23:55       ` Jakub Narebski
                         ` (2 more replies)
  2006-10-16 23:45     ` Johannes Schindelin
  3 siblings, 3 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-16 23:35 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git



On Mon, 16 Oct 2006, Aaron Bentley wrote:
> 
> Bazaar's namespace is "simple" because all branches can be named by a
> URL, and all revisions can be named by a URL + a number.
> 
> If that's true of Git, then it certainly has a simple namespace.  Using
> eight-digit hex values doesn't sound simple to me, though.

Hey, "simple" is in the eye of the beholder. You can always just define 
Bazaar's naming convention to be simple. 

I pretty much _guarantee_ that a "number" is not a valid way to uniquely 
name a revision in a distributed environment, though. I bet the "number" 
really only names a revision in one _single_ repository, right?

Which measn that it's actually not a "name" of the revision at all. It's 
just a local shorthand that has no meaning, and the exact same revision 
will be called something different when in somebody elses repository.

I wouldn't call that "simple". I'd call it "insane".

In contrast, in git, a revision is a revision is a revision. If you give 
the SHA1 name, it's well-defined even between different repositories, and 
you can tell somebody that "revision XYZ is when the problem started", and 
they'll know _exactly_ which revision it is, even if they don't have your 
particular repository.

Now _that_ is true simplicity. It does automatically mean that the names 
are a bit longer, but in this case, "longer" really _does_ mean "simpler".

If you want a short, human-readable name, you _tag_ it. It takes all of a 
hundredth of a second to to or so.

> > I'm not sure about "No" in "Supports Repository". Git supports multiple
> > branches in one repository, and what's better supports development using
> > multiple branches, but cannot for example do a diff or a cherry-pick
> > between repositories (well, you can use git-format-patch/git-am to
> > cherry-pick changes between repositories...).
> 
> That sounds right.  So those branches are persistent, and can be worked
> on independently?

Yes.

> > About "checkouts", i.e. working directories with repository elsewhere:
> > you can use GIT_DIR environmental variable or "git --git-dir" option,
> > or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
> > "symref"-like file to point to repository passes, we can use that.
> 
> It sounds like the .gitdir/.git proposal would give Git "checkouts", by
> our meaning of the term.

Well, in the git world, it's really just one shared repository that has 
separate branch-namespaces, and separate working trees (aka "checkouts"). 
So yes, it probably matches what bazaar would call a checkout.

Almost nobody seems to actually use it that way in git - it's mostly more 
efficient to just have five different branches in the same working tree, 
and switch between them. When you switch between branches in git, git only 
rewrites the part of your working tree that actually changed, so switching 
is extremely efficient even with a large repo. 

So there is seldom any real need or reason to actually have multiple 
checkouts. But it certainly _works_.

> You'll note we referred to that bevhavior on the page.  We don't think
> what Git does is the same as supporting renames.  AIUI, some Git users
> feel the same way.

The fact is, git supports renames better than just about anybody else. It 
just does them technically differently. The fact that it happens to be the 
_right_ way, and everybody else is incompetent, is not my fault ;)

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:19     ` Jakub Narebski
@ 2006-10-16 23:39       ` Nguyen Thai Ngoc Duy
  2006-10-17  4:56       ` Aaron Bentley
  2006-10-17  9:37       ` Robert Collins
  2 siblings, 0 replies; 1752+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2006-10-16 23:39 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Aaron Bentley, bazaar-ng, git

On 10/17/06, Jakub Narebski <jnareb@gmail.com> wrote:
> Aaron Bentley wrote:
> > Jakub Narebski wrote:
> >> About "checkouts", i.e. working directories with repository elsewhere:
> >> you can use GIT_DIR environmental variable or "git --git-dir" option,
> >> or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
> >> "symref"-like file to point to repository passes, we can use that.
> >
> > It sounds like the .gitdir/.git proposal would give Git "checkouts", by
> > our meaning of the term.
>
> Actually it is better to work with clone of repository, perhaps either
> symlinking object database, or by alternates mechanism (with alternates
> repositories would share old history, but gather new independetly
> I think).
I agree. Each Git repository is designed to work with one working
directory. Using .gitdir/.git proposal, you are likely to checkout two
working directories from one repo.
-- 
Duy

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16 22:26   ` Aaron Bentley
                       ` (2 preceding siblings ...)
  2006-10-16 23:35     ` Linus Torvalds
@ 2006-10-16 23:45     ` Johannes Schindelin
  2006-10-17  2:40       ` Petr Baudis
                         ` (2 more replies)
  3 siblings, 3 replies; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-16 23:45 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git

Hi Aaron,

On Mon, 16 Oct 2006, Aaron Bentley wrote:

> --[PinePGP]--------------------------------------------------[begin]--
> Jakub Narebski wrote:
> >>Does it accurately reflect the current status of git? Is their
> >>assessment of git's rename capability correct?
> >
> >
> > For example simple namespace for git: you can use shortened sha1
> > (even to only 6 characters, although usually 8 are used), you can
> > use tags, you can use ref^m~n syntax.
> 
> Bazaar's namespace is "simple" because all branches can be named by a 
> URL, and all revisions can be named by a URL + a number.

How should this cope with a distributed project? IOW how does it deal with 
"this revision and that revision are exactly the same"?

If I understand you correctly, you are claiming that you are not really 
identifying a revision, but a revision _at a certain place with a 
place-dependent number_. This conflicts with my understanding of a 
revision.

> If that's true of Git, then it certainly has a simple namespace.  Using 
> eight-digit hex values doesn't sound simple to me, though.

It depends on your usage. If you want to do anything interesting, like 
assure that you have the correct version, or assure that two different 
person's tags actually tag the same revision, there is no simpler 
representation.

> > I'm not sure about "No" in "Supports Repository". Git supports multiple
> > branches in one repository, and what's better supports development using
> > multiple branches, but cannot for example do a diff or a cherry-pick
> > between repositories (well, you can use git-format-patch/git-am to
> > cherry-pick changes between repositories...).
> 
> That sounds right.  So those branches are persistent, and can be worked
> on independently?

Of course! Persistence (and reliability) are the number one goal of git. 
Performance is the next one.

As an example of completely independet branches, look at the "next" and 
the "todo" branch of git. They are _completely_ independent, i.e. not even 
sharing history, let alone files.

> > Git supports renames in its own way; it doesn't use file ids, nor
> > remember renames (the new "note" header for use e.g. by porcelains
> > didn't pass if I remember correctly). But it does *detect* moving
> > _contents_, and even *copying* _contents_ when requested. And of
> > course it detect renames in merges.
> 
> You'll note we referred to that bevhavior on the page.  We don't think
> what Git does is the same as supporting renames.  AIUI, some Git users
> feel the same way.

Oh, we start another flamewar again?

Honestly, if you want to record renames, why don't you also support (with 
a command for each of those purposes) code copying? And refactoring? And 
copyright year bumps? _put your favourite here_

If you really, really think about it: it makes much more sense to record 
your intention in the commit message. So, instead of recording for _every_ 
_single_ file in folder1/ that it was moved to folder2/, it is better to 
say that you moved folder1/ to folder2/ _because of some special reason_!

Same goes for all other thinkable examples.

If you want to track code, then let the tracker do its work, i.e. let 
git-pickaxe figure where your code came from. It is likely being more 
precise than any human ever can be.

> > Git doesn't have some "plugin framework", but because it has many
> > "plumbing" commands, it is easy to add new commands, and also new
> > merge strategies, using shell scripts, Perl, Python and of course C.
> > So the answer would be "Somewhat", as git has plugable merge strategies,
> > or even "Yes" at it is easy to add new git command.
> 
> It sounds like you're saying it's extensible, not that it supports
> plugins.  Plugins have very simple installation requirements.  They can
> provide merge strategies, repository types, internet protocols, new
> commands, etc., all seamlessly integrated.
> 
> What you're describing actually sounds like the Arch approach to
> extensibility: provide a whole bunch of basic commands and let users
> build an RCS on top of that.

It is more like the Unix way. Let each command do _one_ thing, but let it 
do it _perfectly_.

> As the author of two different Arch front-ends, I can say I haven't
> found that approach satisfactory.  Invoking multiple commands tends
> re-invoke the same validation routines over and over, killing
> efficiency, and diagnostics tend to be pretty poorly integrated.

Welcome to git! Git's commands are very efficient, and you can even pipe 
them efficiently! And now that we have GIT_TRACE, diagnostics are no 
concern.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:35     ` Linus Torvalds
@ 2006-10-16 23:55       ` Jakub Narebski
  2006-10-17  0:04         ` Johannes Schindelin
  2006-10-17  0:08         ` Linus Torvalds
  2006-10-17  0:29       ` Luben Tuikov
  2006-10-17  4:24       ` Aaron Bentley
  2 siblings, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-16 23:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Aaron Bentley, bazaar-ng, git

Linus Torvalds wrote:
>>> About "checkouts", i.e. working directories with repository elsewhere:
>>> you can use GIT_DIR environmental variable or "git --git-dir" option,
>>> or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
>>> "symref"-like file to point to repository passes, we can use that.
>> 
>> It sounds like the .gitdir/.git proposal would give Git "checkouts", by
>> our meaning of the term.
> 
> Well, in the git world, it's really just one shared repository that has 
> separate branch-namespaces, and separate working trees (aka "checkouts"). 
> So yes, it probably matches what bazaar would call a checkout.
> 
> Almost nobody seems to actually use it that way in git - it's mostly more 
> efficient to just have five different branches in the same working tree, 
> and switch between them. When you switch between branches in git, git only 
> rewrites the part of your working tree that actually changed, so switching 
> is extremely efficient even with a large repo. 

Unless you have branch(es) with totally different contents, like git.git
'todo' branch.

> So there is seldom any real need or reason to actually have multiple 
> checkouts. But it certainly _works_.

But without .git being either symlink, or .git/.gitdir "symref"-link,
you have to remember what to ser GIT_DIR to, or parameter for --git-dir
option.

I'd like to mention once again that in Git branches and tags have
totally separate namespace than repository namespace.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:55       ` Jakub Narebski
@ 2006-10-17  0:04         ` Johannes Schindelin
  2006-10-17  0:23           ` Linus Torvalds
  2006-10-17  0:08         ` Linus Torvalds
  1 sibling, 1 reply; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-17  0:04 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Linus Torvalds, Aaron Bentley, bazaar-ng, git

Hi,

On Tue, 17 Oct 2006, Jakub Narebski wrote:

> Linus Torvalds wrote:
> >>> About "checkouts", i.e. working directories with repository elsewhere:
> >>> you can use GIT_DIR environmental variable or "git --git-dir" option,
> >>> or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
> >>> "symref"-like file to point to repository passes, we can use that.
> >> 
> >> It sounds like the .gitdir/.git proposal would give Git "checkouts", by
> >> our meaning of the term.
> > 
> > Well, in the git world, it's really just one shared repository that has 
> > separate branch-namespaces, and separate working trees (aka "checkouts"). 
> > So yes, it probably matches what bazaar would call a checkout.
> > 
> > Almost nobody seems to actually use it that way in git - it's mostly more 
> > efficient to just have five different branches in the same working tree, 
> > and switch between them. When you switch between branches in git, git only 
> > rewrites the part of your working tree that actually changed, so switching 
> > is extremely efficient even with a large repo. 
> 
> Unless you have branch(es) with totally different contents, like git.git
> 'todo' branch.

But I _do_ work with it! I just don't need to "checkout" it! Example:

git -p cat-file -p todo:TODO

(How about making git-cat be a short cuut to "git -p cat-file -p"?)

> > So there is seldom any real need or reason to actually have multiple 
> > checkouts. But it certainly _works_.
> 
> But without .git being either symlink, or .git/.gitdir "symref"-link,
> you have to remember what to ser GIT_DIR to, or parameter for --git-dir
> option.

You'd just use alternates for that.

But as Linus mentioned in another email, you mostly can use the _same_ 
working directory. If you want to work on another branch, which is not all 
that different from the current branch (say, you have a bug fix branch on 
top of an upstream branch), you just _switch_ to it. Git recognizes those 
files which are changed, and updates only these. Therefore, if you have 
something like a Makefile system to build the project, you actually save 
(compile) time as compared to the multiple-checkout scenario.

I use this system a lot, since I maintain a few bugfixes for a few 
projects until the bugfixes are applied upstream. BTW the 
multiple-branches-in-one-working-directory workflow was propagated by Jeff 
a long time ago, and it really changed my way of working. Thanks, Jeff!

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:55       ` Jakub Narebski
  2006-10-17  0:04         ` Johannes Schindelin
@ 2006-10-17  0:08         ` Linus Torvalds
  2006-10-17  0:24           ` Jakub Narebski
  2006-10-17  4:31           ` Aaron Bentley
  1 sibling, 2 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-17  0:08 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Aaron Bentley, bazaar-ng, git



On Tue, 17 Oct 2006, Jakub Narebski wrote:
> > rewrites the part of your working tree that actually changed, so switching 
> > is extremely efficient even with a large repo. 
> 
> Unless you have branch(es) with totally different contents, like git.git
> 'todo' branch.

Yes. I have to say, that's likely a fairly odd case, and I wouldn't be 
surprised if other VCS's don't support that mode of operation at _all_.

The fact that git branches can be independent of each other is very 
natural in the git world, but 

> > So there is seldom any real need or reason to actually have multiple 
> > checkouts. But it certainly _works_.
> 
> But without .git being either symlink, or .git/.gitdir "symref"-link,
> you have to remember what to ser GIT_DIR to, or parameter for --git-dir
> option.

I'd strongly suggest that people who do this should actually do

	git clone -l

instead of actually playing games with symlinking .git/ itself or using 
GIT_DIR. It means that the two checkouts get separate branch namespaces, 
but that's really what you'd want most of the time. 

You _can_ share the whole branch namespace and do the symlink of .git (or 
just set GIT_DIR - but that's pretty inconvenient), and it might end up 
being "closer" to what some other VCS would do. But the natural thing to 
do with git is to just share some of the objects through local "slaving" 
of the repositories, and consider them otherwise entirely independent.

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  0:04         ` Johannes Schindelin
@ 2006-10-17  0:23           ` Linus Torvalds
  2006-10-17  0:36             ` Johannes Schindelin
                               ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-17  0:23 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jakub Narebski, Aaron Bentley, bazaar-ng, git



On Tue, 17 Oct 2006, Johannes Schindelin wrote:
> > 
> > Unless you have branch(es) with totally different contents, like git.git
> > 'todo' branch.
> 
> But I _do_ work with it! I just don't need to "checkout" it! Example:
> 
> git -p cat-file -p todo:TODO

Ok, if there ever was an example of a strange git command-line, that was 
it.

> (How about making git-cat be a short cuut to "git -p cat-file -p"?)

Well, you can just add

	[alias]
		cat=-p cat-file -p

to your ~/.gitconfig file, and you're there.

[ For all the non-git people here: the first "-p" is shorthand for 
  "--paginate", and means that git will automatically start a pager for 
  the output. The second "-p" is shorthand for "pretty" (there's no 
  long-format command line switch for it, though), and means that git 
  cat-file will show the result in a human-readable way, regardless of 
  whether it's just a text-file, or a git directory ]

So then you can do just

	git cat todo:TODO

and you're done.

[ So for the non-git people, what that will actually _do_ is to show the 
  TODO file in the "todo" branch - regardless of whether it is checked out 
  or not, and start a pager for you. ]

I actually do this sometimes, but I've never done it for branches (and I 
do it seldom enough that I haven't added the alias). I do it for things 
like

	git cat v2.6.16:Makefile

to see what a file looked like in a certain tagged release.

People sometimes find the git command line confusing, but I have to say, 
the thing is _damn_ expressive. I've never seen anybody else do things 
like the above that git does really naturally, with not that much 
confusion really.

Even that "alias" file is quite readable, although I'd suggest writing out 
the switches in full, ie

	[alias]
		cat=--paginate cat-file -p

instead. That kind of helps explains what the alias does and avoids the 
question of why there are two "-p" switches.

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  0:08         ` Linus Torvalds
@ 2006-10-17  0:24           ` Jakub Narebski
  2006-10-17  4:31           ` Aaron Bentley
  1 sibling, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17  0:24 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Linus Torvalds wrote:

>> > So there is seldom any real need or reason to actually have multiple 
>> > checkouts. But it certainly _works_.
>> 
>> But without .git being either symlink, or .git/.gitdir "symref"-link,
>> you have to remember what to ser GIT_DIR to, or parameter for --git-dir
>> option.
> 
> I'd strongly suggest that people who do this should actually do
> 
>         git clone -l
> 
> instead of actually playing games with symlinking .git/ itself or using 
> GIT_DIR. It means that the two checkouts get separate branch namespaces, 
> but that's really what you'd want most of the time. 

Or symlinking .git/objects (and perhaps .git/remotes and .git/branches).
BTW. wouldn't it be rather git clone -l -s? What would happenm on repack,
or on repack -a -d?

But it is true that there is no need to checkout different branches
to different working areas.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:35     ` Linus Torvalds
  2006-10-16 23:55       ` Jakub Narebski
@ 2006-10-17  0:29       ` Luben Tuikov
  2006-10-17  4:24       ` Aaron Bentley
  2 siblings, 0 replies; 1752+ messages in thread
From: Luben Tuikov @ 2006-10-17  0:29 UTC (permalink / raw)
  To: Linus Torvalds, Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git

--- Linus Torvalds <torvalds@osdl.org> wrote:
> Well, in the git world, it's really just one shared repository that has 
> separate branch-namespaces, and separate working trees (aka "checkouts"). 
> So yes, it probably matches what bazaar would call a checkout.
> 
> Almost nobody seems to actually use it that way in git - it's mostly more 
> efficient to just have five different branches in the same working tree, 
> and switch between them. When you switch between branches in git, git only 
> rewrites the part of your working tree that actually changed, so switching 
> is extremely efficient even with a large repo. 
> 
> So there is seldom any real need or reason to actually have multiple 
> checkouts. But it certainly _works_.

It does work, very well at that.

I have a directory for each separate branch and simply use
cd(1) to change the current working directory to that branch.
So, instead of "git checkout <branch>", I do "cd ../<branch>".

One only needs to watch out when one updates the repository.
If there had been updates in those branches, then one needs
to git-reset the "branch" directory... (you know what I mean)
(For example when I come to work in the morning an sync up
 with home from my usb key...)

The script is called:
Usage: git-mkdir-of-branch <original-directory> <branch> <new-directory>
  where <branch> is the name of an existing branch in <original-directory>/.git/refs/heads

and uses simple symbolic links and some git plumbing to do the
job.  It can be found in my git trees.  I never bothered to send
it out to Junio, since it could be considered heretic. ;-)

     Luben

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  0:23           ` Linus Torvalds
@ 2006-10-17  0:36             ` Johannes Schindelin
  2006-10-17  1:17             ` Nguyen Thai Ngoc Duy
  2006-10-17  7:26             ` Christian MICHON
  2 siblings, 0 replies; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-17  0:36 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, git

Hi,

On Mon, 16 Oct 2006, Linus Torvalds wrote:

> On Tue, 17 Oct 2006, Johannes Schindelin wrote:
> 
> > (How about making git-cat be a short cuut to "git -p cat-file -p"?)
> 
> Well, you can just add
> 
> 	[alias]
> 		cat=-p cat-file -p
> 
> to your ~/.gitconfig file, and you're there.

Ha! I have that for a long time! Although I named it "s", since "git s 
todo:TODO" is two letters shorter...

Ciao,
Dscho

P.S.: BTW a certain person complained about ~/.gitconfig not being 
documented, but evidently the itch was not big enough for that person to 
document it himself...

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  0:23           ` Linus Torvalds
  2006-10-17  0:36             ` Johannes Schindelin
@ 2006-10-17  1:17             ` Nguyen Thai Ngoc Duy
  2006-10-17  7:26             ` Christian MICHON
  2 siblings, 0 replies; 1752+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2006-10-17  1:17 UTC (permalink / raw)
  To: git

On 10/17/06, Linus Torvalds <torvalds@osdl.org> wrote:
> So then you can do just
>
>         git cat todo:TODO
>
> and you're done.
>
> [ So for the non-git people, what that will actually _do_ is to show the
>   TODO file in the "todo" branch - regardless of whether it is checked out
>   or not, and start a pager for you. ]
>
> I actually do this sometimes, but I've never done it for branches (and I
> do it seldom enough that I haven't added the alias). I do it for things
> like
>
>         git cat v2.6.16:Makefile
>
> to see what a file looked like in a certain tagged release.

This very useful syntax (<ent>:<path>) didn't get documented
"officially" anywhere. It was actually documented in commit log
v1.4.1^0~255^2. Maybe someone should copy and paste it to git
documentation? Maybe core-tutorial.txt or git-rev-parse.txt, is there
any better place?
-- 
Duy

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:45     ` Johannes Schindelin
@ 2006-10-17  2:40       ` Petr Baudis
  2006-10-17  5:08       ` Aaron Bentley
  2006-10-17  9:33       ` Robert Collins
  2 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-17  2:40 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Aaron Bentley, Jakub Narebski, bazaar-ng, git

Hi!

Dear diary, on Tue, Oct 17, 2006 at 01:45:34AM CEST, I got a letter
where Johannes Schindelin <Johannes.Schindelin@gmx.de> said that...
> On Mon, 16 Oct 2006, Aaron Bentley wrote:
> > As the author of two different Arch front-ends, I can say I haven't
> > found that approach satisfactory.  Invoking multiple commands tends
> > re-invoke the same validation routines over and over, killing
> > efficiency, and diagnostics tend to be pretty poorly integrated.
> 
> Welcome to git! Git's commands are very efficient, and you can even pipe 
> them efficiently! And now that we have GIT_TRACE, diagnostics are no 
> concern.

I think Aaron rather meant that in case of an error, the error messages
may seem incoherent from the perspective of a porcelain user if it's
been generated by the plumbing. And I had that problem in Cogito as well
few times in the past, but I think most of those are reasonable now (I
can't think of a counter-example off the top of my head).

Calling multiple git commands _is_ a problem, especially in a loop, but
I think it's more the inherent fork()+execve() overhead than whatever
happens over and over when main() takes over. Many git commands got
adjusted so that you can call them just once and then feed from/to them
over longer time period.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16  3:30           ` Jon Smirl
@ 2006-10-17  3:52             ` Sam Vilain
  2006-10-17 12:59               ` Jon Smirl
  0 siblings, 1 reply; 1752+ messages in thread
From: Sam Vilain @ 2006-10-17  3:52 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Petr Baudis, Jakub Narebski, git

Jon Smirl wrote:
> cvsps works ok on small amounts of data, but it can't handle the full
> Mozilla repo. The current idea is to convert the full repo with
> cvs2git and build the ini file needed by cvsps to support incremental
> imports. After that use cvsps.
>   

Looking through the client.mk used to check out the sub-portions of the
CVS repository, I have to ask;

Why are you trying to import this big collection of projects into a
single git repository?

View git's repositories not as a container for an entire community's
code base, but more as object partitions.  Currently you are quite happy
to use per-file version control partitions inherent to CVS.  Now you are
looking at removing all of the partitions completely and hoping to end
up with something managable.  That it has been possible at all to fit it
into the space less than the size of a CD is staggering, but surely a
piecemeal approach would be a pragmatic solution to this problem.

Sam.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:35     ` Linus Torvalds
  2006-10-16 23:55       ` Jakub Narebski
  2006-10-17  0:29       ` Luben Tuikov
@ 2006-10-17  4:24       ` Aaron Bentley
  2006-10-17  7:50         ` Andreas Ericsson
                           ` (3 more replies)
  2 siblings, 4 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-17  4:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Mon, 16 Oct 2006, Aaron Bentley wrote:
>> Bazaar's namespace is "simple" because all branches can be named by a
>> URL, and all revisions can be named by a URL + a number.

> I pretty much _guarantee_ that a "number" is not a valid way to uniquely 
> name a revision in a distributed environment, though. I bet the "number" 
> really only names a revision in one _single_ repository, right?

Right.  That's why I said all revisions can be named by a URL + a
number, because it's the combination of the URL + a number that is
unique.  (In bzr, each branch has a URL.)

> In contrast, in git, a revision is a revision is a revision. 

I agree that a revision is a revision, but I don't think that's a
property unique to git. :-)

> If you give 
> the SHA1 name, it's well-defined even between different repositories, and 
> you can tell somebody that "revision XYZ is when the problem started", and 
> they'll know _exactly_ which revision it is, even if they don't have your 
> particular repository.

When two people have copies of the same revision, it's usually because
they are each pulling from a common branch, and so the revision in that
branch can be named.  Bazaar does use unique ids internally, but it's
extremely rare that the user needs to use them.

> Now _that_ is true simplicity. It does automatically mean that the names 
> are a bit longer, but in this case, "longer" really _does_ mean "simpler".
> 
> If you want a short, human-readable name, you _tag_ it. It takes all of a 
> hundredth of a second to to or so.

But tags have local meaning only, unless someone has access to your
repository, right?

>>> About "checkouts", i.e. working directories with repository elsewhere:
>>> you can use GIT_DIR environmental variable or "git --git-dir" option,
>>> or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
>>> "symref"-like file to point to repository passes, we can use that.
>> It sounds like the .gitdir/.git proposal would give Git "checkouts", by
>> our meaning of the term.
> 
> Well, in the git world, it's really just one shared repository that has 
> separate branch-namespaces, and separate working trees (aka "checkouts"). 
> So yes, it probably matches what bazaar would call a checkout.

The key thing about a checkout is that it's stored in a different
location from its repository.  This provides a few benefits:

- - you can publish a repository without publishing its working tree,
  possibly using standard mirroring tools like rsync.

- - you can have working trees on local systems while having the
  repository on a remote system.  This makes it easy to work on one
  logical branch from multiple locations, without getting out of sync.

- - you can use a checkout to maintain a local mirror of a read-only
  branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).

> Almost nobody seems to actually use it that way in git - it's mostly more 
> efficient to just have five different branches in the same working tree, 
> and switch between them. When you switch between branches in git, git only 
> rewrites the part of your working tree that actually changed, so switching 
> is extremely efficient even with a large repo.

You can operate that way in bzr too, but I find it nicer to have one
checkout for each active branch, plus a checkout of bzr.dev.  Our switch
command also rewrites only the changed part of the working tree.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNFrv0F+nu1YWqI0RAgBHAJ9XpmdvuCNDysxFhnyeCmkEG/z0ggCggMsJ
WyW6lqGMokh0k0It1KOdgtk=
=L1SR
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  0:08         ` Linus Torvalds
  2006-10-17  0:24           ` Jakub Narebski
@ 2006-10-17  4:31           ` Aaron Bentley
  2006-10-19 19:01             ` Nathaniel Smith
  1 sibling, 1 reply; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-17  4:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bazaar-ng, git, Jakub Narebski

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Tue, 17 Oct 2006, Jakub Narebski wrote:
>> Unless you have branch(es) with totally different contents, like git.git
>> 'todo' branch.
> 
> Yes. I have to say, that's likely a fairly odd case, and I wouldn't be 
> surprised if other VCS's don't support that mode of operation at _all_.

Bazaar also supports multiple unrelated branches in a repository, as
does CVS, SVN (depending how you squint), Arch, and probably Monotone.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNFy90F+nu1YWqI0RAgMeAJ99OikxXspSg+efnN6j3ySoPuOovQCfaKA6
yPCRw5Kl/V+ThnU6fsPA8TQ=
=DYAN
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:19     ` Jakub Narebski
  2006-10-16 23:39       ` Nguyen Thai Ngoc Duy
@ 2006-10-17  4:56       ` Aaron Bentley
  2006-10-17  5:20         ` Shawn Pearce
                           ` (3 more replies)
  2006-10-17  9:37       ` Robert Collins
  2 siblings, 4 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-17  4:56 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Well, <ref>~<n> means <n>-th _parent_ of a given ref, which for branches
> (which constantly change) is a moving target.

Ah.  Bazaar uses negative numbers to refer to <n>th parents, and
positive numbers to refer to the number of commits that have been made
since the branch was initialized.

> One cannot have universally valid revision numbers (even
> only per branch) in distributed development. Subversion can do that only
> because it is centralized SCM. Global numbering and distributed nature
> doesn't mix... hence contents based sha1 as commit identifiers.

Sure.  Our UI approach is that unique identifiers can usefully be
abstracted away with a combination of URL + number, in the vast majority
of cases.

> But this doesn't matter much, because you can have really lightweight
> tags in git (especially now with packed refs support). So you can have
> the namespace you want.

The nice thing about revision numbers is that they're implicit-- no one
needs to take any action to update them, and so you can always use them.

> I wonder if any SCM other than git has easy way to "rebase" a branch,
> i.e. cut branch at branching point, and transplant it to the tip
> of other branch. For example you work on 'xx/topic' topic branch,
> and want to have changes in those branch but applied to current work,
> not to the version some time ago when you have started working on
> said feature.

If I understand correctly, in Bazaar, you'd just merge the current work
into 'xx/topic'.

> What your comparison matrick lacks for example is if given SCM
> saves information about branching point and merges, so you can
> get where two branches diverged, and when one branch was merged into
> another.

I'm not sure what you mean about divergence.  For example, Bazaar
records the complete ancestry of each branch, and determining the point
of divergence is as simple as finding the last common ancestor.  But are
you considering only the initial divergence?  Or if the branches merge
and then diverge again, would you consider that the point of divergence?

merge-point tracking is a prerequisite for Smart Merge, which does
appear on our matrix.

> Plugins = API + detection ifrastructure + loading on demand.
> Git has API, has a kind of detection ifrastructure (for commands and
> merge strategies only), doesn't have loading on demand. You can
> easily provide new commands (thanks to git wrapper) and new merge
> strategies.

I'm not sure what you mean by API, unless you mean the commandline.  If
that's what you mean, surely all unix commands are extensible in that
regard.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNGKQ0F+nu1YWqI0RAsW+AJoDOsNRmBjo3raT43JL6qn7SuJNRwCfe9l5
oAZ9OyrxMQlHnwrruhcjz9Y=
=RNuG
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:45     ` Johannes Schindelin
  2006-10-17  2:40       ` Petr Baudis
@ 2006-10-17  5:08       ` Aaron Bentley
  2006-10-17  5:25         ` Carl Worth
                           ` (3 more replies)
  2006-10-17  9:33       ` Robert Collins
  2 siblings, 4 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-17  5:08 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jakub Narebski, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Johannes Schindelin wrote:
> On Mon, 16 Oct 2006, Aaron Bentley wrote:

>> Bazaar's namespace is "simple" because all branches can be named by a 
>> URL, and all revisions can be named by a URL + a number.
> 
> How should this cope with a distributed project? IOW how does it deal with 
> "this revision and that revision are exactly the same"?

There are two answers here.  One is that the URL + number is UI, not
internals.  A unique ID is used internally, so that can be compared.

But to fully ensure that there are no differences, i.e. that no one has
reused an ID, you can generate a revision testament.

> If I understand you correctly, you are claiming that you are not really 
> identifying a revision, but a revision _at a certain place with a 
> place-dependent number_. This conflicts with my understanding of a 
> revision.

No, I am claiming that a revision at a certain place with a
place-dependent number is one name for a revision, but it may have other
names.

>> If that's true of Git, then it certainly has a simple namespace.  Using 
>> eight-digit hex values doesn't sound simple to me, though.
> 
> It depends on your usage. If you want to do anything interesting, like 
> assure that you have the correct version, or assure that two different 
> person's tags actually tag the same revision, there is no simpler 
> representation.

I can use the 'bzr missing' command to check whether my branch is in
sync with a remote branch.  Or I can use the 'pull' command to update my
branch to a given revno in a remote branch.


>> That sounds right.  So those branches are persistent, and can be worked
>> on independently?
> 
> Of course! Persistence (and reliability) are the number one goal of git. 
> Performance is the next one.

You'd be surprised.  When we last spoke to the Mercurial team, Mercurial
didn't support multiple persistent branches in one repository.  Pulling
from a remote repository could join two branches into one.  I'm told
they're fixing that now.


>> You'll note we referred to that bevhavior on the page.  We don't think
>> what Git does is the same as supporting renames.  AIUI, some Git users
>> feel the same way.
> 
> Oh, we start another flamewar again?

I'd hope not.  It sounds as though you feel that supporting renames in
the data representation is *wrong*, and therefore it should be an insult
to you if we said that Git fully supported renames.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNGVq0F+nu1YWqI0RAsXiAJ9hjH2sQGG3E9oIYP2SxscXvVQsJACdHtkj
+r37JPSjbQCuchPo08P3px8=
=5MHE
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  4:56       ` Aaron Bentley
@ 2006-10-17  5:20         ` Shawn Pearce
  2006-10-17  8:21           ` Martin Pool
  2006-10-17  8:15         ` Jakub Narebski
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-17  5:20 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git

Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> Jakub Narebski wrote:
> > One cannot have universally valid revision numbers (even
> > only per branch) in distributed development. Subversion can do that only
> > because it is centralized SCM. Global numbering and distributed nature
> > doesn't mix... hence contents based sha1 as commit identifiers.
> 
> Sure.  Our UI approach is that unique identifiers can usefully be
> abstracted away with a combination of URL + number, in the vast majority
> of cases.

But this only works when the URL is public.  In Git I can just lookup
the unique SHA1 for a revision in my private repository and toss it
into an email with a quick copy and paste.  With Bazaar it sounds
like I'd have to do that relative to some known public repository,
which just sounds like more work to me.

But I don't want to see this otherwise interesting thread devolve into
a "we do X better!" match so I'm not going to say anything further here.
 
> > I wonder if any SCM other than git has easy way to "rebase" a branch,
> > i.e. cut branch at branching point, and transplant it to the tip
> > of other branch. For example you work on 'xx/topic' topic branch,
> > and want to have changes in those branch but applied to current work,
> > not to the version some time ago when you have started working on
> > said feature.
> 
> If I understand correctly, in Bazaar, you'd just merge the current work
> into 'xx/topic'.

Git has two approaches:

 - merge: The two independent lines of development are merged
   together under a new single graph node.  This is a merge commit
   and has two parent pointers, one for each independent line of
   development which was combined into one.  Up to 16 independent
   lines can be merged at once, though 12 is the record.

 - rebase: The commits from one line of development are replayed
   onto a totally different line of development.  This is often
   used to reapply your changes onto the upstream branch after the
   upstream has changed but before you send your changes upstream.
   It can often generate more readable commit history.

I believe what you are talking about in Bazaar is the former (merge)
while what Jakub was talking about was the latter (rebase).
 
> > What your comparison matrick lacks for example is if given SCM
> > saves information about branching point and merges, so you can
> > get where two branches diverged, and when one branch was merged into
> > another.
> 
> I'm not sure what you mean about divergence.  For example, Bazaar
> records the complete ancestry of each branch, and determining the point
> of divergence is as simple as finding the last common ancestor.  But are
> you considering only the initial divergence?  Or if the branches merge
> and then diverge again, would you consider that the point of divergence?

I'm believe you nailed what Jakub was talking about on the head.
And yes, I noticed its in your matrix but its not very clear.
I think that some additional explanation there may help other
readers.
 
-- 
Shawn.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  5:08       ` Aaron Bentley
@ 2006-10-17  5:25         ` Carl Worth
  2006-10-17  5:31         ` Shawn Pearce
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 1752+ messages in thread
From: Carl Worth @ 2006-10-17  5:25 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Johannes Schindelin, Jakub Narebski, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 1373 bytes --]

On Tue, 17 Oct 2006 01:08:59 -0400, Aaron Bentley wrote:
> >> If that's true of Git, then it certainly has a simple namespace.  Using
> >> eight-digit hex values doesn't sound simple to me, though.
> >
> > It depends on your usage. If you want to do anything interesting, like
> > assure that you have the correct version, or assure that two different
> > person's tags actually tag the same revision, there is no simpler
> > representation.
>
> I can use the 'bzr missing' command to check whether my branch is in
> sync with a remote branch.  Or I can use the 'pull' command to update my
> branch to a given revno in a remote branch.

I think you missed the simplicity of the git naming here. With git, I
can receive a bug report that specifies a bug that appears in a
revision such as:

	71037f3612da9d11431567c05c17807499ab1746

And since I have a commit object in my repository with that same name
I have a strong assurance that I am testing the identical software as
the bug reporter without me ever needing any access to pull from the
reporter's repository.

And this works in an entirely distributed fashion. Any two users can
be certain they are working with identical software on both ends by
exchanging and comparing a few bytes, (in email, irc, bugzilla, what
have you), without any need to refer to a common repository which both
users have access to.

-Carl


[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  5:08       ` Aaron Bentley
  2006-10-17  5:25         ` Carl Worth
@ 2006-10-17  5:31         ` Shawn Pearce
  2006-10-17  6:23         ` Junio C Hamano
       [not found]         ` <20061017062341.8a5c8530.seanlkml@sympatico.ca>
  3 siblings, 0 replies; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-17  5:31 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Johannes Schindelin, Jakub Narebski, bazaar-ng, git

Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> Johannes Schindelin wrote:
> > On Mon, 16 Oct 2006, Aaron Bentley wrote:
> >> You'll note we referred to that bevhavior on the page.  We don't think
> >> what Git does is the same as supporting renames.  AIUI, some Git users
> >> feel the same way.
> > 
> > Oh, we start another flamewar again?
> 
> I'd hope not.  It sounds as though you feel that supporting renames in
> the data representation is *wrong*, and therefore it should be an insult
> to you if we said that Git fully supported renames.

It would seem that the majority of folks on the Git list feel that
way, myself among them.  I don't know that we'd find it an insult
to say Git fully supports renames but I do think we have had better
results from *not* recording them and looking for them after the
fact with smart tools.

Junio's recent work with git-pickaxe (or whatever its name finally
settles out to be) is a perfect example of this.  Despite not having
"recorded renames" git-pickaxe is able to fairly accurately detect
blocks of code moving between files, of which renaming files is just
a special case.  This provides some fairly accurate blame reporting
pointing to exactly which commit/author/datetime put a given line
of code into the project.

No additional metadata required.  All existing repositories can
immediately benefit from the new tool.  Rather slick if you ask me.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  5:08       ` Aaron Bentley
  2006-10-17  5:25         ` Carl Worth
  2006-10-17  5:31         ` Shawn Pearce
@ 2006-10-17  6:23         ` Junio C Hamano
  2006-10-17 18:52           ` J. Bruce Fields
       [not found]         ` <20061017062341.8a5c8530.seanlkml@sympatico.ca>
  3 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-17  6:23 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: git

Aaron Bentley <aaron.bentley@utoronto.ca> writes:

> Johannes Schindelin wrote:
>
>>> You'll note we referred to that bevhavior on the page.  We don't think
>>> what Git does is the same as supporting renames.  AIUI, some Git users
>>> feel the same way.
>> 
>> Oh, we start another flamewar again?
>
> I'd hope not.  It sounds as though you feel that supporting renames in
> the data representation is *wrong*, and therefore it should be an insult
> to you if we said that Git fully supported renames.

Not recording and not supporting are quite different things.

What we don't do is to _record_ renames in the data structure.
I personally would not use a word as strong as _wrong_ (and
Linus may disagree), but (1) we can support renames without
recording them just fine, (2) recording renames would not help
to tell users about line movements across files which we would
want to do, and (3) we are getting closer to come up with a way
to even do (2) without recording renames.  Given these, perhaps
I might say recording renames is _pointless_ when I am in good
mood.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  0:23           ` Linus Torvalds
  2006-10-17  0:36             ` Johannes Schindelin
  2006-10-17  1:17             ` Nguyen Thai Ngoc Duy
@ 2006-10-17  7:26             ` Christian MICHON
  2 siblings, 0 replies; 1752+ messages in thread
From: Christian MICHON @ 2006-10-17  7:26 UTC (permalink / raw)
  To: git

On 10/17/06, Linus Torvalds <torvalds@osdl.org> wrote:
> Well, you can just add
>
>        [alias]
>                cat=-p cat-file -p
>
> to your ~/.gitconfig file, and you're there.

_WONDERFUL_. Really :)

-- 
Christian

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  4:24       ` Aaron Bentley
@ 2006-10-17  7:50         ` Andreas Ericsson
  2006-10-17 14:05           ` Aaron Bentley
  2006-10-17  8:30         ` Jakub Narebski
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-17  7:50 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git

Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Linus Torvalds wrote:
>> On Mon, 16 Oct 2006, Aaron Bentley wrote:
>>> Bazaar's namespace is "simple" because all branches can be named by a
>>> URL, and all revisions can be named by a URL + a number.
> 
>> I pretty much _guarantee_ that a "number" is not a valid way to uniquely 
>> name a revision in a distributed environment, though. I bet the "number" 
>> really only names a revision in one _single_ repository, right?
> 
> Right.  That's why I said all revisions can be named by a URL + a
> number, because it's the combination of the URL + a number that is
> unique.  (In bzr, each branch has a URL.)
> 

The revision will change between different repos though, so 
random-contributor A that doesn't have his repo publicised needs to send 
patches and can't log his exact problem revision somewhere, which makes 
it hard for random contributor B that runs into a similar problem but on 
a different project sometime later to find the offending code. I prefer 
the git way, but I'm a git user and probably biased.

That said, it shouldn't be impossible to add fixed, user-friendly 
bazaar-like revision numbers for git. We just have to reverse the
<committish>[^~]<number> syntax to also accept <committish>+<number>.

This would work marvelously with serial development but breaks horribly 
with merges unless the first (or last) commit on each new branch gets 
given a tag or some such.

Either way, I'm fairly certain both bazaar and git needs to distribute 
information to the user in need of finding the revision (which url and 
which number vs which sha). I also imagine that the bazaar users, just 
like the git users, are sufficiently apt copy-paste people to never 
actually read the prerequisite information.

> 
>> If you give 
>> the SHA1 name, it's well-defined even between different repositories, and 
>> you can tell somebody that "revision XYZ is when the problem started", and 
>> they'll know _exactly_ which revision it is, even if they don't have your 
>> particular repository.
> 
> When two people have copies of the same revision, it's usually because
> they are each pulling from a common branch, and so the revision in that
> branch can be named.  Bazaar does use unique ids internally, but it's
> extremely rare that the user needs to use them.
> 

Well, if two people have the same revision in git, you *know* they have 
pulled from each other, because ALL objects are immutable. The point of 
"naming" the revision is moot, because it's something all SCM's can do.


>> Now _that_ is true simplicity. It does automatically mean that the names 
>> are a bit longer, but in this case, "longer" really _does_ mean "simpler".
>>
>> If you want a short, human-readable name, you _tag_ it. It takes all of a 
>> hundredth of a second to to or so.
> 
> But tags have local meaning only, unless someone has access to your
> repository, right?
> 

I imagine the bazaar-names with url+number only has local meaning unless 
someone has access to your repository too. One of the great benefits of 
git is that each revision is *always exactly the same* no matter in 
which repository it appears. This includes file-content, filesystem 
layout and, last but also most important, history.


>>>> About "checkouts", i.e. working directories with repository elsewhere:
>>>> you can use GIT_DIR environmental variable or "git --git-dir" option,
>>>> or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
>>>> "symref"-like file to point to repository passes, we can use that.
>>> It sounds like the .gitdir/.git proposal would give Git "checkouts", by
>>> our meaning of the term.
>> Well, in the git world, it's really just one shared repository that has 
>> separate branch-namespaces, and separate working trees (aka "checkouts"). 
>> So yes, it probably matches what bazaar would call a checkout.
> 
> The key thing about a checkout is that it's stored in a different
> location from its repository.  This provides a few benefits:
> 
> - - you can publish a repository without publishing its working tree,
>   possibly using standard mirroring tools like rsync.
> 

Can't all scm's do this?

> - - you can have working trees on local systems while having the
>   repository on a remote system.  This makes it easy to work on one
>   logical branch from multiple locations, without getting out of sync.
> 

This I'm not so sure about. Anyone wanna fill out how shallow clones and 
all that jazz works?

> - - you can use a checkout to maintain a local mirror of a read-only
>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
> 

Check. Well, actually, you just clone it as usual but with the --bare 
argument and it won't write out the working tree files.

>> Almost nobody seems to actually use it that way in git - it's mostly more 
>> efficient to just have five different branches in the same working tree, 
>> and switch between them. When you switch between branches in git, git only 
>> rewrites the part of your working tree that actually changed, so switching 
>> is extremely efficient even with a large repo.
> 
> You can operate that way in bzr too, but I find it nicer to have one
> checkout for each active branch, plus a checkout of bzr.dev.  Our switch
> command also rewrites only the changed part of the working tree.
> 

Works in git as well, but each "checkout" (actually, locally referenced 
repository clone) gets a separate branch/tag namespace.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  4:56       ` Aaron Bentley
  2006-10-17  5:20         ` Shawn Pearce
@ 2006-10-17  8:15         ` Jakub Narebski
  2006-10-17  8:16         ` Andreas Ericsson
  2006-10-17  9:20         ` Jakub Narebski
  3 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17  8:15 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git

Dnia wtorek 17. października 2006 06:56, Aaron Bentley napisał:
> Jakub Narebski wrote:
> > Well, <ref>~<n> means <n>-th _parent_ of a given ref, which for branches
> > (which constantly change) is a moving target.
> 
> Ah.  Bazaar uses negative numbers to refer to <n>th parents, and
> positive numbers to refer to the number of commits that have been made
> since the branch was initialized.
> 
> > One cannot have universally valid revision numbers (even
> > only per branch) in distributed development. Subversion can do that only
> > because it is centralized SCM. Global numbering and distributed nature
> > doesn't mix... hence contents based sha1 as commit identifiers.
> 
> Sure.  Our UI approach is that unique identifiers can usefully be
> abstracted away with a combination of URL + number, in the vast majority
> of cases.
> 
> > But this doesn't matter much, because you can have really lightweight
> > tags in git (especially now with packed refs support). So you can have
> > the namespace you want.
> 
> The nice thing about revision numbers is that they're implicit-- no one
> needs to take any action to update them, and so you can always use them.
> 
> > I wonder if any SCM other than git has easy way to "rebase" a branch,
> > i.e. cut branch at branching point, and transplant it to the tip
> > of other branch. For example you work on 'xx/topic' topic branch,
> > and want to have changes in those branch but applied to current work,
> > not to the version some time ago when you have started working on
> > said feature.
> 
> If I understand correctly, in Bazaar, you'd just merge the current work
> into 'xx/topic'.
> 
> > What your comparison matrick lacks for example is if given SCM
> > saves information about branching point and merges, so you can
> > get where two branches diverged, and when one branch was merged into
> > another.
> 
> I'm not sure what you mean about divergence.  For example, Bazaar
> records the complete ancestry of each branch, and determining the point
> of divergence is as simple as finding the last common ancestor.  But are
> you considering only the initial divergence?  Or if the branches merge
> and then diverge again, would you consider that the point of divergence?
> 
> merge-point tracking is a prerequisite for Smart Merge, which does
> appear on our matrix.
> 
> > Plugins = API + detection ifrastructure + loading on demand.
> > Git has API, has a kind of detection ifrastructure (for commands and
> > merge strategies only), doesn't have loading on demand. You can
> > easily provide new commands (thanks to git wrapper) and new merge
> > strategies.
> 
> I'm not sure what you mean by API, unless you mean the commandline.  If
> that's what you mean, surely all unix commands are extensible in that
> regard.
> 
> Aaron
> 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  4:56       ` Aaron Bentley
  2006-10-17  5:20         ` Shawn Pearce
  2006-10-17  8:15         ` Jakub Narebski
@ 2006-10-17  8:16         ` Andreas Ericsson
  2006-10-17 20:01           ` Aaron Bentley
  2006-10-17  9:20         ` Jakub Narebski
  3 siblings, 1 reply; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-17  8:16 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git

Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Jakub Narebski wrote:
>> Well, <ref>~<n> means <n>-th _parent_ of a given ref, which for branches
>> (which constantly change) is a moving target.
> 
> Ah.  Bazaar uses negative numbers to refer to <n>th parents, and
> positive numbers to refer to the number of commits that have been made
> since the branch was initialized.
> 

What do you do once a branch has been thrown away, or has had 20 other 
branches merged into it? Does the offset-number change for the revision 
then, or do you track branch-points explicitly?

>> One cannot have universally valid revision numbers (even
>> only per branch) in distributed development. Subversion can do that only
>> because it is centralized SCM. Global numbering and distributed nature
>> doesn't mix... hence contents based sha1 as commit identifiers.
> 
> Sure.  Our UI approach is that unique identifiers can usefully be
> abstracted away with a combination of URL + number, in the vast majority
> of cases.
> 
>> But this doesn't matter much, because you can have really lightweight
>> tags in git (especially now with packed refs support). So you can have
>> the namespace you want.
> 
> The nice thing about revision numbers is that they're implicit-- no one
> needs to take any action to update them, and so you can always use them.
> 
>> I wonder if any SCM other than git has easy way to "rebase" a branch,
>> i.e. cut branch at branching point, and transplant it to the tip
>> of other branch. For example you work on 'xx/topic' topic branch,
>> and want to have changes in those branch but applied to current work,
>> not to the version some time ago when you have started working on
>> said feature.
> 
> If I understand correctly, in Bazaar, you'd just merge the current work
> into 'xx/topic'.
> 

merge != rebase though, although they are indeed similar. Let's take the 
example of a 'master' branch and topic branch topicA. If you rebase 
topicA onto 'master', development will appear to have been serial. If 
you instead merge them, it will either register as a real merge or, if 
the branch tip of 'master' is the branch start-point of topicA, it will 
result in a "fast-forward" where 'master' is just updated to the 
branch-tip of 'topicA'.

>> What your comparison matrick lacks for example is if given SCM
>> saves information about branching point and merges, so you can
>> get where two branches diverged, and when one branch was merged into
>> another.
> 
> I'm not sure what you mean about divergence.  For example, Bazaar
> records the complete ancestry of each branch, and determining the point
> of divergence is as simple as finding the last common ancestor.  But are
> you considering only the initial divergence?  Or if the branches merge
> and then diverge again, would you consider that the point of divergence?
> 
> merge-point tracking is a prerequisite for Smart Merge, which does
> appear on our matrix.
> 
>> Plugins = API + detection ifrastructure + loading on demand.
>> Git has API, has a kind of detection ifrastructure (for commands and
>> merge strategies only), doesn't have loading on demand. You can
>> easily provide new commands (thanks to git wrapper) and new merge
>> strategies.
> 
> I'm not sure what you mean by API, unless you mean the commandline.  If
> that's what you mean, surely all unix commands are extensible in that
> regard.
> 

I'm fairly certain he's talking about the API in the sense it's being 
talked about in every other application. Extensive work has been made to 
libify a lot of the git code, which means that most git commands are 
made up of less than 400 lines of C code, where roughly 80% of the code 
is command-specific (i.e., argument parsing and presentation).

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  5:20         ` Shawn Pearce
@ 2006-10-17  8:21           ` Martin Pool
  0 siblings, 0 replies; 1752+ messages in thread
From: Martin Pool @ 2006-10-17  8:21 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Aaron Bentley, bazaar-ng, git, Jakub Narebski

On 17 Oct 2006, Shawn Pearce <spearce@spearce.org> wrote:
> Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> > Jakub Narebski wrote:
> > > One cannot have universally valid revision numbers (even
> > > only per branch) in distributed development. Subversion can do that only
> > > because it is centralized SCM. Global numbering and distributed nature
> > > doesn't mix... hence contents based sha1 as commit identifiers.
> > 
> > Sure.  Our UI approach is that unique identifiers can usefully be
> > abstracted away with a combination of URL + number, in the vast majority
> > of cases.
> 
> But this only works when the URL is public.  In Git I can just lookup
> the unique SHA1 for a revision in my private repository and toss it
> into an email with a quick copy and paste.  

Yes, but then people need to know how to get it out of your private
repository.  For stuff that goes into well-known repositories I suppose
it just propagates.

> With Bazaar it sounds like I'd have to do that relative to some known
> public repository, which just sounds like more work to me.

You can also name a revision using its UUID, in which case things will
work similarly to git.  We tend to often say "in r1234 of dev".

> But I don't want to see this otherwise interesting thread devolve into
> a "we do X better!" match so I'm not going to say anything further here.

Sure.

> > > I wonder if any SCM other than git has easy way to "rebase" a branch,
> > > i.e. cut branch at branching point, and transplant it to the tip
> > > of other branch. For example you work on 'xx/topic' topic branch,
> > > and want to have changes in those branch but applied to current work,
> > > not to the version some time ago when you have started working on
> > > said feature.
> > 
> > If I understand correctly, in Bazaar, you'd just merge the current work
> > into 'xx/topic'.
> 
> Git has two approaches:
> 
>  - merge: The two independent lines of development are merged
>    together under a new single graph node.  This is a merge commit
>    and has two parent pointers, one for each independent line of
>    development which was combined into one.  Up to 16 independent
>    lines can be merged at once, though 12 is the record.
> 
>  - rebase: The commits from one line of development are replayed
>    onto a totally different line of development.  This is often
>    used to reapply your changes onto the upstream branch after the
>    upstream has changed but before you send your changes upstream.
>    It can often generate more readable commit history.
> 
> I believe what you are talking about in Bazaar is the former (merge)
> while what Jakub was talking about was the latter (rebase).

For the 'rebase' operation in Bazaar you can use 'bzr graft':

  http://spacepants.org/src/bzrgraft/

-- 
Martin

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  4:24       ` Aaron Bentley
  2006-10-17  7:50         ` Andreas Ericsson
@ 2006-10-17  8:30         ` Jakub Narebski
  2006-10-17 11:19           ` Matthieu Moy
       [not found]         ` <20061017062313.cd41e031.seanlkml@sympatico.ca>
  2006-10-17 15:03         ` Linus Torvalds
  3 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17  8:30 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Linus Torvalds, bazaar-ng, git

Aaron Bentley wrote:
> Linus Torvalds wrote:

>> If you want a short, human-readable name, you _tag_ it. It takes all of a
>> hundredth of a second to to or so.
> 
> But tags have local meaning only, unless someone has access to your
> repository, right?

Tags are propagated during clone, and during fetch/pull (getting changes
from repository). So in that sense they are global.

If you don't publish your repository, then neither tags, nor <URL>+<rev no>
has any sense, any meaning to somebody other than local private repository.
 

>> Well, in the git world, it's really just one shared repository that has
>> separate branch-namespaces, and separate working trees (aka "checkouts").
>> So yes, it probably matches what bazaar would call a checkout.
> 
> The key thing about a checkout is that it's stored in a different
> location from its repository.  This provides a few benefits:
> 
> - you can publish a repository without publishing its working tree,
>   possibly using standard mirroring tools like rsync.

git clone --bare
 
> - you can have working trees on local systems while having the
>   repository on a remote system.  This makes it easy to work on one
>   logical branch from multiple locations, without getting out of sync.

In git we usually use "git clone --local" (with repository database
hardlinked) or "git clone --shared"/"git clone --reference <repository>"
(which automatically sets alternates, i.e. file pointing to alternate
repository database) for that. This way one gets his/her own refs
namespace, so two people can work on different branches simultaneously.

Alternate solution would be to symlink .git, or .git/objects (i.e.
repository "database").

> - you can use a checkout to maintain a local mirror of a read-only
>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).

In git you can access contents _without_ checkout/working area.
For example gitweb (one of git's web interfaces) uses only repository
database and doesn't need checkout/working area.

>> Almost nobody seems to actually use it that way in git - it's mostly more
>> efficient to just have five different branches in the same working tree,
>> and switch between them. When you switch between branches in git, git only
>> rewrites the part of your working tree that actually changed, so switching
>> is extremely efficient even with a large repo.
> 
> You can operate that way in bzr too, but I find it nicer to have one
> checkout for each active branch, plus a checkout of bzr.dev.  Our switch
> command also rewrites only the changed part of the working tree.

Luben (IIRC) works this way.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  4:56       ` Aaron Bentley
                           ` (2 preceding siblings ...)
  2006-10-17  8:16         ` Andreas Ericsson
@ 2006-10-17  9:20         ` Jakub Narebski
  2006-10-17  9:40           ` Robert Collins
  2006-10-17  9:59           ` VCS comparison table Andreas Ericsson
  3 siblings, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17  9:20 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git

Aaron Bentley wrote:
> Jakub Narebski wrote:
>> Well, <ref>~<n> means <n>-th _parent_ of a given ref, which for branches
>> (which constantly change) is a moving target.
> 
> Ah.  Bazaar uses negative numbers to refer to <n>th parents, and
> positive numbers to refer to the number of commits that have been made
> since the branch was initialized.

How that works with branching point, and with merges? For example
in the case depicted below, how you refer to commit marked by X?

          ---- time --->

    --*--*--*--*--*--*--*--*--*-- <branch>
          \            /
           \-*--X--*--/

The branch it used to be on is gone...


Besides, in git commit object has pointers (in the form of sha1 ids)
to all its parents. So <ref>^ (parent of <ref>), or <ref>^<m> (m-th
parent of <ref>), or <ref>~<n> (n-th parent in 1st-parent lineage
of <ref>) are natural, and fast. <ref>+<n> (which would add yet another
character as forbidden in branch name) would need either serial number
(per repository or per branch) to commit id database, or getting full
history and looking it up in full history.

Branches in git are remembered not by their starting points, but by
their tips (ending points).

>> One cannot have universally valid revision numbers (even
>> only per branch) in distributed development. Subversion can do that
>> only because it is centralized SCM. Global numbering and distributed
>> nature doesn't mix... hence contents based sha1 as commit identifiers.
> 
> Sure.  Our UI approach is that unique identifiers can usefully be
> abstracted away with a combination of URL + number, in the vast
> majority of cases.

Git could do that too, by having file (files) with serial number
or branch/tag+serial number to commit id mapping. But this would
have to be local matter. And this would take some disk space, and
would seriously affect fetch performance (now git just downloads
what it doesn't have and dumps it into repository database).

BTW. what if repository is moved from one URL to another, for example
moving to different host? All "abstracted away" identifiers get
invalidated?

>> But this doesn't matter much, because you can have really lightweight
>> tags in git (especially now with packed refs support). So you can have
>> the namespace you want.
> 
> The nice thing about revision numbers is that they're implicit-- no one
> needs to take any action to update them, and so you can always use them.

Two words: post-commit hook. You can automate action of adding tags
(especially now with packed refs, which means that we can have huge number
of tags and this doesn't affect performance doue to I/O nor repository size)

>> I wonder if any SCM other than git has easy way to "rebase" a branch,
>> i.e. cut branch at branching point, and transplant it to the tip
>> of other branch. For example you work on 'xx/topic' topic branch,
>> and want to have changes in those branch but applied to current work,
>> not to the version some time ago when you have started working on
>> said feature.
> 
> If I understand correctly, in Bazaar, you'd just merge the current work
> into 'xx/topic'.

That is the alternate solution, but this would mean that merge would be
recorded (unless you squash it). And for published branches (like 'next'
for example) it is better solution, because rebase is in fact rewriting
history.

But rebase means that you had

                 A---B---C topic
                /
           D---E---F---G master

Rebasing 'topic' branch on top of master would mean that you would get

                         A'--B'--C' topic
                        /
           D---E---F---G master

where A', B', C' represent the same changeset as A, B, C up to resolved
conflicts.

And yes, that is "bzr graft"
  http://spacepants.org/src/bzrgraft/
equivalent. Do I understand correctly that this is third-party
contribution?

>> What your comparison matrick lacks for example is if given SCM
>> saves information about branching point and merges, so you can
>> get where two branches diverged, and when one branch was merged into
>> another.
> 
> I'm not sure what you mean about divergence.  For example, Bazaar
> records the complete ancestry of each branch, and determining the point
> of divergence is as simple as finding the last common ancestor.  But are
> you considering only the initial divergence?  Or if the branches merge
> and then diverge again, would you consider that the point of divergence?
> 
> merge-point tracking is a prerequisite for Smart Merge, which does
> appear on our matrix.

I was talking about point-of-divergence (branching point, fork point)
tracking, and merge-point tracking (or saving merge information).

>> Plugins = API + detection ifrastructure + loading on demand.
>> Git has API, has a kind of detection ifrastructure (for commands and
>> merge strategies only), doesn't have loading on demand. You can
>> easily provide new commands (thanks to git wrapper) and new merge
>> strategies.
> 
> I'm not sure what you mean by API, unless you mean the commandline.  If
> that's what you mean, surely all unix commands are extensible in that
> regard.

I mean API in the most common sense. 

For commands written in C it means "engine" (plumbing) functions and
data structures which do most work, so writing new command means some
command specific code and calling some functions to do the work.

For commands written in shell it means having versatile plumbing
commands (like for example git-rev-parse, git-rev-list, git-merge-base,
git-cat-file, etc.) which can be joined together including pipes
(--stdin option, --revs option to some commands), and git-sh-setup,
common git shell setup code. 

For commands writtent in Perl it means the same, with Git.pm module
instead of git-sh-setup.


About new command detection: if you put program named git-<command>
in directory with the rest of git commands, then you can call it
as "git <command>" using git wrapper. I think.

About adding new merge strategies: no autodoetection, you would
have to add new merge strategu to git-merge.sh.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:45     ` Johannes Schindelin
  2006-10-17  2:40       ` Petr Baudis
  2006-10-17  5:08       ` Aaron Bentley
@ 2006-10-17  9:33       ` Robert Collins
  2006-10-17  9:45         ` Jakub Narebski
  2 siblings, 1 reply; 1752+ messages in thread
From: Robert Collins @ 2006-10-17  9:33 UTC (permalink / raw)
  To: bazaar-ng@lists.canonical.com; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1208 bytes --]

On Tue, 2006-10-17 at 01:45 +0200, Johannes Schindelin wrote:
> 
> If you really, really think about it: it makes much more sense to
> record 
> your intention in the commit message. So, instead of recording for
> _every_ 
> _single_ file in folder1/ that it was moved to folder2/, it is better
> to 
> say that you moved folder1/ to folder2/ _because of some special
> reason_!

Just a small nit here: bzr does /not/ record the move of every file: it
records the rename of folder1 to folder2. One piece of data is all thats
recorded - no new manifest for the subdirectory is needed.

Of course, a user can choose to move all the contents of a folder and
not the folder itself - its up to the user.

By recording the folder rename rather than the contents rename, we get
merges of new files added to folder1 in other branches come into folder2
automatically, without needing to do arbitrarily deep history processing
to determine that.

This also does not prevent us doing history analysis as well, to
determine other interesting things - such as cross file 'blame' as has
been mentioned in this thread. 

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:19     ` Jakub Narebski
  2006-10-16 23:39       ` Nguyen Thai Ngoc Duy
  2006-10-17  4:56       ` Aaron Bentley
@ 2006-10-17  9:37       ` Robert Collins
       [not found]         ` <20061017060112.2d036f96.seanlkml@sympatico.ca>
  2006-10-17 10:06         ` Jakub Narebski
  2 siblings, 2 replies; 1752+ messages in thread
From: Robert Collins @ 2006-10-17  9:37 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Aaron Bentley, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 799 bytes --]

On Tue, 2006-10-17 at 01:19 +0200, Jakub Narebski wrote:
> 
> I wonder if any SCM other than git has easy way to "rebase" a branch,
> i.e. cut branch at branching point, and transplant it to the tip
> of other branch. For example you work on 'xx/topic' topic branch,
> and want to have changes in those branch but applied to current work,
> not to the version some time ago when you have started working on
> said feature. 

Precisely how does this rebase operate in git ? 
Does it preserve revision ids for the existing work, or do they all
change?


bzr has a graft plugin which walks one branch applying all its changes
to another preserving the users metadata but changing the uuids for
revisions. 

-Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  9:20         ` Jakub Narebski
@ 2006-10-17  9:40           ` Robert Collins
  2006-10-17 10:08             ` Andreas Ericsson
  2006-10-17 16:41             ` Linus Torvalds
  2006-10-17  9:59           ` VCS comparison table Andreas Ericsson
  1 sibling, 2 replies; 1752+ messages in thread
From: Robert Collins @ 2006-10-17  9:40 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Aaron Bentley, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 726 bytes --]

On Tue, 2006-10-17 at 11:20 +0200, Jakub Narebski wrote:
> 
>           ---- time --->
> 
>     --*--*--*--*--*--*--*--*--*-- <branch>
>           \            /
>            \-*--X--*--/
> 
> The branch it used to be on is gone...

In bzr 0.12 this is :
2.1.2

(assuming the first * is numbered '1'.)

These numbers are fairly stable, in particular everything's number in
the mainline will be the same number in all the branches created from it
at that point in time, but a branch that initially creates a revision or
obtains it before the mainline will have a different number until they
syncronise with the mainline via pull.

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  9:33       ` Robert Collins
@ 2006-10-17  9:45         ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17  9:45 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Robert Collins wrote:

> On Tue, 2006-10-17 at 01:45 +0200, Johannes Schindelin wrote:
>> 
>> If you really, really think about it: it makes much more sense to record 
>> your intention in the commit message. So, instead of recording for _every_ 
>> _single_ file in folder1/ that it was moved to folder2/, it is better to 
>> say that you moved folder1/ to folder2/ _because of some special
>> reason_!
> 
> Just a small nit here: bzr does /not/ record the move of every file: it
> records the rename of folder1 to folder2. One piece of data is all thats
> recorded - no new manifest for the subdirectory is needed.
> 
> Of course, a user can choose to move all the contents of a folder and
> not the folder itself - its up to the user.
> 
> By recording the folder rename rather than the contents rename, we get
> merges of new files added to folder1 in other branches come into folder2
> automatically, without needing to do arbitrarily deep history processing
> to determine that.

Hmmm... I wonder how well git manages that (merge with renamed directory).

  folder1/a  -->  folder2/a  --------> folder2/a
  folder1/b  -->  folder2/b       /    folder2/b
      \                          /     folder2/c
       \------->  folder1/a  ---/
                  folder1/b
                  folder1/c


I wonder how bzr manages "separate some files into subdirectory" (and how
well git does that), i.e. we have

   sub-file1
   sub-file2
   filea
   fileb

In the 'main' branch we separated "sub-*" files into subdirectory

   sub/file1
   sub/file2
   filea
   fileb

How would that merge with adding new sub-* file on the branch to be merged?

   sub-file1
   sub-file2
   sub-file3
   filea
   fileb


Or how bzr manages sub-level movement, such as splitting file into two,
or joining two files into one file.


P.S. is anyone working on --follow option for renames following path
limiting?
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  9:20         ` Jakub Narebski
  2006-10-17  9:40           ` Robert Collins
@ 2006-10-17  9:59           ` Andreas Ericsson
  1 sibling, 0 replies; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-17  9:59 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

Jakub Narebski wrote:
> 
> About new command detection: if you put program named git-<command>
> in directory with the rest of git commands, then you can call it
> as "git <command>" using git wrapper. I think.
> 

Yup. The new command will also automagically appear in the "git help -a" 
output. Those two functions have been available since the C wrapper was 
born, although "git help -a" was the only available output for "command 
not found" until someone introduced the more newbie-friendly list that 
pops up now adays.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]         ` <20061017060112.2d036f96.seanlkml@sympatico.ca>
  2006-10-17 10:01           ` Sean
@ 2006-10-17 10:01           ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 10:01 UTC (permalink / raw)
  To: Robert Collins; +Cc: Jakub Narebski, Aaron Bentley, bazaar-ng, git

On Tue, 17 Oct 2006 19:37:45 +1000
Robert Collins <robertc@robertcollins.net> wrote:

> Precisely how does this rebase operate in git ? 
> Does it preserve revision ids for the existing work, or do they all
> change?
> 
> bzr has a graft plugin which walks one branch applying all its changes
> to another preserving the users metadata but changing the uuids for
> revisions. 

git rebase does exactly the same as you describe, including changing
the sha1 for each commit it moves.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]         ` <20061017060112.2d036f96.seanlkml@sympatico.ca>
@ 2006-10-17 10:01           ` Sean
  2006-10-17 10:01           ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 10:01 UTC (permalink / raw)
  To: Robert Collins; +Cc: bazaar-ng, git, Jakub Narebski

On Tue, 17 Oct 2006 19:37:45 +1000
Robert Collins <robertc@robertcollins.net> wrote:

> Precisely how does this rebase operate in git ? 
> Does it preserve revision ids for the existing work, or do they all
> change?
> 
> bzr has a graft plugin which walks one branch applying all its changes
> to another preserving the users metadata but changing the uuids for
> revisions. 

git rebase does exactly the same as you describe, including changing
the sha1 for each commit it moves.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  9:37       ` Robert Collins
       [not found]         ` <20061017060112.2d036f96.seanlkml@sympatico.ca>
@ 2006-10-17 10:06         ` Jakub Narebski
  1 sibling, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17 10:06 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Robert Collins wrote:

> On Tue, 2006-10-17 at 01:19 +0200, Jakub Narebski wrote:
>> 
>> I wonder if any SCM other than git has easy way to "rebase" a branch,
>> i.e. cut branch at branching point, and transplant it to the tip
>> of other branch. For example you work on 'xx/topic' topic branch,
>> and want to have changes in those branch but applied to current work,
>> not to the version some time ago when you have started working on
>> said feature. 
> 
> Precisely how does this rebase operate in git ? 
> Does it preserve revision ids for the existing work, or do they all
> change?

Revision ids (commit ids) change of course. Therefore rebasing published
branches is not recommended, as it is in fact rewriting history.

It is however recommended before sending _series_ of patches (work on that
series should be done using topic branch) to rebase topic branch they sit
on for the patches to apply cleanly on top of current work. Or use StGit or
other Quilt (patch management) equivalent.

> bzr has a graft plugin which walks one branch applying all its changes
> to another preserving the users metadata but changing the uuids for
> revisions. 

This looks like "bzr graft" is the same as "git rebase". It can deal with
conflict, cannot it?


P.S. It looks like we have yet another terminology conflict. In git "graft"
means "history graft" i.e. file which changes parents of some commits. For
example if we have historical repositoy and current repositoy we can join
together using grafts (otherwise we would need to rewrite history, as sha1
which serves as commit id includes parents information), e.g.

   x--*--*--*--*....x--*--*--*--*

    historical         current

where 'x' is 'root' (parentless) commit, '--' denotes parentship, and '....'
denotes "history graft".      
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  9:40           ` Robert Collins
@ 2006-10-17 10:08             ` Andreas Ericsson
  2006-10-17 10:47               ` Matthieu Moy
  2006-10-18  4:55               ` Robert Collins
  2006-10-17 16:41             ` Linus Torvalds
  1 sibling, 2 replies; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-17 10:08 UTC (permalink / raw)
  To: Robert Collins; +Cc: Jakub Narebski, Aaron Bentley, bazaar-ng, git

Robert Collins wrote:
> On Tue, 2006-10-17 at 11:20 +0200, Jakub Narebski wrote:
>>           ---- time --->
>>
>>     --*--*--*--*--*--*--*--*--*-- <branch>
>>           \            /
>>            \-*--X--*--/
>>
>> The branch it used to be on is gone...
> 
> In bzr 0.12 this is :
> 2.1.2
> 

Would it be a different number in a different version of bazaar?

> (assuming the first * is numbered '1'.)
> 
> These numbers are fairly stable, in particular everything's number in
> the mainline will be the same number in all the branches created from it
> at that point in time, but a branch that initially creates a revision or
> obtains it before the mainline will have a different number until they
> syncronise with the mainline via pull.
> 

So basically anyone can pull/push from/to each other but only so long as 
they decide upon a common master that handles synchronizing of the 
number part of the url+number revision short-hands?

One thing that's been nagging me is how you actually find out the 
url+number where the desired revision exists. That is, after you've 
synced with master, or merged the mothership's master-branch into one of 
your experimental branches where you've done some work that went before 
mothership's master's current tip, do you have to have access to the 
mothership's repo (as in, do you have to be online) to find out the 
number part of url+number shorthand, or can you determine it solely from 
what you have on your laptop?

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]         ` <20061017062313.cd41e031.seanlkml@sympatico.ca>
  2006-10-17 10:23           ` Sean
@ 2006-10-17 10:23           ` Sean
  2006-10-17 10:30             ` Johannes Schindelin
                               ` (3 more replies)
  1 sibling, 4 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 10:23 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git

On Tue, 17 Oct 2006 00:24:15 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> The key thing about a checkout is that it's stored in a different
> location from its repository.  This provides a few benefits:
> 
> - - you can publish a repository without publishing its working tree,
>   possibly using standard mirroring tools like rsync.

Yeah, even in git you typically don't publish your working tree when
making it available for cloning.  In fact the native git network
protocol doesn't even have a way to transfer working trees.

> - - you can have working trees on local systems while having the
>   repository on a remote system.  This makes it easy to work on one
>   logical branch from multiple locations, without getting out of sync.

That is a very nice feature.  Git would be improved if it could
support that mode of operation as well.

> - - you can use a checkout to maintain a local mirror of a read-only
>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).

I'm not sure what you mean here.  A bzr checkout doesn't have any history
does it?  So it's not a mirror of a branch, but just a checkout of the
branch head?

If so, Git can export a tarball of a branch (actually a snapshot as at
any given commit) which can be mirrored out.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]         ` <20061017062313.cd41e031.seanlkml@sympatico.ca>
@ 2006-10-17 10:23           ` Sean
  2006-10-17 10:23           ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 10:23 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Tue, 17 Oct 2006 00:24:15 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> The key thing about a checkout is that it's stored in a different
> location from its repository.  This provides a few benefits:
> 
> - - you can publish a repository without publishing its working tree,
>   possibly using standard mirroring tools like rsync.

Yeah, even in git you typically don't publish your working tree when
making it available for cloning.  In fact the native git network
protocol doesn't even have a way to transfer working trees.

> - - you can have working trees on local systems while having the
>   repository on a remote system.  This makes it easy to work on one
>   logical branch from multiple locations, without getting out of sync.

That is a very nice feature.  Git would be improved if it could
support that mode of operation as well.

> - - you can use a checkout to maintain a local mirror of a read-only
>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).

I'm not sure what you mean here.  A bzr checkout doesn't have any history
does it?  So it's not a mirror of a branch, but just a checkout of the
branch head?

If so, Git can export a tarball of a branch (actually a snapshot as at
any given commit) which can be mirrored out.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]         ` <20061017062341.8a5c8530.seanlkml@sympatico.ca>
  2006-10-17 10:23           ` Sean
@ 2006-10-17 10:23           ` Sean
  2006-10-18  6:33           ` Jeff King
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 10:23 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Johannes Schindelin, Jakub Narebski, bazaar-ng, git

On Tue, 17 Oct 2006 01:08:59 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> I can use the 'bzr missing' command to check whether my branch is in
> sync with a remote branch.  Or I can use the 'pull' command to update my
> branch to a given revno in a remote branch.

The "bzr missing" command sounds like a handy one.  

Someone on the xorg mailing list was recently lamenting that git does not
have an easy way to compare a local branch to a remote one.  While this
turns out to not be a big problem in git, it might be nice to have such
a command.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]         ` <20061017062341.8a5c8530.seanlkml@sympatico.ca>
@ 2006-10-17 10:23           ` Sean
  2006-10-17 10:23           ` Sean
  2006-10-18  6:33           ` Jeff King
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 10:23 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git, Johannes Schindelin, Jakub Narebski

On Tue, 17 Oct 2006 01:08:59 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> I can use the 'bzr missing' command to check whether my branch is in
> sync with a remote branch.  Or I can use the 'pull' command to update my
> branch to a given revno in a remote branch.

The "bzr missing" command sounds like a handy one.  

Someone on the xorg mailing list was recently lamenting that git does not
have an easy way to compare a local branch to a remote one.  While this
turns out to not be a big problem in git, it might be nice to have such
a command.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 10:23           ` Sean
@ 2006-10-17 10:30             ` Johannes Schindelin
       [not found]               ` <20061017063549.da130b5f.seanlkml@sympatico.ca>
                                 ` (2 more replies)
  2006-10-17 19:51             ` Aaron Bentley
                               ` (2 subsequent siblings)
  3 siblings, 3 replies; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-17 10:30 UTC (permalink / raw)
  To: Sean; +Cc: Aaron Bentley, bazaar-ng, git

Hi,

On Tue, 17 Oct 2006, Sean wrote:

> On Tue, 17 Oct 2006 00:24:15 -0400
> Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> 
> > - - you can have working trees on local systems while having the
> >   repository on a remote system.  This makes it easy to work on one
> >   logical branch from multiple locations, without getting out of sync.
> 
> That is a very nice feature.  Git would be improved if it could
> support that mode of operation as well.

It would also make things slow as hell. How do you deal with something 
like annotate in such a setup?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]               ` <20061017063549.da130b5f.seanlkml@sympatico.ca>
@ 2006-10-17 10:35                 ` Sean
  2006-10-17 10:35                 ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 10:35 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Aaron Bentley, bazaar-ng, git

On Tue, 17 Oct 2006 12:30:27 +0200 (CEST)
Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

> It would also make things slow as hell. How do you deal with something 
> like annotate in such a setup?

Some commands like annotate might not make any sense in such a set up.

But one way to get the same (perhaps even better) feature into git 
would be to support shallow clones, in which case even annotate would
continue to work even if somewhat crippled by the lack of a complete
history.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]               ` <20061017063549.da130b5f.seanlkml@sympatico.ca>
  2006-10-17 10:35                 ` Sean
@ 2006-10-17 10:35                 ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 10:35 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: bazaar-ng, git

On Tue, 17 Oct 2006 12:30:27 +0200 (CEST)
Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

> It would also make things slow as hell. How do you deal with something 
> like annotate in such a setup?

Some commands like annotate might not make any sense in such a set up.

But one way to get the same (perhaps even better) feature into git 
would be to support shallow clones, in which case even annotate would
continue to work even if somewhat crippled by the lack of a complete
history.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 10:30             ` Johannes Schindelin
       [not found]               ` <20061017063549.da130b5f.seanlkml@sympatico.ca>
@ 2006-10-17 10:45               ` Matthias Kestenholz
  2006-10-17 13:48               ` Aaron Bentley
  2 siblings, 0 replies; 1752+ messages in thread
From: Matthias Kestenholz @ 2006-10-17 10:45 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Sean, Aaron Bentley, bazaar-ng, git

Hi,

On Tue, 2006-10-17 at 12:30 +0200, Johannes Schindelin wrote:
> Hi,
> 
> On Tue, 17 Oct 2006, Sean wrote:
> 
> > On Tue, 17 Oct 2006 00:24:15 -0400
> > Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> > 
> > > - - you can have working trees on local systems while having the
> > >   repository on a remote system.  This makes it easy to work on one
> > >   logical branch from multiple locations, without getting out of sync.
> > 
> > That is a very nice feature.  Git would be improved if it could
> > support that mode of operation as well.
> 
> It would also make things slow as hell. How do you deal with something 
> like annotate in such a setup?

You'd probably have to do all processing server-side (git log, blame,
merges... like in subversion, where you can merge and rename/move files
remotely, IIRC). Of course, all the things which make git really useful
for me (gitk, git log with all its arguments etc.) would not be
available. Cheap checkouts would be made possible easily that way at the
cost of higher server load and an abstraction layer over network for
object access.

I don't know if that sounds reasonable at all.

	Matthias

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 10:08             ` Andreas Ericsson
@ 2006-10-17 10:47               ` Matthieu Moy
  2006-10-18  4:55               ` Robert Collins
  1 sibling, 0 replies; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-17 10:47 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Robert Collins, bazaar-ng, git, Jakub Narebski

Andreas Ericsson <ae@op5.se> writes:

> Robert Collins wrote:
>> On Tue, 2006-10-17 at 11:20 +0200, Jakub Narebski wrote:
>>>           ---- time --->
>>>
>>>     --*--*--*--*--*--*--*--*--*-- <branch>
>>>           \            /
>>>            \-*--X--*--/
>>>
>>> The branch it used to be on is gone...
>>
>> In bzr 0.12 this is :
>> 2.1.2
>>
>
> Would it be a different number in a different version of bazaar?

I can't say for bzr 0.>12 which do not exist ;-)

For previous versions, it didn't have that "simple" number, and you
had to use the rev-id.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  8:30         ` Jakub Narebski
@ 2006-10-17 11:19           ` Matthieu Moy
       [not found]             ` <20061017073839.3728d1e7.seanlkml@sympatico.ca>
                               ` (4 more replies)
  0 siblings, 5 replies; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-17 11:19 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Aaron Bentley, Linus Torvalds, bazaar-ng, git

Jakub Narebski <jnareb@gmail.com> writes:

>> - you can use a checkout to maintain a local mirror of a read-only
>>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
>
> In git you can access contents _without_ checkout/working area.

Bazaar can do this too. For example,
"bzr cat http://something -r some-revision" gets the content of a file
at a given revision. But that's not what Aaron was refering to.

In Bazaar, checkouts can be two things:

1) a working tree without any history information, pointing to some
   other location for the history itself (a la svn/CVS/...).
   (this is "light checkout")

2) a bound branch. It's not _very_ different from a normal branch, but
   mostly "commit" behaves differently:
   - it commits both on the local and the remote branch (equivalent to
     "commit" + "push", but in a transactional way).
   - it refuses to commit if you're out of date with the branch you're
     bound to.
   (this is "heavy checkout")

In both cases, this has the side effect that you can't commit if the
"upstream" branch is read-only. That's not fundamental, but handy.

I use it for example to have several "checkouts" of the same branch on
different machines. When I commit, bzr tells me "hey, boss, you're out
of date, why don't you update first" if I'm out of date. And if commit
succeeds, I'm sure it is already commited to the main branch. I'm sure
I won't pollute my history with merges which would only be the result
of forgetting to update.

Once more, that's not fundamental, but handy.

The more fundamental thing I suppose is that it allows people to work
in a centralized way (checkout/commit/update/...), and Bazaar was
designed to allow several different workflows, including the
centralized one.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]             ` <20061017073839.3728d1e7.seanlkml@sympatico.ca>
@ 2006-10-17 11:38               ` Sean
  2006-10-17 12:03                 ` Matthieu Moy
  2006-10-17 11:38               ` Sean
  2006-10-21 14:13               ` Jan Hudec
  2 siblings, 1 reply; 1752+ messages in thread
From: Sean @ 2006-10-17 11:38 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Jakub Narebski, Aaron Bentley, Linus Torvalds, bazaar-ng, git

On Tue, 17 Oct 2006 13:19:08 +0200
Matthieu Moy <Matthieu.Moy@imag.fr> wrote:

> 1) a working tree without any history information, pointing to some
>    other location for the history itself (a la svn/CVS/...).
>    (this is "light checkout")

Git can do this from a local repository, it just can't do it from
a remote repo (at least over the git native protocol).  However,
over gitweb you can grab and unpack a tarball from a remote repo.
In practice this is probably enough support for such a feature.

> 2) a bound branch. It's not _very_ different from a normal branch, but
>    mostly "commit" behaves differently:
>    - it commits both on the local and the remote branch (equivalent to
>      "commit" + "push", but in a transactional way).
>    - it refuses to commit if you're out of date with the branch you're
>      bound to.
>    (this is "heavy checkout")

This doesn't sound right, at least in the spirit of git.  Git really
wants to have a local commit which you may or may not push to a
remote repo at a later time.  There is no upside to forcing it all to
happen in one step, and a lot of downsides.  Gits focus is to support
distributed offline development, not requiring a remote repo to be
available at commit time.
 
> In both cases, this has the side effect that you can't commit if the
> "upstream" branch is read-only. That's not fundamental, but handy.

Again this seems really anti-git.  There is no reason for your local
branch to be marked read only just because some upstream branch is
so marked.

> I use it for example to have several "checkouts" of the same branch on
> different machines. When I commit, bzr tells me "hey, boss, you're out
> of date, why don't you update first" if I'm out of date. And if commit
> succeeds, I'm sure it is already commited to the main branch. I'm sure
> I won't pollute my history with merges which would only be the result
> of forgetting to update.

This is exactly the same in Git.  You really only ever push upstream
when your local changes fast forward the remote, (ie. you're up to date).
Git will warn you if your changes don't fast forward the remote.
 
> The more fundamental thing I suppose is that it allows people to work
> in a centralized way (checkout/commit/update/...), and Bazaar was
> designed to allow several different workflows, including the
> centralized one.

While Git really isn't meant to work in a centralized way there's nothing
preventing such a work flow.  It just requires the use of some surrounding
infrastructure.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]             ` <20061017073839.3728d1e7.seanlkml@sympatico.ca>
  2006-10-17 11:38               ` Sean
@ 2006-10-17 11:38               ` Sean
  2006-10-21 14:13               ` Jan Hudec
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 11:38 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Tue, 17 Oct 2006 13:19:08 +0200
Matthieu Moy <Matthieu.Moy@imag.fr> wrote:

> 1) a working tree without any history information, pointing to some
>    other location for the history itself (a la svn/CVS/...).
>    (this is "light checkout")

Git can do this from a local repository, it just can't do it from
a remote repo (at least over the git native protocol).  However,
over gitweb you can grab and unpack a tarball from a remote repo.
In practice this is probably enough support for such a feature.

> 2) a bound branch. It's not _very_ different from a normal branch, but
>    mostly "commit" behaves differently:
>    - it commits both on the local and the remote branch (equivalent to
>      "commit" + "push", but in a transactional way).
>    - it refuses to commit if you're out of date with the branch you're
>      bound to.
>    (this is "heavy checkout")

This doesn't sound right, at least in the spirit of git.  Git really
wants to have a local commit which you may or may not push to a
remote repo at a later time.  There is no upside to forcing it all to
happen in one step, and a lot of downsides.  Gits focus is to support
distributed offline development, not requiring a remote repo to be
available at commit time.
 
> In both cases, this has the side effect that you can't commit if the
> "upstream" branch is read-only. That's not fundamental, but handy.

Again this seems really anti-git.  There is no reason for your local
branch to be marked read only just because some upstream branch is
so marked.

> I use it for example to have several "checkouts" of the same branch on
> different machines. When I commit, bzr tells me "hey, boss, you're out
> of date, why don't you update first" if I'm out of date. And if commit
> succeeds, I'm sure it is already commited to the main branch. I'm sure
> I won't pollute my history with merges which would only be the result
> of forgetting to update.

This is exactly the same in Git.  You really only ever push upstream
when your local changes fast forward the remote, (ie. you're up to date).
Git will warn you if your changes don't fast forward the remote.
 
> The more fundamental thing I suppose is that it allows people to work
> in a centralized way (checkout/commit/update/...), and Bazaar was
> designed to allow several different workflows, including the
> centralized one.

While Git really isn't meant to work in a centralized way there's nothing
preventing such a work flow.  It just requires the use of some surrounding
infrastructure.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 11:19           ` Matthieu Moy
       [not found]             ` <20061017073839.3728d1e7.seanlkml@sympatico.ca>
@ 2006-10-17 11:45             ` Jakub Narebski
  2006-10-17 12:02               ` Jakub Narebski
                                 ` (2 more replies)
  2006-10-17 12:00             ` Andreas Ericsson
                               ` (2 subsequent siblings)
  4 siblings, 3 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17 11:45 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Aaron Bentley, Linus Torvalds, bazaar-ng, git

Matthieu Moy wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>>> - you can use a checkout to maintain a local mirror of a read-only
>>>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
>>
>> In git you can access contents _without_ checkout/working area.
> 
> Bazaar can do this too. For example,
> "bzr cat http://something -r some-revision" gets the content of a file
> at a given revision. But that's not what Aaron was refering to.

Git cannot do that remotely (with exception of git-tar-tree/git-archive 
which has --remote option), yet. But you can get contents of a file 
(with "git cat-file -p [<revision>:|:<stage>:]<filename>"), list 
directory (with "git ls-tree <tree-ish>") and compare files or 
directories (git diff family of commands) without need for working 
directory.
 
AFAICT working area is required _only_ to resolve conflicts during 
merge.

> In Bazaar, checkouts can be two things:
> 
> 1) a working tree without any history information, pointing to some
>    other location for the history itself (a la svn/CVS/...).
>    (this is "light checkout")
> 
> 2) a bound branch. It's not _very_ different from a normal branch, but
>    mostly "commit" behaves differently:
>    - it commits both on the local and the remote branch (equivalent to
>      "commit" + "push", but in a transactional way).
>    - it refuses to commit if you're out of date with the branch you're
>      bound to.
>    (this is "heavy checkout")

In git by default in the top directory of working area you have .git 
directory which contains whole repository (object database, refs (i.e. 
branches and tags), information which branch is current, index aka. 
gitcache, configuration, etc.). You can share object database locally 
(which includes network filesystem).

You can have .git (usually <project>.git then) directory without working 
area.

And you can symlink (and in the future "symref"-link) .git directory.

> In both cases, this has the side effect that you can't commit if the
> "upstream" branch is read-only. That's not fundamental, but handy.

There was proposal to allow for tracking branches to be marked 
read-only, but it was not implemented yet.

But git has reverse check: it forbids (unless forced by user) to fetch 
into branch which has local changes (does not fast-forward). This make 
sure that no information is lost.

The idea is that you fetch changes into tracking branch (e.g. 'master' 
branch of some parent remote repository into 'origin' or 
'remotes/<repository name>/master' branch); you don't commit changes to 
such branch. You do your own work either on 'master' branch, then merge 
(typically using "git pull") corresponding 'origin' tracking branch, or 
use separate private feature branch and use rebase after fetch.

[...]
> The more fundamental thing I suppose is that it allows people to work
> in a centralized way (checkout/commit/update/...), and Bazaar was
> designed to allow several different workflows, including the
> centralized one.

Git is designed for distributed workflows, not for centralized one.
All repositories are created equal :-)

-- 
Jakub Narebski
ShadeHawk on #git and #revctl
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 11:19           ` Matthieu Moy
       [not found]             ` <20061017073839.3728d1e7.seanlkml@sympatico.ca>
  2006-10-17 11:45             ` Jakub Narebski
@ 2006-10-17 12:00             ` Andreas Ericsson
  2006-10-17 13:27               ` Matthieu Moy
  2006-10-17 14:19             ` Olivier Galibert
  2006-10-18  1:46             ` Petr Baudis
  4 siblings, 1 reply; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-17 12:00 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Jakub Narebski, Aaron Bentley, Linus Torvalds, bazaar-ng, git

Matthieu Moy wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>>> - you can use a checkout to maintain a local mirror of a read-only
>>>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
>> In git you can access contents _without_ checkout/working area.
> 
> Bazaar can do this too. For example,
> "bzr cat http://something -r some-revision" gets the content of a file
> at a given revision. But that's not what Aaron was refering to.
> 
> In Bazaar, checkouts can be two things:
> 
> 1) a working tree without any history information, pointing to some
>    other location for the history itself (a la svn/CVS/...).
>    (this is "light checkout")
> 
> 2) a bound branch. It's not _very_ different from a normal branch, but
>    mostly "commit" behaves differently:
>    - it commits both on the local and the remote branch (equivalent to
>      "commit" + "push", but in a transactional way).
>    - it refuses to commit if you're out of date with the branch you're
>      bound to.
>    (this is "heavy checkout")
> 

What about

3) getting the repo with all the history while still not having to be 
online to actually commit to *your* copy of the repo. When you later get 
online, you can send all your changes in a big hunk, or let bazaar email 
them to the maintainer as patches, or...

> In both cases, this has the side effect that you can't commit if the
> "upstream" branch is read-only. That's not fundamental, but handy.
> 

It appears we have different ideas of what's handy. Perhaps it's just a 
difference in workflow, or lack of "email-commits-as-patches" tools in 
bazaar, but the ability to commit to whatever branch I like in my local 
repo and then just send the diffs by email or please-pull requests to 
upstream authors is what makes git work so well for me. I can ofcourse 
also pull the changes to another branch, or cherrypick them one by one, 
or...

OTOH, if by "commit" you mean "send your changes back to central 
server", and bazaar'ish for "register my current set of changes in the 
local clone of the repo" is called something else, it sounds very 
similar to what git does.

> 
> The more fundamental thing I suppose is that it allows people to work
> in a centralized way (checkout/commit/update/...), and Bazaar was
> designed to allow several different workflows, including the
> centralized one.
> 

Centralized works in git too after a fashion. Most projects have a 
master repo hidden somewhere that frequently gets pushed out for 
publishing and which most (all?) contributors sync against from time to 
time, but it's by no means a certainty. What *is* a certainty is that 
the published branches are exactly identical to the ones in the master 
repo, and all the downstream authors will get a history where they can 
easily track master's development.

For git, I suppose Junio has the hidden master repo which he publishes 
at kernel.org. Linus does the same with the Linux repo.

On a side-note, it sounds as though the "bound branch" scenario 
encourages making a big change as one mega-diff, so long as it 
implements one feature, whereas the git workflow with topic-branches 
that eventually gets merged to master allows changes to sort of 
accumulate up to a feature in the steps one actually has to take to make 
the feature work.

Side-note 2: Three really great things that have made work a lot easier 
and more enjoyable since we changed from cvs to git and that aren't 
mentioned in the comparison table:
* Dependency/history graph display tools á la qgit/gitk
* Bisection tool for finding bug introduction revisions.
* Tools for sending commits as emails.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 11:45             ` Jakub Narebski
@ 2006-10-17 12:02               ` Jakub Narebski
       [not found]               ` <20061017080702.615a3b2f.seanlkml@sympatico.ca>
  2006-10-17 13:33               ` Matthieu Moy
  2 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17 12:02 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Aaron Bentley, Linus Torvalds, bazaar-ng, git

Jakub Narebski wrote:
> In git by default in the top directory of working area you have .git 
> directory which contains whole repository (object database, refs (i.e. 
> branches and tags), information which branch is current, index aka. 
> gitcache, configuration, etc.). You can share object database locally 
> (which includes network filesystem).
> 
> You can have .git (usually <project>.git then) directory without working 
> area.

So called "bare" repository.
> 
> And you can symlink (and in the future "symref"-link) .git directory.

And you can use GIT_DIR environmental variable or --git-dir option
to git wrapper.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 11:38               ` Sean
@ 2006-10-17 12:03                 ` Matthieu Moy
  2006-10-17 12:56                   ` Jakub Narebski
                                     ` (3 more replies)
  0 siblings, 4 replies; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-17 12:03 UTC (permalink / raw)
  To: Sean; +Cc: Jakub Narebski, Aaron Bentley, Linus Torvalds, bazaar-ng, git

Sean <seanlkml@sympatico.ca> writes:

> On Tue, 17 Oct 2006 13:19:08 +0200
> Matthieu Moy <Matthieu.Moy@imag.fr> wrote:
>
>> 1) a working tree without any history information, pointing to some
>>    other location for the history itself (a la svn/CVS/...).
>>    (this is "light checkout")
>
> Git can do this from a local repository, it just can't do it from
> a remote repo (at least over the git native protocol).  However,
> over gitweb you can grab and unpack a tarball from a remote repo.
> In practice this is probably enough support for such a feature.

Anyway, given the price of disk space today, this only makes sense if
you have a fast access to the repository (otherwise, you consider your
local repository as a cache, and you're ready to pay the disk space
price to save your bandwidth). In this case, it's often in your
filesystem (local or NFS).

>> 2) a bound branch. It's not _very_ different from a normal branch, but
>>    mostly "commit" behaves differently:
>>    - it commits both on the local and the remote branch (equivalent to
>>      "commit" + "push", but in a transactional way).
>>    - it refuses to commit if you're out of date with the branch you're
>>      bound to.
>>    (this is "heavy checkout")
>
> This doesn't sound right, at least in the spirit of git.  Git really
> wants to have a local commit which you may or may not push to a
> remote repo at a later time.  There is no upside to forcing it all to
> happen in one step, and a lot of downsides.  Gits focus is to support
> distributed offline development, not requiring a remote repo to be
> available at commit time.

I lied in my above description ;-).

I should have said "by default" ... but you have "commit --local" if
you want to have a local commit on a bound branch (at this point, I
should remind that not all branches are "bound branches". "bzr branch"
creates branches similar to git ones).

>> In both cases, this has the side effect that you can't commit if the
>> "upstream" branch is read-only. That's not fundamental, but handy.
>
> Again this seems really anti-git.  There is no reason for your local
> branch to be marked read only just because some upstream branch is
> so marked.

Will, take the example of my bzr setup.

I have one repository, say, $repo.

In it, I have one branch "$repo/bzr.dev" which is an exact mirror of
http://bazaar-vcs.org's branch.

I also have branches for patches (occasional in my case) that I'll
send to upstream. Say $repo/feature1, $repo/feature2, ...

If, by mistake, I start hacking on bzr.dev itself, I'll be warned at
commit time, create a branch, and commit in this new branch. I believe
git manages this in a different way, allowing you to commit in this
branch, and creating the branch next time you pull. But you know this
better than I ;-), I never got time to give a real try to git.

>> I use it for example to have several "checkouts" of the same branch on
>> different machines. When I commit, bzr tells me "hey, boss, you're out
>> of date, why don't you update first" if I'm out of date. And if commit
>> succeeds, I'm sure it is already commited to the main branch. I'm sure
>> I won't pollute my history with merges which would only be the result
>> of forgetting to update.
>
> This is exactly the same in Git.  You really only ever push upstream
> when your local changes fast forward the remote, (ie. you're up to date).
> Git will warn you if your changes don't fast forward the remote.

Yes, but you will have to do a merge at some point, right ? While I'm
keeping a purely linear history (not that it is good in the general
case, but for "projects" on which I'm the only developper, I find it
good. For example, my ${HOME}/etc/).

But don't get me wrong, I also prefer the decentralized way in most
case. And I'm happy that bzr and git work like this by default. Just
that at least *I* have cases where a centralized approach suits me
better, and then I'm happy with that particular feature of bzr.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]               ` <20061017080702.615a3b2f.seanlkml@sympatico.ca>
  2006-10-17 12:07                 ` Sean
@ 2006-10-17 12:07                 ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 12:07 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Matthieu Moy, Aaron Bentley, Linus Torvalds, bazaar-ng, git

On Tue, 17 Oct 2006 13:45:31 +0200
Jakub Narebski <jnareb@gmail.com> wrote:

> Git cannot do that remotely (with exception of git-tar-tree/git-archive 
> which has --remote option), yet. But you can get contents of a file 
> (with "git cat-file -p [<revision>:|:<stage>:]<filename>"), list 
> directory (with "git ls-tree <tree-ish>") and compare files or 
> directories (git diff family of commands) without need for working 
> directory.

Interesting, I didn't know about the --remote option.  So in fact as long
as the remote has enabled upload-tar then anyone can do a "light checkout".
However, it appears that kernel.org for instance doesn't enable this feature.

Sean
  

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]               ` <20061017080702.615a3b2f.seanlkml@sympatico.ca>
@ 2006-10-17 12:07                 ` Sean
  2006-10-21  8:27                   ` Jakub Narebski
  2006-10-17 12:07                 ` Sean
  1 sibling, 1 reply; 1752+ messages in thread
From: Sean @ 2006-10-17 12:07 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Linus Torvalds, bazaar-ng, git, Matthieu Moy

On Tue, 17 Oct 2006 13:45:31 +0200
Jakub Narebski <jnareb@gmail.com> wrote:

> Git cannot do that remotely (with exception of git-tar-tree/git-archive 
> which has --remote option), yet. But you can get contents of a file 
> (with "git cat-file -p [<revision>:|:<stage>:]<filename>"), list 
> directory (with "git ls-tree <tree-ish>") and compare files or 
> directories (git diff family of commands) without need for working 
> directory.

Interesting, I didn't know about the --remote option.  So in fact as long
as the remote has enabled upload-tar then anyone can do a "light checkout".
However, it appears that kernel.org for instance doesn't enable this feature.

Sean
  

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 12:03                 ` Matthieu Moy
@ 2006-10-17 12:56                   ` Jakub Narebski
       [not found]                   ` <20061017085723.7542ee6c.seanlkml@sympatico.ca>
                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17 12:56 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Sean, Aaron Bentley, Linus Torvalds, bazaar-ng, git

Matthieu Moy wrote:
>> This is exactly the same in Git.  You really only ever push upstream
>> when your local changes fast forward the remote, (ie. you're up to date).
>> Git will warn you if your changes don't fast forward the remote.
> 
> Yes, but you will have to do a merge at some point, right ? While I'm
> keeping a purely linear history (not that it is good in the general
> case, but for "projects" on which I'm the only developper, I find it
> good. For example, my ${HOME}/etc/).

Fast-forward doesn't result in merge.

If you have

  1---2---3        <branch 1, or branch locally>
           \
            4---5  <branch 2, or branch at remote>

then this is fast-forward case. After pull (or push) you have

  1---2---3---4---5 <branch 1>

without merge.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                   ` <20061017085723.7542ee6c.seanlkml@sympatico.ca>
@ 2006-10-17 12:57                     ` Sean
  2006-10-17 13:44                       ` Matthieu Moy
  2006-10-17 12:57                     ` VCS comparison table Sean
  1 sibling, 1 reply; 1752+ messages in thread
From: Sean @ 2006-10-17 12:57 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Jakub Narebski, Aaron Bentley, Linus Torvalds, bazaar-ng, git

On Tue, 17 Oct 2006 14:03:21 +0200
Matthieu Moy <Matthieu.Moy@imag.fr> wrote:

> Anyway, given the price of disk space today, this only makes sense if
> you have a fast access to the repository (otherwise, you consider your
> local repository as a cache, and you're ready to pay the disk space
> price to save your bandwidth). In this case, it's often in your
> filesystem (local or NFS).

This is most likely the reason that people using Git don't clammor
more for the ability to work without a local repository.  Disk is cheap
and it just makes sense the vast majority of the time to have a complete
copy of the repository yourself.  There are a lot of powerful things
you can do once you have all that information in your repo.  Not the least
of which is performing any and all operations while flying on a plane
or sitting on a park bench.

> I should have said "by default" ... but you have "commit --local" if
> you want to have a local commit on a bound branch (at this point, I
> should remind that not all branches are "bound branches". "bzr branch"
> creates branches similar to git ones).

Well, with Git the default is to only commit locally.  Of course, you
could set your post commit hook to always push it to a remote if
you wanted to.

> Will, take the example of my bzr setup.
> 
> I have one repository, say, $repo.
> 
> In it, I have one branch "$repo/bzr.dev" which is an exact mirror of
> http://bazaar-vcs.org's branch.
> 
> I also have branches for patches (occasional in my case) that I'll
> send to upstream. Say $repo/feature1, $repo/feature2, ...
> 
> If, by mistake, I start hacking on bzr.dev itself, I'll be warned at
> commit time, create a branch, and commit in this new branch. I believe
> git manages this in a different way, allowing you to commit in this
> branch, and creating the branch next time you pull. But you know this
> better than I ;-), I never got time to give a real try to git.

Well, it's just a slight difference in perspective rather than any
big issue here.  Git treats all repositories as peers, so it would never
assume that just because one other particular repo has a branch marked
as read only that it should be marked read only locally.  It lets you
commit to it, and then push to say a third and fourth repo that are
writable as well.  In practice this doesn't really cause any
insurmountable problems.

> Yes, but you will have to do a merge at some point, right ? While I'm
> keeping a purely linear history (not that it is good in the general
> case, but for "projects" on which I'm the only developper, I find it
> good. For example, my ${HOME}/etc/).

Well if you're committing changes from multiple different machines,
how is that different from having say 3 different developers committing
changes to the central repo?  How does bzr avoid a merge when you're
pushing changes from 3 separate machines? 

You mentioned that if you try to push and you're not up to date you'll
be prompted to update (ie. pull from the upstream repo).  When you do such
a pull do your local changes get rebased on top or is there a merge?   By
your comments I guess you're saying they're rebased rather than merged, and
this is how you keep a linear history.  Git can do this easily, but it's
not done by default.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                   ` <20061017085723.7542ee6c.seanlkml@sympatico.ca>
  2006-10-17 12:57                     ` Sean
@ 2006-10-17 12:57                     ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 12:57 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Tue, 17 Oct 2006 14:03:21 +0200
Matthieu Moy <Matthieu.Moy@imag.fr> wrote:

> Anyway, given the price of disk space today, this only makes sense if
> you have a fast access to the repository (otherwise, you consider your
> local repository as a cache, and you're ready to pay the disk space
> price to save your bandwidth). In this case, it's often in your
> filesystem (local or NFS).

This is most likely the reason that people using Git don't clammor
more for the ability to work without a local repository.  Disk is cheap
and it just makes sense the vast majority of the time to have a complete
copy of the repository yourself.  There are a lot of powerful things
you can do once you have all that information in your repo.  Not the least
of which is performing any and all operations while flying on a plane
or sitting on a park bench.

> I should have said "by default" ... but you have "commit --local" if
> you want to have a local commit on a bound branch (at this point, I
> should remind that not all branches are "bound branches". "bzr branch"
> creates branches similar to git ones).

Well, with Git the default is to only commit locally.  Of course, you
could set your post commit hook to always push it to a remote if
you wanted to.

> Will, take the example of my bzr setup.
> 
> I have one repository, say, $repo.
> 
> In it, I have one branch "$repo/bzr.dev" which is an exact mirror of
> http://bazaar-vcs.org's branch.
> 
> I also have branches for patches (occasional in my case) that I'll
> send to upstream. Say $repo/feature1, $repo/feature2, ...
> 
> If, by mistake, I start hacking on bzr.dev itself, I'll be warned at
> commit time, create a branch, and commit in this new branch. I believe
> git manages this in a different way, allowing you to commit in this
> branch, and creating the branch next time you pull. But you know this
> better than I ;-), I never got time to give a real try to git.

Well, it's just a slight difference in perspective rather than any
big issue here.  Git treats all repositories as peers, so it would never
assume that just because one other particular repo has a branch marked
as read only that it should be marked read only locally.  It lets you
commit to it, and then push to say a third and fourth repo that are
writable as well.  In practice this doesn't really cause any
insurmountable problems.

> Yes, but you will have to do a merge at some point, right ? While I'm
> keeping a purely linear history (not that it is good in the general
> case, but for "projects" on which I'm the only developper, I find it
> good. For example, my ${HOME}/etc/).

Well if you're committing changes from multiple different machines,
how is that different from having say 3 different developers committing
changes to the central repo?  How does bzr avoid a merge when you're
pushing changes from 3 separate machines? 

You mentioned that if you try to push and you're not up to date you'll
be prompted to update (ie. pull from the upstream repo).  When you do such
a pull do your local changes get rebased on top or is there a merge?   By
your comments I guess you're saying they're rebased rather than merged, and
this is how you keep a linear history.  Git can do this easily, but it's
not done by default.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  3:52             ` Sam Vilain
@ 2006-10-17 12:59               ` Jon Smirl
  0 siblings, 0 replies; 1752+ messages in thread
From: Jon Smirl @ 2006-10-17 12:59 UTC (permalink / raw)
  To: Sam Vilain; +Cc: Petr Baudis, Jakub Narebski, git

On 10/16/06, Sam Vilain <sam@vilain.net> wrote:
> Jon Smirl wrote:
> > cvsps works ok on small amounts of data, but it can't handle the full
> > Mozilla repo. The current idea is to convert the full repo with
> > cvs2git and build the ini file needed by cvsps to support incremental
> > imports. After that use cvsps.
> >
>
> Looking through the client.mk used to check out the sub-portions of the
> CVS repository, I have to ask;
>
> Why are you trying to import this big collection of projects into a
> single git repository?

All of Mozilla is in a single CVS repo, client.mk is checking out
directories from the mozilla project. This is how it has been
historically for over ten years. It also allows commits that
simultaneously go to all subcomponents when interfaces are changed.
Even if it was split into different git repos you still need to
download about 70% of them to build the browser.

I've been trying to simply translate the existing repo without
changing it's structure in any way. Changing structure is going to
require a lot of buy-in from all of the developers.

>
> View git's repositories not as a container for an entire community's
> code base, but more as object partitions.  Currently you are quite happy
> to use per-file version control partitions inherent to CVS.  Now you are
> looking at removing all of the partitions completely and hoping to end
> up with something managable.  That it has been possible at all to fit it
> into the space less than the size of a CD is staggering, but surely a
> piecemeal approach would be a pragmatic solution to this problem.
>
> Sam.
>


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 12:00             ` Andreas Ericsson
@ 2006-10-17 13:27               ` Matthieu Moy
  2006-10-17 13:55                 ` Jakub Narebski
  2006-10-17 14:01                 ` Andreas Ericsson
  0 siblings, 2 replies; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-17 13:27 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

Andreas Ericsson <ae@op5.se> writes:

> What about
>
> 3) getting the repo with all the history while still not having to be
> online to actually commit to *your* copy of the repo. When you later
> get online, you can send all your changes in a big hunk, or let bazaar
> email them to the maintainer as patches, or...

Well, the discussion was about checkouts, so I was talking about
checkouts ;-).

What you mention is the default behavior of Bazaar when you use 
"bzr branch" or "bzr get". BTW, it's also possible to do this with a
heavy checkout, that's "commit --local".

> It appears we have different ideas of what's handy. Perhaps it's just
> a difference in workflow, or lack of "email-commits-as-patches" tools
> in bazaar,

You have "bzr bundle" in Bazaar, and there was work to have it
actually send the email ( http://bazaar-vcs.org/SubmitByMail ), but I
don't think it's finished yet.

And yes, this is a great feature, the first time I used it was with
Darcs, and I was impressed how easy I could submit a patch without any
setup and with a 5-lines tutorial. Even wiki seems complex after
that ;-).

> but the ability to commit to whatever branch I like in my local repo
> and then just send the diffs by email or please-pull requests to
> upstream authors is what makes git work so well for me.

Sure. Once again, Bazaar does it this way too. There's an _additional
feature_ called checkout which allows you to work in another way,
though. As most "feature", it's not useful to everybody.

And I repeat that I'm in no way arguing against the git model :-).

> Side-note 2: Three really great things that have made work a lot
> easier and more enjoyable since we changed from cvs to git and that
> aren't mentioned in the comparison table:

Sure. And regarding this, hopufully, most modern VCS go in the same
direction.

> * Dependency/history graph display tools á la qgit/gitk

http://bazaar-vcs.org/bzr-gtk
http://samba.org/~jelmer/bzr/bzrk.png

> * Bisection tool for finding bug introduction revisions.

This took time to come in bzr, but that's the bisect plugin:

http://bazaar-vcs.org/PluginRegistry

> * Tools for sending commits as emails.

(Surprisingly, I had added this in the table, but has been removed for
some obscure reasons)

-- 
Matthieu

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 11:45             ` Jakub Narebski
  2006-10-17 12:02               ` Jakub Narebski
       [not found]               ` <20061017080702.615a3b2f.seanlkml@sympatico.ca>
@ 2006-10-17 13:33               ` Matthieu Moy
  2 siblings, 0 replies; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-17 13:33 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Linus Torvalds, bazaar-ng, git

Jakub Narebski <jnareb@gmail.com> writes:

> But git has reverse check: it forbids (unless forced by user) to fetch 
> into branch which has local changes (does not fast-forward).

Same as bzr then I believe. "bzr pull" will suggest you to use "merge"
in this situation, unless you say "pull --overwrite".

>> The more fundamental thing I suppose is that it allows people to work
>> in a centralized way (checkout/commit/update/...), and Bazaar was
>> designed to allow several different workflows, including the
>> centralized one.
>
> Git is designed for distributed workflows, not for centralized one.
> All repositories are created equal :-)

Note that "bound branches" and "other branches" in bzr are not so
different. The "master" (the one you make a checkout of) doesn't have
to know it has checkouts, and the "checkout" just has one file
pointing to the "master", and you can switch from one flow to the
other with "bzr bind/unbind".

So, in Bazaar, all repositories are /almost/ created equal ;-).

-- 
Matthieu

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 12:57                     ` Sean
@ 2006-10-17 13:44                       ` Matthieu Moy
       [not found]                         ` <20061017100150.b4919aac.seanlkml@sympatico.ca>
  0 siblings, 1 reply; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-17 13:44 UTC (permalink / raw)
  To: Sean; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

Sean <seanlkml@sympatico.ca> writes:

>> Yes, but you will have to do a merge at some point, right ? While I'm
>> keeping a purely linear history (not that it is good in the general
>> case, but for "projects" on which I'm the only developper, I find it
>> good. For example, my ${HOME}/etc/).
>
> Well if you're committing changes from multiple different machines,
> how is that different from having say 3 different developers committing
> changes to the central repo?

The workflow is different.

If I commit broken changes on a repository shared by multiple
developers, they'll insult me, and they'll be right. While I find
nothing wrong in commiting broken changes to my ${HOME}/etc/ when
leaving the office, and fix it from home.

> How does bzr avoid a merge when you're pushing changes from 3
> separate machines?

Err, the same way people have been doing for years ;-). If you don't
have local commits, "bzr update" will work in the same way as "cvs
update", it keeps your local changes, without recording history. Like
"git pull" does if you have uncommited changes I think.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 10:30             ` Johannes Schindelin
       [not found]               ` <20061017063549.da130b5f.seanlkml@sympatico.ca>
  2006-10-17 10:45               ` Matthias Kestenholz
@ 2006-10-17 13:48               ` Aaron Bentley
  2 siblings, 0 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-17 13:48 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Sean, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Johannes Schindelin wrote:
> On Tue, 17 Oct 2006, Sean wrote:
>>Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

>>>- - you can have working trees on local systems while having the
>>>  repository on a remote system.  This makes it easy to work on one
>>>  logical branch from multiple locations, without getting out of sync.
>>
>>That is a very nice feature.  Git would be improved if it could
>>support that mode of operation as well.
> 
> 
> It would also make things slow as hell. How do you deal with something 
> like annotate in such a setup?

For the particular case of annotate, bzr is designed to store
annotations at commit time.  So annotate should require remote access to
a small amount of data from two files-- not a great cost.

But our default form of checkout contains a local copy of all history
data, so that readonly operations happen at local speed.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFNN8Y0F+nu1YWqI0RAqXtAJ4qKGQ5ZwlMF795kz3udeuRTcRy6wCghr53
tjw9cNVxzrQ0XSUO2v52ZIo=
=W6q7
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 13:27               ` Matthieu Moy
@ 2006-10-17 13:55                 ` Jakub Narebski
  2006-10-17 14:08                   ` Matthieu Moy
  2006-10-18 18:03                   ` Jeff Licquia
  2006-10-17 14:01                 ` Andreas Ericsson
  1 sibling, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17 13:55 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Matthieu Moy wrote:
>> Side-note 2: Three really great things that have made work a lot
>> easier and more enjoyable since we changed from cvs to git and that
>> aren't mentioned in the comparison table:
> 
> Sure. And regarding this, hopufully, most modern VCS go in the same
> direction.
> 
> > * Dependency/history graph display tools á la qgit/gitk
> 
> http://bazaar-vcs.org/bzr-gtk
> http://samba.org/~jelmer/bzr/bzrk.png

Hmmm... most of the tools look similar. Git has gitk (Tcl/Tk, now in 
git.git repository), QGit (Qt), GitView (GTK+, in contrib/), 
git-browser (JavaScript, uses High Performance JavaScript Graphics 
Library by Walter Zorn, http://www.walterzorn.com, for graphics).

Tig (Text-mode Interface for Git, ncurses) also in it's git version has 
a kind of history graph using ascii-art.


That is very important tool to have for any SCM which allows (and 
encourages) nonlinear history development.
 
>> * Bisection tool for finding bug introduction revisions.
> 
> This took time to come in bzr, but that's the bisect plugin:
> 
> http://bazaar-vcs.org/PluginRegistry

Hmmm... I winder which SCM had it first.
 
>> * Tools for sending commits as emails.
> 
> (Surprisingly, I had added this in the table, but has been removed for
> some obscure reasons)

While email can be used to exchange patches (git-format-patch to 
generate patches, git-send-mail to send patches if you don't want to 
use ordinary email client, git-am to apply patches) it cannot be used 
to exchange all information (one cannot send for example tags, or merge 
commits).

It is very usefull tool to have for "accidental" developer. You don't 
have to have constant on-line presence in the form of web server or git 
server somewhere for sending pull requests (although http://repo.or.cz 
public git repo hosting can help with that), you don't have to have 
access (ssh perhaps limited, or WebDAV one) to do push to somebody else 
repository, you can just send email to some mailing list.

BTW. git can provide binary patch for binary files (e.g. adding favicon 
for gitweb in git.git).


Other often and not-so-often used tools include:
 * git-rerere - Reuse recorded resolve (of merge conflicts)
 * reflog - Records where was given branch at given time (no UI yet)
 * git-diff -S'text' aka. pickaxe - find commits which added or removed
   given 'text'; and other revision limiters

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 13:27               ` Matthieu Moy
  2006-10-17 13:55                 ` Jakub Narebski
@ 2006-10-17 14:01                 ` Andreas Ericsson
  2006-10-17 14:24                   ` Matthieu Moy
  1 sibling, 1 reply; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-17 14:01 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

Matthieu Moy wrote:
> Andreas Ericsson <ae@op5.se> writes:
> 
>> What about
>>
>> 3) getting the repo with all the history while still not having to be
>> online to actually commit to *your* copy of the repo. When you later
>> get online, you can send all your changes in a big hunk, or let bazaar
>> email them to the maintainer as patches, or...
> 
> Well, the discussion was about checkouts, so I was talking about
> checkouts ;-).
> 

Differences in nomenclature is really messing this discussion up. In 
git, a "checkout" is the act of pulling objects from the object database 
into the working tree. I.e., the act of "clothing" a "bare" repository.


>> but the ability to commit to whatever branch I like in my local repo
>> and then just send the diffs by email or please-pull requests to
>> upstream authors is what makes git work so well for me.
> 
> Sure. Once again, Bazaar does it this way too. There's an _additional
> feature_ called checkout which allows you to work in another way,
> though. As most "feature", it's not useful to everybody.
> 

Now I'm really confused. Does bazaar have both "clone" (git-style 
fetching a full repo and all the branches) and "checkout" (cvs-style 
fetching only the working tree)?

> 
>> Side-note 2: Three really great things that have made work a lot
>> easier and more enjoyable since we changed from cvs to git and that
>> aren't mentioned in the comparison table:
> 
> Sure. And regarding this, hopufully, most modern VCS go in the same
> direction.
> 
>> * Dependency/history graph display tools á la qgit/gitk
> 
> http://bazaar-vcs.org/bzr-gtk
> http://samba.org/~jelmer/bzr/bzrk.png
> 
>> * Bisection tool for finding bug introduction revisions.
> 
> This took time to come in bzr, but that's the bisect plugin:
> 
> http://bazaar-vcs.org/PluginRegistry
> 
>> * Tools for sending commits as emails.
> 
> (Surprisingly, I had added this in the table, but has been removed for
> some obscure reasons)
> 

Merge-conflict with the webpage? ;-)

However, I know that bazaar has many of these features. I was merely 
commenting on the absence of these killer-features in the table. It 
might help people pick the right scm for their project, which is always 
a Good Thing(tm).

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                         ` <20061017100150.b4919aac.seanlkml@sympatico.ca>
  2006-10-17 14:01                           ` Sean
@ 2006-10-17 14:01                           ` Sean
  2006-10-17 14:19                             ` Matthieu Moy
  1 sibling, 1 reply; 1752+ messages in thread
From: Sean @ 2006-10-17 14:01 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Jakub Narebski, Aaron Bentley, Linus Torvalds, bazaar-ng, git

On Tue, 17 Oct 2006 15:44:36 +0200
Matthieu Moy <Matthieu.Moy@imag.fr> wrote:

> > How does bzr avoid a merge when you're pushing changes from 3
> > separate machines?
> 
> Err, the same way people have been doing for years ;-). If you don't
> have local commits, "bzr update" will work in the same way as "cvs
> update", it keeps your local changes, without recording history. Like
> "git pull" does if you have uncommited changes I think.

Ah, okay.  Well Git can definitely manage this.  Just means you have to
rebase any local changes before pushing.  This will keep the history
linear and make sure that no merges are needed in the case you were asking
about.

So far, it sounds to me like bazaar and git are more alike than they are
different.  Each have a few commands the other doesn't but all in all
they sound very similar.  But i'm a Git fanboy so I aint switching
now ;o)

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                         ` <20061017100150.b4919aac.seanlkml@sympatico.ca>
@ 2006-10-17 14:01                           ` Sean
  2006-10-17 14:01                           ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 14:01 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Tue, 17 Oct 2006 15:44:36 +0200
Matthieu Moy <Matthieu.Moy@imag.fr> wrote:

> > How does bzr avoid a merge when you're pushing changes from 3
> > separate machines?
> 
> Err, the same way people have been doing for years ;-). If you don't
> have local commits, "bzr update" will work in the same way as "cvs
> update", it keeps your local changes, without recording history. Like
> "git pull" does if you have uncommited changes I think.

Ah, okay.  Well Git can definitely manage this.  Just means you have to
rebase any local changes before pushing.  This will keep the history
linear and make sure that no merges are needed in the case you were asking
about.

So far, it sounds to me like bazaar and git are more alike than they are
different.  Each have a few commands the other doesn't but all in all
they sound very similar.  But i'm a Git fanboy so I aint switching
now ;o)

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  7:50         ` Andreas Ericsson
@ 2006-10-17 14:05           ` Aaron Bentley
       [not found]             ` <20061017103423.a9589295.seanlkml@sympatico.ca>
  2006-10-17 15:05             ` Andreas Ericsson
  0 siblings, 2 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-17 14:05 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andreas Ericsson wrote:
> Aaron Bentley wrote:

>> When two people have copies of the same revision, it's usually because
>> they are each pulling from a common branch, and so the revision in that
>> branch can be named.  Bazaar does use unique ids internally, but it's
>> extremely rare that the user needs to use them.
>>
> 
> Well, if two people have the same revision in git, you *know* they have
> pulled from each other

No, you don't.  They may have each pulled from a different repository.

Take revision 00aabbcc, created by Linus.  Linus has it because he
committed it.  I have it because I pulled Linus' repository.  You have
it because Andrew Morton pulled Linus' repository, and you pulled Andrew
Morton's repository.

>> But tags have local meaning only, unless someone has access to your
>> repository, right?
>>
> 
> I imagine the bazaar-names with url+number only has local meaning unless
> someone has access to your repository too.

Yes.  That phrasing was from Linus' description of revnos.

> One of the great benefits of
> git is that each revision is *always exactly the same* no matter in
> which repository it appears. This includes file-content, filesystem
> layout and, last but also most important, history.

In Bazaar, a revision id always refers to the same logical entity, but
it may be stored in different formats in different repositories.

>> - - you can publish a repository without publishing its working tree,
>>   possibly using standard mirroring tools like rsync.
>>
> 
> Can't all scm's do this?

With most SCMs that store the repository in the root of the tree,
disentangling the tree and repository requires care.  OTOH, this is just
as easy with Arch, CVS and SVN as it is with Bazaar.

>> - - you can use a checkout to maintain a local mirror of a read-only
>>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
>>
> 
> Check. Well, actually, you just clone it as usual but with the --bare
> argument and it won't write out the working tree files.

No, I *want* the working tree files.  I run bzr from a checkout of bzr.dev.

>> You can operate that way in bzr too, but I find it nicer to have one
>> checkout for each active branch, plus a checkout of bzr.dev.  Our switch
>> command also rewrites only the changed part of the working tree.
>>
> 
> Works in git as well, but each "checkout" (actually, locally referenced
> repository clone) gets a separate branch/tag namespace.

In our terminology, if it can diverge from the original, it's a branch,
not a checkout.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFNOM10F+nu1YWqI0RAvNUAJwN/QviOs+sUuN9ep4Otyrgax9SmwCfSH7t
XdxOxo7smshNlzU3qoxq6Nw=
=nxsM
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 13:55                 ` Jakub Narebski
@ 2006-10-17 14:08                   ` Matthieu Moy
  2006-10-17 14:41                     ` Jakub Narebski
  2006-10-18 18:03                   ` Jeff Licquia
  1 sibling, 1 reply; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-17 14:08 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Jakub Narebski <jnareb@gmail.com> writes:

> While email can be used to exchange patches (git-format-patch to 
> generate patches, git-send-mail to send patches if you don't want to 
> use ordinary email client, git-am to apply patches) it cannot be used 
> to exchange all information (one cannot send for example tags, or merge 
> commits).

In bzr, the "bundle" appears like a patch, but it actually contain the
same information as the revision(s) it contains (I believe this
applies to hg and Darcs too). A bundle can be used almost like a
branch. That's a key point, since revision identity is not based on
content's hash, so applying a patch is very different from merging a
bundle.

> It is very usefull tool to have for "accidental" developer.

That's the key point, but patch review for non-accidental developpers
is also good :-).

> BTW. git can provide binary patch for binary files (e.g. adding favicon 
> for gitweb in git.git).

Bazaar's bundle use base64 encoding for binaries. I don't think that's
efficient binary diff (xdelta-like) though. Aaron has been fighting
quite a lot with MUA and MTA mixing up the patches (line ending in
particular) ...

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 14:01                           ` Sean
@ 2006-10-17 14:19                             ` Matthieu Moy
       [not found]                               ` <20061017110655.f7bcf3f1.seanlkml@sympatico.ca>
  0 siblings, 1 reply; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-17 14:19 UTC (permalink / raw)
  To: bazaar-ng, git

Sean <seanlkml@sympatico.ca> writes:

> Ah, okay.  Well Git can definitely manage this.  Just means you have to
> rebase any local changes before pushing.  This will keep the history
> linear and make sure that no merges are needed in the case you were asking
> about.

Sure. As I said before, the little add-on of checkouts is that you say
once "I don't want to do local commit here", and bzr reminds you this
each time you commit. Well, where it can make a difference is that it
does it in a transactional way, that is, you don't have that little
window between the time you pull and the time you push your next
commit. But this would really be bad luck ;-).

> So far, it sounds to me like bazaar and git are more alike than they are
> different.  Each have a few commands the other doesn't but all in all
> they sound very similar.

Sure. And at least, if you want to prove that your decentralized SCM
is the best, you'd better look at features other than the ability to
commit on a local branch ;-). If you want a _real_ flamewar, better
talk about rename management or revision identity.

The thing is that most people migrated from CVS/svn, so they found
their new SCM to be incredibly better the existing. But it's generally
not _so_ much better than the other modern alternatives ;-). (and
don't forget to thank Darcs and Monotone who brought most of the good
ideas you and I are using)

> But i'm a Git fanboy so I aint switching now ;o)

Probably not going to switch either, but that might happen.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 11:19           ` Matthieu Moy
                               ` (2 preceding siblings ...)
  2006-10-17 12:00             ` Andreas Ericsson
@ 2006-10-17 14:19             ` Olivier Galibert
  2006-10-17 15:37               ` Matthieu Moy
  2006-10-18  1:46             ` Petr Baudis
  4 siblings, 1 reply; 1752+ messages in thread
From: Olivier Galibert @ 2006-10-17 14:19 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Jakub Narebski, Aaron Bentley, Linus Torvalds, bazaar-ng, git

On Tue, Oct 17, 2006 at 01:19:08PM +0200, Matthieu Moy wrote:
> I use it for example to have several "checkouts" of the same branch on
> different machines. When I commit, bzr tells me "hey, boss, you're out
> of date, why don't you update first" if I'm out of date.

You're not telling us bzr still follows the utterly stupid
update-before-commit model, right?  Right?

  OG.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 14:01                 ` Andreas Ericsson
@ 2006-10-17 14:24                   ` Matthieu Moy
  0 siblings, 0 replies; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-17 14:24 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

Andreas Ericsson <ae@op5.se> writes:

> Now I'm really confused. Does bazaar have both "clone" (git-style
> fetching a full repo and all the branches) and "checkout" (cvs-style
> fetching only the working tree)?

Yes, it has both. That's "bzr branch" (git clone) and "bzr checkout"
(cvs checkout).

Difference between "bzr branch" and "git clone" is that bzr doesn't
fetch all the branches. It fetches one "branch" (succession of
revisions) with all the ancestors of the revisions of the branch.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]             ` <20061017103423.a9589295.seanlkml@sympatico.ca>
@ 2006-10-17 14:34               ` Sean
  0 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 14:34 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Andreas Ericsson, Linus Torvalds, Jakub Narebski, bazaar-ng, git

On Tue, 17 Oct 2006 10:05:41 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:


> No, you don't.  They may have each pulled from a different repository.
> 
> Take revision 00aabbcc, created by Linus.  Linus has it because he
> committed it.  I have it because I pulled Linus' repository.  You have
> it because Andrew Morton pulled Linus' repository, and you pulled Andrew
> Morton's repository.

Well his point was that they have pulled from each other directly or
indirectly.  You can safely say that rev 00aabbcc.. in _any_ repository
is the same rev.  This discussion started because of doubt expressed
by some here on the list that the "simple" numbering scheme used by
bzr can offer the same guarantee.  That is, rev 1.2.1 may be completely
different commits in different repos in bazaar.
 
> With most SCMs that store the repository in the root of the tree,
> disentangling the tree and repository requires care.  OTOH, this is just
> as easy with Arch, CVS and SVN as it is with Bazaar.

Just in case it wasn't clear, this is drop dead easy in Git too.

> No, I *want* the working tree files.  I run bzr from a checkout of bzr.dev.

Why?  Uncommitted changes shouldn't be propagated.  Once you have cloned
the repo, you can checkout your own copy of the working tree files.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 14:08                   ` Matthieu Moy
@ 2006-10-17 14:41                     ` Jakub Narebski
  2006-10-18  0:00                       ` Petr Baudis
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17 14:41 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Matthieu Moy <Matthieu.Moy@imag.fr> wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> While email can be used to exchange patches (git-format-patch to 
>> generate patches, git-send-mail to send patches if you don't want to 
>> use ordinary email client, git-am to apply patches) it cannot be used 
>> to exchange all information (one cannot send for example tags, or
>> merge commits).
> 
> In bzr, the "bundle" appears like a patch, but it actually contain the
> same information as the revision(s) it contains (I believe this
> applies to hg and Darcs too). A bundle can be used almost like a
> branch. That's a key point, since revision identity is not based on
> content's hash, so applying a patch is very different from merging a
> bundle.

The patch generated by git-format-patch has author information (in 
"From:" header), original commit date (in "Date:" header), commit 
message (first line in "Subject:", rest in message body), place for 
comments which are not to be included in commit message, diffstat for 
easier patch review, and git extended diff (with information about 
renames detection, mode changes, 7-characters wide shortcuts of file 
contents identifiers). It does not record parent information, original 
comitter and comitter date, which branch we are on etc. You can quite 
easily provide ordering of patches.

Sending patches via email prohibits first line of commit message to be 
enclosed in brackets (subject usually is "[PATCH] Commit description" 
or "[PATCH n/m] Commit description") and enforces git convention of 
commit message to consist of first line describing commit shortly, 
separated by empty line from the longer description and signoff lines.

"Bundle" equivalent, although binary in nature, would be thin pack.
 
>> It is very usefull tool to have for "accidental" developer.
> 
> That's the key point, but patch review for non-accidental developpers
> is also good :-).

How very true...
 
>> BTW. git can provide binary patch for binary files (e.g. adding
>> favicon for gitweb in git.git).
> 
> Bazaar's bundle use base64 encoding for binaries. I don't think that's
> efficient binary diff (xdelta-like) though. Aaron has been fighting
> quite a lot with MUA and MTA mixing up the patches (line ending in
> particular) ...

If I remember correctly git binary diff format is xdiff based, and uses 
kind of ascii85 encoding (PostScript).

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  4:24       ` Aaron Bentley
                           ` (2 preceding siblings ...)
       [not found]         ` <20061017062313.cd41e031.seanlkml@sympatico.ca>
@ 2006-10-17 15:03         ` Linus Torvalds
  3 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-17 15:03 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git



On Tue, 17 Oct 2006, Aaron Bentley wrote:
> 
> But tags have local meaning only, unless someone has access to your
> repository, right?

Ehh. Exactly like the bzr numbers? You have to have access to the original 
repo to name it.

So your point is?

If you do

	git log v2.6.17

in a kernel repository, you'll see exactly what I see - because you'll 
have gotten the tags, aka the "easy revision names".

Now, I'm obviously biased, but the thing is, git really does do this 
right. No meaningless numbers. You give _meaningful_ revision names, and 
they can be extremely powerful.

And no, it's not just tags or the raw SHA1 numbers. You can do 
relationships like

	git log HEAD~5..

which means "show the log for everything since five parents ago" (which is 
_not_ the same as "show the last five revisions", because one of them may 
have been a merge, and brought in a lot more of new commits).

Or, you can say

	git diff mybranch@{2.days.ago}..nextbranch

which says exactly what you'd read it as: show the diff between what 
"mybranch" looked like 2 days ago and what "nextbranch" looks like right 
now.

Or, since the namespace is the same for commit history _and_ for actual 
file contents, and since some commands don't need commits, you can decide 
to name not a revision, but a specific file or subdirectory in a revision, 
and do things like

	git -p grep -1 request_irq v2.6.17~2:drivers/char

where the "revision" is not a commit revision at all, it's a _tree_ 
revision, because we've looked up the revision for "v2.6.17~2" (which 
means "the grandparent of the tag 2.6.17"), and then within that commit we 
looked up the tree "drivers/char", and then we grepped (recursively) for 
the string "request_irq" within that subtree (with one line of context), 
and then we paginated the output through "less" (or whatever your pager is 
set to).

In other words, yes, the above does _exactly_ what you'd expect it to do.

The fact is, nobody ever uses the SHA1 names directly in their normal 
work. You'd use the branch names, tag-names, or some relationship operator 
like "this long ago" or "the parent of" or similar).

The only time you use actual SHA1 names is when you tell somebody _else_ 
something. Or when you use "gitk" to look something up, and select a 
commit, and then paste that commit name into "git show" (which is 
obviously telling "somebody else" - it's communicating between two 
programs).

There's simply no reason to ever use the SHA1 names directly normally. But 
they are there, and they are the _real_ revision numbers, and they 
actually have real meaning between different repositories.

So that "git grep" example above is actually 100% equivalent to

	git -p grep -1 request_irq 3ff4e205e1

but why would I ever write that? That's just insane. But in case you care, 
the way I got that "3ff4e205e1" number, it was just by doing

	git rev-parse v2.6.17~2:drivers/char

and cutting-and-pasting the first ten hex-digits to  make sure I had 
enough of a name to make it unique.

So the SHA1 names always exist, and they are what git _internally_ uses, 
but you'd normally not use them that much in your daily life. 

They are great for explaining things, though. For example, when somebody 
reports a bug, and has used "git bisect" to figure out where the bug 
started happening, that's when the "real name" matters - since we normally 
didn't tag that commit as being buggy when we created it ;)

So that's when you'd say: "I bisected the problem, and it started 
happening in commit 0123456789abcdef". And now everybody with a git 
repository of the kernel can just look it up locally by 
cutting-and-pasting that one number.

> The key thing about a checkout is that it's stored in a different
> location from its repository.  This provides a few benefits:

Actually, git does something even better.

Git allows the repository to be split up.

You can get a git repository on a CD or DVD, and do

	git clone -l -s /mount/cdrom myrepo

and that "-s" means that the new "myrepo" actually is linked to the 
original CDROM repository, and you can now _commit_ stuff and make changes 
in myrepo, even though all the old history is on that CD-ROM. It won't add 
any unnecessary stuff at all to the new repo.

Or, you could do the "totally naked" checkout, so that the whole 
repository is somewhere else (if that "somewhere else" is the CD-ROM, you 
obviously cannot change anything ;)

Or you can have <n> different repositories that are all related, and all 
contain just the part that _they_ care about.

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 14:05           ` Aaron Bentley
       [not found]             ` <20061017103423.a9589295.seanlkml@sympatico.ca>
@ 2006-10-17 15:05             ` Andreas Ericsson
  2006-10-17 15:32               ` Matthieu Moy
  2006-10-17 19:44               ` Aaron Bentley
  1 sibling, 2 replies; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-17 15:05 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git

Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Andreas Ericsson wrote:
>> Aaron Bentley wrote:
> 
>>> When two people have copies of the same revision, it's usually because
>>> they are each pulling from a common branch, and so the revision in that
>>> branch can be named.  Bazaar does use unique ids internally, but it's
>>> extremely rare that the user needs to use them.
>>>
>> Well, if two people have the same revision in git, you *know* they have
>> pulled from each other
> 
> No, you don't.  They may have each pulled from a different repository.
> 

I realized it as I read it now. What I meant was that you know you have 
the exact same revision as the original author once committed.

> 
>>> But tags have local meaning only, unless someone has access to your
>>> repository, right?
>>>
>> I imagine the bazaar-names with url+number only has local meaning unless
>> someone has access to your repository too.
> 
> Yes.  That phrasing was from Linus' description of revnos.
> 
>> One of the great benefits of
>> git is that each revision is *always exactly the same* no matter in
>> which repository it appears. This includes file-content, filesystem
>> layout and, last but also most important, history.
> 
> In Bazaar, a revision id always refers to the same logical entity, but
> it may be stored in different formats in different repositories.
> 

This I don't understand. Let's say Alice has revision-154 in her repo, 
located at alice.example.com. Let's say that commit is accessible with 
the url "alice.example.com:revision-154". Bob pulls from her repo into 
his own, which is located at bob.example.com.

Lots of questions here, so I'll split them up. Feel free to delete the 
non-applicable ones.

Will the commit in Bob's repo be accessible at 
"bob.example.com:revision-154"?

If it's not, how can you backtrack from old bugreports and find the 
error being discussed?

If it is, how does that work if Bob suddenly wants to commit things 
before Alice is done working with her changes?

Also, suppose they both push to a master-repo where Caesar has pushed 
his changes and nicked the slot for revision-154. Does the master repo 
re-organize everything and then invalidate Bob's and Alice's changes, or 
does it tell Alice and Bob that they need to update and then reorganize 
their repos before they're allowed to push?

I really can't get my head around the usefulness of revision-numbers 
hopping around which is probably why I'm having such a trouble groking 
how it works.

> 
>>> - - you can use a checkout to maintain a local mirror of a read-only
>>>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
>>>
>> Check. Well, actually, you just clone it as usual but with the --bare
>> argument and it won't write out the working tree files.
> 
> No, I *want* the working tree files.  I run bzr from a checkout of bzr.dev.
> 

You get the working tree files by default. Use --bare if you don't want 
them to be checked out (i.e. written to the working tree) after the 
clone is complete.

>>> You can operate that way in bzr too, but I find it nicer to have one
>>> checkout for each active branch, plus a checkout of bzr.dev.  Our switch
>>> command also rewrites only the changed part of the working tree.
>>>
>> Works in git as well, but each "checkout" (actually, locally referenced
>> repository clone) gets a separate branch/tag namespace.
> 
> In our terminology, if it can diverge from the original, it's a branch,
> not a checkout.
> 

This clears things up immensely. bazaar checkout != git checkout.
I still fail to see how a local copy you can't commit to is useful, but 
it doesn't really matter to me as I've already found a tool that does 
everything I want wrt scm needs.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                               ` <20061017110655.f7bcf3f1.seanlkml@sympatico.ca>
  2006-10-17 15:06                                 ` Sean
@ 2006-10-17 15:06                                 ` Sean
  2006-10-18  0:14                                 ` Petr Baudis
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 15:06 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: bazaar-ng, git

On Tue, 17 Oct 2006 16:19:46 +0200
Matthieu Moy <Matthieu.Moy@imag.fr> wrote:

> Sure. As I said before, the little add-on of checkouts is that you say
> once "I don't want to do local commit here", and bzr reminds you this
> each time you commit. Well, where it can make a difference is that it
> does it in a transactional way, that is, you don't have that little
> window between the time you pull and the time you push your next
> commit. But this would really be bad luck ;-).

Yeah, it would be bad luck, but Git wouldn't actually let the push
succeed if someone had changed the upstream repo in that small window.
It would complain that your push wasn't a fast forward and ask you
to update before pushing.

> Sure. And at least, if you want to prove that your decentralized SCM
> is the best, you'd better look at features other than the ability to
> commit on a local branch ;-). If you want a _real_ flamewar, better
> talk about rename management or revision identity.
> 
> The thing is that most people migrated from CVS/svn, so they found
> their new SCM to be incredibly better the existing. But it's generally
> not _so_ much better than the other modern alternatives ;-). (and
> don't forget to thank Darcs and Monotone who brought most of the good
> ideas you and I are using)

Heh, true enough.  And the fact is they're all "borrowing" the
best ideas from one another.  All of a sudden the others are all
getting git-like bisect and gitk guis.  And of course Linus has
said that he got quite a bit of inspiration from Monotone
originally.

Beyond the distributed offline nature of using Git, the killer
"feature" for me is its raw speed and flexibility[1].  It's
really nice to be able to branch in under a second and try
out a line of development etc.  Maybe this is just as easy
in Bazaar but it's not true of say Mercurial.  Honestly, I
just can't imagine any other SCM meeting my needs better than
Git.  So I have a hard time taking complaints about rename
management or revision identity seriously.

While they don't affect my usage, IMHO the two biggest failings
of Git are its lack of a shallow clone and its reliance on shell
and other scripting languages so there is no native Windows version.
I'm sure both of these areas are handled better by Bazaar and/or
some of the other new SCMs where they'd be a better choice than
Git.

Sean

[1] As an aside, I don't understand why bazaar pushes the idea
of "plugins".  For instance someone mentioned that bazaar has
a bisect "plugin".  Well Git was able to add a bisect "command"
without needing a plugin architecture.. so i'm at a loss as 
to why plugins are seen as an advantage.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                               ` <20061017110655.f7bcf3f1.seanlkml@sympatico.ca>
@ 2006-10-17 15:06                                 ` Sean
  2006-10-17 15:06                                 ` Sean
  2006-10-18  0:14                                 ` Petr Baudis
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 15:06 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: bazaar-ng, git

On Tue, 17 Oct 2006 16:19:46 +0200
Matthieu Moy <Matthieu.Moy@imag.fr> wrote:

> Sure. As I said before, the little add-on of checkouts is that you say
> once "I don't want to do local commit here", and bzr reminds you this
> each time you commit. Well, where it can make a difference is that it
> does it in a transactional way, that is, you don't have that little
> window between the time you pull and the time you push your next
> commit. But this would really be bad luck ;-).

Yeah, it would be bad luck, but Git wouldn't actually let the push
succeed if someone had changed the upstream repo in that small window.
It would complain that your push wasn't a fast forward and ask you
to update before pushing.

> Sure. And at least, if you want to prove that your decentralized SCM
> is the best, you'd better look at features other than the ability to
> commit on a local branch ;-). If you want a _real_ flamewar, better
> talk about rename management or revision identity.
> 
> The thing is that most people migrated from CVS/svn, so they found
> their new SCM to be incredibly better the existing. But it's generally
> not _so_ much better than the other modern alternatives ;-). (and
> don't forget to thank Darcs and Monotone who brought most of the good
> ideas you and I are using)

Heh, true enough.  And the fact is they're all "borrowing" the
best ideas from one another.  All of a sudden the others are all
getting git-like bisect and gitk guis.  And of course Linus has
said that he got quite a bit of inspiration from Monotone
originally.

Beyond the distributed offline nature of using Git, the killer
"feature" for me is its raw speed and flexibility[1].  It's
really nice to be able to branch in under a second and try
out a line of development etc.  Maybe this is just as easy
in Bazaar but it's not true of say Mercurial.  Honestly, I
just can't imagine any other SCM meeting my needs better than
Git.  So I have a hard time taking complaints about rename
management or revision identity seriously.

While they don't affect my usage, IMHO the two biggest failings
of Git are its lack of a shallow clone and its reliance on shell
and other scripting languages so there is no native Windows version.
I'm sure both of these areas are handled better by Bazaar and/or
some of the other new SCMs where they'd be a better choice than
Git.

Sean

[1] As an aside, I don't understand why bazaar pushes the idea
of "plugins".  For instance someone mentioned that bazaar has
a bisect "plugin".  Well Git was able to add a bisect "command"
without needing a plugin architecture.. so i'm at a loss as 
to why plugins are seen as an advantage.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 15:05             ` Andreas Ericsson
@ 2006-10-17 15:32               ` Matthieu Moy
  2006-10-17 19:44               ` Aaron Bentley
  1 sibling, 0 replies; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-17 15:32 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

Andreas Ericsson <ae@op5.se> writes:

> This I don't understand. Let's say Alice has revision-154 in her repo,
> located at alice.example.com. Let's say that commit is accessible with
> the url "alice.example.com:revision-154". Bob pulls from her repo into
> his own, which is located at bob.example.com.

Another equation can help.

Revision Identity != Revision Number.

$ bzr log --show-ids
------------------------------------------------------------
revno: 1
revision-id: Matthieu.Moy@imag.fr-20061017152029-4c5a2861bcf23b7d
committer: Matthieu Moy <Matthieu.Moy@imag.fr>
branch nick: foo
timestamp: Tue 2006-10-17 17:20:29 +0200
message:
  some message


See, bzr has this unique revision identifier (not based on a hashsum).
The design choice of bzr is to hide it as much as possible from the
user interface.

Then, if I'm in the branch in which I typed this command, I can reffer
to this revision with simply

  bzr whatever -r 1

In the general case, I can access it with

  bzr whatever -r revid:Matthieu.Moy@imag.fr-20061017152029-4c5a2861bcf23b7d

(There's currently a lack in the UI to specify a remote revision-id,
but that's not a problem in the model itself)

bzr's internal use almost exclusively revision ID (ancestry
information is all about revision id), and revno are a UI layered on
top of it.

I don't have strong needs in revision control, but I actually never
encountered a case where I had to access a revision by providing its
ID. So, for people like me, revision numbers are sufficient, and they
are simple (for example, I can tell without running any command that
revision 42 is older than revision 56 in a particular branch).

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 14:19             ` Olivier Galibert
@ 2006-10-17 15:37               ` Matthieu Moy
  0 siblings, 0 replies; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-17 15:37 UTC (permalink / raw)
  To: Olivier Galibert; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

Olivier Galibert <galibert@pobox.com> writes:

> You're not telling us bzr still follows the utterly stupid
> update-before-commit model, right?  Right?

One last time:

bzr _CAN_ follow the utterly stupid update-before-commit model.

It doesn't force you to do so, obviously.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  9:40           ` Robert Collins
  2006-10-17 10:08             ` Andreas Ericsson
@ 2006-10-17 16:41             ` Linus Torvalds
  2006-10-17 22:27               ` Robert Collins
  1 sibling, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-17 16:41 UTC (permalink / raw)
  To: Robert Collins; +Cc: Jakub Narebski, Aaron Bentley, bazaar-ng, git



On Tue, 17 Oct 2006, Robert Collins wrote:

> On Tue, 2006-10-17 at 11:20 +0200, Jakub Narebski wrote:
> > 
> >           ---- time --->
> > 
> >     --*--*--*--*--*--*--*--*--*-- <branch>
> >           \            /
> >            \-*--X--*--/
> > 
> > The branch it used to be on is gone...
> 
> In bzr 0.12 this is :
> 2.1.2
> 
> (assuming the first * is numbered '1'.)
> 
> These numbers are fairly stable

And here, by "fairly stable", you really mean "totally idiotic", don't 
you?

Guys, let's be blunt here, and just say you're wrong. The fact is, I've 
used a system that uses the same naming bzr does, and I've used it likely 
longer and with a bigger project than anybody has likely _ever_ used bzr 
for.

It sounds like bzr is doing _exactly_ what bitkeeper did. 

Those "simple" numbers are totally idiotic. And when I say "totally 
idiotic", please go back up a few sentences, and read those again. I know 
what I'm talking about. I know probably better than anybody in the bzr 
camp.

Those "simple" numbers are anything but. They may be short, most of the 
time, but when you bandy things like "-r 56" around, what you're ignoring 
is that for a _real_ project you actually get numbers like "1.517.3.57", 
which isn't really any simpler or shorter than saying "7786ce19". You 
still want to cut-and-paste it.

And the "simple" numbers have a real downside, which is that THEY CHANGE.

What happens is that somebody else started _another_ branch at revision 2, 
and did important work, and and they also had a "2.1.2" revision, and then 
they merged your work, and you merged their merge back, that "simple" 
revision number changed, didn't it? Suddenly "2.1.2" means something 
different for one of the users.

We had people in the bitkeeper world that _never_ actually understood that 
the numbers changed. The "simple" numbers were stable enough that a lot of 
people thought they were real revisions, and then they were really 
_really_ confused when a number like "1.517.3.57" suddenly went away after 
a merge, and became something else instead.

And yes, bitkeeper had a "real key" internally too. If you actually wanted 
to give a real revision, you had to give something that looked a lot like 
what the bzr internal revision numbers look like.

Of course, most users didn't even _know_ or understand those revision 
numbers, so as a result, you had tons of people who used the "simple" 
thing (which was what "bk log" and all other tools would show), and since 
it worked quite often, they thought it was ok. And then sometimes it 
didn't work at all, or it "worked" by giving the wrong commit, and it was 
just a total disaster.

Something that works "most of the time" is not simple to use. It's just a 
way to make people _believe_ it is simple, and then be really confused 
when it doesn't work.

So trust me, naming things so that the name depend on the local shape of 
the history is idiotic. I _know_. Been there, done that.

The thing is, when I designed git, I actually had years of experience 
working with a big project in a truly distributed manner. I _knew_ that 
handling renames specially is a bad idea (not that you should even need to 
have used BK to know that).

And I _knew_ that the simple revision numbers aren't real and just cause 
confusion.

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  6:23         ` Junio C Hamano
@ 2006-10-17 18:52           ` J. Bruce Fields
  2006-10-17 19:12             ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: J. Bruce Fields @ 2006-10-17 18:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Aaron Bentley, git

On Mon, Oct 16, 2006 at 11:23:53PM -0700, Junio C Hamano wrote:
> Aaron Bentley <aaron.bentley@utoronto.ca> writes:
> 
> > Johannes Schindelin wrote:
> >
> >>> You'll note we referred to that bevhavior on the page.  We don't think
> >>> what Git does is the same as supporting renames.  AIUI, some Git users
> >>> feel the same way.
> >> 
> >> Oh, we start another flamewar again?
> >
> > I'd hope not.  It sounds as though you feel that supporting renames in
> > the data representation is *wrong*, and therefore it should be an insult
> > to you if we said that Git fully supported renames.
> 
> Not recording and not supporting are quite different things.

Yes.  There's a risk of confusing a feature with an implementation
detail.  From http://bazaar-vcs.org/RcsComparisons:

	"If a user can rename a file in the RCS without loosing the RCS
	history for a file, then renames are considered supported. If
	the operation resultes in a delete/add (aka "DA pair"), then
	renames are not considered supported. If the operation results
	in a copy/delete pair, renames are considered "somewhat"
	supported. The problem with copy support is that it is hard to
	define sane merge semantics for copies."

The first sentence sounds like a description of a user-visible feature.
The rest of it sounds like implementation.

And git probably has some deficiencies here, but it'd be more useful to
identify them in terms of things a user can't do.

--b.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 18:52           ` J. Bruce Fields
@ 2006-10-17 19:12             ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17 19:12 UTC (permalink / raw)
  To: git

J. Bruce Fields wrote:

> On Mon, Oct 16, 2006 at 11:23:53PM -0700, Junio C Hamano wrote:
>> Aaron Bentley <aaron.bentley@utoronto.ca> writes:
>> 
>> > Johannes Schindelin wrote:
>> >
>> >>> You'll note we referred to that bevhavior on the page.  We don't think
>> >>> what Git does is the same as supporting renames.  AIUI, some Git users
>> >>> feel the same way.
>> >> 
>> >> Oh, we start another flamewar again?
>> >
>> > I'd hope not.  It sounds as though you feel that supporting renames in
>> > the data representation is *wrong*, and therefore it should be an insult
>> > to you if we said that Git fully supported renames.
>> 
>> Not recording and not supporting are quite different things.
> 
> Yes.  There's a risk of confusing a feature with an implementation
> detail.  From http://bazaar-vcs.org/RcsComparisons:
> 
>       "If a user can rename a file in the RCS without loosing the RCS
>       history for a file, then renames are considered supported. If
>       the operation resultes in a delete/add (aka "DA pair"), then
>       renames are not considered supported. If the operation results
>       in a copy/delete pair, renames are considered "somewhat"
>       supported. The problem with copy support is that it is hard to
>       define sane merge semantics for copies."
> 
> The first sentence sounds like a description of a user-visible feature.
> The rest of it sounds like implementation.

The proper description would be: if we get history of file up to rename
unrelated to the history of file before rename ("DA pair"), where
"unrelated" means that SCM doesn't store this relation (or equivalent
information), renames are not considered supported. If we get full
history of file under new name, and unrelated history of file up to rename
("CD pair"), renames are not considered supported ;-)
 
> And git probably has some deficiencies here, but it'd be more useful to
> identify them in terms of things a user can't do.

For example:
 * if we rename (or delete) file on one branch, and then merge changes
   with other branch where such rename didn't make place, do merge do
   the correct thing.
 * can we get whole history of file, before and after rename. Can we do
   this automatically, in one go.
 * do renames are (can be) marked as such in diff output.

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 15:05             ` Andreas Ericsson
  2006-10-17 15:32               ` Matthieu Moy
@ 2006-10-17 19:44               ` Aaron Bentley
  2006-10-17 23:28                 ` Petr Baudis
  2006-10-17 23:39                 ` Jakub Narebski
  1 sibling, 2 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-17 19:44 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andreas Ericsson wrote:
>> In Bazaar, a revision id always refers to the same logical entity, but
>> it may be stored in different formats in different repositories.
>>
> 
> This I don't understand. Let's say Alice has revision-154 in her repo,
> located at alice.example.com. Let's say that commit is accessible with
> the url "alice.example.com:revision-154". Bob pulls from her repo into
> his own, which is located at bob.example.com.
> 
> Lots of questions here, so I'll split them up. Feel free to delete the
> non-applicable ones.
> 
> Will the commit in Bob's repo be accessible at
> "bob.example.com:revision-154"?

bzr differentiates between pull and merge.  Pull is a mirroring command.
 So with pull, yes revision-154 will be accessible at
bob.example.com:revision-154.

With merge, it won't.  Bob can refer to it as "154:alice.example.com",
though.

> If it's not, how can you backtrack from old bugreports and find the
> error being discussed?

Refer to it as 'alice.example.com revno 154' or by its revision-id.

> If it is, how does that work if Bob suddenly wants to commit things
> before Alice is done working with her changes?

I don't see how this applies.  You can always commit in a branch.  If
alice and bob both commit, then they are diverged and can't pull.  If
alice merges bob, then they converge and bob can pull alice.

> Also, suppose they both push to a master-repo where Caesar has pushed
> his changes and nicked the slot for revision-154. Does the master repo
> re-organize everything and then invalidate Bob's and Alice's changes, or
> does it tell Alice and Bob that they need to update and then reorganize
> their repos before they're allowed to push?

They must merge from the master-repo before they can push to it.

>> In our terminology, if it can diverge from the original, it's a branch,
>> not a checkout.
>>
> 
> This clears things up immensely. bazaar checkout != git checkout.
> I still fail to see how a local copy you can't commit to is useful

My bzr is run from a local copy I can't commit to.  To get the latest
changes from http://bazaar-vcs.org, I can run "bzr update ~/bzr/dev".
To merge the latest changes into my branch, I can run
"bzr merge ~/bzr/dev".  It's also convenient for applying other peoples'
patches to.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFNTKl0F+nu1YWqI0RAhRkAJ0d5KyRElEiFm/m5iRrTIk00RyqywCfe2IY
dhW46SYWm+FTQpN30VY5tPs=
=6SFm
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 10:23           ` Sean
  2006-10-17 10:30             ` Johannes Schindelin
@ 2006-10-17 19:51             ` Aaron Bentley
  2006-10-21 18:58               ` Jan Hudec
  2006-10-20  8:26             ` James Henstridge
  2006-10-20  8:56             ` Erik Bågfors
  3 siblings, 1 reply; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-17 19:51 UTC (permalink / raw)
  To: Sean; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sean wrote:
> On Tue, 17 Oct 2006 00:24:15 -0400
> Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
>>- - you can use a checkout to maintain a local mirror of a read-only
>>  branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
> 
> 
> I'm not sure what you mean here.  A bzr checkout doesn't have any history
> does it?

By default, they do.  You must use a flag to get a checkout with no history.

> So it's not a mirror of a branch, but just a checkout of the
> branch head?

It's a mirror of a branch, and a copy of the branch's working tree.

> If so, Git can export a tarball of a branch (actually a snapshot as at
> any given commit) which can be mirrored out.

Sure, and so can bzr.  But using a checkout of the branch head means:
- - No one has to do anything special to provide a working tree of a given
  revision
- - I can still run any readonly operations I desire
- - I can update to the latest version of bzr.dev with one command.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFNTRc0F+nu1YWqI0RAsL2AKCCG0bP8m01WVllfPMzCdFZjmgEgACfeToz
57HERFJ6ZkkS3VrxLRnVPAs=
=3CX7
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  8:16         ` Andreas Ericsson
@ 2006-10-17 20:01           ` Aaron Bentley
  2006-10-17 21:01             ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-17 20:01 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Jakub Narebski, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andreas Ericsson wrote:
> Aaron Bentley wrote:
>> Ah.  Bazaar uses negative numbers to refer to <n>th parents, and
>> positive numbers to refer to the number of commits that have been made
>> since the branch was initialized.
>>
> 
> What do you do once a branch has been thrown away, or has had 20 other
> branches merged into it? Does the offset-number change for the revision
> then, or do you track branch-points explicitly?

We always track the number of parents since the initial commit in the
project.  Sorry, I don't think I said that clearly before.

>> If I understand correctly, in Bazaar, you'd just merge the current work
>> into 'xx/topic'.
>>
> 
> merge != rebase though, although they are indeed similar. Let's take the
> example of a 'master' branch and topic branch topicA. If you rebase
> topicA onto 'master', development will appear to have been serial.

Ah, now I see what you mean, and the "graft" plugin mentioned by others
fills that role.  I've never used it, though.

> If
> you instead merge them, it will either register as a real merge or, if
> the branch tip of 'master' is the branch start-point of topicA, it will
> result in a "fast-forward" where 'master' is just updated to the
> branch-tip of 'topicA'.

Interesting.  We don't do 'fast-forward' in that case.

>> I'm not sure what you mean by API, unless you mean the commandline.  If
>> that's what you mean, surely all unix commands are extensible in that
>> regard.
>>
> 
> I'm fairly certain he's talking about the API in the sense it's being
> talked about in every other application. Extensive work has been made to
> libify a lot of the git code, which means that most git commands are
> made up of less than 400 lines of C code, where roughly 80% of the code
> is command-specific (i.e., argument parsing and presentation).

Ah, okay.

So it sounds to me like git is extensible, though not as thoroughly as bzr.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFNTat0F+nu1YWqI0RAn9aAJ9WzMrM72be+3SlwCpvJXQ/X2Y3nQCfeYk3
NTIJuZSze9URUaAsiO4Hu5o=
=9nvr
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 20:01           ` Aaron Bentley
@ 2006-10-17 21:01             ` Jakub Narebski
  2006-10-17 21:27               ` Aaron Bentley
  2006-10-17 23:35               ` Jakub Narebski
  0 siblings, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17 21:01 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Andreas Ericsson, bazaar-ng, git

Aaron Bentley wrote:
> Andreas Ericsson wrote:
>> Aaron Bentley wrote:
>>> Ah.  Bazaar uses negative numbers to refer to <n>th parents, and
>>> positive numbers to refer to the number of commits that have been made
>>> since the branch was initialized.
>>>
>>
>> What do you do once a branch has been thrown away, or has had 20 other
>> branches merged into it? Does the offset-number change for the revision
>> then, or do you track branch-points explicitly?
> 
> We always track the number of parents since the initial commit in the
> project.  Sorry, I don't think I said that clearly before.

While this I think is quite reliable (there was idea to store "generation
number" with each commit, e.g. using not implemented "note" header, or
commit-id to generation number "database" as a better heuristic than
timestamp for revision ordering in git-rev-list output), and probably
independent on repository (it is global property of commit history,
and commit history is included in sha1 of its parents), numbering branching
points is unreliable, as is relying on branch names.
 
>>> If I understand correctly, in Bazaar, you'd just merge the current work
>>> into 'xx/topic'.
>>>
>>
>> merge != rebase though, although they are indeed similar. Let's take the
>> example of a 'master' branch and topic branch topicA. If you rebase
>> topicA onto 'master', development will appear to have been serial.
> 
> Ah, now I see what you mean, and the "graft" plugin mentioned by others
> fills that role.  I've never used it, though.

Very useful as a kind of poor-man's-Quilt (or StGit). You develop some
feature step by step, commit by commit in your repository cooking it
in topic branch. Then before sending it to mailing list or maintainer
as a series of patches (using git-format-patch and git-send-email)
you rebase it on top of current work (current state), to ensure that
it would apply cleanly.
 
>> If
>> you instead merge them, it will either register as a real merge or, if
>> the branch tip of 'master' is the branch start-point of topicA, it will
>> result in a "fast-forward" where 'master' is just updated to the
>> branch-tip of 'topicA'.
> 
> Interesting.  We don't do 'fast-forward' in that case.

Fast-forward is a really good idea. Perhaps you could implement it,
if it is not hidden under different name?
 
>>> I'm not sure what you mean by API, unless you mean the commandline.  If
>>> that's what you mean, surely all unix commands are extensible in that
>>> regard.
>>>
>>
>> I'm fairly certain he's talking about the API in the sense it's being
>> talked about in every other application. Extensive work has been made to
>> libify a lot of the git code, which means that most git commands are
>> made up of less than 400 lines of C code, where roughly 80% of the code
>> is command-specific (i.e., argument parsing and presentation).
> 
> Ah, okay.
> 
> So it sounds to me like git is extensible, though not as thoroughly as bzr.

I think having good API for C, shell and Perl (and to lesser extent for any
scripting language) means that it is extensible more. Git is not as of yet
libified; when it would be we could think about bindings for other
programming languages (there is preliminary Java binding/interface).
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 21:01             ` Jakub Narebski
@ 2006-10-17 21:27               ` Aaron Bentley
  2006-10-17 21:51                 ` Jakub Narebski
                                   ` (2 more replies)
  2006-10-17 23:35               ` Jakub Narebski
  1 sibling, 3 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-17 21:27 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
>>Ah, now I see what you mean, and the "graft" plugin mentioned by others
>>fills that role.  I've never used it, though.
> 
> 
> Very useful as a kind of poor-man's-Quilt (or StGit). You develop some
> feature step by step, commit by commit in your repository cooking it
> in topic branch. Then before sending it to mailing list or maintainer
> as a series of patches (using git-format-patch and git-send-email)
> you rebase it on top of current work (current state), to ensure that
> it would apply cleanly.

What is the bad side of using merge in this situation?

>>Interesting.  We don't do 'fast-forward' in that case.
> 
> 
> Fast-forward is a really good idea. Perhaps you could implement it,
> if it is not hidden under different name?

We support it as 'pull', but merge doesn't do it automatically, because
we'd rather have merge behave the same all the time, and because 'pull'
throws away your local commit ordering.

>>So it sounds to me like git is extensible, though not as thoroughly as bzr.
> 
> 
> I think having good API for C, shell and Perl (and to lesser extent for any
> scripting language) means that it is extensible more.

I guess it's a value judgement on which is more important to extensibility:

Git has more language support.

Bzr has plugin autoloading, Protocol plugins, Repository format plugins,
and more.  Because Python supports monkey-patching, a plugin can change
absolutely anything.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFNUrP0F+nu1YWqI0RAizXAJ0Wnf2ZoIRpaba3mX2L4pN9XcWDPQCePtg/
G/W6Oxm+kd8SzhGEEfLAxL8=
=VqC7
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 21:27               ` Aaron Bentley
@ 2006-10-17 21:51                 ` Jakub Narebski
  2006-10-17 22:28                   ` Aaron Bentley
  2006-10-18  6:22                   ` Matthieu Moy
       [not found]                 ` <20061017180051.5453ba90.seanlkml@sympatico.ca>
  2006-10-17 22:03                 ` Linus Torvalds
  2 siblings, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17 21:51 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Andreas Ericsson, bazaar-ng, git

Aaron Bentley wrote:
> Jakub Narebski wrote:

>>>Ah, now I see what you mean, and the "graft" plugin mentioned by others
>>>fills that role.  I've never used it, though.
>>
>> Very useful as a kind of poor-man's-Quilt (or StGit). You develop some
>> feature step by step, commit by commit in your repository cooking it
>> in topic branch. Then before sending it to mailing list or maintainer
>> as a series of patches (using git-format-patch and git-send-email)
>> you rebase it on top of current work (current state), to ensure that
>> it would apply cleanly.
> 
> What is the bad side of using merge in this situation?

We want linear history, not polluted by merges. For example you cannot
send merge commit via email. Another problem is that you want to
send _series_ of patches, string of commits (revisions), creating feature
part by part, with clean history; with merge you get _final result_
which will apply cleanly, with rebase you would get that series
of patches will apply cleanly.
 
>>>Interesting.  We don't do 'fast-forward' in that case.
>>
>> Fast-forward is a really good idea. Perhaps you could implement it,
>> if it is not hidden under different name?
> 
> We support it as 'pull', but merge doesn't do it automatically, because
> we'd rather have merge behave the same all the time, and because 'pull'
> throws away your local commit ordering.

I smell yet another terminology conflict (although this time fault is
on the git side), namely that in git terminology "pull" is "fetch"
(i.e. getting changes done in remote repository since laste "fetch"
or since "clone") followed by merge. pull = fetch + merge.

>>>So it sounds to me like git is extensible, though not as thoroughly as bzr.
>>
>>
>> I think having good API for C, shell and Perl (and to lesser extent for any
>> scripting language) means that it is extensible more.
> 
> I guess it's a value judgement on which is more important to extensibility:
> 
> Git has more language support.
> 
> Bzr has plugin autoloading, Protocol plugins, Repository format plugins,
> and more.  Because Python supports monkey-patching, a plugin can change
> absolutely anything.

Which is _not_ a good idea. Git is created in such way, that the repository
is abstracted away (introduction of pack format, and improving pack format
can and was done "behind the scenes", not changing any porcelanish (user)
commands), but we don't want any chage that would change this abstraction.
Changing repository format is not a good idea for "dumb" protocols; native
protocol is quite extensible (for example there was introduced multi-ack
extension for better downloading of multiple branches with lesser number
of object in the pack sent; even earlier there were intoduced thin packs),
and does a kind of feature detection between client and server. Adding
cURL based FTP read-only support to existing HTTP support was a matter
of few lines, if I remember correctly.

Besides, if monkey-patching is something akin to advices, I guess that
performance might suffer.


To make perhaps not that good analogy. In git adding new commands is
like adding new filesystem to Linux kernel using existing VFS interface,
or existing FUSE/LUFS interface. In Bazaar adding new command is like
writing new filesystem support (plugin) in mikrokernel like L4/Mach.
(And please take note for what project git was created for :-))

-- 
Jakub Narebski
ShadeHawk on #git
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061017180051.5453ba90.seanlkml@sympatico.ca>
  2006-10-17 22:00                   ` Sean
@ 2006-10-17 22:00                   ` Sean
  2006-10-17 22:44                     ` Aaron Bentley
  2006-10-20  9:43                     ` Matthieu Moy
  1 sibling, 2 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 22:00 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, git

On Tue, 17 Oct 2006 17:27:44 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> Bzr has plugin autoloading, Protocol plugins, Repository format plugins,
> and more.  Because Python supports monkey-patching, a plugin can change
> absolutely anything.

But really why does any of that matter?  This is the open source world.
We don't need plugins to extend features, we just add the feature to
the source.  The example I asked about earlier is a case in point. 
Apparently in bzr "bisect" was implemented as a plugin, yet in Git it
was implemented as a command without any issue at all, no plugins
needed, and its compiled and runs at machine speed.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061017180051.5453ba90.seanlkml@sympatico.ca>
@ 2006-10-17 22:00                   ` Sean
  2006-10-17 22:00                   ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 22:00 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Andreas Ericsson, bazaar-ng, git, Jakub Narebski

On Tue, 17 Oct 2006 17:27:44 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> Bzr has plugin autoloading, Protocol plugins, Repository format plugins,
> and more.  Because Python supports monkey-patching, a plugin can change
> absolutely anything.

But really why does any of that matter?  This is the open source world.
We don't need plugins to extend features, we just add the feature to
the source.  The example I asked about earlier is a case in point. 
Apparently in bzr "bisect" was implemented as a plugin, yet in Git it
was implemented as a command without any issue at all, no plugins
needed, and its compiled and runs at machine speed.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 21:27               ` Aaron Bentley
  2006-10-17 21:51                 ` Jakub Narebski
       [not found]                 ` <20061017180051.5453ba90.seanlkml@sympatico.ca>
@ 2006-10-17 22:03                 ` Linus Torvalds
  2006-10-17 22:53                   ` Aaron Bentley
  2 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-17 22:03 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, git



On Tue, 17 Oct 2006, Aaron Bentley wrote:
> 
> >>Interesting.  We don't do 'fast-forward' in that case.
> > 
> > Fast-forward is a really good idea. Perhaps you could implement it,
> > if it is not hidden under different name?
> 
> We support it as 'pull', but merge doesn't do it automatically, because
> we'd rather have merge behave the same all the time, and because 'pull'
> throws away your local commit ordering.

Excuse me? What does that "throws away your local commit ordering" mean?

A fast-forward does no such thing. It leaves the local commit ordering 
alone, it just appends other things on top of it. It's the only sane thing 
you can do, since the work you merged was already based on your top 
commit.

So generating an extra "merge" commit would be actively wrong, and adds 
"history" that is not history at all.

It also means that if people merge back and forth from each other, you get 
into an endless loop of useless merge commits. What's the point? They only 
clutter up the history, and they mean that you can never agree on a common 
state.

There's no reason _ever_ to not just fast-forward if one repository is a 
strict superset of the other.

You must be doing something wrong. Is it just that people want to pee in 
the snow and leave their mark?

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 16:41             ` Linus Torvalds
@ 2006-10-17 22:27               ` Robert Collins
       [not found]                 ` <20061017191838.1c36499b.seanlkml@sympatico.ca>
  0 siblings, 1 reply; 1752+ messages in thread
From: Robert Collins @ 2006-10-17 22:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bazaar-ng, git, Jakub Narebski

[-- Attachment #1: Type: text/plain, Size: 1776 bytes --]

On Tue, 2006-10-17 at 09:41 -0700, Linus Torvalds wrote:
> 
> On Tue, 17 Oct 2006, Robert Collins wrote:
> 
> > On Tue, 2006-10-17 at 11:20 +0200, Jakub Narebski wrote:
> > > 
> > >           ---- time --->
> > > 
> > >     --*--*--*--*--*--*--*--*--*-- <branch>
> > >           \            /
> > >            \-*--X--*--/
> > > 
> > > The branch it used to be on is gone...
> > 
> > In bzr 0.12 this is :
> > 2.1.2
> > 
> > (assuming the first * is numbered '1'.)
> > 
> > These numbers are fairly stable
> 
> And here, by "fairly stable", you really mean "totally idiotic", don't 
> you?
> 
> Guys, let's be blunt here, and just say you're wrong. The fact is, I've 
> used a system that uses the same naming bzr does, and I've used it likely 
> longer and with a bigger project than anybody has likely _ever_ used bzr 
> for.
> 
> It sounds like bzr is doing _exactly_ what bitkeeper did. 
> 
> Those "simple" numbers are totally idiotic. And when I say "totally 
> idiotic", please go back up a few sentences, and read those again. I know 
> what I'm talking about. I know probably better than anybody in the bzr 
> camp.

Be as blunt as you want. You're expressing an opinion, and thats fine. I
happen to think that we're right : users appear to really appreciate
this bit of the UI, and I've not yet seen any evidence of confusion
about it - though I will admit there is the possibility of that
occurring.

I think its completely ok that git and bzr have made different choices
in this regard, but I *dont* think our choice is in any regard 'totally
idiotic'.

[snip examples that are clearly predicated on how bk worked, not on how
bzr works].

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 21:51                 ` Jakub Narebski
@ 2006-10-17 22:28                   ` Aaron Bentley
  2006-10-17 22:57                     ` Jakub Narebski
  2006-10-18  6:22                   ` Matthieu Moy
  1 sibling, 1 reply; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-17 22:28 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:

>> What is the bad side of using merge in this situation?
> 
> We want linear history, not polluted by merges. For example you cannot
> send merge commit via email.

Oh.  Bazaar supports sending merge commits by email.

> Another problem is that you want to
> send _series_ of patches, string of commits (revisions), creating feature
> part by part, with clean history; with merge you get _final result_
> which will apply cleanly, with rebase you would get that series
> of patches will apply cleanly.

Yes, that's something that I'd heard about the kernel development
methodology-- that a series of small patches is preferred to one patch
that makes the whole change.

That's not the way we operate.  We like to review all the changes at
once.  But because bundles are applied with a 'merge' command, not a
'patch' command, an old bundle will tend to apply more cleanly than an
old patch would.

> I smell yet another terminology conflict (although this time fault is
> on the git side), namely that in git terminology "pull" is "fetch"
> (i.e. getting changes done in remote repository since laste "fetch"
> or since "clone") followed by merge. pull = fetch + merge.

I guess so, since git merge will do fast-forward after a fetch.

>> and more.  Because Python supports monkey-patching, a plugin can change
>> absolutely anything.
> 
> Which is _not_ a good idea. Git is created in such way, that the repository
> is abstracted away (introduction of pack format, and improving pack format
> can and was done "behind the scenes", not changing any porcelanish (user)
> commands), but we don't want any chage that would change this abstraction.

I'm not sure what you think Bazaar does.  In Bazaar, a repository format
plugin  implements the same API that a native repository format does.

This is how bzr supports Subversion, Mercurial and Git repositories.

> Changing repository format is not a good idea for "dumb" protocols; 

I can't parse this.  Repository formats and protocols are different
things, right?

> native
> protocol is quite extensible

I was meaning dumb protocol extension.  I can't say how extensible the
bzr native protocol is.
> Adding
> cURL based FTP read-only support to existing HTTP support was a matter
> of few lines, if I remember correctly.

We support read and write over native, ftp and WebDAV (a plugin).  We
also have readonly http support.

> Besides, if monkey-patching is something akin to advices, I guess that
> performance might suffer.

No, monkey-patched code executes at the same speed as unpatched code.
There are arguments against monkey-patching, but speed is not one of them.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNVkM0F+nu1YWqI0RAjCaAJwOcWSUdVy7RpUZROJVxAC9aj/V/wCfUg0T
uHkdc9k6i+v0QnhEvTXdszM=
=YO8G
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:00                   ` Sean
@ 2006-10-17 22:44                     ` Aaron Bentley
       [not found]                       ` <20061017185622.30fbc6c0.seanlkml@sympatico.ca>
  2006-10-20  9:43                     ` Matthieu Moy
  1 sibling, 1 reply; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-17 22:44 UTC (permalink / raw)
  To: Sean; +Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sean wrote:
> On Tue, 17 Oct 2006 17:27:44 -0400
> Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> 
>> Bzr has plugin autoloading, Protocol plugins, Repository format plugins,
>> and more.  Because Python supports monkey-patching, a plugin can change
>> absolutely anything.
> 
> But really why does any of that matter?  This is the open source world.
> We don't need plugins to extend features, we just add the feature to
> the source.

That can lead to feature bloat.  Some plugins are not useful to
everyone, e.g. Mercurial repository support.  Some plugins introduce
additional dependencies that we don't want to have in the core (e.g. the
rsync, baz-import and graph-ancestry commands).

Plugins also don't have a Bazaar's rigid release cycle, testing
requirements and coding conventions, so they are a convenient way to try
out an idea, before committing to the effort of getting it merged into
the core.

> The example I asked about earlier is a case in point. 
> Apparently in bzr "bisect" was implemented as a plugin, yet in Git it
> was implemented as a command without any issue at all, no plugins
> needed, and its compiled and runs at machine speed.

The bisect plugin is just as performant as any other bzr command.  (The
whole VCS is in Python.)  Most people don't use it, so we don't ship it
as part of the base install, but anyone who wants it can have it.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNVy70F+nu1YWqI0RAnlxAJ9+ZXryG/KJxi6hjpz+U/gU3y06MQCdH2Ez
cFlnxwWksB+q2b1dXI3cfwo=
=HAy6
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:03                 ` Linus Torvalds
@ 2006-10-17 22:53                   ` Aaron Bentley
  2006-10-17 23:09                     ` Linus Torvalds
  2006-10-17 23:24                     ` Jakub Narebski
  0 siblings, 2 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-17 22:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Tue, 17 Oct 2006, Aaron Bentley wrote:
>>>> Interesting.  We don't do 'fast-forward' in that case.
>>> Fast-forward is a really good idea. Perhaps you could implement it,
>>> if it is not hidden under different name?
>> We support it as 'pull', but merge doesn't do it automatically, because
>> we'd rather have merge behave the same all the time, and because 'pull'
>> throws away your local commit ordering.
> 
> Excuse me? What does that "throws away your local commit ordering" mean?

Say this is the ordering in branch A:

a
|
b
|
c

Say this is the ordering in branch B:

a
|
b
|\
d c
|/
e

When A pulls B, it gets the same ordering as B has.  If B did not have e
and c, the pull would fail.

> So generating an extra "merge" commit would be actively wrong, and adds 
> "history" that is not history at all.

It's not a tree change, but it records the fact that one branch merged
the other.

> It also means that if people merge back and forth from each other, you get 
> into an endless loop of useless merge commits.

You can pull if you don't want that.  We haven't found that people are
very fussed about it.

> There's no reason _ever_ to not just fast-forward if one repository is a 
> strict superset of the other.

Maybe not in Git.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNV7u0F+nu1YWqI0RAhGtAJwOlWpl088pbl63EHyF04qQCYlXBgCfW0Tm
cfXuE0vqeWelfFbpzffiCNI=
=McQ2
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                       ` <20061017185622.30fbc6c0.seanlkml@sympatico.ca>
  2006-10-17 22:56                         ` Sean
@ 2006-10-17 22:56                         ` Sean
  2006-10-17 23:11                           ` Jakub Narebski
  2006-10-18 21:04                           ` Charles Duffy
  2006-10-18 21:51                         ` Petr Baudis
  2 siblings, 2 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 22:56 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, git

On Tue, 17 Oct 2006 18:44:11 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> That can lead to feature bloat.  Some plugins are not useful to
> everyone, e.g. Mercurial repository support.  Some plugins introduce
> additional dependencies that we don't want to have in the core (e.g. the
> rsync, baz-import and graph-ancestry commands).

Shrug, it's really not that tough to do in regular ole source code.
On Fedora for instance you have your choice of which rpms you want
to install to get the features of Git you want.

> Plugins also don't have a Bazaar's rigid release cycle, testing
> requirements and coding conventions, so they are a convenient way to try
> out an idea, before committing to the effort of getting it merged into
> the core.

Hmm.. It's pretty easy to test out Git ideas too.  People do it all
the time, and without plugins.  Junio maintains several such trees
for instance.  Dunno.. I just think plugs _sounds_ good to developers
without much real benefit to users over regular ole source code.

> The bisect plugin is just as performant as any other bzr command.  (The
> whole VCS is in Python.)  Most people don't use it, so we don't ship it
> as part of the base install, but anyone who wants it can have it.

Sure, and anyone who wants to use StGit on top of Git can download and
use it as well.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                       ` <20061017185622.30fbc6c0.seanlkml@sympatico.ca>
@ 2006-10-17 22:56                         ` Sean
  2006-10-17 22:56                         ` Sean
  2006-10-18 21:51                         ` Petr Baudis
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 22:56 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Andreas Ericsson, bazaar-ng, git, Jakub Narebski

On Tue, 17 Oct 2006 18:44:11 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> That can lead to feature bloat.  Some plugins are not useful to
> everyone, e.g. Mercurial repository support.  Some plugins introduce
> additional dependencies that we don't want to have in the core (e.g. the
> rsync, baz-import and graph-ancestry commands).

Shrug, it's really not that tough to do in regular ole source code.
On Fedora for instance you have your choice of which rpms you want
to install to get the features of Git you want.

> Plugins also don't have a Bazaar's rigid release cycle, testing
> requirements and coding conventions, so they are a convenient way to try
> out an idea, before committing to the effort of getting it merged into
> the core.

Hmm.. It's pretty easy to test out Git ideas too.  People do it all
the time, and without plugins.  Junio maintains several such trees
for instance.  Dunno.. I just think plugs _sounds_ good to developers
without much real benefit to users over regular ole source code.

> The bisect plugin is just as performant as any other bzr command.  (The
> whole VCS is in Python.)  Most people don't use it, so we don't ship it
> as part of the base install, but anyone who wants it can have it.

Sure, and anyone who wants to use StGit on top of Git can download and
use it as well.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:28                   ` Aaron Bentley
@ 2006-10-17 22:57                     ` Jakub Narebski
  2006-10-17 22:59                       ` Jakub Narebski
                                         ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17 22:57 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Andreas Ericsson, bazaar-ng, git

Aaron Bentley wrote:
> Jakub Narebski wrote:
>> Aaron Bentley wrote:
> 
>>> What is the bad side of using merge in this situation?
>>
>> We want linear history, not polluted by merges. For example you cannot
>> send merge commit via email.
> 
> Oh.  Bazaar supports sending merge commits by email.
> 
>> Another problem is that you want to
>> send _series_ of patches, string of commits (revisions), creating feature
>> part by part, with clean history; with merge you get _final result_
>> which will apply cleanly, with rebase you would get that series
>> of patches will apply cleanly.
> 
> Yes, that's something that I'd heard about the kernel development
> methodology-- that a series of small patches is preferred to one patch
> that makes the whole change.
> 
> That's not the way we operate.  We like to review all the changes at
> once.  But because bundles are applied with a 'merge' command, not a
> 'patch' command, an old bundle will tend to apply more cleanly than an
> old patch would.

Perhaps it would be nice to have "bundles" in git too. As of now
we can save arbitrary part of history in a pack, but it is binary
not textual representation.

Some of git workflow stems from old, pre-SCM Linux kernel workflow
of sending _patches_ via email.


By the way, are bzr "bundles" compatibile with ordinary patch?
git-format-patch patches are. They have additional metainfo,
but they are patches in heart.
  
>>> and more.  Because Python supports monkey-patching, a plugin can change
>>> absolutely anything.
>>
>> Which is _not_ a good idea. Git is created in such way, that the repository
>> is abstracted away (introduction of pack format, and improving pack format
>> can and was done "behind the scenes", not changing any porcelanish (user)
>> commands), but we don't want any chage that would change this abstraction.
> 
> I'm not sure what you think Bazaar does.  In Bazaar, a repository format
> plugin  implements the same API that a native repository format does.
> 
> This is how bzr supports Subversion, Mercurial and Git repositories.

But if I remember correctly Subversion does not remember merge points
(merge commits), so how can you provide full Bazaar-NG compatibility
with Subversion repository as backend? Some repository formats lack
some features. Besides, as I said repository database and stuff is
quite well abstracted away.

In git we have import tools (most of them capable of incremental import),
a few exchange tools like git-cvsexportcommit, git-cvsserver, and
Tailor-like git-svn.
 
>> Changing repository format is not a good idea for "dumb" protocols;
> 
> I can't parse this.  Repository formats and protocols are different
> things, right?

"Dumb" protocols in git are protocols for which server provides access
to contents git repository plus some additional info (usually generated
using hooks). The client (be it git-fetch or git-push) discovers which
files to download or what to upload, but it only can download repository
"as is". So if server repository was created with repository format plugin,
and client doesn't have said plugin, you are out of luck.
 
>> native protocol is quite extensible
> 
> I was meaning dumb protocol extension.  I can't say how extensible the
> bzr native protocol is.

Native git protocol (git:// and git+ssh://) does feature discovery, then
negotiates what contents has to be send, and finally tries to send minimal
number of objects.

>> Adding
>> cURL based FTP read-only support to existing HTTP support was a matter
>> of few lines, if I remember correctly.
> 
> We support read and write over native, ftp and WebDAV (a plugin).  We
> also have readonly http support.

Git has read-only access over git:// protocol (served by git-daemon on
port 9418), read-write access over git+ssh:// protocol (you can limit
exposition using git-shell), read-only access via HTTP, HTTPS, FTP "dumb"
protocols, read-write access via WebDAV "dumb" protocol.

Git is open-source, we don't need plugins ;-)
-- 
Jakub Narebski
ShadeHawk on #git
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:57                     ` Jakub Narebski
@ 2006-10-17 22:59                       ` Jakub Narebski
  2006-10-17 23:16                       ` Linus Torvalds
  2006-10-17 23:33                       ` Aaron Bentley
  2 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17 22:59 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Andreas Ericsson, bazaar-ng, git

Jakub Narebski wrote:

> Git has read-only access over git:// protocol (served by git-daemon on
> port 9418), read-write access over git+ssh:// protocol (you can limit
> exposition using git-shell), read-only access via HTTP, HTTPS, FTP "dumb"
> protocols, read-write access via WebDAV "dumb" protocol.

And deprecated read-only (I think), deprecated, suggested to use only
for cloning, rsync:// "dumb" protocol.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:53                   ` Aaron Bentley
@ 2006-10-17 23:09                     ` Linus Torvalds
  2006-10-18  0:23                       ` Aaron Bentley
  2006-10-17 23:24                     ` Jakub Narebski
  1 sibling, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-17 23:09 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, git



On Tue, 17 Oct 2006, Aaron Bentley wrote:
> > 
> > Excuse me? What does that "throws away your local commit ordering" mean?
> 
> Say this is the ordering in branch A:
> 
> a
> |
> b
> |
> c
> 
> Say this is the ordering in branch B:
> 
> a
> |
> b
> |\
> d c
> |/
> e
> 
> When A pulls B, it gets the same ordering as B has.  If B did not have e
> and c, the pull would fail.

Sure. But that doesn't throw away any local commit ordering. The original 
order (a->b->c) is still very much there. The fact that there was a branch 
off 'b' and there is also (a->b->d) and a merge of the two at 'e' doesn't 
take away anything from the original local commit ordering. 

> > So generating an extra "merge" commit would be actively wrong, and adds 
> > "history" that is not history at all.
> 
> It's not a tree change, but it records the fact that one branch merged
> the other.

But that's a totally specious "record". It has no meaning in a distributed 
SCM. There is absolutely zero semantic information in it.

The fact that you _locally_ want to remember where you were is a total 
non-issue for a true distributed system. You shouldn't force everybody 
else to see your local view - since it has no relevance to them, and 
doesn't add any information.

> Maybe not in Git.

I don't think there is any in bzr either. Can you explain?

In other words, the empty merge is totally semantically empty even in the 
bazaar world. Why does it exist?

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:56                         ` Sean
@ 2006-10-17 23:11                           ` Jakub Narebski
  2006-10-18 21:04                           ` Charles Duffy
  1 sibling, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17 23:11 UTC (permalink / raw)
  To: Sean; +Cc: Andreas Ericsson, bazaar-ng, git

/me too post ;-)

Sean wrote:
> On Tue, 17 Oct 2006 18:44:11 -0400
> Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> 
> > That can lead to feature bloat.  Some plugins are not useful to
> > everyone, e.g. Mercurial repository support.  Some plugins introduce
> > additional dependencies that we don't want to have in the core (e.g. the
> > rsync, baz-import and graph-ancestry commands).
> 
> Shrug, it's really not that tough to do in regular ole source code.
> On Fedora for instance you have your choice of which rpms you want
> to install to get the features of Git you want.

git-core, git-email, git-arch, git-cvs, git-svn, gitk
(and git-debuginfo).

gitk and gitweb were developed in its own repositories, but some time
ago got incorporated into git repository. We have contrib/ area.
QGit, Cogito, StGit are developed separately.

> > Plugins also don't have a Bazaar's rigid release cycle, testing
> > requirements and coding conventions, so they are a convenient way to try
> > out an idea, before committing to the effort of getting it merged into
> > the core.
> 
> Hmm.. It's pretty easy to test out Git ideas too.  People do it all
> the time, and without plugins.  Junio maintains several such trees
> for instance.  Dunno.. I just think plugs _sounds_ good to developers
> without much real benefit to users over regular ole source code.

Thanks to many low lewel (plumbing in git-speak) commands it is very
easy to prototype (write actually) new command in language suitable
for fast prototyping, i.e. shell or Perl (or Python, too). Then if it is
performance critical, or if it get troublesome to manage shell script
version, it gets rewritten in C as builtin command.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:57                     ` Jakub Narebski
  2006-10-17 22:59                       ` Jakub Narebski
@ 2006-10-17 23:16                       ` Linus Torvalds
  2006-10-18  5:36                         ` Jeff King
  2006-10-17 23:33                       ` Aaron Bentley
  2 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-17 23:16 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Andreas Ericsson, bazaar-ng, git



On Wed, 18 Oct 2006, Jakub Narebski wrote:
> 
> Perhaps it would be nice to have "bundles" in git too. As of now
> we can save arbitrary part of history in a pack, but it is binary
> not textual representation.
> 
> Some of git workflow stems from old, pre-SCM Linux kernel workflow
> of sending _patches_ via email.

Actually, the reason to _not_ have bundles very much stems from the fact 
that BK did have bundles, and they were pretty horrid.

It would be easy to send the exact same data as the native git protocol 
sends over ssh (or the git port) as an email encoding. We did that a few 
times with BK (there it's called "bk send" and "bk receive" to pack and 
unpack those things), and after doing it about five times, I absolutely 
refused to ever do it again. There's just no point, except to make your 
mailbox grow without bounds, and it was really annoying. 

So sending things as patches is just a lot more convenient if you want 
emails.  And if you want to sync two repos directly, I think we've gotten 
sufficiently past the old UUCP days when you want to use email as a 
packetization medium.

That said, "bundles" certainly wouldn't be _hard_ to do. And as long as 
nobody tries to send _me_ any of them, I won't mind ;)

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061017191838.1c36499b.seanlkml@sympatico.ca>
  2006-10-17 23:18                   ` Sean
@ 2006-10-17 23:18                   ` Sean
  2006-10-17 23:33                   ` Petr Baudis
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 23:18 UTC (permalink / raw)
  To: Robert Collins; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Wed, 18 Oct 2006 08:27:58 +1000
Robert Collins <robertc@robertcollins.net> wrote:

> Be as blunt as you want. You're expressing an opinion, and thats fine. I
> happen to think that we're right : users appear to really appreciate
> this bit of the UI, and I've not yet seen any evidence of confusion
> about it - though I will admit there is the possibility of that
> occurring.

Yeah, but it's an opinion that is based on a huge real world project with
hundreds of developers.  If Bazaar is ever used in a project of that
size it may just see the same type of issues as Bk.  As has been mentioned
elsewhere, Git users really appreciate the short forms it provides for
referencing commits, so much so that there is no reason to invent a
new (unstable) numbering system or attempt to hide the true underlying
commit identities.

Just out of curiosity is there a Bazaar repo of the Linux kernel available
somewhere?

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061017191838.1c36499b.seanlkml@sympatico.ca>
@ 2006-10-17 23:18                   ` Sean
  2006-10-17 23:18                   ` Sean
  2006-10-17 23:33                   ` Petr Baudis
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-17 23:18 UTC (permalink / raw)
  To: Robert Collins; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Wed, 18 Oct 2006 08:27:58 +1000
Robert Collins <robertc@robertcollins.net> wrote:

> Be as blunt as you want. You're expressing an opinion, and thats fine. I
> happen to think that we're right : users appear to really appreciate
> this bit of the UI, and I've not yet seen any evidence of confusion
> about it - though I will admit there is the possibility of that
> occurring.

Yeah, but it's an opinion that is based on a huge real world project with
hundreds of developers.  If Bazaar is ever used in a project of that
size it may just see the same type of issues as Bk.  As has been mentioned
elsewhere, Git users really appreciate the short forms it provides for
referencing commits, so much so that there is no reason to invent a
new (unstable) numbering system or attempt to hide the true underlying
commit identities.

Just out of curiosity is there a Bazaar repo of the Linux kernel available
somewhere?

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:53                   ` Aaron Bentley
  2006-10-17 23:09                     ` Linus Torvalds
@ 2006-10-17 23:24                     ` Jakub Narebski
  2006-10-17 23:50                       ` Linus Torvalds
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17 23:24 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Aaron Bentley wrote:

[...]

>> So generating an extra "merge" commit would be actively wrong, and adds
>> "history" that is not history at all.
> 
> It's not a tree change, but it records the fact that one branch merged
> the other.
> 
>> It also means that if people merge back and forth from each other, you get
>> into an endless loop of useless merge commits.
> 
> You can pull if you don't want that.  We haven't found that people are
> very fussed about it.
> 
>> There's no reason _ever_ to not just fast-forward if one repository is a
>> strict superset of the other.
> 
> Maybe not in Git.

Think what the existence of merge commit is for. It is a place where
we can record how we resolved conflicts. It means: we _merged_ (joined)
two (or more: does bzr support octopus merge?) lines of development.

Merge commit in fast-forward case is only marking "here we did a pull"
(here we downloaded from other repository). It is just a marker which
place is in reflog, not in history. It is only cluttering history.


Besides one of canonical workflows used and encouraged by git is:

 * repository A stores does it's own work on branch 'master',
   and fetches changes from 'master' branch of repository B
   into branch 'origin'. "git pull origin" when on branch 'master'
   fetches changes from 'master' branch of repository B (requiring
   usually that it fast-forwards) into branch 'origin', then
   merges branch 'origin' into branch 'master', automatically
   creating merge commit message.

 * repository B does it's own work on branch 'master',
   and fetches changes from 'master' branch of repository A
   into [tracking] branch 'origin'. (...)

Instead of pull/fetch, we could use push.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 19:44               ` Aaron Bentley
@ 2006-10-17 23:28                 ` Petr Baudis
  2006-10-17 23:39                 ` Jakub Narebski
  1 sibling, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-17 23:28 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git, Jakub Narebski

Dear diary, on Tue, Oct 17, 2006 at 09:44:37PM CEST, I got a letter
where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
> Andreas Ericsson wrote:
> >> In our terminology, if it can diverge from the original, it's a branch,
> >> not a checkout.
> >>
> > 
> > This clears things up immensely. bazaar checkout != git checkout.
> > I still fail to see how a local copy you can't commit to is useful
> 
> My bzr is run from a local copy I can't commit to.  To get the latest
> changes from http://bazaar-vcs.org, I can run "bzr update ~/bzr/dev".
> To merge the latest changes into my branch, I can run
> "bzr merge ~/bzr/dev".  It's also convenient for applying other peoples'
> patches to.

The question is, why is it useful to enforce the "no commit" rule? Git
can work exactly the same, it just doesn't _enforce_ the rule. And is
the capability of enforcing such a rule important enough to warrant its
own column in the comparison table?

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061017191838.1c36499b.seanlkml@sympatico.ca>
  2006-10-17 23:18                   ` Sean
  2006-10-17 23:18                   ` Sean
@ 2006-10-17 23:33                   ` Petr Baudis
  2006-10-18  5:26                     ` Robert Collins
  2 siblings, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-10-17 23:33 UTC (permalink / raw)
  To: Sean; +Cc: Robert Collins, Linus Torvalds, bazaar-ng, git, Jakub Narebski

Dear diary, on Wed, Oct 18, 2006 at 01:18:38AM CEST, I got a letter
where Sean <seanlkml@sympatico.ca> said that...
> On Wed, 18 Oct 2006 08:27:58 +1000
> Robert Collins <robertc@robertcollins.net> wrote:
> 
> > Be as blunt as you want. You're expressing an opinion, and thats fine. I
> > happen to think that we're right : users appear to really appreciate
> > this bit of the UI, and I've not yet seen any evidence of confusion
> > about it - though I will admit there is the possibility of that
> > occurring.
> 
> Yeah, but it's an opinion that is based on a huge real world project with
> hundreds of developers.  If Bazaar is ever used in a project of that
> size it may just see the same type of issues as Bk.  As has been mentioned
> elsewhere, Git users really appreciate the short forms it provides for
> referencing commits, so much so that there is no reason to invent a
> new (unstable) numbering system or attempt to hide the true underlying
> commit identities.

BTW, I think it's fine to build a system optimized for small-scale
projects (if that's the intent), simplifying some things in favour of
mostly straight histories instead of more complicated merge situations
(although I tend to agree with Linus that if you don't behave in the way
the users are used to in 100% cases, the more frequently you behave so
the worse it comes back to bite in the rare cases you do). Just as RCS
is fine when maintaining individual files for personal usage (I still
actually occassionaly use it for few files).

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:57                     ` Jakub Narebski
  2006-10-17 22:59                       ` Jakub Narebski
  2006-10-17 23:16                       ` Linus Torvalds
@ 2006-10-17 23:33                       ` Aaron Bentley
  2006-10-18  8:13                         ` Andreas Ericsson
  2 siblings, 1 reply; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-17 23:33 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:
> By the way, are bzr "bundles" compatibile with ordinary patch?
> git-format-patch patches are. They have additional metainfo,
> but they are patches in heart.

Yes, they are.

>> I'm not sure what you think Bazaar does.  In Bazaar, a repository format
>> plugin  implements the same API that a native repository format does.
>>
>> This is how bzr supports Subversion, Mercurial and Git repositories.
> 
> But if I remember correctly Subversion does not remember merge points
> (merge commits), so how can you provide full Bazaar-NG compatibility
> with Subversion repository as backend? Some repository formats lack
> some features.

That's true.  We support merge points in a way that's compatible with
svk.  Subversion allows revisions to have arbitrary properties, and svk
sets a property to indicate merges.

> In git we have import tools (most of them capable of incremental import),
> a few exchange tools like git-cvsexportcommit, git-cvsserver, and
> Tailor-like git-svn.

Bzr's subversion support is quite nice.  You can commit, merge, run
history viewers.

There are screenshots and stuff here:
http://bazaar-vcs.org/BzrForeignBranches/Subversion

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNWhc0F+nu1YWqI0RAkH7AJ4/S648shA8IKg42xcGWdjnjmA+PgCdEDhg
Af/mcG+XTy3Tsb9b1x3rYcg=
=xnjF
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 21:01             ` Jakub Narebski
  2006-10-17 21:27               ` Aaron Bentley
@ 2006-10-17 23:35               ` Jakub Narebski
  1 sibling, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17 23:35 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Andreas Ericsson, bazaar-ng, git

Dnia wtorek 17. października 2006 23:01, Jakub Narebski napisał:
> Aaron Bentley wrote:
> > Andreas Ericsson wrote:
> >> Aaron Bentley wrote:
> >>> Ah.  Bazaar uses negative numbers to refer to <n>th parents, and
> >>> positive numbers to refer to the number of commits that have been made
> >>> since the branch was initialized.
> >>>
> >>
> >> What do you do once a branch has been thrown away, or has had 20 other
> >> branches merged into it? Does the offset-number change for the revision
> >> then, or do you track branch-points explicitly?
> > 
> > We always track the number of parents since the initial commit in the
> > project.  Sorry, I don't think I said that clearly before.
> 
> While this I think is quite reliable (there was idea to store "generation
> number" with each commit, e.g. using not implemented "note" header, or
> commit-id to generation number "database" as a better heuristic than
> timestamp for revision ordering in git-rev-list output), and probably
> independent on repository (it is global property of commit history,
> and commit history is included in sha1 of its parents), numbering branching
> points is unreliable, as is relying on branch names.

Take for example the following situation:


In the following we had

  A--B--C--D  - repository A

we have cloned repository

  A--B--C--D  - repository B

Then, in parallel/independently we branched off C in repository A, and
branched off B in repository B

          -x
         /
  A--B--C--D  - repository A


  A--B--C--D  - repository B
      \
       -y

If we then fetch changes from B into A, and fetch changes from A into B,
we will have that in repository A branch off C appeared earlier, and
in repository B branch off C appeared later.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 19:44               ` Aaron Bentley
  2006-10-17 23:28                 ` Petr Baudis
@ 2006-10-17 23:39                 ` Jakub Narebski
  2006-10-18  0:24                   ` Aaron Bentley
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-17 23:39 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Andreas Ericsson, Linus Torvalds, bazaar-ng, git

Aaron Bentley wrote:
>> This clears things up immensely. bazaar checkout != git checkout.
>> I still fail to see how a local copy you can't commit to is useful
> 
> My bzr is run from a local copy I can't commit to.  To get the latest
> changes from http://bazaar-vcs.org, I can run "bzr update ~/bzr/dev".
> To merge the latest changes into my branch, I can run
> "bzr merge ~/bzr/dev".  It's also convenient for applying other peoples'
> patches to.

Can you do "bzr log" in 'checkout', without need to specify "~/bzr/dev"?
If not, how this differs from checking out (in git terminology) outside 
default working area, and requiring providing GIT_DIR or --git-dir for
stuff?
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 23:24                     ` Jakub Narebski
@ 2006-10-17 23:50                       ` Linus Torvalds
  0 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-17 23:50 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Aaron Bentley, Andreas Ericsson, bazaar-ng, git



On Wed, 18 Oct 2006, Jakub Narebski wrote:
> 
> Merge commit in fast-forward case is only marking "here we did a pull"
> (here we downloaded from other repository). It is just a marker which
> place is in reflog, not in history. It is only cluttering history.

For non-git people (and maybe even git people who didn't follow some of 
the "reflog" work):

 - git does actually have "local view" support, but it is very much 
   _defined_ to be local. It does not pollute any history as seen by 
   anybody else. It's called "reflog" (where "ref" is just the git name 
   for any reference into a tree, and the "log" part is hopefully obvious)

So each git repository can have (if you enable it) a full log of all the 
changes to each branch. But it's not in the core git datastructures that 
get replicated - because the local view of how the branches have changed 
really _is_ just a local view. It's just a local log to each repository 
(actually, one per branch).

It's what allows a git person to say

	git diff "master@{5.hours.ago}"

because while "5 hours ago" is _not_ well-defined in a distributed 
environment (five hours ago for _whom_?) it's perfectly well-defined in a 
purely _local_ sense of one particular branch.

So there's no need for a fakey "merge" that isn't a real merge and that 
doesn't make sense for anybody else because it doesn't actually add any 
real knowledge about the _history_ of the tree (only about a single 
repository). If you want to see how the history of a particular repository 
has evolved, you can just look at the reflog (although admittedly, common 
tools like "gitk" don't even show it - the data is there if they would 
want to, but the most common usage is the above kind of "show me what 
happened in the last five hours in my current branch".

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 14:41                     ` Jakub Narebski
@ 2006-10-18  0:00                       ` Petr Baudis
  2006-10-18  0:30                         ` Aaron Bentley
  0 siblings, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18  0:00 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Matthieu Moy, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Dear diary, on Tue, Oct 17, 2006 at 04:41:02PM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> "Bundle" equivalent, although binary in nature, would be thin pack.

It should be noted that there's no user interface for sending/receiving
that and I suspect no reasonably usable user interface for creating it.

How frequently are the bundles used in practice?

It's a cultural difference, I suspect. Git comes from an environment
based on intensive exchanges of patches and patch series and an
environment not mandating developers to use any tool besides diff/patch,
so Git is very focused at good support for applying patches and there
simply has been no big conscious demand for bundles support given this.

Another aspect of this is that Git (Linus ;) is very focused on getting
the history right, nice and clean (though it does not _mandate_ it and
you can just wildly do one commit after another; it just provides tools
to easily do it). This means that the downstream maintainers have to
rebase patches, possibly reorder them, and update the changesets with
bugfixes instead of stacking the bugfixes upon them in separate changes
- then Linus merges the patches and only at that point they are "etched"
forever. This means that the history will contain neatly laid out way
of how $FEATURE was achieved, but of course also more work for
downstream maintainers.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                               ` <20061017110655.f7bcf3f1.seanlkml@sympatico.ca>
  2006-10-17 15:06                                 ` Sean
  2006-10-17 15:06                                 ` Sean
@ 2006-10-18  0:14                                 ` Petr Baudis
  2006-10-18  1:36                                   ` Integrating gitweb and git-browser (was: Re: VCS comparison table) Jakub Narebski
  2 siblings, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18  0:14 UTC (permalink / raw)
  To: Sean; +Cc: Matthieu Moy, bazaar-ng, git

Dear diary, on Tue, Oct 17, 2006 at 05:06:55PM CEST, I got a letter
where Sean <seanlkml@sympatico.ca> said that...
> [1] As an aside, I don't understand why bazaar pushes the idea
> of "plugins".  For instance someone mentioned that bazaar has
> a bisect "plugin".  Well Git was able to add a bisect "command"
> without needing a plugin architecture.. so i'm at a loss as 
> to why plugins are seen as an advantage.

Greater flexibility, you can "provide this great Git addon that will
let you push over FTP" without requiring users to patch their Git
installations or wait for new Git version that might include it.
Especially important if you want a lot of users test out your
experimental feature or if it's something project-specific etc.

BTW, I'm thinking about implementing some plugin functionality for
gitweb so that you can add your own views, so that git-browser can
integrate to it more reasonably. (Currently it has completely different
UI and you have to patch gitweb in order to get the proper links at
proper places.) Sure, git-browser might get fully integrated to gitweb
later but that needs to be done sensitively so that people are not
scared by the horrible javascript blobs, etc.; currently git-browser is
very experimental, and adding it would be quite intrusive.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 23:09                     ` Linus Torvalds
@ 2006-10-18  0:23                       ` Aaron Bentley
  2006-10-18  0:46                         ` Jakub Narebski
                                           ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-18  0:23 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Tue, 17 Oct 2006, Aaron Bentley wrote:
>>> Excuse me? What does that "throws away your local commit ordering" mean?
>> Say this is the ordering in branch A:
>>
>> a
>> |
>> b
>> |
>> c
>>
>> Say this is the ordering in branch B:
>>
>> a
>> |
>> b
>> |\
>> d c
>> |/
>> e
>>
>> When A pulls B, it gets the same ordering as B has.  If B did not have e
>> and c, the pull would fail.
> 
> Sure. But that doesn't throw away any local commit ordering. The original 
> order (a->b->c) is still very much there.

After the pull, it's no longer the mainline ordering for the branch.  c
is represented a revision that was merged into the branch, while d is
represented as a commit on the mainline of the branch.

> The fact that there was a branch 
> off 'b' and there is also (a->b->d) and a merge of the two at 'e' doesn't 
> take away anything from the original local commit ordering.

It means the the order that revisions are shown in log commands changes,
and the revision numbers can change.

> But that's a totally specious "record". It has no meaning in a distributed 
> SCM. There is absolutely zero semantic information in it.

It records the committer, the date, the commit message, the parent
revisions.

> The fact that you _locally_ want to remember where you were is a total 
> non-issue for a true distributed system. You shouldn't force everybody 
> else to see your local view - since it has no relevance to them, and 
> doesn't add any information.

Nobody is forced to use your local view.

> In other words, the empty merge is totally semantically empty even in the 
> bazaar world. Why does it exist?

It exists because it is useful.  Because it makes the behavior of bzr
merge uniform.  Because in some workflows, commits show that a person
has signed off on a change.

It's not something special-- it's just another commit, like regular
commits, and merge commits.  It would be harder to forbid than it is to
permit.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNXQQ0F+nu1YWqI0RAnxDAJ4hbuLkEK1eBlyoEOz7NAlqLVth9gCfed4w
nfeiR2KVvN+N9zdSrC8MKcY=
=et73
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 23:39                 ` Jakub Narebski
@ 2006-10-18  0:24                   ` Aaron Bentley
  0 siblings, 0 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-18  0:24 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Andreas Ericsson, Linus Torvalds, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:
>>> This clears things up immensely. bazaar checkout != git checkout.
>>> I still fail to see how a local copy you can't commit to is useful
>> My bzr is run from a local copy I can't commit to.  To get the latest
>> changes from http://bazaar-vcs.org, I can run "bzr update ~/bzr/dev".
>> To merge the latest changes into my branch, I can run
>> "bzr merge ~/bzr/dev".  It's also convenient for applying other peoples'
>> patches to.
> 
> Can you do "bzr log" in 'checkout', without need to specify "~/bzr/dev"?

Sure.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNXRU0F+nu1YWqI0RAptIAJ0btflKFEjF9a7Kt/qVZufK003DpACeK7Dc
leW4ICG1LbOC9DGrAd5ztlY=
=JGvL
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 12:03                 ` Matthieu Moy
  2006-10-17 12:56                   ` Jakub Narebski
       [not found]                   ` <20061017085723.7542ee6c.seanlkml@sympatico.ca>
@ 2006-10-18  0:25                   ` Petr Baudis
  2006-10-18  0:38                     ` Aaron Bentley
       [not found]                     ` <4535778D.40006@utoronto.ca>
  2006-10-18  1:11                   ` Petr Baudis
  3 siblings, 2 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18  0:25 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Sean, Jakub Narebski, Aaron Bentley, Linus Torvalds, bazaar-ng,
	git

Dear diary, on Tue, Oct 17, 2006 at 02:03:21PM CEST, I got a letter
where Matthieu Moy <Matthieu.Moy@imag.fr> said that...
> Sean <seanlkml@sympatico.ca> writes:
> 
> > On Tue, 17 Oct 2006 13:19:08 +0200
> > Matthieu Moy <Matthieu.Moy@imag.fr> wrote:
> >
> >> 1) a working tree without any history information, pointing to some
> >>    other location for the history itself (a la svn/CVS/...).
> >>    (this is "light checkout")
> >
> > Git can do this from a local repository, it just can't do it from
> > a remote repo (at least over the git native protocol).  However,
> > over gitweb you can grab and unpack a tarball from a remote repo.
> > In practice this is probably enough support for such a feature.
> 
> Anyway, given the price of disk space today,

(In rich countries. This may still be very different in poorer
countries.  E.g. some actual mplayer developer(s) from Turkey opposed
transition to a distributed version control system simply because they
have trouble affording the required additional diskspace for the full
history.  SVN is already very space-hungry for them.  (It stores
basically two complete checkouts in parallel.))

But the much bigger practical problem is bandwidth, plenty of people
still have internet connections where downloading several tens/hundreds
of megabytes of the complete history is quite a big thing, and the
servers ain't gonna be happy from that either, nor those paying the
bandwidth bills. ;-) And this is one of the big problems the Mozilla
guys have - having everyone download 450M worth of the full CVS-imported
history (and I'll bet no other VCS will beat that size) seems to be not
an option at all.

> this only makes sense if
> you have a fast access to the repository (otherwise, you consider your
> local repository as a cache, and you're ready to pay the disk space
> price to save your bandwidth). In this case, it's often in your
> filesystem (local or NFS).

So how is the light checkout actually implemented? Do you grab the
complete new snapshot each time the remote repository is updated? Do all
the (at least read-only, like "log" and "diff", perhaps "status")
commands work on such a light checkout?

This is something sorely missing in Git but if it's really only "we just
provide bandwidth-expensive way to keep your tree up-to-date and that's
all," that would not be hard at all to implement in Git too, using
git-archive --remote.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  0:00                       ` Petr Baudis
@ 2006-10-18  0:30                         ` Aaron Bentley
  2006-10-18  0:39                           ` Petr Baudis
                                             ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-18  0:30 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Jakub Narebski, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Matthieu Moy

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Petr Baudis wrote:
> How frequently are the bundles used in practice?

Many times each day.  Most submission to the bzr mainline are done with
bundles.

> Another aspect of this is that Git (Linus ;) is very focused on getting
> the history right, nice and clean (though it does not _mandate_ it and
> you can just wildly do one commit after another; it just provides tools
> to easily do it).

Yes, rebasing is very uncommon in the bzr community.  We would rather
evaluate the complete change than walk through its history.  (Bundles
only show the changes you made, not the changes you merged from the
mainline.)

In an earlier form, bundles contained a patch for every revision, and
people *hated* reading them.  So there's definitely a cultural
difference there.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNXWW0F+nu1YWqI0RAuRnAJ9aZVLo4T1sfmyGC2t364UyHX+6wACff7sM
peal5rAdk/T515RGeKXkWlo=
=O61J
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  0:25                   ` Petr Baudis
@ 2006-10-18  0:38                     ` Aaron Bentley
       [not found]                     ` <4535778D.40006@utoronto.ca>
  1 sibling, 0 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-18  0:38 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Matthieu Moy, Sean, Jakub Narebski, Linus Torvalds, bazaar-ng,
	git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Petr Baudis wrote:
>> this only makes sense if
>> you have a fast access to the repository (otherwise, you consider your
>> local repository as a cache, and you're ready to pay the disk space
>> price to save your bandwidth). In this case, it's often in your
>> filesystem (local or NFS).
> 
> So how is the light checkout actually implemented? Do you grab the
> complete new snapshot each time the remote repository is updated?

No, the lightweight checkouts store very little.  They have
- - a copy of tree shape (filenames, paths, sha1 sums) from the last
  commit.
- - a copy of tree shape for the current working directory
- - a map from stat values to sha-1 hashes


> Do all
> the (at least read-only, like "log" and "diff", perhaps "status")
> commands work on such a light checkout?

Yes.  And if you check out from a read-write branch, all write commands,
work, too.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNXeN0F+nu1YWqI0RAsdrAJ0bUj4swxm5sod9WnsbPZ9yIQ7FVQCdE4UB
8x0ddFkbr5cPISTihw96d8c=
=/XAr
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  0:30                         ` Aaron Bentley
@ 2006-10-18  0:39                           ` Petr Baudis
  2006-10-18  1:28                           ` Jakub Narebski
       [not found]                           ` <20061018003920.GK20017@pasky.or.cz>
  2 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18  0:39 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Jakub Narebski, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Matthieu Moy

Dear diary, on Wed, Oct 18, 2006 at 02:30:14AM CEST, I got a letter
where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
> Petr Baudis wrote:
> > Another aspect of this is that Git (Linus ;) is very focused on getting
> > the history right, nice and clean (though it does not _mandate_ it and
> > you can just wildly do one commit after another; it just provides tools
> > to easily do it).
> 
> Yes, rebasing is very uncommon in the bzr community.  We would rather
> evaluate the complete change than walk through its history.  (Bundles
> only show the changes you made, not the changes you merged from the
> mainline.)
> 
> In an earlier form, bundles contained a patch for every revision, and
> people *hated* reading them.  So there's definitely a cultural
> difference there.

BTW, I think what describes the Git's (kernel's) stance very nicely is
what I call the Al Viro's "homework problem":

	http://lkml.org/lkml/2005/4/7/176

If I understand you right, the bzr approach is what's described as "the
dumbest kind" there? (No offense meant!)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                     ` <4535778D.40006@utoronto.ca>
@ 2006-10-18  0:42                       ` Petr Baudis
  2006-10-18  0:48                       ` Jakub Narebski
       [not found]                       ` <20061018004209.GL20017@pasky.or.cz>
  2 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18  0:42 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Matthieu Moy, Sean, Jakub Narebski, Linus Torvalds, bazaar-ng,
	git

Dear diary, on Wed, Oct 18, 2006 at 02:38:37AM CEST, I got a letter
where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
> Petr Baudis wrote:
> >> this only makes sense if
> >> you have a fast access to the repository (otherwise, you consider your
> >> local repository as a cache, and you're ready to pay the disk space
> >> price to save your bandwidth). In this case, it's often in your
> >> filesystem (local or NFS).
> > 
> > So how is the light checkout actually implemented? Do you grab the
> > complete new snapshot each time the remote repository is updated?
> 
> No, the lightweight checkouts store very little.  They have
> - a copy of tree shape (filenames, paths, sha1 sums) from the last
>   commit.
> - a copy of tree shape for the current working directory
> - a map from stat values to sha-1 hashes

I see, I guess that means "the index file and tree objects for the last
commit" in git-speak. Thanks.

> > Do all
> > the (at least read-only, like "log" and "diff", perhaps "status")
> > commands work on such a light checkout?
> 
> Yes.  And if you check out from a read-write branch, all write commands,
> work, too.

Ok, one last question - do you do most of the work locally, fetching
bits of data as you need, or remotely, only taking input/producing
output over the network (the pserver model)?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  0:23                       ` Aaron Bentley
@ 2006-10-18  0:46                         ` Jakub Narebski
       [not found]                         ` <200610180246.18758.jnareb@gmail.com>
  2006-10-18  3:25                         ` Ryan Anderson
  2 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-18  0:46 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Aaron Bentley wrote:
> Linus Torvalds wrote:
>>
>> On Tue, 17 Oct 2006, Aaron Bentley wrote:
> >>> Excuse me? What does that "throws away your local commit ordering" mean?
> >> Say this is the ordering in branch A:
> >>
> >> a
> >> |
> >> b
> >> |
> >> c
> >>
> >> Say this is the ordering in branch B:
> >>
> >> a
> >> |
> >> b
> >> |\
> >> d c
> >> |/
> >> e
> >>
> >> When A pulls B, it gets the same ordering as B has.  If B did not have e
> >> and c, the pull would fail.
> >
> > Sure. But that doesn't throw away any local commit ordering. The original
> > order (a->b->c) is still very much there.
> 
> After the pull, it's no longer the mainline ordering for the branch.  c
> is represented a revision that was merged into the branch, while d is
> represented as a commit on the mainline of the branch.

Well, that is another example while generation number is/can be global,
any numbering of branches must be local-only.

> > The fact that there was a branch
> > off 'b' and there is also (a->b->d) and a merge of the two at 'e' doesn't
> > take away anything from the original local commit ordering.
> 
> It means the the order that revisions are shown in log commands changes,

That doesn't matter...

> and the revision numbers can change.

...but that means that revision numers are totally, absolutely useless.
Unless by some miracle of engineering, or adding namespace, they can be
made unchangeable.

> > But that's a totally specious "record". It has no meaning in a distributed
> > SCM. There is absolutely zero semantic information in it.
> 
> It records the committer, the date, the commit message, the parent
> revisions.

All totally empty information. What should be commit message? I have
fetched changes from remote repository? You can remove one of parents
(the one of pointing to before fast-forward "merge") without changing
reachability.

              ---------
             /         \
     *--*---x---*---*---y---*

> > The fact that you _locally_ want to remember where you were is a total
> > non-issue for a true distributed system. You shouldn't force everybody
> > else to see your local view - since it has no relevance to them, and
> > doesn't add any information.
> 
> Nobody is forced to use your local view.

But if you record "fast-forward merge", you force all people pulling
from your repository to have this purely local and without any significant
information "I have fetched then" marker.

> > In other words, the empty merge is totally semantically empty even in the
> > bazaar world. Why does it exist?
> 
> It exists because it is useful.  Because it makes the behavior of bzr
> merge uniform.  Because in some workflows, commits show that a person
> has signed off on a change.

Signing off the fact of fetching changes? For true merge you are signing
off the fact that there were no conflicts, or you sign off your conflict
resolution.

> It's not something special-- it's just another commit, like regular
> commits, and merge commits.  It would be harder to forbid than it is to
> permit.

Actualy the check is very easy. And you have to do similar check when
fetchin/pushing to ensure that you don't clobber your changes.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                     ` <4535778D.40006@utoronto.ca>
  2006-10-18  0:42                       ` Petr Baudis
@ 2006-10-18  0:48                       ` Jakub Narebski
       [not found]                       ` <20061018004209.GL20017@pasky.or.cz>
  2 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-18  0:48 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Petr Baudis, Matthieu Moy, Sean, Linus Torvalds, bazaar-ng, git

Aaron Bentley wrote:
> Petr Baudis wrote:
>>> this only makes sense if
>>> you have a fast access to the repository (otherwise, you consider your
>>> local repository as a cache, and you're ready to pay the disk space
>>> price to save your bandwidth). In this case, it's often in your
>>> filesystem (local or NFS).
>>
>> So how is the light checkout actually implemented? Do you grab the
>> complete new snapshot each time the remote repository is updated?
> 
> No, the lightweight checkouts store very little.  They have
> - a copy of tree shape (filenames, paths, sha1 sums) from the last
>   commit.
> - a copy of tree shape for the current working directory
> - a map from stat values to sha-1 hashes

Ah. So in git terminology it stores index and working directory
(and perhaps the name of branch). 

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                       ` <20061018004209.GL20017@pasky.or.cz>
@ 2006-10-18  0:50                         ` Aaron Bentley
       [not found]                         ` <45357A6E.3050603@utoronto.ca>
  1 sibling, 0 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-18  0:50 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Matthieu Moy, Sean, Jakub Narebski, Linus Torvalds, bazaar-ng,
	git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Petr Baudis wrote:

> Ok, one last question - do you do most of the work locally, fetching
> bits of data as you need, or remotely, only taking input/producing
> output over the network (the pserver model)?

Personally, I do not do remote commits over slow links.  At home, I use
a single machine, and mirror my repository to a public machine using
rsync.  At work, I store my repository on an NFS server, and push my
repository to a public machine using rsync.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNXpu0F+nu1YWqI0RAjPTAJ4w9YOM5XLpnIP9jYywtfMr+LZLvACfdycA
/TYAGUVGweR5+cPtDVAIBq4=
=rsNR
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                         ` <45357A6E.3050603@utoronto.ca>
@ 2006-10-18  0:57                           ` Petr Baudis
  2006-10-18  1:05                             ` Aaron Bentley
  0 siblings, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18  0:57 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Matthieu Moy, Sean, Jakub Narebski, Linus Torvalds, bazaar-ng,
	git

Dear diary, on Wed, Oct 18, 2006 at 02:50:54AM CEST, I got a letter
where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
> Petr Baudis wrote:
> 
> > Ok, one last question - do you do most of the work locally, fetching
> > bits of data as you need, or remotely, only taking input/producing
> > output over the network (the pserver model)?
> 
> Personally, I do not do remote commits over slow links.  At home, I use
> a single machine, and mirror my repository to a public machine using
> rsync.  At work, I store my repository on an NFS server, and push my
> repository to a public machine using rsync.

I meant the work of the commands (bzr log and such), not your personal
workflow. :-) Sorry for being unclear.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                         ` <200610180246.18758.jnareb@gmail.com>
@ 2006-10-18  1:00                           ` Aaron Bentley
  2006-10-18  1:25                             ` Carl Worth
  2006-10-18  3:35                             ` Linus Torvalds
  0 siblings, 2 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-18  1:00 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:
>> Linus Torvalds wrote:
>>> On Tue, 17 Oct 2006, Aaron Bentley wrote:
>>>>> Excuse me? What does that "throws away your local commit ordering" mean?
>>>> Say this is the ordering in branch A:
>>>>
>>>> a
>>>> |
>>>> b
>>>> |
>>>> c
>>>>
>>>> Say this is the ordering in branch B:
>>>>
>>>> a
>>>> |
>>>> b
>>>> |\
>>>> d c
>>>> |/
>>>> e
>>>>
>>>> When A pulls B, it gets the same ordering as B has.  If B did not have e
>>>> and c, the pull would fail.
>>> Sure. But that doesn't throw away any local commit ordering. The original
>>> order (a->b->c) is still very much there.
>> After the pull, it's no longer the mainline ordering for the branch.  c
>> is represented a revision that was merged into the branch, while d is
>> represented as a commit on the mainline of the branch.
> 
> Well, that is another example while generation number is/can be global,
> any numbering of branches must be local-only.

No.  The numbering always follows the leftmost parent.  So each revision
has a permanent (but non-unique) number.

> That doesn't matter...

It has significant UI impact.

>> and the revision numbers can change.
> 
> ...but that means that revision numers are totally, absolutely useless.
> Unless by some miracle of engineering, or adding namespace, they can be
> made unchangeable.

No, because no one pulls unless they're trying to maintain a mirror of
the other branch, or else they decide to throw their local history away.

>> Nobody is forced to use your local view.
> 
> But if you record "fast-forward merge", you force all people pulling
> from your repository to have this purely local and without any significant
> information "I have fetched then" marker.

Even if I agreed that the revision was meaningless, the cost of such a
revision is miniscule.

>>> In other words, the empty merge is totally semantically empty even in the
>>> bazaar world. Why does it exist?
>> It exists because it is useful.  Because it makes the behavior of bzr
>> merge uniform.  Because in some workflows, commits show that a person
>> has signed off on a change.
> 
> Signing off the fact of fetching changes? For true merge you are signing
> off the fact that there were no conflicts, or you sign off your conflict
> resolution.

You sign off on the contents of the revision you fetched.  You say "I
have reviewed this revision, and approved it."

>> It's not something special-- it's just another commit, like regular
>> commits, and merge commits.  It would be harder to forbid than it is to
>> permit.
> 
> Actualy the check is very easy.

Agreed.  It's just that not checking is easier still.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNXzD0F+nu1YWqI0RAiGvAJsEbPNNlqZ7QCH7EE39YABqEm/BtwCaAxIo
NHqG4NVZpvymTUlCLYyCqKM=
=YUdC
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  0:57                           ` Petr Baudis
@ 2006-10-18  1:05                             ` Aaron Bentley
  0 siblings, 0 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-18  1:05 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Matthieu Moy, Sean, Jakub Narebski, Linus Torvalds, bazaar-ng,
	git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Petr Baudis wrote:
> Dear diary, on Wed, Oct 18, 2006 at 02:50:54AM CEST, I got a letter
> where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
>> Petr Baudis wrote:
>>
>>> Ok, one last question - do you do most of the work locally, fetching
>>> bits of data as you need, or remotely, only taking input/producing
>>> output over the network (the pserver model)?
>> Personally, I do not do remote commits over slow links.  At home, I use
>> a single machine, and mirror my repository to a public machine using
>> rsync.  At work, I store my repository on an NFS server, and push my
>> repository to a public machine using rsync.
> 
> I meant the work of the commands (bzr log and such), not your personal
> workflow. :-) Sorry for being unclear.

When using the native network protocol, work can happen remotely.  (But
the native protocol is quite new, and support for "smart" operations is
currently limited.)  When using the dumb protocols, data is fetched from
the remote system and processed locally.  Light checkouts are not
recommended when the server is on a slow link, but heavyweight checkouts
are quite suitable in that situation.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNX3j0F+nu1YWqI0RAtRcAJ0fEZam6H3hs3YHY/dEYEhk3A73BQCdENHY
s9+KZTfqnDJg8mHNmC2C/Ok=
=Nqcn
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 12:03                 ` Matthieu Moy
                                     ` (2 preceding siblings ...)
  2006-10-18  0:25                   ` Petr Baudis
@ 2006-10-18  1:11                   ` Petr Baudis
  2006-10-18  6:44                     ` Matthieu Moy
  3 siblings, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18  1:11 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Sean, Jakub Narebski, Aaron Bentley, Linus Torvalds, bazaar-ng,
	git

Dear diary, on Tue, Oct 17, 2006 at 02:03:21PM CEST, I got a letter
where Matthieu Moy <Matthieu.Moy@imag.fr> said that...
> I have one repository, say, $repo.
> 
> In it, I have one branch "$repo/bzr.dev" which is an exact mirror of
> http://bazaar-vcs.org's branch.
> 
> I also have branches for patches (occasional in my case) that I'll
> send to upstream. Say $repo/feature1, $repo/feature2, ...
> 
> If, by mistake, I start hacking on bzr.dev itself, I'll be warned at
> commit time, create a branch, and commit in this new branch. I believe
> git manages this in a different way, allowing you to commit in this
> branch, and creating the branch next time you pull. But you know this
> better than I ;-), I never got time to give a real try to git.

In fact, in Git the branch is actually created at the moment you clone.

For simplicity sake, let's say you cloned just a single branch, not the
whole repository (or imagine a repository with a single branch). Then,
in your local repository, two branches will be created: 'origin' and
'master'. The origin branch is considered readonly (though Git does
not enforce it) and only mirrors the branch in the remote repository.
The master branch is the branch you do your work on, and it corresponds
to the contents of your working tree.

Thus, when you are "updating" your repository (we also call that
"pull"), what happens is that new commits are _fetched_ from the remote
repository to your 'origin' branch and then the 'origin' branch is
_merged_ to the 'master' branch. (You can even separate those two steps
and do them manually. So you can e.g. periodically fetch but just check
diffs with your master branch and never actually merge, or whatever.)

If you never do any local commits on the repository, every time you
merge the 'master' branch is ancestor of the 'origin' branch and only
so-called fast-forward merge happens - the 'master' branch is updated to
point at the same commit as the 'origin' branch.

If you _did_ do some local commits, a real merge of the two branches
happens and a new merge commit tying the current master and origin
history together is recorded on the merge branch.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  1:00                           ` Aaron Bentley
@ 2006-10-18  1:25                             ` Carl Worth
  2006-10-18  3:10                               ` Aaron Bentley
  2006-10-18  3:35                             ` Linus Torvalds
  1 sibling, 1 reply; 1752+ messages in thread
From: Carl Worth @ 2006-10-18  1:25 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Jakub Narebski, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 2002 bytes --]

On Tue, 17 Oct 2006 21:00:51 -0400, Aaron Bentley wrote:
> Jakub Narebski wrote:
> > Well, that is another example while generation number is/can be global,
> > any numbering of branches must be local-only.
>
> No.  The numbering always follows the leftmost parent.  So each revision
> has a permanent (but non-unique) number.

Aaron, thanks for carrying this thread along and helping to bridge
some communication gaps. For example, when I saw your original two two
diagrams I was totally mystified how you were claiming that appending
a couple of nodes and edges to a DAG could change the "order" of the
DAG.

I think I understand what you're describing with the leftmost-parent
ordering now. But it's definitely an ordering that I would describe as
local-only. That is, the ordering has meaning only with respect to a
particular linearization of the DAG and that linearization is
different from one repository to the next.

> > ...but that means that revision numers are totally, absolutely useless.
> > Unless by some miracle of engineering, or adding namespace, they can be
> > made unchangeable.
>
> No, because no one pulls unless they're trying to maintain a mirror of
> the other branch, or else they decide to throw their local history away.

If in practice, nobody does the mirroring "pull" operation then how
are the numbers useful? For example, given your examples above, if
I'm understanding the concepts and terminology correctly, then if A
and B both "merge" from each other (and don't "pull") then they will
each end up with identical DAGs for the revision history but totally
distinct numbers. Correct?

So in that situation the numbers will not help A and B determine that
they have identical history or even identical working trees. So what
good are the numbers?

I can see that the numbers would have applicability with reference to
a single repository, (or equivalently a mirror of that repository),
but no utility as soon as there is any distributed development
happening.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  0:30                         ` Aaron Bentley
  2006-10-18  0:39                           ` Petr Baudis
@ 2006-10-18  1:28                           ` Jakub Narebski
  2006-10-18  1:44                             ` Carl Worth
       [not found]                           ` <20061018003920.GK20017@pasky.or.cz>
  2 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-18  1:28 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Matthieu Moy, bazaar-ng, Linus Torvalds, Andreas Ericsson,
	Petr Baudis, git

Aaron Bentley wrote:
> Petr Baudis wrote:
>>
>> Another aspect of this is that Git (Linus ;) is very focused on getting
>> the history right, nice and clean (though it does not _mandate_ it and
>> you can just wildly do one commit after another; it just provides tools
>> to easily do it).
> 
> Yes, rebasing is very uncommon in the bzr community.  We would rather
> evaluate the complete change than walk through its history.  (Bundles
> only show the changes you made, not the changes you merged from the
> mainline.)
> 
> In an earlier form, bundles contained a patch for every revision, and
> people *hated* reading them.  So there's definitely a cultural
> difference there.

Take for example 
 "[PATCH 0/6] ref deletion and D/F conflict avoidance with packed-refs."
 http://thread.gmane.org/gmane.comp.version-control.git/28150/focus=28154

> This series cleans up the area that was affected by the recent
> addition of "packed-refs".  Christian Couder and Jeff King CC'ed
> since they seem to be touching in the general vicinity of the
> code these patches touch.
> 
> [1/6] ref locking: allow 'foo' when 'foo/bar' used to exist but not anymore.
> [2/6] refs: minor restructuring of cached refs data.
> [3/6] lock_ref_sha1(): do not sometimes error() and sometimes die().
> [4/6] lock_ref_sha1(): check D/F conflict with packed ref when creating.
> [5/6] delete_ref(): delete packed ref
> [6/6] git-branch: remove D/F check done by hand.
> 
> I opted for removing from the packed-ref file when a ref that is
> packed is deleted.

Isn't it easier to review than "bundle", aka. mega-patch?

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Integrating gitweb and git-browser (was: Re: VCS comparison table)
  2006-10-18  0:14                                 ` Petr Baudis
@ 2006-10-18  1:36                                   ` Jakub Narebski
  2006-10-18  1:52                                     ` Petr Baudis
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-18  1:36 UTC (permalink / raw)
  To: git

Petr Baudis wrote:

> BTW, I'm thinking about implementing some plugin functionality for
> gitweb 

Features support is kind of plugin system for gitweb. But certainly we could
split gitweb into modules.

> so that you can add your own views, so that git-browser can 
> integrate to it more reasonably. (Currently it has completely different
> UI and you have to patch gitweb in order to get the proper links at
> proper places.) Sure, git-browser might get fully integrated to gitweb
> later but that needs to be done sensitively so that people are not
> scared by the horrible javascript blobs, etc.; currently git-browser is
> very experimental, and adding it would be quite intrusive.

I was thinking about adding using JavaScript, in shortlog (and perhaps
shortlog-extended, i.e. with date and author) views one extra "diagram"
column, with width set using JavaScript generated embedded style, and use
only part of git-browser that generates diagram to draw it there.

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  1:28                           ` Jakub Narebski
@ 2006-10-18  1:44                             ` Carl Worth
  2006-10-18  3:27                               ` Aaron Bentley
  0 siblings, 1 reply; 1752+ messages in thread
From: Carl Worth @ 2006-10-18  1:44 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Aaron Bentley, Petr Baudis, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git, Matthieu Moy

[-- Attachment #1: Type: text/plain, Size: 1950 bytes --]

On Wed, 18 Oct 2006 03:28:30 +0200, Jakub Narebski wrote:
>
> Isn't it easier to review than "bundle", aka. mega-patch?

There are even more important reasons to prefer a series of
micro-commits over a mega-patch than just ease of merging.

In the cairo project, I've often reviewed a single patch and said:

	"This all looks like perfectly good code and I'd be happy to
	have it all in the tree. But please rebuild this as a series
	of independent patches (perhaps along the lines of a, b, c,
	...)"

I do that not just to make the history "look nice" but because code
history is something we _use_ a lot and separate commits for separate
actions just make the history so much more usable.

We have great tools like bisect to identify commits that introduce
bugs. I know that I'd be delighted to see bisect comes back pointing
at some minimal commit as causing a bug, (which would make finding the
bug so much easier).

But it's also been my experience that the largest commits are also the
most likely to be the things returned by bisect. Big commits really do
introduce bugs more frequently than small commits.

Finally, if someone had gone through the useful work to create small,
independent changes, (and likely finding and fixing bugs in the
process), what a horrible shame it would be to throw away that work
and merge it as a single patch, (welcome to the pain of CVS branch
merging).

Now, I do admit that it is often useful to take the overall view of a
patch series being submitted. This is often the case when a patch
series is in some sub-module of the code for which I don't have as
much direct involvement. In cases like that I will often do review
only of the diff between the tips of the mainline and the branch of
interest, (or if I trust the maintainer enough, perhaps just the
diffstat between the two). But I'm still very glad that what lands in
the history is the series of independent changes, and not one mega
commit.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 11:19           ` Matthieu Moy
                               ` (3 preceding siblings ...)
  2006-10-17 14:19             ` Olivier Galibert
@ 2006-10-18  1:46             ` Petr Baudis
  4 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18  1:46 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Jakub Narebski, Aaron Bentley, Linus Torvalds, bazaar-ng, git

Dear diary, on Tue, Oct 17, 2006 at 01:19:08PM CEST, I got a letter
where Matthieu Moy <Matthieu.Moy@imag.fr> said that...
> 2) a bound branch. It's not _very_ different from a normal branch, but
>    mostly "commit" behaves differently:
>    - it commits both on the local and the remote branch (equivalent to
>      "commit" + "push", but in a transactional way).
>    - it refuses to commit if you're out of date with the branch you're
>      bound to.
>    (this is "heavy checkout")

It isn't very nice because it enforces the update-before-commit
workflow, which was complaint of many CVS users and I can remember it
being one of the selling points of the distributed VCSes in 2001 or so,
although it is not so emphasized lately. (I understand that this is
something optional in Bazaar.)

BTW, merge commits aren't bad. They reflect what really happenned,
explicitly record the merge resolution taken, if there was any, and
protect you from accidentally losing or damaging [any portion of] your
changes. And they aren't cluttery either since we hide them from
non-graphical history listings by default.

Still, I can recognize that in some scenarios, people might find it
useful, and I can remember some people asking for it in the past. So I
couldn't resist and implemented it in Cogito as cg-commit --push. Pushed
out now. Took me about 5 minutes implementing it and 10 minutes documenting
it.  ;-)


P.S.: A general note for bleeding-edge Cogito users, I've rewritten the
local changes handling so that we always do three-way merge now instead
of that braindead patches diffing/applying, but it's not completely
stable yet, some testcases still fail. So be a bit careful when
updating/uncommitting/switching/... with uncommitted changes in the
working tree.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Integrating gitweb and git-browser (was: Re: VCS comparison table)
  2006-10-18  1:36                                   ` Integrating gitweb and git-browser (was: Re: VCS comparison table) Jakub Narebski
@ 2006-10-18  1:52                                     ` Petr Baudis
  2006-10-18  1:58                                       ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18  1:52 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Dear diary, on Wed, Oct 18, 2006 at 03:36:36AM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> Petr Baudis wrote:
> 
> > BTW, I'm thinking about implementing some plugin functionality for
> > gitweb 
> 
> Features support is kind of plugin system for gitweb. But certainly we could
> split gitweb into modules.
> 
> > so that you can add your own views, so that git-browser can 
> > integrate to it more reasonably. (Currently it has completely different
> > UI and you have to patch gitweb in order to get the proper links at
> > proper places.) Sure, git-browser might get fully integrated to gitweb
> > later but that needs to be done sensitively so that people are not
> > scared by the horrible javascript blobs, etc.; currently git-browser is
> > very experimental, and adding it would be quite intrusive.
> 
> I was thinking about adding using JavaScript, in shortlog (and perhaps
> shortlog-extended, i.e. with date and author) views one extra "diagram"
> column, with width set using JavaScript generated embedded style, and use
> only part of git-browser that generates diagram to draw it there.

Shortlog is paginated and that's not very practical for diagrams, I
think - you need to gradually extend it instead in that case. But yes,
keeping the _visual_ difference of git-browser and gitweb as small as
possible has been the main reason for me to think about integrating it
more tightly.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Integrating gitweb and git-browser (was: Re: VCS comparison table)
  2006-10-18  1:52                                     ` Petr Baudis
@ 2006-10-18  1:58                                       ` Jakub Narebski
  2006-10-18  2:02                                         ` Petr Baudis
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-18  1:58 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

Petr Baudis wrote:
> Dear diary, on Wed, Oct 18, 2006 at 03:36:36AM CEST, I got a letter
> where Jakub Narebski <jnareb@gmail.com> said that...
>> Petr Baudis wrote:
>>
>>> so that you can add your own views, so that git-browser can 
>>> integrate to it more reasonably. (Currently it has completely different
>>> UI and you have to patch gitweb in order to get the proper links at
>>> proper places.) Sure, git-browser might get fully integrated to gitweb
>>> later but that needs to be done sensitively so that people are not
>>> scared by the horrible javascript blobs, etc.; currently git-browser is
>>> very experimental, and adding it would be quite intrusive.
>> 
>> I was thinking about adding using JavaScript, in shortlog (and perhaps
>> shortlog-extended, i.e. with date and author) views one extra "diagram"
>> column, with width set using JavaScript generated embedded style, and use
>> only part of git-browser that generates diagram to draw it there.
> 
> Shortlog is paginated and that's not very practical for diagrams, I
> think - you need to gradually extend it instead in that case. But yes,
> keeping the _visual_ difference of git-browser and gitweb as small as
> possible has been the main reason for me to think about integrating it
> more tightly.

You can have paginated graph (diagram). Although it is more natural
to have diagram on the first page only, just like gitk --max-count=100.

The idea is for gitweb to generate (short)log, perhaps with pagination
turned off (CSS overflow: scroll), and git-browser part to generate
diagram and add it to log.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Integrating gitweb and git-browser (was: Re: VCS comparison table)
  2006-10-18  1:58                                       ` Jakub Narebski
@ 2006-10-18  2:02                                         ` Petr Baudis
  0 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18  2:02 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Dear diary, on Wed, Oct 18, 2006 at 03:58:03AM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> You can have paginated graph (diagram). Although it is more natural
> to have diagram on the first page only, just like gitk --max-count=100.

Of course you _can_ have it, but you're going to have a lot of trouble
following the threads over page boundaries, especially if some branch
has no commits whatsoever at some page(s).

> The idea is for gitweb to generate (short)log, perhaps with pagination
> turned off (CSS overflow: scroll), and git-browser part to generate
> diagram and add it to log.

What's missing there is the scary AJAXish thing for fetching more
commits. You do not want to load the whole kernel history at once, but
instead on demand fetch more revisions.

BTW, I'm most probably not the one going to hack git-browser to fit in
this. My javascript knowledge is barely enough to implement a web
browser support for it. ;-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  1:25                             ` Carl Worth
@ 2006-10-18  3:10                               ` Aaron Bentley
  2006-10-18  8:39                                 ` Andreas Ericsson
  2006-10-18 15:38                                 ` Carl Worth
  0 siblings, 2 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-18  3:10 UTC (permalink / raw)
  To: Carl Worth
  Cc: Jakub Narebski, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Carl Worth wrote:
> Aaron, thanks for carrying this thread along and helping to bridge
> some communication gaps. For example, when I saw your original two two
> diagrams I was totally mystified how you were claiming that appending
> a couple of nodes and edges to a DAG could change the "order" of the
> DAG.
> 
> I think I understand what you're describing with the leftmost-parent
> ordering now. But it's definitely an ordering that I would describe as
> local-only. That is, the ordering has meaning only with respect to a
> particular linearization of the DAG and that linearization is
> different from one repository to the next.

Well, the linarization for any particular head is well-defined, but
since different branches have different heads...

> If in practice, nobody does the mirroring "pull" operation then how
> are the numbers useful? For example, given your examples above, if
> I'm understanding the concepts and terminology correctly, then if A
> and B both "merge" from each other (and don't "pull") then they will
> each end up with identical DAGs for the revision history but totally
> distinct numbers. Correct?

The DAGs will be different.  If A merges B, we get:

a
|
b
|\
c d
|\|
| e
|/
f

If B merges A before this, nothing happens, because B is already a
superset of A.

If B merges afterward, we get this:
a
|
b
|\
d c
|/|
e |
|\|
| f
|/
g

> So in that situation the numbers will not help A and B determine that
> they have identical history or even identical working trees.

They don't really have identical history.

> So what good are the numbers?

They are good for naming mainline revisions that introduced particular
changes.

> I can see that the numbers would have applicability with reference to
> a single repository, (or equivalently a mirror of that repository),
> but no utility as soon as there is any distributed development
> happening.

Well, there's distributed, and then there's *DISTRIBUTED*.  We don't
quasi-randomly merge each others' branches.  We have a star topology
around bzr.dev.  So when we refer to revnos, they're usually in bzr.dev.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNZsp0F+nu1YWqI0RAkmWAJ9PkrkubIHVgAn5Wbdkg9IBAHCviACdFx2x
6ClmK4GmC1pRuRQACcSijNM=
=SM1Y
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  0:23                       ` Aaron Bentley
  2006-10-18  0:46                         ` Jakub Narebski
       [not found]                         ` <200610180246.18758.jnareb@gmail.com>
@ 2006-10-18  3:25                         ` Ryan Anderson
  2 siblings, 0 replies; 1752+ messages in thread
From: Ryan Anderson @ 2006-10-18  3:25 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Jakub Narebski, Andreas Ericsson, bazaar-ng, git

On 10/17/06, Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> > In other words, the empty merge is totally semantically empty even in the
> > bazaar world. Why does it exist?
>
> It exists because it is useful.  Because it makes the behavior of bzr
> merge uniform.  Because in some workflows, commits show that a person
> has signed off on a change.

In the Git world that happens via "git tag -s", i.e, a
cryptographically strong "signoff".
(There's also the secondary convention of appending Signed-off-by: to
email-applied patches, but that's something that would translate
effectively to any other system, since it's outside the SCM.)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  1:44                             ` Carl Worth
@ 2006-10-18  3:27                               ` Aaron Bentley
  2006-10-18  9:20                                 ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-18  3:27 UTC (permalink / raw)
  To: Carl Worth
  Cc: Jakub Narebski, Petr Baudis, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git, Matthieu Moy

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Carl Worth wrote:
> On Wed, 18 Oct 2006 03:28:30 +0200, Jakub Narebski wrote:
>> Isn't it easier to review than "bundle", aka. mega-patch?
> 
> There are even more important reasons to prefer a series of
> micro-commits over a mega-patch than just ease of merging.

A bundle isn't a mega-patch.  It contains all the source revisions.  So
when you merge or pull it, you get all the original revisions in your
repository.


> We have great tools like bisect to identify commits that introduce
> bugs. I know that I'd be delighted to see bisect comes back pointing
> at some minimal commit as causing a bug, (which would make finding the
> bug so much easier).

Bisect should work equally well with revisions pulled or merged from a
bundle as revisions re-committed from patches.

> But it's also been my experience that the largest commits are also the
> most likely to be the things returned by bisect. Big commits really do
> introduce bugs more frequently than small commits.

The number of changes shown in the diff has nothing to do with the
number of changes made per commit.

> Now, I do admit that it is often useful to take the overall view of a
> patch series being submitted. This is often the case when a patch
> series is in some sub-module of the code for which I don't have as
> much direct involvement. In cases like that I will often do review
> only of the diff between the tips of the mainline and the branch of
> interest, (or if I trust the maintainer enough, perhaps just the
> diffstat between the two). But I'm still very glad that what lands in
> the history is the series of independent changes, and not one mega
> commit.

So the difference here is that bundles preserve the original commits the
changes came from, so even though it's presented as an overview, you
still have a series of independent changes in your history.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNZ820F+nu1YWqI0RAjNyAJ90HMCAiopuAMvkKlcCEdc4F6QKLwCdGEWI
VOZThAQrvqybe5z93eC44BY=
=xBZM
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  1:00                           ` Aaron Bentley
  2006-10-18  1:25                             ` Carl Worth
@ 2006-10-18  3:35                             ` Linus Torvalds
  2006-10-19  3:10                               ` Aaron Bentley
  1 sibling, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-18  3:35 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, git



On Tue, 17 Oct 2006, Aaron Bentley wrote:
> 
> > That doesn't matter...
> 
> It has significant UI impact.

Right. You have to do it your way, because of the "simple revision 
numbers".

Which gets us back to where we started: "simple" is in the eye of the 
beholder. I personally think that git revision naming is a lot simpler, 
exactly because it doesn't impose arbitrary rules on users.

For example, what happens is that:
 - you like the simple revision numbers
 - that in turn means that you can never allow a mainline-merge to be done 
   by anybody else than the main maintainer
 - that in turn means that the whole situation is no longer distributed, 
   it's more like a "disconnected access to a central repository"

The "main trunk matters" mentality (which has deep roots in CVS - don't 
get me wrong, I don't think you're the first one to do this) is 
fundamentally antithetical to truly distributed system, because it 
basically assumes that some maintainer is "more important" than others. 

That special maintainer is the maintainer whose merge-trunk is followed, 
and whose revision numbers don't change when they are merged back. 

That may even be _true_ in many cases. But please do realize that it's a 
real issue, and that it has real impact - it does two things:

 - it impacts the technology and workflow directly itself: "pull" and 
   "merge" are different: a central maintainer would tend to do a "merge", 
   and one more in the outskirts would tend to do more of a "pull", 
   expecting his work to then be merged back to the "trunk" at some later 
   point)

 - it will result in _psychological_ damage, in the sense that there's 
   always one group that is the "trunk" group, and while you can pass the 
   baton around (like the perl people do), it's always clear who sits 
   centrally.

Maybe this is fine. It's certainly how most projects tend to work. 

I'll just point out that one of my design goals for git was to make every 
single repository 100% equal. That means that there MUST NOT be a "trunk", 
or a special line of development. There is no "vendor branch". It's 
something that a lot of people on the git lists understand now, but it 
took a while for it to sink in - people used to believe that the "first 
parent" of a merge was somehow special, and I had to point out several 
times on the git list that no, that's not how it works - because the merge 
might have been done by somebody _else_ than the person who you think of 
as being "on the trunk".

So when I say that your "simple" revision numbers are totally broken and 
horrible, I say that not because I think a number like "1.45.3.17" is 
ugly, but because I think that the deeper _implications_ of using a number 
like that is ugly. It implies one of two things:

 - the numbers change all the time as things get merged both ways

OR

 - people try to maintain a "trunk" mentality

and I think both of those situations are simply not good situations.

In git, the fact that everybody is on an equal footing is something that I 
think is really good. For example, when I was away for effectively three 
weeks during August, all the git-level merging for the kernel was done by 
Greg KH.

And realize that he didn't use "my tree". No baton was passed. I emailed 
with him (and some others) before-hand, so that everybody knew that I 
expected to be just pull from Greg when I came back, but it was _his_ tree 
that he merged in, and he just worked the same way I did.

And when I did come back, I did a "pull" from his tree. At no point is 
there a big merge-commit with a sign saying

	"I now merged all the work that Greg did while I was away"

No. Because the way git works, my pull just fast-forwarded my tree, 
because while I was away, Greg's tree _was_ the main tree, thanks to the 
fact that git believes that everybody is 100% equal.

So it's actually a big conceptual thing. 

I'm actually very happy with the design of git, and a large part of that 
is that I think the data structures and the basic design was really good. 
Now, I know I'm smarter than anybody else ("Bow down before me, you 
worthless scum"), but the thing is, the way to do good basic design isn't 
actually to be really smart about it, but to try to have a few basic 
concepts.

And the "every repository is equal" is one such concept. The naming 
follows from that - you simply _cannot_ use numbers if everybody is on the 
same footing (at least not _stable_ numbers). 

Btw, BK did get this right. I didn't _like_ the naming in BK, and it was 
numbers, but it worked. But it only worked when people understood that the 
numbers were ephemeral, and it _did_ cause confusion. But hey, the 
confusion wasn't _that_ big of a problem.

> Even if I agreed that the revision was meaningless, the cost of such a
> revision is miniscule.

No. The _cost_ of the revision is the "trunk mentality". THAT is the true 
cost.  The belief that there is one "main line of development".

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 10:08             ` Andreas Ericsson
  2006-10-17 10:47               ` Matthieu Moy
@ 2006-10-18  4:55               ` Robert Collins
  2006-10-18  8:53                 ` Andreas Ericsson
  2006-10-18 15:31                 ` Linus Torvalds
  1 sibling, 2 replies; 1752+ messages in thread
From: Robert Collins @ 2006-10-18  4:55 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: bazaar-ng, git, Jakub Narebski

[-- Attachment #1: Type: text/plain, Size: 3199 bytes --]

On Tue, 2006-10-17 at 12:08 +0200, Andreas Ericsson wrote:
> Robert Collins wrote:
> > On Tue, 2006-10-17 at 11:20 +0200, Jakub Narebski wrote:
> >>           ---- time --->
> >>
> >>     --*--*--*--*--*--*--*--*--*-- <branch>
> >>           \            /
> >>            \-*--X--*--/
> >>
> >> The branch it used to be on is gone...
> > 
> > In bzr 0.12 this is :
> > 2.1.2
> > 
> 
> Would it be a different number in a different version of bazaar?

The dotted decimal display has only been introduced in bzr 0.12

> > (assuming the first * is numbered '1'.)
> > 
> > These numbers are fairly stable, in particular everything's number in
> > the mainline will be the same number in all the branches created from it
> > at that point in time, but a branch that initially creates a revision or
> > obtains it before the mainline will have a different number until they
> > syncronise with the mainline via pull.
> > 
> 
> So basically anyone can pull/push from/to each other but only so long as 
> they decide upon a common master that handles synchronizing of the 
> number part of the url+number revision short-hands?

Anyone can push and pull from each other - full stop. Whenever they
'pull' in bzr terms, they get fast-forward happening (if I understand
the git fast-forward behaviour correctly). After a fast-forward, the
dotted decimal revision numbers in the two branches are identical - and
they remain immutable until another fast forward occurs. Push always
fast forwards, so the public copy of ones own repository that others
pull or merge from is identical to your own. In a 'collection of
branches with no mainline' scenario, people usually have fast forward
occur from time to time, keeping the numbers consistent from the point
your branch was last pulled by someone else, or you pulled them.

> One thing that's been nagging me is how you actually find out the 
> url+number where the desired revision exists. That is, after you've 
> synced with master, or merged the mothership's master-branch into one of 
> your experimental branches where you've done some work that went before 
> mothership's master's current tip, do you have to have access to the 
> mothership's repo (as in, do you have to be online) to find out the 
> number part of url+number shorthand, or can you determine it solely from 
> what you have on your laptop?

You can determine it locally - if you know any of the motherships
revisions locally, we can generate the dotted-revnos that the
motherships master-branch would have from the local data - and the last
merge of mothership you did will have given you that details. I dont
think we have a ui command to spit this out just yet, but it will be
trivial to whip one up.

More commonly though, like git users have 'origin' and 'master'
branches, bzr users tend to have a branch that is the 'origin' (for bzr
itself this is usually called bzr.dev), as well as N other branches for
their own work, which is probably why we haven't seen the need to have a
ui command to spit out the revnos for an arbitrary branch.

-Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 23:33                   ` Petr Baudis
@ 2006-10-18  5:26                     ` Robert Collins
  2006-10-18 21:46                       ` Alternate revno proposal (Was: Re: VCS comparison table) Jan Hudec
  0 siblings, 1 reply; 1752+ messages in thread
From: Robert Collins @ 2006-10-18  5:26 UTC (permalink / raw)
  To: Petr Baudis; +Cc: bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 1495 bytes --]

On Wed, 2006-10-18 at 01:33 +0200, Petr Baudis wrote:
> 
> BTW, I think it's fine to build a system optimized for small-scale
> projects (if that's the intent), simplifying some things in favour of
> mostly straight histories instead of more complicated merge situations
> (although I tend to agree with Linus that if you don't behave in the
> way the users are used to in 100% cases, the more frequently you
> behave so the worse it comes back to bite in the rare cases you do).
> Just as RCS is fine when maintaining individual files for personal
> usage (I still actually occassionaly use it for few files).

revnos visibly change as your work is merged into the mainline - we've
been doing this for years without trouble: ones own commits to a branch
get '3', '4', '5' etc as revnos, and when they are merged to the
mainline they used to stop having revnos at all, but now they will be
given this dotted decimal revno. If you pull from the mainline after the
merge, you see the new numbers, and when you look at mainline you can
see the difference. So while I agree that the surprise the user gets is
inversely related to the frequency with which they see the behaviour, I
think our users see it a lot, so are not surprised much.

FWIW, we're not optimising for mostly straight histories as I understand
such things : our own history has 3 commits on branches to every one on
the mainline.

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 23:16                       ` Linus Torvalds
@ 2006-10-18  5:36                         ` Jeff King
  2006-10-18  5:57                           ` Junio C Hamano
  2006-10-18 14:52                           ` Linus Torvalds
  0 siblings, 2 replies; 1752+ messages in thread
From: Jeff King @ 2006-10-18  5:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jakub Narebski, Aaron Bentley, Andreas Ericsson, bazaar-ng, git

On Tue, Oct 17, 2006 at 04:16:15PM -0700, Linus Torvalds wrote:

> It would be easy to send the exact same data as the native git protocol 
> sends over ssh (or the git port) as an email encoding. We did that a few 
> times with BK (there it's called "bk send" and "bk receive" to pack and 
[...]
> That said, "bundles" certainly wouldn't be _hard_ to do. And as long as 
> nobody tries to send _me_ any of them, I won't mind ;)

I never used BK, but my understanding is that it was based on
changesets, so a bundle was a group of changesets. Because a git commit
represents the entire tree state, how can we avoid sending the entire
tree in each bundle? The interactive protocols can ask "what do you
have?" but an email bundle is presumably meant to work without a round
trip.

We could always make a guess ("git send --remote-has master~10") but
that seems awfully error-prone. I assume a changeset-oriented system
would implicitly keep some concept of "I think Linus is at master~10"
and do it automatically.

-Peff

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  5:36                         ` Jeff King
@ 2006-10-18  5:57                           ` Junio C Hamano
  2006-10-18 14:52                           ` Linus Torvalds
  1 sibling, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-18  5:57 UTC (permalink / raw)
  To: Jeff King
  Cc: Jakub Narebski, Aaron Bentley, Andreas Ericsson, bazaar-ng, git,
	Linus Torvalds

Jeff King <peff@peff.net> writes:

> We could always make a guess ("git send --remote-has master~10") but
> that seems awfully error-prone. I assume a changeset-oriented system
> would implicitly keep some concept of "I think Linus is at master~10"
> and do it automatically.

We could always anchor at a well known point ("git send v2.6.18..").
If you as the recipient do not have the preimage, the "bundle" would
identify what the assumed common ancestor is and you can fetch
it before proceeding.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 21:51                 ` Jakub Narebski
  2006-10-17 22:28                   ` Aaron Bentley
@ 2006-10-18  6:22                   ` Matthieu Moy
  1 sibling, 0 replies; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-18  6:22 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Andreas Ericsson, bazaar-ng, git

Jakub Narebski <jnareb@gmail.com> writes:

>>> Fast-forward is a really good idea. Perhaps you could implement it,
>>> if it is not hidden under different name?
>> 
>> We support it as 'pull', but merge doesn't do it automatically, because
>> we'd rather have merge behave the same all the time, and because 'pull'
>> throws away your local commit ordering.
>
> I smell yet another terminology conflict (although this time fault is
> on the git side), namely that in git terminology "pull" is "fetch"
> (i.e. getting changes done in remote repository since laste "fetch"
> or since "clone") followed by merge. pull = fetch + merge.

AAUI, the initial claim was that after a rebase, git can do a
fast-forward, but Aaron has missed the /after a rebase/ part.

And yes, it the bzr terminology, bzr can do a "pull" after a "graft".
I don't think there's a fundamental difference here.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]         ` <20061017062341.8a5c8530.seanlkml@sympatico.ca>
  2006-10-17 10:23           ` Sean
  2006-10-17 10:23           ` Sean
@ 2006-10-18  6:33           ` Jeff King
  2 siblings, 0 replies; 1752+ messages in thread
From: Jeff King @ 2006-10-18  6:33 UTC (permalink / raw)
  To: Sean; +Cc: Aaron Bentley, Johannes Schindelin, Jakub Narebski, bazaar-ng,
	git

On Tue, Oct 17, 2006 at 06:23:41AM -0400, Sean wrote:

> The "bzr missing" command sounds like a handy one.  
> 
> Someone on the xorg mailing list was recently lamenting that git does not
> have an easy way to compare a local branch to a remote one.  While this
> turns out to not be a big problem in git, it might be nice to have such
> a command.

What's wrong with:

  git-fetch
  gitk master...origin

The git model is to do operations on local refs and objects, so the
fetch is a natural part of that. The only downside I see is that you
actually end up fetching the data rather than simply peeking at where
the remote is. But a useful comparison will include at least grabbing
the commit objects, and probably the tree objects (to do diffs) anyway.

-Peff

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  1:11                   ` Petr Baudis
@ 2006-10-18  6:44                     ` Matthieu Moy
  2006-10-18  7:16                       ` Shawn Pearce
  0 siblings, 1 reply; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-18  6:44 UTC (permalink / raw)
  To: bazaar-ng, git

Petr Baudis <pasky@suse.cz> writes:

> The origin branch is considered readonly (though Git does
> not enforce it) and only mirrors the branch in the remote repository.

By curiosity, what happens if you accidentally commit to it?

-- 
Matthieu

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  6:44                     ` Matthieu Moy
@ 2006-10-18  7:16                       ` Shawn Pearce
  0 siblings, 0 replies; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-18  7:16 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: bazaar-ng, git

Matthieu Moy <Matthieu.Moy@imag.fr> wrote:
> Petr Baudis <pasky@suse.cz> writes:
> 
> > The origin branch is considered readonly (though Git does
> > not enforce it) and only mirrors the branch in the remote repository.
> 
> By curiosity, what happens if you accidentally commit to it?

It will quietly accept the commit.

Later when you attempt to run `git fetch` to download any changes
from the remote repository to your local origin branch the fetch
command will fail as it won't be a strict fast-forward due to
there being changes in origin which aren't in the remote repository
being downloaded.

The user can force those changes to be thrown away with `git fetch
--force`, though they probably would want to first examine the
branch with `git log origin` to see what commits (if any) should
be saved, and either extract them to patches for reapplication or
create a holder branch via `git branch holder origin` to allow them
to later merge the holder branch (or parts thereof) after the fetch
has forced origin to match the remote repository.

So in short by default Git stops and tells the user something fishy
is going on, but the error message isn't obvious about what that
is and how they can resolve it easily.

There has been discussion about marking these branches that we
know the user fetches into as read-only, to prevent `git commit`
from actually committing to such a branch (we also have the same
case with the special bisect branch), but I don't think anyone has
stepped forward with the complete implementation of that yet.

Like anything I think people get used to the idea that those branches
are strictly for fetching and shouldn't be used for anything else.
There's really no reason to checkout a fetched into branch anyway;
temporary branches are less than 1 second away with
`git checkout -b tmp origin` (for example).

-- 
Shawn.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 23:33                       ` Aaron Bentley
@ 2006-10-18  8:13                         ` Andreas Ericsson
  0 siblings, 0 replies; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-18  8:13 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git

Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Jakub Narebski wrote:
>> Aaron Bentley wrote:
>> By the way, are bzr "bundles" compatibile with ordinary patch?
>> git-format-patch patches are. They have additional metainfo,
>> but they are patches in heart.
> 
> Yes, they are.
> 

Sounds a bit like [PATCH 0/8] would have the output of

	git diff $(git merge-base master)..topic-branch

for any given patch-series. It might be easier to review the whole 
patch-series in some cases. Especially with patch-series where more than 
one patch touches the same part of the code.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  3:10                               ` Aaron Bentley
@ 2006-10-18  8:39                                 ` Andreas Ericsson
  2006-10-18  9:04                                   ` Peter Baumann
                                                     ` (2 more replies)
  2006-10-18 15:38                                 ` Carl Worth
  1 sibling, 3 replies; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-18  8:39 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Carl Worth, Jakub Narebski, Linus Torvalds, bazaar-ng, git

Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Carl Worth wrote:
>> Aaron, thanks for carrying this thread along and helping to bridge
>> some communication gaps. For example, when I saw your original two two
>> diagrams I was totally mystified how you were claiming that appending
>> a couple of nodes and edges to a DAG could change the "order" of the
>> DAG.
>>
>> I think I understand what you're describing with the leftmost-parent
>> ordering now. But it's definitely an ordering that I would describe as
>> local-only. That is, the ordering has meaning only with respect to a
>> particular linearization of the DAG and that linearization is
>> different from one repository to the next.
> 
> Well, the linarization for any particular head is well-defined, but
> since different branches have different heads...
> 
>> If in practice, nobody does the mirroring "pull" operation then how
>> are the numbers useful? For example, given your examples above, if
>> I'm understanding the concepts and terminology correctly, then if A
>> and B both "merge" from each other (and don't "pull") then they will
>> each end up with identical DAGs for the revision history but totally
>> distinct numbers. Correct?
> 
> The DAGs will be different.  If A merges B, we get:
> 
> a
> |
> b
> |\
> c d
> |\|
> | e
> |/
> f
> 
> If B merges A before this, nothing happens, because B is already a
> superset of A.
> 
> If B merges afterward, we get this:
> a
> |
> b
> |\
> d c
> |/|
> e |
> |\|
> | f
> |/
> g
> 

Seems like an awful lot of merge commits. In git, I think these trees 
would be identical (actually both to bazaar and to each other), with the 
exception that the 'g' commit wouldn't exist, since git does 
fast-forward and relies on dependency-chain only to present the graph 
instead of mucking around with info in external files (recording of 
fetches).

>> So in that situation the numbers will not help A and B determine that
>> they have identical history or even identical working trees.
> 
> They don't really have identical history.
> 

As explained above, they would be identical in git. The fact that you 
register a fast-forward as a merge makes them not so, but this is 
something most gitizens are against, as it can quickly clutter up the DAG.

>> So what good are the numbers?
> 
> They are good for naming mainline revisions that introduced particular
> changes.
> 
>> I can see that the numbers would have applicability with reference to
>> a single repository, (or equivalently a mirror of that repository),
>> but no utility as soon as there is any distributed development
>> happening.
> 
> Well, there's distributed, and then there's *DISTRIBUTED*.  We don't
> quasi-randomly merge each others' branches.  We have a star topology
> around bzr.dev.  So when we refer to revnos, they're usually in bzr.dev.
> 

So in essence, the revnos work wonderfully so long as there is a central 
server to make them immutable?

Doesn't this mean that one of your key features doesn't actually work in 
a completely distributed setup (i.e., each dev has his own repo, there 
is no mother-ship, everyone pulls from each other)?

I can see the six-line hook that lays the groundwork for this in git 
before me right now. I'll happily refuse to write it down anywhere. I 
get the feeling that sha's are easier to handle in the long run, while 
revno's might be good to use in development work. In git, we have 
<branch/tag/"committish">~<number> syntax for this.

In my experience, finding the revision sha of an old bug is what takes 
time. Copy-paste is just as fast with 20 bytes as with 4 bytes. Honestly 
now, do you actually remember the revno for a bug that you stopped 
working on three weeks ago, or do you have to go look it up? If someone 
wants to notify you about the revision a bug was introduced, do they not 
communicate the revno to you by email/irc/somesuch?

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  4:55               ` Robert Collins
@ 2006-10-18  8:53                 ` Andreas Ericsson
  2006-10-18 11:15                   ` Petr Baudis
  2006-10-18 15:31                 ` Linus Torvalds
  1 sibling, 1 reply; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-18  8:53 UTC (permalink / raw)
  To: Robert Collins; +Cc: bazaar-ng, git, Jakub Narebski

Robert Collins wrote:
> On Tue, 2006-10-17 at 12:08 +0200, Andreas Ericsson wrote:
>> Robert Collins wrote:
>>> On Tue, 2006-10-17 at 11:20 +0200, Jakub Narebski wrote:
>>>>           ---- time --->
>>>>
>>>>     --*--*--*--*--*--*--*--*--*-- <branch>
>>>>           \            /
>>>>            \-*--X--*--/
>>>>
>>>> The branch it used to be on is gone...
>>> In bzr 0.12 this is :
>>> 2.1.2
>>>
>> Would it be a different number in a different version of bazaar?
> 
> The dotted decimal display has only been introduced in bzr 0.12
> 
>>> (assuming the first * is numbered '1'.)
>>>
>>> These numbers are fairly stable, in particular everything's number in
>>> the mainline will be the same number in all the branches created from it
>>> at that point in time, but a branch that initially creates a revision or
>>> obtains it before the mainline will have a different number until they
>>> syncronise with the mainline via pull.
>>>
>> So basically anyone can pull/push from/to each other but only so long as 
>> they decide upon a common master that handles synchronizing of the 
>> number part of the url+number revision short-hands?
> 
> Anyone can push and pull from each other - full stop. Whenever they
> 'pull' in bzr terms, they get fast-forward happening (if I understand
> the git fast-forward behaviour correctly). After a fast-forward, the
> dotted decimal revision numbers in the two branches are identical - and
> they remain immutable until another fast forward occurs.


This is where it breaks down for me. "until another fast forward occurs" 
is just not good enough, imo.

> 
>> One thing that's been nagging me is how you actually find out the 
>> url+number where the desired revision exists. That is, after you've 
>> synced with master, or merged the mothership's master-branch into one of 
>> your experimental branches where you've done some work that went before 
>> mothership's master's current tip, do you have to have access to the 
>> mothership's repo (as in, do you have to be online) to find out the 
>> number part of url+number shorthand, or can you determine it solely from 
>> what you have on your laptop?
> 
> You can determine it locally - if you know any of the motherships
> revisions locally, we can generate the dotted-revnos that the
> motherships master-branch would have from the local data - and the last
> merge of mothership you did will have given you that details.


To me, this means bazaar isn't distributed at all and I could achieve 
much the same distributedness(?) by rsyncing an SVN repo, working 
against that and then rsyncing it back with some fancy merging. In other 
words, bazaar requires there to be one Lord of the Code, or some of the 
key features break down.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  8:39                                 ` Andreas Ericsson
@ 2006-10-18  9:04                                   ` Peter Baumann
  2006-10-18  9:07                                   ` Jakub Narebski
  2006-10-18 10:32                                   ` Matthew D. Fuller
  2 siblings, 0 replies; 1752+ messages in thread
From: Peter Baumann @ 2006-10-18  9:04 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Aaron Bentley, Carl Worth, Jakub Narebski, Linus Torvalds,
	bazaar-ng, git

2006/10/18, Andreas Ericsson <ae@op5.se>:
> Aaron Bentley wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > Carl Worth wrote:
> >> Aaron, thanks for carrying this thread along and helping to bridge
> >> some communication gaps. For example, when I saw your original two two
> >> diagrams I was totally mystified how you were claiming that appending
> >> a couple of nodes and edges to a DAG could change the "order" of the
> >> DAG.
> >>
> >> I think I understand what you're describing with the leftmost-parent
> >> ordering now. But it's definitely an ordering that I would describe as
> >> local-only. That is, the ordering has meaning only with respect to a
> >> particular linearization of the DAG and that linearization is
> >> different from one repository to the next.
> >
> > Well, the linarization for any particular head is well-defined, but
> > since different branches have different heads...
> >
> >> If in practice, nobody does the mirroring "pull" operation then how
> >> are the numbers useful? For example, given your examples above, if
> >> I'm understanding the concepts and terminology correctly, then if A
> >> and B both "merge" from each other (and don't "pull") then they will
> >> each end up with identical DAGs for the revision history but totally
> >> distinct numbers. Correct?
> >
> > The DAGs will be different.  If A merges B, we get:
> >
> > a
> > |
> > b
> > |\
> > c d
> > |\|
> > | e
> > |/
> > f
> >
> > If B merges A before this, nothing happens, because B is already a
> > superset of A.
> >
> > If B merges afterward, we get this:
> > a
> > |
> > b
> > |\
> > d c
> > |/|
> > e |
> > |\|
> > | f
> > |/
> > g
> >
>
> Seems like an awful lot of merge commits. In git, I think these trees
> would be identical (actually both to bazaar and to each other), with the
> exception that the 'g' commit wouldn't exist, since git does
> fast-forward and relies on dependency-chain only to present the graph
> instead of mucking around with info in external files (recording of
> fetches).
>

Ok. This I don't get. Let me recaptulize:

Branch A
a
|
b
|
c

Branch B
a
|
b
| \
d c
| /
e

In branch A, do merge branch B (git pull B) you get as result branch B, because
A fastforwards to B and you don't get a merge commit f

In branch B, do merge branch A (git pull A), the result would be
branch B, because
we are already uptodate.

You _never_ have a commit f or g.

-Peter

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  8:39                                 ` Andreas Ericsson
  2006-10-18  9:04                                   ` Peter Baumann
@ 2006-10-18  9:07                                   ` Jakub Narebski
  2006-10-18 10:32                                   ` Matthew D. Fuller
  2 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-18  9:07 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Aaron Bentley, Carl Worth, Linus Torvalds, bazaar-ng, git

Andreas Ericsson wrote:
> Aaron Bentley wrote:
>> Well, there's distributed, and then there's *DISTRIBUTED*.  We don't
>> quasi-randomly merge each others' branches.  We have a star topology
>> around bzr.dev.  So when we refer to revnos, they're usually in bzr.dev.
>> 
> 
> So in essence, the revnos work wonderfully so long as there is a central 
> server to make them immutable?
> 
> Doesn't this mean that one of your key features doesn't actually work in 
> a completely distributed setup (i.e., each dev has his own repo, there 
> is no mother-ship, everyone pulls from each other)?
> 
> I can see the six-line hook that lays the groundwork for this in git 
> before me right now. I'll happily refuse to write it down anywhere. I 
> get the feeling that sha's are easier to handle in the long run, while 
> revno's might be good to use in development work. In git, we have 
> <branch/tag/"committish">~<number> syntax for this.
> 
> In my experience, finding the revision sha of an old bug is what takes 
> time. Copy-paste is just as fast with 20 bytes as with 4 bytes. Honestly 
> now, do you actually remember the revno for a bug that you stopped 
> working on three weeks ago, or do you have to go look it up? If someone 
> wants to notify you about the revision a bug was introduced, do they not 
> communicate the revno to you by email/irc/somesuch?

Revnos were supposed to be superior to using sha1 (or shortened sha1)
as commit identifiers because of two key features:
 1. They were simplier than sha1, therefore easier to use
 2. Given two revisions related by lineage (i.e. one is ancestor of
    the other) you can from a glance know which revision was earlier

But the details invalidated 1.: for complicated history, for a large
project, with many contributors and nonlinear development we have 
www.repository.com:127.2.31.57 vs 988859a (7 chars shortcut of sha1)
to have immutable revno. And we have to use _immutable_ (up to few
years) revison identifiers, unless we want our "simple ids" scheme
to make a mess...

And I'm not sure if 2. is true, if even for revisions with direct
lineage we don't have to compare 127.15.2.16 with 210.2.20.3 for
example. Having generation number would solve 2.; as of now git
check for fast-forward case by checking if merge-base of two
revisions is one of the revisions.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  3:27                               ` Aaron Bentley
@ 2006-10-18  9:20                                 ` Jakub Narebski
  2006-10-18 16:31                                   ` Aaron Bentley
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-18  9:20 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Carl Worth, Petr Baudis, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git, Matthieu Moy

Aaron Bentley wrote:
> Carl Worth wrote:
>> On Wed, 18 Oct 2006 03:28:30 +0200, Jakub Narebski wrote:
>>> Isn't it easier to review than "bundle", aka. mega-patch?
>>
>> There are even more important reasons to prefer a series of
>> micro-commits over a mega-patch than just ease of merging.
> 
> A bundle isn't a mega-patch.  It contains all the source revisions.  So
> when you merge or pull it, you get all the original revisions in your
> repository.

But what patch reviewer see is a mega-patch showing the changeset
of a whole "bundle", isn't it?
[...]
>> Now, I do admit that it is often useful to take the overall view of a
>> patch series being submitted. This is often the case when a patch
>> series is in some sub-module of the code for which I don't have as
>> much direct involvement. In cases like that I will often do review
>> only of the diff between the tips of the mainline and the branch of
>> interest, (or if I trust the maintainer enough, perhaps just the
>> diffstat between the two). But I'm still very glad that what lands in
>> the history is the series of independent changes, and not one mega
>> commit.
> 
> So the difference here is that bundles preserve the original commits the
> changes came from, so even though it's presented as an overview, you
> still have a series of independent changes in your history.

I think it is much better to review series of patches commit by commit;
besides it allows to correct some inner patches before applying the whole
series or drop one of patches in series (and it happened from time to time
on git mailing list).

So if git introduces bundles, I think they would take form of series
of "patch" mails + introductory email with series description (currently
it is not saved anywhere), shortlog, diffstat and perhaps more metainfo
like bundle parent (which I think should be email form of branch really),
tags introduced etc.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                           ` <20061018003920.GK20017@pasky.or.cz>
@ 2006-10-18  9:28                             ` Erik Bågfors
  2006-10-18 11:08                               ` Petr Baudis
  0 siblings, 1 reply; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-18  9:28 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Aaron Bentley, Matthieu Moy, bazaar-ng, Linus Torvalds,
	Andreas Ericsson, git, Jakub Narebski

On 10/18/06, Petr Baudis <pasky@suse.cz> wrote:
> Dear diary, on Wed, Oct 18, 2006 at 02:30:14AM CEST, I got a letter
> where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
> > Petr Baudis wrote:
> > > Another aspect of this is that Git (Linus ;) is very focused on getting
> > > the history right, nice and clean (though it does not _mandate_ it and
> > > you can just wildly do one commit after another; it just provides tools
> > > to easily do it).
> >
> > Yes, rebasing is very uncommon in the bzr community.  We would rather
> > evaluate the complete change than walk through its history.  (Bundles
> > only show the changes you made, not the changes you merged from the
> > mainline.)
> >
> > In an earlier form, bundles contained a patch for every revision, and
> > people *hated* reading them.  So there's definitely a cultural
> > difference there.
>
> BTW, I think what describes the Git's (kernel's) stance very nicely is
> what I call the Al Viro's "homework problem":
>
>         http://lkml.org/lkml/2005/4/7/176
>
> If I understand you right, the bzr approach is what's described as "the
> dumbest kind" there? (No offense meant!)

Yes and no, The bundle includes both the full final thing, and each
step along the way. Each step along the way is something you'll get
when you merge it.

Once merged, it will be "next one" in the description above. It would
typically look something like this in "bzr log"(shortened)  In this
example, doing C requires doing A and B as well...

committer: foobar@foobar.com
message: merged in C
      -------
      committer: bar@bar.com
      message: opps, fix bug in A
      -------
      committer: bar@bar.com
      message: implement B
      -------
      committer: bar@bar.com
      message: implement A

So, you'll get full history, including errors made :)  You can also
see who approved it to this branch (foobar) and who did the actual
work (bar)

/Erik

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  8:39                                 ` Andreas Ericsson
  2006-10-18  9:04                                   ` Peter Baumann
  2006-10-18  9:07                                   ` Jakub Narebski
@ 2006-10-18 10:32                                   ` Matthew D. Fuller
  2006-10-18 11:19                                     ` Andreas Ericsson
  2 siblings, 1 reply; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-18 10:32 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Aaron Bentley, Linus Torvalds, Carl Worth, bazaar-ng, git,
	Jakub Narebski

On Wed, Oct 18, 2006 at 10:39:32AM +0200 I heard the voice of
Andreas Ericsson, and lo! it spake thus:
>
> So in essence, the revnos work wonderfully so long as there is a
> central server to make them immutable?

It seems from my somewhat detached perspective that there's a lot of
conflation of 'conventions' with 'capabilities' around this thread...


With a single linear branch, revnos work wonderfully, and are probably
much more useful than any sort of UUID.  It would be silly in this day
and age to design a VCS aimed specifically for this use case, of
course.  That doesn't mean a VCS shouldn't make it easy, though.


With a star config, revnos are useful locally and with reference to
the "main" branch[es].  And, most of the world is star configs of one
sort or another.  Actually, one might say that practically ALL the
world outside of linux-kernel is star-configs   ;)

In many cases in the star setup, a revno (particularly along the
'trunk') is more directly useful than a UUID; consider particularly
the case of somebody who's just mirroring/following, not actively
developing.  In some cases, the UUID is more useful.  Certainly, using
a revno in a case where the UUID is more appropriate is Bad, but
that's just a matter of using the right tool.


With a uber-distributed full-mesh setup, revnos may be basically
useless for anything except local lookups (which boils down to
"useless for most anything you'd identify a revision for").  For that
case, you'd practically always use the UUID, and pretend revnos don't
exist.


The merge revno forms (123.5.2.17 and the like), I'm somewhat
ambivalent about in many ways.  But, you don't have to use them any
more than you have to use "top-level" revnos.  If either form of revno
is Wrong for your case (whether it be because "I hate numbers
wholesale", or because "Numbers don't cover this case usefully"), then
you just use the UUID and pretend the number isn't there.  If you
wanted them completely out of sight, I wouldn't expect it to be very
hard to talk bzr into never showing the revnos and just showing the
UUID ("revid").



[ I don't speak for bzr, despite the fact that I'm about to appear to ]

>From where I sit, revnos are quite useful in the first 1.5 or 2 cases.
Some would argue that they're not useless in the third case as well,
but that's no necessary point to hash out; it certainly does no
technical harm to have them there, since you can just ignore them if
they don't help you.  I think a good case could be made that the vast
majority of VCS use in the world is a form of case 2.

Git comes out of a world where case 3 is All, and the other cases are,
if not actively ignored, at least far secondary considerations, so it
can hardly be surprising that it doesn't have or want something that
adds practically nothing to its case.

bzr, both in its own development schema, and in the expected audience,
is overwhelmingly case 2 (of which case 1 is really just a degenerate
version), but that doesn't mean case 3 is ignored or impossible.  The
UUID's are there for when you need them, and can be used anywhere you
might use a number, and just as easily.  It's a community convention
to organize development in such a way that the number is "usually"
useful, and when it is, it's certainly easier.  That doesn't mean you
HAVE to use it in cases where it doesn't fit, though.  "bzr people
like to avoid using UUID's" doesn't lead to "bzr can't handle the
cases where UUID's are necessary".


> Doesn't this mean that one of your key features doesn't actually
> work in a completely distributed setup

That's one way of phrasing it, I guess.  I'd say rather "a particular
feature isn't applicable to a completely distributed setup".  I'm sure
git has a lot of features that are key for somebody that "don't work"
for someone else, just because they're doing something that person
doesn't want done.  Just because somebody else thinks their toaster
oven is a great way to solder, doesn't mean you have to sell yours.
You can just leave it in the cupboard and use an iron instead.



-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  9:28                             ` Erik Bågfors
@ 2006-10-18 11:08                               ` Petr Baudis
  2006-10-18 11:17                                 ` Jakub Narebski
  2006-10-18 13:09                                 ` Erik Bågfors
  0 siblings, 2 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18 11:08 UTC (permalink / raw)
  To: Erik B?gfors
  Cc: Aaron Bentley, Matthieu Moy, bazaar-ng, Linus Torvalds,
	Andreas Ericsson, git, Jakub Narebski

Dear diary, on Wed, Oct 18, 2006 at 11:28:32AM CEST, I got a letter
where Erik B?gfors <zindar@gmail.com> said that...
> On 10/18/06, Petr Baudis <pasky@suse.cz> wrote:
> >Dear diary, on Wed, Oct 18, 2006 at 02:30:14AM CEST, I got a letter
> >where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
> >> Petr Baudis wrote:
> >> > Another aspect of this is that Git (Linus ;) is very focused on getting
> >> > the history right, nice and clean (though it does not _mandate_ it and
> >> > you can just wildly do one commit after another; it just provides tools
> >> > to easily do it).
> >>
> >> Yes, rebasing is very uncommon in the bzr community.  We would rather
> >> evaluate the complete change than walk through its history.  (Bundles
> >> only show the changes you made, not the changes you merged from the
> >> mainline.)
> >>
> >> In an earlier form, bundles contained a patch for every revision, and
> >> people *hated* reading them.  So there's definitely a cultural
> >> difference there.
> >
> >BTW, I think what describes the Git's (kernel's) stance very nicely is
> >what I call the Al Viro's "homework problem":
> >
> >        http://lkml.org/lkml/2005/4/7/176
> >
> >If I understand you right, the bzr approach is what's described as "the
> >dumbest kind" there? (No offense meant!)
> 
> Yes and no, The bundle includes both the full final thing, and each
> step along the way. Each step along the way is something you'll get
> when you merge it.
> 
> Once merged, it will be "next one" in the description above. It would
> typically look something like this in "bzr log"(shortened)  In this
> example, doing C requires doing A and B as well...
> 
> committer: foobar@foobar.com
> message: merged in C
>      -------
>      committer: bar@bar.com
>      message: opps, fix bug in A
>      -------
>      committer: bar@bar.com
>      message: implement B
>      -------
>      committer: bar@bar.com
>      message: implement A
> 
> So, you'll get full history, including errors made :)  You can also
> see who approved it to this branch (foobar) and who did the actual
> work (bar)

I see, that's what I've been missing, thanks. So it's the middle path
(as any other commonly used VCS for that matter, expect maybe darcs?;
patch queues and rebasing count but it's a hack, not something properly
supported by the design of Git, since at this point the development
cannot be fully distributed).

I also assume that given this is the case, the big diff does really not
serve any purpose besides human review?

But somewhere else in the thread it's been said that bundles can also
contain merges. Does that means that bundles can look like:

   1
  / \
 2   4
 |   | _
 3   5  |
  \ /   | a bundle
   6    |
       ~

In that case, against what the big diff from 6 is done? 2? 4? Or even 1?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  8:53                 ` Andreas Ericsson
@ 2006-10-18 11:15                   ` Petr Baudis
  0 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18 11:15 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Robert Collins, bazaar-ng, git, Jakub Narebski

Dear diary, on Wed, Oct 18, 2006 at 10:53:16AM CEST, I got a letter
where Andreas Ericsson <ae@op5.se> said that...
> Robert Collins wrote:
> >Anyone can push and pull from each other - full stop. Whenever they
> >'pull' in bzr terms, they get fast-forward happening (if I understand
> >the git fast-forward behaviour correctly). After a fast-forward, the
> >dotted decimal revision numbers in the two branches are identical - and
> >they remain immutable until another fast forward occurs.
..snip..
> >You can determine it locally - if you know any of the motherships
> >revisions locally, we can generate the dotted-revnos that the
> >motherships master-branch would have from the local data - and the last
> >merge of mothership you did will have given you that details.
> 
> 
> To me, this means bazaar isn't distributed at all and I could achieve 
> much the same distributedness(?) by rsyncing an SVN repo, working 
> against that and then rsyncing it back with some fancy merging. In other 
> words, bazaar requires there to be one Lord of the Code, or some of the 
> key features break down.

Well as far as I understand, the Lord of the Code is whoever you pulled
from the last time.

It's just a different focus here. If I understood everything in this
thread correctly, both Git and Bazaar have persistent (SHA1, UUID) and
volatile (revspec, revision number) revision ids. The only difference is
that Git primarily presents the user with the SHA1 ids while Bazaar
primarily presents the user with a revision number (and that revspecs
change after every commit while revision numbers change only after a
merge).

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18 11:08                               ` Petr Baudis
@ 2006-10-18 11:17                                 ` Jakub Narebski
  2006-10-18 13:09                                 ` Erik Bågfors
  1 sibling, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-18 11:17 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Erik B?gfors, Aaron Bentley, Matthieu Moy, bazaar-ng,
	Linus Torvalds, Andreas Ericsson, git

Petr Baudis wrote:
> But somewhere else in the thread it's been said that bundles can also
> contain merges. Does that means that bundles can look like:
>
>    1
>   / \
>  2   4
>  |   | _
>  3   5  |
>   \ /   | a bundle
>    6    |
>        ~
>
> In that case [merge bundle], against what the big diff from 6 is done?
> 2? 4? Or even 1? 

Or do you use equivalent of git combined diff format?
http://www.kernel.org/pub/software/scm/git/docs/git-diff-tree.html
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18 10:32                                   ` Matthew D. Fuller
@ 2006-10-18 11:19                                     ` Andreas Ericsson
  2006-10-18 12:43                                       ` Matthew D. Fuller
  0 siblings, 1 reply; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-18 11:19 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Aaron Bentley, Linus Torvalds, Carl Worth, bazaar-ng, git,
	Jakub Narebski

Matthew D. Fuller wrote:
> On Wed, Oct 18, 2006 at 10:39:32AM +0200 I heard the voice of
> Andreas Ericsson, and lo! it spake thus:
>> So in essence, the revnos work wonderfully so long as there is a
>> central server to make them immutable?
> 
> 
> With a star config, revnos are useful locally and with reference to
> the "main" branch[es].  And, most of the world is star configs of one
> sort or another.  Actually, one might say that practically ALL the
> world outside of linux-kernel is star-configs   ;)
> 

That might be the case today. However, since we introduced git at the 
office, mini-projects are cropping up like mad, and pieces of toy-code 
are being pushed around among the employees. When something is found to 
be useful enough to attract management attention, it's given a spot at 
the "master site". It doesn't need one. It's just that we have this one 
place where gitweb is installed, which management likes whereas devs 
don't have that on their laptop. It's also convenient to have one place 
to find all changes rather than pulling from 1-to-N different people 
just to have a look at what they've done.

The point I'm trying to make here is that the star config might be the 
most common case today because
a) old scm's enforced this use case and it is therefor the most common 
way just out of habit.
b) projects you actually *see* have gotten past the "Joe made some cool 
changes, pull his 'jukebox-ui' branch".


> In many cases in the star setup, a revno (particularly along the
> 'trunk') is more directly useful than a UUID; consider particularly
> the case of somebody who's just mirroring/following, not actively
> developing.  In some cases, the UUID is more useful.  Certainly, using
> a revno in a case where the UUID is more appropriate is Bad, but
> that's just a matter of using the right tool.
> 

I can easily imagine the use case Linus pointed out with BK. Because 
revnos work wonderfully 80% of the time, people get confused, frustrated 
and downright pissed off when they don't.

> 
> With a uber-distributed full-mesh setup, revnos may be basically
> useless for anything except local lookups (which boils down to
> "useless for most anything you'd identify a revision for").  For that
> case, you'd practically always use the UUID, and pretend revnos don't
> exist.
> 

But they *do* exist, and they *usually* work, so people are bound to try 
them first. Teaching them when they work and when they don't (or rather, 
when they should and when they shouldn't, cause they will work by 
accident sometimes too) is bound to be a lot harder than sending them a 
10 char irc message.

> 
> The merge revno forms (123.5.2.17 and the like), I'm somewhat
> ambivalent about in many ways.  But, you don't have to use them any
> more than you have to use "top-level" revnos.  If either form of revno
> is Wrong for your case (whether it be because "I hate numbers
> wholesale", or because "Numbers don't cover this case usefully"), then
> you just use the UUID and pretend the number isn't there.  If you
> wanted them completely out of sight, I wouldn't expect it to be very
> hard to talk bzr into never showing the revnos and just showing the
> UUID ("revid").
> 

So what's the point in having them? You can't seriously tell me that you 
think of 123.5.2.17 as something you can easily remember, do you? Count 
the times, during one day, where you use the revnos and type them manually.

> 
> 
> [ I don't speak for bzr, despite the fact that I'm about to appear to ]
> 
>>From where I sit, revnos are quite useful in the first 1.5 or 2 cases.
> Some would argue that they're not useless in the third case as well,
> but that's no necessary point to hash out; it certainly does no
> technical harm to have them there, since you can just ignore them if
> they don't help you.  I think a good case could be made that the vast
> majority of VCS use in the world is a form of case 2.
> 
> Git comes out of a world where case 3 is All, and the other cases are,
> if not actively ignored, at least far secondary considerations, so it
> can hardly be surprising that it doesn't have or want something that
> adds practically nothing to its case.
> 

Not really. It's just that case 3 is the most flexible of them all. It's 
trivial to enforce linear development in git. Just add a hook that 
forbids merge commits. Set up a "master repo" and put the hook there and 
you've turned it into CVS with off-line log-browsing (more or less).

Set up a master-server and enable the reflog there and you've turned it 
into bazaar, more or less.

In git, the mothership repo is there for conveniance, because it's nice 
to have one place to set up mailing-list hooks, gitweb, git-daemon and 
the likes. Everything works *exactly* as it would have done without it 
in all repos around the world.


> bzr, both in its own development schema, and in the expected audience,
> is overwhelmingly case 2 (of which case 1 is really just a degenerate
> version), but that doesn't mean case 3 is ignored or impossible.  The
> UUID's are there for when you need them, and can be used anywhere you
> might use a number, and just as easily.  It's a community convention
> to organize development in such a way that the number is "usually"
> useful, and when it is, it's certainly easier.  That doesn't mean you
> HAVE to use it in cases where it doesn't fit, though.  "bzr people
> like to avoid using UUID's" doesn't lead to "bzr can't handle the
> cases where UUID's are necessary".
> 

Have a look at the list of things that CVS "can handle" and compare it 
mentally to the things CVS "handles gracefully" and you'll see why 
people have stopped using it.

> 
>> Doesn't this mean that one of your key features doesn't actually
>> work in a completely distributed setup
> 
> That's one way of phrasing it, I guess.  I'd say rather "a particular
> feature isn't applicable to a completely distributed setup".

So how come it's in the same list of features as the "distributed 
repository model", and both are marked as supported when they're 
apparently mutually exclusive?


>  I'm sure
> git has a lot of features that are key for somebody that "don't work"
> for someone else, just because they're doing something that person
> doesn't want done.

The main point, the *important* point about git is that everything it 
shows always makes sense and works in exactly the same way no matter 
which setup you use. There are no features in git that are mutually 
exclusive, or only sane in one particular setup but not in others. You 
can use them all or pick which ones you like. Whatever you choose, it 
never comes at the expense of losing something else.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18 11:19                                     ` Andreas Ericsson
@ 2006-10-18 12:43                                       ` Matthew D. Fuller
       [not found]                                         ` <20061018090218.35f0326b.seanlkml@sympatico.ca>
                                                           ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-18 12:43 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Aaron Bentley, Linus Torvalds, Carl Worth, bazaar-ng, git,
	Jakub Narebski

On Wed, Oct 18, 2006 at 01:19:10PM +0200 I heard the voice of
Andreas Ericsson, and lo! it spake thus:
> 
> It's just that we have this one place where gitweb is installed,
> which management likes whereas devs don't have that on their laptop.
> It's also convenient to have one place to find all changes rather
> than pulling from 1-to-N different people just to have a look at
> what they've done.

I think this just by itself lends support to:

> The point I'm trying to make here is that the star config might be
> the most common case today because

c) Stars work well as a mental model for humans.

Heck, in large, Linux is star-ish.  There s "2.6.1", "2.6.2", etc;
that's a trunk.  Any time you have releases, you're establishing a
"master" branch.  For most people using Linux, there's a trunk,
whether it's the kernel.org trunk, or the "What Redhat ships" trunk,
etc.  The closer you drill to the day-to-day work on the kernel, the
farther it gets from trunks, but if it were full-mesh at all levels I
don't think it would be nearly as usable for regular computing tasks
as it is.


Perhaps someday a heavy full-mesh setup will be the common case for
VCS usage.  I find that very difficult to buy for various reasons, but
it could happen.  If it does, bzr may well revisit the choice and
decide revnos contribute little enough marginal value as to be a loss,
and discard them.  But that's not today.


> But they *do* exist, and they *usually* work, so people are bound to
> try them first. Teaching them when they work and when they don't (or
> rather, when they should and when they shouldn't, cause they will
> work by accident sometimes too) is bound to be a lot harder than
> sending them a 10 char irc message.

Perhaps, for some projects.  And in those cases, perhaps you'd want to
flip a hypothetical "dump those numbers in the bin" switch.  That
doesn't mean every project wants to, or that those projects who don't
and have no trouble and discernible gain from revno usage are
hypothetical.


> So what's the point in having them? You can't seriously tell me that
> you think of 123.5.2.17 as something you can easily remember, do
> you? Count the times, during one day, where you use the revnos and
> type them manually.

No, I don't.  But I don't use merge revnos for various reasons, one of
the primary ones being that they don't currently intuitively follow
from me (and that intuitiveness is the major attraction of revnos in
the first place).

I rarely refer to non-mainline revisions at all, in fact.  And I use
revnos for mainline revisions regularly.  Heck, I communicate revnos
_verbally_; people handle that easily with numbers, not so easily with
hex strings.  The vast majority of my branches are simple cases, and I
like simple tools that match simple mental models for them.  For the
more intricate cases, revids provide a more rigorous tool, and I WANT
a VCS that lets me choose which is appropriate.  If I wanted a
computer to tell me how to work, I'd run Windows    ;)


> Not really. It's just that case 3 is the most flexible of them all.

Yes, but this doesn't necessarily mean everything you seem to try and
cover with it.  The more rigorous tool will cover the simplest case
(those being just a degenerate form of the more complex after all),
but that doesn't mean it's the EASIEST way of handling that case.


> Everything works *exactly* as it would have done without it in all
> repos around the world.

And if you use the UUID's, the same applies to bzr.

That is, if you use git like you use git, the above is true.  If you
use bzr like you use git, the above is ALSO true.

The difference is that bzr ALSO chooses to support and optimize for a
different case in the default UI presentation, because We[0] consider
that far and away the common case on the one hand, and that people
trying to use the more complex case are ipso facto more able to use a
behavior differing from the norm on the other.


[0] Note how adroitly I again speak for other people.  Practice,
    practice!


> >That's one way of phrasing it, I guess.  I'd say rather "a
> >particular feature isn't applicable to a completely distributed
> >setup".
> 
> So how come it's in the same list of features as the "distributed
> repository model", and both are marked as supported when they're
> apparently mutually exclusive?

I assume in this you're referring to the RcsComparisons page that
started the thread.  First off, I don't agree with all the
characterizations on the page, so don't expect me to support it as
gospel.  That said, they're not "mutually exclusive"; one is just
inapplicable in extreme cases of the other.  "Plugins" is on the same
list as "distributed repository model" too.  And you can't count on
other people having the same plugins as you, so it's just as "mutually
exclusive" with distributed.


> The main point, the *important* point about git is that everything
> it shows always makes sense and works in exactly the same way no
> matter which setup you use.  There are no features in git that are
> mutually exclusive, or only sane in one particular setup but not in
> others.

I find it really hard to believe that that's strictly true, just as a
general rule.  For that matter, I think it's demonstrably false: using
SHA1 hashes as revision identifiers in a simple linear tree with 5
revs doesn't strike me as "sane".  But that aside...

I don't think of that as a positive thing.  There are lots of things
that make sense in certain setups that don't in others.  We have two
techniques, A and B, and two general cases, X and Y.  A works really
well for X, and is useless with Y.  B works ok for X, and handles Y
well.  "Use A for X and B for Y" seems like a heck of a lot better
answer than "Only support B".  You certainly CAN shape wood joints
with just a claw hammer, but I wouldn't want to.  A jigsaw makes it
much easier, no matter how useless it may be for forging iron.


Your position seems to be, in essence, "This feature can be misused,
therefore it should be eliminated".  And you should certainly use a
tool that provides the behavior you want.  So, too, should other
people.

I don't want to use git for any number of reasons, which sum up
concisely if undescriptively as "It doesn't work for me", but it seems
to work great for the community it was built for, and that's
excellent.  Not all aspects of that design work well for other people,
though, no matter how poorly some capability "fits" you
(non-specific), it can still fit others very well.  This particular
item certainly seems one of those significant divides.



-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                         ` <20061018090218.35f0326b.seanlkml@sympatico.ca>
  2006-10-18 13:02                                           ` Sean
@ 2006-10-18 13:02                                           ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-18 13:02 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Andreas Ericsson, Aaron Bentley, Linus Torvalds, Carl Worth,
	bazaar-ng, git, Jakub Narebski

On Wed, 18 Oct 2006 07:43:20 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> The difference is that bzr ALSO chooses to support and optimize for a
> different case in the default UI presentation, because We[0] consider
> that far and away the common case on the one hand, and that people
> trying to use the more complex case are ipso facto more able to use a
> behavior differing from the norm on the other.
> 
> [0] Note how adroitly I again speak for other people.  Practice,
>     practice!

Just to be clear here, Git is also able to  supports this model if
you so choose.  It's quite easy for a server to generate Git tags
for every commit it gets.

It's just that this is basically a non issue in the Git world.  People
who use Git aren't crying out for salvation from sha1 numbers.  So I
think this entire discussion is a bit overblown.

But just to be clear, there is nothing in the Git model that prohibits
tagging every commit with something you find less objectionable than
sha1's.  They can appear in the log listings and in gitk etc, and
everyone who pulls from the central server will get them.  In fact,
for some imports of other VCS into Git, exactly that is done; so every
commit can be referenced by its sha1 _or_ the "friendly" number it was
known by in its original VCS.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                         ` <20061018090218.35f0326b.seanlkml@sympatico.ca>
@ 2006-10-18 13:02                                           ` Sean
  2006-10-18 13:02                                           ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-18 13:02 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: bazaar-ng, Linus Torvalds, Andreas Ericsson, Carl Worth, git,
	Jakub Narebski

On Wed, 18 Oct 2006 07:43:20 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> The difference is that bzr ALSO chooses to support and optimize for a
> different case in the default UI presentation, because We[0] consider
> that far and away the common case on the one hand, and that people
> trying to use the more complex case are ipso facto more able to use a
> behavior differing from the norm on the other.
> 
> [0] Note how adroitly I again speak for other people.  Practice,
>     practice!

Just to be clear here, Git is also able to  supports this model if
you so choose.  It's quite easy for a server to generate Git tags
for every commit it gets.

It's just that this is basically a non issue in the Git world.  People
who use Git aren't crying out for salvation from sha1 numbers.  So I
think this entire discussion is a bit overblown.

But just to be clear, there is nothing in the Git model that prohibits
tagging every commit with something you find less objectionable than
sha1's.  They can appear in the log listings and in gitk etc, and
everyone who pulls from the central server will get them.  In fact,
for some imports of other VCS into Git, exactly that is done; so every
commit can be referenced by its sha1 _or_ the "friendly" number it was
known by in its original VCS.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18 11:08                               ` Petr Baudis
  2006-10-18 11:17                                 ` Jakub Narebski
@ 2006-10-18 13:09                                 ` Erik Bågfors
  1 sibling, 0 replies; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-18 13:09 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Matthieu Moy, bazaar-ng, Linus Torvalds, Andreas Ericsson, git,
	Jakub Narebski

On 10/18/06, Petr Baudis <pasky@suse.cz> wrote:
> Dear diary, on Wed, Oct 18, 2006 at 11:28:32AM CEST, I got a letter
> where Erik B?gfors <zindar@gmail.com> said that...
> > On 10/18/06, Petr Baudis <pasky@suse.cz> wrote:
> > >Dear diary, on Wed, Oct 18, 2006 at 02:30:14AM CEST, I got a letter
> > >where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
> > >> Petr Baudis wrote:
> > >> > Another aspect of this is that Git (Linus ;) is very focused on getting
> > >> > the history right, nice and clean (though it does not _mandate_ it and
> > >> > you can just wildly do one commit after another; it just provides tools
> > >> > to easily do it).
> > >>
> > >> Yes, rebasing is very uncommon in the bzr community.  We would rather
> > >> evaluate the complete change than walk through its history.  (Bundles
> > >> only show the changes you made, not the changes you merged from the
> > >> mainline.)
> > >>
> > >> In an earlier form, bundles contained a patch for every revision, and
> > >> people *hated* reading them.  So there's definitely a cultural
> > >> difference there.
> > >
> > >BTW, I think what describes the Git's (kernel's) stance very nicely is
> > >what I call the Al Viro's "homework problem":
> > >
> > >        http://lkml.org/lkml/2005/4/7/176
> > >
> > >If I understand you right, the bzr approach is what's described as "the
> > >dumbest kind" there? (No offense meant!)
> >
> > Yes and no, The bundle includes both the full final thing, and each
> > step along the way. Each step along the way is something you'll get
> > when you merge it.
> >
> > Once merged, it will be "next one" in the description above. It would
> > typically look something like this in "bzr log"(shortened)  In this
> > example, doing C requires doing A and B as well...
> >
> > committer: foobar@foobar.com
> > message: merged in C
> >      -------
> >      committer: bar@bar.com
> >      message: opps, fix bug in A
> >      -------
> >      committer: bar@bar.com
> >      message: implement B
> >      -------
> >      committer: bar@bar.com
> >      message: implement A
> >
> > So, you'll get full history, including errors made :)  You can also
> > see who approved it to this branch (foobar) and who did the actual
> > work (bar)
>
> I see, that's what I've been missing, thanks. So it's the middle path
> (as any other commonly used VCS for that matter, expect maybe darcs?;
> patch queues and rebasing count but it's a hack, not something properly
> supported by the design of Git, since at this point the development
> cannot be fully distributed).
>
> I also assume that given this is the case, the big diff does really not
> serve any purpose besides human review?
>
> But somewhere else in the thread it's been said that bundles can also
> contain merges. Does that means that bundles can look like:
>
>    1
>   / \
>  2   4
>  |   | _
>  3   5  |
>   \ /   | a bundle
>    6    |
>        ~
>
> In that case, against what the big diff from 6 is done? 2? 4? Or even 1?

When you run the "bundle" command, you can tell it what you want the
bundle to be created against.  So, If I just commited 5, I can run
"bzr bundle -r-1" to get the bundle against 4, or I can do "bzr bundle
path/to/other/branch" to get a bundle that relates to it.

To merge a bundle into a branch, the parrent of the first revision in
the bundle, has to exist in the branch is't being merged into. (well,
unless you use patch, but that's outside of bzr, and bzr wouldn't know
about each revision in them)

This command will find a common root and create a bundle that
corresponds to it.  The "big diff" as you call it, would be the
changes between the point where the branch was created, and the last
commit.

In the case of just committing 5, and you want to create a bundle that
can be merged back at point 6, the "big diff" would be against 1 since
that's the branch point.

/Erik

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18 12:43                                       ` Matthew D. Fuller
       [not found]                                         ` <20061018090218.35f0326b.seanlkml@sympatico.ca>
@ 2006-10-18 13:10                                         ` Jakub Narebski
  2006-10-18 16:07                                         ` Linus Torvalds
  2 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-18 13:10 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Andreas Ericsson, Aaron Bentley, Linus Torvalds, Carl Worth,
	bazaar-ng, git

Dnia środa 18. października 2006 14:43, Matthew D. Fuller napisał:
> On Wed, Oct 18, 2006 at 01:19:10PM +0200 I heard the voice of
> Andreas Ericsson, and lo! it spake thus:
> > 
> > It's just that we have this one place where gitweb is installed,
> > which management likes whereas devs don't have that on their laptop.
> > It's also convenient to have one place to find all changes rather
> > than pulling from 1-to-N different people just to have a look at
> > what they've done.
> 
> I think this just by itself lends support to:
> 
> > The point I'm trying to make here is that the star config might be
> > the most common case today because
> 
> c) Stars work well as a mental model for humans.
> 
> Heck, in large, Linux is star-ish.  There s "2.6.1", "2.6.2", etc;
> that's a trunk.  Any time you have releases, you're establishing a
> "master" branch.  For most people using Linux, there's a trunk,
> whether it's the kernel.org trunk, or the "What Redhat ships" trunk,
> etc.  The closer you drill to the day-to-day work on the kernel, the
> farther it gets from trunks, but if it were full-mesh at all levels I
> don't think it would be nearly as usable for regular computing tasks
> as it is.

No, it is not. If you consider only published Linus repository, and
private repositories of other people, it usually is star-ish (although
mentioned situaltion where somebody else repository took place of center
of star-ish configuration wouldn't be possible in tru star-ish model).
But please take note of stable repository, -mm repository; the changes
are exchanged there and back again. And "What Redhat ships" is AFAIK
mix of different repositories and own patches. 
 
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  5:36                         ` Jeff King
  2006-10-18  5:57                           ` Junio C Hamano
@ 2006-10-18 14:52                           ` Linus Torvalds
  2006-10-18 18:52                             ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Petr Baudis
  2006-10-18 21:20                             ` VCS comparison table Jeff King
  1 sibling, 2 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-18 14:52 UTC (permalink / raw)
  To: Jeff King; +Cc: Andreas Ericsson, bazaar-ng, git, Jakub Narebski



On Wed, 18 Oct 2006, Jeff King wrote:
> 
> I never used BK, but my understanding is that it was based on
> changesets, so a bundle was a group of changesets.

Yes.

> Because a git commit represents the entire tree state, how can we avoid 
> sending the entire tree in each bundle?

That's not the problem. That's easy to handle - and we already do. That's 
the whole point of the wire-transfer protocol (ie sending deltas, and only 
sending enough to actually matter).

> The interactive protocols can ask "what do you have?" but an email 
> bundle is presumably meant to work without a round trip.

Right, but they can do exactly what bk did: you have to have a reference 
to what the other side has. In git, that's usually even simpler: you'd do

	git send origin..

and that "origin" is what the other end is expected to already have.

Of course, if you send an unconnected bundle (ie you give an origin that 
the other end _doesn't_ have), you're screwed.

In other words, to get such a pack, we'd _literally_ just do something 
like

	git-rev-list --objects-edge origin.. |
		git-pack-objects --stdout |
		uuencode

and that would be it. You'd still need to add a "diffstat" to the thing, 
and tell the other end what the current HEAD is (so that it knows what 
it's supposed to fast-forward to), but it _literally_ is that simple.

"plug-in architecture" my ass. "I recognize this - it's UNIX!".

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  4:55               ` Robert Collins
  2006-10-18  8:53                 ` Andreas Ericsson
@ 2006-10-18 15:31                 ` Linus Torvalds
  2006-10-18 15:50                   ` Jakub Narebski
  1 sibling, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-18 15:31 UTC (permalink / raw)
  To: Robert Collins; +Cc: Andreas Ericsson, bazaar-ng, git, Jakub Narebski



On Wed, 18 Oct 2006, Robert Collins wrote:
> 
> More commonly though, like git users have 'origin' and 'master'
> branches, bzr users tend to have a branch that is the 'origin' (for bzr
> itself this is usually called bzr.dev), as well as N other branches for
> their own work, which is probably why we haven't seen the need to have a
> ui command to spit out the revnos for an arbitrary branch.

You mis-understand.

git doesn't have a "ui command to spit out the revnos for an arbitrary 
branch" either.

Normally, you'd just use the branch-name. Nobody ever uses the SHA1's 
directly.

What git does (and does very well) is to be _scriptable_. It was designed 
that way. I'm a UNIX guy. I think piping is very powerful. And when you 
script things, your scripts pass SHA1's around internally.

So for example, to repack a git archive, you'd normally do

	git repack -a -d

and you don't have any "UI" with SHA1 numbers. But internally, this used 
to be

	git-rev-list --all --objects |
		git-pack-objects 

where "git-rev-list" is the one that lists all object names (which are the 
SHA1 numbers), and "git-pack-objects" is the one that takes a list of 
objects and packs them. 

(These days, since our internal C libraries have become so much better, 
the object traversal is done internally to packing, so we don't actually 
use the pipe any more for repacking an archive, but that's just an 
implementation detail)

You seem to think that we use SHA1 names as _humans_. We don't. The SHA1 
names are used internally, and humans just use the branch names.

The only case you'd (as a human) use the SHA1 name is when you want to 
pass it on to another person that may have a different archive (ie you 
mail somebody a revision that is problematic). It would obviously be 
totally unworkable to say "it's the grand-parent of my current HEAD 
commit", since that's a local description. So instead, you'd say "it's 
commit 9550e59c4587f637d9aa34689e32eea460e6f50c".

So I think people (totally incorrectly) think that git users use a lot of 
SHA1 names, just because they see the git users on the kernel mailing list 
sending each others SHA1 names. But that's because you see only the case 
where you _want_ to communicate a stable revision name to another side. 
Sending a number like 1.57.8.312 to describe what commit broke would be a 
_bug_, because a person who has a differently shaped tree wouldn't even 
_have_ that revision.

But normally? You'd be hard-pressed to find anything but the branch (and 
tag) names on a command line.

See?

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  3:10                               ` Aaron Bentley
  2006-10-18  8:39                                 ` Andreas Ericsson
@ 2006-10-18 15:38                                 ` Carl Worth
  2006-10-19  9:10                                   ` Matthew D. Fuller
  1 sibling, 1 reply; 1752+ messages in thread
From: Carl Worth @ 2006-10-18 15:38 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Jakub Narebski, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 3786 bytes --]

On Tue, 17 Oct 2006 23:10:34 -0400, Aaron Bentley wrote:
> If B merges A before this, nothing happens, because B is already a
> superset of A.
>
> If B merges afterward, we get this:

Wow. Thanks for elucidating---again I was making some incorrect
assumptions about the system, so your answer was surprising and
appreciated.

So, am I correct in my understanding now that it's impossible for two
users to establish identical code history on both sides through merge?
If the two kept merging back and forth the history would pick up a new
commit each time even though there were no code changes. Right?

That's a startling property. I'm surprised to learn that the
generally-used mechanism for getting new changes doesn't have a mode
where it says "you're already up to date---doing nothing".

I do understand that there's a separate "pull" that does allow for
correct synchronization of a local repository with a remote
repository, and it does have the "up to date---doing nothing"
behavior. But as you already said, it's often avoided specifically
because it destroys locally-created revision numbers.

Another way of describing bzr's "pull" is that it establishes a
master-slave relationship between the remote and local repository,
(his numbers are more important than mine, so I'll throw mine away).
I think Linus already provided a good argument in this thread about
why that kind of asymmetry is bad for software projects and why tools
should not provide it.

So there are some aspects of the bzr design that rob from its ability
to function as a distributed version control system. It really does
bias itself toward centralization, (the so called "star topoloogy" as
opposed to something "fully" distributed).

And by the way, some people seem to have the opinion that there's
something unique about the way the linux kernel is developed that
allows is to benefit from a fully distributed system. The assumption
seems to be that projects with a central tree won't benefit the same
way, and don't really need the full set of features of a distributed
system. That's not true in my experience.

With cairo, for example, we had been using cvs. Obviously, it imposes
a centralized model, but most of the active developers had been using
rsync or other repository synchronization so that we could at least do
offline history browsing. So even with cvs we had as much of a star
topology as possible, (but we didn't have offline commits to our
roaming repositories, nor did we have any sharing between them).

Now, after the switch from cvs to git, we still do have a central
repository that all developers share and push into, (this is distinct
from how linux or the git project itself use git). And git supports
this kind of shared central repository perfectly well.

But a lot of the big advantages the cairo project gets from git come
from our ability to now easily share branches among ourselves without
going through the central repository. We only push fully-cooked
branches to the central tree. But now, with everyone owning their own
publicly-visible repository with all their work in it, we can now
easily share the half-baked ideas we have with all their history. One
person can start an idea, and others can easily pick it up, (without
having to drop down to a mega-patch like we would have done with
cvs). And people actually have the ability to collaborate on turning
an answer into a solution, (in Al Viro's terminology).

So even a project that's very oriented around a single, central tree
can get a lot of benefit from being able to share things arbitrarily
between any two given repositories. And I think that any project will
naturally start doing more of this kind of sharing, (and benefitting
considerably from it), as it adopts tools that support it well.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18 15:31                 ` Linus Torvalds
@ 2006-10-18 15:50                   ` Jakub Narebski
  2006-10-18 16:22                     ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-18 15:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Robert Collins, Andreas Ericsson, bazaar-ng, git

Linus Torvalds wrote:
> 
> On Wed, 18 Oct 2006, Robert Collins wrote:
>> 
>> More commonly though, like git users have 'origin' and 'master'
>> branches, bzr users tend to have a branch that is the 'origin' (for bzr
>> itself this is usually called bzr.dev), as well as N other branches for
>> their own work, which is probably why we haven't seen the need to have a
>> ui command to spit out the revnos for an arbitrary branch.
> 
> You mis-understand.
> 
> git doesn't have a "ui command to spit out the revnos for an arbitrary 
> branch" either.
> 
> Normally, you'd just use the branch-name. Nobody ever uses the SHA1's 
> directly.

With the exception of having sometimes commit-ids in the commit messages,
for example "Fixes bug introduced by aabbcc00" (although usually you just
write "Fixes bug in some_function in some_file"), and automatically
generated 
  This reverts d119e3de13ea1493107bd57381d0ce9c9dd90976 commit.
(in addition to 'Revert "<Commit title>") for git-revert generated
commit messages.

And it is true that you usually use branchname, or branchname~n syntax.
Git even has git-name-rev to convert from sha1 to temporary, local
ref^m~n... syntax.


By the way, git has very powerfull syntax to get revisions, and
revision lists. For example "git-rev-list foo bar  ^baz" means
"list all the commits which are included in foo and bar lineage,
but not in baz", or more useful "git log origin..next".

How's that in bzr?
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18 12:43                                       ` Matthew D. Fuller
       [not found]                                         ` <20061018090218.35f0326b.seanlkml@sympatico.ca>
  2006-10-18 13:10                                         ` Jakub Narebski
@ 2006-10-18 16:07                                         ` Linus Torvalds
  2 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-18 16:07 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: bazaar-ng, Andreas Ericsson, Carl Worth, git, Jakub Narebski



On Wed, 18 Oct 2006, Matthew D. Fuller wrote:

> On Wed, Oct 18, 2006 at 01:19:10PM +0200 I heard the voice of
> Andreas Ericsson, and lo! it spake thus:
> > 
> > It's just that we have this one place where gitweb is installed,
> > which management likes whereas devs don't have that on their laptop.
> > It's also convenient to have one place to find all changes rather
> > than pulling from 1-to-N different people just to have a look at
> > what they've done.
> 
> I think this just by itself lends support to:
> 
> > The point I'm trying to make here is that the star config might be
> > the most common case today because
> 
> c) Stars work well as a mental model for humans.

I really don't think that's even true.

Most projects do tend to have a star-like setup, but I think that's 
largely due to historical tools, not mental models. 

For example, I used CVS professionally for too long a few years ago, and 
the thing I _really_ hated was exactly how it forced people who were 
working on "experimental stuff" to be so tightly organized around the 
central repository (and how they had to do things that were visible and 
annoying to the mainline).

And I think that's where the "star-like" situation breaks down: when you 
have a group of people who go off to do something experimental. Suddenly 
the "mainline" in that case isn't the central and most important 
repository any more, and instead you really have another second (and 
third, fourth etc) "centerpoint" that another group works around.

Now, what does that mean? It means that whenever you look at a big project 
from the outside, you tend to see a star-like thing: there's the "big 
common thing", and you won't even be _seeing_ the off-shoots, because they 
tend to be used by developers to try out new ideas etc. So it looks like a 
star, but it really isn't, and shouldn't be.

An SCM should support the _developers_, not the users. The users don't 
need an SCM, they just need a place to fetch the "standard" thing 
(preferably with a vendor that supports them or at least makes them feel 
comfy). But an SCM really should support the off-shoots, because that's 
where the exciting stuff happens.

Btw, this is also why distribution is so fundamentally important:

Most of the off-shoots tend to be failures, but that is as it should be. 
Again, this is where SVN and CVS and other centralized models fail 
_miserably_. Because branches are in a centralized repository, the cost of 
failure is visible to all, and thus people don't like creating branches 
for things that don't look "obviously viable" to the people around the 
central repository.

In contrast, in a truly distributed environmen, a failed branch is 
something that people don't even KNOW about. Anybody can take the kernel 
git tree, start his own development line (with ten other people) and try 
to improve it. And if it fails, I'd never even know: there is literally 
_zero_ cost to everybody else from failed branches. And if they succeed, 
they'll just say "hey, pull this, it works, and it makes Xyz go five times 
faster".

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18 15:50                   ` Jakub Narebski
@ 2006-10-18 16:22                     ` Linus Torvalds
  0 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-18 16:22 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Andreas Ericsson, bazaar-ng, git, Robert Collins



On Wed, 18 Oct 2006, Jakub Narebski wrote:
> > 
> > Normally, you'd just use the branch-name. Nobody ever uses the SHA1's 
> > directly.
> 
> With the exception of having sometimes commit-ids in the commit messages,
> for example "Fixes bug introduced by aabbcc00" (although usually you just
> write "Fixes bug in some_function in some_file"), and automatically
> generated 
>   This reverts d119e3de13ea1493107bd57381d0ce9c9dd90976 commit.

Yes. But in both cases, that's usually because you literally ended up 
having the commit name because somebody else (which _can_ be you) searched 
for it (with something like "bisect") and gave it to you.

So even that case is really about communicating a stable name from one 
place (the "find the bug") to another (the "revert the buggy commit").

So yes, _communication_ should always happen by full SHA1's, because those 
are the only thing that always remain stable.

(The fact that "gitk" and I think "gitweb" can then turn them into 
hyperlinks in the commit message is obviously one reason we then tend to 
give them such prominent visibility - they actually end up being very 
useful later on).

In bzr, either you don't get the hyperlinks, or you need to use the 
non-simple name in the commit messages, since the simple names don't 
actually work. Either way, it's an inferior setup.

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  9:20                                 ` Jakub Narebski
@ 2006-10-18 16:31                                   ` Aaron Bentley
  2006-10-21 15:56                                     ` Jan Hudec
  0 siblings, 1 reply; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-18 16:31 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Carl Worth, Petr Baudis, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git, Matthieu Moy

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:
> 
>>Carl Worth wrote:
>>>There are even more important reasons to prefer a series of
>>>micro-commits over a mega-patch than just ease of merging.
>>
>>A bundle isn't a mega-patch.  It contains all the source revisions.  So
>>when you merge or pull it, you get all the original revisions in your
>>repository.
> 
> 
> But what patch reviewer see is a mega-patch showing the changeset
> of a whole "bundle", isn't it?
> [...]

Yes.  Carl was saying that, aside from the issue of what a reviewer
sees, a bundle is bad for other reasons.  I am saying those other
reasons don't apply.  I wasn't addressing the issue of what a reviewer sees.

To me, seeing the individual patches is like reading a book where every
page has a different word on it, and so it's hard to put it together
into a full sentence.  I'm not saying my way is The Right Way, just my
personal preference.

For larger pieces of work, we try to split them up into logical units,
and merge those units independently.

The Bundle format can also support a patch-by-patch output, but we don't
have UI to select that.

> I think it is much better to review series of patches commit by commit;
> besides it allows to correct some inner patches before applying the whole
> series or drop one of patches in series (and it happened from time to time
> on git mailing list).

It's important to remember that bundles represent revisions, not
patches.  When you merge a bundle, you

1. install those revisions into your repository.  These revisions are
   latent, as though they were on another branch.
2. merge the head revision of the bundle into your branch.

Virtually any merge selection process that works with branches would
also work with bundles.  So tweaking before merging is really a matter
of replacing the UI for 2.

> So if git introduces bundles, I think they would take form of series
> of "patch" mails + introductory email with series description (currently
> it is not saved anywhere), shortlog, diffstat and perhaps more metainfo
> like bundle parent (which I think should be email form of branch really),
> tags introduced etc.

The parent in a bundle revision is the revision-id of the parent of that
revision in the branch.  I don't think it's possible to change that
parent id into something else, without changing the meaning of a bundle.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFNlb40F+nu1YWqI0RAnxxAJ9ETibey1Qyvz/zVxdGipaHGtnddgCfTtzt
CQUZ2dK64BS5K5WYecFAsfM=
=bJxq
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 13:55                 ` Jakub Narebski
  2006-10-17 14:08                   ` Matthieu Moy
@ 2006-10-18 18:03                   ` Jeff Licquia
  1 sibling, 0 replies; 1752+ messages in thread
From: Jeff Licquia @ 2006-10-18 18:03 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

On Tue, 2006-10-17 at 15:55 +0200, Jakub Narebski wrote:
> Matthieu Moy wrote:
> > This took time to come in bzr, but that's the bisect plugin:
> > 
> > http://bazaar-vcs.org/PluginRegistry
> 
> Hmmm... I winder which SCM had it first.

You did.  The plugin is largely based on my experiences with the git
version, and explicitly gives credit in the comments.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 14:52                           ` Linus Torvalds
@ 2006-10-18 18:52                             ` Petr Baudis
  2006-10-18 18:59                               ` Petr Baudis
                                                 ` (2 more replies)
  2006-10-18 21:20                             ` VCS comparison table Jeff King
  1 sibling, 3 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18 18:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff King, Jakub Narebski, Aaron Bentley, Andreas Ericsson,
	bazaar-ng, git

Dear diary, on Wed, Oct 18, 2006 at 04:52:25PM CEST, I got a letter
where Linus Torvalds <torvalds@osdl.org> said that...
> In other words, to get such a pack, we'd _literally_ just do something 
> like
> 
> 	git-rev-list --objects-edge origin.. |
> 		git-pack-objects --stdout |
> 		uuencode
> 
> and that would be it. You'd still need to add a "diffstat" to the thing, 
> and tell the other end what the current HEAD is (so that it knows what 
> it's supposed to fast-forward to), but it _literally_ is that simple.
> 
> "plug-in architecture" my ass. "I recognize this - it's UNIX!".

Took me exactly an hour from mkdir cogito-bundle to cg-push to
kernel.org. :-)

cogito-bundle is an example on how to create third-party addons or
plugins adding own commands to Cogito and using Cogito's infrastructure.
It's not _that_ easy currently since you have to replicate large part of
the build infrastructure locally; that could be fixed by installing some
"library makefiles" and asciidoc toolkit to /usr/share or something, if
there would be a real demand for such an addon API. cg-help and the cg
wrapper will pick up the newly installed commands automagically. The
only thing missing is updating cogito(7) to list the addon commands,
which would take a bit more work.

Though it's an example, it's actually supposed to be useful, by doing
exactly what is outlined above - l - it lets you exchange commits over
mail by so-called "bundles", similar to e.g. Bazaar bundles - basically,
it is like push or fetch, but over email, and the commit ids are
preserved when transferred in bundles (if you just send patches, the
commit ids will end up different).

The provided cg-bundle and cg-unbundle commands are rather crude and
don't support many things - they don't actually include a diff, only a
diffstat, etc. The uuencoded bundle is inlined in the mail, which I
suspect isn't very useful; perhaps it would be more practical to just
attach it binarily. Feel free to send patches (or bundles ;).

An example bundle is available at

	http://pasky.or.cz/~pasky/cp/example-bundle.txt

as generated by

	cogito.master$ cg-bundle -r v0.18 -m"Subject is this" \
		-m"And some body now..." --stdout

and cogito-bundle is available at

	git://git.kernel.org/pub/scm/cogito/cogito-bundle.git/
	(gitweb http://kernel.org/git/?p=cogito/cogito-bundle.git)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 18:52                             ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Petr Baudis
@ 2006-10-18 18:59                               ` Petr Baudis
  2006-10-18 19:04                                 ` Junio C Hamano
                                                   ` (2 more replies)
       [not found]                               ` <20061018155704.b94b441d.seanlkml@sympatico.ca>
  2006-10-19  6:46                               ` Alexander Belchenko
  2 siblings, 3 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18 18:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Dear diary, on Wed, Oct 18, 2006 at 08:52:25PM CEST, I got a letter
where Petr Baudis <pasky@suse.cz> said that...
> Dear diary, on Wed, Oct 18, 2006 at 04:52:25PM CEST, I got a letter
> where Linus Torvalds <torvalds@osdl.org> said that...
> > In other words, to get such a pack, we'd _literally_ just do something 
> > like
> > 
> > 	git-rev-list --objects-edge origin.. |
> > 		git-pack-objects --stdout |
> > 		uuencode
> > 
> > and that would be it. You'd still need to add a "diffstat" to the thing, 
> > and tell the other end what the current HEAD is (so that it knows what 
> > it's supposed to fast-forward to), but it _literally_ is that simple.
> > 
> > "plug-in architecture" my ass. "I recognize this - it's UNIX!".
> 
> Took me exactly an hour from mkdir cogito-bundle to cg-push to
> kernel.org. :-)

By the way, originally I just wanted to index and save the pack, but
when trying to feed it to git-index-pack, I kept getting

	fatal: packfile '.git/objects/pack/pack-b2ab684daebea5b9c5a6492fa732e0d2e1799c8e.pack' has unresolved deltas

while feeding it to git-unpack-objects works fine. Any idea what's wrong?

(BTW, I got the id by sha1summing the pack file; is there an existing
way to name a pack properly if I have it lying around, unnamed? sha1sum
seems to be specific to a fairly new GNU coreutils version.)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 18:59                               ` Petr Baudis
@ 2006-10-18 19:04                                 ` Junio C Hamano
  2006-10-18 19:13                                   ` Nicolas Pitre
  2006-10-18 19:09                                 ` Nicolas Pitre
  2006-10-18 20:08                                 ` Linus Torvalds
  2 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-18 19:04 UTC (permalink / raw)
  To: git

Petr Baudis <pasky@suse.cz> writes:

> Dear diary, on Wed, Oct 18, 2006 at 08:52:25PM CEST, I got a letter
> where Petr Baudis <pasky@suse.cz> said that...
>> Dear diary, on Wed, Oct 18, 2006 at 04:52:25PM CEST, I got a letter
>> where Linus Torvalds <torvalds@osdl.org> said that...
>> > In other words, to get such a pack, we'd _literally_ just do something 
>> > like
>> > 
>> > 	git-rev-list --objects-edge origin.. |
>> > 		git-pack-objects --stdout |
>> > 		uuencode
>> > 
>> > and that would be it. You'd still need to add a "diffstat" to the thing, 
>> > and tell the other end what the current HEAD is (so that it knows what 
>> > it's supposed to fast-forward to), but it _literally_ is that simple.
>> > 
>> > "plug-in architecture" my ass. "I recognize this - it's UNIX!".
>> 
>> Took me exactly an hour from mkdir cogito-bundle to cg-push to
>> kernel.org. :-)
>
> By the way, originally I just wanted to index and save the pack, but
> when trying to feed it to git-index-pack, I kept getting
>
> 	fatal: packfile '.git/objects/pack/pack-b2ab684daebea5b9c5a6492fa732e0d2e1799c8e.pack' has unresolved deltas
>
> while feeding it to git-unpack-objects works fine. Any idea what's wrong?

Yes.  You told the pipeline, with --objects-edge, to create a
thin pack.  By definition that is _not_ indexable.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 18:59                               ` Petr Baudis
  2006-10-18 19:04                                 ` Junio C Hamano
@ 2006-10-18 19:09                                 ` Nicolas Pitre
  2006-10-18 20:08                                 ` Linus Torvalds
  2 siblings, 0 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-10-18 19:09 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

On Wed, 18 Oct 2006, Petr Baudis wrote:

> By the way, originally I just wanted to index and save the pack, but
> when trying to feed it to git-index-pack, I kept getting
> 
> 	fatal: packfile '.git/objects/pack/pack-b2ab684daebea5b9c5a6492fa732e0d2e1799c8e.pack' has unresolved deltas
> 
> while feeding it to git-unpack-objects works fine. Any idea what's wrong?

Did you really manage to miss the "heads-up: git-index-pack in "next" is 
broken" thread?

The fix:

diff --git a/index-pack.c b/index-pack.c
index fffddd2..56c590e 100644
--- a/index-pack.c
+++ b/index-pack.c
@@ -23,6 +23,12 @@ union delta_base {
 	unsigned long offset;
 };
 
+/*
+ * Even if sizeof(union delta_base) == 24 on 64-bit archs, we really want
+ * to memcmp() only the first 20 bytes.
+ */
+#define UNION_BASE_SZ	20
+
 struct delta_entry
 {
 	struct object_entry *obj;
@@ -211,7 +217,7 @@ static int find_delta(const union delta_
                 struct delta_entry *delta = &deltas[next];
                 int cmp;
 
-                cmp = memcmp(base, &delta->base, sizeof(*base));
+                cmp = memcmp(base, &delta->base, UNION_BASE_SZ);
                 if (!cmp)
                         return next;
                 if (cmp < 0) {
@@ -232,9 +238,9 @@ static int find_delta_childs(const union
 
 	if (first < 0)
 		return -1;
-	while (first > 0 && !memcmp(&deltas[first - 1].base, base, sizeof(*base)))
+	while (first > 0 && !memcmp(&deltas[first - 1].base, base, UNION_BASE_SZ))
 		--first;
-	while (last < end && !memcmp(&deltas[last + 1].base, base, sizeof(*base)))
+	while (last < end && !memcmp(&deltas[last + 1].base, base, UNION_BASE_SZ))
 		++last;
 	*first_index = first;
 	*last_index = last;
@@ -312,7 +318,7 @@ static int compare_delta_entry(const voi
 {
 	const struct delta_entry *delta_a = a;
 	const struct delta_entry *delta_b = b;
-	return memcmp(&delta_a->base, &delta_b->base, sizeof(union delta_base));
+	return memcmp(&delta_a->base, &delta_b->base, UNION_BASE_SZ);
 }
 
 static void parse_pack_objects(void)


Nicolas

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 19:04                                 ` Junio C Hamano
@ 2006-10-18 19:13                                   ` Nicolas Pitre
  2006-10-18 19:18                                     ` Shawn Pearce
  2006-10-18 19:33                                     ` Junio C Hamano
  0 siblings, 2 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-10-18 19:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, 18 Oct 2006, Junio C Hamano wrote:

> Petr Baudis <pasky@suse.cz> writes:
> 
> > By the way, originally I just wanted to index and save the pack, but
> > when trying to feed it to git-index-pack, I kept getting
> >
> > 	fatal: packfile '.git/objects/pack/pack-b2ab684daebea5b9c5a6492fa732e0d2e1799c8e.pack' has unresolved deltas
> >
> > while feeding it to git-unpack-objects works fine. Any idea what's wrong?
> 
> Yes.  You told the pipeline, with --objects-edge, to create a
> thin pack.  By definition that is _not_ indexable.

Ah true.  I missed the "thin" pack.

Any idea why we should still prevent this?  It is not like it was a 
technical limitation.


Nicolas

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 19:13                                   ` Nicolas Pitre
@ 2006-10-18 19:18                                     ` Shawn Pearce
  2006-10-18 19:33                                       ` Nicolas Pitre
  2006-10-18 19:33                                     ` Junio C Hamano
  1 sibling, 1 reply; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-18 19:18 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Junio C Hamano, git

Nicolas Pitre <nico@cam.org> wrote:
> On Wed, 18 Oct 2006, Junio C Hamano wrote:
> 
> > Petr Baudis <pasky@suse.cz> writes:
> > 
> > > By the way, originally I just wanted to index and save the pack, but
> > > when trying to feed it to git-index-pack, I kept getting
> > >
> > > 	fatal: packfile '.git/objects/pack/pack-b2ab684daebea5b9c5a6492fa732e0d2e1799c8e.pack' has unresolved deltas
> > >
> > > while feeding it to git-unpack-objects works fine. Any idea what's wrong?
> > 
> > Yes.  You told the pipeline, with --objects-edge, to create a
> > thin pack.  By definition that is _not_ indexable.
> 
> Ah true.  I missed the "thin" pack.
> 
> Any idea why we should still prevent this?  It is not like it was a 
> technical limitation.

It still is in sha1-file.c; or at least the last time I looked at
that code.  The base is always resolved from the same pack/index
as the delta.  If you fix sha1-file.c sure, I don't see why you
can't allow indexing thin packs.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 19:18                                     ` Shawn Pearce
@ 2006-10-18 19:33                                       ` Nicolas Pitre
  2006-10-18 20:46                                         ` Shawn Pearce
  0 siblings, 1 reply; 1752+ messages in thread
From: Nicolas Pitre @ 2006-10-18 19:33 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Junio C Hamano, git

On Wed, 18 Oct 2006, Shawn Pearce wrote:

> Nicolas Pitre <nico@cam.org> wrote:
> > On Wed, 18 Oct 2006, Junio C Hamano wrote:
> > 
> > > Petr Baudis <pasky@suse.cz> writes:
> > > 
> > > > By the way, originally I just wanted to index and save the pack, but
> > > > when trying to feed it to git-index-pack, I kept getting
> > > >
> > > > 	fatal: packfile '.git/objects/pack/pack-b2ab684daebea5b9c5a6492fa732e0d2e1799c8e.pack' has unresolved deltas
> > > >
> > > > while feeding it to git-unpack-objects works fine. Any idea what's wrong?
> > > 
> > > Yes.  You told the pipeline, with --objects-edge, to create a
> > > thin pack.  By definition that is _not_ indexable.
> > 
> > Ah true.  I missed the "thin" pack.
> > 
> > Any idea why we should still prevent this?  It is not like it was a 
> > technical limitation.
> 
> It still is in sha1-file.c; or at least the last time I looked at
> that code.  The base is always resolved from the same pack/index
> as the delta.  

Yep.  I mean this doesn't have to be like that fundamentally.

> If you fix sha1-file.c sure, I don't see why you
> can't allow indexing thin packs.

If there are advantages to do so then maybe. That would be for another 
day though, as I've been burned a bit with packs recently.


Nicolas

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 19:13                                   ` Nicolas Pitre
  2006-10-18 19:18                                     ` Shawn Pearce
@ 2006-10-18 19:33                                     ` Junio C Hamano
  2006-10-18 20:47                                       ` Shawn Pearce
  1 sibling, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-18 19:33 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

Nicolas Pitre <nico@cam.org> writes:

> Ah true.  I missed the "thin" pack.
>
> Any idea why we should still prevent this?  It is not like it was a 
> technical limitation.

It is a technical limitation.  We have never assumed that the
virtual address space is big enough to hold more than one whole
pack mmapped at the same time.

Lifting this needs the piecemeal mmap() change somebody was
talking about.

I might bite the bullet and do that myself but I've been hoping
to get an appliable patch from somewhere else ;-).

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
       [not found]                               ` <20061018155704.b94b441d.seanlkml@sympatico.ca>
@ 2006-10-18 19:57                                 ` Sean
  2006-10-18 20:46                                 ` Petr Baudis
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-18 19:57 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Linus Torvalds, Jeff King, Jakub Narebski, Aaron Bentley,
	Andreas Ericsson, bazaar-ng, git

On Wed, 18 Oct 2006 20:52:25 +0200
Petr Baudis <pasky@suse.cz> wrote:

> Took me exactly an hour from mkdir cogito-bundle to cg-push to
> kernel.org. :-)

Nicely done :-).

> cogito-bundle is an example on how to create third-party addons or
> plugins adding own commands to Cogito and using Cogito's infrastructure.
> It's not _that_ easy currently since you have to replicate large part of
> the build infrastructure locally; that could be fixed by installing some
> "library makefiles" and asciidoc toolkit to /usr/share or something, if
> there would be a real demand for such an addon API. cg-help and the cg
> wrapper will pick up the newly installed commands automagically. The
> only thing missing is updating cogito(7) to list the addon commands,
> which would take a bit more work.

Couldn't these just as easily have been written as git-bundle and
git-unbundle without needing any plugins or other cogito infrastructure?

> Though it's an example, it's actually supposed to be useful, by doing
> exactly what is outlined above - l - it lets you exchange commits over
> mail by so-called "bundles", similar to e.g. Bazaar bundles - basically,
> it is like push or fetch, but over email, and the commit ids are
> preserved when transferred in bundles (if you just send patches, the
> commit ids will end up different).

Not sure if it would be useful, but it shouldn't be too hard to have
same commit ids regenerated at receiving end with git patches.

> The provided cg-bundle and cg-unbundle commands are rather crude and
> don't support many things - they don't actually include a diff, only a
> diffstat, etc. The uuencoded bundle is inlined in the mail, which I
> suspect isn't very useful; perhaps it would be more practical to just
> attach it binarily. Feel free to send patches (or bundles ;).

Think you're right about making it an attachment instead.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 18:59                               ` Petr Baudis
  2006-10-18 19:04                                 ` Junio C Hamano
  2006-10-18 19:09                                 ` Nicolas Pitre
@ 2006-10-18 20:08                                 ` Linus Torvalds
  2 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-18 20:08 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git



On Wed, 18 Oct 2006, Petr Baudis wrote:
> 
> By the way, originally I just wanted to index and save the pack, but
> when trying to feed it to git-index-pack, I kept getting
> 
> 	fatal: packfile '.git/objects/pack/pack-b2ab684daebea5b9c5a6492fa732e0d2e1799c8e.pack' has unresolved deltas
> 
> while feeding it to git-unpack-objects works fine. Any idea what's wrong?

Since you created a "thin" pack (that's what the "--objects-edge" means), 
the pack actually contains deltas to objects that are _not_ in the pack. 

In other words, it's not a valid stand-alone pack, it's only a valid thin 
pack, useful to transfer data to the other end (and the other end had 
better have the objects that the deltas are against already).

As a result, index-file refuses to index it: it cannot be used as a 
stand-alone pack, it's _only_ useful as a transfer medium.

So don't even _try_ to use it as a standalone pack-file. It won't work.

(If you want somethign that actually works as a stand-alone pack-file, 
change the "--objects-edge" flag to just "--objects" - that makes the 
pack-file self-sufficient, and doesn't try to delta against "edge" 
objects).

> (BTW, I got the id by sha1summing the pack file; is there an existing
> way to name a pack properly if I have it lying around, unnamed? sha1sum
> seems to be specific to a fairly new GNU coreutils version.)

A properly named _standalone_ pack gets named not by its actual contents, 
but by the SHA1-sum of the sorted list of objects it contains. That's so 
that a pack-file will be named the same thing regardless of how the 
contents are actually packed.

A thin pack cannot be named that way at all, for the same reason you 
cannot index it: it has a set of objects it enumerates (so you could name 
it by them), but it _also_ has a set of objects outside of it that it 
depends on. 

That said, even a thin pack internally has a SHA1 checksum of its 
contents: the last 20 bytes should be the SHA1-sum of all preceding bytes. 
So if you just want _some_ kind of name, you can use the last 20 bytes of 
a pack, which is just its internal integrity-checksum (but that is 
_different_ from the "pack-xxxxxx.idx"/"pack-xxxxxx.pack" naming).

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
       [not found]                               ` <20061018155704.b94b441d.seanlkml@sympatico.ca>
  2006-10-18 19:57                                 ` Sean
@ 2006-10-18 20:46                                 ` Petr Baudis
       [not found]                                   ` <20061018165341.bcece11f.seanlkml@sympatico.ca>
  1 sibling, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18 20:46 UTC (permalink / raw)
  To: Sean
  Cc: Linus Torvalds, Jeff King, Jakub Narebski, Aaron Bentley,
	Andreas Ericsson, bazaar-ng, git

Dear diary, on Wed, Oct 18, 2006 at 09:57:04PM CEST, I got a letter
where Sean <seanlkml@sympatico.ca> said that...
> Couldn't these just as easily have been written as git-bundle and
> git-unbundle without needing any plugins or other cogito infrastructure?

They could be written, but certainly not "just as easily". I'm more used
to coding Cogito, I find it much more convenient than hacking git's
shell scripts (those two may be interconnected ;), and there's plenty of
infrastructure in Cogito missing in Git - Cogito has more flexible
arguments parsing, documentation bundled with code, I could just
cut'n'paste the code to handle -m arguments and message editor (and most
of it is libified anyway) so I got that basically for free, and I think
Cogito beats Git hands down in code readability.

> Not sure if it would be useful, but it shouldn't be too hard to have
> same commit ids regenerated at receiving end with git patches.

It would be of course technically possible, yes. But somewhat more work,
this is just a quick hack.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 19:33                                       ` Nicolas Pitre
@ 2006-10-18 20:46                                         ` Shawn Pearce
  2006-10-18 21:17                                           ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-18 20:46 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Junio C Hamano, git

Nicolas Pitre <nico@cam.org> wrote:
> If there are advantages to do so then maybe. That would be for another 
> day though, as I've been burned a bit with packs recently.

I guess its my turn then to work in the mmap window code, huh?  :-)

-- 
Shawn.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 19:33                                     ` Junio C Hamano
@ 2006-10-18 20:47                                       ` Shawn Pearce
  0 siblings, 0 replies; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-18 20:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nicolas Pitre, git

Junio C Hamano <junkio@cox.net> wrote:
> Nicolas Pitre <nico@cam.org> writes:
> 
> > Ah true.  I missed the "thin" pack.
> >
> > Any idea why we should still prevent this?  It is not like it was a 
> > technical limitation.
> 
> It is a technical limitation.  We have never assumed that the
> virtual address space is big enough to hold more than one whole
> pack mmapped at the same time.

Even though its not big enough for some larger packs on a 32
bit system.
 
> Lifting this needs the piecemeal mmap() change somebody was
> talking about.
> 
> I might bite the bullet and do that myself but I've been hoping
> to get an appliable patch from somewhere else ;-).

I might be able to do it this weekend.  I'll try to spend some time
on it.  You'll either see a patch series, or you won't.  ;-)

-- 
Shawn.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
       [not found]                                   ` <20061018165341.bcece11f.seanlkml@sympatico.ca>
@ 2006-10-18 20:53                                     ` Sean
  2006-10-18 21:39                                     ` Petr Baudis
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-18 20:53 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Linus Torvalds, Jeff King, Jakub Narebski, Aaron Bentley,
	Andreas Ericsson, bazaar-ng, git

On Wed, 18 Oct 2006 22:46:18 +0200
Petr Baudis <pasky@suse.cz> wrote:

> They could be written, but certainly not "just as easily". I'm more used
> to coding Cogito, I find it much more convenient than hacking git's
> shell scripts (those two may be interconnected ;), and there's plenty of
> infrastructure in Cogito missing in Git - Cogito has more flexible
> arguments parsing, documentation bundled with code, I could just
> cut'n'paste the code to handle -m arguments and message editor (and most
> of it is libified anyway) so I got that basically for free, and I think
> Cogito beats Git hands down in code readability.

Hmmm, if I get some time over the weekend i'll take a look at porting
them to Git.  But maybe some of the items you mentioned above deserve
to become part of Git proper?  It would definitely be nice to see
something like what you just did put into the hands of more users than
just those using Cogito, and its unfortunate that the current state
of Git code kept you from going that route.

> It would be of course technically possible, yes. But somewhat more work,
> this is just a quick hack.

No doubt, there would be some slightly thorny issues to deal with.  It
might even end up too fragile to be worthwhile.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:56                         ` Sean
  2006-10-17 23:11                           ` Jakub Narebski
@ 2006-10-18 21:04                           ` Charles Duffy
       [not found]                             ` <20061018172945.c0c58c38.seanlkml@sympatico.ca>
  1 sibling, 1 reply; 1752+ messages in thread
From: Charles Duffy @ 2006-10-18 21:04 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Sean wrote:
> Hmm.. It's pretty easy to test out Git ideas too.  People do it all
> the time, and without plugins.  Junio maintains several such trees
> for instance.  Dunno.. I just think plugs _sounds_ good to developers
> without much real benefit to users over regular ole source code.

Example time!

There's a plugin for Bzr which adds support for Cygwin-compatible 
symlink support on Windows. (IIRC, this involves monkey-patching some of 
the Python standard library bits).

Now, this is something which is *proposed* as a feature to be merged 
into upstream bzr, and it may happen at some point. That said, when I 
have a Windows-using coworker who wants to check out a repository that 
has symlinks in it (with his win32-native, no-cygwin-required bzr 
upstream binary), I don't need to tell him to go download and build bzr 
from a third party; instead, I just need to tell him to run a single 
command to check out the plugin in question into the bzr plugins folder.

 From an end-user convenience perspective, it's a pretty significant win.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 20:46                                         ` Shawn Pearce
@ 2006-10-18 21:17                                           ` Linus Torvalds
  2006-10-18 21:32                                             ` Shawn Pearce
                                                               ` (3 more replies)
  0 siblings, 4 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-18 21:17 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Nicolas Pitre, Junio C Hamano, git



On Wed, 18 Oct 2006, Shawn Pearce wrote:
>
> I guess its my turn then to work in the mmap window code, huh?  :-)

There are bigger reasons to _never_ allow packs to contain deltas to 
outside of themselves:

 - there's no point. 

   If you have many small packs, you're doing something wrong. The whole 
   _point_ of packs is to put things into the same file, so that you can 
   avoid the filesystem overhead. And once packs are big and few, the 
   advantage of having deltas to outside the pack is basically zero.

 - it's a bad design. 

   Self-sufficient packs means that a pack is a "safe" thing. When the 
   index says that it contains an object, then it damn well contains it.

   In contrast, if you had packs that only contained a delta, and the pack 
   needed some _other_ pack (or loose object) to actually generate that 
   object, then it's not safe any more. You could end up with a situation 
   where you get two packs from two different sources, and they contain 
   deltas to _each_other_, and you have no way of actually generating the 
   object itself any more.

   (Or you end up having to have rules to figure out when you have a loop,
   and stop looking just in the packed files, and start looking for loose 
   objects instead)

   In other words, it has potentially _serious_ downsides.

So DAMMIT! Stop looking to make the data structures worse. The fact is, 
the git data structures are FINE. They are well-designed. They work well. 
There's no _point_ in changing them, especially since changing them seems 
to be all about making things less reliable for dubious gain.

One of the advantages of git is that you can explain things with object 
relationships, and that the file format is stable as _hell_. Thats a GOOD 
thing. Please realize that if you want to change the file formats, you'd 
have a hell of a better reason for it that "just because I can".

Please. Really.

So next time somebody suggests a new pack-format, ask yourself:

 - does it save disk-space by 50% or more?

 - does it drop memory usage by 50% or more?

 - does it improve performance by 50% of more?

 - does it make something possible that really fundamentally isn't 
   possible right now?

And if the answer to those questions is "no", then JUST DON'T DO IT.

It really needs to be _damn_ spectacular to be worthy of a new format. 
Really. We've had a few of those, so it clearly does happen:

 - The "compress _after_ SHA1". The original object format was just 
   broken, and the SHA1 name depended on how things compressed. I fixed 
   it. It needed fixing. We couldn't have done a lot of the things we did 
   without switching compression and SHA1-hashing around.

 - the pack-file in the first place: this saved orders of magnitude both 
   in diskspace _and_ performance. Not "10%". More like "factors of 100".

   THAT was worthy of a major format change.

 - the "make loose object contents look the same as packed objects". This 
   was not just a cleanup, it allows us to create pack-files much faster. 

   That said, we're still defaulting to the legacy format, and maybe it 
   wasn't really worth it. 

My personal suspicion is that we'll want to have a 64-bit index file some 
day, and THAT is worthy of a format change. That day is not now, btw. It's 
probably not even very close. Even the mozilla repo that was pushing the 
limit was only doing so until it was optimized better, and now it's 
apparently nowhere _near_ that limit.

But even then, we might well want to update _just_ the index file format.

Because in an SCM, stability and trustworthiness is more important than 
just about _anything_ else. 

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18 14:52                           ` Linus Torvalds
  2006-10-18 18:52                             ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Petr Baudis
@ 2006-10-18 21:20                             ` Jeff King
  1 sibling, 0 replies; 1752+ messages in thread
From: Jeff King @ 2006-10-18 21:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

On Wed, Oct 18, 2006 at 07:52:25AM -0700, Linus Torvalds wrote:

> 	git send origin..
> 
> and that "origin" is what the other end is expected to already have.
> 
> Of course, if you send an unconnected bundle (ie you give an origin that 
> the other end _doesn't_ have), you're screwed.

OK, that was how I was envisioning it, as well, but I was concerned
about the "screwed" part. But I'm not sure how often that would be an
issue in practice (after all, patches require some matchup of the base,
though not as strict as SHA1s).

Thanks for the explanation.

-Peff

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                             ` <20061018172945.c0c58c38.seanlkml@sympatico.ca>
  2006-10-18 21:29                               ` Sean
@ 2006-10-18 21:29                               ` Sean
  2006-10-18 23:31                                 ` Charles Duffy
  2006-10-18 21:37                               ` Shawn Pearce
  2 siblings, 1 reply; 1752+ messages in thread
From: Sean @ 2006-10-18 21:29 UTC (permalink / raw)
  To: Charles Duffy; +Cc: git, bazaar-ng

On Wed, 18 Oct 2006 16:04:52 -0500
Charles Duffy <cduffy@spamcop.net> wrote:

> Example time!
> 
> There's a plugin for Bzr which adds support for Cygwin-compatible 
> symlink support on Windows. (IIRC, this involves monkey-patching some of 
> the Python standard library bits).
> 
> Now, this is something which is *proposed* as a feature to be merged 
> into upstream bzr, and it may happen at some point. That said, when I 
> have a Windows-using coworker who wants to check out a repository that 
> has symlinks in it (with his win32-native, no-cygwin-required bzr 
> upstream binary), I don't need to tell him to go download and build bzr 
> from a third party; instead, I just need to tell him to run a single 
> command to check out the plugin in question into the bzr plugins folder.
> 
>  From an end-user convenience perspective, it's a pretty significant win.

You'll need a better example than that.  Git has supported a version
of Cygwin-compatible symlink support on Windows for quite some time.
And no plugins were needed.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                             ` <20061018172945.c0c58c38.seanlkml@sympatico.ca>
@ 2006-10-18 21:29                               ` Sean
  2006-10-18 21:29                               ` Sean
  2006-10-18 21:37                               ` Shawn Pearce
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-18 21:29 UTC (permalink / raw)
  To: Charles Duffy; +Cc: bazaar-ng, git

On Wed, 18 Oct 2006 16:04:52 -0500
Charles Duffy <cduffy@spamcop.net> wrote:

> Example time!
> 
> There's a plugin for Bzr which adds support for Cygwin-compatible 
> symlink support on Windows. (IIRC, this involves monkey-patching some of 
> the Python standard library bits).
> 
> Now, this is something which is *proposed* as a feature to be merged 
> into upstream bzr, and it may happen at some point. That said, when I 
> have a Windows-using coworker who wants to check out a repository that 
> has symlinks in it (with his win32-native, no-cygwin-required bzr 
> upstream binary), I don't need to tell him to go download and build bzr 
> from a third party; instead, I just need to tell him to run a single 
> command to check out the plugin in question into the bzr plugins folder.
> 
>  From an end-user convenience perspective, it's a pretty significant win.

You'll need a better example than that.  Git has supported a version
of Cygwin-compatible symlink support on Windows for quite some time.
And no plugins were needed.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:17                                           ` Linus Torvalds
@ 2006-10-18 21:32                                             ` Shawn Pearce
  2006-10-18 21:42                                               ` Junio C Hamano
  2006-10-18 21:55                                               ` Linus Torvalds
  2006-10-18 21:41                                             ` Nicolas Pitre
                                                               ` (2 subsequent siblings)
  3 siblings, 2 replies; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-18 21:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, git

Linus Torvalds <torvalds@osdl.org> wrote:
> On Wed, 18 Oct 2006, Shawn Pearce wrote:
> >
> > I guess its my turn then to work in the mmap window code, huh?  :-)
> 
> There are bigger reasons to _never_ allow packs to contain deltas to 
> outside of themselves:
> 
>  - there's no point. 
>  - it's a bad design. 

That and all of the other reasons you cited in your message are
why I haven't finished trying to use some sort of dictionary based
compression for packing objects.

On the other hand we've already seen how packs >1.5 GiB in size
(certainly well within the 4 GiB limitation in the current index
file format) cannot be repacked by git-repack-objects on a 32
bit address space as the entire pack file is mmap'd on one shot.
After the kernel space of ~1 GiB and the pack file at ~1.5 GiB
there's very little address space left for the application code.

My comment that you quoted was about mmap'ing the pack files in
large chunks (around 64-128 MiB at a time, but configurable from
.git/config) rather than as an entire massive mapping.  It had
absolutely nothing to do about changing the pack file format, the
index format, or any other on disk format.  Although it would add
a new pair of configuration options to .git/config.  Is that change
too radical?  :-)

With such a change the Git and Linux kernel repositories would both
still mmap in one chunk but much larger projects like Mozilla or
very large pack files coming out of git-fastimport would actually
be usable on 32 bit architectures without running into address space
limitations so quickly.  Git would also be slightly more usable for
some people who have a lot of very uncompressable data stored in Git.


Unless of course you are actively working on a fix for the Linux
kernel so that we can actually have all 4 GiB of virtual address
space available for the userspace git-repack-objects process.
Or have some sort of secret plan to upgrade everyone who uses Git
to 64 bit processors which support 64 bit address spaces...

-- 
Shawn.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                             ` <20061018172945.c0c58c38.seanlkml@sympatico.ca>
  2006-10-18 21:29                               ` Sean
  2006-10-18 21:29                               ` Sean
@ 2006-10-18 21:37                               ` Shawn Pearce
       [not found]                                 ` <20061018174450.f2108a21.seanlkml@sympatico.ca>
  2006-10-18 23:38                                 ` Johannes Schindelin
  2 siblings, 2 replies; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-18 21:37 UTC (permalink / raw)
  To: Sean; +Cc: Charles Duffy, git, bazaar-ng

Sean <seanlkml@sympatico.ca> wrote:
> On Wed, 18 Oct 2006 16:04:52 -0500
> Charles Duffy <cduffy@spamcop.net> wrote:
> 
> > Example time!
> > 
> > There's a plugin for Bzr which adds support for Cygwin-compatible 
> > symlink support on Windows. (IIRC, this involves monkey-patching some of 
> > the Python standard library bits).
> > 
> > Now, this is something which is *proposed* as a feature to be merged 
> > into upstream bzr, and it may happen at some point. That said, when I 
> > have a Windows-using coworker who wants to check out a repository that 
> > has symlinks in it (with his win32-native, no-cygwin-required bzr 
> > upstream binary), I don't need to tell him to go download and build bzr 
> > from a third party; instead, I just need to tell him to run a single 
> > command to check out the plugin in question into the bzr plugins folder.
> > 
> >  From an end-user convenience perspective, it's a pretty significant win.
> 
> You'll need a better example than that.  Git has supported a version
> of Cygwin-compatible symlink support on Windows for quite some time.
> And no plugins were needed.

Actually I think the only part of that example that was really
interesting was that Bzr runs natively on Windows and that Bzr's
native method of extending the tool with additional features doesn't
require Cygwin.


Today Git doesn't run natively on Windows.  It runs slowly through
Cygwin, thanks to lots of various overheads in different places.
And due to the crappy disk drive in my Windows box.  :-)

Today Git is typically extended (at least initially in prototyping
mode) through Perl, Python, TCL or Bourne shell scripts.  Although
the first three are available natively on Windows the last requires
Cygwin... and we've had some issues with ActiveState Perl on Windows
in the past too.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
       [not found]                                   ` <20061018165341.bcece11f.seanlkml@sympatico.ca>
  2006-10-18 20:53                                     ` Sean
@ 2006-10-18 21:39                                     ` Petr Baudis
       [not found]                                       ` <20061018175443.50b728f6.seanlkml@sympatico.ca>
  1 sibling, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18 21:39 UTC (permalink / raw)
  To: Sean; +Cc: git

(Trimmed Cc' list, this is offtopic for bazaar-ng.)

Dear diary, on Wed, Oct 18, 2006 at 10:53:41PM CEST, I got a letter
where Sean <seanlkml@sympatico.ca> said that...
> On Wed, 18 Oct 2006 22:46:18 +0200
> Petr Baudis <pasky@suse.cz> wrote:
> 
> > They could be written, but certainly not "just as easily". I'm more used
> > to coding Cogito, I find it much more convenient than hacking git's
> > shell scripts (those two may be interconnected ;), and there's plenty of
> > infrastructure in Cogito missing in Git - Cogito has more flexible
> > arguments parsing, documentation bundled with code, I could just
> > cut'n'paste the code to handle -m arguments and message editor (and most
> > of it is libified anyway) so I got that basically for free, and I think
> > Cogito beats Git hands down in code readability.
> 
> Hmmm, if I get some time over the weekend i'll take a look at porting
> them to Git.  But maybe some of the items you mentioned above deserve
> to become part of Git proper?  It would definitely be nice to see
> something like what you just did put into the hands of more users than
> just those using Cogito, and its unfortunate that the current state
> of Git code kept you from going that route.

You can use just this single tool from Cogito. ;-)

The point is, I'll of course prefer doing this stuff in Cogito while I'm
enhancing Cogito, and I'll work on Cogito while I and others will be
using it. I didn't move on to pure Git long time ago since I simply
consider its UI much inferior to Cogito's. Sure, given enough time and
work, it is fixable - but UI flaws are very hard to fix and I find it
more effective to work on Cogito for the time being, at least until I
bring it to 1.0, then I'll see.

Besides, I'm used to Cogito. :-)

So yes, current Git code definitely is a part of the reason, but it is
certainly not the main part of it.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:17                                           ` Linus Torvalds
  2006-10-18 21:32                                             ` Shawn Pearce
@ 2006-10-18 21:41                                             ` Nicolas Pitre
  2006-10-18 21:41                                             ` Shawn Pearce
  2006-10-18 21:56                                             ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Junio C Hamano
  3 siblings, 0 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-10-18 21:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Shawn Pearce, Junio C Hamano, git

On Wed, 18 Oct 2006, Linus Torvalds wrote:

> There are bigger reasons to _never_ allow packs to contain deltas to 
> outside of themselves:
> 
>  - there's no point. 

Remember what I said earlier: "If there are advantages to do so then 
maybe."  So far there are none.

>    You could end up with a situation where you get two packs from two 
>    different sources, and they contain deltas to _each_other_, and you 
>    have no way of actually generating the object itself any more.

To me this is the real killer.

Shawn was talking about a different issue though.


Nicolas

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:17                                           ` Linus Torvalds
  2006-10-18 21:32                                             ` Shawn Pearce
  2006-10-18 21:41                                             ` Nicolas Pitre
@ 2006-10-18 21:41                                             ` Shawn Pearce
  2006-10-18 22:00                                               ` Linus Torvalds
  2006-10-18 22:13                                               ` Junio C Hamano
  2006-10-18 21:56                                             ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Junio C Hamano
  3 siblings, 2 replies; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-18 21:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, git

Linus Torvalds <torvalds@osdl.org> wrote:
> There are bigger reasons to _never_ allow packs to contain deltas to 
> outside of themselves:
> 
>  - there's no point. 

Actually there is a point to storing thin packs.  When I pull from
a remote repo (or push to a remote repo) a huge number of objects
and the target disk that is about to receive that huge number of
loose objects is slooooooooow I would rather just store the thin
pack then store the loose objects.

Ideally that thin pack would be repacked (along with the other
existing packs) as quickly as possible into a self-contained pack.
But that of course is unlikely to happen in practice; especially
on a push.
 
>  - it's a bad design. 
> 
>    In other words, it has potentially _serious_ downsides.

Yes, it does.

But it could also be useful when you fetch 20k+ objects onto a
Windows system or push 1k+ objects onto the slowest NFS system I
have ever seen...  where writing file data (aka packs) is reasonable
but creating or deleting files takes nearly 1 second per file.
I don't want to kill the better part of an hour waiting for a push
to complete!

-- 
Shawn.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:32                                             ` Shawn Pearce
@ 2006-10-18 21:42                                               ` Junio C Hamano
  2006-10-18 21:52                                                 ` Shawn Pearce
  2006-10-18 21:55                                               ` Linus Torvalds
  1 sibling, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-18 21:42 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git

Shawn Pearce <spearce@spearce.org> writes:

> ...  Although it would add
> a new pair of configuration options to .git/config.  Is that change
> too radical?  :-)

I wonder what you would need the configuration options for.

If mmap() pack works well, it works well, and if it is broken
nobody has reason to enable it.  The code should be able to
adjust the mmap window to appropriate size itself and its
automatic adjustment does not even have to be the absolute
optimum (since the user would not know what the optimum would be
anyway), so maybe your configuration options would not be
"enable" nor "window-size" -- and I am puzzled as to what they
are.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                 ` <20061018174450.f2108a21.seanlkml@sympatico.ca>
@ 2006-10-18 21:44                                   ` Sean
  2006-10-18 21:52                                   ` Petr Baudis
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-18 21:44 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git

On Wed, 18 Oct 2006 17:37:03 -0400
Shawn Pearce <spearce@spearce.org> wrote:

> Today Git is typically extended (at least initially in prototyping
> mode) through Perl, Python, TCL or Bourne shell scripts.  Although
> the first three are available natively on Windows the last requires
> Cygwin... and we've had some issues with ActiveState Perl on Windows
> in the past too.

Just for kicks and giggles it would be nice if someone tried out
one of the native Windows bourne shell ports[1] just to see how much
is missing.  A bunch of command line utilities would have to be ported
as well; maybe too many.  But i've held out booting a Windows box
for a long time so.... not it!

Sean

[1] For example, http://www.steve.org.uk/Software/bash/

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Alternate revno proposal (Was: Re: VCS comparison table)
  2006-10-18  5:26                     ` Robert Collins
@ 2006-10-18 21:46                       ` Jan Hudec
  2006-10-18 22:14                         ` Jakub Narebski
                                           ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Jan Hudec @ 2006-10-18 21:46 UTC (permalink / raw)
  To: Robert Collins; +Cc: Petr Baudis, bazaar-ng, git

On Wed, Oct 18, 2006 at 03:26:40PM +1000, Robert Collins wrote:
> revnos visibly change as your work is merged into the mainline - we've
> been doing this for years without trouble: ones own commits to a branch
> get '3', '4', '5' etc as revnos, and when they are merged to the
> mainline they used to stop having revnos at all, but now they will be
> given this dotted decimal revno. If you pull from the mainline after the
> merge, you see the new numbers, and when you look at mainline you can
> see the difference. So while I agree that the surprise the user gets is
> inversely related to the frequency with which they see the behaviour, I
> think our users see it a lot, so are not surprised much.
> 
> FWIW, we're not optimising for mostly straight histories as I understand
> such things : our own history has 3 commits on branches to every one on
> the mainline.

Reading this thread I came to think, that the revnos should be assigned
to _all_ revisions _available_, in order of when they entered the
repository (there are some possible variations I will mention below)

 - Such revnos would be purely local, but:
   - Current revnos are not guaranteed to be the same in different
     branches either.
   - They could be done so that mirror has the same revnos as the
     master.
 - They would be easier to use than the dotted ones. What (at least as
   far as I understand) makes revnos easier to use than revids is, that
   you can remember few of them for short time while composing some
   operation. Ie. look up 2 or 3 revisions in the log and than do some
   command on them. And a 4 to 5-digit number like 10532 is easier to
   remember than something like 3250.2.45.86.
 - Their ordering would be an (arbitrary) superset of the partial
   ordering by descendance, ie. if revision A is ancestor of B, it would
   always have lower revno.
   - The intuition that lower revno means older revision would be always
     valid for related revisions and approximately valid for unrelated
     ones.
 - They would be *localy stable*. That is once assigned the revno would
   always mean the same revision in given branch (as determined by
   location, not tip).
     - This is more than the current scheme can give, since now pull can
       renumber revisions.
 - They wouldn't make any branch special, so the objections Linus raised
   does not apply.
 - They would be the same as subversion and svk, and IIRC mercurial as
   well, use, so:
   - They would already be familiar to users comming from those systems.
   - They are known to be useful that way. In fact for svk it's the only
     way to refer to revisions and seem to work satisfactorily (though
     note that svk is not really suitable to ad-hoc topologies).

Now I said there are two options how to assign them. These are:

 - Repository-wide: Number would be assigned to each revision entering
   the repository, even when it is not in ancestry of any branch (ie.
   if one starts a merge, but than reverts it).
   - Advantages:
     - Simpler to implement (just log every written-out revision).
     - All branches in the same repository use the same revision
       numbers, so if you keep branches in a shared repo, it makes
       easier to look up one revision in log of one branch, other in log
       of other branch and run diff on them.
   - Disadvantages:
     - Mirror only has the same revnos if both master and the mirror are
       stand-alone branches.
 - Branch-wide: Nuber would be assigned to each revision that becomes
   ancestor of the current head revision.
   - Advantages:
     - Mirror (always updated by push from the same source) always have
       the same revision numbers.
     - The revno assignment list could be reused for refering to state
       at particular point in time (in fact, it would be exactly the
       same thing as git reflog).
     - Bound branches could be forced to have the same revnos.
   - Disadvantages:
     - More complex to implement.
     - More work at runtime and more space needed in a shared
       repository, since each branch has it's own mapping.

Both ways, it would be implemented the way revision-history currently
is, just it would list all revisions, not just the path along the
leftmost parent.

Comments?

(Should I put it on the wiki?)

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                       ` <20061017185622.30fbc6c0.seanlkml@sympatico.ca>
  2006-10-17 22:56                         ` Sean
  2006-10-17 22:56                         ` Sean
@ 2006-10-18 21:51                         ` Petr Baudis
  2 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18 21:51 UTC (permalink / raw)
  To: Sean; +Cc: Aaron Bentley, Jakub Narebski, Andreas Ericsson, bazaar-ng, git

Dear diary, on Wed, Oct 18, 2006 at 12:56:22AM CEST, I got a letter
where Sean <seanlkml@sympatico.ca> said that...
> On Tue, 17 Oct 2006 18:44:11 -0400
> Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> > Plugins also don't have a Bazaar's rigid release cycle, testing
> > requirements and coding conventions, so they are a convenient way to try
> > out an idea, before committing to the effort of getting it merged into
> > the core.
> 
> Hmm.. It's pretty easy to test out Git ideas too.  People do it all
> the time, and without plugins.  Junio maintains several such trees
> for instance.  Dunno.. I just think plugs _sounds_ good to developers
> without much real benefit to users over regular ole source code.

I think this is just another cultural difference. Git comes from the
kernel environment (although it is currently used in far more
environments than just the kernel and kernel-related stuff) and the
_kernel_'s development style is that you want to get as much stuff as
possible inside the kernel, and on the other hand don't care at all
about breaking in-kernel APIs and such.

The Git "plumbing" is very much the "kernel". We aren't as much
interested in having support for external bits of code poking in the Git
innards, we would much rather have them integrated into Git as soon as
possible rather than live around externally. OTOH, the "kernel" gives a
very flexible ("UNIXy") API to the writhing mass of porcelain scripts you
may call the "userland".

I'm not saying it must be always sharply better approach than the
plugin-encouraging approach. It's just as it is. (Also, another reason
is probably a purely technical one, it is much easier to have pluggable
functions in scripting languages that support "monkey-patching", than
have them in C, since you actually need to explicitly add all the hooks
etc. So in Python, from a large part you get the plugin support for
free.)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:42                                               ` Junio C Hamano
@ 2006-10-18 21:52                                                 ` Shawn Pearce
  2006-10-18 22:02                                                   ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-18 21:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano <junkio@cox.net> wrote:
> Shawn Pearce <spearce@spearce.org> writes:
> 
> > ...  Although it would add
> > a new pair of configuration options to .git/config.  Is that change
> > too radical?  :-)
> 
> I wonder what you would need the configuration options for.
> 
> If mmap() pack works well, it works well, and if it is broken
> nobody has reason to enable it.  The code should be able to
> adjust the mmap window to appropriate size itself and its
> automatic adjustment does not even have to be the absolute
> optimum (since the user would not know what the optimum would be
> anyway), so maybe your configuration options would not be
> "enable" nor "window-size" -- and I am puzzled as to what they

All very true.

However what do we do about the case where we mmap over 1 GiB worth
of pack data (because the mmap succeeds and we have at least that
much in .pack and .idx files) and then the application starts to
demand a lot of memory via malloc?  At some point malloc will return
NULL, xmalloc will die(), and that's the end of the program.

If the user was able to set the maximum threshold of how much data
we mmap then they could initially prevent us from mmap'ing over 1 GiB;
instead using a smaller upper limit like 512 MiB.

Of course as I write this I think the better solution to this
problem is to simply modify xmalloc (and friends) so that if the
underlying malloc returned NULL and we have a large amount of stuff
mmap'd from packs we try releasing some of the unused pack windows
and retry the malloc before die()'ing.


The other configuration option is the size of the mmap window.
This should by default be at least 32 MiB, probably closer to
128 MiB.  But its nice to be able to force it as low as a single
system page to setup test cases in the t/ directory for the mmap
window code.

Earlier this summer we discussed this exact issue and said this
value probably needs to be configurable if only to facilitate the
unit tests.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                 ` <20061018174450.f2108a21.seanlkml@sympatico.ca>
  2006-10-18 21:44                                   ` Sean
@ 2006-10-18 21:52                                   ` Petr Baudis
  1 sibling, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18 21:52 UTC (permalink / raw)
  To: Sean; +Cc: Shawn Pearce, git

Dear diary, on Wed, Oct 18, 2006 at 11:44:50PM CEST, I got a letter
where Sean <seanlkml@sympatico.ca> said that...
> On Wed, 18 Oct 2006 17:37:03 -0400
> Shawn Pearce <spearce@spearce.org> wrote:
> 
> > Today Git is typically extended (at least initially in prototyping
> > mode) through Perl, Python, TCL or Bourne shell scripts.  Although
> > the first three are available natively on Windows the last requires
> > Cygwin... and we've had some issues with ActiveState Perl on Windows
> > in the past too.
> 
> Just for kicks and giggles it would be nice if someone tried out
> one of the native Windows bourne shell ports[1] just to see how much
> is missing.  A bunch of command line utilities would have to be ported
> as well; maybe too many.  But i've held out booting a Windows box
> for a long time so.... not it!

I think that before starting to think about the porcelain scripts, you
need to port the plumbing. :-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
       [not found]                                       ` <20061018175443.50b728f6.seanlkml@sympatico.ca>
@ 2006-10-18 21:54                                         ` Sean
  0 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-18 21:54 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

On Wed, 18 Oct 2006 23:39:35 +0200
Petr Baudis <pasky@suse.cz> wrote:

> You can use just this single tool from Cogito. ;-)

I'd rather not have to keep two separate tools up to date, i just want
to install Git and have all these features installed.  Especially since
there is so much overlap in what these two packages do.  That would seem
like the best thing to do for most users in fact, asking them to install
and keep both up to date just doesn't make sense, to me at least.

> The point is, I'll of course prefer doing this stuff in Cogito while I'm
> enhancing Cogito, and I'll work on Cogito while I and others will be
> using it. I didn't move on to pure Git long time ago since I simply
> consider its UI much inferior to Cogito's. Sure, given enough time and
> work, it is fixable - but UI flaws are very hard to fix and I find it
> more effective to work on Cogito for the time being, at least until I
> bring it to 1.0, then I'll see.
> 
> Besides, I'm used to Cogito. :-)
> 
> So yes, current Git code definitely is a part of the reason, but it is
> certainly not the main part of it.

It's just a shame that your talents are split off from helping the main
project more.  Git would be further along today in content and PR if it
had managed to attract you back from your Cogito adventure.  Then all
the nice things you're able to say about Cogito might then be said
about Git proper, and maybe we'd attract even more users.

While you've contributed more to Git than many others (including me
obviously), it would sure be nice to see you back full time on Git.
I want to type "git bundle" without having to install more
software damnit ;o)  But of course you have to decide what's best
for yourself.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:32                                             ` Shawn Pearce
  2006-10-18 21:42                                               ` Junio C Hamano
@ 2006-10-18 21:55                                               ` Linus Torvalds
  2006-10-18 22:05                                                 ` Shawn Pearce
  2006-10-18 22:07                                                 ` Junio C Hamano
  1 sibling, 2 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-18 21:55 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Nicolas Pitre, Junio C Hamano, git



On Wed, 18 Oct 2006, Shawn Pearce wrote:
> 
> My comment that you quoted was about mmap'ing the pack files in
> large chunks (around 64-128 MiB at a time, but configurable from
> .git/config) rather than as an entire massive mapping.

Sure. I agree that we should do that, if only because it's clearly getting 
hard to handle large pack-files on a 32-bit architecture.

You just seemed to say that in the _context_ of wanting to support having 
multiple pack-files open (in order to allow deltas to refer to things 
outside their own pack-file).

I just wanted to head that particular idea off at the pass.

I think thin packs have been a good idea, and they certainly cut the 
amount of data sent over the network down by a large amount (much more 
than 50%), so I think thin packs are a great idea. Just _not_ when 
indexed.

So I don't object to mmap windows at all. I object to them only in the 
context of "they would allow us to use deltas between two different packs"
discussion ;)

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:17                                           ` Linus Torvalds
                                                               ` (2 preceding siblings ...)
  2006-10-18 21:41                                             ` Shawn Pearce
@ 2006-10-18 21:56                                             ` Junio C Hamano
  3 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-18 21:56 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> My personal suspicion is that we'll want to have a 64-bit index file some 
> day, and THAT is worthy of a format change. That day is not now, btw. It's 
> probably not even very close. Even the mozilla repo that was pushing the 
> limit was only doing so until it was optimized better, and now it's 
> apparently nowhere _near_ that limit.
>
> But even then, we might well want to update _just_ the index file format.

We've tried this already, and I shelved the patch for 64-index
for now due to exactly the same reasoning as yours (and it would
have conflicted heavily with Shawn's windowed-mmap() patch).  It
involved updating just the index file format, so you are right
on both counts.

But you are always right anyway, so it may not be a news at all
;-).

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:41                                             ` Shawn Pearce
@ 2006-10-18 22:00                                               ` Linus Torvalds
  2006-10-18 22:11                                                 ` Shawn Pearce
  2006-10-18 22:13                                               ` Junio C Hamano
  1 sibling, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-18 22:00 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Nicolas Pitre, Junio C Hamano, git



On Wed, 18 Oct 2006, Shawn Pearce wrote:
> 
> Actually there is a point to storing thin packs.  When I pull from
> a remote repo (or push to a remote repo) a huge number of objects
> and the target disk that is about to receive that huge number of
> loose objects is slooooooooow I would rather just store the thin
> pack then store the loose objects.
> 
> Ideally that thin pack would be repacked (along with the other
> existing packs) as quickly as possible into a self-contained pack.
> But that of course is unlikely to happen in practice; especially
> on a push.

I'm really nervous about keeping thin packs around. 

But a possibly good (and fairly simple) alternative would be to just 
create a non-thin pack on the receiving side. Right now we unpack into a 
lot of loose objects, but it should be possible to instead "unpack" into a 
non-thin pack.

In other words, we could easily still use the thin pack for communication, 
we'd just "fill it out" on the receiving side.

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:52                                                 ` Shawn Pearce
@ 2006-10-18 22:02                                                   ` Junio C Hamano
  0 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-18 22:02 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git

Shawn Pearce <spearce@spearce.org> writes:

> However what do we do about the case where we mmap over 1 GiB worth
> of pack data (because the mmap succeeds and we have at least that
> much in .pack and .idx files) and then the application starts to
> demand a lot of memory via malloc?...
>
> The other configuration option is the size of the mmap window.
>...
> Earlier this summer we discussed this exact issue and said this
> value probably needs to be configurable if only to facilitate the
> unit tests.

I see.  So you are allowing users to control individual window
size and total mmap memory.  That makes sense.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:55                                               ` Linus Torvalds
@ 2006-10-18 22:05                                                 ` Shawn Pearce
  2006-10-18 22:07                                                 ` Junio C Hamano
  1 sibling, 0 replies; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-18 22:05 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, git

Linus Torvalds <torvalds@osdl.org> wrote:
> So I don't object to mmap windows at all. I object to them only in the 
> context of "they would allow us to use deltas between two different packs"
> discussion ;)

Having mmap windows or not has no impact on using deltas between
packs.  We already map multiple packs at once.  We just don't do
delta resolution between them, for the reasons you have already
given.

The two are totally unrelated.  I apologize for somehow making
yourself (and others) think they are.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:55                                               ` Linus Torvalds
  2006-10-18 22:05                                                 ` Shawn Pearce
@ 2006-10-18 22:07                                                 ` Junio C Hamano
  1 sibling, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-18 22:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> I think thin packs have been a good idea, and they certainly cut the 
> amount of data sent over the network down by a large amount (much more 
> than 50%), so I think thin packs are a great idea. Just _not_ when 
> indexed.

Ah, I feel quite behind.  I was about to say "oh have you been
pushing with --thin option?", and then realized that we made it
default since late March this year.

I need to run memtest86 on myself X-<.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 22:00                                               ` Linus Torvalds
@ 2006-10-18 22:11                                                 ` Shawn Pearce
  0 siblings, 0 replies; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-18 22:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, git

Linus Torvalds <torvalds@osdl.org> wrote:
> On Wed, 18 Oct 2006, Shawn Pearce wrote:
> > 
> > Actually there is a point to storing thin packs.  When I pull from
> > a remote repo (or push to a remote repo) a huge number of objects
> > and the target disk that is about to receive that huge number of
> > loose objects is slooooooooow I would rather just store the thin
> > pack then store the loose objects.
> > 
> > Ideally that thin pack would be repacked (along with the other
> > existing packs) as quickly as possible into a self-contained pack.
> > But that of course is unlikely to happen in practice; especially
> > on a push.
> 
> I'm really nervous about keeping thin packs around. 
> 
> But a possibly good (and fairly simple) alternative would be to just 
> create a non-thin pack on the receiving side. Right now we unpack into a 
> lot of loose objects, but it should be possible to instead "unpack" into a 
> non-thin pack.
> 
> In other words, we could easily still use the thin pack for communication, 
> we'd just "fill it out" on the receiving side.

Funny, I had the same thought.  :-)

We already know how many objects are coming in on a thin pack;
its right there in the header.  We could just have some threshold
at which we start writing a full pack rather than unpacking.

Writing such a full pack would be a simple matter of copying the
input stream out to a temporary pack, but sticking any delta bases
into a table in memory.  At the end of the data stream if we have any
delta bases which weren't actually in that pack then find them and
copy them onto the end, update the header and recompute the checksum.
git-fastimport does some of that already, though its trivial code...

Worst case scenario would be the incoming thin pack is 100% deltas
as we would need to copy in a base object for every object mentioned
in the pack.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:41                                             ` Shawn Pearce
  2006-10-18 22:00                                               ` Linus Torvalds
@ 2006-10-18 22:13                                               ` Junio C Hamano
  2006-10-18 22:42                                                 ` Linus Torvalds
  1 sibling, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-18 22:13 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Linus Torvalds, Nicolas Pitre, git

Shawn Pearce <spearce@spearce.org> writes:

> Ideally that thin pack would be repacked (along with the other
> existing packs) as quickly as possible into a self-contained pack.

It should not be hard to write another program that generates a
packfile like pack-object does but taking a thin pack as its
input.  Then receive-pack can drive it instead of
unpack-objects.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Alternate revno proposal (Was: Re: VCS comparison table)
  2006-10-18 21:46                       ` Alternate revno proposal (Was: Re: VCS comparison table) Jan Hudec
@ 2006-10-18 22:14                         ` Jakub Narebski
  2006-10-19  5:45                           ` Jan Hudec
  2006-10-19  8:19                         ` Alexander Belchenko
  2006-10-20  2:09                         ` Horst H. von Brand
  2 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-18 22:14 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Jan Hudec wrote:

> Comments?

What about fetching from repository? For revnos you have to assign revno for
all commit you have downloaded; now you need only to unpack received pack
(or not, if you used --keep option). More work.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 22:13                                               ` Junio C Hamano
@ 2006-10-18 22:42                                                 ` Linus Torvalds
  2006-10-18 22:48                                                   ` Junio C Hamano
  2006-10-18 23:18                                                   ` Nicolas Pitre
  0 siblings, 2 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-18 22:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn Pearce, Nicolas Pitre, git



On Wed, 18 Oct 2006, Junio C Hamano wrote:
> 
> It should not be hard to write another program that generates a
> packfile like pack-object does but taking a thin pack as its
> input.  Then receive-pack can drive it instead of
> unpack-objects.

Give me half an hour. It should be trivial to make "unpack-objects" write 
the "unpacked" objects into a pack-file instead.

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 22:42                                                 ` Linus Torvalds
@ 2006-10-18 22:48                                                   ` Junio C Hamano
  2006-10-18 23:22                                                     ` Shawn Pearce
  2006-10-18 23:18                                                   ` Nicolas Pitre
  1 sibling, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-18 22:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> On Wed, 18 Oct 2006, Junio C Hamano wrote:
>> 
>> It should not be hard to write another program that generates a
>> packfile like pack-object does but taking a thin pack as its
>> input.  Then receive-pack can drive it instead of
>> unpack-objects.
>
> Give me half an hour. It should be trivial to make "unpack-objects" write 
> the "unpacked" objects into a pack-file instead.

Heh, three people having the same idea that goes in the same
direction at the same time is not necessarily a good sign of
efficient project management...

I am currently fighting with FC5 so please go ahead.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 22:42                                                 ` Linus Torvalds
  2006-10-18 22:48                                                   ` Junio C Hamano
@ 2006-10-18 23:18                                                   ` Nicolas Pitre
  2006-10-18 23:50                                                     ` Johannes Schindelin
  2006-10-19  0:07                                                     ` Linus Torvalds
  1 sibling, 2 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-10-18 23:18 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Shawn Pearce, git

On Wed, 18 Oct 2006, Linus Torvalds wrote:

> 
> 
> On Wed, 18 Oct 2006, Junio C Hamano wrote:
> > 
> > It should not be hard to write another program that generates a
> > packfile like pack-object does but taking a thin pack as its
> > input.  Then receive-pack can drive it instead of
> > unpack-objects.
> 
> Give me half an hour. It should be trivial to make "unpack-objects" write 
> the "unpacked" objects into a pack-file instead.

If you use builtin-unpack-objects.c from next, you'll be able to 
generate the pack index pretty easily as well, as all the needed info is 
stored in the obj_list array.  Just need to append objects remaining on 
the delta_list array to the end of the pack, sort the obj_list by sha1 
and write the index.

Pretty trivial indeed.


Nicolas

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 22:48                                                   ` Junio C Hamano
@ 2006-10-18 23:22                                                     ` Shawn Pearce
  0 siblings, 0 replies; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-18 23:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

Junio C Hamano <junkio@cox.net> wrote:
> Linus Torvalds <torvalds@osdl.org> writes:
> 
> > On Wed, 18 Oct 2006, Junio C Hamano wrote:
> >> 
> >> It should not be hard to write another program that generates a
> >> packfile like pack-object does but taking a thin pack as its
> >> input.  Then receive-pack can drive it instead of
> >> unpack-objects.
> >
> > Give me half an hour. It should be trivial to make "unpack-objects" write 
> > the "unpacked" objects into a pack-file instead.
> 
> Heh, three people having the same idea that goes in the same
> direction at the same time is not necessarily a good sign of
> efficient project management...

Or maybe it is just a sign of a good way to resolve the issue I
was raising.  :-)

-- 
Shawn.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18 21:29                               ` Sean
@ 2006-10-18 23:31                                 ` Charles Duffy
  2006-10-18 23:48                                   ` Johannes Schindelin
                                                     ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Charles Duffy @ 2006-10-18 23:31 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Sean wrote:
> You'll need a better example than that.  Git has supported a version
> of Cygwin-compatible symlink support on Windows for quite some time.
> And no plugins were needed.

The win32-compatible symlink support is not, in and of itself, the point.

The point is that core, pervasive functionality can be modified at 
runtime, with no recompilation or installation of tools not included in 
the bzr package itself, simply by dropping a directory into place. This 
means that folks who don't have the skillset to merge three branches 
together (say, upstream plus two different trees adding extra 
functionality) and run a build can still install a few plugins to 
enhance their copy of bzr (which was installed by their IT staff, or a 
shiny click-through idiot-friendly Windows installer, etc).

And yes, there are people like that who are part of bzr's target 
audience. Think (of the lower end of the set of) DBAs, QA folk and such.


Granted, I'm speaking with my IT hat on here rather than my developer 
hat -- but plugins are a pretty clear usability win.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18 21:37                               ` Shawn Pearce
       [not found]                                 ` <20061018174450.f2108a21.seanlkml@sympatico.ca>
@ 2006-10-18 23:38                                 ` Johannes Schindelin
  2006-10-18 23:54                                   ` Petr Baudis
  1 sibling, 1 reply; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-18 23:38 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git

Hi,

On Wed, 18 Oct 2006, Shawn Pearce wrote:

> Today Git doesn't run natively on Windows.

As I mentioned some time ago, I started a branch on MinGW. It works quite 
well for the moment, but it lacks fork() emulation, and glob() emulation. 
And I lack the time to continue working on it.

> Today Git is typically extended (at least initially in prototyping
> mode) through Perl, Python, TCL or Bourne shell scripts.  Although
> the first three are available natively on Windows the last requires
> Cygwin... and we've had some issues with ActiveState Perl on Windows
> in the past too.

Those are not the only problems with scripting. Scripting is fine for 
prototyping, but _anything_ remotely serious should be implemented using a 
portable (!) and safe (!) API.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18 23:31                                 ` Charles Duffy
@ 2006-10-18 23:48                                   ` Johannes Schindelin
  2006-10-19  1:58                                     ` Charles Duffy
  2006-10-18 23:48                                   ` Jakub Narebski
       [not found]                                   ` <20061018194945.3e5105e7.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-18 23:48 UTC (permalink / raw)
  To: Charles Duffy; +Cc: git, bazaar-ng

Hi,

On Wed, 18 Oct 2006, Charles Duffy wrote:

> The point is that core, pervasive functionality can be modified at 
> runtime, with no recompilation or installation of tools not included in 
> the bzr package itself, simply by dropping a directory into place.

Please note that this is not welcome here. I _need_ to trust my SCM. And 
_that_ means that no strange non-mainline beast can be allowed to change 
core features.

So, the wonderful upside of plugins you described here are actually the 
reason I will never, _never_ use bzr with plugins.

Ciao,
Dscho

--

It's not paranoia. It's called experience.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18 23:31                                 ` Charles Duffy
  2006-10-18 23:48                                   ` Johannes Schindelin
@ 2006-10-18 23:48                                   ` Jakub Narebski
       [not found]                                   ` <20061018194945.3e5105e7.seanlkml@sympatico.ca>
  2 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-18 23:48 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Charles Duffy wrote:

> Sean wrote:
>> You'll need a better example than that.  Git has supported a version
>> of Cygwin-compatible symlink support on Windows for quite some time.
>> And no plugins were needed.
> 
> The win32-compatible symlink support is not, in and of itself, the point.
> 
> The point is that core, pervasive functionality can be modified at 
> runtime, with no recompilation or installation of tools not included in 
> the bzr package itself, simply by dropping a directory into place. This 
> means that folks who don't have the skillset to merge three branches 
> together (say, upstream plus two different trees adding extra 
> functionality) and run a build can still install a few plugins to 
> enhance their copy of bzr (which was installed by their IT staff, or a 
> shiny click-through idiot-friendly Windows installer, etc).

You don't need plugins for that. Take for example git-svn (perhaps not the
best example, as it is Perl script; but Python although has compiled form
is script language at heart), which went AFAIK from external contribution,
to being in contrib/, to being in mainline (and in git-svn package).

About plugins modifying some core functionality: this is rather sign
of not attracting developers to do it in-core...
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                   ` <20061018194945.3e5105e7.seanlkml@sympatico.ca>
@ 2006-10-18 23:49                                     ` Sean
  2006-10-18 23:49                                     ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-18 23:49 UTC (permalink / raw)
  To: Charles Duffy; +Cc: git, bazaar-ng

On Wed, 18 Oct 2006 18:31:32 -0500
Charles Duffy <cduffy@spamcop.net> wrote:

> Granted, I'm speaking with my IT hat on here rather than my developer 
> hat -- but plugins are a pretty clear usability win.

Sure they can be.  But their value I think is overstated, especially
in an open source project where anyone can grab a copy of the source
and update it with a trial feature.  This updated copy can be wrapped
in a nice GUI installer just as easily as any plugin.

Now, I suppose plugins let end users mix and match trial features
slightly easier, but hopefully your base package isn't so devoid of
features that this is honestly necessary.

As Petr pointed out, all this comes to Bzr essentially for free
since it's a part of python.  So be it, but I've yet to hear an
example where plugins were anything more than a minor convenience
rather than a fundamental win over the way Git is developing.

For an example, just look how few lines of git were needed to
implement the essential features of the bzr bundle feature.
With no plugins or monkey business needed ;o)

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                   ` <20061018194945.3e5105e7.seanlkml@sympatico.ca>
  2006-10-18 23:49                                     ` Sean
@ 2006-10-18 23:49                                     ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-18 23:49 UTC (permalink / raw)
  To: Charles Duffy; +Cc: bazaar-ng, git

On Wed, 18 Oct 2006 18:31:32 -0500
Charles Duffy <cduffy@spamcop.net> wrote:

> Granted, I'm speaking with my IT hat on here rather than my developer 
> hat -- but plugins are a pretty clear usability win.

Sure they can be.  But their value I think is overstated, especially
in an open source project where anyone can grab a copy of the source
and update it with a trial feature.  This updated copy can be wrapped
in a nice GUI installer just as easily as any plugin.

Now, I suppose plugins let end users mix and match trial features
slightly easier, but hopefully your base package isn't so devoid of
features that this is honestly necessary.

As Petr pointed out, all this comes to Bzr essentially for free
since it's a part of python.  So be it, but I've yet to hear an
example where plugins were anything more than a minor convenience
rather than a fundamental win over the way Git is developing.

For an example, just look how few lines of git were needed to
implement the essential features of the bzr bundle feature.
With no plugins or monkey business needed ;o)

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 23:18                                                   ` Nicolas Pitre
@ 2006-10-18 23:50                                                     ` Johannes Schindelin
  2006-10-19  0:07                                                     ` Linus Torvalds
  1 sibling, 0 replies; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-18 23:50 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

Hi,

On Wed, 18 Oct 2006, Nicolas Pitre wrote:

> Pretty trivial indeed.

Easy! You take all the fun out of it!

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* [ANNOUNCE] GIT 1.4.3
@ 2006-10-18 23:53 Junio C Hamano
  2006-10-20 12:31 ` Horst H. von Brand
                   ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-18 23:53 UTC (permalink / raw)
  To: git; +Cc: linux-kernel

The latest feature release GIT 1.4.3 is available at the usual
places:

  http://www.kernel.org/pub/software/scm/git/

  git-1.4.3.tar.{gz,bz2}			(tarball)
  git-htmldocs-1.4.3.tar.{gz,bz2}		(preformatted docs)
  git-manpages-1.4.3.tar.{gz,bz2}		(preformatted docs)
  RPMS/$arch/git-*-1.4.3-1.$arch.rpm	(RPM)

Please holler if i386 RPMs are broken, since they are not cut on
the machine I am used to use (I ended up burning half a day
installing and futzing with FC5 on my older laptop resurrected
from the boneyard).

User visible changes, other than bugfixes, since v1.4.2.4 are:

 - upload-tar is deprecated but not removed; we now have
   upload-archive --format=tar and --format=zip instead.

 - ftp:// protocol is supported the same way as http:// and
   https://

 - git-diff paginates its output to the tty by default.  If this
   irritates you, using LESS=RF might help.

 - git-cherry-pick does not leave often useless "cherry-picked
   from" message.

 - git-merge-recursive was replaced by a rewritten implemention
   in C.  The original Python implementation is available as
   "recursive-old" strategy for now, but hopefully we can remove
   it in the next cycle.

 - git-daemon can do name based virtual hosting.

 - git-daemon can serve tar and zip snapshots.

 - many gitweb tweaks and cleanups.

 - git-apply --reverse, --reject.

 - git-diff --color highlights whitespace errors.

 - git-diff --stat can be taught to use non-default widths.

 - git-status can use colors.

 - many more commands are built-in.

----------------------------------------------------------------

 .gitignore                                         |   10 +-
 Documentation/Makefile                             |    4 +-
 Documentation/asciidoc.conf                        |    1 +
 Documentation/config.txt                           |   34 +
 Documentation/core-tutorial.txt                    |    2 +-
 Documentation/cvs-migration.txt                    |    2 +-
 Documentation/diff-options.txt                     |   10 +-
 Documentation/git-apply.txt                        |   69 +-
 .../{git-tar-tree.txt => git-archive.txt}          |   93 +-
 Documentation/git-blame.txt                        |   29 +-
 Documentation/git-cherry-pick.txt                  |   23 +-
 Documentation/git-daemon.txt                       |  135 +-
 Documentation/git-grep.txt                         |   15 +-
 Documentation/git-http-push.txt                    |    2 +-
 Documentation/git-init-db.txt                      |    4 +
 Documentation/git-ls-remote.txt                    |   18 +-
 Documentation/git-pack-objects.txt                 |   26 +-
 Documentation/git-receive-pack.txt                 |    2 +
 Documentation/git-repack.txt                       |   13 +-
 Documentation/git-repo-config.txt                  |    3 +-
 Documentation/git-rev-list.txt                     |  428 ++-
 Documentation/git-rev-parse.txt                    |    2 +-
 Documentation/git-send-pack.txt                    |    2 +-
 Documentation/git-shortlog.txt                     |   17 +-
 Documentation/git-svn.txt                          |  399 ++-
 Documentation/git-tar-tree.txt                     |    3 +
 Documentation/git-unpack-objects.txt               |    8 +-
 Documentation/git-update-index.txt                 |    4 +-
 .../{git-upload-tar.txt => git-upload-archive.txt} |   24 +-
 Documentation/git.txt                              |   37 +-
 Documentation/gitk.txt                             |  151 +-
 Documentation/glossary.txt                         |    4 +-
 Documentation/hooks.txt                            |   56 +-
 Documentation/technical/racy-git.txt               |  193 +
 Documentation/tutorial-2.txt                       |    2 +-
 GIT-VERSION-GEN                                    |    2 +-
 INSTALL                                            |   15 +-
 Makefile                                           |  297 +-
 builtin-tar-tree.c => archive-tar.c                |  229 +-
 archive-zip.c                                      |  333 ++
 archive.h                                          |   47 +
 blame.c                                            |   19 +-
 builtin-apply.c                                    |  708 +++-
 builtin-archive.c                                  |  263 ++
 builtin-cat-file.c                                 |   40 +-
 checkout-index.c => builtin-checkout-index.c       |   29 +-
 builtin-commit-tree.c                              |    2 +-
 builtin-count.c => builtin-count-objects.c         |    2 +-
 builtin-diff-files.c                               |    7 -
 builtin-diff-stages.c                              |    2 +-
 builtin-diff.c                                     |   16 +-
 builtin-fmt-merge-msg.c                            |   39 +-
 builtin-grep.c                                     |  650 +---
 builtin-init-db.c                                  |    1 +
 builtin-log.c                                      |    7 +-
 builtin-ls-files.c                                 |   27 +-
 builtin-ls-tree.c                                  |    6 +-
 builtin-mailinfo.c                                 |   17 +-
 builtin-mv.c                                       |   12 +-
 name-rev.c => builtin-name-rev.c                   |    8 +-
 pack-objects.c => builtin-pack-objects.c           |  439 ++-
 builtin-prune-packed.c                             |    2 +-
 builtin-prune.c                                    |    4 +-
 builtin-push.c                                     |   32 +-
 builtin-read-tree.c                                |  865 +----
 builtin-repo-config.c                              |   34 +-
 builtin-rev-list.c                                 |  171 +-
 builtin-rev-parse.c                                |   10 +-
 builtin-rm.c                                       |    2 +-
 builtin-runstatus.c                                |   36 +
 builtin-show-branch.c                              |   18 +-
 symbolic-ref.c => builtin-symbolic-ref.c           |    8 +-
 builtin-tar-tree.c                                 |  439 +--
 unpack-objects.c => builtin-unpack-objects.c       |   78 +-
 builtin-update-index.c                             |   18 +-
 builtin-update-ref.c                               |    2 +-
 builtin-upload-archive.c                           |  175 +
 builtin-upload-tar.c                               |   74 -
 verify-pack.c => builtin-verify-pack.c             |   15 +-
 builtin-write-tree.c                               |    4 +-
 builtin.h                                          |   86 +-
 cache-tree.c                                       |   14 +-
 cache.h                                            |   68 +-
 check-racy.c                                       |   28 +
 color.c                                            |  176 +
 color.h                                            |   12 +
 combine-diff.c                                     |   41 +-
 commit.c                                           |   51 +-
 commit.h                                           |    2 +-
 compat/inet_pton.c                                 |  220 +
 config.c                                           |   18 +-
 config.mak.in                                      |   18 +
 configure.ac                                       |  561 ++-
 connect.c                                          |   50 +-
 contrib/completion/git-completion.bash             |  324 ++
 contrib/emacs/git.el                               |    4 +-
 contrib/emacs/vc-git.el                            |    6 +-
 contrib/gitview/gitview.txt                        |   56 +-
 contrib/vim/README                                 |    8 +
 contrib/vim/syntax/gitcommit.vim                   |   18 +
 convert-objects.c                                  |    8 +-
 csum-file.c                                        |    6 +-
 daemon.c                                           |  431 ++-
 date.c                                             |  132 +-
 describe.c                                         |   14 +-
 diff-delta.c                                       |    4 +-
 diff-lib.c                                         |   32 +-
 diff.c                                             |  666 ++-
 diff.h                                             |   15 +-
 diffcore-break.c                                   |    2 +-
 diffcore-rename.c                                  |    2 +-
 dir.c                                              |   48 +-
 dir.h                                              |    1 +
 dump-cache-tree.c                                  |    2 +-
 entry.c                                            |    4 +-
 environment.c                                      |   20 +-
 exec_cmd.c                                         |   20 +-
 fetch-clone.c                                      |   33 +-
 fetch-pack.c                                       |   24 +-
 fetch.c                                            |    9 +-
 fsck-objects.c                                     |   44 +-
 generate-cmdlist.sh                                |    1 +
 git-branch.sh                                      |   10 +
 git-checkout.sh                                    |    9 +-
 git-cherry.sh                                      |    3 -
 git-clone.sh                                       |    8 +-
 git-commit.sh                                      |  582 +--
 git-compat-util.h                                  |   18 +-
 git-cvsexportcommit.perl                           |    2 +-
 git-cvsserver.perl                                 |   65 +-
 git-fetch.sh                                       |   26 +-
 git-ls-remote.sh                                   |    6 +-
 ...erge-recursive.py => git-merge-recursive-old.py |    0 
 git-merge.sh                                       |    5 +-
 git-parse-remote.sh                                |   43 +-
 git-pull.sh                                        |    2 +-
 git-rebase.sh                                      |    6 +-
 git-repack.sh                                      |   25 +-
 git-reset.sh                                       |    3 -
 git-resolve.sh                                     |    4 +
 git-revert.sh                                      |   14 +-
 git-send-email.perl                                |   42 +-
 git-shortlog.perl                                  |   44 +-
 git-svn.perl                                       |  122 +-
 git-svnimport.perl                                 |   35 +-
 git.c                                              |  131 +-
 git.spec.in                                        |   23 +-
 gitk                                               |  682 +++-
 gitweb/README                                      |   61 +-
 gitweb/git-favicon.png                             |  Bin
 gitweb/git-logo.png                                |  Bin
 gitweb/gitweb.css                                  |   80 +-
 gitweb/{gitweb.cgi => gitweb.perl}                 | 4459 ++++++++++++--------
 grep.c                                             |  498 +++
 grep.h                                             |   79 +
 builtin-help.c => help.c                           |    4 +-
 http-fetch.c                                       |  303 +--
 http-push.c                                        |   95 +-
 http.c                                             |   12 +
 http.h                                             |    4 +
 imap-send.c                                        |   45 +-
 index-pack.c                                       |   16 +-
 interpolate.c                                      |  108 +
 interpolate.h                                      |   26 +
 builtin-prune.c => list-objects.c                  |  255 +-
 list-objects.h                                     |   12 +
 local-fetch.c                                      |    8 +-
 log-tree.c                                         |   82 +-
 merge-base.c                                       |    2 +-
 merge-file.c                                       |    2 +-
 merge-index.c                                      |    5 +-
 merge-recursive.c                                  | 1351 ++++++
 merge-tree.c                                       |   10 +-
 mktag.c                                            |    2 +-
 mktree.c                                           |    5 +-
 object-refs.c                                      |   11 +-
 object.c                                           |    6 +-
 object.h                                           |   11 -
 pack-check.c                                       |   25 +-
 pack-redundant.c                                   |   18 +-
 pager.c                                            |    4 +-
 patch-id.c                                         |    2 +-
 path-list.c                                        |    5 +-
 path.c                                             |   10 +-
 peek-remote.c                                      |    5 +-
 perl/.gitignore                                    |    4 +
 perl/Git.pm                                        |  837 ++++
 perl/Makefile.PL                                   |   28 +
 perl/private-Error.pm                              |  827 ++++
 quote.c                                            |   61 +
 quote.h                                            |    7 +
 read-cache.c                                       |   77 +-
 receive-pack.c                                     |   28 +-
 refs.c                                             |   26 +-
 revision.c                                         |  258 +-
 revision.h                                         |   14 +-
 rsh.c                                              |   31 +-
 run-command.c                                      |    8 +-
 send-pack.c                                        |  126 +-
 server-info.c                                      |    2 +-
 setup.c                                            |    2 +
 sha1_file.c                                        |  596 ++--
 sha1_name.c                                        |   60 +-
 sideband.c                                         |   78 +
 sideband.h                                         |   13 +
 ssh-fetch.c                                        |   10 +-
 ssh-upload.c                                       |    4 +-
 t/t1200-tutorial.sh                                |    2 +-
 t/t1400-update-ref.sh                              |   86 +-
 t/t3200-branch.sh                                  |   12 +
 t/t3403-rebase-skip.sh                             |    4 +-
 t/t3700-add.sh                                     |   22 +
 t/t4015-diff-whitespace.sh                         |  122 +
 t/t4103-apply-binary.sh                            |    4 +-
 t/t4104-apply-boundary.sh                          |  115 +
 t/t4116-apply-reverse.sh                           |   85 +
 t/t4117-apply-reject.sh                            |  157 +
 t/t5400-send-pack.sh                               |   14 +
 t/t5510-fetch.sh                                   |   69 +
 t/t5600-clone-fail-cleanup.sh                      |    6 +
 t/t5710-info-alternate.sh                          |    2 +
 t/t6001-rev-list-graft.sh                          |  113 +
 t/t7002-grep.sh                                    |   31 +-
 t/t7201-co.sh                                      |    9 +
 t/test-lib.sh                                      |   17 +-
 trace.c                                            |  150 +
 tree-diff.c                                        |   15 +-
 tree-walk.c                                        |    4 +-
 tree.c                                             |    5 +-
 builtin-read-tree.c => unpack-trees.c              |  474 +--
 unpack-trees.h                                     |   35 +
 upload-pack.c                                      |  190 +-
 write_or_die.c                                     |   45 +
 wt-status.c                                        |  276 ++
 wt-status.h                                        |   25 +
 xdiff-interface.c                                  |   12 +-
 xdiff/xutils.c                                     |   29 +-
 237 files changed, 16898 insertions(+), 8168 deletions(-)
 copy Documentation/{git-tar-tree.txt => git-archive.txt} (29%)
 rewrite Documentation/git-rev-list.txt (61%)
 rename Documentation/{git-upload-tar.txt => git-upload-archive.txt} (30%)
 rewrite Documentation/gitk.txt (37%)
 create mode 100644 Documentation/technical/racy-git.txt
 copy builtin-tar-tree.c => archive-tar.c (59%)
 create mode 100644 archive-zip.c
 create mode 100644 archive.h
 create mode 100644 builtin-archive.c
 rename builtin-cat-file.c => builtin-cat-file.c (0%)
 rename checkout-index.c => builtin-checkout-index.c (92%)
 rename builtin-count.c => builtin-count-objects.c (99%)
 rename name-rev.c => builtin-name-rev.c (97%)
 rename pack-objects.c => builtin-pack-objects.c (81%)
 create mode 100644 builtin-runstatus.c
 rename symbolic-ref.c => builtin-symbolic-ref.c (75%)
 rename unpack-objects.c => builtin-unpack-objects.c (82%)
 create mode 100644 builtin-upload-archive.c
 delete mode 100644 builtin-upload-tar.c
 rename verify-pack.c => builtin-verify-pack.c (83%)
 create mode 100644 check-racy.c
 create mode 100644 color.c
 create mode 100644 color.h
 create mode 100644 compat/inet_pton.c
 rewrite configure.ac (21%)
 create mode 100755 contrib/completion/git-completion.bash
 rename contrib/gitview/{gitview.txt => gitview.txt} (74%)
 create mode 100644 contrib/vim/README
 create mode 100644 contrib/vim/syntax/gitcommit.vim
 rename git-merge-recursive.py => git-merge-recursive-old.py (100%)
 create mode 100644 gitweb/git-favicon.png
 create mode 100644 gitweb/git-logo.png
 rename gitweb/{gitweb.cgi => gitweb.perl} (30%)
 create mode 100644 grep.c
 create mode 100644 grep.h
 rename builtin-help.c => help.c (99%)
 create mode 100644 interpolate.c
 create mode 100644 interpolate.h
 copy builtin-prune.c => list-objects.c (24%)
 create mode 100644 list-objects.h
 create mode 100644 merge-recursive.c
 create mode 100644 perl/.gitignore
 create mode 100644 perl/Git.pm
 create mode 100644 perl/Makefile.PL
 create mode 100644 perl/private-Error.pm
 create mode 100644 sideband.c
 create mode 100644 sideband.h
 create mode 100755 t/t4015-diff-whitespace.sh
 create mode 100755 t/t4104-apply-boundary.sh
 create mode 100755 t/t4116-apply-reverse.sh
 create mode 100755 t/t4117-apply-reject.sh
 create mode 100755 t/t5510-fetch.sh
 create mode 100755 t/t6001-rev-list-graft.sh
 create mode 100644 trace.c
 copy builtin-read-tree.c => unpack-trees.c (62%)
 create mode 100644 unpack-trees.h
 create mode 100644 write_or_die.c
 create mode 100644 wt-status.c
 create mode 100644 wt-status.h

Alan Chandler (2):
      Update the gitweb/README file to include setting the GITWEB_CONFIG environment
      Fix usage string to match that given in the man page

Alex Riesen (3):
      Use const for interpolate arguments
      fix daemon.c compilation for NO_IPV6=1
      do not discard constness in interp_set_entry value argument

Alexandre Julliard (2):
      git.el: Fixed inverted "renamed from/to" message.
      vc-git.el: Switch to using git-blame instead of git-annotate.

Andy Whitcroft (4):
      send-pack: remove remote reference limit
      send-pack: switch to using git-rev-list --stdin
      svnimport: add support for parsing From: lines for author
      add proper dependancies on the xdiff source

Aneesh Kumar K.V (4):
      gitweb: Support for snapshot
      gitweb: fix snapshot support
      gitweb: Make blame and snapshot a feature.
      gitweb: Fix git_blame

Art Haas (1):
      Patch for http-fetch.c and older curl releases

Christian Couder (9):
      Trace into open fd and refactor tracing code.
      Trace into a file or an open fd and refactor tracing code.
      Update GIT_TRACE documentation.
      Fix memory leak in prepend_to_path (git.c).
      Move add_to_string to "quote.c" and make it extern.
      Fix a memory leak in "connect.c" and die if command too long.
      Fix space in string " false" problem in "trace.c".
      Remove empty ref directories that prevent creating a ref.
      Fix tracing when GIT_TRACE is set to an empty string.

David Rientjes (18):
      blame.c return cleanup
      builtin-grep.c cleanup
      builtin-push.c cleanup
      diff.c cleanup
      http-push.c cleanup
      read-cache.c cleanup
      Make pprint_tag void and cleans up call in cmd_cat_file.
      Make show_entry void
      Make checkout_all void.
      Make fsck_dir void.
      Make pack_objects void.
      Make track_tree_refs void.
      Make upload_pack void and remove conditional return.
      Make sha1flush void and remove conditional return.
      make inline is_null_sha1 global
      use appropriate typedefs
      remove unnecessary initializations
      Do not use memcmp(sha1_1, sha1_2, 20) with hardcoded length.

Dennis Stosberg (12):
      "test" in Solaris' /bin/sh does not support -e
      Makefile fix for Solaris
      Add possibility to pass CFLAGS and LDFLAGS specific to the perl subdir
      Solaris has strlcpy() at least since version 8
      Look for sockaddr_storage in sys/socket.h
      Fix detection of ipv6 on Solaris
      Fix compilation with Sun CC
      gitweb: Use --git-dir parameter instead of setting $ENV{'GIT_DIR'}
      gitweb: Remove forgotten call to git_to_hash
      use do() instead of require() to include configuration
      lock_ref_sha1_basic does not remove empty directories on BSD
      Add default values for --window and --depth to the docs

Dmitry V. Levin (3):
      Make count-objects, describe and merge-tree work in subdirectory
      Documentation: Fix broken links
      Handle invalid argc gently

Eric Wong (13):
      pass DESTDIR to the generated perl/Makefile
      git-svn: establish new connections on commit after fork
      git-svn: recommend rebase for syncing against an SVN repo
      git-svn: add the 'dcommit' command
      git-svn: stop repeatedly reusing the first commit message with dcommit
      git-svn: multi-init saves and reuses --tags and --branches arguments
      git-svn: log command fixes
      Documentation/git-svn: document some of the newer features
      git-svn: -h(elp) message formatting fixes
      commit: fix a segfault when displaying a commit with unreachable parents
      git-svn: add a message encouraging use of SVN::* libraries
      git-svn: fix commits over svn+ssh://
      git-svn: reduce memory usage for large commits

Franck Bui-Huu (11):
      Add a newline before appending "Signed-off-by: " line
      log-tree.c: cleanup a bit append_signoff()
      Add git-archive
      git-archive: wire up TAR format.
      git-archive: wire up ZIP format.
      Add git-upload-archive
      connect.c: finish_connect(): allow null pid parameter
      Test return value of finish_connect()
      upload-archive: monitor child communication even more carefully.
      git-archive: update documentation
      Add git-upload-archive to the main git man page

Haavard Skinnemoen (1):
      git-send-email: Don't set author_not_sender from Cc: lines

Jakub Narebski (139):
      gitweb: whitespace cleanup
      gitweb: Use list for of open for running git commands, thorougly.
      gitweb: simplify git_get_hash_by_path
      gitweb: More explicit error messages for open "-|"
      gitweb: Cleanup - chomp $line in consistent style
      gitweb: Cleanup - chomp @lines in consistent style
      gitweb: Add git_page_nav for later use
      gitweb: Navbar refactoring - use git_page_nav to generate navigation bar
      gitweb: Replace form-feed character by ^L
      gitweb: Show project descriptions with utf-8 characters in project list correctly
      gitweb: Add "\n" after <br/> in git_page_nav
      gitweb: Pager refactoring - use git_get_paging_nav for pagination
      gitweb: Remove $project from git_get_paging_nav arguments
      gitweb: Headers refactoring - use git_header_div for header divs
      gitweb: Remove characters entities entirely when shortening string
      gitweb: Ref refactoring - use git_get_referencing for marking tagged/head commits
      gitweb: Refactor generation of shortlog, tags and heads body
      gitweb: do not quote path for list version of open "-|"
      gitweb: Remove characters entities entirely when shortening string -- correction
      gitweb: Reordering code and dividing it into categories
      gitweb: Refactoring git_project_list
      autoconf: Add support for setting SHELL_PATH and PERL_PATH
      autoconf: Move site configuration section earlier in configure.ac
      autoconf: Add support for setting PYTHON_PATH or NO_PYTHON
      autoconf: Check for ll hh j z t size specifiers introduced by C99
      autoconf: Typo cleanup, reordering etc.
      Copy description of new build configuration variables to configure.ac
      autoconf: Set NEEDS_LIBICONV unconditionally if there is no iconv in libc
      gitweb: Separate input validation and dispatch, add comment about opml action
      gitweb: die_error first (optional) parameter is HTTP status
      gitweb: Use undef for die_error to use default first (status) parameter value
      gitweb: Don't undefine query parameter related variables before die_error
      gitweb: Cleanup and uniquify error messages
      gitweb: No periods for error messages
      gitweb: No error messages with unescaped/unprotected user input
      gitweb: PATH_INFO=/ means no project
      gitweb: Inline $rss_link
      gitweb: Refactor untabifying - converting tabs to spaces
      gitweb: fix commitdiff for root commits
      gitweb: Skip nonmatching lines in difftree output, consistently
      autoconf: Unset NO_STH and NEED_STH when it is detected not needed
      gitweb: Remove unused variables in git_shortlog_body and git_heads
      autoconf: Add configure target to main Makefile
      autoconf: Error out on --without-shell and --without-perl
      autoconf: Improvements in NO_PYTHON/PYTHON_PATH handling
      autoconf: Move variables which we always set to config.mak.in
      autoconf: It is --without-python, not --no-python
      autoconf: Add support for setting CURLDIR, OPENSSLDIR, EXPATDIR
      gitweb: Whitespace cleanup - tabs are for indent, spaces are for align
      gitweb: Great subroutines renaming
      gitweb: Separate ref parsing in git_get_refs_list into parse_ref
      gitweb: Refactor printing shortened title in git_shortlog_body and git_tags_body
      gitweb: Separate main part of git_history into git_history_body
      gitweb: Separate finding project owner into git_get_project_owner
      gitweb: Change appereance of marker of refs pointing to given object
      gitweb: Skip comments in mime.types like file
      gitweb: True fix: Support for the standard mime.types map in gitweb
      gitweb: Separate printing difftree in git_commit into git_difftree_body
      gitweb: Show project's git URL on summary page
      gitweb: Add support for per project git URLs
      gitweb: Uniquify version info output, add meta generator in page header
      gitweb: Refactor printing commit message
      gitweb: Added parse_difftree_raw_line function for later use
      gitweb: Use parse_difftree_raw_line in git_difftree_body
      gitweb: bugfix: a.list formatting regression
      gitweb: Replace some presentational HTML by CSS
      gitweb: Whitespace cleanup: realign, reindent
      gitweb: Use underscore instead of hyphen to separate words in HTTP headers names
      gitweb: Route rest of action subroutines through %actions
      gitweb: Use here-doc
      gitweb: Drop the href() params which keys are not in %mapping
      gitweb: Sort CGI parameters returned by href()
      gitweb: Use git-diff-tree patch output for commitdiff
      gitweb: Show information about incomplete lines in commitdiff
      gitweb: Remove invalid comment in format_diff_line
      gitweb: Streamify patch output in git_commitdiff
      gitweb: Add git_get_{following,preceding}_references functions
      gitweb: Faster return from git_get_preceding_references if possible
      gitweb: Add git_get_rev_name_tags function
      gitweb: Use git_get_name_rev_tags for commitdiff_plain X-Git-Tag: header
      gitweb: Add support for hash_parent_base parameter for blobdiffs
      gitweb: Allow for pre-parsed difftree info in git_patchset_body
      gitweb: Parse two-line from-file/to-file diff header in git_patchset_body
      gitweb: Add invisible hyperlink to from-file/to-file diff header
      gitweb: Always display link to blobdiff_plain in git_blobdiff
      gitweb: Change here-doc back for style consistency in git_blobdiff
      gitweb: Use git-diff-tree or git-diff patch output for blobdiff
      gitweb: git_blobdiff_plain is git_blobdiff('plain')
      gitweb: Remove git_diff_print subroutine
      gitweb: Remove creating directory for temporary files
      gitweb: git_annotate didn't expect negative numeric timezone
      gitweb: Remove workaround for git-diff bug fixed in f82cd3c
      gitweb: Improve comments about gitweb features configuration
      gitweb: blobs defined by non-textual hash ids can be cached
      gitweb: Fix typo in git_difftree_body
      gitweb: Fix typo in git_patchset_body
      gitweb: Remove unused git_get_{preceding,following}_references
      gitweb: Remove git_to_hash function
      gitweb: Use @diff_opts, default ('M'), as git-diff and git-diff-tree paramete
      gitweb: Make git_print_log generic; git_print_simplified_log uses it
      gitweb: Do not remove signoff lines in git_print_simplified_log
      gitweb: Add author information to commitdiff view
      gitweb: git_print_log: signoff line is non-empty line
      gitweb: Add diff tree, with links to patches, to commitdiff view
      gitweb: Add local time and timezone to git_print_authorship
      gitweb: Move git-ls-tree output parsing to parse_ls_tree_line
      gitweb: Separate printing of git_tree row into git_print_tree_entry
      gitweb: Extend parse_difftree_raw_line to save commit info
      gitweb: Change the name of diff to parent link in "commit" view to "diff
      gitweb: Add GIT favicon, assuming image/png type
      gitweb: Correct typo: '==' instead of 'eq' in git_difftree_body
      gitweb: Divide page path into directories -- path's "breadcrumbs"
      autoconf: Add -liconv to LIBS when NEEDS_LIBICONV
      autoconf: Check for subprocess.py
      autoconf: Quote AC_CACHE_CHECK arguments
      autoconf: Fix copy'n'paste error
      autoconf: Set NO_ICONV if iconv is found neither in libc, nor in libiconv
      autoconf: Add support for setting NO_ICONV and ICONVDIR
      autoconf: Add config.cache to .gitignore
      gitweb: Make pickaxe search a feature
      gitweb: Paginate history output
      gitweb: Use File::Find::find in git_get_projects_list
      gitweb: Do not parse refs by hand, use git-peek-remote instead
      gitweb: Add git_project_index for generating index.aux
      gitweb: Allow for href() to be used for links without project param
      gitweb: Add link to "project_index" view to "project_list" page
      gitweb: Fix mimetype_guess_file for files with multiple extensions
      gitweb: Even more support for PATH_INFO based URLs
      gitweb: Require project for almost all actions
      gitweb: Always use git-peek-remote in git_get_references
      gitweb: Make git_get_refs_list do work of git_get_references
      gitweb: Fix thinko in git_tags and git_heads
      gitweb: Make git_get_hash_by_path check type if provided
      gitweb: Strip trailing slashes from $path in git_get_hash_by_path
      gitweb: Use "return" instead of "return undef" for some subs
      gitweb: Split validate_input into validate_pathname and validate_refname
      gitweb: Add git_url subroutine, and use it to quote full URLs
      gitweb: Quote filename in HTTP Content-Disposition: header
      gitweb: Cleanup Git logo and Git logo target generation

Jeff King (9):
      gitweb: optionally read config from GITWEB_CONFIG
      diff: support custom callbacks for output
      Move color option parsing out of diff.c and into color.[ch]
      git-commit.sh: convert run_status to a C builtin
      git-status: document colorization config options
      contrib/vim: add syntax highlighting file for commits
      wt-status: remove extraneous newline from 'deleted:' output
      rev-list: fix segfault with --{author,committer,grep}
      git-repack: allow git-repack to run in subdirectory

Johannes Schindelin (38):
      Git.xs: older perl do not know const char *
      Status update on merge-recursive in C
      Cumulative update of merge-recursive in C
      merge-recur: Convert variable names to lower_case
      merge-recur: Get rid of debug code
      merge-recur: Remove dead code
      merge-recur: Fix compiler warning with -pedantic
      merge-recur: Cleanup last mixedCase variables...
      merge-recur: Explain why sha_eq() and struct stage_data cannot go
      merge-recur: fix thinko in unique_path()
      read-trees: refactor the unpack_trees() part
      read-tree: move merge functions to the library
      merge-recur: use the unpack_trees() interface instead of exec()ing read-tree
      merge-recur: virtual commits shall never be parsed
      merge-recursive: fix rename handling
      http-push: avoid fork() by calling merge_bases() directly
      merge-recur: do not call git-write-tree
      merge-recur: do not setenv("GIT_INDEX_FILE")
      merge-recur: if there is no common ancestor, fake empty one
      merge-recur: try to merge older merge bases first
      merge-recur: do not die unnecessarily
      discard_cache(): discard index, even if no file was mmap()ed
      Add the --color-words option to the diff options family
      builtin-mv: readability patch
      unpack-objects: remove unused variable "eof"
      Makefile: fix typo
      Remove uneeded #include
      fmt-merge-msg: fix off-by-one bug
      Teach runstatus about --untracked
      add receive.denyNonFastforwards config variable
      receive-pack: plug memory leak in fast-forward checking code.
      Document receive.denyNonFastforwards
      runstatus: do not recurse into subdirectories if not needed
      daemon: default to 256 for HOST_NAME_MAX if it is not defined
      diff --stat: ensure at least one '-' for deletions, and one '+' for additions
      diff: fix 2 whitespace issues
      cvsserver: Show correct letters for modified, removed and added files
      cvsserver: fix "cvs diff" in a subdirectory

Jon Loeliger (3):
      Add virtualization support to git-daemon
      Cleaned up git-daemon virtual hosting support.
      Removed memory leaks from interpolation table uses.

Jonas Fonseca (21):
      git-apply(1): document missing options and improve existing ones
      git-ls-remote(1): document --upload-pack
      git-blame(1): mention options in the synopsis and advertise pickaxe
      gitk(1): expand the manpage to look less like a template
      git(7): put the synopsis in a verse style paragraph
      gitview.txt: improve asciidoc markup
      git-svn(1): improve asciidoc markup
      describe: fix off-by-one error in --abbrev=40 handling
      Use PATH_MAX instead of MAXPATHLEN
      Use xrealloc instead of realloc
      Use fstat instead of fseek
      Use xcalloc instead of calloc
      Add --relative-date option to the revision interface
      git(7): move gitk(1) to the list of porcelain commands
      Use xmalloc instead of malloc
      Include config.mak.autogen in the doc Makefile
      git-rev-list(1): group options; reformat; document more options
      git-apply(1): document --unidiff-zero
      git-repack(1): document --window and --depth
      Fix trivial typos and inconsistencies in hooks documentation
      gitk(1): mention --all

Junio C Hamano (139):
      Perl interface: add build-time configuration to allow building with -fPIC
      Perl interface: make testsuite work again.
      perl: fix make clean
      Git.pm: tentative fix to test the freshly built Git.pm
      Perly Git: arrange include path settings properly.
      Makefile: Set USE_PIC on x86-64
      Perly git: work around buggy make implementations.
      Git.pm: clean generated files.
      Perly Git: make sure we do test the freshly built one.
      INSTALL: a tip for running after building but without installing.
      Work around sed and make interactions on the backslash at the end of line.
      upload-pack: use object pointer not copy of sha1 to keep track of has/needs.
      upload-pack: lift MAX_NEEDS and MAX_HAS limitation
      recur vs recursive: help testing without touching too many stuff.
      sha1_file.c: expose map_sha1_file() interface.
      pack-objects: reuse deflated data from new-style loose objects.
      unpack-objects: read configuration data upon startup.
      Makefile: git-merge-recur depends on xdiff libraries.
      gitweb: There can be more than two levels of subdirectories
      gitweb: an obvious cut and paste error.
      gitweb: fix use of uninitialized value.
      gitweb: when showing history of a tree, show tree link not blob
      gitweb: avoid undefined value warning in print_page_path
      gitweb/README: do not bug Kay with gitweb questions anymore
      Makefile: gitweb/gitweb.cgi is now generated.
      gitweb: do not use @@FOO@@ for replaced tokens
      .gitignore: git-merge-recur is a built file.
      Make git-checkout-index a builtin
      builtins: Makefile clean-up
      git.c: Rename NEEDS_PREFIX to RUN_SETUP
      autoconf: fix NEEDS_SSL_WITH_CRYPTO
      autoconf: NO_IPV6
      Racy git: avoid having to be always too careful
      read-cache: tweak racy-git delay logic
      autoconf: clean temporary file mak.append
      git-grep: show pathnames relative to the current directory
      upload-pack: minor clean-up in multi-ack logic
      Fix type of combine-diff.c::show_patch_diff()
      Remove combine-diff.c::uninteresting()
      t4116 apply --reverse test
      git-apply --reverse: simplify reverse option.
      git-apply --binary: clean up and prepare for --reverse
      avoid nanosleep(2)
      Documentation/technical/racy-git.txt
      Add check program "git-check-racy"
      Remove the "delay writing to avoid runtime penalty of racy-git avoidance"
      builtin-grep: remove unused debugging cruft.
      builtin-apply --reverse: two bugfixes.
      diff.c: make binary patch reversible.
      apply --reverse: tie it all together.
      git-apply --reject
      git-apply --reject: send rejects to .rej files.
      git-apply --verbose
      apply --reject: count hunks starting from 1, not 0
      Convert memset(hash,0,20) to hashclr(hash).
      hashcpy/hashcmp remaining bits.
      builtin-grep.c: remove unused debugging piece.
      update-index -g
      git-apply --reject: finishing touches.
      free(NULL) is perfectly valid.
      daemon: prepare for multiple services.
      daemon: add upload-tar service.
      multi-service daemon: documentation
      t5710: fix two thinkos.
      Constness tightening for move/link_temp_to_file()
      consolidate two copies of new style object header parsing code.
      pack-objects: re-validate data we copy from elsewhere.
      Revert "Convert git-annotate to use Git.pm"
      Revert "Git.pm: Introduce fast get_object() method"
      Revert "Make it possible to set up libgit directly (instead of from the environment)"
      pack-objects: fix thinko in revalidate code
      more lightweight revalidation while reusing deflated stream in packing
      unpack-objects desperately salvages objects from a corrupt pack
      revision.c: allow injecting revision parameters after setup_revisions().
      Teach rev-list an option to read revs from the standard input.
      Revert "daemon: add upload-tar service."
      Make apply --binary a no-op.
      diff --binary generates full index on binary files.
      Separate object listing routines out of rev-list
      pack-objects: run rev-list equivalent internally.
      pack-objects: further work on internal rev-list logic.
      pack-objects --unpacked=<existing pack> option.
      get_sha1_hex() micro-optimization
      archive: allow remote to have more formats than we understand.
      Move sideband client side support into reusable form.
      Move sideband server side support into reusable form.
      archive: force line buffered output to stderr
      Add --verbose to git-archive
      Teach --exec to git-archive --remote
      Prepare larger packet buffer for upload-pack protocol.
      Add sideband status report to git-archive protocol
      upload-archive: monitor child communication more carefully.
      builtin-archive.c: rename remote_request() to extract_remote_arg()
      pack-objects: document --revs, --unpacked and --all.
      http-fetch: fix alternates handling.
      unpack-objects -r: call it "recover".
      Document git-grep -[Hh]
      Define fallback PATH_MAX on systems that do not define one in <limits.h>
      Fix git-am safety checks
      http-fetch.c: consolidate code to detect missing fetch target
      Add ftp:// protocol support for git-http-fetch
      t1400: make test debuggable.
      apply --unidiff-zero: loosen sanity checks for --unidiff=0 patches
      builtin-grep: make pieces of it available as library.
      revision traversal: prepare for commit log match.
      revision traversal: --author, --committer, and --grep.
      repack: use only pack-objects, not rev-list.
      Update grep internal for grepping only in head/body
      git log: Unify header_filter and message_filter into one.
      Make hexval() available to others.
      sha1_name.c: understand "describe" output as a valid object name
      diff.c: second war on whitespace.
      git-apply: second war on whitespace.
      Add t5510 to test per branch configuration affecting git-fetch.
      Remove upload-tar and make git-tar-tree a thin wrapper to git-archive
      Deprecate merge-recursive.py
      diff --stat: allow custom diffstat output width.
      diff --stat: color output.
      An illustration of rev-list --parents --pretty=raw
      grep: free expressions and patterns when done.
      grep: fix --fixed-strings combined with expression.
      Contributed bash completion support for core Git tools.
      git-diff -B output fix.
      Remove -fPIC which was only needed for Git.xs
      GIT 1.4.3-rc1
      Makefile: install and clean merge-recur, still.
      escape tilde in Documentation/git-rev-parse.txt
      tar-tree deprecation: we eat our own dog food.
      gitweb: Make the Git logo link target to point to the homepage
      git-send-email: avoid uninitialized variable warning.
      cherry-pick: make -r the default
      Add WEBDAV timeout to http-fetch.
      Fix git-revert
      git-fetch --update-head-ok typofix
      git-pull: we say commit X, not X commit.
      git.spec.in: perl subpackage is installed in perl_vendorlib not vendorarch
      apply --numstat -z: line termination fix.
      t4015: work-around here document problem on Cygwin.
      Revert "move pack creation to version 3"

Linus Torvalds (10):
      Relative timestamps in git log
      git-fsck-objects: lacking default references should not be fatal
      Fix git-fsck-objects SIGSEGV/divide-by-zero
      Add "-h/-H" parsing to "git grep"
      Allow multiple "git_path()" uses
      git-log --author and --committer are not left-anchored by default
      Clean up approxidate() in preparation for fixes
      Fix approxidate() to understand more extended numbers
      diff --stat=width[,name-width]: allow custom diffstat output width.
      Fix approxidate() to understand 12:34 AM/PM are 00:34 and 12:34

Liu Yubao (1):
      Fix duplicate xmalloc in builtin-add

Luben Tuikov (22):
      gitweb: git_tree displays blame based on repository config
      gitweb: bugfix: git_commit and git_commitdiff parents
      gitweb: blame table row no highlight fix
      gitweb: bugfix: commitdiff regression
      gitweb: bugfix: git_print_page_path() needs the hash base
      gitweb: tree view: eliminate redundant "blob"
      gitweb: Remove redundant "tree" link
      gitweb: extend blame to show links to diff and previous
      Revert "gitweb: extend blame to show links to diff and previous"
      gitweb: Remove excessively redundant entries from git_difftree_body
      gitweb: Add history and blame to git_difftree_body()
      gitweb: "alternate" starts with shade (i.e. 1)
      gitweb: Remove redundant "commit" link from shortlog
      gitweb: Factor out gitweb_have_snapshot()
      gitweb: Add snapshot to shortlog
      gitweb: Don't use quotemeta on internally generated strings
      gitweb: Remove redundant "commit" from history
      gitweb: History: blob and tree are first, then commitdiff, etc
      gitweb: tree view: hash_base and hash are now context sensitive
      gitweb: Escape ESCAPE (\e) character
      gitweb: Do not print "log" and "shortlog" redundantly in commit view
      gitweb: blame: Minimize vertical table row padding

Markus Amsler (1):
      git-imap-send: Strip smtp From_ header from imap message.

Martin Langhoff (1):
      git-repack: create new packs inside $GIT_DIR, not cwd

Martin Waitz (16):
      gitweb: fill in gitweb configuration by Makefile
      gitweb: use out-of-line GIT logo.
      gitweb: provide function to format the URL for an action link.
      gitweb: consolidate action URL generation.
      gitweb: continue consolidation of URL generation.
      gitweb: support for "fp" parameter.
      gitweb: support for / as home_link.
      gitweb: fix project list if PATH_INFO=="/".
      gitweb: more support for PATH_INFO based URLs
      gitweb: fix uninitialized variable warning.
      gitweb: fix display of trees via PATH_INFO.
      gitweb: document webserver configuration for common gitweb/repo URLs.
      git-commit: cleanup unused function.
      git-commit: fix coding style.
      test-lib: separate individual test better in verbose mode.
      paginate git-diff by default

Matthias Kestenholz (6):
      Make git-name-rev a builtin
      Make git-pack-objects a builtin
      Make git-unpack-objects a builtin
      Make git-symbolic-ref a builtin
      Add gitweb.cgi to .gitignore
      Check if pack directory exists prior to descending into it

Matthias Lederhofer (12):
      pager: environment variable GIT_PAGER to override PAGER
      gitweb: use a hash to lookup the sub for an action
      gitweb: require $ENV{'GITWEB_CONFIG'}
      gitweb: check if HTTP_ACCEPT is really set
      gitweb: fix commitdiff_plain for root commits
      gitweb: fix $project usage
      gitweb: do not use 'No such directory' error message
      gitweb: export options
      gitweb: fix warnings in PATH_INFO code and add export_ok/strict_export
      gitweb fix validating pg (page) parameter
      format-patch: use cwd as default output directory
      git-format-patch: fix bug using -o in subdirectories

Nicolas Pitre (4):
      move pack creation to version 3
      many cleanups to sha1_file.c
      add commit count options to git-shortlog
      atomic write for sideband remote messages

Paul Mackerras (10):
      gitk: Minor cleanups
      gitk: Recompute ancestor/descendent heads/tags when rereading refs
      gitk: Add a row context-menu item for creating a new branch
      gitk: Add a context menu for heads
      gitk: Fix a couple of buglets in the branch head menu items
      gitk: Add a menu item for cherry-picking commits
      gitk: Update preceding/following tag info when creating a tag
      gitk: Improve responsiveness while reading and layout out the graph
      gitk: Fix some bugs in the new cherry-picking code
      diff-index --cc shows a 3-way diff between HEAD, index and working tree.

Pavel Roskin (3):
      Fix probing for already installed Error.pm
      Delete manuals if compiling without docs
      Make perl interface a separate package

Petr Baudis (48):
      Introduce Git.pm (v4)
      Git.pm: Implement Git::exec_path()
      Git.pm: Call external commands using execv_git_cmd()
      Git.pm: Implement Git::version()
      Add Error.pm to the distribution
      Git.pm: Better error handling
      Git.pm: Handle failed commands' output
      Git.pm: Enhance the command_pipe() mechanism
      Git.pm: Implement options for the command interface
      Git.pm: Add support for subdirectories inside of working copies
      Convert git-mv to use Git.pm
      Git.pm: assorted build related fixes.
      Git.pm: Try to support ActiveState output pipe
      Git.pm: Swap hash_object() parameters
      Git.pm: Fix Git->repository("/somewhere/totally/elsewhere")
      Git.pm: Support for perl/ being built by a different compiler
      Git.pm: Remove PerlIO usage from Git.xs
      Git.pm: Avoid ppport.h
      Git.pm: Don't #define around die
      Use $GITPERLLIB instead of $RUNNING_GIT_TESTS and centralize @INC munging
      Git.pm: Add config() method
      Convert git-send-email to use Git.pm
      Git.pm: Introduce ident() and ident_person() methods
      Make it possible to set up libgit directly (instead of from the environment)
      Git.pm: Introduce fast get_object() method
      Convert git-annotate to use Git.pm
      Eliminate Scalar::Util usage from private-Error.pm
      Fix showing of path in tree view
      gitweb: Link (HEAD) tree for each project from projects list
      gitweb: More per-view navigation bar links
      gitweb: Link to tree instead of snapshot in shortlog
      gitweb: Link to latest tree from the head line in heads list
      gitweb: Link to associated tree from a particular log item in full log view
      gitweb: Rename "plain" labels to "raw"
      gitweb: Relabel "head" as "HEAD"
      Make path in tree view look nicer
      gitweb: Fix tree link associated with each commit log entry.
      gitweb: Fix @git_base_url_list usage
      Fix snapshot link in tree view
      Git.pm: Kill Git.xs for now
      Deprecate git-resolve.sh
      gitweb: Consolidate escaping/validation of query string
      gitweb: fix over-eager application of esc_html().
      Show snapshot link in shortlog only if have_snapsho
      gitweb: Separate (new) and (deleted) in commitdiff by a space
      gitweb: Handle commits with empty commit messages more reasonably
      gitweb: [commit view] Do not suppress commitdiff link in root commit
      svnimport: Fix broken tags being generated

Pierre Habouzit (7):
      Fix a comparison bug in diff-delta.c
      avoid to use error that shadows the function name, use err instead.
      git_dir holds pointers to local strings, hence MUST be const.
      missing 'static' keywords
      remove ugly shadowing of loop indexes in subloops.
      use name[len] in switch directly, instead of creating a shadowed variable.
      n is in fact unused, and is later shadowed.

Randal L. Schwartz (1):
      builtin-upload-archive.c broken on openbsd

Rene Scharfe (21):
      git-verify-pack: make builtin
      Axe the last ent
      Add write_or_die(), a helper function
      Add git-zip-tree
      git-cherry: remove unused variable
      git-reset: remove unused variable
      Add git-zip-tree to .gitignore
      git-archive: make compression level of ZIP archives configurable
      Use xstrdup instead of strdup in builtin-{tar,zip}-tree.c
      git-archive: inline default_parse_extra()
      git-tar-tree: devolve git-tar-tree into a wrapper for git-archive
      Remove git-zip-tree
      Rename builtin-zip-tree.c to archive-zip.c
      git-tar-tree: Remove duplicate git_config() call
      git-tar-tree: Move code for git-archive --format=tar to archive-tar.c
      git-tar-tree: don't RUN_SETUP
      Documentation: add missing second colons and remove a typo
      Add hash_sha1_file()
      Make write_sha1_file_prepare() static
      Make write_sha1_file_prepare() void
      Replace open-coded version of hash_sha1_file()

Robin Rosenberg (3):
      Quote arguments to tr in test-lib
      Make cvsexportcommit remove files.
      Error in test description of t1200-tutorial

Santi Béjar (4):
      Fetch: default remote repository from branch properties
      fetch: get the remote branches to merge from the branch properties
      Add test for the default merges in fetch.
      fetch: Reset remote refs list each time fetch_main is called

Sasha Khapyorsky (3):
      Trivial support for cloning and fetching via ftp://.
      git-svnimport: Parse log message for Signed-off-by: lines
      http/ftp: optionally ask curl to not use EPSV command

Sergey Vlasov (2):
      Documentation: Fix howto/revert-branch-rebase.html generation
      git-svn: Fix fetch --no-ignore-externals with GIT_SVN_NO_LIB=1

Shawn Pearce (15):
      Verify we know how to read a pack before trying to using it.
      Remove unnecessary forward declaration of unpack_entry.
      Convert memcpy(a,b,20) to hashcpy(a,b).
      Convert memcpy(a,b,20) to hashcpy(a,b).
      Reorganize/rename unpack_non_delta_entry to unpack_compressed_entry.
      Reuse compression code in unpack_compressed_entry.
      Cleanup unpack_entry_gently and friends to use type_name array.
      Cleanup unpack_object_header to use only offsets.
      Convert unpack_entry_gently and friends to use offsets.
      Replace uses of strdup with xstrdup.
      Allow 'svn fetch' on '(no date)' revisions in Subversion.
      Allow '(no author)' in git-svn's authors file.
      Ignore executable bit when adding files if filemode=0.
      Allow git-checkout when on a non-existant branch.
      Corrected copy-and-paste thinko in ignore executable bit test case.

Tilman Sauerbeck (2):
      Indentation fix.
      Added support for dropping privileges to git-daemon.

Timo Hirvonen (2):
      --name-only, --name-status, --check and -s are mutually exclusive
      Remove awkward compatibility warts

Ville Skyttä (1):
      Be nicer if git executable is not installed

Yasushi SHOJI (4):
      gitweb: configurable home link string
      gitweb: Decode long title for link tooltips
      gitweb: refactor decode() for utf8 conversion
      clone: the given repository dir should be relative to $PWD

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18 23:38                                 ` Johannes Schindelin
@ 2006-10-18 23:54                                   ` Petr Baudis
  2006-10-19  0:33                                     ` Johannes Schindelin
  0 siblings, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-10-18 23:54 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Shawn Pearce, git

  Hi,

Dear diary, on Thu, Oct 19, 2006 at 01:38:45AM CEST, I got a letter
where Johannes Schindelin <Johannes.Schindelin@gmx.de> said that...
> On Wed, 18 Oct 2006, Shawn Pearce wrote:
> 
> > Today Git doesn't run natively on Windows.
> 
> As I mentioned some time ago, I started a branch on MinGW. It works quite 
> well for the moment, but it lacks fork() emulation, and glob() emulation. 
> And I lack the time to continue working on it.

  care to publish it somewhere, e.g. on repo.or.cz?

  (P.S., have fun in Prague! Too bad I won't be around over the weekend.
:-( )

  Thanks,

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 23:18                                                   ` Nicolas Pitre
  2006-10-18 23:50                                                     ` Johannes Schindelin
@ 2006-10-19  0:07                                                     ` Linus Torvalds
  2006-10-19  0:15                                                       ` Linus Torvalds
                                                                         ` (3 more replies)
  1 sibling, 4 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-19  0:07 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Junio C Hamano, Shawn Pearce, git



On Wed, 18 Oct 2006, Nicolas Pitre wrote:
> 
> If you use builtin-unpack-objects.c from next, you'll be able to 
> generate the pack index pretty easily as well, as all the needed info is 
> stored in the obj_list array.  Just need to append objects remaining on 
> the delta_list array to the end of the pack, sort the obj_list by sha1 
> and write the index.

Actually, I've hit an impasse.

The index isn't the problem. The problem is actually writing the resultant 
pack-file itself in one go.

The silly thing is, the pack-file contains the number of entries in the 
header. That's a silly problem, because the _natural_ way to turn a thin 
pack into a normal pack would be to just add the missing objects from the 
local store into the resulting pack. But we don't _know_ how many such 
missing objects there are, until we've gone through the whole source pack. 

So you can't easily do a streaming "write the result as you go along" 
version using that approach.

So there's _another_ way of fixing a thin pack: it's to expand the objects 
without a base into non-delta objects, and keeping the number of objects 
in the pack the same. But _again_, we don't actually know which ones to 
expand until it's too late.

The end result? I can expand them all (I have a patch that does that). Or 
I could leave as deltas the ones I have already seen the base for in the 
pack-file (I don't have that yet, but that should be a SMOP). But I'm not 
very happy with even the latter choice, because it really potentially 
expands things that didn't _need_ expansion, they just got expanded 
because we hadn't seen the base object yet.

So I'll happily send my patches to anybody who wants to try (I don't write 
the index file yet, but it should be easy to add), but I'm getting the 
feeling that "builtin-unpack-objects.c" is the wrong tool to use for this, 
because it's very much designed for streaming.

It would probably be better to start from "index-pack.c" instead, which is 
already a multi-pass thing, and wouldn't have had any of the problems I 
hit. 

Gaah.

> Pretty trivial indeed.

So it's conceptually totally trivial to rewrite a pack-file as another 
pack-file, but at least so far, it's turned out to be less trivial in 
practice (or at least in a single pass, without holding everything in 
memory, which I definitely do _not_ want to do).

So I'm leaving this for today, and perhaps coming back to it tomorrow with 
a fresh eye.

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19  0:07                                                     ` Linus Torvalds
@ 2006-10-19  0:15                                                       ` Linus Torvalds
  2006-10-19  0:31                                                       ` Johannes Schindelin
                                                                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-19  0:15 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Junio C Hamano, Shawn Pearce, git



On Wed, 18 Oct 2006, Linus Torvalds wrote:
> 
> So I'll happily send my patches to anybody who wants to try (I don't write 
> the index file yet, but it should be easy to add), but I'm getting the 
> feeling that "builtin-unpack-objects.c" is the wrong tool to use for this, 
> because it's very much designed for streaming.

A potentially even simpler way would probably be to literally just use 
"git-pack-objects" directly, and just have a very special mode that allows 
mapping the thin pack as if it was a real pack (ie basically 
pre-populating a fake pack entry, where the fake part comes from adding 
the missing objects by hand to the mapping).

So many ways to do it, so little real motivation ;)

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19  0:07                                                     ` Linus Torvalds
  2006-10-19  0:15                                                       ` Linus Torvalds
@ 2006-10-19  0:31                                                       ` Johannes Schindelin
  2006-10-19  0:46                                                         ` Linus Torvalds
  2006-10-19  3:01                                                       ` Nicolas Pitre
  2006-10-19  3:46                                                       ` Junio C Hamano
  3 siblings, 1 reply; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-19  0:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, Shawn Pearce, git

Hi,

On Wed, 18 Oct 2006, Linus Torvalds wrote:

> The silly thing is, the pack-file contains the number of entries in the 
> header.

You do not write this to stdout, right? Why not just come back and correct 
the number of objects? Of course, the SHA1 has to be calculated _after_ 
that.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18 23:54                                   ` Petr Baudis
@ 2006-10-19  0:33                                     ` Johannes Schindelin
  0 siblings, 0 replies; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-19  0:33 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Shawn Pearce, git

Hi,

On Thu, 19 Oct 2006, Petr Baudis wrote:

> Dear diary, on Thu, Oct 19, 2006 at 01:38:45AM CEST, I got a letter
> where Johannes Schindelin <Johannes.Schindelin@gmx.de> said that...
> > On Wed, 18 Oct 2006, Shawn Pearce wrote:
> > 
> > > Today Git doesn't run natively on Windows.
> > 
> > As I mentioned some time ago, I started a branch on MinGW. It works quite 
> > well for the moment, but it lacks fork() emulation, and glob() emulation. 
> > And I lack the time to continue working on it.
> 
>   care to publish it somewhere, e.g. on repo.or.cz?

It is way to dirty for that. I would only dare give it somebody in return 
for the promise to clean everything up.

BTW I completely forgot that in the absence of poll() from MinGW, all the 
networking code is actually just wrapped into "return -1;" functions.

>   (P.S., have fun in Prague! Too bad I won't be around over the weekend.
> :-( )

Pity. You seem to have good connections...

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19  0:31                                                       ` Johannes Schindelin
@ 2006-10-19  0:46                                                         ` Linus Torvalds
  0 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-19  0:46 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Nicolas Pitre, Junio C Hamano, Shawn Pearce, Git Mailing List



On Thu, 19 Oct 2006, Johannes Schindelin wrote:
> 
> You do not write this to stdout, right? Why not just come back and correct 
> the number of objects? Of course, the SHA1 has to be calculated _after_ 
> that.

That's the issue. I wanted the pack-file thing to look as similar to the 
old code as possible. And that means using the "sha1write()" interfaces, 
which calculate the SHA1 checksum _as_ we write.

So yes, I wanted to do it all in one phase.

Anyway, if anybody is interested, here's a series of four patches that do 
something that _almost_ works. I save away the SHA1's and the offsets so 
that I could write an index too, but I didn't actually do that part.

But with this, I can rewrite a pack-file "in flight", and the end result 
can then have "git index-pack" run on it, and used as a pack. It's just 
that there are no deltas left because of some of the silly problems I 
outlined (the code to write out deltas is actually there and just 
uncommented - it works, but it leaves the end result with unsatisfied 
deltas again).

		Linus
---
commit 4efd9b0f44635b3075c9aad6d1cc8830e3abded3
Author: Linus Torvalds <torvalds@osdl.org>
Date:   Wed Oct 18 17:22:04 2006 -0700

    Fix up csum-file interfaces
    
    Add "const" where appropriate
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>

diff --git a/csum-file.c b/csum-file.c
index b7174c6..3237228 100644
--- a/csum-file.c
+++ b/csum-file.c
@@ -47,7 +47,7 @@ int sha1close(struct sha1file *f, unsign
 	return 0;
 }
 
-int sha1write(struct sha1file *f, void *buf, unsigned int count)
+int sha1write(struct sha1file *f, const void *buf, unsigned int count)
 {
 	while (count) {
 		unsigned offset = f->offset;
@@ -115,7 +115,7 @@ struct sha1file *sha1fd(int fd, const ch
 	return f;
 }
 
-int sha1write_compressed(struct sha1file *f, void *in, unsigned int size)
+int sha1write_compressed(struct sha1file *f, const void *in, unsigned int size)
 {
 	z_stream stream;
 	unsigned long maxsize;
@@ -127,7 +127,7 @@ int sha1write_compressed(struct sha1file
 	out = xmalloc(maxsize);
 
 	/* Compress it */
-	stream.next_in = in;
+	stream.next_in = (void *) in;
 	stream.avail_in = size;
 
 	stream.next_out = out;
diff --git a/csum-file.h b/csum-file.h
index 3ad1a99..fee8589 100644
--- a/csum-file.h
+++ b/csum-file.h
@@ -13,7 +13,7 @@ struct sha1file {
 extern struct sha1file *sha1fd(int fd, const char *name);
 extern struct sha1file *sha1create(const char *fmt, ...) __attribute__((format (printf, 1, 2)));
 extern int sha1close(struct sha1file *, unsigned char *, int);
-extern int sha1write(struct sha1file *, void *, unsigned int);
-extern int sha1write_compressed(struct sha1file *, void *, unsigned int);
+extern int sha1write(struct sha1file *, const void *, unsigned int);
+extern int sha1write_compressed(struct sha1file *, const void *, unsigned int);
 
 #endif
\f
commit c2c8480b05a75d93f78a0ddd1cce18c6864738eb
Author: Linus Torvalds <torvalds@osdl.org>
Date:   Wed Oct 18 17:20:53 2006 -0700

    Make some of the pack-writing helper functions available
    
    string_to_type() and encode_header() are useful in general.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>

diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 96c069a..ea39bf3 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -220,6 +220,20 @@ static void *delta_against(void *buf, un
 	return delta_buf;
 }
 
+enum object_type string_to_type(const char *type, const unsigned char *sha1)
+{
+	if (!strcmp(type, commit_type))
+		return OBJ_COMMIT;
+	if (!strcmp(type, tree_type))
+		return OBJ_TREE;
+	if (!strcmp(type, blob_type))
+		return OBJ_BLOB;
+	if (!strcmp(type, tag_type))
+		return OBJ_TAG;
+	die("strange object %s of unknown type %s",
+		    sha1_to_hex(sha1), type);
+}
+
 /*
  * The per-object header is a pretty dense thing, which is
  *  - first byte: low four bits are "size", then three bits of "type",
@@ -227,7 +241,7 @@ static void *delta_against(void *buf, un
  *  - each byte afterwards: low seven bits are size continuation,
  *    with the high bit being "size continues"
  */
-static int encode_header(enum object_type type, unsigned long size, unsigned char *hdr)
+int encode_header(enum object_type type, unsigned long size, unsigned char *hdr)
 {
 	int n = 1;
 	unsigned char c;
@@ -943,17 +957,7 @@ static void check_object(struct object_e
 		die("unable to get type of object %s",
 		    sha1_to_hex(entry->sha1));
 
-	if (!strcmp(type, commit_type)) {
-		entry->type = OBJ_COMMIT;
-	} else if (!strcmp(type, tree_type)) {
-		entry->type = OBJ_TREE;
-	} else if (!strcmp(type, blob_type)) {
-		entry->type = OBJ_BLOB;
-	} else if (!strcmp(type, tag_type)) {
-		entry->type = OBJ_TAG;
-	} else
-		die("unable to pack object %s of type %s",
-		    sha1_to_hex(entry->sha1), type);
+	entry->type = string_to_type(type, entry->sha1);
 }
 
 static unsigned int check_delta_limit(struct object_entry *me, unsigned int n)
diff --git a/pack.h b/pack.h
index eb07b03..346a430 100644
--- a/pack.h
+++ b/pack.h
@@ -15,6 +15,9 @@ struct pack_header {
 	unsigned int hdr_entries;
 };
 
+enum object_type string_to_type(const char *type, const unsigned char *sha1);
+int encode_header(enum object_type type, unsigned long size, unsigned char *hdr);
+
 extern int verify_pack(struct packed_git *, int);
 extern int check_reuse_pack_delta(struct packed_git *, unsigned long,
 				  unsigned char *, unsigned long *,
\f
commit 94d620067b4a4179656c0ce347cb87be52a9d67f
Author: Linus Torvalds <torvalds@osdl.org>
Date:   Wed Oct 18 15:44:40 2006 -0700

    git-unpack-objects: pass in the original delta data when writing the object
    
    This does nothing right now, but if we want to instead of loose objects
    write a new "verified packfile" with an index, this lets us do that instead.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>

diff --git a/builtin-unpack-objects.c b/builtin-unpack-objects.c
index 4f96bca..bbb6e21 100644
--- a/builtin-unpack-objects.c
+++ b/builtin-unpack-objects.c
@@ -109,7 +109,8 @@ static void add_delta_to_list(unsigned c
 
 static void added_object(unsigned char *sha1, const char *type, void *data, unsigned long size);
 
-static void write_object(void *buf, unsigned long size, const char *type)
+static void write_object(void *buf, unsigned long size, const char *type,
+	unsigned char *base, void *delta, unsigned long delta_size)
 {
 	unsigned char sha1[20];
 	if (write_sha1_file(buf, size, type, sha1) < 0)
@@ -117,7 +118,7 @@ static void write_object(void *buf, unsi
 	added_object(sha1, type, buf, size);
 }
 
-static void resolve_delta(const char *type,
+static void resolve_delta(const char *type, unsigned char *base_sha1,
 			  void *base, unsigned long base_size,
 			  void *delta, unsigned long delta_size)
 {
@@ -129,8 +130,8 @@ static void resolve_delta(const char *ty
 			     &result_size);
 	if (!result)
 		die("failed to apply delta");
+	write_object(result, result_size, type, base_sha1, delta, delta_size);
 	free(delta);
-	write_object(result, result_size, type);
 	free(result);
 }
 
@@ -143,7 +144,7 @@ static void added_object(unsigned char *
 		if (!hashcmp(info->base_sha1, sha1)) {
 			*p = info->next;
 			p = &delta_list;
-			resolve_delta(type, data, size, info->delta, info->size);
+			resolve_delta(type, sha1, data, size, info->delta, info->size);
 			free(info);
 			continue;
 		}
@@ -164,7 +165,7 @@ static void unpack_non_delta_entry(enum 
 	default: die("bad type %d", kind);
 	}
 	if (!dry_run && buf)
-		write_object(buf, size, type);
+		write_object(buf, size, type, NULL, NULL, 0);
 	free(buf);
 }
 
@@ -197,7 +198,7 @@ static void unpack_delta_entry(unsigned 
 		has_errors = 1;
 		return;
 	}
-	resolve_delta(type, base, base_size, delta_data, delta_size);
+	resolve_delta(type, base_sha1, base, base_size, delta_data, delta_size);
 	free(base);
 }
 
diff --git a/date.c b/date.c
index 1825922..0b06994 100644
--- a/date.c
+++ b/date.c
@@ -657,6 +657,7 @@ static const struct typelen {
 	{ "hours", 60*60 },
 	{ "days", 24*60*60 },
 	{ "weeks", 7*24*60*60 },
+	{ "fortnights", 2*7*24*60*60 },
 	{ NULL }
 };	
 
\f
commit 636210e7fcceb7297ccf0fc54291bb1c8356f0d3
Author: Linus Torvalds <torvalds@osdl.org>
Date:   Wed Oct 18 17:23:06 2006 -0700

    Make "unpack-objects" able to write a single pack-file instead
    
    This is idiotic. It writes everything undeltified, which is
    horrid. I need a brain.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>

diff --git a/builtin-unpack-objects.c b/builtin-unpack-objects.c
index bbb6e21..f139308 100644
--- a/builtin-unpack-objects.c
+++ b/builtin-unpack-objects.c
@@ -7,11 +7,12 @@ #include "blob.h"
 #include "commit.h"
 #include "tag.h"
 #include "tree.h"
+#include "csum-file.h"
 
 #include <sys/time.h>
 
 static int dry_run, quiet, recover, has_errors;
-static const char unpack_usage[] = "git-unpack-objects [-n] [-q] [-r] < pack-file";
+static const char unpack_usage[] = "git-unpack-objects [-n] [-q] [-r] [--repack=pack-name] < pack-file";
 
 /* We always read in 4kB chunks. */
 static unsigned char buffer[4096];
@@ -87,6 +88,56 @@ static void *get_data(unsigned long size
 	return buf;
 }
 
+static struct sha1file *pack_file;
+static unsigned long pack_file_offset;
+
+struct index_entry {
+	unsigned long offset;
+	unsigned char sha1[20];
+};
+
+static unsigned int index_nr, index_alloc;
+static struct index_entry **index_array;
+
+static void add_pack_index(unsigned char *sha1)
+{
+	struct index_entry *entry;
+	int nr = index_nr;
+	if (nr >= index_alloc) {
+		index_alloc = (index_alloc + 64) * 3 / 2;
+		index_array = xrealloc(index_array, index_alloc * sizeof(*index_array));
+	}
+	entry = xmalloc(sizeof(*entry));
+	entry->offset = pack_file_offset;
+	hashcpy(entry->sha1, sha1);
+	index_array[nr++] = entry;
+}
+
+static void write_pack_delta(const unsigned char *base, const void *delta, unsigned long delta_size)
+{
+	unsigned char header[10];
+	unsigned hdrlen, datalen;
+
+	hdrlen = encode_header(OBJ_DELTA, delta_size, header);
+	sha1write(pack_file, header, hdrlen);
+	sha1write(pack_file, base, 20);
+	datalen = sha1write_compressed(pack_file, delta, delta_size);
+
+	pack_file_offset += hdrlen + 20 + datalen;
+}
+
+static void write_pack_object(const char *type, const unsigned char *sha1, const void *buf, unsigned long size)
+{
+	unsigned char header[10];
+	unsigned hdrlen, datalen;
+
+	hdrlen = encode_header(string_to_type(type, sha1), size, header);
+	sha1write(pack_file, header, hdrlen);
+	datalen = sha1write_compressed(pack_file, buf, size);
+
+	pack_file_offset += hdrlen + datalen;
+}
+
 struct delta_info {
 	unsigned char base_sha1[20];
 	unsigned long size;
@@ -113,7 +164,16 @@ static void write_object(void *buf, unsi
 	unsigned char *base, void *delta, unsigned long delta_size)
 {
 	unsigned char sha1[20];
-	if (write_sha1_file(buf, size, type, sha1) < 0)
+
+	if (pack_file) {
+		if (hash_sha1_file(buf, size, type, sha1) < 0)
+			die("failed to compute object hash");
+		add_pack_index(sha1);
+		if (0 && base)
+			write_pack_delta(base, delta, delta_size);
+		else
+			write_pack_object(type, sha1, buf, size);
+	} else if (write_sha1_file(buf, size, type, sha1) < 0)
 		die("failed to write object");
 	added_object(sha1, type, buf, size);
 }
@@ -254,7 +314,7 @@ static void unpack_one(unsigned nr, unsi
 	}
 }
 
-static void unpack_all(void)
+static void unpack_all(const char *repack)
 {
 	int i;
 	struct pack_header *hdr = fill(sizeof(struct pack_header));
@@ -266,17 +326,32 @@ static void unpack_all(void)
 		die("unknown pack file version %d", ntohl(hdr->hdr_version));
 	fprintf(stderr, "Unpacking %d objects\n", nr_objects);
 
+	if (repack) {
+		struct pack_header newhdr;
+		newhdr.hdr_signature = htonl(PACK_SIGNATURE);
+		newhdr.hdr_version = htonl(PACK_VERSION);
+		newhdr.hdr_entries = htonl(nr_objects);
+		
+		pack_file = sha1create("%s.pack", repack);
+		sha1write(pack_file, &newhdr, sizeof(newhdr));
+		pack_file_offset = sizeof(newhdr);
+	}
+		
+
 	use(sizeof(struct pack_header));
 	for (i = 0; i < nr_objects; i++)
 		unpack_one(i+1, nr_objects);
 	if (delta_list)
 		die("unresolved deltas left after unpacking");
+	if (repack)
+		sha1close(pack_file, NULL, 1);
 }
 
 int cmd_unpack_objects(int argc, const char **argv, const char *prefix)
 {
 	int i;
 	unsigned char sha1[20];
+	const char *repack = NULL;
 
 	git_config(git_default_config);
 
@@ -298,6 +373,10 @@ int cmd_unpack_objects(int argc, const c
 				recover = 1;
 				continue;
 			}
+			if (!strncmp(arg, "--repack=", 9)) {
+				repack = arg + 9;
+				continue;
+			}
 			usage(unpack_usage);
 		}
 
@@ -305,7 +384,7 @@ int cmd_unpack_objects(int argc, const c
 		usage(unpack_usage);
 	}
 	SHA1_Init(&ctx);
-	unpack_all();
+	unpack_all(repack);
 	SHA1_Update(&ctx, buffer, offset);
 	SHA1_Final(sha1, &ctx);
 	if (hashcmp(fill(20), sha1))

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18 23:48                                   ` Johannes Schindelin
@ 2006-10-19  1:58                                     ` Charles Duffy
  2006-10-19 11:01                                       ` Johannes Schindelin
  0 siblings, 1 reply; 1752+ messages in thread
From: Charles Duffy @ 2006-10-19  1:58 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Johannes Schindelin wrote:
> So, the wonderful upside of plugins you described here are actually the 
> reason I will never, _never_ use bzr with plugins.
> 

I presume that for this reason you will also never, _never_ use a 
non-mainline branch of git -- even if its actual code only touches UI 
enhancements or something similarly non-core -- because third-party 
branches have the ability, in theory, to make changes to the core of the 
revision control system. And that you will never, _never_ use 
third-party wrappers because they might play LD_PRELOAD tricks. Or run 
any software with root privileges you haven't personally written. Or...

Sean's point that plugins are a comparatively minor win made inexpensive 
on account of bzr's use of Python is reasonable (though we may choose to 
differ on what level of value we attach to the utility). The claim that 
an extensibility mechanism should be rejected wholesale on account of 
being excessively powerful, on the other hand, is just silly.



(If you couldn't write a plugin that *didn't* touch the core, this would 
be a different story. This is, however, very much not the case).

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19  0:07                                                     ` Linus Torvalds
  2006-10-19  0:15                                                       ` Linus Torvalds
  2006-10-19  0:31                                                       ` Johannes Schindelin
@ 2006-10-19  3:01                                                       ` Nicolas Pitre
  2006-10-19  3:46                                                       ` Junio C Hamano
  3 siblings, 0 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-10-19  3:01 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Shawn Pearce, git

On Wed, 18 Oct 2006, Linus Torvalds wrote:

> 
> 
> On Wed, 18 Oct 2006, Nicolas Pitre wrote:
> > 
> > If you use builtin-unpack-objects.c from next, you'll be able to 
> > generate the pack index pretty easily as well, as all the needed info is 
> > stored in the obj_list array.  Just need to append objects remaining on 
> > the delta_list array to the end of the pack, sort the obj_list by sha1 
> > and write the index.
> 
> Actually, I've hit an impasse.
> 
> The index isn't the problem. The problem is actually writing the resultant 
> pack-file itself in one go.
> 
> The silly thing is, the pack-file contains the number of entries in the 
> header. That's a silly problem, because the _natural_ way to turn a thin 
> pack into a normal pack would be to just add the missing objects from the 
> local store into the resulting pack. But we don't _know_ how many such 
> missing objects there are, until we've gone through the whole source pack. 
> 
> So you can't easily do a streaming "write the result as you go along" 
> version using that approach.

Hmmm.... unpack-objects receives a (possibly thin) pack over its stdin.  
That part has to be streamed.  But its output is currently always 
written to multiple files as separate objects.  So, while the input 
comes from a stream, the output doesn't have to.

In that case, why not just write the input directly to a temporary file, 
append the missing objects, seek back to adjust the object number, and 
finally run a SHA1_Update() on the whole thing?  This forces you to 
write everything and then read everything back, but this should not be 
too bad especially that the written data is likely to still be cached.  
Once its final sha1sum is written then it just need to be moved with the 
appropriate name.

> So there's _another_ way of fixing a thin pack: it's to expand the objects 
> without a base into non-delta objects, and keeping the number of objects 
> in the pack the same. But _again_, we don't actually know which ones to 
> expand until it's too late.
> 
> The end result? I can expand them all (I have a patch that does that). Or 
> I could leave as deltas the ones I have already seen the base for in the 
> pack-file (I don't have that yet, but that should be a SMOP). But I'm not 
> very happy with even the latter choice, because it really potentially 
> expands things that didn't _need_ expansion, they just got expanded 
> because we hadn't seen the base object yet.

Most base objects, well all of them nowadays, are written before their 
deltas.  So in practice the only objects that will get expanded are the 
deltas with missing base.   Still it is unfortunate.

> So I'll happily send my patches to anybody who wants to try (I don't write 
> the index file yet, but it should be easy to add), but I'm getting the 
> feeling that "builtin-unpack-objects.c" is the wrong tool to use for this, 
> because it's very much designed for streaming.
> 
> It would probably be better to start from "index-pack.c" instead, which is 
> already a multi-pass thing, and wouldn't have had any of the problems I 
> hit. 

But index-pack is totally incompatible with any streaming.  It mmap() 
the whole pack and happily perform random accesses.  So you'd need to 
write the entire thin pack to disk anyway before it could work on it.  
This is not really better than the unpack-objects option.  At least 
unpack-objects is structured to perform work on the fly as data is 
received.

> Gaah.
> 
> > Pretty trivial indeed.
> 
> So it's conceptually totally trivial to rewrite a pack-file as another 
> pack-file, but at least so far, it's turned out to be less trivial in 
> practice (or at least in a single pass, without holding everything in 
> memory, which I definitely do _not_ want to do).
> 
> So I'm leaving this for today, and perhaps coming back to it tomorrow with 
> a fresh eye.

I'll have a look at your patches tomorrow as well.  I have many ideas 
brewing, including randering index-pack obsolete since actually 
unpack-objects could do it all already (both tools have many concepts in 
common).


Nicolas

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18  3:35                             ` Linus Torvalds
@ 2006-10-19  3:10                               ` Aaron Bentley
  2006-10-19  5:21                                 ` Carl Worth
                                                   ` (3 more replies)
  0 siblings, 4 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-19  3:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:

> For example, what happens is that:
>  - you like the simple revision numbers
>  - that in turn means that you can never allow a mainline-merge to be done 
>    by anybody else than the main maintainer

That's not true of bzr development.  The "main maintainer" that runs the
bzr.dev is an email bot.  It's not an integrator-- its work is purely
mechanical.  It can't resolve merge conflicts.

Most of the merge work is done in integration branches run by the core
developers.  Although Martin is our project leader, lays out ground
rules, and makes design decisions, he doesn't have to be involved in any
particular merge.

> The "main trunk matters" mentality (which has deep roots in CVS - don't 
> get me wrong, I don't think you're the first one to do this) is 
> fundamentally antithetical to truly distributed system, because it 
> basically assumes that some maintainer is "more important" than others. 

Linus, if you got hit by a bus, it would still be a shock, and it would
still take time for the Linux world to recover.  Your insights and
talent, both technical and social, make you the most important kernel
developer.  And it stays that way because you deserve it.  Projects with
good leadership don't fork, or if they do, the fork withers and dies
pretty quickly.

It is fine to say all branches are equal from a technical perspective.
- From a social perspective, it's just not true.

The scale of Bazaar development is much smaller than the scale of kernel
development, so it doesn't make sense to maintain long-term divergent
branches like the mm tree.  We do occasionally have long-lived feature
branches, though.

> That special maintainer is the maintainer whose merge-trunk is followed, 
> and whose revision numbers don't change when they are merged back.

In bzr development, it's very rare for anyone's revision numbers to change.

> That may even be _true_ in many cases. But please do realize that it's a 
> real issue, and that it has real impact - it does two things:
> 
>  - it impacts the technology and workflow directly itself: "pull" and 
>    "merge" are different: a central maintainer would tend to do a "merge", 
>    and one more in the outskirts would tend to do more of a "pull", 
>    expecting his work to then be merged back to the "trunk" at some later 
>    point)

AFAIK, everyone who maintains long-lived branches in bzr uses "merge".

>  - it will result in _psychological_ damage, in the sense that there's 
>    always one group that is the "trunk" group, and while you can pass the 
>    baton around (like the perl people do), it's always clear who sits 
>    centrally.

As I mentioned earlier, there are four people who each run their own
integration branches and make decisions about what gets merged.  No baton.

> 
> Maybe this is fine. It's certainly how most projects tend to work. 
> 
> I'll just point out that one of my design goals for git was to make every 
> single repository 100% equal. That means that there MUST NOT be a "trunk", 
> or a special line of development. There is no "vendor branch".

I think you're implying that on a technical level, bzr doesn't support
this.  But it does.  Every published repository has unique identifiers
for every revision on its mainline, and it's exceedingly uncommon for
these to change.  There are special procedures to maintain bzr.dev, but
there's nothing technically unique about it.  People develop against
bzr.dev rather than my integration branch, because they have
non-technical reasons for wanting their changes to be merged into
bzr.dev, not my integration branch.

> It's 
> something that a lot of people on the git lists understand now, but it 
> took a while for it to sink in - people used to believe that the "first 
> parent" of a merge was somehow special, and I had to point out several 
> times on the git list that no, that's not how it works - because the merge 
> might have been done by somebody _else_ than the person who you think of 
> as being "on the trunk".

On an actively-developed bzr branch, the first parent *is* special:
- - it's a revision that you committed
- - the diff between a revision and its first parent is the same as the
  diff that would be produced just before it was committed.

> So when I say that your "simple" revision numbers are totally broken and 
> horrible, I say that not because I think a number like "1.45.3.17" is 
> ugly, but because I think that the deeper _implications_ of using a number 
> like that is ugly. It implies one of two things:
> 
>  - the numbers change all the time as things get merged both ways
> 
> OR
> 
>  - people try to maintain a "trunk" mentality

I don't think your analysis holds together completely, because all
actively-maintained branches have very stable revnos that anyone can
refer to.

> In git, the fact that everybody is on an equal footing is something that I 
> think is really good. For example, when I was away for effectively three 
> weeks during August, all the git-level merging for the kernel was done by 
> Greg KH.
> 
> And realize that he didn't use "my tree". No baton was passed. I emailed 
> with him (and some others) before-hand, so that everybody knew that I 
> expected to be just pull from Greg when I came back, but it was _his_ tree 
> that he merged in, and he just worked the same way I did.
>
> And when I did come back, I did a "pull" from his tree.

That sounds to me like a baton was passed.  You asked Greg to behave
like you, and told everyone else to expect that, too.  Passing the baton
was a social, not technical event, but it did happen.  And there would
certainly be no difficulty doing exactly that (right down to running
"pull") in Bazaar land.

In fact, we are currently rotating release managers.  The 0.10 and 0.11
releases were done by Robert, and the upcoming 0.12 is being managed by
John.  Neither of them is the project leader.  They threaten that they
want me to manage a release, too.  We shall see...

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNuyT0F+nu1YWqI0RAjxSAJ9YulgRMmIuy9RS1xrrYnKl9x2arQCaAr5/
u56sojZb6jhKl3fMQ/ZxLf4=
=EYC+
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19  0:07                                                     ` Linus Torvalds
                                                                         ` (2 preceding siblings ...)
  2006-10-19  3:01                                                       ` Nicolas Pitre
@ 2006-10-19  3:46                                                       ` Junio C Hamano
  2006-10-19 14:27                                                         ` Nicolas Pitre
  2006-10-19 14:55                                                         ` Linus Torvalds
  3 siblings, 2 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-19  3:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> Actually, I've hit an impasse.
>
> So there's _another_ way of fixing a thin pack: it's to expand the objects 
> without a base into non-delta objects, and keeping the number of objects 
> in the pack the same. But _again_, we don't actually know which ones to 
> expand until it's too late.

pack-objects.c::write_one() makes sure that we write out base
immediately after delta if we haven't written out its base yet,
so I suspect if you buffer one delta you should be Ok, no?

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19  3:10                               ` Aaron Bentley
@ 2006-10-19  5:21                                 ` Carl Worth
  2006-10-19  5:56                                   ` Martin Pool
                                                     ` (2 more replies)
  2006-10-19  5:33                                 ` Jan Hudec
                                                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 1752+ messages in thread
From: Carl Worth @ 2006-10-19  5:21 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git, Jakub Narebski

[-- Attachment #1: Type: text/plain, Size: 2797 bytes --]

On Wed, 18 Oct 2006 23:10:11 -0400, Aaron Bentley wrote:
> It is fine to say all branches are equal from a technical perspective.
> - From a social perspective, it's just not true.

That's actually a very important insight, but supporting the wrong
conclusion.

In a healthy situation, the only thing that makes a branch special are
social issues, such as you describe. That's how it should be.

But think about your favorite example of an unhealthy social situation
around a software project and a big, nasty fork. Every example I can
think of involves some technical distinction that makes one branch
more special than another.

Now, those situations also involve social problems, and those are even
more significant. But the technical blessing of one branch does not
help. And I think it contributes to the social problems in many cases.

So, I think the technical thing that is distributed version control is
an extremely important thing for us to use to help maintain healthy
social software projects. Reducing the technical hurdle of a fork, (to
where continual forking is actually a totally expected part of the
process), is a very healthy thing.

Now, both bzr and git are distributed systems, and either one will
help a great deal in the respects I'm talking about compared to
something like cvs.

As far as the revision numbers, my impression is that the numbers
would be confusing or worthless if I were to use bzr the way I'm
currently using git, as they certainly could not remain stable.

> In bzr development, it's very rare for anyone's revision numbers to change.

Which just says to me that the bzr developers really are sticking to a
centralized model. That's fine, but it does have impacts, and the tool
really does seem to have some bias toward this.

> I think you're implying that on a technical level, bzr doesn't support
> this.  But it does.  Every published repository has unique identifiers
> for every revision on its mainline, and it's exceedingly uncommon for
> these to change.

Every argument you make for the number change being uncommon just
strengthens the argument that it will be all that more
confusing/frustrating when the numbers do change.

In cairo, for example, we've made a habit of including a revision
identifier in our bug tracking system for every commit that resolves a
bug. I like having the assurance that those numbers will survive
forever. And it doesn't matter if the repository moves, or the project
is forked, or anything else. Those numbers cannot change.

I understand that bzr also has unique identifiers, but it sounds like
the tools try to hide them, and people aren't in the habit of using
them for things like this. Do bzr developers put revision numbers in
their bug trackers? Is there a guarantee they will always be valid?

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19  3:10                               ` Aaron Bentley
  2006-10-19  5:21                                 ` Carl Worth
@ 2006-10-19  5:33                                 ` Jan Hudec
  2006-10-19  7:02                                 ` Erik Bågfors
  2006-10-20 13:22                                 ` Horst H. von Brand
  3 siblings, 0 replies; 1752+ messages in thread
From: Jan Hudec @ 2006-10-19  5:33 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git, Jakub Narebski

On Wed, Oct 18, 2006 at 11:10:11PM -0400, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Linus Torvalds wrote:
> 
> > For example, what happens is that:
> >  - you like the simple revision numbers
> >  - that in turn means that you can never allow a mainline-merge to be done 
> >    by anybody else than the main maintainer
> 
> That's not true of bzr development.  The "main maintainer" that runs the
> bzr.dev is an email bot.  It's not an integrator-- its work is purely
> mechanical.  It can't resolve merge conflicts.

The point here is, that because of using the bot, the revnos on bzr.dev
are indeed stable (and many of the merges are in fact pointless merges
(ie. merges of revision and it's ancestor)). But if you don't use the
bot, than doing:

bzr merge mainline
bzr push mainline

makes your revision the leftmost parent is your revison, not the one
from "mainline". The fact that bzr treats leftmost parent somewhat
specially makes people to replace the above with

bzr branch mainline
cd mainline
bzr merge feature-branch
bzr push

which is, well, more complicated (but you see it's not about main
maintainer -- anybody with write access can push).

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Alternate revno proposal (Was: Re: VCS comparison table)
  2006-10-18 22:14                         ` Jakub Narebski
@ 2006-10-19  5:45                           ` Jan Hudec
  0 siblings, 0 replies; 1752+ messages in thread
From: Jan Hudec @ 2006-10-19  5:45 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

On Thu, Oct 19, 2006 at 12:14:02AM +0200, Jakub Narebski wrote:
> Jan Hudec wrote:
> > Comments?
> 
> What about fetching from repository? For revnos you have to assign revno for
> all commit you have downloaded; now you need only to unpack received pack
> (or not, if you used --keep option). More work.

I don't know git internals, so I can't tell for git. For bzr:
1) You have to add the data to the knits, since the knits are one for
   each versioned file plus one for inventory and one for revision
   metadata, so this is just a small addition to that work. In fact the
   revnos in repository-wide case would be just the indices into the
   revisions knit (while in the branch-wide there would have to be a
   special list).
2) Bzr already generates a special list, revision-history, where it
   stores a list of mainline branches (in fact it used to store a list
   of local commits, but now lists the path over leftmost parents).
   So it already does the work.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19  5:21                                 ` Carl Worth
@ 2006-10-19  5:56                                   ` Martin Pool
  2006-10-19 14:58                                   ` Aaron Bentley
  2006-10-19 15:25                                   ` Linus Torvalds
  2 siblings, 0 replies; 1752+ messages in thread
From: Martin Pool @ 2006-10-19  5:56 UTC (permalink / raw)
  To: Carl Worth
  Cc: Aaron Bentley, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Jakub Narebski

On 18 Oct 2006, Carl Worth <cworth@cworth.org> wrote:

> I understand that bzr also has unique identifiers, but it sounds like
> the tools try to hide them, and people aren't in the habit of using
> them for things like this. Do bzr developers put revision numbers in
> their bug trackers? Is there a guarantee they will always be valid?

There is a mix of 

 - Just giving the overall tarball version number, which is most 
   meaningful to users (and not related to bzr versions)

 - Giving a mainline revision number, which will never revert because we
   never pull (fast-forward) that branch.  That has the substantial
   (imo) benefit that you can immediately compare these numbers by eye,
   and they are easy to quote.

 - Giving a unique id, which is obviously most definitive and
   appropriate if you're talking about something which is not 
   on the mainline or a well known branch.  The launchpad.net 
   bug tracker links branches to bugs and does this through 
   revision ids.

-- 
Martin

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 18:52                             ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Petr Baudis
  2006-10-18 18:59                               ` Petr Baudis
       [not found]                               ` <20061018155704.b94b441d.seanlkml@sympatico.ca>
@ 2006-10-19  6:46                               ` Alexander Belchenko
       [not found]                                 ` <20061019064049.bec89582.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 1752+ messages in thread
From: Alexander Belchenko @ 2006-10-19  6:46 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Petr Baudis пишет:
...
> An example bundle is available at
> 
> 	http://pasky.or.cz/~pasky/cp/example-bundle.txt

You probably miss main idea of bzr bundles. It's not just the way to
send via e-mail or other appropriate transport the part of repository.
It primarily was designed to be human readable as usual diff (i.e.
patch). It was designed to solve 2 thing simultaneously:

- be informative for human as usual patch
- be consistent for machine.

--
Alexander

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19  3:10                               ` Aaron Bentley
  2006-10-19  5:21                                 ` Carl Worth
  2006-10-19  5:33                                 ` Jan Hudec
@ 2006-10-19  7:02                                 ` Erik Bågfors
  2006-10-19  8:49                                   ` Christian MICHON
  2006-10-19 11:37                                   ` Petr Baudis
  2006-10-20 13:22                                 ` Horst H. von Brand
  3 siblings, 2 replies; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-19  7:02 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git, Jakub Narebski

> > In git, the fact that everybody is on an equal footing is something that I
> > think is really good. For example, when I was away for effectively three
> > weeks during August, all the git-level merging for the kernel was done by
> > Greg KH.
> >
> > And realize that he didn't use "my tree". No baton was passed. I emailed
> > with him (and some others) before-hand, so that everybody knew that I
> > expected to be just pull from Greg when I came back, but it was _his_ tree
> > that he merged in, and he just worked the same way I did.
> >
> > And when I did come back, I did a "pull" from his tree.
>
> That sounds to me like a baton was passed.  You asked Greg to behave
> like you, and told everyone else to expect that, too.  Passing the baton
> was a social, not technical event, but it did happen.  And there would
> certainly be no difficulty doing exactly that (right down to running
> "pull") in Bazaar land.


I'd like to point out that the same thing has happened in bzr-land.
Back in the "pre-bot" days, only Martin did put things in "his branch"
where most people got bzr from (same as Linus' git branch), but he was
away for a few weeks and during this time, there was 3 (or 4 perhaps)
other branches, called integration branches, that was being used.
They were all maintained by different people.

Everyone learned really quickly to use them instead of Martin's
branch. When Martin came back, he just pulled/merged these branches
and everything was back to normal.

I'd say in this case, bzr was even more "without a trunk" then in the
example Linus gives above.

What seams to be one interesting thing in this discussion is that,
because people use bzr and git in slightly different ways, they think
that one or the other cannot be used in another way.

bzr's use of revision numbers, doesn't mean it hasn't got unique
revision identifiers, and I can't see any reason why it couldn't be
used in the same way as git.  Both are excellent tools, and since git
is more specialized (built to support the exact workflow used in
kernel development), it's more suited for that exact use.

bzr tries to take a broader view, for example, it does support a
centralized workflow if you want one.  Most people don't, but a few
might. Because of this, it probably fits the kernel development less
good than git.  That's fine I think! I happens to fit my workflow
better than git does :)

Regards,
Erik

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Alternate revno proposal (Was: Re: VCS comparison table)
  2006-10-18 21:46                       ` Alternate revno proposal (Was: Re: VCS comparison table) Jan Hudec
  2006-10-18 22:14                         ` Jakub Narebski
@ 2006-10-19  8:19                         ` Alexander Belchenko
  2006-10-21 13:48                           ` Jan Hudec
  2006-10-20  2:09                         ` Horst H. von Brand
  2 siblings, 1 reply; 1752+ messages in thread
From: Alexander Belchenko @ 2006-10-19  8:19 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Jan Hudec пишет:
...
> 
> Reading this thread I came to think, that the revnos should be assigned
> to _all_ revisions _available_, in order of when they entered the
> repository (there are some possible variations I will mention below)
> 
...
>  - They would be the same as subversion and svk, and IIRC mercurial as
>    well, use, so:
>    - They would already be familiar to users comming from those systems.
>    - They are known to be useful that way. In fact for svk it's the only
>      way to refer to revisions and seem to work satisfactorily (though
>      note that svk is not really suitable to ad-hoc topologies).

I think that SVN model of revision numbers is wrong. And apply it to bzr
break many UI habits. Per example, when ones use svn and their repo has
many branches you never could say what revisions belongs to mainline. So
things like
bzr diff -rM..N
(where M and N absolute revisions numbers, and N = M+1(+2) etc.)
will more complicated, because in this case you first need to run log
command, remember actual numbers of those revisions.
And I each time frustrating to see that after mainline svn revision 1000
might be mainline revision 1020. It's very-very-very confusing. May be
only for me.

There is 2 things why I don't want to switch to svn (if I can do my own
choice): their strange tags implementation (their tags is the same as
branches, so what difference?) and their revisions numbers.

I also think that dotted revisions is not answer in this case, but it
looks very logical and nice.

I think bzr need to have a switch, a flag, probably in .bazaar.conf to
show revno to user or revid. And user can easily select what model is
more appropriate for him:

* decentralized (with revno)
* or distrubuted (with revid i.e. UUID)

> Comments?

-1 to make revno as in svn.

--
Alexander

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19  7:02                                 ` Erik Bågfors
@ 2006-10-19  8:49                                   ` Christian MICHON
  2006-10-19  8:58                                     ` Andreas Ericsson
  2006-10-19 11:37                                   ` Petr Baudis
  1 sibling, 1 reply; 1752+ messages in thread
From: Christian MICHON @ 2006-10-19  8:49 UTC (permalink / raw)
  To: bazaar-ng, git

close to 200 post on bzr-git war!
is this the right place (git mailing list) to discuss about future
features of bzr ?

-- 
Christian

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19  8:49                                   ` Christian MICHON
@ 2006-10-19  8:58                                     ` Andreas Ericsson
  2006-10-19  9:10                                       ` Matthieu Moy
                                                         ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-19  8:58 UTC (permalink / raw)
  To: Christian MICHON; +Cc: bazaar-ng, git

Christian MICHON wrote:
> close to 200 post on bzr-git war!
> is this the right place (git mailing list) to discuss about future
> features of bzr ?
> 

Perhaps not, but the tone is friendly (mostly), the patience of the 
bazaar people seems infinite and lots of people seem to be having fun 
while at the same time learning a thing or two about a different SCM.
Best case scenario, both git and bazaar come out of the discussion as 
better tools. If there would never be any cross-pollination, git 
wouldn't have half the features it has today.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19  8:58                                     ` Andreas Ericsson
@ 2006-10-19  9:10                                       ` Matthieu Moy
  2006-10-19 14:57                                         ` Tim Webster
  2006-10-19 15:45                                       ` Ramon Diaz-Uriarte
  2006-10-20 10:40                                       ` Jakub Narebski
  2 siblings, 1 reply; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-19  9:10 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Christian MICHON, bazaar-ng, git

Andreas Ericsson <ae@op5.se> writes:

> Perhaps not, but the tone is friendly (mostly), the patience of the
> bazaar people seems infinite and lots of people seem to be having fun
> while at the same time learning a thing or two about a different SCM.
> Best case scenario, both git and bazaar come out of the discussion as
> better tools. If there would never be any cross-pollination, git
> wouldn't have half the features it has today.

I second this.

I'm bzr user and occasionnal developper, and I learnt a lot about git
in the discussion. I hope I also could explain well some of the
features of bzr to some git guys, it's always interesting to
understand why other people do things on a different way, or why they
do it in the same way.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18 15:38                                 ` Carl Worth
@ 2006-10-19  9:10                                   ` Matthew D. Fuller
  2006-10-19 11:15                                     ` Andreas Ericsson
  2006-10-19 11:27                                     ` Karl Hasselström
  0 siblings, 2 replies; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-19  9:10 UTC (permalink / raw)
  To: Carl Worth
  Cc: Aaron Bentley, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Jakub Narebski

On Wed, Oct 18, 2006 at 08:38:24AM -0700 I heard the voice of
Carl Worth, and lo! it spake thus:
> 
> But as you already said, it's often avoided specifically because it
> destroys locally-created revision numbers.

I think this has the causality backward.  It's avoided because it
changes the ancestry of the branch in question, by rearranging the
left parents; this ties into Linus' assertion that all parents ought
to be treated equally, which I'm beginning to think is the base
lynchpin of this whole dissension.


Without a differentiation of the parents, there's no such creature as
a "mainline" on a branch, so it's hard to find anything to base revnos
on from the get-go; the whole discussion becomes meaningless and
incomprehensible then.

With the differentiation, numbering along the leftmost 'mainline'
makes sense, and fits the way people tend to work.  "I did this, then
I did this, then I merged in Joe's stuff, then I did this", and the
numbering follows along that.  And as long as it's the same branch,
those revnos will always be the same; I can't go back and add
something in between my first and second commits.  THAT'S where revnos
are useful; referring to a point on given branch.


Certainly, they're of no (or extremely limited) use when referring to
_different_ branches.  And when you change the arrangement of parents
on a branch, you create a different branch.  That's why bzr (the
project, not the program) tends toward trunks that are merged into,
rather than ephemeral trunks that are merged from and then replaced
with the new trunk, and has its UI optimized by default for that case;
because the ordering of the parents IS considered important and to be
preserved.  Ancestry changes aren't avoided because it would screw up
the revnos; the revnos don't get screwed up because the ancestry
changes are avoided for their OWN sake, and it's BECAUSE of that
pre-existing tendancy that the revnos could come into being in the
first place.


If you need to refer to a specific revision in a vacuum, a revno is
the *WRONG* tool for the job.  Revnos exist to refer to points along a
branch.  And in cases where there's a meaningful persistent branch, as
happens in most projects which have a trunk in some sense or another,
they can be the right tool for referring to points along that.


> So there are some aspects of the bzr design that rob from its
> ability to function as a distributed version control system. It
> really does bias itself toward centralization, (the so called "star
> topoloogy" as opposed to something "fully" distributed).

That depends on what you mean by 'bias' (and for that matter, what you
mean by 'centralization'; I think that's being used in very different
ways here).  If you don't care about the ancestry changes, you can go
ahead and change it around by merging and pushing like there's no
tomorrow, and it'll keep up just fine.  Some attributes of it like the
revnos which assume you do care about the ancestry simply cease to be
of any applicability.  That doesn't make it a useless feature, any
more than diff being inapplicable in a branch I'm using to store
binary files makes diff useless; it's just not one that's meaningful
in a given case.

bzr (the project) does care about the ordering of the parents, so it
doesn't do that.  bzr (the tool) assumes that the majority of its
users will care, which is why it has revnos; because in the case where
you don't disturb the ancestry of given branches, revnos are very
useful in reference to that branch.


> So even a project that's very oriented around a single, central tree
> can get a lot of benefit from being able to share things arbitrarily
> between any two given repositories.

I agree wholeheartedly.  That's one of the reasons I'm using bzr, even
though 95% or better of what I do is very oriented around single,
central trees, after all    8-}


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
       [not found]                                 ` <20061019064049.bec89582.seanlkml@sympatico.ca>
@ 2006-10-19 10:40                                   ` Sean
  2006-10-20 14:03                                     ` Aaron Bentley
  2006-10-19 10:40                                   ` Sean
  1 sibling, 1 reply; 1752+ messages in thread
From: Sean @ 2006-10-19 10:40 UTC (permalink / raw)
  To: Alexander Belchenko; +Cc: git, bazaar-ng

On Thu, 19 Oct 2006 09:46:32 +0300
Alexander Belchenko <bialix@ukr.net> wrote:

> You probably miss main idea of bzr bundles. It's not just the way to
> send via e-mail or other appropriate transport the part of repository.
> It primarily was designed to be human readable as usual diff (i.e.
> patch). It was designed to solve 2 thing simultaneously:
> 
> - be informative for human as usual patch
> - be consistent for machine.

Petr already mentioned that the data currently shown in the email
text isn't really useful.  But it's simple to make it an attachment
and show a combined diff instead.

Although that might just make the email bigger for not a lot of
gain.  It's easy to use the git command line and gui tools to inspect
the bundle after importing it into your repository.  And just as
easy to expunge the bundle afterward if it isn't up to grade.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
       [not found]                                 ` <20061019064049.bec89582.seanlkml@sympatico.ca>
  2006-10-19 10:40                                   ` Sean
@ 2006-10-19 10:40                                   ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-19 10:40 UTC (permalink / raw)
  To: Alexander Belchenko; +Cc: bazaar-ng, git

On Thu, 19 Oct 2006 09:46:32 +0300
Alexander Belchenko <bialix@ukr.net> wrote:

> You probably miss main idea of bzr bundles. It's not just the way to
> send via e-mail or other appropriate transport the part of repository.
> It primarily was designed to be human readable as usual diff (i.e.
> patch). It was designed to solve 2 thing simultaneously:
> 
> - be informative for human as usual patch
> - be consistent for machine.

Petr already mentioned that the data currently shown in the email
text isn't really useful.  But it's simple to make it an attachment
and show a combined diff instead.

Although that might just make the email bigger for not a lot of
gain.  It's easy to use the git command line and gui tools to inspect
the bundle after importing it into your repository.  And just as
easy to expunge the bundle afterward if it isn't up to grade.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19  1:58                                     ` Charles Duffy
@ 2006-10-19 11:01                                       ` Johannes Schindelin
  2006-10-19 11:10                                         ` Charles Duffy
  0 siblings, 1 reply; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-19 11:01 UTC (permalink / raw)
  To: Charles Duffy; +Cc: git

Hi,

On Wed, 18 Oct 2006, Charles Duffy wrote:

> Johannes Schindelin wrote:

you neatly clipped the most important part of my email: I quoted you 
saying that plugins can even change core behaviour!

> > So, the wonderful upside of plugins you described here are actually the
> > reason I will never, _never_ use bzr with plugins.
> > 
> 
> I presume that for this reason you will also never, _never_ use a 
> non-mainline branch of git -- even if its actual code only touches UI 
> enhancements or something similarly non-core

NO! The point was that I will not gladly run anything which could change 
the core. If I know it touches only the UI, there is no problem.

If I get a shell script using git-core programs to do its job, I 
_know_ that my repository will not be fscked afterwards.

And _that_ was the whole point of my email.

> And that you will never, _never_ use third-party wrappers because they 
> might play LD_PRELOAD tricks. Or run any software with root privileges 
> you haven't personally written. Or...

Most of it comes down to trust. And yes, you are correct, I will not run 
git with some obscure module LD_PRELOADed that some guy from some planet 
sent me.

You might have missed my argument being about the SCM, and not the 
universe and all the rest.

> The claim that an extensibility mechanism should be rejected wholesale 
> on account of being excessively powerful, on the other hand, is just 
> silly.

Oh, but NO! An extensibility mechanism which allows for a fragile system 
_is_ silly. Not my rejection of it.

Just take an example (illustrating that once again, one should not 
attribute everything to malevolence...): I write a plugin for bzr. It does 
really wonderful things, it even cooks you dinner.

Only that I happened to make a small mistake (if you followed some threads 
on the git list, you'd know that small mistakes are a hobby of mine), and 
by this mistake, your repository is ... gone. Small mistake, big 
consequence. That is wrong with such a powerful system which caters for 
developers, which are human after all.

Note that such a small mistake would be much more likely caught in git: if 
it touches the core, plenty of eyes look at it.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 11:01                                       ` Johannes Schindelin
@ 2006-10-19 11:10                                         ` Charles Duffy
  2006-10-19 11:24                                           ` Johannes Schindelin
  0 siblings, 1 reply; 1752+ messages in thread
From: Charles Duffy @ 2006-10-19 11:10 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin wrote:
>> I presume that for this reason you will also never, _never_ use a 
>> non-mainline branch of git -- even if its actual code only touches UI 
>> enhancements or something similarly non-core
>>     
>
> NO! The point was that I will not gladly run anything which could change 
> the core. If I know it touches only the UI, there is no problem.
>   

If you're willing to look at the source of a branch to know that it 
touches only the UI, why would you not be willing to look at the source 
of a plugin to do the same thing?

> If I get a shell script using git-core programs to do its job, I 
> _know_ that my repository will not be fscked afterwards.
>
> And _that_ was the whole point of my email.
>   

It's a silly point. If you're willing to look at what your shell script 
does and validate that it doesn't do LD_PRELOAD tricks or swap out git 
core pieces, why wouldn't you be willing to accept a plugin after a 
similar level of review, rather than stating outright that you would 
*never* use them?

>> The claim that an extensibility mechanism should be rejected wholesale 
>> on account of being excessively powerful, on the other hand, is just 
>> silly.
>>     
>
> Oh, but NO! An extensibility mechanism which allows for a fragile system 
> _is_ silly. Not my rejection of it.
>   

Shell scripts allow for a fragile system because they could include C 
code snippets which they then compile and LD_PRELOAD. Sure, they "allow 
for" a fragile system -- but the author has to go out of their way to 
make it so. Similarly, folks writing bzr plugins need to take explicit 
actions to monkeypatch existing code (as opposed to adding a new 
transport/storage format/command/etc but leaving the old ones alone).

If you trust the author of your shell script not to build their own 
LD_PRELOAD at runtime, why don't you trust the author of your bzr plugin 
not to monkeypatch in replacements to core code if they say they aren't?

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19  9:10                                   ` Matthew D. Fuller
@ 2006-10-19 11:15                                     ` Andreas Ericsson
  2006-10-19 12:04                                       ` Matthieu Moy
  2006-10-19 11:27                                     ` Karl Hasselström
  1 sibling, 1 reply; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-19 11:15 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Carl Worth, Aaron Bentley, Linus Torvalds, bazaar-ng, git,
	Jakub Narebski

Matthew D. Fuller wrote:
> On Wed, Oct 18, 2006 at 08:38:24AM -0700 I heard the voice of
> Carl Worth, and lo! it spake thus:
>> But as you already said, it's often avoided specifically because it
>> destroys locally-created revision numbers.
> 
> I think this has the causality backward.  It's avoided because it
> changes the ancestry of the branch in question, by rearranging the
> left parents; this ties into Linus' assertion that all parents ought
> to be treated equally, which I'm beginning to think is the base
> lynchpin of this whole dissension.
> 
> 
> Without a differentiation of the parents, there's no such creature as
> a "mainline" on a branch, so it's hard to find anything to base revnos
> on from the get-go; the whole discussion becomes meaningless and
> incomprehensible then.
> 
> With the differentiation, numbering along the leftmost 'mainline'


You, and others, keep saying "leftmost". What on earth does left or 
right have to do with anything? Or rather, how do you determine which 
side anything at all is on?

> makes sense, and fits the way people tend to work.  "I did this, then
> I did this, then I merged in Joe's stuff, then I did this", and the
> numbering follows along that.  And as long as it's the same branch,
> those revnos will always be the same; I can't go back and add
> something in between my first and second commits.  THAT'S where revnos
> are useful; referring to a point on given branch.
> 

So long as the given branch is, in git-speak, "master"? I think I'm 
starting to see how this would work, but I still fail to see how you can 
then come up with revnos such as 2343.1.14.7.19, since the only ones 
that seem to actually make any sense are the ones that track the 
strictly linear development.

In git, this can be accomplished by auto-tagging each update of any 
branch with a tag named numerically and incrementally, although no-one 
really bothers with it.

Let's say you have the following graph, where A is the root commit, B 
introduces the base for a couple of new features that three separate 
coders start to work on in their own repositories. The feature started 
on in D is logically coded as a two-stage change. F fixes a bug 
introduced in D. I is the result of an octopus merge of all three 
branches, where the three features are implemented and all bugs are 
fixed (this is btw by far the most common pattern we have in our repos 
here at work).

   A
   |
   B
  /|\
C |  D
| |  |\
| |  E F
| |  |/
| |  G
| H /
  \|/
   I

Now a couple of questions arise.
- How do I do to get to C, D, E, F, G and H?
- When these get merged, which one will be considered the "left" parent, 
and why?

> 
>> So there are some aspects of the bzr design that rob from its
>> ability to function as a distributed version control system. It
>> really does bias itself toward centralization, (the so called "star
>> topoloogy" as opposed to something "fully" distributed).
> 
> That depends on what you mean by 'bias' (and for that matter, what you
> mean by 'centralization'; I think that's being used in very different
> ways here).  If you don't care about the ancestry changes, you can go
> ahead and change it around by merging and pushing like there's no
> tomorrow, and it'll keep up just fine.  Some attributes of it like the
> revnos which assume you do care about the ancestry simply cease to be
> of any applicability.


How deep will I have to dig to get the immutable revids instead?


>  That doesn't make it a useless feature, any
> more than diff being inapplicable in a branch I'm using to store
> binary files makes diff useless; it's just not one that's meaningful
> in a given case.
> 

Binary diffs work just fine, thank you very much ;-)

> bzr (the project) does care about the ordering of the parents, so it
> doesn't do that.  bzr (the tool) assumes that the majority of its
> users will care, which is why it has revnos; because in the case where
> you don't disturb the ancestry of given branches, revnos are very
> useful in reference to that branch.
> 
> 
>> So even a project that's very oriented around a single, central tree
>> can get a lot of benefit from being able to share things arbitrarily
>> between any two given repositories.
> 
> I agree wholeheartedly.  That's one of the reasons I'm using bzr, even
> though 95% or better of what I do is very oriented around single,
> central trees, after all    8-}
> 

I'm sure it's supported. The question is whether or not bazaar makes it 
easy for those developers to exchange valuable information (revids, 
since their revnos will be mixed up) so they can communicate detailed 
info about "commit X introduced a bug in foo_diddle(). I fixed it in 
commit Y, so if you merge it we can release". If revids are always 
printed anyways, I see even less need for revnos. If it's hard to get 
the revids I wouldn't consider the truly distributed workflow supported 
any more than I consider CVS file rename support á la "just hand-edit 
the ,v-files" to actually work.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 11:10                                         ` Charles Duffy
@ 2006-10-19 11:24                                           ` Johannes Schindelin
  2006-10-19 11:30                                             ` Charles Duffy
  0 siblings, 1 reply; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-19 11:24 UTC (permalink / raw)
  To: Charles Duffy; +Cc: git

Hi,

On Thu, 19 Oct 2006, Charles Duffy wrote:

> Johannes Schindelin wrote:
> > > I presume that for this reason you will also never, _never_ use a
> > > non-mainline branch of git -- even if its actual code only touches UI
> > > enhancements or something similarly non-core
> > >     
> > 
> > NO! The point was that I will not gladly run anything which could change the
> > core. If I know it touches only the UI, there is no problem.
> >   
> 
> If you're willing to look at the source of a branch to know that it 
> touches only the UI, why would you not be willing to look at the source 
> of a plugin to do the same thing?

That is why I said I'd be gladly using a shell-script using git-core 
programs. It is typically no more than 20 lines, and I can review that 
quite easily.

> Shell scripts allow for a fragile system because they could include C code
> snippets which they then compile and LD_PRELOAD.

Well, I do not expect people to misbehave. You do not compile a nasty 
C-program from a shell script _by mistake_.

I also expect people not to constantly miss my point. It could be that I 
am not as proficient in the English language as I thought. In that case, 
I'll better shut up.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19  9:10                                   ` Matthew D. Fuller
  2006-10-19 11:15                                     ` Andreas Ericsson
@ 2006-10-19 11:27                                     ` Karl Hasselström
  2006-10-19 11:46                                       ` Petr Baudis
  1 sibling, 1 reply; 1752+ messages in thread
From: Karl Hasselström @ 2006-10-19 11:27 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Carl Worth, Aaron Bentley, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git, Jakub Narebski

On 2006-10-19 04:10:45 -0500, Matthew D. Fuller wrote:

> I think this has the causality backward. It's avoided because it
> changes the ancestry of the branch in question, by rearranging the
> left parents; this ties into Linus' assertion that all parents ought
> to be treated equally, which I'm beginning to think is the base
> lynchpin of this whole dissension.

Yes, it seems you have found the needle. :-) In git, history is a DAG;
a commit has a _set_ of parents, so by definition they are not
ordered. This has a number of consequences. For example, you can't
really answer the question "Which branch was this commit on?". All you
can say is that "This commit is reachable from (and therefore part of)
branches X, Y, and Z."

In all other SCMs I have seen, a "branch" is conceptually an ordered
series of commits (some of which may be merges). In git, a "branch" is
a pointer to a commit, period. The commit knows its set of parents, so
all its history is there, but there is fundamentally no way to tell
which branch a commit was "on" when it was created.

This is an important point; it means there is no concept of "my" or
"your" branch. Every participant is adding commits to the same DAG,
and may at any point decide to share her additions with someone else,
or keep them private forever. And because "branches" don't really
exist, every commit really is created equal.

Really, every commit. Not even the initial commit of a project is
special -- it's just a commit with an empty parent set. And, it's
perfectly possible to make a (merge) commit whose parents belong to
previously disconnected parts of the DAG. This of course means that
it's not even possible to differentiate commits based on which project
they're part of, since one can create a commit whose parents belong to
different projects. All commits are _really_ born equal! There's just
one great DAG of all git commits that could possibly exist. (This has
been done in git's own history; the graphical viewer gitk was
originally a separate project, with its own initial commit, but that
initial commit is now reachable from all commits currently being made
to git -- that is, it has been merged.)

This structure of things may seem complex, since it's different, but
mathematically it's quite simple, and that's what counts in the end if
you want to do nontrivial things.

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 11:24                                           ` Johannes Schindelin
@ 2006-10-19 11:30                                             ` Charles Duffy
  2006-10-20 11:38                                               ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Charles Duffy @ 2006-10-19 11:30 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin wrote:
>> Shell scripts allow for a fragile system because they could include C code
>> snippets which they then compile and LD_PRELOAD.
>>     
>
> Well, I do not expect people to misbehave. You do not compile a nasty 
> C-program from a shell script _by mistake_.
>   

You also don't replace bzrlib functionality (in your terms, plumbing) in 
a plugin by mistake.

> I also expect people not to constantly miss my point.

I think your point is predicated on a misunderstanding of how plugins work.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19  7:02                                 ` Erik Bågfors
  2006-10-19  8:49                                   ` Christian MICHON
@ 2006-10-19 11:37                                   ` Petr Baudis
  2006-10-19 15:17                                     ` Matthew D. Fuller
  1 sibling, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-10-19 11:37 UTC (permalink / raw)
  To: Erik B?gfors
  Cc: Aaron Bentley, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Jakub Narebski

Dear diary, on Thu, Oct 19, 2006 at 09:02:16AM CEST, I got a letter
where Erik B?gfors <zindar@gmail.com> said that...
> bzr's use of revision numbers, doesn't mean it hasn't got unique
> revision identifiers, and I can't see any reason why it couldn't be
> used in the same way as git.

There is perhaps no "technical" reason, but it's also what the user
interface is designed around - most probably, using UUIDs instead of
revnos would be a lot less convenient for bzr people because you
probably primarily show revnos everywhere and UUIDs only in few special
places and/or when asked specifically through a command (correct me if
I'm wrong). Also, do you support "UUID autocompletion" so that you can
type just the unique UUID prefix instead of the whole thing?

> Both are excellent tools, and since git
> is more specialized (built to support the exact workflow used in
> kernel development), it's more suited for that exact use.
> 
> bzr tries to take a broader view, for example, it does support a
> centralized workflow if you want one.  Most people don't, but a few
> might. Because of this, it probably fits the kernel development less
> good than git.  That's fine I think! I happens to fit my workflow
> better than git does :)

I think they are in fact just as flexible (+-epsilon). Git can support
centralized workflow as well - you have some central repository
somewhere and all the developers clone it, then pull from it and push to
it in basically the same way they would use CVS. And it is perhaps
currently even more used in practice than the "single-man" workflow
nowadays, as more project are using Git.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 11:27                                     ` Karl Hasselström
@ 2006-10-19 11:46                                       ` Petr Baudis
  2006-10-19 16:01                                         ` Matthew D. Fuller
  0 siblings, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-10-19 11:46 UTC (permalink / raw)
  To: Karl Hasselström
  Cc: Matthew D. Fuller, Carl Worth, Aaron Bentley, Linus Torvalds,
	Andreas Ericsson, bazaar-ng, git, Jakub Narebski

Dear diary, on Thu, Oct 19, 2006 at 01:27:59PM CEST, I got a letter
where Karl Hasselström <kha@treskal.com> said that...
> Really, every commit. Not even the initial commit of a project is
> special -- it's just a commit with an empty parent set. And, it's
> perfectly possible to make a (merge) commit whose parents belong to
> previously disconnected parts of the DAG. This of course means that
> it's not even possible to differentiate commits based on which project
> they're part of, since one can create a commit whose parents belong to
> different projects.

FWIW, IIRC the Git project has about 6 initial commits. :-)

BTW, a popular source of horrification in other VCSes are Git's octopus
merges. (A popular source of horrification in Git are kernel developers
doing octopus merges of 40 branches at once.) Does Bazaar support those?
(I can't really say it's a defect if it doesn't...)

(An octopus merge is a merge of more than two branches at once, in a
single commit.)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 11:15                                     ` Andreas Ericsson
@ 2006-10-19 12:04                                       ` Matthieu Moy
  2006-10-19 12:33                                         ` Petr Baudis
  0 siblings, 1 reply; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-19 12:04 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Matthew D. Fuller, bazaar-ng, Linus Torvalds, Carl Worth, git,
	Jakub Narebski

Andreas Ericsson <ae@op5.se> writes:

> You, and others, keep saying "leftmost". What on earth does left or
> right have to do with anything? Or rather, how do you determine which
> side anything at all is on?

Not sure it's the same in git, but in bzr, a new revision is always
created by a commit (it can be "fetched" by other commands though). If
you "merge", then you have to commit after.

What people call "leftmost ancestor" is the revision which used to be
the tip at the time you commited. For example, if you do "bzr diff;
bzr commit" the diff shown before is the same as the one got with
"bzr diff -r last:1" right after the commit.

I believe this doesn't make a difference for merge algorithms, but in
the UI, it's here when you say, e.g.:

bzr diff -r last:12..before:revid:foo@bar-auents987aue

(once in "last:", and once in "before:")

-- 
Matthieu

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 12:04                                       ` Matthieu Moy
@ 2006-10-19 12:33                                         ` Petr Baudis
  2006-10-19 13:44                                           ` Matthieu Moy
  2006-10-20 11:50                                           ` Jakub Narebski
  0 siblings, 2 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-19 12:33 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Andreas Ericsson, Matthew D. Fuller, bazaar-ng, Linus Torvalds,
	Carl Worth, git, Jakub Narebski

Dear diary, on Thu, Oct 19, 2006 at 02:04:14PM CEST, I got a letter
where Matthieu Moy <Matthieu.Moy@imag.fr> said that...
> What people call "leftmost ancestor" is the revision which used to be
> the tip at the time you commited. For example, if you do "bzr diff;
> bzr commit" the diff shown before is the same as the one got with
> "bzr diff -r last:1" right after the commit.

The lack of parents ordering in Git is directly connected with
fast-forwarding.

Consider

 repo1   repo2

   a       a
  /       /
 b       c

Now repo2 merges with repo1:

 repo1   repo2

   a       a
  /       / \
 b       c   b
          \ /
           m

repo1 tip ('b') is not ancestor of repo2 tip ('c') so a three-way merge
is done and a new 'm' merge commit is created.

And now repo1 merges with repo2:

 repo1   repo2

   a       a
  / \     / \
 c   b   c   b
  \ /     \ /
   m       m

Because previous repo1 tip ('b') was ancestor of repo2 tip ('m'), a
fast-forward happenned and repo1 tip simply moved to 'm'. But this
"flipped" the development from repo1 POV - you cannot assume anymore
that the first ("leftmost") parent is special.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 12:33                                         ` Petr Baudis
@ 2006-10-19 13:44                                           ` Matthieu Moy
  2006-10-19 16:03                                             ` Carl Worth
  2006-10-20 11:50                                           ` Jakub Narebski
  1 sibling, 1 reply; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-19 13:44 UTC (permalink / raw)
  To: Petr Baudis; +Cc: bazaar-ng, git

Petr Baudis <pasky@suse.cz> writes:

> The lack of parents ordering in Git is directly connected with
> fast-forwarding.

[...]

>  repo1 repo2
>
>    a       a
>   / \     / \
>  c   b   c   b
>   \ /     \ /
>    m       m

Yes, bzr has similar thing too. AIUI, the difference is that git does
it automatically, while bzr has two commands in its UI, "merge" and
"pull".

In your case, the "leftmost ancestor" of m is b, because at the time
it was created, it was commited from b.

One problem with that approach is that from revision m and looking
backward in history (say, running "bzr log"), you have two ways to go
backward:

1) Take the history of _your_ commits, and your pull till the point
   where you've branched.

2) Follow the history taking the leftmost ancestor at each step.

In bzr, the notion of "branch" corresponds to a succession of
revisions, which are explicitely stored in a file (ls
.bzr/branch/revision-history), which is what commands like "log"
follow, and what is used for revision numbering. And this sucession of
revision must obey (at most) one of the above. In the past, it was 1),
which means that "pull" (i.e. fast-forward) was only adding revisions
to a branch. In your scenario, repo1 would get a revision history of
"a c m" while repo2 would have had "a b m" with the same tip.

Today, the revision history follows leftmost ancestor. One good
property of this is that revision history is unique for a given
revision. But the terrible drawback is that "pull" and "push" do not
/add/ revisions to your revision history, they rewrite the target one
with the source one. That means I can have

$ bzr log --line
1: some upstream stuff
2: started my work
3: continued my work

# upstream merges.

$ bzr pull
$ bzr log --line
1: some upstream stuff
2: some other upstream stuff ...
3: ... commited while I was working
4: merged from Matthieu this terrible feature

-- 
Matthieu -- definitely curious to give a real try to git ;-)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19  3:46                                                       ` Junio C Hamano
@ 2006-10-19 14:27                                                         ` Nicolas Pitre
  2006-10-19 14:55                                                         ` Linus Torvalds
  1 sibling, 0 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-10-19 14:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

On Wed, 18 Oct 2006, Junio C Hamano wrote:

> Linus Torvalds <torvalds@osdl.org> writes:
> 
> > Actually, I've hit an impasse.
> >
> > So there's _another_ way of fixing a thin pack: it's to expand the objects 
> > without a base into non-delta objects, and keeping the number of objects 
> > in the pack the same. But _again_, we don't actually know which ones to 
> > expand until it's too late.
> 
> pack-objects.c::write_one() makes sure that we write out base
> immediately after delta if we haven't written out its base yet,
> so I suspect if you buffer one delta you should be Ok, no?

If we create full packs out of thin packs the base objects will end up 
at the end of the pack so this assumption is a bad one to rely upon if 
we want to make things robust (like being able to feed such a pack 
back).


Nicolas

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19  3:46                                                       ` Junio C Hamano
  2006-10-19 14:27                                                         ` Nicolas Pitre
@ 2006-10-19 14:55                                                         ` Linus Torvalds
  2006-10-19 16:07                                                           ` Jan Harkes
  1 sibling, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-19 14:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Wed, 18 Oct 2006, Junio C Hamano wrote:
>
> Linus Torvalds <torvalds@osdl.org> writes:
> >
> > Actually, I've hit an impasse.
> >
> > So there's _another_ way of fixing a thin pack: it's to expand the objects 
> > without a base into non-delta objects, and keeping the number of objects 
> > in the pack the same. But _again_, we don't actually know which ones to 
> > expand until it's too late.
> 
> pack-objects.c::write_one() makes sure that we write out base
> immediately after delta if we haven't written out its base yet,
> so I suspect if you buffer one delta you should be Ok, no?

It doesn't matter. I realized that my bogus patch to unpack-objects was 
more seriously broken anyway: even the "un-deltify every single object" 
was broken. And that's despite the fact that I _tested_ it, and verified 
the end result by hand.

Why? Because I tested it within one repo, by just piping the output of 
git-pack-objects --stdout directly to the repacker. That seemed to be a 
good way to test it without setting up anything bigger. But it turns out 
that it misses one of the big problems: if you don't unpack the objects in 
a way that later phases can read, none of the streaming code works at all, 
and you have to buffer up _everything_ in memory just to be able to read 
any previous _non_delta objects too.

So my patch-series works - but it only works in a repo that already has 
all the objects in question, because then it can look up the objects in 
the original database. Which makes it useless. Duh.

So forget about unpack-objects. It's designed to be streaming (and it's a 
_good_ design for what it does), but repacking really cannot be done that 
way. Repacking needs to be done by saving the thin pack to disk, and then 
doing a multi-pass over it (like git-index-pack does, for example).

Just throw my patch away. It's not even useful as a basis for anything 
else, unless you want to use it as a way to keep all the objects in memory 
and use the "unpack-objects" logic to just _parse_ the incoming pack.

I suspect using "index-pack" is saner (since it already has the multi-pass 
logic), or just doing somethign that maps all the objects in memory, and 
then calls builtin-pack-objects once it has set up the new thin pack so 
that others can see/use the new objects without realizing that they aren't 
in the canonical pack-format.

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19  9:10                                       ` Matthieu Moy
@ 2006-10-19 14:57                                         ` Tim Webster
  2006-10-19 15:30                                           ` Aaron Bentley
  2006-10-19 16:14                                           ` Matthieu Moy
  0 siblings, 2 replies; 1752+ messages in thread
From: Tim Webster @ 2006-10-19 14:57 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Andreas Ericsson, Christian MICHON, bazaar-ng, git

On 10/19/06, Matthieu Moy <Matthieu.Moy@imag.fr> wrote:
> Andreas Ericsson <ae@op5.se> writes:
>
> > Perhaps not, but the tone is friendly (mostly), the patience of the
> > bazaar people seems infinite and lots of people seem to be having fun
> > while at the same time learning a thing or two about a different SCM.
> > Best case scenario, both git and bazaar come out of the discussion as
> > better tools. If there would never be any cross-pollination, git
> > wouldn't have half the features it has today.
>
>


Thanks everyone for taking time to explain details.

However, I don't use SCM for code development. I use it for collaborative
documentation, white boarding and tracking configurations.
In fact in my company no one uses SCM for code development.
Everyone here uses it for collaborative documentation and white boarding.
Only I use SCM for tracking configurations.

I think of SCMs in terms of an SCM core and SCM tools.

First I want to say every SCM I know of sucks when it comes to tracking
configurations, simply because they don't record or restore file metadata,
like perms, ownership, and acl. I don't see recording or restoring
file metadata as part of the SCM core. I do however feel an SCM core needs to
have provisions for extended file inventory information. The problem
with extended file inventory information, it is fs specific. For this reason I
feel it is essential that the SCM core allow multiple sets of extended file
inventory information. The SCM tools are responsible, based on the local
config, for recording metadata and creating extended file inventory,
translating file metadata of one file system. When tracking configurations
octopus merges are surprisingly common. If a configuration changed is
not signed off by a responsible person, it can not be accepted. Doing
otherwise is simply an invitation to attackers and makes trouble shooting
far too difficult. Also configuration file in one directory will most often not
be members of the same repo. For example each file etc in directory would
members of different repos according to its associated application/pkg.

Somethings I like the SCM tools to handle. Personally I would like the
SCM tools to be platform independent. This would ensure that correct
things happening on ext3 mounted on windows.
I don't think execute bit belongs in the basic file inventory information.
Instead I would like to use this replace by a filter in the extended
file inventory
indicating what file metadata if any should be recorded or restored.
When the local SCM tools config has use metadata enable, the filter is used.
A filter lets the user select file metadata to record/restore such as;
record ownership, record permissions, record acl.

For SCM configuration tracking to function reliably, pulls, pushes and merges
need to be atomic. Personally I like my servers to pull change updates. And
I like to push changes I make on local servers to branches. On configuration
master merge the  branches into groups. When the server pulls changes
for a particular application/pkg, the following is a list of steps that need to
occur.
The SCM tools, perform a pre update step, such as optionally stopping a service
pull updates and build changes files in a scratch space, than apply
file metadata,
unchanged files would be links from the scratch space to the original files,
verify all files are correct by checking their sha1 or md5,
atomically move configuration files and scripts to install them,
perform a post update step,  such as starting or reloading a service.
The pre update step and the post update are very much like pkg pre and post
install scripts. The pre update and post update scripts are in fact part of the
application/pkg configurations files.


Collaborative document editing and white boarding are other requirements.
odf and svg are xml file formats. I would like to see an efficient
xml diff as part of the SCM core. Using mime types SCM tools can unzip
files, bundles, and use mime type information to the SCM core xml
diff, plain diff
as required. I think it is essential that the SCM core include
previsions for multiple
repo partners. For example this can be used to create fail over star
scm architecture.
In collaborative document editing it is often the case where you want to
compress / summarize some of the change history.

We currently use our scm based collaborative document editing as an ad
hock white
board, coordinating our commits and updates via IM. :)
It would be nice if the SCM tools included rss feeds for communicating zip
patch bundles.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19  5:21                                 ` Carl Worth
  2006-10-19  5:56                                   ` Martin Pool
@ 2006-10-19 14:58                                   ` Aaron Bentley
  2006-10-19 16:59                                     ` Carl Worth
  2006-10-19 17:01                                     ` Carl Worth
  2006-10-19 15:25                                   ` Linus Torvalds
  2 siblings, 2 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-19 14:58 UTC (permalink / raw)
  To: Carl Worth
  Cc: Linus Torvalds, Jakub Narebski, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Carl Worth wrote:
> On Wed, 18 Oct 2006 23:10:11 -0400, Aaron Bentley wrote:
> But think about your favorite example of an unhealthy social situation
> around a software project and a big, nasty fork. Every example I can
> think of involves some technical distinction that makes one branch
> more special than another.
>
> Now, those situations also involve social problems, and those are even
> more significant. But the technical blessing of one branch does not
> help. And I think it contributes to the social problems in many cases.

I'm not as familiar with those details.  The one fork that I know a lot
about, when Baz (the old Bazaar architecture) forked off from Arch,
showed me that for each developer branch, one branch must be special.

This is just because it is hard to maintain a branch that applies
cleanly to two diverging codebases.  So each developer must develop
against the fork that they want to merge their code into.  If they want
their code to be applied to the other fork, someone must port it.

So I really do feel that special branches are inescapable.

With bzr, you have the freedom to choose which branch you consider
special, and change your mind at any time.  There are no technical
limitations in that regard.

> As far as the revision numbers, my impression is that the numbers
> would be confusing or worthless if I were to use bzr the way I'm
> currently using git, as they certainly could not remain stable.

They would remain stable if you only used pull to update your origin
branch, and used merge+commit to update your development branch.

>> In bzr development, it's very rare for anyone's revision numbers to change.
> 
> Which just says to me that the bzr developers really are sticking to a
> centralized model.

I don't see why you're reaching that conclusion.  I'd like to understand
that better, because Linus seems to be concluding the same thing, and it
doesn't make sense to me.

>> I think you're implying that on a technical level, bzr doesn't support
>> this.  But it does.  Every published repository has unique identifiers
>> for every revision on its mainline, and it's exceedingly uncommon for
>> these to change.
> 
> Every argument you make for the number change being uncommon just
> strengthens the argument that it will be all that more
> confusing/frustrating when the numbers do change.

That doesn't follow.  Just because something is arguably true doesn't
make it bad.  And in this case, I'm not arguing that it's true, I'm
saying that it's true, because that is what my experience tells me is true.

> In cairo, for example, we've made a habit of including a revision
> identifier in our bug tracking system for every commit that resolves a
> bug.

We do it the other way around: we put a bug number in the commit
message.  And I personally have been developing a bugtracker that is
distributed in the same way bzr is; it stores bug data in the source
tree of a project, so that bug activities follow branches around.

> I like having the assurance that those numbers will survive
> forever. And it doesn't matter if the repository moves, or the project
> is forked, or anything else. Those numbers cannot change.
> 
> I understand that bzr also has unique identifiers, but it sounds like
> the tools try to hide them, and people aren't in the habit of using
> them for things like this. Do bzr developers put revision numbers in
> their bug trackers? Is there a guarantee they will always be valid?

Yes, we put revnos in our bug trackers.  No, we can't prove that they
will always be valid.  But there are significant disincentives to
changing them, so I am quite comfortable assuming they will not change.
 And the older a revno gets, the less likely it is to change.

On the other hand, I think your revision identifiers are not as
permanent as you think.

In the first place, it seems fairly common in the Git community to
rebase.  This process throws away old revisions and creates new
revisions that are morally equivalent[1].  I don't know whether Git
fetches unreferenced revisions, but bzr's policy is to fetch only
revisions referenced in the ancestry DAG of the branch.

In the second place, one must consider the "nuclear launch codes"
scenario.  In this scenario, someone has committed the codes necessary
to begin a nuclear attack into their branch.  This is an unlikely event,
of course, but nuclear launch codes are an extreme example of data that
absolutely, positively must be completely expunged from the branch.
Other examples include proprietary code (e.g. if SCO wasn't a bunch of
charlatans), passwords and obscene or libelous statements.

In a nuclear codes scenario, the revision that introduced the nuclear
launch codes and all its descendants must be expunged from the
repository.  You may, perhaps, rebase in order to retain the shape of
the history, but the revision-ids that you have recorded will be gone.

Aaron

[1] This is a process that I find discomforting, because I consider the
original revisions to be real, historical data, and I don't like the
idea of throwing it away.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFN5F70F+nu1YWqI0RAhrsAJ9rcqNGv28134eTvbGoxxteOxif3wCfTbaq
fpD0HNeGgdlMwuJldyzUxRM=
=9k8r
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 11:37                                   ` Petr Baudis
@ 2006-10-19 15:17                                     ` Matthew D. Fuller
  0 siblings, 0 replies; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-19 15:17 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Erik B?gfors, bazaar-ng, git

[ trim back CC a bit ]

On Thu, Oct 19, 2006 at 01:37:31PM +0200 I heard the voice of
Petr Baudis, and lo! it spake thus:
> 
> [...] you probably primarily show revnos everywhere and UUIDs only
> in few special places and/or when asked specifically through a
> command (correct me if I'm wrong).

The primary place you'd see either is in 'log'.  To show the UUID,
you'd add a "--show-ids" arg to it (and via per-user config aliasing,
you could just alias 'log' to 'log --show-ids' if you always wanted to
see them, so you wouldn't have to type it.  The output looks something
like:

revno: 1
revision-id: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
committer: Matthew Fuller <fullermd@over-yonder.net>
branch nick: a
timestamp: Thu 2006-10-19 10:14:37 -0500
message:
  Foo

(without --show-ids, it's the same, except not showing the
revision-id: line)


> Also, do you support "UUID autocompletion" so that you can type just
> the unique UUID prefix instead of the whole thing?

With the form of bzr UUID's, that's not particularly useful, since
you're probably into the minutes/seconds of the timestamp before it
becomes unique, at which points you're close to 2/3 of the way through
the whole string.



-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19  5:21                                 ` Carl Worth
  2006-10-19  5:56                                   ` Martin Pool
  2006-10-19 14:58                                   ` Aaron Bentley
@ 2006-10-19 15:25                                   ` Linus Torvalds
  2006-10-19 16:13                                     ` Matthew D. Fuller
  2 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-19 15:25 UTC (permalink / raw)
  To: Carl Worth
  Cc: Aaron Bentley, Jakub Narebski, Andreas Ericsson, bazaar-ng, git



On Wed, 18 Oct 2006, Carl Worth wrote:
> 
> I understand that bzr also has unique identifiers, but it sounds like
> the tools try to hide them, and people aren't in the habit of using
> them for things like this. Do bzr developers put revision numbers in
> their bug trackers? Is there a guarantee they will always be valid?

bzr seems to use the classic UUID format, and it's funny how much it looks 
like a real BK ChangeSet revision number ("key").

Here's the quoted bzr "true" revision ID:

	Matthieu.Moy@imag.fr-20061017152029-4c5a2861bcf23b7d

and here's a BK "ChangeSet Key":

	adi@zaphod.bitmover.com|ChangeSet|20031031183805|57296

(I don't have BK installed anywhere, so I had to google for changeset 
keys, and this was just some random key in the BK bugzilla ;)

Looks very similar, don't they? And yes, the true revision ID is stable 
over time (at least it was in BK, and I assume it is in bzr too).

The biggest difference seems to be that in bzr, the final checksum is 
64-bit, while for BK, it was just a 16-bit checksum/unique number (the 
rest is just user-name/machine-name and date: I assume that the bzr commit 
was done at 10/17/2006 3:20:29PM, and the example BK ChangeSet was created 
10/31/2003 6:38:50PM - it looks like _exactly_ the same date format).

With BK, you can also use a "md5 key", and I don't actually know how they 
work. They may just be the md5 hash of the ChangeSet key, I think that may 
be how those things are indexed. So in bkcvs, you'll see a line like this:

	BKrev: 42516681VmgTWL0bkLcltPGiI6Yk5Q

which is the BK md5 key for my last kernel revision in BK (2.6.12-rc2). 
Again, these numbers are stable, unlike the simple revisions.

Note that from a usability standpoint, the UUID's look more readable to a 
human, but are actually much worse than the md5 keys (or the SHA1's that 
git uses). At least with a hash, the first few digits are likely to be 
unique, so you can do things like auto-completion (or just short names). 
With the email+date+random number kind of UUID, you don't have that.

(Pure hashes obviously also tend to just all have the same length, and are 
easier to parse automatically, so from a programmatic standpoint they are 
a lot easier too - but the surprising thing is how they are actually 
easier on humans too, even if the UUID's look more readable).

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 14:57                                         ` Tim Webster
@ 2006-10-19 15:30                                           ` Aaron Bentley
  2006-10-20  3:14                                             ` Tim Webster
  2006-10-20 10:44                                             ` Jakub Narebski
  2006-10-19 16:14                                           ` Matthieu Moy
  1 sibling, 2 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-19 15:30 UTC (permalink / raw)
  To: Tim Webster
  Cc: Matthieu Moy, Christian MICHON, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tim Webster wrote:
> First I want to say every SCM I know of sucks when it comes to tracking
> configurations, simply because they don't record or restore file metadata,
> like perms, ownership, and acl.

Arch supports that kind of metadata.

I believe SVN supports recording arbitrary file properties, so it's just
a matter of applying those properties to the tree.

> Somethings I like the SCM tools to handle. Personally I would like the
> SCM tools to be platform independent. This would ensure that correct
> things happening on ext3 mounted on windows.
> I don't think execute bit belongs in the basic file inventory information.

Our choices have been predicated on producing the best SCM we can for
the purpose of developing software.  We find that the execute bit is
very useful for build scripts and other incidental scripts.

The other attributes didn't seem useful for software development, so
they're not part of the baseline.

> Collaborative document editing and white boarding are other requirements.
> odf and svg are xml file formats. I would like to see an efficient
> xml diff as part of the SCM core. Using mime types SCM tools can unzip
> files, bundles, and use mime type information to the SCM core xml
> diff, plain diff
> as required.

An XML diff/patch or merge will not handle ODF properly.  There's too
much extra semantic information.

> I think it is essential that the SCM core include
> previsions for multiple
> repo partners.

You mean multiple merge sources?

> It would be nice if the SCM tools included rss feeds for communicating zip
> patch bundles.

The bzr "webserve" plugin provides rss feeds.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFN5oB0F+nu1YWqI0RAjSoAJ9xrZtSrZpVVoz6qAf/sZnd/StsUACfenqX
6bemNgMSbhtL0JjIlvulrb4=
=bSpK
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19  8:58                                     ` Andreas Ericsson
  2006-10-19  9:10                                       ` Matthieu Moy
@ 2006-10-19 15:45                                       ` Ramon Diaz-Uriarte
  2006-10-20 10:40                                       ` Jakub Narebski
  2 siblings, 0 replies; 1752+ messages in thread
From: Ramon Diaz-Uriarte @ 2006-10-19 15:45 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Christian MICHON, bazaar-ng, git

On 10/19/06, Andreas Ericsson <ae@op5.se> wrote:
> Christian MICHON wrote:
> > close to 200 post on bzr-git war!
> > is this the right place (git mailing list) to discuss about future
> > features of bzr ?
> >
>
> Perhaps not, but the tone is friendly (mostly), the patience of the
> bazaar people seems infinite and lots of people seem to be having fun
> while at the same time learning a thing or two about a different SCM.
> Best case scenario, both git and bazaar come out of the discussion as
> better tools. If there would never be any cross-pollination, git
> wouldn't have half the features it has today.
>

I fully agree with Andreas: I am just a bzr user (not even a bzr
developer) and when looking for a decentralized VCS I also looked at
git and a few others. I think I am learning quite a bit  about bzr,
git, and VCS in general.

R.

> --
> Andreas Ericsson                   andreas.ericsson@op5.se
> OP5 AB                             www.op5.se
> Tel: +46 8-230225                  Fax: +46 8-230231
>
>


-- 
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 11:46                                       ` Petr Baudis
@ 2006-10-19 16:01                                         ` Matthew D. Fuller
  2006-10-19 17:06                                           ` Matthew D. Fuller
  0 siblings, 1 reply; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-19 16:01 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Karl Hasselström, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git, Jakub Narebski

On Thu, Oct 19, 2006 at 01:46:39PM +0200 I heard the voice of
Petr Baudis, and lo! it spake thus:
> 
> Does Bazaar support those?  (I can't really say it's a defect if it
> doesn't...)

By default, merge will refuse to do its thing if there are uncommitted
changes in the working tree, whether those changes are something
you've done, or the pending results of a previous merge.  A '--force'
arg to merge will make it go forward though, so yes, you can merge
multiple other branches in one merge if you want to.

Actually, I can kill 2 birds here.  Quick little bictopus merge:

% bzr log --show-ids
------------------------------------------------------------
revno: 2
revision-id: fullermd@over-yonder.net-20061019151856-c3b406b8bcdfb537
parent: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
parent: fullermd@over-yonder.net-20061019151800-2fe41e4949f5e237
parent: fullermd@over-yonder.net-20061019151807-3d7047e387edcad9
committer: Matthew Fuller <fullermd@over-yonder.net>
branch nick: a
timestamp: Thu 2006-10-19 10:18:56 -0500
message:
  merge
    ------------------------------------------------------------
    revno: 1.2.1
    merged: fullermd@over-yonder.net-20061019151800-2fe41e4949f5e237
    parent: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
    committer: Matthew Fuller <fullermd@over-yonder.net>
    branch nick: b
    timestamp: Thu 2006-10-19 10:18:00 -0500
    message:
      bar
    ------------------------------------------------------------
    revno: 1.1.1
    merged: fullermd@over-yonder.net-20061019151807-3d7047e387edcad9
    parent: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
    committer: Matthew Fuller <fullermd@over-yonder.net>
    committer: Matthew Fuller <fullermd@over-yonder.net>
    branch nick: c
    timestamp: Thu 2006-10-19 10:18:07 -0500
    message:
      baz
------------------------------------------------------------
revno: 1
revision-id: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
committer: Matthew Fuller <fullermd@over-yonder.net>
branch nick: a
timestamp: Thu 2006-10-19 10:14:37 -0500
message:
  Foo


(I'll refer to revids by the last segment)

Note that this also shows the "left-most" parent distinction.  The
"left-most" parent of revno 2 (c3b406b8bcdfb537) is revno 1
(5b99dff6ed1d76cd), because that's the last thing I did in THIS
branch.  That's my 'mainline'; the commits from branch b
(2fe41e4949f5e237) and c (3d7047e387edcad9) are then additional
parents of the merge at revno 2.

The graph for branch a now looks something like (calling the 3
original commits 'a', 'b', and 'c' and the merge rev 'D'):

  a-.
  |\ \
  | b c
  |/ /
  D-'


The 2fe41e4949f5e237 rev is on branch b's mainline forever, and it has
a single-digit revno (2 in this case) on branch b, but it's not on
mine in a.  Now, let's pretend we're branch b, and we want to pick up
from a.  Because a is a superset of b, we could pull ('fast-forward')
a.  If we do that, the graph in b will be identical to a (and so 'log'
will be too).  That, AIUI, is what you'd do in git.

In the bzr methodology we've been discussing, where you want to
maintain your branch's identity, you'd instead merge from a into b.
You've got two new revisions to pick up in doing so; the
3d7047e387edcad9 from branch c, and the merge rev c3b406b8bcdfb537;
you already have 2fe41e4949f5e237 on your mainline.  So, post-merge,
the log for b will look like (somewhat trimmed for space):


------------------------------------------------------------
revno: 3
revision-id: fullermd@over-yonder.net-20061019153827-78d6209cd0f5f2f7
parent: fullermd@over-yonder.net-20061019151800-2fe41e4949f5e237
parent: fullermd@over-yonder.net-20061019151856-c3b406b8bcdfb537
branch nick: b
    ------------------------------------------------------------
    revno: 1.1.1
    merged: fullermd@over-yonder.net-20061019151856-c3b406b8bcdfb537
    parent: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
    parent: fullermd@over-yonder.net-20061019151800-2fe41e4949f5e237
    parent: fullermd@over-yonder.net-20061019151807-3d7047e387edcad9
    branch nick: a
    ------------------------------------------------------------
    revno: 1.2.1
    merged: fullermd@over-yonder.net-20061019151807-3d7047e387edcad9
    parent: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
    branch nick: c
------------------------------------------------------------
revno: 2
revision-id: fullermd@over-yonder.net-20061019151800-2fe41e4949f5e237
parent: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
branch nick: b
------------------------------------------------------------
revno: 1
revision-id: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
branch nick: a


The 2fe41e4949f5e237 which was originally on b's mainline is still on
the mainline at revno 2.  The graph in b now looks like (adding the
new 'E' merge commit)[0]:

  a-.
  |\ \
  b c |
  |\|/
  | D
  |/ 
  E


Now, the question of "is that merge commit E really necessary, when
you could just attach D to the end of the graph and create something
like:

  a-.
  |\ \
  b c |
  |/ /
  D-'

is perhaps a useful question (and one that there's obviously
disagreement on).  And it may be a fruitful one to discuss, if we're
not way off in the weeds already.  But, it's also not QUITE the same
question as "Is the left-vs-other path distinction meaningful and to
be preserved?"



[0] For reference at this point:
    a: 5b99dff6ed1d76cd
    b: 2fe41e4949f5e237
    c: 3d7047e387edcad9
    D: c3b406b8bcdfb537
    E: 78d6209cd0f5f2f7


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 13:44                                           ` Matthieu Moy
@ 2006-10-19 16:03                                             ` Carl Worth
  2006-10-19 16:38                                               ` Matthieu Moy
  0 siblings, 1 reply; 1752+ messages in thread
From: Carl Worth @ 2006-10-19 16:03 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Petr Baudis, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 5872 bytes --]

On Thu, 19 Oct 2006 15:44:34 +0200, Matthieu Moy wrote:
> > The lack of parents ordering in Git is directly connected with
> > fast-forwarding.

Yes. We're identifying the core underlying technical difference behind
the recent discussion. Namely bzr treats one parent as special, (the
parent that was the branch tip previously). And this special treatment
eliminates the ability to fast-forward, adds merge commits that
wouldn't exist with fast forwarding, and is able to make its revision
numbers a bit more stable as a consequence.

> >    a       a
> >   / \     / \
> >  c   b   c   b
> >   \ /     \ /
> >    m       m
>
> Yes, bzr has similar thing too. AIUI, the difference is that git does
> it automatically, while bzr has two commands in its UI, "merge" and
> "pull".

There's a bit more to it than that though. The git command named
"pull" will perform a fast-forward if possible, but will create a
merge commit if necessary. For example:

	a       a                      a
	| pulls | and fast-forwards to |
	b       b                      b
	        |                      |
	        c                      c

whereas:

	a       a                       a
	| pulls | and creates a merge  / \
	b       c                     b   c
                                       \ /
                                        m

So I'm curious. What does bzr pull do in the case of divergence like
this? (And this is the "numbers will be changed" case, by the way).

> In your case, the "leftmost ancestor" of m is b, because at the time
> it was created, it was commited from b.

It should be mentioned that git can, (annoyingly not by default), save
a file detailing the history of a branch, (time a revision ID for
every time the branch tip moved). This is the "reflog" support and
provides the same information that bzr is encoding in its "leftmost
ancestor" branches.

Importantly, though, git's reflog is entirely local and is not
propagated by push/pull etc.

> One problem with that approach is that from revision m and looking
> backward in history (say, running "bzr log"), you have two ways to go
> backward:
>
> 1) Take the history of _your_ commits, and your pull till the point
>    where you've branched.
>
> 2) Follow the history taking the leftmost ancestor at each step.

Uhm, don't you really have to follow both? And the only ambiguity is
which one you see first?

>              In your scenario, repo1 would get a revision history of
> "a c m" while repo2 would have had "a b m" with the same tip.

OK. With git the two reflogs on the two machines would also have "a c
m" and "a b m". But is this the only kind of log that exists? If I
had code history as above and wanted to ask questions about what led
to commit m, then I would want to know about both b and c which
contribute to it.

And that's what "git log" provides. It lists all the commits that are
reachable from a given commit by following parent links. Surely bzr
has a way to view the complete history that way?

Meanwhile, I suggest that there really is no significance to which
parent of a commit used to have the branch head pointing at it. Saving
that information as part of the history is saving it in the wrong
place. It forces the user to have to be careful about which direction
merges happen, leading to awkward command sequences as demonstrated
above, (or daemons to hide them). And in the end, it's just not
important information to have saved in the permanent history.

It is useful in a transient sense to be able to say, (as git reflog
allows), what was my "master" branch pointing at yesterday, (because I
know the code was working before I merged in some bad code this
morning, for instance). But that's a local-only question and will
never have historical significance. "What was cworth's master branch
pointing at on 2006-10-18" is a question that nobody will ever need
the answer to in any historical sense.

-Carl

PS. Here are the commands the show the divergent pull example I gave
above with git:

# Start a new empty repository
$ mkdir git-example; cd git-example
$ git init-db
defaulting to local storage area

# Create initial commit 'a'
$ touch a; git add a; git commit -m "Initial commit of a"
Committing initial tree 496d6428b9cf92981dc9495211e6e1120fb6f2ba

# Create the 'b' commit on a new 'b' branch from 'a'
$ git checkout -b b; touch b; git add b; git commit -m "Add b on branch b"

# Create the 'c' commit on a new 'c' branch from 'a'
$ git checkout -b c master; touch c; git add c; git commit -m "Add c on branch c"

# Checkout the 'master' branch, (which is pointing at 'a')
$ git git checkout master

# Merge the 'b' branch, (notice that this is a fast forward)
$ git pull . b
Updating from faf5f2f7363ef5de740193afd89bedee095ef966 to 141811d050aa7008f19867280c41405e05b3dbf7
Fast forward
 0 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 b

# Now merge the 'c' branch (notice that this is not a fast
# forward, but instead creates a new merge commit)
$ git pull . c
Trying really trivial in-index merge...
Wonderful.
In-index merge
 0 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 c

# Show the log of commits reachable from 'master', (all 4 commits)
$ git log
commit 59b3cdaf930824d4c0def4ba7ef9b913fcf05d96
Merge: 141811d... dfc35d5...
Author: Carl Worth <cworth@raht.cworth.org>
Date:   Thu Oct 19 08:15:23 2006 -0700

    Merge branch 'c'

commit dfc35d5bd88b22f836bd6f46991169d3c3960b69
Author: Carl Worth <cworth@raht.cworth.org>
Date:   Thu Oct 19 08:14:30 2006 -0700

    Add c on branch c

commit 141811d050aa7008f19867280c41405e05b3dbf7
Author: Carl Worth <cworth@raht.cworth.org>
Date:   Thu Oct 19 08:14:10 2006 -0700

    Add b on branch b

commit faf5f2f7363ef5de740193afd89bedee095ef966
Author: Carl Worth <cworth@raht.cworth.org>
Date:   Thu Oct 19 08:13:53 2006 -0700

    Initial commit of a

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19 14:55                                                         ` Linus Torvalds
@ 2006-10-19 16:07                                                           ` Jan Harkes
  2006-10-19 16:48                                                             ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Jan Harkes @ 2006-10-19 16:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

On Thu, Oct 19, 2006 at 07:55:18AM -0700, Linus Torvalds wrote:
> On Wed, 18 Oct 2006, Junio C Hamano wrote:
> >
> > Linus Torvalds <torvalds@osdl.org> writes:
> > >
> > > Actually, I've hit an impasse.
> > >
> > > So there's _another_ way of fixing a thin pack: it's to expand the objects 
> > > without a base into non-delta objects, and keeping the number of objects 
> > > in the pack the same. But _again_, we don't actually know which ones to 
> > > expand until it's too late.
> > 
> > pack-objects.c::write_one() makes sure that we write out base
> > immediately after delta if we haven't written out its base yet,
> > so I suspect if you buffer one delta you should be Ok, no?
> 
> It doesn't matter. I realized that my bogus patch to unpack-objects was 
> more seriously broken anyway: even the "un-deltify every single object" 
> was broken. And that's despite the fact that I _tested_ it, and verified 
> the end result by hand.
> 
> Why? Because I tested it within one repo, by just piping the output of 
> git-pack-objects --stdout directly to the repacker. That seemed to be a 
> good way to test it without setting up anything bigger. But it turns out 
> that it misses one of the big problems: if you don't unpack the objects in 
> a way that later phases can read, none of the streaming code works at all, 
> and you have to buffer up _everything_ in memory just to be able to read 
> any previous _non_delta objects too.

You are correct that it is not possible to create a pack with all
objects expanded in a single pass. But that doesn't mean that a single
pass conversion to a full pack is impossible.

If we find a delta against a base that is not found in our repository we
can keep it as a delta, the base should show up later on in the
thin-pack. Whenever we find a delta against a base that we haven't seen
in the received part of the thin pack, but is available from the
repository we should expand it because there is a chance we may not see
this base in the remainder of the thin-pack.

> So my patch-series works - but it only works in a repo that already has 
> all the objects in question, because then it can look up the objects in 
> the original database. Which makes it useless. Duh.

About that patch series, is there a simple way to import the series into
a local repository? git-am doesn't like it, even after splitting it into
separate files on the linebreaks. I guess git-mailinfo could be taught
to recognise the git-log headers. Or have I missed some useful git apply
trick.

Jan

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 15:25                                   ` Linus Torvalds
@ 2006-10-19 16:13                                     ` Matthew D. Fuller
  2006-10-19 16:49                                       ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-19 16:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Carl Worth, Andreas Ericsson, bazaar-ng, git, Jakub Narebski

On Thu, Oct 19, 2006 at 08:25:26AM -0700 I heard the voice of
Linus Torvalds, and lo! it spake thus:
> 
> The biggest difference seems to be that in bzr, the final checksum
> is 64-bit,

Actually, as best I know, it's not a checksum, just random bits (a
quick glance at the code seems to agree with me).


> Note that from a usability standpoint, the UUID's look more readable
> to a human, but are actually much worse [...]

This I agree with, at least in part.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 14:57                                         ` Tim Webster
  2006-10-19 15:30                                           ` Aaron Bentley
@ 2006-10-19 16:14                                           ` Matthieu Moy
  2006-10-20  3:40                                             ` Tim Webster
  1 sibling, 1 reply; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-19 16:14 UTC (permalink / raw)
  To: Tim Webster; +Cc: Christian MICHON, Andreas Ericsson, bazaar-ng, git

"Tim Webster" <tdwebste@gmail.com> writes:

> First I want to say every SCM I know of sucks when it comes to tracking
> configurations, simply because they don't record or restore file metadata,
> like perms, ownership, and acl.

That's not a simple matter.

Tracking ownership hardly makes sense as soon as you have two
developers on the same project. What does it mean to checkout a file
belonging to user foo and group bar on a system not having such user
and group?

Just restoring the complete user/group/other rwx permission is already
a mess. In my experience (GNU Arch did this):

1) It sucks ;-). Me working with umask 022 so that my collegues can
   "cp -r" from me, working on a project with people having umask 077,
   I got some files not readable, some yes, well, a mess. *I* have set
   my umask, and *I* want my tools to obey.

2) It's a security hole. If you work with people having umask=002 (not
   indecent if your default group contains just you), you end-up with
   world-writable files in your ${HOME}.

That said, it can be interesting to have it, but disabled by default.

The 'x' bit, OTOH, is definitely useful.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 16:03                                             ` Carl Worth
@ 2006-10-19 16:38                                               ` Matthieu Moy
  2006-10-20 11:24                                                 ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-19 16:38 UTC (permalink / raw)
  To: Carl Worth; +Cc: bazaar-ng, Petr Baudis, git

Carl Worth <cworth@cworth.org> writes:

> Yes. We're identifying the core underlying technical difference behind
> the recent discussion. Namely bzr treats one parent as special, (the
> parent that was the branch tip previously). And this special treatment
> eliminates the ability to fast-forward, 

No.

bzr could trivially do fast-forward too. It's an explicit design
decision to have two separate commands.

> adds merge commits that wouldn't exist with fast forwarding,

They don't exist either with "pull".

The difference between bzr and git is smaller than you think on this
point I believe.

> There's a bit more to it than that though. The git command named
> "pull" will perform a fast-forward if possible, but will create a
> merge commit if necessary. For example:

The bzr command "pull" will do a fast-forward if possible, but will
refuse to continue and ask you to create the merge commit with other
commands if necessary.

> 	a       a                      a
> 	| pulls | and fast-forwards to |
> 	b       b                      b
> 	        |                      |
> 	        c                      c

Same as bzr.

> whereas:
>
>         a       a                       a
>         | pulls | and creates a merge  / \
>         b       c                     b   c
>                                        \ /
>                                         m

Here, bzr will refuse to pull. It will say "branches have diverged"
and tell you to use merge.

Then, you'll do

$ bzr merge

# optionally "bzr status"

$ bzr commit -m "merged such or such thing"


So, "git pull" seems roughly equivalent to something like

$ bzr pull || (bzr merge; bzr commit -m merge)

> So I'm curious. What does bzr pull do in the case of divergence like
> this? (And this is the "numbers will be changed" case, by the way).

Not yet. The "numbers will be changed" is if b pulls, right after.


Then, one other difference is in the UI. bzr shows you commits in a
kind of hierarchical maner, like (fictive example, that's not the real
exact format).

$ bzr log
commiter: upstream@maintainer.com
message:
  merged the work on a feature
  ------
  commiter: contributor@site.com
  message:
    prepared for feature X
  ------
  commiter: contributor@site.com
  message:
    implemented feature X
  ------
  commiter: contributor@site.com
  message:
    added testcase for feature X
------
commiter: upstream@maintainer.com
message:
  something else

No big difference in the model either, but it probably reveals a
different vision of what "history" means.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19 16:07                                                           ` Jan Harkes
@ 2006-10-19 16:48                                                             ` Linus Torvalds
  2006-10-20  0:20                                                               ` Jan Harkes
                                                                                 ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-19 16:48 UTC (permalink / raw)
  To: Jan Harkes; +Cc: Junio C Hamano, git



On Thu, 19 Oct 2006, Jan Harkes wrote:
> 
> If we find a delta against a base that is not found in our repository we
> can keep it as a delta, the base should show up later on in the
> thin-pack. Whenever we find a delta against a base that we haven't seen
> in the received part of the thin pack, but is available from the
> repository we should expand it because there is a chance we may not see
> this base in the remainder of the thin-pack.

Yes, indeed. We can also have another heuristic: if we find a delta, and 
we haven't seen the object it deltas against, we can still keep it as a 
delta IF WE ALSO DON'T ALREADY HAVE THE BASE OBJECT. Because then we know 
that the base object has to be there later in the pack (or we have a 
dangling delta, which we'll just consider an error).

So yeah, maybe my patch-series is something we can still save.

However, the thing that makes me suspect that it is _not_ saveable, is 
this:

 - let's assume we have a nice thin pack, with object A B C D (in that 
   order), which is actually a good pack in itself (ie it _might_ be thin, 
   but it's actually self-sufficient)

 - let A be a full object, and B be packed as a delta off A, C as a delta 
   off B, and D as a delta off C.

 - Try to repack it as a streaming thing (the end result _should_ 
   obviously be exactly the same as the input, since it turns out to be 
   self-sufficient)

Looks trivial, no?

The answer is: no. It's not trivial. Or rather, it _is_ trivial, but you 
have to _remember_ all of the actual data for A, B, C and D all the way to 
the end, because only if you have that data in memory can you actually 
_recreate_ B, C and D even enough to get their SHA1's (which you need, 
just in order to know that the pack is complete, must less to be able to 
create a non-delta version in case it hadn't been).

So we can definitely do the one-pass creation, but it requires that we 
keep track of everything we've expanded so far in memory (because we won't 
have the data available any other way - we don't have them as objects in 
our object database, and we don't have a good new pack yet).

But if you do that, then yes, it's salvageable.

> About that patch series, is there a simple way to import the series into
> a local repository? git-am doesn't like it, even after splitting it into
> separate files on the linebreaks. I guess git-mailinfo could be taught
> to recognise the git-log headers. Or have I missed some useful git apply
> trick.

No, you've not missed anything. I didn't really expect anybody to want to 
seriously play with it, so I didn't bother to do things properly. 

Especially since I hadn't even written very good commit messages.

Anyway, I just pushed the "rewrite-pack" branch to my git repo on 
kernel.org, so once it mirrors out, if you really want to try to fix up 
the mess I left behind, there it is:

	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/git.git rewrite-pack

Maybe it's recoverable. 

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 16:13                                     ` Matthew D. Fuller
@ 2006-10-19 16:49                                       ` Linus Torvalds
  2006-10-19 18:30                                         ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-19 16:49 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Andreas Ericsson, Carl Worth, bazaar-ng, git, Jakub Narebski



On Thu, 19 Oct 2006, Matthew D. Fuller wrote:

> On Thu, Oct 19, 2006 at 08:25:26AM -0700 I heard the voice of
> Linus Torvalds, and lo! it spake thus:
> > 
> > The biggest difference seems to be that in bzr, the final checksum
> > is 64-bit,
> 
> Actually, as best I know, it's not a checksum, just random bits (a
> quick glance at the code seems to agree with me).

Ahh. They may be that even in BK. I know BK had various 16-bit CRC 
checksums, but they were probably on the actual _file_ contents, not in 
the key itself.

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 14:58                                   ` Aaron Bentley
@ 2006-10-19 16:59                                     ` Carl Worth
  2006-10-19 23:01                                       ` Aaron Bentley
  2006-10-19 17:01                                     ` Carl Worth
  1 sibling, 1 reply; 1752+ messages in thread
From: Carl Worth @ 2006-10-19 16:59 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git, Jakub Narebski

[-- Attachment #1: Type: text/plain, Size: 7913 bytes --]

On Thu, 19 Oct 2006 10:58:48 -0400, Aaron Bentley wrote:
> >> In bzr development, it's very rare for anyone's revision numbers to change.
> >
> > Which just says to me that the bzr developers really are sticking to a
> > centralized model.
>
> I don't see why you're reaching that conclusion.  I'd like to understand
> that better, because Linus seems to be concluding the same thing, and it
> doesn't make sense to me.

First, I want to point out that I think we're having a delightfully
enlightening conversation here, and I'm glad for that.

Let me provide a couple of hypothetical situations to try to
demonstrate my thinking here. The first is far-fetched but perhaps
easier to understand the implications. But the second is the real,
everyday situation that is much more important.

Far-fetched
-----------
Let's imagine there's a complete fork in the bzr codebase tomorrow. We
need not suppose any acrimony, just an amiable split as two subsets of
the team start taking the code in different directions.

Now, at the time of the fork, all published revision numbers apply
equally well to either team's codebase, (obviously, since they are
identical). But as the projects diverge they each start publishing
revision numbers with respect to their own repositories in their own
bug trackers, etc. Obviously, each project has its own "mainline" so
these new revision numbers are only unique within each project and not
between the two.

Time passes...

Finally the two teams (who had remained good friends after the
breakup) find a unifying theory that will let them work on a single
tool that will meet the needs of both user bases. So they want to
merge their code together.

After the merge, there can be only one mainline, so one team or the
other will have to concede to give up the numbers they had generated
and published during the fork. That is, the numbers will not be usable
within the new, merged repository.

Everyday
--------
Now, the above scenario is just silly. It's not likely to ever happen,
so it's really not worth considering as a motivating case.

But, what does (and should) happen everyday is exactly the same. So
here's a realistic situation that is worth considering:

An individual takes the bzr codebase and starts working on it. It's
experimental stuff, so it's not pushed back into the central
repository yet. But our coder isn't a total recluse, so his friends
help him with the code he's working on. They communicate about their
work, (perhaps on the main bzr mailing list), and make statements such
as "feature F is working perfectly as of version V".

But for these communications, revision numbers will not provide
historically stable values that can be used. It's impossible for our
coder to predict the numbers that will be assigned to his code when
they get merged back into the mainline---since some other unknown
programmer may have branched at exactly the same point and is trying
to make the same determination. Neither programmer can know which code
will land first, so neither can know what numbers will get assigned,
right?

Now, the programmers could get stable numbers by keeping the branch in
the main tree, or by at least pushing out the branching point to
"reserve" a number in the main tree.

So, the only way to get stable numbers is to rely on this central
tree.

Does that make sense?

> That doesn't follow.  Just because something is arguably true doesn't
> make it bad.  And in this case, I'm not arguing that it's true, I'm
> saying that it's true, because that is what my experience tells me is true.

[I'm sorry, but I didn't grasp this sentence. I think I lost the
antecedent of "it" somewhere.]

> > In cairo, for example, we've made a habit of including a revision
> > identifier in our bug tracking system for every commit that resolves a
> > bug.
>
> We do it the other way around: we put a bug number in the commit
> message.

Oh, we do that too. That number is important, (for "what the heck is
this commit trying to do, and why", since (sadly) much of the why ends
up getting stuck off in external bug tracking tools). But the reverse
direction is also important, ("Hey, this bug got fixed in the
development version, but I want to backport it to my distribution
package. Where can I find it?").

>          And I personally have been developing a bugtracker that is
> distributed in the same way bzr is; it stores bug data in the source
> tree of a project, so that bug activities follow branches around.

That kind of thing sounds very useful. As I've been talking about
"numbers" here in bug trackers and mailing lists, it should be obvious
that I consider the information stored in such systems an important
part of the history of a code project. So it would be nice if all of
that history were stored in an equally reliable system in some way.

> On the other hand, I think your revision identifiers are not as
> permanent as you think.
>
> In the first place, it seems fairly common in the Git community to
> rebase.  This process throws away old revisions and creates new
> revisions that are morally equivalent[1].

Yes, rebasing does "destroy history" in one sense, (in actual fact, it
creates new commits and leaves the old ones around, which may or may
not have references to them anymore). But i's definitely not common
for git users to use rebase in a situation where it would change any
published number.

For example, I regularly use git-rebase, (and similar "git-commit
 --amend"), as I'm putting together a new branch that exists only
in a repository on my laptop with nobody having external visibility to
it.

So, if I see a typo in a commit and I've never pushed it anywhere,
I'll just "git commit --amend" to fix it. But if I see that typo only
after I push out the change, then I just make a new commit to fix it,
(and suck up the fact that my mistake will be a permanent part of the
history).

And git helps with this as well. If I ever forget that I've already
pushed a change and then I rebase, then the next time I try to push,
git will complain that I'm attempting to throw away history on the
remote end, and will refuse to cooperate, (unless I force it).

There's a similar safety mechanism on the pull side. If I did force a
history-rewriting push, then users who tried to pull it would also
have to force git's hand before it would rewrite their history.

[By the way, it is sometimes useful to make chaotic, regularly-rebased
branches visible to others, so they can watch what's going on. (Junio
does this with his "proposed updates (pu)" branch in hit repository
for git itself, for example). It's just that such branches should
never be used to start new development if they expect to pull from the
branch again later, nor should the revision numbers of such a branch
ever be considered permanent, nor published anywhere.]

> In the second place, one must consider the "nuclear launch codes"
> scenario.

Sure. And git does provide tools that can do this. Of course, the
"normal" tools strictly add new commits and move branches (which are
no more than references to commits) around. But moving branches can
leave commits unreferenced. And a "prune" command does exist, (which
isn't needed in "normal" use), which will delete unreferenced objects.

-Carl

> [1] This is a process that I find discomforting, because I consider the
> original revisions to be real, historical data, and I don't like the
> idea of throwing it away.

As I mentioned above. They aren't thrown away. I often use rebase when
re-building an ugly series of patches into a nice clean set of
patches. And in that situation, I might rebase from the old to the
new, but still with a reference to the old branch until I'm done with
the entire process. And it's perfectly possible, and legitimate that
such a reference has been published and the old branch will live
"forever" even if I rebased it. So rebase isn't necessarily
destructive.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 14:58                                   ` Aaron Bentley
  2006-10-19 16:59                                     ` Carl Worth
@ 2006-10-19 17:01                                     ` Carl Worth
  2006-10-19 17:14                                       ` J. Bruce Fields
  1 sibling, 1 reply; 1752+ messages in thread
From: Carl Worth @ 2006-10-19 17:01 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Jakub Narebski, Andreas Ericsson, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 7913 bytes --]

On Thu, 19 Oct 2006 10:58:48 -0400, Aaron Bentley wrote:
> >> In bzr development, it's very rare for anyone's revision numbers to change.
> >
> > Which just says to me that the bzr developers really are sticking to a
> > centralized model.
>
> I don't see why you're reaching that conclusion.  I'd like to understand
> that better, because Linus seems to be concluding the same thing, and it
> doesn't make sense to me.

First, I want to point out that I think we're having a delightfully
enlightening conversation here, and I'm glad for that.

Let me provide a couple of hypothetical situations to try to
demonstrate my thinking here. The first is far-fetched but perhaps
easier to understand the implications. But the second is the real,
everyday situation that is much more important.

Far-fetched
-----------
Let's imagine there's a complete fork in the bzr codebase tomorrow. We
need not suppose any acrimony, just an amiable split as two subsets of
the team start taking the code in different directions.

Now, at the time of the fork, all published revision numbers apply
equally well to either team's codebase, (obviously, since they are
identical). But as the projects diverge they each start publishing
revision numbers with respect to their own repositories in their own
bug trackers, etc. Obviously, each project has its own "mainline" so
these new revision numbers are only unique within each project and not
between the two.

Time passes...

Finally the two teams (who had remained good friends after the
breakup) find a unifying theory that will let them work on a single
tool that will meet the needs of both user bases. So they want to
merge their code together.

After the merge, there can be only one mainline, so one team or the
other will have to concede to give up the numbers they had generated
and published during the fork. That is, the numbers will not be usable
within the new, merged repository.

Everyday
--------
Now, the above scenario is just silly. It's not likely to ever happen,
so it's really not worth considering as a motivating case.

But, what does (and should) happen everyday is exactly the same. So
here's a realistic situation that is worth considering:

An individual takes the bzr codebase and starts working on it. It's
experimental stuff, so it's not pushed back into the central
repository yet. But our coder isn't a total recluse, so his friends
help him with the code he's working on. They communicate about their
work, (perhaps on the main bzr mailing list), and make statements such
as "feature F is working perfectly as of version V".

But for these communications, revision numbers will not provide
historically stable values that can be used. It's impossible for our
coder to predict the numbers that will be assigned to his code when
they get merged back into the mainline---since some other unknown
programmer may have branched at exactly the same point and is trying
to make the same determination. Neither programmer can know which code
will land first, so neither can know what numbers will get assigned,
right?

Now, the programmers could get stable numbers by keeping the branch in
the main tree, or by at least pushing out the branching point to
"reserve" a number in the main tree.

So, the only way to get stable numbers is to rely on this central
tree.

Does that make sense?

> That doesn't follow.  Just because something is arguably true doesn't
> make it bad.  And in this case, I'm not arguing that it's true, I'm
> saying that it's true, because that is what my experience tells me is true.

[I'm sorry, but I didn't grasp this sentence. I think I lost the
antecedent of "it" somewhere.]

> > In cairo, for example, we've made a habit of including a revision
> > identifier in our bug tracking system for every commit that resolves a
> > bug.
>
> We do it the other way around: we put a bug number in the commit
> message.

Oh, we do that too. That number is important, (for "what the heck is
this commit trying to do, and why", since (sadly) much of the why ends
up getting stuck off in external bug tracking tools). But the reverse
direction is also important, ("Hey, this bug got fixed in the
development version, but I want to backport it to my distribution
package. Where can I find it?").

>          And I personally have been developing a bugtracker that is
> distributed in the same way bzr is; it stores bug data in the source
> tree of a project, so that bug activities follow branches around.

That kind of thing sounds very useful. As I've been talking about
"numbers" here in bug trackers and mailing lists, it should be obvious
that I consider the information stored in such systems an important
part of the history of a code project. So it would be nice if all of
that history were stored in an equally reliable system in some way.

> On the other hand, I think your revision identifiers are not as
> permanent as you think.
>
> In the first place, it seems fairly common in the Git community to
> rebase.  This process throws away old revisions and creates new
> revisions that are morally equivalent[1].

Yes, rebasing does "destroy history" in one sense, (in actual fact, it
creates new commits and leaves the old ones around, which may or may
not have references to them anymore). But i's definitely not common
for git users to use rebase in a situation where it would change any
published number.

For example, I regularly use git-rebase, (and similar "git-commit
 --amend"), as I'm putting together a new branch that exists only
in a repository on my laptop with nobody having external visibility to
it.

So, if I see a typo in a commit and I've never pushed it anywhere,
I'll just "git commit --amend" to fix it. But if I see that typo only
after I push out the change, then I just make a new commit to fix it,
(and suck up the fact that my mistake will be a permanent part of the
history).

And git helps with this as well. If I ever forget that I've already
pushed a change and then I rebase, then the next time I try to push,
git will complain that I'm attempting to throw away history on the
remote end, and will refuse to cooperate, (unless I force it).

There's a similar safety mechanism on the pull side. If I did force a
history-rewriting push, then users who tried to pull it would also
have to force git's hand before it would rewrite their history.

[By the way, it is sometimes useful to make chaotic, regularly-rebased
branches visible to others, so they can watch what's going on. (Junio
does this with his "proposed updates (pu)" branch in hit repository
for git itself, for example). It's just that such branches should
never be used to start new development if they expect to pull from the
branch again later, nor should the revision numbers of such a branch
ever be considered permanent, nor published anywhere.]

> In the second place, one must consider the "nuclear launch codes"
> scenario.

Sure. And git does provide tools that can do this. Of course, the
"normal" tools strictly add new commits and move branches (which are
no more than references to commits) around. But moving branches can
leave commits unreferenced. And a "prune" command does exist, (which
isn't needed in "normal" use), which will delete unreferenced objects.

-Carl

> [1] This is a process that I find discomforting, because I consider the
> original revisions to be real, historical data, and I don't like the
> idea of throwing it away.

As I mentioned above. They aren't thrown away. I often use rebase when
re-building an ugly series of patches into a nice clean set of
patches. And in that situation, I might rebase from the old to the
new, but still with a reference to the old branch until I'm done with
the entire process. And it's perfectly possible, and legitimate that
such a reference has been published and the old branch will live
"forever" even if I rebased it. So rebase isn't necessarily
destructive.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 16:01                                         ` Matthew D. Fuller
@ 2006-10-19 17:06                                           ` Matthew D. Fuller
  0 siblings, 0 replies; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-19 17:06 UTC (permalink / raw)
  To: Petr Baudis
  Cc: bazaar-ng, Karl Hasselström, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git, Jakub Narebski

On Thu, Oct 19, 2006 at 11:01:03AM -0500 I heard the voice of
Matthew D. Fuller, and lo! it spake thus:
>
> Now, the question of "is that merge commit E really necessary, when
> you could just attach D to the end of the graph and create something
> like [...] is perhaps a useful question (and one that there's
> obviously disagreement on).  And it may be a fruitful one to
> discuss, if we're not way off in the weeds already.
>
> But, it's also not QUITE the same question as "Is the left-vs-other
> path distinction meaningful and to be preserved?"

Let me elaborate a little on this.

bzr COULD create

>   a-.
>   |\ \
>   b c |
>   |/ /
>   D-'

instead of

>   a-.
>   |\ \
>   b c |
>   |\|/
>   | D
>   |/ 
>   E

for the previously discussed merge, basically duplicating
'fast-forward' behavior.  It doesn't currently, but it could just as
well without disturbing the attributes it gains from assigning meaning
to the left-most parent.  The choice to create E is the result of an
independent decision from the choice to treat the left path as
special.


What the leftmost discussion impacts is the case of 

    a-.
    |\ \
    | b c
    |/ /
    D-'

vs

    a-.-.
     \ \ \
      b c |
     / / /
    D-'-'

Now, the branches are distinct to bzr, but they're not different.  If
you try to merge one from the other, merge will quite rightly tell you
there's nothing to do, since you both have all the same revs.  git
doesn't recognize the distinction at all, of course.  The difference
is mostly cosmetic.  But, it's a cosmetic difference that bzr devs
(and users, I venture) find _useful_, which is why it's fought for.
And everything else seems to follow from that.

If you don't think the distinction is meaningful or useful, you can
ignore it, and the tool should work just fine.  The main place the
distinction would show up is in the cosmetics of how "log" looks (and
probably similarly in any tool that graphically describes ancestry),
and a custom log output formatter could probably be very easily
written to obviate even that.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 17:01                                     ` Carl Worth
@ 2006-10-19 17:14                                       ` J. Bruce Fields
  2006-10-20 14:31                                         ` Jeff King
  0 siblings, 1 reply; 1752+ messages in thread
From: J. Bruce Fields @ 2006-10-19 17:14 UTC (permalink / raw)
  To: Carl Worth
  Cc: Aaron Bentley, Linus Torvalds, Jakub Narebski, Andreas Ericsson,
	bazaar-ng, git

On Thu, Oct 19, 2006 at 10:01:33AM -0700, Carl Worth wrote:
> On Thu, 19 Oct 2006 10:58:48 -0400, Aaron Bentley wrote:
> > On the other hand, I think your revision identifiers are not as
> > permanent as you think.
> >
> > In the first place, it seems fairly common in the Git community to
> > rebase.  This process throws away old revisions and creates new
> > revisions that are morally equivalent[1].
> 
> Yes, rebasing does "destroy history" in one sense, (in actual fact, it
> creates new commits and leaves the old ones around, which may or may
> not have references to them anymore).

Note that the id's are still permanent in this case; they will never
(module some assumptions about the crypto) be reused.  So a given id
points at one and only one object, for all time; it's just that we may
forget what that one object is....

> > In the second place, one must consider the "nuclear launch codes"
> > scenario.
> 
> Sure. And git does provide tools that can do this.

So in this case you can certainly lose the launch codes.  But you have
forever granted everyone a way to determine whether a given guess at the
launch codes is correct.  (Again, assuming some stuff about SHA1).

--b.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 16:49                                       ` Linus Torvalds
@ 2006-10-19 18:30                                         ` Linus Torvalds
  2006-10-19 18:54                                           ` Matthieu Moy
  2006-10-19 19:16                                           ` Junio C Hamano
  0 siblings, 2 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-19 18:30 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Carl Worth, Andreas Ericsson, bazaar-ng, git, Jakub Narebski



On Thu, 19 Oct 2006, Linus Torvalds wrote:
> 
> Ahh. They may be that even in BK. I know BK had various 16-bit CRC 
> checksums, but they were probably on the actual _file_ contents, not in 
> the key itself.

Btw, I do believe that bzr seems to be acting a lot like BK, at least when 
it comes to versioning. I suspect that is not entirely random either, and 
I suspect it's been a conscious effort to some degree.

Which is fine, in the sense that there are certainly much worse things to 
try to copy.

That said, at least BK was up-front about the versions changing, and 
didn't try to do anything to hinder it. It still confused some people, and 
it wasn't a great naming system, but it did work.

In the big picture, the version naming between BK and git hasn't been an 
issue for anybody in practice, I suspect.

So if you want to look at features that actually matter more, try out 
something like

	gitk drivers/scsi include/scsi

on the kernel archive (I assume that somebody has tried importing the 
kernel git tree into bzr - quite frankly, if bzr cannot handle that size 
tree without problems, you have much bigger issues!).

In other words, being able to look at history of more than a single file 
has been a _huge_ bonus. 

The other big difference is being able to do merges in seconds. The 
biggest cost of doing a big merge these days seems to literally be 
generating the diffstat of the changes at the end (which is purely a UI 
issue, but one that I find so important that I'll happily take the extra 
few seconds for that, even if it sometimes effectively doubles the 
overhead).

Looking at the dates of the merges yesterday, they're literally half a 
minute apart, and that's not me _scripting_ them - that's me actually 
looking up the emails, typing in the "git pull " and pasting the source 
repository, and git fetching the data over the network and merging it, and 
checking out the result (and me verifying that the resulting diffstat 
matches what the email says). Doing four of those in a row in less than 
two minutes is actually a really big deal.

At some point, "performance" is just more than a question of how fast 
things are, it becomes a big part of usability.

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 18:30                                         ` Linus Torvalds
@ 2006-10-19 18:54                                           ` Matthieu Moy
  2006-10-19 20:47                                             ` Linus Torvalds
  2006-10-19 23:28                                             ` Ryan Anderson
  2006-10-19 19:16                                           ` Junio C Hamano
  1 sibling, 2 replies; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-19 18:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: bazaar-ng, Matthew D. Fuller, Carl Worth, Andreas Ericsson, git,
	Jakub Narebski

Linus Torvalds <torvalds@osdl.org> writes:

> Btw, I do believe that bzr seems to be acting a lot like BK, at least when 
> it comes to versioning. I suspect that is not entirely random either, and 
> I suspect it's been a conscious effort to some degree.
>
> Which is fine, in the sense that there are certainly much worse things to 
> try to copy.

By curiosity, how would you compare git and Bitkeeper, on a purely
technical basis? (not asking for a detailed comparison, but an "X is
globaly/much/terribly/not better than Y" kind of statement ;-) )

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17  4:31           ` Aaron Bentley
@ 2006-10-19 19:01             ` Nathaniel Smith
  2006-10-20 10:32               ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Nathaniel Smith @ 2006-10-19 19:01 UTC (permalink / raw)
  To: git

Aaron Bentley <aaron.bentley <at> utoronto.ca> writes:
> Bazaar also supports multiple unrelated branches in a repository, as
> does CVS, SVN (depending how you squint), Arch, and probably Monotone.

It's quite common in Monotone.  You could probably do it in Mercurial as well,
though I don't know that anyone does.  SVK definitely does it (since each user
has a single repo that's shared by all the projects they work on).

Trivia-ly yours,
-- Nathaniel

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 18:30                                         ` Linus Torvalds
  2006-10-19 18:54                                           ` Matthieu Moy
@ 2006-10-19 19:16                                           ` Junio C Hamano
  2006-10-20 10:51                                             ` Jakub Narebski
  1 sibling, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-19 19:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> The other big difference is being able to do merges in seconds. The 
> biggest cost of doing a big merge these days seems to literally be 
> generating the diffstat of the changes at the end (which is purely a UI 
> issue, but one that I find so important that I'll happily take the extra 
> few seconds for that, even if it sometimes effectively doubles the 
> overhead).

An interesting effect on this is when people have a column for
merge performance in a SCM comparison table, they would include
time to run the diffstat as part of the time spent for merging
when they fill in the number for git, but not for any other SCM.

I know you won't misunderstand me but for the sake of others, I
should add this: I am not saying diffstat should be optional.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 18:54                                           ` Matthieu Moy
@ 2006-10-19 20:47                                             ` Linus Torvalds
  2006-10-21  5:49                                               ` Junio C Hamano
  2006-10-19 23:28                                             ` Ryan Anderson
  1 sibling, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-19 20:47 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Matthew D. Fuller, Andreas Ericsson, Carl Worth, bazaar-ng, git,
	Jakub Narebski

[-- Attachment #1: Type: TEXT/PLAIN, Size: 8997 bytes --]



On Thu, 19 Oct 2006, Matthieu Moy wrote:
> 
> By curiosity, how would you compare git and Bitkeeper, on a purely
> technical basis? (not asking for a detailed comparison, but an "X is
> globaly/much/terribly/not better than Y" kind of statement ;-) )

I think git is better for kernel work these days, but a large portion of 
that is that a lot of the features have literally been tweaked for us (for 
very obvious reasons).

For example, the whole "rebase" thing (or explicitly making cherry-picking 
easy) is something that a number of kernel people do, and even if I have 
to admit to not liking the practice very much (it kind of hides the "true" 
development history), it does have huge advantages, and it makes history a 
lot easier to read.

Similarly, I often used the single-file graphical history viewing in BK 
("revtool"), but being able to follow the history of multiple files as one 
"entity" really is something that once you get used to, it's really really 
hard going back, and "gitk" does generate a much more readable graph.

And I think the git way of doing branches is just simply superior. Git 
always did branches in the sense that the way merges happened you _always_ 
had several heads, but actually making them available and switching 
between them was something that wasn't my idea, and that I even was a bit 
apprehensive about. I was wrong. Git branches are branches done right. I 
just don't see how you _could_ do them better.

That said, a lot of the features I like and _I_ consider really important 
are possibly not that important to others. For example, maybe nobody else 
really cares about viewing the history of a particular subsystem, the way 
I do. For a lot of people, single-file is probably ok. 

For example, while git now does "annotate" (or "blame"), it's not 
lightning fast, and I simply don't care. Doing a

	git blame kernel/sched.c

takes about three seconds for me, and that's on a pretty good machine (and 
on the kernel tree, which for me is always in the cache ;). Quite frankly, 
if I cared deeply about that kind of annotation, I'd probably be upset 
about it. There are basically _no_ other git operations that take that 
long. I can get the _full_ log of the last 18 months of the kernel much 
faster than that.

And the slowness of annotate comes directly from the design of git, and 
from the fact that it's not how I tend to look at changes. Rather than 
doing "git blame kernel/sched.c", I'm _much_ more likely to just do

	git log -p kernel/sched.c

and see the changes as individual patches instead (and perhaps search for 
some pattern that I'm looking for by just literally using a regex in the 
pager).

Also, the fact that you need to repack the archive every once in a while 
doesn't disturb me. I probably end up repacking the kernel almost daily, 
which is _waay_ excessive, but it's just become habit of mine. I've seen 
people who really don't like it, and I've also seen people who apparently 
never even realized that they should do an occasional "git repack -a -d", 
and then they have hundreds of thousands of loose objects and wonder why 
the performance is so bad ;)

BK never had these issues. BK always kept things "packed", which made a 
lot of operations much slower ("bk undo" was painfully slow). BK could 
annotate quickly, since it was really a file-based history, in a way that 
git fundamentally isn't, and can never be (and I don't _want_ it to be, 
but it means that "annotate" is slow).

And BK had some great tools. The merge tool was superior ("bk resolve"? I 
forget). The patch-application tool was great.

But both of those tools are things that git doesn't have, for _another_ 
reason: the way git works, you don't really need them. For example, the 
patch application tool was great, but the biggest reason it was needed in 
the first place was tracking renames explicitly.

In that kind of environment, you have serious problems with patches, and 
you actually _need_ a tool to let the user explain when something is a 
rename and when it isn't. With git not tracking renames, the patch 
application tool simply isn't needed.

The same goes to some degree to "bk resolve". Because git has the index, 
and you can _leave_ things unresolved in the index, you don't need a 
graphical tool to resolve things - git knows very fundamentally about 
incomplete merges _and_ about multiple branches (which you need in order 
to keep track of both the branch you merge from and the branch you merge 
into), and it's fine to resolve any conflicts in the normal working tree.

So for at least _my_ usage, git does everything very well, but that's 
because if it didn't fit me, I fixed it until it did. 

And "git bisect" really does rock. I still cannot believe that apparently 
nobody did it before us. It's such a useful thing, and it works so well in 
unambiguous cases (and not all cases are that unambiguous, but an 
appreciably large subset is).

So that said, git does work very well for us, but I do want to end on a 
note on thigns that BitKeeper did and nobody else has:

 - Larry was first. The undeniable fact is, that before BK (and for 
   several years _after_ BK), the open-source alternatives were just CRAP.

   You can say anything you like about his personality, but dammit, 
   compared to Larry, most people I know are idiots. People don't give BK 
   the credit it deserves. When Tridge "reverse-engineered" it, people 
   were making jokes about how trivial some of the protocols were. That 
   misses the point ENTIRELY. The point is, compared to BK, everything 
   else absolutely _sucked_, and BK really was a watershed program.

   Never EVER underestimate how important BK was. Quite frankly, I think 
   most open-source SCM's _still_ suck. I'm constantly amazed that anybody 
   would touch SVN with a ten-foot pole. Talk about crap. And SVN is at 
   least usable, unlike a lot of other projects.

 - When I did git, one of the things that actually _helped_ me was that I 
   was consciously trying to not do a BK clone. I wanted to do the same 
   things that BK did, but I very much did _not_ want to do them the _way_ 
   BK did them. I respect Larry too much, and I didn't want there to be 
   any question about git being just a "clone".

   So a lot of the git design ended up very much trying to avoid old 
   designs on purpose, and I think that really helped. The fact that I 
   didn't have a background in SCM's, and that I thought all the weaves 
   etc were confusing, meant that I instead went for a radically different 
   way of doing things.

   And I'm 100% convinced that "radically different" was the right thing 
   to do. That was what allowed git to really soar. A lot of the good 
   things in git come exactly from the fact that git does _not_ do things 
   like most traditional SCM's do. But BK should still get a lot of 
   credit, because it was what taught me (and a lot of other people) what 
   being "distributed" really meant.

 - On a more personal note: people say that BK showed the "failure" of 
   using a commercial closed-source program. I would disagree. Not only 
   did the kernel get a whole lot of useful work out of BK, we learnt how 
   distributed systems _should_ work, and quite frankly, I'd do ít all 
   over again in a heartbeat.

   If there was a "failure" in the BK saga, it was in how horrendously 
   _bad_ all open-source SCM's were, even with BK showing how it should 
   have been done for several years. THAT is the failure. The fact that 
   there were hundreds of people who whined about BK, and nobody really 
   did anything productive. 

Now, I'm obviously biased, but I really do believe that git is the best 
open-source SCM there is, by a _mile_. I don't know how many people 
realize this, but we literally haven't changed our data formats in over a 
year. I was looking at my old git import of the BKCVS tree today, because 
I wanted to look up the "BKrev" format for the email earlier in this tree, 
and I realized that the pack-file was from July of last year. That's 
within a few _weeks_ of the pack-file being introduced at all, and guess 
what? It all still worked. No "on-the-fly format conversion", no 
_nothing_. It just worked.

That should tell people something. It's pretty much the fastest SCM out 
there (and yeah, that's on almost any operation you can name), it still 
has the smallest disk footprint I've ever heard of, and it hasn't had the 
"format of the week" disease that every other project seems to go through.

And it's used in production settings on some of the biggest projects out 
there. SVN has more users, but let's face it, SVN really isn't even in the 
running. Technology-wise, the thing is just not worth bothering with, but 
it's a good crutch for people who are used to CVS and never want to use 
anything lse.

Am I happy with git? I'm happy as a clam. It turned out even better than I 
ever thought it would. And BK was what taught me what to aim for.

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 16:59                                     ` Carl Worth
@ 2006-10-19 23:01                                       ` Aaron Bentley
  2006-10-19 23:42                                         ` Carl Worth
  2006-10-20 10:53                                         ` Jakub Narebski
  0 siblings, 2 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-19 23:01 UTC (permalink / raw)
  To: Carl Worth
  Cc: Linus Torvalds, Jakub Narebski, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Carl Worth wrote:
> On Thu, 19 Oct 2006 10:58:48 -0400, Aaron Bentley wrote:

> Let's imagine there's a complete fork in the bzr codebase tomorrow. We
> need not suppose any acrimony, just an amiable split as two subsets of
> the team start taking the code in different directions.

...

> Finally the two teams ... want to
> merge their code together.
> 
> After the merge, there can be only one mainline, so one team or the
> other will have to concede to give up the numbers they had generated
> and published during the fork.

I don't think this is true.  The abandoned mainline does not need to be
destroyed.  It can be kept at the same location that it always was, with
the numbers that it always had.  So the number + URL combo stays
meaningful.  Additionally, the new mainline can keep a mirror of the
abandoned mainline in its repository, because there are virtually no
additional storage requirements to doing so.

> An individual takes the bzr codebase and starts working on it. It's
> experimental stuff, so it's not pushed back into the central
> repository yet. But our coder isn't a total recluse, so his friends
> help him with the code he's working on. They communicate about their
> work, (perhaps on the main bzr mailing list), and make statements such
> as "feature F is working perfectly as of version V".
> 
> But for these communications, revision numbers will not provide
> historically stable values that can be used.

They certainly can.

The coder says "I've put up a branch at http://example.com/bzr/feature.
 In revision 5, I started work on feature A.  I finished work in
revision 6.  But then I had to fix a related bug in revision 7."

As long as that coder is active, they'll keep their repository at the
same location.  And because branches are cheap (even cheaper than
delta-compressed revisions), there's no reason to delete old branches.
It's better to keep them around for reference purposes.

> It's impossible for our
> coder to predict the numbers that will be assigned to his code when
> they get merged back into the mainline---since some other unknown
> programmer may have branched at exactly the same point and is trying
> to make the same determination.

This is true, but his code is likely to all land in the mainline at
once.  Since his own revnos are more fine-grained, he's not likely want
to use the mainline revnos.

> Now, the programmers could get stable numbers by keeping the branch in
> the main tree, or by at least pushing out the branching point to
> "reserve" a number in the main tree.

I don't know what you mean by pushing out the branching point.

>> That doesn't follow.  Just because something is arguably true doesn't
>> make it bad.  And in this case, I'm not arguing that it's true, I'm
>> saying that it's true, because that is what my experience tells me is true.
> 
> [I'm sorry, but I didn't grasp this sentence. I think I lost the
> antecedent of "it" somewhere.]

I felt that you were mischaracterizing my _statement_ that "it's
exceedingly uncommon for [revnos] to change" as an _argument_ "it's
exceedingly uncommon for [revnos] to change".  The reality is that we
keep saying revnos don't change because git users keep saying "but what
if the revnos change?".


>>          And I personally have been developing a bugtracker that is
>> distributed in the same way bzr is; it stores bug data in the source
>> tree of a project, so that bug activities follow branches around.
> 
> That kind of thing sounds very useful. As I've been talking about
> "numbers" here in bug trackers and mailing lists, it should be obvious
> that I consider the information stored in such systems an important
> part of the history of a code project. So it would be nice if all of
> that history were stored in an equally reliable system in some way.

If you're interested, it's called "Bugs Everywhere" and it's available here:
http://panoramicfeedback.com/opensource/

New VCS backends are welcome :-D

>> In the first place, it seems fairly common in the Git community to
>> rebase.  This process throws away old revisions and creates new
>> revisions that are morally equivalent[1].
> 
> Yes, rebasing does "destroy history" in one sense, (in actual fact, it
> creates new commits and leaves the old ones around, which may or may
> not have references to them anymore). But i's definitely not common
> for git users to use rebase in a situation where it would change any
> published number.

So actually, not all branches are treated equally by Git users.  Public
branches are treated as append-only, but private branches are treated as
mutable.  (It's the same with bzr users, of course.)

> And git helps with this as well. If I ever forget that I've already
> pushed a change and then I rebase, then the next time I try to push,
> git will complain that I'm attempting to throw away history on the
> remote end, and will refuse to cooperate, (unless I force it).

Same here.

> There's a similar safety mechanism on the pull side. If I did force a
> history-rewriting push, then users who tried to pull it would also
> have to force git's hand before it would rewrite their history.

Same here.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFOAPm0F+nu1YWqI0RAhkdAJ9InxuEjbToGQU2AOJmfZw124Lb2wCeMmDC
9w08eZbmL19FfVQmtpPcYkQ=
=AmGo
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 18:54                                           ` Matthieu Moy
  2006-10-19 20:47                                             ` Linus Torvalds
@ 2006-10-19 23:28                                             ` Ryan Anderson
  1 sibling, 0 replies; 1752+ messages in thread
From: Ryan Anderson @ 2006-10-19 23:28 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Linus Torvalds, Matthew D. Fuller, Andreas Ericsson, Carl Worth,
	bazaar-ng, git, Jakub Narebski

On 10/19/06, Matthieu Moy <Matthieu.Moy@imag.fr> wrote:
> Linus Torvalds <torvalds@osdl.org> writes:
>
> > Btw, I do believe that bzr seems to be acting a lot like BK, at least when
> > it comes to versioning. I suspect that is not entirely random either, and
> > I suspect it's been a conscious effort to some degree.
> >
> > Which is fine, in the sense that there are certainly much worse things to
> > try to copy.
>
> By curiosity, how would you compare git and Bitkeeper, on a purely
> technical basis? (not asking for a detailed comparison, but an "X is
> globaly/much/terribly/not better than Y" kind of statement ;-) )

Having used both in a past job setting (simultaneously even),
BitKeeper was a huge win over CVS, but after a while, some of its
tools  were just very frustrating in comparison with comparable Git
interfaces, and I had actually written a terribly slow BK -> Git
converter just so I could incrementally import our BK tree, then use
Git's history-viewing because it was so much more pleasant to work
with.

For small projects (~5 people), they weren't hugely different, but Git
just felt more comfortable after a while.  (It was actually possible
to do a commit from the command line in a single command, without
getting annoyed by the interface, for a trivial example.)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 23:01                                       ` Aaron Bentley
@ 2006-10-19 23:42                                         ` Carl Worth
  2006-10-20  1:06                                           ` Aaron Bentley
  2006-10-20  2:53                                           ` James Henstridge
  2006-10-20 10:53                                         ` Jakub Narebski
  1 sibling, 2 replies; 1752+ messages in thread
From: Carl Worth @ 2006-10-19 23:42 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Jakub Narebski, Andreas Ericsson, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 5266 bytes --]

On Thu, 19 Oct 2006 19:01:58 -0400, Aaron Bentley wrote:
> I don't think this is true.  The abandoned mainline does not need to be
> destroyed.  It can be kept at the same location that it always was, with
> the numbers that it always had. So the number + URL combo stays
> meaningful.

Sure that's possible, but it gets rather unwieldy the more
repositories you have involved. I've been arguing that bzr really does
encourage centralized, not distributed development, and you were having
trouble seeing how I came to that conclusion. Do you see how "maintain
an independent URL namespace for every distributed branch" doesn't
encourage much distributed development?

>             Additionally, the new mainline can keep a mirror of the
> abandoned mainline in its repository, because there are virtually no
> additional storage requirements to doing so.

And this part I don't understand. I can understand the mainline
storing the revisions, but I don't understand how it could make them
accessible by the published revision numbers of the "abandoned"
line. And that's the problem.

> > But for these communications, revision numbers will not provide
> > historically stable values that can be used.
>
> They certainly can.
>
> The coder says "I've put up a branch at http://example.com/bzr/feature.
>  In revision 5, I started work on feature A.  I finished work in
> revision 6.  But then I had to fix a related bug in revision 7."

"I've put this branch up" isn't historically stable...

> As long as that coder is active

...which is what you just said there yourself.

On the other hand, git names really do live forever, regardless of
where the code is hosted or how it moves around. When I'm talking
about historical stability, I'm talking about being able to publish
numbers that live forever.

It sounds like bzr has numbers like this inside it, (but not nearly as
simple as the ones that git has), but that users aren't in the
practice of communicating with them. Instead, users communicate with
the unstable numbers. And that's a shame from an historical
perspective.

> This is true, but his code is likely to all land in the mainline at
> once.  Since his own revnos are more fine-grained, he's not likely want
> to use the mainline revnos.

What I'd like to be able to do, is advertise a temporary repository,
and while using it, publish names for revisions that will still be
valid when the code gets pushed out to the mainline. That is
supporting distributed development, and everything I'm hearing says
that the bzr revision numbers don't support that.

> I felt that you were mischaracterizing my _statement_ that "it's
> exceedingly uncommon for [revnos] to change" as an _argument_ "it's
> exceedingly uncommon for [revnos] to change".  The reality is that we
> keep saying revnos don't change because git users keep saying "but what
> if the revnos change?".

OK.

The original claim that sparked the discussion was that bzr has a
"simple namespace" while git does not. We've been talking for quite a
while here, and I still don't fully understand how these numbers are
generated or what I can expect to happen to the numbers associated
with a given revision as that revision moves from one repository to
another. It's really not a simple scheme.

Meanwhile, I have been arguing that the "simple" revision numbers that
bzr advertises have restrictions on their utility, (they can only be
used with reference to a specific repository, or with reference to
another that treats it as canonical). I _think_ I understand the
numbers well enough to say that still.

Compare that with the git names. The scheme really is easy to
understand, (either the new user already understands cryptographic
hashes, or else it's as easy as "a long string of digits that git
assigns as the name"). The names have universal utility in time and
space, (for definitions of the the universe larger than I will ever be
able to observe anyway). And the natural inclination to abbreviate the
a name when repeating it, (note the recent post with bzr UUIDs
exhibiting the same inclination), doesn't make the names any less
useful since the abbreviation alone will work most always.

The naming in git really is beautiful and beautifully simple.

It's not monotonically increasing from one revision to the next, but
I've never found that to be an issue. Of course, we do still use our
own "simple" names for versioning the releases and snapshots of
software we manage with git, and that's where being able to easily
determine "newer" or "older" by simple numerical examination is
important. I've honestly never encountered a situation where I was
handed two git sha1 sums and wished that I could do the same thing.

> If you're interested, it's called "Bugs Everywhere" and it's available here:
> http://panoramicfeedback.com/opensource/
>
> New VCS backends are welcome :-D

Thanks, I hope to take a look at that at some point.

> So actually, not all branches are treated equally by Git users.  Public
> branches are treated as append-only, but private branches are treated as
> mutable.  (It's the same with bzr users, of course.)

Well, some users treat all branches as append only and shun rebase.

[snip of remaining agreement of similarity between the tools]

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19 16:48                                                             ` Linus Torvalds
@ 2006-10-20  0:20                                                               ` Jan Harkes
  2006-10-20 14:41                                                                 ` Jeff King
  2006-10-20  0:20                                                               ` [PATCH 1/2] Pass through unresolved deltas when writing a pack Jan Harkes
  2006-10-20  0:20                                                               ` [PATCH 2/2] Remove unused index tracking code Jan Harkes
  2 siblings, 1 reply; 1752+ messages in thread
From: Jan Harkes @ 2006-10-20  0:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

On Thu, Oct 19, 2006 at 09:48:29AM -0700, Linus Torvalds wrote:
> On Thu, 19 Oct 2006, Jan Harkes wrote:
> > 
> > If we find a delta against a base that is not found in our repository we
> > can keep it as a delta, the base should show up later on in the
> > thin-pack. Whenever we find a delta against a base that we haven't seen
> > in the received part of the thin pack, but is available from the
> > repository we should expand it because there is a chance we may not see
> > this base in the remainder of the thin-pack.
> 
> Yes, indeed. We can also have another heuristic: if we find a delta, and 
> we haven't seen the object it deltas against, we can still keep it as a 
> delta IF WE ALSO DON'T ALREADY HAVE THE BASE OBJECT. Because then we know 
> that the base object has to be there later in the pack (or we have a 
> dangling delta, which we'll just consider an error).
> 
> So yeah, maybe my patch-series is something we can still save.

It looks like you were really close. When we cannot resolve a delta, we
just write it to the packfile and we don't queue it. If it can be
resolved we write it as a full object.

The only thing that cannot be reliably tracked is the pack index
information. The offsets are trivial, but we cannot calculate the SHA1
for a delta without applying it to it's base, if the base comes later
the existing code could do it, but if it has already been written to the
pack we can't easily track back.

And why add all the extra complexity. Running git-index-pack after
git-update-objects --repack not only generates the correct index without
a problem, it also serves as an extra consistency check and we keep this
code isolated from any possible future changes to the index file format.

I'll try to follow this up with 2 patches, one is an almost trivial
change to your code that makes it write out a pack with all full objects
and resolvable deltas converted to full objects, any unresolved deltas
are expected to be relative to some other object in the same pack.

The rewritten pack is indexed correctly even when I run git-update-index
in a repository that does not contain any of the objects in the thin-pack.
Ofcourse it also works when the objects are available, but the resulting
full pack is considerably bigger since we can find a suitable base for
every delta.

> However, the thing that makes me suspect that it is _not_ saveable, is 
> this:
...
> The answer is: no. It's not trivial. Or rather, it _is_ trivial, but you 
> have to _remember_ all of the actual data for A, B, C and D all the way to 
> the end, because only if you have that data in memory can you actually 
> _recreate_ B, C and D even enough to get their SHA1's (which you need, 
> just in order to know that the pack is complete, must less to be able to 
> create a non-delta version in case it hadn't been).

Only if you want to build the index at the same time, we don't need to
know the SHA1 values for unresolved deltas.

> Anyway, I just pushed the "rewrite-pack" branch to my git repo on 
> kernel.org, so once it mirrors out, if you really want to try to fix up 
> the mess I left behind, there it is:

I think I still left quite a bit of the mess unfixed.

Jan

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* [PATCH 1/2] Pass through unresolved deltas when writing a pack
  2006-10-19 16:48                                                             ` Linus Torvalds
  2006-10-20  0:20                                                               ` Jan Harkes
@ 2006-10-20  0:20                                                               ` Jan Harkes
  2006-10-20  0:20                                                               ` [PATCH 2/2] Remove unused index tracking code Jan Harkes
  2 siblings, 0 replies; 1752+ messages in thread
From: Jan Harkes @ 2006-10-20  0:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

The resulting pack should be correct if we have the base somewhere else in
the received pack, if we didn't have the base the received pack would be
faulty and can't be unpacked as loose objects either.

The internal pack index information is not updated correctly anymore.

Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>

---
 builtin-unpack-objects.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/builtin-unpack-objects.c b/builtin-unpack-objects.c
index f139308..b95c93c 100644
--- a/builtin-unpack-objects.c
+++ b/builtin-unpack-objects.c
@@ -246,7 +246,10 @@ static void unpack_delta_entry(unsigned 
 	}
 
 	if (!has_sha1_file(base_sha1)) {
-		add_delta_to_list(base_sha1, delta_data, delta_size);
+		if (pack_file)
+			write_pack_delta(base_sha1, delta_data, delta_size);
+		else
+			add_delta_to_list(base_sha1, delta_data, delta_size);
 		return;
 	}
 	base = read_sha1_file(base_sha1, type, &base_size);
-- 
1.4.2.1

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* [PATCH 2/2] Remove unused index tracking code.
  2006-10-19 16:48                                                             ` Linus Torvalds
  2006-10-20  0:20                                                               ` Jan Harkes
  2006-10-20  0:20                                                               ` [PATCH 1/2] Pass through unresolved deltas when writing a pack Jan Harkes
@ 2006-10-20  0:20                                                               ` Jan Harkes
  2006-10-20  1:11                                                                 ` Nicolas Pitre
  2 siblings, 1 reply; 1752+ messages in thread
From: Jan Harkes @ 2006-10-20  0:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

Tracking the offsets is not that hard, but calculating the sha1 for the
deltas is tricky, we may have already seen and written out the base we
need. So it is actually easier to avoid the complexity altogether and
rely on git-index-pack to rebuild the index. The indexing step is also a
useful validation whether the final pack contains a base for every delta.

Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>

---
 builtin-unpack-objects.c |   57 +++++++++++-----------------------------------
 1 files changed, 14 insertions(+), 43 deletions(-)

diff --git a/builtin-unpack-objects.c b/builtin-unpack-objects.c
index b95c93c..3df7938 100644
--- a/builtin-unpack-objects.c
+++ b/builtin-unpack-objects.c
@@ -89,29 +89,6 @@ static void *get_data(unsigned long size
 }
 
 static struct sha1file *pack_file;
-static unsigned long pack_file_offset;
-
-struct index_entry {
-	unsigned long offset;
-	unsigned char sha1[20];
-};
-
-static unsigned int index_nr, index_alloc;
-static struct index_entry **index_array;
-
-static void add_pack_index(unsigned char *sha1)
-{
-	struct index_entry *entry;
-	int nr = index_nr;
-	if (nr >= index_alloc) {
-		index_alloc = (index_alloc + 64) * 3 / 2;
-		index_array = xrealloc(index_array, index_alloc * sizeof(*index_array));
-	}
-	entry = xmalloc(sizeof(*entry));
-	entry->offset = pack_file_offset;
-	hashcpy(entry->sha1, sha1);
-	index_array[nr++] = entry;
-}
 
 static void write_pack_delta(const unsigned char *base, const void *delta, unsigned long delta_size)
 {
@@ -122,11 +99,9 @@ static void write_pack_delta(const unsig
 	sha1write(pack_file, header, hdrlen);
 	sha1write(pack_file, base, 20);
 	datalen = sha1write_compressed(pack_file, delta, delta_size);
-
-	pack_file_offset += hdrlen + 20 + datalen;
 }
 
-static void write_pack_object(const char *type, const unsigned char *sha1, const void *buf, unsigned long size)
+static void write_pack_object(const void *buf, unsigned long size, const char *type, const unsigned char *sha1)
 {
 	unsigned char header[10];
 	unsigned hdrlen, datalen;
@@ -134,8 +109,6 @@ static void write_pack_object(const char
 	hdrlen = encode_header(string_to_type(type, sha1), size, header);
 	sha1write(pack_file, header, hdrlen);
 	datalen = sha1write_compressed(pack_file, buf, size);
-
-	pack_file_offset += hdrlen + datalen;
 }
 
 struct delta_info {
@@ -160,22 +133,21 @@ static void add_delta_to_list(unsigned c
 
 static void added_object(unsigned char *sha1, const char *type, void *data, unsigned long size);
 
-static void write_object(void *buf, unsigned long size, const char *type,
-	unsigned char *base, void *delta, unsigned long delta_size)
+static void write_object(void *buf, unsigned long size, const char *type)
 {
 	unsigned char sha1[20];
 
 	if (pack_file) {
 		if (hash_sha1_file(buf, size, type, sha1) < 0)
 			die("failed to compute object hash");
-		add_pack_index(sha1);
-		if (0 && base)
-			write_pack_delta(base, delta, delta_size);
-		else
-			write_pack_object(type, sha1, buf, size);
-	} else if (write_sha1_file(buf, size, type, sha1) < 0)
-		die("failed to write object");
-	added_object(sha1, type, buf, size);
+
+		write_pack_object(buf, size, type, sha1);
+	} else {
+		if (write_sha1_file(buf, size, type, sha1) < 0)
+		    die("failed to write object");
+
+		added_object(sha1, type, buf, size);
+	}
 }
 
 static void resolve_delta(const char *type, unsigned char *base_sha1,
@@ -190,7 +162,7 @@ static void resolve_delta(const char *ty
 			     &result_size);
 	if (!result)
 		die("failed to apply delta");
-	write_object(result, result_size, type, base_sha1, delta, delta_size);
+	write_object(result, result_size, type);
 	free(delta);
 	free(result);
 }
@@ -225,7 +197,7 @@ static void unpack_non_delta_entry(enum 
 	default: die("bad type %d", kind);
 	}
 	if (!dry_run && buf)
-		write_object(buf, size, type, NULL, NULL, 0);
+		write_object(buf, size, type);
 	free(buf);
 }
 
@@ -334,12 +306,11 @@ static void unpack_all(const char *repac
 		newhdr.hdr_signature = htonl(PACK_SIGNATURE);
 		newhdr.hdr_version = htonl(PACK_VERSION);
 		newhdr.hdr_entries = htonl(nr_objects);
-		
+
 		pack_file = sha1create("%s.pack", repack);
 		sha1write(pack_file, &newhdr, sizeof(newhdr));
-		pack_file_offset = sizeof(newhdr);
 	}
-		
+
 
 	use(sizeof(struct pack_header));
 	for (i = 0; i < nr_objects; i++)
-- 
1.4.2.1

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 23:42                                         ` Carl Worth
@ 2006-10-20  1:06                                           ` Aaron Bentley
  2006-10-20  5:05                                             ` Linus Torvalds
                                                               ` (4 more replies)
  2006-10-20  2:53                                           ` James Henstridge
  1 sibling, 5 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-20  1:06 UTC (permalink / raw)
  To: Carl Worth
  Cc: Linus Torvalds, Jakub Narebski, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Carl Worth wrote:
> On Thu, 19 Oct 2006 19:01:58 -0400, Aaron Bentley wrote:

> Do you see how "maintain
> an independent URL namespace for every distributed branch" doesn't
> encourage much distributed development?

I understand your argument now.  It's nothing to do with numbers per se,
and all about per-branch namespaces.  Correct?

>>             Additionally, the new mainline can keep a mirror of the
>> abandoned mainline in its repository, because there are virtually no
>> additional storage requirements to doing so.
> 
> And this part I don't understand. I can understand the mainline
> storing the revisions, but I don't understand how it could make them
> accessible by the published revision numbers of the "abandoned"
> line. And that's the problem.

I meant that the active branch and a mirror of the abandoned branch
could be stored in the same repository, for ease of access.

Bazaar encourages you to stick lots and lots of branches in your
repository.  They don't even have to be related.  For example, my repo
contains branches of bzr, bzrtools, Meld, and BazaarInspect.

> It sounds like bzr has numbers like this inside it, (but not nearly as
> simple as the ones that git has), but that users aren't in the
> practice of communicating with them. Instead, users communicate with
> the unstable numbers. And that's a shame from an historical
> perspective.

I can see where you're coming from, but to me, the trade-off seems
worthwhile.  Because historical data gets less and less valuable the
older it gets.  By the time the URL for a branch goes dark, there's
unlikely to be any reason to refer to one of its revisions at all.

> The original claim that sparked the discussion was that bzr has a
> "simple namespace" while git does not. We've been talking for quite a
> while here, and I still don't fully understand how these numbers are
> generated or what I can expect to happen to the numbers associated
> with a given revision as that revision moves from one repository to
> another. It's really not a simple scheme.

When you create a new branch from scratch, the number starts at zero.
If you copy a branch, you copy its number, too.

Every time you commit, the number is incremented.  If you pull, your
numbers are adjusted to be identical to those of the branch you pulled from.

Is that really complicated?

> Meanwhile, I have been arguing that the "simple" revision numbers that
> bzr advertises have restrictions on their utility, (they can only be
> used with reference to a specific repository, or with reference to
> another that treats it as canonical). I _think_ I understand the
> numbers well enough to say that still.

Sure.  It's the "favors centralization" thing that I don't agree with,
but I now understand your argument.

> Compare that with the git names. The scheme really is easy to
> understand, (either the new user already understands cryptographic
> hashes, or else it's as easy as "a long string of digits that git
> assigns as the name").

In my experience, users who don't understand distributed systems don't
understand why UUIDS must be used as identifiers.

> The naming in git really is beautiful and beautifully simple.

Well, you've got to admit that those names are at least superficially ugly.

> It's not monotonically increasing from one revision to the next, but
> I've never found that to be an issue. Of course, we do still use our
> own "simple" names for versioning the releases and snapshots of
> software we manage with git, and that's where being able to easily
> determine "newer" or "older" by simple numerical examination is
> important. I've honestly never encountered a situation where I was
> handed two git sha1 sums and wished that I could do the same thing.

What's nice is being able see the revno 753 and knowing that "diff -r
752..753" will show the changes it introduced.  Checking the revo on a
branch mirror and knowing how out-of-date it is.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFOCEf0F+nu1YWqI0RAhgtAJwK4jkWFjjF2iHJb1VyXqgszsHElACff2U7
olZJiAED80tIS6kgkqFsJps=
=BkRZ
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [PATCH 2/2] Remove unused index tracking code.
  2006-10-20  0:20                                                               ` [PATCH 2/2] Remove unused index tracking code Jan Harkes
@ 2006-10-20  1:11                                                                 ` Nicolas Pitre
  2006-10-20  1:35                                                                   ` Junio C Hamano
  2006-10-20  2:27                                                                   ` Jan Harkes
  0 siblings, 2 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-10-20  1:11 UTC (permalink / raw)
  To: Jan Harkes; +Cc: Linus Torvalds, Junio C Hamano, git

On Thu, 19 Oct 2006, Jan Harkes wrote:

> Tracking the offsets is not that hard, but calculating the sha1 for the
> deltas is tricky, we may have already seen and written out the base we
> need. So it is actually easier to avoid the complexity altogether and
> rely on git-index-pack to rebuild the index. The indexing step is also a
> useful validation whether the final pack contains a base for every delta.
> 
> Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>

I don't think it is a good idea.

After looking at the problem for a while I should side with Linus.  
unpack-objects is not the proper tool for the job.  The way to go is to 
make input to index-pack streamable.

This patch in particular creates additional restrictions on pack 
files that were not present before.  And I don't think this is a good 
thing.

This patch impose an ordering on REF_DELTA objects that doesn't need to 
exist.  Say for example that an OFS_DELTA depends on an object which is 
a REF_DELTA object.  With this patch any pack with the base for that 
REF_DELTA stored after the OFS_DELTA object will be broken.

And to really do thin pack fixing properly we really want to just append 
missing base objects at the end of the pack which falls in the broken 
case above.

So this is a NAK from me.

> ---
>  builtin-unpack-objects.c |   57 +++++++++++-----------------------------------
>  1 files changed, 14 insertions(+), 43 deletions(-)
> 
> diff --git a/builtin-unpack-objects.c b/builtin-unpack-objects.c
> index b95c93c..3df7938 100644
> --- a/builtin-unpack-objects.c
> +++ b/builtin-unpack-objects.c
> @@ -89,29 +89,6 @@ static void *get_data(unsigned long size
>  }
>  
>  static struct sha1file *pack_file;
> -static unsigned long pack_file_offset;
> -
> -struct index_entry {
> -	unsigned long offset;
> -	unsigned char sha1[20];
> -};
> -
> -static unsigned int index_nr, index_alloc;
> -static struct index_entry **index_array;
> -
> -static void add_pack_index(unsigned char *sha1)
> -{
> -	struct index_entry *entry;
> -	int nr = index_nr;
> -	if (nr >= index_alloc) {
> -		index_alloc = (index_alloc + 64) * 3 / 2;
> -		index_array = xrealloc(index_array, index_alloc * sizeof(*index_array));
> -	}
> -	entry = xmalloc(sizeof(*entry));
> -	entry->offset = pack_file_offset;
> -	hashcpy(entry->sha1, sha1);
> -	index_array[nr++] = entry;
> -}
>  
>  static void write_pack_delta(const unsigned char *base, const void *delta, unsigned long delta_size)
>  {
> @@ -122,11 +99,9 @@ static void write_pack_delta(const unsig
>  	sha1write(pack_file, header, hdrlen);
>  	sha1write(pack_file, base, 20);
>  	datalen = sha1write_compressed(pack_file, delta, delta_size);
> -
> -	pack_file_offset += hdrlen + 20 + datalen;
>  }
>  
> -static void write_pack_object(const char *type, const unsigned char *sha1, const void *buf, unsigned long size)
> +static void write_pack_object(const void *buf, unsigned long size, const char *type, const unsigned char *sha1)
>  {
>  	unsigned char header[10];
>  	unsigned hdrlen, datalen;
> @@ -134,8 +109,6 @@ static void write_pack_object(const char
>  	hdrlen = encode_header(string_to_type(type, sha1), size, header);
>  	sha1write(pack_file, header, hdrlen);
>  	datalen = sha1write_compressed(pack_file, buf, size);
> -
> -	pack_file_offset += hdrlen + datalen;
>  }
>  
>  struct delta_info {
> @@ -160,22 +133,21 @@ static void add_delta_to_list(unsigned c
>  
>  static void added_object(unsigned char *sha1, const char *type, void *data, unsigned long size);
>  
> -static void write_object(void *buf, unsigned long size, const char *type,
> -	unsigned char *base, void *delta, unsigned long delta_size)
> +static void write_object(void *buf, unsigned long size, const char *type)
>  {
>  	unsigned char sha1[20];
>  
>  	if (pack_file) {
>  		if (hash_sha1_file(buf, size, type, sha1) < 0)
>  			die("failed to compute object hash");
> -		add_pack_index(sha1);
> -		if (0 && base)
> -			write_pack_delta(base, delta, delta_size);
> -		else
> -			write_pack_object(type, sha1, buf, size);
> -	} else if (write_sha1_file(buf, size, type, sha1) < 0)
> -		die("failed to write object");
> -	added_object(sha1, type, buf, size);
> +
> +		write_pack_object(buf, size, type, sha1);
> +	} else {
> +		if (write_sha1_file(buf, size, type, sha1) < 0)
> +		    die("failed to write object");
> +
> +		added_object(sha1, type, buf, size);
> +	}
>  }
>  
>  static void resolve_delta(const char *type, unsigned char *base_sha1,
> @@ -190,7 +162,7 @@ static void resolve_delta(const char *ty
>  			     &result_size);
>  	if (!result)
>  		die("failed to apply delta");
> -	write_object(result, result_size, type, base_sha1, delta, delta_size);
> +	write_object(result, result_size, type);
>  	free(delta);
>  	free(result);
>  }
> @@ -225,7 +197,7 @@ static void unpack_non_delta_entry(enum 
>  	default: die("bad type %d", kind);
>  	}
>  	if (!dry_run && buf)
> -		write_object(buf, size, type, NULL, NULL, 0);
> +		write_object(buf, size, type);
>  	free(buf);
>  }
>  
> @@ -334,12 +306,11 @@ static void unpack_all(const char *repac
>  		newhdr.hdr_signature = htonl(PACK_SIGNATURE);
>  		newhdr.hdr_version = htonl(PACK_VERSION);
>  		newhdr.hdr_entries = htonl(nr_objects);
> -		
> +
>  		pack_file = sha1create("%s.pack", repack);
>  		sha1write(pack_file, &newhdr, sizeof(newhdr));
> -		pack_file_offset = sizeof(newhdr);
>  	}
> -		
> +
>  
>  	use(sizeof(struct pack_header));
>  	for (i = 0; i < nr_objects; i++)
> -- 
> 1.4.2.1
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


Nicolas

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [PATCH 2/2] Remove unused index tracking code.
  2006-10-20  1:11                                                                 ` Nicolas Pitre
@ 2006-10-20  1:35                                                                   ` Junio C Hamano
  2006-10-20  2:27                                                                   ` Jan Harkes
  1 sibling, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-20  1:35 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

Nicolas Pitre <nico@cam.org> writes:

> This patch in particular creates additional restrictions on pack 
> files that were not present before.  And I don't think this is a good 
> thing.
>
> This patch impose an ordering on REF_DELTA objects that doesn't need to 
> exist.  Say for example that an OFS_DELTA depends on an object which is 
> a REF_DELTA object.  With this patch any pack with the base for that 
> REF_DELTA stored after the OFS_DELTA object will be broken.
>
> And to really do thin pack fixing properly we really want to just append 
> missing base objects at the end of the pack which falls in the broken 
> case above.
>
> So this is a NAK from me.

I agree.

By the way, it is rather rare for us to see a NAK on this list.
I'd welcome to see more of them ;-).

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Alternate revno proposal (Was: Re: VCS comparison table)
  2006-10-18 21:46                       ` Alternate revno proposal (Was: Re: VCS comparison table) Jan Hudec
  2006-10-18 22:14                         ` Jakub Narebski
  2006-10-19  8:19                         ` Alexander Belchenko
@ 2006-10-20  2:09                         ` Horst H. von Brand
  2006-10-20  5:38                           ` Jan Hudec
  2 siblings, 1 reply; 1752+ messages in thread
From: Horst H. von Brand @ 2006-10-20  2:09 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Robert Collins, Petr Baudis, bazaar-ng, git

Jan Hudec <bulb@ucw.cz> wrote:

[...]

> Reading this thread I came to think, that the revnos should be assigned
> to _all_ revisions _available_, in order of when they entered the
> repository (there are some possible variations I will mention below)
> 
>  - Such revnos would be purely local, but:
>    - Current revnos are not guaranteed to be the same in different
>      branches either.
>    - They could be done so that mirror has the same revnos as the
>      master.

Then they are almost useless (except for people working alone). You need to
be able to talk about a particular commit with others working independently.

>  - They would be easier to use than the dotted ones. What (at least as
>    far as I understand) makes revnos easier to use than revids is, that
>    you can remember few of them for short time while composing some
>    operation. Ie. look up 2 or 3 revisions in the log and than do some
>    command on them. And a 4 to 5-digit number like 10532 is easier to
>    remember than something like 3250.2.45.86.

Probably. In git you can (mostly) get away with partial SHA-1's, BTW.

>  - Their ordering would be an (arbitrary) superset of the partial
>    ordering by descendance, ie. if revision A is ancestor of B, it would
>    always have lower revno.
>    - The intuition that lower revno means older revision would be always
>      valid for related revisions and approximately valid for unrelated
>      ones.
>  - They would be *localy stable*. That is once assigned the revno would
>    always mean the same revision in given branch (as determined by
>    location, not tip).

Tip-relative is extremely useful: I wouldn't normally remember the current
revision, but I'll probably often be talking about "the change before this
one" and so on.

>      - This is more than the current scheme can give, since now pull can
>        renumber revisions.

Urgh. Get an update, and all your bearings change?

>  - They wouldn't make any branch special, so the objections Linus raised
>    does not apply.

But the original branch /is/ special?

>  - They would be the same as subversion and svk, and IIRC mercurial as
>    well, use, so:
>    - They would already be familiar to users comming from those systems.
>    - They are known to be useful that way. In fact for svk it's the only
>      way to refer to revisions and seem to work satisfactorily (though
>      note that svk is not really suitable to ad-hoc topologies).

SVN is /centralized/, there it does make sense talking about (the one and
only) history. In a distributed system, potentially each has a different
history, and they are intertwined.

Not at all useful.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                    Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria             +56 32 2654239
Casilla 110-V, Valparaiso, Chile               Fax:  +56 32 2797513

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [PATCH 2/2] Remove unused index tracking code.
  2006-10-20  1:11                                                                 ` Nicolas Pitre
  2006-10-20  1:35                                                                   ` Junio C Hamano
@ 2006-10-20  2:27                                                                   ` Jan Harkes
  2006-10-20  2:30                                                                     ` Junio C Hamano
  2006-10-20  3:36                                                                     ` Nicolas Pitre
  1 sibling, 2 replies; 1752+ messages in thread
From: Jan Harkes @ 2006-10-20  2:27 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Linus Torvalds, Junio C Hamano, git

On Thu, Oct 19, 2006 at 09:11:10PM -0400, Nicolas Pitre wrote:
> This patch impose an ordering on REF_DELTA objects that doesn't need to 
> exist.  Say for example that an OFS_DELTA depends on an object which is 
> a REF_DELTA object.  With this patch any pack with the base for that 
> REF_DELTA stored after the OFS_DELTA object will be broken.

I don't see where it imposes any ordering.

If we see a complete object it will remain complete. If we find a delta,
and we have the base in the current repository it will be expanded to a
complete object. When we get a delta that doesn't have a base in the
current repository it will remain unresolved and is written out as a
delta.

So the output pack will always contain fewer deltas as the input.

btw. I don't really know what OFS_DELTA and REF_DELTA objects are, I
grepped the source and found no references to either. I can only find
an OBJ_DELTA.

But if any of the deltas depend on an object that is not in the thin
pack, the base has to be available in the current repository and as such
it will be expanded to a full object, replacing the possibly external
delta reference with an internal base object. If the base is not found
in the current repository the base has to be another object in the
original thin pack so we can write out the delta as is.

There is no before or after decision here. We don't look back in the
thin pack, and we don't have to look forward either. So I don't
understand why your example would break or not depending on if the base
object happens to be before or after the OFS_DELTA.

> And to really do thin pack fixing properly we really want to just append 
> missing base objects at the end of the pack which falls in the broken 
> case above.

I guess I'll grep through the mailinglists to try to figure out what
these OFS and REF deltas are and why they behave so differently
depending on their order in the pack.

Jan

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [PATCH 2/2] Remove unused index tracking code.
  2006-10-20  2:27                                                                   ` Jan Harkes
@ 2006-10-20  2:30                                                                     ` Junio C Hamano
  2006-10-20  2:46                                                                       ` Jan Harkes
  2006-10-20  3:36                                                                     ` Nicolas Pitre
  1 sibling, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-20  2:30 UTC (permalink / raw)
  To: Jan Harkes; +Cc: Nicolas Pitre, Linus Torvalds, git

Jan Harkes <jaharkes@cs.cmu.edu> writes:

> I guess I'll grep through the mailinglists to try to figure out what
> these OFS and REF deltas are and why they behave so differently
> depending on their order in the pack.

It's been cooking in "next" branch for quite a while.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [PATCH 2/2] Remove unused index tracking code.
  2006-10-20  2:30                                                                     ` Junio C Hamano
@ 2006-10-20  2:46                                                                       ` Jan Harkes
  0 siblings, 0 replies; 1752+ messages in thread
From: Jan Harkes @ 2006-10-20  2:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nicolas Pitre, Linus Torvalds, git

On Thu, Oct 19, 2006 at 07:30:27PM -0700, Junio C Hamano wrote:
> Jan Harkes <jaharkes@cs.cmu.edu> writes:
> 
> > I guess I'll grep through the mailinglists to try to figure out what
> > these OFS and REF deltas are and why they behave so differently
> > depending on their order in the pack.
> 
> It's been cooking in "next" branch for quite a while.

Ah yes, just went through the thread about the git-index-pack breaking on
64-bit systems and the back and forth about the possible complexity of
the new code.

> It is really simple:
>
>  - if the found union content matches with a reference union initialized
>    through the sha1 member then deltas[j].obj->type == OBJ_REF_DELTA
>    must be true.
>
>  - if the found union content matches with a reference union initialized
>    through the sha1 member then deltas[j].obj->type == OBJ_OFS_DELTA
>    must be true.
...

I guess one of these must be false.

But clearly this patch breaks those offset based delta's when we expand
random deltas in place.

Jan

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 23:42                                         ` Carl Worth
  2006-10-20  1:06                                           ` Aaron Bentley
@ 2006-10-20  2:53                                           ` James Henstridge
  2006-10-20  9:51                                             ` Jakub Narebski
  1 sibling, 1 reply; 1752+ messages in thread
From: James Henstridge @ 2006-10-20  2:53 UTC (permalink / raw)
  To: Carl Worth
  Cc: Aaron Bentley, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Jakub Narebski

On 20/10/06, Carl Worth <cworth@cworth.org> wrote:
> On Thu, 19 Oct 2006 19:01:58 -0400, Aaron Bentley wrote:
> > I don't think this is true.  The abandoned mainline does not need to be
> > destroyed.  It can be kept at the same location that it always was, with
> > the numbers that it always had. So the number + URL combo stays
> > meaningful.
>
> Sure that's possible, but it gets rather unwieldy the more
> repositories you have involved. I've been arguing that bzr really does
> encourage centralized, not distributed development, and you were having
> trouble seeing how I came to that conclusion. Do you see how "maintain
> an independent URL namespace for every distributed branch" doesn't
> encourage much distributed development?
>
> >             Additionally, the new mainline can keep a mirror of the
> > abandoned mainline in its repository, because there are virtually no
> > additional storage requirements to doing so.
>
> And this part I don't understand. I can understand the mainline
> storing the revisions, but I don't understand how it could make them
> accessible by the published revision numbers of the "abandoned"
> line. And that's the problem.

With this sort of setup, I would publish my branches in a directory
tree like this:

    /repo
        /branch1
        /branch2

I make "/repo" a Bazaar repository so that it stores the revision data
for all branches contained in the directory (the tree contents,
revision meta data, etc).

The "/repo/branch1" essentially just contains a list of mainline
revision IDs that identify the branch.  This could probably be just
store the head revision ID, but there are some optimisations that make
use of the linear history here.

If the ancestry of "/repo/branch2" is a subset of branch1 (as it might
be if the in the case of forked then merged projects), then all its
revision data will already be in the repository when branch1 was
imported.  The only cost of keeping the branch around (and publishing
it) is the list of revision IDs in its mainline history.

For similar reasons, the cost of publishing 20 related Bazaar branches
on my web server is generally not 20 times the cost of publishing a
single branch.

I understand that you get similar benefits by a GIT repository with
multiple head revisions.


> > > But for these communications, revision numbers will not provide
> > > historically stable values that can be used.
> >
> > They certainly can.
> >
> > The coder says "I've put up a branch at http://example.com/bzr/feature.
> >  In revision 5, I started work on feature A.  I finished work in
> > revision 6.  But then I had to fix a related bug in revision 7."
>
> "I've put this branch up" isn't historically stable...

With the repository structure mentioned above, the cost of publishing
multiple branches is quite low.  If I continue to work on the project,
then there is no particular bandwidth or disk space reasons for me to
cut off access to my old branches.

For similar reasons, it doesn't cost me much to mirror other people's
related branches if I really care about them.

> > As long as that coder is active
>
> ...which is what you just said there yourself.
>
> On the other hand, git names really do live forever, regardless of
> where the code is hosted or how it moves around. When I'm talking
> about historical stability, I'm talking about being able to publish
> numbers that live forever.
>
> It sounds like bzr has numbers like this inside it, (but not nearly as
> simple as the ones that git has), but that users aren't in the
> practice of communicating with them. Instead, users communicate with
> the unstable numbers. And that's a shame from an historical
> perspective.

If you need that level of stability then you want the revision
identifier in both the GIT and Bazaar cases.

As for simplicity, note that Bazaar doesn't extract any special
meaning from the "$email-$date-$random" format of the revision
identifiers.  The only property it cares about is that they are
globally unique.  For example, revision identifiers generated by the
Arch -> Bazaar importer have a different format and are handled the
same.


> > This is true, but his code is likely to all land in the mainline at
> > once.  Since his own revnos are more fine-grained, he's not likely want
> > to use the mainline revnos.
>
> What I'd like to be able to do, is advertise a temporary repository,
> and while using it, publish names for revisions that will still be
> valid when the code gets pushed out to the mainline. That is
> supporting distributed development, and everything I'm hearing says
> that the bzr revision numbers don't support that.

That is correct.  The revision numbers assigned to particular
revisions in the context of one branch won't necessarily be the same
as the numbers in another branch.


> > I felt that you were mischaracterizing my _statement_ that "it's
> > exceedingly uncommon for [revnos] to change" as an _argument_ "it's
> > exceedingly uncommon for [revnos] to change".  The reality is that we
> > keep saying revnos don't change because git users keep saying "but what
> > if the revnos change?".
>
> OK.
>
> The original claim that sparked the discussion was that bzr has a
> "simple namespace" while git does not. We've been talking for quite a
> while here, and I still don't fully understand how these numbers are
> generated or what I can expect to happen to the numbers associated
> with a given revision as that revision moves from one repository to
> another. It's really not a simple scheme.

I can't say anything about the dotted revision numbers that have been
recently introduced to Bazaar, but I have definitely found the simple
numeric revision numbers for mainline revisions useful when using
Bazaar.  The revisions with these short revision numbers are generally
the ones I am most interested in when working on that branch.

It hasn't ever seemed a problem those revisions no longer had short
revision numbers assigned to them when someone else merged my branch.


> Meanwhile, I have been arguing that the "simple" revision numbers that
> bzr advertises have restrictions on their utility, (they can only be
> used with reference to a specific repository, or with reference to
> another that treats it as canonical). I _think_ I understand the
> numbers well enough to say that still.

Using Bazaar terminology, the revision numbers are specific to a
particular _branch_.  If I copy a branch from one repository to
another, its revision numbers will stay the same.  And conversely, two
branches in the same repository can have different revision numbers.


> Compare that with the git names. The scheme really is easy to
> understand, (either the new user already understands cryptographic
> hashes, or else it's as easy as "a long string of digits that git
> assigns as the name"). The names have universal utility in time and
> space, (for definitions of the the universe larger than I will ever be
> able to observe anyway). And the natural inclination to abbreviate the
> a name when repeating it, (note the recent post with bzr UUIDs
> exhibiting the same inclination), doesn't make the names any less
> useful since the abbreviation alone will work most always.
>
> The naming in git really is beautiful and beautifully simple.

I don't think anyone is saying that universally unique names are bad.
But I also don't see a problem with using shorter names that only have
meaning in a local scope.

I've noticed some people using abbreviated SHA1 sums with GIT.  Isn't
that also a case of trading potential global uniqueness for
convenience when working in a local scope?


James.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 15:30                                           ` Aaron Bentley
@ 2006-10-20  3:14                                             ` Tim Webster
  2006-10-20  4:05                                               ` Aaron Bentley
  2006-10-20 10:44                                             ` Jakub Narebski
  1 sibling, 1 reply; 1752+ messages in thread
From: Tim Webster @ 2006-10-20  3:14 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Matthieu Moy, Christian MICHON, Andreas Ericsson, bazaar-ng, git

On 10/19/06, Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Tim Webster wrote:
> > First I want to say every SCM I know of sucks when it comes to tracking
> > configurations, simply because they don't record or restore file metadata,
> > like perms, ownership, and acl.
>
> Arch supports that kind of metadata.
>
> I believe SVN supports recording arbitrary file properties, so it's just
> a matter of applying those properties to the tree.

yes svn has arbitrary properties which can be manipulated.
They are not really intended for permissions, ownership, and acl.
To use the svn properties for this requires adding scm tools.
Also svn does not allow files in the same directory to live in
multiple repos

>
> > Somethings I like the SCM tools to handle. Personally I would like the

> > Collaborative document editing and white boarding are other requirements.
> > odf and svg are xml file formats. I would like to see an efficient
> > xml diff as part of the SCM core. Using mime types SCM tools can unzip
> > files, bundles, and use mime type information to the SCM core xml
> > diff, plain diff
> > as required.
>
> An XML diff/patch or merge will not handle ODF properly.  There's too
> much extra semantic information.

I have only experiment with xml diffs on odf files.
From my experience xml diffs work fine on svg files.
For more information, please refer to
http://www.unibw.de/inf2/OO_VCS/oo_rcs_api.html


> > I think it is essential that the SCM core include
> > previsions for multiple
> > repo partners.
>
> You mean multiple merge sources?

yes, Multiple merge sources is handy for collaborative document editing

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [PATCH 2/2] Remove unused index tracking code.
  2006-10-20  2:27                                                                   ` Jan Harkes
  2006-10-20  2:30                                                                     ` Junio C Hamano
@ 2006-10-20  3:36                                                                     ` Nicolas Pitre
  1 sibling, 0 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-10-20  3:36 UTC (permalink / raw)
  To: Jan Harkes; +Cc: Linus Torvalds, Junio C Hamano, git

On Thu, 19 Oct 2006, Jan Harkes wrote:

> If we see a complete object it will remain complete. If we find a delta,
> and we have the base in the current repository it will be expanded to a
> complete object.
> When we get a delta that doesn't have a base in the
> current repository it will remain unresolved and is written out as a
> delta.

But the point of the whole exercice is actually to avoid unresolved 
deltas.  And you know if you have unresolved deltas only when the whole 
pack has been processed.

If the base object is not in the repository but it is in the pack 
_after_ the delta that needs it, you won't have resolved it.  If this is 
a thin pack with missing base objects for whatever reason you're 
screwed.

If the delta has its base object in both the repository _and_ in the 
pack but after the delta then you will have expanded the delta 
needlessly.

So your solution is suboptimal.

The optimal solution really consists of appending missing base objects 
to a thin pack in order to make it complete, or error out if those 
cannot be found.


Nicolas

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 16:14                                           ` Matthieu Moy
@ 2006-10-20  3:40                                             ` Tim Webster
  0 siblings, 0 replies; 1752+ messages in thread
From: Tim Webster @ 2006-10-20  3:40 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Christian MICHON, Andreas Ericsson, bazaar-ng, git

On 10/20/06, Matthieu Moy <Matthieu.Moy@imag.fr> wrote:
> "Tim Webster" <tdwebste@gmail.com> writes:
>
> > First I want to say every SCM I know of sucks when it comes to tracking
> > configurations, simply because they don't record or restore file metadata,
> > like perms, ownership, and acl.
>
> That's not a simple matter.
>
> Tracking ownership hardly makes sense as soon as you have two
> developers on the same project. What does it mean to checkout a file
> belonging to user foo and group bar on a system not having such user
> and group?
.
> That said, it can be interesting to have it, but disabled by default.

Yes I agree it should be disabled by default. And enabled based on the
local settings.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20  3:14                                             ` Tim Webster
@ 2006-10-20  4:05                                               ` Aaron Bentley
  2006-10-21 12:30                                                 ` Jan Hudec
  0 siblings, 1 reply; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-20  4:05 UTC (permalink / raw)
  To: Tim Webster
  Cc: Christian MICHON, Andreas Ericsson, bazaar-ng, git, Matthieu Moy

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tim Webster wrote:
> On 10/19/06, Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
>> I believe SVN supports recording arbitrary file properties, so it's just
>> a matter of applying those properties to the tree.
> 
> yes svn has arbitrary properties which can be manipulated.
> They are not really intended for permissions, ownership, and acl.
> To use the svn properties for this requires adding scm tools.

Agreed.  I think it's okay to require extra work to set the scm up to
handle configurations.

> Also svn does not allow files in the same directory to live in
> multiple repos

It would surprise me if many SCMs that support atomic commit also
support intermixing files from multiple repos in the same directory.

>> You mean multiple merge sources?
> 
> yes, Multiple merge sources is handy for collaborative document editing

That's something I'd like for software development, too.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFOEsO0F+nu1YWqI0RAo+6AJ9lzF0+O1I8rgkyCOdhsir1gjo0NQCfXEVV
EIsDmS+eR/7cHKQfmnPJRA4=
=g5jk
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20  1:06                                           ` Aaron Bentley
@ 2006-10-20  5:05                                             ` Linus Torvalds
  2006-10-20  7:47                                               ` Lachlan Patrick
  2006-10-20  9:57                                             ` Jakub Narebski
                                                               ` (3 subsequent siblings)
  4 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-20  5:05 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Andreas Ericsson, Carl Worth, bazaar-ng, git, Jakub Narebski



On Thu, 19 Oct 2006, Aaron Bentley wrote:
> 
> I understand your argument now.  It's nothing to do with numbers per se,
> and all about per-branch namespaces.  Correct?

I don't know if that is what Carl's problem is, but yes, to somebody from 
the git world, it's totally insane to have the _same_ commit have ten 
different names just depending on which branch is was in.

In git-land, the name of a commit is the same in every branch.

Do you have something like

	gitk --all

in your graphical viewers? That one shows _all_ the branches of a 
repository, and how they relate to each other in git. How do you name your 
commits in such a viewer, since every branch has a _different_ name for 
the same commit?

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Alternate revno proposal (Was: Re: VCS comparison table)
  2006-10-20  2:09                         ` Horst H. von Brand
@ 2006-10-20  5:38                           ` Jan Hudec
  0 siblings, 0 replies; 1752+ messages in thread
From: Jan Hudec @ 2006-10-20  5:38 UTC (permalink / raw)
  To: Horst H. von Brand; +Cc: Robert Collins, Petr Baudis, bazaar-ng, git

On Thu, Oct 19, 2006 at 11:09:49PM -0300, Horst H. von Brand wrote:
> Jan Hudec <bulb@ucw.cz> wrote:
> 
> [...]
> 
> > Reading this thread I came to think, that the revnos should be assigned
> > to _all_ revisions _available_, in order of when they entered the
> > repository (there are some possible variations I will mention below)
> > 
> >  - Such revnos would be purely local, but:
> >    - Current revnos are not guaranteed to be the same in different
> >      branches either.
> >    - They could be done so that mirror has the same revnos as the
> >      master.
> 
> Then they are almost useless (except for people working alone). You need to
> be able to talk about a particular commit with others working independently.

As they currently are you can't either. Because currently it is
guaranteed that the revnos will be the same in two branches with the
same current revision. But when the current revisions differ, the
numbers may as well.

Moreover currently they can change for the same branch over time, while
with the alternate proposal they would not, so you could reliably say
revision 567 on foo.

> >  - They would be easier to use than the dotted ones. What (at least as
> >    far as I understand) makes revnos easier to use than revids is, that
> >    you can remember few of them for short time while composing some
> >    operation. Ie. look up 2 or 3 revisions in the log and than do some
> >    command on them. And a 4 to 5-digit number like 10532 is easier to
> >    remember than something like 3250.2.45.86.
> 
> Probably. In git you can (mostly) get away with partial SHA-1's, BTW.

1) Partial sha-1 is still longer (starts being useful at 6 digits,
   usually you need 8)
2) Decimal numbers are easier to remember than hexadecimal ones.
3) The hashes are not oredered.

> >  - Their ordering would be an (arbitrary) superset of the partial
> >    ordering by descendance, ie. if revision A is ancestor of B, it would
> >    always have lower revno.
> >    - The intuition that lower revno means older revision would be always
> >      valid for related revisions and approximately valid for unrelated
> >      ones.
> >  - They would be *localy stable*. That is once assigned the revno would
> >    always mean the same revision in given branch (as determined by
> >    location, not tip).
> 
> Tip-relative is extremely useful: I wouldn't normally remember the current
> revision, but I'll probably often be talking about "the change before this
> one" and so on.

That's however separate issue. Negative numbers are tip-relative and
there are various prefixes in bzr (like before:, ancestor: etc.) for
relative revision addressing.

> >      - This is more than the current scheme can give, since now pull can
> >        renumber revisions.
> 
> Urgh. Get an update, and all your bearings change?

Currently yes. Currently pull changes the branch to be a mirror of the
pulled-from branch, including the way they are numbered.

> >  - They wouldn't make any branch special, so the objections Linus raised
> >    does not apply.
> 
> But the original branch /is/ special?

Some branches are usually special, but which they are may not
necessarily coincide with the left-parent lineage.

> >  - They would be the same as subversion and svk, and IIRC mercurial as
> >    well, use, so:
> >    - They would already be familiar to users comming from those systems.
> >    - They are known to be useful that way. In fact for svk it's the only
> >      way to refer to revisions and seem to work satisfactorily (though
> >      note that svk is not really suitable to ad-hoc topologies).
> 
> SVN is /centralized/, there it does make sense talking about (the one and
> only) history. In a distributed system, potentially each has a different

Did you notice that I also said svk and mercurial? They both *ARE*
distributed (well, svk has it's limitations, but mercurial really very
similar to git).

> history, and they are intertwined.
> 
> Not at all useful.

There are no global persistent revision numbers in a distributed system.
There can't be. But numbers with limited scope can still be really
useful. The question is what that scope should be.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20  5:05                                             ` Linus Torvalds
@ 2006-10-20  7:47                                               ` Lachlan Patrick
  2006-10-20  8:38                                                 ` Johannes Schindelin
  2006-10-20 10:16                                                 ` Petr Baudis
  0 siblings, 2 replies; 1752+ messages in thread
From: Lachlan Patrick @ 2006-10-20  7:47 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Linus Torvalds wrote:
> 
> On Thu, 19 Oct 2006, Aaron Bentley wrote:
>> I understand your argument now.  It's nothing to do with numbers per se,
>> and all about per-branch namespaces.  Correct?
> 
> I don't know if that is what Carl's problem is, but yes, to somebody from 
> the git world, it's totally insane to have the _same_ commit have ten 
> different names just depending on which branch is was in.
> 
> In git-land, the name of a commit is the same in every branch.

I've been following the git-vs-bzr discussion, and I'd like to ask a
question (being new to both bzr and git). How does git disambiguate SHA1
hash collisions? I think git has an alternative way to name revisions
(can someone please explain it in more detail, I've seen <ref>~<n>
mentioned only in passing in this thread). It seems to me collisions are
a good argument in favour of having two independent naming schemes, so
that you're not solely relying on hashes being unique.

A strong argument is that a global namespace based on hashes of data is
ideal because the names are generated from the data being named, and
therefore are immutable. Same data => same name for that data, always
and forever, which is desirable when merging named data from many
sources. But the converse isn't true: one name does not necessarily map
to only that data. Have I misunderstood? Is this a problem?

Ta,
Loki

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 10:23           ` Sean
  2006-10-17 10:30             ` Johannes Schindelin
  2006-10-17 19:51             ` Aaron Bentley
@ 2006-10-20  8:26             ` James Henstridge
  2006-10-20 10:19               ` Jakub Narebski
  2006-10-20  8:56             ` Erik Bågfors
  3 siblings, 1 reply; 1752+ messages in thread
From: James Henstridge @ 2006-10-20  8:26 UTC (permalink / raw)
  To: Sean; +Cc: Aaron Bentley, Linus Torvalds, bazaar-ng, git, Jakub Narebski

On 17/10/06, Sean <seanlkml@sympatico.ca> wrote:
> > - - you can use a checkout to maintain a local mirror of a read-only
> >   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
>
> I'm not sure what you mean here.  A bzr checkout doesn't have any history
> does it?  So it's not a mirror of a branch, but just a checkout of the
> branch head?

There are two forms of checkout: a normal checkout which contains the
complete history of the branch, and a lightweight checkout, which just
has a pointer back to the original location of the history.

In both cases, a "bzr commit" invocation will commit changes to the
remote location.  In general, you only want to use a lightweight
checkout when there is a fast reliably connection to the branch (e.g.
if it is on the local file system, or local network).

Aaron would be talking about a normal (heavyweight) checkout here.
With a heavyweight checkout, you can do pretty much anything without
access to the branch.  In contrast, almost all operations on a
lightweight checkout need access to the branch.

James.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20  7:47                                               ` Lachlan Patrick
@ 2006-10-20  8:38                                                 ` Johannes Schindelin
  2006-10-20 10:13                                                   ` Petr Baudis
  2006-10-20 11:09                                                   ` Jakub Narebski
  2006-10-20 10:16                                                 ` Petr Baudis
  1 sibling, 2 replies; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-20  8:38 UTC (permalink / raw)
  To: Lachlan Patrick; +Cc: bazaar-ng, git

Hi,

On Fri, 20 Oct 2006, Lachlan Patrick wrote:

> How does git disambiguate SHA1 hash collisions?

It does not. You can fully expect the universe to go down before that 
happens.

The only reasonable worry is about SHA-1 being broken some time in future, 
i.e. being able to construct a malign version of some source code _which 
has the same hash_. There were plenty of discussions about that; Please 
search the mailing list. (The consent was that those do not matter, 
because an existing object will _never_ be overwritten by a fetch, so you 
would not get that invalid object anyway.)

Hth,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 10:23           ` Sean
                               ` (2 preceding siblings ...)
  2006-10-20  8:26             ` James Henstridge
@ 2006-10-20  8:56             ` Erik Bågfors
  3 siblings, 0 replies; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-20  8:56 UTC (permalink / raw)
  To: Sean; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

> > - - you can use a checkout to maintain a local mirror of a read-only
> >   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
>
> I'm not sure what you mean here.  A bzr checkout doesn't have any history
> does it?  So it's not a mirror of a branch, but just a checkout of the
> branch head?

In bzr there are two different kind of checkouts.  One is a called a
lightweight checkout and that's really a "normal" checkout in the way
svn for example does it.  In this mode, you have the branch remotely
and only the working tree locally.  So it's just a checkout of the
branch head (of any other revision if using -r when doing the
checkout).

Then there are none lightweight checkouts, heavyweight checkouts.
These are the default type.  A heavyweight checkout is in fact a full
branch locally, but it is "bound" to the remote branch.  What this
means is that all commands such as diff/status/log/etc can be done
locally. So it's really quick.

It acts the same as a lightweight checkout in most regards, so when I
run "bzr update" it actually pulls from the remove branch, and when I
run "bzr commit" it commits the same revision in both the remote
branch and the local branch. It does this in one transaction so one
can't work and the other fail (they would both fail in that case).

What this also gives you is that when you want to clone the branch,
you don't need to go the the remote branch to get the revisions and
also, when being offline, you can commit locally.

Committing locally is a very cool feature in my mind.  If you work in
a centralized manner with checkouts, you normally commit directly to
the central branch, but when you are offline, that will fail (of
course :) ).  So what you can do then is to run "bzr commit --local"
to commit only to your local checkout branch, then when you get online
again you can run "bzr update".  In this case the update will take any
new commits that has been done while you were away, pull them into
your local branch, and make your local commits into something that has
been merged into the "checkout".

I find this REALLY useful.

Don't know if that made sense, here it is in commands.

$ bzr checkout t p
$ cd p
$ echo hej >> hosts
$ bzr commit --local -m 'offline'
$ echo hej >> hosts
$ bzr commit --local -m 'offline 2'

Now I get back, someone has committed new stuff... I run bzr update
$ bzr update
All changes applied successfully.
Updated to revision 2.
Your local commits will now show as pending merges with 'bzr status',
and can be committed with 'bzr commit'.
$ bzr status
modified:
  hosts
pending merges:
  Erik Bågfors 2006-10-20 offline 2
    Erik Bågfors 2006-10-20 offline
$ bzr commit -m 'my offline stuff'
modified hosts
Committed revision 3.

$ bzr log -r-1
------------------------------------------------------------
revno: 3
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: p
timestamp: Fri 2006-10-20 10:51:08 +0200
message:
  my offline stuff
    ------------------------------------------------------------
    merged: erik@bagfors.nu-20061020084949-8bc43db8f5cd449b
    committer: Erik Bågfors <erik@bagfors.nu>
    branch nick: p
    timestamp: Fri 2006-10-20 10:49:49 +0200
    message:
      offline 2
    ------------------------------------------------------------
    merged: erik@bagfors.nu-20061020084945-13e5093f98c0c380
    committer: Erik Bågfors <erik@bagfors.nu>
    branch nick: p
    timestamp: Fri 2006-10-20 10:49:45 +0200
    message:
      offline

I think that bzr really allows you to work well in a centralized
environment as well as a distrubuted, which is one of the things I
like best about bzr.

Regards,
Erik
-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Signed git-tag doesn't find default key
@ 2006-10-20  9:04 Andy Parkins
  2006-10-20 16:32 ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Andy Parkins @ 2006-10-20  9:04 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 913 bytes --]

Hello,

I did this:

$ git tag -s adp-sign-tag
gpg: skipped "Andy Parkins <andyparkins@gmail.com>": secret key not available
gpg: signing failed: secret key not available
failed to sign the tag with GPG.

I believe the problem is that I have used the comment field in my key's UID 
definition.

$ gpg --list-keys andy
pub   1024D/4F712F6D 2003-08-14
uid                  Andy Parkins (Google) <andyparkins@gmail.com>

So when git-tag looks for "Andy Parkins <andyparkins@gmail.com>"; it's not 
found.  The answer is (I think) to search only on the email address when 
looking for a key.  I've simply changed git-tag to have

username=$(git-repo-config user.email)

However, this is clearly wrong as what it actually wants is the committer 
email.  Am I safe to simply process the $tagger variable to extract it?



Andy
-- 
Dr Andy Parkins, M Eng (hons), MIEE
andyparkins@gmail.com

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:00                   ` Sean
  2006-10-17 22:44                     ` Aaron Bentley
@ 2006-10-20  9:43                     ` Matthieu Moy
  2006-10-24  6:02                       ` Lachlan Patrick
  1 sibling, 1 reply; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-20  9:43 UTC (permalink / raw)
  To: Sean; +Cc: Andreas Ericsson, bazaar-ng, git, Jakub Narebski

Sean <seanlkml@sympatico.ca> writes:

> On Tue, 17 Oct 2006 17:27:44 -0400
> Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
>
>> Bzr has plugin autoloading, Protocol plugins, Repository format plugins,
>> and more.  Because Python supports monkey-patching, a plugin can change
>> absolutely anything.
>
> But really why does any of that matter?  This is the open source world.
> We don't need plugins to extend features, we just add the feature to
> the source.  The example I asked about earlier is a case in point. 
> Apparently in bzr "bisect" was implemented as a plugin, yet in Git it
> was implemented as a command without any issue at all,

The plugin Vs core feature is not a technical problem. The code for a
plugin and for a core functionality will roughly be the same, but in a
different file.

There can be many reasons why you want to implement something as a
plugin:

* This is project-specific, upstream is not interested (for example,
  bzr has a plugin to submit a merge request to a robot, it will
  probably never come in the core).

* The feature is not matured enough, so you don't want to merge it in
  upstream, but you want to make it available to people without
  patching (for example, "bzr uncommit" was once in the bzrtools
  plugin, and finally landed in upstream).

* The feature you're adding are only of use to a small subset of
  users. You don't want to pollute, in particular "bzr help commands"
  with it, especially not to disturb beginners. I've been arguing in
  favor of a configuration option to hide commands from "bzr help
  commands" instead, but nobody seemed interested.

* Explicit divergent points of view between the implementor of the
  plugin and upstream. That avoids a fork. I don't remember any such
  case with bzr.

I'd compare bzr's plugins to Firefox extensions. Geeks used to like
the big Mozilla-with-tons-of-config-options, but
Firefox-with-only-the-most-relevant-features is the one which allowed
a wide adoption by non-geeks. Still, geeks can customize their
browser, and add features without having to wait for Mozilla Fundation
to incorporate it in upstream.

Now, I don't know git enough to know whether the way it is extensible
allow all of the above, but bzr's plugin system it quite good at that.
At the time git was almost exclusively used by the kernel, you didn't
have all those problems since you targeted only one community, but I
guess you already had some needs for flexibility.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20  2:53                                           ` James Henstridge
@ 2006-10-20  9:51                                             ` Jakub Narebski
  2006-10-20 10:42                                               ` James Henstridge
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20  9:51 UTC (permalink / raw)
  To: James Henstridge
  Cc: bazaar-ng, Linus Torvalds, Carl Worth, Andreas Ericsson, git

James Henstridge wrote:
> On 20/10/06, Carl Worth <cworth@cworth.org> wrote:
>> On Thu, 19 Oct 2006 19:01:58 -0400, Aaron Bentley wrote:

>>>             Additionally, the new mainline can keep a mirror of the
>>> abandoned mainline in its repository, because there are virtually no
>>> additional storage requirements to doing so.
>>
>> And this part I don't understand. I can understand the mainline
>> storing the revisions, but I don't understand how it could make them
>> accessible by the published revision numbers of the "abandoned"
>> line. And that's the problem.
> 
> With this sort of setup, I would publish my branches in a directory
> tree like this:
> 
>     /repo
>         /branch1
>         /branch2
> 
> I make "/repo" a Bazaar repository so that it stores the revision data
> for all branches contained in the directory (the tree contents,
> revision meta data, etc).

And here we have a feature which is as far as I see unique to git,
namely to have persistent branches with _separate namespace_. It means
that we can have hierarchical branch names (including names like
"remotes/<remotename>/<branch of remote>", or "jc/diff"), and we don't
have to guess where repository name ends and branch name begins.

The idea of "branches (and tags) as directories" was if I understand
it correctly introduced by Subversion, and from what can be seen from
troubles with git-svn (stemming from the fact that division between
project name and branch name is the matter of _convention_) at least
slightly brain-damaged.
 
> The "/repo/branch1" essentially just contains a list of mainline
> revision IDs that identify the branch.  This could probably be just
> store the head revision ID, but there are some optimisations that make
> use of the linear history here.
> 
> If the ancestry of "/repo/branch2" is a subset of branch1 (as it might
> be if the in the case of forked then merged projects), then all its
> revision data will already be in the repository when branch1 was
> imported.  The only cost of keeping the branch around (and publishing
> it) is the list of revision IDs in its mainline history.
> 
> For similar reasons, the cost of publishing 20 related Bazaar branches
> on my web server is generally not 20 times the cost of publishing a
> single branch.
> 
> I understand that you get similar benefits by a GIT repository with
> multiple head revisions.

You can get similar benefits by a GIT repository with shared object
database using alternates mechanism. And that is usually preferred
over storing unrelated branches, i.e. branches pointing to disconnected
DAG (separate trees in BK terminology) of revision, if that you mean by
multiple head revisions (because in GIT there is no notion of "mainline"
branch, only of current (HEAD) branch).


>>>> But for these communications, revision numbers will not provide
>>>> historically stable values that can be used.
>>>
>>> They certainly can.
>>>
>>> The coder says "I've put up a branch at http://example.com/bzr/feature.
>>>  In revision 5, I started work on feature A.  I finished work in
>>> revision 6.  But then I had to fix a related bug in revision 7."
>>
>> "I've put this branch up" isn't historically stable...
> 
> With the repository structure mentioned above, the cost of publishing
> multiple branches is quite low.  If I continue to work on the project,
> then there is no particular bandwidth or disk space reasons for me to
> cut off access to my old branches.
> 
> For similar reasons, it doesn't cost me much to mirror other people's
> related branches if I really care about them.

But the revision number in this case _changes_. It is from 7 to
branch:7 but still it changes somewhat.

[...]
>> The naming in git really is beautiful and beautifully simple.
> 
> I don't think anyone is saying that universally unique names are bad.
> But I also don't see a problem with using shorter names that only have
> meaning in a local scope.
> 
> I've noticed some people using abbreviated SHA1 sums with GIT.  Isn't
> that also a case of trading potential global uniqueness for
> convenience when working in a local scope?

Emphasisis on _potential_. SHA1 id abbreviated to 6 characters might
be not unique in larger project, but for example the chance that
SHA1 id abbreviated to 7 or 8 characters is not unique is really low.


Yet another analogy:

SHA1 identifiers of commits (and not only commits) can be compared
to Message-Ids of Usenet messages, while revision numbers can be compared
to Xref number of Usenet message which if I understand correctly is unique
only for given news server. But Message-Ids cannot be shortened
meaningfully like SHA1 ids can; newertheless they are used in communication
without any problems. Even if namespace is not simple ;-)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20  1:06                                           ` Aaron Bentley
  2006-10-20  5:05                                             ` Linus Torvalds
@ 2006-10-20  9:57                                             ` Jakub Narebski
  2006-10-20 10:02                                               ` Matthieu Moy
  2006-10-20 10:45                                               ` James Henstridge
  2006-10-20 11:00                                             ` Jakub Narebski
                                                               ` (2 subsequent siblings)
  4 siblings, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20  9:57 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Carl Worth, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Aaron Bentley wrote:
>> The naming in git really is beautiful and beautifully simple.
> 
> Well, you've got to admit that those names are at least superficially
> ugly. 

If you want pretty name, you tag it. Tags are exchanged during 
fetch/push operation. And you can have pretty names of revisions
like v1.4.3
 
>> It's not monotonically increasing from one revision to the next, but
>> I've never found that to be an issue. Of course, we do still use our
>> own "simple" names for versioning the releases and snapshots of
>> software we manage with git, and that's where being able to easily
>> determine "newer" or "older" by simple numerical examination is
>> important. I've honestly never encountered a situation where I was
>> handed two git sha1 sums and wished that I could do the same thing.
> 
> What's nice is being able see the revno 753 and knowing that "diff -r
> 752..753" will show the changes it introduced.  Checking the revo on a
> branch mirror and knowing how out-of-date it is.

Huh? If you want what changes have been introduced by commit 
c3424aebbf722c1f204931bf1c843e8a103ee143, you just do

# git diff c3424aebbf722c1f204931bf1c843e8a103ee143

(or better "git show" instead of "git diff" or "git diff-tree").
If you give only one commit (only one revision) git automatically
gives diff to its parent(s).


By the way, is referring to revision by it's revno _fast_?
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20  9:57                                             ` Jakub Narebski
@ 2006-10-20 10:02                                               ` Matthieu Moy
  2006-10-20 10:45                                                 ` Andy Whitcroft
  2006-10-20 10:45                                               ` James Henstridge
  1 sibling, 1 reply; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-20 10:02 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: bazaar-ng, Linus Torvalds, Andreas Ericsson, Carl Worth, git

Jakub Narebski <jnareb@gmail.com> writes:

> Huh? If you want what changes have been introduced by commit 
> c3424aebbf722c1f204931bf1c843e8a103ee143, you just do
>
> # git diff c3424aebbf722c1f204931bf1c843e8a103ee143

How does git chose which ancestor to use if this revision has more
than one in this case?

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20  8:38                                                 ` Johannes Schindelin
@ 2006-10-20 10:13                                                   ` Petr Baudis
  2006-10-20 11:09                                                   ` Jakub Narebski
  1 sibling, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-20 10:13 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: bazaar-ng, git

  Hi,

Dear diary, on Fri, Oct 20, 2006 at 10:38:48AM CEST, I got a letter
where Johannes Schindelin <Johannes.Schindelin@gmx.de> said that...
> On Fri, 20 Oct 2006, Lachlan Patrick wrote:
> 
> > How does git disambiguate SHA1 hash collisions?
> 
> It does not. You can fully expect the universe to go down before that 
> happens.
> 
> The only reasonable worry is about SHA-1 being broken some time in future, 
> i.e. being able to construct a malign version of some source code _which 
> has the same hash_. There were plenty of discussions about that; Please 
> search the mailing list. (The consent was that those do not matter, 
> because an existing object will _never_ be overwritten by a fetch, so you 
> would not get that invalid object anyway.)

  well, that's somewhat a bold statement, since when you have a way to
fabricate malicious objects, you probably can socially engineer to have
it distributed to a large portion of repositories if you try hard
enough. Or you hack kernel.org and replace the object. Who knows.

  But the thing is that noone has come any closer to this kind of attack
at all. Currently known attacks are that you can relatively fast (which
doesn't mean "5 minutes"; I think that in case of SHA1 the complexity is
still huge, just smaller than intended, but I may remember wrong; you
can get a MD5 collision of this kind within one minute on a standard
notebook) create a _pair_ of objects sharing the same hash, where both
objects contain a big binary blob. So you would first have to engineer
to have one of those objects accepted officially, then engineer the
malicious one getting in. Generating an object that hashes to a
predetermined value is much harder problem and AFAIK there's no much
progress in breaking this.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20  7:47                                               ` Lachlan Patrick
  2006-10-20  8:38                                                 ` Johannes Schindelin
@ 2006-10-20 10:16                                                 ` Petr Baudis
  1 sibling, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-20 10:16 UTC (permalink / raw)
  To: Lachlan Patrick; +Cc: bazaar-ng, git

Dear diary, on Fri, Oct 20, 2006 at 09:47:16AM CEST, I got a letter
where Lachlan Patrick <loki@research.canon.com.au> said that...
> I think git has an alternative way to name revisions
> (can someone please explain it in more detail, I've seen <ref>~<n>
> mentioned only in passing in this thread).

This is just a notion that lets you point to revisions relative to a
given id. <id>~<n> means n-th ancestor of the given commit.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20  8:26             ` James Henstridge
@ 2006-10-20 10:19               ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 10:19 UTC (permalink / raw)
  To: James Henstridge; +Cc: Sean, Aaron Bentley, Linus Torvalds, bazaar-ng, git

James Henstridge wrote:
> On 17/10/06, Sean <seanlkml@sympatico.ca> wrote:
> > > - - you can use a checkout to maintain a local mirror of a read-only
> > >   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
> >
> > I'm not sure what you mean here.  A bzr checkout doesn't have any history
> > does it?  So it's not a mirror of a branch, but just a checkout of the
> > branch head?
> 
> There are two forms of checkout: a normal checkout which contains the
> complete history of the branch, and a lightweight checkout, which just
> has a pointer back to the original location of the history.
> 
> In both cases, a "bzr commit" invocation will commit changes to the
> remote location.  In general, you only want to use a lightweight
> checkout when there is a fast reliably connection to the branch (e.g.
> if it is on the local file system, or local network).

So the "lightweight checkout" is equivalent of "lazy clone" we have
much discussed on git mailing list about (without any resulting code,
unfortunately). The point of problem was how to do this fast, without
need for fast reliable connection to the repository it was cloned from.
For example if to leave fetched objects in some kind of cache, or even
in "lightweight checkout"/"lazy clone" repository database.

If repository we do "lightweight checkout"/"lazy clone" from is on
local file system (perhaps network file system), then we can use
alternates mechanism (git clone -l -s). That's why "lazy clone" was
sometimes named "remote alternates".
 
> Aaron would be talking about a normal (heavyweight) checkout here.
> With a heavyweight checkout, you can do pretty much anything without
> access to the branch.  In contrast, almost all operations on a
> lightweight checkout need access to the branch.

We have terminology conflict here. Bazaar-NG "pull" and "merge" vs.
GIT "fetch", "pull" and "merge"; Bazaar-NG "checkout" vs. GIT "clone"
and "checkout".

In GIT "clone" is what is used to copy whole repository, "checkout"
is what is used to extract given/current branch to [given] working area.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 19:01             ` Nathaniel Smith
@ 2006-10-20 10:32               ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 10:32 UTC (permalink / raw)
  To: git

Nathaniel Smith wrote:

> Aaron Bentley <aaron.bentley <at> utoronto.ca> writes:
>
>> Bazaar also supports multiple unrelated branches in a repository, as
>> does CVS, SVN (depending how you squint), Arch, and probably Monotone.
> 
> It's quite common in Monotone.  You could probably do it in Mercurial as well,
> though I don't know that anyone does.  SVK definitely does it (since each user
> has a single repo that's shared by all the projects they work on).

I think that GIT separation of root, repository, and branches
namespaces is why there are so many calls for adding subproject
support to GIT; people want to change to GIT literally, for example
putting everything in one large repository.

In GIT there is no concept of root, like in CVS or SVN. You can
put repository anywhere. By default GIT looks for repository 
in current directory or one of its parents; otherwise you have to
provide location of repository either by using GIT_DIR environment
variable, or by using --git-dir option to git wrapper.

And the branch namespace is totally separate. There are some
restrictions on branch names (caused by notation GIT uses, for
example <branch>^ means [first] parent of commit given by <branch>),
but really few. Branch names can be hierarchical, like "jc/diff".

So there is no "store everything in URL/path" of
  /root/repo/branch
notation in GIT.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19  8:58                                     ` Andreas Ericsson
  2006-10-19  9:10                                       ` Matthieu Moy
  2006-10-19 15:45                                       ` Ramon Diaz-Uriarte
@ 2006-10-20 10:40                                       ` Jakub Narebski
  2006-10-20 13:36                                         ` Shawn Pearce
  2006-10-21 12:30                                         ` Matthew D. Fuller
  2 siblings, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 10:40 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Andreas Ericsson wrote:

> Christian MICHON wrote:
>
>> close to 200 post on bzr-git war!
>> is this the right place (git mailing list) to discuss about future
>> features of bzr ?
>> 
> 
> Perhaps not, but the tone is friendly (mostly), the patience of the 
> bazaar people seems infinite and lots of people seem to be having fun 
> while at the same time learning a thing or two about a different SCM.
> Best case scenario, both git and bazaar come out of the discussion as 
> better tools. If there would never be any cross-pollination, git 
> wouldn't have half the features it has today.

And it certainly helps to explain user-visible differences between
Bazaar-NG and GIT; I'd like to put ComparisonWithBazaarNG page on
GitWiki (http://git.or.cz/gitwiki/) some time soon, in addition
to ComparisonWithMercurial I meant to add from some time (stemming
from discussion on #revctrl list on FreeNode), and in addition
to existing GitSvnComparison page on GitWiki).
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20  9:51                                             ` Jakub Narebski
@ 2006-10-20 10:42                                               ` James Henstridge
  2006-10-20 13:17                                                 ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: James Henstridge @ 2006-10-20 10:42 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: bazaar-ng, Linus Torvalds, Carl Worth, Andreas Ericsson, git

On 20/10/06, Jakub Narebski <jnareb@gmail.com> wrote:
> James Henstridge wrote:
> > On 20/10/06, Carl Worth <cworth@cworth.org> wrote:
> >> On Thu, 19 Oct 2006 19:01:58 -0400, Aaron Bentley wrote:
>
> >>>             Additionally, the new mainline can keep a mirror of the
> >>> abandoned mainline in its repository, because there are virtually no
> >>> additional storage requirements to doing so.
> >>
> >> And this part I don't understand. I can understand the mainline
> >> storing the revisions, but I don't understand how it could make them
> >> accessible by the published revision numbers of the "abandoned"
> >> line. And that's the problem.
> >
> > With this sort of setup, I would publish my branches in a directory
> > tree like this:
> >
> >     /repo
> >         /branch1
> >         /branch2
> >
> > I make "/repo" a Bazaar repository so that it stores the revision data
> > for all branches contained in the directory (the tree contents,
> > revision meta data, etc).
>
> And here we have a feature which is as far as I see unique to git,
> namely to have persistent branches with _separate namespace_. It means
> that we can have hierarchical branch names (including names like
> "remotes/<remotename>/<branch of remote>", or "jc/diff"), and we don't
> have to guess where repository name ends and branch name begins.

With the above layout, I would just type:
    bzr branch http://server/repo/branch1

This command behaves identically whether the repository data is in
/repo or in /repo/branch1.  Someone pulling from the branch doesn't
have to care what the repository structure is.  Having a separate
namespace for branch names only really makes sense if the user needs
to care about it.

As for heirarchical names, there is nothing stopping you from using
deaper directory structures with Bazaar too.  Bazaar just checks each
successive parent directory til it finds a repository for the branch.


> The idea of "branches (and tags) as directories" was if I understand
> it correctly introduced by Subversion, and from what can be seen from
> troubles with git-svn (stemming from the fact that division between
> project name and branch name is the matter of _convention_) at least
> slightly brain-damaged.

I think you are a bit confused about how Bazaar works here.  A Bazaar
repository is a store of trees and revision metadata.  A Bazaar branch
is just a pointer to a head revision in the repository.  As you can
probably guess, the data for the branch is a lot smaller than the data
for the repository.

You can store the repository and branch in the same directory to get a
standalone branch.  The layout I described above has a repository in a
parent directory, shared by multiple branches.

If you are comparing Subversion and Bazaar, a Bazaar branch shares
more properties with a full Subversion repository rather than a
Subversion branch.


> > The "/repo/branch1" essentially just contains a list of mainline
> > revision IDs that identify the branch.  This could probably be just
> > store the head revision ID, but there are some optimisations that make
> > use of the linear history here.
> >
> > If the ancestry of "/repo/branch2" is a subset of branch1 (as it might
> > be if the in the case of forked then merged projects), then all its
> > revision data will already be in the repository when branch1 was
> > imported.  The only cost of keeping the branch around (and publishing
> > it) is the list of revision IDs in its mainline history.
> >
> > For similar reasons, the cost of publishing 20 related Bazaar branches
> > on my web server is generally not 20 times the cost of publishing a
> > single branch.
> >
> > I understand that you get similar benefits by a GIT repository with
> > multiple head revisions.
>
> You can get similar benefits by a GIT repository with shared object
> database using alternates mechanism. And that is usually preferred
> over storing unrelated branches, i.e. branches pointing to disconnected
> DAG (separate trees in BK terminology) of revision, if that you mean by
> multiple head revisions (because in GIT there is no notion of "mainline"
> branch, only of current (HEAD) branch).

I may have got the git terminology wrong. I was trying to draw
parallels between the .git/refs/... files in a git repository and the
way multiple branches can be stored in a Bazaar repository.

I am not claiming that you'll get bandwidth or disk space benefits for
storing unrelated branches in a single Bazaar repository.  But if the
branches are related, then there will be space savings (which is what
the great-grandparent post was asking about).


> >>>> But for these communications, revision numbers will not provide
> >>>> historically stable values that can be used.
> >>>
> >>> They certainly can.
> >>>
> >>> The coder says "I've put up a branch at http://example.com/bzr/feature.
> >>>  In revision 5, I started work on feature A.  I finished work in
> >>> revision 6.  But then I had to fix a related bug in revision 7."
> >>
> >> "I've put this branch up" isn't historically stable...
> >
> > With the repository structure mentioned above, the cost of publishing
> > multiple branches is quite low.  If I continue to work on the project,
> > then there is no particular bandwidth or disk space reasons for me to
> > cut off access to my old branches.
> >
> > For similar reasons, it doesn't cost me much to mirror other people's
> > related branches if I really care about them.
>
> But the revision number in this case _changes_. It is from 7 to
> branch:7 but still it changes somewhat.

A revision number is only has meaning in the context of a branch.  If
I mirror a branch, the revision numbers in the context of each will
refer to the same revision IDs.

I am not sure what sort of distinction you are trying to draw.


> >> The naming in git really is beautiful and beautifully simple.
> >
> > I don't think anyone is saying that universally unique names are bad.
> > But I also don't see a problem with using shorter names that only have
> > meaning in a local scope.
> >
> > I've noticed some people using abbreviated SHA1 sums with GIT.  Isn't
> > that also a case of trading potential global uniqueness for
> > convenience when working in a local scope?
>
> Emphasisis on _potential_. SHA1 id abbreviated to 6 characters might
> be not unique in larger project, but for example the chance that
> SHA1 id abbreviated to 7 or 8 characters is not unique is really low.

My point was that by shortening the IDs with GIT, you are trading
global uniqueness (i.e. the identifier may clash with one found in a
different context) for the convenience of shorter identifiers.

Provided you know that the tradeoff is being made, it isn't generally
much of a problem.  I agree that the ability to pick how much of a
tradeoff is made by altering the length of the identifier is a nice
property of GIT.


> Yet another analogy:
>
> SHA1 identifiers of commits (and not only commits) can be compared
> to Message-Ids of Usenet messages, while revision numbers can be compared
> to Xref number of Usenet message which if I understand correctly is unique
> only for given news server. But Message-Ids cannot be shortened
> meaningfully like SHA1 ids can; newertheless they are used in communication
> without any problems. Even if namespace is not simple ;-)

I can't say I ever used usenet much, so can't comment too much.  But
from your description, a (server, xref) tuple could be used to look up
the unique identifier in a similar way to how you can do so in Bazaar
with a (branch_url, revno) tuple.

James.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 15:30                                           ` Aaron Bentley
  2006-10-20  3:14                                             ` Tim Webster
@ 2006-10-20 10:44                                             ` Jakub Narebski
  1 sibling, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 10:44 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Aaron Bentley wrote:

>> It would be nice if the SCM tools included rss feeds for communicating zip
>> patch bundles.
> 
> The bzr "webserve" plugin provides rss feeds.

Git "gitweb" (in git.git repo from some time) web interface provides OPML
and RSS feeds.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 10:02                                               ` Matthieu Moy
@ 2006-10-20 10:45                                                 ` Andy Whitcroft
  0 siblings, 0 replies; 1752+ messages in thread
From: Andy Whitcroft @ 2006-10-20 10:45 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Jakub Narebski, Aaron Bentley, Andreas Ericsson, Linus Torvalds,
	Carl Worth, bazaar-ng, git

Matthieu Moy wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> Huh? If you want what changes have been introduced by commit 
>> c3424aebbf722c1f204931bf1c843e8a103ee143, you just do
>>
>> # git diff c3424aebbf722c1f204931bf1c843e8a103ee143
> 
> How does git chose which ancestor to use if this revision has more
> than one in this case?

Well if there is more than one parent, then there are more than one
diff.  For instance this is a merge commit which I asked to 'see'.

This gets shown in the combined diff format, showing the results of the
conflict resolution.

diff --cc this
index fbbafbf,10c8337..43b7af0
--- a/this
+++ b/this
@@@ -1,3 -1,3 +1,4 @@@
  1
+ 2a
 +2b
  3

If you want to know each individual diff in a more 'standard' form you
can ask about the parents specifically.

apw@pinky$ git diff HEAD^1..
diff --git a/this b/this
index fbbafbf..43b7af0 100644
--- a/this
+++ b/this
@@ -1,3 +1,4 @@
 1
+2a
 2b
 3

apw@pinky$ git diff HEAD^2..
diff --git a/bar b/bar
new file mode 100644
index 0000000..8dc5f23
--- /dev/null
+++ b/bar
@@ -0,0 +1 @@
+this that other
diff --git a/this b/this
index 10c8337..43b7af0 100644
--- a/this
+++ b/this
@@ -1,3 +1,4 @@
 1
 2a
+2b
 3

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20  9:57                                             ` Jakub Narebski
  2006-10-20 10:02                                               ` Matthieu Moy
@ 2006-10-20 10:45                                               ` James Henstridge
  2006-10-20 12:01                                                 ` Jakub Narebski
  1 sibling, 1 reply; 1752+ messages in thread
From: James Henstridge @ 2006-10-20 10:45 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Aaron Bentley, Andreas Ericsson, Linus Torvalds, Carl Worth,
	bazaar-ng, git

On 20/10/06, Jakub Narebski <jnareb@gmail.com> wrote:
> > What's nice is being able see the revno 753 and knowing that "diff -r
> > 752..753" will show the changes it introduced. Checking the revo on a
> > branch mirror and knowing how out-of-date it is.
>
> Huh? If you want what changes have been introduced by commit
> c3424aebbf722c1f204931bf1c843e8a103ee143, you just do
>
> # git diff c3424aebbf722c1f204931bf1c843e8a103ee143
>
> (or better "git show" instead of "git diff" or "git diff-tree").
> If you give only one commit (only one revision) git automatically
> gives diff to its parent(s).

If a revision has multiple parents, what does it diff against in this
case?  Do you get one diff against each parent revision?

James.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 19:16                                           ` Junio C Hamano
@ 2006-10-20 10:51                                             ` Jakub Narebski
  2006-10-20 15:58                                               ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 10:51 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:

> Linus Torvalds <torvalds@osdl.org> writes:
> 
>> The other big difference is being able to do merges in seconds. The 
>> biggest cost of doing a big merge these days seems to literally be 
>> generating the diffstat of the changes at the end (which is purely a UI 
>> issue, but one that I find so important that I'll happily take the extra 
>> few seconds for that, even if it sometimes effectively doubles the 
>> overhead).
> 
> An interesting effect on this is when people have a column for
> merge performance in a SCM comparison table, they would include
> time to run the diffstat as part of the time spent for merging
> when they fill in the number for git, but not for any other SCM.

So if you want to compare merge performance with other SCM, you should
either add time to run diffstat for other SCM, or substract time to
run "git diff-tree --stat".

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 23:01                                       ` Aaron Bentley
  2006-10-19 23:42                                         ` Carl Worth
@ 2006-10-20 10:53                                         ` Jakub Narebski
  2006-10-20 12:34                                           ` Matthieu Moy
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 10:53 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Aaron Bentley wrote:

>>>          And I personally have been developing a bugtracker that is
>>> distributed in the same way bzr is; it stores bug data in the source
>>> tree of a project, so that bug activities follow branches around.
>>
>> That kind of thing sounds very useful. As I've been talking about
>> "numbers" here in bug trackers and mailing lists, it should be obvious
>> that I consider the information stored in such systems an important
>> part of the history of a code project. So it would be nice if all of
>> that history were stored in an equally reliable system in some way.
> 
> If you're interested, it's called "Bugs Everywhere" and it's available here:
> http://panoramicfeedback.com/opensource/
> 
> New VCS backends are welcome :-D

While SCM can (and should be usually) distributed, I think that bugtracker
has to be centralized.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20  1:06                                           ` Aaron Bentley
  2006-10-20  5:05                                             ` Linus Torvalds
  2006-10-20  9:57                                             ` Jakub Narebski
@ 2006-10-20 11:00                                             ` Jakub Narebski
  2006-10-20 14:12                                             ` Jeff King
  2006-10-20 21:48                                             ` Carl Worth
  4 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 11:00 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Carl Worth, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Aaron Bentley wrote:

> Bazaar encourages you to stick lots and lots of branches in your
> repository.  They don't even have to be related.  For example, my repo
> contains branches of bzr, bzrtools, Meld, and BazaarInspect.

GIT encourages you to use separate repositories for unrelated projects.
And alternates mechanism for related projects (like different Linux
kernel repositories: Linus, stable, etc.).

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20  8:38                                                 ` Johannes Schindelin
  2006-10-20 10:13                                                   ` Petr Baudis
@ 2006-10-20 11:09                                                   ` Jakub Narebski
  2006-10-20 11:37                                                     ` Johannes Schindelin
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 11:09 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Johannes Schindelin wrote:

> On Fri, 20 Oct 2006, Lachlan Patrick wrote:
> 
>> How does git disambiguate SHA1 hash collisions?
> 
> It does not. You can fully expect the universe to go down before that 
> happens.
 
Or you can compile git with COLLISION_CHECK

>From Makefile:
# Define COLLISION_CHECK below if you believe that SHA1's
# 1461501637330902918203684832716283019655932542976 hashes do not give you
# sufficient guarantee that no collisions between objects will ever happen.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 16:38                                               ` Matthieu Moy
@ 2006-10-20 11:24                                                 ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 11:24 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Matthieu Moy wrote:

> Then, one other difference is in the UI. bzr shows you commits in a
> kind of hierarchical maner, like (fictive example, that's not the real
> exact format).
> 
> $ bzr log
> commiter: upstream@maintainer.com
> message:
>   merged the work on a feature
>   ------
>   commiter: contributor@site.com
>   message:
>     prepared for feature X
>   ------
>   commiter: contributor@site.com
>   message:
>     implemented feature X
>   ------
>   commiter: contributor@site.com
>   message:
>     added testcase for feature X
> ------
> commiter: upstream@maintainer.com
> message:
>   something else
> 
> No big difference in the model either, but it probably reveals a
> different vision of what "history" means.

We have in GIT git-show-branch command for that (although it
has quite strange UI, and shows only title of commit), we
can do "git log | git name-rev --stdin", or better use graphical
history viewers like gitk (Tcl/Tk) or qgit (Qt). Graphical history
viewers are a must with more complicated history. 

Bazaar-NG has bzr-gtk.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 11:09                                                   ` Jakub Narebski
@ 2006-10-20 11:37                                                     ` Johannes Schindelin
  2006-10-20 12:03                                                       ` Jakub Narebski
  2006-10-20 17:23                                                       ` David Lang
  0 siblings, 2 replies; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-20 11:37 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, bazaar-ng

Hi,

On Fri, 20 Oct 2006, Jakub Narebski wrote:

> Johannes Schindelin wrote:
> 
> > On Fri, 20 Oct 2006, Lachlan Patrick wrote:
> > 
> >> How does git disambiguate SHA1 hash collisions?
> > 
> > It does not. You can fully expect the universe to go down before that 
> > happens.
>  
> Or you can compile git with COLLISION_CHECK
> 
> >From Makefile:
> # Define COLLISION_CHECK below if you believe that SHA1's
> # 1461501637330902918203684832716283019655932542976 hashes do not give you
> # sufficient guarantee that no collisions between objects will ever happen.

You can document your disbelief.

But it does not change a thing. Since v0.99~653, we do not have any 
collision check, even if compiled with COLLISION_CHECK.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 11:30                                             ` Charles Duffy
@ 2006-10-20 11:38                                               ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 11:38 UTC (permalink / raw)
  To: git

Charles Duffy wrote:

> Johannes Schindelin wrote:
>>> Shell scripts allow for a fragile system because they could include C
code
>>> snippets which they then compile and LD_PRELOAD.
>>>     
>>
>> Well, I do not expect people to misbehave. You do not compile a nasty 
>> C-program from a shell script _by mistake_.
> 
> You also don't replace bzrlib functionality (in your terms, plumbing) in 
> a plugin by mistake.

Perhaps the cause for not having plugins in GIT (besides the fact that
it follows OSS + Unix guidelines) is that git is not libified, yet. It
is "scriptified", i.e. it has many helper programs, and has options for
pipelining that it is really easy to use in scripts (Cogito, pg, StGit),
but the libification effort is [only] ongoing.

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 12:33                                         ` Petr Baudis
  2006-10-19 13:44                                           ` Matthieu Moy
@ 2006-10-20 11:50                                           ` Jakub Narebski
  2006-10-20 13:26                                             ` Jakub Narebski
  2006-10-20 23:19                                             ` Junio C Hamano
  1 sibling, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 11:50 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Matthieu Moy, Andreas Ericsson, Matthew D. Fuller, bazaar-ng,
	Linus Torvalds, Carl Worth, git

I have lost somewhere among many emails in this thread the email I 
wanted to reply to, the one mentioning for the first time the lack of 
parents ordering in GIT, but this one should do.


Petr Baudis wrote:

> The lack of parents ordering in Git is directly connected with
> fast-forwarding.

There are exactly _two_ places where Git treats first parent specially 
(correct me if I'm wrong).

First, <commit-ish>^ is shortcut for <commit-ish>^1, i.e. for first 
parent of commit. <commit-ish>~<n> is shortcut for <commit-ish>^^...^ 
(n-times '^'), which means that <commit-ish>~<n> is n-th parent in 
1st-parent lineage of <commit-ish>. But you can always use names
like for example next~12^2^^2~2.

Second, git-diff with only one <commit-ish> generates diff to first
parent. But you can always use '-c' or '-cc' combined diff format
or '-m' with default diff format to compare to _all_ parents.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 10:45                                               ` James Henstridge
@ 2006-10-20 12:01                                                 ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 12:01 UTC (permalink / raw)
  To: James Henstridge
  Cc: Aaron Bentley, Andreas Ericsson, Linus Torvalds, Carl Worth,
	bazaar-ng, git

James Henstridge wrote:
> On 20/10/06, Jakub Narebski <jnareb@gmail.com> wrote:
> > > What's nice is being able see the revno 753 and knowing that "diff -r
> > > 752..753" will show the changes it introduced. Checking the revo on a
> > > branch mirror and knowing how out-of-date it is.
> >
> > Huh? If you want what changes have been introduced by commit
> > c3424aebbf722c1f204931bf1c843e8a103ee143, you just do
> >
> > # git diff c3424aebbf722c1f204931bf1c843e8a103ee143
> >
> > (or better "git show" instead of "git diff" or "git diff-tree").
> > If you give only one commit (only one revision) git automatically
> > gives diff to its parent(s).
> 
> If a revision has multiple parents, what does it diff against in this
> case?  Do you get one diff against each parent revision?

If revision has multiple parents (is merge commit), git-diff
(which is used by git-show) does not show differences (unless you
give two revisions in git-diff case).

You can either use '-m' option to show differences from all its
parents, or '-c'/'--cc' to show combined diff ('--cc' shows more
compact diff).
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 11:37                                                     ` Johannes Schindelin
@ 2006-10-20 12:03                                                       ` Jakub Narebski
  2006-10-20 12:48                                                         ` Johannes Schindelin
  2006-10-20 17:23                                                       ` David Lang
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 12:03 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin wrote:
> On Fri, 20 Oct 2006, Jakub Narebski wrote:
> 
>> Johannes Schindelin wrote:
>> 
>>> On Fri, 20 Oct 2006, Lachlan Patrick wrote:
>>> 
>>>> How does git disambiguate SHA1 hash collisions?
>>> 
>>> It does not. You can fully expect the universe to go down before that 
>>> happens.
>>  
>> Or you can compile git with COLLISION_CHECK
>> 
>> From Makefile:
>> # Define COLLISION_CHECK below if you believe that SHA1's
>> # 1461501637330902918203684832716283019655932542976 hashes do not give you
>> # sufficient guarantee that no collisions between objects will ever happen.
> 
> You can document your disbelief.
> 
> But it does not change a thing. Since v0.99~653, we do not have any 
> collision check, even if compiled with COLLISION_CHECK.

So why it is left in Makefile? Does defining this change a thing
or not (in which case this section should be removed)?

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] GIT 1.4.3
  2006-10-18 23:53 [ANNOUNCE] GIT 1.4.3 Junio C Hamano
@ 2006-10-20 12:31 ` Horst H. von Brand
  2006-10-20 13:26 ` Peter Eriksen
  2006-10-20 23:35 ` Junio C Hamano
  2 siblings, 0 replies; 1752+ messages in thread
From: Horst H. von Brand @ 2006-10-20 12:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, linux-kernel

Junio C Hamano <junkio@cox.net> wrote:
> The latest feature release GIT 1.4.3 is available at the usual
> places:

[...]

>  rename builtin-cat-file.c => builtin-cat-file.c (0%)

Huh?!
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                    Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria             +56 32 2654239
Casilla 110-V, Valparaiso, Chile               Fax:  +56 32 2797513

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 10:53                                         ` Jakub Narebski
@ 2006-10-20 12:34                                           ` Matthieu Moy
  2006-10-20 13:20                                             ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-20 12:34 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

Jakub Narebski <jnareb@gmail.com> writes:

>> If you're interested, it's called "Bugs Everywhere" and it's available here:
>> http://panoramicfeedback.com/opensource/
>> 
>> New VCS backends are welcome :-D
>
> While SCM can (and should be usually) distributed, I think that bugtracker
> has to be centralized.

Well, indeed, I think bug _reporting_ should be somehow centralized,
while bug _fixing_ can be decentralized: You fix a bug, you mark it as
fixed, and then the main branch gets the information that the bug is
fixed when the bugfix is merged.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 12:03                                                       ` Jakub Narebski
@ 2006-10-20 12:48                                                         ` Johannes Schindelin
  0 siblings, 0 replies; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-20 12:48 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Hi,

On Fri, 20 Oct 2006, Jakub Narebski wrote:

> > But it does not change a thing. Since v0.99~653, we do not have any 
> > collision check, even if compiled with COLLISION_CHECK.
> 
> So why it is left in Makefile? Does defining this change a thing
> or not (in which case this section should be removed)?

It does not. The relevant parts in the code read like this:

sha1_filc.c:1442
                /* FIXME!!! Collision check here ? */

sha1_file.c:1541
                /*
                 * FIXME!!! We might do collision checking here, but we'd
                 * need to uncompress the old file and check it. Later.
                 */

It was hoped that the people who actually care would implement that 
functionality. (Note that in an earlier version, the check was 
implemented, but would have to be different these days: pack files did not 
exist then).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 10:42                                               ` James Henstridge
@ 2006-10-20 13:17                                                 ` Jakub Narebski
  2006-10-20 13:36                                                   ` Petr Baudis
  2006-10-20 14:59                                                   ` James Henstridge
  0 siblings, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 13:17 UTC (permalink / raw)
  To: James Henstridge
  Cc: bazaar-ng, Linus Torvalds, Carl Worth, Andreas Ericsson, git

James Henstridge wrote:
> On 20/10/06, Jakub Narebski <jnareb@gmail.com> wrote:
>> James Henstridge wrote:
>>> On 20/10/06, Carl Worth <cworth@cworth.org> wrote:
>>>> On Thu, 19 Oct 2006 19:01:58 -0400, Aaron Bentley wrote:

>>> With this sort of setup, I would publish my branches in a directory
>>> tree like this:
>>>
>>>     /repo
>>>         /branch1
>>>         /branch2
>>>
>>> I make "/repo" a Bazaar repository so that it stores the revision data
>>> for all branches contained in the directory (the tree contents,
>>> revision meta data, etc).
>>
>> And here we have a feature which is as far as I see unique to git,
>> namely to have persistent branches with _separate namespace_. It means
>> that we can have hierarchical branch names (including names like
>> "remotes/<remotename>/<branch of remote>", or "jc/diff"), and we don't
>> have to guess where repository name ends and branch name begins.
> 
> With the above layout, I would just type:
>     bzr branch http://server/repo/branch1

With Cogito (you can think of it either as alternate Git UI, or as SCM
built on top of Git) you would use

   $ cg clone http://server/repo#branch

for example

   $ cg clone git://git.kernel.org/pub/scm/git/git.git#next

to clone _single_ branch (in bzr terminology, "heavy checkout" of branch).
But you can also clone _whole_ repository, _all_ published branches with

   $ cg clone git://git.kernel.org/pub/scm/git/git.git

With core Git it is the same, but we don't have the above shortcut
for checking only one branch; branches to checkout are in separate
arguments to git-clone.

In bzr it seems that you cannot distinguish (at least not only
from URL) where repository ends and branch begins.

*Sidenote:* In current version of gitweb you can get file
in given repository in given branch using the following
notation:

   http://path/to/gitweb.cgi/repo/sitory/branch/name:file/name

gitweb can detect where branch name ends and repository name
begins; usually (by convention) "bare" git repositories uses
<project>.git name, "clothed" git repositories uses
<project>/.git


See also below.

> This command behaves identically whether the repository data is in
> /repo or in /repo/branch1.  Someone pulling from the branch doesn't
> have to care what the repository structure is.  Having a separate
> namespace for branch names only really makes sense if the user needs
> to care about it.
> 
> As for hierarchical names, there is nothing stopping you from using
> deaper directory structures with Bazaar too.  Bazaar just checks each
> successive parent directory til it finds a repository for the branch.
> 
>> The idea of "branches (and tags) as directories" was if I understand
>> it correctly introduced by Subversion, and from what can be seen from
>> troubles with git-svn (stemming from the fact that division between
>> project name and branch name is the matter of _convention_) at least
>> slightly brain-damaged.
> 
> I think you are a bit confused about how Bazaar works here.  A Bazaar
> repository is a store of trees and revision metadata.  A Bazaar branch
> is just a pointer to a head revision in the repository.  As you can
> probably guess, the data for the branch is a lot smaller than the data
> for the repository.
> 
> You can store the repository and branch in the same directory to get a
> standalone branch.  The layout I described above has a repository in a
> parent directory, shared by multiple branches.
> 
> If you are comparing Subversion and Bazaar, a Bazaar branch shares
> more properties with a full Subversion repository rather than a
> Subversion branch.

Oh, that explained yet another difference between Bazaar-NG (and other
SCM which uses similar model) and Git.

In Git branch is just a pointer to head (top) commit (hence they are stored
under .git/refs/heads/) in given line of development. Git also stores
information (in .git/HEAD) about which branch we are currently on, which
means on which branch git puts new commits. Nothing more (well, there
can be log of changes to head in .git/logs/refs/heads/ but that is optional
and purely local information). In Bazaar-NG you have to store (if I
understand it correctly) mapping from revnos to revisions.
 
By default (it means for example default behavior of git-clone, if we don't
use --bare option) git repository is _embedded_ in working area. We have

   .git/
   .git/HEAD
   ...
   .git/refs/heads/
   ...
   <working area files, e.g.>

So repo/branch wouldn't work, because 'branch' would conflict with working
area files. GIT doesn't follow the CVS model of separate storage area
(CVSROOT) and having only pointer to said area (files in CVS/ 
subdirectories) in working directory.

In GIT to work on some repository you don't (like from what I understand
in Bazaar-NG) "checkout" some branch (which would automatically copy some
data in case of "heavy checkout" or just save some pointer to repository
in "lightweight checkout" case). You clone whole repository; well you can
select which branches to clone. "Checkout" in GIT terminology means to
populate working area with given version (and change in repository which
branch is current, usually).

How checked out working area looks like in Bazaar-NG?

[...]
>>> For similar reasons, the cost of publishing 20 related Bazaar branches
>>> on my web server is generally not 20 times the cost of publishing a
>>> single branch.
>>>
>>> I understand that you get similar benefits by a GIT repository with
>>> multiple head revisions.
>>
>> You can get similar benefits by a GIT repository with shared object
>> database using alternates mechanism. And that is usually preferred
>> over storing unrelated branches, i.e. branches pointing to disconnected
>> DAG (separate trees in BK terminology) of revision, if that you mean by
>> multiple head revisions (because in GIT there is no notion of "mainline"
>> branch, only of current (HEAD) branch).
> 
> I may have got the git terminology wrong. I was trying to draw
> parallels between the .git/refs/... files in a git repository and the
> way multiple branches can be stored in a Bazaar repository.

Yes, but using Git that way has serious disadvantages. For example
there is only one current branch pointer and only one index (dircache)
per git repository.

> I am not claiming that you'll get bandwidth or disk space benefits for
> storing unrelated branches in a single Bazaar repository.  But if the
> branches are related, then there will be space savings (which is what
> the great-grandparent post was asking about).

So it is way better to use one repository per project, and use alternates
mechanism to save space.

But I agree that saving "old fork" info as separate branch doesn't lead
to that much inefficiency as might be thought.

But after saving "old fork" as a branch revno based revision identifiers
change from http://old.host/old/repo:127 to http://host/repo/old.fork:127
That is maybe minimal change, but this is change!


P.S. In two separate git repositories, even if they exchange information
with each other, the branch names can be different.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 12:34                                           ` Matthieu Moy
@ 2006-10-20 13:20                                             ` Jakub Narebski
  2006-10-20 13:47                                               ` Petr Baudis
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 13:20 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: bazaar-ng, git

Matthieu Moy wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
> >> If you're interested, it's called "Bugs Everywhere" and it's available here:
> >> http://panoramicfeedback.com/opensource/
> >> 
> >> New VCS backends are welcome :-D
> >
> > While SCM can (and should be usually) distributed, I think that bugtracker
> > has to be centralized.
> 
> Well, indeed, I think bug _reporting_ should be somehow centralized,
> while bug _fixing_ can be decentralized: You fix a bug, you mark it as
> fixed, and then the main branch gets the information that the bug is
> fixed when the bugfix is merged.

But you don't need much infrastructure for branch fixing. Fix it in
repository, and write bug number (you have to have centralized bugtracker
for numbers) or bug identifier in commit message. You write (or post-commit
hook writes) in bugtracker that bug was fixed in commit <commit-id>.
You tell mainline to pull from you. That's all.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19  3:10                               ` Aaron Bentley
                                                   ` (2 preceding siblings ...)
  2006-10-19  7:02                                 ` Erik Bågfors
@ 2006-10-20 13:22                                 ` Horst H. von Brand
  2006-10-20 13:46                                   ` Christian MICHON
  3 siblings, 1 reply; 1752+ messages in thread
From: Horst H. von Brand @ 2006-10-20 13:22 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git, Jakub Narebski

Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> Linus Torvalds wrote:

[...]

> > The "main trunk matters" mentality (which has deep roots in CVS - don't 
> > get me wrong, I don't think you're the first one to do this) is 
> > fundamentally antithetical to truly distributed system, because it 
> > basically assumes that some maintainer is "more important" than others. 

> Linus, if you got hit by a bus, it would still be a shock, and it would
> still take time for the Linux world to recover.  Your insights and
> talent, both technical and social, make you the most important kernel
> developer.  And it stays that way because you deserve it.  Projects with
> good leadership don't fork, or if they do, the fork withers and dies
> pretty quickly.

So? It makes no sense to me to cater only to "successful projects"... most
projects /aren't/ successful ;-)

> It is fine to say all branches are equal from a technical perspective.
> From a social perspective, it's just not true.

Yes, but what matters here is the principle... if branches aren't equal, it
makes some things unnecessarily hard (i.e., forking, passing maintainership
over, ...). Sure, they aren't activities that should be actively
encouraged, but they shouldn't be made harder than necessary either.

> The scale of Bazaar development is much smaller than the scale of kernel
> development, so it doesn't make sense to maintain long-term divergent
> branches like the mm tree.  We do occasionally have long-lived feature
> branches, though.

Are you saying Bazaar is aimed at small(ish) projects (only)?

> > That special maintainer is the maintainer whose merge-trunk is followed, 
> > and whose revision numbers don't change when they are merged back.

> In bzr development, it's very rare for anyone's revision numbers to
> change.

"Very rare" != "never". The "very rare" cases /will/ come back to bite you,
once you grow accustomed to "hasn't ever happened"

[...]

> > I'll just point out that one of my design goals for git was to make every 
> > single repository 100% equal. That means that there MUST NOT be a "trunk", 
> > or a special line of development. There is no "vendor branch".

> I think you're implying that on a technical level, bzr doesn't support
> this.  But it does.  Every published repository

What makes a "published repository" special, as oposed to my local
playground?

>                                                 has unique identifiers
> for every revision on its mainline,

Are they different among repositories, even though they came from another
of the set?

>                                     and it's exceedingly uncommon for
> these to change.

See above.

>                   There are special procedures to maintain bzr.dev, but
> there's nothing technically unique about it.  People develop against
> bzr.dev rather than my integration branch, because they have
> non-technical reasons for wanting their changes to be merged into
> bzr.dev, not my integration branch.

OK.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] GIT 1.4.3
  2006-10-18 23:53 [ANNOUNCE] GIT 1.4.3 Junio C Hamano
  2006-10-20 12:31 ` Horst H. von Brand
@ 2006-10-20 13:26 ` Peter Eriksen
  2006-10-20 23:35 ` Junio C Hamano
  2 siblings, 0 replies; 1752+ messages in thread
From: Peter Eriksen @ 2006-10-20 13:26 UTC (permalink / raw)
  To: git; +Cc: linux-kernel

Junio C Hamano <junkio@cox.net> writes:

> The latest feature release GIT 1.4.3 is available at the usual
> places:
...
> ----------------------------------------------------------------
> 
>  .gitignore                                         |   10 +-
>  Documentation/Makefile                             |    4 +-
>  Documentation/asciidoc.conf                        |    1 +
>  Documentation/config.txt                           |   34 +
...
>  rename contrib/gitview/{gitview.txt => gitview.txt} (74%)

How does it come to the result, that this is a rename?

Peter

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 11:50                                           ` Jakub Narebski
@ 2006-10-20 13:26                                             ` Jakub Narebski
  2006-10-20 23:19                                             ` Junio C Hamano
  1 sibling, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 13:26 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Matthieu Moy, Andreas Ericsson, Matthew D. Fuller, bazaar-ng,
	Linus Torvalds, Carl Worth, git

Jakub Narebski wrote:
> Second, git-diff with only one <commit-ish> generates diff to first
> parent. But you can always use '-c' or '-cc' combined diff format
> or '-m' with default diff format to compare to _all_ parents.

I stand corrected: git-diff refuses to show anything if provided
with only one commit, and commit has more than one parent. So it
does not reat first parent specially.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 13:17                                                 ` Jakub Narebski
@ 2006-10-20 13:36                                                   ` Petr Baudis
  2006-10-20 14:12                                                     ` Jakub Narebski
  2006-10-20 14:59                                                   ` James Henstridge
  1 sibling, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-10-20 13:36 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: James Henstridge, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

Dear diary, on Fri, Oct 20, 2006 at 03:17:26PM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> But you can also clone _whole_ repository, _all_ published branches with
> 
>    $ cg clone git://git.kernel.org/pub/scm/git/git.git

Nope, cg clone will in this case clone the master branch (or whatever
the remote HEAD points at). cg clone -a is planned but not implemented
yet. Very soon now, hopefully. :-)

> In GIT to work on some repository you don't (like from what I understand
> in Bazaar-NG) "checkout" some branch (which would automatically copy some
> data in case of "heavy checkout" or just save some pointer to repository
> in "lightweight checkout" case). You clone whole repository; well you can
> select which branches to clone. "Checkout" in GIT terminology means to
> populate working area with given version (and change in repository which
> branch is current, usually).

You don't need to, you can switch your working tree between various
branches.  I think Linus said he does that (or was it Junio?), and I do that
as well, as well as many others.

A good question would be "when to create another branch and when to
clone the repository". And I don't think there's any good answer, except
"when you are comfortable with it". :-) Both approaches have pros/cons.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 10:40                                       ` Jakub Narebski
@ 2006-10-20 13:36                                         ` Shawn Pearce
  2006-10-21 12:30                                         ` Matthew D. Fuller
  1 sibling, 0 replies; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-20 13:36 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> wrote:
> from discussion on #revctrl list on FreeNode), and in addition
> to existing GitSvnComparison page on GitWiki).

Oh, you mean that document that I orphaned when I got sidetracked
and forgot I hadn't quite finished it?  :-)

-- 
Shawn.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 13:22                                 ` Horst H. von Brand
@ 2006-10-20 13:46                                   ` Christian MICHON
  2006-10-20 15:05                                     ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Christian MICHON @ 2006-10-20 13:46 UTC (permalink / raw)
  To: bazaar-ng, git

On 10/20/06, Horst H. von Brand <vonbrand@inf.utfsm.cl> wrote:
> Are you saying Bazaar is aimed at small(ish) projects (only)?

funny. I actually read another post from Linus, and when I
"merge" with your post (understand: bisect), the following
comes out:

- git is the fastest scm around
- git has the smallest scm footprint
- git is also aimed at small(ish) projects

my personal proof of concept on the last point is that I'm a
IC design engineer who threw away other scm in favor of git
since git-1.4.2 and regret now the years wasted on _other_
scm. But your mileage may vary.

-- 
Christian

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 13:20                                             ` Jakub Narebski
@ 2006-10-20 13:47                                               ` Petr Baudis
  0 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-20 13:47 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Matthieu Moy, bazaar-ng, git

Dear diary, on Fri, Oct 20, 2006 at 03:20:42PM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> Matthieu Moy wrote:
> > Jakub Narebski <jnareb@gmail.com> writes:
> > 
> > >> If you're interested, it's called "Bugs Everywhere" and it's available here:
> > >> http://panoramicfeedback.com/opensource/
> > >> 
> > >> New VCS backends are welcome :-D
> > >
> > > While SCM can (and should be usually) distributed, I think that bugtracker
> > > has to be centralized.
> > 
> > Well, indeed, I think bug _reporting_ should be somehow centralized,
> > while bug _fixing_ can be decentralized: You fix a bug, you mark it as
> > fixed, and then the main branch gets the information that the bug is
> > fixed when the bugfix is merged.
> 
> But you don't need much infrastructure for branch fixing. Fix it in
> repository, and write bug number (you have to have centralized bugtracker
> for numbers) or bug identifier in commit message. You write (or post-commit
> hook writes) in bugtracker that bug was fixed in commit <commit-id>.
> You tell mainline to pull from you. That's all.

Yes but noone did the infrastructure yet. :-) Also, we need a way to
make it worth smooth, e.g. so that you don't have to download any
special stuff after cloning a branch - thus the post-commit hook needs
to be cloned too, but you also need to deal with the security
implications reasonably. (We would very much like to have "hooks
cloning" in Git in our in-SUSE usage as well; I didn't get to it yet.)

On a somewhat related note, I was on Microsoft's presentation at my
university about their Team Foundation Server. And Microsoft's clearly
aware that SourceSafe was a horrible crap and the version control in TFS
is much more advanced and even shows some signs of distributiveness (but
I don't know how much, the presenter did not know details about how it
works).

But their selling point really is the tight integration with bug
tracking and autobuild system. And it indeed does look pretty nice (when
you watch it, you might get quite a different perspective when actually
*using* it ;).

You can read my brief notes from the presentation at

	http://pasky.or.cz/~pasky/cp/tfs-lecture-notes.txt

It's a bit of bureaucracy for developers but managers will absolutely
*adore* it.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19 10:40                                   ` Sean
@ 2006-10-20 14:03                                     ` Aaron Bentley
  2006-10-20 14:56                                       ` Jakub Narebski
       [not found]                                       ` <20061020113712.d192580a.seanlkml@sympatico.ca>
  0 siblings, 2 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-20 14:03 UTC (permalink / raw)
  To: Sean; +Cc: Alexander Belchenko, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sean wrote:

> Petr already mentioned that the data currently shown in the email
> text isn't really useful.

In Bazaar bundles, the text of the diff is an integral part of the data.
 It is used to generate the text of all the files in the revision.

Bazaar bundles were designed to be used on mailing lists.  So you can
review the changes from the diff, comment on them, and if it seems
suitable, merge them.

> Although that might just make the email bigger for not a lot of
> gain.

It's my understanding that most changes discussed on lkml are provided
as a series of patches.  Bazaar bundles are intended as a direct
replacement for patches in that use case.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFONck0F+nu1YWqI0RAgrHAJ0flmF1wCGYYUSk8f2iy8LuZnkaKQCdFSIo
JIaKi9S8TzUkhvaWpYYP5AA=
=MgZo
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 13:36                                                   ` Petr Baudis
@ 2006-10-20 14:12                                                     ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 14:12 UTC (permalink / raw)
  To: Petr Baudis
  Cc: James Henstridge, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

Petr Baudis wrote:
> Dear diary, on Fri, Oct 20, 2006 at 03:17:26PM CEST, I got a letter
> where Jakub Narebski <jnareb@gmail.com> said that...

>> But you can also clone _whole_ repository, _all_ published branches with
>> 
>>    $ cg clone git://git.kernel.org/pub/scm/git/git.git
> 
> Nope, cg clone will in this case clone the master branch (or whatever
> the remote HEAD points at). cg clone -a is planned but not implemented
> yet. Very soon now, hopefully. :-)

That's probably because Cogito still uses obsolete branches/


$ git clone git://git.kernel.org/pub/scm/git/git.git

clones _whole_ repository, all the branches and tags, and saves information
about the branches it cloned, and URL to repository in remotes/ file.
 
>> In GIT to work on some repository you don't (like from what I understand
>> in Bazaar-NG) "checkout" some branch (which would automatically copy some
>> data in case of "heavy checkout" or just save some pointer to repository
>> in "lightweight checkout" case). You clone whole repository; well you can
>> select which branches to clone. "Checkout" in GIT terminology means to
>> populate working area with given version (and change in repository which
>> branch is current, usually).
> 
> You don't need to, you can switch your working tree between various
> branches.  I think Linus said he does that (or was it Junio?), and I do that
> as well, as well as many others.

I should have said: bring working area to state given by some revision
(instead of "populate working area").

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20  1:06                                           ` Aaron Bentley
                                                               ` (2 preceding siblings ...)
  2006-10-20 11:00                                             ` Jakub Narebski
@ 2006-10-20 14:12                                             ` Jeff King
  2006-10-20 14:40                                               ` Jakub Narebski
  2006-10-21 17:57                                               ` Aaron Bentley
  2006-10-20 21:48                                             ` Carl Worth
  4 siblings, 2 replies; 1752+ messages in thread
From: Jeff King @ 2006-10-20 14:12 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Carl Worth, Linus Torvalds, Jakub Narebski, Andreas Ericsson,
	bazaar-ng, git

On Thu, Oct 19, 2006 at 09:06:40PM -0400, Aaron Bentley wrote:

> What's nice is being able see the revno 753 and knowing that "diff -r
> 752..753" will show the changes it introduced.  Checking the revo on a
> branch mirror and knowing how out-of-date it is.

I was accustomed to doing such things in CVS, but I find the git way
much more pleasant, since I don't have to do any arithmetic:
  diff d8a60^..d8a60
(Yes, I am capable of performing subtraction in my head, but I find that
a "parent-of" operator matches my cognitive model better, especially
when you get into things like d8a60^2~3).

Does bzr have a similar shorthand for mentioning relative commits?

-Peff

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 17:14                                       ` J. Bruce Fields
@ 2006-10-20 14:31                                         ` Jeff King
  2006-10-20 15:33                                           ` J. Bruce Fields
  0 siblings, 1 reply; 1752+ messages in thread
From: Jeff King @ 2006-10-20 14:31 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Carl Worth, Aaron Bentley, Linus Torvalds, Jakub Narebski,
	Andreas Ericsson, bazaar-ng, git

On Thu, Oct 19, 2006 at 01:14:09PM -0400, J. Bruce Fields wrote:

> > > In the second place, one must consider the "nuclear launch codes"
> > > scenario.
> > Sure. And git does provide tools that can do this.
> 
> So in this case you can certainly lose the launch codes.  But you have
> forever granted everyone a way to determine whether a given guess at the
> launch codes is correct.  (Again, assuming some stuff about SHA1).

In what sense? Yes, you can make a guess if you have stored the SHA1
that contained the launch codes. But the point is that that particular
SHA1 is no longer part of the repository. Keeping that SHA1 is no easier
than just keeping the launch codes in the first place.

-Peff

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 14:12                                             ` Jeff King
@ 2006-10-20 14:40                                               ` Jakub Narebski
  2006-10-20 14:52                                                 ` Johannes Schindelin
  2006-10-21 17:57                                               ` Aaron Bentley
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 14:40 UTC (permalink / raw)
  To: Jeff King
  Cc: Aaron Bentley, Carl Worth, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git

Jeff King wrote:
> On Thu, Oct 19, 2006 at 09:06:40PM -0400, Aaron Bentley wrote:
> 
>> What's nice is being able see the revno 753 and knowing that "diff -r
>> 752..753" will show the changes it introduced.  Checking the revo on a
>> branch mirror and knowing how out-of-date it is.
> 
> I was accustomed to doing such things in CVS, but I find the git way
> much more pleasant, since I don't have to do any arithmetic:
>   diff d8a60^..d8a60

By the way "diff d8a60" also works (unless d8a60 is merge commit, in
which case you would need "diff -c d8a60" or "diff -m d8a60").

> (Yes, I am capable of performing subtraction in my head, but I find that
> a "parent-of" operator matches my cognitive model better, especially
> when you get into things like d8a60^2~3).
> 
> Does bzr have a similar shorthand for mentioning relative commits?

By the way, git has the following extended SHA1 syntax for <commit-ish>
(documented in git-rev-parse(1)):
 * full SHA1 (40-chars hexadecimal string) or abbreviation unique for
   repository
 * symbolic ref name. E.g. 'master' typically means commit object referenced
   by $GIT_DIR/refs/heads/master; 'v1.4.1' means commit object referenced
   [indirectly] by $GIT_DIR/refs/tags/v1.4.1. You can say 'heads/master'
   and 'tags/master' if you have both head (branch) and tag named 'master',
   but don't do that. HEAD means current branch (and is usually default).
 * <ref>@{<date>} or <ref>@{<n>} to specify value of <ref> (usually branch)
   at given point of time, or n changes to ref back. Available only if you
   have reflog for given ref.
 * <commit-ish>^<n> means n-th parent of given revision. <commit-ish>^0
   means commit itself. <commit-ish>^ is a shortcut for <commit-ish>^1.
   <commit-ish>~<n> is shortcut for <commit-ish>^^..^ with n*'^', for
   example rev~3 is equivalent to rev^^^, which in turn is equivalent
   to rev^1^1^1

Additionally it has following undocumented extended SHA1 syntax to refer
to trees (directories) and blobs (file contents)
 * <revision>:<filename> gives SHA1 of tree or blob at given revision
 * :<stage>:<filename> (I think for blobs only) gives SHA1 for different
   versions of file during unresolved merge conflict.

I'm not enumerating here all the ways to specify part of DAG of history,
except that it includes "A ^B" meaning "all from A", "exclude all from B",
"B..A" meaning "^B A", "A...B" meaning "A B --not $(git merge-base A B)",
and of course "A -- path" meaning "all from A", "limit to changes in path".

What about _your_ SMC? ;-)
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20  0:20                                                               ` Jan Harkes
@ 2006-10-20 14:41                                                                 ` Jeff King
  0 siblings, 0 replies; 1752+ messages in thread
From: Jeff King @ 2006-10-20 14:41 UTC (permalink / raw)
  To: Linus Torvalds, Junio C Hamano, git

On Thu, Oct 19, 2006 at 08:20:32PM -0400, Jan Harkes wrote:

> It looks like you were really close. When we cannot resolve a delta, we
> just write it to the packfile and we don't queue it. If it can be
> resolved we write it as a full object.

If I understand correctly, if we see an unresolvable delta, we are just
making the assumption that its base has arrived (or will arrive) in the
same pack (without checking).  This means that we could end up with a
corrupted repository if the sender gives us a bad pack. I believe that
git's network interaction has been designed specifically to avoid such
possibilities (e.g., verifying completeness and integrity of downloaded
objects).

-Peff

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 14:40                                               ` Jakub Narebski
@ 2006-10-20 14:52                                                 ` Johannes Schindelin
  2006-10-20 15:34                                                   ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-20 14:52 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Jeff King, Aaron Bentley, Carl Worth, Linus Torvalds,
	Andreas Ericsson, bazaar-ng, git

Hi,

On Fri, 20 Oct 2006, Jakub Narebski wrote:

> Jeff King wrote:
> > 
> > I was accustomed to doing such things in CVS, but I find the git way
> > much more pleasant, since I don't have to do any arithmetic:
> >   diff d8a60^..d8a60
> 
> By the way "diff d8a60" also works (unless d8a60 is merge commit, in
> which case you would need "diff -c d8a60" or "diff -m d8a60").

I could be wrong, but I have the impression (even after actually testing 
it) that "git diff d8a60" is equivalent to "git diff d8a60..HEAD", _not_ 
"git diff d8a60^..d8a60".

IIRC we had a "-p" flag to denote "parent" once upon a time, but that no 
longer works...

"git-show" is definitely what you want.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 14:03                                     ` Aaron Bentley
@ 2006-10-20 14:56                                       ` Jakub Narebski
  2006-10-20 15:34                                         ` Aaron Bentley
  2006-10-21  7:56                                         ` Matthieu Moy
       [not found]                                       ` <20061020113712.d192580a.seanlkml@sympatico.ca>
  1 sibling, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 14:56 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Aaron Bentley wrote:

> Sean wrote:
> 
>> Petr already mentioned that the data currently shown in the email
>> text isn't really useful.
> 
> In Bazaar bundles, the text of the diff is an integral part of the data.
>  It is used to generate the text of all the files in the revision.

I thought that the diff was combined diff of changes.
 
> Bazaar bundles were designed to be used on mailing lists.  So you can
> review the changes from the diff, comment on them, and if it seems
> suitable, merge them.

If you have only mega-diff, you can comment only on this mega-diff.
It is more usefull for changes which have natural mult-commit history,
to review and comment on each of commits/patches in series _separately_.

>> Although that might just make the email bigger for not a lot of
>> gain.
> 
> It's my understanding that most changes discussed on lkml are provided
> as a series of patches.  Bazaar bundles are intended as a direct
> replacement for patches in that use case.

As _series_ of patches. You have git-format-patch + git-send-email
to format and send them, git-am to apply them (as patches, not as branch).

I was under an impression that user sees only mega-patch of all the
revisions in bundle together, and rest is for machine consumption only.

cg-bundle doesn't have this "mega-diff", but has shortlog (does bzr
bundle has shortlog/log of changes contained therein?) and diffstat
was planned.

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 13:17                                                 ` Jakub Narebski
  2006-10-20 13:36                                                   ` Petr Baudis
@ 2006-10-20 14:59                                                   ` James Henstridge
  2006-10-20 22:50                                                     ` Jakub Narebski
  1 sibling, 1 reply; 1752+ messages in thread
From: James Henstridge @ 2006-10-20 14:59 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: bazaar-ng, Linus Torvalds, Carl Worth, Andreas Ericsson, git

On 20/10/06, Jakub Narebski <jnareb@gmail.com> wrote:
> James Henstridge wrote:
> > With the above layout, I would just type:
> >     bzr branch http://server/repo/branch1
>
> With Cogito (you can think of it either as alternate Git UI, or as SCM
> built on top of Git) you would use
>
>    $ cg clone http://server/repo#branch
>
> for example
>
>    $ cg clone git://git.kernel.org/pub/scm/git/git.git#next
>
> to clone _single_ branch (in bzr terminology, "heavy checkout" of branch).

My understanding of git is that this would be equivalent to the "bzr
branch" command.  A checkout (heavy or lightweight) has the property
that commits are made to the original branch.

> But you can also clone _whole_ repository, _all_ published branches with
>
>    $ cg clone git://git.kernel.org/pub/scm/git/git.git

I suppose that'd be useful if you want a copy of all the branches at
once.  There is no builtin command in Bazaar to do that at present.


> With core Git it is the same, but we don't have the above shortcut
> for checking only one branch; branches to checkout are in separate
> arguments to git-clone.
>
> In bzr it seems that you cannot distinguish (at least not only
> from URL) where repository ends and branch begins.

I guess this highlights that the two tools optimise for different workflows.
> > This command behaves identically whether the repository data is in
> > /repo or in /repo/branch1.  Someone pulling from the branch doesn't
> > have to care what the repository structure is.  Having a separate
> > namespace for branch names only really makes sense if the user needs
> > to care about it.
> >
> > As for hierarchical names, there is nothing stopping you from using
> > deaper directory structures with Bazaar too.  Bazaar just checks each
> > successive parent directory til it finds a repository for the branch.
> >
> >> The idea of "branches (and tags) as directories" was if I understand
> >> it correctly introduced by Subversion, and from what can be seen from
> >> troubles with git-svn (stemming from the fact that division between
> >> project name and branch name is the matter of _convention_) at least
> >> slightly brain-damaged.
> >
> > I think you are a bit confused about how Bazaar works here.  A Bazaar
> > repository is a store of trees and revision metadata.  A Bazaar branch
> > is just a pointer to a head revision in the repository.  As you can
> > probably guess, the data for the branch is a lot smaller than the data
> > for the repository.
> >
> > You can store the repository and branch in the same directory to get a
> > standalone branch.  The layout I described above has a repository in a
> > parent directory, shared by multiple branches.
> >
> > If you are comparing Subversion and Bazaar, a Bazaar branch shares
> > more properties with a full Subversion repository rather than a
> > Subversion branch.
>
> Oh, that explained yet another difference between Bazaar-NG (and other
> SCM which uses similar model) and Git.
>
> In Git branch is just a pointer to head (top) commit (hence they are stored
> under .git/refs/heads/) in given line of development. Git also stores
> information (in .git/HEAD) about which branch we are currently on, which
> means on which branch git puts new commits. Nothing more (well, there
> can be log of changes to head in .git/logs/refs/heads/ but that is optional
> and purely local information). In Bazaar-NG you have to store (if I
> understand it correctly) mapping from revnos to revisions.
>
> By default (it means for example default behavior of git-clone, if we don't
> use --bare option) git repository is _embedded_ in working area. We have

Two points:
(1) if we are publishing branches, we wouldn't include working trees
-- they are not needed to pull or merge from such a branch.
(2) if we did have working trees, they'd be rooted at /repo/branch1
and /repo/branch2 -- not at /repo (since /repo is not a branch).

In case (2) there is a potential for conflicts if you nest branches,
but people don't generally trigger this problem with the way they use
Bazaar.

> So repo/branch wouldn't work, because 'branch' would conflict with working
> area files. GIT doesn't follow the CVS model of separate storage area
> (CVSROOT) and having only pointer to said area (files in CVS/
> subdirectories) in working directory.

That is fairly similar to the default mode of operation with Bazaar:
you have a repository, branch and working tree all rooted in the same
directory.  If you have separated working trees and branches, then
that is because you specifically asked for it.


> In GIT to work on some repository you don't (like from what I understand
> in Bazaar-NG) "checkout" some branch (which would automatically copy some
> data in case of "heavy checkout" or just save some pointer to repository
> in "lightweight checkout" case). You clone whole repository; well you can
> select which branches to clone. "Checkout" in GIT terminology means to
> populate working area with given version (and change in repository which
> branch is current, usually).

I think you have a slight misunderstanding of what a Bazaar checkout is.

>
> How checked out working area looks like in Bazaar-NG?

The layout of a standalone branch would be:
  .bzr/repository/ -- storage of trees and metadata
  .bzr/branch/ -- branch metadagta (e.g. pointer to the head revision)
  .bzr/checkout/ -- working tree book-keeping files
  source code

If we use a shared repository, the contained branches would lack the
.bzr/repository/ directory.  The parent directory would instead have a
.bzr/repository/, but usually wouldn't have .bzr/branch/ (unless there
is a branch rooted at the base of the repository).

if we are publishing a branch to a web server, we'd skip the working
tree, so the source code and .bzr/checkout/ directory would be
missing.

In the case of a checkout, the .bzr/branch/ directory has a special
format and acts as a pointer to the original branch.  If the checkout
is lightweight, the .bzr/repository/ directory would be missing, and
bzr would need to contact the original branch for the data.


> >>> For similar reasons, the cost of publishing 20 related Bazaar branches
> >>> on my web server is generally not 20 times the cost of publishing a
> >>> single branch.
> >>>
> >>> I understand that you get similar benefits by a GIT repository with
> >>> multiple head revisions.
> >>
> >> You can get similar benefits by a GIT repository with shared object
> >> database using alternates mechanism. And that is usually preferred
> >> over storing unrelated branches, i.e. branches pointing to disconnected
> >> DAG (separate trees in BK terminology) of revision, if that you mean by
> >> multiple head revisions (because in GIT there is no notion of "mainline"
> >> branch, only of current (HEAD) branch).
> >
> > I may have got the git terminology wrong. I was trying to draw
> > parallels between the .git/refs/... files in a git repository and the
> > way multiple branches can be stored in a Bazaar repository.
>
> Yes, but using Git that way has serious disadvantages. For example
> there is only one current branch pointer and only one index (dircache)
> per git repository.

Okay.  So using Bazaar terminology, this seems to be an issue of the
working tree being associated with the repository rather than the
branch?


[...]
> But I agree that saving "old fork" info as separate branch doesn't lead
> to that much inefficiency as might be thought.
>
> But after saving "old fork" as a branch revno based revision identifiers
> change from http://old.host/old/repo:127 to http://host/repo/old.fork:127
> That is maybe minimal change, but this is change!

Well, a branch can easily have multiple URLs even if there is only one
copy of it.  I might write to it via local file access or sftp (which
would be a file: or sftp: URL).

Mirrors of branches don't usually confuse users (and remember that the
revision numbers are primarily intended for users -- if I am writing a
Bazaar plugin, I'd work in terms of revision IDs).


James.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 13:46                                   ` Christian MICHON
@ 2006-10-20 15:05                                     ` Jakub Narebski
  2006-10-20 15:16                                       ` Johannes Schindelin
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 15:05 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Christian MICHON wrote:

> - git is the fastest scm around

Mercurial also claims that. It probably depends on the benchmark, though.
But Mercurial (hg) lacks from what I understand persistent branches, and
has only partial support for renames. YMMV.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 15:05                                     ` Jakub Narebski
@ 2006-10-20 15:16                                       ` Johannes Schindelin
  2006-10-20 15:28                                         ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-20 15:16 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, bazaar-ng

Hi,

On Fri, 20 Oct 2006, Jakub Narebski wrote:

> Christian MICHON wrote:
> 
> > - git is the fastest scm around
> 
> Mercurial also claims that.

Funny. When you type in "mercurial" and "benchmark" into Google, the 
_first_ hit is into "git Archives: Mercurial 0.4b vs git patchbomb 
benchmark". Performed by the good Mercurial people.

Leaving git as winner.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 15:16                                       ` Johannes Schindelin
@ 2006-10-20 15:28                                         ` Jakub Narebski
  2006-10-20 15:39                                           ` Johannes Schindelin
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 15:28 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, bazaar-ng

Johannes Schindelin wrote:

> On Fri, 20 Oct 2006, Jakub Narebski wrote:
> 
>> Christian MICHON wrote:
>> 
>>> - git is the fastest scm around
>> 
>> Mercurial also claims that.
> 
> Funny. When you type in "mercurial" and "benchmark" into Google, the 
> _first_ hit is into "git Archives: Mercurial 0.4b vs git patchbomb 
> benchmark". Performed by the good Mercurial people.
> 
> Leaving git as winner.
 
Check out http://git.or.cz/gitwiki/GitBenchmarks section "Quilt import 
comparison of Git and Mercurial" for the latest (OLS2006) benchmark
by Mercurial. Probably not indexed by Google, or doesn't have high 
pagerank because it is in PDF and fairly new (therefore has low 
"citations" number).

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 14:31                                         ` Jeff King
@ 2006-10-20 15:33                                           ` J. Bruce Fields
  2006-10-20 15:43                                             ` Jeff King
  0 siblings, 1 reply; 1752+ messages in thread
From: J. Bruce Fields @ 2006-10-20 15:33 UTC (permalink / raw)
  To: Jeff King
  Cc: Carl Worth, Aaron Bentley, Linus Torvalds, Jakub Narebski,
	Andreas Ericsson, bazaar-ng, git

On Fri, Oct 20, 2006 at 10:31:11AM -0400, Jeff King wrote:
> On Thu, Oct 19, 2006 at 01:14:09PM -0400, J. Bruce Fields wrote:
> > So in this case you can certainly lose the launch codes.  But you have
> > forever granted everyone a way to determine whether a given guess at the
> > launch codes is correct.  (Again, assuming some stuff about SHA1).
> 
> In what sense? Yes, you can make a guess if you have stored the SHA1
> that contained the launch codes. But the point is that that particular
> SHA1 is no longer part of the repository.

Well, I thought the discussion was about what meaning references have
after branches were modified or removed.  In which case the interesting
situation is one where an object is gone but someone somewhere still
holds a reference (because the SHA1 was mentioned in a bug report or an
email or whatever).

> Keeping that SHA1 is no easier than just keeping the launch codes in
> the first place.

Could be.

Anyway, the important difference between the SHA1 references and small
integers is that there's no aliasing in the former case.  Which is
important--I'd rather have a reference to nothing than a reference to
the wrong thing....

--b.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 14:56                                       ` Jakub Narebski
@ 2006-10-20 15:34                                         ` Aaron Bentley
  2006-10-20 16:21                                           ` Jakub Narebski
  2006-10-20 22:40                                           ` Petr Baudis
  2006-10-21  7:56                                         ` Matthieu Moy
  1 sibling, 2 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-20 15:34 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 2005 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:
>>In Bazaar bundles, the text of the diff is an integral part of the data.
>> It is used to generate the text of all the files in the revision.
> 
> 
> I thought that the diff was combined diff of changes.

It is.  It's a description of how to produce revision X given revision
Y, where Y is the last-merged mainline revision.

>>Bazaar bundles were designed to be used on mailing lists.  So you can
>>review the changes from the diff, comment on them, and if it seems
>>suitable, merge them.
> 
> 
> If you have only mega-diff, you can comment only on this mega-diff.

That is what we prefer to review.

>>>Although that might just make the email bigger for not a lot of
>>>gain.
>>
>>It's my understanding that most changes discussed on lkml are provided
>>as a series of patches.  Bazaar bundles are intended as a direct
>>replacement for patches in that use case.
> 
> 
> As _series_ of patches. You have git-format-patch + git-send-email
> to format and send them, git-am to apply them (as patches, not as branch).

If you want to do it exactly the same way, you send a series of bundles.

The bundle format can also support sending a single bundles that
displays the series of patches, though there's currently no UI to select
this.

> I was under an impression that user sees only mega-patch of all the
> revisions in bundle together, and rest is for machine consumption only.

All of it is for machine consumption.  The MIME-encoded sections are a
series of patches.  They're usually MIME-encoded to avoid confusion with
the overview patch, but this is optional.

I've attached an example of what a combined patch-by-patch bundle looks
like.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFOOyB0F+nu1YWqI0RAtU6AKCJndTNlTTPNnzxZX53lkBUUHTYkwCfePlG
7x3cjpYwh8LXEb5ZWXXmu6s=
=6Lgv
-----END PGP SIGNATURE-----

[-- Attachment #2: hello-world.patch --]
[-- Type: text/x-patch, Size: 1808 bytes --]

# Bazaar revision bundle v0.8
#
# message:
#   Added 'world'
# committer: Aaron Bentley <abentley@panoramicfeedback.com>
# date: Fri 2006-10-20 11:30:21.903000116 -0400

=== modified file world
--- world
+++ world
@@ -1,1 +1,1 @@
-Hello
+Hello, world

=== modified directory  // last-changed:abentley@panoramicfeedback.com-20061020
... 153021-b5fcea14e9cd2b34
# revision id: abentley@panoramicfeedback.com-20061020153021-b5fcea14e9cd2b34
# sha1: 6d553e72158aaa76c258d98c15cd24922d171cd9
# inventory sha1: 64af82c4d81d9d6ad4f33fc734d32c2a1eaa0df5
# parent ids:
#   abentley@panoramicfeedback.com-20061020152951-10cff5ff5a51e9a2
# properties:
#   branch-nick: bar

# message:
#   Capitalized
# committer: Aaron Bentley <abentley@panoramicfeedback.com>
# date: Fri 2006-10-20 11:29:51.953999996 -0400

=== modified file world
--- world
+++ world
@@ -1,1 +1,1 @@
-hello
+Hello

=== modified directory  // last-changed:abentley@panoramicfeedback.com-20061020
... 152951-10cff5ff5a51e9a2
# revision id: abentley@panoramicfeedback.com-20061020152951-10cff5ff5a51e9a2
# sha1: f7b79934bc3b0a944e35168b5df6b106c5b29ebf
# inventory sha1: 1400d56451752300cc31c9c94ff7ee2188e8ef8c
# parent ids:
#   abentley@panoramicfeedback.com-20061020152935-64bde004f622131f
# properties:
#   branch-nick: bar

# message:
#   initial commit
# committer: Aaron Bentley <abentley@panoramicfeedback.com>
# date: Fri 2006-10-20 11:29:35.536999941 -0400

=== added directory  // file-id:TREE_ROOT
=== added file world // file-id:world-20061020152929-12bknd8mm9mx48as-1
--- /dev/null
+++ world
@@ -0,0 +1,1 @@
+hello

# revision id: abentley@panoramicfeedback.com-20061020152935-64bde004f622131f
# sha1: 0728f761b891b257f0a71e2e360799eec080cd21
# inventory sha1: e52e030ea40f6bf5da78f4e8eb8efcd072b0930a
# properties:
#   branch-nick: bar


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 14:52                                                 ` Johannes Schindelin
@ 2006-10-20 15:34                                                   ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 15:34 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Jeff King, Aaron Bentley, Carl Worth, Linus Torvalds,
	Andreas Ericsson, bazaar-ng, git

Johannes Schindelin wrote:
> On Fri, 20 Oct 2006, Jakub Narebski wrote:
> 
>> Jeff King wrote:
>>> 
>>> I was accustomed to doing such things in CVS, but I find the git way
>>> much more pleasant, since I don't have to do any arithmetic:
>>>   diff d8a60^..d8a60
>> 
>> By the way "diff d8a60" also works (unless d8a60 is merge commit, in
>> which case you would need "diff -c d8a60" or "diff -m d8a60").
> 
> I could be wrong, but I have the impression (even after actually testing 
> it) that "git diff d8a60" is equivalent to "git diff d8a60..HEAD", _not_ 
> "git diff d8a60^..d8a60".

Ooops, I mixed git-diff-tree (which behaves as mentioned above) with
git-diff, which according to documentation compares with working tree
(and not HEAD) if only one <tree-ish> is given.

git-diff(1):
       ?  When  one  <tree-ish>  is given, the working tree and the named tree are
          compared, using git-diff-index. The option --cached can be given to com-
          pare the index file and the named tree.

git-diff-tree(1):
       If there is only one <tree-ish> given, the commit is compared with its par-
       ents (see --stdin below).
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
       [not found]                                       ` <20061020113712.d192580a.seanlkml@sympatico.ca>
@ 2006-10-20 15:37                                         ` Sean
  2006-10-20 15:37                                         ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-20 15:37 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Alexander Belchenko, bazaar-ng, git

On Fri, 20 Oct 2006 10:03:16 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> In Bazaar bundles, the text of the diff is an integral part of the data.
> It is used to generate the text of all the files in the revision.
> 
> Bazaar bundles were designed to be used on mailing lists.  So you can
> review the changes from the diff, comment on them, and if it seems
> suitable, merge them.

Perhaps I missed something in the earlier mails about this feature.
As I understood it, the email sent has a combined diff that shows
the net effect of all the commits included in the bundle.  (Whereas
the current Cogito version only shows a diffstat)

If the recipient of such a bundle is unable to extract the diff of
each separate commit included in the bundle then I can't see any
value in the feature at all.  But showing a combined diff in the
email may have marginal value, so long as when the bundle is 
imported into the recipient repository the individual commits
are available.

> It's my understanding that most changes discussed on lkml are provided
> as a series of patches.  Bazaar bundles are intended as a direct
> replacement for patches in that use case.

A combined diff of a bunch of changes would usually be most _unwelcome_
for review on lkml.  The constant refrain is to ask people to split their
changes up into smallish individual patches for review.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
       [not found]                                       ` <20061020113712.d192580a.seanlkml@sympatico.ca>
  2006-10-20 15:37                                         ` Sean
@ 2006-10-20 15:37                                         ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-20 15:37 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Alexander Belchenko, bazaar-ng, git

On Fri, 20 Oct 2006 10:03:16 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> In Bazaar bundles, the text of the diff is an integral part of the data.
> It is used to generate the text of all the files in the revision.
> 
> Bazaar bundles were designed to be used on mailing lists.  So you can
> review the changes from the diff, comment on them, and if it seems
> suitable, merge them.

Perhaps I missed something in the earlier mails about this feature.
As I understood it, the email sent has a combined diff that shows
the net effect of all the commits included in the bundle.  (Whereas
the current Cogito version only shows a diffstat)

If the recipient of such a bundle is unable to extract the diff of
each separate commit included in the bundle then I can't see any
value in the feature at all.  But showing a combined diff in the
email may have marginal value, so long as when the bundle is 
imported into the recipient repository the individual commits
are available.

> It's my understanding that most changes discussed on lkml are provided
> as a series of patches.  Bazaar bundles are intended as a direct
> replacement for patches in that use case.

A combined diff of a bunch of changes would usually be most _unwelcome_
for review on lkml.  The constant refrain is to ask people to split their
changes up into smallish individual patches for review.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 15:28                                         ` Jakub Narebski
@ 2006-10-20 15:39                                           ` Johannes Schindelin
  2006-10-20 16:05                                             ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-20 15:39 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, bazaar-ng

Hi,

On Fri, 20 Oct 2006, Jakub Narebski wrote:

> Johannes Schindelin wrote:
> 
> > On Fri, 20 Oct 2006, Jakub Narebski wrote:
> > 
> >> Christian MICHON wrote:
> >> 
> >>> - git is the fastest scm around
> >> 
> >> Mercurial also claims that.
> > 
> > Funny. When you type in "mercurial" and "benchmark" into Google, the 
> > _first_ hit is into "git Archives: Mercurial 0.4b vs git patchbomb 
> > benchmark". Performed by the good Mercurial people.
> > 
> > Leaving git as winner.
>  
> Check out http://git.or.cz/gitwiki/GitBenchmarks section "Quilt import 
> comparison of Git and Mercurial" for the latest (OLS2006) benchmark
> by Mercurial.

Thanks for the hint!

BTW the tests in Clone/status/pull make sense, especially the "4 times 
slower on pull/merge". In my tests, merge-recur (the default merge 
strategy, which was written in Python, and is now in C) was substantially 
faster.

> Probably not indexed by Google, or doesn't have high pagerank because it 
> is in PDF and fairly new (therefore has low "citations" number).

I hope these posts boost the pagerank.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 15:33                                           ` J. Bruce Fields
@ 2006-10-20 15:43                                             ` Jeff King
  0 siblings, 0 replies; 1752+ messages in thread
From: Jeff King @ 2006-10-20 15:43 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Carl Worth, Aaron Bentley, Linus Torvalds, Jakub Narebski,
	Andreas Ericsson, bazaar-ng, git

On Fri, Oct 20, 2006 at 11:33:23AM -0400, J. Bruce Fields wrote:

> Well, I thought the discussion was about what meaning references have
> after branches were modified or removed.  In which case the interesting
> situation is one where an object is gone but someone somewhere still
> holds a reference (because the SHA1 was mentioned in a bug report or an
> email or whatever).

Git tries very hard to make sure you don't have a reference to something
that doesn't exist. But yes, you could have a reference to the SHA1 in
another, non-git source, and try to guess the data from it. However,
there's a bit of a two-step procedure, since the SHA1 will likely be of
the commit. You have to guess the commit author, date, message, and
the contents of the rest of the tree to make a correct guess.

In practice I think most "launch code" scenarios are less about
guessable confidentiality, and more about ceasing to publish things you
shouldn't be (like copyright or patent encumbered code).

-Peff

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 10:51                                             ` Jakub Narebski
@ 2006-10-20 15:58                                               ` Linus Torvalds
  0 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-20 15:58 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git



On Fri, 20 Oct 2006, Jakub Narebski wrote:
> Junio C Hamano wrote:
> > 
> > An interesting effect on this is when people have a column for
> > merge performance in a SCM comparison table, they would include
> > time to run the diffstat as part of the time spent for merging
> > when they fill in the number for git, but not for any other SCM.
> 
> So if you want to compare merge performance with other SCM, you should
> either add time to run diffstat for other SCM, or substract time to
> run "git diff-tree --stat".

Naah. Just run "git pull -n". It's even documented:

	OPTIONS
	       -n, --no-summary
	              Do not show diffstat at the end of the merge.

so while the _default_ is to always show the diffstat, you certainly can 
easily do without it.

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 15:39                                           ` Johannes Schindelin
@ 2006-10-20 16:05                                             ` Jakub Narebski
  2006-10-20 16:24                                               ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 16:05 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, bazaar-ng

Johannes Schindelin wrote:
> On Fri, 20 Oct 2006, Jakub Narebski wrote:
>> Johannes Schindelin wrote:
>>> On Fri, 20 Oct 2006, Jakub Narebski wrote:
>>> 
>>>> Christian MICHON wrote:
>>>> 
>>>>> - git is the fastest scm around
>>>> 
>>>> Mercurial also claims that.
>>> 
>>> Funny. When you type in "mercurial" and "benchmark" into Google, the 
>>> _first_ hit is into "git Archives: Mercurial 0.4b vs git patchbomb 
>>> benchmark". Performed by the good Mercurial people.
>>> 
>>> Leaving git as winner.
>>  
>> Check out http://git.or.cz/gitwiki/GitBenchmarks section "Quilt import 
>> comparison of Git and Mercurial" for the latest (OLS2006) benchmark
>> by Mercurial.
> 
> Thanks for the hint!
> 
> BTW the tests in Clone/status/pull make sense, especially the "4 times 
> slower on pull/merge". In my tests, merge-recur (the default merge 
> strategy, which was written in Python, and is now in C) was substantially 
> faster.

As it was mentioned somewhere else in this thread, to compare times
for pull/merge in git with other SCM one should in principle substract
time for diffstat/git diff --stat.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 15:34                                         ` Aaron Bentley
@ 2006-10-20 16:21                                           ` Jakub Narebski
  2006-10-20 17:03                                             ` Aaron Bentley
  2006-10-20 18:12                                             ` Jan Hudec
  2006-10-20 22:40                                           ` Petr Baudis
  1 sibling, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 16:21 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git

Aaron Bentley wrote:

> === added directory  // file-id:TREE_ROOT

Gaaah, so rename detection in bzr is done using file-ids?
Linus will tell you the inherent problems with that "solution".
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 16:05                                             ` Jakub Narebski
@ 2006-10-20 16:24                                               ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 16:24 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Jakub Narebski wrote:

> Johannes Schindelin wrote:
>> On Fri, 20 Oct 2006, Jakub Narebski wrote:
>>> Johannes Schindelin wrote:
>>>> On Fri, 20 Oct 2006, Jakub Narebski wrote:
>>>> 
>>>>> Christian MICHON wrote:
>>>>> 
>>>>>> - git is the fastest scm around
>>>>> 
>>>>> Mercurial also claims that.
>>>> 
>>>> Funny. When you type in "mercurial" and "benchmark" into Google, the 
>>>> _first_ hit is into "git Archives: Mercurial 0.4b vs git patchbomb 
>>>> benchmark". Performed by the good Mercurial people.
>>>> 
>>>> Leaving git as winner.
>>>  
>>> Check out http://git.or.cz/gitwiki/GitBenchmarks section "Quilt import 
>>> comparison of Git and Mercurial" for the latest (OLS2006) benchmark
>>> by Mercurial.
>> 
>> Thanks for the hint!
>> 
>> BTW the tests in Clone/status/pull make sense, especially the "4 times 
>> slower on pull/merge". In my tests, merge-recur (the default merge 
>> strategy, which was written in Python, and is now in C) was substantially 
>> faster.
> 
> As it was mentioned somewhere else in this thread, to compare times
> for pull/merge in git with other SCM one should in principle substract
> time for diffstat/git diff --stat.

Or as reminded, use -n, --no-summary option to git pull.

BTW. I'd rather have -n == --no-commit for git pull...
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Signed git-tag doesn't find default key
  2006-10-20  9:04 Signed git-tag doesn't find default key Andy Parkins
@ 2006-10-20 16:32 ` Linus Torvalds
  2006-10-20 19:21   ` Andy Parkins
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-20 16:32 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git



On Fri, 20 Oct 2006, Andy Parkins wrote:
> 
> I did this:
> 
> $ git tag -s adp-sign-tag
> gpg: skipped "Andy Parkins <andyparkins@gmail.com>": secret key not available
> gpg: signing failed: secret key not available
> failed to sign the tag with GPG.

I would suggest one of two things:

 - specify the signing entity explicitly:

	git tag -u "andyparkins@gmail.com" adp-sign-tag

 - or just add a new alternate user ID to match the full git user ID.

Currently, your pgp key has the full ID "Andy Parkins (Google) 
<andyparkins@gmail.com>", and the way gpg matches ID's, that will _not_ 
match an ID of "Andy Parkins <andyparkins@gmail.com>"

But you can just do something like

	gpg --edit-key andyparkins@gmail.com

and then do an "adduid", and then add your UID _without_ the "(Google)" in 
there, and that should solve all your problems.

> So when git-tag looks for "Andy Parkins <andyparkins@gmail.com>"; it's not 
> found.  The answer is (I think) to search only on the email address when 
> looking for a key.  I've simply changed git-tag to have
> 
> username=$(git-repo-config user.email)
> 
> However, this is clearly wrong as what it actually wants is the committer 
> email.  Am I safe to simply process the $tagger variable to extract it?

You're probably better off with something like

	git var GIT_COMMITTER_IDENT | sed 's/\(.*\)<\(.*\)>\(.*\)/\2/'

which should work, but see above: I think you literally are better off 
just adding an alias to your PGP key that doesn't have the comment field.

That said, I've never understood why gpg matches on the comment field. 
Dammit, it _should_ find the key anyway. Stupid program.

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 16:21                                           ` Jakub Narebski
@ 2006-10-20 17:03                                             ` Aaron Bentley
  2006-10-20 17:18                                               ` Linus Torvalds
  2006-10-20 17:21                                               ` Shawn Pearce
  2006-10-20 18:12                                             ` Jan Hudec
  1 sibling, 2 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-20 17:03 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:
> 
> 
>>=== added directory  // file-id:TREE_ROOT
> 
> 
> Gaaah, so rename detection in bzr is done using file-ids?
> Linus will tell you the inherent problems with that "solution".

All solutions have disadvantages.  We prefer the disadvantages that come
from using file-ids over the disadvantages that come from using
content-based rename detection.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFOQFo0F+nu1YWqI0RAlCnAJwIqwuPG/IPBBQWaGyEImTm4GMP6QCfTV89
QZaMQsTqXBH8wrt7VKAHpII=
=Qx2i
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:03                                             ` Aaron Bentley
@ 2006-10-20 17:18                                               ` Linus Torvalds
  2006-10-20 17:45                                                 ` Jakub Narebski
  2006-10-20 17:47                                                 ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Aaron Bentley
  2006-10-20 17:21                                               ` Shawn Pearce
  1 sibling, 2 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-20 17:18 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git



On Fri, 20 Oct 2006, Aaron Bentley wrote:
> 
> All solutions have disadvantages.  We prefer the disadvantages that come
> from using file-ids over the disadvantages that come from using
> content-based rename detection.

That's fine, but please don't call the git rename handling "maybe" or 
"partial", like a lot of people seem to do. 

Git _definitely_ handles renames, both in everyday life and when merging. 
Some people may not like how it's done, but other (I'll say "equally 
informed", even though obviously I know better ;) people really don't like 
the way bzr or others do their rename handling.

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:03                                             ` Aaron Bentley
  2006-10-20 17:18                                               ` Linus Torvalds
@ 2006-10-20 17:21                                               ` Shawn Pearce
  2006-10-20 17:48                                                 ` Linus Torvalds
  1 sibling, 1 reply; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-20 17:21 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git, Jakub Narebski

Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> Jakub Narebski wrote:
> > Gaaah, so rename detection in bzr is done using file-ids?
> > Linus will tell you the inherent problems with that "solution".
> 
> All solutions have disadvantages.  We prefer the disadvantages that come
> from using file-ids over the disadvantages that come from using
> content-based rename detection.

As good as the content based rename detection is I got burned
recently by it.

I renamed hundreds of small files in one shot and also did a few
hundered adds and deletes of other small XML files.  Git generated
a lot of those unrelated adds/deletes as rename/modifies, as their
content was very similiar.  Some people involved in the project
freaked as the files actually had nothing in common with one
another... except for a lot of XML elements (as they shared the
same DTD).

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 11:37                                                     ` Johannes Schindelin
  2006-10-20 12:03                                                       ` Jakub Narebski
@ 2006-10-20 17:23                                                       ` David Lang
  1 sibling, 0 replies; 1752+ messages in thread
From: David Lang @ 2006-10-20 17:23 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jakub Narebski, git, bazaar-ng

On Fri, 20 Oct 2006, Johannes Schindelin wrote:

> On Fri, 20 Oct 2006, Jakub Narebski wrote:
>
>> Johannes Schindelin wrote:
>>
>>> On Fri, 20 Oct 2006, Lachlan Patrick wrote:
>>>
>>>> How does git disambiguate SHA1 hash collisions?
>>>
>>> It does not. You can fully expect the universe to go down before that
>>> happens.
>>
>> Or you can compile git with COLLISION_CHECK
>>
>>> From Makefile:
>> # Define COLLISION_CHECK below if you believe that SHA1's
>> # 1461501637330902918203684832716283019655932542976 hashes do not give you
>> # sufficient guarantee that no collisions between objects will ever happen.
>
> You can document your disbelief.
>
> But it does not change a thing. Since v0.99~653, we do not have any
> collision check, even if compiled with COLLISION_CHECK.

I had the same disbelief as you about this, however the last time this came up 
Linus pointed out something that satisfied me.

any action in git that could create or or recreate an object will not overwrite 
an object that it thinks that it already has.

so

if you create a new local file that would conflict and save it, git will accept 
your save and throw away the new file.

if you pull from a remote repository and there is a file there that conflicts 
with a file you already have it will throw away the new file.

if you pull from a remote repository and someone has hacked it to replace a file 
with a bad one, if you already have the good one git will throw away the bad 
one.

as a result the worst case is that a new file being checked in doesn't really 
get in and when someone checks it out and trys to use it they get the old 
contents. In the case of code, it's extremely unlikly that the wrong code will 
even compile, let alone do anything remotely close to working correctly. At this 
point the fix is to go back to the origional developer to get the correct 
version while additional changes are made to git (and remember, that unless this 
is a brand new file the prior version is readily available so only the latest 
diff needs to be recovered)

so the odds are extremely low and the concequeces of a collision are fairly 
minor.

git has (or had) an option to actually check the full contents before throwing 
away the new copy instead of just checking the hash (and throwing an error if 
the contents don't match), but the performance cost of this is pretty high.

David Lang

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:18                                               ` Linus Torvalds
@ 2006-10-20 17:45                                                 ` Jakub Narebski
  2006-10-20 17:59                                                   ` Linus Torvalds
  2006-10-20 17:47                                                 ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Aaron Bentley
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 17:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Aaron Bentley, bazaar-ng, git

Linus Torvalds wrote:
> 
> On Fri, 20 Oct 2006, Aaron Bentley wrote:
>> 
>> All solutions have disadvantages.  We prefer the disadvantages that come
>> from using file-ids over the disadvantages that come from using
>> content-based rename detection.

If I remember correctly, git decided on contents (plus filename)
similarity based renames detection because 1), it is more generic
as it covers (or can cover) contents moving not only wholesome rename
of a file, and 2) because file-id based renames handling works only
if you explicitely use SCM command to rename file, which is not the
case of non-SCM-aware channel like for example patches (and accepting
ordinary patches is important for Linux kernel, the project git was
created for).

Another problem with file-id based rename handling is not handling
file copying (correct me if I'm wrong), and troubles with removing
or renaming a file, then having new file with old name.
 
> That's fine, but please don't call the git rename handling "maybe" or 
> "partial", like a lot of people seem to do. 
> 
> Git _definitely_ handles renames, both in everyday life and when merging. 
> Some people may not like how it's done, but other (I'll say "equally 
> informed", even though obviously I know better ;) people really don't like 
> the way bzr or others do their rename handling.

I think that "partial" refers to not complete handling of renames
for file history; pathspec doesn't follow history. Although the
information is there in SCM, it's the tools that need extension
(the --follow of rename following single file pathspec limit
proposal).

There was also suggestion of rr2-cache, which would record corrections
to automatic rename detection (rename/copy conflict resolving) 
if I remember correctly.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:18                                               ` Linus Torvalds
  2006-10-20 17:45                                                 ` Jakub Narebski
@ 2006-10-20 17:47                                                 ` Aaron Bentley
  2006-10-20 18:06                                                   ` Linus Torvalds
  1 sibling, 1 reply; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-20 17:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Fri, 20 Oct 2006, Aaron Bentley wrote:
> 
>>All solutions have disadvantages.  We prefer the disadvantages that come
>>from using file-ids over the disadvantages that come from using
>>content-based rename detection.
> 
> 
> That's fine, but please don't call the git rename handling "maybe" or 
> "partial", like a lot of people seem to do. 
> 
> Git _definitely_ handles renames, both in everyday life and when merging.

Hmm.  Could you say more here?  The only examples I can think of for
handling renames are situations that can be expressed as a merge.

For example, populating a working tree can be expressed as:
BASE: nothing
THIS: nothing
OTHER: aabbccddee

Or revert can be expressed as

BASE: current
THIS: current
OTHER: aabbccddee

Or fast-forward pull

BASE: last-commit
THIS: current
OTHER: aabbccddee

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFOQuv0F+nu1YWqI0RAotBAKCEEzvh1Cc2jJH4NIEBwoYrDJlbUQCgiPBF
DZ4+hSbkjbvgOwbT4+oLzFA=
=wSgK
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:21                                               ` Shawn Pearce
@ 2006-10-20 17:48                                                 ` Linus Torvalds
  2006-10-20 17:58                                                   ` David Lang
                                                                     ` (3 more replies)
  0 siblings, 4 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-20 17:48 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Aaron Bentley, Jakub Narebski, bazaar-ng, git



On Fri, 20 Oct 2006, Shawn Pearce wrote:
> 
> I renamed hundreds of small files in one shot and also did a few
> hundered adds and deletes of other small XML files.  Git generated
> a lot of those unrelated adds/deletes as rename/modifies, as their
> content was very similiar.  Some people involved in the project
> freaked as the files actually had nothing in common with one
> another... except for a lot of XML elements (as they shared the
> same DTD).

Heh. We can probably tweak the heuristics (one of the _great_ things about 
content detection is that you can fix it after the fact, unlike the 
alternative).

That said, I've personally actually found the content-based similarity 
analysis to often be quite informative, even when (and perhaps 
_especially_ when) it ended up showing something that the actual author of 
the thing didn't intend.

So yeah, I've seen a few strange cases myself, but they've actually been 
interesting. Like seeing how much of a file was just a copyright license, 
and then a file being considered a "copy" just because it didn't actually 
introduce any real new code.

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:48                                                 ` Linus Torvalds
@ 2006-10-20 17:58                                                   ` David Lang
  2006-10-20 18:15                                                   ` Jon Smirl
                                                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 1752+ messages in thread
From: David Lang @ 2006-10-20 17:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Shawn Pearce, Aaron Bentley, Jakub Narebski, bazaar-ng, git

On Fri, 20 Oct 2006, Linus Torvalds wrote:

> On Fri, 20 Oct 2006, Shawn Pearce wrote:
>>
>> I renamed hundreds of small files in one shot and also did a few
>> hundered adds and deletes of other small XML files.  Git generated
>> a lot of those unrelated adds/deletes as rename/modifies, as their
>> content was very similiar.  Some people involved in the project
>> freaked as the files actually had nothing in common with one
>> another... except for a lot of XML elements (as they shared the
>> same DTD).
>
> Heh. We can probably tweak the heuristics (one of the _great_ things about
> content detection is that you can fix it after the fact, unlike the
> alternative).
>
> That said, I've personally actually found the content-based similarity
> analysis to often be quite informative, even when (and perhaps
> _especially_ when) it ended up showing something that the actual author of
> the thing didn't intend.
>
> So yeah, I've seen a few strange cases myself, but they've actually been
> interesting. Like seeing how much of a file was just a copyright license,
> and then a file being considered a "copy" just because it didn't actually
> introduce any real new code.
>

isn't the default to consider them a copy if they are 80% the same, with a 
command line option to tweak this (IIRC -m, but I could easily be wrong)

David Lang

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:45                                                 ` Jakub Narebski
@ 2006-10-20 17:59                                                   ` Linus Torvalds
  2006-10-20 20:17                                                     ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-20 17:59 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git



On Fri, 20 Oct 2006, Jakub Narebski wrote:
> 
> If I remember correctly, git decided on contents (plus filename)
> similarity based renames detection because 1), it is more generic
> as it covers (or can cover) contents moving not only wholesome rename
> of a file, and 2) because file-id based renames handling works only
> if you explicitely use SCM command to rename file, which is not the
> case of non-SCM-aware channel like for example patches (and accepting
> ordinary patches is important for Linux kernel, the project git was
> created for).

There are lots of problems with file ID's. One of the more obvious ones is 
indeed that if you arrive at the same state two different ways (eg patches 
vs "native SCM"), you end up with two fundmanetally different trees. Even 
though clearly there was no real difference.

There are other serious problems. For example, file-ID based systems 
invariably have _huge_ problems with handling two branches deleting and 
renaming things differently, and we had several issues with that during 
the BK days (ie two people would move files differently, and ending up 
with different file ID's for the same path, and merging that inevitably 
causes problems not just during the merge, but ever after, since one of 
the file ID's will then have to be "deleted" even though it might be 
active in one of the branches).

Finally, file-ID based systems fundamentally cannot handle some simple and 
interesting cases, like partial content movement. We're starting to see 
git actually being able to track file content moving between files: even 
when the files themselves didn't move (ie Junio's "git pickaxe" work could 
do things like that).

And there really aren't as many advantages to tracking renames as people 
claim. The biggest advantage of tracking renames is to avoid the trap that 
CVS fell into: being file-ID based _and_ not being able to track the file 
ID moving is clearly the worst of all worlds.

So for anybody coming from a CVS background, tracking renames explicitly 
is a _huge_ advantage, which is, I think, why some SCM people have gotten 
so hung up about them. It's just that if you don't have the file-ID 
problem in the first place (and git doesn't), then rename tracking doesn't 
actually make any sense, and only makes things much worse.

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:47                                                 ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Aaron Bentley
@ 2006-10-20 18:06                                                   ` Linus Torvalds
  2006-10-20 18:30                                                     ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-20 18:06 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git, Jakub Narebski



On Fri, 20 Oct 2006, Aaron Bentley wrote:
> > 
> > Git _definitely_ handles renames, both in everyday life and when merging.
> 
> Hmm.  Could you say more here?  The only examples I can think of for
> handling renames are situations that can be expressed as a merge.

So yes, merges are the situation where renames are normally considered a 
"problem", but it's actually not nearly the most every-day situation at 
all.

The most common one is actually just showing things as a diff.

If you are looking at a code-change, there's an absolutely _huge_ 
difference if you look at the result as a "delete this huge file" and 
"create this other huge file" and seeing it as a "move this huge file from 
here to here, and change a few lines in the process".

So the most _important_ part of rename tracking from a user perspective is 
for the person who walks through somebody elses code history, and wants to 
know how a certain state came to be. The merges are usually not as big of 
a deal for the user (although they are clearly the most hairy case for the 
SCM - which is why SCM people concentrate on merges).

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 16:21                                           ` Jakub Narebski
  2006-10-20 17:03                                             ` Aaron Bentley
@ 2006-10-20 18:12                                             ` Jan Hudec
  2006-10-20 18:35                                               ` Jakub Narebski
                                                                 ` (4 more replies)
  1 sibling, 5 replies; 1752+ messages in thread
From: Jan Hudec @ 2006-10-20 18:12 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Aaron Bentley, bazaar-ng, git

On Fri, Oct 20, 2006 at 06:21:34PM +0200, Jakub Narebski wrote:
> Aaron Bentley wrote:
> 
> > === added directory  // file-id:TREE_ROOT
> 
> Gaaah, so rename detection in bzr is done using file-ids?
> Linus will tell you the inherent problems with that "solution".

Ok, I tried to read
http://permalink.gmane.org/gmane.comp.version-control.git/217

It's all nice and well, but my question is whether the below cases work
in git. Yes, they are particular cases, but they are particularly
important. If they don't, I'd rather have file-id scheme, that is
limited to just them, but handles them, than something with big plans,
but nothing working.

Let's consider following scenario:

(where A$ means working in branch A, B$ means working in branch B and
 VCT stands for version control tool of choice)

A$ echo Hello Warld! > hello.txt
A$ VCT add hello.txt
A$ VCT commit -m "Created greeting"
$ VCT branch A B
A$ VCT mkdir data
A$ VCT mv hello.txt data/
A$ VCT commit -m "Moved hello.txt to data dir"
B$ ed hello.txt
? 1s/Warld/World/
? wq
B$ VCT commit -m "Fixed typo in greeting"
A$ VCT merge B

At this point, I expect the tree to look like this:
A$ ls -R
.:
data/
data:
hello.txt
A$ cat data/hello.txt
Hello World!

The file-id algorithm is not exceptionaly clever, is a bit of
special-case and all that, but it handles the above case right. And
while that scenario is just a special case of general moving contents,
it is:
1) Very common
2) Possible to handle in an obviously correct way

It is very important for me that a version control tool I use handles
this case. If it handles the more general cases, that's nice, but this
is a must.

Oh, and there is one more complicated case, that I also require to work
and that works in Bzr, but did not work in Arch:

...let's start with the tree at the end of previous example...

A$ VCT mv data greetings
A$ VCT commit -m "Renamed the data directory to greetings"
B$ echo "Goodbye World!" > data/goodbye.txt
B$ VCT add data/goodbye.txt
B$ VCT commit -m "Added goodbye message."
A$ VCT merge B

And now I expect to have tree looking like this:

A$ ls -R
.:
greetings/
greetings:
hello.txt
goodbye.txt

And note, that it is /not/ required to use file-ids to handle this.
Darcs handles this just as well with it's patch algebra
(http://darcs.net/DarcsWiki/PatchTheory) without need of any IDs.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:48                                                 ` Linus Torvalds
  2006-10-20 17:58                                                   ` David Lang
@ 2006-10-20 18:15                                                   ` Jon Smirl
  2006-11-03  3:43                                                     ` Matthew Hannigan
  2006-10-20 20:23                                                   ` Petr Baudis
  2006-10-20 20:53                                                   ` Shawn Pearce
  3 siblings, 1 reply; 1752+ messages in thread
From: Jon Smirl @ 2006-10-20 18:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Shawn Pearce, Aaron Bentley, Jakub Narebski, bazaar-ng, git

On 10/20/06, Linus Torvalds <torvalds@osdl.org> wrote:
> So yeah, I've seen a few strange cases myself, but they've actually been
> interesting. Like seeing how much of a file was just a copyright license,
> and then a file being considered a "copy" just because it didn't actually
> introduce any real new code.

It may be worth doing something special for licenses. Logs of small
Mozilla files are also getting tripped up by the large copyright
notices. The notices take up a lot of space too. The Mozilla license
has been changed five times. That is 110,000 files times one to five
licenses at 800-1500 characters each. 500MB+ of junk before
compression.

You could have a file of macro substitutions that is applied/expanded
when files go in/out of git. The macros would replace the copyright
notices improving the move/rename tracking and the reducing repository
size. The macros could be recorded out of band to eliminate the need
for escaping the file contents. Even simpler, the only valid place for
the macro could be the beginning of the file.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:06                                                   ` Linus Torvalds
@ 2006-10-20 18:30                                                     ` Linus Torvalds
  2006-10-20 19:04                                                       ` Aaron Bentley
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-20 18:30 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git



On Fri, 20 Oct 2006, Linus Torvalds wrote:
> 
> So yes, merges are the situation where renames are normally considered a 
> "problem", but it's actually not nearly the most every-day situation at 
> all.

Btw, this is a pet peeve of mine, and it is not at all restricted to 
the SCM world.

In CompSci in general, you see a _lot_ of papers about things that almost 
don't matter - not because the issues are that important in practice, but 
because the issues are something small enough to be something you can 
discuss and explain without having to delve into tons of ugly detail, and 
because it's something that has a lot of "mental masturbation" associated 
with it - ie you can discuss it endlessly.

In the OS world, it's things like schedulers. You find an _inordinate_ 
number of papers on scheduling, considering that the actual algorithm then 
tends to be something that can be expressed in a hundred lines of code or 
so, but it's got quite high "mental masturbatory value" (hereafter called 
MMV).

Other high-MMV areas are page-out algorithms (never mind that almost all 
_real_ VM problems are elsewhere) and some zero-copy schemes (never mind 
that if you actually need to _work_ with the data, zero-copy DMA may 
actually be much worse because it ends up having bad cache behaviour).

In the SCM world, file renames and merging seem to be the high-MMV things. 
Never mind that the real issues tend to be elsewhere (like _performance_ 
when you have a few thousand commits that you want to merge).

For example, in the kernel, I think about half of all merges are what git 
calls "trivial in-index merges". That's HALF. Being a trivial in-index 
merge means that there was not a single file-level conflict that even 
needed a three-way merge, much less any study of the history AT ALL (other 
than finding the common ancestor, of course).

Of the rest, most by far need some trivial 3-way merging. And the ones 
that have trouble? In practice, that trivial and maligned 3-way does 
_better_ than anything more complicated.

Yet, if you actually bother to follow all the discussion on #revctrl and 
other places, what do you find discussed? Right: various high-MMV issues 
like "staircase merge" etc crap.

Go to revctrl.org for prime example of this. I think half the stuff is 
about merge algorithms, some of it is about glossary, and almost none of 
it is about something as pedestrian and simple as performance and 
scalability.

(Actually, to be honest, I think some of the #revctrl noise has become 
better lately. I'm not seeing quite as much theoretical discussion, it may 
be that as open-source distributed SCM's are getting to be more "real", 
people start to slowly realize that the masturbatory crap isn't actually 
what it's all about. So maybe at least this area is getting more about 
real every-day problems, and less about the theoretical-but-not-very- 
important issues).

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:12                                             ` Jan Hudec
@ 2006-10-20 18:35                                               ` Jakub Narebski
  2006-10-20 18:46                                                 ` Jakub Narebski
  2006-10-20 18:47                                               ` Jakub Narebski
                                                                 ` (3 subsequent siblings)
  4 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 18:35 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Aaron Bentley, bazaar-ng, git

Jan Hudec wrote:
> On Fri, Oct 20, 2006 at 06:21:34PM +0200, Jakub Narebski wrote:
> > Aaron Bentley wrote:
> > 
> > > === added directory  // file-id:TREE_ROOT
> > 
> > Gaaah, so rename detection in bzr is done using file-ids?
> > Linus will tell you the inherent problems with that "solution".
> 
> Ok, I tried to read
> http://permalink.gmane.org/gmane.comp.version-control.git/217
> 
> It's all nice and well, but my question is whether the below cases work
> in git. Yes, they are particular cases, but they are particularly
> important. If they don't, I'd rather have file-id scheme, that is
> limited to just them, but handles them, than something with big plans,
> but nothing working.
> 
> Let's consider following scenario:
> 
> (where A$ means working in branch A, B$ means working in branch B and
>  VCT stands for version control tool of choice)

1077:jnareb@roke:/tmp/jnareb> mkdir tmp
1078:jnareb@roke:/tmp/jnareb> cd tmp/
1079:jnareb@roke:/tmp/jnareb/tmp> git init-db
defaulting to local storage area

> A$ echo Hello Warld! > hello.txt
1081:jnareb@roke:/tmp/jnareb/tmp> echo 'Hello Warld!' > hello.txt

> A$ VCT add hello.txt
1082:jnareb@roke:/tmp/jnareb/tmp> git add hello.txt

> A$ VCT commit -m "Created greeting"
1083:jnareb@roke:/tmp/jnareb/tmp> git commit -a -m "Created greeting"

(we use here still default branch 'master'. Let us change it to A)
1084:jnareb@roke:/tmp/jnareb/tmp> git branch A
1088:jnareb@roke:/tmp/jnareb/tmp> git checkout A

> $ VCT branch A B
1085:jnareb@roke:/tmp/jnareb/tmp> git branch B A
(create branch B based on A)

> A$ VCT mkdir data
1089:jnareb@roke:/tmp/jnareb/tmp> mkdir data

> A$ VCT mv hello.txt data/
1090:jnareb@roke:/tmp/jnareb/tmp> git mv hello.txt data/

> A$ VCT commit -m "Moved hello.txt to data dir"
1092:jnareb@roke:/tmp/jnareb/tmp> git commit -a -m "Moved hello.txt to data dir"

> B$ ed hello.txt
> ? 1s/Warld/World/
> ? wq
1094:jnareb@roke:/tmp/jnareb/tmp> ed hello.txt 
13
1s/Warld/World/
wq
13

> B$ VCT commit -m "Fixed typo in greeting"
1096:jnareb@roke:/tmp/jnareb/tmp> git commit -a -m "Fixed typo in greeting"

> A$ VCT merge B
1097:jnareb@roke:/tmp/jnareb/tmp> git checkout A
1098:jnareb@roke:/tmp/jnareb/tmp> git pull . B
Trying really trivial in-index merge...
fatal: Merge requires file-level merging
Nope.
Merging HEAD with 9de7290d385ec2b0c2ade9b888f6c3a6633ac926
Merging: 
5f0eb04467538f0f1414af85ec6481150107c0b2 Moved hello.txt to data dir 
9de7290d385ec2b0c2ade9b888f6c3a6633ac926 Fixed typo in greeting 
found 1 common ancestor(s): 
f49a520e40143cb9d84b00e9728c5742897c0a22 Created greeting 

Merge made by recursive.
 data/hello.txt |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

> At this point, I expect the tree to look like this:
> A$ ls -R
1099:jnareb@roke:/tmp/jnareb/tmp> ls -R
.:
data

./data:
hello.txt

> A$ cat data/hello.txt
1100:jnareb@roke:/tmp/jnareb/tmp> cat data/hello.txt 
Hello World!



> A$ VCT mv data greetings
1102:jnareb@roke:/tmp/jnareb/tmp> git mv data greetings

> A$ VCT commit -m "Renamed the data directory to greetings"
1105:jnareb@roke:/tmp/jnareb/tmp> git commit -a -m "Renamed the data directory to greetings"

> B$ echo "Goodbye World!" > data/goodbye.txt
1106:jnareb@roke:/tmp/jnareb/tmp> git checkout B
1109:jnareb@roke:/tmp/jnareb/tmp> echo 'Goodbye World!' > data/goodbye.txt
bash: data/goodbye.txt: There is no such file or directory
1110:jnareb@roke:/tmp/jnareb/tmp> ls -R
.:
hello.txt

You need to revise your example.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:35                                               ` Jakub Narebski
@ 2006-10-20 18:46                                                 ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 18:46 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Aaron Bentley, bazaar-ng, git

Jakub Narebski wrote:
>> A$ VCT commit -m "Moved hello.txt to data dir"
> 1092:jnareb@roke:/tmp/jnareb/tmp> git commit -a -m "Moved hello.txt to data dir"
> 
>> B$ ed hello.txt
>> ? 1s/Warld/World/
>> ? wq
Sorry, I have forgot to put in email "git checkout B"
to actually switch to branch B.

> 1094:jnareb@roke:/tmp/jnareb/tmp> ed hello.txt 
> 13
> 1s/Warld/World/
> wq
> 13

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:12                                             ` Jan Hudec
  2006-10-20 18:35                                               ` Jakub Narebski
@ 2006-10-20 18:47                                               ` Jakub Narebski
  2006-10-20 19:00                                                 ` Linus Torvalds
  2006-10-20 18:48                                               ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Linus Torvalds
                                                                 ` (2 subsequent siblings)
  4 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 18:47 UTC (permalink / raw)
  To: Jan Hudec; +Cc: bazaar-ng, git

Jan Hudec wrote:

> And note, that it is /not/ required to use file-ids to handle this.
> Darcs handles this just as well with it's patch algebra
> (http://darcs.net/DarcsWiki/PatchTheory) without need of any IDs.

And Darcs is, from opinions I've read, dog-slow.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:12                                             ` Jan Hudec
  2006-10-20 18:35                                               ` Jakub Narebski
  2006-10-20 18:47                                               ` Jakub Narebski
@ 2006-10-20 18:48                                               ` Linus Torvalds
  2006-10-20 22:13                                                 ` Jeff Licquia
  2006-10-20 19:14                                               ` Jakub Narebski
  2006-10-20 22:59                                               ` Jeff King
  4 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-20 18:48 UTC (permalink / raw)
  To: Jan Hudec; +Cc: bazaar-ng, git, Jakub Narebski



On Fri, 20 Oct 2006, Jan Hudec wrote:
>
> Let's consider following scenario:

Here's a real-life schenario that we hit several times with BK over the 
years:

 - take a real repository, and a patch that gets discussed that adds a new 
   file.
 - take two different people applying that patch to their trees (or, do 
   the equivalent thing, which is to just create the same filename
   independently, because the solution is obvious - and the same - to 
   both developers).
 - now, have somebody merge both of those two peoples trees (eg me)
 - have the two people continue to use their trees, modifying it, and 
   getting merged.

Trust me, this isn't even _unlikely_. It happens. And it's a serious 
problem for a file-ID case. Why? Because you have two different file ID's 
for the same pathname. 

(It happily only happened a handful of times, so it was never a big enough 
problem to cause me to think that BK was crap. But it definitely was a 
real issue).

What BK did (and what is likely the only reasonable thing to do) is to 
move one of the file-ID's to an "Attic" kind of place, and just go with 
the other. The nasty part is that now the developer whose file was 
"dropped" (and anybody who got the work from him) may still be continuing 
to work with _his_ copy of the same file, never even realizing that when 
his work gets merged, all his fixes GET THROWN AWAY!

And trust me, this isn't a theoretical thing. This actually happens. So 
you have problems at many levels: you have the problems that happen during 
the merge (where somebody needs to decide how to resolve the file-ID 
clash), but what a lot of SCM people seem to not have understood is that 
the problem actually _remains_ after the merge, and causes problems even 
down the line.

So yeah, content-based merging has its own problems (especially if you do 
things like re-indent a file as you move it, or if you have files that 
just look the same because they share 99% of their content through a 
copyright message), but at least so far, we've not really ever hit that 
issue in the kernel.

And we are actually approaching the old kernel BK tree in size with the 
current git tree (we're about 2/3rds of the way if you count number of 
commits). That's despite the fact that we actually have been moving things 
around.  So from a purely _practical_ standpoint, I really do have 
anecdotal evidence that I'm right.

I didn't have that evidence when I started, but I knew I was right anyway ;)

		Linus

PS. It's undoubtedly true that the SCM you use impacts _how_ you do 
development, so any project will almost automatically align itself with 
whatever SCM rules there are in place.

So "anecdotal evidence" in that sense isn't really wonderful, since it 
obviously is always a matter of a certain project/SCM combination - but 
the above example is about as neutral as you can get, since it's the 
_same_ project, with the _same_ maintainer, and roughtly the _same_ rules, 
just two different approaches wrt renames of the SCM's in question.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:47                                               ` Jakub Narebski
@ 2006-10-20 19:00                                                 ` Linus Torvalds
  2006-10-20 19:10                                                   ` Aaron Bentley
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-20 19:00 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Jan Hudec, Aaron Bentley, bazaar-ng, git



On Fri, 20 Oct 2006, Jakub Narebski wrote:

> Jan Hudec wrote:
> 
> > And note, that it is /not/ required to use file-ids to handle this.
> > Darcs handles this just as well with it's patch algebra
> > (http://darcs.net/DarcsWiki/PatchTheory) without need of any IDs.
> 
> And Darcs is, from opinions I've read, dog-slow.

You really cannot expect to get any kind of performance at all unless you:

 - are able to ignore 99.9% of all files on merging (ie you have to be 
   able to totally ignore the files that are identical in both sides, and 
   you really shouldn't even _care_ about why they ended up being 
   identical)

 - are able to ignore 99% of what the commits _did_ in between the merges 
   (ie if you need to look at them at all, only look at the part that 
   matters for the 0.1% of files that you couldn't ignore)

If you have to parse all the commit details all the way down to the common 
parent, you're basically already screwed. There's no _way_ you can make it 
fast. 

Git goes one step further: it _really_ doesn't matter about how you got to 
a certain state. Absolutely _none_ of what the commits in between the 
final stages and the common ancestor matter in the least. The only thing 
that matters is what the states at the end-point are.

(Of course, you _could_ plug in a merge algorithm that cares, since there 
is more data there. I'm just talking about the standard "recursive" 
algorithm here.)

That's why git can be so fast, but it's actually more important than that: 
the fact that it doesn't matter _how_ you got to a certain state is 
actually a huge and important feature. In other words, you should see it 
as a guarantee, not as a "lack of knowledge".

Darcs thinks it matters how you got somewhere. Git consciously says: none 
of the individual patches matter, the only thing that matters is the end 
result, because you could have gotten the same result in a lot of 
different ways, and nobody _cares_.

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:30                                                     ` Linus Torvalds
@ 2006-10-20 19:04                                                       ` Aaron Bentley
  2006-10-20 19:31                                                         ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-20 19:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Fri, 20 Oct 2006, Linus Torvalds wrote:
> 
>>So yes, merges are the situation where renames are normally considered a 
>>"problem", but it's actually not nearly the most every-day situation at 
>>all.
> 
> 
> Btw, this is a pet peeve of mine, and it is not at all restricted to 
> the SCM world.

I guess I don't mind a bit of high-mmv discussion, so long as it doesn't
get in the way of real work.  Polishing these kinds of things seems to
fall in the category of 10% of functionality that takes 90% of effort.

> Of the rest, most by far need some trivial 3-way merging. And the ones 
> that have trouble? In practice, that trivial and maligned 3-way does 
> _better_ than anything more complicated.

I think the great motivator for exploring other merge algorithms has
been criss-cross merge.  There are some workflows (e.g. the Launchpad
workflow) in which heavy mesh-merging takes place, leading to frequent
criss-crosses.

Bog-standard three-way doesn't handle that criss-cross very well.  I
understand git uses recursive three-way in that situation.

The other motivator has been cherry-picking.

So I'm happy that people are trying to devise merge algorithms that are
better than three-way.  When someone gets it right, we'll implement it.

And then there are other more incremental tweaks, like
merge-across-indent and merge-across-line-ending-change that I'd like to
see.

> Go to revctrl.org for prime example of this. I think half the stuff is 
> about merge algorithms, some of it is about glossary, and almost none of 
> it is about something as pedestrian and simple as performance and 
> scalability.

Partly this is because of Bram's interests.  AIUI, he started with a
merge algorithm and built a VCS around it.

> (Actually, to be honest, I think some of the #revctrl noise has become 
> better lately.

I used to spend time on #revctrl, but I think that was before you
started visiting.  Too bad I missed ya.

 So maybe at least this area is getting more about
> real every-day problems, and less about the theoretical-but-not-very- 
> important issues).

It wouldn't surprise me if the early phases of VCS development tended
toward more theoretical discussion, just because so many questions are open.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFOR3D0F+nu1YWqI0RAo5lAJ99+5ShvLXaVIRG1A8XN7HRicoPngCeLO+y
meMZVcjdX7AX9JCfhSN5uK4=
=AI8p
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 19:00                                                 ` Linus Torvalds
@ 2006-10-20 19:10                                                   ` Aaron Bentley
  2006-10-20 19:46                                                     ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-20 19:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, Jan Hudec, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> Git goes one step further: it _really_ doesn't matter about how you got to 
> a certain state. Absolutely _none_ of what the commits in between the 
> final stages and the common ancestor matter in the least. The only thing 
> that matters is what the states at the end-point are.

That's interesting, because I've always thought one of the strengths of
file-ids was that you only had to worry about end-points, not how you
got there.

How do you handle renames without looking at the history?

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFOR8c0F+nu1YWqI0RAkhJAJ9QJ3nyP/437/bNPI3VEVHZP0dEZACfZyEg
SWAp+673iTDEZfH00M4RG4k=
=1XO+
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:12                                             ` Jan Hudec
                                                                 ` (2 preceding siblings ...)
  2006-10-20 18:48                                               ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Linus Torvalds
@ 2006-10-20 19:14                                               ` Jakub Narebski
  2006-10-20 22:59                                               ` Jeff King
  4 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 19:14 UTC (permalink / raw)
  To: Jan Hudec; +Cc: bazaar-ng, git

Jan Hudec wrote:

> Let's consider following scenario:
> 
> (where A$ means working in branch A, B$ means working in branch B and
>  VCT stands for version control tool of choice)
[...]
> At this point, I expect the tree to look like this:
> A$ ls -R
> .:
> data/
> data:
> hello.txt
> A$ cat data/hello.txt
> Hello World!
[...]
> Oh, and there is one more complicated case, that I also require to work
> and that works in Bzr, but did not work in Arch:
> 
> ...let's start with the tree at the end of previous example...
> 
> A$ VCT mv data greetings
> A$ VCT commit -m "Renamed the data directory to greetings"
> B$ echo "Goodbye World!" > data/goodbye.txt
> B$ VCT add data/goodbye.txt
> B$ VCT commit -m "Added goodbye message."
> A$ VCT merge B

(slightly corrected example).

A$ git branch B
A$ git mv data greetings
A$ git commit -a -m "Renamed the data directory to greetings"
A$ git checkout B
B$ echo 'Goodbye World!' > data/goodbye.txt
B$ git add data/goodbye.txt
B$ git commit -a -m "Added goodbye message."
B$ git checkout A
A$ git pull . B
Trying really trivial in-index merge...
fatal: Merge requires file-level merging
Nope.
Merging HEAD with 4a8a1a7941f214c6173786b583830b4f74a67c1f
Merging: 
96738390ba0b4de5b234059081701badc1c86693 Renamed the data directory to greetings 
4a8a1a7941f214c6173786b583830b4f74a67c1f Added goodbye message. 
found 1 common ancestor(s): 
7cfd8edd06b7cb016856737d8fd98d5d096955b5 Merge branch 'B' into A 

Merge made by recursive.
 data/goodbye.txt |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 data/goodbye.txt

> And now I expect to have tree looking like this:
> 
> A$ ls -R
> .:
> greetings/
> greetings:
> hello.txt
> goodbye.txt

So git _fails_ (your expectations) in this case:
A$ ls -R
.:
data  greetings

./data:
goodbye.txt

./greetings:
hello.txt

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Signed git-tag doesn't find default key
  2006-10-20 16:32 ` Linus Torvalds
@ 2006-10-20 19:21   ` Andy Parkins
  2006-10-21  0:52     ` Horst H. von Brand
  0 siblings, 1 reply; 1752+ messages in thread
From: Andy Parkins @ 2006-10-20 19:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1833 bytes --]

On Friday 2006, October 20 17:32, Linus Torvalds wrote:

> and then do an "adduid", and then add your UID _without_ the "(Google)" in
> there, and that should solve all your problems.

Yeah, obviously that's one way; and while it doesn't really matter to me, it 
seems poor form that git doesn't work with gpg as it is.  While one could of 
course use the "-u" switch, if that is the answer, then why bother with 
having the "-s" switch at all?

> You're probably better off with something like
>
> 	git var GIT_COMMITTER_IDENT | sed 's/\(.*\)<\(.*\)>\(.*\)/\2/'

I've actually settled on:

: ${username:=$(expr "z$tagger" : 'z.*<\(.*\)>')}

In git-tag.sh.

> That said, I've never understood why gpg matches on the comment field.
> Dammit, it _should_ find the key anyway. Stupid program.

I think it's doing the right thing unfortunately.  If you search on any part
 "Andy Parkins"
 "<andyparkins@gmail.com>"
 "andyparkins@gmail.com"
 "andyparkins"
It finds it fine; the only thing it doesn't find is
 "Andy Parkins <andyparkins@gmail.com>"
Which I suppose is fair enough, as it's a fairly specific format to be 
searching for.

I'm going to advocate my change of only searching on the email address for 
finding the key - there shouldn't be two keys with the same email address 
anyway, so there shouldn't be a danger of ambiguity of key.  Also, it deals 
with the case when someone has entered a different name in git and in their 
gpg UID.  For example, I would think it shouldn't be a problem that I like to 
be called "Andy" on the git list, and yet want my key to say "A. D. 
Parkins", "Andrew Parkins" or "Sparky McFly". 

Now, I think I've written my name far, far too many times in this email.


Sparky McFly
-- 
Dr Andrew Parkins, M Eng (Hons), AMIEE
andyparkins@gmail.com

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 19:04                                                       ` Aaron Bentley
@ 2006-10-20 19:31                                                         ` Linus Torvalds
  2006-10-20 20:12                                                           ` Aaron Bentley
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-20 19:31 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git, Jakub Narebski



On Fri, 20 Oct 2006, Aaron Bentley wrote:
> > 
> > Btw, this is a pet peeve of mine, and it is not at all restricted to 
> > the SCM world.
> 
> I guess I don't mind a bit of high-mmv discussion, so long as it doesn't
> get in the way of real work.  Polishing these kinds of things seems to
> fall in the category of 10% of functionality that takes 90% of effort.

Well, the thing is, that 10% of the functionality usually takes a whole 
lot _less_ than 10% of the work.

The stuff you can think through (and argue about) tends to be the easy 
stuff. Exactly because you _can_ think about it abstractly.

The stuff that is actually really hard and time-consuming is the stuff 
that you find out in practice, and you have to iterate on.

In kernels, for example, it seems like 99% of the effort ends up being 
hardware-dependent stuff. Getting architecture interfaces right, and 
getting working drivers. Hotplugging and device management turns out to be 
a _much_ bigger issue than schedulers or VM page-out has _ever_ been. 

But show me a single paper about them. I'm sure they exist. I'm just 
saying that they're sure as heck not getting 99% of the attention (or even 
1% of the attention) in discussions, even though they're definitely 99% of 
the real everyday work and effort.

(Maybe it's not 99%. Numbers taken out of my nether regions. The point 
should be clear).

The same is actually true of SCM's too, I'm totally convinced. At least in 
git, we really haven't spent _that_ much time on merges, for example. My 
original stupid three-way merge was really simple, and I think the way I 
introduced "stages" into the git index was really clever, but it was still 
a small detail. And it worked surprisingly way.

After that merge, people improved it. And "recursive" is a _huge_ 
improvement, don't get me wrong: it's still entirely a 3-way merge on the 
file contents, but it now does those 3-way merges in several stages if 
there are multiple independent common parents, and the rename logic is 
clearly important.

But if you actually look at how much effort was spent on merging, and how 
much was spent on just "details in general", I think you'll find merging 
to be pretty low down the list, even though the recursive merge ended up 
_also_ getting re-written in C. Perhaps it was one of the bigger 
_individual_ efforts, but compared to all the work we've continually done 
on performance and usability, for example, it's been pretty small in the 
end.

As an example: I suspect that in git just the CVS importer has gotten 
_way_ more attention than merging ever got. Importing from CVS is simply a 
much harder problem in practice, and we've probably had more people 
working on it (and that's _despite_ the fact that this is one of the areas 
where git has successfully re-used other projects that had similar goals: 
cvsps, cvs2svn etc). It's hard to "think" about, because a lot of the 
problems with importing from CVS are literally all about the details and 
the nasty crud. I really think "merging" is _way_ easier.

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 19:10                                                   ` Aaron Bentley
@ 2006-10-20 19:46                                                     ` Linus Torvalds
  2006-10-20 20:29                                                       ` Aaron Bentley
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-20 19:46 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, Jan Hudec, bazaar-ng, git



On Fri, 20 Oct 2006, Aaron Bentley wrote:
> 
> Linus Torvalds wrote:
> > Git goes one step further: it _really_ doesn't matter about how you got to 
> > a certain state. Absolutely _none_ of what the commits in between the 
> > final stages and the common ancestor matter in the least. The only thing 
> > that matters is what the states at the end-point are.
> 
> That's interesting, because I've always thought one of the strengths of
> file-ids was that you only had to worry about end-points, not how you
> got there.
> 
> How do you handle renames without looking at the history?

You first handle all the non-renames that just merge on their own. That 
takes care of 99.99% of the stuff (and I'm not exaggerating: in the 
kernel, you have ~21000 files, and most merges don't have a single rename 
to worry about - and even when you do have them, they tend to be in the 
"you can count them on one hand" kind of situation).

Then you just look at all the pathnames you _couldn't_ resolve, and that's 
usually cut down the thing to something where you can literally use a lot 
of CPU power per file, because now you only have a small number of 
candidates left.

If you were to use one hundredth of a second per file regardless of file, 
a stupid per-file merge would take 210 seconds, which is just 
unacceptable. So you really don't want to do that. You want to merge whole 
subdirectories in one go (and with git, you can: since the SHA1 of a 
directory defines _all_ of the contents under it, if the two branches you 
merge have an identical subdirectory, you don't need to do anything at 
_all_ about that one. See?).

So instead of trying to be really fast on individual files and doing them 
one at a time, git makes individual files basically totally free (you 
literally often don't need to look at them AT ALL). And then for the few 
files you can't resolve, you can afford to spend more time.

So say that you spend one second per file-pair because you do complex 
heuristics etc - you'll still have a merge that is a _lot_ faster than 
your 210-second one.

So recursive basically generates the matrix of similarity for the 
new/deleted files, and tries to match them up, and there you have your 
renames - without ever looking at the history of how you ended up where 
you are.

Btw, that "210 second" merge is not at all unlikely. Some of the SCM's 
seem to scale much worse than that to big archives, and I've heard people 
talk about merges that took 20 minutes or more. In contrast, git doing a 
merge in ~2-3 seconds for the kernel is _normal_.

[ In fact, I just re-tested doing my last kernel merge: it took 0.970 
  seconds, and that was _including_ the diffstat of the result - not 
  obviously not including the time to fetch the other branch over the 
  network.

  I don't know if people appreciate how good it is to do a merge of two 
  21000-file branches in less than a second. It didn't have any renames, 
  and it only had a single well-defined common parent, but not only is 
  that the common case, being that fast for the simple case is what 
  _allows_ you to do well on the complex cases too, because it's what gets 
  rid of all the files you should _not_ worry about ]

Performance does matter. 

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 19:31                                                         ` Linus Torvalds
@ 2006-10-20 20:12                                                           ` Aaron Bentley
  0 siblings, 0 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-20 20:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Fri, 20 Oct 2006, Aaron Bentley wrote:
> 
>>>Btw, this is a pet peeve of mine, and it is not at all restricted to 
>>>the SCM world.
>>
>>I guess I don't mind a bit of high-mmv discussion, so long as it doesn't
>>get in the way of real work.  Polishing these kinds of things seems to
>>fall in the category of 10% of functionality that takes 90% of effort.
> 
> 
> Well, the thing is, that 10% of the functionality usually takes a whole 
> lot _less_ than 10% of the work.

I guess this depends on whether you consider the brainstorming and
discussion to be part of the work of polishing, and I do mean polishing.
 Getting from something that works 90% of the time to something that
works 99% of the time can be a questionable expenditure of time and effort.

> The same is actually true of SCM's too, I'm totally convinced. At least in 
> git, we really haven't spent _that_ much time on merges, for example. My 
> original stupid three-way merge was really simple, and I think the way I 
> introduced "stages" into the git index was really clever, but it was still 
> a small detail. And it worked surprisingly way.

I did rewrite our merge code once, but that was because the API was
quite hard to deal with and made it hard to maintain.  I agree that it's
important to focus effort on the areas that make a difference.

On the other hand, our "exotic" text merge algorithms have been praised
by the people who work on Launchpad.  So that's a win.

> As an example: I suspect that in git just the CVS importer has gotten 
> _way_ more attention than merging ever got. Importing from CVS is simply a 
> much harder problem in practice, and we've probably had more people 
> working on it (and that's _despite_ the fact that this is one of the areas 
> where git has successfully re-used other projects that had similar goals: 
> cvsps, cvs2svn etc). It's hard to "think" about, because a lot of the 
> problems with importing from CVS are literally all about the details and 
> the nasty crud. I really think "merging" is _way_ easier.

Yes, I don't even want to think about CVS when I don't have to.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFOS2Y0F+nu1YWqI0RAiOcAJ0TXmBdiCcvnTzmg+nnF+kayJ25cgCggMFx
w6xFlFHwPoNm9dt/T4LnmCU=
=zNuy
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:59                                                   ` Linus Torvalds
@ 2006-10-20 20:17                                                     ` Junio C Hamano
  2006-10-20 20:40                                                       ` Jakub Narebski
                                                                         ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-20 20:17 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> ...  We're starting to see 
> git actually being able to track file content moving between files: even 
> when the files themselves didn't move (ie Junio's "git pickaxe" work could 
> do things like that).

I've reordered the git-pickaxe I parked in "pu" while 1.4.3-rc
cycle and merged it into "next".

The earlier one I was futzing with in "pu" had built-in
heuristics and pure mechanisms mixed together in the same patch,
which was quite bad as development history.  I think the
reordered sequence shows the logical evolution better.

  1. git-pickaxe: blame rewritten.

     This implements the infrastructure (parent traversal,
     identifying "corresponding path" in the parent -- aka
     "handling renames", passing blames to the parents and
     taking responsibility for the remainder) and uses the the
     same old "single diff with parent file identifies what we
     inherited from the parent" logic git-blame uses for passing
     blames.

  2. git-pickaxe -M: blame line movements within a file.

     This adds logic to find swapped groups of lines in the same
     file.  When the file in the parent had A and B and the child
     has B and A, "single diff with parent" would find only one
     of A or B is inherited from the parent, not both.  This
     re-diffs the remainder with the parent's file to find both.

     I used to have heuristics to avoid trivial groups of lines
     from being subject to this step, but in this version they
     have been removed, so that we can see the core logic and
     need for heuristics more clearly.

     On the other hand, the version I used to have in "pu" gave
     blame to the first match.  This one tries to find the best
     match and assign the blame to it.

  3. git-pickaxe -C: blame cut-and-pasted lines.

     This adds logic to find groups of lines brought in from
     existing file in the parent.  We scan the remainder using
     the same logic as -M detection, but it is done against
     other files in the parent.

     There was a heuristic that gave the blame to the parent
     right then and there when we find a copy-and-paste instead
     of allowing the parent to pass blame further on to its
     ancestors; again I removed this heuristics in the reordered
     series.

The next logical step is to come up with a good set of
heuristics to avoid excessive nonsense matches the code
currently gives.

Groups of small number of empty lines, lines with indentation
blanks followed by a closing brace, and '#include' lines that
include common header files occur so commonly, that without any
heuristics (which can be seen in the "next" branch today) the
algorithm would give surprisingly idiotic results.  For example:

	git -p pickaxe -C -f -n v1.4.3 -- commit.c

tells you that the first line of commit.c in v1.4.3 release,
which is '#include "cache.h"' came from the first line of
receive-pack.c which is total nonsense (this particular line
could actually be a bug in the -M or -C logic -- I need to
check).

A less "obviously wrong" but still idiotic case is that we find
ll.409-411 came from ll.94-96 of describe.c in commit 908e5310.
These three lines read as:

	409		}
        410	}
        411

While this blame assignment might be technically correct, it
does not add much value to pass blames on in such a case.

On the brighter side, we find that ll.415-419 (the beginning of
function "static int get_one_line()") originally came from
diff-tree.c (commit cee99d22, ll.275-279).

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:48                                                 ` Linus Torvalds
  2006-10-20 17:58                                                   ` David Lang
  2006-10-20 18:15                                                   ` Jon Smirl
@ 2006-10-20 20:23                                                   ` Petr Baudis
  2006-10-20 20:49                                                     ` David Lang
  2006-10-20 20:53                                                   ` Shawn Pearce
  3 siblings, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-10-20 20:23 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Shawn Pearce, Aaron Bentley, Jakub Narebski, bazaar-ng, git

Dear diary, on Fri, Oct 20, 2006 at 07:48:58PM CEST, I got a letter
where Linus Torvalds <torvalds@osdl.org> said that...
> So yeah, I've seen a few strange cases myself, but they've actually been 
> interesting. Like seeing how much of a file was just a copyright license, 
> and then a file being considered a "copy" just because it didn't actually 
> introduce any real new code.

Well it's certainly "interesting" and fun to see, but is it equally fun
to handle mismerges caused by a broken detection?

I've talked to some people who really didn't mind (or even liked) Git's
heuristics when it came to _inspecting_ movement of content, but were
really nervous about merge following such heuristics.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 19:46                                                     ` Linus Torvalds
@ 2006-10-20 20:29                                                       ` Aaron Bentley
  2006-10-20 20:57                                                         ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-20 20:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, Jan Hudec, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Fri, 20 Oct 2006, Aaron Bentley wrote:
> 
>>Linus Torvalds wrote:
>>
>>>Git goes one step further: it _really_ doesn't matter about how you got to 
>>>a certain state. Absolutely _none_ of what the commits in between the 
>>>final stages and the common ancestor matter in the least. The only thing 
>>>that matters is what the states at the end-point are.
>>
>>That's interesting, because I've always thought one of the strengths of
>>file-ids was that you only had to worry about end-points, not how you
>>got there.
>>
>>How do you handle renames without looking at the history?
> 
> 
> You first handle all the non-renames that just merge on their own.
> If you were to use one hundredth of a second per file regardless of file, 
> a stupid per-file merge would take 210 seconds, which is just 
> unacceptable. So you really don't want to do that.

Agreed.  We start by comparing BASE and OTHER, so all those comparisons
are in-memory operations that don't hit disk.  Only for files where BASE
and OTHER differ do we even examine the THIS version.

We can do a do-nothing kernel merge in < 20 seconds, and that's
comparing every single file in the tree.  In Python.  I was aiming for
less than 10 seconds, but didn't quite hit it.

> So recursive basically generates the matrix of similarity for the 
> new/deleted files, and tries to match them up, and there you have your 
> renames - without ever looking at the history of how you ended up where 
> you are.

So in the simple case, you compare unmatched THIS, OTHER and BASE files
to find the renames?

>   I don't know if people appreciate how good it is to do a merge of two 
>   21000-file branches in less than a second. It didn't have any renames, 
>   and it only had a single well-defined common parent, but not only is 
>   that the common case, being that fast for the simple case is what 
>   _allows_ you to do well on the complex cases too, because it's what gets 
>   rid of all the files you should _not_ worry about ]

Well, I certainly appreciate that.  I've never worried about the speed
of text merge algorithms, because you rarely merge very many files.  The
key is making the tree merge fast.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFOTGN0F+nu1YWqI0RAii+AJ0eduC3bYya5Ao8vm1EpBb38tJP4ACeJRYe
9/D+ahDRJa87NTryc7j3C+U=
=plWA
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 20:17                                                     ` Junio C Hamano
@ 2006-10-20 20:40                                                       ` Jakub Narebski
  2006-10-20 22:41                                                       ` [PATCH 1/2] git-pickaxe: introduce heuristics to "best match" scoring Junio C Hamano
  2006-10-20 22:41                                                       ` [PATCH 2/2] git-pickaxe: introduce heuristics to avoid "trivial" chunks Junio C Hamano
  2 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 20:40 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:

>   2. git-pickaxe -M: blame line movements within a file.
> 
>      This adds logic to find swapped groups of lines in the same
>      file.  When the file in the parent had A and B and the child
>      has B and A, "single diff with parent" would find only one
>      of A or B is inherited from the parent, not both.  This
>      re-diffs the remainder with the parent's file to find both.
> 
>      I used to have heuristics to avoid trivial groups of lines
>      from being subject to this step, but in this version they
>      have been removed, so that we can see the core logic and
>      need for heuristics more clearly.
> 
>      On the other hand, the version I used to have in "pu" gave
>      blame to the first match.  This one tries to find the best
>      match and assign the blame to it.
> 
>   3. git-pickaxe -C: blame cut-and-pasted lines.
> 
>      This adds logic to find groups of lines brought in from
>      existing file in the parent.  We scan the remainder using
>      the same logic as -M detection, but it is done against
>      other files in the parent.
> 
>      There was a heuristic that gave the blame to the parent
>      right then and there when we find a copy-and-paste instead
>      of allowing the parent to pass blame further on to its
>      ancestors; again I removed this heuristics in the reordered
>      series.

The names of options clash somewhat with -M and -C in diffcore,
which detect contents 'M'oving (renaming files), and contents
'C'opying (copying files), where in git-pickaxe -C is still about
code movement, only across files (-M -M or --MM?).

Would git-pickaxe try to do also copy-and-paste within the file,
and across files?
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 20:23                                                   ` Petr Baudis
@ 2006-10-20 20:49                                                     ` David Lang
  2006-10-20 20:53                                                       ` Petr Baudis
  0 siblings, 1 reply; 1752+ messages in thread
From: David Lang @ 2006-10-20 20:49 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Linus Torvalds, Shawn Pearce, Aaron Bentley, Jakub Narebski,
	bazaar-ng, git

On Fri, 20 Oct 2006, Petr Baudis wrote:

> 
> Dear diary, on Fri, Oct 20, 2006 at 07:48:58PM CEST, I got a letter
> where Linus Torvalds <torvalds@osdl.org> said that...
>> So yeah, I've seen a few strange cases myself, but they've actually been
>> interesting. Like seeing how much of a file was just a copyright license,
>> and then a file being considered a "copy" just because it didn't actually
>> introduce any real new code.
>
> Well it's certainly "interesting" and fun to see, but is it equally fun
> to handle mismerges caused by a broken detection?
>
> I've talked to some people who really didn't mind (or even liked) Git's
> heuristics when it came to _inspecting_ movement of content, but were
> really nervous about merge following such heuristics.

remember, git only stores the results. so when you are merging it doesn't even 
look for renames.

the only time you get renames is after-the-fact when you ask git for a report 
about what changed. then (if you enable rename detection) it will tell you what 
files have changed, and what files look like they may have been renames 
(possibly with changes). but if you don't ask git to look for renames it won't 
bother and you can just ignore the concept entirely.

or if you only want complete renames (as opposed to rename + change) then use 
the option to tell it that you don't want to consider it a rename unless it's 
100% the same (or 99%, or whatever satisfies you)

David Lang

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:48                                                 ` Linus Torvalds
                                                                     ` (2 preceding siblings ...)
  2006-10-20 20:23                                                   ` Petr Baudis
@ 2006-10-20 20:53                                                   ` Shawn Pearce
  3 siblings, 0 replies; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-20 20:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bazaar-ng, git, Jakub Narebski

Linus Torvalds <torvalds@osdl.org> wrote:
> On Fri, 20 Oct 2006, Shawn Pearce wrote:
> > 
> > I renamed hundreds of small files in one shot and also did a few
> > hundered adds and deletes of other small XML files.  Git generated
> > a lot of those unrelated adds/deletes as rename/modifies, as their
> > content was very similiar.  Some people involved in the project
> > freaked as the files actually had nothing in common with one
> > another... except for a lot of XML elements (as they shared the
> > same DTD).
> 
> Heh. We can probably tweak the heuristics (one of the _great_ things about 
> content detection is that you can fix it after the fact, unlike the 
> alternative).
> 
> That said, I've personally actually found the content-based similarity 
> analysis to often be quite informative, even when (and perhaps 
> _especially_ when) it ended up showing something that the actual author of 
> the thing didn't intend.
> 
> So yeah, I've seen a few strange cases myself, but they've actually been 
> interesting. Like seeing how much of a file was just a copyright license, 
> and then a file being considered a "copy" just because it didn't actually 
> introduce any real new code.

Aside from that one strange case I just mentioned I've always seen
the strategy to work very well.  Its never done something I didn't
expect and I've never seen copies or that I didn't expect to see,
knowing what the author of the change did.

So even though I had a little bit of trouble with that rename
situation above I'm _very_ happy with the way Git handles renames.

And the truth is that case above really was quite correct: XML is
very verbose.  When 70% of the file is just required XML to frame
the other 30% of the file's payload its not surprising that files
are considered to be similar when they only differ by a little bit
of payload.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 20:49                                                     ` David Lang
@ 2006-10-20 20:53                                                       ` Petr Baudis
  2006-10-20 20:55                                                         ` David Lang
  0 siblings, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-10-20 20:53 UTC (permalink / raw)
  To: David Lang; +Cc: bazaar-ng, Linus Torvalds, Shawn Pearce, git, Jakub Narebski

Dear diary, on Fri, Oct 20, 2006 at 10:49:53PM CEST, I got a letter
where David Lang <dlang@digitalinsight.com> said that...
> On Fri, 20 Oct 2006, Petr Baudis wrote:
> 
> >
> >Dear diary, on Fri, Oct 20, 2006 at 07:48:58PM CEST, I got a letter
> >where Linus Torvalds <torvalds@osdl.org> said that...
> >>So yeah, I've seen a few strange cases myself, but they've actually been
> >>interesting. Like seeing how much of a file was just a copyright license,
> >>and then a file being considered a "copy" just because it didn't actually
> >>introduce any real new code.
> >
> >Well it's certainly "interesting" and fun to see, but is it equally fun
> >to handle mismerges caused by a broken detection?
> >
> >I've talked to some people who really didn't mind (or even liked) Git's
> >heuristics when it came to _inspecting_ movement of content, but were
> >really nervous about merge following such heuristics.
> 
> remember, git only stores the results. so when you are merging it doesn't 
> even look for renames.

Of course it does look for renames; when you use the recursive strategy,
it will try to merge across renames.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 20:53                                                       ` Petr Baudis
@ 2006-10-20 20:55                                                         ` David Lang
  0 siblings, 0 replies; 1752+ messages in thread
From: David Lang @ 2006-10-20 20:55 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Linus Torvalds, Shawn Pearce, Aaron Bentley, Jakub Narebski,
	bazaar-ng, git

On Fri, 20 Oct 2006, Petr Baudis wrote:

>>> I've talked to some people who really didn't mind (or even liked) Git's
>>> heuristics when it came to _inspecting_ movement of content, but were
>>> really nervous about merge following such heuristics.
>>
>> remember, git only stores the results. so when you are merging it doesn't
>> even look for renames.
>
> Of course it does look for renames; when you use the recursive strategy,
> it will try to merge across renames.

sorry, missed that.

David Lang

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 20:29                                                       ` Aaron Bentley
@ 2006-10-20 20:57                                                         ` Linus Torvalds
  2006-10-21  2:03                                                           ` git-merge-recursive, was " Johannes Schindelin
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-20 20:57 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, Jan Hudec, Git Mailing List, Jakub Narebski



On Fri, 20 Oct 2006, Aaron Bentley wrote:
> 
> Agreed.  We start by comparing BASE and OTHER, so all those comparisons
> are in-memory operations that don't hit disk.  Only for files where BASE
> and OTHER differ do we even examine the THIS version.

Git just slurps in all three trees. I actually think that the current 
merge-recursive.c does it the stupid way (ie it expands all trees 
recursively, regardless of whether it's needed or not), but I should 
really check with Dscho, since I had nothing to do with that code.

I wrote a tree-level merger that avoided doing the recursive tree reading 
when the tree-SHA1's matched entirely, and re-doing the latest merge using 
that took all of 0.037s, because it didn't recursively expand any of the 
uninteresting trees.

But the default recursive merge was ported from the python script that 
did it a full tree at a time, so it's comparatively "slow". But it's fast 
enough (witness the under-1s time ;) that I think the motivation to be 
smarter about reading the trees was basically not just there, so my 
"git-merge-tree" thing is languishing as a proof-of-concept.

So right now, git merging itself doesn't even take advantage of the "you 
can compare two whole directories in one go". We do that all over the 
place in other situations, though (it's a big reason for why doing a 
"diff" between different revisions is so fast - you can cut the problem 
space up and ignore the known-identical parts much faster).

That tree-based data structure turned out to be wonderful. Originally (as 
in "first weeks of actual git work" in April 2005) git had a flat "file 
manifest" kind of thing, and that really sucked.  So the data structures 
are important, and I think we got those right fairly early on.

> We can do a do-nothing kernel merge in < 20 seconds, and that's
> comparing every single file in the tree.  In Python.  I was aiming for
> less than 10 seconds, but didn't quite hit it.

Well, so I know I can do that particular actual merge in 0.037 seconds 
(that's not counting the history traversal to actually find the common 
parent, which is another 0.01s or more ;), so we should be able to 
comfortably do the simple merges in less than a tenth of a second. But at 
some point, apparently nobody just cares.

Of course, this kind of thing depends a lot on developer behaviour. We had 
some performance bugs that we didn't notice simply because the kernel 
didn't show any of those patterns, but people using it for other things 
had slower merges. Sometimes you don't see the problem, just because you 
end up looking at the wrong pattern for performance.

> > So recursive basically generates the matrix of similarity for the 
> > new/deleted files, and tries to match them up, and there you have your 
> > renames - without ever looking at the history of how you ended up where 
> > you are.
> 
> So in the simple case, you compare unmatched THIS, OTHER and BASE files
> to find the renames?

Right. Some cases are easy: if one of the branches only added files (which 
is relatively common), that obviously cannot be a rename. So you don't 
even have to compare all possible combinarions - you know you don't have 
renames from one branch to the other ;)

But I'm not even the authorative person to explain all the details of the 
current recursive merge, and I might have missed something. Dscho? 
Fredrik? Anything you want to add?

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20  1:06                                           ` Aaron Bentley
                                                               ` (3 preceding siblings ...)
  2006-10-20 14:12                                             ` Jeff King
@ 2006-10-20 21:48                                             ` Carl Worth
  2006-10-21 13:01                                               ` Matthew D. Fuller
  2006-10-21 20:05                                               ` Aaron Bentley
  4 siblings, 2 replies; 1752+ messages in thread
From: Carl Worth @ 2006-10-20 21:48 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Jakub Narebski, Andreas Ericsson, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 12348 bytes --]

On Thu, 19 Oct 2006 21:06:40 -0400, Aaron Bentley wrote:
> I understand your argument now.

Well, I'm glad to know we each feel like we are communicating at
times, here.

>                                  It's nothing to do with numbers per se,
> and all about per-branch namespaces.  Correct?

The entire discussion is about how to name things in a distributed
system. The premise that Linus has put forth in a very compelling way,
is that attempting to use sequential numbers for names in a
distributed system will break down. The breakdown could be that the
names are not stable, or that the system is used in a centralized way
to avoid the instability of the names.

Now, that causality might not accurately describe the way bzr has
developed. It may be that the centralization bias was determined by
other reasons, and that given those, using sequential numbers for
names makes perfect sense.

But it really is fundamental and unavoidable that sequential numbers
don't work as names in a distributed version control system.

> I meant that the active branch and a mirror of the abandoned branch
> could be stored in the same repository, for ease of access.

Granted, everything can be stored in one repository. But that still
doesn't change what I was trying to say with my example. One of the
repositories would "win" (the names it published during the fork would
still be valid). And the other repository would "lose" (the names it
published would be not valid anymore). Right?

Now, maybe there's some "simple" mapping from old names to new names
for the losing repository, (something like adding a prefix of
"losers/" to the beginning of the names or something or adding a "15."
prefix or whatever). The point is that the old names are
invalidated. And there's no way to guarantee this kind of change won't
happen in the future, (no matter how old a project is).

I constructed that example to show that the naming has a social impact
in forcing a distinction between winners and losers in the merge, (or
mainline and side branch, or whatever you want to name the
distinction). The two re-joining projects could be really amiable,
create a new virgin mainline and treat both histories as side
branches. In this version, everyone loses as all the old names are
invalidated.

> Bazaar encourages you to stick lots and lots of branches in your
> repository.  They don't even have to be related.  For example, my repo
> contains branches of bzr, bzrtools, Meld, and BazaarInspect.

Git allows this just fine. And lots of branches belonging to a single
project is definitely the common usage. It is not common (nor
encouraged) for unrelated projects to share a repository, since a git
clone will fetch every branch in the repository. common for a single
base URL to provide a common basis for a hierarchy of git
repositories, (see, for example http://repo.or.cz/), and that may
provide similar benefits.

I'm noticing another terminology conflict here. The notion of "branch"
in bzr is obviously very different than in git. For example the bzr
man page has a sentence beginning with "if there is already a branch
at the location but it has no working tree". I'm still not sure
exactly what a bzr branch is, but it's clearly something different
from a git branch, (which is absolutely nothing more than a name
referencing a particular commit object). [Note: after playing with it
a bit more down below, a bzr "branch" appears to be something like a
git "repository" that can only hold a single branch.]

> I can see where you're coming from, but to me, the trade-off seems
> worthwhile.  Because historical data gets less and less valuable the
> older it gets.  By the time the URL for a branch goes dark, there's
> unlikely to be any reason to refer to one of its revisions at all.

I strongly disagree on this point. One, I don't think that the "time
for a branch to go dark" is necessarily long, (or if it is, then
that's another barrier that's setup against distributed
development---people have to have a long-term repository before they
can usefully start publishing a branch). Second, I'm not comfortable
with any limit on usefulness of history. Would you willingly throw
away commits, mailing list posts, or closed bug reports older than any
given age for any projects that you care about?

> When you create a new branch from scratch, the number starts at zero.
> If you copy a branch, you copy its number, too.
>
> Every time you commit, the number is incremented.  If you pull, your
> numbers are adjusted to be identical to those of the branch you pulled from.
>
> Is that really complicated?

OK. So now I had to actually try things out. I went ahead and
installed bzr and was able to init and commit from the man page. I had
to go to IRC to figure out how to create and change branches, (the
documentation for "bzr branch" just said FROM_LOCATION and TO_LOCATION
and I couldn't figure out what to pass for those).

Here's the setup I came up with for a tweaked version of the a[bc]m
diamond example I showed with git earlier, (I just added a second
commit to each branch before merging):

	mkdir bzrtest; cd bzrtest
	mkdir master; cd master; bzr init
	touch a; bzr add a; bzr commit -m "Initial commit of a"
	cd ..
	bzr branch master b; cd b
	touch b; bzr add b; bzr commit -m "Commit b on b branch"
	echo "change" > b; bzr commit -m "Change b on b branch"
	cd ..
	bzr branch master c; cd c
	touch c; bzr add c; bzr commit -m "Commit c on c branch"
	echo "change" > c; bzr commit -m "Change c on c branch"
	cd ../master
	bzr merge ../b; bzr commit -m "Merge in b"
	bzr merge ../c; bzr commit -m "Merge in c"

First, I've been told that this is a lot less efficient than possible
since I have what in bzr terms is three unshared "branches" here,
(what git would really call three separate "repositories").

Second, I think that using the filesystem for separating branches is a
really bad idea. One, it intrudes on my branch namespace, (note that
in many commands above I have to use things like "../b" where I'd like
to just name my branch "b". Two, it prevents bzr from having any
notion of "all branches" in places where git takes advantage of it,
(such as git-clone and "gitk --all"). Three, it certainly encourages
the storage problem I ran into above, (and I'd be interested to see a
"corrected" version of the commands above to fix the storage
inefficiencies).

But anyway, those are all new topics, what we were trying to talk
about is revision numbers. After the above commands I can run bzr log
in my three branches, master, b, and c and I get the following
revision number sequences:

master: 1 2 3
b: 1 2 3
c: 1 2 3

And from this state if I ask questions with bzr missing and look at
just the revision numbers, then the answers are useless. I get answers
like:

	.../b:$ bzr missing ../c
	You have 2 extra revision(s):
	revno: 3
	  Change b on b branch
	revno: 2
	  Commit b on b branch

	You are missing 2 revision(s):
	revno: 3
	  Change c on c branch
	revno: 2
	  Commit c on c branch

	.../b:$ bzr missing ../master
	You are missing 2 revision(s):
	revno: 3
	  Merge in c
	revno: 2
	  Merge in b

So there we have the revision numbers 2 and 3 each being used to name
three different revisions. That's a lot of aliasing already.
Then, if the b and c branches each treat master as their mainline and
each pull, then both branches get their numbers all shuffled.

Oh, drat. I just realized that I'm running 0.11 here which doesn't
have the dotted-decimal numbers. (I'm trying to get bzr.dev too, but
it appears to be stuck about 40% of the way through "Fetch phase
1/4" [Note: it ). In this version, the commits brought in as part of a merge
don't get any "simple" number at all and instead "bzr log" shows a
merge ID.

I hadn't realized that the dotted decimal notation was so new that the
community hadn't had a lot of experience with it yet. But, your
description doesn't actually presume that notation. What you asked
was:

	> When you create a new branch from scratch, the number starts at zero.
	> If you copy a branch, you copy its number, too.
	>
	> Every time you commit, the number is incremented.  If you pull, your
	> numbers are adjusted to be identical to those of the branch you pulled from.
	>
	> Is that really complicated?

And to answer. That description doesn't describe at all what happens
to the "simple" numbers of commits that are merged. In the version I
have, they disappear and get replaced with "ugly" numbers. In 0.12
something else happens instead, (that's the part I don't understand
yet).

And my argument isn't just "confusing" it's "confusing or
useless". I understand that pull destroys numbers, and how, but that
makes the numbers I had generated earlier useless. I still don't
understand how people can avoid number changing, (since pull seems the
only way to synch up without infinite new merge commits being added
back and forth).

So, yes, it really is complicated or my brain is just too small.

> > The naming in git really is beautiful and beautifully simple.
>
> Well, you've got to admit that those names are at least superficially ugly.

Sure. But I'll gladly take a simple system with superficial warts than
a complex system with superficial beauty.

> What's nice is being able see the revno 753 and knowing that "diff -r
> 752..753" will show the changes it introduced.  Checking the revo on a
> branch mirror and knowing how out-of-date it is.

With git I get to see a revision number of b62710d4 and know that
"diff b62710d4^ b62710d4" will show its changes, though much more
likely just "show b62710d4". I really cannot fathom a place where
arithmetic on revision numbers does something useful that git revision
specifications don't do just as easily. Anybody have an example for
me?

-Carl

PS. The "bzr branch" of bzr.dev did eventually finish. I can see the
dotted-decimal numbers in my example now, (1.1.1 and 1.2.2 for the
commits that came from branch b; 1.2.1 and 1.2.2 for the commits that
came from branch c). At 5 characters a piece these are well on their
way to getting just as "ugly" as git names, (once it's all
cut-and-paste the difference in ugliness is negligible).

And now, I see it's not just pull that does number rewriting. If I use
the following command (after the chunk of commands above):

	cd ..; bzr branch -r 1.2.2 master 1.2.2

It appears to just create newly linearized revision numbers from whole
cloth for the new branch (1, 2, and 3 corresponding to mainline 1,
1.2.1, and 1.2.2). That's totally surprising, very confusing, and
would invalidate any use I wanted to make of published revision
numbers for the mainline branch while I was working on this branch.

See? This stuff really doesn't work.

Motivating scenario for the above: Imagine 1.2.3 commited garbage so I
want to fix it by branching from 1.2.2 rather than the mainline
"2". Then after I branch, I learn something about "1.2.1" that I want
to investigate more closely. I try to inspect that in my branch, but
ouch! I don't have that revision.

Is there even a way to say "show me the change introduced by what is
named '1.2.1' in the source branch in this scenario" ?

Note: In #bzr I just learned that there is a way for me to do this
_if_ I also happen to have a pull of the original branch somewhere on
my machine. Something like:

	bzr diff -r1.2.0:../master -r1.2.1:../master

I don't know if there's a way to get diff's .. notation to work with
that, (I can't manage to). But these simple numbers are getting less
simple all the time.

With git, if I find a revision number somewhere, I can cut-and-paste
it and get the right thing:

	git show b62710d4f8602203d848daf2d444865b611fff09

But with bzr if I find "1.2.1" somewhere I'm likely to type:

	bzr diff -r1.2.0..1.2.1

If I'm lucky, then that fails with:

	bzr: ERROR: Requested revision: '1.2.0' does not exist in branch:

and I go back to the source, find out what branch it was referring to,
remember where that is on my machine (../master, say), and manually
type that to my command line to get:

	bzr diff -r1.2.0:../master -r1.2.1:../master

If I'm unlucky then the first diff comes up with some unrelated commit
and I get to be confused before I go through that same process.

Now do you see? It really, really does not work. This stuff is about
as un-simple as could be, and this things will happen.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:48                                               ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Linus Torvalds
@ 2006-10-20 22:13                                                 ` Jeff Licquia
  2006-10-20 23:05                                                   ` Robert Collins
  2006-10-20 23:59                                                   ` Linus Torvalds
  0 siblings, 2 replies; 1752+ messages in thread
From: Jeff Licquia @ 2006-10-20 22:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jan Hudec, bazaar-ng, git, Jakub Narebski

On Fri, 2006-10-20 at 11:48 -0700, Linus Torvalds wrote:
> Here's a real-life schenario that we hit several times with BK over the 
> years:
> 
>  - take a real repository, and a patch that gets discussed that adds a new 
>    file.
>  - take two different people applying that patch to their trees (or, do 
>    the equivalent thing, which is to just create the same filename
>    independently, because the solution is obvious - and the same - to 
>    both developers).
>  - now, have somebody merge both of those two peoples trees (eg me)
>  - have the two people continue to use their trees, modifying it, and 
>    getting merged.
> 
> Trust me, this isn't even _unlikely_. It happens. And it's a serious 
> problem for a file-ID case. Why? Because you have two different file ID's 
> for the same pathname. 

I tried this to see what bzr would do.  Here's the critical point where
the first merges are done ("a" is mainline, "b" and "c" are external
branches being merged into "a").

---
jeff@lsblap:~/tmp/linus-file-id/a$ bzr pull ../b
All changes applied successfully.
1 revision(s) pulled.
jeff@lsblap:~/tmp/linus-file-id/a$ bzr pull ../c
bzr: ERROR: These branches have diverged.  Use the merge command to reconcile them.
jeff@lsblap:~/tmp/linus-file-id/a$ bzr merge ../c
Conflict adding file file2.  Moved existing file to file2.moved.
1 conflicts encountered.
jeff@lsblap:~/tmp/linus-file-id/a$ bzr status
added:
  file2
renamed:
  file2 => file2.moved
conflicts:
  Conflict adding file file2.  Moved existing file to file2.moved.
pending merges:
  Jeff Licquia 2006-10-20 commit c of file2
---

file2 and file2.moved have identical contents at this point.  I fixed it
by deleting file2.moved, "bzr resolve file2", and committing.

After this conflict is resolved, merging from b causes conflicts, while
merging from c appears to work fine.  This continues until b merges from
a (and resolves a conflict in a similar manner to a), at which time
merging/pulling works as you'd expect between the branches.  Whenever b
is marked as conflicting before it merges from a, bzr preserves b's
changes by moving b's modified file.

All in all, not ideal, but it seems bzr handles this better than bk.
Certainly, bzr doesn't silently drop anyone's changes, at least.  I
suspect that bzr could improve its handling of this use case, but not,
I'm sure, to Linus's specifications; some of the fun and games does seem
to come from the use of file IDs.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 15:34                                         ` Aaron Bentley
  2006-10-20 16:21                                           ` Jakub Narebski
@ 2006-10-20 22:40                                           ` Petr Baudis
  2006-10-20 23:33                                             ` Aaron Bentley
  1 sibling, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-10-20 22:40 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git

Dear diary, on Fri, Oct 20, 2006 at 05:34:39PM CEST, I got a letter
where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Jakub Narebski wrote:
> > Aaron Bentley wrote:
> >>In Bazaar bundles, the text of the diff is an integral part of the data.
> >> It is used to generate the text of all the files in the revision.
> > 
> > 
> > I thought that the diff was combined diff of changes.
> 
> It is.  It's a description of how to produce revision X given revision
> Y, where Y is the last-merged mainline revision.

Aha, so by default a bundle can carry just a _single_ revision?

That doesn't sound right either, because then it wouldn't make sense to
talk about "combined" or "simple" diffs. So I guess sending a bundle
really is taking n revisions at your side, bundling them to a single
diff and when the other side takes it, it will result in a single
revision? That is basically what our merge --squash does.

Hmm, but that doesn't sound right either, that's certainly no revolting
functionality and seems to be in contradiction with previous bundles
description. But if it doesn't squash the changes, I don't see how the
combined diff can be integral part of the data. Sorry, I don't get it.

> The bundle format can also support sending a single bundles that
> displays the series of patches, though there's currently no UI to select
> this.
..snip..
> > I was under an impression that user sees only mega-patch of all the
> > revisions in bundle together, and rest is for machine consumption only.
> 
> All of it is for machine consumption.  The MIME-encoded sections are a
> series of patches.  They're usually MIME-encoded to avoid confusion with
> the overview patch, but this is optional.
> 
> I've attached an example of what a combined patch-by-patch bundle looks
> like.

But that's the one there's no UI to select? Or where is the combined
diff?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* [PATCH 1/2] git-pickaxe: introduce heuristics to "best match" scoring
  2006-10-20 20:17                                                     ` Junio C Hamano
  2006-10-20 20:40                                                       ` Jakub Narebski
@ 2006-10-20 22:41                                                       ` Junio C Hamano
  2006-10-20 22:41                                                       ` [PATCH 2/2] git-pickaxe: introduce heuristics to avoid "trivial" chunks Junio C Hamano
  2 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-20 22:41 UTC (permalink / raw)
  To: git

Instead of comparing number of lines matched, look at the
matched characters and count alnums, so that we do not pass
blame on not-so-interesting lines, such as empty lines and lines
that are indentation with closing brace.

Add an option --score-debug to show the score of each
blame_entry while we cook this further on the "next" branch.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 * This comes on top of "next".  The next one makes output from
   "pickaxe -C commit" actually make sense.

 builtin-pickaxe.c |   71 +++++++++++++++++++++++++++++++++++-----------------
 1 files changed, 48 insertions(+), 23 deletions(-)

diff --git a/builtin-pickaxe.c b/builtin-pickaxe.c
index 74c7c9a..3c73d82 100644
--- a/builtin-pickaxe.c
+++ b/builtin-pickaxe.c
@@ -34,8 +34,7 @@ static int longest_file;
 static int longest_author;
 static int max_orig_digits;
 static int max_digits;
-
-#define DEBUG 0
+static int max_score_digits;
 
 #define PICKAXE_BLAME_MOVE		01
 #define PICKAXE_BLAME_COPY		02
@@ -78,6 +77,11 @@ struct blame_entry {
 	 * suspect's file; internally all line numbers are 0 based.
 	 */
 	int s_lno;
+
+	/* how significant this entry is -- cached to avoid
+	 * scanning the lines over and over
+	 */
+	unsigned score;
 };
 
 struct scoreboard {
@@ -215,9 +219,6 @@ static void process_u_diff(void *state_,
 	struct chunk *chunk;
 	int off1, off2, len1, len2, num;
 
-	if (DEBUG)
-		fprintf(stderr, "%.*s", (int) len, line);
-
 	num = state->ret->num;
 	if (len < 4 || line[0] != '@' || line[1] != '@') {
 		if (state->hunk_in_pre_context && line[0] == ' ')
@@ -295,10 +296,6 @@ static struct patch *get_patch(struct or
 	char *blob_p, *blob_o;
 	struct patch *patch;
 
-	if (DEBUG) fprintf(stderr, "get patch %.8s %.8s\n",
-			   sha1_to_hex(parent->commit->object.sha1),
-			   sha1_to_hex(origin->commit->object.sha1));
-
 	blob_p = read_sha1_file(parent->blob_sha1, type,
 				(unsigned long *) &file_p.size);
 	blob_o = read_sha1_file(origin->blob_sha1, type,
@@ -352,6 +349,7 @@ static void dup_entry(struct blame_entry
 	memcpy(dst, src, sizeof(*src));
 	dst->prev = p;
 	dst->next = n;
+	dst->score = 0;
 }
 
 static const char *nth_line(struct scoreboard *sb, int lno)
@@ -448,7 +446,7 @@ static void split_blame(struct scoreboar
 		add_blame_entry(sb, new_entry);
 	}
 
-	if (DEBUG) {
+	if (1) { /* sanity */
 		struct blame_entry *ent;
 		int lno = 0, corrupt = 0;
 
@@ -530,12 +528,6 @@ static int pass_blame_to_parent(struct s
 	for (i = 0; i < patch->num; i++) {
 		struct chunk *chunk = &patch->chunks[i];
 
-		if (DEBUG)
-			fprintf(stderr,
-				"plno = %d, tlno = %d, "
-				"same as parent up to %d, resync %d and %d\n",
-				plno, tlno,
-				chunk->same, chunk->p_next, chunk->t_next);
 		blame_chunk(sb, tlno, plno, chunk->same, target, parent);
 		plno = chunk->p_next;
 		tlno = chunk->t_next;
@@ -547,14 +539,37 @@ static int pass_blame_to_parent(struct s
 	return 0;
 }
 
-static void copy_split_if_better(struct blame_entry best_so_far[3],
+static unsigned ent_score(struct scoreboard *sb, struct blame_entry *e)
+{
+	unsigned score;
+	const char *cp, *ep;
+
+	if (e->score)
+		return e->score;
+
+	score = 0;
+	cp = nth_line(sb, e->lno);
+	ep = nth_line(sb, e->lno + e->num_lines);
+	while (cp < ep) {
+		unsigned ch = *((unsigned char *)cp);
+		if (isalnum(ch))
+			score++;
+		cp++;
+	}
+	e->score = score;
+	return score;
+}
+
+static void copy_split_if_better(struct scoreboard *sb,
+				 struct blame_entry best_so_far[3],
 				 struct blame_entry this[3])
 {
 	if (!this[1].suspect)
 		return;
-	if (best_so_far[1].suspect &&
-	    (this[1].num_lines < best_so_far[1].num_lines))
-		return;
+	if (best_so_far[1].suspect) {
+		if (ent_score(sb, &this[1]) < ent_score(sb, &best_so_far[1]))
+			return;
+	}
 	memcpy(best_so_far, this, sizeof(struct blame_entry [3]));
 }
 
@@ -596,7 +611,7 @@ static void find_copy_in_blob(struct sco
 				      tlno + ent->s_lno, plno,
 				      chunk->same + ent->s_lno,
 				      parent);
-			copy_split_if_better(split, this);
+			copy_split_if_better(sb, split, this);
 		}
 		plno = chunk->p_next;
 		tlno = chunk->t_next;
@@ -699,7 +714,7 @@ static int find_copy_in_parent(struct sc
 				continue;
 			}
 			find_copy_in_blob(sb, ent, norigin, this, &file_p);
-			copy_split_if_better(split, this);
+			copy_split_if_better(sb, split, this);
 		}
 		if (split[1].suspect)
 			split_blame(sb, split, ent);
@@ -944,6 +959,7 @@ #define OUTPUT_RAW_TIMESTAMP	004
 #define OUTPUT_PORCELAIN	010
 #define OUTPUT_SHOW_NAME	020
 #define OUTPUT_SHOW_NUMBER	040
+#define OUTPUT_SHOW_SCORE      0100
 
 static void emit_porcelain(struct scoreboard *sb, struct blame_entry *ent)
 {
@@ -1016,6 +1032,8 @@ static void emit_other(struct scoreboard
 					   show_raw_time),
 			       ent->lno + 1 + cnt);
 		else {
+			if (opt & OUTPUT_SHOW_SCORE)
+				printf(" %*d", max_score_digits, ent->score);
 			if (opt & OUTPUT_SHOW_NAME)
 				printf(" %-*.*s", longest_file, longest_file,
 				       suspect->path);
@@ -1060,8 +1078,9 @@ static void output(struct scoreboard *sb
 	for (ent = sb->ent; ent; ent = ent->next) {
 		if (option & OUTPUT_PORCELAIN)
 			emit_porcelain(sb, ent);
-		else
+		else {
 			emit_other(sb, ent, option);
+		}
 	}
 }
 
@@ -1118,6 +1137,7 @@ static void find_alignment(struct scoreb
 {
 	int longest_src_lines = 0;
 	int longest_dst_lines = 0;
+	unsigned largest_score = 0;
 	struct blame_entry *e;
 
 	for (e = sb->ent; e; e = e->next) {
@@ -1143,9 +1163,12 @@ static void find_alignment(struct scoreb
 		num = e->lno + e->num_lines;
 		if (longest_dst_lines < num)
 			longest_dst_lines = num;
+		if (largest_score < ent_score(sb, e))
+			largest_score = ent_score(sb, e);
 	}
 	max_orig_digits = lineno_width(longest_src_lines);
 	max_digits = lineno_width(longest_dst_lines);
+	max_score_digits = lineno_width(largest_score);
 }
 
 static int has_path_in_work_tree(const char *path)
@@ -1206,6 +1229,8 @@ int cmd_pickaxe(int argc, const char **a
 				tmp = top; top = bottom; bottom = tmp;
 			}
 		}
+		else if (!strcmp("--score-debug", arg))
+			output_option |= OUTPUT_SHOW_SCORE;
 		else if (!strcmp("-f", arg) ||
 			 !strcmp("--show-name", arg))
 			output_option |= OUTPUT_SHOW_NAME;
-- 
1.4.3.ge193

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* [PATCH 2/2] git-pickaxe: introduce heuristics to avoid "trivial" chunks
  2006-10-20 20:17                                                     ` Junio C Hamano
  2006-10-20 20:40                                                       ` Jakub Narebski
  2006-10-20 22:41                                                       ` [PATCH 1/2] git-pickaxe: introduce heuristics to "best match" scoring Junio C Hamano
@ 2006-10-20 22:41                                                       ` Junio C Hamano
  2 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-20 22:41 UTC (permalink / raw)
  To: git

This adds scoring logic to blame_entry to prevent blames on very
trivial chunks (e.g. lots of empty lines, indent followed by a
closing brace) from being passed down to unrelated lines in the
parent.

The current heuristics are quite simple and may need to be
tweaked later, but we need to start from somewhere.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
 builtin-pickaxe.c |   36 ++++++++++++++++++++++++++++++++----
 1 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/builtin-pickaxe.c b/builtin-pickaxe.c
index 3c73d82..49673a5 100644
--- a/builtin-pickaxe.c
+++ b/builtin-pickaxe.c
@@ -40,6 +40,15 @@ #define PICKAXE_BLAME_MOVE		01
 #define PICKAXE_BLAME_COPY		02
 #define PICKAXE_BLAME_COPY_HARDER	04
 
+/*
+ * blame for a blame_entry with score lower than these threasholds
+ * is not passed to the parent using move/copy logic.
+ */
+static unsigned blame_move_score;
+static unsigned blame_copy_score;
+#define BLAME_DEFAULT_MOVE_SCORE	20
+#define BLAME_DEFAULT_COPY_SCORE	40
+
 /* bits #0..7 in revision.h, #8..11 used for merge_bases() in commit.c */
 #define METAINFO_SHOWN		(1u<<12)
 #define MORE_THAN_ONE_PATH	(1u<<13)
@@ -645,7 +654,8 @@ static int find_move_in_parent(struct sc
 		if (ent->suspect != target || ent->guilty)
 			continue;
 		find_copy_in_blob(sb, ent, parent, split, &file_p);
-		if (split[1].suspect)
+		if (split[1].suspect &&
+		    blame_move_score < ent_score(sb, &split[1]))
 			split_blame(sb, split, ent);
 	}
 	free(blob_p);
@@ -716,7 +726,8 @@ static int find_copy_in_parent(struct sc
 			find_copy_in_blob(sb, ent, norigin, this, &file_p);
 			copy_split_if_better(sb, split, this);
 		}
-		if (split[1].suspect)
+		if (split[1].suspect &&
+		    blame_copy_score < ent_score(sb, &split[1]))
 			split_blame(sb, split, ent);
 	}
 	diff_flush(&diff_opts);
@@ -1177,6 +1188,15 @@ static int has_path_in_work_tree(const c
 	return !lstat(path, &st);
 }
 
+static unsigned parse_score(const char *arg)
+{
+	char *end;
+	unsigned long score = strtoul(arg, &end, 10);
+	if (*end)
+		return 0;
+	return score;
+}
+
 int cmd_pickaxe(int argc, const char **argv, const char *prefix)
 {
 	struct rev_info revs;
@@ -1206,12 +1226,15 @@ int cmd_pickaxe(int argc, const char **a
 			output_option |= OUTPUT_LONG_OBJECT_NAME;
 		else if (!strcmp("-S", arg) && ++i < argc)
 			revs_file = argv[i];
-		else if (!strcmp("-M", arg))
+		else if (!strncmp("-M", arg, 2)) {
 			opt |= PICKAXE_BLAME_MOVE;
-		else if (!strcmp("-C", arg)) {
+			blame_move_score = parse_score(arg+2);
+		}
+		else if (!strncmp("-C", arg, 2)) {
 			if (opt & PICKAXE_BLAME_COPY)
 				opt |= PICKAXE_BLAME_COPY_HARDER;
 			opt |= PICKAXE_BLAME_COPY | PICKAXE_BLAME_MOVE;
+			blame_copy_score = parse_score(arg+2);
 		}
 		else if (!strcmp("-L", arg) && ++i < argc) {
 			char *term;
@@ -1249,6 +1272,11 @@ int cmd_pickaxe(int argc, const char **a
 			argv[unk++] = arg;
 	}
 
+	if (!blame_move_score)
+		blame_move_score = BLAME_DEFAULT_MOVE_SCORE;
+	if (!blame_copy_score)
+		blame_copy_score = BLAME_DEFAULT_COPY_SCORE;
+
 	/* We have collected options unknown to us in argv[1..unk]
 	 * which are to be passed to revision machinery if we are
 	 * going to do the "bottom" procesing.
-- 
1.4.3.ge193

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 14:59                                                   ` James Henstridge
@ 2006-10-20 22:50                                                     ` Jakub Narebski
  2006-10-20 22:58                                                       ` Petr Baudis
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 22:50 UTC (permalink / raw)
  To: James Henstridge
  Cc: bazaar-ng, Linus Torvalds, Carl Worth, Andreas Ericsson, git

On 20-10-2006, James Henstridge wrote:
> On 20/10/06, Jakub Narebski <jnareb@gmail.com> wrote:
>> James Henstridge wrote:

>>> With the above layout, I would just type:
>>>     bzr branch http://server/repo/branch1
>>
>> With Cogito (you can think of it either as alternate Git UI, or as SCM
>> built on top of Git) you would use
>>
>>    $ cg clone http://server/repo#branch
>>
>> for example
>>
>>    $ cg clone git://git.kernel.org/pub/scm/git/git.git#next
>>
>> to clone _single_ branch (in bzr terminology, "heavy checkout" of branch).
> 
> My understanding of git is that this would be equivalent to the "bzr
> branch" command.  A checkout (heavy or lightweight) has the property
> that commits are made to the original branch.

Not exactly (my mistake in explaining it). "cg clone git://host/repo@branch"
clones only part of history DAG of commits reachable from given branch.
Still it is full repository. You can add branches to it later with
cg-branch-add and fetch changes with cg-fetch.

>> But you can also clone _whole_ repository, _all_ published branches with
>>
>>    $ cg clone git://git.kernel.org/pub/scm/git/git.git
> 
> I suppose that'd be useful if you want a copy of all the branches at
> once.  There is no builtin command in Bazaar to do that at present.

That is _very_ useful. And that is default option for Git. For
example with git.git repository I'm interested both in 'master'
branch (main line of development), and in 'next' branch (development
branch). For example I send some patches, based on 'master', they
get accepted but in 'next' (to cook for a while for example), and
I want to do further work in this direction I have to base my
new work on 'next' branch.

It looks like the Bazaar-NG "branches" are equivalent of the
one-branch-clone of Git.

And if there is no command to clone whole repository, how
you do public repository?

See below.

[...] 
> Two points:
> (1) if we are publishing branches, we wouldn't include working trees
> -- they are not needed to pull or merge from such a branch.

Same with Git. Public repositories are usually "bare" clones, i.e.
without working directory. We can clone/fetch from "clothed" repo
without problem - we just have to point to .git.

> (2) if we did have working trees, they'd be rooted at /repo/branch1
> and /repo/branch2 -- not at /repo (since /repo is not a branch).

That's explains it.

> In case (2) there is a potential for conflicts if you nest branches,
> but people don't generally trigger this problem with the way they use
> Bazaar.

There is no problem in Git to have git repository nested within
working area: of course you better ignore .git directory; you can
ignore files in this embedded repository or not.

[...]
>> How checked out working area looks like in Bazaar-NG?
> 
> The layout of a standalone branch would be:
>   .bzr/repository/ -- storage of trees and metadata
>   .bzr/branch/ -- branch metadagta (e.g. pointer to the head revision)
>   .bzr/checkout/ -- working tree book-keeping files
>   source code

The layout of git repository (git clone, as it is equivalent of bzr branch)
you have the following layout:
  .git/objects/ -- repository objects database
  .git/refs/ -- heads (branches) and tags
  .git/index -- staging area for commit (adding files, merge resolving)
  .git/HEAD -- which branch is current branch
  source code

> If we use a shared repository, the contained branches would lack the
> .bzr/repository/ directory.  The parent directory would instead have a
> .bzr/repository/, but usually wouldn't have .bzr/branch/ (unless there
> is a branch rooted at the base of the repository).

The equivalent of shared repository would be having .git/objects/
to be symlink to some directory which would serve as common area
to store object database.

You can use alternates file: .git/objects/info/alternates can have
list of absolute pathnames (one per line) where objects can be found
instead. If I understand correctly new objects gets commited to current
repository object database, therefore to have equivalent of symlinking
.git/objects directory you would have for every repository which you
want to share object database to have in alternates file all repositories
except self. 

Or you can use GIT_ALTERNATE_OBJECT_DIRECTORIES environmental variable.

Repository using any kind of alternates mechanism is not suitable
to publish using "dumb" (non-git-aware) transports.

> if we are publishing a branch to a web server, we'd skip the working
> tree, so the source code and .bzr/checkout/ directory would be
> missing.

For "bare" clone only 'source files' would be missing. Well, perhaps
also '.git/index' but I'm not sure.

> In the case of a checkout, the .bzr/branch/ directory has a special
> format and acts as a pointer to the original branch.  If the checkout
> is lightweight, the .bzr/repository/ directory would be missing, and
> bzr would need to contact the original branch for the data.

There is no equivalent for bzr "checkout" (and could you please use
other name for that, like "lazy branch"?) in Git. There was some talk
about how to do "lazy clone"/"remote alternates" in Git, but no consensus
was reached about how to do this effectively, and for both "dumb"
(http, https, ftp, rsync) transports and git-aware (local, git, ssh+git)
transports. From what I've read Bazaar-NG doesn't try the "effective"
part...

[...]
>> Yes, but using Git that way has serious disadvantages. For example
>> there is only one current branch pointer and only one index (dircache)
>> per git repository.
> 
> Okay.  So using Bazaar terminology, this seems to be an issue of the
> working tree being associated with the repository rather than the
> branch?
 
From the point of view of Git users, there is (in Bazaar-NG) an issue
of working tree being associated with the individual branch rather than
repository.

In git to work on some project you clone its repository; in bzr to
work on some project you get one of its branches.


IMVHO if "Cheap Branching Anywhere" was changed to "Lightweight Branches"
then Bazaar-NG would have to put "Partial" in there. Unless you setup
your branches to share data, branches are not cheap (in the sense of
disk space). That's probably the cause for _need_ for "checkouts".
Bazaar-NG doesn't encourage using temporary branches, with
lifespan no longer than day. Can you ever switch between branches
using only one working area; can you do it fast?

It looks somewhat like bzr started without permanent branches, and
they were added later (sharing repository data). But I might be mistaken.

P.S. what Git lacks at least now is a way to generate diff between
two different local repositories, but you can always setup alternates
file and fetch the other repository into some tag.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 22:50                                                     ` Jakub Narebski
@ 2006-10-20 22:58                                                       ` Petr Baudis
  0 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-20 22:58 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: James Henstridge, bazaar-ng, Linus Torvalds, Andreas Ericsson,
	Carl Worth, git

Dear diary, on Sat, Oct 21, 2006 at 12:50:31AM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> P.S. what Git lacks at least now is a way to generate diff between
> two different local repositories, but you can always setup alternates
> file and fetch the other repository into some tag.

It's not exactly convenient, but you can do

	xpasky@machine[0:0]~/git$ GIT_ALTERNATE_OBJECT_DIRECTORIES=../cogito/.git/objects cg-diff -r `GIT_DIR=../cogito/.git cg-object-id -c HEAD`..HEAD

I don't personally think it's worth a special UI, but there're no
boundaries for initiative... :-)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:12                                             ` Jan Hudec
                                                                 ` (3 preceding siblings ...)
  2006-10-20 19:14                                               ` Jakub Narebski
@ 2006-10-20 22:59                                               ` Jeff King
  2006-10-21 17:40                                                 ` Jan Hudec
  4 siblings, 1 reply; 1752+ messages in thread
From: Jeff King @ 2006-10-20 22:59 UTC (permalink / raw)
  To: Jan Hudec; +Cc: bazaar-ng, git, Jakub Narebski

On Fri, Oct 20, 2006 at 08:12:10PM +0200, Jan Hudec wrote:

> At this point, I expect the tree to look like this:
> A$ ls -R
> .:
> data/
> data:
> hello.txt
> A$ cat data/hello.txt
> Hello World!

Git does what you expect here.

> A$ VCT mv data greetings
> A$ VCT commit -m "Renamed the data directory to greetings"
> B$ echo "Goodbye World!" > data/goodbye.txt
> B$ VCT add data/goodbye.txt
> B$ VCT commit -m "Added goodbye message."
> A$ VCT merge B
> 
> And now I expect to have tree looking like this:
> 
> A$ ls -R
> .:
> greetings/
> greetings:
> hello.txt
> goodbye.txt

Git does not do what you expect here. It notes that files moved, but it
does not have a concept of directories moving.  Git could, even without
file-ids or special patch types, figure out what happened by noting that
every file in data/ was renamed to its analogue in greetings/, and infer
that previously non-existant files in data/ should also be moved to
greetings/.

However, I'm not sure that I personally would prefer that behavior. In
some cases you might actually WANT data/goodbye.txt, and in some other
cases a conflict might be more appropriate. In any case, I would rather
the SCM do the simple and predictable thing (which I consider to be
creating data/goodbye.txt) rather than be clever and wrong (even if it's
only wrong a small percentage of the time).

In short, git doesn't do what you expect, but I'm not convinced that
it's a bug or lack of feature, and not simply a difference in desired
behavior.

-Peff

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 22:13                                                 ` Jeff Licquia
@ 2006-10-20 23:05                                                   ` Robert Collins
  2006-10-20 23:15                                                     ` Robert Collins
  2006-10-20 23:24                                                     ` Jakub Narebski
  2006-10-20 23:59                                                   ` Linus Torvalds
  1 sibling, 2 replies; 1752+ messages in thread
From: Robert Collins @ 2006-10-20 23:05 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 2583 bytes --]

On Fri, 2006-10-20 at 18:13 -0400, Jeff Licquia wrote:
> 
> All in all, not ideal, but it seems bzr handles this better than bk.
> Certainly, bzr doesn't silently drop anyone's changes, at least.  I
> suspect that bzr could improve its handling of this use case, but not,
> I'm sure, to Linus's specifications; some of the fun and games does
> seem to come from the use of file IDs. 

We have a few features we're focusing on right now, but coming shortly
after them we hope to address parallel imports [which this is a case of]
better than we do now. I have a number of ideas, and I'm sure other devs
do too, about the right way to solve this. Fundamentally, I think using
1-1 mapped path ids [which can be considered a memo of the origin commit
id + path] of a path is not sufficiently rich a representation of what
happens to paths - there is a dual that you can convert to, which is
identity via ancestry traversal - each path has N <= M parent paths in
each of M parent revisions. Our current path ids can only represent the
case where when you traverse to the start of history this graph has a
single tail (that is, that a single file must start at one and only one
place). The graph however is not intrinsically limited in this way -
files can split and join, and we should be able to represent this more
fully.

I'll happily acknowledge that we dont need fileids per se: tracking
renames can be done without a memo of the origin.

However, I'm still convinced that tracking the user intention of renames
leads to a slicker system than renames via inference. My off the cuff
list of corner cases is:

 - change file, rename: rename the changed file/change the renamed file.
 - change file, remove: conflict on removal/text change
 - add path to dir, rename the dir: move the current contents of the
directory/add the new path to the renamed directory.
 - move paths out of a directory, rename the directory: leave the paths
moved out where they were moved to/move the paths from wherever their
new location is.
 - introduce path A + rename old A to B , change path A: change path
B/rename A to B and introduce the new A.

All these cases work roughly along the form of 'have two branches, do
one action in one, one in the other: merge other to one/merge one to
other'. I haven't yet seen an inference system get all these right.

There are other, more complex cases, but I think they all boil down to
one of those primitives to all intents and purposes.

Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 23:05                                                   ` Robert Collins
@ 2006-10-20 23:15                                                     ` Robert Collins
  2006-10-20 23:39                                                       ` Jeff Licquia
  2006-10-20 23:24                                                     ` Jakub Narebski
  1 sibling, 1 reply; 1752+ messages in thread
From: Robert Collins @ 2006-10-20 23:15 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 845 bytes --]

On Sat, 2006-10-21 at 09:05 +1000, Robert Collins wrote:
> On Fri, 2006-10-20 at 18:13 -0400, Jeff Licquia wrote:
> > 
> > All in all, not ideal, but it seems bzr handles this better than bk.
> > Certainly, bzr doesn't silently drop anyone's changes, at least.  I
> > suspect that bzr could improve its handling of this use case, but not,
> > I'm sure, to Linus's specifications; some of the fun and games does
> > seem to come from the use of file IDs. 
...
> However, I'm still convinced that tracking the user intention of renames
> leads to a slicker system than renames via inference. My off the cuff
> list of corner cases is:

I meant to add, that I think inference is a great tool to use as an
adjunct to whatever explicit data one can capture.

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 11:50                                           ` Jakub Narebski
  2006-10-20 13:26                                             ` Jakub Narebski
@ 2006-10-20 23:19                                             ` Junio C Hamano
  2006-10-21  0:07                                               ` Linus Torvalds
  1 sibling, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-20 23:19 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

>> The lack of parents ordering in Git is directly connected with
>> fast-forwarding.
>
> There are exactly _two_ places where Git treats first parent specially 
> (correct me if I'm wrong).

I am not bold enough to say _exactly_ N places, but you missed
at least one more important one.  Merge simplification favors
the earlier parents over later ones.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 23:05                                                   ` Robert Collins
  2006-10-20 23:15                                                     ` Robert Collins
@ 2006-10-20 23:24                                                     ` Jakub Narebski
  2006-10-20 23:28                                                       ` Petr Baudis
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-20 23:24 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Robert Collins wrote:

> However, I'm still convinced that tracking the user intention of renames
> leads to a slicker system than renames via inference.

Well, there was (abandoned for now) idea of rr2-cache, the cache of how
renames were resolved during merge conflict resolving.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 23:24                                                     ` Jakub Narebski
@ 2006-10-20 23:28                                                       ` Petr Baudis
  0 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-20 23:28 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, bazaar-ng

Dear diary, on Sat, Oct 21, 2006 at 01:24:51AM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> Robert Collins wrote:
> 
> > However, I'm still convinced that tracking the user intention of renames
> > leads to a slicker system than renames via inference.
> 
> Well, there was (abandoned for now) idea of rr2-cache, the cache of how
> renames were resolved during merge conflict resolving.

Is that really relevant? It rather seems something like rerere, which is
handy, but only if you are the one who is actually supposed to have clue
on how should it be resolved; the caches aren't replicated on clones.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 22:40                                           ` Petr Baudis
@ 2006-10-20 23:33                                             ` Aaron Bentley
  0 siblings, 0 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-20 23:33 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Jakub Narebski, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 2835 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Petr Baudis wrote:
> Dear diary, on Fri, Oct 20, 2006 at 05:34:39PM CEST, I got a letter
> where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Jakub Narebski wrote:
>>> Aaron Bentley wrote:
>>>> In Bazaar bundles, the text of the diff is an integral part of the data.
>>>> It is used to generate the text of all the files in the revision.
>>>
>>> I thought that the diff was combined diff of changes.
>> It is.  It's a description of how to produce revision X given revision
>> Y, where Y is the last-merged mainline revision.
> 
> Aha, so by default a bundle can carry just a _single_ revision?

No, bundles contain 1 or more revisions.  They contain all the ancestors
of X that are not ancestors of Y.

Only the diff from X to Y is shown, but the diffs for all other
revisions are present in the MIME-encoded section.

Consider these four revisions in a straight-line ancestry: a, b, c, d.
'a' is a common ancestor.  b, c and d are the revisions that are missing
from the target repository.

A default bundle will contain

metadata for d
diff from a -> d in plaintext
metadata for c
diff from b -> c in MIME encoding
metadata for b
diff from a -> b in MIME encoding

To install b, the diff for a->b is applied to a.  To install c, the diff
for b->c is applied to b.  To install d, the diff for a -> d is applied
to a.

Doing a diff from a -> d instead of from c -> d introduces some
redundancy, of course.  But we do that because we want an overview diff.

> That doesn't sound right either, because then it wouldn't make sense to
> talk about "combined" or "simple" diffs. So I guess sending a bundle
> really is taking n revisions at your side, bundling them to a single
> diff and when the other side takes it, it will result in a single
> revision?

No, it copies the revisions verbatim, and we are careful to avoid data loss.

> Hmm, but that doesn't sound right either, that's certainly no revolting
> functionality and seems to be in contradiction with previous bundles
> description. But if it doesn't squash the changes, I don't see how the
> combined diff can be integral part of the data. Sorry, I don't get it.

It's because there's no other diff in the bundle that produces 'd'.

>> I've attached an example of what a combined patch-by-patch bundle looks
>> like.
> 
> But that's the one there's no UI to select? Or where is the combined
> diff?

That is the one that doesn't have UI to select it.  I've attached a
normal bundle for comparison.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFOVzR0F+nu1YWqI0RAkACAJ4z2SJZgelZLfhoFKhEZbmvRIXMjACfag+h
6j+5vvIeHt7xMZOvp6CUcPk=
=33G4
-----END PGP SIGNATURE-----

[-- Attachment #2: hello-world-default.patch --]
[-- Type: text/x-patch, Size: 1884 bytes --]

# Bazaar revision bundle v0.8
#
# message:
#   Added 'world'
# committer: Aaron Bentley <abentley@panoramicfeedback.com>
# date: Fri 2006-10-20 11:30:21.903000116 -0400

=== added directory  // file-id:TREE_ROOT
=== added file world // file-id:world-20061020152929-12bknd8mm9mx48as-1
--- /dev/null
+++ world
@@ -0,0 +1,1 @@
+Hello, world

# revision id: abentley@panoramicfeedback.com-20061020153021-b5fcea14e9cd2b34
# sha1: 6d553e72158aaa76c258d98c15cd24922d171cd9
# inventory sha1: 64af82c4d81d9d6ad4f33fc734d32c2a1eaa0df5
# parent ids:
#   abentley@panoramicfeedback.com-20061020152951-10cff5ff5a51e9a2
# base id: null:
# properties:
#   branch-nick: bar

# message:
#   Capitalized
# committer: Aaron Bentley <abentley@panoramicfeedback.com>
# date: Fri 2006-10-20 11:29:51.953999996 -0400

=== modified file world // encoding:base64
LS0tIHdvcmxkCisrKyB3b3JsZApAQCAtMSwxICsxLDEgQEAKLWhlbGxvCitIZWxsbwoK

=== modified directory  // last-changed:abentley@panoramicfeedback.com-20061020
... 152951-10cff5ff5a51e9a2
# revision id: abentley@panoramicfeedback.com-20061020152951-10cff5ff5a51e9a2
# sha1: f7b79934bc3b0a944e35168b5df6b106c5b29ebf
# inventory sha1: 1400d56451752300cc31c9c94ff7ee2188e8ef8c
# parent ids:
#   abentley@panoramicfeedback.com-20061020152935-64bde004f622131f
# properties:
#   branch-nick: bar

# message:
#   initial commit
# committer: Aaron Bentley <abentley@panoramicfeedback.com>
# date: Fri 2006-10-20 11:29:35.536999941 -0400

=== added directory  // file-id:TREE_ROOT
=== added file world // file-id:world-20061020152929-12bknd8mm9mx48as-1 // enco
... ding:base64
LS0tIC9kZXYvbnVsbAorKysgd29ybGQKQEAgLTAsMCArMSwxIEBACitoZWxsbwoK

# revision id: abentley@panoramicfeedback.com-20061020152935-64bde004f622131f
# sha1: 0728f761b891b257f0a71e2e360799eec080cd21
# inventory sha1: e52e030ea40f6bf5da78f4e8eb8efcd072b0930a
# properties:
#   branch-nick: bar


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] GIT 1.4.3
  2006-10-18 23:53 [ANNOUNCE] GIT 1.4.3 Junio C Hamano
  2006-10-20 12:31 ` Horst H. von Brand
  2006-10-20 13:26 ` Peter Eriksen
@ 2006-10-20 23:35 ` Junio C Hamano
  2006-10-21  0:14   ` Linus Torvalds
                     ` (2 more replies)
  2 siblings, 3 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-20 23:35 UTC (permalink / raw)
  To: git; +Cc: linux-kernel

Junio C Hamano <junkio@cox.net> writes:

>  - git-diff paginates its output to the tty by default.  If this
>    irritates you, using LESS=RF might help.

I am considering the following to address irritation some people
(including me, actually) are experiencing with this change when
viewing a small (or no) diff.  Any objections?

diff --git a/pager.c b/pager.c
index dcb398d..8bd33a1 100644
--- a/pager.c
+++ b/pager.c
@@ -50,7 +50,7 @@ void setup_pager(void)
 	close(fd[0]);
 	close(fd[1]);
 
-	setenv("LESS", "-RS", 0);
+	setenv("LESS", "FRS", 0);
 	run_pager(pager);
 	die("unable to execute pager '%s'", pager);
 	exit(255);

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 23:15                                                     ` Robert Collins
@ 2006-10-20 23:39                                                       ` Jeff Licquia
  0 siblings, 0 replies; 1752+ messages in thread
From: Jeff Licquia @ 2006-10-20 23:39 UTC (permalink / raw)
  To: Robert Collins; +Cc: bazaar-ng, git

On Sat, 2006-10-21 at 09:15 +1000, Robert Collins wrote:
> I meant to add, that I think inference is a great tool to use as an
> adjunct to whatever explicit data one can capture.

If you ask me, that's the most interesting idea in this whole thread.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 22:13                                                 ` Jeff Licquia
  2006-10-20 23:05                                                   ` Robert Collins
@ 2006-10-20 23:59                                                   ` Linus Torvalds
  2006-10-21  1:26                                                     ` Junio C Hamano
  1 sibling, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-20 23:59 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: Jan Hudec, bazaar-ng, git, Jakub Narebski



On Fri, 20 Oct 2006, Jeff Licquia wrote:
> 
> After this conflict is resolved, merging from b causes conflicts, while
> merging from c appears to work fine.  This continues until b merges from
> a (and resolves a conflict in a similar manner to a), at which time
> merging/pulling works as you'd expect between the branches.  Whenever b
> is marked as conflicting before it merges from a, bzr preserves b's
> changes by moving b's modified file.

This sounds somewhat like what I think BK did. I'm not sure if BK actually 
marked it as a conflict or whether BK just warned about "changes to 
deleted file" or something similar, but it didn't entirely _silently_ 
throw them away.

But I hope this shows some of the basic problems.

The much more _serious_ problem of "file identity" tracking is actually 
that you can't track partial file movement or file copies sanely. The 
thing is, tracking things at file boundaries simply is fundamnetally a 
broken notion, simply because _code_ doesn't get done at file boundaries.

Both of these things that git can actually do. Admittedly it does not do 
that in any _released_ version, so you'd have to work with the development 
branch, and it's a fairly early thing, but currently it can actually 
notice that our "revision.c" file largely came from the "rev-list.c" file 
that still exists!

And btw, that's not just some random feature that happened to get 
implemented last week. Yes, it actually _did_ get implemented last week, 
but this was something I outlined when I started git in April of last 
year, and tried to explain to people WHY TRACKING FILE ID'S ARE WRONG!

You can find me explaining these things to people in April-2005, which 
should tell you something: the initial revision of "git" was on Thursday, 
April 7. So the lack of file identity tracking has been controversial from 
the very beginning, but I was right then, and I'm right now.

Because the _fact_ is, that as long as you track stuff on a file basis, 
you're _never_ going to be able to do the things that git alreadt does, 
and that are very natural.

Here's the real-world example of something that git CAN DO TODAY:

 - we used to have a file called "rev-list.c", which did a lot of the 
   commit history revision traversal, and is the source of the git command 
   "git rev-list".

 - I (and others) extended it a lot, and turned it into a more generic 
   library interface, so that other commands could traverse the commit 
   graph on their own, rather than forking and executing "git-rev-list" 
   and piping the output between them.

 - as a result, the old "rev-list.c" still exists (except it was renamed 
   to "builtin-rev-list.c" since it's now a builtin command to the main 
   "git" binary). 

 - HOWEVER, a lot of the actual code got split into the library file, 
   called "revision.c", which contains the real smarts of the program.

See? There was a file rename involved (rev-list.c => builtin-rev-list.c), 
but that actually happened after a lot of the really _interesting_ code 
had been excised from that file, and put into the new internal library 
file (revision.c).

Now, as a result, in many ways the rename is _much_ less interesting than 
the question about the history of the code in "revision.c" (because that's 
really some very core code). And that was never a rename at all. That was 
just a file create, where a lot of the contents happened to come from a 
file that continued to exist.

Wouldn't you want "annotate" to be able to follow this kind of data 
movement? Notice how there is no "file" that moved at all. Only code that 
moved between files.

I tell you: as long as you work with "file ID's", you'll always be 
inferior. You'll never be able to see that some code was copied 
_partially_ from one file into another. You'll never be able to see an 
important function moving between file boundaries.

Unless you work with "git", that is. Because git isn't so _stupid_ as to 
think that file boundaries matter. Git knows better. The only thing that 
matters is the actual _data_, and file boundaries are just one way of 
delimiting that data.

Just try it out. Get the "next" branch of the git repository (that's the 
"stable development" branch in git.git - ie it's going to be in the next 
release and is expected to work, unless some of the more "experimental 
development" that is in the "pu" branch - pu = proposed updates), compile 
it, and run

	git pickaxe -C revision.c | less -S

and marvel. Marvel at my shining intelligence (and the small matter of 
programming, which was all done by Junio, but I'm taking all the credit 
_anyway_, because *dammit* I talked about this last year when people 
didn't understand! And besides, I always take all the credit regardless, 
so what are you whining about? Get off my back!).

More seriously, Junio really did a kick-ass job. I really had nothing at 
all to do with it, and deserve no real credit. But I _did_ forsee it, and 
yes, it really is about the fact that git tracks _contents_.

As somebody smarter that I have said (*): "I'm always right, but this time 
I'm even more right than usual".

			Linus

(*) Just kidding. It was me. Of course.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 23:19                                             ` Junio C Hamano
@ 2006-10-21  0:07                                               ` Linus Torvalds
  2006-10-21  1:09                                                 ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-21  0:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jakub Narebski, git



On Fri, 20 Oct 2006, Junio C Hamano wrote:
> 
> I am not bold enough to say _exactly_ N places, but you missed
> at least one more important one.  Merge simplification favors
> the earlier parents over later ones.

Which is probably slightly inconsistent (although I seriously doubt 
anybody really cares - when we simplify a merge we obvioously do it 
exactly because the parents are identical wrt the files we are following).

Most of the rest of commit traversal tend to have a rule that says 
"traverse youngest parent first", simply by virtue of the fact that 
revlist() normally pops off the queue in date order. But Jakub is 
certainly correct that when we do "^" we just take the first one. 

And "gitweb" does consider the first one special, since it shows diffs 
against that one (although I've argued that it probably shouldn't, and 
that there should be some way to show branches against arbitrary parents)

So we're a bit confused. Not that it probably really ever matters. We 
might as well say that parent order is random, and that our "random number 
generators" are pretty damn lazy ;)

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] GIT 1.4.3
  2006-10-20 23:35 ` Junio C Hamano
@ 2006-10-21  0:14   ` Linus Torvalds
  2006-10-21  0:22     ` Petr Baudis
  2006-10-21  2:12     ` Al Viro
  2006-10-21  0:47   ` Nicolas Pitre
  2006-10-23  0:53   ` prune/prune-packed J. Bruce Fields
  2 siblings, 2 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-21  0:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, linux-kernel



On Fri, 20 Oct 2006, Junio C Hamano wrote:
> 
> I am considering the following to address irritation some people
> (including me, actually) are experiencing with this change when
> viewing a small (or no) diff.  Any objections?

Not from me. I use "git diff" just to check that the tree is empty, and 
the fact that it now throws me into an empty pager is irritating.

That said, "LESS=FRS" doesn't really help that much. It still clears the 
screen. Using "LESS=FRSX" fixes that, but the alternate display sequence 
is actually nice _if_ the pager is used.

Still, I think I'd prefer FRSX as the default.

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] GIT 1.4.3
  2006-10-21  0:14   ` Linus Torvalds
@ 2006-10-21  0:22     ` Petr Baudis
  2006-10-21  0:31       ` Linus Torvalds
                         ` (2 more replies)
  2006-10-21  2:12     ` Al Viro
  1 sibling, 3 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-21  0:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git, linux-kernel

> That said, "LESS=FRS" doesn't really help that much. It still clears the 
> screen. Using "LESS=FRSX" fixes that, but the alternate display sequence 
> is actually nice _if_ the pager is used.

Hmm, what terminal emulator do you use? The reasonable ones should
restore the original screen. At least xterm does, and I *think*
gnome-terminal does too (although I'm too lazy to boot up my notebook
and confirm).

(I personally consider alternate screen an abomination. It would be so
nice if the terminal emulators would just make it optional.)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] GIT 1.4.3
  2006-10-21  0:22     ` Petr Baudis
@ 2006-10-21  0:31       ` Linus Torvalds
  2006-10-21  9:53       ` Andreas Schwab
  2006-10-22 21:09       ` Anders Larsen
  2 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-21  0:31 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Junio C Hamano, git, linux-kernel



On Sat, 21 Oct 2006, Petr Baudis wrote:
>
> > That said, "LESS=FRS" doesn't really help that much. It still clears the 
> > screen. Using "LESS=FRSX" fixes that, but the alternate display sequence 
> > is actually nice _if_ the pager is used.
> 
> Hmm, what terminal emulator do you use? The reasonable ones should
> restore the original screen. At least xterm does, and I *think*
> gnome-terminal does too (although I'm too lazy to boot up my notebook
> and confirm).

Not xterm, at least.

Not gnome-terminal either, for that matter.

I just tried.

	LESS=FRS git diff

clears the screen and leaves the thing at the end.

	LESS=FRSX git diff

works fine, but for people who _like_ the alternate screens (and I do, 
once I really use a pager) it also disables the alternate screen.

It might depend on the termcap, of course. I'm running FC5.

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] GIT 1.4.3
  2006-10-20 23:35 ` Junio C Hamano
  2006-10-21  0:14   ` Linus Torvalds
@ 2006-10-21  0:47   ` Nicolas Pitre
  2006-10-23  0:53   ` prune/prune-packed J. Bruce Fields
  2 siblings, 0 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-10-21  0:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, linux-kernel

On Fri, 20 Oct 2006, Junio C Hamano wrote:

> Junio C Hamano <junkio@cox.net> writes:
> 
> >  - git-diff paginates its output to the tty by default.  If this
> >    irritates you, using LESS=RF might help.
> 
> I am considering the following to address irritation some people
> (including me, actually) are experiencing with this change when
> viewing a small (or no) diff.  Any objections?

I think this is an excellent idea.


Nicolas

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Signed git-tag doesn't find default key
  2006-10-20 19:21   ` Andy Parkins
@ 2006-10-21  0:52     ` Horst H. von Brand
  2006-10-21  7:44       ` Andy Parkins
  0 siblings, 1 reply; 1752+ messages in thread
From: Horst H. von Brand @ 2006-10-21  0:52 UTC (permalink / raw)
  To: Andy Parkins; +Cc: Linus Torvalds, git

Andy Parkins <andyparkins@gmail.com> wrote:

[...]

> I'm going to advocate my change of only searching on the email address
> for finding the key - there shouldn't be two keys with the same email
> address anyway, so there shouldn't be a danger of ambiguity of key.

There very well might be... say you have a key for signing git stuff,
another one for emailing, another one for signing RPMs you create, ... I
believe that is the idea of the GPG comment field, precisely.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                    Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria             +56 32 2654239
Casilla 110-V, Valparaiso, Chile               Fax:  +56 32 2797513

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21  0:07                                               ` Linus Torvalds
@ 2006-10-21  1:09                                                 ` Junio C Hamano
  2006-10-21  1:19                                                   ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-21  1:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Jakub Narebski

Linus Torvalds <torvalds@osdl.org> writes:

> And "gitweb" does consider the first one special, since it shows diffs 
> against that one (although I've argued that it probably shouldn't, and 
> that there should be some way to show branches against arbitrary parents)
>
> So we're a bit confused. Not that it probably really ever matters.

There is another one similar to the gitweb one you mentioned:
git-show --stat on a merge.  We deliberately chose to show the
difference from the first parent; it is called "showing the
changes the person who made this merge saw".

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21  1:09                                                 ` Junio C Hamano
@ 2006-10-21  1:19                                                   ` Linus Torvalds
  2006-10-21  1:27                                                     ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-21  1:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jakub Narebski



On Fri, 20 Oct 2006, Junio C Hamano wrote:
> 
> There is another one similar to the gitweb one you mentioned:
> git-show --stat on a merge.  We deliberately chose to show the
> difference from the first parent; it is called "showing the
> changes the person who made this merge saw".

Well, that one actually makes sense. It's just the stat from the previous 
state, after all, and it actually is done _together_ with the operation 
that causes the diffs.

So that one I don't think you can really even claim.

Also, it's not even the "first parent". Look closer. It's literally 
"previous state", because it does so for a fast-forward too. It's from 
ORIG_HEAD.

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 23:59                                                   ` Linus Torvalds
@ 2006-10-21  1:26                                                     ` Junio C Hamano
  2006-10-21  8:40                                                       ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-21  1:26 UTC (permalink / raw)
  To: git; +Cc: Jan Hudec, bazaar-ng, Jeff Licquia, Linus Torvalds,
	Jakub Narebski

Linus Torvalds <torvalds@osdl.org> writes:

> Both of these things that git can actually do. Admittedly it does not do 
> that in any _released_ version, so you'd have to work with the development 
> branch, and it's a fairly early thing, but currently it can actually 
> notice that our "revision.c" file largely came from the "rev-list.c" file 
> that still exists!
>
> And btw, that's not just some random feature that happened to get 
> implemented last week. Yes, it actually _did_ get implemented last week, 
> but this was something I outlined when I started git in April of last 
> year, and tried to explain to people WHY TRACKING FILE ID'S ARE WRONG!
>
> You can find me explaining these things to people in April-2005, which 
> should tell you something: the initial revision of "git" was on Thursday, 
> April 7. So the lack of file identity tracking has been controversial from 
> the very beginning, but I was right then, and I'm right now.

For people new to the list, the message is:

    http://thread.gmane.org/gmane.comp.version-control.git/27/focus=217

I think I've quoted this link at least three times on this list;
I consider it is _the_ most important message in the whole list
archive.  If you haven't read it, read it now, print it out,
read it three more times, place it under the pillow before you
sleep tonight.  Repeat that until you can recite the whole
message.  It should not take more than a week.

To me, personally, achieving that ideal "drill down" dream was
one of the more important goals of my involvement in this
project.  I did diffcore-rename to fill some part of the dream,
and then diffcore-pickaxe to fill some other part.  Neither was
even close.  I think the recent round of pickaxe is getting much
closer.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21  1:19                                                   ` Linus Torvalds
@ 2006-10-21  1:27                                                     ` Junio C Hamano
  2006-10-21  1:55                                                       ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-21  1:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> On Fri, 20 Oct 2006, Junio C Hamano wrote:
>> 
>> There is another one similar to the gitweb one you mentioned:
>> git-show --stat on a merge.  We deliberately chose to show the
>> difference from the first parent; it is called "showing the
>> changes the person who made this merge saw".
>
> Well, that one actually makes sense. It's just the stat from the previous 
> state, after all, and it actually is done _together_ with the operation 
> that causes the diffs.
>
> So that one I don't think you can really even claim.
>
> Also, it's not even the "first parent". Look closer. It's literally 
> "previous state", because it does so for a fast-forward too. It's from 
> ORIG_HEAD.

I was not talking about "git pull".  I was talking about "git
show".

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21  1:27                                                     ` Junio C Hamano
@ 2006-10-21  1:55                                                       ` Linus Torvalds
  2006-10-21  8:32                                                         ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-21  1:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Fri, 20 Oct 2006, Junio C Hamano wrote:
> 
> I was not talking about "git pull".  I was talking about "git
> show".

Duh. I don't know why I misread that.

Yeah, that makes no sense at all. I _think_ "git show" should be the same 
thing as a single-entry "git log -p".

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* git-merge-recursive, was Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 20:57                                                         ` Linus Torvalds
@ 2006-10-21  2:03                                                           ` Johannes Schindelin
  2006-10-21  2:17                                                             ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-21  2:03 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Aaron Bentley, Jakub Narebski, Jan Hudec, bazaar-ng,
	Git Mailing List



On Fri, 20 Oct 2006, Linus Torvalds wrote:

> On Fri, 20 Oct 2006, Aaron Bentley wrote:
> > 
> > Agreed.  We start by comparing BASE and OTHER, so all those comparisons
> > are in-memory operations that don't hit disk.  Only for files where BASE
> > and OTHER differ do we even examine the THIS version.
> 
> Git just slurps in all three trees. I actually think that the current 
> merge-recursive.c does it the stupid way (ie it expands all trees 
> recursively, regardless of whether it's needed or not), but I should 
> really check with Dscho, since I had nothing to do with that code.

AFAIR yes, it does the dumb thing, namely it does not take advantage of 
trees being identical when their SHA1s are identical.

This will be a _tremendous_ speed-up.

> > > So recursive basically generates the matrix of similarity for the 
> > > new/deleted files, and tries to match them up, and there you have your 
> > > renames - without ever looking at the history of how you ended up where 
> > > you are.
> > 
> > So in the simple case, you compare unmatched THIS, OTHER and BASE files
> > to find the renames?
> 
> Right. Some cases are easy: if one of the branches only added files (which 
> is relatively common), that obviously cannot be a rename. So you don't 
> even have to compare all possible combinarions - you know you don't have 
> renames from one branch to the other ;)
> 
> But I'm not even the authorative person to explain all the details of the 
> current recursive merge, and I might have missed something. Dscho? 
> Fredrik? Anything you want to add?

Not me. Only that there is much potential for optimization (meaning 
performance, not the basic algorithm).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] GIT 1.4.3
  2006-10-21  0:14   ` Linus Torvalds
  2006-10-21  0:22     ` Petr Baudis
@ 2006-10-21  2:12     ` Al Viro
  2006-10-21  5:29       ` Junio C Hamano
  2006-10-21 14:29       ` Rene Scharfe
  1 sibling, 2 replies; 1752+ messages in thread
From: Al Viro @ 2006-10-21  2:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git, linux-kernel

On Fri, Oct 20, 2006 at 05:14:39PM -0700, Linus Torvalds wrote:
> 
> 
> On Fri, 20 Oct 2006, Junio C Hamano wrote:
> > 
> > I am considering the following to address irritation some people
> > (including me, actually) are experiencing with this change when
> > viewing a small (or no) diff.  Any objections?
> 
> Not from me. I use "git diff" just to check that the tree is empty, and 
> the fact that it now throws me into an empty pager is irritating.

Speaking of irritations...  There is a major (and AFAICS fixable)
suckitude in git-cherry.  Basically, what it does is
	* use git-rev-list to find commits on our branches
	* do git-diff-tree -p for each commit
	* do git-patch-id on each delta
	* compare sets.
For one thing, there are better ways to do set comparison than creating
a file for each element in one set and going through another checking
if corresponding files exist (join(1) and sort(1) or just use perl hashes).
That one is annoying on journalling filesystems (a lot of files being
created, read and removed - fsckloads of disk traffic), but it's actually
not the worst problem.

Far more annoying is that we keep recalculating git-diff-tree -p | git-patch-id
again and again; try to do git cherry on a dozen short branches forked at
2.6.18 and you'll see the damn thing recalculated a dozen of times for
each commit from 2.6.18 to current.  It's not cheap, to put it mildly.

git-rev-list ^v2.6.18 HEAD|while read i; do git-diff-tree -p $i; done |git-patch-id >/dev/null

out of hot cache on 2GHz amd64 box (Athlon 64 3400+) takes 3 minutes of
wall time.  Repeat that for each branch and it's starting to get old very
fast.

Note that we are calculating a function of commit; it _never_ changes.
Even if we don't just calculate and memorize it at commit time, a cache
somewhere under .git would speed the things up a lot...

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: git-merge-recursive, was Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21  2:03                                                           ` git-merge-recursive, was " Johannes Schindelin
@ 2006-10-21  2:17                                                             ` Junio C Hamano
  2006-10-22 21:04                                                               ` [PATCH] threeway_merge: if file will not be touched, leave it alone Johannes Schindelin
  0 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-21  2:17 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> On Fri, 20 Oct 2006, Linus Torvalds wrote:
>
>> Git just slurps in all three trees. I actually think that the current 
>> merge-recursive.c does it the stupid way (ie it expands all trees 
>> recursively, regardless of whether it's needed or not), but I should 
>> really check with Dscho, since I had nothing to do with that code.
>
> AFAIR yes, it does the dumb thing, namely it does not take advantage of 
> trees being identical when their SHA1s are identical.
>
> This will be a _tremendous_ speed-up.

While we are talking about merge-recursive, I could use some
help from somebody familiar with merge-recursive to complete the
read-tree changes Linus mentioned early this month.

The issue is that we would want to remove one verify_absent()
call in unpack-tree.c:threeway_merge().  When read-tree decides
to leave higher stages around, we do not want it to check if the
merge could clobber a working tree file, because having an
unrelated file at the same path in the working tree sometimes is
and sometimes is not a conflict, depending on the outcome of the
merge, and that part of the code does not _know_ the outcome
yet.

What this means is that we would need to have the equivalent
check in the merge strategy that uses read-tree for three-way
merge when we remove this overcautious safety check from
read-tree.  I've adjusted merge-one-file to do so, but not many
people use 'resolve' strategy these days, and we would need the
matching change in merge-recursive.

If you are interested, you can see the details in commit 0b35995.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] GIT 1.4.3
  2006-10-21  2:12     ` Al Viro
@ 2006-10-21  5:29       ` Junio C Hamano
  2006-10-21  5:40         ` Al Viro
  2006-10-21 14:29       ` Rene Scharfe
  1 sibling, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-21  5:29 UTC (permalink / raw)
  To: Al Viro; +Cc: git

Al Viro <viro@ftp.linux.org.uk> writes:

> Speaking of irritations...  There is a major (and AFAICS fixable)
> suckitude in git-cherry.  Basically, what it does is...

Yeah, that sucks big time.  I never realized there are people
who still are using it, though. git-format-patch used to use it,
but the version was retired exactly five months ago, and there
is no in-tree users anymore.

I guess we could separate out the revision filtering logic in
builtin-log.c:cmd_format_patch() and implement git-cherry as a
new built-in.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] GIT 1.4.3
  2006-10-21  5:29       ` Junio C Hamano
@ 2006-10-21  5:40         ` Al Viro
  0 siblings, 0 replies; 1752+ messages in thread
From: Al Viro @ 2006-10-21  5:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Fri, Oct 20, 2006 at 10:29:37PM -0700, Junio C Hamano wrote:
> Al Viro <viro@ftp.linux.org.uk> writes:
> 
> > Speaking of irritations...  There is a major (and AFAICS fixable)
> > suckitude in git-cherry.  Basically, what it does is...
> 
> Yeah, that sucks big time.  I never realized there are people
> who still are using it, though. git-format-patch used to use it,
> but the version was retired exactly five months ago, and there
> is no in-tree users anymore.

Huh?  If you have a saner way to do reordering/changeset-by-changeset
rebasing of branches...  git-cherry followed by selective cherry-pick
works and is much more convenient than messing with implementing what
I need via git-am and shitloads of editing...

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-19 20:47                                             ` Linus Torvalds
@ 2006-10-21  5:49                                               ` Junio C Hamano
  0 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-21  5:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> For example, while git now does "annotate" (or "blame"), it's not 
> lightning fast, and I simply don't care. Doing a
>
> 	git blame kernel/sched.c
>
> takes about three seconds for me, and that's on a pretty good machine (and 
> on the kernel tree, which for me is always in the cache ;).

ll.6041-6091 of that file is blamed to arch/ia64/kernel/domain.c
by pickaxe -C (attributed to commit 2.6.12-rc2) while blame says
they are brought in by commit 9c1cfa, which says "Move the ia64
domain setup code to the generic code".  I am slowly realizing
that comparing the output from blame and pickaxe might be a good
way to study the project history.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Signed git-tag doesn't find default key
  2006-10-21  0:52     ` Horst H. von Brand
@ 2006-10-21  7:44       ` Andy Parkins
  0 siblings, 0 replies; 1752+ messages in thread
From: Andy Parkins @ 2006-10-21  7:44 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 492 bytes --]

On Saturday 2006, October 21 01:52, Horst H. von Brand wrote:

> There very well might be... say you have a key for signing git stuff,
> another one for emailing, another one for signing RPMs you create, ... I
> believe that is the idea of the GPG comment field, precisely.

Either way, you're arguing for the fault being with Git - which has no notion 
of comment fields and so won't find the key anyway.

Andy

-- 
Dr Andrew Parkins, M Eng (Hons), AMIEE
andyparkins@gmail.com

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 14:56                                       ` Jakub Narebski
  2006-10-20 15:34                                         ` Aaron Bentley
@ 2006-10-21  7:56                                         ` Matthieu Moy
  2006-10-21  8:36                                           ` Jakub Narebski
  1 sibling, 1 reply; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-21  7:56 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

Jakub Narebski <jnareb@gmail.com> writes:

>> It's my understanding that most changes discussed on lkml are provided
>> as a series of patches.  Bazaar bundles are intended as a direct
>> replacement for patches in that use case.
>
> As _series_ of patches. You have git-format-patch + git-send-email
> to format and send them, git-am to apply them (as patches, not as branch).
>
> I was under an impression that user sees only mega-patch of all the
> revisions in bundle together, and rest is for machine consumption only.

Nothing prevents you from using series of bundles.

A bundle for a single revision looks like a patch with a few comments
on top and bottom. _If_ you have several revisions in your patch, you
get the diff as human readable, and the intermediate revisions as
MIME-encoded.

For big changes, people do send several bundles.

So, a bundle is a direct replacement for a patch, not for series of
patches.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 12:07                 ` Sean
@ 2006-10-21  8:27                   ` Jakub Narebski
  2006-10-21  8:48                     ` Erik Bågfors
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21  8:27 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Sean wrote:

> On Tue, 17 Oct 2006 13:45:31 +0200
> Jakub Narebski <jnareb@gmail.com> wrote:
> 
>> Git cannot do that remotely (with exception of git-tar-tree/git-archive 
>> which has --remote option), yet. But you can get contents of a file 
>> (with "git cat-file -p [<revision>:|:<stage>:]<filename>"), list 
>> directory (with "git ls-tree <tree-ish>") and compare files or 
>> directories (git diff family of commands) without need for working 
>> directory.
> 
> Interesting, I didn't know about the --remote option.  So in fact as long
> as the remote has enabled upload-tar then anyone can do a "light
> checkout". 

Not exactly. "Light checkout" (aka "lazy one-branch clone") in bzr
contains also info about the repository it came from, and has some
metadata that you can commit to it locally. git tar-tree --remote
just gets snapshot. 

> However, it appears that kernel.org for instance doesn't enable this
> feature. 

One can get snapshot from gitweb... if gitweb is new enough and
has this feature enabled (it is enabled by default). Again not
the case of kernel.org

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21  1:55                                                       ` Linus Torvalds
@ 2006-10-21  8:32                                                         ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21  8:32 UTC (permalink / raw)
  To: git

Linus Torvalds wrote:

> On Fri, 20 Oct 2006, Junio C Hamano wrote:
>> 
>> I was not talking about "git pull".  I was talking about "git
>> show".
> 
> Duh. I don't know why I misread that.
> 
> Yeah, that makes no sense at all. I _think_ "git show" should be the same 
> thing as a single-entry "git log -p".

Huh?

$ git show ff49fae6a547e5c70117970e01c53b64d983cd10
commit ff49fae6a547e5c70117970e01c53b64d983cd10
Merge: 7ad4ee7... 75f9007... 14eab2b... 0b35995... eee4609...
[...]
diff --cc Makefile
index 36b9e06,68ae43b,66c8b4b,66c8b4b,09f60bb..a2f2f7c
[...]

"git show" doesn't prefer first parent: it uses compact combined
(that is the meaning of --cc, isn't it?) format for merges.

git version 1.4.2.1
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21  7:56                                         ` Matthieu Moy
@ 2006-10-21  8:36                                           ` Jakub Narebski
  2006-10-21 10:09                                             ` Matthieu Moy
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21  8:36 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: bazaar-ng, git

Matthieu Moy wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>>> It's my understanding that most changes discussed on lkml are provided
>>> as a series of patches.  Bazaar bundles are intended as a direct
>>> replacement for patches in that use case.
>>
>> As _series_ of patches. You have git-format-patch + git-send-email
>> to format and send them, git-am to apply them (as patches, not as branch).
>>
>> I was under an impression that user sees only mega-patch of all the
>> revisions in bundle together, and rest is for machine consumption only.
> 
> Nothing prevents you from using series of bundles.
> 
> A bundle for a single revision looks like a patch with a few comments
> on top and bottom. _If_ you have several revisions in your patch, you
> get the diff as human readable, and the intermediate revisions as
> MIME-encoded.
> 
> For big changes, people do send several bundles.
> 
> So, a bundle is a direct replacement for a patch, not for series of
> patches.

Ah, that explains this. So why people use bundles instead of patches
(with some metainfo like commit message)? And do bzr have command to
apply in correct ordering series of bundles send either chain replied
to (each patch in the series is reply to previous patch) or being
replies to patchseries introductory message?

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21  1:26                                                     ` Junio C Hamano
@ 2006-10-21  8:40                                                       ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21  8:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jan Hudec, bazaar-ng, Jeff Licquia, Linus Torvalds

Junio C Hamano wrote:

> For people new to the list, the message is:
> 
>     http://thread.gmane.org/gmane.comp.version-control.git/27/focus=217
> 
> I think I've quoted this link at least three times on this list;
> I consider it is _the_ most important message in the whole list
> archive.  If you haven't read it, read it now, print it out,
> read it three more times, place it under the pillow before you
> sleep tonight.  Repeat that until you can recite the whole
> message.  It should not take more than a week.
> 
> To me, personally, achieving that ideal "drill down" dream was
> one of the more important goals of my involvement in this
> project.  I did diffcore-rename to fill some part of the dream,
> and then diffcore-pickaxe to fill some other part.  Neither was
> even close.  I think the recent round of pickaxe is getting much
> closer.

What I find lacking in this mail, and in git as it is now, is
somehow remembering and perhaps even propagating user's corrections
to automatic contents movement (which includes file renames and
file copying) detection.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21  8:27                   ` Jakub Narebski
@ 2006-10-21  8:48                     ` Erik Bågfors
  0 siblings, 0 replies; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-21  8:48 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

On 10/21/06, Jakub Narebski <jnareb@gmail.com> wrote:
> Sean wrote:
>
> > On Tue, 17 Oct 2006 13:45:31 +0200
> > Jakub Narebski <jnareb@gmail.com> wrote:
> >
> >> Git cannot do that remotely (with exception of git-tar-tree/git-archive
> >> which has --remote option), yet. But you can get contents of a file
> >> (with "git cat-file -p [<revision>:|:<stage>:]<filename>"), list
> >> directory (with "git ls-tree <tree-ish>") and compare files or
> >> directories (git diff family of commands) without need for working
> >> directory.
> >
> > Interesting, I didn't know about the --remote option.  So in fact as long
> > as the remote has enabled upload-tar then anyone can do a "light
> > checkout".
>
> Not exactly. "Light checkout" (aka "lazy one-branch clone") in bzr
> contains also info about the repository it came from, and has some
> metadata that you can commit to it locally. git tar-tree --remote
> just gets snapshot.

No, a lightweight checkout doesn't have that.  A lightweight checkout
is basically just the latest revision checked out, a snapshot. For
everything else it needs to go the remote branch to get information.
You cannot commit locally on a "lightwieght checkout"

A "normal/heavyweight" checkout has the ability to commit locally.

/Erik

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] GIT 1.4.3
  2006-10-21  0:22     ` Petr Baudis
  2006-10-21  0:31       ` Linus Torvalds
@ 2006-10-21  9:53       ` Andreas Schwab
  2006-10-22 21:09       ` Anders Larsen
  2 siblings, 0 replies; 1752+ messages in thread
From: Andreas Schwab @ 2006-10-21  9:53 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Linus Torvalds, Junio C Hamano, git, linux-kernel

Petr Baudis <pasky@suse.cz> writes:

> (I personally consider alternate screen an abomination. It would be so
> nice if the terminal emulators would just make it optional.)

$ xterm -rm "*titeInhibit: true"

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21  8:36                                           ` Jakub Narebski
@ 2006-10-21 10:09                                             ` Matthieu Moy
  2006-10-21 10:34                                               ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-21 10:09 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

Jakub Narebski <jnareb@gmail.com> writes:

> Ah, that explains this. So why people use bundles instead of patches
> (with some metainfo like commit message)?

You need more metainfo than the commit message. Since revision-id is
not based on the content, you need at least to specify the
revision-id.

And bzr's bundle give indeed _all_ the information that is in the
repository about this revision (i.e. commit message, ancestors, ...).

Another relevant difference between a patch and a bundle is that the
bundles knows its ancestor, so, when you apply the bundle, it builds
the new revision with exact patching. If you need a merge, then it
will happen exactly in the same way as a merge between two branches
(ie. three-way merge for example).

> And do bzr have command to apply in correct ordering series of
> bundles send either chain replied to (each patch in the series is
> reply to previous patch) or being replies to patchseries
> introductory message?

Not directly AFAIK, but since the bundle knows which revision it
applies to, it will refuse to apply the second if the first one is not
in your repository already for example.

It would probably be interesting to have more features to help sending
series of bundles and apply them, but no one have been really asking
for it up to now.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21 10:09                                             ` Matthieu Moy
@ 2006-10-21 10:34                                               ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21 10:34 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: bazaar-ng, git

Matthieu Moy wrote:

> Another relevant difference between a patch and a bundle is that the
> bundles knows its ancestor, so, when you apply the bundle, it builds
> the new revision with exact patching. If you need a merge, then it
> will happen exactly in the same way as a merge between two branches
> (ie. three-way merge for example).

By the way, if patch send via email is git enchanced patch, with
[shortened] sha1 of blobs (file contents), and our repository has
the blob the patch is supposedly to apply to (but for example line
of development moved forwards) we can request via --3way command
option to git-am to fall back on 3-way merge if the patch doesn't
apply cleanly.

It is not as powerfull as merge of branches, but it is sufficient
in most cases. And in other cases you have to resolve conflict by
hand, anyway; git-rerere (which records resolving of conflicts and
reuses them) can help there.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20  4:05                                               ` Aaron Bentley
@ 2006-10-21 12:30                                                 ` Jan Hudec
  2006-10-21 13:05                                                   ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Jan Hudec @ 2006-10-21 12:30 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Tim Webster, Christian MICHON, Andreas Ericsson, bazaar-ng, git,
	Matthieu Moy

On Fri, Oct 20, 2006 at 12:05:35AM -0400, Aaron Bentley wrote:
> Tim Webster wrote:
> > Also svn does not allow files in the same directory to live in
> > multiple repos
> 
> It would surprise me if many SCMs that support atomic commit also
> support intermixing files from multiple repos in the same directory.

In fact I think svk would. You would have to switch them by setting
an environment variable, but it's probably doable. That is because
unlike other version control systems, it does not store the information
about checkout in the checkout, but in the central directory and that
can be set. I don't know git well enough to tell whether git could do
the same by setting GIT_DIR.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 10:40                                       ` Jakub Narebski
  2006-10-20 13:36                                         ` Shawn Pearce
@ 2006-10-21 12:30                                         ` Matthew D. Fuller
  1 sibling, 0 replies; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-21 12:30 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

On Fri, Oct 20, 2006 at 12:40:11PM +0200 I heard the voice of
Jakub Narebski, and lo! it spake thus:
> 
> I'd like to put ComparisonWithBazaarNG page on GitWiki
> (http://git.or.cz/gitwiki/) some time soon,

This is a good idea; I think we've plowed a lot of ground in this
thread that would be useful to document somewhere easily
referenceable.  I've thought a few times while going through these
mails of putting some of the material up on the Bazaar wiki.  I'm not
really the best person to try and sort it out, but I may try and put
together some notes at least.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 21:48                                             ` Carl Worth
@ 2006-10-21 13:01                                               ` Matthew D. Fuller
  2006-10-21 14:08                                                 ` Jakub Narebski
                                                                   ` (2 more replies)
  2006-10-21 20:05                                               ` Aaron Bentley
  1 sibling, 3 replies; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-21 13:01 UTC (permalink / raw)
  To: Carl Worth
  Cc: Aaron Bentley, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Jakub Narebski

On Fri, Oct 20, 2006 at 02:48:52PM -0700 I heard the voice of
Carl Worth, and lo! it spake thus:
> 
> The entire discussion is about how to name things in a distributed
> system.

I think we're getting into scratched-record-mode on this.


Git: Revnos aren't globally unique or persistent.

Bzr: Yes, we know.

G: Therefore they're useless.

B: No, they're very useful in [situation] and [situation], and we deal
   with [situation] all the time, and they work great for that.

G: But they fall apart totally in [situation].

B: Yes, so use revids there.

G: So use revids everywhere.

B: Revnos are handier tools for [situation] and [situation] for
   [reason] and [reason].

*brrrrrrrrrrrrrrrrip!!!*    *skip back to start*


I'm not sure there's any unturned stone left along this line, so I'm
not sure how productive it really is to keep walking down it.  So, to
make something productive of it, I'm going to put it onto my todo list
to spend some time with bzr trying to use revids for stuff.  I'm
fairly certain that, due to the bzr cultural tendancy to use revnos
where possible, there are some rough edges in the UI when using revids
that should be filed down (though I think it much less likely to turn
up underlying model failures that interfere with using revids).


> It may be that the centralization bias

I think it's more accurately describable as a branch-identity bias.
The git claim seems to be that the two statements are identical, but I
have some trouble swallowing that.


> I'm still not sure exactly what a bzr branch is, but it's clearly
> something different from a git branch,

The term is somewhat overloaded, which is why it's causing you trouble
(and did me).  It refers both to the conceptual entity ("a line of
development" roughly, much like what 'branch' means in git and VCS in
general), and to the physical location (directory, URL) where that
branch is stored, and where it'll often have a working tree.  Branches
are always referred to by location, never by name.


> (and I'd be interested to see a "corrected" version of the commands
> above to fix the storage inefficiencies).

The 'corrected' step would be:

> 	mkdir bzrtest; cd bzrtest
    bzr init-repo .
> 	mkdir master; cd master; bzr init

Then all branches stored under that 'bzrtest' dir will use the
bzrtest/.bzr/ dir for storing the revisions, and shared revisions will
only exist once saving the space/time for multiple copies.

Probably, you'd actually want 'init-repo --trees' in this case,
because repos default to being [working]tree-less.  In a tree-less
setup, you'd create a [lightweight] checkout of the branch(es) you
wanted to work on elsewhere, giving you a layout much like CVS or SVN
where "my VCS files are THERE, my working tree is HERE".


> (since pull seems the only way to synch up without infinite new
> merge commits being added back and forth).

The infinite-merge-commits case doesn't happen in bzr-land because we
generally don't merge other branches except when the branch owner says
"Hey, I've got something for you to merge".  If you were to setup a
script to merge two branches back and forth until they were 'equal',
yes, it'd churn away until you filled up your disk with the N bytes of
metadata every new revision uses up.



-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 12:30                                                 ` Jan Hudec
@ 2006-10-21 13:05                                                   ` Jakub Narebski
  2006-10-21 13:15                                                     ` Jan Hudec
  2006-10-21 16:56                                                     ` Aaron Bentley
  0 siblings, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21 13:05 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Jan Hudec wrote:

> On Fri, Oct 20, 2006 at 12:05:35AM -0400, Aaron Bentley wrote:
>> Tim Webster wrote:
>> > Also svn does not allow files in the same directory to live in
>> > multiple repos
>> 
>> It would surprise me if many SCMs that support atomic commit also
>> support intermixing files from multiple repos in the same directory.
> 
> In fact I think svk would. You would have to switch them by setting
> an environment variable, but it's probably doable. That is because
> unlike other version control systems, it does not store the information
> about checkout in the checkout, but in the central directory and that
> can be set. I don't know git well enough to tell whether git could do
> the same by setting GIT_DIR.

You can very simply embed one "clothed" repository into another in GIT,
like shown below

  project/.git
  project/subdir/
  project/subdir/file
  project/subproject/
  project/subproject/.git
  project/subproject/file
  ...

It depends on circumstances if one wants files belonging to subdirectory
be ignored by top repository. You would want to ignore .git/ directory,
though.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 13:05                                                   ` Jakub Narebski
@ 2006-10-21 13:15                                                     ` Jan Hudec
  2006-10-21 13:29                                                       ` Jakub Narebski
  2006-10-21 16:56                                                     ` Aaron Bentley
  1 sibling, 1 reply; 1752+ messages in thread
From: Jan Hudec @ 2006-10-21 13:15 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

On Sat, Oct 21, 2006 at 03:05:22PM +0200, Jakub Narebski wrote:
> Jan Hudec wrote:
> 
> > On Fri, Oct 20, 2006 at 12:05:35AM -0400, Aaron Bentley wrote:
> >> Tim Webster wrote:
> >> > Also svn does not allow files in the same directory to live in
> >> > multiple repos
> >> 
> >> It would surprise me if many SCMs that support atomic commit also
> >> support intermixing files from multiple repos in the same directory.
> > 
> > In fact I think svk would. You would have to switch them by setting
> > an environment variable, but it's probably doable. That is because
> > unlike other version control systems, it does not store the information
> > about checkout in the checkout, but in the central directory and that
> > can be set. I don't know git well enough to tell whether git could do
> > the same by setting GIT_DIR.
> 
> You can very simply embed one "clothed" repository into another in GIT,
> like shown below
> 
>   project/.git
>   project/subdir/
>   project/subdir/file
>   project/subproject/
>   project/subproject/.git
>   project/subproject/file
>   ...
> 
> It depends on circumstances if one wants files belonging to subdirectory
> be ignored by top repository. You would want to ignore .git/ directory,
> though.

Yes, you can do that with bzr and most other tools I know of as well.
But I understand the original question as requesting the working trees
to be rooted at the same place (ie. all in /etc), because each has some
files and some directories that have to be placed next to each other.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 13:15                                                     ` Jan Hudec
@ 2006-10-21 13:29                                                       ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21 13:29 UTC (permalink / raw)
  To: Jan Hudec; +Cc: bazaar-ng, git

Dnia sobota 21. października 2006 15:15, Jan Hudec napisał:
> On Sat, Oct 21, 2006 at 03:05:22PM +0200, Jakub Narebski wrote:
>> Jan Hudec wrote:
>> 
>>> On Fri, Oct 20, 2006 at 12:05:35AM -0400, Aaron Bentley wrote:
>>>> Tim Webster wrote:
>>>>> Also svn does not allow files in the same directory to live in
>>>>> multiple repos
>>>> 
>>>> It would surprise me if many SCMs that support atomic commit also
>>>> support intermixing files from multiple repos in the same directory.
>>> 
>>> In fact I think svk would. You would have to switch them by setting
>>> an environment variable, but it's probably doable. That is because
>>> unlike other version control systems, it does not store the information
>>> about checkout in the checkout, but in the central directory and that
>>> can be set. I don't know git well enough to tell whether git could do
>>> the same by setting GIT_DIR.
>> 
>> You can very simply embed one "clothed" repository into another in GIT,
>> like shown below
[...]
>> It depends on circumstances if one wants files belonging to subdirectory
>> be ignored by top repository. You would want to ignore .git/ directory,
>> though.
> 
> Yes, you can do that with bzr and most other tools I know of as well.
> But I understand the original question as requesting the working trees
> to be rooted at the same place (ie. all in /etc), because each has some
> files and some directories that have to be placed next to each other.

You can separate working area from the repository (you don't need to have
repository in top directory of working area), but you must then provide
for each git command you do the location of repository, either via setting
GIT_DIR environmental variable (GIT_DIR=/path/to/repo.git git commit ...),
or use --git-dir option of git wrapper (git --git-dir=/path/to/repo.git diff),
as automatical detection of repository wouldn't work, of course.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Alternate revno proposal (Was: Re: VCS comparison table)
  2006-10-19  8:19                         ` Alexander Belchenko
@ 2006-10-21 13:48                           ` Jan Hudec
  0 siblings, 0 replies; 1752+ messages in thread
From: Jan Hudec @ 2006-10-21 13:48 UTC (permalink / raw)
  To: Alexander Belchenko; +Cc: bazaar-ng, git

On Thu, Oct 19, 2006 at 11:19:30AM +0300, Alexander Belchenko wrote:
> Jan Hudec ??????????:
> >Reading this thread I came to think, that the revnos should be assigned
> >to _all_ revisions _available_, in order of when they entered the
> >repository (there are some possible variations I will mention below)
> ...
> > - They would be the same as subversion and svk, and IIRC mercurial as
> >   well, use, so:
> >   - They would already be familiar to users comming from those systems.
> >   - They are known to be useful that way. In fact for svk it's the only
> >     way to refer to revisions and seem to work satisfactorily (though
> >     note that svk is not really suitable to ad-hoc topologies).
> 
> I think that SVN model of revision numbers is wrong. And apply it to bzr
> break many UI habits. Per example, when ones use svn and their repo has
> many branches you never could say what revisions belongs to mainline. So
> things like
> bzr diff -rM..N
> (where M and N absolute revisions numbers, and N = M+1(+2) etc.)
> will more complicated, because in this case you first need to run log
> command, remember actual numbers of those revisions.

Well, you need to run log anyway, because you usually want to see a diff
between some particular revisions, so you need to find them anyway.

On the other hand in subversion all revisions actually exist on all
branches, so svn diff -r N-1:N always shows changes introduced by
revision N, while here you would have to use before:N..N.

> And I each time frustrating to see that after mainline svn revision 1000
> might be mainline revision 1020. It's very-very-very confusing. May be
> only for me.

I got used to this pretty quickly when I used svk. And there it actually
happens much more often than in subversion itself, because you have the
mirrored branches and each commit on them also gets a revision number.
But yes, they feel more weird.

> There is 2 things why I don't want to switch to svn (if I can do my own
> choice): their strange tags implementation (their tags is the same as
> branches, so what difference?) and their revisions numbers.
> 
> I also think that dotted revisions is not answer in this case, but it
> looks very logical and nice.
> 
> I think bzr need to have a switch, a flag, probably in .bazaar.conf to
> show revno to user or revid. And user can easily select what model is
> more appropriate for him:
> 
> * decentralized (with revno)
> * or distrubuted (with revid i.e. UUID)

Personally I'd like the ui to make the revision ids more visible since
they are the canonical way for refering to revisions and as shown among
other in this thread people who know something about distributed version
control are actually confused by them not being visible and think they
are not there.

> >Comments?
> 
> -1 to make revno as in svn.

Hm, you are probably right. In any case it's more useful to teach the
users not to get attached to the revnos too much.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 13:01                                               ` Matthew D. Fuller
@ 2006-10-21 14:08                                                 ` Jakub Narebski
  2006-10-21 16:31                                                   ` Erik Bågfors
  2006-10-21 18:11                                                   ` Matthew D. Fuller
  2006-10-21 20:47                                                 ` Carl Worth
  2006-10-25  9:35                                                 ` Andreas Ericsson
  2 siblings, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21 14:08 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Carl Worth, Aaron Bentley, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git

Dnia sobota 21. października 2006 15:01, Matthew D. Fuller napisał:
> On Fri, Oct 20, 2006 at 02:48:52PM -0700 I heard the voice of
> Carl Worth, and lo! it spake thus:
> > 
> > The entire discussion is about how to name things in a distributed
> > system.
> 
> I think we're getting into scratched-record-mode on this.
> 
> 
> Git: Revnos aren't globally unique or persistent.
> 
> Bzr: Yes, we know.
> 
> G: Therefore they're useless.
> 
> B: No, they're very useful in [situation] and [situation], and we deal
>    with [situation] all the time, and they work great for that.
> 
> G: But they fall apart totally in [situation].

G: But revnos force centralized/star-topology development. And even in
   [situation] have [disadvantages].

> B: Yes, so use revids there.
> 
> G: So use revids everywhere.
> 
> B: Revnos are handier tools for [situation] and [situation] for
>    [reason] and [reason].

G: Shortened sha1 commit-ids are almost as handy.

> *brrrrrrrrrrrrrrrrip!!!*    *skip back to start*

There _are_ terminology conflicts. For example bzr "branch" is roughly 
equivalent to one-branch git "repository"; bzr "repository" is just 
collection of branches sharing common storage, which is similar to set 
of git "repositories" with .git/objects/ linked to common object 
repository (storage area) or appropriately set alternates file 
(although that is not common usage in git, and for example you would 
have to be carefull with running git-prune); bzr "lightweight checkout" 
is equivalent to nonexistent "lazy clone"/"remote alternates" discussed 
on git mailing list but not implemented because of performance 
concerns; bzr "normal checkout" is I think similar to git "shared 
clone" (but shared clone is limited to repositories on the same 
filesystem); bzr "heavyweight checkout" is roughly equivalent to 
one-branch-only "clone" in git or cg (cg = Cogito).

And there are differences in opinion. For example "simple namespace for 
revisions" which is important for bzr, is superficially simple for git 
(as it works only for centralized approach, and for leaf repositories 
you have to have access to central repository to get final revnos); on 
the other hand "not simpleness" of git's sha1 identifiers is not that 
complicated in everydays work, as one usually use branch and tag names, 
<ref>~<n> and <ref1>..<ref2> syntax, sometimes shortened sha1 names and 
full sha1 names only rarely. For bzr it is more important to tell from 
revno which commit on branch was earlier, for git it is more important 
that commitids never ever change; we can use git commands to check 
which commit was earlier. For bzr plugins are important, for git it is 
important to be easy to add new commands, using scripts for fast 
prototyping.

> > It may be that the centralization bias
> 
> I think it's more accurately describable as a branch-identity bias.
> The git claim seems to be that the two statements are identical, but I
> have some trouble swallowing that.

When two clones of the same repository (in git terminology), or two 
"branches" (in bzr terminology), used by different people, cannot be 
totally equivalent that is centralization bias. By equivalent I mean 
that "old history" is exactly the same (the same diagram, the same
identifiers - make it usually used identifiers).
 
The fact that you have two different commands, "merge" vs "pull"
for using in one mother/mainline "branch" vs other "branches" tells
us that there is bias towards centralization.

> > I'm still not sure exactly what a bzr branch is, but it's clearly
> > something different from a git branch,
> 
> The term is somewhat overloaded, which is why it's causing you trouble
> (and did me).  It refers both to the conceptual entity ("a line of
> development" roughly, much like what 'branch' means in git and VCS in
> general), and to the physical location (directory, URL) where that
> branch is stored, and where it'll often have a working tree.  Branches
> are always referred to by location, never by name.

I'd rather use other name then. Perhaps "forks" for physical "branch",
i.e. branch metadata (like revno to revid mapping) + object repository 
or pointer to it + optionally working area/working files. 

[...]
> > (since pull seems the only way to synch up without infinite new
> > merge commits being added back and forth).
> 
> The infinite-merge-commits case doesn't happen in bzr-land because we
> generally don't merge other branches except when the branch owner says
> "Hey, I've got something for you to merge".  If you were to setup a
> script to merge two branches back and forth until they were 'equal',
> yes, it'd churn away until you filled up your disk with the N bytes of
> metadata every new revision uses up.

And you say that bzr is not biased towards centralization? In git you 
can just pull (fetch) to check if there were any changes, and if there 
were not you don't get useless marker-merges.


Take for example two simple git scenarios:
1. Single branch repository. We have two clones of the same repository, 
both with only one branch, 'master', both working on this branch, and 
both considered equal. If only one person worked on branch, "pull" 
would result in fast-forward. If both worked on branch, "pull" would 
result in merge. This is the "diamond" example by Pasky, which 
explained why git doesn't treat first parent like special - because of 
fast forward. Bzr treats first parent/mainline/"the branch" special 
therefore it generates superficial merge commits if we preserve revnos; 
BTW doesn't "pull" clobber your changes?

2. But the preferred git workflow is to have two branches in each of two 
clones. The 'origin' branch where you fetch changes from other 
repository (so called "tracking branch") and you don't commit your 
changes to (by convention, as git doesn't protect the branch from 
commiting to, although it would refuse to fetch in non fast-forward 
case unless forced). You put your work in the 'master' branch, and you 
merge 'origin' branch into 'master'. This allows for example fetching 
changes to 'origin' but _not_ merging them immediately into 'master',
for example if you are in the middle of some larger work byt want to 
check what other side did to not to create conflict if not neccessary.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]             ` <20061017073839.3728d1e7.seanlkml@sympatico.ca>
  2006-10-17 11:38               ` Sean
  2006-10-17 11:38               ` Sean
@ 2006-10-21 14:13               ` Jan Hudec
       [not found]                 ` <20061021102346.9cd3abce.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 1752+ messages in thread
From: Jan Hudec @ 2006-10-21 14:13 UTC (permalink / raw)
  To: Sean; +Cc: Matthieu Moy, Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Tue, Oct 17, 2006 at 07:38:39AM -0400, Sean wrote:
> On Tue, 17 Oct 2006 13:19:08 +0200
> Matthieu Moy <Matthieu.Moy@imag.fr> wrote:
> 
> > 1) a working tree without any history information, pointing to some
> >    other location for the history itself (a la svn/CVS/...).
> >    (this is "light checkout")
> 
> Git can do this from a local repository, it just can't do it from
> a remote repo (at least over the git native protocol).  However,
> over gitweb you can grab and unpack a tarball from a remote repo.
> In practice this is probably enough support for such a feature.
> 
> > 2) a bound branch. It's not _very_ different from a normal branch, but
> >    mostly "commit" behaves differently:
> >    - it commits both on the local and the remote branch (equivalent to
> >      "commit" + "push", but in a transactional way).
> >    - it refuses to commit if you're out of date with the branch you're
> >      bound to.
> >    (this is "heavy checkout")
> 
> This doesn't sound right, at least in the spirit of git.  Git really
> wants to have a local commit which you may or may not push to a
> remote repo at a later time.  There is no upside to forcing it all to
> happen in one step, and a lot of downsides.  Gits focus is to support
> distributed offline development, not requiring a remote repo to be
> available at commit time.

While there is no upside to forcing it all to _always_ happen in one
step, there are good reasons to allow it in particular cases.

The most common is if you work on something from two different computers
(at home and at work or from desktop or notebook or similar cases) and
want to be sure you don't forget to synchronize your changes.

You can always unbind the branch or do a commit --local, which allows
doing a local commit anyway (eg. when disconnected) and then the next
commit will require a merge if the branches diverged.

> > In both cases, this has the side effect that you can't commit if the
> > "upstream" branch is read-only. That's not fundamental, but handy.
> 
> Again this seems really anti-git.  There is no reason for your local
> branch to be marked read only just because some upstream branch is
> so marked.

Again, it only is if you want, and opt for, making it so. Eg. people who
often have many terminals with different current directories may use it
to protect themselves from accidentaly running commands in the wrong
one. You don't have to use it if you don't want to.

> > I use it for example to have several "checkouts" of the same branch on
> > different machines. When I commit, bzr tells me "hey, boss, you're out
> > of date, why don't you update first" if I'm out of date. And if commit
> > succeeds, I'm sure it is already commited to the main branch. I'm sure
> > I won't pollute my history with merges which would only be the result
> > of forgetting to update.
> 
> This is exactly the same in Git.  You really only ever push upstream
> when your local changes fast forward the remote, (ie. you're up to date).
> Git will warn you if your changes don't fast forward the remote.

In bzr push and pull only work for the fast-forward case. They operate
on branches and actually apply the changes on the target. But that's a
different thing. Bound branches are mainly about not forgetting to
synchronize it.

> > The more fundamental thing I suppose is that it allows people to work
> > in a centralized way (checkout/commit/update/...), and Bazaar was
> > designed to allow several different workflows, including the
> > centralized one.
> 
> While Git really isn't meant to work in a centralized way there's nothing
> preventing such a work flow.  It just requires the use of some surrounding
> infrastructure.

Bzr is meant to be used in both ways, depending on user's choice.
Therefore it comes with that infrastructure and you can choose whether
you want to use it or not.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061021102346.9cd3abce.seanlkml@sympatico.ca>
  2006-10-21 14:23                   ` Sean
@ 2006-10-21 14:23                   ` Sean
  2006-10-21 16:19                     ` Erik Bågfors
  2006-10-21 18:34                   ` Jan Hudec
  2 siblings, 1 reply; 1752+ messages in thread
From: Sean @ 2006-10-21 14:23 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Matthieu Moy, Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Sat, 21 Oct 2006 16:13:28 +0200
Jan Hudec <bulb@ucw.cz> wrote:

> Bzr is meant to be used in both ways, depending on user's choice.
> Therefore it comes with that infrastructure and you can choose whether
> you want to use it or not.

>From what we've read on this thread, bzr appears to be biased towards
working with a central repo.  That is the model that supports the use of
revnos etc that the bzr folks are so fond of.   However Git is perfectly
capable of being used in any number of models, including centralized.
Git just doesn't make the mistake of training new users into using
features that are only stable in a limited number of those models.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061021102346.9cd3abce.seanlkml@sympatico.ca>
@ 2006-10-21 14:23                   ` Sean
  2006-10-21 14:23                   ` Sean
  2006-10-21 18:34                   ` Jan Hudec
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-21 14:23 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Linus Torvalds, bazaar-ng, git, Matthieu Moy, Jakub Narebski

On Sat, 21 Oct 2006 16:13:28 +0200
Jan Hudec <bulb@ucw.cz> wrote:

> Bzr is meant to be used in both ways, depending on user's choice.
> Therefore it comes with that infrastructure and you can choose whether
> you want to use it or not.

>From what we've read on this thread, bzr appears to be biased towards
working with a central repo.  That is the model that supports the use of
revnos etc that the bzr folks are so fond of.   However Git is perfectly
capable of being used in any number of models, including centralized.
Git just doesn't make the mistake of training new users into using
features that are only stable in a limited number of those models.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] GIT 1.4.3
  2006-10-21  2:12     ` Al Viro
  2006-10-21  5:29       ` Junio C Hamano
@ 2006-10-21 14:29       ` Rene Scharfe
  1 sibling, 0 replies; 1752+ messages in thread
From: Rene Scharfe @ 2006-10-21 14:29 UTC (permalink / raw)
  To: Al Viro; +Cc: Linus Torvalds, Junio C Hamano, git

Al Viro schrieb:
> Speaking of irritations...  There is a major (and AFAICS fixable)
> suckitude in git-cherry.

[...]

> For one thing, there are better ways to do set comparison than creating
> a file for each element in one set and going through another checking
> if corresponding files exist (join(1) and sort(1) or just use perl hashes).

[...]

> Note that we are calculating a function of commit; it _never_ changes.
> Even if we don't just calculate and memorize it at commit time, a cache
> somewhere under .git would speed the things up a lot...

How about this patch?  It does away with using temporary files and instead
creates persistent cache files under .git/patch-ids/.  It is a very stupid
cache layout: file name = commit SHA1, file contents = patch ID.  Perhaps
it needs fan-out directories like .git/objects/ has before it can be
considered for merge.

The set compare is stupid, too, but at least it is in-shell now, using a
space separated list and the is_in function.

And the cache file creation is not safe for multiple parallel git-cherry's.

It survives "make test" and is otherwise untested.  Care to test drive
this prototype? :-D

Thanks,
René


diff --git a/git-cherry.sh b/git-cherry.sh
index 8832573..c88afc3 100755
--- a/git-cherry.sh
+++ b/git-cherry.sh
@@ -46,18 +46,29 @@ # not that the order in inup matters...
 inup=`git-rev-list ^$ours $upstream` &&
 ours=`git-rev-list $ours ^$limit` || exit
 
-tmp=.cherry-tmp$$
-patch=$tmp-patch
-mkdir $patch
-trap "rm -rf $tmp-*" 0 1 2 3 15
+is_in() {
+	what="$1"
+	while [ $# -gt 1 ]; do
+		shift
+		[ "$what" = "$1" ] && return 0
+	done
+	return 1
+}
 
+# prime patch-ID cache
+PATCH_ID_CACHE="$GIT_DIR/patch-ids"
+mkdir -p "$PATCH_ID_CACHE"
+for commit in $inup $ours; do
+	[ -f "$PATCH_ID_CACHE/$commit" ] && continue
+	set x `git-diff-tree -p $commit | git-patch-id`
+	echo "$2" >"$PATCH_ID_CACHE/$commit"
+done
+
+ids_inup=
 for c in $inup
 do
-	git-diff-tree -p $c
-done | git-patch-id |
-while read id name
-do
-	echo $name >>$patch/$id
+	read id <"$PATCH_ID_CACHE/$c"
+	ids_inup="$ids_inup $id"
 done
 
 LF='
@@ -66,10 +77,10 @@ LF='
 O=
 for c in $ours
 do
-	set x `git-diff-tree -p $c | git-patch-id`
-	if test "$2" != ""
+	read id <"$PATCH_ID_CACHE/$c"
+	if test "$id" != ""
 	then
-		if test -f "$patch/$2"
+		if is_in $id $ids_inup
 		then
 			sign=-
 		else

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-18 16:31                                   ` Aaron Bentley
@ 2006-10-21 15:56                                     ` Jan Hudec
  2006-10-21 16:13                                       ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Jan Hudec @ 2006-10-21 15:56 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Jakub Narebski, Matthieu Moy, bazaar-ng, Linus Torvalds,
	Andreas Ericsson, Petr Baudis, Carl Worth, git

On Wed, Oct 18, 2006 at 12:31:52PM -0400, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Jakub Narebski wrote:
> > Aaron Bentley wrote:
> > 
> >>Carl Worth wrote:
> >>>There are even more important reasons to prefer a series of
> >>>micro-commits over a mega-patch than just ease of merging.
> >>
> >>A bundle isn't a mega-patch.  It contains all the source revisions.  So
> >>when you merge or pull it, you get all the original revisions in your
> >>repository.
> > 
> > 
> > But what patch reviewer see is a mega-patch showing the changeset
> > of a whole "bundle", isn't it?
> > [...]
> 
> Yes.  Carl was saying that, aside from the issue of what a reviewer
> sees, a bundle is bad for other reasons.  I am saying those other
> reasons don't apply.  I wasn't addressing the issue of what a reviewer sees.
> 
> To me, seeing the individual patches is like reading a book where every
> page has a different word on it, and so it's hard to put it together
> into a full sentence.  I'm not saying my way is The Right Way, just my
> personal preference.
> 
> For larger pieces of work, we try to split them up into logical units,
> and merge those units independently.
> 
> The Bundle format can also support a patch-by-patch output, but we don't
> have UI to select that.

As for what the reviewer wants to see, I think it depends on what kind
of code it is. Kernel code is complex and does not have (at least I have
not heared of) unit-tests, so short patches are preferable for review.
And since C is of the more verbose languages, short patches mean
spliting them up into several pieces.

On the other hand bzr has unit-tests and python is less verbose, so the
single patch for a feature is not so big and is manageable. The patches
to bzr still come in logical steps, but usually one step per feature is
enough.

Also programmers usually don't develop even the single logical step as a
single commit. Instead they they also commit to backup their work,
when they try something they think they may in future return, when they
need to continue on another computer and so on. And these commits are
generally not logical steps. Also the steps are often not in a logical
order. Therefore showing diff for each commit in the bundle often does
not make sense.

So there is one bundle per logical step and therefore has a summary
diff. Individual bundles for individual steps are preferable anyway,
since the maintainer may decide to accept just some of them.  A tool to
generate a series of bundles (either each with just one commit or each
with several commits) would be possible, just noone was interested
enough to do it yet.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 15:56                                     ` Jan Hudec
@ 2006-10-21 16:13                                       ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21 16:13 UTC (permalink / raw)
  To: Jan Hudec
  Cc: Aaron Bentley, Matthieu Moy, bazaar-ng, Linus Torvalds,
	Andreas Ericsson, Petr Baudis, Carl Worth, git

Jan Hudec wrote:

> Also programmers usually don't develop even the single logical step as a
> single commit. Instead they they also commit to backup their work,

In git you can backup your work on temporary branch; besides there
is git commit --amend to correct last commit.

> when they try something they think they may in future return, when they
> need to continue on another computer and so on. And these commits are
> generally not logical steps. Also the steps are often not in a logical
> order. Therefore showing diff for each commit in the bundle often does
> not make sense.

That is why before sending patch series based on some feature branch,
you should at least rebase the branch on top of current work, to ensure
that the series would apply cleanly.

If feature branch/patch series needs cleanup (going from "answer" to
"solution" http://lkml.org/lkml/2005/4/7/176), i.e. patch (commit)
reordering, joining two patches into one, patch splitting, you can
use git-cherry-pick, git-cherry-pick --no-commit and git commit --amend
combination, or git-format-patch, patch editing and reordering, and git-am.
Or just use StGit or pg.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 14:23                   ` Sean
@ 2006-10-21 16:19                     ` Erik Bågfors
  2006-10-21 16:31                       ` Jakub Narebski
                                         ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-21 16:19 UTC (permalink / raw)
  To: Sean
  Cc: Jan Hudec, Linus Torvalds, bazaar-ng, git, Matthieu Moy,
	Jakub Narebski

On 10/21/06, Sean <seanlkml@sympatico.ca> wrote:
> On Sat, 21 Oct 2006 16:13:28 +0200
> Jan Hudec <bulb@ucw.cz> wrote:
>
> > Bzr is meant to be used in both ways, depending on user's choice.
> > Therefore it comes with that infrastructure and you can choose whether
> > you want to use it or not.
>
> From what we've read on this thread, bzr appears to be biased towards
> working with a central repo.  That is the model that supports the use of
> revnos etc that the bzr folks are so fond of.   However Git is perfectly
> capable of being used in any number of models, including centralized.
> Git just doesn't make the mistake of training new users into using
> features that are only stable in a limited number of those models.

This is just plain wrong.

bzr is a fully decentralized VCS. I've read this thread for quite some
time now and I really cannot understand why people come to this
conclusion.

However, if you do want to work centralized, bzr has commands that
fits that workflow really good.


/Erik

-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 14:08                                                 ` Jakub Narebski
@ 2006-10-21 16:31                                                   ` Erik Bågfors
  2006-10-21 16:59                                                     ` Jakub Narebski
  2006-10-21 18:11                                                   ` Matthew D. Fuller
  1 sibling, 1 reply; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-21 16:31 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Matthew D. Fuller, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

> There _are_ terminology conflicts. For example bzr "branch" is roughly
> equivalent to one-branch git "repository";

Agreed.

> bzr "repository" is just
> collection of branches sharing common storage,
Agreed

> which is similar to set
> of git "repositories" with .git/objects/ linked to common object
> repository (storage area) or appropriately set alternates file
> (although that is not common usage in git, and for example you would
> have to be carefull with running git-prune); bzr "lightweight checkout"
> is equivalent to nonexistent "lazy clone"/"remote alternates" discussed
> on git mailing list but not implemented because of performance
> concerns; bzr "normal checkout" is I think similar to git "shared
> clone" (but shared clone is limited to repositories on the same
> filesystem); bzr "heavyweight checkout" is roughly equivalent to
> one-branch-only "clone" in git or cg (cg = Cogito).

This is wrong. There are two kinds of checkouts
lightweight.. and "normal/heavyweight".

I think you are getting this alittle wrong, and I think the reason is
that you are thinking of repositories, while in bzr you normally think
of branches.

For example, I think (correct me if I'm wrong) that if I have a git
repository of a upstream linux-repo (Linus' for example).  I guess
I'll use "pull" to keep my copy up to date with the upstream repo? If
I then would like to hack something special, I would "clone" the repo
and get a new repo and that's where I do my work.  Is that correct?

In bzr you never (well...)  clone a full repository, but you clone one
line-of-development (a branch).  So "bzr branch"  is always a
"one-branch-only "clone" in git or cg".

"bzr checkout" is a "bzr branch" followed by a setting saying
"whenever you commit here, commit in the master branch also".

"bzr checkout --lightweight" is a way to get only a snapshot of the
working tree out of a branch. Whenever you commit, it's done in the
remote branch.

/Erik

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 16:19                     ` Erik Bågfors
@ 2006-10-21 16:31                       ` Jakub Narebski
       [not found]                       ` <BAYC1-PASMTP01706CD2FCBE923333A0CBAE020@CEZ.ICE>
  2006-10-21 21:04                       ` Linus Torvalds
  2 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21 16:31 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Sean, Jan Hudec, Linus Torvalds, bazaar-ng, git, Matthieu Moy

Erik Bågfors wrote:
> On 10/21/06, Sean <seanlkml@sympatico.ca> wrote:
>> On Sat, 21 Oct 2006 16:13:28 +0200
>> Jan Hudec <bulb@ucw.cz> wrote:
>>
>>> Bzr is meant to be used in both ways, depending on user's choice.
>>> Therefore it comes with that infrastructure and you can choose whether
>>> you want to use it or not.
>>
>> From what we've read on this thread, bzr appears to be biased towards
>> working with a central repo.  That is the model that supports the use of
>> revnos etc that the bzr folks are so fond of.   However Git is perfectly
>> capable of being used in any number of models, including centralized.
>> Git just doesn't make the mistake of training new users into using
>> features that are only stable in a limited number of those models.
> 
> This is just plain wrong.
> 
> bzr is a fully decentralized VCS. I've read this thread for quite some
> time now and I really cannot understand why people come to this
> conclusion.
> 
> However, if you do want to work centralized, bzr has commands that
> fits that workflow really good.

Read carefully: bzr is _biased_ towards work with central repository.
Default workflow (as for example using revnos, as for example using
"merge" for one repository and "pull" for other) of bzr is geared
towards star topology, i.e. some centralized repository.

That to be said, it is supposed to be able to work in fully decentralized
way, using revids. But then for example you don't have "simple rev
namespace" (moreover you have _worse_ namespace than git's sha1 ids).

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                       ` <BAYC1-PASMTP01706CD2FCBE923333A0CBAE020@CEZ.ICE>
@ 2006-10-21 16:35                         ` Erik Bågfors
       [not found]                           ` <BAYC1-PASMTP04FAD1FBB91BA4C07A5E79AE020@CEZ.ICE>
  0 siblings, 1 reply; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-21 16:35 UTC (permalink / raw)
  To: Sean
  Cc: Jan Hudec, Linus Torvalds, bazaar-ng, git, Matthieu Moy,
	Jakub Narebski

On 10/21/06, Sean <seanlkml@sympatico.ca> wrote:
> On Sat, 21 Oct 2006 18:19:54 +0200
> "Erik Bågfors" <zindar@gmail.com> wrote:
>
> > This is just plain wrong.
> >
> > bzr is a fully decentralized VCS. I've read this thread for quite some
> > time now and I really cannot understand why people come to this
> > conclusion.
> >
> > However, if you do want to work centralized, bzr has commands that
> > fits that workflow really good.
>
> Have you been reading this thread at all?

Yes.

> Even the bzr people have now
> stated rather firmly that the revno scheme doesn't work very well in
> a number of situations.  Numerous examples have been given where the
> revno will be useless, or worse misleading when bzr is used without
> a central server.  The answer from the bzr folks has been then don't
> use the revno in those situations.  However, it's quite clear from the
> bzr UI that there is a _bias_ towards using revno's.
>
> So yes, clearly you can use bzr without a central server; but it's just
> as clearly biased against such usage.

So... I do agree that revnos might not fit perfectly in at all times.
But that they automatically mean that bzr is not a decentralized VCS,
I strongly disagree with.  They are just one part of the equation.

/Erik
-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 13:05                                                   ` Jakub Narebski
  2006-10-21 13:15                                                     ` Jan Hudec
@ 2006-10-21 16:56                                                     ` Aaron Bentley
  2006-10-21 17:03                                                       ` Jakub Narebski
  2006-10-21 17:31                                                       ` Linus Torvalds
  1 sibling, 2 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-21 16:56 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Jan Hudec wrote:
> 
>> On Fri, Oct 20, 2006 at 12:05:35AM -0400, Aaron Bentley wrote:
>>> Tim Webster wrote:
>>>> Also svn does not allow files in the same directory to live in
>>>> multiple repos
>>> It would surprise me if many SCMs that support atomic commit also
>>> support intermixing files from multiple repos in the same directory.
>> In fact I think svk would. You would have to switch them by setting
>> an environment variable, but it's probably doable. That is because
>> unlike other version control systems, it does not store the information
>> about checkout in the checkout, but in the central directory and that
>> can be set. I don't know git well enough to tell whether git could do
>> the same by setting GIT_DIR.
> 
> You can very simply embed one "clothed" repository into another in GIT,
> like shown below
> 
>   project/.git
>   project/subdir/
>   project/subdir/file
>   project/subproject/
>   project/subproject/.git
>   project/subproject/file
>   ...
> 
> It depends on circumstances if one wants files belonging to subdirectory
> be ignored by top repository. You would want to ignore .git/ directory,
> though.

Any SCM worth its salt should support that.  AIUI, that's not what Tim
wants.  He wants to intermix files from different repos in the same
directory.

i.e.

project/file-1
project/file-2
project/.git-1
project/.git-2

So file-1 would be in the .git-1 repository, but file-2 would be in the
.git-2 repository.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFOlE70F+nu1YWqI0RAvNcAJ0Rd6ovGoBNtKxcPNOrMH1yc+bzWQCfQlqT
hREsUmCBAW8mIYzfzdnqZqU=
=unGE
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 16:31                                                   ` Erik Bågfors
@ 2006-10-21 16:59                                                     ` Jakub Narebski
  2006-10-21 17:41                                                       ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21 16:59 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Matthew D. Fuller, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

Erik Bågfors wrote:
> Jakub Narebski wrote:
>>
>> There _are_ terminology conflicts. For example bzr "branch" is roughly
>> equivalent to one-branch git "repository";
> 
> Agreed.
> 
>> bzr "repository" is just
>> collection of branches sharing common storage,
>
> Agreed

What is worse (in comparing git with bzr) that there are no exact
equivalents. For example bzr "branch" is something between git
repository (clone of repository) and git branch. Bazaar-NG "repository"
is something like multi-branch git repository, but also like collection
of git repositories sharing object database.
 
>> which is similar to set
>> of git "repositories" with .git/objects/ linked to common object
>> repository (storage area) or appropriately set alternates file
>> (although that is not common usage in git, and for example you would
>> have to be carefull with running git-prune); bzr "lightweight checkout"
>> is equivalent to nonexistent "lazy clone"/"remote alternates" discussed
>> on git mailing list but not implemented because of performance
>> concerns; bzr "normal checkout" is I think similar to git "shared
>> clone" (but shared clone is limited to repositories on the same
>> filesystem); bzr "heavyweight checkout" is roughly equivalent to
>> one-branch-only "clone" in git or cg (cg = Cogito).
> 
> This is wrong. There are two kinds of checkouts
> lightweight.. and "normal/heavyweight".
> 
> I think you are getting this a little wrong, and I think the reason is
> that you are thinking of repositories, while in bzr you normally think
> of branches.

As I said: conflict of concepts. And perhaps philosophies.

> For example, I think (correct me if I'm wrong) that if I have a git
> repository of a upstream linux-repo (Linus' for example).  I guess
> I'll use "pull" to keep my copy up to date with the upstream repo? If
> I then would like to hack something special, I would "clone" the repo
> and get a new repo and that's where I do my work.  Is that correct?

Not exactly.

To work for example on Linus' version of Linux kernel you clone upstream
linux-repo git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
Working area is associated with repository in Git, not with "branch" like
in Bazaar-NG. In default configuration 'master' (main) branch of cloned
repository (in the case of Linus' public repo it is the only branch)
corresponds to 'origin' branch in your repository.

Now you can work on 'master' branch, putting your changes there. git-fetch
will update 'origin' branch to the current version of 'master' branch of
cloned repo; git-pull will additionally merge into 'master', i.e. merge
new changes into your work.

Now if you want to hack something special, that you prefer to use separate
branch for, you don't need to clone repository anew (although you could,
using --local --shared to reduce cost of cloning) but it is enough to
create new branch in your repository. You can very easily switch between
branches using the same working area (in bzr it would probably mean 
"branch checkout" to the same directory).

> In bzr you never (well...)  clone a full repository, but you clone one

It's a pity... for example you usually want to have access to both
stable ('master') and development ('next') branches, perhaps
also to fixes ('maint') and beta stage development ('pu') branches.
In bzr it is a bit work (to correctly setup "repository"), in git
it is one command.

> line-of-development (a branch).  So "bzr branch"  is always a
> "one-branch-only "clone" in git or cg".

More or less.

> "bzr checkout" is a "bzr branch" followed by a setting saying
> "whenever you commit here, commit in the master branch also".

Git doesn't have exact equivalent here. For "bzr checkout" on
the same system, it is similar to setting common object repository;
for remote "bzr checkout" it might be approximated by hooks which
would push changes to remote repository (although we would have
to implement some transaction/journal framework).

> "bzr checkout --lightweight" is a way to get only a snapshot of the
> working tree out of a branch. Whenever you commit, it's done in the
> remote branch.

Yes, but with "bzr checkout --lightweight" you get also pointer
to remote branch where to commit changes. Git doesn't have something
like that, at least not for remote remote branch; mostly because of
poor performance or need for fast and constant network connection
to source branch.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 16:56                                                     ` Aaron Bentley
@ 2006-10-21 17:03                                                       ` Jakub Narebski
  2006-10-21 17:31                                                       ` Linus Torvalds
  1 sibling, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21 17:03 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git

Aaron Bentley wrote:
> AIUI, that's not what Tim  wants.  He wants to intermix files from
> different repos in the same directory.
> 
> i.e.
> 
> project/file-1
> project/file-2
> project/.git-1
> project/.git-2
> 
> So file-1 would be in the .git-1 repository, but file-2 would be
> in the .git-2 repository.

Possible (as I said), although it would screw up automatic repository 
detection. So you would have to say "git --git-dir=.git-1 commit -a"
or "GIT_DIR=.git-2 git log -p; git diff; ...", i.e. specify repo
for each command.

Of course you would have to hide repositories from each other,
and probably it would be better to hide files provided by other
repository.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 16:56                                                     ` Aaron Bentley
  2006-10-21 17:03                                                       ` Jakub Narebski
@ 2006-10-21 17:31                                                       ` Linus Torvalds
  2006-10-21 17:38                                                         ` Linus Torvalds
  2006-10-22  7:49                                                         ` Tim Webster
  1 sibling, 2 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-21 17:31 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git, Jakub Narebski



On Sat, 21 Oct 2006, Aaron Bentley wrote:
> 
> Any SCM worth its salt should support that.  AIUI, that's not what Tim
> wants.  He wants to intermix files from different repos in the same
> directory.
> 
> i.e.
> 
> project/file-1
> project/file-2
> project/.git-1
> project/.git-2

Ok, that's just insane.

It's going to always result in problems (ie some files are going to be 
considered "untracked" depending on which repository you're looking at 
right then and there).

That said, if you _really_ want this, you can do it. Here's now:

	# Create insane repository layout
	mkdir I-am-insane
	cd I-am-insane

	# Tell people we want to work with ".git-1"
	export GIT_DIR=.git-1

	git init-db
	echo "This is file 1 in repo 1" > file-1
	git add file-1
	git commit -m "Silly commit" 

	# Now we switch repos
	export GIT_DIR=.git-2

	git init-db
	echo "This is another file in repo 2" > file-2
	git add file-2
	git commit -m "Silly commit in another repo"

and now you literally have two repositories in the same subdirectory, and 
they don't know about each other, and you can switch your "attention" 
between them by simply doing

	export GIT_DIR=.git-1

(or .git-2). Then you can just do "git diff" etc normally, and work in the 
repo totally ignoring the other one in the same directory structure.

Of course, things like "git status" that show untracked files will always 
then show the "other" repository files as untracked - the two things will 
really be _totally_ independent, they don't at any point know about each 
others files, although they can actually _share_ checked-out files if you 
want to:

	echo "This is a shared file" > file-shared

	export GIT_DIR=.git-1
	git add file-shared
	git commit -m "Add shared file to repo 1"

	export GIT_DIR=.git-2
	git add file-shared
	git commit -m "Add shared file to repo 2"

and now if you change that file, both repositories will see it as being 
changed.

INSANE. And probably totally useless. But you can do it. If you really 
want to.

The git directories don't even have to be in the same subdirectory 
structure. You could have done

	export GIT_DIR=~/insane-git-setup/dir1

instead, and the git information for that thing would have been put in 
that subdirectory.

Note: the above literally creates two different repositories. You can do 
the same thing with a single object repository (so that any actual shared 
data shows up in a shared database) by still using different GIT_DIR 
variables, but using GIT_OBJECT_DIRECTORY to point to a shared database 
directory (which again could be anywhere - it could be under ".git-1", or 
it could be in a separate place in your home directory).

Or you could do it even _more_ differently by actually having just a 
single repository, and having two different branches in that repository, 
and just tracking them separately: in that case you would keep the same 
GIT_DIR/GIT_OBJECT_DIRECTORY (or keep them unset, which just means that 
they default to ".git" and ".git/objects" as normal), and then just switch 
the "index" file and the HEAD files around. That would mean that to switch 
from one "view" to the other, you'd do something like

	export GIT_INDEX_FILE=.git/index1
	git symbolic-ref HEAD refs/heads/branch1

to set your view to "branch1".

Anyway, I would strongly discourage people from actually doing anything 
like this. It should _work_, but quite frankly, if you actually want to do 
this, you have serious mental problems.

What's probably much better is to have two separate development 
repositories, and then perhaps mixing the end _result_ somewhere else. For 
example, you can use the

	git checkout-index -a -f --prefix=/usr/shared/result/

in both (separate) repositories, and you'll end up with basically a 
snapshot of the "union" in /usr/shared/result.

(Not that I see why you'd want to do that _either_, but hey, at least 
you're not going to be _totally_ confused by the end result).

Anyway. Git certainly allows you to do some really insane things. The 
above is just the beginning - it's not even talking about alternate object 
directories where you can share databases _partially_ between two 
otherwise totally independent repositories etc.

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                           ` <BAYC1-PASMTP04FAD1FBB91BA4C07A5E79AE020@CEZ.ICE>
@ 2006-10-21 17:33                             ` Erik Bågfors
  0 siblings, 0 replies; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-21 17:33 UTC (permalink / raw)
  To: Sean
  Cc: Matthieu Moy, bazaar-ng, Linus Torvalds, Jan Hudec, git,
	Jakub Narebski

On 10/21/06, Sean <seanlkml@sympatico.ca> wrote:
> On Sat, 21 Oct 2006 18:35:18 +0200
> "Erik Bågfors" <zindar@gmail.com> wrote:
>
>
> > So... I do agree that revnos might not fit perfectly in at all times.
> > But that they automatically mean that bzr is not a decentralized VCS,
> > I strongly disagree with.  They are just one part of the equation.
>
> Whoe are you strongly disagreeing with?  Nobody said it wasn't a
> decentralized VCS.  But there is a _clear_ bias towards using it
> with a central server.


Ok, I take that back :)

When I think "centralized" I think "everyone must commit to a central
repository"... which is not what we are talking about here...

/Erik
ps. Sean, your mailer does something wierd with my last name in the
to-field, so I can't just hit "reply" without removing my name
first...

/Erik

-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 17:31                                                       ` Linus Torvalds
@ 2006-10-21 17:38                                                         ` Linus Torvalds
  2006-10-22  7:49                                                         ` Tim Webster
  1 sibling, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-21 17:38 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git



On Sat, 21 Oct 2006, Linus Torvalds wrote:
> 
> 	# Tell people we want to work with ".git-1"
> 	export GIT_DIR=.git-1

Actually, I think Jakub's approach is better: you'd be better off doing 
this as

	alias git-1="git --git-dir=.git-1"
	alias git-2="git --git-dir=.git-2"

and now you should be able to just do

	git-1 diff

(or any other git command) and

	git-2 diff

and can happily share the same directory and mix git commands without 
changing an environment variable all the time.

That would still be insane, but it wouldn't likely be _quite_ as confusing 
(or error-prone in case you forgot to switch the variable).

			Linus

PS. I'd still _not_ suggest doing this. It should _work_, but I mean - 
really..

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 22:59                                               ` Jeff King
@ 2006-10-21 17:40                                                 ` Jan Hudec
  2006-10-21 17:51                                                   ` Jakub Narebski
  2006-10-21 18:42                                                   ` Linus Torvalds
  0 siblings, 2 replies; 1752+ messages in thread
From: Jan Hudec @ 2006-10-21 17:40 UTC (permalink / raw)
  To: Jeff King; +Cc: bazaar-ng, git, Jakub Narebski

On Fri, Oct 20, 2006 at 06:59:17PM -0400, Jeff King wrote:
> On Fri, Oct 20, 2006 at 08:12:10PM +0200, Jan Hudec wrote:
> 
> > At this point, I expect the tree to look like this:
> > A$ ls -R
> > .:
> > data/
> > data:
> > hello.txt
> > A$ cat data/hello.txt
> > Hello World!
> 
> Git does what you expect here.
> 
> > A$ VCT mv data greetings
> > A$ VCT commit -m "Renamed the data directory to greetings"
> > B$ echo "Goodbye World!" > data/goodbye.txt
> > B$ VCT add data/goodbye.txt
> > B$ VCT commit -m "Added goodbye message."
> > A$ VCT merge B
> > 
> > And now I expect to have tree looking like this:
> > 
> > A$ ls -R
> > .:
> > greetings/
> > greetings:
> > hello.txt
> > goodbye.txt
> 
> Git does not do what you expect here. It notes that files moved, but it
> does not have a concept of directories moving.  Git could, even without
> file-ids or special patch types, figure out what happened by noting that
> every file in data/ was renamed to its analogue in greetings/, and infer
> that previously non-existant files in data/ should also be moved to
> greetings/.
> 
> However, I'm not sure that I personally would prefer that behavior. In
> some cases you might actually WANT data/goodbye.txt, and in some other
> cases a conflict might be more appropriate. In any case, I would rather
> the SCM do the simple and predictable thing (which I consider to be
> creating data/goodbye.txt) rather than be clever and wrong (even if it's
> only wrong a small percentage of the time).
> 
> In short, git doesn't do what you expect, but I'm not convinced that
> it's a bug or lack of feature, and not simply a difference in desired
> behavior.

I still consider it a bug, but different problems of the file-id
solution have already been described in this thread that I consider bugs
as well.

Besides I start to think that it should be actually possible to solve
this case with the git-style approach. I have to state beforehand, that
I don't know how the most recent git algorithm works, but I imagine
there is some kind of 'brackets' saying the text is in a given file. Now
if those 'brackets' were not flat, but nested, ie. instead of saying
'this is in foo/bar' it would say 'this is in bar is in foo', the
difference when renaming directory would only affect the 'outer bracket'
and therefore merge correctly with adding content inside it.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 16:59                                                     ` Jakub Narebski
@ 2006-10-21 17:41                                                       ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21 17:41 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Matthew D. Fuller, bazaar-ng, Carl Worth, Andreas Ericsson, git

Note: instead of symlinking .git/objects/ objects database,
you can simply set and export GIT_OBJECT_DIRECTORY environment
variable.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21 17:40                                                 ` Jan Hudec
@ 2006-10-21 17:51                                                   ` Jakub Narebski
  2006-10-21 19:20                                                     ` Jan Hudec
  2006-10-21 18:42                                                   ` Linus Torvalds
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21 17:51 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Jeff King, bazaar-ng, git

Jan Hudec wrote:

> Besides I start to think that it should be actually possible to solve
> this case with the git-style approach. I have to state beforehand, that
> I don't know how the most recent git algorithm works, but I imagine
> there is some kind of 'brackets' saying the text is in a given file. Now
> if those 'brackets' were not flat, but nested, ie. instead of saying
> 'this is in foo/bar' it would say 'this is in bar is in foo', the
> difference when renaming directory would only affect the 'outer bracket'
> and therefore merge correctly with adding content inside it.

You mean, to consider "contents" of a directory union of contents
of files and directories it contains, and then use the same "rename
detection" algorithm as for files?

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 14:12                                             ` Jeff King
  2006-10-20 14:40                                               ` Jakub Narebski
@ 2006-10-21 17:57                                               ` Aaron Bentley
  2006-10-21 18:20                                                 ` Jakub Narebski
  1 sibling, 1 reply; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-21 17:57 UTC (permalink / raw)
  To: Jeff King
  Cc: Carl Worth, Linus Torvalds, Jakub Narebski, Andreas Ericsson,
	bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jeff King wrote:
> On Thu, Oct 19, 2006 at 09:06:40PM -0400, Aaron Bentley wrote:
> 
>> What's nice is being able see the revno 753 and knowing that "diff -r
>> 752..753" will show the changes it introduced.  Checking the revo on a
>> branch mirror and knowing how out-of-date it is.
> 
> I was accustomed to doing such things in CVS, but I find the git way
> much more pleasant, since I don't have to do any arithmetic:
>   diff d8a60^..d8a60

> Does bzr have a similar shorthand for mentioning relative commits?

Yes, you could e.g. do:

bzr diff -r before:753..753

Aaron

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFOl9s0F+nu1YWqI0RAhW7AJ4vi4kgen/8h6j2AgueU+kcsmLrPwCeKry9
pp68K4rAmXjjkPvK32LvmPk=
=qDn2
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 14:08                                                 ` Jakub Narebski
  2006-10-21 16:31                                                   ` Erik Bågfors
@ 2006-10-21 18:11                                                   ` Matthew D. Fuller
  2006-10-21 19:19                                                     ` Jeff King
  2006-10-21 19:41                                                     ` Jakub Narebski
  1 sibling, 2 replies; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-21 18:11 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: bazaar-ng, Linus Torvalds, Carl Worth, Andreas Ericsson, git

On Sat, Oct 21, 2006 at 04:08:18PM +0200 I heard the voice of
Jakub Narebski, and lo! it spake thus:
> Dnia sobota 21. października 2006 15:01, Matthew D. Fuller napisał:
> > 
> > I think we're getting into scratched-record-mode on this.
>
>  [....]

Thank you for demonstrating my point   8-}


> When two clones of the same repository (in git terminology), or two
> "branches" (in bzr terminology), used by different people, cannot be
> totally equivalent that is centralization bias.

This is obviously some new meaning of "centralization" bearing no
resemblance whatsoever to how I understand the word.

In git, apparently, you don't give a crap about a branch's identity
(alternately expressible as "it has none"), and so you throw it away
all the time.  Given that, revnos even if git had them would never be
of ANY use to you, so it's no wonder you have no use for the notion.

I DO give a crap about my branchs' identities.  I WANT them to retain
them.  If I have 8 branches, they have 8 identities.  When I merge one
into another, I don't WANT it to lose its identity.  When I merge a
branch that's a strict superset of second into that second, I don't
WANT the second branch to turn into a copy of the first.  If I wanted
that, I'd just use the second branch, or make another copy of it.  I
don't WANT to copy it.  I just want to merge the changes in, and keep
on with my branch's current identity.

Maybe that's what you mean by 'centralization'; each branch is central
to itself.  That seems a pretty useless definition, though.  In my
mind, actually, it's MORE distributed; my branch remains my branch,
and your branch remains your branch, and the difference doesn't keep
us from working together and moving changes back and forth.  Forcing
my branch to become your branch sounds a lot more "centralized" to me.


Now, we can discuss THAT distinction.  I'm not _opposed_ to git's
model per se, and I can think of a lot of cases where it's be really
handy.  But those aren't most of my cases.  And as long as we don't
agree on branch identity, it's completely pointless to keep yakking
about revnos, because they're a direct CONSEQUENCE of that difference
in mental model.  See?  They're an EFFECT, not a CAUSE.  If bzr didn't
have revnos, I'd STILL want my branch to keep its identity.  You could
name the mainline revisions after COLORS if you wanted, and I'd still
want my branch to keep its identity.  Aren't we through rehashing the
same discussion about the EFFECTS?


> > It refers both to the conceptual entity ("a line of development"
> > roughly, much like what 'branch' means in git and VCS in general),
> > and to the physical location (directory, URL)
> 
> I'd rather use other name then. Perhaps "forks" for physical
> "branch", i.e. branch metadata (like revno to revid mapping) +
> object repository or pointer to it + optionally working area/working
> files. 

It's the same name in bzr because branches are their location, not
their 'name'.  Every branch always has a location, and every location
refers to a branch (well, as long as it's a location that's meaningful
to bzr; "/etc/passwd" is a location, but it's nothing to do with bzr,
so it's not a branch.  Don't dawdle in irrelevancies).


> And you say that bzr is not biased towards centralization? In git
> you can just pull (fetch) to check if there were any changes, and if
> there were not you don't get useless marker-merges.

If I don't tell you my branch has something in it ready to grab, you
shouldn't merge it.  It probably won't work, and is quite likely to
set your computer on fire, slaughter and fillet your pet goldfish, and
make demons fly out of your nose.  If you wanna get stuck with all my
incomplete WIP, let's just use a CVS module and be done with it.


> 2. But the preferred git workflow is to have two branches in each of
> two clones. The 'origin' branch where you fetch changes from other
> repository (so called "tracking branch") and you don't commit your
> changes to [...]

Funny, since this reads to me EXACTLY like the bzr flow of "upstream
branch I pull" and "my branch I merge from upstream" that's getting
kvetched around...



-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 17:57                                               ` Aaron Bentley
@ 2006-10-21 18:20                                                 ` Jakub Narebski
  2006-10-22 14:27                                                   ` Matthieu Moy
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21 18:20 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Jeff King, Carl Worth, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git

Aaron Bentley wrote:
> Jeff King wrote:
>> On Thu, Oct 19, 2006 at 09:06:40PM -0400, Aaron Bentley wrote:
>>
>>> What's nice is being able see the revno 753 and knowing that "diff -r
>>> 752..753" will show the changes it introduced.  Checking the revo on a
>>> branch mirror and knowing how out-of-date it is.
>>
>> I was accustomed to doing such things in CVS, but I find the git way
>> much more pleasant, since I don't have to do any arithmetic:
>>   diff d8a60^..d8a60
> 
>> Does bzr have a similar shorthand for mentioning relative commits?
> 
> Yes, you could e.g. do:
> 
> bzr diff -r before:753..753

What about grandparent of commit (d8a60^^ or d8a60~2 in git),
or choosing one of the parents in merge commit (d8a60^2 is second
parent of a commit)? before:before:753 ?

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061021102346.9cd3abce.seanlkml@sympatico.ca>
  2006-10-21 14:23                   ` Sean
  2006-10-21 14:23                   ` Sean
@ 2006-10-21 18:34                   ` Jan Hudec
       [not found]                     ` <20061021144704.71d75e83.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 1752+ messages in thread
From: Jan Hudec @ 2006-10-21 18:34 UTC (permalink / raw)
  To: Sean; +Cc: Linus Torvalds, bazaar-ng, git, Matthieu Moy, Jakub Narebski

On Sat, Oct 21, 2006 at 10:23:46AM -0400, Sean wrote:
> On Sat, 21 Oct 2006 16:13:28 +0200
> Jan Hudec <bulb@ucw.cz> wrote:
> 
> > Bzr is meant to be used in both ways, depending on user's choice.
> > Therefore it comes with that infrastructure and you can choose whether
> > you want to use it or not.
> 
> >From what we've read on this thread, bzr appears to be biased towards
> working with a central repo.  That is the model that supports the use of
> revnos etc that the bzr folks are so fond of.   However Git is perfectly
> capable of being used in any number of models, including centralized.
> Git just doesn't make the mistake of training new users into using
> features that are only stable in a limited number of those models.

For one think I, like others already expressed, think difference should
be made between 'centralized' and 'star-topology'. Subversion is
centralized -- I don't think bzr is biased towards that kind of
centralization, though it provides tools (bound branches) to make it
easy.

I would agree it IS biased towards viewing branches as organized in a
hierarchy, while git strictly treats them as equal peers, which I'd call
star-topology (and I don't think it is because it _has_ revnos, but
because the user interface strongly favors them over revids).

On the other hand git is biased away from centralized (as in subversion
is centralized) in that it takes extra work to make sure you are always
synchronized (while bzr has bound branches to do the checking for you).
For open-source development, centralized is a wrong way to go, but
people use version control tools for other purposes as well and for some
of them staying synchronized is important.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21 17:40                                                 ` Jan Hudec
  2006-10-21 17:51                                                   ` Jakub Narebski
@ 2006-10-21 18:42                                                   ` Linus Torvalds
  2006-10-21 19:21                                                     ` Jakub Narebski
  1 sibling, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-21 18:42 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Jeff King, bazaar-ng, git, Jakub Narebski



On Sat, 21 Oct 2006, Jan Hudec wrote:
>
> [ On not moving files that weren't moved originally, but whose
>   directories were moved ]
> 
> I still consider it a bug, but different problems of the file-id
> solution have already been described in this thread that I consider bugs
> as well.
> 
> Besides I start to think that it should be actually possible to solve
> this case with the git-style approach.

It's certainly _possible_ to figure out, but one reason git does what it 
does is that it's just simpler (ie just ignore the whole "directory move" 
situation entirely, and just consider it to be "many files moved"). 

Another reason is that this really is an ambigious case. When the 
directory was moved, the file in question really didn't exist. So when it 
was created independently of the move, it really _is_ somewhat ambiguous 
whether the intention was to move it with the other files or whether the 
new creation point is the right one.

I think that for a human, the details would likely be obvious (and I 
suspect that in most cases it would indeed move with the directory). But 
it really isn't totally clear: what does moving a directory imply for the 
future? Does it imply that the directory should never exist in the future, 
or does it just imply that the _current_ contents move?

Git "tends to" have a policy of not caring about directories at all. For 
example, git will not track an empty directory by default. You _can_ make 
it track one in your commits (the data structures support it), but you're 
really just better of just thinking of git as tracking individual files, 
and nor really directories. So as far as git is concerned, "directories" 
mostly don't really have any existence on their own, they only exist as 
paths to reach files.

In that kind of mindset, renaming a directory really is about renaming the 
files that are in that directory, and that explains the git behaviour. It 
may not necessarily be what you expect, but it _is_ consistent, and it's 
not really "wrong" either. It's just another way of looking at the thing.

Also, I'd like to point out that people worry way too much about merges. 
There are much harder merge conflicts to fix up. If you notice that things 
didn't go the way you expected in a merge, even if it was done 
automatically, you can just do a

	git mv unexpected/directory/file expected/directory/file
	git commit --amend

which basically "fixes up" the automatic merge (that's what the "--amend" 
means: it means "re-do the last commit with _this_ state instead).

(Of course, you could also just make a separate commit to move the file, 
but I think the "manual fixup of the merge" is just cleaner - just add a 
note in the commit message to say you fixed it up by hand. When you do 
your "git commit --amend", it will automatically just give you an editor 
to edit up the commit message too while you're at it).

So again: merges are certainly fairly "hard" from a SCM standpoint, but 
from a user standpoint, they tend to be not at all as important. I would 
again argue that more important than the merge itself (which you can 
trivially just fix up to match your expectations) is to make it easy to 
later _show_ what happened, ie if you examine the file later, you should 
be able to see where it came from.

(And again, with git, things like "git pickaxe" - think of it as just a 
"better annotate" - will indeed pick up the similarity, regardless of 
whether the rename was done manually or automatically as part of the 
merge - exactly because git only really cares about actual contents).

Btw, just to be honest: git _mostly_ thinks in terms of "constant 
pathname patterns" as opposed to "individual paths that move around". 
That's at least partly because of how I work. I actually fairly seldom 
look at an individual file, and tend to much more often look at a group of 
files, and then it's a _lot_ more convenient to do

	gitk drivers/usb include/linux/usb*

where those argument pathnames are _not_ a set of filenames that we track, 
but really somethign more generic, namely a "repository pathname subset" 
which is constant. The above will show the _subset_ of the kernel 
repository history that is relevant for all the named pathnames, but the 
pathnames are _fixed_. It won't follow files that move out of the 
subdirectories: it will show the history as seen from the viewpoint of a 
certain subset of pathnames.

This also extends to things like "git log". So when you do

	git log kernel/sched.c

if you have a "file ID" mentality, you expect the above to follow renames. 
It doesn't - even though git -can- follow renames, what the above actually 
_means_ is "show the log for the fixed pathname set that only includes one 
single path". 

So if "kernel/sched.c" had originally been called something else, the 
above wouldn't show the rename at all. It would just show that "oh, this 
pathname suddenly was created as a new file", because from the viewpoint 
of that fixed pathname, that's _exactly_ what happens.

We've discussed adding a "--follow" flag to tell "git log" to consider the 
argument to not be a "pathname filter", but a "individual file" kind of 
thing, and I think there was even a patch for it, but I suspect it hasn't 
been a big issue, probably partly because you get rather used to the 
"pathname filter" approach fairly quickly. If you knew what the old 
pathname was, for example, you could get git to _tell_ you about the 
rename by doing

	git log -M -- <set-of-all-pathnames-we're-interested-in-old-included>

and git would happily see the renames that happen _within_ that pathname 
filter (the "-M" is there because by default "git log" doesn't show any 
patches at all, of course, so if you want to see the rename, you need to 
tell git so).

As a particular example of this behaviour, if you do

	git log -M kernel/

you'll always see any renames that happen _within_ that subdirectory, but 
any files that are moved into (or out of) the subdirectory will be 
considered to be "create" or "delete" events - because you've literally 
told git to ignore all history that is not relevant to the kernel/ 
subdirectory (so they really _are_ "create/delete" events as far as that 
subdirectory is concerned).

Is this different from other SCM's? Hell yes. git does a lot of things 
differently. Is it useful? Again, hell yes. Especially for a maintainer, 
the ability to talk about pathname _patterns_ is generally much more 
important than talking about any particular file.

[ The pathname thing also means that it's trivial to ask questions like 
  "ok, so what happened to file xyz that I _know_ we used to have, but 
  clearly don't have any more?".

  You just do "git log -- xyz", and you'll see exactly what you wanted to 
  see. The "--" here (and in a previous example) is because to avoid 
  ambiguity, git requires that if you name files that don't actually 
  exist, you make it clear that they are filenames, not just mistyped 
  revision ID's or something else. ]

In general, git gives you the best of both worlds. It knows how to follow 
individual files if you want to, but by default it uses this much more 
generic concept of "pathname filters". The default is definitely 
influenced both by my usage, and my (obviously very strong) opinions on 
what is more important (and thus the git "mental model").

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                     ` <20061021144704.71d75e83.seanlkml@sympatico.ca>
@ 2006-10-21 18:47                       ` Sean
  2006-10-21 18:47                       ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-21 18:47 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Linus Torvalds, bazaar-ng, git, Matthieu Moy, Jakub Narebski

On Sat, 21 Oct 2006 20:34:28 +0200
Jan Hudec <bulb@ucw.cz> wrote:

> For one think I, like others already expressed, think difference should
> be made between 'centralized' and 'star-topology'. Subversion is
> centralized -- I don't think bzr is biased towards that kind of
> centralization, though it provides tools (bound branches) to make it
> easy.

A star-topology assumes there is a central server from which the points
of the start emerge.  It is very much a centralized model and one that
bzr is clearly optimized for.  The difference between bzr and say
cvs is that bzr provides offline abilities where checkins to the
central server can be deferred by checking them in locally first.

The bzr bias towards this model is implicit in its affection for
revnos, which depend on a central repository to syncronize them for
all the points of the star.

[...]
> On the other hand git is biased away from centralized (as in subversion
> is centralized) in that it takes extra work to make sure you are always
> synchronized (while bzr has bound branches to do the checking for you).
> For open-source development, centralized is a wrong way to go, but
> people use version control tools for other purposes as well and for some
> of them staying synchronized is important.

Please reconsider this point, Git can be configured to push every commit
to a central server immediately.  It's just that such a model is so inferior
in almost every way, that it's not typically done.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                     ` <20061021144704.71d75e83.seanlkml@sympatico.ca>
  2006-10-21 18:47                       ` Sean
@ 2006-10-21 18:47                       ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-21 18:47 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Linus Torvalds, bazaar-ng, Matthieu Moy, git, Jakub Narebski

On Sat, 21 Oct 2006 20:34:28 +0200
Jan Hudec <bulb@ucw.cz> wrote:

> For one think I, like others already expressed, think difference should
> be made between 'centralized' and 'star-topology'. Subversion is
> centralized -- I don't think bzr is biased towards that kind of
> centralization, though it provides tools (bound branches) to make it
> easy.

A star-topology assumes there is a central server from which the points
of the start emerge.  It is very much a centralized model and one that
bzr is clearly optimized for.  The difference between bzr and say
cvs is that bzr provides offline abilities where checkins to the
central server can be deferred by checking them in locally first.

The bzr bias towards this model is implicit in its affection for
revnos, which depend on a central repository to syncronize them for
all the points of the star.

[...]
> On the other hand git is biased away from centralized (as in subversion
> is centralized) in that it takes extra work to make sure you are always
> synchronized (while bzr has bound branches to do the checking for you).
> For open-source development, centralized is a wrong way to go, but
> people use version control tools for other purposes as well and for some
> of them staying synchronized is important.

Please reconsider this point, Git can be configured to push every commit
to a central server immediately.  It's just that such a model is so inferior
in almost every way, that it's not typically done.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-17 19:51             ` Aaron Bentley
@ 2006-10-21 18:58               ` Jan Hudec
       [not found]                 ` <20061021150233.c29e11c5.seanlkml@sympatico.ca>
  0 siblings, 1 reply; 1752+ messages in thread
From: Jan Hudec @ 2006-10-21 18:58 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Sean, Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Tue, Oct 17, 2006 at 03:51:56PM -0400, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Sean wrote:
> > On Tue, 17 Oct 2006 00:24:15 -0400
> > Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> >>- - you can use a checkout to maintain a local mirror of a read-only
> >>  branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
> > 
> > 
> > I'm not sure what you mean here.  A bzr checkout doesn't have any history
> > does it?
> 
> By default, they do.  You must use a flag to get a checkout with no history.

If I can add some clarification: There is a lightweight checkout and
heavyweight checkout. The former contains no history and does everything
(except status and I am not sure about diff) by accessing the remote
data. The later contains mirror of the history data and does
write-through on commit (and otherwise behaves like normal branch with
repository)

What would be really useful would be a checkout, or even a branch (ie.
with ability to commit locally), that would only contain history data
since some point. This would allow downloading very little data when
branching, but than working locally as with normal repository clone.

In bzr this was already discussed and the storage supports so called
"ghost" revisions, whose existence is known, but not their data. There
are even repositories around that contain them (created by converting
data from arch), but to my best knowledge there is no user interface to
create branches or checkouts with partial data.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061021150233.c29e11c5.seanlkml@sympatico.ca>
  2006-10-21 19:02                   ` Sean
@ 2006-10-21 19:02                   ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-21 19:02 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Aaron Bentley, Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Sat, 21 Oct 2006 20:58:25 +0200
Jan Hudec <bulb@ucw.cz> wrote:

> In bzr this was already discussed and the storage supports so called
> "ghost" revisions, whose existence is known, but not their data. There
> are even repositories around that contain them (created by converting
> data from arch), but to my best knowledge there is no user interface to
> create branches or checkouts with partial data.

In Git the same functionality can be achieved with so called shallow-
clones.  Unfortunately, they've only been discussed and not yet
implemented.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061021150233.c29e11c5.seanlkml@sympatico.ca>
@ 2006-10-21 19:02                   ` Sean
  2006-10-21 19:02                   ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-21 19:02 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Sat, 21 Oct 2006 20:58:25 +0200
Jan Hudec <bulb@ucw.cz> wrote:

> In bzr this was already discussed and the storage supports so called
> "ghost" revisions, whose existence is known, but not their data. There
> are even repositories around that contain them (created by converting
> data from arch), but to my best knowledge there is no user interface to
> create branches or checkouts with partial data.

In Git the same functionality can be achieved with so called shallow-
clones.  Unfortunately, they've only been discussed and not yet
implemented.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 18:11                                                   ` Matthew D. Fuller
@ 2006-10-21 19:19                                                     ` Jeff King
  2006-10-21 19:30                                                       ` Jakub Narebski
  2006-10-21 21:46                                                       ` Matthew D. Fuller
  2006-10-21 19:41                                                     ` Jakub Narebski
  1 sibling, 2 replies; 1752+ messages in thread
From: Jeff King @ 2006-10-21 19:19 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Jakub Narebski, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

On Sat, Oct 21, 2006 at 01:11:49PM -0500, Matthew D. Fuller wrote:

> Maybe that's what you mean by 'centralization'; each branch is central
> to itself.  That seems a pretty useless definition, though.  In my
> mind, actually, it's MORE distributed; my branch remains my branch,
> and your branch remains your branch, and the difference doesn't keep
> us from working together and moving changes back and forth.  Forcing
> my branch to become your branch sounds a lot more "centralized" to me.
> 
> Now, we can discuss THAT distinction.  I'm not _opposed_ to git's

OK, let's discuss. :)

I think the concept of "my" branch doesn't make any sense in git.
Everyone is working collectively on a DAG of the history, and we all
have pointers into the DAG. Something is "my" branch in the sense that I
have a repository with a pointer into the DAG, but then again, so do N
other people. I control my pointer, but that's it.

So don't think of it as "git throws away branch identity" as much as
"git never cared about branch identity in the first place, and doesn't
think it's relevant."

Now, there are presumably advantages and disadvantages to these
approaches. I like the fact that I can prepare a repository from
scratch, import it from cvs, copy it, push it, or do whatever I like,
and the end result is always exactly the same (revids included). With
your model, on the other hand, it seems the advantages are that in many
cases you can do things like distributed revnos.

> agree on branch identity, it's completely pointless to keep yakking
> about revnos, because they're a direct CONSEQUENCE of that difference
> in mental model.  See?  They're an EFFECT, not a CAUSE.  If bzr didn't
> have revnos, I'd STILL want my branch to keep its identity.  You could
> name the mainline revisions after COLORS if you wanted, and I'd still
> want my branch to keep its identity.  Aren't we through rehashing the
> same discussion about the EFFECTS?

I agree completely.

> > 2. But the preferred git workflow is to have two branches in each of
> > two clones. The 'origin' branch where you fetch changes from other
> > repository (so called "tracking branch") and you don't commit your
> > changes to [...]
> 
> Funny, since this reads to me EXACTLY like the bzr flow of "upstream
> branch I pull" and "my branch I merge from upstream" that's getting
> kvetched around...

The difference, I think, is that it's easier in git to move the upstream
around: you simply start fetching from a different place. I'm not clear
on how that works in bzr (if it invalidates revnos or has other side
effects).

-Peff

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21 17:51                                                   ` Jakub Narebski
@ 2006-10-21 19:20                                                     ` Jan Hudec
  0 siblings, 0 replies; 1752+ messages in thread
From: Jan Hudec @ 2006-10-21 19:20 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Jeff King, bazaar-ng, git

On Sat, Oct 21, 2006 at 07:51:43PM +0200, Jakub Narebski wrote:
> Jan Hudec wrote:
> 
> > Besides I start to think that it should be actually possible to solve
> > this case with the git-style approach. I have to state beforehand, that
> > I don't know how the most recent git algorithm works, but I imagine
> > there is some kind of 'brackets' saying the text is in a given file. Now
> > if those 'brackets' were not flat, but nested, ie. instead of saying
> > 'this is in foo/bar' it would say 'this is in bar is in foo', the
> > difference when renaming directory would only affect the 'outer bracket'
> > and therefore merge correctly with adding content inside it.
> 
> You mean, to consider "contents" of a directory union of contents
> of files and directories it contains, and then use the same "rename
> detection" algorithm as for files?

Yes, something like that.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21 18:42                                                   ` Linus Torvalds
@ 2006-10-21 19:21                                                     ` Jakub Narebski
  2006-11-03  6:36                                                       ` Martin Langhoff
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21 19:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jan Hudec, Jeff King, bazaar-ng, git

Linus Torvalds wrote:

> We've discussed adding a "--follow" flag to tell "git log" to consider the 
> argument to not be a "pathname filter", but a "individual file" kind of 
> thing, and I think there was even a patch for it, but I suspect it hasn't 
> been a big issue, probably partly because you get rather used to the 
> "pathname filter" approach fairly quickly.

If I remember correctly, the patch implementing --follow was fairly
intrusive, and was unfortunate in that it was posted during changes
in diffcore.

Lack of --follow is not a big issue because you can do this "by hand";
you can use git-diff-tree -M at the end of file history to check if
[git considers] it was moved from somewhere.

During discussion we have agreed that we would like to have both
--follow rename following limiter and static path limiter (and 
that it would be nice to extend static path limiter to include globs).
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 19:19                                                     ` Jeff King
@ 2006-10-21 19:30                                                       ` Jakub Narebski
  2006-10-21 19:47                                                         ` Jan Hudec
  2006-10-21 19:55                                                         ` Linus Torvalds
  2006-10-21 21:46                                                       ` Matthew D. Fuller
  1 sibling, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21 19:30 UTC (permalink / raw)
  To: Jeff King
  Cc: Matthew D. Fuller, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

Jeff King wrote:

> The difference, I think, is that it's easier in git to move the upstream
> around: you simply start fetching from a different place. I'm not clear
> on how that works in bzr (if it invalidates revnos or has other side
> effects).

That's good example of fully distributed approach. I can fetch directly
(actually, I cannot) from Junio private repository, I can fetch from
public git.git repository, either using git:// or http:// protocol,
I can fetch from somebody else clone of git repository: intermixing
those fetches, and revids (commit-ids) remain constant and unchanged.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 18:11                                                   ` Matthew D. Fuller
  2006-10-21 19:19                                                     ` Jeff King
@ 2006-10-21 19:41                                                     ` Jakub Narebski
  2006-10-22 19:18                                                       ` David Clymer
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21 19:41 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: bazaar-ng, Linus Torvalds, Carl Worth, Andreas Ericsson, git

Matthew D. Fuller wrote:
> On Sat, Oct 21, 2006 at 04:08:18PM +0200 I heard the voice of
> Jakub Narebski, and lo! it spake thus:
>> Dnia sobota 21. października 2006 15:01, Matthew D. Fuller napisał:

>> When two clones of the same repository (in git terminology), or two
>> "branches" (in bzr terminology), used by different people, cannot be
>> totally equivalent that is centralization bias.
> 
> This is obviously some new meaning of "centralization" bearing no
> resemblance whatsoever to how I understand the word.

Perhaps I'd better use "star topology bias" instead of "centralization
bias".
 
> In git, apparently, you don't give a crap about a branch's identity
> (alternately expressible as "it has none"), and so you throw it away
> all the time.  Given that, revnos even if git had them would never be
> of ANY use to you, so it's no wonder you have no use for the notion.

In git branches are lightweight. Branch names are local to repository.
Repositories have identity. Bzr "branch" is strange mix of one-branch
git repository and git branch.

Git main workflow is fully decentralized workflow. All clones of the
same repository are created equal. In bzr the suggested workflow
(with revnos) forces one (or more) branches to be mainline (use "merge",
get empty-merges, revnos don't change) and leaf (use "pull", revnos
change).
 
> I DO give a crap about my branchs' identities.  I WANT them to retain
> them.  If I have 8 branches, they have 8 identities.  When I merge one
> into another, I don't WANT it to lose its identity.  When I merge a
> branch that's a strict superset of second into that second, I don't
> WANT the second branch to turn into a copy of the first.  If I wanted
> that, I'd just use the second branch, or make another copy of it.  I
> don't WANT to copy it.  I just want to merge the changes in, and keep
> on with my branch's current identity.

I don't understand. If I merge 'next' branch into 'master' in git, I 
still have two branches: 'master' and 'next'.

And I don't understand why you are so hung on branch identities. Yes, if
somebody clones your 'repo' repository, he can have your 'master' branch
(refs/heads/master) named 'repo' (refs/heads/repo) or 'repo/master'
(refs/remotes/repo/master), but why that matters to you. It is _his_
(or her ;-) clone. 

> Now, we can discuss THAT distinction.  I'm not _opposed_ to git's
> model per se, and I can think of a lot of cases where it's be really
> handy.  But those aren't most of my cases.  And as long as we don't
> agree on branch identity, it's completely pointless to keep yakking
> about revnos, because they're a direct CONSEQUENCE of that difference
> in mental model.  See?  They're an EFFECT, not a CAUSE.  If bzr didn't
> have revnos, I'd STILL want my branch to keep its identity.  You could
> name the mainline revisions after COLORS if you wanted, and I'd still
> want my branch to keep its identity.  Aren't we through rehashing the
> same discussion about the EFFECTS?

For revnos to work you MUST have one "branch" to be considered
special, the hub in star topology. This very much precludes fully
distributed development. 

BTW. I get that you can use revids in revnos in bzr for fully
distributed and not star-topology geared development. But
Bazaar-NG revids are uglier that Git commit-ids.

[...]
>> And you say that bzr is not biased towards centralization? In git
>> you can just pull (fetch) to check if there were any changes, and if
>> there were not you don't get useless marker-merges.
> 
> If I don't tell you my branch has something in it ready to grab, you
> shouldn't merge it.  It probably won't work, and is quite likely to
> set your computer on fire, slaughter and fillet your pet goldfish, and
> make demons fly out of your nose.  If you wanna get stuck with all my
> incomplete WIP, let's just use a CVS module and be done with it.

In git I can fetch your changes but I don't need to merge them. Take
for example Junio 'pu' (proposed updates) branch: this is the branch
you shouldn't merge as it's history is constantly being rewritten.

If you don't want for your WIP to be publicly available, you don't
publish it. For example as far as I understand Junio works on Git
in his private repository, with many, many feature branches, but
he does push to public [bare] repository only some subset of branches,
and we can fetch/pull only those.

But still, if I am impatient I can pull from Junio every hour, and
I don't get 24 totally useless empty merge messages if he took day
off and didn't publish any changes till day later.

>> 2. But the preferred git workflow is to have two branches in each of
>> two clones. The 'origin' branch where you fetch changes from other
>> repository (so called "tracking branch") and you don't commit your
>> changes to [...]
> 
> Funny, since this reads to me EXACTLY like the bzr flow of "upstream
> branch I pull" and "my branch I merge from upstream" that's getting
> kvetched around...

But please, have you realized that in this workflow the two clones
of the same repository are totally symmetrical? One's 'master' is
another 'origin' and vice versa. After pull on one side, and pull
on the other side (without any changes in between) we have the same
contents, and the same revision names (commit-ids in git), even if
the changes (revisions) got to those clones in different order.
In bzr those two "branches" would get different revnos. No symmetry.
Full distributed vs star topology (one branch "central", hence
"centralized" - I don't mean need to access to one central repository,
although...)

-- 
Jakub Narebski
ShadeHawk on #git
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 19:30                                                       ` Jakub Narebski
@ 2006-10-21 19:47                                                         ` Jan Hudec
  2006-10-21 19:55                                                         ` Linus Torvalds
  1 sibling, 0 replies; 1752+ messages in thread
From: Jan Hudec @ 2006-10-21 19:47 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Jeff King, bazaar-ng, Matthew D. Fuller, Linus Torvalds,
	Andreas Ericsson, Carl Worth, git

On Sat, Oct 21, 2006 at 09:30:30PM +0200, Jakub Narebski wrote:
> Jeff King wrote:
> 
> > The difference, I think, is that it's easier in git to move the upstream
> > around: you simply start fetching from a different place. I'm not clear
> > on how that works in bzr (if it invalidates revnos or has other side
> > effects).

Moving upstram around does not invalidate revnos. Switching to different
upstream (ie. the head revisions are different) does. And this may
happen by doing a merge with the previous mainline as non-first parent
-- revnos are simply short aliases for revids, not persistent unique
idenfiers.

> That's good example of fully distributed approach. I can fetch directly
> (actually, I cannot) from Junio private repository, I can fetch from
> public git.git repository, either using git:// or http:// protocol,
> I can fetch from somebody else clone of git repository: intermixing
> those fetches, and revids (commit-ids) remain constant and unchanged.

So they (revids) do in bzr.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 19:30                                                       ` Jakub Narebski
  2006-10-21 19:47                                                         ` Jan Hudec
@ 2006-10-21 19:55                                                         ` Linus Torvalds
  2006-10-21 20:19                                                           ` Jakub Narebski
  1 sibling, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-21 19:55 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: bazaar-ng, Matthew D. Fuller, Jeff King, Andreas Ericsson,
	Carl Worth, git



On Sat, 21 Oct 2006, Jakub Narebski wrote:
> 
> That's good example of fully distributed approach. I can fetch directly
> (actually, I cannot) from Junio private repository, I can fetch from
> public git.git repository, either using git:// or http:// protocol,
> I can fetch from somebody else clone of git repository: intermixing
> those fetches, and revids (commit-ids) remain constant and unchanged.

This is nice for a couple of situations:

 - if some particular machine is down, nobody really cares. It doesn't 
   really change the workflow at all if "master.kernel.org" were to be 
   off-line due to some trouble - it just happens to be a machine with 
   good bandwidth that a number of kernel (and git) developers have access 
   to, but if you want to sync with something else, go wild. We could just 
   sync directly between developers, although most people tend to have 
   firewalls (I certainly have a very anal one - not even ssh gets in) 
   making it usually easier to go through some - any - public place.

   But in git, the "public place" really is just an intermediary. It has 
   nothing to do with anything history-wise, and it's revision ID's are a 
   non-issue. It's just a temporary staging area (although re-using the 
   same repo over and over for pushing things out obviously means you can 
   do just incremental updates, so most everybody does that)

 - sometimes you have multiple branches in the same tree that have very 
   _different_ sources. For example, you might start out cloning my tree, 
   but if you _also_ want to track the stable tree, you just do so: you 
   can just do

	git fetch <repo> <remote-branch-name>:<local-branch-name>

   at any time, and you now have a new branch that tracks a different 
   repository entirely (to make it easier to keep track of them, you'd 
   probably want to make note of this in your .config file or your remote 
   tracking data, but that's a small "usability detail", not a real 
   conceptual issue).

 - the same "multi-source" thing is true for pushing things out too, not 
   just fetching: I still have my personal git.git repository on 
   kernel.org for historical reasons, even though Junio maintains the 
   normal one. So when I did some experimental (and broken) stuff for "git 
   unpack-objects" in a local branch, and others were interested in fixing 
   it, I just pushed it out to my git repo as a new branch - one that 
   Junio doesn't have.

   So now my kernel.org git repo not only tracks all of Junio's branches 
   (basically just a mirror of his tree), I also have a few stale branches 
   of my own that I did some work on separately. So it's kind of a 
   "frankensteins monster" of different branches from different sources. 

   And I think that's fairly common, actually (ie many kernel developers 
   that publicise their own git trees often have a "linus" branch that 
   tracks mine, along with their own "real" branches)

And note how in none of these situtations does it matter what the 
"original" branch was. It might even be a way to just pre-populate the 
tree. For a real-life example, a week or two ago, Jesper Juhl wanted to 
download my kernel tree (which is about 140MB in size), but he's somewhere 
in Europe, and apparently the connection to kernel.org was just _really_ 
slow. 

So what I told him to do was:

   Hmm. I suspect most mirrors avoid the /pub/scm directory, but there are a 
   few places that mirror git trees in general, eg

        http://www.jur-linux.org/git/

   might be closer to you.

   Once you have _one_ kernel repo, you can clone another easily using

        git clone --reference <mylocalrepo> <remotereponame> [localdir]

   but you do need to have the thing in git format, not just a snapshot, to 
   do that.

and that's exactly what he did (and he could have just fetched into the 
original archive entirely):

   I could only get 2-3kb/sec from kernel.org and at that speed 140MB is 
   *HUGE*.

   That was a lot better. got more than 200kb/sec from there.

so the point here is, "distributed" really is more than star-topology. If 
you think outside the star, you can take useful shortcuts.

Now, I'm sure that bzr can probably do all the same things. This is likely 
less an issue of "technology" than of "mindset". The "git way" tends to 
make all of these things very trivial - the notion of tracking multiple 
branches from multiple _different_ repositories in one local repo just 
fits very naturally in the whole git mentality.

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20 21:48                                             ` Carl Worth
  2006-10-21 13:01                                               ` Matthew D. Fuller
@ 2006-10-21 20:05                                               ` Aaron Bentley
  2006-10-21 20:48                                                 ` Jakub Narebski
                                                                   ` (2 more replies)
  1 sibling, 3 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-21 20:05 UTC (permalink / raw)
  To: Carl Worth
  Cc: Linus Torvalds, Jakub Narebski, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Carl Worth wrote:
> On Thu, 19 Oct 2006 21:06:40 -0400, Aaron Bentley wrote:
>> I understand your argument now.
>>                                  It's nothing to do with numbers per se,
>> and all about per-branch namespaces.  Correct?
> 
> The entire discussion is about how to name things in a distributed
> system. The premise that Linus has put forth in a very compelling way,
> is that attempting to use sequential numbers for names in a
> distributed system will break down. The breakdown could be that the
> names are not stable, or that the system is used in a centralized way
> to avoid the instability of the names.

So I'd say that revnos without the context of a location can only refer
to the current branch that the user is working on.  They don't refer to
the mainline, which typically has its own numbers that don't match the
user's.

If you're saying that bzr is "centralized" in that the user's current
branch is special, then I'll say "guilty as charged".

> But it really is fundamental and unavoidable that sequential numbers
> don't work as names in a distributed version control system.

Right.  You need something guaranteed to be unique.  It's the revno +
url combo that is unique.  That may not be permanent, but anyone can
create one of those names, so it is decentralized.

>> I meant that the active branch and a mirror of the abandoned branch
>> could be stored in the same repository, for ease of access.
> 
> Granted, everything can be stored in one repository. But that still
> doesn't change what I was trying to say with my example. One of the
> repositories would "win" (the names it published during the fork would
> still be valid). And the other repository would "lose" (the names it
> published would be not valid anymore). Right?

No.  It would be silly for the losing side to publish a mirror of the
winning branch at the same location where they had previously published
their own branch.  So the old number + URL combination would remain valid.

If the losing faction decided to maintain their own branch after the
merge, they'd have two options

1. continue to develop against the losing "branch", without updating its
numbers from the "winning" branch.  It would be hard to tell who had won
or lost in this case.

2. create a new mirror of the "winning" branch and develop against that.
 I'm not sure what this point of this would be.

I think the most realistic thing in this scenario is that they leave the
"losing" branch exactly where it was, and develop against the "winning"
branch.

>> Bazaar encourages you to stick lots and lots of branches in your
>> repository.  They don't even have to be related.  For example, my repo
>> contains branches of bzr, bzrtools, Meld, and BazaarInspect.
> 
> Git allows this just fine. And lots of branches belonging to a single
> project is definitely the common usage. It is not common (nor
> encouraged) for unrelated projects to share a repository, since a git
> clone will fetch every branch in the repository.

Right.  This is a difference between Bazaar and Git that's I'd
characterize as being "branch-oriented" vs "repository-oriented".  We'll
see more of this below.

> I'm noticing another terminology conflict here. The notion of "branch"
> in bzr is obviously very different than in git. For example the bzr
> man page has a sentence beginning with "if there is already a branch
> at the location but it has no working tree". I'm still not sure
> exactly what a bzr branch is, but it's clearly something different
> from a git branch, (which is absolutely nothing more than a name
> referencing a particular commit object).

I got the impression there was also a local ordering of revisions.  Is
that wrong?

A Bazaar branch is a directory inside a repository that contains:
 - a name referencing a particular revision
 - (optional) the location of the default branch to pull/merge from
 - (optional) the location of the default branch to push to
 - (optional) the policy for GPG signing
 - (optional) an alternate committer-id to use for this branch
 - (optional) a nickname for the branch
 - other configuration options

A Bazaar branch doesn't contain any commit objects ("revisions" in
Bazaar parlance).  Those are retrieved from the containing repository.

It doesn't contain any working files, but a branch and a working tree
may coexist in the same directory.  Similarly, a branch and a repository
may coexist in the same directory.

So this is one common layout:

Repository:
~/repo/

Branch:
~/repo/branch

Working Tree:
~/workingtee

This is another common layout:

Repository:
~/

Branch:
~/mybranch

Working Tree
~/mybranch

This layout is our default, a "standalone tree":

Repository:
~/mybranch

Branch:
~/mybranch

Working Tree:
~/mybranch

This layout is an imitation of Git, as I understand it:
Repository:
~/repo

Branches:
~/repo/origin
~/repo/master

Workingtree
~/repo

> Second, I'm not comfortable
> with any limit on usefulness of history. Would you willingly throw
> away commits, mailing list posts, or closed bug reports older than any
> given age for any projects that you care about?

I think the mailing list posts age the best, because they provide a
record of rationales for design decsions.  But I'd throw away old
commits if there were a good reason, like lack of disk space.  Not so
sure about bug reports.

> Second, I think that using the filesystem for separating branches is a
> really bad idea. 

The canonical way to name branches in Bazaar is with URLs, though we
support file paths where possible.  Part of the "simple namespace" thing
is that branches are simply URLs, so in order to retrieve a branch, all
you need is one URL.

> One, it intrudes on my branch namespace, (note that
> in many commands above I have to use things like "../b" where I'd like
> to just name my branch "b". 

While "bzr merge ../b" is a minor inconvenience, I think that "bzr merge
http://bazaar-vcs.org/bzr/bzr.dev" is a big win.

> Two, it prevents bzr from having any
> notion of "all branches" in places where git takes advantage of it,
> (such as git-clone and "gitk --all").

No, it doesn't.  Bazaar can easily list all the branches in a
repository, just by starting with the repository root, and recursing
through all the subdirectories, looking for branches.

That said, we do have mentality that branches, not repositories, are
what's important to users in day-to-day use.

> Three, it certainly encourages
> the storage problem I ran into above, (and I'd be interested to see a
> "corrected" version of the commands above to fix the storage
> inefficiencies).

$ bzr init-repo bzrtest --trees
$ bzr init bzrtest/master; cd bzrtest/master
$ touch a; bzr add a; bzr commit -m "Initial commit of a"
$ bzr branch . ../b; cd ../b
$ touch b; bzr add b; bzr commit -m "Commit b on b branch"
$ echo "change" > b; bzr commit -m "Change b on b branch"
$ bzr branch ../master ../c; cd ../c
$ touch c; bzr add c; bzr commit -m "Commit c on c branch"
$ echo "change" > c; bzr commit -m "Change c on c branch"
$ cd ../master
$ bzr merge ../b; bzr commit -m "Merge in b"
$ bzr merge ../c; bzr commit -m "Merge in c"

> I hadn't realized that the dotted decimal notation was so new that the
> community hadn't had a lot of experience with it yet. But, your
> description doesn't actually presume that notation. What you asked
> was:
> 
> 	> When you create a new branch from scratch, the number starts at zero.
> 	> If you copy a branch, you copy its number, too.
> 	>
> 	> Every time you commit, the number is incremented.  If you pull, your
> 	> numbers are adjusted to be identical to those of the branch you pulled from.
> 	>
> 	> Is that really complicated?
> 
> And to answer. That description doesn't describe at all what happens
> to the "simple" numbers of commits that are merged.

Nothing happens to them, because they were never part of this branch, so
they didn't ever exist in this context.

> I still don't
> understand how people can avoid number changing, (since pull seems the
> only way to synch up without infinite new merge commits being added
> back and forth).

Why would anyone commit if the merge introduced no changes?

>> What's nice is being able see the revno 753 and knowing that "diff -r
>> 752..753" will show the changes it introduced.  Checking the revo on a
>> branch mirror and knowing how out-of-date it is.
> 
> With git I get to see a revision number of b62710d4 and know that
> "diff b62710d4^ b62710d4" will show its changes, though much more
> likely just "show b62710d4". I really cannot fathom a place where
> arithmetic on revision numbers does something useful that git revision
> specifications don't do just as easily. Anybody have an example for
> me?

My understanding is that ^ is treated as a special metacharacter by some
shells, which is why bzr revision specs are more long-winded.

> PS. The "bzr branch" of bzr.dev did eventually finish. I can see the
> dotted-decimal numbers in my example now, (1.1.1 and 1.2.2 for the
> commits that came from branch b; 1.2.1 and 1.2.2 for the commits that
> came from branch c). At 5 characters a piece these are well on their
> way to getting just as "ugly" as git names, (once it's all
> cut-and-paste the difference in ugliness is negligible).

Yeah, I'm not sure I like those dotted numbers, either.

> And now, I see it's not just pull that does number rewriting. If I use
> the following command (after the chunk of commands above):
> 
> 	cd ..; bzr branch -r 1.2.2 master 1.2.2

It's not number rewriting, it's number writing.  It doesn't change the
numbers in master, or any other existing branch.  (Push also does number
rewriting, because it's mostly the inverse of pull).

> It appears to just create newly linearized revision numbers from whole
> cloth for the new branch (1, 2, and 3 corresponding to mainline 1,
> 1.2.1, and 1.2.2). That's totally surprising, very confusing, and
> would invalidate any use I wanted to make of published revision
> numbers for the mainline branch while I was working on this branch.

I think the intent of those numbers was for operations like "diff".  I
never branch from a revision, always from a branch, which will preserve
numbers.

> See? This stuff really doesn't work.

Our experience really is that it does work.

> Is there even a way to say "show me the change introduced by what is
> named '1.2.1' in the source branch in this scenario" ?

The revno:branch notation ought to work, but I guess there's a bug.  Not
surprising, since dotted revnos are new in this release.

> > Note: In #bzr I just learned that there is a way for me to do this
> _if_ I also happen to have a pull of the original branch somewhere on
> my machine. 

This should work with any URL, not just locations on your machine.

> But with bzr if I find "1.2.1" somewhere I'm likely to type:

The problem here is the "somewhere".  Since each branch has its own
revno namespace, you need to know where to use the revno effectively.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFOn190F+nu1YWqI0RAn1nAKCDqT8gbzm/xIMjbc3kTFCkpMbJvwCeJiWr
3fLtDo4uLwtAWi+pQOrgPLU=
=0GeT
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 19:55                                                         ` Linus Torvalds
@ 2006-10-21 20:19                                                           ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21 20:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff King, Matthew D. Fuller, bazaar-ng, Carl Worth,
	Andreas Ericsson, git

Linus Torvalds wrote:
>  - sometimes you have multiple branches in the same tree that have very 
>    _different_ sources. For example, you might start out cloning my tree, 
>    but if you _also_ want to track the stable tree, you just do so: you 
>    can just do
> 
>         git fetch <repo> <remote-branch-name>:<local-branch-name>
> 
>    at any time, and you now have a new branch that tracks a different 
>    repository entirely (to make it easier to keep track of them, you'd 
>    probably want to make note of this in your .config file or your remote 
>    tracking data, but that's a small "usability detail", not a real 
>    conceptual issue).

That for example allows of joining two initially separate projects
into one project. For example that was the case for gitk and gitweb
which are now in git.git repository. Most probably gitweb/gitk was
fetched into separate gitweb/gitk branch, then merged with the 'master'
branch of git (in case of gitweb we "resolved conflict" by moving 
all gitweb files to gitweb/ subdirectory) then propagated to other
branches by merging with master.

For example git has 7 "initial" (parentless) commits. Two of them
are superficial 'html' and 'man' branches for automatic generation
of HTML and man version of git documentation, keeping it current.
There is 'todo' branch, [also] totally separate for notes. And there
are initial commits of git, git-tools, gitk and gitweb:
 * Initial revision of "git", the information manager from hell
 * Start of early patch applicator tools for git.
 * Add initial version of gitk to the CVS repository
 * first working version 
   [of gitweb: this commit message should be more descriptive]

$ git rev-list --parents --all | grep -v " " | xargs git -p show
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 13:01                                               ` Matthew D. Fuller
  2006-10-21 14:08                                                 ` Jakub Narebski
@ 2006-10-21 20:47                                                 ` Carl Worth
  2006-10-21 20:55                                                   ` Jakub Narebski
                                                                     ` (3 more replies)
  2006-10-25  9:35                                                 ` Andreas Ericsson
  2 siblings, 4 replies; 1752+ messages in thread
From: Carl Worth @ 2006-10-21 20:47 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Aaron Bentley, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Jakub Narebski

[-- Attachment #1: Type: text/plain, Size: 8080 bytes --]

On Sat, 21 Oct 2006 08:01:11 -0500, "Matthew D. Fuller" wrote:
> I think we're getting into scratched-record-mode on this.

I apologize if I've come across as beating a dead horse on this. I've
really tried to only respond where I still confused, or there are
explicit indications that the reader hasn't understood what I was
saying, ("I don't understand how you've come to that conclusion",
etc.). I'll be even more careful about that below, labeling paragraphs
as "I'm missing something" or "Maybe I wasn't clear".

> G: So use revids everywhere.
>
> B: Revnos are handier tools for [situation] and [situation] for
>    [reason] and [reason].

I'm missing something:

I still haven't seen strong examples for this last claim. When are
they handier? I asked a couple of messages back and two people replied
that given one revno it's trivial to compute the revno of its
parent. But that's no win over git's revision specifications,
(particularly since they provide "parent of" operators).

> > It may be that the centralization bias
>
> I think it's more accurately describable as a branch-identity bias.
> The git claim seems to be that the two statements are identical, but I
> have some trouble swallowing that.

Maybe I wasn't clear:

There's no doubt that there has been semantic confusion over the term
branch that has been confounding communication on both sides. Here's
my attempt to describe the situation, (which only became this clear
recently as I started playing with bzr more). This is not an attempt
at a complete description, but is hopefully accurate, neutral, and
sufficient for the current discussion:

  Abstract: In a distributed VCS we are using a distributed process to
  create a DAG, (nodes are associated with revisions and point to parent
  nodes). The distributed nature means that the collective DAG will have
  multiple source nodes, (often termed heads or tips).

  Git: A subset of the DAG is stored in a "repository". The DAG in the
  repository may have many source nodes. A "branch" is a named reference
  to a node (whether or not a source). Multiple local repositories may
  share storage for common objects. There are inter-repository commands
  for copying revisions and adjusting branch references, but basically
  all other operations act within a single repository.

  Bzr: A subset of the DAG is stored in a "branch". The DAG in the
  branch has a single source node. Multiple local branches may share
  storage for common objects through a "repository". Basically all
  operations (where applicable) can act between branches.

Let me know if I botched any of that.

One concept that is really not introduced in the above is the
colloquial concept of a "branch" as a "line of development". In my
experience, this notion is a fundamentally short-lived thing. For
example, work happens on a feature branch for a while, and then it
gets merged into the mainline. After the merge, there's not that much
significance to the branch anymore. In a sense, it no longer exists
but for a few edges in the graph.

I imagine that both git and bzr users both use this short-lived aspect
in practice. After merging, git users drop their branch references and
bzr users drop their directories containing their branches. Anything
else would be unwieldy as the number of merged-in, "uninteresting"
branches would grow without bound and there wouldn't be any advantage
to keeping them around.

But dropping a merged branch in bzr means throwing away the ability to
reference any of its commits by its custom, branch-specific revision
numbers. And the revision numbers _do_ change, pull, branch, and merge
all introduce revision number differences between branches, (or
changes within a branch in the case of pull). And there is no simple
way to correlate the numbers between branches.

Maybe you can argue that there isn't any centralization bias in
bzr. But anyone that claims that the revnos. are stable really is
talking from a standpoint that favors centralization.

But, here's a unifying point about git and bzr. Git also allows
branch-specific, unstable names for revisions. And they're even more
unstable than the ones bzr generates. But there are some important
differences between how they are used, (both by the tool and by
people).

To illustrate, yesterday I gave an example where performing a bzr
branch from a dotted-decimal revision would rewrite the numbers from
the originating branch (1.2.2, 1.2.1, and 1) to unrelated numbers in
the new branch (3, 2, 1). I was surprised at first, and couldn't
imagine any sane reason for the tool to go off and invent new names.
It prevents a user of the new branch from referencing any commits by
their original names. It also prevents the user from communicating
with anyone with these new names, (unless the user publishes the
branch, and any parties to the communication retain the new branch for
as long as said communication might be reference).

But then I realized why bzr is doing this. It's because, bzr users
don't just use the revision numbers for external communication, but
they also use them for lots of direct interaction with the tool. The
rewriting makes it easy to write something like "bzr diff -r1..3".

And it turns out that git also allows branch specific naming for the
exact same reason. In place of 3, 2, 1 in the same situation git would
allow the names HEAD, HEAD~1, and HEAD~2 to refer to the same three
revisions. So the easy diff command would be "git diff HEAD~2 HEAD".
(And where I have HEAD here I could also use any branch name, or any
other reference to a commit as well.)

So there are two fundamentally different uses for names, (and Linus
recently talked about this in some length): 1. day-to-day working with
the tool and 2. externally communicating about specific revisions.

Both bzr and git allow for unstable, branch-specific names to be used
as a convenience in the case of the day-to-day working. Maybe some of
the people that dislike git's "ugly" names so much is that they
imagine that to compare two revisions a user of git must inspect the
logs, fish out the sha1sum for each, and then cut-and-paste to create
the command needed. I agree that if that were required, it would be
exceedingly painful. But that's not required, what the git user uses
is branch names and simple variations.

Now, there are some important difference in the unstable names that
git and bzr has. Most importantly, git's are even less stable, (with
respect to the association between a name and any specific
revision). With every commit, all of the git names effectively shift
as the branch moves, (HEAD points to the new commit, HEAD~1 points to
what HEAD previously pointed to). This is remarkably useful since it
provides stability in terms of what the user cares about, (the latest
commit and it's closest ancestors). This means that "diff from
grandparent to current commit" is always "git diff HEAD~1 HEAD" where
as in bzr it is "git diff -r<X-2>..<X>" and the user actually does
need to lookup X first, (unless there's more to the bzr revision
specification than I've seen).

Finally, since these branch-specific names are changing all the time,
there's never any temptation for people to attempt to use them to for
external communication. In contrast, by being numbered in the opposite
direction, bzr revision numbers give a false appearance of stability
and people _do_ use them for communication. This is the mistake we've
been warning bzr users about in this thread.

Also, since the git names are so predictable, git almost never emits
them. It accepts them as names just fine, but it doesn't generate
them, (log, and commit never show the branch-specific names). I think
the only git command that even can emit such a name was a recently
added git-name-rev which exists solely for the purpose of mapping a
commit identifier to a local, branch-specific name which might have
more intuitive meaning for the user.

So the fact that things like git-log doesn't print these names also
helps avoid any trap of users trying to communicate with something
unstable.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 20:05                                               ` Aaron Bentley
@ 2006-10-21 20:48                                                 ` Jakub Narebski
  2006-10-21 22:52                                                   ` Edgar Toernig
  2006-10-21 23:39                                                   ` Aaron Bentley
       [not found]                                                 ` <20061021165313.dba67497.seanlkml@sympatico.ca>
  2006-10-22  7:45                                                 ` Jan Hudec
  2 siblings, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21 20:48 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Carl Worth, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Aaron Bentley wrote:
> Carl Worth wrote:

>> I'm noticing another terminology conflict here. The notion of "branch"
>> in bzr is obviously very different than in git. For example the bzr
>> man page has a sentence beginning with "if there is already a branch
>> at the location but it has no working tree". I'm still not sure
>> exactly what a bzr branch is, but it's clearly something different
>> from a git branch, (which is absolutely nothing more than a name
>> referencing a particular commit object).
> 
> I got the impression there was also a local ordering of revisions.  Is
> that wrong?

No, there is no such thing like local ordering of revisions.

Each revision (commit) has link to its parent(s). Branch technically
is just a reference to a particular commit object. The commit itself
gives us sub-DAG of DAG of whole history, the DAG of all parents of
said commit. Such lineage of commit pointed by branch is conceptually
a branch; i.e. branch is DAG of development (not line of development,
as there is no special meaning of first parent).

You can have (in git repository) also reflog, which records values
of branch-as-reference, or branch tip of branch-as-named-lineage.
But for example fetch and fast-forward 5 commits in history is
recorded as single event, single change in reflog.
 
> A Bazaar branch is a directory inside a repository that contains:
>  - a name referencing a particular revision
>  - (optional) the location of the default branch to pull/merge from
>  - (optional) the location of the default branch to push to
>  - (optional) the policy for GPG signing
>  - (optional) an alternate committer-id to use for this branch
>  - (optional) a nickname for the branch
>  - other configuration options
Erm, wasn't revno to revid mapping also part of bzr "branch"?

We store configuration per repository, not per branch, although
there is some branch specific configuration.

[...]
> This layout is an imitation of Git, as I understand it:
> Repository:
> ~/repo
> 
> Branches:
> ~/repo/origin
> ~/repo/master
> 
> Workingtree:
> ~/repo

Workingtree:
~/

if I understand notation correctly.

>> One, it intrudes on my branch namespace, (note that
>> in many commands above I have to use things like "../b" where I'd like
>> to just name my branch "b".
> 
> While "bzr merge ../b" is a minor inconvenience, I think that "bzr merge
> http://bazaar-vcs.org/bzr/bzr.dev" is a big win.

Gaah, it's even more inconvenient. Certainly more than using name
of branch itself, like in git.
 
>> Two, it prevents bzr from having any
>> notion of "all branches" in places where git takes advantage of it,
>> (such as git-clone and "gitk --all").
> 
> No, it doesn't.  Bazaar can easily list all the branches in a
> repository, just by starting with the repository root, and recursing
> through all the subdirectories, looking for branches.

Is there a command to list all branches in bzr? Is there a command
to copy (clone in SCM jargon) whole repository with all branches?
 
> That said, we do have mentality that branches, not repositories, are
> what's important to users in day-to-day use.

Thats opposite to git view. In git, working area is associated with
repository (clone of repository), not branch. We copy whole repositories
(sometimes only part of repository), not branches.

>>> What's nice is being able see the revno 753 and knowing that "diff -r
>>> 752..753" will show the changes it introduced.  Checking the revo on a
>>> branch mirror and knowing how out-of-date it is.
>>
>> With git I get to see a revision number of b62710d4 and know that
>> "diff b62710d4^ b62710d4" will show its changes, though much more
>> likely just "show b62710d4". I really cannot fathom a place where
>> arithmetic on revision numbers does something useful that git revision
>> specifications don't do just as easily. Anybody have an example for
>> me?
> 
> My understanding is that ^ is treated as a special metacharacter by some
> shells, which is why bzr revision specs are more long-winded.

Which shells? If I understand it '^' was chosen (for example as
NOT operator for specify sub-DAG instead of '!') because of no problems
for shell expansion. And considering that many git commands are/were
written in shell, one certainly would notice that.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                 ` <20061021165313.dba67497.seanlkml@sympatico.ca>
@ 2006-10-21 20:53                                                   ` Sean
  2006-10-21 21:10                                                     ` Linus Torvalds
  2006-10-21 20:53                                                   ` Sean
  1 sibling, 1 reply; 1752+ messages in thread
From: Sean @ 2006-10-21 20:53 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Carl Worth, Linus Torvalds, Jakub Narebski, Andreas Ericsson,
	bazaar-ng, git

On Sat, 21 Oct 2006 16:05:18 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> Our experience really is that it does work.

Of course it works as long as you accept the implicit requirements of
supporting them and ignore the cases where they change out from
underneath the user.  But as soon as users want to embrace distributive
models where there isn't a central shared repo, at best revno's are
unhelpful and at worst they are counterproductive.  The proof of this
is that if revno's were sufficient bzr wouldn't need revid's.

Since the utility provided by revno's seems so minimal even in the
case where they do work, Git simply doesn't bother with them.  And
"our" experience is that Git really does work well without them.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                 ` <20061021165313.dba67497.seanlkml@sympatico.ca>
  2006-10-21 20:53                                                   ` Sean
@ 2006-10-21 20:53                                                   ` Sean
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-21 20:53 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: bazaar-ng, Linus Torvalds, Andreas Ericsson, Carl Worth, git,
	Jakub Narebski

On Sat, 21 Oct 2006 16:05:18 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> Our experience really is that it does work.

Of course it works as long as you accept the implicit requirements of
supporting them and ignore the cases where they change out from
underneath the user.  But as soon as users want to embrace distributive
models where there isn't a central shared repo, at best revno's are
unhelpful and at worst they are counterproductive.  The proof of this
is that if revno's were sufficient bzr wouldn't need revid's.

Since the utility provided by revno's seems so minimal even in the
case where they do work, Git simply doesn't bother with them.  And
"our" experience is that Git really does work well without them.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 20:47                                                 ` Carl Worth
@ 2006-10-21 20:55                                                   ` Jakub Narebski
  2006-10-21 23:07                                                   ` Jeff Licquia
                                                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21 20:55 UTC (permalink / raw)
  To: Carl Worth
  Cc: Matthew D. Fuller, Aaron Bentley, Linus Torvalds,
	Andreas Ericsson, bazaar-ng, git

Carl Worth wrote:

> Also, since the git names are so predictable, git almost never emits
> them. It accepts them as names just fine, but it doesn't generate
> them, (log, and commit never show the branch-specific names). I think
> the only git command that even can emit such a name was a recently
> added git-name-rev which exists solely for the purpose of mapping a
> commit identifier to a local, branch-specific name which might have
> more intuitive meaning for the user.

git-show-branch also shows git-name-rev like names.

BTW. git-show-branch has somewhat strange, and different from other git 
commands UI. You can think of it as text version of gitk/qgit history 
viewer (although you can use tig for CLI (ncurses) graph).
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 16:19                     ` Erik Bågfors
  2006-10-21 16:31                       ` Jakub Narebski
       [not found]                       ` <BAYC1-PASMTP01706CD2FCBE923333A0CBAE020@CEZ.ICE>
@ 2006-10-21 21:04                       ` Linus Torvalds
  2006-10-21 23:58                         ` Linus Torvalds
                                           ` (2 more replies)
  2 siblings, 3 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-21 21:04 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Sean, Jan Hudec, bazaar-ng, git, Matthieu Moy, Jakub Narebski

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1821 bytes --]



On Sat, 21 Oct 2006, Erik Bågfors wrote:
> 
> bzr is a fully decentralized VCS. I've read this thread for quite some
> time now and I really cannot understand why people come to this
> conclusion.

Even the bzr people agree, so what's not to understand?

The revision numbers are totally unstable in a distributed environment 
_unless_ you use a certain work-flow. And that work-flow is definitely not 
"distributed" it's much closer to "disconnected centralized".

Now, you could be truly distributed: BK used the same revision numbering 
thing, but was distributed. But BK didn't even try to claim that their 
revision numbers were "simple" and that fast-forwarding is sometimes the 
wrong thing to do.

So BK always fast-forwarded, and the revision numbers were just randomly 
changing numbers. They weren't stable, they weren't simple, and nobody 
claimed they were.

So bzr can bite the bullet and say: "revision numbers are changing and 
meaningless, and we should just fast-forward on merges", or you should 
just admit that bzr is really more about "disconnected operation" than 
truly distributed.

You can't have your cake and eat it too. Truly distributed _cannot_ be 
done with a stable dotted numbering scheme (unless the "dotted numbering 
scheme" is just a way to show a hash like git does - so the numbering has 
no _sequential_ meaning).

Btw, this isn't just an "opinion". This is a _fact_. It's something they 
teach in any good introductory course to distributed algorithms. Usually 
it's talked about in the context of "global clock". 

Anybody who thinks that there exists a globally ticking clock in the 
system (and stably increasing dotted numbers are just one such thing) is 
talking about some fantasy-world that doesn't exist, or a world that has 
nothing to do with "distributed".

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 20:53                                                   ` Sean
@ 2006-10-21 21:10                                                     ` Linus Torvalds
  0 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-21 21:10 UTC (permalink / raw)
  To: Sean
  Cc: Aaron Bentley, Carl Worth, Jakub Narebski, Andreas Ericsson,
	bazaar-ng, git



On Sat, 21 Oct 2006, Sean wrote:
> 
> Since the utility provided by revno's seems so minimal even in the
> case where they do work, Git simply doesn't bother with them.  And
> "our" experience is that Git really does work well without them.

Yes. This really is what it boils down to.

The _only_ time you actually use revision numbers (as opposed to 
branch-names or tag-names) is when you want a _stable_ number.

It's that simple. You never really need a revision number otherwise. In 
other situations, you do things like 

	git log --since=2.days.ago
	gitk v2.6.18..
	git diff --stat --summary ORIG_HEAD.. 

or whatever. It's clearly not "stable", but it's also clearly not a 
revision number from a UI perspective.

When you want a revision number is _exactly_ when you're moving things 
between branches, or reporting a bug to somebody else, or similar. And 
that's also _exactly_ when you want the number to be stable and meaningful 
(ie the other end should be able to rely on the number).

And if you need refer to a central repository to do that, it's clearly not 
distributed. Not needing such a central reference point is what the word 
"distributed" _means_ in computer science for chrissake!

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 19:19                                                     ` Jeff King
  2006-10-21 19:30                                                       ` Jakub Narebski
@ 2006-10-21 21:46                                                       ` Matthew D. Fuller
       [not found]                                                         ` <20061021180653.d3152616.seanlkml@sympatico.ca>
  2006-10-21 22:25                                                         ` Jakub Narebski
  1 sibling, 2 replies; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-21 21:46 UTC (permalink / raw)
  To: Jeff King
  Cc: Jakub Narebski, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

On Sat, Oct 21, 2006 at 03:19:49PM -0400 I heard the voice of
Jeff King, and lo! it spake thus:
> 
> I think the concept of "my" branch doesn't make any sense in git.
> [...]
> So don't think of it as "git throws away branch identity" as much as
> "git never cared about branch identity in the first place, and
> doesn't think it's relevant."

This is as I understand it.


But in my mind, it does make sense.  I fundamentally DO think of "my
commits" differently from "revisions I've merged", and I want the tool
to preserve that for me.  "My commits" tend to be steps along a path,
"merges" tend to be completed paths.  I usually use bzr's "log
--short" for looking at logs, which doesn't show merged revs at all.
That works, because most of the time I don't care about them; I know
if I merged something, it's a completed piece, which I described in
the log message; it's not a PART of a task like my commits usually
are.  So, just the message for my merge rev tells me what I need to
know, and if I need to drill down into it, I can use the regular
(--long) log output to look at the revision in it.  This lets me know,
for instance, that if I want to re-check something I did 3 commits
ago, and I just merged another branch, the commit I'm interested in is
the 4th commit back on the mainline; I don't need to grub through a
bunch of revisions that aren't mine to try and find it.

So, if me and Bob are working on different bits of the same project in
parallel, finish up, and merge back and forth to sync up (ignoring for
the moment the "empty merge commit" bit), even though we now both have
the 'same' stuff, we have the same head rev with all the same parents,
the parents are in a different order, and my 'mainline' (the path of
left-most parents, or 'first' as I understand git calls them) is
different than his; my mainline is my commits, his mainline is his.
If one of us were to 'pull' the other, our branch would become a
duplicate of his and so adopt his 'mainline', which we want to avoid
because then it doesn't fit the mental model of "what I did", which is
what I think of my branch as.


Obviously, this is a totally foreign mentality to git, and that's
great because it seems to work for you.  I can see advantages to it,
and I can conceive of situations where I might want that behavior.
But, in my day-to-day VCS use, I don't hit them, which is why I keep
typing 'bzr' instead of 'git' when I annoyingly need to type 'cvs'.


> The difference, I think, is that it's easier in git to move the
> upstream around: you simply start fetching from a different place.
> I'm not clear on how that works in bzr (if it invalidates revnos or
> has other side effects).

Depends on what you're fetching.  You can always tell 'bzr pull' a new
URL to look from.  If it's a later version of the 'same' branch, it'll
just update.  If it's a 'different' branch (a branch that's a superset
of your current branch/set-of-revisions, but with a different
'mainline' path through the revisions counts as 'different' here),
pull will complain and require a --overwrite to do the deed.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                         ` <20061021180653.d3152616.seanlkml@sympatico.ca>
@ 2006-10-21 22:06                                                           ` Sean
  0 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-21 22:06 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Jeff King, Jakub Narebski, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

On Sat, 21 Oct 2006 16:46:29 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> Obviously, this is a totally foreign mentality to git, and that's
> great because it seems to work for you.  I can see advantages to it,
> and I can conceive of situations where I might want that behavior.
> But, in my day-to-day VCS use, I don't hit them, which is why I keep
> typing 'bzr' instead of 'git' when I annoyingly need to type 'cvs'.

It's not completely foreign, it's one of the things you can use the
git reflog feature to record.  It's just that it's utterly clear in
Git that this is a local feature and is never replicated as part
of the distributed data.

> Depends on what you're fetching.  You can always tell 'bzr pull' a new
> URL to look from.  If it's a later version of the 'same' branch, it'll
> just update.  If it's a 'different' branch (a branch that's a superset
> of your current branch/set-of-revisions, but with a different
> 'mainline' path through the revisions counts as 'different' here),
> pull will complain and require a --overwrite to do the deed.

This is where the git model is clearly superior and allows a true
distributed model.  Because there is no concept of a "mainline"
(except locally via reflog) you can always merge with anyone
participating in the DAG without having to overwrite or lose ordering.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 21:46                                                       ` Matthew D. Fuller
       [not found]                                                         ` <20061021180653.d3152616.seanlkml@sympatico.ca>
@ 2006-10-21 22:25                                                         ` Jakub Narebski
  2006-10-21 23:42                                                           ` Jeff Licquia
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-21 22:25 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Jeff King, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

Matthew D. Fuller wrote:

[cut]
> Obviously, this is a totally foreign mentality to git, and that's
> great because it seems to work for you.  I can see advantages to it,
> and I can conceive of situations where I might want that behavior.
> But, in my day-to-day VCS use, I don't hit them, which is why I keep
> typing 'bzr' instead of 'git' when I annoyingly need to type 'cvs'.

Well, not exactly. If you are interested in your changes, i.e. commits 
generated by you, you can (with new git) filter commits by author name,
e.g. 'git log --author="$(git repo-config --get user.email)"'. If you
are interested in commits which you entered into repository, you can
(with new git) filter commits by commiter.

If you are interested in history of your branch, you can enable reflog
for this branch. This is of course totally local information, and 
doesn't get propagated. It records things like commits, merges, 
rebasing, starting branch anew, amending commits etc. Because it
is separate from branch and DAG of revisions, we can do fast-forward
and have identical DAG while having information about local history.

Besides git users are used to refer to graphical history viewers,
including gitk (Tcl/Tk, in git repository), qgit (Qt), gitview (GTK+, in 
contrib/, less popular), git-show-branch (core git, strange UI, command 
line), tig (ncurses) for more complicated cases.


I wonder if searching for one's own commits isn't the sign that
the project is of one-main-developer size (i.e. small project,
without large number of distributed contributors). I think in large 
project you rather ask of history of specified file, of specified part 
of project (specified directory), ask about why certain change was 
introduced etc.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 20:48                                                 ` Jakub Narebski
@ 2006-10-21 22:52                                                   ` Edgar Toernig
  2006-10-21 23:39                                                   ` Aaron Bentley
  1 sibling, 0 replies; 1752+ messages in thread
From: Edgar Toernig @ 2006-10-21 22:52 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Aaron Bentley, Carl Worth, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git

Jakub Narebski wrote:
>
> > My understanding is that ^ is treated as a special metacharacter by some
> > shells, which is why bzr revision specs are more long-winded.
> 
> Which shells?

In the traditional Bourne shell ^ is an alias for the pipe symbol |.

Ciao, ET.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 20:47                                                 ` Carl Worth
  2006-10-21 20:55                                                   ` Jakub Narebski
@ 2006-10-21 23:07                                                   ` Jeff Licquia
       [not found]                                                     ` <20061021192539.4a00cc3e.seanlkml@sympatico.ca>
  2006-10-22 12:46                                                   ` Matthew D. Fuller
  2006-10-22 19:36                                                   ` David Clymer
  3 siblings, 1 reply; 1752+ messages in thread
From: Jeff Licquia @ 2006-10-21 23:07 UTC (permalink / raw)
  To: Carl Worth; +Cc: bazaar-ng, git

On Sat, 2006-10-21 at 13:47 -0700, Carl Worth wrote:
> I still haven't seen strong examples for this last claim. When are
> they handier? I asked a couple of messages back and two people replied
> that given one revno it's trivial to compute the revno of its
> parent. But that's no win over git's revision specifications,
> (particularly since they provide "parent of" operators).

Having used both (though my familiarity with git is less), in my opinion
the biggest win is the obvious one: sequential numbers work in the head
better than SHA1 checksums.

"But it's not a problem in practice!" is a good retort, except that I
wonder whether the set of "practices" you're using includes anyone who
decided to pass on git in favor of something else--perhaps because they
saw a few SHAs float by and ran in terror.  Beware of self-selection
bias.

Put another way, "strength" of example is often in the eye of the
beholder.  That we continue to give you the same "weak" examples may be
evidence that we have a different impression of their strengths, and
that your analysis of their strengths isn't convincing to us.

I suppose this line of conversation still has value if you don't see any
benefit at all, but OTOH if you really don't see how sequential numbers
are easier to work with in the head than SHA sums with modifiers, I'm
not sure that's a gap we can bridge.

> Let me know if I botched any of that.

I don't see any problems with it.

> But dropping a merged branch in bzr means throwing away the ability to
> reference any of its commits by its custom, branch-specific revision
> numbers. And the revision numbers _do_ change, pull, branch, and merge
> all introduce revision number differences between branches, (or
> changes within a branch in the case of pull). And there is no simple
> way to correlate the numbers between branches.
> 
> Maybe you can argue that there isn't any centralization bias in
> bzr. But anyone that claims that the revnos. are stable really is
> talking from a standpoint that favors centralization.

I wonder if part of the problem is that the revno scheme we've been
talking about (the x.y.z... format) doesn't technically exist in any
released version of bzr that I know of.

Previous to 0.12, bzr revnos were absolutely a local thing; revisions
from merges didn't even have revnos (except for the merge commit
itself).  If you merged a branch and you later wanted to recreate that
branch, or see a diff from that branch, etc., you had to use revids.

So when you talk of a "centralization bias" in bzr, a lot of us get
confused, defensive, etc., because from our perspective, bzr and git
weren't all that much different until just recently.

Now it may be that you're right that "global" revnos like bzr has now
introduce a bias in favor of centralization.  If that's true, I'm not
sure that totally vindicates the git model.  We have to ask if the bias
is a good thing, but so do you; after all, we may have done so because
of user demand, and if our users want it, maybe yours will want it too
someday.

(I say "may" because I haven't been paying close attention to the new
revno conversation, so I don't want to sound more sure than I am.)

But I think bzr people are more willing to take a wait-and-see approach.
Local revnos weren't a big deal, so we're willing to bet that the new
0.12 revnos won't be, either.

> And it turns out that git also allows branch specific naming for the
> exact same reason. In place of 3, 2, 1 in the same situation git would
> allow the names HEAD, HEAD~1, and HEAD~2 to refer to the same three
> revisions. So the easy diff command would be "git diff HEAD~2 HEAD".
> (And where I have HEAD here I could also use any branch name, or any
> other reference to a commit as well.)

FYI: The strict analogy to HEAD~1 in bzr would be -2.  And yes, -2 is
every bit as unstable as HEAD~1.

> Finally, since these branch-specific names are changing all the time,
> there's never any temptation for people to attempt to use them to for
> external communication. In contrast, by being numbered in the opposite
> direction, bzr revision numbers give a false appearance of stability
> and people _do_ use them for communication. This is the mistake we've
> been warning bzr users about in this thread.

URLs are also used for communication, despite having many of the same
drawbacks as revnos in DVC systems.  This could have been a fatal flaw,
but in reality, this has resulted in some best practices ("permalinks",
for example), and a sense of where a URL is appropriate and where it
isn't.  It's not perfect, and yet it's been wildly successful.

Copying the flaws of a highly successful system does not guarantee
success, of course.  On the other hand, it does influence our evaluation
of the severity of the flaws.

There may be a danger, though, that the bzr community may want to pay
closer attention to.

Several of us have pointed to the (branch, revno) combination as a
sufficiently reliable communication method, and we may be right about
that.  But, so far, those revnos have been entirely local to a single
branch, and have also been as absolutely reliable (locally speaking) as
a revid; the branch "foo" may go away, but while it's around, "revision
14 of branch foo" will always mean the same thing.  But we're now adding
the 0.12 revno scheme, with "global" revnos.  Will those be as reliable?
Will "revision 2418.1.4 on bzr.dev" work as well as "revision 2418 on
bzr.dev" does now?

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                     ` <20061021192539.4a00cc3e.seanlkml@sympatico.ca>
  2006-10-21 23:25                                                       ` Sean
@ 2006-10-21 23:25                                                       ` Sean
  2006-10-22  0:46                                                       ` Jeff Licquia
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-21 23:25 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: Carl Worth, bazaar-ng, git

On Sat, 21 Oct 2006 19:07:10 -0400
Jeff Licquia <jeff@licquia.org> wrote:

> Several of us have pointed to the (branch, revno) combination as a
> sufficiently reliable communication method, and we may be right about
> that.  But, so far, those revnos have been entirely local to a single
> branch, and have also been as absolutely reliable (locally speaking) as
> a revid; the branch "foo" may go away, but while it's around, "revision
> 14 of branch foo" will always mean the same thing.  But we're now adding
> the 0.12 revno scheme, with "global" revnos.  Will those be as reliable?
> Will "revision 2418.1.4 on bzr.dev" work as well as "revision 2418 on
> bzr.dev" does now?

There is no need to speculate, the numbers will only be reliable on a local
basis.  So yes you can force a single repository like bzr.dev to always "win"
any conflict and force the other guy to change ie. a central repo model.
But they can not be maintained consistently in a truly distributed
system.  As Linus pointed out that is fact, not opinion.

Now the opinion of the bzr people is that it doesn't matter and that for
all important cases it works well enough.  If all the people who don't like
the look of sha1's self select bzr, so be it, but that doesn't change the
fundamental argument.

But just to reiterate, the design of Git is flexible enough to where you
can automatically generate "revno" tags for every commit in your repo
_today_.  You'd end up with the exact same problems that bzr will
eventually hit, but Git already has everything you need today to refer
to every commit in your repo as r1 r2 r3 r4 etc...  

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                     ` <20061021192539.4a00cc3e.seanlkml@sympatico.ca>
@ 2006-10-21 23:25                                                       ` Sean
  2006-10-21 23:25                                                       ` Sean
  2006-10-22  0:46                                                       ` Jeff Licquia
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-21 23:25 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: Carl Worth, bazaar-ng, git

On Sat, 21 Oct 2006 19:07:10 -0400
Jeff Licquia <jeff@licquia.org> wrote:

> Several of us have pointed to the (branch, revno) combination as a
> sufficiently reliable communication method, and we may be right about
> that.  But, so far, those revnos have been entirely local to a single
> branch, and have also been as absolutely reliable (locally speaking) as
> a revid; the branch "foo" may go away, but while it's around, "revision
> 14 of branch foo" will always mean the same thing.  But we're now adding
> the 0.12 revno scheme, with "global" revnos.  Will those be as reliable?
> Will "revision 2418.1.4 on bzr.dev" work as well as "revision 2418 on
> bzr.dev" does now?

There is no need to speculate, the numbers will only be reliable on a local
basis.  So yes you can force a single repository like bzr.dev to always "win"
any conflict and force the other guy to change ie. a central repo model.
But they can not be maintained consistently in a truly distributed
system.  As Linus pointed out that is fact, not opinion.

Now the opinion of the bzr people is that it doesn't matter and that for
all important cases it works well enough.  If all the people who don't like
the look of sha1's self select bzr, so be it, but that doesn't change the
fundamental argument.

But just to reiterate, the design of Git is flexible enough to where you
can automatically generate "revno" tags for every commit in your repo
_today_.  You'd end up with the exact same problems that bzr will
eventually hit, but Git already has everything you need today to refer
to every commit in your repo as r1 r2 r3 r4 etc...  

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 20:48                                                 ` Jakub Narebski
  2006-10-21 22:52                                                   ` Edgar Toernig
@ 2006-10-21 23:39                                                   ` Aaron Bentley
  2006-10-22  0:04                                                     ` Carl Worth
  2006-10-22  0:14                                                     ` Jakub Narebski
  1 sibling, 2 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-21 23:39 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Carl Worth, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:
>> Carl Worth wrote:
> 
> No, there is no such thing like local ordering of revisions.

> You can have (in git repository) also reflog, which records values
> of branch-as-reference, or branch tip of branch-as-named-lineage.
> But for example fetch and fast-forward 5 commits in history is
> recorded as single event, single change in reflog.

That must be what I was thinking of.

>> A Bazaar branch is a directory inside a repository that contains:
>>  - a name referencing a particular revision
>>  - (optional) the location of the default branch to pull/merge from
>>  - (optional) the location of the default branch to push to
>>  - (optional) the policy for GPG signing
>>  - (optional) an alternate committer-id to use for this branch
>>  - (optional) a nickname for the branch
>>  - other configuration options
> Erm, wasn't revno to revid mapping also part of bzr "branch"?

It's not part of the conceptual model.  The revno-to-revid mapping is
done using the DAG.  The branch just tracks the head.

The .bzr/branch/revision-history file is from an earlier model in which
branches had a local ordering.  Nowadays, it can be treated as:
 - a reference to the head revision
 - a cache of the revno-to-revid mapping

>> This layout is an imitation of Git, as I understand it:
>> Repository:
>> ~/repo
>>
>> Branches:
>> ~/repo/origin
>> ~/repo/master
>>
>> Workingtree:
>> ~/repo
> 
> Workingtree:
> ~/
> 
> if I understand notation correctly.

The notation was that ~/repo would contain the .git directory for the
repository.

>> While "bzr merge ../b" is a minor inconvenience, I think that "bzr merge
>> http://bazaar-vcs.org/bzr/bzr.dev" is a big win.
> 
> Gaah, it's even more inconvenient. Certainly more than using name
> of branch itself, like in git.

Of course if you have a copy of bzr.dev on your computer, you don't need
to type the full URL.  it's just like the 'merge ../b' above.

But how can you use the branch name of a branch that isn't on your
computer?  I suspect git requires a separate 'clone' step to get it onto
your computer first.

> Is there a command to list all branches in bzr?

There's one in the 'bzrtools' plugin.

> Is there a command
> to copy (clone in SCM jargon) whole repository with all branches?

No.

>> My understanding is that ^ is treated as a special metacharacter by some
>> shells, which is why bzr revision specs are more long-winded.
> 
> Which shells? If I understand it '^' was chosen (for example as
> NOT operator for specify sub-DAG instead of '!') because of no problems
> for shell expansion. And considering that many git commands are/were
> written in shell, one certainly would notice that.

Sorry, it's been quite a long time since people complained at me for
using ^, so I don't remember.  Perhaps Edgar is right about it being the
pipe character in old shells.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD4DBQFFOq+80F+nu1YWqI0RAp/KAJ9Bw1q9/nd3gUAjcX3c+24aoEifeQCYlbD0
tUZ01ra11vkQ7V3RzarXeg==
=oFIC
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 22:25                                                         ` Jakub Narebski
@ 2006-10-21 23:42                                                           ` Jeff Licquia
  2006-10-21 23:49                                                             ` Carl Worth
  0 siblings, 1 reply; 1752+ messages in thread
From: Jeff Licquia @ 2006-10-21 23:42 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

On Sun, 2006-10-22 at 00:25 +0200, Jakub Narebski wrote:
> I wonder if searching for one's own commits isn't the sign that
> the project is of one-main-developer size (i.e. small project,
> without large number of distributed contributors). I think in large 
> project you rather ask of history of specified file, of specified part 
> of project (specified directory), ask about why certain change was 
> introduced etc.

I don't think so.  Recently, I've been trying to track a particular
patch in the kernel.  It was done as a series of commits, and probably
would have been its own branch in bzr, but when I was trying to group
the commits together to analyze them as a group, the easiest way to do
that was by the original committer's name.

Now, there's probably a better way to hunt that stuff down, but in this
case hunting the user down worked for me.  (It may have made a
difference that I was using gitweb instead of a local clone.)

And the case of hunting down your own commits is just a degenerate case
of hunting down someone else's.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 23:42                                                           ` Jeff Licquia
@ 2006-10-21 23:49                                                             ` Carl Worth
  2006-10-22  0:07                                                               ` Jeff Licquia
                                                                                 ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Carl Worth @ 2006-10-21 23:49 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: Jakub Narebski, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 999 bytes --]

On Sat, 21 Oct 2006 19:42:47 -0400, Jeff Licquia wrote:
> I don't think so.  Recently, I've been trying to track a particular
> patch in the kernel.  It was done as a series of commits, and probably
> would have been its own branch in bzr, but when I was trying to group
> the commits together to analyze them as a group, the easiest way to do
> that was by the original committer's name.

As far as "its own branch in bzr" would such a branch remain available
indefinitely even after being merged in to the main tree?

> Now, there's probably a better way to hunt that stuff down, but in this
> case hunting the user down worked for me.  (It may have made a
> difference that I was using gitweb instead of a local clone.)

Vast, huge, gaping, cosmic difference.

Almost none of the power of git is exposed by gitweb. It's really not
worth comparing. (Now a gitweb-alike that provided all the kinds of
very easy browsing and filtering of the history like gitk and git
might be nice to have.)

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 21:04                       ` Linus Torvalds
@ 2006-10-21 23:58                         ` Linus Torvalds
  2006-10-22  0:13                           ` Erik Bågfors
  2006-10-22  0:09                         ` Erik Bågfors
  2006-10-27  4:51                         ` Jan Hudec
  2 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-21 23:58 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Matthieu Moy, bazaar-ng, Sean, Jan Hudec, git, Jakub Narebski



On Sat, 21 Oct 2006, Linus Torvalds wrote:
> 
> And that work-flow is definitely not "distributed" it's much closer to 
> "disconnected centralized".

Side note: the only reason I think that distinction is worth making at all 
is when comparing git to bzr, and even then this is a fairly subtle 
distinction, and probably not a huge deal in practice.

I obviously think git is a nicer distributed design, but in the end, if 
you compare to something like CVS or SVN that isn't even disconnected, the 
difference between git and bzr in this sense is basically zero. 

So I sound like I care, but at the same time, I realize very well that 
when coming from a totally centralized world, the details we're arguing 
are _so_ not important.

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 23:39                                                   ` Aaron Bentley
@ 2006-10-22  0:04                                                     ` Carl Worth
  2006-10-22  0:14                                                     ` Jakub Narebski
  1 sibling, 0 replies; 1752+ messages in thread
From: Carl Worth @ 2006-10-22  0:04 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Jakub Narebski, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 2390 bytes --]

On Sat, 21 Oct 2006 19:39:41 -0400, Aaron Bentley wrote:
> Of course if you have a copy of bzr.dev on your computer, you don't need
> to type the full URL.  it's just like the 'merge ../b' above.
>
> But how can you use the branch name of a branch that isn't on your
> computer?  I suspect git requires a separate 'clone' step to get it onto
> your computer first.

No. You can merge a branch from a remote repository in a single step:

	git pull http://example.com/git/repo branch-of-interest

But if you want to do something besides (or before) a merge, (for
example, just explore its history, do some diffs etc.) then you would
fetch it instead, assigning it a local branch name in the process:

	git fetch http://example.com/git/repo branch-of-interest:local-name

After which "local-name" is all one would need to use. So after a
fetch like the above, the equivalent of "bzr missing --theirs-only"
would be:

	git log ..local-name

[This shows some of the expressive power of git revision
specifications. There's no need for a separate "missing" command. It's
just one case of viewing a particular subset of the DAG. And the
specification language makes almost all interesting subsets easy. The
--mine-only specification would be "local-name.."]

And beyond what bzr missing does (I believe) it's easy to also see the
patch content of each commit with:

	git log -p ..local-name

And then if everything is happy, one could merge that branch in:

	git pull . local-name

(And, yes, it is the case that "pull" with a repository URL of "." is
how merging is done. It's bizarre to me that this is not "git merge
local-name" instead. There actually _is_ a "git merge" command that
could be used here, but it is somewhat awkward to use, (requiring both
a commit message (without the -m of git-commit(!)) and an explicit
mention of the current branch). So using it would be something like:

	git merge "merge of local-name" HEAD local-name

I've never claimed that git is completely free of its UI
warts---though there are fewer now than when I started using it.)

But, yes, the notion in git is to bring things in to the current
repository and then work with them locally. This has an advantage that
network traffic is spent only once if doing multiple operations, (say
the three steps shown above: 1) investigate commit messages, 2)
investigate patch content, 3) perform the merge).

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 23:49                                                             ` Carl Worth
@ 2006-10-22  0:07                                                               ` Jeff Licquia
  2006-10-22  0:47                                                                 ` Linus Torvalds
  2006-10-22 16:02                                                               ` Petr Baudis
  2006-10-25  9:52                                                               ` Andreas Ericsson
  2 siblings, 1 reply; 1752+ messages in thread
From: Jeff Licquia @ 2006-10-22  0:07 UTC (permalink / raw)
  To: Carl Worth; +Cc: bazaar-ng, git

On Sat, 2006-10-21 at 16:49 -0700, Carl Worth wrote:
> On Sat, 21 Oct 2006 19:42:47 -0400, Jeff Licquia wrote:
> > I don't think so.  Recently, I've been trying to track a particular
> > patch in the kernel.  It was done as a series of commits, and probably
> > would have been its own branch in bzr, but when I was trying to group
> > the commits together to analyze them as a group, the easiest way to do
> > that was by the original committer's name.
> 
> As far as "its own branch in bzr" would such a branch remain available
> indefinitely even after being merged in to the main tree?

Yes, in the sense that you can recreate the branch by using that
branch's last commit.  But not in the git sense that there's a branch ID
pointing at the commit in question.

You know what?  It occurs to me that much of the problem with git
branches vs. bzr branches might be solved when bzr gets proper tagging
support.  Because, after all, aren't branches more like special tags in
git?

> > Now, there's probably a better way to hunt that stuff down, but in this
> > case hunting the user down worked for me.  (It may have made a
> > difference that I was using gitweb instead of a local clone.)
> 
> Vast, huge, gaping, cosmic difference.
> 
> Almost none of the power of git is exposed by gitweb. It's really not
> worth comparing. (Now a gitweb-alike that provided all the kinds of
> very easy browsing and filtering of the history like gitk and git
> might be nice to have.)

So, very probably, I would have had a far easier time of it if I had
been able to really use git to do the work, instead of gitweb.

I still don't think, though, that it's a sign of a small project to be
concerned about one's own branches more than others.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 21:04                       ` Linus Torvalds
  2006-10-21 23:58                         ` Linus Torvalds
@ 2006-10-22  0:09                         ` Erik Bågfors
  2006-10-27  4:51                         ` Jan Hudec
  2 siblings, 0 replies; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-22  0:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sean, Jan Hudec, bazaar-ng, git, Matthieu Moy, Jakub Narebski

On 10/21/06, Linus Torvalds <torvalds@osdl.org> wrote:
>
>
> On Sat, 21 Oct 2006, Erik Bågfors wrote:
> >
> > bzr is a fully decentralized VCS. I've read this thread for quite some
> > time now and I really cannot understand why people come to this
> > conclusion.
>
> Even the bzr people agree, so what's not to understand?

The use of the word "decentralized".

When I think centralized, I think "all users must commit to a central
repo/branch".  In this sense bzr is 100% fully decentralized.  You are
free to commit to a none-central branch.

What I mean is that it's fully decentralized, but it may have a bias
to the usage of a central branch/repo.

/Erik
-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 23:58                         ` Linus Torvalds
@ 2006-10-22  0:13                           ` Erik Bågfors
  2006-10-22  0:22                             ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-22  0:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sean, Jan Hudec, bazaar-ng, git, Matthieu Moy, Jakub Narebski

On 10/22/06, Linus Torvalds <torvalds@osdl.org> wrote:
>
>
> On Sat, 21 Oct 2006, Linus Torvalds wrote:
> >
> > And that work-flow is definitely not "distributed" it's much closer to
> > "disconnected centralized".
>
> Side note: the only reason I think that distinction is worth making at all
> is when comparing git to bzr, and even then this is a fairly subtle
> distinction, and probably not a huge deal in practice.
>
> I obviously think git is a nicer distributed design, but in the end, if
> you compare to something like CVS or SVN that isn't even disconnected, the
> difference between git and bzr in this sense is basically zero.
>
> So I sound like I care, but at the same time, I realize very well that
> when coming from a totally centralized world, the details we're arguing
> are _so_ not important.

I have to agree. Personally I think both git, bzr and mercurial are
all VERY nice systems.  If they weren't all started about the same
time, I doubt we would have all three.

I am happy to use either, but I have a small preference with bzr
because it suites me. I'm saying this, just as a user, nothing else.

/Erik
-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 23:39                                                   ` Aaron Bentley
  2006-10-22  0:04                                                     ` Carl Worth
@ 2006-10-22  0:14                                                     ` Jakub Narebski
  1 sibling, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-22  0:14 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Andreas Ericsson, Linus Torvalds, Carl Worth, bazaar-ng, git

Aaron Bentley wrote:
> Jakub Narebski wrote:
>> Aaron Bentley wrote:

>>> A Bazaar branch is a directory inside a repository that contains:
>>>  - a name referencing a particular revision
>>>  - (optional) the location of the default branch to pull/merge from
>>>  - (optional) the location of the default branch to push to
>>>  - (optional) the policy for GPG signing
>>>  - (optional) an alternate committer-id to use for this branch
>>>  - (optional) a nickname for the branch
>>>  - other configuration options
>> Erm, wasn't revno to revid mapping also part of bzr "branch"?
> 
> It's not part of the conceptual model.  The revno-to-revid mapping is
> done using the DAG.  The branch just tracks the head.
> 
> The .bzr/branch/revision-history file is from an earlier model in which
> branches had a local ordering.  Nowadays, it can be treated as:
>  - a reference to the head revision
>  - a cache of the revno-to-revid mapping

In git DAG is DAG od parents. There are no "child" links. So it is natural
to refer to n-th ancestor of given commit (in git <ref>~<n>, in bzr -<m>).

To have incrementing (from 1 for first revision on given branch) revision
numbers you either have to have links to "children", which automatically
means that revisions cannot be immutable to allow for branching at
arbitrary revision, or to transverse DAG here and back again (perhaps
with cache of revno-to-revid mapping to help performance).

Additionally to have incrementing revision numbers you have to remember
which part of DAG is our branch; which parent in merge to chose to follow.
Bazaar-NG decides here to distinguish first parent; to have first parent
immutable it doesn't use fast-forward and always use merge, sometimes
giving empty-merge. If you use "pull" numbers change.
 
>>> This layout is an imitation of Git, as I understand it:
>>> Repository:
>>> ~/repo
>>>
>>> Branches:
>>> ~/repo/origin
>>> ~/repo/master
>>>
>>> Workingtree:
>>> ~/repo
>>
>> Workingtree:
>> ~/
>>
>> if I understand notation correctly.
> 
> The notation was that ~/repo would contain the .git directory for the
> repository.

The default layout of "clothed" repository is

 Repository:
 ~/repo/.git/

 Branches:
 ~/repo/.git/refs/heads/

 Workingtree:
 ~/repo/

>>> While "bzr merge ../b" is a minor inconvenience, I think that "bzr merge
>>> http://bazaar-vcs.org/bzr/bzr.dev" is a big win.
>>
>> Gaah, it's even more inconvenient. Certainly more than using name
>> of branch itself, like in git.
> 
> Of course if you have a copy of bzr.dev on your computer, you don't need
> to type the full URL.  it's just like the 'merge ../b' above.
> 
> But how can you use the branch name of a branch that isn't on your
> computer?  I suspect git requires a separate 'clone' step to get it onto
> your computer first.

No, as it was said in other messages in this thread, you can fetch
a branch (branches), even from other repository that the one you cloned
from, into given branch (branches). For git it would be
  $ git fetch <URL> <remotebranch>:<localbranch>
You probably would want to save above info in remotes file or in config.
For cg (Cogito) it would be
  $ cg branch-add <localbranch> <URL>#<remotebranch>
  $ cg fetch <localbranch>

In git you always use names like 'master', 'next', 'HEAD' (meaning current
branch) and also HEAD^, next~5 when comparing branches, viewing history,
merging branches, switching to branch etc. Not '../master'...

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22  0:13                           ` Erik Bågfors
@ 2006-10-22  0:22                             ` Jakub Narebski
  2006-10-22  1:00                               ` Theodore Tso
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-22  0:22 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Linus Torvalds, Sean, Jan Hudec, bazaar-ng, git, Matthieu Moy

Erik Bågfors wrote:

>> So I sound like I care, but at the same time, I realize very well that
>> when coming from a totally centralized world, the details we're arguing
>> are _so_ not important.
> 
> I have to agree. Personally I think both git, bzr and mercurial are
> all VERY nice systems.  If they weren't all started about the same
> time, I doubt we would have all three.

If I understand correctly bzr came to life much earlier than Monotone,
Mercurial and Git but it was in beta stages very long. Bazaar-NG
"repositories" to group bunch of "branches" seems inspoted by hg or git.
Git (and probably Mercurial) was inspired both by BitKeeper and Monotone.
Monotone started to be reasonable fast around time when Git and Mercurial
came to be.

P.S. I'd like very much to see "history of SCM", with links denoting
borrowing of ideas, similar to the "history of UNIX" graphs...
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                     ` <20061021192539.4a00cc3e.seanlkml@sympatico.ca>
  2006-10-21 23:25                                                       ` Sean
  2006-10-21 23:25                                                       ` Sean
@ 2006-10-22  0:46                                                       ` Jeff Licquia
       [not found]                                                         ` <20061021212645.2f9ba751.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 1752+ messages in thread
From: Jeff Licquia @ 2006-10-22  0:46 UTC (permalink / raw)
  To: Sean; +Cc: bazaar-ng, git

On Sat, 2006-10-21 at 19:25 -0400, Sean wrote:
> Now the opinion of the bzr people is that it doesn't matter and that for
> all important cases it works well enough.  If all the people who don't like
> the look of sha1's self select bzr, so be it, but that doesn't change the
> fundamental argument.

Which opinion is this?  The opinion that old-style local revnos aren't a
big deal, or that new-style dotted revnos aren't a big deal?

I suspect you're conflating the two, and interpreting certainty for the
former as certainty for the latter.  Though I don't mind being
corrected.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22  0:07                                                               ` Jeff Licquia
@ 2006-10-22  0:47                                                                 ` Linus Torvalds
  0 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-22  0:47 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: Carl Worth, bazaar-ng, git



On Sat, 21 Oct 2006, Jeff Licquia wrote:
> 
> You know what?  It occurs to me that much of the problem with git
> branches vs. bzr branches might be solved when bzr gets proper tagging
> support.  Because, after all, aren't branches more like special tags in
> git?

Both branches _and_ tags in git are 100% the same thing: they're just 
shorthand for the commit name. That's _literally_ all they are. They are a 
symbolic name for a 160-bit SHA1 hash.

So yes, you can say that branches are like special tags, or that 
(unsigned) tags are like special branches. There's no real "technical" 
difference: in both cases, it's just an arbitrary name for the top commit.

However, there are some purely UI differences between tags and branches, 
which really don't affect any of the "name->SHA1" translation at all, but 
which affect how you can _use_ a tag-name vs a branch-name.

 - A branch is always a pointer to a _commit_ object.

   In contrast, a tag can point to anything. It can point to a tree (and 
   that means that you can do _diff_ between a tag and a branch, but such 
   a tree doesn't have any "history" associated with it - it's purely 
   about a certain "state", so you cannot say that it has a parent or 
   anything like that).

   A tag can also point to a single file object ("blob": pure file 
   content), which is soemthing that the git.git repository uses to point 
   to the GPG public key that Junio uses to sign things, for example.

   But perhaps more commonly, a tag can also point to a special "tag" 
   object, which is just a form of indirection that can optionally contain 
   an explanation and a digitally signed verification. When I cut a kernel 
   release, for example, my tag's don't point to the commit that is the 
   release commit, they point to a GPG-signed tag-object that in turn 
   points to the commit. 

   With those signed tags, people can verify (if they get my public key) 
   that a particular release was something I did. And due to the 
   cryptographic nature of the hash, trusting the tag object also means 
   that you can trust the commit it points to, and the whole history that 
   points to.

   So while from a _revision_lookup_ standpoint a "branch" and a "tag" do 
   100% the same thing, we put some limitations on branches: they always 
   have to point to a commit.

 - Thanks to the limitation on branches being commits, branches can be 
   "checked out" which is saying that you can make it the active working 
   tree state. You cannot "check out" a tag: you need to have a branch 
   that you check out and can do development on.  So a "tag" is considered 
   purely a stationary pointer: it cannot be committed to, and it cannot 
   participate directly in development.

   This literally has nothing to do with looking up the SHA1 name 
   associated with a tag or a branch, this is _purely_ an agreed-upon 
   convention (that is enforced by higher-level commands like "git 
   checkout"). So if you want to check out the state as of some tag, you 
   must always do it within the confines of some branch.

   So for example, you could do

	git checkout -b newbranch v2.6.18

   which uses a tag ("v2.6.18") to define where to start the branch, and 
   then creates a branch called "newbranch" and checks that out. That's 
   purely shorthand for

	git branch newbranch v2.6.18	# create 'newbranch', initialize 
					# it at v2.6.18

	git checkout newbranch		# make 'newbranch' our currently 
					#active branch

   but you are _not_ allowed to do

	git checkout v2.6.18

   because that would leave you with a situation where your "top-of-tree" 
   is a tag, and you couldn't do any development on it because you don't 
   have a branch to develop _on_.

But all of these kinds of differences between tags and branches are really 
not "core technology" and are purely about having adopted a convention. It 
is literally about just having certain "usage rules" for specific 
"symbolic namespaces".

"branch" and "tag" are just the normal namespaces git gives you and always 
has. You can have others too (and you can define your own) and those names 
will automatically be used for lookup by all the basic git tools. Git 
won't _touch_ those names in any other way, but it means that you can 
create your own tools around git that have their own rules about how the 
names are managed, and you can still use them for lookup.

For example, you could have a "svn" namespace for a project imported from 
svn, and that namespace would contain the SVN revision names for the 
project, so that you could do

	git diff svn/56..

to see the difference between "svn revision 56" and your current HEAD, 
without necessarily polluting the "real" git tag namespace.

(Which can matter, since some commands take arguments like "--tags", which 
just collects all the regular tags - so you might not want to use normal 
tags to remember your SVN revision mapping, even if it might technically 
be fine).

(The above was a totally made-up example. I don't think any of the svn 
importers actually do anything like that: but we do use a few other 
"namespaces" internally: "git bisect" puts the bisection results in the 
"bisect" namespace, and the "remotes" namespace can be used to track 
remote heads as something _different_ than a local branch - so that you 
won't check such a "remote branch" out directly by mistake)

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22  0:22                             ` Jakub Narebski
@ 2006-10-22  1:00                               ` Theodore Tso
  0 siblings, 0 replies; 1752+ messages in thread
From: Theodore Tso @ 2006-10-22  1:00 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Erik Bågfors, Linus Torvalds, Sean, Jan Hudec, bazaar-ng,
	git, Matthieu Moy

On Sun, Oct 22, 2006 at 02:22:28AM +0200, Jakub Narebski wrote:
> If I understand correctly bzr came to life much earlier than Monotone,
> Mercurial and Git but it was in beta stages very long. Bazaar-NG
> "repositories" to group bunch of "branches" seems inspoted by hg or git.
> Git (and probably Mercurial) was inspired both by BitKeeper and Monotone.
> Monotone started to be reasonable fast around time when Git and Mercurial
> came to be.

Yes, bzr predates Mercurial and Git; I remember talking to Martin Pool
about Bazaar-BG at the the 2005 Linux.conf.au, which was before the BK
turnoff.  At the time, I had considered using bzr-ng (which has since
been renamed bzr), but it didn't have branch functionality at that
point if I remember correctly.  Both git and Mercurial started
development at almost the same time right after the Larry McVoy
announced the pending withdrawal of the BitKeeper no-cost license.   

About one month after the announced BK turnoff date, I looked at the
various options for transitioning e2fsprogs, and at that point
Mercurial was **substantially** faster than bzr, and I believe
slightly ahead in features.  I also looked at git, but at that point
Hg was easier to learn how to use, and I figured for a project the
size of e2fsprogs, I didn't need the power of git, so I decided in
favor of Mercurial because it looked like it would be easier for
people to learn how to use it.

I think it's fair to say that the exchange in ideas have profited all
three projects, and that the different projects have different
strengths,   

						- Ted

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                         ` <20061021212645.2f9ba751.seanlkml@sympatico.ca>
@ 2006-10-22  1:26                                                           ` Sean
  2006-10-22  1:26                                                           ` Sean
  2006-10-22  3:23                                                           ` Jeff Licquia
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-22  1:26 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: bazaar-ng, git

On Sat, 21 Oct 2006 20:46:45 -0400
Jeff Licquia <jeff@licquia.org> wrote:

> Which opinion is this?  The opinion that old-style local revnos aren't a
> big deal, or that new-style dotted revnos aren't a big deal?
> 
> I suspect you're conflating the two, and interpreting certainty for the
> former as certainty for the latter.  Though I don't mind being
> corrected.

The archives have all the posts of people claiming that there were no
issues with revno's and fully distributed models.  But it's okay, the
issue really isn't all that important in the big scheme of things.  Bzr
and Git have much more in common than they have differences.  I reject
that revno's are an example of where bzr is superior than Git, but
there are no doubt examples where I would concede that bzr has the edge.

Cheers,
Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                         ` <20061021212645.2f9ba751.seanlkml@sympatico.ca>
  2006-10-22  1:26                                                           ` Sean
@ 2006-10-22  1:26                                                           ` Sean
  2006-10-22  3:23                                                           ` Jeff Licquia
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-22  1:26 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: bazaar-ng, git

On Sat, 21 Oct 2006 20:46:45 -0400
Jeff Licquia <jeff@licquia.org> wrote:

> Which opinion is this?  The opinion that old-style local revnos aren't a
> big deal, or that new-style dotted revnos aren't a big deal?
> 
> I suspect you're conflating the two, and interpreting certainty for the
> former as certainty for the latter.  Though I don't mind being
> corrected.

The archives have all the posts of people claiming that there were no
issues with revno's and fully distributed models.  But it's okay, the
issue really isn't all that important in the big scheme of things.  Bzr
and Git have much more in common than they have differences.  I reject
that revno's are an example of where bzr is superior than Git, but
there are no doubt examples where I would concede that bzr has the edge.

Cheers,
Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                         ` <20061021212645.2f9ba751.seanlkml@sympatico.ca>
  2006-10-22  1:26                                                           ` Sean
  2006-10-22  1:26                                                           ` Sean
@ 2006-10-22  3:23                                                           ` Jeff Licquia
       [not found]                                                             ` <20061021233014.d4525a1d.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 1752+ messages in thread
From: Jeff Licquia @ 2006-10-22  3:23 UTC (permalink / raw)
  To: Sean; +Cc: bazaar-ng, git

On Sat, 2006-10-21 at 21:26 -0400, Sean wrote:
> On Sat, 21 Oct 2006 20:46:45 -0400
> Jeff Licquia <jeff@licquia.org> wrote:
> > I suspect you're conflating the two, and interpreting certainty for the
> > former as certainty for the latter.  Though I don't mind being
> > corrected.
> 
> The archives have all the posts of people claiming that there were no
> issues with revno's and fully distributed models.  

"revno's"?  Which "revno's"? ...

OK.  So you are conflating the two.  Could someone who isn't comment?

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                             ` <20061021233014.d4525a1d.seanlkml@sympatico.ca>
  2006-10-22  3:30                                                               ` Sean
@ 2006-10-22  3:30                                                               ` Sean
  2006-10-22 10:00                                                               ` Matthew D. Fuller
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-22  3:30 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: bazaar-ng, git

On Sat, 21 Oct 2006 23:23:37 -0400
Jeff Licquia <jeff@licquia.org> wrote:

> > The archives have all the posts of people claiming that there were no
> > issues with revno's and fully distributed models.  
> 
> "revno's"?  Which "revno's"? ...
> 
> OK.  So you are conflating the two.  Could someone who isn't comment?

No, actually i'm not.  Single revno's or your dotted revno's _both_ have the
same property.  They can only be local data and can not guarantee stability
in a fully distributed environment.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                             ` <20061021233014.d4525a1d.seanlkml@sympatico.ca>
@ 2006-10-22  3:30                                                               ` Sean
  2006-10-22  3:30                                                               ` Sean
  2006-10-22 10:00                                                               ` Matthew D. Fuller
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-22  3:30 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: bazaar-ng, git

On Sat, 21 Oct 2006 23:23:37 -0400
Jeff Licquia <jeff@licquia.org> wrote:

> > The archives have all the posts of people claiming that there were no
> > issues with revno's and fully distributed models.  
> 
> "revno's"?  Which "revno's"? ...
> 
> OK.  So you are conflating the two.  Could someone who isn't comment?

No, actually i'm not.  Single revno's or your dotted revno's _both_ have the
same property.  They can only be local data and can not guarantee stability
in a fully distributed environment.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* prune/prune-packed
@ 2006-10-22  3:59 J. Bruce Fields
  2006-10-22  4:59 ` prune/prune-packed Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: J. Bruce Fields @ 2006-10-22  3:59 UTC (permalink / raw)
  To: git

Both "man prune" and everyday.txt say that git-prune also runs
git-prune-packed.  But that doesn't seem to be true.  Is the bug in the
documentation?

--b.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: prune/prune-packed
  2006-10-22  3:59 prune/prune-packed J. Bruce Fields
@ 2006-10-22  4:59 ` Junio C Hamano
  2006-10-22 23:14   ` prune/prune-packed J. Bruce Fields
  0 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-22  4:59 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: git

"J. Bruce Fields" <bfields@fieldses.org> writes:

> Both "man prune" and everyday.txt say that git-prune also runs
> git-prune-packed.  But that doesn't seem to be true.  Is the bug in the
> documentation?

I think it is a regression when prune was rewritten as a
built-in.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 20:05                                               ` Aaron Bentley
  2006-10-21 20:48                                                 ` Jakub Narebski
       [not found]                                                 ` <20061021165313.dba67497.seanlkml@sympatico.ca>
@ 2006-10-22  7:45                                                 ` Jan Hudec
  2006-10-22  9:05                                                   ` Jakub Narebski
  2 siblings, 1 reply; 1752+ messages in thread
From: Jan Hudec @ 2006-10-22  7:45 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Carl Worth, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Jakub Narebski

On Sat, Oct 21, 2006 at 04:05:18PM -0400, Aaron Bentley wrote:
> Carl Worth wrote:
> > On Thu, 19 Oct 2006 21:06:40 -0400, Aaron Bentley wrote:
> [...]
> > But it really is fundamental and unavoidable that sequential numbers
> > don't work as names in a distributed version control system.
> 
> Right.  You need something guaranteed to be unique.  It's the revno +
> url combo that is unique.  That may not be permanent, but anyone can
> create one of those names, so it is decentralized.

But it is *not* *distributed*. The definition of a distributed system
among other things require, that resource identifiers are independent on
the location of the resources. So only using the revision-ids is really
distributed.

> >> I meant that the active branch and a mirror of the abandoned branch
> >> could be stored in the same repository, for ease of access.
> > 
> > Granted, everything can be stored in one repository. But that still
> > doesn't change what I was trying to say with my example. One of the
> > repositories would "win" (the names it published during the fork would
> > still be valid). And the other repository would "lose" (the names it
> > published would be not valid anymore). Right?
> 
> No.  It would be silly for the losing side to publish a mirror of the
> winning branch at the same location where they had previously published
> their own branch.  So the old number + URL combination would remain valid.

I regularly use bzr and I never used git. But I'd not hesitate a second
to pull --overwrite over the old location. Because the url has a meaning
"the base I develop against" for me and I'd want to preserve that
meaning.

> If the losing faction decided to maintain their own branch after the
> merge, they'd have two options
> 
> 1. continue to develop against the losing "branch", without updating its
> numbers from the "winning" branch.  It would be hard to tell who had won
> or lost in this case.
> 
> 2. create a new mirror of the "winning" branch and develop against that.
>  I'm not sure what this point of this would be.
> 
> I think the most realistic thing in this scenario is that they leave the
> "losing" branch exactly where it was, and develop against the "winning"
> branch.
> 
> >> Bazaar encourages you to stick lots and lots of branches in your
> >> repository.  They don't even have to be related.  For example, my repo
> >> contains branches of bzr, bzrtools, Meld, and BazaarInspect.
> > 
> > Git allows this just fine. And lots of branches belonging to a single
> > project is definitely the common usage. It is not common (nor
> > encouraged) for unrelated projects to share a repository, since a git
> > clone will fetch every branch in the repository.
> 
> Right.  This is a difference between Bazaar and Git that's I'd
> characterize as being "branch-oriented" vs "repository-oriented".  We'll
> see more of this below.

This is one of things I on the other hand like better on bzr than git.
Because it is really branches and not repositories that I usually care
about.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 17:31                                                       ` Linus Torvalds
  2006-10-21 17:38                                                         ` Linus Torvalds
@ 2006-10-22  7:49                                                         ` Tim Webster
  2006-10-22 17:12                                                           ` Linus Torvalds
  1 sibling, 1 reply; 1752+ messages in thread
From: Tim Webster @ 2006-10-22  7:49 UTC (permalink / raw)
  To: git; +Cc: Aaron Bentley, bazaar-ng, Jakub Narebski

On 10/22/06, Linus Torvalds <torvalds@osdl.org> wrote:
>
>
> On Sat, 21 Oct 2006, Aaron Bentley wrote:
> >
> > Any SCM worth its salt should support that.  AIUI, that's not what Tim
> > wants.  He wants to intermix files from different repos in the same
> > directory.
> >
> > i.e.
> >
> > project/file-1
> > project/file-2
> > project/.git-1
> > project/.git-2
>
> Ok, that's just insane.
[snip]
> Anyway. Git certainly allows you to do some really insane things. The
> above is just the beginning - it's not even talking about alternate object
> directories where you can share databases _partially_ between two
> otherwise totally independent repositories etc.


Perhaps this is insane, but it does not make sense to track all config
files in etc as though they belong in a single repo. Each
application/pkg has a set of associated config files. Actually in some
cases it is easy to track which files belong in each application/pkg
repo. For example dpkg list conffiles per pkg. Additional config files
not in the application/pkg maintainer repo branch are easily added to
the application/pkg local repo branch.

My question is where should file metadata be stored in git? With hook
scripts, the file metadata can be captured and applied appropriately.

If a similar thing can be done with bzr as Linus described for git, I
am all ears.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22  7:45                                                 ` Jan Hudec
@ 2006-10-22  9:05                                                   ` Jakub Narebski
  2006-10-22  9:56                                                     ` Erik Bågfors
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-22  9:05 UTC (permalink / raw)
  To: Jan Hudec
  Cc: Aaron Bentley, Carl Worth, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git

Jan Hudec wrote:
> On Sat, Oct 21, 2006 at 04:05:18PM -0400, Aaron Bentley wrote:
>> Carl Worth wrote:
>>> On Thu, 19 Oct 2006 21:06:40 -0400, Aaron Bentley wrote:

>>>> Bazaar encourages you to stick lots and lots of branches in your
>>>> repository.  They don't even have to be related.  For example, my repo
>>>> contains branches of bzr, bzrtools, Meld, and BazaarInspect.
>>> 
>>> Git allows this just fine. And lots of branches belonging to a single
>>> project is definitely the common usage. It is not common (nor
>>> encouraged) for unrelated projects to share a repository, since a git
>>> clone will fetch every branch in the repository.
>> 
>> Right.  This is a difference between Bazaar and Git that's I'd
>> characterize as being "branch-oriented" vs "repository-oriented".  We'll
>> see more of this below.
> 
> This is one of things I on the other hand like better on bzr than git.
> Because it is really branches and not repositories that I usually care
> about.

That's probably because you are used to Bazaar-NG, and your habits
speaking. Think of git clone of repository as of bzr "branch".

For example git encourages using many short and longer-lived feature
branches; I don't see bzr encouraging this workflow.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22  9:05                                                   ` Jakub Narebski
@ 2006-10-22  9:56                                                     ` Erik Bågfors
  2006-10-22 13:23                                                       ` Jakub Narebski
  2006-10-22 14:25                                                       ` Carl Worth
  0 siblings, 2 replies; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-22  9:56 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: bazaar-ng, Linus Torvalds, Andreas Ericsson, Carl Worth,
	Jan Hudec, git

> For example git encourages using many short and longer-lived feature
> branches; I don't see bzr encouraging this workflow.

Why not? I think it really does.  And due to the fact that merges are
merges and will show up as such, I think it's very suitable for
feature branches.

In fact, in the bzr development of bzr itself.  All commits are done
in feature branches and then merged into bzr.dev (the main "trunk" of
bzr) when they are considered stable.

Consider the following
bzr branch mainline featureA
cd featureA
hack hack; bzr commit -m 'f1'; hack hack bzr commit -m f2; etc
No I want to merge in mainline again
bzr merge ../mainline; bzr commit -m merge
hack hack; bzr commit -m f3; hack hack bzr commit -m f4; etc

right now, I would have something line this in the branch log
-----------------------------------------------------------------
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: featureA
message:
   f4
-----------------------------------------------------------------
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: featureA
message:
   f3
----------------------------------------------------------------
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: featureA
message:
   merge
      -----------------------------------------------------------------
      committer: Foo Bar <foo@bar.com>
      branch nick: mainline
      message:
         something done in mainline
      -----------------------------------------------------------------
      committer: Foo Bar <foo@bar.com>
      branch nick: mainline
      message:
         something else done in mainline
-----------------------------------------------------------------
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: featureA
message:
   f2
-----------------------------------------------------------------
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: featureA
message:
   f1

In this view,I can easily see what was part of this feature branch,
because the committs that belongs to the feature branch are not
indented, and they have a "branch nick" of "featureA".  I can also
easily see what comes from other branches.

I can also run bzr log with --line or --short which shows you only the
commits made in this branch and not the once that are merged in.  So
with --line I would get something line
Erik Bågfors 2006-10-19 f4
Erik Bågfors 2006-10-19 f3
Erik Bågfors 2006-10-19 merge
Erik Bågfors 2006-10-19 f2
Erik Bågfors 2006-10-19 f1

Which will give me a good view of what has been done in this feature
branch only.

If I understand it correctly, in git, you don't really know what has
been committed as part of this branch/repo, and what has been
committed in another branch/repo (this is my understanding from
reading this thread, I might be wrong, feel free to correct me again
:) )

/Erik
-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                             ` <20061021233014.d4525a1d.seanlkml@sympatico.ca>
  2006-10-22  3:30                                                               ` Sean
  2006-10-22  3:30                                                               ` Sean
@ 2006-10-22 10:00                                                               ` Matthew D. Fuller
       [not found]                                                                 ` <20061022074422.50dcbee6.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-22 10:00 UTC (permalink / raw)
  To: Sean; +Cc: Jeff Licquia, bazaar-ng, git

On Sat, Oct 21, 2006 at 11:30:14PM -0400 I heard the voice of
Sean, and lo! it spake thus:
> On Sat, 21 Oct 2006 23:23:37 -0400
> Jeff Licquia <jeff@licquia.org> wrote:
> > 
> > OK.  So you are conflating the two.  Could someone who isn't
> > comment?
> 
> No, actually i'm not.  Single revno's or your dotted revno's _both_
> have the same property.

I think Jeff's actually meaning the other way around.  We're confident
through experience of the utility of the single revnos.  We're NOT (at
least, I'm not) so convinced of the utility and usability of the
dotted ones; they haven't gone through the crucible of experience yet.

During the dotted-decimal discussion, I favored numbering from the
merge point (rather than the ancestral point) for a lot of the same
reasons brought up here.  e.g., the log-ish output would look
something like:

200
199
 199.3
 199.2
 199.1
198
[...]

See <https://lists.ubuntu.com/archives/bazaar-ng/2006q3/017773.html>
for instance.

Of course, now we have them, and they  number from ancestors.  So
after that's in a couple releases, we'll get to see how it works in
practice.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                                 ` <20061022074422.50dcbee6.seanlkml@sympatico.ca>
  2006-10-22 11:44                                                                   ` Sean
@ 2006-10-22 11:44                                                                   ` Sean
  2006-10-22 13:03                                                                   ` Matthew D. Fuller
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-22 11:44 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Jeff Licquia, bazaar-ng, git

On Sun, 22 Oct 2006 05:00:28 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> I think Jeff's actually meaning the other way around.  We're confident
> through experience of the utility of the single revnos.  We're NOT (at
> least, I'm not) so convinced of the utility and usability of the
> dotted ones; they haven't gone through the crucible of experience yet.
> 

Yes, that's the way I took what he said as well.

Bzr revnos (dotted or otherwise) can not be guaranteed to be stable
in a truly distributed system.   Now it's clear that you folks
just don't really care about that and you're happy enough that they
work out fine for your uses.  That's a fair enough decision to make;
there's no law that says you have to care about the situations where
there will be clashes and/or the numbers will change.  Git makes
a different choice, and for my money it's a better choice.

Cheers,
Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                                 ` <20061022074422.50dcbee6.seanlkml@sympatico.ca>
@ 2006-10-22 11:44                                                                   ` Sean
  2006-10-22 11:44                                                                   ` Sean
  2006-10-22 13:03                                                                   ` Matthew D. Fuller
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-22 11:44 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git

On Sun, 22 Oct 2006 05:00:28 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> I think Jeff's actually meaning the other way around.  We're confident
> through experience of the utility of the single revnos.  We're NOT (at
> least, I'm not) so convinced of the utility and usability of the
> dotted ones; they haven't gone through the crucible of experience yet.
> 

Yes, that's the way I took what he said as well.

Bzr revnos (dotted or otherwise) can not be guaranteed to be stable
in a truly distributed system.   Now it's clear that you folks
just don't really care about that and you're happy enough that they
work out fine for your uses.  That's a fair enough decision to make;
there's no law that says you have to care about the situations where
there will be clashes and/or the numbers will change.  Git makes
a different choice, and for my money it's a better choice.

Cheers,
Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 20:47                                                 ` Carl Worth
  2006-10-21 20:55                                                   ` Jakub Narebski
  2006-10-21 23:07                                                   ` Jeff Licquia
@ 2006-10-22 12:46                                                   ` Matthew D. Fuller
  2006-10-22 13:51                                                     ` Jakub Narebski
  2006-10-22 19:36                                                   ` David Clymer
  3 siblings, 1 reply; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-22 12:46 UTC (permalink / raw)
  To: Carl Worth; +Cc: bazaar-ng, git

[ Time to trim up CC's a bit ]

On Sat, Oct 21, 2006 at 01:47:08PM -0700 I heard the voice of
Carl Worth, and lo! it spake thus:
> On Sat, 21 Oct 2006 08:01:11 -0500, "Matthew D. Fuller" wrote:
> > I think we're getting into scratched-record-mode on this.
> 
> I apologize if I've come across as beating a dead horse on this.

Oh, I don't mean the whole topic in general.  It's just that there are
only so many ways one can say "revnos are only valid in certain
situations", and I really think we must have hit them all by now.  We
all agree on that; we just disagree (probably highly based on
differing workflows) on the commonness and extent of those situations.


> > B: Revnos are handier tools for [situation] and [situation] for
> >    [reason] and [reason].
> 
> I'm missing something:
> 
> I still haven't seen strong examples for this last claim. When are
> they handier?

This ties in a bit with what you say below, so I'll address it there.


> There's no doubt that there has been semantic confusion over the
> term branch that has been confounding communication on both sides.
  [...]
> Let me know if I botched any of that.

This seems correct; at least, it's correct enough to work from until
we find a detail wrong.


> But dropping a merged branch in bzr means throwing away the ability to
> reference any of its commits by its custom, branch-specific revision
> numbers.

True (though see below).


> And there is no simple way to correlate the numbers between
> branches.

Rather, unless you can one way or another access the branch the number
was for, there's NO way.


> Maybe you can argue that there isn't any centralization bias in bzr.
> But anyone that claims that the revnos. are stable really is talking
> from a standpoint that favors centralization.

I think it's using that 'c' word there that's causing contention here;
we're ascribing different meanings to it.

Revnos only apply to a specific "branch" (in this usage, I'm talking
about branch abstractly and somewhat specifically; more in a moment),
and so except by wild coincidence are only useful in talking about
that branch.  One of the two cases (the second discussed later) where
that's useful is when you have long-lived branches.  In git,
apparently, you don't have long-lived "branches" in this particular
meaning of the word, but the way people use bzr they do.  Perhaps this
is what you mean by 'centralization'.

That long-lived branch doesn't have to be any sort of "trunk", though
it usually is; it could as easily be something totally peripheral.


Now, details of that use of "branch".  In mathematical terms, a branch
may be defined purely by its head rev (and the graph built up by
recursing through all the parents), but in [bzr] UI and mental model
terms, a "branch" is that plus its mainline[0]; the left-most or first
line of descent, which colloquially is the difference between 'things
I commit' and 'things I merge'.

Let me try flexing my git-expression muscles here.  Given a branch at
a specific point in time, you point at the head rev, and there's a
subset we call 'mainline' of the whole set of parents, which is
expressed by following the 'first' parent pointers back to a single
origin (there can be 50 origins in the whole graph, of course, but
only one of them is on the 'mainline').  At some later time, more
revisions have been added to the graph, and the head rev is now
something "later".  If, at that later time, all the nodes which were
previously on that 'mainline' are still on it tracing back from the
new head, then in the sense I'm using "branch", it's still the same
"branch".  All the revnos referring to its earlier incarnation are
still valid for this one (though there are new ones tacked onto the
end; that doesn't affect the pre-existing ones).

[I THINK we all understand that, but just making sure]


[0] This probably causes some confusion too, since I know I'm guilty
    of using the word 'mainline' both in the sense of a 'trunk'
    branch, and this particular path through one branch.  _I_ think
    it's usually clear from context, but I guess it probably isn't for
    those with a different mental modeling of "branch".


> To illustrate, yesterday I gave an example where performing a bzr
> branch from a dotted-decimal revision would rewrite the numbers from
> the originating branch (1.2.2, 1.2.1, and 1) to unrelated numbers in
> the new branch (3, 2, 1).

One thing to note here is that that 1.2.1 and 1.2.2 came into your
first branch here by merging from another branch (call that branch
'b').  When you created your new branch here that now has (3,2,1),
those numbers are the same as the numbers that existed locally in 'b'
at the time 1.2.2 was its 'head'.  In a sense, then, you've just
recreated [a copy of] "branch" 'b' at that time.  So, in a way, by
taking a copy of the current bzr.dev branch, you can recreate the
entire state of any branches that were merged into it as of the time
they were merged (excluding cases of cherrypicking, or when merging
prior to the head of those branches of course).


> But then I realized why bzr is doing this. It's because, bzr users
> don't just use the revision numbers for external communication, but
> they also use them for lots of direct interaction with the tool. The
> rewriting makes it easy to write something like "bzr diff -r1..3".

This is an instance of the second case (first above) where the revnos,
applying just to one branch, become useful.  And, it's probably the
case I'm most attached to.

The great majority (I'd say easily 80%) of my references to revisions
are transient.  Most of 'em have probably exhausted their usefulness
in an hour; many of them (as in interaction with the tool you
mentioned) in just a couple seconds.  Virtually all my branches live
longer than that, so the limited lifespan of the numbers in the grand
scheme doesn't matter a whit.

So, from above, some of the places they're handier:

- Typing.  I know, copy and paste copies and pastes one string just as
  well as another, and long strings just as well as short.  But I
  don't want to copy&paste; I want to ^Z out of log and run a quick
  diff, between two revisions only one of which is on my screen at the
  time.  I can just remember the offscreen revno I'm comparing
  against, and it's very easy to quickly type the numbers,
  particularly since 95% of the time I'm comparing mainline revs so I
  don't even have to think about dotted forms.


- Some forms of communicating.  I can yell numbers across the room
  without concern about whether they'll be interpreted right.  Even 6
  digits of an SHA-1 hash are a lot harder to do that with.  I can
  hold revnos in my head while I walk down the hall to talk to
  somebody about them, or pick up a phone, or go to a meeting.  I can
  scrawl them on notepads or whiteboards.  In all these cases, the
  only reason for which I'm communicating that revno will be exhausted
  very shortly, so it's completely irrelevant whether it's meaningful
  in 5 years, or next week.


- Visual comparing (this is one that's useful on the long-lived
  branches, as well as transient stuff) and information gathering.  I
  can hold in my head "Yeah, I looked at 1350 of Joe's branch", and if
  I see an email from him "Oh, I fixed a bug in 1358" or "in 1293", I
  can know just from that whether I saw the fix or not.

  If somebody says "I introduced a bug in revision 3841, and fixed it
  in 3843", I know the window where that bug is in play is probably
  pretty small, whereas "introduced in 3841, fixed in 5337" tells me
  it was alive a looong time.

  bzr.dev is currently on revno 2091.  I didn't know that, I had to
  look it up.  But I knew it was a little past 2000, just from loosely
  watching it.  If somebody talks about something that happened in
  revno 1800, I know automatically "That was fairly recent", compared
  to talking about revno 75, where I know "Wow, that was a long time
  ago".

  This property is true of bzr revids as well.  If I see talk about
  revision "mbp@sourcefrog.net-20050520021228-bc46a17f07eff7f9", I
  know right away Martin committed it, and it was a year and a half
  ago.  If I see talk about an oops in revision "af38cc3", that just
  tells me that somebody screwed up, and it gets mentally filed away
  or goes in one eye and out the other.  But if I see talk about an
  oops in revision "fullermd@over-yonder.net-[...]", that rings bright
  blue bells that tell me that *I* screwed up and I need to jump on
  that right now.  In a sufficiently small projects with sufficiently
  discrete task division, I may even be able to guess offhand based on
  the person and date what bit of functionality the commit references,
  though that's a much lower probability.

  It can also be useful in looking at cases where you don't
  necessarily have the tool.  Compare putting CVS's rcsid tags in
  strings in the source.  static const char *rcsid = "$Id"; and the
  like.  Then you can use 'ident' on the compiled binaries to see the
  revs of files in them.  If somebody says "foo.c has a bug in 1.34,
  fixed in 1.37", I can without any VCS interaction just look at the
  compiled binary and tell whether I'm prior to the bug, have the bug,
  or after the fix.  If the binary is known to be compiled from a
  particular branch, a tree-wide revno tells me that too.  A revid
  (even one containing a date) won't tell me that; I'll have to find
  the tool and a copy of the tree and find out if my rev contains that
  other rev.

  Now, on any given revision reference, I probably don't care about
  most of those bits of info.  I may not care about any of them, but I
  often care about at least one or two.  And we all probably have
  wildly varying appraisals of the commonness of various of the
  situations described.  And yes, a lot of them are just mental
  heuristics.  Sure, with a completely opaque id, I could pull up the
  tool to look up any of those (and a lot more information besides),
  the gain is I don't HAVE to.  Just knowing some bit of that info can
  often tell me if I don't care to investigate whatever the revision
  is being referenced for at all, or that I need to put doing so at
  the top of my priority list.



> And it turns out that git also allows branch specific naming for the
> exact same reason. In place of 3, 2, 1 in the same situation git
> would allow the names HEAD, HEAD~1, and HEAD~2 to refer to the same
> three revisions. So the easy diff command would be "git diff HEAD~2
> HEAD".

In bzr, that would be "bzr diff -r-2..-1" (or just "-r-2.." since
open-ended revspecs pretty much work like you'd expect them to).  IME,
that only works well maybe 4 or 5 revs back; past that, you spend too
much time counting, and it's easier to just whack in the number from
log.

bzr _doesn't_, OTOH, have anything like HEAD^2, for selecting
alternate parent paths.  That's probably use-pattern bias; we hardly
ever do something like that, so it's never occurred to us to add the
ability to.


> Maybe some of the people that dislike git's "ugly" names so much is
> that they imagine that to compare two revisions a user of git must
> inspect the logs, fish out the sha1sum for each, and then
> cut-and-paste to create the command needed.

I do imagine that.  And I think I'd hit it, since I often look around
revs that aren't right near the tip; trying to figure out
"HEAD~293..HEAD~38" is even worse than excavating the sha1sum's.



-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                                 ` <20061022074422.50dcbee6.seanlkml@sympatico.ca>
  2006-10-22 11:44                                                                   ` Sean
  2006-10-22 11:44                                                                   ` Sean
@ 2006-10-22 13:03                                                                   ` Matthew D. Fuller
       [not found]                                                                     ` <20061022092845.233deb43.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-22 13:03 UTC (permalink / raw)
  To: Sean; +Cc: bazaar-ng, git

On Sun, Oct 22, 2006 at 07:44:22AM -0400 I heard the voice of
Sean, and lo! it spake thus:
> 
> Bzr revnos (dotted or otherwise) can not be guaranteed to be stable
> in a truly distributed system.

Perhaps the difference is that we're making a [fine] distinction
between "useful in a truely distributed system" and "useful when
WORKING in a truely distributed system".  cworth's point back up a few
posts is good; nearly all of my use of revnos is in direct interaction
with the tool, where the revnos just came from looking at the history.
And of those uses that aren't in that class, nearly all of THOSE are
very transient.  Non-local (in time or space) stability in either of
those cases is a total non-concern.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22  9:56                                                     ` Erik Bågfors
@ 2006-10-22 13:23                                                       ` Jakub Narebski
  2006-10-22 14:11                                                         ` Erik Bågfors
  2006-10-22 14:25                                                       ` Carl Worth
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-22 13:23 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Jan Hudec, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

Erik Bågfors wrote:
> Jakub Narębski wrote:

>> For example git encourages using many short and longer-lived feature
>> branches; I don't see bzr encouraging this workflow.
> 
> Why not? I think it really does.  And due to the fact that merges are
> merges and will show up as such, I think it's very suitable for
> feature branches.

I think I haven't properly explained what "feature branch" means.
"Feature branch" is short (or medium) lived branch, created for
development of one isolated feature. When feature is in stable
stage, we merge feature branch and forget about it. We are not
interested in the fact that given feature was developed on given
branch. BTW. for example in published git.git repository are
only available in the form of "digest" 'pu' (proposed updates)
branch.

I guess what you are talking about are long lived "development
branches" (like git.git 'maint', 'master', 'next' and 'pu' branches),
or perhaps long lived another user's clone of given git repository.

Git considers having clones of given repository totally equivalent,
and having fast-forward property more important than remembering
"which branch (which clone) has this commit came from" or at least
"this commit is from this (current) branch-clone".

You have graphical history viewers (bzr has it's own: bzr-gtk),
committer and author info, and reflog if enabled if you really,
really need this kind of information. 
 
> In fact, in the bzr development of bzr itself.  All commits are done
> in feature branches and then merged into bzr.dev (the main "trunk" of
> bzr) when they are considered stable.
> 
> Consider the following
> bzr branch mainline featureA
Which if I remember correctly (at least by default) needs and generates
new working tree.

> cd featureA
> hack hack; bzr commit -m 'f1'; hack hack bzr commit -m f2; etc
> No I want to merge in mainline again
> bzr merge ../mainline; bzr commit -m merge
> hack hack; bzr commit -m f3; hack hack bzr commit -m f4; etc

As it clarified during this long discussion, bzr "branches" are
something between git branches and one-branch [local] clones.
Can you for example create branch starting from an arbitrary revision,
not only tip of branch?

The above sequence of operations can be done in (at least) two different
ways in git.

Less used:
 $ cd /somewhere/else
 $ git clone -l -s <mainrepo>/.git featureA
 $ cd featureA
 $ hack; hack; git commit -a -m "f1"; hack; hack; git commit -a -m "f2"; etc   
 $ cd <mainrepo>
 $ git pull /somewhere/else/featureA/.git
 (this does commit and merge)

But more common used is:
 $ git branch featureA mainline
 $ git checkout featureA
 $ hack; hack; git commit -a -m "f1"; hack; hack; git commit -a -m "f2"; etc
 $ git checkout mainline
 $ git pull . featureA
 (although this would fast-forward in this example)

> right now, I would have something line this in the branch log
> -----------------------------------------------------------------
> committer: Erik Bågfors <erik@bagfors.nu>
> branch nick: featureA
> message:
>    f4
> -----------------------------------------------------------------
> committer: Erik Bågfors <erik@bagfors.nu>
> branch nick: featureA
> message:
>    f3
> ----------------------------------------------------------------
> committer: Erik Bågfors <erik@bagfors.nu>
> branch nick: featureA
> message:
>    merge
>       -----------------------------------------------------------------
>       committer: Foo Bar <foo@bar.com>
>       branch nick: mainline
>       message:
>          something done in mainline
>       -----------------------------------------------------------------
>       committer: Foo Bar <foo@bar.com>
>       branch nick: mainline
>       message:
>          something else done in mainline
The automatic merge message takes care of this, if we enable
merge.summary config option. For example:

commit 2c8a02263c13c6e1891e9e338eb40a4286b613e5
Merge: 2492932... 87b787a...
Author: Jakub Narebski <jnareb@gmail.com>
Date:   Sat Oct 21 13:23:19 2006 +0200

    Merge branch 'master' of git://git.kernel.org/pub/scm/git/git
    
    * 'master' of git://git.kernel.org/pub/scm/git/git:
      git-clone: define die() and use it.
      Fix typo in show-index.c
      pager: default to LESS=FRS


Another example, this time of "octopus" merge.

commit ff49fae6a547e5c70117970e01c53b64d983cd10
Merge: 7ad4ee7... 75f9007... 14eab2b... 0b35995... eee4609...
Author: Junio C Hamano <junkio@cox.net>
Date:   Fri Oct 20 18:56:14 2006 -0700

    Merge branches 'jc/diff', 'jc/diff-apply-patch', 'jc/read-tree' and 'pb/web' into pu
    
    * jc/diff:
      para walk wip
      para-walk: walk n trees, index and working tree in parallel
    
    * jc/diff-apply-patch:
      git-diff/git-apply: make diff output a bit friendlier to GNU patch (part 2)
    
    * jc/read-tree:
      merge: loosen overcautious "working file will be lost" check.
    
    * pb/web:
      gitweb: Show project README if available

That said we couldn't do that in abovementioned example
as it is simple case of fast-forward. We have above messages
for "true merges" of two _diverging_ lines of development,
and we could use similar format for "git log". In practice
we rather use history viewers: gitk, qgit, tig, git-show-branch.

For example:
$ git show-branch origin next
! [origin] git-clone: define die() and use it.
 ! [next] Merge branch 'master' into next
--
 - [next] Merge branch 'master' into next
++ [origin] git-clone: define die() and use it.

> If I understand it correctly, in git, you don't really know what has
> been committed as part of this branch/repo, and what has been
> committed in another branch/repo (this is my understanding from
> reading this thread, I might be wrong, feel free to correct me again
> :) )

You can browse reflog to get to know which changes were commited
as part of this repo, and which came from other repo (other clone
of this repo).
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                                     ` <20061022092845.233deb43.seanlkml@sympatico.ca>
@ 2006-10-22 13:28                                                                       ` Sean
  2006-10-22 13:28                                                                       ` Sean
  2006-10-22 13:33                                                                       ` Matthew D. Fuller
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-22 13:28 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git

On Sun, 22 Oct 2006 08:03:22 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> Perhaps the difference is that we're making a [fine] distinction
> between "useful in a truely distributed system" and "useful when
> WORKING in a truely distributed system".  cworth's point back up a few
> posts is good; nearly all of my use of revnos is in direct interaction
> with the tool, where the revnos just came from looking at the history.
> And of those uses that aren't in that class, nearly all of THOSE are
> very transient.  Non-local (in time or space) stability in either of
> those cases is a total non-concern.

Sure, but if they're just a local feature then why propagate them with
the distributed data?  If they're meant only to be used locally,
they can be guaranteed to be stable by never replicating
them, with obvious benefits for the local user.  However bzr makes the
(IMO) mistake of including them in the data that is distributed 
between repos.  This suggests bzr team just doesn't care about the
distributed models where this will not help and will quite possibly
lead to frustration and confusion.  And yes, I know that you
haven't seen those situations yourself yet.  Obviously, it's the
Bzr teams trade-off to make, but if an avid user like yourself thinks
of revno's as local, perhaps they've made the wrong choice.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                                     ` <20061022092845.233deb43.seanlkml@sympatico.ca>
  2006-10-22 13:28                                                                       ` Sean
@ 2006-10-22 13:28                                                                       ` Sean
  2006-10-22 13:33                                                                       ` Matthew D. Fuller
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-22 13:28 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git

On Sun, 22 Oct 2006 08:03:22 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> Perhaps the difference is that we're making a [fine] distinction
> between "useful in a truely distributed system" and "useful when
> WORKING in a truely distributed system".  cworth's point back up a few
> posts is good; nearly all of my use of revnos is in direct interaction
> with the tool, where the revnos just came from looking at the history.
> And of those uses that aren't in that class, nearly all of THOSE are
> very transient.  Non-local (in time or space) stability in either of
> those cases is a total non-concern.

Sure, but if they're just a local feature then why propagate them with
the distributed data?  If they're meant only to be used locally,
they can be guaranteed to be stable by never replicating
them, with obvious benefits for the local user.  However bzr makes the
(IMO) mistake of including them in the data that is distributed 
between repos.  This suggests bzr team just doesn't care about the
distributed models where this will not help and will quite possibly
lead to frustration and confusion.  And yes, I know that you
haven't seen those situations yourself yet.  Obviously, it's the
Bzr teams trade-off to make, but if an avid user like yourself thinks
of revno's as local, perhaps they've made the wrong choice.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                                     ` <20061022092845.233deb43.seanlkml@sympatico.ca>
  2006-10-22 13:28                                                                       ` Sean
  2006-10-22 13:28                                                                       ` Sean
@ 2006-10-22 13:33                                                                       ` Matthew D. Fuller
       [not found]                                                                         ` <20061022094041.77c06cc7.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-22 13:33 UTC (permalink / raw)
  To: Sean; +Cc: bazaar-ng, git

On Sun, Oct 22, 2006 at 09:28:45AM -0400 I heard the voice of
Sean, and lo! it spake thus:
> 
> Sure, but if they're just a local feature then why propagate them
> with the distributed data?

Because they're 'local' to a given "branch"; see my message to cworth
a little while ago for expansion of the rather particular meaning of
the word used here.  If somebody takes a clone of my _branch_, it's
the same "branch", so the numbers will be the same (and that's
desired).


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                                         ` <20061022094041.77c06cc7.seanlkml@sympatico.ca>
  2006-10-22 13:40                                                                           ` Sean
@ 2006-10-22 13:40                                                                           ` Sean
  2006-10-22 13:57                                                                           ` Matthew D. Fuller
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-22 13:40 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git

On Sun, 22 Oct 2006 08:33:36 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> Because they're 'local' to a given "branch"; see my message to cworth
> a little while ago for expansion of the rather particular meaning of
> the word used here.  If somebody takes a clone of my _branch_, it's
> the same "branch", so the numbers will be the same (and that's
> desired).

The fact is that once you start distributing them to other repositories
you CAN NOT GUARANTEE their stability.  Those number may already be
used by _HIS_ branch and when he tries to get _YOUR_ branch.. there
is a conflict.  AND THERE IS NOTHING YOU CAN DO TO FIX THAT.  It's
a fundamental flaw with distributing revnos.  The reason you likely
haven't seen a problem so far is that the bzr world seems to favor
the use of a central server that has the effect of more or less
synchronizing branch numbers to most of the nodes in the system.
However, that's only one model.  So while you may not have seen a
problem yourself, there are _inherent_ limitations of the system
you've embraced.

But it seems like nobody on the bzr team cares or wants to hear about
it, so let's just move on.

Cheers,
Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                                         ` <20061022094041.77c06cc7.seanlkml@sympatico.ca>
@ 2006-10-22 13:40                                                                           ` Sean
  2006-10-22 13:40                                                                           ` Sean
  2006-10-22 13:57                                                                           ` Matthew D. Fuller
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-22 13:40 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git

On Sun, 22 Oct 2006 08:33:36 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> Because they're 'local' to a given "branch"; see my message to cworth
> a little while ago for expansion of the rather particular meaning of
> the word used here.  If somebody takes a clone of my _branch_, it's
> the same "branch", so the numbers will be the same (and that's
> desired).

The fact is that once you start distributing them to other repositories
you CAN NOT GUARANTEE their stability.  Those number may already be
used by _HIS_ branch and when he tries to get _YOUR_ branch.. there
is a conflict.  AND THERE IS NOTHING YOU CAN DO TO FIX THAT.  It's
a fundamental flaw with distributing revnos.  The reason you likely
haven't seen a problem so far is that the bzr world seems to favor
the use of a central server that has the effect of more or less
synchronizing branch numbers to most of the nodes in the system.
However, that's only one model.  So while you may not have seen a
problem yourself, there are _inherent_ limitations of the system
you've embraced.

But it seems like nobody on the bzr team cares or wants to hear about
it, so let's just move on.

Cheers,
Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22 12:46                                                   ` Matthew D. Fuller
@ 2006-10-22 13:51                                                     ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-22 13:51 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Matthew D. Fuller wrote:

>   It can also be useful in looking at cases where you don't
>   necessarily have the tool.  Compare putting CVS's rcsid tags in
>   strings in the source.  static const char *rcsid = "$Id"; and the
>   like.  Then you can use 'ident' on the compiled binaries to see the
>   revs of files in them.  If somebody says "foo.c has a bug in 1.34,
>   fixed in 1.37", I can without any VCS interaction just look at the
>   compiled binary and tell whether I'm prior to the bug, have the bug,
>   or after the fix.  If the binary is known to be compiled from a
>   particular branch, a tree-wide revno tells me that too.  A revid
>   (even one containing a date) won't tell me that; I'll have to find
>   the tool and a copy of the tree and find out if my rev contains that
>   other rev.

We use signed tags for tagging official releases (e.g. v1.4.0 tag),
and we use "git describe" output to be embedded during build time
in resulting binary. For example my current output of git-describe
on my clone of git repository is:

 $ git describe 
 v1.4.3.1-g2c8a022

Git project does this, gitweb does this, Linux kernel does this.
This is quite coarse grained, i.e. you know ahich released version
it is after, but you need git tools (or access to git tools via
gitweb) to check if it is after or before the fix.

Of course that is when you run GIT version of tool...
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                                         ` <20061022094041.77c06cc7.seanlkml@sympatico.ca>
  2006-10-22 13:40                                                                           ` Sean
  2006-10-22 13:40                                                                           ` Sean
@ 2006-10-22 13:57                                                                           ` Matthew D. Fuller
       [not found]                                                                             ` <20061022102454.b9dea693.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-22 13:57 UTC (permalink / raw)
  To: Sean; +Cc: bazaar-ng, git

On Sun, Oct 22, 2006 at 09:40:41AM -0400 I heard the voice of
Sean, and lo! it spake thus:
> 
> The fact is that once you start distributing them to other
> repositories you CAN NOT GUARANTEE their stability.

Terminology.  When those revisions get distributed to other BRANCHES,
their stability is forfeit.  We know.  We don't care.  We only care
about the numbers on ONE BRANCH.


> Those number may already be used by _HIS_ branch and when he tries
> to get _YOUR_ branch.. there is a conflict.

Terminology again.  When he has his branch and gets my branches, he
has two branches, mine and his, side by side, and the numbers in his
'my' branch still correspond to the numbers in my 'my' branch.  When
he merges the REVISIONS from my branch into his, my numbers have no
meaning on his side (there's not a 'conflict' because numbers don't
get copied, they get derived).


> So while you may not have seen a problem yourself,

You keep insisting that there's a PROBLEM here.  You're right, I don't
see one.  I KNOW the numbers only refer to a branch, I KNOW that when
you're talking about a different branch the numbers are meaningless,
and I'm perfectly fine with that because referring to revisions on *A*
branch is exactly what I USE the numbers for.

There doesn't have to be a 'central' branch, nor is there any wish for
such to be.  Any given revno only refers to *A* branch, it doesn't
have to be central to a darn thing.  HEAD in git only has meaning in
the context of *A* branch (and even 'worse', only refers to that
branch at a specific time[0]), but you'll keep on using it every day
anyway I wager.



[0] See again particular term of art "branch".


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22 13:23                                                       ` Jakub Narebski
@ 2006-10-22 14:11                                                         ` Erik Bågfors
  2006-10-22 14:39                                                           ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-22 14:11 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: bazaar-ng, Linus Torvalds, Andreas Ericsson, Carl Worth,
	Jan Hudec, git

On 10/22/06, Jakub Narebski <jnareb@gmail.com> wrote:
> Erik Bågfors wrote:
> > Jakub Narębski wrote:
>
> >> For example git encourages using many short and longer-lived feature
> >> branches; I don't see bzr encouraging this workflow.
> >
> > Why not? I think it really does.  And due to the fact that merges are
> > merges and will show up as such, I think it's very suitable for
> > feature branches.
>
> I think I haven't properly explained what "feature branch" means.
> "Feature branch" is short (or medium) lived branch, created for
> development of one isolated feature. When feature is in stable
> stage, we merge feature branch and forget about it. We are not
> interested in the fact that given feature was developed on given
> branch. BTW. for example in published git.git repository are
> only available in the form of "digest" 'pu' (proposed updates)
> branch.


That's what I'm talking about too.
For example, in my bzr bzr-repo I have
bzr.init-repo-tree/
bzr.aliases/
bzr.dev/

and others...
In bzr.aliases for example, I built the support for defining aliases
in the bzr config file. That was a unique feature that didn't exist in
any other branch.  The branch survived about 17 days before it was
merged into bzr.dev.  During that time, I merge in another branch
twice.  The branch I merged at this time was NOT bzr.dev, but rather
another branch, from one of the main developers.  The reason I merged
his branch was that I needed a bugfix (or two? :) ) that he had done,
but that wasn't approved in bzr.dev yet.

After a time, his branch was merged into bzr.dev, shortly thereafter,
so was my branch.

After my branch was merged, I forgot about it.  I still have it laying
around on my computer because it really doesn't take up any extra
space (since it's in a shared repository), but I really have forgotten
about it.

This is typically how all features in bzr are created.
Short/medium/long-lived feature branches.

/Erik

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                                             ` <20061022102454.b9dea693.seanlkml@sympatico.ca>
  2006-10-22 14:24                                                                               ` Sean
@ 2006-10-22 14:24                                                                               ` Sean
  2006-10-22 14:56                                                                               ` Matthew D. Fuller
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-22 14:24 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git

On Sun, 22 Oct 2006 08:57:02 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> You keep insisting that there's a PROBLEM here.  You're right, I don't
> see one.  I KNOW the numbers only refer to a branch, I KNOW that when
> you're talking about a different branch the numbers are meaningless,
> and I'm perfectly fine with that because referring to revisions on *A*
> branch is exactly what I USE the numbers for.

Light goes on.  Okay.  So a bzr "branch" is only ever editable on a 
single machine.  So there is no distributed development on top of a 
bzr "branch".  Everyone else just has read-only copies of it.  In this
way you ensure that there is never a conflict of the revno's.  I'm not
sure of the ramifications of this but at least I get where you're coming
from now.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                                             ` <20061022102454.b9dea693.seanlkml@sympatico.ca>
@ 2006-10-22 14:24                                                                               ` Sean
  2006-10-22 14:24                                                                               ` Sean
  2006-10-22 14:56                                                                               ` Matthew D. Fuller
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-10-22 14:24 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git

On Sun, 22 Oct 2006 08:57:02 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> You keep insisting that there's a PROBLEM here.  You're right, I don't
> see one.  I KNOW the numbers only refer to a branch, I KNOW that when
> you're talking about a different branch the numbers are meaningless,
> and I'm perfectly fine with that because referring to revisions on *A*
> branch is exactly what I USE the numbers for.

Light goes on.  Okay.  So a bzr "branch" is only ever editable on a 
single machine.  So there is no distributed development on top of a 
bzr "branch".  Everyone else just has read-only copies of it.  In this
way you ensure that there is never a conflict of the revno's.  I'm not
sure of the ramifications of this but at least I get where you're coming
from now.

Sean

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22  9:56                                                     ` Erik Bågfors
  2006-10-22 13:23                                                       ` Jakub Narebski
@ 2006-10-22 14:25                                                       ` Carl Worth
  2006-10-22 14:48                                                         ` Erik Bågfors
                                                                           ` (2 more replies)
  1 sibling, 3 replies; 1752+ messages in thread
From: Carl Worth @ 2006-10-22 14:25 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Jakub Narebski, Jan Hudec, bazaar-ng, Linus Torvalds,
	Andreas Ericsson, git

[-- Attachment #1: Type: text/plain, Size: 7105 bytes --]

At Sun, 22 Oct 2006 11:56:32 +0200, "=?ISO-8859-1?Q?Erik_B=E5gfors?=" wrote:
> Consider the following
> bzr branch mainline featureA
> cd featureA
> hack hack; bzr commit -m 'f1'; hack hack bzr commit -m f2; etc
> No I want to merge in mainline again
> bzr merge ../mainline; bzr commit -m merge
> hack hack; bzr commit -m f3; hack hack bzr commit -m f4; etc

Thanks for sharing this example. I think when we look at concrete
things that the tools actually let you do, we have a better
conversation. Plus, this example highlights some very interesting
differences between the tools.

So here is a complete sequence of git commands to construct the
scenario (even the extra hacking in mainline):

	mkdir gittest; cd gittest
	git init-db
	touch mainline; git add mainline; git commit -m "Initial commit of mainline"
	git checkout -b featureA
	touch f1; git add f1; git commit -m f1
	touch f2; git add f2; git commit -m f2
	git checkout -b mainline master
	touch sd; git add sd; git commit -m "something done in mainline";
	touch se; git add se; git commit -m "something else done in mainline";
	git checkout featureA
	git pull . mainline
	touch f3; git add f3; git commit -m f3
	touch f4; git add f4; git commit -m f4

For reference, here's the same with bzr:

	mkdir bzrtest; cd bzrtest
	bzr init-repo . --trees
	bzr init mainline; cd mainline
	touch mainline; bzr add mainline; bzr commit -m "Initial commit of mainline"
	cd ..; bzr branch mainline featureA; cd featureA
	touch f1; bzr add f1; bzr commit -m f1
	touch f2; bzr add f2; bzr commit -m f2
	cd ../mainline/
	touch sd; bzr add sd; bzr commit -m "something done in mainline"
	touch se; bzr add se; bzr commit -m "something else done in mainline"
	cd ../featureA
	bzr merge ../mainline/; bzr commit -m "merge"
	touch f3; bzr add f3; bzr commit -m f3
	touch f4; bzr add f4; bzr commit -m f4

[As has recently been pointed out, the tools really are more the same
than different, and I think the above illustrates that.]

> right now, I would have something line this in the branch log

OK. So here is a difference in the tools. With git, you don't get the
indentation for the "non-mainline" commits. This is because git
doesn't recognize any branch in the DAG to be more significant than
any other. Instead, git provides a flat, and (heuristically)
time-sorted view of the commits. (It's heuristic in that git just uses
the time stamps in the commit objects---but it doesn't actually care
if these are totally "wrong"---git knows that there is no global
clock.)

That said, git does store an order for the parent edges of each
commit, and this order is assigned deterministically by the commands
that create merge commits. So someone could use git carefully, (which
it seems people are doing with bzr), to preserve "mainline as first
parent" and someone could write a modified git-log that would do
indentation.

But even without any of that manual care for creating a "mainline",
git already provides a very easy way to see the "mainline" view
anyway. See below.

> In this view,I can easily see what was part of this feature branch,
> because the commits that belongs to the feature branch are not
> indented, and they have a "branch nick" of "featureA".  I can also
> easily see what comes from other branches.

Ah, I hadn't realized that bzr commits stored an "originating branch"
inside them. Git commits definitely do not have anything like
that. And as I said above, there's no indentation in git-log, so the
commits from separate branches are "mixed up". But see below.

> I can also run bzr log with --line or --short which shows you only the
> commits made in this branch and not the once that are merged in.  So
> with --line I would get something line
> Erik Bågfors 2006-10-19 f4
> Erik Bågfors 2006-10-19 f3
> Erik Bågfors 2006-10-19 merge
> Erik Bågfors 2006-10-19 f2
> Erik Bågfors 2006-10-19 f1
>
> Which will give me a good view of what has been done in this feature
> branch only.

Thank you. You've provided a concrete example of something to do,
("see commits that belong to a feature branch"), that is really very
practical and useful. And bzr achieves this ability by adopting a
"mainline is special" treatment in bzr. This special treatment
influences or directly causes many of the things in bzr that we've
been discussing:

 * mainline commits get special treatment from revision numbers
   (in old days, they're the only commits to have revision
   numbers---more recently they're the only commits to get non-dotted
   revision numbers)

 * bzr adds empty merge commits instead of fast-forwarding since it
   needs a new "mainline" commit

 * users have to be careful about merge direction to avoid
   accidentally going the "wrong" way

 * users are discouraged from using the "give me their DAG" pull
   command since it would scramble their local view of what "mainline"
   is.

I've been arguing that all of these impacts are dubious. But I can
understand that a bzr user hearing arguments against them might fear
that they would lose the ability to be able to see a view of commits
that "belong" to a particular branch.

But git provides that view perfectly well, and it's what git users
work with all the time. It doesn't require any special treatment of
one commit parent vs. another, nor storage of "originating branch" in
the commit, nor the user taking any care whatsoever about which
direction merges are performed, (nor "who" does the merge).

And as a bonus, the command-line for this view is really simple:

	git log mainline..featureA

This gives a log view just "bzr log --line" in that in only includes
f1, f2, the merge commit, f3, and f4. You can even drop the merge if
it's uninteresting:

	git log --no-merges mainline..featureA

The mainline..featureA syntax literally just means:

	the set of commits that are reachable by featureA
	and excluding the set of commits reachable by mainline

It's an extraordinarily powerful thing to say, and its exactly what
you want here. And it's more than a "show mainline" thing, since
theses sets of commits can consist of arbitrarily complex DAG
subsets. This syntax is just a really useful way to slice up the DAG.

And this syntax is almost universally accepted by git commands. so you
can visualize a chunk of the DAG with:

	gitk mainline..featureA

Or export it as patches with:

	git format-patch mainline..featureA

I haven't been able to find something similar in bzr yet. Does it
exist?

> If I understand it correctly, in git, you don't really know what has
> been committed as part of this branch/repo, and what has been
> committed in another branch/repo (this is my understanding from
> reading this thread, I might be wrong, feel free to correct me again
> :) )

You're correct that git doesn't _store_ any sort of "branch ownership"
in the commit object. But this is a huge feature. It avoids a lot of
the things in bzr that look so bizarre to people coming from git.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 18:20                                                 ` Jakub Narebski
@ 2006-10-22 14:27                                                   ` Matthieu Moy
  0 siblings, 0 replies; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-22 14:27 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Aaron Bentley, bazaar-ng, Jeff King, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

Jakub Narebski <jnareb@gmail.com> writes:

> What about grandparent of commit (d8a60^^ or d8a60~2 in git),
> or choosing one of the parents in merge commit (d8a60^2 is second
> parent of a commit)? before:before:753 ?

Yes, "before:" can take any revision specifier, including
"before:something-else".

-- 
Matthieu

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22 14:11                                                         ` Erik Bågfors
@ 2006-10-22 14:39                                                           ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-22 14:39 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Jan Hudec, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

Erik Bågfors wrote:
> On 10/22/06, Jakub Narebski <jnareb@gmail.com> wrote:
>> Erik Bågfors wrote:
>>> Jakub Narębski wrote:
>>
>>>> For example git encourages using many short and longer-lived feature
>>>> branches; I don't see bzr encouraging this workflow.
>>>
>>> Why not? I think it really does.  And due to the fact that merges are
>>> merges and will show up as such, I think it's very suitable for
>>> feature branches.
>>
>> I think I haven't properly explained what "feature branch" means.
>> "Feature branch" is short (or medium) lived branch, created for
>> development of one isolated feature. When feature is in stable
>> stage, we merge feature branch and forget about it. We are not
>> interested in the fact that given feature was developed on given
>> branch. BTW. for example in published git.git repository are
>> only available in the form of "digest" 'pu' (proposed updates)
>> branch. 
> 
> That's what I'm talking about too.
> For example, in my bzr bzr-repo I have
> bzr.init-repo-tree/
> bzr.aliases/
> bzr.dev/

Due to the fact that git uses separate namespace for branch names,
and not position on filesystem, one would probably use 'dev'
(or 'master', or perhaps 'next'), 'aliases' and 'init-repo-tree'
as branch names. No need for 'bzr.' prefix to distingush
branches from other directories for user.

Git does use convention like above for bare repositories
(clones of repositories without working tree; working tree
is associated with repository, not with branch), e.g. git.git
or linux-2.6.18.y.git though.

> and others...
> In bzr.aliases for example, I built the support for defining aliases
> in the bzr config file. That was a unique feature that didn't exist in
> any other branch.  The branch survived about 17 days before it was
> merged into bzr.dev.  During that time, I merge in another branch
> twice.  The branch I merged at this time was NOT bzr.dev, but rather
> another branch, from one of the main developers.  The reason I merged
> his branch was that I needed a bugfix (or two? :) ) that he had done,
> but that wasn't approved in bzr.dev yet.

That is also quite common. Merging 'master' into feature branch,
or 'next' into feature branch. One could of course cherry-pick
only the bugfix... can you do this in bzr?

> After a time, his branch was merged into bzr.dev, shortly thereafter,
> so was my branch.
> 
> After my branch was merged, I forgot about it.  I still have it laying
> around on my computer because it really doesn't take up any extra
> space (since it's in a shared repository), but I really have forgotten
> about it.

Usually after feature branch is merged (or fast-forwarded) we delete
it. All the parentage information is in DAG anyway. We can later
attach new branch with the same name to the point where the branch was.

> This is typically how all features in bzr are created.
> Short/medium/long-lived feature branches.

Like I said, in git.git development we use development branches
(e.g. 'master', 'maint', 'next'), tracking branches (e.g. 'origin',
'linus'), feature branches (e.g. 'jc/pickaxe', 'np/pack'), "helper"
branches storing somewhat unrelated ('html' and 'man' branches for
autogenerated documentation) or unrelated ('todo' for TODO notes)
wtr. code stored to the main project, "digest" branches (e.g. 'pu'
branch in git.git, which is merge of WIP feature branches to be
published, and does not fast-forward), and temporary branches (for
example for shelving current work).

From long, to medium, to short, to extremly short lived.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22 14:25                                                       ` Carl Worth
@ 2006-10-22 14:48                                                         ` Erik Bågfors
  2006-10-22 15:04                                                           ` Jakub Narebski
  2006-10-22 14:55                                                         ` Jakub Narebski
  2006-10-22 18:53                                                         ` Matthew D. Fuller
  2 siblings, 1 reply; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-22 14:48 UTC (permalink / raw)
  To: Carl Worth
  Cc: Jakub Narebski, Jan Hudec, bazaar-ng, Linus Torvalds,
	Andreas Ericsson, git

Thanks for this mail, this makes me happy to see. The tools are pretty
much the same but have some different view on how to do things..

On 10/22/06, Carl Worth <cworth@cworth.org> wrote:
>
>         git log --no-merges mainline..featureA
>
> The mainline..featureA syntax literally just means:
>
>         the set of commits that are reachable by featureA
>         and excluding the set of commits reachable by mainline
>
> It's an extraordinarily powerful thing to say, and its exactly what
> you want here. And it's more than a "show mainline" thing, since
> theses sets of commits can consist of arbitrarily complex DAG
> subsets. This syntax is just a really useful way to slice up the DAG.
>
> And this syntax is almost universally accepted by git commands. so you
> can visualize a chunk of the DAG with:
>
>         gitk mainline..featureA
>
> Or export it as patches with:
>
>         git format-patch mainline..featureA
>
> I haven't been able to find something similar in bzr yet. Does it
> exist?

If I understand you correctly, you'll get the same thing with "bzr missing".

$ bzr missing ../mainline/
You have 1 extra revision(s):
------------------------------------------------------------
revno: 2
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: newbranch
timestamp: Sun 2006-10-22 16:43:10 +0200
message:
  hepp


You are missing 1 revision(s):
------------------------------------------------------------
revno: 2
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: mainline
timestamp: Sun 2006-10-22 16:42:53 +0200
message:
  hej

You can also run "bzr missing" with "--theirs-only" or "--mine-only"
to get only one way.

To get the patches you can run "bzr bundle ../mainline", but then
we're back to the discussion that it currently gives a "big patch" for
viewing, but when you merge it, you get each revision separately.

/Erik

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22 14:25                                                       ` Carl Worth
  2006-10-22 14:48                                                         ` Erik Bågfors
@ 2006-10-22 14:55                                                         ` Jakub Narebski
  2006-10-22 18:53                                                         ` Matthew D. Fuller
  2 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-22 14:55 UTC (permalink / raw)
  To: Carl Worth
  Cc: Erik Bågfors, Jan Hudec, bazaar-ng, Linus Torvalds,
	Andreas Ericsson, git

Carl Worth wrote:
> Erik Bågfors wrote:
>> If I understand it correctly, in git, you don't really know what has
>> been committed as part of this branch/repo, and what has been
>> committed in another branch/repo (this is my understanding from
>> reading this thread, I might be wrong, feel free to correct me again
>> :) )
> 
> You're correct that git doesn't _store_ any sort of "branch ownership"
> in the commit object. But this is a huge feature. It avoids a lot of
> the things in bzr that look so bizarre to people coming from git.

Because "branch ownership" is obvously local, we have reflog, which is
local and not propagated. Reflog uses the following format

 oldsha1 SP newsha1 SP committer TAB reason LF

where reason might be "commit: <commit description/title/subject>"
or "commit (amend): <commit description>", "am: <commit 
description>" (applied mail patch), "reset --hard HEAD^" (dropped
top commit), "branch: Created from origin^0", or "pull origin: In-index 
merge".

We have not yet tools to examine reflog (e.g. change committer
info with it's timestamp to human readable format) yet.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
       [not found]                                                                             ` <20061022102454.b9dea693.seanlkml@sympatico.ca>
  2006-10-22 14:24                                                                               ` Sean
  2006-10-22 14:24                                                                               ` Sean
@ 2006-10-22 14:56                                                                               ` Matthew D. Fuller
  2006-10-22 15:05                                                                                 ` Matthieu Moy
  2 siblings, 1 reply; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-22 14:56 UTC (permalink / raw)
  To: Sean; +Cc: bazaar-ng, git

On Sun, Oct 22, 2006 at 10:24:54AM -0400 I heard the voice of
Sean, and lo! it spake thus:
> 
> Light goes on.  Okay.  So a bzr "branch" is only ever editable on a
> single machine.  So there is no distributed development on top of a
> bzr "branch".  Everyone else just has read-only copies of it.

Ah!  Yes, that's exactly[0] right.  Mark up another of those "so
obvious we never think to state it" thought-patterns   :|


Distributed development proper only happens on 'projects', not
branches.  In practice, we say "we're all working on branch X", in the
sense that we use it as a base to work from and intend to merge our
stuff into it, but strictly speaking we're all working on our own
branches that just merge from/into X from time to time.

That's also why we use the phrases "merge from" and "merge to", rather
than "merge WITH".  Of course, where possible, we could 'fast-forward'
to X rather than merge from it, at which point we'd then momentarily
have exactly X, but culturally we don't seem to like doing that.



[0] There are a few very special-case exceptions, notably around the
'checkout' concept or where people are very carefully manually
maintaining sync, but they're irrelevant in this case; and they ARE
star-pattern developments that could be said to be 'centralized'.  Now
I grok where that's coming from.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22 14:48                                                         ` Erik Bågfors
@ 2006-10-22 15:04                                                           ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-22 15:04 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Carl Worth, Jan Hudec, bazaar-ng, Linus Torvalds,
	Andreas Ericsson, git

Erik Bågfors wrote:

> On 10/22/06, Carl Worth <cworth@cworth.org> wrote:
>>
>>         git log --no-merges mainline..featureA
>>
>> The mainline..featureA syntax literally just means:
>>
>>         the set of commits that are reachable by featureA
>>         and excluding the set of commits reachable by mainline
>>
[...]
>> And this syntax is almost universally accepted by git commands. so you
>> can visualize a chunk of the DAG with:
>>
>>         gitk mainline..featureA
>>
>> Or export it as patches with:
>>
>>         git format-patch mainline..featureA
>>
>> I haven't been able to find something similar in bzr yet. Does it
>> exist?
> 
> If I understand you correctly, you'll get the same thing with "bzr missing".
> 
> $ bzr missing ../mainline/
> You have 1 extra revision(s):
> ------------------------------------------------------------
> revno: 2
> committer: Erik Bågfors <erik@bagfors.nu>
> branch nick: newbranch
> timestamp: Sun 2006-10-22 16:43:10 +0200
> message:
>   hepp
> 
> 
> You are missing 1 revision(s):
> ------------------------------------------------------------
> revno: 2
> committer: Erik Bågfors <erik@bagfors.nu>
> branch nick: mainline
> timestamp: Sun 2006-10-22 16:42:53 +0200
> message:
>   hej

That is (roughly) equivalent of
  $ git log mainline...featureA
(which would give all commits which are _either_ in mainline,
xor in featureA, although not separated; --topo-order might help), or
  $ git show-branch mainline featureA

> You can also run "bzr missing" with "--theirs-only" or "--mine-only"
> to get only one way.

That would be equivalent of
  $ git log mainline..featureA
(--theirs-only), or
  $ git log featureA..mainline
(--mine-only).

> To get the patches you can run "bzr bundle ../mainline", but then
> we're back to the discussion that it currently gives a "big patch" for
> viewing, but when you merge it, you get each revision separately.

What about
  $ gitk mainline..featureA
i.e. showing selected part of DAG in graphical history viewer?

And of course syntax is even more powerfull, e.g.
  $ git log maint master --not next
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22 14:56                                                                               ` Matthew D. Fuller
@ 2006-10-22 15:05                                                                                 ` Matthieu Moy
  0 siblings, 0 replies; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-22 15:05 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Sean, bazaar-ng, git

"Matthew D. Fuller" <fullermd@over-yonder.net> writes:

> On Sun, Oct 22, 2006 at 10:24:54AM -0400 I heard the voice of
> Sean, and lo! it spake thus:
>> 
>> Light goes on.  Okay.  So a bzr "branch" is only ever editable on a
>> single machine.  So there is no distributed development on top of a
>> bzr "branch".  Everyone else just has read-only copies of it.
>
> Ah!  Yes, that's exactly[0] right.  Mark up another of those "so
> obvious we never think to state it" thought-patterns   :|

Well, I'm not sure you talk about the same thing still. Adding my
2cents:

If ~/branch1 is a branch, I can get a read-write "copy" of it with

$ bzr branch ~/branch1 ~/branch2

which will roughly be equivalent to

$ cp -r ~/branch1 ~/branch2

Whether they are at this point "the same branch" or "two distinct
branches with same content" is just a matter of vocabulary since there
is no real "branch identity" AFAIK in bzr.

Now, if you commit in ~/branch1, then ~/branch2 is out of date with
it. If you commit also to ~/branch2, then you get two divergent
branches.

(and obviously, I could have done the same with branches in different
machines)

-- 
Matthieu

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-16  3:53   ` Martin Pool
@ 2006-10-22 15:50     ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-22 15:50 UTC (permalink / raw)
  Cc: bazaar-ng, git

On 14 Oct 2006, Jakub Narebski <jnareb@gmail.com> wrote:
> Jon Smirl wrote:
> 
>> It refers to this comparison chart between source control systems.
>> http://bazaar-vcs.org/RcsComparisons
> 
> It is quite obvious that comparison of programs of given type (SMC)
> on some program site (Bazaar-NG) is usually biased towards said program,
> perhaps unconsciously: by emphasizing the features which were important
> for developers of said program.

There are also clashes with SCM terminology used differently by different
projects, which are sometimes couled with differences in philosophy,
and sometimes by different undestanding of given name.

For example "lightweight checkouts" and "normal/heavyweight checkout"
are from what I gather, is supporting "CVS/centralized model" and
"disconnected CVS model" (i.e. we can commit changes locally with
no network access, and we save local changes), at least when we
do "checkout" remotely and not on one local filesystem out-of-the-box.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 23:49                                                             ` Carl Worth
  2006-10-22  0:07                                                               ` Jeff Licquia
@ 2006-10-22 16:02                                                               ` Petr Baudis
  2006-10-25  9:52                                                               ` Andreas Ericsson
  2 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-22 16:02 UTC (permalink / raw)
  To: Carl Worth; +Cc: Jeff Licquia, Jakub Narebski, bazaar-ng, git

Dear diary, on Sun, Oct 22, 2006 at 01:49:04AM CEST, I got a letter
where Carl Worth <cworth@cworth.org> said that...
> Almost none of the power of git is exposed by gitweb. It's really not
> worth comparing. (Now a gitweb-alike that provided all the kinds of
> very easy browsing and filtering of the history like gitk and git
> might be nice to have.)

http://repo.or.cz/git-browser/by-commit.html?r=linux-2.6.git

It could use plenty of improvement, though.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22  7:49                                                         ` Tim Webster
@ 2006-10-22 17:12                                                           ` Linus Torvalds
  2006-10-23  5:19                                                             ` Matthew Hannigan
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-22 17:12 UTC (permalink / raw)
  To: Tim Webster; +Cc: git, Aaron Bentley, bazaar-ng, Jakub Narebski



On Sun, 22 Oct 2006, Tim Webster wrote:

> On 10/22/06, Linus Torvalds <torvalds@osdl.org> wrote:
> > 
> > > project/file-1
> > > project/file-2
> > > project/.git-1
> > > project/.git-2
> > 
> > Ok, that's just insane.
> [snip]
> > Anyway. Git certainly allows you to do some really insane things. The
> > above is just the beginning - it's not even talking about alternate object
> > directories where you can share databases _partially_ between two
> > otherwise totally independent repositories etc.
> 
> 
> Perhaps this is insane, but it does not make sense to track all config
> files in etc as though they belong in a single repo.

Oh, ok, now I see what you're going after.

Right - if you track system directories in a repo, you'd quite possibly 
end up with multiple repositories. Although even then, I'd actually 
suggest that as a git user, you would only have one actual repository, and 
just multiple branches that have a disjoint set of files (again, it's 
certainly possible to have file overlap too, of course).

But the usage I would seriously suggest is to _not_ do development "inside 
/etc" itself. You'd have those git repositories somewhere else, say in 
"/usr/src/etc-repo" or similar, and then you'd have a few extra wrappers 
to help your particular usage. I have a few reasons for this:

 - I think being in /etc and doing development is just fundamentally scary 
   in itself, because if you do something wrong in the current directory, 
   you're just pretty badly off. It's better to have a "buffer zone" that 
   you do development in, and when you're happy, you do a "install" 
   command or something.

 - I think developing as "root" is totally broken, and some of the files 
   you are tracking may not even be _readable_ to normal users in their 
   real form, so you can't even do trivial things like "diff" as a normal 
   user otherwise. So again, the solution to this would be to do 
   development somewhere else, and have specific wrappers (with "sudo" as 
   appropriate, and your developer ID obviously specially in the sudo 
   files) to do those special "realdiff" and "install" commands.

 - finally: when you work with almost any SCM designed for source control, 
   you're almost inevitably going to have to have some "special" way to 
   track the things that source control usually does _not_ track because 
   it makes no sense for source code. So you'd have to have some special 
   file that tracks ownership/group/full permissions information, and 
   perhaps special devices (if you're tracking things like /dev).

   Again, the way to solve this would tend to be to have a few helper 
   scripts that use regular file-contents that _describe_ these things to 
   do "realdiff" and "install".

In other words, for at least three _totally_ different reasons, you really 
don't want to do tracking/development directly in /etc, but you want to 
have a buffer zone to do it. And once you have that, you might as well do 
_that_ as the repository, and just add a few specialty commands (let's 
call them "plugins" to make everybody happy) to do the special things.

And once you have that kind of setup, you're really better off with 
more of a "several branches for different kinds of files" or even totally 
different repositories. That's a detail, and I don't think anybody really 
cares.

Anyway, to make this slightly more grounded in examples, let me give a 
quick overview of what I'd do if I did this with git. Not a "real" setup 
at all, but kind of a "maybe something like this" - so don't get _too_ 
hung up about the details, ok? It's just a rough draft kind of thing.

First off, let's just say that I want to track /etc/group, /etc/passwd and 
/etc/shadow as one "thing". Whether that thing is a repository of its own 
or a branch in a bigger repository doesn't matter (right now I'm only 
doing those three), and quite frankly, I'm not going to even go into 
whether it _really_ makes sense to track "groups" and the passwd files 
together, but it's just an example, ok?

What I'd do is roughly:

	# set up the new repo (or branch, or..)
	mkdir identity-repo
	cd identity-repo
	git init-db

	# copy the data, set up a PERMISSIONS file to track extra info
	sudo cp /etc/group /etc/passwd /etc/shadow .
	sudo chown user:user *
	cat <<EOF > PERMISSIONS
	group root:root 0644
	passwd root:root 0644
	shadow root:root 0400
	EOF
	git add .
	git commit -m "Initial setup"

and now I have the initial setup, together with permissions and user/group 
information on the things, all ready to track. I can do development in 
this as if it was a normal source-code repository.

So now I can do "work work work commit commit commit" as if these files 
were nothign special. What else do I need? I need the "plugins" to 
actually expose (install) my work, and perhaps to check that /etc matches 
what I expect (and nobody else did anything behind my back that I'd need 
to merge).

Let's call them "install" and "realdiff" as I did above, ok?

And again, I'm not going to even claim that the above two "plugins" are 
the right ones (maybe you want other operations too to interact with the 
"real" installed files), and I'm not going to really get all the details 
right, but here's kind of how you _might_ do it.

To create the script (let's make it shell, because that's what I'm used 
to, but it could be anything) "git-install" in your git binary directory, 
and make it do something like this:

	#!/bin/sh
	while read name chown chmod
	do
		cp $name $name.tmp &&
		sudo chown $chown $name.tmp &&
		sudo chmod $chmod $name.tmp &&
		sudo mv $name.tmp /etc/$name
	done < PERMISSIONS

and make it executable.

Now, you can work in your git directory, and when you're happy, you can do

	git install

to actually copy it into the _real_ directory in /etc.

See? You can do something similar for "realdiff", that would compare the 
contents in /etc with what you have now in your development tree (where 
you want to script the thing to compare the PERMISSIONS file too).

And note: if you do the "plugin scripts" properly, they can work for _all_ 
your repositories that track different files in /etc. So you can work in 
many different repos, and track different files in each, and "git install" 
will do the right thing for each, regardless of the actual files you're 
tracking.

Doesn't this sound like a workable situation? You get all the normal SCM 
tools (looking at history etc), and there's only a few special things you 
need to do when you actually want to install a specific version.

Btw: none of this is really "git-specific". The above tells you how to do 
local "git plugins", and it's obviously fairly trivial, but I suspect any 
SCM can be used in this manner.

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22 14:25                                                       ` Carl Worth
  2006-10-22 14:48                                                         ` Erik Bågfors
  2006-10-22 14:55                                                         ` Jakub Narebski
@ 2006-10-22 18:53                                                         ` Matthew D. Fuller
  2006-10-22 19:27                                                           ` Jakub Narebski
                                                                             ` (2 more replies)
  2 siblings, 3 replies; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-22 18:53 UTC (permalink / raw)
  To: Carl Worth; +Cc: Erik Bågfors, bazaar-ng, git, Jakub Narebski

On Sun, Oct 22, 2006 at 07:25:41AM -0700 I heard the voice of
Carl Worth, and lo! it spake thus:
>
> 	git pull . mainline

This throws me a little.  I'd expect it to Just Do It when it's
fast-forwarding, but if it's doing a merge, I'd prefer it to stop and
wait before creating the commit, even if there are no textual
conflicts.  I realize you can just look at it afterward and back out
the commit if necessary, but still...


> Ah, I hadn't realized that bzr commits stored an "originating
> branch" inside them.

Every branch has a nickname, settable with 'bzr nick' (defaulting to
whatever the directory it's in is), and that's stored as a text field
in each commit.  It's mostly cosmetic, but it's handy to see at a
glance.


> This special treatment influences or directly causes many of the
> things in bzr that we've been discussing:
  [...]
> I've been arguing that all of these impacts are dubious. But I can
> understand that a bzr user hearing arguments against them might fear
> that they would lose the ability to be able to see a view of commits
> that "belong" to a particular branch.

Dead center.


> The mainline..featureA syntax literally just means:
> 
> 	the set of commits that are reachable by featureA
> 	and excluding the set of commits reachable by mainline

>From what I can gather from this, though, that means that when I merge
stuff from featureA into mainline (and keep on with other stuff in
featureA), I'll no longer be able to see those older commits from this
command.  And I'll see merged revisions from branches other than
mainline (until they themselves get merged into mainline), correct?
It sounds more like a 'bzr missing --mine-only' than looking down a
mainline in log...


> I haven't been able to find something similar in bzr yet. Does it
> exist?

The branch: (head) and ancestor: (latest common rev) revspecs let you
refer to the respective bits of other branches, which I think would
fill this role.


> It avoids a lot of the things in bzr that look so bizarre to people
> coming from git.

Well, what would be the fun in that?   8-}


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 19:41                                                     ` Jakub Narebski
@ 2006-10-22 19:18                                                       ` David Clymer
  2006-10-22 19:57                                                         ` Jakub Narebski
  2006-10-22 20:06                                                         ` Jakub Narebski
  0 siblings, 2 replies; 1752+ messages in thread
From: David Clymer @ 2006-10-22 19:18 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Matthew D. Fuller, Andreas Ericsson, Linus Torvalds, Carl Worth,
	bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 9321 bytes --]

On Sat, 2006-10-21 at 21:41 +0200, Jakub Narebski wrote:
> Matthew D. Fuller wrote:
> > On Sat, Oct 21, 2006 at 04:08:18PM +0200 I heard the voice of
> > Jakub Narebski, and lo! it spake thus:
> >> Dnia sobota 21. października 2006 15:01, Matthew D. Fuller napisał:
> 
> >> When two clones of the same repository (in git terminology), or two
> >> "branches" (in bzr terminology), used by different people, cannot be
> >> totally equivalent that is centralization bias.
> > 
> > This is obviously some new meaning of "centralization" bearing no
> > resemblance whatsoever to how I understand the word.
> 
> Perhaps I'd better use "star topology bias" instead of "centralization
> bias".
>  
> > In git, apparently, you don't give a crap about a branch's identity
> > (alternately expressible as "it has none"), and so you throw it away
> > all the time.  Given that, revnos even if git had them would never be
> > of ANY use to you, so it's no wonder you have no use for the notion.
> 
> In git branches are lightweight. Branch names are local to repository.
> Repositories have identity. Bzr "branch" is strange mix of one-branch
> git repository and git branch.
> 
> Git main workflow is fully decentralized workflow. All clones of the
> same repository are created equal. In bzr the suggested workflow
> (with revnos) forces one (or more) branches to be mainline (use "merge",
> get empty-merges, revnos don't change) and leaf (use "pull", revnos
> change).
>  
> > I DO give a crap about my branchs' identities.  I WANT them to retain
> > them.  If I have 8 branches, they have 8 identities.  When I merge one
> > into another, I don't WANT it to lose its identity.  When I merge a
> > branch that's a strict superset of second into that second, I don't
> > WANT the second branch to turn into a copy of the first.  If I wanted
> > that, I'd just use the second branch, or make another copy of it.  I
> > don't WANT to copy it.  I just want to merge the changes in, and keep
> > on with my branch's current identity.
> 
> I don't understand. If I merge 'next' branch into 'master' in git, I 
> still have two branches: 'master' and 'next'.
> 
> And I don't understand why you are so hung on branch identities. Yes, if
> somebody clones your 'repo' repository, he can have your 'master' branch
> (refs/heads/master) named 'repo' (refs/heads/repo) or 'repo/master'
> (refs/remotes/repo/master), but why that matters to you. It is _his_
> (or her ;-) clone. 
> 

I think you missed the point. Speaking for myself, I want to maintain
the identity of _my_ branches. If you clone one of them, I _don't_ care.
That's your branch. Branch identity as presented here is not intended to
be globally significant. It's locally significant.

> > Now, we can discuss THAT distinction.  I'm not _opposed_ to git's
> > model per se, and I can think of a lot of cases where it's be really
> > handy.  But those aren't most of my cases.  And as long as we don't
> > agree on branch identity, it's completely pointless to keep yakking
> > about revnos, because they're a direct CONSEQUENCE of that difference
> > in mental model.  See?  They're an EFFECT, not a CAUSE.  If bzr didn't
> > have revnos, I'd STILL want my branch to keep its identity.  You could
> > name the mainline revisions after COLORS if you wanted, and I'd still
> > want my branch to keep its identity.  Aren't we through rehashing the
> > same discussion about the EFFECTS?
> 
> For revnos to work you MUST have one "branch" to be considered
> special, the hub in star topology. This very much precludes fully
> distributed development. 
> 
> BTW. I get that you can use revids in revnos in bzr for fully
> distributed and not star-topology geared development. But
> Bazaar-NG revids are uglier that Git commit-ids.

OK, just to clarify what you are saying here: 

1. revnos don't work because they don't serve the same purpose as revids
or git's SHA1 commit ids.

2. bzr does not support fully distributed development because revnos
"don't work" as stated in #1.

3. Ok, bzr does support distributed development, I just say it doesn't
because I think revids are ugly.

Thus, revids are ugly.

Is this really the argument you want to be making? I'm not disagreeing
with you; it's just that I'm not sure it's relevant.

Can we just put the whole "revnos don't work" thing to rest?

Revnos are only intended to be significant relative to a given branch.
They are not intended to serve as an absolute, global identifier.

Revnos + a url _are_ globally significant, but are not static except in
certain topologies.

Revids are globally significant and static in any topology.

If a user does not like or cannot use revnos, they may use revids.
Revnos are not a tool to be used for every job. In no way does that mean
that they are broken.

If a given developer or group of developers primarily use revnos or
revids, it _may_ indicate that _they_ have a bias towards central (or
star) or distributed development, but does not necessarily have any
bearing on the capability of the VCS being used.

> 
> [...]
> >> And you say that bzr is not biased towards centralization? In git
> >> you can just pull (fetch) to check if there were any changes, and if
> >> there were not you don't get useless marker-merges.
> > 
> > If I don't tell you my branch has something in it ready to grab, you
> > shouldn't merge it.  It probably won't work, and is quite likely to
> > set your computer on fire, slaughter and fillet your pet goldfish, and
> > make demons fly out of your nose.  If you wanna get stuck with all my
> > incomplete WIP, let's just use a CVS module and be done with it.
> 
> In git I can fetch your changes but I don't need to merge them. Take
> for example Junio 'pu' (proposed updates) branch: this is the branch
> you shouldn't merge as it's history is constantly being rewritten.
> 
> If you don't want for your WIP to be publicly available, you don't
> publish it. For example as far as I understand Junio works on Git
> in his private repository, with many, many feature branches, but
> he does push to public [bare] repository only some subset of branches,
> and we can fetch/pull only those.
> 
> But still, if I am impatient I can pull from Junio every hour, and
> I don't get 24 totally useless empty merge messages if he took day
> off and didn't publish any changes till day later.
> 
> >> 2. But the preferred git workflow is to have two branches in each of
> >> two clones. The 'origin' branch where you fetch changes from other
> >> repository (so called "tracking branch") and you don't commit your
> >> changes to [...]
> > 
> > Funny, since this reads to me EXACTLY like the bzr flow of "upstream
> > branch I pull" and "my branch I merge from upstream" that's getting
> > kvetched around...
> 
> But please, have you realized that in this workflow the two clones
> of the same repository are totally symmetrical? One's 'master' is
> another 'origin' and vice versa. After pull on one side, and pull
> on the other side (without any changes in between) we have the same
> contents, and the same revision names (commit-ids in git), even if
> the changes (revisions) got to those clones in different order.
> In bzr those two "branches" would get different revnos. No symmetry.
> Full distributed vs star topology (one branch "central", hence
> "centralized" - I don't mean need to access to one central repository,
> although...)

I think that when I attempt to pull from one branch to another, if they
are identical, neither branch changes. Merging + pulling results in
identical history, causing revnos on the pulling branch to change. Just
merging maintains divergent views of the same history. 

Perhaps bzr has a central bias in the view that each developer has the
option of seeing their own branch as the central focus of his/her
development. This view would be the same from each branch; each
developer views his/her own branch as special. If the developer does not
want to view their own branch specially, they would merge + pull rather
than just merging. If I remember correctly, abentley covered this
earlier in this whole "VCS comparison table" thread.

Anyway, much of this seems to be a disagreement over the definition of
"distributed VCS." Perhaps this is too simplistic, but to my inexpert
eyes, these appear to be the positions of each side:

Bzr: Branches and all shared history may be stored locally in disparate
locations, and all VCS functions are available locally.

Git: Same thing, except that all shared history must also be identically
ordered.

Did I get that right?

In general, as a mere _user_ of distributed VCS, all I care about is if
I can accurately point you to a particular commit or set of commits, and
that you can access them either in shared history or in a given branch.
The fact that the VCS does not require a central branch and facilitates
code interchange, means to me that it is distributed. As long as all
major uses are fully supported, being slightly biased toward one use
case or another is not a distinction I consider to be worth making.

-davidc
-- 
gpg-key: http://www.zettazebra.com/files/key.gpg

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22 18:53                                                         ` Matthew D. Fuller
@ 2006-10-22 19:27                                                           ` Jakub Narebski
  2006-10-23 16:57                                                           ` David Lang
  2006-10-23 17:29                                                           ` Linus Torvalds
  2 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-22 19:27 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Carl Worth, bazaar-ng, git, Erik Bågfors

On Son, Oct 22, 2006 Matthew D. Fuller wrote:
> On Sun, Oct 22, 2006 at 07:25:41AM -0700 I heard the voice of
> Carl Worth, and lo! it spake thus:
>>
>> 	git pull . mainline
> 
> This throws me a little.  I'd expect it to Just Do It when it's
> fast-forwarding, but if it's doing a merge, I'd prefer it to stop and
> wait before creating the commit, even if there are no textual
> conflicts.  I realize you can just look at it afterward and back out
> the commit if necessary, but still...

Or you can use --no-commit option to git pull, and commit later.
But it is true that you can always amend the commit with
got commit --amend, even if the commit is merge.
 
>> Ah, I hadn't realized that bzr commits stored an "originating
>> branch" inside them.
> 
> Every branch has a nickname, settable with 'bzr nick' (defaulting to
> whatever the directory it's in is), and that's stored as a text field
> in each commit.  It's mostly cosmetic, but it's handy to see at a
> glance.

If I remember correctly Linus argued against it, because branch
name is something local to repository (most common example is
"mine 'master' is yours 'origin'").

There was proposal for "note" header for notes like merge algorithm
used, or branch name, visible only in 'raw' mode, but it wasn't 
implemented.

>> The mainline..featureA syntax literally just means:
>> 
>> 	the set of commits that are reachable by featureA
>> 	and excluding the set of commits reachable by mainline
> 
> From what I can gather from this, though, that means that when I merge
> stuff from featureA into mainline (and keep on with other stuff in
> featureA), I'll no longer be able to see those older commits from this
> command.  And I'll see merged revisions from branches other than
> mainline (until they themselves get merged into mainline), correct?
> It sounds more like a 'bzr missing --mine-only' than looking down a
> mainline in log...

That's true. That is what history viewers are for (gitk, qgit, tig,
gitview, git-show-branch, git-browser) are for.

And there is always reflog (if you enable it, of course).

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 20:47                                                 ` Carl Worth
                                                                     ` (2 preceding siblings ...)
  2006-10-22 12:46                                                   ` Matthew D. Fuller
@ 2006-10-22 19:36                                                   ` David Clymer
  3 siblings, 0 replies; 1752+ messages in thread
From: David Clymer @ 2006-10-22 19:36 UTC (permalink / raw)
  To: Carl Worth
  Cc: Matthew D. Fuller, bazaar-ng, Linus Torvalds, Andreas Ericsson,
	git, Jakub Narebski

[-- Attachment #1: Type: text/plain, Size: 1480 bytes --]

On Sat, 2006-10-21 at 13:47 -0700, Carl Worth wrote:
> On Sat, 21 Oct 2006 08:01:11 -0500, "Matthew D. Fuller" wrote:
> > I think we're getting into scratched-record-mode on this.
> 
> I apologize if I've come across as beating a dead horse on this. I've
> really tried to only respond where I still confused, or there are
> explicit indications that the reader hasn't understood what I was
> saying, ("I don't understand how you've come to that conclusion",
> etc.). I'll be even more careful about that below, labeling paragraphs
> as "I'm missing something" or "Maybe I wasn't clear".
> 
> > G: So use revids everywhere.
> >
> > B: Revnos are handier tools for [situation] and [situation] for
> >    [reason] and [reason].
> 
> I'm missing something:
> 
> I still haven't seen strong examples for this last claim. When are
> they handier? I asked a couple of messages back and two people replied
> that given one revno it's trivial to compute the revno of its
> parent. But that's no win over git's revision specifications,
> (particularly since they provide "parent of" operators).

I would say that: revnos are handier tools than revids...etc

I think that since G: was making a statement about revids, B: was making
an implicit comparison with them.

bzr log -r before:1   

being handier than

bzr log -r before:revid:david@zettazebra.com-20061022175244-4b85cb5f0cbc79ad


-davidc
-- 
gpg-key: http://www.zettazebra.com/files/key.gpg

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22 19:18                                                       ` David Clymer
@ 2006-10-22 19:57                                                         ` Jakub Narebski
  2006-10-22 20:06                                                         ` Jakub Narebski
  1 sibling, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-22 19:57 UTC (permalink / raw)
  To: David Clymer
  Cc: Matthew D. Fuller, Andreas Ericsson, Linus Torvalds, Carl Worth,
	bazaar-ng, git

David Clymer wrote:
> Bzr: Branches and all shared history may be stored locally in disparate
> locations, and all VCS functions are available locally.

Branches in bzr are both one-source (one head) DAG (of parents), and
the "mainline" i.e. track of commits commited in this branch-as-place.
Bazaar-NG tries to keep both information in DAG by using first parent
to mark commits on current branch-as-place.

Additionally bzr by default uses revnos, numbering commits on branch,
which needs maintaining mainline identity for revnos not to change
even for one branch-as-place.

This leads to the need to use "merge" if you want to maintain revnos
unchanged, and "pull" if you are not interested in that.


Git correctly realizes that mainline identity is local information,
and instead of trying to save local information in DAG which is shared,
it uses reflog.

[That's of course totally biased view.]
 
> Git: Same thing, except that all shared history must also be identically
> ordered.
That is the EFFECT of preferring fast-forward over preserving
"first parent is my branch" property. So the RESULT is that
shared history is identically ordered.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22 19:18                                                       ` David Clymer
  2006-10-22 19:57                                                         ` Jakub Narebski
@ 2006-10-22 20:06                                                         ` Jakub Narebski
  2006-10-23 11:56                                                           ` David Clymer
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-22 20:06 UTC (permalink / raw)
  To: David Clymer
  Cc: Matthew D. Fuller, Andreas Ericsson, Linus Torvalds, Carl Worth,
	bazaar-ng, git

David Clymer wrote:
> 1. revnos don't work because they don't serve the same purpose as revids
> or git's SHA1 commit ids.
Revnos works only locally, or in star-topology configuration. They have
some consequences: treating first parent specially, need for merges
instead of fast-forward even if fast-forward would be applicable,
two different "fetch" operators: "pull" (which uses revids on the
pulled side) and "merge" (which preserves revids on pullee side).

> 2. bzr does not support fully distributed development because revnos
> "don't work" as stated in #1.
Bazaar is biased towards centralized/star-topology development if we
want to use revids. In fully distributed configuration there is no
"simple namespace".

> 3. Ok, bzr does support distributed development, I just say it doesn't
> because I think revids are ugly.
I think that bzr revids are uglier that git commit-ids.

If on the pros side of bzr is "simple namespace", you must remember that
it is simple namespace only for not fully distributed development. The
pros of "simple namespace" with cons of "merge" vs "pull" and centralization
required for uniqueness of revids.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* [PATCH] threeway_merge: if file will not be touched, leave it alone
  2006-10-21  2:17                                                             ` Junio C Hamano
@ 2006-10-22 21:04                                                               ` Johannes Schindelin
  2006-10-22 23:11                                                                 ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-22 21:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git


If the merge base _and_ the to-be merged brach have a certain file, but
HEAD has not, do not complain if that file exists anyway. It will not be
overwritten.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>

---

	On Fri, 20 Oct 2006, Junio C Hamano wrote:

	> While we are talking about merge-recursive, I could use some
	> help from somebody familiar with merge-recursive to complete the
	> read-tree changes Linus mentioned early this month.
	>
	> The issue is that we would want to remove one verify_absent()
	> call in unpack-tree.c:threeway_merge().  When read-tree decides
	> to leave higher stages around, we do not want it to check if the
	> merge could clobber a working tree file, because having an
	> unrelated file at the same path in the working tree sometimes is
	> and sometimes is not a conflict, depending on the outcome of the
	> merge, and that part of the code does not _know_ the outcome
	> yet.

	How about this? It passes the testsuite, and I tested it with the 
	test case you did, and with the same test case with recursive 
	merge.

 unpack-trees.c |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 3ac0289..b4994c4 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -658,10 +658,9 @@ int threeway_merge(struct cache_entry **
 	 * up-to-date to avoid the files getting overwritten with
 	 * conflict resolution files.
 	 */
-	if (index) {
+	if (index)
 		verify_uptodate(index, o);
-	}
-	else if (path)
+	else if (no_anc_exists)
 		verify_absent(path, "overwritten", o);
 
 	o->nontrivial_merge = 1;
-- 
1.4.3.1.ga3de1-dirty

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] GIT 1.4.3
  2006-10-21  0:22     ` Petr Baudis
  2006-10-21  0:31       ` Linus Torvalds
  2006-10-21  9:53       ` Andreas Schwab
@ 2006-10-22 21:09       ` Anders Larsen
  2 siblings, 0 replies; 1752+ messages in thread
From: Anders Larsen @ 2006-10-22 21:09 UTC (permalink / raw)
  To: git; +Cc: linux-kernel

On Sat, 21 Oct 2006 02:22:51 +0200, Petr Baudis wrote:

>> That said, "LESS=FRS" doesn't really help that much. It still clears the 
>> screen. Using "LESS=FRSX" fixes that, but the alternate display sequence 
>> is actually nice _if_ the pager is used.
> 
> Hmm, what terminal emulator do you use? The reasonable ones should
> restore the original screen.

And indeed they do.
The problem is, when the original screen is restored, the diff output that
was paged through less -FRS goes poof as well.

Cheers
 Anders

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [PATCH] threeway_merge: if file will not be touched, leave it alone
  2006-10-22 21:04                                                               ` [PATCH] threeway_merge: if file will not be touched, leave it alone Johannes Schindelin
@ 2006-10-22 23:11                                                                 ` Junio C Hamano
  2006-10-23  0:48                                                                   ` Johannes Schindelin
  0 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-22 23:11 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> 	How about this? It passes the testsuite, and I tested it with the 
> 	test case you did, and with the same test case with recursive 
> 	merge.
>
>  unpack-trees.c |    5 ++---
>  1 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 3ac0289..b4994c4 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -658,10 +658,9 @@ int threeway_merge(struct cache_entry **
>  	 * up-to-date to avoid the files getting overwritten with
>  	 * conflict resolution files.
>  	 */
> -	if (index) {
> +	if (index)
>  		verify_uptodate(index, o);
> -	}
> -	else if (path)
> +	else if (no_anc_exists)
>  		verify_absent(path, "overwritten", o);
>  
>  	o->nontrivial_merge = 1;

This feels wrong at the philosophical level.  unpack-trees and
read-tree do not know, and more importantly, do not want to
decide, the outcome of the merge, so it should not be doing
verify_absent because it does not know if the path will be
overwritten by the merge.

Complaining when no_anc_exists means that threeway_merge() is
deciding that the merge result should have the path in this
case.  It might be true for the current merge-recursive and
merge-resolve, but I do not think we should force that decision
on future merge strategies, since that is the whole point of
declaring the merge to be nontrivial and _not_ deciding the
outcome ourselves here.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: prune/prune-packed
  2006-10-22  4:59 ` prune/prune-packed Junio C Hamano
@ 2006-10-22 23:14   ` J. Bruce Fields
  0 siblings, 0 replies; 1752+ messages in thread
From: J. Bruce Fields @ 2006-10-22 23:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sat, Oct 21, 2006 at 09:59:20PM -0700, Junio C Hamano wrote:
> "J. Bruce Fields" <bfields@fieldses.org> writes:
> 
> > Both "man prune" and everyday.txt say that git-prune also runs
> > git-prune-packed.  But that doesn't seem to be true.  Is the bug in the
> > documentation?
> 
> I think it is a regression when prune was rewritten as a
> built-in.

So would it be as simple as this?

--b.

>From d8a01cf8e2d4ccc02dc52fe5dd22b8462997c1ca Mon Sep 17 00:00:00 2001
From: J. Bruce Fields <bfields@citi.umich.edu>
Date: Sun, 22 Oct 2006 19:01:23 -0400
Subject: [PATCH] Make prune also run prune-packed

Both the git-prune manpage and everday.txt say that git-prune should also prune
unpacked objects that are also found in packs, by running git prune-packed.

Junio thought this was "a regression when prune was rewritten as a built-in."

So modify prune to call prune-packed again.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
---
 builtin-prune-packed.c |   11 +++++------
 builtin-prune.c        |    2 ++
 builtin.h              |    1 +
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/builtin-prune-packed.c b/builtin-prune-packed.c
index 960db49..e12b6cf 100644
--- a/builtin-prune-packed.c
+++ b/builtin-prune-packed.c
@@ -4,9 +4,7 @@ #include "cache.h"
 static const char prune_packed_usage[] =
 "git-prune-packed [-n]";
 
-static int dryrun;
-
-static void prune_dir(int i, DIR *dir, char *pathname, int len)
+static void prune_dir(int i, DIR *dir, char *pathname, int len, int dryrun)
 {
 	struct dirent *de;
 	char hex[40];
@@ -31,7 +29,7 @@ static void prune_dir(int i, DIR *dir, c
 	rmdir(pathname);
 }
 
-static void prune_packed_objects(void)
+void prune_packed_objects(int dryrun)
 {
 	int i;
 	static char pathname[PATH_MAX];
@@ -50,7 +48,7 @@ static void prune_packed_objects(void)
 		d = opendir(pathname);
 		if (!d)
 			continue;
-		prune_dir(i, d, pathname, len + 3);
+		prune_dir(i, d, pathname, len + 3, dryrun);
 		closedir(d);
 	}
 }
@@ -58,6 +56,7 @@ static void prune_packed_objects(void)
 int cmd_prune_packed(int argc, const char **argv, const char *prefix)
 {
 	int i;
+	int dryrun;
 
 	for (i = 1; i < argc; i++) {
 		const char *arg = argv[i];
@@ -73,6 +72,6 @@ int cmd_prune_packed(int argc, const cha
 		usage(prune_packed_usage);
 	}
 	sync();
-	prune_packed_objects();
+	prune_packed_objects(dryrun);
 	return 0;
 }
diff --git a/builtin-prune.c b/builtin-prune.c
index 6228c79..7290e6d 100644
--- a/builtin-prune.c
+++ b/builtin-prune.c
@@ -255,5 +255,7 @@ int cmd_prune(int argc, const char **arg
 
 	prune_object_dir(get_object_directory());
 
+	sync();
+	prune_packed_objects(show_only);
 	return 0;
 }
diff --git a/builtin.h b/builtin.h
index f9fa9ff..f71b962 100644
--- a/builtin.h
+++ b/builtin.h
@@ -11,6 +11,7 @@ extern int mailinfo(FILE *in, FILE *out,
 extern int split_mbox(const char **mbox, const char *dir, int allow_bare, int nr_prec, int skip);
 extern void stripspace(FILE *in, FILE *out);
 extern int write_tree(unsigned char *sha1, int missing_ok, const char *prefix);
+extern void prune_packed_objects(int);
 
 extern int cmd_add(int argc, const char **argv, const char *prefix);
 extern int cmd_apply(int argc, const char **argv, const char *prefix);
-- 
1.4.3.1.g87b78

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: [PATCH] threeway_merge: if file will not be touched, leave it alone
  2006-10-22 23:11                                                                 ` Junio C Hamano
@ 2006-10-23  0:48                                                                   ` Johannes Schindelin
  2006-10-23  4:17                                                                     ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Johannes Schindelin @ 2006-10-23  0:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi,

On Sun, 22 Oct 2006, Junio C Hamano wrote:

> Complaining when no_anc_exists means that threeway_merge() is deciding 
> that the merge result should have the path in this case.

Two points:

- you are correct for at least the case of choosing the merge strategy 
"theirs". (Which does not exist yet.)

- in merge-recursive.c:process_entry() (which is called on _all_ unmerged 
entries after threeway merge), "Case A" reads "deleted in one branch". 
Reading the code again, I believe there is a bug, which should be fixed by

diff --git a/merge-recursive.c b/merge-recursive.c
index 2ba43ae..9f6538a 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1005,9 +1005,10 @@ static int process_entry(const char *pat
 		    (!a_sha && sha_eq(b_sha, o_sha))) {
 			/* Deleted in both or deleted in one and
 			 * unchanged in the other */
-			if (a_sha)
+			if (!a_sha) {
 				output("Removing %s", path);
-			remove_file(1, path);
+				remove_file(1, path);
+			}
 		} else {
 			/* Deleted in one and changed in the other */
 			clean_merge = 0;

Note that not only it groups the call to output() and remove_file(), which 
matches the expectation, but also changes the condition to "!a_sha", 
meaning that the file is deleted in branch "a", but existed in the merge 
base, where it is identical to what is in branch "b".

Of course, this assumes that even in the recursive case, branch "a" is to 
be preferred over branch "b". (If I still remember correctly, then branch 
"a" is either the current head, or the temporary recursive merge, so this 
would make sense to me.)

So, after applying this patchlet, merge-recursive (more precisely: the 
function process_entry()) should behave correctly with the change to 
unpack-trees.c you have in pu, i.e. the change that drops that 
verify_absent() call to the floor.

However, I could use some additional optical lobes here.

Ciao,
Dscho

P.S.: Maybe I was wrong on my earlier assessment, that merge-recursive 
does not optimize the "subtrees have identical SHA1s" case. This should be 
handled pretty well by the call to unpack_trees() with threeway merge.

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: prune/prune-packed
  2006-10-20 23:35 ` Junio C Hamano
  2006-10-21  0:14   ` Linus Torvalds
  2006-10-21  0:47   ` Nicolas Pitre
@ 2006-10-23  0:53   ` J. Bruce Fields
  2006-10-23  1:26     ` prune/prune-packed A Large Angry SCM
  2 siblings, 1 reply; 1752+ messages in thread
From: J. Bruce Fields @ 2006-10-23  0:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano <junkio@cox.net> writes:
> I am considering the following to address irritation some people
> (including me, actually) are experiencing with this change when
> viewing a small (or no) diff.  Any objections?

So for me, if I run

	less -FRS file

where "file" is less than a page, I see nothing happen whatsoever.

At a guess, maybe it's clearing the screen, displaying the file, the
restoring, all before I see anything happen?

--b.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: prune/prune-packed
  2006-10-23  0:53   ` prune/prune-packed J. Bruce Fields
@ 2006-10-23  1:26     ` A Large Angry SCM
  2006-10-23  2:36       ` [ANNOUNCE] GIT 1.4.3 J. Bruce Fields
  2006-10-23  3:27       ` prune/prune-packed Junio C Hamano
  0 siblings, 2 replies; 1752+ messages in thread
From: A Large Angry SCM @ 2006-10-23  1:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: J. Bruce Fields, git

J. Bruce Fields wrote:
> Junio C Hamano <junkio@cox.net> writes:
>> I am considering the following to address irritation some people
>> (including me, actually) are experiencing with this change when
>> viewing a small (or no) diff.  Any objections?
> 
> So for me, if I run
> 
> 	less -FRS file
> 
> where "file" is less than a page, I see nothing happen whatsoever.
> 
> At a guess, maybe it's clearing the screen, displaying the file, the
> restoring, all before I see anything happen?

Junio,

How about reverting this change? From the reports here, is causing 
problems on a number of different distributions.

These settings are probably something that is better set by the user in 
an environment variable. Or, make the default something that does work 
everywhere and have a config item for those that wish to customize their UI.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] GIT 1.4.3
  2006-10-23  1:26     ` prune/prune-packed A Large Angry SCM
@ 2006-10-23  2:36       ` J. Bruce Fields
  2006-10-23  3:27       ` prune/prune-packed Junio C Hamano
  1 sibling, 0 replies; 1752+ messages in thread
From: J. Bruce Fields @ 2006-10-23  2:36 UTC (permalink / raw)
  To: A Large Angry SCM; +Cc: Junio C Hamano, git

On Sun, Oct 22, 2006 at 06:26:13PM -0700, A Large Angry SCM wrote:
> J. Bruce Fields wrote:
> >So for me, if I run
> >
> >	less -FRS file
> >
> >where "file" is less than a page, I see nothing happen whatsoever.
> >
> >At a guess, maybe it's clearing the screen, displaying the file, the
> >restoring, all before I see anything happen?
...
> 
> How about reverting this change? From the reports here, is causing 
> problems on a number of different distributions.

I'm using gnome-terminal on Debian/Sid, by the way.

> These settings are probably something that is better set by the user in 
> an environment variable. Or, make the default something that does work 
> everywhere and have a config item for those that wish to customize their UI.

(Um, sorry for my mail screwups, by the way....)

--b.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: prune/prune-packed
  2006-10-23  1:26     ` prune/prune-packed A Large Angry SCM
  2006-10-23  2:36       ` [ANNOUNCE] GIT 1.4.3 J. Bruce Fields
@ 2006-10-23  3:27       ` Junio C Hamano
  2006-10-23 18:39         ` prune/prune-packed Petr Baudis
  2006-10-27 21:19         ` prune/prune-packed Jon Loeliger
  1 sibling, 2 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-23  3:27 UTC (permalink / raw)
  To: gitzilla; +Cc: J. Bruce Fields, git

A Large Angry SCM <gitzilla@gmail.com> writes:

> J. Bruce Fields wrote:
>> Junio C Hamano <junkio@cox.net> writes:
>>> I am considering the following to address irritation some people
>>> (including me, actually) are experiencing with this change when
>>> viewing a small (or no) diff.  Any objections?
>>
>> So for me, if I run
>>
>> 	less -FRS file
>>
>> where "file" is less than a page, I see nothing happen whatsoever.
>>
>> At a guess, maybe it's clearing the screen, displaying the file, the
>> restoring, all before I see anything happen?
>
> Junio,
>
> How about reverting this change? From the reports here, is causing
> problems on a number of different distributions.

Hmmm.  I thought I was using gnome-terminal as well, but I
always work in screen and did not see this problem.

Sorry, but you are right and Linus is more right.  How about
doing FRSX.

diff --git a/pager.c b/pager.c
index 8bd33a1..4587fbb 100644
--- a/pager.c
+++ b/pager.c
@@ -50,7 +50,7 @@ void setup_pager(void)
 	close(fd[0]);
 	close(fd[1]);
 
-	setenv("LESS", "FRS", 0);
+	setenv("LESS", "FRSX", 0);
 	run_pager(pager);
 	die("unable to execute pager '%s'", pager);
 	exit(255);

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: [PATCH] threeway_merge: if file will not be touched, leave it alone
  2006-10-23  0:48                                                                   ` Johannes Schindelin
@ 2006-10-23  4:17                                                                     ` Junio C Hamano
  0 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-23  4:17 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> diff --git a/merge-recursive.c b/merge-recursive.c
> index 2ba43ae..9f6538a 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -1005,9 +1005,10 @@ static int process_entry(const char *pat
>  		    (!a_sha && sha_eq(b_sha, o_sha))) {
>  			/* Deleted in both or deleted in one and
>  			 * unchanged in the other */
> -			if (a_sha)
> +			if (!a_sha) {
>  				output("Removing %s", path);
> -			remove_file(1, path);
> +				remove_file(1, path);
> +			}
>  		} else {
>  			/* Deleted in one and changed in the other */
>  			clean_merge = 0;
>
> Note that not only it groups the call to output() and remove_file(), which 
> matches the expectation, but also changes the condition to "!a_sha", 
> meaning that the file is deleted in branch "a", but existed in the merge 
> base, where it is identical to what is in branch "b".

I think the conditional "output" is to mimic the first case in
git-merge-one-file; there we conditionally give that message
only when ours had that path.  If we lost the path while they
have it the same way as the common ancestor, then we do not have
the path to begin with when we start the merge.  It is not
correct to say "Removing" in such a case.

So the output() call being tied to if (a_sha) _is_ correct in
your code.

What we would want to prevent is to remove the path from the
working tree when we did not have the path at the beginning of
the merge and the merge result says we do not want that path.
In such a case, the file in the working tree is an untracked
file that is not touched by the merge.

E.g gitweb/gitweb.cgi is not tracked in the current "master",
but used to be around v1.4.0 time.  If you try to merge a
branch forked from v1.4.0 because you are interested in a work
on other part of the system (i.e. the branch did not touch
gitweb/ at all), we want to successfully merge that branch into
our "master" even after "make" created gitweb/gitweb.cgi.

Such a merge would start with your HEAD and index missing
gitweb/gitweb.cgi but the path still in your working tree.  The
common ancestor and their tree has the path tracked, so you
would end up with identical stage #1 and #3 with missing stage
#2.

The merge machinery should say the merge result does not have
the path, so it should remove it from the index.  However, it
should _not_ touch the untracked (from the beginning of the time
the merge started) working tree file.  So remove_file() call you
touch in your patch needs to be told not to update working
directory in such a case.

Under "aggressive" rule, threeway_merge() is requested to make
the merge policy decision, so it should also loosen this check
itself.  The change by commit 0b35995 needs to be updated with
this patch:

diff --git a/unpack-trees.c b/unpack-trees.c
index b1d78b8..7cfd628 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -642,7 +642,7 @@ int threeway_merge(struct cache_entry **
 		    (remote_deleted && head && head_match)) {
 			if (index)
 				return deleted_entry(index, index, o);
-			else if (path)
+			else if (path && !head_deleted)
 				verify_absent(path, "removed", o);
 			return 0;
 		}

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22 17:12                                                           ` Linus Torvalds
@ 2006-10-23  5:19                                                             ` Matthew Hannigan
  0 siblings, 0 replies; 1752+ messages in thread
From: Matthew Hannigan @ 2006-10-23  5:19 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Tim Webster, bazaar-ng, git, Jakub Narebski

On Sun, Oct 22, 2006 at 10:12:00AM -0700, Linus Torvalds wrote:
> [ ... ]
> 
>    Again, the way to solve this would tend to be to have a few helper 
>    scripts that use regular file-contents that _describe_ these things to 
>    do "realdiff" and "install".
> 
> In other words, for at least three _totally_ different reasons, you really 
> don't want to do tracking/development directly in /etc, but you want to 
> have a buffer zone to do it. And once you have that, you might as well do 
> _that_ as the repository, and just add a few specialty commands (let's 
> call them "plugins" to make everybody happy) to do the special things.

Damn you stole my idea!  I had this scheme brewing in my head too,
with some slight variations:

> 	# copy the data, set up a PERMISSIONS file to track extra info
> 	sudo cp /etc/group /etc/passwd /etc/shadow .
> 	sudo chown user:user *
> 	cat <<EOF > PERMISSIONS
> 	group root:root 0644
> 	passwd root:root 0644
> 	shadow root:root 0400
> 	EOF

You may want one perms/metadata file per real file (file.meta?) with contents
like:
	owner root
	group root
	perms u=r,go=

for possibly easier to digest diff output. You could omit "don't care" variables.
You could still have one overarching file (DEFAULT.meta) for defaults.  Also, you
may want to track the implied umask instead of the real perms.

You could also track the pathname, (e.g. path /etc/group, path /etc/inet/hosts) so you
didn't have to match the structure of the working tree to the actual destination.

> And again, I'm not going to even claim that the above two "plugins" are 
> the right ones (maybe you want other operations too to interact with the 
> "real" installed files),  [ ... ]

Yes, there are other very useful transformations possible.  One example is to
split the /etc/group file into a series of files, each named after the group,
with contents the sorted list of members.  Again, this is useful for 'diff' and
any SCM. It's important that it's a lossless transformation in both
directions; you may want to scan the destination and make sure
your base revision matches it before 'git install'.

> Btw: none of this is really "git-specific". The above tells you how to do 
> local "git plugins", and it's obviously fairly trivial, but I suspect any 
> SCM can be used in this manner.

Indeed, the essential thing about this is you're representing any
system modification as a text diff, so it makes sense for any
SCM.  In fact the 'plugin' for any SCM would be 95% the same code.

This might also be useful for SCMs that don't handle symlinks
natively.

--
Matt Hannigan

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22 20:06                                                         ` Jakub Narebski
@ 2006-10-23 11:56                                                           ` David Clymer
  2006-10-23 12:54                                                             ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: David Clymer @ 2006-10-23 11:56 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Matthew D. Fuller, Andreas Ericsson, Linus Torvalds, Carl Worth,
	bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 2212 bytes --]

On Sun, 2006-10-22 at 22:06 +0200, Jakub Narebski wrote:
> David Clymer wrote:
> > 1. revnos don't work because they don't serve the same purpose as revids
> > or git's SHA1 commit ids.
> Revnos works only locally, or in star-topology configuration. They have
> some consequences: treating first parent specially, need for merges
> instead of fast-forward even if fast-forward would be applicable,
> two different "fetch" operators: "pull" (which uses revids on the
> pulled side) and "merge" (which preserves revids on pullee side).

s/revids/revnos/g  but yes, I think I said this later in my previous
email.

> 
> > 2. bzr does not support fully distributed development because revnos
> > "don't work" as stated in #1.
> Bazaar is biased towards centralized/star-topology development if we
> want to use revids. In fully distributed configuration there is no
> "simple namespace".

So revnos aren't globally meaningful in fully distributed settings. So
what? I don't see how this translates into bias. There is a lot of
functionality provided by bazaar that doesn't really apply to my use
case, but it doesn't mean that it is indicative of some bias in bazaar.

> 
> > 3. Ok, bzr does support distributed development, I just say it doesn't
> > because I think revids are ugly.
> I think that bzr revids are uglier that git commit-ids.
> 
> If on the pros side of bzr is "simple namespace", you must remember that
> it is simple namespace only for not fully distributed development. The
> pros of "simple namespace" with cons of "merge" vs "pull" and centralization
> required for uniqueness of revids.

I think you've switched revids and revnos, but I get what you are
saying. In fact, I think I said pretty much the same thing in the email
you are replying to. I don't think that anyone is disagreeing about
anything other than the assertion that bzr is biased because revnos are
used to simplify cases where it is possible to do so.

In any case, Matthew Fuller & Carl Worth cover this in greater detail in
emails further down in this thread (or one of its siblings), so I think
I'll stop here.

-davidc

-- 
gpg-key: http://www.zettazebra.com/files/key.gpg

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 11:56                                                           ` David Clymer
@ 2006-10-23 12:54                                                             ` Jakub Narebski
  2006-10-23 15:01                                                               ` James Henstridge
  2006-10-24  3:24                                                               ` David Clymer
  0 siblings, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-23 12:54 UTC (permalink / raw)
  To: David Clymer
  Cc: Matthew D. Fuller, Andreas Ericsson, Linus Torvalds, Carl Worth,
	bazaar-ng, git

On Mon, Oct 23, 2006 David Clymer wrote:
> On Sun, 2006-10-22 at 22:06 +0200, Jakub Narebski wrote:
>> David Clymer wrote:

>>> 2. bzr does not support fully distributed development because revnos
>>> "don't work" as stated in #1.
>>
>> Bazaar is biased towards centralized/star-topology development if we
>> want to use revnos. In fully distributed configuration there is no
>> "simple namespace".
> 
> So revnos aren't globally meaningful in fully distributed settings. So
> what? I don't see how this translates into bias. There is a lot of
> functionality provided by bazaar that doesn't really apply to my use
> case, but it doesn't mean that it is indicative of some bias in bazaar.

First, bzr is biased towards using revnos: bzr commands uses revnos
by default to provide revision (you have to use revid: prefix/operator
to use revision identifiers), bzr commands outputs revids only when
requested, examples of usage uses revision numbers.

In order to use revnos as _global_ identifiers in distributed development,
you need central "branch", mainline, to provide those revnos. You have
either to have access to this "revno server" and refer to revisions by
"revno server" URL and revision number, or designate one branch as holding
revision numbers ("revno server") and preserve revnos on "revno server"
by using bzr "merge", while copying revnos when fetching by using bzr "pull"
for leaf branches. In short: for revnos to be global identifiers you need
star-topology.

Even if you use revnos only locally, you need to know which revisions are
"yours", i.e. beside branch as DAG of history of given revision you need
"ordered series of revisions" (to quote Bazaar-NG wiki Glossary), or path
through this diagram from given revision to one of the roots (initial,
parentless revisions). Because bzr does that by preserving mentioned path
as first-parent path (treating first parent specially), i.e. storing local
information in a DAG (which is shared), to preserve revnos you need to
use "merge" instead of "pull", which means that you get empty-merge in
clearly fast-forward case. This means "local changes bias", which some
might take as not being fully distributed.

Sidenote 1: Why Bazaar-NG tries to store "branch as ordered series
of revisions"/"branch as path through revisions DAG" in DAG instead
of storing it separately (like reflog stores history of tip of branch,
which is roughly equivalent of "branch as path" in bzr). It needs
some kind of cache of mapping from revno to the revision itself anyway
(unless performance doesn't matter for bzr developers ;-)! All what
left is to propagate this mapping on "pull"...

Sidenote 2: "Fringe" developer using default git configuration of
'origin' branch tracking 'master' branch in cloned (mainline) repo,
and 'master' branch on which he/she does his/her own work, who committed
at least single revision on his/her 'master' branch, and whose changes
are never pulled and if they get into mainline repo it is using "side"
channel like git-enchanced patches sent to project mailing list,
will see the picture similar to the bzr branch which uses "merge".


The whole discussion about validity of revision numbers started
with "simple namespace" feature in SCM comparison matrix on Bazaar-NG
wiki...
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 12:54                                                             ` Jakub Narebski
@ 2006-10-23 15:01                                                               ` James Henstridge
  2006-10-23 17:18                                                                 ` Aaron Bentley
  2006-10-24  3:24                                                               ` David Clymer
  1 sibling, 1 reply; 1752+ messages in thread
From: James Henstridge @ 2006-10-23 15:01 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: David Clymer, bazaar-ng, Matthew D. Fuller, Linus Torvalds,
	Carl Worth, Andreas Ericsson, git

On 23/10/06, Jakub Narebski <jnareb@gmail.com> wrote:
> First, bzr is biased towards using revnos: bzr commands uses revnos
> by default to provide revision (you have to use revid: prefix/operator
> to use revision identifiers), bzr commands outputs revids only when
> requested, examples of usage uses revision numbers.

As has been said before, you can set an alias to always show revision
IDs in "bzr log" output.


> In order to use revnos as _global_ identifiers in distributed development,
> you need central "branch", mainline, to provide those revnos. You have
> either to have access to this "revno server" and refer to revisions by
> "revno server" URL and revision number, or designate one branch as holding
> revision numbers ("revno server") and preserve revnos on "revno server"
> by using bzr "merge", while copying revnos when fetching by using bzr "pull"
> for leaf branches. In short: for revnos to be global identifiers you need
> star-topology.

Why do you continue to repeat this argument?  No one is claiming that
a revision number by itself, as Bazaar uses them, is a global
identifier.  In fact, we keep on saying that they only have meaning in
the context of a branch.  If you want to use a revision number as part
of a globally unique identifier, it needs to be in combination with
its branch.


> Even if you use revnos only locally, you need to know which revisions are
> "yours", i.e. beside branch as DAG of history of given revision you need
> "ordered series of revisions" (to quote Bazaar-NG wiki Glossary), or path
> through this diagram from given revision to one of the roots (initial,
> parentless revisions). Because bzr does that by preserving mentioned path
> as first-parent path (treating first parent specially), i.e. storing local
> information in a DAG (which is shared), to preserve revnos you need to
> use "merge" instead of "pull", which means that you get empty-merge in
> clearly fast-forward case. This means "local changes bias", which some
> might take as not being fully distributed.

I won't dispute that Bazaar has features that make it easier to work
with the revisions in the line of development of the branch you're
working on in comparison to the revisions from merges.  But given that
every Bazaar branch has this same bias towards their own main line of
development, how can that affect whether or not it is distributed?

James.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22 18:53                                                         ` Matthew D. Fuller
  2006-10-22 19:27                                                           ` Jakub Narebski
@ 2006-10-23 16:57                                                           ` David Lang
  2006-10-23 17:29                                                           ` Linus Torvalds
  2 siblings, 0 replies; 1752+ messages in thread
From: David Lang @ 2006-10-23 16:57 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Carl Worth, Erik Bågfors, bazaar-ng, git, Jakub Narebski

>> This special treatment influences or directly causes many of the
>> things in bzr that we've been discussing:
>  [...]
>> I've been arguing that all of these impacts are dubious. But I can
>> understand that a bzr user hearing arguments against them might fear
>> that they would lose the ability to be able to see a view of commits
>> that "belong" to a particular branch.
>
> Dead center.
>
>
>> The mainline..featureA syntax literally just means:
>>
>> 	the set of commits that are reachable by featureA
>> 	and excluding the set of commits reachable by mainline
>
> From what I can gather from this, though, that means that when I merge
> stuff from featureA into mainline (and keep on with other stuff in
> featureA), I'll no longer be able to see those older commits from this
> command.  And I'll see merged revisions from branches other than
> mainline (until they themselves get merged into mainline), correct?
> It sounds more like a 'bzr missing --mine-only' than looking down a
> mainline in log...

one thing you are missing 'mainline' in this git command is not saying 
'everything that's in the 'main' published branch'. it's saying 'everything 
reachable by the tag 'mainline'

so when you branched off for your feature development you could set a tag that 
says 'branchpoint' and no matter what gets merged in mainline after that you can 
always do branchpoint..featureA and find what you've done.

that being said, mainline..featureA is also extremely useful, it tells you what 
development stuff you have done that have not yet been merged into mainline

David Lang

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 15:01                                                               ` James Henstridge
@ 2006-10-23 17:18                                                                 ` Aaron Bentley
  2006-10-23 17:53                                                                   ` Jakub Narebski
  2006-10-23 20:06                                                                   ` Jeff King
  0 siblings, 2 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-23 17:18 UTC (permalink / raw)
  To: James Henstridge
  Cc: Jakub Narebski, bazaar-ng, Matthew D. Fuller, Linus Torvalds,
	Andreas Ericsson, Carl Worth, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

James Henstridge wrote:
> Why do you continue to repeat this argument?  No one is claiming that
> a revision number by itself, as Bazaar uses them, is a global
> identifier.  In fact, we keep on saying that they only have meaning in
> the context of a branch.

And, unlike git, Bazaar branches are all independent entities[1], and
they each have a URL.

So:

http://code.aaronbentley.com/bzrrepo/bzr.ab 1695

is a name for

abentley@panoramicfeedback.com-20060927202832-9795d0528e311e31

And it does not depend on any other branch, especially not bzr.dev

Since:
1. anyone with write access to the urls can create them
2. anyone with read access to the urls can read them
3. the maintainers of the mainline have no control over them
   (except as provided by 1)

these identifiers are not centralized.

Aaron

[1] The fact that they may share storage is not important to the model.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFPPlm0F+nu1YWqI0RAlmLAJ9cpw5X7UXQ82EmoIeUrKzEaFbhdACfZPsS
CRJ69XWi7XAWJRi7Fgt9ICU=
=WrV9
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-22 18:53                                                         ` Matthew D. Fuller
  2006-10-22 19:27                                                           ` Jakub Narebski
  2006-10-23 16:57                                                           ` David Lang
@ 2006-10-23 17:29                                                           ` Linus Torvalds
  2006-10-23 22:21                                                             ` Matthew D. Fuller
  2 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-23 17:29 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Carl Worth, Erik Bågfors, bazaar-ng, git, Jakub Narebski



On Sun, 22 Oct 2006, Matthew D. Fuller wrote:
> 
> > This special treatment influences or directly causes many of the
> > things in bzr that we've been discussing:
>   [...]
> > I've been arguing that all of these impacts are dubious. But I can
> > understand that a bzr user hearing arguments against them might fear
> > that they would lose the ability to be able to see a view of commits
> > that "belong" to a particular branch.
> 
> Dead center.

The thing that the bzr people don't seem to realize is that their choice 
of revision naming has serious side effects, some of them really 
technical, and limiting.

I already briought this up once, and I suspect that the bzr people simply 
DID NOT UNDERSTAND the question:

 - how do you do the git equivalent of "gitk --all"

which is just another reason why "branch-local" revision naming is simply 
stupid and has real _technical_ problems.

I really suspect that a lot of people can't see further than their own 
feet, and don't understand the subtle indirect problems that branch-local 
naming causes. 

For example, how long does it take to do an arbitrary "undo" (ie forcing a 
branch to an earlier state) in a project with tens of thousands of 
commits? That's actually a really important operation, and yes, 
performance does matter. It's something that you do a lot when you do 
things like "bisect" (which I used to approximate with BK by hand, and 
yes, re-weaving the branch history was apparently a big part of why it 
took _minutes_ to do sometimes).

Again, this is something that people don't expect to have _anything_ to do 
with revision numbering, but the fact is, it's a big part of the picture. 
If you have branch-local revision numbering, you need to renumber all 
revisions on events like this, and even if it is "just" re-creatigng the 
revno->"real ID" cache, it's actually an expensive operation exactly 
because it's going to be at least linear in history.

One of the git design requirements was that no operation should _ever_ 
need to be linear in history size, because it becomes a serious limiter of 
scalability at some point. We were seeing some of those issues with BK, 
which is why I cared.

So in git, doing things like jumping back and forth in history is O(1). 
Always (with a really low constant cost too). Of course, checking out the 
end result is then roughly O(n), but even there "n" is the size of the 
_changes_, not number of revisions or number of files.

(And there are obviously operations that _are_ O(revision history), the 
most trivial one being anything that visualizes all of history - but they 
depend on the size of history not because the operation itself gets more 
expensive, but because the dataset increases).

The whole confusing between "bzr pull" and "bzr merge" is another 
_technical_ sign of why branch-local revision numbers are a mistake. 

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 17:18                                                                 ` Aaron Bentley
@ 2006-10-23 17:53                                                                   ` Jakub Narebski
  2006-10-23 18:04                                                                     ` Linus Torvalds
  2006-10-23 20:06                                                                   ` Jeff King
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-23 17:53 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: James Henstridge, bazaar-ng, Matthew D. Fuller, Linus Torvalds,
	Andreas Ericsson, Carl Worth, git

Aaron Bentley wrote:
> James Henstridge wrote:

>> Why do you continue to repeat this argument?  No one is claiming that
>> a revision number by itself, as Bazaar uses them, is a global
>> identifier.  In fact, we keep on saying that they only have meaning in
>> the context of a branch.
> 
> And, unlike git, Bazaar branches are all independent entities[1], and
> they each have a URL.
> 
> So:
> 
> http://code.aaronbentley.com/bzrrepo/bzr.ab 1695
> 
> is a name for
> 
> abentley@panoramicfeedback.com-20060927202832-9795d0528e311e31
> 
> And it does not depend on any other branch, especially not bzr.dev
> 
> Since:
> 1. anyone with write access to the urls can create them
> 2. anyone with read access to the urls can read them
> 3. the maintainers of the mainline have no control over them
>    (except as provided by 1)
> 
> these identifiers are not centralized.

If you don't use centralized numbers (i.e. always refering to bzr.dev,
either by using always (bzr.dev URL, revno), or by using "merge" for
bzr.dev and "pull" for rest), the numbers are volatile. If URL vanishes,
then (URL, revno) to revid mapping is no longer valid. Yeah, I know,
cool URI don't change...

Besides, you need [constant] network access for this mapping.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 17:53                                                                   ` Jakub Narebski
@ 2006-10-23 18:04                                                                     ` Linus Torvalds
  2006-10-23 18:21                                                                       ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-23 18:04 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: James Henstridge, bazaar-ng, Matthew D. Fuller, Andreas Ericsson,
	Carl Worth, git



On Mon, 23 Oct 2006, Jakub Narebski wrote:
> 
> Besides, you need [constant] network access for this mapping.

I _think_ that Aaron was trying to say that

	abentley@panoramicfeedback.com-20060927202832-9795d0528e311e31

is always constant, so you can use that.

Of course, nobody will ever do that, because in practice they're not 
shown, the same way the "true" BK revision names were never shown and thus 
never really used.

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:04                                                                     ` Linus Torvalds
@ 2006-10-23 18:21                                                                       ` Jakub Narebski
  2006-10-23 18:26                                                                         ` Jelmer Vernooij
  2006-10-23 18:34                                                                         ` Linus Torvalds
  0 siblings, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-23 18:21 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Aaron Bentley, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git

Linus Torvalds wrote:
> 
> On Mon, 23 Oct 2006, Jakub Narebski wrote:
>> 
>> Besides, you need [constant] network access for this mapping.
> 
> I _think_ that Aaron was trying to say that
> 
> 	abentley@panoramicfeedback.com-20060927202832-9795d0528e311e31
> 
> is always constant, so you can use that.
> 
> Of course, nobody will ever do that, because in practice they're not 
> shown, the same way the "true" BK revision names were never shown and thus 
> never really used.

By the way, I wonder if accidentally identical revisions
(see example for accidental clean merge on revctrl.org)
would get the same revision id in bzr. In git they would.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:21                                                                       ` Jakub Narebski
@ 2006-10-23 18:26                                                                         ` Jelmer Vernooij
  2006-10-23 18:31                                                                           ` Jakub Narebski
  2006-10-23 18:34                                                                         ` Linus Torvalds
  1 sibling, 1 reply; 1752+ messages in thread
From: Jelmer Vernooij @ 2006-10-23 18:26 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Linus Torvalds, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git

[-- Attachment #1: Type: text/plain, Size: 1086 bytes --]

On Mon, 2006-10-23 at 20:21 +0200, Jakub Narebski wrote:
> Linus Torvalds wrote:
> > On Mon, 23 Oct 2006, Jakub Narebski wrote:
> >> 
> >> Besides, you need [constant] network access for this mapping.
> > 
> > I _think_ that Aaron was trying to say that
> > 
> > 	abentley@panoramicfeedback.com-20060927202832-9795d0528e311e31
> > 
> > is always constant, so you can use that.
> > 
> > Of course, nobody will ever do that, because in practice they're not 
> > shown, the same way the "true" BK revision names were never shown and thus 
> > never really used.
> 
> By the way, I wonder if accidentally identical revisions
> (see example for accidental clean merge on revctrl.org)
> would get the same revision id in bzr. In git they would.
They won't. The revision id is made up of the committers email address,
a timestamp and a bunch of random data. It wouldn't be hard to switch
using checksums as revids instead, but I don't think there are any plans
in that direction.

Cheers,

Jelmer
-- 
Jelmer Vernooij <jelmer@samba.org> - http://samba.org/~jelmer/

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:26                                                                         ` Jelmer Vernooij
@ 2006-10-23 18:31                                                                           ` Jakub Narebski
  2006-10-23 18:44                                                                             ` Jelmer Vernooij
  2006-10-23 18:45                                                                             ` Linus Torvalds
  0 siblings, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-23 18:31 UTC (permalink / raw)
  To: Jelmer Vernooij
  Cc: Linus Torvalds, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git

Jelmer Vernooij wrote:
>> By the way, I wonder if accidentally identical revisions
>> (see example for accidental clean merge on revctrl.org)
>> would get the same revision id in bzr. In git they would.

> They won't. The revision id is made up of the committers email address,
> a timestamp and a bunch of random data. It wouldn't be hard to switch
> using checksums as revids instead, but I don't think there are any plans
> in that direction.

The place for timestamp and commiter info is in the revision metadata
(in commit object in git). Not in revision id. Unless you think that
"accidentally the same" doesn't happen...
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:21                                                                       ` Jakub Narebski
  2006-10-23 18:26                                                                         ` Jelmer Vernooij
@ 2006-10-23 18:34                                                                         ` Linus Torvalds
  1 sibling, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-23 18:34 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Aaron Bentley, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git



On Mon, 23 Oct 2006, Jakub Narebski wrote:
> 
> By the way, I wonder if accidentally identical revisions
> (see example for accidental clean merge on revctrl.org)
> would get the same revision id in bzr. In git they would.

git can have no "accidentally identical revisions". They'd have to be 
purposefully done, but yes, they'd obviously (on purpose) get the same 
revision name if that's the case.

You may think of tree (not commit) identity, where git on purpose names 
trees the same regardless of how you got to them. So on a _tree_ level, 
you are always supposed to get the same result regardless of how you 
import things (ie two people importing the same tar-ball should always get 
exactly the same tree ID).

But the actual commit names are identical only if the same people are 
claimed to have authored (and committed) them at the same time - so it's 
definitely not "accidental" if the commits are called the same: they 
really _are_ the same.

Btw, I think you misunderstand the term "accidental clean merge". It means 
that two identical changes on two branches will merge without conflicts 
being reported.

A merge algorithm that doesn't do "accidental clean merge" is totally 
broken. The accidental clean merge is a usability requirement for pretty 
much anything - you often have two branches doing the same thing (possibly 
for different reasons - two people independently found the same bug that 
showed itself in two different ways - so they may even think that they 
are fixing different issues, and may have written totally different 
changelogs to explain the bug, but the solution is identical and should 
obviously merge cleanly).

So "accidental clean merge" may _sound_ like something bad, but it's 
actually a seriously good property (it's really just a special case of 
"convergence" - again, that's a good thing).

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: prune/prune-packed
  2006-10-23  3:27       ` prune/prune-packed Junio C Hamano
@ 2006-10-23 18:39         ` Petr Baudis
  2006-10-27 21:19         ` prune/prune-packed Jon Loeliger
  1 sibling, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-23 18:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: gitzilla, J. Bruce Fields, git

Dear diary, on Mon, Oct 23, 2006 at 05:27:49AM CEST, I got a letter
where Junio C Hamano <junkio@cox.net> said that...
> A Large Angry SCM <gitzilla@gmail.com> writes:
> 
> > J. Bruce Fields wrote:
> >> Junio C Hamano <junkio@cox.net> writes:
> >>> I am considering the following to address irritation some people
> >>> (including me, actually) are experiencing with this change when
> >>> viewing a small (or no) diff.  Any objections?
> >>
> >> So for me, if I run
> >>
> >> 	less -FRS file
> >>
> >> where "file" is less than a page, I see nothing happen whatsoever.
> >>
> >> At a guess, maybe it's clearing the screen, displaying the file, the
> >> restoring, all before I see anything happen?
> >
> > Junio,
> >
> > How about reverting this change? From the reports here, is causing
> > problems on a number of different distributions.
> 
> Hmmm.  I thought I was using gnome-terminal as well, but I
> always work in screen and did not see this problem.
> 
> Sorry, but you are right and Linus is more right.  How about
> doing FRSX.

I should like that solution more since I hate the alternate screen, but
I actually don't, since it should be left at the user's will whether to
use the alternate screen or not, and Git shouldn't change the default on
whim. Git is trying to be too smart here, and I think it's more annoying
to override what the user is used to than having to by default press q.

Yes, the user can always override Git by setting own $LESS, but that
means another explicit action at the user's side is required and they
don't receive any further cool flags we might stick in there later.
(BTW, I don't think this is right either. In Cogito, I do

	LESS="$myflags$LESS"

unless $CG_LESS is set, in which case I do

	LESS="$CG_LESS".

So people like Jens who have LESS set still get sensible behaviour from
Cogito _and_ they don't loose the ability to override Cogito's less
flags.)

BTW, I think not seeing output of paged commands is a major problem,
this should probably warrant another bugfix release.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:31                                                                           ` Jakub Narebski
@ 2006-10-23 18:44                                                                             ` Jelmer Vernooij
  2006-10-23 18:45                                                                             ` Linus Torvalds
  1 sibling, 0 replies; 1752+ messages in thread
From: Jelmer Vernooij @ 2006-10-23 18:44 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Linus Torvalds, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git

[-- Attachment #1: Type: text/plain, Size: 1202 bytes --]

On Mon, 2006-10-23 at 20:31 +0200, Jakub Narebski wrote:
> Jelmer Vernooij wrote:
> >> By the way, I wonder if accidentally identical revisions
> >> (see example for accidental clean merge on revctrl.org)
> >> would get the same revision id in bzr. In git they would.
> 
> > They won't. The revision id is made up of the committers email address,
> > a timestamp and a bunch of random data. It wouldn't be hard to switch
> > using checksums as revids instead, but I don't think there are any plans
> > in that direction.
> The place for timestamp and commiter info is in the revision metadata
> (in commit object in git). Not in revision id. Unless you think that
> "accidentally the same" doesn't happen...
The revision id isn't parsed by bzr. It's just a unique identifier that
is generated at commit-time and is currently created by concatenating
those three fields. It can be anything you like. The bzr-svn plugin for
example creates revision ids in the form
svn:REVNUM@REPOS_UUID-BRANCHPATH and bzr-git uses git:GITREVID. Nothing
will break if bzr would start using a different format.

Cheers,

Jelmer

-- 
Jelmer Vernooij <jelmer@samba.org> - http://samba.org/~jelmer/

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:31                                                                           ` Jakub Narebski
  2006-10-23 18:44                                                                             ` Jelmer Vernooij
@ 2006-10-23 18:45                                                                             ` Linus Torvalds
  2006-10-23 18:56                                                                               ` Jelmer Vernooij
  1 sibling, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-23 18:45 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Jelmer Vernooij, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git



On Mon, 23 Oct 2006, Jakub Narebski wrote:
> 
> The place for timestamp and commiter info is in the revision metadata
> (in commit object in git). Not in revision id. Unless you think that
> "accidentally the same" doesn't happen...

Well, git and bzr really do share the same "stable" revision naming, 
although in git it's more indirect, and thus "covers" more.

In git, the revision name indirectly includes the commit comments too (and 
git obviously also distinguishes between "committer" and "author", and 
those end up being indirectly credited in the name of the commit too). But 
in a very real sense, the bzr stable ("real") revision name does 
effectively contain the same things as a git ID: it's just that it's a 
small subset (only committer+date+random number) of what git includes in 
its names.

So you could more easily _fake_ a commit name in bzr, and depending on how 
things are done it might be more open to malicious attacks for that reason 
(or unintentionally - if two people apply the exact same patch from an 
email, and take the author/date info from the email like hit does, you 
might have clashes. But with a 64-bit random number, that's probably 
unlikely, unless you also hit some other bad luck like having the 
pseudo-random sequence seeded by "time()", and people just _happen_ to 
apply the email at the exact same second).

The git use of hashes and parenthood information make any accidental 
clashes like that a non-issue: if you have exactly the same information, 
it really _is_ the same commit, since the hash includes the parenthood 
too. So you're left with just malicious attacks, and those currently look 
practically impossible too, of course.

So I don't think bzr and git differ in this respect. I think you can 
_trust_ stable git names a lot more, but that's a separate issue.

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:45                                                                             ` Linus Torvalds
@ 2006-10-23 18:56                                                                               ` Jelmer Vernooij
  2006-10-23 19:02                                                                                 ` Shawn Pearce
                                                                                                   ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Jelmer Vernooij @ 2006-10-23 18:56 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jakub Narebski, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git

[-- Attachment #1: Type: text/plain, Size: 2030 bytes --]

On Mon, 2006-10-23 at 11:45 -0700, Linus Torvalds wrote:
> On Mon, 23 Oct 2006, Jakub Narebski wrote:
> > The place for timestamp and commiter info is in the revision metadata
> > (in commit object in git). Not in revision id. Unless you think that
> > "accidentally the same" doesn't happen...
> Well, git and bzr really do share the same "stable" revision naming, 
> although in git it's more indirect, and thus "covers" more.
> 
> In git, the revision name indirectly includes the commit comments too (and 
> git obviously also distinguishes between "committer" and "author", and 
> those end up being indirectly credited in the name of the commit too). But 
> in a very real sense, the bzr stable ("real") revision name does 
> effectively contain the same things as a git ID: it's just that it's a 
> small subset (only committer+date+random number) of what git includes in 
> its names.
There are no requirements on what a revid is in bzr. It's a unique
identifier, nothing more. It can be whatever you like, as long as it's
unique for that specific commit. The committer+date+random\ number is
just what bzr uses at the moment to create those unique identifiers.

> So you could more easily _fake_ a commit name in bzr, and depending on how 
> things are done it might be more open to malicious attacks for that reason 
> (or unintentionally - if two people apply the exact same patch from an 
> email, and take the author/date info from the email like hit does, you 
> might have clashes. But with a 64-bit random number, that's probably 
> unlikely, unless you also hit some other bad luck like having the 
> pseudo-random sequence seeded by "time()", and people just _happen_ to 
> apply the email at the exact same second).
Bzr stores a checksum of the commit separately from the revision id in
the metadata of a revision. The revision is not used by itself to check
the integrity of a revision.

Cheers,

Jelmer

-- 
Jelmer Vernooij <jelmer@samba.org> - http://samba.org/~jelmer/

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:56                                                                               ` Jelmer Vernooij
@ 2006-10-23 19:02                                                                                 ` Shawn Pearce
  2006-10-23 19:12                                                                                 ` Jakub Narebski
  2006-10-23 19:18                                                                                 ` Linus Torvalds
  2 siblings, 0 replies; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-23 19:02 UTC (permalink / raw)
  To: Jelmer Vernooij
  Cc: Linus Torvalds, Jakub Narebski, James Henstridge, bazaar-ng,
	Matthew D. Fuller, Andreas Ericsson, Carl Worth, git

Jelmer Vernooij <jelmer@samba.org> wrote:
> On Mon, 2006-10-23 at 11:45 -0700, Linus Torvalds wrote:
> > On Mon, 23 Oct 2006, Jakub Narebski wrote:
> > > The place for timestamp and commiter info is in the revision metadata
> > > (in commit object in git). Not in revision id. Unless you think that
> > > "accidentally the same" doesn't happen...
> > Well, git and bzr really do share the same "stable" revision naming, 
> > although in git it's more indirect, and thus "covers" more.
> > 
[snip]
> > So you could more easily _fake_ a commit name in bzr, and depending on how 
> > things are done it might be more open to malicious attacks for that reason 
> > (or unintentionally - if two people apply the exact same patch from an 
> > email, and take the author/date info from the email like hit does, you 
> > might have clashes. But with a 64-bit random number, that's probably 
> > unlikely, unless you also hit some other bad luck like having the 
> > pseudo-random sequence seeded by "time()", and people just _happen_ to 
> > apply the email at the exact same second).
> Bzr stores a checksum of the commit separately from the revision id in
> the metadata of a revision. The revision is not used by itself to check
> the integrity of a revision.

I think Linus' original point here was that if you communicate the
revision id to another person and they fetch that revision there
is no assurance that the commit they have received is the exact
same commit you had.

In Git that assurance is implicitly present as the unique
identification you communicated to the other person is also that
integrity verification.  Therefore its nearly impossible to spoof.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:56                                                                               ` Jelmer Vernooij
  2006-10-23 19:02                                                                                 ` Shawn Pearce
@ 2006-10-23 19:12                                                                                 ` Jakub Narebski
  2006-10-23 19:18                                                                                 ` Linus Torvalds
  2 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-23 19:12 UTC (permalink / raw)
  To: Jelmer Vernooij
  Cc: James Henstridge, bazaar-ng, Matthew D. Fuller, Linus Torvalds,
	Carl Worth, Andreas Ericsson, git

Jelmer Vernooij wrote:

> There are no requirements on what a revid is in bzr. It's a unique
> identifier, nothing more. It can be whatever you like, as long as it's
> unique for that specific commit. The committer+date+random_number is
> just what bzr uses at the moment to create those unique identifiers.

In unpacked git repository commit-id is also commit address. Pack files
adds another level of indirection via pack index file. And functions
as checksum.

P.S. I'm interested what are bzr equivalents of git different types
of objects: commits (revision info) and what is stored in there besides
commit message and "snapshot"; trees/manifest i.e. how files are 
gathered together to form given revision; blob i.e. what is the storage 
format and how it is divided: changeset-like of Arch or file "buckets" 
of Mercurial and CVS, or something yet different together. Is there 
equivalent of git tags and tags objects?

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:56                                                                               ` Jelmer Vernooij
  2006-10-23 19:02                                                                                 ` Shawn Pearce
  2006-10-23 19:12                                                                                 ` Jakub Narebski
@ 2006-10-23 19:18                                                                                 ` Linus Torvalds
  2 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-23 19:18 UTC (permalink / raw)
  To: Jelmer Vernooij
  Cc: James Henstridge, bazaar-ng, Matthew D. Fuller, Carl Worth,
	Andreas Ericsson, git, Jakub Narebski



On Mon, 23 Oct 2006, Jelmer Vernooij wrote:
>
> Bzr stores a checksum of the commit separately from the revision id in
> the metadata of a revision. The revision is not used by itself to check
> the integrity of a revision.

That wasn't what I was trying to aim at - the problem is that the bzr 
revision ID isn't "safe" in itself. Anybody can create a revision with the 
same names - and they may both have checksums that match their own 
revision, but you have no idea which one is "correct".

So you just have to trust the person that generates the name, to use a 
proper name generation algorithm. You have to _trust_ that your 64-bit 
random number really is random, for example. And that nobody is trying to 
mess with your repo.

This isn't a problem in normal behaviour, but it's a problem in an attack 
schenario: imagine somebody hacking the central server, and replacing the 
repository with something that had all the same commit names, but one of 
the revisions was changed to introduce a nasty backhole problem. Change 
all the checksums to match too..

It would _look_ fine to somebody who fetches an update, and the maintainer 
might not ever even notice (because he wouldn't send the _old_ revision 
again, and _his_ tree would be fine, so he'd happily continue to to send 
out new revisions on top of the bad one on the public site, never even 
realizing that people are fetching something that doesn't match what he is 
pushing).

In contrast, in git, if you replace something in a git repository, the 
name changes, and if I were to try to push an update on top of a broken 
repo like that, it simply wouldn't work - I couldn't fast-forward my own 
branch, because it's no longer a proper subset of what I'm trying to send.

So in git, you can _trust_ the names. They actually self-verify. You can't 
have maliciously made-up names that point to something else than what they 
are. 

[ Also, as a result, and related to this same issue: the git protocol 
  actually never sends object names when sending the object itself. It 
  just sends the object data, and the _recipient_ generates the name from 
  that.

  So you can't do the _other_ kind of spoofing, and make a repository that 
  _claims_ to have one name and the data would differ - because if you do 
  that, anybody who pulls from the spoofed repository will re-create 
  different names than you claimed, and won't even be able to pull such a 
  malicious repository. ]

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 17:18                                                                 ` Aaron Bentley
  2006-10-23 17:53                                                                   ` Jakub Narebski
@ 2006-10-23 20:06                                                                   ` Jeff King
  2006-10-23 20:29                                                                     ` Jakub Narebski
  1 sibling, 1 reply; 1752+ messages in thread
From: Jeff King @ 2006-10-23 20:06 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: James Henstridge, Jakub Narebski, bazaar-ng, Matthew D. Fuller,
	Linus Torvalds, Andreas Ericsson, Carl Worth, git

On Mon, Oct 23, 2006 at 01:18:30PM -0400, Aaron Bentley wrote:

> And, unlike git, Bazaar branches are all independent entities[1], and
[...]
> [1] The fact that they may share storage is not important to the model.

Sorry, I don't understand this statement. How are git branches not
independent? Sure, they tend to exist in repositories with other
branches, but there's no need to (it simply allows the sharing of object
storage). There's no reason I can't move any branch from any repo into
its own repo, or vice versa move any unrelated branch into a repo with
other branches.

It all Just Works because there _isn't_ any branch information. It's
simply a pointer into the DAG, so if I have the right parts of the DAG
(which git is careful to make sure of), I can just make a pointer, and I
have absolutely zero connection to wherever the DAG came from.

> they each have a URL.

In cogito, branches can each have a URL, but git-clone doesn't have a
way (that I know of) to clone only a subset of branches. It would be
fairly trivial to implement, I think.

> So:
> 
> http://code.aaronbentley.com/bzrrepo/bzr.ab 1695
> 
> is a name for
> 
> abentley@panoramicfeedback.com-20060927202832-9795d0528e311e31

The git analog is of course:

http://kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git v2.6.18

as a name for

e478bec0ba0a83a48a0f6982934b6de079e7e6b3

The difference being that Linus assigned the "local" name of v2.6.18
rather than having git auto-assign it.

> And it does not depend on any other branch, especially not bzr.dev

Of course. For me, the above commit is actually

  ssh://peff.net/home/peff/git/linux-2.6 v2.6.18

but once it is in my local repository, it's indistinguishable from one I
pulled directly from kernel.org.

And I wonder if THAT is at the root of this discussion. bzr isn't
"centralized" in the sense that you have to talk to a central server, or
rely on it for doing any operations.  But you actually CARE about where
your commits come from, and git fundamentally doesn't.

-Peff

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 20:06                                                                   ` Jeff King
@ 2006-10-23 20:29                                                                     ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-23 20:29 UTC (permalink / raw)
  To: Jeff King
  Cc: Aaron Bentley, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Linus Torvalds, Andreas Ericsson, Carl Worth, git

On Mon, 23 Oct 2006, Jeff King wrote:
> On Mon, Oct 23, 2006 at 01:18:30PM -0400, Aaron Bentley wrote:
> 
>> And, unlike git, Bazaar branches are all independent entities[1], and
> [...]
>> [1] The fact that they may share storage is not important to the model.

By the way, git repositories (remember that working area in bzr is
associated with branch, and in git with repository) can share storage,
either sharing only immutable "old history" (part of DAG) via 
$GIT_DIR/objects/info/alternates file or GIT_ALTERNATE_OBJECT_DIRECTORIES
environment variable, or via having shared commit object database
via symlinking $GIT_DIR/objects directory or via setting 
GIT_OBJECT_DIRECTORY variable. 

Git doesn't support latter fully out of the box (you must be careful
with prune) but on the other side bzr doesn't support cloning whole
repository.
  
> It all Just Works because there _isn't_ any branch information. It's
> simply a pointer into the DAG, so if I have the right parts of the DAG
> (which git is careful to make sure of), I can just make a pointer, and I
> have absolutely zero connection to wherever the DAG came from.

Well, with exception of reflog, which is local to repository
(and doesn't get propagated).
 
>> they each have a URL.
> 
> In cogito, branches can each have a URL, but git-clone doesn't have a
> way (that I know of) to clone only a subset of branches. It would be
> fairly trivial to implement, I think.

On the other side Cogito doesn't have way to clone all the branches.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 17:29                                                           ` Linus Torvalds
@ 2006-10-23 22:21                                                             ` Matthew D. Fuller
  2006-10-23 22:28                                                               ` David Lang
                                                                                 ` (4 more replies)
  0 siblings, 5 replies; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-23 22:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bazaar-ng, git

On Mon, Oct 23, 2006 at 10:29:53AM -0700 I heard the voice of
Linus Torvalds, and lo! it spake thus:
> 
> I already briought this up once, and I suspect that the bzr people
> simply DID NOT UNDERSTAND the question:
> 
>  - how do you do the git equivalent of "gitk --all"

I for one simply DO NOT UNDERSTAND the question, because I don't know
what that is or what I'd be trying to accomplish by doing it.  The
documentation helpfully tells me that it's something undocumented.


> For example, how long does it take to do an arbitrary "undo" (ie
> forcing a branch to an earlier state) [...]

I don't understand the thrust of this, either.  As I understand the
operation you're talking about, it doesn't have anything to do with a
branch; you'd just be whipping the working tree around to different
versions.  That should be O(diff) on any modern VCS.


> and yes, performance does matter.

I agree, and I currently find a number of places bzr doesn't hit the
level of performance I think it should.  I'm not convinced, however,
that any notable proportion of that has to do with the abstract model
behind it.  And insofar as it has to do with the physical storage
model, that can easily be (and I'm confident will be, considering it's
a focus) ameliorated with later repository formats.


> The whole confusing between "bzr pull" and "bzr merge" is another
> _technical_ sign of why branch-local revision numbers are a mistake. 

I consider it a _technical_ sign of a way of thinking about branches I
prefer   8-}


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 22:21                                                             ` Matthew D. Fuller
@ 2006-10-23 22:28                                                               ` David Lang
  2006-10-23 22:44                                                               ` Linus Torvalds
                                                                                 ` (3 subsequent siblings)
  4 siblings, 0 replies; 1752+ messages in thread
From: David Lang @ 2006-10-23 22:28 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Linus Torvalds, bazaar-ng, git

On Mon, 23 Oct 2006, Matthew D. Fuller wrote:

>
> I don't understand the thrust of this, either.  As I understand the
> operation you're talking about, it doesn't have anything to do with a
> branch; you'd just be whipping the working tree around to different
> versions.  That should be O(diff) on any modern VCS.

on many modern VCS systems it's O(n) on the number of changes (start from where 
you are and apply the patch to change it to rev -1, then apply the patch to 
change it to rev -2, etc)

on git it's O(1) (write the new files into place)

David Lang

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 22:21                                                             ` Matthew D. Fuller
  2006-10-23 22:28                                                               ` David Lang
@ 2006-10-23 22:44                                                               ` Linus Torvalds
  2006-10-24  0:26                                                                 ` Matthew D. Fuller
  2006-10-23 22:45                                                               ` Jakub Narebski
                                                                                 ` (2 subsequent siblings)
  4 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-23 22:44 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git



On Mon, 23 Oct 2006, Matthew D. Fuller wrote:

> On Mon, Oct 23, 2006 at 10:29:53AM -0700 I heard the voice of
> Linus Torvalds, and lo! it spake thus:
> > 
> > I already briought this up once, and I suspect that the bzr people
> > simply DID NOT UNDERSTAND the question:
> > 
> >  - how do you do the git equivalent of "gitk --all"
> 
> I for one simply DO NOT UNDERSTAND the question, because I don't know
> what that is or what I'd be trying to accomplish by doing it.  The
> documentation helpfully tells me that it's something undocumented.

gitk (and all other logging functions) can take as its argument a set of 
arbitrary revision expressions.

That means, for example, that you can give it a list of branches and tags, 
and it will generate the combined log for all of them. "--all" is just 
shorthand for that, but it's really just a special case of the generic 
facility.

This is _invaluable_ when you want to actually look at how the branches 
are related. The whole _point_ of having branches is that they tend to 
have common state.

For example, let's say that you have a branch called "development", and a 
branch called "experimental", and a branch called "mainline". Now, 
_obviously_ all of these are related, but if you want to see how, what 
would you do?

In git, one natural thing would be, for example, to do

	gitk development experimental ^mainline

(where instead of "gitk" you can use any of the history listing 
things - gitk is just the visually more clear one) which will show you 
what exists in the branches "development" and "experimental", but it will 
_subtract_ out anything in "mainline" (which is sensible - you may want to 
see _just_ the stuff that is getting worked on - and the stuff in mainline 
is thus uninteresting).

See? When you visualize multiple branches together, HAVING PER-BRANCH 
REVISION NUMBERS IS INSANE! Yet, clearly, it's a valid and interesting 
operation to do.

An equally interesting thing to ask is: I've got two branches, show me the 
differences between them, but not the stuff in common. Again, very simple. 
In git, you'd literally just write

	gitk a...b

(where "..." is "symmetric difference"). Or, if you want to see what is in 
"a" but _not_ in "b", you'd do

	gitk b..a

(now ".." is regular set difference, and the above is really identical to 
the "a ^b" syntax).

And trust me, these are all very valid things to do, even though you're 
talking about different branches.

Try it out. 

> > For example, how long does it take to do an arbitrary "undo" (ie
> > forcing a branch to an earlier state) [...]
> 
> I don't understand the thrust of this, either.  As I understand the
> operation you're talking about, it doesn't have anything to do with a
> branch; you'd just be whipping the working tree around to different
> versions.  That should be O(diff) on any modern VCS.

No. If you "undo", you'd undo the whole history too. And if you undo to a 
point that was on a branch, you'd have to re-write _all_ the revision 
ID's.

> I consider it a _technical_ sign of a way of thinking about branches I
> prefer   8-}

Quite frankly, I just don't think you understand what it means.

			Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 22:21                                                             ` Matthew D. Fuller
  2006-10-23 22:28                                                               ` David Lang
  2006-10-23 22:44                                                               ` Linus Torvalds
@ 2006-10-23 22:45                                                               ` Jakub Narebski
  2006-10-23 23:14                                                                 ` Erik Bågfors
  2006-10-24  9:51                                                               ` Matthieu Moy
  2006-10-25 10:52                                                               ` Andreas Ericsson
  4 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-23 22:45 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Matthew D. Fuller wrote:

> On Mon, Oct 23, 2006 at 10:29:53AM -0700 I heard the voice of
> Linus Torvalds, and lo! it spake thus:
>> 
>> I already briought this up once, and I suspect that the bzr people
>> simply DID NOT UNDERSTAND the question:
>> 
>>  - how do you do the git equivalent of "gitk --all"
> 
> I for one simply DO NOT UNDERSTAND the question, because I don't know
> what that is or what I'd be trying to accomplish by doing it.  The
> documentation helpfully tells me that it's something undocumented.

gitk(1)
=======

NAME
----
gitk - git repository browser

DESCRIPTION
-----------
Displays changes in a repository or a selected set of commits. This includes
visualizing the commit graph, showing information related to each commit, and
the files in the trees of each revision.

Historically, gitk was the first repository browser. It's written in tcl/tk
and started off in a separate repository but was later merged into the main
git repository.

OPTIONS
-------
To control which revisions to shown, the command takes options applicable to
the git-rev-list(1) command. This manual page describes only the most
frequently used options.

[...]
--all::

        Show all branches.


Which means that "gitk --all" means show whole DAG in graphical history viewer.

As in bzr there is no command (nor plugin) to clone whole repository,
I guess that the answer is that you can't do this. But perhaps 
I'm mistaken, and you can do this in bzr-gtk/bzrk...

>> For example, how long does it take to do an arbitrary "undo" (ie
>> forcing a branch to an earlier state) [...]
> 
> I don't understand the thrust of this, either.  As I understand the
> operation you're talking about, it doesn't have anything to do with a
> branch; you'd just be whipping the working tree around to different
> versions.  That should be O(diff) on any modern VCS.

For example if you decide to discard some changes completely, reverting
(this action in git is called 'rewind') branch to some previous revision.

And in git this operation is O(1), not O(diff).

BTW. The following question IIRC remained unanswered: can you easily
in bzr create branch off arbitrary revision (for example deciding that
stable branch should start two revisions back in history from development
branch)?

>> and yes, performance does matter.
> 
> I agree, and I currently find a number of places bzr doesn't hit the
> level of performance I think it should.  I'm not convinced, however,
> that any notable proportion of that has to do with the abstract model
> behind it.  And insofar as it has to do with the physical storage
> model, that can easily be (and I'm confident will be, considering it's
> a focus) ameliorated with later repository formats.

Some of physical storage models needs specific abstract model. I think
that git storage model is in this class.

>> The whole confusing between "bzr pull" and "bzr merge" is another
>> _technical_ sign of why branch-local revision numbers are a mistake. 
> 
> I consider it a _technical_ sign of a way of thinking about branches I
> prefer   8-}

Or _perhaps_ just the way of thinking about branches in the way you are
used to.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 22:45                                                               ` Jakub Narebski
@ 2006-10-23 23:14                                                                 ` Erik Bågfors
  2006-10-23 23:24                                                                   ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-23 23:14 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

This is starting to turn into a "my VCS it better than yours"
discussion rather then anything else.  That's unfortunate....


>
> Which means that "gitk --all" means show whole DAG in graphical history viewer.
>
> As in bzr there is no command (nor plugin) to clone whole repository,

But it wouldn't be hard to create one...

> I guess that the answer is that you can't do this. But perhaps
> I'm mistaken, and you can do this in bzr-gtk/bzrk...

As of now there is no way to do it due to the fact that nobody has
done it yet. You can ofcourse clone branches into a common repo and do
operations on that. For example, there is a plugin that allows you to
list heads in a repo (and not in branches). So basically, if you loose
a branch, you can still find the head in the repository and recreate
the branch.

I don't see any problem doing a "gitk --all" equivalent in bzr.
Personally, I don't really have a need for it.

> BTW. The following question IIRC remained unanswered: can you easily
> in bzr create branch off arbitrary revision (for example deciding that
> stable branch should start two revisions back in history from development
> branch)?

bzr branch -r-2 development stable
(or "bzr branch -rrevid:foobar" to start at revision id "foobar")

very easy.

/Erik

-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 23:14                                                                 ` Erik Bågfors
@ 2006-10-23 23:24                                                                   ` Linus Torvalds
  2006-10-24  0:26                                                                     ` Matthew D. Fuller
                                                                                       ` (3 more replies)
  0 siblings, 4 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-23 23:24 UTC (permalink / raw)
  To: Erik Bågfors; +Cc: Jakub Narebski, bazaar-ng, git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 310 bytes --]



On Tue, 24 Oct 2006, Erik Bågfors wrote:
> 
> I don't see any problem doing a "gitk --all" equivalent in bzr.

The problem? How do you show a commit that is _common_ to two branches, 
but has different revision names in them?

Do you _finally_ see what is so wrong with this whole per-branch naming?

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 22:44                                                               ` Linus Torvalds
@ 2006-10-24  0:26                                                                 ` Matthew D. Fuller
  2006-10-24 15:58                                                                   ` David Lang
  0 siblings, 1 reply; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-24  0:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bazaar-ng, git

On Mon, Oct 23, 2006 at 03:44:13PM -0700 I heard the voice of
Linus Torvalds, and lo! it spake thus:
> 
> gitk (and all other logging functions) can take as its argument a
> set of arbitrary revision expressions.
  [...]
> And trust me, these are all very valid things to do, even though
> you're talking about different branches.

I have zero problem believing that.  It seems from all accounts a
wonderful swiss-army chainsaw, and while none of that power is useful
to me personally in anything I'm VCS'ing at the moment, I'd feel awful
shiny knowing it was sitting there waiting for me.  All else being
equal, I'd think more highly of a VCS with those capabilities than one
without.

bzr-the-program doesn't have a lot of that capability, and what it
does have is rather more verbose to access.  Perhaps some attribute of
bzr-the-current-storage-model would make some bit of that
significantly more expensive than it has to be (I don't know of any,
and can't think offhand of anywhere it might hide, but that's way off
my turf).

But I don't understand how bzr-the-abstract-data-model makes such
things impossible, or even significantly different than doing so in
git.  In git, you're just chopping off one DAG where another one
intersects it (or similar operations).  To do it in bzr, you'd do...
exactly the same thing.  The revnos, or the mainline, are completely
useless in such an operation of course, but they don't hurt it; the
tool would just just ignore them like it does the SHA-1 of files in
the revision.


> See? When you visualize multiple branches together, HAVING
> PER-BRANCH REVISION NUMBERS IS INSANE! Yet, clearly, it's a valid
> and interesting operation to do.

I wouldn't be so absolutist about it, but certainly they're of
extremely limited utility if of any at all in such cases.  And yes, it
can be an interesting operation.  But what does that have to do with
using revnos in other cases?  You keep saying "having" where I would
say "using".


> No. If you "undo", you'd undo the whole history too. And if you undo
> to a point that was on a branch, you'd have to re-write _all_ the
> revision ID's.

Well, I guess in this particular case I still don't see why you'd
generally undo big hunks of a branch versus just flipping your working
tree to different versions.  But contrived examples are still
examples, and even if so, truncate()'ing a list of numbers is a
constant time operation.  And even if you had to renumber totally...
my $DEITY, I'd expect my old 200MHz PPro to renumber a hundred
thousand rev long mainline in half a second.


> > I consider it a _technical_ sign of a way of thinking about
> > branches I prefer   8-}
> 
> Quite frankly, I just don't think you understand what it means.

Quite frankly, I just don't think you understand that I WANT to care
about first parents.  No, really.  Seriously.  I really really really
want to.  If my VCS didn't give me numbers along the mainline, I'd
still care out it.  If the revisions were all named SHA-1 hashes, I'd
still care about it.  If I had a metric quidnillion ways to
cross-section and compare branches, I'd still care about it.

This comes with costs.  Chief among them is a restriction of my
actions; I can't fast-forward branches where I care about the
mainline.  That's a cost.  That means I have to take some care about
what operations I perform.  I *GLEEFULLY* pony up that cost.

Because I care about the mainline, revnos can be useful.  I like
revnos.  It has to cost SOMETHING to come up with them (though there
seems to be disagreement about the size of that cost), since doing
'x+y' will always cost more than doing 'x'.  I've never seen a case
where that cost even appeared MEASURABLE, much less significant
(things have to be pretty expensive to compare to the cost of starting
up python and loading a bunch of files into it ;).  So far, I've not
seen the slightest hint of a cost that would make it even worth asking
the question of whether the cost is worth it to me.


I care about that first parent line.  Therefore, I require my tool to
at least _pretend_ to care.  I'm not aware of any way in which the
fundamental bzr structures care, but the UI is chock full of
pretending.  A necessary part of that pretending is not changing my
mainline unless I specifically ask for it, and that means a
merge-vs-pull distinction needs to be there.  That's a _technical_
sign that the tool is ready to work with me the way I want to work.  A
lack of it is a _technical_ sign that it's not suitable.

You, by your own words, don't care about the first parent line.  Your
tool naturally reflects this.  From that perspective, *ANY* cost for
maintaining such a thing is Bad And Wrong, and so you condemn it.
Those condemnations will keep failing to carry any weight with me,
though, as long as I care about that mainline and value the benefits I
find in it.


Maybe I won't always.  2 years ago, I could maybe see some benefits in
DVCS, but I couldn't imagine what possible use they could ever be to
me in anything I do.  Today, I'm using one (if lightly by the
standards of a lot of people in this discussion), and chafing at every
centralized system I have to deal with.  In 5 years, I may be standing
beside you slugging it out at those lunatics and hacks who keep
begging to pay these whopper costs, just to be able to do extra work
to maintain an ordering of parents that doesn't matter for crap.
Could be.  I've changed my mind about far more momentous things in my
life.

Maybe someday I'll still care, but the OTHER advantages of a system
(like git) that doesn't over all the ones that do will outweigh the
advantages I gain from that distinction.  Someday I might need such
ultra-expressive ways of comparing branches, and bzr won't have grown
them yet.  Someday I might reach a point where bzr's performance due
to the choice of storage structures or implementation language or
developer habits or whatever else just doesn't cut the mustard, and
git's does.  Someday, some set of other advantages may make it
worthwhile for me to give up my preciouss mainline no matter how much
I might still crave it.

But I can only work from today.  Today, I do care.  Today, it's well
worth whatever I give up to get it.  And I like that my tool makes
that caring easy for me.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 23:24                                                                   ` Linus Torvalds
@ 2006-10-24  0:26                                                                     ` Matthew D. Fuller
  2006-10-24  0:38                                                                       ` Matthew D. Fuller
  2006-10-24  0:47                                                                       ` Carl Worth
  2006-10-24  0:39                                                                     ` Martin Langhoff
                                                                                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-24  0:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Erik Bågfors, bazaar-ng, git, Jakub Narebski

On Mon, Oct 23, 2006 at 04:24:30PM -0700 I heard the voice of
Linus Torvalds, and lo! it spake thus:
> 
> The problem? How do you show a commit that is _common_ to two
> branches, but has different revision names in them?

Why would you?


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24  0:26                                                                     ` Matthew D. Fuller
@ 2006-10-24  0:38                                                                       ` Matthew D. Fuller
  2006-10-24  5:42                                                                         ` Linus Torvalds
  2006-10-24  0:47                                                                       ` Carl Worth
  1 sibling, 1 reply; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-24  0:38 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, bazaar-ng, git, Erik Bågfors

On Mon, Oct 23, 2006 at 07:26:57PM -0500 I heard the voice of
Matthew D. Fuller, and lo! it spake thus:
> On Mon, Oct 23, 2006 at 04:24:30PM -0700 I heard the voice of
> Linus Torvalds, and lo! it spake thus:
> > 
> > The problem? How do you show a commit that is _common_ to two
> > branches, but has different revision names in them?
> 
> Why would you?

I beg your pardon; that was awful ambiguous of me.  I meant "In such a
case, where the whole purpose of what you're doing is to you're look
at multiple branches to see relationships between them, why WOULD you
be using branch-local identifiers for revisions at all?"


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 23:24                                                                   ` Linus Torvalds
  2006-10-24  0:26                                                                     ` Matthew D. Fuller
@ 2006-10-24  0:39                                                                     ` Martin Langhoff
  2006-10-24  7:52                                                                       ` Erik Bågfors
  2006-10-24  9:30                                                                     ` Jelmer Vernooij
  2006-10-25 18:41                                                                     ` Aaron Bentley
  3 siblings, 1 reply; 1752+ messages in thread
From: Martin Langhoff @ 2006-10-24  0:39 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Erik Bågfors, Jakub Narebski, bazaar-ng, git

On 10/24/06, Linus Torvalds <torvalds@osdl.org> wrote:
> On Tue, 24 Oct 2006, Erik Bågfors wrote:
> >
> > I don't see any problem doing a "gitk --all" equivalent in bzr.
>
> The problem? How do you show a commit that is _common_ to two branches,
> but has different revision names in them?

Eric,

coming from an Arch background, I understand the whole per-branch
commitids approach. After using GIT for a while, you start realising
that it tries to pin down things in the wrong place.

This is specially visible if you run `gitk --all` before and after a
merge. Or on a project with many merges (if you can, get a checkout of
git itself, and browse its history with gitk).

Before the merge, you see

 --o--o--o--o
    \
     \--o--o

and after

 --o--o--o--o
    \        \
     \--o--o--o

Now, after it's merged somewhere, both commits are part of its
history, regardless of where they come from. And it is very clear if
two branches have been merging and remerging.

Where a commit originated does not matter. And fancy
repo-and-branch-centric names get in the way. A lot. And they re
mostly meaningless as soon as you put what matters in the commit
message. Which means that that bit of metadata that you are hoping
that the revno keeps "indirectly" isn't lost on cherry picking.

I guess that's where I used to find revnos useful as they contained
some basic metadata. With bzr it seems to be author-repo-branch where
branch is hopefully "line of work" but all of that can be (and should
be) in the commit message.

You can see similar info in the first part of the commit message for
most git-hosted projects. It'll say something like

   cvsserver: fix the frobnicator to be sequential

which means that at that point, you could be working in a branch
called fix-this-fscking-thing-attempt524" and no-one would know ;-)

And in a few years (even months) time, that bit of metadata you were
hoping to keep is totally irrelevant. What you have in the commit
message remains relevant and useful.

cheers,


martin

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24  0:26                                                                     ` Matthew D. Fuller
  2006-10-24  0:38                                                                       ` Matthew D. Fuller
@ 2006-10-24  0:47                                                                       ` Carl Worth
  2006-10-24  7:31                                                                         ` Erik Bågfors
  2006-10-24 21:51                                                                         ` Erik Bågfors
  1 sibling, 2 replies; 1752+ messages in thread
From: Carl Worth @ 2006-10-24  0:47 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git, Erik Bågfors

[-- Attachment #1: Type: text/plain, Size: 930 bytes --]

On Mon, 23 Oct 2006 19:26:57 -0500, "Matthew D. Fuller" wrote:
>
> On Mon, Oct 23, 2006 at 04:24:30PM -0700 I heard the voice of
> Linus Torvalds, and lo! it spake thus:
> >
> > The problem? How do you show a commit that is _common_ to two
> > branches, but has different revision names in them?
>
> Why would you?

Assume you've got two long-lived branches and one periodically gets
merged into the other one. The combined history might look as follows
(more recent commits first):

 f   g
 |   |
 d   e
 |\ /
 b c
 |/
 a

The point is that it is extremely nice to be able to visualize things
that way. Say I've got a "dev" branch that points at f and a "stable"
branch that points at g. With this, a command like:

	gitk dev stable

would result in a picture just like the above. Can a similar figure be
made with bzr? Or only the following two separate pictures:

 f    g
 |    |
 d    e
 |\   |
 b c  c
 |/   |
 a    a

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 12:54                                                             ` Jakub Narebski
  2006-10-23 15:01                                                               ` James Henstridge
@ 2006-10-24  3:24                                                               ` David Clymer
  1 sibling, 0 replies; 1752+ messages in thread
From: David Clymer @ 2006-10-24  3:24 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Matthew D. Fuller, Andreas Ericsson, Linus Torvalds, Carl Worth,
	bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 3190 bytes --]

On Mon, 2006-10-23 at 14:54 +0200, Jakub Narebski wrote:
> On Mon, Oct 23, 2006 David Clymer wrote:
> > On Sun, 2006-10-22 at 22:06 +0200, Jakub Narebski wrote:
> >> David Clymer wrote:
> 
> >>> 2. bzr does not support fully distributed development because revnos
> >>> "don't work" as stated in #1.
> >>
> >> Bazaar is biased towards centralized/star-topology development if we
> >> want to use revnos. In fully distributed configuration there is no
> >> "simple namespace".
> > 
> > So revnos aren't globally meaningful in fully distributed settings. So
> > what? I don't see how this translates into bias. There is a lot of
> > functionality provided by bazaar that doesn't really apply to my use
> > case, but it doesn't mean that it is indicative of some bias in bazaar.
> 
> First, bzr is biased towards using revnos: bzr commands uses revnos
> by default to provide revision (you have to use revid: prefix/operator
> to use revision identifiers), bzr commands outputs revids only when
> requested, examples of usage uses revision numbers.

Agreed. Of course, I want the simplest case to be the simplest. When
working on my own branch, regardless if it is a standalone project or
part of a distributed one, I don't want to have to type SHA hashes or
revids. Numbers serve my purposes best in this case. When I communicate
with other distributed developers, I can and should use revids.

> 
> In order to use revnos as _global_ identifiers in distributed development,
> you need central "branch", mainline, to provide those revnos. You have
> either to have access to this "revno server" and refer to revisions by
> "revno server" URL and revision number, or designate one branch as holding
> revision numbers ("revno server") and preserve revnos on "revno server"
> by using bzr "merge", while copying revnos when fetching by using bzr "pull"
> for leaf branches. In short: for revnos to be global identifiers you need
> star-topology.

Ok. Let's not repeat this again. I think I said this once, and you've
said it in two following emails. It's a given. Assume that we all know
it.

> 
> Even if you use revnos only locally, you need to know which revisions are
> "yours", i.e. beside branch as DAG of history of given revision you need
> "ordered series of revisions" (to quote Bazaar-NG wiki Glossary), or path
> through this diagram from given revision to one of the roots (initial,
> parentless revisions). Because bzr does that by preserving mentioned path
> as first-parent path (treating first parent specially), i.e. storing local
> information in a DAG (which is shared), to preserve revnos you need to
> use "merge" instead of "pull", which means that you get empty-merge in
> clearly fast-forward case. This means "local changes bias", which some
> might take as not being fully distributed.

"local changes bias" I can buy that. I even like it. I don't even care
if that makes bazaar "not fully distributed." I don't think the
distinction between "fully" and "almost, except for some technicality"
distributed is one that has much practical value.

-davidc
-- 
gpg-key: http://www.zettazebra.com/files/key.gpg

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24  0:38                                                                       ` Matthew D. Fuller
@ 2006-10-24  5:42                                                                         ` Linus Torvalds
  2006-10-24  5:47                                                                           ` Shawn Pearce
  2006-10-24 16:46                                                                           ` Matthew D. Fuller
  0 siblings, 2 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-24  5:42 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Jakub Narebski, bazaar-ng, git, Erik Bågfors



On Mon, 23 Oct 2006, Matthew D. Fuller wrote:

> On Mon, Oct 23, 2006 at 07:26:57PM -0500 I heard the voice of
> Matthew D. Fuller, and lo! it spake thus:
> > On Mon, Oct 23, 2006 at 04:24:30PM -0700 I heard the voice of
> > Linus Torvalds, and lo! it spake thus:
> > > 
> > > The problem? How do you show a commit that is _common_ to two
> > > branches, but has different revision names in them?
> > 
> > Why would you?
> 
> I beg your pardon; that was awful ambiguous of me.  I meant "In such a
> case, where the whole purpose of what you're doing is to you're look
> at multiple branches to see relationships between them, why WOULD you
> be using branch-local identifiers for revisions at all?"

Well, I would use the globally unique ones, certainly. It's the only thing 
that makes sense.

However, I'd also argue that once you start doing that, _mixing_ the 
globally unique and stable ones and the "simple" ones is a mistake: you'd 
be better off having told your users to use the global ones from the very 
beginning, and trying to make _those_ as simple to use as possible.

Because once you start using both, you're just going to confuse your users 
horribly, and they'll consider the globally unique one really irritating, 
because they're used to using something totally different in most other 
contexts.

Using the _same_ names everywhere is just better. 

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24  5:42                                                                         ` Linus Torvalds
@ 2006-10-24  5:47                                                                           ` Shawn Pearce
  2006-10-24 16:46                                                                           ` Matthew D. Fuller
  1 sibling, 0 replies; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-24  5:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Erik Bågfors, bazaar-ng, git, Matthew D. Fuller,
	Jakub Narebski

Linus Torvalds <torvalds@osdl.org> wrote:
> Using the _same_ names everywhere is just better. 

I find that it is simpler too.  :-)

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-20  9:43                     ` Matthieu Moy
@ 2006-10-24  6:02                       ` Lachlan Patrick
  2006-10-24  6:23                         ` Shawn Pearce
  2006-10-24  6:31                         ` Linus Torvalds
  0 siblings, 2 replies; 1752+ messages in thread
From: Lachlan Patrick @ 2006-10-24  6:02 UTC (permalink / raw)
  To: bazaar-ng, git

Matthieu Moy wrote:
> Sean <seanlkml@sympatico.ca> writes:
>> We don't need plugins to extend features, we just add the feature to
>> the source.  The example I asked about earlier is a case in point. 
>> Apparently in bzr "bisect" was implemented as a plugin, yet in Git it
>> was implemented as a command without any issue at all,
> 
> I'd compare bzr's plugins to Firefox extensions.

So, bzr's plug-in architecture provides a 'protocol' for communicating
with bzr? Or is it functionally the same as a Python module which is
loaded after being named on the bzr command-line (or placed in a special
folder) then executed along with all the other plug-ins? I'm trying to
understand if writing a plug-in is any simpler than understanding the
bzr source code.

Can I ask the git folks what Sean meant in the above about a 'command'.
Are you talking about shell scripts? Is 'git' the only program you need?

AFAIK, 'bzr' is the sole program in Bazaar, and everything is done with
command line options to bzr. Is that true of git? To what extent is git
tied to a [programmable] shell? I've heard someone say there's no
Windows version of git for some reason, can someone elaborate?

Ta,
Loki

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24  6:02                       ` Lachlan Patrick
@ 2006-10-24  6:23                         ` Shawn Pearce
  2006-10-24  6:31                         ` Linus Torvalds
  1 sibling, 0 replies; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-24  6:23 UTC (permalink / raw)
  To: Lachlan Patrick; +Cc: bazaar-ng, git

Lachlan Patrick <loki@research.canon.com.au> wrote:
> Can I ask the git folks what Sean meant in the above about a 'command'.
> Are you talking about shell scripts? Is 'git' the only program you need?

'git' is actually two things:

  1) Its a wrapper command which executes 'git-foo' if you call it
     with 'foo' as its first parameter.  It searches for 'git-foo'
     in the GIT_EXEC_PATH environment variable, which has a default
     set at compile time, usually to the directory you are going to
     install Git into.

  2) Its most of the core Git plumbing.  There are currently around 48
     'builtin' commands.  These are things which 'git' knows how to do
     without executing another program.  If you look at the installation
     these 48 builtin commands are just hardlinks back to 'git'.  For
     example 'git-update-index' is really just a hardlink back to 'git'
     and 'git' knows to perform the update index logic when its called
     as either 'git-update-index' or as 'git update-index'.

We're moving more towards #2, but there are still a large number
of commands which fall into #1.
 
> AFAIK, 'bzr' is the sole program in Bazaar, and everything is done with
> command line options to bzr. Is that true of git?

No.  In Git at least half of the things Git can do are not builtin to
'git' and thus require exec()'ing an external program (e.g. git-fetch).
However these often appear as though they are command line options to
'git' as 'git fetch' just means exec 'git-fetch' (by #1 above).

On the other hand there are a wide range of tools which are more or
less the same thing, just with different options applied to them.
All of the diff programs, log, whatchanged, show - these are all
just variations on a theme.  Their individual implementations are
very tiny as they all use the same library code.

> To what extent is git
> tied to a [programmable] shell?

Git is still very much tied to a shell.  For example 'git commit'
is really the shell script 'git-commit'.  This is a rather long
shell script and it does a lot of things for the user; not having
it would make Git useless to for most people.  It also has not been
rewritten in C.  There is a roadmap however to convert it to C to
help remove the programmable shell requirement and people have been
slowly performing the (rather tedious) conversion work.

> I've heard someone say there's no
> Windows version of git for some reason, can someone elaborate?

Git runs on Cygwin.  But there's no native Win32 (without Cygwin)
version of Git because:

 - Git uses POSIX APIs and expects POSIX behavior from the OS its
   running on.  Without a compatability layer to make Windows act
   like UNIX Git won't run.  Cygwin happens to be a really good
   compatability layer.

 - Git requires a Bourne shell for many of its important tools,
   such as 'git commit'.  Windows lacks such a program, at least
   out of the box, but its in Cygwin.

 - Git relies on a helper program called 'merge' to perform three
   way file merges.  This tool may or may not be ported to native
   Win32 (I don't know) but it is in Cygwin.

 - Git requires some libraries for certain features, such as libexpat
   or libcurl.  I don't know if these are available for native Win32
   but they are available on Cygwin.

 - Windows isn't the primary target platform for many of the Git
   contributors.  Some consider the fact that it even runs there
   at all a minor miracle, and that's only possible due to the hard
   work the Cygwin folks have done.

 - ... I'm sure there's other reasons ...

-- 
Shawn.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24  6:02                       ` Lachlan Patrick
  2006-10-24  6:23                         ` Shawn Pearce
@ 2006-10-24  6:31                         ` Linus Torvalds
  2006-10-24  6:45                           ` David Rientjes
  1 sibling, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-24  6:31 UTC (permalink / raw)
  To: Lachlan Patrick; +Cc: bazaar-ng, git



On Tue, 24 Oct 2006, Lachlan Patrick wrote:
> 
> Can I ask the git folks what Sean meant in the above about a 'command'.
> Are you talking about shell scripts? Is 'git' the only program you need?

Historically, "git" was _only_ a wrapper program. When you did

	git log

it just executed the real program called "git-log", which was often a 
shell-script. That was just so that things could easily be extended, and 
you could use shell-script for simple one-liner things, and native C for 
more "core" stuff.

For example, "git log" used to be a one-line shell-script that just did

	git-rev-list --pretty HEAD | LESS=-S ${PAGER:-less}

but it ended up being a lot more capable, and eventually just rewritten 
as an internal command..

These days, most of the simple things like "git log" are all built into 
the "git" program, although for anything not built in, it still acts as 
just a wrapper, which allows not only random functionality to still be 
written in shell (or sometimes perl), but also ends up being the simplest 
possible plug-in mechanism: you can define your own commands by just 
writing a shell-script thing, calling it "git-mycommand", installing it in 
the proper place, and it ends up being accessible as "git mycommand".

That allows for easy prototyping in your language of choice.

> AFAIK, 'bzr' is the sole program in Bazaar, and everything is done with
> command line options to bzr. Is that true of git? To what extent is git
> tied to a [programmable] shell? I've heard someone say there's no
> Windows version of git for some reason, can someone elaborate?

Almost all of "core" git is pure C, which unlike something like python or 
perl obviously tends to have a fair amount of system issues. That said, 
much of it really is fairly portable, so doing the built-in git stuff 
should _largely_ work even natively under Windows with some effort.

The problem ends up being that few enough people seem to develop under 
Windows, and the cygwin port works better (because it handles a number of 
the portability issues and also handles the scripts that are still shell). 
Those two issues seem to mean that not a lot of effort has been put into 
aiming for a native windows binary (or into moving away from shell 
scripts).

Most of the shell scripts really are fairly simple. So if somebody 
_really_ wanted to, it would probably not be hard to spend some effort to 
either just write them as C and turn them into built-ins, or porting them 
to some other scripting language.

Of course, most Windows users don't seem to really want a command line 
interface at all. IDE integration would appear to be more interesting to 
some people.

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24  6:31                         ` Linus Torvalds
@ 2006-10-24  6:45                           ` David Rientjes
       [not found]                             ` <Pin e.LNX.4.64.0610240812410.3962@g5.osdl.org>
                                               ` (3 more replies)
  0 siblings, 4 replies; 1752+ messages in thread
From: David Rientjes @ 2006-10-24  6:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Lachlan Patrick, bazaar-ng, git

On Mon, 23 Oct 2006, Linus Torvalds wrote:

> Historically, "git" was _only_ a wrapper program. When you did
> 
> 	git log
> 
> it just executed the real program called "git-log", which was often a 
> shell-script. That was just so that things could easily be extended, and 
> you could use shell-script for simple one-liner things, and native C for 
> more "core" stuff.
> 
> For example, "git log" used to be a one-line shell-script that just did
> 
> 	git-rev-list --pretty HEAD | LESS=-S ${PAGER:-less}
> 
> but it ended up being a lot more capable, and eventually just rewritten 
> as an internal command..
> 

Some of the internal commands that have been coded in C are actually much 
better handled by the shell in the first place.  It's much simpler to 
write and extend as well as being much more traceable for runtime 
problems.  The shell commands that would be used for most of these git
routines have options for requesting it to be more verbose so the user 
actually has a lot more power over reporting and/or logging.  In addition 
it tends to be more portable and the amount of code is drastically reduced 
in a script style of programming.  The criticisms against such use of 
shell scripting tends to be a matter of personal taste.  People believe, 
for some reason or another, that it is a lower-class type of programming 
that is less robust and is harder to understand.  Seldom have there been 
cogent arguments for coding such features in C as opposed to shell 
scripting, especially in the case of git where the shell becomes a very 
powerful ally.

		David

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24  0:47                                                                       ` Carl Worth
@ 2006-10-24  7:31                                                                         ` Erik Bågfors
  2006-10-24 21:51                                                                         ` Erik Bågfors
  1 sibling, 0 replies; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-24  7:31 UTC (permalink / raw)
  To: Carl Worth
  Cc: Linus Torvalds, bazaar-ng, git, Matthew D. Fuller, Jakub Narebski

On 10/24/06, Carl Worth <cworth@cworth.org> wrote:
> On Mon, 23 Oct 2006 19:26:57 -0500, "Matthew D. Fuller" wrote:
> >
> > On Mon, Oct 23, 2006 at 04:24:30PM -0700 I heard the voice of
> > Linus Torvalds, and lo! it spake thus:
> > >
> > > The problem? How do you show a commit that is _common_ to two
> > > branches, but has different revision names in them?
> >
> > Why would you?
>
> Assume you've got two long-lived branches and one periodically gets
> merged into the other one. The combined history might look as follows
> (more recent commits first):
>
>  f   g
>  |   |
>  d   e
>  |\ /
>  b c
>  |/
>  a
>
> The point is that it is extremely nice to be able to visualize things
> that way. Say I've got a "dev" branch that points at f and a "stable"
> branch that points at g. With this, a command like:
>
>         gitk dev stable
>
> would result in a picture just like the above. Can a similar figure be
> made with bzr? Or only the following two separate pictures:

The above picture can easily be created with bzr if you have a
utility/plugin that does it. There is none that does it yet, but there
are no problems doing one.

Of course, in such a context revision numbers have no use.  But see,
revision numbers is not mandatory in bzr, so that's not a problem.

I haven't really had a need for such a tool, but I do see where it can
be very useful to have.

>  f    g
>  |    |
>  d    e
>  |\   |
>  b c  c
>  |/   |
>  a    a
>

This is what you would get if you visualize the two separate branches,
and not the common repository.

/Erik

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24  0:39                                                                     ` Martin Langhoff
@ 2006-10-24  7:52                                                                       ` Erik Bågfors
  2006-10-24  8:37                                                                         ` Jakub Narebski
  2006-10-24 10:11                                                                         ` Martin Langhoff
  0 siblings, 2 replies; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-24  7:52 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git

On 10/24/06, Martin Langhoff <martin.langhoff@gmail.com> wrote:
> On 10/24/06, Linus Torvalds <torvalds@osdl.org> wrote:
> > On Tue, 24 Oct 2006, Erik Bågfors wrote:
> > >
> > > I don't see any problem doing a "gitk --all" equivalent in bzr.
> >
> > The problem? How do you show a commit that is _common_ to two branches,
> > but has different revision names in them?
>
> Eric,

It's Erik :)

> coming from an Arch background, I understand the whole per-branch
> commitids approach. After using GIT for a while, you start realising
> that it tries to pin down things in the wrong place.
>
> This is specially visible if you run `gitk --all` before and after a
> merge. Or on a project with many merges (if you can, get a checkout of
> git itself, and browse its history with gitk).
>
> Before the merge, you see
>
>  --o--o--o--o
>     \
>      \--o--o
>
> and after
>
>  --o--o--o--o
>     \        \
>      \--o--o--o
>
> Now, after it's merged somewhere, both commits are part of its
> history, regardless of where they come from. And it is very clear if
> two branches have been merging and remerging.
>
> Where a commit originated does not matter. And fancy
> repo-and-branch-centric names get in the way. A lot. And they re
> mostly meaningless as soon as you put what matters in the commit
> message. Which means that that bit of metadata that you are hoping
> that the revno keeps "indirectly" isn't lost on cherry picking.

Let's make one thing clear.  Revnos are NOT stored with the revision,
they are not "names" of the revision.  They are basically just
shortcuts to specific revisions, that only makes sence in the context
of a branch.

As human beings this is something we are very used to in everyday
life. I don't always call my friends with firstname and surname, I
just use first name or even "mate".  As long as it's clear who I'm
talking about in that contect.  If there are multiple people with the
same first name, then we might have to use the surname as well.

Same with bzr. In the context of a branch, revnos works as shortcuts
to the revision id.  In the context of multiple branches, they don't.

I think they do serve a good purpose but I don't really think that we
absolutely need them either.

> I guess that's where I used to find revnos useful as they contained
> some basic metadata. With bzr it seems to be author-repo-branch where
> branch is hopefully "line of work" but all of that can be (and should
> be) in the commit message.
>
> You can see similar info in the first part of the commit message for
> most git-hosted projects. It'll say something like
>
>    cvsserver: fix the frobnicator to be sequential
>
> which means that at that point, you could be working in a branch
> called fix-this-fscking-thing-attempt524" and no-one would know ;-)
>
> And in a few years (even months) time, that bit of metadata you were
> hoping to keep is totally irrelevant. What you have in the commit
> message remains relevant and useful.

I'm not even going to try to understand the argument here as they are
about a totally different thing and doesn't make any sense to me.

I think this disussion is getting out of hand.

There are a few things that are being discussed
1. Revnos are bad/good
2. treating "leftmost" parrent special is bad/good
3. plugins are useless/useful
4. And now, storing branch information should be done manually (if
wanted) and not automatically.

1. I don't really care, I haven't seen any confusion based on it, but
I don't have a very strong opinion about it either.
2. This is something I do care about.  For me, this is the only
logical way of doing it. It might be because I am used to it now, but
when I started to look at bzr/hg/git/darcs/etc, I just got a so much
more clear view of the history when running a standard log command,
that it was one of the first things that attracted me to bzr. This is
just a user talking.
There might be technical reasons why it's better to not do it, but for
me it works the way I expect, therefore I'm happy
3. This is just silly
4. No comment.

/Erik

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24  7:52                                                                       ` Erik Bågfors
@ 2006-10-24  8:37                                                                         ` Jakub Narebski
  2006-10-24 10:11                                                                         ` Martin Langhoff
  1 sibling, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-24  8:37 UTC (permalink / raw)
  To: Erik Bågfors; +Cc: Martin Langhoff, Linus Torvalds, bazaar-ng, git

Erik Bågfors wrote:
> I think this disussion is getting out of hand.
> 
> There are a few things that are being discussed
> 1. Revnos are bad/good
> 2. treating "leftmost" parrent special is bad/good
> 3. plugins are useless/useful
> 4. And now, storing branch information should be done manually (if
> wanted) and not automatically.
> 
> 1. I don't really care, I haven't seen any confusion based on it, but
> I don't have a very strong opinion about it either.

To use revnos[*1*] you have to have branch as path through DAG. Bzr does
that by treating first parent special, which leads to empty merges
in fast-forward case.

Using revnos as implemented in bzr leads to some (perhaps unforeseen)
consequences.

[*1*] Meaning that revnos won't change on you.

> 2. This is something I do care about.  For me, this is the only
> logical way of doing it. It might be because I am used to it now, but
> when I started to look at bzr/hg/git/darcs/etc, I just got a so much
> more clear view of the history when running a standard log command,
> that it was one of the first things that attracted me to bzr. This is
> just a user talking.

Git has reflog for when you are interested in branch tip history
(which also stores "reason" for branch tip change: pull, amending
a commit, rebase,...). Git doesn't unfortunately have git-ref-log
command (or --ref option to git-log) to display reflog in user friendly 
format.

Git users are used to use graphical history viewers (mainly gitk and 
qgit, but there is also gitview, tig and git-browser) more to have 
clear view of history, view that log cannot provide.

That said I _thing_ that caring about "branch identity" is just 
something you are used to, perhaps because bzr doesn't have wonderfull 
git log limiting specifiers aka. builtin git log searching (a..b, 
a...b, --max-count, -- <path>, --committer, --grep etc.).

> There might be technical reasons why it's better to not do it, but for
> me it works the way I expect, therefore I'm happy

I think it would be better to maintain "branch identity" separately and 
not in DAG, but that might have other problems I have not seen.

> 3. This is just silly

I think the discussion/arguments were twofold. 

First, Bazaar-NG has plugin infrastructure "for free" because it is 
written in Python, which allows modules loading and monkey-patching. 
Git core is written in C, and git is not yet fully libified.

Second, all that can be done with plugins except for core changes can be 
done in Git writing scripts (this also allows for fast prototyping). 
All except core changes can be done writing few lines in C, but you 
have to compile against some version of Git, and don't have advantages 
of bultin command; git is OSS project.

> 4. No comment.

Storing branch information could be done automatically on demand ;-)
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 23:24                                                                   ` Linus Torvalds
  2006-10-24  0:26                                                                     ` Matthew D. Fuller
  2006-10-24  0:39                                                                     ` Martin Langhoff
@ 2006-10-24  9:30                                                                     ` Jelmer Vernooij
  2006-10-26 15:22                                                                       ` Aaron Bentley
  2006-10-25 18:41                                                                     ` Aaron Bentley
  3 siblings, 1 reply; 1752+ messages in thread
From: Jelmer Vernooij @ 2006-10-24  9:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Erik B?gfors, bazaar-ng, git, Jakub Narebski

[-- Attachment #1: Type: text/plain, Size: 978 bytes --]

On Mon, Oct 23, 2006 at 04:24:30PM -0700, Linus Torvalds wrote:
> On Tue, 24 Oct 2006, Erik B?gfors wrote:
> > I don't see any problem doing a "gitk --all" equivalent in bzr.
> The problem? How do you show a commit that is _common_ to two branches, 
> but has different revision names in them?
It'll have the same revision name. The revision no's will be
different, sure, but that's not a problem.

> Do you _finally_ see what is so wrong with this whole per-branch naming?
revnos are the only naming bit that is branch-specific.

I guess one way of looking at revnos is to regard them completely as a 
command-line ui thing.  They're not explicitly stored anywhere on
disk but just an easy way for users to refer to revisions on a
per-branch basis. 

The graphical frontends to bzr, for example, don't know about revno's but 
only about revids.

Cheers,

Jelmer

-- 
Jelmer Vernooij <jelmer@samba.org> - http://jelmer.vernstok.nl/
Currently playing: 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 22:21                                                             ` Matthew D. Fuller
                                                                                 ` (2 preceding siblings ...)
  2006-10-23 22:45                                                               ` Jakub Narebski
@ 2006-10-24  9:51                                                               ` Matthieu Moy
  2006-10-24 10:27                                                                 ` Jakub Narebski
  2006-10-25 10:52                                                               ` Andreas Ericsson
  4 siblings, 1 reply; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-24  9:51 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Linus Torvalds, bazaar-ng, git

"Matthew D. Fuller" <fullermd@over-yonder.net> writes:

>> For example, how long does it take to do an arbitrary "undo" (ie
>> forcing a branch to an earlier state) [...]
>
> I don't understand the thrust of this, either.  As I understand the
> operation you're talking about, it doesn't have anything to do with a
> branch; you'd just be whipping the working tree around to different
> versions.  That should be O(diff) on any modern VCS.

There are two things to do:

* Mark the tree as corresponding to a different revision in the past.
  This is roughly "echo 'revision@id-123' > .bzr/checkout/last-revision"
  in bzr. Obviously, writting the file is O(1), but computing the
  revision identifier if you say "bzr switch -r 42" (I'm not sure
  switch accepts this BTW), you have to load the revision history.
  Indeed, bzr would load it anyway to make sure that the revision you
  switch to is in the revision history.

  In bzr, you have .bzr/branch/revision-history for each branch, which
  is a newline-separated list of revision-identifiers. In the case of
  bzr.dev, for example, this file is 112KB as of now. This is
  O(history), with "history" being the length of the path from HEAD to
  the initial commit, following the leftmost ancestor (i.e. number of
  revisions in a centralized workflow, and less than this otherwise).
  That said, the constant factor is very small. For example, on
  bzr.dev, I did "grep -n some-rev-id" (which does revid-to-revno), it
  takes 0.004 seconds (Vs 0.003 seconds to grep in /dev/null
  instead ;-) ), so you'd need many orders of magnitude before this
  becomes a limitation.

  Linus's point AIUI is that this will _never_ be a limitation of git.

* Then, do the "merge" to make your tree up to date. You can hardly do
  faster than git and its unpacked format, but this is at the cost of
  disk space. But as you say, in almost any modern VCS, that's
  O(diff). In a space-efficient format, that's just the tradeoff you
  make between full copies of a file and delta-compression.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24  7:52                                                                       ` Erik Bågfors
  2006-10-24  8:37                                                                         ` Jakub Narebski
@ 2006-10-24 10:11                                                                         ` Martin Langhoff
  1 sibling, 0 replies; 1752+ messages in thread
From: Martin Langhoff @ 2006-10-24 10:11 UTC (permalink / raw)
  To: Erik Bågfors; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git

On 10/24/06, Erik Bågfors <zindar@gmail.com> wrote:
> It's Erik :)

Sorry Erik!

> Let's make one thing clear.  Revnos are NOT stored with the revision,
> they are not "names" of the revision.  They are basically just
> shortcuts to specific revisions, that only makes sence in the context
> of a branch.

My bad. The revnos examples discussed looked quite Arch-like. As Arch
took them seriously, I thought bzr did too.

Probably quite a few people here thought as much, and got hot under
the t-shirt about it ;-)

Now, the thing about they shorthand is that we have quite a few means
of using shorthand in GIT that don't rely on revnos. We have the whole
^branchname stuff. And when you are looking at gitk it's pretty
obvious which are your recent "local" commits.


...


> 2. treating "leftmost" parrent special is bad/good

> 2. This is something I do care about.  For me, this is the only
> logical way of doing it. It might be because I am used to it now, but
> when I started to look at bzr/hg/git/darcs/etc, I just got a so much
> more clear view of the history when running a standard log command,
> that it was one of the first things that attracted me to bzr. This is
> just a user talking.
> There might be technical reasons why it's better to not do it, but for
> me it works the way I expect, therefore I'm happy

Can you give us a quick example of why you got such a clearer picture?

> 3. plugins are useless/useful

Hmmmm. It's more of a unix/C/pipes tradition vs dynamically typed &
compiled scripting language tradition.

> 4. And now, storing branch information should be done manually (if
> wanted) and not automatically.

> 4. No comment.

Probably not. But if someone is using branchnames to identify "lines
of work" and hoping that metadata will remain attached there, it's
probably a bad long-term approach.

But following what you said earlier about that info being transient
and "local", then I was 200% wrong, and thinking of Arch/Bazaar usage
patterns.

cheers,


martin

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24  9:51                                                               ` Matthieu Moy
@ 2006-10-24 10:27                                                                 ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-24 10:27 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Matthieu Moy wrote:
> "Matthew D. Fuller" <fullermd@over-yonder.net> writes:
> 
>>> For example, how long does it take to do an arbitrary "undo" (ie
>>> forcing a branch to an earlier state) [...]
>>
>> I don't understand the thrust of this, either.  As I understand the
>> operation you're talking about, it doesn't have anything to do with a
>> branch; you'd just be whipping the working tree around to different
>> versions.  That should be O(diff) on any modern VCS.

> There are two things to do:
>
> * Mark the tree as corresponding to a different revision in the past.
[...]
> * Then, do the "merge" to make your tree up to date. You can hardly do
>   faster than git and its unpacked format, but this is at the cost of
>   disk space. But as you say, in almost any modern VCS, that's
>   O(diff). In a space-efficient format, that's just the tradeoff you
>   make between full copies of a file and delta-compression.

Actually, this would be "checkout" (in git terminology), i.e. overwriting
the files which differ in current revision, and the revision we rewind (do
undo) to. (That's of course simplification omitting for example removing
and creating files.) Which would be O(changed files) which is lower bound
and cannot be faster. Finding which files changed is also O(changed files),
with a little bit of O(directory depth) in git, with very small constant.

And even in the case of packed format, it wouldn't be O(diff)/O(history),
but O(delta length) where delta length is maximum length of delta chain
in pack, by default set to 10. Well, constant is a bit larges because git
additionally gzip-compresses (even in loose, i.e. unpacked format).
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24  6:45                           ` David Rientjes
       [not found]                             ` <Pin e.LNX.4.64.0610240812410.3962@g5.osdl.org>
       [not found]                             ` <"Pin e.LNX.4.64.0610240812410.3962"@g5.osdl.org>
@ 2006-10-24 15:15                             ` Linus Torvalds
  2006-10-24 20:12                               ` David Rientjes
  2006-10-26  2:29                             ` Linus Torvalds
  3 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-24 15:15 UTC (permalink / raw)
  To: David Rientjes; +Cc: Lachlan Patrick, bazaar-ng, git



On Mon, 23 Oct 2006, David Rientjes wrote:
> 
> Some of the internal commands that have been coded in C are actually much 
> better handled by the shell in the first place.  It's much simpler to 
> write and extend as well as being much more traceable for runtime 
> problems.

Yes. However, from a portability (to Windows) standpoint, shell is just 
about the worst choice.

Not that perl/python/etc really help - unless the _whole_ program is one 
perl/python thing. Windows just doesn't like pipelines etc very much.

So I'd like all the _common_ programs to be built-ins..

		Linus

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24  0:26                                                                 ` Matthew D. Fuller
@ 2006-10-24 15:58                                                                   ` David Lang
  2006-10-24 16:34                                                                     ` Matthew D. Fuller
  0 siblings, 1 reply; 1752+ messages in thread
From: David Lang @ 2006-10-24 15:58 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Linus Torvalds, bazaar-ng, git

On Mon, 23 Oct 2006, Matthew D. Fuller wrote:

> But I don't understand how bzr-the-abstract-data-model makes such
> things impossible, or even significantly different than doing so in
> git.  In git, you're just chopping off one DAG where another one
> intersects it (or similar operations).  To do it in bzr, you'd do...
> exactly the same thing.  The revnos, or the mainline, are completely
> useless in such an operation of course, but they don't hurt it; the
> tool would just just ignore them like it does the SHA-1 of files in
> the revision.

one key difference is that with bzr you have to do this chopping by creating the 
branches at the time changes are done, with git you do this chopping after the 
fact when you are displaying the results.

As such you can chop and compare things in ways that were never contemplated by 
anyone at the time changes are made.

>
>> See? When you visualize multiple branches together, HAVING
>> PER-BRANCH REVISION NUMBERS IS INSANE! Yet, clearly, it's a valid
>> and interesting operation to do.
>
> I wouldn't be so absolutist about it, but certainly they're of
> extremely limited utility if of any at all in such cases.  And yes, it
> can be an interesting operation.  But what does that have to do with
> using revnos in other cases?  You keep saying "having" where I would
> say "using".

and the bzr tools strongly encourage the use of these numbers

> I care about that first parent line.  Therefore, I require my tool to
> at least _pretend_ to care.  I'm not aware of any way in which the
> fundamental bzr structures care, but the UI is chock full of
> pretending.  A necessary part of that pretending is not changing my
> mainline unless I specifically ask for it, and that means a
> merge-vs-pull distinction needs to be there.  That's a _technical_
> sign that the tool is ready to work with me the way I want to work.  A
> lack of it is a _technical_ sign that it's not suitable.

nobody is saying that the bzr approach is invalid for your workflow.

what people are saying is that it doesn't easily support a truely distributed 
workflow. this is a very different statement.

your workflow isn't truely distributed so you bzr's model works well for you. no 
problem, just don't claim that becouse you haven't run into any problems with 
your workflow that there are no problems with bzr with other workflows.

David Lang

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24 15:58                                                                   ` David Lang
@ 2006-10-24 16:34                                                                     ` Matthew D. Fuller
  2006-10-24 18:03                                                                       ` David Lang
  0 siblings, 1 reply; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-24 16:34 UTC (permalink / raw)
  To: David Lang; +Cc: Linus Torvalds, bazaar-ng, git

On Tue, Oct 24, 2006 at 08:58:56AM -0700 I heard the voice of
David Lang, and lo! it spake thus:
> 
> one key difference is that with bzr you have to do this chopping by
> creating the branches at the time changes are done,

HUH?  Why on earth do you think that?

To do this in a git data model, you point at 2 (or 3, or 4, or...)
revisions, anywhere in the revision-space universe.  You derive back a
DAG of the history from each of them by recursing over parent links.
You figure out where (if anywhere) those DAG's intersect.  And based
on that, you alter what and how you display; including or excluding
certain revs, changing the angles of lines or columnation of dots in a
graph, etc.

To do it in a bzr data model, you would follow *EXACTLY* the same
steps.  As in, you do EXACTLY (a), then EXACTLY (b), then...


> what people are saying is that it doesn't easily support a truely
> distributed workflow. this is a very different statement.

And it's one that carries around a lot of unstated assumptions about
what "truely distributed" means, which *I*'m certainly not
understanding, because any meaning I can apply to the term doesn't
lead me to the conclusions it does you.  Certainly, depending on your
workflow, certain parts of the UI are of lesser utility than they are
in mine, down to and including zero.  And it's probably certain that
some parts of the UI aren't up to handling various workflows, too,
including OUR workflow.  That's kinda what "in development" means...

But that's a very different statement from the claim that they CAN'T
be without changes to the conceptual model underneath.  Just because a
UI is built around maintaining the fiction of a mainline doesn't mean
the system requires it.  All you'd have to do to abandon it is write a
different log formatter that didn't show revnos and didn't nest merge
commits, and change (or add an option to) 'merge' to fast-forward if
possible.  The difference between the views on how the pieces should
fit together really IS just that fine.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24  5:42                                                                         ` Linus Torvalds
  2006-10-24  5:47                                                                           ` Shawn Pearce
@ 2006-10-24 16:46                                                                           ` Matthew D. Fuller
  1 sibling, 0 replies; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-24 16:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, bazaar-ng, git, Erik Bågfors

On Mon, Oct 23, 2006 at 10:42:23PM -0700 I heard the voice of
Linus Torvalds, and lo! it spake thus:
> 
> Well, I would use the globally unique ones, certainly. It's the only
> thing that makes sense.

So would I, and it is.


> Using the _same_ names everywhere is just better. 

This is just where we split on it.  All else being equal, sure, but
all else is never equal.  Most of my time is spent working forward
along one branch (different branches at different times, of course,
but at any given moment I'm almost certainly only concerned about one
branch), and having a different and advantageous localized naming
scheme there is a benefit I celebrate.  If most of my time were
instead spent comparing and contrasting and intersecting and
cross-breeding branches, it would probably be as worthless to me as it
apparently is to you.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24 16:34                                                                     ` Matthew D. Fuller
@ 2006-10-24 18:03                                                                       ` David Lang
  2006-10-24 18:25                                                                         ` Jakub Narebski
  2006-10-25  0:27                                                                         ` Matthew D. Fuller
  0 siblings, 2 replies; 1752+ messages in thread
From: David Lang @ 2006-10-24 18:03 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Linus Torvalds, bazaar-ng, git

On Tue, 24 Oct 2006, Matthew D. Fuller wrote:

> On Tue, Oct 24, 2006 at 08:58:56AM -0700 I heard the voice of
> David Lang, and lo! it spake thus:
>>
>> one key difference is that with bzr you have to do this chopping by
>> creating the branches at the time changes are done,
>
> HUH?  Why on earth do you think that?
>
> To do this in a git data model, you point at 2 (or 3, or 4, or...)
> revisions, anywhere in the revision-space universe.  You derive back a
> DAG of the history from each of them by recursing over parent links.
> You figure out where (if anywhere) those DAG's intersect.  And based
> on that, you alter what and how you display; including or excluding
> certain revs, changing the angles of lines or columnation of dots in a
> graph, etc.
>
> To do it in a bzr data model, you would follow *EXACTLY* the same
> steps.  As in, you do EXACTLY (a), then EXACTLY (b), then...

it sounded like you were saying that the way to get the slices of the DAG was to 
use branches in bzr. to do this you need to create the branches with the correct 
info on each branch. this is only practical if the branches are created as the 
changes are made, if you try to do this after the fact you need to create the 
changes in the branch before you do the slicing.

with git you can look at the DAG and pick any arbatrary points in it as points 
to use for the slicing at display time.

>> what people are saying is that it doesn't easily support a truely
>> distributed workflow. this is a very different statement.
>
> And it's one that carries around a lot of unstated assumptions about
> what "truely distributed" means, which *I*'m certainly not
> understanding, because any meaning I can apply to the term doesn't
> lead me to the conclusions it does you.  Certainly, depending on your
> workflow, certain parts of the UI are of lesser utility than they are
> in mine, down to and including zero.  And it's probably certain that
> some parts of the UI aren't up to handling various workflows, too,
> including OUR workflow.  That's kinda what "in development" means...
>
> But that's a very different statement from the claim that they CAN'T
> be without changes to the conceptual model underneath.  Just because a
> UI is built around maintaining the fiction of a mainline doesn't mean
> the system requires it.  All you'd have to do to abandon it is write a
> different log formatter that didn't show revnos and didn't nest merge
> commits, and change (or add an option to) 'merge' to fast-forward if
> possible.  The difference between the views on how the pieces should
> fit together really IS just that fine.

the claim isn't that bzr can't be modified to support these other workflows (it 
sounds as if just changing to tools to use the internal refid's rather then the 
current refno's would come very close to solving this problem), it's that the 
current refno's (use of which is strongly encouraged by the current UI) cannot 
support some workflows, and therefor the claim that it supports fully 
distributed workflows as well as git is false

remember that this entire thing started with a feature comparison checklist, 
the definitions of some of the items on the checklist is being questioned.

after that there's the issue of if the VCS in question has the feature.

this discussion started with two topologies

1. Centralized: all commits must go to one repository, connectivity required to check-in 
2. Distributed: everything else

since then one additional topology has been defined, and one has been redefined

1. Centralized: all commits must go to one repository, connectivity required to check-in

2. Star: one repository is 'special' or 'primary' and all other repositories 
sync to this, but development can take place against local repositories, 
connectivity is only requred when syncing the repositories. as updates take 
place the history is defined by the primary repository, and can overwrite or 
change the history as defined by local repositories.

3. Distributed: all repositories are equal (any definition of 'primary' is a 
matter of convention, not a requirement of the tool) development can take place 
against local repositories, connectivity is only required when syncing the 
repositories. repositories with no development takeing place can sync back and 
forth with no side effects. History displays the same thing no matter what 
repository is looked at (allowing for the fact that some repositories may not 
have the full history)

everyone agrees that bzr supports the Star topology. Most people (including bzr 
people) seem to agree that currently bzr does not support the Distributed 
topology.

it's just fine for bzr to not support all possible topologies, the only reason 
for discussing these issues (besides everyone understanding each other) is the 
feature checklist that started this entire thread, and what is appropriate there 
for each VCS (see the early part of this discussion to see how that worked with 
git's rename support)

David Lang

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24 18:03                                                                       ` David Lang
@ 2006-10-24 18:25                                                                         ` Jakub Narebski
  2006-10-24 19:27                                                                           ` Petr Baudis
  2006-10-25  0:27                                                                         ` Matthew D. Fuller
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-24 18:25 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

David Lang wrote:

> 1. Centralized: all commits must go to one repository, connectivity
> required to check-in 

Bazaar-NG "light checkouts" implements this. Git doesn't support this
topology, and probably wouldn't.

1.5. Disconnected centralized. Like centralized, but you can work (perhaps
limited to what you can do) even without connection to central server.
Minimally you have to be able to commit changes locally, if central server
is not available. Bzr "normal/heavyweight checkouts" are [roughly] abot
this. Git "lazy clone" proposal is about similar thing; you can get git to
support this model (although without space savings) with full 
clone + hooks.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24 18:25                                                                         ` Jakub Narebski
@ 2006-10-24 19:27                                                                           ` Petr Baudis
  0 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-24 19:27 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, bazaar-ng

Dear diary, on Tue, Oct 24, 2006 at 08:25:53PM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> David Lang wrote:
> 
> > 1. Centralized: all commits must go to one repository, connectivity
> > required to check-in 
> 
> Bazaar-NG "light checkouts" implements this. Git doesn't support this
> topology, and probably wouldn't.
> 
> 1.5. Disconnected centralized. Like centralized, but you can work (perhaps
> limited to what you can do) even without connection to central server.
> Minimally you have to be able to commit changes locally, if central server
> is not available. Bzr "normal/heavyweight checkouts" are [roughly] abot
> this. Git "lazy clone" proposal is about similar thing; you can get git to
> support this model (although without space savings) with full 
> clone + hooks.

Cogito can do it now out of the box, having support for cg-commit --push
and cg-update preserving uncommitted local changes.

Not that you probably should use it. ;-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24 15:15                             ` Linus Torvalds
@ 2006-10-24 20:12                               ` David Rientjes
  2006-10-24 20:28                                 ` Jakub Narebski
  2006-10-25  8:48                                 ` Jeff King
  0 siblings, 2 replies; 1752+ messages in thread
From: David Rientjes @ 2006-10-24 20:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bazaar-ng, git

On Tue, 24 Oct 2006, Linus Torvalds wrote:

> Yes. However, from a portability (to Windows) standpoint, shell is just 
> about the worst choice.
> 
> Not that perl/python/etc really help - unless the _whole_ program is one 
> perl/python thing. Windows just doesn't like pipelines etc very much.
> 
> So I'd like all the _common_ programs to be built-ins..
> 

And I would prefer the opposite because we're talking about git.  As an 
information manager, it should be seen and not heard.  Nobody is going to 
spend their time to become a git or CVS or perforce expert.  As an 
individual primarily interested in development, I should not be required 
to learn command lines for dozens of different git-specific commands to do 
my job quickly and effectively.  I would opt for a much more simpler 
approach and deal with shell scripting for many of these commands because 
I'm familiar with them and I can pipe any command with the options I 
already know and have used before to any other command.

As a developer on Linux based systems, I should not need to deal with 
code in a revision control system that is longer and less traceable 
because the authors of that system decided they wanted to support Windows 
too.  Moving away from the functionality that the shell provides is a 
mistake for a system such as git where it could be so advantageous because 
of the inherent nature of git as an information manager.

This is the reason why I was a fan of git long ago and used it for my own 
needs before tons of unnecessary features and unneeded complexity was 
added on.

		David




^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24 20:12                               ` David Rientjes
@ 2006-10-24 20:28                                 ` Jakub Narebski
  2006-10-25  8:48                                 ` Jeff King
  1 sibling, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-24 20:28 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

David Rientjes wrote:

> This is the reason why I was a fan of git long ago and used it for my own 
> needs before tons of unnecessary features and unneeded complexity was 
> added on.

But you can still use git as you used it long time ago. The plumbing
commands didn't vanish. Git got rich in porcelanish commands, true, but old
core remains. And GIT_TRACE (quite new addition) certainly helps.

I think git profit very much from being created bottom-up, from main idea of
SCM, through repository format and structure, through plumbing commands,
through porcelain done with scripts, to having many new plumbing commands,
to having many commands builtin, in the future to libification perhaps.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24  0:47                                                                       ` Carl Worth
  2006-10-24  7:31                                                                         ` Erik Bågfors
@ 2006-10-24 21:51                                                                         ` Erik Bågfors
  2006-10-25 12:41                                                                           ` Andreas Ericsson
  1 sibling, 1 reply; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-24 21:51 UTC (permalink / raw)
  To: Carl Worth
  Cc: Matthew D. Fuller, Linus Torvalds, bazaar-ng, git, Jakub Narebski

Sorry for going back to an old mail... but....

On 10/24/06, Carl Worth <cworth@cworth.org> wrote:
> On Mon, 23 Oct 2006 19:26:57 -0500, "Matthew D. Fuller" wrote:
> >
> > On Mon, Oct 23, 2006 at 04:24:30PM -0700 I heard the voice of
> > Linus Torvalds, and lo! it spake thus:
> > >
> > > The problem? How do you show a commit that is _common_ to two
> > > branches, but has different revision names in them?
> >
> > Why would you?
>
> Assume you've got two long-lived branches and one periodically gets
> merged into the other one. The combined history might look as follows
> (more recent commits first):
>
>  f   g
>  |   |
>  d   e
>  |\ /
>  b c
>  |/
>  a
>
> The point is that it is extremely nice to be able to visualize things
> that way. Say I've got a "dev" branch that points at f and a "stable"
> branch that points at g. With this, a command like:
>
>         gitk dev stable
>
> would result in a picture just like the above. Can a similar figure be
> made with bzr? Or only the following two separate pictures:

I wanted to test how hard it is. So I created a small plugin that will
show the relationsships between revisions... The following commands

bzr init-repo repo --trees
bzr init repo/branchA
cd repo/branchA
bzr whoami --branch "Test Devel 1 <test1@devel.com>"
bzr ci --unchanged -m a1
bzr ci --unchanged -m a2
bzr branch . ../branchB
bzr ci --unchanged -m a3
bzr ci --unchanged -m a4
cd ../branchB
bzr whoami --branch "Test Devel 2 <test2@devel.com>"
bzr ci --unchanged -m b1
bzr ci --unchanged -m b2
bzr merge ../branchA
bzr ci -m merge
bzr ci --unchanged -m b3
bzr ci --unchanged -m b4
cd ../branchA
bzr merge ../branchB
bzr ci -m merge
bzr ci --unchanged -m a5
cd ../branchB
bzr ci --unchanged -m b5
cd ..
bzr dotrepo > test.dot
dot -Tpng test.dot > dotrepo.png

Creates the picture you can see at
http://erik.bagfors.nu/bzr-plugins/dotrepo.png

Please remember that this is a 15 min implementation and as such might
suck (the output is not perfect for example, it's slow, etc).  This
just brings in every revision in the entire repo, but to expand it to
just take the branches on the command line, is perfectly possible.

But still.. there is no problem to create this.

/Erik
ps. the plugin can be bzr branched from
http://erik.bagfors.nu/bzr-plugins/dotrepo/
-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* [PATCH] xdiff: Do not consider lines starting by # hunkworthy
@ 2006-10-25  0:07 Petr Baudis
  2006-10-25  0:16 ` Junio C Hamano
  2006-10-25  0:17 ` [PATCH] xdiff: Do not consider lines starting by # hunkworthy Jakub Narebski
  0 siblings, 2 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-25  0:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

This will be probably controversial but in my personal experience, the
amount of time this is the right thing to do because of #defines is negligible
compared to amount of time it is wrong, especially because of #ifs and #endifs
in the middle of functions and also because of comments at the line start when
it concerns non-C files.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 xdiff/xemit.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 714c563..4f20075 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -86,8 +86,7 @@ static void xdl_find_func(xdfile_t *xf, 
 		if (len > 0 &&
 		    (isalpha((unsigned char)*rec) || /* identifier? */
 		     *rec == '_' ||	/* also identifier? */
-		     *rec == '(' ||	/* lisp defun? */
-		     *rec == '#')) {	/* #define? */
+		     *rec == '(')) {	/* #define? */
 			if (len > sz)
 				len = sz;

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: [PATCH] xdiff: Do not consider lines starting by # hunkworthy
  2006-10-25  0:07 [PATCH] xdiff: Do not consider lines starting by # hunkworthy Petr Baudis
@ 2006-10-25  0:16 ` Junio C Hamano
  2006-10-25  0:28   ` [PATCH] xdiff: Match GNU diff behaviour when deciding hunk comment worthiness of lines Petr Baudis
  2006-10-25  0:17 ` [PATCH] xdiff: Do not consider lines starting by # hunkworthy Jakub Narebski
  1 sibling, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-25  0:16 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

Petr Baudis <pasky@suse.cz> writes:

> This will be probably controversial but in my personal experience, the
> amount of time this is the right thing to do because of #defines is negligible
> compared to amount of time it is wrong, especially because of #ifs and #endifs
> in the middle of functions and also because of comments at the line start when
> it concerns non-C files.
>
> Signed-off-by: Petr Baudis <pasky@suse.cz>
> ---
>
>  xdiff/xemit.c |    3 +--
>  1 files changed, 1 insertions(+), 2 deletions(-)
>
> diff --git a/xdiff/xemit.c b/xdiff/xemit.c
> index 714c563..4f20075 100644
> --- a/xdiff/xemit.c
> +++ b/xdiff/xemit.c
> @@ -86,8 +86,7 @@ static void xdl_find_func(xdfile_t *xf, 
>  		if (len > 0 &&
>  		    (isalpha((unsigned char)*rec) || /* identifier? */
>  		     *rec == '_' ||	/* also identifier? */
> -		     *rec == '(' ||	/* lisp defun? */
> -		     *rec == '#')) {	/* #define? */
> +		     *rec == '(')) {	/* #define? */
>  			if (len > sz)
>  				len = sz;
>  			if (len && rec[len - 1] == '\n')

I'd either omit the opening parenthesis or fix the comment ;-).

More seriously, I'd rather just match default GNU diff behaviour
to use isalpha, underscore or '$'.  I do not particularly like
to have '$' but I feel it is the easiest to match a prior art in
cases like this because I do not have to defend my position when
somebody says "Why do you include '#'???  It makes no sense!".
Since I do not care too much about it, being able to just say
"Well we match what GNU diff does by default." is a good thing.




^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [PATCH] xdiff: Do not consider lines starting by # hunkworthy
  2006-10-25  0:07 [PATCH] xdiff: Do not consider lines starting by # hunkworthy Petr Baudis
  2006-10-25  0:16 ` Junio C Hamano
@ 2006-10-25  0:17 ` Jakub Narebski
  1 sibling, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-25  0:17 UTC (permalink / raw)
  To: git

Petr Baudis wrote:

> -                    *rec == '(' ||     /* lisp defun? */
> -                    *rec == '#')) {    /* #define? */
> +                    *rec == '(')) {    /* #define? */
>                         if (len > sz)

Shouldn't it be:

+                    *rec == '(')) {    /* lisp defun? */
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24 18:03                                                                       ` David Lang
  2006-10-24 18:25                                                                         ` Jakub Narebski
@ 2006-10-25  0:27                                                                         ` Matthew D. Fuller
  2006-10-25 22:40                                                                           ` David Lang
  1 sibling, 1 reply; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-25  0:27 UTC (permalink / raw)
  To: David Lang; +Cc: Linus Torvalds, bazaar-ng, git

On Tue, Oct 24, 2006 at 11:03:20AM -0700 I heard the voice of
David Lang, and lo! it spake thus:
> 
> it sounded like you were saying that the way to get the slices of
> the DAG was to use branches in bzr. [...]

I'm not entirely sure I understand what you mean here, but I think
you're saying "Nobody's written the code in bzr to show arbitrary
slices of the DAG", which is true TTBOMK.


> everyone agrees that bzr supports the Star topology. Most people
> (including bzr people) seem to agree that currently bzr does not
> support the Distributed topology.

I think this statement arouses so much grumbling because (a) bzr does
support such a lot better than often seems implied, (b) where it
doesn't, the changes needed to do so are relatively minor (often
merely cosmetic), and (c) disagreement over whether some of the
qualifications included for 'distributed' are really fundamental.


> it's just fine for bzr to not support all possible topologies,

I think there's a real intent for bzr TO support at least all common
topologies.  I'll buy that current development has focused more on
[relatively] simple topologies than the more wildly complex ones.  I
look forward to more addressing of the less common cases as the tool
matures, and I think a lot of this thread will be good material to
work with as that happens.  It's just the suggestion that providing
fruit for simple topologies _necessarily_ prejudices against complex
ones that I find so onerous.


> (besides everyone understanding each other)

That's a good enough reason for me.  Before this thread, I wasn't
interested in using git.  I'm still not, but now I understand much
better /why/ I'm not.  And when (I'm sure it'll happen sooner or
later) some project I follow picks up using git, I'll have enough
grounding in the tool's mental model to work with it when I have to.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* [PATCH] xdiff: Match GNU diff behaviour when deciding hunk comment worthiness of lines
  2006-10-25  0:16 ` Junio C Hamano
@ 2006-10-25  0:28   ` Petr Baudis
  2006-10-25  1:33     ` Horst H. von Brand
  0 siblings, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-10-25  0:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

This removes the '#' and '(' tests and adds a '$' test instead although I have
no idea what it is actually good for - but hey, if that's what GNU diff does...

Pasky only went and did as Junio sayeth.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 xdiff/xemit.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/xdiff/xemit.c b/xdiff/xemit.c
index 714c563..4139d55 100644
--- a/xdiff/xemit.c
+++ b/xdiff/xemit.c
@@ -86,8 +86,7 @@ static void xdl_find_func(xdfile_t *xf, 
 		if (len > 0 &&
 		    (isalpha((unsigned char)*rec) || /* identifier? */
 		     *rec == '_' ||	/* also identifier? */
-		     *rec == '(' ||	/* lisp defun? */
-		     *rec == '#')) {	/* #define? */
+		     *rec == '$')) {	/* mysterious GNU diff's invention */
 			if (len > sz)
 				len = sz;

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: [PATCH] xdiff: Match GNU diff behaviour when deciding hunk comment worthiness of lines
  2006-10-25  0:28   ` [PATCH] xdiff: Match GNU diff behaviour when deciding hunk comment worthiness of lines Petr Baudis
@ 2006-10-25  1:33     ` Horst H. von Brand
  2006-10-25  2:18       ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Horst H. von Brand @ 2006-10-25  1:33 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Junio C Hamano, git

Petr Baudis <pasky@suse.cz> wrote:
> This removes the '#' and '(' tests and adds a '$' test instead although I
> have no idea what it is actually good for - but hey, if that's what GNU
> diff does...

$ starts a shell (or Perl) variable...

> Pasky only went and did as Junio sayeth.

Horst adds a guesse...
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                    Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria             +56 32 2654239

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [PATCH] xdiff: Match GNU diff behaviour when deciding hunk comment worthiness of lines
  2006-10-25  1:33     ` Horst H. von Brand
@ 2006-10-25  2:18       ` Junio C Hamano
  0 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-25  2:18 UTC (permalink / raw)
  To: Horst H. von Brand; +Cc: Petr Baudis, git

"Horst H. von Brand" <vonbrand@inf.utfsm.cl> writes:

> Petr Baudis <pasky@suse.cz> wrote:
>> This removes the '#' and '(' tests and adds a '$' test instead although I
>> have no idea what it is actually good for - but hey, if that's what GNU
>> diff does...
>
> $ starts a shell (or Perl) variable...
>
>> Pasky only went and did as Junio sayeth.
>
> Horst adds a guesse...

If I have to guess, I think that is from VMS.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24 20:12                               ` David Rientjes
  2006-10-24 20:28                                 ` Jakub Narebski
@ 2006-10-25  8:48                                 ` Jeff King
       [not found]                                   ` < Pine.LNX.4.64N.0610250157470.3467@attu1.cs.washington.edu>
                                                     ` (2 more replies)
  1 sibling, 3 replies; 1752+ messages in thread
From: Jeff King @ 2006-10-25  8:48 UTC (permalink / raw)
  To: David Rientjes; +Cc: Linus Torvalds, Lachlan Patrick, bazaar-ng, git

On Tue, Oct 24, 2006 at 01:12:52PM -0700, David Rientjes wrote:

> And I would prefer the opposite because we're talking about git.  As an 
> information manager, it should be seen and not heard.  Nobody is going to 
> spend their time to become a git or CVS or perforce expert.  As an 
> individual primarily interested in development, I should not be required 
> to learn command lines for dozens of different git-specific commands to do 
> my job quickly and effectively.  I would opt for a much more simpler 
> approach and deal with shell scripting for many of these commands because 
> I'm familiar with them and I can pipe any command with the options I 
> already know and have used before to any other command.

I don't understand how converting shell scripts to C has any impact
whatsoever on the usage of git. The plumbing shell scripts didn't go
away; you can still call them and they behave identically.

Is there some specific change in functionality that you're lamenting?

> As a developer on Linux based systems, I should not need to deal with 
> code in a revision control system that is longer and less traceable 
> because the authors of that system decided they wanted to support Windows 
> too.  Moving away from the functionality that the shell provides is a 
> mistake for a system such as git where it could be so advantageous because 
> of the inherent nature of git as an information manager.

Some C->shell conversions may have made the code "longer and less
traceable." However, many of those conversions caused the code to be
shorter (because communication between C functions is simpler than going
over pipes, and because anything involving a data structure more complex
than a string is difficult in the shell) and more robust (fewer
opportunities for quoting/parsing errors, and none of the shell gotchas
like missing the error code in "foo | bar").

Do you have any specific reason to believe that the git code is of worse
quality now than it was before?

> This is the reason why I was a fan of git long ago and used it for my own 
> needs before tons of unnecessary features and unneeded complexity was 
> added on.

Is there something you used to do with git that you no longer can? Is
there a reason you can't ignore the newer commands?


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25  8:48                                 ` Jeff King
       [not found]                                   ` < Pine.LNX.4.64N.0610250157470.3467@attu1.cs.washington.edu>
@ 2006-10-25  9:19                                   ` David Rientjes
  2006-10-25  9:32                                     ` Jakub Narebski
  2006-10-25  9:49                                     ` Jeff King
  2006-10-25 21:08                                   ` Junio C Hamano
  2 siblings, 2 replies; 1752+ messages in thread
From: David Rientjes @ 2006-10-25  9:19 UTC (permalink / raw)
  To: Jeff King; +Cc: Linus Torvalds, Lachlan Patrick, bazaar-ng, git

On Wed, 25 Oct 2006, Jeff King wrote:

> I don't understand how converting shell scripts to C has any impact
> whatsoever on the usage of git. The plumbing shell scripts didn't go
> away; you can still call them and they behave identically.
> 
> Is there some specific change in functionality that you're lamenting?
> 

No, my criticism is against the added complexity which makes the 
modification of git increasingly difficult with every new release.  It's a 
pretty limited use case of the entire package, I'm sure, but one of the 
major advantages that I saw in git early on was the ability to tailor it 
to your own personal needs very easily with some simple shell knowledge 
and enough C that was required at the time.

> Some C->shell conversions may have made the code "longer and less
> traceable." However, many of those conversions caused the code to be
> shorter (because communication between C functions is simpler than going
> over pipes, and because anything involving a data structure more complex
> than a string is difficult in the shell) and more robust (fewer
> opportunities for quoting/parsing errors, and none of the shell gotchas
> like missing the error code in "foo | bar").
> 

You're ignoring the advantageous nature of the shell with regard to git.  
The shell is so much better prepared to deal with information managers by 
nature than the C programming language.  It's not a matter of shorter 
code, per se, it's about the developer's ability to make small changes to 
the operation of the information manager on demand to tailor to his or her 
_current_ needs.  For any experienced shell programmer it is so much 
easier to go in and change an option or pipe to a different command or 
comment out a simple shell command in a .sh file than editing the C code.  
And sometimes it's necessary to have several different variations of that 
command which is very easy with slightly renamed .sh files instead of 
adding on more and more flags to commands that have become so complex at 
this point that it's difficult to know the basics of how to manage a 
project.

This all became very obvious when the tutorials came out on "how to use 
git in 20 commands or less" effectively.  These tutorials shouldn't need 
to exist with an information manager that started as a quick, efficient, 
and _simple_ project.  You're treating git development in the same light 
as you treat Linux development; let's be honest and say that 99% of the 
necessary git functionality was there almost a year ago and ever since 
nothing of absolute necessity has been added that serious developers care 
about in a revision control system.  Look at LKML, nobody is waiting on 
these new releases and upgrading to them when they're announced.  And this 
is the community that git has _targeted_.  Most other projects don't care 
about the syntactics of sign-off lines and acked-by lines and format-patch 
like the git community does.

> Do you have any specific reason to believe that the git code is of worse
> quality now than it was before?
> 

Absolutely.  I think I've actually documented that fairly well.  Back in 
the day git was a very concise, well-written package.  Today, a tour 
through the source code for the latest release leaves a lot to be desired 
for any serious C programmer.

> Is there something you used to do with git that you no longer can? Is
> there a reason you can't ignore the newer commands?
> 

Functionality wise, no.  But in terms of being able to _customize_ my 
version of git depending on how I want to use it, I've lost hope on the 
whole idea.  It's a shame too because it appears as though the original 
vision was one of efficiency and simplicity.  I would say that git-1.2.4 
is my package of preference with some slight tweaking in the branching 
department.

I really do miss the old git.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25  9:19                                   ` David Rientjes
@ 2006-10-25  9:32                                     ` Jakub Narebski
  2006-10-25  9:49                                     ` Jeff King
  1 sibling, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-25  9:32 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

David Rientjes wrote:

> On Wed, 25 Oct 2006, Jeff King wrote:
> 
>> I don't understand how converting shell scripts to C has any impact
>> whatsoever on the usage of git. The plumbing shell scripts didn't go
>> away; you can still call them and they behave identically.
>> 
>> Is there some specific change in functionality that you're lamenting?
>> 
> 
> No, my criticism is against the added complexity which makes the 
> modification of git increasingly difficult with every new release.  It's a 
> pretty limited use case of the entire package, I'm sure, but one of the 
> major advantages that I saw in git early on was the ability to tailor it 
> to your own personal needs very easily with some simple shell knowledge 
> and enough C that was required at the time.
> 
[...]
>> Is there something you used to do with git that you no longer can? Is
>> there a reason you can't ignore the newer commands?
> 
> Functionality wise, no.  But in terms of being able to _customize_ my 
> version of git depending on how I want to use it, I've lost hope on the 
> whole idea.  It's a shame too because it appears as though the original 
> vision was one of efficiency and simplicity.  I would say that git-1.2.4 
> is my package of preference with some slight tweaking in the branching 
> department.

Ahah! So you miss the old script version of git commands, which you could
easily modify, tailoring it to your needs, isn't it? Well, if you don't mind
keeping your clone of git repository lying around somewhere, you can always
resurrect old shell version of some git command, e.g.
  $ git cat-file -p v1.2.4:git-prune.sh > $(git --exec-path)/git-prune.sh
change its name and modify as you used to do.

Are there any old commands which stopped working?
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 13:01                                               ` Matthew D. Fuller
  2006-10-21 14:08                                                 ` Jakub Narebski
  2006-10-21 20:47                                                 ` Carl Worth
@ 2006-10-25  9:35                                                 ` Andreas Ericsson
  2006-10-25  9:46                                                   ` Jakub Narebski
  2006-10-25  9:57                                                   ` Matthieu Moy
  2 siblings, 2 replies; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-25  9:35 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Carl Worth, Aaron Bentley, Linus Torvalds, bazaar-ng, git,
	Jakub Narebski

Matthew D. Fuller wrote:
> On Fri, Oct 20, 2006 at 02:48:52PM -0700 I heard the voice of
> Carl Worth, and lo! it spake thus:
> 
>> (since pull seems the only way to synch up without infinite new
>> merge commits being added back and forth).
> 
> The infinite-merge-commits case doesn't happen in bzr-land because we
> generally don't merge other branches except when the branch owner says
> "Hey, I've got something for you to merge".  If you were to setup a
> script to merge two branches back and forth until they were 'equal',
> yes, it'd churn away until you filled up your disk with the N bytes of
> metadata every new revision uses up.
> 

This is new to me. At work, we merge our toy repositories back and forth 
between devs only. There is no central repo at all. Does this mean that 
each merge would add one extra commit per time the one I'm merging with 
has merged with me?

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25  9:35                                                 ` Andreas Ericsson
@ 2006-10-25  9:46                                                   ` Jakub Narebski
  2006-10-25 10:08                                                     ` James Henstridge
  2006-10-25  9:57                                                   ` Matthieu Moy
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-25  9:46 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Matthew D. Fuller, Carl Worth, Aaron Bentley, Linus Torvalds,
	bazaar-ng, git

Andreas Ericsson wrote:
> Matthew D. Fuller wrote:
>> On Fri, Oct 20, 2006 at 02:48:52PM -0700 I heard the voice of
>> Carl Worth, and lo! it spake thus:
>> 
>>> (since pull seems the only way to synch up without infinite new
>>> merge commits being added back and forth).
>> 
>> The infinite-merge-commits case doesn't happen in bzr-land because we
>> generally don't merge other branches except when the branch owner says
>> "Hey, I've got something for you to merge".  If you were to setup a
>> script to merge two branches back and forth until they were 'equal',
>> yes, it'd churn away until you filled up your disk with the N bytes of
>> metadata every new revision uses up.
> 
> This is new to me. At work, we merge our toy repositories back and forth 
> between devs only. There is no central repo at all. Does this mean that 
> each merge would add one extra commit per time the one I'm merging with 
> has merged with me?

From what I understand, "bzr merge" will create one extra commit to
preserve the "first parent is my branch" feature. "bzr pull" will do
fast-forward if your DAG is proper subset of pulled branch/repository
DAG, but at the cost that it would change your revno to revision mapping
to those of the pulled repository.

That's a consequence of preserving branch as "my work" i.e. as path
through "branch DAG" in the DAG using first parent as special, instead
of saving it outside DAG.

-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25  9:19                                   ` David Rientjes
  2006-10-25  9:32                                     ` Jakub Narebski
@ 2006-10-25  9:49                                     ` Jeff King
  2006-10-25 13:49                                       ` Andreas Ericsson
  2006-10-25 17:21                                       ` David Rientjes
  1 sibling, 2 replies; 1752+ messages in thread
From: Jeff King @ 2006-10-25  9:49 UTC (permalink / raw)
  To: David Rientjes; +Cc: Linus Torvalds, bazaar-ng, git

On Wed, Oct 25, 2006 at 02:19:15AM -0700, David Rientjes wrote:

> No, my criticism is against the added complexity which makes the 
> modification of git increasingly difficult with every new release.  It's a 

OK, you seemed to imply problems for end users in your first paragraph,
which is what I was responding to.

> _current_ needs.  For any experienced shell programmer it is so much 
> easier to go in and change an option or pipe to a different command or 
> comment out a simple shell command in a .sh file than editing the C code.  

Yes, it's true that some operations might be easier to play with in the
shell. However, does it actually come up that you want to modify
existing git programs? The more common usage seems to be gluing the
plumbing together in interesting ways, and that is still very much
supported.

> And sometimes it's necessary to have several different variations of that 
> command which is very easy with slightly renamed .sh files instead of 
> adding on more and more flags to commands that have become so complex at 
> this point that it's difficult to know the basics of how to manage a 
> project.

You can do the same thing in C. In fact, look at how similar
git-whatchanged, git-log, and git-diff are.

I don't understand how a C->shell conversion has anything to do with
options being added. If you look at all of the conversions, they
replicate the interface _exactly_.

> This all became very obvious when the tutorials came out on "how to use 
> git in 20 commands or less" effectively.  These tutorials shouldn't need 
> to exist with an information manager that started as a quick, efficient, 
> and _simple_ project.  You're treating git development in the same light 

Sorry, I don't see how this is related to the programming language _at
all_. Are you arguing that the interface of git should be simplified so
that such tutorials aren't necessary? If so, then please elaborate, as
I'm sure many here would like to hear proposals for improvements. If
you're arguing that git now has too many features, then which features
do you consider extraneous?

> as you treat Linux development; let's be honest and say that 99% of the 
> necessary git functionality was there almost a year ago and ever since 
> nothing of absolute necessity has been added that serious developers care 
> about in a revision control system.  Look at LKML, nobody is waiting on 

I don't agree with this. There are tons of enhancements that I find
useful (e.g., '...' rev syntax, rebasing with 3-way merge, etc) that I
think other developers ARE using. There are scalability and performance
improvements. And there are new things on the way (Junio's pickaxe work)
that will hopefully make git even more useful than it already is.

If you don't think recent git versions are worthwhile, then why don't
you run an old version? You can even use git to cherry-pick patches onto
your personal branch.

> Absolutely.  I think I've actually documented that fairly well.  Back in 

Where?

> the day git was a very concise, well-written package.  Today, a tour 
> through the source code for the latest release leaves a lot to be desired 
> for any serious C programmer.

I don't agree, but since you haven't provided anything specific enough
to discuss, there's not much to say.

> Functionality wise, no.  But in terms of being able to _customize_ my 
> version of git depending on how I want to use it, I've lost hope on the 
> whole idea.  It's a shame too because it appears as though the original 

Can you name one customization that you would like to perform now that
you feel can't be easily done (and presumably that would have been
easier in the past)?

-Peff




^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 23:49                                                             ` Carl Worth
  2006-10-22  0:07                                                               ` Jeff Licquia
  2006-10-22 16:02                                                               ` Petr Baudis
@ 2006-10-25  9:52                                                               ` Andreas Ericsson
  2 siblings, 0 replies; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-25  9:52 UTC (permalink / raw)
  To: Carl Worth; +Cc: Jeff Licquia, Jakub Narebski, bazaar-ng, git

Carl Worth wrote:
> On Sat, 21 Oct 2006 19:42:47 -0400, Jeff Licquia wrote:
>> I don't think so.  Recently, I've been trying to track a particular
>> patch in the kernel.  It was done as a series of commits, and probably
>> would have been its own branch in bzr, but when I was trying to group
>> the commits together to analyze them as a group, the easiest way to do
>> that was by the original committer's name.
> 
> As far as "its own branch in bzr" would such a branch remain available
> indefinitely even after being merged in to the main tree?
> 
>> Now, there's probably a better way to hunt that stuff down, but in this
>> case hunting the user down worked for me.  (It may have made a
>> difference that I was using gitweb instead of a local clone.)
> 
> Vast, huge, gaping, cosmic difference.
> 
> Almost none of the power of git is exposed by gitweb. It's really not
> worth comparing. (Now a gitweb-alike that provided all the kinds of
> very easy browsing and filtering of the history like gitk and git
> might be nice to have.)
> 

There was one, but it got discontinued due to performance issues. Shame 
that, because it would have been nice to have to show "foreign" visitors 
how gitk/qgit works. It would especially show the way git thinks about 
branches and stuff like that.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25  9:35                                                 ` Andreas Ericsson
  2006-10-25  9:46                                                   ` Jakub Narebski
@ 2006-10-25  9:57                                                   ` Matthieu Moy
  1 sibling, 0 replies; 1752+ messages in thread
From: Matthieu Moy @ 2006-10-25  9:57 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Matthew D. Fuller, Carl Worth, Aaron Bentley, Linus Torvalds,
	bazaar-ng, git, Jakub Narebski

Andreas Ericsson <ae@op5.se> writes:

> This is new to me. At work, we merge our toy repositories back and
> forth between devs only. There is no central repo at all. Does this
> mean that each merge would add one extra commit per time the one I'm
> merging with has merged with me?

Two things differ in bzr and git, here:

* bzr doesn't do "autocommit" after a merge. So, new revisions are
  created only if you use"commit".

* bzr has two commands, "pull" and "merge". "pull" just does what the
  git people call "fast-forward", and only this (it refuses to do
  anything if the branches diverged). In particular, you never have to
  commit after a pull (well, except if you had some local, uncommited
  changes). "merge" changes your working directory, and you have to
  commit after. "merge" will never do fast-forward, it will never
  change the revision to which your working tree revfers to, and it's
  your option to commit or not after (if you see that it introduces no
  changes, you might not want to commit).

The final rule in bzr would be "you create an extra commit each time
you commit" ;-).

As a side-note, it could be interesting to have a git-like merge
command (chosing automatically between merge and pull), probably not
in the core, but as a plugin.

-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25  9:46                                                   ` Jakub Narebski
@ 2006-10-25 10:08                                                     ` James Henstridge
  2006-10-25 15:54                                                       ` Carl Worth
  0 siblings, 1 reply; 1752+ messages in thread
From: James Henstridge @ 2006-10-25 10:08 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: bazaar-ng, Matthew D. Fuller, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

On 25/10/06, Jakub Narebski <jnareb@gmail.com> wrote:
> Andreas Ericsson wrote:
> > This is new to me. At work, we merge our toy repositories back and forth
> > between devs only. There is no central repo at all. Does this mean that
> > each merge would add one extra commit per time the one I'm merging with
> > has merged with me?
>
> From what I understand, "bzr merge" will create one extra commit to
> preserve the "first parent is my branch" feature. "bzr pull" will do
> fast-forward if your DAG is proper subset of pulled branch/repository
> DAG, but at the cost that it would change your revno to revision mapping
> to those of the pulled repository.

Actually, "bzr merge" does not create any commits on the branch -- you
need to run "bzr commit" afterwards (possibly after resolving
conflicts).  The control files for the working tree record a pending
merge, which gets recorded when you get round to the commit.

So you can easily check if there were any tree changes resulting from the merge.

If there aren't, or you made the merge by mistake, you can make a call
to "bzr revert" to clean things up without ever having created a new
revision.

James.




^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 22:21                                                             ` Matthew D. Fuller
                                                                                 ` (3 preceding siblings ...)
  2006-10-24  9:51                                                               ` Matthieu Moy
@ 2006-10-25 10:52                                                               ` Andreas Ericsson
  2006-10-25 19:53                                                                 ` Junio C Hamano
  4 siblings, 1 reply; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-25 10:52 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Linus Torvalds, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 1178 bytes --]

Matthew D. Fuller wrote:
> On Mon, Oct 23, 2006 at 10:29:53AM -0700 I heard the voice of
> Linus Torvalds, and lo! it spake thus:
>> I already briought this up once, and I suspect that the bzr people
>> simply DID NOT UNDERSTAND the question:
>>
>>  - how do you do the git equivalent of "gitk --all"
> 
> I for one simply DO NOT UNDERSTAND the question, because I don't know
> what that is or what I'd be trying to accomplish by doing it.  The
> documentation helpfully tells me that it's something undocumented.
> 

See the attached screenshot. This is from qgit --all on the git 
repository, but the DAG output is identical to that of gitk. Note in 
particular the 'pu' and 'next' branches. By scrolling down, I can easily 
see the branch-point of any of them.

To those that do not appreciate or allow email-attachments, I apologize. 
I think however that it was necessary to provide a view for the bazaar 
people of what Linus is talking about without having to download and 
install git and a git repository.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

[-- Attachment #2: Screenshot.png --]
[-- Type: image/png, Size: 148451 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24 21:51                                                                         ` Erik Bågfors
@ 2006-10-25 12:41                                                                           ` Andreas Ericsson
  2006-10-25 13:15                                                                             ` Erik Bågfors
  0 siblings, 1 reply; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-25 12:41 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Carl Worth, Matthew D. Fuller, Linus Torvalds, bazaar-ng, git,
	Jakub Narebski

Erik Bågfors wrote:
> 
> Creates the picture you can see at
> http://erik.bagfors.nu/bzr-plugins/dotrepo.png
> 

Looking at this picture, I found a very annoying thing with bzr's 
revids: For commits from the same author on the same day, they don't 
differ in the beginning, making all of them, at a glance, look the same. 
I got a headache just trying to figure out how to read them. It might be 
worth looking into in the future, especially if you decide to show them 
to the users.

Perhaps it's just my git eyes being used to seeing the first 4 chars 
(which is all I normally look at) being different for each different 
commit, but having to look up the near-end of the string to find the 
actual difference in bzr's revids was actually a quite painful experience.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 12:41                                                                           ` Andreas Ericsson
@ 2006-10-25 13:15                                                                             ` Erik Bågfors
  0 siblings, 0 replies; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-25 13:15 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Carl Worth, Matthew D. Fuller, Linus Torvalds, bazaar-ng, git,
	Jakub Narebski

On 10/25/06, Andreas Ericsson <ae@op5.se> wrote:
> Erik Bågfors wrote:
> >
> > Creates the picture you can see at
> > http://erik.bagfors.nu/bzr-plugins/dotrepo.png
> >
>
> Looking at this picture, I found a very annoying thing with bzr's
> revids: For commits from the same author on the same day, they don't
> differ in the beginning, making all of them, at a glance, look the same.
> I got a headache just trying to figure out how to read them. It might be
> worth looking into in the future, especially if you decide to show them
> to the users.
>
> Perhaps it's just my git eyes being used to seeing the first 4 chars
> (which is all I normally look at) being different for each different
> commit, but having to look up the near-end of the string to find the
> actual difference in bzr's revids was actually a quite painful experience.

I agree, and new formats for how the revisions should look are being
discussed on the mailinglist right now.  It's not set in stone.

/Erik

-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25  9:49                                     ` Jeff King
@ 2006-10-25 13:49                                       ` Andreas Ericsson
  2006-10-25 21:51                                         ` David Lang
  2006-10-25 17:21                                       ` David Rientjes
  1 sibling, 1 reply; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-25 13:49 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Linus Torvalds, bazaar-ng, David Rientjes

Jeff King wrote:
> On Wed, Oct 25, 2006 at 02:19:15AM -0700, David Rientjes wrote:
> 
>> No, my criticism is against the added complexity which makes the 
>> modification of git increasingly difficult with every new release.  It's a 
> 
> OK, you seemed to imply problems for end users in your first paragraph,
> which is what I was responding to.
> 
>> _current_ needs.  For any experienced shell programmer it is so much 
>> easier to go in and change an option or pipe to a different command or 
>> comment out a simple shell command in a .sh file than editing the C code.  
> 
> Yes, it's true that some operations might be easier to play with in the
> shell. However, does it actually come up that you want to modify
> existing git programs? The more common usage seems to be gluing the
> plumbing together in interesting ways, and that is still very much
> supported.
> 

Indeed. I still use my old git-send-patch script whenever I want to send 
patches, simply because I don't like git-send-email and its defaults 
much. The interface hasn't changed one bit since I wrote it. That's 
pretty stable, since send-patch was created couple of hours before git.c 
was submitted to the list, as I wrote the "send-patch" script to send 
the patch that did the rewriting.

I'm personally all for a rewrite of the necessary commands in C 
("commit" comes to mind), but as many others, I have no personal 
interest in doing the actual work. I'm fairly certain that once we get 
it working natively on windows with some decent performance, windows 
hackers will pick up the ball and write "wingit", which will be a log 
viewer and GUI thing for 
fetching/merging/committing/reverting/rebasing/sending patches and 
whatnot. Possibly it will have hooks to Visual C++ or some other IDE. I 
don't know how that sort of thing works, but I'm sure someone clever and 
bored enough will want to investigate the possibilities.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 10:08                                                     ` James Henstridge
@ 2006-10-25 15:54                                                       ` Carl Worth
  2006-10-26  8:52                                                         ` James Henstridge
  0 siblings, 1 reply; 1752+ messages in thread
From: Carl Worth @ 2006-10-25 15:54 UTC (permalink / raw)
  To: James Henstridge
  Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, Matthew D. Fuller,
	Linus Torvalds, git

[-- Attachment #1: Type: text/plain, Size: 1197 bytes --]

On Wed, 25 Oct 2006 18:08:22 +0800, "James Henstridge" wrote:
> If there aren't, or you made the merge by mistake, you can make a call
> to "bzr revert" to clean things up without ever having created a new
> revision.

One result of this approach is that developers of different trees
don't necessarily have common revision IDs to compare. Imagine a
question like:

	When you ran that test did you have the same code I've got?

In git, the answer would be determined by comparing revision IDs.

In bzr, the only answer I'm hearing is attempting a merge to see if it
introduces any changes. (I'm deliberately avoiding "pull" since we're
talking about distributed cases here).

And to comment on something mentioned earlier in the thread, there's
no need for "wildly complex" distributed scenarios. All of these
issues are present with developers working together as peers, (and
each considering their own repository as canonical).

A harder question (for bzr) is:

	Do you have all of the history I've got?

(The problem being that when one developer is missing some history and
merges it in, she necessarily creates new history, so there's never a
stable point for both sides to agree on.)

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25  9:49                                     ` Jeff King
  2006-10-25 13:49                                       ` Andreas Ericsson
@ 2006-10-25 17:21                                       ` David Rientjes
  2006-10-25 21:03                                         ` Jeff King
  2006-10-26 11:15                                         ` Andreas Ericsson
  1 sibling, 2 replies; 1752+ messages in thread
From: David Rientjes @ 2006-10-25 17:21 UTC (permalink / raw)
  To: Jeff King; +Cc: Linus Torvalds, Lachlan Patrick, bazaar-ng, git

On Wed, 25 Oct 2006, Jeff King wrote:

> Yes, it's true that some operations might be easier to play with in the
> shell. However, does it actually come up that you want to modify
> existing git programs? The more common usage seems to be gluing the
> plumbing together in interesting ways, and that is still very much
> supported.
> 

Yes, it does.  I'll give you an example from six months ago: there was a 
need for the group that I work with to support a faster type of hashing 
function for whatever reason.  This would have been simple with previous 
versions of git, but if you've ever looked at the SHA1 code in git, you'll 
realize that you're probably better off never trying to touch it.  There 
is absolutely _no_ abstraction of it at all and the code is so deeply 
coupled in the source that abstracting it away is a pain.

Likewise, there is always room for personal or organizational tweaks on 
the part of the developer.  Things like distributed pulling and 
merging should actually be pretty simple to implement if the complexity 
wasn't so high in the merge-* family.  This is something I implemented 
after an enormous headache because we were dealing with very large 
projects: yes, larger than the Linux kernel.  And this is _exactly_ where 
piping would help; we have implementations of distributed grep over very 
large datasets (on the order of terabytes).

> You can do the same thing in C. In fact, look at how similar
> git-whatchanged, git-log, and git-diff are.
> 

No you can't.  Making a one line addition, commenting out a line, or 
changing a simple flag in a shell script is much easier.  And like I 
already said, you can save multiple versions for your common use if you 
work on a specific project much of the time and change how it operates 
depending on the needs of that one project so you never need to do it 
again or you can _distribute_ that shell file to your colleagues so that 
everybody is doing their work via the same method.  This makes it so you 
can just say "type X, then type Y, then type Z" and everybody is operating 
together without training them on how to use git.

> > This all became very obvious when the tutorials came out on "how to use 
> > git in 20 commands or less" effectively.  These tutorials shouldn't need 
> > to exist with an information manager that started as a quick, efficient, 
> > and _simple_ project.  You're treating git development in the same light 
> 
> Sorry, I don't see how this is related to the programming language _at
> all_. Are you arguing that the interface of git should be simplified so
> that such tutorials aren't necessary? If so, then please elaborate, as
> I'm sure many here would like to hear proposals for improvements. If
> you're arguing that git now has too many features, then which features
> do you consider extraneous?
> 

It's not, it's related to the original vision of git which was meant for 
efficiency and simplicity.  A year ago it was very easy to pick up the 
package and start using it effectively within a couple hours.  Keep in 
mind that this was without tutorials, it was just reading man pages.  
Today it would be very difficult to know what the essential commands are 
and how to use them simply to get the job done, unless you use the 
tutorials.  This _inherently_ goes against the approach of trying to 
provide something that is simple to the developer.

Revision control is something that should exist in the background that 
does it's simple job very efficiently.  Unfortunately git has tried to 
move its presence into the foreground and requiring developers to spend 
more time on learning the system.

Have you never tried to show other people git without giving them a 
tutorial on the most common uses?  Try it and you'll see the confusion.  
That _specifically_ illustrates the ever-increasing lack of simplicity 
that git has acquired.

> I don't agree with this. There are tons of enhancements that I find
> useful (e.g., '...' rev syntax, rebasing with 3-way merge, etc) that I
> think other developers ARE using. There are scalability and performance
> improvements. And there are new things on the way (Junio's pickaxe work)
> that will hopefully make git even more useful than it already is.
> 

There are _not_ scalability improvements.  There may be some slight 
performance improvements, but definitely not scalability.  If you have 
ever tried to use git to manage terabytes of data, you will see this 
becomes very clear.  And "rebasing with 3-way merge" is not something 
often used in industry anyway if you've followed the more common models 
for revision control within large companies with thousands of engineers.  
Typically they all work off mainline.

> If you don't think recent git versions are worthwhile, then why don't
> you run an old version? You can even use git to cherry-pick patches onto
> your personal branch.
> 

I do.  And that's why I would recommend to any serious developer to use 
1.2.4; this same version that I used for kernel development at Google.

> Where?
> 

Few months back here on the mailing list.  When I tried cleaning up even 
one program, I got the response back from the original author "why fix a 
non-problem?" because his argument was that since it worked the code 
doesn't matter.

	http://marc.theaimsgroup.com/?l=git&m=115589472706036

And that is simply one thread of larger conversations that have taken 
place off-list and aren't archived.

> I don't agree, but since you haven't provided anything specific enough
> to discuss, there's not much to say.
> 

If there's a question about some of the sloppiness in the git source code 
as it stands today, that's a much bigger issue than the sloppiness.  My 
advice would be to pick up a copy of K&R's 2nd edition C programming 
language book, read it, and then take a tour of the source code.

> Can you name one customization that you would like to perform now that
> you feel can't be easily done (and presumably that would have been
> easier in the past)?
> 

Yes, those mentioned above.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-23 23:24                                                                   ` Linus Torvalds
                                                                                       ` (2 preceding siblings ...)
  2006-10-24  9:30                                                                     ` Jelmer Vernooij
@ 2006-10-25 18:41                                                                     ` Aaron Bentley
  3 siblings, 0 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-25 18:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Erik Bågfors, bazaar-ng, git, Jakub Narebski

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Tue, 24 Oct 2006, Erik Bågfors wrote:
> 
>>I don't see any problem doing a "gitk --all" equivalent in bzr.
> 
> 
> The problem? How do you show a commit that is _common_ to two branches, 
> but has different revision names in them?

If you're talking about the old-style single-integer revnos, each
revision only has one of those, because that revision dictates the path
you must take to the origin when determining its revno.  Many others may
share that revno, but each revision has only one.

The new-style dotted-series-of-ints revnos, I agree, will change.
They're not something I use.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFP6/B0F+nu1YWqI0RAs76AJ9nE4BnL2tLDPQwqjQvCi6okDTdpQCdFQ9V
GoL1BWO+L2FxjLjRrCjKtuY=
=yQ6t

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 10:52                                                               ` Andreas Ericsson
@ 2006-10-25 19:53                                                                 ` Junio C Hamano
  0 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-25 19:53 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: git

Andreas Ericsson <ae@op5.se> writes:

> See the attached screenshot. This is from qgit --all on the git
> repository, but the DAG output is identical to that of gitk. Note in
> particular the 'pu' and 'next' branches. By scrolling down, I can
> easily see the branch-point of any of them.

Looking at this picture I noticed the lack of circles or
rectangles on six commits near the tip of "pu" branch.  Nobody
should be doing an Octopus so it might be a non-issue, but
somehow it looks fishy.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 17:21                                       ` David Rientjes
@ 2006-10-25 21:03                                         ` Jeff King
  2006-10-26 11:15                                         ` Andreas Ericsson
  1 sibling, 0 replies; 1752+ messages in thread
From: Jeff King @ 2006-10-25 21:03 UTC (permalink / raw)
  To: David Rientjes; +Cc: Linus Torvalds, Lachlan Patrick, bazaar-ng, git

On Wed, Oct 25, 2006 at 10:21:42AM -0700, David Rientjes wrote:

> Yes, it does.  I'll give you an example from six months ago: there was a 

First off, thanks for giving examples. I was having trouble seeing where
you were coming from.

> need for the group that I work with to support a faster type of hashing 
> function for whatever reason.  This would have been simple with previous 
> versions of git, but if you've ever looked at the SHA1 code in git, you'll 
> realize that you're probably better off never trying to touch it.  There 
> is absolutely _no_ abstraction of it at all and the code is so deeply 
> coupled in the source that abstracting it away is a pain.

Is this really an artifact of the C code versus the shell code? A lot of
parts of the system need to touch SHA1 hashes, and I think it has been
sprinkled throughout the code from the beginning. In fact, I think the
libification of git-rev-list has made the code a lot _cleaner_ (and
shorter), in that the C programs can all use the same nice interface.
The external interface is still there, but now there is consistency
among programs when using rev syntax (ISTR issues in the distant past
where program X didn't understand syntax because the parsing was all
done ad-hoc).

> Likewise, there is always room for personal or organizational tweaks on 
> the part of the developer.  Things like distributed pulling and 
> merging should actually be pretty simple to implement if the complexity 
> wasn't so high in the merge-* family.  This is something I implemented 
> after an enormous headache because we were dealing with very large 
> projects: yes, larger than the Linux kernel.  And this is _exactly_ where 
> piping would help; we have implementations of distributed grep over very 
> large datasets (on the order of terabytes).

I guess I don't see how this was ever any easier. Do you mean that when
we called an external grep, it was easier to plug in your distributed
grep?

> > You can do the same thing in C. In fact, look at how similar
> > git-whatchanged, git-log, and git-diff are.
> No you can't.


The "same thing" I referred to was changing behavior trivially based on
the program name. So yes, you can.

> Making a one line addition, commenting out a line, or changing a
> simple flag in a shell script is much easier.  And like I already

Sure, shell can be easier to modify (though in well-written C, you're
likely just commenting out a few lines or a function call -- maybe you
can argue whether or not git is well-written). However, I remain
unconvinced that this is a common use case, or that it is something that
should weigh heavily when compared with portability, efficiency, or
robustness concerns.

> It's not, it's related to the original vision of git which was meant for 
> efficiency and simplicity.


Simplicity is fine if all you want is plumbing. But normal people want
to _use_ git without hacking their own shell scripts, so it makes sense
to provide the scripts that other people have hacked together (as shell,
perl, C, or whatever). Do I want to use git-send-email? Hell no, the
interface is terrible to me. But do the plumbing commands still exist so
that I can use the scripts I hacked together? Absolutely. I can take
what I want and leave the rest.

> A year ago it was very easy to pick up the package and start using it
> effectively within a couple hours.  Keep in mind that this was without

Was it? The most common complaint I've heard about git, starting a year
ago, was the lack of documentation and tutorials and the complexity of
use.

> tutorials, it was just reading man pages.  Today it would be very
> difficult to know what the essential commands are and how to use them
> simply to get the job done, unless you use the tutorials.  This

I think this has been the case for a long time. It's just that there
_weren't_ tutorials back then.

> Have you never tried to show other people git without giving them a 
> tutorial on the most common uses?  Try it and you'll see the confusion.  
> That _specifically_ illustrates the ever-increasing lack of simplicity 
> that git has acquired.

No, it illustrates a lack of simplicity that currently exists; it says
_nothing_ about the change in simplicity over time.

> There are _not_ scalability improvements.  There may be some slight 
> performance improvements, but definitely not scalability.  If you have 
> ever tried to use git to manage terabytes of data, you will see this 

There has been work on scaling to larger repositories (e.g., mozilla and
xorg prompting work/discussion on cvs importing, subproject/superproject
support, shallow clones, etc), but not on terabyte scales. I realize
that might not help you, but it is helping a lot of people. Quite
honestly, git is focused on SOURCE CODE MANAGEMENT, not terabytes of
data. Perhaps that is your true complaint: git is developing tools for
working with source code, potentially at the loss of some generality
(though I tend to think it hasn't lost generality, but rather it hasn't
gained).

> becomes very clear.  And "rebasing with 3-way merge" is not something 
> often used in industry anyway if you've followed the more common models 
> for revision control within large companies with thousands of engineers.  
> Typically they all work off mainline.

My point isn't that every feature is useful to every developer. My point
is that just because features aren't useful to _you_ doesn't mean
they're not useful at all.

And if you want to talk about industry standard, didn't the discussion
start off with your complaint about porting to Windows? An
industry-standard SCM needs to be cross-platform across the major
operating systems.

> Few months back here on the mailing list.  When I tried cleaning up even 
> one program, I got the response back from the original author "why fix a 
> non-problem?" because his argument was that since it worked the code 
> doesn't matter.

I remember a big discussion about the order of arguments in relational
expressions. Git may have problems, but I just don't see coding style
nitpicks as a priority.

Abstracting the hashing might be worthwhile, but the list consensus was
that it's not worth the work unless we're actually going to _do_
something with the abstraction.  Your argument seems to be that you
_are_ doing something with the abstraction on your own. If you want to
convince the git developers that this is a worthwhile direction, then
show some code which uses it.

> 	http://marc.theaimsgroup.com/?l=git&m=115589472706036

OK, I remember this particular discussion. And I just read through to
the end of the thread; it looks like Junio ended up with "this code is
ugly; fix it" and Johannes did.

It sounds like your real beef was that you want to use some alternate
"mv" command that handles your data set better, and having git-mv as a
shell-script would make that simpler for you.  Well, it isn't a shell
script and it never was. If you want to write it as one, I imagine it
would be considered for inclusion (though I expect the C version may
have some advantages, such as atomicity of file movement and index
updating).


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25  8:48                                 ` Jeff King
       [not found]                                   ` < Pine.LNX.4.64N.0610250157470.3467@attu1.cs.washington.edu>
  2006-10-25  9:19                                   ` David Rientjes
@ 2006-10-25 21:08                                   ` Junio C Hamano
  2006-10-25 21:16                                     ` Jeff King
                                                       ` (2 more replies)
  2 siblings, 3 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-25 21:08 UTC (permalink / raw)
  To: Jeff King; +Cc: Linus Torvalds, David Rientjes, bazaar-ng, git

Jeff King <peff@peff.net> writes:

> On Tue, Oct 24, 2006 at 01:12:52PM -0700, David Rientjes wrote:
>
>> And I would prefer the opposite because we're talking about git.  As an 
>> information manager, it should be seen and not heard.  Nobody is going to 
>> spend their time to become a git or CVS or perforce expert.  As an 
>> individual primarily interested in development, I should not be required 
>> to learn command lines for dozens of different git-specific commands to do 
>> my job quickly and effectively.  I would opt for a much more simpler 
>> approach and deal with shell scripting for many of these commands because 
>> I'm familiar with them and I can pipe any command with the options I 
>> already know and have used before to any other command.
>
> I don't understand how converting shell scripts to C has any impact
> whatsoever on the usage of git. The plumbing shell scripts didn't go
> away; you can still call them and they behave identically.
>
> Is there some specific change in functionality that you're lamenting?

That's also I wondered, but I also can understand where David is
coming from, and I agree with him to a certain degree.

When I learned git, I learned a lot from trying to piece my own
plumbing together, since there weren't much Porcelain to speak
of back then.  Then we had many usability enhancements before
the 1.0 release to add Porcelainish done as shell scripts.

This had two positive effects, aside from adding usability.
Interested people had more shell scripts to learn from.  The
scripts were easy to adjust to feature requests from the list,
and as we learned from user experience based on these scripts it
was definitely quicker to codify the best current practice
workflow in them than if they were written in C.  It would have
taken us a lot more effort to add "git commit -o paths" vs "git
commit -i paths" if it were already converted to C, for example.
This continued and our Porcelainish scripts matured quickly.

Then 1.3 series started to move some of the mature ones into C.
As many people already have pointed out, being written in C and
not doing pipe() has two advantages (better portability to
platforms with awkward pipe support and one less process usually
mean better performance).  git-log family with path limiting had
a real boost in performance because the path limiting can be
done in the revision traversal side not diff-tree that used to
be on the downstream side of the pipe.  So this in overall was a
right thing to do.

One thing we lost during the process, however, is a ready access
to the pool of "sample scripts" when people would want to
scratch their own itches.  Linus's original tutorial talked
about "this pattern of pipe is so useful that we have a three
liner shell script wrapper that is called git-foo", and
interested people can easily look at how the plumbing commands
fit together.

The plumbing is still there, and I and people who already know
git would still script around git-rev-list when we need to (by
the way, scripting around git-log is a wrong thing to do -- it
is for human consumption and scripting should be done with
plumbing).  But when we rewrote mature ones in C (and I keep
stressing "mature" because another thing I agree with David is
that shell is definitely easier to futz with), we did not leave
the older shell implementation around as reference.  People
coming to git after 1.3 series certainly do have harder time to
learn how plumbing would fit together than when git old-timers
learned it, if that is the area they are interested in, as
opposed to just using git as a revision tracking system.

We could probably do two things to address this issue:

 - Create examples/ hierarchy in the source tree to house these
   historical implementations as a reference material, or an
   entirely different branch or repository to house them.

 - Learn the itches David and other people have, that the
   current git Porcelain-ish does not scratch well, and enrich
   Documentation/technical with real-world working scripts built
   around plumbing.






^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 21:08                                   ` Junio C Hamano
@ 2006-10-25 21:16                                     ` Jeff King
  2006-10-25 21:32                                       ` Junio C Hamano
  2006-10-25 21:50                                     ` Junio C Hamano
  2006-10-26 11:25                                     ` Andreas Ericsson
  2 siblings, 1 reply; 1752+ messages in thread
From: Jeff King @ 2006-10-25 21:16 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, bazaar-ng, Linus Torvalds, Lachlan Patrick, David Rientjes

On Wed, Oct 25, 2006 at 02:08:20PM -0700, Junio C Hamano wrote:

> the older shell implementation around as reference.  People
> coming to git after 1.3 series certainly do have harder time to
> learn how plumbing would fit together than when git old-timers
> learned it, if that is the area they are interested in, as
> opposed to just using git as a revision tracking system.

I think this is part of the complication of discussion I'm having with
David. There are really two sets of users for git: people who want to
hack scripts based on plumbing, and people who want everything to "just
work." I think it's a good point that as the system matures (movement
to C and growth of complexity), it might become less easy to hack.

>  - Create examples/ hierarchy in the source tree to house these
>    historical implementations as a reference material, or an
>    entirely different branch or repository to house them.

Housing historical implementations seems like it would just lead to
out-of-date and non-functional examples.

>  - Learn the itches David and other people have, that the
>    current git Porcelain-ish does not scratch well, and enrich
>    Documentation/technical with real-world working scripts built
>    around plumbing.

I think this is a better approach. I think it also makes sense to
let people know that it's an acceptable approach to start new features
as shell and then have them mature to C (looking at the current
codebase, and some of Dscho's rantings, one might get the impression
that git isn't accepting new shell scripts).


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 21:16                                     ` Jeff King
@ 2006-10-25 21:32                                       ` Junio C Hamano
  0 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-25 21:32 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> Housing historical implementations seems like it would just lead to
> out-of-date and non-functional examples.

I agree.  Although that ought to be rare in principle, given
that one advertised feature of git is that the plumbing is
supposed to be stable, we occasionally had to have to subtly
break things to improve plumbing and at the same time run around
to make sure that all the script users (both in-tree and
out-of-tree like Cogito, gitweb and StGIT) are updated.

>>  - Learn the itches David and other people have, that the
>>    current git Porcelain-ish does not scratch well, and enrich
>>    Documentation/technical with real-world working scripts built
>>    around plumbing.
>
> I think this is a better approach. I think it also makes sense to
> let people know that it's an acceptable approach to start new features
> as shell and then have them mature to C (looking at the current
> codebase, and some of Dscho's rantings, one might get the impression
> that git isn't accepting new shell scripts).

New commands like pickaxe and for-each-ref were easier to code
in C, and cherry rewrite in C was really about how crufty the
shell script version was from the beginning (and there weren't
in-tree users of it left so it was not maintained at all but
thanks to plumbing being stable it just kept working perhaps
correctly but still horribly).

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 21:08                                   ` Junio C Hamano
  2006-10-25 21:16                                     ` Jeff King
@ 2006-10-25 21:50                                     ` Junio C Hamano
  2006-10-26 11:25                                     ` Andreas Ericsson
  2 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-25 21:50 UTC (permalink / raw)
  To: git

Junio C Hamano <junkio@cox.net> writes:

>  - Learn the itches David and other people have, that the
>    current git Porcelain-ish does not scratch well, and enrich
>    Documentation/technical with real-world working scripts built
>    around plumbing.

I meant "Documentation/howto"; sorry for the noise.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 13:49                                       ` Andreas Ericsson
@ 2006-10-25 21:51                                         ` David Lang
  2006-10-25 22:15                                           ` Shawn Pearce
  0 siblings, 1 reply; 1752+ messages in thread
From: David Lang @ 2006-10-25 21:51 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Jeff King, David Rientjes, Linus Torvalds, Lachlan Patrick,
	bazaar-ng, git

a quick lesson on program nameing

On Wed, 25 Oct 2006, Andreas Ericsson wrote:

> I'm personally all for a rewrite of the necessary commands in C ("commit" 
> comes to mind), but as many others, I have no personal interest in doing the 
> actual work. I'm fairly certain that once we get it working natively on 
> windows with some decent performance, windows hackers will pick up the ball 
> and write "wingit", which will be a log viewer and GUI thing for
              ^^^^^^

how many other people read this as 'wing it' rather then 'win git'? ;-)

David Lang

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 21:51                                         ` David Lang
@ 2006-10-25 22:15                                           ` Shawn Pearce
  2006-10-25 22:29                                             ` Jakub Narebski
  2006-10-25 22:41                                             ` David Lang
  0 siblings, 2 replies; 1752+ messages in thread
From: Shawn Pearce @ 2006-10-25 22:15 UTC (permalink / raw)
  To: David Lang; +Cc: git

David Lang <dlang@digitalinsight.com> wrote:
> a quick lesson on program nameing
> 
> On Wed, 25 Oct 2006, Andreas Ericsson wrote:
> 
> >I'm personally all for a rewrite of the necessary commands in C ("commit" 
> >comes to mind), but as many others, I have no personal interest in doing 
> >the actual work. I'm fairly certain that once we get it working natively 
> >on windows with some decent performance, windows hackers will pick up the 
> >ball and write "wingit", which will be a log viewer and GUI thing for
>              ^^^^^^
> 
> how many other people read this as 'wing it' rather then 'win git'? ;-)

Yes, that's certainly a less than optimal name...

What about gitk?  Is it "gi tk" or "git k" ?  This has actually
been the source of much local debate.  :-)

-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Combined diff format documentation
@ 2006-10-25 22:22 Jakub Narebski
  2006-10-25 22:40 ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-25 22:22 UTC (permalink / raw)
  To: git

In Documentation/diff-format.txt we can find the following information about
combined diff format:

 combined diff format
 --------------------
 
 git-diff-tree and git-diff-files can take '-c' or '--cc' option
 to produce 'combined diff', which looks like this: 

 ------------
 diff --combined describe.c
 @@@ +98,7 @@@
    return (a_date > b_date) ? -1 : (a_date == b_date) ? 0 : 1;
   }
 
 - static void describe(char *arg)
  -static void describe(struct commit *cmit, int last_one)
 ++static void describe(char *arg, int last_one)
   {
  +     unsigned char sha1[20];
  +     struct commit *cmit;
 ------------

And it further goes how to read combined diff format, and how --cc
(condensed combined) differs from --combined.

There is no note about which of extended headers functions with combined
diff format, how they change, how chunk header changes.

From what I gathered, there are the following differences as compared to
ordinary (diff --git) git extended headers:

1. "git diff" header which looked like this

      diff --git a/file1 b/file2

    is now

      diff --combined file2

    (where instead of --combined we might have --cc). Not described in
    documentation.

2. the "index" extended header line changes from

     index <hash>..<hash> <mode>

   to
   
     index <hash>,<hash>..<hash>

   Mode information is put in separate line, only if mode changes, for
   example

     mode <mode>,<mode>..<mode>

   <mode> can be 000000 if file didn't exist in particular parent; if file
   was cerated by merge we have

     new file mode <mode>

   I haven't checked what happens if file is deleted, either by branch or by
   merge commit itself. Not described in documentation, I'm not sure about
   how this (wrt modes) works.

3. The "rename/copy" headers seems to be never present; see below.

4. From file/to file header _seems_ to function exactly like in ordinary
   diff format, namely

     --- a/file1
     +++ b/file2

   But it seems to function rather like in ordinary "git diff" header,
   i.e. we have a/file1 instead of /dev/null even for files created by
   merge. I have not checked if and how rename detection work here.

5. Hunk header is also modified: in ordinary diff we have

     @@ <from range> <to range> @@

   where <from range> is -<start line>,<number of lines>, and <to range>
   is +<start line>,<number of lines>. In combined diff format it changes
   similarly to "index" extended header, namely

     @@@ <from range> <from range> <to range> @@@

   It might be not obvoious that we have (number of parents + 1) '@'
   characters in chunk header for combined dif format.

   BTW. it is not mentioned in documentation that git diff uses hunk section
   indicator, and what regexp/expression it uses (and is it configurable).
   Not described in documentation.

6. Documentation/diff-format.txt explains combined and condensed combined
   format quite well, although it doesn't tell us if we can have plusses and
   minuses together in one line...


=====================================================================

Combined diff format an renames detection
-----------------------------------------

We have the following situation:
$ git ls-tree -r --abbrev HEAD
100644 blob 1ce3f81     greetings/goodbye.txt
100644 blob 980a0d5     greetings/hello.txt
$ git ls-tree -r --abbrev HEAD^1
100644 blob 980a0d5     greetings/hello.txt
$ git ls-tree -r --abbrev HEAD^2
100644 blob 1ce3f81     data/goodbye.txt
100644 blob 980a0d5     data/hello.txt

Below there are following diffs: with first parent, merge (with all parents)
with renames detection, combined, combined with rename detection. Is it all
expected?

$ git diff-tree -p HEAD^1 HEAD
diff --git a/greetings/goodbye.txt b/greetings/goodbye.txt
new file mode 100644
index 0000000..1ce3f81
--- /dev/null
+++ b/greetings/goodbye.txt
@@ -0,0 +1 @@
+Goodbye World!

$ git diff-tree -p -M -m HEAD
d0fdd886e3b768678832c8d826bb8b70f2ef7b8e
diff --git a/greetings/goodbye.txt b/greetings/goodbye.txt
new file mode 100644
index 0000000..1ce3f81
--- /dev/null
+++ b/greetings/goodbye.txt
@@ -0,0 +1 @@
+Goodbye World!
d0fdd886e3b768678832c8d826bb8b70f2ef7b8e
diff --git a/data/goodbye.txt b/greetings/goodbye.txt
similarity index 100%
rename from data/goodbye.txt
rename to greetings/goodbye.txt
diff --git a/data/hello.txt b/greetings/hello.txt
similarity index 100%
rename from data/hello.txt
rename to greetings/hello.txt

$ git diff-tree -p -c HEAD
d0fdd886e3b768678832c8d826bb8b70f2ef7b8e
diff --combined greetings/goodbye.txt
index 0000000,0000000..1ce3f81
new file mode 100644
--- a/greetings/goodbye.txt
+++ b/greetings/goodbye.txt
@@@ -1,0 -1,0 +1,1 @@@
++Goodbye World!

$ git diff-tree -p -c -M HEAD
d0fdd886e3b768678832c8d826bb8b70f2ef7b8e
diff --combined greetings/goodbye.txt
index 0000000,1ce3f81..1ce3f81
mode 000000,100644..100644
--- a/greetings/goodbye.txt
+++ b/greetings/goodbye.txt
@@@ -1,0 -1,1 +1,1 @@@
+ Goodbye World!

And to compare, latest with --cc (condensed combined) instead of -c:
$ git diff-tree -p --cc -M HEAD
d0fdd886e3b768678832c8d826bb8b70f2ef7b8e
diff --cc greetings/goodbye.txt
index 0000000,1ce3f81..1ce3f81
mode 000000,100644..100644
--- a/greetings/goodbye.txt
+++ b/greetings/goodbye.txt
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 22:15                                           ` Shawn Pearce
@ 2006-10-25 22:29                                             ` Jakub Narebski
  2006-10-25 22:44                                               ` Petr Baudis
  2006-10-25 22:41                                             ` David Lang
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-25 22:29 UTC (permalink / raw)
  To: git

Shawn Pearce wrote:

> David Lang <dlang@digitalinsight.com> wrote:
>> a quick lesson on program nameing
>> 
>> On Wed, 25 Oct 2006, Andreas Ericsson wrote:
>> 
>> >I'm personally all for a rewrite of the necessary commands in C ("commit" 
>> >comes to mind), but as many others, I have no personal interest in doing 
>> >the actual work. I'm fairly certain that once we get it working natively 
>> >on windows with some decent performance, windows hackers will pick up the 
>> >ball and write "wingit", which will be a log viewer and GUI thing for
>>              ^^^^^^
>> 
>> how many other people read this as 'wing it' rather then 'win git'? ;-)
> 
> Yes, that's certainly a less than optimal name...
> 
> What about gitk?  Is it "gi tk" or "git k" ?  This has actually
> been the source of much local debate.  :-)

You can always use CamelCase, i.e. WinGit or WinGIT (or wgit,
but this is also silly).

Cute names are taken: CoGITo, gitk, qgit (GTK+ history viewer is gitview,
not ggit, curiously ;-) and tig.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25  0:27                                                                         ` Matthew D. Fuller
@ 2006-10-25 22:40                                                                           ` David Lang
  2006-10-25 23:53                                                                             ` Matthew D. Fuller
  2006-10-30 21:46                                                                             ` Jan Hudec
  0 siblings, 2 replies; 1752+ messages in thread
From: David Lang @ 2006-10-25 22:40 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Linus Torvalds, bazaar-ng, git

On Tue, 24 Oct 2006, Matthew D. Fuller wrote:

> On Tue, Oct 24, 2006 at 11:03:20AM -0700 I heard the voice of
> David Lang, and lo! it spake thus:
>>
>> it sounded like you were saying that the way to get the slices of
>> the DAG was to use branches in bzr. [...]
>
> I'm not entirely sure I understand what you mean here, but I think
> you're saying "Nobody's written the code in bzr to show arbitrary
> slices of the DAG", which is true TTBOMK.

I think we are talking past each other here.

what I think was said was

G 'one feature of git is that you can view arbatrary slices trivially'

B 'bzr can do this too, you just use branches to define the slices'

G 'but this limits you becouse branches are defined as code is developed, git 
lets you define slices at viewing time'

by the way, I think it's more then just saying 'well, the code could be written 
to do this in $VCS' some decisions and standard ways of doing things can impact 
how hard it is to implement a feature, and some decisions can make it 
impossible (without doing unexpected things).

>
>> everyone agrees that bzr supports the Star topology. Most people
>> (including bzr people) seem to agree that currently bzr does not
>> support the Distributed topology.
>
> I think this statement arouses so much grumbling because (a) bzr does
> support such a lot better than often seems implied, (b) where it
> doesn't, the changes needed to do so are relatively minor (often
> merely cosmetic), and (c) disagreement over whether some of the
> qualifications included for 'distributed' are really fundamental.
>
>
>> it's just fine for bzr to not support all possible topologies,
>
> I think there's a real intent for bzr TO support at least all common
> topologies.  I'll buy that current development has focused more on
> [relatively] simple topologies than the more wildly complex ones.  I
> look forward to more addressing of the less common cases as the tool
> matures, and I think a lot of this thread will be good material to
> work with as that happens.  It's just the suggestion that providing
> fruit for simple topologies _necessarily_ prejudices against complex
> ones that I find so onerous.

one concern that the git people are voicing is that the things that work for 
simple topologies (revno's) can't be used with the more complex ones (where you 
need the refid's). especially the fact that users need to do things 
significantly different when there are fairly subtle changes to the topology.

the scenerio that came up elsewhere today where you have

    Master
    /    \
dev1   dev2

and then dev1 and dev2 both start working on the same thing (without knowing 
it), then discover they are working on the same thing. they now have threeB 
options

1. merge their stuff up to the master so that they can both pull it down.
   but this puts broken, experimental stuff up in the master

2. declare one of the dev trees to be the master

this changes the topology to

Master--dev1--dev2

3. pull from each other frequently to keep in sync.

this changes the topology to

    Master
    /   \
dev1--dev2

if they do this with bzr then the revno's break, they each get extra commits 
showing up (so they can never show the same history).

in git this is a non-issue, they can pull back and forth and the only new 
history to show up will be changes.

this is the situation that the kernel developers are in frequently. it sounds as 
if you haven't needed to do this yet, so you haven't encountered the problems.

David Lang

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Combined diff format documentation
  2006-10-25 22:22 Combined diff format documentation Jakub Narebski
@ 2006-10-25 22:40 ` Junio C Hamano
  2006-10-25 22:58   ` Jakub Narebski
                     ` (3 more replies)
  0 siblings, 4 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-25 22:40 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> 1. "git diff" header which looked like this
> 2. the "index" extended header line changes from
> 3. The "rename/copy" headers seems to be never present; see below.
>...

Thanks for starting this.  Your observation is correct.  It was
pretty much designed for quick _content_ inspection and renames
would work correctly to pick which blobs from each tree to
compare but otherwise not reflected in the output (the pathnames
are not shown as far as I know).  We could probably add it if
some users need it.

> 5. Hunk header is also modified: in ordinary diff we have
> ...
>    It might be not obvoious that we have (number of parents + 1) '@'
>    characters in chunk header for combined dif format.

Correct.  This was done to prevent people from accidentally
feeding it to "patch -p1".  In other words, we wanted to make it
so obvious that it is _not_ a patch.

There may be more information in "git log -- combine-diff.c"
output that ought to be collected into the documentation, and
now might be a good time to do so, given that that part of the
system is fairly stable and has not changed for quite some time
in git timescale.

>    BTW. it is not mentioned in documentation that git diff uses hunk section
>    indicator, and what regexp/expression it uses (and is it configurable).
>    Not described in documentation.

If you mean by "hunk section indicator" the output similar to
GNU diff -p option, I think it is not worth mentioning and we
are not ready to mention it yet (we have not etched the
expression in stone).  Nobody jumped up and down to say it needs
to be configurable, so it is left undocumented more or less
deliberately.

> 6. Documentation/diff-format.txt explains combined and condensed combined
>    format quite well, although it doesn't tell us if we can have plusses and
>    minuses together in one line...

But you already know the answer to that question, since you
asked me a few days ago ;-).

Patches to documentation would be easier to comment on and more
productive, I guess.

> Below there are following diffs: with first parent, merge (with all parents)
> with renames detection, combined, combined with rename detection. Is it all
> expected?

Yes.  I do not see anything obviously unexpected in your output.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 22:15                                           ` Shawn Pearce
  2006-10-25 22:29                                             ` Jakub Narebski
@ 2006-10-25 22:41                                             ` David Lang
  1 sibling, 0 replies; 1752+ messages in thread
From: David Lang @ 2006-10-25 22:41 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git

On Wed, 25 Oct 2006, Shawn Pearce wrote:

> David Lang <dlang@digitalinsight.com> wrote:
>> a quick lesson on program nameing
>>
>> On Wed, 25 Oct 2006, Andreas Ericsson wrote:
>>
>>> I'm personally all for a rewrite of the necessary commands in C ("commit"
>>> comes to mind), but as many others, I have no personal interest in doing
>>> the actual work. I'm fairly certain that once we get it working natively
>>> on windows with some decent performance, windows hackers will pick up the
>>> ball and write "wingit", which will be a log viewer and GUI thing for
>>              ^^^^^^
>>
>> how many other people read this as 'wing it' rather then 'win git'? ;-)
>
> Yes, that's certainly a less than optimal name...
>
> What about gitk?  Is it "gi tk" or "git k" ?  This has actually
> been the source of much local debate.  :-)

in this case I think it's both, (or technicaly git tk with the double t's 
combined to save typeing)


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 22:29                                             ` Jakub Narebski
@ 2006-10-25 22:44                                               ` Petr Baudis
  2006-10-25 23:15                                                 ` Jakub Narebski
  2006-10-26  1:06                                                 ` Horst H. von Brand
  0 siblings, 2 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-25 22:44 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Dear diary, on Thu, Oct 26, 2006 at 12:29:17AM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> Cute names are taken: CoGITo, gitk, qgit (GTK+ history viewer is gitview,
> not ggit, curiously ;-) and tig.

wit?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Combined diff format documentation
  2006-10-25 22:40 ` Junio C Hamano
@ 2006-10-25 22:58   ` Jakub Narebski
  2006-10-25 23:14     ` Junio C Hamano
  2006-10-25 23:24     ` Junio C Hamano
  2006-10-25 23:45   ` Jakub Narebski
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-25 22:58 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
>
>> 6. Documentation/diff-format.txt explains combined and condensed combined
>>    format quite well, although it doesn't tell us if we can have plusses and
>>    minuses together in one line...
> 
> But you already know the answer to that question, since you
> asked me a few days ago ;-).

Yes, in "[RFC] Syntax highlighting for combined diff" thread
http://permalink.gmane.org/gmane.comp.version-control.git/29566

Well, the _documentation_ doesn't tell. I haven't fully grokked the code
for generating and coloring combined diff output besides the fact that
I think it uses last indicator ('+' or '-') to chose color for the rest
of line. You said that even if the possibility exist, it is extreme
unlikely.

> Patches to documentation would be easier to comment on and more
> productive, I guess.

I was not sure about output. All conclusions about combined diff output
are from examples; I've planned to send patch to documentation when I'll
be sure that at least _most_ of what I've added is correct.

Will do.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Combined diff format documentation
  2006-10-25 22:58   ` Jakub Narebski
@ 2006-10-25 23:14     ` Junio C Hamano
  2006-10-25 23:24     ` Junio C Hamano
  1 sibling, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-25 23:14 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> I was not sure about output. All conclusions about combined diff output
> are from examples; I've planned to send patch to documentation when I'll
> be sure that at least _most_ of what I've added is correct.
>
> Will do.

Thanks.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 22:44                                               ` Petr Baudis
@ 2006-10-25 23:15                                                 ` Jakub Narebski
  2006-10-26  1:06                                                 ` Horst H. von Brand
  1 sibling, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-25 23:15 UTC (permalink / raw)
  To: git

Petr Baudis wrote:

> Dear diary, on Thu, Oct 26, 2006 at 12:29:17AM CEST, I got a letter
> where Jakub Narebski <jnareb@gmail.com> said that...
>> Cute names are taken: CoGITo, gitk, qgit (GTK+ history viewer is gitview,
>> not ggit, curiously ;-) and tig.
> 
> wit?

Taken.

wit ? a Python web interface to git maintained by Christian Meder.
Example site on http://www.grmso.net:8090/ . It uses PATH_INFO
much more than gitweb (which uses CGI parameters mostly, but also
supports multiple projects).

Well, not maintained if http://www.absolutegiganten.org/wit/
is indicator

  wit-0.0.4.tar.gz        08-Sep-2005

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Combined diff format documentation
  2006-10-25 22:58   ` Jakub Narebski
  2006-10-25 23:14     ` Junio C Hamano
@ 2006-10-25 23:24     ` Junio C Hamano
  1 sibling, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-25 23:24 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> Well, the _documentation_ doesn't tell. I haven't fully grokked the code
> for generating and coloring combined diff output besides the fact that
> I think it uses last indicator ('+' or '-') to chose color for the rest
> of line. You said that even if the possibility exist, it is extreme
> unlikely.

Well if I said that I must have been on booze ;-).

A '-' in the nth column means that the line is from the nth
parent and does _not_ appear in the merge result.  A '+' in the
nth column means that the line _appears_ in the merge result,
and the nth parent does not have that line (i.e. added by the
merge itself, or inherited from other parents).

Hence, by definition, you cannot have '-' and '+' on the same
line (otherwise the line has to exist and not exist in the merge
result at the same time).

A ' ' is a bit tricky to interpret.  A ' ' on a line _without_
any '-' means the line is the same as in that parent and the
merge result (i.e. the result inherited the line from that
parent).  A ' ' on a line that has '-' talks nothing about the
merge result (because by definition '-' lines do not exist in
the merge result) nor the parent that has ' '; in other words,
it is a "don't care" bit.  In the example you quoted from the
commit log of af3feefa:

         - static void describe(char *arg)
          -static void describe(struct commit *cmit, int last_one)
         ++static void describe(char *arg, int last_one)
           {

The first parent had it as one-arg function, and the second one
two-arg but the first parameter was of type "struct commit *";
the merge result has it as two-arg with the first parameter of
type "char *".  The second parent does not know about the
one-arg form of the function so it has ' ' in its column for the
first line.

All versions start the function with an opening brace '{' so the
line has two ' ' prefixed, which is an example of ' ' on a line
without any '-'.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Combined diff format documentation
  2006-10-25 22:40 ` Junio C Hamano
  2006-10-25 22:58   ` Jakub Narebski
@ 2006-10-25 23:45   ` Jakub Narebski
  2006-10-26  1:48   ` Horst H. von Brand
  2006-10-26  3:44   ` [PATCH] diff-format.txt: Combined diff format documentation supplement Jakub Narebski
  3 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-25 23:45 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:

>>    BTW. it is not mentioned in documentation that git diff uses hunk section
>>    indicator, and what regexp/expression it uses (and is it configurable).
>>    Not described in documentation.
> 
> If you mean by "hunk section indicator" the output similar to
> GNU diff -p option, I think it is not worth mentioning and we
> are not ready to mention it yet (we have not etched the
> expression in stone).  Nobody jumped up and down to say it needs
> to be configurable, so it is left undocumented more or less
> deliberately.

By the way, I have just checked that combined diff format doesn't have
(for unknown reason) "which section" indicator in chunk header.
Compare
$ git diff-tree -p -m fec9ebf16c948bcb4a8b88d0173ee63584bcde76
and
$ git diff-tree -p -c fec9ebf16c948bcb4a8b88d0173ee63584bcde76
(this is the source of example combined diff format in diff-formats.txt
which I've found via
$ git rev-list --parents HEAD -- describe.c | grep " .* "
i.e. finding all merges which included changes to describe.c; there
are only two such commits).
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 22:40                                                                           ` David Lang
@ 2006-10-25 23:53                                                                             ` Matthew D. Fuller
  2006-10-26 10:13                                                                               ` Andreas Ericsson
  2006-10-30 21:46                                                                             ` Jan Hudec
  1 sibling, 1 reply; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-25 23:53 UTC (permalink / raw)
  To: David Lang; +Cc: bazaar-ng, git

On Wed, Oct 25, 2006 at 03:40:00PM -0700 I heard the voice of
David Lang, and lo! it spake thus:
> 
> I think we are talking past each other here.
> 
> what I think was said was
> 
> G 'one feature of git is that you can view arbatrary slices
> trivially'
> 
> B 'bzr can do this too, you just use branches to define the slices'

Ah.  This is more like "bzr [mostly] only does this now in terms of a
single branch (or some point back along it)".  The slices that go
between branches are very limited ('missing' gives you one view;
'branch:' and 'ancestor:' revision specifications give you another).
bzrk/'visualize' gives an interface similar to gitk, but also only in
the context of a single branch/head looking backward through its
previous tree AFAIK.  Any random DAG-slicing of what you have in the
revision store can be done, somebody would just have to write the code
for it.  Nothing about 'the workflow preserves parents' would make
that any harder than writing the code for git was.

Much of this is probably a result of the 'branch'-centric (rather than
'repository'-centric) view of the world; similarly to the fact that
branches are referred to by location (local ../otherbranch, or remote
http/sftp/etc) rather than by a name.  This is one of the bits of bzr
I'm personally somewhat ambivalent about.


> they now have threeB options

Those certainly aren't the only choices, but to stay OT:

> 3. pull from each other frequently to keep in sync.
> 
> this changes the topology to
> 
>    Master
>    /   \
>  dev1--dev2
> 
> if they do this with bzr then the revno's break, they each get extra
> commits showing up (so they can never show the same history).

These two are either/or, not and; either they pull (in which case
their old mainline is no longer meaningful), or they merge (in which
case they get the 'extra' merge commits).


> in git this is a non-issue, they can pull back and forth and the
> only new history to show up will be changes.

In git, this is a non-issue because you don't get to CHOOSE which way
to work.  You always (if you can) pull and obliterate your local
mainline.  In bzr, it's only an 'issue' because you CAN choose, and
CAN maintain your local mainline.  You CAN choose, right now, to do a
git and pull back and forth and only new history show up as changed by
creating a 'bzr-pull' shell script that does a 'bzr pull || bzr merge'
(though you'd be a lot better off adding a '--fast-forward-if-you-can'
option to merge and aliasing that over).

More basically, though, I don't think that "histories become exactly
equivalent" is a necessary pass-word to enter the Hallowed City of
Truely Distributed Development.  And I certainly see no reason to
believe we'll agree on it this time any more than We (in broad) have
the last 6 times it came up in the thread.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 22:44                                               ` Petr Baudis
  2006-10-25 23:15                                                 ` Jakub Narebski
@ 2006-10-26  1:06                                                 ` Horst H. von Brand
  1 sibling, 0 replies; 1752+ messages in thread
From: Horst H. von Brand @ 2006-10-26  1:06 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Jakub Narebski, git

Petr Baudis <pasky@suse.cz> wrote:
> Dear diary, on Thu, Oct 26, 2006 at 12:29:17AM CEST, I got a letter
> where Jakub Narebski <jnareb@gmail.com> said that...
> > Cute names are taken: CoGITo, gitk, qgit (GTK+ history viewer is gitview,
> > not ggit, curiously ;-) and tig.
> 
> wit?

Wig. 
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                    Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria             +56 32 2654239

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Combined diff format documentation
  2006-10-25 22:40 ` Junio C Hamano
  2006-10-25 22:58   ` Jakub Narebski
  2006-10-25 23:45   ` Jakub Narebski
@ 2006-10-26  1:48   ` Horst H. von Brand
  2006-10-26  3:04     ` Junio C Hamano
  2006-10-26  3:44   ` [PATCH] diff-format.txt: Combined diff format documentation supplement Jakub Narebski
  3 siblings, 1 reply; 1752+ messages in thread
From: Horst H. von Brand @ 2006-10-26  1:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jakub Narebski, git

Junio C Hamano <junkio@cox.net> wrote:
> Jakub Narebski <jnareb@gmail.com> writes:

[...]

> > 5. Hunk header is also modified: in ordinary diff we have
> > ...
> >    It might be not obvoious that we have (number of parents + 1) '@'
> >    characters in chunk header for combined dif format.

> Correct.  This was done to prevent people from accidentally
> feeding it to "patch -p1".  In other words, we wanted to make it
> so obvious that it is _not_ a patch.

It isn't, really... perhaps it should be made /more/ obvious (not use @ but
e.g. &, ...)?
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                    Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria             +56 32 2654239

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24  6:45                           ` David Rientjes
                                               ` (2 preceding siblings ...)
  2006-10-24 15:15                             ` Linus Torvalds
@ 2006-10-26  2:29                             ` Linus Torvalds
  3 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-26  2:29 UTC (permalink / raw)
  To: David Rientjes; +Cc: Lachlan Patrick, bazaar-ng, git



On Mon, 23 Oct 2006, David Rientjes wrote:
> 
> Some of the internal commands that have been coded in C are actually much 
> better handled by the shell in the first place.

Others have answered this, but the thing is, it was a _wonderful_ way to 
prototype things, and to add obvious (and nice) early UI issues that made 
git much more usable.

But no, things are not better handled in shell.

Shell tends to make some things really _hard_ to do. A fair chunk of the 
rewrite was because core functionality made things easier. For example, 
the whole internal revision partsing library is really actually a lot more 
capable than we could easily expose as a simple pipeline: the original 
"git log" pipeline worked very well, and you can actually still use those 
kinds of pipelines for a lot of work, but at the same time, some things 
really just work better when you have "deeper" interfaces.

For example, the revision parsing library not only makes "git log" trivial 
as C, it's also needed for an efficient "git annotate/blame/pickaxe" kind 
of thing. There are also things that are just ludicrously hard to do in 
shell-script, like exclusive and atomic file operations.

We used perl and python for some things, but finding people who know them 
tends to be problematic, and python in particular was also a dependency 
problem too, so the fact that the default recursive merge was python 
wasn't wonderful.

So I think the shell-scripts are great (and some of them quite likely will 
remain around for the forseeable future) for prototyping, but for core 
functionality they were not wonderful. 

They are sometimes good examples of how powerful a scripting language git 
can be, though. Scripting is still very important, even though a lot of 
the core stuff doesn't necessarily depend on being scripts itself. 

But error handling in scripting is very hard or inconvenient, especially 
in pipelines. So some things were actively problematic (ie "git-rev-list 
--all --objects | git-pack-objects") and moving it to use the internal 
library interface was simply technically the right thing to do.

Others had real performance issues, eg the new merge in C is a lot faster. 
It was fast before, it's much faster still.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Combined diff format documentation
  2006-10-26  1:48   ` Horst H. von Brand
@ 2006-10-26  3:04     ` Junio C Hamano
  0 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-26  3:04 UTC (permalink / raw)
  To: Horst H. von Brand; +Cc: git

"Horst H. von Brand" <vonbrand@inf.utfsm.cl> writes:

>> Correct.  This was done to prevent people from accidentally
>> feeding it to "patch -p1".  In other words, we wanted to make it
>> so obvious that it is _not_ a patch.
>
> It isn't, really... perhaps it should be made /more/ obvious (not use @ but
> e.g. &, ...)?

Eh, sorry, what I meant was "obvious to the tool", so "patch"
would take notice.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* [PATCH] diff-format.txt: Combined diff format documentation supplement
  2006-10-25 22:40 ` Junio C Hamano
                     ` (2 preceding siblings ...)
  2006-10-26  1:48   ` Horst H. von Brand
@ 2006-10-26  3:44   ` Jakub Narebski
  2006-10-26  6:15     ` Junio C Hamano
  3 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-26  3:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Update example combined diff format to the current version
$ git diff-tree -p -c fec9ebf16c948bcb4a8b88d0173ee63584bcde76
and provide complete first chunk in example.

Document combined diff format headers: how "diff header" look like,
which of "extended diff headers" are used with combined diff and how
they look like, differences in two-line from-file/to-file header from
non-combined diff format, chunk header format.

It should be noted that combined diff format was designed for quick
_content_ inspection and renames would work correctly to pick which
blobs from each tree to compare but otherwise not reflected in the
output (the pathnames are not shown).

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
Junio C Hamano napisał:
> Patches to documentation would be easier to comment on and more
> productive, I guess.

So here you have. It should perhaps get review on validity by someone
well versed in the combined diff generation code. There are some guesses
here...

It compiles, but the output was not inspected.


 Documentation/diff-format.txt |   70 ++++++++++++++++++++++++++++++++++++++---
 1 files changed, 65 insertions(+), 5 deletions(-)

diff --git a/Documentation/diff-format.txt b/Documentation/diff-format.txt
index 617d8f5..0d04b03 100644
--- a/Documentation/diff-format.txt
+++ b/Documentation/diff-format.txt
@@ -156,18 +156,78 @@ to produce 'combined diff', which looks 
 
 ------------
 diff --combined describe.c
-@@@ +98,7 @@@
-   return (a_date > b_date) ? -1 : (a_date == b_date) ? 0 : 1;
+index fabadb8,cc95eb0..4866510
+--- a/describe.c
++++ b/describe.c
+@@@ -98,20 -98,12 +98,20 @@@
+  	return (a_date > b_date) ? -1 : (a_date == b_date) ? 0 : 1;
   }
-
+  
 - static void describe(char *arg)
  -static void describe(struct commit *cmit, int last_one)
 ++static void describe(char *arg, int last_one)
   {
- +     unsigned char sha1[20];
- +     struct commit *cmit;
+ +	unsigned char sha1[20];
+ +	struct commit *cmit;
+  	struct commit_list *list;
+  	static int initialized = 0;
+  	struct commit_name *n;
+  
+ +	if (get_sha1(arg, sha1) < 0)
+ +		usage(describe_usage);
+ +	cmit = lookup_commit_reference(sha1);
+ +	if (!cmit)
+ +		usage(describe_usage);
+ +
+  	if (!initialized) {
+  		initialized = 1;
+  		for_each_ref(get_name);
 ------------
 
+1.   It is preceded with a "git diff" header, that looks like
+     this (when '-c' option is used):
+
+       diff --combined fileM
+
+     or like this (when '--cc' option is used):
+
+       diff --c fileM
+
+2.   It is followed by one or more extended header lines
+     (we assume here that we have merge with two parents):
+
+       index <hash>,<hash>..<hash>
+       mode <mode>,<mode>..<mode>
+       new file mode <mode>
+
+     The "mode <mode>,<mode>..<mode>" appears only if at least
+     one of the <mode> is diferent from the rest. Extended headers
+     with information about detected contents movement (renames
+     and copying detection) are designed to work with diff of two
+     <tree-ish> and are not used by combined diff format. Currently
+     combined diff format cannot show files which were removed
+     by merge, so "deleted file mode <mode>,<mode>" is never used.
+
+3.   It is followed by two-line from-file/to-file header
+
+       --- a/fileM
+       +++ b/fileM
+
+     Contrary to two-line header for traditional 'unified' diff
+     format, and similar to filenames in ordinary "diff header",
+     /dev/null is not used for creation combined diff.
+
+4.   Chunk header format is modified to prevent people from
+     accidentally feeding it to 'patch -p1'. Combined diff format
+     was created for review of merge commit changes, and was not
+     meant for apply. The change is similar to the change in the
+     extended 'index' header
+
+       @@@ <from-file-range> <from-file-range> <to-file-range> @@@
+
+     It might be not obvious that we have number of parents + 1
+     '@' characters in chunk header for combined diff format.
+
 Unlike the traditional 'unified' diff format, which shows two
 files A and B with a single column that has `-` (minus --
 appears in A but removed in B), `+` (plus -- missing in A but
-- 
1.4.2.1



-- 
Jakub Narebski

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: [PATCH] diff-format.txt: Combined diff format documentation supplement
  2006-10-26  3:44   ` [PATCH] diff-format.txt: Combined diff format documentation supplement Jakub Narebski
@ 2006-10-26  6:15     ` Junio C Hamano
  2006-10-26  7:05       ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-26  6:15 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> So here you have. It should perhaps get review on validity by someone
> well versed in the combined diff generation code. There are some guesses
> here...

Thanks.

I guess review by the original author would be good enough;
this is entirely my code -- it was done while Linus and gang
was having fun in NZ, if I recall correctly ;-).

> It compiles, but the output was not inspected.

I've done minimal asciidoc mark-up fixes.  Troff man output look
horrible but that is not limited to this man page -- it looks
quite wrong whenever numbered list with displayed examples are
used.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [PATCH] diff-format.txt: Combined diff format documentation supplement
  2006-10-26  6:15     ` Junio C Hamano
@ 2006-10-26  7:05       ` Junio C Hamano
  2006-10-26  7:10         ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-26  7:05 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

By the way, this is only minimally tested, but this patch fills
the blanks in your documentation.

The combine_diff_path structure _does_ keep track of the
original path in each parent, so renames _could_ be shown (we do
not keep the score, though), but the question is how.  "rename to"
would obviously be "the path in the merge result", but "rename from"
needs to be shown for all parents when we have rename from any
of the parents.  "rename from" line as in the usual one parent diff
does not have any place to say "which parent" (because there is
no need), so showing the usual "rename from" for only parents
that the result has rename from makes the output useless -- we
cannot tell "from which parent" from such an output.  I feel
that an evil merge is rare enough that worrying about showing
rename line is probably not worth the effort.

-- >8 --
[PATCH] combine-diff: a few more finishing touches.

"new file" and "deleted file" were already reported in the
original code, but the logic was not as transparent as it could
have.  This uses a few variables and more comments to clarify
the flow.  The rule is: (1) if a path exists in the merge result
when no parent had it, we report "new" (otherwise it came from
the parents, as opposed to have added by the evil merge). (2) if
the path does not exist in the merge result, it is "deleted".

Since we can say "new" and "deleted", there is no reason not to
follow the /dev/null convention.  This fixes it.

Appending function name after @@@ ... @@@ is trivial, so
implement it.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
 combine-diff.c |   48 +++++++++++++++++++++++++++++++++++++++++++-----
 1 files changed, 43 insertions(+), 5 deletions(-)

diff --git a/combine-diff.c b/combine-diff.c
index 46d9121..01a8437 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -489,6 +489,12 @@ static void show_parent_lno(struct sline
 	printf(" -%lu,%lu", l0, l1-l0);
 }
 
+static int hunk_comment_line(const char *bol)
+{
+	int ch = *bol & 0xff;
+	return (isalpha(ch) || ch == '_' || ch == '$');
+}
+
 static void dump_sline(struct sline *sline, unsigned long cnt, int num_parent,
 		       int use_color)
 {
@@ -508,8 +514,13 @@ static void dump_sline(struct sline *sli
 		struct sline *sl = &sline[lno];
 		unsigned long hunk_end;
 		unsigned long rlines;
-		while (lno <= cnt && !(sline[lno].flag & mark))
+		const char *hunk_comment = NULL;
+
+		while (lno <= cnt && !(sline[lno].flag & mark)) {
+			if (hunk_comment_line(sline[lno].bol))
+				hunk_comment = sline[lno].bol;
 			lno++;
+		}
 		if (cnt < lno)
 			break;
 		else {
@@ -526,6 +537,22 @@ static void dump_sline(struct sline *sli
 			show_parent_lno(sline, lno, hunk_end, i);
 		printf(" +%lu,%lu ", lno+1, rlines);
 		for (i = 0; i <= num_parent; i++) putchar(combine_marker);
+
+		if (hunk_comment) {
+			int comment_end = 0;
+			for (i = 0; i < 40; i++) {
+				int ch = hunk_comment[i] & 0xff;
+				if (!ch || ch == '\n')
+					break;
+				if (!isspace(ch))
+				    comment_end = i;
+			}
+			if (comment_end)
+				putchar(' ');
+			for (i = 0; i < comment_end; i++)
+				putchar(hunk_comment[i]);
+		}
+
 		printf("%s\n", c_reset);
 		while (lno < hunk_end) {
 			struct lline *ll;
@@ -707,6 +734,8 @@ static void show_patch_diff(struct combi
 		int use_color = opt->color_diff;
 		const char *c_meta = diff_get_color(use_color, DIFF_METAINFO);
 		const char *c_reset = diff_get_color(use_color, DIFF_RESET);
+		int added = 0;
+		int deleted = 0;
 
 		if (rev->loginfo)
 			show_log(rev, opt->msg_sep);
@@ -722,7 +751,10 @@ static void show_patch_diff(struct combi
 		printf("..%s%s\n", abb, c_reset);
 
 		if (mode_differs) {
-			int added = !!elem->mode;
+			deleted = !elem->mode;
+
+			/* We say it was added if nobody had it */
+			added = !deleted;
 			for (i = 0; added && i < num_parent; i++)
 				if (elem->parent[i].status !=
 				    DIFF_STATUS_ADDED)
@@ -731,7 +763,7 @@ static void show_patch_diff(struct combi
 				printf("%snew file mode %06o",
 				       c_meta, elem->mode);
 			else {
-				if (!elem->mode)
+				if (deleted)
 					printf("%sdeleted file ", c_meta);
 				printf("mode ");
 				for (i = 0; i < num_parent; i++) {
@@ -743,8 +775,14 @@ static void show_patch_diff(struct combi
 			}
 			printf("%s\n", c_reset);
 		}
-		dump_quoted_path("--- a/", elem->path, c_meta, c_reset);
-		dump_quoted_path("+++ b/", elem->path, c_meta, c_reset);
+		if (added)
+			dump_quoted_path("--- /dev/", "null", c_meta, c_reset);
+		else
+			dump_quoted_path("--- a/", elem->path, c_meta, c_reset);
+		if (deleted)
+			dump_quoted_path("+++ /dev/", "null", c_meta, c_reset);
+		else
+			dump_quoted_path("+++ b/", elem->path, c_meta, c_reset);
 		dump_sline(sline, cnt, num_parent, opt->color_diff);
 	}
 	free(result);

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: [PATCH] diff-format.txt: Combined diff format documentation supplement
  2006-10-26  7:05       ` Junio C Hamano
@ 2006-10-26  7:10         ` Junio C Hamano
  0 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-26  7:10 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

This script minimally demonstrates a few interesting things an
evil merge can do.  Run it in a throw-away directory and view
the resulting merge with "git-show" with or without the patch I
sent out earlier.

One thing that you would notice is that the combined-diff code
chooses not to show the original contents of a deleted
file. while showing the whole result of a new file.  Strictly
speaking, this is inconsistent, but an evil merge is rare and
what ended up getting removed is not as interesting as what
remains as the result.

-- >8 --
#!/bin/sh

test -d .git && {
	echo Run me in an empty directory please
	exit 1
}

git init-db

echo one >file1.txt
git add file1.txt
git commit -m initial

git branch side

echo two >file2.txt
git add file2.txt
git commit -m second

git checkout side
echo uno >file1.txt
git commit -a -m side

git merge "Evil merge" HEAD master
rm -f file1.txt
echo added by the evil merge >file3.txt
echo modified by the evil merge >file2.txt
git update-index --add --remove file1.txt file2.txt file3.txt
EDITOR=: VISUAL=: git commit --amend




^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 15:54                                                       ` Carl Worth
@ 2006-10-26  8:52                                                         ` James Henstridge
  2006-10-26  9:33                                                           ` Junio C Hamano
  2006-10-26  9:50                                                           ` Andreas Ericsson
  0 siblings, 2 replies; 1752+ messages in thread
From: James Henstridge @ 2006-10-26  8:52 UTC (permalink / raw)
  To: Carl Worth
  Cc: bazaar-ng, Matthew D. Fuller, Linus Torvalds, Andreas Ericsson,
	git, Jakub Narebski

On 25/10/06, Carl Worth <cworth@cworth.org> wrote:
> On Wed, 25 Oct 2006 18:08:22 +0800, "James Henstridge" wrote:
> > If there aren't, or you made the merge by mistake, you can make a call
> > to "bzr revert" to clean things up without ever having created a new
> > revision.
>
> One result of this approach is that developers of different trees
> don't necessarily have common revision IDs to compare. Imagine a
> question like:
>
>         When you ran that test did you have the same code I've got?
>
> In git, the answer would be determined by comparing revision IDs.

Can you really just rely on equal revision IDs meaning you have the
same code though?

Lets say that I clone your git repository, and then we both merge the
same diverged branch.  Will our head revision IDs match?  From a quick
look at the logs of cairo, it seems that the commits generated for
such a merge include the date and author, so the two commits would
have different SHA1 sums (and hence different revision IDs).

So I'd have a revision you don't have and vice versa, even though the
trees are identical.


> In bzr, the only answer I'm hearing is attempting a merge to see if it
> introduces any changes. (I'm deliberately avoiding "pull" since we're
> talking about distributed cases here).

Or run "bzr missing".  If the sole missing revision is a merge (and
not the revisions introduced by the merge), you could assume that you
have the same tree state.


> And to comment on something mentioned earlier in the thread, there's
> no need for "wildly complex" distributed scenarios. All of these
> issues are present with developers working together as peers, (and
> each considering their own repository as canonical).
>
> A harder question (for bzr) is:
>
>         Do you have all of the history I've got?
>
> (The problem being that when one developer is missing some history and
> merges it in, she necessarily creates new history, so there's never a
> stable point for both sides to agree on.)

Why does it matter if they create a new revision?  They can still tell
if they've got all the history you had.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26  8:52                                                         ` James Henstridge
@ 2006-10-26  9:33                                                           ` Junio C Hamano
  2006-10-26  9:57                                                             ` James Henstridge
  2006-10-26  9:50                                                           ` Andreas Ericsson
  1 sibling, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-26  9:33 UTC (permalink / raw)
  To: James Henstridge
  Cc: bazaar-ng, Carl Worth, Matthew D. Fuller, Linus Torvalds,
	Andreas Ericsson, git, Jakub Narebski

"James Henstridge" <james@jamesh.id.au> writes:

> Can you really just rely on equal revision IDs meaning you have the
> same code though?

If you two have the same commit that is a guarantee that you two
have identical trees.  The reverse is not true as logic 101
would teach ;-).

Doing fast-forward instead of doing a "useless" merges helps
somewhat but not in cases like two people merging the same
branches the same way or two people applying the same patch on
top of the same commit.  You need to compare tree object IDs for
that.

>> In bzr, the only answer I'm hearing is attempting a merge to see if it
>> introduces any changes. (I'm deliberately avoiding "pull" since we're
>> talking about distributed cases here).
>
> Or run "bzr missing".  If the sole missing revision is a merge (and
> not the revisions introduced by the merge), you could assume that you
> have the same tree state.

Is it "you could assume" or "it is guaranteed"?  If former, what
kind of corner cases could invalidate that assumption?


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26  8:52                                                         ` James Henstridge
  2006-10-26  9:33                                                           ` Junio C Hamano
@ 2006-10-26  9:50                                                           ` Andreas Ericsson
  1 sibling, 0 replies; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-26  9:50 UTC (permalink / raw)
  To: James Henstridge
  Cc: Carl Worth, bazaar-ng, Matthew D. Fuller, Linus Torvalds, git,
	Jakub Narebski

James Henstridge wrote:
> On 25/10/06, Carl Worth <cworth@cworth.org> wrote:
>> On Wed, 25 Oct 2006 18:08:22 +0800, "James Henstridge" wrote:
>> > If there aren't, or you made the merge by mistake, you can make a call
>> > to "bzr revert" to clean things up without ever having created a new
>> > revision.
>>
>> One result of this approach is that developers of different trees
>> don't necessarily have common revision IDs to compare. Imagine a
>> question like:
>>
>>         When you ran that test did you have the same code I've got?
>>
>> In git, the answer would be determined by comparing revision IDs.
> 
> Can you really just rely on equal revision IDs meaning you have the
> same code though?
> 

Yes. Because each commit contains parent revision id's, which in turn 
contain *their* parent revision id's, which in turn..., you know you 
have exactly the same revision, code, and history leading up to that 
revision. You may have other revisions on top or on other branches, but 
all commits, including merge-points and whatnot, leading to that 
particular revision id are EXACTLY identical.

> Lets say that I clone your git repository, and then we both merge the
> same diverged branch.  Will our head revision IDs match?  From a quick
> look at the logs of cairo, it seems that the commits generated for
> such a merge include the date and author, so the two commits would
> have different SHA1 sums (and hence different revision IDs).
> 
> So I'd have a revision you don't have and vice versa, even though the
> trees are identical.
> 

Merges preserve author and commit info. You may need to create a new 
branch (a git branch, the cheap kind which is a 41-byte file) and fetch 
"his" into "yours". This will be very cheap if you both have the same 
code but not the same history, as everything but a few commit-objects 
will be shared. A more likely scenario though is this;

Bob writes a feature that doesn't work as per spec. He doesn't know why.
He asks Alice to have a look, so he communicates the commits to her by 
"please pull this branch from here", or by sending patches and telling 
Alice the branch-point revision to apply them to.
Alice creates the "bobs-bugs/nr1232" at the branch-point and fetches 
Bobs branch into that or applies the patches on top of that (in the 
fetch scenario she wouldn't need to know the branch point, since git 
would figure this out for her).
She knows this should create a revision named 00123989aaddeddad39, so if 
it doesn't, she doesn't have the same code.


I imagine this works roughly the same in bazaar, although the original 
case where tests have already been done and the testers wanted to know 
if they had the exact same revision Just Works in git.

> 
>> In bzr, the only answer I'm hearing is attempting a merge to see if it
>> introduces any changes. (I'm deliberately avoiding "pull" since we're
>> talking about distributed cases here).
> 
> Or run "bzr missing".  If the sole missing revision is a merge (and
> not the revisions introduced by the merge), you could assume that you
> have the same tree state.
> 

"assume" != "know", or was that just sloppy phrasing?

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26  9:33                                                           ` Junio C Hamano
@ 2006-10-26  9:57                                                             ` James Henstridge
  2006-10-26 10:10                                                               ` Jeff King
  0 siblings, 1 reply; 1752+ messages in thread
From: James Henstridge @ 2006-10-26  9:57 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: bazaar-ng, Matthew D. Fuller, Linus Torvalds, Andreas Ericsson,
	Carl Worth, git, Jakub Narebski

On 26/10/06, Junio C Hamano <junkio@cox.net> wrote:
> "James Henstridge" <james@jamesh.id.au> writes:
>
> > Can you really just rely on equal revision IDs meaning you have the
> > same code though?
>
> If you two have the same commit that is a guarantee that you two
> have identical trees.  The reverse is not true as logic 101
> would teach ;-).

That was the point I was trying to make.  Carl asserted that in git
you could tell if you had the same tree as someone else based on
revision IDs, which doesn't seem to be the case all the time.

The reverse assertion (that if you have the same revision ID, you have
the same tree) seems to hold equally in git and Bazaar.


> Doing fast-forward instead of doing a "useless" merges helps
> somewhat but not in cases like two people merging the same
> branches the same way or two people applying the same patch on
> top of the same commit.  You need to compare tree object IDs for
> that.

Sure, you can do the same in Bazaar by comparing the inventories for
the two revisions.

>
> >> In bzr, the only answer I'm hearing is attempting a merge to see if it
> >> introduces any changes. (I'm deliberately avoiding "pull" since we're
> >> talking about distributed cases here).
> >
> > Or run "bzr missing".  If the sole missing revision is a merge (and
> > not the revisions introduced by the merge), you could assume that you
> > have the same tree state.
>
> Is it "you could assume" or "it is guaranteed"?  If former, what
> kind of corner cases could invalidate that assumption?

The merge revision will also include any manual conflict resolution.
If the other person resolved the conflicts differently.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26  9:57                                                             ` James Henstridge
@ 2006-10-26 10:10                                                               ` Jeff King
  2006-10-26 10:52                                                                 ` Vincent Ladeuil
  0 siblings, 1 reply; 1752+ messages in thread
From: Jeff King @ 2006-10-26 10:10 UTC (permalink / raw)
  To: James Henstridge
  Cc: Junio C Hamano, bazaar-ng, Matthew D. Fuller, Linus Torvalds,
	Andreas Ericsson, Carl Worth, git, Jakub Narebski

On Thu, Oct 26, 2006 at 05:57:20PM +0800, James Henstridge wrote:

> >If you two have the same commit that is a guarantee that you two
> >have identical trees.  The reverse is not true as logic 101
> >would teach ;-).
> 
> That was the point I was trying to make.  Carl asserted that in git
> you could tell if you had the same tree as someone else based on
> revision IDs, which doesn't seem to be the case all the time.

If you have the same revision (commit IDs), you have the same tree (at
the same time, by the same committer, etc).

If you have a different revision (commit), you may or may not have the
same tree. You can then check the tree id, which will either be the same
(you have the same tree) or differ (you don't).

Thus, in the converse, if you have the same tree, you _will_ have the
same tree id. You may or may not have the same commit id.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 23:53                                                                             ` Matthew D. Fuller
@ 2006-10-26 10:13                                                                               ` Andreas Ericsson
  2006-10-26 10:45                                                                                 ` Erik Bågfors
                                                                                                   ` (3 more replies)
  0 siblings, 4 replies; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-26 10:13 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, David Lang, git

Matthew D. Fuller wrote:
> 
>> 3. pull from each other frequently to keep in sync.
>>
>> this changes the topology to
>>
>>    Master
>>    /   \
>>  dev1--dev2
>>
>> if they do this with bzr then the revno's break, they each get extra
>> commits showing up (so they can never show the same history).
> 
> These two are either/or, not and; either they pull (in which case
> their old mainline is no longer meaningful), or they merge (in which
> case they get the 'extra' merge commits).
> 
> 
>> in git this is a non-issue, they can pull back and forth and the
>> only new history to show up will be changes.
> 
> In git, this is a non-issue because you don't get to CHOOSE which way
> to work.

Yes they do. They can (and in this case probably will) create a 
topic-branch named "the-other-dev/featureX" and keep it solely for 
tracking the other peers changes, keeping their own topic-branch for 
their own changes, and another branch where they merge both changes in, 
or cherry-pick from each branch to get to the desired result fast. This 
works easily because in git
a) branches are as cheap as I can ever imagine an SCM making them.
b) the "slice the DAG and view anything you like from any branch you 
like any time you like and mix them however you want" approach of the 
visualizers makes it trivial for a 10-year old fledgling programmer to 
see what changes what, and where, and by whom, and why.

The "b" above was a feature I didn't know I needed until it became 
available to me. Thanks to Paul Mackerras (spelling?) for creating the 
wonderful gitk tool, and to Marco Costalba for making a faster and, imo, 
more capable version of it.

>  You always (if you can) pull and obliterate your local
> mainline.  In bzr, it's only an 'issue' because you CAN choose, and
> CAN maintain your local mainline.

Git puts emphasis on code. Bazaar puts emphasis on developers and 
branch-structure. Depending on your preferrence, I imagine one suits 
some people better. I really, really, really don't care if my branch-tip 
gets moved because I hadn't made any changes to it while the other dev 
hacked away or if it causes a merge because we had decided to work on 
different parts of the feature. Perhaps this is a result of the insanely 
good visualizers (kudos again to Paul and Marco) that easily lets me see 
who did what when and where anyways. What I *do* care about is being 
able to easily make sure all the devs have the same code to work and 
test with.

>  You CAN choose, right now, to do a
> git and pull back and forth and only new history show up as changed by
> creating a 'bzr-pull' shell script that does a 'bzr pull || bzr merge'
> (though you'd be a lot better off adding a '--fast-forward-if-you-can'
> option to merge and aliasing that over).
> 
> More basically, though, I don't think that "histories become exactly
> equivalent" is a necessary pass-word to enter the Hallowed City of
> Truely Distributed Development.

The only issue I have with bzr's revno's and truly distributed setup is 
that, by looking at the table, it seems to claim that you have found 
some miraculous way to make revnos work without a central server. Since 
everyone agrees that they don't, this should IMO be listed as mutually 
exclusive features.

On a side-note, git has made my life easier, so I childishly want to 
defend it and see it on top of every list in the world. Something I'm 
sure I share with more people on this list and with some of the bazaar 
users/devs. ;-)



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 10:13                                                                               ` Andreas Ericsson
@ 2006-10-26 10:45                                                                                 ` Erik Bågfors
  2006-10-26 11:48                                                                                 ` Jakub Narebski
                                                                                                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 1752+ messages in thread
From: Erik Bågfors @ 2006-10-26 10:45 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Matthew D. Fuller, bazaar-ng, David Lang, git

> On a side-note, git has made my life easier, so I childishly want to
> defend it and see it on top of every list in the world. Something I'm
> sure I share with more people on this list and with some of the bazaar
> users/devs. ;-)

Haha, I feel the same way about bzr. Some of the features that bazaar
has, such as how it preservs the leftmost parent and treats that
specially in some cases, are things that I REALLY love and don't want
to live without.

All in all, I feel that git and bazaar and both excellent products,
what will happen in the future will be interesting to see.

/Erik
-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 10:10                                                               ` Jeff King
@ 2006-10-26 10:52                                                                 ` Vincent Ladeuil
  2006-10-26 11:13                                                                   ` Jeff King
                                                                                     ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Vincent Ladeuil @ 2006-10-26 10:52 UTC (permalink / raw)
  To: Jeff King
  Cc: James Henstridge, Junio C Hamano, bazaar-ng, Matthew D. Fuller,
	Linus Torvalds, Carl Worth, Andreas Ericsson, git, Jakub Narebski

>>>>> "Jeff" == Jeff King <peff@peff.net> writes:

    Jeff> On Thu, Oct 26, 2006 at 05:57:20PM +0800, James Henstridge wrote:
    >> >If you two have the same commit that is a guarantee that you two
    >> >have identical trees.  The reverse is not true as logic 101
    >> >would teach ;-).
    >> 
    >> That was the point I was trying to make.  Carl asserted that in git
    >> you could tell if you had the same tree as someone else based on
    >> revision IDs, which doesn't seem to be the case all the time.

    Jeff> If you have the same revision (commit IDs), you have
    Jeff> the same tree (at the same time, by the same committer,
    Jeff> etc).

    Jeff> If you have a different revision (commit), you may or
    Jeff> may not have the same tree. You can then check the tree
    Jeff> id, which will either be the same (you have the same
    Jeff> tree) or differ (you don't).

    Jeff> Thus, in the converse, if you have the same tree, you
    Jeff> _will_ have the same tree id. You may or may not have
    Jeff> the same commit id.

Ok, so git make a distinction between the commit (code created by
someone) and the tree (code only).

Commits are defined by their parents.

Trees are defined by their content only ?

If that's the case, how do you proceed ? 

Calculate a sha1 representing the content (or the content of the
diff from parent) of all the files and dirs in the tree ?  Or
from the sha1s of the files and dirs themselves recursively based
on sha1s of the files and dirs they contain ?

I ask because the later seems to provide some nice effects
similar to what makes BDD
(http://en.wikipedia.org/wiki/Binary_decision_diagram) so
efficient: you can compare graphs of any complexity or size in
O(1) by just comparing their signatures.

    Vincent



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 10:52                                                                 ` Vincent Ladeuil
@ 2006-10-26 11:13                                                                   ` Jeff King
  2006-10-26 11:15                                                                     ` Jeff King
  2006-10-26 12:33                                                                     ` Vincent Ladeuil
  2006-10-26 11:18                                                                   ` Jakub Narebski
  2006-10-26 15:05                                                                   ` Linus Torvalds
  2 siblings, 2 replies; 1752+ messages in thread
From: Jeff King @ 2006-10-26 11:13 UTC (permalink / raw)
  To: Vincent Ladeuil
  Cc: James Henstridge, Junio C Hamano, bazaar-ng, Matthew D. Fuller,
	Linus Torvalds, Carl Worth, Andreas Ericsson, git, Jakub Narebski

On Thu, Oct 26, 2006 at 12:52:05PM +0200, Vincent Ladeuil wrote:

> Ok, so git make a distinction between the commit (code created by
> someone) and the tree (code only).

Yes (a commit is a tree, zero or more parents, commit message, and
author/committer info).

> Commits are defined by their parents.

Partially, yes.

> Trees are defined by their content only ?

Yes.

> Calculate a sha1 representing the content (or the content of the
> diff from parent) of all the files and dirs in the tree ?  Or
> from the sha1s of the files and dirs themselves recursively based
> on sha1s of the files and dirs they contain ?

Recursively. Each tree is an ordered list of 4-tuples: pathname, type,
sha1, mode. If the type is "blob" then the sha1 is the hash of the file
contents. If the type is "tree" then the sha1 is the id of a sub-tree.
The id of a tree is the sha1 hash of the data structure.

> I ask because the later seems to provide some nice effects
> similar to what makes BDD
> (http://en.wikipedia.org/wiki/Binary_decision_diagram) so
> efficient: you can compare graphs of any complexity or size in
> O(1) by just comparing their signatures.

Yes, if two trees' hashes compare equal, they contain the same data. I
believe we are not currently using this optimization to find merge
differences, but there was some discussion earlier this week about doing
so.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 11:13                                                                   ` Jeff King
@ 2006-10-26 11:15                                                                     ` Jeff King
  2006-10-26 12:33                                                                     ` Vincent Ladeuil
  1 sibling, 0 replies; 1752+ messages in thread
From: Jeff King @ 2006-10-26 11:15 UTC (permalink / raw)
  To: Vincent Ladeuil
  Cc: James Henstridge, Junio C Hamano, bazaar-ng, Matthew D. Fuller,
	Linus Torvalds, Carl Worth, Andreas Ericsson, git, Jakub Narebski

On Thu, Oct 26, 2006 at 07:13:39AM -0400, Jeff King wrote:

> Yes (a commit is a tree, zero or more parents, commit message, and
> author/committer info).

Sorry, I should clarify: a commit is a _tree id_, zero or more _parent
ids_, commit message, etc.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 17:21                                       ` David Rientjes
  2006-10-25 21:03                                         ` Jeff King
@ 2006-10-26 11:15                                         ` Andreas Ericsson
  2006-10-26 16:30                                           ` David Lang
  1 sibling, 1 reply; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-26 11:15 UTC (permalink / raw)
  To: David Rientjes; +Cc: Jeff King, Linus Torvalds, Lachlan Patrick, bazaar-ng, git

David Rientjes wrote:
> On Wed, 25 Oct 2006, Jeff King wrote:
> 
>>> This all became very obvious when the tutorials came out on "how to use 
>>> git in 20 commands or less" effectively.  These tutorials shouldn't need 
>>> to exist with an information manager that started as a quick, efficient, 
>>> and _simple_ project.  You're treating git development in the same light 
>> Sorry, I don't see how this is related to the programming language _at
>> all_. Are you arguing that the interface of git should be simplified so
>> that such tutorials aren't necessary? If so, then please elaborate, as
>> I'm sure many here would like to hear proposals for improvements. If
>> you're arguing that git now has too many features, then which features
>> do you consider extraneous?
>>
> 
> It's not, it's related to the original vision of git which was meant for 
> efficiency and simplicity.

Compared to todays version, original git was neither efficient nor 
simple. Unless you mean "some random version along the way where git had 
everything *I* need and not the useless cruft that other people use", in 
which case it's simply a very egotistical view of things.

>  A year ago it was very easy to pick up the 
> package and start using it effectively within a couple hours.   Keep in
> mind that this was without tutorials, it was just reading man pages.  
> Today it would be very difficult to know what the essential commands are 
> and how to use them simply to get the job done, unless you use the 
> tutorials.

Have you tried "git --help"? It shows the most common commands and a 
short description of what they do. It's a very good pointer to which 
man-pages you need to read, and I imagine this would actually be one of 
the very first commands that new git users try. If they don't but just 
expect things to work according to some premade mental model they have 
of scm's, I'd say they'd be screwed no matter which software they tried.


>  This _inherently_ goes against the approach of trying to 
> provide something that is simple to the developer.
> 
> Revision control is something that should exist in the background that 
> does it's simple job very efficiently.  Unfortunately git has tried to 
> move its presence into the foreground and requiring developers to spend 
> more time on learning the system.
> 

No it hasn't. The ten or so commands that Linus first introduced when 
announcing git still work pretty much the same. Nobody in their right 
mind would ever claim that those ten commands made up anything that even 
remotely resembled a complete scm, but they were something to build on 
by anyone who wanted to extend it. So far, ~220 people have wanted to 
extend it in ways that others thought useful, because their patches are 
apparently in the git tree.

> Have you never tried to show other people git without giving them a 
> tutorial on the most common uses?  Try it and you'll see the confusion.  
> That _specifically_ illustrates the ever-increasing lack of simplicity 
> that git has acquired.
> 

Well, my head hurt when I tried to learn CVS without a tutorial, and 
mercurial and darcs and svn as well. I didn't pick up the functionality 
of the 'ls' command completely without reading the man-page for it. If 
you want something that works for everyone without having to read any 
documentation what so ever, buy Lego, cause computers ain't for you, my 
friend.

>> I don't agree with this. There are tons of enhancements that I find
>> useful (e.g., '...' rev syntax, rebasing with 3-way merge, etc) that I
>> think other developers ARE using. There are scalability and performance
>> improvements. And there are new things on the way (Junio's pickaxe work)
>> that will hopefully make git even more useful than it already is.
>>
> 
> There are _not_ scalability improvements.  There may be some slight 
> performance improvements, but definitely not scalability.  If you have 
> ever tried to use git to manage terabytes of data, you will see this 
> becomes very clear.  And "rebasing with 3-way merge" is not something 
> often used in industry anyway if you've followed the more common models 
> for revision control within large companies with thousands of engineers.  
> Typically they all work off mainline.
> 

Actually, I don't see why git shouldn't be perfectly capable of handling 
a repo containing several terabytes of data, provided you don't expect 
it to turn up the full history for the project in a couple of seconds 
and you don't actually *change* that amount of data in each revision. If 
you want a vcs that handles that amount with any kind of speed, I think 
you'll find rsync and raw rvs a suitable solution.

On the other hand, you fellas at google don't really use git to store 
the data from the search database, do you? I mean, it's written for 
source control management. People that tried to keep their mboxes in git 
failed miserably, because large files that constantly change just 
doesn't work well with git.

>> If you don't think recent git versions are worthwhile, then why don't
>> you run an old version? You can even use git to cherry-pick patches onto
>> your personal branch.
>>
> 
> I do.  And that's why I would recommend to any serious developer to use 
> 1.2.4; this same version that I used for kernel development at Google.
> 
>> Where?
>>
> 
> Few months back here on the mailing list.  When I tried cleaning up even 
> one program, I got the response back from the original author "why fix a 
> non-problem?" because his argument was that since it worked the code 
> doesn't matter.
> 
> 	http://marc.theaimsgroup.com/?l=git&m=115589472706036
> 
> And that is simply one thread of larger conversations that have taken 
> place off-list and aren't archived.
> 

First off, the code got changed as per Junio's desires. He's the 
maintainer and gets to choose about coding style and readability vs 
microoptimizations.

Second, why keep discussions about git development off-list?

Third, if you still have issues with it, why not provide a patch and see 
if Junio accepts it? Cleaner and faster code will, in my experience, 
always get accepted. Code that is cleaner from one devs point of view 
but doesn't actually provide any other benefits will be dropped to the 
floor, and rightly so.


>> I don't agree, but since you haven't provided anything specific enough
>> to discuss, there's not much to say.
>>
> 
> If there's a question about some of the sloppiness in the git source code 
> as it stands today, that's a much bigger issue than the sloppiness.  My 
> advice would be to pick up a copy of K&R's 2nd edition C programming 
> language book, read it, and then take a tour of the source code.
> 

The first sentence doesn't make sense. The second one is just rude, and 
formed by your own opinion on how code should be written. But again, 
submit patches and see if Junio accepts them. If he doesn't, and you 
really, really *really* can't stand the changes he and the rest of the 
git community wants in, fork your own version and hack away til your 
heart's content. Git makes it easy for you, whichever version you use.

>> Can you name one customization that you would like to perform now that
>> you feel can't be easily done (and presumably that would have been
>> easier in the past)?
>>
> 
> Yes, those mentioned above.
> 

Which ones? The git-mv changes you submitted were applied (although in a 
different shape), so there must be other ones. Rewriting C builtins as 
shell-scripts is not really an option, because portability and 
performance *does* matter.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 10:52                                                                 ` Vincent Ladeuil
  2006-10-26 11:13                                                                   ` Jeff King
@ 2006-10-26 11:18                                                                   ` Jakub Narebski
  2006-10-26 15:05                                                                   ` Linus Torvalds
  2 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-26 11:18 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Vincent Ladeuil wrote:

> Ok, so git make a distinction between the commit (code created by
> someone) and the tree (code only).
> 
> Commits are defined by their parents.
> 
> Trees are defined by their content only ?

Trees are collections of tuples: (mode, type, sha1, name), where mode
is simplified mode of a file or directory (only if it is symlink, directory,
file or executable file is tracked), type is blob (file) or tree
(directory), sha1 is sha1 of contents of given entry, and name is filename
of given entry.
 
> If that's the case, how do you proceed ? 
> 
> Calculate a sha1 representing the content (or the content of the
> diff from parent) of all the files and dirs in the tree ?  Or
> from the sha1s of the files and dirs themselves recursively based
> on sha1s of the files and dirs they contain ?
 
sha1 of object is sha1 of type+contents if I remember correctly. So the sha1
of tree is based on sha1 of the files and dirs it contain.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 21:08                                   ` Junio C Hamano
  2006-10-25 21:16                                     ` Jeff King
  2006-10-25 21:50                                     ` Junio C Hamano
@ 2006-10-26 11:25                                     ` Andreas Ericsson
  2 siblings, 0 replies; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-26 11:25 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, git, bazaar-ng, Linus Torvalds, Lachlan Patrick,
	David Rientjes

Junio C Hamano wrote:
> 
>  - Learn the itches David and other people have, that the
>    current git Porcelain-ish does not scratch well, and enrich
>    Documentation/technical with real-world working scripts built
>    around plumbing.
> 

Isn't this how git has been developed since day one, more or less? If a 
command is missing, it gets added as a shell-script. I agree with you on 
the "pipes from this sent here does this, and look how useful it is" 
lectures are gone since many commands were rewritten. Otoh, they're gone 
because they now instead provide examples on how to interface with the 
libified parts of git, so it's not a loss per se, just a switch in what 
it teaches.

I also agree with David that shell is much more fun to muck around with 
and prototype in, because you see results to much faster. However, since 
our plumbing is so rock-solid (and getting extended with --stdin options 
to more and more commands), I see no reason why we shouldn't have a "how 
to extend git" with the old shell-based porcelain scripts up somewhere 
at the web. Perhaps it would kill two birds with one stone and increase 
the addition of new utilities to git, while at the same time keeping the 
already rewritten commands in C.

Btw, the old shell-versions still work with the new plumbing (well, 
mostly anyways). They just have problems with filenames and revisions 
with spaces and special chars and things like that, same as they've 
always had.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 10:13                                                                               ` Andreas Ericsson
  2006-10-26 10:45                                                                                 ` Erik Bågfors
@ 2006-10-26 11:48                                                                                 ` Jakub Narebski
  2006-10-26 11:54                                                                                   ` Nicholas Allen
  2006-10-27  2:02                                                                                   ` Horst H. von Brand
  2006-10-26 12:12                                                                                 ` VCS comparison table Matthew D. Fuller
  2006-10-26 13:47                                                                                 ` Aaron Bentley
  3 siblings, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-26 11:48 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Andreas Ericsson wrote:

> On a side-note, git has made my life easier, so I childishly want to 
> defend it and see it on top of every list in the world. Something I'm 
> sure I share with more people on this list and with some of the bazaar 
> users/devs. ;-)

Let's then us review what started this thread, namely comparison chart
between source control systems
  http://bazaar-vcs.org/RcsComparisons

1. Decentralized. O.K.

2. Disconnected Ops. O.K.

3. Simple Namespace. Should be named "Simple Rev Names" instead, Bazaar
should have note that revnos work only for specific workflows
(star-topology); for Git it should be perhaps "Somewhat" here, as <ref>~<n>
(or <ref>@{<n>} if reflog is enabled) _are_ simple (if volatile for branch
<refs>). $(git-merge-base <ref1> <ref2>), usually "hidden" in
<ref1>..<ref2> or <ref1>...<ref2> shortcut is also I think simple. There
was huge discussion here about revnos, revids, workflows (development
topology), fast-forwards, empty merges etc. Bazaar-NG and Git puts
emphasisis on other things. Additionally tags supports removes some of
perceived revnos advantages; tags are simple.

4. Supports Renames. I could agree with "Somewhat" because of not yet
implemented --follow option to git-rev-list (and therefore all porcelain).
Perhaps it would be closer to truth to leave the marker (background color)
as for "Somewhat" and write "N/A" with note that Git has contents and
pathname based heuristic detection of renames, or just put "Detect" or
"Detection" here.

I would certainly change description of what means that SCM doesn't "Support
Renames" or has it implemented partially. Current explanation relies
heavily on _implementation_. The correct wording of current definition
would be that SCM doesn't support renames if history of a file "as visible
to SCM" is broken into before rename and after rename part, and that SCM
support it partially if you can track history of renamed file from
post-rename name but there is left in void history of pre-rename file.
But with this definition Git _does_ "Supports Renames".

I'd rather split "Supports Renames" into engine part (does SCM
remember/detect that rename took place _as_ rename, not remember/detect it
as copiying+deletion; something other than rename) and user interface part:
can user easily deal with renames (this includes merging and viewing file
history).

5 and 6. Needs Repository/Supports Repository. The name is very, very
unclean and stems from branch-centricness of Bazaar. Git should probably
have "Yes" here, as for Git branch is just reference to its tip in
revisions DAG (plus optionally branch tip history in reflog). On the other
hand Git _can_ share object database like branches can be gathered together
to share data into repository. You can have one-branch repositories, you
can clone whole repositories (perhaps Bazaar should have "Somewhat" for
Supports Repository as it doesn't support cloning of whole repository...
bzt, wrong, there is example plugin for that), and you can clone (using
Cogito) only one branch of repository and you can fetch only selected
branches of repository.

Thinking more about it those items should probably read "Support Individual
Branches" (as: can you get only the branch you are interested in, can SCM
support one-branch workflow) and "Support Branch Grouping" or "Support Data
Sharing" (as: can you share DAG between branches, can you share DAG between
repositories).

7. Checkouts (as a noun). This probably read "Support Centralized and
Disconnected Centralized Workflow" but that is perhaps too wordy. Git would
have "No" for "Centralized" and "Somewhat" for "Disconnected Centralized"
meaning that you can set up Git repository to be equivalent of heavyweight
checkout, and push changes to some given repository on commit.

8. Partial Checkouts (as a verb). Here Git should have perhaps "Minimal", as
you can have partial checkouts but only with care (and you still need whole
repository). "No?" is also correct (?).

9. Atomic Commits. O.K. You have to remember that there are consequences
of having Atomic Commits on the details of Partial Checkouts.

10. Cheap Branching Anywhere. Git should probably have "Yes! Yes! Yes!"
here ;-)

11. Smart Merge. O.K. Should probably be explained what constitutes smart
merging. Perhaps instead of "Yes" there should be name of default/smartest
merge strategy used?

12. Cherrypicks. What constitutes "Yes" here? Why "Somewhat" for Git?
It does have git-cherry-pick command for cherry picking...

13. Plugins. I would put "Somewhat" here, or "Scriptable" in the "Somewhat"
or "?" background color for Git. And add note that it is easy to script up
porcelanish command, and to add another merge strategy. There also was
example plugin infrastructure for Cogito, so I'd opt for "Someahwt"
marking.

14. Has Special Server. O.K.

15. Req. Dedicated Server. O.K.

16. Good Windows support. I'd put "Cygwin" instead of "No" for Git, although
with the same marking. And perhaps add note that Git relies heavily on
POSIX.

17 and 18. Fast Local Performance and Fast Network Performance. O.K.

19. Ease of Use. Hmmm... I don't know for Git. I personally find it very
easy to use, but I have not much experiences with other SCM. I wonder why
Bazaar has "No" there...


Too much rewriting to correct the page...


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 11:48                                                                                 ` Jakub Narebski
@ 2006-10-26 11:54                                                                                   ` Nicholas Allen
  2006-10-26 12:13                                                                                     ` Jakub Narebski
  2006-10-26 21:25                                                                                     ` Jeff King
  2006-10-27  2:02                                                                                   ` Horst H. von Brand
  1 sibling, 2 replies; 1752+ messages in thread
From: Nicholas Allen @ 2006-10-26 11:54 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git


> 
> 4. Supports Renames. I could agree with "Somewhat" because of not yet
> implemented --follow option to git-rev-list (and therefore all porcelain).
> Perhaps it would be closer to truth to leave the marker (background color)
> as for "Somewhat" and write "N/A" with note that Git has contents and
> pathname based heuristic detection of renames, or just put "Detect" or
> "Detection" here.
> 
> I would certainly change description of what means that SCM doesn't "Support
> Renames" or has it implemented partially. Current explanation relies
> heavily on _implementation_. The correct wording of current definition
> would be that SCM doesn't support renames if history of a file "as visible
> to SCM" is broken into before rename and after rename part, and that SCM
> support it partially if you can track history of renamed file from
> post-rename name but there is left in void history of pre-rename file.
> But with this definition Git _does_ "Supports Renames".

I would have thought that supports renames would also involve flagging a 
conflict when merging a file that has been renamed on 2 separate 
branches. ie 2 branches rename the file to different names and then one 
branch is merged into the other. In this situation, the user should be 
told of a rename conflict. Bzr supports this as far as I know. Not sure 
about git though as I have never used it.




^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 10:13                                                                               ` Andreas Ericsson
  2006-10-26 10:45                                                                                 ` Erik Bågfors
  2006-10-26 11:48                                                                                 ` Jakub Narebski
@ 2006-10-26 12:12                                                                                 ` Matthew D. Fuller
  2006-10-26 12:18                                                                                   ` Jakub Narebski
  2006-10-26 13:47                                                                                 ` Aaron Bentley
  3 siblings, 1 reply; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-26 12:12 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: David Lang, bazaar-ng, git

On Thu, Oct 26, 2006 at 12:13:39PM +0200 I heard the voice of
Andreas Ericsson, and lo! it spake thus:
> Matthew D. Fuller wrote:
> >
> >In git, this is a non-issue because you don't get to CHOOSE which
> >way to work.
> 
> Yes they do.

Not where I was going with that section of the mail; I was looking at
just the merge vs fast-forward distinction.  In git, you don't get to
choose; in bzr you do.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 11:54                                                                                   ` Nicholas Allen
@ 2006-10-26 12:13                                                                                     ` Jakub Narebski
  2006-10-26 21:25                                                                                     ` Jeff King
  1 sibling, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-26 12:13 UTC (permalink / raw)
  To: Nicholas Allen; +Cc: bazaar-ng, git

Nicholas Allen wrote:
> Jakub Narebski wrote:
>> 
>> 4. Supports Renames. I could agree with "Somewhat" because of not yet
>> implemented --follow option to git-rev-list (and therefore all porcelain).
>> Perhaps it would be closer to truth to leave the marker (background color)
>> as for "Somewhat" and write "N/A" with note that Git has contents and
>> pathname based heuristic detection of renames, or just put "Detect" or
>> "Detection" here.
>> 
>> I would certainly change description of what means that SCM doesn't "Support
>> Renames" or has it implemented partially. Current explanation relies
>> heavily on _implementation_. The correct wording of current definition
>> would be that SCM doesn't support renames if history of a file "as visible
>> to SCM" is broken into before rename and after rename part, and that SCM
>> support it partially if you can track history of renamed file from
>> post-rename name but there is left in void history of pre-rename file.
>> But with this definition Git _does_ "Supports Renames".
> 
> I would have thought that supports renames would also involve flagging a 
> conflict when merging a file that has been renamed on 2 separate 
> branches. ie 2 branches rename the file to different names and then one 
> branch is merged into the other. In this situation, the user should be 
> told of a rename conflict. Bzr supports this as far as I know. Not sure 
> about git though as I have never used it.

If I remember correctly Git usually resolves such conflict. If it cannot
resolve it, it tells user of rename conflict (add/add conflict or rename/add
conflict).

Unfortunately Git tutorial part 3 on merges is not yer written.
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 12:12                                                                                 ` VCS comparison table Matthew D. Fuller
@ 2006-10-26 12:18                                                                                   ` Jakub Narebski
  2006-10-26 15:06                                                                                     ` Matthew D. Fuller
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-26 12:18 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Matthew D. Fuller wrote:

> On Thu, Oct 26, 2006 at 12:13:39PM +0200 I heard the voice of
> Andreas Ericsson, and lo! it spake thus:
>> Matthew D. Fuller wrote:
>>>
>>>In git, this is a non-issue because you don't get to CHOOSE which
>>>way to work.
>> 
>> Yes they do.
> 
> Not where I was going with that section of the mail; I was looking at
> just the merge vs fast-forward distinction.  In git, you don't get to
> choose; in bzr you do.

You can get similar workflow in git using 'origin'/'master' pair, I think.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 11:13                                                                   ` Jeff King
  2006-10-26 11:15                                                                     ` Jeff King
@ 2006-10-26 12:33                                                                     ` Vincent Ladeuil
  2006-10-26 13:14                                                                       ` Rogan Dawes
  1 sibling, 1 reply; 1752+ messages in thread
From: Vincent Ladeuil @ 2006-10-26 12:33 UTC (permalink / raw)
  To: Jeff King; +Cc: bazaar-ng, git

>>>>> "Jeff" == Jeff King <peff@peff.net> writes:

    Jeff> On Thu, Oct 26, 2006 at 12:52:05PM +0200, Vincent Ladeuil wrote:
    >> Ok, so git make a distinction between the commit (code created by
    >> someone) and the tree (code only).

    Jeff> Yes (a commit is a tree, zero or more parents, commit message, and
    Jeff> author/committer info).

The parents of a tree are also trees or can/must they be commits ?

    >> Commits are defined by their parents.

    Jeff> Partially, yes.

I buy that this "partially" means "the other parts are irrelevant
to this discussion".

    >> Trees are defined by their content only ?

    Jeff> Yes.

So it is possible that : starting from a tree T,

- I make a patch A,
- you make the patch B,
- A and B are equal (stop watching above my shoulder please, or what is me ?),
- we both commit,
- we pull changes from each other repository.

We will end up with a tree T2 with a hash corresponding to both
T+A and T+B, but each of us will have a different commit id CA
and CB both pointing to T2, did I get it ?

    Vincent







^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 12:33                                                                     ` Vincent Ladeuil
@ 2006-10-26 13:14                                                                       ` Rogan Dawes
  0 siblings, 0 replies; 1752+ messages in thread
From: Rogan Dawes @ 2006-10-26 13:14 UTC (permalink / raw)
  To: Vincent Ladeuil; +Cc: bazaar-ng, git

Vincent Ladeuil wrote:
>>>>>> "Jeff" == Jeff King <peff@peff.net> writes:
> 
>     Jeff> On Thu, Oct 26, 2006 at 12:52:05PM +0200, Vincent Ladeuil wrote:
>     >> Ok, so git make a distinction between the commit (code created by
>     >> someone) and the tree (code only).
> 
>     Jeff> Yes (a commit is a tree, zero or more parents, commit message, and
>     Jeff> author/committer info).
> 
> The parents of a tree are also trees or can/must they be commits ?

This refers to the parents of a _commit_, not of a tree, and the parents 
must be _commits_. The parents allow us to determine what changed 
between the previous commit(s), and the current one. If there are more 
than one parent, then we have a merge commit.

So, a commit refers to a tree representing the state of the code at the 
time of the commit, as well as to any parent commit(s). If there are no 
parent commits, then the commit is an "initial commit" (i.e. the first 
checkin). A project can have multiple "initial commits", typically where 
two previously independent projects are merged together, c.f. gitk and git.

> 
>     >> Commits are defined by their parents.
> 
>     Jeff> Partially, yes.
> 
> I buy that this "partially" means "the other parts are irrelevant
> to this discussion".

Yes.

>     >> Trees are defined by their content only ?
> 
>     Jeff> Yes.
> 
> So it is possible that : starting from a tree T,
> 
> - I make a patch A,
> - you make the patch B,
> - A and B are equal (stop watching above my shoulder please, or what is me ?),
> - we both commit,
> - we pull changes from each other repository.
> 
> We will end up with a tree T2 with a hash corresponding to both
> T+A and T+B, but each of us will have a different commit id CA
> and CB both pointing to T2, did I get it ?
> 
>     Vincent

Yes. That is exactly right.

 From there, we can either trivially merge CA and CB with a new merge 
commit referring to T2, but citing both CA and CB as parents, or simply 
discard one of the lines of development, depending on how much 
subsequent development cited CA or CB as parents.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 10:13                                                                               ` Andreas Ericsson
                                                                                                   ` (2 preceding siblings ...)
  2006-10-26 12:12                                                                                 ` VCS comparison table Matthew D. Fuller
@ 2006-10-26 13:47                                                                                 ` Aaron Bentley
  2006-10-26 13:53                                                                                   ` Jakub Narebski
  3 siblings, 1 reply; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-26 13:47 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Matthew D. Fuller, bazaar-ng, David Lang, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andreas Ericsson wrote:
> The only issue I have with bzr's revno's and truly distributed setup is
> that, by looking at the table, it seems to claim that you have found
> some miraculous way to make revnos work without a central server. Since
> everyone agrees that they don't, this should IMO be listed as mutually
> exclusive features.

The "simple namespace" is both a URL and a revno.

And therefore, it's just as distributed and decentralized as the web.

There is very little difference between this:

http://example.com/mywebpage#5

And this:

http://example.com/mybranch 5

In fact, we've been planning to unify them into one identifier.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFQLxr0F+nu1YWqI0RAiVrAJ9rb+uylIuxqMo2VMelI3Qm6oNQOwCfeTAb
kOkp9kOkRl1YEVEP+G3y2SU=
=Zgsg

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 13:47                                                                                 ` Aaron Bentley
@ 2006-10-26 13:53                                                                                   ` Jakub Narebski
  2006-10-26 15:13                                                                                     ` Aaron Bentley
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-26 13:53 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Aaron Bentley wrote:

> Andreas Ericsson wrote:

>> The only issue I have with bzr's revno's and truly distributed setup is
>> that, by looking at the table, it seems to claim that you have found
>> some miraculous way to make revnos work without a central server. Since
>> everyone agrees that they don't, this should IMO be listed as mutually
>> exclusive features.
> 
> The "simple namespace" is both a URL and a revno.
> 
> And therefore, it's just as distributed and decentralized as the web.
> 
> There is very little difference between this:
> 
> http://example.com/mywebpage#5
> 
> And this:
> 
> http://example.com/mybranch 5
> 
> In fact, we've been planning to unify them into one identifier.

Well, then it is not much simpler than 8-chars sha1. And sha1 is more
decentralized, because you can use it when you don't have access to net,
and when the _central_ revno server is down.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 10:52                                                                 ` Vincent Ladeuil
  2006-10-26 11:13                                                                   ` Jeff King
  2006-10-26 11:18                                                                   ` Jakub Narebski
@ 2006-10-26 15:05                                                                   ` Linus Torvalds
  2006-10-26 16:04                                                                     ` Vincent Ladeuil
  2 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-26 15:05 UTC (permalink / raw)
  To: Vincent Ladeuil
  Cc: Jeff King, James Henstridge, Junio C Hamano, bazaar-ng,
	Matthew D. Fuller, Carl Worth, Andreas Ericsson, git,
	Jakub Narebski



On Thu, 26 Oct 2006, Vincent Ladeuil wrote:
> 
> Ok, so git make a distinction between the commit (code created by
> someone) and the tree (code only).
> 
> Commits are defined by their parents.

Commits are defined by a _combination_ of:
 - the tree they commit (which is recursive, so the commit name indirectly 
   includes information EVERY SINGLE BIT in the whole tree, in every 
   single file)
 - the parent(s) if any (which is also recursive, so the commit name 
   indirectly includes information about EVERY SINGLE BIT in not just the 
   current tree, but every tree in the history, and every commit that is 
   reachable from it)
 - the author, committer, and dates of each (and committer is actually 
   very often different from author)
 - the actual commit message

So a commit really names - uniquely and authoratively - not just the 
commit itself, but everything ever associated with it.

> Trees are defined by their content only ?

Where "contents" does include names and permissions/types (eg execute bit 
and symlink etc).

> If that's the case, how do you proceed ? 

If you compare the commit name, and they are equal, you automatically know
 - the trees are 100% identical
 - the histories are 100% identical

If you only care about the actual tree, you compare the tree name for 
equality, ie you can do

	git-rev-parse commit1^{tree} commit2^{tree}

and compare the two: if and only if they are equal are the actual contents 
100% equal.

> Calculate a sha1 representing the content (or the content of the
> diff from parent) of all the files and dirs in the tree ?  Or
> from the sha1s of the files and dirs themselves recursively based
> on sha1s of the files and dirs they contain ?

The latter. 

> I ask because the later seems to provide some nice effects
> similar to what makes BDD
> (http://en.wikipedia.org/wiki/Binary_decision_diagram) so
> efficient: you can compare graphs of any complexity or size in
> O(1) by just comparing their signatures.

This is exactly what git does. You can compare entire trees (and 
subdirectories are just other trees) by just comparing 20 bytes of 
information.

How do you think we can do a diff between two arbitrary kernel revisions 
so fast? Why do you think we can afford to do a 

	git log drivers/usb include/linux/usb*

that literally picks out the history (by comparing state) for every commit 
in the tree?

I can do the above log-generation in less than ten _seconds_ for the last 
year and a half of the kernel. That's 20k+ lines of logs of commits that 
only touch those files and directories. And I _need_ it to be fast, 
because that's literally one of the most common operations I do.

And the reason it's fast is that we can compare 20,000 files (names, 
contents, permissions) by just comparing a _single_ 20-byte SHA1.

In git, revision names (and _everything_ has a revision name: commits, 
trees, blobs, tags) really have meaning. They're not just random noise.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 12:18                                                                                   ` Jakub Narebski
@ 2006-10-26 15:06                                                                                     ` Matthew D. Fuller
  0 siblings, 0 replies; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-26 15:06 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

On Thu, Oct 26, 2006 at 02:18:53PM +0200 I heard the voice of
Jakub Narebski, and lo! it spake thus:
> 
> You can get similar workflow in git using 'origin'/'master' pair, I
> think.

Not the same, because as soon as your 'git pull' _can_ fast-foward, it
will.  You can't merge a set of changes from another branch that's a
strict superset of yours in, without it fast-forwarding them.

I suppose you could take great care to ensure that the other branch is
never in a position to be fast-forwarded onto yours, most simply just
by making forced commits before you do a pull, but that's revolting.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 13:53                                                                                   ` Jakub Narebski
@ 2006-10-26 15:13                                                                                     ` Aaron Bentley
  0 siblings, 0 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-26 15:13 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:
>>There is very little difference between this:
>>
>>http://example.com/mywebpage#5
>>
>>And this:
>>
>>http://example.com/mybranch 5
>>
>>In fact, we've been planning to unify them into one identifier.
> 
> 
> Well, then it is not much simpler than 8-chars sha1. And sha1 is more
> decentralized, because you can use it when you don't have access to net,
> and when the _central_ revno server is down.

What do you mean by _central_ revno server?  example.com?  Does that
also apply to google.com?

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFQNCc0F+nu1YWqI0RAlslAJ0XJ8Fezxyn5Ty1oAcgAo4LdQEAvQCfbWk+
vVTmHwIuhyd7lhAxMm2uMZ8=
=c4pE

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-24  9:30                                                                     ` Jelmer Vernooij
@ 2006-10-26 15:22                                                                       ` Aaron Bentley
  0 siblings, 0 replies; 1752+ messages in thread
From: Aaron Bentley @ 2006-10-26 15:22 UTC (permalink / raw)
  To: Jelmer Vernooij
  Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git, Erik B?gfors

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jelmer Vernooij wrote:

> The graphical frontends to bzr, for example, don't know about revno's but 
> only about revids.

Gannotate shows revnos where appropriate.  Not sure about others.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFQNK60F+nu1YWqI0RAiGiAJ45IG/nHsl3/5rP23nxYLduopVj/QCfUX+9
E01mr0edaZld9aKMASRbo+o=
=YavT

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 15:05                                                                   ` Linus Torvalds
@ 2006-10-26 16:04                                                                     ` Vincent Ladeuil
  2006-10-26 16:21                                                                       ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Vincent Ladeuil @ 2006-10-26 16:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bazaar-ng, git

>>>>> "Linus" == Linus Torvalds <torvalds@osdl.org> writes:

    Linus> On Thu, 26 Oct 2006, Vincent Ladeuil wrote:
    >> 
    >> Ok, so git make a distinction between the commit (code created by
    >> someone) and the tree (code only).
    >> 
    >> Commits are defined by their parents.

    Linus> Commits are defined by a _combination_ of:

    Linus>  - the tree they commit (which is recursive, so the
    Linus>  commit name indirectly includes information EVERY
    Linus>  SINGLE BIT in the whole tree, in every single file)

And here you keep that separate from any SCM related info,
right ?

    Linus>  - the parent(s) if any (which is also recursive, so
    Linus>  the commit name indirectly includes information about
    Linus>  EVERY SINGLE BIT in not just the current tree, but
    Linus>  every tree in the history, and every commit that is
    Linus>  reachable from it)

    Linus>  - the author, committer, and dates of each (and
    Linus>  committer is actually very often different from
    Linus>  author)

    Linus>  - the actual commit message

    Linus> So a commit really names - uniquely and authoratively
    Linus> - not just the commit itself, but everything ever
    Linus> associated with it.

Thanks for the clarification. But no need to shout about EVERY
SINGLE BIT, the pointer to BDDs was already talking a bit about
bits :) 

But I agree, this is the important point that may be missed.

    >> Trees are defined by their content only ?

    Linus> Where "contents" does include names and
    Linus> permissions/types (eg execute bit and symlink etc).

Which can also be expressed as: "Everything the user can
manipulate outside the SCM context", right ?

    >> If that's the case, how do you proceed ? 

    Linus> If you compare the commit name, and they are equal,
    Linus> you automatically know

    Linus>  - the trees are 100% identical
    Linus>  - the histories are 100% identical

And that's the only info you can get, no ordering here. (Just
pointing the obvious, as soon as you try to put more info into
the signature, the equality will vanish).

But for various optimizations this equality property is the only
needed one.

Do we agree ?

    Linus> If you only care about the actual tree, you compare
    Linus> the tree name for equality, ie you can do

    Linus> 	git-rev-parse commit1^{tree} commit2^{tree}

    Linus> and compare the two: if and only if they are equal are
    Linus> the actual contents 100% equal.

Actually, that's backwards:

"their actual contents are equal" implies "their signatures are
equal".

But, two totally different trees can have the same signature.

My god ! What an horror ! Not. I even wonder if I will live so
long as to see it occurs... So we *can* pretend that:

"theirs signatures are equal" is equivalent to "their contents
are equal"

And that's all we care :)

But I digressed, the question was about a detail on your tree
definition, once the signature is defined to be unique (as in
canonical), the property of comparing the signatures as if they
were the objects themselves follows. Thanks for the confirmation.

    >> Calculate a sha1 representing the content (or the content
    >> of the diff from parent) of all the files and dirs in the
    >> tree ?  Or from the sha1s of the files and dirs themselves
    >> recursively based on sha1s of the files and dirs they
    >> contain ?

    Linus> The latter. 

Thanks for providing the clarification. So of course, finding the
differences between the trees is quick, you can prune anywhere
the signatures equality is verified.

    >> I ask because the later seems to provide some nice effects
    >> similar to what makes BDD
    >> (http://en.wikipedia.org/wiki/Binary_decision_diagram) so
    >> efficient: you can compare graphs of any complexity or size in
    >> O(1) by just comparing their signatures.

    Linus> This is exactly what git does. You can compare entire
    Linus> trees (and subdirectories are just other trees) by
    Linus> just comparing 20 bytes of information.

I understand that, years ago even. I have a bit of practice with
BDDs and I am accustomed to that so lovely property. But without
that practice, I think most people will just wonder...

<snip/>

    Linus> And the reason it's fast is that we can compare 20,000
    Linus> files (names, contents, permissions) by just comparing
    Linus> a _single_ 20-byte SHA1.

Yeah, let's go further ! We can compare gazillions of files and
their history since epoch by comparing _two_ signatures ! :-)

    Linus> In git, revision names (and _everything_ has a
    Linus> revision name: commits, trees, blobs, tags) really
    Linus> have meaning. They're not just random noise.

I know that effect, but I understand people complaining that they
*look* like noise. 

I'm still searching a parallel in nature, but the best I could
find is DNA, ever look at a DNA ? 

Looks like noise no ? No ordering either between parents and
children... But there is a way to identify a parent from the DNA
of a children...


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 16:04                                                                     ` Vincent Ladeuil
@ 2006-10-26 16:21                                                                       ` Linus Torvalds
  0 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-26 16:21 UTC (permalink / raw)
  To: Vincent Ladeuil; +Cc: bazaar-ng, git



On Thu, 26 Oct 2006, Vincent Ladeuil wrote:

> >>>>> "Linus" == Linus Torvalds <torvalds@osdl.org> writes:
> 
>     Linus> Commits are defined by a _combination_ of:
> 
>     Linus>  - the tree they commit (which is recursive, so the
>     Linus>  commit name indirectly includes information EVERY
>     Linus>  SINGLE BIT in the whole tree, in every single file)
> 
> And here you keep that separate from any SCM related info,
> right ?

I don't understand that question.

The commits contain the tree information. A raw commit in git (this is the 
true contents of the current top commit in my kernel tree, just added 
indentation and an empty line between the command I used to generate it 
and the output, to make it stand out better in the email) looks something 
like this:

   [torvalds@g5 linux]$ git-cat-file commit HEAD

   tree ba1ed8c744654ca91ee2b71b7cdee149c8edbef1
   parent 2a4f739dfc59edd52eaa37d63af1bd830ea42318
   parent 012d64ff68f304df1c35ce5902f5023dc14b643f
   author Linus Torvalds <torvalds@g5.osdl.org> 1161873881 -0700
   committer Linus Torvalds <torvalds@g5.osdl.org> 1161873881 -0700
   
   Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
   
   * master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
     [SPARC64]: Fix memory corruption in pci_4u_free_consistent().
     [SPARC64]: Fix central/FHC bus handling on Ex000 systems.

where the _name_ of the commit is 

   [torvalds@g5 linux]$ git-rev-parse HEAD

   e80391500078b524083ba51c3df01bbaaecc94bb

ie the commit itself contains the exact tree name (and the name of the 
parents), and the name of the commit is literally the SHA1 of the contents 
of the commit (plus a git-specific header).

>     >> Trees are defined by their content only ?
> 
>     Linus> Where "contents" does include names and
>     Linus> permissions/types (eg execute bit and symlink etc).
> 
> Which can also be expressed as: "Everything the user can
> manipulate outside the SCM context", right ?

Again, I'm not sure what you mean by that. The SCM does not track 
_everything_. It does not track user names and inode numbers, so in a 
sense a developer can change things that the SCM simply doesn't _care_ 
about and never tracks. But yes, the tree contents uniquely identify the 
exact contents that the user cares about.

>     Linus> If you compare the commit name, and they are equal,
>     Linus> you automatically know
> 
>     Linus>  - the trees are 100% identical
>     Linus>  - the histories are 100% identical
> 
> And that's the only info you can get, no ordering here.

No, there is ordering there too. But yes, the ordering is not in the name 
itself, you have to go look at the actual commit history to see it.

The name is just an identifier.

>     Linus> If you only care about the actual tree, you compare
>     Linus> the tree name for equality, ie you can do
> 
>     Linus> 	git-rev-parse commit1^{tree} commit2^{tree}
> 
>     Linus> and compare the two: if and only if they are equal are
>     Linus> the actual contents 100% equal.
> 
> Actually, that's backwards:
> 
> "their actual contents are equal" implies "their signatures are
> equal".

No. 

If the signatures are equal, the contents are equal, and vice versa. It 
really is a two-way thing.

> But, two totally different trees can have the same signature.

No. Don't even think that way. That just confuses you. The hash is 
cryptographic, and large enough, that you really can equate the contents 
with the hash. Anything else is just not even interesting.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 11:15                                         ` Andreas Ericsson
@ 2006-10-26 16:30                                           ` David Lang
  2006-10-26 17:03                                             ` Nicolas Pitre
  0 siblings, 1 reply; 1752+ messages in thread
From: David Lang @ 2006-10-26 16:30 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: David Rientjes, Jeff King, Linus Torvalds, Lachlan Patrick,
	bazaar-ng, git

On Thu, 26 Oct 2006, Andreas Ericsson wrote:

>> 
>> There are _not_ scalability improvements.  There may be some slight 
>> performance improvements, but definitely not scalability.  If you have ever 
>> tried to use git to manage terabytes of data, you will see this becomes 
>> very clear.  And "rebasing with 3-way merge" is not something often used in 
>> industry anyway if you've followed the more common models for revision 
>> control within large companies with thousands of engineers.  Typically they 
>> all work off mainline.
>> 
>
> Actually, I don't see why git shouldn't be perfectly capable of handling a 
> repo containing several terabytes of data, provided you don't expect it to 
> turn up the full history for the project in a couple of seconds and you don't 
> actually *change* that amount of data in each revision. If you want a vcs 
> that handles that amount with any kind of speed, I think you'll find rsync 
> and raw rvs a suitable solution.

actually, there are some real problems in this area. the git pack format can't 
be larger then 4G, and I wouldn't be surprised if there were other issues with 
files larger then 4G (these all boil down to 32 bit limits). once these limits 
are dealt with then you will be right.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 16:30                                           ` David Lang
@ 2006-10-26 17:03                                             ` Nicolas Pitre
  2006-10-26 17:04                                               ` David Lang
  2006-10-26 17:45                                               ` Jakub Narebski
  0 siblings, 2 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-10-26 17:03 UTC (permalink / raw)
  To: David Lang
  Cc: Andreas Ericsson, David Rientjes, Jeff King, Linus Torvalds,
	Lachlan Patrick, bazaar-ng, git

On Thu, 26 Oct 2006, David Lang wrote:

> On Thu, 26 Oct 2006, Andreas Ericsson wrote:
> 
> > > 
> > > There are _not_ scalability improvements.  There may be some slight
> > > performance improvements, but definitely not scalability.  If you have
> > > ever tried to use git to manage terabytes of data, you will see this
> > > becomes very clear.  And "rebasing with 3-way merge" is not something
> > > often used in industry anyway if you've followed the more common models
> > > for revision control within large companies with thousands of engineers.
> > > Typically they all work off mainline.
> > > 
> >
> > Actually, I don't see why git shouldn't be perfectly capable of handling a
> > repo containing several terabytes of data, provided you don't expect it to
> > turn up the full history for the project in a couple of seconds and you
> > don't actually *change* that amount of data in each revision. If you want a
> > vcs that handles that amount with any kind of speed, I think you'll find
> > rsync and raw rvs a suitable solution.
> 
> actually, there are some real problems in this area. the git pack format can't
> be larger then 4G, and I wouldn't be surprised if there were other issues with
> files larger then 4G (these all boil down to 32 bit limits). once these limits
> are dealt with then you will be right.

There is no such limit on the pack format.  A pack itself can be as 
large as you want.  The 4G limit is in the tool not the format.

The actual pack limits are as follows:

	- a pack can have infinite size

	- a pack cannot have more than 4294967296 objects

	- each non-delta objects can be of infinite size

	- delta objects can be of infinite size themselves but...

	- current delta encoding can use base objects no larger than 4G

The _code_ is currently limited to 4G though, especially on 32-bit 
architectures.  The delta issue could be resolved in a backward 
compatible way but it hasn't been formalized yet.

The pack index is actually limited to 32-bits meaning it can cope with 
packs no larger than 4G.  But the pack index is a local matter and not 
part of the protocol so this is not a big issue to define a new index 
format and automatically convert existing indexes at that point.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 17:03                                             ` Nicolas Pitre
@ 2006-10-26 17:04                                               ` David Lang
  2006-10-26 17:16                                                 ` Linus Torvalds
  2006-10-26 17:24                                                 ` Nicolas Pitre
  2006-10-26 17:45                                               ` Jakub Narebski
  1 sibling, 2 replies; 1752+ messages in thread
From: David Lang @ 2006-10-26 17:04 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Andreas Ericsson, David Rientjes, Jeff King, Linus Torvalds,
	Lachlan Patrick, bazaar-ng, git

On Thu, 26 Oct 2006, Nicolas Pitre wrote:

> On Thu, 26 Oct 2006, David Lang wrote:
>
>> On Thu, 26 Oct 2006, Andreas Ericsson wrote:
>>
>>>>
>>>> There are _not_ scalability improvements.  There may be some slight
>>>> performance improvements, but definitely not scalability.  If you have
>>>> ever tried to use git to manage terabytes of data, you will see this
>>>> becomes very clear.  And "rebasing with 3-way merge" is not something
>>>> often used in industry anyway if you've followed the more common models
>>>> for revision control within large companies with thousands of engineers.
>>>> Typically they all work off mainline.
>>>>
>>>
>>> Actually, I don't see why git shouldn't be perfectly capable of handling a
>>> repo containing several terabytes of data, provided you don't expect it to
>>> turn up the full history for the project in a couple of seconds and you
>>> don't actually *change* that amount of data in each revision. If you want a
>>> vcs that handles that amount with any kind of speed, I think you'll find
>>> rsync and raw rvs a suitable solution.
>>
>> actually, there are some real problems in this area. the git pack format can't
>> be larger then 4G, and I wouldn't be surprised if there were other issues with
>> files larger then 4G (these all boil down to 32 bit limits). once these limits
>> are dealt with then you will be right.
>
> There is no such limit on the pack format.  A pack itself can be as
> large as you want.  The 4G limit is in the tool not the format.
>
> The actual pack limits are as follows:
>
> 	- a pack can have infinite size
>
> 	- a pack cannot have more than 4294967296 objects
>
> 	- each non-delta objects can be of infinite size
>
> 	- delta objects can be of infinite size themselves but...
>
> 	- current delta encoding can use base objects no larger than 4G
>
> The _code_ is currently limited to 4G though, especially on 32-bit
> architectures.  The delta issue could be resolved in a backward
> compatible way but it hasn't been formalized yet.
>
> The pack index is actually limited to 32-bits meaning it can cope with
> packs no larger than 4G.  But the pack index is a local matter and not
> part of the protocol so this is not a big issue to define a new index
> format and automatically convert existing indexes at that point.

the offset within a pack for the starting location of an object cannot be larger 
then 4G.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 17:04                                               ` David Lang
@ 2006-10-26 17:16                                                 ` Linus Torvalds
  2006-10-26 17:24                                                 ` Nicolas Pitre
  1 sibling, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-10-26 17:16 UTC (permalink / raw)
  To: David Lang
  Cc: Nicolas Pitre, Andreas Ericsson, David Rientjes, Jeff King,
	Lachlan Patrick, bazaar-ng, git



On Thu, 26 Oct 2006, David Lang wrote:
> 
> the offset within a pack for the starting location of an object cannot be
> larger then 4G.

Well, strictly speaking, even that isn't actually a limit on the _pack_ 
format itself.  It's really just the (totally separate) index that 
currently uses 32-bit offsets.

For example, you can actually use the pack-file to transfer more than 4GB 
of data over the network. You'd not need to change the format at all. Only 
the local _index_ of the result needs to change - but we never transfer 
that at all (it's always generated locally), so that's really a separate 
issue.

It's not even hard to fix. It's just that right now, the biggest 
repository that we know about (mozilla) is not even close to the limit. 
And it took them ten years to get there. So if the mozilla people switch 
to git, and keep going at the same rate, we have about 70 years left 
before we need to fix the indexing ;)

(Of course, other projects, like the kernel, seem to grow faster, so it 
might be "only" a decade or two - but since the index format is a local 
thing, even that won't be too painful, since we don't really need a global 
flag-day once we decide to start supporting larger offsets in the index)


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 17:04                                               ` David Lang
  2006-10-26 17:16                                                 ` Linus Torvalds
@ 2006-10-26 17:24                                                 ` Nicolas Pitre
  1 sibling, 0 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-10-26 17:24 UTC (permalink / raw)
  To: David Lang
  Cc: Andreas Ericsson, David Rientjes, Jeff King, Linus Torvalds,
	Lachlan Patrick, bazaar-ng, git

On Thu, 26 Oct 2006, David Lang wrote:

> On Thu, 26 Oct 2006, Nicolas Pitre wrote:
> 
> > The pack index is actually limited to 32-bits meaning it can cope with
> > packs no larger than 4G.
> 
> the offset within a pack for the starting location of an object cannot be
> larger then 4G.

To be more exact, yes.  But I don't think we'll ever consider use 
scenarios with packs > 4G with the current index format.  There is 
simply no point.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 17:03                                             ` Nicolas Pitre
  2006-10-26 17:04                                               ` David Lang
@ 2006-10-26 17:45                                               ` Jakub Narebski
  1 sibling, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-26 17:45 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Nicolas Pitre wrote:

> On Thu, 26 Oct 2006, David Lang wrote:
> 
>> actually, there are some real problems in this area. the git pack format can't
>> be larger then 4G, and I wouldn't be surprised if there were other issues with
>> files larger then 4G (these all boil down to 32 bit limits). once these limits
>> are dealt with then you will be right.
> 
> There is no such limit on the pack format.  A pack itself can be as 
> large as you want.  The 4G limit is in the tool not the format.
[...]
> The _code_ is currently limited to 4G though, especially on 32-bit 
> architectures.  The delta issue could be resolved in a backward 
> compatible way but it hasn't been formalized yet.
> 
> The pack index is actually limited to 32-bits meaning it can cope with 
> packs no larger than 4G.  But the pack index is a local matter and not 
> part of the protocol so this is not a big issue to define a new index 
> format and automatically convert existing indexes at that point.

If I remember correctly those issues are under development:
1. There is work on 64-bit index
2. There is work that would allow to have multiple packs, repack only one
   of packs and treat the rest as 'archive packs' (which can be more
   aggresively packed). This solution is to split pack into multiple packs.
3. There is work on mmaping only part of pack, which would avoid 4G limit
   even on 32-bit machines, if I understand it correctly.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 11:54                                                                                   ` Nicholas Allen
  2006-10-26 12:13                                                                                     ` Jakub Narebski
@ 2006-10-26 21:25                                                                                     ` Jeff King
  1 sibling, 0 replies; 1752+ messages in thread
From: Jeff King @ 2006-10-26 21:25 UTC (permalink / raw)
  To: Nicholas Allen; +Cc: Jakub Narebski, bazaar-ng, git

On Thu, Oct 26, 2006 at 01:54:38PM +0200, Nicholas Allen wrote:

> I would have thought that supports renames would also involve flagging a 
> conflict when merging a file that has been renamed on 2 separate 
> branches. ie 2 branches rename the file to different names and then one 
> branch is merged into the other. In this situation, the user should be 
> told of a rename conflict. Bzr supports this as far as I know. Not sure 
> about git though as I have never used it.

It works as you expect:

$ git-init-db
$ touch foo
$ git-add foo
$ git-commit -m foo
Committing initial tree 4d5fcadc293a348e88f777dc0920f11e7d71441c
$ git-checkout -b other
$ git-mv foo bar
$ git-commit -m bar
$ git-checkout master
$ git-mv foo baz
$ git-commit -m baz$a
$ git-pull . other
Trying really trivial in-index merge...
fatal: Merge requires file-level merging
Nope.
Merging HEAD with 5a1dfd32c56a24d0ef06f0e71d731fcd49d5dc6e
Merging:
76ac76ee3ce890d43648ebc009d278dc81a327e0 baz
5a1dfd32c56a24d0ef06f0e71d731fcd49d5dc6e bar
found 1 common ancestor(s):
c9e7e95de6fdbb2af06ea44cc60d1ac1a63eaad6 foo
CONFLICT (rename/rename): Rename foo->baz in branch HEAD rename foo->bar
in 5a1dfd32c56a24d0ef06f0e71d731fcd49d5dc6e
Automatic merge failed; fix conflicts and then commit the result.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-26 11:48                                                                                 ` Jakub Narebski
  2006-10-26 11:54                                                                                   ` Nicholas Allen
@ 2006-10-27  2:02                                                                                   ` Horst H. von Brand
  2006-10-27  2:08                                                                                     ` Petr Baudis
  2006-10-27  9:34                                                                                     ` Andreas Ericsson
  1 sibling, 2 replies; 1752+ messages in thread
From: Horst H. von Brand @ 2006-10-27  2:02 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

Jakub Narebski <jnareb@gmail.com> wrote:

[...]

> I'd rather split "Supports Renames" into engine part (does SCM
> remember/detect that rename took place _as_ rename, not remember/detect it
> as copiying+deletion; something other than rename) and user interface part:
> can user easily deal with renames (this includes merging and viewing file
> history).

I think that what to tool does in its guts is completely irrelevant, what
is important is what the user sees. Sadly, it seems hard to describe
exactly what is meant/wanted here.

[...]

> 7. Checkouts (as a noun). This probably read "Support Centralized and
> Disconnected Centralized Workflow" but that is perhaps too wordy. Git would
> have "No" for "Centralized"

Why? We could all agree that some repository is "central" and all push/pull
there. Or send patches by mail (or apply them via ssh). Sure, it's not CVS,
but...

[...]

> 13. Plugins. I would put "Somewhat" here, or "Scriptable" in the "Somewhat"
> or "?" background color for Git. And add note that it is easy to script up
> porcelanish command, and to add another merge strategy. There also was
> example plugin infrastructure for Cogito, so I'd opt for "Someahwt"
> marking.

Mostly an implementation detail for "extensible"...

[...]

> 19. Ease of Use. Hmmm... I don't know for Git. I personally find it very
> easy to use, but I have not much experiences with other SCM. I wonder why
> Bazaar has "No" there...

Extremely subjective. Easy to learn doesn't cut it either.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-27  2:02                                                                                   ` Horst H. von Brand
@ 2006-10-27  2:08                                                                                     ` Petr Baudis
  2006-10-27  9:34                                                                                     ` Andreas Ericsson
  1 sibling, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-10-27  2:08 UTC (permalink / raw)
  To: Horst H. von Brand; +Cc: bazaar-ng, git, Jakub Narebski

Dear diary, on Fri, Oct 27, 2006 at 04:02:32AM CEST, I got a letter
where "Horst H. von Brand" <vonbrand@inf.utfsm.cl> said that...
> Jakub Narebski <jnareb@gmail.com> wrote:
> > 7. Checkouts (as a noun). This probably read "Support Centralized and
> > Disconnected Centralized Workflow" but that is perhaps too wordy. Git would
> > have "No" for "Centralized"
> 
> Why? We could all agree that some repository is "central" and all push/pull
> there. Or send patches by mail (or apply them via ssh). Sure, it's not CVS,
> but...

An ability to configure the tool so that the centralized workflow is
_enforced_ may be important for managers. It's stupid, but it's what is
meant there, I think.

> > 19. Ease of Use. Hmmm... I don't know for Git. I personally find it very
> > easy to use, but I have not much experiences with other SCM. I wonder why
> > Bazaar has "No" there...
> 
> Extremely subjective. Easy to learn doesn't cut it either.

I don't think this column makes sense at all. I swear I've seen
*several* people that claimed GNU Arch was easy to learn/use for them!



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-21 21:04                       ` Linus Torvalds
  2006-10-21 23:58                         ` Linus Torvalds
  2006-10-22  0:09                         ` Erik Bågfors
@ 2006-10-27  4:51                         ` Jan Hudec
  2006-10-28 11:38                           ` Jakub Narebski
  2 siblings, 1 reply; 1752+ messages in thread
From: Jan Hudec @ 2006-10-27  4:51 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Erik B?gfors, Matthieu Moy, bazaar-ng, Sean, git, Jakub Narebski

On Sat, Oct 21, 2006 at 02:04:56PM -0700, Linus Torvalds wrote:
> On Sat, 21 Oct 2006, Erik B?gfors wrote:
> > bzr is a fully decentralized VCS. I've read this thread for quite some
> > time now and I really cannot understand why people come to this
> > conclusion.
> 
> Even the bzr people agree, so what's not to understand?
> 
> The revision numbers are totally unstable in a distributed environment 
> _unless_ you use a certain work-flow. And that work-flow is definitely not 
> "distributed" it's much closer to "disconnected centralized".
> 
> Now, you could be truly distributed: BK used the same revision numbering 
> thing, but was distributed. But BK didn't even try to claim that their 
> revision numbers were "simple" and that fast-forwarding is sometimes the 
> wrong thing to do.
> 
> So BK always fast-forwarded, and the revision numbers were just randomly 
> changing numbers. They weren't stable, they weren't simple, and nobody 
> claimed they were.
> 
> So bzr can bite the bullet and say: "revision numbers are changing and 
> meaningless, and we should just fast-forward on merges", or you should 
> just admit that bzr is really more about "disconnected operation" than 
> truly distributed.
> 
> You can't have your cake and eat it too. Truly distributed _cannot_ be 
> done with a stable dotted numbering scheme (unless the "dotted numbering 
> scheme" is just a way to show a hash like git does - so the numbering has 
> no _sequential_ meaning).
> 
> Btw, this isn't just an "opinion". This is a _fact_. It's something they 
> teach in any good introductory course to distributed algorithms. Usually 
> it's talked about in the context of "global clock". 
> 
> Anybody who thinks that there exists a globally ticking clock in the 
> system (and stably increasing dotted numbers are just one such thing) is 
> talking about some fantasy-world that doesn't exist, or a world that has 
> nothing to do with "distributed".
> 
> 			Linus

Actually bzr used to have slightly different numbering scheme not long
ago. There was a revision-history in each branch listing the revisions
in order in which they were commited or merged in. Some time ago it was
changed to numbering along the leftmost parent, which was, IIRC, deemed
simpler and a little more logical. But in the light of these arguments,
maybe the former system was better -- it was more dependent on the
actual location, but on the other hand it allowed (or could allow --
IIRC there was some problem with it) to fast-forward merge while
_locally_ keeping the meaning of old revision numbers. In fact, the
revision-history used to be almost exactly the same as git reflog,
except it only stored the revids, not the times.

--------------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-27  2:02                                                                                   ` Horst H. von Brand
  2006-10-27  2:08                                                                                     ` Petr Baudis
@ 2006-10-27  9:34                                                                                     ` Andreas Ericsson
  2006-10-27 10:49                                                                                       ` Jakub Narebski
  2006-10-27 14:46                                                                                       ` J. Bruce Fields
  1 sibling, 2 replies; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-27  9:34 UTC (permalink / raw)
  To: Horst H. von Brand; +Cc: Jakub Narebski, git, bazaar-ng

Horst H. von Brand wrote:
> Jakub Narebski <jnareb@gmail.com> wrote:
> 
> [...]
> 
>> I'd rather split "Supports Renames" into engine part (does SCM
>> remember/detect that rename took place _as_ rename, not remember/detect it
>> as copiying+deletion; something other than rename) and user interface part:
>> can user easily deal with renames (this includes merging and viewing file
>> history).
> 
> I think that what to tool does in its guts is completely irrelevant, what
> is important is what the user sees. Sadly, it seems hard to describe
> exactly what is meant/wanted here.
> 

Agreed. I'd rather make the definition "Can users, after a rename has 
taken place, follow the history of the file-contents across renames?". 
Mainly because this is clearly unambiguous, doesn't involve 
implementation details and only weighs what really counts: User-visible 
capabilities.

IMNSHO, I'd rather have all the features in the list be along the lines 
of "Can users/admins/random-boon do X?" and instead of "yes/no" list the 
number of commands/the amount of time required to achieve the desired 
effect. This would set a clear limit and put most terminology issues out 
of the way.

> 
>> 13. Plugins. I would put "Somewhat" here, or "Scriptable" in the "Somewhat"
>> or "?" background color for Git. And add note that it is easy to script up
>> porcelanish command, and to add another merge strategy. There also was
>> example plugin infrastructure for Cogito, so I'd opt for "Someahwt"
>> marking.
> 
> Mostly an implementation detail for "extensible"...
> 

Yup. Any fast-growing SCM can clearly be said to be "extensible", 
otherwise it wouldn't be extended ;-)

> [...]
> 
>> 19. Ease of Use. Hmmm... I don't know for Git. I personally find it very
>> easy to use, but I have not much experiences with other SCM. I wonder why
>> Bazaar has "No" there...
> 
> Extremely subjective. Easy to learn doesn't cut it either.

This one just needs to go. Could possibly be replaced with "Has 
tutorial/documentation online" or some such. No SCM is really intuitive 
to users that haven't experienced any of them before, so the only thing 
that really matters is how much documentation one can find online and 
how up-to-date it is.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-27  9:34                                                                                     ` Andreas Ericsson
@ 2006-10-27 10:49                                                                                       ` Jakub Narebski
  2006-10-27 11:41                                                                                         ` Andreas Ericsson
  2006-10-27 14:46                                                                                       ` J. Bruce Fields
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-27 10:49 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Horst H. von Brand, git, bazaar-ng

On 10/27/06, Andreas Ericsson <ae@op5.se> wrote:
> Horst H. von Brand wrote:
>> Jakub Narebski <jnareb@gmail.com> wrote:
>>
>> [...]
>>
>>> I'd rather split "Supports Renames" into engine part (does SCM
>>> remember/detect that rename took place _as_ rename, not remember/detect
>>> it as copiying+deletion; something other than rename) and user interface
>>> part: can user easily deal with renames (this includes merging and
viewing file
>>> history).
>>
>> I think that what to tool does in its guts is completely irrelevant, what
>> is important is what the user sees. Sadly, it seems hard to describe
>> exactly what is meant/wanted here.
>
> Agreed. I'd rather make the definition "Can users, after a rename has
> taken place, follow the history of the file-contents across renames?".
> Mainly because this is clearly unambiguous, doesn't involve
> implementation details and only weighs what really counts: User-visible
> capabilities.

With this definition (with this part) it would be "Somewhat" for Git, because
user can track the history of file-contents across renames, but some additional
steps are required... until --follow=<pathname> would get implemented, that is.
Yet "tracking file-contents across renames" is based on specific workflow used;
for example with Git you usually track [some part of] history of some subpart
of a project, not history of single file. (I'd name it "History Rename Support"
or "Log Rename Support").

But equally important for user is another question related to
"Supporting Renames".
Namely detection of renames during merge and detection of conflict during merge
is what I would consider minimal "Merge Renames Support". Causing information
to be lost is having no "Merge Renames Support". To have "Yes" in this
column SCM
have to resolve conflict at least in obvious cases, and "Yes!" if it
can remember
resolution of merge conflict involving renames ;-).

> IMNSHO, I'd rather have all the features in the list be along the lines
> of "Can users/admins/random-boon do X?" and instead of "yes/no" list the
> number of commands/the amount of time required to achieve the desired
> effect. This would set a clear limit and put most terminology issues out
> of the way.

This would make the comparison table less clear, unfortunately.

>>> 13. Plugins. I would put "Somewhat" here, or "Scriptable" in the "Somewhat"
>>> or "?" background color for Git. And add note that it is easy to script up
>>> porcelanish command, and to add another merge strategy. There also was
>>> example plugin infrastructure for Cogito, so I'd opt for "Someahwt"
>>> marking.
>>
>> Mostly an implementation detail for "extensible"...
>>
>
> Yup. Any fast-growing SCM can clearly be said to be "extensible",
> otherwise it wouldn't be extended ;-)

I'd put "Easily Extensible" here, and put "Plugins (core+UI)" for Bazaar-NG,
and "Scriptable (UI+merge)" for Git, or something like that.

>> [...]
>>
>>> 19. Ease of Use. Hmmm... I don't know for Git. I personally find it very
>>> easy to use, but I have not much experiences with other SCM. I wonder why
>>> Bazaar has "No" there...
>>
>> Extremely subjective. Easy to learn doesn't cut it either.
>
> This one just needs to go. Could possibly be replaced with "Has
> tutorial/documentation online" or some such. No SCM is really intuitive
> to users that haven't experienced any of them before, so the only thing
> that really matters is how much documentation one can find online and
> how up-to-date it is.

For example SCM can be easy to use but at the cost of simplifications
and limited useness.

On the other side basic concept behind some SCM might be more
or less understandable...
-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-27 10:49                                                                                       ` Jakub Narebski
@ 2006-10-27 11:41                                                                                         ` Andreas Ericsson
  0 siblings, 0 replies; 1752+ messages in thread
From: Andreas Ericsson @ 2006-10-27 11:41 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Horst H. von Brand, git, bazaar-ng

Jakub Narebski wrote:
> On 10/27/06, Andreas Ericsson <ae@op5.se> wrote:
>> Horst H. von Brand wrote:
>>> Jakub Narebski <jnareb@gmail.com> wrote:
>>>
>>> [...]
>>>
>>>> I'd rather split "Supports Renames" into engine part (does SCM
>>>> remember/detect that rename took place _as_ rename, not remember/detect
>>>> it as copiying+deletion; something other than rename) and user 
>>>> interface
>>>> part: can user easily deal with renames (this includes merging and
> viewing file
>>>> history).
>>>
>>> I think that what to tool does in its guts is completely irrelevant, 
>>> what
>>> is important is what the user sees. Sadly, it seems hard to describe
>>> exactly what is meant/wanted here.
>>
>> Agreed. I'd rather make the definition "Can users, after a rename has
>> taken place, follow the history of the file-contents across renames?".
>> Mainly because this is clearly unambiguous, doesn't involve
>> implementation details and only weighs what really counts: User-visible
>> capabilities.
> 

[...]

> But equally important for user is another question related to
> "Supporting Renames".
> Namely detection of renames during merge and detection of conflict 
> during merge
> is what I would consider minimal "Merge Renames Support". Causing 
> information
> to be lost is having no "Merge Renames Support". To have "Yes" in this
> column SCM
> have to resolve conflict at least in obvious cases, and "Yes!" if it
> can remember
> resolution of merge conflict involving renames ;-).
> 

True.

>> IMNSHO, I'd rather have all the features in the list be along the lines
>> of "Can users/admins/random-boon do X?" and instead of "yes/no" list the
>> number of commands/the amount of time required to achieve the desired
>> effect. This would set a clear limit and put most terminology issues out
>> of the way.
> 
> This would make the comparison table less clear, unfortunately.
> 

True that. Perhaps just stick with Yes/No and have a timing table to 
compare merge times, multi-parent merge times and stuff like that.

> 
>>> [...]
>>>
>>>> 19. Ease of Use. Hmmm... I don't know for Git. I personally find it 
>>>> very
>>>> easy to use, but I have not much experiences with other SCM. I 
>>>> wonder why
>>>> Bazaar has "No" there...
>>>
>>> Extremely subjective. Easy to learn doesn't cut it either.
>>
>> This one just needs to go. Could possibly be replaced with "Has
>> tutorial/documentation online" or some such. No SCM is really intuitive
>> to users that haven't experienced any of them before, so the only thing
>> that really matters is how much documentation one can find online and
>> how up-to-date it is.
> 
> For example SCM can be easy to use but at the cost of simplifications
> and limited useness.
> 
> On the other side basic concept behind some SCM might be more
> or less understandable...

Yes, but it will always be based on personal opinion and that's why it 
can never be measured in an unbiased way. It would be like playing 
Trivial Pursuit and getting the question "Which 20'th century author 
wrote the best books?". There's actually two problems with that 
question, but the important one is that it can't be answered correctly 
in this wonderful world we live in where everyone has their own opinion.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Creating new repos
@ 2006-10-27 12:29 Horst H. von Brand
  2006-10-27 12:39 ` Petr Baudis
  0 siblings, 1 reply; 1752+ messages in thread
From: Horst H. von Brand @ 2006-10-27 12:29 UTC (permalink / raw)
  To: git

I'm trying to set up git repos for remote access here. I set up git-daemon
and got it working (some older repositories do work fine), but now:

  $ mkdir /var/scm/user/SomeRepo.git
  $ cd /var/scm/user/SomeRepo.git
  $ git --bare init-db
  $ touch git-daemon-export-ok

All OK, but then, from a remote machine:

  $ git clone git://git-server/user/SomeRepo.git
  fatal: no matching remote head
  fetch-pack from 'git://git-server/user/Test.git' failed.

The empty repo created by init-db should be cloneable, so as to get the ball
rolling easily.

You can't push into such an empty repository either.

Any suggestion of how to set up a server into which users can create their
own repos /without/ giving out full shell accounts?

Also, the behaviour of git-daemon is different when using git or ssh+git,
you need to give the full path for the later but not the former (given
--base-path=/var/scm):

  git://git-server/user/Test.git
  ssh+git://git-server/var/scm/user/Test.git

Is this intentional?
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                    Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria             +56 32 2654239
Casilla 110-V, Valparaiso, Chile               Fax:  +56 32 2797513

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Creating new repos
  2006-10-27 12:29 Creating new repos Horst H. von Brand
@ 2006-10-27 12:39 ` Petr Baudis
  2006-10-27 17:08   ` Horst H. von Brand
  0 siblings, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-10-27 12:39 UTC (permalink / raw)
  To: Horst H. von Brand; +Cc: git

Dear diary, on Fri, Oct 27, 2006 at 02:29:10PM CEST, I got a letter
where "Horst H. von Brand" <vonbrand@inf.utfsm.cl> said that...
> I'm trying to set up git repos for remote access here. I set up git-daemon
> and got it working (some older repositories do work fine), but now:
> 
>   $ mkdir /var/scm/user/SomeRepo.git
>   $ cd /var/scm/user/SomeRepo.git
>   $ git --bare init-db
>   $ touch git-daemon-export-ok
> 
> All OK, but then, from a remote machine:
> 
>   $ git clone git://git-server/user/SomeRepo.git
>   fatal: no matching remote head
>   fetch-pack from 'git://git-server/user/Test.git' failed.
> 
> The empty repo created by init-db should be cloneable, so as to get the ball
> rolling easily.

Well there's really nothing to clone, so it's tough. :-) What would such
a clone be supposed to do? It has no branches to record as belonging to
origin, and note that Git's git-clone is long-term broken in the sense
that it won't pick new branches as they appear in the remote
repository. So a clone of an empty repository would be useless anyway.

> You can't push into such an empty repository either.

This is supposed to work. What error do you get?

> Any suggestion of how to set up a server into which users can create their
> own repos /without/ giving out full shell accounts?

Sure:

	http://repo.or.cz/w/repo.git

But it may be quite an overkill for you. ;-)

If you want them to be able to do it over ssh, you would need to provide
a trusted tool which would manage the repository setup, that means not
only doing init-db, but also managing the export-ok files, description
file, you'd likely want to enable the post-update hook (but probably not
any other hook or let the user edit it since at that point you've given
him full shell access), etc. And the tool would have to be carefully
reviewed since it's security-critical.

> Also, the behaviour of git-daemon is different when using git or ssh+git,
> you need to give the full path for the later but not the former (given
> --base-path=/var/scm):
> 
>   git://git-server/user/Test.git
>   ssh+git://git-server/var/scm/user/Test.git
> 
> Is this intentional?

git+ssh has nothing to do with git-daemon, it's executing other git
commands remotely.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-27  9:34                                                                                     ` Andreas Ericsson
  2006-10-27 10:49                                                                                       ` Jakub Narebski
@ 2006-10-27 14:46                                                                                       ` J. Bruce Fields
  2006-10-28 11:18                                                                                         ` Ilpo Nyyssönen
  1 sibling, 1 reply; 1752+ messages in thread
From: J. Bruce Fields @ 2006-10-27 14:46 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Horst H. von Brand, Jakub Narebski, git, bazaar-ng

On Fri, Oct 27, 2006 at 11:34:09AM +0200, Andreas Ericsson wrote:
> Horst H. von Brand wrote:
> >Jakub Narebski <jnareb@gmail.com> wrote:
> >>19. Ease of Use. Hmmm... I don't know for Git. I personally find it very
> >>easy to use, but I have not much experiences with other SCM. I wonder why
> >>Bazaar has "No" there...
> >
> >Extremely subjective. Easy to learn doesn't cut it either.
> 
> This one just needs to go.

It's certainly a hard question to answer, and will never be answered
completely, but unfortunately it's also a really *important* question.
The best SCM in the world isn't much use if I can't convince my
coworkers to learn the thing.

So I think it's helpful to attempt to find out whether we have a problem
here or not, even if the problem is more one of perception than reality.
Though obviously it would be more helpful to have something more
detailed than just a yes or no answer to "is git easy to use?"

> Could possibly be replaced with "Has tutorial/documentation online" or
> some such. No SCM is really intuitive to users that haven't
> experienced any of them before, so the only thing that really matters
> is how much documentation one can find online and how up-to-date it
> is.

Documentation helps, though sometimes extensive documentation is a sign
of a problem--it takes a lot more documentation to explain how to manage
a branch in CVS than it does in any sensible system....


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Creating new repos
  2006-10-27 12:39 ` Petr Baudis
@ 2006-10-27 17:08   ` Horst H. von Brand
  2006-10-28 14:19     ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Horst H. von Brand @ 2006-10-27 17:08 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Horst H. von Brand, git

Petr Baudis <pasky@suse.cz> wrote:
> Dear diary, on Fri, Oct 27, 2006 at 02:29:10PM CEST, I got a letter
> where "Horst H. von Brand" <vonbrand@inf.utfsm.cl> said that...
> > I'm trying to set up git repos for remote access here. I set up git-daemon
> > and got it working (some older repositories do work fine), but now:
> > 
> >   $ mkdir /var/scm/user/SomeRepo.git
> >   $ cd /var/scm/user/SomeRepo.git
> >   $ git --bare init-db
> >   $ touch git-daemon-export-ok
> > 
> > All OK, but then, from a remote machine:
> > 
> >   $ git clone git://git-server/user/SomeRepo.git
> >   fatal: no matching remote head
> >   fetch-pack from 'git://git-server/user/Test.git' failed.

> > The empty repo created by init-db should be cloneable, so as to get the
> > ball rolling easily.

> Well there's really nothing to clone, so it's tough. :-) What would such
> a clone be supposed to do? It has no branches to record as belonging to
> origin, and note that Git's git-clone is long-term broken in the sense
> that it won't pick new branches as they appear in the remote
> repository. So a clone of an empty repository would be useless anyway.

As useless as the empty set? ;-)

> > You can't push into such an empty repository either.
> 
> This is supposed to work. What error do you get?

Pilot error. Sorry for the noise.

> > Any suggestion of how to set up a server into which users can create their
> > own repos /without/ giving out full shell accounts?
> 
> Sure:
> 
> 	http://repo.or.cz/w/repo.git

Cloning... 
"error: Can't lock ref" (?)

OK, got it; the repo is at git://repo.or.cz/repo.git. Better not calling it
*.git

> But it may be quite an overkill for you. ;-)

Will see.

> If you want them to be able to do it over ssh, you would need to provide
> a trusted tool which would manage the repository setup, that means not
> only doing init-db, but also managing the export-ok files, description
> file, you'd likely want to enable the post-update hook (but probably not
> any other hook or let the user edit it since at that point you've given
> him full shell access), etc. And the tool would have to be carefully
> reviewed since it's security-critical.

I was fearing something along these lines...

> > Also, the behaviour of git-daemon is different when using git or ssh+git,
> > you need to give the full path for the later but not the former (given
> > --base-path=/var/scm):
> > 
> >   git://git-server/user/Test.git
> >   ssh+git://git-server/var/scm/user/Test.git
> > 
> > Is this intentional?
> 
> git+ssh has nothing to do with git-daemon, it's executing other git
> commands remotely.

OK. But from an UI POW it is confusing.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                    Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria             +56 32 2654239

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Generating docu in 1.4.3.3.g01929
@ 2006-10-27 17:26 Horst H. von Brand
  2006-10-27 19:44 ` Sean
  2006-10-27 21:34 ` Junio C Hamano
  0 siblings, 2 replies; 1752+ messages in thread
From: Horst H. von Brand @ 2006-10-27 17:26 UTC (permalink / raw)
  To: git

I'm getting lots of these after today's pull:

asciidoc -b docbook -d manpage -f asciidoc.conf git-daemon.txt
xmlto -m callouts.xsl man git-daemon.xml
error : unterminated entity reference                
error : unterminated entity reference                
error : unterminated entity reference             ...
error : unterminated entity reference                
error : unterminated entity reference                
Writing git-daemon.1 for refentry
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                    Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria             +56 32 2654239

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Generating docu in 1.4.3.3.g01929
  2006-10-27 17:26 Generating docu in 1.4.3.3.g01929 Horst H. von Brand
@ 2006-10-27 19:44 ` Sean
       [not found]   ` <20061027154433.da9b29d7.seanlkml@sympatico.ca>
  2006-10-27 21:34 ` Junio C Hamano
  1 sibling, 1 reply; 1752+ messages in thread
From: Sean @ 2006-10-27 19:44 UTC (permalink / raw)
  To: Horst H. von Brand; +Cc: git

On Fri, 27 Oct 2006 14:26:53 -0300
"Horst H. von Brand" <vonbrand@inf.utfsm.cl> wrote:

> I'm getting lots of these after today's pull:
> 
> asciidoc -b docbook -d manpage -f asciidoc.conf git-daemon.txt
> xmlto -m callouts.xsl man git-daemon.xml
> error : unterminated entity reference                
> error : unterminated entity reference                
> error : unterminated entity reference             ...
> error : unterminated entity reference                
> error : unterminated entity reference                
> Writing git-daemon.1 for refentry

Can't reproduce this here on master or on next with:
 asciidoc-7.1.2-0 and xmlto-0.0.18-13.1
Maybe this is an Asciidoc 8 issue, are you using it?


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: prune/prune-packed
  2006-10-23  3:27       ` prune/prune-packed Junio C Hamano
  2006-10-23 18:39         ` prune/prune-packed Petr Baudis
@ 2006-10-27 21:19         ` Jon Loeliger
  2006-10-27 21:55           ` prune/prune-packed Junio C Hamano
  1 sibling, 1 reply; 1752+ messages in thread
From: Jon Loeliger @ 2006-10-27 21:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: gitzilla, J. Bruce Fields, Git List

On Sun, 2006-10-22 at 22:27, Junio C Hamano wrote:

> Sorry, but you are right and Linus is more right.  How about
> doing FRSX.
> 
> diff --git a/pager.c b/pager.c
> index 8bd33a1..4587fbb 100644
> --- a/pager.c
> +++ b/pager.c
> @@ -50,7 +50,7 @@ void setup_pager(void)
>  	close(fd[0]);
>  	close(fd[1]);
>  
> -	setenv("LESS", "FRS", 0);
> +	setenv("LESS", "FRSX", 0);
>  	run_pager(pager);
>  	die("unable to execute pager '%s'", pager);
>  	exit(255);

I'm a little confused by all this because I
set the LESS environment variable by myself
already.  And I use the value that I like.
Why change it or override the user's settings
like this?  Or did I miss something?

Thanks,
jdl


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Generating docu in 1.4.3.3.g01929
  2006-10-27 17:26 Generating docu in 1.4.3.3.g01929 Horst H. von Brand
  2006-10-27 19:44 ` Sean
@ 2006-10-27 21:34 ` Junio C Hamano
  1 sibling, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-27 21:34 UTC (permalink / raw)
  To: Horst H. von Brand; +Cc: git

"Horst H. von Brand" <vonbrand@inf.utfsm.cl> writes:

> I'm getting lots of these after today's pull:
>
> asciidoc -b docbook -d manpage -f asciidoc.conf git-daemon.txt
> xmlto -m callouts.xsl man git-daemon.xml
> error : unterminated entity reference                
> error : unterminated entity reference                
> error : unterminated entity reference             ...
> error : unterminated entity reference                
> error : unterminated entity reference                
> Writing git-daemon.1 for refentry

Is it only with git-daemon.txt (as opposed to other files like
git-cat-file.txt), is it only with generating git-daemon.1 (as
opposed to generating git-daemon.html), and is it only with
today's pull (as opposed to 1.4.3.3)?

The point I am getting at is if it is only for you and if so
we might want to pinpoint where the breakage is.

I do not see it with my xmlto and asciidoc combination, either
on FC6 nor on Debian testing.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: prune/prune-packed
  2006-10-27 21:19         ` prune/prune-packed Jon Loeliger
@ 2006-10-27 21:55           ` Junio C Hamano
  0 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-27 21:55 UTC (permalink / raw)
  To: Jon Loeliger; +Cc: git

Jon Loeliger <jdl@freescale.com> writes:

> On Sun, 2006-10-22 at 22:27, Junio C Hamano wrote:
>
>> Sorry, but you are right and Linus is more right.  How about
>> doing FRSX.
>> 
>> diff --git a/pager.c b/pager.c
>> index 8bd33a1..4587fbb 100644
>> --- a/pager.c
>> +++ b/pager.c
>> @@ -50,7 +50,7 @@ void setup_pager(void)
>>  	close(fd[0]);
>>  	close(fd[1]);
>>  
>> -	setenv("LESS", "FRS", 0);
>> +	setenv("LESS", "FRSX", 0);
>>  	run_pager(pager);
>>  	die("unable to execute pager '%s'", pager);
>>  	exit(255);
>
> I'm a little confused by all this because I
> set the LESS environment variable by myself
> already.  And I use the value that I like.
> Why change it or override the user's settings
> like this?  Or did I miss something?

This is about "if user does not set it, use this default".


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Generating docu in 1.4.3.3.g01929
       [not found]   ` <20061027154433.da9b29d7.seanlkml@sympatico.ca>
@ 2006-10-27 23:12     ` Horst H. von Brand
  2006-10-28  4:24       ` Sean
  0 siblings, 1 reply; 1752+ messages in thread
From: Horst H. von Brand @ 2006-10-27 23:12 UTC (permalink / raw)
  To: Sean; +Cc: Horst H. von Brand, git

Sean <seanlkml@sympatico.ca> wrote:

> On Fri, 27 Oct 2006 14:26:53 -0300
> "Horst H. von Brand" <vonbrand@inf.utfsm.cl> wrote:
> 
> > I'm getting lots of these after today's pull:
> > 
> > asciidoc -b docbook -d manpage -f asciidoc.conf git-daemon.txt
> > xmlto -m callouts.xsl man git-daemon.xml
> > error : unterminated entity reference                
> > error : unterminated entity reference                
> > error : unterminated entity reference             ...
> > error : unterminated entity reference                
> > error : unterminated entity reference                
> > Writing git-daemon.1 for refentry
> 
> Can't reproduce this here on master or on next with:
>  asciidoc-7.1.2-0 and xmlto-0.0.18-13.1
> Maybe this is an Asciidoc 8 issue, are you using it?

Fedora rawhide i386, with:

  asciidoc-7.0.2-3.fc6
  xmlto-0.0.18-13.1

Perhaps too old, not too new...
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                    Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria             +56 32 2654239

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Generating docu in 1.4.3.3.g01929
  2006-10-27 23:12     ` Horst H. von Brand
@ 2006-10-28  4:24       ` Sean
  2006-10-28  5:45         ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Sean @ 2006-10-28  4:24 UTC (permalink / raw)
  To: Horst H. von Brand; +Cc: git

On Fri, 27 Oct 2006 20:12:50 -0300
"Horst H. von Brand" <vonbrand@inf.utfsm.cl> wrote:

> > > asciidoc -b docbook -d manpage -f asciidoc.conf git-daemon.txt
> > > xmlto -m callouts.xsl man git-daemon.xml
> > > error : unterminated entity reference                
> > > error : unterminated entity reference                
> > > error : unterminated entity reference             ...
> > > error : unterminated entity reference                
> > > error : unterminated entity reference                
> > > Writing git-daemon.1 for refentry
> > 
> > Can't reproduce this here on master or on next with:
> >  asciidoc-7.1.2-0 and xmlto-0.0.18-13.1
> > Maybe this is an Asciidoc 8 issue, are you using it?
> 
> Fedora rawhide i386, with:
> 
>   asciidoc-7.0.2-3.fc6
>   xmlto-0.0.18-13.1
> 
> Perhaps too old, not too new...

Can't imagine that it's too old.  You may have to bisect to figure
out what the culprit is. :o/


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Generating docu in 1.4.3.3.g01929
  2006-10-28  4:24       ` Sean
@ 2006-10-28  5:45         ` Junio C Hamano
  2006-10-28  6:07           ` Sean
  0 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-28  5:45 UTC (permalink / raw)
  To: Sean; +Cc: git, Horst H. von Brand

Sean <seanlkml@sympatico.ca> writes:

>> > Can't reproduce this here on master or on next with:
>> >  asciidoc-7.1.2-0 and xmlto-0.0.18-13.1
>> > Maybe this is an Asciidoc 8 issue, are you using it?
>> 
>> Fedora rawhide i386, with:
>> 
>>   asciidoc-7.0.2-3.fc6
>>   xmlto-0.0.18-13.1
>> 
>> Perhaps too old, not too new...
>
> Can't imagine that it's too old.  You may have to bisect to figure
> out what the culprit is. :o/

Eh, do you mean bisecting asciidoc?  I am not seeing the problem
with these on a freshly installed FC6:

Name   : asciidoc
Arch   : noarch
Version: 7.0.2
Release: 3.fc6

Name   : xmlto
Arch   : i386
Version: 0.0.18
Release: 13.1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Generating docu in 1.4.3.3.g01929
  2006-10-28  5:45         ` Junio C Hamano
@ 2006-10-28  6:07           ` Sean
  2006-10-28 19:04             ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Sean @ 2006-10-28  6:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Horst H. von Brand

On Fri, 27 Oct 2006 22:45:51 -0700
Junio C Hamano <junkio@cox.net> wrote:

> Eh, do you mean bisecting asciidoc?  I am not seeing the problem
> with these on a freshly installed FC6:
> 

Yeah.. don't see the problem here either.  But assuming there
is some strange interaction with Horst's environment, bisecting
would narrow it down.  Even though I don't really think bisecting
will turn up a problem in Git, it might identify the problem in
the environment.. 


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-27 14:46                                                                                       ` J. Bruce Fields
@ 2006-10-28 11:18                                                                                         ` Ilpo Nyyssönen
  2006-10-28 13:53                                                                                           ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Ilpo Nyyssönen @ 2006-10-28 11:18 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

"J. Bruce Fields" <bfields@fieldses.org> writes:

> Documentation helps, though sometimes extensive documentation is a sign
> of a problem--it takes a lot more documentation to explain how to manage
> a branch in CVS than it does in any sensible system....

Usability:

I have used bzr, bk for development and git very little for following
kernel development. I have followed this discussion quite well.

1. It is easier to start using something you are already familiar
with. (Just try to use Mac OS X with a Windows or Linux background.)

G: Something totally new and so no points from here. The way of using
git is just so different from any other similar software.

B: Quite clearly gets points from this. Normal branches work quite
like many other software, the checkout stuff works like CVS and SVN.

2. Finding commands.

G: Quite big amount of commands, some clear, but some not so. With all
the installed commands, it is even more confusing. What's the
difference between fetch and pull and which one I should use? Same for
clone and branch.

B: A bit clearer I think, but the pull and merge does cause confusion. 
Also the checkout stuff could be better shown in the command line
help. With plugins like bzrtools the amount of command raises and
confusion increases. Maybe better separation for plugin commands in
the command line help?

3. Understanding output

G: Speaks a language of its own, hard to understand. No progress
reported for long lasting operations.

B: Could maybe speak a bit more. Progress reporting is quite good.

4. Misc stuff

G: You have only one workspace and this forces you to use git more or
to make several repositories. You can't just diff branchA/foo
branchB/foo. You can't just open file from old branch to check
something while you are developing in some new branch. Do I have to
commit my changes before changing a branch in the workspace?

G: What is this git repack thing and do I have to use it? If yes, why? 
Nobody told me that I should run it, but I did notice Linus mentioning
it somewhere. Definetly causing harm for usability.

B: People migth misuse the revnos and so be confused when things won't
work like they expected.

Conclusion: I would say that Bazaar is more usable than git.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-27  4:51                         ` Jan Hudec
@ 2006-10-28 11:38                           ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-28 11:38 UTC (permalink / raw)
  To: Jan Hudec
  Cc: Linus Torvalds, Erik B?gfors, Matthieu Moy, bazaar-ng, Sean, git

Jan Hudec wrote:
> Actually bzr used to have slightly different numbering scheme not long
> ago. There was a revision-history in each branch listing the revisions
> in order in which they were commited or merged in. Some time ago it was
> changed to numbering along the leftmost parent, which was, IIRC, deemed
> simpler and a little more logical. But in the light of these arguments,
> maybe the former system was better -- it was more dependent on the
> actual location, but on the other hand it allowed (or could allow --
> IIRC there was some problem with it) to fast-forward merge while
> _locally_ keeping the meaning of old revision numbers. In fact, the
> revision-history used to be almost exactly the same as git reflog,
> except it only stored the revids, not the times.

Which is very fine if you don't modify the history (amending commits,
rewinding history to earlier point, rebasing the branch, merging branch
in and starting it anew aka. dovetail approach if I remember correctly),
and if you are not concerned with performance when fetching larger
number of commits into branch (as you have to assign number to them).

Which was perhaps why bzr changed from revnolog to leftmost/first parent
as a way to keep branch-as-path/assing revision numbers to revisions.
Which has it's own disadvantages as enumerated multiple times here
on the list.
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-28 11:18                                                                                         ` Ilpo Nyyssönen
@ 2006-10-28 13:53                                                                                           ` Jakub Narebski
  2006-10-28 14:58                                                                                             ` Jakub Narebski
                                                                                                               ` (3 more replies)
  0 siblings, 4 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-28 13:53 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Ilpo Nyyssönen wrote:

> "J. Bruce Fields" <bfields@fieldses.org> writes:
> 
>> Documentation helps, though sometimes extensive documentation is a sign
>> of a problem--it takes a lot more documentation to explain how to manage
>> a branch in CVS than it does in any sensible system....
> 
> Usability:
> 
> I have used bzr, bk for development and git very little for following
> kernel development. I have followed this discussion quite well.
> 
> 1. It is easier to start using something you are already familiar
> with. (Just try to use Mac OS X with a Windows or Linux background.)
> 
> G: Something totally new and so no points from here. The way of using
> git is just so different from any other similar software.
> 
> B: Quite clearly gets points from this. Normal branches work quite
> like many other software, the checkout stuff works like CVS and SVN.

I find for example concept of branches in Git extremly easy to understand.
Bazaar-NG "branches" is mixture of Git branch and Git repository/clone of
repository. In bzr "branch" refers to abstract SCM concept as part of DAG of
revisions sourced from given revision/head/tip (git branch is very close to
it); yet another but distinct abstract SCM concept of branch as "your" line
of development i.e. path in the DAG of revisions started at given
revision/head/tip and ending in initial/parentless revision; the physical
representation: working area, metainformation, storage or pointer to
storage (when branches share storage forming so called bzr "repository").

About checkout: Bazaar mixes here "CVS checkout" model in the "bzr checkout"
command, and SCM concept of checking-out i.e. getting files from repository
(or branch in bzr) to working area.

On the other side breaking with traditional concepts of _centralized_ SCM
in _distributed_ SCM (and geared towards distributed usage) is IMVHO a good
idea. And breaking with the cruft of bad ideas of CVS is very good idea.

But I agree that in Git some terminology (and names of commands) could be
better. Some of it stems from BitKeeper background, some from the way Git
was created: bottom-up, from repository layout to fully (or not ;-) fledged
SCM. For example "pull" as "fetch + merge" is IIRC BitKeeper legacy, while
the fact that "merge" command is low-level (or mid-level) command fairly
poorly usable for user (which should use "pull ." for merging from local
branch).

> 2. Finding commands.
> 
> G: Quite big amount of commands, some clear, but some not so. With all
> the installed commands, it is even more confusing. What's the
> difference between fetch and pull and which one I should use? Same for
> clone and branch.
>
> B: A bit clearer I think, but the pull and merge does cause confusion. 
> Also the checkout stuff could be better shown in the command line
> help. With plugins like bzrtools the amount of command raises and
> confusion increases. Maybe better separation for plugin commands in
> the command line help?

In Git Users Survey (http://git.or.cz/gitwiki/GitSurvey) the answer "too
many commands" was most common answer to question 6. "What did you find
hardest?" in the survey (which survey was base on Mercurial survey:
http://www.selenic.com/mercurial/wiki/index.cgi/UserSurvey). It would be
perhaps better for Git to clearly divide commands between porcelanish (for
end user), admin (whole repository level) and plumbing (for use in
scripts).

But for example git(7) man page lists git commands clearly divided between
low-level commands (plumbing): manipulation commands, interrogation
commands, synching commands and high level commands (porcelain): main
commands, ancillary commands. The "git help" and "git --help" shows the
most commonly used git commands with short description of each command
("git help -a" show all commands). 
 
I can understand confusion between "git pull" and "git fetch"; it is
adressed in documentation. Although I think the confusion between
"bzr merge" and "bzr pull" is as great if not greater.

I don't understand the confusion between "git branch" and "git clone"
commands... unless you are confused by Bazaar-NG branch-centric approach
which mixes branch with repository.

> 3. Understanding output
> 
> G: Speaks a language of its own, hard to understand. No progress
> reported for long lasting operations.
> 
> B: Could maybe speak a bit more. Progress reporting is quite good.

Which long lasting operations lack progress bar/progress reporting?
"git clone" and "git fetch"/"git pull" both have progress report
for both "smart" git://, git+ssh:// and local protocols, and "dumb"
http://, https://, ftp://, rsync:// protocols. "git rebase" has
progress report. "git am" has progress report.

But I agree that Git tends to speak in its own jargon. But this jargon is
very clear if you are familiar with Git. BTW. some of the worst offenders
like <ent> (== <tree-ish>) is removed already from documentation.

> 4. Misc stuff
> 
> G: You have only one workspace and this forces you to use git more or
> to make several repositories. 

This is your confusion stemming from Bazaar-NG branch-centricness. In Git
working area is associated with repository, not with branch as in bzr.
Usually you have repsoitory embedded in working area, in .git directory in
top level of working area. The fact that you have only one index (but you
can specify alternate index, or switch between index files), and only one
current branch marker namely HEAD (you can switch HEAD to other branch; if
I remember correctly there is no way to specify current head other way)
makes working with multiple working areas tied to one repository more
difficult. But it is usually not necessary in Git.

In Bazaar-NG "repository" is just sharing the storage of "branches"; in Git
you can share the storage between repositories (although it is not the
default mode), or share common old history between repositories (more
common). 

> You can't just diff branchA/foo branchB/foo.

You can: either using "git diff branchA branchB -- foo" which means
difference between branches branchA and branchB limited to the differences
on branch foo (where foo can be directory name or filename), or via
"extended SHA1 reference" using "git diff branchA:foo branchB:foo" which
means compare file/directory "foo" at revision "branchA" and file/directory
"foo" at revision "branchB".

You can even diff two different _repositories_ if they are on the same local
filesystem using pasky trick described in http://git.or.cz/gitwiki/GitTips.

> You can't just open file from old branch to check 
> something while you are developing in some new branch.

You can view file from old branch via "git cat-file -p old-branch:file".

> Do I have to commit my changes before changing a branch
> in the workspace? 

You have to. But we have "git commit --amend", so if I need to do this
I usually do "git commit -m 'TEMPORARY COMMIT'" before switching to other
branch. Or you can save differences between working area and current branch
to patch file. The "git-checkpoint" proposal adresses that... in rather
heavy-handed fashion. There is also "git-stash/git-unstash" floating
somewhere in git mailing list archives.
 
> G: What is this git repack thing and do I have to use it? If yes, why? 
> Nobody told me that I should run it, but I did notice Linus mentioning
> it somewhere. Definetly causing harm for usability.

Hmm... perhaps "repack -a -d" should be shown in "git help" list of commonly
used commands output.

Having two separate formats in repository: loose (but compressed) and packed
(in one file, deltaified, compressed) has the following advantages:

0. Historical, it allowed for git to be released (deployed) early,
originally as fast content tracker and not full SCM, and to add features
based on how people used it and scripted it. It also gave Git design the
advantage of not being tailored/based on some storage mechanism, which
resulted in IMHO very clean design and concepts.

1. Security (together with format). It secures repository against corruption
stemming from: corruption during saving file, race condition, interruptions
during operation etc.; although it doesn' save against all possible errors.
That is what sold Keith on choosing Git as SCM for X.Org:
http://keithp.com/blog/Repository_Formats_Matter.html

2. Efficiency. The packed Git format is both AFAIK the densest repository
format from OSS SCM, and it is very fast to access any given revision.

3. Net format. It allows to use _exactly_ the same format for transmission
during clone and fetch; well with the exception that for "smart" protocols
git can send "thin" pack, with some deltas without bases. The latest work
in progress by Nicolas Pitre and others to convert thin pack to full pack
without exploding it into loose objects in between.


There quite frequently appears suggestion for SCM based on Git, or Git
porcelains (like Cogito) to automatically repack. Latest work on the option
to repack to not pack only loose objects, or repack everything, but to
repack given pack or repack with exception of some archive packs should
help with that solution.

> B: People migth misuse the revnos and so be confused when things won't
> work like they expected.

Revnos work only with very specific workflows.

> Conclusion: I would say that Bazaar is more usable than git.

Conclusion: I would say that Git is more usable than Bazaar.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Creating new repos
  2006-10-27 17:08   ` Horst H. von Brand
@ 2006-10-28 14:19     ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-28 14:19 UTC (permalink / raw)
  To: git

Horst H. von Brand wrote:
> Petr Baudis <pasky@suse.cz> wrote:
>> Dear diary, on Fri, Oct 27, 2006 at 02:29:10PM CEST, I got a letter
>> where "Horst H. von Brand" <vonbrand@inf.utfsm.cl> said that...

>>>   git://git-server/user/Test.git
>>>   ssh+git://git-server/var/scm/user/Test.git
>>> 
>>> Is this intentional?
>> 
>> git+ssh has nothing to do with git-daemon, it's executing other git
>> commands remotely.
> 
> OK. But from an UI POW it is confusing.

You can use ssh:// instead of ssh+git:// if you like. ssh+git:// is to note
that you should use git for that...
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-28 13:53                                                                                           ` Jakub Narebski
@ 2006-10-28 14:58                                                                                             ` Jakub Narebski
  2006-10-28 22:18                                                                                             ` Robin Rosenberg
                                                                                                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-28 14:58 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Jakub Narebski wrote:

>> You can't just diff branchA/foo branchB/foo.
> 
> You can: either using "git diff branchA branchB -- foo" which means
> difference between branches branchA and branchB limited to the differences
> on branch foo (where foo can be directory name or filename)

Sorry, it should be:

"limited to the differences on pathname foo (where foo can be directory name
or filename)"


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Generating docu in 1.4.3.3.g01929
  2006-10-28  6:07           ` Sean
@ 2006-10-28 19:04             ` Junio C Hamano
  2006-10-28 19:13               ` Sean
  2006-10-29 19:03               ` Horst H. von Brand
  0 siblings, 2 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-28 19:04 UTC (permalink / raw)
  To: Sean; +Cc: git, Horst H. von Brand

Sean <seanlkml@sympatico.ca> writes:

> On Fri, 27 Oct 2006 22:45:51 -0700
> Junio C Hamano <junkio@cox.net> wrote:
>
>> Eh, do you mean bisecting asciidoc?  I am not seeing the problem
>> with these on a freshly installed FC6:
>
> Yeah.. don't see the problem here either.  But assuming there
> is some strange interaction with Horst's environment, bisecting
> would narrow it down.  Even though I don't really think bisecting
> will turn up a problem in Git, it might identify the problem in
> the environment.. 

Horst has a non-working combination that is:

 - tip of "master" of the day
 - Fedora rawhide i386 (whatever that is -- sorry I am new to RPM world)
 - asciidoc 7.0.2 3.fc6
 - xmlto 0.0.18 13.1

I have a working combination:

 - tip of "master" of the day
 - FC6 i386 (freshly installed)
 - asciidoc 7.0.2 3.fc6
 - xmlto 0.0.18 13.1

So the difference between me and Horst that can be bisected is
not what are listed above.  I wonder what other things come into
the picture.

"rpm -q --requires" tells us that:

 - asciidoc wants python >= 2.3
 - xmlto wants docbook-dtds, docbook-xsl >= 1.56.0, flex,
   libxslt, passivetex >= 1.11, util-linux, w3m

and here is what I have:

   asciidoc-7.0.2-3.fc6
   xmlto-0.0.18-13.1
   python-2.4.3-18.fc6
   docbook-dtds-1.0-30.1
   package docbook-xsl is not installed
   flex-2.5.4a-41.fc6
   libxslt-1.1.17-1.1
   passivetex-1.25-5.1.1
   util-linux-2.13-0.44.fc6
   w3m-0.5.1-14.1

"rpm -q --whatprovides docbook-xsl" says:

   docbook-style-xsl-1.69.1-5.1

and it is installed on the FC6 box.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Generating docu in 1.4.3.3.g01929
  2006-10-28 19:04             ` Junio C Hamano
@ 2006-10-28 19:13               ` Sean
  2006-10-28 19:22                 ` Junio C Hamano
  2006-10-29 19:03               ` Horst H. von Brand
  1 sibling, 1 reply; 1752+ messages in thread
From: Sean @ 2006-10-28 19:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Horst H. von Brand

On Sat, 28 Oct 2006 12:04:24 -0700
Junio C Hamano <junkio@cox.net> wrote:

> Horst has a non-working combination that is:
> 
>  - tip of "master" of the day
>  - Fedora rawhide i386 (whatever that is -- sorry I am new to RPM world)
>  - asciidoc 7.0.2 3.fc6
>  - xmlto 0.0.18 13.1
> 
> I have a working combination:
> 
>  - tip of "master" of the day
>  - FC6 i386 (freshly installed)
>  - asciidoc 7.0.2 3.fc6
>  - xmlto 0.0.18 13.1
> 
> So the difference between me and Horst that can be bisected is
> not what are listed above.  I wonder what other things come into
> the picture.

The thing is, Horst implied everything worked before a recent pull.
It's worth at least going back to see if that's true.  Quite likely
that older version will no longer work anymore either, but maybe it
will.  Of course, if an older version no longer works, there's no
need to bisect further, something in the environment has changed.
Either way, it'll help narrow things down a bit.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Generating docu in 1.4.3.3.g01929
  2006-10-28 19:13               ` Sean
@ 2006-10-28 19:22                 ` Junio C Hamano
  0 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-10-28 19:22 UTC (permalink / raw)
  To: Sean; +Cc: git, Horst H. von Brand

Sean <seanlkml@sympatico.ca> writes:

>> So the difference between me and Horst that can be bisected is
>> not what are listed above.  I wonder what other things come into
>> the picture.
>
> The thing is, Horst implied everything worked before a recent pull.

Ah, Ok.

I explicitly asked about things that would help to narrow down
and Horst did not answer to any, so I took that "no info" as (0)
this is the first doc generation so it is unknown if older git
sources would generate docs correctly in the environment, (1)
not just git-daemon.1 but generating git-anything.1 is broken,
(2) not just git-daemon.1 but generating git-daemon.html is also
broken.

You interpreted the "no info" differently, which is valid.

> It's worth at least going back to see if that's true.  Quite likely
> that older version will no longer work anymore either, but maybe it
> will.  Of course, if an older version no longer works, there's no
> need to bisect further, something in the environment has changed.
> Either way, it'll help narrow things down a bit.

Very true.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-28 13:53                                                                                           ` Jakub Narebski
  2006-10-28 14:58                                                                                             ` Jakub Narebski
@ 2006-10-28 22:18                                                                                             ` Robin Rosenberg
  2006-10-28 22:46                                                                                               ` Jakub Narebski
  2006-10-29  6:54                                                                                             ` Ilpo Nyyssönen
  2006-10-30 10:18                                                                                             ` Progress reporting (was: VCS comparison table) Jakub Narebski
  3 siblings, 1 reply; 1752+ messages in thread
From: Robin Rosenberg @ 2006-10-28 22:18 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

lördag 28 oktober 2006 15:53 skrev Jakub Narebski:
> But for example git(7) man page lists git commands clearly divided between
> low-level commands (plumbing): manipulation commands, interrogation
> commands, synching commands and high level commands (porcelain): main
> commands, ancillary commands. The "git help" and "git --help" shows the
> most commonly used git commands with short description of each command
> ("git help -a" show all commands).

I believe people tend to skim through documentation looking for pieces of 
information rather than read it from start to end. So they find themselves 
reading the plumbing documentation first. Simply reordering documentation to 
list the porcelain commands before the plumbing would make the git man page 
less scary to newcomers.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-28 22:18                                                                                             ` Robin Rosenberg
@ 2006-10-28 22:46                                                                                               ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-28 22:46 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: git, bazaar-ng

Dnia niedziela 29. października 2006 00:18, Robin Rosenberg napisał:
> lördag 28 oktober 2006 15:53 skrev Jakub Narebski:
>>
>> But for example git(7) man page lists git commands clearly divided between
>> low-level commands (plumbing): manipulation commands, interrogation
>> commands, synching commands and high level commands (porcelain): main
>> commands, ancillary commands. The "git help" and "git --help" shows the
>> most commonly used git commands with short description of each command
>> ("git help -a" show all commands).
> 
> I believe people tend to skim through documentation looking for pieces of 
> information rather than read it from start to end. So they find themselves 
> reading the plumbing documentation first. Simply reordering documentation to 
> list the porcelain commands before the plumbing would make the git man page 
> less scary to newcomers.

Good idea. Thanks.

Current ordering in git(7) man page is probably the result of bottom-up
git development. First there were plumbing commands (well, first was
repository format AFAICT, but I digress...).

-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-28 13:53                                                                                           ` Jakub Narebski
  2006-10-28 14:58                                                                                             ` Jakub Narebski
  2006-10-28 22:18                                                                                             ` Robin Rosenberg
@ 2006-10-29  6:54                                                                                             ` Ilpo Nyyssönen
  2006-10-29 12:01                                                                                               ` Jakub Narebski
  2006-10-30 10:18                                                                                             ` Progress reporting (was: VCS comparison table) Jakub Narebski
  3 siblings, 1 reply; 1752+ messages in thread
From: Ilpo Nyyssönen @ 2006-10-29  6:54 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> Ilpo Nyyssönen wrote:
>
>> Usability:
>> 
>> I have used bzr, bk for development and git very little for following
>> kernel development. I have followed this discussion quite well.
>> 
>> 1. It is easier to start using something you are already familiar
>> with. (Just try to use Mac OS X with a Windows or Linux background.)
>> 
>> G: Something totally new and so no points from here. The way of using
>> git is just so different from any other similar software.
>> 
>> B: Quite clearly gets points from this. Normal branches work quite
>> like many other software, the checkout stuff works like CVS and SVN.
>
> I find for example concept of branches in Git extremly easy to understand.

Might be, but the point was: Git is harder as it is not like others. 
In other hand one can see Bazaar like other distributed SCMs and even
like the not distributed ones as it has the checkout stuff.

You can give Bazaar for me, a bk user, and I can understand what to do
with the branches that are like bk clones. (The repository stuff is
later development and still optional.) Switching a CVS environment to
Bazaar one can be done so that most of the users can be just told to
use bzr checkout and they don't have to care about pushing.

But with git, I clone some repository. Now it is totally new to
understand that I didn't clone only single branch. It's like nothing
else and that's what I saw when I first looked at it. I might have
even not noticed the branch stuff and just cloned it further.

> On the other side breaking with traditional concepts of _centralized_ SCM
> in _distributed_ SCM (and geared towards distributed usage) is IMVHO a good
> idea. And breaking with the cruft of bad ideas of CVS is very good idea.

Breaking concepts can be a good idea and I somewhat think that git
needed to do what it did. But do remember that it came with a cost:
git is harder to understand and use. You first have to understand that
it is different and how it is different.

> I don't understand the confusion between "git branch" and "git clone"
> commands... unless you are confused by Bazaar-NG branch-centric approach
> which mixes branch with repository.

Those commands do so different things in different SCMs. Just look at
the differences bk clone, git clone, git branch and bzr branch. You
have both. At the point where I didn't yet understand that I cloned
more than a one branch, git branch is very odd looking command.

> Which long lasting operations lack progress bar/progress reporting?
> "git clone" and "git fetch"/"git pull" both have progress report

First note that I didn't notice git repack until recently so things
got slower until that.

At least some points they just tell that they are doing something, but
not how much of it has been done and how much is still to do. Look at
Bazaar and you'll see the difference, it has progress bars.

>> G: You have only one workspace and this forces you to use git more or
>> to make several repositories. 
>
> This is your confusion stemming from Bazaar-NG branch-centricness. In Git
> working area is associated with repository, not with branch as in bzr.

Exactly my point.

>> You can't just diff branchA/foo branchB/foo.
>
> You can: either using "git diff branchA branchB -- foo" which means

Exactly my point: it forces you to use git more. In Bazaar I can do
this without Bazaar commands. I could even do it with some Windows GUI
stuff, take two files or directories and compare.

As you need to use git commands more than bzr commands, git has bigger
requirements for usability.

>> You can't just open file from old branch to check 
>> something while you are developing in some new branch.
>
> You can view file from old branch via "git cat-file -p old-branch:file".

Same thing here, in Bazaar, I can just open the file from the other
branch. I can also compile and run the other branch while I have the
other open.

Essentially I would need a separate git repository for each branch
anyway. In Bazaar I can use the same.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-29  6:54                                                                                             ` Ilpo Nyyssönen
@ 2006-10-29 12:01                                                                                               ` Jakub Narebski
  2006-10-29 18:24                                                                                                 ` Matthew D. Fuller
  2006-10-30  0:10                                                                                                 ` Theodore Tso
  0 siblings, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-29 12:01 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Ilpo Nyyssönen wrote:

> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> Ilpo Nyyssönen wrote:
>>
>>> Usability:
>>> 
>>> I have used bzr, bk for development and git very little for following
>>> kernel development. I have followed this discussion quite well.
>>> 
>>> 1. It is easier to start using something you are already familiar
>>> with. (Just try to use Mac OS X with a Windows or Linux background.)
>>> 
>>> G: Something totally new and so no points from here. The way of using
>>> git is just so different from any other similar software.
>>> 
>>> B: Quite clearly gets points from this. Normal branches work quite
>>> like many other software, the checkout stuff works like CVS and SVN.
>>
>> I find for example concept of branches in Git extremly easy to
>> understand.
> 
> Might be, but the point was: Git is harder as it is not like others. 
> In other hand one can see Bazaar like other distributed SCMs and even
> like the not distributed ones as it has the checkout stuff.
> 
> You can give Bazaar for me, a bk user, and I can understand what to do
> with the branches that are like bk clones. (The repository stuff is
> later development and still optional.) Switching a CVS environment to
> Bazaar one can be done so that most of the users can be just told to
> use bzr checkout and they don't have to care about pushing.

That is of course because you are familiar with branch-centric distributed
SCM, namely BitKeeper, when trying Bazaar-NG. IMHO branch-centric view
is somewhat limiting; you can always use repository-centric SCM with
one-live-branch-per-repository paradigm and emulate branch-centric SCM,
which is not (or not always) the case for branch-centric SCM. Branch-centric
and repo-centric SCM promote different workflows, namely parallel uncommited
work on few development branches for branch-centric SCM, one-change
per-commit multiple temporary and feature branches for repo-centric SCM.

Breaking from CVS update-then-commit stupid model is IMHO very, very good
idea. On the par of breaking from CVS "model" of branches. In my opinion
CVS had one very good idea (perhaps it wasn't originally CVS idea), namely
using merge instead of locking files for editing; well that and the fact
that it tried (emphasisis on tried) to treat module as a whole, allowing
for multi-file change commits.

Take for example the case of WordProcessors: if they all would only emulate
the UI of leading one (most commonly used), no progress would be made.

> But with git, I clone some repository. Now it is totally new to
> understand that I didn't clone only single branch. It's like nothing
> else and that's what I saw when I first looked at it. I might have
> even not noticed the branch stuff and just cloned it further.

That's the shift of paradigm. Instead of one-branch-per-repository, and
one-branch-per-developer workflow which I think usually stems from that, we
have one-repository-per-developer (usually), and heavily nonlinear
development.

>> On the other side breaking with traditional concepts of _centralized_ SCM
>> in _distributed_ SCM (and geared towards distributed usage) is IMVHO a
>> good idea. And breaking with the cruft of bad ideas of CVS is very good
>> idea. 
> 
> Breaking concepts can be a good idea and I somewhat think that git
> needed to do what it did. But do remember that it came with a cost:
> git is harder to understand and use. You first have to understand that
> it is different and how it is different.

The same could be said about moving from MS-DOS or later MS Windows to the
world of UNIX.

But yes, I understand and agree that being different than others can be
disadvantage... and can be advantage.

>> I don't understand the confusion between "git branch" and "git clone"
>> commands... unless you are confused by Bazaar-NG branch-centric approach
>> which mixes branch with repository.
> 
> Those commands do so different things in different SCMs. Just look at
> the differences bk clone, git clone, git branch and bzr branch. You
> have both. At the point where I didn't yet understand that I cloned
> more than a one branch, git branch is very odd looking command.

I for example didn't understand "bzr branch" concept, being familiar rather
with "git branch".

>> Which long lasting operations lack progress bar/progress reporting?
>> "git clone" and "git fetch"/"git pull" both have progress report
> 
> First note that I didn't notice git repack until recently so things
> got slower until that.
> 
> At least some points they just tell that they are doing something, but
> not how much of it has been done and how much is still to do. Look at
> Bazaar and you'll see the difference, it has progress bars.

Well, having progress bars for operations which are usually fast and one
step is in my opinion stupid idea. Even if there are combinations of
options which makes them slow (for example using so called pickaxe, 
e.g. "git log -S'fragment' -- file" to find revisions which introduced
'fragment' to 'file').

I'll ask again: _which_ git commands you find lacking progress reporting?

>>> You can't just diff branchA/foo branchB/foo.
>>
>> You can: either using "git diff branchA branchB -- foo" which means
> 
> Exactly my point: it forces you to use git more. In Bazaar I can do
> this without Bazaar commands. I could even do it with some Windows GUI
> stuff, take two files or directories and compare.
> 
> As you need to use git commands more than bzr commands, git has bigger
> requirements for usability.

But git commands are more powerfull than equivalent GNU commands. git-diff
is more powerfull than GNU diff (for example it can detect renames and
copying, it shows mode changes, it can show diff for merge using "combined
diff" format), git-grep is more powerfull than GNU grep (for example Linus
finds himself to put files in git repository to use git-grep instead of
combination of GNU find and GNU grep).
 
And don't forget about _cost_ of doing that abovementioned way, namely
having to keep two copies of working area (differing in revision, of
course).

>>> You can't just open file from old branch to check 
>>> something while you are developing in some new branch.
>>
>> You can view file from old branch via "git cat-file -p old-branch:file".

Or you can "git commit -a -m 'TEMP'" to save changes, "git checkout
<branch>" to switch to other branch, perhaps git-clean, hack; hack; hack;
commit changes, swotch back to branch, and wiether amend the commit or reset
index and HEAD (but not working area).

> Same thing here, in Bazaar, I can just open the file from the other
> branch. I can also compile and run the other branch while I have the
> other open.

Do you really often compile and run other branch while developing on other?

> Essentially I would need a separate git repository for each branch
> anyway. In Bazaar I can use the same.

Well, that's a fact that git lacks somewhat (but not lack completly) support
for multiple independent workplaces for the same repository (link+separate
index+separate HEAD), and lacks somewhat (but not completely) support for
sharing object database between repositories aka. bzr model (you have to be
very careful with pruning).

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-29 12:01                                                                                               ` Jakub Narebski
@ 2006-10-29 18:24                                                                                                 ` Matthew D. Fuller
  2006-10-29 18:39                                                                                                   ` Jakub Narebski
  2006-10-30  0:10                                                                                                 ` Theodore Tso
  1 sibling, 1 reply; 1752+ messages in thread
From: Matthew D. Fuller @ 2006-10-29 18:24 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

On Sun, Oct 29, 2006 at 01:01:07PM +0100 I heard the voice of
Jakub Narebski, and lo! it spake thus:
> 
> Branch-centric and repo-centric SCM promote different workflows,
> namely parallel uncommited work on few development branches for
> branch-centric SCM, one-change per-commit multiple temporary and
> feature branches for repo-centric SCM.

I don't think that follows at all.


> Do you really often compile and run other branch while developing on
> other?

Yes.  And I do the same with older revisions along a given branch too,
where is where [lightweight] checkouts come in handy.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-29 18:24                                                                                                 ` Matthew D. Fuller
@ 2006-10-29 18:39                                                                                                   ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-29 18:39 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git

Matthew D. Fuller wrote:
> On Sun, Oct 29, 2006 at 01:01:07PM +0100 I heard the voice of
> Jakub Narebski, and lo! it spake thus:
>>
>> Do you really often compile and run other branch while developing on
>> other?
> 
> Yes.  And I do the same with older revisions along a given branch too,
> where is where [lightweight] checkouts come in handy.

Well, if you don't _work_ on other branch, you can alwaych checkout
the other branch or any given revision from a separate directory
using
  git --git-dir=<path to repo> tar-tree <revision> | tar xf -
for example.
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Generating docu in 1.4.3.3.g01929
  2006-10-28 19:04             ` Junio C Hamano
  2006-10-28 19:13               ` Sean
@ 2006-10-29 19:03               ` Horst H. von Brand
  1 sibling, 0 replies; 1752+ messages in thread
From: Horst H. von Brand @ 2006-10-29 19:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Sean, git, Horst H. von Brand

Junio C Hamano <junkio@cox.net> wrote:

[...]

> and here is what I have:
> 
>    asciidoc-7.0.2-3.fc6
>    xmlto-0.0.18-13.1
>    python-2.4.3-18.fc6
>    docbook-dtds-1.0-30.1
>    package docbook-xsl is not installed
>    flex-2.5.4a-41.fc6
>    libxslt-1.1.17-1.1
>    passivetex-1.25-5.1.1
>    util-linux-2.13-0.44.fc6
>    w3m-0.5.1-14.1

I've got:

asciidoc-7.0.2-3.fc6
xmlto-0.0.18-13.1
python-2.4.4-1.fc7
docbook-dtds-1.0-30.1
package docbook-xsl is not installed
flex-2.5.4a-41.fc6
libxslt-1.1.18-1
passivetex-1.25-5.1.1
util-linux-2.13-0.44.fc6
w3m-0.5.1-14.1

> "rpm -q --whatprovides docbook-xsl" says:
> 
>    docbook-style-xsl-1.69.1-5.1

docbook-style-xsl-1.69.1-5.1

Differences are (mine (Junio's)):

python-2.4.4-1.fc7 (python-2.4.3-18.fc6)
libxslt-1.1.18-1 (libxslt-1.1.17-1.1)

libxslt requires libxml2:

libxml2-2.6.27-1 (Fedora 6 has libxml2-2.6.26-2.1.1)

Getting the Fedora 6 libxslt (Junio's) and redoing git gives no errors.

Judging from the libxslt changelog <http://xmlsoft.org/XSLT/news.html> they
tightened up the processing, so I'd guess asciidoc is generating fishy XML
or xmlto is broken. I've no clue here... somebody knowledgeable who can
take a closer look or otherwise lend me a hand?

Thanks!

PS: I get similar errors with tig...
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                    Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria             +56 32 2654239
Casilla 110-V, Valparaiso, Chile               Fax:  +56 32 2797513


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-29 12:01                                                                                               ` Jakub Narebski
  2006-10-29 18:24                                                                                                 ` Matthew D. Fuller
@ 2006-10-30  0:10                                                                                                 ` Theodore Tso
  1 sibling, 0 replies; 1752+ messages in thread
From: Theodore Tso @ 2006-10-30  0:10 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, bazaar-ng

On Sun, Oct 29, 2006 at 01:01:07PM +0100, Jakub Narebski wrote:
> > You can give Bazaar for me, a bk user, and I can understand what to do
> > with the branches that are like bk clones. (The repository stuff is
> > later development and still optional.) Switching a CVS environment to
> > Bazaar one can be done so that most of the users can be just told to
> > use bzr checkout and they don't have to care about pushing.
> 
> That is of course because you are familiar with branch-centric distributed
> SCM, namely BitKeeper, when trying Bazaar-NG. IMHO branch-centric view
> is somewhat limiting; you can always use repository-centric SCM with
> one-live-branch-per-repository paradigm and emulate branch-centric SCM,
> which is not (or not always) the case for branch-centric SCM. Branch-centric
> and repo-centric SCM promote different workflows, namely parallel uncommited
> work on few development branches for branch-centric SCM, one-change
> per-commit multiple temporary and feature branches for repo-centric SCM.

I've got to disagree here.  Being a former bitkeeper user myself, I
find BZR-NG to be nothing like bk.  In particular, Bitkeeper is *not*
branch-centric the way that BZR is; in fact, bk is much closer to git
and bk both in terms of how it works and its terminology.  You can
have a non-linear set of history without using any "branches" in both
bk and mercurial, simply by creating two commits changing different
files in two different repositories (using the bk, git, and hg sense
of the word --- only bzr attaches a completely different definitoin to
term "repository"), and then pull them together.   

With bzr, the only way you can do the following is by explicitly
creating a separate branch and then merging the two branches together.
In bzr --- unlike bk, git, and hg --- when you are on a "branch" the
history must be completely linear.  The difference between bk, and git
and hg, is that bk enforces a restriction that there must be one
"head", or "tip" on a particular repository (in the bk, hg, and git
sense).  So if you start by cloning the repository A -> B, and then
make one or more commits in repository A, and then one or more commits
in repository B, when you pull from repository B to A, bk will enforce
the creation of a merge changeset on the resulting repository --- or
fail the merge.  (Actually, with BK there was the option to create
multiple tips using "lines of development", but it was never fully
developed or supported.)

With hg and git, you have the *option* of pulling the two lines of
commits together using a merge changeset *or* leaving the two "tips"
or "heads" unmerged.  But that's only a very minor difference between
bk and hg/git --- and if you are willing to always merge two heads
after pulling so that your git or hg repository only has one head/tip,
then conceptually the changeset history is just like bk.

In contrast, it's impossible to do this with bzr without leaving the
named branches around, so in this sense it's quite different form BK.

						- Ted

P.S.  I'm going to teaching a class entitled "Bzr, Hg, and Git, Oh
my!" at LISA conference in Washington, D.C.  It's only a half-day
tutorial intending to cover the basics of Distributed SCM systems, so
most folks on this list will probably know everything I'm planning on
discussing, but if you have some colleagues who need a gentle
introduction, please feel tell them to head on over to the LISA
conference website at www.usenix.org.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Progress reporting (was: VCS comparison table)
  2006-10-28 13:53                                                                                           ` Jakub Narebski
                                                                                                               ` (2 preceding siblings ...)
  2006-10-29  6:54                                                                                             ` Ilpo Nyyssönen
@ 2006-10-30 10:18                                                                                             ` Jakub Narebski
  2006-10-30 15:21                                                                                               ` Nicolas Pitre
  3 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-10-30 10:18 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Jakub Narebski wrote:
> Ilpo Nyyssönen wrote:

>> 3. Understanding output
>> 
>> G: Speaks a language of its own, hard to understand. No progress
>> reported for long lasting operations.
>> 
>> B: Could maybe speak a bit more. Progress reporting is quite good.
> 
> Which long lasting operations lack progress bar/progress reporting?
> "git clone" and "git fetch"/"git pull" both have progress report
> for both "smart" git://, git+ssh:// and local protocols, and "dumb"
> http://, https://, ftp://, rsync:// protocols. "git rebase" has
> progress report. "git am" has progress report.

I was bitten lately by git lack of progress reporting for git-push.
While it nicely reports local progress (generating data) it unfortunately
lacks wget like, "curl -o" like or scp like pack upload progress
reporting. And while usually push is fast, initial push of whole
project to empty repository can be quite slow on low-bandwidth link
(or busy network).

git version 1.4.3.3 on local side, git+ssh:// protocol, git version
1.4.3.3.g9ab2 on the remote side (repo.or.cz).
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Progress reporting (was: VCS comparison table)
  2006-10-30 10:18                                                                                             ` Progress reporting (was: VCS comparison table) Jakub Narebski
@ 2006-10-30 15:21                                                                                               ` Nicolas Pitre
  0 siblings, 0 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-10-30 15:21 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

On Mon, 30 Oct 2006, Jakub Narebski wrote:

> I was bitten lately by git lack of progress reporting for git-push.
> While it nicely reports local progress (generating data) it unfortunately
> lacks wget like, "curl -o" like or scp like pack upload progress
> reporting. And while usually push is fast, initial push of whole
> project to empty repository can be quite slow on low-bandwidth link
> (or busy network).

What about this patch?

diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 41e1e74..7f87ae8 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -1524,6 +1524,10 @@ int cmd_pack_objects(int argc, const cha
 			progress = 1;
 			continue;
 		}
+		if (!strcmp("--all-progress", arg)) {
+			progress = 2;
+			continue;
+		}
 		if (!strcmp("--incremental", arg)) {
 			incremental = 1;
 			continue;
@@ -1641,7 +1645,7 @@ int cmd_pack_objects(int argc, const cha
 	else {
 		if (nr_result)
 			prepare_pack(window, depth);
-		if (progress && pack_to_stdout) {
+		if (progress == pack_to_stdout) {
 			/* the other end usually displays progress itself */
 			struct itimerval v = {{0,},};
 			setitimer(ITIMER_REAL, &v, NULL);
diff --git a/send-pack.c b/send-pack.c
index 0e90548..9280481 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -30,6 +30,7 @@ static void exec_pack_objects(void)
 {
 	static const char *args[] = {
 		"pack-objects",
+		"--all-progress",
 		"--stdout",
 		NULL

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: VCS comparison table
  2006-10-25 22:40                                                                           ` David Lang
  2006-10-25 23:53                                                                             ` Matthew D. Fuller
@ 2006-10-30 21:46                                                                             ` Jan Hudec
  1 sibling, 0 replies; 1752+ messages in thread
From: Jan Hudec @ 2006-10-30 21:46 UTC (permalink / raw)
  To: David Lang; +Cc: Matthew D. Fuller, Linus Torvalds, bazaar-ng, git

On Wed, Oct 25, 2006 at 03:40:00PM -0700, David Lang wrote:
> On Tue, 24 Oct 2006, Matthew D. Fuller wrote:
> >On Tue, Oct 24, 2006 at 11:03:20AM -0700 I heard the voice of
> >David Lang, and lo! it spake thus:
> >>
> >>it sounded like you were saying that the way to get the slices of
> >>the DAG was to use branches in bzr. [...]
> >
> >I'm not entirely sure I understand what you mean here, but I think
> >you're saying "Nobody's written the code in bzr to show arbitrary
> >slices of the DAG", which is true TTBOMK.
> 
> I think we are talking past each other here.
> 
> what I think was said was
> 
> G 'one feature of git is that you can view arbatrary slices trivially'
> 
> B 'bzr can do this too, you just use branches to define the slices'
> 
> G 'but this limits you becouse branches are defined as code is developed, 
> git lets you define slices at viewing time'
> 
> by the way, I think it's more then just saying 'well, the code could be 
> written to do this in $VCS' some decisions and standard ways of doing 
> things can impact how hard it is to implement a feature, and some decisions 
> can make it impossible (without doing unexpected things).

Since bzr branch is, and is ONLY, a pointer to a revision, I don't see
any design decision that would make this harder in bzr. The UI was only
implemented to take the revisions as branches.

> >>everyone agrees that bzr supports the Star topology. Most people
> >>(including bzr people) seem to agree that currently bzr does not
> >>support the Distributed topology.
> >
> >I think this statement arouses so much grumbling because (a) bzr does
> >support such a lot better than often seems implied, (b) where it
> >doesn't, the changes needed to do so are relatively minor (often
> >merely cosmetic), and (c) disagreement over whether some of the
> >qualifications included for 'distributed' are really fundamental.

The more I read this thread I actually think bzr does support
distributed topology as well as git.

The whole difference is that bzr makes a distinction between the first
and other parents of a revision, while git does not. This distinction is
done in two places:

1. The log shows the first parent and than, as indented subsection the
   ancestry of other parents until the point where the ancestries meet
   again. This actually captures a pattern people usually use. When you
   merge, you usually put in the log something along the lines:

   "merged X, which bars and fixes foo."

   when you actually merge M, which you consider a "mainline" and
   therefore not worth mentioning and X. Linus does it this way too --
   he actually posted a log message as an example, that showed exactly
   this.

2. Assigns revision aliases in this same order (except the "major"
   number for the subsection is based on the common ancestor, not on the
   merge point). They are not special thing that is generated at commit
   time; they are infered from the shape of the DAG (and cached for
   performance reasons).

And the only issue I think is, that the bzr UI and documentation pushes
forward these aliases (revnos) more than appropriate for fully
distributed case and hides the real revision names (revids) too much for
that case.

> >>it's just fine for bzr to not support all possible topologies,
> >
> >I think there's a real intent for bzr TO support at least all common
> >topologies.  I'll buy that current development has focused more on
> >[relatively] simple topologies than the more wildly complex ones.  I
> >look forward to more addressing of the less common cases as the tool
> >matures, and I think a lot of this thread will be good material to
> >work with as that happens.  It's just the suggestion that providing
> >fruit for simple topologies _necessarily_ prejudices against complex
> >ones that I find so onerous.
> 
> one concern that the git people are voicing is that the things that work 
> for simple topologies (revno's) can't be used with the more complex ones 
> (where you need the refid's). especially the fact that users need to do 
> things significantly different when there are fairly subtle changes to the 
> topology.
> 
> the scenerio that came up elsewhere today where you have
> 
>    Master
>    /    \
> dev1   dev2
> 
> and then dev1 and dev2 both start working on the same thing (without 
> knowing it), then discover they are working on the same thing. they now 
> have threeB options
> 
> 1. merge their stuff up to the master so that they can both pull it down.
>   but this puts broken, experimental stuff up in the master
> 
> 2. declare one of the dev trees to be the master
> 
> this changes the topology to
> 
> Master--dev1--dev2
> 
> 3. pull from each other frequently to keep in sync.
> 
> this changes the topology to
> 
>    Master
>    /   \
> dev1--dev2
> 
> if they do this with bzr then the revno's break, they each get extra 
> commits showing up (so they can never show the same history).

That's a deficiency of merge not telling that a merge is pointless.
Actually I think than bzr merge *should* reduce to pull in all cases:

- If the common ancestor is on the leftmost path of the other branch,
  than the existing revnos as seen on this branch will not change in any
  case, only more than one is added. I think it's safe for merge to
  reduce to pull in this case and consider it a bug in bzr that it does
  not.
- If the common ancestor is not on the leftmost path on the other
  branch, than it is because the branch was merged with some other
  deemed "more important" (ie. the "Master" above). In this case
  reducing to pull will change the old revids, but IMO it's correct
  thing to do, because it's now up-to-date with latest revision of
  "Master" and it's revnos should take precedence. Personally I'd just
  like merge to reduce to pull in this case as well, but maybe it'd be
  better to have it error out and request user to either "pull" or
  "merge --pointless".

> in git this is a non-issue, they can pull back and forth and the only new 
> history to show up will be changes.
> 
> this is the situation that the kernel developers are in frequently. it 
> sounds as if you haven't needed to do this yet, so you haven't encountered 
> the problems.

Given that bzr is considerably smaller project than the Linux kernel,
that's quite likely. And it's likely a reason why it was not thoroughly
discussed in bzr yet (or at least I don't know about that it was).

--------------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:15                                                   ` Jon Smirl
@ 2006-11-03  3:43                                                     ` Matthew Hannigan
  0 siblings, 0 replies; 1752+ messages in thread
From: Matthew Hannigan @ 2006-11-03  3:43 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git, Shawn Pearce

On Fri, Oct 20, 2006 at 02:15:15PM -0400, Jon Smirl wrote:
> [ ... ] 
> You could have a file of macro substitutions that is applied/expanded
> when files go in/out of git. The macros would replace the copyright
> notices improving the move/rename tracking and the reducing repository
> size. The macros could be recorded out of band to eliminate the need
> for escaping the file contents. Even simpler, the only valid place for
> the macro could be the beginning of the file.

That probably belongs in the class of transformations
best done outside the VCS such as the permissions 
and system config file idea Linus outlined earlier.


Matt

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21 19:21                                                     ` Jakub Narebski
@ 2006-11-03  6:36                                                       ` Martin Langhoff
  0 siblings, 0 replies; 1752+ messages in thread
From: Martin Langhoff @ 2006-11-03  6:36 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Linus Torvalds, Jan Hudec, Jeff King, bazaar-ng, git

On 10/22/06, Jakub Narebski <jnareb@gmail.com> wrote:
> Lack of --follow is not a big issue because you can do this "by hand";
> you can use git-diff-tree -M at the end of file history to check if
> [git considers] it was moved from somewhere.

This 'by hand' can be done in shell. cg-log has a half-complete
implementation of it. Seems to be disabled now :-(

cheers,



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* [PATCH] git-pickaxe -C -C -C
@ 2006-11-06  9:08 Junio C Hamano
  2006-11-06 16:46 ` Horst H. von Brand
  0 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-06  9:08 UTC (permalink / raw)
  To: git

Three -C options makes the command to look for copied lines from _any_
existing file in the parent commit, not just changed files.

This is of course _very_ expensive.

Some numbers and observations.

* git-pickaxe -C revision.c
2.22user 0.02system 0:02.24elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+5263minor)pagefaults 0swaps
num read blob 486

* git-pickaxe -C -C -C revision.c
35.42user 0.27system 0:37.66elapsed 94%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (14major+115543minor)pagefaults 0swaps
num read blob 43277

Comparing the output from the above two, with this option, it
finds that some lines were copied from diff.c, diff-tree.c and
merge-cache.c; they are obvious patterns justifiably repeated.

 - list of parameters to a function (ll. 214-217, 247-249);

 - definitions of local variables (ll. 260-263);

 - loops over all cache entries (ll. 581-584).

This change probably falls into the category of "I did this not
because it is useful in practice but just because I could".

Nevertheless, looking at the output was very interesting.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
 builtin-pickaxe.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/builtin-pickaxe.c b/builtin-pickaxe.c
index f12b2d4..619a8c6 100644
--- a/builtin-pickaxe.c
+++ b/builtin-pickaxe.c
@@ -19,7 +19,7 @@
 #include <sys/time.h>
 
 static char pickaxe_usage[] =
-"git-pickaxe [-c] [-l] [-t] [-f] [-n] [-p] [-L n,m] [-S <revs-file>] [-M] [-C] [-C] [commit] [--] file\n"
+"git-pickaxe [-c] [-l] [-t] [-f] [-n] [-p] [-L n,m] [-S <revs-file>] [-M] [-C] [-C] [-C] [commit] [--] file\n"
 "  -c, --compatibility Use the same output mode as git-annotate (Default: off)\n"
 "  -l, --long          Show long commit SHA1 (Default: off)\n"
 "  -t, --time          Show raw timestamp (Default: off)\n"
@@ -48,6 +48,7 @@ static int num_commits;
 #define PICKAXE_BLAME_MOVE		01
 #define PICKAXE_BLAME_COPY		02
 #define PICKAXE_BLAME_COPY_HARDER	04
+#define PICKAXE_BLAME_COPY_HARDEST	010
 
 /*
  * blame for a blame_entry with score lower than these thresholds
@@ -885,8 +886,9 @@ static int find_copy_in_parent(struct sc
 	 * and this code needs to be after diff_setup_done(), which
 	 * usually makes find-copies-harder imply copy detection.
 	 */
-	if ((opt & PICKAXE_BLAME_COPY_HARDER) &&
-	    (!porigin || strcmp(target->path, porigin->path)))
+	if (((opt & PICKAXE_BLAME_COPY_HARDER) &&
+	     (!porigin || strcmp(target->path, porigin->path))) ||
+	    (opt & PICKAXE_BLAME_COPY_HARDEST))
 		diff_opts.find_copies_harder = 1;
 
 	diff_tree_sha1(parent->tree->object.sha1,
@@ -1569,6 +1571,8 @@ int cmd_pickaxe(int argc, const char **a
 			blame_move_score = parse_score(arg+2);
 		}
 		else if (!strncmp("-C", arg, 2)) {
+			if (opt & PICKAXE_BLAME_COPY_HARDER)
+				opt |= PICKAXE_BLAME_COPY_HARDEST;
 			if (opt & PICKAXE_BLAME_COPY)
 				opt |= PICKAXE_BLAME_COPY_HARDER;
 			opt |= PICKAXE_BLAME_COPY | PICKAXE_BLAME_MOVE;
-- 
1.4.3.4.g9f05


^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: [PATCH] git-pickaxe -C -C -C
  2006-11-06  9:08 [PATCH] git-pickaxe -C -C -C Junio C Hamano
@ 2006-11-06 16:46 ` Horst H. von Brand
  2006-11-06 17:25   ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Horst H. von Brand @ 2006-11-06 16:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano <junkio@cox.net> wrote:
> Three -C options makes the command to look for copied lines from _any_
> existing file in the parent commit, not just changed files.

IMHO, this is horrible UI.

-C        is one thing
-C -C     is another
-C -C -C  is still another?
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                    Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria             +56 32 2654239
Casilla 110-V, Valparaiso, Chile               Fax:  +56 32 2797513

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [PATCH] git-pickaxe -C -C -C
  2006-11-06 16:46 ` Horst H. von Brand
@ 2006-11-06 17:25   ` Junio C Hamano
  0 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-06 17:25 UTC (permalink / raw)
  To: Horst H. von Brand; +Cc: git

"Horst H. von Brand" <vonbrand@inf.utfsm.cl> writes:

> Junio C Hamano <junkio@cox.net> wrote:
>> Three -C options makes the command to look for copied lines from _any_
>> existing file in the parent commit, not just changed files.
>
> IMHO, this is horrible UI.
>
> -C        is one thing
> -C -C     is another
> -C -C -C  is still another?

I think of it as "-v" vs "-v -v" vs "-v -v -v" some programs use
to give you increasing levels of verbosity.

Triple-C version is a kind of joke and not to be integrated
(although it seems to work as advertised, it is inpractically
slow), so it really is between -C vs -C -C, but certainly I am
open to better ways of specifying the current -C/-C -C options.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* [PATCH] commit: Steer new users toward "git commit -a" rather than update-index
@ 2006-11-14 16:42 Carl Worth
  2006-11-14 18:55 ` Andy Whitcroft
  0 siblings, 1 reply; 1752+ messages in thread
From: Carl Worth @ 2006-11-14 16:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 733 bytes --]

As has been discussed recently, update-index isn't intended as a
"porcelain" command so the mention of it in the output of git-commit
does lead to some user confusion.
---
 wt-status.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/wt-status.c b/wt-status.c
index 7dd6857..4edabcd 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -126,7 +126,7 @@ static void wt_status_print_changed_cb(s
 	int i;
 	if (q->nr)
 		wt_status_print_header("Changed but not updated",
-				"use git-update-index to mark for commit");
+				"use \"git commit <files>\" to commit or \"git commit -a\" for all");
 	for (i = 0; i < q->nr; i++)
 		wt_status_print_filepair(WT_STATUS_CHANGED, q->queue[i]);
 	if (q->nr)
--
1.4.3.3.gf040


[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: [PATCH] commit: Steer new users toward "git commit -a" rather than update-index
  2006-11-14 16:42 [PATCH] commit: Steer new users toward "git commit -a" rather than update-index Carl Worth
@ 2006-11-14 18:55 ` Andy Whitcroft
  2006-11-14 19:22   ` Cleaning up git user-interface warts Carl Worth
  2006-11-14 23:30   ` [PATCH] commit: Steer new users toward "git commit -a" rather than update-index Junio C Hamano
  0 siblings, 2 replies; 1752+ messages in thread
From: Andy Whitcroft @ 2006-11-14 18:55 UTC (permalink / raw)
  To: Carl Worth; +Cc: Junio C Hamano, git

Carl Worth wrote:
> As has been discussed recently, update-index isn't intended as a
> "porcelain" command so the mention of it in the output of git-commit
> does lead to some user confusion.
> ---
>  wt-status.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/wt-status.c b/wt-status.c
> index 7dd6857..4edabcd 100644
> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -126,7 +126,7 @@ static void wt_status_print_changed_cb(s
>  	int i;
>  	if (q->nr)
>  		wt_status_print_header("Changed but not updated",
> -				"use git-update-index to mark for commit");
> +				"use \"git commit <files>\" to commit or \"git commit -a\" for all");
>  	for (i = 0; i < q->nr; i++)
>  		wt_status_print_filepair(WT_STATUS_CHANGED, q->queue[i]);
>  	if (q->nr)
> --
> 1.4.3.3.gf040

Are we sure this isn't porcelain-ish?  We need to use it in merge
conflict correction and the like?  You can't use git-commit there as a
replacement.  I'd expect it to be 'git update-index' rather than
'git-update-index' of course.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Cleaning up git user-interface warts
  2006-11-14 18:55 ` Andy Whitcroft
@ 2006-11-14 19:22   ` Carl Worth
  2006-11-14 19:29     ` Shawn Pearce
                       ` (3 more replies)
  2006-11-14 23:30   ` [PATCH] commit: Steer new users toward "git commit -a" rather than update-index Junio C Hamano
  1 sibling, 4 replies; 1752+ messages in thread
From: Carl Worth @ 2006-11-14 19:22 UTC (permalink / raw)
  To: Andy Whitcroft; +Cc: Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 3070 bytes --]

On Tue, 14 Nov 2006 18:55:51 +0000, Andy Whitcroft wrote:
> Carl Worth wrote:
> > As has been discussed recently, update-index isn't intended as a
> > "porcelain" command so the mention of it in the output of git-commit
> > does lead to some user confusion.
>
> Are we sure this isn't porcelain-ish?  We need to use it in merge
> conflict correction and the like?  You can't use git-commit there as a
> replacement.  I'd expect it to be 'git update-index' rather than
> 'git-update-index' of course.

It was Junio that recently said update-index is plumbing, not
porcelain.

So, the fact that conflict resolution still requires the use of
update-index would just be the next thing to fix. A name for a
replacement to use there could be "git resolve <paths>", (since the
old git-resolve is now officially deprecated). That's a name that
matches what hg uses in this situation, (another option is "resolved"
which is what stg uses, but I think verbs for commands work better in
general).

It would be really nice if none of the "common" commands had a hyphen
in them, for example.

And then, the next phase of my evil plan would be to introduce a -i
option for git-commit making it commit the state in the index. Then
git-commit with no options could work like "git-commit -a" does now,
(with the additional protection of not committing any unmerged
files---that is the new "git resolve" would be required before "git
commit" would work after a conflict). Users who really, really like
the current behavior of git-commit could use the new alias support to
pass the new -i option in order to maintain compatible behavior.

Then, the last thing I'd really like to fix is to allow a usage of
"git merge <branch>" instead of the awkward "git pull . <branch>".

With that, most of the user-interface warts that I regularly run into
with git would be solved. Oh, except it would also be nice to
eliminate the "plumbing" commands in a couple of places:

 1) From the "man git" man page

 2) From git-<TAB>, (maybe the solution for this is to make
    "git <TAB>" work and only do tab-completion for the commands
    blessed enough to appear in "git --help"? Also push the tab
    completion stuff out as a standard part of packages.

Anyway, now I've just gone and blown all my secret plans for changing
git in ways to make it less intimidating for new users.

For reference, the latest potential batch of new users that I'm
dealing with is the set of Fedora package maintainers who are looking
at replacing CVS for their tree of package-building scripts. They are
currently evaluating systems and liking the interface of hg. Here's
the top of the current thread:

https://www.redhat.com/archives/fedora-maintainers/2006-November/msg00030.html

Here's the report about "git commit -a" confusion that led to my patch
above:

https://www.redhat.com/archives/fedora-maintainers/2006-November/msg00141.html

And here's my reply where I suggest that git UI might still be
improved in these areas:

https://www.redhat.com/archives/fedora-maintainers/2006-November/msg00149.html

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 19:22   ` Cleaning up git user-interface warts Carl Worth
@ 2006-11-14 19:29     ` Shawn Pearce
  2006-11-14 19:59       ` Carl Worth
  2006-11-14 19:47     ` Petr Baudis
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 1752+ messages in thread
From: Shawn Pearce @ 2006-11-14 19:29 UTC (permalink / raw)
  To: Carl Worth; +Cc: Andy Whitcroft, Junio C Hamano, git

Carl Worth <cworth@cworth.org> wrote:
>  2) From git-<TAB>, (maybe the solution for this is to make
>     "git <TAB>" work and only do tab-completion for the commands
>     blessed enough to appear in "git --help"? Also push the tab
>     completion stuff out as a standard part of packages.

Uh, see contrib/completion/git-completion.bash.

"git <TAB>" completes commands.  It offers too many completions
for your taste it sounds like, as it also offers plumbing... but
that's fixable.  :-)

-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 19:22   ` Cleaning up git user-interface warts Carl Worth
  2006-11-14 19:29     ` Shawn Pearce
@ 2006-11-14 19:47     ` Petr Baudis
  2006-11-14 20:56       ` Carl Worth
  2006-11-14 20:46     ` Karl Hasselström
  2006-11-14 20:52     ` Nicolas Pitre
  3 siblings, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-11-14 19:47 UTC (permalink / raw)
  To: Carl Worth; +Cc: Andy Whitcroft, Junio C Hamano, git

On Tue, Nov 14, 2006 at 08:22:39PM CET, Carl Worth wrote:
> For reference, the latest potential batch of new users that I'm
> dealing with is the set of Fedora package maintainers who are looking
> at replacing CVS for their tree of package-building scripts. They are
> currently evaluating systems and liking the interface of hg. Here's
> the top of the current thread:
> 
> https://www.redhat.com/archives/fedora-maintainers/2006-November/msg00030.html
> 
> Here's the report about "git commit -a" confusion that led to my patch
> above:
> 
> https://www.redhat.com/archives/fedora-maintainers/2006-November/msg00141.html
> 
> And here's my reply where I suggest that git UI might still be
> improved in these areas:
> 
> https://www.redhat.com/archives/fedora-maintainers/2006-November/msg00149.html

Hmm, did they (not) consider Cogito? They wouldn't have those issues.
;-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 19:29     ` Shawn Pearce
@ 2006-11-14 19:59       ` Carl Worth
  0 siblings, 0 replies; 1752+ messages in thread
From: Carl Worth @ 2006-11-14 19:59 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Andy Whitcroft, Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 1662 bytes --]

On Tue, 14 Nov 2006 14:29:14 -0500, Shawn Pearce wrote:
> Uh, see contrib/completion/git-completion.bash.

Oops. I had seen this and thought I had installed it properly a while
ago, (copied it to /etc/bash_completion.d/git), but I hadn't realized
it wasn't active in the shell I used to test while composing that
email.

> "git <TAB>" completes commands.  It offers too many completions
> for your taste it sounds like, as it also offers plumbing... but
> that's fixable.  :-)

Yes, I think we'd all be better off if we could designate some subset
of the current git commands as not being intended for users to type on
the command line and pulled them out of the completion scripts.

It is tough though. Looking through what's available in the short list
from "git --help" I notice that update-index isn't there, and that's
currently still required, (as we've been discussing here). But even
things as "core plumbing" as git rev-list I find extremely useful on
the command like with simple pipelines.

On the other hand, there are definitely some commands I've never
typed, and are not intended to be typed by the user. Here are a few I
see as fairly obvious just from skimming the list:

	merge-*
	http-*
	ssh-*
	upload-*
	mktag
	mktree
	check-ref-format
	...

There are a bunch of others as well. Maybe it would be easier to start
with the list in git --help and see what should be added to that.

The documentation for some of the above commands have phrases such as
"Invoked by <other command>" and "usually not invoked by the end user"
which does make the distinction quite clear. So it would be nice if
git could keep these away from the user more.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 19:22   ` Cleaning up git user-interface warts Carl Worth
  2006-11-14 19:29     ` Shawn Pearce
  2006-11-14 19:47     ` Petr Baudis
@ 2006-11-14 20:46     ` Karl Hasselström
  2006-11-14 20:52     ` Nicolas Pitre
  3 siblings, 0 replies; 1752+ messages in thread
From: Karl Hasselström @ 2006-11-14 20:46 UTC (permalink / raw)
  To: Carl Worth; +Cc: Andy Whitcroft, Junio C Hamano, git

On 2006-11-14 11:22:39 -0800, Carl Worth wrote:

> So, the fact that conflict resolution still requires the use of
> update-index would just be the next thing to fix. A name for a
> replacement to use there could be "git resolve <paths>", (since the
> old git-resolve is now officially deprecated). That's a name that
> matches what hg uses in this situation, (another option is
> "resolved" which is what stg uses, but I think verbs for commands
> work better in general).

Yes, "resolve" sounds better than "resolved". The latter is arguably
more correct, since you're telling git that you have already resolved
the file and not asking it to resolve it for you, but I still prefer
"resolve".

> And then, the next phase of my evil plan would be to introduce a -i
> option for git-commit making it commit the state in the index. Then
> git-commit with no options could work like "git-commit -a" does now,
> (with the additional protection of not committing any unmerged
> files---that is the new "git resolve" would be required before "git
> commit" would work after a conflict). Users who really, really like
> the current behavior of git-commit could use the new alias support
> to pass the new -i option in order to maintain compatible behavior.

Seems very sane. Default to simple behavior, and provide a switch to
get more complicated behavior.

> Then, the last thing I'd really like to fix is to allow a usage of
> "git merge <branch>" instead of the awkward "git pull . <branch>".

This should reduce newbie confusion a lot.

-- 
Karl Hasselström, kha@treskal.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 19:22   ` Cleaning up git user-interface warts Carl Worth
                       ` (2 preceding siblings ...)
  2006-11-14 20:46     ` Karl Hasselström
@ 2006-11-14 20:52     ` Nicolas Pitre
  2006-11-14 21:01       ` Jakub Narebski
  2006-11-14 21:10       ` Carl Worth
  3 siblings, 2 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-14 20:52 UTC (permalink / raw)
  To: Carl Worth; +Cc: Andy Whitcroft, Junio C Hamano, git

On Tue, 14 Nov 2006, Carl Worth wrote:

> Anyway, now I've just gone and blown all my secret plans for changing
> git in ways to make it less intimidating for new users.

I just cannot do otherwise than cheer this with applause.

Even if I have a clear preference for GIT's _technology_, I still think 
that the HG user interface is more convivial.  I even been thinking 
about writing something like an hg-like frontend to GIT from time to 
time just so that GIT could then be better compared to (and actually 
just used like) HG.

I still think that the GIT user interface sucks in many ways.  The 
confusion between pull, fetch and push is still my favorite, along with 
the locale vs remote branch issue.  Maybe we'll better handle the branch 
issue eventually, but it would be so much intuitive to split branch 
merging out of git-pull, and make git-pull be the same as git-fetch 
(maybe deprecating git-fetch in the process) so push and pull are really 
_only_ opposite of each other.

If the fetch+merge behavior (which I think should really be refered as 
pull+merge) is still desirable, then it should be called git-update and 
be no more than a single shell script line such as

	git_pull && git_merge"

This is really what most people expect from such a command name based 
on obvious historical reasons.  The lack of any branch argument to 
git-pull and git-merge could be defined as using the first defined 
remote branch by default.  But having git-pull performing merges is IMHO 
overloading the word and goes against most people's expectations.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 19:47     ` Petr Baudis
@ 2006-11-14 20:56       ` Carl Worth
  2006-11-15  0:31         ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Carl Worth @ 2006-11-14 20:56 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Andy Whitcroft, Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 2691 bytes --]

On Tue, 14 Nov 2006 20:47:07 +0100, Petr Baudis wrote:
> Hmm, did they (not) consider Cogito? They wouldn't have those issues.

I didn't ask.

Frankly, I don't see a lot of value in the git/cogito split right now.

When I first learned git and cogito (January 2006) and switched cairo
from cvs to git (the repository storage), I recommended cogito to
cairo programmers as a "more cvs-like" way to work with the new
repository.

Since then, having worked with git (the command-line program)
exclusively for my own work, and having introduced it to dozens of new
users, I don't bother recommending cogito anymore. It's just not that
hard to learn git itself, so there's not that much value in learning
cogito instead.

And this is particularly true since there's quite a large cost to
having to learn cogito _in addition to_ git. And I think that's what
most people would have to do anyway. For example, cogito doesn't wrap
all git commands. So users have to dip down into git for things like
git-bisect or else miss out an important functionality.

And for something like the Fedora transition, where I'm working with
the people who will be training the community in the new tools, the
trainers would have to learn both if they want to support a community
using both git and cogito. These trainers are already complaining
about the ~140 git commands, so adding 40 more cogito commands as well
doesn't make the story better.

It's great that git is written in a script-friendly way so that new
interfaces can be built on top of it. And I think the benefits of new
user interfaces are clear when they work in fundamentally different
ways, (say, being operated through a GUI). But where git and cogito
are both command-line utilities and have the same basic functionality,
I don't see how its helpful to maintain both tools. (Certainly some of
my attitude here is due to the timing of my introduction to git
contrasted with the timing of the inception of cogito. I'm sure git
improved a lot between those two events.)

There are some things that cogito does that git does not that I would
like to have in git. One is having a "commit" command that commits
everything by default without an extra command-line option. Another
(that I _think_ cogito has) is a way to switch away from a branch with
dirty changes to a clean branch, do work there, and come back to the
original branch with the dirty stuff still there.

I don't see any defining difference that justifies cogito's
existence ("hide the index" maybe? let's just hide it a tiny bit more
in git). And I would like to help work to get the remaining good
stuff that has been proven in cogito---to get it pushed down into git
itself.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 20:52     ` Nicolas Pitre
@ 2006-11-14 21:01       ` Jakub Narebski
  2006-11-14 21:32         ` Nicolas Pitre
  2006-11-14 21:10       ` Carl Worth
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-11-14 21:01 UTC (permalink / raw)
  To: git

Nicolas Pitre wrote:

> If the fetch+merge behavior (which I think should really be refered as 
> pull+merge) is still desirable, then it should be called git-update and 
> be no more than a single shell script line such as
> 
>         git_pull && git_merge"
> 
> This is really what most people expect from such a command name based 
> on obvious historical reasons.  The lack of any branch argument to 
> git-pull and git-merge could be defined as using the first defined 
> remote branch by default.  But having git-pull performing merges is IMHO 
> overloading the word and goes against most people's expectations.

By the way, is anyone doing _remote_ octopus pull (true pull, not with . as
repository)?

We can always have --merge arguments to git-pull, and --fetch argument to
git-merge.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 20:52     ` Nicolas Pitre
  2006-11-14 21:01       ` Jakub Narebski
@ 2006-11-14 21:10       ` Carl Worth
  2006-11-14 21:30         ` Jakub Narebski
  2006-11-14 22:36         ` Junio C Hamano
  1 sibling, 2 replies; 1752+ messages in thread
From: Carl Worth @ 2006-11-14 21:10 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Andy Whitcroft, Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 2611 bytes --]

On Tue, 14 Nov 2006 15:52:47 -0500 (EST), Nicolas Pitre wrote:
> Even if I have a clear preference for GIT's _technology_, I still think
> that the HG user interface is more convivial.  I even been thinking
> about writing something like an hg-like frontend to GIT from time to
> time just so that GIT could then be better compared to (and actually
> just used like) HG.

I've actually been tempted to do the same myself. I really think that
the technology is a more important criterion than the UI so the
imagined hg-on-git interface would be an attempt to get people to look
past the interface differences and look at the technology when
deciding.

But, then, I'd be guilty of creating another cogito, and I just argued
against its existence in a separate thread. So I think we're better
off just fixing the git interface.

> I still think that the GIT user interface sucks in many ways.  The
> confusion between pull, fetch and push is still my favorite, along with
> the locale vs remote branch issue.  Maybe we'll better handle the branch
> issue eventually,

The --use-separate-remotes thing is technology in the right direction
here. But I think it's another example of very useful stuff being
improperly hidden behind another command-line option. Getting rid of
the "remote-tracking branches" as user-visible branches possible for
committing should be a priority. And that should be the default for
everyone, not just people who happen to clone with this obscure
option.

Similarly, the reflog stuff was often trumpeted in the recent git
vs. bzr debate. Why is that very useful functionality buried in a
config file option and not just stored by default?

> This is really what most people expect from such a command name based
> on obvious historical reasons.  The lack of any branch argument to
> git-pull and git-merge could be defined as using the first defined
> remote branch by default.

Once again, there's lots of useful work on "branch configuration" that
allows for commands to be able to get the "right" default repository
for push and pull. I hope that that stuff can be enabled by default
and not require --use-separate-remotes or manual configuration for
people to benefit from it.

I apologize if I sound like I'm ranting here. I love to see the many
good improvements being made to git. It's just that there seems to be
a sort of shyness about new features, (perhaps a fear of changing
existing behavior?). When it improves the user experience, let's make
the improvement the default and not add any more

	--make-this-command-do-what-it-really-should-have-always-done

options.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 21:10       ` Carl Worth
@ 2006-11-14 21:30         ` Jakub Narebski
  2006-11-14 21:34           ` Nicolas Pitre
  2006-11-14 22:36         ` Junio C Hamano
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-11-14 21:30 UTC (permalink / raw)
  To: git

The git interface refactoring should be I think the cause for git 2.0.0
release...

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 21:01       ` Jakub Narebski
@ 2006-11-14 21:32         ` Nicolas Pitre
  2006-11-14 22:04           ` Jakub Narebski
  0 siblings, 1 reply; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-14 21:32 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1241 bytes --]

On Tue, 14 Nov 2006, Jakub Narebski wrote:

> Nicolas Pitre wrote:
> 
> > If the fetch+merge behavior (which I think should really be refered as 
> > pull+merge) is still desirable, then it should be called git-update and 
> > be no more than a single shell script line such as
> > 
> >         git_pull && git_merge"
> > 
> > This is really what most people expect from such a command name based 
> > on obvious historical reasons.  The lack of any branch argument to 
> > git-pull and git-merge could be defined as using the first defined 
> > remote branch by default.  But having git-pull performing merges is IMHO 
> > overloading the word and goes against most people's expectations.
> 
> By the way, is anyone doing _remote_ octopus pull (true pull, not with . as
> repository)?
> 
> We can always have --merge arguments to git-pull, and --fetch argument to
> git-merge.

That would be a complete abomination if you want my opinion.

Please let git-pull actually pull stuff from a remote place, and 
git-merge actually merge stuff only.  Let's keep simple concepts mapped 
to simple commands please.  Nothing prevents _you_ from scripting more 
involved operations with a single command of your liking afterwards.


Nicolas

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 21:30         ` Jakub Narebski
@ 2006-11-14 21:34           ` Nicolas Pitre
  2006-11-14 22:56             ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-14 21:34 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

On Tue, 14 Nov 2006, Jakub Narebski wrote:

> The git interface refactoring should be I think the cause for git 2.0.0
> release...

Good idea indeed.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 21:32         ` Nicolas Pitre
@ 2006-11-14 22:04           ` Jakub Narebski
  2006-11-14 22:29             ` Nicolas Pitre
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-11-14 22:04 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

Nicolas Pitre wrote:
> On Tue, 14 Nov 2006, Jakub Narebski wrote:

>> We can always have --merge arguments to git-pull, and --fetch argument to
>> git-merge.
> 
> That would be a complete abomination if you want my opinion.
> 
> Please let git-pull actually pull stuff from a remote place, and 
> git-merge actually merge stuff only.  Let's keep simple concepts mapped 
> to simple commands please.  Nothing prevents _you_ from scripting more 
> involved operations with a single command of your liking afterwards.

Do we want to abandon completely "single-branch" workflow, where you
don't use tracking branch, only merge directly into your working branch?
That is the cause to (unused by most) future git-merge (replacement for
git-pull .) --fetch=<remote>[#<branch>] option.

I'm not that sure about --merge option, but it could be useful, at least
to have current automatic "Merge branch '<branch>' of <URL>" commit message.
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 22:04           ` Jakub Narebski
@ 2006-11-14 22:29             ` Nicolas Pitre
  0 siblings, 0 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-14 22:29 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

On Tue, 14 Nov 2006, Jakub Narebski wrote:

> Nicolas Pitre wrote:
> > On Tue, 14 Nov 2006, Jakub Narebski wrote:
> 
> >> We can always have --merge arguments to git-pull, and --fetch argument to
> >> git-merge.
> > 
> > That would be a complete abomination if you want my opinion.
> > 
> > Please let git-pull actually pull stuff from a remote place, and 
> > git-merge actually merge stuff only.  Let's keep simple concepts mapped 
> > to simple commands please.  Nothing prevents _you_ from scripting more 
> > involved operations with a single command of your liking afterwards.
> 
> Do we want to abandon completely "single-branch" workflow, where you
> don't use tracking branch, only merge directly into your working branch?

I really think we should.  Let's admit it: such a work flow has nothing 
to do with the tool.  It would certainly be much easier to teach new 
users about "this is a read-only view of the remote content that you can 
merge into your working branch" than trying to explain why the tool is 
so weird for the sake of supporting different work flows directly.

Again I think it is easier to grasp two simple commands than a single 
but complex one with multiple ramifications.

> That is the cause to (unused by most) future git-merge (replacement for
> git-pull .) --fetch=<remote>[#<branch>] option.
> 
> I'm not that sure about --merge option, but it could be useful, at least
> to have current automatic "Merge branch '<branch>' of <URL>" commit message.

A "remote" branch should obviously have a corresponding URL.  So if you 
do "git-merge remote" then you may as well prepare a commit message with 
that URL given the local name for that branch if you want.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 21:10       ` Carl Worth
  2006-11-14 21:30         ` Jakub Narebski
@ 2006-11-14 22:36         ` Junio C Hamano
  2006-11-14 22:50           ` Junio C Hamano
  2006-11-16  5:12           ` Petr Baudis
  1 sibling, 2 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-14 22:36 UTC (permalink / raw)
  To: Carl Worth; +Cc: git, Andy Whitcroft, Nicolas Pitre

Carl Worth <cworth@cworth.org> writes:

> On Tue, 14 Nov 2006 15:52:47 -0500 (EST), Nicolas Pitre wrote:
>> Even if I have a clear preference for GIT's _technology_, I still think
>> that the HG user interface is more convivial.  I even been thinking
>> about writing something like an hg-like frontend to GIT from time to
>> time just so that GIT could then be better compared to (and actually
>> just used like) HG.
>
> I've actually been tempted to do the same myself. I really think that
> the technology is a more important criterion than the UI so the
> imagined hg-on-git interface would be an attempt to get people to look
> past the interface differences and look at the technology when
> deciding.
>...
>> I still think that the GIT user interface sucks in many ways.  The
>...

I've actually been tempted to do that too, and my earlier "if I
were to redo git from scratch" message was the beginning of it
to summarize my preference about some of the issues raised in
this thread.

Commenting on the messages in this thread:

 - "resolve / resolved" are both confusing, when you are talking
   about "mark-resolved" operation.

 - "pull/push/fetch" have undesired confusion depending on where
   people learned the term.  I'd perhaps vote for replacing
   fetch with download and push with upload.

 - I think it would be sensible to make remote tracking branches
   less visible.  For example:

	git diff origin

   where origin is the shorthand for your upstream (e.g. you
   have .git/remotes/origin that records the URL and the branch
   you are tracking) should be easier to understand than

   	git diff remotes/origin/HEAD

   The latter is an implementation detail.  I could imagine we
   might even want to allow

	git diff origin#next

   to name the branch of the remote repository.  The notion of
   "where the tips of remote repository's branches are" is
   probably be updated by "git download" (in other words, the
   above "git diff" does not automatically initiate network
   transfer).

 - "git merge" to merge another branch into the current would
   make sense.  "git pull . remotes/origin/next" is showing too
   much implementation detail.  It should just be:

	git merge origin#next

And I agree with Pasky that fixing UI is hard unless you are
willing to get rid of historical warts.  Syntax of the command
line arguments the current set of Porcelain-ish takes are
sometimes just horrible.  It may not be a bad idea to start
building the fixed UI from scratch, using different prefix than
"git" (say "gu" that stands for "git UI" or "gh" that stands for
"git for humans").

Of course, it could even be "cg" ;-).


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 22:36         ` Junio C Hamano
@ 2006-11-14 22:50           ` Junio C Hamano
  2006-11-15  4:32             ` Nicolas Pitre
  2006-11-16  5:12           ` Petr Baudis
  1 sibling, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-14 22:50 UTC (permalink / raw)
  To: git, Andy Whitcroft, Nicolas Pitre, Carl Worth

Junio C Hamano <junkio@cox.net> writes:

>  - I think it would be sensible to make remote tracking branches
>    less visible.  For example:
>...
>  - "git merge" to merge another branch into the current would
>    make sense.  "git pull . remotes/origin/next" is showing too
>    much implementation detail.  It should just be:
>
> 	git merge origin#next

This and other examples in "making remote tracking branches less
visible" are hard to read because I used the word "origin" in
two different sense.  So here is a needed clarification.

If you have remotes/upstream that says:

	URL: git://git.xz/repo.git
        Pull: refs/heads/master:remotes/origin/master
        Pull: refs/heads/next:remotes/origin/next

Then, currently the users need to say:

	git diff remotes/origin/master
        git merge remotes/origin/next

By "making tracking branches less visible", what I mean is to
let the users say this instead:

	git diff upstream
        git merge upstream#next




^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 21:34           ` Nicolas Pitre
@ 2006-11-14 22:56             ` Junio C Hamano
  2006-11-15  1:48               ` Nicolas Pitre
  0 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-14 22:56 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

Nicolas Pitre <nico@cam.org> writes:

> On Tue, 14 Nov 2006, Jakub Narebski wrote:
>
>> The git interface refactoring should be I think the cause for git 2.0.0
>> release...
>
> Good idea indeed.

We need to avoid user confusion, so making a command that used
to do one thing to suddenly do something completely different is
a no-no.  However, I do not think it needs to wait for 2.0.0.
We can start with a separate namespace (or even a separate
"Improved git UI project") and introduce the "improved UI set"
in 1.5.0 timeframe.

If managed properly, the "improved git UI" can coexist with the
current set of tools and over time we can give an option not to
even install the older Porcelain-ish commands.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: [PATCH] commit: Steer new users toward "git commit -a" rather than update-index
  2006-11-14 18:55 ` Andy Whitcroft
  2006-11-14 19:22   ` Cleaning up git user-interface warts Carl Worth
@ 2006-11-14 23:30   ` Junio C Hamano
  1 sibling, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-14 23:30 UTC (permalink / raw)
  To: Andy Whitcroft; +Cc: git, Carl Worth

Andy Whitcroft <apw@shadowen.org> writes:

> Are we sure this isn't porcelain-ish?  We need to use it in merge
> conflict correction and the like?  You can't use git-commit there as a
> replacement.  I'd expect it to be 'git update-index' rather than
> 'git-update-index' of course.

I think status should be taken as Porcelain-ish so it should
notice more about the environment to see why the user has
changed but not updated files and recommend the possible action
depending on the context.

For that, you would need to enumerate what kind of 'context'
there could be with the current set of tools.  Here is a
strawman.

 1. None of the below.
 2. A merge was attempted and resulted in a conflict.
 3. An am or rebase without --merge was attempted and
    resulted in a conflict or patch rejection.
 4. A "rebase --merge was attempted and resulted in a conflict.

In the normal case, the next user action would be:

 1-1. The user wants that change in the next commit, and should
      run "git update-index $that_path" to prepare the index for
      partial commit, or "git commit -a" to commit all the
      changes made to the working tree so far.  Carl's patch
      helps the user in this case.

 1-2. The user realizes that the some of the changes in the
      working tree were not desirable, and "git checkout --
      $that_path" to revert them before continuing.  Before
      deciding to revert, the user may want to check what the
      difference is by running "git diff -- $that_path" so
      suggesting these two might also be helpful.

 1-3. The user wants to keep that change a strictly local change
      in the working tree (this is often very useful and making
      "commit -a" the default will not be acceptable unless
      there is a very compelling reason to do so).  This means
      the suggestion we would make should clearly be
      _suggestion_.

The earlier wording was bad in that it suggested to use a
Plumbing command update-index, but was attempting to convey that
it was merely a conditional suggestion by saying "use it TO MARK
FOR COMMIT", implying that if the user does not want to mark
them for commit, it is Ok not to use update-index.

When a merge is in progress, we would have .git/MERGE_HEAD and
that would be the way to tell case 2.  In that case, the next
user action would be:

 2-1. The user resolves conflicts and marks them as resolved,
      with update-index (or "git mark-resolved"), to prepare the
      index for the merge commit.  But this is not done for
      "Changed but not updated" files but "unmerged" files.  We
      should strongly suggest not to do _anything_ to "Changed
      but not updated" files here.

 2-2. The user decides this conflict is too much to handle right
      now, and abandones the change by "git reset --hard".  This
      would lose the local changes ("Changed but not updated"),
      so we should suggest to save the change before doing so.

	If you are going to abandone this merge with "reset
	--hard", your changes to these files will be lost.  You
	can save them with "git diff HEAD -- $this_path
	$that_path..."

      which is probably too long for that part of the output but
      that is what we would want to say if we want to be
      helpful.

When either rebase without --merge or am is in progress, there
would be .dotest/ directory (whose name could be changed but I
think this was a mistake and we would be better off using fixed
names for this kind of application) for git-status to notice.
The next user action would be:

 3-1. The user resolves the conflict or manually apply the
      patch, update-index the paths involved and proceeds with
      "rebase --continue" or "am --resolved".  "Changed but not
      updated" paths should not be touched in this case,
      similarly to 2-1.

 3-2. The user gives up.  Same as 2-2.

Designing for the "rebase --merge" case and coming up with other
cases are left as exercise to the list for further discussion.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 20:56       ` Carl Worth
@ 2006-11-15  0:31         ` Junio C Hamano
  2006-11-15  4:08           ` Petr Baudis
  2006-11-15 20:51           ` Carl Worth
  0 siblings, 2 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-15  0:31 UTC (permalink / raw)
  To: Carl Worth; +Cc: git, Andy Whitcroft, Petr Baudis

Carl Worth <cworth@cworth.org> writes:

> On Tue, 14 Nov 2006 20:47:07 +0100, Petr Baudis wrote:
>> Hmm, did they (not) consider Cogito? They wouldn't have those issues.
>
> I didn't ask.
>
> Frankly, I don't see a lot of value in the git/cogito split right now.
> ...
> It's great that git is written in a script-friendly way so that new
> interfaces can be built on top of it. And I think the benefits of new
> user interfaces are clear when they work in fundamentally different
> ways, (say, being operated through a GUI). But where git and cogito
> are both command-line utilities and have the same basic functionality,
> ...
> There are some things that cogito does that git does not that I would
> like to have in git.
> ...
> I don't see any defining difference that justifies cogito's
> existence ("hide the index" maybe? let's just hide it a tiny bit more
> in git). And I would like to help work to get the remaining good
> stuff that has been proven in cogito---to get it pushed down into git
> itself.

I am of two minds here.

I do not think the Porcelain-ish UI that is shipped with git
should be taken with the same degree of "authority" as git
Plumbing.  The plumbing needed to have something that worked for
one particular workflow (namely, workflow of the people in the
integrator role of kernel-style project) and that is where the
current set of Porcelain-ish originates.  Linus works primarily
as an integrator so the toolsets he did tend to be more pleasant
to use for integrators and less so for contributors.  I started
as a contributor and added some commands like format-patch and
rebase that Linus never would have felt the need for.  I think
single isolated developers, contributors and CVS style shared
repository usage could be a lot improved because neither of us
were concentrating in their workflows.  This needs somebody
motivated enough to improve things in that area.  For example,
StGIT with its 'float' command is a great improvement over what
rebase does for people in the contributor role.

By now, perhaps git may be good enough for the kernel folks,
even for those not in the integrator role, but I have no doubt
that they have many dislikes to the way some commands work.
They and X.org folks are using git primarily because Linus and
Keith forced them to ;-), and being interoperable is more
important than having to tolerate sucky UI here and there.
Everybody knows that git Porcelain-ish sucks, and making it more
usable is a worthy goal.

But making it more usable for whom is a big question.  

Quite frankly, I do not think there can be _the_ single UI that
would satisfy different types of workflows for some of the
commands.  The commands related to software archaeology, in
which my main interest and strength lie, would easily be usable
across workflows, but commands to build commits locally and
propagate them to and from other repositories would be affected
by the workflow.

For example, fetching and merging from many places without
necessarily having corresponding tracking branches is a great
thing for people in the integrator role.  On the other hand, for
people doing CVS-style centralized repository interaction, it is
often more useful to have tracking branches.  You could support
both but it has been painful.

For another example, having a commit command to commit
everything by default is disastrous for people who allow their
workflows to often be interrupted.  When I respond to a message
from the list with an example patch, my repository is often in
the middle of doing something completely unrelated, and I edit
and make diff to send the message out and I do not necessarily
revert that change afterwards immediately.  For more organized
people it may not be a problem so you either support both types
of workflows or do a specialized toolset.

It is not just command line syntax and the defaults, but
concepts as well.  People in the integrator role often need to
deal with merges and you would need to be aware of the role of
the index and need to be able to manipulate the index, a lot
more often than people in the contributor role.  To satisify
both kinds of workflows, you would either have switches, or do a
specialized toolset, like Cogito, that tries to hide the index.

A Porcelain that does a very similar thing in slightly different
way is obviously a waste, but otherwise I do not think it is a
problem to have different Porcelains.  StGIT does not compete
with the "sucky" Porcelain-ish shipped with git but makes the
user's life a lot more pleasant by complementing what the sucky
one does not do well.  It is not very useful while I am playing
the integrator role, but when I am doing my own thing it is a
great addition to my toolchest.

I am from the camp that does _not_ want to hide the index, so
obviously I do not see any value in its effort to hide the
index.  But other aspects of it, most notably being friendly to
simpler workflows, is a very good thing.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 22:56             ` Junio C Hamano
@ 2006-11-15  1:48               ` Nicolas Pitre
  2006-11-15  2:10                 ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-15  1:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, 14 Nov 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > On Tue, 14 Nov 2006, Jakub Narebski wrote:
> >
> >> The git interface refactoring should be I think the cause for git 2.0.0
> >> release...
> >
> > Good idea indeed.
> 
> We need to avoid user confusion, so making a command that used
> to do one thing to suddenly do something completely different is
> a no-no.  However, I do not think it needs to wait for 2.0.0.
> We can start with a separate namespace (or even a separate
> "Improved git UI project") and introduce the "improved UI set"
> in 1.5.0 timeframe.

Dunno.  I feel this is a bit overboard.  Actually the naming problem is 
rather localized to one command, namely git-pull.  In my opinion going 
with yet another namespace which would rather add to the confusion not 
clear it.

The best way to avoid user confusion is to remove the source of the 
confusion not let it live.  In other words I think we should _fix_ 
git-pull instead of replacing it.  People are already confused about it 
so simply fixing this command will have a net confusion reduction.  Yet 
we're not talking about "suddenly doing something completely different" 
either.  If git-pull doesn't merge automatically anymore it is easy to 
tell people to use git-merge after a pull.

"You pull the remote changes with 'git-pull upstream,, then you can 
merge them in your current branch with 'git-merge upstream'."

Isn't it much simpler to understand (and to teach) that way?

Also I don't think using git-upload and git-download is much better.  
This adds yet more commands that do almost the same as existing ones but 
with a different name which is yet not necessarily fully adequate.  I 
for example would think that "download" is more like git-clone than 
git-fetch or git-pull.

Let's face it: HG got it right with pull and push and newbies have much 
less difficulty grokking it.  We screwed it by not using the most 
intuitive semantic of a pull and locking the word "pull" away is not the 
better solution given all considerations. Why just not admit it and 
avoid being different than HG just for the sake of it?



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  1:48               ` Nicolas Pitre
@ 2006-11-15  2:10                 ` Junio C Hamano
  2006-11-15  2:27                   ` Michael K. Edwards
                                     ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-15  2:10 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

Nicolas Pitre <nico@cam.org> writes:

> "You pull the remote changes with 'git-pull upstream,, then you can 
> merge them in your current branch with 'git-merge upstream'."
>
> Isn't it much simpler to understand (and to teach) that way?

If it were "you download the remote changes with 'git download
upstream' and then merge with 'git merge'", then perhaps, but if
you used the word "pull" or "fetch", I do not think so.

I would be all for changing the semantics of "pull" from one
thing to another, if the new semantics were (1) what everybody
welcomed, (2) what "pull" traditionally meant everywhere else.
In that case, we have been misusing it to be confusing to
outsiders and I agree it makes a lot of sense to remove the
source of confusion.  But I do not think CVS nor SVN ever used
the term, and I was told that BK was what introduced the term,
and the word meant something different from what you are
proposing.

You have to admit both pull and fetch have been contaminated
with loaded meanings from different backgrounds. I was talking
about killing the source of confusion in the longer term by
removing fetch/pull/push, so we are still on the same page.

That's where my "you download from the upstream and merge" comes
from.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  2:10                 ` Junio C Hamano
@ 2006-11-15  2:27                   ` Michael K. Edwards
  2006-11-15  4:20                   ` Nicolas Pitre
  2006-11-15 20:12                   ` Petr Baudis
  2 siblings, 0 replies; 1752+ messages in thread
From: Michael K. Edwards @ 2006-11-15  2:27 UTC (permalink / raw)
  To: git

I would kind of like to see "git poll" -- visit all remote branches,
fetching objects and tags into the local repository, so that I can
inspect changes off-line and merge, cherry-pick, etc. to my heart's
content.  That would fit the platform integrator's workflow nicely --
"git poll" into a tracking tree, do some merges there (such as
backporting a subsystem to a "stable" base kernel), then merge this
backport branch to each platform working copy and cherry-pick other
changes as necessary.

Cheers,

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Sometimes "Failed to find remote refs" means "try git-fetch --no-tags"
@ 2006-11-15  3:53 Michael K. Edwards
  2006-11-15  4:05 ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Michael K. Edwards @ 2006-11-15  3:53 UTC (permalink / raw)
  To: git

Down inside git-ls-remote there is a die "Failed to find remote refs".
 This struck when I tried to fetch an http repository with a missing
info/refs file.  Using "git fetch --no-tags" succeeds because it
doesn't have to call git-ls-remote at all.  Does git-ls-remote have
any way of knowing who is calling it so that it can print a
context-appropriate error message?  If not, is it worth adding some
sort of "caller context" mechanism, perhaps at the boundary between
porcelain and plumbing?  Or should the error message include, "If you
were trying to do a 'git fetch', try --no-tags; you won't get tags but
you may get a good update of the branch content"?

Cheers,

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Sometimes "Failed to find remote refs" means "try git-fetch --no-tags"
  2006-11-15  3:53 Sometimes "Failed to find remote refs" means "try git-fetch --no-tags" Michael K. Edwards
@ 2006-11-15  4:05 ` Junio C Hamano
  2006-11-15 21:13   ` Horst H. von Brand
  0 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-15  4:05 UTC (permalink / raw)
  To: Michael K. Edwards; +Cc: git

"Michael K. Edwards" <medwards.linux@gmail.com> writes:

> Down inside git-ls-remote there is a die "Failed to find remote refs".
> This struck when I tried to fetch an http repository with a missing
> info/refs file.  Using "git fetch --no-tags" succeeds because it
> doesn't have to call git-ls-remote at all.  Does git-ls-remote have
> any way of knowing who is calling it so that it can print a
> context-appropriate error message?  If not, is it worth adding some
> sort of "caller context" mechanism, perhaps at the boundary between
> porcelain and plumbing?

I think letting git-ls-remote know who called it makes sense for
better error reporting.  I am all for it.

However "fetch --no-tags" from http upstream is a band-aid to
hide that the upstream repository has stale info/refs, and I do
not think we would want to encourage the band-aid.  Rather, the
message should say "yell loudly at the repository owner" ;-).

Seriously, when people starts using packed-refs that will be in
v1.4.4 scheduled for tomorrow on the public site, I think the
best way to adjust the commit walker clients is to have them
download info/refs and start traversing from the objects listed
there, instead of downloading .git/refs/heads/$branch and
.git/refs/tags/$tag files as we currently do, so the band-aid
would become less useful.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  0:31         ` Junio C Hamano
@ 2006-11-15  4:08           ` Petr Baudis
  2006-11-15  4:33             ` Junio C Hamano
  2006-11-15 10:05             ` Jakub Narebski
  2006-11-15 20:51           ` Carl Worth
  1 sibling, 2 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-11-15  4:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Carl Worth, git, Andy Whitcroft

On Wed, Nov 15, 2006 at 01:31:50AM CET, Junio C Hamano wrote:
> Carl Worth <cworth@cworth.org> writes:
> 
> > On Tue, 14 Nov 2006 20:47:07 +0100, Petr Baudis wrote:
> >> Hmm, did they (not) consider Cogito? They wouldn't have those issues.
> >
> > I didn't ask.
> >
> > Frankly, I don't see a lot of value in the git/cogito split right now.
> > ...
> > It's great that git is written in a script-friendly way so that new
> > interfaces can be built on top of it. And I think the benefits of new
> > user interfaces are clear when they work in fundamentally different
> > ways, (say, being operated through a GUI). But where git and cogito
> > are both command-line utilities and have the same basic functionality,
> > ...
> > There are some things that cogito does that git does not that I would
> > like to have in git.
> > ...
> > I don't see any defining difference that justifies cogito's
> > existence ("hide the index" maybe? let's just hide it a tiny bit more
> > in git). And I would like to help work to get the remaining good
> > stuff that has been proven in cogito---to get it pushed down into git
> > itself.
> 
> I am of two minds here.
> 
> I do not think the Porcelain-ish UI that is shipped with git
> should be taken with the same degree of "authority" as git
> Plumbing.
..snip passage about workflows..

Controversy's fun, so...

<Cogito maintainer hat _off_> (But yeah, it still looks silly that I'm
saying this.)

 From the current perspective, I think it has been a mistake that the
porcelain and plumbing was not kept separate in independent packages,
and perhaps even maintained separately (and perhaps not; at least having
a single tree with plumbing/ and porcelain/ directories and separate
packages in distributions might already help something), so that "git"
would be kept as a kind of library and then there would be a separate
package providing an interface to it. Or you could select one of several
packages. Not only would that make Cogito prevail in the world and bring
me a flood of marriage proposals, but look at how would it help the
general public:

  (i) Clearly divided porcelain/plumbing interface, so that you can
really isolate the two UI-wise; endless confusion reigns there now. Is
git-update-index porcelain or plumbing? _You_ call git-merge a proper
porcelain? From my perspective, git-update-ref is as plumbing as it
gets, but it's classified as porcelain. Etc, etc. This would be by far
the most important advantage.

  (ii) The plumbing and porcelain would not share the same namespace,
leading to clearer UI. (I'm just inflating (i).)

  (iii) The documentation would not be a strange mix of porcelain and
plumbing. (More (i) inflation.)

  (iv) (i) is troublesome because I have a feeling that Junio declared
several times that he doesn't care that much about stable API for
porcelain compared to the plumbing. But with the current mix it's
desirable to use some porcelain even in other porcelains and in scripts.

  (v) Git would be properly libified by now. If you wanted to convert
bits of porcelain to C, it would be at least much higher priority.

  (vi) You wouldn't need to make the gruesome choice on what is the
canonical workflow the _the_ Git porcelain supports (see the snipped
passage). Or you would, but it would have less impact.

  (vii) The world would be a happier place.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  2:10                 ` Junio C Hamano
  2006-11-15  2:27                   ` Michael K. Edwards
@ 2006-11-15  4:20                   ` Nicolas Pitre
  2006-11-15  4:58                     ` Junio C Hamano
  2006-11-15 18:03                     ` Linus Torvalds
  2006-11-15 20:12                   ` Petr Baudis
  2 siblings, 2 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-15  4:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, 14 Nov 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > "You pull the remote changes with 'git-pull upstream,, then you can 
> > merge them in your current branch with 'git-merge upstream'."
> >
> > Isn't it much simpler to understand (and to teach) that way?
> 
> If it were "you download the remote changes with 'git download
> upstream' and then merge with 'git merge'", then perhaps, but if
> you used the word "pull" or "fetch", I do not think so.
> 
> I would be all for changing the semantics of "pull" from one
> thing to another, if the new semantics were (1) what everybody
> welcomed, (2) what "pull" traditionally meant everywhere else.
> In that case, we have been misusing it to be confusing to
> outsiders and I agree it makes a lot of sense to remove the
> source of confusion.  But I do not think CVS nor SVN ever used
> the term, and I was told that BK was what introduced the term,
> and the word meant something different from what you are
> proposing.
> 
> You have to admit both pull and fetch have been contaminated
> with loaded meanings from different backgrounds. I was talking
> about killing the source of confusion in the longer term by
> removing fetch/pull/push, so we are still on the same page.
> 
> That's where my "you download from the upstream and merge" comes
> from.

But the fact is that HG (which has a growing crowd of happy campers, 
maybe even larger than the BK crowd now) did work with and got used to a 
sensible definition of what a "pull" is.  This means that their 
definition is becoming rather more relevant with time than what it used 
to, and because it is a saner definition than what GIT has for the same 
word which HG users really have no issue with, I think we really should 
leverage the "common wisdom" and consider aligning ourselves with them 
in this case rather than trying to go into a totally different 
direction.  We simply won't gain anything trying to teach people "a pull 
in HG is a download in GIT".  If a pull becomes the same thing for both 
then it's one less oddball in the GIT interface.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 22:50           ` Junio C Hamano
@ 2006-11-15  4:32             ` Nicolas Pitre
  2006-11-15  5:35               ` Junio C Hamano
                                 ` (3 more replies)
  0 siblings, 4 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-15  4:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Andy Whitcroft, Carl Worth

On Tue, 14 Nov 2006, Junio C Hamano wrote:

> Junio C Hamano <junkio@cox.net> writes:
> 
> >  - I think it would be sensible to make remote tracking branches
> >    less visible.  For example:
> >...
> >  - "git merge" to merge another branch into the current would
> >    make sense.  "git pull . remotes/origin/next" is showing too
> >    much implementation detail.  It should just be:
> >
> > 	git merge origin#next
> 
> This and other examples in "making remote tracking branches less
> visible" are hard to read because I used the word "origin" in
> two different sense.  So here is a needed clarification.
> 
> If you have remotes/upstream that says:
> 
> 	URL: git://git.xz/repo.git
>         Pull: refs/heads/master:remotes/origin/master
>         Pull: refs/heads/next:remotes/origin/next
> 
> Then, currently the users need to say:
> 
> 	git diff remotes/origin/master
>         git merge remotes/origin/next
> 
> By "making tracking branches less visible", what I mean is to
> let the users say this instead:
> 
> 	git diff upstream
>         git merge upstream#next

What is the point of hiding tracking branches?  Why just not making them 
easier to use instead?  There are currently so many ways to specify 
remote branches that even I get confused.

OK..... let's pretend this is my follow-up to your "If I were redoing 
git from scratch" query.  Actually I would not redo it from scratch 
since the vast majority of it is rather sane already.  But here's some 
changes that I would do:

1) make "git init" an alias for "git init-db".

What's the point of "-db"?  Sure we're initializing the GIT database.  
But who cares?  The user doesn't care if GIT uses a "database" or 
whatever.  And according to some people's definition of a "database" it 
could be argued that GIT doesn't use a database at all in the purist 
sense of it. What the user wants is to get started and "init" (without 
the "-db" is so much more to the point. Doesn't matter if incidentally 
it happens to be the same keyword HG uses for the same operation because 
we are not afflicted by the NIH disease, right? And it has 3 chars less 
to type which is for sure a premium improvement to the very first GIT 
user experience!

2) "pull" and "push" should be symmetrical operations

They are symmetrical in the dictionary and in people's mind.  OK but what 
if I merge content from another _local_ branch into the current one?  
Isn't that kind of a pull as well?  Answer: NO IT IS NOT!  Reason: 
because we already have "merge" for that very operation for damn sake!  
And because "merging" isn't a synonym for "pulling" at all, we cannot 
pretend it should sort of become more true if taken the other way 
around.

Actually, if we _merge_ stuff together, we certainly have to /pull/ some 
of it, meaning that "merge" might imply a "pull", even in real life 
situations outside of the GIT context (think merging Vodka and Kahlua in 
a glass where you might have to pull the Vodka bottle out of the freezer 
before you can merge it). And thankfully we got it right with git-merge 
which can take either a branch or an URL as argument which in the later 
case will perform a pull implicitly (OK currently a fetch but you know 
what I mean).

But trying to put in people's head that "pulling" implies a "merge"?  No 
that doesn't work really well.  OK if you pull too hard on the Vodka 
bottle that might imply a merge at some point but it would certainly be 
accidental.  And it is not without coincidence that some people had 
accidental GIT merges by using git-pull.

Conclusion:  git-pull must not perform any merge.  It is the symmetrical 
operation of a push meaning that it pulls content from a remote branch 
and does no more.  People understands that pretty well, .  This makes 
git-fetch redundant (or an alias to git-pull) in that case, and again we 
don't mind it becoming similar to in HG because we admit HG was right 
about it.

3) remote branch handling should become more straight forward.

OK! Now that we've solved the pull issue and that everybody agrees with 
me (how can't you all agree with me anyway) let's have a look at remote 
branches.  It should be simple:

a)	git-pull git://repo.com/time_machine.git

This pulls every branches from the time_machine.git repository and 
create identically named branches locally, except for the remote 
master becoming origin locally.  All those branches are marked read-only 
(i.e. cannot commit to them) and _each_ of those branches get an URL 
associated to them somehow (the association is an implementation 
detail).

If then you do:

b)	git-pull origin

Then it will pull the git://repo.com/time_machine.git:master branch into 
the local "origin" branch.  IOW, local tracking branches becomes 
synonyms for their remote URLs after they've been pulled once.  If the 
remote branch "next" became a local "next" with the first pull (because 
it didn't specify any branch meaning that they were all pulled), doing 
a:

c)	git-pull next

would actually be the same as:

d)	git-pull git://repo.com/time_machine.git:next

Now to have different remote and local names for those tracking 
branches:

e)	git-pull git://repo.com/time_machine.git:master upstream

would be a variation where a remote branch gets a different local name. 
This pulls the remote master branch but calls it "upstream" locally.  
If that "upstream" branch does exist locally already then fail with 
appropriate error message, unless the local branch happens to have the 
same URL attribute already.  You then have two local branches tracking 
the same remote branch which is weird but still fine if someone wants
to have different views (today's pull and yesterday's pull).  This is 
not necessarily something to encourage but only a fallout of the branch 
semantic.  And again a simple:

f)	git-pull upstream

would update the "upstream" branch from the remote master branch.

I think the concept of "branch group" should be preserved too.  So if 
you create a group called "warp", then add "origin", "next", and 
"upstream" to it, then:

g)	git-pull warp

would pull all the included branches.  One way to create a branch group 
with the initial pull is not to specify the remote branch but only the 
repository URL, like:

h)	git-pull git://repo.com/time_machine.git warp

Because no specific branch in the remote repository was specified just 
like in (a) then all branches are pulled, but because a local name was 
provided then this becomes a branch group.

Branch groups could be used to extend the branch namespace as well to 
avoid clashes with different remote repositories.  In this case the 
branch groups could be a way to arrange branches in a hierarchy so 
"warp" refer to all branches included in the warp group while 
"warp/upstream" refer to only one branch. In this case "upstream" and 
"warp/upstream" would be the same branch if "upstream" was effectively 
added to the "warp" group, but it doesn't need to be so.  And branches 
in a group don't have to come from the same remote repository either 
since the source of each branch (the URL) is a per branch attribute.

To make it "easy" on the user, I think that any branch (or tag) down the 
hierarchy should be used without the "path" leading to it if there is no 
conflict.  We already do that with heads and tags, So if for example the 
"warp" group contained a branch named "lightspeed" but no such branch 
(or tag) existed anywhere else then it could be referenced with simply 
"lightspeed" or "warp/lightspeed".

Then you don't need any strange scheme for diff and merge.  Just using 
"git-diff upstream" or "git-merge origin next" suffice.  Oh and I don't 
think it would be a good idea to have a completely separate namespace 
for local vs remote aka tracking branches.  Maybe in .git/refs/ they 
should be separate to distinguish which ones are read-only remote 
tracking ones and which ones are local, but that must not be forced on 
the UI.

Thinking about it some more, maybe (a) should create a default branch 
group if the remote repository has more than one branches, say "origin".  
This way, git-pull without any argument would be the same as 
"git-pull origin" by default.  If "origin" is a single branch then 
(git-pull" would pull only one branch, but if "origin" is a branch group 
then all included branches would be pulled.

This becomes formalized as:

	git_pull [<URL>] [<local_name>]

If <URL> includes a branch name then <local_name> is a single branch 
name.  If <URL> doesn't include any branch name then <local_name> 
becomes a local branch group name containing all branches in the remote 
repository. If <URL> is specified but not <local_name> then <local_name> 
is set to "origin" by default, unless it already exists in which case it 
is an error and the pull fails.  If <URL> is not specified then the URL 
attribute to the specified branch(es) is used.  If nothing is specified 
then "origin" is used for <local_name> by default and URL attribute of 
the origin branch or the origin branch group is/are used.

*****

OK I think this is enough for now. I know that parts of what I've said 
can already be found in GIT, but I wanted the explanation to be 
complete and therefore tentatively coherent.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  4:08           ` Petr Baudis
@ 2006-11-15  4:33             ` Junio C Hamano
  2006-11-15  4:46               ` Nicolas Pitre
  2006-11-15 20:39               ` Petr Baudis
  2006-11-15 10:05             ` Jakub Narebski
  1 sibling, 2 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-15  4:33 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

Petr Baudis <pasky@suse.cz> writes:

> On Wed, Nov 15, 2006 at 01:31:50AM CET, Junio C Hamano wrote:
>> 
>> I am of two minds here.
>> 
>> I do not think the Porcelain-ish UI that is shipped with git
>> should be taken with the same degree of "authority" as git
>> Plumbing.
> ..snip passage about workflows..
>
> Controversy's fun, so...
>
> <Cogito maintainer hat _off_> (But yeah, it still looks silly that I'm
> saying this.)

It appears that you are not grumpy as you were anymore ;-).  I
mostly agree with what you said in your message.

> (i) Clearly divided porcelain/plumbing interface, so that you can
> really isolate the two UI-wise; endless confusion reigns there now. Is
> git-update-index porcelain or plumbing? _You_ call git-merge a proper
> porcelain? From my perspective, git-update-ref is as plumbing as it
> gets, but it's classified as porcelain. Etc, etc. This would be by far
> the most important advantage.

Yes.  The current "merge" started its life as Linus's porcelain
(we did not have fetch and pull infrastructure back then) but
quickly has become just a helper for pull to produce a merge
commit.  If anybody thinks its UI is good as a general end-user
level command, there is a need for "head examination".

As you say, update-ref is as plumbing as it gets and it should
not be listed as Porcelain; I am a bit surprised that it is
labelled as such myself.

No disagreement here, nor (ii) nor (iii).

>   (ii) The plumbing and porcelain would not share the same namespace,
> leading to clearer UI. (I'm just inflating (i).)
>
>   (iii) The documentation would not be a strange mix of porcelain and
> plumbing. (More (i) inflation.)
>
>   (iv) (i) is troublesome because I have a feeling that Junio declared
> several times that he doesn't care that much about stable API for
> porcelain compared to the plumbing. But with the current mix it's
> desirable to use some porcelain even in other porcelains and in scripts.

This is true and it is a problem.

While we encourage Porcelain writers to use plumbing in order to
give git Porcelain-ish more freedom to evolve to give better UI
for humans, not having a clear distinction between the two makes
it harder.

>   (v) Git would be properly libified by now. If you wanted to convert
> bits of porcelain to C, it would be at least much higher priority.

I am not sure about "libified" part and I do not know what bits
of porcelain wants to become C right now.  But I do not think
this point is important part of your list.

>   (vi) You wouldn't need to make the gruesome choice on what is the
> canonical workflow the _the_ Git porcelain supports (see the snipped
> passage). Or you would, but it would have less impact.

Yes.  This is really important.

Linus and me having done Porcelain-ish that supports integrator
role workflow better than other workflows such as contributor
role should not discourage people from working on alternative or
complementary Porcelains to help other workflows better (see the
snipped passage).

StGIT sets a great example, and efforts like it is encoraged
more.

I think both Linus and myself tried to make it clear that the
purpose of Porcelain-ish that comes with core git is 50% to make
plumbing (perhaps minimally) usable and the other 50% to serve
as an example for Porcelain writers to learn how to use the
plumbing, but we should probably have stressed the latter
better.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  4:33             ` Junio C Hamano
@ 2006-11-15  4:46               ` Nicolas Pitre
  2006-11-15 10:09                 ` Jakub Narebski
  2006-11-15 20:39               ` Petr Baudis
  1 sibling, 1 reply; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-15  4:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Petr Baudis, git

On Tue, 14 Nov 2006, Junio C Hamano wrote:

> Yes.  The current "merge" started its life as Linus's porcelain
> (we did not have fetch and pull infrastructure back then) but
> quickly has become just a helper for pull to produce a merge
> commit.  If anybody thinks its UI is good as a general end-user
> level command, there is a need for "head examination".

If you mean "git merge" it sure needs to be brought forward.  It can't 
be clearer than:

	git-merge the_other_branch

or

	git-merge git://repo.com/time_machine.git

to instantaneously understand what is going on.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  4:20                   ` Nicolas Pitre
@ 2006-11-15  4:58                     ` Junio C Hamano
  2006-11-15 18:03                     ` Linus Torvalds
  1 sibling, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-15  4:58 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

Nicolas Pitre <nico@cam.org> writes:

> ...  We simply won't gain anything trying to teach people "a pull 
> in HG is a download in GIT".  If a pull becomes the same thing for both 
> then it's one less oddball in the GIT interface.

I personally do not have any issue with that, as long as you
would help us convert existing users that what was known as pull
is not available and new pull means fetching only.

If I recall correctly in this thread, you also advocated to
always have tracking branches.  I am a bit worried about losing
the promiscuous pull usage, which can easily become a regression
for people like Linus in the integrator role unless done with an
escape hatch.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  4:32             ` Nicolas Pitre
@ 2006-11-15  5:35               ` Junio C Hamano
  2006-11-15  6:18                 ` Shawn Pearce
  2006-11-15 14:01                 ` Johannes Schindelin
  2006-11-15  9:17               ` Andy Parkins
                                 ` (2 subsequent siblings)
  3 siblings, 2 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-15  5:35 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git, Andy Whitcroft, Carl Worth

Nicolas Pitre <nico@cam.org> writes:

> What is the point of hiding tracking branches?  Why just not making them 
> easier to use instead?  There are currently so many ways to specify 
> remote branches that even I get confused.

Ok, I think in essence we are saying the same thing except I
went overboard by suggsting to extend sha1_name to also look at
.git/remotes/$name which is not necessary, because we already
have the .git/refs/remotes/%s/HEAD magic there.  Consider the
suggestion of "upstream#next" syntax retracted, please.

> 1) make "git init" an alias for "git init-db".

Or even better, have "gh init".

> 2) "pull" and "push" should be symmetrical operations

I think that makes a lot of sense to have "gh pull" and "gh
push" as symmetric operations, and make "gh merge" do the
fast-forward and 3-way merge magic done in the current "git
pull".  These three words would have a lot saner meaning.

> 3) remote branch handling should become more straight forward.
>
> OK! Now that we've solved the pull issue and that everybody agrees with 
> me (how can't you all agree with me anyway) let's have a look at remote 
> branches.

I would probably prefer making the default namespace under
.git/refs/remotes/remote-name for the tracking branches this
proposal creates, but other than that I agree with the general
direction this proposal is taking us, including branch groups.
We have .git/refs/remotes/%s/HEAD magic so I do not think we
even need to treat one branch repository any specially as you
suggsted.

The reason I am suggsting "gh" instead of "git" is primarily to
deal with stale documentation people would find googling.  I can
easily see people get confused by reading "pull = fetch + merge"
from either mailing list archive or Git cheat sheet various
projects seem to have developed.

It does not mean we need to redo _all_ UI.  I think most of the
archaeology commands have sane UI so during the transition
period (git 1.99) we can have "git log" and "gh log" which are
one and the same program, and perhaps git 2.0 can be shipped
with clear distinction between plumbing (i.e. git-update-index
and friends) and porcelain (e.g. "gh pull" that only fetches but
with the user friendliness you outlined here), with backward
compatibility wart to help old timers (e.g. "git pull" that
still does "git fetch" followed by "git merge").


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  5:35               ` Junio C Hamano
@ 2006-11-15  6:18                 ` Shawn Pearce
  2006-11-15  6:30                   ` Junio C Hamano
  2006-11-15 14:01                 ` Johannes Schindelin
  1 sibling, 1 reply; 1752+ messages in thread
From: Shawn Pearce @ 2006-11-15  6:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nicolas Pitre, git, Andy Whitcroft, Carl Worth

Junio C Hamano <junkio@cox.net> wrote:
> Or even better, have "gh init".

Why gh?  Is Git just Mercurial backwards?  :)

I'm all in favor of this discussion, and in particular of just
breaking the entire UI in 2.0 by using a new frontend command.
I'm just not sure that "Mercurial backwards" describes Git well.

-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  6:18                 ` Shawn Pearce
@ 2006-11-15  6:30                   ` Junio C Hamano
  0 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-15  6:30 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git

Shawn Pearce <spearce@spearce.org> writes:

> Junio C Hamano <junkio@cox.net> wrote:
>> Or even better, have "gh init".
>
> Why gh?  Is Git just Mercurial backwards?  :)
>
> I'm all in favor of this discussion, and in particular of just
> breaking the entire UI in 2.0 by using a new frontend command.
> I'm just not sure that "Mercurial backwards" describes Git well.

I do not have any obsession to any name as long as it is
different from "git" to avoid confusion coming from older
documents that would be found by googling.  gh was just
shorthand for "git for humans" (and easy to type with index
fingers).  I think I listed a few other possibilities in my
previous message.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  4:32             ` Nicolas Pitre
  2006-11-15  5:35               ` Junio C Hamano
@ 2006-11-15  9:17               ` Andy Parkins
  2006-11-15  9:59                 ` Jakub Narebski
                                   ` (3 more replies)
  2006-11-15 12:15               ` Andreas Ericsson
  2006-11-16 13:58               ` Petr Baudis
  3 siblings, 4 replies; 1752+ messages in thread
From: Andy Parkins @ 2006-11-15  9:17 UTC (permalink / raw)
  To: git

On Wednesday 2006 November 15 04:32, Nicolas Pitre wrote:

> OK..... let's pretend this is my follow-up to your "If I were redoing

Personally, I agree with almost everything in this email.  Except the 
implementation of point 3.

> 3) remote branch handling should become more straight forward.

I was completely confused by this origin/master/clone stuff when I started 
with git.  In hindsight, now I understand git a bit more, this is what I 
would have liked:

 * Don't use the name "origin" twice.  In fact, don't use it at all.  In a 
distributed system there is no such thing as a true origin.

 * .git/remotes/origin should be ".git/remotes/default".   "origin" is only 
special because it is the default to push and pull - it's very nice to have a 
default, but it should therefore be /called/ "default".

 * Whatever git-clone calls the remote, it should be matched by a directory 
in .git/refs/remotes.  So .git/remotes/$name contains "Pull"s to get all the 
remote branches to .git/refs/remotes/$name/*.   This implies that 
git /always/ does --use-separate-remote in clone.  If a branch is practically 
read-only it should be technically read-only too.

 * If clone really wants to have a non-read-only master, then that should 
be .git/refs/heads/master and will initialise 
to .git/refs/remotes/$name/master after cloning.  Personally I think this is 
dangerous because it assumes there is a "master" upstream - which git doesn't 
mandate at all.  Maybe it would be better to take the upstream HEAD and 
create a local branch for /that/ branch rather than require that it is 
called "master".

 * Ensuring we have /all/ upstream branches at a later date is hard, and not 
automatic.  Here is the .git/remotes/default file that should be possible:
    URL: git://host/project.git
    Pull: refs/heads/*:refs/remotes/default/*
   Now, every git-pull would check for new upstream branch refs and sync them 
into the local remotes list.  These are read-only so it'd be perfectly safe 
to delete any locally that no longer exist upstream.

 * git-clone should really just be a small wrapper around
    - git-init-db
    - create .git/remotes/default
    - maybe create specific .git/config
    - git-fetch default
   If git-clone does anything that can't be done with settings in the config 
and the remotes/default file then it's wrong.  The reason I say this is that 
as soon as git-clone has special capabilities (like --shared, --local 
and --reference) then you are prevented from doing magic with existing 
repositories.  For example; how do you create a repository that contains 
branches from two other local repositories that have the objects hard linked?

While I'm writing wishes, I'd like to jump on Junio's integration with other 
fetch-backends wish.  I use git-svn, and it would be fantastic if I could 
replace:

git-svn init --id upstream/trunk svn://host/path/trunk
git-svn fetch --id upstream/trunk
git-svn init --id upstream/stable svn://host/path/branches/stable
git-svn fetch --id upstream/stable

With a .git/remotes/svn
 SVN-URL: svn://host/path
 Pull: trunk:refs/remotes/upstream/trunk
 Pull: branches/stable:refs/remotes/upstream/stable
and
 git fetch svn

Obviously, the syntax is just made up; but you get the idea.  Even better, 
would be if it could cope with my "*" syntax suggested above:
 SVN-URL: svn://host/path
 Pull: trunk:refs/remotes/upstream/trunk
 Pull: branches/*:refs/remotes/upstream/*


There have been lots of "wishlist" posts lately; would it be useful if I tried 
to collect all these suggestions from various people into one place to try 
and get a picture of any consensus?



Andy
-- 
Dr Andy Parkins, M Eng (hons), MIEE

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  9:17               ` Andy Parkins
@ 2006-11-15  9:59                 ` Jakub Narebski
  2006-11-15 10:33                   ` Andy Parkins
  2006-11-15 15:41                 ` Nicolas Pitre
                                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-11-15  9:59 UTC (permalink / raw)
  To: git

Andy Parkins wrote:

>  * Don't use the name "origin" twice.  In fact, don't use it at all.  In a 
> distributed system there is no such thing as a true origin.

The remote 'origin' is true origin of the repository: it is repository
we cloned this repository from.

I agree that having branch 'origin', at least in most common multi-branch
(multi-head) repository, is just confusing.

>  * Ensuring we have /all/ upstream branches at a later date is hard, and not 
> automatic.  Here is the .git/remotes/default file that should be possible:
>     URL: git://host/project.git
>     Pull: refs/heads/*:refs/remotes/default/*
>    Now, every git-pull would check for new upstream branch refs and sync them 
> into the local remotes list.  These are read-only so it'd be perfectly safe 
> to delete any locally that no longer exist upstream.

Very nice idea.
 
>  * git-clone should really just be a small wrapper around
>     - git-init-db
>     - create .git/remotes/default
>     - maybe create specific .git/config

I'm not sure about "create .git/remotes/default" part. Isn't git moving from
remotes file to having information about remotes (and branches) in config?

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  4:08           ` Petr Baudis
  2006-11-15  4:33             ` Junio C Hamano
@ 2006-11-15 10:05             ` Jakub Narebski
  2006-11-15 10:25               ` Karl Hasselström
  1 sibling, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-11-15 10:05 UTC (permalink / raw)
  To: git

Petr Baudis wrote:

>   (i) Clearly divided porcelain/plumbing interface, so that you can
> really isolate the two UI-wise; endless confusion reigns there now. Is
> git-update-index porcelain or plumbing? _You_ call git-merge a proper
> porcelain? From my perspective, git-update-ref is as plumbing as it
> gets, but it's classified as porcelain. Etc, etc. This would be by far
> the most important advantage.

The problem is that one man's plumbing is another man porcelain.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  4:46               ` Nicolas Pitre
@ 2006-11-15 10:09                 ` Jakub Narebski
  2006-11-15 10:15                   ` Santi Béjar
  2006-11-15 14:56                   ` Nicolas Pitre
  0 siblings, 2 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-11-15 10:09 UTC (permalink / raw)
  To: git

Nicolas Pitre wrote:

> On Tue, 14 Nov 2006, Junio C Hamano wrote:
> 
>> Yes.  The current "merge" started its life as Linus's porcelain
>> (we did not have fetch and pull infrastructure back then) but
>> quickly has become just a helper for pull to produce a merge
>> commit.  If anybody thinks its UI is good as a general end-user
>> level command, there is a need for "head examination".
> 
> If you mean "git merge" it sure needs to be brought forward.  It can't 
> be clearer than:
> 
>       git-merge the_other_branch
> 
> or
> 
>       git-merge git://repo.com/time_machine.git
> 
> to instantaneously understand what is going on.

You mean

      git merge git://repo.com/time_machine.git#branch

don't you (perhaps with 'master' as default branch)?

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 10:09                 ` Jakub Narebski
@ 2006-11-15 10:15                   ` Santi Béjar
  2006-11-15 10:28                     ` Jakub Narebski
  2006-11-15 14:56                   ` Nicolas Pitre
  1 sibling, 1 reply; 1752+ messages in thread
From: Santi Béjar @ 2006-11-15 10:15 UTC (permalink / raw)
  To: git

On 11/15/06, Jakub Narebski <jnareb@gmail.com> wrote:
> Nicolas Pitre wrote:
>
> > On Tue, 14 Nov 2006, Junio C Hamano wrote:
> >
> >> Yes.  The current "merge" started its life as Linus's porcelain
> >> (we did not have fetch and pull infrastructure back then) but
> >> quickly has become just a helper for pull to produce a merge
> >> commit.  If anybody thinks its UI is good as a general end-user
> >> level command, there is a need for "head examination".
> >
> > If you mean "git merge" it sure needs to be brought forward.  It can't
> > be clearer than:
> >
> >       git-merge the_other_branch
> >
> > or
> >
> >       git-merge git://repo.com/time_machine.git
> >
> > to instantaneously understand what is going on.
>
> You mean
>
>       git merge git://repo.com/time_machine.git#branch
>
> don't you (perhaps with 'master' as default branch)?

perhaps with remote 'HEAD' as default branch?


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 10:05             ` Jakub Narebski
@ 2006-11-15 10:25               ` Karl Hasselström
  0 siblings, 0 replies; 1752+ messages in thread
From: Karl Hasselström @ 2006-11-15 10:25 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

On 2006-11-15 11:05:26 +0100, Jakub Narebski wrote:

> The problem is that one man's plumbing is another man porcelain.

No; that way lies insanitation.

-- 
Karl Hasselström, kha@treskal.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 10:15                   ` Santi Béjar
@ 2006-11-15 10:28                     ` Jakub Narebski
  2006-11-16  2:43                       ` Petr Baudis
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-11-15 10:28 UTC (permalink / raw)
  To: git

Santi Béjar wrote:

> On 11/15/06, Jakub Narebski <jnareb@gmail.com> wrote:

>> You mean
>>
>>       git merge git://repo.com/time_machine.git#branch
>>
>> don't you (perhaps with 'master' as default branch)?
> 
> perhaps with remote 'HEAD' as default branch?

No! HEAD might change without your notice, and you want to know
which branch you merge. With remotes the default could be first
branch in the pull/fetch list, but with bare URL...
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  9:59                 ` Jakub Narebski
@ 2006-11-15 10:33                   ` Andy Parkins
  2006-11-15 10:48                     ` Karl Hasselström
  0 siblings, 1 reply; 1752+ messages in thread
From: Andy Parkins @ 2006-11-15 10:33 UTC (permalink / raw)
  To: git

On Wednesday 2006 November 15 09:59, Jakub Narebski wrote:

> >  * Don't use the name "origin" twice.  In fact, don't use it at all.  In
> > a distributed system there is no such thing as a true origin.
>
> The remote 'origin' is true origin of the repository: it is repository
> we cloned this repository from.

But that is not necessarily /the/ original, and "origin" is the absolute 
reference in maths.  It doesn't bother me that much I suppose, it's just that 
as far as unambiguous names go, I'm not wild about it - it's got too 
many "central repository" connotations, which is of course anathema to git.


Andy
-- 
Dr Andy Parkins, M Eng (hons), MIEE

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 10:33                   ` Andy Parkins
@ 2006-11-15 10:48                     ` Karl Hasselström
  2006-11-15 11:28                       ` Andy Parkins
  0 siblings, 1 reply; 1752+ messages in thread
From: Karl Hasselström @ 2006-11-15 10:48 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git

On 2006-11-15 11:33:55 +0100, Andy Parkins wrote:

> But that is not necessarily /the/ original, and "origin" is the
> absolute reference in maths. It doesn't bother me that much I
> suppose, it's just that as far as unambiguous names go, I'm not wild
> about it - it's got too many "central repository" connotations,
> which is of course anathema to git.

To me, "origin" just means "where <whatever we're talking about>
originated". If you think of it that way, it's perfectly obvious that
each repository can have its own origin.

-- 
Karl Hasselström, kha@treskal.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 10:48                     ` Karl Hasselström
@ 2006-11-15 11:28                       ` Andy Parkins
  0 siblings, 0 replies; 1752+ messages in thread
From: Andy Parkins @ 2006-11-15 11:28 UTC (permalink / raw)
  To: git

On Wednesday 2006 November 15 10:48, Karl Hasselström wrote:

> To me, "origin" just means "where <whatever we're talking about>
> originated". If you think of it that way, it's perfectly obvious that
> each repository can have its own origin.

Of course.  I wasn't saying that I didn't understand why origin was chosen.  
It's not a completely crazy name - it does have /a/ meaning.  However, it's 
not an unambiguous meaning.  What if the repository I clone was itself a 
clone?  What if the repository it cloned was pulling from three other 
repositories?  What if those three repositories pull/push from/to each other?

  * -- * -- *
   \   |   / \
    \  |  /  /
     \ | /  / 
       *   /
       |  / 
       | /
       * <--- "origin"
       |
       * <--- cloned repository

The name "origin" is too close to having an "ultimate source" feel to it IMO.  
In a distributed system, it's not the right idea to be pushing.  After the 
clone is complete, the "origin" is no more special than any other repository, 
and if you felt like it you could change the URL for "origin" and it would 
make very little difference to you.

In short: I don't think "origin" is wrong, I just think it's not right.


Andy
-- 
Dr Andy Parkins, M Eng (hons), MIEE

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  4:32             ` Nicolas Pitre
  2006-11-15  5:35               ` Junio C Hamano
  2006-11-15  9:17               ` Andy Parkins
@ 2006-11-15 12:15               ` Andreas Ericsson
  2006-11-15 12:31                 ` Jakub Narebski
  2006-11-16 13:58               ` Petr Baudis
  3 siblings, 1 reply; 1752+ messages in thread
From: Andreas Ericsson @ 2006-11-15 12:15 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Junio C Hamano, git, Andy Whitcroft, Carl Worth

Nicolas Pitre wrote:

[ axed a lot of stuff that I didn't fully grok ]

> 
> This becomes formalized as:
> 
> 	git_pull [<URL>] [<local_name>]
> 
> If <URL> includes a branch name then <local_name> is a single branch 
> name.  If <URL> doesn't include any branch name then <local_name> 
> becomes a local branch group name containing all branches in the remote 
> repository.

I would change that so "local_name" is always a branch group name, but 
branch group names can be used as refs. That is,

git pull startrek.com/kirk.git:master kirk

would always create the branch-head .git/refs/remote/kirk/master which 
for short can be referenced as just "kirk" (barring clashes ofc), so 
long as it only has one branch tracked.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 12:15               ` Andreas Ericsson
@ 2006-11-15 12:31                 ` Jakub Narebski
  0 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-11-15 12:31 UTC (permalink / raw)
  To: git

Andreas Ericsson wrote:

> Nicolas Pitre wrote:
> 
> [ axed a lot of stuff that I didn't fully grok ]
> 
>> 
>> This becomes formalized as:
>> 
>>      git_pull [<URL>] [<local_name>]
>> 
>> If <URL> includes a branch name then <local_name> is a single branch 
>> name.  If <URL> doesn't include any branch name then <local_name> 
>> becomes a local branch group name containing all branches in the remote 
>> repository.
> 
> I would change that so "local_name" is always a branch group name, but 
> branch group names can be used as refs. That is,
> 
> git pull startrek.com/kirk.git:master kirk

I'd rather use Cogito (not gitweb) notation startrek.com/kirk.git#master
This way we can change the name of local branch
   startrek.com/kirk.git#master:kirk
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  5:35               ` Junio C Hamano
  2006-11-15  6:18                 ` Shawn Pearce
@ 2006-11-15 14:01                 ` Johannes Schindelin
  2006-11-15 15:03                   ` Sean
  2006-11-15 15:10                   ` Nicolas Pitre
  1 sibling, 2 replies; 1752+ messages in thread
From: Johannes Schindelin @ 2006-11-15 14:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nicolas Pitre, git, Andy Whitcroft, Carl Worth

Hi,

On Tue, 14 Nov 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > 1) make "git init" an alias for "git init-db".
> 
> Or even better, have "gh init".

Please no. It only makes things even more confusing. "git init" is perfect 
as it is. We can always have internal aliases from "init-db" to "init" to 
account for older usages.

> > 2) "pull" and "push" should be symmetrical operations
> 
> I think that makes a lot of sense to have "gh pull" and "gh
> push" as symmetric operations, and make "gh merge" do the
> fast-forward and 3-way merge magic done in the current "git
> pull".  These three words would have a lot saner meaning.

I am really opposed to do "gh pull". Not only because of "gh" being 
completely confusing (we already _have_ "git", and for porcelains 
different TLAs), but "pull" _really_ is confusing by now. And Mercurial 
did not help one wit by insisting on their own interpretation.

Why not do something like "get/put" instead? It is

- easier to remember
- not bogus (AFAICT the meaning is not used in diametrical senses)
- shorter to type than download/upload

As for "git merge": Just by the number of arguments you can discern 
between the original usage and the new usage, so I am all in favour of 
replacing "git pull <blabla>" by "git merge <blabla>". Where "<blabla>" 
can be a branch or a remote or a URL (with cogito style #branchname).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 10:09                 ` Jakub Narebski
  2006-11-15 10:15                   ` Santi Béjar
@ 2006-11-15 14:56                   ` Nicolas Pitre
  1 sibling, 0 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-15 14:56 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

On Wed, 15 Nov 2006, Jakub Narebski wrote:

> Nicolas Pitre wrote:
> 
> > On Tue, 14 Nov 2006, Junio C Hamano wrote:
> > 
> >> Yes.  The current "merge" started its life as Linus's porcelain
> >> (we did not have fetch and pull infrastructure back then) but
> >> quickly has become just a helper for pull to produce a merge
> >> commit.  If anybody thinks its UI is good as a general end-user
> >> level command, there is a need for "head examination".
> > 
> > If you mean "git merge" it sure needs to be brought forward.  It can't 
> > be clearer than:
> > 
> >       git-merge the_other_branch
> > 
> > or
> > 
> >       git-merge git://repo.com/time_machine.git
> > 
> > to instantaneously understand what is going on.
> 
> You mean
> 
>       git merge git://repo.com/time_machine.git#branch
> 
> don't you (perhaps with 'master' as default branch)?

Something like that.  I wantee to enphasize on the "merge" command that 
should deal with, hey, merges.

I don't know if # is a good choice for branch indicator though.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 14:01                 ` Johannes Schindelin
@ 2006-11-15 15:03                   ` Sean
  2006-11-15 15:10                   ` Nicolas Pitre
  1 sibling, 0 replies; 1752+ messages in thread
From: Sean @ 2006-11-15 15:03 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Junio C Hamano, Nicolas Pitre, git, Andy Whitcroft, Carl Worth

On Wed, 15 Nov 2006 15:01:47 +0100 (CET)
Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

> I am really opposed to do "gh pull". Not only because of "gh" being 
> completely confusing (we already _have_ "git", and for porcelains 
> different TLAs), but "pull" _really_ is confusing by now. And Mercurial 
> did not help one wit by insisting on their own interpretation.

This makes a lot of sense.  The "git" command isn't damaged so bad
that it can't be saved in a backward compatible way, at least for
a transition period.  Adding a new command name seems like a step
backward.
 
> Why not do something like "get/put" instead? It is
> 
> - easier to remember
> - not bogus (AFAICT the meaning is not used in diametrical senses)
> - shorter to type than download/upload
> 
> As for "git merge": Just by the number of arguments you can discern 
> between the original usage and the new usage, so I am all in favour of 
> replacing "git pull <blabla>" by "git merge <blabla>". Where "<blabla>" 
> can be a branch or a remote or a URL (with cogito style #branchname).

Both these ideas sound like a step in the right direction too.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 14:01                 ` Johannes Schindelin
  2006-11-15 15:03                   ` Sean
@ 2006-11-15 15:10                   ` Nicolas Pitre
  2006-11-15 18:16                     ` Junio C Hamano
  1 sibling, 1 reply; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-15 15:10 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, git, Andy Whitcroft, Carl Worth

On Wed, 15 Nov 2006, Johannes Schindelin wrote:

> On Tue, 14 Nov 2006, Junio C Hamano wrote:
> 
> > Nicolas Pitre <nico@cam.org> writes:
> > 
> > > 2) "pull" and "push" should be symmetrical operations
> > 
> > I think that makes a lot of sense to have "gh pull" and "gh
> > push" as symmetric operations, and make "gh merge" do the
> > fast-forward and 3-way merge magic done in the current "git
> > pull".  These three words would have a lot saner meaning.
> 
> I am really opposed to do "gh pull". Not only because of "gh" being 
> completely confusing (we already _have_ "git", and for porcelains 
> different TLAs), but "pull" _really_ is confusing by now. And Mercurial 
> did not help one wit by insisting on their own interpretation.

I completely agree that creating yet another command prefix for 
basically the same tools would be a disaster.  We have "git" already so 
let's stick to it and make its usage just more sane.

> Why not do something like "get/put" instead? It is
> 
> - easier to remember
> - not bogus (AFAICT the meaning is not used in diametrical senses)
> - shorter to type than download/upload

Well, of all compromizes this is probably the best one so far.  I would 
have prefered to bite the bullet and fix "pull" instead of adding yet 
more commands.  But if the consensus is that there is no way on earth 
that "pull" can be salvaged then get/put is probably more enjoyable than 
download/upload.  This way pull/fetch/push could still be available 
(albeit burried somewhere out of sight).



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  9:17               ` Andy Parkins
  2006-11-15  9:59                 ` Jakub Narebski
@ 2006-11-15 15:41                 ` Nicolas Pitre
  2006-11-15 17:59                   ` Junio C Hamano
  2006-11-15 17:55                 ` Junio C Hamano
  2006-11-16  3:53                 ` Petr Baudis
  3 siblings, 1 reply; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-15 15:41 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git

On Wed, 15 Nov 2006, Andy Parkins wrote:

> On Wednesday 2006 November 15 04:32, Nicolas Pitre wrote:
> 
> > OK..... let's pretend this is my follow-up to your "If I were redoing
> 
> Personally, I agree with almost everything in this email.  Except the 
> implementation of point 3.
> 
> > 3) remote branch handling should become more straight forward.
> 
> I was completely confused by this origin/master/clone stuff when I started 
> with git.  In hindsight, now I understand git a bit more, this is what I 
> would have liked:
> 
>  * Don't use the name "origin" twice.  In fact, don't use it at all.  In a 
> distributed system there is no such thing as a true origin.

I agree, sort of.  Not because"origin" is ambigous as a name.  But 
rather because there is a magic translation from "master" to "origin", 
and I think this is wrong to do that.

As mentioned elsewhere (and let's start using "get" instead of "pull" as 
suggested by Johannes), a "get" should probably always create a branch 
group even if it contains only one branch.  This way the remote branch 
called "master" will still be called "master" locally, under the branch 
group used to represent the remote repository.  And if a local name is 
not provided then let's just call it "default".  This way, amongst the 
remote references, there would be a "default/master" that would be used 
when nothing else is provided by the user. So...

	git get repo.com/time_machine.git

would create a local branch named "remotes/default/master" if the remote 
repo has only a master branch.

Then, a simple:

	git merge

could be the same as

	git merge default

which would be equivalent to

	git merge default/master

Afterwards, because the "default" remote already exists, then:

	git get

would be the same as

	git get default

to get changes for all branches in the "default" remote branches, of 
which "master" might be the only one in the simple case.

But again I think it is important that the URL to use must be a per 
branch attribute i.e. attached to "default/master" and not just 
"default".  This way someone could add all branches of interest into the 
"default" group even if they're from different repositories, and a 
simple  get without any argument would get them all.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  9:17               ` Andy Parkins
  2006-11-15  9:59                 ` Jakub Narebski
  2006-11-15 15:41                 ` Nicolas Pitre
@ 2006-11-15 17:55                 ` Junio C Hamano
  2006-11-15 19:14                   ` Andy Parkins
  2006-11-16  3:53                 ` Petr Baudis
  3 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-15 17:55 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git

Andy Parkins <andyparkins@gmail.com> writes:

>> 3) remote branch handling should become more straight forward.
>
> I was completely confused by this origin/master/clone stuff when I started 
> with git.  In hindsight, now I understand git a bit more, this is what I 
> would have liked:
>
>  * Don't use the name "origin" twice.  In fact, don't use it at all.  In a 
> distributed system there is no such thing as a true origin.
>
>  * .git/remotes/origin should be ".git/remotes/default".   "origin" is only 
> special because it is the default to push and pull - it's very nice to have a 
> default, but it should therefore be /called/ "default".

I think the naming is just a minor detail and can be overridden
with "clone --origin" already.  Renaming it to default is just
like making separate-remote the default to me -- it is fine as
long as it does not break people's expectations.

>  * If clone really wants to have a non-read-only master, then that should 
> be .git/refs/heads/master and will initialise 
> to .git/refs/remotes/$name/master after cloning.  Personally I think this is 
> dangerous because it assumes there is a "master" upstream - which git doesn't 
> mandate at all.  Maybe it would be better to take the upstream HEAD and 
> create a local branch for /that/ branch rather than require that it is 
> called "master".

I think the latter is what clone has done always; take remote's
HEAD and use that to initialize local master (there is no
confusion coming from multiple peer repositories because you
clone from only one place to initialize the repository -- that
one _is_ the origin), and we even keep the HEAD pointing at the
remote's master or whatever it points at at the remote.  Using
"$name" as an object name uses .git/refs/remotes/$name/HEAD.

>  * git-clone should really just be a small wrapper around
>...
> If git-clone does anything that can't be done with settings in the config 
> and the remotes/default file then it's wrong.  The reason I say this is that 
> as soon as git-clone has special capabilities (like --shared, --local 
> and --reference) then you are prevented from doing magic with existing 
> repositories.

That is not entirely true.  clone has convenience because people
asked.  It does not have to mean you are not allowed to give
similar convenience to other commands.  Patches?

> branches from two other local repositories that have the objects hard linked?

fetch by second local repository with git-local-fetch perhaps.

> There have been lots of "wishlist" posts lately; would it be
> useful if I tried to collect all these suggestions from
> various people into one place to try and get a picture of any
> consensus?

A list of common things wished by people certainly is a handy
thing to have.

A consensus would not write code and it generally does not take
technology into account to tell what is realistic and what is
not, so the result needs to be take with a grain of salt,
though.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 15:41                 ` Nicolas Pitre
@ 2006-11-15 17:59                   ` Junio C Hamano
  2006-11-15 18:11                     ` Nicolas Pitre
  0 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-15 17:59 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

Nicolas Pitre <nico@cam.org> writes:

> But again I think it is important that the URL to use must be a per 
> branch attribute i.e. attached to "default/master" and not just 
> "default".  This way someone could add all branches of interest into the 
> "default" group even if they're from different repositories, and a 
> simple  get without any argument would get them all.

I think the "one group per one remote repository" model is a lot
easier to explain.  At least when I read your first "branch
group" proposal that was I thought was going on and I found it
quite sensible (and it maps more or less straightforwardly to
the way existing .git/refs/remotes is set up by default).

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  4:20                   ` Nicolas Pitre
  2006-11-15  4:58                     ` Junio C Hamano
@ 2006-11-15 18:03                     ` Linus Torvalds
  2006-11-15 18:28                       ` Jakub Narebski
                                         ` (5 more replies)
  1 sibling, 6 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-15 18:03 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Junio C Hamano, git



On Tue, 14 Nov 2006, Nicolas Pitre wrote:
> 
> But the fact is that HG (which has a growing crowd of happy campers, 
> maybe even larger than the BK crowd now) did work with and got used to a 
> sensible definition of what a "pull" is.

Guys, before you start thinking this way, the fact is, there's a lot of 
happy git users. 

So the reason for using "git pull" is

 - bk did it that way, and like it or not, bk was the first usable 
   distributed system. hg is totally uninteresting.

 - git itself has now done it that way for the last 18 months, and the 
   fact is, the people _complaining_ are a small subset of the people who 
   actually use git on a daily basis and don't complain.

So don't fall for the classic "second system syndrome". The classic reason 
for getting the second system wrong is because you focus on the issues 
people complain about, and not on the issues that work well (because the 
issues that work fine are obviously not getting a lot of attention).

If you think "pull" is confusing, I can guarantee you that _changing_ the 
name is a hell of a lot more confusing. In fact, I think a lot of the 
confusion comes from cogito, not from git - the fact that cogito used 
different names and different syntax was a mistake, I think.

And that '#' for branch naming in particular was (and is) total 
braindamage. The native git branch naming convention is just fundamentally 
much better, and allows you to very naturally fetch multiple branches at 
once, in a way that cogito's syntax does not.

So when I see suggestions of using that brain-damaged cogito syntax as an 
"improvement", I know for a fact that somebody hasn't thought things 
through, and only thinks it's a better syntax beause of totally bogus 
reasons.

I do agree that we probably could/should re-use the "git merge" name. The 
current "git merge" is an esoteric internal routine, and I doubt a lot of 
people use it as-is. I don't think it would be a mistake to make "git 
merge" basically be an alias for "git pull", for example, and I doubt many 
people would really even notice.

But the fact is, git isn't really that hard to work out, and the commands 
aren't that complicated. There's no reason to rename them. We do have 
other problems:

 - default branch selection for merging is broken (it should definitely 
   take the current branch into account). When I do "git pull" with no 
   branch specification, and I happen to be on a branch that is associated 
   with something else than "master" in the remote, I shouldn't merge with 
   master.

 - I agree that having to create temporary branches to just look at a tag 
   that you don't want to actually develop on is just unnecessarily 
   bothersome.

But trying to rename "pull" (or the "git" name itself) is just going to 
cause more confusion than you fix.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 17:59                   ` Junio C Hamano
@ 2006-11-15 18:11                     ` Nicolas Pitre
  2006-11-16 13:21                       ` Karl Hasselström
  0 siblings, 1 reply; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-15 18:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, 15 Nov 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > But again I think it is important that the URL to use must be a per 
> > branch attribute i.e. attached to "default/master" and not just 
> > "default".  This way someone could add all branches of interest into the 
> > "default" group even if they're from different repositories, and a 
> > simple  get without any argument would get them all.
> 
> I think the "one group per one remote repository" model is a lot
> easier to explain.  At least when I read your first "branch
> group" proposal that was I thought was going on and I found it
> quite sensible (and it maps more or less straightforwardly to
> the way existing .git/refs/remotes is set up by default).

I think one group per remote repo is how things should be by default 
too.  But we should not limit it to that if possible.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 15:10                   ` Nicolas Pitre
@ 2006-11-15 18:16                     ` Junio C Hamano
  2006-11-15 19:02                       ` Andy Parkins
  2006-11-16  0:23                       ` Han-Wen Nienhuys
  0 siblings, 2 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-15 18:16 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

Nicolas Pitre <nico@cam.org> writes:

>> Why not do something like "get/put" instead? It is
>> 
>> - easier to remember
>> - not bogus (AFAICT the meaning is not used in diametrical senses)
>> - shorter to type than download/upload
>
> Well, of all compromizes this is probably the best one so far.  I would 
> have prefered to bite the bullet and fix "pull" instead of adding yet 
> more commands.  But if the consensus is that there is no way on earth 
> that "pull" can be salvaged then get/put is probably more enjoyable than 
> download/upload.  This way pull/fetch/push could still be available 
> (albeit burried somewhere out of sight).

I still think in the long run you would be better off giving
separate names to Porcelains because I am sure you are going to
find the next command to "fix", you cannot suddenly change the
semantics of the command, and you soon run out of alternative
ways to name the action and you in addition have to explain the
differences between fetch and get to new users.  At least, with
"ig pull", you can dismiss all the broken git-x Porcelain-ish by
saying "Oh, git-x user-level commands had inconsistent semantics
and broken UI so do not use them anymore -- they are still there
only to help old timers transition.  The user level commands are
now called ig-x and ig stands for improved git".

But that's a very minor detail and can be fixed when we hit the
wall, so let's wait and see what happens.  Please consider my
gh/gu/cg/whatever dropped.

I think get/put is much better than suddenly changing what pull
means and is shorter to type than x-load; I am Ok with them.
Although I think these words are tainted by SCCS, I do not think
anybody cares.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 18:03                     ` Linus Torvalds
@ 2006-11-15 18:28                       ` Jakub Narebski
  2006-11-15 20:31                         ` Josef Weidendorfer
  2006-11-15 18:43                       ` Nicolas Pitre
                                         ` (4 subsequent siblings)
  5 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-11-15 18:28 UTC (permalink / raw)
  To: git

Linus Torvalds wrote:

> But the fact is, git isn't really that hard to work out, and the commands 
> aren't that complicated. There's no reason to rename them. We do have 
> other problems:
> 
>  - default branch selection for merging is broken (it should definitely 
>    take the current branch into account). When I do "git pull" with no 
>    branch specification, and I happen to be on a branch that is associated 
>    with something else than "master" in the remote, I shouldn't merge with 
>    master.

This problem is _slightly_ migitated by branch.<name>.merge config variable.
Slightly because you have to specify branch to merge, instead of forbidding
merge if you are not on specific branch (and you don't override it).

>  - I agree that having to create temporary branches to just look at a tag 
>    that you don't want to actually develop on is just unnecessarily 
>    bothersome.

Agreed.

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 18:03                     ` Linus Torvalds
  2006-11-15 18:28                       ` Jakub Narebski
@ 2006-11-15 18:43                       ` Nicolas Pitre
  2006-11-15 18:49                         ` Shawn Pearce
  2006-11-15 18:58                       ` Andy Parkins
                                         ` (3 subsequent siblings)
  5 siblings, 1 reply; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-15 18:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

On Wed, 15 Nov 2006, Linus Torvalds wrote:

> 
> 
> On Tue, 14 Nov 2006, Nicolas Pitre wrote:
> > 
> > But the fact is that HG (which has a growing crowd of happy campers, 
> > maybe even larger than the BK crowd now) did work with and got used to a 
> > sensible definition of what a "pull" is.
> 
> Guys, before you start thinking this way, the fact is, there's a lot of 
> happy git users. 
> 
> So the reason for using "git pull" is
> 
>  - bk did it that way, and like it or not, bk was the first usable 
>    distributed system. hg is totally uninteresting.
> 
>  - git itself has now done it that way for the last 18 months, and the 
>    fact is, the people _complaining_ are a small subset of the people who 
>    actually use git on a daily basis and don't complain.

Those arguments are somewhat flawed.  If we stick to "BK did it that way 
and it was first", then following that logic we would also carry a lot 
of CVS baggage because "CVS did it that way, and it was the most 
successful of its kind".  Still, we decided not to follow CVS nor BK in 
many ways already.

As for the fraction of people complaining being a small fraction of 
current GIT users: that is easily explainable by the fact that most 
people who would have grown the complainers group are simply not GIT 
users anymore since they were turned away by GIT's current user 
interface issues.  The only complainers remaining are those who see 
value in the GIT technology but who would like to bring more 
intuitiveness to the GIT interface instead of going for the alternative 
technology.  And those kind of people are always few.

> So don't fall for the classic "second system syndrome". The classic reason 
> for getting the second system wrong is because you focus on the issues 
> people complain about, and not on the issues that work well (because the 
> issues that work fine are obviously not getting a lot of attention).

The counter part of that is the possibility to fall for the "ivory tower 
syndrome" where seasoned GIT users feel they are well satisfied with 
what is currently available and unwilling to consider changes that would 
reduce the barrier to entry for new users... simply because they are so 
used to the way things work that they can't see why others have problems 
with it.

> If you think "pull" is confusing, I can guarantee you that _changing_ the 
> name is a hell of a lot more confusing.

Agreed.  This is why the current discussion led to a proposition that 
allows for "pull" to remain as is but to have a "get" version that would 
be the alternate (saner) version.

> In fact, I think a lot of the 
> confusion comes from cogito, not from git - the fact that cogito used 
> different names and different syntax was a mistake, I think.
> 
> And that '#' for branch naming in particular was (and is) total 
> braindamage. The native git branch naming convention is just fundamentally 
> much better, and allows you to very naturally fetch multiple branches at 
> once, in a way that cogito's syntax does not.
> 
> So when I see suggestions of using that brain-damaged cogito syntax as an 
> "improvement", I know for a fact that somebody hasn't thought things 
> through, and only thinks it's a better syntax beause of totally bogus 
> reasons.

Do you have comments on my proposed syntax (that would be implemented 
with a git-get command) which I think doesn't really look like cogito?

> I do agree that we probably could/should re-use the "git merge" name. The 
> current "git merge" is an esoteric internal routine, and I doubt a lot of 
> people use it as-is. I don't think it would be a mistake to make "git 
> merge" basically be an alias for "git pull", for example, and I doubt many 
> people would really even notice.

Agreed.

> But the fact is, git isn't really that hard to work out, and the commands 
> aren't that complicated.

I agree with you in general, except for the "pull" behavior which is 
really really odd.  Maybe it made sense in the BK context, maybe it is 
fine _once_ you get used to it, but otherwise it is really overloaded.

> But trying to rename "pull" (or the "git" name itself) is just going to 
> cause more confusion than you fix.

Agreed again.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 18:43                       ` Nicolas Pitre
@ 2006-11-15 18:49                         ` Shawn Pearce
  2006-11-15 19:05                           ` Marko Macek
  0 siblings, 1 reply; 1752+ messages in thread
From: Shawn Pearce @ 2006-11-15 18:49 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Linus Torvalds, Junio C Hamano, git

Nicolas Pitre <nico@cam.org> wrote:
> As for the fraction of people complaining being a small fraction of 
> current GIT users: that is easily explainable by the fact that most 
> people who would have grown the complainers group are simply not GIT 
> users anymore since they were turned away by GIT's current user 
> interface issues.  The only complainers remaining are those who see 
> value in the GIT technology but who would like to bring more 
> intuitiveness to the GIT interface instead of going for the alternative 
> technology.  And those kind of people are always few.

Or they are by proxy.

*I* don't see that much of a problem with git pull; I can use it
without trouble at this point.  But I find it difficult to teach
to others.

My complaints about git pull/fetch/push are by proxy for about 10
other users who aren't on the mailing list but whom I interact with
through Git.  They don't like pull/fetch/push very much.

So count my complaints 10 times.  :)

Ok, that's still a drop in the bucket of current Git users.
But still, I'm sure there are others.  I think Carl was recently
talking about complaints from some Fedora folks...

-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 18:03                     ` Linus Torvalds
  2006-11-15 18:28                       ` Jakub Narebski
  2006-11-15 18:43                       ` Nicolas Pitre
@ 2006-11-15 18:58                       ` Andy Parkins
  2006-11-15 19:18                         ` Linus Torvalds
  2006-11-15 19:32                         ` Junio C Hamano
  2006-11-16  1:14                       ` Theodore Tso
                                         ` (2 subsequent siblings)
  5 siblings, 2 replies; 1752+ messages in thread
From: Andy Parkins @ 2006-11-15 18:58 UTC (permalink / raw)
  To: git

On Wednesday 2006, November 15 18:03, Linus Torvalds wrote:

> Guys, before you start thinking this way, the fact is, there's a lot of
> happy git users.

I'm a happy user, doesn't mean I wouldn't like changes.  In fact, by that 
argument, that there are happy users means that there is no need to ever make 
changes.

>  - git itself has now done it that way for the last 18 months, and the
>    fact is, the people _complaining_ are a small subset of the people who
>    actually use git on a daily basis and don't complain.

That's awfully like the argument I hear off my bank whenever I complain to 
them too - "well lots of other people don't complain so we must be right".  
The people who complain are a subset of the people who have complaints.  I 
don't think never changing is a good argument - leaving aside the actual 
changes under discussion - in another 18 months lets say there are double the 
number of git users, and 18 months after that double again - in that case the 
potential new users needs outweigh the current users needs.

> If you think "pull" is confusing, I can guarantee you that _changing_ the
> name is a hell of a lot more confusing. In fact, I think a lot of the

> But the fact is, git isn't really that hard to work out, and the commands

On the one hand you're arguing that git syntax is easy to learn, and on the 
other that no one will be able to learn a new syntax just as easily.

> aren't that complicated. There's no reason to rename them. We do have
> other problems:

That there are other problems doesn't negate these problems.

> But trying to rename "pull" (or the "git" name itself) is just going to
> cause more confusion than you fix.

I don't think so.  Mainly because the proposed new git pull would be a subset 
of the existing git pull.  It's not changing function, it's just reducing in 
function.


Andy
-- 
Dr Andrew Parkins, M Eng (Hons), AMIEE

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 18:16                     ` Junio C Hamano
@ 2006-11-15 19:02                       ` Andy Parkins
  2006-11-15 19:41                         ` Junio C Hamano
  2006-11-16  0:23                       ` Han-Wen Nienhuys
  1 sibling, 1 reply; 1752+ messages in thread
From: Andy Parkins @ 2006-11-15 19:02 UTC (permalink / raw)
  To: git

On Wednesday 2006, November 15 18:16, Junio C Hamano wrote:

> I still think in the long run you would be better off giving
> separate names to Porcelains because I am sure you are going to

The problem I think with that is that the line between plumbing and porcelain 
is not clear.  If you have two names then for the ambiguous ones you are just 
making it more confusing because there is yet another variable to try before 
you get the function you want.



Andy

-- 
Dr Andrew Parkins, M Eng (Hons), AMIEE

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 18:49                         ` Shawn Pearce
@ 2006-11-15 19:05                           ` Marko Macek
  2006-11-15 20:41                             ` Junio C Hamano
                                               ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Marko Macek @ 2006-11-15 19:05 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Linus Torvalds, Junio C Hamano, git, cworth, pasky

Shawn Pearce wrote:
> Nicolas Pitre <nico@cam.org> wrote:
>> As for the fraction of people complaining being a small fraction of 
>> current GIT users: that is easily explainable by the fact that most 
>> people who would have grown the complainers group are simply not GIT 
>> users anymore since they were turned away by GIT's current user 
>> interface issues.  The only complainers remaining are those who see 
>> value in the GIT technology but who would like to bring more 
>> intuitiveness to the GIT interface instead of going for the alternative 
>> technology.  And those kind of people are always few.
> 
> Or they are by proxy.
> 
> *I* don't see that much of a problem with git pull; I can use it
> without trouble at this point.  But I find it difficult to teach
> to others.
> 
> My complaints about git pull/fetch/push are by proxy for about 10
> other users who aren't on the mailing list but whom I interact with
> through Git.  They don't like pull/fetch/push very much.
> 
> So count my complaints 10 times.  :)
> 
> Ok, that's still a drop in the bucket of current Git users.
> But still, I'm sure there are others.  I think Carl was recently
> talking about complaints from some Fedora folks...

Agreed. Personally, the first thing that I notice when trying to switch
 from Subversion to git is the behavior of 'index', mainly in git-diff, git-status and 
git-commit.

For people switching from CVS and SVN it would be much better if the index was hidden 
behind the scenes by using different defaults:

git-commit -a
git-status -a
git-diff HEAD

BTW, currently there's a minor bug: git-diff HEAD doesn't work before you 
make the first commit. Perhaps this should be special cased.

I could personally get used to this, but I'd surely get blank 
stares from people when teaching them the difference.

I guess this is the reason that the GIT Tutorial for CVS/SVN users is talking about _cogito_ instead.
(which is very confusing for someone coming to _git_ home page, trying to learn git).


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 17:55                 ` Junio C Hamano
@ 2006-11-15 19:14                   ` Andy Parkins
  0 siblings, 0 replies; 1752+ messages in thread
From: Andy Parkins @ 2006-11-15 19:14 UTC (permalink / raw)
  To: git

On Wednesday 2006, November 15 17:55, Junio C Hamano wrote:

> I think the latter is what clone has done always; take remote's
> HEAD and use that to initialize local master (there is no

It's this sort of thing that is confusing though - the remote HEAD branch 
could be anything, and yet that is made to be origin locally as a tracking 
branch and then master as the writable branch.  What if upstream /has/ a 
master but "next" is its HEAD?  You'd then get

 next:remotes/origin
 master:remotes/master

Then a local master which is actually upstream next!  Oh dear.

I may well have misunderstood what you've said above above clone always 
initialising master from remote's HEAD; if so please disregard what I'm 
saying.

> > that as soon as git-clone has special capabilities (like --shared,
> > --local and --reference) then you are prevented from doing magic with
> > existing repositories.
>
> That is not entirely true.  clone has convenience because people
> asked.  It does not have to mean you are not allowed to give
> similar convenience to other commands.  Patches?

Absolutely, that was why I said clone shouldn't have special abilities.  In 
fact, if you're willing you don't need clone at all; you just need 
git-init-db and to write the correct remotes file.  

> > branches from two other local repositories that have the objects hard
> > linked?
>
> fetch by second local repository with git-local-fetch perhaps.

Is that not plumbing?  I thought this was about porcelain.

> A consensus would not write code and it generally does not take
> technology into account to tell what is realistic and what is
> not, so the result needs to be take with a grain of salt,
> though.

Of course, I only suggested it because the same suggestions were popping up 
multiple times.  Anyway; I put it in the GitWiki at 
http://git.or.cz/gitwiki/Wishlist

Andy

-- 
Dr Andrew Parkins, M Eng (Hons), AMIEE

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 18:58                       ` Andy Parkins
@ 2006-11-15 19:18                         ` Linus Torvalds
  2006-11-15 19:39                           ` Michael K. Edwards
  2006-11-16  1:40                           ` Anand Kumria
  2006-11-15 19:32                         ` Junio C Hamano
  1 sibling, 2 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-15 19:18 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git



On Wed, 15 Nov 2006, Andy Parkins wrote:
>
> On the one hand you're arguing that git syntax is easy to learn, and on the 
> other that no one will be able to learn a new syntax just as easily.

I'm saying that people who are new to git will _have_ to learn new 
concepts ANYWAY.

I don't think the naming is the hard part. 

The fact is, git is one of the very few (essentially _only_) SCM's that 
make it very clear that all real operations are local and that if you want 
to work with other repositories, you have to "fetch" those into local 
branches first. The fact that "pull" exists at all is really just 
shorthand.

If people have trouble explaining this to others, and have trouble 
grasping "pull", then I will bet that the _real_ issue has nothing at all 
to do with naming at all, and the real issue is that people are being 
_taught_ the concepts in the wrong order.

Before you learn "pull", you should learn "fetch". Don't even _mention_ 
"pull" until the person got what "fetch" means. Because the fact is, 
"fetch" is really the much more fundamental operation, and once you 
really understand what "fetch" does, "pull" is obvious.

So I'll argue that the problem isn't naming, the "problem" is really that 
git has a few fundamnetal concepts that people aren't used to. The most 
fundamnetal of those is the notion of the local branch-space. EVERY other 
(broken) SCM has branches as being some kind of totally idiotic separate 
subdirectories, or doesn't really support branches at all (ie neither BK 
nor CVS really support "branches" - even if a concept of that name exists 
in CVS, it has nothing at all in common with the git model of branches).

But once you understand branches, and understand "fetch" (and it really 
isn't _that_ complicated: fetch really does exactly what the name says, so 
if you understand local branches, you will understand "fetch"), then it's 
a much smaller step to explain "pull = fetch + merge".

But I bet people don't teach it that way. They _start_ by teaching "pull". 
Right?


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 18:58                       ` Andy Parkins
  2006-11-15 19:18                         ` Linus Torvalds
@ 2006-11-15 19:32                         ` Junio C Hamano
  1 sibling, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-15 19:32 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git

Andy Parkins <andyparkins@gmail.com> writes:

>> But trying to rename "pull" (or the "git" name itself) is just going to
>> cause more confusion than you fix.
>
> I don't think so.  Mainly because the proposed new git pull would be a subset 
> of the existing git pull.  It's not changing function, it's just reducing in 
> function.

We usually use the word "regression" to refer to that kind of
change.

I think it makes a lot of sense having command x that does
essentially the same thing as the current fetch but with more
usability enhancements and more convention as built-in defaults,
and another command y that does what the current 'pull .' does
but with more usability enhancements and more convention as
built-in defaults.  I agree that kind of UI improvements would
make it easier to explain to new people.  Calling x "pull",
however, breaks the existing users and documents, and causes
confusion.  I really do not think you can argue with that.

That's why we are talking about using an uncontaminated word
"get".  I think it is a good effort.

>> aren't that complicated. There's no reason to rename them. We do have
>> other problems:
>
> That there are other problems doesn't negate these problems.

And I think Linus is right in pointing out that there are other
problems that are equally or even more pressing than _renaming_
to break things for existing users.

I personally do not think the current fetch/pull confusing, and
I do see real downside in _renaming_ them, but I am open to the
current get/put discussion because I think the new commands'
semantics may be designed to match newcomers' expectation better
(it's to match tools to newcomers instead of teaching them the
new language of the land) and I do not think that approach would
break existing users and documents.

For some things "matching tools to newcomers" would not really
work, though.  For example, I do not think you can get away with
hiding index forever if you want your users to do real work in a
workflow that involves merging and cherry picking.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 19:18                         ` Linus Torvalds
@ 2006-11-15 19:39                           ` Michael K. Edwards
  2006-11-15 20:09                             ` Linus Torvalds
  2006-11-16  1:40                           ` Anand Kumria
  1 sibling, 1 reply; 1752+ messages in thread
From: Michael K. Edwards @ 2006-11-15 19:39 UTC (permalink / raw)
  To: git

On 11/15/06, Linus Torvalds <torvalds@osdl.org> wrote:
> But once you understand branches, and understand "fetch" (and it really
> isn't _that_ complicated: fetch really does exactly what the name says, so
> if you understand local branches, you will understand "fetch"), then it's
> a much smaller step to explain "pull = fetch + merge".
>
> But I bet people don't teach it that way. They _start_ by teaching "pull".
> Right?

"git fetch" is certainly the right thing for the platform integration
role, in which one is trying to maintain a series of integration
branches which track the bleeding edge of some subsystems while
keeping the core stable on each branch.  This is not as impossible as
people make it out to be, but there certainly isn't much place for
automatic merges to _persistent_ branches.

It's fundamentally a backporting and cherry-picking effort, and the
git workflow puts it where it belongs: in the local repository, where
_transient_ branches can and should be created and destroyed casually
to track exploratory efforts.  These may include automatic merges and
even cruder techniques (git diff, hack on patch, apply patch).  Once
you figure out which bits you actually want to backport, you go back
to a fresh branch and cherry-pick the same bits with the tool instead
of manually, so that there is less noise in future merges.  When
you've tested a little, you merge this branch to the persistent branch
that other repositories track.

Cheers,

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 19:02                       ` Andy Parkins
@ 2006-11-15 19:41                         ` Junio C Hamano
  2006-11-15 20:15                           ` Nicolas Pitre
  2006-11-15 20:19                           ` Carl Worth
  0 siblings, 2 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-15 19:41 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git

Andy Parkins <andyparkins@gmail.com> writes:

> On Wednesday 2006, November 15 18:16, Junio C Hamano wrote:
>
>> I still think in the long run you would be better off giving
>> separate names to Porcelains because I am sure you are going to
>
> The problem I think with that is that the line between plumbing and porcelain 
> is not clear.

This is moot because we (at least tentatively) agreed not to do
"gh" or "ig" or whatever, but I do not understand why you feel
so.

If we had a separate Porcelain namespace (say "ng" for "new
git") you would know "ng-commit" is not a Plumbing and when you
are writing a Porcelain script you would stay away from using it
in your script.

In the longer term, when the new Porcelain UI Nico and friends
are designing matures, and if it makes everybody (including
existing users who learned git-* Porcelain-ish during 18-months
process) happy, we could gradually deprecate and eventually
remove the git-* Porcelain-ish over time, at that point we would
have a very clear line between plumbing and porcelain.

But that would not be a flag-day change.  During the transition
period you cannot mechanically tell if git-foo is a plumbing or
a porcelain just like you cannot do so now.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 19:39                           ` Michael K. Edwards
@ 2006-11-15 20:09                             ` Linus Torvalds
  2006-11-15 20:21                               ` Nicolas Pitre
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-15 20:09 UTC (permalink / raw)
  To: Michael K. Edwards; +Cc: git



On Wed, 15 Nov 2006, Michael K. Edwards wrote:
> > 
> > But I bet people don't teach it that way. They _start_ by teaching "pull".
> > Right?
> 
> "git fetch" is certainly the right thing for the platform integration
> role

I'm saying that even if you _never_ end up using "git fetch" ever again 
(because in practice you always want to do a "fetch + merge == pull"), 
people who teach others the concepts and usage of git should probably 
start by talking about "git fetch".

Then, when the user says (and he obviously will say this) "but I don't 
want to just fetch the other persons work into some local branch, I want 
to actually get it into _my_ branch", you say "Ahhah!" and talk about how 
"pull" is a shorthand for first fetching and then merging the result into 
the current branch.

See? Once you explain "fetch" to somebody, I can pretty much guarantee 
that they'll explain "pull" to themselves without you having to even work 
at it. And then they'll probably happily use "pull" ever after, and never 
worry about fetch, but now they'll understand the _concepts_.

It's only if you start the other way around that "pull" vs "fetch" vs 
"push" become confusing. If you _start_ by explaining branches (and you 
might use "gitk --all" on a small project as a visualization tool), 
suddenly the concepts aren't all that complicated.

Sure, then you have to remember two words ("pull" vs "fetch"), but I'm 
pretty sure that the thing that makes people confused is not the words 
themselves, but their lack of understanding of the concepts behind them.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  2:10                 ` Junio C Hamano
  2006-11-15  2:27                   ` Michael K. Edwards
  2006-11-15  4:20                   ` Nicolas Pitre
@ 2006-11-15 20:12                   ` Petr Baudis
  2006-11-15 20:26                     ` Nicolas Pitre
  2 siblings, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-11-15 20:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nicolas Pitre, git

On Wed, Nov 15, 2006 at 03:10:16AM CET, Junio C Hamano wrote:
> You have to admit both pull and fetch have been contaminated
> with loaded meanings from different backgrounds. I was talking
> about killing the source of confusion in the longer term by
> removing fetch/pull/push, so we are still on the same page.

How was/is fetch contaminated?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 19:41                         ` Junio C Hamano
@ 2006-11-15 20:15                           ` Nicolas Pitre
  2006-11-15 20:19                           ` Carl Worth
  1 sibling, 0 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-15 20:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Andy Parkins, git

On Wed, 15 Nov 2006, Junio C Hamano wrote:

> If we had a separate Porcelain namespace (say "ng" for "new
> git") you would know "ng-commit" is not a Plumbing and when you
> are writing a Porcelain script you would stay away from using it
> in your script.

There is merit in trying to segregate porcelain vs plumbing... at least 
in theory.  In practice though I don't think this is something we should 
absolutely strive for.

Why? Because something is always going to fail the categorization.  
Sure there are commands that are pure plumbing like git-commit-tree, 
etc.  Some are pure porcelain like git-commit or git-log.  Yet we use 
git-log's output for git-shortlog.  Does it mean that git-log is 
plumbing? Also I have a script here that uses git-commit directly 
because it is so much convenient rather than futzing with the really 
bare plumbing.  I don't think git-commit should be prevented from being 
used within another script even if it is classified as porcelain.

So we have that notion of plumbing vs porcelain but in practice there is 
a whole spectrum between those two poles and I think it is a good thing.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 19:41                         ` Junio C Hamano
  2006-11-15 20:15                           ` Nicolas Pitre
@ 2006-11-15 20:19                           ` Carl Worth
  2006-11-15 21:13                             ` Junio C Hamano
  1 sibling, 1 reply; 1752+ messages in thread
From: Carl Worth @ 2006-11-15 20:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Andy Parkins, git

[-- Attachment #1: Type: text/plain, Size: 1996 bytes --]

On Wed, 15 Nov 2006 11:41:20 -0800, Junio C Hamano wrote:
> Andy Parkins <andyparkins@gmail.com> writes:
> > On Wednesday 2006, November 15 18:16, Junio C Hamano wrote:
> >
> >> I still think in the long run you would be better off giving
> >> separate names to Porcelains because I am sure you are going to
> >
> > The problem I think with that is that the line between plumbing and porcelain
> > is not clear.
>
> This is moot because we (at least tentatively) agreed not to do
> "gh" or "ig" or whatever, but I do not understand why you feel
> so.

I'm not the original poster, but I feel the same way about the line
being unclear.

Here's a real-world example from last week.

For cairo I wrote a little script that two revspecs, (or one in
which case its first parent is used), and it goes off and checks out
both versions, builds each, runs a performance test on each, and then
generates a report showing the performance impact.

So now I can do things like:

	# What's the performance impact of my latest change:
	cairo-perf-diff HEAD

	# Have my last few changes helped as much as I'd hoped:
	cairo-perf-diff HEAD~3 HEAD

	# How has performance changed since our last stable release:
	cairo-perf-diff 1.2.6 HEAD

Anyway, when I announced this I also mentioned how easily someone
might generate an entire series of reports for a series of
commits. The command I gave as an example is:

	for rev in $(git rev-list 1.2.6..HEAD); do
	    cairo-perf-diff $rev
	done

I think that's a perfectly legitimate one-liner for users to use, and
it really shows off the easy-scriptability of git. But certainly, no
"new porcelain" author is going to consider rev-list to be porcelain
rather than plumbing, right? So as soon as I start teaching people to
do useful stuff like this, they might have to reach down into the
"scary" git interface.

I think we're much better off just having one "git" namespace for the
standard command-line interface, and then making it as easy to use as
possible.

-Carl


[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 20:09                             ` Linus Torvalds
@ 2006-11-15 20:21                               ` Nicolas Pitre
  2006-11-15 20:40                                 ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-15 20:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Michael K. Edwards, git

On Wed, 15 Nov 2006, Linus Torvalds wrote:

> I'm saying that even if you _never_ end up using "git fetch" ever again 
> (because in practice you always want to do a "fetch + merge == pull"), 
> people who teach others the concepts and usage of git should probably 
> start by talking about "git fetch".
> 
> Then, when the user says (and he obviously will say this) "but I don't 
> want to just fetch the other persons work into some local branch, I want 
> to actually get it into _my_ branch", you say "Ahhah!" and talk about how 
> "pull" is a shorthand for first fetching and then merging the result into 
> the current branch.

Actually I believe it would make things even clearer if "merge" was 
taught at that point.  Only when the user is comfortable with the 
separate notions of fetching and merging might the pull shorthand 
possibly be mentioned.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 20:12                   ` Petr Baudis
@ 2006-11-15 20:26                     ` Nicolas Pitre
  2006-11-15 20:50                       ` Linus Torvalds
  2006-11-16  1:51                       ` Anand Kumria
  0 siblings, 2 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-15 20:26 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Junio C Hamano, git

On Wed, 15 Nov 2006, Petr Baudis wrote:

> On Wed, Nov 15, 2006 at 03:10:16AM CET, Junio C Hamano wrote:
> > You have to admit both pull and fetch have been contaminated
> > with loaded meanings from different backgrounds. I was talking
> > about killing the source of confusion in the longer term by
> > removing fetch/pull/push, so we are still on the same page.
> 
> How was/is fetch contaminated?

I think "fetch" is sane.  Its only problem is a missing symetrical 
counterpart verb, like "get" and "put".



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 18:28                       ` Jakub Narebski
@ 2006-11-15 20:31                         ` Josef Weidendorfer
  2006-11-15 20:35                           ` Petr Baudis
  0 siblings, 1 reply; 1752+ messages in thread
From: Josef Weidendorfer @ 2006-11-15 20:31 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Linus Torvalds, Nicolas Pitre, Junio C Hamano

On Wednesday 15 November 2006 19:28, you wrote:
> Linus Torvalds wrote:
> 
> > But the fact is, git isn't really that hard to work out, and the commands 
> > aren't that complicated. There's no reason to rename them. We do have 
> > other problems:
> > 
> >  - default branch selection for merging is broken (it should definitely 
> >    take the current branch into account). When I do "git pull" with no 
> >    branch specification, and I happen to be on a branch that is associated 
> >    with something else than "master" in the remote, I shouldn't merge with 
> >    master.
> 
> This problem is _slightly_ migitated by branch.<name>.merge config variable.
> Slightly because you have to specify branch to merge, instead of forbidding
> merge if you are not on specific branch (and you don't override it).

We should change this.

The problem is that whatever is the first Pull line in remotes config gets
merged by default into current branch, which most often is not the right
thing to do.

Often, I find myself doing "git branch" just to make sure that I am on
"master", so that a following pull does not do a bogus merge.

Can we please disable this behavior, e.g. by allowing a fake first
Pull line like "Pull: (not-for-merge)" to prohibit any merge?

This even could be written by default in git-clone somewhere in the future,
and we suddenly get the behavior of pull being symmetric to push - at least
by default. And still, it is fully compatible to existing repositories.

To make pull do the right thing, we _have_ to configure branch.<name>.merge
whenever we create a new branch (which matters for git-clone, too).

Josef

> 
> >  - I agree that having to create temporary branches to just look at a tag 
> >    that you don't want to actually develop on is just unnecessarily 
> >    bothersome.
> 
> Agreed.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 20:31                         ` Josef Weidendorfer
@ 2006-11-15 20:35                           ` Petr Baudis
  2006-11-15 21:12                             ` Josef Weidendorfer
  0 siblings, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-11-15 20:35 UTC (permalink / raw)
  To: Josef Weidendorfer
  Cc: Jakub Narebski, git, Linus Torvalds, Nicolas Pitre,
	Junio C Hamano

On Wed, Nov 15, 2006 at 09:31:13PM CET, Josef Weidendorfer wrote:
> Often, I find myself doing "git branch" just to make sure that I am on
> "master", so that a following pull does not do a bogus merge.
> 
> Can we please disable this behavior, e.g. by allowing a fake first
> Pull line like "Pull: (not-for-merge)" to prohibit any merge?

Wait, if you don't want pull to merge, why do you pull and not fetch?

(Disclaimer: I'm not intimately familiar with git pull/fetch and I
didn't read the whole thread yet.)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  4:33             ` Junio C Hamano
  2006-11-15  4:46               ` Nicolas Pitre
@ 2006-11-15 20:39               ` Petr Baudis
  1 sibling, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-11-15 20:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Nov 15, 2006 at 05:33:03AM CET, Junio C Hamano wrote:
> Petr Baudis <pasky@suse.cz> writes:
> >   (v) Git would be properly libified by now. If you wanted to convert
> > bits of porcelain to C, it would be at least much higher priority.
> 
> I am not sure about "libified" part and I do not know what bits
> of porcelain wants to become C right now.  But I do not think
> this point is important part of your list.

Merge strategies. Or wait, is that already plumbing?

Or git-status. git-add. Plenty more.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 20:21                               ` Nicolas Pitre
@ 2006-11-15 20:40                                 ` Linus Torvalds
  2006-11-15 21:08                                   ` Carl Worth
  2006-11-16  4:26                                   ` Theodore Tso
  0 siblings, 2 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-15 20:40 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Michael K. Edwards, git



On Wed, 15 Nov 2006, Nicolas Pitre wrote:
> 
> Actually I believe it would make things even clearer if "merge" was 
> taught at that point.  Only when the user is comfortable with the 
> separate notions of fetching and merging might the pull shorthand 
> possibly be mentioned.

I agree. I just expect that "merge" is such a simple concept that it 
doesn't really need a whole lot of explaining. 

People kind of expect merging to be hard, but I think it's because CVS et 
al have tought people that merging is _painful_. I don't think it's a very 
complicated concept per se, especially if you have explained branches with 
gitk already.

But yes, the order should be:

 (a) explain what "branches" mean in git (and in that situation, "fetch" 
     is very natural - I think fetching itself is probably easier to 
     explain than "branches" are).
 (b) once you've explained branches, the notion of "merge" comes next, and 
     I _think_ that is very obvious. This is where UI issues come in, 
     because "git merge" is really a totally internal program with a 
     pretty horrid UI, but I think we could fix the syntax, and even with 
     the current syntax you can really just gloss it over, because nobody 
     is really going to care.
 (c) once "fetching branches" and "merging" have been explained, "pull" is 
     really pretty damn trivial, and in fact, if you then explain that 
     it's just easier to do "git pull . branchname" than to use "git 
     merge", I think people may just even agree with you.

I think I saw that particular discussion on #git: somebody didn't expect 
"git pull . branch" to be the way to merge. And again, I think it's 
not _really_ because "pull" is hard to understand, it's because people 
haven't been walked through the thing in this way.

Once you understand local branches, fetching and merging, it's actually 
_easier_ to explain why we merge even local branches with "git pull .": 
you just tell them that this way you can use the same command regardless 
of whether you're merging something local or something remote. Again, if 
it's explained that way, I bet a lot of people react with "ahh, that's 
clever", and _like_ the fact that they only really need to learn _one_ 
command, instead of learning two.

See? Explain it that way: "pull" really is simple. By using "pull", you 
don't have to learn about "merge" syntax. You -can- use "merge" as a 
separate program if you want to, but the syntax isn't very nice, exactly 
because you're not really expected to.

But the real issue here is to explain local branches. I will happily admit 
that local branches are very VERY different from just about any other SCM, 
but I also claim that git is just much BETTER than other SCM's in this 
respect.

And yes, this is why you should NOT try to use the same naming as "hg", 
for example. Last I saw, hg still didn't even have local branches, To 
mercurial, repository == branch, and that's it. It was what I came from 
too, and I used to argue for using git that way too. I've since seen the 
error of my ways, and git is simply BETTER. 

And the concept of local branches is exactly _why_ you have to have 
separate "fetch" and "pull", but why you do _not_ need a separate "merge" 
(because "pull ." does it for you).

If you don't understand local branches, you'll never understand git usage. 
And once you _do_ understand local branches, "fetch" vs "pull" actually is 
rather simple.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 19:05                           ` Marko Macek
@ 2006-11-15 20:41                             ` Junio C Hamano
  2006-11-15 22:07                               ` Shawn Pearce
  2006-11-16  6:07                               ` Marko Macek
  2006-11-15 22:28                             ` Sean
       [not found]                             ` <20061115172834.0a328154.seanlkml@sympatico.ca>
  2 siblings, 2 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-15 20:41 UTC (permalink / raw)
  To: Marko Macek; +Cc: Shawn Pearce, Linus Torvalds, git, cworth, pasky

Marko Macek <marko.macek@gmx.net> writes:

> For people switching from CVS and SVN it would be much better if the
> index was hidden behind the scenes by using different defaults:
>
> git-commit -a
> git-status -a
> git-diff HEAD
>
> BTW, currently there's a minor bug: git-diff HEAD doesn't work before
> you make the first commit. Perhaps this should be special cased.

That's only a _bug_ in your implementation of the synonym for
"svn diff" which blindly used "git diff HEAD".

"git diff HEAD" is not a synonym for "svn diff" when HEAD does
not exist yet, because you are asking "please give me a diff
between the tree in the HEAD commit and my working tree files
through the index".  So if you are doing "git-svnish-diff"
Porcelain script, it should notice that HEAD does not exist yet
and take an appropriate action.  We do something similar in
git-status; the porcelain notices and acts differently when HEAD
is not there yet.

This "there is no HEAD yet" is not related to the index, but I
am skeptical about trying to hide the index from the end user.

You can make some things map more naturally to systems like SVN
and CVS than other things.  For example, Nico's proposal to
always use remote tracking branches and defaulting to use
refs/remotes/ would be a way to match UI of pull/push to another
existing system and that would work well (I am not agreeing to
the change to make 'pull' not to do the merge which would break
existing users -- I am just saying that the result would be self
consistent).  But things that have difference at the concept
level, I suspect no clever mapping to hide the differences would
work well.

The index is quite central to the way git works at the concept
level, and I think it is doing disservice to the end user to try
hiding it forever from them and failing to do so, rather than
being honest and teaching them the concept upfront.

But me thinking so does not necessarily mean you are forbidden
from trying.  Your efforts may result in a system where the
index is totally invisible and the end user never has to know
about it.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 20:26                     ` Nicolas Pitre
@ 2006-11-15 20:50                       ` Linus Torvalds
  2006-11-15 21:18                         ` Nicolas Pitre
  2006-11-16  1:51                       ` Anand Kumria
  1 sibling, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-15 20:50 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Petr Baudis, Junio C Hamano, git



On Wed, 15 Nov 2006, Nicolas Pitre wrote:
> 
> I think "fetch" is sane.  Its only problem is a missing symetrical 
> counterpart verb, like "get" and "put".

If you're a dog owner, the obvious counterpart for "fetch" is "throw" ;)

I think "get" and "put" would be bad, just because of confusion with 
"sccs get" (ie it has that "get this file" connotations).

Maybe "fetch" and "push" aren't totally diametrically opposite, but 
really, I don't think they are that hard to understand either. We do have 
the BK legacy of "pull" implying a merge, and that's fairly fundamental. 

It's also true that in a lot of usage schenarios, what people actually 
_use_ is "pull" and "push", and no, they aren't mirror images (since push 
will _not_ do the merge), but at the same time, from a _usage_ standpoint 
they really _are_ each others opposites. 

You "pull" to get other peoples data into your branch (and once you've 
internalized local branches and the merge thing, you know what this 
means), and you "push" to push your changes out. It really _is_ the usage 
schenario, and using "opposite" words really _does_ make sense.

It's true that _technically_ "fetch" is the opposite of "push", but at the 
same time, that really is about technology, not about usage models. You 
normally wouldn't do a "git fetch + git push" pair. You _can_ do so, but 
it's not the natural way to work - unless you're just doing a mirror 
service.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  0:31         ` Junio C Hamano
  2006-11-15  4:08           ` Petr Baudis
@ 2006-11-15 20:51           ` Carl Worth
  2006-11-15 20:57             ` Jakub Narebski
  1 sibling, 1 reply; 1752+ messages in thread
From: Carl Worth @ 2006-11-15 20:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Andy Whitcroft, Petr Baudis

[-- Attachment #1: Type: text/plain, Size: 6788 bytes --]

On Tue, 14 Nov 2006 16:31:50 -0800, Junio C Hamano wrote:
> I do not think the Porcelain-ish UI that is shipped with git
> should be taken with the same degree of "authority" as git
> Plumbing.

I think we should fix this. "This is great technology with a crap
interface on top" really isn't a good story. I don't actually agree
with that---I don't think the git interface is really all that bad,
it's just got a few little things that tend to trip up new users in my
experience.

And what git does really well, (history exploring, allowing for
pipeline on-liners to iterate over revisions in A..B), are things that
don't even exist in other tools, nor even in the "alternate"
porcelains for git. This stuff is where git's interface is really
fantastic, and it would be a shame to write it off.

>                                                        I think
> single isolated developers, contributors and CVS style shared
> repository usage could be a lot improved because neither of us
> were concentrating in their workflows.  This needs somebody
> motivated enough to improve things in that area.  For example,
> StGIT with its 'float' command is a great improvement over what
> rebase does for people in the contributor role.

Yes, there are some specific workflow-oriented operations that git
doesn't handle as well as it could. Things like commit --amend are
certainly improvements. One that is still totally broken is "follow
all the development in another repository" where clone followed by
repeated fetch doesn't do the job as soon as the remote adds or
deletes a branch.

> But making it more usable for whom is a big question.
>
> Quite frankly, I do not think there can be _the_ single UI that
> would satisfy different types of workflows for some of the
> commands.

I strongly disagree. Or at least, I don't think we've tried hard
enough yet that we should give up on this.

I do agree that people in different roles will have different lists of
"most used operations" and that some operations won't appear on some
users lists at all, (someone who's just "watching" development won't
commit or merge, for example---[or so they thing when they start]).

But I really don't think that for any given operation that different
roles impose a different desire on the behavior of the operation. We
have different people with different background and disagreement on
names and silly things like that, but I don't think that's related to
the roles in which they are working with the tool.

> For example, fetching and merging from many places without
> necessarily having corresponding tracking branches is a great
...

I don't think we've ever had this right in git. The new
--use-separate-remotes stuff or similar will start to help as it
becomes the default. I don't see how this won't benefit everybody.

> For another example, having a commit command to commit
> everything by default is disastrous for people who allow their
> workflows to often be interrupted.

Workflow-interruption is an important thing to support, but separating
update-index and commit really doesn't address it nearly as much as I
would like. The lack of really good workflow-interruption support has
been one of my longest-running annoyances with git, (perhaps because I
have a problem with trying to do too many things at once). Git can
create and change branches fast enough that it really should be able
to help me better with this. The only missing piece is being able to
stash the dirty stuff on the current branch, to be able to come back
to it later. I've talked a bit about what I would like in this area
before, and I really just need to code it up.

> It is not just command line syntax and the defaults, but
> concepts as well.  People in the integrator role often need to
> deal with merges and you would need to be aware of the role of
> the index and need to be able to manipulate the index, ...

Again, I think it's more that the specific operations bring in
concepts, (merge bringing in the index here). As such, someone never
doing a merge could easily get by not having to understand the index.

> A Porcelain that does a very similar thing in slightly different
> way is obviously a waste, but otherwise I do not think it is a
> problem to have different Porcelains.  StGIT does not compete
> with the "sucky" Porcelain-ish shipped with git but makes the
> user's life a lot more pleasant by complementing what the sucky
> one does not do well.  It is not very useful while I am playing
> the integrator role, but when I am doing my own thing it is a
> great addition to my toolchest.

But even here, there's a bunch of waste in StGit. For example, there
are a lot of commands in StGit whose only purpose is to translate back
and forth between the StGit and non-StGit views of the world, (init,
assimilate, commit, uncommit). Those could all be discarded if the
functionality of StGit were brought down into git itself. Then there
are a myriad of StGit commands which are basically just the same as
their git counterparts.

Now, StGit is a great tool, and I know that it works really well for
some people in the role of just maintaining a stack of changes against
some upstream, and can use StGit alone and never touch "git" the
command-line.

But for someone like me who already uses git regularly, and
occasionally just wants to pop back a few commits, amend it, and then
push again, StGit is not helpful, (the series of init, assimilate, and
uncommits just to get started is prohibitive compared to just working
out the awkward steps needed to make a temporary branch and
rebase). So I'd love to see just a couple of commands added to "git"
to support these kinds of operations more smoothly.

> I am from the camp that does _not_ want to hide the index, so
> obviously I do not see any value in its effort to hide the
> index.  But other aspects of it, most notably being friendly to
> simpler workflows, is a very good thing.

I don't think "hide or not-to-hide" is the right way to frame the
discussion about the index. I regularly use update-index to stage
partial commits, and I find that very useful. And obviously the index
is involved in resolving merge conflicts.

But I don't think the user-interface for either of those operations
(partial commit, resolve conflicts), is ideal, and the current
requirement to use either "update-index <paths>" or "commit -a" after
modifying a file for the first time is demonstrably a hangup for a lot
of new users. So I really think it's possible to address both of these
at once.

Anyone, that's enough generic rambling from me without any specific
content. I'll try to keep future messages focused on specific
desirable operations that have problematic interfaces in git right
now, along with proposals for improving them.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 20:51           ` Carl Worth
@ 2006-11-15 20:57             ` Jakub Narebski
  2006-11-15 22:00               ` Shawn Pearce
  0 siblings, 1 reply; 1752+ messages in thread
From: Jakub Narebski @ 2006-11-15 20:57 UTC (permalink / raw)
  To: git

Carl Worth wrote:
>On Tue, 14 Nov 2006 16:31:50 -0800, Junio C Hamano wrote:
>>
>> For another example, having a commit command to commit
>> everything by default is disastrous for people who allow their
>> workflows to often be interrupted.
> 
> Workflow-interruption is an important thing to support, but separating
> update-index and commit really doesn't address it nearly as much as I
> would like. The lack of really good workflow-interruption support has
> been one of my longest-running annoyances with git, (perhaps because I
> have a problem with trying to do too many things at once). Git can
> create and change branches fast enough that it really should be able
> to help me better with this. The only missing piece is being able to
> stash the dirty stuff on the current branch, to be able to come back
> to it later. I've talked a bit about what I would like in this area
> before, and I really just need to code it up.

There is git-stash/git-unstash floating somewhere in the archive.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 20:40                                 ` Linus Torvalds
@ 2006-11-15 21:08                                   ` Carl Worth
  2006-11-15 21:31                                     ` Junio C Hamano
  2006-11-15 21:45                                     ` Linus Torvalds
  2006-11-16  4:26                                   ` Theodore Tso
  1 sibling, 2 replies; 1752+ messages in thread
From: Carl Worth @ 2006-11-15 21:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Michael K. Edwards, git

[-- Attachment #1: Type: text/plain, Size: 3521 bytes --]

On Wed, 15 Nov 2006 12:40:43 -0800 (PST), Linus Torvalds wrote:
> On Wed, 15 Nov 2006, Nicolas Pitre wrote:
> >
> > Actually I believe it would make things even clearer if "merge" was
> > taught at that point.  Only when the user is comfortable with the
> > separate notions of fetching and merging might the pull shorthand
> > possibly be mentioned.
>
> I agree. I just expect that "merge" is such a simple concept that it
> doesn't really need a whole lot of explaining.

Well, one of the problems is that with current git I can teach, (and I
have), that there's a conceptual:

	pull = fetch + merge

But then shortly after I have to teach an interface notion:

	merge = pull .

So there's this goofy circular notion that people end up with
mentally. If we fix it so that a local merge really is performed with
"git merge <branch>" instead of "git pull . <branch>" then teaching
pull=fetch+merge really is a lot easier.

In the meantime, pull would still be useless to me, I think. But maybe
that's just the "default branch to merge" selection being broken. If
that were fixed, maybe I would start using pull.

>  (a) explain what "branches" mean in git (and in that situation, "fetch"
>      is very natural - I think fetching itself is probably easier to
>      explain than "branches" are).

There's a piece missing here, namely the mapping between remote and
local branch names and any notion of "tracking branches". I think a
sane story for that is still being invented, (or if it exists now, I
haven't seen it yet).

>  (c) once "fetching branches" and "merging" have been explained, "pull" is
>      really pretty damn trivial, and in fact, if you then explain that
>      it's just easier to do "git pull . branchname" than to use "git
>      merge", I think people may just even agree with you.

Well, they get pretty darn confused at this point, in my experience.

> Once you understand local branches, fetching and merging, it's actually
> _easier_ to explain why we merge even local branches with "git pull .":
> you just tell them that this way you can use the same command regardless
> of whether you're merging something local or something remote. Again, if
> it's explained that way, I bet a lot of people react with "ahh, that's
> clever", and _like_ the fact that they only really need to learn _one_
> command, instead of learning two.

No. It's really, really broken to use "pull ." for local merging. Not
a feature at all. We just got done establishing that pull is a
shorthand for doing fetch+merge, so reusing it when there is _no_
fetch at all is insane.

You just established quite clearly hat git has a huge advantge over
all other systems by having a model that everything is fetched in
and then worked with locally. I agree that this is a major
selling-point of git, and I'm also baffled that systems like bzr and
hg try so hard to push every branch into a separate repository.

But I think that git's "work with everything locally" story is undercut
a bit by regular usage being to use a transfer-inducing command like
"pull" for a totally local merge.

Anyway, I think we all agree that we'd really rather have "git merge
<branch>" be usable for local merges, so let's get that in place and
users can pick whichever they like.

> But the real issue here is to explain local branches. I will happily admit
> that local branches are very VERY different from just about any other SCM,
> but I also claim that git is just much BETTER than other SCM's in this
> respect.

Totally agree.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 20:35                           ` Petr Baudis
@ 2006-11-15 21:12                             ` Josef Weidendorfer
  2006-11-15 21:31                               ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Josef Weidendorfer @ 2006-11-15 21:12 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Jakub Narebski, git, Linus Torvalds, Nicolas Pitre,
	Junio C Hamano

On Wednesday 15 November 2006 21:35, Petr Baudis wrote:
> On Wed, Nov 15, 2006 at 09:31:13PM CET, Josef Weidendorfer wrote:
> > Often, I find myself doing "git branch" just to make sure that I am on
> > "master", so that a following pull does not do a bogus merge.
> > 
> > Can we please disable this behavior, e.g. by allowing a fake first
> > Pull line like "Pull: (not-for-merge)" to prohibit any merge?
> 
> Wait, if you don't want pull to merge, why do you pull and not fetch?

I am not really opposed to pull doing a merge. It only should work in
a useful way: ie. only do the merge of updated origin branch when
current branch is master (given "Pull: master:origin").

I want "git pull" being harmless if I find myself accidently on a
branch != master. I always can do "git checkout master; git pull . origin"
afterwards.

For this to work, I currently need to specify a "branch.<name>.merge"
config for _every_ branch I have, as otherwise I get this bogus pull
merge behavior. This is not needed if there was a way to configure no
merge at all as default pull behavior.

I just noted that allowing such a config option would be kind of a
working compromise for all the people which want
pull to be the opposite of push.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 20:19                           ` Carl Worth
@ 2006-11-15 21:13                             ` Junio C Hamano
  2006-11-15 22:36                               ` Carl Worth
  0 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-15 21:13 UTC (permalink / raw)
  To: Carl Worth; +Cc: Andy Parkins, git

Carl Worth <cworth@cworth.org> writes:

> I'm not the original poster, but I feel the same way about the line
> being unclear.
>
> Here's a real-world example from last week.
>...
> Anyway, when I announced this I also mentioned how easily someone
> might generate an entire series of reports for a series of
> commits. The command I gave as an example is:
>
> 	for rev in $(git rev-list 1.2.6..HEAD); do
> 	    cairo-perf-diff $rev
> 	done
>
> I think that's a perfectly legitimate one-liner for users to use, and
> it really shows off the easy-scriptability of git. But certainly, no
> "new porcelain" author is going to consider rev-list to be porcelain
> rather than plumbing, right? So as soon as I start teaching people to
> do useful stuff like this, they might have to reach down into the
> "scary" git interface.

That is a very fine example, but I do not see why it is a
problem.  I do not think the goal of Porcelain is to make it
totally unnecessary for users to know about the plumbing.

The one-liner is essentially a new Porcelain command that is
useful in the cairo developers' workflow, and implementing it
with a plumbing command makes perfect sense.  The whole point of
git plumbing is to be friendly for scripted use.  If the user
who learns that one-liner from you gets curious why and how that
one-liner works, that would be a good gentle introduction to the
plumbing, but otherwise the user is not forced to know about it.

Also I do not see a problem if some plumbing commands happen to
be also useful by themselves ("[alias] less = -p cat-file -p"
comes to mind for example).

Some plumbing commands may be too deep magic and users do not
have to directly deal with them every day.  Some other plumbing
commands are so low-level and needs combination with others to
be any useful, and it is cumbersome to type the combination
every day.  For the latter kind, we have Porcelain commands that
implement the frequently used combination and the end users do
not have to know about them.

So it is true that by having a rich and usable set of Porcelain,
there is less need for the users to know about all the plumbing
details, but I consider that is a happy consequence.  It does
not have to be the goal of having a good Porcelain to hide the
whole plumbing.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Sometimes "Failed to find remote refs" means "try git-fetch --no-tags"
  2006-11-15  4:05 ` Junio C Hamano
@ 2006-11-15 21:13   ` Horst H. von Brand
  0 siblings, 0 replies; 1752+ messages in thread
From: Horst H. von Brand @ 2006-11-15 21:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael K. Edwards, git

Junio C Hamano <junkio@cox.net> wrote:

[...]

> However "fetch --no-tags" from http upstream is a band-aid to
> hide that the upstream repository has stale info/refs, and I do
> not think we would want to encourage the band-aid.  Rather, the
> message should say "yell loudly at the repository owner" ;-).

I'm seeing this gem here:

  [vonbrand@laptop13 git]$ git pull
  fatal: read error (Connection reset by peer)
  Fetch failure: git://git.kernel.org/pub/scm/git/git.git
  fatal: read error (Connection reset by peer)
  Failed to find remote refs
  No changes.

Who shall I yell at? ;-)

Seriously, this is broken. I get 4 different error messages, plus a
(reassuring?) "No changes". Yes, I know this is what I'll see if the
machine is overloaded.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                    Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria             +56 32 2654239

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 20:50                       ` Linus Torvalds
@ 2006-11-15 21:18                         ` Nicolas Pitre
  0 siblings, 0 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-15 21:18 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Petr Baudis, Junio C Hamano, git

On Wed, 15 Nov 2006, Linus Torvalds wrote:

> 
> 
> On Wed, 15 Nov 2006, Nicolas Pitre wrote:
> > 
> > I think "fetch" is sane.  Its only problem is a missing symetrical 
> > counterpart verb, like "get" and "put".
> 
> If you're a dog owner, the obvious counterpart for "fetch" is "throw" ;)

Yeah.  You could always throw a branch to your dog.

Or maybe we should introduce the concept of "bones" to GIT in place of 
branches?  ;-)

> I think "get" and "put" would be bad, just because of confusion with 
> "sccs get" (ie it has that "get this file" connotations).

Has SCCS really had a similar level of influence than BK or CVS in that 
matter?

> Maybe "fetch" and "push" aren't totally diametrically opposite, but 
> really, I don't think they are that hard to understand either. We do have 
> the BK legacy of "pull" implying a merge, and that's fairly fundamental. 
> 
> It's also true that in a lot of usage schenarios, what people actually 
> _use_ is "pull" and "push", and no, they aren't mirror images (since push 
> will _not_ do the merge), but at the same time, from a _usage_ standpoint 
> they really _are_ each others opposites. 

The problem is the "usage standpoint" distinction that has to be made.  
Exactly because in GIT it is a bit distorted from what most people 
expect from other standpoints.

> You "pull" to get other peoples data into your branch (and once you've 
> internalized local branches and the merge thing, you know what this 
> means), and you "push" to push your changes out. It really _is_ the usage 
> schenario, and using "opposite" words really _does_ make sense.

But that's exactly why newbies have problems.  Instead of simply 
understanding the bare operation (fetch data in a branch _then_ merge 
it) they sort of need to abstract the concept of branch away because a 
"pull" does it all automagically.  Which is fine as long as you're 
willing to ignore branch concepts altogether.  But once branches are 
back in the picture for more involved operations then the "pull" word 
simply feels odd.  Even more so with the local merge syntax.

When I say to someone "just merge branch weezee with your current 
branch" the most intuitive command would be:

	git merge weezee

But because "pull" mixes two concepts together this makes the thing more 
esoteric.  Unless, of course, you get used to the mental model you 
outlined above, but IMHO simply needing a mental model to explain the 
tool is a sign that something is mapped wrong.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 21:08                                   ` Carl Worth
@ 2006-11-15 21:31                                     ` Junio C Hamano
  2006-11-15 21:40                                       ` Nicolas Pitre
  2006-11-15 21:45                                     ` Linus Torvalds
  1 sibling, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-15 21:31 UTC (permalink / raw)
  To: Carl Worth; +Cc: Nicolas Pitre, Michael K. Edwards, Linus Torvalds, git

Carl Worth <cworth@cworth.org> writes:

> So there's this goofy circular notion that people end up with
> If we fix it so that a local merge really is performed with
> "git merge <branch>" instead of "git pull . <branch>" then teaching
> pull=fetch+merge really is a lot easier.

I am wondering if that could be "git merge <committish>..."
instead.  I do not care too much about the ... part (i.e. an
Octopus), but I often find myself doing:

	git checkout next
        git merge "Merge early part of branch 'foo'" HEAD foo~3

when earlier part of "foo" topic are worthy to be in 'next' but
not the later ones.

> In the meantime, pull would still be useless to me, I think. But maybe
> that's just the "default branch to merge" selection being broken.

Have you looked into per-branch configuration for default merge
source recently?  It might not be documented well enough,
though, because I do not use it myself, but you should be able
to improve on that (meaning both documentation and setting up
the defaults upon cloning and fetching).

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 21:12                             ` Josef Weidendorfer
@ 2006-11-15 21:31                               ` Linus Torvalds
  0 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-15 21:31 UTC (permalink / raw)
  To: Josef Weidendorfer
  Cc: Petr Baudis, Jakub Narebski, git, Nicolas Pitre, Junio C Hamano



On Wed, 15 Nov 2006, Josef Weidendorfer wrote:
> 
> I am not really opposed to pull doing a merge. It only should work in
> a useful way: ie. only do the merge of updated origin branch when
> current branch is master (given "Pull: master:origin").

I absolutely agree.

We should _only_ use the default head when pulling from the default head 
("master"). If we don't pull from within the default branch, we should 
either require an explicit head _or_ we should require that an explicit 
mapping has been set up in .git/config or in .git/remotes/..

So doing a "git pull" from any other branch than "master" should probably 
by default say "which branch do you want to pull from today"?


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 21:31                                     ` Junio C Hamano
@ 2006-11-15 21:40                                       ` Nicolas Pitre
  2006-11-15 21:52                                         ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-15 21:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Carl Worth, Michael K. Edwards, Linus Torvalds, git

On Wed, 15 Nov 2006, Junio C Hamano wrote:

> I am wondering if that could be "git merge <committish>..."
> instead.  I do not care too much about the ... part (i.e. an
> Octopus), but I often find myself doing:
> 
> 	git checkout next
>         git merge "Merge early part of branch 'foo'" HEAD foo~3
> 
> when earlier part of "foo" topic are worthy to be in 'next' but
> not the later ones.

Indeed !



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 21:08                                   ` Carl Worth
  2006-11-15 21:31                                     ` Junio C Hamano
@ 2006-11-15 21:45                                     ` Linus Torvalds
  2006-11-15 22:52                                       ` Carl Worth
  1 sibling, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-15 21:45 UTC (permalink / raw)
  To: Carl Worth; +Cc: Nicolas Pitre, Michael K. Edwards, git



On Wed, 15 Nov 2006, Carl Worth wrote:
> 
> Well, one of the problems is that with current git I can teach, (and I
> have), that there's a conceptual:
> 
> 	pull = fetch + merge
> 
> But then shortly after I have to teach an interface notion:
> 
> 	merge = pull .

This is why I would suggest teaching the _concept_ of the "merge", and not 
the actual command.

I don't think you should basically ever use the "git merge" command 
itself, not in teaching, and not in real life. So after talking about 
branches and having taught people to use "git fetch", the next stage is 
not so much to teach people to use "git merge", but to explain to them the 
_concept_ of merging. 

I really think that's a fairly quick thing, partly exactly _because_ you 
shouldn't at that point need to worry about syntax or details or anything 
like that at all. You just tell them that there's a notion of "merging" 
two branches by joining them together and havign the result have the 
changes from both branches. So it's a _conceptual_ issue, and that's why I 
said I think you should just totally gloss over the whole issue of "git 
merge" syntax.

Once you've explained the _concept_ of merging, you then introduce the 
command to actually _execute_ the merge: it's "git pull".

See? No circular thinking at all. One is a _concept_ ("join two branches 
together by including both in the result") and the other is a command 
("pull will fetch the remote data if any, and merge it into the current 
branch").

If you explain it that way, then _obviously_ if you don't need to fetch 
any remote data, doing "git pull . xyzzy" will merge the local branch 
"xyzzy" into the current branch.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 21:40                                       ` Nicolas Pitre
@ 2006-11-15 21:52                                         ` Junio C Hamano
  2006-11-15 21:59                                           ` Nicolas Pitre
  0 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-15 21:52 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

Nicolas Pitre <nico@cam.org> writes:

> On Wed, 15 Nov 2006, Junio C Hamano wrote:
>
>> I am wondering if that could be "git merge <committish>..."
>> instead.  I do not care too much about the ... part (i.e. an
>> Octopus), but I often find myself doing:
>> 
>> 	git checkout next
>>         git merge "Merge early part of branch 'foo'" HEAD foo~3
>> 
>> when earlier part of "foo" topic are worthy to be in 'next' but
>> not the later ones.
>
> Indeed !

Indeed, what?

That means that updated "git merge" (not the current one) would
not be able to assume it's parameter is a branch name, and still
has to come up with the merge message "Merge <branch>".

Merging only within the local branch namespace already has the
problem you need to solve to come up with a nicely formatted
"Merge <branch> of <remote repository>" some way.  I am not
saying that this is unsolvable (you can look at remotes/ files
to see what remote tracking branch the branch is about), but
something you need to keep in mind when implementing the
improved "git merge".

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 21:52                                         ` Junio C Hamano
@ 2006-11-15 21:59                                           ` Nicolas Pitre
  0 siblings, 0 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-15 21:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, 15 Nov 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > On Wed, 15 Nov 2006, Junio C Hamano wrote:
> >
> >> I am wondering if that could be "git merge <committish>..."
> >> instead.  I do not care too much about the ... part (i.e. an
> >> Octopus), but I often find myself doing:
> >> 
> >> 	git checkout next
> >>         git merge "Merge early part of branch 'foo'" HEAD foo~3
> >> 
> >> when earlier part of "foo" topic are worthy to be in 'next' but
> >> not the later ones.
> >
> > Indeed !
> 
> Indeed, what?

What you propose would be excellent indeed.

> That means that updated "git merge" (not the current one) would
> not be able to assume it's parameter is a branch name, and still
> has to come up with the merge message "Merge <branch>".
> 
> Merging only within the local branch namespace already has the
> problem you need to solve to come up with a nicely formatted
> "Merge <branch> of <remote repository>" some way.  I am not
> saying that this is unsolvable (you can look at remotes/ files
> to see what remote tracking branch the branch is about), but
> something you need to keep in mind when implementing the
> improved "git merge".

Right.  But that is an _implementation_ detail, not a usability issue.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 20:57             ` Jakub Narebski
@ 2006-11-15 22:00               ` Shawn Pearce
  2006-11-15 22:17                 ` Carl Worth
  0 siblings, 1 reply; 1752+ messages in thread
From: Shawn Pearce @ 2006-11-15 22:00 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> wrote:
> Carl Worth wrote:
> >On Tue, 14 Nov 2006 16:31:50 -0800, Junio C Hamano wrote:
> >>
> >> For another example, having a commit command to commit
> >> everything by default is disastrous for people who allow their
> >> workflows to often be interrupted.
> > 
> > Workflow-interruption is an important thing to support, but separating
> > update-index and commit really doesn't address it nearly as much as I
> > would like. The lack of really good workflow-interruption support has
> > been one of my longest-running annoyances with git, (perhaps because I
> > have a problem with trying to do too many things at once). Git can
> > create and change branches fast enough that it really should be able
> > to help me better with this. The only missing piece is being able to
> > stash the dirty stuff on the current branch, to be able to come back
> > to it later. I've talked a bit about what I would like in this area
> > before, and I really just need to code it up.
> 
> There is git-stash/git-unstash floating somewhere in the archive.

I find that a "git commit -a -m parked; git checkout -b ..." works
well to stash my current stuff off.  Then I just amend the commit
when I come back to that branch.


The problem I just ran into today was "git checkout" doesn't double
check the file stat data against the index before switching branches.
If the file is unchanged between the two branches there's no error.
So I switched branches with dirty files that I forgot to park on
the old branch.

-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 20:41                             ` Junio C Hamano
@ 2006-11-15 22:07                               ` Shawn Pearce
  2006-11-16  6:07                               ` Marko Macek
  1 sibling, 0 replies; 1752+ messages in thread
From: Shawn Pearce @ 2006-11-15 22:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Marko Macek, Linus Torvalds, git, cworth, pasky

Junio C Hamano <junkio@cox.net> wrote:
> The index is quite central to the way git works at the concept
> level, and I think it is doing disservice to the end user to try
> hiding it forever from them and failing to do so, rather than
> being honest and teaching them the concept upfront.
> 
> But me thinking so does not necessarily mean you are forbidden
> from trying.  Your efforts may result in a system where the
> index is totally invisible and the end user never has to know
> about it.

I agree with what you are saying about the index.

But in git-gui I found myself writing code on Monday which tries to
hide the index from the user unless he/she requested that the index
be made visible.

The reason is there are some users who I'd like to give git-gui to
who I'm not sure I trust to make sure their index is in sync with
their working directory before they commit.  In some cases I'm lucky
that the user even knows what directory their file is stored in.  :-(
Yes, there really are computer users who are afraid of directories
and command lines.

I probably could try to teach them to make sure the final file
is included in the index before committing, but I think that for
most of them they would find this to be just another couple of
mouse clicks they have to perform before every commit, meaning its
something that the #$@!*@!*@$# tool should just do for them.

-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 22:00               ` Shawn Pearce
@ 2006-11-15 22:17                 ` Carl Worth
  0 siblings, 0 replies; 1752+ messages in thread
From: Carl Worth @ 2006-11-15 22:17 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Jakub Narebski, git

[-- Attachment #1: Type: text/plain, Size: 1136 bytes --]

On Wed, 15 Nov 2006 17:00:54 -0500, Shawn Pearce wrote:
> > There is git-stash/git-unstash floating somewhere in the archive.

Yes, I did write those once upon a time. ;-)

It's the manual stash/unstash that I don't want though. I want to be
able to make this happen automatically when switching branches.

> I find that a "git commit -a -m parked; git checkout -b ..." works
> well to stash my current stuff off.  Then I just amend the commit
> when I come back to that branch.

Yes, I do stuff like that as well. And often "reset HEAD~" instead of
amend, (always with a moment's pause as reset justly deserves).

> The problem I just ran into today was "git checkout" doesn't double
> check the file stat data against the index before switching branches.
> If the file is unchanged between the two branches there's no error.
> So I switched branches with dirty files that I forgot to park on
> the old branch.

Right, so that's just more evidence that this approach is a little
awkward.

Anyway, the stashing thing I want is a minor thing that should be easy
to fix in git, (as is everything we're talking about here I think).

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 19:05                           ` Marko Macek
  2006-11-15 20:41                             ` Junio C Hamano
@ 2006-11-15 22:28                             ` Sean
       [not found]                             ` <20061115172834.0a328154.seanlkml@sympatico.ca>
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-11-15 22:28 UTC (permalink / raw)
  To: Marko Macek
  Cc: Shawn Pearce, Linus Torvalds, Junio C Hamano, git, cworth, pasky

On Wed, 15 Nov 2006 20:05:27 +0100
Marko Macek <marko.macek@gmx.net> wrote:

> I guess this is the reason that the GIT Tutorial for CVS/SVN users is talking about _cogito_ instead.
> (which is very confusing for someone coming to _git_ home page, trying to learn git).

IMHO this is really bad.  Pasky runs the Git web site and feels
that Cogito comes hand in hand with Git.  When I asked him about
it he mentioned that Junio had approved.  But it's very confusing
to click a link that purports to show you how to use Git and get
shown a bunch of Cogito stuff.

Git is confusing enough for new users without "Git" and "Cogito"
being mixed without comment on the Git webpage.  At the very
least, the links should be changed to "Cogito for CVS/SVN users".


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 21:13                             ` Junio C Hamano
@ 2006-11-15 22:36                               ` Carl Worth
  2006-11-16  3:21                                 ` Petr Baudis
  0 siblings, 1 reply; 1752+ messages in thread
From: Carl Worth @ 2006-11-15 22:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Andy Parkins, git

[-- Attachment #1: Type: text/plain, Size: 678 bytes --]

On Wed, 15 Nov 2006 13:13:11 -0800, Junio C Hamano wrote:
> That is a very fine example, but I do not see why it is a
> problem.  I do not think the goal of Porcelain is to make it
> totally unnecessary for users to know about the plumbing.

If not, then the promise of the porcelain fails. If cogito offers
"Here are 40 commands so you don't have to learn git's 140" and then
next says "Oh, and you'll still want to learn all those git commands
too", then its existence only makes the "too much stuff to learn"
problem worse, not better.

But I think you agree with me (for now) that fixing the git UI should
not involve creating a new primary command to replace "git".

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 21:45                                     ` Linus Torvalds
@ 2006-11-15 22:52                                       ` Carl Worth
  2006-11-15 23:02                                         ` Shawn Pearce
                                                           ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Carl Worth @ 2006-11-15 22:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Michael K. Edwards, git

[-- Attachment #1: Type: text/plain, Size: 2150 bytes --]

On Wed, 15 Nov 2006 13:45:58 -0800 (PST), Linus Torvalds wrote:
> On Wed, 15 Nov 2006, Carl Worth wrote:
> >
> > Well, one of the problems is that with current git I can teach, (and I
> > have), that there's a conceptual:
> >
> > 	pull = fetch + merge
> >
> > But then shortly after I have to teach an interface notion:
> >
> > 	merge = pull .
>
> This is why I would suggest teaching the _concept_ of the "merge", and not
> the actual command.
>
> I don't think you should basically ever use the "git merge" command
> itself, not in teaching, and not in real life.

I think that's just and accident of git-merge having such a bad
syntax, (requiring a merge message, not using -m for that, requiring
two heads instead of defaulting to current, etc.). So the result is
accepting another bad syntax "pull ." for an operation that really is
merge.

> Once you've explained the _concept_ of merging, you then introduce the
> command to actually _execute_ the merge: it's "git pull".

I think we'll be doing better when there is a stronger correlation
between the concepts of the operations and the command names for
carrying them out.

Plus, when I'm teaching "fetch everything first, then manipulate it
locally", (which is what I teach, since that's the only way I use
git), then the "." looks really out of place when I teach the 'merge'
command. I end up saying, "Oh, that's there because you could do the
fetch and merge all in one step if you really wanted, but I never do
that.".

And that's because I _do_ teach fetch first, as you've suggested.

> changes from both branches. So it's a _conceptual_ issue, and that's why I
> said I think you should just totally gloss over the whole issue of "git
> merge" syntax.

That doesn't work. I know I went looking at the git-merge
documentation when I started to learn git. "It can't really be this
hard, can it?" was my reaction to it. And then only after attending a
tutorial did I learn that "pull ." is the way it's really done.

That's nothing more than a user-interface trap for new users, plain
and simple.

The real fix is to stop glossing over git-merge and just give it a
usable syntax.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 22:52                                       ` Carl Worth
@ 2006-11-15 23:02                                         ` Shawn Pearce
  2006-11-15 23:33                                           ` Linus Torvalds
  2006-11-15 23:07                                         ` Sean
       [not found]                                         ` <20061115180722.83ff8990.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 1752+ messages in thread
From: Shawn Pearce @ 2006-11-15 23:02 UTC (permalink / raw)
  To: Carl Worth; +Cc: Linus Torvalds, Nicolas Pitre, Michael K. Edwards, git

Carl Worth <cworth@cworth.org> wrote:
> Plus, when I'm teaching "fetch everything first, then manipulate it
> locally", (which is what I teach, since that's the only way I use
> git), then the "." looks really out of place when I teach the 'merge'
> command. I end up saying, "Oh, that's there because you could do the
> fetch and merge all in one step if you really wanted, but I never do
> that.".
> 
> And that's because I _do_ teach fetch first, as you've suggested.

Ditto.  In every way.

I've taught the same fetch first, then merge strategy.  Nobody I
know in meat-space pulls from a remote URL and merges in one shot;
they always fetch locally, look at the incoming changes, decide if
its worthwhile/ok, *then* merge with "git pull . branch".

The "." looks out of place for everyone...

-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 22:52                                       ` Carl Worth
  2006-11-15 23:02                                         ` Shawn Pearce
@ 2006-11-15 23:07                                         ` Sean
       [not found]                                         ` <20061115180722.83ff8990.seanlkml@sympatico.ca>
  2 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-11-15 23:07 UTC (permalink / raw)
  To: Carl Worth; +Cc: Linus Torvalds, Nicolas Pitre, Michael K. Edwards, git

On Wed, 15 Nov 2006 14:52:32 -0800
Carl Worth <cworth@cworth.org> wrote:

> The real fix is to stop glossing over git-merge and just give it a
> usable syntax.

Agreed 100%   There's just no good reason to hide the user level
merge command inside of pull.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
       [not found]                                         ` <20061115180722.83ff8990.seanlkml@sympatico.ca>
@ 2006-11-15 23:15                                           ` Shawn Pearce
  2006-11-16  7:51                                             ` Richard CURNOW
  0 siblings, 1 reply; 1752+ messages in thread
From: Shawn Pearce @ 2006-11-15 23:15 UTC (permalink / raw)
  To: Sean; +Cc: Carl Worth, Linus Torvalds, Nicolas Pitre, Michael K. Edwards,
	git

Sean <seanlkml@sympatico.ca> wrote:
> On Wed, 15 Nov 2006 14:52:32 -0800
> Carl Worth <cworth@cworth.org> wrote:
> 
> > The real fix is to stop glossing over git-merge and just give it a
> > usable syntax.
> 
> Agreed 100%   There's just no good reason to hide the user level
> merge command inside of pull.

So what about making git-merge take a -m "msg" argument to supply
the commit message, in which case it does the current behavior
(and thus git-pull needs to change to supply -m); and then make
git-merge without any -m parameter invoke "git pull . $@" ?

A minor tweak to both apps, a minor breakage to git-merge, but one
that I think anyone who invokes it by hand today would find sane
(using -m like we do elsewhere) and since the vintage of both
git-pull and git-merge should always match shouldn't break anyone
who uses git-pull today.

-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 23:02                                         ` Shawn Pearce
@ 2006-11-15 23:33                                           ` Linus Torvalds
  2006-11-16  0:08                                             ` Nicolas Pitre
                                                               ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-15 23:33 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Carl Worth, Nicolas Pitre, Michael K. Edwards, git



On Wed, 15 Nov 2006, Shawn Pearce wrote:
> 
> I've taught the same fetch first, then merge strategy.  Nobody I
> know in meat-space pulls from a remote URL and merges in one shot;

Actually, with different people involved it's _much_ better to do it in 
one shot.

Why? Because doing a separate "fetch to local space" + "merge from local 
space" actually loses the information on what you are merging.

It's a lot more useful to have a merge message like

	Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6

than one like

	Merge branch 'for-linus'

which is what you get if you fetched it first.

Of course, in a situation like git itself, where most of the merges are 
stuff that Junio has had pending in his own tree ('maint' branch etc), 
things are different. But in a system where people actually use separate 
trees, there really is an advantage to consider the fundamental operation 
to be the "pull", not the "merge".

Again, the kernel really is more distributed than most projects, but this 
is another thing people should recognize: git has been designed for "true 
distributed development". Not the "fake" kind. Not the "I merge mainly my 
own branches" kind of thing. Truly distributed.

And in a truly distributed situation, "pull" is strictly more powerful 
than a separate "fetch" + separate "merge".

In other words, an SCM that does "pull" is _better_ than an SCM that does 
"merge". You can implement "merge" as a special case of "pull" (which we 
do), but you cannot conveniently do it the other way around without having 
to tie them together some other way (ie you could have a "remember the 
last place we fetched this branch from in order to tie the fetch and the 
merge together" - but please realize that that is exactly what "pull" 
_is_).

So I will generally do a "git pull" (possibly followed by a "git reset 
--hard ORIG_HEAD" if I decided it wasn't good) over a "git fetch" + "git 
merge". Exactly because the "pull" operation is actually more powerful.

Maybe people who aren't in my position don't always appreciate the _power_ 
of git. The reason "merge" is a second-class citizen is simply because IT 
SHOULD BE.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 23:33                                           ` Linus Torvalds
@ 2006-11-16  0:08                                             ` Nicolas Pitre
  2006-11-16  3:07                                               ` Linus Torvalds
  2006-11-16  3:02                                             ` Michael K. Edwards
  2006-11-16 16:37                                             ` Carl Worth
  2 siblings, 1 reply; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-16  0:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Shawn Pearce, Carl Worth, Michael K. Edwards, git

On Wed, 15 Nov 2006, Linus Torvalds wrote:

> 
> 
> On Wed, 15 Nov 2006, Shawn Pearce wrote:
> > 
> > I've taught the same fetch first, then merge strategy.  Nobody I
> > know in meat-space pulls from a remote URL and merges in one shot;
> 
> Actually, with different people involved it's _much_ better to do it in 
> one shot.
> 
> Why? Because doing a separate "fetch to local space" + "merge from local 
> space" actually loses the information on what you are merging.
> 
> It's a lot more useful to have a merge message like
> 
> 	Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
> 
> than one like
> 
> 	Merge branch 'for-linus'

That is an implementation detail that should be easily overcome once the 
notion of tracking branch with URL attribute is implemented.  Then it 
will be really easy to notice whether the branch argument is a local 
branch or a tracking branch with remote reference.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 18:16                     ` Junio C Hamano
  2006-11-15 19:02                       ` Andy Parkins
@ 2006-11-16  0:23                       ` Han-Wen Nienhuys
  1 sibling, 0 replies; 1752+ messages in thread
From: Han-Wen Nienhuys @ 2006-11-16  0:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano escreveu:
> I still think in the long run you would be better off giving
> separate names to Porcelains because I am sure you are going to
> find the next command to "fix", you cannot suddenly change the

 > "ig pull", you can dismiss all the broken git-x Porcelain-ish by
 > saying "Oh, git-x user-level commands had inconsistent semantics
 > and broken UI so do not use them anymore -- they are still there
 > only to help old timers transition.  The user level commands are
 > now called ig-x and ig stands for improved git".


I think it would be good if there were different commands for 
porcelains. Not because fixing the current commands is too much work, 
but rather because it would clarify the structure of git.  GIT is a 
3-layer approach:

  - index+workdir+refs over
  - a DAG of commits over
  - a file based SHA1 database

at first sight it is difficult to tell for each command on which layer 
it operates. It would help understanding GIT a lot if each layer got 
it's own command, eg.

   git - sha1 content db
   gic - sequences of commits
   giu - UI

(Of course, these names are completely silly, but you get the idea)


> I think get/put is much better than suddenly changing what pull
> means and is shorter to type than x-load; I am Ok with them.
> Although I think these words are tainted by SCCS, I do not think
> anybody cares.

they're also tainted  by darcs, but that's a minor problem, I suppose.


-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 18:03                     ` Linus Torvalds
                                         ` (2 preceding siblings ...)
  2006-11-15 18:58                       ` Andy Parkins
@ 2006-11-16  1:14                       ` Theodore Tso
  2006-11-16  4:21                         ` Junio C Hamano
  2006-11-16  1:20                       ` Han-Wen Nienhuys
  2006-11-16  4:30                       ` Petr Baudis
  5 siblings, 1 reply; 1752+ messages in thread
From: Theodore Tso @ 2006-11-16  1:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, git

On Wed, Nov 15, 2006 at 10:03:18AM -0800, Linus Torvalds wrote:
> So the reason for using "git pull" is
> 
>  - bk did it that way, and like it or not, bk was the first usable 
>    distributed system. hg is totally uninteresting.

Yes, "bk pull" had an implied merge.  But, the reason why bk pull was
never really a problem with Bitkeeper is because it didn't really have
support for multiple branches active within the same repository ---
what Larry called "lines of development".  Or rather, Larry started
down the path of implementing lines of development, and then never
fully supported it, mainly because making it easy for people to use
was the tricky part.   

So with Bitkeeper, with "bk pull" there was never any question about
which branch ("line of development") you would be merging into after
doing a "bk pull", since there was only one LOD, and given that BK had
the rule that a within a LOD only one tip was allowed, a "bk pull"
_had_ to do do a merge operation.   

The moment you start supporting multiple unmerged tips in a repository
i.e., branches, it raises the question, "which branch should the pull
operation merge onto"?  And git's answer, "the current branch", is
often not the right one.  *That's* why always doing a merge isn't
always the right answer, and so in the git world, people are told, use
"git fetch" instead, and in the hg world, "hg pull" doesn't do the
merge.  IMO, it's a fundamental result of the fact that both git and
hg have chosen to support mulitple LOD's, whereas BK punted on the
concept.

If you are operating on your local development branch, the reality is
that merging is probably not the right answer in the general case,
which is why the hg world have omitted doing the merge.  And by
telling people, use "git fetch" instead, that's also an implicit
admission that merging onto the current branch is often not the Right
Thing.

The problem is that "pull" is a very evocative word, especially given
the existence "push", and so in the git world we are reduced to
telling people, "you really don't want to use pull, trust me".  

Is this a major issue?  Not really; I can think of a number of other
issues that make git hard to learn, and why hg has a more gentle
learning curve, and the "don't use pull" is probably a relatively
minor annoyance in the grand scheme of things.

If people are looking for a simple way out, maybe it would be enough
to have an option where if "git pull" is called from an interactive
terminal, and the "novice user" option is enabled, "git pull" returns
a warning message, "You probably want to use 'git fetch' instead; are
you sure?"  If people are saying that we shouldn't be teaching "git
pull" until fairly late in the game, maybe we should have a way of
discouraging novices from using, simply because they they are used to
seeing "pull" from other distributed SCM's.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 18:03                     ` Linus Torvalds
                                         ` (3 preceding siblings ...)
  2006-11-16  1:14                       ` Theodore Tso
@ 2006-11-16  1:20                       ` Han-Wen Nienhuys
  2006-11-16  1:53                         ` Jakub Narebski
                                           ` (2 more replies)
  2006-11-16  4:30                       ` Petr Baudis
  5 siblings, 3 replies; 1752+ messages in thread
From: Han-Wen Nienhuys @ 2006-11-16  1:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

Linus Torvalds escreveu:
>  - git itself has now done it that way for the last 18 months, and the 
>    fact is, the people _complaining_ are a small subset of the people who 
>    actually use git on a daily basis and don't complain.


that's not a good argument; the set of git users is a small subset of 
those that looked at git, and dismissed it because they couldn't wrap 
their heads around it.   It's worth trying to get those on board by 
fixing the annoying little issues that have popped up in this thread. 
The technical base for GIT is excellent, and the only reason for not 
using it is its arcane interface.

A version control system is often only tangentially related to the real 
work that needs to be done, so the incentive to learn it well is small, 
   and a steep learning curve only makes it worse.

FWIW, I regularly mess up with the differences between fetching, pulling 
and merging.  In particular, having to do a two step process to get 
remote changes in,

   git pull url-to-server master:master
      ..error message about not being a fast-forward..

   git pull --update-head-ok url-to-server master:master
      ..still an error message about update not being a fast-forward..

       (sigh)

   git pull url-to-server master:scrap-branch

   git pull . scrap-branch:my-current-branch

       (make mental note of deleting scrap-branch)


-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 19:18                         ` Linus Torvalds
  2006-11-15 19:39                           ` Michael K. Edwards
@ 2006-11-16  1:40                           ` Anand Kumria
  1 sibling, 0 replies; 1752+ messages in thread
From: Anand Kumria @ 2006-11-16  1:40 UTC (permalink / raw)
  To: git

On Wed, 15 Nov 2006 11:18:36 -0800, Linus Torvalds wrote:

> On Wed, 15 Nov 2006, Andy Parkins wrote:
>>
>> On the one hand you're arguing that git syntax is easy to learn, and on the 
>> other that no one will be able to learn a new syntax just as easily.
> 
> I'm saying that people who are new to git will _have_ to learn new 
> concepts ANYWAY.
> 
> I don't think the naming is the hard part. 

It isn't - the unexpectedness of what happens is.

I've started by teaching how to do stuff locally, then "pushing" it out to
others (me).  All the while being able to point out how this is either all
local, or sends stuff (without any local modifications) to others.

Come up to 'pull' and ere you have to point out that not only will you get
the remote changes but they are also merged into your repository. On the
wrong branch?

Too bad.

The problem with git-pull behaving illogically drove me to look at cogito
(an aside, perhaps cg-throw should be the corrollary to cg-fetch?)
instead. Alas it has problems with a cogito branch not being something you
can mentally map back to a git branch.

> But I bet people don't teach it that way. They _start_ by teaching "pull". 
> Right?

Nope.

Anand

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 20:26                     ` Nicolas Pitre
  2006-11-15 20:50                       ` Linus Torvalds
@ 2006-11-16  1:51                       ` Anand Kumria
  1 sibling, 0 replies; 1752+ messages in thread
From: Anand Kumria @ 2006-11-16  1:51 UTC (permalink / raw)
  To: git

On Wed, 15 Nov 2006 15:26:44 -0500, Nicolas Pitre wrote:

> On Wed, 15 Nov 2006, Petr Baudis wrote:
> 
>> On Wed, Nov 15, 2006 at 03:10:16AM CET, Junio C Hamano wrote:
>> > You have to admit both pull and fetch have been contaminated
>> > with loaded meanings from different backgrounds. I was talking
>> > about killing the source of confusion in the longer term by
>> > removing fetch/pull/push, so we are still on the same page.
>> 
>> How was/is fetch contaminated?
> 
> I think "fetch" is sane.  Its only problem is a missing symetrical 
> counterpart verb, like "get" and "put".

"throw" ?

But I think "I'll just 'throw' this set of patches at you" is a lot
harshers sounding than "I'll just 'push' this set of patches at you".

Anand

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  1:20                       ` Han-Wen Nienhuys
@ 2006-11-16  1:53                         ` Jakub Narebski
  2006-11-16  2:03                         ` Junio C Hamano
  2006-11-16  3:12                         ` Linus Torvalds
  2 siblings, 0 replies; 1752+ messages in thread
From: Jakub Narebski @ 2006-11-16  1:53 UTC (permalink / raw)
  To: git

Han-Wen Nienhuys wrote:

> FWIW, I regularly mess up with the differences between fetching, pulling 
> and merging.  In particular, having to do a two step process to get 
> remote changes in,
> 
>    git pull url-to-server master:master
>       ..error message about not being a fast-forward..
> 
>    git pull --update-head-ok url-to-server master:master
>       ..still an error message about update not being a fast-forward..

What about:

     git pull --update-head-ok url-to-server +master:master

(or --force, but be careful with that one)?
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  1:20                       ` Han-Wen Nienhuys
  2006-11-16  1:53                         ` Jakub Narebski
@ 2006-11-16  2:03                         ` Junio C Hamano
  2006-11-16  2:30                           ` Han-Wen Nienhuys
  2006-11-16  3:12                         ` Linus Torvalds
  2 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-16  2:03 UTC (permalink / raw)
  To: hanwen; +Cc: git

Han-Wen Nienhuys <hanwen@xs4all.nl> writes:

> FWIW, I regularly mess up with the differences between fetching,
> pulling and merging.  In particular, having to do a two step process
> to get remote changes in,
>
>   git pull url-to-server master:master
>      ..error message about not being a fast-forward..
>
>   git pull --update-head-ok url-to-server master:master
>      ..still an error message about update not being a fast-forward..
>
>       (sigh)

Sigh indeed.

Why don't you do the simple and obvious

	git pull url master

or "git pull url" if you already know the master is the branch
you are interested in.

The more advanced form of using tracking branches are there and
documentation talks about them for completeness but that does
not mean you have to use it.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  2:03                         ` Junio C Hamano
@ 2006-11-16  2:30                           ` Han-Wen Nienhuys
  2006-11-16  3:27                             ` Junio C Hamano
  0 siblings, 1 reply; 1752+ messages in thread
From: Han-Wen Nienhuys @ 2006-11-16  2:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano escreveu:
>> FWIW, I regularly mess up with the differences between fetching,
>> pulling and merging.  In particular, having to do a two step process
>> to get remote changes in,
>>
>>   git pull url-to-server master:master
>>      ..error message about not being a fast-forward..
>>
>>   git pull --update-head-ok url-to-server master:master
>>      ..still an error message about update not being a fast-forward..
>>
>>       (sigh)
> 
> Sigh indeed.
> 
> Why don't you do the simple and obvious
> 
> 	git pull url master

It is not all evident from the git-pull man-page that this is the 
obvious and most common usage.

> or "git pull url" if you already know the master is the branch
> you are interested in.

Because I usually replace verbose commands with shortcuts only when I 
understand exactly what the shortcut is.

To me it's very unlogical that

   master:current-branch

doesn't work, but

   master:

does work, and does what I'd expect

   master:current-branch

to do. Interestingly, doing

   pull ..url.. master:HEAD

also doesn't merge into the current branch, but rather creates a bogus 
refs/heads/HEAD

I use the remote:local syntax, because I started using GIT in scripted 
compiles from copied branches of remote repositories. There the explicit 
remote:local statements are necessary because there is no default branch.

-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 10:28                     ` Jakub Narebski
@ 2006-11-16  2:43                       ` Petr Baudis
  0 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-11-16  2:43 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

On Wed, Nov 15, 2006 at 11:28:27AM CET, Jakub Narebski wrote:
> Santi Béjar wrote:
> 
> > On 11/15/06, Jakub Narebski <jnareb@gmail.com> wrote:
> 
> >> You mean
> >>
> >>       git merge git://repo.com/time_machine.git#branch
> >>
> >> don't you (perhaps with 'master' as default branch)?
> > 
> > perhaps with remote 'HEAD' as default branch?
> 
> No! HEAD might change without your notice, and you want to know
> which branch you merge. With remotes the default could be first
> branch in the pull/fetch list, but with bare URL...

No! If HEAD changed without your notice, it means that the remote
repository admin _wants_ you to start fetching another branch now.
Imagine a setup of these branches:

	phooey-1.2	legacy lineage
	phooey-2.0	last stable
	phooey-3.0	current development (no releases yet)
	phooey-4.0	stash for futuristic functionality, heavily
			experimental

In this case, HEAD now points to phooey-3.0 but when it becomes stable,
it would change to phooey-4.0.

The common practice of having 'master' pointing on whatever you
currently have now and and "cutting out" the branches from it at random
times is something heavily influenced by CVS where this is the only sane
way of branching (the cutting out even hardcoded in numbering scheme).
In more advanced systems, you may want to be much more flexible wrt. this
(note that I'm not saying you necessarily _should_ be).

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 23:33                                           ` Linus Torvalds
  2006-11-16  0:08                                             ` Nicolas Pitre
@ 2006-11-16  3:02                                             ` Michael K. Edwards
  2006-11-16 11:35                                               ` Andreas Ericsson
  2006-11-16 16:37                                             ` Carl Worth
  2 siblings, 1 reply; 1752+ messages in thread
From: Michael K. Edwards @ 2006-11-16  3:02 UTC (permalink / raw)
  To: git

On 11/15/06, Linus Torvalds <torvalds@osdl.org> wrote:
> Actually, with different people involved it's _much_ better to do it in
> one shot.
>
> Why? Because doing a separate "fetch to local space" + "merge from local
> space" actually loses the information on what you are merging.
>
> It's a lot more useful to have a merge message like
>
>         Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
>
> than one like
>
>         Merge branch 'for-linus'
>
> which is what you get if you fetched it first.

Full ACK from a platform integrator's perspective.  Local merge is
great for trial runs but the history in a persistent branch should be
as self-contained and self-explanatory as possible.  It shouldn't
depend on what I name local tracking branches, which are just a
convenience so that I can still do trial runs when my connectivity is
broken.

I don't have to manually log the _mechanical_origin_ of a given delta;
git does that for me, and mostly just DTRT when the same delta arrives
via several paths.  When I use git pull from a remote branch (with or
without an entry in remotes/heads, which for this purpose is just
shorthand), I don't have to manually log what conflicts I have and
haven't resolved, either; I must have assimilated whatever I cared
about in the remote branch's history up to that point, because as long
as there are things in that remote branch that I haven't decided how
to handle, I stick to cherry-picking.

Obviously, fetch to local space is great (especially when you spend
some of your working hours behind a firewall that blocks outbound TCP
9418).  Fetch from local space is also great, when the local space you
are fetching from reflects local work (such as a sync point and
reconciliation of several upstream sources, which then needs to be
ported forward or back to the chosen core version for each platform).
Fetch from a local space that is just a tracker for remote work is not
great, because it doesn't capture the editorial decision implied by a
remote pull:  I looked at what the remote branch had to offer as of
this date, systematically decided which bits did and didn't belong in
the branch to which I was pulling, and pulled.

The record of that pull becomes a first-class object because it's
attached to an actual content delta in the target branch.  So it
propagates into branches that pull from it.  Pulling this delta into
another branch is different from cherry-picking a feature delta; it
implies acceptance of the reconciliation and editorial work associated
with the merge in the source branch.

Coming from me, this is all rather theoretical, as I haven't been
using this particular tool for the purpose long enough to have an
independent opinion.  But for what it's worth, the workflow Linus
describes isn't just for the guy at the top of the pyramid.

Cheers,

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  0:08                                             ` Nicolas Pitre
@ 2006-11-16  3:07                                               ` Linus Torvalds
  2006-11-16  3:43                                                 ` Nicolas Pitre
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-16  3:07 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Shawn Pearce, Carl Worth, Michael K. Edwards, git



On Wed, 15 Nov 2006, Nicolas Pitre wrote:
> 
> That is an implementation detail that should be easily overcome once the 
> notion of tracking branch with URL attribute is implemented.

Nope.

I simply don't _have_ those branches.

Why? Because the kernel is _distributed_. There is no central place 
(certainly not my repository) that tracks all the possible branches that 
might get merged.

In other words, I repeat: in a TRULY DISTRIBUTED ENVIRONMENT it makes more 
sense to have a "pull" that fetches and merges, over something that 
fetches separately and then merges. Because in a truly distributed 
environment, you simply DO NOT HAVE static branches that you can associate 
with particular sources.

See?

And the thing is, I think the git design should be geared towards true 
distribution. It should NOT be geared toward a fairly static set of 
branches that all have a fairly static set of other repositories 
associated with them. Can you see the difference?

I'm personally convinced that one of the reasons people tend to use git in 
a centralized manner is just a mental disease that has its roots in how 
they used _other_ SCM's. I don't want git design to be polluted by such a 
centralized notion.

So to repeat: you can always make "pull" boil down to "pull from myself" 
(aka just "merge"), but you can _not_ make "fetch + merge" boil down to 
"pull" without meking up extra state to track separately. In other words, 
"pull" really is the strictly more powerful operation.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
       [not found]                             ` <20061115172834.0a328154.seanlkml@sympatico.ca>
@ 2006-11-16  3:07                               ` Petr Baudis
  0 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-11-16  3:07 UTC (permalink / raw)
  To: Sean; +Cc: Marko Macek, Shawn Pearce, Linus Torvalds, Junio C Hamano, git,
	cworth

On Wed, Nov 15, 2006 at 11:28:34PM CET, Sean wrote:
> Git is confusing enough for new users without "Git" and "Cogito"
> being mixed without comment on the Git webpage.  At the very
> least, the links should be changed to "Cogito for CVS/SVN users".

It's not being mixed without comment, in the very first paragraph I'm
trying to explain what the difference is and why is Cogito used for
introduction to Git. I've tried to clear it up even more now.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  1:20                       ` Han-Wen Nienhuys
  2006-11-16  1:53                         ` Jakub Narebski
  2006-11-16  2:03                         ` Junio C Hamano
@ 2006-11-16  3:12                         ` Linus Torvalds
  2006-11-16 10:31                           ` Junio C Hamano
                                             ` (2 more replies)
  2 siblings, 3 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-16  3:12 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: Junio C Hamano, git



On Thu, 16 Nov 2006, Han-Wen Nienhuys wrote:

> Linus Torvalds escreveu:
> >  - git itself has now done it that way for the last 18 months, and the
> > fact is, the people _complaining_ are a small subset of the people who
> > actually use git on a daily basis and don't complain.
> 
> 
> that's not a good argument; the set of git users is a small subset of those
> that looked at git, and dismissed it because they couldn't wrap their heads
> around it. 

And I've said this again, and I'll say it once more: that has basically 
_nothing_ to do with whether you spell "pull" as "pull" or "merge".

The reason people have trouble wrapping their heads around git is because 
they have been braindamaged by CVS and SVN, and just don't understand the 
fairly fundamental new concepts and workflow.

That's totally different from then arguing about stupid naming issues.

Peopel seem to believe that changign a few names or doing other totally 
_minimal_ UI changes would somehow magically make things understandable. I 
claim that isn't so at all. The fact is, git is different from CVS and 
SVN, and git _has_ to be different from CVS and SVN. It has to be 
different because the whole model of CVS and SVN is simpyl fundamentally 
BROKEN.

> It's worth trying to get those on board by fixing the annoying
> little issues that have popped up in this thread.

I claim that those "annoying little issues" are totally made up by people 
who had trouble wrapping their minds about git, and then make up reasons 
that have nothing to do with reality for why that might be so.

Let's face it, you could just alias "merge" to "pull", and it wouldn't 
really change ANYTHING. You'd still have to learn the new model. 


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 22:36                               ` Carl Worth
@ 2006-11-16  3:21                                 ` Petr Baudis
  2006-11-16 10:09                                   ` Robin Rosenberg
  0 siblings, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-11-16  3:21 UTC (permalink / raw)
  To: Carl Worth; +Cc: Junio C Hamano, Andy Parkins, git

On Wed, Nov 15, 2006 at 11:36:21PM CET, Carl Worth wrote:
> On Wed, 15 Nov 2006 13:13:11 -0800, Junio C Hamano wrote:
> > That is a very fine example, but I do not see why it is a
> > problem.  I do not think the goal of Porcelain is to make it
> > totally unnecessary for users to know about the plumbing.
> 
> If not, then the promise of the porcelain fails. If cogito offers
> "Here are 40 commands so you don't have to learn git's 140" and then
> next says "Oh, and you'll still want to learn all those git commands
> too", then its existence only makes the "too much stuff to learn"
> problem worse, not better.

I didn't get this argument before either - why do you need to learn "all
those git commands" too? You'll never have to learn "git add" or even
"git commit". If you want to pick specific git commands later (like "git
bisect", which even seeks in a Cogito-compatible way), that's fine, go
ahead! But you by no means have to learn _other_ commands than those you
need. If you want to bisect, you have to learn no other Git commands
than "git bisect".

Another point is, if using _just_ _git_ requires you to learn "all those
git commands too" from git-commit-tree up (yes it does! if you want your
authorship information to be correct), something is wrong.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  2:30                           ` Han-Wen Nienhuys
@ 2006-11-16  3:27                             ` Junio C Hamano
  2006-11-16  3:35                               ` Junio C Hamano
  2006-11-16  4:07                               ` Junio C Hamano
  0 siblings, 2 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-16  3:27 UTC (permalink / raw)
  To: hanwen; +Cc: git

Han-Wen Nienhuys <hanwen@xs4all.nl> writes:

> Junio C Hamano escreveu:
>>...
>> Sigh indeed.
>>
>> Why don't you do the simple and obvious
>>
>> 	git pull url master
>
> It is not all evident from the git-pull man-page that this is the
> obvious and most common usage.

In the git user poll a few months ago, many people recommended
"everyday git" as a good cheat sheet, and indeed it does not
talk anything about directing the underlying git-fetch to
manipulate tracking branches by giving explicit refspec pairs to
git pull.  You are obviously tripped by both the overeager
manpage (but manpage should strive to be complete so you cannot
really blame it) and less than optimally organized tutorial
style documents.

I myself do prefer, when learning a new tool, to use longhand
until I understand the shorthand, but that attitude requires a
true commitment to learn the tool, and most people do not go
that route.  Tutorial style documents tend to give the commonly
used shorthand first for that exact reason.

Shorthand to give only the branch name to fetch and merge
immediately without using a tracking branch is equivalent to
longhand "branch:" as you found out, so if that was what was
desired then people with the attitude "before understanding what
longhand does I prefer using shorthand" like myself and you
would have liked to learn "git pull url branch:" notation from
Tutorial.  But I think we _are_ minority.  People would not want
to see that seemingly useless colon there.

> To me it's very unlogical that
>
>   master:current-branch
>
> doesn't work,

That shows that you did not understand what fetch does.  Maybe
you do now, but a very natural consequence of directing fetch to
update tracking branches with the colon notation is:

 - "pull url master:master", while on master, is almost always
   wrong and not something you would want to do, ever.

   "fetch --update-head-ok url +master:master; reset --hard HEAD"

   may make sense but never "pull".

> I use the remote:local syntax, because I started using GIT in scripted
> compiles from copied branches of remote repositories. There the
> explicit remote:local statements are necessary because there is no
> default branch.

If you perhaps wanted to ask "is there a better way to do what
I've been doing?", then I am willing to think with you to come
up with an answer.  Unfortunately, however, I do not understand
the above paragraph, so I'd refrain from commenting on it in
this response.



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  3:27                             ` Junio C Hamano
@ 2006-11-16  3:35                               ` Junio C Hamano
  2006-11-16  4:07                               ` Junio C Hamano
  1 sibling, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-16  3:35 UTC (permalink / raw)
  To: hanwen; +Cc: git

Junio C Hamano <junkio@cox.net> writes:

(not changing what I said but editorial)
> I myself do prefer, when learning a new tool, to use longhand
> until I understand the shorthand, but that attitude requires a
> true commitment to learn the tool, and most people do not go
> that route.  Tutorial style documents tend to give the commonly
> used shorthand first for that exact reason.

Eh, sorry, "prefer to use longhand until I understand what is
going on before using the shorthand" is what I wanted to say.

> Shorthand to give only the branch name to fetch and merge
> immediately without using a tracking branch is equivalent to
> longhand "branch:" as you found out, so if that was what was
> desired then people with the attitude "before understanding what
> longhand does I prefer using shorthand" like myself and you

"prefer not using shorthand", sorry again.

> would have liked to learn "git pull url branch:" notation from
> Tutorial.  But I think we _are_ minority.  People would not want
> to see that seemingly useless colon there.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  3:07                                               ` Linus Torvalds
@ 2006-11-16  3:43                                                 ` Nicolas Pitre
  0 siblings, 0 replies; 1752+ messages in thread
From: Nicolas Pitre @ 2006-11-16  3:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Shawn Pearce, Carl Worth, Michael K. Edwards, git

On Wed, 15 Nov 2006, Linus Torvalds wrote:

> 
> 
> On Wed, 15 Nov 2006, Nicolas Pitre wrote:
> > 
> > That is an implementation detail that should be easily overcome once the 
> > notion of tracking branch with URL attribute is implemented.
> 
> Nope.
> 
> I simply don't _have_ those branches.
> 
> Why? Because the kernel is _distributed_. There is no central place 
> (certainly not my repository) that tracks all the possible branches that 
> might get merged.
> 
> In other words, I repeat: in a TRULY DISTRIBUTED ENVIRONMENT it makes more 
> sense to have a "pull" that fetches and merges, over something that 
> fetches separately and then merges.
[...]

OK fine.  git-pull is there to stay and let's make sure it remains the 
same.

Let's see if, for example, git-merge can be made more useful in the mean 
time for those evidently inferior people that would prefer an interface 
that maps more closely to the actual operation that is being performed.  
And although I do understand what "pull" does, I think I should qualify 
myself as one of those inferior people nevertheless since /pull . blah" 
really irritates me.  OK I must be really dumb to let myself being 
disturbed by such an insignificant detail... but apparently I'm not 
alone.

But I promise to never change the "pull" behavior if I ever attempt to 
fix the "merge" command for the inferior mortals as myself.  All power 
to those with superior minds shall never be removed.

;-)



^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  9:17               ` Andy Parkins
                                   ` (2 preceding siblings ...)
  2006-11-15 17:55                 ` Junio C Hamano
@ 2006-11-16  3:53                 ` Petr Baudis
  3 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-11-16  3:53 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git

On Wed, Nov 15, 2006 at 10:17:22AM CET, Andy Parkins wrote:
> On Wednesday 2006 November 15 04:32, Nicolas Pitre wrote:
> 
> > 3) remote branch handling should become more straight forward.
> 
> I was completely confused by this origin/master/clone stuff when I started 
> with git.  In hindsight, now I understand git a bit more, this is what I 
> would have liked:
> 
>  * Don't use the name "origin" twice.  In fact, don't use it at all.  In a 
> distributed system there is no such thing as a true origin.
> 
>  * .git/remotes/origin should be ".git/remotes/default".   "origin" is only 
> special because it is the default to push and pull - it's very nice to have a 
> default, but it should therefore be /called/ "default".

  But "default" is way too generic a name, it's much more confusing I
think. As the one guilty of inventing master and origin, I agree that
they are somewhat silly, but if I would have to pick which one to
replace with something "better", I'd much rather pick master.

  Yes, Git can operate in a completely distributed manner. People do use
it as it. And there are also people that have no origin branch in their
repository. But the vast (overwhelming!) majority of people _does_ work
in some kind of hierarchical setup, and for them origin does have a
meaning. And origin URL can even change over time!

>  * git-clone should really just be a small wrapper around
>     - git-init-db
>     - create .git/remotes/default
>     - maybe create specific .git/config
>     - git-fetch default
>    If git-clone does anything that can't be done with settings in the config 
> and the remotes/default file then it's wrong.  The reason I say this is that 
> as soon as git-clone has special capabilities (like --shared, --local 
> and --reference) then you are prevented from doing magic with existing 
> repositories.  For example; how do you create a repository that contains 
> branches from two other local repositories that have the objects hard linked?

  Here I think that modulo the lack of remotes support (which is not a
fundamental thing here), the general setup of how Cogito does stuff is
much more saner than the current Git mess. It does basically exactly
what you've said above, and even the fetching itself is IMHO written
much more cleanly than in Git. In an ideal world, Git would just take
Cogito's code here. :-)

> While I'm writing wishes, I'd like to jump on Junio's integration with other 
> fetch-backends wish.  I use git-svn, and it would be fantastic if I could 
> replace:
> 
> git-svn init --id upstream/trunk svn://host/path/trunk
> git-svn fetch --id upstream/trunk
> git-svn init --id upstream/stable svn://host/path/branches/stable
> git-svn fetch --id upstream/stable
> 
> With a .git/remotes/svn
>  SVN-URL: svn://host/path
>  Pull: trunk:refs/remotes/upstream/trunk
>  Pull: branches/stable:refs/remotes/upstream/stable
> and
>  git fetch svn
> 
> Obviously, the syntax is just made up; but you get the idea.  Even better, 
> would be if it could cope with my "*" syntax suggested above:
>  SVN-URL: svn://host/path
>  Pull: trunk:refs/remotes/upstream/trunk
>  Pull: branches/*:refs/remotes/upstream/*

  It shouldn't be hard to do at all. Have the porcelain call "protocol
drivers" based on protocol in some generic way, like
/usr/lib/git/protocol/$proto.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  3:27                             ` Junio C Hamano
  2006-11-16  3:35                               ` Junio C Hamano
@ 2006-11-16  4:07                               ` Junio C Hamano
  1 sibling, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-16  4:07 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: git

Junio C Hamano <junkio@cox.net> writes:

> Han-Wen Nienhuys <hanwen@xs4all.nl> writes:
>
>> Junio C Hamano escreveu:
>>>...
>>> Sigh indeed.
>>>
>>> Why don't you do the simple and obvious
>>>
>>> 	git pull url master
>>
>> It is not all evident from the git-pull man-page that this is the
>> obvious and most common usage.
>
> In the git user poll a few months ago, many people recommended
> "everyday git" as a good cheat sheet, and indeed it does not
> talk anything about ...

Sorry, I must have been very grumpy mood when I wrote the
message (cf. Pasky's utterance on #git a few days ago).  What I
wrote was a bit incoherent, so here is an attempt to clarify.

I should point out that the colon separated refspec pairs you
can give to "pull" was designed with considerable thought; it is
not a convenience hack that we give them to "pull" that "fetches
and merges".  Linus's and Michael's other messages in this
thread may seem to be saying that using tracking branches is not
a kosher way to use git, but I do not think that is a correct
interpretation of their messages.

The workflow that does not use any tracking branches is the
simplest and truly distributed way as Linus says.  The command
recommended in "everyday git" document:

	git pull $url $branchname

is the most natural way to express it, and simplest variant that
you do not have to say anything "colon" in it.

However that does not mean it is a bad practice to use tracking
branches.  Sometimes it is handy to be able to refer to what you
fetched from the remote the last time, possibly which is what
you merged into your branch if that last fetch was done via "git
pull", so that you can later examine its history without your
own development.  For that purpose, you need to store what you
fetched in your local refs/ namespace, and that is what tracking
branches are.

The workflow that fetches to tracking branches and then merges
within local repository as two separate steps loses the true
origin information ("Merge branch 'foo'" vs "Merge branch 'foo'
of git://git.bar.xz/foo.git").  That's the reason why not just
"git fetch" but also "git pull" take the colon separated refspec
pairs to direct git to update the tracking branches when "pull"
happenes.  The longhands are cumbersome to type all the time,
and we have shorthand, both to store URL: and Pull: lines in
remotes/ hierarchy, and also $branchname alone is a shorthand
for saying "${branchname}:", meaning "do not use a tracking
branch to store this".

So you have options to use or not to use tracking branches.
After cloning we happen to default to track all remote branches
with corresponding local tracking branches, but that is only
because may people on the list wanted to make life easier to CVS
migrants where following mostly static set of branches is the
norm ("set" is the static part: I do not mean the branches stay
still) and we wanted to make it easier for them.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  1:14                       ` Theodore Tso
@ 2006-11-16  4:21                         ` Junio C Hamano
  2006-11-16 11:34                           ` Alexandre Julliard
  2006-11-16 16:07                           ` Theodore Tso
  0 siblings, 2 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-16  4:21 UTC (permalink / raw)
  To: Theodore Tso; +Cc: git, Nicolas Pitre, Linus Torvalds

Theodore Tso <tytso@mit.edu> writes:

> So with Bitkeeper, with "bk pull" there was never any question about
> which branch ("line of development") you would be merging into after
> doing a "bk pull", since there was only one LOD, and given that BK had
> the rule that a within a LOD only one tip was allowed, a "bk pull"
> _had_ to do do a merge operation.   

I've never used Bk and I really appreciate your comments here.

> If you are operating on your local development branch, the reality is
> that merging is probably not the right answer in the general case,

I agree, but I wonder why you are pulling/fetching (with or
without merge) if you are operating on your local development
branch (implying that you are in the middle of something else).

> ...  And by
> telling people, use "git fetch" instead, that's also an implicit
> admission that merging onto the current branch is often not the Right
> Thing.
>
> The problem is that "pull" is a very evocative word, especially given
> the existence "push", and so in the git world we are reduced to
> telling people, "you really don't want to use pull, trust me".  

I would rather say "use 'git branch' to make sure if you are
ready to merge".  Who teaches not to use "git pull"?

> If people are looking for a simple way out, maybe it would be enough
> to have an option where if "git pull" is called from an interactive
> terminal, and the "novice user" option is enabled, "git pull" returns
> a warning message,

I have to disagree with this.  In the simplest CVS-like central
repository with single branch setup in which many "novice users"
start out with, there is almost no need for "git fetch" nor
tracking branch.  You pull, resolve conflicts, attempt to push
back, perhaps gets "oh, no fast forward somebody pushed first",
pull again, then push back.  So I am not sure where "you really
do not want to use pull.  trust me" comes from.

It is a different story for people who _know_ git enough to know
what is going on.  They may be using multiple branches and
interacting with multiple remote branches, and there are times
you would want fetch and there are other times you would want
pull.  But for them, I do think the suggestion would never end
with "trust me" -- they would understand what the differences
are.




^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 20:40                                 ` Linus Torvalds
  2006-11-15 21:08                                   ` Carl Worth
@ 2006-11-16  4:26                                   ` Theodore Tso
  2006-11-16 11:50                                     ` Andreas Ericsson
  1 sibling, 1 reply; 1752+ messages in thread
From: Theodore Tso @ 2006-11-16  4:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Michael K. Edwards, git

On Wed, Nov 15, 2006 at 12:40:43PM -0800, Linus Torvalds wrote:
> And yes, this is why you should NOT try to use the same naming as "hg", 
> for example. Last I saw, hg still didn't even have local branches, To 
> mercurial, repository == branch, and that's it. It was what I came from 
> too, and I used to argue for using git that way too. I've since seen the 
> error of my ways, and git is simply BETTER. 

Actually, that's not true.  Mercurial has local branches, just as git
does.  Some people choose not to *use* this particular feature, and
use the BK style repository == branch, but that's mainly because it's
conceptually easy for them, and a number of BK refugees are very
happily using Hg.  

It's probably because of the BK refugee population that after you do
an hg pull, it will warn you that you need to do an "hg update" in
order to merge the working directory up to the latest version that was
just pulled --- and this change was made precisely because Hg supports
local branches, and merging with the current branch isn't always the
right thing, unlike with BK.

> And the concept of local branches is exactly _why_ you have to have 
> separate "fetch" and "pull", but why you do _not_ need a separate "merge" 
> (because "pull ." does it for you).

It's just that the semantics are different, and many developers have
to use multiple DSCM's, depending on what project they happen to be
developing on.  So the reality is that there are people who have to
use bzr, git, and hg, all at the same time.  And while eventually
newbies will figure out and remember that "git pull ." == "merge", the
naming is simply confusing, that's all.  (What does "pull" have to do
with "merge"?  It's not at all obvious.)  

For somoene who uses git full-time, and to the exclusion of all other
systems, I'm sure it's not a problem at all.
	

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 18:03                     ` Linus Torvalds
                                         ` (4 preceding siblings ...)
  2006-11-16  1:20                       ` Han-Wen Nienhuys
@ 2006-11-16  4:30                       ` Petr Baudis
  5 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-11-16  4:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, git

On Wed, Nov 15, 2006 at 07:03:18PM CET, Linus Torvalds wrote:
> If you think "pull" is confusing, I can guarantee you that _changing_ the 
> name is a hell of a lot more confusing. In fact, I think a lot of the 
> confusion comes from cogito, not from git - the fact that cogito used 
> different names and different syntax was a mistake, I think.

  I would agree that having "pull" mean something different in Cogito
than in Git was a bad idea (explanation: historically, for some period
of time Cogito had cg-pull which meant the same as cg-fetch or hg pull;
later it got renamed to cg-fetch). But I'm also happy that Cogito just
does not use the "pull" expression at all currently: "updating" seems to
be a clear and unloaded enough concept for new people. Pull is really
_very_ confusing, with it meaning something different (but not different
enough) in _all_ other systems but BK (which is basically irrelevant
nowadays).

  That said, I agree with your argument that changing it in Git now
might just result in more confusion. I'm just trying to explain Cogito's
choice here, and I believe it does no good nor harm to Core Git if it
just uses different name for the concept and avoids the original name at
all (except explaining in the docs that updating in Cogito is what
pulling is in Git).

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-14 22:36         ` Junio C Hamano
  2006-11-14 22:50           ` Junio C Hamano
@ 2006-11-16  5:12           ` Petr Baudis
  2006-11-16 10:45             ` Junio C Hamano
                               ` (2 more replies)
  1 sibling, 3 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-11-16  5:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Carl Worth, git, Andy Whitcroft, Nicolas Pitre

On Tue, Nov 14, 2006 at 11:36:19PM CET, Junio C Hamano wrote:
> Commenting on the messages in this thread:
> 
>  - "resolve / resolved" are both confusing, when you are talking
>    about "mark-resolved" operation.

Well that's what "resolved" is saying. But speaking of which, it took me
_weeks_ of regular (though not extensive) usage to train my fingers to
write "stg resolved" and not "stg resolve".

>  - "pull/push/fetch" have undesired confusion depending on where
>    people learned the term.  I'd perhaps vote for replacing
>    fetch with download and push with upload.

It's too long. :-(

I think if some people have a real problem with something it's "pull",
not push or fetch. Without "pull" name, there's no confusion about
merging or not merging; and without it, there's also no confusion about
"push" and the fetch/push duality. I'm not saying that this is enough an
argument to ditch pull from Git at this point.

>  - I think it would be sensible to make remote tracking branches
>    less visible.  For example:
> 
> 	git diff origin
> 
>    where origin is the shorthand for your upstream (e.g. you
>    have .git/remotes/origin that records the URL and the branch
>    you are tracking) should be easier to understand than
> 
>    	git diff remotes/origin/HEAD
> 
>    The latter is an implementation detail.

Hmm, wait. I didn't start using refs/remotes/ yet for obvious reasons,
but wasn't it generally agreed when implementing them that what you
wrote above would work? (That a ref not found in refs/{heads,tags}/ is
looked up in remotes and if it's a directory, /HEAD is appended.) So it
doesn't for some reason?

>    I could imagine we might even want to allow
> 
> 	git diff origin#next
> 
>    to name the branch of the remote repository.  The notion of
>    "where the tips of remote repository's branches are" is
>    probably be updated by "git download" (in other words, the
>    above "git diff" does not automatically initiate network
>    transfer).

Yes, that little syntax extension would be cute to have.

> Of course, it could even be "cg" ;-).

So, here is an arbitrary list of random reasons why cg commands are not
part of git yet:

(i) Naming issues. Example: "pull" vs. "update".

(ii) Namespace issues. Big selling point of Cogito is that it's
_simple_. A very important part of that is that your command set is
limited, so that even someone who wants to fully grok Cogito is not
overwhelmed and has just few commands in front of him. I think we're
doing pretty good here, and I very carefully weight adding another
command to the set (I'm actually pondering removing some now). The
similar applies to actual commands' usage, though certainly not so
heavily; and there are few warts here.

But overally, I think this point is pretty much unsolvable and this is
where I actually think the main "incompatibleness" of Cogito and Git
with its free mix of high- and mid- and low- level commands lies. I
don't think the thread provided any solution to this either.

(ii) Behaviour issues. Example: Cogito tries to deal with uncommitted
local changes in your repository when doing stuff. It didn't shine at it
before recent improvements (post-v0.18), but it tried to preserve your
local uncommitted changes during various operations (merging,
fast-forwarding, switching branches, seeking, ...). I think historically
Git's stance to this was negative (it'd rather block the operation), I'm
not sure what the current situation is, though.

(iii) Output format issues. Example: "status" in Git and Cogito
has a completely different format in both. I'm a die-hard fan of
Cogito's format but there're surely die-hard fans of Git's.

(iv) Control issues. I'm reluctant to give up a final word on how the UI
looks like, mostly for the reason of enforcing (ii) and proper
documentation. But this is not a blocker point.

(v) Library issues. Cogito has a pretty neat shell library which it
prices; but that could be carried around. Also, Cogito requires
/bin/bash, but mostly for performance reasons (using builtins instead of
forking for external commands at some points); Git has the advantage of
simply putting that part in C, which is though something I should've
been doing more frequently too.

(vi) Coding issues. This is probably very subjective, but a blocker for
me. I have no issues about C here, but about the shell part of Git.
Well, how to say it... It's just fundamentally incompatible with me. I
*could* do things in/with it, but it's certainly something I wouldn't
_enjoy_ doing _at all_, on a deep level. I think the current shell code
is really hard to read, the ancient constructs are frequently strange at
best, etc. It's surely fine code at functional level and there'll be
people who hate _my_ style of coding and my shell code which isn't
perfect either, but it's just how it is with me.


Now, it would be absolutely awesome if we could start to bridge at least
some of these points, shuffle some functionality around and overally
reduce the code duplication, increase features count and improve general
level of world happiness.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 20:41                             ` Junio C Hamano
  2006-11-15 22:07                               ` Shawn Pearce
@ 2006-11-16  6:07                               ` Marko Macek
  2006-11-16 10:36                                 ` Junio C Hamano
  1 sibling, 1 reply; 1752+ messages in thread
From: Marko Macek @ 2006-11-16  6:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn Pearce, Linus Torvalds, git, cworth, pasky

Junio C Hamano wrote:
> Marko Macek <marko.macek@gmx.net> writes:
> 
>> For people switching from CVS and SVN it would be much better if the
>> index was hidden behind the scenes by using different defaults:
>>
>> git-commit -a
>> git-status -a
>> git-diff HEAD
>>
>> BTW, currently there's a minor bug: git-diff HEAD doesn't work before
>> you make the first commit. Perhaps this should be special cased.
> 
> That's only a _bug_ in your implementation of the synonym for
> "svn diff" which blindly used "git diff HEAD".


My "implementation" is taken from git-diff man page. It seems obvious
that the situation before the first commit is just a special case if 
we consider git-diff to be Porcelain (which I do).

 
> This "there is no HEAD yet" is not related to the index, but I

I agree, this is a separate issue.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 23:15                                           ` Shawn Pearce
@ 2006-11-16  7:51                                             ` Richard CURNOW
  2006-11-16 23:01                                               ` Johannes Schindelin
  0 siblings, 1 reply; 1752+ messages in thread
From: Richard CURNOW @ 2006-11-16  7:51 UTC (permalink / raw)
  To: git
  Cc: Shawn Pearce, Sean, Carl Worth, Linus Torvalds, Nicolas Pitre,
	Michael K. Edwards

* Shawn Pearce <spearce@spearce.org> [2006-11-15]:
> 
> So what about making git-merge take a -m "msg" argument to supply
> the commit message, in which case it does the current behavior
> (and thus git-pull needs to change to supply -m); and then make
> git-merge without any -m parameter invoke "git pull . $@" ?

Sounds good to me.

When I'm merging in my own projects, I currently always use merge
(possibly preceded by fetch) rather than pull.  Why?  Because I don't
want my history full of commit messages like

Merge branch "trial_hack" from "../scratch_dir_with_silly_name"

In contrast to Linus's case of wanting to record where the remote merge
came from, I expressly don't want to record that - I want the merge
commit to describe conceptually what was being merged with what.

OK, I could use probably use pull with --no-commit, but I've already
trained my fingers to type out the merge syntax.  They'd be happier with
'git merge -m "Merge feature foo with fixes for bar" bar" though.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  3:21                                 ` Petr Baudis
@ 2006-11-16 10:09                                   ` Robin Rosenberg
  2006-11-16 13:46                                     ` Petr Baudis
  0 siblings, 1 reply; 1752+ messages in thread
From: Robin Rosenberg @ 2006-11-16 10:09 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Carl Worth, Junio C Hamano, Andy Parkins, git

torsdag 16 november 2006 04:21 skrev Petr Baudis:
> Another point is, if using _just_ _git_ requires you to learn "all those
> git commands too" from git-commit-tree up (yes it does! if you want your
> authorship information to be correct), something is wrong.

When/why do I need git-commit-tree? Isn't git-commit enough?


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  3:12                         ` Linus Torvalds
@ 2006-11-16 10:31                           ` Junio C Hamano
  2006-11-16 10:45                           ` Han-Wen Nienhuys
  2006-11-16 23:00                           ` Johannes Schindelin
  2 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-16 10:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Han-Wen Nienhuys

Linus Torvalds <torvalds@osdl.org> writes:

> And I've said this again, and I'll say it once more: that has basically 
> _nothing_ to do with whether you spell "pull" as "pull" or "merge".
>
> The reason people have trouble wrapping their heads around git is because 
> they have been braindamaged by CVS and SVN, and just don't understand the 
> fairly fundamental new concepts and workflow.
> ...
> Let's face it, you could just alias "merge" to "pull", and it wouldn't 
> really change ANYTHING. You'd still have to learn the new model. 

I had a bit different feeling about yesterday's discussion
myself.

If somebody uses git like you do in "truly distributed way", the
current pull behaviour and pull being an operational mirror to
push are natural consequence of the model and concepts, and
there is nothing to fix (modulo "the default merge source per
branch" should be made easier to use).  Renaming the pull to
merge would not make it any easier to use unless the underlying
model is understood, and I fully agree with you on that.

But for people working in a project organized around central
repository in the CVS/SVN fashion, the workflow is quite
different.  CVS does not even let you "fetch" without either
merging (co) or throwing away your work (co -C), and we already
do support that model with:

	git clone
        git pull
        work work work; git commit
        git push
        : oops not fast forward?
        git pull
        resolve work; git commit
	git push

without ever using a local branch, any tracking branch, nor
use of git-fetch.  So we do support both extremes ("truly
distributed" and "not distributed at all") reasonably well.

The trouble starts when the users hear about this wonderful
"distributed" stuff git offers, and try to use it without
understanding the key concepts.  People tend to learn by doing
and there is a leap the user need to make because now they need
to understand branched development, branches and fetching like
you explained if they want to use git the same way as you do.
Once they understand them, then the current set of tools offer
them a simple and very straightforward user interface (the tools
directly reflect the concepts and it is straightforward only
because we are talking about users who understood the concepts).

But we have to admit that this leap may rather be difficult for
people who are used to other models.  Telling them that our
model is different and it is different for a good reason does
not change the fact that the more different something is, the
more difficult to learn it.

I suspect that there could be a way to use git, not like you or
I do.  Our workflows are already quite different (e.g. you
almost never do topic branch merge yourself in your repository,
but I have abundance of them).  There is no reason to think
there won't be other workflows that are suitable for other
people.  Some workflows might be classified less distributed and
inferiour compared to the "truly git way" from "truly
distributed is the point of git" point of view, but nevertheless
could be "good enough" for those people.  In other words, a
workflow that is a bit more advanced than just a single trunk
CVS/SVN usage could still take advantage of some of the features
to support distributed development model git has, while not
taking full advantage of truly distributed nature of git.

I think the complaints in the yesterday's discussion are mostly
about frustration that, while we have a reasonable support for
the both extremes, we do not either know what that middle ground
workflow is, or even if we know what that is, we do not support
it very well.

And I am not opposed to people exploring what that different
workflow would be, and while they do so if they come up with a
set of commands (get/put perhaps) to suppor that slightly
different workflow, that would be a very good thing.

Add foreign SCM importers in the mix and the situation becomes
more difficult and interesting.  cvsimport mostly works and
quacks like git-fetch with set of tracking branches, which I
think is the right model for the importers, and would integrate
well with the current set of tools.  I believe svnimport is the
same way.  But I do not know about git-svn.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  6:07                               ` Marko Macek
@ 2006-11-16 10:36                                 ` Junio C Hamano
  0 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-16 10:36 UTC (permalink / raw)
  To: Marko Macek; +Cc: git

Marko Macek <marko.macek@gmx.net> writes:

>>> BTW, currently there's a minor bug: git-diff HEAD doesn't work before
>>> you make the first commit. Perhaps this should be special cased.
>>
>> That's only a _bug_ in your implementation of the synonym for
>> "svn diff" which blindly used "git diff HEAD".
>
> My "implementation" is taken from git-diff man page. It seems obvious
> that the situation before the first commit is just a special case if
> we consider git-diff to be Porcelain (which I do).

Yes, "git diff" is a Porcelain.  No question about it.

I do not consider the current behaviour of "git diff HEAD" that
complains instead of giving runs of "foo is a new file and no
diff is available for it" a bug; you asked for diff from some
commit but the commit you gave was bogus (does not exist yet).
But if you feel strongly about it, it should be trivial to
special case the yet-to-be-born HEAD case and run the
equilvalent of:

	git ls-files | sed -e 's/$/ is a new file, no diff is available./'

in such a case.  Or you could even go fancier and do an
equivalent of:

	git ls-files |
        while read path
        do
		l=`wc -l <"$path"`
        	echo "diff --git a/$path b/$path"
                echo "--- a/$path"
                echo "--- b/$path"
                echo "@@ -0,0 +1,$l @@"
                sed -e 's/^/+/' <"$path"
	done

and you can claim that it makes it consistent with the case
where you already have commits.

But I happen to think that consistency is only of academic
interest.  After all, how often would one create a true "root"
commit?  We are not talking about creating a new repository that
starts its life as a clone of something else, but a truly empty
one in which the initial commit is made.  And how often would
one want to view "diff" from void while preparing for that
initial commit?  Both that low frequency _and_ general
uselessness of the output from either of the above shell
scripts, would it be worth "fixing" it?

I do not think it adds any real practical value, and does not
even have much to do with being user friendly.  I would put it
in the "when somebody is really bored and has nothing better to
do, then this _could_ be done" category.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  3:12                         ` Linus Torvalds
  2006-11-16 10:31                           ` Junio C Hamano
@ 2006-11-16 10:45                           ` Han-Wen Nienhuys
  2006-11-16 11:11                             ` Junio C Hamano
  2006-11-16 16:23                             ` Linus Torvalds
  2006-11-16 23:00                           ` Johannes Schindelin
  2 siblings, 2 replies; 1752+ messages in thread
From: Han-Wen Nienhuys @ 2006-11-16 10:45 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, git

Linus Torvalds escreveu:
>>>  - git itself has now done it that way for the last 18 months, and the
>>> fact is, the people _complaining_ are a small subset of the people who
>>> actually use git on a daily basis and don't complain.
>>
>> that's not a good argument; the set of git users is a small subset of those
>> that looked at git, and dismissed it because they couldn't wrap their heads
>> around it. 
> 
> And I've said this again, and I'll say it once more: that has basically 
> _nothing_ to do with whether you spell "pull" as "pull" or "merge".
> 
> The reason people have trouble wrapping their heads around git is because 
> they have been braindamaged by CVS and SVN, and just don't understand the 
> fairly fundamental new concepts and workflow.

 > I claim that those "annoying little issues" are totally made up by
 > people
 > who had trouble wrapping their minds about git, and then make up
 > reasons
 > that have nothing to do with reality for why that might be so.

Let me put this more personally: I continue to be bitten by stupid 
naming issues, and the myriad of little mostly non-orthogonal commands.
My head is doing just fine otherwise, and has no problems wrapping it 
around the core of GIT.  I've also used Darcs for almost a year. Darcs, 
which is much less overwhelming.

This is not about CVS or SVN, so don't put them up as a strawman.
If you want to argue that my brain is warped, use other distributed VCs 
as an example.

The following

   mkdir x y
   cd x
   hg init
   echo hoi > bla
   hg add
   hg commit -m 'yes, I am also too stupid to refuse explicit empty 
commit messages'
   cd ../y
   hg init
   hg pull ../x

pretty much works the same in Darcs, bzr and mercurial.

With GIT, this is what happens

[hanwen@haring y]$ git pull ../x
fatal: Needed a single revision
Pulling into a black hole?

[hanwen@haring y]$ git fetch ../x
warning: no common commits
remote: Generating pack...
Done counting 3 objects.
Deltifying 3 objects.
  100% (3/3) done
Total 3, wremote: ritten 3 (delta 0), reused 0 (delta 0)
Unpacking 3 objects
  100% (3/3) done

[hanwen@haring y]$ git checkout
fatal: ambiguous argument 'HEAD': unknown revision or path not in the 
working tree.
Use '--' to separate paths from revisions
fatal: Not a valid object name HEAD

[hanwen@haring y]$ git branch master
fatal: Needed a single revision

at this point, I resort to adding a bogus commit and/or editing 
.git/HEAD by hand. I'm sure there is a saner way of doing it, but I 
still haven't found out what it is.

This might not be typical GIT use, but it does show the typical GIT user 
experience, at least mine.

If you want to have another example of how not to design a 
user-interface, try the above on Monotone.

> That's totally different from then arguing about stupid naming issues.
> 
> Peopel seem to believe that changign a few names or doing other totally 
> _minimal_ UI changes would somehow magically make things understandable. I 
> claim that isn't so at all. The fact is, git is different from CVS and 
> SVN, and git _has_ to be different from CVS and SVN. It has to be 
> different because the whole model of CVS and SVN is simpyl fundamentally 
> BROKEN.
> 
>> It's worth trying to get those on board by fixing the annoying
>> little issues that have popped up in this thread.
> 
> 
> Let's face it, you could just alias "merge" to "pull", and it wouldn't 
> really change ANYTHING.

I don't want ANYTHING to really change, I just want a sane interface to it.


-- 
  Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  5:12           ` Petr Baudis
@ 2006-11-16 10:45             ` Junio C Hamano
  2006-11-16 13:43               ` Petr Baudis
  2006-11-16 21:49             ` Junio C Hamano
  2006-11-17  0:11             ` Han-Wen Nienhuys
  2 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-16 10:45 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Carl Worth, git, Andy Whitcroft, Nicolas Pitre

Petr Baudis <pasky@suse.cz> writes:

> (v) Library issues...
> Git has the advantage of
> simply putting that part in C, which is though something I should've
> been doing more frequently too.

It should be stressed that git-core plumbing written in C is not
just for git Porcelain-ish, and it will continue to be shared
service.  We would add core support for what Porcelains need and
we would try hard to keep them generic enough so that other
Porcelains can use them.  Keeping the core and Porcelain-ish in
the same project has made it easier to keep them in sync and to
find and add missing features that would benefit Porcelains (not
limited to git Porcelain-ish).  But that should not be mistaken
as plumbing somehow belongs more to git Porcelain-ish than to
Cogito or others.

I also think you should take credit for some core improvements
you did yourself (e.g "ls-files -t" format was originally added
for the sole purpose of helping Cogito, but now others use it,
too).

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 10:45                           ` Han-Wen Nienhuys
@ 2006-11-16 11:11                             ` Junio C Hamano
  2006-11-16 11:47                               ` Junio C Hamano
  2006-11-16 13:03                               ` Han-Wen Nienhuys
  2006-11-16 16:23                             ` Linus Torvalds
  1 sibling, 2 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-16 11:11 UTC (permalink / raw)
  To: hanwen; +Cc: git

Han-Wen Nienhuys <hanwen@xs4all.nl> writes:

You claim it is _an interface_ issue but it is not.

> With GIT, this is what happens
>
> [hanwen@haring y]$ git pull ../x
> fatal: Needed a single revision
> Pulling into a black hole?

You asked it to fetch from the neighbour repository and merge it
into your current branch which does not exist (I presume that
you omitted to describe what you did in directory y/ and I am
assuming you did "mkdir y && cd y && git initdb" and nothing
else).  You are pulling into a black hole.

> [hanwen@haring y]$ git fetch ../x
>...
> [hanwen@haring y]$ git checkout

You fetched without telling it in which tracking branch to store
what you fetched, and as a result your HEAD is not updated, so
your current branch still does not exist.  A failure from
checking out nothingness is not an interface issue; expectation
for it to work is a concept level issue.

> [hanwen@haring y]$ git branch master
> fatal: Needed a single revision

You are not at any commit yet and you try to create a branch?

Of course, the "right" (in some sense of the word) thing is to
do "git clone x y" in the parent directory, without creating y
upfront.

If you have an empty y to begin with, then you can do this:

	$ git fetch ../x :origin
        $ git reset --hard origin

which would mirror a part of what "git clone" would have done
for you.  It copies from the other repository, stores the tip in
your tracking branch called "origin", and make your HEAD to be
the same as origin.  After these two commands, you would have
two branches, origin and master, and you will be on master.

You can name 'origin' any way you want.  You might want to name
it 'x' to make it clear (to yourself) that it is used to track
what will happen in the neighboring repository 'x'.  Also, you
would most likely be fetching and merging from the same ../x
from now on, so it might be handy to set up the remotes for it:

	$ cat >.git/remotes/x <<EOF
        URL: ../x
        Pull: master:origin
	EOF

Then subsequent work of yours would be done on 'master' branch
(you have only two branches, and origin is a tracking branch so
you will never make commits on it, which means the above is a
logical consequence), and from time to time you would sync with
whoever is working in ../x

	$ git pull x

Here, 'x' is just a shorthand which looks up the URL: and Pull: line
through .git/remotes/x.  If your .git/remotes/ file was named origin
(not x), you could even have written:

	$ git pull

because pull defaults to 'origin' (without any other configuration).

>> Let's face it, you could just alias "merge" to "pull", and it
>> wouldn't really change ANYTHING.
>
> I don't want ANYTHING to really change, I just want a sane interface to it.

I agree that you do not want to change anything.  You just
needed a bit of handholding, because you deviated from the
cookbook usage, to correct your course.




^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  4:21                         ` Junio C Hamano
@ 2006-11-16 11:34                           ` Alexandre Julliard
  2006-11-16 14:01                             ` Petr Baudis
  2006-11-16 16:07                           ` Theodore Tso
  1 sibling, 1 reply; 1752+ messages in thread
From: Alexandre Julliard @ 2006-11-16 11:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Theodore Tso, git, Nicolas Pitre, Linus Torvalds

Junio C Hamano <junkio@cox.net> writes:

> I would rather say "use 'git branch' to make sure if you are
> ready to merge".  Who teaches not to use "git pull"?

We do that for Wine. The problem is that we recommend using git-rebase
to make it easier for occasional developers to keep a clean history,
and rebase and pull interfere badly.

The result is that we recommend always using fetch+rebase to keep up
to date, but this is confusing many people too, because git-fetch
appears to do a lot of work yet leaves the working tree completely
unchanged, and git-rebase doesn't do anything (since in most cases
they don't have commits to rebase) but has an apparently magical
side-effect of updating the working tree.

Ideally it should be possible to have git-rebase do the right thing
even if the branch has been merged into; then we could tell people to
always use git-pull, and when they get confused by seeing merges in
their history have them do a git-rebase to clean things up.

-- 
Alexandre Julliard

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  3:02                                             ` Michael K. Edwards
@ 2006-11-16 11:35                                               ` Andreas Ericsson
  0 siblings, 0 replies; 1752+ messages in thread
From: Andreas Ericsson @ 2006-11-16 11:35 UTC (permalink / raw)
  To: Michael K. Edwards; +Cc: git

Michael K. Edwards wrote:
> On 11/15/06, Linus Torvalds <torvalds@osdl.org> wrote:
>> Actually, with different people involved it's _much_ better to do it in
>> one shot.
>>
>> Why? Because doing a separate "fetch to local space" + "merge from local
>> space" actually loses the information on what you are merging.
>>
>> It's a lot more useful to have a merge message like
>>
>>         Merge branch 'for-linus' of 
>> git://one.firstfloor.org/home/andi/git/linux-2.6
>>
>> than one like
>>
>>         Merge branch 'for-linus'
>>
>> which is what you get if you fetched it first.
> 
> Full ACK from a platform integrator's perspective.  Local merge is
> great for trial runs but the history in a persistent branch should be
> as self-contained and self-explanatory as possible.  It shouldn't
> depend on what I name local tracking branches, which are just a
> convenience so that I can still do trial runs when my connectivity is
> broken.
> 

[...]

> 
> Coming from me, this is all rather theoretical, as I haven't been
> using this particular tool for the purpose long enough to have an
> independent opinion.  But for what it's worth, the workflow Linus
> describes isn't just for the guy at the top of the pyramid.
> 

I think it's unfortunate that git was originally written by Linus, since 
he so obviously is "the guy at the top of the pyramid" in many more 
senses than just "Linus said this and that patch was OK to commit", 
since git was designed to work like king Arthur's round table; "Linus is 
in the same circle as me, so ofcourse we help each other out".

All suggestions I've been reading about tracking branches, 
separate-remotes and whatnot have their merit. If any of it gets 
implemented, I'd still like to be able to do one-shot pulls from remote 
repos *without* creating specific tracking branches for it. It's 
extremely useful to fetch other peoples topic-branches into my own 
"master" (or topic-branch) when I trust their changes to be good. Please 
consider that when you're hacking away on whatever changes to do.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 11:11                             ` Junio C Hamano
@ 2006-11-16 11:47                               ` Junio C Hamano
  2006-11-16 13:03                               ` Han-Wen Nienhuys
  1 sibling, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-16 11:47 UTC (permalink / raw)
  To: hanwen; +Cc: git

Junio C Hamano <junkio@cox.net> writes:

> Han-Wen Nienhuys <hanwen@xs4all.nl> writes:
>
>> [hanwen@haring y]$ git pull ../x
>> fatal: Needed a single revision
>> Pulling into a black hole?

Having said all that, I happen to think that this particular
case of pulling into void could deserve to be special cased to
pretend it is a fast forward (after all, nothingness is an
ancestor of anything), if only to make new people's first
experience more pleasant.

Working from nothingness is something not usually done in
everyday work, so from practical and technical point of view it
does not add much _real_ value to the people who actually uses
the system, but nevertheless, new people typically start
learning the system from either cloned repository (which I
believe is covered by the existing tools fairly well) or
emptiness (which bitten us here in a bad way), and making the
first experience more pleasnt to new people have a positive
value of flattening the learning curve.

So please consider that this is classified as a bug.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  4:26                                   ` Theodore Tso
@ 2006-11-16 11:50                                     ` Andreas Ericsson
  2006-11-16 16:30                                       ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Andreas Ericsson @ 2006-11-16 11:50 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Linus Torvalds, Nicolas Pitre, Michael K. Edwards, git

Theodore Tso wrote:
> On Wed, Nov 15, 2006 at 12:40:43PM -0800, Linus Torvalds wrote:
>> And yes, this is why you should NOT try to use the same naming as "hg", 
>> for example. Last I saw, hg still didn't even have local branches, To 
>> mercurial, repository == branch, and that's it. It was what I came from 
>> too, and I used to argue for using git that way too. I've since seen the 
>> error of my ways, and git is simply BETTER. 
> 
> Actually, that's not true.  Mercurial has local branches, just as git
> does.  Some people choose not to *use* this particular feature, and
> use the BK style repository == branch, but that's mainly because it's
> conceptually easy for them, and a number of BK refugees are very
> happily using Hg.  
> 
> It's probably because of the BK refugee population that after you do
> an hg pull, it will warn you that you need to do an "hg update" in
> order to merge the working directory up to the latest version that was
> just pulled --- and this change was made precisely because Hg supports
> local branches, and merging with the current branch isn't always the
> right thing, unlike with BK.
> 
>> And the concept of local branches is exactly _why_ you have to have 
>> separate "fetch" and "pull", but why you do _not_ need a separate "merge" 
>> (because "pull ." does it for you).
> 
> It's just that the semantics are different, and many developers have
> to use multiple DSCM's, depending on what project they happen to be
> developing on.  So the reality is that there are people who have to
> use bzr, git, and hg, all at the same time.  And while eventually
> newbies will figure out and remember that "git pull ." == "merge", the
> naming is simply confusing, that's all.  (What does "pull" have to do
> with "merge"?  It's not at all obvious.)  
> 
> For somoene who uses git full-time, and to the exclusion of all other
> systems, I'm sure it's not a problem at all.


It seems we should, cheaply, be able to avoid a large part of the 
confusion by

* Mentioning git-fetch before git-pull in all documentation newborn 
gitizens are likely to come across. Most git-users aren't Linus, and for 
every successful project the maintainers are outnumbered 100 to 1 by the 
contributors. Those projects successful *because* maintainers are 
heavily outnumbered so we should make it easier for contributors by 
teaching them the right things from the start and possibly have a 
separate man-page for maintainer (git-{maintainer,developer} man-pages, 
anyone?).
* Creating "git update" which might possibly be an internal alias to 
"git pull", except that it should read .git/remotes/* by default unless 
a specific remotes-file is specified.
* Renaming git-merge to git-merge-driver
* Implementing a git-merge that actually does what its name implies, 
possibly by making it an internal alias to pull, but with these differences:
   - It always merges into your current branch.
   - It understands "git merge branch" as well as "git merge . branch".

This is just the very low-hanging fruit. If we take these steps and let 
things cool down a bit, it would probably be proper to take a fresh look 
at this in a couple of months.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 11:11                             ` Junio C Hamano
  2006-11-16 11:47                               ` Junio C Hamano
@ 2006-11-16 13:03                               ` Han-Wen Nienhuys
  2006-11-16 13:11                                 ` Han-Wen Nienhuys
  1 sibling, 1 reply; 1752+ messages in thread
From: Han-Wen Nienhuys @ 2006-11-16 13:03 UTC (permalink / raw)
  To: git; +Cc: git

Junio C Hamano escreveu:
> You claim it is _an interface_ issue but it is not.

 >> I don't want ANYTHING to really change, I just want a sane interface 
 >> to it.
 >
 > I agree that you do not want to change anything.  You just
 > needed a bit of handholding, because you deviated from the
 > cookbook usage, to correct your course.

Users (well, I do at least) start fiddling with systems to find out how 
they work.   Reading the manual is usually done as a last resort. I 
think this is pretty well documented in usability research.

I'm trying to show how GIT is badly suited to this. Your response is to 
explain to me what I should have done. That's nice, but that approach 
doesn't scale, because you don't reach the dozens of users out there who 
try the same, fail and give up.

If you really want to find out the weaknesses, you'd have to sit someone 
new to git in front of a computer, and let him figure how to operate it, 
while videotaping everything.

Writing a manual for newbies is also an effective (and simpler and 
cheaper) approach of figuring out what needs to be changed.



As another example:  annoyances regarding program invocation

  - option handling: -x -f -z != -xfz , "--max-count 1" doesn't work, 
but needs an '='

  - git --help lists an unordered set, which is too long scan quickly. 
I'd expect that list to either contain everything or the minimum set for 
daily use. I.e. the set introduced in a first tutorial.  Why are merge, 
prune, verify-tag there?

Try "bzr help" for comparison.

  - --pretty option with wholly uninformative options full, medium, 
short, raw.  It's not even documented what each option does.


I can go on with listing idiosyncrasies, but my point is not to get help 
from you, but rather to show how git can be improved.


>> With GIT, this is what happens
>>
>> [hanwen@haring y]$ git pull ../x
>> fatal: Needed a single revision
>> Pulling into a black hole?
> 
> You asked it to fetch from the neighbour repository and merge it
> into your current branch which does not exist (I presume that
> you omitted to describe what you did in directory y/ and I am
> assuming you did "mkdir y && cd y && git initdb" and nothing
> else).  You are pulling into a black hole.

as you remark in the other reply, there is IMO no reason for not having 
an empty 'master' branch. If master + HEAD gets created on the first 
commit, it might as well be created on the init-db.

-- 
  Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 13:03                               ` Han-Wen Nienhuys
@ 2006-11-16 13:11                                 ` Han-Wen Nienhuys
  0 siblings, 0 replies; 1752+ messages in thread
From: Han-Wen Nienhuys @ 2006-11-16 13:11 UTC (permalink / raw)
  To: git

Han-Wen Nienhuys escreveu:

> I can go on with listing idiosyncrasies, but my point is not to get help 
> from you, but rather to show how git can be improved.

oh, and another annoying one: git's insistence on firing up a pager if 
there is nothing to page, eg. try

   git-log je-n-existe-pas

-- 
  Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 18:11                     ` Nicolas Pitre
@ 2006-11-16 13:21                       ` Karl Hasselström
  0 siblings, 0 replies; 1752+ messages in thread
From: Karl Hasselström @ 2006-11-16 13:21 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Junio C Hamano, git

On 2006-11-15 13:11:36 -0500, Nicolas Pitre wrote:

> On Wed, 15 Nov 2006, Junio C Hamano wrote:
>
> > Nicolas Pitre <nico@cam.org> writes:
> >
> > > But again I think it is important that the URL to use must be a
> > > per branch attribute i.e. attached to "default/master" and not
> > > just "default". This way someone could add all branches of
> > > interest into the "default" group even if they're from different
> > > repositories, and a simple get without any argument would get
> > > them all.
> >
> > I think the "one group per one remote repository" model is a lot
> > easier to explain. At least when I read your first "branch group"
> > proposal that was I thought was going on and I found it quite
> > sensible (and it maps more or less straightforwardly to the way
> > existing .git/refs/remotes is set up by default).
>
> I think one group per remote repo is how things should be by default
> too. But we should not limit it to that if possible.

Without the limitation, we risk name collisions when getting all
branches from the remote repository (that is, including any new
branches we previously didn't know about).

-- 
Karl Hasselström, kha@treskal.com

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 10:45             ` Junio C Hamano
@ 2006-11-16 13:43               ` Petr Baudis
  0 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-11-16 13:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Carl Worth, git, Andy Whitcroft, Nicolas Pitre

On Thu, Nov 16, 2006 at 11:45:46AM CET, Junio C Hamano wrote:
> Petr Baudis <pasky@suse.cz> writes:
> 
> > (v) Library issues...
> > Git has the advantage of
> > simply putting that part in C, which is though something I should've
> > been doing more frequently too.
> 
> It should be stressed that git-core plumbing written in C is not
> just for git Porcelain-ish, and it will continue to be shared
> service.  We would add core support for what Porcelains need and
> we would try hard to keep them generic enough so that other
> Porcelains can use them.  Keeping the core and Porcelain-ish in
> the same project has made it easier to keep them in sync and to
> find and add missing features that would benefit Porcelains (not
> limited to git Porcelain-ish).  But that should not be mistaken
> as plumbing somehow belongs more to git Porcelain-ish than to
> Cogito or others.

  Of course, I didn't mean to say that. I should do more often things
like adding --stdin to the fetchers. From one part, I'm used to work
with a fixed set of system tools and extending Git with the
functionality I want means changing my thinking mode and "jumping out of
the system" a bit. The other part is that I cannot use the improvements
in Cogito right away (at least not in the main branch) but I have to
wait for the next Git release; but this is mostly just an excuse. :-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 10:09                                   ` Robin Rosenberg
@ 2006-11-16 13:46                                     ` Petr Baudis
  0 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-11-16 13:46 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: Carl Worth, Junio C Hamano, Andy Parkins, git

On Thu, Nov 16, 2006 at 11:09:13AM CET, Robin Rosenberg wrote:
> torsdag 16 november 2006 04:21 skrev Petr Baudis:
> > Another point is, if using _just_ _git_ requires you to learn "all those
> > git commands too" from git-commit-tree up (yes it does! if you want your
> > authorship information to be correct), something is wrong.
> 
> When/why do I need git-commit-tree? Isn't git-commit enough?

As I said, when you need to find out how to setup your authorship
information. It's documented as deep as on the git-commit-tree level.
BTW, the documentation is another important part of the
plumbing/porcelain separation, it's not only about the list of commands
but also that porcelain documentation should be reasonably
self-contained and not require users to peek at plumbing docs in order
to find out many stuff. It's also a consideration I take when
maintaining Cogito documentation.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15  4:32             ` Nicolas Pitre
                                 ` (2 preceding siblings ...)
  2006-11-15 12:15               ` Andreas Ericsson
@ 2006-11-16 13:58               ` Petr Baudis
  3 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-11-16 13:58 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Junio C Hamano, git, Andy Whitcroft, Carl Worth

On Wed, Nov 15, 2006 at 05:32:06AM CET, Nicolas Pitre wrote:
> 1) make "git init" an alias for "git init-db".
> 
> What's the point of "-db"?  Sure we're initializing the GIT database.  
> But who cares?  The user doesn't care if GIT uses a "database" or 
> whatever.  And according to some people's definition of a "database" it 
> could be argued that GIT doesn't use a database at all in the purist 
> sense of it. What the user wants is to get started and "init" (without 
> the "-db" is so much more to the point. Doesn't matter if incidentally 
> it happens to be the same keyword HG uses for the same operation because 
> we are not afflicted by the NIH disease, right? And it has 3 chars less 
> to type which is for sure a premium improvement to the very first GIT 
> user experience!

(This is somewhat related to the HEAD issue, e.g.
<7v1wo3d6g4.fsf@assigned-by-dhcp.cox.net>, by virtue of basically
eliminating it.)

Let's see. If you are adding the alias, you can as well add some
porcelain stuffing in it, too.

What are the 99% of use cases when doing "init"?

(a) You are going to do an initial commit right away; the repository is
at this point basically useless for anything but initial commit. So you
might have "init" well just perform it for you right away.

(b) You are setting up a bare repository on a server and you will push
to it in a minute. Cogito has a separate cg-admin-setuprepo command for
it, which will also prepare it for usage by dumb servers and optionally
for shared usage in a group of users. Git could have something similar.


> 2) "pull" and "push" should be symmetrical operations
..snip..
> Conclusion:  git-pull must not perform any merge.  It is the symmetrical 
> operation of a push meaning that it pulls content from a remote branch 
> and does no more.  People understands that pretty well, .  This makes 
> git-fetch redundant (or an alias to git-pull) in that case, and again we 
> don't mind it becoming similar to in HG because we admit HG was right 
> about it.

If you _really_ want to do it in Git, the only sensible way to do it is
to stop using the "pull" verb for a command name altogether for at least
some rather long period of time, otherwise that's a blatant backwards
compatibility breakage.

> 3) remote branch handling should become more straight forward.
> 
> OK! Now that we've solved the pull issue and that everybody agrees with 
> me (how can't you all agree with me anyway) let's have a look at remote 
> branches.  It should be simple:
..snip..

By the way, due to the way you describe it, it's not all that clear to
me how is this (in)compatible with the current way we do it, on other
than the usage and git-pull's auto-creation magic level.

Is it that what you are describing _is_ in fact what we do support now,
with "branch groups" meaning "remotes" etc, and you are only proposing
some enhancements to automatically create remotes in git-pull, or are
there some other differences I've missed?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 11:34                           ` Alexandre Julliard
@ 2006-11-16 14:01                             ` Petr Baudis
  2006-11-16 15:48                               ` Alexandre Julliard
  0 siblings, 1 reply; 1752+ messages in thread
From: Petr Baudis @ 2006-11-16 14:01 UTC (permalink / raw)
  To: Alexandre Julliard
  Cc: Junio C Hamano, Theodore Tso, git, Nicolas Pitre, Linus Torvalds

On Thu, Nov 16, 2006 at 12:34:27PM CET, Alexandre Julliard wrote:
> Junio C Hamano <junkio@cox.net> writes:
> 
> > I would rather say "use 'git branch' to make sure if you are
> > ready to merge".  Who teaches not to use "git pull"?
> 
> We do that for Wine. The problem is that we recommend using git-rebase
> to make it easier for occasional developers to keep a clean history,
> and rebase and pull interfere badly.

How do those developers submit their changes? Do they push? If they do,
git-rebase can be saving one merge at most, and the merge is actually a
good thing (someone should write some nice standalone writeup about
that).

If they don't have push access and maintain their patches locally until
they get accepted, perhaps it would be far simpler for them to use
StGIT?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 14:01                             ` Petr Baudis
@ 2006-11-16 15:48                               ` Alexandre Julliard
  0 siblings, 0 replies; 1752+ messages in thread
From: Alexandre Julliard @ 2006-11-16 15:48 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Junio C Hamano, Theodore Tso, git, Nicolas Pitre, Linus Torvalds

Petr Baudis <pasky@suse.cz> writes:

> How do those developers submit their changes? Do they push? If they do,
> git-rebase can be saving one merge at most, and the merge is actually a
> good thing (someone should write some nice standalone writeup about
> that).

No, they use git-format-patch and mail them in.

> If they don't have push access and maintain their patches locally until
> they get accepted, perhaps it would be far simpler for them to use
> StGIT?

For regular developers, sure. But regular developers will need to
properly understand the git model anyway, and then they will able to
make sense even of the standard git commands ;-)  The problem is that
there isn't a smooth progression to that point.

At first, a user will simply want to download and build the code, and
for that git-pull works great, it's a one-stop command to update their
tree.

Then after a while the user will fix a bug here and there, and at that
point git-rebase is IMO the best tool, it's reasonably easy to use,
doesn't require learning other commands, and once the patch is
accepted upstream it nicely gets the tree back to the state that the
user is familiar with.

The problem is that rebase doesn't work with pull, so the user needs
to un-learn git-pull and start using git-fetch; it's to avoid this
that we recommend using git-fetch from the start, which is unfortunate
since it makes things harder for beginners.

-- 
Alexandre Julliard

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  4:21                         ` Junio C Hamano
  2006-11-16 11:34                           ` Alexandre Julliard
@ 2006-11-16 16:07                           ` Theodore Tso
  2006-11-16 16:49                             ` Theodore Tso
  1 sibling, 1 reply; 1752+ messages in thread
From: Theodore Tso @ 2006-11-16 16:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Nicolas Pitre, Linus Torvalds

On Wed, Nov 15, 2006 at 08:21:36PM -0800, Junio C Hamano wrote:
> Theodore Tso <tytso@mit.edu> writes:
> > So with Bitkeeper, with "bk pull" there was never any question about
> > which branch ("line of development") you would be merging into after
> > doing a "bk pull", since there was only one LOD, and given that BK had
> > the rule that a within a LOD only one tip was allowed, a "bk pull"
> > _had_ to do do a merge operation.   
> 
> I've never used Bk and I really appreciate your comments here.
> 
> > If you are operating on your local development branch, the reality is
> > that merging is probably not the right answer in the general case,
> 
> I agree, but I wonder why you are pulling/fetching (with or
> without merge) if you are operating on your local development
> branch (implying that you are in the middle of something else).

Well, when I was using BitKeeper, I never would.  Bitkeeper has what
Linus calls the broken "repository == branch" model.  So normally I
would have one repository where I would track the upstream branch, and
only do bk pull into that branch.  I would do my hacking in another
repository (i.e., branch), and periodically keep track wha was going
on in mainline by cd'ing to the mainline repository and doing the bk
pull there.  

The challenge when you put multiple branches into a single repository,
is you have to keep track of which branch you happen to be in.  In the
BK world, this was obvious because it would show up in my shell
prompt:

<tytso@candygram>       {/usr/src/linux-2.6}
2% 

(OK, obviously I'm in the Linux 2.6 upstream repository)

In a system where you need to keep track of what branch you are in via
an SCM-specific local state information, it's easy to get confused and
do a pull when you are in the "wrong" branch, or while you have local
state in your working directory.   

What I currently do (and I'm sure I'm being really horrible and need
to be say 100 "Hail, Linus"'es for penance for not adhering staying in
the one true distributed state of grace) is that I keep an entirely
separate Linux 2.6 git repository just to make sure I never get
confused about what branch I might happen to be in when I do the "git
pull" --- and yeah, I could have used "git fetch", but 3+ years of BK
usage plus Hg usage is hard to get away from.  I'm sure this is where
Linus would say that use of BK and Hg, causes permanent brain damage,
ala's Dijkstra's ofted quoted comment about use of Basic inducing
brain damage....

> I have to disagree with this.  In the simplest CVS-like central
> repository with single branch setup in which many "novice users"
> start out with, there is almost no need for "git fetch" nor
> tracking branch.  You pull, resolve conflicts, attempt to push
> back, perhaps gets "oh, no fast forward somebody pushed first",
> pull again, then push back.  So I am not sure where "you really
> do not want to use pull.  trust me" comes from.

I think the problem is the people who have had years of BK or Hg
experience.  Maybe it's more of a documentation problem; perhaps a
"git for BK" or "git for Hg" users is what's needed.  The problem
though is that while use of BK is definitely legacy, there are going
to be a lot of people who need to use both BK and Hg.   

> It is a different story for people who _know_ git enough to know
> what is going on.  They may be using multiple branches and
> interacting with multiple remote branches, and there are times
> you would want fetch and there are other times you would want
> pull.  But for them, I do think the suggestion would never end
> with "trust me" -- they would understand what the differences
> are.

Well, I think this is where git's learning curve challenges are.  Yes,
for users that are doing the stupidest, most simplistic usage models,
git is quite easy to use.  And I am willing to grant that for people
who are using the deepest, most complicated and most distributed
development, who understand multiple branches and the index, and all
of the deep git plumbing, there's also no problem.

The challenge is in between; to use a car analogy, git has a great
automatic transmision, and an extremely powerful "racing clutch".  But
for someone where the automatic transmission isn't good enough, when
as they start to learn how to use the manual transmission, git's
extremely touchy "racing clutch" is much more difficult master ---
especially in comparison to people who have learned to drive other,
more pedestrian "standard transmission" cars.  So people who try to
use git's racing clutch keep stalling out the car, and some give up in
frustration.

And maybe the problem is one that should be addressed only by lots of
training, but at the moment, that's the reason why I believe a number
of projects have chosen Hg instead of git; they need more than the
"stupid simple" git usage, but if they don't need the extreme power of
git, Hg is simpler for people to learn how to use.  The problem, of
course, comes when later on, the project finds out they really want
git's power, and now they have to deal with the repository conversion
as well as retraining their entire development community.

But hey, maybe this isn't a problem the git community wants to solve;
clearly git is optimized for the Linux kernel development, and maybe
it's too much to ask that it also work well for somewhat less
extremely distributed development models.  But in any case, that's why
I chose Hg for e2fsprogs.  At the time when I made my choice, git was
just too painful to learn how to use its more esoteric features, and
Hg was much closer to BK's model.  (Since then, Hg has added more
functionality, including better multiple branches in a repository
support, and it's gotten more complicated, but it's still much simper
to teach someone how to use Hg than git.)

Regards,


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 10:45                           ` Han-Wen Nienhuys
  2006-11-16 11:11                             ` Junio C Hamano
@ 2006-11-16 16:23                             ` Linus Torvalds
  2006-11-16 16:42                               ` Han-Wen Nienhuys
  1 sibling, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-16 16:23 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: Junio C Hamano, git



On Thu, 16 Nov 2006, Han-Wen Nienhuys wrote:
> 
> This is not about CVS or SVN, so don't put them up as a strawman.
> If you want to argue that my brain is warped, use other distributed VCs as an
> example.

Your example has nothing at all to do with "pull" vs "fetch", though.

Your example is about something totally _different_, namely that under 
git, "git init-db" is _only_ for creating a _new_ project.

> The following
> 
>   mkdir x y
>   cd x
>   hg init
>   echo hoi > bla
>   hg add
>   hg commit -m 'yes, I am also too stupid to refuse explicit empty commit messages'
>   cd ../y
>   hg init
>   hg pull ../x
> 
> pretty much works the same in Darcs, bzr and mercurial.
> 
> With GIT, this is what happens
> 
> [hanwen@haring y]$ git pull ../x

Bzzt. This is where you went wrong, and you blamed "pull".

The way you do this in git is to NOT do "git init". Instead, you replace 
all the

	mkdir y
	cd ../y
	hg init
	hg pull ../x

with a simple

	git clone x y

and YOU ARE DONE.

Now, we could certainly _make_ "git pull" work on an empty git project, 
but that has _nothing_ to do with what people have been talking about.

In fact, the fact that "git fetch" kind of works is not exactly accidental 
(because "git fetch" _is_ meant to add new local branches too), but all 
the problems you have with it are due to the SAME issue. You started 
without any branch at all, because you started with an empty git repo, and 
you're simply not _supposed_ to do that.

So current rule (and this is not new, it's always been true): the ONLY 
time you use "git init-db" is when you are going to start a totally new 
project. Never _ever_ otherwise. If you want to track another project, use 
"git clone".

> This might not be typical GIT use, but it does show the typical GIT user
> experience, at least mine.

It's not that it isn't typical, it's that you are using the wrong model. 
Maybe it's not well documented, I can easily give you that, but ALL your 
problems come from that fundamental starting point: you shouldn't have 
used "git init-db" in the first place.

Somebody want to document it?

Alternatively, we certainly _could_ make "git pull" just accept an empty 
git repo, and make it basically create the current branch.

(And we probably should improve the error messahe)

> I don't want ANYTHING to really change, I just want a sane interface to it.

The sane interface _exists_. It's called "git clone".


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 11:50                                     ` Andreas Ericsson
@ 2006-11-16 16:30                                       ` Linus Torvalds
  2006-11-16 17:01                                         ` Carl Worth
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-16 16:30 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Theodore Tso, Nicolas Pitre, Michael K. Edwards, git



On Thu, 16 Nov 2006, Andreas Ericsson wrote:
> 
> * Mentioning git-fetch before git-pull in all documentation newborn gitizens
> are likely to come across.

However, I also think it might make sense to talk about the _simple_ form 
of "git pull" first.

The form I use is actually a lot simpler (conceptually) than the "short" 
form.

When you do

	git pull <reponame> <branchname>

there are very few things that can confuse you (although trying to do it 
without a current branch at all is apparently one such thing ;). 

There are no local branches to worry about, and there aren't any issues 
about what the default repository or branchname on the remote side would 
be either.

So in many ways, if you use this format, you simply never have to worry. 
You may have to _type_ a bit more, so it's not the short or concise 
format, but it sure is the _simple_ format. There simply isn't anything to 
be confused about.

And yes, I actually tend to use this even for project that I don't develop 
on, partly because the defaults for the short and concise formats are bad. 
For example, I follow the "modesetting" branch on the xorg intel graphics 
driver tree, and because I'm always on that branch, what I do is

	git pull origin modesetting

which works correctly (while "git pull" would _not_ have done the right 
thing: it would have picked the right repository, but it would have picked 
the "master" branch of that repository, not the "modesetting" branch).

And notice how I don't do _any_ development there, I just follow that 
branch. The "merge" will obviously always be a fast-forward, but that's 
exactly what I want. 

> Most git-users aren't Linus, and for every successful project the 
> maintainers are outnumbered 100 to 1 by the contributors.

Well, as mentioned, I think even for non-developers, doing pulls with 
explicit branchnames is actually perfectly sane.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-15 23:33                                           ` Linus Torvalds
  2006-11-16  0:08                                             ` Nicolas Pitre
  2006-11-16  3:02                                             ` Michael K. Edwards
@ 2006-11-16 16:37                                             ` Carl Worth
  2006-11-16 17:57                                               ` Michael K. Edwards
  2 siblings, 1 reply; 1752+ messages in thread
From: Carl Worth @ 2006-11-16 16:37 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Shawn Pearce, Nicolas Pitre, Michael K. Edwards, git

[-- Attachment #1: Type: text/plain, Size: 2051 bytes --]

On Wed, 15 Nov 2006 15:33:43 -0800 (PST), Linus Torvalds wrote:
> It's a lot more useful to have a merge message like
>
> 	Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
>
> than one like
>
> 	Merge branch 'for-linus'

There's more information in the first, sure. But I absolutely don't
accept that it's necessarily more useful, and definitely not that this
is a good argument for using pull with a remote branch instead of
fetch followed by merge with a local branch.

First, the pull may just fast-forward in which case there's no message
at all. And we've been through that topic enough recently that we all
know that no important information is lost by not doing any separate
recording in that case.

So you can't turn around and argue that the remote URL information is
suddenly important when it just so happens that it's not a fast
forward.

> And in a truly distributed situation, "pull" is strictly more powerful
> than a separate "fetch" + separate "merge".

I don't buy it. In my usage, I have several different remote
repositories I'm interested in tracking, each with any number of
branches. What I really want is an easy command that fetches all of
those branches, (even new ones that I've never heard about---but never
any of their "tracking branches" that wouldn't be of interest to
me). And I want to do that once, to get the online-access-required
part over with and get all the data into my local repository where I
can start working with it.

As for the URL from which I'm fetching all this stuff, it's really not
interesting to me at all. The URL for "Keith's stuff" keeps changing
anyway---I have no interest in recording that. But I do think it's
worth recording that the commits came from Keith's repository. I do
that right now with a keith/ prefix for his branches. It could also be
done by bringing in his .git/description during the fetch and storing
it somewhere. But I honestly don't see how storing something like that
during would make the system any less distributed in any sense.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 16:23                             ` Linus Torvalds
@ 2006-11-16 16:42                               ` Han-Wen Nienhuys
  2006-11-16 17:17                                 ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Han-Wen Nienhuys @ 2006-11-16 16:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

Linus Torvalds escreveu:

> So current rule (and this is not new, it's always been true): the ONLY 
> time you use "git init-db" is when you are going to start a totally new 
> project. Never _ever_ otherwise. If you want to track another project, use 
> "git clone".

Actually, only a 2 weeks ago, you suggested that I share the website
and main source code for my project in a single repository for reasons
of organization.

In this setup I find it logical to do

  git init-db
  git pull ..url.. website/master

to wind up with just the 5mb website, instead of the complete 70mb
of packed source code with all of its branches and tags.

> It's not that it isn't typical, it's that you are using the wrong model. 
> Maybe it's not well documented, I can easily give you that, but ALL your 
> problems come from that fundamental starting point: you shouldn't have 
> used "git init-db" in the first place.
> 
> Somebody want to document it?
> 
> Alternatively, we certainly _could_ make "git pull" just accept an empty 
> git repo, and make it basically create the current branch.

Yes, I would like that.  


-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 16:07                           ` Theodore Tso
@ 2006-11-16 16:49                             ` Theodore Tso
  0 siblings, 0 replies; 1752+ messages in thread
From: Theodore Tso @ 2006-11-16 16:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Nicolas Pitre, Linus Torvalds

On Thu, Nov 16, 2006 at 11:07:00AM -0500, Theodore Tso wrote:
> I think the problem is the people who have had years of BK or Hg
> experience.  Maybe it's more of a documentation problem; perhaps a
> "git for BK" or "git for Hg" users is what's needed.  The problem
> though is that while use of BK is definitely legacy, there are going
> to be a lot of people who need to use both BK and Hg.   

Err, what I meant to say is that there are going to be a lot of people
who will need to simultaneously use both git and Hg.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 16:30                                       ` Linus Torvalds
@ 2006-11-16 17:01                                         ` Carl Worth
  2006-11-16 17:30                                           ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Carl Worth @ 2006-11-16 17:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andreas Ericsson, Theodore Tso, Nicolas Pitre, Michael K. Edwards,
	git

[-- Attachment #1: Type: text/plain, Size: 1621 bytes --]

On Thu, 16 Nov 2006 08:30:55 -0800 (PST), Linus Torvalds wrote:
> The form I use is actually a lot simpler (conceptually) than the "short"
> form.
>
> When you do
>
> 	git pull <reponame> <branchname>

Yes, that's what the user almost always wants. The UI problem here is
that the conceptually simpler form is syntactically longer, (which
means users aren't likely to find it).

So if we can just get <reponame> and <branchname> to default
correctly, (based on the current branch name, and clone/fetch/pull
history), then the conceptually simple form ends up syntactically
simple as "git pull".

And I definitely don't have any problem with that. I'd love to be able
to teach that kind of simple thing to new users.

> driver tree, and because I'm always on that branch, what I do is
>
> 	git pull origin modesetting
...
> Well, as mentioned, I think even for non-developers, doing pulls with
> explicit branchnames is actually perfectly sane.

The behavior is sane, but having to always type the branch name
specifically because it never changes... that's a user-interface bug.

This is a good example of the kind of thing I wanted to hit when
starting this thread. I don't think there are any big conceptual
changes needed in git to make it easier for new users. But there are
little things that are problems that really should be fixed. Wouldn't
it be great to have the following exchange:

	User: How do I track on-going development in a branch?
	Master: Use "git pull"

Rather than:

	User: How do I track on-going development in a branch?
	Master Use "git pull origin <name-of-branch-you-are-already-on>"

?

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 16:42                               ` Han-Wen Nienhuys
@ 2006-11-16 17:17                                 ` Linus Torvalds
  2006-11-16 17:40                                   ` multi-project repos (was Re: Cleaning up git user-interface warts) Han-Wen Nienhuys
                                                     ` (2 more replies)
  0 siblings, 3 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-16 17:17 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: Junio C Hamano, git



On Thu, 16 Nov 2006, Han-Wen Nienhuys wrote:
> 
> Actually, only a 2 weeks ago, you suggested that I share the website
> and main source code for my project in a single repository for reasons
> of organization.
> 
> In this setup I find it logical to do
> 
>   git init-db
>   git pull ..url.. website/master

I don't disagree per se. It should be easy to support, it's just that it's 
not traditionally been something we've ever done.

So the way you'd normally set up a single repo that contains multiple 
other existing repositories is to basically start with one ("git clone") 
and then add the other branches and "git fetch" them.

So again, instead of "git init-db" + "git pull", you'd just use "git 
clone" instead.

Note that there _is_ another difference between "git pull" and 
"fetch+merge". The difference being that "git pull" implicitly does the 
checkout for you (I say "implicitly", because that's the way the git 
merge conceptually works: we always merge in the working tree. That's not 
the only way it _could_ be done, though - for trivial merges, we could do 
them without any working tree at all, but we don't suppotr that).

And that "git pull" semantic actually means that if you want a _bare_ 
repository, I think "git --bare init-db" + "git --bare fetch" actually 
does exactly the right thing right now too. But "git pull" would not be 
the right thing to use.

Btw, another normal way to generate a central "multi-headed repo" for is 
to not use "pull" or "fetch" or "clone" at ALL, but I would likely do 
something like

	mkdir central-repo
	cd central-repo
	git --bare init-db

and that's it. You now have a central repository, and you _never_ touch it 
again in the central place except to repack it and do other "maintenance" 
(eg pruning, fsck, whatever).

Instead, from the _outside_, you'd probably just do

	git push central-repo mybranch:refs/heads/central-branch-name

(actually, you'd probably set up that branch-name translation of 
"mybranch:refs/heads/central-branch-name" in your remote description, but 
I'm writing it out in full as an example).

So there are many ways to do it. It just happens that "git init-db" 
followed by "git pull" is not one of them ;)

(And the real reason for that is simple: "git pull" simply wants to have 
something to _start_ with. It's not hugely fundamental, it's just how it 
was written).


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 17:01                                         ` Carl Worth
@ 2006-11-16 17:30                                           ` Linus Torvalds
  2006-11-16 17:44                                             ` Sean
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-16 17:30 UTC (permalink / raw)
  To: Carl Worth
  Cc: Andreas Ericsson, Theodore Tso, Nicolas Pitre, Michael K. Edwards,
	git



On Thu, 16 Nov 2006, Carl Worth wrote:
>
> On Thu, 16 Nov 2006 08:30:55 -0800 (PST), Linus Torvalds wrote:
> > The form I use is actually a lot simpler (conceptually) than the "short"
> > form.
> >
> > When you do
> >
> > 	git pull <reponame> <branchname>
> 
> Yes, that's what the user almost always wants. The UI problem here is
> that the conceptually simpler form is syntactically longer, (which
> means users aren't likely to find it).

Yeah. 

And this is something I absolutely agree with. Our default branches for 
"pull" are horrible. You can "fix" it, but you can only fix it by adding 
_explicit_ branches to your .git/config file by hand, so I don't think 
that's actually a real fix at all. We should just fix the default (where 
even a "I don't know what branch you want" _error_ would be preferable 
over the current situation).

Along with the "git checkout <tag>" thing, I think these two things are 
definitely worth just fixing.

> The behavior is sane, but having to always type the branch name
> specifically because it never changes... that's a user-interface bug.

Yeah. Each branch should

 (a) have a "default source" initialized on the initial "clone"

 (b) have a way to set the source afterwards

 (c) error out if you do just a "git pull" or "git pull remotename" if 
     there is no default branch for the current local branch for that 
     remote.

We actually have (b) in a weak form right now ("weak" because it requires 
you to manually edit the config file: we've got the mechanism, but not a 
nice UI for it), but (a) and (c) are just broken.

And yeah, we should allow pulling into a branch that hasn't been 
initialized.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* multi-project repos (was Re: Cleaning up git user-interface warts)
  2006-11-16 17:17                                 ` Linus Torvalds
@ 2006-11-16 17:40                                   ` Han-Wen Nienhuys
  2006-11-16 18:21                                     ` Linus Torvalds
  2006-11-16 17:57                                   ` Cleaning up git user-interface warts Linus Torvalds
  2006-11-16 18:13                                   ` Carl Worth
  2 siblings, 1 reply; 1752+ messages in thread
From: Han-Wen Nienhuys @ 2006-11-16 17:40 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, git

Linus Torvalds escreveu:
> 
> On Thu, 16 Nov 2006, Han-Wen Nienhuys wrote:
>> Actually, only a 2 weeks ago, you suggested that I share the website
>> and main source code for my project in a single repository for reasons
>> of organization.
>>
>> In this setup I find it logical to do
>>
>>   git init-db
>>   git pull ..url.. website/master
> 
> I don't disagree per se. It should be easy to support, it's just that it's 
> not traditionally been something we've ever done.
> 
> So the way you'd normally set up a single repo that contains multiple 
> other existing repositories is to basically start with one ("git clone") 

You're misunderstanding me: the multi-repo is at git.sv.gnu.org is the
remote one. The example I gave was about locally creating a single
project repo from a remote multiproject repo. 

On a tangent: why is there no reverse-clone?  I have no shell access
to the machine, so when I created the remote repo, I had to push, and
ended up putting 1.2 Gb data on the server.

<looks at manpage>

is this send-pack? From UI perspective it would be nice if this could
also be done with clone,

  git clone . ssh+git://....

>And that "git pull" semantic actually means that if you want a _bare_ 
>repository, I think "git --bare init-db" + "git --bare fetch" actually

yes, this works. Two remarks:


* it needs

  website/master:master

otherwise you still don't have a branch.

* why are objects downloaded twice?  If I do

  git --bare fetch git://git.sv.gnu.org/lilypond.git web/master

it downloads stuff, but I don't get a branch. If I then do 

  git --bare fetch git://git.sv.gnu.org/lilypond.git web/master:master

it downloads the same stuff again. 

-- 
 Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 17:30                                           ` Linus Torvalds
@ 2006-11-16 17:44                                             ` Sean
  0 siblings, 0 replies; 1752+ messages in thread
From: Sean @ 2006-11-16 17:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Carl Worth, Andreas Ericsson, Theodore Tso, Nicolas Pitre,
	Michael K. Edwards, git

On Thu, 16 Nov 2006 09:30:47 -0800 (PST)
Linus Torvalds <torvalds@osdl.org> wrote:

> Yeah. Each branch should
> 
>  (a) have a "default source" initialized on the initial "clone"
>
> (b) have a way to set the source afterwards
>
> (c) error out if you do just a "git pull" or "git pull remotename" if 
>     there is no default branch for the current local branch for that 
>     remote.

This would be _great_.  You just shouldn't have to hack at the
.git/config file to get reasonable default sources after a clone.
Or even for that matter after fetching a new branch into an
existing repo.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 16:37                                             ` Carl Worth
@ 2006-11-16 17:57                                               ` Michael K. Edwards
  2006-11-16 18:23                                                 ` Carl Worth
  0 siblings, 1 reply; 1752+ messages in thread
From: Michael K. Edwards @ 2006-11-16 17:57 UTC (permalink / raw)
  To: Carl Worth; +Cc: git

On 11/16/06, Carl Worth <cworth@cworth.org> wrote:
> First, the pull may just fast-forward in which case there's no message
> at all. And we've been through that topic enough recently that we all
> know that no important information is lost by not doing any separate
> recording in that case.
>
> So you can't turn around and argue that the remote URL information is
> suddenly important when it just so happens that it's not a fast
> forward.

When it's a fast forward, the puller hasn't had to make any judgment
calls, so there's no editorial history to record.  When it's not, but
the puller chooses to retain the result on a persistent branch, that
_is_ an editorial decision (even if the result of the auto-merge is
clean); I like having that in the history.

> > And in a truly distributed situation, "pull" is strictly more powerful
> > than a separate "fetch" + separate "merge".
>
> I don't buy it. In my usage, I have several different remote
> repositories I'm interested in tracking, each with any number of
> branches. What I really want is an easy command that fetches all of
> those branches, (even new ones that I've never heard about---but never
> any of their "tracking branches" that wouldn't be of interest to
> me). And I want to do that once, to get the online-access-required
> part over with and get all the data into my local repository where I
> can start working with it.

What do you want all of those branches for?  They haven't been
published to you (that's a human interaction that doesn't go through
git), so for all you know they're just upstream experiments, and doing
things with them is probably shooting yourself in the foot.

I do agree that a robust form of "for b in .git/remotes/*; do git
fetch `basename $b`; done" would be a nice bit of porcelain.  The
entries in .git/remotes would probably need to grow a "Fetch-options:"
field so that you could choose whether or not to follow tags, etc.
Patch to follow.

Cheers,

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 17:17                                 ` Linus Torvalds
  2006-11-16 17:40                                   ` multi-project repos (was Re: Cleaning up git user-interface warts) Han-Wen Nienhuys
@ 2006-11-16 17:57                                   ` Linus Torvalds
  2006-11-16 18:27                                     ` Junio C Hamano
  2006-11-16 18:28                                     ` Linus Torvalds
  2006-11-16 18:13                                   ` Carl Worth
  2 siblings, 2 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-16 17:57 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: Junio C Hamano, git



On Thu, 16 Nov 2006, Linus Torvalds wrote:
> 
> (And the real reason for that is simple: "git pull" simply wants to have 
> something to _start_ with. It's not hugely fundamental, it's just how it 
> was written).

Here's a very lightly tested patch that allows you to use "git pull" to 
populate an empty repository.

I'm not at all sure this is necessarily the nicest way to do it, but it's 
fairly straightforward.

Junio, what do you think?

		Linus

---
diff --git a/git-pull.sh b/git-pull.sh
index ed04e7d..7e5cee2 100755
--- a/git-pull.sh
+++ b/git-pull.sh
@@ -44,10 +44,10 @@ do
 	shift
 done
 
-orig_head=$(git-rev-parse --verify HEAD) || die "Pulling into a black hole?"
+orig_head=$(git-rev-parse --verify HEAD 2> /dev/null)
 git-fetch --update-head-ok --reflog-action=pull "$@" || exit 1
 
-curr_head=$(git-rev-parse --verify HEAD)
+curr_head=$(git-rev-parse --verify HEAD 2> /dev/null)
 if test "$curr_head" != "$orig_head"
 then
 	# The fetch involved updating the current branch.
@@ -80,6 +80,11 @@ case "$merge_head" in
 	exit 0
 	;;
 ?*' '?*)
+	if test -z "$orig_head"
+	then
+		echo >&2 "Cannot merge multiple branches into empty head"
+		exit 1
+	fi
 	var=`git-repo-config --get pull.octopus`
 	if test -n "$var"
 	then
@@ -95,6 +100,12 @@ case "$merge_head" in
 	;;
 esac
 
+if test -z "$orig_head"
+then
+	git-update-ref -m "initial pull" HEAD $merge_head "" || exit 1
+	exit
+fi
+
 case "$strategy_args" in
 '')

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 17:17                                 ` Linus Torvalds
  2006-11-16 17:40                                   ` multi-project repos (was Re: Cleaning up git user-interface warts) Han-Wen Nienhuys
  2006-11-16 17:57                                   ` Cleaning up git user-interface warts Linus Torvalds
@ 2006-11-16 18:13                                   ` Carl Worth
  2 siblings, 0 replies; 1752+ messages in thread
From: Carl Worth @ 2006-11-16 18:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Han-Wen Nienhuys, Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 910 bytes --]

On Thu, 16 Nov 2006 09:17:32 -0800 (PST), Linus Torvalds wrote:
> So the way you'd normally set up a single repo that contains multiple
> other existing repositories is to basically start with one ("git clone")
> and then add the other branches and "git fetch" them.

For that we'd also need a way for clone to be able to fetch just a
single branch, and not all of them as well.

There is some clone vs. fetch asymmetry here that has annoyed me for a
while, and that I don't think has been mentioned in this thread
yet. Namely:

clone: can only be executed once, fetches all branches, "remembers"
       URLs for later simplified use

fetch: can be executed many times, fetches only named branches,
       doesn't remember anything for later

I've often been in the situation where I cloned a long time ago, but
I'd like to be able to fetch everything that I would get if I were to
start a fresh clone.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: multi-project repos (was Re: Cleaning up git user-interface warts)
  2006-11-16 17:40                                   ` multi-project repos (was Re: Cleaning up git user-interface warts) Han-Wen Nienhuys
@ 2006-11-16 18:21                                     ` Linus Torvalds
  2006-11-16 18:33                                       ` multi-project repos Junio C Hamano
                                                         ` (3 more replies)
  0 siblings, 4 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-16 18:21 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: Junio C Hamano, git



On Thu, 16 Nov 2006, Han-Wen Nienhuys wrote:
> 
> You're misunderstanding me: the multi-repo is at git.sv.gnu.org is the
> remote one. The example I gave was about locally creating a single
> project repo from a remote multiproject repo. 

Ahh.

Ok, try the patch I just sent out, and see if it works for you. It 
_should_ allow you to do exactly that

	mkdir new-repo
	cd new-repo
	git init-db
	git pull <remote> <onehead>

and now your "master" branch should be initialized to "onehead".

Oh, except I just realized that I forgot to do a "git checkout" in my 
patch, so you'd need to add that (or do it by hand, but you really 
shouldn't need to, since the checkout is implied by the "pull").

The downside with this is that it does NOT populate your "remotes" 
information (like "git clone" would have done), so either we'd need to 
teach "git pull" to do that too, or you just have to do it by hand (so 
that you then can do the shorthand "git pull" to update in the future).

> On a tangent: why is there no reverse-clone?  I have no shell access
> to the machine, so when I created the remote repo, I had to push, and
> ended up putting 1.2 Gb data on the server.

Yeah, you're supposed to "init-db" and "push". Right now, that tends to 
unpack everything (which is bad), although that is hopefully getting fixed 
(ie the receiving end shouldn't unpack any more if it is recent. Junio?)

> <looks at manpage>
> 
> is this send-pack?

"git push" uses send-pack internally, you shouldn't ever need to use it 
yourself.

> From UI perspective it would be nice if this could also be done with clone,
> 
>   git clone . ssh+git://....

The creation of a new archive tends to need special rights (with _real_ 
ssh access and a shell you could do it, but "ssh+git" really means "git 
protocol over a connection that was opened with ssh, but doesn't 
necessarily have a real shell at the other end").

So for most protocols, you simply cannot (and shouldn't) do it. Think 
about services like the one that Pasky has set up, that allow you to set 
up a new git repo - the setup phase really _has_ to be separate (because 
you need to set up your keys etc).

So I think the above syntax is actually not a good one, because it cannot 
work in the general case. It's much better to get used to setting up a 
repo first, and then pushing into it, and just accepting that it's a 
two-phase thing.

Also, from a bandwidth standpoint, you can often (although obviously not 
always) make the setup start with something that is _closer_ to what you 
want to do. So, for example, you'd often do something like:

 (a) ssh to central repository
 (b) create the new repository by cloning it _locally_ at the central 
     place from some other repository that is related
 (c) then, from your local (non-central) repository, do a "git push --force"
     to force your changes (which now only needs the _incremental_ thing).

An example of this is again the "forking" thing that he repos at  at 
http://git.or.cz/ already supports. 


> >And that "git pull" semantic actually means that if you want a _bare_ 
> >repository, I think "git --bare init-db" + "git --bare fetch" actually
> 
> yes, this works. Two remarks:
> 
> * it needs
> 
>   website/master:master
> 
> otherwise you still don't have a branch.

Right. In fact, you should probably do

	website/master:refs/heads/master

just to make it really explicit.

> * why are objects downloaded twice?  If I do
> 
>   git --bare fetch git://git.sv.gnu.org/lilypond.git web/master
> 
> it downloads stuff, but I don't get a branch.

A "fetch" by default won't actually generate a local branch unless you 
told it to. It just squirrels the end result into the magic FETCH_HEAD 
name, so that you can do

	# do the fetch
	git fetch git://git.sv.gnu.org/lilypond.git web/master

	# look at changes
	gitk ..FETCH_HEAD

	# If you're happy with them, merge them in
	git merge "merge new code" HEAD FETCH_HEAD

and you never actually created a real local branch at all.

If you want "git fetch" to fetch _into_ a branch, you need to tell it so, 
by using the full "src:dest" format. Otherwise it doesn't know what branch 
to fetch it into.

(And, of course, you can define that branch relationship in your remote 
configuration, so you don't actually have to say it explicitly every time)

> If I then do 
> 
>   git --bare fetch git://git.sv.gnu.org/lilypond.git web/master:master
> 
> it downloads the same stuff again. 

Right. So you can either

 (a) do it that way to begin with (because you now told it to put the 
     results in "master", so you never needed to do the second fetch in 
     the first place)

or

 (b) after you did the first fetch (into FETCH_HEAD), you could also have 
     just decided to do 

	git update-ref HEAD FETCH_HEAD ""

     (where that "" at the end is really not technically necessary, but it 
     tells "update-ref" that you _only_ want to do this if the old HEAD 
     was empty/undefined. Without it, "git update-ref" will just 
     overwrite HEAD without caring what it contained before, so it can be 
     a dangerous operation!)

See?


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 17:57                                               ` Michael K. Edwards
@ 2006-11-16 18:23                                                 ` Carl Worth
  0 siblings, 0 replies; 1752+ messages in thread
From: Carl Worth @ 2006-11-16 18:23 UTC (permalink / raw)
  To: Michael K. Edwards; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1244 bytes --]

On Thu, 16 Nov 2006 09:57:00 -0800, "Michael K. Edwards" wrote:
> What do you want all of those branches for?  They haven't been
> published to you (that's a human interaction that doesn't go through
> git), so for all you know they're just upstream experiments, and doing
> things with them is probably shooting yourself in the foot.

The same "what do you want them all for" question could be asked of
git-clone which also fetches all available branches. I really just
want to be able to easily watch what's going on in multiple
repositories.

I want to be able to just say "git update" (or whatever) and then be
able to list and browse and explore the stuff locally.

Yes, there's still outside communication that's necessary, but with
the ability to easily track all the remote branches that communication
can be even less formal if I can easily browse and explore things
locally. For example, I might not even know the name of the branch:

Me: Have you pushed a branch for your new work on the frob-widget?
Friend: Yes

And then I can "get fetch" and see "cool-new-frob" come in without
having to be told that name. Or I could have even just fetched
without the specific communication if I was already expecting it for
some reason.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 17:57                                   ` Cleaning up git user-interface warts Linus Torvalds
@ 2006-11-16 18:27                                     ` Junio C Hamano
  2006-11-16 18:28                                     ` Linus Torvalds
  1 sibling, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-16 18:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> On Thu, 16 Nov 2006, Linus Torvalds wrote:
>> 
>> (And the real reason for that is simple: "git pull" simply wants to have 
>> something to _start_ with. It's not hugely fundamental, it's just how it 
>> was written).
>
> Here's a very lightly tested patch that allows you to use "git pull" to 
> populate an empty repository.
>
> I'm not at all sure this is necessarily the nicest way to do it, but it's 
> fairly straightforward.
>
> Junio, what do you think?

Yeah, I talked about making "merge" treat missing HEAD as a
special case of fast forward, but I like yours better.  It is a
lot cleaner and to the point.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 17:57                                   ` Cleaning up git user-interface warts Linus Torvalds
  2006-11-16 18:27                                     ` Junio C Hamano
@ 2006-11-16 18:28                                     ` Linus Torvalds
  2006-11-16 19:47                                       ` Junio C Hamano
  1 sibling, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-16 18:28 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: Junio C Hamano, git



On Thu, 16 Nov 2006, Linus Torvalds wrote:
> @@ -95,6 +100,12 @@ case "$merge_head" in
>  	;;
>  esac
>  
> +if test -z "$orig_head"
> +then
> +	git-update-ref -m "initial pull" HEAD $merge_head "" || exit 1
> +	exit
> +fi
> +

So this is the place that probably wants a "git-checkout" before the 
exit, otherwise you'd (illogically) have to do it by hand for that 
particular case.

Of course, we should _not_ do it if the "--bare" flag has been set, so you 
migth want to tweak the exact logic here.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: multi-project repos
  2006-11-16 18:21                                     ` Linus Torvalds
@ 2006-11-16 18:33                                       ` Junio C Hamano
  2006-11-16 19:01                                       ` multi-project repos (was Re: Cleaning up git user-interface warts) Linus Torvalds
                                                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-16 18:33 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Han-Wen Nienhuys

Linus Torvalds <torvalds@osdl.org> writes:

> Yeah, you're supposed to "init-db" and "push". Right now, that tends to 
> unpack everything (which is bad), although that is hopefully getting fixed 
> (ie the receiving end shouldn't unpack any more if it is recent. Junio?)

Correct.

> See?
>
> 			Linus

Saw.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: multi-project repos (was Re: Cleaning up git user-interface warts)
  2006-11-16 18:21                                     ` Linus Torvalds
  2006-11-16 18:33                                       ` multi-project repos Junio C Hamano
@ 2006-11-16 19:01                                       ` Linus Torvalds
  2006-11-16 22:21                                       ` Johannes Schindelin
  2006-11-16 23:32                                       ` Han-Wen Nienhuys
  3 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-16 19:01 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: Junio C Hamano, git



On Thu, 16 Nov 2006, Linus Torvalds wrote:
> 
> A "fetch" by default won't actually generate a local branch unless you 
> told it to. It just squirrels the end result into the magic FETCH_HEAD 
> name [...]

Btw, the magic heads are probably not all that well documented. They do 
come up in the man-pages, but I don't think there is any central place 
talking about them. We have:

 - "HEAD" itself, which is obviously the default pointer for a lot of 
   operations, and that specifies the current branch (ie it should 
   currently always be a symref, although we've talked about relaxing 
   that)

 - "ORIG_HEAD" is very useful indeed, and it's the head _before_ a merge 
   (or some other operations, like "git rebase" and "git reset": think of 
   it as a "original head before we did some uncontrolled operation 
   where we otherwise can't use HEAD^ or similar")

   I use "gitk ORIG_HEAD.." a lot, and if I don't like something I see 
   when I do it, I end up doing "git reset --hard ORIG_HEAD" to undo a 
   pull I've done. This is important exactly because ORIG_HEAD is _not_ 
   the same as the first parent of a merge, since a merge could have been 
   just a fast-forward.

 - "FETCH_HEAD" as mentioned. Normally you'd only use this in scripting, I 
   suspect, but it's potentially useful if you prefer to do a fetch first 
   and then check out it (perhaps cherry-picking stuff instead of merging, 
   for example).

   So you could do (for example)

	git fetch some-other-repo branch
	gitk ..FETCH_HEAD
	git cherry-pick <some-particular-commit-you-picked>

 - "MERGE_HEAD" is kind of the opposite of "ORIG_HEAD" when you're in 
   the middle of a merge: it's the "other" branch that you're merging.

   It's mainly useful for merge resolution, ie

	git log -p HEAD...MERGE_HEAD -- some/file/with/conflicts

   is a great way to see what happened along both branches (note the 
   _triple_ dot: it's a symmetric difference), to see _why_ the confict 
   happened.

Most of the above are used implicitly in various cases, not just HEAD. The 
"--merge" flag to git-rev-list (and thus git log and friends) is just 
shorthand for the above "HEAD...MERGE_HEAD" behaviour (with the addition 
of also limiting the result to just conflicting files), so

	git log -p --merge

is basically exactly the same as the above (except for _all_ files that 
have conflicts in them rather than just one hand-specified one).

Anyway, maybe somebody didn't know about these, and finds them useful. 
Normally, the only one you would _really_ use is "ORIG_HEAD" (which is 
described in several of the tutorials and examples, so people hopefully 
already know about it). Most of the others tend to mostly be used 
implicitly, not by explicitly naming them - although you _can_.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 18:28                                     ` Linus Torvalds
@ 2006-11-16 19:47                                       ` Junio C Hamano
  2006-11-16 19:53                                         ` Linus Torvalds
  0 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-16 19:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Han-Wen Nienhuys, git

Linus Torvalds <torvalds@osdl.org> writes:

> On Thu, 16 Nov 2006, Linus Torvalds wrote:
>> @@ -95,6 +100,12 @@ case "$merge_head" in
>>  	;;
>>  esac
>>  
>> +if test -z "$orig_head"
>> +then
>> +	git-update-ref -m "initial pull" HEAD $merge_head "" || exit 1
>> +	exit
>> +fi
>> +
>
> So this is the place that probably wants a "git-checkout" before the 
> exit, otherwise you'd (illogically) have to do it by hand for that 
> particular case.
>
> Of course, we should _not_ do it if the "--bare" flag has been set, so you 
> migth want to tweak the exact logic here.

As you said, pull inherently involve a merge which implies the
existence of associated working tree, so I do not think there is
any room for --bare to get in the picture.  We already do the
checkout when we recover from a fetch that is used incorrectly
and updated the current branch head underneath us.

To give the list a summary of the discussion so far, here is a
consolidated patch.

-- >8 --
From: Linus Torvalds <torvalds@osdl.org>
Subject: git-pull: allow pulling into an empty repository

We used to complain that we cannot merge anything we fetched
with a local branch that does not exist yet.  Just treat the
case as a natural extension of fast forwarding and make the
local branch'es tip point at the same commit we just fetched.
After all an empty repository without an initial commit is an
ancestor of any commit.

Signed-off-by: Junio C Hamano <junkio@cox.net>

---
diff --git a/git-pull.sh b/git-pull.sh
index ed04e7d..e23beb6 100755
--- a/git-pull.sh
+++ b/git-pull.sh
@@ -44,10 +44,10 @@ do
 	shift
 done
 
-orig_head=$(git-rev-parse --verify HEAD) || die "Pulling into a black hole?"
+orig_head=$(git-rev-parse --verify HEAD 2>/dev/null)
 git-fetch --update-head-ok --reflog-action=pull "$@" || exit 1
 
-curr_head=$(git-rev-parse --verify HEAD)
+curr_head=$(git-rev-parse --verify HEAD 2>/dev/null)
 if test "$curr_head" != "$orig_head"
 then
 	# The fetch involved updating the current branch.
@@ -80,6 +80,11 @@ case "$merge_head" in
 	exit 0
 	;;
 ?*' '?*)
+	if test -z "$orig_head"
+	then
+		echo >&2 "Cannot merge multiple branches into empty head"
+		exit 1
+	fi
 	var=`git-repo-config --get pull.octopus`
 	if test -n "$var"
 	then
@@ -95,6 +100,13 @@ case "$merge_head" in
 	;;
 esac
 
+if test -z "$orig_head"
+then
+	git-update-ref -m "initial pull" HEAD $merge_head "" &&
+	git-read-tree --reset -u HEAD || exit 1
+	exit
+fi
+
 case "$strategy_args" in
 '')
 	strategy_args=$strategy_default_args

^ permalink raw reply related	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 19:47                                       ` Junio C Hamano
@ 2006-11-16 19:53                                         ` Linus Torvalds
  0 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-16 19:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Han-Wen Nienhuys, git



On Thu, 16 Nov 2006, Junio C Hamano wrote:
> 
> As you said, pull inherently involve a merge which implies the
> existence of associated working tree, so I do not think there is
> any room for --bare to get in the picture.

Fair enough. Feel free to add the signed-off-by from me too, 


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  5:12           ` Petr Baudis
  2006-11-16 10:45             ` Junio C Hamano
@ 2006-11-16 21:49             ` Junio C Hamano
  2006-11-16 22:20               ` Petr Baudis
  2006-11-17  0:11             ` Han-Wen Nienhuys
  2 siblings, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-16 21:49 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Carl Worth, git, Andy Whitcroft, Nicolas Pitre

Petr Baudis <pasky@suse.cz> writes:

> (vi) Coding issues. This is probably very subjective, but a blocker for
> me. I have no issues about C here, but about the shell part of Git.
> Well, how to say it... It's just fundamentally incompatible with me. I
> *could* do things in/with it, but it's certainly something I wouldn't
> _enjoy_ doing _at all_, on a deep level. I think the current shell code
> is really hard to read, the ancient constructs are frequently strange at
> best, etc. It's surely fine code at functional level and there'll be
> people who hate _my_ style of coding and my shell code which isn't
> perfect either, but it's just how it is with me.

I've been thinking about revamping the style of shell scripts in
git-core Porcelain-ish for some time, and I have a feeling that
now may be a good time to do so, after one feature release is
out and the list is discussing UI improvements.

But before mentioning the specifics, let me mention one tangent.
I recently installed an OpenBSD bochs (it was actually a qemu
image) without knowing much about the way of the land, and after
adjusting myself to necessary glitches (like "make" being called
"gmake" there), I saw git properly built and pass its selftest.
I was pleasantly surprised when I noticed there was no 'bash' on
the system after all that.

I would like to keep it that way.

I'll list things I would want to and not want to change.
Comments from the list are very appreciated.  You can say things
in two ways:

 * I guarantee that the _default_ shell on all sane platforms we
   care about handle this construct correctly, although it was
   not in the original Bourne.  There is no reason to stay away
   from it these days.

or

 * You've stayed away from this construct but now you say you
   feel it is Ok to use it.  Don't.  It would break with the
   shell on my platform (or "it is a bad practice because of
   such and such reasons").

I do not think many people can say the former with authority
unless you have a portability lab (the company I work for used
to be like that and it was an interesting experience to learn
all about irritating implementation differences).  And "POSIX
says shell should behave that way" is _not_ what I want to hear
about.

But the latter should be a lot easier to say, and would be
appreciated because it would help us avoid regressions.

Things I would want to change:

 - One indent level is one tab and the tab-width is eight
   columns.  Some of our scripts tend to use less than eight
   spaces for indentation to avoid line wrapping.

 - More use of shell functions are fine.   Especially if the
   above change makes lines too long, the logic should be
   refactored.

 - It is so 80-ish to follow certain portability and performance
   wisdom.  The following should go:

   . Use "case" when you do not have to use "if test".

   . Avoid ${parameter##word} and friends and use `expr` instead
     to pick a string apart.

   . Avoid "export name=word", write "name=word; export name"
     instead.

   . Avoid ${parameter:-word} and friends when ${parameter-word}
     would do.

Things I do not want to change:

 - The shell scripts should start with #!/bin/sh, not
   #!/bin/bash (nor even worse "#!/usr/bin/env sh").

 - Shell functions are written as "name () { ... }" without 
   "function" noiseword.

 - 'foo && bar || exit' exits with the error code of what
   failed; no need to say 'exit $?'.

 - String equality check for "test" is a single =, not ==. 

 - Do not use locals.

 - Do not use shell arrays.

 - In general, if something does not behave the same way in ksh,
   bash and dash, don't use it (that does not mean these three
   are special; it just means if something is not even portable
   across these three, it is a definite no-no).

I do not think I need to list other common-sense shell idioms in
the latter category (e.g. 'using "test z$name = zexpected" when
we do not know what $name contains' falls into that).

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 21:49             ` Junio C Hamano
@ 2006-11-16 22:20               ` Petr Baudis
  0 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-11-16 22:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Carl Worth, git, Andy Whitcroft, Nicolas Pitre

On Thu, Nov 16, 2006 at 10:49:36PM CET, Junio C Hamano wrote:
> I would like to keep it that way.

I agree - I certainly don't want to infect Git with bash dependency.

> And "POSIX says shell should behave that way" is _not_ what I want to
> hear about.

Actually, which sane platforms we care about have /bin/sh that is NOT
POSIX compatible?

> Things I would want to change:

What about [ instead of test? And

	if foo; then

instead of

	if foo
	then

?


Am I the only one who hates

case "$log_given" in
tt*)
        die "Only one of -c/-C/-F can be used." ;;
*tm*|*mt*)
        die "Option -m cannot be combined with -c/-C/-F." ;;
esac

instead of having this stuff in explicit variables and writing out some
explicit boolean expressions? (There _are_ few cases where the case is
cool, but they are rare.)


It would be really great if Git would have something alike the Cogito's
optparse infrastructure. I'm not sure if you can implement it in Bourne
sh with reasonable performance, though...


I think addressing these three particular points would make the scripts
hugely more coder-friendly. (And well, I usually say that coding style
is not *that* important and is frequently overemphasised. But that holds
only to a certain point. ;-)


> Things I do not want to change:
..snip all those I agree with..
>  - Do not use locals.

It's a pity. :-( Which shell doesn't support them?

It's not that huge a deal, though.

>  - Do not use shell arrays.

This is quite a larger deal, I think; but the portability concerns are
very real, I guess. :|

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: multi-project repos (was Re: Cleaning up git user-interface warts)
  2006-11-16 18:21                                     ` Linus Torvalds
  2006-11-16 18:33                                       ` multi-project repos Junio C Hamano
  2006-11-16 19:01                                       ` multi-project repos (was Re: Cleaning up git user-interface warts) Linus Torvalds
@ 2006-11-16 22:21                                       ` Johannes Schindelin
  2006-11-16 22:44                                         ` multi-project repos Junio C Hamano
  2006-11-16 22:49                                         ` multi-project repos (was Re: Cleaning up git user-interface warts) Linus Torvalds
  2006-11-16 23:32                                       ` Han-Wen Nienhuys
  3 siblings, 2 replies; 1752+ messages in thread
From: Johannes Schindelin @ 2006-11-16 22:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Han-Wen Nienhuys, Junio C Hamano, git

Hi,

On Thu, 16 Nov 2006, Linus Torvalds wrote:

> On Thu, 16 Nov 2006, Han-Wen Nienhuys wrote:
> > 
> > * why are objects downloaded twice?  If I do
> > 
> >   git --bare fetch git://git.sv.gnu.org/lilypond.git web/master
> > 
> > it downloads stuff, but I don't get a branch.
> 
> A "fetch" by default won't actually generate a local branch unless you 
> told it to.

This is actually a perfect example for

- a script that is porcelain as well as plumbing (you are supposed to use 
it directly, or via pull), and for

- a terrible UI.

_If_ you use git-fetch directly you virtually always want to store the 
result. I was tempted quite often to submit a patch which adds a command 
line switch --no-warn, which is passed to git-fetch by git-pull, and 
without which git-fetch complains if the branch-to-be-fetched is not 
stored right away (and refuses to go along).

_Also_, git-pull not storing the fetched branches at least temporarily 
often annoyed me: the pull did not work, and the SHA1 was so far away I 
could not even scroll to it. The result: I had to pull (and fetch!) the 
whole darned objects again. Again, I was tempted quite often to submit a 
patch which makes git-pull fetch the branches into refs/fetch-temp/* and 
only throw them away when the merge succeeded.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: multi-project repos
  2006-11-16 22:21                                       ` Johannes Schindelin
@ 2006-11-16 22:44                                         ` Junio C Hamano
  2006-11-17  0:29                                           ` Johannes Schindelin
  2006-11-16 22:49                                         ` multi-project repos (was Re: Cleaning up git user-interface warts) Linus Torvalds
  1 sibling, 1 reply; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-16 22:44 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Linus Torvalds

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> _If_ you use git-fetch directly you virtually always want to store the 
> result. I was tempted quite often to submit a patch which adds a command 
> line switch --no-warn, which is passed to git-fetch by git-pull, and 
> without which git-fetch complains if the branch-to-be-fetched is not 
> stored right away (and refuses to go along).
>
> _Also_, git-pull not storing the fetched branches at least temporarily 
> often annoyed me: the pull did not work, and the SHA1 was so far away I 
> could not even scroll to it. The result: I had to pull (and fetch!) the 
> whole darned objects again. Again, I was tempted quite often to submit a 
> patch which makes git-pull fetch the branches into refs/fetch-temp/* and 
> only throw them away when the merge succeeded.

I think the earlier write-up by Linus on magic HEADs would help
documenting FETCH_HEAD better.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: multi-project repos (was Re: Cleaning up git user-interface warts)
  2006-11-16 22:21                                       ` Johannes Schindelin
  2006-11-16 22:44                                         ` multi-project repos Junio C Hamano
@ 2006-11-16 22:49                                         ` Linus Torvalds
  2006-11-16 23:08                                           ` Linus Torvalds
                                                             ` (2 more replies)
  1 sibling, 3 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-16 22:49 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Han-Wen Nienhuys, Junio C Hamano, git



On Thu, 16 Nov 2006, Johannes Schindelin wrote:
>
> - a terrible UI.

Why? We _do_ have the temporary branch. It's called FETCH_HEAD.

> _Also_, git-pull not storing the fetched branches at least temporarily 
> often annoyed me: the pull did not work, and the SHA1 was so far away I 
> could not even scroll to it.

Again, why didn't you use FETCH_HEAD?

If the user doesn't give us a head to write to, we clearly MUST NOT write 
to any long-term branch. That would be a _horrible_ mistake. 

So all your complaints seem totally misplaced. The UI is both usable and 
practical, and your complaint that git pull doesn't store the fetched 
branches is just NOT TRUE.

And your "solution" is obviously totally unusable. git ABSOLUTELY MUST NOT 
overwrite any existing branches unless explicitly told to do so by the 
user.

So I really don't see your point. 

A lot of the complaints seem to not be about the interfaces, but about 
people not _understanding_ and knowing what the interfaces do. If you were 
confused about something (like not realizing that FETCH_HEAD is there and 
very much usable), how about sending in a patch to make FETCH_HEAD use 
clearer in whatever docs you looked at and didn't find it mentioned in.

Now, there is no question that some of the interfaces can get a bit 
"interesting" to use. For example, if you really don't want to re-fetch 
for some reason, FETCH_HEAD actually does contain enough information that 
you should be able to just re-do a failed merge, for example, including 
the message generation. But at that point it really _does_ get a bit 
complicated, and you end up doing something like

	git merge "$(git fmt-merge-msg < .git/FETCH_HEAD)" HEAD FETCH_HEAD

which should _work_, but I'm not going to claim that it's all that easy to 
understand.

(That said, read that one-liner a few times, and suddenly it doesn't seem 
_that_ complicated any more, now does it? You can probably even guess what 
it's really going to do, even if you don't know git all that well. It's 
not unreadable line noise, is it?)

Of course, if I had a merge that failed (the most common reason being that 
I had some uncommitted patch in a file that wanted to be updated by the 
merge), I'd never actually do the above one-liner. I'd just re-do the 
pull. But if networking was _really_ slow, and I _really_ cared, maybe I'd 
do the above.

(And no, I didn't actually test the above one-liner. Maybe it doesn't work 
for some reason. Somebody should check, just for fun).


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  3:12                         ` Linus Torvalds
  2006-11-16 10:31                           ` Junio C Hamano
  2006-11-16 10:45                           ` Han-Wen Nienhuys
@ 2006-11-16 23:00                           ` Johannes Schindelin
  2006-11-16 23:22                             ` Linus Torvalds
  2 siblings, 1 reply; 1752+ messages in thread
From: Johannes Schindelin @ 2006-11-16 23:00 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Han-Wen Nienhuys, Junio C Hamano, git

Hi,

On Wed, 15 Nov 2006, Linus Torvalds wrote:

> Peopel seem to believe that changign a few names or doing other totally 
> _minimal_ UI changes would somehow magically make things understandable. 

Never ever underestimate pet peeves. If we give many people an obvious 
reason (however trivial and bike-shed-coloured) to complain, they will 
complain.

If we pull (pun intended) that reason away under their collective 
backsides, they will have to find another reason to complain. But by the 
time they found something, they will already be happy git users!

But since you just provided a patch to make life easier on non-gitters, I 
guess you agree with that already.

And hopefully you also agree that enhancing the syntax of git-merge to 
grok "git-merge [-m message] <branch>" and "git-merge [-m message] 
<url-or-remote> <branch>" would be a lovely thing, luring even more 
people into using git.

Maybe they even start complaining about subversion and CVS calling a merge 
"update", who knows?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  7:51                                             ` Richard CURNOW
@ 2006-11-16 23:01                                               ` Johannes Schindelin
  0 siblings, 0 replies; 1752+ messages in thread
From: Johannes Schindelin @ 2006-11-16 23:01 UTC (permalink / raw)
  To: Richard CURNOW; +Cc: git

Hi,

On Thu, 16 Nov 2006, Richard CURNOW wrote:

> In contrast to Linus's case of wanting to record where the remote merge
> came from, I expressly don't want to record that - I want the merge
> commit to describe conceptually what was being merged with what.
> 
> OK, I could use probably use pull with --no-commit, but I've already
> trained my fingers to type out the merge syntax.  They'd be happier with
> 'git merge -m "Merge feature foo with fixes for bar" bar" though.

For the moment, if you forget --no-commit, you can always do a "git-commit 
--amend" -- even with merges.

Hth,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: multi-project repos (was Re: Cleaning up git user-interface warts)
  2006-11-16 22:49                                         ` multi-project repos (was Re: Cleaning up git user-interface warts) Linus Torvalds
@ 2006-11-16 23:08                                           ` Linus Torvalds
  2006-11-16 23:36                                           ` Johannes Schindelin
  2006-11-16 23:40                                           ` Han-Wen Nienhuys
  2 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-16 23:08 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Han-Wen Nienhuys, Junio C Hamano, git



On Thu, 16 Nov 2006, Linus Torvalds wrote:
>
> 	git merge "$(git fmt-merge-msg < .git/FETCH_HEAD)" HEAD FETCH_HEAD

Btw, I'd like to claim that this is a _great_ user interface.

Yeah, it's different from other SCM's. I don't think you'd really want to 
script a merge like this in CVS, especially not using standard UNIX 
pipelines etc. But it's an example of how a lot of git operations - even 
the "high level ones" are pretty scriptable, using very basic and very 
simple standard UNIX shell scripting.

So even though I'd not actually _do_ the above one-liner, I think it's a 
great example of how git really works, and how scriptable it can be, 
without a lot of huge problems.

So considering that "FETCH_HEAD" works pretty much everywhere, and that 
you can also use the totally non-scripting approach of doing "standard" 
SCM things like

	git diff ..FETCH_HEAD

or 

	gitk HEAD...FETCH_HEAD

to look at what got fetched (and in the latter case look at both the 
current HEAD _and_ FETCH_HEAD, and what was in one but not the other), I 
really think it's unfair to say that "git fetch" does not have a nice UI.

It's just that "git fetch" can be used two totally different ways:

 - "git fetch" to get something temporary: use FETCH_HEAD, and do _not_ 
   specify a destination branch

 - "git fetch" as a way to update the branches you already have, by either 
   using explicit branch specifiers (which would be unusual, but works), 
   or by just having the branch relationships listed in your .git/remotes/ 
   file or .git/config file.

both are actually very natural things to do.

What is probably _not_ that natural is to do the explicit branch 
specifier, ie

	git fetch somerepo remotebranch:localbranch

which obviously works, but you wouldn't want to actually do this very 
often. Either you do something once (and use FETCH_HEAD, which is actually 
nicer than a real branch in some respects: it also tells you were you 
fetched _from_, and it can contain data on merging from _multiple_ 
branches), or you set up a "real translation" in your configuration files.

So I would say that the natural thing to do is:

 - "git pull somerepo"

   This will _also_ fetch all the branches you've said you want to track, 
   of course.

 - "git fetch somerepo somebranch"

   Look at FETCH_HEAD, and be happy

 - "git fetch somerepo"

   This is kind of strange, but it can be useful if you are basically just 
   mirroring another repo, and want to fetch all the branches you've said 
   you want to track, but don't actually want to check them out.

while the "complicated" scenario like the following is something you 
should generally _avoid_, because it's just confusing and complex:

 - "git fetch somerepo branch1:mybranch1 branch2:mybranch2"

   This works, and I'm sure it's useful, and I've even used it (usually 
   with just one branch, though), but let's face it - it's too damn 
   complicated to be anything you want to do _normally_.

So git is definitely powerful, but I think some people have looked at the 
_complicated_ cases more than the simple cases (ie maybe people have 
looked too much at that last case, not realizing that there really isn't 
much reason to use it - and FETCH_HEAD is one big reason why you seldom 
need the complicated format).


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 23:00                           ` Johannes Schindelin
@ 2006-11-16 23:22                             ` Linus Torvalds
  2006-11-17  0:05                               ` Han-Wen Nienhuys
  0 siblings, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-16 23:22 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Han-Wen Nienhuys, Junio C Hamano, git



On Fri, 17 Nov 2006, Johannes Schindelin wrote:
> 
> Never ever underestimate pet peeves. If we give many people an obvious 
> reason (however trivial and bike-shed-coloured) to complain, they will 
> complain.

I do actually think that this discussion has been informative, partly 
because I never even realized that some people would ever think to do 
"init-db" + "pull". 

Making things like that work is easy enough, it's just that I never saw 
any point until people complained. And when they complained, the initial 
complaint wasn't actually obvious. Only when Han-Wen actually gave 
something that didn't work, was it clear that the real issue wasn't so 
much _naming_, as just expectations about the _work_flow_.

> And hopefully you also agree that enhancing the syntax of git-merge to 
> grok "git-merge [-m message] <branch>" and "git-merge [-m message] 
> <url-or-remote> <branch>" would be a lovely thing, luring even more 
> people into using git.

I definitely think we can make "git merge" have a more pleasant syntax. 
I'm just still not sure that people should actually use it ;)

My real point was/is that usually it's really not the "naming details" 
that people _really_ have problems with. The real problems tend to be in 
learning a new workflow.

We can make some of those workflows easier, but I would heartily recommend 
that people not worry about naming of "pull" vs "fetch", because that's 
almost certainly not really the issue. Instead, if you have a problem, 
rather than concentrating on the names of the programs, say:

 - what do you want to get done.

   Most likely it's _trivial_ to do with git, it's just that somebody used 
   the wrong approach, and then it didn't work at all.

 - give actual examples of a workflow that didn't work or was complex.

   (again, the "init-db" + "pull" example). 

   And yes, in many cases, it might well be a case of "sure, we can make 
   that _other_ workflow work too". But somebody like me, who has used git 
   for a year and a half, and used BK before it, probably simply uses a 
   different workflow than somebody who comes from CVS. 

For example, I suspect that your gripe with "git fetch" was just from 
using it in a really awkward manner. Maybe we could make your workflow 
work with git too, but maybe it really already (and always) did, you just 
used a particular tool in a way that made the use be really really 
painful.

Sometimes it's just a question of "ok, use it like _this_, and now it's 
actually really simple". Other times it's "ok, I didn't even realize that 
you wanted to use it like _that_, and yeah, that's incredibly 
inconvenient, and we can change it".

I just got involved in this discussion because I thought people were 
talking about all the wrong things. Command naming really can't be _that_ 
big of a deal. I really don't believe that we should have some people use 
"gh" instead of "git" just because they think "pull" should mean not to 
merge or something.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: multi-project repos (was Re: Cleaning up git user-interface  warts)
  2006-11-16 18:21                                     ` Linus Torvalds
                                                         ` (2 preceding siblings ...)
  2006-11-16 22:21                                       ` Johannes Schindelin
@ 2006-11-16 23:32                                       ` Han-Wen Nienhuys
  3 siblings, 0 replies; 1752+ messages in thread
From: Han-Wen Nienhuys @ 2006-11-16 23:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git


Linus Torvalds escreveu:
>> You're misunderstanding me: the multi-repo is at git.sv.gnu.org is the
>> remote one. The example I gave was about locally creating a single
>> project repo from a remote multiproject repo. 
> 
> Ahh.
> 
> Ok, try the patch I just sent out, and see if it works for you. It 
> _should_ allow you to do exactly that

I'm leaving for a short holiday tomorrow, but will do when I come back.

>> From UI perspective it would be nice if this could also be done with clone,
>>
>>   git clone . ssh+git://....
> 
> The creation of a new archive tends to need special rights (with _real_ 
> ssh access and a shell you could do it, but "ssh+git" really means "git 
> protocol over a connection that was opened with ssh, but doesn't 
> necessarily have a real shell at the other end").

What happens on savannah is that the sysadmins set up an empty GIT
repo with access, and leave it to you to push the stuff.  Of course,
if the initial import gets packed automatically, that's also ok.

> So I think the above syntax is actually not a good one, because it cannot 
> work in the general case. It's much better to get used to setting up a 
> repo first, and then pushing into it, and just accepting that it's a 
> two-phase thing.

Perhaps ; from a UI viewpoint, it would be nice though, even if it
were aliased to a simple push. (Darcs has a get command analogous to
git-clone, but also a put command to which git lacks the equivalent).

>> * why are objects downloaded twice?  If I do
>>
>>   git --bare fetch git://git.sv.gnu.org/lilypond.git web/master
>>
>> it downloads stuff, but I don't get a branch.
> [..] 
>> If I then do 
>>
>>   git --bare fetch git://git.sv.gnu.org/lilypond.git web/master:master
>>
>> it downloads the same stuff again. 
> 
> Right. So you can either
> [..]
> See?

No, I don't understand. In the fetch all the objects with their SHA1s
were already downloaded. I'd expect that the fetch with a refspec
would simply write a HEAD and a refs/heads/master, and notice that all
the actual data was already downloaded, and doesn't download it again. 


-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: multi-project repos (was Re: Cleaning up git user-interface warts)
  2006-11-16 22:49                                         ` multi-project repos (was Re: Cleaning up git user-interface warts) Linus Torvalds
  2006-11-16 23:08                                           ` Linus Torvalds
@ 2006-11-16 23:36                                           ` Johannes Schindelin
  2006-11-17  0:49                                             ` Linus Torvalds
  2006-11-16 23:40                                           ` Han-Wen Nienhuys
  2 siblings, 1 reply; 1752+ messages in thread
From: Johannes Schindelin @ 2006-11-16 23:36 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Han-Wen Nienhuys, Junio C Hamano, git

Hi,

On Thu, 16 Nov 2006, Linus Torvalds wrote:

> On Thu, 16 Nov 2006, Johannes Schindelin wrote:
> >
> > - a terrible UI.
> 
> Why? We _do_ have the temporary branch. It's called FETCH_HEAD.

It is a terrible UI, because it was not that obvious to me. And I consider 
myself not a git newbie.

Besides, it is not really a temporary branch. If it was, the pull would 
_not_ download all these objects again, would it?

> > _Also_, git-pull not storing the fetched branches at least temporarily 
> > often annoyed me: the pull did not work, and the SHA1 was so far away I 
> > could not even scroll to it.
> 
> Again, why didn't you use FETCH_HEAD?

Because I am a Jar-HEAD?

> If the user doesn't give us a head to write to, we clearly MUST NOT write 
> to any long-term branch. That would be a _horrible_ mistake. 

I was _not_ suggesting a long-term branch. Just a way to do-what-i-want 
and not waste bandwidth.

> And your "solution" is obviously totally unusable. git ABSOLUTELY MUST NOT 
> overwrite any existing branches unless explicitly told to do so by the 
> user.

Guess three times why I did not post the patches.

But the real problem is not necessarily the behaviour; it is the obscure 
fashion of the behaviour. You may not understand that problem, because you 
were there from the beginning. You saw the big-bang and how all the 
quarks formed all of a sudden, and how matter and eventually planets 
and suns came into being.

But others (me included) were not there. Or they did not really watch. And 
now they see all these creatures, and plants, and bacteria, and they do 
not understand how these are all connected, because of that. And now they 
think "wow that must have been some intelligent design, and really a 
miracle, and I cannot understand how it works." But that is not true 
(the latter part of course).

There is something to be said about the simplicity of Mercurial. It's 
inner workings may suck, but people get easily attracted by it.

I do not claim we should imitate Mercurial, or even hide the index (even 
if I sometimes wonder if the index is not just a clever way to accelerate 
commits, and nothing more).

> So I really don't see your point. 
> 
> A lot of the complaints seem to not be about the interfaces, but about 
> people not _understanding_ and knowing what the interfaces do.

But the interfaces should be usable interfaces! They should _explain_ what 
they do. Other software does so, it can't be _that_ hard.

> 	git merge "$(git fmt-merge-msg < .git/FETCH_HEAD)" HEAD FETCH_HEAD

I find that quite easy to understand. Why? Because I happen to _know_ the 
syntax of -merge and -fmt-merge-msg. For similar reasons I _understand_ 
why -pull behaves like it does. But others don't; they will shudder and 
then run.

Maybe it is not important that -pull fetches all objects all over again. 
But it _is_ important to make things like merging branches (local or 
remote) trivial. It _is_ important to make the user experience be fun.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: multi-project repos (was Re: Cleaning up git user-interface  warts)
  2006-11-16 22:49                                         ` multi-project repos (was Re: Cleaning up git user-interface warts) Linus Torvalds
  2006-11-16 23:08                                           ` Linus Torvalds
  2006-11-16 23:36                                           ` Johannes Schindelin
@ 2006-11-16 23:40                                           ` Han-Wen Nienhuys
  2 siblings, 0 replies; 1752+ messages in thread
From: Han-Wen Nienhuys @ 2006-11-16 23:40 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

Linus Torvalds escreveu:
> A lot of the complaints seem to not be about the interfaces, but about 
> people not _understanding_ and knowing what the interfaces do. If you were 

From the point of view of a user, there is not really a difference
between the two.  As a user, you form a mental model of how things
work by looking at the interface. If the interface is bad, the user
creates a faulty model in his head, and starts doing things that
are perfectly logical in the faulty model, but stupid and silly when
you consider the actual internals.

A nice book about this is "The Design of Everyday Things" by Donald
Norman.

-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16 23:22                             ` Linus Torvalds
@ 2006-11-17  0:05                               ` Han-Wen Nienhuys
  2006-11-17  0:13                                 ` Junio C Hamano
  2006-11-17  0:39                                 ` Linus Torvalds
  0 siblings, 2 replies; 1752+ messages in thread
From: Han-Wen Nienhuys @ 2006-11-17  0:05 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, git



Linus Torvalds escreveu:
> My real point was/is that usually it's really not the "naming details" 
> that people _really_ have problems with. The real problems tend to be in 
> learning a new workflow.

I agree that discussions on naming may cloud the issue, but "learning
the workflow" implies that people should adapt to the limitations of
their tools.  That's only a viable stance when the tools are finished
and completely perfect.

Until that time, it would be good goal to remove all idiosyncrasies,
all gratuitious asymetries and needless limitations in the commands of
git, eg.

 - clone but not a put-clone,

 - pull = merge + fetch, but no command for merge + throw

 - clone for getting all branches of a repo, but no command for
   updating all branches of a repo.  

Of course, when all warts are fixed, backward compatibility will force
us to choose some new names. At that point, a discussion on naming is
in place.


-- 
 Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-16  5:12           ` Petr Baudis
  2006-11-16 10:45             ` Junio C Hamano
  2006-11-16 21:49             ` Junio C Hamano
@ 2006-11-17  0:11             ` Han-Wen Nienhuys
  2 siblings, 0 replies; 1752+ messages in thread
From: Han-Wen Nienhuys @ 2006-11-17  0:11 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Carl Worth, git, Andy Whitcroft, Nicolas Pitre

Petr Baudis escreveu:
> (vi) Coding issues. This is probably very subjective, but a blocker for
> me. I have no issues about C here, but about the shell part of Git.
> Well, how to say it... It's just fundamentally incompatible with me. I

(on a tangent)

I concur, but probably in a different way.

some 10 years ago I vowed never to write perl code again, and some 5
years ago, I made the same pledge for shell scripts, because I spent
inordinate amounts of time debugging them.

When I see the GIT shell scripts, my hands start to itch to make a
nice object oriented Python wrapper for it.

-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-17  0:05                               ` Han-Wen Nienhuys
@ 2006-11-17  0:13                                 ` Junio C Hamano
  2006-11-17  0:27                                   ` Han-Wen Nienhuys
  2006-11-17  0:37                                   ` Carl Worth
  2006-11-17  0:39                                 ` Linus Torvalds
  1 sibling, 2 replies; 1752+ messages in thread
From: Junio C Hamano @ 2006-11-17  0:13 UTC (permalink / raw)
  To: hanwen; +Cc: git

Han-Wen Nienhuys <hanwen@xs4all.nl> writes:

>  - clone but not a put-clone,

What's put-clone?  Care to explain?

>  - pull = merge + fetch, but no command for merge + throw

What's merge+throw?  Care to explain?

>  - clone for getting all branches of a repo, but no command for
>    updating all branches of a repo.  

This one I can understand, but how would you propose to "update
all branches", in other words what's your design for mapping
remote branch names to local branch namespaces?

It would be nice if the design does not straightjacket different
repository layouts different people seem to like, but I think it
would be Ok to limit ourselves only to support the straight
one-to-one mapping and support only separate-remote layout.

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-17  0:13                                 ` Junio C Hamano
@ 2006-11-17  0:27                                   ` Han-Wen Nienhuys
  2006-11-17  0:35                                     ` Petr Baudis
  2006-11-17  0:37                                   ` Carl Worth
  1 sibling, 1 reply; 1752+ messages in thread
From: Han-Wen Nienhuys @ 2006-11-17  0:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano escreveu:
> Han-Wen Nienhuys <hanwen@xs4all.nl> writes:
> 
>>  - clone but not a put-clone,
> 
> What's put-clone?  Care to explain?

put clone would be the putative inverse of clone, ie. make a clone of
a local repository on a remote server.

>>  - pull = merge + fetch, but no command for merge + throw
> 
> What's merge+throw?  Care to explain?

throw is the hypothetical opposite of fetch. I agree that this is
academical, because it's logical to only allow fast-forwards for
sending revisions.

>>  - clone for getting all branches of a repo, but no command for
>>    updating all branches of a repo.  
> 
> This one I can understand, but how would you propose to "update
> all branches", in other words what's your design for mapping
> remote branch names to local branch namespaces?
> 
> It would be nice if the design does not straightjacket different
> repository layouts different people seem to like, but I think it
> would be Ok to limit ourselves only to support the straight
> one-to-one mapping and support only separate-remote layout.

I think the whole clone design is a bit broken, in that the "master"
branch gets renamed or copied to "origin", but all of the other
branches remain unchanged in their names.

It's more logical for clone to either

 * leave all names unchanged

 * put all remote branches into a subdirectory.  This would also make
   it easier to track branches from multiple servers.

   At present,  I have in my build-daemon the following branches,

	cvs-head-repo.or.cz-lilypond.git
	hanwen-repo.or.cz-lilypond.git
	hwn-jcn-repo.or.cz-lilypond.git
	lilypond_1_0-repo.or.cz-lilypond.git
	lilypond_1_2-repo.or.cz-lilypond.git
	lilypond_1_4-repo.or.cz-lilypond.git
	lilypond_1_6-repo.or.cz-lilypond.git
	lilypond_1_8-repo.or.cz-lilypond.git
	lilypond_2_0-repo.or.cz-lilypond.git
	lilypond_2_2-repo.or.cz-lilypond.git
	lilypond_2_3_2b-repo.or.cz-lilypond.git
	lilypond_2_3_5b-repo.or.cz-lilypond.git
	lilypond_2_4-repo.or.cz-lilypond.git
	lilypond_2_6-repo.or.cz-lilypond.git
	lilypond_2_8-repo.or.cz-lilypond.git
	master-git.sv.gnu.org-lilypond.git
	master-hanwen
	master-repo.or.cz-lilypond.git
	origin-repo.or.cz-lilypond.git
	stable
	stable-2.10
	stable--2.10-git.sv.gnu.org-lilypond.git

  It would solve lots of problems for me if cloning and fetching would
  put branches into a subdirectory, ie.

    git clone git://repo.or.cz/lilypond.git

  leads to branches

    repo.or.cz/lilypond_2_8
    repo.or.cz/lilypond_2_6
    repo.or.cz/lilypond_2_4
    repo.or.cz/master
     (etc..)

	
-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: multi-project repos
  2006-11-16 22:44                                         ` multi-project repos Junio C Hamano
@ 2006-11-17  0:29                                           ` Johannes Schindelin
  0 siblings, 0 replies; 1752+ messages in thread
From: Johannes Schindelin @ 2006-11-17  0:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi,

On Thu, 16 Nov 2006, Junio C Hamano wrote:

> I think the earlier write-up by Linus on magic HEADs would help 
> documenting FETCH_HEAD better.

I am not sure that documenting FETCH_HEAD better would help. As Han-Wen 
pointed out (and some colleagues of mine who would never subscribe to a 
mailing list), people do not read the manual, but rather try to wrap their 
heads around the inner workings from the interface. And FETCH_HEAD just 
does not meet _any_ expectation a sane (read: untainted) user might have.

While I'm at it: the problem I pointed out with -pull may annoy just me.

But there is another problem with "git fetch": a common work flow is 
tracking other peoples branches. And since git makes it so easy to 
have multiple branches, chances are that you track more than one 
branch per remote repository.

Now, an old gripe of mine was the lack of "git fetch --all". I wrote a 
script for that (Linus would be proud of me!), which just does "git 
ls-remote" and constructs a command line for "git fetch" from that.

But even if you agree with the common story that you should specify the 
branches you want to track: it is hard!

If I were new to git, after reading some tutorials I would _expect_ "git 
fetch" to be the tool to track branches. (I posted a patch to at least be 
able to store the current "git fetch" command line under a nick IIRC). But 
it does not.

(Of course, after reading several documentation, as a new user I would 
eventually find that I should edit .git/remotes/<nick>, or even 
edit/-repo-config the remotes information in the config, but I would fully 
expect a new user to give up before reaching that stage.)

But maybe I got it all wrong and this is not the common expectation...

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-17  0:27                                   ` Han-Wen Nienhuys
@ 2006-11-17  0:35                                     ` Petr Baudis
  0 siblings, 0 replies; 1752+ messages in thread
From: Petr Baudis @ 2006-11-17  0:35 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: Junio C Hamano, git

On Fri, Nov 17, 2006 at 01:27:53AM CET, Han-Wen Nienhuys wrote:
> put clone would be the putative inverse of clone, ie. make a clone of
> a local repository on a remote server.

So effectively to tell git push not to unpack on the remote side, and to
push all branches and relevant tags.

..snip..
> It's more logical for clone to either
> 
>  * leave all names unchanged
> 
>  * put all remote branches into a subdirectory.  This would also make
>    it easier to track branches from multiple servers.
> 
>    At present,  I have in my build-daemon the following branches,
> 
> 	cvs-head-repo.or.cz-lilypond.git
> 	hanwen-repo.or.cz-lilypond.git
> 	hwn-jcn-repo.or.cz-lilypond.git
> 	lilypond_1_0-repo.or.cz-lilypond.git
> 	lilypond_1_2-repo.or.cz-lilypond.git
> 	lilypond_1_4-repo.or.cz-lilypond.git
> 	lilypond_1_6-repo.or.cz-lilypond.git
> 	lilypond_1_8-repo.or.cz-lilypond.git
> 	lilypond_2_0-repo.or.cz-lilypond.git
> 	lilypond_2_2-repo.or.cz-lilypond.git
> 	lilypond_2_3_2b-repo.or.cz-lilypond.git
> 	lilypond_2_3_5b-repo.or.cz-lilypond.git
> 	lilypond_2_4-repo.or.cz-lilypond.git
> 	lilypond_2_6-repo.or.cz-lilypond.git
> 	lilypond_2_8-repo.or.cz-lilypond.git
> 	master-git.sv.gnu.org-lilypond.git
> 	master-hanwen
> 	master-repo.or.cz-lilypond.git
> 	origin-repo.or.cz-lilypond.git
> 	stable
> 	stable-2.10
> 	stable--2.10-git.sv.gnu.org-lilypond.git
> 
>   It would solve lots of problems for me if cloning and fetching would
>   put branches into a subdirectory, ie.
> 
>     git clone git://repo.or.cz/lilypond.git
> 
>   leads to branches
> 
>     repo.or.cz/lilypond_2_8
>     repo.or.cz/lilypond_2_6
>     repo.or.cz/lilypond_2_4
>     repo.or.cz/master
>      (etc..)

That's basically exactly what git clone --use-separate-remote should do.
Now only if it would become the default... :-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-17  0:13                                 ` Junio C Hamano
  2006-11-17  0:27                                   ` Han-Wen Nienhuys
@ 2006-11-17  0:37                                   ` Carl Worth
  1 sibling, 0 replies; 1752+ messages in thread
From: Carl Worth @ 2006-11-17  0:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: hanwen, git

[-- Attachment #1: Type: text/plain, Size: 760 bytes --]

On Thu, 16 Nov 2006 16:13:44 -0800, Junio C Hamano wrote:
> >  - clone for getting all branches of a repo, but no command for
> >    updating all branches of a repo.

I want this one as well.

> This one I can understand, but how would you propose to "update
> all branches", in other words what's your design for mapping
> remote branch names to local branch namespaces?

As long as its consistent with "clone" I'll be happy, (I think as part
of a separate topic we need to fix the mappings in clone, see
--use-separate-remotes as default and related).

The current case is really annoying where I have to throw use clone
into a new repository just to get everything, rather than just being
able to fetch everything into the repository I already have.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-17  0:05                               ` Han-Wen Nienhuys
  2006-11-17  0:13                                 ` Junio C Hamano
@ 2006-11-17  0:39                                 ` Linus Torvalds
  2006-11-17  0:52                                   ` Han-Wen Nienhuys
  1 sibling, 1 reply; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-17  0:39 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: Junio C Hamano, git



On Fri, 17 Nov 2006, Han-Wen Nienhuys wrote:
> 
> Until that time, it would be good goal to remove all idiosyncrasies,
> all gratuitious asymetries and needless limitations in the commands of
> git, eg.

Well, a lot of the assymmetries aren't actually gratuitous at all.

>  - clone but not a put-clone,

As mentioned, in order to "put-clone", you generally have to "create" 
first, so the "put-clone" really makes no sense.

The _true_ reverse is really your

 - "git init-db" on both sides

 - "git pull" (your workflow ;) on receiving

 - "git push" on sending.

The fact that we can do "git clone" on the _receiving_ side is an 
assymmetry, but it's not gratutous: when receiving we don't need any extra 
permissions or setup to create a new archive. In contrast, when sending, 
you do have to have that "get permission to create new archive" phase.

>  - pull = merge + fetch, but no command for merge + throw

Again, this is not gratuitous, and the reason is very similar: when you 
pull, you're pulling into something that _you_ control and _you_ have 
access to, namely your working directory. In order to merge you have to 
have the ability to fix up conflicts (whether automatically or manually), 
and this is something that you _fundamentally_ can only do when you own 
the repo space.

Again, when you do "push", the reason you can't merge is not a "gratuitous 
assymmetry", but a _fundamental_ assymmetry: by definition, you're pushing 
to a _remote_ thing, and as such you can't merge, because you can't fix up 
any merge problems.

See?

In many ways, if you want _symmetry_, you need to make sure that the 
_cases_ are symmetrical. If you have ssh shell access, you can often do 
that, and the "reverse" of a "git pull" is actually just another "git 
pull" from the other side:

	ssh other-side "cd repo ; git pull back"

Now they really _are_ symmetrical: "git pull" is really in many ways ITS 
OWN reverse operation. 

But "push" and "pull" _fundamentally_ aren't symmetric operations, and you 
simply cannot possibly make them symmetric. Any system that tries would be 
absolutely horrible to use, exactly because it would be either:

 - making local/remote operations totally equivalent

   This sounds like a "good" thing, but from a real user perspective it's 
   actually horribly horribly bad. Knowing the difference between local 
   and remote is what allows a lot of performance optimizations, and a lot 
   of security. Your local repo is _yours_, and nobody can take that away 
   from you, and that's a really fundamental reason for why the symmetry 
   cannot exist, and why local/remote operations MUST NOT be something 
   that you can mix without thinking about them,

 - limit local operations in a way to make them effectively unusable and 
   unscriptable.

   You'd basically have to do everything even _locally_ through some 
   server interface, and you'd not be allowed to ever touch your local 
   checked-out repository directly. Again: local repositories really _are_ 
   special, because you can touch the checked out copy. If you try to 
   suppress that, you're screwed.

>  - clone for getting all branches of a repo, but no command for
>    updating all branches of a repo.  

As in sending? Sure there is: use "git push --all". It will push out every 
branch (and tag) you have. Add "--force" if you want to make sure that it 
also pushed out branches even if the result isn't a strict superset (of 
course, the receiving end may actually end up refusing to take it, there's 
a option for the receiver to say "I will refuse any update that isn't a 
strict superset of what I had").

If you mean as in "receiving new branches", then yeah, you do have to 
script it, with some fairly trivial "git ls-remote" to make sure you get 
the new remotes.


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: multi-project repos (was Re: Cleaning up git user-interface warts)
  2006-11-16 23:36                                           ` Johannes Schindelin
@ 2006-11-17  0:49                                             ` Linus Torvalds
  0 siblings, 0 replies; 1752+ messages in thread
From: Linus Torvalds @ 2006-11-17  0:49 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Han-Wen Nienhuys, Junio C Hamano, git



On Fri, 17 Nov 2006, Johannes Schindelin wrote:
> > Why? We _do_ have the temporary branch. It's called FETCH_HEAD.
> 
> It is a terrible UI, because it was not that obvious to me. And I consider 
> myself not a git newbie.

Heh. The "temporary branches" are actually the _original_ branches as far 
as git is concerned. The long-term branches only came later.

So in many ways, HEAD, FETCH_HEAD, MERGE_HEAD and ORIG_HEAD are more 
fundamental than any long-term branch has ever been, and maybe they should 
be taught first as such.

So you're newbie enough that you've only seen those new-fangled "real" 
branches.

When I was young, we had to walk to school up-hill in three feet of snow 
every day. And we _liked_ our FETCH_HEAD's.

> Besides, it is not really a temporary branch. If it was, the pull would 
> _not_ download all these objects again, would it?

Well, exactly because they are temporary, we can't actually trust the 
objects they point to. They have no "real" long-term life, so no, I'm 
afraid that we always will have to re-fetch the objects, because fetching 
them is the only way to know that we still have them. 

That said, we could certainly _make_ them be honored by things like "git 
prune" and friends. But yes, they really _are_ temporary branches right 
now, and part of the meaning of that "temporary" is exactly the fact that 
git fetch will not trust that you still have the objects. 

For example, if you used one of the old-fashioned commit walkers, maybe we 
got the initial commit, but we may not have gotten the whole _chain_. See?

Temporary branch indeed.

> > Again, why didn't you use FETCH_HEAD?
> 
> Because I am a Jar-HEAD?

Well, we clearly should document them better. Anybody?


^ permalink raw reply	[flat|nested] 1752+ messages in thread

* Re: Cleaning up git user-interface warts
  2006-11-17  0:39                                 ` Linus Torvalds
@ 2006-11-17  0:52                                   ` Han-Wen Nienhuys
  0 siblings, 0 replies; 1752+ messages in thread
From: Han-Wen Nienhuys @ 2006-11-17  0:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

Linus Torvalds escreveu:
> The fact that we can do "git clone" on the _receiving_ side is an 
> assymmetry, but it's not gratutous: when receiving we don't need any extra 
> permissions or setup to create a new archive. In contrast, when sending, 
> you do have to have that "get permission to create new archive" phase.
> 
>>  - pull = merge + fetch, but no command for merge + throw
> 
> Again, this is not gratuitous, and the reason is very similar: when you 
> pull, you're pulling into something that _you_ control and _you_ have 

>But "push" and "pull" _fundamentally_ aren't symmetric operations, and you 
>simply cannot possibly make them symmetric. 

Point taken;  thank you. 

In that case, we're full circle with the command naming issues. Push
and pull are fundamentally asymmetric operations, but then a
consistent UI would dictate that they wouldn't be named symmetrically,
as they are now.


-- 

^ permalink raw reply	[flat|nested] 1752+ messages in thread

end of thread, other threads:[~2006-11-17  0:52 UTC | newest]

Thread overview: 1752+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-07-27 10:01 Git 1.0 Synopis (Draft v2) Ryan Anderson
2005-07-27 22:13 ` Junio C Hamano
2005-07-29  8:27   ` Ryan Anderson
2005-07-29  8:29 ` Git 1.0 Synopis (Draft v3 Ryan Anderson
2005-07-29 10:58   ` Johannes Schindelin
2005-07-29 21:26   ` Sam Ravnborg
2005-07-31 22:18     ` Horst von Brand
2005-07-31 22:15   ` Horst von Brand
2005-08-01 13:21     ` Horst von Brand
2005-08-15  4:55     ` Git 1.0 Synopis (Draft v4) Ryan Anderson
2005-08-15  5:09       ` Ryan Anderson
2005-08-15  5:19       ` Junio C Hamano
2005-08-15  6:58         ` Ryan Anderson
2005-08-15  7:17           ` Junio C Hamano
2005-08-15  8:02             ` Ryan Anderson
2005-08-15  8:17               ` Junio C Hamano
2005-08-15 18:59                 ` Daniel Barkalow
2005-08-16  7:28                   ` Junio C Hamano
2005-08-16 10:03                     ` Johannes Schindelin
2005-08-16 10:14                       ` Dongsheng Song
2005-08-16 10:17                       ` about git server & permissions Dongsheng Song
2005-08-16 15:31                     ` Git 1.0 Synopis (Draft v4) Johannes Schindelin
2005-08-16 15:47                       ` Daniel Barkalow
2005-08-16 15:39                     ` Daniel Barkalow
2005-08-16 19:41                     ` Horst von Brand
2005-08-16 20:41                       ` Johannes Schindelin
2005-08-18  9:27                       ` Matthias Urlichs
  -- strict thread matches above, loose matches on Subject: below --
2006-10-14 15:07 VCS comparison table Jon Smirl
2006-10-14 16:40 ` Jakub Narebski
2006-10-14 17:18   ` Jon Smirl
2006-10-14 17:42     ` Jakub Narebski
2006-10-16  3:53   ` Martin Pool
2006-10-22 15:50     ` Jakub Narebski
2006-10-16 22:26   ` Aaron Bentley
2006-10-16 22:35     ` Andy Whitcroft
2006-10-16 22:53       ` Jakub Narebski
2006-10-16 23:19     ` Jakub Narebski
2006-10-16 23:39       ` Nguyen Thai Ngoc Duy
2006-10-17  4:56       ` Aaron Bentley
2006-10-17  5:20         ` Shawn Pearce
2006-10-17  8:21           ` Martin Pool
2006-10-17  8:15         ` Jakub Narebski
2006-10-17  8:16         ` Andreas Ericsson
2006-10-17 20:01           ` Aaron Bentley
2006-10-17 21:01             ` Jakub Narebski
2006-10-17 21:27               ` Aaron Bentley
2006-10-17 21:51                 ` Jakub Narebski
2006-10-17 22:28                   ` Aaron Bentley
2006-10-17 22:57                     ` Jakub Narebski
2006-10-17 22:59                       ` Jakub Narebski
2006-10-17 23:16                       ` Linus Torvalds
2006-10-18  5:36                         ` Jeff King
2006-10-18  5:57                           ` Junio C Hamano
2006-10-18 14:52                           ` Linus Torvalds
2006-10-18 18:52                             ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Petr Baudis
2006-10-18 18:59                               ` Petr Baudis
2006-10-18 19:04                                 ` Junio C Hamano
2006-10-18 19:13                                   ` Nicolas Pitre
2006-10-18 19:18                                     ` Shawn Pearce
2006-10-18 19:33                                       ` Nicolas Pitre
2006-10-18 20:46                                         ` Shawn Pearce
2006-10-18 21:17                                           ` Linus Torvalds
2006-10-18 21:32                                             ` Shawn Pearce
2006-10-18 21:42                                               ` Junio C Hamano
2006-10-18 21:52                                                 ` Shawn Pearce
2006-10-18 22:02                                                   ` Junio C Hamano
2006-10-18 21:55                                               ` Linus Torvalds
2006-10-18 22:05                                                 ` Shawn Pearce
2006-10-18 22:07                                                 ` Junio C Hamano
2006-10-18 21:41                                             ` Nicolas Pitre
2006-10-18 21:41                                             ` Shawn Pearce
2006-10-18 22:00                                               ` Linus Torvalds
2006-10-18 22:11                                                 ` Shawn Pearce
2006-10-18 22:13                                               ` Junio C Hamano
2006-10-18 22:42                                                 ` Linus Torvalds
2006-10-18 22:48                                                   ` Junio C Hamano
2006-10-18 23:22                                                     ` Shawn Pearce
2006-10-18 23:18                                                   ` Nicolas Pitre
2006-10-18 23:50                                                     ` Johannes Schindelin
2006-10-19  0:07                                                     ` Linus Torvalds
2006-10-19  0:15                                                       ` Linus Torvalds
2006-10-19  0:31                                                       ` Johannes Schindelin
2006-10-19  0:46                                                         ` Linus Torvalds
2006-10-19  3:01                                                       ` Nicolas Pitre
2006-10-19  3:46                                                       ` Junio C Hamano
2006-10-19 14:27                                                         ` Nicolas Pitre
2006-10-19 14:55                                                         ` Linus Torvalds
2006-10-19 16:07                                                           ` Jan Harkes
2006-10-19 16:48                                                             ` Linus Torvalds
2006-10-20  0:20                                                               ` Jan Harkes
2006-10-20 14:41                                                                 ` Jeff King
2006-10-20  0:20                                                               ` [PATCH 1/2] Pass through unresolved deltas when writing a pack Jan Harkes
2006-10-20  0:20                                                               ` [PATCH 2/2] Remove unused index tracking code Jan Harkes
2006-10-20  1:11                                                                 ` Nicolas Pitre
2006-10-20  1:35                                                                   ` Junio C Hamano
2006-10-20  2:27                                                                   ` Jan Harkes
2006-10-20  2:30                                                                     ` Junio C Hamano
2006-10-20  2:46                                                                       ` Jan Harkes
2006-10-20  3:36                                                                     ` Nicolas Pitre
2006-10-18 21:56                                             ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Junio C Hamano
2006-10-18 19:33                                     ` Junio C Hamano
2006-10-18 20:47                                       ` Shawn Pearce
2006-10-18 19:09                                 ` Nicolas Pitre
2006-10-18 20:08                                 ` Linus Torvalds
     [not found]                               ` <20061018155704.b94b441d.seanlkml@sympatico.ca>
2006-10-18 19:57                                 ` Sean
2006-10-18 20:46                                 ` Petr Baudis
     [not found]                                   ` <20061018165341.bcece11f.seanlkml@sympatico.ca>
2006-10-18 20:53                                     ` Sean
2006-10-18 21:39                                     ` Petr Baudis
     [not found]                                       ` <20061018175443.50b728f6.seanlkml@sympatico.ca>
2006-10-18 21:54                                         ` Sean
2006-10-19  6:46                               ` Alexander Belchenko
     [not found]                                 ` <20061019064049.bec89582.seanlkml@sympatico.ca>
2006-10-19 10:40                                   ` Sean
2006-10-20 14:03                                     ` Aaron Bentley
2006-10-20 14:56                                       ` Jakub Narebski
2006-10-20 15:34                                         ` Aaron Bentley
2006-10-20 16:21                                           ` Jakub Narebski
2006-10-20 17:03                                             ` Aaron Bentley
2006-10-20 17:18                                               ` Linus Torvalds
2006-10-20 17:45                                                 ` Jakub Narebski
2006-10-20 17:59                                                   ` Linus Torvalds
2006-10-20 20:17                                                     ` Junio C Hamano
2006-10-20 20:40                                                       ` Jakub Narebski
2006-10-20 22:41                                                       ` [PATCH 1/2] git-pickaxe: introduce heuristics to "best match" scoring Junio C Hamano
2006-10-20 22:41                                                       ` [PATCH 2/2] git-pickaxe: introduce heuristics to avoid "trivial" chunks Junio C Hamano
2006-10-20 17:47                                                 ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Aaron Bentley
2006-10-20 18:06                                                   ` Linus Torvalds
2006-10-20 18:30                                                     ` Linus Torvalds
2006-10-20 19:04                                                       ` Aaron Bentley
2006-10-20 19:31                                                         ` Linus Torvalds
2006-10-20 20:12                                                           ` Aaron Bentley
2006-10-20 17:21                                               ` Shawn Pearce
2006-10-20 17:48                                                 ` Linus Torvalds
2006-10-20 17:58                                                   ` David Lang
2006-10-20 18:15                                                   ` Jon Smirl
2006-11-03  3:43                                                     ` Matthew Hannigan
2006-10-20 20:23                                                   ` Petr Baudis
2006-10-20 20:49                                                     ` David Lang
2006-10-20 20:53                                                       ` Petr Baudis
2006-10-20 20:55                                                         ` David Lang
2006-10-20 20:53                                                   ` Shawn Pearce
2006-10-20 18:12                                             ` Jan Hudec
2006-10-20 18:35                                               ` Jakub Narebski
2006-10-20 18:46                                                 ` Jakub Narebski
2006-10-20 18:47                                               ` Jakub Narebski
2006-10-20 19:00                                                 ` Linus Torvalds
2006-10-20 19:10                                                   ` Aaron Bentley
2006-10-20 19:46                                                     ` Linus Torvalds
2006-10-20 20:29                                                       ` Aaron Bentley
2006-10-20 20:57                                                         ` Linus Torvalds
2006-10-21  2:03                                                           ` git-merge-recursive, was " Johannes Schindelin
2006-10-21  2:17                                                             ` Junio C Hamano
2006-10-22 21:04                                                               ` [PATCH] threeway_merge: if file will not be touched, leave it alone Johannes Schindelin
2006-10-22 23:11                                                                 ` Junio C Hamano
2006-10-23  0:48                                                                   ` Johannes Schindelin
2006-10-23  4:17                                                                     ` Junio C Hamano
2006-10-20 18:48                                               ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Linus Torvalds
2006-10-20 22:13                                                 ` Jeff Licquia
2006-10-20 23:05                                                   ` Robert Collins
2006-10-20 23:15                                                     ` Robert Collins
2006-10-20 23:39                                                       ` Jeff Licquia
2006-10-20 23:24                                                     ` Jakub Narebski
2006-10-20 23:28                                                       ` Petr Baudis
2006-10-20 23:59                                                   ` Linus Torvalds
2006-10-21  1:26                                                     ` Junio C Hamano
2006-10-21  8:40                                                       ` Jakub Narebski
2006-10-20 19:14                                               ` Jakub Narebski
2006-10-20 22:59                                               ` Jeff King
2006-10-21 17:40                                                 ` Jan Hudec
2006-10-21 17:51                                                   ` Jakub Narebski
2006-10-21 19:20                                                     ` Jan Hudec
2006-10-21 18:42                                                   ` Linus Torvalds
2006-10-21 19:21                                                     ` Jakub Narebski
2006-11-03  6:36                                                       ` Martin Langhoff
2006-10-20 22:40                                           ` Petr Baudis
2006-10-20 23:33                                             ` Aaron Bentley
2006-10-21  7:56                                         ` Matthieu Moy
2006-10-21  8:36                                           ` Jakub Narebski
2006-10-21 10:09                                             ` Matthieu Moy
2006-10-21 10:34                                               ` Jakub Narebski
     [not found]                                       ` <20061020113712.d192580a.seanlkml@sympatico.ca>
2006-10-20 15:37                                         ` Sean
2006-10-20 15:37                                         ` Sean
2006-10-19 10:40                                   ` Sean
2006-10-18 21:20                             ` VCS comparison table Jeff King
2006-10-17 23:33                       ` Aaron Bentley
2006-10-18  8:13                         ` Andreas Ericsson
2006-10-18  6:22                   ` Matthieu Moy
     [not found]                 ` <20061017180051.5453ba90.seanlkml@sympatico.ca>
2006-10-17 22:00                   ` Sean
2006-10-17 22:00                   ` Sean
2006-10-17 22:44                     ` Aaron Bentley
     [not found]                       ` <20061017185622.30fbc6c0.seanlkml@sympatico.ca>
2006-10-17 22:56                         ` Sean
2006-10-17 22:56                         ` Sean
2006-10-17 23:11                           ` Jakub Narebski
2006-10-18 21:04                           ` Charles Duffy
     [not found]                             ` <20061018172945.c0c58c38.seanlkml@sympatico.ca>
2006-10-18 21:29                               ` Sean
2006-10-18 21:29                               ` Sean
2006-10-18 23:31                                 ` Charles Duffy
2006-10-18 23:48                                   ` Johannes Schindelin
2006-10-19  1:58                                     ` Charles Duffy
2006-10-19 11:01                                       ` Johannes Schindelin
2006-10-19 11:10                                         ` Charles Duffy
2006-10-19 11:24                                           ` Johannes Schindelin
2006-10-19 11:30                                             ` Charles Duffy
2006-10-20 11:38                                               ` Jakub Narebski
2006-10-18 23:48                                   ` Jakub Narebski
     [not found]                                   ` <20061018194945.3e5105e7.seanlkml@sympatico.ca>
2006-10-18 23:49                                     ` Sean
2006-10-18 23:49                                     ` Sean
2006-10-18 21:37                               ` Shawn Pearce
     [not found]                                 ` <20061018174450.f2108a21.seanlkml@sympatico.ca>
2006-10-18 21:44                                   ` Sean
2006-10-18 21:52                                   ` Petr Baudis
2006-10-18 23:38                                 ` Johannes Schindelin
2006-10-18 23:54                                   ` Petr Baudis
2006-10-19  0:33                                     ` Johannes Schindelin
2006-10-18 21:51                         ` Petr Baudis
2006-10-20  9:43                     ` Matthieu Moy
2006-10-24  6:02                       ` Lachlan Patrick
2006-10-24  6:23                         ` Shawn Pearce
2006-10-24  6:31                         ` Linus Torvalds
2006-10-24  6:45                           ` David Rientjes
     [not found]                             ` <Pin e.LNX.4.64.0610240812410.3962@g5.osdl.org>
     [not found]                             ` <"Pin e.LNX.4.64.0610240812410.3962"@g5.osdl.org>
2006-10-24 15:15                             ` Linus Torvalds
2006-10-24 20:12                               ` David Rientjes
2006-10-24 20:28                                 ` Jakub Narebski
2006-10-25  8:48                                 ` Jeff King
     [not found]                                   ` < Pine.LNX.4.64N.0610250157470.3467@attu1.cs.washington.edu>
     [not found]                                     ` <20061025094900.G A26989@coredump.intra.peff.net>
2006-10-25  9:19                                   ` David Rientjes
2006-10-25  9:32                                     ` Jakub Narebski
2006-10-25  9:49                                     ` Jeff King
2006-10-25 13:49                                       ` Andreas Ericsson
2006-10-25 21:51                                         ` David Lang
2006-10-25 22:15                                           ` Shawn Pearce
2006-10-25 22:29                                             ` Jakub Narebski
2006-10-25 22:44                                               ` Petr Baudis
2006-10-25 23:15                                                 ` Jakub Narebski
2006-10-26  1:06                                                 ` Horst H. von Brand
2006-10-25 22:41                                             ` David Lang
2006-10-25 17:21                                       ` David Rientjes
2006-10-25 21:03                                         ` Jeff King
2006-10-26 11:15                                         ` Andreas Ericsson
2006-10-26 16:30                                           ` David Lang
2006-10-26 17:03                                             ` Nicolas Pitre
2006-10-26 17:04                                               ` David Lang
2006-10-26 17:16                                                 ` Linus Torvalds
2006-10-26 17:24                                                 ` Nicolas Pitre
2006-10-26 17:45                                               ` Jakub Narebski
2006-10-25 21:08                                   ` Junio C Hamano
2006-10-25 21:16                                     ` Jeff King
2006-10-25 21:32                                       ` Junio C Hamano
2006-10-25 21:50                                     ` Junio C Hamano
2006-10-26 11:25                                     ` Andreas Ericsson
2006-10-26  2:29                             ` Linus Torvalds
2006-10-17 22:03                 ` Linus Torvalds
2006-10-17 22:53                   ` Aaron Bentley
2006-10-17 23:09                     ` Linus Torvalds
2006-10-18  0:23                       ` Aaron Bentley
2006-10-18  0:46                         ` Jakub Narebski
     [not found]                         ` <200610180246.18758.jnareb@gmail.com>
2006-10-18  1:00                           ` Aaron Bentley
2006-10-18  1:25                             ` Carl Worth
2006-10-18  3:10                               ` Aaron Bentley
2006-10-18  8:39                                 ` Andreas Ericsson
2006-10-18  9:04                                   ` Peter Baumann
2006-10-18  9:07                                   ` Jakub Narebski
2006-10-18 10:32                                   ` Matthew D. Fuller
2006-10-18 11:19                                     ` Andreas Ericsson
2006-10-18 12:43                                       ` Matthew D. Fuller
     [not found]                                         ` <20061018090218.35f0326b.seanlkml@sympatico.ca>
2006-10-18 13:02                                           ` Sean
2006-10-18 13:02                                           ` Sean
2006-10-18 13:10                                         ` Jakub Narebski
2006-10-18 16:07                                         ` Linus Torvalds
2006-10-18 15:38                                 ` Carl Worth
2006-10-19  9:10                                   ` Matthew D. Fuller
2006-10-19 11:15                                     ` Andreas Ericsson
2006-10-19 12:04                                       ` Matthieu Moy
2006-10-19 12:33                                         ` Petr Baudis
2006-10-19 13:44                                           ` Matthieu Moy
2006-10-19 16:03                                             ` Carl Worth
2006-10-19 16:38                                               ` Matthieu Moy
2006-10-20 11:24                                                 ` Jakub Narebski
2006-10-20 11:50                                           ` Jakub Narebski
2006-10-20 13:26                                             ` Jakub Narebski
2006-10-20 23:19                                             ` Junio C Hamano
2006-10-21  0:07                                               ` Linus Torvalds
2006-10-21  1:09                                                 ` Junio C Hamano
2006-10-21  1:19                                                   ` Linus Torvalds
2006-10-21  1:27                                                     ` Junio C Hamano
2006-10-21  1:55                                                       ` Linus Torvalds
2006-10-21  8:32                                                         ` Jakub Narebski
2006-10-19 11:27                                     ` Karl Hasselström
2006-10-19 11:46                                       ` Petr Baudis
2006-10-19 16:01                                         ` Matthew D. Fuller
2006-10-19 17:06                                           ` Matthew D. Fuller
2006-10-18  3:35                             ` Linus Torvalds
2006-10-19  3:10                               ` Aaron Bentley
2006-10-19  5:21                                 ` Carl Worth
2006-10-19  5:56                                   ` Martin Pool
2006-10-19 14:58                                   ` Aaron Bentley
2006-10-19 16:59                                     ` Carl Worth
2006-10-19 23:01                                       ` Aaron Bentley
2006-10-19 23:42                                         ` Carl Worth
2006-10-20  1:06                                           ` Aaron Bentley
2006-10-20  5:05                                             ` Linus Torvalds
2006-10-20  7:47                                               ` Lachlan Patrick
2006-10-20  8:38                                                 ` Johannes Schindelin
2006-10-20 10:13                                                   ` Petr Baudis
2006-10-20 11:09                                                   ` Jakub Narebski
2006-10-20 11:37                                                     ` Johannes Schindelin
2006-10-20 12:03                                                       ` Jakub Narebski
2006-10-20 12:48                                                         ` Johannes Schindelin
2006-10-20 17:23                                                       ` David Lang
2006-10-20 10:16                                                 ` Petr Baudis
2006-10-20  9:57                                             ` Jakub Narebski
2006-10-20 10:02                                               ` Matthieu Moy
2006-10-20 10:45                                                 ` Andy Whitcroft
2006-10-20 10:45                                               ` James Henstridge
2006-10-20 12:01                                                 ` Jakub Narebski
2006-10-20 11:00                                             ` Jakub Narebski
2006-10-20 14:12                                             ` Jeff King
2006-10-20 14:40                                               ` Jakub Narebski
2006-10-20 14:52                                                 ` Johannes Schindelin
2006-10-20 15:34                                                   ` Jakub Narebski
2006-10-21 17:57                                               ` Aaron Bentley
2006-10-21 18:20                                                 ` Jakub Narebski
2006-10-22 14:27                                                   ` Matthieu Moy
2006-10-20 21:48                                             ` Carl Worth
2006-10-21 13:01                                               ` Matthew D. Fuller
2006-10-21 14:08                                                 ` Jakub Narebski
2006-10-21 16:31                                                   ` Erik Bågfors
2006-10-21 16:59                                                     ` Jakub Narebski
2006-10-21 17:41                                                       ` Jakub Narebski
2006-10-21 18:11                                                   ` Matthew D. Fuller
2006-10-21 19:19                                                     ` Jeff King
2006-10-21 19:30                                                       ` Jakub Narebski
2006-10-21 19:47                                                         ` Jan Hudec
2006-10-21 19:55                                                         ` Linus Torvalds
2006-10-21 20:19                                                           ` Jakub Narebski
2006-10-21 21:46                                                       ` Matthew D. Fuller
     [not found]                                                         ` <20061021180653.d3152616.seanlkml@sympatico.ca>
2006-10-21 22:06                                                           ` Sean
2006-10-21 22:25                                                         ` Jakub Narebski
2006-10-21 23:42                                                           ` Jeff Licquia
2006-10-21 23:49                                                             ` Carl Worth
2006-10-22  0:07                                                               ` Jeff Licquia
2006-10-22  0:47                                                                 ` Linus Torvalds
2006-10-22 16:02                                                               ` Petr Baudis
2006-10-25  9:52                                                               ` Andreas Ericsson
2006-10-21 19:41                                                     ` Jakub Narebski
2006-10-22 19:18                                                       ` David Clymer
2006-10-22 19:57                                                         ` Jakub Narebski
2006-10-22 20:06                                                         ` Jakub Narebski
2006-10-23 11:56                                                           ` David Clymer
2006-10-23 12:54                                                             ` Jakub Narebski
2006-10-23 15:01                                                               ` James Henstridge
2006-10-23 17:18                                                                 ` Aaron Bentley
2006-10-23 17:53                                                                   ` Jakub Narebski
2006-10-23 18:04                                                                     ` Linus Torvalds
2006-10-23 18:21                                                                       ` Jakub Narebski
2006-10-23 18:26                                                                         ` Jelmer Vernooij
2006-10-23 18:31                                                                           ` Jakub Narebski
2006-10-23 18:44                                                                             ` Jelmer Vernooij
2006-10-23 18:45                                                                             ` Linus Torvalds
2006-10-23 18:56                                                                               ` Jelmer Vernooij
2006-10-23 19:02                                                                                 ` Shawn Pearce
2006-10-23 19:12                                                                                 ` Jakub Narebski
2006-10-23 19:18                                                                                 ` Linus Torvalds
2006-10-23 18:34                                                                         ` Linus Torvalds
2006-10-23 20:06                                                                   ` Jeff King
2006-10-23 20:29                                                                     ` Jakub Narebski
2006-10-24  3:24                                                               ` David Clymer
2006-10-21 20:47                                                 ` Carl Worth
2006-10-21 20:55                                                   ` Jakub Narebski
2006-10-21 23:07                                                   ` Jeff Licquia
     [not found]                                                     ` <20061021192539.4a00cc3e.seanlkml@sympatico.ca>
2006-10-21 23:25                                                       ` Sean
2006-10-21 23:25                                                       ` Sean
2006-10-22  0:46                                                       ` Jeff Licquia
     [not found]                                                         ` <20061021212645.2f9ba751.seanlkml@sympatico.ca>
2006-10-22  1:26                                                           ` Sean
2006-10-22  1:26                                                           ` Sean
2006-10-22  3:23                                                           ` Jeff Licquia
     [not found]                                                             ` <20061021233014.d4525a1d.seanlkml@sympatico.ca>
2006-10-22  3:30                                                               ` Sean
2006-10-22  3:30                                                               ` Sean
2006-10-22 10:00                                                               ` Matthew D. Fuller
     [not found]                                                                 ` <20061022074422.50dcbee6.seanlkml@sympatico.ca>
2006-10-22 11:44                                                                   ` Sean
2006-10-22 11:44                                                                   ` Sean
2006-10-22 13:03                                                                   ` Matthew D. Fuller
     [not found]                                                                     ` <20061022092845.233deb43.seanlkml@sympatico.ca>
2006-10-22 13:28                                                                       ` Sean
2006-10-22 13:28                                                                       ` Sean
2006-10-22 13:33                                                                       ` Matthew D. Fuller
     [not found]                                                                         ` <20061022094041.77c06cc7.seanlkml@sympatico.ca>
2006-10-22 13:40                                                                           ` Sean
2006-10-22 13:40                                                                           ` Sean
2006-10-22 13:57                                                                           ` Matthew D. Fuller
     [not found]                                                                             ` <20061022102454.b9dea693.seanlkml@sympatico.ca>
2006-10-22 14:24                                                                               ` Sean
2006-10-22 14:24                                                                               ` Sean
2006-10-22 14:56                                                                               ` Matthew D. Fuller
2006-10-22 15:05                                                                                 ` Matthieu Moy
2006-10-22 12:46                                                   ` Matthew D. Fuller
2006-10-22 13:51                                                     ` Jakub Narebski
2006-10-22 19:36                                                   ` David Clymer
2006-10-25  9:35                                                 ` Andreas Ericsson
2006-10-25  9:46                                                   ` Jakub Narebski
2006-10-25 10:08                                                     ` James Henstridge
2006-10-25 15:54                                                       ` Carl Worth
2006-10-26  8:52                                                         ` James Henstridge
2006-10-26  9:33                                                           ` Junio C Hamano
2006-10-26  9:57                                                             ` James Henstridge
2006-10-26 10:10                                                               ` Jeff King
2006-10-26 10:52                                                                 ` Vincent Ladeuil
2006-10-26 11:13                                                                   ` Jeff King
2006-10-26 11:15                                                                     ` Jeff King
2006-10-26 12:33                                                                     ` Vincent Ladeuil
2006-10-26 13:14                                                                       ` Rogan Dawes
2006-10-26 11:18                                                                   ` Jakub Narebski
2006-10-26 15:05                                                                   ` Linus Torvalds
2006-10-26 16:04                                                                     ` Vincent Ladeuil
2006-10-26 16:21                                                                       ` Linus Torvalds
2006-10-26  9:50                                                           ` Andreas Ericsson
2006-10-25  9:57                                                   ` Matthieu Moy
2006-10-21 20:05                                               ` Aaron Bentley
2006-10-21 20:48                                                 ` Jakub Narebski
2006-10-21 22:52                                                   ` Edgar Toernig
2006-10-21 23:39                                                   ` Aaron Bentley
2006-10-22  0:04                                                     ` Carl Worth
2006-10-22  0:14                                                     ` Jakub Narebski
     [not found]                                                 ` <20061021165313.dba67497.seanlkml@sympatico.ca>
2006-10-21 20:53                                                   ` Sean
2006-10-21 21:10                                                     ` Linus Torvalds
2006-10-21 20:53                                                   ` Sean
2006-10-22  7:45                                                 ` Jan Hudec
2006-10-22  9:05                                                   ` Jakub Narebski
2006-10-22  9:56                                                     ` Erik Bågfors
2006-10-22 13:23                                                       ` Jakub Narebski
2006-10-22 14:11                                                         ` Erik Bågfors
2006-10-22 14:39                                                           ` Jakub Narebski
2006-10-22 14:25                                                       ` Carl Worth
2006-10-22 14:48                                                         ` Erik Bågfors
2006-10-22 15:04                                                           ` Jakub Narebski
2006-10-22 14:55                                                         ` Jakub Narebski
2006-10-22 18:53                                                         ` Matthew D. Fuller
2006-10-22 19:27                                                           ` Jakub Narebski
2006-10-23 16:57                                                           ` David Lang
2006-10-23 17:29                                                           ` Linus Torvalds
2006-10-23 22:21                                                             ` Matthew D. Fuller
2006-10-23 22:28                                                               ` David Lang
2006-10-23 22:44                                                               ` Linus Torvalds
2006-10-24  0:26                                                                 ` Matthew D. Fuller
2006-10-24 15:58                                                                   ` David Lang
2006-10-24 16:34                                                                     ` Matthew D. Fuller
2006-10-24 18:03                                                                       ` David Lang
2006-10-24 18:25                                                                         ` Jakub Narebski
2006-10-24 19:27                                                                           ` Petr Baudis
2006-10-25  0:27                                                                         ` Matthew D. Fuller
2006-10-25 22:40                                                                           ` David Lang
2006-10-25 23:53                                                                             ` Matthew D. Fuller
2006-10-26 10:13                                                                               ` Andreas Ericsson
2006-10-26 10:45                                                                                 ` Erik Bågfors
2006-10-26 11:48                                                                                 ` Jakub Narebski
2006-10-26 11:54                                                                                   ` Nicholas Allen
2006-10-26 12:13                                                                                     ` Jakub Narebski
2006-10-26 21:25                                                                                     ` Jeff King
2006-10-27  2:02                                                                                   ` Horst H. von Brand
2006-10-27  2:08                                                                                     ` Petr Baudis
2006-10-27  9:34                                                                                     ` Andreas Ericsson
2006-10-27 10:49                                                                                       ` Jakub Narebski
2006-10-27 11:41                                                                                         ` Andreas Ericsson
2006-10-27 14:46                                                                                       ` J. Bruce Fields
2006-10-28 11:18                                                                                         ` Ilpo Nyyssönen
2006-10-28 13:53                                                                                           ` Jakub Narebski
2006-10-28 14:58                                                                                             ` Jakub Narebski
2006-10-28 22:18                                                                                             ` Robin Rosenberg
2006-10-28 22:46                                                                                               ` Jakub Narebski
2006-10-29  6:54                                                                                             ` Ilpo Nyyssönen
2006-10-29 12:01                                                                                               ` Jakub Narebski
2006-10-29 18:24                                                                                                 ` Matthew D. Fuller
2006-10-29 18:39                                                                                                   ` Jakub Narebski
2006-10-30  0:10                                                                                                 ` Theodore Tso
2006-10-30 10:18                                                                                             ` Progress reporting (was: VCS comparison table) Jakub Narebski
2006-10-30 15:21                                                                                               ` Nicolas Pitre
2006-10-26 12:12                                                                                 ` VCS comparison table Matthew D. Fuller
2006-10-26 12:18                                                                                   ` Jakub Narebski
2006-10-26 15:06                                                                                     ` Matthew D. Fuller
2006-10-26 13:47                                                                                 ` Aaron Bentley
2006-10-26 13:53                                                                                   ` Jakub Narebski
2006-10-26 15:13                                                                                     ` Aaron Bentley
2006-10-30 21:46                                                                             ` Jan Hudec
2006-10-23 22:45                                                               ` Jakub Narebski
2006-10-23 23:14                                                                 ` Erik Bågfors
2006-10-23 23:24                                                                   ` Linus Torvalds
2006-10-24  0:26                                                                     ` Matthew D. Fuller
2006-10-24  0:38                                                                       ` Matthew D. Fuller
2006-10-24  5:42                                                                         ` Linus Torvalds
2006-10-24  5:47                                                                           ` Shawn Pearce
2006-10-24 16:46                                                                           ` Matthew D. Fuller
2006-10-24  0:47                                                                       ` Carl Worth
2006-10-24  7:31                                                                         ` Erik Bågfors
2006-10-24 21:51                                                                         ` Erik Bågfors
2006-10-25 12:41                                                                           ` Andreas Ericsson
2006-10-25 13:15                                                                             ` Erik Bågfors
2006-10-24  0:39                                                                     ` Martin Langhoff
2006-10-24  7:52                                                                       ` Erik Bågfors
2006-10-24  8:37                                                                         ` Jakub Narebski
2006-10-24 10:11                                                                         ` Martin Langhoff
2006-10-24  9:30                                                                     ` Jelmer Vernooij
2006-10-26 15:22                                                                       ` Aaron Bentley
2006-10-25 18:41                                                                     ` Aaron Bentley
2006-10-24  9:51                                                               ` Matthieu Moy
2006-10-24 10:27                                                                 ` Jakub Narebski
2006-10-25 10:52                                                               ` Andreas Ericsson
2006-10-25 19:53                                                                 ` Junio C Hamano
2006-10-20  2:53                                           ` James Henstridge
2006-10-20  9:51                                             ` Jakub Narebski
2006-10-20 10:42                                               ` James Henstridge
2006-10-20 13:17                                                 ` Jakub Narebski
2006-10-20 13:36                                                   ` Petr Baudis
2006-10-20 14:12                                                     ` Jakub Narebski
2006-10-20 14:59                                                   ` James Henstridge
2006-10-20 22:50                                                     ` Jakub Narebski
2006-10-20 22:58                                                       ` Petr Baudis
2006-10-20 10:53                                         ` Jakub Narebski
2006-10-20 12:34                                           ` Matthieu Moy
2006-10-20 13:20                                             ` Jakub Narebski
2006-10-20 13:47                                               ` Petr Baudis
2006-10-19 17:01                                     ` Carl Worth
2006-10-19 17:14                                       ` J. Bruce Fields
2006-10-20 14:31                                         ` Jeff King
2006-10-20 15:33                                           ` J. Bruce Fields
2006-10-20 15:43                                             ` Jeff King
2006-10-19 15:25                                   ` Linus Torvalds
2006-10-19 16:13                                     ` Matthew D. Fuller
2006-10-19 16:49                                       ` Linus Torvalds
2006-10-19 18:30                                         ` Linus Torvalds
2006-10-19 18:54                                           ` Matthieu Moy
2006-10-19 20:47                                             ` Linus Torvalds
2006-10-21  5:49                                               ` Junio C Hamano
2006-10-19 23:28                                             ` Ryan Anderson
2006-10-19 19:16                                           ` Junio C Hamano
2006-10-20 10:51                                             ` Jakub Narebski
2006-10-20 15:58                                               ` Linus Torvalds
2006-10-19  5:33                                 ` Jan Hudec
2006-10-19  7:02                                 ` Erik Bågfors
2006-10-19  8:49                                   ` Christian MICHON
2006-10-19  8:58                                     ` Andreas Ericsson
2006-10-19  9:10                                       ` Matthieu Moy
2006-10-19 14:57                                         ` Tim Webster
2006-10-19 15:30                                           ` Aaron Bentley
2006-10-20  3:14                                             ` Tim Webster
2006-10-20  4:05                                               ` Aaron Bentley
2006-10-21 12:30                                                 ` Jan Hudec
2006-10-21 13:05                                                   ` Jakub Narebski
2006-10-21 13:15                                                     ` Jan Hudec
2006-10-21 13:29                                                       ` Jakub Narebski
2006-10-21 16:56                                                     ` Aaron Bentley
2006-10-21 17:03                                                       ` Jakub Narebski
2006-10-21 17:31                                                       ` Linus Torvalds
2006-10-21 17:38                                                         ` Linus Torvalds
2006-10-22  7:49                                                         ` Tim Webster
2006-10-22 17:12                                                           ` Linus Torvalds
2006-10-23  5:19                                                             ` Matthew Hannigan
2006-10-20 10:44                                             ` Jakub Narebski
2006-10-19 16:14                                           ` Matthieu Moy
2006-10-20  3:40                                             ` Tim Webster
2006-10-19 15:45                                       ` Ramon Diaz-Uriarte
2006-10-20 10:40                                       ` Jakub Narebski
2006-10-20 13:36                                         ` Shawn Pearce
2006-10-21 12:30                                         ` Matthew D. Fuller
2006-10-19 11:37                                   ` Petr Baudis
2006-10-19 15:17                                     ` Matthew D. Fuller
2006-10-20 13:22                                 ` Horst H. von Brand
2006-10-20 13:46                                   ` Christian MICHON
2006-10-20 15:05                                     ` Jakub Narebski
2006-10-20 15:16                                       ` Johannes Schindelin
2006-10-20 15:28                                         ` Jakub Narebski
2006-10-20 15:39                                           ` Johannes Schindelin
2006-10-20 16:05                                             ` Jakub Narebski
2006-10-20 16:24                                               ` Jakub Narebski
2006-10-18  3:25                         ` Ryan Anderson
2006-10-17 23:24                     ` Jakub Narebski
2006-10-17 23:50                       ` Linus Torvalds
2006-10-17 23:35               ` Jakub Narebski
2006-10-17  9:20         ` Jakub Narebski
2006-10-17  9:40           ` Robert Collins
2006-10-17 10:08             ` Andreas Ericsson
2006-10-17 10:47               ` Matthieu Moy
2006-10-18  4:55               ` Robert Collins
2006-10-18  8:53                 ` Andreas Ericsson
2006-10-18 11:15                   ` Petr Baudis
2006-10-18 15:31                 ` Linus Torvalds
2006-10-18 15:50                   ` Jakub Narebski
2006-10-18 16:22                     ` Linus Torvalds
2006-10-17 16:41             ` Linus Torvalds
2006-10-17 22:27               ` Robert Collins
     [not found]                 ` <20061017191838.1c36499b.seanlkml@sympatico.ca>
2006-10-17 23:18                   ` Sean
2006-10-17 23:18                   ` Sean
2006-10-17 23:33                   ` Petr Baudis
2006-10-18  5:26                     ` Robert Collins
2006-10-18 21:46                       ` Alternate revno proposal (Was: Re: VCS comparison table) Jan Hudec
2006-10-18 22:14                         ` Jakub Narebski
2006-10-19  5:45                           ` Jan Hudec
2006-10-19  8:19                         ` Alexander Belchenko
2006-10-21 13:48                           ` Jan Hudec
2006-10-20  2:09                         ` Horst H. von Brand
2006-10-20  5:38                           ` Jan Hudec
2006-10-17  9:59           ` VCS comparison table Andreas Ericsson
2006-10-17  9:37       ` Robert Collins
     [not found]         ` <20061017060112.2d036f96.seanlkml@sympatico.ca>
2006-10-17 10:01           ` Sean
2006-10-17 10:01           ` Sean
2006-10-17 10:06         ` Jakub Narebski
2006-10-16 23:35     ` Linus Torvalds
2006-10-16 23:55       ` Jakub Narebski
2006-10-17  0:04         ` Johannes Schindelin
2006-10-17  0:23           ` Linus Torvalds
2006-10-17  0:36             ` Johannes Schindelin
2006-10-17  1:17             ` Nguyen Thai Ngoc Duy
2006-10-17  7:26             ` Christian MICHON
2006-10-17  0:08         ` Linus Torvalds
2006-10-17  0:24           ` Jakub Narebski
2006-10-17  4:31           ` Aaron Bentley
2006-10-19 19:01             ` Nathaniel Smith
2006-10-20 10:32               ` Jakub Narebski
2006-10-17  0:29       ` Luben Tuikov
2006-10-17  4:24       ` Aaron Bentley
2006-10-17  7:50         ` Andreas Ericsson
2006-10-17 14:05           ` Aaron Bentley
     [not found]             ` <20061017103423.a9589295.seanlkml@sympatico.ca>
2006-10-17 14:34               ` Sean
2006-10-17 15:05             ` Andreas Ericsson
2006-10-17 15:32               ` Matthieu Moy
2006-10-17 19:44               ` Aaron Bentley
2006-10-17 23:28                 ` Petr Baudis
2006-10-17 23:39                 ` Jakub Narebski
2006-10-18  0:24                   ` Aaron Bentley
2006-10-17  8:30         ` Jakub Narebski
2006-10-17 11:19           ` Matthieu Moy
     [not found]             ` <20061017073839.3728d1e7.seanlkml@sympatico.ca>
2006-10-17 11:38               ` Sean
2006-10-17 12:03                 ` Matthieu Moy
2006-10-17 12:56                   ` Jakub Narebski
     [not found]                   ` <20061017085723.7542ee6c.seanlkml@sympatico.ca>
2006-10-17 12:57                     ` Sean
2006-10-17 13:44                       ` Matthieu Moy
     [not found]                         ` <20061017100150.b4919aac.seanlkml@sympatico.ca>
2006-10-17 14:01                           ` Sean
2006-10-17 14:01                           ` Sean
2006-10-17 14:19                             ` Matthieu Moy
     [not found]                               ` <20061017110655.f7bcf3f1.seanlkml@sympatico.ca>
2006-10-17 15:06                                 ` Sean
2006-10-17 15:06                                 ` Sean
2006-10-18  0:14                                 ` Petr Baudis
2006-10-18  1:36                                   ` Integrating gitweb and git-browser (was: Re: VCS comparison table) Jakub Narebski
2006-10-18  1:52                                     ` Petr Baudis
2006-10-18  1:58                                       ` Jakub Narebski
2006-10-18  2:02                                         ` Petr Baudis
2006-10-17 12:57                     ` VCS comparison table Sean
2006-10-18  0:25                   ` Petr Baudis
2006-10-18  0:38                     ` Aaron Bentley
     [not found]                     ` <4535778D.40006@utoronto.ca>
2006-10-18  0:42                       ` Petr Baudis
2006-10-18  0:48                       ` Jakub Narebski
     [not found]                       ` <20061018004209.GL20017@pasky.or.cz>
2006-10-18  0:50                         ` Aaron Bentley
     [not found]                         ` <45357A6E.3050603@utoronto.ca>
2006-10-18  0:57                           ` Petr Baudis
2006-10-18  1:05                             ` Aaron Bentley
2006-10-18  1:11                   ` Petr Baudis
2006-10-18  6:44                     ` Matthieu Moy
2006-10-18  7:16                       ` Shawn Pearce
2006-10-17 11:38               ` Sean
2006-10-21 14:13               ` Jan Hudec
     [not found]                 ` <20061021102346.9cd3abce.seanlkml@sympatico.ca>
2006-10-21 14:23                   ` Sean
2006-10-21 14:23                   ` Sean
2006-10-21 16:19                     ` Erik Bågfors
2006-10-21 16:31                       ` Jakub Narebski
     [not found]                       ` <BAYC1-PASMTP01706CD2FCBE923333A0CBAE020@CEZ.ICE>
2006-10-21 16:35                         ` Erik Bågfors
     [not found]                           ` <BAYC1-PASMTP04FAD1FBB91BA4C07A5E79AE020@CEZ.ICE>
2006-10-21 17:33                             ` Erik Bågfors
2006-10-21 21:04                       ` Linus Torvalds
2006-10-21 23:58                         ` Linus Torvalds
2006-10-22  0:13                           ` Erik Bågfors
2006-10-22  0:22                             ` Jakub Narebski
2006-10-22  1:00                               ` Theodore Tso
2006-10-22  0:09                         ` Erik Bågfors
2006-10-27  4:51                         ` Jan Hudec
2006-10-28 11:38                           ` Jakub Narebski
2006-10-21 18:34                   ` Jan Hudec
     [not found]                     ` <20061021144704.71d75e83.seanlkml@sympatico.ca>
2006-10-21 18:47                       ` Sean
2006-10-21 18:47                       ` Sean
2006-10-17 11:45             ` Jakub Narebski
2006-10-17 12:02               ` Jakub Narebski
     [not found]               ` <20061017080702.615a3b2f.seanlkml@sympatico.ca>
2006-10-17 12:07                 ` Sean
2006-10-21  8:27                   ` Jakub Narebski
2006-10-21  8:48                     ` Erik Bågfors
2006-10-17 12:07                 ` Sean
2006-10-17 13:33               ` Matthieu Moy
2006-10-17 12:00             ` Andreas Ericsson
2006-10-17 13:27               ` Matthieu Moy
2006-10-17 13:55                 ` Jakub Narebski
2006-10-17 14:08                   ` Matthieu Moy
2006-10-17 14:41                     ` Jakub Narebski
2006-10-18  0:00                       ` Petr Baudis
2006-10-18  0:30                         ` Aaron Bentley
2006-10-18  0:39                           ` Petr Baudis
2006-10-18  1:28                           ` Jakub Narebski
2006-10-18  1:44                             ` Carl Worth
2006-10-18  3:27                               ` Aaron Bentley
2006-10-18  9:20                                 ` Jakub Narebski
2006-10-18 16:31                                   ` Aaron Bentley
2006-10-21 15:56                                     ` Jan Hudec
2006-10-21 16:13                                       ` Jakub Narebski
     [not found]                           ` <20061018003920.GK20017@pasky.or.cz>
2006-10-18  9:28                             ` Erik Bågfors
2006-10-18 11:08                               ` Petr Baudis
2006-10-18 11:17                                 ` Jakub Narebski
2006-10-18 13:09                                 ` Erik Bågfors
2006-10-18 18:03                   ` Jeff Licquia
2006-10-17 14:01                 ` Andreas Ericsson
2006-10-17 14:24                   ` Matthieu Moy
2006-10-17 14:19             ` Olivier Galibert
2006-10-17 15:37               ` Matthieu Moy
2006-10-18  1:46             ` Petr Baudis
     [not found]         ` <20061017062313.cd41e031.seanlkml@sympatico.ca>
2006-10-17 10:23           ` Sean
2006-10-17 10:23           ` Sean
2006-10-17 10:30             ` Johannes Schindelin
     [not found]               ` <20061017063549.da130b5f.seanlkml@sympatico.ca>
2006-10-17 10:35                 ` Sean
2006-10-17 10:35                 ` Sean
2006-10-17 10:45               ` Matthias Kestenholz
2006-10-17 13:48               ` Aaron Bentley
2006-10-17 19:51             ` Aaron Bentley
2006-10-21 18:58               ` Jan Hudec
     [not found]                 ` <20061021150233.c29e11c5.seanlkml@sympatico.ca>
2006-10-21 19:02                   ` Sean
2006-10-21 19:02                   ` Sean
2006-10-20  8:26             ` James Henstridge
2006-10-20 10:19               ` Jakub Narebski
2006-10-20  8:56             ` Erik Bågfors
2006-10-17 15:03         ` Linus Torvalds
2006-10-16 23:45     ` Johannes Schindelin
2006-10-17  2:40       ` Petr Baudis
2006-10-17  5:08       ` Aaron Bentley
2006-10-17  5:25         ` Carl Worth
2006-10-17  5:31         ` Shawn Pearce
2006-10-17  6:23         ` Junio C Hamano
2006-10-17 18:52           ` J. Bruce Fields
2006-10-17 19:12             ` Jakub Narebski
     [not found]         ` <20061017062341.8a5c8530.seanlkml@sympatico.ca>
2006-10-17 10:23           ` Sean
2006-10-17 10:23           ` Sean
2006-10-18  6:33           ` Jeff King
2006-10-17  9:33       ` Robert Collins
2006-10-17  9:45         ` Jakub Narebski
2006-10-14 20:20 ` Jakub Narebski
2006-10-14 23:06   ` Jon Smirl
2006-10-14 23:34     ` Jakub Narebski
     [not found]     ` <20061014200356.e7b56402.seanlkml@sympatico.ca>
2006-10-15  0:03       ` Sean
2006-10-15  0:34         ` Jon Smirl
     [not found]           ` <20061014214452.8c2d2a5c.seanlkml@sympatico.ca>
2006-10-15  1:44             ` Sean
2006-10-15  0:53     ` Jakub Narebski
2006-10-15 15:37     ` Jakub Narebski
2006-10-15 18:23     ` Petr Baudis
     [not found]       ` <20061015143956.86db3a8b.seanlkml@sympatico.ca>
2006-10-15 18:39         ` Sean
2006-10-15 19:24         ` Petr Baudis
2006-10-15 19:49       ` Jon Smirl
2006-10-16  3:23         ` Petr Baudis
2006-10-16  3:30           ` Jon Smirl
2006-10-17  3:52             ` Sam Vilain
2006-10-17 12:59               ` Jon Smirl
2006-10-18 23:53 [ANNOUNCE] GIT 1.4.3 Junio C Hamano
2006-10-20 12:31 ` Horst H. von Brand
2006-10-20 13:26 ` Peter Eriksen
2006-10-20 23:35 ` Junio C Hamano
2006-10-21  0:14   ` Linus Torvalds
2006-10-21  0:22     ` Petr Baudis
2006-10-21  0:31       ` Linus Torvalds
2006-10-21  9:53       ` Andreas Schwab
2006-10-22 21:09       ` Anders Larsen
2006-10-21  2:12     ` Al Viro
2006-10-21  5:29       ` Junio C Hamano
2006-10-21  5:40         ` Al Viro
2006-10-21 14:29       ` Rene Scharfe
2006-10-21  0:47   ` Nicolas Pitre
2006-10-23  0:53   ` prune/prune-packed J. Bruce Fields
2006-10-23  1:26     ` prune/prune-packed A Large Angry SCM
2006-10-23  2:36       ` [ANNOUNCE] GIT 1.4.3 J. Bruce Fields
2006-10-23  3:27       ` prune/prune-packed Junio C Hamano
2006-10-23 18:39         ` prune/prune-packed Petr Baudis
2006-10-27 21:19         ` prune/prune-packed Jon Loeliger
2006-10-27 21:55           ` prune/prune-packed Junio C Hamano
2006-10-20  9:04 Signed git-tag doesn't find default key Andy Parkins
2006-10-20 16:32 ` Linus Torvalds
2006-10-20 19:21   ` Andy Parkins
2006-10-21  0:52     ` Horst H. von Brand
2006-10-21  7:44       ` Andy Parkins
2006-10-22  3:59 prune/prune-packed J. Bruce Fields
2006-10-22  4:59 ` prune/prune-packed Junio C Hamano
2006-10-22 23:14   ` prune/prune-packed J. Bruce Fields
2006-10-25  0:07 [PATCH] xdiff: Do not consider lines starting by # hunkworthy Petr Baudis
2006-10-25  0:16 ` Junio C Hamano
2006-10-25  0:28   ` [PATCH] xdiff: Match GNU diff behaviour when deciding hunk comment worthiness of lines Petr Baudis
2006-10-25  1:33     ` Horst H. von Brand
2006-10-25  2:18       ` Junio C Hamano
2006-10-25  0:17 ` [PATCH] xdiff: Do not consider lines starting by # hunkworthy Jakub Narebski
2006-10-25 22:22 Combined diff format documentation Jakub Narebski
2006-10-25 22:40 ` Junio C Hamano
2006-10-25 22:58   ` Jakub Narebski
2006-10-25 23:14     ` Junio C Hamano
2006-10-25 23:24     ` Junio C Hamano
2006-10-25 23:45   ` Jakub Narebski
2006-10-26  1:48   ` Horst H. von Brand
2006-10-26  3:04     ` Junio C Hamano
2006-10-26  3:44   ` [PATCH] diff-format.txt: Combined diff format documentation supplement Jakub Narebski
2006-10-26  6:15     ` Junio C Hamano
2006-10-26  7:05       ` Junio C Hamano
2006-10-26  7:10         ` Junio C Hamano
2006-10-27 12:29 Creating new repos Horst H. von Brand
2006-10-27 12:39 ` Petr Baudis
2006-10-27 17:08   ` Horst H. von Brand
2006-10-28 14:19     ` Jakub Narebski
2006-10-27 17:26 Generating docu in 1.4.3.3.g01929 Horst H. von Brand
2006-10-27 19:44 ` Sean
     [not found]   ` <20061027154433.da9b29d7.seanlkml@sympatico.ca>
2006-10-27 23:12     ` Horst H. von Brand
2006-10-28  4:24       ` Sean
2006-10-28  5:45         ` Junio C Hamano
2006-10-28  6:07           ` Sean
2006-10-28 19:04             ` Junio C Hamano
2006-10-28 19:13               ` Sean
2006-10-28 19:22                 ` Junio C Hamano
2006-10-29 19:03               ` Horst H. von Brand
2006-10-27 21:34 ` Junio C Hamano
2006-11-06  9:08 [PATCH] git-pickaxe -C -C -C Junio C Hamano
2006-11-06 16:46 ` Horst H. von Brand
2006-11-06 17:25   ` Junio C Hamano
2006-11-14 16:42 [PATCH] commit: Steer new users toward "git commit -a" rather than update-index Carl Worth
2006-11-14 18:55 ` Andy Whitcroft
2006-11-14 19:22   ` Cleaning up git user-interface warts Carl Worth
2006-11-14 19:29     ` Shawn Pearce
2006-11-14 19:59       ` Carl Worth
2006-11-14 19:47     ` Petr Baudis
2006-11-14 20:56       ` Carl Worth
2006-11-15  0:31         ` Junio C Hamano
2006-11-15  4:08           ` Petr Baudis
2006-11-15  4:33             ` Junio C Hamano
2006-11-15  4:46               ` Nicolas Pitre
2006-11-15 10:09                 ` Jakub Narebski
2006-11-15 10:15                   ` Santi Béjar
2006-11-15 10:28                     ` Jakub Narebski
2006-11-16  2:43                       ` Petr Baudis
2006-11-15 14:56                   ` Nicolas Pitre
2006-11-15 20:39               ` Petr Baudis
2006-11-15 10:05             ` Jakub Narebski
2006-11-15 10:25               ` Karl Hasselström
2006-11-15 20:51           ` Carl Worth
2006-11-15 20:57             ` Jakub Narebski
2006-11-15 22:00               ` Shawn Pearce
2006-11-15 22:17                 ` Carl Worth
2006-11-14 20:46     ` Karl Hasselström
2006-11-14 20:52     ` Nicolas Pitre
2006-11-14 21:01       ` Jakub Narebski
2006-11-14 21:32         ` Nicolas Pitre
2006-11-14 22:04           ` Jakub Narebski
2006-11-14 22:29             ` Nicolas Pitre
2006-11-14 21:10       ` Carl Worth
2006-11-14 21:30         ` Jakub Narebski
2006-11-14 21:34           ` Nicolas Pitre
2006-11-14 22:56             ` Junio C Hamano
2006-11-15  1:48               ` Nicolas Pitre
2006-11-15  2:10                 ` Junio C Hamano
2006-11-15  2:27                   ` Michael K. Edwards
2006-11-15  4:20                   ` Nicolas Pitre
2006-11-15  4:58                     ` Junio C Hamano
2006-11-15 18:03                     ` Linus Torvalds
2006-11-15 18:28                       ` Jakub Narebski
2006-11-15 20:31                         ` Josef Weidendorfer
2006-11-15 20:35                           ` Petr Baudis
2006-11-15 21:12                             ` Josef Weidendorfer
2006-11-15 21:31                               ` Linus Torvalds
2006-11-15 18:43                       ` Nicolas Pitre
2006-11-15 18:49                         ` Shawn Pearce
2006-11-15 19:05                           ` Marko Macek
2006-11-15 20:41                             ` Junio C Hamano
2006-11-15 22:07                               ` Shawn Pearce
2006-11-16  6:07                               ` Marko Macek
2006-11-16 10:36                                 ` Junio C Hamano
2006-11-15 22:28                             ` Sean
     [not found]                             ` <20061115172834.0a328154.seanlkml@sympatico.ca>
2006-11-16  3:07                               ` Petr Baudis
2006-11-15 18:58                       ` Andy Parkins
2006-11-15 19:18                         ` Linus Torvalds
2006-11-15 19:39                           ` Michael K. Edwards
2006-11-15 20:09                             ` Linus Torvalds
2006-11-15 20:21                               ` Nicolas Pitre
2006-11-15 20:40                                 ` Linus Torvalds
2006-11-15 21:08                                   ` Carl Worth
2006-11-15 21:31                                     ` Junio C Hamano
2006-11-15 21:40                                       ` Nicolas Pitre
2006-11-15 21:52                                         ` Junio C Hamano
2006-11-15 21:59                                           ` Nicolas Pitre
2006-11-15 21:45                                     ` Linus Torvalds
2006-11-15 22:52                                       ` Carl Worth
2006-11-15 23:02                                         ` Shawn Pearce
2006-11-15 23:33                                           ` Linus Torvalds
2006-11-16  0:08                                             ` Nicolas Pitre
2006-11-16  3:07                                               ` Linus Torvalds
2006-11-16  3:43                                                 ` Nicolas Pitre
2006-11-16  3:02                                             ` Michael K. Edwards
2006-11-16 11:35                                               ` Andreas Ericsson
2006-11-16 16:37                                             ` Carl Worth
2006-11-16 17:57                                               ` Michael K. Edwards
2006-11-16 18:23                                                 ` Carl Worth
2006-11-15 23:07                                         ` Sean
     [not found]                                         ` <20061115180722.83ff8990.seanlkml@sympatico.ca>
2006-11-15 23:15                                           ` Shawn Pearce
2006-11-16  7:51                                             ` Richard CURNOW
2006-11-16 23:01                                               ` Johannes Schindelin
2006-11-16  4:26                                   ` Theodore Tso
2006-11-16 11:50                                     ` Andreas Ericsson
2006-11-16 16:30                                       ` Linus Torvalds
2006-11-16 17:01                                         ` Carl Worth
2006-11-16 17:30                                           ` Linus Torvalds
2006-11-16 17:44                                             ` Sean
2006-11-16  1:40                           ` Anand Kumria
2006-11-15 19:32                         ` Junio C Hamano
2006-11-16  1:14                       ` Theodore Tso
2006-11-16  4:21                         ` Junio C Hamano
2006-11-16 11:34                           ` Alexandre Julliard
2006-11-16 14:01                             ` Petr Baudis
2006-11-16 15:48                               ` Alexandre Julliard
2006-11-16 16:07                           ` Theodore Tso
2006-11-16 16:49                             ` Theodore Tso
2006-11-16  1:20                       ` Han-Wen Nienhuys
2006-11-16  1:53                         ` Jakub Narebski
2006-11-16  2:03                         ` Junio C Hamano
2006-11-16  2:30                           ` Han-Wen Nienhuys
2006-11-16  3:27                             ` Junio C Hamano
2006-11-16  3:35                               ` Junio C Hamano
2006-11-16  4:07                               ` Junio C Hamano
2006-11-16  3:12                         ` Linus Torvalds
2006-11-16 10:31                           ` Junio C Hamano
2006-11-16 10:45                           ` Han-Wen Nienhuys
2006-11-16 11:11                             ` Junio C Hamano
2006-11-16 11:47                               ` Junio C Hamano
2006-11-16 13:03                               ` Han-Wen Nienhuys
2006-11-16 13:11                                 ` Han-Wen Nienhuys
2006-11-16 16:23                             ` Linus Torvalds
2006-11-16 16:42                               ` Han-Wen Nienhuys
2006-11-16 17:17                                 ` Linus Torvalds
2006-11-16 17:40                                   ` multi-project repos (was Re: Cleaning up git user-interface warts) Han-Wen Nienhuys
2006-11-16 18:21                                     ` Linus Torvalds
2006-11-16 18:33                                       ` multi-project repos Junio C Hamano
2006-11-16 19:01                                       ` multi-project repos (was Re: Cleaning up git user-interface warts) Linus Torvalds
2006-11-16 22:21                                       ` Johannes Schindelin
2006-11-16 22:44                                         ` multi-project repos Junio C Hamano
2006-11-17  0:29                                           ` Johannes Schindelin
2006-11-16 22:49                                         ` multi-project repos (was Re: Cleaning up git user-interface warts) Linus Torvalds
2006-11-16 23:08                                           ` Linus Torvalds
2006-11-16 23:36                                           ` Johannes Schindelin
2006-11-17  0:49                                             ` Linus Torvalds
2006-11-16 23:40                                           ` Han-Wen Nienhuys
2006-11-16 23:32                                       ` Han-Wen Nienhuys
2006-11-16 17:57                                   ` Cleaning up git user-interface warts Linus Torvalds
2006-11-16 18:27                                     ` Junio C Hamano
2006-11-16 18:28                                     ` Linus Torvalds
2006-11-16 19:47                                       ` Junio C Hamano
2006-11-16 19:53                                         ` Linus Torvalds
2006-11-16 18:13                                   ` Carl Worth
2006-11-16 23:00                           ` Johannes Schindelin
2006-11-16 23:22                             ` Linus Torvalds
2006-11-17  0:05                               ` Han-Wen Nienhuys
2006-11-17  0:13                                 ` Junio C Hamano
2006-11-17  0:27                                   ` Han-Wen Nienhuys
2006-11-17  0:35                                     ` Petr Baudis
2006-11-17  0:37                                   ` Carl Worth
2006-11-17  0:39                                 ` Linus Torvalds
2006-11-17  0:52                                   ` Han-Wen Nienhuys
2006-11-16  4:30                       ` Petr Baudis
2006-11-15 20:12                   ` Petr Baudis
2006-11-15 20:26                     ` Nicolas Pitre
2006-11-15 20:50                       ` Linus Torvalds
2006-11-15 21:18                         ` Nicolas Pitre
2006-11-16  1:51                       ` Anand Kumria
2006-11-14 22:36         ` Junio C Hamano
2006-11-14 22:50           ` Junio C Hamano
2006-11-15  4:32             ` Nicolas Pitre
2006-11-15  5:35               ` Junio C Hamano
2006-11-15  6:18                 ` Shawn Pearce
2006-11-15  6:30                   ` Junio C Hamano
2006-11-15 14:01                 ` Johannes Schindelin
2006-11-15 15:03                   ` Sean
2006-11-15 15:10                   ` Nicolas Pitre
2006-11-15 18:16                     ` Junio C Hamano
2006-11-15 19:02                       ` Andy Parkins
2006-11-15 19:41                         ` Junio C Hamano
2006-11-15 20:15                           ` Nicolas Pitre
2006-11-15 20:19                           ` Carl Worth
2006-11-15 21:13                             ` Junio C Hamano
2006-11-15 22:36                               ` Carl Worth
2006-11-16  3:21                                 ` Petr Baudis
2006-11-16 10:09                                   ` Robin Rosenberg
2006-11-16 13:46                                     ` Petr Baudis
2006-11-16  0:23                       ` Han-Wen Nienhuys
2006-11-15  9:17               ` Andy Parkins
2006-11-15  9:59                 ` Jakub Narebski
2006-11-15 10:33                   ` Andy Parkins
2006-11-15 10:48                     ` Karl Hasselström
2006-11-15 11:28                       ` Andy Parkins
2006-11-15 15:41                 ` Nicolas Pitre
2006-11-15 17:59                   ` Junio C Hamano
2006-11-15 18:11                     ` Nicolas Pitre
2006-11-16 13:21                       ` Karl Hasselström
2006-11-15 17:55                 ` Junio C Hamano
2006-11-15 19:14                   ` Andy Parkins
2006-11-16  3:53                 ` Petr Baudis
2006-11-15 12:15               ` Andreas Ericsson
2006-11-15 12:31                 ` Jakub Narebski
2006-11-16 13:58               ` Petr Baudis
2006-11-16  5:12           ` Petr Baudis
2006-11-16 10:45             ` Junio C Hamano
2006-11-16 13:43               ` Petr Baudis
2006-11-16 21:49             ` Junio C Hamano
2006-11-16 22:20               ` Petr Baudis
2006-11-17  0:11             ` Han-Wen Nienhuys
2006-11-14 23:30   ` [PATCH] commit: Steer new users toward "git commit -a" rather than update-index Junio C Hamano
2006-11-15  3:53 Sometimes "Failed to find remote refs" means "try git-fetch --no-tags" Michael K. Edwards
2006-11-15  4:05 ` Junio C Hamano
2006-11-15 21:13   ` Horst H. von Brand

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).