* LCA06 Cogito/GIT workshop - (Re: git-whatchanged: exit out early on errors)
@ 2006-01-26 2:10 Martin Langhoff
2006-01-28 4:47 ` Linus Torvalds
0 siblings, 1 reply; 110+ messages in thread
From: Martin Langhoff @ 2006-01-26 2:10 UTC (permalink / raw
To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List
On 1/26/06, Linus Torvalds <torvalds@osdl.org> wrote:
> If we get an error parsing the arguments, exit.
This bug found thanks to the 'demo' effect. ;-)
The workshop had a 2hr slot -- after 2hs 15, I asked Linus if he
wanted to talk about the internals. He did, and the workshop went
on... for 2 hours more. It was actually hard to get people out of the
room.
Sadly, not many people actually played along on their laptop. Those
who did got an extra bit of help to migrate their preexisting CVS/SVN
repos ;-) (thanks to Sam Vilain for all the help!)
I'll upload the presentation material soon -- very similar to the
stuff I used @ Wellington Perl Mongers. Still text-based; given all
the talk about plumbing and porcelain, I steadfastly refuse to add
imagery.
During the presentation someone mentioned errors when running
git-cvsimport which I'm keen on hearing more about.
cheers,
m
^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: LCA06 Cogito/GIT workshop - (Re: git-whatchanged: exit out early on errors) 2006-01-26 2:10 LCA06 Cogito/GIT workshop - (Re: git-whatchanged: exit out early on errors) Martin Langhoff @ 2006-01-28 4:47 ` Linus Torvalds 2006-01-28 5:33 ` Martin Langhoff 0 siblings, 1 reply; 110+ messages in thread From: Linus Torvalds @ 2006-01-28 4:47 UTC (permalink / raw To: Martin Langhoff; +Cc: Junio C Hamano, Git Mailing List On Thu, 26 Jan 2006, Martin Langhoff wrote: > > During the presentation someone mentioned errors when running > git-cvsimport which I'm keen on hearing more about. Martin, I talked to Keith, and apparently you fixed some cvsimport problem they had with Cairo during dinner last night? Was that something that could have affected other people, or was it very specific to whatever Cairo CVS insanity? I've not seen any messages from you on it.. Linus ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: LCA06 Cogito/GIT workshop - (Re: git-whatchanged: exit out early on errors) 2006-01-28 4:47 ` Linus Torvalds @ 2006-01-28 5:33 ` Martin Langhoff 2006-01-28 5:53 ` Linus Torvalds 2006-01-28 11:00 ` Keith Packard 0 siblings, 2 replies; 110+ messages in thread From: Martin Langhoff @ 2006-01-28 5:33 UTC (permalink / raw To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List, keithp On 1/28/06, Linus Torvalds <torvalds@osdl.org> wrote: > > During the presentation someone mentioned errors when running > > git-cvsimport which I'm keen on hearing more about. > > Martin, I talked to Keith, and apparently you fixed some cvsimport problem > they had with Cairo during dinner last night? Was that something that > could have affected other people, or was it very specific to whatever > Cairo CVS insanity? I've not seen any messages from you on it.. I've got a few small improvements to cvsimport in my laptop that I'll push out for Junio to merge as soon as I get back to the office. I've run "99% successful" imports of cairo and of x.org (modular and monolithic) with all their branches and tags. It isn't literally the 20 years of commits Jim talked initially about -- cvs holds just the last ~5 years. The repos *are* a bit broken -- files missing (not moved, but really missing) so some of the fixes are to make it easier to discover where it is dying and workaround it. There are a few more things that I need to debug in cvsimport -- there's a small delta between what I should have and what I do have. As soon as they are 100% right I'll put them on http://locke.catalyst.net.nz/gitweb for the X.org team to have a look at them -- and a cronjob to keep them up to date with official CVS. BTW, have you still got that patch to git-merge to seed the commit msg with conflicted files? ;-) cheers, m ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: LCA06 Cogito/GIT workshop - (Re: git-whatchanged: exit out early on errors) 2006-01-28 5:33 ` Martin Langhoff @ 2006-01-28 5:53 ` Linus Torvalds 2006-01-28 6:32 ` Junio C Hamano 2006-01-29 10:12 ` Fredrik Kuivinen 2006-01-28 11:00 ` Keith Packard 1 sibling, 2 replies; 110+ messages in thread From: Linus Torvalds @ 2006-01-28 5:53 UTC (permalink / raw To: Martin Langhoff; +Cc: Junio C Hamano, Git Mailing List, keithp On Sat, 28 Jan 2006, Martin Langhoff wrote: > > BTW, have you still got that patch to git-merge to seed the commit msg > with conflicted files? ;-) Nope. But it was something like the appended (totally untested, and slightly improved). The point being that we'd fill in a template that the committer will hopefully edit to explain what he did to fix up the merge for each file that had conflicts. Linus --- diff --git a/git-merge.sh b/git-merge.sh index 0a158ef..9f828f3 100755 --- a/git-merge.sh +++ b/git-merge.sh @@ -301,5 +301,9 @@ then "Automatic merge went well; stopped before committing as requested" exit 0 else + echo >"$GIT_DIR/MERGE_MSG" + echo "Conflicts in" >"$GIT_DIR/MERGE_MSG" + git-ls-files --unmerged | cut -f2 | uniq | + sed 's/^.*/ \0:/' >"$GIT_DIR/MERGE_MSG" die "Automatic merge failed; fix up by hand" fi ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: LCA06 Cogito/GIT workshop - (Re: git-whatchanged: exit out early on errors) 2006-01-28 5:53 ` Linus Torvalds @ 2006-01-28 6:32 ` Junio C Hamano 2006-01-29 10:12 ` Fredrik Kuivinen 1 sibling, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2006-01-28 6:32 UTC (permalink / raw To: Linus Torvalds; +Cc: Martin Langhoff, Git Mailing List, keithp Linus Torvalds <torvalds@osdl.org> writes: > The point being that we'd fill in a template that the committer will > hopefully edit to explain what he did to fix up the merge for each file > that had conflicts. That is a sound idea from the point of view of good practice. While on the topic of conflicting merge, I've been wondering if it would make sense to do the "combined diff" between stage 2, stage 3 and the working tree file, in addition to the --ours and --theirs enhancements you added lately. This would let you sanity check the merge you _could_ commit, in the same format you would see later when you examine the merge commit. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: LCA06 Cogito/GIT workshop - (Re: git-whatchanged: exit out early on errors) 2006-01-28 5:53 ` Linus Torvalds 2006-01-28 6:32 ` Junio C Hamano @ 2006-01-29 10:12 ` Fredrik Kuivinen 2006-01-29 20:15 ` Junio C Hamano 1 sibling, 1 reply; 110+ messages in thread From: Fredrik Kuivinen @ 2006-01-29 10:12 UTC (permalink / raw To: Linus Torvalds; +Cc: Martin Langhoff, Junio C Hamano, Git Mailing List, keithp On Sat, Jan 28, 2006 at 12:53:31AM -0500, Linus Torvalds wrote: > > > On Sat, 28 Jan 2006, Martin Langhoff wrote: > > > > BTW, have you still got that patch to git-merge to seed the commit msg > > with conflicted files? ;-) > > Nope. But it was something like the appended (totally untested, and > slightly improved). > > The point being that we'd fill in a template that the committer will > hopefully edit to explain what he did to fix up the merge for each file > that had conflicts. > Would it make sense to add an optional mergeresult <tree> line to merge commit objects? Here <tree> is supposed to be a SHA1 of the tree object which corresponds to the result of the automatic part of a merge. Hence, for a given merge commit which had conflicts "git-diff-tree <commit SHA1> <mergeresult SHA1>" would give a diff which shows the changes that was applied to resolve the conflict. When the recursive merge strategy is used we actually write the 'mergeresult' tree object to the object database, so this thing should be straight forward to implement in that case. If there is interest it could be implemented for the resolve strategy too. I think those mergeresult lines might be useful when implementing git-annotate across merges too. It makes it easy to distinguish changes which came from the merged branches and changes introduced in the merge itself. It would not be backwards compatible with the current git though... - Fredrik ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: LCA06 Cogito/GIT workshop - (Re: git-whatchanged: exit out early on errors) 2006-01-29 10:12 ` Fredrik Kuivinen @ 2006-01-29 20:15 ` Junio C Hamano 0 siblings, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2006-01-29 20:15 UTC (permalink / raw To: Fredrik Kuivinen Cc: Linus Torvalds, Martin Langhoff, Git Mailing List, keithp Fredrik Kuivinen <freku045@student.liu.se> writes: > Would it make sense to add an optional > > mergeresult <tree> > > line to merge commit objects? Two issues and a half. (1) Not all conflicting merge cases can write a sensible "conflicted intermediate auto-merge result". Look for cases where we punt in git-merge-one-file. (2) Modulo issue (1), it can be re-computed if and when needed, so this is akin to "storing rename information in the commit by detecting renames while merging". (3) Depending on the direction you pull, you would have logically the same "conflicted auto-merge result" that has <<< === >>> delimited hunks in reverse. Which one should you record? And annotate would not be helped much -- if it is needed you could recompute it at that point. Annotate needs to look at the diff from each parent _anyway_ to assign blames. By the way, I brought up the issue (3) because it relates to how my latest toy "git rerere" works ;-). ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: LCA06 Cogito/GIT workshop - (Re: git-whatchanged: exit out early on errors) 2006-01-28 5:33 ` Martin Langhoff 2006-01-28 5:53 ` Linus Torvalds @ 2006-01-28 11:00 ` Keith Packard 2006-01-28 21:08 ` [Census] So who uses git? Junio C Hamano 1 sibling, 1 reply; 110+ messages in thread From: Keith Packard @ 2006-01-28 11:00 UTC (permalink / raw To: Martin Langhoff; +Cc: keithp, Linus Torvalds, Junio C Hamano, Git Mailing List [-- Attachment #1: Type: text/plain, Size: 1757 bytes --] On Sat, 2006-01-28 at 18:33 +1300, Martin Langhoff wrote: > I've got a few small improvements to cvsimport in my laptop that I'll > push out for Junio to merge as soon as I get back to the office. I've > run "99% successful" imports of cairo and of x.org (modular and > monolithic) with all their branches and tags. It isn't literally the > 20 years of commits Jim talked initially about -- cvs holds just the > last ~5 years. Yeah, X CVS is a scattered mess at present. I think it would be better to just leave that mess alone and grab a reasonably recent chunk of it to put into a GIT repository. Save a bunch of space too. We also haven't quite finished all of the recovery needed to span the whole twenty years yet. Carl and I hacked at the tool a bit to pull apart our ChangeLog-based commit messages; extracting email addresses and separating the commit messages from the (now useless) list of affected files. We're getting clean cairo imports now, there are a few weirdnesses around branches that we've seen -- one commit appears on both the branch and trunk for some reason. Once we're happy with the import, I'm pretty sure we'll just switch cairo over to git and dump the CVS bits. X.org is a harder case, for that I suspect we'll migrate individual modules over one at a time, perhaps starting with the core X server pieces so that I can get my work done, have it published in the main repository and not have it also break everyone else's X server. I'm not sure we'll need ongoing synchronization with existing X.org CVS for long; there aren't any other developers doing any significant changes to this part of the system, so we can abandon the losers with no remorse. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 110+ messages in thread
* [Census] So who uses git? 2006-01-28 11:00 ` Keith Packard @ 2006-01-28 21:08 ` Junio C Hamano 2006-01-29 2:14 ` Morten Welinder ` (3 more replies) 0 siblings, 4 replies; 110+ messages in thread From: Junio C Hamano @ 2006-01-28 21:08 UTC (permalink / raw To: Keith Packard; +Cc: Martin Langhoff, Linus Torvalds, Git Mailing List Keith Packard <keithp@keithp.com> writes: > Once we're happy with the import, I'm pretty sure we'll just switch > cairo over to git and dump the CVS bits. X.org is a harder case, for > that I suspect we'll migrate individual modules over one at a time, > perhaps starting with the core X server pieces so that I can get my work > done, have it published in the main repository and not have it also > break everyone else's X server. Wow....... You are switching Cairo and X.org from CVS to git? It could be that anything is better than CVS these days, but I have to admit that my jaw dropped after reading this, primarily because I've have never touched anything as big as X. Awestruck, dumbstruck,... Xstruck. Yeah, I know I should have more faith in git. Earlier I heard Wine folks are running git in parallel with CVS as their dual primary SCM now, and of course git is the primary SCM for the Linux kernel project. For things like the source code management, it takes a new software to be at least 10 times as good as the one that has been used, because switching _is_ a pain no matter how well tool helps the transition. You have to transition not just the repository, but people who interact with it. When the Linux kernel switched, it was not that hard to be infinitely better than the previous one. Because the previous one was no longer available to the kernel community; git did not have to be 10 times better on technical merits alone when the transition happened. Can I hear experiences from other big projects that tried to use git [*1*]? I suspect there are many that have tried, and I would not be surprised at all if git did not work out well for them. For projects that already run on a (free) SCM, I would be very surprised if the developers find the current git 10 times better than the SCM they have been using (probably with an exception of CVS), unless they have very specific need, such as parallel development of distributed nature like the Linux kernel. I do not do mailing list search as often as I would like to be doing, but I have seen some projects tried and went back to CVS. We would learn much from our failures to support them -- what those people found lacking. [Foornote] *1* Please limit yourselves to reasonably well-known "it is surprising you haven't heard of this project" kind... ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-28 21:08 ` [Census] So who uses git? Junio C Hamano @ 2006-01-29 2:14 ` Morten Welinder 2006-01-29 3:53 ` Junio C Hamano 2006-01-29 10:09 ` Keith Packard ` (2 subsequent siblings) 3 siblings, 1 reply; 110+ messages in thread From: Morten Welinder @ 2006-01-29 2:14 UTC (permalink / raw To: Junio C Hamano; +Cc: Git Mailing List > Can I hear experiences from other big projects that tried to use > git [*1*]? I suspect there are many that have tried, and I > would not be surprised at all if git did not work out well for > them. I've been playing with Gnumeric under git. -rw-rw-r-- 1 welinder research 270M Nov 5 09:46 gnumeric/.git/objects/pack/pack-91291de5477ddd06545b052460239b3dae89ad72.pack 270M is about 40% of the cvs repository size. Given compression I would have expected bigger savings. Conversion isn't perfect, probably because the cvs tree has seen some hacking over the years. (I am not posting the URL because I don't want to kill gnome.org.) We haven't switched yet, but I expect that we will. We are looking for (in no particular order): 1. Offline history. 2. Patch sets and other things that'll make it easier to maintain more than one branch. In other words, pretty-much anything but cvs will fit the bill, :-./ M. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-29 2:14 ` Morten Welinder @ 2006-01-29 3:53 ` Junio C Hamano 2006-01-29 14:19 ` Morten Welinder 0 siblings, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2006-01-29 3:53 UTC (permalink / raw To: Morten Welinder; +Cc: git Morten Welinder <mwelinder@gmail.com> writes: >> Can I hear experiences from other big projects that tried to use >> git [*1*]? I suspect there are many that have tried, and I >> would not be surprised at all if git did not work out well for >> them. > > I've been playing with Gnumeric under git. > ... > We haven't switched yet, but I expect that we will... I might have sounded as if I was looking for failure report, but success stories are of course welcome ;-). It's always good to hear their git experiences first-hand from people in the top echelon of public projects. > 270M is about 40% of the cvs repository size. Given > compression I would have expected bigger savings. I think that 40% sounds about right. My understanding of the underlying format CVS uses, RCS, is that it stores an full copy of the tip of trunk uncompressed, and other versions of the file are represented as incremental delta from that. The packed git format does not favor particular version based on the distance from the tip, and stores either a compressed full copy, or a delta from some other revision (which may not necessarily be represented as a full copy). When we store something as a delta from something else, we limit the length of the delta chain to a full copy to 10 (by default), so that you can get to a specific object with at most 10 applications of delta on top of a full copy. Comparing these two formats for storage efficiency is tricky: - A full copy of the version at the tip in CVS is not compressed but in git a full copy is compressed -- zlib gives 50% for typical text sources -- git has some advantage here. - Because of delta-length limit, we store full copy, albeit compressed [*1*], every ten or so versions. This trades off storage effciency for run-time efficiency. - CVS storage records most things as delta for a long-lived project, and delta are less compressible (IOW, you could think of them as already compressed somewhat), so it is not _that_ inefficient to begin with. - Delta representation is used only when representing something as a delta from something else buys as enough space reduction than compressing it as a full copy in git. This is a pure improvement from the CVS format. [Footnote] *1* You could make different trade-off by using --depth flag when running git-pack-objects. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-29 3:53 ` Junio C Hamano @ 2006-01-29 14:19 ` Morten Welinder 2006-01-29 20:15 ` Junio C Hamano 0 siblings, 1 reply; 110+ messages in thread From: Morten Welinder @ 2006-01-29 14:19 UTC (permalink / raw To: Junio C Hamano; +Cc: git > I think that 40% sounds about right. My understanding of the > underlying format CVS uses, RCS, is that it stores an full copy > of the tip of trunk uncompressed, and other versions of the file > are represented as incremental delta from that. The packed git > format does not favor particular version based on the distance > from the tip, and stores either a compressed full copy, or a > delta from some other revision (which may not necessarily be > represented as a full copy). When we store something as a delta > from something else, we limit the length of the delta chain to a > full copy to 10 (by default), so that you can get to a specific > object with at most 10 applications of delta on top of a full > copy. If I understand this right, that means that for a log file (in this case a ChangeLog file) that is appended to linearly as a function of revision number, we have... cvs: O(n) archive size git: O(n*n) archive size At least that is what we get if revision N is always deltad over revision N-1. A good deal could be saved if instead of dumping a full copy every 10 revisions, that revision would instead be deltad off an earlier revision, but I think it'll still be O(n*n). (/me prepares for Linus chiming in and telling me I should not keep ChangeLog files, :-) M. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-29 14:19 ` Morten Welinder @ 2006-01-29 20:15 ` Junio C Hamano 0 siblings, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2006-01-29 20:15 UTC (permalink / raw To: Morten Welinder; +Cc: git Morten Welinder <mwelinder@gmail.com> writes: > If I understand this right, that means that for a log file (in this > case a ChangeLog file) that is appended to linearly as a > function of revision number, we have... > > cvs: O(n) archive size > git: O(n*n) archive size > > At least that is what we get if revision N is always deltad over > revision N-1. A good deal could be saved if instead of dumping > a full copy every 10 revisions, that revision would instead be > deltad off an earlier revision, but I think it'll still be O(n*n). I have not counted O()rders, but it is not as simple as that, because we do not really compare "versions". If version N reverts a change version N-1 introduced since version N-2, we would not even store a copy for version N and version N-2 separately. We just store a single copy, which may be delta information against version N-1 (or the other way around and N-1 might be delta against N). For the sake of math, let's say this project keeps only one file, append only ChangeLog, with a straight line of development without branches ("single strand of pearls"), and has revisions 1..N. In RCS, you would have a full copy of the revision N, and revision J is recorded as delta from revision J+1 for 1 <= J < N. This delta is similar to "ed" script, and going backwards in the history for the ChangeLog example means only line deletion is involved, so what was removed is not recorded. It records how many lines are removed from where. This is _very_ efficient and compact. In git, we would have a full copy of version N (because we favor keeping larger blob associated with newer commits as a full copy), and essentially the same thing as RCS happens. The only difference is that our "delta" is binary delta, but in this case, it just records "copy N bytes from here to here" which results in about the same amount of information to represent each delta. As you say, if (10 < N), we would have a full copy every once in a while. You could use depth other than the default to make this chaining longer and if you did so, your repository would be *very* compactly compressed. However, retrieving cost of version 1 is quite different. RCS format is O(n) -- you start from the tip, extract and interpret (N-1) deltas and apply them in turn to get to what you want. The cost of extracting an arbitrary version is bounded in git packfile, because you need to do such an "extract, interpret and apply" at most $depth cycles. This is primarily because we do not store "versions" but individual objects, and do not apply "newer revisions are far more likely to be accessed often" heuristics, which RCS format is designed for. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-28 21:08 ` [Census] So who uses git? Junio C Hamano 2006-01-29 2:14 ` Morten Welinder @ 2006-01-29 10:09 ` Keith Packard 2006-01-29 11:18 ` Radoslaw Szkodzinski 2006-01-29 18:37 ` Dave Jones 2006-01-30 18:58 ` Carl Baldwin 3 siblings, 1 reply; 110+ messages in thread From: Keith Packard @ 2006-01-29 10:09 UTC (permalink / raw To: Junio C Hamano Cc: cworth, keithp, Martin Langhoff, Linus Torvalds, Git Mailing List [-- Attachment #1: Type: text/plain, Size: 4034 bytes --] On Sat, 2006-01-28 at 13:08 -0800, Junio C Hamano wrote: > Keith Packard <keithp@keithp.com> writes: > > > Once we're happy with the import, I'm pretty sure we'll just switch > > cairo over to git and dump the CVS bits. X.org is a harder case, for > > that I suspect we'll migrate individual modules over one at a time, > > perhaps starting with the core X server pieces so that I can get my work > > done, have it published in the main repository and not have it also > > break everyone else's X server. > > Wow....... You are switching Cairo and X.org from CVS to git? We're not switching 'X.org', we're switching the X server core. X.org is now broken into many separate projects, and each one will get to choose SCM on their own. I expect to migrate the ones I maintain and use to git, but migration of the dead code is unlikely to ever happen (and there's lots of dead code) > It could be that anything is better than CVS these days, but I > have to admit that my jaw dropped after reading this, primarily > because I've have never touched anything as big as X. > > Awestruck, dumbstruck,... Xstruck. Yeah, I know I should have > more faith in git. Earlier I heard Wine folks are running git > in parallel with CVS as their dual primary SCM now, and of > course git is the primary SCM for the Linux kernel project. > > For things like the source code management, it takes a new > software to be at least 10 times as good as the one that has > been used, because switching _is_ a pain no matter how well tool > helps the transition. You have to transition not just the > repository, but people who interact with it. Fortunately, there are very few people involved with any specific piece of the X.org distribution; there's really only one or two people actively developing the X.org core server, so that part of the migration will be easy. Our users will be stuck, but there aren't many of them either, and git makes just sucking the current bits pretty easy. > When the Linux kernel switched, it was not that hard to be > infinitely better than the previous one. Because the previous > one was no longer available to the kernel community; git did not > have to be 10 times better on technical merits alone when the > transition happened. git really does look 10x better than CVS at this point; mostly social issues are now blocking X development as weaker developers are refused access to source code management to protect the project from damage. git eliminates that barrier, and should let many new developers experiment and share their results without affecting my work > Can I hear experiences from other big projects that tried to use > git [*1*]? I suspect there are many that have tried, and I > would not be surprised at all if git did not work out well for > them. For projects that already run on a (free) SCM, I would be > very surprised if the developers find the current git 10 times > better than the SCM they have been using (probably with an > exception of CVS), unless they have very specific need, such as > parallel development of distributed nature like the Linux > kernel. Everyone *wants* parallel distributed development, CVS prevents it. And, remember that this is *not* a huge project, the core X server is only 2M lines of source code. We separate out all of the drivers, libraries and applications. Doing the migration in pieces allows us to incrementally affect developers, and repair issues without suspending all development. I don't know of other huge projects moving to git; it's not all that interesting as we know the tool is stable and will scale to support our project already. Also, hg and bzr are not ready for production use in my opinion; hg as it appears likely a flag day will be required before 1.0, and bzr because they didn't focus on repository format, and have suggested that they will switch to a hash-addressed scheme at some point in the future... -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-29 10:09 ` Keith Packard @ 2006-01-29 11:18 ` Radoslaw Szkodzinski 2006-01-29 18:12 ` Greg KH 2006-01-30 22:51 ` Alex Riesen 0 siblings, 2 replies; 110+ messages in thread From: Radoslaw Szkodzinski @ 2006-01-29 11:18 UTC (permalink / raw To: Keith Packard Cc: Junio C Hamano, cworth, Martin Langhoff, Linus Torvalds, Git Mailing List [-- Attachment #1: Type: text/plain, Size: 1846 bytes --] Keith Packard wrote: > Fortunately, there are very few people involved with any specific piece > of the X.org distribution; there's really only one or two people > actively developing the X.org core server, so that part of the migration > will be easy. Our users will be stuck, but there aren't many of them > either, and git makes just sucking the current bits pretty easy. > Not under Windows (bleh), but it's support for Cygwin is getting better and better. > I don't know of other huge projects moving to git; it's not all that > interesting as we know the tool is stable and will scale to support our > project already. Also, hg and bzr are not ready for production use in my > opinion; hg as it appears likely a flag day will be required before 1.0, I haven't seen any such flag day since 0.3. Repository format seems stable, except rename and modes support (these will be added in a compatible way I think). 0.8 release is imminent (today or tomorrow). I personally wouldn't mind git - it's great. The only drawback is local cloning. This operation is like 4x slower than plain copying of the repository. Probably because it works like an ssh clone - creates a pack, copies it, then unpacks. This is just inefficient on a local machine. > and bzr because they didn't focus on repository format, and have > suggested that they will switch to a hash-addressed scheme at some point > in the future... > Not only that - they don't have an efficient network transfer protocol. (they use HTTP walkers, not even supporting persistent connections and also do too many DNS lookups) This is very unfortunate, especially for large projects. (branching Linux would take 3 days I think) -- GPG Key id: 0xD1F10BA2 Fingerprint: 96E2 304A B9C4 949A 10A0 9105 9543 0453 D1F1 0BA2 AstralStorm [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 252 bytes --] ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-29 11:18 ` Radoslaw Szkodzinski @ 2006-01-29 18:12 ` Greg KH 2006-01-31 18:33 ` Radoslaw Szkodzinski 2006-01-30 22:51 ` Alex Riesen 1 sibling, 1 reply; 110+ messages in thread From: Greg KH @ 2006-01-29 18:12 UTC (permalink / raw To: Radoslaw Szkodzinski Cc: Keith Packard, Junio C Hamano, cworth, Martin Langhoff, Linus Torvalds, Git Mailing List On Sun, Jan 29, 2006 at 12:18:45PM +0100, Radoslaw Szkodzinski wrote: > > The only drawback is local cloning. This operation is like 4x slower > than plain copying of the repository. Probably because it works like an > ssh clone - creates a pack, copies it, then unpacks. This is just > inefficient on a local machine. Have you tried the "-l" option for cloneing locally? It's _very_ fast, even for my tiny little old laptop. If you add a "-n" that will not checkout the source tree, so you can compare the time of cloning with the checkout portion. thanks, greg k-h ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-29 18:12 ` Greg KH @ 2006-01-31 18:33 ` Radoslaw Szkodzinski 2006-01-31 19:50 ` Radoslaw Szkodzinski 0 siblings, 1 reply; 110+ messages in thread From: Radoslaw Szkodzinski @ 2006-01-31 18:33 UTC (permalink / raw To: Greg KH Cc: Keith Packard, Junio C Hamano, cworth, Martin Langhoff, Linus Torvalds, Git Mailing List [-- Attachment #1: Type: text/plain, Size: 1115 bytes --] Greg KH wrote: > On Sun, Jan 29, 2006 at 12:18:45PM +0100, Radoslaw Szkodzinski wrote: >> The only drawback is local cloning. This operation is like 4x slower >> than plain copying of the repository. Probably because it works like an >> ssh clone - creates a pack, copies it, then unpacks. This is just >> inefficient on a local machine. > > Have you tried the "-l" option for cloneing locally? It's _very_ fast, > even for my tiny little old laptop. Because it's cp -rl <one-tree> <second-tree> and some file modifications, right? It's what I've been using already. This -l option should be more prominent in the documentation. Maybe it even already is. I've taught myself using git before 0.9. Thank you. This helps a lot. > If you add a "-n" that will not checkout the source tree, so you can > compare the time of cloning with the checkout portion. Cloning without -l option is much slower - some minutes vs below a minute. I could have time(8)d it, but it's no use. -- GPG Key id: 0xD1F10BA2 Fingerprint: 96E2 304A B9C4 949A 10A0 9105 9543 0453 D1F1 0BA2 AstralStorm [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 252 bytes --] ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 18:33 ` Radoslaw Szkodzinski @ 2006-01-31 19:50 ` Radoslaw Szkodzinski 2006-01-31 20:43 ` Junio C Hamano 0 siblings, 1 reply; 110+ messages in thread From: Radoslaw Szkodzinski @ 2006-01-31 19:50 UTC (permalink / raw To: Greg KH Cc: Keith Packard, Junio C Hamano, cworth, Martin Langhoff, Linus Torvalds, Git Mailing List [-- Attachment #1: Type: text/plain, Size: 1284 bytes --] Radoslaw Szkodzinski wrote: > Cloning without -l option is much slower - some minutes vs below a minute. > I could have time(8)d it, but it's no use. > Make that time(1)d. Results for the kernel follow. Disc cache has been preheated with find. git version: 5b2bcc7b2d546c636f79490655b3347acc91d17f Filesystem: ext3 data=writeback Kernel: 2.6.16-rc1-astorm2 (mostly -ck patchset with "hotfix") Elevator: CFQ time git clone linux-2.6.git linux-2.6.git.new Packing 180025 objects real 8m31.637s user 3m19.571s sys 0m42.211s Extremely bad. The task is mostly cpu-bound. Made some background applications swap out late in the process. (that's the cause of the sys time) time git clone -l linux-2.6.git linux-2.6.git.local 0 blocks real 0m42.339s user 0m2.818s sys 0m4.040s Good enough for me. Possibly cp -rl of objects and then a checkout. time cp -rl linux-2.6.git linux-2.6.git.rl real 0m18.333s user 0m0.103s sys 0m1.732s Really fast, but requires additional file modification. (namely .git/remotes/origin, removal of gitrc) Also incompatible with apps having problems with hardlinks. -- GPG Key id: 0xD1F10BA2 Fingerprint: 96E2 304A B9C4 949A 10A0 9105 9543 0453 D1F1 0BA2 AstralStorm [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 252 bytes --] ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 19:50 ` Radoslaw Szkodzinski @ 2006-01-31 20:43 ` Junio C Hamano 2006-01-31 21:02 ` Radoslaw Szkodzinski 0 siblings, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2006-01-31 20:43 UTC (permalink / raw To: Radoslaw Szkodzinski Cc: Greg KH, Keith Packard, cworth, Martin Langhoff, Linus Torvalds, Git Mailing List Radoslaw Szkodzinski <astralstorm@gorzow.mm.pl> writes: > Radoslaw Szkodzinski wrote: >> Cloning without -l option is much slower - some minutes vs below a minute. >> I could have time(8)d it, but it's no use. >> > > Make that time(1)d. > > Results for the kernel follow. Disc cache has been preheated with find. While you are at it, "git clone -l -s -n" might be more interesting. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 20:43 ` Junio C Hamano @ 2006-01-31 21:02 ` Radoslaw Szkodzinski 0 siblings, 0 replies; 110+ messages in thread From: Radoslaw Szkodzinski @ 2006-01-31 21:02 UTC (permalink / raw To: Junio C Hamano Cc: Greg KH, Keith Packard, cworth, Martin Langhoff, Linus Torvalds, Git Mailing List [-- Attachment #1: Type: text/plain, Size: 1001 bytes --] Junio C Hamano wrote: > Radoslaw Szkodzinski <astralstorm@gorzow.mm.pl> writes: > >> Radoslaw Szkodzinski wrote: >>> Cloning without -l option is much slower - some minutes vs below a minute. >>> I could have time(8)d it, but it's no use. >>> >> Make that time(1)d. >> >> Results for the kernel follow. Disc cache has been preheated with find. > > While you are at it, "git clone -l -s -n" might be more interesting. > > Sure: time git clone -l -s -n linux-2.6.git linux-2.6.git.lsn real 0m0.458s user 0m0.020s sys 0m0.027s Speed demon. I'd use it, but I often need a checkout anyway, so... time git clone -l -s linux-2.6.git linux-2.6.git.ls real 0m35.752s user 0m2.661s sys 0m2.374s Not really better than git clone -l and relies on the tools more. However, it should make for easier repacking and pruning. I'll keep it. -- GPG Key id: 0xD1F10BA2 Fingerprint: 96E2 304A B9C4 949A 10A0 9105 9543 0453 D1F1 0BA2 AstralStorm [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 252 bytes --] ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-29 11:18 ` Radoslaw Szkodzinski 2006-01-29 18:12 ` Greg KH @ 2006-01-30 22:51 ` Alex Riesen 2006-01-31 21:25 ` Linus Torvalds 1 sibling, 1 reply; 110+ messages in thread From: Alex Riesen @ 2006-01-30 22:51 UTC (permalink / raw To: Radoslaw Szkodzinski Cc: Keith Packard, Junio C Hamano, cworth, Martin Langhoff, Linus Torvalds, Git Mailing List Radoslaw Szkodzinski, Sun, Jan 29, 2006 12:18:45 +0100: > > Fortunately, there are very few people involved with any specific piece > > of the X.org distribution; there's really only one or two people > > actively developing the X.org core server, so that part of the migration > > will be easy. Our users will be stuck, but there aren't many of them > > either, and git makes just sucking the current bits pretty easy. > > Not under Windows (bleh), but it's support for Cygwin is getting better > and better. > I use git in cygwin for a project with more then 17k files (almost 6M lines). It's real slow on ntfs (on 3.2Mhz PIV!), PITA on fat, and has some hiccups now and then (of the kind: "windows unexpectedly does not have feature X, which everything else has" or "windows broke a 20-year-old feature Y"). But its more intuitive and more powerful than any alternatives here (Perforce, SVN and CVS come to mind). ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-30 22:51 ` Alex Riesen @ 2006-01-31 21:25 ` Linus Torvalds 2006-01-31 21:52 ` J. Bruce Fields 2006-01-31 22:01 ` Alex Riesen 0 siblings, 2 replies; 110+ messages in thread From: Linus Torvalds @ 2006-01-31 21:25 UTC (permalink / raw To: Alex Riesen Cc: Radoslaw Szkodzinski, Keith Packard, Junio C Hamano, cworth, Martin Langhoff, Git Mailing List On Mon, 30 Jan 2006, Alex Riesen wrote: > > I use git in cygwin for a project with more then 17k files (almost 6M lines). > It's real slow on ntfs (on 3.2Mhz PIV!) One thing that git does rely on is a fast "lstat()" system call. The index file means that we almost never need to read the contents of a file to compare, but git _does_ check that files haven't been modified, and doing an "lstat()" on every single file it knows about is the way to do that. Now, I suspect that you simply can't do basic filename lookups much faster than Linux does them. The Linux VFS layer name caching reigns supreme: the dentries are just incredibly powerful, and the reason Linux kicks ass on many benchmarks. And yes, git was designed for it. git is _really_ fast on Linux, but any operating system that is so stupid that it has to call down to the low-level filesystem for filename lookup (which is most of them, and from what I have heard, the NT VFS layer is worse than most) will take a lot longer. This is sadly not something I think you can possibly avoid. Git is literally being as fast as is humanly possible without doing explicit locking. You _can_ avoid the "lstat()" calls if you are willing to always explicitly mark files that you have changed (so that the SCM can stat just _those_ files and ignore all the others), but I personally much prefer being able to use any random tools on the files without having to prepare them some way. So we could speed it up on cygwin (and yes, it would speed git up a lot even on Linux, but since the cached lstat() case is so fast anyway, I doubt a lot of Linux users care - the biggest win would be on a cold-cache tree). But it would require that you explicitly _mark_ the files you edit some way. Btw, BK wanted that, and it wasn't _too_ painful. You had to do bk edit to mark a file as being ready to be dirtied, and as a helper command you would use bk editor which would first do the "bk edit" thing and then start up your favourite editor (the usual ${EDITOR:${VISUAL:vi}} rules applied) on it, and it worked fine. We _could_ do the same in git. I'd just prefer not to. For small projects (or big projects with fairly few files), it really shouldn't matter. Your 17k files example is hopefully fairly rare.. > But its more intuitive and more powerful than any alternatives here (Perforce, > SVN and CVS come to mind). Good to know. Linus ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 21:25 ` Linus Torvalds @ 2006-01-31 21:52 ` J. Bruce Fields 2006-01-31 22:01 ` Alex Riesen 1 sibling, 0 replies; 110+ messages in thread From: J. Bruce Fields @ 2006-01-31 21:52 UTC (permalink / raw To: Linus Torvalds Cc: Alex Riesen, Radoslaw Szkodzinski, Keith Packard, Junio C Hamano, cworth, Martin Langhoff, Git Mailing List On Tue, Jan 31, 2006 at 01:25:08PM -0800, Linus Torvalds wrote: > So we could speed it up on cygwin (and yes, it would speed git up a lot > even on Linux, but since the cached lstat() case is so fast anyway, I > doubt a lot of Linux users care - the biggest win would be on a cold-cache > tree). But it would require that you explicitly _mark_ the files you edit > some way. You couldn't depend on a combination of lstat's and some kind of filesystem change notifications? --b. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 21:25 ` Linus Torvalds 2006-01-31 21:52 ` J. Bruce Fields @ 2006-01-31 22:01 ` Alex Riesen [not found] ` <20060201013901.GA16832@mail.com> 1 sibling, 1 reply; 110+ messages in thread From: Alex Riesen @ 2006-01-31 22:01 UTC (permalink / raw To: Linus Torvalds Cc: Radoslaw Szkodzinski, Keith Packard, Junio C Hamano, cworth, Martin Langhoff, Git Mailing List Linus Torvalds, Tue, Jan 31, 2006 22:25:08 +0100: > > I use git in cygwin for a project with more then 17k files (almost > > 6M lines). It's real slow on ntfs (on 3.2Mhz PIV!) > ... > So we could speed it up on cygwin (and yes, it would speed git up a lot > even on Linux, but since the cached lstat() case is so fast anyway, I > doubt a lot of Linux users care - the biggest win would be on a cold-cache > tree). But it would require that you explicitly _mark_ the files you edit > some way. I'd hate to have to do that. The project in question is just stuffed up beyond all reason, windows' VFS is a sorry piece of junk, and I care much more about how comfortable the tool is. > ... > For small projects (or big projects with fairly few files), it really > shouldn't matter. Your 17k files example is hopefully fairly rare.. I'd say it is fairly common. It's what driven by paranoia and suffering from chronic undereducation projects in big companies usually end up with. Frequently right from the start... ^ permalink raw reply [flat|nested] 110+ messages in thread
[parent not found: <20060201013901.GA16832@mail.com>]
* Re: [Census] So who uses git? [not found] ` <20060201013901.GA16832@mail.com> @ 2006-02-01 2:04 ` Linus Torvalds 2006-02-01 2:09 ` Linus Torvalds ` (4 more replies) 2006-02-01 2:52 ` Martin Langhoff 1 sibling, 5 replies; 110+ messages in thread From: Linus Torvalds @ 2006-02-01 2:04 UTC (permalink / raw To: Ray Lehtiniemi Cc: Alex Riesen, Radoslaw Szkodzinski, Keith Packard, Junio C Hamano, cworth, Martin Langhoff, Git Mailing List On Tue, 31 Jan 2006, Ray Lehtiniemi wrote: > > for what it's worth, it's certainly true here... i'm using git to help > me manage a similar project where i work. Hmm. We _could_ actually fairly easily add a flag to the index which means "don't even bother comparing - assume same", and then have specific operations to clear that flag. That would allow people with slow filesystems (not just Windows: even under Linux, the cold-cache case is always going to be pretty slow) to have a _choice_: they could continue to use git it is done now (explicit checks), _or_ they could mark all their index caches as "implicitly up-to-date" and use a separate program to mark them as being potentially edited. We still have one unused bit in the cache-entry "ce_flags", so we wouldn't even need to break any existing index files with it. We'd just need to have two new (fast) operations: - mark one or more files as being "implicitly up-to-date" "git checkout" would do this if the proper flag was set in the .git/config file. "git-update-index --refresh" would do this for files that weren't already implicitly up-to-date _and_ the refresh actually showed it to match (and the .git/config file said so). - mark one or more files as _not_ being implicitly up-to-date: people would do this by hand when editing a file (or when just deciding that they want git to re-check everything again) They're fast, because they are purely in the cache (well, git-update-index obviously isn't, but the new op wouldn't be any _slower_ than the old one). Looks simple enough. The big thing to remember is to clear that "implicitly up-to-date" flag whenever we make changes (ie we'd probably make "add_cache_entry()" always clear it, possibly with a flag to add it as "pre-verified" which would set it). Comments? Junio, what do you think? > we're working on a vendor supplied tree which is also hacked upon > by various VAR companies. the tree in question has ~20,000 files > totalling nearly 1.4 GB of source files, ms word docs, binary-only > libraries for a wide array of processor variants, windows exe > files, video clips, etc. (however, the amount of actual source code > interspersed in there is only about 6000 files totaling about 112 MB) > > here's a repo sitting on the local linux filesystem with cold cache: > > reiserfs$ time git update-index --refresh > real 0m17.422s > user 0m0.025s > sys 0m0.320s .. somewhat painful, but with enough memory this is hopefully a pretty rare case. > and with hot cache > > reiserfs$ time git update-index --refresh > real 0m0.151s > user 0m0.020s > sys 0m0.067s This is how it _should_ look. But: > for comparison, one of our sandboxes is sitting on an NTFS file system, > accessed via SMB: > > smbfs$ time git update-index --refresh > real 11m36.502s > user 0m6.830s > sys 0m5.086s Ouch, ouch, ouch. Sounds like every single stat() will go out the wire. I forget what the Linux NFS client does, but I _think_ it has a metadata timeout that avoids this. But it might be as bad under NFS. Has anybody used git over NFS? If it's this bad (or even close to), I guess the "mark files as up-to-date in the index" approach is a really good idea.. Of course, the whole point of git is that you should keep your repository close, but sometimes NFS - or similar - is enforced upon you by other issues, like the fact that the powers-that-be want anonymous workstations and everybody should work with a home-directory automounted over NFS.. Linus ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 2:04 ` Linus Torvalds @ 2006-02-01 2:09 ` Linus Torvalds 2006-02-09 5:15 ` [PATCH] "Assume unchanged" git Junio C Hamano 2006-02-01 2:31 ` [Census] So who uses git? Junio C Hamano ` (3 subsequent siblings) 4 siblings, 1 reply; 110+ messages in thread From: Linus Torvalds @ 2006-02-01 2:09 UTC (permalink / raw To: Ray Lehtiniemi Cc: Alex Riesen, Radoslaw Szkodzinski, Keith Packard, Junio C Hamano, cworth, Martin Langhoff, Git Mailing List On Tue, 31 Jan 2006, Linus Torvalds wrote: > > We still have one unused bit in the cache-entry "ce_flags", so we wouldn't > even need to break any existing index files with it. In case it wasn't clear, the _core_ of this optimization would be as simple as something like the appended. The real meat is just making sure that CE_VALID gets set/cleared properly. (That's also the most complex part, of course, but this trivial patch might help show the basic idea) Linus --- diff --git a/cache.h b/cache.h index bdbe2d6..7adc2e6 100644 --- a/cache.h +++ b/cache.h @@ -91,6 +91,7 @@ struct cache_entry { #define CE_NAMEMASK (0x0fff) #define CE_STAGEMASK (0x3000) #define CE_UPDATE (0x4000) +#define CE_VALID (0x8000) #define CE_STAGESHIFT 12 #define create_ce_flags(len, stage) htons((len) | ((stage) << CE_STAGESHIFT)) diff --git a/read-cache.c b/read-cache.c index c5474d4..738fe78 100644 --- a/read-cache.c +++ b/read-cache.c @@ -148,7 +148,16 @@ static int ce_match_stat_basic(struct ca int ce_match_stat(struct cache_entry *ce, struct stat *st) { - unsigned int changed = ce_match_stat_basic(ce, st); + unsigned int changed; + + /* + * If it's marked as always valid in the index, it's + * valid whatever the checked-out copy says + */ + if (ce->ce_flags & htons(CE_VALID)) + return 0; + + changed = ce_match_stat_basic(ce, st); /* * Within 1 second of this sequence: ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [PATCH] "Assume unchanged" git 2006-02-01 2:09 ` Linus Torvalds @ 2006-02-09 5:15 ` Junio C Hamano 2006-02-09 5:49 ` [PATCH] "Assume unchanged" git: do not set CE_VALID with --refresh Junio C Hamano 2006-02-09 5:50 ` [PATCH] ls-files: debugging aid for CE_VALID changes Junio C Hamano 0 siblings, 2 replies; 110+ messages in thread From: Junio C Hamano @ 2006-02-09 5:15 UTC (permalink / raw To: git Cc: Alex Riesen, Radoslaw Szkodzinski, Keith Packard, cworth, Martin Langhoff, Linus Torvalds Linus Torvalds <torvalds@osdl.org> writes: > The real meat is just making sure that CE_VALID gets set/cleared properly. Setting is easier part. Deciding when to ignore/clear for the sake of safety and usability is harder. I think I got the basics right but we might want to pass "really" from more places. This is _not_ 1.2 material, but I think it is ready to be tested by people who asked for this feature. It applies on top of the recent master branch. -- >8 -- [PATCH] "Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> --- apply.c | 2 +- cache.h | 6 +++-- checkout-index.c | 1 + config.c | 5 ++++ diff-files.c | 2 +- diff-index.c | 2 +- diff.c | 2 +- entry.c | 2 +- environment.c | 1 + read-cache.c | 28 +++++++++++++++++++---- read-tree.c | 2 +- update-index.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++------ write-tree.c | 2 +- 13 files changed, 99 insertions(+), 21 deletions(-) b169290f100cfa67b785c361bcae83f807487f5e diff --git a/apply.c b/apply.c index 2ad47fb..35ae48e 100644 --- a/apply.c +++ b/apply.c @@ -1309,7 +1309,7 @@ static int check_patch(struct patch *pat return -1; } - changed = ce_match_stat(active_cache[pos], &st); + changed = ce_match_stat(active_cache[pos], &st, 1); if (changed) return error("%s: does not match index", old_name); diff --git a/cache.h b/cache.h index bdbe2d6..cd58fad 100644 --- a/cache.h +++ b/cache.h @@ -91,6 +91,7 @@ struct cache_entry { #define CE_NAMEMASK (0x0fff) #define CE_STAGEMASK (0x3000) #define CE_UPDATE (0x4000) +#define CE_VALID (0x8000) #define CE_STAGESHIFT 12 #define create_ce_flags(len, stage) htons((len) | ((stage) << CE_STAGESHIFT)) @@ -144,8 +145,8 @@ extern int add_cache_entry(struct cache_ extern int remove_cache_entry_at(int pos); extern int remove_file_from_cache(const char *path); extern int ce_same_name(struct cache_entry *a, struct cache_entry *b); -extern int ce_match_stat(struct cache_entry *ce, struct stat *st); -extern int ce_modified(struct cache_entry *ce, struct stat *st); +extern int ce_match_stat(struct cache_entry *ce, struct stat *st, int); +extern int ce_modified(struct cache_entry *ce, struct stat *st, int); extern int ce_path_match(const struct cache_entry *ce, const char **pathspec); extern int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, const char *type); extern int index_pipe(unsigned char *sha1, int fd, const char *type, int write_object); @@ -161,6 +162,7 @@ extern int commit_index_file(struct cach extern void rollback_index_file(struct cache_file *); extern int trust_executable_bit; +extern int assume_unchanged; extern int only_use_symrefs; extern int diff_rename_limit_default; extern int shared_repository; diff --git a/checkout-index.c b/checkout-index.c index 53dd8cb..957b4a8 100644 --- a/checkout-index.c +++ b/checkout-index.c @@ -116,6 +116,7 @@ int main(int argc, char **argv) int all = 0; prefix = setup_git_directory(); + git_config(git_default_config); prefix_length = prefix ? strlen(prefix) : 0; if (read_cache() < 0) { diff --git a/config.c b/config.c index 8355224..7dbdce1 100644 --- a/config.c +++ b/config.c @@ -222,6 +222,11 @@ int git_default_config(const char *var, return 0; } + if (!strcmp(var, "core.ignorestat")) { + assume_unchanged = git_config_bool(var, value); + return 0; + } + if (!strcmp(var, "core.symrefsonly")) { only_use_symrefs = git_config_bool(var, value); return 0; diff --git a/diff-files.c b/diff-files.c index d24d11c..c96ad35 100644 --- a/diff-files.c +++ b/diff-files.c @@ -191,7 +191,7 @@ int main(int argc, const char **argv) show_file('-', ce); continue; } - changed = ce_match_stat(ce, &st); + changed = ce_match_stat(ce, &st, 0); if (!changed && !diff_options.find_copies_harder) continue; oldmode = ntohl(ce->ce_mode); diff --git a/diff-index.c b/diff-index.c index f8a102e..12a9418 100644 --- a/diff-index.c +++ b/diff-index.c @@ -33,7 +33,7 @@ static int get_stat_data(struct cache_en } return -1; } - changed = ce_match_stat(ce, &st); + changed = ce_match_stat(ce, &st, 0); if (changed) { mode = create_ce_mode(st.st_mode); if (!trust_executable_bit && diff --git a/diff.c b/diff.c index ec51e7d..c72064e 100644 --- a/diff.c +++ b/diff.c @@ -311,7 +311,7 @@ static int work_tree_matches(const char ce = active_cache[pos]; if ((lstat(name, &st) < 0) || !S_ISREG(st.st_mode) || /* careful! */ - ce_match_stat(ce, &st) || + ce_match_stat(ce, &st, 0) || memcmp(sha1, ce->sha1, 20)) return 0; /* we return 1 only when we can stat, it is a regular file, diff --git a/entry.c b/entry.c index 6c47c3a..8fb99bc 100644 --- a/entry.c +++ b/entry.c @@ -123,7 +123,7 @@ int checkout_entry(struct cache_entry *c strcpy(path + len, ce->name); if (!lstat(path, &st)) { - unsigned changed = ce_match_stat(ce, &st); + unsigned changed = ce_match_stat(ce, &st, 1); if (!changed) return 0; if (!state->force) { diff --git a/environment.c b/environment.c index 0596fc6..251e53c 100644 --- a/environment.c +++ b/environment.c @@ -12,6 +12,7 @@ char git_default_email[MAX_GITNAME]; char git_default_name[MAX_GITNAME]; int trust_executable_bit = 1; +int assume_unchanged = 0; int only_use_symrefs = 0; int repository_format_version = 0; char git_commit_encoding[MAX_ENCODING_LENGTH] = "utf-8"; diff --git a/read-cache.c b/read-cache.c index c5474d4..efbb1be 100644 --- a/read-cache.c +++ b/read-cache.c @@ -27,6 +27,9 @@ void fill_stat_cache_info(struct cache_e ce->ce_uid = htonl(st->st_uid); ce->ce_gid = htonl(st->st_gid); ce->ce_size = htonl(st->st_size); + + if (assume_unchanged) + ce->ce_flags |= htons(CE_VALID); } static int ce_compare_data(struct cache_entry *ce, struct stat *st) @@ -146,9 +149,18 @@ static int ce_match_stat_basic(struct ca return changed; } -int ce_match_stat(struct cache_entry *ce, struct stat *st) +int ce_match_stat(struct cache_entry *ce, struct stat *st, int ignore_valid) { - unsigned int changed = ce_match_stat_basic(ce, st); + unsigned int changed; + + /* + * If it's marked as always valid in the index, it's + * valid whatever the checked-out copy says. + */ + if (!ignore_valid && (ce->ce_flags & htons(CE_VALID))) + return 0; + + changed = ce_match_stat_basic(ce, st); /* * Within 1 second of this sequence: @@ -164,7 +176,7 @@ int ce_match_stat(struct cache_entry *ce * effectively mean we can make at most one commit per second, * which is not acceptable. Instead, we check cache entries * whose mtime are the same as the index file timestamp more - * careful than others. + * carefully than others. */ if (!changed && index_file_timestamp && @@ -174,10 +186,10 @@ int ce_match_stat(struct cache_entry *ce return changed; } -int ce_modified(struct cache_entry *ce, struct stat *st) +int ce_modified(struct cache_entry *ce, struct stat *st, int really) { int changed, changed_fs; - changed = ce_match_stat(ce, st); + changed = ce_match_stat(ce, st, really); if (!changed) return 0; /* @@ -233,6 +245,11 @@ int cache_name_compare(const char *name1 return -1; if (len1 > len2) return 1; + + /* Differences between "assume up-to-date" should not matter. */ + flags1 &= ~CE_VALID; + flags2 &= ~CE_VALID; + if (flags1 < flags2) return -1; if (flags1 > flags2) @@ -430,6 +447,7 @@ int add_cache_entry(struct cache_entry * int ok_to_add = option & ADD_CACHE_OK_TO_ADD; int ok_to_replace = option & ADD_CACHE_OK_TO_REPLACE; int skip_df_check = option & ADD_CACHE_SKIP_DFCHECK; + pos = cache_name_pos(ce->name, ntohs(ce->ce_flags)); /* existing match? Just replace it. */ diff --git a/read-tree.c b/read-tree.c index 5580f15..52f06e3 100644 --- a/read-tree.c +++ b/read-tree.c @@ -349,7 +349,7 @@ static void verify_uptodate(struct cache return; if (!lstat(ce->name, &st)) { - unsigned changed = ce_match_stat(ce, &st); + unsigned changed = ce_match_stat(ce, &st, 1); if (!changed) return; errno = 0; diff --git a/update-index.c b/update-index.c index afec98d..767fd49 100644 --- a/update-index.c +++ b/update-index.c @@ -23,6 +23,10 @@ static int quiet; /* --refresh needing u static int info_only; static int force_remove; static int verbose; +static int mark_valid_only = 0; +#define MARK_VALID 1 +#define UNMARK_VALID 2 + /* Three functions to allow overloaded pointer return; see linux/err.h */ static inline void *ERR_PTR(long error) @@ -53,6 +57,25 @@ static void report(const char *fmt, ...) va_end(vp); } +static int mark_valid(const char *path) +{ + int namelen = strlen(path); + int pos = cache_name_pos(path, namelen); + if (0 <= pos) { + switch (mark_valid_only) { + case MARK_VALID: + active_cache[pos]->ce_flags |= htons(CE_VALID); + break; + case UNMARK_VALID: + active_cache[pos]->ce_flags &= ~htons(CE_VALID); + break; + } + active_cache_changed = 1; + return 0; + } + return -1; +} + static int add_file_to_cache(const char *path) { int size, namelen, option, status; @@ -94,6 +117,7 @@ static int add_file_to_cache(const char ce = xmalloc(size); memset(ce, 0, size); memcpy(ce->name, path, namelen); + ce->ce_flags = htons(namelen); fill_stat_cache_info(ce, &st); ce->ce_mode = create_ce_mode(st.st_mode); @@ -105,7 +129,6 @@ static int add_file_to_cache(const char if (0 <= pos) ce->ce_mode = active_cache[pos]->ce_mode; } - ce->ce_flags = htons(namelen); if (index_path(ce->sha1, path, &st, !info_only)) return -1; @@ -128,7 +151,7 @@ static int add_file_to_cache(const char * For example, you'd want to do this after doing a "git-read-tree", * to link up the stat cache details with the proper files. */ -static struct cache_entry *refresh_entry(struct cache_entry *ce) +static struct cache_entry *refresh_entry(struct cache_entry *ce, int really) { struct stat st; struct cache_entry *updated; @@ -137,21 +160,22 @@ static struct cache_entry *refresh_entry if (lstat(ce->name, &st) < 0) return ERR_PTR(-errno); - changed = ce_match_stat(ce, &st); + changed = ce_match_stat(ce, &st, really); if (!changed) return NULL; - if (ce_modified(ce, &st)) + if (ce_modified(ce, &st, really)) return ERR_PTR(-EINVAL); size = ce_size(ce); updated = xmalloc(size); memcpy(updated, ce, size); fill_stat_cache_info(updated, &st); + return updated; } -static int refresh_cache(void) +static int refresh_cache(int really) { int i; int has_errors = 0; @@ -171,12 +195,19 @@ static int refresh_cache(void) continue; } - new = refresh_entry(ce); + new = refresh_entry(ce, really); if (!new) continue; if (IS_ERR(new)) { if (not_new && PTR_ERR(new) == -ENOENT) continue; + if (really && PTR_ERR(new) == -EINVAL) { + /* If we are doing --really-refresh that + * means the index is not valid anymore. + */ + ce->ce_flags &= ~htons(CE_VALID); + active_cache_changed = 1; + } if (quiet) continue; printf("%s: needs update\n", ce->name); @@ -274,6 +305,8 @@ static int add_cacheinfo(unsigned int mo memcpy(ce->name, path, len); ce->ce_flags = create_ce_flags(len, stage); ce->ce_mode = create_ce_mode(mode); + if (assume_unchanged) + ce->ce_flags |= htons(CE_VALID); option = allow_add ? ADD_CACHE_OK_TO_ADD : 0; option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0; if (add_cache_entry(ce, option)) @@ -317,6 +350,12 @@ static void update_one(const char *path, fprintf(stderr, "Ignoring path %s\n", path); return; } + if (mark_valid_only) { + if (mark_valid(p)) + die("Unable to mark file %s", path); + return; + } + if (force_remove) { if (remove_file_from_cache(p)) die("git-update-index: unable to remove %s", path); @@ -467,7 +506,11 @@ int main(int argc, const char **argv) continue; } if (!strcmp(path, "--refresh")) { - has_errors |= refresh_cache(); + has_errors |= refresh_cache(0); + continue; + } + if (!strcmp(path, "--really-refresh")) { + has_errors |= refresh_cache(1); continue; } if (!strcmp(path, "--cacheinfo")) { @@ -493,6 +536,14 @@ int main(int argc, const char **argv) die("git-update-index: %s cannot chmod %s", path, argv[i]); continue; } + if (!strcmp(path, "--assume-unchanged")) { + mark_valid_only = MARK_VALID; + continue; + } + if (!strcmp(path, "--no-assume-unchanged")) { + mark_valid_only = UNMARK_VALID; + continue; + } if (!strcmp(path, "--info-only")) { info_only = 1; continue; diff --git a/write-tree.c b/write-tree.c index f866059..addb5de 100644 --- a/write-tree.c +++ b/write-tree.c @@ -111,7 +111,7 @@ int main(int argc, char **argv) funny = 0; for (i = 0; i < entries; i++) { struct cache_entry *ce = active_cache[i]; - if (ntohs(ce->ce_flags) & ~CE_NAMEMASK) { + if (ce_stage(ce)) { if (10 < ++funny) { fprintf(stderr, "...\n"); break; -- 1.1.6.gbb042 ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [PATCH] "Assume unchanged" git: do not set CE_VALID with --refresh 2006-02-09 5:15 ` [PATCH] "Assume unchanged" git Junio C Hamano @ 2006-02-09 5:49 ` Junio C Hamano 2006-02-09 5:50 ` [PATCH] ls-files: debugging aid for CE_VALID changes Junio C Hamano 1 sibling, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2006-02-09 5:49 UTC (permalink / raw To: git Cc: Alex Riesen, Radoslaw Szkodzinski, Keith Packard, cworth, Martin Langhoff, Linus Torvalds When working with automatic assume-unchanged mode using core.ignorestat, setting CE_VALID after --refresh makes things more cumbersome to use. Consider this scenario: (1) the working tree is on a filesystem with slow lstat(2). The user sets core.ignorestat = true. (2) "git checkout" to switch to a different branch (or initial checkout) updates all paths and the index starts out with "all clean". (3) The user knows she wants to edit certain paths. She uses update-index --no-assume-unchanged (we could call it --edit; the name is inmaterial) to mark these paths and starts editing. (4) After editing half of the paths marked to be edited, she runs "git status". This runs "update-index --refresh" to reduce the false hits from diff-files. (5) Now the other half of the paths, since she has not changed them, are found to match the index, and CE_VALID is set on them again. For this reason, this commit makes update-index --refresh not to set CE_VALID even after the path without CE_VALID are verified to be up to date. The user still can run --really-refresh to force lstat() to match the index entries to the reality. Signed-off-by: Junio C Hamano <junkio@cox.net> --- update-index.c | 9 +++++++++ 1 files changed, 9 insertions(+), 0 deletions(-) fd4e57f17733d85ed5346d70005ea900cb80b9ff diff --git a/update-index.c b/update-index.c index 767fd49..bb73050 100644 --- a/update-index.c +++ b/update-index.c @@ -172,6 +172,15 @@ static struct cache_entry *refresh_entry memcpy(updated, ce, size); fill_stat_cache_info(updated, &st); + /* In this case, if really is not set, we should leave + * CE_VALID bit alone. Otherwise, paths marked with + * --no-assume-unchanged (i.e. things to be edited) will + * reacquire CE_VALID bit automatically, which is not + * really what we want. + */ + if (!really && assume_unchanged && !(ce->ce_flags & htons(CE_VALID))) + updated->ce_flags &= ~htons(CE_VALID); + return updated; } -- 1.1.6.gbb042 ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [PATCH] ls-files: debugging aid for CE_VALID changes. 2006-02-09 5:15 ` [PATCH] "Assume unchanged" git Junio C Hamano 2006-02-09 5:49 ` [PATCH] "Assume unchanged" git: do not set CE_VALID with --refresh Junio C Hamano @ 2006-02-09 5:50 ` Junio C Hamano 1 sibling, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2006-02-09 5:50 UTC (permalink / raw To: git Cc: Alex Riesen, Radoslaw Szkodzinski, Keith Packard, cworth, Martin Langhoff, Linus Torvalds This is not really part of the proposed updates for CE_VALID, but with this change, ls-files -t shows CE_VALID paths with lowercase tag letters instead of the usual uppercase. Useful for checking out what is going on. Signed-off-by: Junio C Hamano <junkio@cox.net> --- ls-files.c | 18 +++++++++++++++++- 1 files changed, 17 insertions(+), 1 deletions(-) 775ca05ee2ba7e1f54ec4db1fed7069014364f2c diff --git a/ls-files.c b/ls-files.c index 6af3b09..3f06ece 100644 --- a/ls-files.c +++ b/ls-files.c @@ -447,6 +447,22 @@ static void show_ce_entry(const char *ta if (pathspec && !match(pathspec, ce->name, len)) return; + if (tag && *tag && (ce->ce_flags & htons(CE_VALID))) { + static char alttag[4]; + memcpy(alttag, tag, 3); + if (isalpha(tag[0])) + alttag[0] = tolower(tag[0]); + else if (tag[0] == '?') + alttag[0] = '!'; + else { + alttag[0] = 'v'; + alttag[1] = tag[0]; + alttag[2] = ' '; + alttag[3] = 0; + } + tag = alttag; + } + if (!show_stage) { fputs(tag, stdout); write_name_quoted("", 0, ce->name + offset, @@ -503,7 +519,7 @@ static void show_files(void) err = lstat(ce->name, &st); if (show_deleted && err) show_ce_entry(tag_removed, ce); - if (show_modified && ce_modified(ce, &st)) + if (show_modified && ce_modified(ce, &st, 0)) show_ce_entry(tag_modified, ce); } } -- 1.1.6.gbb042 ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 2:04 ` Linus Torvalds 2006-02-01 2:09 ` Linus Torvalds @ 2006-02-01 2:31 ` Junio C Hamano 2006-02-01 3:43 ` Linus Torvalds [not found] ` <20060201045337.GC25753@mail.com> 2006-02-01 16:15 ` Jason Riedy ` (2 subsequent siblings) 4 siblings, 2 replies; 110+ messages in thread From: Junio C Hamano @ 2006-02-01 2:31 UTC (permalink / raw To: Linus Torvalds; +Cc: git Linus Torvalds <torvalds@osdl.org> writes: > They're fast, because they are purely in the cache (well, git-update-index > obviously isn't, but the new op wouldn't be any _slower_ than the old > one). > > Looks simple enough. The big thing to remember is to clear that > "implicitly up-to-date" flag whenever we make changes (ie we'd probably > make "add_cache_entry()" always clear it, possibly with a flag to add it > as "pre-verified" which would set it). > > Comments? Junio, what do you think? Somehow this reminds me of a "feature" we added quite a long time ago to support "update-index without working tree". I think this should work fine as a mechanism, but I am a bit worried about the convenience and safety aspect. It _might_ make sense to do what RCS does; check out read-only copy by default and set the "assume unchanged" flag, to prevent people from accidentally modifying the working tree copy without telling the index about it. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 2:31 ` [Census] So who uses git? Junio C Hamano @ 2006-02-01 3:43 ` Linus Torvalds 2006-02-01 7:03 ` Junio C Hamano [not found] ` <20060201045337.GC25753@mail.com> 1 sibling, 1 reply; 110+ messages in thread From: Linus Torvalds @ 2006-02-01 3:43 UTC (permalink / raw To: Junio C Hamano; +Cc: git On Tue, 31 Jan 2006, Junio C Hamano wrote: > > I think this should work fine as a mechanism, but I am a bit > worried about the convenience and safety aspect. It _might_ > make sense to do what RCS does; check out read-only copy by > default and set the "assume unchanged" flag, to prevent people > from accidentally modifying the working tree copy without > telling the index about it. Yes, I think the "assume unchanged" flag goes well together with making sure that the checked-out file is non-writable at the time. Of course, any number of editors and other actions won't care: if you do anything like for i in *.c do sed 's/xyzzy/bas/g' < $i > $i.new mv $i.new $i done you'll never have even noticed that the old file was marked read-only. So it's obviously not in any way any guarantee, but it probably makes sense as a crutch. Your point that we discussed a similar flag for the "don't require a full checkout" is a good one: we should try to make sure that it works for both uses. Although maybe we decided for some reason that nobody cared about the non-checked-out case? Linus ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 3:43 ` Linus Torvalds @ 2006-02-01 7:03 ` Junio C Hamano 0 siblings, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2006-02-01 7:03 UTC (permalink / raw To: Linus Torvalds; +Cc: git Linus Torvalds <torvalds@osdl.org> writes: > Your point that we discussed a similar flag for the "don't require a full > checkout" is a good one: we should try to make sure that it works for both > uses. Although maybe we decided for some reason that nobody cared about > the non-checked-out case? We gave them a way to add --cacheinfo but did not do any more than that, because they are independently coming up with some hash (not necessarily be a proper git blob object name), they did not have the huge blob data with the working tree anyway, and the only thing they cared about was which paths changed and they did not even want to see how the contents changed. I.e. "diff-tree -r" was the only thing they cared about. If we end up doing "assume unchanged", I should remember to do a sensible thing for "diff-index" without --cached. It should not look at the working tree file for paths marked as such. This implies one optimization in "diff-index -p" and "diff-tree -p" may need to be disabled. They cheat and avoid expanding blob objects when their cache entries are clean and required blobs are in the working tree. If "assume unchanged" path was actually changed, such a diff would show up as a confusing unexpected change. Well, the user is asking for it, so that confusion is not _my_ problem, though ;-). ^ permalink raw reply [flat|nested] 110+ messages in thread
[parent not found: <20060201045337.GC25753@mail.com>]
* Re: [Census] So who uses git? [not found] ` <20060201045337.GC25753@mail.com> @ 2006-02-01 5:04 ` Linus Torvalds 2006-02-01 5:42 ` Junio C Hamano 1 sibling, 0 replies; 110+ messages in thread From: Linus Torvalds @ 2006-02-01 5:04 UTC (permalink / raw To: Ray Lehtiniemi; +Cc: Junio C Hamano, git On Tue, 31 Jan 2006, Ray Lehtiniemi wrote: > > what if the user wants to change the mode bits of an assume-unchanged > file with the twiddled permissions, but forgets to clear the flag > first? seems like that change is likely to get lost, especially if the > new mode is read-only.... Remember - git only cares about execute permissions. The write permissions are entirely ignored by git .. Linus ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? [not found] ` <20060201045337.GC25753@mail.com> 2006-02-01 5:04 ` Linus Torvalds @ 2006-02-01 5:42 ` Junio C Hamano 1 sibling, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2006-02-01 5:42 UTC (permalink / raw To: Ray Lehtiniemi; +Cc: Linus Torvalds, git Ray Lehtiniemi <rayl@mail.com> writes: > what if the user wants to change the mode bits of an assume-unchanged > file with the twiddled permissions, but forgets to clear the flag > first? seems like that change is likely to get lost, especially if the > new mode is read-only.... No problem, since we only record u+x bit and nothing else. Most importantly, we do not record any of the +w bits. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 2:04 ` Linus Torvalds 2006-02-01 2:09 ` Linus Torvalds 2006-02-01 2:31 ` [Census] So who uses git? Junio C Hamano @ 2006-02-01 16:15 ` Jason Riedy 2006-02-01 19:20 ` Julian Phillips 2006-02-06 21:15 ` Chuck Lever 4 siblings, 0 replies; 110+ messages in thread From: Jason Riedy @ 2006-02-01 16:15 UTC (permalink / raw Cc: Git Mailing List And Linus Torvalds writes: - - Has anybody used git over NFS? If it's this bad (or even close to), I - guess the "mark files as up-to-date in the index" approach is a really - good idea.. My normal use is on NFS (Solaris and Linux) and IBM's GPFS (AIX and Linux). I haven't noticed any particular problems, and LAPACK and the reference BLAS make a moderately sized working set of around 3000 source files. Not kernel sized, but not tiny. However, I mostly use git over NFS on a relatively slow machine. NFS is faster than the local disk... Jason ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 2:04 ` Linus Torvalds ` (2 preceding siblings ...) 2006-02-01 16:15 ` Jason Riedy @ 2006-02-01 19:20 ` Julian Phillips 2006-02-01 19:29 ` Linus Torvalds 2006-02-06 21:15 ` Chuck Lever 4 siblings, 1 reply; 110+ messages in thread From: Julian Phillips @ 2006-02-01 19:20 UTC (permalink / raw To: Linus Torvalds Cc: Ray Lehtiniemi, Alex Riesen, Radoslaw Szkodzinski, Keith Packard, Junio C Hamano, cworth, Martin Langhoff, Git Mailing List On Tue, 31 Jan 2006, Linus Torvalds wrote: > Sounds like every single stat() will go out the wire. I forget what the > Linux NFS client does, but I _think_ it has a metadata timeout that avoids > this. But it might be as bad under NFS. > > Has anybody used git over NFS? If it's this bad (or even close to), I > guess the "mark files as up-to-date in the index" approach is a really > good idea.. As it happens, yes ... I can't say that I've noticed git being particularly slow, but then - I've not tried running git with a local repos ... ;) using a recentish 2.6 kernel repos, directly on the server I get: server: linux-2.6>time git update-index --refresh real 0m0.067s user 0m0.015s sys 0m0.052s then against the same repos over NFS, I get: client: linux-2.6>time git update-index --refresh real 0m1.578s user 0m0.018s sys 0m0.366s and if I do it from the client again soon afterward I get: client: linux-2.6>time git update-index --refresh real 0m0.145s user 0m0.012s sys 0m0.118s > > Of course, the whole point of git is that you should keep your repository > close, but sometimes NFS - or similar - is enforced upon you by other > issues, like the fact that the powers-that-be want anonymous workstations > and everybody should work with a home-directory automounted over NFS.. > -- Julian --- You know it's going to be a bad day when you want to put on the clothes you wore home from the party and there aren't any. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 19:20 ` Julian Phillips @ 2006-02-01 19:29 ` Linus Torvalds 0 siblings, 0 replies; 110+ messages in thread From: Linus Torvalds @ 2006-02-01 19:29 UTC (permalink / raw To: Julian Phillips Cc: Ray Lehtiniemi, Alex Riesen, Radoslaw Szkodzinski, Keith Packard, Junio C Hamano, cworth, Martin Langhoff, Git Mailing List On Wed, 1 Feb 2006, Julian Phillips wrote: > > As it happens, yes ... I can't say that I've noticed git being particularly > slow, but then - I've not tried running git with a local repos ... ;) Well, NFS seems to be ok. Which is not that surprising: NFS has gotten a _lot_ of attention in the caching area (I worked on it myself a couple of years back when the page cache transition happened during 2.3.x, but happily we've had very good NFS maintainership since, so I don't get involved any more). Your numbers show that NFS is fine (my "benchmark" is that I refuse to see the kinds of commit times that "cvs commit" does - easily several minutes for a big project. If it goes over 2 seconds, it's painful, and over ten seconds is totally unacceptable). Your numbers seem to say that at least with a good network/server, NFS on Linux is not a problem at all. CIFS is likely a very different animal. I suspect the cifs people have spent a whole lot more effort on strange Windows interaction issues than on trying to make sure that cached performance is top-notch. Linus ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 2:04 ` Linus Torvalds ` (3 preceding siblings ...) 2006-02-01 19:20 ` Julian Phillips @ 2006-02-06 21:15 ` Chuck Lever 4 siblings, 0 replies; 110+ messages in thread From: Chuck Lever @ 2006-02-06 21:15 UTC (permalink / raw To: Linus Torvalds Cc: Ray Lehtiniemi, Alex Riesen, Radoslaw Szkodzinski, Keith Packard, Junio C Hamano, cworth, Martin Langhoff, Git Mailing List, Trond Myklebust [-- Attachment #1: Type: text/plain, Size: 1654 bytes --] Linus Torvalds wrote: >>for comparison, one of our sandboxes is sitting on an NTFS file system, >>accessed via SMB: >> >> smbfs$ time git update-index --refresh >> real 11m36.502s >> user 0m6.830s >> sys 0m5.086s > > > Ouch, ouch, ouch. > > Sounds like every single stat() will go out the wire. I forget what the > Linux NFS client does, but I _think_ it has a metadata timeout that avoids > this. But it might be as bad under NFS. > > Has anybody used git over NFS? If it's this bad (or even close to), I > guess the "mark files as up-to-date in the index" approach is a really > good idea.. > > Of course, the whole point of git is that you should keep your repository > close, but sometimes NFS - or similar - is enforced upon you by other > issues, like the fact that the powers-that-be want anonymous workstations > and everybody should work with a home-directory automounted over NFS.. yes, i keep my Linux kernel repository in NFS (and my stgit and git repositories too). there are some things that are slow precisely because my think time is longer than the NFS client's attribute timeout, which means that all of git's lstat()s turn into GETATTRs. using the "noatime,nodiratime,actimeo=7200" mount options can have some benefit. however, i found that keeping the repository packed provides the greatest positive impact. that means that most of the objects are in a single file, and can be validated with just one GETATTR. one thing we might conclude from this is that making "packing" an efficient operation (or even an incremental one) would go a long way to helping performance on network file systems. [-- Attachment #2: cel.vcf --] [-- Type: text/x-vcard, Size: 451 bytes --] begin:vcard fn:Chuck Lever n:Lever;Charles org:Network Appliance, Incorporated;Open Source NFS Client Development adr:535 West William Street, Suite 3100;;Center for Information Technology Integration;Ann Arbor;MI;48103-4943;USA email;internet:cel@citi.umich.edu title:Member of Technical Staff tel;work:+1 734 763 4415 tel;fax:+1 734 763 4434 tel;home:+1 734 668 1089 x-mozilla-html:FALSE url:http://troy.citi.umich.edu/u/cel/ version:2.1 end:vcard ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? [not found] ` <20060201013901.GA16832@mail.com> 2006-02-01 2:04 ` Linus Torvalds @ 2006-02-01 2:52 ` Martin Langhoff 2006-02-01 3:48 ` Linus Torvalds 2006-02-01 14:55 ` Alex Riesen 1 sibling, 2 replies; 110+ messages in thread From: Martin Langhoff @ 2006-02-01 2:52 UTC (permalink / raw To: Ray Lehtiniemi Cc: Alex Riesen, Linus Torvalds, Radoslaw Szkodzinski, Keith Packard, Junio C Hamano, cworth, Git Mailing List On 2/1/06, Ray Lehtiniemi <rayl@mail.com> wrote: > by various VAR companies. the tree in question has ~20,000 files > totalling nearly 1.4 GB ... > reiserfs$ time git update-index --refresh If you have such a tree, your workflow _must_ be such that you know exactly what files you have changed. Asking any tool to go out and "find which of my 20K files has changed" is doable, but it's just magic that it works on recent linuxes. > for comparison, one of our sandboxes is sitting on an NTFS file system, > accessed via SMB: you have the samba stack, network, SMB/CIFS stack and NTFS itself in the middle. Replace the ethernet with carrier pigeons for a more complete picture ;-) Perhaps a local git/cygwin on NTFS would be more reasonable to benchmark? cheers, martin ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 2:52 ` Martin Langhoff @ 2006-02-01 3:48 ` Linus Torvalds 2006-02-01 19:30 ` H. Peter Anvin 2006-02-01 14:55 ` Alex Riesen 1 sibling, 1 reply; 110+ messages in thread From: Linus Torvalds @ 2006-02-01 3:48 UTC (permalink / raw To: Martin Langhoff Cc: Ray Lehtiniemi, Alex Riesen, Radoslaw Szkodzinski, Keith Packard, Junio C Hamano, cworth, Git Mailing List On Wed, 1 Feb 2006, Martin Langhoff wrote: > > If you have such a tree, your workflow _must_ be such that you know > exactly what files you have changed. Asking any tool to go out and > "find which of my 20K files has changed" is doable, but it's just > magic that it works on recent linuxes. It's not magic, and it's not all that recent. Linux FS ops have always been pretty good, and the dentry cache was introduced in 2.0.x, I think, so you'd be hard-pressed to find a Linux system that doesn't have it. Now, I bet Linux will be better (often by a factor of 2-3) than most other systems, but that still doesn't mean that 20k files is totally unreasonable on other setups. I suspect cygwin is worse than most because (a) the NT VFS layer is piss-poor and you need a kernel service to get good performance and (b) cygwin probably adds its own overhead for handling symlinks, so the "lstat()" call is probably even more expensive. Now, the networked filesystems are a potential problem for everybody. Linus ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 3:48 ` Linus Torvalds @ 2006-02-01 19:30 ` H. Peter Anvin 0 siblings, 0 replies; 110+ messages in thread From: H. Peter Anvin @ 2006-02-01 19:30 UTC (permalink / raw To: Linus Torvalds Cc: Martin Langhoff, Ray Lehtiniemi, Alex Riesen, Radoslaw Szkodzinski, Keith Packard, Junio C Hamano, cworth, Git Mailing List Linus Torvalds wrote: > > It's not magic, and it's not all that recent. Linux FS ops have always > been pretty good, and the dentry cache was introduced in 2.0.x, I think, > so you'd be hard-pressed to find a Linux system that doesn't have it. > 2.1.14, I seem to remember -- it was definitely 2.1.1x-ish. I mostly recall because autofs didn't just break horribly, it took adding several dcache hooks to make it work again :) -hpa ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 2:52 ` Martin Langhoff 2006-02-01 3:48 ` Linus Torvalds @ 2006-02-01 14:55 ` Alex Riesen 2006-02-01 16:25 ` Linus Torvalds 1 sibling, 1 reply; 110+ messages in thread From: Alex Riesen @ 2006-02-01 14:55 UTC (permalink / raw To: Martin Langhoff Cc: Ray Lehtiniemi, Linus Torvalds, Radoslaw Szkodzinski, Keith Packard, Junio C Hamano, cworth, Git Mailing List On 2/1/06, Martin Langhoff <martin.langhoff@gmail.com> wrote: > Perhaps a local git/cygwin on NTFS would be more reasonable to benchmark? $ time git update-index --refresh real 0m21.500s user 0m0.358s sys 0m1.406s WinNT, NTFS, 13k files, hot cache. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 14:55 ` Alex Riesen @ 2006-02-01 16:25 ` Linus Torvalds 2006-02-02 9:12 ` Alex Riesen 0 siblings, 1 reply; 110+ messages in thread From: Linus Torvalds @ 2006-02-01 16:25 UTC (permalink / raw To: Alex Riesen Cc: Martin Langhoff, Ray Lehtiniemi, Radoslaw Szkodzinski, Keith Packard, Junio C Hamano, cworth, Git Mailing List On Wed, 1 Feb 2006, Alex Riesen wrote: > > $ time git update-index --refresh > > real 0m21.500s > user 0m0.358s > sys 0m1.406s > > WinNT, NTFS, 13k files, hot cache. That's 25% less files than the Linux kernel, and I can do that operation in 0m0.062s (0.012s user, 0.048s system). So WinNT/cygwin is about 2.5 _orders_of_maginitude_ slower here, or 340 times slower. Now, I'm tempted to say that NT is a piece of sh*t, but the fact is, your CPU-times seem to indicate that most of it is IO (and the "real" cost is just 1.7 seconds, much of which is system time, which in turn itself is probably due to the IO costs too - so even that isn't comparable with the ). Which may mean that you simply don't have enough memory to cache the whole thing. Which may be NT sucking, of course ("we don't like to use more than 10% of memory for caches"), but it might also be a tunable (which is sucky in itself, of course), but finally, it might just be that you just don't have a ton of memory. I've got 2GB in my machines, although 1GB is plenty to cache the kernel. Linus ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 16:25 ` Linus Torvalds @ 2006-02-02 9:12 ` Alex Riesen 0 siblings, 0 replies; 110+ messages in thread From: Alex Riesen @ 2006-02-02 9:12 UTC (permalink / raw To: Linus Torvalds Cc: Martin Langhoff, Ray Lehtiniemi, Radoslaw Szkodzinski, Keith Packard, Junio C Hamano, cworth, Git Mailing List On 2/1/06, Linus Torvalds <torvalds@osdl.org> wrote: > > $ time git update-index --refresh > > > > real 0m21.500s > > user 0m0.358s > > sys 0m1.406s > > > > WinNT, NTFS, 13k files, hot cache. > > That's 25% less files than the Linux kernel, and I can do that operation > in 0m0.062s (0.012s user, 0.048s system). correction. It's 18k files, which is almost the same as 2.6.13-rc6. But these files got *very* long names (the project poisoned by classical C++ education and breaks windows' 255 chars limit on filename length from time to time). Refresh index in 2.6.13 is actualy consistantly faster: $ cd src/linux-2.6.13-rc6 $ time git update-index --refresh real 0m1.344s user 0m0.358s sys 0m0.984s > So WinNT/cygwin is about 2.5 _orders_of_maginitude_ slower here, or 340 > times slower. > > Now, I'm tempted to say that NT is a piece of sh*t, but the fact is, your > CPU-times seem to indicate that most of it is IO (and the "real" cost is > just 1.7 seconds, much of which is system time, which in turn itself is > probably due to the IO costs too - so even that isn't comparable with > the ). > > Which may mean that you simply don't have enough memory to cache the whole > thing. Which may be NT sucking, of course ("we don't like to use more than > 10% of memory for caches"), but it might also be a tunable (which is sucky > in itself, of course), but finally, it might just be that you just don't > have a ton of memory. I've got 2GB in my machines, although 1GB is plenty > to cache the kernel. I have 2Gb, the "System Cache" is around 1.5Gb, and this is PIV 3.2GHz. There seem to be no tunables for any kind of system stuff (savin' on support costs, do they?). You'd be very hardpressed not to say that windows is a piece of sh*t. The "benchmark: several times in a row: $ time git update-index --refresh real 0m1.766s user 0m0.498s sys 0m1.203s $ time git update-index --refresh real 0m1.766s user 0m0.358s sys 0m1.390s $ time git update-index --refresh real 0m1.781s user 0m0.420s sys 0m1.311s $ time git update-index --refresh real 0m1.875s user 0m0.374s sys 0m1.343s $ time git update-index --refresh real 0m1.766s user 0m0.326s sys 0m1.375s It is always almost the same time. I don't think it's IO, looks more like cache accesses. It is just that bad in this cygwin+win2k combination. Besides, I don't trust "time <command>" on windows much: it returned sys time 0 for git-update-index in a directory which was read before. Yes, there was disk activity, I can hear it real good with that barrakuda. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-28 21:08 ` [Census] So who uses git? Junio C Hamano 2006-01-29 2:14 ` Morten Welinder 2006-01-29 10:09 ` Keith Packard @ 2006-01-29 18:37 ` Dave Jones 2006-01-29 20:17 ` Daniel Barkalow 2006-01-30 18:58 ` Carl Baldwin 3 siblings, 1 reply; 110+ messages in thread From: Dave Jones @ 2006-01-29 18:37 UTC (permalink / raw To: Junio C Hamano Cc: Keith Packard, Martin Langhoff, Linus Torvalds, Git Mailing List On Sat, Jan 28, 2006 at 01:08:54PM -0800, Junio C Hamano wrote: > Can I hear experiences from other big projects that tried to use > git [*1*]? I suspect there are many that have tried, and I > would not be surprised at all if git did not work out well for > them. For projects that already run on a (free) SCM, I would be > very surprised if the developers find the current git 10 times > better than the SCM they have been using (probably with an > exception of CVS), unless they have very specific need, such as > parallel development of distributed nature like the Linux > kernel. I've found switching from cvs->git even for small projects has made me more productive. In part because it's got me away from the 'check in to a centralised server like sourceforge' mentality, without the need to set up a local cvs server of my own. Adding changesets to a small project like x86info, now takes seconds, whereas it used to take minutes of thumb-twiddling whilst I waited for sf.net to do its thing. The ability to check in changesets locally whilst I'm travelling, and then push them when I have network connectivity again is also a massive productivity win over cvs. There's also another git usage that I doubt I'm alone in doing. I regularly use git to import cvs trees from sourceforge etc for random projects, because I now find browsing history of projects with tools like gitk much nicer than any cvs tool I've used. (cvs annotate is the only thing I really miss). What would be really cool, would be a web page pointing to public conversions of various projects cvs trees, so that everyone doesn't have to keep hammering various repos to do the conversions themselves. (Sort of a pseudo bkbits.net). Dave ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-29 18:37 ` Dave Jones @ 2006-01-29 20:17 ` Daniel Barkalow 2006-01-29 20:29 ` Martin Langhoff 2006-01-30 15:23 ` Mike McCormack 0 siblings, 2 replies; 110+ messages in thread From: Daniel Barkalow @ 2006-01-29 20:17 UTC (permalink / raw To: Dave Jones Cc: Junio C Hamano, Keith Packard, Martin Langhoff, Linus Torvalds, Git Mailing List On Sun, 29 Jan 2006, Dave Jones wrote: > On Sat, Jan 28, 2006 at 01:08:54PM -0800, Junio C Hamano wrote: > > Can I hear experiences from other big projects that tried to use > > git [*1*]? I suspect there are many that have tried, and I > > would not be surprised at all if git did not work out well for > > them. For projects that already run on a (free) SCM, I would be > > very surprised if the developers find the current git 10 times > > better than the SCM they have been using (probably with an > > exception of CVS), unless they have very specific need, such as > > parallel development of distributed nature like the Linux > > kernel. > > I've found switching from cvs->git even for small projects has > made me more productive. In part because it's got me away from > the 'check in to a centralised server like sourceforge' mentality, > without the need to set up a local cvs server of my own. > Adding changesets to a small project like x86info, now takes > seconds, whereas it used to take minutes of thumb-twiddling whilst > I waited for sf.net to do its thing. The ability to check in > changesets locally whilst I'm travelling, and then push them when > I have network connectivity again is also a massive productivity > win over cvs. > > There's also another git usage that I doubt I'm alone in doing. > I regularly use git to import cvs trees from sourceforge etc for > random projects, because I now find browsing history of projects > with tools like gitk much nicer than any cvs tool I've used. > (cvs annotate is the only thing I really miss). I think this is the real driving factor for git adoption: it doesn't have to be 10x better for people to use it, because individuals can use it for interacting with CVS projects without causing anybody else any pain. It doesn't just enable distributed development, it enables a distributed choice of SCM, which means a much lower activation energy threshold. I think we'll see a lot more adoption when we have a CVS daemon interface (so projects can stop having a CVS repository, and support both sorts of users with a git repository and have better metadata), and also if someone sets up a place for putting git imports of CVS projects, so people will know that other people are using git. -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-29 20:17 ` Daniel Barkalow @ 2006-01-29 20:29 ` Martin Langhoff 2006-01-30 15:23 ` Mike McCormack 1 sibling, 0 replies; 110+ messages in thread From: Martin Langhoff @ 2006-01-29 20:29 UTC (permalink / raw To: Daniel Barkalow Cc: Dave Jones, Junio C Hamano, Keith Packard, Linus Torvalds, Git Mailing List On 1/30/06, Daniel Barkalow <barkalow@iabervon.org> wrote: > > There's also another git usage that I doubt I'm alone in doing. > > I regularly use git to import cvs trees from sourceforge etc for > > random projects, because I now find browsing history of projects > > with tools like gitk much nicer than any cvs tool I've used. > > (cvs annotate is the only thing I really miss). > > I think this is the real driving factor for git adoption: it doesn't have > to be 10x better for people to use it, because individuals can use it for > interacting with CVS projects without causing anybody else any pain. IMHO, this is a killer feature of GIT. From a CVS/SVN user point of view, it has vendor branches done right. At work, we do that with Moodle, Elgg, EPrints and GForge. And the list is growing. That's why I'm working on the toolchain to make interop with CVS smooth so I can land patches in upstream projects where I have cvs access. cheers, m ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-29 20:17 ` Daniel Barkalow 2006-01-29 20:29 ` Martin Langhoff @ 2006-01-30 15:23 ` Mike McCormack 1 sibling, 0 replies; 110+ messages in thread From: Mike McCormack @ 2006-01-30 15:23 UTC (permalink / raw To: Daniel Barkalow Cc: Dave Jones, Junio C Hamano, Keith Packard, Martin Langhoff, Linus Torvalds, Git Mailing List Daniel Barkalow wrote: > I think we'll see a lot more adoption when we have a CVS daemon interface > (so projects can stop having a CVS repository, and support both sorts of > users with a git repository and have better metadata), and also if someone > sets up a place for putting git imports of CVS projects, so people will > know that other people are using git. The Wine project is using a GIT repository which is mirrored into CVS. Alexandre wrote scripts to mirror GIT commits into CVS, so developers can use whichever they're more comfortable with, and the CVS repository remains up to date. We've found that patch submitters using GIT tend to send multiple patches per day, and that those using CVS tend to send a patch or two occasionally or just keep up to date with the source. Mike ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-28 21:08 ` [Census] So who uses git? Junio C Hamano ` (2 preceding siblings ...) 2006-01-29 18:37 ` Dave Jones @ 2006-01-30 18:58 ` Carl Baldwin 2006-01-31 10:27 ` Johannes Schindelin 2006-02-01 19:32 ` H. Peter Anvin 3 siblings, 2 replies; 110+ messages in thread From: Carl Baldwin @ 2006-01-30 18:58 UTC (permalink / raw To: Junio C Hamano Cc: Keith Packard, Martin Langhoff, Linus Torvalds, Git Mailing List Junio, You don't seem to give git enough credit. I am a hardware engineer with many softwareish responsibilities. One of those is to keep up to date with the many commercial and free SCM type tools that are available. Git has become my SCM tool of choice for many reasons. - Anyone can install and fire it up without license/contract hassles. - The infrastructure barriers to getting a project started with git are about as low as they can be. - Geographically distributed teams even inside a corporation are becoming more common. Git's repository design meets this need perfectly. - The repository is also to designed to be inherently safe from data-loss and corruption even in the face of concurrent writes due to each objects' immutable nature. - While on the subject of the repository. Good job keeping it simple. I was able to learn pretty much all there is to know from a technical stand-point about the objects and refs directories in an afternoon. It follows a principle I always work toward myself. "Make it simple enough that there are obviously no difficiencies rather than making it complicated so that there are no obvious difficiencies." - In my opinion git is flexible enough to support just about any development/build/release flow that one can think of. Most of the free tools (including subversion and arch) make branching and merging --- on which most of these flows rely --- way too heavy-weight. Git shows how light-weight it can be. Not only can parallel development happen easily between users/repositories but parallel development is trivial even within the same repository. I think your 'pu' system illustrates how powerful it can be. I myself have had up to four concurrent branches where I implemented four different features in parallel in the same repository easily switching between them. It was almost too easy to bring them together using merge as each one finished. I was just reading through an article on how to choose an SCM last week and I kept thinking how git could be used to meet almost every one (if not all) of the needs discussed. - Git supports enough network protocols to make it immediately useful in about any situation with firewalls and such. This is where it leaves monotone behind. The biggest hurdle that I've seen in adopting git is training the users. I myself took to it like a duck to water but I've found that even some of my brightest colleages have trouble wrapping their heads around it. Currently, I'm trying to look at what parts they are having the most trouble with. In general, I think it is grasping the reason for the index file and how git commands like git-commit and git-diff interact with it. Even so, I've always appreciated those tools that may have a steeper learning curve but that pay dividends over time. Also, I should mention that this learning curve has been flattening over time as git has developed and obtained more porcelainish commands. Carl On Sat, Jan 28, 2006 at 01:08:54PM -0800, Junio C Hamano wrote: > Keith Packard <keithp@keithp.com> writes: > > Wow....... You are switching Cairo and X.org from CVS to git? > > It could be that anything is better than CVS these days, but I > have to admit that my jaw dropped after reading this, primarily > because I've have never touched anything as big as X. > > Awestruck, dumbstruck,... Xstruck. Yeah, I know I should have > more faith in git. Earlier I heard Wine folks are running git > in parallel with CVS as their dual primary SCM now, and of > course git is the primary SCM for the Linux kernel project. -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Carl Baldwin RADCAD (R&D CAD) Hewlett Packard Company MS 88 work: 970 898-1523 3404 E. Harmony Rd. work: Carl.N.Baldwin@hp.com Fort Collins, CO 80525 home: Carl@ecBaldwin.net - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-30 18:58 ` Carl Baldwin @ 2006-01-31 10:27 ` Johannes Schindelin 2006-01-31 15:24 ` Carl Baldwin ` (2 more replies) 2006-02-01 19:32 ` H. Peter Anvin 1 sibling, 3 replies; 110+ messages in thread From: Johannes Schindelin @ 2006-01-31 10:27 UTC (permalink / raw To: Carl Baldwin Cc: Junio C Hamano, Keith Packard, Martin Langhoff, Linus Torvalds, Git Mailing List Hi, On Mon, 30 Jan 2006, Carl Baldwin wrote: > In general, I think it is grasping the reason for the index file and how > git commands like git-commit and git-diff interact with it. IMHO this is the one big showstopper. I had problems explaining the concept myself. For example, I had a hard time explaining to a friend why a git-add'ed file is committed when saying "git commit some_other_file", but not another (modified) file. Very unintuitive. Ciao, Dscho ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 10:27 ` Johannes Schindelin @ 2006-01-31 15:24 ` Carl Baldwin 2006-01-31 15:31 ` Johannes Schindelin 2006-01-31 17:30 ` Linus Torvalds 2006-01-31 23:16 ` Daniel Barkalow 2 siblings, 1 reply; 110+ messages in thread From: Carl Baldwin @ 2006-01-31 15:24 UTC (permalink / raw To: Johannes Schindelin Cc: Junio C Hamano, Keith Packard, Martin Langhoff, Linus Torvalds, Git Mailing List Its difficult to explain because it breaks away from the precedent set by other SCMs. I wouldn't call it a show-stopper for this reason. In fact, some who have wrapped their heads around the concept might call it a valuable feature. I, myself, have found it a handy thing in certain circumstances. In other circumstances I simply bypass it by adding -a to the command-line. This doesn't fit my definition of a show-stopper. Carl On Tue, Jan 31, 2006 at 11:27:34AM +0100, Johannes Schindelin wrote: > Hi, > > On Mon, 30 Jan 2006, Carl Baldwin wrote: > > > In general, I think it is grasping the reason for the index file and how > > git commands like git-commit and git-diff interact with it. > > IMHO this is the one big showstopper. I had problems explaining the > concept myself. > > For example, I had a hard time explaining to a friend why a git-add'ed > file is committed when saying "git commit some_other_file", but not > another (modified) file. Very unintuitive. > > Ciao, > Dscho > > -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Carl Baldwin RADCAD (R&D CAD) Hewlett Packard Company MS 88 work: 970 898-1523 3404 E. Harmony Rd. work: Carl.N.Baldwin@hp.com Fort Collins, CO 80525 home: Carl@ecBaldwin.net - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 15:24 ` Carl Baldwin @ 2006-01-31 15:31 ` Johannes Schindelin 0 siblings, 0 replies; 110+ messages in thread From: Johannes Schindelin @ 2006-01-31 15:31 UTC (permalink / raw To: Carl Baldwin Cc: Junio C Hamano, Keith Packard, Martin Langhoff, Linus Torvalds, Git Mailing List Hi, On Tue, 31 Jan 2006, Carl Baldwin wrote: > Its difficult to explain because it breaks away from the precedent set > by other SCMs. I wouldn't call it a show-stopper for this reason. I don't. The strange concept from the user's perspective is that git commit -m "some message" file-a.txt can commit file-b.txt also. > [...] In other circumstances I simply bypass it by adding -a to the > command-line. This is a different thing. Ciao, Dscho ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 10:27 ` Johannes Schindelin 2006-01-31 15:24 ` Carl Baldwin @ 2006-01-31 17:30 ` Linus Torvalds 2006-01-31 18:12 ` J. Bruce Fields ` (2 more replies) 2006-01-31 23:16 ` Daniel Barkalow 2 siblings, 3 replies; 110+ messages in thread From: Linus Torvalds @ 2006-01-31 17:30 UTC (permalink / raw To: Johannes Schindelin Cc: Carl Baldwin, Junio C Hamano, Keith Packard, Martin Langhoff, Git Mailing List On Tue, 31 Jan 2006, Johannes Schindelin wrote: > > On Mon, 30 Jan 2006, Carl Baldwin wrote: > > > In general, I think it is grasping the reason for the index file and how > > git commands like git-commit and git-diff interact with it. > > IMHO this is the one big showstopper. I had problems explaining the > concept myself. > > For example, I had a hard time explaining to a friend why a git-add'ed > file is committed when saying "git commit some_other_file", but not > another (modified) file. Very unintuitive. I really think you should explain it one of two ways: - ignore it. Never _ever_ use git-update-index directly, and don't tell people about use individual filenames to git-commit. Maybe even add "-a" by default to the git-commit flags as a special installation addition. - talk about the index, and revel in it as a way to explain the staging area. This is what the old tutorial.txt did before it got simplified. The "ignore the index" approach is the simple one to explain. It's strictly less powerful, but hey, what else is new? Linus ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 17:30 ` Linus Torvalds @ 2006-01-31 18:12 ` J. Bruce Fields 2006-01-31 19:33 ` Junio C Hamano 2006-01-31 19:01 ` Keith Packard 2006-02-01 19:34 ` H. Peter Anvin 2 siblings, 1 reply; 110+ messages in thread From: J. Bruce Fields @ 2006-01-31 18:12 UTC (permalink / raw To: Linus Torvalds Cc: Johannes Schindelin, Carl Baldwin, Junio C Hamano, Keith Packard, Martin Langhoff, Git Mailing List On Tue, Jan 31, 2006 at 09:30:48AM -0800, Linus Torvalds wrote: > I really think you should explain it one of two ways: > > - ignore it. Never _ever_ use git-update-index directly, and don't tell > people about use individual filenames to git-commit. Maybe even add > "-a" by default to the git-commit flags as a special installation > addition. > > - talk about the index, and revel in it as a way to explain the staging > area. This is what the old tutorial.txt did before it got simplified. > > The "ignore the index" approach is the simple one to explain. It's > strictly less powerful, but hey, what else is new? Yeah, I do wonder what's likely to be the best approach for most users. My goal with the new tutorial was to get a reader doing something fun and useful as quickly as possible. So it just refers elsewhere for any discussion of the index file or SHA1 names. But probably everyone needs to pick up that stuff eventually anyway, and maybe it's better to get to it a little sooner, I dunno. Besides the git-add/git-commit thing, the other thing that caught me by suprise was the behaviour of git reset. I expected there to be an "inverse" to git commit -a, meaning that 1) the sequence git reset HEAD^ git commit -a would be a no-op, in the sense that the new commit would get the same changes as the old one, and 2) the sequence git commit -a git reset HEAD^ would be a no-op, in the sense that "git diff" would report the same diff before and after. But there isn't, and explaining how --soft and --mixed actually work requires referring to the index file. Is that something that can be fixed in the tools or does the user fundamentally need to know about the index file to do this kind of stuff? --b. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 18:12 ` J. Bruce Fields @ 2006-01-31 19:33 ` Junio C Hamano 2006-01-31 19:44 ` Jon Loeliger 2006-01-31 20:06 ` J. Bruce Fields 0 siblings, 2 replies; 110+ messages in thread From: Junio C Hamano @ 2006-01-31 19:33 UTC (permalink / raw To: J. Bruce Fields Cc: Linus Torvalds, Johannes Schindelin, Carl Baldwin, Keith Packard, Martin Langhoff, Git Mailing List "J. Bruce Fields" <bfields@fieldses.org> writes: > On Tue, Jan 31, 2006 at 09:30:48AM -0800, Linus Torvalds wrote: >> >> The "ignore the index" approach is the simple one to explain. It's >> strictly less powerful, but hey, what else is new? > > Yeah, I do wonder what's likely to be the best approach for most users. > My goal with the new tutorial was to get a reader doing something fun > and useful as quickly as possible. So it just refers elsewhere for any > discussion of the index file or SHA1 names. But probably everyone needs > to pick up that stuff eventually anyway, and maybe it's better to get to > it a little sooner, I dunno. I think many good stuff git offers would not be helpful to the users until index is understood as the third entity, in addition to the usual "committed state" and "working tree state". It might be better to talk about it sooner rather than later. And the tool is geared towards taking advantage of it, so until the user understands that, behaviour of some tools would feel unintuitive. You can have local throw-away modifications while applying patches and merging (I once broke merges by ignoring that it is perfectly valid to have index and working tree files be different and keep working that way. That was a hard lesson). The index file knows what working tree changes are meant to be committed. Another thing I find useful, which cannot be done without index, is to sanity check while developing. When "git diff" gives too many diffs, running update-index on paths that I think are more-or-less OK helps to reduce clutter, and I can view only further changes to those paths. In a sense, update-index can be thought of to check in the changes without committing. You can check in number of times, and the cumulative effect is committed later. "reset --mixed" is undoing these uncommitted check-ins. "reset --hard" undoes the last commit. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 19:33 ` Junio C Hamano @ 2006-01-31 19:44 ` Jon Loeliger 2006-01-31 19:52 ` Junio C Hamano [not found] ` <7vd5i8w2nc.fsf@assigned-by-dhcp.cox.net> 2006-01-31 20:06 ` J. Bruce Fields 1 sibling, 2 replies; 110+ messages in thread From: Jon Loeliger @ 2006-01-31 19:44 UTC (permalink / raw To: Git List On Tue, 2006-01-31 at 13:33, Junio C Hamano wrote: > "J. Bruce Fields" <bfields@fieldses.org> writes: > I think many good stuff git offers would not be helpful to the > users until index is understood as the third entity, in addition > to the usual "committed state" and "working tree state". It > might be better to talk about it sooner rather than later. And > the tool is geared towards taking advantage of it, so until the > user understands that, behaviour of some tools would feel > unintuitive. Agreed. > You can have local throw-away modifications while applying > patches and merging (I once broke merges by ignoring that it is > perfectly valid to have index and working tree files be > different and keep working that way. That was a hard lesson). > The index file knows what working tree changes are meant to be > committed. Another thing I find useful, which cannot be done > without index, is to sanity check while developing. When "git > diff" gives too many diffs, running update-index on paths that I > think are more-or-less OK helps to reduce clutter, and I can > view only further changes to those paths. And right there is where people get caught by surprise. What "they" then want to do is actually pick certain files to commit. And when they do, they get caught off guard by the _additional_ files. I have done this style of "update-index on more-or-less OK files in order to clear up the diff. And it is also in that time frame that I start feeling that certain changes belong to "one commit" or another. The result is, I want to then pick the parts that get committed together. But _really_ being certain exactly which files, and _only_ those files, will really be committed is tough. jdl ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 19:44 ` Jon Loeliger @ 2006-01-31 19:52 ` Junio C Hamano [not found] ` <7vd5i8w2nc.fsf@assigned-by-dhcp.cox.net> 1 sibling, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2006-01-31 19:52 UTC (permalink / raw To: Jon Loeliger; +Cc: git Jon Loeliger <jdl@freescale.com> writes: > I have done this style of "update-index on more-or-less OK > files in order to clear up the diff. And it is also in that > time frame that I start feeling that certain changes belong > to "one commit" or another. The result is, I want to then > pick the parts that get committed together. But _really_ > being certain exactly which files, and _only_ those files, > will really be committed is tough. $ git diff --cached would help. If you are _only_ comitting either all changes or no change per path, 'git diff --cached --name-status' would be sufficient. ^ permalink raw reply [flat|nested] 110+ messages in thread
[parent not found: <7vd5i8w2nc.fsf@assigned-by-dhcp.cox.net>]
* Re: [Census] So who uses git? [not found] ` <7vd5i8w2nc.fsf@assigned-by-dhcp.cox.net> @ 2006-01-31 20:56 ` J. Bruce Fields 0 siblings, 0 replies; 110+ messages in thread From: J. Bruce Fields @ 2006-01-31 20:56 UTC (permalink / raw To: Junio C Hamano; +Cc: Jon Loeliger, git On Tue, Jan 31, 2006 at 12:41:59PM -0800, Junio C Hamano wrote: > On the tutorial front, maybe we could start teaching people to > always use "commit -a", and not tell them about update-index nor > "commit paths.." at all. Have them do "hello world", review > changes since the last commit with "git diff", and make commit > with "git commit -a". Next tell them about index, and after > they understand index, finally tell them "commit paths..." is > there merely to reduce typing. Yeah, I think that's approximately what you get right now if you read tutorial.txt followed by core-tutorial.txt, though the two currently may not really work together well as sequels. So I'm inclined to start by revising the two to make them read well as sequels, then maybe moving some of core-tutorial.txt into the earlier tutorial.txt. By the time we're done the two might end up being one document. Or they might still be two, but with the split being more clearly beginning/advanced instead of high-level/low-level. Feedback from people who'd actually worked through the two would obviously be useful. --b. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 19:33 ` Junio C Hamano 2006-01-31 19:44 ` Jon Loeliger @ 2006-01-31 20:06 ` J. Bruce Fields 1 sibling, 0 replies; 110+ messages in thread From: J. Bruce Fields @ 2006-01-31 20:06 UTC (permalink / raw To: Junio C Hamano Cc: Linus Torvalds, Johannes Schindelin, Carl Baldwin, Keith Packard, Martin Langhoff, Git Mailing List On Tue, Jan 31, 2006 at 11:33:21AM -0800, Junio C Hamano wrote: > I think many good stuff git offers would not be helpful to the > users until index is understood as the third entity, in addition > to the usual "committed state" and "working tree state". It > might be better to talk about it sooner rather than later. And > the tool is geared towards taking advantage of it, so until the > user understands that, behaviour of some tools would feel > unintuitive. Yeah, makes sense. But I'd like to introduce that while still introducing the higher-level tools earlier on than core-tutorial.txt does. I'll give some thought to how to move things in that direction, maybe this weekend.... --b. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 17:30 ` Linus Torvalds 2006-01-31 18:12 ` J. Bruce Fields @ 2006-01-31 19:01 ` Keith Packard 2006-01-31 19:21 ` Linus Torvalds 2006-01-31 20:56 ` Sam Ravnborg 2006-02-01 19:34 ` H. Peter Anvin 2 siblings, 2 replies; 110+ messages in thread From: Keith Packard @ 2006-01-31 19:01 UTC (permalink / raw To: Linus Torvalds Cc: keithp, Johannes Schindelin, Carl Baldwin, Junio C Hamano, Martin Langhoff, Git Mailing List [-- Attachment #1: Type: text/plain, Size: 634 bytes --] On Tue, 2006-01-31 at 09:30 -0800, Linus Torvalds wrote: > - ignore it. Never _ever_ use git-update-index directly, and don't tell > people about use individual filenames to git-commit. Maybe even add > "-a" by default to the git-commit flags as a special installation > addition. As a newly initiated user, this would have been a more gentle introduction to the system. But, it would be hard to make it entirely invisible given the current interfaces. I'm not sure if obscuring the presense of the index is a great plan; it's already hard enough to figure out how it works. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 19:01 ` Keith Packard @ 2006-01-31 19:21 ` Linus Torvalds 2006-01-31 22:55 ` Joel Becker 2006-01-31 20:56 ` Sam Ravnborg 1 sibling, 1 reply; 110+ messages in thread From: Linus Torvalds @ 2006-01-31 19:21 UTC (permalink / raw To: Keith Packard Cc: Johannes Schindelin, Carl Baldwin, Junio C Hamano, Martin Langhoff, Git Mailing List On Tue, 31 Jan 2006, Keith Packard wrote: > On Tue, 2006-01-31 at 09:30 -0800, Linus Torvalds wrote: > > > - ignore it. Never _ever_ use git-update-index directly, and don't tell > > people about use individual filenames to git-commit. Maybe even add > > "-a" by default to the git-commit flags as a special installation > > addition. > > As a newly initiated user, this would have been a more gentle > introduction to the system. But, it would be hard to make it entirely > invisible given the current interfaces. I'm not sure if obscuring the > presense of the index is a great plan; it's already hard enough to > figure out how it works. Now, I do agree. I don't actually like hiding the index too much. Understanding the index is _invaluable_ whenever you're doing a merge with conflicts, and understanding what tools are available to you to resolve those conflicts. The index is also obviously very important when you do a partial commit, and it's something I do end up doing quite often. Again, maybe that's not something that a new git user should be encouraged to ever do, but it's a huge convenience feature for power-users. Understanding the index also allows people to understand certain performance-characteristics of git, and explains how "git add" (and remove, if we had one) actually works independently of the commit. So I'm actually of the "revel in the index" camp (as could probably be guessed by the original tutorial). My personal suggestion would be to introduce git "gently" by ignoring it, but by the time a person actually _works_ on a project (as opposed to just going through a tutorial or following another persons project), he/she should probably have been introduced to the index in order to understand what happens and to use its power. (In particular, the difference between "git diff" and "git diff HEAD" is an important one to understand eventually). Linus ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 19:21 ` Linus Torvalds @ 2006-01-31 22:55 ` Joel Becker 2006-02-01 14:43 ` Johannes Schindelin 0 siblings, 1 reply; 110+ messages in thread From: Joel Becker @ 2006-01-31 22:55 UTC (permalink / raw To: Linus Torvalds Cc: Keith Packard, Johannes Schindelin, Carl Baldwin, Junio C Hamano, Martin Langhoff, Git Mailing List On Tue, Jan 31, 2006 at 11:21:52AM -0800, Linus Torvalds wrote: > Now, I do agree. I don't actually like hiding the index too much. > Understanding the index is _invaluable_ whenever you're doing a merge with > conflicts, and understanding what tools are available to you to resolve > those conflicts. This is precisely the experience I've had explaining GIT to folks moving to it. The simplest workflow (clone; hack one file, commit one file) is so similar to CVS/Subversion/Anything that it's immediately understood. But when pull, push, merge, and any non-linear history are discussed, I have to describe the index and the commit/tree layout. Once I do, they get it. > So I'm actually of the "revel in the index" camp (as could probably be > guessed by the original tutorial). I'm going to second this, from a real-world "explain it to others" standpoint. Joel -- "Every day I get up and look through the Forbes list of the richest people in America. If I'm not there, I go to work." - Robert Orben Joel Becker Principal Software Developer Oracle E-mail: joel.becker@oracle.com Phone: (650) 506-8127 ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 22:55 ` Joel Becker @ 2006-02-01 14:43 ` Johannes Schindelin 0 siblings, 0 replies; 110+ messages in thread From: Johannes Schindelin @ 2006-02-01 14:43 UTC (permalink / raw To: Joel Becker Cc: Linus Torvalds, Keith Packard, Carl Baldwin, Junio C Hamano, Martin Langhoff, Git Mailing List Hi, On Tue, 31 Jan 2006, Joel Becker wrote: > On Tue, Jan 31, 2006 at 11:21:52AM -0800, Linus Torvalds wrote: > > Now, I do agree. I don't actually like hiding the index too much. > > Understanding the index is _invaluable_ whenever you're doing a merge with > > conflicts, and understanding what tools are available to you to resolve > > those conflicts. > > This is precisely the experience I've had explaining GIT to > folks moving to it. The simplest workflow (clone; hack one file, commit > one file) is so similar to CVS/Subversion/Anything that it's immediately > understood. But when pull, push, merge, and any non-linear history are > discussed, I have to describe the index and the commit/tree layout. > Once I do, they get it. > > > So I'm actually of the "revel in the index" camp (as could probably be > > guessed by the original tutorial). > > I'm going to second this, from a real-world "explain it to > others" standpoint. How about talking about the index a bit at the end of tutorial.txt like this: -- snip -- For a number of (mostly technical) reasons, "git diff" does not show the changes of the current working directory with respect to the latest commit, but rather to an intermediate stage: the "index". Think of the index as a staging area just before committing: the commit object (and the tree and blob objects referenced from it) are assembled there. Also, when you checkout, the index is used to disassemble the commit object just before writing the corresponding files and directories. -- snap -- May this be worth the work? Ciao, Dscho ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 19:01 ` Keith Packard 2006-01-31 19:21 ` Linus Torvalds @ 2006-01-31 20:56 ` Sam Ravnborg 2006-01-31 22:21 ` Junio C Hamano 1 sibling, 1 reply; 110+ messages in thread From: Sam Ravnborg @ 2006-01-31 20:56 UTC (permalink / raw To: Keith Packard Cc: Linus Torvalds, keithp, Johannes Schindelin, Carl Baldwin, Junio C Hamano, Martin Langhoff, Git Mailing List > As a newly initiated user, this would have been a more gentle > introduction to the system. But, it would be hard to make it entirely > invisible given the current interfaces. I'm not sure if obscuring the > presense of the index is a great plan; it's already hard enough to > figure out how it works. I have found myself using a mixture of cogito and git commands lately. Part of it being that my finger type something like: rm `git ls-files -m` cg-restore and I have not convinced them about git reset --hard But the primary thing is cg-commit I give you a list of files modified which can be edited and it have saved me a couple of times commiting to much. And I get vi fired up so no need to fiddle with command line argumetns. Sam ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 20:56 ` Sam Ravnborg @ 2006-01-31 22:21 ` Junio C Hamano 0 siblings, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2006-01-31 22:21 UTC (permalink / raw To: Sam Ravnborg; +Cc: git "Sam Ravnborg" <sam@ravnborg.org> writes: > But the primary thing is cg-commit > I give you a list of files modified which can be edited and > it have saved me a couple of times commiting to much. > And I get vi fired up so no need to fiddle with command line argumetns. [this is what I sent in a separate message but I goofed up the destination headers and the message did not appear on the list, so I am reprinting.] I have always felt "git commit paths..." was a mistake; it encourages partial commits by individual developers. By "partial commit", I mean a commit that does not exactly match the state of the working tree when the commit is made. There are two kinds of "partial commits". Good ones and bad ones. Being able to make partial commits is handy for people whose primary role is to integrate many changes from trusted developers rather than testing each and every commit as a whole (read: Linus and subsystem maintainers). Integrators' job may include testing what have been merged as a whole by a compile and reboot cycle as the final "wrap-up" step, but the most important role they play is to sanity check the changes from architectural perspective. For that workflow to work effectively, however, the changes fed by individual developers to the integrators have to be clean and well tested. A partial commit records something that never existed in any working tree as a whole, so by definition it is an untested change. You would risk "sorry I forgot to commit the changes to these paths but without them it does not even compile", and end up wasting integrators' time. The integrators make commits out of their working trees using git-merge and git-apply to record changes made by others after reviewing them. These commands ignore unconflicting local changes (but notices conflicting ones to operate correctly), and allow them to make partial commits. This is a good thing; otherwise they would have to reset their own changes in their working tree, only to do merges and to accept patches. However, people playing the integrator role rarely have reason to use "git commit paths..." while merging from others to make such a partial commit. Only after they resolve conflicts by hand, perhaps. But that happens far less often than careless individual developers making partial commits of bad kind using the same "git commit paths..." command. This is the reason why I feel "git commit paths..." is a bad feature. It helps to make bad partial commits, without having to do much with making good partial commits. Many SCMs may have the ability to do "commit paths...", but that does not change the fact that it encourages carelessness for individual developers, which is especially bad in a distributed development workflow like the Linux kernel style [*1*]. But that was not my change ;-). [Foornote] *1* It could be argued that being able to do partial commit is a good thing in other SCM systems where there is no equivalent to our "index" file. It is one way for the developer to snapshot their work-in-progress state where they might later come back to if the approach they are currently pursuing does not pan out. But for that, we have index file we can "check into" without committing. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 17:30 ` Linus Torvalds 2006-01-31 18:12 ` J. Bruce Fields 2006-01-31 19:01 ` Keith Packard @ 2006-02-01 19:34 ` H. Peter Anvin 2 siblings, 0 replies; 110+ messages in thread From: H. Peter Anvin @ 2006-02-01 19:34 UTC (permalink / raw To: Linus Torvalds Cc: Johannes Schindelin, Carl Baldwin, Junio C Hamano, Keith Packard, Martin Langhoff, Git Mailing List Linus Torvalds wrote: >> >>For example, I had a hard time explaining to a friend why a git-add'ed >>file is committed when saying "git commit some_other_file", but not >>another (modified) file. Very unintuitive. > > I really think you should explain it one of two ways: > > - ignore it. Never _ever_ use git-update-index directly, and don't tell > people about use individual filenames to git-commit. Maybe even add > "-a" by default to the git-commit flags as a special installation > addition. > > - talk about the index, and revel in it as a way to explain the staging > area. This is what the old tutorial.txt did before it got simplified. > > The "ignore the index" approach is the simple one to explain. It's > strictly less powerful, but hey, what else is new? > I think both of these are probably the wrong answer, and it's pretty much a matter of the git model violating the principle of least surprise. Perhaps added (or removed?) files need to be handled in a different way than they currently are. -hpa ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 10:27 ` Johannes Schindelin 2006-01-31 15:24 ` Carl Baldwin 2006-01-31 17:30 ` Linus Torvalds @ 2006-01-31 23:16 ` Daniel Barkalow 2006-01-31 23:36 ` Petr Baudis 2006-01-31 23:47 ` Junio C Hamano 2 siblings, 2 replies; 110+ messages in thread From: Daniel Barkalow @ 2006-01-31 23:16 UTC (permalink / raw To: Johannes Schindelin Cc: Carl Baldwin, Junio C Hamano, Keith Packard, Martin Langhoff, Linus Torvalds, Git Mailing List On Tue, 31 Jan 2006, Johannes Schindelin wrote: > Hi, > > On Mon, 30 Jan 2006, Carl Baldwin wrote: > > > In general, I think it is grasping the reason for the index file and how > > git commands like git-commit and git-diff interact with it. > > IMHO this is the one big showstopper. I had problems explaining the > concept myself. > > For example, I had a hard time explaining to a friend why a git-add'ed > file is committed when saying "git commit some_other_file", but not > another (modified) file. Very unintuitive. I sort of suspect that "git commit some_other_file" should really read HEAD into a temporary index, update "some_other_file" in that (and the main index), and commit it. The concept of the index isn't hard (it's the preparation you've made so far towards a commit), and plain "git commit" makes sense with it; "git commit -a" also makes sense, since committing all changes is pretty clear. The surprising thing is that "git commit path ..." means "everything I've already mentioned, plus path..." not just "path ...", and it's particularly surprising because people only tend to specify paths when they've done something they don't want to commit. -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 23:16 ` Daniel Barkalow @ 2006-01-31 23:36 ` Petr Baudis 2006-01-31 23:47 ` Junio C Hamano 1 sibling, 0 replies; 110+ messages in thread From: Petr Baudis @ 2006-01-31 23:36 UTC (permalink / raw To: Daniel Barkalow Cc: Johannes Schindelin, Carl Baldwin, Junio C Hamano, Keith Packard, Martin Langhoff, Linus Torvalds, Git Mailing List Dear diary, on Wed, Feb 01, 2006 at 12:16:26AM CET, I got a letter where Daniel Barkalow <barkalow@iabervon.org> said that... > On Tue, 31 Jan 2006, Johannes Schindelin wrote: > > > Hi, > > > > On Mon, 30 Jan 2006, Carl Baldwin wrote: > > > > > In general, I think it is grasping the reason for the index file and how > > > git commands like git-commit and git-diff interact with it. > > > > IMHO this is the one big showstopper. I had problems explaining the > > concept myself. > > > > For example, I had a hard time explaining to a friend why a git-add'ed > > file is committed when saying "git commit some_other_file", but not > > another (modified) file. Very unintuitive. > > I sort of suspect that "git commit some_other_file" should really read > HEAD into a temporary index, update "some_other_file" in that (and the > main index), and commit it. FWIW, this is also what cg-commit does. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ Of the 3 great composers Mozart tells us what it's like to be human, Beethoven tells us what it's like to be Beethoven and Bach tells us what it's like to be the universe. -- Douglas Adams ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 23:16 ` Daniel Barkalow 2006-01-31 23:36 ` Petr Baudis @ 2006-01-31 23:47 ` Junio C Hamano 2006-02-01 0:38 ` Linus Torvalds 1 sibling, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2006-01-31 23:47 UTC (permalink / raw To: Daniel Barkalow Cc: Johannes Schindelin, Carl Baldwin, Keith Packard, Martin Langhoff, Linus Torvalds, Git Mailing List Daniel Barkalow <barkalow@iabervon.org> writes: > I sort of suspect that "git commit some_other_file" should really read > HEAD into a temporary index, update "some_other_file" in that (and the > main index), and commit it. > ... > The surprising thing is that "git commit path ..." means > "everything I've already mentioned, plus path..." not just > "path ...", and it's particularly surprising because people > only tend to specify paths when they've done something they > don't want to commit. Interesting idea, and a good point. Not that I particularly would like to encourage people to make partial commits by making it easier, but as long as we allow our users to say "commit path...", your proposal would reduce the confusion. I wonder which is faster, to check if index differs from HEAD and do the temporary index only when they differ, or always use a temporary without checking? The former needs one diff-index --cached, zero or one read-tree, one write-tree and one commit-tree. The latter always needs one read-tree, one write-tree and one commit-tree. Wait. We already do diff-index --cached during git-commit anyway (it is in git-status). Maybe with a bit of code restructuring we can do the temporary index part optional. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-31 23:47 ` Junio C Hamano @ 2006-02-01 0:38 ` Linus Torvalds 2006-02-01 0:52 ` Junio C Hamano ` (2 more replies) 0 siblings, 3 replies; 110+ messages in thread From: Linus Torvalds @ 2006-02-01 0:38 UTC (permalink / raw To: Junio C Hamano Cc: Daniel Barkalow, Johannes Schindelin, Carl Baldwin, Keith Packard, Martin Langhoff, Git Mailing List On Tue, 31 Jan 2006, Junio C Hamano wrote: > Daniel Barkalow <barkalow@iabervon.org> writes: > > > I sort of suspect that "git commit some_other_file" should really read > > HEAD into a temporary index, update "some_other_file" in that (and the > > main index), and commit it. > > ... > > The surprising thing is that "git commit path ..." means > > "everything I've already mentioned, plus path..." not just > > "path ...", and it's particularly surprising because people > > only tend to specify paths when they've done something they > > don't want to commit. > > Interesting idea, and a good point. One thing to be careful about is merges. This actually happens to me: git pull .... .. uhhuh, trivial conflict in one file .. .. edit the/file/that/conflicted .. git commit the/file/that/conflicted and there is no way that it would ever be correct to then just commit that one file. The fact that it's a merge means that the rest of the index - which is all from the merge, and correct - absolutely _must_ be committed too. And yes, I could use "git commit -a" (and I often do), but the thing is, I surprisingly often have edits in unrelated files (stuff that the merge never touched), and doing "git commit -a" would do the wrong thing. So the current "git commit filename" behaviour is actually the only possible correct one for a merge. Nothing else makes any sense what-so-ever. Now, I can hear people arguing that "ok, merges are special, and for merges we always do it in the current index", but that makes "git commit pathname" act very _differently_ for a merge than for a normal commit. That just smells wrong to me. So if you do this change (which may be the right one) then please make sure that "git commit <filename>" doesn't work _at_all_ when a merge is in progress (ie MERGE_HEAD exists), because it would do the wrong thing. And yes, then I'll just have to force my fingers to do a simple git-update-index filename git commit instead. I can do that. Oh, one final suggestion: if you give a filename to "git commit", and you do the new semantics which means something _different_ than "do a git-update-index on that file and commit", then I'd really suggest that the _old_ index for that filename should match the parent exactly. Otherwise, you may have done a git diff filename and you _thought_ you were committing just a two-line thing (because you didn't understand about the index), but another, earlier, action caused the index to be different from the file you had in HEAD, and in reality you're actually committing a much bigger diff. In other words: if you want "git commit <filename>" to _not_ care about the current index, then it should make sure that the index at least _matches_ the current HEAD in the files mentioned. Ie "git-diff-index --cached HEAD <filespec>" should return empty. Or something like that. Linus ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 0:38 ` Linus Torvalds @ 2006-02-01 0:52 ` Junio C Hamano 2006-02-01 2:19 ` Daniel Barkalow 2006-02-01 6:42 ` Junio C Hamano 2 siblings, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2006-02-01 0:52 UTC (permalink / raw To: Linus Torvalds; +Cc: git Linus Torvalds <torvalds@osdl.org> writes: > One thing to be careful about is merges. > ... > So the current "git commit filename" behaviour is actually the only > possible correct one for a merge. Nothing else makes any sense > what-so-ever. Agreed 100%, and I kind of feel silly about not mentioning that myself. It _might_ even make sense to reject explicit filenames when MERGE_HEAD does not exist ;-). > Oh, one final suggestion: if you give a filename to "git > commit", and you do the new semantics which means something > _different_ than "do a git-update-index on that file and > commit", then I'd really suggest that the _old_ index for that > filename should match the parent exactly. That is also a good safety measure. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 0:38 ` Linus Torvalds 2006-02-01 0:52 ` Junio C Hamano @ 2006-02-01 2:19 ` Daniel Barkalow 2006-02-01 6:42 ` Junio C Hamano 2 siblings, 0 replies; 110+ messages in thread From: Daniel Barkalow @ 2006-02-01 2:19 UTC (permalink / raw To: Linus Torvalds Cc: Junio C Hamano, Johannes Schindelin, Carl Baldwin, Keith Packard, Martin Langhoff, Git Mailing List On Tue, 31 Jan 2006, Linus Torvalds wrote: > So if you do this change (which may be the right one) then please make > sure that "git commit <filename>" doesn't work _at_all_ when a merge is in > progress (ie MERGE_HEAD exists), because it would do the wrong thing. Agreed. I suppose it could accept doing a commit of only a few files which weren't touched by the merge, but I don't think even you multitask enough to want to do that; anyway, the user can just ditch the merge, commit their stuff, and try the merge again. (I bet this is a case where new users would be really surprised by the behavior of "git commit filename", except that they wouldn't think it would do anything other than give an error.) > And yes, then I'll just have to force my fingers to do a simple > > git-update-index filename > git commit > > instead. I can do that. > > Oh, one final suggestion: if you give a filename to "git commit", and you > do the new semantics which means something _different_ than "do a > git-update-index on that file and commit", then I'd really suggest that > the _old_ index for that filename should match the parent exactly. > Otherwise, you may have done a > > git diff filename > > and you _thought_ you were committing just a two-line thing (because you > didn't understand about the index), but another, earlier, action caused > the index to be different from the file you had in HEAD, and in reality > you're actually committing a much bigger diff. > > In other words: if you want "git commit <filename>" to _not_ care about > the current index, then it should make sure that the index at least > _matches_ the current HEAD in the files mentioned. > > Ie "git-diff-index --cached HEAD <filespec>" should return empty. Or > something like that. Agreed here, too. -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 0:38 ` Linus Torvalds 2006-02-01 0:52 ` Junio C Hamano 2006-02-01 2:19 ` Daniel Barkalow @ 2006-02-01 6:42 ` Junio C Hamano 2006-02-01 7:22 ` Carl Worth ` (2 more replies) 2 siblings, 3 replies; 110+ messages in thread From: Junio C Hamano @ 2006-02-01 6:42 UTC (permalink / raw To: Linus Torvalds; +Cc: git Linus Torvalds <torvalds@osdl.org> writes: > Oh, one final suggestion: if you give a filename to "git commit", and you > do the new semantics which means something _different_ than "do a > git-update-index on that file and commit", then I'd really suggest that > the _old_ index for that filename should match the parent exactly. > Otherwise, you may have done a > > git diff filename > > and you _thought_ you were committing just a two-line thing (because you > didn't understand about the index), but another, earlier, action caused > the index to be different from the file you had in HEAD, and in reality > you're actually committing a much bigger diff. This "I thought I was only checking in the two-liner I did as the last step but you committed the whole thing, stupid git!" confusion feels to be a parallel of "I thought I was only checking in the files I specified on the command line but you also committed the files I earlier git-add'ed, stupid git!" confusion. Taken together with your "during a partially conflicted merge" example, it feels to me that the simplest safety valve would be to refuse "git commit paths..." if the index does not exactly match HEAD. Not just mentioned paths but anywhere. People who do not like this can set in their config file some flag, say, 'core.index = understood', to get the current behaviour. The reason I am bringing this up is because of this command sequence: # start from a clean tree, after 'git reset --hard' $ create a-new-file $ git add a-new-file $ edit existing-file $ edit another-file $ git commit existing-file There is no question we do not commit "another-file" and we do commit changes to the "existing-file" as a whole. What should we do to "a-new-file", and how do we explain why we do so to novices? We can argue it either way. We could say we shouldn't because "commit" argument does not mention it. We could say we should because the user already told that he wants to add that file to git. Either makes sort-of sense from what the end user did. I think a file "cvs add"ed is committed if whole subdirectory commit (similar to our "commit -a") is done or the file is explicitly specified on the "cvs commit" command line, and that may match people's expectations. That's an argument for not committing "a-new-file". But to be consistent with that, this should not commit anything: # the same clean tree. $ create a-new-file $ git add a-new-file $ git commit Which is counterintuitive to me by now (because I played too long with git). We could make "git commit" without paths to mean the current "-a" behaviour, which would match CVS behaviour more closely. However, it would make commit after a merge conflict resolution in a dirty working tree _very_ dangerous -- it may give more familiar feel to CVS people, but it is not an improvement for git people at all. I would rather not. Right now, "git add" means "stage this for the next commit in the index". If we change the semantics of "git add" to mean "I am not adding it for the next commit yet; I am just letting you know there is a file in the working tree so that you can keep an eye on it for me", using the intent-to-add index entry I've mentioned a couple of times, I think the above problem might naturally be solved. For people who do not use update-index, "commit -a" and "commit paths..." are the only two ways to actually check-in anything to the index file for the next commit ("git add" alone does not count). "commit -a" would do the equivalent of current "update all the not-up-to-date file to the index and then commit", which would include the intent-to-add paths. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 6:42 ` Junio C Hamano @ 2006-02-01 7:22 ` Carl Worth 2006-02-01 8:26 ` Junio C Hamano 2006-02-01 17:11 ` Linus Torvalds 2006-02-01 17:18 ` Nicolas Pitre 2 siblings, 1 reply; 110+ messages in thread From: Carl Worth @ 2006-02-01 7:22 UTC (permalink / raw To: Junio C Hamano; +Cc: Linus Torvalds, git [-- Attachment #1: Type: text/plain, Size: 2269 bytes --] On Tue, 31 Jan 2006 22:42:05 -0800, Junio C Hamano wrote: > > There is no question we do not commit "another-file" and we do > commit changes to the "existing-file" as a whole. What should > we do to "a-new-file", and how do we explain why we do so to > novices? I'll offer a couple of ill-informed comments from a novice's point-of-view if I may. My first exposure to git (about 1 week ago) was "A short git tutorial" [*] I found the discussion of the index, git-update-index, and the subtle distinctions between the various git-diff commands rather intimidating for an initial introduction. After getting to know the system better over the past week, it seems it should be possible to have a class of "novice ready" tools that provide for common use cases and that never require any mention of the index in their documentation. If so, that seems to me a useful goal to work toward and a useful guide in this discussion. > We could make "git commit" without paths to mean the current > "-a" behaviour, which would match CVS behaviour more closely. Again, my novice experience leads me to favor that change. After reading the tutorial, I had the following sequence in mind for committing an edited file: git update-index edited-file git commit which seemed like more pain than strictly necessary. The next day, when I went to the linux.conf.au tutorial and saw Linus use: git commit -a for the same operation it was a breath of fresh air. I was left scratching my head wondering why the -a behavior wasn't the default for "git commit" with no paths. > However, it would make commit after a merge conflict resolution > in a dirty working tree _very_ dangerous -- it may give more > familiar feel to CVS people, but it is not an improvement for > git people at all. I would rather not. I'm still not "git people" I guess. Could you explain what the danger is here? And is it something the tool could detect and prevent? -Carl [*] http://www.kernel.org/pub/software/scm/git/docs/core-tutorial.html [* A better initial introduction for me would likely have been "A tutorial introduction to git": http://www.kernel.org/pub/software/scm/git/docs/tutorial.html so a link to the latter from the first paragraph or so of the former might be very helpful. [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 7:22 ` Carl Worth @ 2006-02-01 8:26 ` Junio C Hamano 2006-02-01 9:59 ` Randal L. Schwartz 0 siblings, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2006-02-01 8:26 UTC (permalink / raw To: Carl Worth; +Cc: Linus Torvalds, git Carl Worth <cworth@cworth.org> writes: > ... it seems it should be possible to have a class of > "novice ready" tools that provide for common use cases and that never > require any mention of the index in their documentation. If so, that > seems to me a useful goal to work toward and a useful guide in this > discussion. I agree it is a worthy goal. Unfortunately I lost my git virginity long time ago, so a fresh perspective is really appreciated in this discussion. > ... Could you explain what the danger is here? As Linus mentioned in an earlier message in this thread, one of the important task for him is to take other peoples' trees and merge it into his mainline. The workflow goes like this: $ git pull from-somewhere ... oops there are conflicts $ edit conflicted/file $ edit more/conflicted/file ... maybe compile test ... $ git diff -c ;# final sanity check $ git update-index conflicted/file $ git update-index more/conflicted/file $ git commit He does *not* want to do "git commit -a" here, because he usually has unrelated changes in his working tree he has not done update-index on and does _not_ want to commit [*1*]. "git commit" to imply "git commit -a" increases the risk of accidentally committing those unrelated changes mixed in the merge (eh, actually makes the risk 100%). We _could_ detect that we were in the middle of a merge, enumerate the paths touched by the merged branches. Then we can say paths that are different between the index and the working tree and not in the paths touched by the merge are his unrelated changes. But it is conceivable he may need to modify a file neither branch touches in order to _logically_ resolve the merge, even when the merge phisically does not conflict in textual diff basis, so while that heuristics may work pretty well most of the time, doing so might make things even less easier to explain to other people. [Footnotes] *1* The reason he has unrelated changes while doing a merge is because he works on things himself (I am speculating about this), and for these modified paths he never runs git-add nor git-update-index until he is ready to commit his changes (I am not speculating about this). As long as he knows what he is pulling in from outside does not overlap with what he has been working on, he can merge and commit the result without worrying about his own unrelated changes, and git is careful not to touch anything in his working tree to cause information loss when the changes do overlap [*2*]. He is committing something that he never tested himself in his working tree as a whole. The tree resulting from the merge never existed outside his index file, so there is no way he could have even compile tested it properly. But for somebody who is playing an integrator's role, it is not his primary job to examine and test every change he merges in as a whole at nitty-gritty level -- that is what the originator of the change should have done. So having uncommitted changes in the working tree for an integrator person is not a sign of bad discipline at all, and supporting this workflow _is_ important for git. The primary reason I first got involved in git was because I wanted to help the workflow of the kernel people, especially Linus and the subsystem maintainers. To be honest, I personally still consider the kernel people the first tier customers for me, and I stop and try to think twice when thinking about a change or a new feature that may help individual developers and newcomers, to make sure such a change does not make life less convenient for the 'integrator' people. Helping integrators to be more efficient is important because they can become bottlenecks. *2* I once got yelled at by Linus when I carelessly broke this feature and changed 'git-merge' to require a clean working tree without changes before starting a merge; it was quickly reverted. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 8:26 ` Junio C Hamano @ 2006-02-01 9:59 ` Randal L. Schwartz 2006-02-01 20:48 ` Junio C Hamano 0 siblings, 1 reply; 110+ messages in thread From: Randal L. Schwartz @ 2006-02-01 9:59 UTC (permalink / raw To: Junio C Hamano; +Cc: Carl Worth, Linus Torvalds, git >>>>> "Junio" == Junio C Hamano <junkio@cox.net> writes: Junio> *1* The reason he has unrelated changes while doing a merge is Junio> because he works on things himself (I am speculating about Junio> this), You need to speculate that Linus works on things himself? :) -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training! ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 9:59 ` Randal L. Schwartz @ 2006-02-01 20:48 ` Junio C Hamano 0 siblings, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2006-02-01 20:48 UTC (permalink / raw To: Randal L. Schwartz; +Cc: git merlyn@stonehenge.com (Randal L. Schwartz) writes: >>>>>> "Junio" == Junio C Hamano <junkio@cox.net> writes: > > Junio> *1* The reason he has unrelated changes while doing a merge is > Junio> because he works on things himself (I am speculating about > Junio> this), > > You need to speculate that Linus works on things himself? :) Forgot a smiley ;-). ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 6:42 ` Junio C Hamano 2006-02-01 7:22 ` Carl Worth @ 2006-02-01 17:11 ` Linus Torvalds 2006-02-01 17:18 ` Nicolas Pitre 2 siblings, 0 replies; 110+ messages in thread From: Linus Torvalds @ 2006-02-01 17:11 UTC (permalink / raw To: Junio C Hamano; +Cc: git On Tue, 31 Jan 2006, Junio C Hamano wrote: > > Taken together with your "during a partially conflicted merge" > example, it feels to me that the simplest safety valve would be > to refuse "git commit paths..." if the index does not exactly > match HEAD. Not just mentioned paths but anywhere. But at that point, the existing "git commit" semantics actually are the ones we'd use, and the only difference ends up being that we error out if the index doesn't match HEAD. The problem with that is that it appears that some of the people who don't like the current "git commit <filename>" thing _do_ actually understand the index, but they want to commit just that one file. So at least from my understanding, I think Dscho was arguing for the new semantics of "git commit <file>" to _work_, but to only commit <file>, even if he does understand the index perfectly well, and might have done a "git add" or updated a file for some other reason.. Btw, one thing that _can_ be confusing is that you do git commit fileA and then when you edit the commit message, you realize that you don't actually want to do this at all, so you exit out of the editor without changes (which aborts the commit). Now "git commit" will not actually have done the commit, but it _will_ have done the "git-update-index" on that file. So next time, when you do git commit fileB you'll currently commit _both_ fileA and fileB. This is, in my opinion, the biggest argument for the suggested _new_ semantics: if you explicitly name a set of files, it should always do a # Verify current state parent=$(git-rev-parse --verify HEAD) || exit # Verify that the current index is ok in the named files a=$(git-diff-files --name-only --cached $parent "$@") || exit if [ "$a" ]; then echo -e >&2 "Files are changed in the index:\n $a" exit 2 fi # create the new tree object export GIT_INDEX_FILE=tmpfile newtree=$(git-read-tree $parent && git-update-index "$@" && git-write-tree) || exit # edit message ... edit message .. # do commit newhead=$(git-commit-tree -p $parent < msg) git-update-ref HEAD $newhead $parent or similar. That has the advantage that if we _do_ decide to break out of the commit, we will not have changed the current index (only the temporary one). Linus ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 6:42 ` Junio C Hamano 2006-02-01 7:22 ` Carl Worth 2006-02-01 17:11 ` Linus Torvalds @ 2006-02-01 17:18 ` Nicolas Pitre 2006-02-01 20:27 ` Junio C Hamano 2 siblings, 1 reply; 110+ messages in thread From: Nicolas Pitre @ 2006-02-01 17:18 UTC (permalink / raw To: Junio C Hamano; +Cc: Linus Torvalds, git On Tue, 31 Jan 2006, Junio C Hamano wrote: > This "I thought I was only checking in the two-liner I did as > the last step but you committed the whole thing, stupid git!" > confusion feels to be a parallel of "I thought I was only > checking in the files I specified on the command line but you > also committed the files I earlier git-add'ed, stupid git!" > confusion. > > Taken together with your "during a partially conflicted merge" > example, it feels to me that the simplest safety valve would be > to refuse "git commit paths..." if the index does not exactly > match HEAD. Not just mentioned paths but anywhere. > > People who do not like this can set in their config file some > flag, say, 'core.index = understood', to get the current > behaviour. I'd avoid hidden config options that magically change behaviors and semantics like that as much as possible. _This_ would pave the way to even greater confusion and prevent the git user base from converging on a unified semantics knowledge. Better add a command line option which has the vertue of being visible, and name it such that it make the intention explicit whether the previous index state is preserved or not, something like --current-index or the like. > The reason I am bringing this up is because of this command > sequence: > > # start from a clean tree, after 'git reset --hard' > $ create a-new-file > $ git add a-new-file > $ edit existing-file > $ edit another-file > $ git commit existing-file > > There is no question we do not commit "another-file" and we do > commit changes to the "existing-file" as a whole. What should > we do to "a-new-file", and how do we explain why we do so to > novices? > > We can argue it either way. We could say we shouldn't because > "commit" argument does not mention it. We could say we should > because the user already told that he wants to add that file to > git. Either makes sort-of sense from what the end user did. It is much more intuitive to expect that, if you specify path arguments to commit, then only those paths are considered, and even if you didn't do a git add on some of them. If nothing is specified then the current index (the default, including a-new-file) is considered. > I think a file "cvs add"ed is committed if whole subdirectory > commit (similar to our "commit -a") is done or the file is > explicitly specified on the "cvs commit" command line, and that > may match people's expectations. That's an argument for not > committing "a-new-file". Exact. > But to be consistent with that, this should not commit anything: > > # the same clean tree. > $ create a-new-file > $ git add a-new-file > $ git commit > > Which is counterintuitive to me by now (because I played too > long with git). IMHO this should commit a_new_file simply because you added it to the index and a commit without any argument should commit the whole (refreshed) index. > We could make "git commit" without paths to mean the current > "-a" behaviour, which would match CVS behaviour more closely. Exact. > However, it would make commit after a merge conflict resolution > in a dirty working tree _very_ dangerous -- it may give more > familiar feel to CVS people, but it is not an improvement for > git people at all. I would rather not. For that case, (assuming that -a would be the default) maybe something meaning the opposite of -a could be specified on the commit argument list like I suggested earlier. And maybe it should always be the default when committing a merge (in which case the -a would override that and refresh everything and not only the merged files plus those specified on the command line). So to resume: - a non-merge commit without any argument would imply -a. - a non-merge commit with path arguments implies _only_ those paths, regardless if they were previously "git add"ed or not. - a non-merge commit with, say, --no-auto or --current-index or whatever would preserve the current behavior, with or without additional paths. - a merge commit would imply that --no-auto behavior automatically. - a merge commit could override the --no-auto with an explicit -a. This might look complicated when presented like that, but I think that the default behavior of each (non-merge vs merge) commit would more closely fit most people's expectations. The merge commit create a shift in semantics of course, but committing a merge is already something a bit more involved anyway and at that point git users should have gained a bit more experience with the index concept and the default merge behavior is probably what most people will expect at that point as well. Nicolas ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 17:18 ` Nicolas Pitre @ 2006-02-01 20:27 ` Junio C Hamano 2006-02-01 21:09 ` Linus Torvalds 2006-02-01 22:00 ` Joel Becker 0 siblings, 2 replies; 110+ messages in thread From: Junio C Hamano @ 2006-02-01 20:27 UTC (permalink / raw To: Nicolas Pitre; +Cc: Linus Torvalds, git Nicolas Pitre <nico@cam.org> writes: > On Tue, 31 Jan 2006, Junio C Hamano wrote: > >> People who do not like this can set in their config file some >> flag, say, 'core.index = understood', to get the current >> behaviour. > > I'd avoid hidden config options that magically change behaviors and > semantics like that as much as possible.... I agree; it was tongue-in-cheek sort of suggestion ;-) > It is much more intuitive to expect that, if you specify path arguments > to commit, then only those paths are considered, and even if you didn't > do a git add on some of them. If nothing is specified then the current > index (the default, including a-new-file) is considered. Good thinking. I was not thinking about the case where you explicitly list an untracked file to be added. > - a non-merge commit without any argument would imply -a. > > - a non-merge commit with path arguments implies _only_ those paths, > regardless if they were previously "git add"ed or not. > > - a non-merge commit with, say, --no-auto or --current-index or > whatever would preserve the current behavior, with or without > additional paths. > > - a merge commit ... > - a merge commit ... > > This might look complicated when presented like that, but I think that > the default behavior of each (non-merge vs merge) commit would more > closely fit most people's expectations.... If I may correct what I said earlier, I now realize the "automatic -a is dangerous" argument does not have anything to do with merges. If the user usually works with a dirty working tree, is aware of the index, and takes advantage of the index as the staging area for the next commit, your --no-auto would be needed to help her workflow. I in principle agree with the first three items in the above summary, except that I think it would make more sense to do that for all commits. How about this: - "git commit --also fileA..." means: update index at listed paths (add/remove if necessary) and then commit the tree described in index (the current behaviour with explicit paths). - "git commit fileA..." means: create a temporary index from the current HEAD commit (or empty index if there is none), update it at listed paths (add/remove if necessary) and commit the resulting tree. Also update the real index at the listed paths (add/remove if necessary). In the original index file, the paths listed must be either empty or match exactly the HEAD commit -- otherwise we error out (Linus' suggestion). - "git commit" means: update index with all local changes and then commit the tree described in index (current "-a" behaviour). - In all cases, revert the index to the state before the command is run if we end up not making the commit (e.g. index unmerged, empty log message, pre-commit hook refusal). Experienced git users would end up saying "--also" without explicit paths to defeat the automatic -a behaviour all the time, and while the flag --also makes perfect sense when used with one or more paths, using it like this look awkward: $ edit some-file $ git update-index some-file $ git commit --also It's just a flag name so we could make --no-auto synonym to --also. A minor twist of the above to make it friendlier to the current git users is to do this: - "git commit fileA...", "git commit -a", and "git commit" keep the existing semantics. - "git commit --only fileA..." does the new temporary index thing. This has an advantage that existing use is not affected, and another advantage is that internally it is more consistent ("git commit" is a natural extension of "git commit fileA..." with zero path). But one possible downside is that you need to explicitly say --only when you want cvs-like "commit". Since we are discussing that the people find existing interface to be unintuitive, being consistent with the current usage may not count as a big advantage after all.. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 20:27 ` Junio C Hamano @ 2006-02-01 21:09 ` Linus Torvalds 2006-02-01 21:34 ` Nicolas Pitre 2006-02-01 21:59 ` Junio C Hamano 2006-02-01 22:00 ` Joel Becker 1 sibling, 2 replies; 110+ messages in thread From: Linus Torvalds @ 2006-02-01 21:09 UTC (permalink / raw To: Junio C Hamano; +Cc: Nicolas Pitre, git On Wed, 1 Feb 2006, Junio C Hamano wrote: > > How about this: > > - "git commit --also fileA..." means: update index at listed > paths (add/remove if necessary) and then commit the tree > described in index (the current behaviour with explicit paths). I'd suggest "--incremental" instead of "--also". > - "git commit fileA..." means: create a temporary index from the > current HEAD commit (or empty index if there is none), update > it at listed paths (add/remove if necessary) and commit the > resulting tree. Also update the real index at the listed > paths (add/remove if necessary). In the original index file, > the paths listed must be either empty or match exactly the > HEAD commit -- otherwise we error out (Linus' suggestion). Yes. > - "git commit" means: update index with all local changes and > then commit the tree described in index (current "-a" > behaviour). No. Please no. "git commit" should continue to do what it does now. Otherwise you can't do the two-stage thing in any sane way. Requiring "--incremental"/"--also" is very confusing. If somebody doesn't know about the index, he normally will never have index changes _anyway_, except for the "git add" case. In which case "git commit" does the right thing for him: it will either commit the added files, or it will say "nothing to commit". Linus ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 21:09 ` Linus Torvalds @ 2006-02-01 21:34 ` Nicolas Pitre 2006-02-01 21:59 ` Junio C Hamano 1 sibling, 0 replies; 110+ messages in thread From: Nicolas Pitre @ 2006-02-01 21:34 UTC (permalink / raw To: Linus Torvalds; +Cc: Junio C Hamano, git On Wed, 1 Feb 2006, Linus Torvalds wrote: > > > On Wed, 1 Feb 2006, Junio C Hamano wrote: > > > > How about this: > > > > - "git commit --also fileA..." means: update index at listed > > paths (add/remove if necessary) and then commit the tree > > described in index (the current behaviour with explicit paths). > > I'd suggest "--incremental" instead of "--also". > > > - "git commit fileA..." means: create a temporary index from the > > current HEAD commit (or empty index if there is none), update > > it at listed paths (add/remove if necessary) and commit the > > resulting tree. Also update the real index at the listed > > paths (add/remove if necessary). In the original index file, > > the paths listed must be either empty or match exactly the > > HEAD commit -- otherwise we error out (Linus' suggestion). > > Yes. Agreed. > > - "git commit" means: update index with all local changes and > > then commit the tree described in index (current "-a" > > behaviour). > > No. Please no. "git commit" should continue to do what it does now. > Otherwise you can't do the two-stage thing in any sane way. > > Requiring "--incremental"/"--also" is very confusing. > > If somebody doesn't know about the index, he normally will never have > index changes _anyway_, except for the "git add" case. In which case "git > commit" does the right thing for him: it will either commit the added > files, or it will say "nothing to commit". Sensible. As long as "commit files..." actually commits _only_ those files unless --index (or something) is specified to also explicitly include the index changes. What is really counter-intuitive is to have index changes merged by default when a single file is specified as argument to commit. Nicolas ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 21:09 ` Linus Torvalds 2006-02-01 21:34 ` Nicolas Pitre @ 2006-02-01 21:59 ` Junio C Hamano 2006-02-01 22:25 ` Nicolas Pitre ` (2 more replies) 1 sibling, 3 replies; 110+ messages in thread From: Junio C Hamano @ 2006-02-01 21:59 UTC (permalink / raw To: Linus Torvalds; +Cc: Nicolas Pitre, git Linus Torvalds <torvalds@osdl.org> writes: >> - "git commit" means: update index with all local changes and >> then commit the tree described in index (current "-a" >> behaviour). > > No. Please no. "git commit" should continue to do what it does now. > Otherwise you can't do the two-stage thing in any sane way. > Requiring "--incremental"/"--also" is very confusing. I myself did not like it but... > If somebody doesn't know about the index, he normally will never have > index changes _anyway_, except for the "git add" case. In which case "git > commit" does the right thing for him: it will either commit the added > files, or it will say "nothing to commit". ... the original complaint was that "git commit" without explicit paths does not quack like "cvs/svn commit" -- commit all my changes in the working tree. And actually the one you are responding to was my cunning move to pull this exact reaction from you: "No commit without parameter should not imply -a". I prefer the "minor twist" version in the same messge myself. To recap: - "git commit fileA..." means: update index at listed paths (add/remove if necessary) and then commit the tree described in index (the same as the current behaviour with explicit paths). - "git commit -a" means: update index with all local changes and then commit the tree described in index (the same as the current behaviour). - "git commit" means: write out the current index and commit (the same as the current behaviour). - "git commit --only fileA..." means: create a temporary index from the current HEAD commit (or empty index if there is none), update it at listed paths (add/remove if necessary) and commit the resulting tree. Also update the real index at the listed paths (add/remove if necessary). In the original index file, the paths listed must be either empty or match exactly the HEAD commit -- otherwise we error out (Linus' suggestion). - In all cases, revert the index to the state before the command is run if we end up not making the commit (e.g. index unmerged, empty log message, pre-commit hook refusal). With this, "git diff-files fileA" would show the differences as it showed beforean aborted "git commit -a" or "git commit fileA" and removes one common gripe. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 21:59 ` Junio C Hamano @ 2006-02-01 22:25 ` Nicolas Pitre 2006-02-01 22:50 ` Junio C Hamano 2006-02-01 22:35 ` Linus Torvalds 2006-02-01 22:57 ` [Census] So who uses git? Daniel Barkalow 2 siblings, 1 reply; 110+ messages in thread From: Nicolas Pitre @ 2006-02-01 22:25 UTC (permalink / raw To: Junio C Hamano; +Cc: Linus Torvalds, git On Wed, 1 Feb 2006, Junio C Hamano wrote: > To recap: > > - "git commit fileA..." means: update index at listed paths > (add/remove if necessary) and then commit the tree described > in index (the same as the current behaviour with explicit > paths). No. > - "git commit -a" means: update index with all local changes and > then commit the tree described in index (the same as the > current behaviour). Sensible. > - "git commit" means: write out the current index and commit > (the same as the current behaviour). Sensible. > - "git commit --only fileA..." means: create a temporary index > from the current HEAD commit (or empty index if there is > none), update it at listed paths (add/remove if necessary) > and commit the resulting tree. Also update the real index at > the listed paths (add/remove if necessary). In the original > index file, the paths listed must be either empty or match > exactly the HEAD commit -- otherwise we error out (Linus' > suggestion). Actually, my opinion is that should be the behavior for your first item above (when only filenames are specified). If you want to _also_ include the index like you describe in your first item then an additional switch should be provided. In other words, the --only should become --with-index with the behavior swapped. The fact is that when you simply specify a filename, you really expect _only_ that filename will be affected and the rest be left alone. That's the most probable expectation for any tool. If you want _additional_ stuff to also be merged along with the files specified then it is logical to have an additional argument in that case, not the other way around. Nicolas ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 22:25 ` Nicolas Pitre @ 2006-02-01 22:50 ` Junio C Hamano 2006-02-02 14:59 ` Andreas Ericsson 0 siblings, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2006-02-01 22:50 UTC (permalink / raw To: Nicolas Pitre; +Cc: Linus Torvalds, git, Joel Becker Nicolas Pitre <nico@cam.org> writes: > Actually, my opinion is that should be the behavior for your first item > above (when only filenames are specified). If you want to _also_ > include the index like you describe in your first item then an > additional switch should be provided. OK, agreed. Sorry to be slow. So, to recap: git commit paths... (temporary index thing) git commit --incremental paths... (same as current w/o --incremental) git commit (same as current) git commit -a (same as current) And I agree with Joel that we should not automatically imply "git add" with or without --incremental. I do not particularly have much preference among --also, --with-index, or --incremental, but: - 'with-index' is precise but might be too technical; - 'incremental' is not really incremental -- you can use it only once. Because you do not have to say "git commit --also" without paths (which _is_ awkward) to get the traditional behaviour, maybe it is a good name for that flag (it is also the shortest). ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 22:50 ` Junio C Hamano @ 2006-02-02 14:59 ` Andreas Ericsson 0 siblings, 0 replies; 110+ messages in thread From: Andreas Ericsson @ 2006-02-02 14:59 UTC (permalink / raw To: Junio C Hamano; +Cc: Nicolas Pitre, Linus Torvalds, git, Joel Becker Junio C Hamano wrote: > > I do not particularly have much preference among --also, > --with-index, or --incremental, but: > > - 'with-index' is precise but might be too technical; > - 'incremental' is not really incremental -- you can use it > only once. > > Because you do not have to say "git commit --also" without paths > (which _is_ awkward) to get the traditional behaviour, maybe it > is a good name for that flag (it is also the shortest). > Except that -a, which is the logical shorthand, is already taken. How about --include (or --include-index, or --index) and -i? commit being a fairly commonly used command, I think it's safe to assume that most people will read the man-page or the help output if there's something they don't undetstand. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 21:59 ` Junio C Hamano 2006-02-01 22:25 ` Nicolas Pitre @ 2006-02-01 22:35 ` Linus Torvalds 2006-02-01 23:33 ` Two ideas for improving git's user interface Carl Worth 2006-02-01 22:57 ` [Census] So who uses git? Daniel Barkalow 2 siblings, 1 reply; 110+ messages in thread From: Linus Torvalds @ 2006-02-01 22:35 UTC (permalink / raw To: Junio C Hamano; +Cc: Nicolas Pitre, git On Wed, 1 Feb 2006, Junio C Hamano wrote: > > ... the original complaint was that "git commit" without > explicit paths does not quack like "cvs/svn commit" -- commit > all my changes in the working tree. Agreed. However, I think that one is pretty easy to explain, and conceptually it's not a problem to just tell people to use the "-a" flag if they want to get CVS/SVN semantics. After all, "git commit" will actually make it pretty obvious in the commit message status, _and_ if you haven't done any "git add" you'll get the "nothing to commit" thing, so it's not like this is hard to explain. The real _confusion_ I think came from the filename usage. Linus ^ permalink raw reply [flat|nested] 110+ messages in thread
* Two ideas for improving git's user interface 2006-02-01 22:35 ` Linus Torvalds @ 2006-02-01 23:33 ` Carl Worth 2006-02-02 0:38 ` Junio C Hamano ` (3 more replies) 0 siblings, 4 replies; 110+ messages in thread From: Carl Worth @ 2006-02-01 23:33 UTC (permalink / raw To: Linus Torvalds; +Cc: Junio C Hamano, Nicolas Pitre, git [-- Attachment #1: Type: text/plain, Size: 3612 bytes --] On Wed, 1 Feb 2006 14:35:33 -0800 (PST), Linus Torvalds wrote: > > Agreed. However, I think that one is pretty easy to explain, and > conceptually it's not a problem to just tell people to use the "-a" flag > if they want to get CVS/SVN semantics. "Just use -a" is tempting, but I don't think it's a satisfying stance to take. Consider the following operations: echo "original" > A; git add A; echo "modified" > A; git commit -a -m "add A"; echo "original" > B; git add B; echo "modified" > B; git commit -m "add B" B; echo "original" > C; git add C; echo "modified" > C; git commit -m "add C"; After which we can see: $ git diff diff --git a/C b/C index 4b48dee..2e09960 100644 --- a/C +++ b/C @@ -1 +1 @@ -original +modified To explain this, "just use -a" isn't enough, it would have to be something like, "always use -a or else 'git commit' just won't work and you can end up committing stale garbage". And perhaps "unless you also add the filename to the commit line, then it will start working again." There's explanation for the above behavior requires a rather careful description of the index, the operations, the flags, and some rather subtle interactions between them. I don't even think "embrace the index" is enough to make the above behavior obvious---the variations in the above behavior are a bit too subtle. But I don't think git is doomed to be hard to learn or that its behavior needs to be hard to predict. I think this should be fairly easy to fix. Here's a fundamental question I have, (and thanks to Keith Packard for helping me to phrase it): Is it ever useful (reasonable, desirable) to commit file contents that differ from the contents of the working directory? I don't think it is, (but please let me know if I've missed some useful case). Idea #1 (prevent the index from being used to commit stale data) ------- If this isn't useful, then I think git would do well to make it harder/impossible to perform this operation. For example, the index could have a new notion of "use working directory contents" for a given file in addition to the current "use this blob". This would allow a user to use the index to stage subsequent file additions/modifications for commit without introducing the various opportunities for confusing commits of stale data. I would think this would then naturally resolve the confusion around the various diff operations, (diff-index, diff-index --cached, and diff-files). Idea #2 (make it easy to preview diffs of what will be committed) ------- Independent of the above, I'd like to propose another change to help prevent confusion and to help users learn git. There should be an obvious "diff" operation that presents exactly the result of what any "commit" operation will perform. I assume that there currently exist appropriate diff operations for any commit, but the correspondence certainly isn't obvious. For example, the simplest of commit commands: git commit seems to correspond to a rather complex diff command (which I may not have completely correct yet---and if not, that would just demonstrate the point even more): git diff-index -p --cached HEAD What I would love to have is the ability to pass the same arguments to git diff to get a preview of what any get commit would do. For example, something like: git diff # would be a preview of: git commit git diff -a # would be a preview of: git commit -a git diff fileA fileB # would be a preview of: git commit fileA fileB etc. Again, thanks for your consideration to these thoughts from woefully clueless and inexperienced user. -Carl [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: Two ideas for improving git's user interface 2006-02-01 23:33 ` Two ideas for improving git's user interface Carl Worth @ 2006-02-02 0:38 ` Junio C Hamano 2006-02-02 1:16 ` Carl Worth 2006-02-02 1:23 ` Linus Torvalds ` (2 subsequent siblings) 3 siblings, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2006-02-02 0:38 UTC (permalink / raw To: Carl Worth; +Cc: git, Nicolas Pitre, Linus Torvalds Carl Worth <cworth@cworth.org> writes: > To explain this, "just use -a" isn't enough, it would have to be > something like, "always use -a or else 'git commit' just won't work > and you can end up committing stale garbage". And perhaps "unless you > also add the filename to the commit line, then it will start working > again." I do not think you have to make it sound *that* negative. I agree it may be counterintuitive until the user groks the index. Let's assume that we will fix things to (1) require "--also" (or "--incremental") to get the current "git commit paths..." behaviour, (2) without any arguments we commit the index as is, (3) with explicit paths we commit clean HEAD plus only specified paths using a temporary index. I think a fairer way to say what you said would be: Always use -a, or explicit paths. With -a all of your changes in the working tree are committed. With paths, only changes to those paths are committed. Once you are comfortable with making commits this way, you might want to learn about index file and then start using 'git commit' without any argument. This works in a way that cannot be understood until you learn how the index file works, so stick to "-a or explicit paths" rule for now. That rule is good enough for everyday use. And you can probably go a long way without ever knowing about index. Initially when I wrote the above two paragraphs, I said "appreciated" instead of "understood". But depending on your workflow, you may not even need what "git commit" without arguments would give you, in which case there is nothing to appreciate about, so I changed the wording. Old-timer git people seem to like what it gives them but that does not mean everybody should marvel at what it does and adopt the workflow to take advantage of the index file. > Here's a fundamental question I have, (and thanks to Keith Packard for > helping me to phrase it): > > Is it ever useful (reasonable, desirable) to commit file > contents that differ from the contents of the working > directory? What that means is people should always do "git commit -a". Not even "git commit paths...". It matches _my_ sense of developer discipline, especially for individual developers, but it is a rather cumbersome straightjacket if enforced upon you in practice. It is a useful timesaver to be able to leave unrelated changes around in the working tree. > I don't think it is, (but please let me know if I've missed some > useful case). I think I've already done this a couple of times today. Your "git diff" is interesting, but I'd rather make them completely separate command from "git diff". Perhaps "git ndiff" and "git ncommit", that assumes there is nothing but "git commit -a" kind of commits. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: Two ideas for improving git's user interface 2006-02-02 0:38 ` Junio C Hamano @ 2006-02-02 1:16 ` Carl Worth 2006-02-02 2:25 ` Junio C Hamano 0 siblings, 1 reply; 110+ messages in thread From: Carl Worth @ 2006-02-02 1:16 UTC (permalink / raw To: Junio C Hamano; +Cc: git, Nicolas Pitre, Linus Torvalds [-- Attachment #1: Type: text/plain, Size: 2754 bytes --] On Wed, 01 Feb 2006 16:38:45 -0800, Junio C Hamano wrote: > > I do not think you have to make it sound *that* negative. Sorry about that. I was just trying to emphasize the new-user confusion, and perhaps I went overboard. > It is a useful timesaver to be able to leave > unrelated changes around in the working tree. > > > I don't think it is, (but please let me know if I've missed some > > useful case). > > I think I've already done this a couple of times today. I'm sorry. I didn't succeed in phrasing the question the way I wanted. Yes, it is useful to be able to leave unrelated changes around in the working tree. So in that sense, it is clearly useful to be able to commit something that is different (in a repository-wide sense) than what is in the working tree. The question I was trying to ask is, for a _single file_ is it ever useful to commit contents that differ from the contents of the working directory? Let's call this a "skewed file" in the index. I haven't used git much yet, but I found two cases for when one might end up committing a skewed file: 1) Modification of working directory after git-update-index or git-add. There has been discussion in this thread already that the user can get a confusing commit in this case. 2) git-read-tree -m # without -u The git documentation already advertises that not using -u here leads to confusion. This one looks historical, and it's not obvious to me whether git-read-tree is used in practice without -u. So, in both of those cases the skewed files seem to lead only to confusion. Are there any non-confusing cases where it's useful to be able to commit a skewed file? If not, we should be able to simplify things since a lot of the UI complexity being discussed (-a vs. no -a, path names vs. no path names), hinges on the handling of skewed files. > Your "git diff" is interesting, but I'd rather make them > completely separate command from "git diff". Perhaps "git > ndiff" and "git ncommit", that assumes there is nothing but "git > commit -a" kind of commits. I'd be fine with some other name than "diff" if strictly necessary, but I'm not suggesting something that makes any assumption about "git commit -a" only. What I want is a simple way to take any "git commit" command and be able to examine the diff that it will be committing. My workflow has been to always perform a final review of such a diff while composing the commit message. I'd like to be able to do that with git. And I think this tool would make a very good learning tool for users trying to figure out the various commit operations, (particularly if we end up with different semantics for merge vs. non-merge, -a vs. no -a, path names vs. no path names, etc.). -Carl [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: Two ideas for improving git's user interface 2006-02-02 1:16 ` Carl Worth @ 2006-02-02 2:25 ` Junio C Hamano 2006-02-03 23:57 ` Carl Worth 0 siblings, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2006-02-02 2:25 UTC (permalink / raw To: Carl Worth; +Cc: git Carl Worth <cworth@cworth.org> writes: > If not, we should be able to simplify things since a lot of the > UI complexity being discussed (-a vs. no -a, path names vs. no path > names), hinges on the handling of skewed files. I am in agreement with you that "skewed files" might lead to confusion, but I do not see how that relates to "-a vs no -a" nor "path names vs no path names" issues. Let's say we try to detect and forbid committing skewed files. How would we do that? For the sake of clarity, let's say we fixed the commit command the way I said in the message you are responding. Now: 1. "git commit" is the traditional one; it commits the current index. We enumerate paths that 'git-diff-index --cached --name-only HEAD' tells are different (they are the paths to be committed -- what about merges? Maybe take union from all parents?). Then we see if the paths from "git-diff-files --name-only" (locally modified files) overlap with them. Overlapping ones will be skewed if we make a commit. 2. "git commit --also fileA..." updates fileA... on top of the current index and commits that. After doing "git update-index fileA...", the story is the same as the previous case. 3. "git commit fileA..." initializes a temporary index from the current HEAD, updates fileA... and commits that. We would need a check to make sure index matches HEAD at specified paths, but after that check passes, there is no skewed files being committed and there is nothing more to check. 4. "git commit -a" by definition would not have skewed files and there is nothing to check. So what you say sounds doable. But I wonder if that really helps much. Let's say we want to give an interface to a class of users who do _not_ want to worry about the presense of the index file. That means they will _never_ run "git update-index" themselves, although "git commit", "git add", and "git merge" may run update-index for them internally. Essentially, you tell them to always use "git commit -a" or "git commit fileA...", and do not teach them "git commit", "git commit --also fileA...". IOW they will be doing only 3 or 4. In this case, we do not need any of the "skewed files" check. The extra checks in 1 and 2 would prevent index-unaware users from making obvious mistakes, but if they do not understand index then they would still be surprised anyway. For example, "git commit" commits the files they previously run "git add" on, but leaves other modified files in the working tree uncommitted. This is different from either 3 or 4 that they have learned so far. If they did "git commit fileA", the file earlier they run "git add" is not committed. If they did "git commit -a", files other than the added files are also committed. So in that sense the above checks are doable but I do not think it helps that much to alleviate the confusion. These extra checks in 1 and 2 may protect index-aware users from making mistakes, to a certain degree. I am not convinced enough myself to pay the cost of extra checks, though, because my workflow is to do the final review exactly like what you said below. > My workflow has been to always perform a final review of such a diff > while composing the commit message. I'd like to be able to do that > with git. That matches my workflow. I do either one of these (I never use "git commit paths..."): $ work work work $ I may do update-index [--add|--remove] here $ git diff --cached $ git commit $ work work work $ I may do update-index [--add|--remove] here $ git diff HEAD $ git commit -a In either cases "skewed files" do not matter. This can be summarized in a short paragraph: If you are going to commit with "git commit" (no parameters), check the final result with "git diff --cached". If you are going to commit with "git commit -a", check with "git diff HEAD". I said why I do not do "git commit paths..." myself, but I think this "skewed files" discussion adds another thing to be careful about if you use it. If you do this (with the current tool, you drop --also): $ work on file A $ git diff A ... that looks fine so far ... $ git update-index A $ work more on file A $ git diff A ... incrementally that looks fine ... $ git commit --also A you would end up commiting something you have not done the "final review". You need to have the final check before such a commit: $ work on file A $ git diff A ... that looks fine so far ... $ git update-index A $ work more on file A $ git diff A ... incrementally that looks fine ... +++++ $ git diff HEAD $ git commit --also A This includes all changes that are not in the index and are not going to be included in the commit (i.e. changes to files other than A). For that you may need to do something like: git-diff-index --cached HEAD ;# already in index but do not look at A git-diff-index HEAD -- A ;# and path A is taken from working tree which is a bit cumbersome. Without --also (the new semantics), the check would be straightforward: $ work on file A $ git diff A ... that looks fine so far ... $ git update-index A $ work more on file A $ git diff A ... incrementally that looks fine ... +++++ $ git diff HEAD -- A $ git commit A ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: Two ideas for improving git's user interface 2006-02-02 2:25 ` Junio C Hamano @ 2006-02-03 23:57 ` Carl Worth 0 siblings, 0 replies; 110+ messages in thread From: Carl Worth @ 2006-02-03 23:57 UTC (permalink / raw To: Junio C Hamano; +Cc: git [-- Attachment #1: Type: text/plain, Size: 4527 bytes --] [I'm still hesitant to be jumping into this discussion with both feet like this, so please imagine lots of disclaimers of ignorance before any claims I make---I would not be surprised or offended to learn I'm wildly wrong about how I think some things work.] On Wed, 01 Feb 2006 18:25:46 -0800, Junio C Hamano wrote: > Carl Worth <cworth@cworth.org> writes: > > > If not, we should be able to simplify things since a lot of the > > UI complexity being discussed (-a vs. no -a, path names vs. no path > > names), hinges on the handling of skewed files. > > I am in agreement with you that "skewed files" might lead to > confusion, but I do not see how that relates to "-a vs no -a" nor > "path names vs no path names" issues. In the case of skewed files, "-a" commits the current file content, while "no -a" commits the skewed content. Similarly, "path names" commits the current contents while "no path names" commits the skewed content. > Let's say we try to detect and forbid committing skewed files. How > would we do that? I wasn't imagining adding extra checks (== more complexity). Instead I was imagining something like a command that would mark a path to be committed. I don't yet have a good suggestion for a short name for the operation, but I'll call it "mark" for sake of discussion. This mark operation would be used similarly to update-index but instead of storing into the index an object created from the current contents of the specified path, it would simply mark the path in the index as to-be-committed. When committing such a path later, the object would be created based on the contents of the path at that time. So I imagined eliminating skewed files first by providing operations based around "mark" rather than update-index, (since "mark" avoids all of the confusing oops-I-committed-stale-file-contents scenarios), and second by making all commands that update the index from the object DB also update the working directory, (effectively making git-read-tree always act according to its current -u). But as a prerequisite, this kind of plan would require the user to never actually _want_ to stash skewed contents in the index. On a separate branch of the current thread, Linus has said he likes to do that, so I'll continue to discuss that there, and before the outcome of that discussion, this idea need not even be considered further. > 1. "git commit" is the traditional one; it commits the current index. > > 2. "git commit --also fileA..." updates fileA... on top of the current > > 3. "git commit fileA..." initializes a temporary index from the > > 4. "git commit -a" by definition would not have skewed files and there > is nothing to check. The one comment I have about this proposal is a certain lack of orthogonality. Namely the base "commit" performs one operation, (committing the contents of the index), and "commit --also" performs that same operation plus something more (that much is good so far). The problem starts with "commit file" which does not perform the base operation at all, but just does something different. Similarly, "commit -a" is also doing something different, (its behavior can be described as an additional step performed _before_ the base "commit" but could also be described as an operation independent of the original state of the index, if I'm not mistaken). Before "-a" existed, there was better orthogonality, but apparently there wasn't a good fit with what some users wanted to do, (hence the addition of "-a" and the recent proposal of yet more variations on "commit"). > $ git diff --cached > $ git commit ... > $ git diff HEAD > $ git commit -a ... > For that you may need to do something like: > > git-diff-index --cached HEAD ;# already in index but do not look at A > git-diff-index HEAD -- A ;# and path A is taken from working tree > > which is a bit cumbersome. > > Without --also (the new semantics), the check would be > straightforward: .. > +++++ $ git diff HEAD -- A > $ git commit A Thanks for the examples. If nothing else, I hope the above makes clear that it's not always obvious how to achieve a preview diff of a commit. I would love to see the number of fundamental variations of "commit" shrink rather than grow, but especially if it does grow, I think it will always be important for users to be able to easily view "status" and "diff" previews of commits, (preferably by providing the same arguments to some 'preview' commands as will be passed to commit). -Carl [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: Two ideas for improving git's user interface 2006-02-01 23:33 ` Two ideas for improving git's user interface Carl Worth 2006-02-02 0:38 ` Junio C Hamano @ 2006-02-02 1:23 ` Linus Torvalds 2006-02-02 1:44 ` Linus Torvalds 2006-02-04 0:20 ` Carl Worth 2006-02-02 12:31 ` Florian Weimer 2006-02-02 16:30 ` Carl Baldwin 3 siblings, 2 replies; 110+ messages in thread From: Linus Torvalds @ 2006-02-02 1:23 UTC (permalink / raw To: Carl Worth; +Cc: Junio C Hamano, Nicolas Pitre, git On Wed, 1 Feb 2006, Carl Worth wrote: > > Here's a fundamental question I have, (and thanks to Keith Packard for > helping me to phrase it): > > Is it ever useful (reasonable, desirable) to commit file > contents that differ from the contents of the working > directory? Yes. I do it all the time. I tend to have a certain fairly constant set of changes in my working tree, namely every time a release is getting closer, I always tend to have the "Makefile" already updated for the new version (but not checked in: I do that just before I actually tag it, so that the tag will match the commit that actually changes the version). I do that largely for historical reasons, namely that I've forgotten too many times to actually change the version number, and then I usually get a bug report within minutes of cutting the release with a snickering "hah, you forgot to change the version again". So I do lots of commits with that Makefile being dirty, without ever actually committing the Makefile changes themselves. "git commit -a" as a default would be absolutely _horrible_ for me. I occasionally have other things dirty too in my tree - just random hacking. But the Makefile is dirty about 50% of the time for me, so it's the common case. And most of those commits are automated, either through pulls that are successful, or just my email patch-application scripts, and both of those cases actually check that the files that are _changed_ are never dirty in the working directory. However, if the question was an even stricter "do you ever commit _changes_ to a particular file where the last HEAD, the index _and_ the working tree are all different", then the answer is actually "Yes" to that too. What has happened is that I have had merges that have content conflicts that I fix up by hand, but exactly _because_ I fix them up by hand, I actually want to re-compile the kernel and test my fixups. And in that case, I will actually re-apply my manual Makefile change, even if that file was part of the merge changes (in which case I had had to first un-apply the change in order to do the merge). So what happens is that I recompile with my trivial changes in place _after_ I have fixed up any merge conflicts, reboot the thing to test, and then commit the result if everything looks ok. And notice how I commit the _merge_ without actually committing my dirty state in the tree - and whether the files involved in my standard dirty changes ("Makefile") are part of the state that the merge changed or not is _totally_ irrelevant. So I commit file contents that differ from my current working tree all the time. ALMOST all of the time, the actual _changes_ that I commit do not actually touch the files that I have dirty, but as explained above, even that is not at all impossible. The thing is, once you get used to the git "index" as a staging place, it's really really powerful. > Idea #2 (make it easy to preview diffs of what will be committed) > ------- > Independent of the above, I'd like to propose another change to help > prevent confusion and to help users learn git. There should be an > obvious "diff" operation that presents exactly the result of what any > "commit" operation will perform. Actually, we do exactly that. Right now we expressly limit the "preview" to just the filenames, but we literally do run git-diff-index -M --cached --name-status --diff-filter=MDTCRA HEAD as part of "git status", and the eventual end result is what we will populate the commit message file with for your editing pleasure. And you can actually see that. So I would suggest that new git users never be told about the "-m" flag to "git commit", so that they always have to edit the commit message by hand, because that commit message will contain exactly this information. Not the patch itself, though. Maybe we could make it show part of it, though, if somebody really wants to see it ;) Linus ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: Two ideas for improving git's user interface 2006-02-02 1:23 ` Linus Torvalds @ 2006-02-02 1:44 ` Linus Torvalds 2006-02-04 8:03 ` Alan Chandler 2006-02-04 0:20 ` Carl Worth 1 sibling, 1 reply; 110+ messages in thread From: Linus Torvalds @ 2006-02-02 1:44 UTC (permalink / raw To: Carl Worth; +Cc: Junio C Hamano, Nicolas Pitre, git On Wed, 1 Feb 2006, Linus Torvalds wrote: > > And notice how I commit the _merge_ without actually committing my dirty > state in the tree - and whether the files involved in my standard dirty > changes ("Makefile") are part of the state that the merge changed or not > is _totally_ irrelevant. If you get the feeling that merging is special, then to some degree, yes, you'd be right. Merging (especially with conflicts) is the _one_ operation where you absolutely have to know about the index. If you don't know about how the index works, you can get the conflict resolution right kind of by accident, simply because the default workflow of .. edit conflict to look ok .. git commit file/with/conflict actually happens to do exactly the right thing (very much on purpose, btw), but the fact is, to actually figure out more complicated conflicts and to _understand_ what happens, you absolutely need to be aware of the index. Not being aware of it just isn't an option for any serious git user. (Btw, I think this is where cogito falls down. Cogito tries to hide the index file, but I don't think you really _can_ hide the index file and also do merges well at the same time. Anybody who has non-trivial merges should use raw git - not just because the "recursive" strategy just works better, but exactly because of the index file issue). So when you work with a merge, the index file content really in a very real way _is_ the merge. Yes, the index file is also technically how git actually does all the merging complexity, but in this case, there also is no "diff" to the parent, and the number of changed files may be in the hundreds, yet "git diff" should be basically empty when you finally commit your merge. I say "basically empty", because as I've explained, at least I personally have had dirty state in my tree at the time I commit a merge - on _top_ of (and independently of) the state that I actually commit. So to recap: - you really do have to be aware of the index file at some point. Trying to hide it entirely is a huge mistake. - real git power users _will_ use their awareness of the index file when they commit. You will too, some day. Maybe it's only for merges, but I wouldn't be surprised if somebody at some point wants to take advantage of it even for "normal" working conditions (ie use "git-update-index" to "freeze" a certain state for committing, and then editing the file and _not_ committing those edits) So making "-a" the default would be just a horrid horrid mistake. You can only hide the index so far - don't even try to hide it more. Linus ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: Two ideas for improving git's user interface 2006-02-02 1:44 ` Linus Torvalds @ 2006-02-04 8:03 ` Alan Chandler 2006-02-04 8:25 ` Junio C Hamano 0 siblings, 1 reply; 110+ messages in thread From: Alan Chandler @ 2006-02-04 8:03 UTC (permalink / raw To: git On Thursday 02 February 2006 01:44, Linus Torvalds wrote: > On Wed, 1 Feb 2006, Linus Torvalds wrote: > > And notice how I commit the _merge_ without actually committing my dirty > > state in the tree - and whether the files involved in my standard dirty > > changes ("Makefile") are part of the state that the merge changed or not > > is _totally_ irrelevant. > > If you get the feeling that merging is special, then to some degree, yes, > you'd be right. > > Merging (especially with conflicts) is the _one_ operation where you > absolutely have to know about the index. If you don't know about how the > index works, you can get the conflict resolution right kind of by > accident, simply because the default workflow of > > .. edit conflict to look ok .. > git commit file/with/conflict > > actually happens to do exactly the right thing (very much on purpose, > btw), but the fact is, to actually figure out more complicated conflicts > and to _understand_ what happens, you absolutely need to be aware of the > index. Not being aware of it just isn't an option for any serious git > user. > > (Btw, I think this is where cogito falls down. Cogito tries to hide the > index file, but I don't think you really _can_ hide the index file and > also do merges well at the same time. Anybody who has non-trivial merges > should use raw git - not just because the "recursive" strategy just works > better, but exactly because of the index file issue). Wow - light comes on. I have been using git (or rather to be exact git with cg-add, cg-rm and cg-commit) for about 6 months (bearing in mind I am only a part time programmer in the evenings for fun - even though I work in the computer industry the last time I was paid to write code was in 1979 - so I don't really need to be a power user). Although I knew about the index file since the beginning I never really groked what it was about before. Of course I knew of its existance, and I even knew that it could be used as a staging area, but up to now I had always thought of it as a necessary inconvenience to enable git to run as blazingly fast as it does - not as an essential part of work flow it complex situations I think the problem is with three crucial bits of documentation. Firstly, the document is full of the git doesn't do prorcelain statements - pushing towards cogito which then hides the existance of the index file. Git not doing porcelain was true at the very beginning, but I don't think that it is true any longer. Secondly the tutorial. The examples given start by using commands to explicitly update the index and them they move on to show how you don't need to do that by using the more advanced commands of git-add and git-commit. So as I was trying to learn how to use git, I followed through this and thought that you just try an avoid using it directly. Whats more, viewed in this light git-commit seemed to be a rather poor implementation of cogito's superior cg-commit command [Incidentally there is a use case that doesn't seem to have been discussed in this thread which I use cg-commit all the time for and will now have to see if there is a use index file equivalence for. That is, I am developing a web application and in the running version the database framework (iBatis) is using Tomcats connection pooling. In order to run my JUnit test harness, I don't have tomcat, so I need to define a different version of iBatis configuration file to used its own database connection. So I have created a test branch and edited the configuration file in that branch, and I update both code and tests in a edit/compile/fix and text loop until I have written or changed both code and tests. I then do a cg-commit which lists the files I have changed. I ONLY commit those in the test harness - by deleting the others from cogito's list of files to commit - and then repeat the commit commiting the rest]. I then switch back to my master branch and cherry pick commit that is the code changes - not the text harness] Thirdly, "discussion" of the index file at the bottom end of the git man page (The "index" aka "Current Directory Cache") really concentrates on what it is and what operations you can perform with it in the normal situation. I tried looking at the core tutorial looking at what I might be a way of bring this to the attention of the new learner into git and produced the following (partial) patch to the core-tutorial (It needs a whole set of examples on resolving merge problems which I have no idea at the moment how to do - this has been the real area which never understood - basically because the tutorial itself says skip that part). --- a/Documentation/core-tutorial.txt +++ b/Documentation/core-tutorial.txt @@ -212,15 +212,22 @@ was just to show that `git-update-index` actually saved away the contents of your files into the git object database. +The Index File +-------------- + Updating the index did something else too: it created a `.git/index` file. This is the index that describes your current working tree, and -something you should be very aware of. Again, you normally never worry -about the index file itself, but you should be aware of the fact that -you have not actually really "checked in" your files into git so far, -you've only *told* git about them. +something you should be very aware of. It is a staging area between your +working tree and the object store described above. + +In normal circumstances you do not worry about the index file itself, but you +should be aware of the fact that you have not actually really "checked in" +your files into git so far, you've only *told* git about them. Later you +will see how you can exploit the fact that there is this separate index +file to undertake more complex operations. -However, since git knows about them, you can now start using some of the -most basic git commands to manipulate the files or look at their status. +However, since git knows about these files, you can now start using some of +the most basic git commands to manipulate them or look at their status. In particular, let's not even check in the two files into git yet, we'll start off by adding another line to `hello` first: @@ -1188,8 +1195,8 @@ How does the merge work? We said this tutorial shows what plumbing does to help you cope with the porcelain that isn't flushing, but we so far did not talk about how the merge really works. If you are following -this tutorial the first time, I'd suggest to skip to "Publishing -your work" section and come back here later. +this tutorial the first time, I'd suggest to skip to "Resolving Merge +Problems" section and come back here later. OK, still with me? To give us an example to look at, let's go back to the earlier repository with "hello" and "example" file, @@ -1332,6 +1339,10 @@ merge for you to resolve. Notice that t unmerged, and what you see with `git diff` at this point is differences since stage 2 (i.e. your version). +Resolving Merge Problems +------------------------ + +NOT SURE WHAT GOES HERE Publishing your work -------------------- -- Alan Chandler http://www.chandlerfamily.org.uk Open Source. It's the difference between trust and antitrust. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: Two ideas for improving git's user interface 2006-02-04 8:03 ` Alan Chandler @ 2006-02-04 8:25 ` Junio C Hamano 2006-02-04 9:30 ` Alan Chandler 0 siblings, 1 reply; 110+ messages in thread From: Junio C Hamano @ 2006-02-04 8:25 UTC (permalink / raw To: Alan Chandler; +Cc: git Alan Chandler <alan@chandlerfamily.org.uk> writes: > Wow - light comes on. That's good. > -this tutorial the first time, I'd suggest to skip to "Publishing > -your work" section and come back here later. > +this tutorial the first time, I'd suggest to skip to "Resolving Merge > +Problems" section and come back here later. The changes before this look very good to me, but these two lines do not make any sense. If you are going to talk about "Resolving Merge Problems", you _need_ to know about index, so you cannot skip the material. I think having a section on manual merge resolution between the Index File section and Publishing section makes sense. What kind of merges did you have trouble figuring out when you were still git novice? That would be a good starting point. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: Two ideas for improving git's user interface 2006-02-04 8:25 ` Junio C Hamano @ 2006-02-04 9:30 ` Alan Chandler 0 siblings, 0 replies; 110+ messages in thread From: Alan Chandler @ 2006-02-04 9:30 UTC (permalink / raw To: git; +Cc: Junio C Hamano On Saturday 04 February 2006 08:25, Junio C Hamano wrote: > Alan Chandler <alan@chandlerfamily.org.uk> writes: > > Wow - light comes on. > > That's good. > > > -this tutorial the first time, I'd suggest to skip to "Publishing > > -your work" section and come back here later. > > +this tutorial the first time, I'd suggest to skip to "Resolving Merge > > +Problems" section and come back here later. > > The changes before this look very good to me, but these two > lines do not make any sense. If you are going to talk about > "Resolving Merge Problems", you _need_ to know about index, so > you cannot skip the material. Maybe - since the light has just come on, I need to understand a lot more about this area before I can really comment. The tutorial invited one to skip it before, so I was just doing so again. Even today when I tried to read this section again my eyes glazed over. The long sha outputs from git-ls-files screams off the page "don't bother this is detailed technical stuff" :-( > > I think having a section on manual merge resolution between the > Index File section and Publishing section makes sense. What > kind of merges did you have trouble figuring out when you were > still git novice? That would be a good starting point. > I STILL come out in a cold sweat (actually that is a bit over the top:-) ) as soon as a merge fails for whatever reason. The problem is that I am not doing development full time, nor in a team, so I probably hit one about once every 2 months. This means that I don't remember what to do, and need to go and look it up. But where - there is nothing in my main reference places (Everyday Git - or before that the tutorial). So I normally attempt to do what I think is sensible. Manually searching for files that haven't merged. Edit the lines with the >>>>>> ==== <<< markers in them until I think the resultant file is what it should be and then try commit again (probably cg-commit rather than git commit). But what happens next is then hit or miss - sometimes it just works - sometimes it doesn't and I am that place where there was a long thread a couple of months ago entitled something like "and what do I do now?" I must admit it normally works OK - but I have come across situations a couple of weeks later where a file is in an unexpected state - seems to have been from the wrong branch, or missing a commit I thought I had made. -- Alan Chandler http://www.chandlerfamily.org.uk Open Source. It's the difference between trust and antitrust. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: Two ideas for improving git's user interface 2006-02-02 1:23 ` Linus Torvalds 2006-02-02 1:44 ` Linus Torvalds @ 2006-02-04 0:20 ` Carl Worth 2006-02-04 2:08 ` Linus Torvalds 1 sibling, 1 reply; 110+ messages in thread From: Carl Worth @ 2006-02-04 0:20 UTC (permalink / raw To: Linus Torvalds; +Cc: Junio C Hamano, Nicolas Pitre, git [-- Attachment #1: Type: text/plain, Size: 4048 bytes --] On Wed, 1 Feb 2006 17:23:38 -0800 (PST), Linus Torvalds wrote: > > I tend to have a certain fairly constant set of changes in my working > tree, namely every time a release is getting closer, I always tend to have > the "Makefile" already updated for the new version (but not checked in: I > do that just before I actually tag it, so that the tag will match the > commit that actually changes the version). OK. That use case I understand just fine. > However, if the question was an even stricter "do you ever commit > _changes_ to a particular file where the last HEAD, the index _and_ the > working tree are all different", then the answer is actually "Yes" to that > too. Yes, this is the question I was trying to ask. Thanks for pretending that I had actually asked it, and then answering it as well. > What has happened is that I have had merges that have content conflicts > that I fix up by hand, but exactly _because_ I fix them up by hand, I > actually want to re-compile the kernel and test my fixups. OK. I hadn't anticipated this use case, but I am interested in exploring it more fully. > And in that case, I will actually re-apply my manual Makefile change, even > if that file was part of the merge changes (in which case I had had to > first un-apply the change in order to do the merge). Are the un-apply and re-apply operations here primarily manual? or does git help you much with those (beyond alerting you that the merge cannot take place before you un-apply things)? > The thing is, once you get used to the git "index" as a staging place, > it's really really powerful. I believe that the staging operations you perform are quite desirable, but I wonder if existing primitives in git might not provide a more powerful basis for the kinds of operation you're performing. For example, in the case of the not-quite-ready-to-be-committed changes that you want to carry along, couldn't you get additional benefits if those changes could live on their own branch? I suppose there may be a missing operator needed to allow you to easily merge *and* unmerge that branch if needed. Would that seem at all feasible? If so, could your not-ready changes be implemented as some branch that is automatically unmerged prior to commit and then re-merged afterwards? Or something like that? I guess the feeling I get is that staging into the index feels conceptually similar to a commit to a branch, but it's a uniquely weak branch (only one revision per file). And this uniqueness also introduces complexity (the various diff operations), as well as possibilities of confusion when committing. Meanwhile the response to the commit confusion seems to be to add yet more complexity to commit which doesn't seem like an improvement to me. [I'm maybe too far out on a limb at this point, since you've definitely identified a use case for staging in the index, and all I've offered as an alternative is hand-waving about "branches should be able to do that". But if nothing else, I'm floating some ideas out loud, and next I'll try experimenting more with possibilities for non-index staging.] I'm already having a lot of fun with git. It's a very impressive tool, with a surprisingly simple/powerful core. > Actually, we do exactly that. Right now we expressly limit the "preview" > to just the filenames, but we literally do run > > git-diff-index -M --cached --name-status --diff-filter=MDTCRA HEAD > > as part of "git status", and the eventual end result is what we will > populate the commit message file with for your editing pleasure. Yes, that's a good thing to do. In my personal workflow, a pre-populated commit message is a bit late, since I want to review and convince myself I like things before I type the magic word "commit". And I'm not claiming that a preview patch is impossible to generate, I'm just saying that it's currently rather hard to figure what the correct correspondence for arguments to diff and arguments to commit, (see more on this point in another branch of this thread). -Carl [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: Two ideas for improving git's user interface 2006-02-04 0:20 ` Carl Worth @ 2006-02-04 2:08 ` Linus Torvalds 2006-02-06 23:42 ` Carl Worth 0 siblings, 1 reply; 110+ messages in thread From: Linus Torvalds @ 2006-02-04 2:08 UTC (permalink / raw To: Carl Worth; +Cc: Junio C Hamano, Nicolas Pitre, git On Fri, 3 Feb 2006, Carl Worth wrote: > > > And in that case, I will actually re-apply my manual Makefile change, even > > if that file was part of the merge changes (in which case I had had to > > first un-apply the change in order to do the merge). > > Are the un-apply and re-apply operations here primarily manual? or > does git help you much with those (beyond alerting you that the merge > cannot take place before you un-apply things)? They're purely manual. If the changes are more extensive, I just create a temporary branch for them, which is easy enough: git checkout -b temp git commit git checkout master before I do the real merge, but the fact is, most of the changes in my tree tend to be pretty un-interesting. Most of the time it's literally _just_ the Makefile change, sometimes it's a trial patch that I'm not ready commit and had just sent out to somebody for testing or similar. > I believe that the staging operations you perform are quite desirable, > but I wonder if existing primitives in git might not provide a more > powerful basis for the kinds of operation you're performing. No. The point is that they are trivial to do, and that they don't _need_ "powerful basis". What they need is _usability_. And the git index _is_ that usability. It is incredibly powerful, and incredibly easy to use. When you argue against exposing the index, you argue against it from the "let's not give them rope" angle. You argue against power and flexibility. You argue for the clippy, the helper app that says Are you sure you want to do this? [Yes] [No] [Cancel] while I'm trying to explain that it's actually part of the _power_ of git. The fact, that I can keep dirty state in my tree and continue to work with it _without_ having to worry about it is a huge relief to me. > If so, could your not-ready changes be implemented as some branch that > is automatically unmerged prior to commit and then re-merged > afterwards? Or something like that? Sure. They could. You could make things more complicated, and they would WORK. They'd be inconvenient and not offer any actual improvement. The "index" file in git really is very important. Staging into the index is _the_ most fundamental operation. You can't actually see it very well in the history of git (because the first commit exists only after git actually worked pretty fully), but the birth of git is really in the index file. That actually came _before_ the object store, as the way to quickly and efficiently track the notion of "changes". So git itself started out very much with the index file being the staging area for tracking the state of a working tree efficiently. No git operation actually ever lets the working tree interact directly with the object store. The notion of "diff this <tree> object against the current working tree" comes closest, but even that actually really goes through the index file: it's properly a "diff this <tree> object against the index file, and check at the same time the index entry against the working tree" If you deny the index file, you really deny git itself. Think of it this way: when you start a new process, in UNIX you do that in two stages: first you fork() to create a copy, then you do exec() to populate the copy with the new process. Your argument is akin to saying "That's horribly wasteful: wouldn't it be much more intuitive to just do 'spawn()' to do it all, and avoid the unnecessary middle step". But that "unnecessary" middle step - whether it's "fork()" or the git "index" file - is actually the source of the flexibility. It's what allows you to do the "fixups" in the middle when you switch file descriptors around, or when you fix up merge conflicts. And then occasionally, you do fork() _without_ doing an execve() at all. The same way that sometimes you do operations on the index without actually committing them to a tree. That's flexibility. Revel in it, instead of trying to push it under the rug. Linus ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: Two ideas for improving git's user interface 2006-02-04 2:08 ` Linus Torvalds @ 2006-02-06 23:42 ` Carl Worth 0 siblings, 0 replies; 110+ messages in thread From: Carl Worth @ 2006-02-06 23:42 UTC (permalink / raw To: Linus Torvalds; +Cc: Junio C Hamano, Nicolas Pitre, git [-- Attachment #1: Type: text/plain, Size: 1097 bytes --] On Fri, 3 Feb 2006 18:08:19 -0800 (PST), Linus Torvalds wrote: > > If you deny the index file, you really deny git itself. > [And the novice nearly reaches enlightenment.] OK. I now thoroughly understand that this use of the index is by design, not accident. So I won't propose modifying the index again. But, I'm still not yet sure how to reconcile my personal workflow with git yet. I don't think I'm yet an index-embracer that would relish using "update-index <file>; commit" for everything. At the same time, I can appreciate the disdain for "commit -a" to the extent that it does deny the index. I suspect that what I want might fall somewhere in between, with something like "mark <file>; commit-marked" where commit-marked would update the index of all marked files, then commit. This would allow me to still use update-index for the times that I actually need to take advantage of what that provides. So, maybe I'll try scripting up something for that myself, (at the risk of re-inventing cogito). Or maybe I'll just learn to live with (and love?) what git provides already. -Carl [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: Two ideas for improving git's user interface 2006-02-01 23:33 ` Two ideas for improving git's user interface Carl Worth 2006-02-02 0:38 ` Junio C Hamano 2006-02-02 1:23 ` Linus Torvalds @ 2006-02-02 12:31 ` Florian Weimer 2006-02-02 16:30 ` Carl Baldwin 3 siblings, 0 replies; 110+ messages in thread From: Florian Weimer @ 2006-02-02 12:31 UTC (permalink / raw To: git * Carl Worth: > Here's a fundamental question I have, (and thanks to Keith Packard for > helping me to phrase it): > > Is it ever useful (reasonable, desirable) to commit file > contents that differ from the contents of the working > directory? You mean like "darcs record"? 8-) I think this is very useful functionality. Granted, it interferes with a rigorous developer-side regression test policy ("all changes must have been built and passed the test suite"). But it encourages things like fixing typos in comments you spot while editing a file for other reasons. And you can keep some ugly debugging code while working on a series of changes. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: Two ideas for improving git's user interface 2006-02-01 23:33 ` Two ideas for improving git's user interface Carl Worth ` (2 preceding siblings ...) 2006-02-02 12:31 ` Florian Weimer @ 2006-02-02 16:30 ` Carl Baldwin 3 siblings, 0 replies; 110+ messages in thread From: Carl Baldwin @ 2006-02-02 16:30 UTC (permalink / raw To: Carl Worth; +Cc: Linus Torvalds, Junio C Hamano, Nicolas Pitre, git On Wed, Feb 01, 2006 at 03:33:44PM -0800, Carl Worth wrote: > Is it ever useful (reasonable, desirable) to commit file > contents that differ from the contents of the working > directory? What _is_ useful about the status quo is the ability to make some minor change, update that change to the index when I've decided that it is a good change and then use git diff to see what I've incrementally changed in the same file since that update. That way new incremental changes can be viewed independantly of the change I've already decided was good. > What I would love to have is the ability to pass the same arguments to > git diff to get a preview of what any get commit would do. For > example, something like: > > git diff # would be a preview of: > git commit 'git diff --cached' does this. > git diff -a # would be a preview of: > git commit -a 'git diff HEAD' does this. > git diff fileA fileB # would be a preview of: > git commit fileA fileB Paths can be specified in conjunction with the above commands. Yes, these are idioms specific to git and are not immediately intuitive to the new user. However, if the user has access to a good tutorial that walks through these scenerios its not so bad. Carl -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Carl Baldwin RADCAD (R&D CAD) Hewlett Packard Company MS 88 work: 970 898-1523 3404 E. Harmony Rd. work: Carl.N.Baldwin@hp.com Fort Collins, CO 80525 home: Carl@ecBaldwin.net - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 21:59 ` Junio C Hamano 2006-02-01 22:25 ` Nicolas Pitre 2006-02-01 22:35 ` Linus Torvalds @ 2006-02-01 22:57 ` Daniel Barkalow 2 siblings, 0 replies; 110+ messages in thread From: Daniel Barkalow @ 2006-02-01 22:57 UTC (permalink / raw To: Junio C Hamano; +Cc: Linus Torvalds, Nicolas Pitre, git On Wed, 1 Feb 2006, Junio C Hamano wrote: > Linus Torvalds <torvalds@osdl.org> writes: > > > If somebody doesn't know about the index, he normally will never have > > index changes _anyway_, except for the "git add" case. In which case "git > > commit" does the right thing for him: it will either commit the added > > files, or it will say "nothing to commit". > > ... the original complaint was that "git commit" without > explicit paths does not quack like "cvs/svn commit" -- commit > all my changes in the working tree. Actually, the original complaint was about "git commit path ...", I believe. That's the case where new users are finding that the behavior is surprising, rather than just unfamiliar. -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 20:27 ` Junio C Hamano 2006-02-01 21:09 ` Linus Torvalds @ 2006-02-01 22:00 ` Joel Becker 1 sibling, 0 replies; 110+ messages in thread From: Joel Becker @ 2006-02-01 22:00 UTC (permalink / raw To: Junio C Hamano; +Cc: Nicolas Pitre, Linus Torvalds, git On Wed, Feb 01, 2006 at 12:27:17PM -0800, Junio C Hamano wrote: > - "git commit fileA..." means: create a temporary index from the > current HEAD commit (or empty index if there is none), update > it at listed paths (add/remove if necessary) and commit the Please don't do the add/remove automatically. I know, it's pretty convenient if I explicitly say "git commit filetoadd", but what happens if I say "git commit libfoo/*"? I know that I want all my changes in libfoo/ to be commited, ignoring my changes in libbar/. But I forgot that I created libfoo/testfoo.c to debug my changes, and now it's in the repository -- and I might not even notice it for weeks. CVS and Subversion require an explicit "add" for this very reason. Even then, almost everyone gets an "import" or two wrong, pulling in a couple built files (eg, "configure") they didn't mean to get. I guess you could query the user. "I noticed that you specified filetoadd, and you never said 'git add'. Do you want to add it now [Y/n]?" Joel -- "When I am working on a problem I never think about beauty. I only think about how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." - Buckminster Fuller Joel Becker Principal Software Developer Oracle E-mail: joel.becker@oracle.com Phone: (650) 506-8127 ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-01-30 18:58 ` Carl Baldwin 2006-01-31 10:27 ` Johannes Schindelin @ 2006-02-01 19:32 ` H. Peter Anvin 1 sibling, 0 replies; 110+ messages in thread From: H. Peter Anvin @ 2006-02-01 19:32 UTC (permalink / raw To: Carl Baldwin Cc: Junio C Hamano, Keith Packard, Martin Langhoff, Linus Torvalds, Git Mailing List Carl Baldwin wrote: > > - Anyone can install and fire it up without license/contract hassles. > For something like an SCM this is a big deal, and not just for the Open Source world. In a company, it means not having to worry about having enough licenses, and getting budget approval, etc, etc, before a new person can join a project. Perhaps more importantly, it allows someone who normally isn't *on* the project to look at it and participate. -hpa ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git?
@ 2006-02-01 7:08 linux
2006-02-01 8:51 ` Junio C Hamano
` (2 more replies)
0 siblings, 3 replies; 110+ messages in thread
From: linux @ 2006-02-01 7:08 UTC (permalink / raw
To: torvalds; +Cc: git, linux
> Yes, I think the "assume unchanged" flag goes well together with making
> sure that the checked-out file is non-writable at the time.
>
> Of course, any number of editors and other actions won't care: if you do
> anything like
>
> for i in *.c
> do
> sed 's/xyzzy/bas/g' < $i > $i.new
> mv $i.new $i
> done
>
> you'll never have even noticed that the old file was marked read-only. So
> it's obviously not in any way any guarantee, but it probably makes sense
> as a crutch.
At the risk of complicating something already very complicated, and
possibly breaking on Microsoft file systems, that case can be detected
by reading the directory and noticing that the inode number changed.
Would it be worth validating the inode numbers (which can be retrieved
in a batch) even if you don't do a full lstat()?
Or is that too Unix-centric and prone to performance problems on other
file systems? I'd think that, even if a file system used fake inode
numbers, they'd be pretty consistent if you didn't touch the file at all,
and being different would just cause a more expensive validation.
Which would be okay as long as it's infrequent.
^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 7:08 linux @ 2006-02-01 8:51 ` Junio C Hamano 2006-02-01 16:04 ` Linus Torvalds 2006-02-01 16:10 ` Alex Riesen 2 siblings, 0 replies; 110+ messages in thread From: Junio C Hamano @ 2006-02-01 8:51 UTC (permalink / raw To: linux; +Cc: git linux@horizon.com writes: > At the risk of complicating something already very complicated, and > possibly breaking on Microsoft file systems, that case can be detected > by reading the directory and noticing that the inode number changed. > > Would it be worth validating the inode numbers (which can be retrieved > in a batch) even if you don't do a full lstat()? I suspect that what you said about Microsoft filesystems is even stronger. IIRC the latest Cygwin stopped giving d_ino regardless of the filesystem type; you need to do a stat() anyway. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 7:08 linux 2006-02-01 8:51 ` Junio C Hamano @ 2006-02-01 16:04 ` Linus Torvalds 2006-02-01 16:10 ` Alex Riesen 2 siblings, 0 replies; 110+ messages in thread From: Linus Torvalds @ 2006-02-01 16:04 UTC (permalink / raw To: linux; +Cc: git On Tue, 1 Feb 2006, linux@horizon.com wrote: > > At the risk of complicating something already very complicated, and > possibly breaking on Microsoft file systems, that case can be detected > by reading the directory and noticing that the inode number changed. > > Would it be worth validating the inode numbers (which can be retrieved > in a batch) even if you don't do a full lstat()? I don't think it's worth it. It's the unusual case anyway, and it doesn't even really guarantee anything either (the person _could_ just have marked the inode writable - not understanding what is going on, he could have just done a "chmod +w" behind git's back). Together with the fact that it might not work everywhere, and that I could well imagine that "readdir()" is slow on cygwin too (how does it do "d_ino"? Maybe it has to do a stat() to emulate unix behaviour?), I'm not convinced it's worth it. I think the whole "assume it's valid" is a crutch - but if we do it, we should make it _really_ fast, because it's also useful for automated procedures that _know_ which files they touch. So we should make it have minimal impact. Linus ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 7:08 linux 2006-02-01 8:51 ` Junio C Hamano 2006-02-01 16:04 ` Linus Torvalds @ 2006-02-01 16:10 ` Alex Riesen 2006-02-01 21:27 ` linux 2 siblings, 1 reply; 110+ messages in thread From: Alex Riesen @ 2006-02-01 16:10 UTC (permalink / raw To: linux@horizon.com; +Cc: torvalds, git On 1 Feb 2006 02:08:47 -0500, linux@horizon.com <linux@horizon.com> wrote: > > Yes, I think the "assume unchanged" flag goes well together with making > > sure that the checked-out file is non-writable at the time. > > > > Of course, any number of editors and other actions won't care: if you do > > anything like > > > > for i in *.c > > do > > sed 's/xyzzy/bas/g' < $i > $i.new > > mv $i.new $i > > done > > > > you'll never have even noticed that the old file was marked read-only. So > > it's obviously not in any way any guarantee, but it probably makes sense > > as a crutch. > > At the risk of complicating something already very complicated, and > possibly breaking on Microsoft file systems, that case can be detected > by reading the directory and noticing that the inode number changed. Inodes are either uselessor dangerous in cygwin (hash of an absolute pathname on FAT). They may not even change after rm+touch. ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Census] So who uses git? 2006-02-01 16:10 ` Alex Riesen @ 2006-02-01 21:27 ` linux 0 siblings, 0 replies; 110+ messages in thread From: linux @ 2006-02-01 21:27 UTC (permalink / raw To: linux, raa.lkml; +Cc: git, torvalds > Inodes are either uselessor dangerous in cygwin (hash of an > absolute pathname on FAT). They may not even change after rm+touch. Yes, I just looked it up and found that out. I was hoping they used first block number like many Linux FSes have tried, in which case it would have worked, but if it's a hash of the path name, it's guaranteed not to change. And Linus' point is excellent, too: this feature is also useful for automated systems (like git-applypatch) that can be assumed to never forget to warn git ahead of time. ^ permalink raw reply [flat|nested] 110+ messages in thread
end of thread, other threads:[~2006-02-09 5:50 UTC | newest] Thread overview: 110+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-01-26 2:10 LCA06 Cogito/GIT workshop - (Re: git-whatchanged: exit out early on errors) Martin Langhoff 2006-01-28 4:47 ` Linus Torvalds 2006-01-28 5:33 ` Martin Langhoff 2006-01-28 5:53 ` Linus Torvalds 2006-01-28 6:32 ` Junio C Hamano 2006-01-29 10:12 ` Fredrik Kuivinen 2006-01-29 20:15 ` Junio C Hamano 2006-01-28 11:00 ` Keith Packard 2006-01-28 21:08 ` [Census] So who uses git? Junio C Hamano 2006-01-29 2:14 ` Morten Welinder 2006-01-29 3:53 ` Junio C Hamano 2006-01-29 14:19 ` Morten Welinder 2006-01-29 20:15 ` Junio C Hamano 2006-01-29 10:09 ` Keith Packard 2006-01-29 11:18 ` Radoslaw Szkodzinski 2006-01-29 18:12 ` Greg KH 2006-01-31 18:33 ` Radoslaw Szkodzinski 2006-01-31 19:50 ` Radoslaw Szkodzinski 2006-01-31 20:43 ` Junio C Hamano 2006-01-31 21:02 ` Radoslaw Szkodzinski 2006-01-30 22:51 ` Alex Riesen 2006-01-31 21:25 ` Linus Torvalds 2006-01-31 21:52 ` J. Bruce Fields 2006-01-31 22:01 ` Alex Riesen [not found] ` <20060201013901.GA16832@mail.com> 2006-02-01 2:04 ` Linus Torvalds 2006-02-01 2:09 ` Linus Torvalds 2006-02-09 5:15 ` [PATCH] "Assume unchanged" git Junio C Hamano 2006-02-09 5:49 ` [PATCH] "Assume unchanged" git: do not set CE_VALID with --refresh Junio C Hamano 2006-02-09 5:50 ` [PATCH] ls-files: debugging aid for CE_VALID changes Junio C Hamano 2006-02-01 2:31 ` [Census] So who uses git? Junio C Hamano 2006-02-01 3:43 ` Linus Torvalds 2006-02-01 7:03 ` Junio C Hamano [not found] ` <20060201045337.GC25753@mail.com> 2006-02-01 5:04 ` Linus Torvalds 2006-02-01 5:42 ` Junio C Hamano 2006-02-01 16:15 ` Jason Riedy 2006-02-01 19:20 ` Julian Phillips 2006-02-01 19:29 ` Linus Torvalds 2006-02-06 21:15 ` Chuck Lever 2006-02-01 2:52 ` Martin Langhoff 2006-02-01 3:48 ` Linus Torvalds 2006-02-01 19:30 ` H. Peter Anvin 2006-02-01 14:55 ` Alex Riesen 2006-02-01 16:25 ` Linus Torvalds 2006-02-02 9:12 ` Alex Riesen 2006-01-29 18:37 ` Dave Jones 2006-01-29 20:17 ` Daniel Barkalow 2006-01-29 20:29 ` Martin Langhoff 2006-01-30 15:23 ` Mike McCormack 2006-01-30 18:58 ` Carl Baldwin 2006-01-31 10:27 ` Johannes Schindelin 2006-01-31 15:24 ` Carl Baldwin 2006-01-31 15:31 ` Johannes Schindelin 2006-01-31 17:30 ` Linus Torvalds 2006-01-31 18:12 ` J. Bruce Fields 2006-01-31 19:33 ` Junio C Hamano 2006-01-31 19:44 ` Jon Loeliger 2006-01-31 19:52 ` Junio C Hamano [not found] ` <7vd5i8w2nc.fsf@assigned-by-dhcp.cox.net> 2006-01-31 20:56 ` J. Bruce Fields 2006-01-31 20:06 ` J. Bruce Fields 2006-01-31 19:01 ` Keith Packard 2006-01-31 19:21 ` Linus Torvalds 2006-01-31 22:55 ` Joel Becker 2006-02-01 14:43 ` Johannes Schindelin 2006-01-31 20:56 ` Sam Ravnborg 2006-01-31 22:21 ` Junio C Hamano 2006-02-01 19:34 ` H. Peter Anvin 2006-01-31 23:16 ` Daniel Barkalow 2006-01-31 23:36 ` Petr Baudis 2006-01-31 23:47 ` Junio C Hamano 2006-02-01 0:38 ` Linus Torvalds 2006-02-01 0:52 ` Junio C Hamano 2006-02-01 2:19 ` Daniel Barkalow 2006-02-01 6:42 ` Junio C Hamano 2006-02-01 7:22 ` Carl Worth 2006-02-01 8:26 ` Junio C Hamano 2006-02-01 9:59 ` Randal L. Schwartz 2006-02-01 20:48 ` Junio C Hamano 2006-02-01 17:11 ` Linus Torvalds 2006-02-01 17:18 ` Nicolas Pitre 2006-02-01 20:27 ` Junio C Hamano 2006-02-01 21:09 ` Linus Torvalds 2006-02-01 21:34 ` Nicolas Pitre 2006-02-01 21:59 ` Junio C Hamano 2006-02-01 22:25 ` Nicolas Pitre 2006-02-01 22:50 ` Junio C Hamano 2006-02-02 14:59 ` Andreas Ericsson 2006-02-01 22:35 ` Linus Torvalds 2006-02-01 23:33 ` Two ideas for improving git's user interface Carl Worth 2006-02-02 0:38 ` Junio C Hamano 2006-02-02 1:16 ` Carl Worth 2006-02-02 2:25 ` Junio C Hamano 2006-02-03 23:57 ` Carl Worth 2006-02-02 1:23 ` Linus Torvalds 2006-02-02 1:44 ` Linus Torvalds 2006-02-04 8:03 ` Alan Chandler 2006-02-04 8:25 ` Junio C Hamano 2006-02-04 9:30 ` Alan Chandler 2006-02-04 0:20 ` Carl Worth 2006-02-04 2:08 ` Linus Torvalds 2006-02-06 23:42 ` Carl Worth 2006-02-02 12:31 ` Florian Weimer 2006-02-02 16:30 ` Carl Baldwin 2006-02-01 22:57 ` [Census] So who uses git? Daniel Barkalow 2006-02-01 22:00 ` Joel Becker 2006-02-01 19:32 ` H. Peter Anvin -- strict thread matches above, loose matches on Subject: below -- 2006-02-01 7:08 linux 2006-02-01 8:51 ` Junio C Hamano 2006-02-01 16:04 ` Linus Torvalds 2006-02-01 16:10 ` Alex Riesen 2006-02-01 21:27 ` linux
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).