git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* file name case-sensitivity issues
@ 2006-05-23 21:06 Alex Riesen
  2006-05-23 21:16 ` Linus Torvalds
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Alex Riesen @ 2006-05-23 21:06 UTC (permalink / raw
  To: git; +Cc: Junio C Hamano, Linus Torvalds

Very simple to reproduce on FAT and NTFS, and under Windows, as usual,
when a problem is especially annoying. I seem to have no chance to
get my hands on this myself, so I at least let everyone know about the
problem.

The case goes as follows:

  $ mkdir case-sensitivity-test
  $ cd case-sensitivity-test
  $ git init-db
  defaulting to local storage area
  $ echo foo > foo
  $ echo bar > bar
  $ git add foo bar
  $ git commit -m initial\ commit
  Committing initial tree 89ff1a2aefcbff0f09197f0fd8beeb19a7b6e51c
  $ git checkout -b side
  $ echo bar-side >> bar
  $ git commit -m side\ commit -o bar
  $ git checkout master
  $ rm foo
  $ git update-index --remove foo
  $ echo FOO > FOO # note case change
  $ git add FOO
# this is on linux, vfat  on an usbstick (mounted with default case
# conversion, which is "lower". That's why the file can't be found).
# Have no Windows at home. On Windows the FOO is created and "git add"
# just passes. We just assume it did add the file, as it would there.
  git-ls-files: error: pathspec 'FOO' did not match any.
  Maybe you misspelled it?
  $ git commit -m case\ change
  $ git pull . side
  Trying really trivial in-index merge...
  git-read-tree: fatal: Untracked working tree file 'foo' would be overwritten by merge.
  Nope. Really trivial in-index merge is not possible.
  Merging HEAD with 7b0cad3a104487fa92afa06736294338acb84281
  Merging:
  7f6a8ba3e41683ef5b55921d050092e766aad4a5 case change
  7b0cad3a104487fa92afa06736294338acb84281 side commit
  found 1 common ancestor(s):
  f857aaf5f1d3716d25ca7751f12de30420d9b2aa initial commit
  git-read-tree: git-read-tree: fatal: Untracked working tree file 'foo' would be overwritten by merge.

  No merge strategy handled the merge.

Well, what now?

What I did was to replace that die() with error() in
read-tree.c:verify_absent, which if cause is not acceptable.
I'll try to find a solution sometime later, but I really hope
someone will find it sooner (because it'll take some time for me).
Hope it didn't bit anyone yet...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: file name case-sensitivity issues
  2006-05-23 21:06 file name case-sensitivity issues Alex Riesen
@ 2006-05-23 21:16 ` Linus Torvalds
  2006-05-23 21:30   ` Linus Torvalds
  2006-05-23 22:43 ` Ben Clifford
  2006-05-23 22:57 ` Junio C Hamano
  2 siblings, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2006-05-23 21:16 UTC (permalink / raw
  To: Alex Riesen; +Cc: git, Junio C Hamano



On Tue, 23 May 2006, Alex Riesen wrote:
>
> Very simple to reproduce on FAT and NTFS, and under Windows, as usual,
> when a problem is especially annoying. I seem to have no chance to
> get my hands on this myself, so I at least let everyone know about the
> problem.

I don't think we can fix it.

At least not in the short term.

The closest I can imagine is to add a config option like "core.lowercase", 
and that would make us always add files to the index in lower case. That, 
together with making sure that "setup_pathspec()" &co always also 
lower-case their arguments might get things limping along with minimal 
trouble.

But it won't ever do things _well_. Anything non-ascii would be just a 
nightmare.

		Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: file name case-sensitivity issues
  2006-05-23 21:16 ` Linus Torvalds
@ 2006-05-23 21:30   ` Linus Torvalds
  0 siblings, 0 replies; 10+ messages in thread
From: Linus Torvalds @ 2006-05-23 21:30 UTC (permalink / raw
  To: Alex Riesen; +Cc: git, Junio C Hamano



On Tue, 23 May 2006, Linus Torvalds wrote:
> 
> The closest I can imagine is to add a config option like "core.lowercase", 
> and that would make us always add files to the index in lower case.

Side note: doing it by just changing the name compare functions to ignore 
case is _not_ a good things to do, because that would generate tree 
objects that simply don't work (or fsck) correctly on any other machine. 

The index and tree objects are all sorted by pathname, and thus the 
sorting order has to be something that everybody agrees on, and any locale 
dependencies are not appropriate.

It might be worth asking the monotone guys what they do - they've worked 
on Windows for a long time.

		Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: file name case-sensitivity issues
  2006-05-23 21:06 file name case-sensitivity issues Alex Riesen
  2006-05-23 21:16 ` Linus Torvalds
@ 2006-05-23 22:43 ` Ben Clifford
  2006-05-24  1:40   ` Junio C Hamano
  2006-05-23 22:57 ` Junio C Hamano
  2 siblings, 1 reply; 10+ messages in thread
From: Ben Clifford @ 2006-05-23 22:43 UTC (permalink / raw
  To: Alex Riesen; +Cc: git, Junio C Hamano, Linus Torvalds


On OS X using whatever filesystem it comes with by default, I get the 
following, which doesn't seem right (but in a different way).

$ mkdir case-sensitivity-test
$ cd case-sensitivity-test
$ git init-db
defaulting to local storage area
$ echo foo > foo
$ echo bar > bar
$ git add foo bar
$ git commit -m initial\ commit
Committing initial tree 89ff1a2aefcbff0f09197f0fd8beeb19a7b6e51c
$ git checkout -b side
$ echo bar-side >> bar
$ git commit -m side\ commit -o bar
$ git checkout master
$ rm foo
$ git update-index --remove foo
$ echo FOO > FOO
$ git add FOO
$ git commit -m case\ change
$ ls
FOO bar
$ git pull . side
Trying really trivial in-index merge...
fatal: Merge requires file-level merging
Nope.
Merging HEAD with e1f1e78035b099fad2bbfb82af7ec31864d8e4c1
Merging: 
5d70969775bf595dd5144a2bacc25d32cc288352 case change 
e1f1e78035b099fad2bbfb82af7ec31864d8e4c1 side commit 
found 1 common ancestor(s): 
e35c42fad4f08c2ccf61d93409a0208e92028a51 initial commit 

Merge 98bf1cae75776c141ad3b61dc2cb938c71c303ef, made by recursive.
 bar |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)
$ 
$ ls
bar
$ git ls-files -d
FOO
$ git ls-tree HEAD
100644 blob b7d6715e2df11b9c32b2341423273c6b3ad9ae8a    FOO
100644 blob 5f8b81e197a2cb27816112fb5a6b86b7031ffde8    bar

The checkout is losing the FOO file but the merged tree object has the 
merged FOO in it.

-- 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: file name case-sensitivity issues
  2006-05-23 21:06 file name case-sensitivity issues Alex Riesen
  2006-05-23 21:16 ` Linus Torvalds
  2006-05-23 22:43 ` Ben Clifford
@ 2006-05-23 22:57 ` Junio C Hamano
  2006-05-25 15:47   ` Alex Riesen
  2 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2006-05-23 22:57 UTC (permalink / raw
  To: Alex Riesen; +Cc: git

fork0@t-online.de (Alex Riesen) writes:

> Very simple to reproduce on FAT and NTFS, and under Windows, as usual,
> when a problem is especially annoying. I seem to have no chance to
> get my hands on this myself, so I at least let everyone know about the
> problem.

Isn't it like complaining that the following sequence loses your
precious file on a case-challenged filesystem?

	$ echo precious contents >foo
        $ rm -f FOO

Is it a problem for the user?  Certainly yes.  You lost your
precious file.

Is it a bug in the operating system and/or the filesystem?
Probably not; it is doing what it is asked to do -- its
definition of what string matches what file on the filesystem is
dubious, but that is how it sees the world and you accept that
view while you are on such a system.  Is it a bug in "rm"?
Probably not; it is doing what it is asked to do within the
context that you gave it.

I'd call that a PEBCAK.

If you _know_ you are working on a case challenged filesystem, I
think the best thing you can do is not to work on a project that
has files in different cases on such a filesystem.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: file name case-sensitivity issues
  2006-05-23 22:43 ` Ben Clifford
@ 2006-05-24  1:40   ` Junio C Hamano
  2006-05-24  9:55     ` Ben Clifford
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2006-05-24  1:40 UTC (permalink / raw
  To: Ben Clifford; +Cc: git

Ben Clifford <benc@hawaga.org.uk> writes:

> $ ls
> bar
> $ git ls-files -d
> FOO
> $ git ls-tree HEAD
> 100644 blob b7d6715e2df11b9c32b2341423273c6b3ad9ae8a    FOO
> 100644 blob 5f8b81e197a2cb27816112fb5a6b86b7031ffde8    bar
>
> The checkout is losing the FOO file but the merged tree object has the 
> merged FOO in it.

That's interesting.  I wonder how...  Does this sequence remove FOO
on that filesystem?

	$ date >FOO
        $ rm -f foo
        $ ls

Also if you do the final "git pull" using resolve strategy, does
it change the result (say "git pull -s resolve . side" instead)?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: file name case-sensitivity issues
  2006-05-24  1:40   ` Junio C Hamano
@ 2006-05-24  9:55     ` Ben Clifford
  0 siblings, 0 replies; 10+ messages in thread
From: Ben Clifford @ 2006-05-24  9:55 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git



On Tue, 23 May 2006, Junio C Hamano wrote:

> That's interesting.  I wonder how...  Does this sequence remove FOO
> on that filesystem?
> 
> 	$ date >FOO
>         $ rm -f foo
>         $ ls

yes.

$ ls
$ date >FOO
$ ls
FOO
$ rm -f foo
$ ls



> Also if you do the final "git pull" using resolve strategy, does
> it change the result (say "git pull -s resolve . side" instead)?

Different result:

$ mkdir case-sensitivity-test
$ cd case-sensitivity-test
$ git init-db
defaulting to local storage area
$ echo foo > foo
$ echo bar > bar
$ git add foo bar
$ git commit -m initial\ commit
Committing initial tree 89ff1a2aefcbff0f09197f0fd8beeb19a7b6e51c
$ git checkout -b side
$ echo bar-side >> bar
$ git commit -m side\ commit -o bar
$ git checkout master
$ rm foo
$ git update-index --remove foo
$ echo FOO > FOO
$ git add FOO
$ git commit -m case\ change
$ ls
FOO bar
$ git pull -s resolve . side
Trying really trivial in-index merge...
fatal: Merge requires file-level merging
Nope.
Trying simple merge.
Merge 06c11eeb08edefba8178b091287ec6d951d1ef1d, made by resolve.
 bar |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)
$ ls
FOO bar
$ 


-- 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: file name case-sensitivity issues
  2006-05-23 22:57 ` Junio C Hamano
@ 2006-05-25 15:47   ` Alex Riesen
  2006-05-25 18:17     ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Alex Riesen @ 2006-05-25 15:47 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git

Junio C Hamano, Wed, May 24, 2006 00:57:04 +0200:
> I'd call that a PEBCAK.

It is not solvable there though.

> If you _know_ you are working on a case challenged filesystem, I
> think the best thing you can do is not to work on a project that
> has files in different cases on such a filesystem.

That is seldom an acceptable suggestion. Besides, how about when you
don't _know_, like when cloning onto an usb-stick mounted with
auto-detection? Will the files with case-different names just
overwrite each other?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: file name case-sensitivity issues
  2006-05-25 15:47   ` Alex Riesen
@ 2006-05-25 18:17     ` Junio C Hamano
  2006-05-26  3:59       ` Christopher Faylor
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2006-05-25 18:17 UTC (permalink / raw
  To: Alex Riesen; +Cc: git

fork0@t-online.de (Alex Riesen) writes:

> ... Besides, how about when you
> don't _know_, like when cloning onto an usb-stick mounted with
> auto-detection? Will the files with case-different names just
> overwrite each other?

You _do_ realize that example is bogus, don't you?  At least I
hope you did after you sent it.

You are cloning a project that has mixed cases (say foo and FOO)
onto a case challenged filesystem but unfortunately you did not
know the filesystem was case challenged in advance.  So after
the cloning, your checkout results in only one file either foo
or FOO but not both, because you cannot have two files whose
names are different only in case on such a filesystem.

Tough.

There are some other problems on case challenged filesystems
that we _could_ solve but we probably don't right now.  You
could concentrate on fixing those, instead of talking about
unfixable.


There are probably 2 kinds of case-challenged-ness.  On non
case-challenged filesystems, if I say "rm -f foo Foo; echo >foo;
echo >Foo", "ls" says "foo Foo".  On case-challenged systems,
one of the following would happen:

 * "ls" says "foo".  If I swap the order of the "echo", it says
   "Foo".  The filesystem does record the case but does not
   allow two names with only case difference.

 * "ls" says ef oh oh in a case different from either "foo" nor
   "Foo".  Or it says "foo" but if I swap the order of the
   "echo", it still says "foo".  The filesystem does not record
   the case, and does not allow two names with only case
   difference.  readdir() may do some heuristics such as
   lowercasing the name, but the point is the returned string is
   unrealiable.

I have git installed on a Cygwin on NTFS at work, and I think it
is in the former category.  git seems to work as expected,
modulo that you obviously cannot have two files "foo" and "Foo"
in your git-managed project.  Probably a patch to delete "Foo"
and create "foo" (to make your project friendlier to Windows)
and a merge to do the same would work well, though I haven't
tried.

What breaks on filesystems in the latter category?  I suspect
not many.

update-index records the names given by the user (I am assuming
that at least the shell is case sensitive), uses that name to
stat() and open() to update and/or refresh the cache entry, so
that codepath should be OK.  Anything that goes from index to
find names and then goes to the filesystem with those names
(diff family, checkout-index and read-tree -u) should be fine.

ls-files -o/-i would have a hard time, since they need to work
with strings read from readdir(), as you found out.  That means
"git add" and "git clean" may not work.

I do not think of anything else that is affected by readdir()
breakage offhand; the core is doing pretty fine as it is (I do
not consider ls-files -o/-i a core -- that is more Porcelainish
part of the whole package).

I honestly think that on Windows people would not even want to
use the core Porcelainish nor even Cogito.  The would want a
native Window-ish UI that drives the core.  I do not think such
a program would internally call "git add" nor read from
"ls-files -o/-i".  It would instead do its own Folder hierarchy
traversal, and use "update-index --add --remove" to implement
its own "git add/rm" UI, and read from "ls-files" (not -o nor
-i) so that it can show tracked and untracked files differently
in its Explorer view.

So in that sense, I think ls-files -o/-i issue is quite low
priority.  It does not matter on sane filesystems, and in the
place where it matters the most, the desired solution does not
involve ls-files -o/-i working well there.

Having said that, I think you _could_ have a repository
configuration that says "this repository sits on a case
challenged filesystem", and update ls-files to munge what it
gets from readdir() by comparing them against what you have in
the index.  If your readdir() gives "foo" when you have "FOO" in
the index on such a filesystem, you do not say that "foo" is an
untracked file -- you just say you found "FOO" as you expected.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: file name case-sensitivity issues
  2006-05-25 18:17     ` Junio C Hamano
@ 2006-05-26  3:59       ` Christopher Faylor
  0 siblings, 0 replies; 10+ messages in thread
From: Christopher Faylor @ 2006-05-26  3:59 UTC (permalink / raw
  To: git

On Thu, May 25, 2006 at 11:17:48AM -0700, Junio C Hamano wrote:
>I have git installed on a Cygwin on NTFS at work...

Maybe this has been mentioned already but I wanted to point out that
Cygwin's mount has a "managed" option: "mount -o managed c:/foo /foo"
which causes cygwin to encode "problem" characters into the filename.

This means that there is a possibility that you'll run into the Windows
260 character max filename limit sooner so many people don't like to use
this option.  However, since only uppercase characters and characters
like ">", ":", etc.  are encoded, in practice you wouldn't see path
length problems *from this* very often.  There is, of course, some
processing overhead involved in this, too, so using managed mode
will slow things down slightly.

We've been contemplating using Unicode functions in cygwin for a while
since those allow much longer path lengths but this is a massive change
and would potentially cause problems on Windows 9x.  There has also been
some discussion of using native NT calls which, I believe, allow case
preservation like linux.  However, those have a similar set of problems.

FYI,
cgf

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-05-26  3:59 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-23 21:06 file name case-sensitivity issues Alex Riesen
2006-05-23 21:16 ` Linus Torvalds
2006-05-23 21:30   ` Linus Torvalds
2006-05-23 22:43 ` Ben Clifford
2006-05-24  1:40   ` Junio C Hamano
2006-05-24  9:55     ` Ben Clifford
2006-05-23 22:57 ` Junio C Hamano
2006-05-25 15:47   ` Alex Riesen
2006-05-25 18:17     ` Junio C Hamano
2006-05-26  3:59       ` Christopher Faylor

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).