* [PATCH 0/7] Case-insensitive filesystem support, take 1
@ 2008-03-22 17:21 Linus Torvalds
2008-03-22 17:22 ` [PATCH 1/7] Make unpack_trees_options bit flags actual bitfields Linus Torvalds
` (4 more replies)
0 siblings, 5 replies; 29+ messages in thread
From: Linus Torvalds @ 2008-03-22 17:21 UTC (permalink / raw
To: Junio C Hamano, Git Mailing List; +Cc: Frank, Dmitry Potapov
Ok, so I said it wasn't a high priority, but I had already done all the
really core support for this in that we have the name hashes now that I
wanted to use for looking up names case-insensitively, so I took it as a
challenge to do this cleanly. I already knew how I wanted to do it, so how
hard could it be?
First, a few caveats:
- I've tested this series, both on a case-sensitive one (using hardlinks
to test corner cases) and on a vfat filesystem under Linux (which is
case-insensitive and *really* odd wrt case preservation - it remembers
the name of removed files, so it preserves case even across removal and
re-creation!)
- HOWEVER. The testing has been very targeted, and I only convered a few
cases to really care. Things like case-renaming, for example, will be
trivial to do, but I didn't do it. So if you want to do a
git mv Abc abc
on a case-insensitive filesystem, you currently still have to do it as
the sequence
git mv Abc xyz
git mv xyz abc
because I did *not* make git-mv know about case-insensitivity.
- The only two operations that care about case-insensitivity after this
series of seven patches are
(a) "git status" and friends (like "git add") that use the directory
traversal code will know to ignore files that have a case-
insensitive version in the index. So if you have messed up the
case in the working tree (or the filesystem isn't even case-
preserving), then "git add" and "git status" won't show the
case-different file as being "unknown".
(b) merging trees (git read-tree) knows about unexpectedly found files
that are due to case-insensitive filesystems, and knows to ignore
them. This means that switching branches where the case of a file
changes works, and it means that going a merge across that case
also works.
- I made this all conditional on
[core]
ignorecase = true
because obviously I'm not at all interested in penalizing sane
filesystems. That said, I also worked at trying to make sure that it's
safe and possible to do this on a case-sensitive filesystem in case you
are working on a project that doesn't like case-sensitivity, so the
"git status" and "git add ." kind of operations won't add aliases
- Finally: the "case independence" rules could be anything, but right now
I *only* do the standard US-ASCII versions. This will *not* help with
the insane OS X cases of UTF-8 normalization: to actually get that you
need to make sure that the hash-function for names and the comparison
functions work correctly with those more complex cases.
So _conceptually_ this should all work for UTF-8 normalization 'case'
insensitivity too or on just generally utf-8 cases, but I didn't
actually do that more complex case. That's a separate area, and will
not affect the core logic.
And then one final caveat: I think the patch-series is fairly clean, and
each patch in itself is pretty simple, but my testing has been limited,
and not only haven't I extended the case-insensitivity to all operations,
but I would suggest some care from people who test this.
But if you care about the crazy OS X UTF-8 normalization (which apparently
can happen even on otherwise case-sensitive filesystems), or if you care
about the more regular case insensitivity of HFS+ or Windows, you need to
test this - even if you tests right now should be limited to US-ASCII
only. Because I'll happily fix issues, but I'm not using those crap
filesystems myself, nor will I be doing any more testing on VFAT unless
people actually point out issues to me.
So it's up to you users of crap OS's to test the cases and make good
reports, and I'll care just because I think it's a somewhat interesting
problem (it's why I did this series), but I'll never do anything about
this without prodding and good reports. Ok?
Junio: I think this is safe, if only because it's all so very
straight-forward, and I tried very hard to make each change trivial and
limited and fairly easy to understand. Some of the patches are pure
cleanups that I hit when looking at the code and wanted to fix before I
even made any other changes.
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH 1/7] Make unpack_trees_options bit flags actual bitfields
2008-03-22 17:21 [PATCH 0/7] Case-insensitive filesystem support, take 1 Linus Torvalds
@ 2008-03-22 17:22 ` Linus Torvalds
2008-03-22 17:25 ` [PATCH 2/7] Move name hashing functions into a file of its own Linus Torvalds
2008-03-22 17:36 ` [PATCH 1/7] Make unpack_trees_options bit flags actual bitfields Johannes Schindelin
2008-03-22 22:01 ` [PATCH] t0050: Set core.ignorecase case to activate case insensitivity Steffen Prohaska
` (3 subsequent siblings)
4 siblings, 2 replies; 29+ messages in thread
From: Linus Torvalds @ 2008-03-22 17:22 UTC (permalink / raw
To: Junio C Hamano, Git Mailing List; +Cc: Frank, Dmitry Potapov
From: Linus Torvalds <torvalds@woody.linux-foundation.org>
Date: Fri, 21 Mar 2008 13:14:47 -0700
Instead of wasting space with whole integers for a single bit.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
This is really unrelated to the rest of the series, except for the fact
that it irritated me when I was thinking about the required changes to
unpack-trees.
unpack-trees.h | 20 ++++++++++----------
1 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/unpack-trees.h b/unpack-trees.h
index 50453ed..ad8cc65 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -9,16 +9,16 @@ typedef int (*merge_fn_t)(struct cache_entry **src,
struct unpack_trees_options *options);
struct unpack_trees_options {
- int reset;
- int merge;
- int update;
- int index_only;
- int nontrivial_merge;
- int trivial_merges_only;
- int verbose_update;
- int aggressive;
- int skip_unmerged;
- int gently;
+ unsigned int reset:1,
+ merge:1,
+ update:1,
+ index_only:1,
+ nontrivial_merge:1,
+ trivial_merges_only:1,
+ verbose_update:1,
+ aggressive:1,
+ skip_unmerged:1,
+ gently:1;
const char *prefix;
int pos;
struct dir_struct *dir;
--
1.5.5.rc0.28.g61a0.dirty
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 2/7] Move name hashing functions into a file of its own
2008-03-22 17:22 ` [PATCH 1/7] Make unpack_trees_options bit flags actual bitfields Linus Torvalds
@ 2008-03-22 17:25 ` Linus Torvalds
2008-03-22 17:28 ` [PATCH 3/7] Make "index_name_exists()" return the cache_entry it found Linus Torvalds
2008-03-22 17:36 ` [PATCH 1/7] Make unpack_trees_options bit flags actual bitfields Johannes Schindelin
1 sibling, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2008-03-22 17:25 UTC (permalink / raw
To: Junio C Hamano, Git Mailing List; +Cc: Frank, Dmitry Potapov
From: Linus Torvalds <torvalds@woody.linux-foundation.org>
Date: Fri, 21 Mar 2008 13:16:24 -0700
It's really totally separate functionality, and if we want to start
doing case-insensitive hash lookups, I'd rather do it when it's
separated out.
It also renames "remove_index_entry()" to "remove_name_hash()", because
that really describes the thing better. It doesn't actually remove the
index entry, that's done by "remove_index_entry_at()", which is something
very different, despite the similarity in names.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
This makes no code changes what-so-ever apart from the movement/renaming,
just moves things around a bit, and makes a function that used to be
static (add_name_hash()) be exported because it's now in a different file.
Makefile | 1 +
builtin-read-tree.c | 2 +-
cache.h | 31 ++++++++++++----------
name-hash.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++
read-cache.c | 65 ++-------------------------------------------
5 files changed, 95 insertions(+), 77 deletions(-)
create mode 100644 name-hash.c
diff --git a/Makefile b/Makefile
index 7c70b00..6d35662 100644
--- a/Makefile
+++ b/Makefile
@@ -421,6 +421,7 @@ LIB_OBJS += log-tree.o
LIB_OBJS += mailmap.o
LIB_OBJS += match-trees.o
LIB_OBJS += merge-file.o
+LIB_OBJS += name-hash.o
LIB_OBJS += object.o
LIB_OBJS += pack-check.o
LIB_OBJS += pack-revindex.o
diff --git a/builtin-read-tree.c b/builtin-read-tree.c
index e9cfd2b..7ac3088 100644
--- a/builtin-read-tree.c
+++ b/builtin-read-tree.c
@@ -40,7 +40,7 @@ static int read_cache_unmerged(void)
for (i = 0; i < active_nr; i++) {
struct cache_entry *ce = active_cache[i];
if (ce_stage(ce)) {
- remove_index_entry(ce);
+ remove_name_hash(ce);
if (last && !strcmp(ce->name, last->name))
continue;
cache_tree_invalidate_path(active_cache_tree, ce->name);
diff --git a/cache.h b/cache.h
index 2a1e7ec..2afc788 100644
--- a/cache.h
+++ b/cache.h
@@ -153,20 +153,6 @@ static inline void copy_cache_entry(struct cache_entry *dst, struct cache_entry
dst->ce_flags = (dst->ce_flags & ~CE_STATE_MASK) | state;
}
-/*
- * We don't actually *remove* it, we can just mark it invalid so that
- * we won't find it in lookups.
- *
- * Not only would we have to search the lists (simple enough), but
- * we'd also have to rehash other hash buckets in case this makes the
- * hash bucket empty (common). So it's much better to just mark
- * it.
- */
-static inline void remove_index_entry(struct cache_entry *ce)
-{
- ce->ce_flags |= CE_UNHASHED;
-}
-
static inline unsigned create_ce_flags(size_t len, unsigned stage)
{
if (len >= CE_NAMEMASK)
@@ -241,6 +227,23 @@ struct index_state {
extern struct index_state the_index;
+/* Name hashing */
+extern void add_name_hash(struct index_state *istate, struct cache_entry *ce);
+/*
+ * We don't actually *remove* it, we can just mark it invalid so that
+ * we won't find it in lookups.
+ *
+ * Not only would we have to search the lists (simple enough), but
+ * we'd also have to rehash other hash buckets in case this makes the
+ * hash bucket empty (common). So it's much better to just mark
+ * it.
+ */
+static inline void remove_name_hash(struct cache_entry *ce)
+{
+ ce->ce_flags |= CE_UNHASHED;
+}
+
+
#ifndef NO_THE_INDEX_COMPATIBILITY_MACROS
#define active_cache (the_index.cache)
#define active_nr (the_index.cache_nr)
diff --git a/name-hash.c b/name-hash.c
new file mode 100644
index 0000000..e56eb16
--- /dev/null
+++ b/name-hash.c
@@ -0,0 +1,73 @@
+/*
+ * name-hash.c
+ *
+ * Hashing names in the index state
+ *
+ * Copyright (C) 2008 Linus Torvalds
+ */
+#define NO_THE_INDEX_COMPATIBILITY_MACROS
+#include "cache.h"
+
+static unsigned int hash_name(const char *name, int namelen)
+{
+ unsigned int hash = 0x123;
+
+ do {
+ unsigned char c = *name++;
+ hash = hash*101 + c;
+ } while (--namelen);
+ return hash;
+}
+
+static void hash_index_entry(struct index_state *istate, struct cache_entry *ce)
+{
+ void **pos;
+ unsigned int hash;
+
+ if (ce->ce_flags & CE_HASHED)
+ return;
+ ce->ce_flags |= CE_HASHED;
+ ce->next = NULL;
+ hash = hash_name(ce->name, ce_namelen(ce));
+ pos = insert_hash(hash, ce, &istate->name_hash);
+ if (pos) {
+ ce->next = *pos;
+ *pos = ce;
+ }
+}
+
+static void lazy_init_name_hash(struct index_state *istate)
+{
+ int nr;
+
+ if (istate->name_hash_initialized)
+ return;
+ for (nr = 0; nr < istate->cache_nr; nr++)
+ hash_index_entry(istate, istate->cache[nr]);
+ istate->name_hash_initialized = 1;
+}
+
+void add_name_hash(struct index_state *istate, struct cache_entry *ce)
+{
+ ce->ce_flags &= ~CE_UNHASHED;
+ if (istate->name_hash_initialized)
+ hash_index_entry(istate, ce);
+}
+
+int index_name_exists(struct index_state *istate, const char *name, int namelen)
+{
+ unsigned int hash = hash_name(name, namelen);
+ struct cache_entry *ce;
+
+ lazy_init_name_hash(istate);
+ ce = lookup_hash(hash, &istate->name_hash);
+
+ while (ce) {
+ if (!(ce->ce_flags & CE_UNHASHED)) {
+ if (!cache_name_compare(name, namelen, ce->name, ce->ce_flags))
+ return 1;
+ }
+ ce = ce->next;
+ }
+ return 0;
+}
diff --git a/read-cache.c b/read-cache.c
index a92b25b..5dc998d 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -23,80 +23,21 @@
struct index_state the_index;
-static unsigned int hash_name(const char *name, int namelen)
-{
- unsigned int hash = 0x123;
-
- do {
- unsigned char c = *name++;
- hash = hash*101 + c;
- } while (--namelen);
- return hash;
-}
-
-static void hash_index_entry(struct index_state *istate, struct cache_entry *ce)
-{
- void **pos;
- unsigned int hash;
-
- if (ce->ce_flags & CE_HASHED)
- return;
- ce->ce_flags |= CE_HASHED;
- ce->next = NULL;
- hash = hash_name(ce->name, ce_namelen(ce));
- pos = insert_hash(hash, ce, &istate->name_hash);
- if (pos) {
- ce->next = *pos;
- *pos = ce;
- }
-}
-
-static void lazy_init_name_hash(struct index_state *istate)
-{
- int nr;
-
- if (istate->name_hash_initialized)
- return;
- for (nr = 0; nr < istate->cache_nr; nr++)
- hash_index_entry(istate, istate->cache[nr]);
- istate->name_hash_initialized = 1;
-}
-
static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce)
{
- ce->ce_flags &= ~CE_UNHASHED;
istate->cache[nr] = ce;
- if (istate->name_hash_initialized)
- hash_index_entry(istate, ce);
+ add_name_hash(istate, ce);
}
static void replace_index_entry(struct index_state *istate, int nr, struct cache_entry *ce)
{
struct cache_entry *old = istate->cache[nr];
- remove_index_entry(old);
+ remove_name_hash(old);
set_index_entry(istate, nr, ce);
istate->cache_changed = 1;
}
-int index_name_exists(struct index_state *istate, const char *name, int namelen)
-{
- unsigned int hash = hash_name(name, namelen);
- struct cache_entry *ce;
-
- lazy_init_name_hash(istate);
- ce = lookup_hash(hash, &istate->name_hash);
-
- while (ce) {
- if (!(ce->ce_flags & CE_UNHASHED)) {
- if (!cache_name_compare(name, namelen, ce->name, ce->ce_flags))
- return 1;
- }
- ce = ce->next;
- }
- return 0;
-}
-
/*
* This only updates the "non-critical" parts of the directory
* cache, ie the parts that aren't tracked by GIT, and only used
@@ -438,7 +379,7 @@ int remove_index_entry_at(struct index_state *istate, int pos)
{
struct cache_entry *ce = istate->cache[pos];
- remove_index_entry(ce);
+ remove_name_hash(ce);
istate->cache_changed = 1;
istate->cache_nr--;
if (pos >= istate->cache_nr)
--
1.5.5.rc0.28.g61a0.dirty
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 3/7] Make "index_name_exists()" return the cache_entry it found
2008-03-22 17:25 ` [PATCH 2/7] Move name hashing functions into a file of its own Linus Torvalds
@ 2008-03-22 17:28 ` Linus Torvalds
2008-03-22 17:30 ` [PATCH 4/7] Make hash_name_lookup able to do case-independent lookups Linus Torvalds
0 siblings, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2008-03-22 17:28 UTC (permalink / raw
To: Junio C Hamano, Git Mailing List; +Cc: Frank, Dmitry Potapov
From: Linus Torvalds <torvalds@woody.linux-foundation.org>
Date: Fri, 21 Mar 2008 15:53:00 -0700
This allows verify_absent() in unpack_trees() to use the hash chains
rather than looking it up using the binary search.
Perhaps more imporantly, it's also going to be useful for the next phase,
where we actually start looking at the cache entry when we do
case-insensitive lookups and checking the result.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
No real change, except verify_absent() can now use the name hashing rather
than the binary search. But it's all still very much case-sensitive.
cache.h | 2 +-
name-hash.c | 6 +++---
unpack-trees.c | 8 ++++----
3 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/cache.h b/cache.h
index 2afc788..76d95d2 100644
--- a/cache.h
+++ b/cache.h
@@ -353,7 +353,7 @@ extern int write_index(const struct index_state *, int newfd);
extern int discard_index(struct index_state *);
extern int unmerged_index(const struct index_state *);
extern int verify_path(const char *path);
-extern int index_name_exists(struct index_state *istate, const char *name, int namelen);
+extern struct cache_entry *index_name_exists(struct index_state *istate, const char *name, int namelen);
extern int index_name_pos(const struct index_state *, const char *name, int namelen);
#define ADD_CACHE_OK_TO_ADD 1 /* Ok to add */
#define ADD_CACHE_OK_TO_REPLACE 2 /* Ok to replace file/directory */
diff --git a/name-hash.c b/name-hash.c
index e56eb16..2678148 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -54,7 +54,7 @@ void add_name_hash(struct index_state *istate, struct cache_entry *ce)
hash_index_entry(istate, ce);
}
-int index_name_exists(struct index_state *istate, const char *name, int namelen)
+struct cache_entry *index_name_exists(struct index_state *istate, const char *name, int namelen)
{
unsigned int hash = hash_name(name, namelen);
struct cache_entry *ce;
@@ -65,9 +65,9 @@ int index_name_exists(struct index_state *istate, const char *name, int namelen)
while (ce) {
if (!(ce->ce_flags & CE_UNHASHED)) {
if (!cache_name_compare(name, namelen, ce->name, ce->ce_flags))
- return 1;
+ return ce;
}
ce = ce->next;
}
- return 0;
+ return NULL;
}
diff --git a/unpack-trees.c b/unpack-trees.c
index a59f475..ca4c845 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -538,6 +538,7 @@ static int verify_absent(struct cache_entry *ce, const char *action,
if (!lstat(ce->name, &st)) {
int cnt;
int dtype = ce_to_dtype(ce);
+ struct cache_entry *result;
if (o->dir && excluded(o->dir, ce->name, &dtype))
/*
@@ -581,10 +582,9 @@ static int verify_absent(struct cache_entry *ce, const char *action,
* delete this path, which is in a subdirectory that
* is being replaced with a blob.
*/
- cnt = index_name_pos(&o->result, ce->name, strlen(ce->name));
- if (0 <= cnt) {
- struct cache_entry *ce = o->result.cache[cnt];
- if (ce->ce_flags & CE_REMOVE)
+ result = index_name_exists(&o->result, ce->name, ce_namelen(ce));
+ if (result) {
+ if (result->ce_flags & CE_REMOVE)
return 0;
}
--
1.5.5.rc0.28.g61a0.dirty
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 4/7] Make hash_name_lookup able to do case-independent lookups
2008-03-22 17:28 ` [PATCH 3/7] Make "index_name_exists()" return the cache_entry it found Linus Torvalds
@ 2008-03-22 17:30 ` Linus Torvalds
2008-03-22 17:33 ` [PATCH 5/7] Add 'core.ignorecase' option Linus Torvalds
0 siblings, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2008-03-22 17:30 UTC (permalink / raw
To: Junio C Hamano, Git Mailing List; +Cc: Frank, Dmitry Potapov
From: Linus Torvalds <torvalds@woody.linux-foundation.org>
Date: Fri, 21 Mar 2008 15:55:19 -0700
Right now nobody uses it, but "index_name_exists()" gets a flag so
you can enable it on a case-by-case basis.
Signed-of-by: Linus Torvalds <torvalds@linux-foundation.org>
---
Oooh.. We actually have some (admittedly stupid) case insensitivity code
starting to appear. So we now hash the names insensitively, and we have
the _capability_ to do case-insensitive lookups, but nobody actually uses
that insensitive lookup capability yet.
But things are now starting to get interesting.
cache.h | 4 ++--
dir.c | 2 +-
name-hash.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++--
unpack-trees.c | 2 +-
4 files changed, 52 insertions(+), 6 deletions(-)
diff --git a/cache.h b/cache.h
index 76d95d2..a9ddaa1 100644
--- a/cache.h
+++ b/cache.h
@@ -264,7 +264,7 @@ static inline void remove_name_hash(struct cache_entry *ce)
#define refresh_cache(flags) refresh_index(&the_index, (flags), NULL, NULL)
#define ce_match_stat(ce, st, options) ie_match_stat(&the_index, (ce), (st), (options))
#define ce_modified(ce, st, options) ie_modified(&the_index, (ce), (st), (options))
-#define cache_name_exists(name, namelen) index_name_exists(&the_index, (name), (namelen))
+#define cache_name_exists(name, namelen, igncase) index_name_exists(&the_index, (name), (namelen), (igncase))
#endif
enum object_type {
@@ -353,7 +353,7 @@ extern int write_index(const struct index_state *, int newfd);
extern int discard_index(struct index_state *);
extern int unmerged_index(const struct index_state *);
extern int verify_path(const char *path);
-extern struct cache_entry *index_name_exists(struct index_state *istate, const char *name, int namelen);
+extern struct cache_entry *index_name_exists(struct index_state *istate, const char *name, int namelen, int igncase);
extern int index_name_pos(const struct index_state *, const char *name, int namelen);
#define ADD_CACHE_OK_TO_ADD 1 /* Ok to add */
#define ADD_CACHE_OK_TO_REPLACE 2 /* Ok to replace file/directory */
diff --git a/dir.c b/dir.c
index edc458e..7362e83 100644
--- a/dir.c
+++ b/dir.c
@@ -371,7 +371,7 @@ static struct dir_entry *dir_entry_new(const char *pathname, int len)
struct dir_entry *dir_add_name(struct dir_struct *dir, const char *pathname, int len)
{
- if (cache_name_exists(pathname, len))
+ if (cache_name_exists(pathname, len, 0))
return NULL;
ALLOC_GROW(dir->entries, dir->nr+1, dir->alloc);
diff --git a/name-hash.c b/name-hash.c
index 2678148..2253870 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -8,12 +8,25 @@
#define NO_THE_INDEX_COMPATIBILITY_MACROS
#include "cache.h"
+/*
+ * This removes bit 5 if bit 6 is set.
+ *
+ * That will make US-ASCII characters hash to their upper-case
+ * equivalent. We could easily do this one whole word at a time,
+ * but that's for future worries.
+ */
+static inline unsigned char icase_hash(unsigned char c)
+{
+ return c & ~((c & 0x40) >> 1);
+}
+
static unsigned int hash_name(const char *name, int namelen)
{
unsigned int hash = 0x123;
do {
unsigned char c = *name++;
+ c = icase_hash(c);
hash = hash*101 + c;
} while (--namelen);
return hash;
@@ -54,7 +67,40 @@ void add_name_hash(struct index_state *istate, struct cache_entry *ce)
hash_index_entry(istate, ce);
}
-struct cache_entry *index_name_exists(struct index_state *istate, const char *name, int namelen)
+static int slow_same_name(const char *name1, int len1, const char *name2, int len2)
+{
+ if (len1 != len2)
+ return 0;
+
+ while (len1) {
+ unsigned char c1 = *name1++;
+ unsigned char c2 = *name2++;
+ len1--;
+ if (c1 != c2) {
+ c1 = toupper(c1);
+ c2 = toupper(c2);
+ if (c1 != c2)
+ return 0;
+ }
+ }
+ return 1;
+}
+
+static int same_name(const struct cache_entry *ce, const char *name, int namelen, int icase)
+{
+ int len = ce_namelen(ce);
+
+ /*
+ * Always fo exact compare (even if we want a case-ignoring comparison
+ * we do the quick exact one first, because it will be the common case).
+ */
+ if (len == namelen && !cache_name_compare(name, namelen, ce->name, len))
+ return 1;
+
+ return icase && slow_same_name(name, namelen, ce->name, len);
+}
+
+struct cache_entry *index_name_exists(struct index_state *istate, const char *name, int namelen, int icase)
{
unsigned int hash = hash_name(name, namelen);
struct cache_entry *ce;
@@ -64,7 +110,7 @@ struct cache_entry *index_name_exists(struct index_state *istate, const char *na
while (ce) {
if (!(ce->ce_flags & CE_UNHASHED)) {
- if (!cache_name_compare(name, namelen, ce->name, ce->ce_flags))
+ if (same_name(ce, name, namelen, icase))
return ce;
}
ce = ce->next;
diff --git a/unpack-trees.c b/unpack-trees.c
index ca4c845..bf7d8f6 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -582,7 +582,7 @@ static int verify_absent(struct cache_entry *ce, const char *action,
* delete this path, which is in a subdirectory that
* is being replaced with a blob.
*/
- result = index_name_exists(&o->result, ce->name, ce_namelen(ce));
+ result = index_name_exists(&o->result, ce->name, ce_namelen(ce), 0);
if (result) {
if (result->ce_flags & CE_REMOVE)
return 0;
--
1.5.5.rc0.28.g61a0.dirty
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 5/7] Add 'core.ignorecase' option
2008-03-22 17:30 ` [PATCH 4/7] Make hash_name_lookup able to do case-independent lookups Linus Torvalds
@ 2008-03-22 17:33 ` Linus Torvalds
2008-03-22 17:38 ` [PATCH 6/7] Make branch merging aware of underlying case-insensitive filsystems Linus Torvalds
0 siblings, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2008-03-22 17:33 UTC (permalink / raw
To: Junio C Hamano, Git Mailing List; +Cc: Frank, Dmitry Potapov
From: Linus Torvalds <torvalds@woody.linux-foundation.org>
Date: Fri, 21 Mar 2008 16:52:46 -0700
..and start using it for directory entry traversal (ie "git status" will
not consider entries that match an existing entry case-insensitively to
be a new file)
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
Basically all of this patch is just setting up the "ignore_case" variable
(default to false), but this one-liner in "dir_add_name()" is the actual
meat of it all:
- if (cache_name_exists(pathname, len, 0))
+ if (cache_name_exists(pathname, len, ignore_case))
because it now means that we will ignore unknown directory entries that
match already-known files if they match case-insensitively and we have
core.ignorecase being set.
So this actually starts using the case insensitivity logic, but it's for a
really quite trivial and not very interesting case.
cache.h | 1 +
config.c | 5 +++++
dir.c | 2 +-
environment.c | 1 +
4 files changed, 8 insertions(+), 1 deletions(-)
diff --git a/cache.h b/cache.h
index a9ddaa1..9bce723 100644
--- a/cache.h
+++ b/cache.h
@@ -407,6 +407,7 @@ extern int delete_ref(const char *, const unsigned char *sha1);
extern int trust_executable_bit;
extern int quote_path_fully;
extern int has_symlinks;
+extern int ignore_case;
extern int assume_unchanged;
extern int prefer_symlink_refs;
extern int log_all_ref_updates;
diff --git a/config.c b/config.c
index 0624494..3d51868 100644
--- a/config.c
+++ b/config.c
@@ -342,6 +342,11 @@ int git_default_config(const char *var, const char *value)
return 0;
}
+ if (!strcmp(var, "core.ignorecase")) {
+ ignore_case = git_config_bool(var, value);
+ return 0;
+ }
+
if (!strcmp(var, "core.bare")) {
is_bare_repository_cfg = git_config_bool(var, value);
return 0;
diff --git a/dir.c b/dir.c
index 7362e83..b5bfbca 100644
--- a/dir.c
+++ b/dir.c
@@ -371,7 +371,7 @@ static struct dir_entry *dir_entry_new(const char *pathname, int len)
struct dir_entry *dir_add_name(struct dir_struct *dir, const char *pathname, int len)
{
- if (cache_name_exists(pathname, len, 0))
+ if (cache_name_exists(pathname, len, ignore_case))
return NULL;
ALLOC_GROW(dir->entries, dir->nr+1, dir->alloc);
diff --git a/environment.c b/environment.c
index 6739a3f..5fcd5b2 100644
--- a/environment.c
+++ b/environment.c
@@ -14,6 +14,7 @@ char git_default_name[MAX_GITNAME];
int trust_executable_bit = 1;
int quote_path_fully = 1;
int has_symlinks = 1;
+int ignore_case = 0;
int assume_unchanged;
int prefer_symlink_refs;
int is_bare_repository_cfg = -1; /* unspecified */
--
1.5.5.rc0.28.g61a0.dirty
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] Make unpack_trees_options bit flags actual bitfields
2008-03-22 17:22 ` [PATCH 1/7] Make unpack_trees_options bit flags actual bitfields Linus Torvalds
2008-03-22 17:25 ` [PATCH 2/7] Move name hashing functions into a file of its own Linus Torvalds
@ 2008-03-22 17:36 ` Johannes Schindelin
2008-03-22 17:47 ` Linus Torvalds
1 sibling, 1 reply; 29+ messages in thread
From: Johannes Schindelin @ 2008-03-22 17:36 UTC (permalink / raw
To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List, Frank, Dmitry Potapov
Hi,
On Sat, 22 Mar 2008, Linus Torvalds wrote:
> From: Linus Torvalds <torvalds@woody.linux-foundation.org>
> Date: Fri, 21 Mar 2008 13:14:47 -0700
As this patch is really unrelated to the series, this comment is really
unrelated to the content of the patch ;-)
Any point in doing the "From:" and "Date:" inside the mail body?
Ciao,
Dscho
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH 6/7] Make branch merging aware of underlying case-insensitive filsystems
2008-03-22 17:33 ` [PATCH 5/7] Add 'core.ignorecase' option Linus Torvalds
@ 2008-03-22 17:38 ` Linus Torvalds
2008-03-22 17:45 ` [PATCH 7/7] Make unpack-tree update removed files before any updated files Linus Torvalds
2008-03-23 6:13 ` [PATCH 6/7] Make branch merging aware of underlying case-insensitive filsystems Junio C Hamano
0 siblings, 2 replies; 29+ messages in thread
From: Linus Torvalds @ 2008-03-22 17:38 UTC (permalink / raw
To: Junio C Hamano, Git Mailing List; +Cc: Frank, Dmitry Potapov
From: Linus Torvalds <torvalds@woody.linux-foundation.org>
Date: Sat, 22 Mar 2008 09:35:59 -0700
If we find an unexpected file, see if that filename perhaps exists in a
case-insensitive way in the index, and whether the file matches that. If
so, ignore it as a known pre-existing file of a different name.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
All right, this is *it*. This is the actual core code that does something
interesting.
I've tried to explain the behaviour in the comment, and let's face it, the
patch is really really simple (yeah, 26 new lines but they are all really
trivial and over half of them of them are actually the comments about
what is going on).
The core of the code itself is just two lines, really, but it's all
wrapped in a helper function and I tried to make it be really really
obvious what is going on!
The reason why "src_index" had to become non-const is stupid: it's not
because we actually do anything that really writes to the index, but the
lazy index name hashing code means that even just a name lookup will
possibly create the name hash in the index.
Oh well.
unpack-trees.c | 26 ++++++++++++++++++++++++++
unpack-trees.h | 2 +-
2 files changed, 27 insertions(+), 1 deletions(-)
diff --git a/unpack-trees.c b/unpack-trees.c
index bf7d8f6..95d3413 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -521,6 +521,22 @@ static int verify_clean_subdirectory(struct cache_entry *ce, const char *action,
}
/*
+ * This gets called when there was no index entry for the tree entry 'dst',
+ * but we found a file in the working tree that 'lstat()' said was fine,
+ * and we're on a case-insensitive filesystem.
+ *
+ * See if we can find a case-insensitive match in the index that also
+ * matches the stat information, and assume it's that other file!
+ */
+static int icase_exists(struct unpack_trees_options *o, struct cache_entry *dst, struct stat *st)
+{
+ struct cache_entry *src;
+
+ src = index_name_exists(o->src_index, dst->name, ce_namelen(dst), 1);
+ return src && !ie_match_stat(o->src_index, src, st, CE_MATCH_IGNORE_VALID);
+}
+
+/*
* We do not want to remove or overwrite a working tree file that
* is not tracked, unless it is ignored.
*/
@@ -540,6 +556,16 @@ static int verify_absent(struct cache_entry *ce, const char *action,
int dtype = ce_to_dtype(ce);
struct cache_entry *result;
+ /*
+ * It may be that the 'lstat()' succeeded even though
+ * target 'ce' was absent, because there is an old
+ * entry that is different only in case..
+ *
+ * Ignore that lstat() if it matches.
+ */
+ if (ignore_case && icase_exists(o, ce, &st))
+ return 0;
+
if (o->dir && excluded(o->dir, ce->name, &dtype))
/*
* ce->name is explicitly excluded, so it is Ok to
diff --git a/unpack-trees.h b/unpack-trees.h
index ad8cc65..d436d6c 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -31,7 +31,7 @@ struct unpack_trees_options {
void *unpack_data;
struct index_state *dst_index;
- const struct index_state *src_index;
+ struct index_state *src_index;
struct index_state result;
};
--
1.5.5.rc0.28.g61a0.dirty
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 7/7] Make unpack-tree update removed files before any updated files
2008-03-22 17:38 ` [PATCH 6/7] Make branch merging aware of underlying case-insensitive filsystems Linus Torvalds
@ 2008-03-22 17:45 ` Linus Torvalds
2008-03-22 18:06 ` [PATCH 0/7] Final words Linus Torvalds
` (2 more replies)
2008-03-23 6:13 ` [PATCH 6/7] Make branch merging aware of underlying case-insensitive filsystems Junio C Hamano
1 sibling, 3 replies; 29+ messages in thread
From: Linus Torvalds @ 2008-03-22 17:45 UTC (permalink / raw
To: Junio C Hamano, Git Mailing List; +Cc: Frank, Dmitry Potapov
From: Linus Torvalds <torvalds@woody.linux-foundation.org>
Date: Sat, 22 Mar 2008 09:48:41 -0700
This is immaterial on sane filesystems, but if you have a broken (aka
case-insensitive) filesystem, and the objective is to remove the file
'abc' and replace it with the file 'Abc', then we must make sure to do
the removal first.
Otherwise, you'd first update the file 'Abc' - which would just
overwrite the file 'abc' due to the broken case-insensitive filesystem -
and then remove file 'abc' - which would now brokenly remove the just
updated file 'Abc' on that broken filesystem.
By doing removals first, this won't happen.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
Ok, this one looks - and is, really - trivial, but it's actually the only
one in the whole series that I'm even remotely nervous about. First off,
it actually does what it does regardless of that "core.ignorecase"
variable, but that wouldn't worry me if it wasn't for the fact that I
don't remember/understand what the heck that "last_symlink" logic was
there for.
I *think* the logic is only for removal, and that splitting up the single
loop to be two loops is totally safe and actually cleans things up, but I
really want somebody to take a look at this.
This patch is important, because without it you can't reliably switch
between branches with case-aliases on a case-insensitive filesystem, and
strictly speaking I should have put it before the previous patch, but I
put it last because of this worry of mine. Patches 1-6 I think are totally
obvious and ready to go at any point. This one I really want Junio to
double-check.
The patch itself is really really trivial. We used to do both removal and
file updates in one phase, we just split it into two. So I don't worry
about the code, I only worry about that "last_symlink" thing.
Anyway, this concludes the series. It's not _complete_ - the
case-independent compare function is a joke and only gets US-ASCII
correct, and there are other cases we will want to fix up. But in the end,
I think this series of seven trivial patches really does make git somewhat
aware of broken filesystems at a very core level.
unpack-trees.c | 9 +++++++--
1 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/unpack-trees.c b/unpack-trees.c
index 95d3413..feae846 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -79,16 +79,21 @@ static int check_updates(struct unpack_trees_options *o)
for (i = 0; i < index->cache_nr; i++) {
struct cache_entry *ce = index->cache[i];
- if (ce->ce_flags & (CE_UPDATE | CE_REMOVE))
- display_progress(progress, ++cnt);
if (ce->ce_flags & CE_REMOVE) {
+ display_progress(progress, ++cnt);
if (o->update)
unlink_entry(ce->name, last_symlink);
remove_index_entry_at(&o->result, i);
i--;
continue;
}
+ }
+
+ for (i = 0; i < index->cache_nr; i++) {
+ struct cache_entry *ce = index->cache[i];
+
if (ce->ce_flags & CE_UPDATE) {
+ display_progress(progress, ++cnt);
ce->ce_flags &= ~CE_UPDATE;
if (o->update) {
errs |= checkout_entry(ce, &state, NULL);
--
1.5.5.rc0.28.g61a0.dirty
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] Make unpack_trees_options bit flags actual bitfields
2008-03-22 17:36 ` [PATCH 1/7] Make unpack_trees_options bit flags actual bitfields Johannes Schindelin
@ 2008-03-22 17:47 ` Linus Torvalds
2008-03-22 17:57 ` Johannes Schindelin
0 siblings, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2008-03-22 17:47 UTC (permalink / raw
To: Johannes Schindelin
Cc: Junio C Hamano, Git Mailing List, Frank, Dmitry Potapov
On Sat, 22 Mar 2008, Johannes Schindelin wrote:
>
> Any point in doing the "From:" and "Date:" inside the mail body?
I encourage people to do that for the kernel ("From:" in particular),
because while git-am will pick them up from the email headers, that is
only true if the emails don't get passed around and commented upon by
others.
Now, in git, the chain-of-command is very short (everybody -> Junio), so
it really doesn't matter, but in the kernel, when people send out patches
like this, the patches may be forwarded by others (who add their sign-off
lines etc), and then it's really good to have that "From:" line in
particular in the body of the email, because it's more likely that it will
remain there (otherwise we depend on the person who signs off and forwards
it to add it!)
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/7] Make unpack_trees_options bit flags actual bitfields
2008-03-22 17:47 ` Linus Torvalds
@ 2008-03-22 17:57 ` Johannes Schindelin
0 siblings, 0 replies; 29+ messages in thread
From: Johannes Schindelin @ 2008-03-22 17:57 UTC (permalink / raw
To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List, Frank, Dmitry Potapov
Hi,
On Sat, 22 Mar 2008, Linus Torvalds wrote:
> On Sat, 22 Mar 2008, Johannes Schindelin wrote:
>
> > Any point in doing the "From:" and "Date:" inside the mail body?
>
> I encourage people to do that for the kernel ("From:" in particular),
> because while git-am will pick them up from the email headers, that is
> only true if the emails don't get passed around and commented upon by
> others.
Okay.
Thanks,
Dscho
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH 0/7] Final words
2008-03-22 17:45 ` [PATCH 7/7] Make unpack-tree update removed files before any updated files Linus Torvalds
@ 2008-03-22 18:06 ` Linus Torvalds
2008-03-22 18:28 ` Linus Torvalds
2008-03-22 21:19 ` [PATCH 8/7] When adding files to the index, add support for case-independent matches Linus Torvalds
2008-03-23 5:49 ` [PATCH 7/7] Make unpack-tree update removed files before any updated files Junio C Hamano
2 siblings, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2008-03-22 18:06 UTC (permalink / raw
To: Junio C Hamano, Git Mailing List; +Cc: Frank, Dmitry Potapov
So the whole patch series looks like this:
Makefile | 1 +
builtin-read-tree.c | 2 +-
cache.h | 36 +++++++++-------
config.c | 5 ++
dir.c | 2 +-
environment.c | 1 +
name-hash.c | 119 +++++++++++++++++++++++++++++++++++++++++++++++++++
read-cache.c | 65 +--------------------------
unpack-trees.c | 43 ++++++++++++++++---
unpack-trees.h | 22 +++++-----
10 files changed, 199 insertions(+), 97 deletions(-)
create mode 100644 name-hash.c
and clearly does add more lines than it deletes, but it all really is
pretty simple, and none of this is rocket science or even very intrusive.
What took me longest to do was not the actual code itself, but to get
_just_ the right approach so that the end result would be as simple and
nonintrusive as possible. That core patch 6/7 was redone at least ten
times before I was happy with it.
Anyway, perhaps exactly because I tried very hard to make it all make
sense, I'm actually very very happy with the patch. I suspect it's too
late for v1.5.5 even if I think all the patches are really simple, but I'm
hoping it can go into at least "pu" and have people actually *test* it.
Talking about testing, the kind of safety I wanted to get with this patch
is perhaps best described by the tests I did not on case-insensitive
filesystems, but on regular *good* filesystems together with setting the
"core.ignorecase" config variable.
Here's an example of how that patch 6/7 works and tries to be really
careful even on a case-sensitive filesystem:
mkdir test-case
cd test-case
git init
git config core.ignorecase true
echo "File" > File
git add File
git commit -m "Create 'File'"
git checkout -b other
git rm File
echo "file" > file
git add file
git commit -m "Create 'file'"
echo "File" > File
git checkout master
and now it complains about
error: Untracked working tree file 'File' would be overwritten by merge.
which is correct, because while it is doing its case-insensitivity checks,
it also noticed that "File" did *not* match the stat information for
'file', so it really _is_ an untracked working tree file.
So it's actually trying to be a lot more careful than just saying "ok, we
already know about 'File'". See what happens next:
rm File
ln file File
git checkout master
and now it very happily did the switch to master, even though 'File' got
overwritten, because now it again found that untracked file 'File', but
now it could match it up *exactly* against the case-insensitive file
'file', so git was happy that it wasn't actually throwing away any info,
and the fact that it overwrite 'File' was ok, because it considered it the
same file as 'file'.
So the whole thing is not only able to handle these name aliases, it
actually handles them by checking that it's safe.
Final note: I also did notice that I didn't fix the 'git add" case like I
thought I did, it currently only fixes "git status". So I still want to
fix "git add" and "git mv" to do the right thing when there are case-
insensitive aliases, but that's a separate issue from this particular
series..
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/7] Final words
2008-03-22 18:06 ` [PATCH 0/7] Final words Linus Torvalds
@ 2008-03-22 18:28 ` Linus Torvalds
0 siblings, 0 replies; 29+ messages in thread
From: Linus Torvalds @ 2008-03-22 18:28 UTC (permalink / raw
To: Junio C Hamano, Git Mailing List; +Cc: Frank, Dmitry Potapov
On Sat, 22 Mar 2008, Linus Torvalds wrote:
>
> Final note: I also did notice that I didn't fix the 'git add" case like I
> thought I did, it currently only fixes "git status". So I still want to
> fix "git add" and "git mv" to do the right thing when there are case-
> insensitive aliases, but that's a separate issue from this particular
> series..
git-add will want more than this, but this is an example of what we should
do - if 'ignore_case' is set, we probably should disallow adding the same
case-insensitive name twice to the index.
This does *not* guarantee that the index never would have aliases when
core.ignorecase is set, since the index might have been populated from a
tree that was generated on a sane filesystem, and we still allow that, but
things like this are probably good things to do for projects that want to
work case-insensitively.
So even if you have a case-sensitive filesystem, the goal (I think) should
be that you can set core.ignorecase to true, and that should help you work
with other people who may be stuck on case-insensitive crud.
Anyway, the reason "git add" didn't actually work with the simple change
to dir_add_name() is that "git add" doesn't load the index until *after*
it has done the directory traversal (because it actually *wants* to see
files that are already in the index).
Something like this at least disallows the dual add if the case has
changed.
Linus
----
read-cache.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)
diff --git a/read-cache.c b/read-cache.c
index 5dc998d..6aee6e0 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -476,6 +476,13 @@ int add_file_to_index(struct index_state *istate, const char *path, int verbose)
return 0;
}
+ if (ignore_case) {
+ struct cache_entry *alias;
+ alias = index_name_exists(istate, ce->name, ce_namelen(ce), 1);
+ if (alias)
+ die("Will not add file alias '%s' ('%s' already exists in index)", ce->name, alias->name);
+ }
+
if (index_path(ce->sha1, path, &st, 1))
die("unable to index file %s", path);
if (add_index_entry(istate, ce, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE))
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 8/7] When adding files to the index, add support for case-independent matches
2008-03-22 17:45 ` [PATCH 7/7] Make unpack-tree update removed files before any updated files Linus Torvalds
2008-03-22 18:06 ` [PATCH 0/7] Final words Linus Torvalds
@ 2008-03-22 21:19 ` Linus Torvalds
2008-03-22 21:22 ` [PATCH 9/7] Make git-add behave more sensibly in a case-insensitive environment Linus Torvalds
2008-03-23 5:49 ` [PATCH 7/7] Make unpack-tree update removed files before any updated files Junio C Hamano
2 siblings, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2008-03-22 21:19 UTC (permalink / raw
To: Junio C Hamano, Git Mailing List; +Cc: Frank, Dmitry Potapov
From: Linus Torvalds <torvalds@woody.linux-foundation.org>
Date: Sat, 22 Mar 2008 13:19:49 -0700
This simplifies the matching case of "I already have this file and it is
up-to-date" and makes it do the right thing in the face of
case-insensitive aliases.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
This is patch 1/2 to make "git add" act better. Discard the previous
simplistic and overly die()-eager patch that I hadn't signed off on.
read-cache.c | 12 +++++-------
1 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/read-cache.c b/read-cache.c
index 5dc998d..8c57adf 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -431,9 +431,9 @@ static int index_name_pos_also_unmerged(struct index_state *istate,
int add_file_to_index(struct index_state *istate, const char *path, int verbose)
{
- int size, namelen, pos;
+ int size, namelen;
struct stat st;
- struct cache_entry *ce;
+ struct cache_entry *ce, *alias;
unsigned ce_option = CE_MATCH_IGNORE_VALID|CE_MATCH_RACY_IS_DIRTY;
if (lstat(path, &st))
@@ -466,13 +466,11 @@ int add_file_to_index(struct index_state *istate, const char *path, int verbose)
ce->ce_mode = ce_mode_from_stat(ent, st.st_mode);
}
- pos = index_name_pos(istate, ce->name, namelen);
- if (0 <= pos &&
- !ce_stage(istate->cache[pos]) &&
- !ie_match_stat(istate, istate->cache[pos], &st, ce_option)) {
+ alias = index_name_exists(istate, ce->name, ce_namelen(ce), ignore_case);
+ if (alias && !ce_stage(alias) && !ie_match_stat(istate, alias, &st, ce_option)) {
/* Nothing changed, really */
free(ce);
- ce_mark_uptodate(istate->cache[pos]);
+ ce_mark_uptodate(alias);
return 0;
}
--
1.5.5.rc0.31.gdcfd.dirty
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 9/7] Make git-add behave more sensibly in a case-insensitive environment
2008-03-22 21:19 ` [PATCH 8/7] When adding files to the index, add support for case-independent matches Linus Torvalds
@ 2008-03-22 21:22 ` Linus Torvalds
0 siblings, 0 replies; 29+ messages in thread
From: Linus Torvalds @ 2008-03-22 21:22 UTC (permalink / raw
To: Junio C Hamano, Git Mailing List; +Cc: Frank, Dmitry Potapov
From: Linus Torvalds <torvalds@woody.linux-foundation.org>
Subject: [PATCH 2/2] Make git-add behave more sensibly in a case-insensitive environment
This expands on the previous patch, and allows "git add" to sanely handle
a filename that has changed case, keeping the case in the index constant,
and avoiding aliases.
In particular, if you have an index entry called "File", but the
checked-out tree is case-corrupted and has an entry called "file"
instead, doing a
git add .
(or naming "file" explicitly) will automatically notice that we have an
alias, and will replace the name "file" with the existing index
capitalization (ie "File").
However, if we actually have *both* a file called "File" and one called
"file", and they don't have the same lstat() information (ie we're on a
case-sensitive filesystem but have the "core.ignorecase" flag set), we
will error out if we try to add them both.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
The previous patch handled the "nothing changed" case, this one actually
handles the case of the data needing to be updated.
The CE_ADDED flag is an in-memory flag that just protects a single "git
add" invocation from changing the same alias twice. That would not be
right, but if you do separate
git add File
git add file
commands, the second one will happily update the information that the
first one added even if it was different - but
git add File file
would be an error if they don't have the same stat() information.
cache.h | 1 +
read-cache.c | 37 ++++++++++++++++++++++++++++++++++++-
2 files changed, 37 insertions(+), 1 deletions(-)
diff --git a/cache.h b/cache.h
index 9bce723..81727e4 100644
--- a/cache.h
+++ b/cache.h
@@ -133,6 +133,7 @@ struct cache_entry {
#define CE_UPDATE (0x10000)
#define CE_REMOVE (0x20000)
#define CE_UPTODATE (0x40000)
+#define CE_ADDED (0x80000)
#define CE_HASHED (0x100000)
#define CE_UNHASHED (0x200000)
diff --git a/read-cache.c b/read-cache.c
index 8c57adf..26ed644 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -429,6 +429,38 @@ static int index_name_pos_also_unmerged(struct index_state *istate,
return pos;
}
+static int different_name(struct cache_entry *ce, struct cache_entry *alias)
+{
+ int len = ce_namelen(ce);
+ return ce_namelen(alias) != len || memcmp(ce->name, alias->name, len);
+}
+
+/*
+ * If we add a filename that aliases in the cache, we will use the
+ * name that we already have - but we don't want to update the same
+ * alias twice, because that implies that there were actually two
+ * different files with aliasing names!
+ *
+ * So we use the CE_ADDED flag to verify that the alias was an old
+ * one before we accept it as
+ */
+static struct cache_entry *create_alias_ce(struct cache_entry *ce, struct cache_entry *alias)
+{
+ int len;
+ struct cache_entry *new;
+
+ if (alias->ce_flags & CE_ADDED)
+ die("Will not add file alias '%s' ('%s' already exists in index)", ce->name, alias->name);
+
+ /* Ok, create the new entry using the name of the existing alias */
+ len = ce_namelen(alias);
+ new = xcalloc(1, cache_entry_size(len));
+ memcpy(new->name, alias->name, len);
+ copy_cache_entry(new, ce);
+ free(ce);
+ return new;
+}
+
int add_file_to_index(struct index_state *istate, const char *path, int verbose)
{
int size, namelen;
@@ -471,11 +503,14 @@ int add_file_to_index(struct index_state *istate, const char *path, int verbose)
/* Nothing changed, really */
free(ce);
ce_mark_uptodate(alias);
+ alias->ce_flags |= CE_ADDED;
return 0;
}
-
if (index_path(ce->sha1, path, &st, 1))
die("unable to index file %s", path);
+ if (ignore_case && alias && different_name(ce, alias))
+ ce = create_alias_ce(ce, alias);
+ ce->ce_flags |= CE_ADDED;
if (add_index_entry(istate, ce, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE))
die("unable to add %s to index",path);
if (verbose)
--
1.5.5.rc0.31.gdcfd.dirty
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH] t0050: Set core.ignorecase case to activate case insensitivity
2008-03-22 17:21 [PATCH 0/7] Case-insensitive filesystem support, take 1 Linus Torvalds
2008-03-22 17:22 ` [PATCH 1/7] Make unpack_trees_options bit flags actual bitfields Linus Torvalds
@ 2008-03-22 22:01 ` Steffen Prohaska
2008-03-25 6:57 ` [PATCH] git-init: autodetect core.ignorecase Dmitry Potapov
` (2 subsequent siblings)
4 siblings, 0 replies; 29+ messages in thread
From: Steffen Prohaska @ 2008-03-22 22:01 UTC (permalink / raw
To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List, Frank, Dmitry Potapov
Case insensitive file handling is only activated by
core.ignorecase = true. Hence, we need to set it to give the tests
in t0050 a chance to succeed.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
---
t/t0050-filesystem.sh | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
On Sat, 22 Mar 2008, Linus Torvalds wrote:
> - I made this all conditional on
>
> [core]
> ignorecase = true
>
> because obviously I'm not at all interested in penalizing sane
> filesystems. That said, I also worked at trying to make sure that it's
> safe and possible to do this on a case-sensitive filesystem in case you
> are working on a project that doesn't like case-sensitivity, so the
> "git status" and "git add ." kind of operations won't add aliases
With this commit applied test 2 of t0050 passes. This is the minimal
change to make t0050 useful. Eventually test_expect_failure should be
replaced with test_expect_success.
Steffen
diff --git a/t/t0050-filesystem.sh b/t/t0050-filesystem.sh
index 3fbad77..cb109ff 100755
--- a/t/t0050-filesystem.sh
+++ b/t/t0050-filesystem.sh
@@ -34,6 +34,7 @@ test_expect_success 'see if we expect ' '
test_expect_success "setup case tests" '
+ git config core.ignorecase true &&
touch camelcase &&
git add camelcase &&
git commit -m "initial" &&
--
1.5.4.4.613.gaa46e5
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 7/7] Make unpack-tree update removed files before any updated files
2008-03-22 17:45 ` [PATCH 7/7] Make unpack-tree update removed files before any updated files Linus Torvalds
2008-03-22 18:06 ` [PATCH 0/7] Final words Linus Torvalds
2008-03-22 21:19 ` [PATCH 8/7] When adding files to the index, add support for case-independent matches Linus Torvalds
@ 2008-03-23 5:49 ` Junio C Hamano
2 siblings, 0 replies; 29+ messages in thread
From: Junio C Hamano @ 2008-03-23 5:49 UTC (permalink / raw
To: Linus Torvalds; +Cc: Git Mailing List, Frank, Dmitry Potapov
Linus Torvalds <torvalds@linux-foundation.org> writes:
> Ok, this one looks - and is, really - trivial, but it's actually the only
> one in the whole series that I'm even remotely nervous about. First off,
> it actually does what it does regardless of that "core.ignorecase"
> variable, but that wouldn't worry me if it wasn't for the fact that I
> don't remember/understand what the heck that "last_symlink" logic was
> there for.
last_symlink is just a cached information used by underlying
has_symlink_leading_path() function for optimization.
The motivation behind has_symlink_leading_path() is reasonably well
described in
- f859c84: Add has_symlink_leading_path() function., 2007-05-11,
- 64cab59: apply: do not get confused by symlinks in the middle,
2007-05-11, and
- 16a4c61: read-tree -m -u: avoid getting confused by intermediate
symlinks., 2007-05-10
The short version is that:
- sometimes we want to make sure a path a/b/c/d exists (or does not
exist) in the work tree;
- however, !lstat("a/b/c/d") is not quite it. if a/b is a symlink in the
work tree that points at somewhere that happens to have c/d underneath,
!lstat() says "yeah, there is", but that one is _different_ from what
checking out a cache entry a/b/c/d would produce (because in that case
we will remove a/b symlink, create a/b/ directory and deposit blob b
there).
- So we often need to see if a given path has symlink component in the
leading part in the work tree (e.g. given "a/b/c/d", we would need to
check if any of "a", "a/b", "a/b/c" is a symlink).
The function has_symlink_leading_path() answers that question, and its
second argument is a buffer to cache "the last work tree path found to be
a symlink", so if you call it with "a/b/c/d" and then with "a/b/c/e" in
the above example situation, the second call can re-use the information
the first call found out, which is "a/b is a symlink".
I do not think your patch breaks the passing around of last_symlink cached
information. Although the three commits I quoted above are all backed by
real-world breakage cases that they did fix, the issues they deal with are
indeed tricky cases. Although your patch (the change in 7/7) should not
make any difference to the issues, thinking about them is already making
me feel nervous.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 6/7] Make branch merging aware of underlying case-insensitive filsystems
2008-03-22 17:38 ` [PATCH 6/7] Make branch merging aware of underlying case-insensitive filsystems Linus Torvalds
2008-03-22 17:45 ` [PATCH 7/7] Make unpack-tree update removed files before any updated files Linus Torvalds
@ 2008-03-23 6:13 ` Junio C Hamano
2008-03-23 15:41 ` Linus Torvalds
1 sibling, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2008-03-23 6:13 UTC (permalink / raw
To: Linus Torvalds; +Cc: Git Mailing List, Frank, Dmitry Potapov
Linus Torvalds <torvalds@linux-foundation.org> writes:
> @@ -540,6 +556,16 @@ static int verify_absent(struct cache_entry *ce, const char *action,
> int dtype = ce_to_dtype(ce);
> struct cache_entry *result;
>
> + /*
> + * It may be that the 'lstat()' succeeded even though
> + * target 'ce' was absent, because there is an old
> + * entry that is different only in case..
> + *
> + * Ignore that lstat() if it matches.
> + */
> + if (ignore_case && icase_exists(o, ce, &st))
> + return 0;
> +
It may well be the case that this lstat() returning success was caused by
a ghost match with a file with different case, and I think it is the right
thing to say "no, it does not exist" if that is the case.
I wonder what happens when the file with the same case does exist that we
are trying to make sure is missing?
As far as I can tell, icase_exists() does not ask "does a file with this
name in different case exist, and a file with this exact case doesn't?"
but asks "does a file with this name, or another name that is different
only in case, exist?".
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 6/7] Make branch merging aware of underlying case-insensitive filsystems
2008-03-23 6:13 ` [PATCH 6/7] Make branch merging aware of underlying case-insensitive filsystems Junio C Hamano
@ 2008-03-23 15:41 ` Linus Torvalds
0 siblings, 0 replies; 29+ messages in thread
From: Linus Torvalds @ 2008-03-23 15:41 UTC (permalink / raw
To: Junio C Hamano; +Cc: Git Mailing List, Frank, Dmitry Potapov
On Sat, 22 Mar 2008, Junio C Hamano wrote:
>
> I wonder what happens when the file with the same case does exist that we
> are trying to make sure is missing?
Can't happen. This whole code-path only triggers if the entry didn't exist
in the index when we merge a tree.
So we know a priori that the source index didn't contain the thing.
> As far as I can tell, icase_exists() does not ask "does a file with this
> name in different case exist, and a file with this exact case doesn't?"
> but asks "does a file with this name, or another name that is different
> only in case, exist?".
Correct. But see the call chain - this thing is only called if index is
NULL, ie "there was no entry in the index".
So in this case, the other comment (above "icase_exists()") talks about
that:
This gets called when there was no index entry for the tree entry
'dst', but we found a file in the working tree that 'lstat()' said
was fine, [...]
and you can verify that "verify_absent()" only gets called by things where
"index" was NULL (only three callers, and two of them are expressly inside
a "if (!old)" case, and the third one is right after a "if (index) return"
statement.
[ There's _one_ special case: the "index" thing may have been NULL not
because there was no path in the source index, but because we didn't
even look at the index in the first place! So strictly speaking, we
should have a test for "o->merge" being set, but afaik that must always
be true if we have "o->update" set, and again, this logic only triggers
for that case.
So the only case that doesn't set "o->merge" to get the index is
"builtin-read-tree.c" when you do a plain tree-only merge, but that one
has
if ((opts.update||opts.index_only) && !opts.merge)
usage(read_tree_usage);
to make sure that you cannot update the working tree without taking the
index into account ]
Anyway, I think it's all good.
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH] git-init: autodetect core.ignorecase
2008-03-22 17:21 [PATCH 0/7] Case-insensitive filesystem support, take 1 Linus Torvalds
2008-03-22 17:22 ` [PATCH 1/7] Make unpack_trees_options bit flags actual bitfields Linus Torvalds
2008-03-22 22:01 ` [PATCH] t0050: Set core.ignorecase case to activate case insensitivity Steffen Prohaska
@ 2008-03-25 6:57 ` Dmitry Potapov
2008-03-25 9:59 ` Johannes Schindelin
2008-03-25 8:14 ` [PATCH 0/7] Case-insensitive filesystem support, take 1 Dmitry Potapov
2008-03-25 11:39 ` Derek Fawcus
4 siblings, 1 reply; 29+ messages in thread
From: Dmitry Potapov @ 2008-03-25 6:57 UTC (permalink / raw
To: Junio C Hamano; +Cc: Git Mailing List, Linus Torvalds, Frank, Dmitry Potapov
We already detect if symbolic links are supported by the filesystem.
This patch adds autodetect for case-insensitive filesystems, such
as VFAT and others.
Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
---
This patch goes on top of lt/case-insensitive
builtin-init-db.c | 8 +++++++-
1 files changed, 7 insertions(+), 1 deletions(-)
diff --git a/builtin-init-db.c b/builtin-init-db.c
index 79eaf8d..62f7c08 100644
--- a/builtin-init-db.c
+++ b/builtin-init-db.c
@@ -254,8 +254,8 @@ static int create_default_files(const char *git_dir, const char *template_path)
git_config_set("core.worktree", work_tree);
}
- /* Check if symlink is supported in the work tree */
if (!reinit) {
+ /* Check if symlink is supported in the work tree */
path[len] = 0;
strcpy(path + len, "tXXXXXX");
if (!close(xmkstemp(path)) &&
@@ -266,6 +266,12 @@ static int create_default_files(const char *git_dir, const char *template_path)
unlink(path); /* good */
else
git_config_set("core.symlinks", "false");
+
+ /* Check if the filesystem is case-insensitive */
+ path[len] = 0;
+ strcpy(path + len, "CoNfIg");
+ if (access(path, F_OK))
+ git_config_set("core.ignorecase", "true");
}
return reinit;
--
1.5.5.rc1.10.g3948
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 0/7] Case-insensitive filesystem support, take 1
2008-03-22 17:21 [PATCH 0/7] Case-insensitive filesystem support, take 1 Linus Torvalds
` (2 preceding siblings ...)
2008-03-25 6:57 ` [PATCH] git-init: autodetect core.ignorecase Dmitry Potapov
@ 2008-03-25 8:14 ` Dmitry Potapov
2008-03-25 21:04 ` Linus Torvalds
2008-03-25 11:39 ` Derek Fawcus
4 siblings, 1 reply; 29+ messages in thread
From: Dmitry Potapov @ 2008-03-25 8:14 UTC (permalink / raw
To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List, Frank
On Sat, Mar 22, 2008 at 10:21:05AM -0700, Linus Torvalds wrote:
>
> - I've tested this series, both on a case-sensitive one (using hardlinks
> to test corner cases) and on a vfat filesystem under Linux (which is
> case-insensitive and *really* odd wrt case preservation - it remembers
> the name of removed files, so it preserves case even across removal and
> re-creation!)
I also have observed this problem with VFAT on Linux, but the effect was not
stable. It looks like old information is preserved somewhere in caches...
Anyway, I have tested this series of patches a bit on Windows and so far
I have found the following:
- merge different branches were two file names are only differ by case
will cause that the result branch has two file names that differ only
by case and one of them will be overwritten by the other and shown as
modified in the worktree by git status.
- git status cares only about case-insensitivity only for files and not
for directories. Thus, if case of letters in a directory name is changed
then this directory will be shown as untracked.
- pattern specified in .gitignore are match as case-sensitive despite
core.ignorecase set to true.
Personally, I don't care about any of the above issues much as I rarely
work on Windows and when I do, I always check that all filenames are in
low case except Makefile (and a few more exceptions). So, I have never
had any problem with using Git on case-insensitive system...
Dmitry
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] git-init: autodetect core.ignorecase
2008-03-25 6:57 ` [PATCH] git-init: autodetect core.ignorecase Dmitry Potapov
@ 2008-03-25 9:59 ` Johannes Schindelin
2008-03-25 10:49 ` [PATCH v2] " Dmitry Potapov
2008-03-25 11:03 ` [PATCH] " Dmitry Potapov
0 siblings, 2 replies; 29+ messages in thread
From: Johannes Schindelin @ 2008-03-25 9:59 UTC (permalink / raw
To: Dmitry Potapov; +Cc: Junio C Hamano, Git Mailing List, Linus Torvalds, Frank
Hi,
On Tue, 25 Mar 2008, Dmitry Potapov wrote:
> diff --git a/builtin-init-db.c b/builtin-init-db.c
> index 79eaf8d..62f7c08 100644
> --- a/builtin-init-db.c
> +++ b/builtin-init-db.c
> @@ -254,8 +254,8 @@ static int create_default_files(const char *git_dir, const char *template_path)
> git_config_set("core.worktree", work_tree);
> }
>
> - /* Check if symlink is supported in the work tree */
> if (!reinit) {
> + /* Check if symlink is supported in the work tree */
> path[len] = 0;
> strcpy(path + len, "tXXXXXX");
> if (!close(xmkstemp(path)) &&
> @@ -266,6 +266,12 @@ static int create_default_files(const char *git_dir, const char *template_path)
> unlink(path); /* good */
> else
> git_config_set("core.symlinks", "false");
> +
> + /* Check if the filesystem is case-insensitive */
> + path[len] = 0;
> + strcpy(path + len, "CoNfIg");
> + if (access(path, F_OK))
> + git_config_set("core.ignorecase", "true");
Clever!
Last time I checked, the "HEAD" file on VFAT was converted to "head" when
the repository was initialised on Win32 (IIRC) and read on Linux (IIRC).
Maybe this problem has gone away, but if not, it should definitely be
fixed (depending on core.ignorecase).
Ciao,
Dscho
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v2] git-init: autodetect core.ignorecase
2008-03-25 9:59 ` Johannes Schindelin
@ 2008-03-25 10:49 ` Dmitry Potapov
2008-03-25 11:03 ` [PATCH] " Dmitry Potapov
1 sibling, 0 replies; 29+ messages in thread
From: Dmitry Potapov @ 2008-03-25 10:49 UTC (permalink / raw
To: Junio C Hamano
Cc: Johannes Schindelin, Git Mailing List, Linus Torvalds, Frank
We already detect if symbolic links are supported by the filesystem.
This patch adds autodetect for case-insensitive filesystems, such
as VFAT and others.
Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
---
There was a stupid mistake in a previous patch... (Inadvertantly,
I sent the old version of my patch, which was incorrect, and did not
realize that until saw Johannes' reply to my email.)
builtin-init-db.c | 8 +++++++-
1 files changed, 7 insertions(+), 1 deletions(-)
diff --git a/builtin-init-db.c b/builtin-init-db.c
index 79eaf8d..62f7c08 100644
--- a/builtin-init-db.c
+++ b/builtin-init-db.c
@@ -254,8 +254,8 @@ static int create_default_files(const char *git_dir, const char *template_path)
git_config_set("core.worktree", work_tree);
}
- /* Check if symlink is supported in the work tree */
if (!reinit) {
+ /* Check if symlink is supported in the work tree */
path[len] = 0;
strcpy(path + len, "tXXXXXX");
if (!close(xmkstemp(path)) &&
@@ -266,6 +266,12 @@ static int create_default_files(const char *git_dir, const char *template_path)
unlink(path); /* good */
else
git_config_set("core.symlinks", "false");
+
+ /* Check if the filesystem is case-insensitive */
+ path[len] = 0;
+ strcpy(path + len, "CoNfIg");
+ if (!access(path, F_OK))
+ git_config_set("core.ignorecase", "true");
}
return reinit;
--
1.5.5.rc1.10.g3948
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH] git-init: autodetect core.ignorecase
2008-03-25 9:59 ` Johannes Schindelin
2008-03-25 10:49 ` [PATCH v2] " Dmitry Potapov
@ 2008-03-25 11:03 ` Dmitry Potapov
1 sibling, 0 replies; 29+ messages in thread
From: Dmitry Potapov @ 2008-03-25 11:03 UTC (permalink / raw
To: Johannes Schindelin; +Cc: Git Mailing List
On Tue, Mar 25, 2008 at 10:59:00AM +0100, Johannes Schindelin wrote:
>
> Last time I checked, the "HEAD" file on VFAT was converted to "head" when
> the repository was initialised on Win32 (IIRC) and read on Linux (IIRC).
It is result of mounting the VFAT disk on Linux with shortname=lower,
which is the default :( Perhaps, it was a reasonable default for DOS
days, but now it only creates troubles. I usually mount VFAT disks with
shortname=winnt on Linux, though some people prefer shortname=mixed.
Dmitry
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/7] Case-insensitive filesystem support, take 1
2008-03-22 17:21 [PATCH 0/7] Case-insensitive filesystem support, take 1 Linus Torvalds
` (3 preceding siblings ...)
2008-03-25 8:14 ` [PATCH 0/7] Case-insensitive filesystem support, take 1 Dmitry Potapov
@ 2008-03-25 11:39 ` Derek Fawcus
2008-03-25 18:26 ` Jan Hudec
4 siblings, 1 reply; 29+ messages in thread
From: Derek Fawcus @ 2008-03-25 11:39 UTC (permalink / raw
To: git
On Sat, Mar 22, 2008 at 10:21:05AM -0700, Linus Torvalds wrote:
> ... and on a vfat filesystem under Linux (which is
> case-insensitive and *really* odd wrt case preservation - it remembers
> the name of removed files, so it preserves case even across removal and
> re-creation!)
Interesting.
That sounds a bit like the claimed windows 95 properly of 'tunneling' renames.
ISTR that it was to catch a move via 'shortname' which then preserved the longname.
However I'd have expected the Linux version to always use the long name...
DF
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/7] Case-insensitive filesystem support, take 1
2008-03-25 11:39 ` Derek Fawcus
@ 2008-03-25 18:26 ` Jan Hudec
0 siblings, 0 replies; 29+ messages in thread
From: Jan Hudec @ 2008-03-25 18:26 UTC (permalink / raw
To: Derek Fawcus; +Cc: git
On Tue, Mar 25, 2008 at 11:39:56 +0000, Derek Fawcus wrote:
> On Sat, Mar 22, 2008 at 10:21:05AM -0700, Linus Torvalds wrote:
> > ... and on a vfat filesystem under Linux (which is
> > case-insensitive and *really* odd wrt case preservation - it remembers
> > the name of removed files, so it preserves case even across removal and
> > re-creation!)
>
> Interesting.
> That sounds a bit like the claimed windows 95 properly of 'tunneling' renames.
> ISTR that it was to catch a move via 'shortname' which then preserved the longname.
>
> However I'd have expected the Linux version to always use the long name...
... if it's there -- which it might not. Linux may create it always, but
Windows will not if they think they don't need to.
--
Jan 'Bulb' Hudec <bulb@ucw.cz>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/7] Case-insensitive filesystem support, take 1
2008-03-25 8:14 ` [PATCH 0/7] Case-insensitive filesystem support, take 1 Dmitry Potapov
@ 2008-03-25 21:04 ` Linus Torvalds
2008-03-26 2:46 ` Dmitry Potapov
0 siblings, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2008-03-25 21:04 UTC (permalink / raw
To: Dmitry Potapov; +Cc: Junio C Hamano, Git Mailing List, Frank
On Tue, 25 Mar 2008, Dmitry Potapov wrote:
>
> - merge different branches were two file names are only differ by case
> will cause that the result branch has two file names that differ only
> by case and one of them will be overwritten by the other and shown as
> modified in the worktree by git status.
Ok. So there's two issues here:
- the git trees themselves had two different names
This is not something I'm *ever* planning on changing. All my "case
insensitive" patches were about the *working*tree*, not about core git
data structures themselves.
In other words: git itself is internally very much a case-sensitive
program, and the index and the trees are case-sensitive and will remain
so forever as far as I'm concerned. So when you do a tree-level merge
of two trees that have two different names that are equivalent in case,
git will create a result that has those two names. Because git itself
is not case-insensitive.
- HOWEVER - when checking things out, we should probably notice that
we're now writing the two different files out and over-writing one of
them, and fail at that stage. I don't know what a good failure
behaviour would be, though. I'll have to think about it.
IOW, all my case-insensitivity checking was very much designed to be about
the working tree, not about git internal representations. Put another way,
they should really only affect code that does "lstat()" to check whether
a file exists or code that does "open()" to open/create a file.
> - git status cares only about case-insensitivity only for files and not
> for directories. Thus, if case of letters in a directory name is changed
> then this directory will be shown as untracked.
Ahh, yes. This is sad. It comes about because while we can look up whole
names in the index case-insensitively, we have no equivalent way to look
up individual path components, so that still uses the "index_name_pos()"
model and then looking aroung the failure point to see if we hit a
subdirectory. Remember: the index doesn't actually contain directories at
all, just lists of files.
This will not be trivial to fix.
> - pattern specified in .gitignore are match as case-sensitive despite
> core.ignorecase set to true.
This should probably be fairly straightforward. All the logic here is in
the function "excluded_1()" in dir.c - and largely it would be about
changing that "strcmp()" into a "strcasecmp()" and using the FNM_CASEFOLD
flag to fnmatch().
The only half-way subtle issues here are
- do we really want to use strcasecmp() (which may match differently than
our hashing matches!) or do we want to expand on our icase_cmp() or
similar in hash-name.c (I think the latter is the right thing, but it
requires a bit more work)
- FNM_CASEFOLD has the same issue, but also adds the wrinkle of being a
GNU-only extension. Which is sad, since most systems that have glibc
would never need it in the first place. So then we get back to the
whole issue of maybe having to reimplement 'fnmatch()', or at least a
subset of it that git uses.
So that last issue is conceptually simple and straightforward to fix, but
fixing it right would almost certainly be a few hundred lines of code
(fnmatch() in particular is nasty if you want to do all the cases, but
perhaps just '*' is ok?).
The first two issues are nontrivial.
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/7] Case-insensitive filesystem support, take 1
2008-03-25 21:04 ` Linus Torvalds
@ 2008-03-26 2:46 ` Dmitry Potapov
2008-03-26 3:37 ` Linus Torvalds
0 siblings, 1 reply; 29+ messages in thread
From: Dmitry Potapov @ 2008-03-26 2:46 UTC (permalink / raw
To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List, Frank
On Tue, Mar 25, 2008 at 02:04:58PM -0700, Linus Torvalds wrote:
>
> IOW, all my case-insensitivity checking was very much designed to be about
> the working tree, not about git internal representations. Put another way,
> they should really only affect code that does "lstat()" to check whether
> a file exists or code that does "open()" to open/create a file.
Of course, case-insensitivity is about the working tree only. But when
I merge another branch to the current one, git normally checks that it
is not going to overwrite existing files in the *work tree* and refuses
to do the merge if some files may be overwritten.
So if I work on a case-insensitive filesystem and have a file in a different
case and core.ignorecase=false, then the merge fails as expected!
But core.ignorecase=true, which is supposed to do a better job for case-
insensitive filesystems, actually causes the problem here.
Here is my test script:
====
mkdir git-test
cd git-test
git init
git config core.ignorecase true
echo foo > foo
git add foo
git commit -m 'initial commit'
git checkout -b other
echo file > file
git add file
git commit -m 'add file'
git checkout master
echo File > File
git add File
git commit -m 'add File'
# I expect merge to fail here... and it does fail if core.ignorecase
# is set to false, but with core.ignorecase = true, git will overwrite
# 'File'.
# git config core.ignorecase false
git merge other
===
Dmitry
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/7] Case-insensitive filesystem support, take 1
2008-03-26 2:46 ` Dmitry Potapov
@ 2008-03-26 3:37 ` Linus Torvalds
0 siblings, 0 replies; 29+ messages in thread
From: Linus Torvalds @ 2008-03-26 3:37 UTC (permalink / raw
To: Dmitry Potapov; +Cc: Junio C Hamano, Git Mailing List, Frank
On Wed, 26 Mar 2008, Dmitry Potapov wrote:
>
> Of course, case-insensitivity is about the working tree only. But when
> I merge another branch to the current one, git normally checks that it
> is not going to overwrite existing files in the *work tree* and refuses
> to do the merge if some files may be overwritten.
I do agree - but this is not really about the *merge*. It is about the
checkout.
Imagine that the merge had been done on a sane filesystem, and you're just
pulling it or cloning it. No merge on your machine, but the problem is the
same.
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2008-03-26 3:40 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-22 17:21 [PATCH 0/7] Case-insensitive filesystem support, take 1 Linus Torvalds
2008-03-22 17:22 ` [PATCH 1/7] Make unpack_trees_options bit flags actual bitfields Linus Torvalds
2008-03-22 17:25 ` [PATCH 2/7] Move name hashing functions into a file of its own Linus Torvalds
2008-03-22 17:28 ` [PATCH 3/7] Make "index_name_exists()" return the cache_entry it found Linus Torvalds
2008-03-22 17:30 ` [PATCH 4/7] Make hash_name_lookup able to do case-independent lookups Linus Torvalds
2008-03-22 17:33 ` [PATCH 5/7] Add 'core.ignorecase' option Linus Torvalds
2008-03-22 17:38 ` [PATCH 6/7] Make branch merging aware of underlying case-insensitive filsystems Linus Torvalds
2008-03-22 17:45 ` [PATCH 7/7] Make unpack-tree update removed files before any updated files Linus Torvalds
2008-03-22 18:06 ` [PATCH 0/7] Final words Linus Torvalds
2008-03-22 18:28 ` Linus Torvalds
2008-03-22 21:19 ` [PATCH 8/7] When adding files to the index, add support for case-independent matches Linus Torvalds
2008-03-22 21:22 ` [PATCH 9/7] Make git-add behave more sensibly in a case-insensitive environment Linus Torvalds
2008-03-23 5:49 ` [PATCH 7/7] Make unpack-tree update removed files before any updated files Junio C Hamano
2008-03-23 6:13 ` [PATCH 6/7] Make branch merging aware of underlying case-insensitive filsystems Junio C Hamano
2008-03-23 15:41 ` Linus Torvalds
2008-03-22 17:36 ` [PATCH 1/7] Make unpack_trees_options bit flags actual bitfields Johannes Schindelin
2008-03-22 17:47 ` Linus Torvalds
2008-03-22 17:57 ` Johannes Schindelin
2008-03-22 22:01 ` [PATCH] t0050: Set core.ignorecase case to activate case insensitivity Steffen Prohaska
2008-03-25 6:57 ` [PATCH] git-init: autodetect core.ignorecase Dmitry Potapov
2008-03-25 9:59 ` Johannes Schindelin
2008-03-25 10:49 ` [PATCH v2] " Dmitry Potapov
2008-03-25 11:03 ` [PATCH] " Dmitry Potapov
2008-03-25 8:14 ` [PATCH 0/7] Case-insensitive filesystem support, take 1 Dmitry Potapov
2008-03-25 21:04 ` Linus Torvalds
2008-03-26 2:46 ` Dmitry Potapov
2008-03-26 3:37 ` Linus Torvalds
2008-03-25 11:39 ` Derek Fawcus
2008-03-25 18:26 ` Jan Hudec
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).