git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
	Joshua Jensen <jjensen@workspacewhiz.com>,
	Johannes Sixt <j6t@kdbg.org>, Brandon Casey <drafnel@gmail.com>
Subject: [PATCH/RFC v3 5/8] Add case insensitivity support for directories when using git status
Date: Sun,  3 Oct 2010 09:56:43 +0000	[thread overview]
Message-ID: <1286099806-25774-6-git-send-email-avarab@gmail.com> (raw)
In-Reply-To: <4CA847D5.4000903@workspacewhiz.com>

From: Joshua Jensen <jjensen@workspacewhiz.com>

When using a case preserving but case insensitive file system, directory
case can differ but still refer to the same physical directory.  git
status reports the directory with the alternate case as an Untracked
file.  (That is, when mydir/filea.txt is added to the repository and
then the directory on disk is renamed from mydir/ to MyDir/, git status
shows MyDir/ as being untracked.)

Support has been added in name-hash.c for hashing directories with a
terminating slash into the name hash. When index_name_exists() is called
with a directory (a name with a terminating slash), the name is not
found via the normal cache_name_compare() call, but it is found in the
slow_same_name() function.

Additionally, in dir.c, directory_exists_in_index_icase() allows newly
added directories deeper in the directory chain to be identified.

Ultimately, it would be better if the file list was read in case
insensitive alphabetical order from disk, but this change seems to
suffice for now.

The end result is the directory is looked up in a case insensitive
manner and does not show in the Untracked files list.

Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 dir.c       |   40 ++++++++++++++++++++++++++++++++-
 name-hash.c |   72 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 110 insertions(+), 2 deletions(-)

diff --git a/dir.c b/dir.c
index 02c2a82..cf8f65c 100644
--- a/dir.c
+++ b/dir.c
@@ -485,6 +485,39 @@ enum exist_status {
 };
 
 /*
+ * Do not use the alphabetically stored index to look up
+ * the directory name; instead, use the case insensitive
+ * name hash.
+ */
+static enum exist_status directory_exists_in_index_icase(const char *dirname, int len)
+{
+	struct cache_entry *ce = index_name_exists(&the_index, dirname, len + 1, ignore_case);
+	unsigned char endchar;
+
+	if (!ce)
+		return index_nonexistent;
+	endchar = ce->name[len];
+
+	/*
+	 * The cache_entry structure returned will contain this dirname
+	 * and possibly additional path components.
+	 */
+	if (endchar == '/')
+		return index_directory;
+
+	/*
+	 * If there are no additional path components, then this cache_entry
+	 * represents a submodule.  Submodules, despite being directories,
+	 * are stored in the cache without a closing slash.
+	 */
+	if (!endchar && S_ISGITLINK(ce->ce_mode))
+		return index_gitdir;
+
+	/* This should never be hit, but it exists just in case. */
+	return index_nonexistent;
+}
+
+/*
  * The index sorts alphabetically by entry name, which
  * means that a gitlink sorts as '\0' at the end, while
  * a directory (which is defined not as an entry, but as
@@ -493,7 +526,12 @@ enum exist_status {
  */
 static enum exist_status directory_exists_in_index(const char *dirname, int len)
 {
-	int pos = cache_name_pos(dirname, len);
+	int pos;
+
+	if (ignore_case)
+		return directory_exists_in_index_icase(dirname, len);
+
+	pos = cache_name_pos(dirname, len);
 	if (pos < 0)
 		pos = -pos-1;
 	while (pos < active_nr) {
diff --git a/name-hash.c b/name-hash.c
index 0031d78..c6b6a3f 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -32,6 +32,42 @@ static unsigned int hash_name(const char *name, int namelen)
 	return hash;
 }
 
+static void hash_index_entry_directories(struct index_state *istate, struct cache_entry *ce)
+{
+	/*
+	 * Throw each directory component in the hash for quick lookup
+	 * during a git status. Directory components are stored with their
+	 * closing slash.  Despite submodules being a directory, they never
+	 * reach this point, because they are stored without a closing slash
+	 * in the cache.
+	 *
+	 * Note that the cache_entry stored with the directory does not
+	 * represent the directory itself.  It is a pointer to an existing
+	 * filename, and its only purpose is to represent existence of the
+	 * directory in the cache.  It is very possible multiple directory
+	 * hash entries may point to the same cache_entry.
+	 */
+	unsigned int hash;
+	void **pos;
+
+	const char *ptr = ce->name;
+	while (*ptr) {
+		while (*ptr && *ptr != '/')
+			++ptr;
+		if (*ptr == '/') {
+			++ptr;
+			hash = hash_name(ce->name, ptr - ce->name);
+			if (!lookup_hash(hash, &istate->name_hash)) {
+				pos = insert_hash(hash, ce, &istate->name_hash);
+				if (pos) {
+					ce->next = *pos;
+					*pos = ce;
+				}
+			}
+		}
+	}
+}
+
 static void hash_index_entry(struct index_state *istate, struct cache_entry *ce)
 {
 	void **pos;
@@ -47,6 +83,9 @@ static void hash_index_entry(struct index_state *istate, struct cache_entry *ce)
 		ce->next = *pos;
 		*pos = ce;
 	}
+
+	if (ignore_case)
+		hash_index_entry_directories(istate, ce);
 }
 
 static void lazy_init_name_hash(struct index_state *istate)
@@ -97,7 +136,21 @@ static int same_name(const struct cache_entry *ce, const char *name, int namelen
 	if (len == namelen && !cache_name_compare(name, namelen, ce->name, len))
 		return 1;
 
-	return icase && slow_same_name(name, namelen, ce->name, len);
+	if (!icase)
+		return 0;
+
+	/*
+	 * If the entry we're comparing is a filename (no trailing slash), then compare
+	 * the lengths exactly.
+	 */
+	if (name[namelen - 1] != '/')
+		return slow_same_name(name, namelen, ce->name, len);
+
+	/*
+	 * For a directory, we point to an arbitrary cache_entry filename.  Just
+	 * make sure the directory portion matches.
+	 */
+	return slow_same_name(name, namelen, ce->name, namelen < len ? namelen : len);
 }
 
 struct cache_entry *index_name_exists(struct index_state *istate, const char *name, int namelen, int icase)
@@ -115,5 +168,22 @@ struct cache_entry *index_name_exists(struct index_state *istate, const char *na
 		}
 		ce = ce->next;
 	}
+
+	/*
+	 * Might be a submodule.  Despite submodules being directories,
+	 * they are stored in the name hash without a closing slash.
+	 * When ignore_case is 1, directories are stored in the name hash
+	 * with their closing slash.
+	 *
+	 * The side effect of this storage technique is we have need to
+	 * remove the slash from name and perform the lookup again without
+	 * the slash.  If a match is made, S_ISGITLINK(ce->mode) will be
+	 * true.
+	 */
+	if (icase && name[namelen - 1] == '/') {
+		ce = index_name_exists(istate, name, namelen - 1, icase);
+		if (ce && S_ISGITLINK(ce->ce_mode))
+			return ce;
+	}
 	return NULL;
 }
-- 
1.7.3.159.g610493

  parent reply	other threads:[~2010-10-03  9:58 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-03  4:32 [PATCH v2 0/6] Extensions of core.ignorecase=true support Joshua Jensen
2010-10-03  4:32 ` [PATCH v2 1/6] Add string comparison functions that respect the ignore_case variable Joshua Jensen
2010-10-03  8:30   ` Ævar Arnfjörð Bjarmason
2010-10-03  9:07     ` Joshua Jensen
2010-10-03  9:56       ` [PATCH/RFC v3 0/8] ab/icase-directory: jj/icase-directory with Makefile + configure checks Ævar Arnfjörð Bjarmason
2010-10-03  9:56       ` [PATCH/RFC v3 1/8] Makefile & configure: add a NO_FNMATCH flag Ævar Arnfjörð Bjarmason
2010-10-03  9:56       ` [PATCH/RFC v3 2/8] Makefile & configure: add a NO_FNMATCH_CASEFOLD flag Ævar Arnfjörð Bjarmason
2010-10-03 17:58         ` Johannes Sixt
2010-10-04  2:48           ` [PATCH/RFC v4 " Ævar Arnfjörð Bjarmason
2010-10-03  9:56       ` [PATCH/RFC v3 3/8] Add string comparison functions that respect the ignore_case variable Ævar Arnfjörð Bjarmason
2010-10-03  9:56       ` [PATCH/RFC v3 4/8] Case insensitivity support for .gitignore via core.ignorecase Ævar Arnfjörð Bjarmason
2010-10-03  9:56       ` Ævar Arnfjörð Bjarmason [this message]
2010-10-03  9:56       ` [PATCH/RFC v3 6/8] Add case insensitivity support when using git ls-files Ævar Arnfjörð Bjarmason
2010-10-03 11:54         ` Thomas Adam
2010-10-03 18:19           ` Johannes Sixt
2010-10-03 21:59             ` Thomas Adam
2010-10-04  7:49             ` Jonathan Nieder
2010-10-04  8:02               ` Ævar Arnfjörð Bjarmason
2010-10-04 14:03               ` Erik Faye-Lund
2010-10-04 14:58                 ` Joshua Jensen
2010-10-04 17:03                   ` Jonathan Nieder
2010-10-04 16:02         ` Robin Rosenberg
2010-10-04 16:41           ` Ævar Arnfjörð Bjarmason
2010-10-04 16:48           ` Erik Faye-Lund
2010-10-04 16:49           ` Joshua Jensen
2010-10-04 17:08             ` Jonathan Nieder
2010-10-04 17:53             ` Ævar Arnfjörð Bjarmason
2010-10-04 19:02           ` Johannes Sixt
2010-10-04 19:17             ` Ævar Arnfjörð Bjarmason
2010-10-03  9:56       ` [PATCH/RFC v3 7/8] Support case folding for git add when core.ignorecase=true Ævar Arnfjörð Bjarmason
2010-10-03  9:56       ` [PATCH/RFC v3 8/8] Support case folding in git fast-import " Ævar Arnfjörð Bjarmason
2010-10-07  4:13       ` [PATCH v2 1/6] Add string comparison functions that respect the ignore_case variable Junio C Hamano
2010-10-07  5:48         ` Joshua Jensen
2010-10-03  4:32 ` [PATCH v2 2/6] Case insensitivity support for .gitignore via core.ignorecase Joshua Jensen
2010-10-03  4:32 ` [PATCH v2 3/6] Add case insensitivity support for directories when using git status Joshua Jensen
2010-10-03  4:32 ` [PATCH v2 4/6] Add case insensitivity support when using git ls-files Joshua Jensen
2010-10-03  4:32 ` [PATCH v2 5/6] Support case folding for git add when core.ignorecase=true Joshua Jensen
2010-10-03  4:32 ` [PATCH v2 6/6] Support case folding in git fast-import " Joshua Jensen
2010-10-03 13:00   ` Sverre Rabbelier
2010-10-03  8:17 ` [PATCH v2 0/6] Extensions of core.ignorecase=true support Johannes Sixt
2010-10-03 23:34   ` Junio C Hamano
2010-10-03 11:48 ` Robert Buck
2010-10-03 18:12   ` Johannes Sixt
2010-10-06 22:04     ` Robert Buck
2010-10-06 22:46       ` Joshua Jensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1286099806-25774-6-git-send-email-avarab@gmail.com \
    --to=avarab@gmail.com \
    --cc=drafnel@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    --cc=jjensen@workspacewhiz.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).