git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* nd/attr-match-more-optim, nd/wildmatch and as/check-ignore
@ 2012-10-15  6:23 Nguyễn Thái Ngọc Duy
  2012-10-15  6:24 ` [PATCH 1/6] exclude: stricten a length check in EXC_FLAG_ENDSWITH case Nguyễn Thái Ngọc Duy
                   ` (3 more replies)
  0 siblings, 4 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:23 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Adam Spiers,
	Nguyễn Thái Ngọc Duy

I promise I won't send anything dir.c-related till the end of this
month :) These three series all touch the same code in dir.c and cause
a bunch of conflicts. So I rebase nd/wildmatch and as/check-ignore
on top of nd/attr-match-more-optim and resolve all conflicts.

nd/attr-match-more-optim
------------------------

A lot of refactoring in dir.c leads to a cleaner last patch to port
many exclude optimizations to attr. We can extend EXC_FLAGS_ENDSWITH
optimization further, from "^*.[ch]" to "path/to/*.[ch]", but not in
this series.

Nguyễn Thái Ngọc Duy (6):
  exclude: stricten a length check in EXC_FLAG_ENDSWITH case
  exclude: split basename matching code into a separate function
  exclude: fix a bug in prefix compare optimization
  exclude: split pathname matching code into a separate function
  gitignore: make pattern parsing code a separate function
  attr: more matching optimizations from .gitignore

 Documentation/gitattributes.txt    |   1 +
 attr.c                             |  52 ++++++----
 dir.c                              | 192 ++++++++++++++++++++++++-------------
 dir.h                              |  13 ++-
 t/t0003-attributes.sh              |  10 ++
 t/t3001-ls-files-others-exclude.sh |   6 ++
 6 files changed, 186 insertions(+), 88 deletions(-)

nd/wildmatch
------------

new ctype patches that no longer introduce sane_ctype2[]. I also
re-indent wildmatch.c to follow Git's coding style with the intention
of making more changes in future (e.g. case insensitive support in
pathspec means we cannot rely on GNU extension FNM_CASEFOLD). Thanks
to nd/attr-match-more-optim we don't need to make any changes to
attr.c.

Depends on nd/attr-match-more-optim.

Nguyễn Thái Ngọc Duy (13):
  ctype: make sane_ctype[] const array
  ctype: support iscntrl, ispunct, isxdigit and isprint
  Import wildmatch from rsync
  wildmatch: remove unnecessary functions
  wildmatch: follow Git's coding convention
  Integrate wildmatch to git
  t3070: disable unreliable fnmatch tests
  wildmatch: make wildmatch's return value compatible with fnmatch
  wildmatch: remove static variable force_lower_case
  wildmatch: fix case-insensitive matching
  wildmatch: adjust "**" behavior
  wildmatch: make /**/ match zero or more directories
  Support "**" wildcard in .gitignore and .gitattributes

 .gitignore                         |   1 +
 Documentation/gitignore.txt        |  19 +++
 Makefile                           |   3 +
 ctype.c                            |  15 ++-
 dir.c                              |   4 +-
 git-compat-util.h                  |  15 ++-
 t/t0003-attributes.sh              |  37 ++++++
 t/t3001-ls-files-others-exclude.sh |  18 +++
 t/t3070-wildmatch.sh               | 195 +++++++++++++++++++++++++++++++
 test-wildmatch.c                   |  14 +++
 wildmatch.c                        | 234 +++++++++++++++++++++++++++++++++++++
 wildmatch.h                        |   9 ++
 12 files changed, 556 insertions(+), 8 deletions(-)
 create mode 100755 t/t3070-wildmatch.sh
 create mode 100644 test-wildmatch.c
 create mode 100644 wildmatch.c
 create mode 100644 wildmatch.h

as/check-ignore
---------------

Conflict resolution and cleanups. A lot of matching code sharing
between exclude and attr means we might be able to bring check-ignore
functionality to check-attr. But let's leave it for now.

Depends on nd/attr-match-more-optim.

Adam Spiers (12):
  dir.c: rename cryptic 'which' variable to more consistent name
  dir.c: rename path_excluded() to is_path_excluded()
  dir.c: rename excluded_from_list() to is_excluded_from_list()
  dir.c: rename excluded() to is_excluded()
  dir.c: refactor is_excluded_from_list()
  dir.c: refactor is_excluded()
  dir.c: refactor is_path_excluded()
  dir.c: keep track of where patterns came from
  dir.c: refactor treat_gitlinks()
  pathspec.c: move reusable code from builtin/add.c
  dir.c: provide free_directory() for reclaiming dir_struct memory
  Add git-check-ignore sub-command

 .gitignore                                        |   1 +
 Documentation/git-check-ignore.txt                |  85 ++++
 Documentation/gitignore.txt                       |   6 +-
 Documentation/technical/api-directory-listing.txt |   2 +
 Makefile                                          |   3 +
 attr.c                                            |   2 +-
 builtin.h                                         |   1 +
 builtin/add.c                                     |  84 +---
 builtin/check-ignore.c                            | 170 +++++++
 builtin/clean.c                                   |   2 +-
 builtin/ls-files.c                                |   5 +-
 command-list.txt                                  |   1 +
 contrib/completion/git-completion.bash            |   1 +
 dir.c                                             | 185 +++++--
 dir.h                                             |  21 +-
 git.c                                             |   1 +
 pathspec.c                                        |  98 ++++
 pathspec.h                                        |   6 +
 t/t0007-ignores.sh                                | 587 ++++++++++++++++++++++
 unpack-trees.c                                    |  10 +-
 20 files changed, 1135 insertions(+), 136 deletions(-)
 create mode 100644 Documentation/git-check-ignore.txt
 create mode 100644 builtin/check-ignore.c
 create mode 100644 pathspec.c
 create mode 100644 pathspec.h
 create mode 100755 t/t0007-ignores.sh
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 1/6] exclude: stricten a length check in EXC_FLAG_ENDSWITH case
  2012-10-15  6:23 nd/attr-match-more-optim, nd/wildmatch and as/check-ignore Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:24 ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:24   ` [PATCH 2/6] exclude: split basename matching code into a separate function Nguyễn Thái Ngọc Duy
                     ` (4 more replies)
  2012-10-15  6:25 ` [PATCH 01/13] ctype: make sane_ctype[] const array Nguyễn Thái Ngọc Duy
                   ` (2 subsequent siblings)
  3 siblings, 5 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

This block of code deals with the "basename" part only, which has the
length of "pathlen - (basename - pathname)". Stricten the length check
and remove "pathname" from the main expression to avoid confusion.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 dir.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/dir.c b/dir.c
index 0015cc5..b0ae417 100644
--- a/dir.c
+++ b/dir.c
@@ -534,8 +534,9 @@ int excluded_from_list(const char *pathname,
 				if (!strcmp_icase(exclude, basename))
 					return to_exclude;
 			} else if (x->flags & EXC_FLAG_ENDSWITH) {
-				if (x->patternlen - 1 <= pathlen &&
-				    !strcmp_icase(exclude + 1, pathname + pathlen - x->patternlen + 1))
+				int len = pathlen - (basename - pathname);
+				if (x->patternlen - 1 <= len &&
+				    !strcmp_icase(exclude + 1, basename + len - x->patternlen + 1))
 					return to_exclude;
 			} else {
 				if (fnmatch_icase(exclude, basename, 0) == 0)
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 2/6] exclude: split basename matching code into a separate function
  2012-10-15  6:24 ` [PATCH 1/6] exclude: stricten a length check in EXC_FLAG_ENDSWITH case Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:24   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:24   ` [PATCH 3/6] exclude: fix a bug in prefix compare optimization Nguyễn Thái Ngọc Duy
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 dir.c | 37 ++++++++++++++++++++++++-------------
 1 file changed, 24 insertions(+), 13 deletions(-)

diff --git a/dir.c b/dir.c
index b0ae417..d9b5561 100644
--- a/dir.c
+++ b/dir.c
@@ -503,6 +503,25 @@ static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
 	dir->basebuf[baselen] = '\0';
 }
 
+static int match_basename(const char *basename, int basenamelen,
+			  const char *pattern, int prefix, int patternlen,
+			  int flags)
+{
+	if (prefix == patternlen) {
+		if (!strcmp_icase(pattern, basename))
+			return 1;
+	} else if (flags & EXC_FLAG_ENDSWITH) {
+		if (patternlen - 1 <= basenamelen &&
+		    !strcmp_icase(pattern + 1,
+				  basename + basenamelen - patternlen + 1))
+			return 1;
+	} else {
+		if (fnmatch_icase(pattern, basename, 0) == 0)
+			return 1;
+	}
+	return 0;
+}
+
 /* Scan the list and let the last match determine the fate.
  * Return 1 for exclude, 0 for include and -1 for undecided.
  */
@@ -529,19 +548,11 @@ int excluded_from_list(const char *pathname,
 		}
 
 		if (x->flags & EXC_FLAG_NODIR) {
-			/* match basename */
-			if (prefix == x->patternlen) {
-				if (!strcmp_icase(exclude, basename))
-					return to_exclude;
-			} else if (x->flags & EXC_FLAG_ENDSWITH) {
-				int len = pathlen - (basename - pathname);
-				if (x->patternlen - 1 <= len &&
-				    !strcmp_icase(exclude + 1, basename + len - x->patternlen + 1))
-					return to_exclude;
-			} else {
-				if (fnmatch_icase(exclude, basename, 0) == 0)
-					return to_exclude;
-			}
+			if (match_basename(basename,
+					   pathlen - (basename - pathname),
+					   exclude, prefix, x->patternlen,
+					   x->flags))
+				return to_exclude;
 			continue;
 		}
 
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 3/6] exclude: fix a bug in prefix compare optimization
  2012-10-15  6:24 ` [PATCH 1/6] exclude: stricten a length check in EXC_FLAG_ENDSWITH case Nguyễn Thái Ngọc Duy
  2012-10-15  6:24   ` [PATCH 2/6] exclude: split basename matching code into a separate function Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:24   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:24   ` [PATCH 4/6] exclude: split pathname matching code into a separate function Nguyễn Thái Ngọc Duy
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

When "namelen" becomes zero at this stage, we have matched the fixed
part, but whether it actually matches the pattern still depends on the
pattern in "exclude". As demonstrated in t3001, path "three/a.3"
exists and it matches the "three/a.3" part in pattern "three/a.3[abc]",
but that does not mean a true match.

Don't be too optimistic and let fnmatch() do the job.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 dir.c                              | 2 +-
 t/t3001-ls-files-others-exclude.sh | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index d9b5561..22d0b7b 100644
--- a/dir.c
+++ b/dir.c
@@ -585,7 +585,7 @@ int excluded_from_list(const char *pathname,
 			namelen -= prefix;
 		}
 
-		if (!namelen || !fnmatch_icase(exclude, name, FNM_PATHNAME))
+		if (!fnmatch_icase(exclude, name, FNM_PATHNAME))
 			return to_exclude;
 	}
 	return -1; /* undecided */
diff --git a/t/t3001-ls-files-others-exclude.sh b/t/t3001-ls-files-others-exclude.sh
index c8fe978..dc2f045 100755
--- a/t/t3001-ls-files-others-exclude.sh
+++ b/t/t3001-ls-files-others-exclude.sh
@@ -214,4 +214,10 @@ test_expect_success 'subdirectory ignore (l1)' '
 	test_cmp expect actual
 '
 
+test_expect_success 'pattern matches prefix completely' '
+	: >expect &&
+	git ls-files -i -o --exclude "/three/a.3[abc]" >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 4/6] exclude: split pathname matching code into a separate function
  2012-10-15  6:24 ` [PATCH 1/6] exclude: stricten a length check in EXC_FLAG_ENDSWITH case Nguyễn Thái Ngọc Duy
  2012-10-15  6:24   ` [PATCH 2/6] exclude: split basename matching code into a separate function Nguyễn Thái Ngọc Duy
  2012-10-15  6:24   ` [PATCH 3/6] exclude: fix a bug in prefix compare optimization Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:24   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:24   ` [PATCH 5/6] gitignore: make pattern parsing code " Nguyễn Thái Ngọc Duy
  2012-10-15  6:24   ` [PATCH 6/6] attr: more matching optimizations from .gitignore Nguyễn Thái Ngọc Duy
  4 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 dir.c | 85 ++++++++++++++++++++++++++++++++++++++++++-------------------------
 1 file changed, 53 insertions(+), 32 deletions(-)

diff --git a/dir.c b/dir.c
index 22d0b7b..32d1c90 100644
--- a/dir.c
+++ b/dir.c
@@ -522,6 +522,53 @@ static int match_basename(const char *basename, int basenamelen,
 	return 0;
 }
 
+static int match_pathname(const char *pathname, int pathlen,
+			  const char *base, int baselen,
+			  const char *pattern, int prefix, int patternlen,
+			  int flags)
+{
+	const char *name;
+	int namelen;
+
+	/*
+	 * match with FNM_PATHNAME; the pattern has base implicitly
+	 * in front of it.
+	 */
+	if (*pattern == '/') {
+		pattern++;
+		prefix--;
+	}
+
+	/*
+	 * baselen does not count the trailing slash. base[] may or
+	 * may not end with a trailing slash though.
+	 */
+	if (pathlen < baselen + 1 ||
+	    (baselen && pathname[baselen] != '/') ||
+	    strncmp_icase(pathname, base, baselen))
+		return 0;
+
+	namelen = baselen ? pathlen - baselen - 1 : pathlen;
+	name = pathname + pathlen - namelen;
+
+	if (prefix) {
+		/*
+		 * if the non-wildcard part is longer than the
+		 * remaining pathname, surely it cannot match.
+		 */
+		if (prefix > namelen)
+			return 0;
+
+		if (strncmp_icase(pattern, name, prefix))
+			return 0;
+		pattern += prefix;
+		name    += prefix;
+		namelen -= prefix;
+	}
+
+	return fnmatch_icase(pattern, name, FNM_PATHNAME) == 0;
+}
+
 /* Scan the list and let the last match determine the fate.
  * Return 1 for exclude, 0 for include and -1 for undecided.
  */
@@ -536,9 +583,9 @@ int excluded_from_list(const char *pathname,
 
 	for (i = el->nr - 1; 0 <= i; i--) {
 		struct exclude *x = el->excludes[i];
-		const char *name, *exclude = x->pattern;
+		const char *exclude = x->pattern;
 		int to_exclude = x->to_exclude;
-		int namelen, prefix = x->nowildcardlen;
+		int prefix = x->nowildcardlen;
 
 		if (x->flags & EXC_FLAG_MUSTBEDIR) {
 			if (*dtype == DT_UNKNOWN)
@@ -556,36 +603,10 @@ int excluded_from_list(const char *pathname,
 			continue;
 		}
 
-		/* match with FNM_PATHNAME:
-		 * exclude has base (baselen long) implicitly in front of it.
-		 */
-		if (*exclude == '/') {
-			exclude++;
-			prefix--;
-		}
-
-		if (pathlen < x->baselen ||
-		    (x->baselen && pathname[x->baselen-1] != '/') ||
-		    strncmp_icase(pathname, x->base, x->baselen))
-			continue;
-
-		namelen = x->baselen ? pathlen - x->baselen : pathlen;
-		name = pathname + pathlen  - namelen;
-
-		/* if the non-wildcard part is longer than the
-		   remaining pathname, surely it cannot match */
-		if (prefix > namelen)
-			continue;
-
-		if (prefix) {
-			if (strncmp_icase(exclude, name, prefix))
-				continue;
-			exclude += prefix;
-			name    += prefix;
-			namelen -= prefix;
-		}
-
-		if (!fnmatch_icase(exclude, name, FNM_PATHNAME))
+		assert(x->baselen == 0 || x->base[x->baselen - 1] == '/');
+		if (match_pathname(pathname, pathlen,
+				   x->base, x->baselen ? x->baselen - 1 : 0,
+				   exclude, prefix, x->patternlen, x->flags))
 			return to_exclude;
 	}
 	return -1; /* undecided */
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 5/6] gitignore: make pattern parsing code a separate function
  2012-10-15  6:24 ` [PATCH 1/6] exclude: stricten a length check in EXC_FLAG_ENDSWITH case Nguyễn Thái Ngọc Duy
                     ` (2 preceding siblings ...)
  2012-10-15  6:24   ` [PATCH 4/6] exclude: split pathname matching code into a separate function Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:24   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:24   ` [PATCH 6/6] attr: more matching optimizations from .gitignore Nguyễn Thái Ngọc Duy
  4 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

This function can later be reused by attr.c. Also turn to_exclude
field into a flag.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 dir.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++---------------------
 dir.h |  2 +-
 2 files changed, 50 insertions(+), 23 deletions(-)

diff --git a/dir.c b/dir.c
index 32d1c90..c4e64a5 100644
--- a/dir.c
+++ b/dir.c
@@ -308,42 +308,69 @@ static int no_wildcard(const char *string)
 	return string[simple_length(string)] == '\0';
 }
 
+static void parse_exclude_pattern(const char **pattern,
+				  int *patternlen,
+				  int *flags,
+				  int *nowildcardlen)
+{
+	const char *p = *pattern;
+	size_t i, len;
+
+	*flags = 0;
+	if (*p == '!') {
+		*flags |= EXC_FLAG_NEGATIVE;
+		p++;
+	}
+	len = strlen(p);
+	if (len && p[len - 1] == '/') {
+		len--;
+		*flags |= EXC_FLAG_MUSTBEDIR;
+	}
+	for (i = 0; i < len; i++) {
+		if (p[i] == '/')
+			break;
+	}
+	if (i == len)
+		*flags |= EXC_FLAG_NODIR;
+	*nowildcardlen = simple_length(p);
+	/*
+	 * we should have excluded the trailing slash from 'p' too,
+	 * but that's one more allocation. Instead just make sure
+	 * nowildcardlen does not exceed real patternlen
+	 */
+	if (*nowildcardlen > len)
+		*nowildcardlen = len;
+	if (*p == '*' && no_wildcard(p + 1))
+		*flags |= EXC_FLAG_ENDSWITH;
+	*pattern = p;
+	*patternlen = len;
+}
+
 void add_exclude(const char *string, const char *base,
 		 int baselen, struct exclude_list *which)
 {
 	struct exclude *x;
-	size_t len;
-	int to_exclude = 1;
-	int flags = 0;
+	int patternlen;
+	int flags;
+	int nowildcardlen;
 
-	if (*string == '!') {
-		to_exclude = 0;
-		string++;
-	}
-	len = strlen(string);
-	if (len && string[len - 1] == '/') {
+	parse_exclude_pattern(&string, &patternlen, &flags, &nowildcardlen);
+	if (flags & EXC_FLAG_MUSTBEDIR) {
 		char *s;
-		x = xmalloc(sizeof(*x) + len);
+		x = xmalloc(sizeof(*x) + patternlen + 1);
 		s = (char *)(x+1);
-		memcpy(s, string, len - 1);
-		s[len - 1] = '\0';
-		string = s;
+		memcpy(s, string, patternlen);
+		s[patternlen] = '\0';
 		x->pattern = s;
-		flags = EXC_FLAG_MUSTBEDIR;
 	} else {
 		x = xmalloc(sizeof(*x));
 		x->pattern = string;
 	}
-	x->to_exclude = to_exclude;
-	x->patternlen = strlen(string);
+	x->patternlen = patternlen;
+	x->nowildcardlen = nowildcardlen;
 	x->base = base;
 	x->baselen = baselen;
 	x->flags = flags;
-	if (!strchr(string, '/'))
-		x->flags |= EXC_FLAG_NODIR;
-	x->nowildcardlen = simple_length(string);
-	if (*string == '*' && no_wildcard(string+1))
-		x->flags |= EXC_FLAG_ENDSWITH;
 	ALLOC_GROW(which->excludes, which->nr + 1, which->alloc);
 	which->excludes[which->nr++] = x;
 }
@@ -584,7 +611,7 @@ int excluded_from_list(const char *pathname,
 	for (i = el->nr - 1; 0 <= i; i--) {
 		struct exclude *x = el->excludes[i];
 		const char *exclude = x->pattern;
-		int to_exclude = x->to_exclude;
+		int to_exclude = x->flags & EXC_FLAG_NEGATIVE ? 0 : 1;
 		int prefix = x->nowildcardlen;
 
 		if (x->flags & EXC_FLAG_MUSTBEDIR) {
diff --git a/dir.h b/dir.h
index 893465a..41ea32d 100644
--- a/dir.h
+++ b/dir.h
@@ -11,6 +11,7 @@ struct dir_entry {
 #define EXC_FLAG_NODIR 1
 #define EXC_FLAG_ENDSWITH 4
 #define EXC_FLAG_MUSTBEDIR 8
+#define EXC_FLAG_NEGATIVE 16
 
 struct exclude_list {
 	int nr;
@@ -21,7 +22,6 @@ struct exclude_list {
 		int nowildcardlen;
 		const char *base;
 		int baselen;
-		int to_exclude;
 		int flags;
 	} **excludes;
 };
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 6/6] attr: more matching optimizations from .gitignore
  2012-10-15  6:24 ` [PATCH 1/6] exclude: stricten a length check in EXC_FLAG_ENDSWITH case Nguyễn Thái Ngọc Duy
                     ` (3 preceding siblings ...)
  2012-10-15  6:24   ` [PATCH 5/6] gitignore: make pattern parsing code " Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:24   ` Nguyễn Thái Ngọc Duy
  4 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

.gitattributes and .gitignore share the same pattern syntax but has
separate matching implementation. Over the years, ignore's
implementation accumulates more optimizations while attr's stays the
same.

This patch reuses the core matching functions that are also used by
excluded_from_list. excluded_from_list and path_matches can't be
merged due to differences in exclude and attr, for example:

* "!pattern" syntax is forbidden in .gitattributes.  As an attribute
  can be unset (i.e. set to a special value "false") or made back to
  unspecified (i.e. not even set to "false"), "!pattern attr" is unclear
  which one it means.

* we support attaching attributes to directories, but git-core
  internally does not currently make use of attributes on
  directories.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/gitattributes.txt |  1 +
 attr.c                          | 52 ++++++++++++++++++++++++-----------------
 dir.c                           | 22 ++++++++---------
 dir.h                           | 11 +++++++++
 t/t0003-attributes.sh           | 10 ++++++++
 5 files changed, 64 insertions(+), 32 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 80120ea..b7c0e65 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -56,6 +56,7 @@ When more than one pattern matches the path, a later line
 overrides an earlier line.  This overriding is done per
 attribute.  The rules how the pattern matches paths are the
 same as in `.gitignore` files; see linkgit:gitignore[5].
+Unlike `.gitignore`, negative patterns are forbidden.
 
 When deciding what attributes are assigned to a path, git
 consults `$GIT_DIR/info/attributes` file (which has the highest
diff --git a/attr.c b/attr.c
index e7caee4..2fc6353 100644
--- a/attr.c
+++ b/attr.c
@@ -115,6 +115,13 @@ struct attr_state {
 	const char *setto;
 };
 
+struct pattern {
+	const char *pattern;
+	int patternlen;
+	int nowildcardlen;
+	int flags;		/* EXC_FLAG_* */
+};
+
 /*
  * One rule, as from a .gitattributes file.
  *
@@ -131,7 +138,7 @@ struct attr_state {
  */
 struct match_attr {
 	union {
-		char *pattern;
+		struct pattern pat;
 		struct git_attr *attr;
 	} u;
 	char is_macro;
@@ -241,9 +248,16 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	if (is_macro)
 		res->u.attr = git_attr_internal(name, namelen);
 	else {
-		res->u.pattern = (char *)&(res->state[num_attr]);
-		memcpy(res->u.pattern, name, namelen);
-		res->u.pattern[namelen] = 0;
+		char *p = (char *)&(res->state[num_attr]);
+		memcpy(p, name, namelen);
+		res->u.pat.pattern = p;
+		parse_exclude_pattern(&res->u.pat.pattern,
+				      &res->u.pat.patternlen,
+				      &res->u.pat.flags,
+				      &res->u.pat.nowildcardlen);
+		if (res->u.pat.flags & EXC_FLAG_NEGATIVE)
+			die(_("Negative patterns are forbidden in git attributes\n"
+			      "Use '\\!' for literal leading exclamation."));
 	}
 	res->is_macro = is_macro;
 	res->num_attr = num_attr;
@@ -640,25 +654,21 @@ static void prepare_attr_stack(const char *path)
 
 static int path_matches(const char *pathname, int pathlen,
 			const char *basename,
-			const char *pattern,
+			const struct pattern *pat,
 			const char *base, int baselen)
 {
-	if (!strchr(pattern, '/')) {
-		return (fnmatch_icase(pattern, basename, 0) == 0);
+	const char *pattern = pat->pattern;
+	int prefix = pat->nowildcardlen;
+
+	if (pat->flags & EXC_FLAG_NODIR) {
+		return match_basename(basename,
+				      pathlen - (basename - pathname),
+				      pattern, prefix,
+				      pat->patternlen, pat->flags);
 	}
-	/*
-	 * match with FNM_PATHNAME; the pattern has base implicitly
-	 * in front of it.
-	 */
-	if (*pattern == '/')
-		pattern++;
-	if (pathlen < baselen ||
-	    (baselen && pathname[baselen] != '/') ||
-	    strncmp(pathname, base, baselen))
-		return 0;
-	if (baselen != 0)
-		baselen++;
-	return fnmatch_icase(pattern, pathname + baselen, FNM_PATHNAME) == 0;
+	return match_pathname(pathname, pathlen,
+			      base, baselen,
+			      pattern, prefix, pat->patternlen, pat->flags);
 }
 
 static int macroexpand_one(int attr_nr, int rem);
@@ -696,7 +706,7 @@ static int fill(const char *path, int pathlen, const char *basename,
 		if (a->is_macro)
 			continue;
 		if (path_matches(path, pathlen, basename,
-				 a->u.pattern, base, stk->originlen))
+				 &a->u.pat, base, stk->originlen))
 			rem = fill_one("fill", a, rem);
 	}
 	return rem;
diff --git a/dir.c b/dir.c
index c4e64a5..ee8e711 100644
--- a/dir.c
+++ b/dir.c
@@ -308,10 +308,10 @@ static int no_wildcard(const char *string)
 	return string[simple_length(string)] == '\0';
 }
 
-static void parse_exclude_pattern(const char **pattern,
-				  int *patternlen,
-				  int *flags,
-				  int *nowildcardlen)
+void parse_exclude_pattern(const char **pattern,
+			   int *patternlen,
+			   int *flags,
+			   int *nowildcardlen)
 {
 	const char *p = *pattern;
 	size_t i, len;
@@ -530,9 +530,9 @@ static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
 	dir->basebuf[baselen] = '\0';
 }
 
-static int match_basename(const char *basename, int basenamelen,
-			  const char *pattern, int prefix, int patternlen,
-			  int flags)
+int match_basename(const char *basename, int basenamelen,
+		   const char *pattern, int prefix, int patternlen,
+		   int flags)
 {
 	if (prefix == patternlen) {
 		if (!strcmp_icase(pattern, basename))
@@ -549,10 +549,10 @@ static int match_basename(const char *basename, int basenamelen,
 	return 0;
 }
 
-static int match_pathname(const char *pathname, int pathlen,
-			  const char *base, int baselen,
-			  const char *pattern, int prefix, int patternlen,
-			  int flags)
+int match_pathname(const char *pathname, int pathlen,
+		   const char *base, int baselen,
+		   const char *pattern, int prefix, int patternlen,
+		   int flags)
 {
 	const char *name;
 	int namelen;
diff --git a/dir.h b/dir.h
index 41ea32d..f5c89e3 100644
--- a/dir.h
+++ b/dir.h
@@ -81,6 +81,16 @@ extern int excluded_from_list(const char *pathname, int pathlen, const char *bas
 struct dir_entry *dir_add_ignored(struct dir_struct *dir, const char *pathname, int len);
 
 /*
+ * these implement the matching logic for dir.c:excluded_from_list and
+ * attr.c:path_matches()
+ */
+extern int match_basename(const char *, int,
+			  const char *, int, int, int);
+extern int match_pathname(const char *, int,
+			  const char *, int,
+			  const char *, int, int, int);
+
+/*
  * The excluded() API is meant for callers that check each level of leading
  * directory hierarchies with excluded() to avoid recursing into excluded
  * directories.  Callers that do not do so should use this API instead.
@@ -97,6 +107,7 @@ extern int path_excluded(struct path_exclude_check *, const char *, int namelen,
 extern int add_excludes_from_file_to_list(const char *fname, const char *base, int baselen,
 					  char **buf_p, struct exclude_list *which, int check_index);
 extern void add_excludes_from_file(struct dir_struct *, const char *fname);
+extern void parse_exclude_pattern(const char **string, int *patternlen, int *flags, int *nowildcardlen);
 extern void add_exclude(const char *string, const char *base,
 			int baselen, struct exclude_list *which);
 extern void free_excludes(struct exclude_list *el);
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index 51f3045..f6c21ea 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -206,6 +206,16 @@ test_expect_success 'root subdir attribute test' '
 	attr_check subdir/a/i unspecified
 '
 
+test_expect_success 'negative patterns' '
+	echo "!f test=bar" >.gitattributes &&
+	test_must_fail git check-attr test -- f
+'
+
+test_expect_success 'patterns starting with exclamation' '
+	echo "\!f test=foo" >.gitattributes &&
+	attr_check "!f" foo
+'
+
 test_expect_success 'setup bare' '
 	git clone --bare . bare.git &&
 	cd bare.git
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 01/13] ctype: make sane_ctype[] const array
  2012-10-15  6:23 nd/attr-match-more-optim, nd/wildmatch and as/check-ignore Nguyễn Thái Ngọc Duy
  2012-10-15  6:24 ` [PATCH 1/6] exclude: stricten a length check in EXC_FLAG_ENDSWITH case Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:25 ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:25   ` [PATCH 02/13] ctype: support iscntrl, ispunct, isxdigit and isprint Nguyễn Thái Ngọc Duy
                     ` (11 more replies)
  2012-10-15  6:27 ` [PATCH 01/12] dir.c: rename cryptic 'which' variable to more consistent name Nguyễn Thái Ngọc Duy
  2012-10-15 22:13 ` nd/attr-match-more-optim, nd/wildmatch and as/check-ignore Junio C Hamano
  3 siblings, 12 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:25 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 ctype.c           | 2 +-
 git-compat-util.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/ctype.c b/ctype.c
index 9353271..faeaf34 100644
--- a/ctype.c
+++ b/ctype.c
@@ -14,7 +14,7 @@ enum {
 	P = GIT_PATHSPEC_MAGIC  /* other non-alnum, except for ] and } */
 };
 
-unsigned char sane_ctype[256] = {
+const unsigned char sane_ctype[256] = {
 	0, 0, 0, 0, 0, 0, 0, 0, 0, S, S, 0, 0, S, 0, 0,		/*   0.. 15 */
 	0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,		/*  16.. 31 */
 	S, P, P, P, R, P, P, P, R, R, G, R, P, P, R, P,		/*  32.. 47 */
diff --git a/git-compat-util.h b/git-compat-util.h
index 5bd9ad7..80767ff 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -470,7 +470,7 @@ extern const char tolower_trans_tbl[256];
 #undef isupper
 #undef tolower
 #undef toupper
-extern unsigned char sane_ctype[256];
+extern const unsigned char sane_ctype[256];
 #define GIT_SPACE 0x01
 #define GIT_DIGIT 0x02
 #define GIT_ALPHA 0x04
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 02/13] ctype: support iscntrl, ispunct, isxdigit and isprint
  2012-10-15  6:25 ` [PATCH 01/13] ctype: make sane_ctype[] const array Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:25   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:25   ` [PATCH 03/13] Import wildmatch from rsync Nguyễn Thái Ngọc Duy
                     ` (10 subsequent siblings)
  11 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:25 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 ctype.c           | 13 ++++++++-----
 git-compat-util.h | 13 +++++++++++++
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/ctype.c b/ctype.c
index faeaf34..0bfebb4 100644
--- a/ctype.c
+++ b/ctype.c
@@ -11,18 +11,21 @@ enum {
 	D = GIT_DIGIT,
 	G = GIT_GLOB_SPECIAL,	/* *, ?, [, \\ */
 	R = GIT_REGEX_SPECIAL,	/* $, (, ), +, ., ^, {, | */
-	P = GIT_PATHSPEC_MAGIC  /* other non-alnum, except for ] and } */
+	P = GIT_PATHSPEC_MAGIC, /* other non-alnum, except for ] and } */
+	X = GIT_CNTRL,
+	U = GIT_PUNCT,
+	Z = GIT_CNTRL | GIT_SPACE
 };
 
 const unsigned char sane_ctype[256] = {
-	0, 0, 0, 0, 0, 0, 0, 0, 0, S, S, 0, 0, S, 0, 0,		/*   0.. 15 */
-	0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,		/*  16.. 31 */
+	X, X, X, X, X, X, X, X, X, Z, Z, X, X, Z, X, X,		/*   0.. 15 */
+	X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, X,		/*  16.. 31 */
 	S, P, P, P, R, P, P, P, R, R, G, R, P, P, R, P,		/*  32.. 47 */
 	D, D, D, D, D, D, D, D, D, D, P, P, P, P, P, G,		/*  48.. 63 */
 	P, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A,		/*  64.. 79 */
-	A, A, A, A, A, A, A, A, A, A, A, G, G, 0, R, P,		/*  80.. 95 */
+	A, A, A, A, A, A, A, A, A, A, A, G, G, U, R, P,		/*  80.. 95 */
 	P, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A,		/*  96..111 */
-	A, A, A, A, A, A, A, A, A, A, A, R, R, 0, P, 0,		/* 112..127 */
+	A, A, A, A, A, A, A, A, A, A, A, R, R, U, P, X,		/* 112..127 */
 	/* Nothing in the 128.. range */
 };
 
diff --git a/git-compat-util.h b/git-compat-util.h
index 80767ff..02f48f6 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -470,6 +470,10 @@ extern const char tolower_trans_tbl[256];
 #undef isupper
 #undef tolower
 #undef toupper
+#undef iscntrl
+#undef ispunct
+#undef isxdigit
+#undef isprint
 extern const unsigned char sane_ctype[256];
 #define GIT_SPACE 0x01
 #define GIT_DIGIT 0x02
@@ -477,6 +481,8 @@ extern const unsigned char sane_ctype[256];
 #define GIT_GLOB_SPECIAL 0x08
 #define GIT_REGEX_SPECIAL 0x10
 #define GIT_PATHSPEC_MAGIC 0x20
+#define GIT_CNTRL 0x40
+#define GIT_PUNCT 0x80
 #define sane_istest(x,mask) ((sane_ctype[(unsigned char)(x)] & (mask)) != 0)
 #define isascii(x) (((x) & ~0x7f) == 0)
 #define isspace(x) sane_istest(x,GIT_SPACE)
@@ -487,6 +493,13 @@ extern const unsigned char sane_ctype[256];
 #define isupper(x) sane_iscase(x, 0)
 #define is_glob_special(x) sane_istest(x,GIT_GLOB_SPECIAL)
 #define is_regex_special(x) sane_istest(x,GIT_GLOB_SPECIAL | GIT_REGEX_SPECIAL)
+#define iscntrl(x) (sane_istest(x,GIT_CNTRL))
+#define ispunct(x) sane_istest(x, GIT_PUNCT | GIT_REGEX_SPECIAL | \
+		GIT_GLOB_SPECIAL | GIT_PATHSPEC_MAGIC)
+#define isxdigit(x) (hexval_table[x] != -1)
+#define isprint(x) (sane_istest(x, GIT_ALPHA | GIT_DIGIT | GIT_SPACE | \
+		GIT_PUNCT | GIT_REGEX_SPECIAL | GIT_GLOB_SPECIAL | \
+		GIT_PATHSPEC_MAGIC))
 #define tolower(x) sane_case((unsigned char)(x), 0x20)
 #define toupper(x) sane_case((unsigned char)(x), 0)
 #define is_pathspec_magic(x) sane_istest(x,GIT_PATHSPEC_MAGIC)
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 03/13] Import wildmatch from rsync
  2012-10-15  6:25 ` [PATCH 01/13] ctype: make sane_ctype[] const array Nguyễn Thái Ngọc Duy
  2012-10-15  6:25   ` [PATCH 02/13] ctype: support iscntrl, ispunct, isxdigit and isprint Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:25   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:25   ` [PATCH 04/13] wildmatch: remove unnecessary functions Nguyễn Thái Ngọc Duy
                     ` (9 subsequent siblings)
  11 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:25 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 16393 bytes --]

These files are from rsync.git commit
f92f5b166e3019db42bc7fe1aa2f1a9178cd215d, which was the last commit
before rsync turned GPL-3. All files are imported as-is and
no-op. Adaptation is done in a separate patch.

rsync.git           ->  git.git
lib/wildmatch.[ch]      wildmatch.[ch]
wildtest.txt            t/t3070/wildtest.txt

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 t/t3070/wildtest.txt | 165 +++++++++++++++++++++++
 wildmatch.c          | 368 +++++++++++++++++++++++++++++++++++++++++++++++++++
 wildmatch.h          |   6 +
 3 files changed, 539 insertions(+)
 create mode 100644 t/t3070/wildtest.txt
 create mode 100644 wildmatch.c
 create mode 100644 wildmatch.h

diff --git a/t/t3070/wildtest.txt b/t/t3070/wildtest.txt
new file mode 100644
index 0000000..42c1678
--- /dev/null
+++ b/t/t3070/wildtest.txt
@@ -0,0 +1,165 @@
+# Input is in the following format (all items white-space separated):
+#
+# The first two items are 1 or 0 indicating if the wildmat call is expected to
+# succeed and if fnmatch works the same way as wildmat, respectively.  After
+# that is a text string for the match, and a pattern string.  Strings can be
+# quoted (if desired) in either double or single quotes, as well as backticks.
+#
+# MATCH FNMATCH_SAME "text to match" 'pattern to use'
+
+# Basic wildmat features
+1 1 foo			foo
+0 1 foo			bar
+1 1 ''			""
+1 1 foo			???
+0 1 foo			??
+1 1 foo			*
+1 1 foo			f*
+0 1 foo			*f
+1 1 foo			*foo*
+1 1 foobar		*ob*a*r*
+1 1 aaaaaaabababab	*ab
+1 1 foo*		foo\*
+0 1 foobar		foo\*bar
+1 1 f\oo		f\\oo
+1 1 ball		*[al]?
+0 1 ten			[ten]
+1 1 ten			**[!te]
+0 1 ten			**[!ten]
+1 1 ten			t[a-g]n
+0 1 ten			t[!a-g]n
+1 1 ton			t[!a-g]n
+1 1 ton			t[^a-g]n
+1 1 a]b			a[]]b
+1 1 a-b			a[]-]b
+1 1 a]b			a[]-]b
+0 1 aab			a[]-]b
+1 1 aab			a[]a-]b
+1 1 ]			]
+
+# Extended slash-matching features
+0 1 foo/baz/bar		foo*bar
+1 1 foo/baz/bar		foo**bar
+0 1 foo/bar		foo?bar
+0 1 foo/bar		foo[/]bar
+0 1 foo/bar		f[^eiu][^eiu][^eiu][^eiu][^eiu]r
+1 1 foo-bar		f[^eiu][^eiu][^eiu][^eiu][^eiu]r
+0 1 foo			**/foo
+1 1 /foo		**/foo
+1 1 bar/baz/foo		**/foo
+0 1 bar/baz/foo		*/foo
+0 0 foo/bar/baz		**/bar*
+1 1 deep/foo/bar/baz	**/bar/*
+0 1 deep/foo/bar/baz/	**/bar/*
+1 1 deep/foo/bar/baz/	**/bar/**
+0 1 deep/foo/bar	**/bar/*
+1 1 deep/foo/bar/	**/bar/**
+1 1 foo/bar/baz		**/bar**
+1 1 foo/bar/baz/x	*/bar/**
+0 0 deep/foo/bar/baz/x	*/bar/**
+1 1 deep/foo/bar/baz/x	**/bar/*/*
+
+# Various additional tests
+0 1 acrt		a[c-c]st
+1 1 acrt		a[c-c]rt
+0 1 ]			[!]-]
+1 1 a			[!]-]
+0 1 ''			\
+0 1 \			\
+0 1 /\			*/\
+1 1 /\			*/\\
+1 1 foo			foo
+1 1 @foo		@foo
+0 1 foo			@foo
+1 1 [ab]		\[ab]
+1 1 [ab]		[[]ab]
+1 1 [ab]		[[:]ab]
+0 1 [ab]		[[::]ab]
+1 1 [ab]		[[:digit]ab]
+1 1 [ab]		[\[:]ab]
+1 1 ?a?b		\??\?b
+1 1 abc			\a\b\c
+0 1 foo			''
+1 1 foo/bar/baz/to	**/t[o]
+
+# Character class tests
+1 1 a1B		[[:alpha:]][[:digit:]][[:upper:]]
+0 1 a		[[:digit:][:upper:][:space:]]
+1 1 A		[[:digit:][:upper:][:space:]]
+1 1 1		[[:digit:][:upper:][:space:]]
+0 1 1		[[:digit:][:upper:][:spaci:]]
+1 1 ' '		[[:digit:][:upper:][:space:]]
+0 1 .		[[:digit:][:upper:][:space:]]
+1 1 .		[[:digit:][:punct:][:space:]]
+1 1 5		[[:xdigit:]]
+1 1 f		[[:xdigit:]]
+1 1 D		[[:xdigit:]]
+1 1 _		[[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]
+#1 1 …		[^[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]
+1 1 \x7f		[^[:alnum:][:alpha:][:blank:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]
+1 1 .		[^[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:lower:][:space:][:upper:][:xdigit:]]
+1 1 5		[a-c[:digit:]x-z]
+1 1 b		[a-c[:digit:]x-z]
+1 1 y		[a-c[:digit:]x-z]
+0 1 q		[a-c[:digit:]x-z]
+
+# Additional tests, including some malformed wildmats
+1 1 ]		[\\-^]
+0 1 [		[\\-^]
+1 1 -		[\-_]
+1 1 ]		[\]]
+0 1 \]		[\]]
+0 1 \		[\]]
+0 1 ab		a[]b
+0 1 a[]b	a[]b
+0 1 ab[		ab[
+0 1 ab		[!
+0 1 ab		[-
+1 1 -		[-]
+0 1 -		[a-
+0 1 -		[!a-
+1 1 -		[--A]
+1 1 5		[--A]
+1 1 ' '		'[ --]'
+1 1 $		'[ --]'
+1 1 -		'[ --]'
+0 1 0		'[ --]'
+1 1 -		[---]
+1 1 -		[------]
+0 1 j		[a-e-n]
+1 1 -		[a-e-n]
+1 1 a		[!------]
+0 1 [		[]-a]
+1 1 ^		[]-a]
+0 1 ^		[!]-a]
+1 1 [		[!]-a]
+1 1 ^		[a^bc]
+1 1 -b]		[a-]b]
+0 1 \		[\]
+1 1 \		[\\]
+0 1 \		[!\\]
+1 1 G		[A-\\]
+0 1 aaabbb	b*a
+0 1 aabcaa	*ba*
+1 1 ,		[,]
+1 1 ,		[\\,]
+1 1 \		[\\,]
+1 1 -		[,-.]
+0 1 +		[,-.]
+0 1 -.]		[,-.]
+1 1 2		[\1-\3]
+1 1 3		[\1-\3]
+0 1 4		[\1-\3]
+1 1 \		[[-\]]
+1 1 [		[[-\]]
+1 1 ]		[[-\]]
+0 1 -		[[-\]]
+
+# Test recursion and the abort code (use "wildtest -i" to see iteration counts)
+1 1 -adobe-courier-bold-o-normal--12-120-75-75-m-70-iso8859-1	-*-*-*-*-*-*-12-*-*-*-m-*-*-*
+0 1 -adobe-courier-bold-o-normal--12-120-75-75-X-70-iso8859-1	-*-*-*-*-*-*-12-*-*-*-m-*-*-*
+0 1 -adobe-courier-bold-o-normal--12-120-75-75-/-70-iso8859-1	-*-*-*-*-*-*-12-*-*-*-m-*-*-*
+1 1 /adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1	/*/*/*/*/*/*/12/*/*/*/m/*/*/*
+0 1 /adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1	/*/*/*/*/*/*/12/*/*/*/m/*/*/*
+1 1 abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txt		**/*a*b*g*n*t
+0 1 abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txtz		**/*a*b*g*n*t
diff --git a/wildmatch.c b/wildmatch.c
new file mode 100644
index 0000000..f3a1731
--- /dev/null
+++ b/wildmatch.c
@@ -0,0 +1,368 @@
+/*
+**  Do shell-style pattern matching for ?, \, [], and * characters.
+**  It is 8bit clean.
+**
+**  Written by Rich $alz, mirror!rs, Wed Nov 26 19:03:17 EST 1986.
+**  Rich $alz is now <rsalz@bbn.com>.
+**
+**  Modified by Wayne Davison to special-case '/' matching, to make '**'
+**  work differently than '*', and to fix the character-class code.
+*/
+
+#include "rsync.h"
+
+/* What character marks an inverted character class? */
+#define NEGATE_CLASS	'!'
+#define NEGATE_CLASS2	'^'
+
+#define FALSE 0
+#define TRUE 1
+#define ABORT_ALL -1
+#define ABORT_TO_STARSTAR -2
+
+#define CC_EQ(class, len, litmatch) ((len) == sizeof (litmatch)-1 \
+				    && *(class) == *(litmatch) \
+				    && strncmp((char*)class, litmatch, len) == 0)
+
+#if defined STDC_HEADERS || !defined isascii
+# define ISASCII(c) 1
+#else
+# define ISASCII(c) isascii(c)
+#endif
+
+#ifdef isblank
+# define ISBLANK(c) (ISASCII(c) && isblank(c))
+#else
+# define ISBLANK(c) ((c) == ' ' || (c) == '\t')
+#endif
+
+#ifdef isgraph
+# define ISGRAPH(c) (ISASCII(c) && isgraph(c))
+#else
+# define ISGRAPH(c) (ISASCII(c) && isprint(c) && !isspace(c))
+#endif
+
+#define ISPRINT(c) (ISASCII(c) && isprint(c))
+#define ISDIGIT(c) (ISASCII(c) && isdigit(c))
+#define ISALNUM(c) (ISASCII(c) && isalnum(c))
+#define ISALPHA(c) (ISASCII(c) && isalpha(c))
+#define ISCNTRL(c) (ISASCII(c) && iscntrl(c))
+#define ISLOWER(c) (ISASCII(c) && islower(c))
+#define ISPUNCT(c) (ISASCII(c) && ispunct(c))
+#define ISSPACE(c) (ISASCII(c) && isspace(c))
+#define ISUPPER(c) (ISASCII(c) && isupper(c))
+#define ISXDIGIT(c) (ISASCII(c) && isxdigit(c))
+
+#ifdef WILD_TEST_ITERATIONS
+int wildmatch_iteration_count;
+#endif
+
+static int force_lower_case = 0;
+
+/* Match pattern "p" against the a virtually-joined string consisting
+ * of "text" and any strings in array "a". */
+static int dowild(const uchar *p, const uchar *text, const uchar*const *a)
+{
+    uchar p_ch;
+
+#ifdef WILD_TEST_ITERATIONS
+    wildmatch_iteration_count++;
+#endif
+
+    for ( ; (p_ch = *p) != '\0'; text++, p++) {
+	int matched, special;
+	uchar t_ch, prev_ch;
+	while ((t_ch = *text) == '\0') {
+	    if (*a == NULL) {
+		if (p_ch != '*')
+		    return ABORT_ALL;
+		break;
+	    }
+	    text = *a++;
+	}
+	if (force_lower_case && ISUPPER(t_ch))
+	    t_ch = tolower(t_ch);
+	switch (p_ch) {
+	  case '\\':
+	    /* Literal match with following character.  Note that the test
+	     * in "default" handles the p[1] == '\0' failure case. */
+	    p_ch = *++p;
+	    /* FALLTHROUGH */
+	  default:
+	    if (t_ch != p_ch)
+		return FALSE;
+	    continue;
+	  case '?':
+	    /* Match anything but '/'. */
+	    if (t_ch == '/')
+		return FALSE;
+	    continue;
+	  case '*':
+	    if (*++p == '*') {
+		while (*++p == '*') {}
+		special = TRUE;
+	    } else
+		special = FALSE;
+	    if (*p == '\0') {
+		/* Trailing "**" matches everything.  Trailing "*" matches
+		 * only if there are no more slash characters. */
+		if (!special) {
+		    do {
+			if (strchr((char*)text, '/') != NULL)
+			    return FALSE;
+		    } while ((text = *a++) != NULL);
+		}
+		return TRUE;
+	    }
+	    while (1) {
+		if (t_ch == '\0') {
+		    if ((text = *a++) == NULL)
+			break;
+		    t_ch = *text;
+		    continue;
+		}
+		if ((matched = dowild(p, text, a)) != FALSE) {
+		    if (!special || matched != ABORT_TO_STARSTAR)
+			return matched;
+		} else if (!special && t_ch == '/')
+		    return ABORT_TO_STARSTAR;
+		t_ch = *++text;
+	    }
+	    return ABORT_ALL;
+	  case '[':
+	    p_ch = *++p;
+#ifdef NEGATE_CLASS2
+	    if (p_ch == NEGATE_CLASS2)
+		p_ch = NEGATE_CLASS;
+#endif
+	    /* Assign literal TRUE/FALSE because of "matched" comparison. */
+	    special = p_ch == NEGATE_CLASS? TRUE : FALSE;
+	    if (special) {
+		/* Inverted character class. */
+		p_ch = *++p;
+	    }
+	    prev_ch = 0;
+	    matched = FALSE;
+	    do {
+		if (!p_ch)
+		    return ABORT_ALL;
+		if (p_ch == '\\') {
+		    p_ch = *++p;
+		    if (!p_ch)
+			return ABORT_ALL;
+		    if (t_ch == p_ch)
+			matched = TRUE;
+		} else if (p_ch == '-' && prev_ch && p[1] && p[1] != ']') {
+		    p_ch = *++p;
+		    if (p_ch == '\\') {
+			p_ch = *++p;
+			if (!p_ch)
+			    return ABORT_ALL;
+		    }
+		    if (t_ch <= p_ch && t_ch >= prev_ch)
+			matched = TRUE;
+		    p_ch = 0; /* This makes "prev_ch" get set to 0. */
+		} else if (p_ch == '[' && p[1] == ':') {
+		    const uchar *s;
+		    int i;
+		    for (s = p += 2; (p_ch = *p) && p_ch != ']'; p++) {} /*SHARED ITERATOR*/
+		    if (!p_ch)
+			return ABORT_ALL;
+		    i = p - s - 1;
+		    if (i < 0 || p[-1] != ':') {
+			/* Didn't find ":]", so treat like a normal set. */
+			p = s - 2;
+			p_ch = '[';
+			if (t_ch == p_ch)
+			    matched = TRUE;
+			continue;
+		    }
+		    if (CC_EQ(s,i, "alnum")) {
+			if (ISALNUM(t_ch))
+			    matched = TRUE;
+		    } else if (CC_EQ(s,i, "alpha")) {
+			if (ISALPHA(t_ch))
+			    matched = TRUE;
+		    } else if (CC_EQ(s,i, "blank")) {
+			if (ISBLANK(t_ch))
+			    matched = TRUE;
+		    } else if (CC_EQ(s,i, "cntrl")) {
+			if (ISCNTRL(t_ch))
+			    matched = TRUE;
+		    } else if (CC_EQ(s,i, "digit")) {
+			if (ISDIGIT(t_ch))
+			    matched = TRUE;
+		    } else if (CC_EQ(s,i, "graph")) {
+			if (ISGRAPH(t_ch))
+			    matched = TRUE;
+		    } else if (CC_EQ(s,i, "lower")) {
+			if (ISLOWER(t_ch))
+			    matched = TRUE;
+		    } else if (CC_EQ(s,i, "print")) {
+			if (ISPRINT(t_ch))
+			    matched = TRUE;
+		    } else if (CC_EQ(s,i, "punct")) {
+			if (ISPUNCT(t_ch))
+			    matched = TRUE;
+		    } else if (CC_EQ(s,i, "space")) {
+			if (ISSPACE(t_ch))
+			    matched = TRUE;
+		    } else if (CC_EQ(s,i, "upper")) {
+			if (ISUPPER(t_ch))
+			    matched = TRUE;
+		    } else if (CC_EQ(s,i, "xdigit")) {
+			if (ISXDIGIT(t_ch))
+			    matched = TRUE;
+		    } else /* malformed [:class:] string */
+			return ABORT_ALL;
+		    p_ch = 0; /* This makes "prev_ch" get set to 0. */
+		} else if (t_ch == p_ch)
+		    matched = TRUE;
+	    } while (prev_ch = p_ch, (p_ch = *++p) != ']');
+	    if (matched == special || t_ch == '/')
+		return FALSE;
+	    continue;
+	}
+    }
+
+    do {
+	if (*text)
+	    return FALSE;
+    } while ((text = *a++) != NULL);
+
+    return TRUE;
+}
+
+/* Match literal string "s" against the a virtually-joined string consisting
+ * of "text" and any strings in array "a". */
+static int doliteral(const uchar *s, const uchar *text, const uchar*const *a)
+{
+    for ( ; *s != '\0'; text++, s++) {
+	while (*text == '\0') {
+	    if ((text = *a++) == NULL)
+		return FALSE;
+	}
+	if (*text != *s)
+	    return FALSE;
+    }
+
+    do {
+	if (*text)
+	    return FALSE;
+    } while ((text = *a++) != NULL);
+
+    return TRUE;
+}
+
+/* Return the last "count" path elements from the concatenated string.
+ * We return a string pointer to the start of the string, and update the
+ * array pointer-pointer to point to any remaining string elements. */
+static const uchar *trailing_N_elements(const uchar*const **a_ptr, int count)
+{
+    const uchar*const *a = *a_ptr;
+    const uchar*const *first_a = a;
+
+    while (*a)
+	    a++;
+
+    while (a != first_a) {
+	const uchar *s = *--a;
+	s += strlen((char*)s);
+	while (--s >= *a) {
+	    if (*s == '/' && !--count) {
+		*a_ptr = a+1;
+		return s+1;
+	    }
+	}
+    }
+
+    if (count == 1) {
+	*a_ptr = a+1;
+	return *a;
+    }
+
+    return NULL;
+}
+
+/* Match the "pattern" against the "text" string. */
+int wildmatch(const char *pattern, const char *text)
+{
+    static const uchar *nomore[1]; /* A NULL pointer. */
+#ifdef WILD_TEST_ITERATIONS
+    wildmatch_iteration_count = 0;
+#endif
+    return dowild((const uchar*)pattern, (const uchar*)text, nomore) == TRUE;
+}
+
+/* Match the "pattern" against the forced-to-lower-case "text" string. */
+int iwildmatch(const char *pattern, const char *text)
+{
+    static const uchar *nomore[1]; /* A NULL pointer. */
+    int ret;
+#ifdef WILD_TEST_ITERATIONS
+    wildmatch_iteration_count = 0;
+#endif
+    force_lower_case = 1;
+    ret = dowild((const uchar*)pattern, (const uchar*)text, nomore) == TRUE;
+    force_lower_case = 0;
+    return ret;
+}
+
+/* Match pattern "p" against the a virtually-joined string consisting
+ * of all the pointers in array "texts" (which has a NULL pointer at the
+ * end).  The int "where" can be 0 (normal matching), > 0 (match only
+ * the trailing N slash-separated filename components of "texts"), or < 0
+ * (match the "pattern" at the start or after any slash in "texts"). */
+int wildmatch_array(const char *pattern, const char*const *texts, int where)
+{
+    const uchar *p = (const uchar*)pattern;
+    const uchar*const *a = (const uchar*const*)texts;
+    const uchar *text;
+    int matched;
+
+#ifdef WILD_TEST_ITERATIONS
+    wildmatch_iteration_count = 0;
+#endif
+
+    if (where > 0)
+	text = trailing_N_elements(&a, where);
+    else
+	text = *a++;
+    if (!text)
+	return FALSE;
+
+    if ((matched = dowild(p, text, a)) != TRUE && where < 0
+     && matched != ABORT_ALL) {
+	while (1) {
+	    if (*text == '\0') {
+		if ((text = (uchar*)*a++) == NULL)
+		    return FALSE;
+		continue;
+	    }
+	    if (*text++ == '/' && (matched = dowild(p, text, a)) != FALSE
+	     && matched != ABORT_TO_STARSTAR)
+		break;
+	}
+    }
+    return matched == TRUE;
+}
+
+/* Match literal string "s" against the a virtually-joined string consisting
+ * of all the pointers in array "texts" (which has a NULL pointer at the
+ * end).  The int "where" can be 0 (normal matching), or > 0 (match
+ * only the trailing N slash-separated filename components of "texts"). */
+int litmatch_array(const char *string, const char*const *texts, int where)
+{
+    const uchar *s = (const uchar*)string;
+    const uchar*const *a = (const uchar* const*)texts;
+    const uchar *text;
+
+    if (where > 0)
+	text = trailing_N_elements(&a, where);
+    else
+	text = *a++;
+    if (!text)
+	return FALSE;
+
+    return doliteral(s, text, a) == TRUE;
+}
diff --git a/wildmatch.h b/wildmatch.h
new file mode 100644
index 0000000..e7f1a35
--- /dev/null
+++ b/wildmatch.h
@@ -0,0 +1,6 @@
+/* wildmatch.h */
+
+int wildmatch(const char *pattern, const char *text);
+int iwildmatch(const char *pattern, const char *text);
+int wildmatch_array(const char *pattern, const char*const *texts, int where);
+int litmatch_array(const char *string, const char*const *texts, int where);
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 04/13] wildmatch: remove unnecessary functions
  2012-10-15  6:25 ` [PATCH 01/13] ctype: make sane_ctype[] const array Nguyễn Thái Ngọc Duy
  2012-10-15  6:25   ` [PATCH 02/13] ctype: support iscntrl, ispunct, isxdigit and isprint Nguyễn Thái Ngọc Duy
  2012-10-15  6:25   ` [PATCH 03/13] Import wildmatch from rsync Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:25   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:25   ` [PATCH 05/13] wildmatch: follow Git's coding convention Nguyễn Thái Ngọc Duy
                     ` (8 subsequent siblings)
  11 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:25 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 wildmatch.c | 164 ++++--------------------------------------------------------
 wildmatch.h |   2 -
 2 files changed, 10 insertions(+), 156 deletions(-)

diff --git a/wildmatch.c b/wildmatch.c
index f3a1731..fae7397 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -53,33 +53,18 @@
 #define ISUPPER(c) (ISASCII(c) && isupper(c))
 #define ISXDIGIT(c) (ISASCII(c) && isxdigit(c))
 
-#ifdef WILD_TEST_ITERATIONS
-int wildmatch_iteration_count;
-#endif
-
 static int force_lower_case = 0;
 
-/* Match pattern "p" against the a virtually-joined string consisting
- * of "text" and any strings in array "a". */
-static int dowild(const uchar *p, const uchar *text, const uchar*const *a)
+/* Match pattern "p" against "text" */
+static int dowild(const uchar *p, const uchar *text)
 {
     uchar p_ch;
 
-#ifdef WILD_TEST_ITERATIONS
-    wildmatch_iteration_count++;
-#endif
-
     for ( ; (p_ch = *p) != '\0'; text++, p++) {
 	int matched, special;
 	uchar t_ch, prev_ch;
-	while ((t_ch = *text) == '\0') {
-	    if (*a == NULL) {
-		if (p_ch != '*')
-		    return ABORT_ALL;
-		break;
-	    }
-	    text = *a++;
-	}
+	if ((t_ch = *text) == '\0' && p_ch != '*')
+		return ABORT_ALL;
 	if (force_lower_case && ISUPPER(t_ch))
 	    t_ch = tolower(t_ch);
 	switch (p_ch) {
@@ -107,21 +92,15 @@ static int dowild(const uchar *p, const uchar *text, const uchar*const *a)
 		/* Trailing "**" matches everything.  Trailing "*" matches
 		 * only if there are no more slash characters. */
 		if (!special) {
-		    do {
 			if (strchr((char*)text, '/') != NULL)
 			    return FALSE;
-		    } while ((text = *a++) != NULL);
 		}
 		return TRUE;
 	    }
 	    while (1) {
-		if (t_ch == '\0') {
-		    if ((text = *a++) == NULL)
-			break;
-		    t_ch = *text;
-		    continue;
-		}
-		if ((matched = dowild(p, text, a)) != FALSE) {
+		if (t_ch == '\0')
+		    break;
+		if ((matched = dowild(p, text)) != FALSE) {
 		    if (!special || matched != ABORT_TO_STARSTAR)
 			return matched;
 		} else if (!special && t_ch == '/')
@@ -225,144 +204,21 @@ static int dowild(const uchar *p, const uchar *text, const uchar*const *a)
 	}
     }
 
-    do {
-	if (*text)
-	    return FALSE;
-    } while ((text = *a++) != NULL);
-
-    return TRUE;
-}
-
-/* Match literal string "s" against the a virtually-joined string consisting
- * of "text" and any strings in array "a". */
-static int doliteral(const uchar *s, const uchar *text, const uchar*const *a)
-{
-    for ( ; *s != '\0'; text++, s++) {
-	while (*text == '\0') {
-	    if ((text = *a++) == NULL)
-		return FALSE;
-	}
-	if (*text != *s)
-	    return FALSE;
-    }
-
-    do {
-	if (*text)
-	    return FALSE;
-    } while ((text = *a++) != NULL);
-
-    return TRUE;
-}
-
-/* Return the last "count" path elements from the concatenated string.
- * We return a string pointer to the start of the string, and update the
- * array pointer-pointer to point to any remaining string elements. */
-static const uchar *trailing_N_elements(const uchar*const **a_ptr, int count)
-{
-    const uchar*const *a = *a_ptr;
-    const uchar*const *first_a = a;
-
-    while (*a)
-	    a++;
-
-    while (a != first_a) {
-	const uchar *s = *--a;
-	s += strlen((char*)s);
-	while (--s >= *a) {
-	    if (*s == '/' && !--count) {
-		*a_ptr = a+1;
-		return s+1;
-	    }
-	}
-    }
-
-    if (count == 1) {
-	*a_ptr = a+1;
-	return *a;
-    }
-
-    return NULL;
+    return *text ? FALSE : TRUE;
 }
 
 /* Match the "pattern" against the "text" string. */
 int wildmatch(const char *pattern, const char *text)
 {
-    static const uchar *nomore[1]; /* A NULL pointer. */
-#ifdef WILD_TEST_ITERATIONS
-    wildmatch_iteration_count = 0;
-#endif
-    return dowild((const uchar*)pattern, (const uchar*)text, nomore) == TRUE;
+    return dowild((const uchar*)pattern, (const uchar*)text) == TRUE;
 }
 
 /* Match the "pattern" against the forced-to-lower-case "text" string. */
 int iwildmatch(const char *pattern, const char *text)
 {
-    static const uchar *nomore[1]; /* A NULL pointer. */
     int ret;
-#ifdef WILD_TEST_ITERATIONS
-    wildmatch_iteration_count = 0;
-#endif
     force_lower_case = 1;
-    ret = dowild((const uchar*)pattern, (const uchar*)text, nomore) == TRUE;
+    ret = dowild((const uchar*)pattern, (const uchar*)text) == TRUE;
     force_lower_case = 0;
     return ret;
 }
-
-/* Match pattern "p" against the a virtually-joined string consisting
- * of all the pointers in array "texts" (which has a NULL pointer at the
- * end).  The int "where" can be 0 (normal matching), > 0 (match only
- * the trailing N slash-separated filename components of "texts"), or < 0
- * (match the "pattern" at the start or after any slash in "texts"). */
-int wildmatch_array(const char *pattern, const char*const *texts, int where)
-{
-    const uchar *p = (const uchar*)pattern;
-    const uchar*const *a = (const uchar*const*)texts;
-    const uchar *text;
-    int matched;
-
-#ifdef WILD_TEST_ITERATIONS
-    wildmatch_iteration_count = 0;
-#endif
-
-    if (where > 0)
-	text = trailing_N_elements(&a, where);
-    else
-	text = *a++;
-    if (!text)
-	return FALSE;
-
-    if ((matched = dowild(p, text, a)) != TRUE && where < 0
-     && matched != ABORT_ALL) {
-	while (1) {
-	    if (*text == '\0') {
-		if ((text = (uchar*)*a++) == NULL)
-		    return FALSE;
-		continue;
-	    }
-	    if (*text++ == '/' && (matched = dowild(p, text, a)) != FALSE
-	     && matched != ABORT_TO_STARSTAR)
-		break;
-	}
-    }
-    return matched == TRUE;
-}
-
-/* Match literal string "s" against the a virtually-joined string consisting
- * of all the pointers in array "texts" (which has a NULL pointer at the
- * end).  The int "where" can be 0 (normal matching), or > 0 (match
- * only the trailing N slash-separated filename components of "texts"). */
-int litmatch_array(const char *string, const char*const *texts, int where)
-{
-    const uchar *s = (const uchar*)string;
-    const uchar*const *a = (const uchar* const*)texts;
-    const uchar *text;
-
-    if (where > 0)
-	text = trailing_N_elements(&a, where);
-    else
-	text = *a++;
-    if (!text)
-	return FALSE;
-
-    return doliteral(s, text, a) == TRUE;
-}
diff --git a/wildmatch.h b/wildmatch.h
index e7f1a35..562faa3 100644
--- a/wildmatch.h
+++ b/wildmatch.h
@@ -2,5 +2,3 @@
 
 int wildmatch(const char *pattern, const char *text);
 int iwildmatch(const char *pattern, const char *text);
-int wildmatch_array(const char *pattern, const char*const *texts, int where);
-int litmatch_array(const char *string, const char*const *texts, int where);
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 05/13] wildmatch: follow Git's coding convention
  2012-10-15  6:25 ` [PATCH 01/13] ctype: make sane_ctype[] const array Nguyễn Thái Ngọc Duy
                     ` (2 preceding siblings ...)
  2012-10-15  6:25   ` [PATCH 04/13] wildmatch: remove unnecessary functions Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:25   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:25   ` [PATCH 06/13] Integrate wildmatch to git Nguyễn Thái Ngọc Duy
                     ` (7 subsequent siblings)
  11 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:25 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

wildmatch's coding style is pretty close to Git's except the use of 4
space indentation instead of 8. This patch should produce empty diff
with "git diff -b"

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 wildmatch.c | 292 ++++++++++++++++++++++++++++++------------------------------
 1 file changed, 146 insertions(+), 146 deletions(-)

diff --git a/wildmatch.c b/wildmatch.c
index fae7397..4653dd6 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -58,167 +58,167 @@ static int force_lower_case = 0;
 /* Match pattern "p" against "text" */
 static int dowild(const uchar *p, const uchar *text)
 {
-    uchar p_ch;
+	uchar p_ch;
 
-    for ( ; (p_ch = *p) != '\0'; text++, p++) {
-	int matched, special;
-	uchar t_ch, prev_ch;
-	if ((t_ch = *text) == '\0' && p_ch != '*')
-		return ABORT_ALL;
-	if (force_lower_case && ISUPPER(t_ch))
-	    t_ch = tolower(t_ch);
-	switch (p_ch) {
-	  case '\\':
-	    /* Literal match with following character.  Note that the test
-	     * in "default" handles the p[1] == '\0' failure case. */
-	    p_ch = *++p;
-	    /* FALLTHROUGH */
-	  default:
-	    if (t_ch != p_ch)
-		return FALSE;
-	    continue;
-	  case '?':
-	    /* Match anything but '/'. */
-	    if (t_ch == '/')
-		return FALSE;
-	    continue;
-	  case '*':
-	    if (*++p == '*') {
-		while (*++p == '*') {}
-		special = TRUE;
-	    } else
-		special = FALSE;
-	    if (*p == '\0') {
-		/* Trailing "**" matches everything.  Trailing "*" matches
-		 * only if there are no more slash characters. */
-		if (!special) {
-			if (strchr((char*)text, '/') != NULL)
-			    return FALSE;
-		}
-		return TRUE;
-	    }
-	    while (1) {
-		if (t_ch == '\0')
-		    break;
-		if ((matched = dowild(p, text)) != FALSE) {
-		    if (!special || matched != ABORT_TO_STARSTAR)
-			return matched;
-		} else if (!special && t_ch == '/')
-		    return ABORT_TO_STARSTAR;
-		t_ch = *++text;
-	    }
-	    return ABORT_ALL;
-	  case '[':
-	    p_ch = *++p;
-#ifdef NEGATE_CLASS2
-	    if (p_ch == NEGATE_CLASS2)
-		p_ch = NEGATE_CLASS;
-#endif
-	    /* Assign literal TRUE/FALSE because of "matched" comparison. */
-	    special = p_ch == NEGATE_CLASS? TRUE : FALSE;
-	    if (special) {
-		/* Inverted character class. */
-		p_ch = *++p;
-	    }
-	    prev_ch = 0;
-	    matched = FALSE;
-	    do {
-		if (!p_ch)
-		    return ABORT_ALL;
-		if (p_ch == '\\') {
-		    p_ch = *++p;
-		    if (!p_ch)
+	for ( ; (p_ch = *p) != '\0'; text++, p++) {
+		int matched, special;
+		uchar t_ch, prev_ch;
+		if ((t_ch = *text) == '\0' && p_ch != '*')
 			return ABORT_ALL;
-		    if (t_ch == p_ch)
-			matched = TRUE;
-		} else if (p_ch == '-' && prev_ch && p[1] && p[1] != ']') {
-		    p_ch = *++p;
-		    if (p_ch == '\\') {
+		if (force_lower_case && ISUPPER(t_ch))
+			t_ch = tolower(t_ch);
+		switch (p_ch) {
+		case '\\':
+			/* Literal match with following character.  Note that the test
+			 * in "default" handles the p[1] == '\0' failure case. */
 			p_ch = *++p;
-			if (!p_ch)
-			    return ABORT_ALL;
-		    }
-		    if (t_ch <= p_ch && t_ch >= prev_ch)
-			matched = TRUE;
-		    p_ch = 0; /* This makes "prev_ch" get set to 0. */
-		} else if (p_ch == '[' && p[1] == ':') {
-		    const uchar *s;
-		    int i;
-		    for (s = p += 2; (p_ch = *p) && p_ch != ']'; p++) {} /*SHARED ITERATOR*/
-		    if (!p_ch)
-			return ABORT_ALL;
-		    i = p - s - 1;
-		    if (i < 0 || p[-1] != ':') {
-			/* Didn't find ":]", so treat like a normal set. */
-			p = s - 2;
-			p_ch = '[';
-			if (t_ch == p_ch)
-			    matched = TRUE;
+			/* FALLTHROUGH */
+		default:
+			if (t_ch != p_ch)
+				return FALSE;
+			continue;
+		case '?':
+			/* Match anything but '/'. */
+			if (t_ch == '/')
+				return FALSE;
 			continue;
-		    }
-		    if (CC_EQ(s,i, "alnum")) {
-			if (ISALNUM(t_ch))
-			    matched = TRUE;
-		    } else if (CC_EQ(s,i, "alpha")) {
-			if (ISALPHA(t_ch))
-			    matched = TRUE;
-		    } else if (CC_EQ(s,i, "blank")) {
-			if (ISBLANK(t_ch))
-			    matched = TRUE;
-		    } else if (CC_EQ(s,i, "cntrl")) {
-			if (ISCNTRL(t_ch))
-			    matched = TRUE;
-		    } else if (CC_EQ(s,i, "digit")) {
-			if (ISDIGIT(t_ch))
-			    matched = TRUE;
-		    } else if (CC_EQ(s,i, "graph")) {
-			if (ISGRAPH(t_ch))
-			    matched = TRUE;
-		    } else if (CC_EQ(s,i, "lower")) {
-			if (ISLOWER(t_ch))
-			    matched = TRUE;
-		    } else if (CC_EQ(s,i, "print")) {
-			if (ISPRINT(t_ch))
-			    matched = TRUE;
-		    } else if (CC_EQ(s,i, "punct")) {
-			if (ISPUNCT(t_ch))
-			    matched = TRUE;
-		    } else if (CC_EQ(s,i, "space")) {
-			if (ISSPACE(t_ch))
-			    matched = TRUE;
-		    } else if (CC_EQ(s,i, "upper")) {
-			if (ISUPPER(t_ch))
-			    matched = TRUE;
-		    } else if (CC_EQ(s,i, "xdigit")) {
-			if (ISXDIGIT(t_ch))
-			    matched = TRUE;
-		    } else /* malformed [:class:] string */
+		case '*':
+			if (*++p == '*') {
+				while (*++p == '*') {}
+				special = TRUE;
+			} else
+				special = FALSE;
+			if (*p == '\0') {
+				/* Trailing "**" matches everything.  Trailing "*" matches
+				 * only if there are no more slash characters. */
+				if (!special) {
+					if (strchr((char*)text, '/') != NULL)
+						return FALSE;
+				}
+				return TRUE;
+			}
+			while (1) {
+				if (t_ch == '\0')
+					break;
+				if ((matched = dowild(p, text)) != FALSE) {
+					if (!special || matched != ABORT_TO_STARSTAR)
+						return matched;
+				} else if (!special && t_ch == '/')
+					return ABORT_TO_STARSTAR;
+				t_ch = *++text;
+			}
 			return ABORT_ALL;
-		    p_ch = 0; /* This makes "prev_ch" get set to 0. */
-		} else if (t_ch == p_ch)
-		    matched = TRUE;
-	    } while (prev_ch = p_ch, (p_ch = *++p) != ']');
-	    if (matched == special || t_ch == '/')
-		return FALSE;
-	    continue;
+		case '[':
+			p_ch = *++p;
+#ifdef NEGATE_CLASS2
+			if (p_ch == NEGATE_CLASS2)
+				p_ch = NEGATE_CLASS;
+#endif
+			/* Assign literal TRUE/FALSE because of "matched" comparison. */
+			special = p_ch == NEGATE_CLASS? TRUE : FALSE;
+			if (special) {
+				/* Inverted character class. */
+				p_ch = *++p;
+			}
+			prev_ch = 0;
+			matched = FALSE;
+			do {
+				if (!p_ch)
+					return ABORT_ALL;
+				if (p_ch == '\\') {
+					p_ch = *++p;
+					if (!p_ch)
+						return ABORT_ALL;
+					if (t_ch == p_ch)
+						matched = TRUE;
+				} else if (p_ch == '-' && prev_ch && p[1] && p[1] != ']') {
+					p_ch = *++p;
+					if (p_ch == '\\') {
+						p_ch = *++p;
+						if (!p_ch)
+							return ABORT_ALL;
+					}
+					if (t_ch <= p_ch && t_ch >= prev_ch)
+						matched = TRUE;
+					p_ch = 0; /* This makes "prev_ch" get set to 0. */
+				} else if (p_ch == '[' && p[1] == ':') {
+					const uchar *s;
+					int i;
+					for (s = p += 2; (p_ch = *p) && p_ch != ']'; p++) {} /*SHARED ITERATOR*/
+					if (!p_ch)
+						return ABORT_ALL;
+					i = p - s - 1;
+					if (i < 0 || p[-1] != ':') {
+						/* Didn't find ":]", so treat like a normal set. */
+						p = s - 2;
+						p_ch = '[';
+						if (t_ch == p_ch)
+							matched = TRUE;
+						continue;
+					}
+					if (CC_EQ(s,i, "alnum")) {
+						if (ISALNUM(t_ch))
+							matched = TRUE;
+					} else if (CC_EQ(s,i, "alpha")) {
+						if (ISALPHA(t_ch))
+							matched = TRUE;
+					} else if (CC_EQ(s,i, "blank")) {
+						if (ISBLANK(t_ch))
+							matched = TRUE;
+					} else if (CC_EQ(s,i, "cntrl")) {
+						if (ISCNTRL(t_ch))
+							matched = TRUE;
+					} else if (CC_EQ(s,i, "digit")) {
+						if (ISDIGIT(t_ch))
+							matched = TRUE;
+					} else if (CC_EQ(s,i, "graph")) {
+						if (ISGRAPH(t_ch))
+							matched = TRUE;
+					} else if (CC_EQ(s,i, "lower")) {
+						if (ISLOWER(t_ch))
+							matched = TRUE;
+					} else if (CC_EQ(s,i, "print")) {
+						if (ISPRINT(t_ch))
+							matched = TRUE;
+					} else if (CC_EQ(s,i, "punct")) {
+						if (ISPUNCT(t_ch))
+							matched = TRUE;
+					} else if (CC_EQ(s,i, "space")) {
+						if (ISSPACE(t_ch))
+							matched = TRUE;
+					} else if (CC_EQ(s,i, "upper")) {
+						if (ISUPPER(t_ch))
+							matched = TRUE;
+					} else if (CC_EQ(s,i, "xdigit")) {
+						if (ISXDIGIT(t_ch))
+							matched = TRUE;
+					} else /* malformed [:class:] string */
+						return ABORT_ALL;
+					p_ch = 0; /* This makes "prev_ch" get set to 0. */
+				} else if (t_ch == p_ch)
+					matched = TRUE;
+			} while (prev_ch = p_ch, (p_ch = *++p) != ']');
+			if (matched == special || t_ch == '/')
+				return FALSE;
+			continue;
+		}
 	}
-    }
 
-    return *text ? FALSE : TRUE;
+	return *text ? FALSE : TRUE;
 }
 
 /* Match the "pattern" against the "text" string. */
 int wildmatch(const char *pattern, const char *text)
 {
-    return dowild((const uchar*)pattern, (const uchar*)text) == TRUE;
+	return dowild((const uchar*)pattern, (const uchar*)text) == TRUE;
 }
 
 /* Match the "pattern" against the forced-to-lower-case "text" string. */
 int iwildmatch(const char *pattern, const char *text)
 {
-    int ret;
-    force_lower_case = 1;
-    ret = dowild((const uchar*)pattern, (const uchar*)text) == TRUE;
-    force_lower_case = 0;
-    return ret;
+	int ret;
+	force_lower_case = 1;
+	ret = dowild((const uchar*)pattern, (const uchar*)text) == TRUE;
+	force_lower_case = 0;
+	return ret;
 }
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 06/13] Integrate wildmatch to git
  2012-10-15  6:25 ` [PATCH 01/13] ctype: make sane_ctype[] const array Nguyễn Thái Ngọc Duy
                     ` (3 preceding siblings ...)
  2012-10-15  6:25   ` [PATCH 05/13] wildmatch: follow Git's coding convention Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:25   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:25   ` [PATCH 07/13] t3070: disable unreliable fnmatch tests Nguyễn Thái Ngọc Duy
                     ` (6 subsequent siblings)
  11 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:25 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 13360 bytes --]

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 .gitignore           |   1 +
 Makefile             |   3 +
 t/t3070-wildmatch.sh | 188 +++++++++++++++++++++++++++++++++++++++++++++++++++
 t/t3070/wildtest.txt | 165 --------------------------------------------
 test-wildmatch.c     |  14 ++++
 wildmatch.c          |   5 +-
 6 files changed, 210 insertions(+), 166 deletions(-)
 create mode 100755 t/t3070-wildmatch.sh
 delete mode 100644 t/t3070/wildtest.txt
 create mode 100644 test-wildmatch.c

diff --git a/.gitignore b/.gitignore
index f1acd3e..1153a4b 100644
--- a/.gitignore
+++ b/.gitignore
@@ -193,6 +193,7 @@
 /test-sigchain
 /test-subprocess
 /test-svn-fe
+/test-wildmatch
 /common-cmds.h
 *.tar.gz
 *.dsc
diff --git a/Makefile b/Makefile
index 13293d3..bc868d1 100644
--- a/Makefile
+++ b/Makefile
@@ -503,6 +503,7 @@ TEST_PROGRAMS_NEED_X += test-sha1
 TEST_PROGRAMS_NEED_X += test-sigchain
 TEST_PROGRAMS_NEED_X += test-subprocess
 TEST_PROGRAMS_NEED_X += test-svn-fe
+TEST_PROGRAMS_NEED_X += test-wildmatch
 
 TEST_PROGRAMS = $(patsubst %,%$X,$(TEST_PROGRAMS_NEED_X))
 
@@ -674,6 +675,7 @@ LIB_H += unpack-trees.h
 LIB_H += userdiff.h
 LIB_H += utf8.h
 LIB_H += varint.h
+LIB_H += wildmatch.h
 LIB_H += xdiff-interface.h
 LIB_H += xdiff/xdiff.h
 
@@ -803,6 +805,7 @@ LIB_OBJS += userdiff.o
 LIB_OBJS += utf8.o
 LIB_OBJS += varint.o
 LIB_OBJS += walker.o
+LIB_OBJS += wildmatch.o
 LIB_OBJS += wrapper.o
 LIB_OBJS += write_or_die.o
 LIB_OBJS += ws.o
diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
new file mode 100755
index 0000000..dbd3c8b
--- /dev/null
+++ b/t/t3070-wildmatch.sh
@@ -0,0 +1,188 @@
+#!/bin/sh
+
+test_description='wildmatch tests'
+
+. ./test-lib.sh
+
+match() {
+    if [ $1 = 1 ]; then
+	test_expect_success "wildmatch:    match '$3' '$4'" "
+	    test-wildmatch wildmatch '$3' '$4'
+	"
+    else
+	test_expect_success "wildmatch: no match '$3' '$4'" "
+	    ! test-wildmatch wildmatch '$3' '$4'
+	"
+    fi
+    if [ $2 = 1 ]; then
+	test_expect_success "fnmatch:      match '$3' '$4'" "
+	    test-wildmatch fnmatch '$3' '$4'
+	"
+    elif [ $2 = 0 ]; then
+	test_expect_success "fnmatch:   no match '$3' '$4'" "
+	    ! test-wildmatch fnmatch '$3' '$4'
+	"
+#    else
+#	test_expect_success BROKEN_FNMATCH "fnmatch:       '$3' '$4'" "
+#	    ! test-wildmatch fnmatch '$3' '$4'
+#	"
+    fi
+}
+
+# Basic wildmat features
+match 1 1 foo foo
+match 0 0 foo bar
+match 1 1 '' ""
+match 1 1 foo '???'
+match 0 0 foo '??'
+match 1 1 foo '*'
+match 1 1 foo 'f*'
+match 0 0 foo '*f'
+match 1 1 foo '*foo*'
+match 1 1 foobar '*ob*a*r*'
+match 1 1 aaaaaaabababab '*ab'
+match 1 1 'foo*' 'foo\*'
+match 0 0 foobar 'foo\*bar'
+match 1 1 'f\oo' 'f\\oo'
+match 1 1 ball '*[al]?'
+match 0 0 ten '[ten]'
+match 1 1 ten '**[!te]'
+match 0 0 ten '**[!ten]'
+match 1 1 ten 't[a-g]n'
+match 0 0 ten 't[!a-g]n'
+match 1 1 ton 't[!a-g]n'
+match 1 1 ton 't[^a-g]n'
+match 1 1 'a]b' 'a[]]b'
+match 1 1 a-b 'a[]-]b'
+match 1 1 'a]b' 'a[]-]b'
+match 0 0 aab 'a[]-]b'
+match 1 1 aab 'a[]a-]b'
+match 1 1 ']' ']'
+
+# Extended slash-matching features
+match 0 0 'foo/baz/bar' 'foo*bar'
+match 1 0 'foo/baz/bar' 'foo**bar'
+match 0 0 'foo/bar' 'foo?bar'
+match 0 0 'foo/bar' 'foo[/]bar'
+match 0 0 'foo/bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
+match 1 1 'foo-bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
+match 0 0 'foo' '**/foo'
+match 1 1 '/foo' '**/foo'
+match 1 0 'bar/baz/foo' '**/foo'
+match 0 0 'bar/baz/foo' '*/foo'
+match 0 0 'foo/bar/baz' '**/bar*'
+match 1 0 'deep/foo/bar/baz' '**/bar/*'
+match 0 0 'deep/foo/bar/baz/' '**/bar/*'
+match 1 0 'deep/foo/bar/baz/' '**/bar/**'
+match 0 0 'deep/foo/bar' '**/bar/*'
+match 1 0 'deep/foo/bar/' '**/bar/**'
+match 1 0 'foo/bar/baz' '**/bar**'
+match 1 0 'foo/bar/baz/x' '*/bar/**'
+match 0 0 'deep/foo/bar/baz/x' '*/bar/**'
+match 1 0 'deep/foo/bar/baz/x' '**/bar/*/*'
+
+# Various additional tests
+match 0 0 'acrt' 'a[c-c]st'
+match 1 1 'acrt' 'a[c-c]rt'
+match 0 0 ']' '[!]-]'
+match 1 1 'a' '[!]-]'
+match 0 0 '' '\'
+match 0 0 '\' '\'
+match 0 0 '/\' '*/\'
+match 1 1 '/\' '*/\\'
+match 1 1 'foo' 'foo'
+match 1 1 '@foo' '@foo'
+match 0 0 'foo' '@foo'
+match 1 1 '[ab]' '\[ab]'
+match 1 1 '[ab]' '[[]ab]'
+match 1 1 '[ab]' '[[:]ab]'
+match 0 0 '[ab]' '[[::]ab]'
+match 1 1 '[ab]' '[[:digit]ab]'
+match 1 1 '[ab]' '[\[:]ab]'
+match 1 1 '?a?b' '\??\?b'
+match 1 1 'abc' '\a\b\c'
+match 0 0 'foo' ''
+match 1 0 'foo/bar/baz/to' '**/t[o]'
+
+# Character class tests
+match 1 1 'a1B' '[[:alpha:]][[:digit:]][[:upper:]]'
+match 0 0 'a' '[[:digit:][:upper:][:space:]]'
+match 1 1 'A' '[[:digit:][:upper:][:space:]]'
+match 1 0 '1' '[[:digit:][:upper:][:space:]]'
+match 0 0 '1' '[[:digit:][:upper:][:spaci:]]'
+match 1 1 ' ' '[[:digit:][:upper:][:space:]]'
+match 0 0 '.' '[[:digit:][:upper:][:space:]]'
+match 1 1 '.' '[[:digit:][:punct:][:space:]]'
+match 1 1 '5' '[[:xdigit:]]'
+match 1 1 'f' '[[:xdigit:]]'
+match 1 1 'D' '[[:xdigit:]]'
+match 1 0 '_' '[[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]'
+match 1 0 '_' '[[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]'
+match 1 1 '.' '[^[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:lower:][:space:][:upper:][:xdigit:]]'
+match 1 1 '5' '[a-c[:digit:]x-z]'
+match 1 1 'b' '[a-c[:digit:]x-z]'
+match 1 1 'y' '[a-c[:digit:]x-z]'
+match 0 0 'q' '[a-c[:digit:]x-z]'
+
+# Additional tests, including some malformed wildmats
+match 1 1 ']' '[\\-^]'
+match 0 0 '[' '[\\-^]'
+match 1 1 '-' '[\-_]'
+match 1 1 ']' '[\]]'
+match 0 0 '\]' '[\]]'
+match 0 0 '\' '[\]]'
+match 0 0 'ab' 'a[]b'
+match 0 1 'a[]b' 'a[]b'
+match 0 1 'ab[' 'ab['
+match 0 0 'ab' '[!'
+match 0 0 'ab' '[-'
+match 1 1 '-' '[-]'
+match 0 0 '-' '[a-'
+match 0 0 '-' '[!a-'
+match 1 1 '-' '[--A]'
+match 1 1 '5' '[--A]'
+match 1 1 ' ' '[ --]'
+match 1 1 '$' '[ --]'
+match 1 1 '-' '[ --]'
+match 0 0 '0' '[ --]'
+match 1 1 '-' '[---]'
+match 1 1 '-' '[------]'
+match 0 0 'j' '[a-e-n]'
+match 1 1 '-' '[a-e-n]'
+match 1 1 'a' '[!------]'
+match 0 0 '[' '[]-a]'
+match 1 1 '^' '[]-a]'
+match 0 0 '^' '[!]-a]'
+match 1 1 '[' '[!]-a]'
+match 1 1 '^' '[a^bc]'
+match 1 1 '-b]' '[a-]b]'
+match 0 0 '\' '[\]'
+match 1 1 '\' '[\\]'
+match 0 0 '\' '[!\\]'
+match 1 1 'G' '[A-\\]'
+match 0 0 'aaabbb' 'b*a'
+match 0 0 'aabcaa' '*ba*'
+match 1 1 ',' '[,]'
+match 1 1 ',' '[\\,]'
+match 1 1 '\' '[\\,]'
+match 1 1 '-' '[,-.]'
+match 0 0 '+' '[,-.]'
+match 0 0 '-.]' '[,-.]'
+match 1 1 '2' '[\1-\3]'
+match 1 1 '3' '[\1-\3]'
+match 0 0 '4' '[\1-\3]'
+match 1 1 '\' '[[-\]]'
+match 1 1 '[' '[[-\]]'
+match 1 1 ']' '[[-\]]'
+match 0 0 '-' '[[-\]]'
+
+# Test recursion and the abort code (use "wildtest -i" to see iteration counts)
+match 1 1 '-adobe-courier-bold-o-normal--12-120-75-75-m-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
+match 0 0 '-adobe-courier-bold-o-normal--12-120-75-75-X-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
+match 0 0 '-adobe-courier-bold-o-normal--12-120-75-75-/-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
+match 1 1 '/adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1' '/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
+match 0 0 '/adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1' '/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
+match 1 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txt' '**/*a*b*g*n*t'
+match 0 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txtz' '**/*a*b*g*n*t'
+
+test_done
diff --git a/t/t3070/wildtest.txt b/t/t3070/wildtest.txt
deleted file mode 100644
index 42c1678..0000000
--- a/t/t3070/wildtest.txt
+++ /dev/null
@@ -1,165 +0,0 @@
-# Input is in the following format (all items white-space separated):
-#
-# The first two items are 1 or 0 indicating if the wildmat call is expected to
-# succeed and if fnmatch works the same way as wildmat, respectively.  After
-# that is a text string for the match, and a pattern string.  Strings can be
-# quoted (if desired) in either double or single quotes, as well as backticks.
-#
-# MATCH FNMATCH_SAME "text to match" 'pattern to use'
-
-# Basic wildmat features
-1 1 foo			foo
-0 1 foo			bar
-1 1 ''			""
-1 1 foo			???
-0 1 foo			??
-1 1 foo			*
-1 1 foo			f*
-0 1 foo			*f
-1 1 foo			*foo*
-1 1 foobar		*ob*a*r*
-1 1 aaaaaaabababab	*ab
-1 1 foo*		foo\*
-0 1 foobar		foo\*bar
-1 1 f\oo		f\\oo
-1 1 ball		*[al]?
-0 1 ten			[ten]
-1 1 ten			**[!te]
-0 1 ten			**[!ten]
-1 1 ten			t[a-g]n
-0 1 ten			t[!a-g]n
-1 1 ton			t[!a-g]n
-1 1 ton			t[^a-g]n
-1 1 a]b			a[]]b
-1 1 a-b			a[]-]b
-1 1 a]b			a[]-]b
-0 1 aab			a[]-]b
-1 1 aab			a[]a-]b
-1 1 ]			]
-
-# Extended slash-matching features
-0 1 foo/baz/bar		foo*bar
-1 1 foo/baz/bar		foo**bar
-0 1 foo/bar		foo?bar
-0 1 foo/bar		foo[/]bar
-0 1 foo/bar		f[^eiu][^eiu][^eiu][^eiu][^eiu]r
-1 1 foo-bar		f[^eiu][^eiu][^eiu][^eiu][^eiu]r
-0 1 foo			**/foo
-1 1 /foo		**/foo
-1 1 bar/baz/foo		**/foo
-0 1 bar/baz/foo		*/foo
-0 0 foo/bar/baz		**/bar*
-1 1 deep/foo/bar/baz	**/bar/*
-0 1 deep/foo/bar/baz/	**/bar/*
-1 1 deep/foo/bar/baz/	**/bar/**
-0 1 deep/foo/bar	**/bar/*
-1 1 deep/foo/bar/	**/bar/**
-1 1 foo/bar/baz		**/bar**
-1 1 foo/bar/baz/x	*/bar/**
-0 0 deep/foo/bar/baz/x	*/bar/**
-1 1 deep/foo/bar/baz/x	**/bar/*/*
-
-# Various additional tests
-0 1 acrt		a[c-c]st
-1 1 acrt		a[c-c]rt
-0 1 ]			[!]-]
-1 1 a			[!]-]
-0 1 ''			\
-0 1 \			\
-0 1 /\			*/\
-1 1 /\			*/\\
-1 1 foo			foo
-1 1 @foo		@foo
-0 1 foo			@foo
-1 1 [ab]		\[ab]
-1 1 [ab]		[[]ab]
-1 1 [ab]		[[:]ab]
-0 1 [ab]		[[::]ab]
-1 1 [ab]		[[:digit]ab]
-1 1 [ab]		[\[:]ab]
-1 1 ?a?b		\??\?b
-1 1 abc			\a\b\c
-0 1 foo			''
-1 1 foo/bar/baz/to	**/t[o]
-
-# Character class tests
-1 1 a1B		[[:alpha:]][[:digit:]][[:upper:]]
-0 1 a		[[:digit:][:upper:][:space:]]
-1 1 A		[[:digit:][:upper:][:space:]]
-1 1 1		[[:digit:][:upper:][:space:]]
-0 1 1		[[:digit:][:upper:][:spaci:]]
-1 1 ' '		[[:digit:][:upper:][:space:]]
-0 1 .		[[:digit:][:upper:][:space:]]
-1 1 .		[[:digit:][:punct:][:space:]]
-1 1 5		[[:xdigit:]]
-1 1 f		[[:xdigit:]]
-1 1 D		[[:xdigit:]]
-1 1 _		[[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]
-#1 1 …		[^[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]
-1 1 \x7f		[^[:alnum:][:alpha:][:blank:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]
-1 1 .		[^[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:lower:][:space:][:upper:][:xdigit:]]
-1 1 5		[a-c[:digit:]x-z]
-1 1 b		[a-c[:digit:]x-z]
-1 1 y		[a-c[:digit:]x-z]
-0 1 q		[a-c[:digit:]x-z]
-
-# Additional tests, including some malformed wildmats
-1 1 ]		[\\-^]
-0 1 [		[\\-^]
-1 1 -		[\-_]
-1 1 ]		[\]]
-0 1 \]		[\]]
-0 1 \		[\]]
-0 1 ab		a[]b
-0 1 a[]b	a[]b
-0 1 ab[		ab[
-0 1 ab		[!
-0 1 ab		[-
-1 1 -		[-]
-0 1 -		[a-
-0 1 -		[!a-
-1 1 -		[--A]
-1 1 5		[--A]
-1 1 ' '		'[ --]'
-1 1 $		'[ --]'
-1 1 -		'[ --]'
-0 1 0		'[ --]'
-1 1 -		[---]
-1 1 -		[------]
-0 1 j		[a-e-n]
-1 1 -		[a-e-n]
-1 1 a		[!------]
-0 1 [		[]-a]
-1 1 ^		[]-a]
-0 1 ^		[!]-a]
-1 1 [		[!]-a]
-1 1 ^		[a^bc]
-1 1 -b]		[a-]b]
-0 1 \		[\]
-1 1 \		[\\]
-0 1 \		[!\\]
-1 1 G		[A-\\]
-0 1 aaabbb	b*a
-0 1 aabcaa	*ba*
-1 1 ,		[,]
-1 1 ,		[\\,]
-1 1 \		[\\,]
-1 1 -		[,-.]
-0 1 +		[,-.]
-0 1 -.]		[,-.]
-1 1 2		[\1-\3]
-1 1 3		[\1-\3]
-0 1 4		[\1-\3]
-1 1 \		[[-\]]
-1 1 [		[[-\]]
-1 1 ]		[[-\]]
-0 1 -		[[-\]]
-
-# Test recursion and the abort code (use "wildtest -i" to see iteration counts)
-1 1 -adobe-courier-bold-o-normal--12-120-75-75-m-70-iso8859-1	-*-*-*-*-*-*-12-*-*-*-m-*-*-*
-0 1 -adobe-courier-bold-o-normal--12-120-75-75-X-70-iso8859-1	-*-*-*-*-*-*-12-*-*-*-m-*-*-*
-0 1 -adobe-courier-bold-o-normal--12-120-75-75-/-70-iso8859-1	-*-*-*-*-*-*-12-*-*-*-m-*-*-*
-1 1 /adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1	/*/*/*/*/*/*/12/*/*/*/m/*/*/*
-0 1 /adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1	/*/*/*/*/*/*/12/*/*/*/m/*/*/*
-1 1 abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txt		**/*a*b*g*n*t
-0 1 abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txtz		**/*a*b*g*n*t
diff --git a/test-wildmatch.c b/test-wildmatch.c
new file mode 100644
index 0000000..ac56420
--- /dev/null
+++ b/test-wildmatch.c
@@ -0,0 +1,14 @@
+#include "cache.h"
+#include "wildmatch.h"
+
+int main(int argc, char **argv)
+{
+	if (!strcmp(argv[1], "wildmatch"))
+		return wildmatch(argv[3], argv[2]) ? 0 : 1;
+	else if (!strcmp(argv[1], "iwildmatch"))
+		return iwildmatch(argv[3], argv[2]) ? 0 : 1;
+	else if (!strcmp(argv[1], "fnmatch"))
+		return !!fnmatch(argv[3], argv[2], FNM_PATHNAME);
+	else
+		return 1;
+}
diff --git a/wildmatch.c b/wildmatch.c
index 4653dd6..ac29471 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -9,7 +9,10 @@
 **  work differently than '*', and to fix the character-class code.
 */
 
-#include "rsync.h"
+#include "cache.h"
+#include "wildmatch.h"
+
+typedef unsigned char uchar;
 
 /* What character marks an inverted character class? */
 #define NEGATE_CLASS	'!'
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 07/13] t3070: disable unreliable fnmatch tests
  2012-10-15  6:25 ` [PATCH 01/13] ctype: make sane_ctype[] const array Nguyễn Thái Ngọc Duy
                     ` (4 preceding siblings ...)
  2012-10-15  6:25   ` [PATCH 06/13] Integrate wildmatch to git Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:25   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:25   ` [PATCH 08/13] wildmatch: make wildmatch's return value compatible with fnmatch Nguyễn Thái Ngọc Duy
                     ` (5 subsequent siblings)
  11 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:25 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

These tests show different results on different fnmatch() versions. We
don't want to test fnmatch here. We want to make sure wildmatch
behavior matches fnmatch and that only makes sense in cases when
fnmatch() behaves consistently.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 t/t3070-wildmatch.sh | 86 ++++++++++++++++++++++++++--------------------------
 1 file changed, 43 insertions(+), 43 deletions(-)

diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index dbd3c8b..dd95b00 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -52,11 +52,11 @@ match 1 1 ten 't[a-g]n'
 match 0 0 ten 't[!a-g]n'
 match 1 1 ton 't[!a-g]n'
 match 1 1 ton 't[^a-g]n'
-match 1 1 'a]b' 'a[]]b'
-match 1 1 a-b 'a[]-]b'
-match 1 1 'a]b' 'a[]-]b'
-match 0 0 aab 'a[]-]b'
-match 1 1 aab 'a[]a-]b'
+match 1 x 'a]b' 'a[]]b'
+match 1 x a-b 'a[]-]b'
+match 1 x 'a]b' 'a[]-]b'
+match 0 x aab 'a[]-]b'
+match 1 x aab 'a[]a-]b'
 match 1 1 ']' ']'
 
 # Extended slash-matching features
@@ -67,7 +67,7 @@ match 0 0 'foo/bar' 'foo[/]bar'
 match 0 0 'foo/bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
 match 1 1 'foo-bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
 match 0 0 'foo' '**/foo'
-match 1 1 '/foo' '**/foo'
+match 1 x '/foo' '**/foo'
 match 1 0 'bar/baz/foo' '**/foo'
 match 0 0 'bar/baz/foo' '*/foo'
 match 0 0 'foo/bar/baz' '**/bar*'
@@ -85,77 +85,77 @@ match 1 0 'deep/foo/bar/baz/x' '**/bar/*/*'
 match 0 0 'acrt' 'a[c-c]st'
 match 1 1 'acrt' 'a[c-c]rt'
 match 0 0 ']' '[!]-]'
-match 1 1 'a' '[!]-]'
+match 1 x 'a' '[!]-]'
 match 0 0 '' '\'
-match 0 0 '\' '\'
-match 0 0 '/\' '*/\'
-match 1 1 '/\' '*/\\'
+match 0 x '\' '\'
+match 0 x '/\' '*/\'
+match 1 x '/\' '*/\\'
 match 1 1 'foo' 'foo'
 match 1 1 '@foo' '@foo'
 match 0 0 'foo' '@foo'
 match 1 1 '[ab]' '\[ab]'
 match 1 1 '[ab]' '[[]ab]'
-match 1 1 '[ab]' '[[:]ab]'
-match 0 0 '[ab]' '[[::]ab]'
-match 1 1 '[ab]' '[[:digit]ab]'
-match 1 1 '[ab]' '[\[:]ab]'
+match 1 x '[ab]' '[[:]ab]'
+match 0 x '[ab]' '[[::]ab]'
+match 1 x '[ab]' '[[:digit]ab]'
+match 1 x '[ab]' '[\[:]ab]'
 match 1 1 '?a?b' '\??\?b'
 match 1 1 'abc' '\a\b\c'
 match 0 0 'foo' ''
 match 1 0 'foo/bar/baz/to' '**/t[o]'
 
 # Character class tests
-match 1 1 'a1B' '[[:alpha:]][[:digit:]][[:upper:]]'
-match 0 0 'a' '[[:digit:][:upper:][:space:]]'
-match 1 1 'A' '[[:digit:][:upper:][:space:]]'
-match 1 0 '1' '[[:digit:][:upper:][:space:]]'
-match 0 0 '1' '[[:digit:][:upper:][:spaci:]]'
-match 1 1 ' ' '[[:digit:][:upper:][:space:]]'
-match 0 0 '.' '[[:digit:][:upper:][:space:]]'
-match 1 1 '.' '[[:digit:][:punct:][:space:]]'
+match 1 x 'a1B' '[[:alpha:]][[:digit:]][[:upper:]]'
+match 0 x 'a' '[[:digit:][:upper:][:space:]]'
+match 1 x 'A' '[[:digit:][:upper:][:space:]]'
+match 1 x '1' '[[:digit:][:upper:][:space:]]'
+match 0 x '1' '[[:digit:][:upper:][:spaci:]]'
+match 1 x ' ' '[[:digit:][:upper:][:space:]]'
+match 0 x '.' '[[:digit:][:upper:][:space:]]'
+match 1 x '.' '[[:digit:][:punct:][:space:]]'
 match 1 1 '5' '[[:xdigit:]]'
 match 1 1 'f' '[[:xdigit:]]'
 match 1 1 'D' '[[:xdigit:]]'
-match 1 0 '_' '[[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]'
-match 1 0 '_' '[[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]'
-match 1 1 '.' '[^[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:lower:][:space:][:upper:][:xdigit:]]'
-match 1 1 '5' '[a-c[:digit:]x-z]'
-match 1 1 'b' '[a-c[:digit:]x-z]'
-match 1 1 'y' '[a-c[:digit:]x-z]'
-match 0 0 'q' '[a-c[:digit:]x-z]'
+match 1 x '_' '[[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]'
+match 1 x '_' '[[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]'
+match 1 x '.' '[^[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:lower:][:space:][:upper:][:xdigit:]]'
+match 1 x '5' '[a-c[:digit:]x-z]'
+match 1 x 'b' '[a-c[:digit:]x-z]'
+match 1 x 'y' '[a-c[:digit:]x-z]'
+match 0 x 'q' '[a-c[:digit:]x-z]'
 
 # Additional tests, including some malformed wildmats
-match 1 1 ']' '[\\-^]'
+match 1 x ']' '[\\-^]'
 match 0 0 '[' '[\\-^]'
-match 1 1 '-' '[\-_]'
-match 1 1 ']' '[\]]'
+match 1 x '-' '[\-_]'
+match 1 x ']' '[\]]'
 match 0 0 '\]' '[\]]'
 match 0 0 '\' '[\]]'
 match 0 0 'ab' 'a[]b'
-match 0 1 'a[]b' 'a[]b'
-match 0 1 'ab[' 'ab['
+match 0 x 'a[]b' 'a[]b'
+match 0 x 'ab[' 'ab['
 match 0 0 'ab' '[!'
 match 0 0 'ab' '[-'
 match 1 1 '-' '[-]'
 match 0 0 '-' '[a-'
 match 0 0 '-' '[!a-'
-match 1 1 '-' '[--A]'
-match 1 1 '5' '[--A]'
+match 1 x '-' '[--A]'
+match 1 x '5' '[--A]'
 match 1 1 ' ' '[ --]'
 match 1 1 '$' '[ --]'
 match 1 1 '-' '[ --]'
 match 0 0 '0' '[ --]'
-match 1 1 '-' '[---]'
-match 1 1 '-' '[------]'
+match 1 x '-' '[---]'
+match 1 x '-' '[------]'
 match 0 0 'j' '[a-e-n]'
-match 1 1 '-' '[a-e-n]'
-match 1 1 'a' '[!------]'
+match 1 x '-' '[a-e-n]'
+match 1 x 'a' '[!------]'
 match 0 0 '[' '[]-a]'
-match 1 1 '^' '[]-a]'
+match 1 x '^' '[]-a]'
 match 0 0 '^' '[!]-a]'
-match 1 1 '[' '[!]-a]'
+match 1 x '[' '[!]-a]'
 match 1 1 '^' '[a^bc]'
-match 1 1 '-b]' '[a-]b]'
+match 1 x '-b]' '[a-]b]'
 match 0 0 '\' '[\]'
 match 1 1 '\' '[\\]'
 match 0 0 '\' '[!\\]'
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 08/13] wildmatch: make wildmatch's return value compatible with fnmatch
  2012-10-15  6:25 ` [PATCH 01/13] ctype: make sane_ctype[] const array Nguyễn Thái Ngọc Duy
                     ` (5 preceding siblings ...)
  2012-10-15  6:25   ` [PATCH 07/13] t3070: disable unreliable fnmatch tests Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:25   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:25   ` [PATCH 09/13] wildmatch: remove static variable force_lower_case Nguyễn Thái Ngọc Duy
                     ` (4 subsequent siblings)
  11 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:25 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

wildmatch returns non-zero if matched, zero otherwise. This patch
makes it return zero if matches, non-zero otherwise, like fnmatch().

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 test-wildmatch.c |  4 ++--
 wildmatch.c      | 21 ++++++++++++---------
 2 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/test-wildmatch.c b/test-wildmatch.c
index ac56420..77014e9 100644
--- a/test-wildmatch.c
+++ b/test-wildmatch.c
@@ -4,9 +4,9 @@
 int main(int argc, char **argv)
 {
 	if (!strcmp(argv[1], "wildmatch"))
-		return wildmatch(argv[3], argv[2]) ? 0 : 1;
+		return !!wildmatch(argv[3], argv[2]);
 	else if (!strcmp(argv[1], "iwildmatch"))
-		return iwildmatch(argv[3], argv[2]) ? 0 : 1;
+		return !!iwildmatch(argv[3], argv[2]);
 	else if (!strcmp(argv[1], "fnmatch"))
 		return !!fnmatch(argv[3], argv[2], FNM_PATHNAME);
 	else
diff --git a/wildmatch.c b/wildmatch.c
index ac29471..6d992d3 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -20,6 +20,9 @@ typedef unsigned char uchar;
 
 #define FALSE 0
 #define TRUE 1
+
+#define NOMATCH 1
+#define MATCH 0
 #define ABORT_ALL -1
 #define ABORT_TO_STARSTAR -2
 
@@ -78,12 +81,12 @@ static int dowild(const uchar *p, const uchar *text)
 			/* FALLTHROUGH */
 		default:
 			if (t_ch != p_ch)
-				return FALSE;
+				return NOMATCH;
 			continue;
 		case '?':
 			/* Match anything but '/'. */
 			if (t_ch == '/')
-				return FALSE;
+				return NOMATCH;
 			continue;
 		case '*':
 			if (*++p == '*') {
@@ -96,14 +99,14 @@ static int dowild(const uchar *p, const uchar *text)
 				 * only if there are no more slash characters. */
 				if (!special) {
 					if (strchr((char*)text, '/') != NULL)
-						return FALSE;
+						return NOMATCH;
 				}
-				return TRUE;
+				return MATCH;
 			}
 			while (1) {
 				if (t_ch == '\0')
 					break;
-				if ((matched = dowild(p, text)) != FALSE) {
+				if ((matched = dowild(p, text)) != NOMATCH) {
 					if (!special || matched != ABORT_TO_STARSTAR)
 						return matched;
 				} else if (!special && t_ch == '/')
@@ -202,18 +205,18 @@ static int dowild(const uchar *p, const uchar *text)
 					matched = TRUE;
 			} while (prev_ch = p_ch, (p_ch = *++p) != ']');
 			if (matched == special || t_ch == '/')
-				return FALSE;
+				return NOMATCH;
 			continue;
 		}
 	}
 
-	return *text ? FALSE : TRUE;
+	return *text ? NOMATCH : MATCH;
 }
 
 /* Match the "pattern" against the "text" string. */
 int wildmatch(const char *pattern, const char *text)
 {
-	return dowild((const uchar*)pattern, (const uchar*)text) == TRUE;
+	return dowild((const uchar*)pattern, (const uchar*)text);
 }
 
 /* Match the "pattern" against the forced-to-lower-case "text" string. */
@@ -221,7 +224,7 @@ int iwildmatch(const char *pattern, const char *text)
 {
 	int ret;
 	force_lower_case = 1;
-	ret = dowild((const uchar*)pattern, (const uchar*)text) == TRUE;
+	ret = dowild((const uchar*)pattern, (const uchar*)text);
 	force_lower_case = 0;
 	return ret;
 }
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 09/13] wildmatch: remove static variable force_lower_case
  2012-10-15  6:25 ` [PATCH 01/13] ctype: make sane_ctype[] const array Nguyễn Thái Ngọc Duy
                     ` (6 preceding siblings ...)
  2012-10-15  6:25   ` [PATCH 08/13] wildmatch: make wildmatch's return value compatible with fnmatch Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:25   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:25   ` [PATCH 10/13] wildmatch: fix case-insensitive matching Nguyễn Thái Ngọc Duy
                     ` (3 subsequent siblings)
  11 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:25 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

One place less to worry about thread safety. Also combine wildmatch
and iwildmatch into one.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 test-wildmatch.c |  4 ++--
 wildmatch.c      | 21 +++++----------------
 wildmatch.h      |  3 +--
 3 files changed, 8 insertions(+), 20 deletions(-)

diff --git a/test-wildmatch.c b/test-wildmatch.c
index 77014e9..74c0864 100644
--- a/test-wildmatch.c
+++ b/test-wildmatch.c
@@ -4,9 +4,9 @@
 int main(int argc, char **argv)
 {
 	if (!strcmp(argv[1], "wildmatch"))
-		return !!wildmatch(argv[3], argv[2]);
+		return !!wildmatch(argv[3], argv[2], 0);
 	else if (!strcmp(argv[1], "iwildmatch"))
-		return !!iwildmatch(argv[3], argv[2]);
+		return !!wildmatch(argv[3], argv[2], FNM_CASEFOLD);
 	else if (!strcmp(argv[1], "fnmatch"))
 		return !!fnmatch(argv[3], argv[2], FNM_PATHNAME);
 	else
diff --git a/wildmatch.c b/wildmatch.c
index 6d992d3..eef7b13 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -59,10 +59,8 @@ typedef unsigned char uchar;
 #define ISUPPER(c) (ISASCII(c) && isupper(c))
 #define ISXDIGIT(c) (ISASCII(c) && isxdigit(c))
 
-static int force_lower_case = 0;
-
 /* Match pattern "p" against "text" */
-static int dowild(const uchar *p, const uchar *text)
+static int dowild(const uchar *p, const uchar *text, int force_lower_case)
 {
 	uchar p_ch;
 
@@ -106,7 +104,7 @@ static int dowild(const uchar *p, const uchar *text)
 			while (1) {
 				if (t_ch == '\0')
 					break;
-				if ((matched = dowild(p, text)) != NOMATCH) {
+				if ((matched = dowild(p, text,  force_lower_case)) != NOMATCH) {
 					if (!special || matched != ABORT_TO_STARSTAR)
 						return matched;
 				} else if (!special && t_ch == '/')
@@ -214,17 +212,8 @@ static int dowild(const uchar *p, const uchar *text)
 }
 
 /* Match the "pattern" against the "text" string. */
-int wildmatch(const char *pattern, const char *text)
-{
-	return dowild((const uchar*)pattern, (const uchar*)text);
-}
-
-/* Match the "pattern" against the forced-to-lower-case "text" string. */
-int iwildmatch(const char *pattern, const char *text)
+int wildmatch(const char *pattern, const char *text, int flags)
 {
-	int ret;
-	force_lower_case = 1;
-	ret = dowild((const uchar*)pattern, (const uchar*)text);
-	force_lower_case = 0;
-	return ret;
+	return dowild((const uchar*)pattern, (const uchar*)text,
+		      flags & FNM_CASEFOLD ? 1 :0);
 }
diff --git a/wildmatch.h b/wildmatch.h
index 562faa3..e974f9a 100644
--- a/wildmatch.h
+++ b/wildmatch.h
@@ -1,4 +1,3 @@
 /* wildmatch.h */
 
-int wildmatch(const char *pattern, const char *text);
-int iwildmatch(const char *pattern, const char *text);
+int wildmatch(const char *pattern, const char *text, int flags);
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 10/13] wildmatch: fix case-insensitive matching
  2012-10-15  6:25 ` [PATCH 01/13] ctype: make sane_ctype[] const array Nguyễn Thái Ngọc Duy
                     ` (7 preceding siblings ...)
  2012-10-15  6:25   ` [PATCH 09/13] wildmatch: remove static variable force_lower_case Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:25   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:26   ` [PATCH 11/13] wildmatch: adjust "**" behavior Nguyễn Thái Ngọc Duy
                     ` (2 subsequent siblings)
  11 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:25 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

dowild() does case insensitive matching by lower-casing the text. That
means lower case letters in patterns imply case-insensitive matching,
but upper case means exact matching.

We do not want that subtlety. Lower case pattern too so iwildmatch()
always does what we expect it to do.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 wildmatch.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/wildmatch.c b/wildmatch.c
index eef7b13..5469866 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -71,6 +71,8 @@ static int dowild(const uchar *p, const uchar *text, int force_lower_case)
 			return ABORT_ALL;
 		if (force_lower_case && ISUPPER(t_ch))
 			t_ch = tolower(t_ch);
+		if (force_lower_case && ISUPPER(p_ch))
+			p_ch = tolower(p_ch);
 		switch (p_ch) {
 		case '\\':
 			/* Literal match with following character.  Note that the test
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 11/13] wildmatch: adjust "**" behavior
  2012-10-15  6:25 ` [PATCH 01/13] ctype: make sane_ctype[] const array Nguyễn Thái Ngọc Duy
                     ` (8 preceding siblings ...)
  2012-10-15  6:25   ` [PATCH 10/13] wildmatch: fix case-insensitive matching Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:26   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:26   ` [PATCH 12/13] wildmatch: make /**/ match zero or more directories Nguyễn Thái Ngọc Duy
  2012-10-15  6:26   ` [PATCH 13/13] Support "**" wildcard in .gitignore and .gitattributes Nguyễn Thái Ngọc Duy
  11 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:26 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Standard wildmatch() sees consecutive asterisks as "*" that can also
match slashes. But that may be hard to explain to users as
"abc/**/def" can match "abcdef", "abcxyzdef", "abc/def", "abc/x/def",
"abc/x/y/def"...

This patch changes wildmatch so that users can do

- "**/def" -> all paths ending with file/directory 'def'
- "abc/**" - equivalent to "/abc/"
- "abc/**/def" -> "abc/x/def", "abc/x/y/def"...
- otherwise consider the pattern malformed if "**" is found

Basically the magic of "**" only remains if it's wrapped around by
slashes.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 t/t3070-wildmatch.sh |  5 +++--
 wildmatch.c          | 13 +++++++------
 wildmatch.h          |  6 ++++++
 3 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index dd95b00..15848d5 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -46,7 +46,7 @@ match 0 0 foobar 'foo\*bar'
 match 1 1 'f\oo' 'f\\oo'
 match 1 1 ball '*[al]?'
 match 0 0 ten '[ten]'
-match 1 1 ten '**[!te]'
+match 0 1 ten '**[!te]'
 match 0 0 ten '**[!ten]'
 match 1 1 ten 't[a-g]n'
 match 0 0 ten 't[!a-g]n'
@@ -61,7 +61,8 @@ match 1 1 ']' ']'
 
 # Extended slash-matching features
 match 0 0 'foo/baz/bar' 'foo*bar'
-match 1 0 'foo/baz/bar' 'foo**bar'
+match 0 0 'foo/baz/bar' 'foo**bar'
+match 0 1 'foobazbar' 'foo**bar'
 match 0 0 'foo/bar' 'foo?bar'
 match 0 0 'foo/bar' 'foo[/]bar'
 match 0 0 'foo/bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
diff --git a/wildmatch.c b/wildmatch.c
index 5469866..85bc0df 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -21,11 +21,6 @@ typedef unsigned char uchar;
 #define FALSE 0
 #define TRUE 1
 
-#define NOMATCH 1
-#define MATCH 0
-#define ABORT_ALL -1
-#define ABORT_TO_STARSTAR -2
-
 #define CC_EQ(class, len, litmatch) ((len) == sizeof (litmatch)-1 \
 				    && *(class) == *(litmatch) \
 				    && strncmp((char*)class, litmatch, len) == 0)
@@ -90,8 +85,14 @@ static int dowild(const uchar *p, const uchar *text, int force_lower_case)
 			continue;
 		case '*':
 			if (*++p == '*') {
+				const uchar *prev_p = p - 2;
 				while (*++p == '*') {}
-				special = TRUE;
+				if ((prev_p == text || *prev_p == '/') ||
+				    (*p == '\0' || *p == '/' ||
+				     (p[0] == '\\' && p[1] == '/'))) {
+					special = TRUE;
+				} else
+					return ABORT_MALFORMED;
 			} else
 				special = FALSE;
 			if (*p == '\0') {
diff --git a/wildmatch.h b/wildmatch.h
index e974f9a..984a38c 100644
--- a/wildmatch.h
+++ b/wildmatch.h
@@ -1,3 +1,9 @@
 /* wildmatch.h */
 
+#define ABORT_MALFORMED 2
+#define NOMATCH 1
+#define MATCH 0
+#define ABORT_ALL -1
+#define ABORT_TO_STARSTAR -2
+
 int wildmatch(const char *pattern, const char *text, int flags);
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 12/13] wildmatch: make /**/ match zero or more directories
  2012-10-15  6:25 ` [PATCH 01/13] ctype: make sane_ctype[] const array Nguyễn Thái Ngọc Duy
                     ` (9 preceding siblings ...)
  2012-10-15  6:26   ` [PATCH 11/13] wildmatch: adjust "**" behavior Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:26   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:26   ` [PATCH 13/13] Support "**" wildcard in .gitignore and .gitattributes Nguyễn Thái Ngọc Duy
  11 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:26 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

"foo/**/bar" matches "foo/x/bar", "foo/x/y/bar"... but not
"foo/bar". We make a special case, when foo/**/ is detected (and
"foo/" part is already matched), try matching "bar" with the rest of
the string.

"Match one or more directories" semantics can be easily achieved using
"foo/*/**/bar".

This also makes "**/foo" match "foo" in addition to "x/foo",
"x/y/foo"..

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 t/t3070-wildmatch.sh |  8 +++++++-
 wildmatch.c          | 12 ++++++++++++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index 15848d5..e6ad6f4 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -63,11 +63,17 @@ match 1 1 ']' ']'
 match 0 0 'foo/baz/bar' 'foo*bar'
 match 0 0 'foo/baz/bar' 'foo**bar'
 match 0 1 'foobazbar' 'foo**bar'
+match 1 1 'foo/baz/bar' 'foo/**/bar'
+match 1 0 'foo/baz/bar' 'foo/**/**/bar'
+match 1 0 'foo/b/a/z/bar' 'foo/**/bar'
+match 1 0 'foo/b/a/z/bar' 'foo/**/**/bar'
+match 1 0 'foo/bar' 'foo/**/bar'
+match 1 0 'foo/bar' 'foo/**/**/bar'
 match 0 0 'foo/bar' 'foo?bar'
 match 0 0 'foo/bar' 'foo[/]bar'
 match 0 0 'foo/bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
 match 1 1 'foo-bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
-match 0 0 'foo' '**/foo'
+match 1 0 'foo' '**/foo'
 match 1 x '/foo' '**/foo'
 match 1 0 'bar/baz/foo' '**/foo'
 match 0 0 'bar/baz/foo' '*/foo'
diff --git a/wildmatch.c b/wildmatch.c
index 85bc0df..3972e26 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -90,6 +90,18 @@ static int dowild(const uchar *p, const uchar *text, int force_lower_case)
 				if ((prev_p == text || *prev_p == '/') ||
 				    (*p == '\0' || *p == '/' ||
 				     (p[0] == '\\' && p[1] == '/'))) {
+					/*
+					 * Assuming we already match 'foo/' and are at
+					 * <star star slash>, just assume it matches
+					 * nothing and go ahead match the rest of the
+					 * pattern with the remaining string. This
+					 * helps make foo/<*><*>/bar (<> because
+					 * otherwise it breaks C comment syntax) match
+					 * both foo/bar and foo/a/bar.
+					 */
+					if (p[0] == '/' &&
+					    dowild(p + 1, text, force_lower_case) == MATCH)
+						return MATCH;
 					special = TRUE;
 				} else
 					return ABORT_MALFORMED;
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 13/13] Support "**" wildcard in .gitignore and .gitattributes
  2012-10-15  6:25 ` [PATCH 01/13] ctype: make sane_ctype[] const array Nguyễn Thái Ngọc Duy
                     ` (10 preceding siblings ...)
  2012-10-15  6:26   ` [PATCH 12/13] wildmatch: make /**/ match zero or more directories Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:26   ` Nguyễn Thái Ngọc Duy
  2012-11-04 21:00     ` [PATCH 14/13] wildmatch: fix tests that fail on Windows due to path mangling Johannes Sixt
  11 siblings, 1 reply; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:26 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/gitignore.txt        | 19 +++++++++++++++++++
 dir.c                              |  4 +++-
 t/t0003-attributes.sh              | 37 +++++++++++++++++++++++++++++++++++++
 t/t3001-ls-files-others-exclude.sh | 18 ++++++++++++++++++
 4 files changed, 77 insertions(+), 1 deletion(-)

diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt
index 2e7328b..d4747ce 100644
--- a/Documentation/gitignore.txt
+++ b/Documentation/gitignore.txt
@@ -96,6 +96,25 @@ PATTERN FORMAT
    For example, "/{asterisk}.c" matches "cat-file.c" but not
    "mozilla-sha1/sha1.c".
 
+Two consecutive asterisks ("`**`") in patterns matched against
+full pathname may have special meaning:
+
+ - A leading "`**`" followed by a slash means match in all
+   directories. For example, "`**/foo`" matches file or directory
+   "`foo`" anywhere, the same as pattern "`foo`". "**/foo/bar"
+   matches file or directory "`bar`" anywhere that is directly
+   under directory "`foo`".
+
+ - A trailing "/**" matches everything inside. For example,
+   "abc/**" matches all files inside directory "abc", relative
+   to the location of the `.gitignore` file, with infinite depth.
+
+ - A slash followed by two consecutive asterisks then a slash
+   matches zero or more directories. For example, "`a/**/b`"
+   matches "`a/b`", "`a/x/b`", "`a/x/y/b`" and so on.
+
+ - Other consecutive asterisks are considered invalid.
+
 NOTES
 -----
 
diff --git a/dir.c b/dir.c
index ee8e711..cb7328b 100644
--- a/dir.c
+++ b/dir.c
@@ -8,6 +8,7 @@
 #include "cache.h"
 #include "dir.h"
 #include "refs.h"
+#include "wildmatch.h"
 
 struct path_simplify {
 	int len;
@@ -593,7 +594,8 @@ int match_pathname(const char *pathname, int pathlen,
 		namelen -= prefix;
 	}
 
-	return fnmatch_icase(pattern, name, FNM_PATHNAME) == 0;
+	return wildmatch(pattern, name,
+			 ignore_case ? FNM_CASEFOLD : 0) == 0;
 }
 
 /* Scan the list and let the last match determine the fate.
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index f6c21ea..c962403 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -216,6 +216,43 @@ test_expect_success 'patterns starting with exclamation' '
 	attr_check "!f" foo
 '
 
+test_expect_success '"**" test' '
+	echo "**/f foo=bar" >.gitattributes &&
+	cat <<\EOF >expect &&
+f: foo: bar
+a/f: foo: bar
+a/b/f: foo: bar
+a/b/c/f: foo: bar
+EOF
+	git check-attr foo -- "f" >actual 2>err &&
+	git check-attr foo -- "a/f" >>actual 2>>err &&
+	git check-attr foo -- "a/b/f" >>actual 2>>err &&
+	git check-attr foo -- "a/b/c/f" >>actual 2>>err &&
+	test_cmp expect actual &&
+	test_line_count = 0 err
+'
+
+test_expect_success '"**" with no slashes test' '
+	echo "a**f foo=bar" >.gitattributes &&
+	git check-attr foo -- "f" >actual &&
+	cat <<\EOF >expect &&
+f: foo: unspecified
+af: foo: bar
+axf: foo: bar
+a/f: foo: unspecified
+a/b/f: foo: unspecified
+a/b/c/f: foo: unspecified
+EOF
+	git check-attr foo -- "f" >actual 2>err &&
+	git check-attr foo -- "af" >>actual 2>err &&
+	git check-attr foo -- "axf" >>actual 2>err &&
+	git check-attr foo -- "a/f" >>actual 2>>err &&
+	git check-attr foo -- "a/b/f" >>actual 2>>err &&
+	git check-attr foo -- "a/b/c/f" >>actual 2>>err &&
+	test_cmp expect actual &&
+	test_line_count = 0 err
+'
+
 test_expect_success 'setup bare' '
 	git clone --bare . bare.git &&
 	cd bare.git
diff --git a/t/t3001-ls-files-others-exclude.sh b/t/t3001-ls-files-others-exclude.sh
index dc2f045..efb7ebc 100755
--- a/t/t3001-ls-files-others-exclude.sh
+++ b/t/t3001-ls-files-others-exclude.sh
@@ -220,4 +220,22 @@ test_expect_success 'pattern matches prefix completely' '
 	test_cmp expect actual
 '
 
+test_expect_success 'ls-files with "**" patterns' '
+	cat <<\EOF >expect &&
+a.1
+one/a.1
+one/two/a.1
+three/a.1
+EOF
+	git ls-files -o -i --exclude "**/a.1" >actual
+	test_cmp expect actual
+'
+
+
+test_expect_success 'ls-files with "**" patterns and no slashes' '
+	: >expect &&
+	git ls-files -o -i --exclude "one**a.1" >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 01/12] dir.c: rename cryptic 'which' variable to more consistent name
  2012-10-15  6:23 nd/attr-match-more-optim, nd/wildmatch and as/check-ignore Nguyễn Thái Ngọc Duy
  2012-10-15  6:24 ` [PATCH 1/6] exclude: stricten a length check in EXC_FLAG_ENDSWITH case Nguyễn Thái Ngọc Duy
  2012-10-15  6:25 ` [PATCH 01/13] ctype: make sane_ctype[] const array Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:27 ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:27   ` [PATCH 02/12] dir.c: rename path_excluded() to is_path_excluded() Nguyễn Thái Ngọc Duy
                     ` (10 more replies)
  2012-10-15 22:13 ` nd/attr-match-more-optim, nd/wildmatch and as/check-ignore Junio C Hamano
  3 siblings, 11 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:27 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Adam Spiers,
	Nguyễn Thái Ngọc Duy

From: Adam Spiers <git@adamspiers.org>

'el' is only *slightly* less cryptic, but is already used as the
variable name for a struct exclude_list pointer in numerous other
places, so this reduces the number of cryptic variable names in use by
one :-)

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 dir.c | 10 +++++-----
 dir.h |  4 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/dir.c b/dir.c
index ee8e711..f7219b5 100644
--- a/dir.c
+++ b/dir.c
@@ -347,7 +347,7 @@ void parse_exclude_pattern(const char **pattern,
 }
 
 void add_exclude(const char *string, const char *base,
-		 int baselen, struct exclude_list *which)
+		 int baselen, struct exclude_list *el)
 {
 	struct exclude *x;
 	int patternlen;
@@ -371,8 +371,8 @@ void add_exclude(const char *string, const char *base,
 	x->base = base;
 	x->baselen = baselen;
 	x->flags = flags;
-	ALLOC_GROW(which->excludes, which->nr + 1, which->alloc);
-	which->excludes[which->nr++] = x;
+	ALLOC_GROW(el->excludes, el->nr + 1, el->alloc);
+	el->excludes[el->nr++] = x;
 }
 
 static void *read_skip_worktree_file_from_index(const char *path, size_t *size)
@@ -414,7 +414,7 @@ int add_excludes_from_file_to_list(const char *fname,
 				   const char *base,
 				   int baselen,
 				   char **buf_p,
-				   struct exclude_list *which,
+				   struct exclude_list *el,
 				   int check_index)
 {
 	struct stat st;
@@ -461,7 +461,7 @@ int add_excludes_from_file_to_list(const char *fname,
 		if (buf[i] == '\n') {
 			if (entry != buf + i && entry[0] != '#') {
 				buf[i - (i && buf[i-1] == '\r')] = 0;
-				add_exclude(entry, base, baselen, which);
+				add_exclude(entry, base, baselen, el);
 			}
 			entry = buf + i + 1;
 		}
diff --git a/dir.h b/dir.h
index f5c89e3..69cc7d2 100644
--- a/dir.h
+++ b/dir.h
@@ -105,11 +105,11 @@ extern int path_excluded(struct path_exclude_check *, const char *, int namelen,
 
 
 extern int add_excludes_from_file_to_list(const char *fname, const char *base, int baselen,
-					  char **buf_p, struct exclude_list *which, int check_index);
+					  char **buf_p, struct exclude_list *el, int check_index);
 extern void add_excludes_from_file(struct dir_struct *, const char *fname);
 extern void parse_exclude_pattern(const char **string, int *patternlen, int *flags, int *nowildcardlen);
 extern void add_exclude(const char *string, const char *base,
-			int baselen, struct exclude_list *which);
+			int baselen, struct exclude_list *el);
 extern void free_excludes(struct exclude_list *el);
 extern int file_exists(const char *);
 
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 02/12] dir.c: rename path_excluded() to is_path_excluded()
  2012-10-15  6:27 ` [PATCH 01/12] dir.c: rename cryptic 'which' variable to more consistent name Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:27   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:27   ` [PATCH 03/12] dir.c: rename excluded_from_list() to is_excluded_from_list() Nguyễn Thái Ngọc Duy
                     ` (9 subsequent siblings)
  10 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:27 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Adam Spiers,
	Nguyễn Thái Ngọc Duy

From: Adam Spiers <git@adamspiers.org>

Start adopting clearer names for exclude functions.  This 'is_*'
naming pattern for functions returning booleans was agreed here:

http://thread.gmane.org/gmane.comp.version-control.git/204661/focus=204924

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/add.c      | 2 +-
 builtin/ls-files.c | 2 +-
 dir.c              | 4 ++--
 dir.h              | 2 +-
 unpack-trees.c     | 2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 89dce56..c689f37 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -453,7 +453,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 			    && !file_exists(pathspec[i])) {
 				if (ignore_missing) {
 					int dtype = DT_UNKNOWN;
-					if (path_excluded(&check, pathspec[i], -1, &dtype))
+					if (is_path_excluded(&check, pathspec[i], -1, &dtype))
 						dir_add_ignored(&dir, pathspec[i], strlen(pathspec[i]));
 				} else
 					die(_("pathspec '%s' did not match any files"),
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 31b3f2d..ef7f99a 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -203,7 +203,7 @@ static void show_ru_info(void)
 static int ce_excluded(struct path_exclude_check *check, struct cache_entry *ce)
 {
 	int dtype = ce_to_dtype(ce);
-	return path_excluded(check, ce->name, ce_namelen(ce), &dtype);
+	return is_path_excluded(check, ce->name, ce_namelen(ce), &dtype);
 }
 
 static void show_files(struct dir_struct *dir)
diff --git a/dir.c b/dir.c
index f7219b5..6dfb815 100644
--- a/dir.c
+++ b/dir.c
@@ -679,8 +679,8 @@ void path_exclude_check_clear(struct path_exclude_check *check)
  * A path to a directory known to be excluded is left in check->path to
  * optimize for repeated checks for files in the same excluded directory.
  */
-int path_excluded(struct path_exclude_check *check,
-		  const char *name, int namelen, int *dtype)
+int is_path_excluded(struct path_exclude_check *check,
+		     const char *name, int namelen, int *dtype)
 {
 	int i;
 	struct strbuf *path = &check->path;
diff --git a/dir.h b/dir.h
index 69cc7d2..d40aeea 100644
--- a/dir.h
+++ b/dir.h
@@ -101,7 +101,7 @@ struct path_exclude_check {
 };
 extern void path_exclude_check_init(struct path_exclude_check *, struct dir_struct *);
 extern void path_exclude_check_clear(struct path_exclude_check *);
-extern int path_excluded(struct path_exclude_check *, const char *, int namelen, int *dtype);
+extern int is_path_excluded(struct path_exclude_check *, const char *, int namelen, int *dtype);
 
 
 extern int add_excludes_from_file_to_list(const char *fname, const char *base, int baselen,
diff --git a/unpack-trees.c b/unpack-trees.c
index 33a5819..3ac6370 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1372,7 +1372,7 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
 		return 0;
 
 	if (o->dir &&
-	    path_excluded(o->path_exclude_check, name, -1, &dtype))
+	    is_path_excluded(o->path_exclude_check, name, -1, &dtype))
 		/*
 		 * ce->name is explicitly excluded, so it is Ok to
 		 * overwrite it.
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 03/12] dir.c: rename excluded_from_list() to is_excluded_from_list()
  2012-10-15  6:27 ` [PATCH 01/12] dir.c: rename cryptic 'which' variable to more consistent name Nguyễn Thái Ngọc Duy
  2012-10-15  6:27   ` [PATCH 02/12] dir.c: rename path_excluded() to is_path_excluded() Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:27   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:27   ` [PATCH 04/12] dir.c: rename excluded() to is_excluded() Nguyễn Thái Ngọc Duy
                     ` (8 subsequent siblings)
  10 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:27 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Adam Spiers,
	Nguyễn Thái Ngọc Duy

From: Adam Spiers <git@adamspiers.org>

Continue adopting clearer names for exclude functions.  This 'is_*'
naming pattern for functions returning booleans was discussed here:

http://thread.gmane.org/gmane.comp.version-control.git/204661/focus=204924

Also adjust their callers as necessary.

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 dir.c          | 11 ++++++-----
 dir.h          |  4 ++--
 unpack-trees.c |  8 +++++---
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/dir.c b/dir.c
index 6dfb815..7e1a23b 100644
--- a/dir.c
+++ b/dir.c
@@ -599,9 +599,9 @@ int match_pathname(const char *pathname, int pathlen,
 /* Scan the list and let the last match determine the fate.
  * Return 1 for exclude, 0 for include and -1 for undecided.
  */
-int excluded_from_list(const char *pathname,
-		       int pathlen, const char *basename, int *dtype,
-		       struct exclude_list *el)
+int is_excluded_from_list(const char *pathname,
+			  int pathlen, const char *basename, int *dtype,
+			  struct exclude_list *el)
 {
 	int i;
 
@@ -648,8 +648,9 @@ static int excluded(struct dir_struct *dir, const char *pathname, int *dtype_p)
 
 	prep_exclude(dir, pathname, basename-pathname);
 	for (st = EXC_CMDL; st <= EXC_FILE; st++) {
-		switch (excluded_from_list(pathname, pathlen, basename,
-					   dtype_p, &dir->exclude_list[st])) {
+		switch (is_excluded_from_list(pathname, pathlen,
+					      basename, dtype_p,
+					      &dir->exclude_list[st])) {
 		case 0:
 			return 0;
 		case 1:
diff --git a/dir.h b/dir.h
index d40aeea..2fce4bf 100644
--- a/dir.h
+++ b/dir.h
@@ -76,8 +76,8 @@ extern int within_depth(const char *name, int namelen, int depth, int max_depth)
 extern int fill_directory(struct dir_struct *dir, const char **pathspec);
 extern int read_directory(struct dir_struct *, const char *path, int len, const char **pathspec);
 
-extern int excluded_from_list(const char *pathname, int pathlen, const char *basename,
-			      int *dtype, struct exclude_list *el);
+extern int is_excluded_from_list(const char *pathname, int pathlen, const char *basename,
+				 int *dtype, struct exclude_list *el);
 struct dir_entry *dir_add_ignored(struct dir_struct *dir, const char *pathname, int len);
 
 /*
diff --git a/unpack-trees.c b/unpack-trees.c
index 3ac6370..c0da094 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -836,7 +836,8 @@ static int clear_ce_flags_dir(struct cache_entry **cache, int nr,
 {
 	struct cache_entry **cache_end;
 	int dtype = DT_DIR;
-	int ret = excluded_from_list(prefix, prefix_len, basename, &dtype, el);
+	int ret = is_excluded_from_list(prefix, prefix_len,
+					basename, &dtype, el);
 
 	prefix[prefix_len++] = '/';
 
@@ -855,7 +856,7 @@ static int clear_ce_flags_dir(struct cache_entry **cache, int nr,
 	 * with ret (iow, we know in advance the incl/excl
 	 * decision for the entire directory), clear flag here without
 	 * calling clear_ce_flags_1(). That function will call
-	 * the expensive excluded_from_list() on every entry.
+	 * the expensive is_excluded_from_list() on every entry.
 	 */
 	return clear_ce_flags_1(cache, cache_end - cache,
 				prefix, prefix_len,
@@ -938,7 +939,8 @@ static int clear_ce_flags_1(struct cache_entry **cache, int nr,
 
 		/* Non-directory */
 		dtype = ce_to_dtype(ce);
-		ret = excluded_from_list(ce->name, ce_namelen(ce), name, &dtype, el);
+		ret = is_excluded_from_list(ce->name, ce_namelen(ce),
+					    name, &dtype, el);
 		if (ret < 0)
 			ret = defval;
 		if (ret > 0)
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 04/12] dir.c: rename excluded() to is_excluded()
  2012-10-15  6:27 ` [PATCH 01/12] dir.c: rename cryptic 'which' variable to more consistent name Nguyễn Thái Ngọc Duy
  2012-10-15  6:27   ` [PATCH 02/12] dir.c: rename path_excluded() to is_path_excluded() Nguyễn Thái Ngọc Duy
  2012-10-15  6:27   ` [PATCH 03/12] dir.c: rename excluded_from_list() to is_excluded_from_list() Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:27   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:27   ` [PATCH 05/12] dir.c: refactor is_excluded_from_list() Nguyễn Thái Ngọc Duy
                     ` (7 subsequent siblings)
  10 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:27 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Adam Spiers,
	Nguyễn Thái Ngọc Duy

From: Adam Spiers <git@adamspiers.org>

Continue adopting clearer names for exclude functions.  This is_*
naming pattern for functions returning booleans was discussed here:

http://thread.gmane.org/gmane.comp.version-control.git/204661/focus=204924

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 attr.c |  2 +-
 dir.c  | 10 +++++-----
 dir.h  |  4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/attr.c b/attr.c
index 2fc6353..5362563 100644
--- a/attr.c
+++ b/attr.c
@@ -284,7 +284,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
  * (reading the file from top to bottom), .gitattribute of the root
  * directory (again, reading the file from top to bottom) down to the
  * current directory, and then scan the list backwards to find the first match.
- * This is exactly the same as what excluded() does in dir.c to deal with
+ * This is exactly the same as what is_excluded() does in dir.c to deal with
  * .gitignore
  */
 
diff --git a/dir.c b/dir.c
index 7e1a23b..50381f8 100644
--- a/dir.c
+++ b/dir.c
@@ -639,7 +639,7 @@ int is_excluded_from_list(const char *pathname,
 	return -1; /* undecided */
 }
 
-static int excluded(struct dir_struct *dir, const char *pathname, int *dtype_p)
+static int is_excluded(struct dir_struct *dir, const char *pathname, int *dtype_p)
 {
 	int pathlen = strlen(pathname);
 	int st;
@@ -689,7 +689,7 @@ int is_path_excluded(struct path_exclude_check *check,
 	/*
 	 * we allow the caller to pass namelen as an optimization; it
 	 * must match the length of the name, as we eventually call
-	 * excluded() on the whole name string.
+	 * is_excluded() on the whole name string.
 	 */
 	if (namelen < 0)
 		namelen = strlen(name);
@@ -706,7 +706,7 @@ int is_path_excluded(struct path_exclude_check *check,
 
 		if (ch == '/') {
 			int dt = DT_DIR;
-			if (excluded(check->dir, path->buf, &dt))
+			if (is_excluded(check->dir, path->buf, &dt))
 				return 1;
 		}
 		strbuf_addch(path, ch);
@@ -715,7 +715,7 @@ int is_path_excluded(struct path_exclude_check *check,
 	/* An entry in the index; cannot be a directory with subentries */
 	strbuf_setlen(path, 0);
 
-	return excluded(check->dir, name, dtype);
+	return is_excluded(check->dir, name, dtype);
 }
 
 static struct dir_entry *dir_entry_new(const char *pathname, int len)
@@ -1015,7 +1015,7 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 					  const struct path_simplify *simplify,
 					  int dtype, struct dirent *de)
 {
-	int exclude = excluded(dir, path->buf, &dtype);
+	int exclude = is_excluded(dir, path->buf, &dtype);
 	if (exclude && (dir->flags & DIR_COLLECT_IGNORED)
 	    && exclude_matches_pathspec(path->buf, path->len, simplify))
 		dir_add_ignored(dir, path->buf, path->len);
diff --git a/dir.h b/dir.h
index 2fce4bf..4b887cc 100644
--- a/dir.h
+++ b/dir.h
@@ -91,8 +91,8 @@ extern int match_pathname(const char *, int,
 			  const char *, int, int, int);
 
 /*
- * The excluded() API is meant for callers that check each level of leading
- * directory hierarchies with excluded() to avoid recursing into excluded
+ * The is_excluded() API is meant for callers that check each level of leading
+ * directory hierarchies with is_excluded() to avoid recursing into excluded
  * directories.  Callers that do not do so should use this API instead.
  */
 struct path_exclude_check {
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 05/12] dir.c: refactor is_excluded_from_list()
  2012-10-15  6:27 ` [PATCH 01/12] dir.c: rename cryptic 'which' variable to more consistent name Nguyễn Thái Ngọc Duy
                     ` (2 preceding siblings ...)
  2012-10-15  6:27   ` [PATCH 04/12] dir.c: rename excluded() to is_excluded() Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:27   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:28   ` [PATCH 06/12] dir.c: refactor is_excluded() Nguyễn Thái Ngọc Duy
                     ` (6 subsequent siblings)
  10 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:27 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Adam Spiers,
	Nguyễn Thái Ngọc Duy

From: Adam Spiers <git@adamspiers.org>

The excluded function uses a new helper function called
last_exclude_matching_from_list() to perform the inner loop over all of
the exclude patterns.  The helper just tells us whether the path is
included, excluded, or undecided.

However, it may be useful to know _which_ pattern was triggered.  So
let's pass out the entire exclude match, which contains the status
information we were already passing out.

Further patches can make use of this.

This is a modified forward port of a patch from 2009 by Jeff King:
http://article.gmane.org/gmane.comp.version-control.git/108815

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 dir.c | 37 ++++++++++++++++++++++++++++---------
 1 file changed, 28 insertions(+), 9 deletions(-)

diff --git a/dir.c b/dir.c
index 50381f8..859e0f9 100644
--- a/dir.c
+++ b/dir.c
@@ -596,22 +596,26 @@ int match_pathname(const char *pathname, int pathlen,
 	return fnmatch_icase(pattern, name, FNM_PATHNAME) == 0;
 }
 
-/* Scan the list and let the last match determine the fate.
- * Return 1 for exclude, 0 for include and -1 for undecided.
+/*
+ * Scan the given exclude list in reverse to see whether pathname
+ * should be ignored.  The first match (i.e. the last on the list), if
+ * any, determines the fate.  Returns the exclude_list element which
+ * matched, or NULL for undecided.
  */
-int is_excluded_from_list(const char *pathname,
-			  int pathlen, const char *basename, int *dtype,
-			  struct exclude_list *el)
+static struct exclude *last_exclude_matching_from_list(const char *pathname,
+						       int pathlen,
+						       const char *basename,
+						       int *dtype,
+						       struct exclude_list *el)
 {
 	int i;
 
 	if (!el->nr)
-		return -1;	/* undefined */
+		return NULL;	/* undefined */
 
 	for (i = el->nr - 1; 0 <= i; i--) {
 		struct exclude *x = el->excludes[i];
 		const char *exclude = x->pattern;
-		int to_exclude = x->flags & EXC_FLAG_NEGATIVE ? 0 : 1;
 		int prefix = x->nowildcardlen;
 
 		if (x->flags & EXC_FLAG_MUSTBEDIR) {
@@ -626,7 +630,7 @@ int is_excluded_from_list(const char *pathname,
 					   pathlen - (basename - pathname),
 					   exclude, prefix, x->patternlen,
 					   x->flags))
-				return to_exclude;
+				return x;
 			continue;
 		}
 
@@ -634,8 +638,23 @@ int is_excluded_from_list(const char *pathname,
 		if (match_pathname(pathname, pathlen,
 				   x->base, x->baselen ? x->baselen - 1 : 0,
 				   exclude, prefix, x->patternlen, x->flags))
-			return to_exclude;
+			return x;
 	}
+	return NULL; /* undecided */
+}
+
+/*
+ * Scan the list and let the last match determine the fate.
+ * Return 1 for exclude, 0 for include and -1 for undecided.
+ */
+int is_excluded_from_list(const char *pathname,
+			  int pathlen, const char *basename, int *dtype,
+			  struct exclude_list *el)
+{
+	struct exclude *exclude;
+	exclude = last_exclude_matching_from_list(pathname, pathlen, basename, dtype, el);
+	if (exclude)
+		return exclude->flags & EXC_FLAG_NEGATIVE ? 0 : 1;
 	return -1; /* undecided */
 }
 
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 06/12] dir.c: refactor is_excluded()
  2012-10-15  6:27 ` [PATCH 01/12] dir.c: rename cryptic 'which' variable to more consistent name Nguyễn Thái Ngọc Duy
                     ` (3 preceding siblings ...)
  2012-10-15  6:27   ` [PATCH 05/12] dir.c: refactor is_excluded_from_list() Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:28   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:28   ` [PATCH 07/12] dir.c: refactor is_path_excluded() Nguyễn Thái Ngọc Duy
                     ` (5 subsequent siblings)
  10 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:28 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Adam Spiers,
	Nguyễn Thái Ngọc Duy

From: Adam Spiers <git@adamspiers.org>

In a similar way to the previous commit, this extracts a new helper
function last_exclude_matching() which returns the last exclude_list
element which matched, or NULL if no match was found.  is_excluded()
becomes a wrapper around this, and just returns 0 or 1 depending on
whether any matching exclude_list element was found.

This allows callers to find out _why_ a given path was excluded,
rather than just whether it was or not, paving the way for a new git
sub-command which allows users to test their exclude lists from the
command line.

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 dir.c | 38 +++++++++++++++++++++++++++++---------
 1 file changed, 29 insertions(+), 9 deletions(-)

diff --git a/dir.c b/dir.c
index 859e0f9..c91a2f6 100644
--- a/dir.c
+++ b/dir.c
@@ -658,24 +658,44 @@ int is_excluded_from_list(const char *pathname,
 	return -1; /* undecided */
 }
 
-static int is_excluded(struct dir_struct *dir, const char *pathname, int *dtype_p)
+/*
+ * Loads the exclude lists for the directory containing pathname, then
+ * scans all exclude lists to determine whether pathname is excluded.
+ * Returns the exclude_list element which matched, or NULL for
+ * undecided.
+ */
+static struct exclude *last_exclude_matching(struct dir_struct *dir,
+					     const char *pathname,
+					     int *dtype_p)
 {
 	int pathlen = strlen(pathname);
 	int st;
+	struct exclude *exclude;
 	const char *basename = strrchr(pathname, '/');
 	basename = (basename) ? basename+1 : pathname;
 
 	prep_exclude(dir, pathname, basename-pathname);
 	for (st = EXC_CMDL; st <= EXC_FILE; st++) {
-		switch (is_excluded_from_list(pathname, pathlen,
-					      basename, dtype_p,
-					      &dir->exclude_list[st])) {
-		case 0:
-			return 0;
-		case 1:
-			return 1;
-		}
+		exclude = last_exclude_matching_from_list(
+			pathname, pathlen, basename, dtype_p,
+			&dir->exclude_list[st]);
+		if (exclude)
+			return exclude;
 	}
+	return NULL;
+}
+
+/*
+ * Loads the exclude lists for the directory containing pathname, then
+ * scans all exclude lists to determine whether pathname is excluded.
+ * Returns 1 if true, otherwise 0.
+ */
+static int is_excluded(struct dir_struct *dir, const char *pathname, int *dtype_p)
+{
+	struct exclude *exclude =
+		last_exclude_matching(dir, pathname, dtype_p);
+	if (exclude)
+		return exclude->flags & EXC_FLAG_NEGATIVE ? 0 : 1;
 	return 0;
 }
 
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 07/12] dir.c: refactor is_path_excluded()
  2012-10-15  6:27 ` [PATCH 01/12] dir.c: rename cryptic 'which' variable to more consistent name Nguyễn Thái Ngọc Duy
                     ` (4 preceding siblings ...)
  2012-10-15  6:28   ` [PATCH 06/12] dir.c: refactor is_excluded() Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:28   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:28   ` [PATCH 08/12] dir.c: keep track of where patterns came from Nguyễn Thái Ngọc Duy
                     ` (4 subsequent siblings)
  10 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:28 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Adam Spiers,
	Nguyễn Thái Ngọc Duy

From: Adam Spiers <git@adamspiers.org>

In a similar way to the previous commit, this extracts a new helper
function last_exclude_matching_path() which return the last
exclude_list element which matched, or NULL if no match was found.
is_path_excluded() becomes a wrapper around this, and just returns 0
or 1 depending on whether any matching exclude_list element was found.

This allows callers to find out _why_ a given path was excluded,
rather than just whether it was or not, paving the way for a new git
sub-command which allows users to test their exclude lists from the
command line.

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 dir.c | 47 ++++++++++++++++++++++++++++++++++++++---------
 dir.h |  3 +++
 2 files changed, 41 insertions(+), 9 deletions(-)

diff --git a/dir.c b/dir.c
index c91a2f6..84ab826 100644
--- a/dir.c
+++ b/dir.c
@@ -703,6 +703,7 @@ void path_exclude_check_init(struct path_exclude_check *check,
 			     struct dir_struct *dir)
 {
 	check->dir = dir;
+	check->exclude = NULL;
 	strbuf_init(&check->path, 256);
 }
 
@@ -712,18 +713,21 @@ void path_exclude_check_clear(struct path_exclude_check *check)
 }
 
 /*
- * Is this name excluded?  This is for a caller like show_files() that
- * do not honor directory hierarchy and iterate through paths that are
- * possibly in an ignored directory.
+ * For each subdirectory in name, starting with the top-most, checks
+ * to see if that subdirectory is excluded, and if so, returns the
+ * corresponding exclude structure.  Otherwise, checks whether name
+ * itself (which is presumably a file) is excluded.
  *
  * A path to a directory known to be excluded is left in check->path to
  * optimize for repeated checks for files in the same excluded directory.
  */
-int is_path_excluded(struct path_exclude_check *check,
-		     const char *name, int namelen, int *dtype)
+struct exclude *last_exclude_matching_path(struct path_exclude_check *check,
+					   const char *name, int namelen,
+					   int *dtype)
 {
 	int i;
 	struct strbuf *path = &check->path;
+	struct exclude *exclude;
 
 	/*
 	 * we allow the caller to pass namelen as an optimization; it
@@ -733,11 +737,17 @@ int is_path_excluded(struct path_exclude_check *check,
 	if (namelen < 0)
 		namelen = strlen(name);
 
+	/*
+	 * If path is non-empty, and name is equal to path or a
+	 * subdirectory of path, name should be excluded, because
+	 * it's inside a directory which is already known to be
+	 * excluded and was previously left in check->path.
+	 */
 	if (path->len &&
 	    path->len <= namelen &&
 	    !memcmp(name, path->buf, path->len) &&
 	    (!name[path->len] || name[path->len] == '/'))
-		return 1;
+		return check->exclude;
 
 	strbuf_setlen(path, 0);
 	for (i = 0; name[i]; i++) {
@@ -745,8 +755,12 @@ int is_path_excluded(struct path_exclude_check *check,
 
 		if (ch == '/') {
 			int dt = DT_DIR;
-			if (is_excluded(check->dir, path->buf, &dt))
-				return 1;
+			exclude = last_exclude_matching(check->dir,
+							path->buf, &dt);
+			if (exclude) {
+				check->exclude = exclude;
+				return exclude;
+			}
 		}
 		strbuf_addch(path, ch);
 	}
@@ -754,7 +768,22 @@ int is_path_excluded(struct path_exclude_check *check,
 	/* An entry in the index; cannot be a directory with subentries */
 	strbuf_setlen(path, 0);
 
-	return is_excluded(check->dir, name, dtype);
+	return last_exclude_matching(check->dir, name, dtype);
+}
+
+/*
+ * Is this name excluded?  This is for a caller like show_files() that
+ * do not honor directory hierarchy and iterate through paths that are
+ * possibly in an ignored directory.
+ */
+int is_path_excluded(struct path_exclude_check *check,
+		  const char *name, int namelen, int *dtype)
+{
+	struct exclude *exclude =
+		last_exclude_matching_path(check, name, namelen, dtype);
+	if (exclude)
+		return exclude->flags & EXC_FLAG_NEGATIVE ? 0 : 1;
+	return 0;
 }
 
 static struct dir_entry *dir_entry_new(const char *pathname, int len)
diff --git a/dir.h b/dir.h
index 4b887cc..02ac0bf 100644
--- a/dir.h
+++ b/dir.h
@@ -97,10 +97,13 @@ extern int match_pathname(const char *, int,
  */
 struct path_exclude_check {
 	struct dir_struct *dir;
+	struct exclude *exclude;
 	struct strbuf path;
 };
 extern void path_exclude_check_init(struct path_exclude_check *, struct dir_struct *);
 extern void path_exclude_check_clear(struct path_exclude_check *);
+extern struct exclude *last_exclude_matching_path(struct path_exclude_check *, const char *,
+						  int namelen, int *dtype);
 extern int is_path_excluded(struct path_exclude_check *, const char *, int namelen, int *dtype);
 
 
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 08/12] dir.c: keep track of where patterns came from
  2012-10-15  6:27 ` [PATCH 01/12] dir.c: rename cryptic 'which' variable to more consistent name Nguyễn Thái Ngọc Duy
                     ` (5 preceding siblings ...)
  2012-10-15  6:28   ` [PATCH 07/12] dir.c: refactor is_path_excluded() Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:28   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:28   ` [PATCH 09/12] dir.c: refactor treat_gitlinks() Nguyễn Thái Ngọc Duy
                     ` (3 subsequent siblings)
  10 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:28 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Adam Spiers,
	Nguyễn Thái Ngọc Duy

From: Adam Spiers <git@adamspiers.org>

For exclude patterns read in from files, the filename is stored together
with the corresponding line number (counting starting at 1).

For exclude patterns provided on the command line, the sequence number
is negative, with counting starting at -1, so for example the 2nd
pattern provided via --exclude would be numbered -2.  This allows any
future consumers of that data to easily distinguish between exclude
patterns from files vs. from the CLI.

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/clean.c    |  2 +-
 builtin/ls-files.c |  3 ++-
 dir.c              | 26 ++++++++++++++++++++------
 dir.h              |  5 ++++-
 4 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index 0c7b3d0..f618231 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -99,7 +99,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 
 	for (i = 0; i < exclude_list.nr; i++)
 		add_exclude(exclude_list.items[i].string, "", 0,
-			    &dir.exclude_list[EXC_CMDL]);
+			    &dir.exclude_list[EXC_CMDL], "--exclude option", -(i+1));
 
 	pathspec = get_pathspec(prefix, argv);
 
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index ef7f99a..db3c081 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -35,6 +35,7 @@ static int error_unmatch;
 static char *ps_matched;
 static const char *with_tree;
 static int exc_given;
+static int exclude_args;
 
 static const char *tag_cached = "";
 static const char *tag_unmerged = "";
@@ -423,7 +424,7 @@ static int option_parse_exclude(const struct option *opt,
 	struct exclude_list *list = opt->value;
 
 	exc_given = 1;
-	add_exclude(arg, "", 0, list);
+	add_exclude(arg, "", 0, list, "--exclude option", --exclude_args);
 
 	return 0;
 }
diff --git a/dir.c b/dir.c
index 84ab826..81a5dd5 100644
--- a/dir.c
+++ b/dir.c
@@ -347,7 +347,8 @@ void parse_exclude_pattern(const char **pattern,
 }
 
 void add_exclude(const char *string, const char *base,
-		 int baselen, struct exclude_list *el)
+		 int baselen, struct exclude_list *el,
+		 const char *src, int srcpos)
 {
 	struct exclude *x;
 	int patternlen;
@@ -371,6 +372,8 @@ void add_exclude(const char *string, const char *base,
 	x->base = base;
 	x->baselen = baselen;
 	x->flags = flags;
+	x->src = src;
+	x->srcpos = srcpos;
 	ALLOC_GROW(el->excludes, el->nr + 1, el->alloc);
 	el->excludes[el->nr++] = x;
 }
@@ -418,7 +421,7 @@ int add_excludes_from_file_to_list(const char *fname,
 				   int check_index)
 {
 	struct stat st;
-	int fd, i;
+	int fd, i, lineno = 1;
 	size_t size = 0;
 	char *buf, *entry;
 
@@ -461,8 +464,10 @@ int add_excludes_from_file_to_list(const char *fname,
 		if (buf[i] == '\n') {
 			if (entry != buf + i && entry[0] != '#') {
 				buf[i - (i && buf[i-1] == '\r')] = 0;
-				add_exclude(entry, base, baselen, el);
+				add_exclude(entry, base, baselen, el,
+					    fname, lineno);
 			}
+			lineno++;
 			entry = buf + i + 1;
 		}
 	}
@@ -493,8 +498,10 @@ static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
 		    !strncmp(dir->basebuf, base, stk->baselen))
 			break;
 		dir->exclude_stack = stk->prev;
-		while (stk->exclude_ix < el->nr)
-			free(el->excludes[--el->nr]);
+		while (stk->exclude_ix < el->nr) {
+			struct exclude *exclude = el->excludes[--el->nr];
+			free(exclude);
+		}
 		free(stk->filebuf);
 		free(stk);
 	}
@@ -521,7 +528,14 @@ static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
 		memcpy(dir->basebuf + current, base + current,
 		       stk->baselen - current);
 		strcpy(dir->basebuf + stk->baselen, dir->exclude_per_dir);
-		add_excludes_from_file_to_list(dir->basebuf,
+
+		/*
+		 * dir->basebuf gets reused by the traversal, but we
+		 * need fname to remain unchanged to ensure the src
+		 * member of each struct exclude correctly back-references
+		 * its source file.
+		 */
+		add_excludes_from_file_to_list(strdup(dir->basebuf),
 					       dir->basebuf, stk->baselen,
 					       &stk->filebuf, el, 1);
 		dir->exclude_stack = stk;
diff --git a/dir.h b/dir.h
index 02ac0bf..3921aa9 100644
--- a/dir.h
+++ b/dir.h
@@ -23,6 +23,9 @@ struct exclude_list {
 		const char *base;
 		int baselen;
 		int flags;
+		const char *src;
+		int srcpos; /* counting starts from 1 for line numbers in ignore files,
+			       and from -1 decrementing for patterns from CLI (--exclude) */
 	} **excludes;
 };
 
@@ -112,7 +115,7 @@ extern int add_excludes_from_file_to_list(const char *fname, const char *base, i
 extern void add_excludes_from_file(struct dir_struct *, const char *fname);
 extern void parse_exclude_pattern(const char **string, int *patternlen, int *flags, int *nowildcardlen);
 extern void add_exclude(const char *string, const char *base,
-			int baselen, struct exclude_list *el);
+			int baselen, struct exclude_list *el, const char *src, int srcpos);
 extern void free_excludes(struct exclude_list *el);
 extern int file_exists(const char *);
 
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 09/12] dir.c: refactor treat_gitlinks()
  2012-10-15  6:27 ` [PATCH 01/12] dir.c: rename cryptic 'which' variable to more consistent name Nguyễn Thái Ngọc Duy
                     ` (6 preceding siblings ...)
  2012-10-15  6:28   ` [PATCH 08/12] dir.c: keep track of where patterns came from Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:28   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:28   ` [PATCH 10/12] pathspec.c: move reusable code from builtin/add.c Nguyễn Thái Ngọc Duy
                     ` (2 subsequent siblings)
  10 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:28 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Adam Spiers,
	Nguyễn Thái Ngọc Duy

From: Adam Spiers <git@adamspiers.org>

Extract the body of the for loop in treat_gitlinks() into a separate
treat_gitlink() function so that it can be reused elsewhere.  This
paves the way for a new check-ignore sub-command.

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/add.c | 49 +++++++++++++++++++++++++++++++------------------
 1 file changed, 31 insertions(+), 18 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index c689f37..6d2fb0c 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -153,31 +153,44 @@ static char *prune_directory(struct dir_struct *dir, const char **pathspec, int
 	return seen;
 }
 
-static void treat_gitlinks(const char **pathspec)
+/*
+ * Check whether path refers to a submodule, or something inside a
+ * submodule.  If the former, returns the path with any trailing slash
+ * stripped.  If the latter, dies with an error message.
+ */
+const char *treat_gitlink(const char *path)
 {
-	int i;
-
-	if (!pathspec || !*pathspec)
-		return;
-
+	int i, path_len = strlen(path);
 	for (i = 0; i < active_nr; i++) {
 		struct cache_entry *ce = active_cache[i];
 		if (S_ISGITLINK(ce->ce_mode)) {
-			int len = ce_namelen(ce), j;
-			for (j = 0; pathspec[j]; j++) {
-				int len2 = strlen(pathspec[j]);
-				if (len2 <= len || pathspec[j][len] != '/' ||
-				    memcmp(ce->name, pathspec[j], len))
-					continue;
-				if (len2 == len + 1)
-					/* strip trailing slash */
-					pathspec[j] = xstrndup(ce->name, len);
-				else
-					die (_("Path '%s' is in submodule '%.*s'"),
-						pathspec[j], len, ce->name);
+			int ce_len = ce_namelen(ce);
+			if (path_len <= ce_len || path[ce_len] != '/' ||
+			    memcmp(ce->name, path, ce_len))
+				/* path does not refer to this
+				 * submodule or anything inside it */
+				continue;
+			if (path_len == ce_len + 1) {
+				/* path refers to submodule;
+				 * strip trailing slash */
+				return xstrndup(ce->name, ce_len);
+			} else {
+				die (_("Path '%s' is in submodule '%.*s'"),
+				     path, ce_len, ce->name);
 			}
 		}
 	}
+	return path;
+}
+
+void treat_gitlinks(const char **pathspec)
+{
+	if (!pathspec || !*pathspec)
+		return;
+
+	int i;
+	for (i = 0; pathspec[i]; i++)
+		pathspec[i] = treat_gitlink(pathspec[i]);
 }
 
 static void refresh(int verbose, const char **pathspec)
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 10/12] pathspec.c: move reusable code from builtin/add.c
  2012-10-15  6:27 ` [PATCH 01/12] dir.c: rename cryptic 'which' variable to more consistent name Nguyễn Thái Ngọc Duy
                     ` (7 preceding siblings ...)
  2012-10-15  6:28   ` [PATCH 09/12] dir.c: refactor treat_gitlinks() Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:28   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:28   ` [PATCH 11/12] dir.c: provide free_directory() for reclaiming dir_struct memory Nguyễn Thái Ngọc Duy
  2012-10-15  6:28   ` [PATCH 12/12] Add git-check-ignore sub-command Nguyễn Thái Ngọc Duy
  10 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:28 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Adam Spiers,
	Nguyễn Thái Ngọc Duy

From: Adam Spiers <git@adamspiers.org>

This is in preparation for reuse by a new git check-ignore command.

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Makefile      |  2 ++
 builtin/add.c | 95 ++-------------------------------------------------------
 pathspec.c    | 98 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 pathspec.h    |  6 ++++
 4 files changed, 109 insertions(+), 92 deletions(-)
 create mode 100644 pathspec.c
 create mode 100644 pathspec.h

diff --git a/Makefile b/Makefile
index 13293d3..48facad 100644
--- a/Makefile
+++ b/Makefile
@@ -645,6 +645,7 @@ LIB_H += pack-refs.h
 LIB_H += pack-revindex.h
 LIB_H += parse-options.h
 LIB_H += patch-ids.h
+LIB_H += pathspec.h
 LIB_H += pkt-line.h
 LIB_H += progress.h
 LIB_H += prompt.h
@@ -758,6 +759,7 @@ LIB_OBJS += parse-options-cb.o
 LIB_OBJS += patch-delta.o
 LIB_OBJS += patch-ids.o
 LIB_OBJS += path.o
+LIB_OBJS += pathspec.o
 LIB_OBJS += pkt-line.o
 LIB_OBJS += preload-index.o
 LIB_OBJS += pretty.o
diff --git a/builtin/add.c b/builtin/add.c
index 6d2fb0c..927b0f2 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -6,6 +6,7 @@
 #include "cache.h"
 #include "builtin.h"
 #include "dir.h"
+#include "pathspec.h"
 #include "exec_cmd.h"
 #include "cache-tree.h"
 #include "run-command.h"
@@ -97,39 +98,6 @@ int add_files_to_cache(const char *prefix, const char **pathspec, int flags)
 	return !!data.add_errors;
 }
 
-static void fill_pathspec_matches(const char **pathspec, char *seen, int specs)
-{
-	int num_unmatched = 0, i;
-
-	/*
-	 * Since we are walking the index as if we were walking the directory,
-	 * we have to mark the matched pathspec as seen; otherwise we will
-	 * mistakenly think that the user gave a pathspec that did not match
-	 * anything.
-	 */
-	for (i = 0; i < specs; i++)
-		if (!seen[i])
-			num_unmatched++;
-	if (!num_unmatched)
-		return;
-	for (i = 0; i < active_nr; i++) {
-		struct cache_entry *ce = active_cache[i];
-		match_pathspec(pathspec, ce->name, ce_namelen(ce), 0, seen);
-	}
-}
-
-static char *find_used_pathspec(const char **pathspec)
-{
-	char *seen;
-	int i;
-
-	for (i = 0; pathspec[i];  i++)
-		; /* just counting */
-	seen = xcalloc(i, 1);
-	fill_pathspec_matches(pathspec, seen, i);
-	return seen;
-}
-
 static char *prune_directory(struct dir_struct *dir, const char **pathspec, int prefix)
 {
 	char *seen;
@@ -153,46 +121,6 @@ static char *prune_directory(struct dir_struct *dir, const char **pathspec, int
 	return seen;
 }
 
-/*
- * Check whether path refers to a submodule, or something inside a
- * submodule.  If the former, returns the path with any trailing slash
- * stripped.  If the latter, dies with an error message.
- */
-const char *treat_gitlink(const char *path)
-{
-	int i, path_len = strlen(path);
-	for (i = 0; i < active_nr; i++) {
-		struct cache_entry *ce = active_cache[i];
-		if (S_ISGITLINK(ce->ce_mode)) {
-			int ce_len = ce_namelen(ce);
-			if (path_len <= ce_len || path[ce_len] != '/' ||
-			    memcmp(ce->name, path, ce_len))
-				/* path does not refer to this
-				 * submodule or anything inside it */
-				continue;
-			if (path_len == ce_len + 1) {
-				/* path refers to submodule;
-				 * strip trailing slash */
-				return xstrndup(ce->name, ce_len);
-			} else {
-				die (_("Path '%s' is in submodule '%.*s'"),
-				     path, ce_len, ce->name);
-			}
-		}
-	}
-	return path;
-}
-
-void treat_gitlinks(const char **pathspec)
-{
-	if (!pathspec || !*pathspec)
-		return;
-
-	int i;
-	for (i = 0; pathspec[i]; i++)
-		pathspec[i] = treat_gitlink(pathspec[i]);
-}
-
 static void refresh(int verbose, const char **pathspec)
 {
 	char *seen;
@@ -210,23 +138,6 @@ static void refresh(int verbose, const char **pathspec)
         free(seen);
 }
 
-static const char **validate_pathspec(int argc, const char **argv, const char *prefix)
-{
-	const char **pathspec = get_pathspec(prefix, argv);
-
-	if (pathspec) {
-		const char **p;
-		for (p = pathspec; *p; p++) {
-			if (has_symlink_leading_path(*p, strlen(*p))) {
-				int len = prefix ? strlen(prefix) : 0;
-				die(_("'%s' is beyond a symbolic link"), *p + len);
-			}
-		}
-	}
-
-	return pathspec;
-}
-
 int run_add_interactive(const char *revision, const char *patch_mode,
 			const char **pathspec)
 {
@@ -261,7 +172,7 @@ int interactive_add(int argc, const char **argv, const char *prefix, int patch)
 	const char **pathspec = NULL;
 
 	if (argc) {
-		pathspec = validate_pathspec(argc, argv, prefix);
+		pathspec = validate_pathspec(prefix, argv);
 		if (!pathspec)
 			return -1;
 	}
@@ -427,7 +338,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 		fprintf(stderr, _("Maybe you wanted to say 'git add .'?\n"));
 		return 0;
 	}
-	pathspec = validate_pathspec(argc, argv, prefix);
+	pathspec = validate_pathspec(prefix, argv);
 
 	if (read_cache() < 0)
 		die(_("index file corrupt"));
diff --git a/pathspec.c b/pathspec.c
new file mode 100644
index 0000000..a9c5b5b
--- /dev/null
+++ b/pathspec.c
@@ -0,0 +1,98 @@
+#include "cache.h"
+#include "dir.h"
+
+void validate_path(const char *prefix, const char *path)
+{
+	if (has_symlink_leading_path(path, strlen(path))) {
+		int len = prefix ? strlen(prefix) : 0;
+		die(_("'%s' is beyond a symbolic link"), path + len);
+	}
+}
+
+const char **validate_pathspec(const char *prefix, const char **files)
+{
+	const char **pathspec = get_pathspec(prefix, files);
+
+	if (pathspec) {
+		const char **p;
+		for (p = pathspec; *p; p++) {
+			validate_path(prefix, *p);
+		}
+	}
+
+	return pathspec;
+}
+
+void fill_pathspec_matches(const char **pathspec, char *seen, int specs)
+{
+	int num_unmatched = 0, i;
+
+	/*
+	 * Since we are walking the index as if we were walking the directory,
+	 * we have to mark the matched pathspec as seen; otherwise we will
+	 * mistakenly think that the user gave a pathspec that did not match
+	 * anything.
+	 */
+	for (i = 0; i < specs; i++)
+		if (!seen[i])
+			num_unmatched++;
+	if (!num_unmatched)
+		return;
+	for (i = 0; i < active_nr; i++) {
+		struct cache_entry *ce = active_cache[i];
+		match_pathspec(pathspec, ce->name, ce_namelen(ce), 0, seen);
+	}
+}
+
+char *find_used_pathspec(const char **pathspec)
+{
+	char *seen;
+	int i;
+
+	for (i = 0; pathspec[i];  i++)
+		; /* just counting */
+	seen = xcalloc(i, 1);
+	fill_pathspec_matches(pathspec, seen, i);
+	return seen;
+}
+
+/*
+ * Check whether path refers to a submodule, or something inside a
+ * submodule.  If the former, returns the path with any trailing slash
+ * stripped.  If the latter, dies with an error message.
+ */
+const char *treat_gitlink(const char *path)
+{
+	int i, path_len = strlen(path);
+	for (i = 0; i < active_nr; i++) {
+		struct cache_entry *ce = active_cache[i];
+		if (S_ISGITLINK(ce->ce_mode)) {
+			int ce_len = ce_namelen(ce);
+			if (path_len <= ce_len || path[ce_len] != '/' ||
+			    memcmp(ce->name, path, ce_len))
+				/* path does not refer to this
+				 * submodule or anything inside it */
+				continue;
+			if (path_len == ce_len + 1) {
+				/* path refers to submodule;
+				 * strip trailing slash */
+				return xstrndup(ce->name, ce_len);
+			} else {
+				die (_("Path '%s' is in submodule '%.*s'"),
+				     path, ce_len, ce->name);
+			}
+		}
+	}
+	return path;
+}
+
+void treat_gitlinks(const char **pathspec)
+{
+	int i;
+
+	if (!pathspec || !*pathspec)
+		return;
+
+	for (i = 0; pathspec[i]; i++)
+		pathspec[i] = treat_gitlink(pathspec[i]);
+}
diff --git a/pathspec.h b/pathspec.h
new file mode 100644
index 0000000..b7c053a
--- /dev/null
+++ b/pathspec.h
@@ -0,0 +1,6 @@
+extern void validate_path(const char *prefix, const char *path);
+extern const char **validate_pathspec(const char *prefix, const char **files);
+extern char *find_used_pathspec(const char **pathspec);
+extern void fill_pathspec_matches(const char **pathspec, char *seen, int specs);
+extern const char *treat_gitlink(const char *path);
+extern void treat_gitlinks(const char **pathspec);
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 11/12] dir.c: provide free_directory() for reclaiming dir_struct memory
  2012-10-15  6:27 ` [PATCH 01/12] dir.c: rename cryptic 'which' variable to more consistent name Nguyễn Thái Ngọc Duy
                     ` (8 preceding siblings ...)
  2012-10-15  6:28   ` [PATCH 10/12] pathspec.c: move reusable code from builtin/add.c Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:28   ` Nguyễn Thái Ngọc Duy
  2012-10-15  6:28   ` [PATCH 12/12] Add git-check-ignore sub-command Nguyễn Thái Ngọc Duy
  10 siblings, 0 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:28 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Adam Spiers,
	Nguyễn Thái Ngọc Duy

From: Adam Spiers <git@adamspiers.org>

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/technical/api-directory-listing.txt |  2 ++
 dir.c                                             | 28 +++++++++++++++++++++--
 dir.h                                             |  1 +
 3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt
index add6f43..ca571a8 100644
--- a/Documentation/technical/api-directory-listing.txt
+++ b/Documentation/technical/api-directory-listing.txt
@@ -76,4 +76,6 @@ marked. If you to exclude files, make sure you have loaded index first.
 
 * Use `dir.entries[]`.
 
+* Call `free_directory()` when none of the contained elements are no longer in use.
+
 (JC)
diff --git a/dir.c b/dir.c
index 81a5dd5..d50634d 100644
--- a/dir.c
+++ b/dir.c
@@ -481,6 +481,16 @@ void add_excludes_from_file(struct dir_struct *dir, const char *fname)
 		die("cannot use %s as an exclude file", fname);
 }
 
+static void free_exclude_stack(struct exclude_stack *stk)
+{
+	free(stk->filebuf);
+	free(stk);
+}
+
+/*
+ * Loads the per-directory exclude list for the substring of base
+ * which has a char length of baselen.
+ */
 static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
 {
 	struct exclude_list *el;
@@ -502,8 +512,7 @@ static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
 			struct exclude *exclude = el->excludes[--el->nr];
 			free(exclude);
 		}
-		free(stk->filebuf);
-		free(stk);
+		free_exclude_stack(stk);
 	}
 
 	/* Read from the parent directories and push them down. */
@@ -1521,3 +1530,18 @@ void free_pathspec(struct pathspec *pathspec)
 	free(pathspec->items);
 	pathspec->items = NULL;
 }
+
+void free_directory(struct dir_struct *dir)
+{
+	struct exclude_stack *stk;
+	int st;
+	for (st = EXC_CMDL; st <= EXC_FILE; st++)
+		free_excludes(&dir->exclude_list[st]);
+
+	stk = dir->exclude_stack;
+	while (stk) {
+		struct exclude_stack *prev = stk->prev;
+		free_exclude_stack(stk);
+		stk = prev;
+	}
+}
diff --git a/dir.h b/dir.h
index 3921aa9..fb57c66 100644
--- a/dir.h
+++ b/dir.h
@@ -117,6 +117,7 @@ extern void parse_exclude_pattern(const char **string, int *patternlen, int *fla
 extern void add_exclude(const char *string, const char *base,
 			int baselen, struct exclude_list *el, const char *src, int srcpos);
 extern void free_excludes(struct exclude_list *el);
+extern void free_directory(struct dir_struct *dir);
 extern int file_exists(const char *);
 
 extern int is_inside_dir(const char *dir);
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 12/12] Add git-check-ignore sub-command
  2012-10-15  6:27 ` [PATCH 01/12] dir.c: rename cryptic 'which' variable to more consistent name Nguyễn Thái Ngọc Duy
                     ` (9 preceding siblings ...)
  2012-10-15  6:28   ` [PATCH 11/12] dir.c: provide free_directory() for reclaiming dir_struct memory Nguyễn Thái Ngọc Duy
@ 2012-10-15  6:28   ` Nguyễn Thái Ngọc Duy
  2012-10-15 22:31     ` Junio C Hamano
  2012-11-04 21:07     ` [PATCH as/check-ignore] t0007: fix tests on Windows Johannes Sixt
  10 siblings, 2 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-15  6:28 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Adam Spiers,
	Nguyễn Thái Ngọc Duy

From: Adam Spiers <git@adamspiers.org>

This works in a similar manner to git-check-attr.  Some code
was reused from add.c by refactoring out into pathspec.c.

Thanks to Jeff King and Junio C Hamano for the idea:
http://thread.gmane.org/gmane.comp.version-control.git/108671/focus=108815

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 .gitignore                             |   1 +
 Documentation/git-check-ignore.txt     |  85 +++++
 Documentation/gitignore.txt            |   6 +-
 Makefile                               |   1 +
 builtin.h                              |   1 +
 builtin/check-ignore.c                 | 170 ++++++++++
 command-list.txt                       |   1 +
 contrib/completion/git-completion.bash |   1 +
 git.c                                  |   1 +
 t/t0007-ignores.sh                     | 587 +++++++++++++++++++++++++++++++++
 10 files changed, 852 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/git-check-ignore.txt
 create mode 100644 builtin/check-ignore.c
 create mode 100755 t/t0007-ignores.sh

diff --git a/.gitignore b/.gitignore
index f1acd3e..20ef4e8 100644
--- a/.gitignore
+++ b/.gitignore
@@ -19,6 +19,7 @@
 /git-bundle
 /git-cat-file
 /git-check-attr
+/git-check-ignore
 /git-check-ref-format
 /git-checkout
 /git-checkout-index
diff --git a/Documentation/git-check-ignore.txt b/Documentation/git-check-ignore.txt
new file mode 100644
index 0000000..96fa7bc
--- /dev/null
+++ b/Documentation/git-check-ignore.txt
@@ -0,0 +1,85 @@
+git-check-ignore(1)
+===================
+
+NAME
+----
+git-check-ignore - Debug gitignore / exclude files
+
+
+SYNOPSIS
+--------
+[verse]
+'git check-ignore' [options] pathname...
+'git check-ignore' [options] --stdin < <list-of-paths>
+
+DESCRIPTION
+-----------
+
+For each pathname given via the command-line or from a file via
+`--stdin`, this command will list the first exclude pattern found (if
+any) which explicitly excludes or includes that pathname.  Note that
+within any given exclude file, later patterns take precedence over
+earlier ones, so any matching pattern which this command outputs may
+not be the one you would immediately expect.
+
+OPTIONS
+-------
+-q, --quiet::
+	Don't output anything, just set exit status.  This is only
+	valid with a single pathname.
+
+-v, --verbose::
+	Also output details about the matching pattern (if any)
+	for each given pathname.
+
+--stdin::
+	Read file names from stdin instead of from the command-line.
+
+-z::
+	The output format is modified to be machine-parseable (see
+	below).  If `--stdin` is also given, input paths are separated
+	with a NUL character instead of a linefeed character.
+
+OUTPUT
+------
+
+By default, any of the given pathnames which match an ignore pattern
+will be output, one per line.  If no pattern matches a given path,
+nothing will be output for that path; this means that path will not be
+ignored.
+
+If `--verbose` is specified, the output is a series of lines of the form:
+
+<source> <COLON> <linenum> <COLON> <pattern> <HT> <pathname>
+
+<pathname> is the path of a file being queried, <pattern> is the
+matching pattern, <source> is the pattern's source file, and <linenum>
+is the line number of the pattern within that source.  If the pattern
+contained a `!` prefix or `/` suffix, it will be preserved in the
+output.  <source> will be an absolute path when referring to the file
+configured by `core.excludesfile`, or relative to the repository root
+when referring to `.git/info/exclude` or a per-directory exclude file.
+
+If `-z` is specified, the output is a series of lines of the form:
+
+EXIT STATUS
+-----------
+
+0::
+	One or more of the provided paths is ignored.
+
+1::
+	None of the provided paths are ignored.
+
+128::
+	A fatal error was encountered.
+
+SEE ALSO
+--------
+linkgit:gitignore[5]
+linkgit:gitconfig[5]
+linkgit:git-ls-files[5]
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt
index 2e7328b..f401b8c 100644
--- a/Documentation/gitignore.txt
+++ b/Documentation/gitignore.txt
@@ -153,8 +153,10 @@ The second .gitignore prevents git from ignoring
 
 SEE ALSO
 --------
-linkgit:git-rm[1], linkgit:git-update-index[1],
-linkgit:gitrepository-layout[5]
+linkgit:git-rm[1],
+linkgit:git-update-index[1],
+linkgit:gitrepository-layout[5],
+linkgit:git-check-ignore[1]
 
 GIT
 ---
diff --git a/Makefile b/Makefile
index 48facad..8476fc8 100644
--- a/Makefile
+++ b/Makefile
@@ -822,6 +822,7 @@ BUILTIN_OBJS += builtin/branch.o
 BUILTIN_OBJS += builtin/bundle.o
 BUILTIN_OBJS += builtin/cat-file.o
 BUILTIN_OBJS += builtin/check-attr.o
+BUILTIN_OBJS += builtin/check-ignore.o
 BUILTIN_OBJS += builtin/check-ref-format.o
 BUILTIN_OBJS += builtin/checkout-index.o
 BUILTIN_OBJS += builtin/checkout.o
diff --git a/builtin.h b/builtin.h
index dffb34e..d57faf4 100644
--- a/builtin.h
+++ b/builtin.h
@@ -58,6 +58,7 @@ extern int cmd_cat_file(int argc, const char **argv, const char *prefix);
 extern int cmd_checkout(int argc, const char **argv, const char *prefix);
 extern int cmd_checkout_index(int argc, const char **argv, const char *prefix);
 extern int cmd_check_attr(int argc, const char **argv, const char *prefix);
+extern int cmd_check_ignore(int argc, const char **argv, const char *prefix);
 extern int cmd_check_ref_format(int argc, const char **argv, const char *prefix);
 extern int cmd_cherry(int argc, const char **argv, const char *prefix);
 extern int cmd_cherry_pick(int argc, const char **argv, const char *prefix);
diff --git a/builtin/check-ignore.c b/builtin/check-ignore.c
new file mode 100644
index 0000000..d446ade
--- /dev/null
+++ b/builtin/check-ignore.c
@@ -0,0 +1,170 @@
+#include "builtin.h"
+#include "cache.h"
+#include "dir.h"
+#include "quote.h"
+#include "pathspec.h"
+#include "parse-options.h"
+
+static int quiet, verbose, stdin_paths;
+static const char * const check_ignore_usage[] = {
+"git check-ignore [options] pathname...",
+"git check-ignore [options] --stdin < <list-of-paths>",
+NULL
+};
+
+static int null_term_line;
+
+static const struct option check_ignore_options[] = {
+	OPT__QUIET(&quiet, N_("suppress progress reporting")),
+	OPT__VERBOSE(&verbose, N_("be verbose")),
+	OPT_GROUP(""),
+	OPT_BOOLEAN(0, "stdin", &stdin_paths,
+		    N_("read file names from stdin")),
+	OPT_BOOLEAN('z', NULL, &null_term_line,
+		    N_("input paths are terminated by a null character")),
+	OPT_END()
+};
+
+static void output_exclude(const char *path, struct exclude *exclude)
+{
+	char *bang = exclude->flags & EXC_FLAG_NEGATIVE ? "!" : "";
+	char *dir  = (exclude->flags & EXC_FLAG_MUSTBEDIR) ? "/" : "";
+	if (!null_term_line) {
+		if (!verbose) {
+			write_name_quoted(path, stdout, '\n');
+		} else {
+			quote_c_style(exclude->src, NULL, stdout, 0);
+			printf(":%d:%s%s%s\t",
+			       exclude->srcpos,
+			       bang, exclude->pattern, dir);
+			quote_c_style(path, NULL, stdout, 0);
+			fputc('\n', stdout);
+		}
+	} else {
+		if (!verbose) {
+			printf("%s%c", path, '\0');
+		} else {
+			printf("%s%c%d%c%s%s%s%c%s%c",
+			       exclude->src, '\0',
+			       exclude->srcpos, '\0',
+			       bang, exclude->pattern, dir, '\0',
+			       path, '\0');
+		}
+	}
+}
+
+static int check_ignore(const char *prefix, const char **pathspec)
+{
+	struct dir_struct dir;
+	const char *path;
+	char *seen = NULL;
+	int num_ignored = 0;
+
+	/* read_cache() is only necessary so we can watch out for submodules. */
+	if (read_cache() < 0)
+		die(_("index file corrupt"));
+
+	memset(&dir, 0, sizeof(dir));
+	dir.flags |= DIR_COLLECT_IGNORED;
+	setup_standard_excludes(&dir);
+
+	if (pathspec) {
+		int i;
+		struct path_exclude_check check;
+		struct exclude *exclude;
+
+		path_exclude_check_init(&check, &dir);
+		if (!seen)
+			seen = find_used_pathspec(pathspec);
+		for (i = 0; pathspec[i]; i++) {
+			const char *full_path;
+			path = pathspec[i];
+			full_path = prefix_path(prefix, prefix
+						? strlen(prefix) : 0, path);
+			full_path = treat_gitlink(full_path);
+			validate_path(prefix, full_path);
+			if (!seen[i] && path[0]) {
+				int dtype = DT_UNKNOWN;
+				exclude = last_exclude_matching_path(&check, full_path,
+								     -1, &dtype);
+				if (exclude) {
+					if (!quiet)
+						output_exclude(path, exclude);
+					num_ignored++;
+				}
+			}
+		}
+		free(seen);
+		free_directory(&dir);
+		path_exclude_check_clear(&check);
+	} else {
+		printf("no pathspec\n");
+	}
+	return num_ignored;
+}
+
+static int check_ignore_stdin_paths(const char *prefix)
+{
+	struct strbuf buf, nbuf;
+	char **pathspec = NULL;
+	size_t nr = 0, alloc = 0;
+	int line_termination = null_term_line ? 0 : '\n';
+	int num_ignored;
+
+	strbuf_init(&buf, 0);
+	strbuf_init(&nbuf, 0);
+	while (strbuf_getline(&buf, stdin, line_termination) != EOF) {
+		if (line_termination && buf.buf[0] == '"') {
+			strbuf_reset(&nbuf);
+			if (unquote_c_style(&nbuf, buf.buf, NULL))
+				die("line is badly quoted");
+			strbuf_swap(&buf, &nbuf);
+		}
+		ALLOC_GROW(pathspec, nr + 1, alloc);
+		pathspec[nr] = xcalloc(strlen(buf.buf) + 1, sizeof(*buf.buf));
+		strcpy(pathspec[nr++], buf.buf);
+	}
+	ALLOC_GROW(pathspec, nr + 1, alloc);
+	pathspec[nr] = NULL;
+	num_ignored = check_ignore(prefix, (const char **)pathspec);
+	maybe_flush_or_die(stdout, "attribute to stdout");
+	strbuf_release(&buf);
+	strbuf_release(&nbuf);
+	free(pathspec);
+	return num_ignored;
+}
+
+int cmd_check_ignore(int argc, const char **argv, const char *prefix)
+{
+	int num_ignored = 0;
+
+	git_config(git_default_config, NULL);
+
+	argc = parse_options(argc, argv, prefix, check_ignore_options,
+			     check_ignore_usage, 0);
+
+	if (stdin_paths) {
+		if (0 < argc)
+			die(_("cannot specify pathnames with --stdin"));
+	} else {
+		if (null_term_line)
+			die(_("-z only makes sense with --stdin"));
+		if (argc == 0)
+			die(_("no path specified"));
+	}
+	if (quiet) {
+		if (argc > 1)
+			die(_("--quiet is only valid with a single pathname"));
+		if (verbose)
+			die(_("cannot have both --quiet and --verbose"));
+	}
+
+	if (stdin_paths) {
+		num_ignored = check_ignore_stdin_paths(prefix);
+	} else {
+		num_ignored = check_ignore(prefix, argv);
+		maybe_flush_or_die(stdout, "ignore to stdout");
+	}
+
+	return num_ignored > 0 ? 0 : 1;
+}
diff --git a/command-list.txt b/command-list.txt
index 14ea67a..ef7f39c 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -12,6 +12,7 @@ git-branch                              mainporcelain common
 git-bundle                              mainporcelain
 git-cat-file                            plumbinginterrogators
 git-check-attr                          purehelpers
+git-check-ignore                        purehelpers
 git-checkout                            mainporcelain common
 git-checkout-index                      plumbingmanipulators
 git-check-ref-format                    purehelpers
diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
index 2e1b5e1..1fb896b 100755
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -842,6 +842,7 @@ __git_list_porcelain_commands ()
 		archimport)       : import;;
 		cat-file)         : plumbing;;
 		check-attr)       : plumbing;;
+		check-ignore)     : plumbing;;
 		check-ref-format) : plumbing;;
 		checkout-index)   : plumbing;;
 		commit-tree)      : plumbing;;
diff --git a/git.c b/git.c
index d232de9..0b31e66 100644
--- a/git.c
+++ b/git.c
@@ -340,6 +340,7 @@ static void handle_internal_command(int argc, const char **argv)
 		{ "bundle", cmd_bundle, RUN_SETUP_GENTLY },
 		{ "cat-file", cmd_cat_file, RUN_SETUP },
 		{ "check-attr", cmd_check_attr, RUN_SETUP },
+		{ "check-ignore", cmd_check_ignore, RUN_SETUP | NEED_WORK_TREE },
 		{ "check-ref-format", cmd_check_ref_format },
 		{ "checkout", cmd_checkout, RUN_SETUP | NEED_WORK_TREE },
 		{ "checkout-index", cmd_checkout_index,
diff --git a/t/t0007-ignores.sh b/t/t0007-ignores.sh
new file mode 100755
index 0000000..7fd7e53
--- /dev/null
+++ b/t/t0007-ignores.sh
@@ -0,0 +1,587 @@
+#!/bin/sh
+
+test_description=check-ignore
+
+. ./test-lib.sh
+
+init_vars () {
+	global_excludes="$HOME/global-excludes"
+}
+
+enable_global_excludes () {
+	init_vars
+	git config core.excludesfile "$global_excludes"
+}
+
+expect_in () {
+	dest="$HOME/expected-$1" text="$2"
+	if test -z "$text"
+	then
+		>"$dest" # avoid newline
+	else
+		echo "$text" >"$dest"
+	fi
+}
+
+expect () {
+	expect_in stdout "$1"
+}
+
+expect_from_stdin () {
+	cat >"$HOME/expected-stdout"
+}
+
+test_stderr () {
+	expected="$1"
+	expect_in stderr "$1" &&
+	test_cmp "$HOME/expected-stderr" "$HOME/stderr"
+}
+
+stderr_contains () {
+	regexp="$1"
+	if grep -q "$regexp" "$HOME/stderr"
+	then
+		return 0
+	else
+		echo "didn't find /$regexp/ in $HOME/stderr"
+		cat "$HOME/stderr"
+		return 1
+	fi
+}
+
+stderr_empty_on_success () {
+	expect_code="$1"
+	if test $expect_code = 0
+	then
+		test_stderr ""
+	else
+		# If we expect failure then stderr might or might not be empty
+		# due to --quiet - the caller can check its contents
+		return 0
+	fi
+}
+
+test_check_ignore () {
+	args="$1" expect_code="${2:-0}" global_args="$3"
+
+	init_vars &&
+	rm -f "$HOME/stdout" "$HOME/stderr" "$HOME/cmd" &&
+	echo $(which git) $global_args check-ignore $quiet_opt $verbose_opt $args \
+		>"$HOME/cmd" &&
+	pwd >"$HOME/pwd" &&
+	test_expect_code "$expect_code" \
+		git $global_args check-ignore $quiet_opt $verbose_opt $args \
+		>"$HOME/stdout" 2>"$HOME/stderr" &&
+	test_cmp "$HOME/expected-stdout" "$HOME/stdout" &&
+	stderr_empty_on_success "$expect_code"
+}
+
+test_expect_success_multi () {
+	testname="$1" expect_verbose="$2" code="$3"
+
+	expect=$( echo "$expect_verbose" | sed -e 's/.*	//' )
+
+	test_expect_success "$testname" "
+		expect '$expect' &&
+		$code
+	"
+
+	for quiet_opt in '-q' '--quiet'
+	do
+		test_expect_success "$testname${quiet_opt:+ with $quiet_opt}" "
+			expect '' &&
+			$code
+		"
+	done
+	quiet_opt=
+
+	for verbose_opt in '-v' '--verbose'
+	do
+		test_expect_success "$testname${verbose_opt:+ with $verbose_opt}" "
+			expect '$expect_verbose' &&
+			$code
+		"
+	done
+	verbose_opt=
+}
+
+test_expect_success 'setup' '
+	init_vars
+	mkdir -p a/b/ignored-dir a/submodule b &&
+	ln -s b a/symlink &&
+	(
+		cd a/submodule &&
+		git init &&
+		echo a > a &&
+		git add a &&
+		git commit -m"commit in submodule"
+	) &&
+	git add a/submodule &&
+	cat <<-\EOF >.gitignore &&
+		one
+	EOF
+	cat <<-\EOF >a/.gitignore &&
+		two*
+		*three
+	EOF
+	cat <<-\EOF >a/b/.gitignore &&
+		four
+		five
+		# this comment should affect the line numbers
+		six
+		ignored-dir/
+		# and so should this blank line:
+
+		!on*
+		!two
+	EOF
+	echo "seven" >a/b/ignored-dir/.gitignore &&
+	test -n "$HOME" &&
+	cat <<-\EOF >"$global_excludes" &&
+		globalone
+		!globaltwo
+		globalthree
+	EOF
+	cat <<-\EOF >>.git/info/exclude
+		per-repo
+	EOF
+'
+
+############################################################################
+#
+# test invalid inputs
+
+test_expect_success_multi 'empty command line' '' '
+	test_check_ignore "" 128 &&
+	stderr_contains "fatal: no path specified"
+'
+
+test_expect_success '-q with multiple args' '
+	expect "" &&
+	test_check_ignore "-q one two" 128 &&
+	stderr_contains "fatal: --quiet is only valid with a single pathname"
+'
+
+test_expect_success '--quiet with multiple args' '
+	expect "" &&
+	test_check_ignore "--quiet one two" 128 &&
+	stderr_contains "fatal: --quiet is only valid with a single pathname"
+'
+
+for verbose_opt in '-v' '--verbose'
+do
+	for quiet_opt in '-q' '--quiet'
+	do
+		test_expect_success "$quiet_opt $verbose_opt" "
+			expect '' &&
+			test_check_ignore '$quiet_opt $verbose_opt foo' 128 &&
+			stderr_contains 'fatal: cannot have both --quiet and --verbose'
+		"
+	done
+done
+
+test_expect_success '--quiet with multiple args' '
+	expect "" &&
+	test_check_ignore "--quiet one two" 128 &&
+	stderr_contains "fatal: --quiet is only valid with a single pathname"
+'
+
+test_expect_success_multi 'erroneous use of --' '' '
+	test_check_ignore "--" 128 &&
+	stderr_contains "fatal: no path specified"
+'
+
+test_expect_success_multi '--stdin with superfluous arg' '' '
+	test_check_ignore "--stdin foo" 128 &&
+	stderr_contains "fatal: cannot specify pathnames with --stdin"
+'
+
+test_expect_success_multi '--stdin -z with superfluous arg' '' '
+	test_check_ignore "--stdin -z foo" 128 &&
+	stderr_contains "fatal: cannot specify pathnames with --stdin"
+'
+
+test_expect_success_multi '-z without --stdin' '' '
+	test_check_ignore "-z" 128 &&
+	stderr_contains "fatal: -z only makes sense with --stdin"
+'
+
+test_expect_success_multi '-z without --stdin and superfluous arg' '' '
+	test_check_ignore "-z foo" 128 &&
+	stderr_contains "fatal: -z only makes sense with --stdin"
+'
+
+test_expect_success_multi 'needs work tree' '' '
+	(
+		cd .git &&
+		test_check_ignore "foo" 128
+	) &&
+	stderr_contains "fatal: This operation must be run in a work tree"
+'
+
+############################################################################
+#
+# test standard ignores
+
+test_expect_success_multi "top-level not ignored" '' '
+	test_check_ignore "foo" 1
+'
+
+test_expect_success_multi "top-level ignored" \
+	'.gitignore:1:one	one' '
+	test_check_ignore "one"
+'
+
+test_expect_success_multi 'sub-directory ignore from top' \
+	'.gitignore:1:one	a/one' '
+	test_check_ignore "a/one"
+'
+
+test_expect_success 'sub-directory local ignore' '
+	expect "a/3-three" &&
+	test_check_ignore "a/3-three a/three-not-this-one"
+'
+
+test_expect_success 'sub-directory local ignore with --verbose'  '
+	expect "a/.gitignore:2:*three	a/3-three" &&
+	test_check_ignore "--verbose a/3-three a/three-not-this-one"
+'
+
+test_expect_success 'local ignore inside a sub-directory' '
+	expect "3-three" &&
+	(
+		cd a &&
+		test_check_ignore "3-three three-not-this-one"
+	)
+'
+test_expect_success 'local ignore inside a sub-directory with --verbose' '
+	expect "a/.gitignore:2:*three	3-three" &&
+	(
+		cd a &&
+		test_check_ignore "--verbose 3-three three-not-this-one"
+	)
+'
+
+test_expect_success_multi 'nested include' \
+	'a/b/.gitignore:8:!on*	a/b/one' '
+	test_check_ignore "a/b/one"
+'
+
+############################################################################
+#
+# test ignored sub-directories
+
+test_expect_success_multi 'ignored sub-directory' \
+	'a/b/.gitignore:5:ignored-dir/	a/b/ignored-dir' '
+	test_check_ignore "a/b/ignored-dir"
+'
+
+test_expect_success 'multiple files inside ignored sub-directory' '
+	expect_from_stdin <<-\EOF &&
+		a/b/ignored-dir/foo
+		a/b/ignored-dir/twoooo
+		a/b/ignored-dir/seven
+	EOF
+	test_check_ignore "a/b/ignored-dir/foo a/b/ignored-dir/twoooo a/b/ignored-dir/seven"
+'
+
+test_expect_success 'multiple files inside ignored sub-directory with -v' '
+	expect_from_stdin <<-\EOF &&
+		a/b/.gitignore:5:ignored-dir/	a/b/ignored-dir/foo
+		a/b/.gitignore:5:ignored-dir/	a/b/ignored-dir/twoooo
+		a/b/.gitignore:5:ignored-dir/	a/b/ignored-dir/seven
+	EOF
+	test_check_ignore "-v a/b/ignored-dir/foo a/b/ignored-dir/twoooo a/b/ignored-dir/seven"
+'
+
+test_expect_success 'cd to ignored sub-directory' '
+	expect_from_stdin <<-\EOF &&
+		foo
+		twoooo
+		../one
+		seven
+		../../one
+	EOF
+	(
+		cd a/b/ignored-dir &&
+		test_check_ignore "foo twoooo ../one seven ../../one"
+	)
+'
+
+test_expect_success 'cd to ignored sub-directory with -v' '
+	expect_from_stdin <<-\EOF &&
+		a/b/.gitignore:5:ignored-dir/	foo
+		a/b/.gitignore:5:ignored-dir/	twoooo
+		a/b/.gitignore:8:!on*	../one
+		a/b/.gitignore:5:ignored-dir/	seven
+		.gitignore:1:one	../../one
+	EOF
+	(
+		cd a/b/ignored-dir &&
+		test_check_ignore "-v foo twoooo ../one seven ../../one"
+	)
+'
+
+############################################################################
+#
+# test handling of symlinks
+
+test_expect_success_multi 'symlink' '' '
+	test_check_ignore "a/symlink" 1
+'
+
+test_expect_success_multi 'beyond a symlink' '' '
+	test_check_ignore "a/symlink/foo" 128 &&
+	test_stderr "fatal: '\''a/symlink/foo'\'' is beyond a symbolic link"
+'
+
+test_expect_success_multi 'beyond a symlink from subdirectory' '' '
+	(
+		cd a &&
+		test_check_ignore "symlink/foo" 128
+	) &&
+	test_stderr "fatal: '\''symlink/foo'\'' is beyond a symbolic link"
+'
+
+############################################################################
+#
+# test handling of submodules
+
+test_expect_success_multi 'submodule' '' '
+	test_check_ignore "a/submodule/one" 128 &&
+	test_stderr "fatal: Path '\''a/submodule/one'\'' is in submodule '\''a/submodule'\''"
+'
+
+test_expect_success_multi 'submodule from subdirectory' '' '
+	(
+		cd a &&
+		test_check_ignore "submodule/one" 128
+	) &&
+	test_stderr "fatal: Path '\''a/submodule/one'\'' is in submodule '\''a/submodule'\''"
+'
+
+############################################################################
+#
+# test handling of global ignore files
+
+test_expect_success 'global ignore not yet enabled' '
+	expect_from_stdin <<-\EOF &&
+		.git/info/exclude:7:per-repo	per-repo
+		a/.gitignore:2:*three	a/globalthree
+		.git/info/exclude:7:per-repo	a/per-repo
+	EOF
+	test_check_ignore "-v globalone per-repo a/globalthree a/per-repo not-ignored a/globaltwo"
+'
+
+test_expect_success 'global ignore' '
+	enable_global_excludes &&
+	expect_from_stdin <<-\EOF &&
+		globalone
+		per-repo
+		globalthree
+		a/globalthree
+		a/per-repo
+		globaltwo
+	EOF
+	test_check_ignore "globalone per-repo globalthree a/globalthree a/per-repo not-ignored globaltwo"
+'
+
+test_expect_success 'global ignore with -v' '
+	enable_global_excludes &&
+	expect_from_stdin <<-EOF &&
+		$global_excludes:1:globalone	globalone
+		.git/info/exclude:7:per-repo	per-repo
+		$global_excludes:3:globalthree	globalthree
+		a/.gitignore:2:*three	a/globalthree
+		.git/info/exclude:7:per-repo	a/per-repo
+		$global_excludes:2:!globaltwo	globaltwo
+	EOF
+	test_check_ignore "-v globalone per-repo globalthree a/globalthree a/per-repo not-ignored globaltwo"
+'
+
+############################################################################
+#
+# test --stdin
+
+cat <<-\EOF >stdin
+	one
+	not-ignored
+	a/one
+	a/not-ignored
+	a/b/on
+	a/b/one
+	a/b/one one
+	"a/b/one two"
+	"a/b/one\"three"
+	a/b/not-ignored
+	a/b/two
+	a/b/twooo
+	globaltwo
+	a/globaltwo
+	a/b/globaltwo
+	b/globaltwo
+EOF
+cat <<-\EOF >expected-default
+	one
+	a/one
+	a/b/on
+	a/b/one
+	a/b/one one
+	a/b/one two
+	"a/b/one\"three"
+	a/b/two
+	a/b/twooo
+	globaltwo
+	a/globaltwo
+	a/b/globaltwo
+	b/globaltwo
+EOF
+cat <<-EOF >expected-verbose
+	.gitignore:1:one	one
+	.gitignore:1:one	a/one
+	a/b/.gitignore:8:!on*	a/b/on
+	a/b/.gitignore:8:!on*	a/b/one
+	a/b/.gitignore:8:!on*	a/b/one one
+	a/b/.gitignore:8:!on*	a/b/one two
+	a/b/.gitignore:8:!on*	"a/b/one\"three"
+	a/b/.gitignore:9:!two	a/b/two
+	a/.gitignore:1:two*	a/b/twooo
+	$global_excludes:2:!globaltwo	globaltwo
+	$global_excludes:2:!globaltwo	a/globaltwo
+	$global_excludes:2:!globaltwo	a/b/globaltwo
+	$global_excludes:2:!globaltwo	b/globaltwo
+EOF
+
+sed -e 's/^"//' -e 's/\\//' -e 's/"$//' stdin | \
+	tr "\n" "\0" >stdin0
+sed -e 's/^"//' -e 's/\\//' -e 's/"$//' expected-default | \
+	tr "\n" "\0" >expected-default0
+sed -e 's/	"/	/' -e 's/\\//' -e 's/"$//' expected-verbose | \
+	tr ":\t\n" "\0" >expected-verbose0
+
+test_expect_success '--stdin' '
+	expect_from_stdin <expected-default &&
+	test_check_ignore "--stdin" <stdin
+'
+
+test_expect_success '--stdin -q' '
+	expect "" &&
+	test_check_ignore "-q --stdin" <stdin
+'
+
+test_expect_success '--stdin -v' '
+	expect_from_stdin <expected-verbose &&
+	test_check_ignore "-v --stdin" <stdin
+'
+
+for opts in '--stdin -z' '-z --stdin'
+do
+	test_expect_success "$opts" "
+		expect_from_stdin <expected-default0 &&
+		test_check_ignore '$opts' <stdin0
+	"
+
+	test_expect_success "$opts -q" "
+		expect "" &&
+		test_check_ignore '-q $opts' <stdin0
+	"
+
+	test_expect_success "$opts -v" "
+		expect_from_stdin <expected-verbose0 &&
+		test_check_ignore '-v $opts' <stdin0
+	"
+done
+
+cat <<-\EOF >stdin
+	../one
+	../not-ignored
+	one
+	not-ignored
+	b/on
+	b/one
+	b/one one
+	"b/one two"
+	"b/one\"three"
+	b/two
+	b/not-ignored
+	b/twooo
+	../globaltwo
+	globaltwo
+	b/globaltwo
+	../b/globaltwo
+EOF
+cat <<-\EOF >expected-default
+	../one
+	one
+	b/on
+	b/one
+	b/one one
+	b/one two
+	"b/one\"three"
+	b/two
+	b/twooo
+	../globaltwo
+	globaltwo
+	b/globaltwo
+	../b/globaltwo
+EOF
+cat <<-EOF >expected-verbose
+	.gitignore:1:one	../one
+	.gitignore:1:one	one
+	a/b/.gitignore:8:!on*	b/on
+	a/b/.gitignore:8:!on*	b/one
+	a/b/.gitignore:8:!on*	b/one one
+	a/b/.gitignore:8:!on*	b/one two
+	a/b/.gitignore:8:!on*	"b/one\"three"
+	a/b/.gitignore:9:!two	b/two
+	a/.gitignore:1:two*	b/twooo
+	$global_excludes:2:!globaltwo	../globaltwo
+	$global_excludes:2:!globaltwo	globaltwo
+	$global_excludes:2:!globaltwo	b/globaltwo
+	$global_excludes:2:!globaltwo	../b/globaltwo
+EOF
+
+sed -e 's/^"//' -e 's/\\//' -e 's/"$//' stdin | \
+	tr "\n" "\0" >stdin0
+sed -e 's/^"//' -e 's/\\//' -e 's/"$//' expected-default | \
+	tr "\n" "\0" >expected-default0
+sed -e 's/	"/	/' -e 's/\\//' -e 's/"$//' expected-verbose | \
+	tr ":\t\n" "\0" >expected-verbose0
+
+test_expect_success '--stdin from subdirectory' '
+	expect_from_stdin <expected-default &&
+	(
+		cd a &&
+		test_check_ignore "--stdin" <../stdin
+	)
+'
+
+test_expect_success '--stdin from subdirectory with -v' '
+	expect_from_stdin <expected-verbose &&
+	(
+		cd a &&
+		test_check_ignore "--stdin -v" <../stdin
+	)
+'
+
+for opts in '--stdin -z' '-z --stdin'
+do
+	test_expect_success "$opts from subdirectory" '
+		expect_from_stdin <expected-default0 &&
+		(
+			cd a &&
+			test_check_ignore "'"$opts"'" <../stdin0
+		)
+	'
+
+	test_expect_success "$opts from subdirectory with -v" '
+		expect_from_stdin <expected-verbose0 &&
+		(
+			cd a &&
+			test_check_ignore "'"$opts"' -v" <../stdin0
+		)
+	'
+done
+
+
+test_done
-- 
1.8.0.rc0.29.g1fdd78f

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: nd/attr-match-more-optim, nd/wildmatch and as/check-ignore
  2012-10-15  6:23 nd/attr-match-more-optim, nd/wildmatch and as/check-ignore Nguyễn Thái Ngọc Duy
                   ` (2 preceding siblings ...)
  2012-10-15  6:27 ` [PATCH 01/12] dir.c: rename cryptic 'which' variable to more consistent name Nguyễn Thái Ngọc Duy
@ 2012-10-15 22:13 ` Junio C Hamano
  3 siblings, 0 replies; 55+ messages in thread
From: Junio C Hamano @ 2012-10-15 22:13 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Adam Spiers

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> These three series all touch the same code in dir.c and cause
> a bunch of conflicts. So I rebase nd/wildmatch and as/check-ignore
> on top of nd/attr-match-more-optim and resolve all conflicts.

Thanks for working on this.  Makes my life much easier.

If not "my life", it certainly will make it easier to shift blame to
others when things go wrong (it is called "easier to bisect" ;-)

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] Add git-check-ignore sub-command
  2012-10-15  6:28   ` [PATCH 12/12] Add git-check-ignore sub-command Nguyễn Thái Ngọc Duy
@ 2012-10-15 22:31     ` Junio C Hamano
  2012-10-16 11:08       ` Nguyen Thai Ngoc Duy
  2012-10-16 14:13       ` Adam Spiers
  2012-11-04 21:07     ` [PATCH as/check-ignore] t0007: fix tests on Windows Johannes Sixt
  1 sibling, 2 replies; 55+ messages in thread
From: Junio C Hamano @ 2012-10-15 22:31 UTC (permalink / raw)
  To: Adam Spiers; +Cc: git, Nguyễn Thái Ngọc Duy

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> +For each pathname given via the command-line or from a file via
> +`--stdin`, this command will list the first exclude pattern found (if
> +any) which explicitly excludes or includes that pathname.  Note that
> +within any given exclude file, later patterns take precedence over
> +earlier ones, so any matching pattern which this command outputs may
> +not be the one you would immediately expect.

"The first exclude pattern" is very misleading, isn't it?  For
example, with these in $GIT_DIR/info/exclude, I would get:

	$ cat -n .git/info/exclude
	  1 *~
          2 Makefile~
	$ git check-ignore -v Makefile~
	.git/info/exclude:2:Makefile~	Makefile~

which is the correct result (the last one in a single source decides
the fate of the path), but it hardly is "first one found" and the
matching pattern in the output would not be something unexpected for
the users, either.

The reason it is "the first one found" is because the implementation
arranges the loop in such a way that it can stop early when it finds
a match---it simply checks matches from the end of the source.

But that is not visible to end-users, and they will find the above
description just wrong, no?


> +OUTPUT
> +------
> +
> +By default, any of the given pathnames which match an ignore pattern
> +will be output, one per line.  If no pattern matches a given path,
> +nothing will be output for that path; this means that path will not be
> +ignored.
> +
> +If `--verbose` is specified, the output is a series of lines of the form:
> +
> +<source> <COLON> <linenum> <COLON> <pattern> <HT> <pathname>
> +
> +<pathname> is the path of a file being queried, <pattern> is the
> +matching pattern, <source> is the pattern's source file, and <linenum>
> +is the line number of the pattern within that source.  If the pattern
> +contained a `!` prefix or `/` suffix, it will be preserved in the
> +output.  <source> will be an absolute path when referring to the file
> +configured by `core.excludesfile`, or relative to the repository root
> +when referring to `.git/info/exclude` or a per-directory exclude file.
> +
> +If `-z` is specified, the output is a series of lines of the form:
> +

Hmph... the remainder of the paragraph seems to have been chopped off.

> +EXIT STATUS
> +-----------
> +
> +0::
> +	One or more of the provided paths is ignored.
> +
> +1::
> +	None of the provided paths are ignored.
> +
> +128::
> +	A fatal error was encountered.
> +
> +SEE ALSO
> +--------
> +linkgit:gitignore[5]
> +linkgit:gitconfig[5]
> +linkgit:git-ls-files[5]
> +
> +GIT
> +---
> +Part of the linkgit:git[1] suite

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] Add git-check-ignore sub-command
  2012-10-15 22:31     ` Junio C Hamano
@ 2012-10-16 11:08       ` Nguyen Thai Ngoc Duy
  2012-10-16 14:09         ` Adam Spiers
  2012-10-16 14:13       ` Adam Spiers
  1 sibling, 1 reply; 55+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-10-16 11:08 UTC (permalink / raw)
  To: Junio C Hamano, Adam Spiers; +Cc: git

Adam, do you have time to continue this series? I can help polish it
for inclusion, but I don't want to step in your way if you are quietly
updating it.

On Tue, Oct 16, 2012 at 5:31 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> +
>> +If `-z` is specified, the output is a series of lines of the form:
>> +
>
> Hmph... the remainder of the paragraph seems to have been chopped off.

No, the last version [1] looks the same. I was worried there was a bug
somewhere in the toolchain that chopped it off.

[1] http://thread.gmane.org/gmane.comp.version-control.git/204661/focus=206085

>> +EXIT STATUS
>> +-----------
>> +
>> +0::
>> +     One or more of the provided paths is ignored.
-- 
Duy

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] Add git-check-ignore sub-command
  2012-10-16 11:08       ` Nguyen Thai Ngoc Duy
@ 2012-10-16 14:09         ` Adam Spiers
  2012-10-16 15:07           ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 55+ messages in thread
From: Adam Spiers @ 2012-10-16 14:09 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: Junio C Hamano, git

Hi again,

Firstly thanks very much for your recent work on this series!

On Tue, Oct 16, 2012 at 4:08 AM, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:
> Adam, do you have time to continue this series? I can help polish it
> for inclusion, but I don't want to step in your way if you are quietly
> updating it.

I was *intending* to finish it off soon, but I have been really busy
with work and other commitments recently, which has prevented this.  I
don't currently have any unpublished changes which would conflict with
your recent work, and I'm at a conference this week, so feel free to
carry on polishing if you want.  However I will probably have some
responses on the discussion about current issues, so it would be good
if I was given a chance to catch up on this discussion before the
series makes its way to master.

Thanks again!

Adam

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] Add git-check-ignore sub-command
  2012-10-15 22:31     ` Junio C Hamano
  2012-10-16 11:08       ` Nguyen Thai Ngoc Duy
@ 2012-10-16 14:13       ` Adam Spiers
  2012-10-16 16:12         ` Junio C Hamano
  1 sibling, 1 reply; 55+ messages in thread
From: Adam Spiers @ 2012-10-16 14:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Nguyễn Thái Ngọc

On Mon, Oct 15, 2012 at 3:31 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:
>
>> +For each pathname given via the command-line or from a file via
>> +`--stdin`, this command will list the first exclude pattern found (if
>> +any) which explicitly excludes or includes that pathname.  Note that
>> +within any given exclude file, later patterns take precedence over
>> +earlier ones, so any matching pattern which this command outputs may
>> +not be the one you would immediately expect.
>
> "The first exclude pattern" is very misleading, isn't it?

I don't think so, because of the second sentence.

> For example, with these in $GIT_DIR/info/exclude, I would get:
>
>         $ cat -n .git/info/exclude
>           1 *~
>           2 Makefile~
>         $ git check-ignore -v Makefile~
>         .git/info/exclude:2:Makefile~   Makefile~
>
> which is the correct result (the last one in a single source decides
> the fate of the path), but it hardly is "first one found" and the
> matching pattern in the output would not be something unexpected for
> the users, either.
>
> The reason it is "the first one found" is because the implementation
> arranges the loop in such a way that it can stop early when it finds
> a match---it simply checks matches from the end of the source.
>
> But that is not visible to end-users,

Correct; that's precisely why I wrote the second sentence which
explicitly explains this.

> and they will find the above description just wrong, no?

It's not wrong AFAICS, but suggestions for rewording this more clearly
are of course welcome.  Maybe s/immediately/intuitively/ ?

>> +OUTPUT
>> +------
>> +
>> +By default, any of the given pathnames which match an ignore pattern
>> +will be output, one per line.  If no pattern matches a given path,
>> +nothing will be output for that path; this means that path will not be
>> +ignored.
>> +
>> +If `--verbose` is specified, the output is a series of lines of the form:
>> +
>> +<source> <COLON> <linenum> <COLON> <pattern> <HT> <pathname>
>> +
>> +<pathname> is the path of a file being queried, <pattern> is the
>> +matching pattern, <source> is the pattern's source file, and <linenum>
>> +is the line number of the pattern within that source.  If the pattern
>> +contained a `!` prefix or `/` suffix, it will be preserved in the
>> +output.  <source> will be an absolute path when referring to the file
>> +configured by `core.excludesfile`, or relative to the repository root
>> +when referring to `.git/info/exclude` or a per-directory exclude file.
>> +
>> +If `-z` is specified, the output is a series of lines of the form:
>> +
>
> Hmph... the remainder of the paragraph seems to have been chopped off.

Yes, an earlier review also caught this but I have not had time to fix
it yet, sorry :-/

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] Add git-check-ignore sub-command
  2012-10-16 14:09         ` Adam Spiers
@ 2012-10-16 15:07           ` Nguyen Thai Ngoc Duy
  0 siblings, 0 replies; 55+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-10-16 15:07 UTC (permalink / raw)
  To: Adam Spiers; +Cc: Junio C Hamano, git

On Tue, Oct 16, 2012 at 9:09 PM, Adam Spiers <git@adamspiers.org> wrote:
> I was *intending* to finish it off soon, but I have been really busy
> with work and other commitments recently, which has prevented this.  I
> don't currently have any unpublished changes which would conflict with
> your recent work, and I'm at a conference this week, so feel free to
> carry on polishing if you want.  However I will probably have some
> responses on the discussion about current issues, so it would be good
> if I was given a chance to catch up on this discussion before the
> series makes its way to master.

No hurry. And thanks for your contribution.
-- 
Duy

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] Add git-check-ignore sub-command
  2012-10-16 14:13       ` Adam Spiers
@ 2012-10-16 16:12         ` Junio C Hamano
  2012-12-17  0:10           ` Adam Spiers
  0 siblings, 1 reply; 55+ messages in thread
From: Junio C Hamano @ 2012-10-16 16:12 UTC (permalink / raw)
  To: Adam Spiers; +Cc: git, Nguyễn Thái Ngọc

Adam Spiers <git@adamspiers.org> writes:

> On Mon, Oct 15, 2012 at 3:31 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:
>>
>>> +For each pathname given via the command-line or from a file via
>>> +`--stdin`, this command will list the first exclude pattern found (if
>>> +any) which explicitly excludes or includes that pathname.  Note that
>>> +within any given exclude file, later patterns take precedence over
>>> +earlier ones, so any matching pattern which this command outputs may
>>> +not be the one you would immediately expect.
>>
>> "The first exclude pattern" is very misleading, isn't it?
>
> I don't think so, because of the second sentence.
>
>> For example, with these in $GIT_DIR/info/exclude, I would get:
>>
>>         $ cat -n .git/info/exclude
>>           1 *~
>>           2 Makefile~
>>         $ git check-ignore -v Makefile~
>>         .git/info/exclude:2:Makefile~   Makefile~
>>
>> which is the correct result (the last one in a single source decides
>> the fate of the path), but it hardly is "first one found" and the
>> matching pattern in the output would not be something unexpected for
>> the users, either.
>>
>> The reason it is "the first one found" is because the implementation
>> arranges the loop in such a way that it can stop early when it finds
>> a match---it simply checks matches from the end of the source.
>>
>> But that is not visible to end-users,
>
> Correct; that's precisely why I wrote the second sentence which
> explicitly explains this.
>
>> and they will find the above description just wrong, no?
>
> It's not wrong AFAICS, but suggestions for rewording this more clearly
> are of course welcome.  Maybe s/immediately/intuitively/ ?

I think this is sufficient:

	For each pathname given via the command-line or from a file
	via `--stdin`, show the pattern from .gitignore (or other
	input files to the exclude mechanism) that decides if the
	pathname is excluded.

and without "Note that" at all.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 14/13] wildmatch: fix tests that fail on Windows due to path mangling
  2012-10-15  6:26   ` [PATCH 13/13] Support "**" wildcard in .gitignore and .gitattributes Nguyễn Thái Ngọc Duy
@ 2012-11-04 21:00     ` Johannes Sixt
  2012-11-06 12:47       ` Nguyen Thai Ngoc Duy
  2012-11-11 10:13       ` [PATCH 14/13] test-wildmatch: " Nguyễn Thái Ngọc Duy
  0 siblings, 2 replies; 55+ messages in thread
From: Johannes Sixt @ 2012-11-04 21:00 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano

Patterns beginning with a slash are converted to Windows paths before
test-wildmatch gets to see them. Use a different first character. This
does change the meaning of the test because the slash is special. But:

- The first pair of changed lines the test is about '*' matching an empty
  string, which does not need the slash.

- The second pair of changed lines the test is about a sequence of '*/',
  and I think we can afford to test without the leading slash.

One case is unchanged:

    match 1 x '/foo' '**/foo'

Even though the test is about ** matching zero levels at the beginning of
a path, on Windows, '**' actually matches something because /foo is
converted to c:\path\to\msys\foo. Let's be lazy and let this pass.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
 After this change, there are still 3 failing tests that are in connection
 with [[:xdigit:]]. Don't know, yet, what's going on there.

 t/t3070-wildmatch.sh | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index e6ad6f4..74c0f7c 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -95,8 +95,8 @@ match 0 0 ']' '[!]-]'
 match 1 x 'a' '[!]-]'
 match 0 0 '' '\'
 match 0 x '\' '\'
-match 0 x '/\' '*/\'
-match 1 x '/\' '*/\\'
+match 0 x '-\' '*-\'
+match 1 x '-\' '*-\\'
 match 1 1 'foo' 'foo'
 match 1 1 '@foo' '@foo'
 match 0 0 'foo' '@foo'
@@ -187,8 +187,8 @@ match 0 0 '-' '[[-\]]'
 match 1 1 '-adobe-courier-bold-o-normal--12-120-75-75-m-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
 match 0 0 '-adobe-courier-bold-o-normal--12-120-75-75-X-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
 match 0 0 '-adobe-courier-bold-o-normal--12-120-75-75-/-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
-match 1 1 '/adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1' '/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
-match 0 0 '/adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1' '/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
+match 1 1 'adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1' '*/*/*/*/*/*/12/*/*/*/m/*/*/*'
+match 0 0 'adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1' '*/*/*/*/*/*/12/*/*/*/m/*/*/*'
 match 1 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txt' '**/*a*b*g*n*t'
 match 0 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txtz' '**/*a*b*g*n*t'
 
-- 
1.8.0.rc0.45.g6c9d890

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH as/check-ignore] t0007: fix tests on Windows
  2012-10-15  6:28   ` [PATCH 12/12] Add git-check-ignore sub-command Nguyễn Thái Ngọc Duy
  2012-10-15 22:31     ` Junio C Hamano
@ 2012-11-04 21:07     ` Johannes Sixt
  2012-11-08 18:04       ` Jeff King
  1 sibling, 1 reply; 55+ messages in thread
From: Johannes Sixt @ 2012-11-04 21:07 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano, Adam Spiers

The value of $global_excludes is sometimes part of the output
that is tested for. Since git on Windows only sees DOS style paths,
we have to ensure that the "expected" values are constructed in
the same manner. To account for this, use $(pwd) to set the value
of global_excludes.

Additionally, add a SYMLINKS prerequisite to the tests involving
symbolic links.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
 t/t0007-ignores.sh | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/t/t0007-ignores.sh b/t/t0007-ignores.sh
index 7fd7e53..5d2b8f2 100755
--- a/t/t0007-ignores.sh
+++ b/t/t0007-ignores.sh
@@ -5,7 +5,7 @@ test_description=check-ignore
 . ./test-lib.sh
 
 init_vars () {
-	global_excludes="$HOME/global-excludes"
+	global_excludes="$(pwd)/global-excludes"
 }
 
 enable_global_excludes () {
@@ -77,18 +77,24 @@ test_check_ignore () {
 }
 
 test_expect_success_multi () {
+	prereq=
+	if test $# -eq 4
+	then
+		prereq=$1
+		shift
+	fi
 	testname="$1" expect_verbose="$2" code="$3"
 
 	expect=$( echo "$expect_verbose" | sed -e 's/.*	//' )
 
-	test_expect_success "$testname" "
+	test_expect_success $prereq "$testname" "
 		expect '$expect' &&
 		$code
 	"
 
 	for quiet_opt in '-q' '--quiet'
 	do
-		test_expect_success "$testname${quiet_opt:+ with $quiet_opt}" "
+		test_expect_success $prereq "$testname${quiet_opt:+ with $quiet_opt}" "
 			expect '' &&
 			$code
 		"
@@ -97,7 +103,7 @@ test_expect_success_multi () {
 
 	for verbose_opt in '-v' '--verbose'
 	do
-		test_expect_success "$testname${verbose_opt:+ with $verbose_opt}" "
+		test_expect_success $prereq "$testname${verbose_opt:+ with $verbose_opt}" "
 			expect '$expect_verbose' &&
 			$code
 		"
@@ -108,7 +114,10 @@ test_expect_success_multi () {
 test_expect_success 'setup' '
 	init_vars
 	mkdir -p a/b/ignored-dir a/submodule b &&
-	ln -s b a/symlink &&
+	if test_have_prereq SYMLINKS
+	then
+		ln -s b a/symlink
+	fi &&
 	(
 		cd a/submodule &&
 		git init &&
@@ -326,16 +335,16 @@ test_expect_success 'cd to ignored sub-directory with -v' '
 #
 # test handling of symlinks
 
-test_expect_success_multi 'symlink' '' '
+test_expect_success_multi SYMLINKS 'symlink' '' '
 	test_check_ignore "a/symlink" 1
 '
 
-test_expect_success_multi 'beyond a symlink' '' '
+test_expect_success_multi SYMLINKS 'beyond a symlink' '' '
 	test_check_ignore "a/symlink/foo" 128 &&
 	test_stderr "fatal: '\''a/symlink/foo'\'' is beyond a symbolic link"
 '
 
-test_expect_success_multi 'beyond a symlink from subdirectory' '' '
+test_expect_success_multi SYMLINKS 'beyond a symlink from subdirectory' '' '
 	(
 		cd a &&
 		test_check_ignore "symlink/foo" 128
-- 
1.8.0.rc0.45.g6c9d890

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 14/13] wildmatch: fix tests that fail on Windows due to path mangling
  2012-11-04 21:00     ` [PATCH 14/13] wildmatch: fix tests that fail on Windows due to path mangling Johannes Sixt
@ 2012-11-06 12:47       ` Nguyen Thai Ngoc Duy
  2012-11-07 19:32         ` Johannes Sixt
  2012-11-11 10:13       ` [PATCH 14/13] test-wildmatch: " Nguyễn Thái Ngọc Duy
  1 sibling, 1 reply; 55+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-11-06 12:47 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git, Junio C Hamano

On Mon, Nov 5, 2012 at 4:00 AM, Johannes Sixt <j6t@kdbg.org> wrote:
> Patterns beginning with a slash are converted to Windows paths before
> test-wildmatch gets to see them. Use a different first character.

Or we could prepend the paths with something, which is then cut out by
test-wildmatch. Not sure if it's intuitive to look at the tests
though.

>  After this change, there are still 3 failing tests that are in connection
>  with [[:xdigit:]]. Don't know, yet, what's going on there.

the wildmatch tests or fnmatch ones?
-- 
Duy

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 14/13] wildmatch: fix tests that fail on Windows due to path mangling
  2012-11-06 12:47       ` Nguyen Thai Ngoc Duy
@ 2012-11-07 19:32         ` Johannes Sixt
  0 siblings, 0 replies; 55+ messages in thread
From: Johannes Sixt @ 2012-11-07 19:32 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git, Junio C Hamano

Am 06.11.2012 13:47, schrieb Nguyen Thai Ngoc Duy:
> On Mon, Nov 5, 2012 at 4:00 AM, Johannes Sixt <j6t@kdbg.org> wrote:
>> Patterns beginning with a slash are converted to Windows paths before
>> test-wildmatch gets to see them. Use a different first character.
> 
> Or we could prepend the paths with something, which is then cut out by
> test-wildmatch. Not sure if it's intuitive to look at the tests
> though.

That would be a possibility, too.

>>  After this change, there are still 3 failing tests that are in connection
>>  with [[:xdigit:]]. Don't know, yet, what's going on there.
> 
> the wildmatch tests or fnmatch ones?

fnmatch ones:

ok 147 - wildmatch:    match '5' '[[:xdigit:]]'
not ok 148 - fnmatch:      match '5' '[[:xdigit:]]'
#
#                   test-wildmatch fnmatch '5' '[[:xdigit:]]'
#
ok 149 - wildmatch:    match 'f' '[[:xdigit:]]'
not ok 150 - fnmatch:      match 'f' '[[:xdigit:]]'
#
#                   test-wildmatch fnmatch 'f' '[[:xdigit:]]'
#
ok 151 - wildmatch:    match 'D' '[[:xdigit:]]'
not ok 152 - fnmatch:      match 'D' '[[:xdigit:]]'
#
#                   test-wildmatch fnmatch 'D' '[[:xdigit:]]'
#

She same tests fail on Linux, BTW, when built with NO_FNMATCH=1 and the
"#if defined _LIBC || !defined __GNU_LIBRARY__" brackets removed.

-- Hannes

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH as/check-ignore] t0007: fix tests on Windows
  2012-11-04 21:07     ` [PATCH as/check-ignore] t0007: fix tests on Windows Johannes Sixt
@ 2012-11-08 18:04       ` Jeff King
  0 siblings, 0 replies; 55+ messages in thread
From: Jeff King @ 2012-11-08 18:04 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: Nguyễn Thái Ngọc Duy, git, Junio C Hamano,
	Adam Spiers

On Sun, Nov 04, 2012 at 10:07:54PM +0100, Johannes Sixt wrote:

> The value of $global_excludes is sometimes part of the output
> that is tested for. Since git on Windows only sees DOS style paths,
> we have to ensure that the "expected" values are constructed in
> the same manner. To account for this, use $(pwd) to set the value
> of global_excludes.
> 
> Additionally, add a SYMLINKS prerequisite to the tests involving
> symbolic links.
> 
> Signed-off-by: Johannes Sixt <j6t@kdbg.org>

Thanks. I put this on top of as/check-ignore in 'pu' for reference, but
I am still expecting a re-roll from Adam or Duy at some point.

-Peff

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 14/13] test-wildmatch: fix tests that fail on Windows due to path mangling
  2012-11-04 21:00     ` [PATCH 14/13] wildmatch: fix tests that fail on Windows due to path mangling Johannes Sixt
  2012-11-06 12:47       ` Nguyen Thai Ngoc Duy
@ 2012-11-11 10:13       ` Nguyễn Thái Ngọc Duy
  2012-11-11 10:13         ` [PATCH 15/13] compat/fnmatch: fix off-by-one character class's length check Nguyễn Thái Ngọc Duy
  2012-11-11 10:47         ` [PATCH 14/13] test-wildmatch: fix tests that fail on Windows due to path mangling Junio C Hamano
  1 sibling, 2 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-11-11 10:13 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Sixt,
	Nguyễn Thái Ngọc Duy

Patterns beginning with a slash are converted to Windows paths before
test-wildmatch gets to see them. Avoid this case by always writing
XXX/abc instead of /abc. The leading XXX will be removed by
test-wildmatch itself before processing.

Any patterns beginning with a forward slash is rejected by
test-wildmatch to avoid the same fault in future.

Reported-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 On top of nd/wildmatch, apparently.

 t/t3070-wildmatch.sh | 10 +++++-----
 test-wildmatch.c     |  8 ++++++++
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index e6ad6f4..3155eab 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -74,7 +74,7 @@ match 0 0 'foo/bar' 'foo[/]bar'
 match 0 0 'foo/bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
 match 1 1 'foo-bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
 match 1 0 'foo' '**/foo'
-match 1 x '/foo' '**/foo'
+match 1 x 'XXX/foo' '**/foo'
 match 1 0 'bar/baz/foo' '**/foo'
 match 0 0 'bar/baz/foo' '*/foo'
 match 0 0 'foo/bar/baz' '**/bar*'
@@ -95,8 +95,8 @@ match 0 0 ']' '[!]-]'
 match 1 x 'a' '[!]-]'
 match 0 0 '' '\'
 match 0 x '\' '\'
-match 0 x '/\' '*/\'
-match 1 x '/\' '*/\\'
+match 0 x 'XXX/\' '*/\'
+match 1 x 'XXX/\' '*/\\'
 match 1 1 'foo' 'foo'
 match 1 1 '@foo' '@foo'
 match 0 0 'foo' '@foo'
@@ -187,8 +187,8 @@ match 0 0 '-' '[[-\]]'
 match 1 1 '-adobe-courier-bold-o-normal--12-120-75-75-m-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
 match 0 0 '-adobe-courier-bold-o-normal--12-120-75-75-X-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
 match 0 0 '-adobe-courier-bold-o-normal--12-120-75-75-/-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
-match 1 1 '/adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1' '/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
-match 0 0 '/adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1' '/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
+match 1 1 'XXX/adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1' 'XXX/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
+match 0 0 'XXX/adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1' 'XXX/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
 match 1 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txt' '**/*a*b*g*n*t'
 match 0 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txtz' '**/*a*b*g*n*t'
 
diff --git a/test-wildmatch.c b/test-wildmatch.c
index 74c0864..e384c8e 100644
--- a/test-wildmatch.c
+++ b/test-wildmatch.c
@@ -3,6 +3,14 @@
 
 int main(int argc, char **argv)
 {
+	int i;
+	for (i = 2; i < argc; i++) {
+		if (argv[i][0] == '/')
+			die("Forward slash is not allowed at the beginning of the\n"
+			    "pattern because Windows does not like it. Use `XXX/' instead.");
+		else if (!strncmp(argv[i], "XXX/", 4))
+			argv[i] += 3;
+	}
 	if (!strcmp(argv[1], "wildmatch"))
 		return !!wildmatch(argv[3], argv[2], 0);
 	else if (!strcmp(argv[1], "iwildmatch"))
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 15/13] compat/fnmatch: fix off-by-one character class's length check
  2012-11-11 10:13       ` [PATCH 14/13] test-wildmatch: " Nguyễn Thái Ngọc Duy
@ 2012-11-11 10:13         ` Nguyễn Thái Ngọc Duy
  2012-11-13 18:07           ` Johannes Sixt
  2012-11-20  7:06           ` Johannes Sixt
  2012-11-11 10:47         ` [PATCH 14/13] test-wildmatch: fix tests that fail on Windows due to path mangling Junio C Hamano
  1 sibling, 2 replies; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-11-11 10:13 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Sixt,
	Nguyễn Thái Ngọc Duy

Character class "xdigit" is the only one that hits 6 character limit
defined by CHAR_CLASS_MAX_LENGTH. All other character classes are 5
character long and therefore never caught by this.

This should make xdigit tests in t3070 pass on Windows.

Reported-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 I tested with Linux (removing the ifdef __LIBC in fnmatch.c) but I
 think this should get an ACK from someone who actually uses it on
 Windows.

 We may want to tell upstream (who?) about this if they haven't fixed
 it already.

 compat/fnmatch/fnmatch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/compat/fnmatch/fnmatch.c b/compat/fnmatch/fnmatch.c
index 9473aed..0ff1d27 100644
--- a/compat/fnmatch/fnmatch.c
+++ b/compat/fnmatch/fnmatch.c
@@ -345,7 +345,7 @@ internal_fnmatch (pattern, string, no_leading_period, flags)
 
 		    for (;;)
 		      {
-			if (c1 == CHAR_CLASS_MAX_LENGTH)
+			if (c1 > CHAR_CLASS_MAX_LENGTH)
 			  /* The name is too long and therefore the pattern
 			     is ill-formed.  */
 			  return FNM_NOMATCH;
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 14/13] test-wildmatch: fix tests that fail on Windows due to path mangling
  2012-11-11 10:13       ` [PATCH 14/13] test-wildmatch: " Nguyễn Thái Ngọc Duy
  2012-11-11 10:13         ` [PATCH 15/13] compat/fnmatch: fix off-by-one character class's length check Nguyễn Thái Ngọc Duy
@ 2012-11-11 10:47         ` Junio C Hamano
  2012-11-13 10:06           ` [PATCH 14/13] test-wildmatch: avoid Windows " Nguyễn Thái Ngọc Duy
  2012-11-20  7:02           ` Johannes Sixt
  1 sibling, 2 replies; 55+ messages in thread
From: Junio C Hamano @ 2012-11-11 10:47 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Jeff King, Johannes Sixt

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> Patterns beginning with a slash are converted to Windows paths before
> test-wildmatch gets to see them. Avoid this case by always writing
> XXX/abc instead of /abc. The leading XXX will be removed by
> test-wildmatch itself before processing.
>
> Any patterns beginning with a forward slash is rejected by
> test-wildmatch to avoid the same fault in future.

The title taken together with the above explanation makes it sound
as if wildmatch code does not work with the pattern /foo on Windows
at all and to avoid the issue (instead of fixing the breakage) this
patch removes such tests.

But I suspect that is not what is going on. Perhaps the title of the
change is more like "test-wildmatch: avoid Windows path mangling"
and explain with "On Windows, any command line argument that begins
with a slash is mangled as if it were a full pathname.  This causes
patterns beginning with a slash not to be passed to test-wildmatch
correctly" or something?

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 14/13] test-wildmatch: avoid Windows path mangling
  2012-11-11 10:47         ` [PATCH 14/13] test-wildmatch: fix tests that fail on Windows due to path mangling Junio C Hamano
@ 2012-11-13 10:06           ` Nguyễn Thái Ngọc Duy
  2012-11-13 18:06             ` Johannes Sixt
  2012-11-20  7:02           ` Johannes Sixt
  1 sibling, 1 reply; 55+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-11-13 10:06 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Sixt,
	Nguyễn Thái Ngọc Duy

On Windows, arguments starting with a forward slash is mangled as if
it were full pathname. This causes the patterns beginning with a slash
not to be passed to test-wildmatch correctly. Avoid mangling by never
accepting patterns starting with a slash. Those arguments must be
rewritten with a leading "XXX" (e.g. "/abc" becomes "XXX/abc"), which
will be removed by test-wildmatch itself before feeding the patterns
to wildmatch() or fnmatch().

Reported-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 On Sun, Nov 11, 2012 at 5:47 PM, Junio C Hamano <gitster@pobox.com> wrote:
 > The title taken together with the above explanation makes it sound
 > as if wildmatch code does not work with the pattern /foo on Windows
 > at all and to avoid the issue (instead of fixing the breakage) this
 > patch removes such tests....

 OK how about this?

 t/t3070-wildmatch.sh | 10 +++++-----
 test-wildmatch.c     |  8 ++++++++
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index e6ad6f4..3155eab 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -74,7 +74,7 @@ match 0 0 'foo/bar' 'foo[/]bar'
 match 0 0 'foo/bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
 match 1 1 'foo-bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
 match 1 0 'foo' '**/foo'
-match 1 x '/foo' '**/foo'
+match 1 x 'XXX/foo' '**/foo'
 match 1 0 'bar/baz/foo' '**/foo'
 match 0 0 'bar/baz/foo' '*/foo'
 match 0 0 'foo/bar/baz' '**/bar*'
@@ -95,8 +95,8 @@ match 0 0 ']' '[!]-]'
 match 1 x 'a' '[!]-]'
 match 0 0 '' '\'
 match 0 x '\' '\'
-match 0 x '/\' '*/\'
-match 1 x '/\' '*/\\'
+match 0 x 'XXX/\' '*/\'
+match 1 x 'XXX/\' '*/\\'
 match 1 1 'foo' 'foo'
 match 1 1 '@foo' '@foo'
 match 0 0 'foo' '@foo'
@@ -187,8 +187,8 @@ match 0 0 '-' '[[-\]]'
 match 1 1 '-adobe-courier-bold-o-normal--12-120-75-75-m-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
 match 0 0 '-adobe-courier-bold-o-normal--12-120-75-75-X-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
 match 0 0 '-adobe-courier-bold-o-normal--12-120-75-75-/-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
-match 1 1 '/adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1' '/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
-match 0 0 '/adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1' '/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
+match 1 1 'XXX/adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1' 'XXX/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
+match 0 0 'XXX/adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1' 'XXX/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
 match 1 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txt' '**/*a*b*g*n*t'
 match 0 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txtz' '**/*a*b*g*n*t'
 
diff --git a/test-wildmatch.c b/test-wildmatch.c
index 74c0864..e384c8e 100644
--- a/test-wildmatch.c
+++ b/test-wildmatch.c
@@ -3,6 +3,14 @@
 
 int main(int argc, char **argv)
 {
+	int i;
+	for (i = 2; i < argc; i++) {
+		if (argv[i][0] == '/')
+			die("Forward slash is not allowed at the beginning of the\n"
+			    "pattern because Windows does not like it. Use `XXX/' instead.");
+		else if (!strncmp(argv[i], "XXX/", 4))
+			argv[i] += 3;
+	}
 	if (!strcmp(argv[1], "wildmatch"))
 		return !!wildmatch(argv[3], argv[2], 0);
 	else if (!strcmp(argv[1], "iwildmatch"))
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 14/13] test-wildmatch: avoid Windows path mangling
  2012-11-13 10:06           ` [PATCH 14/13] test-wildmatch: avoid Windows " Nguyễn Thái Ngọc Duy
@ 2012-11-13 18:06             ` Johannes Sixt
  0 siblings, 0 replies; 55+ messages in thread
From: Johannes Sixt @ 2012-11-13 18:06 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano, Jeff King

Am 13.11.2012 11:06, schrieb Nguyễn Thái Ngọc Duy:
> On Windows, arguments starting with a forward slash is mangled as if
> it were full pathname. This causes the patterns beginning with a slash
> not to be passed to test-wildmatch correctly. Avoid mangling by never
> accepting patterns starting with a slash. Those arguments must be
> rewritten with a leading "XXX" (e.g. "/abc" becomes "XXX/abc"), which
> will be removed by test-wildmatch itself before feeding the patterns
> to wildmatch() or fnmatch().
> 
> Reported-by: Johannes Sixt <j6t@kdbg.org>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  On Sun, Nov 11, 2012 at 5:47 PM, Junio C Hamano <gitster@pobox.com> wrote:
>  > The title taken together with the above explanation makes it sound
>  > as if wildmatch code does not work with the pattern /foo on Windows
>  > at all and to avoid the issue (instead of fixing the breakage) this
>  > patch removes such tests....
> 
>  OK how about this?

Thanks, the patch lets the tests pass on Windows.

-- Hannes

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 15/13] compat/fnmatch: fix off-by-one character class's length check
  2012-11-11 10:13         ` [PATCH 15/13] compat/fnmatch: fix off-by-one character class's length check Nguyễn Thái Ngọc Duy
@ 2012-11-13 18:07           ` Johannes Sixt
  2012-11-20  7:06           ` Johannes Sixt
  1 sibling, 0 replies; 55+ messages in thread
From: Johannes Sixt @ 2012-11-13 18:07 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano, Jeff King

Am 11.11.2012 11:13, schrieb Nguyễn Thái Ngọc Duy:
> -			if (c1 == CHAR_CLASS_MAX_LENGTH)
> +			if (c1 > CHAR_CLASS_MAX_LENGTH)

Nice catch! With this one and 14/13, all tests in t3070 pass on Windows.

-- Hannes

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 14/13] test-wildmatch: avoid Windows path mangling
  2012-11-11 10:47         ` [PATCH 14/13] test-wildmatch: fix tests that fail on Windows due to path mangling Junio C Hamano
  2012-11-13 10:06           ` [PATCH 14/13] test-wildmatch: avoid Windows " Nguyễn Thái Ngọc Duy
@ 2012-11-20  7:02           ` Johannes Sixt
  2012-11-20 20:11             ` Junio C Hamano
  1 sibling, 1 reply; 55+ messages in thread
From: Johannes Sixt @ 2012-11-20  7:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Nguyễn Thái Ngọc Duy

From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>

The MSYS bash mangles arguments that begin with a forward slash
when they are passed to test-wildmatch. This causes tests to fail.
Avoid mangling by prepending "XXX", which is removed by
test-wildmatch before further processing.

[J6t: reworded commit message]

Reported-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
 Works well, and I'm fine with this work-around.

 -- Hannes

 t/t3070-wildmatch.sh | 10 +++++-----
 test-wildmatch.c     |  8 ++++++++
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index e6ad6f4..3155eab 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -74,7 +74,7 @@ match 0 0 'foo/bar' 'foo[/]bar'
 match 0 0 'foo/bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
 match 1 1 'foo-bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
 match 1 0 'foo' '**/foo'
-match 1 x '/foo' '**/foo'
+match 1 x 'XXX/foo' '**/foo'
 match 1 0 'bar/baz/foo' '**/foo'
 match 0 0 'bar/baz/foo' '*/foo'
 match 0 0 'foo/bar/baz' '**/bar*'
@@ -95,8 +95,8 @@ match 0 0 ']' '[!]-]'
 match 1 x 'a' '[!]-]'
 match 0 0 '' '\'
 match 0 x '\' '\'
-match 0 x '/\' '*/\'
-match 1 x '/\' '*/\\'
+match 0 x 'XXX/\' '*/\'
+match 1 x 'XXX/\' '*/\\'
 match 1 1 'foo' 'foo'
 match 1 1 '@foo' '@foo'
 match 0 0 'foo' '@foo'
@@ -187,8 +187,8 @@ match 0 0 '-' '[[-\]]'
 match 1 1 '-adobe-courier-bold-o-normal--12-120-75-75-m-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
 match 0 0 '-adobe-courier-bold-o-normal--12-120-75-75-X-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
 match 0 0 '-adobe-courier-bold-o-normal--12-120-75-75-/-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
-match 1 1 '/adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1' '/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
-match 0 0 '/adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1' '/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
+match 1 1 'XXX/adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1' 'XXX/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
+match 0 0 'XXX/adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1' 'XXX/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
 match 1 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txt' '**/*a*b*g*n*t'
 match 0 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txtz' '**/*a*b*g*n*t'
 diff --git a/test-wildmatch.c b/test-wildmatch.c
index 74c0864..e384c8e 100644
--- a/test-wildmatch.c
+++ b/test-wildmatch.c
@@ -3,6 +3,14 @@
  int main(int argc, char **argv)
 {
+	int i;
+	for (i = 2; i < argc; i++) {
+		if (argv[i][0] == '/')
+			die("Forward slash is not allowed at the beginning of the\n"
+			    "pattern because Windows does not like it. Use `XXX/' instead.");
+		else if (!strncmp(argv[i], "XXX/", 4))
+			argv[i] += 3;
+	}
 	if (!strcmp(argv[1], "wildmatch"))
 		return !!wildmatch(argv[3], argv[2], 0);
 	else if (!strcmp(argv[1], "iwildmatch"))

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 15/13] compat/fnmatch: fix off-by-one character class's length check
  2012-11-11 10:13         ` [PATCH 15/13] compat/fnmatch: fix off-by-one character class's length check Nguyễn Thái Ngọc Duy
  2012-11-13 18:07           ` Johannes Sixt
@ 2012-11-20  7:06           ` Johannes Sixt
  1 sibling, 0 replies; 55+ messages in thread
From: Johannes Sixt @ 2012-11-20  7:06 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano, Jeff King

Am 11/11/2012 11:13, schrieb Nguyễn Thái Ngọc Duy:
> Character class "xdigit" is the only one that hits 6 character limit
> defined by CHAR_CLASS_MAX_LENGTH. All other character classes are 5
> character long and therefore never caught by this.
> 
> This should make xdigit tests in t3070 pass on Windows.
> 
> Reported-by: Johannes Sixt <j6t@kdbg.org>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  I tested with Linux (removing the ifdef __LIBC in fnmatch.c) but I
>  think this should get an ACK from someone who actually uses it on
>  Windows.

Works well here on Windows.

This does not affect Windows alone, but all platforms that fall back to
compat/fnmatch. It's perhaps worth its own topic branch.

>  We may want to tell upstream (who?) about this if they haven't fixed
>  it already.
> 
>  compat/fnmatch/fnmatch.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/compat/fnmatch/fnmatch.c b/compat/fnmatch/fnmatch.c
> index 9473aed..0ff1d27 100644
> --- a/compat/fnmatch/fnmatch.c
> +++ b/compat/fnmatch/fnmatch.c
> @@ -345,7 +345,7 @@ internal_fnmatch (pattern, string, no_leading_period, flags)
>  
>  		    for (;;)
>  		      {
> -			if (c1 == CHAR_CLASS_MAX_LENGTH)
> +			if (c1 > CHAR_CLASS_MAX_LENGTH)
>  			  /* The name is too long and therefore the pattern
>  			     is ill-formed.  */
>  			  return FNM_NOMATCH;
> 

-- Hannes

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 14/13] test-wildmatch: avoid Windows path mangling
  2012-11-20  7:02           ` Johannes Sixt
@ 2012-11-20 20:11             ` Junio C Hamano
  2012-11-21  6:41               ` Johannes Sixt
  0 siblings, 1 reply; 55+ messages in thread
From: Junio C Hamano @ 2012-11-20 20:11 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git, Jeff King, Nguyễn Thái Ngọc Duy

Johannes Sixt <j.sixt@viscovery.net> writes:

> From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>
> The MSYS bash mangles arguments that begin with a forward slash
> when they are passed to test-wildmatch. This causes tests to fail.
> Avoid mangling by prepending "XXX", which is removed by
> test-wildmatch before further processing.
>
> [J6t: reworded commit message]
>
> Reported-by: Johannes Sixt <j6t@kdbg.org>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> Signed-off-by: Johannes Sixt <j6t@kdbg.org>
> ---
>  Works well, and I'm fine with this work-around.
>
>  -- Hannes

Thanks, but you need to fix your format-patch somehow.

> @@ -187,8 +187,8 @@ match 0 0 '-' '[[-\]]'
>  match 1 1 '-adobe-courier-bold-o-normal--12-120-75-75-m-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
>  match 0 0 '-adobe-courier-bold-o-normal--12-120-75-75-X-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
>  match 0 0 '-adobe-courier-bold-o-normal--12-120-75-75-/-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
> -match 1 1 '/adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1' '/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
> -match 0 0 '/adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1' '/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
> +match 1 1 'XXX/adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1' 'XXX/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
> +match 0 0 'XXX/adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1' 'XXX/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
>  match 1 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txt' '**/*a*b*g*n*t'
>  match 0 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txtz' '**/*a*b*g*n*t'
>  diff --git a/test-wildmatch.c b/test-wildmatch.c

This hunk claims that there are 8 lines in preimage and postimage,
but it is not the case.  It has only 7 lines each.

You also have the first line of the next patch "diff --git" somehow
indented.  How did this happen?


> index 74c0864..e384c8e 100644
> --- a/test-wildmatch.c
> +++ b/test-wildmatch.c
> @@ -3,6 +3,14 @@
>   int main(int argc, char **argv)
>  {
> +	int i;
> +	for (i = 2; i < argc; i++) {
> +		if (argv[i][0] == '/')
> +			die("Forward slash is not allowed at the beginning of the\n"
> +			    "pattern because Windows does not like it. Use `XXX/' instead.");
> +		else if (!strncmp(argv[i], "XXX/", 4))
> +			argv[i] += 3;
> +	}
>  	if (!strcmp(argv[1], "wildmatch"))
>  		return !!wildmatch(argv[3], argv[2], 0);
>  	else if (!strcmp(argv[1], "iwildmatch"))

And again this claims that the preimage has 6 lines while the
postimage has 14.  Somebody is overcounting, or perhaps you removed
the first pre-context by hand without adjusting the line number?

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 14/13] test-wildmatch: avoid Windows path mangling
  2012-11-20 20:11             ` Junio C Hamano
@ 2012-11-21  6:41               ` Johannes Sixt
  0 siblings, 0 replies; 55+ messages in thread
From: Johannes Sixt @ 2012-11-21  6:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Nguyễn Thái Ngọc Duy

From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>

The MSYS bash mangles arguments that begin with a forward slash
when they are passed to test-wildmatch. This causes tests to fail.
Avoid mangling by prepending "XXX", which is removed by
test-wildmatch before further processing.

[J6t: reworded commit message]

Reported-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
Am 11/20/2012 21:11, schrieb Junio C Hamano:
> Thanks, but you need to fix your format-patch somehow.

No, it was Thunderbird. It ate the blank lines when I said "Edit as new"
to forward the patch directly from the mailbox.

Sorry for the inconvenience.

 t/t3070-wildmatch.sh | 10 +++++-----
 test-wildmatch.c     |  8 ++++++++
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index e6ad6f4..3155eab 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -74,7 +74,7 @@ match 0 0 'foo/bar' 'foo[/]bar'
 match 0 0 'foo/bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
 match 1 1 'foo-bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
 match 1 0 'foo' '**/foo'
-match 1 x '/foo' '**/foo'
+match 1 x 'XXX/foo' '**/foo'
 match 1 0 'bar/baz/foo' '**/foo'
 match 0 0 'bar/baz/foo' '*/foo'
 match 0 0 'foo/bar/baz' '**/bar*'
@@ -95,8 +95,8 @@ match 0 0 ']' '[!]-]'
 match 1 x 'a' '[!]-]'
 match 0 0 '' '\'
 match 0 x '\' '\'
-match 0 x '/\' '*/\'
-match 1 x '/\' '*/\\'
+match 0 x 'XXX/\' '*/\'
+match 1 x 'XXX/\' '*/\\'
 match 1 1 'foo' 'foo'
 match 1 1 '@foo' '@foo'
 match 0 0 'foo' '@foo'
@@ -187,8 +187,8 @@ match 0 0 '-' '[[-\]]'
 match 1 1 '-adobe-courier-bold-o-normal--12-120-75-75-m-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
 match 0 0 '-adobe-courier-bold-o-normal--12-120-75-75-X-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
 match 0 0 '-adobe-courier-bold-o-normal--12-120-75-75-/-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
-match 1 1 '/adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1' '/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
-match 0 0 '/adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1' '/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
+match 1 1 'XXX/adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1' 'XXX/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
+match 0 0 'XXX/adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1' 'XXX/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
 match 1 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txt' '**/*a*b*g*n*t'
 match 0 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txtz' '**/*a*b*g*n*t'
 
diff --git a/test-wildmatch.c b/test-wildmatch.c
index 74c0864..e384c8e 100644
--- a/test-wildmatch.c
+++ b/test-wildmatch.c
@@ -3,6 +3,14 @@
 
 int main(int argc, char **argv)
 {
+	int i;
+	for (i = 2; i < argc; i++) {
+		if (argv[i][0] == '/')
+			die("Forward slash is not allowed at the beginning of the\n"
+			    "pattern because Windows does not like it. Use `XXX/' instead.");
+		else if (!strncmp(argv[i], "XXX/", 4))
+			argv[i] += 3;
+	}
 	if (!strcmp(argv[1], "wildmatch"))
 		return !!wildmatch(argv[3], argv[2], 0);
 	else if (!strcmp(argv[1], "iwildmatch"))
-- 
1.8.0.1417.gf6038d8

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 12/12] Add git-check-ignore sub-command
  2012-10-16 16:12         ` Junio C Hamano
@ 2012-12-17  0:10           ` Adam Spiers
  0 siblings, 0 replies; 55+ messages in thread
From: Adam Spiers @ 2012-12-17  0:10 UTC (permalink / raw)
  To: git

On Tue, Oct 16, 2012 at 09:12:58AM -0700, Junio C Hamano wrote:
> Adam Spiers <git@adamspiers.org> writes:
> > On Mon, Oct 15, 2012 at 3:31 PM, Junio C Hamano <gitster@pobox.com> wrote:
> >> Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:
> >>> +For each pathname given via the command-line or from a file via
> >>> +`--stdin`, this command will list the first exclude pattern found (if
> >>> +any) which explicitly excludes or includes that pathname.  Note that
> >>> +within any given exclude file, later patterns take precedence over
> >>> +earlier ones, so any matching pattern which this command outputs may
> >>> +not be the one you would immediately expect.
> >>
> >> "The first exclude pattern" is very misleading, isn't it?
> >
> > I don't think so, because of the second sentence.
> >
> >> For example, with these in $GIT_DIR/info/exclude, I would get:
> >>
> >>         $ cat -n .git/info/exclude
> >>           1 *~
> >>           2 Makefile~
> >>         $ git check-ignore -v Makefile~
> >>         .git/info/exclude:2:Makefile~   Makefile~
> >>
> >> which is the correct result (the last one in a single source decides
> >> the fate of the path), but it hardly is "first one found" and the
> >> matching pattern in the output would not be something unexpected for
> >> the users, either.
> >>
> >> The reason it is "the first one found" is because the implementation
> >> arranges the loop in such a way that it can stop early when it finds
> >> a match---it simply checks matches from the end of the source.
> >>
> >> But that is not visible to end-users,
> >
> > Correct; that's precisely why I wrote the second sentence which
> > explicitly explains this.
> >
> >> and they will find the above description just wrong, no?
> >
> > It's not wrong AFAICS, but suggestions for rewording this more clearly
> > are of course welcome.  Maybe s/immediately/intuitively/ ?
> 
> I think this is sufficient:
> 
> 	For each pathname given via the command-line or from a file
> 	via `--stdin`, show the pattern from .gitignore (or other
> 	input files to the exclude mechanism) that decides if the
> 	pathname is excluded.
> 
> and without "Note that" at all.

I don't think this is quite sufficient.  Firstly, it does not cover
negated patterns (my original text contained "or includes").

Secondly, I think there is still potential for this rewording to
result in confused users.  If the "backwards-ness" of the internal
algorithm is kept hidden from them, then in your example above, most
users would be more likely to intuitively expect check-ignore to
return the first line of .git/info/exclude ("*~").  When they saw it
returning the second, they might draw the conclusion that the first
line failed to match (e.g. by mistakenly thinking that the file format
requires regular expressions rather than globs), rather than that git
starts at the end of the file.

This is precisely why I chose not to hide this aspect of the
implementation when initially writing this documentation.
Unfortunately my wording managed to confuse several of you, so clearly
it was not adequate.  Therefore I propose an extension of your
version:

	For each pathname given via the command-line or from a file
	via `--stdin`, show the pattern from .gitignore (or other
	input files to the exclude mechanism) that decides if the
	pathname is excluded or included.  Later patterns within a
	file take precedence over earlier ones.

^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2012-12-17  0:10 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-15  6:23 nd/attr-match-more-optim, nd/wildmatch and as/check-ignore Nguyễn Thái Ngọc Duy
2012-10-15  6:24 ` [PATCH 1/6] exclude: stricten a length check in EXC_FLAG_ENDSWITH case Nguyễn Thái Ngọc Duy
2012-10-15  6:24   ` [PATCH 2/6] exclude: split basename matching code into a separate function Nguyễn Thái Ngọc Duy
2012-10-15  6:24   ` [PATCH 3/6] exclude: fix a bug in prefix compare optimization Nguyễn Thái Ngọc Duy
2012-10-15  6:24   ` [PATCH 4/6] exclude: split pathname matching code into a separate function Nguyễn Thái Ngọc Duy
2012-10-15  6:24   ` [PATCH 5/6] gitignore: make pattern parsing code " Nguyễn Thái Ngọc Duy
2012-10-15  6:24   ` [PATCH 6/6] attr: more matching optimizations from .gitignore Nguyễn Thái Ngọc Duy
2012-10-15  6:25 ` [PATCH 01/13] ctype: make sane_ctype[] const array Nguyễn Thái Ngọc Duy
2012-10-15  6:25   ` [PATCH 02/13] ctype: support iscntrl, ispunct, isxdigit and isprint Nguyễn Thái Ngọc Duy
2012-10-15  6:25   ` [PATCH 03/13] Import wildmatch from rsync Nguyễn Thái Ngọc Duy
2012-10-15  6:25   ` [PATCH 04/13] wildmatch: remove unnecessary functions Nguyễn Thái Ngọc Duy
2012-10-15  6:25   ` [PATCH 05/13] wildmatch: follow Git's coding convention Nguyễn Thái Ngọc Duy
2012-10-15  6:25   ` [PATCH 06/13] Integrate wildmatch to git Nguyễn Thái Ngọc Duy
2012-10-15  6:25   ` [PATCH 07/13] t3070: disable unreliable fnmatch tests Nguyễn Thái Ngọc Duy
2012-10-15  6:25   ` [PATCH 08/13] wildmatch: make wildmatch's return value compatible with fnmatch Nguyễn Thái Ngọc Duy
2012-10-15  6:25   ` [PATCH 09/13] wildmatch: remove static variable force_lower_case Nguyễn Thái Ngọc Duy
2012-10-15  6:25   ` [PATCH 10/13] wildmatch: fix case-insensitive matching Nguyễn Thái Ngọc Duy
2012-10-15  6:26   ` [PATCH 11/13] wildmatch: adjust "**" behavior Nguyễn Thái Ngọc Duy
2012-10-15  6:26   ` [PATCH 12/13] wildmatch: make /**/ match zero or more directories Nguyễn Thái Ngọc Duy
2012-10-15  6:26   ` [PATCH 13/13] Support "**" wildcard in .gitignore and .gitattributes Nguyễn Thái Ngọc Duy
2012-11-04 21:00     ` [PATCH 14/13] wildmatch: fix tests that fail on Windows due to path mangling Johannes Sixt
2012-11-06 12:47       ` Nguyen Thai Ngoc Duy
2012-11-07 19:32         ` Johannes Sixt
2012-11-11 10:13       ` [PATCH 14/13] test-wildmatch: " Nguyễn Thái Ngọc Duy
2012-11-11 10:13         ` [PATCH 15/13] compat/fnmatch: fix off-by-one character class's length check Nguyễn Thái Ngọc Duy
2012-11-13 18:07           ` Johannes Sixt
2012-11-20  7:06           ` Johannes Sixt
2012-11-11 10:47         ` [PATCH 14/13] test-wildmatch: fix tests that fail on Windows due to path mangling Junio C Hamano
2012-11-13 10:06           ` [PATCH 14/13] test-wildmatch: avoid Windows " Nguyễn Thái Ngọc Duy
2012-11-13 18:06             ` Johannes Sixt
2012-11-20  7:02           ` Johannes Sixt
2012-11-20 20:11             ` Junio C Hamano
2012-11-21  6:41               ` Johannes Sixt
2012-10-15  6:27 ` [PATCH 01/12] dir.c: rename cryptic 'which' variable to more consistent name Nguyễn Thái Ngọc Duy
2012-10-15  6:27   ` [PATCH 02/12] dir.c: rename path_excluded() to is_path_excluded() Nguyễn Thái Ngọc Duy
2012-10-15  6:27   ` [PATCH 03/12] dir.c: rename excluded_from_list() to is_excluded_from_list() Nguyễn Thái Ngọc Duy
2012-10-15  6:27   ` [PATCH 04/12] dir.c: rename excluded() to is_excluded() Nguyễn Thái Ngọc Duy
2012-10-15  6:27   ` [PATCH 05/12] dir.c: refactor is_excluded_from_list() Nguyễn Thái Ngọc Duy
2012-10-15  6:28   ` [PATCH 06/12] dir.c: refactor is_excluded() Nguyễn Thái Ngọc Duy
2012-10-15  6:28   ` [PATCH 07/12] dir.c: refactor is_path_excluded() Nguyễn Thái Ngọc Duy
2012-10-15  6:28   ` [PATCH 08/12] dir.c: keep track of where patterns came from Nguyễn Thái Ngọc Duy
2012-10-15  6:28   ` [PATCH 09/12] dir.c: refactor treat_gitlinks() Nguyễn Thái Ngọc Duy
2012-10-15  6:28   ` [PATCH 10/12] pathspec.c: move reusable code from builtin/add.c Nguyễn Thái Ngọc Duy
2012-10-15  6:28   ` [PATCH 11/12] dir.c: provide free_directory() for reclaiming dir_struct memory Nguyễn Thái Ngọc Duy
2012-10-15  6:28   ` [PATCH 12/12] Add git-check-ignore sub-command Nguyễn Thái Ngọc Duy
2012-10-15 22:31     ` Junio C Hamano
2012-10-16 11:08       ` Nguyen Thai Ngoc Duy
2012-10-16 14:09         ` Adam Spiers
2012-10-16 15:07           ` Nguyen Thai Ngoc Duy
2012-10-16 14:13       ` Adam Spiers
2012-10-16 16:12         ` Junio C Hamano
2012-12-17  0:10           ` Adam Spiers
2012-11-04 21:07     ` [PATCH as/check-ignore] t0007: fix tests on Windows Johannes Sixt
2012-11-08 18:04       ` Jeff King
2012-10-15 22:13 ` nd/attr-match-more-optim, nd/wildmatch and as/check-ignore Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).