* [PATCH 1/8] Import wildmatch from rsync
2012-10-09 3:08 [PATCH 0/8] wildmatch take 3 Nguyễn Thái Ngọc Duy
@ 2012-10-09 3:09 ` Nguyễn Thái Ngọc Duy
2012-10-09 3:09 ` [PATCH 2/8] wildmatch: remove unnecessary functions Nguyễn Thái Ngọc Duy
` (6 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-09 3:09 UTC (permalink / raw)
To: git; +Cc: Nguyễn Thái Ngọc Duy
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 16393 bytes --]
These files are from rsync.git commit
f92f5b166e3019db42bc7fe1aa2f1a9178cd215d, which was the last commit
before rsync turned GPL-3. All files are imported as-is and
no-op. Adaptation is done in a separate patch.
rsync.git -> git.git
lib/wildmatch.[ch] wildmatch.[ch]
wildtest.txt t/t3070/wildtest.txt
Signed-off-by: Nguyá»
n Thái Ngá»c Duy <pclouds@gmail.com>
---
t/t3070/wildtest.txt | 165 +++++++++++++++++++++++
wildmatch.c | 368 +++++++++++++++++++++++++++++++++++++++++++++++++++
wildmatch.h | 6 +
3 files changed, 539 insertions(+)
create mode 100644 t/t3070/wildtest.txt
create mode 100644 wildmatch.c
create mode 100644 wildmatch.h
diff --git a/t/t3070/wildtest.txt b/t/t3070/wildtest.txt
new file mode 100644
index 0000000..42c1678
--- /dev/null
+++ b/t/t3070/wildtest.txt
@@ -0,0 +1,165 @@
+# Input is in the following format (all items white-space separated):
+#
+# The first two items are 1 or 0 indicating if the wildmat call is expected to
+# succeed and if fnmatch works the same way as wildmat, respectively. After
+# that is a text string for the match, and a pattern string. Strings can be
+# quoted (if desired) in either double or single quotes, as well as backticks.
+#
+# MATCH FNMATCH_SAME "text to match" 'pattern to use'
+
+# Basic wildmat features
+1 1 foo foo
+0 1 foo bar
+1 1 '' ""
+1 1 foo ???
+0 1 foo ??
+1 1 foo *
+1 1 foo f*
+0 1 foo *f
+1 1 foo *foo*
+1 1 foobar *ob*a*r*
+1 1 aaaaaaabababab *ab
+1 1 foo* foo\*
+0 1 foobar foo\*bar
+1 1 f\oo f\\oo
+1 1 ball *[al]?
+0 1 ten [ten]
+1 1 ten **[!te]
+0 1 ten **[!ten]
+1 1 ten t[a-g]n
+0 1 ten t[!a-g]n
+1 1 ton t[!a-g]n
+1 1 ton t[^a-g]n
+1 1 a]b a[]]b
+1 1 a-b a[]-]b
+1 1 a]b a[]-]b
+0 1 aab a[]-]b
+1 1 aab a[]a-]b
+1 1 ] ]
+
+# Extended slash-matching features
+0 1 foo/baz/bar foo*bar
+1 1 foo/baz/bar foo**bar
+0 1 foo/bar foo?bar
+0 1 foo/bar foo[/]bar
+0 1 foo/bar f[^eiu][^eiu][^eiu][^eiu][^eiu]r
+1 1 foo-bar f[^eiu][^eiu][^eiu][^eiu][^eiu]r
+0 1 foo **/foo
+1 1 /foo **/foo
+1 1 bar/baz/foo **/foo
+0 1 bar/baz/foo */foo
+0 0 foo/bar/baz **/bar*
+1 1 deep/foo/bar/baz **/bar/*
+0 1 deep/foo/bar/baz/ **/bar/*
+1 1 deep/foo/bar/baz/ **/bar/**
+0 1 deep/foo/bar **/bar/*
+1 1 deep/foo/bar/ **/bar/**
+1 1 foo/bar/baz **/bar**
+1 1 foo/bar/baz/x */bar/**
+0 0 deep/foo/bar/baz/x */bar/**
+1 1 deep/foo/bar/baz/x **/bar/*/*
+
+# Various additional tests
+0 1 acrt a[c-c]st
+1 1 acrt a[c-c]rt
+0 1 ] [!]-]
+1 1 a [!]-]
+0 1 '' \
+0 1 \ \
+0 1 /\ */\
+1 1 /\ */\\
+1 1 foo foo
+1 1 @foo @foo
+0 1 foo @foo
+1 1 [ab] \[ab]
+1 1 [ab] [[]ab]
+1 1 [ab] [[:]ab]
+0 1 [ab] [[::]ab]
+1 1 [ab] [[:digit]ab]
+1 1 [ab] [\[:]ab]
+1 1 ?a?b \??\?b
+1 1 abc \a\b\c
+0 1 foo ''
+1 1 foo/bar/baz/to **/t[o]
+
+# Character class tests
+1 1 a1B [[:alpha:]][[:digit:]][[:upper:]]
+0 1 a [[:digit:][:upper:][:space:]]
+1 1 A [[:digit:][:upper:][:space:]]
+1 1 1 [[:digit:][:upper:][:space:]]
+0 1 1 [[:digit:][:upper:][:spaci:]]
+1 1 ' ' [[:digit:][:upper:][:space:]]
+0 1 . [[:digit:][:upper:][:space:]]
+1 1 . [[:digit:][:punct:][:space:]]
+1 1 5 [[:xdigit:]]
+1 1 f [[:xdigit:]]
+1 1 D [[:xdigit:]]
+1 1 _ [[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]
+#1 1
[^[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]
+1 1 \x7f [^[:alnum:][:alpha:][:blank:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]
+1 1 . [^[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:lower:][:space:][:upper:][:xdigit:]]
+1 1 5 [a-c[:digit:]x-z]
+1 1 b [a-c[:digit:]x-z]
+1 1 y [a-c[:digit:]x-z]
+0 1 q [a-c[:digit:]x-z]
+
+# Additional tests, including some malformed wildmats
+1 1 ] [\\-^]
+0 1 [ [\\-^]
+1 1 - [\-_]
+1 1 ] [\]]
+0 1 \] [\]]
+0 1 \ [\]]
+0 1 ab a[]b
+0 1 a[]b a[]b
+0 1 ab[ ab[
+0 1 ab [!
+0 1 ab [-
+1 1 - [-]
+0 1 - [a-
+0 1 - [!a-
+1 1 - [--A]
+1 1 5 [--A]
+1 1 ' ' '[ --]'
+1 1 $ '[ --]'
+1 1 - '[ --]'
+0 1 0 '[ --]'
+1 1 - [---]
+1 1 - [------]
+0 1 j [a-e-n]
+1 1 - [a-e-n]
+1 1 a [!------]
+0 1 [ []-a]
+1 1 ^ []-a]
+0 1 ^ [!]-a]
+1 1 [ [!]-a]
+1 1 ^ [a^bc]
+1 1 -b] [a-]b]
+0 1 \ [\]
+1 1 \ [\\]
+0 1 \ [!\\]
+1 1 G [A-\\]
+0 1 aaabbb b*a
+0 1 aabcaa *ba*
+1 1 , [,]
+1 1 , [\\,]
+1 1 \ [\\,]
+1 1 - [,-.]
+0 1 + [,-.]
+0 1 -.] [,-.]
+1 1 2 [\1-\3]
+1 1 3 [\1-\3]
+0 1 4 [\1-\3]
+1 1 \ [[-\]]
+1 1 [ [[-\]]
+1 1 ] [[-\]]
+0 1 - [[-\]]
+
+# Test recursion and the abort code (use "wildtest -i" to see iteration counts)
+1 1 -adobe-courier-bold-o-normal--12-120-75-75-m-70-iso8859-1 -*-*-*-*-*-*-12-*-*-*-m-*-*-*
+0 1 -adobe-courier-bold-o-normal--12-120-75-75-X-70-iso8859-1 -*-*-*-*-*-*-12-*-*-*-m-*-*-*
+0 1 -adobe-courier-bold-o-normal--12-120-75-75-/-70-iso8859-1 -*-*-*-*-*-*-12-*-*-*-m-*-*-*
+1 1 /adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1 /*/*/*/*/*/*/12/*/*/*/m/*/*/*
+0 1 /adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1 /*/*/*/*/*/*/12/*/*/*/m/*/*/*
+1 1 abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txt **/*a*b*g*n*t
+0 1 abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txtz **/*a*b*g*n*t
diff --git a/wildmatch.c b/wildmatch.c
new file mode 100644
index 0000000..f3a1731
--- /dev/null
+++ b/wildmatch.c
@@ -0,0 +1,368 @@
+/*
+** Do shell-style pattern matching for ?, \, [], and * characters.
+** It is 8bit clean.
+**
+** Written by Rich $alz, mirror!rs, Wed Nov 26 19:03:17 EST 1986.
+** Rich $alz is now <rsalz@bbn.com>.
+**
+** Modified by Wayne Davison to special-case '/' matching, to make '**'
+** work differently than '*', and to fix the character-class code.
+*/
+
+#include "rsync.h"
+
+/* What character marks an inverted character class? */
+#define NEGATE_CLASS '!'
+#define NEGATE_CLASS2 '^'
+
+#define FALSE 0
+#define TRUE 1
+#define ABORT_ALL -1
+#define ABORT_TO_STARSTAR -2
+
+#define CC_EQ(class, len, litmatch) ((len) == sizeof (litmatch)-1 \
+ && *(class) == *(litmatch) \
+ && strncmp((char*)class, litmatch, len) == 0)
+
+#if defined STDC_HEADERS || !defined isascii
+# define ISASCII(c) 1
+#else
+# define ISASCII(c) isascii(c)
+#endif
+
+#ifdef isblank
+# define ISBLANK(c) (ISASCII(c) && isblank(c))
+#else
+# define ISBLANK(c) ((c) == ' ' || (c) == '\t')
+#endif
+
+#ifdef isgraph
+# define ISGRAPH(c) (ISASCII(c) && isgraph(c))
+#else
+# define ISGRAPH(c) (ISASCII(c) && isprint(c) && !isspace(c))
+#endif
+
+#define ISPRINT(c) (ISASCII(c) && isprint(c))
+#define ISDIGIT(c) (ISASCII(c) && isdigit(c))
+#define ISALNUM(c) (ISASCII(c) && isalnum(c))
+#define ISALPHA(c) (ISASCII(c) && isalpha(c))
+#define ISCNTRL(c) (ISASCII(c) && iscntrl(c))
+#define ISLOWER(c) (ISASCII(c) && islower(c))
+#define ISPUNCT(c) (ISASCII(c) && ispunct(c))
+#define ISSPACE(c) (ISASCII(c) && isspace(c))
+#define ISUPPER(c) (ISASCII(c) && isupper(c))
+#define ISXDIGIT(c) (ISASCII(c) && isxdigit(c))
+
+#ifdef WILD_TEST_ITERATIONS
+int wildmatch_iteration_count;
+#endif
+
+static int force_lower_case = 0;
+
+/* Match pattern "p" against the a virtually-joined string consisting
+ * of "text" and any strings in array "a". */
+static int dowild(const uchar *p, const uchar *text, const uchar*const *a)
+{
+ uchar p_ch;
+
+#ifdef WILD_TEST_ITERATIONS
+ wildmatch_iteration_count++;
+#endif
+
+ for ( ; (p_ch = *p) != '\0'; text++, p++) {
+ int matched, special;
+ uchar t_ch, prev_ch;
+ while ((t_ch = *text) == '\0') {
+ if (*a == NULL) {
+ if (p_ch != '*')
+ return ABORT_ALL;
+ break;
+ }
+ text = *a++;
+ }
+ if (force_lower_case && ISUPPER(t_ch))
+ t_ch = tolower(t_ch);
+ switch (p_ch) {
+ case '\\':
+ /* Literal match with following character. Note that the test
+ * in "default" handles the p[1] == '\0' failure case. */
+ p_ch = *++p;
+ /* FALLTHROUGH */
+ default:
+ if (t_ch != p_ch)
+ return FALSE;
+ continue;
+ case '?':
+ /* Match anything but '/'. */
+ if (t_ch == '/')
+ return FALSE;
+ continue;
+ case '*':
+ if (*++p == '*') {
+ while (*++p == '*') {}
+ special = TRUE;
+ } else
+ special = FALSE;
+ if (*p == '\0') {
+ /* Trailing "**" matches everything. Trailing "*" matches
+ * only if there are no more slash characters. */
+ if (!special) {
+ do {
+ if (strchr((char*)text, '/') != NULL)
+ return FALSE;
+ } while ((text = *a++) != NULL);
+ }
+ return TRUE;
+ }
+ while (1) {
+ if (t_ch == '\0') {
+ if ((text = *a++) == NULL)
+ break;
+ t_ch = *text;
+ continue;
+ }
+ if ((matched = dowild(p, text, a)) != FALSE) {
+ if (!special || matched != ABORT_TO_STARSTAR)
+ return matched;
+ } else if (!special && t_ch == '/')
+ return ABORT_TO_STARSTAR;
+ t_ch = *++text;
+ }
+ return ABORT_ALL;
+ case '[':
+ p_ch = *++p;
+#ifdef NEGATE_CLASS2
+ if (p_ch == NEGATE_CLASS2)
+ p_ch = NEGATE_CLASS;
+#endif
+ /* Assign literal TRUE/FALSE because of "matched" comparison. */
+ special = p_ch == NEGATE_CLASS? TRUE : FALSE;
+ if (special) {
+ /* Inverted character class. */
+ p_ch = *++p;
+ }
+ prev_ch = 0;
+ matched = FALSE;
+ do {
+ if (!p_ch)
+ return ABORT_ALL;
+ if (p_ch == '\\') {
+ p_ch = *++p;
+ if (!p_ch)
+ return ABORT_ALL;
+ if (t_ch == p_ch)
+ matched = TRUE;
+ } else if (p_ch == '-' && prev_ch && p[1] && p[1] != ']') {
+ p_ch = *++p;
+ if (p_ch == '\\') {
+ p_ch = *++p;
+ if (!p_ch)
+ return ABORT_ALL;
+ }
+ if (t_ch <= p_ch && t_ch >= prev_ch)
+ matched = TRUE;
+ p_ch = 0; /* This makes "prev_ch" get set to 0. */
+ } else if (p_ch == '[' && p[1] == ':') {
+ const uchar *s;
+ int i;
+ for (s = p += 2; (p_ch = *p) && p_ch != ']'; p++) {} /*SHARED ITERATOR*/
+ if (!p_ch)
+ return ABORT_ALL;
+ i = p - s - 1;
+ if (i < 0 || p[-1] != ':') {
+ /* Didn't find ":]", so treat like a normal set. */
+ p = s - 2;
+ p_ch = '[';
+ if (t_ch == p_ch)
+ matched = TRUE;
+ continue;
+ }
+ if (CC_EQ(s,i, "alnum")) {
+ if (ISALNUM(t_ch))
+ matched = TRUE;
+ } else if (CC_EQ(s,i, "alpha")) {
+ if (ISALPHA(t_ch))
+ matched = TRUE;
+ } else if (CC_EQ(s,i, "blank")) {
+ if (ISBLANK(t_ch))
+ matched = TRUE;
+ } else if (CC_EQ(s,i, "cntrl")) {
+ if (ISCNTRL(t_ch))
+ matched = TRUE;
+ } else if (CC_EQ(s,i, "digit")) {
+ if (ISDIGIT(t_ch))
+ matched = TRUE;
+ } else if (CC_EQ(s,i, "graph")) {
+ if (ISGRAPH(t_ch))
+ matched = TRUE;
+ } else if (CC_EQ(s,i, "lower")) {
+ if (ISLOWER(t_ch))
+ matched = TRUE;
+ } else if (CC_EQ(s,i, "print")) {
+ if (ISPRINT(t_ch))
+ matched = TRUE;
+ } else if (CC_EQ(s,i, "punct")) {
+ if (ISPUNCT(t_ch))
+ matched = TRUE;
+ } else if (CC_EQ(s,i, "space")) {
+ if (ISSPACE(t_ch))
+ matched = TRUE;
+ } else if (CC_EQ(s,i, "upper")) {
+ if (ISUPPER(t_ch))
+ matched = TRUE;
+ } else if (CC_EQ(s,i, "xdigit")) {
+ if (ISXDIGIT(t_ch))
+ matched = TRUE;
+ } else /* malformed [:class:] string */
+ return ABORT_ALL;
+ p_ch = 0; /* This makes "prev_ch" get set to 0. */
+ } else if (t_ch == p_ch)
+ matched = TRUE;
+ } while (prev_ch = p_ch, (p_ch = *++p) != ']');
+ if (matched == special || t_ch == '/')
+ return FALSE;
+ continue;
+ }
+ }
+
+ do {
+ if (*text)
+ return FALSE;
+ } while ((text = *a++) != NULL);
+
+ return TRUE;
+}
+
+/* Match literal string "s" against the a virtually-joined string consisting
+ * of "text" and any strings in array "a". */
+static int doliteral(const uchar *s, const uchar *text, const uchar*const *a)
+{
+ for ( ; *s != '\0'; text++, s++) {
+ while (*text == '\0') {
+ if ((text = *a++) == NULL)
+ return FALSE;
+ }
+ if (*text != *s)
+ return FALSE;
+ }
+
+ do {
+ if (*text)
+ return FALSE;
+ } while ((text = *a++) != NULL);
+
+ return TRUE;
+}
+
+/* Return the last "count" path elements from the concatenated string.
+ * We return a string pointer to the start of the string, and update the
+ * array pointer-pointer to point to any remaining string elements. */
+static const uchar *trailing_N_elements(const uchar*const **a_ptr, int count)
+{
+ const uchar*const *a = *a_ptr;
+ const uchar*const *first_a = a;
+
+ while (*a)
+ a++;
+
+ while (a != first_a) {
+ const uchar *s = *--a;
+ s += strlen((char*)s);
+ while (--s >= *a) {
+ if (*s == '/' && !--count) {
+ *a_ptr = a+1;
+ return s+1;
+ }
+ }
+ }
+
+ if (count == 1) {
+ *a_ptr = a+1;
+ return *a;
+ }
+
+ return NULL;
+}
+
+/* Match the "pattern" against the "text" string. */
+int wildmatch(const char *pattern, const char *text)
+{
+ static const uchar *nomore[1]; /* A NULL pointer. */
+#ifdef WILD_TEST_ITERATIONS
+ wildmatch_iteration_count = 0;
+#endif
+ return dowild((const uchar*)pattern, (const uchar*)text, nomore) == TRUE;
+}
+
+/* Match the "pattern" against the forced-to-lower-case "text" string. */
+int iwildmatch(const char *pattern, const char *text)
+{
+ static const uchar *nomore[1]; /* A NULL pointer. */
+ int ret;
+#ifdef WILD_TEST_ITERATIONS
+ wildmatch_iteration_count = 0;
+#endif
+ force_lower_case = 1;
+ ret = dowild((const uchar*)pattern, (const uchar*)text, nomore) == TRUE;
+ force_lower_case = 0;
+ return ret;
+}
+
+/* Match pattern "p" against the a virtually-joined string consisting
+ * of all the pointers in array "texts" (which has a NULL pointer at the
+ * end). The int "where" can be 0 (normal matching), > 0 (match only
+ * the trailing N slash-separated filename components of "texts"), or < 0
+ * (match the "pattern" at the start or after any slash in "texts"). */
+int wildmatch_array(const char *pattern, const char*const *texts, int where)
+{
+ const uchar *p = (const uchar*)pattern;
+ const uchar*const *a = (const uchar*const*)texts;
+ const uchar *text;
+ int matched;
+
+#ifdef WILD_TEST_ITERATIONS
+ wildmatch_iteration_count = 0;
+#endif
+
+ if (where > 0)
+ text = trailing_N_elements(&a, where);
+ else
+ text = *a++;
+ if (!text)
+ return FALSE;
+
+ if ((matched = dowild(p, text, a)) != TRUE && where < 0
+ && matched != ABORT_ALL) {
+ while (1) {
+ if (*text == '\0') {
+ if ((text = (uchar*)*a++) == NULL)
+ return FALSE;
+ continue;
+ }
+ if (*text++ == '/' && (matched = dowild(p, text, a)) != FALSE
+ && matched != ABORT_TO_STARSTAR)
+ break;
+ }
+ }
+ return matched == TRUE;
+}
+
+/* Match literal string "s" against the a virtually-joined string consisting
+ * of all the pointers in array "texts" (which has a NULL pointer at the
+ * end). The int "where" can be 0 (normal matching), or > 0 (match
+ * only the trailing N slash-separated filename components of "texts"). */
+int litmatch_array(const char *string, const char*const *texts, int where)
+{
+ const uchar *s = (const uchar*)string;
+ const uchar*const *a = (const uchar* const*)texts;
+ const uchar *text;
+
+ if (where > 0)
+ text = trailing_N_elements(&a, where);
+ else
+ text = *a++;
+ if (!text)
+ return FALSE;
+
+ return doliteral(s, text, a) == TRUE;
+}
diff --git a/wildmatch.h b/wildmatch.h
new file mode 100644
index 0000000..e7f1a35
--- /dev/null
+++ b/wildmatch.h
@@ -0,0 +1,6 @@
+/* wildmatch.h */
+
+int wildmatch(const char *pattern, const char *text);
+int iwildmatch(const char *pattern, const char *text);
+int wildmatch_array(const char *pattern, const char*const *texts, int where);
+int litmatch_array(const char *string, const char*const *texts, int where);
--
1.8.0.rc0.29.g1fdd78f
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/8] wildmatch: remove unnecessary functions
2012-10-09 3:08 [PATCH 0/8] wildmatch take 3 Nguyễn Thái Ngọc Duy
2012-10-09 3:09 ` [PATCH 1/8] Import wildmatch from rsync Nguyễn Thái Ngọc Duy
@ 2012-10-09 3:09 ` Nguyễn Thái Ngọc Duy
2012-10-09 3:09 ` [PATCH 3/8] Integrate wildmatch to git Nguyễn Thái Ngọc Duy
` (5 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-09 3:09 UTC (permalink / raw)
To: git; +Cc: Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
wildmatch.c | 161 ++++--------------------------------------------------------
wildmatch.h | 2 -
2 files changed, 9 insertions(+), 154 deletions(-)
diff --git a/wildmatch.c b/wildmatch.c
index f3a1731..71dba76 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -53,33 +53,19 @@
#define ISUPPER(c) (ISASCII(c) && isupper(c))
#define ISXDIGIT(c) (ISASCII(c) && isxdigit(c))
-#ifdef WILD_TEST_ITERATIONS
-int wildmatch_iteration_count;
-#endif
-
static int force_lower_case = 0;
/* Match pattern "p" against the a virtually-joined string consisting
* of "text" and any strings in array "a". */
-static int dowild(const uchar *p, const uchar *text, const uchar*const *a)
+static int dowild(const uchar *p, const uchar *text)
{
uchar p_ch;
-#ifdef WILD_TEST_ITERATIONS
- wildmatch_iteration_count++;
-#endif
-
for ( ; (p_ch = *p) != '\0'; text++, p++) {
int matched, special;
uchar t_ch, prev_ch;
- while ((t_ch = *text) == '\0') {
- if (*a == NULL) {
- if (p_ch != '*')
- return ABORT_ALL;
- break;
- }
- text = *a++;
- }
+ if ((t_ch = *text) == '\0' && p_ch != '*')
+ return ABORT_ALL;
if (force_lower_case && ISUPPER(t_ch))
t_ch = tolower(t_ch);
switch (p_ch) {
@@ -107,21 +93,15 @@ static int dowild(const uchar *p, const uchar *text, const uchar*const *a)
/* Trailing "**" matches everything. Trailing "*" matches
* only if there are no more slash characters. */
if (!special) {
- do {
if (strchr((char*)text, '/') != NULL)
return FALSE;
- } while ((text = *a++) != NULL);
}
return TRUE;
}
while (1) {
- if (t_ch == '\0') {
- if ((text = *a++) == NULL)
- break;
- t_ch = *text;
- continue;
- }
- if ((matched = dowild(p, text, a)) != FALSE) {
+ if (t_ch == '\0')
+ break;
+ if ((matched = dowild(p, text)) != FALSE) {
if (!special || matched != ABORT_TO_STARSTAR)
return matched;
} else if (!special && t_ch == '/')
@@ -225,144 +205,21 @@ static int dowild(const uchar *p, const uchar *text, const uchar*const *a)
}
}
- do {
- if (*text)
- return FALSE;
- } while ((text = *a++) != NULL);
-
- return TRUE;
-}
-
-/* Match literal string "s" against the a virtually-joined string consisting
- * of "text" and any strings in array "a". */
-static int doliteral(const uchar *s, const uchar *text, const uchar*const *a)
-{
- for ( ; *s != '\0'; text++, s++) {
- while (*text == '\0') {
- if ((text = *a++) == NULL)
- return FALSE;
- }
- if (*text != *s)
- return FALSE;
- }
-
- do {
- if (*text)
- return FALSE;
- } while ((text = *a++) != NULL);
-
- return TRUE;
-}
-
-/* Return the last "count" path elements from the concatenated string.
- * We return a string pointer to the start of the string, and update the
- * array pointer-pointer to point to any remaining string elements. */
-static const uchar *trailing_N_elements(const uchar*const **a_ptr, int count)
-{
- const uchar*const *a = *a_ptr;
- const uchar*const *first_a = a;
-
- while (*a)
- a++;
-
- while (a != first_a) {
- const uchar *s = *--a;
- s += strlen((char*)s);
- while (--s >= *a) {
- if (*s == '/' && !--count) {
- *a_ptr = a+1;
- return s+1;
- }
- }
- }
-
- if (count == 1) {
- *a_ptr = a+1;
- return *a;
- }
-
- return NULL;
+ return *text ? FALSE : TRUE;
}
/* Match the "pattern" against the "text" string. */
int wildmatch(const char *pattern, const char *text)
{
- static const uchar *nomore[1]; /* A NULL pointer. */
-#ifdef WILD_TEST_ITERATIONS
- wildmatch_iteration_count = 0;
-#endif
- return dowild((const uchar*)pattern, (const uchar*)text, nomore) == TRUE;
+ return dowild((const uchar*)pattern, (const uchar*)text) == TRUE;
}
/* Match the "pattern" against the forced-to-lower-case "text" string. */
int iwildmatch(const char *pattern, const char *text)
{
- static const uchar *nomore[1]; /* A NULL pointer. */
int ret;
-#ifdef WILD_TEST_ITERATIONS
- wildmatch_iteration_count = 0;
-#endif
force_lower_case = 1;
- ret = dowild((const uchar*)pattern, (const uchar*)text, nomore) == TRUE;
+ ret = dowild((const uchar*)pattern, (const uchar*)text) == TRUE;
force_lower_case = 0;
return ret;
}
-
-/* Match pattern "p" against the a virtually-joined string consisting
- * of all the pointers in array "texts" (which has a NULL pointer at the
- * end). The int "where" can be 0 (normal matching), > 0 (match only
- * the trailing N slash-separated filename components of "texts"), or < 0
- * (match the "pattern" at the start or after any slash in "texts"). */
-int wildmatch_array(const char *pattern, const char*const *texts, int where)
-{
- const uchar *p = (const uchar*)pattern;
- const uchar*const *a = (const uchar*const*)texts;
- const uchar *text;
- int matched;
-
-#ifdef WILD_TEST_ITERATIONS
- wildmatch_iteration_count = 0;
-#endif
-
- if (where > 0)
- text = trailing_N_elements(&a, where);
- else
- text = *a++;
- if (!text)
- return FALSE;
-
- if ((matched = dowild(p, text, a)) != TRUE && where < 0
- && matched != ABORT_ALL) {
- while (1) {
- if (*text == '\0') {
- if ((text = (uchar*)*a++) == NULL)
- return FALSE;
- continue;
- }
- if (*text++ == '/' && (matched = dowild(p, text, a)) != FALSE
- && matched != ABORT_TO_STARSTAR)
- break;
- }
- }
- return matched == TRUE;
-}
-
-/* Match literal string "s" against the a virtually-joined string consisting
- * of all the pointers in array "texts" (which has a NULL pointer at the
- * end). The int "where" can be 0 (normal matching), or > 0 (match
- * only the trailing N slash-separated filename components of "texts"). */
-int litmatch_array(const char *string, const char*const *texts, int where)
-{
- const uchar *s = (const uchar*)string;
- const uchar*const *a = (const uchar* const*)texts;
- const uchar *text;
-
- if (where > 0)
- text = trailing_N_elements(&a, where);
- else
- text = *a++;
- if (!text)
- return FALSE;
-
- return doliteral(s, text, a) == TRUE;
-}
diff --git a/wildmatch.h b/wildmatch.h
index e7f1a35..562faa3 100644
--- a/wildmatch.h
+++ b/wildmatch.h
@@ -2,5 +2,3 @@
int wildmatch(const char *pattern, const char *text);
int iwildmatch(const char *pattern, const char *text);
-int wildmatch_array(const char *pattern, const char*const *texts, int where);
-int litmatch_array(const char *string, const char*const *texts, int where);
--
1.8.0.rc0.29.g1fdd78f
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 3/8] Integrate wildmatch to git
2012-10-09 3:08 [PATCH 0/8] wildmatch take 3 Nguyễn Thái Ngọc Duy
2012-10-09 3:09 ` [PATCH 1/8] Import wildmatch from rsync Nguyễn Thái Ngọc Duy
2012-10-09 3:09 ` [PATCH 2/8] wildmatch: remove unnecessary functions Nguyễn Thái Ngọc Duy
@ 2012-10-09 3:09 ` Nguyễn Thái Ngọc Duy
2012-10-09 3:09 ` [PATCH 4/8] wildmatch: remove static variable force_lower_case Nguyễn Thái Ngọc Duy
` (4 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-09 3:09 UTC (permalink / raw)
To: git; +Cc: Nguyễn Thái Ngọc Duy
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 13321 bytes --]
This makes wildmatch.c part of libgit.a and builds test-wildmatch; the
dependency on libpopt in the original has been replaced with the use
of our parse-options. Global variables in test-wildmatch are marked
static to avoid sparse warnings.
Signed-off-by: Nguyá»
n Thái Ngá»c Duy <pclouds@gmail.com>
---
.gitignore | 1 +
Makefile | 3 +
t/t3070-wildmatch.sh | 178 +++++++++++++++++++++++++++++++++++++++++++++++++++
t/t3070/wildtest.txt | 165 -----------------------------------------------
test-wildmatch.c | 14 ++++
wildmatch.c | 8 ++-
6 files changed, 203 insertions(+), 166 deletions(-)
create mode 100755 t/t3070-wildmatch.sh
delete mode 100644 t/t3070/wildtest.txt
create mode 100644 test-wildmatch.c
diff --git a/.gitignore b/.gitignore
index a188a82..37c3507 100644
--- a/.gitignore
+++ b/.gitignore
@@ -197,6 +197,7 @@
/test-string-list
/test-subprocess
/test-svn-fe
+/test-wildmatch
/common-cmds.h
*.tar.gz
*.dsc
diff --git a/Makefile b/Makefile
index 8413606..9a97379 100644
--- a/Makefile
+++ b/Makefile
@@ -523,6 +523,7 @@ TEST_PROGRAMS_NEED_X += test-sigchain
TEST_PROGRAMS_NEED_X += test-string-list
TEST_PROGRAMS_NEED_X += test-subprocess
TEST_PROGRAMS_NEED_X += test-svn-fe
+TEST_PROGRAMS_NEED_X += test-wildmatch
TEST_PROGRAMS = $(patsubst %,%$X,$(TEST_PROGRAMS_NEED_X))
@@ -695,6 +696,7 @@ LIB_H += userdiff.h
LIB_H += utf8.h
LIB_H += varint.h
LIB_H += walker.h
+LIB_H += wildmatch.h
LIB_H += wt-status.h
LIB_H += xdiff-interface.h
LIB_H += xdiff/xdiff.h
@@ -826,6 +828,7 @@ LIB_OBJS += utf8.o
LIB_OBJS += varint.o
LIB_OBJS += version.o
LIB_OBJS += walker.o
+LIB_OBJS += wildmatch.o
LIB_OBJS += wrapper.o
LIB_OBJS += write_or_die.o
LIB_OBJS += ws.o
diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
new file mode 100755
index 0000000..bb92f8d
--- /dev/null
+++ b/t/t3070-wildmatch.sh
@@ -0,0 +1,178 @@
+#!/bin/sh
+
+test_description='wildmatch tests'
+
+. ./test-lib.sh
+
+match() {
+ test_expect_success "wildmatch $*" "
+ if [ $1 = 1 ]; then
+ test-wildmatch wildmatch '$3' '$4'
+ else
+ ! test-wildmatch wildmatch '$3' '$4'
+ fi &&
+ if [ $2 = 1 ]; then
+ test-wildmatch fnmatch '$3' '$4'
+ else
+ ! test-wildmatch fnmatch '$3' '$4'
+ fi
+ "
+}
+
+# Basic wildmat features
+match 1 1 foo foo
+match 0 0 foo bar
+match 1 1 '' ""
+match 1 1 foo '???'
+match 0 0 foo '??'
+match 1 1 foo '*'
+match 1 1 foo 'f*'
+match 0 0 foo '*f'
+match 1 1 foo '*foo*'
+match 1 1 foobar '*ob*a*r*'
+match 1 1 aaaaaaabababab '*ab'
+match 1 1 'foo*' 'foo\*'
+match 0 0 foobar 'foo\*bar'
+match 1 1 'f\oo' 'f\\oo'
+match 1 1 ball '*[al]?'
+match 0 0 ten '[ten]'
+match 1 1 ten '**[!te]'
+match 0 0 ten '**[!ten]'
+match 1 1 ten 't[a-g]n'
+match 0 0 ten 't[!a-g]n'
+match 1 1 ton 't[!a-g]n'
+match 1 1 ton 't[^a-g]n'
+match 1 1 'a]b' 'a[]]b'
+match 1 1 a-b 'a[]-]b'
+match 1 1 'a]b' 'a[]-]b'
+match 0 0 aab 'a[]-]b'
+match 1 1 aab 'a[]a-]b'
+match 1 1 ']' ']'
+
+# Extended slash-matching features
+match 0 0 'foo/baz/bar' 'foo*bar'
+match 1 0 'foo/baz/bar' 'foo**bar'
+match 0 0 'foo/bar' 'foo?bar'
+match 0 0 'foo/bar' 'foo[/]bar'
+match 0 0 'foo/bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
+match 1 1 'foo-bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
+match 0 0 'foo' '**/foo'
+match 1 1 '/foo' '**/foo'
+match 1 0 'bar/baz/foo' '**/foo'
+match 0 0 'bar/baz/foo' '*/foo'
+match 0 0 'foo/bar/baz' '**/bar*'
+match 1 0 'deep/foo/bar/baz' '**/bar/*'
+match 0 0 'deep/foo/bar/baz/' '**/bar/*'
+match 1 0 'deep/foo/bar/baz/' '**/bar/**'
+match 0 0 'deep/foo/bar' '**/bar/*'
+match 1 0 'deep/foo/bar/' '**/bar/**'
+match 1 0 'foo/bar/baz' '**/bar**'
+match 1 0 'foo/bar/baz/x' '*/bar/**'
+match 0 0 'deep/foo/bar/baz/x' '*/bar/**'
+match 1 0 'deep/foo/bar/baz/x' '**/bar/*/*'
+
+# Various additional tests
+match 0 0 'acrt' 'a[c-c]st'
+match 1 1 'acrt' 'a[c-c]rt'
+match 0 0 ']' '[!]-]'
+match 1 1 'a' '[!]-]'
+match 0 0 '' '\'
+match 0 0 '\' '\'
+match 0 0 '/\' '*/\'
+match 1 1 '/\' '*/\\'
+match 1 1 'foo' 'foo'
+match 1 1 '@foo' '@foo'
+match 0 0 'foo' '@foo'
+match 1 1 '[ab]' '\[ab]'
+match 1 1 '[ab]' '[[]ab]'
+match 1 1 '[ab]' '[[:]ab]'
+match 0 0 '[ab]' '[[::]ab]'
+match 1 1 '[ab]' '[[:digit]ab]'
+match 1 1 '[ab]' '[\[:]ab]'
+match 1 1 '?a?b' '\??\?b'
+match 1 1 'abc' '\a\b\c'
+match 0 0 'foo' ''
+match 1 0 'foo/bar/baz/to' '**/t[o]'
+
+# Character class tests
+match 1 1 'a1B' '[[:alpha:]][[:digit:]][[:upper:]]'
+match 0 0 'a' '[[:digit:][:upper:][:space:]]'
+match 1 1 'A' '[[:digit:][:upper:][:space:]]'
+match 1 0 '1' '[[:digit:][:upper:][:space:]]'
+match 0 0 '1' '[[:digit:][:upper:][:spaci:]]'
+match 1 1 ' ' '[[:digit:][:upper:][:space:]]'
+match 0 0 '.' '[[:digit:][:upper:][:space:]]'
+match 1 1 '.' '[[:digit:][:punct:][:space:]]'
+match 1 1 '5' '[[:xdigit:]]'
+match 1 1 'f' '[[:xdigit:]]'
+match 1 1 'D' '[[:xdigit:]]'
+match 1 0 '_' '[[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]'
+match 1 0 '_' '[[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]'
+match 1 1 '.' '[^[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:lower:][:space:][:upper:][:xdigit:]]'
+match 1 1 '5' '[a-c[:digit:]x-z]'
+match 1 1 'b' '[a-c[:digit:]x-z]'
+match 1 1 'y' '[a-c[:digit:]x-z]'
+match 0 0 'q' '[a-c[:digit:]x-z]'
+
+# Additional tests, including some malformed wildmats
+match 1 1 ']' '[\\-^]'
+match 0 0 '[' '[\\-^]'
+match 1 1 '-' '[\-_]'
+match 1 1 ']' '[\]]'
+match 0 0 '\]' '[\]]'
+match 0 0 '\' '[\]]'
+match 0 0 'ab' 'a[]b'
+match 0 1 'a[]b' 'a[]b'
+match 0 1 'ab[' 'ab['
+match 0 0 'ab' '[!'
+match 0 0 'ab' '[-'
+match 1 1 '-' '[-]'
+match 0 0 '-' '[a-'
+match 0 0 '-' '[!a-'
+match 1 1 '-' '[--A]'
+match 1 1 '5' '[--A]'
+match 1 1 ' ' '[ --]'
+match 1 1 '$' '[ --]'
+match 1 1 '-' '[ --]'
+match 0 0 '0' '[ --]'
+match 1 1 '-' '[---]'
+match 1 1 '-' '[------]'
+match 0 0 'j' '[a-e-n]'
+match 1 1 '-' '[a-e-n]'
+match 1 1 'a' '[!------]'
+match 0 0 '[' '[]-a]'
+match 1 1 '^' '[]-a]'
+match 0 0 '^' '[!]-a]'
+match 1 1 '[' '[!]-a]'
+match 1 1 '^' '[a^bc]'
+match 1 1 '-b]' '[a-]b]'
+match 0 0 '\' '[\]'
+match 1 1 '\' '[\\]'
+match 0 0 '\' '[!\\]'
+match 1 1 'G' '[A-\\]'
+match 0 0 'aaabbb' 'b*a'
+match 0 0 'aabcaa' '*ba*'
+match 1 1 ',' '[,]'
+match 1 1 ',' '[\\,]'
+match 1 1 '\' '[\\,]'
+match 1 1 '-' '[,-.]'
+match 0 0 '+' '[,-.]'
+match 0 0 '-.]' '[,-.]'
+match 1 1 '2' '[\1-\3]'
+match 1 1 '3' '[\1-\3]'
+match 0 0 '4' '[\1-\3]'
+match 1 1 '\' '[[-\]]'
+match 1 1 '[' '[[-\]]'
+match 1 1 ']' '[[-\]]'
+match 0 0 '-' '[[-\]]'
+
+# Test recursion and the abort code (use "wildtest -i" to see iteration counts)
+match 1 1 '-adobe-courier-bold-o-normal--12-120-75-75-m-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
+match 0 0 '-adobe-courier-bold-o-normal--12-120-75-75-X-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
+match 0 0 '-adobe-courier-bold-o-normal--12-120-75-75-/-70-iso8859-1' '-*-*-*-*-*-*-12-*-*-*-m-*-*-*'
+match 1 1 '/adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1' '/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
+match 0 0 '/adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1' '/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
+match 1 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txt' '**/*a*b*g*n*t'
+match 0 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txtz' '**/*a*b*g*n*t'
+
+test_done
diff --git a/t/t3070/wildtest.txt b/t/t3070/wildtest.txt
deleted file mode 100644
index 42c1678..0000000
--- a/t/t3070/wildtest.txt
+++ /dev/null
@@ -1,165 +0,0 @@
-# Input is in the following format (all items white-space separated):
-#
-# The first two items are 1 or 0 indicating if the wildmat call is expected to
-# succeed and if fnmatch works the same way as wildmat, respectively. After
-# that is a text string for the match, and a pattern string. Strings can be
-# quoted (if desired) in either double or single quotes, as well as backticks.
-#
-# MATCH FNMATCH_SAME "text to match" 'pattern to use'
-
-# Basic wildmat features
-1 1 foo foo
-0 1 foo bar
-1 1 '' ""
-1 1 foo ???
-0 1 foo ??
-1 1 foo *
-1 1 foo f*
-0 1 foo *f
-1 1 foo *foo*
-1 1 foobar *ob*a*r*
-1 1 aaaaaaabababab *ab
-1 1 foo* foo\*
-0 1 foobar foo\*bar
-1 1 f\oo f\\oo
-1 1 ball *[al]?
-0 1 ten [ten]
-1 1 ten **[!te]
-0 1 ten **[!ten]
-1 1 ten t[a-g]n
-0 1 ten t[!a-g]n
-1 1 ton t[!a-g]n
-1 1 ton t[^a-g]n
-1 1 a]b a[]]b
-1 1 a-b a[]-]b
-1 1 a]b a[]-]b
-0 1 aab a[]-]b
-1 1 aab a[]a-]b
-1 1 ] ]
-
-# Extended slash-matching features
-0 1 foo/baz/bar foo*bar
-1 1 foo/baz/bar foo**bar
-0 1 foo/bar foo?bar
-0 1 foo/bar foo[/]bar
-0 1 foo/bar f[^eiu][^eiu][^eiu][^eiu][^eiu]r
-1 1 foo-bar f[^eiu][^eiu][^eiu][^eiu][^eiu]r
-0 1 foo **/foo
-1 1 /foo **/foo
-1 1 bar/baz/foo **/foo
-0 1 bar/baz/foo */foo
-0 0 foo/bar/baz **/bar*
-1 1 deep/foo/bar/baz **/bar/*
-0 1 deep/foo/bar/baz/ **/bar/*
-1 1 deep/foo/bar/baz/ **/bar/**
-0 1 deep/foo/bar **/bar/*
-1 1 deep/foo/bar/ **/bar/**
-1 1 foo/bar/baz **/bar**
-1 1 foo/bar/baz/x */bar/**
-0 0 deep/foo/bar/baz/x */bar/**
-1 1 deep/foo/bar/baz/x **/bar/*/*
-
-# Various additional tests
-0 1 acrt a[c-c]st
-1 1 acrt a[c-c]rt
-0 1 ] [!]-]
-1 1 a [!]-]
-0 1 '' \
-0 1 \ \
-0 1 /\ */\
-1 1 /\ */\\
-1 1 foo foo
-1 1 @foo @foo
-0 1 foo @foo
-1 1 [ab] \[ab]
-1 1 [ab] [[]ab]
-1 1 [ab] [[:]ab]
-0 1 [ab] [[::]ab]
-1 1 [ab] [[:digit]ab]
-1 1 [ab] [\[:]ab]
-1 1 ?a?b \??\?b
-1 1 abc \a\b\c
-0 1 foo ''
-1 1 foo/bar/baz/to **/t[o]
-
-# Character class tests
-1 1 a1B [[:alpha:]][[:digit:]][[:upper:]]
-0 1 a [[:digit:][:upper:][:space:]]
-1 1 A [[:digit:][:upper:][:space:]]
-1 1 1 [[:digit:][:upper:][:space:]]
-0 1 1 [[:digit:][:upper:][:spaci:]]
-1 1 ' ' [[:digit:][:upper:][:space:]]
-0 1 . [[:digit:][:upper:][:space:]]
-1 1 . [[:digit:][:punct:][:space:]]
-1 1 5 [[:xdigit:]]
-1 1 f [[:xdigit:]]
-1 1 D [[:xdigit:]]
-1 1 _ [[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]
-#1 1
[^[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]
-1 1 \x7f [^[:alnum:][:alpha:][:blank:][:digit:][:graph:][:lower:][:print:][:punct:][:space:][:upper:][:xdigit:]]
-1 1 . [^[:alnum:][:alpha:][:blank:][:cntrl:][:digit:][:lower:][:space:][:upper:][:xdigit:]]
-1 1 5 [a-c[:digit:]x-z]
-1 1 b [a-c[:digit:]x-z]
-1 1 y [a-c[:digit:]x-z]
-0 1 q [a-c[:digit:]x-z]
-
-# Additional tests, including some malformed wildmats
-1 1 ] [\\-^]
-0 1 [ [\\-^]
-1 1 - [\-_]
-1 1 ] [\]]
-0 1 \] [\]]
-0 1 \ [\]]
-0 1 ab a[]b
-0 1 a[]b a[]b
-0 1 ab[ ab[
-0 1 ab [!
-0 1 ab [-
-1 1 - [-]
-0 1 - [a-
-0 1 - [!a-
-1 1 - [--A]
-1 1 5 [--A]
-1 1 ' ' '[ --]'
-1 1 $ '[ --]'
-1 1 - '[ --]'
-0 1 0 '[ --]'
-1 1 - [---]
-1 1 - [------]
-0 1 j [a-e-n]
-1 1 - [a-e-n]
-1 1 a [!------]
-0 1 [ []-a]
-1 1 ^ []-a]
-0 1 ^ [!]-a]
-1 1 [ [!]-a]
-1 1 ^ [a^bc]
-1 1 -b] [a-]b]
-0 1 \ [\]
-1 1 \ [\\]
-0 1 \ [!\\]
-1 1 G [A-\\]
-0 1 aaabbb b*a
-0 1 aabcaa *ba*
-1 1 , [,]
-1 1 , [\\,]
-1 1 \ [\\,]
-1 1 - [,-.]
-0 1 + [,-.]
-0 1 -.] [,-.]
-1 1 2 [\1-\3]
-1 1 3 [\1-\3]
-0 1 4 [\1-\3]
-1 1 \ [[-\]]
-1 1 [ [[-\]]
-1 1 ] [[-\]]
-0 1 - [[-\]]
-
-# Test recursion and the abort code (use "wildtest -i" to see iteration counts)
-1 1 -adobe-courier-bold-o-normal--12-120-75-75-m-70-iso8859-1 -*-*-*-*-*-*-12-*-*-*-m-*-*-*
-0 1 -adobe-courier-bold-o-normal--12-120-75-75-X-70-iso8859-1 -*-*-*-*-*-*-12-*-*-*-m-*-*-*
-0 1 -adobe-courier-bold-o-normal--12-120-75-75-/-70-iso8859-1 -*-*-*-*-*-*-12-*-*-*-m-*-*-*
-1 1 /adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1 /*/*/*/*/*/*/12/*/*/*/m/*/*/*
-0 1 /adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1 /*/*/*/*/*/*/12/*/*/*/m/*/*/*
-1 1 abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txt **/*a*b*g*n*t
-0 1 abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txtz **/*a*b*g*n*t
diff --git a/test-wildmatch.c b/test-wildmatch.c
new file mode 100644
index 0000000..08962d5
--- /dev/null
+++ b/test-wildmatch.c
@@ -0,0 +1,14 @@
+#include "cache.h"
+#include "wildmatch.h"
+
+int main(int argc, char **argv)
+{
+ if (!strcmp(argv[1], "wildmatch"))
+ return wildmatch(argv[3], argv[2]) ? 0 : 1;
+ else if (!strcmp(argv[1], "iwildmatch"))
+ return iwildmatch(argv[3], argv[2]) ? 0 : 1;
+ else if (!strcmp(argv[1], "fnmatch"))
+ return fnmatch(argv[3], argv[2], FNM_PATHNAME);
+ else
+ return 1;
+}
diff --git a/wildmatch.c b/wildmatch.c
index 71dba76..7b64a6b 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -9,7 +9,13 @@
** work differently than '*', and to fix the character-class code.
*/
-#include "rsync.h"
+#include <stddef.h>
+#include <ctype.h>
+#include <string.h>
+
+#include "wildmatch.h"
+
+typedef unsigned char uchar;
/* What character marks an inverted character class? */
#define NEGATE_CLASS '!'
--
1.8.0.rc0.29.g1fdd78f
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 4/8] wildmatch: remove static variable force_lower_case
2012-10-09 3:08 [PATCH 0/8] wildmatch take 3 Nguyễn Thái Ngọc Duy
` (2 preceding siblings ...)
2012-10-09 3:09 ` [PATCH 3/8] Integrate wildmatch to git Nguyễn Thái Ngọc Duy
@ 2012-10-09 3:09 ` Nguyễn Thái Ngọc Duy
2012-10-09 20:47 ` Junio C Hamano
2012-10-09 3:09 ` [PATCH 5/8] wildmatch: fix case-insensitive matching Nguyễn Thái Ngọc Duy
` (3 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-09 3:09 UTC (permalink / raw)
To: git; +Cc: Nguyễn Thái Ngọc Duy
One place less to worry about thread safety. Also combine wildmatch
and iwildmatch into one.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
test-wildmatch.c | 4 ++--
wildmatch.c | 23 ++++++-----------------
wildmatch.h | 3 +--
3 files changed, 9 insertions(+), 21 deletions(-)
diff --git a/test-wildmatch.c b/test-wildmatch.c
index 08962d5..5c18cf8 100644
--- a/test-wildmatch.c
+++ b/test-wildmatch.c
@@ -4,9 +4,9 @@
int main(int argc, char **argv)
{
if (!strcmp(argv[1], "wildmatch"))
- return wildmatch(argv[3], argv[2]) ? 0 : 1;
+ return wildmatch(argv[3], argv[2], 0) ? 0 : 1;
else if (!strcmp(argv[1], "iwildmatch"))
- return iwildmatch(argv[3], argv[2]) ? 0 : 1;
+ return wildmatch(argv[3], argv[2], FNM_CASEFOLD) ? 0 : 1;
else if (!strcmp(argv[1], "fnmatch"))
return fnmatch(argv[3], argv[2], FNM_PATHNAME);
else
diff --git a/wildmatch.c b/wildmatch.c
index 7b64a6b..2382873 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -11,8 +11,8 @@
#include <stddef.h>
#include <ctype.h>
-#include <string.h>
+#include "cache.h"
#include "wildmatch.h"
typedef unsigned char uchar;
@@ -59,11 +59,9 @@ typedef unsigned char uchar;
#define ISUPPER(c) (ISASCII(c) && isupper(c))
#define ISXDIGIT(c) (ISASCII(c) && isxdigit(c))
-static int force_lower_case = 0;
-
/* Match pattern "p" against the a virtually-joined string consisting
* of "text" and any strings in array "a". */
-static int dowild(const uchar *p, const uchar *text)
+static int dowild(const uchar *p, const uchar *text, int force_lower_case)
{
uchar p_ch;
@@ -107,7 +105,7 @@ static int dowild(const uchar *p, const uchar *text)
while (1) {
if (t_ch == '\0')
break;
- if ((matched = dowild(p, text)) != FALSE) {
+ if ((matched = dowild(p, text, force_lower_case)) != FALSE) {
if (!special || matched != ABORT_TO_STARSTAR)
return matched;
} else if (!special && t_ch == '/')
@@ -215,17 +213,8 @@ static int dowild(const uchar *p, const uchar *text)
}
/* Match the "pattern" against the "text" string. */
-int wildmatch(const char *pattern, const char *text)
-{
- return dowild((const uchar*)pattern, (const uchar*)text) == TRUE;
-}
-
-/* Match the "pattern" against the forced-to-lower-case "text" string. */
-int iwildmatch(const char *pattern, const char *text)
+int wildmatch(const char *pattern, const char *text, int flags)
{
- int ret;
- force_lower_case = 1;
- ret = dowild((const uchar*)pattern, (const uchar*)text) == TRUE;
- force_lower_case = 0;
- return ret;
+ return dowild((const uchar*)pattern, (const uchar*)text,
+ flags & FNM_CASEFOLD ? 1 : 0) == TRUE;
}
diff --git a/wildmatch.h b/wildmatch.h
index 562faa3..e974f9a 100644
--- a/wildmatch.h
+++ b/wildmatch.h
@@ -1,4 +1,3 @@
/* wildmatch.h */
-int wildmatch(const char *pattern, const char *text);
-int iwildmatch(const char *pattern, const char *text);
+int wildmatch(const char *pattern, const char *text, int flags);
--
1.8.0.rc0.29.g1fdd78f
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 4/8] wildmatch: remove static variable force_lower_case
2012-10-09 3:09 ` [PATCH 4/8] wildmatch: remove static variable force_lower_case Nguyễn Thái Ngọc Duy
@ 2012-10-09 20:47 ` Junio C Hamano
2012-10-10 5:14 ` Nguyen Thai Ngoc Duy
0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2012-10-09 20:47 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy; +Cc: git
Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
> diff --git a/wildmatch.c b/wildmatch.c
> index 7b64a6b..2382873 100644
> --- a/wildmatch.c
> +++ b/wildmatch.c
> @@ -11,8 +11,8 @@
>
> #include <stddef.h>
> #include <ctype.h>
> -#include <string.h>
>
> +#include "cache.h"
> #include "wildmatch.h"
This is wrong; the includes from the system headers should have
been removed in the previous step where the series "integrated"
wildmatch to git, after which point the first include any C source
that is not at the platform-compatibility layer should be cache.h
or git-compat-util.h.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/8] wildmatch: remove static variable force_lower_case
2012-10-09 20:47 ` Junio C Hamano
@ 2012-10-10 5:14 ` Nguyen Thai Ngoc Duy
2012-10-10 5:31 ` Junio C Hamano
0 siblings, 1 reply; 15+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-10-10 5:14 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Wed, Oct 10, 2012 at 3:47 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
>
>> diff --git a/wildmatch.c b/wildmatch.c
>> index 7b64a6b..2382873 100644
>> --- a/wildmatch.c
>> +++ b/wildmatch.c
>> @@ -11,8 +11,8 @@
>>
>> #include <stddef.h>
>> #include <ctype.h>
>> -#include <string.h>
>>
>> +#include "cache.h"
>> #include "wildmatch.h"
>
> This is wrong; the includes from the system headers should have
> been removed in the previous step where the series "integrated"
> wildmatch to git, after which point the first include any C source
> that is not at the platform-compatibility layer should be cache.h
> or git-compat-util.h.
Git's ctype does not seem to be complete for wildmatch's use so
ctype.h is required. But that can be easily fixed later on.
--
Duy
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/8] wildmatch: remove static variable force_lower_case
2012-10-10 5:14 ` Nguyen Thai Ngoc Duy
@ 2012-10-10 5:31 ` Junio C Hamano
2012-10-10 5:47 ` Nguyen Thai Ngoc Duy
0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2012-10-10 5:31 UTC (permalink / raw)
To: Nguyen Thai Ngoc Duy; +Cc: git
Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:
> Git's ctype does not seem to be complete for wildmatch's use so
> ctype.h is required. But that can be easily fixed later on.
Until "later on", I cannot even compile the series.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/8] wildmatch: remove static variable force_lower_case
2012-10-10 5:31 ` Junio C Hamano
@ 2012-10-10 5:47 ` Nguyen Thai Ngoc Duy
0 siblings, 0 replies; 15+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-10-10 5:47 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Wed, Oct 10, 2012 at 12:31 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:
>
>> Git's ctype does not seem to be complete for wildmatch's use so
>> ctype.h is required. But that can be easily fixed later on.
>
> Until "later on", I cannot even compile the series.
So that's why you noticed this patch :) It builds fine here. I'll fix
up and send an update later.
--
Duy
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 5/8] wildmatch: fix case-insensitive matching
2012-10-09 3:08 [PATCH 0/8] wildmatch take 3 Nguyễn Thái Ngọc Duy
` (3 preceding siblings ...)
2012-10-09 3:09 ` [PATCH 4/8] wildmatch: remove static variable force_lower_case Nguyễn Thái Ngọc Duy
@ 2012-10-09 3:09 ` Nguyễn Thái Ngọc Duy
2012-10-09 3:09 ` [PATCH 6/8] wildmatch: adjust "**" behavior Nguyễn Thái Ngọc Duy
` (2 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-09 3:09 UTC (permalink / raw)
To: git; +Cc: Nguyễn Thái Ngọc Duy
dowild() does case insensitive matching by lower-casing the text. That
means lower case letters in patterns imply case-insensitive matching,
but upper case means exact matching.
We do not want that subtlety. Lower case pattern too so iwildmatch()
always does what we expect it to do.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
wildmatch.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/wildmatch.c b/wildmatch.c
index 2382873..fdb8cb1 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -72,6 +72,8 @@ static int dowild(const uchar *p, const uchar *text, int force_lower_case)
return ABORT_ALL;
if (force_lower_case && ISUPPER(t_ch))
t_ch = tolower(t_ch);
+ if (force_lower_case && ISUPPER(p_ch))
+ p_ch = tolower(p_ch);
switch (p_ch) {
case '\\':
/* Literal match with following character. Note that the test
--
1.8.0.rc0.29.g1fdd78f
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 6/8] wildmatch: adjust "**" behavior
2012-10-09 3:08 [PATCH 0/8] wildmatch take 3 Nguyễn Thái Ngọc Duy
` (4 preceding siblings ...)
2012-10-09 3:09 ` [PATCH 5/8] wildmatch: fix case-insensitive matching Nguyễn Thái Ngọc Duy
@ 2012-10-09 3:09 ` Nguyễn Thái Ngọc Duy
2012-10-09 3:09 ` [PATCH 7/8] wildmatch: make /**/ match zero or more directories Nguyễn Thái Ngọc Duy
2012-10-09 3:09 ` [PATCH 8/8] Support "**" wildcard in .gitignore and .gitattributes Nguyễn Thái Ngọc Duy
7 siblings, 0 replies; 15+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-09 3:09 UTC (permalink / raw)
To: git; +Cc: Nguyễn Thái Ngọc Duy
Standard wildmatch() sees consecutive asterisks as "*" that can also
match slashes. But that may be hard to explain to users as
"abc/**/def" can match "abcdef", "abcxyzdef", "abc/def", "abc/x/def",
"abc/x/y/def"...
This patch changes wildmatch so that users can do
- "**/def" -> all paths ending with file/directory 'def'
- "abc/**" - equivalent to "/abc/"
- "abc/**/def" -> "abc/x/def", "abc/x/y/def"...
- other "**" cases are downgraded to normal "*"
Basically the magic of "**" only remains if it's wrapped around by
slashes.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
t/t3070-wildmatch.sh | 2 +-
wildmatch.c | 8 +++++++-
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index bb92f8d..d320f84 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -51,7 +51,7 @@ match 1 1 ']' ']'
# Extended slash-matching features
match 0 0 'foo/baz/bar' 'foo*bar'
-match 1 0 'foo/baz/bar' 'foo**bar'
+match 0 0 'foo/baz/bar' 'foo**bar'
match 0 0 'foo/bar' 'foo?bar'
match 0 0 'foo/bar' 'foo[/]bar'
match 0 0 'foo/bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
diff --git a/wildmatch.c b/wildmatch.c
index fdb8cb1..1b39346 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -91,8 +91,14 @@ static int dowild(const uchar *p, const uchar *text, int force_lower_case)
continue;
case '*':
if (*++p == '*') {
+ const uchar *prev_p = p - 2;
while (*++p == '*') {}
- special = TRUE;
+ if ((prev_p == text || *prev_p == '/') ||
+ (*p == '\0' || *p == '/' ||
+ (p[0] == '\\' && p[1] == '/'))) {
+ special = TRUE;
+ } else
+ special = FALSE;
} else
special = FALSE;
if (*p == '\0') {
--
1.8.0.rc0.29.g1fdd78f
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 7/8] wildmatch: make /**/ match zero or more directories
2012-10-09 3:08 [PATCH 0/8] wildmatch take 3 Nguyễn Thái Ngọc Duy
` (5 preceding siblings ...)
2012-10-09 3:09 ` [PATCH 6/8] wildmatch: adjust "**" behavior Nguyễn Thái Ngọc Duy
@ 2012-10-09 3:09 ` Nguyễn Thái Ngọc Duy
2012-10-09 3:09 ` [PATCH 8/8] Support "**" wildcard in .gitignore and .gitattributes Nguyễn Thái Ngọc Duy
7 siblings, 0 replies; 15+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-09 3:09 UTC (permalink / raw)
To: git; +Cc: Nguyễn Thái Ngọc Duy
"foo/**/bar" matches "foo/x/bar", "foo/x/y/bar"... but not
"foo/bar". We make a special case, when foo/**/ is detected (and
"foo/" part is already matched), try matching "bar" with the rest of
the string.
"Match one or more directories" semantics can be easily achieved using
"foo/*/**/bar".
This also makes "**/foo" match "foo" in addition to "x/foo",
"x/y/foo"..
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
t/t3070-wildmatch.sh | 8 +++++++-
wildmatch.c | 17 +++++++++++++++++
2 files changed, 24 insertions(+), 1 deletion(-)
diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index d320f84..a247a36 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -52,11 +52,17 @@ match 1 1 ']' ']'
# Extended slash-matching features
match 0 0 'foo/baz/bar' 'foo*bar'
match 0 0 'foo/baz/bar' 'foo**bar'
+match 1 1 'foo/baz/bar' 'foo/**/bar'
+match 1 0 'foo/baz/bar' 'foo/**/**/bar'
+match 1 0 'foo/b/a/z/bar' 'foo/**/bar'
+match 1 0 'foo/b/a/z/bar' 'foo/**/**/bar'
+match 1 0 'foo/bar' 'foo/**/bar'
+match 1 0 'foo/bar' 'foo/**/**/bar'
match 0 0 'foo/bar' 'foo?bar'
match 0 0 'foo/bar' 'foo[/]bar'
match 0 0 'foo/bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
match 1 1 'foo-bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
-match 0 0 'foo' '**/foo'
+match 1 0 'foo' '**/foo'
match 1 1 '/foo' '**/foo'
match 1 0 'bar/baz/foo' '**/foo'
match 0 0 'bar/baz/foo' '*/foo'
diff --git a/wildmatch.c b/wildmatch.c
index 1b39346..4069b2d 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -96,6 +96,23 @@ static int dowild(const uchar *p, const uchar *text, int force_lower_case)
if ((prev_p == text || *prev_p == '/') ||
(*p == '\0' || *p == '/' ||
(p[0] == '\\' && p[1] == '/'))) {
+ /*
+ * Assuming we already match 'foo/' and are at
+ * <star star slash>, just assume it matches
+ * nothing and go ahead match the rest of the
+ * pattern with the remaining string. This
+ * helps make foo/<*><*>/bar (<> because
+ * otherwise it breaks C comment syntax) match
+ * both foo/bar and foo/a/bar.
+ *
+ * Crazy patterns like /<*><*>/<*><*>/ are
+ * treated like /<*><*>/. But undefined
+ * behavior is even appropriate for people
+ * writing such a pattern.
+ */
+ if (p[0] == '/' &&
+ dowild(p + 1, text, force_lower_case) == TRUE)
+ return TRUE;
special = TRUE;
} else
special = FALSE;
--
1.8.0.rc0.29.g1fdd78f
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 8/8] Support "**" wildcard in .gitignore and .gitattributes
2012-10-09 3:08 [PATCH 0/8] wildmatch take 3 Nguyễn Thái Ngọc Duy
` (6 preceding siblings ...)
2012-10-09 3:09 ` [PATCH 7/8] wildmatch: make /**/ match zero or more directories Nguyễn Thái Ngọc Duy
@ 2012-10-09 3:09 ` Nguyễn Thái Ngọc Duy
2012-10-09 7:57 ` Michael Haggerty
7 siblings, 1 reply; 15+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-09 3:09 UTC (permalink / raw)
To: git; +Cc: Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
Documentation/gitignore.txt | 19 +++++++++++++++++++
attr.c | 4 +++-
dir.c | 4 +++-
t/t0003-attributes.sh | 38 ++++++++++++++++++++++++++++++++++++++
t/t3001-ls-files-others-exclude.sh | 19 +++++++++++++++++++
5 files changed, 82 insertions(+), 2 deletions(-)
diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt
index 96639e0..5a9c9f7 100644
--- a/Documentation/gitignore.txt
+++ b/Documentation/gitignore.txt
@@ -104,6 +104,25 @@ PATTERN FORMAT
For example, "/{asterisk}.c" matches "cat-file.c" but not
"mozilla-sha1/sha1.c".
+Two consecutive asterisks ("`**`") in patterns matched against
+full pathname may have special meaning:
+
+ - A leading "`**`" followed by a slash means match in all
+ directories. For example, "`**/foo`" matches file or directory
+ "`foo`" anywhere, the same as pattern "`foo`". "**/foo/bar"
+ matches file or directory "`bar`" anywhere that is directly
+ under directory "`foo`".
+
+ - A trailing "/**" matches everything inside. For example,
+ "abc/**" is equivalent to "`/abc/`".
+
+ - A slash followed by two consecutive asterisks then a slash
+ matches zero or more directories. For example, "`a/**/b`"
+ matches "`a/b`", "`a/x/b`", "`a/x/y/b`" and so on.
+
+ - Consecutive asterisks otherwise are treated like normal
+ asterisk wildcards.
+
NOTES
-----
diff --git a/attr.c b/attr.c
index 887a9ae..e85e5ed 100644
--- a/attr.c
+++ b/attr.c
@@ -12,6 +12,7 @@
#include "exec_cmd.h"
#include "attr.h"
#include "dir.h"
+#include "wildmatch.h"
const char git_attr__true[] = "(builtin)true";
const char git_attr__false[] = "\0(builtin)false";
@@ -666,7 +667,8 @@ static int path_matches(const char *pathname, int pathlen,
return 0;
if (baselen != 0)
baselen++;
- return fnmatch_icase(pattern, pathname + baselen, FNM_PATHNAME) == 0;
+ return wildmatch(pattern, pathname + baselen,
+ ignore_case ? FNM_CASEFOLD : 0);
}
static int macroexpand_one(int attr_nr, int rem);
diff --git a/dir.c b/dir.c
index 4868339..dc721c0 100644
--- a/dir.c
+++ b/dir.c
@@ -8,6 +8,7 @@
#include "cache.h"
#include "dir.h"
#include "refs.h"
+#include "wildmatch.h"
struct path_simplify {
int len;
@@ -575,7 +576,8 @@ int excluded_from_list(const char *pathname,
namelen -= prefix;
}
- if (!namelen || !fnmatch_icase(exclude, name, FNM_PATHNAME))
+ if (!namelen ||
+ wildmatch(exclude, name, ignore_case ? FNM_CASEFOLD : 0))
return to_exclude;
}
return -1; /* undecided */
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index febc45c..67a5694 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -232,4 +232,42 @@ test_expect_success 'bare repository: test info/attributes' '
attr_check subdir/a/i unspecified
'
+test_expect_success '"**" test' '
+ cd .. &&
+ echo "**/f foo=bar" >.gitattributes &&
+ cat <<\EOF >expect &&
+f: foo: bar
+a/f: foo: bar
+a/b/f: foo: bar
+a/b/c/f: foo: bar
+EOF
+ git check-attr foo -- "f" >actual 2>err &&
+ git check-attr foo -- "a/f" >>actual 2>>err &&
+ git check-attr foo -- "a/b/f" >>actual 2>>err &&
+ git check-attr foo -- "a/b/c/f" >>actual 2>>err &&
+ test_cmp expect actual &&
+ test_line_count = 0 err
+'
+
+test_expect_success '"**" with no slashes test' '
+ echo "a**f foo=bar" >.gitattributes &&
+ git check-attr foo -- "f" >actual &&
+ cat <<\EOF >expect &&
+f: foo: unspecified
+af: foo: bar
+axf: foo: bar
+a/f: foo: unspecified
+a/b/f: foo: unspecified
+a/b/c/f: foo: unspecified
+EOF
+ git check-attr foo -- "f" >actual 2>err &&
+ git check-attr foo -- "af" >>actual 2>err &&
+ git check-attr foo -- "axf" >>actual 2>err &&
+ git check-attr foo -- "a/f" >>actual 2>>err &&
+ git check-attr foo -- "a/b/f" >>actual 2>>err &&
+ git check-attr foo -- "a/b/c/f" >>actual 2>>err &&
+ test_cmp expect actual &&
+ test_line_count = 0 err
+'
+
test_done
diff --git a/t/t3001-ls-files-others-exclude.sh b/t/t3001-ls-files-others-exclude.sh
index c8fe978..278315d 100755
--- a/t/t3001-ls-files-others-exclude.sh
+++ b/t/t3001-ls-files-others-exclude.sh
@@ -214,4 +214,23 @@ test_expect_success 'subdirectory ignore (l1)' '
test_cmp expect actual
'
+
+test_expect_success 'ls-files with "**" patterns' '
+ cat <<\EOF >expect &&
+a.1
+one/a.1
+one/two/a.1
+three/a.1
+EOF
+ git ls-files -o -i --exclude "**/a.1" >actual
+ test_cmp expect actual
+'
+
+
+test_expect_success 'ls-files with "**" patterns and no slashes' '
+ : >expect &&
+ git ls-files -o -i --exclude "one**a.1" >actual &&
+ test_cmp expect actual
+'
+
test_done
--
1.8.0.rc0.29.g1fdd78f
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 8/8] Support "**" wildcard in .gitignore and .gitattributes
2012-10-09 3:09 ` [PATCH 8/8] Support "**" wildcard in .gitignore and .gitattributes Nguyễn Thái Ngọc Duy
@ 2012-10-09 7:57 ` Michael Haggerty
2012-10-10 5:40 ` Nguyen Thai Ngoc Duy
0 siblings, 1 reply; 15+ messages in thread
From: Michael Haggerty @ 2012-10-09 7:57 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy; +Cc: git
I like how this series is going and it's going to be a nice new feature.
Some comments...
It would be helpful if you would use
--subject-prefix='PATCH v3'
etc. to help spectators keep track of the different versions of your
patch series.
On 10/09/2012 05:09 AM, Nguyễn Thái Ngọc Duy wrote:
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
> Documentation/gitignore.txt | 19 +++++++++++++++++++
> attr.c | 4 +++-
> dir.c | 4 +++-
> t/t0003-attributes.sh | 38 ++++++++++++++++++++++++++++++++++++++
> t/t3001-ls-files-others-exclude.sh | 19 +++++++++++++++++++
> 5 files changed, 82 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt
> index 96639e0..5a9c9f7 100644
> --- a/Documentation/gitignore.txt
> +++ b/Documentation/gitignore.txt
> @@ -104,6 +104,25 @@ PATTERN FORMAT
> For example, "/{asterisk}.c" matches "cat-file.c" but not
> "mozilla-sha1/sha1.c".
>
> +Two consecutive asterisks ("`**`") in patterns matched against
> +full pathname may have special meaning:
> +
> + - A leading "`**`" followed by a slash means match in all
> + directories. For example, "`**/foo`" matches file or directory
> + "`foo`" anywhere, the same as pattern "`foo`". "**/foo/bar"
> + matches file or directory "`bar`" anywhere that is directly
> + under directory "`foo`".
> +
> + - A trailing "/**" matches everything inside. For example,
> + "abc/**" is equivalent to "`/abc/`".
It seems odd that you add a leading slash in this example. I assume
that is because of the rule that a pattern containing a slash is
considered anchored at the current directory. But I find it confusing
because the addition of the leading slash is not part of the rule you
are trying to illustrate here, and is therefore a distraction. I
suggest that you write either
- A trailing "/**" matches everything inside. For example,
"/abc/**" is equivalent to "`/abc/`".
or
- A trailing "/**" matches everything inside. For example,
"abc/**" is equivalent to "`abc/`" (which is also equivalent
to "`/abc/`").
> +
> + - A slash followed by two consecutive asterisks then a slash
> + matches zero or more directories. For example, "`a/**/b`"
> + matches "`a/b`", "`a/x/b`", "`a/x/y/b`" and so on.
> +
> + - Consecutive asterisks otherwise are treated like normal
> + asterisk wildcards.
> +
I don't like the last rule. (1) This construct is superfluous; why
wouldn't the user just use a single asterisk? (2) Allowing this
construct means that it could appear in .gitignore files, creating
unnecessary confusion: extrapolating from the other meanings of "**"
users would expect that it would somehow match slashes. (3) It is
conceivable (though admittedly unlikely) that we might want to assign a
distinct meaning to this construct in the future, and accepting it now
as a different way to spell "*" would prevent such a change.
Perhaps this rule was intended for backwards compatibility?
I think it would be preferable to say that other uses of consecutive
asterisks are undefined, and probably make them trigger a warning.
> NOTES
> -----
>
> diff --git a/attr.c b/attr.c
> index 887a9ae..e85e5ed 100644
> --- a/attr.c
> +++ b/attr.c
> @@ -12,6 +12,7 @@
> #include "exec_cmd.h"
> #include "attr.h"
> #include "dir.h"
> +#include "wildmatch.h"
>
> const char git_attr__true[] = "(builtin)true";
> const char git_attr__false[] = "\0(builtin)false";
> @@ -666,7 +667,8 @@ static int path_matches(const char *pathname, int pathlen,
> return 0;
> if (baselen != 0)
> baselen++;
> - return fnmatch_icase(pattern, pathname + baselen, FNM_PATHNAME) == 0;
> + return wildmatch(pattern, pathname + baselen,
> + ignore_case ? FNM_CASEFOLD : 0);
> }
>
> static int macroexpand_one(int attr_nr, int rem);
> diff --git a/dir.c b/dir.c
> index 4868339..dc721c0 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -8,6 +8,7 @@
> #include "cache.h"
> #include "dir.h"
> #include "refs.h"
> +#include "wildmatch.h"
>
> struct path_simplify {
> int len;
> @@ -575,7 +576,8 @@ int excluded_from_list(const char *pathname,
> namelen -= prefix;
> }
>
> - if (!namelen || !fnmatch_icase(exclude, name, FNM_PATHNAME))
> + if (!namelen ||
> + wildmatch(exclude, name, ignore_case ? FNM_CASEFOLD : 0))
> return to_exclude;
> }
> return -1; /* undecided */
> diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
> index febc45c..67a5694 100755
> --- a/t/t0003-attributes.sh
> +++ b/t/t0003-attributes.sh
> @@ -232,4 +232,42 @@ test_expect_success 'bare repository: test info/attributes' '
> attr_check subdir/a/i unspecified
> '
>
> +test_expect_success '"**" test' '
> + cd .. &&
> + echo "**/f foo=bar" >.gitattributes &&
> + cat <<\EOF >expect &&
> +f: foo: bar
> +a/f: foo: bar
> +a/b/f: foo: bar
> +a/b/c/f: foo: bar
> +EOF
> + git check-attr foo -- "f" >actual 2>err &&
> + git check-attr foo -- "a/f" >>actual 2>>err &&
> + git check-attr foo -- "a/b/f" >>actual 2>>err &&
> + git check-attr foo -- "a/b/c/f" >>actual 2>>err &&
> + test_cmp expect actual &&
> + test_line_count = 0 err
> +'
> +
> +test_expect_success '"**" with no slashes test' '
> + echo "a**f foo=bar" >.gitattributes &&
> + git check-attr foo -- "f" >actual &&
> + cat <<\EOF >expect &&
> +f: foo: unspecified
> +af: foo: bar
> +axf: foo: bar
> +a/f: foo: unspecified
> +a/b/f: foo: unspecified
> +a/b/c/f: foo: unspecified
> +EOF
> + git check-attr foo -- "f" >actual 2>err &&
> + git check-attr foo -- "af" >>actual 2>err &&
> + git check-attr foo -- "axf" >>actual 2>err &&
> + git check-attr foo -- "a/f" >>actual 2>>err &&
> + git check-attr foo -- "a/b/f" >>actual 2>>err &&
> + git check-attr foo -- "a/b/c/f" >>actual 2>>err &&
> + test_cmp expect actual &&
> + test_line_count = 0 err
> +'
> +
> test_done
> diff --git a/t/t3001-ls-files-others-exclude.sh b/t/t3001-ls-files-others-exclude.sh
> index c8fe978..278315d 100755
> --- a/t/t3001-ls-files-others-exclude.sh
> +++ b/t/t3001-ls-files-others-exclude.sh
> @@ -214,4 +214,23 @@ test_expect_success 'subdirectory ignore (l1)' '
> test_cmp expect actual
> '
>
> +
> +test_expect_success 'ls-files with "**" patterns' '
> + cat <<\EOF >expect &&
> +a.1
> +one/a.1
> +one/two/a.1
> +three/a.1
> +EOF
> + git ls-files -o -i --exclude "**/a.1" >actual
> + test_cmp expect actual
> +'
> +
> +
> +test_expect_success 'ls-files with "**" patterns and no slashes' '
> + : >expect &&
> + git ls-files -o -i --exclude "one**a.1" >actual &&
> + test_cmp expect actual
> +'
> +
> test_done
>
Michael
--
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 8/8] Support "**" wildcard in .gitignore and .gitattributes
2012-10-09 7:57 ` Michael Haggerty
@ 2012-10-10 5:40 ` Nguyen Thai Ngoc Duy
0 siblings, 0 replies; 15+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-10-10 5:40 UTC (permalink / raw)
To: Michael Haggerty; +Cc: git
On Tue, Oct 9, 2012 at 2:57 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>> + - A trailing "/**" matches everything inside. For example,
>> + "abc/**" is equivalent to "`/abc/`".
>
> It seems odd that you add a leading slash in this example. I assume
> that is because of the rule that a pattern containing a slash is
> considered anchored at the current directory. But I find it confusing
> because the addition of the leading slash is not part of the rule you
> are trying to illustrate here, and is therefore a distraction. I
> suggest that you write either
>
> - A trailing "/**" matches everything inside. For example,
> "/abc/**" is equivalent to "`/abc/`".
>
> or
>
> - A trailing "/**" matches everything inside. For example,
> "abc/**" is equivalent to "`abc/`" (which is also equivalent
> to "`/abc/`").
The tricky thing in .gitignore is that the last '/' alone does not
imply anchor. So "abc/" means match _directory_ abc anywhere in
worktree. So the former is probably better. I should also add a note
here (or in gitattributes.txt) about the difference between "/abc/*"
and "/abc/**". The former assigns attributes to all files directly
under abc (e.g. depth 1), the latter infinite depth.
>> + - A slash followed by two consecutive asterisks then a slash
>> + matches zero or more directories. For example, "`a/**/b`"
>> + matches "`a/b`", "`a/x/b`", "`a/x/y/b`" and so on.
>> +
>> + - Consecutive asterisks otherwise are treated like normal
>> + asterisk wildcards.
>> +
>
> I don't like the last rule. (1) This construct is superfluous; why
> wouldn't the user just use a single asterisk? (2) Allowing this
> construct means that it could appear in .gitignore files, creating
> unnecessary confusion: extrapolating from the other meanings of "**"
> users would expect that it would somehow match slashes. (3) It is
> conceivable (though admittedly unlikely) that we might want to assign a
> distinct meaning to this construct in the future, and accepting it now
> as a different way to spell "*" would prevent such a change.
>
> Perhaps this rule was intended for backwards compatibility?
We break backwards compatibility already. Existing "**/" or "/**"
patterns now behave differently.
> I think it would be preferable to say that other uses of consecutive
> asterisks are undefined, and probably make them trigger a warning.
Instead of undefined, we can reject the pattern as "broken". I have to
check how fnmatch/wildmatch deals with broken patterns (it must do).
If it returns a different code for broken patterns, then we can warn
users, which is not limited in just "**" breakage.
--
Duy
^ permalink raw reply [flat|nested] 15+ messages in thread