git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH/RFC] attr: allow pattern escape using backslash
@ 2012-10-06  7:22 Nguyễn Thái Ngọc Duy
  2012-10-06  7:54 ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-06  7:22 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

.gitattributes pattern syntax is supposed to be the same as .gitignore
(except a few things that do not make sense in attr context, but
that's a different issue). .gitignore uses fnmatch() as the matching
machinery and "\" is accepted as an escape code. In theory the pattern
'foo\ bar' should match path 'foo bar' in .gitignore. Granted, no one
would write 'foo\ bar' in .gitignore when 'foo bar' should
suffice.

Regardless, 'foo\ bar attr' does not (but should) attach "attr" to
path "foo bar" because pattern/attr parse code does not understand
backslash escape. It parses the line as path 'foo\' and attributes
'bar' and 'attr'. This patch teaches attr code to recognize the
backslash in patterns (not macro names) and pass 'foo\ bar' down to
fnmatch().

This changes the attr behavior. "foo\ attr", if exists in the field,
would match nothing because path "foo\" is invalid in UNIX and is a
directory in Windows, which we do not accept attaching attributes
to. With this patch, that line becomes invalid and is rejected. So
it's not really bad (i.e. no silent changes in behavior).

Other subtle behavioral changes may happen. Still, I think we should
do this as it seems like a correct thing to do.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 We discussed the "spaces in path names in attr" before and I remember
 using quotes or double quotes, even substituting spaces with
 dots, were raised. I don't remember if backslashes were mentioned.

 attr.c                | 12 +++++++++++-
 t/t0003-attributes.sh |  5 +++++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 887a9ae..173d51d 100644
--- a/attr.c
+++ b/attr.c
@@ -221,8 +221,18 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 			return NULL;
 		}
 	}
-	else
+	else {
 		is_macro = 0;
+		namelen = 0;
+		while (name[namelen] != '\0' &&
+		       !strchr(blank, name[namelen])) {
+			if (name[namelen] == '\\' &&
+			    name[namelen + 1] != '\0')
+				namelen += 2;
+			else
+				namelen++;
+		}
+	}
 
 	states = name + namelen;
 	states += strspn(states, blank);
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index febc45c..16b419e 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -24,6 +24,7 @@ test_expect_success 'setup' '
 		echo "offon -test test"
 		echo "no notest"
 		echo "A/e/F test=A/e/F"
+		echo "A\\ b test=space"
 	) >.gitattributes &&
 	(
 		echo "g test=a/g" &&
@@ -196,6 +197,10 @@ test_expect_success 'root subdir attribute test' '
 	attr_check subdir/a/i unspecified
 '
 
+test_expect_success 'quoting in pattern' '
+	attr_check "A b" space
+'
+
 test_expect_success 'setup bare' '
 	git clone --bare . bare.git &&
 	cd bare.git
-- 
1.7.12.1.406.g6ab07c4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH/RFC] attr: allow pattern escape using backslash
  2012-10-06  7:22 [PATCH/RFC] attr: allow pattern escape using backslash Nguyễn Thái Ngọc Duy
@ 2012-10-06  7:54 ` Junio C Hamano
  2012-10-06  8:02   ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2012-10-06  7:54 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> .gitattributes pattern syntax is supposed to be the same as .gitignore
> (except a few things that do not make sense in attr context, but
> that's a different issue). .gitignore uses fnmatch() as the matching
> machinery and "\" is accepted as an escape code. In theory the pattern
> 'foo\ bar' should match path 'foo bar' in .gitignore. Granted, no one
> would write 'foo\ bar' in .gitignore when 'foo bar' should
> suffice.
>
> Regardless, 'foo\ bar attr' does not (but should) attach "attr" to
> path "foo bar" because pattern/attr parse code does not understand
> backslash escape. It parses the line as path 'foo\' and attributes
> 'bar' and 'attr'. This patch teaches attr code to recognize the
> backslash in patterns (not macro names) and pass 'foo\ bar' down to
> fnmatch().
>
> This changes the attr behavior. "foo\ attr", if exists in the field,
> would match nothing because path "foo\" is invalid in UNIX and is a
> directory in Windows, which we do not accept attaching attributes
> to. With this patch, that line becomes invalid and is rejected. So
> it's not really bad (i.e. no silent changes in behavior).
>
> Other subtle behavioral changes may happen. Still, I think we should
> do this as it seems like a correct thing to do.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  We discussed the "spaces in path names in attr" before and I remember
>  using quotes or double quotes, even substituting spaces with
>  dots, were raised. I don't remember if backslashes were mentioned.

My knee-jerk reaction, without thinking things through, is that this
makes sense and doing the same for .gitignore would, too (even
though the one-item-per-line nature of .gitignore makes "\ " and " "
practically equivalent).

Shouldn't we do the same for quoting fnmatch(3) metacharacters?  To
match a path component 'a' followed by an asterisk followed by 'b',
you could then write 'a\*b'.  Same for quoting the backslash itself.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH/RFC] attr: allow pattern escape using backslash
  2012-10-06  7:54 ` Junio C Hamano
@ 2012-10-06  8:02   ` Nguyen Thai Ngoc Duy
  0 siblings, 0 replies; 4+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-10-06  8:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sat, Oct 6, 2012 at 2:54 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Shouldn't we do the same for quoting fnmatch(3) metacharacters?  To
> match a path component 'a' followed by an asterisk followed by 'b',
> you could then write 'a\*b'.  Same for quoting the backslash itself.

I think my patch does that too. The thing it does not do is optimize
this case so that we can do strcmp() instead of fnmatch() if the
entire pattern does not contain any metacharacters but backslashes. I
don't think it'll become popular enough to deserve an optimization.
-- 
Duy

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH/RFC] attr: allow pattern escape using backslash
@ 2012-11-11 10:32 Nguyễn Thái Ngọc Duy
  0 siblings, 0 replies; 4+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-11-11 10:32 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy, Jeff King, Junio C Hamano

.gitattributes pattern syntax is supposed to be the same as .gitignore
(except a few things that do not make sense in attr context, but
that's a different issue). .gitignore uses fnmatch() as the matching
machinery and "\" is accepted as an escape code. In theory the pattern
'foo\ bar' should match path 'foo bar' in .gitignore. Granted, no one
would write 'foo\ bar' in .gitignore when 'foo bar' should
suffice.

Regardless, 'foo\ bar attr' does not (but should) attach "attr" to
path "foo bar" because pattern/attr parse code does not understand
backslash escape. It parses the line as path 'foo\' and attributes
'bar' and 'attr'. This patch teaches attr code to recognize the
backslash in patterns (not macro names) and pass 'foo\ bar' down to
fnmatch().

This changes the attr behavior. "foo\ attr", if exists in the field,
would match nothing because path "foo\" is invalid in UNIX and is a
directory in Windows, which we do not accept attaching attributes
to. With this patch, that line becomes invalid and is rejected. So
it's not really bad (i.e. no silent changes in behavior).

Other subtle behavioral changes may happen. Still, I think we should
do this as it seems like a correct thing to do.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 It was posted [1] during rc cycles (I think) and initial feedback
 from Junio was "not utterly wrong". I still think this is worth
 doing, hence this resend for discussion.

 [1] http://thread.gmane.org/gmane.comp.version-control.git/207135

 attr.c                | 12 +++++++++++-
 t/t0003-attributes.sh |  5 +++++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 887a9ae..173d51d 100644
--- a/attr.c
+++ b/attr.c
@@ -221,8 +221,18 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 			return NULL;
 		}
 	}
-	else
+	else {
 		is_macro = 0;
+		namelen = 0;
+		while (name[namelen] != '\0' &&
+		       !strchr(blank, name[namelen])) {
+			if (name[namelen] == '\\' &&
+			    name[namelen + 1] != '\0')
+				namelen += 2;
+			else
+				namelen++;
+		}
+	}
 
 	states = name + namelen;
 	states += strspn(states, blank);
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index febc45c..16b419e 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -24,6 +24,7 @@ test_expect_success 'setup' '
 		echo "offon -test test"
 		echo "no notest"
 		echo "A/e/F test=A/e/F"
+		echo "A\\ b test=space"
 	) >.gitattributes &&
 	(
 		echo "g test=a/g" &&
@@ -196,6 +197,10 @@ test_expect_success 'root subdir attribute test' '
 	attr_check subdir/a/i unspecified
 '
 
+test_expect_success 'quoting in pattern' '
+	attr_check "A b" space
+'
+
 test_expect_success 'setup bare' '
 	git clone --bare . bare.git &&
 	cd bare.git
-- 
1.7.12.1.406.g6ab07c4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-11-11 10:32 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-06  7:22 [PATCH/RFC] attr: allow pattern escape using backslash Nguyễn Thái Ngọc Duy
2012-10-06  7:54 ` Junio C Hamano
2012-10-06  8:02   ` Nguyen Thai Ngoc Duy
  -- strict thread matches above, loose matches on Subject: below --
2012-11-11 10:32 Nguyễn Thái Ngọc Duy

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).