git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* RE: Bug - Status - Space in Filename
@ 2017-11-08 17:27 Joseph Strauss
  2017-11-09  3:26 ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Joseph Strauss @ 2017-11-08 17:27 UTC (permalink / raw)
  To: git@vger.kernel.org

I believe I have found a bug in the way git status -s lists filenames.

According to the documentation:
  The fields (including the ->) are separated from each other by a single space. If a filename contains whitespace or other nonprintable characters,   that field will be quoted in the manner of a C string literal: surrounded by ASCII double quote (34) characters, and with interior special characters backslash-escaped.

While this is true in most situations, it does not seem to apply to merge conflicts. When a file has merge conflicts I am getting the following:
 $ git status -s
 UU some/path/with space/in/the/name
 M  "another/path/with space/in/the/name "

I found the same problem for the following versions:
. git version 2.15.0.windows.1
. git version 2.10.0



Joseph Kalman Strauss
Lifecycle Management Engineer
B&H Photo
212-239-7500 x2212
josephst@bhphoto.com


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Bug - Status - Space in Filename
  2017-11-08 17:27 Bug - Status - Space in Filename Joseph Strauss
@ 2017-11-09  3:26 ` Junio C Hamano
  2017-11-09 13:29   ` Jeff King
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2017-11-09  3:26 UTC (permalink / raw)
  To: Joseph Strauss; +Cc: git@vger.kernel.org

Joseph Strauss <josephst@bhphoto.com> writes:

> I believe I have found a bug in the way git status -s lists filenames.
>
> According to the documentation:
>   The fields (including the ->) are separated from each other by a single space. If a filename contains whitespace or other nonprintable characters,   that field will be quoted in the manner of a C string literal: surrounded by ASCII double quote (34) characters, and with interior special characters backslash-escaped.
>
> While this is true in most situations, it does not seem to apply to merge conflicts. When a file has merge conflicts I am getting the following:
>  $ git status -s
>  UU some/path/with space/in/the/name
>  M  "another/path/with space/in/the/name "
>
> I found the same problem for the following versions:
> . git version 2.15.0.windows.1
> . git version 2.10.0

Thanks.

wt_shortstatus_status() has this code:

	if (s->null_termination) {
		fprintf(stdout, "%s%c", it->string, 0);
		if (d->head_path)
			fprintf(stdout, "%s%c", d->head_path, 0);
	} else {
		struct strbuf onebuf = STRBUF_INIT;
		const char *one;
		if (d->head_path) {
			one = quote_path(d->head_path, s->prefix, &onebuf);
			if (*one != '"' && strchr(one, ' ') != NULL) {
				putchar('"');
				strbuf_addch(&onebuf, '"');
				one = onebuf.buf;
			}
			printf("%s -> ", one);
			strbuf_release(&onebuf);
		}
		one = quote_path(it->string, s->prefix, &onebuf);
		if (*one != '"' && strchr(one, ' ') != NULL) {
			putchar('"');
			strbuf_addch(&onebuf, '"');
			one = onebuf.buf;
		}
		printf("%s\n", one);
		strbuf_release(&onebuf);
	}

But the corresponding part in wt_shortstatus_unmerged() reads like
this:

	if (s->null_termination) {
		fprintf(stdout, " %s%c", it->string, 0);
	} else {
		struct strbuf onebuf = STRBUF_INIT;
		const char *one;
		one = quote_path(it->string, s->prefix, &onebuf);
		printf(" %s\n", one);
		strbuf_release(&onebuf);
	}

The difference in d->head_path part is dealing about renames, which
is irrelevant for showing unmerged paths, but the key difference is
that the _unmerged() forgets to add the enclosing "" around the
result of quote_path().

It seems that wt_shortstatus_other() shares the same issue.  Perhaps
this will "fix" it?

Having said all that, the whole "quote path using c-style" was
designed very deliberately to treat SP (but not other kind of
whitespaces like HT) as something we do *not* have to quote (and
more importantly, do not *want* to quote, as pathnames with SP in it
are quite common for those who are used to "My Documents/" etc.), so
it is arguably that what is broken and incorrect is the extra
quoting with dq done in wt_shortstatus_status().  It of course is
too late to change the rules this late in the game, though.

Perhaps a better approach is to refactor the extra quoting I moved
to this emit_with_optional_dq() helper into quote_path() which
currently is just aliased to quote_path_relative().  It also is
possible that we may want to extend the "no_dq" parameter that is
given to quote_c_style() helper from a boolean to a set of flag
bits, and allow callers to request "I want SP added to the set of
bytes that triggers the quoting".


 wt-status.c | 38 ++++++++++++++++++++++----------------
 1 file changed, 22 insertions(+), 16 deletions(-)

diff --git a/wt-status.c b/wt-status.c
index bedef256ce..472d53d596 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -1671,6 +1671,20 @@ static void wt_longstatus_print(struct wt_status *s)
 		wt_longstatus_print_stash_summary(s);
 }
 
+static const char *emit_with_optional_dq(const char *in, const char *prefix, 
+					  struct strbuf *out)
+{
+	const char *one;
+
+	one = quote_path(in, prefix, out);
+	if (*one != '"' && strchr(one, ' ') != NULL) {
+		putchar('"');
+		strbuf_addch(out, '"');
+		one = out->buf;
+	}
+	return one;
+}
+
 static void wt_shortstatus_unmerged(struct string_list_item *it,
 			   struct wt_status *s)
 {
@@ -1692,8 +1706,9 @@ static void wt_shortstatus_unmerged(struct string_list_item *it,
 	} else {
 		struct strbuf onebuf = STRBUF_INIT;
 		const char *one;
-		one = quote_path(it->string, s->prefix, &onebuf);
-		printf(" %s\n", one);
+		putchar(' ');
+		one = emit_with_optional_dq(it->string, s->prefix, &onebuf);
+		printf("%s\n", one);
 		strbuf_release(&onebuf);
 	}
 }
@@ -1720,21 +1735,11 @@ static void wt_shortstatus_status(struct string_list_item *it,
 		struct strbuf onebuf = STRBUF_INIT;
 		const char *one;
 		if (d->head_path) {
-			one = quote_path(d->head_path, s->prefix, &onebuf);
-			if (*one != '"' && strchr(one, ' ') != NULL) {
-				putchar('"');
-				strbuf_addch(&onebuf, '"');
-				one = onebuf.buf;
-			}
+			one = emit_with_optional_dq(d->head_path, s->prefix, &onebuf);
 			printf("%s -> ", one);
 			strbuf_release(&onebuf);
 		}
-		one = quote_path(it->string, s->prefix, &onebuf);
-		if (*one != '"' && strchr(one, ' ') != NULL) {
-			putchar('"');
-			strbuf_addch(&onebuf, '"');
-			one = onebuf.buf;
-		}
+		one = emit_with_optional_dq(it->string, s->prefix, &onebuf);
 		printf("%s\n", one);
 		strbuf_release(&onebuf);
 	}
@@ -1748,9 +1753,10 @@ static void wt_shortstatus_other(struct string_list_item *it,
 	} else {
 		struct strbuf onebuf = STRBUF_INIT;
 		const char *one;
-		one = quote_path(it->string, s->prefix, &onebuf);
 		color_fprintf(s->fp, color(WT_STATUS_UNTRACKED, s), "%s", sign);
-		printf(" %s\n", one);
+		putchar(' ');
+		one = emit_with_optional_dq(it->string, s->prefix, &onebuf);
+		printf("%s\n", one);
 		strbuf_release(&onebuf);
 	}
 }

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: Bug - Status - Space in Filename
  2017-11-09  3:26 ` Junio C Hamano
@ 2017-11-09 13:29   ` Jeff King
  2017-11-10  0:52     ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Jeff King @ 2017-11-09 13:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Joseph Strauss, git@vger.kernel.org

On Thu, Nov 09, 2017 at 12:26:20PM +0900, Junio C Hamano wrote:

> The difference in d->head_path part is dealing about renames, which
> is irrelevant for showing unmerged paths, but the key difference is
> that the _unmerged() forgets to add the enclosing "" around the
> result of quote_path().
> 
> It seems that wt_shortstatus_other() shares the same issue.  Perhaps
> this will "fix" it?
> 
> Having said all that, the whole "quote path using c-style" was
> designed very deliberately to treat SP (but not other kind of
> whitespaces like HT) as something we do *not* have to quote (and
> more importantly, do not *want* to quote, as pathnames with SP in it
> are quite common for those who are used to "My Documents/" etc.), so
> it is arguably that what is broken and incorrect is the extra
> quoting with dq done in wt_shortstatus_status().  It of course is
> too late to change the rules this late in the game, though.

Yeah, I think the original sin is using " -> " in the --porcelain output
(which just got it from --short). That should have been a HT all along
to match the rest of the diff code. The --porcelain=v2 format gets this
right.

> Perhaps a better approach is to refactor the extra quoting I moved
> to this emit_with_optional_dq() helper into quote_path() which
> currently is just aliased to quote_path_relative().  It also is
> possible that we may want to extend the "no_dq" parameter that is
> given to quote_c_style() helper from a boolean to a set of flag
> bits, and allow callers to request "I want SP added to the set of
> bytes that triggers the quoting".

Yeah, I had the same thought upon digging into the code.

That said, if this is the only place that has this funny quoting, it may
not be worth polluting the rest of the code with the idea that quoting
spaces is a good thing to do.

> diff --git a/wt-status.c b/wt-status.c
> index bedef256ce..472d53d596 100644
> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -1671,6 +1671,20 @@ static void wt_longstatus_print(struct wt_status *s)
>  		wt_longstatus_print_stash_summary(s);
>  }
>  
> +static const char *emit_with_optional_dq(const char *in, const char *prefix, 
> +					  struct strbuf *out)
> +{
> +	const char *one;
> +
> +	one = quote_path(in, prefix, out);
> +	if (*one != '"' && strchr(one, ' ') != NULL) {
> +		putchar('"');
> +		strbuf_addch(out, '"');
> +		one = out->buf;
> +	}
> +	return one;
> +}

This puts part of its output to stdout, and the other part into a
strbuf. That works for these callers, but maybe just emitting the whole
thing to stdout would be less confusing (and I suspect would clean up
the callers a bit, who would not have to worry about the strbuf at all).

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Bug - Status - Space in Filename
  2017-11-09 13:29   ` Jeff King
@ 2017-11-10  0:52     ` Junio C Hamano
  2017-11-10  9:58       ` Jeff King
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2017-11-10  0:52 UTC (permalink / raw)
  To: Jeff King; +Cc: Joseph Strauss, git@vger.kernel.org

Jeff King <peff@peff.net> writes:

> Yeah, I think the original sin is using " -> " in the --porcelain output
> (which just got it from --short). That should have been a HT all along
> to match the rest of the diff code. The --porcelain=v2 format gets this
> right.

So we at lesat did something right, which is a consolation.

>> Perhaps a better approach is to refactor the extra quoting I moved
>> to this emit_with_optional_dq() helper into quote_path() which
>> currently is just aliased to quote_path_relative().  It also is
>> possible that we may want to extend the "no_dq" parameter that is
>> given to quote_c_style() helper from a boolean to a set of flag
>> bits, and allow callers to request "I want SP added to the set of
>> bytes that triggers the quoting".
>
> Yeah, I had the same thought upon digging into the code.
>
> That said, if this is the only place that has this funny quoting, it may
> not be worth polluting the rest of the code with the idea that quoting
> spaces is a good thing to do.

Sounds sane.  We can probably use a helper like this:

static char *quote_path_with_sp(const char *in, const char *prefix, struct strbuf *out)
{
	const char dq = '"';

	quote_path(in, prefix, out);
	if (out->buf[0] != dq && strchr(out->buf, ' ') != NULL) {
		strbuf_insert(out, 0, &dq, 1);
		strbuf_addch(out, dq);
	}
	return out->buf;
}

which allows the current users like shortstatus_status() to become a
lot shorter.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Bug - Status - Space in Filename
  2017-11-10  0:52     ` Junio C Hamano
@ 2017-11-10  9:58       ` Jeff King
  2017-11-10 18:13         ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Jeff King @ 2017-11-10  9:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Joseph Strauss, git@vger.kernel.org

On Fri, Nov 10, 2017 at 09:52:16AM +0900, Junio C Hamano wrote:

> > That said, if this is the only place that has this funny quoting, it may
> > not be worth polluting the rest of the code with the idea that quoting
> > spaces is a good thing to do.
> 
> Sounds sane.  We can probably use a helper like this:
> 
> static char *quote_path_with_sp(const char *in, const char *prefix, struct strbuf *out)
> {
> 	const char dq = '"';
> 
> 	quote_path(in, prefix, out);
> 	if (out->buf[0] != dq && strchr(out->buf, ' ') != NULL) {
> 		strbuf_insert(out, 0, &dq, 1);
> 		strbuf_addch(out, dq);
> 	}
> 	return out->buf;
> }
> 
> which allows the current users like shortstatus_status() to become a
> lot shorter.

Are there callers who don't just print the result? If not, we could just
always emit. That's slightly more efficient since it drops the expensive
strbuf_insert (though there are already so many copies going on in
quote_path_relative that it hardly matters). But it also drops the need
for the caller to know about the strbuf at all.

Like:
diff --git a/wt-status.c b/wt-status.c
index 937a87bbd5..4f4706a6e2 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -1703,6 +1703,18 @@ static void wt_shortstatus_unmerged(struct string_list_item *it,
 	}
 }
 
+static void emit_path(const char *in, const char *prefix)
+{
+	struct strbuf buf = STRBUF_INIT;
+	quote_path(in, prefix, &buf);
+	if (buf.buf[0] != '"' && strchr(buf.buf, ' ') != NULL) {
+		putchar('"');
+		strbuf_addch(&buf, '"');
+	}
+	fwrite(buf.buf, 1, buf.len, stdout);
+	strbuf_release(&buf);
+}
+
 static void wt_shortstatus_status(struct string_list_item *it,
 			 struct wt_status *s)
 {
@@ -1722,26 +1734,12 @@ static void wt_shortstatus_status(struct string_list_item *it,
 		if (d->head_path)
 			fprintf(stdout, "%s%c", d->head_path, 0);
 	} else {
-		struct strbuf onebuf = STRBUF_INIT;
-		const char *one;
 		if (d->head_path) {
-			one = quote_path(d->head_path, s->prefix, &onebuf);
-			if (*one != '"' && strchr(one, ' ') != NULL) {
-				putchar('"');
-				strbuf_addch(&onebuf, '"');
-				one = onebuf.buf;
-			}
-			printf("%s -> ", one);
-			strbuf_release(&onebuf);
+			emit_path(d->head_path, s->prefix);
+			printf(" -> ");
 		}
-		one = quote_path(it->string, s->prefix, &onebuf);
-		if (*one != '"' && strchr(one, ' ') != NULL) {
-			putchar('"');
-			strbuf_addch(&onebuf, '"');
-			one = onebuf.buf;
-		}
-		printf("%s\n", one);
-		strbuf_release(&onebuf);
+		emit_path(it->string, s->prefix);
+		putchar('\n');
 	}
 }
 

Though really I am fine with any solution that puts this pattern into a
helper function rather than repeating it inline.

-Peff

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: Bug - Status - Space in Filename
  2017-11-10  9:58       ` Jeff King
@ 2017-11-10 18:13         ` Junio C Hamano
  0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2017-11-10 18:13 UTC (permalink / raw)
  To: Jeff King; +Cc: Joseph Strauss, git@vger.kernel.org

Jeff King <peff@peff.net> writes:

> Are there callers who don't just print the result? If not, we could just
> always emit. That's slightly more efficient since it drops the expensive
> strbuf_insert (though there are already so many copies going on in
> quote_path_relative that it hardly matters). But it also drops the need
> for the caller to know about the strbuf at all.

I am fine by that, too.  Consider that I'm still suffering from the
trauma caused by some patches (not in this area but others) that
changed output we have long been directly emiting to stdio to
instead capture to strings ;-)

This might also be a good bite-sized material for the weekend thing
;-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-11-10 18:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-08 17:27 Bug - Status - Space in Filename Joseph Strauss
2017-11-09  3:26 ` Junio C Hamano
2017-11-09 13:29   ` Jeff King
2017-11-10  0:52     ` Junio C Hamano
2017-11-10  9:58       ` Jeff King
2017-11-10 18:13         ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).