From: Matheus Tavares <matheus.bernardino@usp.br>
To: greg@hurrell.net
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject:
Date: Tue, 14 Apr 2020 04:42:04 -0300 [thread overview]
Message-ID: <20200414074204.677139-1-matheus.bernardino@usp.br> (raw)
In-Reply-To: <CAOyLvt9=wRfpvGGJqLMi7=wLWu881pOur8c9qNEg+Xqhf8W2ww@mail.gmail.com>
On Mon, Apr 13, 2020 at 6:57 PM Greg Hurrell <greg@hurrell.net> wrote:
>
> It seems that `git-grep -lz` behaves differently depending on whether
> it is inside a subdirectory:
[...]
> $ git grep -lz content
> an "example".txt^@nested/other "example".txt^@
>
> Note that, as expected, the files are NUL-terminated and not wrapped
> in quotes. ("^@" represents NUL byte.)
>
> $ cd nested
> $ git grep -lz content
> "other \"example\".txt"^@
>
> As soon as we move into a subdirectory, files are wrapped in quotes
> and contain escapes, despite the "-z" switch.
I think this is a bug in "plain" git-grep, not in -z specifically. In some git
commands, -z has the effect of unquoting the output paths. For example, the
docs for git-ls-files says: "-z: \0 line termination on output and do not quote
filenames." In git-grep, however, the -z doc entry only says: "Output \0
instead of the character that normally follows a file name."
So -z does not seem to affect quoting in git-grep. But should we change this, to
quote unusual pathnames (relative or not) by default, and let -z fall back to
the old behavior? This would be more consistent with other commands such as
git-ls-files and to the core.quotePath setting.
However, perhaps output paths are never intended to be displayed quoted in
git-grep (with or without -z) in order to be consistent with GNU grep. I don't
know which of these options is correct (if any), so I would love to hear what
others have to say about it.
But if the second is correct, I think we can use the following patch to solve
the reported bug:
(I'm just wondering: should we also add the information at core.quotePath that
git-grep does not comply with this setting, to be consistent with GNU grep {or
for any other reason}?)
-- >8 --
Subject: [RFC PATCH] grep: fix inconsistent output of unusual pathnames
When grepping from the repository root, paths that contain unusual
characters are not output quoted. However, they are quoted when grepping
from a subdir without --full-name. For example:
$ echo content >'an "unusual" name.txt'
$ mkdir subdir
$ echo content >'subdir/another "unusual" name.txt'
$ git add .
$ git commit -m content
Then:
$ git grep content
an "unusual" name.txt:1:content
subdir/another "unusual" name.txt:1:content
But:
$ cd subdir
$ git grep content
"another \"unusual\" name.txt":1:content
Fix this inconsistency by not quoting unusual pathnames (relative or
not), which is also the default behavior in GNU grep. Also add some
tests to prevent regressions.
Note: some non-related tests that compare git-grep output against
git-ls-files also had to be modified. This is because the testing repo
now contains some unusual pathnames. And git-ls-files will quote such
paths by default, whereas git-grep won't.
Reported-by: Greg Hurrell <greg@hurrell.net>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
builtin/grep.c | 15 ++++++++++-----
t/t7810-grep.sh | 33 ++++++++++++++++++++++++++++++---
2 files changed, 40 insertions(+), 8 deletions(-)
diff --git a/builtin/grep.c b/builtin/grep.c
index 99e2685090..ca21f98315 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -303,8 +303,12 @@ static int grep_oid(struct grep_opt *opt, const struct object_id *oid,
struct grep_source gs;
if (opt->relative && opt->prefix_length) {
- quote_path_relative(filename + tree_name_len, opt->prefix, &pathbuf);
- strbuf_insert(&pathbuf, 0, filename, tree_name_len);
+ struct strbuf rel_buf = STRBUF_INIT;
+ const char *rel_name = relative_path(filename + tree_name_len,
+ opt->prefix, &rel_buf);
+ strbuf_add(&pathbuf, filename, tree_name_len);
+ strbuf_addstr(&pathbuf, rel_name);
+ strbuf_release(&rel_buf);
} else {
strbuf_addstr(&pathbuf, filename);
}
@@ -333,13 +337,14 @@ static int grep_file(struct grep_opt *opt, const char *filename)
{
struct strbuf buf = STRBUF_INIT;
struct grep_source gs;
+ const char *gs_name;
if (opt->relative && opt->prefix_length)
- quote_path_relative(filename, opt->prefix, &buf);
+ gs_name = relative_path(filename, opt->prefix, &buf);
else
- strbuf_addstr(&buf, filename);
+ gs_name = filename;
- grep_source_init(&gs, GREP_SOURCE_FILE, buf.buf, filename, filename);
+ grep_source_init(&gs, GREP_SOURCE_FILE, gs_name, filename, filename);
strbuf_release(&buf);
if (num_threads > 1) {
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 7d7b396c23..23911a3574 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -72,6 +72,8 @@ test_expect_success setup '
# Still a no-op.
function dummy() {}
EOF
+ echo unusual >"\"unusual\" pathname" &&
+ echo unusual >"t/\"unusual\" pathname2" &&
git add . &&
test_tick &&
git commit -m initial
@@ -481,6 +483,26 @@ do
git grep --count -h -e b $H -- ab >actual &&
test_cmp expected actual
'
+
+ test_expect_success "grep $L should not quote unusual pathnames" '
+ cat >expected <<-EOF &&
+ ${HC}"unusual" pathname:unusual
+ ${HC}t/"unusual" pathname2:unusual
+ EOF
+ git grep unusual $H >actual &&
+ test_cmp expected actual
+ '
+
+ test_expect_success "grep $L should not quote unusual relative pathnames" '
+ cat >expected <<-EOF &&
+ ${HC}"unusual" pathname2:unusual
+ EOF
+ (
+ cd t &&
+ git grep unusual $H
+ ) >actual &&
+ test_cmp expected actual
+ '
done
cat >expected <<EOF
@@ -500,7 +522,8 @@ test_expect_success 'grep -c -C' '
'
test_expect_success 'grep -L -C' '
- git ls-files >expected &&
+ git ls-files -z >ls-files-z &&
+ tr "\0" "\n" <ls-files-z >expected &&
git grep -L -C1 nonexistent_string >actual &&
test_cmp expected actual
'
@@ -1686,7 +1709,9 @@ test_expect_success 'grep does not report i-t-a with -L --cached' '
echo "intend to add this" >intend-to-add &&
git add -N intend-to-add &&
test_when_finished "git rm -f intend-to-add" &&
- git ls-files | grep -v "^intend-to-add\$" >expected &&
+ git ls-files -z >ls-files-z &&
+ tr "\0" "\n" <ls-files-z >files &&
+ grep -v "^intend-to-add\$" files >expected &&
git grep -L --cached "nonexistent_string" >actual &&
test_cmp expected actual
'
@@ -1696,7 +1721,9 @@ test_expect_success 'grep does not report i-t-a and assume unchanged with -L' '
git add -N intend-to-add-assume-unchanged &&
test_when_finished "git rm -f intend-to-add-assume-unchanged" &&
git update-index --assume-unchanged intend-to-add-assume-unchanged &&
- git ls-files | grep -v "^intend-to-add-assume-unchanged\$" >expected &&
+ git ls-files -z >ls-files-z &&
+ tr "\0" "\n" <ls-files-z >files &&
+ grep -v "^intend-to-add-assume-unchanged\$" files >expected &&
git grep -L "nonexistent_string" >actual &&
test_cmp expected actual
'
--
2.26.0
next prev parent reply other threads:[~2020-04-14 7:42 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-13 21:55 git-grep's "-z" option misbehaves in subdirectory Greg Hurrell
2020-04-13 23:33 ` Junio C Hamano
2020-04-14 7:42 ` Matheus Tavares [this message]
2020-04-16 18:59 ` Matheus Tavares Bernardino
2020-04-16 20:07 ` Junio C Hamano
2020-04-17 6:04 ` [PATCH] grep: follow conventions for printing paths w/ unusual chars Matheus Tavares
2020-04-17 6:45 ` Junio C Hamano
2020-04-17 21:19 ` Matheus Tavares Bernardino
2020-04-17 21:35 ` Junio C Hamano
2020-04-18 13:13 ` Johannes Schindelin
2020-04-18 14:56 ` Johannes Schindelin
2020-04-19 6:27 ` Matheus Tavares Bernardino
2020-04-19 6:33 ` [PATCH v2] " Matheus Tavares
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200414074204.677139-1-matheus.bernardino@usp.br \
--to=matheus.bernardino@usp.br \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=greg@hurrell.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).