git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Philippe Blain via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Michael J Gruber <git@grubix.eu>,
	Matthieu Moy <git@matthieu-moy.fr>,
	John Keeping <john@keeping.me.uk>,
	Karthik Nayak <karthik.188@gmail.com>, Jeff King <peff@peff.net>,
	Alex Henrie <alexhenrie24@gmail.com>,
	Philippe Blain <levraiphilippeblain@gmail.com>,
	Philippe Blain <levraiphilippeblain@gmail.com>
Subject: [PATCH v2 2/3] ref-filter: fix the API to correctly handle CRLF
Date: Tue, 10 Mar 2020 02:24:52 +0000	[thread overview]
Message-ID: <aab1f45ba976d088a8c68573a21ed2458915d6a6.1583807093.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.576.v2.git.1583807093.gitgitgadget@gmail.com>

From: Philippe Blain <levraiphilippeblain@gmail.com>

The ref-filter API does not correctly handle commit or tag messages that
use CRLF as the line terminator. Such messages can be created with the
`--verbatim` option of `git commit` and `git tag`, or by using `git
commit-tree` directly.

This impacts the output `git branch`, `git tag` and `git for-each-ref`
when used with a `--format` argument containing the atoms
`%(contents:subject)` or `%(contents:body)`, as well as the output of
`git branch --verbose`, which uses `%(contents:subject)` internally.

The function find_subpos in ref-filter.c looks for two consecutive '\n'
to find the end of the subject line, a sequence which is absent in
messages using CRLF. This results in the whole message being parsed as
the subject line (`%(contents:subject)`), and the body of the message
(`%(contents:body)`)  being empty.

Moreover, in copy_subject, '\n' is replaced by space, but '\r' is
untouched, resulting in the escape sequence '^M' being output verbatim
in most terminal emulators:

    $ git branch --verbose
    * crlf    2113b0e Subject first line^M ^M Body first line^M Body second line

This bug is a regression for `git branch --verbose`, which
bisects down to 949af0684c (branch: use ref-filter printing APIs,
2017-01-10).

Fix this bug in ref-filter by hardening the logic in `copy_subject` and
`find_subpos` to correctly parse messages containing CRFL.

Add tests for `branch`, `tag` and `for-each-ref` using
lib-crlf-messages.sh.

The 'make commits' test at the beginning of t3203-branch-output.sh needs
to be modified since it did not use `test_tick` and thus the commit
hashes were not reproducible. For simplicity, use `test_commit` as the
content and name of the files created in this setup test are irrelevant
to the rest of the test script.

`test_cleanup_crlf_refs` is used in t3203-branch-output.sh and
t7004-tag.sh to avoid having to modify the expected output in later
tests.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
---
 ref-filter.c             | 19 +++++++++++++++++--
 t/t3203-branch-output.sh | 26 +++++++++++++++++++++-----
 t/t6300-for-each-ref.sh  |  5 +++++
 t/t7004-tag.sh           |  7 +++++++
 4 files changed, 50 insertions(+), 7 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 79bb5206783..537cc4de42c 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1050,10 +1050,18 @@ static char *copy_subject(const char *buf, unsigned long len)
 {
 	char *r = xmemdupz(buf, len);
 	int i;
+	struct strbuf sb = STRBUF_INIT;
 
-	for (i = 0; i < len; i++)
+	strbuf_attach(&sb, r, len, len + 1);
+	for (i = 0; i < sb.len; i++) {
 		if (r[i] == '\n')
 			r[i] = ' ';
+		if (r[i] == '\r') {
+			strbuf_remove(&sb, i, 1);
+			i -= 1;
+		}
+	}
+	strbuf_detach(&sb, NULL);
 
 	return r;
 }
@@ -1184,15 +1192,22 @@ static void find_subpos(const char *buf,
 		eol = strchrnul(buf, '\n');
 		if (*eol)
 			eol++;
+		/*  protect against messages that might contain \r\n */
+		if (*eol == '\r')
+			eol++;
 		buf = eol;
 	}
 	*sublen = buf - *sub;
 	/* drop trailing newline, if present */
 	if (*sublen && (*sub)[*sublen - 1] == '\n')
 		*sublen -= 1;
+	/*  protect against commit messages that might contain \r\n */
+	else if (*sublen && (*sub)[*sublen - 1] == '\r')
+		*sublen -= 3; /* drop '\r\n\r' */
 
 	/* skip any empty lines */
-	while (*buf == '\n')
+	/* and protect against commit messages that might contain \r\n */
+	while (*buf == '\n' || *buf == '\r')
 		buf++;
 	*body = buf;
 	*bodylen = strlen(buf);
diff --git a/t/t3203-branch-output.sh b/t/t3203-branch-output.sh
index 71818b90f00..1235089619c 100755
--- a/t/t3203-branch-output.sh
+++ b/t/t3203-branch-output.sh
@@ -3,13 +3,11 @@
 test_description='git branch display tests'
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-terminal.sh
+. "$TEST_DIRECTORY"/lib-crlf-messages.sh
 
 test_expect_success 'make commits' '
-	echo content >file &&
-	git add file &&
-	git commit -m one &&
-	echo content >>file &&
-	git commit -a -m two
+	test_commit one &&
+	test_commit two
 '
 
 test_expect_success 'make branches' '
@@ -95,6 +93,24 @@ test_expect_success 'git branch --ignore-case --list -v pattern shows branch sum
 	awk "{print \$NF}" <tmp >actual &&
 	test_cmp expect actual
 '
+test_create_crlf_refs
+
+test_expect_success 'git branch -v works with CRLF input' '
+	cat >expect <<-EOF &&
+	  branch-one                     139b20d two
+	  branch-two                     d79ce16 one
+	  crlf                           2113b0e Subject first line
+	  crlf-empty-lines-after-subject 0a9530d Subject first line
+	  crlf-two-line-subject          f9ded1f Subject first line Subject second line
+	* master                         139b20d two
+	EOF
+	git branch -v >actual &&
+	test_cmp expect actual
+'
+
+test_crlf_subject_body_and_contents branch --list crlf*
+
+test_cleanup_crlf_refs
 
 test_expect_success 'git branch -v pattern does not show branch summaries' '
 	test_must_fail git branch -v branch*
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 9c910ce7467..467924de3df 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -8,6 +8,7 @@ test_description='for-each-ref test'
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-gpg.sh
 . "$TEST_DIRECTORY"/lib-terminal.sh
+. "$TEST_DIRECTORY"/lib-crlf-messages.sh
 
 # Mon Jul 3 23:18:43 2006 +0000
 datestamp=1151968723
@@ -888,4 +889,8 @@ test_expect_success 'for-each-ref --ignore-case ignores case' '
 	test_cmp expect actual
 '
 
+test_create_crlf_refs
+
+test_crlf_subject_body_and_contents for-each-ref refs/heads/crlf*
+
 test_done
diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index 6db92bd3ba6..25f465f34ea 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -10,6 +10,7 @@ Tests for operations with tags.'
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/lib-gpg.sh
 . "$TEST_DIRECTORY"/lib-terminal.sh
+. "$TEST_DIRECTORY"/lib-crlf-messages.sh
 
 # creating and listing lightweight tags:
 
@@ -1969,6 +1970,12 @@ test_expect_success '--format should list tags as per format given' '
 	test_cmp expect actual
 '
 
+test_create_crlf_refs
+
+test_crlf_subject_body_and_contents tag --list tag-crlf*
+
+test_cleanup_crlf_refs
+
 test_expect_success "set up color tests" '
 	echo "<RED>v1.0<RESET>" >expect.color &&
 	echo "v1.0" >expect.bare &&
-- 
gitgitgadget


  parent reply	other threads:[~2020-03-10  2:25 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-08 18:29 [PATCH 0/3] Teach ref-filter API to correctly handle CRLF in messages Philippe Blain via GitGitGadget
2020-03-08 18:29 ` [PATCH 1/3] t: add lib-crlf-messages.sh for messages containing CRLF Philippe Blain via GitGitGadget
2020-03-08 18:29 ` [PATCH 2/3] ref-filter: teach the API to correctly handle CRLF Philippe Blain via GitGitGadget
2020-03-08 18:29 ` [PATCH 3/3] log: add tests for messages containing CRLF to t4202 Philippe Blain via GitGitGadget
2020-03-09 15:14 ` [PATCH 0/3] Teach ref-filter API to correctly handle CRLF in messages Junio C Hamano
2020-03-10  2:19   ` Philippe Blain
2020-03-10  2:24 ` [PATCH v2 " Philippe Blain via GitGitGadget
2020-03-10  2:24   ` [PATCH v2 1/3] t: add lib-crlf-messages.sh for messages containing CRLF Philippe Blain via GitGitGadget
2020-03-10  2:24   ` Philippe Blain via GitGitGadget [this message]
2020-03-10 17:50     ` [PATCH v2 2/3] ref-filter: fix the API to correctly handle CRLF Junio C Hamano
2020-03-10  2:24   ` [PATCH v2 3/3] log: add tests for messages containing CRLF to t4202 Philippe Blain via GitGitGadget
2020-03-10  3:31   ` [PATCH v2 0/3] Teach ref-filter API to correctly handle CRLF in messages Junio C Hamano
2020-03-10 17:24   ` Junio C Hamano
2020-10-12 18:09   ` [PATCH v3 0/3] ref-filter: handle CRLF at end-of-line more gracefully Philippe Blain via GitGitGadget
2020-10-12 18:09     ` [PATCH v3 1/3] t: add lib-crlf-messages.sh for messages containing CRLF Philippe Blain via GitGitGadget
2020-10-12 22:22       ` Junio C Hamano
2020-10-14 13:22         ` Philippe Blain
2020-10-12 22:47       ` Eric Sunshine
2020-10-14 13:20         ` Philippe Blain
2020-10-14 13:45           ` Eric Sunshine
2020-10-14 13:52             ` Philippe Blain
2020-10-14 23:01               ` Eric Sunshine
2020-10-22  3:09         ` Philippe Blain
2020-10-12 18:09     ` [PATCH v3 2/3] ref-filter: handle CRLF at end-of-line more gracefully Philippe Blain via GitGitGadget
2020-10-12 22:24       ` Junio C Hamano
2020-10-14 13:09         ` Philippe Blain
2020-10-12 18:09     ` [PATCH v3 3/3] log, show: add tests for messages containing CRLF Philippe Blain via GitGitGadget
2020-10-22  3:01     ` [PATCH v4 0/2] ref-filter: handle CRLF at end-of-line more gracefully Philippe Blain via GitGitGadget
2020-10-22  3:01       ` [PATCH v4 1/2] " Philippe Blain via GitGitGadget
2020-10-23  0:52         ` Junio C Hamano
2020-10-23  1:46           ` Philippe Blain
2020-10-28 20:24             ` Junio C Hamano
2020-10-29  1:29               ` Philippe Blain
2020-10-22  3:01       ` [PATCH v4 2/2] log, show: add tests for messages containing CRLF Philippe Blain via GitGitGadget
2020-10-22 19:24         ` Philippe Blain
2020-10-29 12:48       ` [PATCH v5 0/2] ref-filter: handle CRLF at end-of-line more gracefully Philippe Blain via GitGitGadget
2020-10-29 12:48         ` [PATCH v5 1/2] " Philippe Blain via GitGitGadget
2020-10-29 12:48         ` [PATCH v5 2/2] log, show: add tests for messages containing CRLF Philippe Blain via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aab1f45ba976d088a8c68573a21ed2458915d6a6.1583807093.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=alexhenrie24@gmail.com \
    --cc=git@grubix.eu \
    --cc=git@matthieu-moy.fr \
    --cc=git@vger.kernel.org \
    --cc=john@keeping.me.uk \
    --cc=karthik.188@gmail.com \
    --cc=levraiphilippeblain@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).