git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/3] interpret-trailers + commit -v bugfix
@ 2017-05-12  5:03 Brian Malehorn
  2017-05-12  5:03 ` [PATCH 1/3] mailinfo.c: is_scissors_line ends on newline Brian Malehorn
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Brian Malehorn @ 2017-05-12  5:03 UTC (permalink / raw)
  To: git


Hi all,

This patch series addresses a bug in interpret-trailers. If the commit
that is being editted is "verbose", it will contain a scissors string
("-- >8 --") and a diff. interpret-trailers doesn't interpret the
scissors and therefore places trailer information after the diff. A
simple reproduction is:

	git config commit.verbose true
	GIT_EDITOR='git interpret-trailers --in-place --trailer Acked-by:me' \
		git commit --amend

This patch series fixes the issue by obeying scissors when computing
trailers location.

P.S. This is my first patch series to the git mailing list, to feel free
to point out any mistakes I made when submitting.

 commit.c                      | 65 ++++++++++++++++++++++++++++++++++++++++---
 commit.h                      |  1 +
 mailinfo.c                    | 53 +----------------------------------
 t/t7513-interpret-trailers.sh | 17 +++++++++++
 4 files changed, 80 insertions(+), 56 deletions(-)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/3] mailinfo.c: is_scissors_line ends on newline
  2017-05-12  5:03 [PATCH 0/3] interpret-trailers + commit -v bugfix Brian Malehorn
@ 2017-05-12  5:03 ` Brian Malehorn
  2017-05-12  8:31   ` Jeff King
  2017-05-12  5:03 ` [PATCH 2/3] commit.c: add is_scissors_line Brian Malehorn
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 24+ messages in thread
From: Brian Malehorn @ 2017-05-12  5:03 UTC (permalink / raw)
  To: git; +Cc: Brian Malehorn

Needed to work with git interpret-trailers. Since "line" is, of course,
a line, it will always end with "\n\0" and therefore we can safely end
on "\n".
---
 mailinfo.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mailinfo.c b/mailinfo.c
index a489d9d0f..eadd0597f 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -661,7 +661,7 @@ static int is_scissors_line(const char *line)
 	const char *first_nonblank = NULL, *last_nonblank = NULL;
 	int visible, perforation = 0, in_perforation = 0;
 
-	for (c = line; *c; c++) {
+	for (c = line; *c != '\n'; c++) {
 		if (isspace(*c)) {
 			if (in_perforation) {
 				perforation++;
-- 
2.12.3.3.g39c96af


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/3] commit.c: add is_scissors_line
  2017-05-12  5:03 [PATCH 0/3] interpret-trailers + commit -v bugfix Brian Malehorn
  2017-05-12  5:03 ` [PATCH 1/3] mailinfo.c: is_scissors_line ends on newline Brian Malehorn
@ 2017-05-12  5:03 ` Brian Malehorn
  2017-05-12  8:40   ` Jeff King
  2017-05-12  5:03 ` [PATCH 3/3] commit.c: skip scissors when computing trailers Brian Malehorn
  2017-05-12  9:00 ` [PATCH 0/3] interpret-trailers + commit -v bugfix Jeff King
  3 siblings, 1 reply; 24+ messages in thread
From: Brian Malehorn @ 2017-05-12  5:03 UTC (permalink / raw)
  To: git; +Cc: Brian Malehorn

Move is_scissors_line to commit.c and expose it through commit.h.
This is needed in commit.c, and mailinfo.c shouldn't really own it.
---
 commit.c   | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 commit.h   |  1 +
 mailinfo.c | 53 +----------------------------------------------------
 3 files changed, 54 insertions(+), 52 deletions(-)

diff --git a/commit.c b/commit.c
index fab826973..041cfa5a9 100644
--- a/commit.c
+++ b/commit.c
@@ -1646,6 +1646,58 @@ const char *find_commit_header(const char *msg, const char *key, size_t *out_len
 	return NULL;
 }
 
+int is_scissors_line(const char *line)
+{
+	const char *c;
+	int scissors = 0, gap = 0;
+	const char *first_nonblank = NULL, *last_nonblank = NULL;
+	int visible, perforation = 0, in_perforation = 0;
+
+	for (c = line; *c != '\n'; c++) {
+		if (isspace(*c)) {
+			if (in_perforation) {
+				perforation++;
+				gap++;
+			}
+			continue;
+		}
+		last_nonblank = c;
+		if (first_nonblank == NULL)
+			first_nonblank = c;
+		if (*c == '-') {
+			in_perforation = 1;
+			perforation++;
+			continue;
+		}
+		if ((!memcmp(c, ">8", 2) || !memcmp(c, "8<", 2) ||
+		     !memcmp(c, ">%", 2) || !memcmp(c, "%<", 2))) {
+			in_perforation = 1;
+			perforation += 2;
+			scissors += 2;
+			c++;
+			continue;
+		}
+		in_perforation = 0;
+	}
+
+	/*
+	 * The mark must be at least 8 bytes long (e.g. "-- >8 --").
+	 * Even though there can be arbitrary cruft on the same line
+	 * (e.g. "cut here"), in order to avoid misidentification, the
+	 * perforation must occupy more than a third of the visible
+	 * width of the line, and dashes and scissors must occupy more
+	 * than half of the perforation.
+	 */
+
+	if (first_nonblank && last_nonblank)
+		visible = last_nonblank - first_nonblank + 1;
+	else
+		visible = 0;
+	return (scissors && 8 <= visible &&
+		visible < perforation * 3 &&
+		gap * 2 < perforation);
+}
+
 /*
  * Inspect the given string and determine the true "end" of the log message, in
  * order to find where to put a new Signed-off-by: line.  Ignored are
diff --git a/commit.h b/commit.h
index 9c12abb91..58cbab1cd 100644
--- a/commit.h
+++ b/commit.h
@@ -353,6 +353,7 @@ extern void free_commit_extra_headers(struct commit_extra_header *extra);
  */
 extern const char *find_commit_header(const char *msg, const char *key,
 				      size_t *out_len);
+extern int is_scissors_line(const char *line);
 
 /* Find the end of the log message, the right place for a new trailer. */
 extern int ignore_non_trailer(const char *buf, size_t len);
diff --git a/mailinfo.c b/mailinfo.c
index eadd0597f..52af800a5 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "utf8.h"
 #include "strbuf.h"
+#include "commit.h"
 #include "mailinfo.h"
 
 static void cleanup_space(struct strbuf *sb)
@@ -654,58 +655,6 @@ static inline int patchbreak(const struct strbuf *line)
 	return 0;
 }
 
-static int is_scissors_line(const char *line)
-{
-	const char *c;
-	int scissors = 0, gap = 0;
-	const char *first_nonblank = NULL, *last_nonblank = NULL;
-	int visible, perforation = 0, in_perforation = 0;
-
-	for (c = line; *c != '\n'; c++) {
-		if (isspace(*c)) {
-			if (in_perforation) {
-				perforation++;
-				gap++;
-			}
-			continue;
-		}
-		last_nonblank = c;
-		if (first_nonblank == NULL)
-			first_nonblank = c;
-		if (*c == '-') {
-			in_perforation = 1;
-			perforation++;
-			continue;
-		}
-		if ((!memcmp(c, ">8", 2) || !memcmp(c, "8<", 2) ||
-		     !memcmp(c, ">%", 2) || !memcmp(c, "%<", 2))) {
-			in_perforation = 1;
-			perforation += 2;
-			scissors += 2;
-			c++;
-			continue;
-		}
-		in_perforation = 0;
-	}
-
-	/*
-	 * The mark must be at least 8 bytes long (e.g. "-- >8 --").
-	 * Even though there can be arbitrary cruft on the same line
-	 * (e.g. "cut here"), in order to avoid misidentification, the
-	 * perforation must occupy more than a third of the visible
-	 * width of the line, and dashes and scissors must occupy more
-	 * than half of the perforation.
-	 */
-
-	if (first_nonblank && last_nonblank)
-		visible = last_nonblank - first_nonblank + 1;
-	else
-		visible = 0;
-	return (scissors && 8 <= visible &&
-		visible < perforation * 3 &&
-		gap * 2 < perforation);
-}
-
 static void flush_inbody_header_accum(struct mailinfo *mi)
 {
 	if (!mi->inbody_header_accum.len)
-- 
2.12.3.3.g39c96af


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/3] commit.c: skip scissors when computing trailers
  2017-05-12  5:03 [PATCH 0/3] interpret-trailers + commit -v bugfix Brian Malehorn
  2017-05-12  5:03 ` [PATCH 1/3] mailinfo.c: is_scissors_line ends on newline Brian Malehorn
  2017-05-12  5:03 ` [PATCH 2/3] commit.c: add is_scissors_line Brian Malehorn
@ 2017-05-12  5:03 ` Brian Malehorn
  2017-05-12  8:52   ` Jeff King
  2017-05-12  9:00 ` [PATCH 0/3] interpret-trailers + commit -v bugfix Jeff King
  3 siblings, 1 reply; 24+ messages in thread
From: Brian Malehorn @ 2017-05-12  5:03 UTC (permalink / raw)
  To: git; +Cc: Brian Malehorn

"scissors" ("----- >8 -----") can be automatically added to commit
messages by setting commit.verbose = true. Prevent this from interfering
with trailer calculations by automatically skipping over scissors,
instead of (usually) treating them as a comment.
---
 commit.c                      | 13 +++++++++----
 t/t7513-interpret-trailers.sh | 17 +++++++++++++++++
 2 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/commit.c b/commit.c
index 041cfa5a9..9a7b41d09 100644
--- a/commit.c
+++ b/commit.c
@@ -1701,10 +1701,10 @@ int is_scissors_line(const char *line)
 /*
  * Inspect the given string and determine the true "end" of the log message, in
  * order to find where to put a new Signed-off-by: line.  Ignored are
- * trailing comment lines and blank lines, and also the traditional
- * "Conflicts:" block that is not commented out, so that we can use
- * "git commit -s --amend" on an existing commit that forgot to remove
- * it.
+ * trailing comment lines and blank lines.  To support "git commit -s
+ * --amend" on an existing commit, we also ignore "Conflicts:".  To
+ * support "git commit -v", we truncate at "---- >8 ----" and similar
+ * scissors lines.
  *
  * Returns the number of bytes from the tail to ignore, to be fed as
  * the second parameter to append_signoff().
@@ -1723,6 +1723,11 @@ int ignore_non_trailer(const char *buf, size_t len)
 		else
 			next_line++;
 
+		if (is_scissors_line(&buf[bol])) {
+			if (!boc)
+				boc = bol;
+			break;
+		}
 		if (buf[bol] == comment_line_char || buf[bol] == '\n') {
 			/* is this the first of the run of comments? */
 			if (!boc)
diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh
index 4dd1d7c52..d88d4a4ff 100755
--- a/t/t7513-interpret-trailers.sh
+++ b/t/t7513-interpret-trailers.sh
@@ -1258,4 +1258,21 @@ test_expect_success 'with no command and no key' '
 	test_cmp expected actual
 '
 
+test_expect_success 'with scissors' '
+	cat >expected <<-EOF &&
+		my subject
+
+		review: Brian
+		sign: A U Thor <author@example.com>
+		# ------------------------ >8 ------------------------
+		ignore this
+	EOF
+	git interpret-trailers --trailer review:Brian >actual <<-EOF &&
+		my subject
+		# ------------------------ >8 ------------------------
+		ignore this
+	EOF
+	test_cmp expected actual
+'
+
 test_done
-- 
2.12.3.3.g39c96af


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] mailinfo.c: is_scissors_line ends on newline
  2017-05-12  5:03 ` [PATCH 1/3] mailinfo.c: is_scissors_line ends on newline Brian Malehorn
@ 2017-05-12  8:31   ` Jeff King
  0 siblings, 0 replies; 24+ messages in thread
From: Jeff King @ 2017-05-12  8:31 UTC (permalink / raw)
  To: Brian Malehorn; +Cc: git

On Thu, May 11, 2017 at 10:03:45PM -0700, Brian Malehorn wrote:

> Needed to work with git interpret-trailers. Since "line" is, of course,
> a line, it will always end with "\n\0" and therefore we can safely end
> on "\n".
> [...]
> -	for (c = line; *c; c++) {
> +	for (c = line; *c != '\n'; c++) {
>  		if (isspace(*c)) {

Hrm. So I don't think this is _wrong_ per se, but the first thing we do
in the loop is check isspace(), which a newline should match. And you
can't see it from the diff context, but it always hits a continue, which
would then advance to the NUL and exit the loop.

So I'm puzzled why this change is necessary. Am I missing something?
We would increase the gap and perforation if we're in them, so I guess
it could have an impact on the line-length heuristics that come at the
end of the function. Is that it?

-Peff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] commit.c: add is_scissors_line
  2017-05-12  5:03 ` [PATCH 2/3] commit.c: add is_scissors_line Brian Malehorn
@ 2017-05-12  8:40   ` Jeff King
  0 siblings, 0 replies; 24+ messages in thread
From: Jeff King @ 2017-05-12  8:40 UTC (permalink / raw)
  To: Brian Malehorn; +Cc: git

On Thu, May 11, 2017 at 10:03:46PM -0700, Brian Malehorn wrote:

> Move is_scissors_line to commit.c and expose it through commit.h.
> This is needed in commit.c, and mailinfo.c shouldn't really own it.

It was fine for mailinfo to own it until now, since it was the only
user. :) I think there are some unsaid bits in your rationale. Perhaps
something like:

 Subject: mailinfo: make is_scissors_line() public

 We parse scissors lines only when looking at emails, and thus the
 parsing function has always lived in mailinfo.c. However, we also
 generate scissors lines as part of "git commit -v". We should be able
 to reuse the same function to parse them.

 Once public, it doesn't really belong in mailinfo.c anymore. A better
 place is commit.c because...[I had trouble filling this in; it's really
 _not_ about commits in particular, but rather about the in-editor
 representation of the commit message used by git-commit].

Thinking on it more, though...are the scissors lines generated by "-v"
really the same as the ones parsed by mailinfo? In the case of mailinfo,
we are parsing an externally generated string according to a heuristic
microformat. But with "commit -v", we are the ones who wrote the cut
line in the first place. Shouldn't we be a bit more picky and make sure
we parse the exact same cut line?

And indeed, we _do_ actually have to parse these cut lines already, when
we read the commit message back in. The code is in
wt_status_truncate_message_at_cut_line(), and it does do an exact match.
We should probably be using that rather than making the mailinfo
is_scissors_line() public.

-Peff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] commit.c: skip scissors when computing trailers
  2017-05-12  5:03 ` [PATCH 3/3] commit.c: skip scissors when computing trailers Brian Malehorn
@ 2017-05-12  8:52   ` Jeff King
  0 siblings, 0 replies; 24+ messages in thread
From: Jeff King @ 2017-05-12  8:52 UTC (permalink / raw)
  To: Brian Malehorn; +Cc: git

On Thu, May 11, 2017 at 10:03:47PM -0700, Brian Malehorn wrote:

> "scissors" ("----- >8 -----") can be automatically added to commit
> messages by setting commit.verbose = true. Prevent this from interfering
> with trailer calculations by automatically skipping over scissors,
> instead of (usually) treating them as a comment.

What's the sequence of commands where you end up with a scissors line in
your "commit -v" output and it gets fed to interpret-trailers? Is it
when you run interpret-trailers from a commit-msg hook? Or do we invoke
it as part of "commit -v" itself?

I ask because I think we can probably come up with a more realistic
test, which may impact what the solution looks like (see below).
Grepping for ignore_non_trailer(), it looks like the issue may be the
append_signoff() call in builtin/commit.c? I couldn't get "git commit -s
-v" to fail, though (it handles the signoff with the verbose bits
removed).

> diff --git a/commit.c b/commit.c
> index 041cfa5a9..9a7b41d09 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -1701,10 +1701,10 @@ int is_scissors_line(const char *line)
>  /*
>   * Inspect the given string and determine the true "end" of the log message, in
>   * order to find where to put a new Signed-off-by: line.  Ignored are
> - * trailing comment lines and blank lines, and also the traditional
> - * "Conflicts:" block that is not commented out, so that we can use
> - * "git commit -s --amend" on an existing commit that forgot to remove
> - * it.
> + * trailing comment lines and blank lines.  To support "git commit -s
> + * --amend" on an existing commit, we also ignore "Conflicts:".  To
> + * support "git commit -v", we truncate at "---- >8 ----" and similar
> + * scissors lines.
>   *
>   * Returns the number of bytes from the tail to ignore, to be fed as
>   * the second parameter to append_signoff().
> @@ -1723,6 +1723,11 @@ int ignore_non_trailer(const char *buf, size_t len)
>  		else
>  			next_line++;
>  
> +		if (is_scissors_line(&buf[bol])) {
> +			if (!boc)
> +				boc = bol;
> +			break;
> +		}

This unconditionally ignores scissors lines. But that means if you have
any inside your commit message, we may quietly corrupt the commit
message. It would be better to remove the scissors lines only when we
know that we've added them.

And that's part of my question above. If this is a call happening inside
builtin/commit.c, then we definitely know when we've added scissors. But
if it's interpret-trailers being run by a hook, passing the information
down to this function is a little tricky.

Maybe I'm being overly picky. We already do a lot of gross heuristic
stuff here like skipping comments and old-style Conflicts blocks (that
we don't even generate anymore!).

-Peff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/3] interpret-trailers + commit -v bugfix
  2017-05-12  5:03 [PATCH 0/3] interpret-trailers + commit -v bugfix Brian Malehorn
                   ` (2 preceding siblings ...)
  2017-05-12  5:03 ` [PATCH 3/3] commit.c: skip scissors when computing trailers Brian Malehorn
@ 2017-05-12  9:00 ` Jeff King
  2017-05-14  3:39   ` Brian Malehorn
  3 siblings, 1 reply; 24+ messages in thread
From: Jeff King @ 2017-05-12  9:00 UTC (permalink / raw)
  To: Brian Malehorn; +Cc: git

On Thu, May 11, 2017 at 10:03:44PM -0700, Brian Malehorn wrote:

> This patch series addresses a bug in interpret-trailers. If the commit
> that is being editted is "verbose", it will contain a scissors string
> ("-- >8 --") and a diff. interpret-trailers doesn't interpret the
> scissors and therefore places trailer information after the diff. A
> simple reproduction is:
> 
> 	git config commit.verbose true
> 	GIT_EDITOR='git interpret-trailers --in-place --trailer Acked-by:me' \
> 		git commit --amend

Ah, I should have read your cover letter more carefully before
responding to the third patch (or alternatively, you should have put
your motivating example into the third patch's commit message :) ).

Your example definitely makes sense. But it does make me wonder if this
should be an option to interpret-trailers, so that it doesn't
accidentally trigger. Unfortunately you'd have to manually specify it
(even though you'd presumably have commit.verbose in your config and
have long forgotten about it). And at that point, you might as well just
say "--no-verbose" on the commit command-line.

> P.S. This is my first patch series to the git mailing list, to feel free
> to point out any mistakes I made when submitting.

My responses so far have all been critical, so let me be positive for a
moment. :)

Welcome to the community. Everything in your patches generally looks
well-formatted, etc. And I do think you're on the right track with your
solution.

As I said, I'm a little iffy on doing this unconditionally, but it may
be the least-bad solution. I'd just worry about collateral damage to
somebody who doesn't use commit.verbose, but has something scissors-like
in their commit message.

If you were to switch out is_scissors_line() for checking the exact
cut_line[] from wt-status.c, I think that would be a big improvement.
We'd still have the possibility of a false positive, but it would be
much less likely in practice.

-Peff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/3] interpret-trailers + commit -v bugfix
  2017-05-12  9:00 ` [PATCH 0/3] interpret-trailers + commit -v bugfix Jeff King
@ 2017-05-14  3:39   ` Brian Malehorn
  2017-05-14  3:39     ` [PATCH] interpret-trailers: obey scissors lines Brian Malehorn
  0 siblings, 1 reply; 24+ messages in thread
From: Brian Malehorn @ 2017-05-14  3:39 UTC (permalink / raw)
  To: peff; +Cc: bmalehorn, git


> As I said, I'm a little iffy on doing this unconditionally, but it may
> be the least-bad solution. I'd just worry about collateral damage to
> somebody who doesn't use commit.verbose, but has something scissors-like
> in their commit message.
> 
> If you were to switch out is_scissors_line() for checking the exact
> cut_line[] from wt-status.c, I think that would be a big improvement.
> We'd still have the possibility of a false positive, but it would be
> much less likely in practice.

Yes, you're probably right. Using is_scissors_line() was the path of
least resistance to fix my bug, but wasn't really the Right Thing.

I've made a new patch that uses the wt-status.c code to determine
scissors lines. "git interpret-trailers" now uses the same logic as
"git commit". This patch replaces the original 3.

And yeah, this is yet another heuristic in interpret-trailers aimed at
git commit messages. But it's hardly the first heuristic we've added,
and I'd say it makes more sense for interpret-trailers and commit to
parse the same format.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] interpret-trailers: obey scissors lines
  2017-05-14  3:39   ` Brian Malehorn
@ 2017-05-14  3:39     ` Brian Malehorn
  2017-05-14  3:56       ` Jeff King
  2017-05-14  7:02       ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 24+ messages in thread
From: Brian Malehorn @ 2017-05-14  3:39 UTC (permalink / raw)
  To: peff; +Cc: bmalehorn, git

If a commit message is being editted as "verbose", it will contain a
scissors string ("-- >8 --") and a diff:

    my subject

    # ------------------------ >8 ------------------------
    # Do not touch the line above.
    # Everything below will be removed.
    diff --git a/foo.txt b/foo.txt
    index 5716ca5..7601807 100644
    --- a/foo.txt
    +++ b/foo.txt
    @@ -1 +1 @@
    -bar
    +baz

interpret-trailers doesn't interpret the scissors and therefore places
trailer information after the diff. A simple reproduction is:

    git config commit.verbose true
    GIT_EDITOR='git interpret-trailers --in-place --trailer Acked-by:me' \
        git commit --amend

This commit resolves the issue by teaching "git interpret-trailers" to
obey scissors the same way "git commit" does.
---
 builtin/commit.c              |  3 ++-
 commit.c                      |  6 ++++--
 t/t7513-interpret-trailers.sh | 17 +++++++++++++++++
 wt-status.c                   | 11 ++++++-----
 wt-status.h                   |  2 +-
 5 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 2de5f6cc6..2ce9c339d 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1735,7 +1735,8 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
 
 	if (verbose || /* Truncate the message just before the diff, if any. */
 	    cleanup_mode == CLEANUP_SCISSORS)
-		wt_status_truncate_message_at_cut_line(&sb);
+		strbuf_setlen(&sb,
+			      wt_status_last_nonscissors_index(sb.buf, sb.len));
 
 	if (cleanup_mode != CLEANUP_NONE)
 		strbuf_stripspace(&sb, cleanup_mode == CLEANUP_ALL);
diff --git a/commit.c b/commit.c
index fab826973..5d791b703 100644
--- a/commit.c
+++ b/commit.c
@@ -11,6 +11,7 @@
 #include "commit-slab.h"
 #include "prio-queue.h"
 #include "sha1-lookup.h"
+#include "wt-status.h"
 
 static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **);
 
@@ -1662,8 +1663,9 @@ int ignore_non_trailer(const char *buf, size_t len)
 	int boc = 0;
 	int bol = 0;
 	int in_old_conflicts_block = 0;
+	size_t cutoff = wt_status_last_nonscissors_index(buf, len);
 
-	while (bol < len) {
+	while (bol < cutoff) {
 		const char *next_line = memchr(buf + bol, '\n', len - bol);
 
 		if (!next_line)
@@ -1689,5 +1691,5 @@ int ignore_non_trailer(const char *buf, size_t len)
 		}
 		bol = next_line - buf;
 	}
-	return boc ? len - boc : 0;
+	return boc ? len - boc : len - cutoff;
 }
diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh
index 4dd1d7c52..d88d4a4ff 100755
--- a/t/t7513-interpret-trailers.sh
+++ b/t/t7513-interpret-trailers.sh
@@ -1258,4 +1258,21 @@ test_expect_success 'with no command and no key' '
 	test_cmp expected actual
 '
 
+test_expect_success 'with scissors' '
+	cat >expected <<-EOF &&
+		my subject
+
+		review: Brian
+		sign: A U Thor <author@example.com>
+		# ------------------------ >8 ------------------------
+		ignore this
+	EOF
+	git interpret-trailers --trailer review:Brian >actual <<-EOF &&
+		my subject
+		# ------------------------ >8 ------------------------
+		ignore this
+	EOF
+	test_cmp expected actual
+'
+
 test_done
diff --git a/wt-status.c b/wt-status.c
index 4bb46781c..8b807d11f 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -883,17 +883,18 @@ static void wt_longstatus_print_other(struct wt_status *s,
 	status_printf_ln(s, GIT_COLOR_NORMAL, "%s", "");
 }
 
-void wt_status_truncate_message_at_cut_line(struct strbuf *buf)
+size_t wt_status_last_nonscissors_index(const char *s, size_t len)
 {
 	const char *p;
 	struct strbuf pattern = STRBUF_INIT;
 
 	strbuf_addf(&pattern, "\n%c %s", comment_line_char, cut_line);
-	if (starts_with(buf->buf, pattern.buf + 1))
-		strbuf_setlen(buf, 0);
-	else if ((p = strstr(buf->buf, pattern.buf)))
-		strbuf_setlen(buf, p - buf->buf + 1);
+	if (starts_with(s, pattern.buf + 1))
+		len = 0;
+	else if ((p = strstr(s, pattern.buf)))
+		len = p - s + 1;
 	strbuf_release(&pattern);
+	return len;
 }
 
 void wt_status_add_cut_line(FILE *fp)
diff --git a/wt-status.h b/wt-status.h
index 54fec7703..36a21b492 100644
--- a/wt-status.h
+++ b/wt-status.h
@@ -112,7 +112,7 @@ struct wt_status_state {
 	unsigned char cherry_pick_head_sha1[20];
 };
 
-void wt_status_truncate_message_at_cut_line(struct strbuf *);
+size_t wt_status_last_nonscissors_index(const char *s, size_t len);
 void wt_status_add_cut_line(FILE *fp);
 void wt_status_prepare(struct wt_status *s);
 void wt_status_print(struct wt_status *s);
-- 
2.12.3.9.g050893b


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] interpret-trailers: obey scissors lines
  2017-05-14  3:39     ` [PATCH] interpret-trailers: obey scissors lines Brian Malehorn
@ 2017-05-14  3:56       ` Jeff King
  2017-05-14  8:33         ` Brian Malehorn
  2017-05-15  2:12         ` Junio C Hamano
  2017-05-14  7:02       ` Ævar Arnfjörð Bjarmason
  1 sibling, 2 replies; 24+ messages in thread
From: Jeff King @ 2017-05-14  3:56 UTC (permalink / raw)
  To: Brian Malehorn; +Cc: git

On Sat, May 13, 2017 at 08:39:23PM -0700, Brian Malehorn wrote:

> If a commit message is being editted as "verbose", it will contain a
> scissors string ("-- >8 --") and a diff:
> 
>     my subject
> 
>     # ------------------------ >8 ------------------------
>     # Do not touch the line above.
>     # Everything below will be removed.
>     diff --git a/foo.txt b/foo.txt
>     index 5716ca5..7601807 100644
>     --- a/foo.txt
>     +++ b/foo.txt
>     @@ -1 +1 @@
>     -bar
>     +baz
> 
> interpret-trailers doesn't interpret the scissors and therefore places
> trailer information after the diff. A simple reproduction is:
> 
>     git config commit.verbose true
>     GIT_EDITOR='git interpret-trailers --in-place --trailer Acked-by:me' \
>         git commit --amend
> 
> This commit resolves the issue by teaching "git interpret-trailers" to
> obey scissors the same way "git commit" does.

Overall, this patch looks good to me. A few comments below.

The commit message explains the situation much better than the original.

> diff --git a/builtin/commit.c b/builtin/commit.c
> index 2de5f6cc6..2ce9c339d 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -1735,7 +1735,8 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
>  
>  	if (verbose || /* Truncate the message just before the diff, if any. */
>  	    cleanup_mode == CLEANUP_SCISSORS)
> -		wt_status_truncate_message_at_cut_line(&sb);
> +		strbuf_setlen(&sb,
> +			      wt_status_last_nonscissors_index(sb.buf, sb.len));

This hunk surprised me at first (that we would need to touch commit.c at
all), but the refactoring makes sense.

> @@ -1662,8 +1663,9 @@ int ignore_non_trailer(const char *buf, size_t len)
>  	int boc = 0;
>  	int bol = 0;
>  	int in_old_conflicts_block = 0;
> +	size_t cutoff = wt_status_last_nonscissors_index(buf, len);
>  
> -	while (bol < len) {
> +	while (bol < cutoff) {
>  		const char *next_line = memchr(buf + bol, '\n', len - bol);
>  
>  		if (!next_line)
> @@ -1689,5 +1691,5 @@ int ignore_non_trailer(const char *buf, size_t len)
>  		}
>  		bol = next_line - buf;
>  	}
> -	return boc ? len - boc : 0;
> +	return boc ? len - boc : len - cutoff;
>  }

The change to interpret-trailers here ended up delightfully simple (and
looks right to me).

> diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh
> index 4dd1d7c52..d88d4a4ff 100755
> --- a/t/t7513-interpret-trailers.sh
> +++ b/t/t7513-interpret-trailers.sh
> @@ -1258,4 +1258,21 @@ test_expect_success 'with no command and no key' '
>  	test_cmp expected actual
>  '
>  
> +test_expect_success 'with scissors' '
> +	cat >expected <<-EOF &&
> +		my subject
> +
> +		review: Brian
> +		sign: A U Thor <author@example.com>
> +		# ------------------------ >8 ------------------------
> +		ignore this
> +	EOF

Two minor style nits. One, we'd usually use "\EOF" here unless you
really do want to interpolate inside the here document. And two, we
usually indent the contents to the same level as the outer cat/EOF pair
(I actually don't mind at all how yours looks, but I just happened to
notice that it is slightly unlike our usual style).

> diff --git a/wt-status.c b/wt-status.c
> index 4bb46781c..8b807d11f 100644
> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -883,17 +883,18 @@ static void wt_longstatus_print_other(struct wt_status *s,
>  	status_printf_ln(s, GIT_COLOR_NORMAL, "%s", "");
>  }
>  
> -void wt_status_truncate_message_at_cut_line(struct strbuf *buf)
> +size_t wt_status_last_nonscissors_index(const char *s, size_t len)

I can see how the refactor changes this function (and it makes sense),
but the "last nonscissors index" seems like a funny term. It is really
the length, isn't, and therefore one past the last nonscissors index (or
another way of putting it: it's the first index of the scissors).

I wonder if it makes sense to call it "length".

Another way to think of it is still as a truncation. Our strip_suffix()
helper behaves quite similarly to this (not actually writing into the
buffer, but returning the new length). Perhaps something like
"wt_status_strip_scissors" would work.

-Peff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] interpret-trailers: obey scissors lines
  2017-05-14  3:39     ` [PATCH] interpret-trailers: obey scissors lines Brian Malehorn
  2017-05-14  3:56       ` Jeff King
@ 2017-05-14  7:02       ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 24+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-14  7:02 UTC (permalink / raw)
  To: Brian Malehorn; +Cc: Jeff King, Git Mailing List

On Sun, May 14, 2017 at 5:39 AM, Brian Malehorn <bmalehorn@gmail.com> wrote:
> If a commit message is being editted as "verbose", it will contain a

Typo, should be "edited": https://en.wiktionary.org/wiki/editted

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] interpret-trailers: obey scissors lines
  2017-05-14  3:56       ` Jeff King
@ 2017-05-14  8:33         ` Brian Malehorn
  2017-05-14  8:33           ` Brian Malehorn
  2017-05-15  3:08           ` Jeff King
  2017-05-15  2:12         ` Junio C Hamano
  1 sibling, 2 replies; 24+ messages in thread
From: Brian Malehorn @ 2017-05-14  8:33 UTC (permalink / raw)
  To: peff; +Cc: bmalehorn, git


> One, we'd usually use "\EOF" here unless you
> really do want to interpolate inside the here document.

Fixed, and I learned something new.

> And two, we usually indent the contents to the same level as the outer
> cat/EOF pair

Fixed.

I was indenting the same as the other tests in that file. But if the way
you described is the preferred way, then sure.

> Another way to think of it is still as a truncation. Our strip_suffix()
> helper behaves quite similarly to this (not actually writing into the
> buffer, but returning the new length). Perhaps something like
> "wt_status_strip_scissors" would work.

I agree the name was pretty awkward. I was trying to avoid using the
word "truncate" or "strip", since it doesn't make any change to the
buffer. But if strip_suffix() is already around it shouldn't be to
surprising.

I've now renamed it to "wt_status_strip_scissors".

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] interpret-trailers: obey scissors lines
  2017-05-14  8:33         ` Brian Malehorn
@ 2017-05-14  8:33           ` Brian Malehorn
  2017-05-15  3:55             ` Junio C Hamano
  2017-05-15  3:08           ` Jeff King
  1 sibling, 1 reply; 24+ messages in thread
From: Brian Malehorn @ 2017-05-14  8:33 UTC (permalink / raw)
  To: peff; +Cc: bmalehorn, git

If a commit message is being edited as "verbose", it will contain a
scissors string ("-- >8 --") and a diff:

    my subject

    # ------------------------ >8 ------------------------
    # Do not touch the line above.
    # Everything below will be removed.
    diff --git a/foo.txt b/foo.txt
    index 5716ca5..7601807 100644
    --- a/foo.txt
    +++ b/foo.txt
    @@ -1 +1 @@
    -bar
    +baz

interpret-trailers doesn't interpret the scissors and therefore places
trailer information after the diff. A simple reproduction is:

    git config commit.verbose true
    GIT_EDITOR='git interpret-trailers --in-place --trailer Acked-by:me' \
        git commit --amend

This commit resolves the issue by teaching "git interpret-trailers" to
obey scissors the same way "git commit" does.
---
 builtin/commit.c              |  3 ++-
 commit.c                      | 13 +++++++------
 t/t7513-interpret-trailers.sh | 17 +++++++++++++++++
 wt-status.c                   | 11 ++++++-----
 wt-status.h                   |  2 +-
 5 files changed, 33 insertions(+), 13 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 2de5f6cc6..d223cf3ea 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1735,7 +1735,8 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
 
 	if (verbose || /* Truncate the message just before the diff, if any. */
 	    cleanup_mode == CLEANUP_SCISSORS)
-		wt_status_truncate_message_at_cut_line(&sb);
+		strbuf_setlen(&sb,
+			      wt_status_strip_scissors(sb.buf, sb.len));
 
 	if (cleanup_mode != CLEANUP_NONE)
 		strbuf_stripspace(&sb, cleanup_mode == CLEANUP_ALL);
diff --git a/commit.c b/commit.c
index fab826973..6c26ac9b9 100644
--- a/commit.c
+++ b/commit.c
@@ -11,6 +11,7 @@
 #include "commit-slab.h"
 #include "prio-queue.h"
 #include "sha1-lookup.h"
+#include "wt-status.h"
 
 static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **);
 
@@ -1649,10 +1650,9 @@ const char *find_commit_header(const char *msg, const char *key, size_t *out_len
 /*
  * Inspect the given string and determine the true "end" of the log message, in
  * order to find where to put a new Signed-off-by: line.  Ignored are
- * trailing comment lines and blank lines, and also the traditional
- * "Conflicts:" block that is not commented out, so that we can use
- * "git commit -s --amend" on an existing commit that forgot to remove
- * it.
+ * trailing comment lines and blank lines.  To support "git commit -s
+ * --amend" on an existing commit, we also ignore "Conflicts:".  To
+ * support "git commit -v", we truncate at scissor lines.
  *
  * Returns the number of bytes from the tail to ignore, to be fed as
  * the second parameter to append_signoff().
@@ -1662,8 +1662,9 @@ int ignore_non_trailer(const char *buf, size_t len)
 	int boc = 0;
 	int bol = 0;
 	int in_old_conflicts_block = 0;
+	size_t cutoff = wt_status_strip_scissors(buf, len);
 
-	while (bol < len) {
+	while (bol < cutoff) {
 		const char *next_line = memchr(buf + bol, '\n', len - bol);
 
 		if (!next_line)
@@ -1689,5 +1690,5 @@ int ignore_non_trailer(const char *buf, size_t len)
 		}
 		bol = next_line - buf;
 	}
-	return boc ? len - boc : 0;
+	return boc ? len - boc : len - cutoff;
 }
diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh
index 4dd1d7c52..b16d8c5d9 100755
--- a/t/t7513-interpret-trailers.sh
+++ b/t/t7513-interpret-trailers.sh
@@ -1258,4 +1258,21 @@ test_expect_success 'with no command and no key' '
 	test_cmp expected actual
 '
 
+test_expect_success 'with scissors' '
+	cat >expected <<-\EOF &&
+	my subject
+
+	review: Brian
+	sign: A U Thor <author@example.com>
+	# ------------------------ >8 ------------------------
+	ignore this
+	EOF
+	git interpret-trailers --trailer review:Brian >actual <<-\EOF &&
+	my subject
+	# ------------------------ >8 ------------------------
+	ignore this
+	EOF
+	test_cmp expected actual
+'
+
 test_done
diff --git a/wt-status.c b/wt-status.c
index 4bb46781c..2a6bba205 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -883,17 +883,18 @@ static void wt_longstatus_print_other(struct wt_status *s,
 	status_printf_ln(s, GIT_COLOR_NORMAL, "%s", "");
 }
 
-void wt_status_truncate_message_at_cut_line(struct strbuf *buf)
+size_t wt_status_strip_scissors(const char *s, size_t len)
 {
 	const char *p;
 	struct strbuf pattern = STRBUF_INIT;
 
 	strbuf_addf(&pattern, "\n%c %s", comment_line_char, cut_line);
-	if (starts_with(buf->buf, pattern.buf + 1))
-		strbuf_setlen(buf, 0);
-	else if ((p = strstr(buf->buf, pattern.buf)))
-		strbuf_setlen(buf, p - buf->buf + 1);
+	if (starts_with(s, pattern.buf + 1))
+		len = 0;
+	else if ((p = strstr(s, pattern.buf)))
+		len = p - s + 1;
 	strbuf_release(&pattern);
+	return len;
 }
 
 void wt_status_add_cut_line(FILE *fp)
diff --git a/wt-status.h b/wt-status.h
index 54fec7703..52a05fd51 100644
--- a/wt-status.h
+++ b/wt-status.h
@@ -112,7 +112,7 @@ struct wt_status_state {
 	unsigned char cherry_pick_head_sha1[20];
 };
 
-void wt_status_truncate_message_at_cut_line(struct strbuf *);
+size_t wt_status_strip_scissors(const char *s, size_t len);
 void wt_status_add_cut_line(FILE *fp);
 void wt_status_prepare(struct wt_status *s);
 void wt_status_print(struct wt_status *s);
-- 
2.12.3.9.g050893b


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] interpret-trailers: obey scissors lines
  2017-05-14  3:56       ` Jeff King
  2017-05-14  8:33         ` Brian Malehorn
@ 2017-05-15  2:12         ` Junio C Hamano
  2017-05-15  3:07           ` Jeff King
  1 sibling, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2017-05-15  2:12 UTC (permalink / raw)
  To: Jeff King; +Cc: Brian Malehorn, git

Jeff King <peff@peff.net> writes:

>> If a commit message is being editted as "verbose", it will contain a
>> scissors string ("-- >8 --") and a diff:
>> 
>>     my subject
>> 
>>     # ------------------------ >8 ------------------------
>>     # Do not touch the line above.
>>     # Everything below will be removed.
>>     diff --git a/foo.txt b/foo.txt
>>     index 5716ca5..7601807 100644
>>     --- a/foo.txt
>>     +++ b/foo.txt
>>     @@ -1 +1 @@
>>     -bar
>>     +baz
>> 
>> interpret-trailers doesn't interpret the scissors and therefore places
>> trailer information after the diff. A simple reproduction is:
>> 
>>     git config commit.verbose true
>>     GIT_EDITOR='git interpret-trailers --in-place --trailer Acked-by:me' \
>>         git commit --amend

So, "git commit --amend --verbose" prepares something like the above
example and passes it to the GIT_EDITOR.  The problem description
indicates that because interpret-trailers (which happens to be used
as the GIT_EDITOR) treats the cut-line used by "git commit" just
like any other random line, it tries to find the trailer block way
below, in the patch text.  I agree that it is a problem.  But...

>> diff --git a/builtin/commit.c b/builtin/commit.c
>> index 2de5f6cc6..2ce9c339d 100644
>> --- a/builtin/commit.c
>> +++ b/builtin/commit.c
>> @@ -1735,7 +1735,8 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
>>  
>>  	if (verbose || /* Truncate the message just before the diff, if any. */
>>  	    cleanup_mode == CLEANUP_SCISSORS)
>> -		wt_status_truncate_message_at_cut_line(&sb);
>> +		strbuf_setlen(&sb,
>> +			      wt_status_last_nonscissors_index(sb.buf, sb.len));
>
> This hunk surprised me at first (that we would need to touch commit.c at
> all), but the refactoring makes sense.

This still surprises me.  If the problem is in interpret-trailers,
why do we even need to touch cmd_commit()?  If GIT_EDITOR returns us

     my subject
 
     Acked-by: me
     # ------------------------ >8 ------------------------
     # Do not touch the line above.
     # Everything below will be removed.
     diff --git a/foo.txt b/foo.txt
     index 5716ca5..7601807 100644
     --- a/foo.txt
     +++ b/foo.txt
     @@ -1 +1 @@
     -bar
     +baz

after this patch fixes interpret-trailers, wouldn't the cmd_commit()
function work as expected without any change?

Puzzled.

The proposed log message calls the cut-line "scissors", but that is
probably a source of this confusion.  The cut-line and scissors do
not have much in commmon.  For one thing, scissors is a mechanism to
discard everything _ABOVE_ it.  The cut-line we see in this example,
on the other hand, is about discarding everything _BELOW_ it.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] interpret-trailers: obey scissors lines
  2017-05-15  2:12         ` Junio C Hamano
@ 2017-05-15  3:07           ` Jeff King
  2017-05-15  3:32             ` Junio C Hamano
  0 siblings, 1 reply; 24+ messages in thread
From: Jeff King @ 2017-05-15  3:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Brian Malehorn, git

On Mon, May 15, 2017 at 11:12:03AM +0900, Junio C Hamano wrote:

> >> diff --git a/builtin/commit.c b/builtin/commit.c
> >> index 2de5f6cc6..2ce9c339d 100644
> >> --- a/builtin/commit.c
> >> +++ b/builtin/commit.c
> >> @@ -1735,7 +1735,8 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
> >>  
> >>  	if (verbose || /* Truncate the message just before the diff, if any. */
> >>  	    cleanup_mode == CLEANUP_SCISSORS)
> >> -		wt_status_truncate_message_at_cut_line(&sb);
> >> +		strbuf_setlen(&sb,
> >> +			      wt_status_last_nonscissors_index(sb.buf, sb.len));
> >
> > This hunk surprised me at first (that we would need to touch commit.c at
> > all), but the refactoring makes sense.
> 
> This still surprises me.  If the problem is in interpret-trailers,
> why do we even need to touch cmd_commit()?  If GIT_EDITOR returns us

The behavior of cmd_commit() shouldn't be changed by the patch. But to
make the interface suitable for the new callsite (which doesn't have a
strbuf, but a ptr/len buffer), it needs to return the length rather than
shortening the strbuf. We could leave in place:

  void wt_status_truncate_message_at_cut_line(struct strbuf *sb)
  {
	strbuf_setlen(sb, wt_status_last_nonscissors_index(sb->buf, sb->len));
  }

but it would only have this one caller.

If I were doing the patch series, I'd probably have split that
refactoring into its own patch and discussed the reason separately. I
waffled on whether or not to ask Brian to do so (and obviously didn't in
the end).

> The proposed log message calls the cut-line "scissors", but that is
> probably a source of this confusion.  The cut-line and scissors do
> not have much in commmon.  For one thing, scissors is a mechanism to
> discard everything _ABOVE_ it.  The cut-line we see in this example,
> on the other hand, is about discarding everything _BELOW_ it.

I suspect the confusion comes from calling it CLEANUP_SCISSORS in the
quoted context above, then.  Certainly we use scissors for "snip
everything above this line" in mailinfo, but here it is "snip everything
after this line". I don't think it's wrong to refer to it as a scissors
line, even though the operation is different.

-Peff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] interpret-trailers: obey scissors lines
  2017-05-14  8:33         ` Brian Malehorn
  2017-05-14  8:33           ` Brian Malehorn
@ 2017-05-15  3:08           ` Jeff King
  1 sibling, 0 replies; 24+ messages in thread
From: Jeff King @ 2017-05-15  3:08 UTC (permalink / raw)
  To: Brian Malehorn; +Cc: git

On Sun, May 14, 2017 at 01:33:48AM -0700, Brian Malehorn wrote:

> > And two, we usually indent the contents to the same level as the outer
> > cat/EOF pair
> 
> Fixed.
> 
> I was indenting the same as the other tests in that file. But if the way
> you described is the preferred way, then sure.

Doh. I didn't notice that. That file _is_ unlike the rest of the code
base, but it's generally OK to match the style of surrounding code.

-Peff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] interpret-trailers: obey scissors lines
  2017-05-15  3:07           ` Jeff King
@ 2017-05-15  3:32             ` Junio C Hamano
  2017-05-15  3:33               ` Jeff King
  0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2017-05-15  3:32 UTC (permalink / raw)
  To: Jeff King; +Cc: Brian Malehorn, git

Jeff King <peff@peff.net> writes:

> On Mon, May 15, 2017 at 11:12:03AM +0900, Junio C Hamano wrote:
>
>> >> diff --git a/builtin/commit.c b/builtin/commit.c
>> >> index 2de5f6cc6..2ce9c339d 100644
>> >> --- a/builtin/commit.c
>> >> +++ b/builtin/commit.c
>> >> @@ -1735,7 +1735,8 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
>> >>  
>> >>  	if (verbose || /* Truncate the message just before the diff, if any. */
>> >>  	    cleanup_mode == CLEANUP_SCISSORS)
>> >> -		wt_status_truncate_message_at_cut_line(&sb);
>> >> +		strbuf_setlen(&sb,
>> >> +			      wt_status_last_nonscissors_index(sb.buf, sb.len));
>> >
>> > This hunk surprised me at first (that we would need to touch commit.c at
>> > all), but the refactoring makes sense.
>> 
>> This still surprises me.  If the problem is in interpret-trailers,
>> why do we even need to touch cmd_commit()?  If GIT_EDITOR returns us
>
> The behavior of cmd_commit() shouldn't be changed by the patch. But to
> make the interface suitable for the new callsite (which doesn't have a
> strbuf, but a ptr/len buffer), it needs to return the length rather than
> shortening the strbuf. We could leave in place:
>
>   void wt_status_truncate_message_at_cut_line(struct strbuf *sb)
>   {
> 	strbuf_setlen(sb, wt_status_last_nonscissors_index(sb->buf, sb->len));
>   }
>
> but it would only have this one caller.
>
> If I were doing the patch series, I'd probably have split that
> refactoring into its own patch and discussed the reason separately. I
> waffled on whether or not to ask Brian to do so (and obviously didn't in
> the end).

I suspect that you would have just explained "since there is only
one caller, let's open-code it" in the log message, without making
this a two-patch series, and that would also have been perfectly
understandable (and in this case probably preferrable).

So the patch text would be OK; it was surprising to have the change
without being explained, that's all.

Thanks.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] interpret-trailers: obey scissors lines
  2017-05-15  3:32             ` Junio C Hamano
@ 2017-05-15  3:33               ` Jeff King
  0 siblings, 0 replies; 24+ messages in thread
From: Jeff King @ 2017-05-15  3:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Brian Malehorn, git

On Mon, May 15, 2017 at 12:32:24PM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > On Mon, May 15, 2017 at 11:12:03AM +0900, Junio C Hamano wrote:
> >
> >> >> diff --git a/builtin/commit.c b/builtin/commit.c
> >> >> index 2de5f6cc6..2ce9c339d 100644
> >> >> --- a/builtin/commit.c
> >> >> +++ b/builtin/commit.c
> >> >> @@ -1735,7 +1735,8 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
> >> >>  
> >> >>  	if (verbose || /* Truncate the message just before the diff, if any. */
> >> >>  	    cleanup_mode == CLEANUP_SCISSORS)
> >> >> -		wt_status_truncate_message_at_cut_line(&sb);
> >> >> +		strbuf_setlen(&sb,
> >> >> +			      wt_status_last_nonscissors_index(sb.buf, sb.len));
> >> >
> >> > This hunk surprised me at first (that we would need to touch commit.c at
> >> > all), but the refactoring makes sense.
> >> 
> >> This still surprises me.  If the problem is in interpret-trailers,
> >> why do we even need to touch cmd_commit()?  If GIT_EDITOR returns us
> >
> > The behavior of cmd_commit() shouldn't be changed by the patch. But to
> > make the interface suitable for the new callsite (which doesn't have a
> > strbuf, but a ptr/len buffer), it needs to return the length rather than
> > shortening the strbuf. We could leave in place:
> >
> >   void wt_status_truncate_message_at_cut_line(struct strbuf *sb)
> >   {
> > 	strbuf_setlen(sb, wt_status_last_nonscissors_index(sb->buf, sb->len));
> >   }
> >
> > but it would only have this one caller.
> >
> > If I were doing the patch series, I'd probably have split that
> > refactoring into its own patch and discussed the reason separately. I
> > waffled on whether or not to ask Brian to do so (and obviously didn't in
> > the end).
> 
> I suspect that you would have just explained "since there is only
> one caller, let's open-code it" in the log message, without making
> this a two-patch series, and that would also have been perfectly
> understandable (and in this case probably preferrable).
> 
> So the patch text would be OK; it was surprising to have the change
> without being explained, that's all.

Yeah, I'd agree it could use a better explanation in the commit message.
I was surprised to see it, too. Since both of us were surprised, that's
probably a good indicator. :)

-Peff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] interpret-trailers: obey scissors lines
  2017-05-14  8:33           ` Brian Malehorn
@ 2017-05-15  3:55             ` Junio C Hamano
  2017-05-15  4:23               ` Junio C Hamano
  0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2017-05-15  3:55 UTC (permalink / raw)
  To: Brian Malehorn; +Cc: peff, git

Brian Malehorn <bmalehorn@gmail.com> writes:

> If a commit message is being edited as "verbose", it will contain a
> scissors string ("-- >8 --") and a diff:
>
>     my subject
>
>     # ------------------------ >8 ------------------------
>     # Do not touch the line above.
>     # Everything below will be removed.
>     diff --git a/foo.txt b/foo.txt
>     index 5716ca5..7601807 100644
>     --- a/foo.txt
>     +++ b/foo.txt
>     @@ -1 +1 @@
>     -bar
>     +baz
>
> interpret-trailers doesn't interpret the scissors and therefore places
> trailer information after the diff. A simple reproduction is:
>
>     git config commit.verbose true
>     GIT_EDITOR='git interpret-trailers --in-place --trailer Acked-by:me' \
>         git commit --amend
>
> This commit resolves the issue by teaching "git interpret-trailers" to
> obey scissors the same way "git commit" does.

I'd prefer to see s/obey scissors lines/honor the cut line/, and the
last paragraph rephrased somewhat, perhaps like:

        Subject: interpret-trailers: honor the cut line

        If a commit message is edited with the "verbose" option, the
        buffer will have a cut line and diff after the log message,
        like so:

             my subject

             # ------------------------ >8 ------------------------
             # Do not touch the line above.
             # Everything below will be removed.
             diff --git a/foo.txt b/foo.txt
             index 5716ca5..7601807 100644
             --- a/foo.txt
             +++ b/foo.txt
             @@ -1 +1 @@
             -bar
             +baz

        "git interpret-trailers" is unaware of the cut line, and
        assumes the trailer block would be at the end of the whole
        thing.  This can easily be seen with:

             $ GIT_EDITOR='git interpret-trailers --in-place --trailer Acked-by:me' \
               git commit --amend -v

        Teach "git interpret-trailers" to notice the cut-line and
        ignore the remainder of the input when looking for a place
        to add new trailer block.  This makes it consistent with how
        "git commit -v -s" inserts a new Signed-off-by: line.

        This can be done by the same logic as the existing helper
        function, wt_status_truncate_message_at_cut_line(), uses,
        but it wants the caller to pass a strbuf to it.  Because the
        helper function ignore_non_trailer() used by the command
        takes a <pointer, length> pair, not a strbuf, steal the
        logic from wt_status_truncate_message_at_cut_line() to
        create a new wt_status_strip_scissors() helper function that
        takes <poiter, length> pair, and make ignore_non_trailer()
        call it to help "interpret-trailers".  Since there is only
        one caller of wt_status_truncate_message_at_cut_line() in
        cmd_commit(), rewrite it to call wt_status_strip_scissors()
        helper instead and remove the old helper that no longer has
        any caller.

The last paragraph would have saved me from getting puzzled.

The mention of "'commit -v -s' works that way, too" was my attempt
to justify why it is OK to make this change unconditionally to
intepret-trailers, but I am still not 100% convinced with that
reasoning (or your original log message) that it is a safe thing to
do.  It is not like its sole purpose is to serve as GIT_EDITOR for
the "git commit" command.

The patch text itself looks sensible.  

We still need to see your patch with your sign-off, though.

Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] interpret-trailers: obey scissors lines
  2017-05-15  3:55             ` Junio C Hamano
@ 2017-05-15  4:23               ` Junio C Hamano
  2017-05-16  6:06                 ` Brian Malehorn
  0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2017-05-15  4:23 UTC (permalink / raw)
  To: Brian Malehorn; +Cc: peff, git

Junio C Hamano <gitster@pobox.com> writes:

>         This can be done by the same logic as the existing helper
>         function, wt_status_truncate_message_at_cut_line(), uses,
>         but it wants the caller to pass a strbuf to it.  Because the
>         helper function ignore_non_trailer() used by the command
>         takes a <pointer, length> pair, not a strbuf, steal the
>         logic from wt_status_truncate_message_at_cut_line() to
>         create a new wt_status_strip_scissors() helper function that
>         takes <poiter, length> pair, and make ignore_non_trailer()
>         call it to help "interpret-trailers".  Since there is only
>         one caller of wt_status_truncate_message_at_cut_line() in
>         cmd_commit(), rewrite it to call wt_status_strip_scissors()
>         helper instead and remove the old helper that no longer has
>         any caller.
>
> The last paragraph would have saved me from getting puzzled.

And re-reading the above, wt_status_strip_scissors() does not sound
right to me.  

Yes, I am bikeshedding at this point, but names matter.  I prefer to
keep the distinction between the two clear by not calling the
cut_line[] array "scissors", but in any case, the new function is
not about "finding" the cut-line or scissors, in other words,
wt_status_locate_scissors() is not a good name either (we do not say
"not found" when there is none).

We are locating the logical end of the commit log message.  It ends
at "---\n", if exists, in the output of "git format-patch", and it
ends at the cut-line, if exists, in "commit -v" editor buffer, and
if there aren't these funny End-Of-Message marks, then we return the
location of the last byte.  IOW, the helper function you added will
be the place to add more logic in the future if we ever found the
need to notice other kinds of "logical end" markers (which may be
ones we will invent in the future) while accepting a proposed commit
log message.

So how about calling it wt_status_locate_end() or something like
that, without limiting ourselves only to the "scissors"?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] interpret-trailers: obey scissors lines
  2017-05-15  4:23               ` Junio C Hamano
@ 2017-05-16  6:06                 ` Brian Malehorn
  2017-05-16  6:06                   ` [PATCH] interpret-trailers: honor the cut line Brian Malehorn
  2017-05-16  6:42                   ` [PATCH] interpret-trailers: obey scissors lines Junio C Hamano
  0 siblings, 2 replies; 24+ messages in thread
From: Brian Malehorn @ 2017-05-16  6:06 UTC (permalink / raw)
  To: gitster; +Cc: bmalehorn, git, peff


Alright, I've made a new patch based on this feedback:

* make indentation consistent with rest of test file again
* rename function to wt_status_locate_end()
* rephrase commit message; add paragraph about wt_status_locate_end()
* s/scissors/cut line/
* Signed-off-by: me

> The mention of "'commit -v -s' works that way, too" was my attempt
> to justify why it is OK to make this change unconditionally to
> intepret-trailers, but I am still not 100% convinced with that
> reasoning (or your original log message) that it is a safe thing to
> do.  It is not like its sole purpose is to serve as GIT_EDITOR for
> the "git commit" command.

Yup, this is a heuristic and it will sometimes hit false.  But a far
more grievous heuristic is ignoring all lines that begin with '#', yet
we do that anyway for its convenience in editing commit messages.  Since
we already *partially* honor commit message syntax in
interpret-trailers, we might as well *completely* honor it by parsing
cut lines as well.  At least, that's my reasoning.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] interpret-trailers: honor the cut line
  2017-05-16  6:06                 ` Brian Malehorn
@ 2017-05-16  6:06                   ` Brian Malehorn
  2017-05-16  6:42                   ` [PATCH] interpret-trailers: obey scissors lines Junio C Hamano
  1 sibling, 0 replies; 24+ messages in thread
From: Brian Malehorn @ 2017-05-16  6:06 UTC (permalink / raw)
  To: gitster; +Cc: bmalehorn, git, peff

If a commit message is edited with the "verbose" option, the buffer will
have a cut line and diff after the log message, like so:

    my subject

    # ------------------------ >8 ------------------------
    # Do not touch the line above.
    # Everything below will be removed.
    diff --git a/foo.txt b/foo.txt
    index 5716ca5..7601807 100644
    --- a/foo.txt
    +++ b/foo.txt
    @@ -1 +1 @@
    -bar
    +baz

"git interpret-trailers" is unaware of the cut line, and assumes the
trailer block would be at the end of the whole thing.  This can easily
be seen with:

     $ GIT_EDITOR='git interpret-trailers --in-place --trailer Acked-by:me' \
       git commit --amend -v

Teach "git interpret-trailers" to notice the cut-line and ignore the
remainder of the input when looking for a place to add new trailer
block.  This makes it consistent with how "git commit -v -s" inserts a
new Signed-off-by: line.

This can be done by the same logic as the existing helper function,
wt_status_truncate_message_at_cut_line(), uses, but it wants the caller
to pass a strbuf to it.  Because the helper function
ignore_non_trailer() used by the command takes a <pointer, length> pair,
not a strbuf, steal the logic from
wt_status_truncate_message_at_cut_line() to create a new
wt_status_locate_end() helper function that takes <pointer, length>
pair, and make ignore_non_trailer() call it to help
"interpret-trailers".  Since there is only one caller of
wt_status_truncate_message_at_cut_line() in cmd_commit(), rewrite it to
call wt_status_locate_end() helper instead and remove the old helper
that no longer has any caller.

Signed-off-by: Brian Malehorn <bmalehorn@gmail.com>
---
 builtin/commit.c              |  2 +-
 commit.c                      | 13 +++++++------
 t/t7513-interpret-trailers.sh | 17 +++++++++++++++++
 wt-status.c                   | 11 ++++++-----
 wt-status.h                   |  2 +-
 5 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 2de5f6cc6..6c606d965 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1735,7 +1735,7 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
 
 	if (verbose || /* Truncate the message just before the diff, if any. */
 	    cleanup_mode == CLEANUP_SCISSORS)
-		wt_status_truncate_message_at_cut_line(&sb);
+		strbuf_setlen(&sb, wt_status_locate_end(sb.buf, sb.len));
 
 	if (cleanup_mode != CLEANUP_NONE)
 		strbuf_stripspace(&sb, cleanup_mode == CLEANUP_ALL);
diff --git a/commit.c b/commit.c
index fab826973..fa353acac 100644
--- a/commit.c
+++ b/commit.c
@@ -11,6 +11,7 @@
 #include "commit-slab.h"
 #include "prio-queue.h"
 #include "sha1-lookup.h"
+#include "wt-status.h"
 
 static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **);
 
@@ -1649,10 +1650,9 @@ const char *find_commit_header(const char *msg, const char *key, size_t *out_len
 /*
  * Inspect the given string and determine the true "end" of the log message, in
  * order to find where to put a new Signed-off-by: line.  Ignored are
- * trailing comment lines and blank lines, and also the traditional
- * "Conflicts:" block that is not commented out, so that we can use
- * "git commit -s --amend" on an existing commit that forgot to remove
- * it.
+ * trailing comment lines and blank lines.  To support "git commit -s
+ * --amend" on an existing commit, we also ignore "Conflicts:".  To
+ * support "git commit -v", we truncate at cut lines.
  *
  * Returns the number of bytes from the tail to ignore, to be fed as
  * the second parameter to append_signoff().
@@ -1662,8 +1662,9 @@ int ignore_non_trailer(const char *buf, size_t len)
 	int boc = 0;
 	int bol = 0;
 	int in_old_conflicts_block = 0;
+	size_t cutoff = wt_status_locate_end(buf, len);
 
-	while (bol < len) {
+	while (bol < cutoff) {
 		const char *next_line = memchr(buf + bol, '\n', len - bol);
 
 		if (!next_line)
@@ -1689,5 +1690,5 @@ int ignore_non_trailer(const char *buf, size_t len)
 		}
 		bol = next_line - buf;
 	}
-	return boc ? len - boc : 0;
+	return boc ? len - boc : len - cutoff;
 }
diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh
index 4dd1d7c52..0c6f91c43 100755
--- a/t/t7513-interpret-trailers.sh
+++ b/t/t7513-interpret-trailers.sh
@@ -1258,4 +1258,21 @@ test_expect_success 'with no command and no key' '
 	test_cmp expected actual
 '
 
+test_expect_success 'with cut line' '
+	cat >expected <<-\EOF &&
+		my subject
+
+		review: Brian
+		sign: A U Thor <author@example.com>
+		# ------------------------ >8 ------------------------
+		ignore this
+	EOF
+	git interpret-trailers --trailer review:Brian >actual <<-\EOF &&
+		my subject
+		# ------------------------ >8 ------------------------
+		ignore this
+	EOF
+	test_cmp expected actual
+'
+
 test_done
diff --git a/wt-status.c b/wt-status.c
index 4bb46781c..afff86c18 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -883,17 +883,18 @@ static void wt_longstatus_print_other(struct wt_status *s,
 	status_printf_ln(s, GIT_COLOR_NORMAL, "%s", "");
 }
 
-void wt_status_truncate_message_at_cut_line(struct strbuf *buf)
+size_t wt_status_locate_end(const char *s, size_t len)
 {
 	const char *p;
 	struct strbuf pattern = STRBUF_INIT;
 
 	strbuf_addf(&pattern, "\n%c %s", comment_line_char, cut_line);
-	if (starts_with(buf->buf, pattern.buf + 1))
-		strbuf_setlen(buf, 0);
-	else if ((p = strstr(buf->buf, pattern.buf)))
-		strbuf_setlen(buf, p - buf->buf + 1);
+	if (starts_with(s, pattern.buf + 1))
+		len = 0;
+	else if ((p = strstr(s, pattern.buf)))
+		len = p - s + 1;
 	strbuf_release(&pattern);
+	return len;
 }
 
 void wt_status_add_cut_line(FILE *fp)
diff --git a/wt-status.h b/wt-status.h
index 54fec7703..889a8d682 100644
--- a/wt-status.h
+++ b/wt-status.h
@@ -112,7 +112,7 @@ struct wt_status_state {
 	unsigned char cherry_pick_head_sha1[20];
 };
 
-void wt_status_truncate_message_at_cut_line(struct strbuf *);
+size_t wt_status_locate_end(const char *s, size_t len);
 void wt_status_add_cut_line(FILE *fp);
 void wt_status_prepare(struct wt_status *s);
 void wt_status_print(struct wt_status *s);
-- 
2.12.3.9.g050893b


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] interpret-trailers: obey scissors lines
  2017-05-16  6:06                 ` Brian Malehorn
  2017-05-16  6:06                   ` [PATCH] interpret-trailers: honor the cut line Brian Malehorn
@ 2017-05-16  6:42                   ` Junio C Hamano
  1 sibling, 0 replies; 24+ messages in thread
From: Junio C Hamano @ 2017-05-16  6:42 UTC (permalink / raw)
  To: Brian Malehorn; +Cc: git, peff

Brian Malehorn <bmalehorn@gmail.com> writes:

>> The mention of "'commit -v -s' works that way, too" was my attempt
>> to justify why it is OK to make this change unconditionally to
>> intepret-trailers, but I am still not 100% convinced with that
>> reasoning (or your original log message) that it is a safe thing to
>> do.  It is not like its sole purpose is to serve as GIT_EDITOR for
>> the "git commit" command.
>
> Yup, this is a heuristic and it will sometimes hit false.  But a far
> more grievous heuristic is ignoring all lines that begin with '#', yet
> we do that anyway for its convenience in editing commit messages.  Since
> we already *partially* honor commit message syntax in
> interpret-trailers, we might as well *completely* honor it by parsing
> cut lines as well.  At least, that's my reasoning.

I was primarily wondering if we want to protect this behind an
option to "interpret-trailers", instead of making this change
unconditionally.  But your above reasoning makes sense to me.

Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2017-05-16  6:42 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-12  5:03 [PATCH 0/3] interpret-trailers + commit -v bugfix Brian Malehorn
2017-05-12  5:03 ` [PATCH 1/3] mailinfo.c: is_scissors_line ends on newline Brian Malehorn
2017-05-12  8:31   ` Jeff King
2017-05-12  5:03 ` [PATCH 2/3] commit.c: add is_scissors_line Brian Malehorn
2017-05-12  8:40   ` Jeff King
2017-05-12  5:03 ` [PATCH 3/3] commit.c: skip scissors when computing trailers Brian Malehorn
2017-05-12  8:52   ` Jeff King
2017-05-12  9:00 ` [PATCH 0/3] interpret-trailers + commit -v bugfix Jeff King
2017-05-14  3:39   ` Brian Malehorn
2017-05-14  3:39     ` [PATCH] interpret-trailers: obey scissors lines Brian Malehorn
2017-05-14  3:56       ` Jeff King
2017-05-14  8:33         ` Brian Malehorn
2017-05-14  8:33           ` Brian Malehorn
2017-05-15  3:55             ` Junio C Hamano
2017-05-15  4:23               ` Junio C Hamano
2017-05-16  6:06                 ` Brian Malehorn
2017-05-16  6:06                   ` [PATCH] interpret-trailers: honor the cut line Brian Malehorn
2017-05-16  6:42                   ` [PATCH] interpret-trailers: obey scissors lines Junio C Hamano
2017-05-15  3:08           ` Jeff King
2017-05-15  2:12         ` Junio C Hamano
2017-05-15  3:07           ` Jeff King
2017-05-15  3:32             ` Junio C Hamano
2017-05-15  3:33               ` Jeff King
2017-05-14  7:02       ` Ævar Arnfjörð Bjarmason

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).