git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v2 0/4] handle multiline in-body headers
@ 2016-09-19 21:08 Jonathan Tan
  2016-09-19 21:08 ` [PATCH v2 1/4] mailinfo: separate in-body header processing Jonathan Tan
                   ` (8 more replies)
  0 siblings, 9 replies; 12+ messages in thread
From: Jonathan Tan @ 2016-09-19 21:08 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, peff

This is an iteration of the patch set after the discussion in
 <cover.1474047135.git.jonathantanmy@google.com>.

Changes:
o largely rewritten to follow Junio's suggested design (refactor of
  check_header to separate out ">From" and "[PATCH]", and an
  is_inbody_header function performing an airtight check on whether a
  line is an in-body header)
o simpler try_convert_to_utf8 API
o one file of the expected output of t/t5100/*0015 is modified (instead
  of the input)
o t/t5100/*0018--no-inbody-headers test files added
o example in commit message improved following Peff's suggestion

Jonathan Tan (4):
  mailinfo: separate in-body header processing
  mailinfo: refactor to support utf8 decode attempts
  mailinfo: make is_scissors_line take plain char *
  mailinfo: handle in-body header continuations

 mailinfo.c                           | 164 ++++++++++++++++++++++++++---------
 mailinfo.h                           |   1 +
 t/t4150-am.sh                        |  23 +++++
 t/t5100-mailinfo.sh                  |   2 +-
 t/t5100/info0018                     |   5 ++
 t/t5100/info0018--no-inbody-headers  |   5 ++
 t/t5100/msg0015                      |   2 -
 t/t5100/msg0018                      |   2 +
 t/t5100/msg0018--no-inbody-headers   |   8 ++
 t/t5100/patch0018                    |   6 ++
 t/t5100/patch0018--no-inbody-headers |   6 ++
 t/t5100/sample.mbox                  |  19 ++++
 12 files changed, 198 insertions(+), 45 deletions(-)
 create mode 100644 t/t5100/info0018
 create mode 100644 t/t5100/info0018--no-inbody-headers
 create mode 100644 t/t5100/msg0018
 create mode 100644 t/t5100/msg0018--no-inbody-headers
 create mode 100644 t/t5100/patch0018
 create mode 100644 t/t5100/patch0018--no-inbody-headers

-- 
2.10.0.rc2.20.g5b18e70


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/4] mailinfo: separate in-body header processing
  2016-09-19 21:08 [PATCH v2 0/4] handle multiline in-body headers Jonathan Tan
@ 2016-09-19 21:08 ` Jonathan Tan
  2016-09-19 21:08 ` [PATCH v2 2/4] mailinfo: refactor to support utf8 decode attempts Jonathan Tan
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Jonathan Tan @ 2016-09-19 21:08 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, peff

The check_header function contains logic specific to in-body headers,
although it is invoked during both the processing of actual headers and
in-body headers. Separate out the in-body header part into its own
function.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 mailinfo.c | 33 +++++++++++++++++----------------
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/mailinfo.c b/mailinfo.c
index e19abe3..0c4738a 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -495,21 +495,6 @@ static int check_header(struct mailinfo *mi,
 		goto check_header_out;
 	}
 
-	/* for inbody stuff */
-	if (starts_with(line->buf, ">From") && isspace(line->buf[5])) {
-		ret = is_format_patch_separator(line->buf + 1, line->len - 1);
-		goto check_header_out;
-	}
-	if (starts_with(line->buf, "[PATCH]") && isspace(line->buf[7])) {
-		for (i = 0; header[i]; i++) {
-			if (!strcmp("Subject", header[i])) {
-				handle_header(&hdr_data[i], line);
-				ret = 1;
-				goto check_header_out;
-			}
-		}
-	}
-
 check_header_out:
 	strbuf_release(&sb);
 	return ret;
@@ -623,6 +608,22 @@ static int is_scissors_line(const struct strbuf *line)
 		gap * 2 < perforation);
 }
 
+static int check_inbody_header(struct mailinfo *mi, const struct strbuf *line)
+{
+	if (starts_with(line->buf, ">From") && isspace(line->buf[5]))
+		return is_format_patch_separator(line->buf + 1, line->len - 1);
+	if (starts_with(line->buf, "[PATCH]") && isspace(line->buf[7])) {
+		int i;
+		for (i = 0; header[i]; i++)
+			if (!strcmp("Subject", header[i])) {
+				handle_header(&mi->s_hdr_data[i], line);
+				return 1;
+			}
+		return 0;
+	}
+	return check_header(mi, line, mi->s_hdr_data, 0);
+}
+
 static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line)
 {
 	assert(!mi->filter_stage);
@@ -633,7 +634,7 @@ static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line)
 	}
 
 	if (mi->use_inbody_headers && mi->header_stage) {
-		mi->header_stage = check_header(mi, line, mi->s_hdr_data, 0);
+		mi->header_stage = check_inbody_header(mi, line);
 		if (mi->header_stage)
 			return 0;
 	} else
-- 
2.10.0.rc2.20.g5b18e70


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/4] mailinfo: refactor to support utf8 decode attempts
  2016-09-19 21:08 [PATCH v2 0/4] handle multiline in-body headers Jonathan Tan
  2016-09-19 21:08 ` [PATCH v2 1/4] mailinfo: separate in-body header processing Jonathan Tan
@ 2016-09-19 21:08 ` Jonathan Tan
  2016-09-19 21:08 ` [PATCH v2 3/4] mailinfo: make is_scissors_line take plain char * Jonathan Tan
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Jonathan Tan @ 2016-09-19 21:08 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, peff

mailinfo.c currently has a convert_to_utf8 function that overrides its
argument and prints an error message when the decoding fails. Refactor
it, creating another function that does not override anything and does
not print an error message when the decoding fails. This is to be used
by a subsequent patch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 mailinfo.c | 42 +++++++++++++++++++++++++++++++++---------
 1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/mailinfo.c b/mailinfo.c
index 0c4738a..aadad09 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -340,23 +340,47 @@ static struct strbuf *decode_b_segment(const struct strbuf *b_seg)
 	return out;
 }
 
+/*
+ * Attempts to convert line into UTF-8.
+ *
+ * This differs from convert_to_utf8 in that no inputs are overridden, and that
+ * conversion non-success is not considered an error case - mi->input_error is
+ * not set, and no error message is printed.
+ *
+ * If the conversion is unnecessary, returns 1 and stores NULL in result.
+ *
+ * If the conversion is successful, returns 1 and stores the converted string
+ * in result.
+ *
+ * If the conversion is unsuccessful, returns 0 and stores NULL in result.
+ */
+static int try_convert_to_utf8(const struct mailinfo *mi, const char *line,
+			       const char *charset, char **result)
+{
+	if (!mi->metainfo_charset || !charset || !*charset ||
+	    same_encoding(mi->metainfo_charset, charset)) {
+		*result = NULL;
+		return 1;
+	}
+
+	*result = reencode_string(line, mi->metainfo_charset, charset);
+	return !!*result;
+}
+
+/*
+ * Converts line into UTF-8, setting mi->input_error to -1 upon failure.
+ */
 static int convert_to_utf8(struct mailinfo *mi,
 			   struct strbuf *line, const char *charset)
 {
 	char *out;
-
-	if (!mi->metainfo_charset || !charset || !*charset)
-		return 0;
-
-	if (same_encoding(mi->metainfo_charset, charset))
-		return 0;
-	out = reencode_string(line->buf, mi->metainfo_charset, charset);
-	if (!out) {
+	if (!try_convert_to_utf8(mi, line->buf, charset, &out)) {
 		mi->input_error = -1;
 		return error("cannot convert from %s to %s",
 			     charset, mi->metainfo_charset);
 	}
-	strbuf_attach(line, out, strlen(out), strlen(out));
+	if (out)
+		strbuf_attach(line, out, strlen(out), strlen(out));
 	return 0;
 }
 
-- 
2.10.0.rc2.20.g5b18e70


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 3/4] mailinfo: make is_scissors_line take plain char *
  2016-09-19 21:08 [PATCH v2 0/4] handle multiline in-body headers Jonathan Tan
  2016-09-19 21:08 ` [PATCH v2 1/4] mailinfo: separate in-body header processing Jonathan Tan
  2016-09-19 21:08 ` [PATCH v2 2/4] mailinfo: refactor to support utf8 decode attempts Jonathan Tan
@ 2016-09-19 21:08 ` Jonathan Tan
  2016-09-19 21:08 ` [PATCH v2 4/4] mailinfo: handle in-body header continuations Jonathan Tan
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Jonathan Tan @ 2016-09-19 21:08 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, peff

The is_scissors_line takes a struct strbuf * when a char * would
suffice. Make it take char *.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 mailinfo.c | 35 ++++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/mailinfo.c b/mailinfo.c
index aadad09..1f487df 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -581,37 +581,35 @@ static inline int patchbreak(const struct strbuf *line)
 	return 0;
 }
 
-static int is_scissors_line(const struct strbuf *line)
+static int is_scissors_line(const char *line)
 {
-	size_t i, len = line->len;
+	const char *c;
 	int scissors = 0, gap = 0;
-	int first_nonblank = -1;
-	int last_nonblank = 0, visible, perforation = 0, in_perforation = 0;
-	const char *buf = line->buf;
+	const char *first_nonblank = NULL, *last_nonblank = NULL;
+	int visible, perforation = 0, in_perforation = 0;
 
-	for (i = 0; i < len; i++) {
-		if (isspace(buf[i])) {
+	for (c = line; *c; c++) {
+		if (isspace(*c)) {
 			if (in_perforation) {
 				perforation++;
 				gap++;
 			}
 			continue;
 		}
-		last_nonblank = i;
-		if (first_nonblank < 0)
-			first_nonblank = i;
-		if (buf[i] == '-') {
+		last_nonblank = c;
+		if (first_nonblank == NULL)
+			first_nonblank = c;
+		if (*c == '-') {
 			in_perforation = 1;
 			perforation++;
 			continue;
 		}
-		if (i + 1 < len &&
-		    (!memcmp(buf + i, ">8", 2) || !memcmp(buf + i, "8<", 2) ||
-		     !memcmp(buf + i, ">%", 2) || !memcmp(buf + i, "%<", 2))) {
+		if ((!memcmp(c, ">8", 2) || !memcmp(c, "8<", 2) ||
+		     !memcmp(c, ">%", 2) || !memcmp(c, "%<", 2))) {
 			in_perforation = 1;
 			perforation += 2;
 			scissors += 2;
-			i++;
+			c++;
 			continue;
 		}
 		in_perforation = 0;
@@ -626,7 +624,10 @@ static int is_scissors_line(const struct strbuf *line)
 	 * than half of the perforation.
 	 */
 
-	visible = last_nonblank - first_nonblank + 1;
+	if (first_nonblank && last_nonblank)
+		visible = last_nonblank - first_nonblank + 1;
+	else
+		visible = 0;
 	return (scissors && 8 <= visible &&
 		visible < perforation * 3 &&
 		gap * 2 < perforation);
@@ -671,7 +672,7 @@ static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line)
 	if (convert_to_utf8(mi, line, mi->charset.buf))
 		return 0; /* mi->input_error already set */
 
-	if (mi->use_scissors && is_scissors_line(line)) {
+	if (mi->use_scissors && is_scissors_line(line->buf)) {
 		int i;
 
 		strbuf_setlen(&mi->log_message, 0);
-- 
2.10.0.rc2.20.g5b18e70


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 4/4] mailinfo: handle in-body header continuations
  2016-09-19 21:08 [PATCH v2 0/4] handle multiline in-body headers Jonathan Tan
                   ` (2 preceding siblings ...)
  2016-09-19 21:08 ` [PATCH v2 3/4] mailinfo: make is_scissors_line take plain char * Jonathan Tan
@ 2016-09-19 21:08 ` Jonathan Tan
  2016-09-19 21:39 ` [PATCH v2 0/4] handle multiline in-body headers Junio C Hamano
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Jonathan Tan @ 2016-09-19 21:08 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, peff

Mailinfo currently handles multi-line headers, but it does not handle
multi-line in-body headers. Teach it to handle such headers, for
example, for this input:

  From: author <author@example.com>
  Date: Fri, 9 Jun 2006 00:44:16 -0700
  Subject: a very long
   broken line

  Subject: another very long
   broken line

interpret the in-body subject to be "another very long broken line"
instead of "another very long".

An existing test (t/t5100/msg0015) has an indented line immediately
after an in-body header - it has been modified to reflect the new
functionality.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 mailinfo.c                           | 56 +++++++++++++++++++++++++++++++++++-
 mailinfo.h                           |  1 +
 t/t4150-am.sh                        | 23 +++++++++++++++
 t/t5100-mailinfo.sh                  |  2 +-
 t/t5100/info0018                     |  5 ++++
 t/t5100/info0018--no-inbody-headers  |  5 ++++
 t/t5100/msg0015                      |  2 --
 t/t5100/msg0018                      |  2 ++
 t/t5100/msg0018--no-inbody-headers   |  8 ++++++
 t/t5100/patch0018                    |  6 ++++
 t/t5100/patch0018--no-inbody-headers |  6 ++++
 t/t5100/sample.mbox                  | 19 ++++++++++++
 12 files changed, 131 insertions(+), 4 deletions(-)
 create mode 100644 t/t5100/info0018
 create mode 100644 t/t5100/info0018--no-inbody-headers
 create mode 100644 t/t5100/msg0018
 create mode 100644 t/t5100/msg0018--no-inbody-headers
 create mode 100644 t/t5100/patch0018
 create mode 100644 t/t5100/patch0018--no-inbody-headers

diff --git a/mailinfo.c b/mailinfo.c
index 1f487df..aec3c52 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -524,6 +524,21 @@ static int check_header(struct mailinfo *mi,
 	return ret;
 }
 
+/*
+ * Returns 1 if the given line or any line beginning with the given line is an
+ * in-body header (that is, check_header will succeed when passed
+ * mi->s_hdr_data).
+ */
+static int is_inbody_header(const struct mailinfo *mi,
+			    const struct strbuf *line)
+{
+	int i;
+	for (i = 0; header[i]; i++)
+		if (!mi->s_hdr_data[i] && cmp_header(line, header[i]))
+			return 1;
+	return 0;
+}
+
 static void decode_transfer_encoding(struct mailinfo *mi, struct strbuf *line)
 {
 	struct strbuf *ret;
@@ -633,8 +648,39 @@ static int is_scissors_line(const char *line)
 		gap * 2 < perforation);
 }
 
+static void flush_inbody_header_accum(struct mailinfo *mi)
+{
+	if (!mi->inbody_header_accum.len)
+		return;
+	assert(check_header(mi, &mi->inbody_header_accum, mi->s_hdr_data, 0));
+	strbuf_reset(&mi->inbody_header_accum);
+}
+
 static int check_inbody_header(struct mailinfo *mi, const struct strbuf *line)
 {
+	if (mi->inbody_header_accum.len &&
+	    (line->buf[0] == ' ' || line->buf[0] == '\t')) {
+		char *utf8 = NULL;
+		int is_scissors = mi->use_scissors && 
+				  try_convert_to_utf8(mi, line->buf,
+						      mi->charset.buf, &utf8) &&
+				  is_scissors_line(utf8 ? utf8 : line->buf);
+		free(utf8);
+		if (is_scissors) {
+			/*
+			 * This is a scissors line; do not consider this line
+			 * as a header continuation line.
+			 */
+			flush_inbody_header_accum(mi);
+			return 0;
+		}
+		strbuf_strip_suffix(&mi->inbody_header_accum, "\n");
+		strbuf_addbuf(&mi->inbody_header_accum, line);
+		return 1;
+	}
+
+	flush_inbody_header_accum(mi);
+
 	if (starts_with(line->buf, ">From") && isspace(line->buf[5]))
 		return is_format_patch_separator(line->buf + 1, line->len - 1);
 	if (starts_with(line->buf, "[PATCH]") && isspace(line->buf[7])) {
@@ -646,7 +692,11 @@ static int check_inbody_header(struct mailinfo *mi, const struct strbuf *line)
 			}
 		return 0;
 	}
-	return check_header(mi, line, mi->s_hdr_data, 0);
+	if (is_inbody_header(mi, line)) {
+		strbuf_addbuf(&mi->inbody_header_accum, line);
+		return 1;
+	}
+	return 0;
 }
 
 static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line)
@@ -912,6 +962,8 @@ static void handle_body(struct mailinfo *mi, struct strbuf *line)
 			break;
 	} while (!strbuf_getwholeline(line, mi->input, '\n'));
 
+	flush_inbody_header_accum(mi);
+
 handle_body_out:
 	strbuf_release(&prev);
 }
@@ -1027,6 +1079,7 @@ void setup_mailinfo(struct mailinfo *mi)
 	strbuf_init(&mi->email, 0);
 	strbuf_init(&mi->charset, 0);
 	strbuf_init(&mi->log_message, 0);
+	strbuf_init(&mi->inbody_header_accum, 0);
 	mi->header_stage = 1;
 	mi->use_inbody_headers = 1;
 	mi->content_top = mi->content;
@@ -1040,6 +1093,7 @@ void clear_mailinfo(struct mailinfo *mi)
 	strbuf_release(&mi->name);
 	strbuf_release(&mi->email);
 	strbuf_release(&mi->charset);
+	strbuf_release(&mi->inbody_header_accum);
 	free(mi->message_id);
 
 	for (i = 0; mi->p_hdr_data[i]; i++)
diff --git a/mailinfo.h b/mailinfo.h
index 93776a7..04a2535 100644
--- a/mailinfo.h
+++ b/mailinfo.h
@@ -27,6 +27,7 @@ struct mailinfo {
 	int patch_lines;
 	int filter_stage; /* still reading log or are we copying patch? */
 	int header_stage; /* still checking in-body headers? */
+	struct strbuf inbody_header_accum;
 	struct strbuf **p_hdr_data;
 	struct strbuf **s_hdr_data;
 
diff --git a/t/t4150-am.sh b/t/t4150-am.sh
index 9ce9424..89a5bac 100755
--- a/t/t4150-am.sh
+++ b/t/t4150-am.sh
@@ -977,4 +977,27 @@ test_expect_success 'am --patch-format=mboxrd handles mboxrd' '
 	test_cmp msg out
 '
 
+test_expect_success 'am works with multi-line in-body headers' '
+	FORTY="String that has a length of more than forty characters" &&
+	LONG="$FORTY $FORTY" &&
+	rm -fr .git/rebase-apply &&
+	git checkout -f first &&
+	echo one >> file &&
+	git commit -am "$LONG" --author="$LONG <long@example.com>" &&
+	git format-patch --stdout -1 >patch &&
+	# bump from, date, and subject down to in-body header
+	perl -lpe "
+		if (/^From:/) {
+			print \"From: x <x\@example.com>\";
+			print \"Date: Sat, 1 Jan 2000 00:00:00 +0000\";
+			print \"Subject: x\n\";
+		}
+	" patch >msg &&
+	git checkout HEAD^ &&
+	git am msg &&
+	# Ensure that the author and full message are present
+	git cat-file commit HEAD | grep "^author.*long@example.com" &&
+	git cat-file commit HEAD | grep "^$LONG"
+'
+
 test_done
diff --git a/t/t5100-mailinfo.sh b/t/t5100-mailinfo.sh
index 1a5a546..e173c33 100755
--- a/t/t5100-mailinfo.sh
+++ b/t/t5100-mailinfo.sh
@@ -11,7 +11,7 @@ test_expect_success 'split sample box' \
 	'git mailsplit -o. "$TEST_DIRECTORY"/t5100/sample.mbox >last &&
 	last=$(cat last) &&
 	echo total is $last &&
-	test $(cat last) = 17'
+	test $(cat last) = 18'
 
 check_mailinfo () {
 	mail=$1 opt=$2
diff --git a/t/t5100/info0018 b/t/t5100/info0018
new file mode 100644
index 0000000..d53e749
--- /dev/null
+++ b/t/t5100/info0018
@@ -0,0 +1,5 @@
+Author: Another Thor
+Email: a.thor@example.com
+Subject: This one contains a tab and a space
+Date: Fri, 9 Jun 2006 00:44:16 -0700
+
diff --git a/t/t5100/info0018--no-inbody-headers b/t/t5100/info0018--no-inbody-headers
new file mode 100644
index 0000000..30b17bd
--- /dev/null
+++ b/t/t5100/info0018--no-inbody-headers
@@ -0,0 +1,5 @@
+Author: A U Thor
+Email: a.u.thor@example.com
+Subject: check multiline inbody headers
+Date: Fri, 9 Jun 2006 00:44:16 -0700
+
diff --git a/t/t5100/msg0015 b/t/t5100/msg0015
index 4abb3d5..e69de29 100644
--- a/t/t5100/msg0015
+++ b/t/t5100/msg0015
@@ -1,2 +0,0 @@
-  - a list
-  - of stuff
diff --git a/t/t5100/msg0018 b/t/t5100/msg0018
new file mode 100644
index 0000000..56de83d
--- /dev/null
+++ b/t/t5100/msg0018
@@ -0,0 +1,2 @@
+a commit message
+
diff --git a/t/t5100/msg0018--no-inbody-headers b/t/t5100/msg0018--no-inbody-headers
new file mode 100644
index 0000000..b1e05d3
--- /dev/null
+++ b/t/t5100/msg0018--no-inbody-headers
@@ -0,0 +1,8 @@
+From: Another Thor
+ <a.thor@example.com>
+Subject: This one contains
+	a tab
+ and a space
+
+a commit message
+
diff --git a/t/t5100/patch0018 b/t/t5100/patch0018
new file mode 100644
index 0000000..789df6d
--- /dev/null
+++ b/t/t5100/patch0018
@@ -0,0 +1,6 @@
+diff --git a/foo b/foo
+index e69de29..d95f3ad 100644
+--- a/foo
++++ b/foo
+@@ -0,0 +1 @@
++content
diff --git a/t/t5100/patch0018--no-inbody-headers b/t/t5100/patch0018--no-inbody-headers
new file mode 100644
index 0000000..789df6d
--- /dev/null
+++ b/t/t5100/patch0018--no-inbody-headers
@@ -0,0 +1,6 @@
+diff --git a/foo b/foo
+index e69de29..d95f3ad 100644
+--- a/foo
++++ b/foo
+@@ -0,0 +1 @@
++content
diff --git a/t/t5100/sample.mbox b/t/t5100/sample.mbox
index 8b2ae06..6d4d0e4 100644
--- a/t/t5100/sample.mbox
+++ b/t/t5100/sample.mbox
@@ -699,3 +699,22 @@ index e69de29..d95f3ad 100644
 +++ b/foo
 @@ -0,0 +1 @@
 +New content
+From nobody Mon Sep 17 00:00:00 2001
+From: A U Thor <a.u.thor@example.com>
+Subject: check multiline inbody headers
+Date: Fri, 9 Jun 2006 00:44:16 -0700
+
+From: Another Thor
+ <a.thor@example.com>
+Subject: This one contains
+	a tab
+ and a space
+
+a commit message
+
+diff --git a/foo b/foo
+index e69de29..d95f3ad 100644
+--- a/foo
++++ b/foo
+@@ -0,0 +1 @@
++content
-- 
2.10.0.rc2.20.g5b18e70


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/4] handle multiline in-body headers
  2016-09-19 21:08 [PATCH v2 0/4] handle multiline in-body headers Jonathan Tan
                   ` (3 preceding siblings ...)
  2016-09-19 21:08 ` [PATCH v2 4/4] mailinfo: handle in-body header continuations Jonathan Tan
@ 2016-09-19 21:39 ` Junio C Hamano
  2016-09-20 17:17 ` [PATCH v3 0/3] " Jonathan Tan
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2016-09-19 21:39 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, peff

Jonathan Tan <jonathantanmy@google.com> writes:

> This is an iteration of the patch set after the discussion in
>  <cover.1474047135.git.jonathantanmy@google.com>.
>
> Changes:
> o largely rewritten to follow Junio's suggested design (refactor of
>   check_header to separate out ">From" and "[PATCH]", and an
>   is_inbody_header function performing an airtight check on whether a
>   line is an in-body header)
> o simpler try_convert_to_utf8 API
> o one file of the expected output of t/t5100/*0015 is modified (instead
>   of the input)
> o t/t5100/*0018--no-inbody-headers test files added
> o example in commit message improved following Peff's suggestion

Overall it looks much nicer, but I still do not understand why we
want to do try-convert-to-utf8.  The _only_ caller of that helper
function is in 4/4 where it tries to convert the line to see if a
line is a scissors line.

It seems to me that the only thing doing so gives us is that you
could later enhance the definition of what a scissors-line looks
like by adding unicode characters [*1*], but I would strongly
suggest not doing that kind of enhancement [*2*].  With the current
definition of what a scissors-line looks like, it can be checked
before you do the UTF-8 conversion, I think, which would mean that
the helper is not strictly necessary.

So...


[Footnotes]

*1* E.g. ✂ encoded in non-UTF8 may have to be converted to UTF-8
first before being recognized as such.

*2* Encouraging people to use non-ASCII that older version of Git
did not take for structural things like "what is a scissors line"
merely for the looks is a bad trade-off; patch e-mails that use the
enhanced definition will break for those with older version of Git,
and the benefit of "prettier scissors" is not really worth it, I
would think.






^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v3 0/3] handle multiline in-body headers
  2016-09-19 21:08 [PATCH v2 0/4] handle multiline in-body headers Jonathan Tan
                   ` (4 preceding siblings ...)
  2016-09-19 21:39 ` [PATCH v2 0/4] handle multiline in-body headers Junio C Hamano
@ 2016-09-20 17:17 ` Jonathan Tan
  2016-09-20 23:06   ` Jeff King
  2016-09-21 17:24   ` Junio C Hamano
  2016-09-20 17:17 ` [PATCH v3 1/3] mailinfo: separate in-body header processing Jonathan Tan
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 12+ messages in thread
From: Jonathan Tan @ 2016-09-20 17:17 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Changes since v2:
o Removed utf8 translation before scissors line check in
  check_inbody_header (I was thinking of support for encodings like
  UTF-16, but I guess those don't work with the current reencode_string
  anyway since it uses strlen internally)

With the above change, it is actually no longer necessary to make
is_scissors_line take plain char * (the second patch) - I think that
that patch still improves the code, but let me know if you want me to
remove it from this patch set.

Jonathan Tan (3):
  mailinfo: separate in-body header processing
  mailinfo: make is_scissors_line take plain char *
  mailinfo: handle in-body header continuations

 mailinfo.c                           | 116 +++++++++++++++++++++++++----------
 mailinfo.h                           |   1 +
 t/t4150-am.sh                        |  23 +++++++
 t/t5100-mailinfo.sh                  |   2 +-
 t/t5100/info0018                     |   5 ++
 t/t5100/info0018--no-inbody-headers  |   5 ++
 t/t5100/msg0015                      |   2 -
 t/t5100/msg0018                      |   2 +
 t/t5100/msg0018--no-inbody-headers   |   8 +++
 t/t5100/patch0018                    |   6 ++
 t/t5100/patch0018--no-inbody-headers |   6 ++
 t/t5100/sample.mbox                  |  19 ++++++
 12 files changed, 159 insertions(+), 36 deletions(-)
 create mode 100644 t/t5100/info0018
 create mode 100644 t/t5100/info0018--no-inbody-headers
 create mode 100644 t/t5100/msg0018
 create mode 100644 t/t5100/msg0018--no-inbody-headers
 create mode 100644 t/t5100/patch0018
 create mode 100644 t/t5100/patch0018--no-inbody-headers

-- 
2.10.0.rc2.20.g5b18e70


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v3 1/3] mailinfo: separate in-body header processing
  2016-09-19 21:08 [PATCH v2 0/4] handle multiline in-body headers Jonathan Tan
                   ` (5 preceding siblings ...)
  2016-09-20 17:17 ` [PATCH v3 0/3] " Jonathan Tan
@ 2016-09-20 17:17 ` Jonathan Tan
  2016-09-20 17:17 ` [PATCH v3 2/3] mailinfo: make is_scissors_line take plain char * Jonathan Tan
  2016-09-20 17:17 ` [PATCH v3 3/3] mailinfo: handle in-body header continuations Jonathan Tan
  8 siblings, 0 replies; 12+ messages in thread
From: Jonathan Tan @ 2016-09-20 17:17 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

The check_header function contains logic specific to in-body headers,
although it is invoked during both the processing of actual headers and
in-body headers. Separate out the in-body header part into its own
function.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 mailinfo.c | 33 +++++++++++++++++----------------
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/mailinfo.c b/mailinfo.c
index e19abe3..0c4738a 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -495,21 +495,6 @@ static int check_header(struct mailinfo *mi,
 		goto check_header_out;
 	}
 
-	/* for inbody stuff */
-	if (starts_with(line->buf, ">From") && isspace(line->buf[5])) {
-		ret = is_format_patch_separator(line->buf + 1, line->len - 1);
-		goto check_header_out;
-	}
-	if (starts_with(line->buf, "[PATCH]") && isspace(line->buf[7])) {
-		for (i = 0; header[i]; i++) {
-			if (!strcmp("Subject", header[i])) {
-				handle_header(&hdr_data[i], line);
-				ret = 1;
-				goto check_header_out;
-			}
-		}
-	}
-
 check_header_out:
 	strbuf_release(&sb);
 	return ret;
@@ -623,6 +608,22 @@ static int is_scissors_line(const struct strbuf *line)
 		gap * 2 < perforation);
 }
 
+static int check_inbody_header(struct mailinfo *mi, const struct strbuf *line)
+{
+	if (starts_with(line->buf, ">From") && isspace(line->buf[5]))
+		return is_format_patch_separator(line->buf + 1, line->len - 1);
+	if (starts_with(line->buf, "[PATCH]") && isspace(line->buf[7])) {
+		int i;
+		for (i = 0; header[i]; i++)
+			if (!strcmp("Subject", header[i])) {
+				handle_header(&mi->s_hdr_data[i], line);
+				return 1;
+			}
+		return 0;
+	}
+	return check_header(mi, line, mi->s_hdr_data, 0);
+}
+
 static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line)
 {
 	assert(!mi->filter_stage);
@@ -633,7 +634,7 @@ static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line)
 	}
 
 	if (mi->use_inbody_headers && mi->header_stage) {
-		mi->header_stage = check_header(mi, line, mi->s_hdr_data, 0);
+		mi->header_stage = check_inbody_header(mi, line);
 		if (mi->header_stage)
 			return 0;
 	} else
-- 
2.10.0.rc2.20.g5b18e70


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 2/3] mailinfo: make is_scissors_line take plain char *
  2016-09-19 21:08 [PATCH v2 0/4] handle multiline in-body headers Jonathan Tan
                   ` (6 preceding siblings ...)
  2016-09-20 17:17 ` [PATCH v3 1/3] mailinfo: separate in-body header processing Jonathan Tan
@ 2016-09-20 17:17 ` Jonathan Tan
  2016-09-20 17:17 ` [PATCH v3 3/3] mailinfo: handle in-body header continuations Jonathan Tan
  8 siblings, 0 replies; 12+ messages in thread
From: Jonathan Tan @ 2016-09-20 17:17 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

The is_scissors_line takes a struct strbuf * when a char * would
suffice. Make it take char *.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 mailinfo.c | 35 ++++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/mailinfo.c b/mailinfo.c
index 0c4738a..69391aa 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -557,37 +557,35 @@ static inline int patchbreak(const struct strbuf *line)
 	return 0;
 }
 
-static int is_scissors_line(const struct strbuf *line)
+static int is_scissors_line(const char *line)
 {
-	size_t i, len = line->len;
+	const char *c;
 	int scissors = 0, gap = 0;
-	int first_nonblank = -1;
-	int last_nonblank = 0, visible, perforation = 0, in_perforation = 0;
-	const char *buf = line->buf;
+	const char *first_nonblank = NULL, *last_nonblank = NULL;
+	int visible, perforation = 0, in_perforation = 0;
 
-	for (i = 0; i < len; i++) {
-		if (isspace(buf[i])) {
+	for (c = line; *c; c++) {
+		if (isspace(*c)) {
 			if (in_perforation) {
 				perforation++;
 				gap++;
 			}
 			continue;
 		}
-		last_nonblank = i;
-		if (first_nonblank < 0)
-			first_nonblank = i;
-		if (buf[i] == '-') {
+		last_nonblank = c;
+		if (first_nonblank == NULL)
+			first_nonblank = c;
+		if (*c == '-') {
 			in_perforation = 1;
 			perforation++;
 			continue;
 		}
-		if (i + 1 < len &&
-		    (!memcmp(buf + i, ">8", 2) || !memcmp(buf + i, "8<", 2) ||
-		     !memcmp(buf + i, ">%", 2) || !memcmp(buf + i, "%<", 2))) {
+		if ((!memcmp(c, ">8", 2) || !memcmp(c, "8<", 2) ||
+		     !memcmp(c, ">%", 2) || !memcmp(c, "%<", 2))) {
 			in_perforation = 1;
 			perforation += 2;
 			scissors += 2;
-			i++;
+			c++;
 			continue;
 		}
 		in_perforation = 0;
@@ -602,7 +600,10 @@ static int is_scissors_line(const struct strbuf *line)
 	 * than half of the perforation.
 	 */
 
-	visible = last_nonblank - first_nonblank + 1;
+	if (first_nonblank && last_nonblank)
+		visible = last_nonblank - first_nonblank + 1;
+	else
+		visible = 0;
 	return (scissors && 8 <= visible &&
 		visible < perforation * 3 &&
 		gap * 2 < perforation);
@@ -647,7 +648,7 @@ static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line)
 	if (convert_to_utf8(mi, line, mi->charset.buf))
 		return 0; /* mi->input_error already set */
 
-	if (mi->use_scissors && is_scissors_line(line)) {
+	if (mi->use_scissors && is_scissors_line(line->buf)) {
 		int i;
 
 		strbuf_setlen(&mi->log_message, 0);
-- 
2.10.0.rc2.20.g5b18e70


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 3/3] mailinfo: handle in-body header continuations
  2016-09-19 21:08 [PATCH v2 0/4] handle multiline in-body headers Jonathan Tan
                   ` (7 preceding siblings ...)
  2016-09-20 17:17 ` [PATCH v3 2/3] mailinfo: make is_scissors_line take plain char * Jonathan Tan
@ 2016-09-20 17:17 ` Jonathan Tan
  8 siblings, 0 replies; 12+ messages in thread
From: Jonathan Tan @ 2016-09-20 17:17 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Mailinfo currently handles multi-line headers, but it does not handle
multi-line in-body headers. Teach it to handle such headers, for
example, for this input:

  From: author <author@example.com>
  Date: Fri, 9 Jun 2006 00:44:16 -0700
  Subject: a very long
   broken line

  Subject: another very long
   broken line

interpret the in-body subject to be "another very long broken line"
instead of "another very long".

An existing test (t/t5100/msg0015) has an indented line immediately
after an in-body header - it has been modified to reflect the new
functionality.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 mailinfo.c                           | 50 +++++++++++++++++++++++++++++++++++-
 mailinfo.h                           |  1 +
 t/t4150-am.sh                        | 23 +++++++++++++++++
 t/t5100-mailinfo.sh                  |  2 +-
 t/t5100/info0018                     |  5 ++++
 t/t5100/info0018--no-inbody-headers  |  5 ++++
 t/t5100/msg0015                      |  2 --
 t/t5100/msg0018                      |  2 ++
 t/t5100/msg0018--no-inbody-headers   |  8 ++++++
 t/t5100/patch0018                    |  6 +++++
 t/t5100/patch0018--no-inbody-headers |  6 +++++
 t/t5100/sample.mbox                  | 19 ++++++++++++++
 12 files changed, 125 insertions(+), 4 deletions(-)
 create mode 100644 t/t5100/info0018
 create mode 100644 t/t5100/info0018--no-inbody-headers
 create mode 100644 t/t5100/msg0018
 create mode 100644 t/t5100/msg0018--no-inbody-headers
 create mode 100644 t/t5100/patch0018
 create mode 100644 t/t5100/patch0018--no-inbody-headers

diff --git a/mailinfo.c b/mailinfo.c
index 69391aa..2275b28 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -500,6 +500,21 @@ static int check_header(struct mailinfo *mi,
 	return ret;
 }
 
+/*
+ * Returns 1 if the given line or any line beginning with the given line is an
+ * in-body header (that is, check_header will succeed when passed
+ * mi->s_hdr_data).
+ */
+static int is_inbody_header(const struct mailinfo *mi,
+			    const struct strbuf *line)
+{
+	int i;
+	for (i = 0; header[i]; i++)
+		if (!mi->s_hdr_data[i] && cmp_header(line, header[i]))
+			return 1;
+	return 0;
+}
+
 static void decode_transfer_encoding(struct mailinfo *mi, struct strbuf *line)
 {
 	struct strbuf *ret;
@@ -609,8 +624,33 @@ static int is_scissors_line(const char *line)
 		gap * 2 < perforation);
 }
 
+static void flush_inbody_header_accum(struct mailinfo *mi)
+{
+	if (!mi->inbody_header_accum.len)
+		return;
+	assert(check_header(mi, &mi->inbody_header_accum, mi->s_hdr_data, 0));
+	strbuf_reset(&mi->inbody_header_accum);
+}
+
 static int check_inbody_header(struct mailinfo *mi, const struct strbuf *line)
 {
+	if (mi->inbody_header_accum.len &&
+	    (line->buf[0] == ' ' || line->buf[0] == '\t')) {
+		if (mi->use_scissors && is_scissors_line(line->buf)) {
+			/*
+			 * This is a scissors line; do not consider this line
+			 * as a header continuation line.
+			 */
+			flush_inbody_header_accum(mi);
+			return 0;
+		}
+		strbuf_strip_suffix(&mi->inbody_header_accum, "\n");
+		strbuf_addbuf(&mi->inbody_header_accum, line);
+		return 1;
+	}
+
+	flush_inbody_header_accum(mi);
+
 	if (starts_with(line->buf, ">From") && isspace(line->buf[5]))
 		return is_format_patch_separator(line->buf + 1, line->len - 1);
 	if (starts_with(line->buf, "[PATCH]") && isspace(line->buf[7])) {
@@ -622,7 +662,11 @@ static int check_inbody_header(struct mailinfo *mi, const struct strbuf *line)
 			}
 		return 0;
 	}
-	return check_header(mi, line, mi->s_hdr_data, 0);
+	if (is_inbody_header(mi, line)) {
+		strbuf_addbuf(&mi->inbody_header_accum, line);
+		return 1;
+	}
+	return 0;
 }
 
 static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line)
@@ -888,6 +932,8 @@ static void handle_body(struct mailinfo *mi, struct strbuf *line)
 			break;
 	} while (!strbuf_getwholeline(line, mi->input, '\n'));
 
+	flush_inbody_header_accum(mi);
+
 handle_body_out:
 	strbuf_release(&prev);
 }
@@ -1003,6 +1049,7 @@ void setup_mailinfo(struct mailinfo *mi)
 	strbuf_init(&mi->email, 0);
 	strbuf_init(&mi->charset, 0);
 	strbuf_init(&mi->log_message, 0);
+	strbuf_init(&mi->inbody_header_accum, 0);
 	mi->header_stage = 1;
 	mi->use_inbody_headers = 1;
 	mi->content_top = mi->content;
@@ -1016,6 +1063,7 @@ void clear_mailinfo(struct mailinfo *mi)
 	strbuf_release(&mi->name);
 	strbuf_release(&mi->email);
 	strbuf_release(&mi->charset);
+	strbuf_release(&mi->inbody_header_accum);
 	free(mi->message_id);
 
 	for (i = 0; mi->p_hdr_data[i]; i++)
diff --git a/mailinfo.h b/mailinfo.h
index 93776a7..04a2535 100644
--- a/mailinfo.h
+++ b/mailinfo.h
@@ -27,6 +27,7 @@ struct mailinfo {
 	int patch_lines;
 	int filter_stage; /* still reading log or are we copying patch? */
 	int header_stage; /* still checking in-body headers? */
+	struct strbuf inbody_header_accum;
 	struct strbuf **p_hdr_data;
 	struct strbuf **s_hdr_data;
 
diff --git a/t/t4150-am.sh b/t/t4150-am.sh
index 9ce9424..89a5bac 100755
--- a/t/t4150-am.sh
+++ b/t/t4150-am.sh
@@ -977,4 +977,27 @@ test_expect_success 'am --patch-format=mboxrd handles mboxrd' '
 	test_cmp msg out
 '
 
+test_expect_success 'am works with multi-line in-body headers' '
+	FORTY="String that has a length of more than forty characters" &&
+	LONG="$FORTY $FORTY" &&
+	rm -fr .git/rebase-apply &&
+	git checkout -f first &&
+	echo one >> file &&
+	git commit -am "$LONG" --author="$LONG <long@example.com>" &&
+	git format-patch --stdout -1 >patch &&
+	# bump from, date, and subject down to in-body header
+	perl -lpe "
+		if (/^From:/) {
+			print \"From: x <x\@example.com>\";
+			print \"Date: Sat, 1 Jan 2000 00:00:00 +0000\";
+			print \"Subject: x\n\";
+		}
+	" patch >msg &&
+	git checkout HEAD^ &&
+	git am msg &&
+	# Ensure that the author and full message are present
+	git cat-file commit HEAD | grep "^author.*long@example.com" &&
+	git cat-file commit HEAD | grep "^$LONG"
+'
+
 test_done
diff --git a/t/t5100-mailinfo.sh b/t/t5100-mailinfo.sh
index 1a5a546..e173c33 100755
--- a/t/t5100-mailinfo.sh
+++ b/t/t5100-mailinfo.sh
@@ -11,7 +11,7 @@ test_expect_success 'split sample box' \
 	'git mailsplit -o. "$TEST_DIRECTORY"/t5100/sample.mbox >last &&
 	last=$(cat last) &&
 	echo total is $last &&
-	test $(cat last) = 17'
+	test $(cat last) = 18'
 
 check_mailinfo () {
 	mail=$1 opt=$2
diff --git a/t/t5100/info0018 b/t/t5100/info0018
new file mode 100644
index 0000000..d53e749
--- /dev/null
+++ b/t/t5100/info0018
@@ -0,0 +1,5 @@
+Author: Another Thor
+Email: a.thor@example.com
+Subject: This one contains a tab and a space
+Date: Fri, 9 Jun 2006 00:44:16 -0700
+
diff --git a/t/t5100/info0018--no-inbody-headers b/t/t5100/info0018--no-inbody-headers
new file mode 100644
index 0000000..30b17bd
--- /dev/null
+++ b/t/t5100/info0018--no-inbody-headers
@@ -0,0 +1,5 @@
+Author: A U Thor
+Email: a.u.thor@example.com
+Subject: check multiline inbody headers
+Date: Fri, 9 Jun 2006 00:44:16 -0700
+
diff --git a/t/t5100/msg0015 b/t/t5100/msg0015
index 4abb3d5..e69de29 100644
--- a/t/t5100/msg0015
+++ b/t/t5100/msg0015
@@ -1,2 +0,0 @@
-  - a list
-  - of stuff
diff --git a/t/t5100/msg0018 b/t/t5100/msg0018
new file mode 100644
index 0000000..56de83d
--- /dev/null
+++ b/t/t5100/msg0018
@@ -0,0 +1,2 @@
+a commit message
+
diff --git a/t/t5100/msg0018--no-inbody-headers b/t/t5100/msg0018--no-inbody-headers
new file mode 100644
index 0000000..b1e05d3
--- /dev/null
+++ b/t/t5100/msg0018--no-inbody-headers
@@ -0,0 +1,8 @@
+From: Another Thor
+ <a.thor@example.com>
+Subject: This one contains
+	a tab
+ and a space
+
+a commit message
+
diff --git a/t/t5100/patch0018 b/t/t5100/patch0018
new file mode 100644
index 0000000..789df6d
--- /dev/null
+++ b/t/t5100/patch0018
@@ -0,0 +1,6 @@
+diff --git a/foo b/foo
+index e69de29..d95f3ad 100644
+--- a/foo
++++ b/foo
+@@ -0,0 +1 @@
++content
diff --git a/t/t5100/patch0018--no-inbody-headers b/t/t5100/patch0018--no-inbody-headers
new file mode 100644
index 0000000..789df6d
--- /dev/null
+++ b/t/t5100/patch0018--no-inbody-headers
@@ -0,0 +1,6 @@
+diff --git a/foo b/foo
+index e69de29..d95f3ad 100644
+--- a/foo
++++ b/foo
+@@ -0,0 +1 @@
++content
diff --git a/t/t5100/sample.mbox b/t/t5100/sample.mbox
index 8b2ae06..6d4d0e4 100644
--- a/t/t5100/sample.mbox
+++ b/t/t5100/sample.mbox
@@ -699,3 +699,22 @@ index e69de29..d95f3ad 100644
 +++ b/foo
 @@ -0,0 +1 @@
 +New content
+From nobody Mon Sep 17 00:00:00 2001
+From: A U Thor <a.u.thor@example.com>
+Subject: check multiline inbody headers
+Date: Fri, 9 Jun 2006 00:44:16 -0700
+
+From: Another Thor
+ <a.thor@example.com>
+Subject: This one contains
+	a tab
+ and a space
+
+a commit message
+
+diff --git a/foo b/foo
+index e69de29..d95f3ad 100644
+--- a/foo
++++ b/foo
+@@ -0,0 +1 @@
++content
-- 
2.10.0.rc2.20.g5b18e70


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 0/3] handle multiline in-body headers
  2016-09-20 17:17 ` [PATCH v3 0/3] " Jonathan Tan
@ 2016-09-20 23:06   ` Jeff King
  2016-09-21 17:24   ` Junio C Hamano
  1 sibling, 0 replies; 12+ messages in thread
From: Jeff King @ 2016-09-20 23:06 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, gitster

On Tue, Sep 20, 2016 at 10:17:50AM -0700, Jonathan Tan wrote:

> Changes since v2:
> o Removed utf8 translation before scissors line check in
>   check_inbody_header (I was thinking of support for encodings like
>   UTF-16, but I guess those don't work with the current reencode_string
>   anyway since it uses strlen internally)

Yeah, I'd be surprised if UTF-16 works very well with our code in
general. If we want to address that, though, the sanest thing is
probably to convert it internally to UTF-8 when we remove the transfer
encoding in handle_body().

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 0/3] handle multiline in-body headers
  2016-09-20 17:17 ` [PATCH v3 0/3] " Jonathan Tan
  2016-09-20 23:06   ` Jeff King
@ 2016-09-21 17:24   ` Junio C Hamano
  1 sibling, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2016-09-21 17:24 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> With the above change, it is actually no longer necessary to make
> is_scissors_line take plain char * (the second patch) - I think that
> that patch still improves the code, but let me know if you want me to
> remove it from this patch set.

I agree with you that it is an independently good change.  Let's
keep it.

Overall looked very good.  Thanks, will queue.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-09-21 17:25 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-19 21:08 [PATCH v2 0/4] handle multiline in-body headers Jonathan Tan
2016-09-19 21:08 ` [PATCH v2 1/4] mailinfo: separate in-body header processing Jonathan Tan
2016-09-19 21:08 ` [PATCH v2 2/4] mailinfo: refactor to support utf8 decode attempts Jonathan Tan
2016-09-19 21:08 ` [PATCH v2 3/4] mailinfo: make is_scissors_line take plain char * Jonathan Tan
2016-09-19 21:08 ` [PATCH v2 4/4] mailinfo: handle in-body header continuations Jonathan Tan
2016-09-19 21:39 ` [PATCH v2 0/4] handle multiline in-body headers Junio C Hamano
2016-09-20 17:17 ` [PATCH v3 0/3] " Jonathan Tan
2016-09-20 23:06   ` Jeff King
2016-09-21 17:24   ` Junio C Hamano
2016-09-20 17:17 ` [PATCH v3 1/3] mailinfo: separate in-body header processing Jonathan Tan
2016-09-20 17:17 ` [PATCH v3 2/3] mailinfo: make is_scissors_line take plain char * Jonathan Tan
2016-09-20 17:17 ` [PATCH v3 3/3] mailinfo: handle in-body header continuations Jonathan Tan

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).