git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v2 0/7] Cure some format-patch wrapping and encoding issues
@ 2012-10-18 14:43 Jan H. Schönherr
  2012-10-18 14:43 ` [PATCH v2 1/7] utf8: fix off-by-one wrapping of text Jan H. Schönherr
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Jan H. Schönherr @ 2012-10-18 14:43 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Jan H. Schönherr

Hi all.

[This is the second version of this series. If you still remember
the first version, you might want to jump directly to the summary
of changes below.]

The main point of this series is to teach git to encode my name
correctly, see patches 5+6, so that the decoded version is actually
my name, so that send-email does not insist on adding a wrong
superfluous From: line to the mail body.

The other patches more mostly by-products that fix other issues
I came across.

Patch 1 fixes an old off-by-one error, so that wrapped text may
now use all available columns.

Patches 2 and 3 make the wrapping of header lines more correct,
i. e., neither too early nor too late.

Patch 4 does some refactoring, which is too unrelated to be included
in one of the later patches.

Patch 5 improves RFC 2047 encoding; patch 6 removes an old non-RFC
conform workaround.

Patch 7 is more an RFC, which seems to be a good idea from my point
of view. Indeed, I thought the current implementation is erroneous,
until Junio C Hamano pointed out, that this might be desired behavior.
Thus, make up your mind about this one.


The series is currently based on the maint branch, but it applies
to master as well. It does also apply to next, but then my
implementation of isprint() has to be dropped from patch 5.


Changes in v2:
- patch 1 is new and is a result of the v1 discussion
- patch 5+6 split the old patch 4 into two patches
- use of constants for maximum line lengths
- even better adherence to RFC 2047 than v1
- updated commit messages/comments


Regards
Jan

Jan H. Schönherr (7):
  utf8: fix off-by-one wrapping of text
  format-patch: do not wrap non-rfc2047 headers too early
  format-patch: do not wrap rfc2047 encoded headers too late
  format-patch: introduce helper function last_line_length()
  format-patch: make rfc2047 encoding more strict
  format-patch: fix rfc2047 address encoding with respect to rfc822
    specials
  format-patch tests: check quoting/encoding in To: and Cc: headers

 git-compat-util.h       |   2 +
 pretty.c                | 149 +++++++++++++++++++++++--------
 t/t4014-format-patch.sh | 231 ++++++++++++++++++++++++++++++------------------
 t/t4202-log.sh          |   4 +-
 utf8.c                  |   2 +-
 5 Dateien geändert, 262 Zeilen hinzugefügt(+), 126 Zeilen entfernt(-)

-- 
1.7.12

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/7] utf8: fix off-by-one wrapping of text
  2012-10-18 14:43 [PATCH v2 0/7] Cure some format-patch wrapping and encoding issues Jan H. Schönherr
@ 2012-10-18 14:43 ` Jan H. Schönherr
  2012-10-18 14:43 ` [PATCH v2 2/7] format-patch: do not wrap non-rfc2047 headers too early Jan H. Schönherr
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Jan H. Schönherr @ 2012-10-18 14:43 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Jan H. Schönherr

From: Jan H. Schönherr <schnhrr@cs.tu-berlin.de>

The wrapping logic in strbuf_add_wrapped_text() does currently not allow
lines that entirely fill the allowed width, instead it wraps the line one
character too early.

For example, the text "This is the sixth commit." formatted via
"%w(11,1,2)" (wrap at 11 characters, 1 char indent of first line, 2 char
indent of following lines) results in four lines: " This is", "  the",
"  sixth", "  commit." This is wrong, because "  the sixth" is exactly
11 characters long, and thus allowed.

Fix this by allowing the (width+1) character of a line to be a valid
wrapping point if it is a whitespace character.

Signed-off-by: Jan H. Schönherr <schnhrr@cs.tu-berlin.de>
---
v2: new patch, result of v1 discussion
---
 t/t4202-log.sh | 4 ++--
 utf8.c         | 2 +-
 2 Dateien geändert, 3 Zeilen hinzugefügt(+), 3 Zeilen entfernt(-)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index b3ac6be..584e3d8 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -72,9 +72,9 @@ cat > expect << EOF
   commit.
 EOF
 
-test_expect_success 'format %w(12,1,2)' '
+test_expect_success 'format %w(11,1,2)' '
 
-	git log -2 --format="%w(12,1,2)This is the %s commit." > actual &&
+	git log -2 --format="%w(11,1,2)This is the %s commit." > actual &&
 	test_cmp expect actual
 '
 
diff --git a/utf8.c b/utf8.c
index a544f15..28791a7 100644
--- a/utf8.c
+++ b/utf8.c
@@ -353,7 +353,7 @@ retry:
 
 		c = *text;
 		if (!c || isspace(c)) {
-			if (w < width || !space) {
+			if (w <= width || !space) {
 				const char *start = bol;
 				if (!c && text == start)
 					return w;
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/7] format-patch: do not wrap non-rfc2047 headers too early
  2012-10-18 14:43 [PATCH v2 0/7] Cure some format-patch wrapping and encoding issues Jan H. Schönherr
  2012-10-18 14:43 ` [PATCH v2 1/7] utf8: fix off-by-one wrapping of text Jan H. Schönherr
@ 2012-10-18 14:43 ` Jan H. Schönherr
  2012-10-18 14:43 ` [PATCH v2 3/7] format-patch: do not wrap rfc2047 encoded headers too late Jan H. Schönherr
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Jan H. Schönherr @ 2012-10-18 14:43 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Jan H. Schönherr

From: Jan H. Schönherr <schnhrr@cs.tu-berlin.de>

Do not wrap the second and later lines of non-rfc2047-encoded headers
substantially before the 78 character limit.

Instead of passing the remaining length of the first line as wrapping
width, use the correct maximum length and tell strbuf_add_wrapped_bytes()
how many characters of the first line are already used.

Signed-off-by: Jan H. Schönherr <schnhrr@cs.tu-berlin.de>
---
v2:
- removed off-by-one correction now handled by first patch
- commit message clarifications
---
 pretty.c                |  2 +-
 t/t4014-format-patch.sh | 60 ++++++++++++++++++++++++++++---------------------
 2 Dateien geändert, 35 Zeilen hinzugefügt(+), 27 Zeilen entfernt(-)

diff --git a/pretty.c b/pretty.c
index 8b1ea9f..71e4024 100644
--- a/pretty.c
+++ b/pretty.c
@@ -286,7 +286,7 @@ static void add_rfc2047(struct strbuf *sb, const char *line, int len,
 		if ((i + 1 < len) && (ch == '=' && line[i+1] == '?'))
 			goto needquote;
 	}
-	strbuf_add_wrapped_bytes(sb, line, len, 0, 1, max_length - line_len);
+	strbuf_add_wrapped_bytes(sb, line, len, -line_len, 1, max_length);
 	return;
 
 needquote:
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index 959aa26..d66e358 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -752,16 +752,14 @@ M64=$M8$M8$M8$M8$M8$M8$M8$M8
 M512=$M64$M64$M64$M64$M64$M64$M64$M64
 cat >expect <<'EOF'
 Subject: [PATCH] foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo
- bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar
- foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo
- bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar
- foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo
- bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar
- foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo
- bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar
- foo bar foo bar foo bar foo bar
+ bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar
+ foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo
+ bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar
+ foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo
+ bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar
+ foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar
 EOF
-test_expect_success 'format-patch wraps extremely long headers (ascii)' '
+test_expect_success 'format-patch wraps extremely long subject (ascii)' '
 	echo content >>file &&
 	git add file &&
 	git commit -m "$M512" &&
@@ -807,28 +805,12 @@ test_expect_success 'format-patch wraps extremely long headers (rfc2047)' '
 	test_cmp expect subject
 '
 
-M8="foo_bar_"
-M64=$M8$M8$M8$M8$M8$M8$M8$M8
-cat >expect <<EOF
-From: $M64
- <foobar@foo.bar>
-EOF
-test_expect_success 'format-patch wraps non-quotable headers' '
-	rm -rf patches/ &&
-	echo content >>file &&
-	git add file &&
-	git commit -mfoo --author "$M64 <foobar@foo.bar>" &&
-	git format-patch --stdout -1 >patch &&
-	sed -n "/^From: /p; /^ /p; /^$/q" <patch >from &&
-	test_cmp expect from
-'
-
 check_author() {
 	echo content >>file &&
 	git add file &&
 	GIT_AUTHOR_NAME=$1 git commit -m author-check &&
 	git format-patch --stdout -1 >patch &&
-	grep ^From: patch >actual &&
+	sed -n "/^From: /p; /^ /p; /^$/q" <patch >actual &&
 	test_cmp expect actual
 }
 
@@ -853,6 +835,32 @@ test_expect_success 'rfc2047-encoded headers also double-quote 822 specials' '
 	check_author "Föo B. Bar"
 '
 
+cat >expect <<EOF
+From: foo_bar_foo_bar_foo_bar_foo_bar_foo_bar_foo_bar_foo_bar_foo_bar_
+ <author@example.com>
+EOF
+test_expect_success 'format-patch wraps moderately long from-header (ascii)' '
+	check_author "foo_bar_foo_bar_foo_bar_foo_bar_foo_bar_foo_bar_foo_bar_foo_bar_"
+'
+
+cat >expect <<'EOF'
+From: Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar
+ Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo
+ Bar Foo Bar Foo Bar Foo Bar <author@example.com>
+EOF
+test_expect_success 'format-patch wraps extremely long from-header (ascii)' '
+	check_author "Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar"
+'
+
+cat >expect <<'EOF'
+From: "Foo.Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar
+ Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo
+ Bar Foo Bar Foo Bar Foo Bar" <author@example.com>
+EOF
+test_expect_success 'format-patch wraps extremely long from-header (rfc822)' '
+	check_author "Foo.Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar"
+'
+
 cat >expect <<'EOF'
 Subject: header with . in it
 EOF
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/7] format-patch: do not wrap rfc2047 encoded headers too late
  2012-10-18 14:43 [PATCH v2 0/7] Cure some format-patch wrapping and encoding issues Jan H. Schönherr
  2012-10-18 14:43 ` [PATCH v2 1/7] utf8: fix off-by-one wrapping of text Jan H. Schönherr
  2012-10-18 14:43 ` [PATCH v2 2/7] format-patch: do not wrap non-rfc2047 headers too early Jan H. Schönherr
@ 2012-10-18 14:43 ` Jan H. Schönherr
  2012-10-18 14:43 ` [PATCH v2 4/7] format-patch: introduce helper function last_line_length() Jan H. Schönherr
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Jan H. Schönherr @ 2012-10-18 14:43 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Jan H. Schönherr

From: Jan H. Schönherr <schnhrr@cs.tu-berlin.de>

Encoded characters add more than one character at once to an encoded
header. Include all characters that are about to be added in the length
calculation for wrapping.

Additionally, RFC 2047 imposes a maximum line length of 76 characters
if that line contains an rfc2047 encoded word.

Signed-off-by: Jan H. Schönherr <schnhrr@cs.tu-berlin.de>
---
v2:
- use constants for both, the 76 and 78 char limit
- rephrase comment
---
 pretty.c                | 26 +++++++++++++---------
 t/t4014-format-patch.sh | 58 +++++++++++++++++++++++++++++--------------------
 2 Dateien geändert, 51 Zeilen hinzugefügt(+), 33 Zeilen entfernt(-)

diff --git a/pretty.c b/pretty.c
index 71e4024..da75879 100644
--- a/pretty.c
+++ b/pretty.c
@@ -263,6 +263,9 @@ static void add_rfc822_quoted(struct strbuf *out, const char *s, int len)
 
 static int is_rfc2047_special(char ch)
 {
+	if (ch == ' ' || ch == '\n')
+		return 1;
+
 	return (non_ascii(ch) || (ch == '=') || (ch == '?') || (ch == '_'));
 }
 
@@ -270,6 +273,7 @@ static void add_rfc2047(struct strbuf *sb, const char *line, int len,
 		       const char *encoding)
 {
 	static const int max_length = 78; /* per rfc2822 */
+	static const int max_encoded_length = 76; /* per rfc2047 */
 	int i;
 	int line_len;
 
@@ -295,23 +299,25 @@ needquote:
 	line_len += strlen(encoding) + 5; /* 5 for =??q? */
 	for (i = 0; i < len; i++) {
 		unsigned ch = line[i] & 0xFF;
+		int is_special = is_rfc2047_special(ch);
+
+		/*
+		 * According to RFC 2047, we could encode the special character
+		 * ' ' (space) with '_' (underscore) for readability. But many
+		 * programs do not understand this and just leave the
+		 * underscore in place. Thus, we do nothing special here, which
+		 * causes ' ' to be encoded as '=20', avoiding this problem.
+		 */
 
-		if (line_len >= max_length - 2) {
+		if (line_len + 2 + (is_special ? 3 : 1) > max_encoded_length) {
 			strbuf_addf(sb, "?=\n =?%s?q?", encoding);
 			line_len = strlen(encoding) + 5 + 1; /* =??q? plus SP */
 		}
 
-		/*
-		 * We encode ' ' using '=20' even though rfc2047
-		 * allows using '_' for readability.  Unfortunately,
-		 * many programs do not understand this and just
-		 * leave the underscore in place.
-		 */
-		if (is_rfc2047_special(ch) || ch == ' ' || ch == '\n') {
+		if (is_special) {
 			strbuf_addf(sb, "=%02X", ch);
 			line_len += 3;
-		}
-		else {
+		} else {
 			strbuf_addch(sb, ch);
 			line_len++;
 		}
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index d66e358..1d5636d 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -772,30 +772,31 @@ M8="föö bar "
 M64=$M8$M8$M8$M8$M8$M8$M8$M8
 M512=$M64$M64$M64$M64$M64$M64$M64$M64
 cat >expect <<'EOF'
-Subject: [PATCH] =?UTF-8?q?f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
- =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar?=
+Subject: [PATCH] =?UTF-8?q?f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f?=
+ =?UTF-8?q?=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar?=
+ =?UTF-8?q?=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20?=
+ =?UTF-8?q?bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6?=
+ =?UTF-8?q?=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3?=
+ =?UTF-8?q?=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3?=
+ =?UTF-8?q?=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f?=
+ =?UTF-8?q?=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar?=
+ =?UTF-8?q?=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20?=
+ =?UTF-8?q?bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6?=
+ =?UTF-8?q?=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3?=
+ =?UTF-8?q?=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3?=
+ =?UTF-8?q?=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f?=
+ =?UTF-8?q?=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar?=
+ =?UTF-8?q?=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20?=
+ =?UTF-8?q?bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6?=
+ =?UTF-8?q?=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3?=
+ =?UTF-8?q?=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3?=
+ =?UTF-8?q?=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f?=
+ =?UTF-8?q?=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar?=
 EOF
-test_expect_success 'format-patch wraps extremely long headers (rfc2047)' '
+test_expect_success 'format-patch wraps extremely long subject (rfc2047)' '
 	rm -rf patches/ &&
 	echo content >>file &&
 	git add file &&
@@ -862,6 +863,17 @@ test_expect_success 'format-patch wraps extremely long from-header (rfc822)' '
 '
 
 cat >expect <<'EOF'
+From: =?UTF-8?q?Fo=C3=B6=20Bar=20Foo=20Bar=20Foo=20Bar=20Foo=20Bar=20Foo?=
+ =?UTF-8?q?=20Bar=20Foo=20Bar=20Foo=20Bar=20Foo=20Bar=20Foo=20Bar=20Foo=20?=
+ =?UTF-8?q?Bar=20Foo=20Bar=20Foo=20Bar=20Foo=20Bar=20Foo=20Bar=20Foo=20Bar?=
+ =?UTF-8?q?=20Foo=20Bar=20Foo=20Bar=20Foo=20Bar=20Foo=20Bar=20Foo=20Bar=20?=
+ =?UTF-8?q?Foo=20Bar=20Foo=20Bar?= <author@example.com>
+EOF
+test_expect_success 'format-patch wraps extremely long from-header (rfc2047)' '
+	check_author "Foö Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar"
+'
+
+cat >expect <<'EOF'
 Subject: header with . in it
 EOF
 test_expect_success 'subject lines do not have 822 atom-quoting' '
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 4/7] format-patch: introduce helper function last_line_length()
  2012-10-18 14:43 [PATCH v2 0/7] Cure some format-patch wrapping and encoding issues Jan H. Schönherr
                   ` (2 preceding siblings ...)
  2012-10-18 14:43 ` [PATCH v2 3/7] format-patch: do not wrap rfc2047 encoded headers too late Jan H. Schönherr
@ 2012-10-18 14:43 ` Jan H. Schönherr
  2012-10-18 14:43 ` [PATCH v2 5/7] format-patch: make rfc2047 encoding more strict Jan H. Schönherr
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Jan H. Schönherr @ 2012-10-18 14:43 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Jan H. Schönherr

From: Jan H. Schönherr <schnhrr@cs.tu-berlin.de>

Currently, an open-coded loop to calculate the length of the last
line of a string buffer is used in multiple places.

Move that code into a function of its own.

Signed-off-by: Jan H. Schönherr <schnhrr@cs.tu-berlin.de>
---
 pretty.c | 25 +++++++++++++------------
 1 Datei geändert, 13 Zeilen hinzugefügt(+), 12 Zeilen entfernt(-)

diff --git a/pretty.c b/pretty.c
index da75879..482402d 100644
--- a/pretty.c
+++ b/pretty.c
@@ -240,6 +240,17 @@ static int has_rfc822_specials(const char *s, int len)
 	return 0;
 }
 
+static int last_line_length(struct strbuf *sb)
+{
+	int i;
+
+	/* How many bytes are already used on the last line? */
+	for (i = sb->len - 1; i >= 0; i--)
+		if (sb->buf[i] == '\n')
+			break;
+	return sb->len - (i + 1);
+}
+
 static void add_rfc822_quoted(struct strbuf *out, const char *s, int len)
 {
 	int i;
@@ -275,13 +286,7 @@ static void add_rfc2047(struct strbuf *sb, const char *line, int len,
 	static const int max_length = 78; /* per rfc2822 */
 	static const int max_encoded_length = 76; /* per rfc2047 */
 	int i;
-	int line_len;
-
-	/* How many bytes are already used on the current line? */
-	for (i = sb->len - 1; i >= 0; i--)
-		if (sb->buf[i] == '\n')
-			break;
-	line_len = sb->len - (i+1);
+	int line_len = last_line_length(sb);
 
 	for (i = 0; i < len; i++) {
 		int ch = line[i];
@@ -346,7 +351,6 @@ void pp_user_info(const struct pretty_print_context *pp,
 	if (pp->fmt == CMIT_FMT_EMAIL) {
 		char *name_tail = strchr(line, '<');
 		int display_name_length;
-		int final_line;
 		if (!name_tail)
 			return;
 		while (line < name_tail && isspace(name_tail[-1]))
@@ -361,10 +365,7 @@ void pp_user_info(const struct pretty_print_context *pp,
 			add_rfc2047(sb, quoted.buf, quoted.len, encoding);
 			strbuf_release(&quoted);
 		}
-		for (final_line = 0; final_line < sb->len; final_line++)
-			if (sb->buf[sb->len - final_line - 1] == '\n')
-				break;
-		if (namelen - display_name_length + final_line > 78) {
+		if (namelen - display_name_length + last_line_length(sb) > 78) {
 			strbuf_addch(sb, '\n');
 			if (!isspace(name_tail[0]))
 				strbuf_addch(sb, ' ');
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 5/7] format-patch: make rfc2047 encoding more strict
  2012-10-18 14:43 [PATCH v2 0/7] Cure some format-patch wrapping and encoding issues Jan H. Schönherr
                   ` (3 preceding siblings ...)
  2012-10-18 14:43 ` [PATCH v2 4/7] format-patch: introduce helper function last_line_length() Jan H. Schönherr
@ 2012-10-18 14:43 ` Jan H. Schönherr
  2012-10-18 14:43 ` [PATCH v2 6/7] format-patch: fix rfc2047 address encoding with respect to rfc822 specials Jan H. Schönherr
  2012-10-18 14:43 ` [PATCH v2 7/7] format-patch tests: check quoting/encoding in To: and Cc: headers Jan H. Schönherr
  6 siblings, 0 replies; 8+ messages in thread
From: Jan H. Schönherr @ 2012-10-18 14:43 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Jan H. Schönherr

From: Jan H. Schönherr <schnhrr@cs.tu-berlin.de>

RFC 2047 requires more characters to be encoded than it is currently done.
Especially, RFC 2047 distinguishes between allowed remaining characters
in encoded words in addresses (From, To, etc.) and other headers, such
as Subject.

Make add_rfc2047() and is_rfc2047_special() location dependent and include
all non-allowed characters to hopefully be RFC 2047 conform.

This especially fixes a problem, where RFC 822 specials (e. g. ".") were
left unencoded in addresses, which was solved with a non-standard-conform
workaround in the past (which is going to be removed in a follow-up patch).

Signed-off-by: Jan H. Schönherr <schnhrr@cs.tu-berlin.de>
---
v2:
- part of restructured patch 4 of v1
- disallow even more characters in is_rfc2047_special()

The implementation of isprint() should later probably be substituted by
the one from Nguyen:
http://article.gmane.org/gmane.comp.version-control.git/207666
---
 git-compat-util.h       |  2 ++
 pretty.c                | 67 +++++++++++++++++++++++++++++++++++++++++++------
 t/t4014-format-patch.sh | 15 ++++++++---
 3 Dateien geändert, 72 Zeilen hinzugefügt(+), 12 Zeilen entfernt(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 000042d..d4ea446 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -475,6 +475,7 @@ extern const char tolower_trans_tbl[256];
 #undef isdigit
 #undef isalpha
 #undef isalnum
+#undef isprint
 #undef islower
 #undef isupper
 #undef tolower
@@ -492,6 +493,7 @@ extern unsigned char sane_ctype[256];
 #define isdigit(x) sane_istest(x,GIT_DIGIT)
 #define isalpha(x) sane_istest(x,GIT_ALPHA)
 #define isalnum(x) sane_istest(x,GIT_ALPHA | GIT_DIGIT)
+#define isprint(x) ((x) >= 0x20 && (x) <= 0x7e)
 #define islower(x) sane_iscase(x, 1)
 #define isupper(x) sane_iscase(x, 0)
 #define is_glob_special(x) sane_istest(x,GIT_GLOB_SPECIAL)
diff --git a/pretty.c b/pretty.c
index 482402d..613e4ea 100644
--- a/pretty.c
+++ b/pretty.c
@@ -272,16 +272,65 @@ static void add_rfc822_quoted(struct strbuf *out, const char *s, int len)
 	strbuf_addch(out, '"');
 }
 
-static int is_rfc2047_special(char ch)
+enum rfc2047_type {
+	RFC2047_SUBJECT,
+	RFC2047_ADDRESS,
+};
+
+static int is_rfc2047_special(char ch, enum rfc2047_type type)
 {
-	if (ch == ' ' || ch == '\n')
+	/*
+	 * rfc2047, section 4.2:
+	 *
+	 *    8-bit values which correspond to printable ASCII characters other
+	 *    than "=", "?", and "_" (underscore), MAY be represented as those
+	 *    characters.  (But see section 5 for restrictions.)  In
+	 *    particular, SPACE and TAB MUST NOT be represented as themselves
+	 *    within encoded words.
+	 */
+
+	/*
+	 * rule out non-ASCII characters and non-printable characters (the
+	 * non-ASCII check should be redundant as isprint() is not localized
+	 * and only knows about ASCII, but be defensive about that)
+	 */
+	if (non_ascii(ch) || !isprint(ch))
+		return 1;
+
+	/*
+	 * rule out special printable characters (' ' should be the only
+	 * whitespace character considered printable, but be defensive and use
+	 * isspace())
+	 */
+	if (isspace(ch) || ch == '=' || ch == '?' || ch == '_')
 		return 1;
 
-	return (non_ascii(ch) || (ch == '=') || (ch == '?') || (ch == '_'));
+	/*
+	 * rfc2047, section 5.3:
+	 *
+	 *    As a replacement for a 'word' entity within a 'phrase', for example,
+	 *    one that precedes an address in a From, To, or Cc header.  The ABNF
+	 *    definition for 'phrase' from RFC 822 thus becomes:
+	 *
+	 *    phrase = 1*( encoded-word / word )
+	 *
+	 *    In this case the set of characters that may be used in a "Q"-encoded
+	 *    'encoded-word' is restricted to: <upper and lower case ASCII
+	 *    letters, decimal digits, "!", "*", "+", "-", "/", "=", and "_"
+	 *    (underscore, ASCII 95.)>.  An 'encoded-word' that appears within a
+	 *    'phrase' MUST be separated from any adjacent 'word', 'text' or
+	 *    'special' by 'linear-white-space'.
+	 */
+
+	if (type != RFC2047_ADDRESS)
+		return 0;
+
+	/* '=' and '_' are special cases and have been checked above */
+	return !(isalnum(ch) || ch == '!' || ch == '*' || ch == '+' || ch == '-' || ch == '/');
 }
 
 static void add_rfc2047(struct strbuf *sb, const char *line, int len,
-		       const char *encoding)
+		       const char *encoding, enum rfc2047_type type)
 {
 	static const int max_length = 78; /* per rfc2822 */
 	static const int max_encoded_length = 76; /* per rfc2047 */
@@ -304,7 +353,7 @@ needquote:
 	line_len += strlen(encoding) + 5; /* 5 for =??q? */
 	for (i = 0; i < len; i++) {
 		unsigned ch = line[i] & 0xFF;
-		int is_special = is_rfc2047_special(ch);
+		int is_special = is_rfc2047_special(ch, type);
 
 		/*
 		 * According to RFC 2047, we could encode the special character
@@ -358,11 +407,13 @@ void pp_user_info(const struct pretty_print_context *pp,
 		display_name_length = name_tail - line;
 		strbuf_addstr(sb, "From: ");
 		if (!has_rfc822_specials(line, display_name_length)) {
-			add_rfc2047(sb, line, display_name_length, encoding);
+			add_rfc2047(sb, line, display_name_length,
+						encoding, RFC2047_ADDRESS);
 		} else {
 			struct strbuf quoted = STRBUF_INIT;
 			add_rfc822_quoted(&quoted, line, display_name_length);
-			add_rfc2047(sb, quoted.buf, quoted.len, encoding);
+			add_rfc2047(sb, quoted.buf, quoted.len,
+						encoding, RFC2047_ADDRESS);
 			strbuf_release(&quoted);
 		}
 		if (namelen - display_name_length + last_line_length(sb) > 78) {
@@ -1294,7 +1345,7 @@ void pp_title_line(const struct pretty_print_context *pp,
 	strbuf_grow(sb, title.len + 1024);
 	if (pp->subject) {
 		strbuf_addstr(sb, pp->subject);
-		add_rfc2047(sb, title.buf, title.len, encoding);
+		add_rfc2047(sb, title.buf, title.len, encoding, RFC2047_SUBJECT);
 	} else {
 		strbuf_addbuf(sb, &title);
 	}
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index 1d5636d..727d606 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -818,21 +818,28 @@ check_author() {
 cat >expect <<'EOF'
 From: "Foo B. Bar" <author@example.com>
 EOF
-test_expect_success 'format-patch quotes dot in headers' '
+test_expect_success 'format-patch quotes dot in from-headers' '
 	check_author "Foo B. Bar"
 '
 
 cat >expect <<'EOF'
 From: "Foo \"The Baz\" Bar" <author@example.com>
 EOF
-test_expect_success 'format-patch quotes double-quote in headers' '
+test_expect_success 'format-patch quotes double-quote in from-headers' '
 	check_author "Foo \"The Baz\" Bar"
 '
 
 cat >expect <<'EOF'
-From: =?UTF-8?q?"F=C3=B6o=20B.=20Bar"?= <author@example.com>
+From: =?UTF-8?q?F=C3=B6o=20Bar?= <author@example.com>
 EOF
-test_expect_success 'rfc2047-encoded headers also double-quote 822 specials' '
+test_expect_success 'format-patch uses rfc2047-encoded from-headers when necessary' '
+	check_author "Föo Bar"
+'
+
+cat >expect <<'EOF'
+From: =?UTF-8?q?F=C3=B6o=20B=2E=20Bar?= <author@example.com>
+EOF
+test_expect_failure 'rfc2047-encoded from-headers leave no rfc822 specials' '
 	check_author "Föo B. Bar"
 '
 
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 6/7] format-patch: fix rfc2047 address encoding with respect to rfc822 specials
  2012-10-18 14:43 [PATCH v2 0/7] Cure some format-patch wrapping and encoding issues Jan H. Schönherr
                   ` (4 preceding siblings ...)
  2012-10-18 14:43 ` [PATCH v2 5/7] format-patch: make rfc2047 encoding more strict Jan H. Schönherr
@ 2012-10-18 14:43 ` Jan H. Schönherr
  2012-10-18 14:43 ` [PATCH v2 7/7] format-patch tests: check quoting/encoding in To: and Cc: headers Jan H. Schönherr
  6 siblings, 0 replies; 8+ messages in thread
From: Jan H. Schönherr @ 2012-10-18 14:43 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Jan H. Schönherr

From: Jan H. Schönherr <schnhrr@cs.tu-berlin.de>

According to RFC 2047 and RFC 822, rfc2047 encoded words and and rfc822
quoted strings do not mix. Since add_rfc2047() no longer leaves RFC 822
specials behind, the quoting is also no longer necessary to create a
standard-conform mail.

Remove the quoting, when RFC 2047 encoding takes place. This actually
requires to refactor add_rfc2047() a bit, so that the different cases
can be distinguished.

With this patch, my own name gets correctly decoded as Jan H. Schönherr
(without quotes) and not as "Jan H. Schönherr" (with quotes).

Signed-off-by: Jan H. Schönherr <schnhrr@cs.tu-berlin.de>
---
v2:
- part of restructured patch 4 of v1
- use constants for both, the 76 and 78 char limit
- select correct maximum length for possible final folding
- removed off-by-one correction now handled by first patch
---
 pretty.c                | 49 ++++++++++++++++++++++++++++++++-----------------
 t/t4014-format-patch.sh |  2 +-
 2 Dateien geändert, 33 Zeilen hinzugefügt(+), 18 Zeilen entfernt(-)

diff --git a/pretty.c b/pretty.c
index 613e4ea..413e758 100644
--- a/pretty.c
+++ b/pretty.c
@@ -231,7 +231,7 @@ static int is_rfc822_special(char ch)
 	}
 }
 
-static int has_rfc822_specials(const char *s, int len)
+static int needs_rfc822_quoting(const char *s, int len)
 {
 	int i;
 	for (i = 0; i < len; i++)
@@ -329,25 +329,29 @@ static int is_rfc2047_special(char ch, enum rfc2047_type type)
 	return !(isalnum(ch) || ch == '!' || ch == '*' || ch == '+' || ch == '-' || ch == '/');
 }
 
-static void add_rfc2047(struct strbuf *sb, const char *line, int len,
-		       const char *encoding, enum rfc2047_type type)
+static int needs_rfc2047_encoding(const char *line, int len,
+				  enum rfc2047_type type)
 {
-	static const int max_length = 78; /* per rfc2822 */
-	static const int max_encoded_length = 76; /* per rfc2047 */
 	int i;
-	int line_len = last_line_length(sb);
 
 	for (i = 0; i < len; i++) {
 		int ch = line[i];
 		if (non_ascii(ch) || ch == '\n')
-			goto needquote;
+			return 1;
 		if ((i + 1 < len) && (ch == '=' && line[i+1] == '?'))
-			goto needquote;
+			return 1;
 	}
-	strbuf_add_wrapped_bytes(sb, line, len, -line_len, 1, max_length);
-	return;
 
-needquote:
+	return 0;
+}
+
+static void add_rfc2047(struct strbuf *sb, const char *line, int len,
+		       const char *encoding, enum rfc2047_type type)
+{
+	static const int max_encoded_length = 76; /* per rfc2047 */
+	int i;
+	int line_len = last_line_length(sb);
+
 	strbuf_grow(sb, len * 3 + strlen(encoding) + 100);
 	strbuf_addf(sb, "=?%s?q?", encoding);
 	line_len += strlen(encoding) + 5; /* 5 for =??q? */
@@ -383,6 +387,7 @@ void pp_user_info(const struct pretty_print_context *pp,
 		  const char *what, struct strbuf *sb,
 		  const char *line, const char *encoding)
 {
+	int max_length = 78; /* per rfc2822 */
 	char *date;
 	int namelen;
 	unsigned long time;
@@ -406,17 +411,21 @@ void pp_user_info(const struct pretty_print_context *pp,
 			name_tail--;
 		display_name_length = name_tail - line;
 		strbuf_addstr(sb, "From: ");
-		if (!has_rfc822_specials(line, display_name_length)) {
+		if (needs_rfc2047_encoding(line, display_name_length, RFC2047_ADDRESS)) {
 			add_rfc2047(sb, line, display_name_length,
 						encoding, RFC2047_ADDRESS);
-		} else {
+			max_length = 76; /* per rfc2047 */
+		} else if (needs_rfc822_quoting(line, display_name_length)) {
 			struct strbuf quoted = STRBUF_INIT;
 			add_rfc822_quoted(&quoted, line, display_name_length);
-			add_rfc2047(sb, quoted.buf, quoted.len,
-						encoding, RFC2047_ADDRESS);
+			strbuf_add_wrapped_bytes(sb, quoted.buf, quoted.len,
+							-6, 1, max_length);
 			strbuf_release(&quoted);
+		} else {
+			strbuf_add_wrapped_bytes(sb, line, display_name_length,
+							-6, 1, max_length);
 		}
-		if (namelen - display_name_length + last_line_length(sb) > 78) {
+		if (namelen - display_name_length + last_line_length(sb) > max_length) {
 			strbuf_addch(sb, '\n');
 			if (!isspace(name_tail[0]))
 				strbuf_addch(sb, ' ');
@@ -1336,6 +1345,7 @@ void pp_title_line(const struct pretty_print_context *pp,
 		   const char *encoding,
 		   int need_8bit_cte)
 {
+	static const int max_length = 78; /* per rfc2047 */
 	struct strbuf title;
 
 	strbuf_init(&title, 80);
@@ -1345,7 +1355,12 @@ void pp_title_line(const struct pretty_print_context *pp,
 	strbuf_grow(sb, title.len + 1024);
 	if (pp->subject) {
 		strbuf_addstr(sb, pp->subject);
-		add_rfc2047(sb, title.buf, title.len, encoding, RFC2047_SUBJECT);
+		if (needs_rfc2047_encoding(title.buf, title.len, RFC2047_SUBJECT))
+			add_rfc2047(sb, title.buf, title.len,
+						encoding, RFC2047_SUBJECT);
+		else
+			strbuf_add_wrapped_bytes(sb, title.buf, title.len,
+					 -last_line_length(sb), 1, max_length);
 	} else {
 		strbuf_addbuf(sb, &title);
 	}
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index 727d606..e024eb8 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -839,7 +839,7 @@ test_expect_success 'format-patch uses rfc2047-encoded from-headers when necessa
 cat >expect <<'EOF'
 From: =?UTF-8?q?F=C3=B6o=20B=2E=20Bar?= <author@example.com>
 EOF
-test_expect_failure 'rfc2047-encoded from-headers leave no rfc822 specials' '
+test_expect_success 'rfc2047-encoded from-headers leave no rfc822 specials' '
 	check_author "Föo B. Bar"
 '
 
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 7/7] format-patch tests: check quoting/encoding in To: and Cc: headers
  2012-10-18 14:43 [PATCH v2 0/7] Cure some format-patch wrapping and encoding issues Jan H. Schönherr
                   ` (5 preceding siblings ...)
  2012-10-18 14:43 ` [PATCH v2 6/7] format-patch: fix rfc2047 address encoding with respect to rfc822 specials Jan H. Schönherr
@ 2012-10-18 14:43 ` Jan H. Schönherr
  6 siblings, 0 replies; 8+ messages in thread
From: Jan H. Schönherr @ 2012-10-18 14:43 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Jan H. Schönherr

From: Jan H. Schönherr <schnhrr@cs.tu-berlin.de>

git-format-patch does currently not parse user supplied extra header
values (e. g., --cc, --add-header) and just replays them. That forces
users to add them RFC 2822/2047 conform in encoded form, e. g.

--cc '=?UTF-8?q?Jan=20H=2E=20Sch=C3=B6nherr?= <...>'

which is inconvenient. We would want to update git-format-patch to
accept human-readable input

--cc 'Jan H. Schönherr <...>'

and handle the encoding, wrapping and quoting internally in the future,
similar to what is already done in git-send-email. The necessary code
should mostly exist in the code paths that handle the From: and Subject:
headers.

Whether we want to do this only for the git-format-patch options
--to and --cc (and the corresponding config options) or also for
user supplied headers via --add-header, is open for discussion.

For now, add test_expect_failure tests for To: and Cc: headers as a
reminder and fix tests that would otherwise fail should this get
implemented.

Signed-off-by: Jan H. Schönherr <schnhrr@cs.tu-berlin.de>
---
This patch is RFC material. There are a few reasons, why this
is a good idea and also a few, why it is bad:

Pro:
- current git-format-patch behavior differs from git-send-email
- we should be able to use the address format that git uses
  elsewhere (e. g., author and committer info)
- necessary code mostly exists

Con:
- changes current behavior
- make code more complex

(Feel free to add more.)

The first drawback can be mitigated by checking whether the
input is already properly encoded, so that we do not accidentally
double-encode things. git-send-email does that, but that's written
in Perl, so we would need even more code.

For now, this is only about _addresses_ supplied to git-format-patch,
not _headers_. We could also validate/encode/wrap user supplied headers.
RFC 2822/2047 is specific enough to allow that. But there is no point
thinking about that without the intention of encoding addresses.

v2:
- updated commit message as suggested by Junio C Hamano
---
 t/t4014-format-patch.sh | 98 +++++++++++++++++++++++++++++++++----------------
 1 Datei geändert, 66 Zeilen hinzugefügt(+), 32 Zeilen entfernt(-)

diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index e024eb8..ad9f69e 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -110,73 +110,107 @@ test_expect_success 'replay did not screw up the log message' '
 
 test_expect_success 'extra headers' '
 
-	git config format.headers "To: R. E. Cipient <rcipient@example.com>
+	git config format.headers "To: R E Cipient <rcipient@example.com>
 " &&
-	git config --add format.headers "Cc: S. E. Cipient <scipient@example.com>
+	git config --add format.headers "Cc: S E Cipient <scipient@example.com>
 " &&
 	git format-patch --stdout master..side > patch2 &&
 	sed -e "/^\$/q" patch2 > hdrs2 &&
-	grep "^To: R. E. Cipient <rcipient@example.com>\$" hdrs2 &&
-	grep "^Cc: S. E. Cipient <scipient@example.com>\$" hdrs2
+	grep "^To: R E Cipient <rcipient@example.com>\$" hdrs2 &&
+	grep "^Cc: S E Cipient <scipient@example.com>\$" hdrs2
 
 '
 
 test_expect_success 'extra headers without newlines' '
 
-	git config --replace-all format.headers "To: R. E. Cipient <rcipient@example.com>" &&
-	git config --add format.headers "Cc: S. E. Cipient <scipient@example.com>" &&
+	git config --replace-all format.headers "To: R E Cipient <rcipient@example.com>" &&
+	git config --add format.headers "Cc: S E Cipient <scipient@example.com>" &&
 	git format-patch --stdout master..side >patch3 &&
 	sed -e "/^\$/q" patch3 > hdrs3 &&
-	grep "^To: R. E. Cipient <rcipient@example.com>\$" hdrs3 &&
-	grep "^Cc: S. E. Cipient <scipient@example.com>\$" hdrs3
+	grep "^To: R E Cipient <rcipient@example.com>\$" hdrs3 &&
+	grep "^Cc: S E Cipient <scipient@example.com>\$" hdrs3
 
 '
 
 test_expect_success 'extra headers with multiple To:s' '
 
-	git config --replace-all format.headers "To: R. E. Cipient <rcipient@example.com>" &&
-	git config --add format.headers "To: S. E. Cipient <scipient@example.com>" &&
+	git config --replace-all format.headers "To: R E Cipient <rcipient@example.com>" &&
+	git config --add format.headers "To: S E Cipient <scipient@example.com>" &&
 	git format-patch --stdout master..side > patch4 &&
 	sed -e "/^\$/q" patch4 > hdrs4 &&
-	grep "^To: R. E. Cipient <rcipient@example.com>,\$" hdrs4 &&
-	grep "^ *S. E. Cipient <scipient@example.com>\$" hdrs4
+	grep "^To: R E Cipient <rcipient@example.com>,\$" hdrs4 &&
+	grep "^ *S E Cipient <scipient@example.com>\$" hdrs4
 '
 
-test_expect_success 'additional command line cc' '
+test_expect_success 'additional command line cc (ascii)' '
 
-	git config --replace-all format.headers "Cc: R. E. Cipient <rcipient@example.com>" &&
+	git config --replace-all format.headers "Cc: R E Cipient <rcipient@example.com>" &&
+	git format-patch --cc="S E Cipient <scipient@example.com>" --stdout master..side | sed -e "/^\$/q" >patch5 &&
+	grep "^Cc: R E Cipient <rcipient@example.com>,\$" patch5 &&
+	grep "^ *S E Cipient <scipient@example.com>\$" patch5
+'
+
+test_expect_failure 'additional command line cc (rfc822)' '
+
+	git config --replace-all format.headers "Cc: R E Cipient <rcipient@example.com>" &&
 	git format-patch --cc="S. E. Cipient <scipient@example.com>" --stdout master..side | sed -e "/^\$/q" >patch5 &&
-	grep "^Cc: R. E. Cipient <rcipient@example.com>,\$" patch5 &&
-	grep "^ *S. E. Cipient <scipient@example.com>\$" patch5
+	grep "^Cc: R E Cipient <rcipient@example.com>,\$" patch5 &&
+	grep "^ *"S. E. Cipient" <scipient@example.com>\$" patch5
 '
 
 test_expect_success 'command line headers' '
 
 	git config --unset-all format.headers &&
-	git format-patch --add-header="Cc: R. E. Cipient <rcipient@example.com>" --stdout master..side | sed -e "/^\$/q" >patch6 &&
-	grep "^Cc: R. E. Cipient <rcipient@example.com>\$" patch6
+	git format-patch --add-header="Cc: R E Cipient <rcipient@example.com>" --stdout master..side | sed -e "/^\$/q" >patch6 &&
+	grep "^Cc: R E Cipient <rcipient@example.com>\$" patch6
 '
 
 test_expect_success 'configuration headers and command line headers' '
 
-	git config --replace-all format.headers "Cc: R. E. Cipient <rcipient@example.com>" &&
-	git format-patch --add-header="Cc: S. E. Cipient <scipient@example.com>" --stdout master..side | sed -e "/^\$/q" >patch7 &&
-	grep "^Cc: R. E. Cipient <rcipient@example.com>,\$" patch7 &&
-	grep "^ *S. E. Cipient <scipient@example.com>\$" patch7
+	git config --replace-all format.headers "Cc: R E Cipient <rcipient@example.com>" &&
+	git format-patch --add-header="Cc: S E Cipient <scipient@example.com>" --stdout master..side | sed -e "/^\$/q" >patch7 &&
+	grep "^Cc: R E Cipient <rcipient@example.com>,\$" patch7 &&
+	grep "^ *S E Cipient <scipient@example.com>\$" patch7
 '
 
-test_expect_success 'command line To: header' '
+test_expect_success 'command line To: header (ascii)' '
 
 	git config --unset-all format.headers &&
+	git format-patch --to="R E Cipient <rcipient@example.com>" --stdout master..side | sed -e "/^\$/q" >patch8 &&
+	grep "^To: R E Cipient <rcipient@example.com>\$" patch8
+'
+
+test_expect_failure 'command line To: header (rfc822)' '
+
 	git format-patch --to="R. E. Cipient <rcipient@example.com>" --stdout master..side | sed -e "/^\$/q" >patch8 &&
-	grep "^To: R. E. Cipient <rcipient@example.com>\$" patch8
+	grep "^To: "R. E. Cipient" <rcipient@example.com>\$" patch8
+'
+
+test_expect_failure 'command line To: header (rfc2047)' '
+
+	git format-patch --to="R Ä Cipient <rcipient@example.com>" --stdout master..side | sed -e "/^\$/q" >patch8 &&
+	grep "^To: =?UTF-8?q?R=20=C3=84=20Cipient?= <rcipient@example.com>\$" patch8
 '
 
-test_expect_success 'configuration To: header' '
+test_expect_success 'configuration To: header (ascii)' '
+
+	git config format.to "R E Cipient <rcipient@example.com>" &&
+	git format-patch --stdout master..side | sed -e "/^\$/q" >patch9 &&
+	grep "^To: R E Cipient <rcipient@example.com>\$" patch9
+'
+
+test_expect_failure 'configuration To: header (rfc822)' '
 
 	git config format.to "R. E. Cipient <rcipient@example.com>" &&
 	git format-patch --stdout master..side | sed -e "/^\$/q" >patch9 &&
-	grep "^To: R. E. Cipient <rcipient@example.com>\$" patch9
+	grep "^To: "R. E. Cipient" <rcipient@example.com>\$" patch9
+'
+
+test_expect_failure 'configuration To: header (rfc2047)' '
+
+	git config format.to "R Ä Cipient <rcipient@example.com>" &&
+	git format-patch --stdout master..side | sed -e "/^\$/q" >patch9 &&
+	grep "^To: =?UTF-8?q?R=20=C3=84=20Cipient?= <rcipient@example.com>\$" patch9
 '
 
 # check_patch <patch>: Verify that <patch> looks like a half-sane
@@ -190,11 +224,11 @@ check_patch () {
 test_expect_success '--no-to overrides config.to' '
 
 	git config --replace-all format.to \
-		"R. E. Cipient <rcipient@example.com>" &&
+		"R E Cipient <rcipient@example.com>" &&
 	git format-patch --no-to --stdout master..side |
 	sed -e "/^\$/q" >patch10 &&
 	check_patch patch10 &&
-	! grep "^To: R. E. Cipient <rcipient@example.com>\$" patch10
+	! grep "^To: R E Cipient <rcipient@example.com>\$" patch10
 '
 
 test_expect_success '--no-to and --to replaces config.to' '
@@ -212,21 +246,21 @@ test_expect_success '--no-to and --to replaces config.to' '
 test_expect_success '--no-cc overrides config.cc' '
 
 	git config --replace-all format.cc \
-		"C. E. Cipient <rcipient@example.com>" &&
+		"C E Cipient <rcipient@example.com>" &&
 	git format-patch --no-cc --stdout master..side |
 	sed -e "/^\$/q" >patch12 &&
 	check_patch patch12 &&
-	! grep "^Cc: C. E. Cipient <rcipient@example.com>\$" patch12
+	! grep "^Cc: C E Cipient <rcipient@example.com>\$" patch12
 '
 
 test_expect_success '--no-add-header overrides config.headers' '
 
 	git config --replace-all format.headers \
-		"Header1: B. E. Cipient <rcipient@example.com>" &&
+		"Header1: B E Cipient <rcipient@example.com>" &&
 	git format-patch --no-add-header --stdout master..side |
 	sed -e "/^\$/q" >patch13 &&
 	check_patch patch13 &&
-	! grep "^Header1: B. E. Cipient <rcipient@example.com>\$" patch13
+	! grep "^Header1: B E Cipient <rcipient@example.com>\$" patch13
 '
 
 test_expect_success 'multiple files' '
-- 
1.7.12

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-10-18 14:51 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-18 14:43 [PATCH v2 0/7] Cure some format-patch wrapping and encoding issues Jan H. Schönherr
2012-10-18 14:43 ` [PATCH v2 1/7] utf8: fix off-by-one wrapping of text Jan H. Schönherr
2012-10-18 14:43 ` [PATCH v2 2/7] format-patch: do not wrap non-rfc2047 headers too early Jan H. Schönherr
2012-10-18 14:43 ` [PATCH v2 3/7] format-patch: do not wrap rfc2047 encoded headers too late Jan H. Schönherr
2012-10-18 14:43 ` [PATCH v2 4/7] format-patch: introduce helper function last_line_length() Jan H. Schönherr
2012-10-18 14:43 ` [PATCH v2 5/7] format-patch: make rfc2047 encoding more strict Jan H. Schönherr
2012-10-18 14:43 ` [PATCH v2 6/7] format-patch: fix rfc2047 address encoding with respect to rfc822 specials Jan H. Schönherr
2012-10-18 14:43 ` [PATCH v2 7/7] format-patch tests: check quoting/encoding in To: and Cc: headers Jan H. Schönherr

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).