From: "Đoàn Trần Công Danh" <congdanhqx@gmail.com>
To: git@vger.kernel.org
Cc: "Đoàn Trần Công Danh" <congdanhqx@gmail.com>
Subject: [PATCH] mailinfo: strip CR from base64/quoted-printable email
Date: Wed, 21 Apr 2021 08:34:04 +0700 [thread overview]
Message-ID: <20210421013404.17383-1-congdanhqx@gmail.com> (raw)
When an SMTP server receives an 8-bit email message, possibly with only
LF as line ending, some of those servers decide to change said LF to
CRLF.
Some other SMTP servers, when receives an 8-bit email message, decide to
encoding such message in base64 and/or quoted-printable instead.
If an email is transfered through those 2 email servers in order, the
final recipients will receive an email contains a patch mungled with
CRLF encoded inside another encoding. Thus, such CR couldn't be dropped
by mailsplit. Such accidents have been observed in the wild [1].
Let's guess if such CR was added automatically and strip them in
mailinfo.
[1]: https://nmbug.notmuchmail.org/nmweb/show/m2lf9ejegj.fsf%40guru.guru-group.fi
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
I'm not sure if guessing the heuristic to strip CR is a good approach.
I think it's better to pass --keep-cr down from git-am.
Let's say --keep-cr=<yes|no|auto>
mailinfo.c | 20 +++++++++++++++++---
t/t5100-mailinfo.sh | 5 +++++
t/t5100/cr-base64.mbox | 22 ++++++++++++++++++++++
t/t5100/info1000 | 5 +++++
t/t5100/msg1000 | 2 ++
t/t5100/patch1000 | 22 ++++++++++++++++++++++
6 files changed, 73 insertions(+), 3 deletions(-)
create mode 100644 t/t5100/cr-base64.mbox
create mode 100644 t/t5100/info1000
create mode 100644 t/t5100/msg1000
create mode 100644 t/t5100/patch1000
diff --git a/mailinfo.c b/mailinfo.c
index 5681d9130d..dbff867f42 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -988,16 +988,27 @@ static int handle_boundary(struct mailinfo *mi, struct strbuf *line)
}
static void handle_filter_flowed(struct mailinfo *mi, struct strbuf *line,
- struct strbuf *prev)
+ struct strbuf *prev, int *keep_cr)
{
size_t len = line->len;
const char *rest;
if (!mi->format_flowed) {
+ if (*keep_cr == -1 && len >= 2)
+ *keep_cr = !(line->buf[len - 2] == '\r' &&
+ line->buf[len - 1] == '\n');
+ if (!*keep_cr && len >= 2 &&
+ line->buf[len - 2] == '\r' &&
+ line->buf[len - 1] == '\n') {
+ strbuf_setlen(line, len - 2);
+ strbuf_addch(line, '\n');
+ len--;
+ }
handle_filter(mi, line);
return;
}
+ *keep_cr = 1;
if (line->buf[len - 1] == '\n') {
len--;
if (len && line->buf[len - 1] == '\r')
@@ -1036,6 +1047,7 @@ static void handle_filter_flowed(struct mailinfo *mi, struct strbuf *line,
static void handle_body(struct mailinfo *mi, struct strbuf *line)
{
struct strbuf prev = STRBUF_INIT;
+ int keep_cr = -1;
/* Skip up to the first boundary */
if (*(mi->content_top)) {
@@ -1081,7 +1093,7 @@ static void handle_body(struct mailinfo *mi, struct strbuf *line)
strbuf_addbuf(&prev, sb);
break;
}
- handle_filter_flowed(mi, sb, &prev);
+ handle_filter_flowed(mi, sb, &prev, &keep_cr);
}
/*
* The partial chunk is saved in "prev" and will be
@@ -1091,7 +1103,9 @@ static void handle_body(struct mailinfo *mi, struct strbuf *line)
break;
}
default:
- handle_filter_flowed(mi, line, &prev);
+ /* CR in plain message was processed in mailsplit */
+ keep_cr = 1;
+ handle_filter_flowed(mi, line, &prev, &keep_cr);
}
if (mi->input_error)
diff --git a/t/t5100-mailinfo.sh b/t/t5100-mailinfo.sh
index 147e616533..9ccc11d16a 100755
--- a/t/t5100-mailinfo.sh
+++ b/t/t5100-mailinfo.sh
@@ -228,4 +228,9 @@ test_expect_success 'mailinfo handles unusual header whitespace' '
test_cmp expect actual
'
+test_expect_success 'mailinfo strip CR after decode base64' '
+ cp $DATA/cr-base64.mbox 1000 &&
+ check_mailinfo 1000 ""
+'
+
test_done
diff --git a/t/t5100/cr-base64.mbox b/t/t5100/cr-base64.mbox
new file mode 100644
index 0000000000..6ea9806a6b
--- /dev/null
+++ b/t/t5100/cr-base64.mbox
@@ -0,0 +1,22 @@
+From: A U Thor <mail@example.com>
+To: list@example.org
+Subject: [PATCH v2] sample
+Date: Mon, 3 Aug 2020 22:40:55 +0700
+Message-Id: <msg-id@example.com>
+Content-Type: text/plain; charset="utf-8"
+Content-Transfer-Encoding: base64
+
+T24gZGlmZmVyZW50IGRpc3RybywgcHl0ZXN0IGlzIHN1ZmZpeGVkIHdpdGggZGlmZmVyZW50IHBh
+dHRlcm5zLg0KDQotLS0NCiBjb25maWd1cmUgfCAyICstDQogMSBmaWxlIGNoYW5nZWQsIDEgaW5z
+ZXJ0aW9uKCspLCAxIGRlbGV0aW9uKC0pDQoNCmRpZmYgLS1naXQgYS9jb25maWd1cmUgYi9jb25m
+aWd1cmUNCmluZGV4IGRiMzUzOGIzLi5mN2MxYzA5NSAxMDA3NTUNCi0tLSBhL2NvbmZpZ3VyZQ0K
+KysrIGIvY29uZmlndXJlDQpAQCAtODE0LDcgKzgxNCw3IEBAIGlmIFsgJGhhdmVfcHl0aG9uMyAt
+ZXEgMSBdOyB0aGVuDQogICAgIHByaW50ZiAiQ2hlY2tpbmcgZm9yIHB5dGhvbjMgcHl0ZXN0ICg+
+PSAzLjApLi4uICINCiAgICAgY29uZj0kKG1rdGVtcCkNCiAgICAgcHJpbnRmICJbcHl0ZXN0XVxu
+bWludmVyc2lvbj0zLjBcbiIgPiAkY29uZg0KLSAgICBpZiBweXRlc3QtMyAtYyAkY29uZiAtLXZl
+cnNpb24gPi9kZXYvbnVsbCAyPiYxOyB0aGVuDQorICAgIGlmICIkcHl0aG9uIiAtbSBweXRlc3Qg
+LWMgJGNvbmYgLS12ZXJzaW9uID4vZGV2L251bGwgMj4mMTsgdGhlbg0KICAgICAgICAgcHJpbnRm
+ICJZZXMuXG4iDQogICAgICAgICBoYXZlX3B5dGhvbjNfcHl0ZXN0PTENCiAgICAgZWxzZQ0KLS0g
+DQoyLjI4LjANCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f
+CmV4YW1wbGUgbWFpbGluZyBsaXN0IC0tIGxpc3RAZXhhbXBsZS5vcmcKVG8gdW5zdWJzY3JpYmUg
+c2VuZCBhbiBlbWFpbCB0byBsaXN0LWxlYXZlQGV4YW1wbGUub3JnCg==
diff --git a/t/t5100/info1000 b/t/t5100/info1000
new file mode 100644
index 0000000000..dab2228b70
--- /dev/null
+++ b/t/t5100/info1000
@@ -0,0 +1,5 @@
+Author: A U Thor
+Email: mail@example.com
+Subject: sample
+Date: Mon, 3 Aug 2020 22:40:55 +0700
+
diff --git a/t/t5100/msg1000 b/t/t5100/msg1000
new file mode 100644
index 0000000000..5e8e860aae
--- /dev/null
+++ b/t/t5100/msg1000
@@ -0,0 +1,2 @@
+On different distro, pytest is suffixed with different patterns.
+
diff --git a/t/t5100/patch1000 b/t/t5100/patch1000
new file mode 100644
index 0000000000..51c4fb4cb5
--- /dev/null
+++ b/t/t5100/patch1000
@@ -0,0 +1,22 @@
+---
+ configure | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/configure b/configure
+index db3538b3..f7c1c095 100755
+--- a/configure
++++ b/configure
+@@ -814,7 +814,7 @@ if [ $have_python3 -eq 1 ]; then
+ printf "Checking for python3 pytest (>= 3.0)... "
+ conf=$(mktemp)
+ printf "[pytest]\nminversion=3.0\n" > $conf
+- if pytest-3 -c $conf --version >/dev/null 2>&1; then
++ if "$python" -m pytest -c $conf --version >/dev/null 2>&1; then
+ printf "Yes.\n"
+ have_python3_pytest=1
+ else
+--
+2.28.0
+_______________________________________________
+example mailing list -- list@example.org
+To unsubscribe send an email to list-leave@example.org
--
2.31.1.192.g0881477623
next reply other threads:[~2021-04-21 1:34 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-21 1:34 Đoàn Trần Công Danh [this message]
2021-04-21 2:09 ` [PATCH] mailinfo: strip CR from base64/quoted-printable email Junio C Hamano
2021-04-21 3:32 ` brian m. carlson
2021-04-21 12:07 ` Đoàn Trần Công Danh
2021-04-22 1:10 ` brian m. carlson
2021-05-04 17:19 ` [PATCH v2 0/5] Teach am/mailinfo to process quoted CR Đoàn Trần Công Danh
2021-05-04 17:19 ` [PATCH v2 1/5] mailinfo: avoid magic number in option parsing Đoàn Trần Công Danh
2021-05-04 17:19 ` [PATCH v2 2/5] mailinfo: warn if CR found in base64/quoted-printable email Đoàn Trần Công Danh
2021-05-05 3:41 ` Junio C Hamano
2021-05-04 17:20 ` [PATCH v2 3/5] mailinfo: skip quoted CR on user's wish Đoàn Trần Công Danh
2021-05-05 4:12 ` Junio C Hamano
2021-05-05 15:53 ` Đoàn Trần Công Danh
2021-05-04 17:20 ` [PATCH v2 4/5] mailinfo: strip quoted CR on users' wish Đoàn Trần Công Danh
2021-05-05 4:27 ` Junio C Hamano
2021-05-04 17:20 ` [PATCH v2 5/5] am: learn to process quoted lines that ends with CRLF Đoàn Trần Công Danh
2021-05-05 4:31 ` [PATCH v2 0/5] Teach am/mailinfo to process quoted CR Junio C Hamano
2021-05-06 15:02 ` [PATCH v3 0/6] " Đoàn Trần Công Danh
2021-05-06 15:02 ` [PATCH v3 1/6] mailinfo: load default metainfo_charset lazily Đoàn Trần Công Danh
2021-05-06 15:02 ` [PATCH v3 2/6] mailinfo: stop parsing options manually Đoàn Trần Công Danh
2021-05-08 10:44 ` Junio C Hamano
2021-05-06 15:02 ` [PATCH v3 3/6] mailinfo: warn if CR found in decoded base64/QP email Đoàn Trần Công Danh
2021-05-08 10:52 ` Junio C Hamano
2021-05-06 15:02 ` [PATCH v3 4/6] mailinfo: allow squelching quoted CR warning Đoàn Trần Công Danh
2021-05-06 15:02 ` [PATCH v3 5/6] mailinfo: allow stripping quoted CR without warning Đoàn Trần Công Danh
2021-05-06 15:02 ` [PATCH v3 6/6] am: learn to process quoted lines that ends with CRLF Đoàn Trần Công Danh
2021-05-08 10:57 ` [PATCH v3 0/6] Teach am/mailinfo to process quoted CR Junio C Hamano
[not found] ` <cover.1620309355.git.congdanhqx@gmail.com>
2021-05-06 15:02 ` [PATCH v3 2/6] mailinfo: stop parse options manually Đoàn Trần Công Danh
2021-05-06 15:19 ` Đoàn Trần Công Danh
2021-05-09 17:12 ` [PATCH v4 0/6] Teach am/mailinfo to process quoted CR Đoàn Trần Công Danh
2021-05-09 17:12 ` [PATCH v4 1/6] mailinfo: load default metainfo_charset lazily Đoàn Trần Công Danh
2021-05-09 17:12 ` [PATCH v4 2/6] mailinfo: stop parsing options manually Đoàn Trần Công Danh
2021-05-09 17:12 ` [PATCH v4 3/6] mailinfo: warn if CRLF found in decoded base64/QP email Đoàn Trần Công Danh
2021-05-09 17:12 ` [PATCH v4 4/6] mailinfo: allow squelching quoted CRLF warning Đoàn Trần Công Danh
2021-05-09 17:12 ` [PATCH v4 5/6] mailinfo: allow stripping quoted CR without warning Đoàn Trần Công Danh
2021-05-09 17:12 ` [PATCH v4 6/6] am: learn to process quoted lines that ends with CRLF Đoàn Trần Công Danh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210421013404.17383-1-congdanhqx@gmail.com \
--to=congdanhqx@gmail.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).