From: Krzysztof Mazur <krzysiek@podlesie.net>
To: Jeff King <peff@peff.net>
Cc: gitster@pobox.com, git@vger.kernel.org
Subject: Re: [PATCH] git-send-email: skip RFC2047 quoting for ASCII subjects
Date: Wed, 24 Oct 2012 19:10:36 +0200 [thread overview]
Message-ID: <20121024171036.GA18880@shrek.podlesie.net> (raw)
In-Reply-To: <20121024084636.GA23500@sigill.intra.peff.net>
On Wed, Oct 24, 2012 at 04:46:36AM -0400, Jeff King wrote:
> On Wed, Oct 24, 2012 at 10:03:35AM +0200, Krzysztof Mazur wrote:
>
> > The git-send-email always use RFC2047 subject quoting for files
> > with "broken" encoding - non-ASCII files without Content-Transfer-Encoding,
> > even for ASCII subjects. Now for ASCII subjects the RFC2047 quoting will be
> > skipped.
> > [...]
> > - if ($broken_encoding{$t} && !is_rfc2047_quoted($subject)) {
> > + if ($broken_encoding{$t} && !is_rfc2047_quoted($subject) &&
> > + ($subject =~ /[^[:ascii:]]/)) {
>
> Is that test sufficient? We would also need to encode if it has rfc2047
> specials, no?
For Subject this should be sufficient. According to RFC822 after
"Subject:" we have "text" token,
--- from RFC822 ---
/ "Subject" ":" *text
--- from RFC822 ---
and text is defined as:
--- from RFC822 ---
text = <any CHAR, including bare ; => atoms, specials,
CR & bare LF, but NOT ; comments and
including CRLF> ; quoted-strings are
; NOT recognized.
--- from RFC822 ---
so only CRLF is not allowed in Subject.
So the problem only exists for broken RFC2047-like texts, but I think
it's ok to just pass such subjects, in most cases the Subject comes
from already formatted patch file. I think that we just want to fix Subjects
without specified encoding here.
In most cases, when git-send-email is used for patches generated
by "git format-patch" we just don't want to corrupt Subject. The
"git format-patch" generates "broken" patches when commit message
uses only ASCII characters and patch contains some non-ASCII characters.
In this case original git-send-email, without this patch, adds RFC2047
quotation for pure ASCII Subject.
>
> It looks like we use the same regex elsewhere. Maybe this would be a
> good chance to abstract out a needs_rfc2047_quoting while we are in the
> area?
It's a good idea, however rules are different for Subject and addresses
(sanitize_address).
I think we can go even further, we can just add quote_subject(),
which performs this test and calls quote_rfc2047() if necessary.
I'm sending bellow patch that does that.
Krzysiek
--
From a1e6eef831275485ec1555d94ff0d9aac852dd12 Mon Sep 17 00:00:00 2001
From: Krzysztof Mazur <krzysiek@podlesie.net>
Date: Wed, 24 Oct 2012 19:08:57 +0200
Subject: [PATCH] git-send-email: introduce quote_subject()
The quote_rfc2047() always adds RFC2047 quoting and to avoid quoting ASCII
subjects, before calling quote_rfc2047() subject must be tested for non-ASCII
characters. To avoid this new quote_subject() function is introduced.
The quote_subject() performs this test and calls quote_rfc2047() only if
necessary.
Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net>
---
git-send-email.perl | 25 +++++++++++++++++++------
1 file changed, 19 insertions(+), 6 deletions(-)
diff --git a/git-send-email.perl b/git-send-email.perl
index efeae4c..e9aec8d 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -657,9 +657,7 @@ EOT
$initial_subject = $1;
my $subject = $initial_subject;
$_ = "Subject: " .
- ($subject =~ /[^[:ascii:]]/ ?
- quote_rfc2047($subject, $compose_encoding) :
- $subject) .
+ quote_subject($subject, $compose_encoding) .
"\n";
} elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
$initial_reply_to = $1;
@@ -907,6 +905,22 @@ sub is_rfc2047_quoted {
$s =~ m/^(?:"[[:ascii:]]*"|=\?$token\?$token\?$encoded_text\?=)$/o;
}
+sub subject_needs_rfc2047_quoting {
+ my $s = shift;
+
+ return !is_rfc2047_quoted($s) && ($s =~ /[^[:ascii:]]/);
+}
+
+sub quote_subject {
+ local $subject = shift;
+ my $encoding = shift || 'UTF-8';
+
+ if (subject_needs_rfc2047_quoting($subject)) {
+ return quote_rfc2047($subject, $encoding);
+ }
+ return $subject;
+}
+
# use the simplest quoting being able to handle the recipient
sub sanitize_address {
my ($recipient) = @_;
@@ -1327,9 +1341,8 @@ foreach my $t (@files) {
$body_encoding = $auto_8bit_encoding;
}
- if ($broken_encoding{$t} && !is_rfc2047_quoted($subject) &&
- ($subject =~ /[^[:ascii:]]/)) {
- $subject = quote_rfc2047($subject, $auto_8bit_encoding);
+ if ($broken_encoding{$t}) {
+ $subject = quote_subject($subject, $auto_8bit_encoding);
}
if (defined $author and $author ne $sender) {
--
1.8.0.3.gf4c35fc
next prev parent reply other threads:[~2012-10-24 17:10 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-24 8:03 [PATCH] git-send-email: skip RFC2047 quoting for ASCII subjects Krzysztof Mazur
2012-10-24 8:46 ` Jeff King
2012-10-24 17:10 ` Krzysztof Mazur [this message]
2012-10-24 19:25 ` Jeff King
2012-10-24 21:08 ` Krzysztof Mazur
2012-10-24 21:28 ` [PATCH] git-send-email: add rfc2047 quoting for "=?" Krzysztof Mazur
2012-10-25 9:05 ` Jeff King
2012-10-25 9:01 ` [PATCH] git-send-email: skip RFC2047 quoting for ASCII subjects Jeff King
2012-10-25 10:08 ` Jeff King
2012-10-25 11:19 ` Krzysztof Mazur
2012-10-25 11:21 ` Jeff King
2012-10-25 11:12 ` Krzysztof Mazur
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121024171036.GA18880@shrek.podlesie.net \
--to=krzysiek@podlesie.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).