git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Krzysztof Mazur <krzysiek@podlesie.net>
To: Jeff King <peff@peff.net>
Cc: gitster@pobox.com, git@vger.kernel.org
Subject: Re: [PATCH] git-send-email: skip RFC2047 quoting for ASCII subjects
Date: Wed, 24 Oct 2012 19:10:36 +0200	[thread overview]
Message-ID: <20121024171036.GA18880@shrek.podlesie.net> (raw)
In-Reply-To: <20121024084636.GA23500@sigill.intra.peff.net>

On Wed, Oct 24, 2012 at 04:46:36AM -0400, Jeff King wrote:
> On Wed, Oct 24, 2012 at 10:03:35AM +0200, Krzysztof Mazur wrote:
> 
> > The git-send-email always use RFC2047 subject quoting for files
> > with "broken" encoding - non-ASCII files without Content-Transfer-Encoding,
> > even for ASCII subjects. Now for ASCII subjects the RFC2047 quoting will be
> > skipped.
> > [...]
> > -	if ($broken_encoding{$t} && !is_rfc2047_quoted($subject)) {
> > +	if ($broken_encoding{$t} && !is_rfc2047_quoted($subject) &&
> > +			($subject =~ /[^[:ascii:]]/)) {
> 
> Is that test sufficient? We would also need to encode if it has rfc2047
> specials, no?

For Subject this should be sufficient. According to RFC822 after
"Subject:" we have "text" token,

--- from RFC822 ---
                 /  "Subject"           ":"  *text
--- from RFC822 ---

and text is defined as:

--- from RFC822 ---
     text        =  <any CHAR, including bare    ; => atoms, specials,
                     CR & bare LF, but NOT       ;  comments and
                     including CRLF>             ;  quoted-strings are
                                                 ;  NOT recognized.
--- from RFC822 ---

so only CRLF is not allowed in Subject.


So the problem only exists for broken RFC2047-like texts, but I think
it's ok to just pass such subjects, in most cases the Subject comes
from already formatted patch file. I think that we just want to fix Subjects
without specified encoding here.


In most cases, when git-send-email is used for patches generated
by "git format-patch" we just don't want to corrupt Subject. The
"git format-patch" generates "broken" patches when commit message
uses only ASCII characters and patch contains some non-ASCII characters.
In this case original git-send-email, without this patch, adds RFC2047
quotation for pure ASCII Subject.

> 
> It looks like we use the same regex elsewhere. Maybe this would be a
> good chance to abstract out a needs_rfc2047_quoting while we are in the
> area?

It's a good idea, however rules are different for Subject and addresses
(sanitize_address).

I think we can go even further, we can just add quote_subject(),
which performs this test and calls quote_rfc2047() if necessary.
I'm sending bellow patch that does that.

Krzysiek
-- 
From a1e6eef831275485ec1555d94ff0d9aac852dd12 Mon Sep 17 00:00:00 2001
From: Krzysztof Mazur <krzysiek@podlesie.net>
Date: Wed, 24 Oct 2012 19:08:57 +0200
Subject: [PATCH] git-send-email: introduce quote_subject()

The quote_rfc2047() always adds RFC2047 quoting and to avoid quoting ASCII
subjects, before calling quote_rfc2047() subject must be tested for non-ASCII
characters. To avoid this new quote_subject() function is introduced.
The quote_subject() performs this test and calls quote_rfc2047() only if
necessary.

Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net>
---
 git-send-email.perl | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index efeae4c..e9aec8d 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -657,9 +657,7 @@ EOT
 			$initial_subject = $1;
 			my $subject = $initial_subject;
 			$_ = "Subject: " .
-				($subject =~ /[^[:ascii:]]/ ?
-				 quote_rfc2047($subject, $compose_encoding) :
-				 $subject) .
+				quote_subject($subject, $compose_encoding) .
 				"\n";
 		} elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
 			$initial_reply_to = $1;
@@ -907,6 +905,22 @@ sub is_rfc2047_quoted {
 	$s =~ m/^(?:"[[:ascii:]]*"|=\?$token\?$token\?$encoded_text\?=)$/o;
 }
 
+sub subject_needs_rfc2047_quoting {
+	my $s = shift;
+
+	return !is_rfc2047_quoted($s) && ($s =~ /[^[:ascii:]]/);
+}
+
+sub quote_subject {
+ 	local $subject = shift;
+ 	my $encoding = shift || 'UTF-8';
+
+ 	if (subject_needs_rfc2047_quoting($subject)) {
+		return quote_rfc2047($subject, $encoding);
+ 	}
+ 	return $subject;
+}
+
 # use the simplest quoting being able to handle the recipient
 sub sanitize_address {
 	my ($recipient) = @_;
@@ -1327,9 +1341,8 @@ foreach my $t (@files) {
 		$body_encoding = $auto_8bit_encoding;
 	}
 
-	if ($broken_encoding{$t} && !is_rfc2047_quoted($subject) &&
-			($subject =~ /[^[:ascii:]]/)) {
-		$subject = quote_rfc2047($subject, $auto_8bit_encoding);
+	if ($broken_encoding{$t}) {
+		$subject = quote_subject($subject, $auto_8bit_encoding);
 	}
 
 	if (defined $author and $author ne $sender) {
-- 
1.8.0.3.gf4c35fc

  reply	other threads:[~2012-10-24 17:10 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-24  8:03 [PATCH] git-send-email: skip RFC2047 quoting for ASCII subjects Krzysztof Mazur
2012-10-24  8:46 ` Jeff King
2012-10-24 17:10   ` Krzysztof Mazur [this message]
2012-10-24 19:25     ` Jeff King
2012-10-24 21:08       ` Krzysztof Mazur
2012-10-24 21:28         ` [PATCH] git-send-email: add rfc2047 quoting for "=?" Krzysztof Mazur
2012-10-25  9:05           ` Jeff King
2012-10-25  9:01         ` [PATCH] git-send-email: skip RFC2047 quoting for ASCII subjects Jeff King
2012-10-25 10:08           ` Jeff King
2012-10-25 11:19             ` Krzysztof Mazur
2012-10-25 11:21               ` Jeff King
2012-10-25 11:12           ` Krzysztof Mazur

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121024171036.GA18880@shrek.podlesie.net \
    --to=krzysiek@podlesie.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).