git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Kevin Daudt <me@ikke.info>
Cc: git@vger.kernel.org, Swift Geek <swiftgeek@gmail.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH] mailinfo: unescape quoted-pair in header fields
Date: Mon, 19 Sep 2016 20:57:11 -0700	[thread overview]
Message-ID: <20160920035710.qw2byl3qeqwih7t5@sigill.intra.peff.net> (raw)
In-Reply-To: <20160919105133.GA10901@ikke.info>

On Mon, Sep 19, 2016 at 12:51:33PM +0200, Kevin Daudt wrote:

> > I didn't look in the RFC. Is:
> > 
> >   From: my \"name\" <foo@example.com>
> > 
> > really the same as:
> > 
> >   From: "my \\\"name\\\"" <foo@example.com>
> > 
> > ? That seems weird, but I think it may be that the former is simply
> > bogus (you are not supposed to use backslashes outside of the quoted
> > section at all).
> 
> Correct, the quoted-pair (escape sequence) can only occur in a quoted
> string or a comment. Even more so, the display name *needs* to be quoted
> when consisting of more then one word according to the RFC.

Hmm. So, I guess a follow-up question is: what would it be OK to do if
we see a quoted-pair outside of quotes? If the top one above violates
the RFC, it seems like stripping the backslashes would be a reasonable
outcome.

So if that's the case, do we actually need to care if we see any
parenthesized comments? I think we should just leave comments in place
either way, so syntactically they are only interesting insofar as we
replace quoted pairs or not.

IOW, I wonder if:

  while ((c = *in++)) {
	switch (c) {
	case '\\':
		if (!*in)
			return 0; /* ignore trailing backslash */
		/* quoted pair */
		strbuf_addch(out, *in++);
		break;
	case '"':
		/*
		 * This may be starting or ending a quoted section,
		 * but we do not care whether we are in such a section.
		 * We _do_ need to remove the quotes, though, as they
		 * are syntactic.
		 */
		break;
	default:
		/*
		 * Anything else is a normal character we keep. These
		 * _might_ be violating the RFC if they are magic
		 * characters outside of a quoted section, but we'd
		 * rather be liberal and pass them through.
		 */
		strbuf_addch(out, c);
		break;
	}
  }

would work. I certainly do not mind following the RFC more closely, but
AFAICT the very simple code above gives a pretty forgiving outcome.

> > This is obviously getting pretty silly, but if we are going to follow
> > the RFC, I think you actually have to do a recursive parse, and keep
> > track of an arbitrary depth of context.
> > 
> > I dunno. This method probably covers most cases in practice, and it's
> > easy to reason about.
> 
> The problem is, how do you differentiate between nested comments, and
> escaped braces within a comment after one run?

I'm not sure what you mean. Escaped characters are always handled first
in your loop. Can you give an example (although if you agree with what I
wrote above, it may not be worth discussing further)?

-Peff

  reply	other threads:[~2016-09-20  3:57 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-16 21:02 [PATCH] mailinfo: unescape quoted-pair in header fields Kevin Daudt
2016-09-16 22:22 ` Jeff King
2016-09-19 10:51   ` Kevin Daudt
2016-09-20  3:57     ` Jeff King [this message]
2016-09-21 16:07       ` Junio C Hamano
2016-09-19 18:54 ` [PATCH v2 0/2] Handle escape characters in From field Kevin Daudt
2016-09-25 21:08   ` [PATCH v3 1/2] t5100-mailinfo: replace common path prefix with variable Kevin Daudt
2016-09-25 21:08     ` [PATCH v3 2/2] mailinfo: unescape quoted-pair in header fields Kevin Daudt
2016-09-26 19:11       ` Junio C Hamano
2016-09-26 19:26         ` Junio C Hamano
2016-09-26 19:44           ` Kevin Daudt
2016-09-26 22:23             ` Junio C Hamano
2016-09-27 10:26               ` Kevin Daudt
2016-09-26 19:06     ` [PATCH v3 1/2] t5100-mailinfo: replace common path prefix with variable Junio C Hamano
2016-09-28 19:49     ` [PATCH v4 0/2] Handle RFC2822 quoted-pairs in From header Kevin Daudt
2016-09-28 19:52       ` [PATCH v4 1/2] t5100-mailinfo: replace common path prefix with variable Kevin Daudt
2016-09-28 20:21         ` Junio C Hamano
2016-09-28 20:27           ` Kevin Daudt
2016-09-28 19:52       ` [PATCH v4 2/2] mailinfo: unescape quoted-pair in header fields Kevin Daudt
2016-09-19 18:54 ` [PATCH v2 1/2] t5100-mailinfo: replace common path prefix with variable Kevin Daudt
2016-09-19 21:16   ` Junio C Hamano
2016-09-20  3:59     ` Jeff King
2016-09-19 18:54 ` [PATCH v2 2/2] mailinfo: unescape quoted-pair in header fields Kevin Daudt
2016-09-19 21:24   ` Junio C Hamano
2016-09-19 22:04     ` Junio C Hamano
2016-09-20  4:28   ` Jeff King
2016-09-21 11:09   ` Jeff King
2016-09-22 22:17     ` Junio C Hamano
2016-09-23  4:15       ` Jeff King
2016-09-25 20:17         ` Kevin Daudt
2016-09-25 22:38           ` Jakub Narębski
2016-09-26  5:02             ` Kevin Daudt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160920035710.qw2byl3qeqwih7t5@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ikke.info \
    --cc=swiftgeek@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).