git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* git send-email Cc with cruft not working as expected
@ 2017-08-22 23:15 Jacob Keller
  2017-08-22 23:18 ` Stefan Beller
  0 siblings, 1 reply; 13+ messages in thread
From: Jacob Keller @ 2017-08-22 23:15 UTC (permalink / raw)
  To: Git mailing list

Hi,

I recently found an issue with git-send-email where it does not
properly remove the cruft of an email address when sending using a Cc:
line.

The specific example is with a commit containing the following Cc line,

Cc: stable@vger.kernel.org # 4.10+

which is the standard way Linux upstream expects the stable Ccs to be,
and I saw several examples of this in the past.

However, this gets converted into a cc of
"stable@vger.kernel.org#4.10+" which isn't a valid address obviously.

This does work as expected if you remember to

Cc: <stable@vger.kernel.org> # 4.10+

I would have assumed that validate_address would kick in and let me
know that the address I'd given isn't valid, or something along those
lines.

I tried to come up with a test for this, but modifying t9001 seemed to
cause other failures and I couldn't detangle exactly how the tests fit
together.

Is this simply expected behavior and I need to remember to use <>
around the address?

Thanks,
Jake

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: git send-email Cc with cruft not working as expected
  2017-08-22 23:15 git send-email Cc with cruft not working as expected Jacob Keller
@ 2017-08-22 23:18 ` Stefan Beller
  2017-08-22 23:30   ` Jacob Keller
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Beller @ 2017-08-22 23:18 UTC (permalink / raw)
  To: Jacob Keller; +Cc: Git mailing list

On Tue, Aug 22, 2017 at 4:15 PM, Jacob Keller <jacob.keller@gmail.com> wrote:
> Hi,
>
> I recently found an issue with git-send-email where it does not
> properly remove the cruft of an email address when sending using a Cc:
> line.
>
> The specific example is with a commit containing the following Cc line,
>
> Cc: stable@vger.kernel.org # 4.10+

Please see and discuss at
https://public-inbox.org/git/20170216174924.GB2625@localhost/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: git send-email Cc with cruft not working as expected
  2017-08-22 23:18 ` Stefan Beller
@ 2017-08-22 23:30   ` Jacob Keller
  2017-08-22 23:36     ` Stefan Beller
  0 siblings, 1 reply; 13+ messages in thread
From: Jacob Keller @ 2017-08-22 23:30 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Git mailing list

On Tue, Aug 22, 2017 at 4:18 PM, Stefan Beller <sbeller@google.com> wrote:
> On Tue, Aug 22, 2017 at 4:15 PM, Jacob Keller <jacob.keller@gmail.com> wrote:
>> Hi,
>>
>> I recently found an issue with git-send-email where it does not
>> properly remove the cruft of an email address when sending using a Cc:
>> line.
>>
>> The specific example is with a commit containing the following Cc line,
>>
>> Cc: stable@vger.kernel.org # 4.10+
>
> Please see and discuss at
> https://public-inbox.org/git/20170216174924.GB2625@localhost/

I read that thread, and it addressed the problem of

Cc: <stable@vger.kernel.org> # 4.10+

but did not fix this case without the <> around the email address.

Additionally I just discovered that the behavior here changes pretty
drastically if you have Email::Validate installed, now it splits the
address into multiple things:

stable@vger.kernel.org, #, 4.10+

Thanks,
Jake

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: git send-email Cc with cruft not working as expected
  2017-08-22 23:30   ` Jacob Keller
@ 2017-08-22 23:36     ` Stefan Beller
  2017-08-23 10:02       ` Matthieu Moy
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Beller @ 2017-08-22 23:36 UTC (permalink / raw)
  To: Jacob Keller, johan, Matthieu Moy; +Cc: Git mailing list

+cc people from that thread

On Tue, Aug 22, 2017 at 4:30 PM, Jacob Keller <jacob.keller@gmail.com> wrote:
> On Tue, Aug 22, 2017 at 4:18 PM, Stefan Beller <sbeller@google.com> wrote:
>> On Tue, Aug 22, 2017 at 4:15 PM, Jacob Keller <jacob.keller@gmail.com> wrote:
>>> Hi,
>>>
>>> I recently found an issue with git-send-email where it does not
>>> properly remove the cruft of an email address when sending using a Cc:
>>> line.
>>>
>>> The specific example is with a commit containing the following Cc line,
>>>
>>> Cc: stable@vger.kernel.org # 4.10+
>>
>> Please see and discuss at
>> https://public-inbox.org/git/20170216174924.GB2625@localhost/
>
> I read that thread, and it addressed the problem of
>
> Cc: <stable@vger.kernel.org> # 4.10+
>
> but did not fix this case without the <> around the email address.
>
> Additionally I just discovered that the behavior here changes pretty
> drastically if you have Email::Validate installed, now it splits the
> address into multiple things:
>
> stable@vger.kernel.org, #, 4.10+
>
> Thanks,
> Jake

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: git send-email Cc with cruft not working as expected
  2017-08-22 23:36     ` Stefan Beller
@ 2017-08-23 10:02       ` Matthieu Moy
  2017-08-23 10:21         ` [RFC PATCH 1/2] send-email: fix garbage removal after address Matthieu Moy
  2017-08-23 22:49         ` git send-email Cc with cruft not working as expected Jacob Keller
  0 siblings, 2 replies; 13+ messages in thread
From: Matthieu Moy @ 2017-08-23 10:02 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jacob Keller, johan, Git mailing list

Stefan Beller <sbeller@google.com> writes:

> +cc people from that thread
>
> On Tue, Aug 22, 2017 at 4:30 PM, Jacob Keller <jacob.keller@gmail.com> wrote:
>> On Tue, Aug 22, 2017 at 4:18 PM, Stefan Beller <sbeller@google.com> wrote:
>>> On Tue, Aug 22, 2017 at 4:15 PM, Jacob Keller <jacob.keller@gmail.com> wrote:
>>>> Hi,
>>>>
>>>> I recently found an issue with git-send-email where it does not
>>>> properly remove the cruft of an email address when sending using a Cc:
>>>> line.
>>>>
>>>> The specific example is with a commit containing the following Cc line,
>>>>
>>>> Cc: stable@vger.kernel.org # 4.10+
>>>
>>> Please see and discuss at
>>> https://public-inbox.org/git/20170216174924.GB2625@localhost/
>>
>> I read that thread, and it addressed the problem of
>>
>> Cc: <stable@vger.kernel.org> # 4.10+
>>
>> but did not fix this case without the <> around the email address.

Indeed. It detects garbage as "everything after >".

I feel really sorry that we need so many iterations to get back to a
correct behavior :-(.

>> Additionally I just discovered that the behavior here changes pretty
>> drastically if you have Email::Validate installed, now it splits the
>> address into multiple things:

(I'm assuming you mean Email::Address, there's also Email::Valid but I
don't think it would modify the behavior)

Hmm, I think we reached the point where we should just stop using
Email::Address.

Patch series follows and should address both points.

-- 
Matthieu Moy
http://matthieu-moy.fr

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH 1/2] send-email: fix garbage removal after address
  2017-08-23 10:02       ` Matthieu Moy
@ 2017-08-23 10:21         ` Matthieu Moy
  2017-08-23 10:21           ` [RFC PATCH 2/2] send-email: don't use Mail::Address, even if available Matthieu Moy
                             ` (2 more replies)
  2017-08-23 22:49         ` git send-email Cc with cruft not working as expected Jacob Keller
  1 sibling, 3 replies; 13+ messages in thread
From: Matthieu Moy @ 2017-08-23 10:21 UTC (permalink / raw)
  To: git; +Cc: Matthieu Moy

This is a followup over 9d33439 (send-email: only allow one address
per body tag, 2017-02-20). The first iteration did allow writting

  Cc: <foo@example.com> # garbage

but did so by matching the regex ([^>]*>?), i.e. stop after the first
instance of '>'. However, it did not properly deal with

  Cc: foo@example.com # garbage

Fix this using a new function strip_garbage_one_address, which does
essentially what the old ([^>]*>?) was doing, but dealing with more
corner-cases. Since we've allowed

  Cc: "Foo # Bar" <foobar@example.com>

in previous versions, it makes sense to continue allowing it (but we
still remove any garbage after it). OTOH, when an address is given
without quoting, we just take the first word and ignore everything
after.

Signed-off-by: Matthieu Moy <git@matthieu-moy.fr>
---
Also available as: https://github.com/git/git/pull/398

 git-send-email.perl   | 26 ++++++++++++++++++++++++--
 t/t9001-send-email.sh |  4 ++++
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index fa6526986e..33a69ffe5d 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -1089,6 +1089,26 @@ sub sanitize_address {
 
 }
 
+sub strip_garbage_one_address {
+	my ($addr) = @_;
+	chomp $addr;
+	if ($addr =~ /^(("[^"]*"|[^"<]*)? *<[^>]*>).*/) {
+		# "Foo Bar" <foobar@example.com> [possibly garbage here]
+		# Foo Bar <foobar@example.com> [possibly garbage here]
+		return $1;
+	}
+	if ($addr =~ /^(<[^>]*>).*/) {
+		# <foo@example.com> [possibly garbage here]
+		# if garbage contains other addresses, they are ignored.
+		return $1;
+	}
+	if ($addr =~ /^([^"#,\s]*)/) {
+		# address without quoting: remove anything after the address
+		return $1;
+	}
+	return $addr;
+}
+
 sub sanitize_address_list {
 	return (map { sanitize_address($_) } @_);
 }
@@ -1590,10 +1610,12 @@ foreach my $t (@files) {
 	# Now parse the message body
 	while(<$fh>) {
 		$message .=  $_;
-		if (/^(Signed-off-by|Cc): ([^>]*>?)/i) {
+		if (/^(Signed-off-by|Cc): (.*)/i) {
 			chomp;
 			my ($what, $c) = ($1, $2);
-			chomp $c;
+			# strip garbage for the address we'll use:
+			$c = strip_garbage_one_address($c);
+			# sanitize a bit more to decide whether to suppress the address:
 			my $sc = sanitize_address($c);
 			if ($sc eq $sender) {
 				next if ($suppress_cc{'self'});
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index d1e4e8ad19..f30980895c 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -148,6 +148,8 @@ cat >expected-cc <<\EOF
 !two@example.com!
 !three@example.com!
 !four@example.com!
+!five@example.com!
+!six@example.com!
 EOF
 "
 
@@ -161,6 +163,8 @@ test_expect_success $PREREQ 'cc trailer with various syntax' '
 	Cc: <two@example.com> # trailing comments are ignored
 	Cc: <three@example.com>, <not.four@example.com> one address per line
 	Cc: "Some # Body" <four@example.com> [ <also.a.comment> ]
+	Cc: five@example.com # not.six@example.com
+	Cc: six@example.com, not.seven@example.com
 	EOF
 	clean_fake_sendmail &&
 	git send-email -1 --to=recipient@example.com \
-- 
2.14.0.rc0.dirty


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 2/2] send-email: don't use Mail::Address, even if available
  2017-08-23 10:21         ` [RFC PATCH 1/2] send-email: fix garbage removal after address Matthieu Moy
@ 2017-08-23 10:21           ` Matthieu Moy
  2017-08-23 21:59           ` [RFC PATCH 1/2] send-email: fix garbage removal after address Jacob Keller
  2017-08-24 20:32           ` Junio C Hamano
  2 siblings, 0 replies; 13+ messages in thread
From: Matthieu Moy @ 2017-08-23 10:21 UTC (permalink / raw)
  To: git; +Cc: Matthieu Moy

Using Mail::Address made sense when we didn't have a proper parser. We
now have a reasonable address parser, and using Mail::Address
_if available_ causes much more trouble than it gives benefits:

* Developers typically test one version, not both.

* Users may not be aware that installing Mail::Address will change the
  behavior. They may complain about the behavior in one case without
  knowing that Mail::Address is involved.

* Having this optional Mail::Address makes it tempting to anwser "please
  install Mail::Address" to users instead of fixing our own code. We've
  reached the stage where bugs in our parser should be fixed, not worked
  around.

Signed-off-by: Matthieu Moy <git@matthieu-moy.fr>
---
 git-send-email.perl | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 33a69ffe5d..2208dcc213 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -155,7 +155,6 @@ sub format_2822_time {
 }
 
 my $have_email_valid = eval { require Email::Valid; 1 };
-my $have_mail_address = eval { require Mail::Address; 1 };
 my $smtp;
 my $auth;
 my $num_sent = 0;
@@ -490,11 +489,7 @@ my ($repoauthor, $repocommitter);
 ($repocommitter) = Git::ident_person(@repo, 'committer');
 
 sub parse_address_line {
-	if ($have_mail_address) {
-		return map { $_->format } Mail::Address->parse($_[0]);
-	} else {
-		return Git::parse_mailboxes($_[0]);
-	}
+	return Git::parse_mailboxes($_[0]);
 }
 
 sub split_addrs {
-- 
2.14.0.rc0.dirty


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] send-email: fix garbage removal after address
  2017-08-23 10:21         ` [RFC PATCH 1/2] send-email: fix garbage removal after address Matthieu Moy
  2017-08-23 10:21           ` [RFC PATCH 2/2] send-email: don't use Mail::Address, even if available Matthieu Moy
@ 2017-08-23 21:59           ` Jacob Keller
  2017-08-24 20:32           ` Junio C Hamano
  2 siblings, 0 replies; 13+ messages in thread
From: Jacob Keller @ 2017-08-23 21:59 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Git mailing list

On Wed, Aug 23, 2017 at 3:21 AM, Matthieu Moy <git@matthieu-moy.fr> wrote:
> This is a followup over 9d33439 (send-email: only allow one address
> per body tag, 2017-02-20). The first iteration did allow writting
>
>   Cc: <foo@example.com> # garbage
>
> but did so by matching the regex ([^>]*>?), i.e. stop after the first
> instance of '>'. However, it did not properly deal with
>
>   Cc: foo@example.com # garbage
>
> Fix this using a new function strip_garbage_one_address, which does
> essentially what the old ([^>]*>?) was doing, but dealing with more
> corner-cases. Since we've allowed
>
>   Cc: "Foo # Bar" <foobar@example.com>
>
> in previous versions, it makes sense to continue allowing it (but we
> still remove any garbage after it). OTOH, when an address is given
> without quoting, we just take the first word and ignore everything
> after.
>
> Signed-off-by: Matthieu Moy <git@matthieu-moy.fr>
> ---

I pulled this and tested it for my issue, and it fixes the problem for
me. I think the approach in the code was solid too, extracting out the
logic helps make the code more clear.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: git send-email Cc with cruft not working as expected
  2017-08-23 10:02       ` Matthieu Moy
  2017-08-23 10:21         ` [RFC PATCH 1/2] send-email: fix garbage removal after address Matthieu Moy
@ 2017-08-23 22:49         ` Jacob Keller
  1 sibling, 0 replies; 13+ messages in thread
From: Jacob Keller @ 2017-08-23 22:49 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Stefan Beller, johan, Git mailing list

On Wed, Aug 23, 2017 at 3:02 AM, Matthieu Moy <git@matthieu-moy.fr> wrote:
>> On Tue, Aug 22, 2017 at 4:30 PM, Jacob Keller <jacob.keller@gmail.com> wrote:
>>> Additionally I just discovered that the behavior here changes pretty
>>> drastically if you have Email::Validate installed, now it splits the
>>> address into multiple things:
>
> (I'm assuming you mean Email::Address, there's also Email::Valid but I
> don't think it would modify the behavior)
>

No I actually definitely meant Email::Valid. I already had
Mail::Address installed, and I then installed Email::Valid, and it
changed behavior to split the cruft into multiple addresses.

I don't actually know why or how it did this, but I'm certain it was
presence of Email::Valid that did it.

However, your first patch addresses the issue since you remove the
cruft well before passing it into Email::Valid anyways.

> Hmm, I think we reached the point where we should just stop using
> Email::Address.

I do agree, I don't think we should use Mail::Address.

>
> Patch series follows and should address both points.
>
> --
> Matthieu Moy
> http://matthieu-moy.fr

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] send-email: fix garbage removal after address
  2017-08-23 10:21         ` [RFC PATCH 1/2] send-email: fix garbage removal after address Matthieu Moy
  2017-08-23 10:21           ` [RFC PATCH 2/2] send-email: don't use Mail::Address, even if available Matthieu Moy
  2017-08-23 21:59           ` [RFC PATCH 1/2] send-email: fix garbage removal after address Jacob Keller
@ 2017-08-24 20:32           ` Junio C Hamano
  2017-08-25  9:11             ` Matthieu Moy
  2 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2017-08-24 20:32 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: git

Matthieu Moy <git@matthieu-moy.fr> writes:

> This is a followup over 9d33439 (send-email: only allow one address
> per body tag, 2017-02-20). The first iteration did allow writting
>
>   Cc: <foo@example.com> # garbage
>
> but did so by matching the regex ([^>]*>?), i.e. stop after the first
> instance of '>'. However, it did not properly deal with
>
>   Cc: foo@example.com # garbage
>
> Fix this using a new function strip_garbage_one_address, which does
> essentially what the old ([^>]*>?) was doing, but dealing with more
> corner-cases. Since we've allowed
>
>   Cc: "Foo # Bar" <foobar@example.com>
>
> in previous versions, it makes sense to continue allowing it (but we
> still remove any garbage after it). OTOH, when an address is given
> without quoting, we just take the first word and ignore everything
> after.
>
> Signed-off-by: Matthieu Moy <git@matthieu-moy.fr>
> ---
> Also available as: https://github.com/git/git/pull/398
>
>  git-send-email.perl   | 26 ++++++++++++++++++++++++--
>  t/t9001-send-email.sh |  4 ++++
>  2 files changed, 28 insertions(+), 2 deletions(-)
>
> diff --git a/git-send-email.perl b/git-send-email.perl
> index fa6526986e..33a69ffe5d 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -1089,6 +1089,26 @@ sub sanitize_address {
>  
>  }
>  
> +sub strip_garbage_one_address {
> +	my ($addr) = @_;
> +	chomp $addr;
> +	if ($addr =~ /^(("[^"]*"|[^"<]*)? *<[^>]*>).*/) {
> +		# "Foo Bar" <foobar@example.com> [possibly garbage here]
> +		# Foo Bar <foobar@example.com> [possibly garbage here]
> +		return $1;
> +	}
> +	if ($addr =~ /^(<[^>]*>).*/) {
> +		# <foo@example.com> [possibly garbage here]
> +		# if garbage contains other addresses, they are ignored.
> +		return $1;
> +	}

Isn't this already covered by the first one, which allows an
optional "something", followed by an optional run of SPs, in front
of this exact pattern, so the case where the optional "something"
does not appear and the number of optional SP is zero would exactly
match the one this pattern is meant to cover.

> +	if ($addr =~ /^([^"#,\s]*)/) {
> +		# address without quoting: remove anything after the address
> +		return $1;
> +	}
> +	return $addr;
> +}

By the way, these three regexps smell like they were written
specifically to cover three cases you care about (perhaps the ones
in your proposed log message), but what will be our response when
somebody else comes next time to us and says that their favourite
formatting of "Cc:" line is not covered by these rules?  

Will we add yet another pattern?  Where will it end?  There will be
a point where we instead start telling them to update the convention
of their project so that it will be covered by one of the patterns
we have already developed, I would imagine.

So, from that point of view, I, with devil's advocate hat on, wonder
why we are not saying

	"Cc: s@k.org # cruft"?  Use "Cc: <s@k.org> # cruft" instead
	and you'd be fine.

right now, without this patch.

I do not _mind_ us trying to be extra nice for a while, and I
certainly do not mind _this_ particular patch that gives us a single
helper function that future "here is another way to spell cruft"
rules can go, but I feel that there should be some line that lets us
say that we've done enough.

Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/2] send-email: fix garbage removal after address
  2017-08-24 20:32           ` Junio C Hamano
@ 2017-08-25  9:11             ` Matthieu Moy
  2017-08-25  9:11               ` [PATCH v2 " Matthieu Moy
  0 siblings, 1 reply; 13+ messages in thread
From: Matthieu Moy @ 2017-08-25  9:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Matthieu Moy <git@matthieu-moy.fr> writes:
>
>> +sub strip_garbage_one_address {
>> +	my ($addr) = @_;
>> +	chomp $addr;
>> +	if ($addr =~ /^(("[^"]*"|[^"<]*)? *<[^>]*>).*/) {
>> +		# "Foo Bar" <foobar@example.com> [possibly garbage here]
>> +		# Foo Bar <foobar@example.com> [possibly garbage here]
>> +		return $1;
>> +	}
>> +	if ($addr =~ /^(<[^>]*>).*/) {
>> +		# <foo@example.com> [possibly garbage here]
>> +		# if garbage contains other addresses, they are ignored.
>> +		return $1;
>> +	}
>
> Isn't this already covered by the first one,

Oops, indeed. I just removed the second "if" (and added the appropriate
comment to the first):

+       if ($addr =~ /^(("[^"]*"|[^"<]*)? *<[^>]*>).*/) {
+               # "Foo Bar" <foobar@example.com> [possibly garbage here]
+               # Foo Bar <foobar@example.com> [possibly garbage here]
+               # <foo@example.com> [possibly garbage here]
+               return $1;
+       }

> By the way, these three regexps smell like they were written
> specifically to cover three cases you care about (perhaps the ones
> in your proposed log message), but what will be our response when
> somebody else comes next time to us and says that their favourite
> formatting of "Cc:" line is not covered by these rules?

Well, actually the last one covers essentially everything. Just stop at
the first space, #, ',' or '"'. The first case is here to allow putting
a name in front of the address, which is something we've already allowed
and sounds reasonable from the user point of view.

OTOH, I didn't bother with real corner-cases like

  Cc: "Foo \"bar\"" <foobar@example.com>

> So, from that point of view, I, with devil's advocate hat on, wonder
> why we are not saying
>
> 	"Cc: s@k.org # cruft"?  Use "Cc: <s@k.org> # cruft" instead
> 	and you'd be fine.
>
> right now, without this patch.

I would agree if the broken case were an exotic one. But a plain adress
is really the simplest use-case I can think of, so it's hard to say
"don't do that" when we should say "sorry, we should obviously have
thought about this use-case".

-- 
Matthieu Moy
https://matthieu-moy.fr/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/2] send-email: fix garbage removal after address
  2017-08-25  9:11             ` Matthieu Moy
@ 2017-08-25  9:11               ` Matthieu Moy
  2017-08-25  9:12                 ` [PATCH v2 2/2] send-email: don't use Mail::Address, even if available Matthieu Moy
  0 siblings, 1 reply; 13+ messages in thread
From: Matthieu Moy @ 2017-08-25  9:11 UTC (permalink / raw)
  To: gitster; +Cc: git, Matthieu Moy

This is a followup over 9d33439 (send-email: only allow one address
per body tag, 2017-02-20). The first iteration did allow writting

  Cc: <foo@example.com> # garbage

but did so by matching the regex ([^>]*>?), i.e. stop after the first
instance of '>'. However, it did not properly deal with

  Cc: foo@example.com # garbage

Fix this using a new function strip_garbage_one_address, which does
essentially what the old ([^>]*>?) was doing, but dealing with more
corner-cases. Since we've allowed

  Cc: "Foo # Bar" <foobar@example.com>

in previous versions, it makes sense to continue allowing it (but we
still remove any garbage after it). OTOH, when an address is given
without quoting, we just take the first word and ignore everything
after.

Signed-off-by: Matthieu Moy <git@matthieu-moy.fr>
---
Change since v1: removed dead code as suggested by Junio.


 git-send-email.perl   | 22 ++++++++++++++++++++--
 t/t9001-send-email.sh |  4 ++++
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index fa6526986e..dfd646ac5b 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -1089,6 +1089,22 @@ sub sanitize_address {
 
 }
 
+sub strip_garbage_one_address {
+	my ($addr) = @_;
+	chomp $addr;
+	if ($addr =~ /^(("[^"]*"|[^"<]*)? *<[^>]*>).*/) {
+		# "Foo Bar" <foobar@example.com> [possibly garbage here]
+		# Foo Bar <foobar@example.com> [possibly garbage here]
+		# <foo@example.com> [possibly garbage here]
+		return $1;
+	}
+	if ($addr =~ /^([^"#,\s]*)/) {
+		# address without quoting: remove anything after the address
+		return $1;
+	}
+	return $addr;
+}
+
 sub sanitize_address_list {
 	return (map { sanitize_address($_) } @_);
 }
@@ -1590,10 +1606,12 @@ foreach my $t (@files) {
 	# Now parse the message body
 	while(<$fh>) {
 		$message .=  $_;
-		if (/^(Signed-off-by|Cc): ([^>]*>?)/i) {
+		if (/^(Signed-off-by|Cc): (.*)/i) {
 			chomp;
 			my ($what, $c) = ($1, $2);
-			chomp $c;
+			# strip garbage for the address we'll use:
+			$c = strip_garbage_one_address($c);
+			# sanitize a bit more to decide whether to suppress the address:
 			my $sc = sanitize_address($c);
 			if ($sc eq $sender) {
 				next if ($suppress_cc{'self'});
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index d1e4e8ad19..f30980895c 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -148,6 +148,8 @@ cat >expected-cc <<\EOF
 !two@example.com!
 !three@example.com!
 !four@example.com!
+!five@example.com!
+!six@example.com!
 EOF
 "
 
@@ -161,6 +163,8 @@ test_expect_success $PREREQ 'cc trailer with various syntax' '
 	Cc: <two@example.com> # trailing comments are ignored
 	Cc: <three@example.com>, <not.four@example.com> one address per line
 	Cc: "Some # Body" <four@example.com> [ <also.a.comment> ]
+	Cc: five@example.com # not.six@example.com
+	Cc: six@example.com, not.seven@example.com
 	EOF
 	clean_fake_sendmail &&
 	git send-email -1 --to=recipient@example.com \
-- 
2.14.0.rc0.dirty


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/2] send-email: don't use Mail::Address, even if available
  2017-08-25  9:11               ` [PATCH v2 " Matthieu Moy
@ 2017-08-25  9:12                 ` Matthieu Moy
  0 siblings, 0 replies; 13+ messages in thread
From: Matthieu Moy @ 2017-08-25  9:12 UTC (permalink / raw)
  To: gitster; +Cc: git, Matthieu Moy

Using Mail::Address made sense when we didn't have a proper parser. We
now have a reasonable address parser, and using Mail::Address
_if available_ causes much more trouble than it gives benefits:

* Developers typically test one version, not both.

* Users may not be aware that installing Mail::Address will change the
  behavior. They may complain about the behavior in one case without
  knowing that Mail::Address is involved.

* Having this optional Mail::Address makes it tempting to anwser "please
  install Mail::Address" to users instead of fixing our own code. We've
  reached the stage where bugs in our parser should be fixed, not worked
  around.

Signed-off-by: Matthieu Moy <git@matthieu-moy.fr>
---
 git-send-email.perl | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index dfd646ac5b..0061dbfab9 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -155,7 +155,6 @@ sub format_2822_time {
 }
 
 my $have_email_valid = eval { require Email::Valid; 1 };
-my $have_mail_address = eval { require Mail::Address; 1 };
 my $smtp;
 my $auth;
 my $num_sent = 0;
@@ -490,11 +489,7 @@ my ($repoauthor, $repocommitter);
 ($repocommitter) = Git::ident_person(@repo, 'committer');
 
 sub parse_address_line {
-	if ($have_mail_address) {
-		return map { $_->format } Mail::Address->parse($_[0]);
-	} else {
-		return Git::parse_mailboxes($_[0]);
-	}
+	return Git::parse_mailboxes($_[0]);
 }
 
 sub split_addrs {
-- 
2.14.0.rc0.dirty


^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-08-25  9:12 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-22 23:15 git send-email Cc with cruft not working as expected Jacob Keller
2017-08-22 23:18 ` Stefan Beller
2017-08-22 23:30   ` Jacob Keller
2017-08-22 23:36     ` Stefan Beller
2017-08-23 10:02       ` Matthieu Moy
2017-08-23 10:21         ` [RFC PATCH 1/2] send-email: fix garbage removal after address Matthieu Moy
2017-08-23 10:21           ` [RFC PATCH 2/2] send-email: don't use Mail::Address, even if available Matthieu Moy
2017-08-23 21:59           ` [RFC PATCH 1/2] send-email: fix garbage removal after address Jacob Keller
2017-08-24 20:32           ` Junio C Hamano
2017-08-25  9:11             ` Matthieu Moy
2017-08-25  9:11               ` [PATCH v2 " Matthieu Moy
2017-08-25  9:12                 ` [PATCH v2 2/2] send-email: don't use Mail::Address, even if available Matthieu Moy
2017-08-23 22:49         ` git send-email Cc with cruft not working as expected Jacob Keller

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).