git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: Matthieu Moy <git@matthieu-moy.fr>
Cc: gitster@pobox.com, git@vger.kernel.org,
	"Eric Sunshine" <sunshine@sunshineco.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Thomas Adam" <thomas@xteddy.org>
Subject: Re: [PATCH v3 1/3] send-email: add and use a local copy of Mail::Address
Date: Mon, 08 Jan 2018 11:56:15 +0000	[thread overview]
Message-ID: <87incco97k.fsf@linaro.org> (raw)
In-Reply-To: <1515407674-5233-1-git-send-email-git@matthieu-moy.fr>


Matthieu Moy <git@matthieu-moy.fr> writes:

> We used to have two versions of the email parsing code. Our
> parse_mailboxes (in Git.pm), and Mail::Address which we used if
> installed. Unfortunately, both versions have different sets of bugs, and
> changing the behavior of git depending on whether Mail::Address is
> installed was a bad idea.
>
> A first attempt to solve this was cc90750 (send-email: don't use
> Mail::Address, even if available, 2017-08-23), but it turns out our
> parse_mailboxes is too buggy for some uses. For example the lack of
> nested comments support breaks get_maintainer.pl in the Linux kernel
> tree:
>
>   https://public-inbox.org/git/20171116154814.23785-1-alex.bennee@linaro.org/
>
> This patch goes the other way: use Mail::Address anyway, but have a
> local copy from CPAN as a fallback, when the system one is not
> available.
>
> The duplicated script is small (276 lines of code) and stable in time.
> Maintaining the local copy should not be an issue, and will certainly be
> less burden than maintaining our own parse_mailboxes.
>
> Another option would be to consider Mail::Address as a hard dependency,
> but it's easy enough to save the trouble of extra-dependency to the end
> user or packager.
>
> Signed-off-by: Matthieu Moy <git@matthieu-moy.fr>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>


> ---
> No change since v2.
>
>  git-send-email.perl               |   3 +-
>  perl/Git/FromCPAN/Mail/Address.pm | 276 ++++++++++++++++++++++++++++++++++++++
>  perl/Git/Mail/Address.pm          |  24 ++++
>  3 files changed, 302 insertions(+), 1 deletion(-)
>  create mode 100644 perl/Git/FromCPAN/Mail/Address.pm
>  create mode 100755 perl/Git/Mail/Address.pm
>
> diff --git a/git-send-email.perl b/git-send-email.perl
> index edcc6d3..340b5c8 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -30,6 +30,7 @@ use Error qw(:try);
>  use Cwd qw(abs_path cwd);
>  use Git;
>  use Git::I18N;
> +use Git::Mail::Address;
>
>  Getopt::Long::Configure qw/ pass_through /;
>
> @@ -489,7 +490,7 @@ my ($repoauthor, $repocommitter);
>  ($repocommitter) = Git::ident_person(@repo, 'committer');
>
>  sub parse_address_line {
> -	return Git::parse_mailboxes($_[0]);
> +	return map { $_->format } Mail::Address->parse($_[0]);
>  }
>
>  sub split_addrs {
> diff --git a/perl/Git/FromCPAN/Mail/Address.pm b/perl/Git/FromCPAN/Mail/Address.pm
> new file mode 100644
> index 0000000..13b2ff7
> --- /dev/null
> +++ b/perl/Git/FromCPAN/Mail/Address.pm
> @@ -0,0 +1,276 @@
> +# Copyrights 1995-2017 by [Mark Overmeer <perl@overmeer.net>].
> +#  For other contributors see ChangeLog.
> +# See the manual pages for details on the licensing terms.
> +# Pod stripped from pm file by OODoc 2.02.
> +package Mail::Address;
> +use vars '$VERSION';
> +$VERSION = '2.19';
> +
> +use strict;
> +
> +use Carp;
> +
> +# use locale;   removed in version 1.78, because it causes taint problems
> +
> +sub Version { our $VERSION }
> +
> +
> +
> +# given a comment, attempt to extract a person's name
> +sub _extract_name
> +{   # This function can be called as method as well
> +    my $self = @_ && ref $_[0] ? shift : undef;
> +
> +    local $_ = shift
> +        or return '';
> +
> +    # Using encodings, too hard. See Mail::Message::Field::Full.
> +    return '' if m/\=\?.*?\?\=/;
> +
> +    # trim whitespace
> +    s/^\s+//;
> +    s/\s+$//;
> +    s/\s+/ /;
> +
> +    # Disregard numeric names (e.g. 123456.1234@compuserve.com)
> +    return "" if /^[\d ]+$/;
> +
> +    s/^\((.*)\)$/$1/; # remove outermost parenthesis
> +    s/^"(.*)"$/$1/;   # remove outer quotation marks
> +    s/\(.*?\)//g;     # remove minimal embedded comments
> +    s/\\//g;          # remove all escapes
> +    s/^"(.*)"$/$1/;   # remove internal quotation marks
> +    s/^([^\s]+) ?, ?(.*)$/$2 $1/; # reverse "Last, First M." if applicable
> +    s/,.*//;
> +
> +    # Change casing only when the name contains only upper or only
> +    # lower cased characters.
> +    unless( m/[A-Z]/ && m/[a-z]/ )
> +    {   # Set the case of the name to first char upper rest lower
> +        s/\b(\w+)/\L\u$1/igo;  # Upcase first letter on name
> +        s/\bMc(\w)/Mc\u$1/igo; # Scottish names such as 'McLeod'
> +        s/\bo'(\w)/O'\u$1/igo; # Irish names such as 'O'Malley, O'Reilly'
> +        s/\b(x*(ix)?v*(iv)?i*)\b/\U$1/igo; # Roman numerals, eg 'Level III Support'
> +    }
> +
> +    # some cleanup
> +    s/\[[^\]]*\]//g;
> +    s/(^[\s'"]+|[\s'"]+$)//g;
> +    s/\s{2,}/ /g;
> +
> +    $_;
> +}
> +
> +sub _tokenise
> +{   local $_ = join ',', @_;
> +    my (@words,$snippet,$field);
> +
> +    s/\A\s+//;
> +    s/[\r\n]+/ /g;
> +
> +    while ($_ ne '')
> +    {   $field = '';
> +        if(s/^\s*\(/(/ )    # (...)
> +        {   my $depth = 0;
> +
> +     PAREN: while(s/^(\(([^\(\)\\]|\\.)*)//)
> +            {   $field .= $1;
> +                $depth++;
> +                while(s/^(([^\(\)\\]|\\.)*\)\s*)//)
> +                {   $field .= $1;
> +                    last PAREN unless --$depth;
> +	            $field .= $1 if s/^(([^\(\)\\]|\\.)+)//;
> +                }
> +            }
> +
> +            carp "Unmatched () '$field' '$_'"
> +                if $depth;
> +
> +            $field =~ s/\s+\Z//;
> +            push @words, $field;
> +
> +            next;
> +        }
> +
> +        if( s/^("(?:[^"\\]+|\\.)*")\s*//       # "..."
> +         || s/^(\[(?:[^\]\\]+|\\.)*\])\s*//    # [...]
> +         || s/^([^\s()<>\@,;:\\".[\]]+)\s*//
> +         || s/^([()<>\@,;:\\".[\]])\s*//
> +          )
> +        {   push @words, $1;
> +            next;
> +        }
> +
> +        croak "Unrecognised line: $_";
> +    }
> +
> +    push @words, ",";
> +    \@words;
> +}
> +
> +sub _find_next
> +{   my ($idx, $tokens, $len) = @_;
> +
> +    while($idx < $len)
> +    {   my $c = $tokens->[$idx];
> +        return $c if $c eq ',' || $c eq ';' || $c eq '<';
> +        $idx++;
> +    }
> +
> +    "";
> +}
> +
> +sub _complete
> +{   my ($class, $phrase, $address, $comment) = @_;
> +
> +    @$phrase || @$comment || @$address
> +       or return undef;
> +
> +    my $o = $class->new(join(" ",@$phrase), join("",@$address), join(" ",@$comment));
> +    @$phrase = @$address = @$comment = ();
> +    $o;
> +}
> +
> +#------------
> +
> +sub new(@)
> +{   my $class = shift;
> +    bless [@_], $class;
> +}
> +
> +
> +sub parse(@)
> +{   my $class = shift;
> +    my @line  = grep {defined} @_;
> +    my $line  = join '', @line;
> +
> +    my (@phrase, @comment, @address, @objs);
> +    my ($depth, $idx) = (0, 0);
> +
> +    my $tokens  = _tokenise @line;
> +    my $len     = @$tokens;
> +    my $next    = _find_next $idx, $tokens, $len;
> +
> +    local $_;
> +    for(my $idx = 0; $idx < $len; $idx++)
> +    {   $_ = $tokens->[$idx];
> +
> +        if(substr($_,0,1) eq '(') { push @comment, $_ }
> +        elsif($_ eq '<')    { $depth++ }
> +        elsif($_ eq '>')    { $depth-- if $depth }
> +        elsif($_ eq ',' || $_ eq ';')
> +        {   warn "Unmatched '<>' in $line" if $depth;
> +            my $o = $class->_complete(\@phrase, \@address, \@comment);
> +            push @objs, $o if defined $o;
> +            $depth = 0;
> +            $next = _find_next $idx+1, $tokens, $len;
> +        }
> +        elsif($depth)       { push @address, $_ }
> +        elsif($next eq '<') { push @phrase,  $_ }
> +        elsif( /^[.\@:;]$/ || !@address || $address[-1] =~ /^[.\@:;]$/ )
> +        {   push @address, $_ }
> +        else
> +        {   warn "Unmatched '<>' in $line" if $depth;
> +            my $o = $class->_complete(\@phrase, \@address, \@comment);
> +            push @objs, $o if defined $o;
> +            $depth = 0;
> +            push @address, $_;
> +        }
> +    }
> +    @objs;
> +}
> +
> +#------------
> +
> +sub phrase  { shift->set_or_get(0, @_) }
> +sub address { shift->set_or_get(1, @_) }
> +sub comment { shift->set_or_get(2, @_) }
> +
> +sub set_or_get($)
> +{   my ($self, $i) = (shift, shift);
> +    @_ or return $self->[$i];
> +
> +    my $val = $self->[$i];
> +    $self->[$i] = shift if @_;
> +    $val;
> +}
> +
> +
> +my $atext = '[\-\w !#$%&\'*+/=?^`{|}~]';
> +sub format
> +{   my @addrs;
> +
> +    foreach (@_)
> +    {   my ($phrase, $email, $comment) = @$_;
> +        my @addr;
> +
> +        if(defined $phrase && length $phrase)
> +        {   push @addr
> +              , $phrase =~ /^(?:\s*$atext\s*)+$/o ? $phrase
> +              : $phrase =~ /(?<!\\)"/             ? $phrase
> +              :                                    qq("$phrase");
> +
> +            push @addr, "<$email>"
> +                if defined $email && length $email;
> +        }
> +        elsif(defined $email && length $email)
> +        {   push @addr, $email;
> +        }
> +
> +        if(defined $comment && $comment =~ /\S/)
> +        {   $comment =~ s/^\s*\(?/(/;
> +            $comment =~ s/\)?\s*$/)/;
> +        }
> +
> +        push @addr, $comment
> +            if defined $comment && length $comment;
> +
> +        push @addrs, join(" ", @addr)
> +            if @addr;
> +    }
> +
> +    join ", ", @addrs;
> +}
> +
> +#------------
> +
> +sub name
> +{   my $self   = shift;
> +    my $phrase = $self->phrase;
> +    my $addr   = $self->address;
> +
> +    $phrase    = $self->comment
> +        unless defined $phrase && length $phrase;
> +
> +    my $name   = $self->_extract_name($phrase);
> +
> +    # first.last@domain address
> +    if($name eq '' && $addr =~ /([^\%\.\@_]+([\._][^\%\.\@_]+)+)[\@\%]/)
> +    {   ($name  = $1) =~ s/[\._]+/ /g;
> +	$name   = _extract_name $name;
> +    }
> +
> +    if($name eq '' && $addr =~ m#/g=#i)    # X400 style address
> +    {   my ($f) = $addr =~ m#g=([^/]*)#i;
> +	my ($l) = $addr =~ m#s=([^/]*)#i;
> +	$name   = _extract_name "$f $l";
> +    }
> +
> +    length $name ? $name : undef;
> +}
> +
> +
> +sub host
> +{   my $addr = shift->address || '';
> +    my $i    = rindex $addr, '@';
> +    $i >= 0 ? substr($addr, $i+1) : undef;
> +}
> +
> +
> +sub user
> +{   my $addr = shift->address || '';
> +    my $i    = rindex $addr, '@';
> +    $i >= 0 ? substr($addr,0,$i) : $addr;
> +}
> +
> +1;
> diff --git a/perl/Git/Mail/Address.pm b/perl/Git/Mail/Address.pm
> new file mode 100755
> index 0000000..2ce3e84
> --- /dev/null
> +++ b/perl/Git/Mail/Address.pm
> @@ -0,0 +1,24 @@
> +package Git::Mail::Address;
> +use 5.008;
> +use strict;
> +use warnings;
> +
> +=head1 NAME
> +
> +Git::Mail::Address - Wrapper for the L<Mail::Address> module, in case it's not installed
> +
> +=head1 DESCRIPTION
> +
> +This module is only intended to be used for code shipping in the
> +C<git.git> repository. Use it for anything else at your peril!
> +
> +=cut
> +
> +eval {
> +    require Mail::Address;
> +    1;
> +} or do {
> +    require Git::FromCPAN::Mail::Address;
> +};
> +
> +1;


--
Alex Bennée

  parent reply	other threads:[~2018-01-08 11:56 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-04 18:55 [RFC PATCH 1/2] add a local copy of Mail::Address from CPAN Matthieu Moy
2018-01-04 18:55 ` [RFC PATCH 2/2] Remove now useless email-address parsing code Matthieu Moy
2018-01-04 22:11   ` Alex Bennée
2018-01-05  9:39     ` Matthieu Moy
2018-01-05 10:11       ` [PATCH] send-email: add test for Linux's get_maintainer.pl Matthieu Moy
2018-01-05 11:15         ` Alex Bennée
2018-01-05 11:36           ` Matthieu Moy
2018-01-05 20:16             ` Junio C Hamano
2018-01-04 21:02 ` [RFC PATCH 1/2] add a local copy of Mail::Address from CPAN Eric Sunshine
2018-01-05 14:19 ` Ævar Arnfjörð Bjarmason
2018-01-05 18:36 ` [PATCH v2 1/3] send-email: add and use a local copy of Mail::Address Matthieu Moy
2018-01-05 18:36   ` [PATCH v2 2/3] Remove now useless email-address parsing code Matthieu Moy
2018-01-05 18:36   ` [PATCH v2 3/3] send-email: add test for Linux's get_maintainer.pl Matthieu Moy
2018-01-05 18:59     ` Eric Sunshine
2018-01-08 10:30       ` Matthieu Moy
2018-01-08 10:34   ` [PATCH v3 1/3] send-email: add and use a local copy of Mail::Address Matthieu Moy
2018-01-08 10:34     ` [PATCH v3 2/3] Remove now useless email-address parsing code Matthieu Moy
2018-01-08 11:57       ` Alex Bennée
2018-01-08 10:34     ` [PATCH v3 3/3] send-email: add test for Linux's get_maintainer.pl Matthieu Moy
2018-01-08 18:45       ` Junio C Hamano
2018-01-08 11:56     ` Alex Bennée [this message]
2018-02-14 14:59   ` [PATCH v2 1/3] send-email: add and use a local copy of Mail::Address Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87incco97k.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=avarab@gmail.com \
    --cc=git@matthieu-moy.fr \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=sunshine@sunshineco.com \
    --cc=thomas@xteddy.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).