git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Remi Lespinet <remi.lespinet@ensimag.grenoble-inp.fr>
To: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr>
Cc: Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org,
	Remi Galan <remi.galan-alfonso@ensimag.grenoble-inp.fr>,
	Guillaume Pages <guillaume.pages@ensimag.grenoble-inp.fr>,
	Louis-Alexandre Stuber 
	<louis--alexandre.stuber@ensimag.grenoble-inp.fr>,
	Antoine Delaite <antoine.delaite@ensimag.grenoble-inp.fr>
Subject: [PATCH/RFC v4 07/10] send-email: reduce dependancies impact on parse_address_line
Date: Thu, 18 Jun 2015 17:08:51 +0200 (CEST)	[thread overview]
Message-ID: <1444764681.621777.1434640131682.JavaMail.zimbra@ensimag.grenoble-inp.fr> (raw)
In-Reply-To: <vpqh9q56yaf.fsf@anie.imag.fr>

> Remi Lespinet <remi.lespinet@ensimag.grenoble-inp.fr> writes:
> 
> > I've some more tests, maybe I should put them all in this post ?
> 
> Yes, please post as much as you have. Ideally, this should be
> automatically tested, but if you don't have time to write the automated
> tests, at least having a track of what you did on the list archives can
> help someone else to do it.

It may not be easily readable without colors, so there are the scripts
at the end. You can change the tested input by changing lines after
the "cat >.tmplist" line in testall.sh. (There are two scripts 
testall.sh and testone.perl).

Here are the tests results:

Input: 
Split: 
M::A : 
Same : Yes
----------
Input: Jane
Split: Jane
M::A : Jane
Same : Yes
----------
Input: jdoe@example.com
Split: jdoe@example.com
M::A : jdoe@example.com
Same : Yes
----------
Input: <jdoe@example.com>
Split: jdoe@example.com
M::A : jdoe@example.com
Same : Yes
----------
Input: Jane <jdoe@example.com>
Split: Jane <jdoe@example.com>
M::A : Jane <jdoe@example.com>
Same : Yes
----------
Input: Jane Doe <jdoe@example.com>
Split: Jane Doe <jdoe@example.com>
M::A : Jane Doe <jdoe@example.com>
Same : Yes
----------
Input: Jane\ Doe <jdoe@example.com>
Split: "Jane\ Doe" <jdoe@example.com>
M::A : "Jane \ Doe" <jdoe@example.com>
Same : No
----------
Input: "Jane" <jdoe@example.com>
Split: "Jane" <jdoe@example.com>
M::A : "Jane" <jdoe@example.com>
Same : Yes
----------
Input: "Doe, Jane" <jdoe@example.com>
Split: "Doe, Jane" <jdoe@example.com>
M::A : "Doe, Jane" <jdoe@example.com>
Same : Yes
----------
Input: "Doe, Ja"ne <jdoe@example.com>
Split: "Doe, Ja ne" <jdoe@example.com>
M::A : "Doe, Ja" ne <jdoe@example.com>
Same : No
----------
Input: "Doe, Katarina" Jane <jdoe@example.com>
Split: "Doe, Katarina Jane" <jdoe@example.com>
M::A : "Doe, Katarina" Jane <jdoe@example.com>
Same : No
----------
Input: "Jane@:;\>.,()<Doe" <jdoe@example.com>
Split: "Jane@:;\>.,()<Doe" <jdoe@example.com>
M::A : "Jane@:;\>.,()<Doe" <jdoe@example.com>
Same : Yes
----------
Input: Jane@:;\.,()<>Doe <jdoe@example.com>
Split: Jane@:
     : "\."
     : Doe <jdoe@example.com> ()
M::A : Jane@:
     : \.
     : Doe <jdoe@example.com> ()
Same : No
----------
Input: Jane!#$%&'*+-/=?^_{|}~Doe' <jdoe@example.com>
Split: Jane!#$%&'*+-/=?^_{|}~Doe' <jdoe@example.com>
M::A : Jane!#$%&'*+-/=?^_{|}~Doe' <jdoe@example.com>
Same : Yes
----------
Input: "<jdoe@example.com>"
Split: "<jdoe@example.com>"
M::A : "<jdoe@example.com>"
Same : Yes
----------
Input: "Jane jdoe@example.com"
Split: "Jane jdoe@example.com"
M::A : "Jane jdoe@example.com"
Same : Yes
----------
Input: Jane Doe <jdoe    @   example.com  >
Split: Jane Doe <jdoe@example.com>
M::A : Jane Doe <jdoe@example.com>
Same : Yes
----------
Input: Jane       Doe <  jdoe@example.com  >
Split: Jane Doe <jdoe@example.com>
M::A : Jane Doe <jdoe@example.com>
Same : Yes
----------
Input: Jane @ Doe @ Jane @ Doe
Split: Jane@Doe@Jane@Doe
M::A : Jane@Doe@Jane@Doe
Same : Yes
----------
Input: Jane jdoe@example.com
Split: Janejdoe@example.com
M::A : Jane
     : jdoe@example.com
Same : No
----------
Input: <jdoe@example.com> Jane Doe
Split: jdoe@example.comJaneDoe
M::A : Jane Doe <jdoe@example.com>
Same : No
----------
Input: Jane <jdoe@example.com> Doe
Split: Jane <jdoe@example.comDoe>
M::A : Jane Doe <jdoe@example.com>
Same : No
----------
Input: "Jane, 'Doe'" <jdoe@example.com>
Split: "Jane, 'Doe'" <jdoe@example.com>
M::A : "Jane, 'Doe'" <jdoe@example.com>
Same : Yes
----------
Input: 'Doe, "Jane' <jdoe@example.com>
Split: 'Doe
     : " Jane' <jdoe@example.com>
M::A : 'Doe
     : " Jane' <jdoe@example.com>
Same : Yes
----------
Input: "Jane" "Do"e <jdoe@example.com>
Split: "Jane" "Do" e <jdoe@example.com>
M::A : "Jane" "Do" e <jdoe@example.com>
Same : Yes
----------
Input: "Jane' Doe" <jdoe@example.com>
Split: "Jane' Doe" <jdoe@example.com>
M::A : "Jane' Doe" <jdoe@example.com>
Same : Yes
----------
Input: "Jane Doe <jdoe@example.com>" <jdoe@example.com>
Split: "Jane Doe <jdoe@example.com>" <jdoe@example.com>
M::A : "Jane Doe <jdoe@example.com>" <jdoe@example.com>
Same : Yes
----------
Input: "Jane\" Doe" <jdoe@example.com>
Split: "Jane\" Doe" <jdoe@example.com>
M::A : "Jane\" Doe" <jdoe@example.com>
Same : Yes
----------
Input: Doe, jane <jdoe@example.com>
Split: Doe
     : jane <jdoe@example.com>
M::A : Doe
     : jane <jdoe@example.com>
Same : Yes
----------
Input: "Jane Doe <jdoe@example.com>
Split: " Jane Doe <jdoe@example.com>
M::A : " Jane Doe <jdoe@example.com>
Same : Yes
----------
Input: "Jane "Kat"a" ri"na" ",Doe" <jdoe@example.com>
Split: "Jane  Kat a ri na ,Doe" <jdoe@example.com>
M::A : "Jane " Kat "a" ri "na" ",Doe" <jdoe@example.com>
Same : No
----------
Input: Jane Doe
Split: Jane Doe
M::A : Jane
     : Doe
Same : No
----------
Input: Jane "Doe <jdoe@example.com>"
Split: "Jane Doe <jdoe@example.com>"
M::A : Jane
     : "Doe <jdoe@example.com>"
Same : No
----------
Input: \"Jane Doe <jdoe@example.com>
Split: "\"Jane Doe" <jdoe@example.com>
M::A : \ " Jane Doe <jdoe@example.com>
Same : No
----------
Input: Jane\"\" Doe <jdoe@example.com>
Split: "Jane\"\" Doe" <jdoe@example.com>
M::A : Jane \ " \ " Doe <jdoe@example.com>
Same : No
----------
Input: 'Jane 'Doe' <jdoe@example.com>
Split: 'Jane 'Doe' <jdoe@example.com>
M::A : 'Jane 'Doe' <jdoe@example.com>
Same : Yes
----------
Input: 'Jane "Katarina\" \' Doe' <jdoe@example.com>
Split: "'Jane  Katarina\" \' Doe'" <jdoe@example.com>
M::A : 'Jane " Katarina \ " \ ' Doe' <jdoe@example.com>
Same : No


**********************************************************************
*                          SCRIPTS PART                              *
**********************************************************************


---------------------------- testall.sh ----------------------------

#!/bin/sh

cat >.tmplist <<EOF

Jane
jdoe@example.com
<jdoe@example.com>
Jane <jdoe@example.com>
Jane Doe <jdoe@example.com>
Jane\ Doe <jdoe@example.com>
"Jane" <jdoe@example.com>
"Doe, Jane" <jdoe@example.com>
"Doe, Ja"ne <jdoe@example.com>
"Doe, Katarina" Jane <jdoe@example.com>
"Jane@:;\>.,()<Doe" <jdoe@example.com>
Jane@:;\.,()<>Doe <jdoe@example.com>
Jane!#$%&'*+-/=?^_{|}~Doe' <jdoe@example.com>
"<jdoe@example.com>"
"Jane jdoe@example.com"
Jane Doe <jdoe    @   example.com  >
Jane       Doe <  jdoe@example.com  >
Jane @ Doe @ Jane @ Doe
Jane jdoe@example.com
<jdoe@example.com> Jane Doe
Jane <jdoe@example.com> Doe
"Jane, 'Doe'" <jdoe@example.com>
'Doe, "Jane' <jdoe@example.com>
"Jane" "Do"e <jdoe@example.com>
"Jane' Doe" <jdoe@example.com>
"Jane Doe <jdoe@example.com>" <jdoe@example.com>
"Jane\" Doe" <jdoe@example.com>
Doe, jane <jdoe@example.com>
"Jane Doe <jdoe@example.com>
"Jane "Kat"a" ri"na" ",Doe" <jdoe@example.com>
Jane Doe
Jane "Doe <jdoe@example.com>"
\"Jane Doe <jdoe@example.com>
Jane\"\" Doe <jdoe@example.com>
'Jane 'Doe' <jdoe@example.com>
'Jane "Katarina\" \' Doe' <jdoe@example.com>
EOF


cat .tmplist | while read -r line
do
    echo "Input: $line"
    ./testone.perl "$line"
    echo ----------
done

---------------------------- testone.perl ----------------------------

#!/usr/bin/perl

use strict;
use warnings;

use Term::ANSIColor;
use Mail::Address;
use Text::ParseWords;

my $string = $ARGV[0];

sub split_addrs {
	my $re_comment = qr/\((?:[^)]*)\)/;
	my $re_quote = qr/"(?:[^\"\\]|\\.)*"/;
	my $re_word = qr/(?:[^]["\s()<>:;@\\,.]|\\.)+/;
	my $re_token = qr/(?:$re_quote|$re_word|$re_comment|\S)/;

	my @tokens = map { $_ =~ /\s*($re_token)\s*/g } @_;
	push @tokens, ",";

	my (@addr_list, @phrase, @address, @comment, @buffer) = ();
	foreach my $token (@tokens) {
		if ($token =~ /^[,;]$/) {
			if (@address) {
				push @address, @buffer;
			} else {
				push @phrase, @buffer;
			}
		
			my $str_phrase = join ' ', @phrase;
			my $str_address = join '', @address;
			my $str_comment = join ' ', @comment;
		
			if ($str_phrase =~ /[][()<>:;@\\,.\000-\037\177]/) {
				$str_phrase =~ s/(^|[^\\])"/$1/g;
				$str_phrase = qq["$str_phrase"];
			}
		
			if ($str_address ne "" && $str_phrase ne "") {
				$str_address = qq[<$str_address>];
			}
		
			my $str_mailbox = "$str_phrase $str_address $str_comment";
			$str_mailbox =~ s/^\s*|\s*$//g;
			push @addr_list, $str_mailbox if ($str_mailbox);
		
			@phrase = @address = @comment = @buffer = ();
		} elsif ($token =~ /^\(/) {
			push @comment, $token;
		} elsif ($token eq "<") {
			push @phrase, (splice @address), (splice @buffer);
		} elsif ($token eq ">") {
			push @address, (splice @buffer);
		} elsif ($token eq "@") {
			push @address, (splice @buffer), "@";
		} elsif ($token eq ".") {
			push @address, (splice @buffer), ".";
		} else {
			push @buffer, $token;
		}
	}

	return @addr_list;
}

sub old_split {
	quotewords('\s*,\s*', 1, $_[0]);
}

my @tab = split_addrs($string);
my @ref = map { $_->format } Mail::Address->parse($string);
# my @old = old_split($string);  #can be printed to see the difference

my $tabstring = join "\n", @tab;
my $refstring = join "\n", @ref;
my $same = ($tabstring eq $refstring);

$tabstring =~ s/\n/\n     : /g;
$refstring =~ s/\n/\n     : /g;

print color 'bold yellow';
print "Split: ", "$tabstring", "\n";

print color 'bold blue';
print "M::A : ", "$refstring", "\n";

if ($same) {
	print color 'bold green';
	print "Same : ", "Yes", "\n";
} else {
	print color 'bold red';
	print "Same : ", "No", "\n";
}

print color 'reset';


 

  reply	other threads:[~2015-06-18 15:07 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-17 14:18 [PATCH/RFC v4 01/10] t9001-send-email: move script creation in a setup test Remi Lespinet
2015-06-17 14:18 ` [PATCH/RFC v4 02/10] send-email: allow aliases in patch header and command script outputs Remi Lespinet
2015-06-17 14:18 ` [PATCH/RFC v4 03/10] t9001-send-email: refactor header variable fields replacement Remi Lespinet
2015-06-17 14:18 ` [PATCH/RFC v4 04/10] send-email: refactor address list process Remi Lespinet
2015-06-17 14:18 ` [PATCH/RFC v4 05/10] send-email: Allow use of aliases in the From field of --compose mode Remi Lespinet
2015-06-17 15:57   ` Matthieu Moy
2015-06-17 14:18 ` [PATCH/RFC v4 06/10] send-email: minor code refactoring Remi Lespinet
2015-06-17 14:18 ` [PATCH/RFC v4 07/10] send-email: reduce dependancies impact on parse_address_line Remi Lespinet
2015-06-17 15:45   ` Matthieu Moy
2015-06-17 23:39     ` Remi Lespinet
2015-06-17 21:27   ` Junio C Hamano
2015-06-17 23:48     ` Remi Lespinet
2015-06-18 11:39       ` Matthieu Moy
2015-06-18 15:08         ` Remi Lespinet [this message]
2015-06-18 17:29           ` Matthieu Moy
2015-06-18 21:29             ` Remi Lespinet
2015-06-19  7:16               ` Matthieu Moy
2015-06-17 14:30 ` [PATCH/RFC v4 08/10] send-email: consider quote as delimiter instead of character Remi Lespinet
2015-06-17 14:31 ` [PATCH/RFC v4 09/10] send-email: allow multiple emails using --cc, --to and --bcc Remi Lespinet
2015-06-17 14:32 ` [PATCH/RFC v4 10/10] send-email: suppress meaningless whitespaces in from field Remi Lespinet
2015-06-17 14:54   ` Matthieu Moy
2015-06-17 15:11     ` Remi Lespinet
2015-06-20 23:17 ` [PATCH v5 01/10] t9001-send-email: move script creation in a setup test Remi Lespinet
2015-06-20 23:17   ` [PATCH v5 02/10] send-email: allow aliases in patch header and command script outputs Remi Lespinet
2015-06-20 23:17   ` [PATCH v5 03/10] t9001-send-email: refactor header variable fields replacement Remi Lespinet
2015-06-20 23:17   ` [PATCH v5 04/10] send-email: refactor address list process Remi Lespinet
2015-06-20 23:17   ` [PATCH v5 05/10] send-email: Allow use of aliases in the From field of --compose mode Remi Lespinet
2015-06-20 23:17   ` [PATCH v5 06/10] send-email: minor code refactoring Remi Lespinet
2015-06-20 23:17   ` [PATCH v5 07/10] send-email: reduce dependancies impact on parse_address_line Remi Lespinet
2015-06-21 10:07     ` Matthieu Moy
2015-06-21 13:02       ` Remi Lespinet
2015-06-23 20:15       ` Remi Lespinet
2015-06-21 13:24     ` Matthieu Moy
2015-06-21 12:45 ` [PATCH v5 08/10] send-email: consider quote as delimiter instead of character Remi Lespinet
2015-06-21 12:45   ` [PATCH v5 09/10] send-email: allow multiple emails using --cc, --to and --bcc Remi Lespinet
2015-06-21 13:17     ` Matthieu Moy
2015-06-21 12:45   ` [PATCH v5 10/10] send-email: suppress meaningless whitespaces in from field Remi Lespinet
2015-06-23 20:30 ` [PATCH v6 01/10] t9001-send-email: move script creation in a setup test Remi Lespinet
2015-06-23 20:30   ` [PATCH v6 02/10] send-email: allow aliases in patch header and command script outputs Remi Lespinet
2015-06-23 20:30   ` [PATCH v6 03/10] t9001-send-email: refactor header variable fields replacement Remi Lespinet
2015-06-23 20:30   ` [PATCH v6 04/10] send-email: refactor address list process Remi Lespinet
2015-06-23 20:30   ` [PATCH v6 05/10] send-email: Allow use of aliases in the From field of --compose mode Remi Lespinet
2015-06-23 20:30   ` [PATCH v6 06/10] send-email: minor code refactoring Remi Lespinet
2015-06-23 20:30   ` [PATCH v6 07/10] send-email: reduce dependencies impact on parse_address_line Remi Lespinet
2015-06-23 20:39     ` Matthieu Moy
2015-06-23 20:58       ` Remi LESPINET
2015-06-23 20:40   ` [PATCH v6 08/10] send-email: consider quote as delimiter instead of character Remi Lespinet
2015-06-23 20:41   ` [PATCH v6 09/10] send-email: allow multiple emails using --cc, --to and --bcc Remi Lespinet
2015-06-23 20:44     ` Matthieu Moy
2015-06-23 20:41   ` [PATCH v6 10/10] send-email: suppress meaningless whitespaces in from field Remi Lespinet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1444764681.621777.1434640131682.JavaMail.zimbra@ensimag.grenoble-inp.fr \
    --to=remi.lespinet@ensimag.grenoble-inp.fr \
    --cc=Matthieu.Moy@grenoble-inp.fr \
    --cc=antoine.delaite@ensimag.grenoble-inp.fr \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=guillaume.pages@ensimag.grenoble-inp.fr \
    --cc=louis--alexandre.stuber@ensimag.grenoble-inp.fr \
    --cc=remi.galan-alfonso@ensimag.grenoble-inp.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).