[PATCH] send-email: extract email-parsing code into a subroutine

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCH] send-email: extract email-parsing code into a subroutine
@ 2017-12-02 17:02 Payre Nathan
  2017-12-02 17:11 ` Nathan PAYRE
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Payre Nathan @ 2017-12-02 17:02 UTC (permalink / raw)
  To: git
  Cc: ryan, e, gitster, Nathan Payre, Nathan Payre, Matthieu Moy,
	Timothee Albertin, Daniel Bensoussan

From: Nathan Payre <second.payre@gmail.com>

The existing code mixes parsing of email header with regular
expression and actual code. Extract the parsing code into a new
subroutine 'parse_header_line()'. This improves the code readability
and make parse_header_line reusable in other place.

Signed-off-by: Nathan Payre <nathan.payre@etu.univ-lyon1.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@univ-lyon1.fr>
Signed-off-by: Timothee Albertin <timothee.albertin@etu.univ-lyon1.fr>
Signed-off-by: Daniel Bensoussan <daniel.bensoussan--bohm@etu.univ-lyon1.fr>
---

This patch is a first step to implement a new feature.
See new feature discussion here: https://public-inbox.org/git/20171030223444.5052-1-nathan.payre@etu.univ-lyon1.fr/

 git-send-email.perl | 106 +++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 80 insertions(+), 26 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 2208dcc21..98c2e461c 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -715,41 +715,64 @@ EOT3
 	if (!defined $compose_encoding) {
 		$compose_encoding = "UTF-8";
 	}
-	while(<$c>) {
+
+	my %parsed_email;
+	while (<$c>) {
 		next if m/^GIT:/;
-		if ($in_body) {
-			$summary_empty = 0 unless (/^\n$/);
-		} elsif (/^\n$/) {
-			$in_body = 1;
-			if ($need_8bit_cte) {
+		parse_header_line($_, \%parsed_email);
+		if (/^\n$/i) {
+			while (my $row = <$c>) {
+				if (!($row =~ m/^GIT:/)) {
+					$parsed_email{'body'} = $parsed_email{'body'} . $row;
+				}
+			}
+		}
+	}
+	if ($parsed_email{'from'}) {
+		$sender = $parsed_email{'from'};
+	}
+	if ($parsed_email{'in_reply_to'}) {
+		$initial_reply_to = $parsed_email{'in_reply_to'};
+	}
+	if ($parsed_email{'subject'}) {
+		$initial_subject = $parsed_email{'subject'};
+		print $c2 "Subject: " .
+			quote_subject($parsed_email{'subject'}, $compose_encoding) .
+			"\n";
+	}
+	if ($parsed_email{'mime-version'}) {
+		$need_8bit_cte = 0;
+	}
+	if ($need_8bit_cte) {
+		if ($parsed_email{'content-type'}) {
+				print $c2 "MIME-Version: 1.0\n",
+					 "Content-Type: $parsed_email{'content-type'};",
+					 "Content-Transfer-Encoding: 8bit\n";
+			} else {
 				print $c2 "MIME-Version: 1.0\n",
 					 "Content-Type: text/plain; ",
-					   "charset=$compose_encoding\n",
+					 "charset=$compose_encoding\n",
 					 "Content-Transfer-Encoding: 8bit\n";
 			}
-		} elsif (/^MIME-Version:/i) {
-			$need_8bit_cte = 0;
-		} elsif (/^Subject:\s*(.+)\s*$/i) {
-			$initial_subject = $1;
-			my $subject = $initial_subject;
-			$_ = "Subject: " .
-				quote_subject($subject, $compose_encoding) .
-				"\n";
-		} elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
-			$initial_reply_to = $1;
-			next;
-		} elsif (/^From:\s*(.+)\s*$/i) {
-			$sender = $1;
-			next;
-		} elsif (/^(?:To|Cc|Bcc):/i) {
-			print __("To/Cc/Bcc fields are not interpreted yet, they have been ignored\n");
-			next;
-		}
-		print $c2 $_;
 	}
+	if ($parsed_email{'body'}) {
+		$summary_empty = 0;
+		print $c2 "\n$parsed_email{'body'}\n";
+	}
+
 	close $c;
 	close $c2;
 
+	open $c2, "<", $compose_filename . ".final"
+		or die sprintf(__("Failed to open %s.final: %s"), $compose_filename, $!);
+
+	print "affichage : \n";
+	while (<$c2>) {
+		print $_;
+	}
+
+	close $c2;
+
 	if ($summary_empty) {
 		print __("Summary email is empty, skipping it\n");
 		$compose = -1;
@@ -792,6 +815,37 @@ sub ask {
 	return;
 }
 
+sub parse_header_line {
+	my $lines = shift;
+	my $parsed_line = shift;
+
+	foreach (split(/\n/, $lines)) {
+		if (/^From:\s*(.+)$/i) {
+			$parsed_line->{'from'} = $1;
+		} elsif (/^To:\s*(.+)$/i) {
+			$parsed_line->{'to'} = [ parse_address_line($1) ];
+		} elsif (/^Cc:\s*(.+)$/i) {
+			$parsed_line->{'cc'} = [ parse_address_line($1) ];
+		} elsif (/^Bcc:\s*(.+)$/i) {
+			$parsed_line->{'bcc'} = [ parse_address_line($1) ];
+		} elsif (/^Subject:\s*(.+)\s*$/i) {
+			$parsed_line->{'subject'} = $1;
+		} elsif (/^Date: (.*)/i) {
+			$parsed_line->{'date'} = $1;
+		} elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
+			$parsed_line->{'in_reply_to'} = $1;
+		} elsif (/^Message-ID: (.*)$/i) {
+			$parsed_line->{'message_id'} = $1;
+		} elsif (/^MIME-Version:$/i) {
+			$parsed_line->{'mime-version'} = $1;
+		} elsif (/^Content-Type:\s+(.*)\s*$/i) {
+			$parsed_line->{'content-type'} = $1;
+		} elsif (/^References:\s+(.*)/i) {
+			$parsed_line->{'references'} = $1;
+		}
+	}
+}
+
 my %broken_encoding;
 
 sub file_declares_8bit_cte {
-- 
2.15.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] send-email: extract email-parsing code into a subroutine
  2017-12-02 17:02 [PATCH] send-email: extract email-parsing code into a subroutine Payre Nathan
@ 2017-12-02 17:11 ` Nathan PAYRE
       [not found] ` <445d0838cf2a4107bad95d5cc2d38a05@BPMBX2013-01.univ-lyon1.fr>
  2017-12-03 22:00 ` Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 25+ messages in thread
From: Nathan PAYRE @ 2017-12-02 17:11 UTC (permalink / raw)
  To: Git Mailing List
  Cc: Ryan Anderson, e, Junio C Hamano, Nathan Payre, Nathan Payre,
	Matthieu Moy, Timothee Albertin, Daniel Bensoussan

I found a mistake in my signed-off-by, please replace
<nathan.payre@etu.univ-lyon1.fr.> by <second.payre@gmail.com>

Excuse me.

2017-12-02 18:02 GMT+01:00 Payre Nathan <second.payre@gmail.com>:
> From: Nathan Payre <second.payre@gmail.com>
>
> The existing code mixes parsing of email header with regular
> expression and actual code. Extract the parsing code into a new
> subroutine 'parse_header_line()'. This improves the code readability
> and make parse_header_line reusable in other place.
>
> Signed-off-by: Nathan Payre <nathan.payre@etu.univ-lyon1.fr>
> Signed-off-by: Matthieu Moy <matthieu.moy@univ-lyon1.fr>
> Signed-off-by: Timothee Albertin <timothee.albertin@etu.univ-lyon1.fr>
> Signed-off-by: Daniel Bensoussan <daniel.bensoussan--bohm@etu.univ-lyon1.fr>
> ---
>
> This patch is a first step to implement a new feature.
> See new feature discussion here: https://public-inbox.org/git/20171030223444.5052-1-nathan.payre@etu.univ-lyon1.fr/
>
>  git-send-email.perl | 106 +++++++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 80 insertions(+), 26 deletions(-)
>
> diff --git a/git-send-email.perl b/git-send-email.perl
> index 2208dcc21..98c2e461c 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -715,41 +715,64 @@ EOT3
>         if (!defined $compose_encoding) {
>                 $compose_encoding = "UTF-8";
>         }
> -       while(<$c>) {
> +
> +       my %parsed_email;
> +       while (<$c>) {
>                 next if m/^GIT:/;
> -               if ($in_body) {
> -                       $summary_empty = 0 unless (/^\n$/);
> -               } elsif (/^\n$/) {
> -                       $in_body = 1;
> -                       if ($need_8bit_cte) {
> +               parse_header_line($_, \%parsed_email);
> +               if (/^\n$/i) {
> +                       while (my $row = <$c>) {
> +                               if (!($row =~ m/^GIT:/)) {
> +                                       $parsed_email{'body'} = $parsed_email{'body'} . $row;
> +                               }
> +                       }
> +               }
> +       }
> +       if ($parsed_email{'from'}) {
> +               $sender = $parsed_email{'from'};
> +       }
> +       if ($parsed_email{'in_reply_to'}) {
> +               $initial_reply_to = $parsed_email{'in_reply_to'};
> +       }
> +       if ($parsed_email{'subject'}) {
> +               $initial_subject = $parsed_email{'subject'};
> +               print $c2 "Subject: " .
> +                       quote_subject($parsed_email{'subject'}, $compose_encoding) .
> +                       "\n";
> +       }
> +       if ($parsed_email{'mime-version'}) {
> +               $need_8bit_cte = 0;
> +       }
> +       if ($need_8bit_cte) {
> +               if ($parsed_email{'content-type'}) {
> +                               print $c2 "MIME-Version: 1.0\n",
> +                                        "Content-Type: $parsed_email{'content-type'};",
> +                                        "Content-Transfer-Encoding: 8bit\n";
> +                       } else {
>                                 print $c2 "MIME-Version: 1.0\n",
>                                          "Content-Type: text/plain; ",
> -                                          "charset=$compose_encoding\n",
> +                                        "charset=$compose_encoding\n",
>                                          "Content-Transfer-Encoding: 8bit\n";
>                         }
> -               } elsif (/^MIME-Version:/i) {
> -                       $need_8bit_cte = 0;
> -               } elsif (/^Subject:\s*(.+)\s*$/i) {
> -                       $initial_subject = $1;
> -                       my $subject = $initial_subject;
> -                       $_ = "Subject: " .
> -                               quote_subject($subject, $compose_encoding) .
> -                               "\n";
> -               } elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
> -                       $initial_reply_to = $1;
> -                       next;
> -               } elsif (/^From:\s*(.+)\s*$/i) {
> -                       $sender = $1;
> -                       next;
> -               } elsif (/^(?:To|Cc|Bcc):/i) {
> -                       print __("To/Cc/Bcc fields are not interpreted yet, they have been ignored\n");
> -                       next;
> -               }
> -               print $c2 $_;
>         }
> +       if ($parsed_email{'body'}) {
> +               $summary_empty = 0;
> +               print $c2 "\n$parsed_email{'body'}\n";
> +       }
> +
>         close $c;
>         close $c2;
>
> +       open $c2, "<", $compose_filename . ".final"
> +               or die sprintf(__("Failed to open %s.final: %s"), $compose_filename, $!);
> +
> +       print "affichage : \n";
> +       while (<$c2>) {
> +               print $_;
> +       }
> +
> +       close $c2;
> +
>         if ($summary_empty) {
>                 print __("Summary email is empty, skipping it\n");
>                 $compose = -1;
> @@ -792,6 +815,37 @@ sub ask {
>         return;
>  }
>
> +sub parse_header_line {
> +       my $lines = shift;
> +       my $parsed_line = shift;
> +
> +       foreach (split(/\n/, $lines)) {
> +               if (/^From:\s*(.+)$/i) {
> +                       $parsed_line->{'from'} = $1;
> +               } elsif (/^To:\s*(.+)$/i) {
> +                       $parsed_line->{'to'} = [ parse_address_line($1) ];
> +               } elsif (/^Cc:\s*(.+)$/i) {
> +                       $parsed_line->{'cc'} = [ parse_address_line($1) ];
> +               } elsif (/^Bcc:\s*(.+)$/i) {
> +                       $parsed_line->{'bcc'} = [ parse_address_line($1) ];
> +               } elsif (/^Subject:\s*(.+)\s*$/i) {
> +                       $parsed_line->{'subject'} = $1;
> +               } elsif (/^Date: (.*)/i) {
> +                       $parsed_line->{'date'} = $1;
> +               } elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
> +                       $parsed_line->{'in_reply_to'} = $1;
> +               } elsif (/^Message-ID: (.*)$/i) {
> +                       $parsed_line->{'message_id'} = $1;
> +               } elsif (/^MIME-Version:$/i) {
> +                       $parsed_line->{'mime-version'} = $1;
> +               } elsif (/^Content-Type:\s+(.*)\s*$/i) {
> +                       $parsed_line->{'content-type'} = $1;
> +               } elsif (/^References:\s+(.*)/i) {
> +                       $parsed_line->{'references'} = $1;
> +               }
> +       }
> +}
> +
>  my %broken_encoding;
>
>  sub file_declares_8bit_cte {
> --
> 2.15.1
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] send-email: extract email-parsing code into a subroutine
       [not found] ` <445d0838cf2a4107bad95d5cc2d38a05@BPMBX2013-01.univ-lyon1.fr>
@ 2017-12-03 21:20   ` Matthieu Moy
  0 siblings, 0 replies; 25+ messages in thread
From: Matthieu Moy @ 2017-12-03 21:20 UTC (permalink / raw)
  To: Nathan PAYRE
  Cc: Git Mailing List, Ryan Anderson, e@80x24.org, Junio C Hamano,
	PAYRE NATHAN p1508475, ALBERTIN TIMOTHEE p1514771,
	BENSOUSSAN--BOHM DANIEL p1507430

Nathan PAYRE <second.payre@gmail.com> writes:

> I found a mistake in my signed-off-by, please replace
> <nathan.payre@etu.univ-lyon1.fr.> by <second.payre@gmail.com>

I think you want exactly the opposite of this. You're contributing as a
Lyon 1 student, hence your identity is @etu.univ-lyon1.fr. Your Gmail
adress is used only for technical reasons.

OTOH, you are missing the first line From: ... @..univ-lyon1.fr in your
message.

See how you did it:

https://public-inbox.org/git/20171012091727.30759-1-second.payre@gmail.com/

(The sign-off was wrong in this one, but the From was OK)

-- 
Matthieu Moy
https://matthieu-moy.fr/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] send-email: extract email-parsing code into a subroutine
  2017-12-02 17:02 [PATCH] send-email: extract email-parsing code into a subroutine Payre Nathan
  2017-12-02 17:11 ` Nathan PAYRE
       [not found] ` <445d0838cf2a4107bad95d5cc2d38a05@BPMBX2013-01.univ-lyon1.fr>
@ 2017-12-03 22:00 ` Ævar Arnfjörð Bjarmason
  2017-12-03 23:41   ` Nathan PAYRE
  2 siblings, 1 reply; 25+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-12-03 22:00 UTC (permalink / raw)
  To: Payre Nathan
  Cc: git, ryan, e, gitster, Nathan Payre, Matthieu Moy,
	Timothee Albertin, Daniel Bensoussan


On Sat, Dec 02 2017, Payre Nathan jotted:

> From: Nathan Payre <second.payre@gmail.com>
>
> The existing code mixes parsing of email header with regular
> expression and actual code. Extract the parsing code into a new
> subroutine 'parse_header_line()'. This improves the code readability
> and make parse_header_line reusable in other place.
>
> Signed-off-by: Nathan Payre <nathan.payre@etu.univ-lyon1.fr>
> Signed-off-by: Matthieu Moy <matthieu.moy@univ-lyon1.fr>
> Signed-off-by: Timothee Albertin <timothee.albertin@etu.univ-lyon1.fr>
> Signed-off-by: Daniel Bensoussan <daniel.bensoussan--bohm@etu.univ-lyon1.fr>
> ---
>
> This patch is a first step to implement a new feature.
> See new feature discussion here: https://public-inbox.org/git/20171030223444.5052-1-nathan.payre@etu.univ-lyon1.fr/
>
>  git-send-email.perl | 106 +++++++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 80 insertions(+), 26 deletions(-)
>
> diff --git a/git-send-email.perl b/git-send-email.perl
> index 2208dcc21..98c2e461c 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -715,41 +715,64 @@ EOT3
>  	if (!defined $compose_encoding) {
>  		$compose_encoding = "UTF-8";
>  	}
> -	while(<$c>) {
> +
> +	my %parsed_email;
> +	while (<$c>) {
>  		next if m/^GIT:/;
> -		if ($in_body) {
> -			$summary_empty = 0 unless (/^\n$/);
> -		} elsif (/^\n$/) {
> -			$in_body = 1;
> -			if ($need_8bit_cte) {
> +		parse_header_line($_, \%parsed_email);
> +		if (/^\n$/i) {
> +			while (my $row = <$c>) {
> +				if (!($row =~ m/^GIT:/)) {
> +					$parsed_email{'body'} = $parsed_email{'body'} . $row;
> +				}
> +			}
> +		}
> +	}
> +	if ($parsed_email{'from'}) {
> +		$sender = $parsed_email{'from'};
> +	}
> +	if ($parsed_email{'in_reply_to'}) {
> +		$initial_reply_to = $parsed_email{'in_reply_to'};
> +	}
> +	if ($parsed_email{'subject'}) {
> +		$initial_subject = $parsed_email{'subject'};
> +		print $c2 "Subject: " .
> +			quote_subject($parsed_email{'subject'}, $compose_encoding) .
> +			"\n";
> +	}
> +	if ($parsed_email{'mime-version'}) {
> +		$need_8bit_cte = 0;
> +	}
> +	if ($need_8bit_cte) {
> +		if ($parsed_email{'content-type'}) {
> +				print $c2 "MIME-Version: 1.0\n",
> +					 "Content-Type: $parsed_email{'content-type'};",
> +					 "Content-Transfer-Encoding: 8bit\n";
> +			} else {
>  				print $c2 "MIME-Version: 1.0\n",
>  					 "Content-Type: text/plain; ",
> -					   "charset=$compose_encoding\n",
> +					 "charset=$compose_encoding\n",
>  					 "Content-Transfer-Encoding: 8bit\n";
>  			}
> -		} elsif (/^MIME-Version:/i) {
> -			$need_8bit_cte = 0;
> -		} elsif (/^Subject:\s*(.+)\s*$/i) {
> -			$initial_subject = $1;
> -			my $subject = $initial_subject;
> -			$_ = "Subject: " .
> -				quote_subject($subject, $compose_encoding) .
> -				"\n";
> -		} elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
> -			$initial_reply_to = $1;
> -			next;
> -		} elsif (/^From:\s*(.+)\s*$/i) {
> -			$sender = $1;
> -			next;
> -		} elsif (/^(?:To|Cc|Bcc):/i) {
> -			print __("To/Cc/Bcc fields are not interpreted yet, they have been ignored\n");
> -			next;
> -		}
> -		print $c2 $_;
>  	}
> +	if ($parsed_email{'body'}) {
> +		$summary_empty = 0;
> +		print $c2 "\n$parsed_email{'body'}\n";
> +	}
> +
>  	close $c;
>  	close $c2;
>
> +	open $c2, "<", $compose_filename . ".final"
> +		or die sprintf(__("Failed to open %s.final: %s"), $compose_filename, $!);
> +
> +	print "affichage : \n";
> +	while (<$c2>) {
> +		print $_;
> +	}
> +
> +	close $c2;
> +
>  	if ($summary_empty) {
>  		print __("Summary email is empty, skipping it\n");
>  		$compose = -1;
> @@ -792,6 +815,37 @@ sub ask {
>  	return;
>  }
>
> +sub parse_header_line {
> +	my $lines = shift;
> +	my $parsed_line = shift;
> +
> +	foreach (split(/\n/, $lines)) {
> +		if (/^From:\s*(.+)$/i) {
> +			$parsed_line->{'from'} = $1;
> +		} elsif (/^To:\s*(.+)$/i) {
> +			$parsed_line->{'to'} = [ parse_address_line($1) ];
> +		} elsif (/^Cc:\s*(.+)$/i) {
> +			$parsed_line->{'cc'} = [ parse_address_line($1) ];
> +		} elsif (/^Bcc:\s*(.+)$/i) {
> +			$parsed_line->{'bcc'} = [ parse_address_line($1) ];
> +		} elsif (/^Subject:\s*(.+)\s*$/i) {
> +			$parsed_line->{'subject'} = $1;
> +		} elsif (/^Date: (.*)/i) {
> +			$parsed_line->{'date'} = $1;
> +		} elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
> +			$parsed_line->{'in_reply_to'} = $1;
> +		} elsif (/^Message-ID: (.*)$/i) {
> +			$parsed_line->{'message_id'} = $1;
> +		} elsif (/^MIME-Version:$/i) {
> +			$parsed_line->{'mime-version'} = $1;
> +		} elsif (/^Content-Type:\s+(.*)\s*$/i) {
> +			$parsed_line->{'content-type'} = $1;
> +		} elsif (/^References:\s+(.*)/i) {
> +			$parsed_line->{'references'} = $1;
> +		}
> +	}
> +}
> +
>  my %broken_encoding;
>
>  sub file_declares_8bit_cte {

I haven't read the patches that follow. Completely untested, But just a
diff on top I came up with while reading this:

Rationale:

 * Once you start passing $_ to functions you should probably just give
   it a name.

 * !($x =~ m//) you can just write as $x !~ m//

 * There's a lot of copy/paste programming in parse_header_line() and an
   inconsistency between you seeing A-Header and turning it into either
   a_header or a-header. If you just stick with a-header and use dash
   you end up with just two cases.

   The resulting line is quite long, so it's worth doing:

   my $header_parsed   = join "|", qw(To CC ...);
   my $header_unparsed = join "|", qw(From Subject Message-ID ...);
   [...]
   if ($str =~ /^($header_unparsed)
   

diff --git a/git-send-email.perl b/git-send-email.perl
index 98c2e461cf..3696cad456 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -717,12 +717,12 @@ EOT3
 	}

 	my %parsed_email;
-	while (<$c>) {
-		next if m/^GIT:/;
-		parse_header_line($_, \%parsed_email);
-		if (/^\n$/i) {
+	while (my $line = <$c>) {
+		next if $line =~ m/^GIT:/;
+		parse_header_line($line, \%parsed_email);
+		if ($line =~ /^\n$/i) {
 			while (my $row = <$c>) {
-				if (!($row =~ m/^GIT:/)) {
+				if ($row !~ m/^GIT:/) {
 					$parsed_email{'body'} = $parsed_email{'body'} . $row;
 				}
 			}
@@ -731,7 +731,7 @@ EOT3
 	if ($parsed_email{'from'}) {
 		$sender = $parsed_email{'from'};
 	}
-	if ($parsed_email{'in_reply_to'}) {
+	if ($parsed_email{'in-reply-to'}) {
 		$initial_reply_to = $parsed_email{'in_reply_to'};
 	}
 	if ($parsed_email{'subject'}) {
@@ -820,28 +820,10 @@ sub parse_header_line {
 	my $parsed_line = shift;

 	foreach (split(/\n/, $lines)) {
-		if (/^From:\s*(.+)$/i) {
-			$parsed_line->{'from'} = $1;
-		} elsif (/^To:\s*(.+)$/i) {
-			$parsed_line->{'to'} = [ parse_address_line($1) ];
-		} elsif (/^Cc:\s*(.+)$/i) {
-			$parsed_line->{'cc'} = [ parse_address_line($1) ];
-		} elsif (/^Bcc:\s*(.+)$/i) {
-			$parsed_line->{'bcc'} = [ parse_address_line($1) ];
-		} elsif (/^Subject:\s*(.+)\s*$/i) {
-			$parsed_line->{'subject'} = $1;
-		} elsif (/^Date: (.*)/i) {
-			$parsed_line->{'date'} = $1;
-		} elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
-			$parsed_line->{'in_reply_to'} = $1;
-		} elsif (/^Message-ID: (.*)$/i) {
-			$parsed_line->{'message_id'} = $1;
-		} elsif (/^MIME-Version:$/i) {
-			$parsed_line->{'mime-version'} = $1;
-		} elsif (/^Content-Type:\s+(.*)\s*$/i) {
-			$parsed_line->{'content-type'} = $1;
-		} elsif (/^References:\s+(.*)/i) {
-			$parsed_line->{'references'} = $1;
+		if (/^(To|Cc|Bcc):\s*(.+)$/i) {
+			$parsed_line->{lc $1} = [ parse_address_line($2) ];
+		} elsif (/^(From|Subject|Date|In-Reply-To|Message-ID|MIME-Version|Content-Type|References):\s*(.+)\s*$/i) {
+			$parsed_line->{lc $1} = $2;
 		}
 	}
 }

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] send-email: extract email-parsing code into a subroutine
  2017-12-03 22:00 ` Ævar Arnfjörð Bjarmason
@ 2017-12-03 23:41   ` Nathan PAYRE
  2017-12-04 13:45     ` Junio C Hamano
  2017-12-06 15:32     ` [PATCH v2] " Nathan Payre
  0 siblings, 2 replies; 25+ messages in thread
From: Nathan PAYRE @ 2017-12-03 23:41 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git Mailing List, Ryan Anderson, e, Junio C Hamano, Nathan Payre,
	Matthieu Moy, Timothee Albertin, Daniel Bensoussan

I've tested your code, and after few changes it's works perfectly!
The code looks better now.
Thanks a lot for your review.

2017-12-03 23:00 GMT+01:00 Ævar Arnfjörð Bjarmason <avarab@gmail.com>:
>
> On Sat, Dec 02 2017, Payre Nathan jotted:
>
>> From: Nathan Payre <second.payre@gmail.com>
>>
>> The existing code mixes parsing of email header with regular
>> expression and actual code. Extract the parsing code into a new
>> subroutine 'parse_header_line()'. This improves the code readability
>> and make parse_header_line reusable in other place.
>>
>> Signed-off-by: Nathan Payre <nathan.payre@etu.univ-lyon1.fr>
>> Signed-off-by: Matthieu Moy <matthieu.moy@univ-lyon1.fr>
>> Signed-off-by: Timothee Albertin <timothee.albertin@etu.univ-lyon1.fr>
>> Signed-off-by: Daniel Bensoussan <daniel.bensoussan--bohm@etu.univ-lyon1.fr>
>> ---
>>
>> This patch is a first step to implement a new feature.
>> See new feature discussion here: https://public-inbox.org/git/20171030223444.5052-1-nathan.payre@etu.univ-lyon1.fr/
>>
>>  git-send-email.perl | 106 +++++++++++++++++++++++++++++++++++++++-------------
>>  1 file changed, 80 insertions(+), 26 deletions(-)
>>
>> diff --git a/git-send-email.perl b/git-send-email.perl
>> index 2208dcc21..98c2e461c 100755
>> --- a/git-send-email.perl
>> +++ b/git-send-email.perl
>> @@ -715,41 +715,64 @@ EOT3
>>       if (!defined $compose_encoding) {
>>               $compose_encoding = "UTF-8";
>>       }
>> -     while(<$c>) {
>> +
>> +     my %parsed_email;
>> +     while (<$c>) {
>>               next if m/^GIT:/;
>> -             if ($in_body) {
>> -                     $summary_empty = 0 unless (/^\n$/);
>> -             } elsif (/^\n$/) {
>> -                     $in_body = 1;
>> -                     if ($need_8bit_cte) {
>> +             parse_header_line($_, \%parsed_email);
>> +             if (/^\n$/i) {
>> +                     while (my $row = <$c>) {
>> +                             if (!($row =~ m/^GIT:/)) {
>> +                                     $parsed_email{'body'} = $parsed_email{'body'} . $row;
>> +                             }
>> +                     }
>> +             }
>> +     }
>> +     if ($parsed_email{'from'}) {
>> +             $sender = $parsed_email{'from'};
>> +     }
>> +     if ($parsed_email{'in_reply_to'}) {
>> +             $initial_reply_to = $parsed_email{'in_reply_to'};
>> +     }
>> +     if ($parsed_email{'subject'}) {
>> +             $initial_subject = $parsed_email{'subject'};
>> +             print $c2 "Subject: " .
>> +                     quote_subject($parsed_email{'subject'}, $compose_encoding) .
>> +                     "\n";
>> +     }
>> +     if ($parsed_email{'mime-version'}) {
>> +             $need_8bit_cte = 0;
>> +     }
>> +     if ($need_8bit_cte) {
>> +             if ($parsed_email{'content-type'}) {
>> +                             print $c2 "MIME-Version: 1.0\n",
>> +                                      "Content-Type: $parsed_email{'content-type'};",
>> +                                      "Content-Transfer-Encoding: 8bit\n";
>> +                     } else {
>>                               print $c2 "MIME-Version: 1.0\n",
>>                                        "Content-Type: text/plain; ",
>> -                                        "charset=$compose_encoding\n",
>> +                                      "charset=$compose_encoding\n",
>>                                        "Content-Transfer-Encoding: 8bit\n";
>>                       }
>> -             } elsif (/^MIME-Version:/i) {
>> -                     $need_8bit_cte = 0;
>> -             } elsif (/^Subject:\s*(.+)\s*$/i) {
>> -                     $initial_subject = $1;
>> -                     my $subject = $initial_subject;
>> -                     $_ = "Subject: " .
>> -                             quote_subject($subject, $compose_encoding) .
>> -                             "\n";
>> -             } elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
>> -                     $initial_reply_to = $1;
>> -                     next;
>> -             } elsif (/^From:\s*(.+)\s*$/i) {
>> -                     $sender = $1;
>> -                     next;
>> -             } elsif (/^(?:To|Cc|Bcc):/i) {
>> -                     print __("To/Cc/Bcc fields are not interpreted yet, they have been ignored\n");
>> -                     next;
>> -             }
>> -             print $c2 $_;
>>       }
>> +     if ($parsed_email{'body'}) {
>> +             $summary_empty = 0;
>> +             print $c2 "\n$parsed_email{'body'}\n";
>> +     }
>> +
>>       close $c;
>>       close $c2;
>>
>> +     open $c2, "<", $compose_filename . ".final"
>> +             or die sprintf(__("Failed to open %s.final: %s"), $compose_filename, $!);
>> +
>> +     print "affichage : \n";
>> +     while (<$c2>) {
>> +             print $_;
>> +     }
>> +
>> +     close $c2;
>> +
>>       if ($summary_empty) {
>>               print __("Summary email is empty, skipping it\n");
>>               $compose = -1;
>> @@ -792,6 +815,37 @@ sub ask {
>>       return;
>>  }
>>
>> +sub parse_header_line {
>> +     my $lines = shift;
>> +     my $parsed_line = shift;
>> +
>> +     foreach (split(/\n/, $lines)) {
>> +             if (/^From:\s*(.+)$/i) {
>> +                     $parsed_line->{'from'} = $1;
>> +             } elsif (/^To:\s*(.+)$/i) {
>> +                     $parsed_line->{'to'} = [ parse_address_line($1) ];
>> +             } elsif (/^Cc:\s*(.+)$/i) {
>> +                     $parsed_line->{'cc'} = [ parse_address_line($1) ];
>> +             } elsif (/^Bcc:\s*(.+)$/i) {
>> +                     $parsed_line->{'bcc'} = [ parse_address_line($1) ];
>> +             } elsif (/^Subject:\s*(.+)\s*$/i) {
>> +                     $parsed_line->{'subject'} = $1;
>> +             } elsif (/^Date: (.*)/i) {
>> +                     $parsed_line->{'date'} = $1;
>> +             } elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
>> +                     $parsed_line->{'in_reply_to'} = $1;
>> +             } elsif (/^Message-ID: (.*)$/i) {
>> +                     $parsed_line->{'message_id'} = $1;
>> +             } elsif (/^MIME-Version:$/i) {
>> +                     $parsed_line->{'mime-version'} = $1;
>> +             } elsif (/^Content-Type:\s+(.*)\s*$/i) {
>> +                     $parsed_line->{'content-type'} = $1;
>> +             } elsif (/^References:\s+(.*)/i) {
>> +                     $parsed_line->{'references'} = $1;
>> +             }
>> +     }
>> +}
>> +
>>  my %broken_encoding;
>>
>>  sub file_declares_8bit_cte {
>
> I haven't read the patches that follow. Completely untested, But just a
> diff on top I came up with while reading this:
>
> Rationale:
>
>  * Once you start passing $_ to functions you should probably just give
>    it a name.
>
>  * !($x =~ m//) you can just write as $x !~ m//
>
>  * There's a lot of copy/paste programming in parse_header_line() and an
>    inconsistency between you seeing A-Header and turning it into either
>    a_header or a-header. If you just stick with a-header and use dash
>    you end up with just two cases.
>
>    The resulting line is quite long, so it's worth doing:
>
>    my $header_parsed   = join "|", qw(To CC ...);
>    my $header_unparsed = join "|", qw(From Subject Message-ID ...);
>    [...]
>    if ($str =~ /^($header_unparsed)
>
>
> diff --git a/git-send-email.perl b/git-send-email.perl
> index 98c2e461cf..3696cad456 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -717,12 +717,12 @@ EOT3
>         }
>
>         my %parsed_email;
> -       while (<$c>) {
> -               next if m/^GIT:/;
> -               parse_header_line($_, \%parsed_email);
> -               if (/^\n$/i) {
> +       while (my $line = <$c>) {
> +               next if $line =~ m/^GIT:/;
> +               parse_header_line($line, \%parsed_email);
> +               if ($line =~ /^\n$/i) {
>                         while (my $row = <$c>) {
> -                               if (!($row =~ m/^GIT:/)) {
> +                               if ($row !~ m/^GIT:/) {
>                                         $parsed_email{'body'} = $parsed_email{'body'} . $row;
>                                 }
>                         }
> @@ -731,7 +731,7 @@ EOT3
>         if ($parsed_email{'from'}) {
>                 $sender = $parsed_email{'from'};
>         }
> -       if ($parsed_email{'in_reply_to'}) {
> +       if ($parsed_email{'in-reply-to'}) {
>                 $initial_reply_to = $parsed_email{'in_reply_to'};
>         }
>         if ($parsed_email{'subject'}) {
> @@ -820,28 +820,10 @@ sub parse_header_line {
>         my $parsed_line = shift;
>
>         foreach (split(/\n/, $lines)) {
> -               if (/^From:\s*(.+)$/i) {
> -                       $parsed_line->{'from'} = $1;
> -               } elsif (/^To:\s*(.+)$/i) {
> -                       $parsed_line->{'to'} = [ parse_address_line($1) ];
> -               } elsif (/^Cc:\s*(.+)$/i) {
> -                       $parsed_line->{'cc'} = [ parse_address_line($1) ];
> -               } elsif (/^Bcc:\s*(.+)$/i) {
> -                       $parsed_line->{'bcc'} = [ parse_address_line($1) ];
> -               } elsif (/^Subject:\s*(.+)\s*$/i) {
> -                       $parsed_line->{'subject'} = $1;
> -               } elsif (/^Date: (.*)/i) {
> -                       $parsed_line->{'date'} = $1;
> -               } elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
> -                       $parsed_line->{'in_reply_to'} = $1;
> -               } elsif (/^Message-ID: (.*)$/i) {
> -                       $parsed_line->{'message_id'} = $1;
> -               } elsif (/^MIME-Version:$/i) {
> -                       $parsed_line->{'mime-version'} = $1;
> -               } elsif (/^Content-Type:\s+(.*)\s*$/i) {
> -                       $parsed_line->{'content-type'} = $1;
> -               } elsif (/^References:\s+(.*)/i) {
> -                       $parsed_line->{'references'} = $1;
> +               if (/^(To|Cc|Bcc):\s*(.+)$/i) {
> +                       $parsed_line->{lc $1} = [ parse_address_line($2) ];
> +               } elsif (/^(From|Subject|Date|In-Reply-To|Message-ID|MIME-Version|Content-Type|References):\s*(.+)\s*$/i) {
> +                       $parsed_line->{lc $1} = $2;
>                 }
>         }
>  }

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] send-email: extract email-parsing code into a subroutine
  2017-12-03 23:41   ` Nathan PAYRE
@ 2017-12-04 13:45     ` Junio C Hamano
  2017-12-06 15:38       ` [PATCH v2] " Nathan Payre
  2017-12-06 15:32     ` [PATCH v2] " Nathan Payre
  1 sibling, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2017-12-04 13:45 UTC (permalink / raw)
  To: Nathan PAYRE
  Cc: Ævar Arnfjörð Bjarmason, Git Mailing List,
	Ryan Anderson, e, Nathan Payre, Matthieu Moy, Timothee Albertin,
	Daniel Bensoussan

Nathan PAYRE <second.payre@gmail.com> writes:

> I've tested your code, and after few changes it's works perfectly!
> The code looks better now.
> Thanks a lot for your review.

Thanks, both of you.  

Could you send in the final version later so that I can pick it up?
I agree with Matthieu's suggestion on what address to use on your
From: (authorship identity) and S-o-b:; as you are showing your work
done as a uni student, the authorship and sign-off should be done as
such.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v2] send-email: extract email-parsing code into a subroutine
  2017-12-03 23:41   ` Nathan PAYRE
  2017-12-04 13:45     ` Junio C Hamano
@ 2017-12-06 15:32     ` Nathan Payre
  1 sibling, 0 replies; 25+ messages in thread
From: Nathan Payre @ 2017-12-06 15:32 UTC (permalink / raw)
  To: git; +Cc: Nathan Payre, Matthieu Moy, Timothee Albertin, Daniel Bensoussan

The existing code mixes parsing of email header with regular
expression and actual code. Extract the parsing code into a new
subroutine 'parse_header_line()'. This improves the code readability
and make parse_header_line reusable in other place.

Signed-off-by: Nathan Payre <nathan.payre@etu.univ-lyon1.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@univ-lyon1.fr>
Signed-off-by: Timothee Albertin <timothee.albertin@etu.univ-lyon1.fr>
Signed-off-by: Daniel Bensoussan <daniel.bensoussan--bohm@etu.univ-lyon1.fr>
Thanks-to: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 git-send-email.perl | 100 ++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 73 insertions(+), 27 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 2208dcc21..db16e4dec 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -715,41 +715,73 @@ EOT3
 	if (!defined $compose_encoding) {
 		$compose_encoding = "UTF-8";
 	}
-	while(<$c>) {
-		next if m/^GIT:/;
-		if ($in_body) {
-			$summary_empty = 0 unless (/^\n$/);
-		} elsif (/^\n$/) {
-			$in_body = 1;
-			if ($need_8bit_cte) {
+
+    my %parsed_email;
+	$parsed_email{'body'} = '';
+    while (my $line = <$c>) {
+	    next if $line =~ m/^GIT:/;
+	    parse_header_line($line, \%parsed_email);
+	    if ($line =~ /^\n$/i) {
+	        while (my $body_line = <$c>) {
+                if ($body_line !~ m/^GIT:/) {
+                    $parsed_email{'body'} = $parsed_email{'body'} . $body_line;
+                }
+	        }
+		}
+		print "la : $line\n";
+	}
+
+	if ($parsed_email{'from'}) {
+		$sender = $parsed_email{'from'};
+	}
+	if ($parsed_email{'in-reply-to'}) {
+		$initial_reply_to = $parsed_email{'in-reply-to'};
+	}
+	if ($parsed_email{'subject'}) {
+		$initial_subject = $parsed_email{'subject'};
+		print $c2 "Subject: " .
+			quote_subject($parsed_email{'subject'}, $compose_encoding) .
+			"\n";
+	}
+	if ($parsed_email{'mime-version'}) {
+		print "CASE 0\n";
+		$need_8bit_cte = 0;
+		print $c2 "MIME-Version: $parsed_email{'mime-version'}\n",
+					"Content-Type: $parsed_email{'content-type'};\n",
+					"Content-Transfer-Encoding: $parsed_email{'content-transfer-encoding'}\n";
+	}
+	if ($need_8bit_cte) {
+		if ($parsed_email{'content-type'}) {
+				print "CASE 1\n";
+				print $c2 "MIME-Version: 1.0\n",
+					 "Content-Type: $parsed_email{'content-type'};",
+					 "Content-Transfer-Encoding: 8bit\n";
+			} else {
+				print "CASE 2\n";
 				print $c2 "MIME-Version: 1.0\n",
 					 "Content-Type: text/plain; ",
-					   "charset=$compose_encoding\n",
+					 "charset=$compose_encoding\n",
 					 "Content-Transfer-Encoding: 8bit\n";
 			}
-		} elsif (/^MIME-Version:/i) {
-			$need_8bit_cte = 0;
-		} elsif (/^Subject:\s*(.+)\s*$/i) {
-			$initial_subject = $1;
-			my $subject = $initial_subject;
-			$_ = "Subject: " .
-				quote_subject($subject, $compose_encoding) .
-				"\n";
-		} elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
-			$initial_reply_to = $1;
-			next;
-		} elsif (/^From:\s*(.+)\s*$/i) {
-			$sender = $1;
-			next;
-		} elsif (/^(?:To|Cc|Bcc):/i) {
-			print __("To/Cc/Bcc fields are not interpreted yet, they have been ignored\n");
-			next;
-		}
-		print $c2 $_;
 	}
+	if ($parsed_email{'body'}) {
+		$summary_empty = 0;
+		print $c2 "\n$parsed_email{'body'}\n";
+	}
+
 	close $c;
 	close $c2;
 
+	open $c2, "<", $compose_filename . ".final"
+		or die sprintf(__("Failed to open %s.final: %s"), $compose_filename, $!);
+
+	print "affichage : \n";
+	while (<$c2>) {
+		print $_;
+	}
+
+	close $c2;
+
 	if ($summary_empty) {
 		print __("Summary email is empty, skipping it\n");
 		$compose = -1;
@@ -792,6 +824,20 @@ sub ask {
 	return;
 }
 
+sub parse_header_line {
+	my $lines = shift;
+	my $parsed_line = shift;
+
+	foreach (split(/\n/, $lines)) {
+		if (/^(To|Cc|Bcc):\s*(.+)$/i) {
+		        $parsed_line->{lc $1} = [ parse_address_line($2) ];
+		} elsif (/^(From|Subject|Date|In-Reply-To|Message-ID|MIME-Version|Content-Type|Content-Transfer-Encoding|References):\s*(.+)\s*$/i) {
+		        $parsed_line->{lc $1} = $2;
+		}
+	}
+}
+
+
 my %broken_encoding;
 
 sub file_declares_8bit_cte {
-- 
2.15.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2] send-email: extract email-parsing code into a subroutine
  2017-12-04 13:45     ` Junio C Hamano
@ 2017-12-06 15:38       ` Nathan Payre
  2017-12-06 21:39         ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Nathan Payre @ 2017-12-06 15:38 UTC (permalink / raw)
  To: git; +Cc: Nathan Payre, Matthieu Moy, Timothee Albertin, Daniel Bensoussan

The existing code mixes parsing of email header with regular
expression and actual code. Extract the parsing code into a new
subroutine 'parse_header_line()'. This improves the code readability
and make parse_header_line reusable in other place.

Signed-off-by: Nathan Payre <nathan.payre@etu.univ-lyon1.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@univ-lyon1.fr>
Signed-off-by: Timothee Albertin <timothee.albertin@etu.univ-lyon1.fr>
Signed-off-by: Daniel Bensoussan <daniel.bensoussan--bohm@etu.univ-lyon1.fr>
Thanks-to: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 git-send-email.perl | 100 ++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 73 insertions(+), 27 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 2208dcc21..db16e4dec 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -715,41 +715,73 @@ EOT3
 	if (!defined $compose_encoding) {
 		$compose_encoding = "UTF-8";
 	}
-	while(<$c>) {
-		next if m/^GIT:/;
-		if ($in_body) {
-			$summary_empty = 0 unless (/^\n$/);
-		} elsif (/^\n$/) {
-			$in_body = 1;
-			if ($need_8bit_cte) {
+
+    my %parsed_email;
+	$parsed_email{'body'} = '';
+    while (my $line = <$c>) {
+	    next if $line =~ m/^GIT:/;
+	    parse_header_line($line, \%parsed_email);
+	    if ($line =~ /^\n$/i) {
+	        while (my $body_line = <$c>) {
+                if ($body_line !~ m/^GIT:/) {
+                    $parsed_email{'body'} = $parsed_email{'body'} . $body_line;
+                }
+	        }
+		}
+		print "la : $line\n";
+	}
+
+	if ($parsed_email{'from'}) {
+		$sender = $parsed_email{'from'};
+	}
+	if ($parsed_email{'in-reply-to'}) {
+		$initial_reply_to = $parsed_email{'in-reply-to'};
+	}
+	if ($parsed_email{'subject'}) {
+		$initial_subject = $parsed_email{'subject'};
+		print $c2 "Subject: " .
+			quote_subject($parsed_email{'subject'}, $compose_encoding) .
+			"\n";
+	}
+	if ($parsed_email{'mime-version'}) {
+		print "CASE 0\n";
+		$need_8bit_cte = 0;
+		print $c2 "MIME-Version: $parsed_email{'mime-version'}\n",
+					"Content-Type: $parsed_email{'content-type'};\n",
+					"Content-Transfer-Encoding: $parsed_email{'content-transfer-encoding'}\n";
+	}
+	if ($need_8bit_cte) {
+		if ($parsed_email{'content-type'}) {
+				print "CASE 1\n";
+				print $c2 "MIME-Version: 1.0\n",
+					 "Content-Type: $parsed_email{'content-type'};",
+					 "Content-Transfer-Encoding: 8bit\n";
+			} else {
+				print "CASE 2\n";
 				print $c2 "MIME-Version: 1.0\n",
 					 "Content-Type: text/plain; ",
-					   "charset=$compose_encoding\n",
+					 "charset=$compose_encoding\n",
 					 "Content-Transfer-Encoding: 8bit\n";
 			}
-		} elsif (/^MIME-Version:/i) {
-			$need_8bit_cte = 0;
-		} elsif (/^Subject:\s*(.+)\s*$/i) {
-			$initial_subject = $1;
-			my $subject = $initial_subject;
-			$_ = "Subject: " .
-				quote_subject($subject, $compose_encoding) .
-				"\n";
-		} elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
-			$initial_reply_to = $1;
-			next;
-		} elsif (/^From:\s*(.+)\s*$/i) {
-			$sender = $1;
-			next;
-		} elsif (/^(?:To|Cc|Bcc):/i) {
-			print __("To/Cc/Bcc fields are not interpreted yet, they have been ignored\n");
-			next;
-		}
-		print $c2 $_;
 	}
+	if ($parsed_email{'body'}) {
+		$summary_empty = 0;
+		print $c2 "\n$parsed_email{'body'}\n";
+	}
+
 	close $c;
 	close $c2;
 
+	open $c2, "<", $compose_filename . ".final"
+		or die sprintf(__("Failed to open %s.final: %s"), $compose_filename, $!);
+
+	print "affichage : \n";
+	while (<$c2>) {
+		print $_;
+	}
+
+	close $c2;
+
 	if ($summary_empty) {
 		print __("Summary email is empty, skipping it\n");
 		$compose = -1;
@@ -792,6 +824,20 @@ sub ask {
 	return;
 }
 
+sub parse_header_line {
+	my $lines = shift;
+	my $parsed_line = shift;
+
+	foreach (split(/\n/, $lines)) {
+		if (/^(To|Cc|Bcc):\s*(.+)$/i) {
+		        $parsed_line->{lc $1} = [ parse_address_line($2) ];
+		} elsif (/^(From|Subject|Date|In-Reply-To|Message-ID|MIME-Version|Content-Type|Content-Transfer-Encoding|References):\s*(.+)\s*$/i) {
+		        $parsed_line->{lc $1} = $2;
+		}
+	}
+}
+
+
 my %broken_encoding;
 
 sub file_declares_8bit_cte {
-- 
2.15.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v2] send-email: extract email-parsing code into a subroutine
  2017-12-06 15:38       ` [PATCH v2] " Nathan Payre
@ 2017-12-06 21:39         ` Junio C Hamano
  2017-12-06 22:55           ` Nathan PAYRE
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2017-12-06 21:39 UTC (permalink / raw)
  To: Nathan Payre; +Cc: git, Matthieu Moy, Timothee Albertin, Daniel Bensoussan


Nathan Payre <nathan.payre@etu.univ-lyon1.fr> writes:

> The existing code mixes parsing of email header with regular
> expression and actual code. Extract the parsing code into a new
> subroutine 'parse_header_line()'. This improves the code readability
> and make parse_header_line reusable in other place.
>
> Signed-off-by: Nathan Payre <nathan.payre@etu.univ-lyon1.fr>
> Signed-off-by: Matthieu Moy <matthieu.moy@univ-lyon1.fr>
> Signed-off-by: Timothee Albertin <timothee.albertin@etu.univ-lyon1.fr>
> Signed-off-by: Daniel Bensoussan <daniel.bensoussan--bohm@etu.univ-lyon1.fr>
> Thanks-to: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

Thanks, but... 

> +    my %parsed_email;
> +	$parsed_email{'body'} = '';
> +    while (my $line = <$c>) {
> +	    next if $line =~ m/^GIT:/;
> +	    parse_header_line($line, \%parsed_email);
> +	    if ($line =~ /^\n$/i) {
> +	        while (my $body_line = <$c>) {
> +                if ($body_line !~ m/^GIT:/) {
> +                    $parsed_email{'body'} = $parsed_email{'body'} . $body_line;
> +                }
> +	        }
> +		}
> +		print "la : $line\n";
> +	}

... throughout this patch, not limited to this section, indentation
is strange and there seem to be many "print" that show messages that
do not seem to be meant for end-user consumption.  I can see that
this aspires to improve the readability, but not quite yet ;-).

Also "reusable in other place" is by itself not an unconditional
plus, until readers can be convinced that that 'other place' really
wants to be able to call this function.  Is there some untold
motivation behind this change---such as a planned update to actually
use this helper subroutine?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2] send-email: extract email-parsing code into a subroutine
  2017-12-06 21:39         ` Junio C Hamano
@ 2017-12-06 22:55           ` Nathan PAYRE
  2017-12-06 23:02             ` [PATCH v3] " Nathan Payre
                               ` (5 more replies)
  0 siblings, 6 replies; 25+ messages in thread
From: Nathan PAYRE @ 2017-12-06 22:55 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nathan Payre, Git Mailing List, Matthieu Moy, Timothee Albertin,
	Daniel Bensoussan

Junio C Hamano <gitster@pobox.com>: writes:

> ... throughout this patch, not limited to this section, indentation
> is strange and there seem to be many "print" that show messages that
> do not seem to be meant for end-user consumption.  I can see that
> this aspires to improve the readability, but not quite yet ;-).

Hmmm I'm wondering who place thoses print in my code !
I will fix it fast. :-)

> Also "reusable in other place" is by itself not an unconditional
> plus, until readers can be convinced that that 'other place' really
> wants to be able to call this function.  Is there some untold
> motivation behind this change---such as a planned update to actually
> use this helper subroutine?

This subroutine will be used to implement, initially a new option called
"--quote-email", but became "--cite" added after "--in-reply-to".
This will permit to the user to cite a mail and reply with a patch and keep
Cc, To ...
See discussion :
https://public-inbox.org/git/20171030223444.5052-1-nathan.payre@etu.univ-lyon1.fr/

And Daniel Timothee and I wanted to refactor an other part of the file
using parse_header_line(). Near Line 1570.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3] send-email: extract email-parsing code into a subroutine
  2017-12-06 22:55           ` Nathan PAYRE
@ 2017-12-06 23:02             ` Nathan Payre
  2017-12-06 23:12               ` Ævar Arnfjörð Bjarmason
                                 ` (2 more replies)
  2017-12-06 23:06             ` [PATCH v2] " Junio C Hamano
                               ` (4 subsequent siblings)
  5 siblings, 3 replies; 25+ messages in thread
From: Nathan Payre @ 2017-12-06 23:02 UTC (permalink / raw)
  To: git; +Cc: Nathan Payre, Matthieu Moy, Timothee Albertin, Daniel Bensoussan

The existing code mixes parsing of email header with regular
expression and actual code. Extract the parsing code into a new
subroutine 'parse_header_line()'. This improves the code readability
and make parse_header_line reusable in other place.

Signed-off-by: Nathan Payre <nathan.payre@etu.univ-lyon1.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@univ-lyon1.fr>
Signed-off-by: Timothee Albertin <timothee.albertin@etu.univ-lyon1.fr>
Signed-off-by: Daniel Bensoussan <daniel.bensoussan--bohm@etu.univ-lyon1.fr>
Thanks-to: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

Without the "print" used for testing. 

 git-send-email.perl | 90 +++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 63 insertions(+), 27 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 2208dcc21..a10574a56 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -715,41 +715,63 @@ EOT3
 	if (!defined $compose_encoding) {
 		$compose_encoding = "UTF-8";
 	}
-	while(<$c>) {
-		next if m/^GIT:/;
-		if ($in_body) {
-			$summary_empty = 0 unless (/^\n$/);
-		} elsif (/^\n$/) {
-			$in_body = 1;
-			if ($need_8bit_cte) {
+
+    my %parsed_email;
+	$parsed_email{'body'} = '';
+    while (my $line = <$c>) {
+	    next if $line =~ m/^GIT:/;
+	    parse_header_line($line, \%parsed_email);
+	    if ($line =~ /^\n$/i) {
+	        while (my $body_line = <$c>) {
+                if ($body_line !~ m/^GIT:/) {
+                    $parsed_email{'body'} = $parsed_email{'body'} . $body_line;
+                }
+	        }
+		}
+	}
+
+	if ($parsed_email{'from'}) {
+		$sender = $parsed_email{'from'};
+	}
+	if ($parsed_email{'in-reply-to'}) {
+		$initial_reply_to = $parsed_email{'in-reply-to'};
+	}
+	if ($parsed_email{'subject'}) {
+		$initial_subject = $parsed_email{'subject'};
+		print $c2 "Subject: " .
+			quote_subject($parsed_email{'subject'}, $compose_encoding) .
+			"\n";
+	}
+	if ($parsed_email{'mime-version'}) {
+		$need_8bit_cte = 0;
+		print $c2 "MIME-Version: $parsed_email{'mime-version'}\n",
+					"Content-Type: $parsed_email{'content-type'};\n",
+					"Content-Transfer-Encoding: $parsed_email{'content-transfer-encoding'}\n";
+	}
+	if ($need_8bit_cte) {
+		if ($parsed_email{'content-type'}) {
+				print $c2 "MIME-Version: 1.0\n",
+					 "Content-Type: $parsed_email{'content-type'};",
+					 "Content-Transfer-Encoding: 8bit\n";
+			} else {
 				print $c2 "MIME-Version: 1.0\n",
 					 "Content-Type: text/plain; ",
-					   "charset=$compose_encoding\n",
+					 "charset=$compose_encoding\n",
 					 "Content-Transfer-Encoding: 8bit\n";
 			}
-		} elsif (/^MIME-Version:/i) {
-			$need_8bit_cte = 0;
-		} elsif (/^Subject:\s*(.+)\s*$/i) {
-			$initial_subject = $1;
-			my $subject = $initial_subject;
-			$_ = "Subject: " .
-				quote_subject($subject, $compose_encoding) .
-				"\n";
-		} elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
-			$initial_reply_to = $1;
-			next;
-		} elsif (/^From:\s*(.+)\s*$/i) {
-			$sender = $1;
-			next;
-		} elsif (/^(?:To|Cc|Bcc):/i) {
-			print __("To/Cc/Bcc fields are not interpreted yet, they have been ignored\n");
-			next;
-		}
-		print $c2 $_;
 	}
+	if ($parsed_email{'body'}) {
+		$summary_empty = 0;
+		print $c2 "\n$parsed_email{'body'}\n";
+	}
+
 	close $c;
 	close $c2;
 
+	open $c2, "<", $compose_filename . ".final"
+		or die sprintf(__("Failed to open %s.final: %s"), $compose_filename, $!);
+	close $c2;
+
 	if ($summary_empty) {
 		print __("Summary email is empty, skipping it\n");
 		$compose = -1;
@@ -792,6 +814,20 @@ sub ask {
 	return;
 }
 
+sub parse_header_line {
+	my $lines = shift;
+	my $parsed_line = shift;
+
+	foreach (split(/\n/, $lines)) {
+		if (/^(To|Cc|Bcc):\s*(.+)$/i) {
+		        $parsed_line->{lc $1} = [ parse_address_line($2) ];
+		} elsif (/^(From|Subject|Date|In-Reply-To|Message-ID|MIME-Version|Content-Type|Content-Transfer-Encoding|References):\s*(.+)\s*$/i) {
+		        $parsed_line->{lc $1} = $2;
+		}
+	}
+}
+
+
 my %broken_encoding;
 
 sub file_declares_8bit_cte {
-- 
2.15.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v2] send-email: extract email-parsing code into a subroutine
  2017-12-06 22:55           ` Nathan PAYRE
  2017-12-06 23:02             ` [PATCH v3] " Nathan Payre
@ 2017-12-06 23:06             ` Junio C Hamano
  2017-12-07  7:32             ` Eric Sunshine
                               ` (3 subsequent siblings)
  5 siblings, 0 replies; 25+ messages in thread
From: Junio C Hamano @ 2017-12-06 23:06 UTC (permalink / raw)
  To: Nathan PAYRE
  Cc: Nathan Payre, Git Mailing List, Matthieu Moy, Timothee Albertin,
	Daniel Bensoussan

Nathan PAYRE <second.payre@gmail.com> writes:

> Junio C Hamano <gitster@pobox.com>: writes:
>
>> ... throughout this patch, not limited to this section, indentation
>> is strange and there seem to be many "print" that show messages that
>> do not seem to be meant for end-user consumption.  I can see that
>> this aspires to improve the readability, but not quite yet ;-).
>
> Hmmm I'm wondering who place thoses print in my code !
> I will fix it fast. :-)

Don't make waste by being hasty, though.  The print statements were
bad, but funny indentation was more distracting and will be worse
hindrance from the maintainabaility's point of view.

Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] send-email: extract email-parsing code into a subroutine
  2017-12-06 23:02             ` [PATCH v3] " Nathan Payre
@ 2017-12-06 23:12               ` Ævar Arnfjörð Bjarmason
  2017-12-07 10:28               ` [PATCH v4] " Nathan Payre
       [not found]               ` <ff9066a7209b4e21867d933542f8eece@BPMBX2013-01.univ-lyon1.fr>
  2 siblings, 0 replies; 25+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-12-06 23:12 UTC (permalink / raw)
  To: Nathan Payre
  Cc: Git Mailing List, Matthieu Moy, Timothee Albertin,
	Daniel Bensoussan

On Thu, Dec 7, 2017 at 12:02 AM, Nathan Payre
<nathan.payre@etu.univ-lyon1.fr> wrote:

> +sub parse_header_line {
> +       my $lines = shift;
> +       my $parsed_line = shift;
> +
> +       foreach (split(/\n/, $lines)) {
> +               if (/^(To|Cc|Bcc):\s*(.+)$/i) {
> +                       $parsed_line->{lc $1} = [ parse_address_line($2) ];
> +               } elsif (/^(From|Subject|Date|In-Reply-To|Message-ID|MIME-Version|Content-Type|Content-Transfer-Encoding|References):\s*(.+)\s*$/i) {
> +                       $parsed_line->{lc $1} = $2;
> +               }
> +       }
> +}

Nit: As noted in my earlier review this results in very long lines,
I'm just typing this pseudocode into an E-Mail client so not tested at
all, but this would be better:

   my $header_parsed   = join "|", qw(To Cc Bcc);
   my $header_unparsed = join "|", qw(From Subject Message-ID ...); #
line wrap this at some point
   foreach [...]
   if (/^($header_parsed)[...]
   } elsif (^/($header_unparsed)[...].

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2] send-email: extract email-parsing code into a subroutine
  2017-12-06 22:55           ` Nathan PAYRE
  2017-12-06 23:02             ` [PATCH v3] " Nathan Payre
  2017-12-06 23:06             ` [PATCH v2] " Junio C Hamano
@ 2017-12-07  7:32             ` Eric Sunshine
       [not found]             ` <2b59497271cd4fada4ff590a001446cf@BPMBX2013-01.univ-lyon1.fr>
                               ` (2 subsequent siblings)
  5 siblings, 0 replies; 25+ messages in thread
From: Eric Sunshine @ 2017-12-07  7:32 UTC (permalink / raw)
  To: Nathan PAYRE
  Cc: Junio C Hamano, Nathan Payre, Git Mailing List, Matthieu Moy,
	Timothee Albertin, Daniel Bensoussan

On Wed, Dec 6, 2017 at 5:55 PM, Nathan PAYRE <second.payre@gmail.com> wrote:
> Junio C Hamano <gitster@pobox.com>: writes:
>> Also "reusable in other place" is by itself not an unconditional
>> plus, until readers can be convinced that that 'other place' really
>> wants to be able to call this function.  Is there some untold
>> motivation behind this change---such as a planned update to actually
>> use this helper subroutine?
>
> This subroutine will be used to implement, initially a new option called
> "--quote-email", but became "--cite" added after "--in-reply-to".
> This will permit to the user to cite a mail and reply with a patch and keep
> Cc, To ...
> See discussion :
> https://public-inbox.org/git/20171030223444.5052-1-nathan.payre@etu.univ-lyon1.fr/

Although he didn't state so explicitly, please take Junio's question
as a strong hint that you should update the commit message to include
the above rationale/explanation about why you consider this a good
change. The reason for including the motivation in the commit message
is that you want, not only to convince Junio now that this is a useful
change, but to convince future readers of the project history that
this change is desirable.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] send-email: extract email-parsing code into a subroutine
       [not found]             ` <2b59497271cd4fada4ff590a001446cf@BPMBX2013-01.univ-lyon1.fr>
@ 2017-12-07  7:57               ` Matthieu Moy
  0 siblings, 0 replies; 25+ messages in thread
From: Matthieu Moy @ 2017-12-07  7:57 UTC (permalink / raw)
  To: PAYRE NATHAN p1508475
  Cc: git@vger.kernel.org, ALBERTIN TIMOTHEE p1514771,
	BENSOUSSAN--BOHM DANIEL p1507430

PAYRE NATHAN p1508475 <nathan.payre@etu.univ-lyon1.fr> writes:

> Without the "print" used for testing. 

But still smoe broken indentation:

>  git-send-email.perl | 90 +++++++++++++++++++++++++++++++++++++----------------
>  1 file changed, 63 insertions(+), 27 deletions(-)
>
> diff --git a/git-send-email.perl b/git-send-email.perl
> index 2208dcc21..a10574a56 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -715,41 +715,63 @@ EOT3
>  	if (!defined $compose_encoding) {
>  		$compose_encoding = "UTF-8";
>  	}
> -	while(<$c>) {
> -		next if m/^GIT:/;
> -		if ($in_body) {
> -			$summary_empty = 0 unless (/^\n$/);
> -		} elsif (/^\n$/) {
> -			$in_body = 1;
> -			if ($need_8bit_cte) {
> +
> +    my %parsed_email;
> +	$parsed_email{'body'} = '';
> +    while (my $line = <$c>) {
> +	    next if $line =~ m/^GIT:/;
> +	    parse_header_line($line, \%parsed_email);
> +	    if ($line =~ /^\n$/i) {
> +	        while (my $body_line = <$c>) {
> +                if ($body_line !~ m/^GIT:/) {
> +                    $parsed_email{'body'} = $parsed_email{'body'} . $body_line;
> +                }
> +	        }
> +		}
> +	}

This may display properly in your text editor with your setting, but
appears broken at least with tab-width=8. Don't mix tabs and spaces. The
Git coding style is to indent with tabs.

To see what I mean, open the script in Emacs and type M-x
whitespace-mode RET.

-- 
Matthieu Moy
https://matthieu-moy.fr/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v4] send-email: extract email-parsing code into a subroutine
  2017-12-06 23:02             ` [PATCH v3] " Nathan Payre
  2017-12-06 23:12               ` Ævar Arnfjörð Bjarmason
@ 2017-12-07 10:28               ` Nathan Payre
       [not found]               ` <ff9066a7209b4e21867d933542f8eece@BPMBX2013-01.univ-lyon1.fr>
  2 siblings, 0 replies; 25+ messages in thread
From: Nathan Payre @ 2017-12-07 10:28 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Eric Sunshine, Nathan Payre, Matthieu Moy,
	Timothee Albertin, Daniel Bensoussan

The existing code mixes parsing of email header with regular
expression and actual code. Extract the parsing code into a new
subroutine "parse_header_line()". This improves the code readability
and make parse_header_line reusable in other place.

"parsed_header_line()" and "filter_body()" could be used for refactoring
the part of code which parses the header a last time to prepare the
email and send it.

Signed-off-by: Nathan Payre <nathan.payre@etu.univ-lyon1.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@univ-lyon1.fr>
Signed-off-by: Timothee Albertin <timothee.albertin@etu.univ-lyon1.fr>
Signed-off-by: Daniel Bensoussan <daniel.bensoussan--bohm@etu.univ-lyon1.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Thanks-to: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

This patch fixes the indentation problem and reduce lines over 80 characters.

 git-send-email.perl | 102 ++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 75 insertions(+), 27 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 2208dcc21..b64f8872d 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -715,41 +715,60 @@ EOT3
 	if (!defined $compose_encoding) {
 		$compose_encoding = "UTF-8";
 	}
-	while(<$c>) {
-		next if m/^GIT:/;
-		if ($in_body) {
-			$summary_empty = 0 unless (/^\n$/);
-		} elsif (/^\n$/) {
-			$in_body = 1;
-			if ($need_8bit_cte) {
+
+
+	my %parsed_email;
+	$parsed_email{'body'} = '';
+	while (my $line = <$c>) {
+		next if $line =~ m/^GIT:/;
+		parse_header_line($line, \%parsed_email);
+		if ($line =~ /^\n$/i) {
+			$parsed_email{'body'} = filter_body($c);
+		}
+	}
+
+	if ($parsed_email{'from'}) {
+		$sender = $parsed_email{'from'};
+	}
+	if ($parsed_email{'in-reply-to'}) {
+		$initial_reply_to = $parsed_email{'in-reply-to'};
+	}
+	if ($parsed_email{'subject'}) {
+		$initial_subject = $parsed_email{'subject'};
+		print $c2 "Subject: " .
+			quote_subject($parsed_email{'subject'}, $compose_encoding) .
+			"\n";
+	}
+	if ($parsed_email{'mime-version'}) {
+		$need_8bit_cte = 0;
+		print $c2 "MIME-Version: $parsed_email{'mime-version'}\n",
+				"Content-Type: $parsed_email{'content-type'};\n",
+				"Content-Transfer-Encoding: $parsed_email{'content-transfer-encoding'}\n";
+	}
+	if ($need_8bit_cte) {
+		if ($parsed_email{'content-type'}) {
+				print $c2 "MIME-Version: 1.0\n",
+					 "Content-Type: $parsed_email{'content-type'};",
+					 "Content-Transfer-Encoding: 8bit\n";
+			} else {
 				print $c2 "MIME-Version: 1.0\n",
 					 "Content-Type: text/plain; ",
-					   "charset=$compose_encoding\n",
+					 "charset=$compose_encoding\n",
 					 "Content-Transfer-Encoding: 8bit\n";
 			}
-		} elsif (/^MIME-Version:/i) {
-			$need_8bit_cte = 0;
-		} elsif (/^Subject:\s*(.+)\s*$/i) {
-			$initial_subject = $1;
-			my $subject = $initial_subject;
-			$_ = "Subject: " .
-				quote_subject($subject, $compose_encoding) .
-				"\n";
-		} elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
-			$initial_reply_to = $1;
-			next;
-		} elsif (/^From:\s*(.+)\s*$/i) {
-			$sender = $1;
-			next;
-		} elsif (/^(?:To|Cc|Bcc):/i) {
-			print __("To/Cc/Bcc fields are not interpreted yet, they have been ignored\n");
-			next;
-		}
-		print $c2 $_;
 	}
+	if ($parsed_email{'body'}) {
+		$summary_empty = 0;
+		print $c2 "\n$parsed_email{'body'}\n";
+	}
+
 	close $c;
 	close $c2;
 
+	open $c2, "<", $compose_filename . ".final"
+		or die sprintf(__("Failed to open %s.final: %s"), $compose_filename, $!);
+	close $c2;
+
 	if ($summary_empty) {
 		print __("Summary email is empty, skipping it\n");
 		$compose = -1;
@@ -792,6 +811,35 @@ sub ask {
 	return;
 }
 
+sub parse_header_line {
+	my $lines = shift;
+	my $parsed_line = shift;
+	my $pattern1 = join "|", qw(To Cc Bcc);
+	my $pattern2 = join "|",
+		qw(From Subject Date In-Reply-To Message-ID MIME-Version 
+			Content-Type Content-Transfer-Encoding References);
+	
+	foreach (split(/\n/, $lines)) {
+		if (/^($pattern1):\s*(.+)$/i) {
+		        $parsed_line->{lc $1} = [ parse_address_line($2) ];
+		} elsif (/^($pattern2):\s*(.+)\s*$/i) {
+		        $parsed_line->{lc $1} = $2;
+		}
+	}
+}
+
+sub filter_body {
+	my $c = shift;
+	my $body = "";
+	while (my $body_line = <$c>) {
+		if ($body_line !~ m/^GIT:/) {
+			$body = $body . $body_line;
+		}
+	}
+	return $body;
+}
+
+
 my %broken_encoding;
 
 sub file_declares_8bit_cte {
-- 
2.15.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v4] send-email: extract email-parsing code into a subroutine
       [not found]               ` <ff9066a7209b4e21867d933542f8eece@BPMBX2013-01.univ-lyon1.fr>
@ 2017-12-07 13:22                 ` Matthieu Moy
  2017-12-07 14:14                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 25+ messages in thread
From: Matthieu Moy @ 2017-12-07 13:22 UTC (permalink / raw)
  To: PAYRE NATHAN p1508475
  Cc: git@vger.kernel.org, Junio C Hamano, Eric Sunshine,
	ALBERTIN TIMOTHEE p1514771, BENSOUSSAN--BOHM DANIEL p1507430

Not terribly important, but your patch has trailing newlines. "git diff
--staged --check" to see them. More below.

PAYRE NATHAN p1508475 <nathan.payre@etu.univ-lyon1.fr> writes:

> the part of code which parses the header a last time to prepare the
> email and send it.

The important point is not that it's the last time the code parses
headers, so I'd drop the "a last time".

> +	my %parsed_email;
> +	$parsed_email{'body'} = '';
> +	while (my $line = <$c>) {
> +		next if $line =~ m/^GIT:/;
> +		parse_header_line($line, \%parsed_email);
> +		if ($line =~ /^\n$/i) {

You don't need the /i (case-Insensitive) here, there are no letters to
match.

> +	if ($parsed_email{'mime-version'}) {
> +		$need_8bit_cte = 0;

This $need_8bit_cte is a leftover of the old code, which processed the
headers in the order it found them in the message and had to remember
the content of MIME-Version while parsing Content-Type.

I believe you can apply this on top of your patch:

--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -709,7 +709,6 @@ EOT3
        open $c, "<", $compose_filename
                or die sprintf(__("Failed to open %s: %s"), $compose_filename, $!);
 
-       my $need_8bit_cte = file_has_nonascii($compose_filename);
        my $in_body = 0;
        my $summary_empty = 1;
        if (!defined $compose_encoding) {
@@ -740,12 +739,10 @@ EOT3
                        "\n";
        }
        if ($parsed_email{'mime-version'}) {
-               $need_8bit_cte = 0;
                print $c2 "MIME-Version: $parsed_email{'mime-version'}\n",
                                "Content-Type: $parsed_email{'content-type'};\n",
                                "Content-Transfer-Encoding: $parsed_email{'content-transfer-encoding'}\n";
-       }
-       if ($need_8bit_cte) {
+       } else if (file_has_nonascii($compose_filename)) {
                if ($parsed_email{'content-type'}) {
                                print $c2 "MIME-Version: 1.0\n",
                                         "Content-Type: $parsed_email{'content-type'};",

It reads much better: "If the original message already had a
MIME-Version header, then use that, else see if the file has non-ascii
characters and if so, use MIME-Version: 1.0".

Actually, you can even simplify further by factoring the if/else below:

> +		if ($parsed_email{'content-type'}) {
> +				print $c2 "MIME-Version: 1.0\n",
> +					 "Content-Type: $parsed_email{'content-type'};",

(Suspicious ";", and suspicious absence of "\n" here, I don't think it's
intentional and I'm fixing it below, but correct me if I'm wrong)

> +					 "Content-Transfer-Encoding: 8bit\n";
> +			} else {

(Broken indentation, this is not aligned with the "if" above)

>  				print $c2 "MIME-Version: 1.0\n",
>  					 "Content-Type: text/plain; ",
> -					   "charset=$compose_encoding\n",
> +					 "charset=$compose_encoding\n",
>  					 "Content-Transfer-Encoding: 8bit\n";
>  			}

This could become stg like (untested):

	} else if (file_has_nonascii($compose_filename)) {
        	my $content_type = ($parsed_email{'content-type'} or
                	"text/plain; charset=$compose_encoding");
		print $c2 "MIME-Version: 1.0\n",
			  "Content-Type: $content_type\n",
			  "Content-Transfer-Encoding: 8bit\n";
	}

> +	open $c2, "<", $compose_filename . ".final"
> +		or die sprintf(__("Failed to open %s.final: %s"), $compose_filename, $!);
> +	close $c2;

What is this? Cut-and-paste mistake?

> +sub parse_header_line {
> +	my $lines = shift;
> +	my $parsed_line = shift;
> +	my $pattern1 = join "|", qw(To Cc Bcc);
> +	my $pattern2 = join "|",
> +		qw(From Subject Date In-Reply-To Message-ID MIME-Version 
> +			Content-Type Content-Transfer-Encoding References);
> +	
> +	foreach (split(/\n/, $lines)) {
> +		if (/^($pattern1):\s*(.+)$/i) {
> +		        $parsed_line->{lc $1} = [ parse_address_line($2) ];
> +		} elsif (/^($pattern2):\s*(.+)\s*$/i) {
> +		        $parsed_line->{lc $1} = $2;
> +		}

I don't think you need to list the possibilities in the "else" branch.
Just matching /^([^:]*):\s*(.+)\s*$/i should do the trick.

> +			$body = $body . $body_line;

Or just: $body .= $body_line;

-- 
Matthieu Moy
https://matthieu-moy.fr/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4] send-email: extract email-parsing code into a subroutine
  2017-12-07 13:22                 ` Matthieu Moy
@ 2017-12-07 14:14                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 25+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-12-07 14:14 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: PAYRE NATHAN p1508475, git@vger.kernel.org, Junio C Hamano,
	Eric Sunshine, ALBERTIN TIMOTHEE p1514771,
	BENSOUSSAN--BOHM DANIEL p1507430


On Thu, Dec 07 2017, Matthieu Moy jotted:

> Not terribly important, but your patch has trailing newlines. "git diff
> --staged --check" to see them. More below.
>
> PAYRE NATHAN p1508475 <nathan.payre@etu.univ-lyon1.fr> writes:
>
>> the part of code which parses the header a last time to prepare the
>> email and send it.
>
> The important point is not that it's the last time the code parses
> headers, so I'd drop the "a last time".
>
>> +	my %parsed_email;
>> +	$parsed_email{'body'} = '';
>> +	while (my $line = <$c>) {
>> +		next if $line =~ m/^GIT:/;
>> +		parse_header_line($line, \%parsed_email);
>> +		if ($line =~ /^\n$/i) {
>
> You don't need the /i (case-Insensitive) here, there are no letters to
> match.

Good catch, actually this can just be: /^$/. The $ syntax already
matches the ending newline, no need for /^\n$/.

>> +sub parse_header_line {
>> +	my $lines = shift;
>> +	my $parsed_line = shift;
>> +	my $pattern1 = join "|", qw(To Cc Bcc);
>> +	my $pattern2 = join "|",
>> +		qw(From Subject Date In-Reply-To Message-ID MIME-Version
>> +			Content-Type Content-Transfer-Encoding References);
>> +
>> +	foreach (split(/\n/, $lines)) {
>> +		if (/^($pattern1):\s*(.+)$/i) {
>> +		        $parsed_line->{lc $1} = [ parse_address_line($2) ];
>> +		} elsif (/^($pattern2):\s*(.+)\s*$/i) {
>> +		        $parsed_line->{lc $1} = $2;
>> +		}
>
> I don't think you need to list the possibilities in the "else" branch.
> Just matching /^([^:]*):\s*(.+)\s*$/i should do the trick.

Although you'll end up with a lot of stuff in the $parsed_line hash you
don't need, which makes dumping it for debugging verbose.

I also wonder about multi-line headers, but then again that probably
breaks already on e.g. Message-ID and Refererences, but that's an
existing bug unrelated to this patch...

>> +			$body = $body . $body_line;
>
> Or just: $body .= $body_line;

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] send-email: extract email-parsing code into a subroutine
  2017-12-06 22:55           ` Nathan PAYRE
                               ` (3 preceding siblings ...)
       [not found]             ` <2b59497271cd4fada4ff590a001446cf@BPMBX2013-01.univ-lyon1.fr>
@ 2017-12-09 15:37             ` Nathan Payre
       [not found]             ` <34c53164f4054ee88354f19fc38ae0c4@BPMBX2013-01.univ-lyon1.fr>
  5 siblings, 0 replies; 25+ messages in thread
From: Nathan Payre @ 2017-12-09 15:37 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Nathan Payre, Matthieu Moy, Timothee Albertin,
	Daniel Bensoussan, Junio C Hamano

The existing code mixes parsing of email header with regular
expression and actual code. Extract the parsing code into a new
subroutine "parse_header_line()". This improves the code readability
and make parse_header_line reusable in other place.

"parsed_header_line()" and "filter_body()" could be used for
refactoring the part of code which parses the header to prepare the
email and send it.

Signed-off-by: Nathan Payre <nathan.payre@etu.univ-lyon1.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@univ-lyon1.fr>
Signed-off-by: Timothee Albertin <timothee.albertin@etu.univ-lyon1.fr>
Signed-off-by: Daniel Bensoussan <daniel.bensoussan--bohm@etu.univ-lyon1.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Thanks-to: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

I fixed the last reported problems and removed some other old
variable as $need_8bit_cte.

 git-send-email.perl | 110 ++++++++++++++++++++++++++++++++++------------------
 1 file changed, 73 insertions(+), 37 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 2208dcc21..ac36c6aac 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -685,7 +685,7 @@ Lines beginning in "GIT:" will be removed.
 Consider including an overall diffstat or table of contents
 for the patch you are writing.
 
-Clear the body content if you don't wish to send a summary.
+Clear the body content if you dont wish to send a summary.
 EOT2
 From: $tpl_sender
 Subject: $tpl_subject
@@ -709,51 +709,61 @@ EOT3
 	open $c, "<", $compose_filename
 		or die sprintf(__("Failed to open %s: %s"), $compose_filename, $!);
 
-	my $need_8bit_cte = file_has_nonascii($compose_filename);
-	my $in_body = 0;
-	my $summary_empty = 1;
 	if (!defined $compose_encoding) {
 		$compose_encoding = "UTF-8";
 	}
-	while(<$c>) {
-		next if m/^GIT:/;
-		if ($in_body) {
-			$summary_empty = 0 unless (/^\n$/);
-		} elsif (/^\n$/) {
-			$in_body = 1;
-			if ($need_8bit_cte) {
-				print $c2 "MIME-Version: 1.0\n",
-					 "Content-Type: text/plain; ",
-					   "charset=$compose_encoding\n",
-					 "Content-Transfer-Encoding: 8bit\n";
-			}
-		} elsif (/^MIME-Version:/i) {
-			$need_8bit_cte = 0;
-		} elsif (/^Subject:\s*(.+)\s*$/i) {
-			$initial_subject = $1;
-			my $subject = $initial_subject;
-			$_ = "Subject: " .
-				quote_subject($subject, $compose_encoding) .
-				"\n";
-		} elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
-			$initial_reply_to = $1;
-			next;
-		} elsif (/^From:\s*(.+)\s*$/i) {
-			$sender = $1;
-			next;
-		} elsif (/^(?:To|Cc|Bcc):/i) {
-			print __("To/Cc/Bcc fields are not interpreted yet, they have been ignored\n");
-			next;
+
+
+	my %parsed_email;
+	$parsed_email{'body'} = '';
+	while (my $line = <$c>) {
+		next if $line =~ m/^GIT:/;
+		parse_header_line($line, \%parsed_email);
+		if ($line =~ /^$/) {
+			$parsed_email{'body'} = filter_body($c);
 		}
-		print $c2 $_;
 	}
-	close $c;
-	close $c2;
 
-	if ($summary_empty) {
+	if ($parsed_email{'from'}) {
+		$sender = $parsed_email{'from'};
+	}
+	if ($parsed_email{'in-reply-to'}) {
+		$initial_reply_to = $parsed_email{'in-reply-to'};
+	}
+	if ($parsed_email{'subject'}) {
+		$initial_subject = $parsed_email{'subject'};
+		print $c2 "Subject: " .
+			quote_subject($parsed_email{'subject'}, $compose_encoding) .
+			"\n";
+	}
+	if ($parsed_email{'mime-version'}) {
+		print $c2 "MIME-Version: $parsed_email{'mime-version'}\n",
+				"Content-Type: $parsed_email{'content-type'};\n",
+				"Content-Transfer-Encoding: $parsed_email{'content-transfer-encoding'}\n";
+	}
+
+	if ($parsed_email{'content-type'}) {
+		print $c2 "MIME-Version: 1.0\n",
+			 "Content-Type: $parsed_email{'content-type'};\n",
+			 "Content-Transfer-Encoding: 8bit\n";
+	} elsif (file_has_nonascii($compose_filename)) {
+                my $content_type = ($parsed_email{'content-type'} or
+                        "text/plain; charset=$compose_encoding");
+                print $c2 "MIME-Version: 1.0\n",
+                          "Content-Type: $content_type\n",
+                          "Content-Transfer-Encoding: 8bit\n";
+        }
+
+	if ($parsed_email{'body'}) {
+		print $c2 "\n$parsed_email{'body'}\n";
+	} else {
 		print __("Summary email is empty, skipping it\n");
 		$compose = -1;
 	}
+
+	close $c;
+	close $c2;
+
 } elsif ($annotate) {
 	do_edit(@files);
 }
@@ -792,6 +802,32 @@ sub ask {
 	return;
 }
 
+sub parse_header_line {
+	my $lines = shift;
+	my $parsed_line = shift;
+	my $pattern = join "|", qw(To Cc Bcc);
+	
+	foreach (split(/\n/, $lines)) {
+		if (/^($pattern):\s*(.+)$/i) {
+		        $parsed_line->{lc $1} = [ parse_address_line($2) ];
+		} elsif (/^([^:]*):\s*(.+)\s*$/i) {
+		        $parsed_line->{lc $1} = $2;
+		}
+	}
+}
+
+sub filter_body {
+	my $c = shift;
+	my $body = "";
+	while (my $body_line = <$c>) {
+		if ($body_line !~ m/^GIT:/) {
+			$body .= $body_line;
+		}
+	}
+	return $body;
+}
+
+
 my %broken_encoding;
 
 sub file_declares_8bit_cte {
-- 
2.15.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] send-email: extract email-parsing code into a subroutine
       [not found]             ` <34c53164f4054ee88354f19fc38ae0c4@BPMBX2013-01.univ-lyon1.fr>
@ 2017-12-11 21:12               ` Matthieu Moy
  2017-12-14 10:06                 ` Nathan PAYRE
  0 siblings, 1 reply; 25+ messages in thread
From: Matthieu Moy @ 2017-12-11 21:12 UTC (permalink / raw)
  To: PAYRE NATHAN p1508475
  Cc: git, Eric Sunshine, ALBERTIN TIMOTHEE p1514771,
	BENSOUSSAN--BOHM DANIEL p1507430, Junio C Hamano

"PAYRE NATHAN p1508475" <nathan.payre@etu.univ-lyon1.fr> wrote:

> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -685,7 +685,7 @@ Lines beginning in "GIT:" will be removed.
>  Consider including an overall diffstat or table of contents
>  for the patch you are writing.
>  
> -Clear the body content if you don't wish to send a summary.
> +Clear the body content if you dont wish to send a summary.

This is not part of your patch. Use "git add -p" to specify
exactly which hunks should go into the patch and don't let this
kind of change end up in the version you send.

> +	my %parsed_email;
> +	$parsed_email{'body'} = '';
> +	while (my $line = <$c>) {
> +		next if $line =~ m/^GIT:/;
> +		parse_header_line($line, \%parsed_email);
> +		if ($line =~ /^$/) {
> +			$parsed_email{'body'} = filter_body($c);
>  		}
> -		print $c2 $_;

I didn't notice this at first, but you're modifying the behavior here:
the old code used to print to $c2 anything that didn't match any of
the if/else if branches.

To keep this behavior, you need to keep all these extra headers in
$parsed_email (you do, in this version) and print them after taking
care of all the known headers (AFAICT, you don't).

>  	}
> -	close $c;
> -	close $c2;

You'll still need $c2, but you don't need $c anymore, so I'd keep the
"close $c" here. OTOH, $c2 is not needed before this point (actually a
bit later), so it would make sense to move the "open" down a little.
This would materialize the "read input, then write output" scheme (as
opposed to "write output while reading input" in the previous code).
It's not a new issue in your patch, but giving variables meaningful
names (i.e. not $c and $c2) would help, too.

> +	if ($parsed_email{'mime-version'}) {
> +		print $c2 "MIME-Version: $parsed_email{'mime-version'}\n",
> +				"Content-Type: $parsed_email{'content-type'};\n",
> +				"Content-Transfer-Encoding: $parsed_email{'content-transfer-encoding'}\n";
> +	}
> +
> +	if ($parsed_email{'content-type'}) {
> +		print $c2 "MIME-Version: 1.0\n",
> +			 "Content-Type: $parsed_email{'content-type'};\n",
> +			 "Content-Transfer-Encoding: 8bit\n";

This "if ($parsed_email{'content-type'})" does not correspond to
anything in the old code, and ...

> +	} elsif (file_has_nonascii($compose_filename)) {
> +                my $content_type = ($parsed_email{'content-type'} or
> +                        "text/plain; charset=$compose_encoding");

Here, your're dealing explicitly with $parsed_email{'content-type'} !=
false (you're in the 'else' branch where it can only be false).

I think you just meant to drop the "if
($parsed_email{'content-type'})" part, and plug the "elseif" directly
after the "if ($parsed_email{'mime-version'})". That's what I
suggested in my earlier email.

> +                my $content_type =3D ($parsed_email{'content-type'} or
> +                        "text/plain; charset=3D$compose_encoding");
> +                print $c2 "MIME-Version: 1.0\n",
> +                          "Content-Type: $content_type\n",
> +                          "Content-Transfer-Encoding: 8bit\n";
> +        }

This part is indented with spaces, please use tabs.

-- 
Matthieu Moy
https://matthieu-moy.fr/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] send-email: extract email-parsing code into a subroutine
  2017-12-11 21:12               ` Matthieu Moy
@ 2017-12-14 10:06                 ` Nathan PAYRE
  2017-12-14 11:12                   ` Nathan Payre
       [not found]                   ` <ce70816f94c24754bea9bc8175de4bc4@BPMBX2013-01.univ-lyon1.fr>
  0 siblings, 2 replies; 25+ messages in thread
From: Nathan PAYRE @ 2017-12-14 10:06 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: PAYRE NATHAN p1508475, Git Mailing List, Eric Sunshine,
	ALBERTIN TIMOTHEE p1514771, BENSOUSSAN--BOHM DANIEL p1507430,
	Junio C Hamano

2017-12-11 22:12 GMT+01:00 Matthieu Moy <matthieu.moy@univ-lyon1.fr>:

> "PAYRE NATHAN p1508475" <nathan.payre@etu.univ-lyon1.fr> wrote:
>> +     my %parsed_email;
>> +     $parsed_email{'body'} = '';
>> +     while (my $line = <$c>) {
>> +             next if $line =~ m/^GIT:/;
>> +             parse_header_line($line, \%parsed_email);
>> +             if ($line =~ /^$/) {
>> +                     $parsed_email{'body'} = filter_body($c);
>>               }
>> -             print $c2 $_;
>
> I didn't notice this at first, but you're modifying the behavior here:
> the old code used to print to $c2 anything that didn't match any of
> the if/else if branches.
>
> To keep this behavior, you need to keep all these extra headers in
> $parsed_email (you do, in this version) and print them after taking
> care of all the known headers (AFAICT, you don't).

This case is not that easy to correct because:
- It's could weigh the code.
- The refactoring may not be legitimate anymore.

I've found two way to resolve this:
.1) After every if($parsed_email{'key'}) remove the corresponding key
and just before closing $c2 create a new loop which add all the
remaining parts.

.2) Making a mix between the old and new code. Some parts of
my patch can improve the old code (like the removing of
$need_8bit_cte) then it will be kept and the while loop will be
similar the old code

I think that the first version will look like better than the second
one, easy to read, but it will change the order of the email header.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] send-email: extract email-parsing code into a subroutine
  2017-12-14 10:06                 ` Nathan PAYRE
@ 2017-12-14 11:12                   ` Nathan Payre
       [not found]                   ` <ce70816f94c24754bea9bc8175de4bc4@BPMBX2013-01.univ-lyon1.fr>
  1 sibling, 0 replies; 25+ messages in thread
From: Nathan Payre @ 2017-12-14 11:12 UTC (permalink / raw)
  To: git
  Cc: Eric Sunshine, Nathan Payre, Matthieu Moy, Timothee Albertin,
	Daniel Bensoussan, Junio C Hamano

The existing code mixes parsing of email header with regular
expression and actual code. Extract the parsing code into a new
subroutine "parse_header_line()". This improves the code readability
and make parse_header_line reusable in other place.

"parsed_header_line()" and "filter_body()" could be used for
refactoring the part of code which parses the header to prepare the
email and send it.

Signed-off-by: Nathan Payre <nathan.payre@etu.univ-lyon1.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@univ-lyon1.fr>
Signed-off-by: Timothee Albertin <timothee.albertin@etu.univ-lyon1.fr>
Signed-off-by: Daniel Bensoussan <daniel.bensoussan--bohm@etu.univ-lyon1.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Thanks-to: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

>> "PAYRE NATHAN p1508475" <nathan.payre@etu.univ-lyon1.fr> wrote:
>>> +     my %parsed_email;
>>> +     $parsed_email{'body'} = '';
>>> +     while (my $line = <$c>) {
>>> +             next if $line =~ m/^GIT:/;
>>> +             parse_header_line($line, \%parsed_email);
>>> +             if ($line =~ /^$/) {
>>> +                     $parsed_email{'body'} = filter_body($c);
>>>               }
>>> -             print $c2 $_;
>>
>> I didn't notice this at first, but you're modifying the behavior here:
>> the old code used to print to $c2 anything that didn't match any of
>> the if/else if branches.
>>
>> To keep this behavior, you need to keep all these extra headers in
>> $parsed_email (you do, in this version) and print them after taking
>> care of all the known headers (AFAICT, you don't).
>
> This case is not that easy to correct because:
> - It's could weigh the code.
> - The refactoring may not be legitimate anymore.
> 
> I've found two way to resolve this:
> .1) After every if($parsed_email{'key'}) remove the corresponding key
> and just before closing $c2 create a new loop which add all the
> remaining parts.
>
> .2) Making a mix between the old and new code. Some parts of
> my patch can improve the old code (like the removing of
> $need_8bit_cte) then it will be kept and the while loop will be
> similar the old code
>
> I think that the first version will look like better than the second
> one, easy to read, but it will change the order of the email header.

This is how I see the first choice of the two I've proposed in my last
email.

 git-send-email.perl | 116
 +++++++++++++++++++++++++++++++++++----------------- 1 file changed,
 78 insertions(+), 38 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 2208dcc21..f942fc2a5 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -703,57 +703,71 @@ EOT3
 		do_edit($compose_filename);
 	}
 
-	open my $c2, ">", $compose_filename . ".final"
-		or die sprintf(__("Failed to open %s.final: %s"), $compose_filename, $!);
-
 	open $c, "<", $compose_filename
 		or die sprintf(__("Failed to open %s: %s"), $compose_filename, $!);
 
-	my $need_8bit_cte = file_has_nonascii($compose_filename);
-	my $in_body = 0;
-	my $summary_empty = 1;
 	if (!defined $compose_encoding) {
 		$compose_encoding = "UTF-8";
 	}
-	while(<$c>) {
-		next if m/^GIT:/;
-		if ($in_body) {
-			$summary_empty = 0 unless (/^\n$/);
-		} elsif (/^\n$/) {
-			$in_body = 1;
-			if ($need_8bit_cte) {
-				print $c2 "MIME-Version: 1.0\n",
-					 "Content-Type: text/plain; ",
-					   "charset=$compose_encoding\n",
-					 "Content-Transfer-Encoding: 8bit\n";
-			}
-		} elsif (/^MIME-Version:/i) {
-			$need_8bit_cte = 0;
-		} elsif (/^Subject:\s*(.+)\s*$/i) {
-			$initial_subject = $1;
-			my $subject = $initial_subject;
-			$_ = "Subject: " .
-				quote_subject($subject, $compose_encoding) .
-				"\n";
-		} elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
-			$initial_reply_to = $1;
-			next;
-		} elsif (/^From:\s*(.+)\s*$/i) {
-			$sender = $1;
-			next;
-		} elsif (/^(?:To|Cc|Bcc):/i) {
-			print __("To/Cc/Bcc fields are not interpreted yet, they have been ignored\n");
-			next;
+
+	my %parsed_email;
+	while (my $line = <$c>) {
+		next if $line =~ m/^GIT:/;
+		parse_header_line($line, \%parsed_email);
+		if ($line =~ /^$/) {
+			$parsed_email{'body'} = filter_body($c);
 		}
-		print $c2 $_;
 	}
+
 	close $c;
-	close $c2;
 
-	if ($summary_empty) {
+	open my $c2, ">", $compose_filename . ".final"
+	or die sprintf(__("Failed to open %s.final: %s"), $compose_filename, $!);
+
+
+	if ($parsed_email{'From'}) {
+		$sender = delete($parsed_email{'From'});
+	}
+	if ($parsed_email{'In-Reply-To'}) {
+		$initial_reply_to = delete($parsed_email{'In-Reply-To'});
+	}
+	if ($parsed_email{'Subject'}) {
+		$initial_subject = delete($parsed_email{'Subject'});
+		print $c2 "Subject: " .
+			quote_subject($initial_subject, $compose_encoding) .
+			"\n";
+	}
+
+	if ($parsed_email{'MIME-Version'}) {
+		print $c2 "MIME-Version: $parsed_email{'MIME-Version'}\n",
+				"Content-Type: $parsed_email{'Content-Type'};\n",
+				"Content-Transfer-Encoding: $parsed_email{'Content-Transfer-Encoding'}\n";
+		delete($parsed_email{'MIME-Version'});
+		delete($parsed_email{'Content-Type'});
+		delete($parsed_email{'Content-Transfer-Encoding'});
+	} elsif (file_has_nonascii($compose_filename)) {
+		my $content_type = (delete($parsed_email{'Content-Type'}) or
+			"text/plain; charset=$compose_encoding");
+		print $c2 "MIME-Version: 1.0\n",
+			"Content-Type: $content_type\n",
+			"Content-Transfer-Encoding: 8bit\n";
+	}
+
+	foreach my $key (keys %parsed_email) {
+		next if $key == 'body';
+		print $c2 "$key: $parsed_email{$key}";
+	}
+
+	if ($parsed_email{'body'}) {
+		print $c2 "\n$parsed_email{'body'}\n";
+		delete($parsed_email{'body'});
+	} else {
 		print __("Summary email is empty, skipping it\n");
 		$compose = -1;
 	}
+
+	close $c2;
+
 } elsif ($annotate) {
 	do_edit(@files);
 }
@@ -792,6 +806,32 @@ sub ask {
 	return;
 }
 
+sub parse_header_line {
+	my $lines = shift;
+	my $parsed_line = shift;
+	my $pattern = join "|", qw(To Cc Bcc);
+	
+	foreach (split(/\n/, $lines)) {
+		if (/^($pattern):\s*(.+)$/i) {
+		        $parsed_line->{$1} = [ parse_address_line($2) ];
+		} elsif (/^([^:]*):\s*(.+)\s*$/i) {
+		        $parsed_line->{$1} = $2;
+		}
+	}
+}
+
+sub filter_body {
+	my $c = shift;
+	my $body = "";
+	while (my $body_line = <$c>) {
+		if ($body_line !~ m/^GIT:/) {
+			$body .= $body_line;
+		}
+	}
+	return $body;
+}
+
+
 my %broken_encoding;
 
 sub file_declares_8bit_cte {
-- 
2.15.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] send-email: extract email-parsing code into a subroutine
       [not found]                   ` <ce70816f94c24754bea9bc8175de4bc4@BPMBX2013-01.univ-lyon1.fr>
@ 2017-12-14 14:05                     ` Matthieu Moy
  2017-12-15 15:33                       ` Nathan Payre
       [not found]                       ` <3cafddfe825a4fb4a554f02aa3c025a3@BPMBX2013-01.univ-lyon1.fr>
  0 siblings, 2 replies; 25+ messages in thread
From: Matthieu Moy @ 2017-12-14 14:05 UTC (permalink / raw)
  To: PAYRE NATHAN p1508475
  Cc: git, Eric Sunshine, ALBERTIN TIMOTHEE p1514771,
	BENSOUSSAN--BOHM DANIEL p1507430, Junio C Hamano

"PAYRE NATHAN p1508475" <nathan.payre@etu.univ-lyon1.fr> wrote:

> -		print $c2 $_;
>  	}
> +
>  	close $c;

Nit: this added newline does not seem necessary to me. Nothing
serious, but this kind of thing tend to distract the reader when
reviewing the patch.

> +	foreach my $key (keys %parsed_email) {
> +		next if $key == 'body';
> +		print $c2 "$key: $parsed_email{$key}";
> +	}

I'd add a comment like

	# Preserve unknown headers

at the top of the loop to make it clear what we're doing.

On a side note: there's no comment in the code you're adding. This is
not necessarily a bad thing (beautifully written code does not need
comments to be readable), but you may re-read your code with the
question "did I explain everything well-enough?" in mind. The loop
above is a case where IMHO a short and sweet comment helps the reader.

Two potential issues not mentionned in your message but that we
discussed offlist is that 1) this doesn't preserve the order, and 2)
this strips duplicate headers. I believe this is not a problem here,
and trying to solve these points would make the code overkill, but
this would really deserve being mentionned in the commit message.
First, so that people reviewing your patch now can confirm (or not)
that you are taking the right decision by doing this, and also for
people in the future examining your patch (e.g. after a bisect).

> +sub parse_header_line {
> +	my $lines = shift;
> +	my $parsed_line = shift;
> +	my $pattern = join "|", qw(To Cc Bcc);

Nit: you may want to rename it to something more explicit, like
$addr_headers_pat.

None of my nit should block the patch inclusion, but I think the
commit message should be expanded to include a mention of the
"duplicate headers"/"header order" potential issue.

-- 
Matthieu Moy
https://matthieu-moy.fr/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] send-email: extract email-parsing code into a subroutine
  2017-12-14 14:05                     ` Matthieu Moy
@ 2017-12-15 15:33                       ` Nathan Payre
       [not found]                       ` <3cafddfe825a4fb4a554f02aa3c025a3@BPMBX2013-01.univ-lyon1.fr>
  1 sibling, 0 replies; 25+ messages in thread
From: Nathan Payre @ 2017-12-15 15:33 UTC (permalink / raw)
  To: git
  Cc: Nathan Payre, Matthieu Moy, Timothee Albertin, Daniel Bensoussan,
	Junio C Hamano

The existing code mixes parsing of email header with regular
expression and actual code. Extract the parsing code into a new
subroutine "parse_header_line()". This improves the code readability
and make parse_header_line reusable in other place.

"parsed_header_line()" and "filter_body()" could be used for
refactoring the part of code which parses the header to prepare the
email and send it.

In contrast to the previous version it doesn't keep the header order
and strip duplicate headers.

Signed-off-by: Nathan Payre <nathan.payre@etu.univ-lyon1.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@univ-lyon1.fr>
Signed-off-by: Timothee Albertin <timothee.albertin@etu.univ-lyon1.fr>
Signed-off-by: Daniel Bensoussan <daniel.bensoussan--bohm@etu.univ-lyon1.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Thanks-to: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

>> +sub parse_header_line {
>> +     my $lines = shift;
>> +     my $parsed_line = shift;
>> +     my $pattern = join "|", qw(To Cc Bcc);
>
> Nit: you may want to rename it to something more explicit, like
> $addr_headers_pat.

I find "$addr_headers_pat" too long that's why I've choose rename it
into "$addr_pat", in addition to that, because the variable is in the
subroutine "parse_header_line" it does not require to include
"headers" in the variable name.

 git-send-email.perl | 115 +++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 77 insertions(+), 38 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 2208dcc21..e6e813041 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -703,57 +703,70 @@ EOT3
 		do_edit($compose_filename);
 	}
 
-	open my $c2, ">", $compose_filename . ".final"
-		or die sprintf(__("Failed to open %s.final: %s"), $compose_filename, $!);
-
 	open $c, "<", $compose_filename
 		or die sprintf(__("Failed to open %s: %s"), $compose_filename, $!);
 
-	my $need_8bit_cte = file_has_nonascii($compose_filename);
-	my $in_body = 0;
-	my $summary_empty = 1;
 	if (!defined $compose_encoding) {
 		$compose_encoding = "UTF-8";
 	}
-	while(<$c>) {
-		next if m/^GIT:/;
-		if ($in_body) {
-			$summary_empty = 0 unless (/^\n$/);
-		} elsif (/^\n$/) {
-			$in_body = 1;
-			if ($need_8bit_cte) {
-				print $c2 "MIME-Version: 1.0\n",
-					 "Content-Type: text/plain; ",
-					   "charset=$compose_encoding\n",
-					 "Content-Transfer-Encoding: 8bit\n";
-			}
-		} elsif (/^MIME-Version:/i) {
-			$need_8bit_cte = 0;
-		} elsif (/^Subject:\s*(.+)\s*$/i) {
-			$initial_subject = $1;
-			my $subject = $initial_subject;
-			$_ = "Subject: " .
-				quote_subject($subject, $compose_encoding) .
-				"\n";
-		} elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
-			$initial_reply_to = $1;
-			next;
-		} elsif (/^From:\s*(.+)\s*$/i) {
-			$sender = $1;
-			next;
-		} elsif (/^(?:To|Cc|Bcc):/i) {
-			print __("To/Cc/Bcc fields are not interpreted yet, they have been ignored\n");
-			next;
+
+	my %parsed_email;
+	while (my $line = <$c>) {
+		next if $line =~ m/^GIT:/;
+		parse_header_line($line, \%parsed_email);
+		if ($line =~ /^$/) {
+			$parsed_email{'body'} = filter_body($c);
 		}
-		print $c2 $_;
 	}
 	close $c;
-	close $c2;
 
-	if ($summary_empty) {
+	open my $c2, ">", $compose_filename . ".final"
+	or die sprintf(__("Failed to open %s.final: %s"), $compose_filename, $!);
+
+
+	if ($parsed_email{'From'}) {
+		$sender = delete($parsed_email{'From'});
+	}
+	if ($parsed_email{'In-Reply-To'}) {
+		$initial_reply_to = delete($parsed_email{'In-Reply-To'});
+	}
+	if ($parsed_email{'Subject'}) {
+		$initial_subject = delete($parsed_email{'Subject'});
+		print $c2 "Subject: " .
+			quote_subject($initial_subject, $compose_encoding) .
+			"\n";
+	}
+
+	if ($parsed_email{'MIME-Version'}) {
+		print $c2 "MIME-Version: $parsed_email{'MIME-Version'}\n",
+				"Content-Type: $parsed_email{'Content-Type'};\n",
+				"Content-Transfer-Encoding: $parsed_email{'Content-Transfer-Encoding'}\n";
+		delete($parsed_email{'MIME-Version'});
+		delete($parsed_email{'Content-Type'});
+		delete($parsed_email{'Content-Transfer-Encoding'});
+	} elsif (file_has_nonascii($compose_filename)) {
+		my $content_type = (delete($parsed_email{'Content-Type'}) or
+			"text/plain; charset=$compose_encoding");
+		print $c2 "MIME-Version: 1.0\n",
+			"Content-Type: $content_type\n",
+			"Content-Transfer-Encoding: 8bit\n";
+	}
+	# Preserve unknown headers
+	foreach my $key (keys %parsed_email) {
+		next if $key eq 'body';
+		print $c2 "$key: $parsed_email{$key}";
+	}
+
+	if ($parsed_email{'body'}) {
+		print $c2 "\n$parsed_email{'body'}\n";
+		delete($parsed_email{'body'});
+	} else {
 		print __("Summary email is empty, skipping it\n");
 		$compose = -1;
 	}
+
+	close $c2;
+
 } elsif ($annotate) {
 	do_edit(@files);
 }
@@ -792,6 +805,32 @@ sub ask {
 	return;
 }
 
+sub parse_header_line {
+	my $lines = shift;
+	my $parsed_line = shift;
+	my $addr_pat = join "|", qw(To Cc Bcc);
+	
+	foreach (split(/\n/, $lines)) {
+		if (/^($addr_pat):\s*(.+)$/i) {
+		        $parsed_line->{$1} = [ parse_address_line($2) ];
+		} elsif (/^([^:]*):\s*(.+)\s*$/i) {
+		        $parsed_line->{$1} = $2;
+		}
+	}
+}
+
+sub filter_body {
+	my $c = shift;
+	my $body = "";
+	while (my $body_line = <$c>) {
+		if ($body_line !~ m/^GIT:/) {
+			$body .= $body_line;
+		}
+	}
+	return $body;
+}
+
+
 my %broken_encoding;
 
 sub file_declares_8bit_cte {
-- 
2.15.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] send-email: extract email-parsing code into a subroutine
       [not found]                       ` <3cafddfe825a4fb4a554f02aa3c025a3@BPMBX2013-01.univ-lyon1.fr>
@ 2017-12-15 15:44                         ` Matthieu Moy
  0 siblings, 0 replies; 25+ messages in thread
From: Matthieu Moy @ 2017-12-15 15:44 UTC (permalink / raw)
  To: PAYRE NATHAN p1508475
  Cc: git@vger.kernel.org, ALBERTIN TIMOTHEE p1514771,
	BENSOUSSAN--BOHM DANIEL p1507430, Junio C Hamano

PAYRE NATHAN p1508475 <nathan.payre@etu.univ-lyon1.fr> writes:

>>> +sub parse_header_line {
>>> +     my $lines = shift;
>>> +     my $parsed_line = shift;
>>> +     my $pattern = join "|", qw(To Cc Bcc);
>>
>> Nit: you may want to rename it to something more explicit, like
>> $addr_headers_pat.
>
> I find "$addr_headers_pat" too long that's why I've choose rename it
> into "$addr_pat", in addition to that, because the variable is in the
> subroutine "parse_header_line" it does not require to include
> "headers" in the variable name.

I suggested this name because $addr_pat seems to imply that this matches
an address, while it matches the _name of headers_ containing address.
But that's not terribly important, the meaning is clear by the context
anyway.

All my previous remarks have been taken into account. This is now

Reviewed-by: Matthieu Moy <Matthieu.Moy@univ-lyon1.fr>

Thanks,

-- 
Matthieu Moy
https://matthieu-moy.fr/

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2017-12-15 15:45 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-02 17:02 [PATCH] send-email: extract email-parsing code into a subroutine Payre Nathan
2017-12-02 17:11 ` Nathan PAYRE
     [not found] ` <445d0838cf2a4107bad95d5cc2d38a05@BPMBX2013-01.univ-lyon1.fr>
2017-12-03 21:20   ` Matthieu Moy
2017-12-03 22:00 ` Ævar Arnfjörð Bjarmason
2017-12-03 23:41   ` Nathan PAYRE
2017-12-04 13:45     ` Junio C Hamano
2017-12-06 15:38       ` [PATCH v2] " Nathan Payre
2017-12-06 21:39         ` Junio C Hamano
2017-12-06 22:55           ` Nathan PAYRE
2017-12-06 23:02             ` [PATCH v3] " Nathan Payre
2017-12-06 23:12               ` Ævar Arnfjörð Bjarmason
2017-12-07 10:28               ` [PATCH v4] " Nathan Payre
     [not found]               ` <ff9066a7209b4e21867d933542f8eece@BPMBX2013-01.univ-lyon1.fr>
2017-12-07 13:22                 ` Matthieu Moy
2017-12-07 14:14                   ` Ævar Arnfjörð Bjarmason
2017-12-06 23:06             ` [PATCH v2] " Junio C Hamano
2017-12-07  7:32             ` Eric Sunshine
     [not found]             ` <2b59497271cd4fada4ff590a001446cf@BPMBX2013-01.univ-lyon1.fr>
2017-12-07  7:57               ` [PATCH v3] " Matthieu Moy
2017-12-09 15:37             ` [PATCH] " Nathan Payre
     [not found]             ` <34c53164f4054ee88354f19fc38ae0c4@BPMBX2013-01.univ-lyon1.fr>
2017-12-11 21:12               ` Matthieu Moy
2017-12-14 10:06                 ` Nathan PAYRE
2017-12-14 11:12                   ` Nathan Payre
     [not found]                   ` <ce70816f94c24754bea9bc8175de4bc4@BPMBX2013-01.univ-lyon1.fr>
2017-12-14 14:05                     ` Matthieu Moy
2017-12-15 15:33                       ` Nathan Payre
     [not found]                       ` <3cafddfe825a4fb4a554f02aa3c025a3@BPMBX2013-01.univ-lyon1.fr>
2017-12-15 15:44                         ` Matthieu Moy
2017-12-06 15:32     ` [PATCH v2] " Nathan Payre

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).