* [PATCH 2/4] mail_diff: match ContentHash EOL and EOM behavior more closely
2023-04-25 10:50 6% [PATCH 0/4] mail diff updates Eric Wong
@ 2023-04-25 10:50 7% ` Eric Wong
0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2023-04-25 10:50 UTC (permalink / raw)
To: meta
ContentHash currently doesn't convert CRCRLF to LF. Perhaps it
should, but for now, have diff behavior match the actual
comparison behavior used for dedupe and omit all trailing
whitespace for diff.
---
lib/PublicInbox/ContentHash.pm | 2 +-
lib/PublicInbox/MailDiff.pm | 5 +++--
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/lib/PublicInbox/ContentHash.pm b/lib/PublicInbox/ContentHash.pm
index a4f6196f..fc94257c 100644
--- a/lib/PublicInbox/ContentHash.pm
+++ b/lib/PublicInbox/ContentHash.pm
@@ -45,7 +45,7 @@ sub content_dig_i {
my $ct = $part->content_type || 'text/plain';
my ($s, undef) = msg_part_text($part, $ct);
if (defined $s) {
- $s =~ s/\r\n/\n/gs;
+ $s =~ s/\r\n/\n/gs; # TODO: consider \r+\n to match View
$s =~ s/\s*\z//s;
utf8::encode($s);
} else {
diff --git a/lib/PublicInbox/MailDiff.pm b/lib/PublicInbox/MailDiff.pm
index 7511144c..d9733ed4 100644
--- a/lib/PublicInbox/MailDiff.pm
+++ b/lib/PublicInbox/MailDiff.pm
@@ -11,7 +11,7 @@ use PublicInbox::GitAsyncCat;
sub write_part { # Eml->each_part callback
my ($ary, $self) = @_;
my ($part, $depth, $idx) = @$ary;
- if ($idx ne '1' || $self->{-raw_hdr}) {
+ if ($idx ne '1' || $self->{-raw_hdr}) { # lei mail-diff --raw-header
open my $fh, '>', "$self->{curdir}/$idx.hdr" or die "open: $!";
print $fh ${$part->{hdr}} or die "print $!";
close $fh or die "close $!";
@@ -20,7 +20,8 @@ sub write_part { # Eml->each_part callback
my ($s, $err) = msg_part_text($part, $ct);
my $sfx = defined($s) ? 'txt' : 'bin';
$s //= $part->body;
- $s =~ s/\r+\n/\n/sg;
+ $s =~ s/\r\n/\n/gs; # TODO: consider \r+\n to match View
+ $s =~ s/\s*\z//s;
open my $fh, '>:utf8', "$self->{curdir}/$idx.$sfx" or die "open: $!";
print $fh $s or die "print $!";
close $fh or die "close $!";
^ permalink raw reply related [relevance 7%]
* [PATCH 0/4] mail diff updates
@ 2023-04-25 10:50 6% Eric Wong
2023-04-25 10:50 7% ` [PATCH 2/4] mail_diff: match ContentHash EOL and EOM behavior more closely Eric Wong
0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2023-04-25 10:50 UTC (permalink / raw)
To: meta
Some things which I noticed while reading some cross-posted LKML
messages. These affect the /$INBOX/$MSGID/d/ WWW endpoint as
well as `lei mail-diff'
I'm considering making tweaks to ContentHash to ignore the
name+comment parts of To/Cc headers and only rely on the
lowercased email address itself, too. That would affect
dedupe across the board for v2 and extindex...
Eric Wong (4):
mid+contenthash: eliminate needless local variable captures
mail_diff: match ContentHash EOL and EOM behavior more closely
mail_diff: show headers differences in WWW /$MSGID/d/ view
content_digest_dbg: improve display of To:/Cc: diffs
lib/PublicInbox/ContentDigestDbg.pm | 7 +++++--
lib/PublicInbox/ContentHash.pm | 8 +++-----
lib/PublicInbox/MID.pm | 6 ++----
lib/PublicInbox/MailDiff.pm | 11 ++++-------
4 files changed, 14 insertions(+), 18 deletions(-)
^ permalink raw reply [relevance 6%]
Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2023-04-25 10:50 6% [PATCH 0/4] mail diff updates Eric Wong
2023-04-25 10:50 7% ` [PATCH 2/4] mail_diff: match ContentHash EOL and EOM behavior more closely Eric Wong
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).