From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id C2C4C1FAE2 for ; Mon, 19 Mar 2018 08:14:59 +0000 (UTC) From: "Eric Wong (Contractor, The Linux Foundation)" To: meta@public-inbox.org Subject: [PATCH 01/27] content_id: use Sender header if From is not available Date: Mon, 19 Mar 2018 08:14:33 +0000 Message-Id: <20180319081459.10645-2-e@80x24.org> In-Reply-To: <20180319081459.10645-1-e@80x24.org> References: <20180319081459.10645-1-e@80x24.org> List-Id: We will be using Sender: in more places if the From: header is not available, this is one of them. Followup-to: ("import: fall back to Sender for extracting name and email") --- lib/PublicInbox/ContentId.pm | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/lib/PublicInbox/ContentId.pm b/lib/PublicInbox/ContentId.pm index 8347de2..9082b76 100644 --- a/lib/PublicInbox/ContentId.pm +++ b/lib/PublicInbox/ContentId.pm @@ -11,9 +11,6 @@ use PublicInbox::MID qw(mids references); # not sure if less-widely supported hash families are worth bothering with use Digest::SHA; -# Content-* headers are often no-ops, so maybe we don't need them -my @ID_HEADERS = qw(Subject From Date To Cc); - sub content_digest ($) { my ($mime) = @_; my $dig = Digest::SHA->new(256); @@ -31,7 +28,18 @@ sub content_digest ($) { next if $seen{$mid}; $dig->add('ref: '.$mid); } - foreach my $h (@ID_HEADERS) { + + # Only use Sender: if From is not present + foreach my $h (qw(From Sender)) { + my @v = $hdr->header_raw($h); + if (@v) { + $dig->add("$h: $_") foreach @v; + last; + } + } + + # Content-* headers are often no-ops, so maybe we don't need them + foreach my $h (qw(Subject Date To Cc)) { my @v = $hdr->header_raw($h); $dig->add("$h: $_") foreach @v; } -- EW