From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 211BB1F4B9 for ; Thu, 23 Jan 2020 23:06:00 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Subject: [PATCH 1/6] contentid: use map to generate %seen for Message-Ids Date: Thu, 23 Jan 2020 23:05:54 +0000 Message-Id: <20200123230559.16781-2-e@yhbt.net> In-Reply-To: <20200123230559.16781-1-e@yhbt.net> References: <20200123230559.16781-1-e@yhbt.net> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: This use of map {} is a common idiom as we no longer consider the Message-ID as part of the digest. --- lib/PublicInbox/ContentId.pm | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/lib/PublicInbox/ContentId.pm b/lib/PublicInbox/ContentId.pm index eb937a0e..0c4a8678 100644 --- a/lib/PublicInbox/ContentId.pm +++ b/lib/PublicInbox/ContentId.pm @@ -60,12 +60,9 @@ sub content_digest ($) { # References: and In-Reply-To: get used interchangeably # in some "duplicates" in LKML. We treat them the same # in SearchIdx, so treat them the same for this: - my %seen; - foreach my $mid (@{mids($hdr)}) { - # do NOT consider the Message-ID as part of the content_id - # if we got here, we've already got Message-ID reuse - $seen{$mid} = 1; - } + # do NOT consider the Message-ID as part of the content_id + # if we got here, we've already got Message-ID reuse + my %seen = map { $_ => 1 } @{mids($hdr)}; foreach my $mid (@{references($hdr)}) { next if $seen{$mid}; $dig->add("ref\0$mid\0");