From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id C3DFA1F4B4; Mon, 21 Sep 2020 20:58:09 +0000 (UTC) Date: Mon, 21 Sep 2020 20:58:09 +0000 From: Eric Wong To: Konstantin Ryabitsev Cc: meta@public-inbox.org Subject: [PATCH] mda: match List-Id insensitively Message-ID: <20200921205809.GA20588@dcvr> References: <20200921180152.uyqluod7qxbwqubo@chatter.i7.local> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200921180152.uyqluod7qxbwqubo@chatter.i7.local> List-Id: Konstantin Ryabitsev wrote: > Hello: > > Attempting to subscribe radiotap@radiotap.org has highlighted two > problems with list-id matching. When the email comes in from the mailing > list, the header is set as: > > List-Id: radiotap.NetBSD.org > > Public-inbox doesn't find this because the above list-id header is not > compliant with the RFC (it should be inside angle brackets). However, > even when <> are added, the match still fails due to capitalization: the > List-Id value from the email header is lc'd first before it is compared > with the listid= value in the config file (which isn't lc'd). So, if the > config file value is using capitalization, the match will never succeed. The lack of lc for the -mda code path is definitely a bug and also inconsistent with -watch behavior. The patch below fixes it, thanks. > I think public-inbox should recognize this list-id header even though > it's not compliant, and it should lc both values before comparing them, > since the canonical value uses capitalization. We'll have to think about that one... It probably needs to be an case-insensitive match of the entire header contents to avoid inadvertant substring matching. RFC 2919 allows a phrase element before the list-id. With -watch, using "watchheader=List-Id:radiotap.NetBSD.org" (case-sensitive) can workaround it, but -mda doesn't support watchheader or anything like it right now... -------8<------- Subject: [PATCH] mda: match List-Id insensitively This follows -watch commit b70473ab8296d31ebb600adb4fa8fe0ac5935ca8 to match List-Id headers case-insensitively. Reported-by: Konstantin Ryabitsev Link: https://public-inbox.org/meta/20200921180152.uyqluod7qxbwqubo@chatter.i7.local/ --- lib/PublicInbox/Config.pm | 3 ++- t/mda.t | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/Config.pm b/lib/PublicInbox/Config.pm index abc525db..d57c361a 100644 --- a/lib/PublicInbox/Config.pm +++ b/lib/PublicInbox/Config.pm @@ -424,8 +424,9 @@ EOF $self->{-no_obfuscate}->{$lc_addr} = 1; } if (my $listids = $ibx->{listid}) { + # RFC2919 section 6 stipulates "case insensitive equality" foreach my $list_id (@$listids) { - $self->{-by_list_id}->{$list_id} = $ibx; + $self->{-by_list_id}->{lc($list_id)} = $ibx; } } if (my $ng = $ibx->{newsgroup}) { diff --git a/t/mda.t b/t/mda.t index c7caf3e0..c5b35eec 100644 --- a/t/mda.t +++ b/t/mda.t @@ -261,7 +261,7 @@ Subject: this message will be trained as spam Date: Thu, 01 Jan 1970 00:00:00 +0000 EOF - xsys(qw(git config --file), $pi_config, "$cfgpfx.listid", $list_id); + xsys(qw(git config --file), $pi_config, "$cfgpfx.listid", uc $list_id); $? == 0 or die "failed to set listid $?"; ok(run_script(['-mda'], undef, { 0 => \$in }), 'mda OK with List-Id match');