From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 2E9041F5AD; Sun, 12 Apr 2020 21:59:33 +0000 (UTC) Date: Sun, 12 Apr 2020 21:59:33 +0000 From: Eric Wong To: Kyle Meyer Cc: meta@public-inbox.org Subject: Re: [RFC PATCH] watchmaildir: support multiple watchheader values Message-ID: <20200412215933.GA63705@dcvr> References: <20200412044452.7524-1-kyle@kyleam.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200412044452.7524-1-kyle@kyleam.com> List-Id: Kyle Meyer wrote: > The watchheader key supports only a single value. In discussion [1] > of 8d3e3bd8 (doc: explain publicinbox..watchheader, 2019-10-09), > two concerns were raised about supporting multiple watchheader values: > > * doing so would invite confusion about whether the acceptance > condition is "all" or "any" > > * it's not clear there's a need > > The second point seems to be the crucial one because, provided there > is a need, the first can be addressed with documentation (as already > done for listid, which accepts the message if there are any matches). > > Here is one scenario where support for matching multiple headers would > be convenient. Someone wants to set up public-inbox archives for some > small projects but does _not_ want to run mailing lists for them, > instead allowing others to follow the project by any of the pull > mechanisms. Using a common underlying address, an address alias for > each project is set up via a third-party email provider, with messages > for each alias being exposed as a separate public-inbox archive. Yeah. This seems like it could be useful for organizing personal mail. > In this scenario, if all messages are downloaded to a common maildir, > the problem with using watchheader to match the project address is > that the address may be in either the To: or Cc: header. One way to > avoid this issue would be to filter things upstream so that messages > for different projects got to different maildirs, ideally in a way > that duplicates any cross-posted messages in each maildir. > > Teach watchheader to support multiple values, avoiding the need to > direct message with no List-ID to separate maildirs. > > [1] https://public-inbox.org/meta/20191010085118.r3amey4cayazfycb@dcvr/ > --- > > I'm not entirely convinced that the above provides a compelling > reason to make this change, but I think it's worth considering if it > makes it easier for some to set up archives for their projects > without setting up mailings lists. Thoughts? Seems reasonable for the 1.5 release (hoping to have 1.4 out soon). I think the "any" match is reasonable considering it's analogous to the behavior of the current `address' matching. > (If this ends up being a direction we decide to go, I'd be happy to > work on adding tests to this patch.) Yes, thanks in advance :) > diff --git a/Documentation/public-inbox-config.pod b/Documentation/public-inbox-config.pod > index 53926ef4..f9871d4b 100644 > --- a/Documentation/public-inbox-config.pod > +++ b/Documentation/public-inbox-config.pod > @@ -88,7 +88,8 @@ Default: none; only for L users > watchheader = List-Id: > > If specified, L will only process mail matching > -the given header. Multiple values are not currently supported. > +the given header. If specified multiple times, mail will be processed > +if it matches any of the values. > > Default: none; only for L users > > diff --git a/lib/PublicInbox/Config.pm b/lib/PublicInbox/Config.pm > index 917939ca..458f29b2 100644 > --- a/lib/PublicInbox/Config.pm > +++ b/lib/PublicInbox/Config.pm > @@ -367,7 +367,7 @@ sub _fill { > my $ibx = {}; > > foreach my $k (qw(inboxdir filter newsgroup > - watch watchheader httpbackendmax > + watch httpbackendmax > replyto feedmax nntpserver indexlevel)) { > my $v = $self->{"$pfx.$k"}; > $ibx->{$k} = $v if defined $v; > @@ -388,7 +388,7 @@ sub _fill { > # TODO: more arrays, we should support multi-value for > # more things to encourage decentralization > foreach my $k (qw(address altid nntpmirror coderepo hide listid url > - infourl)) { > + infourl watchheader)) { > if (defined(my $v = $self->{"$pfx.$k"})) { > $ibx->{$k} = _array($v); > } > diff --git a/lib/PublicInbox/WatchMaildir.pm b/lib/PublicInbox/WatchMaildir.pm > index e2024640..bdaacf62 100644 > --- a/lib/PublicInbox/WatchMaildir.pm > +++ b/lib/PublicInbox/WatchMaildir.pm > @@ -59,9 +59,11 @@ sub new { > my $watch = $ibx->{watch} or return; > if (is_maildir($watch)) { > my $watch_hdrs = []; > - if (my $wh = $ibx->{watchheader}) { > - my ($k, $v) = split(/:/, $wh, 2); > - push @$watch_hdrs, [ $k, qr/\Q$v\E/ ]; > + if (my $whs = $ibx->{watchheader}) { > + for (@$whs) { > + my ($k, $v) = split(/:/, $_, 2); > + push @$watch_hdrs, [ $k, qr/\Q$v\E/ ]; > + } > } > if (my $list_ids = $ibx->{listid}) { > for (@$list_ids) { The change itself looks fine to my somewhat sleepy eyes.