user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@80x24.org>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: meta@public-inbox.org
Subject: Re: [RFC][PATCH] Config.pm: Add support for mailing list information
Date: Sat, 18 May 2019 07:15:18 +0000	[thread overview]
Message-ID: <20190518071518.fqcwveehh726wxdu@dcvr> (raw)
In-Reply-To: <877eap26rx.fsf@xmission.com>

"Eric W. Biederman" <ebiederm@xmission.com> wrote:
> 
> The world has turned since I first started following mailing lists and
> to my surprise every mailling list that I am subscribed to properly
> sets the "List-ID:" mailing list header.  So instead of doing
> something clever and flexible I am adding support for looking up
> public inbox mailing lists by their mailing list name.

Agreed, and good to know List-Id is getting standardized on.

> That makes the work needed for each email trivial and easy to understand.
> - Parse the "List-ID:" header.
> - Lookup in the configuration which mailbox is connected to that
>   "List-ID:"
> - Deliver the mail to that mailbox.

This is a great idea.  One great side-effect would be avoiding
the the need to expose the mirror delivery address for server
admins who use -mda for mirroring lists:

https://public-inbox.org/meta/20180612170546.GA5945@chatter/

> To that end this change enhances PublicInbox to have an additional
> mailbox configuration parameter "list" that holds the mailling list
> name.

I think using "listId" in configs would be preferable, as I
think it's clear that we're matching the RFC2919 header and
not any other thing like "X-Mailing-List".

We'll use camelCase for user docs, and only alllowercase for
parsing (consistent with git config naming).  I actually hate
that naming convention of git config keys, so "list_id" for
internal variables.

Spelling: s/mailling/mailing/g (you seem to spell it both ways
throughout the email)

Also, as a Perl user, the word "list" alone is also a bit
ambiguous because it tends to get used interchangeably with
"array"; and it could be confused in my brain as a throwaway
variable.

> A method is added to the PublicInbox config object called
> lookup_list that given a mailing list name will return
> the PublicInbox in the configarion that is configured to
> handle that mailing list.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
> 
> Some context.  I have my mailing list email coming in via imap, and I
> have a script that looks at List-ID and delivers them to the appropriate
> public-inbox.  I am hoping to get my script at least into the PublicInbox
> scripts directory.
> 
> I have looked and see that you a watchheader directive that effectively
> does the same thing that I am doing.  Even so, I think a mailing list
> having one or more mailling list names is more fundamental.  A mailing
> name is a well supported concept and it differs from an address
> in that a mailing list name does not have an "@".

Yes, for -watch, watchheader is a more flexible variant of this.
listId could be used as a shortcut.

> So as part of getting my small changes to your PublicInbox code
> to support my email fetching script.  Do you mind the addition
> of a mailing list name configuration parameter?
> If you don't mind can we apply the patch below?

Don't mind at all, but I would like some comments in this
email addressed, along with some tests for -mda, at least.

> Do I need to tweak
> public-inbox-watch to be able to use the list parameter?

I think translating the listId directive into a watchheader
regexp will be sufficient.

> I can't
> really test it as my setup is quite different but I am willing to
> work things through so that the code is harmonized and people
> can benefit from each others improvements.

Looks like watchheader is also missing tests :x

>  lib/PublicInbox/Config.pm | 33 ++++++++++++++++++++++++++++++++-
>  1 file changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/PublicInbox/Config.pm b/lib/PublicInbox/Config.pm
> index 09f9179b085a..70bd88f51556 100644
> --- a/lib/PublicInbox/Config.pm
> +++ b/lib/PublicInbox/Config.pm
> @@ -25,6 +25,7 @@ sub new {
>  
>  	# caches
>  	$self->{-by_addr} ||= {};
> +	$self->{-by_list} ||= {};

-by_list_id

>  	$self->{-by_name} ||= {};
>  	$self->{-by_newsgroup} ||= {};
>  	$self->{-no_obfuscate} ||= {};
> @@ -84,6 +85,32 @@ sub lookup {
>  	_fill($self, $pfx);
>  }
>  
> +sub lookup_list {

lookup_list_id

> +	my ($self, $list) = @_;
> +	my $inbox = $self->{-by_list}->{$list};
> +	return $inbox if $inbox;

"my $ibx"

I've been starting to standardize on $ibx as the identifier for
PublicInbox::Inbox references (and maybe leave $inbox for
human-readable names).

> +	foreach my $k (keys %$self) {
> +		$k =~ /\A(publicinbox\.[\w-]+)\.list\z/ or next;

                                              \.listid\z/

> +		my $v = $self->{$k};
> +		if (ref($v) eq "ARRAY") {
> +			foreach my $alias (@$v) {
> +				($alias eq $list) or next;
> +				$pfx = $1;
> +				last;
> +			}
> +		} else {
> +			($v eq $list) or next;
> +			$pfx = $1;
> +			last;
> +		}

It looks these "eq" comparisons should be case-insensitive:

	https://tools.ietf.org/html/rfc2919#section-6

> +	}
> +	defined $pfx or return;
> +	_fill($self, $pfx);
> +}
> +
>  sub lookup_name ($$) {
>  	my ($self, $name) = @_;
>  	$self->{-by_name}->{$name} || _fill($self, "publicinbox.$name");
> @@ -389,7 +416,7 @@ sub _fill {
>  	}
>  	# TODO: more arrays, we should support multi-value for
>  	# more things to encourage decentralization
> -	foreach my $k (qw(address altid nntpmirror coderepo hide)) {
> +	foreach my $k (qw(address altid nntpmirror coderepo hide list)) {
>  		if (defined(my $v = $self->{"$pfx.$k"})) {
>  			$ibx->{$k} = _array($v);
>  		}
> @@ -412,6 +439,10 @@ sub _fill {
>  		$self->{-by_addr}->{$lc_addr} = $ibx;
>  		$self->{-no_obfuscate}->{$lc_addr} = 1;
>  	}
> +	foreach (@{$ibx->{list}}) {
> +		my $list = $_;
> +		$self->{-by_list}->{$list} = $ibx;
> +	}

No sense in "my $list = $_"; either "foreach my $list_id" for
larger loops, or use $_ for tiny loops.

Anyways, great idea, needs some minor tweaks.  Thanks!

  reply	other threads:[~2019-05-18  7:15 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-17  1:45 [RFC][PATCH] Config.pm: Add support for mailing list information Eric W. Biederman
2019-05-18  7:15 ` Eric Wong [this message]
2019-05-18 15:22   ` Eric W. Biederman
2019-06-13  0:45     ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190518071518.fqcwveehh726wxdu@dcvr \
    --to=e@80x24.org \
    --cc=ebiederm@xmission.com \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).