user/dev discussion of public-inbox itself
 help / color / Atom feed
* [RFC][PATCH] Config.pm: Add support for mailing list information
@ 2019-05-17  1:45 ebiederm
  2019-05-18  7:15 ` Eric Wong
  0 siblings, 1 reply; 4+ messages in thread
From: ebiederm @ 2019-05-17  1:45 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta


The world has turned since I first started following mailing lists and
to my surprise every mailling list that I am subscribed to properly
sets the "List-ID:" mailing list header.  So instead of doing
something clever and flexible I am adding support for looking up
public inbox mailing lists by their mailing list name.

That makes the work needed for each email trivial and easy to understand.
- Parse the "List-ID:" header.
- Lookup in the configuration which mailbox is connected to that
  "List-ID:"
- Deliver the mail to that mailbox.

To that end this change enhances PublicInbox to have an additional
mailbox configuration parameter "list" that holds the mailling list
name.

A method is added to the PublicInbox config object called
lookup_list that given a mailing list name will return
the PublicInbox in the configarion that is configured to
handle that mailing list.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---

Some context.  I have my mailing list email coming in via imap, and I
have a script that looks at List-ID and delivers them to the appropriate
public-inbox.  I am hoping to get my script at least into the PublicInbox
scripts directory.

I have looked and see that you a watchheader directive that effectively
does the same thing that I am doing.  Even so, I think a mailing list
having one or more mailling list names is more fundamental.  A mailing
name is a well supported concept and it differs from an address
in that a mailing list name does not have an "@".

So as part of getting my small changes to your PublicInbox code
to support my email fetching script.  Do you mind the addition
of a mailing list name configuration parameter?

If you don't mind can we apply the patch below?  Do I need to tweak
public-inbox-watch to be able to use the list parameter?  I can't
really test it as my setup is quite different but I am willing to
work things through so that the code is harmonized and people
can benefit from each others improvements.

 lib/PublicInbox/Config.pm | 33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/Config.pm b/lib/PublicInbox/Config.pm
index 09f9179b085a..70bd88f51556 100644
--- a/lib/PublicInbox/Config.pm
+++ b/lib/PublicInbox/Config.pm
@@ -25,6 +25,7 @@ sub new {
 
 	# caches
 	$self->{-by_addr} ||= {};
+	$self->{-by_list} ||= {};
 	$self->{-by_name} ||= {};
 	$self->{-by_newsgroup} ||= {};
 	$self->{-no_obfuscate} ||= {};
@@ -84,6 +85,32 @@ sub lookup {
 	_fill($self, $pfx);
 }
 
+sub lookup_list {
+	my ($self, $list) = @_;
+	my $inbox = $self->{-by_list}->{$list};
+	return $inbox if $inbox;
+
+	my $pfx;
+
+	foreach my $k (keys %$self) {
+		$k =~ /\A(publicinbox\.[\w-]+)\.list\z/ or next;
+		my $v = $self->{$k};
+		if (ref($v) eq "ARRAY") {
+			foreach my $alias (@$v) {
+				($alias eq $list) or next;
+				$pfx = $1;
+				last;
+			}
+		} else {
+			($v eq $list) or next;
+			$pfx = $1;
+			last;
+		}
+	}
+	defined $pfx or return;
+	_fill($self, $pfx);
+}
+
 sub lookup_name ($$) {
 	my ($self, $name) = @_;
 	$self->{-by_name}->{$name} || _fill($self, "publicinbox.$name");
@@ -389,7 +416,7 @@ sub _fill {
 	}
 	# TODO: more arrays, we should support multi-value for
 	# more things to encourage decentralization
-	foreach my $k (qw(address altid nntpmirror coderepo hide)) {
+	foreach my $k (qw(address altid nntpmirror coderepo hide list)) {
 		if (defined(my $v = $self->{"$pfx.$k"})) {
 			$ibx->{$k} = _array($v);
 		}
@@ -412,6 +439,10 @@ sub _fill {
 		$self->{-by_addr}->{$lc_addr} = $ibx;
 		$self->{-no_obfuscate}->{$lc_addr} = 1;
 	}
+	foreach (@{$ibx->{list}}) {
+		my $list = $_;
+		$self->{-by_list}->{$list} = $ibx;
+	}
 	if (my $ng = $ibx->{newsgroup}) {
 		$self->{-by_newsgroup}->{$ng} = $ibx;
 	}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC][PATCH] Config.pm: Add support for mailing list information
  2019-05-17  1:45 [RFC][PATCH] Config.pm: Add support for mailing list information ebiederm
@ 2019-05-18  7:15 ` Eric Wong
  2019-05-18 15:22   ` ebiederm
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Wong @ 2019-05-18  7:15 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: meta

"Eric W. Biederman" <ebiederm@xmission.com> wrote:
> 
> The world has turned since I first started following mailing lists and
> to my surprise every mailling list that I am subscribed to properly
> sets the "List-ID:" mailing list header.  So instead of doing
> something clever and flexible I am adding support for looking up
> public inbox mailing lists by their mailing list name.

Agreed, and good to know List-Id is getting standardized on.

> That makes the work needed for each email trivial and easy to understand.
> - Parse the "List-ID:" header.
> - Lookup in the configuration which mailbox is connected to that
>   "List-ID:"
> - Deliver the mail to that mailbox.

This is a great idea.  One great side-effect would be avoiding
the the need to expose the mirror delivery address for server
admins who use -mda for mirroring lists:

https://public-inbox.org/meta/20180612170546.GA5945@chatter/

> To that end this change enhances PublicInbox to have an additional
> mailbox configuration parameter "list" that holds the mailling list
> name.

I think using "listId" in configs would be preferable, as I
think it's clear that we're matching the RFC2919 header and
not any other thing like "X-Mailing-List".

We'll use camelCase for user docs, and only alllowercase for
parsing (consistent with git config naming).  I actually hate
that naming convention of git config keys, so "list_id" for
internal variables.

Spelling: s/mailling/mailing/g (you seem to spell it both ways
throughout the email)

Also, as a Perl user, the word "list" alone is also a bit
ambiguous because it tends to get used interchangeably with
"array"; and it could be confused in my brain as a throwaway
variable.

> A method is added to the PublicInbox config object called
> lookup_list that given a mailing list name will return
> the PublicInbox in the configarion that is configured to
> handle that mailing list.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
> 
> Some context.  I have my mailing list email coming in via imap, and I
> have a script that looks at List-ID and delivers them to the appropriate
> public-inbox.  I am hoping to get my script at least into the PublicInbox
> scripts directory.
> 
> I have looked and see that you a watchheader directive that effectively
> does the same thing that I am doing.  Even so, I think a mailing list
> having one or more mailling list names is more fundamental.  A mailing
> name is a well supported concept and it differs from an address
> in that a mailing list name does not have an "@".

Yes, for -watch, watchheader is a more flexible variant of this.
listId could be used as a shortcut.

> So as part of getting my small changes to your PublicInbox code
> to support my email fetching script.  Do you mind the addition
> of a mailing list name configuration parameter?
> If you don't mind can we apply the patch below?

Don't mind at all, but I would like some comments in this
email addressed, along with some tests for -mda, at least.

> Do I need to tweak
> public-inbox-watch to be able to use the list parameter?

I think translating the listId directive into a watchheader
regexp will be sufficient.

> I can't
> really test it as my setup is quite different but I am willing to
> work things through so that the code is harmonized and people
> can benefit from each others improvements.

Looks like watchheader is also missing tests :x

>  lib/PublicInbox/Config.pm | 33 ++++++++++++++++++++++++++++++++-
>  1 file changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/PublicInbox/Config.pm b/lib/PublicInbox/Config.pm
> index 09f9179b085a..70bd88f51556 100644
> --- a/lib/PublicInbox/Config.pm
> +++ b/lib/PublicInbox/Config.pm
> @@ -25,6 +25,7 @@ sub new {
>  
>  	# caches
>  	$self->{-by_addr} ||= {};
> +	$self->{-by_list} ||= {};

-by_list_id

>  	$self->{-by_name} ||= {};
>  	$self->{-by_newsgroup} ||= {};
>  	$self->{-no_obfuscate} ||= {};
> @@ -84,6 +85,32 @@ sub lookup {
>  	_fill($self, $pfx);
>  }
>  
> +sub lookup_list {

lookup_list_id

> +	my ($self, $list) = @_;
> +	my $inbox = $self->{-by_list}->{$list};
> +	return $inbox if $inbox;

"my $ibx"

I've been starting to standardize on $ibx as the identifier for
PublicInbox::Inbox references (and maybe leave $inbox for
human-readable names).

> +	foreach my $k (keys %$self) {
> +		$k =~ /\A(publicinbox\.[\w-]+)\.list\z/ or next;

                                              \.listid\z/

> +		my $v = $self->{$k};
> +		if (ref($v) eq "ARRAY") {
> +			foreach my $alias (@$v) {
> +				($alias eq $list) or next;
> +				$pfx = $1;
> +				last;
> +			}
> +		} else {
> +			($v eq $list) or next;
> +			$pfx = $1;
> +			last;
> +		}

It looks these "eq" comparisons should be case-insensitive:

	https://tools.ietf.org/html/rfc2919#section-6

> +	}
> +	defined $pfx or return;
> +	_fill($self, $pfx);
> +}
> +
>  sub lookup_name ($$) {
>  	my ($self, $name) = @_;
>  	$self->{-by_name}->{$name} || _fill($self, "publicinbox.$name");
> @@ -389,7 +416,7 @@ sub _fill {
>  	}
>  	# TODO: more arrays, we should support multi-value for
>  	# more things to encourage decentralization
> -	foreach my $k (qw(address altid nntpmirror coderepo hide)) {
> +	foreach my $k (qw(address altid nntpmirror coderepo hide list)) {
>  		if (defined(my $v = $self->{"$pfx.$k"})) {
>  			$ibx->{$k} = _array($v);
>  		}
> @@ -412,6 +439,10 @@ sub _fill {
>  		$self->{-by_addr}->{$lc_addr} = $ibx;
>  		$self->{-no_obfuscate}->{$lc_addr} = 1;
>  	}
> +	foreach (@{$ibx->{list}}) {
> +		my $list = $_;
> +		$self->{-by_list}->{$list} = $ibx;
> +	}

No sense in "my $list = $_"; either "foreach my $list_id" for
larger loops, or use $_ for tiny loops.

Anyways, great idea, needs some minor tweaks.  Thanks!

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC][PATCH] Config.pm: Add support for mailing list information
  2019-05-18  7:15 ` Eric Wong
@ 2019-05-18 15:22   ` ebiederm
  2019-06-13  0:45     ` Eric Wong
  0 siblings, 1 reply; 4+ messages in thread
From: ebiederm @ 2019-05-18 15:22 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong <e@80x24.org> writes:

> "Eric W. Biederman" <ebiederm@xmission.com> wrote:
>> 
>> The world has turned since I first started following mailing lists and
>> to my surprise every mailling list that I am subscribed to properly
>> sets the "List-ID:" mailing list header.  So instead of doing
>> something clever and flexible I am adding support for looking up
>> public inbox mailing lists by their mailing list name.
>
> Agreed, and good to know List-Id is getting standardized on.
>
>> That makes the work needed for each email trivial and easy to understand.
>> - Parse the "List-ID:" header.
>> - Lookup in the configuration which mailbox is connected to that
>>   "List-ID:"
>> - Deliver the mail to that mailbox.
>
> This is a great idea.  One great side-effect would be avoiding
> the the need to expose the mirror delivery address for server
> admins who use -mda for mirroring lists:
>
> https://public-inbox.org/meta/20180612170546.GA5945@chatter/


Yes. I see.  I can easily have it do that.  For email mirrors
this works very well.

>> To that end this change enhances PublicInbox to have an additional
>> mailbox configuration parameter "list" that holds the mailling list
>> name.
>
> I think using "listId" in configs would be preferable, as I
> think it's clear that we're matching the RFC2919 header and
> not any other thing like "X-Mailing-List".
>
> We'll use camelCase for user docs, and only alllowercase for
> parsing (consistent with git config naming).  I actually hate
> that naming convention of git config keys, so "list_id" for
> internal variables.
>
> Spelling: s/mailling/mailing/g (you seem to spell it both ways
> throughout the email)

Got it.  I think I am suffering from a bit of new Daddy syndrome.
I will clean that up before I resubmit.

> Also, as a Perl user, the word "list" alone is also a bit
> ambiguous because it tends to get used interchangeably with
> "array"; and it could be confused in my brain as a throwaway
> variable.

Makes perfect sense.

>> A method is added to the PublicInbox config object called
>> lookup_list that given a mailing list name will return
>> the PublicInbox in the configarion that is configured to
>> handle that mailing list.
>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> ---
>> 
>> Some context.  I have my mailing list email coming in via imap, and I
>> have a script that looks at List-ID and delivers them to the appropriate
>> public-inbox.  I am hoping to get my script at least into the PublicInbox
>> scripts directory.
>> 
>> I have looked and see that you a watchheader directive that effectively
>> does the same thing that I am doing.  Even so, I think a mailing list
>> having one or more mailling list names is more fundamental.  A mailing
>> name is a well supported concept and it differs from an address
>> in that a mailing list name does not have an "@".
>
> Yes, for -watch, watchheader is a more flexible variant of this.
> listId could be used as a shortcut.

Yes.  I will se what I can arrange for updating -watch and -mda.

>> So as part of getting my small changes to your PublicInbox code
>> to support my email fetching script.  Do you mind the addition
>> of a mailing list name configuration parameter?
>> If you don't mind can we apply the patch below?
>
> Don't mind at all, but I would like some comments in this
> email addressed, along with some tests for -mda, at least.

Noted I will aim at getting that done.

>> Do I need to tweak
>> public-inbox-watch to be able to use the list parameter?
>
> I think translating the listId directive into a watchheader
> regexp will be sufficient.

Yes.  That should be straight forward.  I will do that.

>> I can't
>> really test it as my setup is quite different but I am willing to
>> work things through so that the code is harmonized and people
>> can benefit from each others improvements.
>
> Looks like watchheader is also missing tests :x
>
>>  lib/PublicInbox/Config.pm | 33 ++++++++++++++++++++++++++++++++-
>>  1 file changed, 32 insertions(+), 1 deletion(-)
>> 
>> diff --git a/lib/PublicInbox/Config.pm b/lib/PublicInbox/Config.pm
>> index 09f9179b085a..70bd88f51556 100644
>> --- a/lib/PublicInbox/Config.pm
>> +++ b/lib/PublicInbox/Config.pm
>> @@ -25,6 +25,7 @@ sub new {
>>  
>>  	# caches
>>  	$self->{-by_addr} ||= {};
>> +	$self->{-by_list} ||= {};
>
> -by_list_id

ack

>>  	$self->{-by_name} ||= {};
>>  	$self->{-by_newsgroup} ||= {};
>>  	$self->{-no_obfuscate} ||= {};
>> @@ -84,6 +85,32 @@ sub lookup {
>>  	_fill($self, $pfx);
>>  }
>>  
>> +sub lookup_list {
>
> lookup_list_id

ack

>> +	my ($self, $list) = @_;
>> +	my $inbox = $self->{-by_list}->{$list};
>> +	return $inbox if $inbox;
>
> "my $ibx"
>
> I've been starting to standardize on $ibx as the identifier for
> PublicInbox::Inbox references (and maybe leave $inbox for
> human-readable names).
>

ack.

>> +	foreach my $k (keys %$self) {
>> +		$k =~ /\A(publicinbox\.[\w-]+)\.list\z/ or next;
>
>                                               \.listid\z/
ack

>> +		my $v = $self->{$k};
>> +		if (ref($v) eq "ARRAY") {
>> +			foreach my $alias (@$v) {
>> +				($alias eq $list) or next;
>> +				$pfx = $1;
>> +				last;
>> +			}
>> +		} else {
>> +			($v eq $list) or next;
>> +			$pfx = $1;
>> +			last;
>> +		}
>
> It looks these "eq" comparisons should be case-insensitive:
>
> 	https://tools.ietf.org/html/rfc2919#section-6

Interesting.  I have yet to see it matter.  But it isn't hard
to change the code to do:
		(lc($v) eq lc($list).

>> +	}
>> +	defined $pfx or return;
>> +	_fill($self, $pfx);
>> +}
>> +
>>  sub lookup_name ($$) {
>>  	my ($self, $name) = @_;
>>  	$self->{-by_name}->{$name} || _fill($self, "publicinbox.$name");
>> @@ -389,7 +416,7 @@ sub _fill {
>>  	}
>>  	# TODO: more arrays, we should support multi-value for
>>  	# more things to encourage decentralization
>> -	foreach my $k (qw(address altid nntpmirror coderepo hide)) {
>> +	foreach my $k (qw(address altid nntpmirror coderepo hide list)) {
>>  		if (defined(my $v = $self->{"$pfx.$k"})) {
>>  			$ibx->{$k} = _array($v);
>>  		}
>> @@ -412,6 +439,10 @@ sub _fill {
>>  		$self->{-by_addr}->{$lc_addr} = $ibx;
>>  		$self->{-no_obfuscate}->{$lc_addr} = 1;
>>  	}
>> +	foreach (@{$ibx->{list}}) {
>> +		my $list = $_;
>> +		$self->{-by_list}->{$list} = $ibx;
>> +	}
>
> No sense in "my $list = $_"; either "foreach my $list_id" for
> larger loops, or use $_ for tiny loops.
>
> Anyways, great idea, needs some minor tweaks.  Thanks!

Will make them and get this posted again.

Eric

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC][PATCH] Config.pm: Add support for mailing list information
  2019-05-18 15:22   ` ebiederm
@ 2019-06-13  0:45     ` Eric Wong
  0 siblings, 0 replies; 4+ messages in thread
From: Eric Wong @ 2019-06-13  0:45 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: meta

"Eric W. Biederman" <ebiederm@xmission.com> wrote:
> Will make them and get this posted again.

Hey, just wondering where you are with this List-Id work.

I'm planning to make some improvements to the -watch code in
a few days (I hope).  I'd probably use this feature myself,
so I can take it over and let you focus on other stuff :)

Thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-17  1:45 [RFC][PATCH] Config.pm: Add support for mailing list information ebiederm
2019-05-18  7:15 ` Eric Wong
2019-05-18 15:22   ` ebiederm
2019-06-13  0:45     ` Eric Wong

user/dev discussion of public-inbox itself

Archives are clonable:
	git clone --mirror https://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://ou63pmih66umazou.onion/inbox.comp.mail.public-inbox.meta
	nntp://czquwvybam4bgbro.onion/inbox.comp.mail.public-inbox.meta
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.org/gmane.mail.public-inbox.general

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox