public-inbox-init with minimal info

user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed

* public-inbox-init with minimal info
@ 2019-10-03 11:16 Alyssa Ross
  2019-10-04  2:45 ` Eric Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Alyssa Ross @ 2019-10-03 11:16 UTC (permalink / raw)
  To: meta

[-- Attachment #1: Type: text/plain, Size: 1233 bytes --]

In NixOS, the best way for us to provide a public-inbox module would be
to generate the configuration file ahead of time, and then initialize
inboxes that don't already exist at some reasonable time during boot.

public-inbox-init tries to write a config file in addition to
initializing an inbox.  My initial idea was to just eschew
public-inbox-init for doing git init --bare myself, which works great
for V1 repositories, but I'd really like to be generating V2 ones.

Since the V2 initialization isn't encapsulated in one easy command, I'm
wondering what the best way to accomplish initialization without writing
a config file or asking for unnecessary information is.  I could just
run public-inbox-init with PI_CONFIG=/dev/null, but then it's still not
clear to me what information about the mailbox the script requires to be
able to initialize the mailbox.  Looking at the code, I see that at
least the primary address is passed to PublicInbox::Inbox, but I'm not
sure what that would actually be used for inside the inbox.

So, what would the best thing for me to do here be?  To summarise, I'd
like to generate V2 inboxes while providing as little information about
the inbox as possible, and without writing a config file.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: public-inbox-init with minimal info
  2019-10-03 11:16 public-inbox-init with minimal info Alyssa Ross
@ 2019-10-04  2:45 ` Eric Wong
  2019-10-04 11:18   ` Alyssa Ross
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2019-10-04  2:45 UTC (permalink / raw)
  To: Alyssa Ross; +Cc: meta

Alyssa Ross <hi@alyssa.is> wrote:
> In NixOS, the best way for us to provide a public-inbox module would be
> to generate the configuration file ahead of time, and then initialize
> inboxes that don't already exist at some reasonable time during boot.
> 
> public-inbox-init tries to write a config file in addition to
> initializing an inbox.  My initial idea was to just eschew
> public-inbox-init for doing git init --bare myself, which works great
> for V1 repositories, but I'd really like to be generating V2 ones.
> 
> Since the V2 initialization isn't encapsulated in one easy command, I'm
> wondering what the best way to accomplish initialization without writing
> a config file or asking for unnecessary information is.  I could just
> run public-inbox-init with PI_CONFIG=/dev/null, but then it's still not
> clear to me what information about the mailbox the script requires to be
> able to initialize the mailbox.  Looking at the code, I see that at
> least the primary address is passed to PublicInbox::Inbox, but I'm not
> sure what that would actually be used for inside the inbox.

That address is used for making commits using the
public-inbox-{learn,edit,purge} commands, and also matching
with public-inbox-mda for delivery.

> So, what would the best thing for me to do here be?  To summarise, I'd
> like to generate V2 inboxes while providing as little information about
> the inbox as possible, and without writing a config file.

You want to provide that inbox during the post-install?

I'm not sure if I understand what's going on (but it's been a
long day :x).  I'm not sure if providing an inbox on package
installation is necessary or even a good thing...

Using git itself as an analogy: I wouldn't expect installing git
to auto-generate an empty git repo for me.  So same with
public-inbox, and stuff like cgit/gitweb...

Perhaps examples/public-inbox-config could add some newish
features such as nntpserver and css support and packaged
up for users, though:

diff --git a/examples/public-inbox-config b/examples/public-inbox-config
index 7fcbe0ba..a6785a7c 100644
--- a/examples/public-inbox-config
+++ b/examples/public-inbox-config
@@ -1,5 +1,13 @@
 # this usually in ~/.public-inbox/config and parseable with git-config(1)
 # update t/config.t if changing this, that test relies on this
+[publicinbox]
+	nntpserver = news.example.com
+	css = /path/to/share/public-inbox/216dark.css media=screen
+	css = /path/to/share/public-inbox/216light.css media=print
+	css = /path/to/share/public-inbox/216light.css \
+		media='screen AND (prefers-color-scheme:light)'
+[publicinboxwatch]
+	spamcheck = spamc
 [publicinbox "test"]
 	address = try@public-inbox.org
 	address = sandbox@public-inbox.org

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: public-inbox-init with minimal info
  2019-10-04  2:45 ` Eric Wong
@ 2019-10-04 11:18   ` Alyssa Ross
  2019-10-05  5:14     ` Eric Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Alyssa Ross @ 2019-10-04 11:18 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

[-- Attachment #1: Type: text/plain, Size: 3318 bytes --]

>> Since the V2 initialization isn't encapsulated in one easy command, I'm
>> wondering what the best way to accomplish initialization without writing
>> a config file or asking for unnecessary information is.  I could just
>> run public-inbox-init with PI_CONFIG=/dev/null, but then it's still not
>> clear to me what information about the mailbox the script requires to be
>> able to initialize the mailbox.  Looking at the code, I see that at
>> least the primary address is passed to PublicInbox::Inbox, but I'm not
>> sure what that would actually be used for inside the inbox.
>
> That address is used for making commits using the
> public-inbox-{learn,edit,purge} commands, and also matching
> with public-inbox-mda for delivery.

I'm guessing those read from the config file, though?  I'm trying to
figure out what configuration is stored in the inbox directory as
opposed to the config file.

> You want to provide that inbox during the post-install?
>
> I'm not sure if I understand what's going on (but it's been a
> long day :x).  I'm not sure if providing an inbox on package
> installation is necessary or even a good thing...
>
> Using git itself as an analogy: I wouldn't expect installing git
> to auto-generate an empty git repo for me.  So same with
> public-inbox, and stuff like cgit/gitweb...

I agree -- creating an inbox on package installation would be a terrible
idea, and is not what I'm proposing to do.

Some background: in the Nix ecosystem, we have a package repository,
Nixpkgs.  These packages are fairly typical except for unusual paths
(containing checksums of the "inputs" -- think dependencies -- of the
package) and the functional language they're written in.  We also have
NixOS, which is a GNU/Linux distribution, that uses those packages but
also adds the concept of "modules", written in the same language as the
packages.  These modules do things like configuring the users on your
system, setting up config files, etc.  The idea being that you can
generate a whole Linux system from pure (in the functional programming
sense) configuration, reproducibly, and have the system stuff be read
only at runtime to the maximum extent possible.  I'm working on both a
package _and_ a module for public-inbox.  The package will do exactly
what you'd expect a package to do, but the module will let you express
global and per-inbox configuration in the Nix language, generate a
read-only public-inbox config file and systemd units from those, and, at
boot time or configuration change time, initialize the inboxes defined
by the user.

As an example, Tor in NixOS looks like this:

    { ... }:

    {
      services.tor.enable = true;
      services.tor.hiddenServices = [
        {
          name = "public-inbox";
          map = [ { port = 80; destination = 8000; } ];
          version = 3;
        };
      ];
    };

This will generate a static tor.service and tor config file -- we do as
much as possible staticly and purely because then we know that if a
configuration is rolled back, we can remove the service etc.  For state,
however, like the hidden service private key (in this case -- I could
have used a static one here if I'd wanted), it will be generated either
at boot time or in Tor's ExecStartPre.  This is the same mechanism I
would use to run public-inbox-init.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: public-inbox-init with minimal info
  2019-10-04 11:18   ` Alyssa Ross
@ 2019-10-05  5:14     ` Eric Wong
  2019-10-05 13:05       ` Alyssa Ross
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2019-10-05  5:14 UTC (permalink / raw)
  To: Alyssa Ross; +Cc: meta

Alyssa Ross <hi@alyssa.is> wrote:
> >> Since the V2 initialization isn't encapsulated in one easy command, I'm
> >> wondering what the best way to accomplish initialization without writing
> >> a config file or asking for unnecessary information is.  I could just
> >> run public-inbox-init with PI_CONFIG=/dev/null, but then it's still not
> >> clear to me what information about the mailbox the script requires to be
> >> able to initialize the mailbox.  Looking at the code, I see that at
> >> least the primary address is passed to PublicInbox::Inbox, but I'm not
> >> sure what that would actually be used for inside the inbox.
> >
> > That address is used for making commits using the
> > public-inbox-{learn,edit,purge} commands, and also matching
> > with public-inbox-mda for delivery.
> 
> I'm guessing those read from the config file, though?  I'm trying to
> figure out what configuration is stored in the inbox directory as
> opposed to the config file.

Yes, they all read from the config file because they need to
operate on inboxes.   public-inbox-{edit,purge,index,xcpdb,compact}
can also operate on inboxes outside of the config file, but -mda
and -learn needs the config.

There isn't any config stored in the inbox directories
themselves, although there's some metadata in SQLite for
incremental indexing and indexlevel for Xapian.

> > You want to provide that inbox during the post-install?
> >
> > I'm not sure if I understand what's going on (but it's been a
> > long day :x).  I'm not sure if providing an inbox on package
> > installation is necessary or even a good thing...
> >
> > Using git itself as an analogy: I wouldn't expect installing git
> > to auto-generate an empty git repo for me.  So same with
> > public-inbox, and stuff like cgit/gitweb...
> 
> I agree -- creating an inbox on package installation would be a terrible
> idea, and is not what I'm proposing to do.

Good to know :>

> Some background: in the Nix ecosystem, we have a package repository,
> Nixpkgs.  These packages are fairly typical except for unusual paths
> (containing checksums of the "inputs" -- think dependencies -- of the
> package) and the functional language they're written in.  We also have
> NixOS, which is a GNU/Linux distribution, that uses those packages but
> also adds the concept of "modules", written in the same language as the
> packages.  These modules do things like configuring the users on your
> system, setting up config files, etc.  The idea being that you can
> generate a whole Linux system from pure (in the functional programming
> sense) configuration, reproducibly, and have the system stuff be read
> only at runtime to the maximum extent possible.  I'm working on both a
> package _and_ a module for public-inbox.  The package will do exactly
> what you'd expect a package to do, but the module will let you express
> global and per-inbox configuration in the Nix language, generate a
> read-only public-inbox config file and systemd units from those, and, at
> boot time or configuration change time, initialize the inboxes defined
> by the user.

Thanks for that clarification, NixOS modules are a new concept
to me.

I'm not sure how the public-inbox config file can/should remain
read-only.  It's analogous to a config file for cgit or gitweb,
so maybe modules for those can offer some inspiration...

> As an example, Tor in NixOS looks like this:
> 
>     { ... }:
> 
>     {
>       services.tor.enable = true;
>       services.tor.hiddenServices = [
>         {
>           name = "public-inbox";
>           map = [ { port = 80; destination = 8000; } ];
>           version = 3;
>         };
>       ];
>     };
> 
> This will generate a static tor.service and tor config file -- we do as
> much as possible staticly and purely because then we know that if a
> configuration is rolled back, we can remove the service etc.  For state,
> however, like the hidden service private key (in this case -- I could
> have used a static one here if I'd wanted), it will be generated either
> at boot time or in Tor's ExecStartPre.  This is the same mechanism I
> would use to run public-inbox-init.

So that means a new tor process is spawned for every hidden
service?  (instead of one tor process running multiple
services).  It works, but it's not memory-efficient (but
could be more secure if tor has bugs).

Like gitweb and cgit, public-inbox-{httpd,nntpd} usually expects
to serve multiple inboxes off one instance.  It's possible to
start an independent -httpd/-nntpd instance for every inbox,
each with it's own config, but that's not efficient at all.

Both the WWW and NNTP code are able to scan Message-IDs across
multiple inboxes in the same process in case of cross-posts
between related inboxes.  At some point, I think it'll be useful
to define groups/relationships between inboxes for the WWW UI.

Likewise, public-inbox-watch would be expected to watch multiple
Maildirs and write to multiple inboxes, if needed.  Maybe
modules for mail downloading/notification tools such as
offlineimap/mbsync(isync) could serve as inspiration, there.

I tried to keep most of the daemon-specific config knobs in the
command-line, so it goes into .service files, which can be
read-only (well, unless somebody wants to change worker
processes via "-W $NUM").

But yeah, packaging services/modules for different systems and
use-cases is hard, everybody seems to do it differently and
want different things.  So I'm really not sure how packaging
a module would work, it seems like that's something each user
would want to manage on their own.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: public-inbox-init with minimal info
  2019-10-05  5:14     ` Eric Wong
@ 2019-10-05 13:05       ` Alyssa Ross
  2019-10-05 19:58         ` Eric Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Alyssa Ross @ 2019-10-05 13:05 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

[-- Attachment #1: Type: text/plain, Size: 2251 bytes --]

> There isn't any config stored in the inbox directories
> themselves, although there's some metadata in SQLite for
> incremental indexing and indexlevel for Xapian.

Cool, okay, I think this is what I wanted to know.  I'll generate an
inbox and see what's in sqlite, and omit anything that isn't used there.

> I'm not sure how the public-inbox config file can/should remain
> read-only.  It's analogous to a config file for cgit or gitweb,
> so maybe modules for those can offer some inspiration...

I've been using cgit fine with a readonly config for ages, fwiw.

>> As an example, Tor in NixOS looks like this:
>> 
>>     { ... }:
>> 
>>     {
>>       services.tor.enable = true;
>>       services.tor.hiddenServices = [
>>         {
>>           name = "public-inbox";
>>           map = [ { port = 80; destination = 8000; } ];
>>           version = 3;
>>         };
>>       ];
>>     };
>> 
>> This will generate a static tor.service and tor config file -- we do as
>> much as possible staticly and purely because then we know that if a
>> configuration is rolled back, we can remove the service etc.  For state,
>> however, like the hidden service private key (in this case -- I could
>> have used a static one here if I'd wanted), it will be generated either
>> at boot time or in Tor's ExecStartPre.  This is the same mechanism I
>> would use to run public-inbox-init.
>
> So that means a new tor process is spawned for every hidden
> service?  (instead of one tor process running multiple
> services).  It works, but it's not memory-efficient (but
> could be more secure if tor has bugs).

No.  If I had added multiple hidden services above, Nix would still
generate one config file, one systemd service, etc.

> But yeah, packaging services/modules for different systems and
> use-cases is hard, everybody seems to do it differently and
> want different things.  So I'm really not sure how packaging
> a module would work, it seems like that's something each user
> would want to manage on their own.

Well, the idea is to provide sufficient configuration options that it
should be sufficient for almost everybody's needs.  Tor has way more
options than I showed above, for example.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: public-inbox-init with minimal info
  2019-10-05 13:05       ` Alyssa Ross
@ 2019-10-05 19:58         ` Eric Wong
  2019-10-06  9:52           ` Alyssa Ross
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2019-10-05 19:58 UTC (permalink / raw)
  To: Alyssa Ross; +Cc: meta

Alyssa Ross <hi@alyssa.is> wrote:
> > There isn't any config stored in the inbox directories
> > themselves, although there's some metadata in SQLite for
> > incremental indexing and indexlevel for Xapian.
> 
> Cool, okay, I think this is what I wanted to know.  I'll generate an
> inbox and see what's in sqlite, and omit anything that isn't used there.
> 
> > I'm not sure how the public-inbox config file can/should remain
> > read-only.  It's analogous to a config file for cgit or gitweb,
> > so maybe modules for those can offer some inspiration...
> 
> I've been using cgit fine with a readonly config for ages, fwiw.

Is that possible without scan-path/project-list options in cgitrc?
public-inbox has nothing analogous to those options, right now.

> >> As an example, Tor in NixOS looks like this:
> >> 
> >>     { ... }:
> >> 
> >>     {
> >>       services.tor.enable = true;
> >>       services.tor.hiddenServices = [
> >>         {
> >>           name = "public-inbox";
> >>           map = [ { port = 80; destination = 8000; } ];
> >>           version = 3;
> >>         };
> >>       ];
> >>     };
> >> 
> >> This will generate a static tor.service and tor config file -- we do as
> >> much as possible staticly and purely because then we know that if a
> >> configuration is rolled back, we can remove the service etc.  For state,
> >> however, like the hidden service private key (in this case -- I could
> >> have used a static one here if I'd wanted), it will be generated either
> >> at boot time or in Tor's ExecStartPre.  This is the same mechanism I
> >> would use to run public-inbox-init.
> >
> > So that means a new tor process is spawned for every hidden
> > service?  (instead of one tor process running multiple
> > services).  It works, but it's not memory-efficient (but
> > could be more secure if tor has bugs).
> 
> No.  If I had added multiple hidden services above, Nix would still
> generate one config file, one systemd service, etc.

one systemd service == one tor process, right?  I haven't looked
too deeply into systemd, though, so maybe there's some way to
add services to an existing tor process...

> > But yeah, packaging services/modules for different systems and
> > use-cases is hard, everybody seems to do it differently and
> > want different things.  So I'm really not sure how packaging
> > a module would work, it seems like that's something each user
> > would want to manage on their own.
> 
> Well, the idea is to provide sufficient configuration options that it
> should be sufficient for almost everybody's needs.  Tor has way more
> options than I showed above, for example.

I tried to make the defaults reasonable, so I don't think any
config is needed outside of what's required to map
inboxes/addresses to directories (which public-inbox-init does)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: public-inbox-init with minimal info
  2019-10-05 19:58         ` Eric Wong
@ 2019-10-06  9:52           ` Alyssa Ross
  2019-10-06 12:01             ` Eric Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Alyssa Ross @ 2019-10-06  9:52 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

[-- Attachment #1: Type: text/plain, Size: 2640 bytes --]

>> I've been using cgit fine with a readonly config for ages, fwiw.
>
> Is that possible without scan-path/project-list options in cgitrc?
> public-inbox has nothing analogous to those options, right now.

Yes, because I can generate my cgit configuration file with Nix.
Theer's no static config file or anything -- it's generated from the
configuration options provided by the user.

So, as a user of cgit in NixOS, I can define my repositories, and the
paths to them in Nix, and when I "rebuild"[1] my system the cgit
configuration file is generated based on that.

[1]: conceptually an update, but works by generating a new system
     configuration without referring to the current one, so it's more
     like rebuilding from scratch

> one systemd service == one tor process, right?  I haven't looked
> too deeply into systemd, though, so maybe there's some way to
> add services to an existing tor process...

At least in this case, yes, one systemd service == one tor process.  We
don't support more than one, AFAIK.  That would have to be done in a
container or something.

> I tried to make the defaults reasonable, so I don't think any
> config is needed outside of what's required to map
> inboxes/addresses to directories (which public-inbox-init does)

The only one I've really added so far that affects the public-inbox
config is whether to enable spam checking or not, but I suspect there
might be more.  There are also options for things like whether a service
should be generated to run public-inbox-httpd, etc.

Here's what my configuration looks like so far, using the module I'm
writing:

    services.public-inbox.enable = true;

    # Add spamassassin to the PATH of public-inbox-mda,
    # public-inbox-watch, etc.
    services.public-inbox.path = with pkgs; [ spamassassin ];

    services.public-inbox.mda.spamCheck = "spamc";
    services.public-inbox.http.mount = "/lists/archives/";
    services.public-inbox.inboxes = { [...] };

As you can see, it's in some ways like just writing a public-inbox
configuration file, but it can go beyond that too -- there can be
options like services.public-inbox.path that are more of a packaging but
that can be delegated to a user (by default, services on NixOS have
almost nothing in their PATH to ensure purity).  I'm probably not the
best person to explain why NixOS modules are good, or the benefits of
expressing all system configuration in a single functional programming
language, so suffice it to say that doing these things are the
fundamental goals of the distribution, and that it works extremely well.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: public-inbox-init with minimal info
  2019-10-06  9:52           ` Alyssa Ross
@ 2019-10-06 12:01             ` Eric Wong
  2019-10-07 20:52               ` Alyssa Ross
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2019-10-06 12:01 UTC (permalink / raw)
  To: Alyssa Ross; +Cc: meta

Alyssa Ross <hi@alyssa.is> wrote:
> >> I've been using cgit fine with a readonly config for ages, fwiw.
> >
> > Is that possible without scan-path/project-list options in cgitrc?
> > public-inbox has nothing analogous to those options, right now.
> 
> Yes, because I can generate my cgit configuration file with Nix.
> Theer's no static config file or anything -- it's generated from the
> configuration options provided by the user.
> 
> So, as a user of cgit in NixOS, I can define my repositories, and the
> paths to them in Nix, and when I "rebuild"[1] my system the cgit
> configuration file is generated based on that.
> 
> [1]: conceptually an update, but works by generating a new system
>      configuration without referring to the current one, so it's more
>      like rebuilding from scratch

OK, it seems like you can "build" the full public-inbox config
file the same way you'd build your cgitrc.

> > one systemd service == one tor process, right?  I haven't looked
> > too deeply into systemd, though, so maybe there's some way to
> > add services to an existing tor process...
> 
> At least in this case, yes, one systemd service == one tor process.  We
> don't support more than one, AFAIK.  That would have to be done in a
> container or something.
> 
> > I tried to make the defaults reasonable, so I don't think any
> > config is needed outside of what's required to map
> > inboxes/addresses to directories (which public-inbox-init does)
> 
> The only one I've really added so far that affects the public-inbox
> config is whether to enable spam checking or not, but I suspect there
> might be more.  There are also options for things like whether a service
> should be generated to run public-inbox-httpd, etc.

Yeah, I run -httpd, -nntpd, and -watch as services (see examples/ dir);
so it makes sense to have those in a package.  -httpd/-nntpd
generally run on the same machines, but they don't need -watch
on the same server (they can be running off git-clone/fetch &&
public-inbox-index)

> Here's what my configuration looks like so far, using the module I'm
> writing:
> 
>     services.public-inbox.enable = true;
> 
>     # Add spamassassin to the PATH of public-inbox-mda,
>     # public-inbox-watch, etc.
>     services.public-inbox.path = with pkgs; [ spamassassin ];

They'll all need git, too (unless that's in the default path).
Also, httpd/nntpd don't need spamassassin.

>     services.public-inbox.mda.spamCheck = "spamc";

Note: one slight oddity is there's also a "publicinboxwatch.spamc"
in addition to "publicinboxmda.spamc"...  I figure some folks will
want differently-configured spamcheckers depending on whether the
mail hits -mda or -watch (so -watch defaults to not having a
spamchecker at all).

Does Nix allow users to set things in the config file directly?
(instead of going through the functional language).

I'm also not sure if you need to have
"services.public-inbox.mda.spamCheck" at all, since "spamc"
is the default value.

>     services.public-inbox.http.mount = "/lists/archives/";

I think all the services would want access to the same
directories, not just httpd (if I'm understanding that config
correctly).  Also, httpd/nntpd only need read-only access to their
mount points, in case that affects things...

>     services.public-inbox.inboxes = { [...] };
> 
> As you can see, it's in some ways like just writing a public-inbox
> configuration file, but it can go beyond that too -- there can be
> options like services.public-inbox.path that are more of a packaging but
> that can be delegated to a user (by default, services on NixOS have
> almost nothing in their PATH to ensure purity).  I'm probably not the
> best person to explain why NixOS modules are good, or the benefits of
> expressing all system configuration in a single functional programming
> language, so suffice it to say that doing these things are the
> fundamental goals of the distribution, and that it works extremely well.

The purity and rollback parts definitely sound good :)

However, I'm wondering what level of customization can be
supported by editing the public-inbox config directly? (instead
of using the Nix language)

Having both an upstream and distro-specific ways to configure the
same thing could be confusing to both users and people trying to
help them.

I can agree with things like PATH, environment variables,
services-to-enable and mount-points being environment and/or
distro-specific.  The rest (everything in public-inbox-config(5)),
I'm not sure about; it would increase support/doc overhead.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: public-inbox-init with minimal info
  2019-10-06 12:01             ` Eric Wong
@ 2019-10-07 20:52               ` Alyssa Ross
  2019-10-08  7:11                 ` Eric Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Alyssa Ross @ 2019-10-07 20:52 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

[-- Attachment #1: Type: text/plain, Size: 5970 bytes --]

> OK, it seems like you can "build" the full public-inbox config
> file the same way you'd build your cgitrc.

Yep!  I've had that working for a couple of weeks now and it's working
great!

> Yeah, I run -httpd, -nntpd, and -watch as services (see examples/ dir);
> so it makes sense to have those in a package.  -httpd/-nntpd
> generally run on the same machines, but they don't need -watch
> on the same server (they can be running off git-clone/fetch &&
> public-inbox-index)

I haven't looked into watch yet -- I've been implementing for my use
case first, which is using public-inbox as an archiver for a mailing
list hosted on the same server.  But that does make sense, and I'll make
sure to account for it.

>>     # Add spamassassin to the PATH of public-inbox-mda,
>>     # public-inbox-watch, etc.
>>     services.public-inbox.path = with pkgs; [ spamassassin ];
>
> They'll all need git, too (unless that's in the default path).
> Also, httpd/nntpd don't need spamassassin.

Yeah, this is only for -mda and -watch.  git is just added to the PATH
of every public-inbox-* executable in the package using a wrapper script.

> Note: one slight oddity is there's also a "publicinboxwatch.spamc"
> in addition to "publicinboxmda.spamc"...  I figure some folks will
> want differently-configured spamcheckers depending on whether the
> mail hits -mda or -watch (so -watch defaults to not having a
> spamchecker at all).

Noticed that, and will account for it.  As above, I'm just not using
-watch in my setup.

> Does Nix allow users to set things in the config file directly?
> (instead of going through the functional language).

It does; see below.

> I'm also not sure if you need to have
> "services.public-inbox.mda.spamCheck" at all, since "spamc"
> is the default value.

Correct -- that's a remnant of when I had it disabled because I hadn't
set up spamassassin yet.  I should have deleted the line rather than
changing it.

>>     services.public-inbox.http.mount = "/lists/archives/";
>
> I think all the services would want access to the same
> directories, not just httpd (if I'm understanding that config
> correctly).  Also, httpd/nntpd only need read-only access to their
> mount points, in case that affects things...

"mount" here is in the PSGI sense, not the file system sense.  My
public-inboxes are at https://example.org/lists/archives/.  Maybe
there's a better name.

> The purity and rollback parts definitely sound good :)
>
> However, I'm wondering what level of customization can be
> supported by editing the public-inbox config directly? (instead
> of using the Nix language)
>
> Having both an upstream and distro-specific ways to configure the
> same thing could be confusing to both users and people trying to
> help them.
>
> I can agree with things like PATH, environment variables,
> services-to-enable and mount-points being environment and/or
> distro-specific.  The rest (everything in public-inbox-config(5)),
> I'm not sure about; it would increase support/doc overhead.

We accept that we can't package every option and have two conventions
for making sure that our users can always use every option upstream
gives them.

One is to provide an option called extraConfig, which just adds a string
verbatim to the end of the configuration file.  The other is to provide
an option called config, which is structured Nix that gets turned into
the appropriate configuration.  The latter is preferred, because then
from our point of view the configuration is still structured, and can be
read back in Nix without having to parse a string, etc.  Since
public-inbox's configuration format is well-defined as git config,
there's no reason to support the former.  So, supposing you introduce a
new option, publicinbox.foo, and the NixOS module hasn't yet been
updated to support it natively yet.  In that case, a user could use this
option as an escape hatch to use it anyway:

    services.public-inbox.config = {
      publicinbox.foo = "bar";
    };

This will then be compiled into git config using a function I've written
that essentially runs git config --add for each config option to build
up a configuration file.

By now I'm sure you're wondering "why bother adding individual NixOS
options for each setting at all, if you can do this?", and there are a
couple of reasons why we try to do it anyway.  One is that we can do
type checking -- setting publicinboxmda.spamcheck to "invalid" can be a
build-time failure rather than a runtime one.  The other is that we can
provide documentation for each option, and our users can see
documentation for every option available on their NixOS system in one
place at <https://nixos.org/nixos/options.html>.  We regularly hear from
users that this is one of their favourite things about NixOS.  A single
place to search configuration options for every package they use.

I hear your concerns about this being difficult for people trying to
help NixOS users with public-inbox.  It's absolutely not my goal to
fragment the ecosystem.  I'm not aware of this having been a significant
issue for any of the hundreds of modules we have so far.  Users seem to
be generally pretty good at figuring out what's a NixOS issue and what's
an upstream issue, and, if a NixOS user does need upstream support, it's
still easy enough for them to find the generated configuration file to
share with non-NixOS users.  Overall, we've found that the benefits of
"heavyweight" NixOS modules, for want of a better term, outweigh the
disadvantages.  Should this end up having a negative impact on the
public-inbox system, I'd be happy to review the approach.  But I think
that instead, we'll end up with public-inbox being accessible to more
people by being available with the advantages of NixOS --
declarative configuration, reproducibility, near-atomic updates and
rollbacks, etc.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: public-inbox-init with minimal info
  2019-10-07 20:52               ` Alyssa Ross
@ 2019-10-08  7:11                 ` Eric Wong
  2019-10-09 12:09                   ` Alyssa Ross
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2019-10-08  7:11 UTC (permalink / raw)
  To: Alyssa Ross; +Cc: meta

Alyssa Ross <hi@alyssa.is> wrote:

<snip>

> Eric Wong wrote:
> > Does Nix allow users to set things in the config file directly?
> > (instead of going through the functional language).
> 
> It does; see below.

<snip>

> >>     services.public-inbox.http.mount = "/lists/archives/";
> >
> > I think all the services would want access to the same
> > directories, not just httpd (if I'm understanding that config
> > correctly).  Also, httpd/nntpd only need read-only access to their
> > mount points, in case that affects things...
> 
> "mount" here is in the PSGI sense, not the file system sense.  My
> public-inboxes are at https://example.org/lists/archives/.  Maybe
> there's a better name.

Ah, so that overrides the Plack::Builder DSL/language.  We also
have an analogous support problem for PSGI vs public-inbox-config,
so I've been avoiding any overlap between them.

Perhaps "psgi_mount" would be clearer? *shrug*

> > The purity and rollback parts definitely sound good :)
> >
> > However, I'm wondering what level of customization can be
> > supported by editing the public-inbox config directly? (instead
> > of using the Nix language)
> >
> > Having both an upstream and distro-specific ways to configure the
> > same thing could be confusing to both users and people trying to
> > help them.
> >
> > I can agree with things like PATH, environment variables,
> > services-to-enable and mount-points being environment and/or
> > distro-specific.  The rest (everything in public-inbox-config(5)),
> > I'm not sure about; it would increase support/doc overhead.
> 
> We accept that we can't package every option and have two conventions
> for making sure that our users can always use every option upstream
> gives them.
> 
> One is to provide an option called extraConfig, which just adds a string
> verbatim to the end of the configuration file.  The other is to provide
> an option called config, which is structured Nix that gets turned into
> the appropriate configuration.  The latter is preferred, because then
> from our point of view the configuration is still structured, and can be
> read back in Nix without having to parse a string, etc.  Since
> public-inbox's configuration format is well-defined as git config,
> there's no reason to support the former.  So, supposing you introduce a
> new option, publicinbox.foo, and the NixOS module hasn't yet been
> updated to support it natively yet.  In that case, a user could use this
> option as an escape hatch to use it anyway:
> 
>     services.public-inbox.config = {
>       publicinbox.foo = "bar";
>     };
> 
> This will then be compiled into git config using a function I've written
> that essentially runs git config --add for each config option to build
> up a configuration file.
> 
> By now I'm sure you're wondering "why bother adding individual NixOS
> options for each setting at all, if you can do this?", and there are a
> couple of reasons why we try to do it anyway.  One is that we can do
> type checking -- setting publicinboxmda.spamcheck to "invalid" can be a
> build-time failure rather than a runtime one.  The other is that we can
> provide documentation for each option, and our users can see
> documentation for every option available on their NixOS system in one
> place at <https://nixos.org/nixos/options.html>.  We regularly hear from
> users that this is one of their favourite things about NixOS.  A single
> place to search configuration options for every package they use.

Cool.  An upside for non-NixOS users is we get more experienced
and clueful maintainers reporting bugs to upstream as a result :)

Btw, would it be helpful if public-inbox provided a linter for
its config own file?

I'm actually thinking of doing some sort of graphing tool using
Graph::Easy to visualize all the relationships between various
components, and maybe it can parse the config file to show
people how things work; and it could do the linting as a side-effect.

> I hear your concerns about this being difficult for people trying to
> help NixOS users with public-inbox.  It's absolutely not my goal to
> fragment the ecosystem.  I'm not aware of this having been a significant
> issue for any of the hundreds of modules we have so far.  Users seem to
> be generally pretty good at figuring out what's a NixOS issue and what's
> an upstream issue, and, if a NixOS user does need upstream support, it's
> still easy enough for them to find the generated configuration file to
> share with non-NixOS users.  Overall, we've found that the benefits of
> "heavyweight" NixOS modules, for want of a better term, outweigh the
> disadvantages.  Should this end up having a negative impact on the
> public-inbox system, I'd be happy to review the approach.  But I think
> that instead, we'll end up with public-inbox being accessible to more
> people by being available with the advantages of NixOS --
> declarative configuration, reproducibility, near-atomic updates and
> rollbacks, etc.

Thanks, that is all very reassuring to read.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: public-inbox-init with minimal info
  2019-10-08  7:11                 ` Eric Wong
@ 2019-10-09 12:09                   ` Alyssa Ross
  2019-10-10  8:19                     ` Eric Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Alyssa Ross @ 2019-10-09 12:09 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

[-- Attachment #1: Type: text/plain, Size: 1683 bytes --]

>> >>     services.public-inbox.http.mount = "/lists/archives/";
>> >
>> > I think all the services would want access to the same
>> > directories, not just httpd (if I'm understanding that config
>> > correctly).  Also, httpd/nntpd only need read-only access to their
>> > mount points, in case that affects things...
>> 
>> "mount" here is in the PSGI sense, not the file system sense.  My
>> public-inboxes are at https://example.org/lists/archives/.  Maybe
>> there's a better name.
>
> Ah, so that overrides the Plack::Builder DSL/language.  We also
> have an analogous support problem for PSGI vs public-inbox-config,
> so I've been avoiding any overlap between them.
>
> Perhaps "psgi_mount" would be clearer? *shrug*

I shied away from that because it would only be clearer if you know what
PSGI is, and the module takes care of all of that.  I also considered
httpMount, but since it's already in the http namespace that felt
redundant.

> Btw, would it be helpful if public-inbox provided a linter for
> its config own file?

Very much so!  That would let us lint config files at build time, and
fail the system build if they were invalid, meaning a system could
"never" have an invalid config file.  We already do this with nginx --
the linter doesn't catch everything, but it's wonderful when it catches
something that would otherwise have left you without a working web
server.

Here's an idea for a lint: I lost most of a day wondering what I had
done wrong, before realising that I was setting mainrepo to all.git,
rather than its parent directory.  The name "mainrepo" isn't great, IMO,
but a lint could have accomodated for that.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: public-inbox-init with minimal info
  2019-10-09 12:09                   ` Alyssa Ross
@ 2019-10-10  8:19                     ` Eric Wong
  2019-10-16 10:04                       ` Eric Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2019-10-10  8:19 UTC (permalink / raw)
  To: Alyssa Ross; +Cc: meta

Alyssa Ross <hi@alyssa.is> wrote:
> >> >>     services.public-inbox.http.mount = "/lists/archives/";
> >> >
> >> > I think all the services would want access to the same
> >> > directories, not just httpd (if I'm understanding that config
> >> > correctly).  Also, httpd/nntpd only need read-only access to their
> >> > mount points, in case that affects things...
> >> 
> >> "mount" here is in the PSGI sense, not the file system sense.  My
> >> public-inboxes are at https://example.org/lists/archives/.  Maybe
> >> there's a better name.
> >
> > Ah, so that overrides the Plack::Builder DSL/language.  We also
> > have an analogous support problem for PSGI vs public-inbox-config,
> > so I've been avoiding any overlap between them.
> >
> > Perhaps "psgi_mount" would be clearer? *shrug*
> 
> I shied away from that because it would only be clearer if you know what
> PSGI is, and the module takes care of all of that.  I also considered
> httpMount, but since it's already in the http namespace that felt
> redundant.

Maybe "url_mount"?  Naming is one of the toughest problems :<

> > Btw, would it be helpful if public-inbox provided a linter for
> > its config own file?
> 
> Very much so!  That would let us lint config files at build time, and
> fail the system build if they were invalid, meaning a system could
> "never" have an invalid config file.  We already do this with nginx --
> the linter doesn't catch everything, but it's wonderful when it catches
> something that would otherwise have left you without a working web
> server.

OK, adding TODO item.

> Here's an idea for a lint: I lost most of a day wondering what I had
> done wrong, before realising that I was setting mainrepo to all.git,
> rather than its parent directory.  The name "mainrepo" isn't great, IMO,
> but a lint could have accomodated for that.

Sorry about wasting your time!  Yeah, "mainrepo" is a bad name,
especially with v2 :<

I guess we can shift to "inboxdir" and keep the old alias
indefinitely for compatibility, since there's already INBOX_DIR
all over the documentation.

I originally intended for each inbox to have another repo
for rejected/spam messages; but just put it in PI_EMERGENCY,
instead.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: public-inbox-init with minimal info
  2019-10-10  8:19                     ` Eric Wong
@ 2019-10-16 10:04                       ` Eric Wong
  0 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2019-10-16 10:04 UTC (permalink / raw)
  To: Alyssa Ross; +Cc: meta

Eric Wong <e@80x24.org> wrote:
> Alyssa Ross <hi@alyssa.is> wrote:
> > Here's an idea for a lint: I lost most of a day wondering what I had
> > done wrong, before realising that I was setting mainrepo to all.git,
> > rather than its parent directory.  The name "mainrepo" isn't great, IMO,
> > but a lint could have accomodated for that.
> 
> Sorry about wasting your time!  Yeah, "mainrepo" is a bad name,
> especially with v2 :<
> 
> I guess we can shift to "inboxdir" and keep the old alias
> indefinitely for compatibility, since there's already INBOX_DIR
> all over the documentation.

Patches posted at:
  https://public-inbox.org/meta/20191016085955.23674-1-e@80x24.org/

(and I needed a 3/2 fixup after deploying :x)
Will merge to master soonish.  I think it's the last major thing
to do before 1.2.0 (and some more doc updates coming...)

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-10-16 10:04 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-03 11:16 public-inbox-init with minimal info Alyssa Ross
2019-10-04  2:45 ` Eric Wong
2019-10-04 11:18   ` Alyssa Ross
2019-10-05  5:14     ` Eric Wong
2019-10-05 13:05       ` Alyssa Ross
2019-10-05 19:58         ` Eric Wong
2019-10-06  9:52           ` Alyssa Ross
2019-10-06 12:01             ` Eric Wong
2019-10-07 20:52               ` Alyssa Ross
2019-10-08  7:11                 ` Eric Wong
2019-10-09 12:09                   ` Alyssa Ross
2019-10-10  8:19                     ` Eric Wong
2019-10-16 10:04                       ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).