user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
* Contributing messages to an archive
@ 2018-07-03 16:09 Jonathan Nieder
  2018-07-03 16:39 ` Jeff King
  2018-07-03 19:54 ` Eric Wong
  0 siblings, 2 replies; 7+ messages in thread
From: Jonathan Nieder @ 2018-07-03 16:09 UTC (permalink / raw)
  To: meta; +Cc: Jeff King

Hi Eric et al,

I dug through the documentation at public-inbox.org and didn't see a
clear answer to this, so thought I'd ask to see whether my idea is
crazy.

https://public-inbox.org/git contains a copy of all messages sent to
the git mailing list, which is a useful resource for git developers.
But some messages don't reach there:

 1. When vger has a bad day, some messages might not reach
    public-inbox but they still reach any other developers that were
    cc-ed

 2. (My motivation) The git-security@googlegroups.com list receives
    messages about embargoed issues that should not be made public
    right away.  We would like to switch to a model where after the
    embargo expires, the discussion is made public.

Is there a way to inject messages into the public-inbox.org/git
archive?  E.g.  should I provide my own public-inbox style repository
to pull messages from, or is there an address that I can bounce
messages to?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Contributing messages to an archive
  2018-07-03 16:09 Contributing messages to an archive Jonathan Nieder
@ 2018-07-03 16:39 ` Jeff King
  2018-07-03 19:54 ` Eric Wong
  1 sibling, 0 replies; 7+ messages in thread
From: Jeff King @ 2018-07-03 16:39 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: meta

On Tue, Jul 03, 2018 at 09:09:10AM -0700, Jonathan Nieder wrote:

>  2. (My motivation) The git-security@googlegroups.com list receives
>     messages about embargoed issues that should not be made public
>     right away.  We would like to switch to a model where after the
>     embargo expires, the discussion is made public.
> 
> Is there a way to inject messages into the public-inbox.org/git
> archive?  E.g.  should I provide my own public-inbox style repository
> to pull messages from, or is there an address that I can bounce
> messages to?

To give a sense of the scale here, I have a 326-message thread regarding
the recent submodules vulnerability that I'd like to make it into the
archive (now that the issue is un-embargoed). I _could_ just re-send
those messages individually to git@vger, but:

  1. It's kind of spammy to dump so many messages.

  2. The dates are now old, which means they'd likely get spam-blocked
     if used as-is.

  3. Ditto for the "From" field, which would violate things like SPF.

Points (2) and (3) could be addressed by "wrapping" the messages and
using in-body headers, like we do for patch authors/dates, but then that
would also fool tools like public-inbox.

Likewise, I could dump the whole thing as a gzipped mbox, but it would
be nice if the contents were more easily searchable.

So it would be really nice to inject directly into the archive somehow
(and then leave a single message on the list alerting people that it
happened).

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Contributing messages to an archive
  2018-07-03 16:09 Contributing messages to an archive Jonathan Nieder
  2018-07-03 16:39 ` Jeff King
@ 2018-07-03 19:54 ` Eric Wong
  2018-07-03 20:44   ` Jonathan Nieder
  1 sibling, 1 reply; 7+ messages in thread
From: Eric Wong @ 2018-07-03 19:54 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: meta, Jeff King

Jonathan Nieder <jrnieder@gmail.com> wrote:
> Hi Eric et al,
> 
> I dug through the documentation at public-inbox.org and didn't see a
> clear answer to this, so thought I'd ask to see whether my idea is
> crazy.
> 
> https://public-inbox.org/git contains a copy of all messages sent to
> the git mailing list, which is a useful resource for git developers.
> But some messages don't reach there:
> 
>  1. When vger has a bad day, some messages might not reach
>     public-inbox but they still reach any other developers that were
>     cc-ed

In that case, the notifying the admin(s) privately with an mbox
or whatever which I can drop into my Maildir watched by
public-inbox-watch should work fine[1].

>  2. (My motivation) The git-security@googlegroups.com list receives
>     messages about embargoed issues that should not be made public
>     right away.  We would like to switch to a model where after the
>     embargo expires, the discussion is made public.

I suspect that is best suited for a second archive.

It's simpler for readers using NNTP and Atom feeds to not get
out-of-date messages, and filtering rules[1] would be different,
as I suspect googlegroups allows HTML :<

Since I'm not a part of git-security, it might be best for you
and/or Jeff to run that yourselves[2].  I can help out, of
course.  I'd use public-inbox-watch setup to watch that a
particular Maildir, and you'd move unembargoed messages into
that.


> Is there a way to inject messages into the public-inbox.org/git
> archive?  E.g.  should I provide my own public-inbox style repository
> to pull messages from, or is there an address that I can bounce
> messages to?

If you're using public-inbox-watch, then dropping messages into
a Maildir being watched ought to do the trick.


There's also some one-off scripts in the scripts/ directory of
the git repo (e.g. scripts/import_vger_from_mbox) which can
work after-the-fact.


[1] If configured, PublicInbox::SaPlugin::ListMirror actually
    prevents bounced messages from being injected.  The admin would
    have to edit the top Received: header.

[2] The partitioned v2 layout I wrote for kernel.org scales way
    better than the original 2/38 layout.  Cloning is a little
    more involved (see the bottom of <https://lore.kernel.org/lkml/>),
    but the only difference is in setup is passing the "-V2" switch
    to public-inbox-init.  But yeah, remember being appalled by
    hosting costs of git-scm.com Peff posted, and public-inbox.org
    has always run on a $20/month VPS which I also hack and run
    test suites from.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Contributing messages to an archive
  2018-07-03 19:54 ` Eric Wong
@ 2018-07-03 20:44   ` Jonathan Nieder
  2018-07-03 21:00     ` Jeff King
  2018-07-03 21:03     ` Eric Wong
  0 siblings, 2 replies; 7+ messages in thread
From: Jonathan Nieder @ 2018-07-03 20:44 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta, Jeff King

Hi,

Eric Wong wrote:
> Jonathan Nieder <jrnieder@gmail.com> wrote:

>>  1. When vger has a bad day, some messages might not reach
>>     public-inbox but they still reach any other developers that were
>>     cc-ed
>
> In that case, the notifying the admin(s) privately with an mbox
> or whatever which I can drop into my Maildir watched by
> public-inbox-watch should work fine[1].

Thanks, good to know.

>>  2. (My motivation) The git-security@googlegroups.com list receives
>>     messages about embargoed issues that should not be made public
>>     right away.  We would like to switch to a model where after the
>>     embargo expires, the discussion is made public.
>
> I suspect that is best suited for a second archive.
>
> It's simpler for readers using NNTP and Atom feeds to not get
> out-of-date messages, and filtering rules[1] would be different,
> as I suspect googlegroups allows HTML :<

Hm, I was looking forward to having it in the main archive since
that's where people look for context on a patch.

Is there a way to configure the search to look at multiple archives?

> Since I'm not a part of git-security, it might be best for you
> and/or Jeff to run that yourselves[2].  I can help out, of
> course.  I'd use public-inbox-watch setup to watch that a
> particular Maildir, and you'd move unembargoed messages into
> that.

Would you be interested in helping with Git security bug reports and
reviewing the patches that address them?  If so, we'd be happy to add
you to the list.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Contributing messages to an archive
  2018-07-03 20:44   ` Jonathan Nieder
@ 2018-07-03 21:00     ` Jeff King
  2018-07-03 21:08       ` Eric Wong
  2018-07-03 21:03     ` Eric Wong
  1 sibling, 1 reply; 7+ messages in thread
From: Jeff King @ 2018-07-03 21:00 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Eric Wong, meta

On Tue, Jul 03, 2018 at 01:44:00PM -0700, Jonathan Nieder wrote:

> > It's simpler for readers using NNTP and Atom feeds to not get
> > out-of-date messages, and filtering rules[1] would be different,
> > as I suspect googlegroups allows HTML :<
> 
> Hm, I was looking forward to having it in the main archive since
> that's where people look for context on a patch.
> 
> Is there a way to configure the search to look at multiple archives?

Yeah, that was my main impetus in putting them in the vger archive.
Those messages _would_ have gone to the list, if not for the embargo,
and ideally people digging into the history later would be able to find
them easily (either by message-id if referenced, or by keyword
searching).

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Contributing messages to an archive
  2018-07-03 20:44   ` Jonathan Nieder
  2018-07-03 21:00     ` Jeff King
@ 2018-07-03 21:03     ` Eric Wong
  1 sibling, 0 replies; 7+ messages in thread
From: Eric Wong @ 2018-07-03 21:03 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: meta, Jeff King

Jonathan Nieder <jrnieder@gmail.com> wrote:
> Eric Wong wrote:
> > I suspect that is best suited for a second archive.
> 
> Hm, I was looking forward to having it in the main archive since
> that's where people look for context on a patch.
> 
> Is there a way to configure the search to look at multiple archives?

Not yet, but v2 repos already use the feature of Xapian to
transparently tie different Xapian indices together.

The UI is the hard part, I suppose there should be a way to
group "related" repos hosted on the same server (as I have
a bunch of inboxes which are totally unrelated).

> > Since I'm not a part of git-security, it might be best for you
> > and/or Jeff to run that yourselves[2].  I can help out, of
> > course.  I'd use public-inbox-watch setup to watch that a
> > particular Maildir, and you'd move unembargoed messages into
> > that.
> 
> Would you be interested in helping with Git security bug reports and
> reviewing the patches that address them?  If so, we'd be happy to add
> you to the list.

Sure I'd like to be able to scan the list.  But my time for
working on git.git and even public-inbox has been pretty limited
in recent months.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Contributing messages to an archive
  2018-07-03 21:00     ` Jeff King
@ 2018-07-03 21:08       ` Eric Wong
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Wong @ 2018-07-03 21:08 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Nieder, meta

Jeff King <peff@peff.net> wrote:
> On Tue, Jul 03, 2018 at 01:44:00PM -0700, Jonathan Nieder wrote:
> > 
> > Is there a way to configure the search to look at multiple archives?
> 
> Yeah, that was my main impetus in putting them in the vger archive.
> Those messages _would_ have gone to the list, if not for the embargo,
> and ideally people digging into the history later would be able to find
> them easily (either by message-id if referenced, or by keyword
> searching).

Yes, at least by Message-ID references it already finds a message
on the same server.  So Jonathan's original message to this list
using /git/:

https://public-inbox.org/git/20180703160910.GB51821@aiede.svl.corp.google.com/
https://public-inbox.org/git/20180703160910.GB51821@aiede.svl.corp.google.com/

...refers the reader to the archive for this list:

https://public-inbox.org/meta/20180703160910.GB51821@aiede.svl.corp.google.com/

(It needed to work that way for NNTP).


Dealing with the UI and configuration for search ends up being a
bit tougher...

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-07-03 21:08 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-03 16:09 Contributing messages to an archive Jonathan Nieder
2018-07-03 16:39 ` Jeff King
2018-07-03 19:54 ` Eric Wong
2018-07-03 20:44   ` Jonathan Nieder
2018-07-03 21:00     ` Jeff King
2018-07-03 21:08       ` Eric Wong
2018-07-03 21:03     ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).