user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: "W. Trevor King" <wking@tremily.us>
To: Eric Wong <e@80x24.org>
Cc: meta@public-inbox.org
Subject: Re: [RFC] ssoma-mda: Use the email subject as the commit message
Date: Sat, 18 Oct 2014 14:50:20 -0700	[thread overview]
Message-ID: <20141018215020.GK17200@odin.tremily.us> (raw)
In-Reply-To: <20141018210400.GA2448@dcvr.yhbt.net>

[-- Attachment #1: Type: text/plain, Size: 5396 bytes --]

On Sat, Oct 18, 2014 at 09:04:00PM +0000, Eric Wong wrote:
> W. Trevor King wrote:
> > This is more interesting than just using 'mda' all the time, but
> > it's harder to setup proper quoting around the message without
> > using third-party Perl modules (e.g. IPC::Run or
> > String::ShellQuote).  This proof-of-concept patch just assumes the
> > subject doesn't contain single-quotes (').  This patch also
> > doesn't handle the empty/missing subject case, which should
> > probably fall back to '<no subject>' or some such.
> 
> Right, carelessness here would open us up to command injection.

There's no chance of carelessness if you're using a subprocess
launcher that's based on execve (see exec(3)) instead of using a
shell.

> It would also need to work with internationalized subjects.  I
> considered it for public-inbox-mda; but decided it was not worth the
> trouble.

Python handles that out of the box without difficulty [1].  In Python
3:

  >>> import email.header  
  >>> h = email.header.Header('p\xf6stal', 'iso-8859-1')
  >>> str(h)
  'pöstal'

In Python 2, you just need to import the unicode_literals future [2]
and use unicode() instead of str().  It's easy to bind the appropriate
to-Unicode function to a unicode_str helper depending on the Python
version if you want a code-base compatible with both.

> > It would also be useful (I think) to set the GIT_AUTHOR_NAME,
> > GIT_AUTHOR_EMAIL, and GIT_AUTHOR_DATE environment variables from
> > the message header before committing.  I know how to do that using
> > Python's subprocess module, but I don't know the Perl incantation.
> 
> That's done in public-inbox-mda using
> 
> 	local $ENV{...} = ...
> 
> And more Email::* modules to properly decode various email addresses
> and internationalized names.  I wanted to keep ssoma as lean and
> dumb as possible.

It doesn't seem like *that* much more complication ;).  Can we make it
optional, and error out if it's enabled and the appropriate decoding
modules aren't present?  It seems like you'd want to handle input and
local browsing with ssoma and then point public-inbox at the resulting
Git archive.  Collecting the archive should be independent of serving
it over HTTP.

> > Is there any interest in a Python port of ssoma?  The subprocess
> > handling in Perl's standard libraries is not my favorite ;).  I
> > expect we could handle all of ssoma without leaving Python's
> > standard libraries.  For an example of a related Perl -> Python
> > rewrite that I just landed, see nmbug [1,2,3].
> 
> I think you're the only one who's shown any interest in ssoma at all
> :)
>
> I would love to have multiple implementations of ssoma and want a Ruby
> one, too.  However I don't think using Python/Ruby would increase it's
> ease-of-installation or adoption much (and most of my software is Ruby).

I would have tried it sooner if it had been written in a language I
liked ;).  I'm not familiar with Ruby's email-parsing modules, but I
am familiar with Python's.

What do you see as the ease-of-installation and adoption barriers?
I'd guess they're just “Gmane works pretty well”.

> Fwiw, the commit subject/message currently has no bearing on the way
> ssoma or public-inbox handles the mail data.  So another
> implementation is free to use more metadata in the commit message.

Right, but if you're going to put something into Git, you might as
well make the history pleasant to browse ;).

> I've considered adding fuzzy generation counters to commit messages to
> public-inbox to allow easier history traversals; but decided it's
> probably better to do in any out-of-band, easily-regenerated store
> using sqlite or similar (this may help with adding search support to
> the web UI as well).

Fuzzy generation counters?  For search, I'd just run a local notmuch
index [3].  It already has Python, Ruby, and Go bindings [4], although
I'm not sure how mature the non-Python ones are (219 commits touch
bindings/python, but only 38 and 20 touch bindings/ruby and
bindings/go).  Of course, you can always call the notmuch command-line
client as a subprocess if the bindings don't work for you. Personally,
I'd rather use ssoma for aggregating and sharing the archive, and then
notmuch to handle threading and search, with a read-only web frontent
in front of notmuch, that just hit the ssoma archive for message
bodies (but served thread lists and such straight from notmuch,
hitting the Xapian database but not the ssoma archives).

I think ssoma + notmuch + nmbug is a good pairing for users too, since
you'll generally want the whole archive locally for that (although
with the in-flight ghost message series for notmuch [5,6], having the
whole archive locally will move from “you really want this” to “you
probably want this”).

Cheers,
Trevor

[1]: https://docs.python.org/3/library/email.header.html
[2]: https://docs.python.org/2/library/__future__.html
[3]: http://notmuchmail.org/
[4]: http://git.notmuchmail.org/git/notmuch/tree/HEAD:/bindings
[5]: http://notmuchmail.org/pipermail/notmuch/2014/019160.html
[6]: http://notmuchmail.org/pipermail/notmuch/2014/019235.html

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy


  reply	other threads:[~2014-10-18 21:50 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-18 20:19 [RFC] ssoma-mda: Use the email subject as the commit message W. Trevor King
2014-10-18 21:04 ` Eric Wong
2014-10-18 21:50   ` W. Trevor King [this message]
2014-10-18 23:43     ` Eric Wong
2014-10-19  3:48       ` W. Trevor King
2014-10-19  5:30         ` Eric Wong
2014-10-19 17:31           ` W. Trevor King
2014-10-20  0:49             ` Eric Wong
2014-10-20 15:36               ` W. Trevor King
2014-10-20 19:26                 ` Eric Wong
2014-10-20 19:53                   ` W. Trevor King
2014-10-26 22:57         ` Eric Wong
2014-10-27  0:19           ` W. Trevor King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141018215020.GK17200@odin.tremily.us \
    --to=wking@tremily.us \
    --cc=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).