From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS7922 96.112.0.0/13 X-Spam-Status: No, score=-1.9 required=3.0 tests=AWL,BAYES_00,URIBL_BLOCKED shortcircuit=no autolearn=unavailable version=3.3.2 X-Original-To: meta@public-inbox.org Received: from resqmta-po-12v.sys.comcast.net (resqmta-po-12v.sys.comcast.net [96.114.154.171]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 124A41F8B3 for ; Sat, 18 Oct 2014 21:50:22 +0000 (UTC) Received: from resomta-po-19v.sys.comcast.net ([96.114.154.243]) by resqmta-po-12v.sys.comcast.net with comcast id 4lq21p0055FMDhs01lqME9; Sat, 18 Oct 2014 21:50:21 +0000 Received: from odin.tremily.us ([24.18.63.50]) by resomta-po-19v.sys.comcast.net with comcast id 4lqL1p00G152l3L01lqLaq; Sat, 18 Oct 2014 21:50:21 +0000 Received: by odin.tremily.us (Postfix, from userid 1000) id 44E2E1423AC8; Sat, 18 Oct 2014 14:50:20 -0700 (PDT) Date: Sat, 18 Oct 2014 14:50:20 -0700 From: "W. Trevor King" To: Eric Wong Cc: meta@public-inbox.org Subject: Re: [RFC] ssoma-mda: Use the email subject as the commit message Message-ID: <20141018215020.GK17200@odin.tremily.us> References: <20141018210400.GA2448@dcvr.yhbt.net> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="ee6FjwWxuMujAVRe"; micalg="pgp-sha1"; protocol="application/pgp-signature" Content-Disposition: inline In-Reply-To: <20141018210400.GA2448@dcvr.yhbt.net> OpenPGP: id=39A2F3FA2AB17E5D8764F388FC29BDCDF15F5BE8; url=http://tremily.us/pubkey.txt User-Agent: Mutt/1.5.23 (2014-03-12) X-Content-Filtered-By: PublicInbox::Filter 0.0.1 List-Id: --ee6FjwWxuMujAVRe Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Oct 18, 2014 at 09:04:00PM +0000, Eric Wong wrote: > W. Trevor King wrote: > > This is more interesting than just using 'mda' all the time, but > > it's harder to setup proper quoting around the message without > > using third-party Perl modules (e.g. IPC::Run or > > String::ShellQuote). This proof-of-concept patch just assumes the > > subject doesn't contain single-quotes ('). This patch also > > doesn't handle the empty/missing subject case, which should > > probably fall back to '' or some such. >=20 > Right, carelessness here would open us up to command injection. There's no chance of carelessness if you're using a subprocess launcher that's based on execve (see exec(3)) instead of using a shell. > It would also need to work with internationalized subjects. I > considered it for public-inbox-mda; but decided it was not worth the > trouble. Python handles that out of the box without difficulty [1]. In Python 3: >>> import email.header =20 >>> h =3D email.header.Header('p\xf6stal', 'iso-8859-1') >>> str(h) 'p=C3=B6stal' In Python 2, you just need to import the unicode_literals future [2] and use unicode() instead of str(). It's easy to bind the appropriate to-Unicode function to a unicode_str helper depending on the Python version if you want a code-base compatible with both. > > It would also be useful (I think) to set the GIT_AUTHOR_NAME, > > GIT_AUTHOR_EMAIL, and GIT_AUTHOR_DATE environment variables from > > the message header before committing. I know how to do that using > > Python's subprocess module, but I don't know the Perl incantation. >=20 > That's done in public-inbox-mda using >=20 > local $ENV{...} =3D ... >=20 > And more Email::* modules to properly decode various email addresses > and internationalized names. I wanted to keep ssoma as lean and > dumb as possible. It doesn't seem like *that* much more complication ;). Can we make it optional, and error out if it's enabled and the appropriate decoding modules aren't present? It seems like you'd want to handle input and local browsing with ssoma and then point public-inbox at the resulting Git archive. Collecting the archive should be independent of serving it over HTTP. > > Is there any interest in a Python port of ssoma? The subprocess > > handling in Perl's standard libraries is not my favorite ;). I > > expect we could handle all of ssoma without leaving Python's > > standard libraries. For an example of a related Perl -> Python > > rewrite that I just landed, see nmbug [1,2,3]. >=20 > I think you're the only one who's shown any interest in ssoma at all > :) > > I would love to have multiple implementations of ssoma and want a Ruby > one, too. However I don't think using Python/Ruby would increase it's > ease-of-installation or adoption much (and most of my software is Ruby). I would have tried it sooner if it had been written in a language I liked ;). I'm not familiar with Ruby's email-parsing modules, but I am familiar with Python's. What do you see as the ease-of-installation and adoption barriers? I'd guess they're just =E2=80=9CGmane works pretty well=E2=80=9D. > Fwiw, the commit subject/message currently has no bearing on the way > ssoma or public-inbox handles the mail data. So another > implementation is free to use more metadata in the commit message. Right, but if you're going to put something into Git, you might as well make the history pleasant to browse ;). > I've considered adding fuzzy generation counters to commit messages to > public-inbox to allow easier history traversals; but decided it's > probably better to do in any out-of-band, easily-regenerated store > using sqlite or similar (this may help with adding search support to > the web UI as well). Fuzzy generation counters? For search, I'd just run a local notmuch index [3]. It already has Python, Ruby, and Go bindings [4], although I'm not sure how mature the non-Python ones are (219 commits touch bindings/python, but only 38 and 20 touch bindings/ruby and bindings/go). Of course, you can always call the notmuch command-line client as a subprocess if the bindings don't work for you. Personally, I'd rather use ssoma for aggregating and sharing the archive, and then notmuch to handle threading and search, with a read-only web frontent in front of notmuch, that just hit the ssoma archive for message bodies (but served thread lists and such straight from notmuch, hitting the Xapian database but not the ssoma archives). I think ssoma + notmuch + nmbug is a good pairing for users too, since you'll generally want the whole archive locally for that (although with the in-flight ghost message series for notmuch [5,6], having the whole archive locally will move from =E2=80=9Cyou really want this=E2=80=9D= to =E2=80=9Cyou probably want this=E2=80=9D). Cheers, Trevor [1]: https://docs.python.org/3/library/email.header.html [2]: https://docs.python.org/2/library/__future__.html [3]: http://notmuchmail.org/ [4]: http://git.notmuchmail.org/git/notmuch/tree/HEAD:/bindings [5]: http://notmuchmail.org/pipermail/notmuch/2014/019160.html [6]: http://notmuchmail.org/pipermail/notmuch/2014/019235.html --=20 This email may be signed or encrypted with GnuPG (http://www.gnupg.org). For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy --ee6FjwWxuMujAVRe--