From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 4BEF420248; Thu, 14 Mar 2019 07:44:47 +0000 (UTC) Date: Thu, 14 Mar 2019 07:44:47 +0000 From: Eric Wong To: Bjorn Helgaas Cc: meta@public-inbox.org Subject: Re: Threading in git repo? Message-ID: <20190314074447.GA8156@dcvr> References: <20190313230707.GB210027@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190313230707.GB210027@google.com> List-Id: Bjorn Helgaas wrote: > Hi Eric, > > As far as I can tell, pi git repos have no branching: each new message > is added as a child commit of the most recent message, even if it is a > response to an older message. Have you considered making the new > message a child of the message it is responding to? Correct, there is no branching. Doing threading in git does not work because of out-of-order message delivery (which is common in SMTP). public-inbox-index scanning (along with notmuch and mairix) are all resilient to out-of-order message delivery when doing threading. > I'm fiddling with making neomutt read a pi git repo. Currently I only > read the git log info (not the commit bodies). It's pretty fast to > read the author, date, and subject (since you conveniently stash them > in the commit metadata), but since I'm not reading the mail headers, > neomutt can't do all its threading magic. neomutt could read the over.sqlite3 database... However, I can't guarantee it's stability, either (since it's in the "xap$VER" directory where $VER is 15, now). Perhaps improving NNTP support in neomutt is the best way to go? public-inbox-nntpd has room for improvement, too (see TODO) > It seems like working out the threading could be done once at the time > the message is added to the git repo, and threads could appear as > branches in the repo. Not really. It'd still have to support "ghost" messages to account for out-of-order message delivery; and the threading logic can be improved and tweaked: https://public-inbox.org/meta/20190129075644.3917-1-e@80x24.org/ If the git commit messages all had key headers (Message-ID/From/To/Cc/References/In-Reply-To/Subject), then yes; then a SQLite/Xapian-agnostic client could be taught to read and do threading based on that; with fewer git ODB accesses. I don't think it's worth introducing at this time, though. NNTP seems the best and least-fragile way forward.