Hello Konstantin, On Tue, Dec 22, 2020 at 11:28:28AM -0500, Konstantin Ryabitsev wrote: > On Tue, Dec 22, 2020 at 08:37:04AM +0100, Uwe Kleine-König wrote: > > I found that Konstantin Ryabitsev's tool to prepare an initial archive > > from an already existing mailing list[1] filters some of these out, but > > the instance on kernel.org has some of these details, too. (See for > > example > > https://lore.kernel.org/lkml/20201013082132.661993-1-u.kleine-koenig@pengutronix.de/raw; > > there are Return-Path: and also some Received: headers that I consider > > not-so-nice as they were added after the mail was processed by the > > mailing list tool on vger.kernel.org.) > > > > Is it considerd bad to filter these out? Or is it just that nobody > > wanted this kind of cleanliness before in such a setup? > > The reason we don't do any filtering after receiving the mail on the archiver > system is two-fold: > > 1. we don't know if any of the Received: lines are part of any DKIM/ARC > signatures (they shouldn't be -- it's wrong to include them, but I've seen > this happen). Note I don't intend to throw away all Received lines, only the ones concerning the hops after the mailing list server. These cannot be signed using DKIM unless the mailing list subscription goes to an address that is forwarded and the forwarding server signs the Received lines. > 2. the goal of lore.kernel.org is maximum transparency, so we include > everything that our own systems add to the headers in an attempt to show > that "there's nothing up our sleeves" > > > I could handcraft a preprocessor[2] but I assume that a solution in > > public-inbox itself would find some users?! > > I don't know if this should be part of public-inbox -- a simple procmail > script would work. I know procmail isn't very actively developed these days, > but it's also extremely robust and handles almost anything you can throw at > it, which is an important advantage when it comes to a format like email. Procmail doesn't help here (unless I miss something). Well, it allows to call a filter, but doesn't filter itself. Currently I experiment with formail (which is called by procmail) but formail cannot throw away selected Received lines only. Best regards and thanks for your input, Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | https://www.pengutronix.de/ |