user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: "Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>
To: meta@public-inbox.org
Subject: Re: About header filtering
Date: Tue, 22 Dec 2020 23:21:18 +0100	[thread overview]
Message-ID: <20201222222118.i4bioeo7l6iuf3pk@pengutronix.de> (raw)
In-Reply-To: <20201222162828.wir7sfelqmy2mzrr@chatter.i7.local>

[-- Attachment #1: Type: text/plain, Size: 2493 bytes --]

Hello Konstantin,

On Tue, Dec 22, 2020 at 11:28:28AM -0500, Konstantin Ryabitsev wrote:
> On Tue, Dec 22, 2020 at 08:37:04AM +0100, Uwe Kleine-König wrote:
> > I found that Konstantin Ryabitsev's tool to prepare an initial archive
> > from an already existing mailing list[1] filters some of these out, but
> > the instance on kernel.org has some of these details, too. (See for
> > example
> > https://lore.kernel.org/lkml/20201013082132.661993-1-u.kleine-koenig@pengutronix.de/raw;
> > there are Return-Path: and also some Received: headers that I consider
> > not-so-nice as they were added after the mail was processed by the
> > mailing list tool on vger.kernel.org.)
> > 
> > Is it considerd bad to filter these out? Or is it just that nobody
> > wanted this kind of cleanliness before in such a setup?
> 
> The reason we don't do any filtering after receiving the mail on the archiver
> system is two-fold:
> 
> 1. we don't know if any of the Received: lines are part of any DKIM/ARC
>    signatures (they shouldn't be -- it's wrong to include them, but I've seen
>    this happen).

Note I don't intend to throw away all Received lines, only the ones
concerning the hops after the mailing list server. These cannot be
signed using DKIM unless the mailing list subscription goes to an
address that is forwarded and the forwarding server signs the Received
lines.

> 2. the goal of lore.kernel.org is maximum transparency, so we include
>    everything that our own systems add to the headers in an attempt to show
>    that "there's nothing up our sleeves"
> 
> > I could handcraft a preprocessor[2] but I assume that a solution in
> > public-inbox itself would find some users?!
> 
> I don't know if this should be part of public-inbox -- a simple procmail
> script would work. I know procmail isn't very actively developed these days,
> but it's also extremely robust and handles almost anything you can throw at
> it, which is an important advantage when it comes to a format like email.

Procmail doesn't help here (unless I miss something). Well, it allows to
call a filter, but doesn't filter itself. Currently I experiment with
formail (which is called by procmail) but formail cannot throw away
selected Received lines only.

Best regards and thanks for your input,
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2020-12-22 22:21 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-22  7:37 About header filtering Uwe Kleine-König
2020-12-22 16:28 ` Konstantin Ryabitsev
2020-12-22 22:21   ` Uwe Kleine-König [this message]
2020-12-22 23:11     ` Eric Wong
2020-12-23 17:57     ` Konstantin Ryabitsev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201222222118.i4bioeo7l6iuf3pk@pengutronix.de \
    --to=u.kleine-koenig@pengutronix.de \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).