From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-3.4 required=3.0 tests=AWL,BAYES_00, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from metis.ext.pengutronix.de (metis.ext.pengutronix.de [IPv6:2001:67c:670:201:290:27ff:fe1d:cc33]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id D69501F4B4 for ; Tue, 22 Dec 2020 22:21:22 +0000 (UTC) Received: from ptx.hi.pengutronix.de ([2001:67c:670:100:1d::c0]) by metis.ext.pengutronix.de with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1krq1u-0000Bx-NK for meta@public-inbox.org; Tue, 22 Dec 2020 23:21:18 +0100 Received: from ukl by ptx.hi.pengutronix.de with local (Exim 4.92) (envelope-from ) id 1krq1u-0001Vx-4P for meta@public-inbox.org; Tue, 22 Dec 2020 23:21:18 +0100 Date: Tue, 22 Dec 2020 23:21:18 +0100 From: Uwe =?utf-8?Q?Kleine-K=C3=B6nig?= To: meta@public-inbox.org Subject: Re: About header filtering Message-ID: <20201222222118.i4bioeo7l6iuf3pk@pengutronix.de> References: <20201222073704.u7hacjk5m7mpuc52@pengutronix.de> <20201222162828.wir7sfelqmy2mzrr@chatter.i7.local> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="3svfd6tolc4oa35x" Content-Disposition: inline In-Reply-To: <20201222162828.wir7sfelqmy2mzrr@chatter.i7.local> X-SA-Exim-Connect-IP: 2001:67c:670:100:1d::c0 X-SA-Exim-Mail-From: ukl@pengutronix.de X-SA-Exim-Scanned: No (on metis.ext.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: meta@public-inbox.org List-Id: --3svfd6tolc4oa35x Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hello Konstantin, On Tue, Dec 22, 2020 at 11:28:28AM -0500, Konstantin Ryabitsev wrote: > On Tue, Dec 22, 2020 at 08:37:04AM +0100, Uwe Kleine-K=F6nig wrote: > > I found that Konstantin Ryabitsev's tool to prepare an initial archive > > from an already existing mailing list[1] filters some of these out, but > > the instance on kernel.org has some of these details, too. (See for > > example > > https://lore.kernel.org/lkml/20201013082132.661993-1-u.kleine-koenig@pe= ngutronix.de/raw; > > there are Return-Path: and also some Received: headers that I consider > > not-so-nice as they were added after the mail was processed by the > > mailing list tool on vger.kernel.org.) > >=20 > > Is it considerd bad to filter these out? Or is it just that nobody > > wanted this kind of cleanliness before in such a setup? >=20 > The reason we don't do any filtering after receiving the mail on the arch= iver > system is two-fold: >=20 > 1. we don't know if any of the Received: lines are part of any DKIM/ARC > signatures (they shouldn't be -- it's wrong to include them, but I've = seen > this happen). Note I don't intend to throw away all Received lines, only the ones concerning the hops after the mailing list server. These cannot be signed using DKIM unless the mailing list subscription goes to an address that is forwarded and the forwarding server signs the Received lines. > 2. the goal of lore.kernel.org is maximum transparency, so we include > everything that our own systems add to the headers in an attempt to sh= ow > that "there's nothing up our sleeves" >=20 > > I could handcraft a preprocessor[2] but I assume that a solution in > > public-inbox itself would find some users?! >=20 > I don't know if this should be part of public-inbox -- a simple procmail > script would work. I know procmail isn't very actively developed these da= ys, > but it's also extremely robust and handles almost anything you can throw = at > it, which is an important advantage when it comes to a format like email. Procmail doesn't help here (unless I miss something). Well, it allows to call a filter, but doesn't filter itself. Currently I experiment with formail (which is called by procmail) but formail cannot throw away selected Received lines only. Best regards and thanks for your input, Uwe --=20 Pengutronix e.K. | Uwe Kleine-K=F6nig | Industrial Linux Solutions | https://www.pengutronix.de/ | --3svfd6tolc4oa35x Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEfnIqFpAYrP8+dKQLwfwUeK3K7AkFAl/icVoACgkQwfwUeK3K 7Akpfgf+NelDvxGtQFMLtRIi923dLKot1lWFkWh6FwrCtaevPryOIdXmZpCaO0cQ Hr4UuR5a5qlBDMedHJ/20RlbnhscgivdHXlgNullkoMGrl2t2b3FpY3Abv34UsM9 QDW3Hr3t5MxA9Z1vsOUfs5Olasoana0u4K+Za8byyWN+XxdPxSWkln+9KTnuopNX ICYMIRS1ZeW68AmjNAf5m3BnPigUIQ1GAdEuAUpJ9R7n3smkJLo+nSXpVlUnmWXu 9FYOWfkXIuQr0A4dAd5VigpEFvx8QMJfxdq5bdM172ZyRozBJFzCwWwBVUYGPlDH pJvX4y/JdZbE03XgIIo52jHfj5IEQw== =EWig -----END PGP SIGNATURE----- --3svfd6tolc4oa35x--