From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-3.3 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from mail-qv1-xf36.google.com (mail-qv1-xf36.google.com [IPv6:2607:f8b0:4864:20::f36]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 662251F4B4 for ; Tue, 22 Dec 2020 16:28:33 +0000 (UTC) Received: by mail-qv1-xf36.google.com with SMTP id a4so5912654qvd.12 for ; Tue, 22 Dec 2020 08:28:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxfoundation.org; s=google; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:in-reply-to; bh=BIiWeqmp427U4b06qxy1G44/h08vwFITsgDoiFkUcXI=; b=hkfAnB3CKmvCjANqe7HeViQfinx+rgH7gzNm7txU0zSP1ymxLoGi0eCKWuVePYzSjz SnuIh6jw97jUGiAcnonbhmcduLI4bycCWjfYbNgkYmF7yBi7+MU6t0TlkUzS2k2xEKyF T3mQrVDm4+X4AjIS4XBjMIi/0ysQhnptSeA9Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :in-reply-to; bh=BIiWeqmp427U4b06qxy1G44/h08vwFITsgDoiFkUcXI=; b=riXkVU5wAqPfzh/nLJgZg/q7lt2BzMgaqe28zkrnFyokdbveRgrlI+JQA17CcKvgEZ PYif2u9oxTc60XATFPzL38k5c4upkB5ognkKvtPBIK6ETt8eUyvV4N0jIOd788MStU/X 3ZHNIeqI8UNjdw0rRQ6HRIKCEV9z48+MuRJsDlxXSoh73GN05cVdDZgjEvYOaI/CaA0r 9aLEWf8E/iqaXc+QhvUebQkTeBBdDtMoRID66jYT6FCGqTBl2bghY/aENVEn2hcdKKfU Rc/EbARZKqEGAaLivQDHmx+OrfB02SXUNYI5mwdGxJ1IehBDfrvVQyG+IQTod7JTkj4j NoBA== X-Gm-Message-State: AOAM531GFewWsEBPWC4QqGVG1VaXFpmoWArHosbgGbmKoViECmn6o1Rw 6PRPwleig61ksVMcdDiR0NksNmw3xvEMYhu/ X-Google-Smtp-Source: ABdhPJwNmc4SAtmn0XxItyMKvPFPC9bqVyi6EdQnFcqQRFrScT4Y/F9jQMzVlWsY7STqLx8i37gXkw== X-Received: by 2002:a05:6214:b2f:: with SMTP id w15mr23263390qvj.8.1608654512139; Tue, 22 Dec 2020 08:28:32 -0800 (PST) Received: from chatter.i7.local ([89.36.78.230]) by smtp.gmail.com with ESMTPSA id x47sm12998339qtb.86.2020.12.22.08.28.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Dec 2020 08:28:31 -0800 (PST) Date: Tue, 22 Dec 2020 11:28:28 -0500 From: Konstantin Ryabitsev To: Uwe =?utf-8?Q?Kleine-K=C3=B6nig?= Cc: meta@public-inbox.org Subject: Re: About header filtering Message-ID: <20201222162828.wir7sfelqmy2mzrr@chatter.i7.local> Mail-Followup-To: Uwe =?utf-8?Q?Kleine-K=C3=B6nig?= , meta@public-inbox.org References: <20201222073704.u7hacjk5m7mpuc52@pengutronix.de> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="5ukwewkjv4dtjvar" Content-Disposition: inline In-Reply-To: <20201222073704.u7hacjk5m7mpuc52@pengutronix.de> List-Id: --5ukwewkjv4dtjvar Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Dec 22, 2020 at 08:37:04AM +0100, Uwe Kleine-K=C3=B6nig wrote: > I found that Konstantin Ryabitsev's tool to prepare an initial archive > from an already existing mailing list[1] filters some of these out, but > the instance on kernel.org has some of these details, too. (See for > example > https://lore.kernel.org/lkml/20201013082132.661993-1-u.kleine-koenig@peng= utronix.de/raw; > there are Return-Path: and also some Received: headers that I consider > not-so-nice as they were added after the mail was processed by the > mailing list tool on vger.kernel.org.) >=20 > Is it considerd bad to filter these out? Or is it just that nobody > wanted this kind of cleanliness before in such a setup? The reason we don't do any filtering after receiving the mail on the archiv= er system is two-fold: 1. we don't know if any of the Received: lines are part of any DKIM/ARC signatures (they shouldn't be -- it's wrong to include them, but I've se= en this happen). 2. the goal of lore.kernel.org is maximum transparency, so we include everything that our own systems add to the headers in an attempt to show that "there's nothing up our sleeves" > I could handcraft a preprocessor[2] but I assume that a solution in > public-inbox itself would find some users?! I don't know if this should be part of public-inbox -- a simple procmail script would work. I know procmail isn't very actively developed these days, but it's also extremely robust and handles almost anything you can throw at it, which is an important advantage when it comes to a format like email. -K --5ukwewkjv4dtjvar Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQR2vl2yUnHhSB5njDW2xBzjVmSZbAUCX+IerAAKCRC2xBzjVmSZ bBw5AQCiBA+Oco3Qz7aeKfKGpIN+suBH7tvhSZXevQ8G2DxszAD+MBrzV+Sq0qaK qd6OetsFfUxf0pzaIQd7oflqYgMM1gs= =rFad -----END PGP SIGNATURE----- --5ukwewkjv4dtjvar--