From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 4C7901F461; Thu, 27 Jun 2019 19:33:32 +0000 (UTC) Date: Thu, 27 Jun 2019 19:33:32 +0000 From: Eric Wong To: Konstantin Ryabitsev Cc: meta@public-inbox.org Subject: Re: RFC: marking spam via refs/notes/spam to hide it Message-ID: <20190627193332.gjzwkuiotp6fgmcf@whir> References: <20190627184251.GC14570@chatter.i7.local> <20190627185236.2mwuoygclytf5m7x@whir> <20190627185723.GE14570@chatter.i7.local> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20190627185723.GE14570@chatter.i7.local> List-Id: Konstantin Ryabitsev wrote: > On Thu, Jun 27, 2019 at 06:52:36PM +0000, Eric Wong wrote: > > > I'm reluctant to delete spam because it rebases the repository -- > > > for large > > > ones this can cause excessive downloads to mirrors. A thought occurred to me > > > -- would it make sense to just hide spam from the frontend? E.g.: > > > > > > public-inbox-hide linux-kernel message@id > > > > > > This would do the following: > > > > > > - remove that message from search databases > > > - attach a refs/notes/spam git-note to that commit > > > - tell public-inbox-init/reindex to ignore this commit in the future > > > > Aside from the git note, public-inbox-learn already does that: > > > > public-inbox-learn spam > > > (scans everything in ~/.public-inbox/config since spam is > > frequently cross-posted) > > Ah, that shows how carefully I read docs, I guess. :) Is it possible to just > specify a message-id, so that there's no extra step to dump the spam message > into a file? Not exactly with the Message-ID arg. It would be dangerous if somebody malicious wanted to get you to remove a legit message by sending a spam message which reuses a Message-ID of a legit message. I'd definitely want to verify a message is what I'd want to remove, first. In theory, you could: "curl $URL_MESSAGE_ID/raw | public-inbox-learn spam"; but that's still dangerous because there are/were legit bots (and IIRC, old git-send-email) which reused Message-IDs, too.