user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@80x24.org>
To: Konstantin Ryabitsev <mricon@kernel.org>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>,
	tools@linux.kernel.org, stable@vger.kernel.org,
	meta@public-inbox.org, sashal@kernel.org,
	gregkh@linuxfoundation.org, krzk@kernel.org
Subject: Re: filtering stable patches in lore queries
Date: Wed, 8 May 2024 11:33:14 +0000	[thread overview]
Message-ID: <20240508113314.M238016@dcvr> (raw)
In-Reply-To: <20240429-antique-hyena-of-glee-d9e4ac@lemur>

Konstantin Ryabitsev <mricon@kernel.org> wrote:
> On Sat, Apr 27, 2024 at 07:19:21AM GMT, Eric Wong wrote:
> > Correct, public-inbox currently won't index every header due to
> > cost, false positives, and otherwise lack of usefulness (general
> > gibberish from DKIM sigs, various UUIDs, etc).
> > 
> > So it doesn't currently know about "X-stable:"
> > 
> > I started working on making headers indexing configurable last
> > year, but didn't hear a response from the person that
> > potentially was interested:
> > 
> > https://public-inbox.org/meta/20231120032132.M610564@dcvr/
> > 
> > Right now, indexing new headers + validations can be maintained
> > as a Perl module in the public-inbox codebase.
> > 
> > For lore, it'd make sense to be able to configure a bunch (or
> > all) inboxes at once instead of the per-inbox configuration in
> > my proposed RFC.
> > 
> > At minimum, one would have to know:
> > 
> > 1) the mail header name (e.g. `X-stable')
> > 2) the search prefix to use (e.g. `xstable:') # can't use dash `-' AFAIK
> > 3) the type of header value (phrase, string, sortable numeric, etc...)
> 
> I'm whole-heartedly for this! This ties nicely to my b4 work where I'd 
> like to be able to identify code-review trailers sent for a specific 
> patch, even if that patch itself is not on lore. For example, this could 
> be a patch that is part of a pull-request on a git forge, but we'd still 
> like to be able to collect and find code-review trailers for it when a 
> maintainer applies it.

OK, a more configurable version is available on a per-inbox basis:

https://public-inbox.org/meta/20240508110957.3108196-1-e@80x24.org/

But that's a PITA to configure with hundreds of inboxes and
doesn't have extindex support, yet.

I made it share logic with the old altid code; so I'll also be
getting altid into extindex since ISTR users wanting to be able
to lookup gmane stuff via extindex.

And it also works with the new C++ xap_helper process
(which I'll use for threadid: support (still working on that...)).

> I'm perfectly fine with it only being a string, honestly.

Yeah, though there's 3 ways of indexing strings, currently :x
I've decided to keep some options open and support boolean_term,
text, and phrase for now.

boolean_term is the cheapest and probably best for exactly
matching labels/enums and such.  The others may work better
for more complex texts (comma-delimited labels, maybe).

> > So probably just supporting strings and/or phrases to start...
> > 
> > Validation to prevent poisoning by malicious/broken senders can
> > be useful in some cases (and the reason the RFC was a per use
> > case Perl module).  That said, I'm not sure if much validation
> > is necessary for X-stable: headers or if just any text is fine.
> 
> I'd let the consumer clients worry about it.

Agreed.

  reply	other threads:[~2024-05-08 11:33 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-27  0:28 filtering stable patches in lore queries Jason A. Donenfeld
2024-04-27  7:19 ` Eric Wong
2024-04-29 14:27   ` Konstantin Ryabitsev
2024-05-08 11:33     ` Eric Wong [this message]
2024-05-08 17:01       ` Konstantin Ryabitsev
2024-05-08 17:09         ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240508113314.M238016@dcvr \
    --to=e@80x24.org \
    --cc=Jason@zx2c4.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=krzk@kernel.org \
    --cc=meta@public-inbox.org \
    --cc=mricon@kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tools@linux.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).