user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* Re: [PATCH] searchidx: index "diff --git a/... b/..." headers
  2021-11-09  4:03 10%             ` Rob Herring
@ 2021-11-09  5:08  9%               ` Eric Wong
  0 siblings, 0 replies; 5+ results
From: Eric Wong @ 2021-11-09  5:08 UTC (permalink / raw)
  To: Rob Herring; +Cc: Konstantin Ryabitsev, meta

Rob Herring <robh@kernel.org> wrote:
> On Mon, Nov 8, 2021 at 9:12 PM Eric Wong <e@80x24.org> wrote:
> > I think 's:patch' should be sufficient, don't think there's
> > many false-positives on that front, actually.
> 
> It's at least 's:patch OR s:rfc OR s:resend'. That catches all but the
> few creative folks that come up with something else.
> 
> > With this fix, nq:"diff --git" should also be working across
> > https://yhbt.net/lore/ in about 40 hours (whenever reindex
> > finishes)
> 
> 'diff --git' should cover probably 99.9% of patches but there are
> still some non-git diffs from time to time.

OK, so maybe the combination of:

	s:patch OR s:rfc OR s:resend OR nq:"diff --git"

Is enough?  Maybe it would be good do support some form of alias
expansion in the Xapian query parser for "common" things like
that.  I know there's a few not-seriously-proposed patches which
may lack all of those, but perhaps they weren't meant to be
applied, either...

Fwiw, it would also be useful to have it expand .mailmap and
sendemail.aliasesfile entries, too.

Unfortunately, I think doing aliases cleanly requires dropping
down to C++ to supply custom routines to Xapian.  The current
approxidate parsing is all done via fragile string
substitutions; I'm not sure how that holds up...

> > I'm not sure if there needs to be a specific term to index
> > patches on; maybe there is.  There's still a lot of Xapian
> > we're not using, yet...
> 
> What I'm hoping to get to is a replacement for patchwork in my
> workflow. For that I want all patches which don't have either a
> Reviewed/Acked tag from me or a reply from me. I think the first part
> should be possible with lei, but I'd imagine the last part is some
> processing on top of the lei query.

Yes, exactly.  Powerful-enough local search should be able to
replace many web-based tools.  Patch indexing could take into
account git trailers, but maybe existing 'nq:' phrases are
enough *shrug*

lei is still primitive, but things like "lei p2q" can be
combined to do some patch tracking.  An example from the
lei-p2q(1) manpage:

         # to view unapplied patches for a given $FILE from the past year:
         echo \( rt:last.year.. AND dfn:$FILE \) AND NOT \( \
               $(git log -p --pretty=mboxrd --since=last.year $FILE |
                       lei p2q -F mboxrd )
               \) | lei q -o /tmp/unapplied

I do need some time to consider future features and non-bugfix
stuff to lei+public-inbox, though.  There's already a huge
amount of stuff in the TODO and a billion things in my life
that also need fixing :<

^ permalink raw reply	[relevance 9%]

* Re: [PATCH] searchidx: index "diff --git a/... b/..." headers
  2021-11-09  3:12 10%           ` Eric Wong
@ 2021-11-09  4:03 10%             ` Rob Herring
  2021-11-09  5:08  9%               ` Eric Wong
  0 siblings, 1 reply; 5+ results
From: Rob Herring @ 2021-11-09  4:03 UTC (permalink / raw)
  To: Eric Wong; +Cc: Konstantin Ryabitsev, meta

On Mon, Nov 8, 2021 at 9:12 PM Eric Wong <e@80x24.org> wrote:
>
> Rob Herring <robh@kernel.org> wrote:
> > On Mon, Nov 8, 2021 at 3:27 PM Eric Wong <e@80x24.org> wrote:
> > >
> > > Rob Herring <robh@kernel.org> wrote:
> > > > On Mon, Nov 8, 2021 at 2:22 PM Konstantin Ryabitsev
> > > > > I think 's:patch AND nq:diff' is a good option here.
> > > >
> > > > Not even close really. That mainly finds my replies with 'diff' in
> > > > them. I'm not sure why, but it misses most actual patches:
> > > >
> > > > https://lore.kernel.org/all/?q=s%3Apatch+nq%3Adiff+f%3Arobh%40kernel.org
> > >
> > > Actually, it looks like nq:diff never works.  The diff indexer
> > > skips right over 'diff --git a/... b/...' lines :x
> >
> > Never works for 'diff' being a patch? Because it works very well
> > finding all the other cases.
>
> Yeah, the index_diff() code path ignored the "diff --git" phrase
> before this patch.
>
> > > The following should fix it, but reindexing is necessary.
> > > ---------8<----------
> > > Subject: [PATCH] searchidx: index "diff --git a/... b/..." headers
> > >
> > > While we do detailed indexing of git diffs, the header itself
> > > was failing and queries like 'nq:diff' would not work.
> >
> > Any thoughts on supporting an 'is a patch' type query?
>
> I think 's:patch' should be sufficient, don't think there's
> many false-positives on that front, actually.

It's at least 's:patch OR s:rfc OR s:resend'. That catches all but the
few creative folks that come up with something else.

> With this fix, nq:"diff --git" should also be working across
> https://yhbt.net/lore/ in about 40 hours (whenever reindex
> finishes)

'diff --git' should cover probably 99.9% of patches but there are
still some non-git diffs from time to time.

> I'm not sure if there needs to be a specific term to index
> patches on; maybe there is.  There's still a lot of Xapian
> we're not using, yet...

What I'm hoping to get to is a replacement for patchwork in my
workflow. For that I want all patches which don't have either a
Reviewed/Acked tag from me or a reply from me. I think the first part
should be possible with lei, but I'd imagine the last part is some
processing on top of the lei query.

Rob

^ permalink raw reply	[relevance 10%]

* Re: [PATCH] searchidx: index "diff --git a/... b/..." headers
  2021-11-09  0:38 10%         ` Rob Herring
@ 2021-11-09  3:12 10%           ` Eric Wong
  2021-11-09  4:03 10%             ` Rob Herring
  0 siblings, 1 reply; 5+ results
From: Eric Wong @ 2021-11-09  3:12 UTC (permalink / raw)
  To: Rob Herring; +Cc: Konstantin Ryabitsev, meta

Rob Herring <robh@kernel.org> wrote:
> On Mon, Nov 8, 2021 at 3:27 PM Eric Wong <e@80x24.org> wrote:
> >
> > Rob Herring <robh@kernel.org> wrote:
> > > On Mon, Nov 8, 2021 at 2:22 PM Konstantin Ryabitsev
> > > > I think 's:patch AND nq:diff' is a good option here.
> > >
> > > Not even close really. That mainly finds my replies with 'diff' in
> > > them. I'm not sure why, but it misses most actual patches:
> > >
> > > https://lore.kernel.org/all/?q=s%3Apatch+nq%3Adiff+f%3Arobh%40kernel.org
> >
> > Actually, it looks like nq:diff never works.  The diff indexer
> > skips right over 'diff --git a/... b/...' lines :x
> 
> Never works for 'diff' being a patch? Because it works very well
> finding all the other cases.

Yeah, the index_diff() code path ignored the "diff --git" phrase
before this patch.

> > The following should fix it, but reindexing is necessary.
> > ---------8<----------
> > Subject: [PATCH] searchidx: index "diff --git a/... b/..." headers
> >
> > While we do detailed indexing of git diffs, the header itself
> > was failing and queries like 'nq:diff' would not work.
> 
> Any thoughts on supporting an 'is a patch' type query?

I think 's:patch' should be sufficient, don't think there's
many false-positives on that front, actually.

With this fix, nq:"diff --git" should also be working across
https://yhbt.net/lore/ in about 40 hours (whenever reindex
finishes)

I'm not sure if there needs to be a specific term to index
patches on; maybe there is.  There's still a lot of Xapian
we're not using, yet...

^ permalink raw reply	[relevance 10%]

* Re: [PATCH] searchidx: index "diff --git a/... b/..." headers
  2021-11-08 21:27 14%       ` [PATCH] searchidx: index "diff --git a/... b/..." headers Eric Wong
@ 2021-11-09  0:38 10%         ` Rob Herring
  2021-11-09  3:12 10%           ` Eric Wong
  0 siblings, 1 reply; 5+ results
From: Rob Herring @ 2021-11-09  0:38 UTC (permalink / raw)
  To: Eric Wong; +Cc: Konstantin Ryabitsev, meta

On Mon, Nov 8, 2021 at 3:27 PM Eric Wong <e@80x24.org> wrote:
>
> Rob Herring <robh@kernel.org> wrote:
> > On Mon, Nov 8, 2021 at 2:22 PM Konstantin Ryabitsev
> > > I think 's:patch AND nq:diff' is a good option here.
> >
> > Not even close really. That mainly finds my replies with 'diff' in
> > them. I'm not sure why, but it misses most actual patches:
> >
> > https://lore.kernel.org/all/?q=s%3Apatch+nq%3Adiff+f%3Arobh%40kernel.org
>
> Actually, it looks like nq:diff never works.  The diff indexer
> skips right over 'diff --git a/... b/...' lines :x

Never works for 'diff' being a patch? Because it works very well
finding all the other cases.

> The following should fix it, but reindexing is necessary.
> ---------8<----------
> Subject: [PATCH] searchidx: index "diff --git a/... b/..." headers
>
> While we do detailed indexing of git diffs, the header itself
> was failing and queries like 'nq:diff' would not work.

Any thoughts on supporting an 'is a patch' type query?

Rob

^ permalink raw reply	[relevance 10%]

* [PATCH] searchidx: index "diff --git a/... b/..." headers
  @ 2021-11-08 21:27 14%       ` Eric Wong
  2021-11-09  0:38 10%         ` Rob Herring
  0 siblings, 1 reply; 5+ results
From: Eric Wong @ 2021-11-08 21:27 UTC (permalink / raw)
  To: Rob Herring; +Cc: Konstantin Ryabitsev, meta

Rob Herring <robh@kernel.org> wrote:
> On Mon, Nov 8, 2021 at 2:22 PM Konstantin Ryabitsev
> > I think 's:patch AND nq:diff' is a good option here.
> 
> Not even close really. That mainly finds my replies with 'diff' in
> them. I'm not sure why, but it misses most actual patches:
> 
> https://lore.kernel.org/all/?q=s%3Apatch+nq%3Adiff+f%3Arobh%40kernel.org

Actually, it looks like nq:diff never works.  The diff indexer
skips right over 'diff --git a/... b/...' lines :x

The following should fix it, but reindexing is necessary.
---------8<----------
Subject: [PATCH] searchidx: index "diff --git a/... b/..." headers

While we do detailed indexing of git diffs, the header itself
was failing and queries like 'nq:diff' would not work.

Noticed-by: Rob Herring <robh@kernel.org>
---
 lib/PublicInbox/SearchIdx.pm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm
index b886ce78..6e2e614c 100644
--- a/lib/PublicInbox/SearchIdx.pm
+++ b/lib/PublicInbox/SearchIdx.pm
@@ -259,6 +259,7 @@ sub index_diff ($$$) {
 		} elsif (m!^diff --git "?[^/]+/.+ "?[^/]+/.+\z!) {
 			# wait until "---" and "+++" to capture filenames
 			$in_diff = 1;
+			push @xnq, $_;
 		# traditional diff:
 		} elsif (m/^diff -(.+) (\S+) (\S+)$/) {
 			my ($opt, $fa, $fb) = ($1, $2, $3);

^ permalink raw reply related	[relevance 14%]

Results 1-5 of 5 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
     [not found]     <lorelei.part1.202111051304.mdtebsxahljcrxak@meerkat.local>
     [not found]     ` <CAL_JsqJBh1O3H2-P07AHzVq0x89BoP_N6P=rT5up6=3QyF_B0Q@mail.gmail.com>
2021-11-08 20:22       ` lei: incorrect quoting on saved searches (was Re: lore+lei: getting started) Konstantin Ryabitsev
2021-11-08 20:53         ` Rob Herring
2021-11-08 21:27 14%       ` [PATCH] searchidx: index "diff --git a/... b/..." headers Eric Wong
2021-11-09  0:38 10%         ` Rob Herring
2021-11-09  3:12 10%           ` Eric Wong
2021-11-09  4:03 10%             ` Rob Herring
2021-11-09  5:08  9%               ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).