From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS2044 198.145.29.0/24 X-Spam-Status: No, score=-4.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 7F9D51F9FC for ; Tue, 9 Nov 2021 04:04:09 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 1D4BC61207 for ; Tue, 9 Nov 2021 04:04:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1636430649; bh=HmIt5IBcJLewUS87y7p7+zkrqdE70TGIGkBpYph86w0=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=Pv1WgYVE8W3SEXAC09uFkTc0jMIA6ii90L5wiCUtqL/A+QxTOs8D+JYOgaSJIBI8g IMPoYETty/aYD2xCvgmkdjjMnpDtgwLAtBy22nmePUDfp6rsrWE+F/ah9PWTabPBDa i3UE7T28owAVI3Lg76pFThHHmqbOOrVeKXxn1xtAJmq4pTMHH7nBGZk6Amo8AYeVwm 5n+HoZDGKvQETbrxMKz0EYvdJ2Hso1CAbHzHG2NHG4Ahnj7oEuo+PVCqR3254Dj3gs jla8Wa5846ubYlIYkxCfPuz7FgmLSkLl50QQ85vngf6T1hq54zaA9nijpQAEQIqnXl LORBuTXgIVL7g== Received: by mail-ed1-f54.google.com with SMTP id o8so71127184edc.3 for ; Mon, 08 Nov 2021 20:04:09 -0800 (PST) X-Gm-Message-State: AOAM531kXVdMWbqACSlHH0f1tc7XDV9jsr1LJPLAv0EiUsithIpsTOf7 CKwRrgYHDLvbv6Kfj0KXO/K/Az4NTLEEqrNU6g== X-Google-Smtp-Source: ABdhPJx+TEc7A59Ag6E7G19A06SZVcgi33set10IFuiNeOCC+GhYrgNJBxi+LWlsmUOtwX0netJkkDvLnc7/rgtcKVg= X-Received: by 2002:a17:907:3f24:: with SMTP id hq36mr5707563ejc.390.1636430647535; Mon, 08 Nov 2021 20:04:07 -0800 (PST) MIME-Version: 1.0 References: <20211108202204.q5zg6bachnvbjlnx@meerkat.local> <20211108212714.GA13642@dcvr> <20211109031233.GA19089@dcvr> In-Reply-To: <20211109031233.GA19089@dcvr> From: Rob Herring Date: Mon, 8 Nov 2021 22:03:54 -0600 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] searchidx: index "diff --git a/... b/..." headers To: Eric Wong Cc: Konstantin Ryabitsev , meta@public-inbox.org Content-Type: text/plain; charset="UTF-8" List-Id: On Mon, Nov 8, 2021 at 9:12 PM Eric Wong wrote: > > Rob Herring wrote: > > On Mon, Nov 8, 2021 at 3:27 PM Eric Wong wrote: > > > > > > Rob Herring wrote: > > > > On Mon, Nov 8, 2021 at 2:22 PM Konstantin Ryabitsev > > > > > I think 's:patch AND nq:diff' is a good option here. > > > > > > > > Not even close really. That mainly finds my replies with 'diff' in > > > > them. I'm not sure why, but it misses most actual patches: > > > > > > > > https://lore.kernel.org/all/?q=s%3Apatch+nq%3Adiff+f%3Arobh%40kernel.org > > > > > > Actually, it looks like nq:diff never works. The diff indexer > > > skips right over 'diff --git a/... b/...' lines :x > > > > Never works for 'diff' being a patch? Because it works very well > > finding all the other cases. > > Yeah, the index_diff() code path ignored the "diff --git" phrase > before this patch. > > > > The following should fix it, but reindexing is necessary. > > > ---------8<---------- > > > Subject: [PATCH] searchidx: index "diff --git a/... b/..." headers > > > > > > While we do detailed indexing of git diffs, the header itself > > > was failing and queries like 'nq:diff' would not work. > > > > Any thoughts on supporting an 'is a patch' type query? > > I think 's:patch' should be sufficient, don't think there's > many false-positives on that front, actually. It's at least 's:patch OR s:rfc OR s:resend'. That catches all but the few creative folks that come up with something else. > With this fix, nq:"diff --git" should also be working across > https://yhbt.net/lore/ in about 40 hours (whenever reindex > finishes) 'diff --git' should cover probably 99.9% of patches but there are still some non-git diffs from time to time. > I'm not sure if there needs to be a specific term to index > patches on; maybe there is. There's still a lot of Xapian > we're not using, yet... What I'm hoping to get to is a replacement for patchwork in my workflow. For that I want all patches which don't have either a Reviewed/Acked tag from me or a reply from me. I think the first part should be possible with lei, but I'd imagine the last part is some processing on top of the lei query. Rob