From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.1 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, T_SCC_BODY_TEXT_LINE shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 1AED71F727; Wed, 29 Jun 2022 17:27:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=80x24.org; s=selector1; t=1656523663; bh=IpOLXlJGch8NCCX3GvI82DinXwHQLSDyPnnuiVRvCcM=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=aOCnyrPYfE0lUUxvZTK9h/mnOg5CaROxoJKcNJ1S6ErskyCFELQnmVMY6NEhPaV1o R1twpMZsqsTMYIAA0IxbaNxGdVz5o1WJfSw/6q2p30o91Bt1/wha9alr2fstPZgnKS NTWiYHTjqNGr9GDaaiJW1XWYlcCBM+kVsnPmvpn4= Date: Wed, 29 Jun 2022 17:27:42 +0000 From: Eric Wong To: Rob Herring Cc: meta@public-inbox.org Subject: Re: lei missing mails Message-ID: <20220629172742.M978900@dcvr> References: <20220629163033.GA14412@dcvr> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: List-Id: Rob Herring wrote: > On Wed, Jun 29, 2022 at 10:30 AM Eric Wong wrote: > > > > Rob Herring wrote: > > > Hi, > > > > > > I'm using lei with lore where I have 2 queries which overlap. Really, > > > one is a subset of the other. On those overlapping threads, I'm > > > finding that sometimes new messages are written to one mailbox and not > > > the other. (At least sometimes, the messages may be missing from all > > > mailboxes sometimes too. I'm not certain.) Using --remote-fudge-time > > > to force refetching seems to get the missing mails. I haven't found > > > anything strange in timestamps of the missing mails, but otherwise am > > > not sure how to debug this further. The queries are retrieving full > > > threads and the missing mails are in the threads, but not direct > > > matches to the queries. I realize that's not a lot of detail to go on. > > > Suggestions on debugging this further? > > > > Is this with 1.8 or 1.7? > > Commit 68b53c888911 actually. So post 1.8. OK, thanks for that info. > > I forgot to note in the release notes, but there were some > > SQLite usage-related fixes which could avoid missing messages. > > > > You'll need "lei daemon-kill" after upgrading to 1.8 to ensure > > the new code is running. > > It's possible I haven't done that since updating though I do vaguely > recall seeing something about needing to do that. Is there any way to > tell before I restart it? Not really, but it's pretty cheap to restart (assuming there's no long-running jobs). > > What might be interesting is to use the URLs lei prints and > > comparing the results w/o lei. > > > > I'll have to double-check if overlapping affects things, but it > > shouldn't; since the dedupe logic is per-output. > > > > Is this exclusively with HTTPS endpoints and writing to Maildirs > > (or something else?) > > Yes. It's querying lore and writing to a maildir. Here's one of the queries: > > [lei] > q = (dfn:drivers OR dfn:arch OR dfn:Documentation/* OR > dfn:include OR dfn:scripts) AND \ > f:robh@kernel.org AND rt:6.month.ago.. > [lei "q"] > include = https://lore.kernel.org/all/ > external = 1 > local = 1 > remote = 1 > threads = 1 > dedupe = mid > output = maildir:/home/rob/Mail/my-patches Fwiw, dedupe based on mid could be vulnerable to spoofing, which is why `content' is the default. But yes, in the past, I've noticed some messages to meta@public-inbox.org not showing up, though not recently (I guess lack of activity here is a culprit :x) I also just noticed an inotify-related bug deadlocking the whole lei-deamon while looking into this :< > > > It might be helpful if lei could print out message-ids of messages > > > written to mailboxes. > > > > That could get very noisy, especially as mailboxes are written > > in parallel. > > Verbose mode already is. Maybe specifying what info you want to be > verbose would help. The network side is mostly uninteresting in this > case for example. Yes, I've been struggling with the verbosity, too; and many other things :< > Is there any tool to list new messages in a maildir? I could do that > before and after. I've done the clearing the new flag in mutt between > runs, but that's not really ideal. I suppose `ls'. There are likely other tools more suited for Maildirs but I'm not familiar with them off the top of my head. Maybe lei could grow yet another command.