From: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
To: Eric Wong <e@80x24.org>
Cc: meta@public-inbox.org
Subject: Re: public-inbox + mlmmj best practices?
Date: Mon, 28 Dec 2020 11:22:18 -0500 [thread overview]
Message-ID: <20201228162218.zcnqxkgwa2i3nt66@chatter.i7.local> (raw)
In-Reply-To: <20201222062808.GA4522@dcvr>
On Tue, Dec 22, 2020 at 06:28:08AM +0000, Eric Wong wrote:
> Eric Wong <e@80x24.org> wrote:
> >
> > There's scripts/ssoma-replay which was v1-only and dependent on
> > ssoma. I've been meaning to convert into something that reads
> > NNTP so it's not locked into public-inbox. Maybe it could be
> > part of `lei', too, for piping to arbitrary commands, dunno...
I wrote grok-pi-piper a while back for the purpose of piping from git to
patchwork.kernel.org. It's not complete yet, because we currently do not
handle situations with rewritten history, but it's been working well enough. I
have a write-up here:
https://people.kernel.org/monsieuricon/subscribing-to-lore-lists-with-grokmirror
What is the sanest way to recognize and handle history rewrites? Right now, we
just keep track of the latest tip hash. On each subsequent run, we just iterate
all commits between the recorded hash and the newest tip. My current thoughts
are:
- in addition to the latest tip hash, keep track of author, authordate and
message-id of the last processed message
- if we no longer find the tracked hash in the repo, use author+authordate to
find the new hash of the latest message we processed, and verify with
message-id
- if we cannot find the exact match (i.e. our latest processed message is gone
from history), find the first commit that happens before our recorded
authordate and use that as the "latest processed" jump-off point
This should do the right thing in most situations except for when the message
that was deleted from history was sent with a bogus Date: header with a date
in the future. In this case, we can miss valid messages in the queue.
Any suggestions on how this can be improved?
-K
next prev parent reply other threads:[~2020-12-28 16:22 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-21 21:20 public-inbox + mlmmj best practices? Konstantin Ryabitsev
2020-12-21 21:39 ` Eric Wong
2020-12-22 6:28 ` Eric Wong
2020-12-28 16:22 ` Konstantin Ryabitsev [this message]
2020-12-28 21:31 ` Eric Wong
2021-01-04 20:12 ` Konstantin Ryabitsev
2021-01-05 1:06 ` Eric Wong
2021-01-05 1:29 ` [PATCH] v2writable: exact discontiguous history handling Eric Wong
2021-01-09 22:21 ` Eric Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://public-inbox.org/README
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201228162218.zcnqxkgwa2i3nt66@chatter.i7.local \
--to=konstantin@linuxfoundation.org \
--cc=e@80x24.org \
--cc=meta@public-inbox.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/public-inbox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).