From: Konstantin Ryabitsev <firstname.lastname@example.org> To: email@example.com Subject: RFC: Using public-inbox v2 repos for distributed patch lifecycle tracking Date: Sun, 27 Jan 2019 20:37:44 +0200 [thread overview] Message-ID: <CAMwyc-Td-AKNaLgxEw91ibBAkN-DdP1xhRSfs-oLkyh73yRQUw@mail.gmail.com> (raw) Hi, all: Here's something I've been mulling over for a while, and sorta goes hand-in-hand with what Eric has been doing with diff highlights work. We have significant duplication of functionality on lore.kernel.org between the public-inbox repository for LKML and the patchwork instance for the same list (https://lore.kernel.org/patchwork). I am currently working on a tool that would move a lot of patchwork's functionality to the developers' workstation by using the public-inbox archive of the mailing list that receives patches. After it is done, the tool should be able to do all of the following: - Keep track of patches and patch series, including series revisions - Allow developers to easily apply patch series directly from the public-inbox repository (by creating a mbox file of the series behind the scenes with adjusted git by-llines -- e.g. if someone replies to a patch with a "Acked-by: Foo Dev" or "Tested-by: Foo Bot", the original patch trailers are adjusted to reflect that new data) - Automatically recognising when patches have been applied to a repo and auto-"accepting" them - Allowing developers to see changes between series using interdiff Currently, the tool sticks various patch tracking information into a custom sqlite3 db, but I'm increasingly wondering if it makes more sense to have this machine-parseable data available as part of the repository itself -- say, as a git note on the commit-id of the message. In other words, if the patch is in a message in abcd1234:m, a refs/notes/patches entry for abcd1234 would have json-formatted data with patch tracking information similar to what patchwork has for it (see, for example, https://lore.kernel.org/patchwork/api/1.1/patches/1035986/), but without patchwork-specific bits and duplicate info like headers and patch contents. If we add these notes on lore.kernel.org itself, then we save a lot of redundant data processing on the client end. A developer who has mirror-cloned a public-inbox archive of a development mailing list would be able to start applying patches and series without having to parse tens of thousands of messages. I have limited experience with notes, however, and I'm curious if they are a good candidate for such task. A public-inbox repo of LKML, even after sharding, contains hundreds of thousands of messages. If many of them carry such notes, would that significantly increase the repository size and reduce its performance? I have similar thoughts about publishing CI-related information as notes to public-inbox commits, as well. This would allow centralised archival of distributed CI efforts, which is a common problem in distributed projects. Currently, bots flood lists with automated email as patch follow-ups, but if they could publish their CI information as refs/notes/ci/projectname in a public repository, we could mirror that back to lore.kernel.org via regular pulls of that ref for each defined projects and developers would be able to view CI reports from multiple distributed projects inside the same interface. Again, I'm not sure if git notes is the right tool for this. Another way to go about it would be to use something like a custom blockchain ledger, but I'm afraid that picking that would result in lower adoption rate due to general unease people have about blockchain (it's new, it's overhyped, and it's mired by association with cryptocoins). I'd love to hear your thoughts. One of my goals is to find a way to keep the distributed nature of Linux Kernel development without locking it into a single vendor (like GitHub) or a suite of tools (like GitLab, CircleCI, whatnot). We need to find a way to preserve and archive the data generated by such tools in a way that is easy to replicate and verify. -K
next reply other threads:[~2019-01-27 18:37 UTC|newest] Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-01-27 18:37 Konstantin Ryabitsev [this message] 2019-01-28 3:32 ` Eric Wong
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: https://public-inbox.org/README * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CAMwyc-Td-AKNaLgxEw91ibBAkN-DdP1xhRSfs-oLkyh73yRQUw@mail.gmail.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: RFC: Using public-inbox v2 repos for distributed patch lifecycle tracking' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Code repositories for project(s) associated with this inbox: https://80x24.org/public-inbox.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).