user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
To: meta@public-inbox.org
Subject: RFC: Using public-inbox v2 repos for distributed patch lifecycle tracking
Date: Sun, 27 Jan 2019 20:37:44 +0200	[thread overview]
Message-ID: <CAMwyc-Td-AKNaLgxEw91ibBAkN-DdP1xhRSfs-oLkyh73yRQUw@mail.gmail.com> (raw)

Hi, all:

Here's something I've been mulling over for a while, and sorta goes
hand-in-hand with what Eric has been doing with diff highlights work.
We have significant duplication of functionality on lore.kernel.org
between the public-inbox repository for LKML and the patchwork
instance for the same list (https://lore.kernel.org/patchwork). I am
currently working on a tool that would move a lot of patchwork's
functionality to the developers' workstation by using the public-inbox
archive of the mailing list that receives patches. After it is done,
the tool should be able to do all of the following:

- Keep track of patches and patch series, including series revisions
- Allow developers to easily apply patch series directly from the
public-inbox repository (by creating a mbox file of the series behind
the scenes with adjusted git by-llines -- e.g. if someone replies to a
patch with a "Acked-by: Foo Dev" or "Tested-by: Foo Bot", the original
patch trailers are adjusted to reflect that new data)
- Automatically recognising when patches have been applied to a repo
and auto-"accepting" them
- Allowing developers to see changes between series using interdiff

Currently, the tool sticks various patch tracking information into a
custom sqlite3 db, but I'm increasingly wondering if it makes more
sense to have this machine-parseable data available as part of the
repository itself -- say, as a git note on the commit-id of the
message. In other words, if the patch is in a message in abcd1234:m, a
refs/notes/patches entry for abcd1234 would have json-formatted data
with patch tracking information similar to what patchwork has for it
(see, for example,
https://lore.kernel.org/patchwork/api/1.1/patches/1035986/), but
without patchwork-specific bits and duplicate info like headers and
patch contents. If we add these notes on lore.kernel.org itself, then
we save a lot of redundant data processing on the client end. A
developer who has mirror-cloned a public-inbox archive of a
development mailing list would be able to start applying patches and
series without having to parse tens of thousands of messages.

I have limited experience with notes, however, and I'm curious if they
are a good candidate for such task. A public-inbox repo of LKML, even
after sharding, contains hundreds of thousands of messages. If many of
them carry such notes, would that significantly increase the
repository size and reduce its performance?

I have similar thoughts about publishing CI-related information as
notes to public-inbox commits, as well. This would allow centralised
archival of distributed CI efforts, which is a common problem in
distributed projects. Currently, bots flood lists with automated email
as patch follow-ups, but if they could publish their CI information as
refs/notes/ci/projectname in a public repository, we could mirror that
back to lore.kernel.org via regular pulls of that ref for each defined
projects and developers would be able to view CI reports from multiple
distributed projects inside the same interface.

Again, I'm not sure if git notes is the right tool for this. Another
way to go about it would be to use something like a custom blockchain
ledger, but I'm afraid that picking that would result in lower
adoption rate due to general unease people have about blockchain (it's
new, it's overhyped, and it's mired by association with cryptocoins).

I'd love to hear your thoughts. One of my goals is to find a way to
keep the distributed nature of Linux Kernel development without
locking it into a single vendor (like GitHub) or a suite of tools
(like GitLab, CircleCI, whatnot). We need to find a way to preserve
and archive the data generated by such tools in a way that is easy to
replicate and verify.

-K

             reply	other threads:[~2019-01-27 18:37 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-27 18:37 Konstantin Ryabitsev [this message]
2019-01-28  3:32 ` RFC: Using public-inbox v2 repos for distributed patch lifecycle tracking Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMwyc-Td-AKNaLgxEw91ibBAkN-DdP1xhRSfs-oLkyh73yRQUw@mail.gmail.com \
    --to=konstantin@linuxfoundation.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).