user/dev discussion of public-inbox itself
 help / color / Atom feed
* RFC: Using public-inbox v2 repos for distributed patch lifecycle tracking
@ 2019-01-27 18:37 Konstantin Ryabitsev
  2019-01-28  3:32 ` Eric Wong
  0 siblings, 1 reply; 2+ messages in thread
From: Konstantin Ryabitsev @ 2019-01-27 18:37 UTC (permalink / raw)
  To: meta

Hi, all:

Here's something I've been mulling over for a while, and sorta goes
hand-in-hand with what Eric has been doing with diff highlights work.
We have significant duplication of functionality on lore.kernel.org
between the public-inbox repository for LKML and the patchwork
instance for the same list (https://lore.kernel.org/patchwork). I am
currently working on a tool that would move a lot of patchwork's
functionality to the developers' workstation by using the public-inbox
archive of the mailing list that receives patches. After it is done,
the tool should be able to do all of the following:

- Keep track of patches and patch series, including series revisions
- Allow developers to easily apply patch series directly from the
public-inbox repository (by creating a mbox file of the series behind
the scenes with adjusted git by-llines -- e.g. if someone replies to a
patch with a "Acked-by: Foo Dev" or "Tested-by: Foo Bot", the original
patch trailers are adjusted to reflect that new data)
- Automatically recognising when patches have been applied to a repo
and auto-"accepting" them
- Allowing developers to see changes between series using interdiff

Currently, the tool sticks various patch tracking information into a
custom sqlite3 db, but I'm increasingly wondering if it makes more
sense to have this machine-parseable data available as part of the
repository itself -- say, as a git note on the commit-id of the
message. In other words, if the patch is in a message in abcd1234:m, a
refs/notes/patches entry for abcd1234 would have json-formatted data
with patch tracking information similar to what patchwork has for it
(see, for example,
https://lore.kernel.org/patchwork/api/1.1/patches/1035986/), but
without patchwork-specific bits and duplicate info like headers and
patch contents. If we add these notes on lore.kernel.org itself, then
we save a lot of redundant data processing on the client end. A
developer who has mirror-cloned a public-inbox archive of a
development mailing list would be able to start applying patches and
series without having to parse tens of thousands of messages.

I have limited experience with notes, however, and I'm curious if they
are a good candidate for such task. A public-inbox repo of LKML, even
after sharding, contains hundreds of thousands of messages. If many of
them carry such notes, would that significantly increase the
repository size and reduce its performance?

I have similar thoughts about publishing CI-related information as
notes to public-inbox commits, as well. This would allow centralised
archival of distributed CI efforts, which is a common problem in
distributed projects. Currently, bots flood lists with automated email
as patch follow-ups, but if they could publish their CI information as
refs/notes/ci/projectname in a public repository, we could mirror that
back to lore.kernel.org via regular pulls of that ref for each defined
projects and developers would be able to view CI reports from multiple
distributed projects inside the same interface.

Again, I'm not sure if git notes is the right tool for this. Another
way to go about it would be to use something like a custom blockchain
ledger, but I'm afraid that picking that would result in lower
adoption rate due to general unease people have about blockchain (it's
new, it's overhyped, and it's mired by association with cryptocoins).

I'd love to hear your thoughts. One of my goals is to find a way to
keep the distributed nature of Linux Kernel development without
locking it into a single vendor (like GitHub) or a suite of tools
(like GitLab, CircleCI, whatnot). We need to find a way to preserve
and archive the data generated by such tools in a way that is easy to
replicate and verify.

-K

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, back to index

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-27 18:37 RFC: Using public-inbox v2 repos for distributed patch lifecycle tracking Konstantin Ryabitsev
2019-01-28  3:32 ` Eric Wong

user/dev discussion of public-inbox itself

Archives are clonable:
	git clone --mirror https://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://ou63pmih66umazou.onion/inbox.comp.mail.public-inbox.meta
	nntp://czquwvybam4bgbro.onion/inbox.comp.mail.public-inbox.meta
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.org/gmane.mail.public-inbox.general

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox