git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Stephen Finucane <stephen@that.guru>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: Feature request: provide a persistent IDs on a commit
Date: Tue, 19 Jul 2022 11:47:55 +0100	[thread overview]
Message-ID: <61333be26339440d9bae8f12fd1a4faeb5e68ab6.camel@that.guru> (raw)
In-Reply-To: <220718.86ilnuw8jo.gmgdl@evledraar.gmail.com>

On Mon, 2022-07-18 at 20:50 +0200, Ævar Arnfjörð Bjarmason wrote:
> On Mon, Jul 18 2022, Stephen Finucane wrote:
> 
> > ...to track evolution of a patch through time.
> > 
> > tl;dr: How hard would it be to retrofit an 'ChangeID' concept à la the 'Change-
> > ID' trailer used by Gerrit into git core?
> > 
> > Firstly, apologies in advance if this is the wrong forum to post a feature
> > request. I help maintain the Patchwork project [1], which a web-based tool that
> > provides a mechanism to track the state of patches submitted to a mailing list
> > and make sure stuff doesn't slip through the crack. One of our long-term goals
> > has been to track the evolution of an individual patch through multiple
> > revisions. This is surprisingly hard goal because oftentimes there isn't a whole
> > lot to work with. One can try to guess whether things are the same by inspecting
> > the metadata of the commit (subject, author, commit message, and the diff
> > itself) but each of these metadata items are subject to arbitrary changes and
> > are therefore fallible.
> > 
> > One of the mechanisms I've seen used to address this is the 'Change-ID' trailer
> > used by Gerrit. For anyone that hasn't seen this, the Gerrit server provides a
> > git commit hook that you can install locally. When installed, this appends a
> > 'Change-ID' trailer to each and every commit message. In this way, the evolution
> > of a patch (or a "change", in Gerrit parlance) can be tracked through time since
> > the Change ID provides an authoritative answer to the question "is this still
> > the same patch". Unfortunately, there are still some obvious downside to this
> > approach. Not only does this additional trailer clutter your commit messages but
> > it's also something the user must install themselves. While Gerrit can insist
> > that this is installed before pushing a change, this isn't an option for any of
> > the common forges nor is it something git-send-email supports.
> 
> git format-patch+send-email will send your trailers along as-is, how
> doesn't it support Change-Id. Does it need some support that any other
> made-up trailer doesn't?

It supports sending the trailers, sure. What it doesn't support is insisting you
send this specific trailer (Change-Id). Only Gerrit can do this (server side,
thankfully, which means you don't need to ask all contributors to install this
hook if you want to rely on it for tooling, CI, etc.).

> > I imagine most people working with mailing list based workflows have their own
> > client side tooling to support this while software forges like GitHub and GitLab
> > simply don't bother tracking version history between individual commits in a
> > pull/merge request.
> 
> It's far from ideal, but at least GitLab shows a diff on a push to a MR,
> including if it's force-pushed. I'm not sure about GitHub.

GitHub does not. Simply piling multiple additional "fix" commits onto the PR
branch results in a less horrible review experience since you can maintain
context, alas at the cost of a rotten git log. We don't need to debate the pros
and cons of the various forges though :)

> 
> > IMO though, it would be fantastic if third party tools
> > weren't necessary though. What I suspect we want is a persistent ID (or rather
> > UUID) that never changes regardless of how many times a patch is cherry-picked,
> > rebased, or otherwise modified, similar to the Author and AuthorDate fields.
> > Like Author and AuthorDate, it would be part of the core git commit metadata
> > rather than something in the commit message like Signed-Off-By or Change-ID.
> > 
> > Has such an idea ever been explored? Is it even possible? Would it be broadly
> > useful?
> 
> This has come up a bunch of times. I think that the thing git itself
> should be doing is to lean into the same notion that we use for tracking
> renames. I.e. we don't, we analyze history after-the-fact and spot the
> renames for you.

Any idea where I'd find previous discussions on this? I did look, and the only
proposal I found was an old one that seemed to suggest including the Change-Id
commit-msg hook with git itself which is not what I'm suggesting here.

> We have some of that in git already, as git-patch-id, and more recently
> git-range-diff. Both are flawed in a bunch of ways, and it's easy to run
> into edge cases where they don't spot something that they "should"
> have. Where "should" exists in the mind of the user.

That's a fair point and is of course what we (Patchwork) have to do currently.
Patchwork can track relations between individual patches but doesn't attempt to
generate these relations itself. Instead, we rely on third-party tooling. The
PaStA tool was one such example of a tool that could do this [1]. I can't
imagine a tool like Gerrit would ever work without this concept of an
authoritative (and arbitrary) identifier to track a patch's identity through
time, hence its reliance on the Change-Id trailer.

Perhaps we could flip this on its head. What would be the _downsides_ of
providing a persistent, arbitrary identifier on a commit similar to Author and
AuthorDate fields? There's obviously some work involved in implementing it but
assuming that was already done, what would break/be worse as a result?

Stephen

[1] https://rsarky.github.io/2020/08/10/pasta-patchwork.html

  reply	other threads:[~2022-07-19 10:48 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-18 17:18 Feature request: provide a persistent IDs on a commit Stephen Finucane
2022-07-18 17:35 ` Konstantin Ryabitsev
2022-07-18 19:04   ` Michal Suchánek
2022-07-19 10:57     ` Stephen Finucane
2022-07-18 21:24   ` Glen Choo
2022-07-20 19:21     ` Konstantin Ryabitsev
2022-07-20 19:30       ` Michal Suchánek
2022-07-20 22:10       ` Theodore Ts'o
2022-07-21 11:57         ` Han-Wen Nienhuys
2022-07-24  5:09     ` Elijah Newren
2022-07-18 18:50 ` Ævar Arnfjörð Bjarmason
2022-07-19 10:47   ` Stephen Finucane [this message]
2022-07-19 11:09     ` Ævar Arnfjörð Bjarmason
2022-07-19 11:57       ` Michal Suchánek
2022-07-29 12:11       ` Stephen Finucane
2022-07-29 12:40         ` Jason Pyeron
2022-07-21 16:18   ` Phillip Susi
2022-07-21 18:58     ` Hilco Wijbenga
2022-07-22 20:08       ` Philip Oakley
2022-07-22 20:36         ` Michal Suchánek
2022-07-22 22:46           ` Jacob Keller
2022-07-23  7:00             ` Michal Suchánek
2022-07-24  5:23               ` Elijah Newren
2022-07-24  8:54                 ` Michal Suchánek
2022-07-25 21:47                 ` Jacob Keller
2022-07-26  3:49                   ` Elijah Newren
2022-07-26  8:43                     ` Michal Suchánek
2022-07-24  5:10           ` Elijah Newren
2022-07-24  8:59             ` Michal Suchánek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=61333be26339440d9bae8f12fd1a4faeb5e68ab6.camel@that.guru \
    --to=stephen@that.guru \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).