Re: [RFC PATCH 2/2] docs: document a format for anonymous author and committer IDs

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org, Taylor Blau <me@ttaylorr.com>,
	Emily Shaffer <emilyshaffer@google.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: Re: [RFC PATCH 2/2] docs: document a format for anonymous author and committer IDs
Date: Thu, 22 Sep 2022 00:08:27 +0000	[thread overview]
Message-ID: <Yyune8ZNhWGWaTf2@tapette.crustytoothpaste.net> (raw)
In-Reply-To: <220920.86a66u5mnt.gmgdl@evledraar.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4653 bytes --]

On 2022-09-20 at 10:51:39, Ævar Arnfjörð Bjarmason wrote:
> Wouldn't we get the same thing if *by convention* we just supported
> authorship like this, (which we already support):
> 
> 	UUID=$(get-some-uuid)
>         git config user.name X
>         git config user.email $UUID.uuid.git.example.org

You can indeed use a UUID if you want.  However, it's not deterministic.

Using a key hash also means account linking is trivially implemented in
forges.  If we use a UUID, then there's no way to prove ownership of the
identifier, which means that people can claim other people's commits.
Signed commits don't help here because you can't embed arbitrary
non-emails in X.509 (or in OpenPGP, because nobody will certify such an
ID), so you have no way of linking the commit identity to the key and
therefore signed commits are worse than before.  At least with an email
you can verify that the owner of the account owns the email address, but
you can't do that with a UUID.

I want a design that works whether or not you use a forge, but realizing
that most developers use forges these days, I want to make the workflow as
simple and straightforward as possible for those who do.  I also want a
design which is going to be acceptable to forge implementers, and
working for one, I think this design is going to be easier to implement
and more likely to be accepted than an ID which requires extra work and
isn't verifiable.

For ease of use, I would be implementing tooling to make setting this
from an existing user.signingkey or SSH key on the system.  I literally
envision this being as simple as something like `git id --set -f
~/.ssh/id_ed25519` or `git id --set --generate-ssh-key`.  (This is just
an example; we can argue about the details later.)

> So you'd end up with e.g.:
> 
> 	X <98ab8d66-38d2-11ed-a261-0242ac120002.uuid.git.example.com>
> 
> Or whatever, we could bikeshed about the format, but the point is that
> it's not codifying *how* that looks.

I do very much want to codify how this looks because people are
absolutely going to rely on it, whether we want them to or not.  People
already parse GitHub's fake no-reply emails for information.  Everything
that Git does people rely on, whether we like it or not.

Keeping it in the form of an email maximizes compatibility for existing
implementations.

> We'd then just support this refs/mailmap mechanism you're suggesting,
> where we'd have a mapping like:
> 
>       Ævar Arnfjörð Bjarmason <avarab@gmail.com> X <98ab8d66-38d2-11ed-a261-0242ac120002.uuid.git.example.com>
> 
> Which could be force-pushed.
> 
> I can see why you'd *also* want to formalize the ID generation, but I
> just don't see why we'd want to make that as one leaping change rather
> than something more incremental.

We can make it as incremental as folks want.  However, the longer we
have people embedding their real names and emails in an immutable Merkle
tree, the longer we're going to run into deadname problems.  Thus,
encouraging this new form of ID sooner means that people will adopt it
sooner.

If this is the only impediment, we can make it more gradual.

> I.e. even if you don't have opaque IDs in the first place this mechanism
> would allow you to maintain a "mailmap" ref on the remote, which would
> already be useful.
> 
> E.g. now if I use a hosting provider and have my .mailmap in various
> repo I need to maintain then in each repo, but this would allow for a
> magical ref which would keep it up-to-date in various repos...

That's part of the goal.

> I obviously see why you want the "force push" aspect of this (the
> deadnaming), but I still wonder if it's really a good trade-off for git
> as an SCM to make that the default.
> 
> We've been going in the other direction for e.g. tags semi-recently with
> my 0bc8d71b99e (fetch: stop clobbering existing tags without --force,
> 2018-08-31).
> 
> By having that force-push default we make it so that a plumbing command
> (that makes use of mailmap) will give you one result today, but a
> different one tomorrow, with no easy way to get back.

I think force-pushing semantics has a nicer behaviour for my use case,
but it's not essential.  If the mailmap is in a separate ref, then if I
work at $MEGACORP and need to update the mailmap because of a name
change, I can still just rewrite the history, and as long as we preserve
the force-fetch behaviour by default, then it will just work.

I _do_ think we should retain the force-fetch behaviour by default.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

next prev parent reply	other threads:[~2022-09-22  0:09 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-19 14:52 [RFC PATCH 0/2] Opaque author and committer identifiers brian m. carlson
2022-09-19 14:52 ` [RFC PATCH 1/2] doc: specify a header for including arbitrary format-patch metadata brian m. carlson
2022-09-19 14:52 ` [RFC PATCH 2/2] docs: document a format for anonymous author and committer IDs brian m. carlson
2022-09-20 10:51   ` Ævar Arnfjörð Bjarmason
2022-09-22  0:08     ` brian m. carlson [this message]
2022-09-30 20:26   ` Gwyneth Morgan
2022-10-02  0:27     ` brian m. carlson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yyune8ZNhWGWaTf2@tapette.crustytoothpaste.net \
    --to=sandals@crustytoothpaste.net \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).