1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
| | Anonymous IDs
=============
Objective
---------
Provide a way for people to identify themselves without the need to associate a
fixed personal name or email.
Background
----------
People change their name and email many times over the course over their lives.
For example, people may marry or change jobs. In many cases, these changes can
be handled by the mailmap. However, for many transgender people, keeping the
old name in the mailmap is often undesirable.
This document proposes a new way to specify anonymous IDs based on an SSH key or
GnuPG key instead along with a mailmap which is automatically downloaded from
the remote which provides an automatic correspondence. In this approach, all
users are expected to specify an anonymous ID and a mailmap entry.
This does not solve the problem of previous commits, but it does solve the
approach going forward if reasonably well adopted and avoids the problems of
existing approaches of obscuring the mailmap which are defeated by simply
enumerating all entries in all commits.
Anonymous IDs
-------------
Git will implement a new form of email address which is acceptable to existing
implementations but is not valid according to RFC 1123. This takes the form of
an email address where the local-part contains the identifier and the domain
portion starts with `_.` and then a domain specifier which specifies an
authority and the meaning of the identifier.
In such a case, Git will specify the username as a single U+2060 in UTF-8 (the
byte sequence 0xE2 0x81 0xA0), which is a zero width non-breaking space. This
is compatible with existing implementations.
The Git project will specify a set of identifiers under the domain
`id.git-scm.com`. The next component is the type of key as specified by the
`gpg.program` identifier, and then a component indicating the hash type or
version number as specified below.
This approach provides IDs which are simple and easy to create (almost all users
will have an SSH implementation which can generate keys with a single command),
opaque, completely deterministic, and not personally identifiable.
Other authorities, such as hosting providers, may use different IDs. For
example, if the hosting provider example.com might issue the ID
`1234@_.user.example.com` for user ID 1234. Authorities are encouraged to use
database IDs or other unique IDs rather than usernames, since many usernames
contain human names or corporate affiliations, which defeats the point of this
feature.
In conjunction with a single, constantly rewritten mailmap reference and
`mailmap.blob`, this allows users to move their real IDs outside of the commit
IDs into a mailmap which is constantly rewritten. If a user's real name or
email changes, they can submit an update to the mailmap and the ID, which will
be squashed into a single commit without history.
Specifications
~~~~~~~~~~~~~~
OpenPGP Keys
^^^^^^^^^^^^
If a user possesses a v4 OpenPGP key, then they may use the domain
`_.v4.openpgp.id.git-scm.com` using a lowercase hex form of the SHA-1
fingerprint as the local-part. For example, the key with the fingerprint
`da39a3ee5e6b4b0d3255bfef95601890afd80709` would have the email address
`da39a3ee5e6b4b0d3255bfef95601890afd80709@_v4.openpgp.id.git-scm.com`.
Similarly, when RFC 4880 bis is implemented using v5 keys with SHA-256
fingerprints, the domain `_.v5.openpgp.id.git-scm.com` may be used with a
lowercase hex form of the SHA-256 fingerprint as the local-part.
SSH Keys
^^^^^^^^
If a user possesses an SSH key, then they may use the domain
`_.sha256.ssh.id.git-scm.com` using a base64url encoding (without padding) as
the local-part. This is the RFC 4648 Base64 encoding with URL and filename safe
alphabet without the padding character. For example, a user whose SSH key
fingerprint is `47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU` may use
`47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU@_.sha256.ssh.id.git-scm.com`.
It's intentional that no specification is provided for MD5 fingerprints. MD5 is
obsolete and should not be used in new protocols such as this.
X.509 Certificates
^^^^^^^^^^^^^^^^^^
If a user possesses an X.509 certificate, then they may use the domain
`_.sha256.x509.id.git-scm.com` using a lowercase hex form of the SHA-256
fingerpint of the certificate. For example, if the key fingerprint is
`e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855`, then the ID
would be
`e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855@_.sha256.x509.id.git-scm.com`.
Emission
~~~~~~~~
A user may specify, instead of `user.email`, a `user.signingkey` (or a suitable
protocol-specific setting). If `user.idFormat` is set to `email`, then the
user's email will be written into the commit; if it is instead set to `key`,
then the ID corresponding to the key is extracted from the signing program and
that is used instead. `id` can be used to specify the `user.id` value. An order
of items to try can be specifed with a colon-separated list. The default, which
is subject to change, is `id:email:key`. This allows users to specify an
independent ID which is independent of their email.
For patches, a user may specify `format.id` as `as-is` to leave the data as is,
or as `mailmap` to use the mailmap value to rewrite it to the value in the
mailmap. If the user specifies `mailmap-metadata`, then an in-body `From:` line
in the patch is written to contain the author ID using the ID as written in the
commit, but a format-patch metadata header is written using the mailmap entry in
the commit.
Expected Mailmap Improvements
-----------------------------
Right now, the mailmap is included in a repository as part of a regular commit.
This means it has a history, which is undesirable if the user would like to
completely rewrite their identity.
This can be easily solved with some mailmap improvements. `git clone` will
learn a command, `--use-mailmap`, which will specifically fetch the ref
`refs/mailmap` from the remote and keep it up to date using force updates if
necessary. This option will also specify `mailmap.blob` to point to the
`.mailmap` file in this ref, which allows the user to automatically keep it up
to date with the remote.
`git am` or `git apply` can then apply the mailmap entry from the patch to the
appropriate ref with `--use-mailmap`. The default is `--use-mailmap=amend`,
which amends the existing commit. If a user would like to preserve a history
for some reason, they can use `--use-mailmap=commit`. For maintainers, they can
then push this ref using the normal push refspecs, or explicitly with
`--mailmap`, which is equivalent to `+refs/mailmap:refs/mailmap`.
The goal of this is to make interacting with the mailmap refs automatic and
transparent whenever other data is fetched or cloned from the remote.
|