git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Improving support for name changes in git
@ 2023-04-04 18:00 Bran Hagger
  2023-04-06  1:59 ` Junio C Hamano
  2023-04-26 18:35 ` Gwyneth Morgan
  0 siblings, 2 replies; 3+ messages in thread
From: Bran Hagger @ 2023-04-04 18:00 UTC (permalink / raw)
  To: git@vger.kernel.org; +Cc: Emily Shaffer

Hello Git community,

I'm interested in volunteering to help improve the process for users changing their name in Git.

I've seen the notes from the Git summit[1] and the old proposal to change the .mailmap to use hashes instead of plaintext names[2]. The problem with both approaches is that it is easy for other users to figure out the old name, which is a privacy concern for many people who change their names. Since the reverse of the hashes in the second case can be easily brute-forced, using hashes in the .mailmap provides no additional protection.

A system that prevents people from reverse-engineering the old name of a user who changes their name would require two key components:

1. The method of determining the current name of the author of a git commit can not rely on any information derived from their old name.
2. The mapping to the current name of the author of a git commit can not contain any history.

Solving the first problem seems reasonably doable. Instead of each commit having an author name and email, the author section could contain a hash that is used for the mapping. To maintain compatibility with older versions of git, the format could look something like:
Author: Hash #user.idHash <email@lookIn.newMailmap>​
With the user.idHash is a randomly generated number set in the .gitconfig the same way user.name and user.email currently are. A .newMailmap file (or whatever name we choose to give it) would then map from user id hashes to user names and emails.

The second problem of how to maintain a mapping of user.idHash without history is a radical departure from how git currently works. While handling such a file on the client side is probably not too technically complicated, it raises several questions:

* How can a git repository accept changes and protect against malicious actors modifying the .newMailmap file (or however we choose to name it)? Making pull requests to modify the file and keeping those pull requests around recreates the old issue of having a record of every name change.
* How are merge conflicts handled?
* How do we ensure users can only set the name and email for their own hashes? If the commits are signed this could be done via signing verification, but my understanding is that signing commits is relatively rare.

Has there been any further work done on supporting git name changes that I missed? Are there any existing files without git history that face similar issues?

[1] https://code.googlesource.com/git/summit/2020/+/main/notes.md
[2] https://lore.kernel.org/git/20210103211849.2691287-1-sandals@crustytoothpaste.net/

Thank you,
Bran (he/him)

P.S. Apologies for potentially double-sending this. My email client accidentally added HTML to the first copy.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Improving support for name changes in git
  2023-04-04 18:00 Improving support for name changes in git Bran Hagger
@ 2023-04-06  1:59 ` Junio C Hamano
  2023-04-26 18:35 ` Gwyneth Morgan
  1 sibling, 0 replies; 3+ messages in thread
From: Junio C Hamano @ 2023-04-06  1:59 UTC (permalink / raw)
  To: Bran Hagger; +Cc: git@vger.kernel.org, Emily Shaffer

Bran Hagger <brhagger@microsoft.com> writes:

> I'm interested in volunteering to help improve the process for
> users changing their name in Git.

To "improve", we need to understand what these users want when they
change their names.  Changing the names and changing the e-mail
addresses are both commonly done, and people depending on
circumstances want different things from the tool.  Some do not want
to be known that the person who used to use that old name is you,
the person who uses this new name.  Some do not mind their old name
or address to be in the record but they want to take credit for what
they did under both names.  There may be some position in between,
with various degrees of being realistic (e.g. "I do not want to be
associated with the old commit, but at the same time I do want to
take credit for it"---is that a reasonable desire?).

> Solving the first problem seems reasonably doable....

Up to this point, I found what you wrote to be reasoned very nicely.
However, ...

> The second problem of how to maintain a mapping of user.idHash
> without history is a radical departure from how git currently
> works.

... I think the above is an understatement.  No "radical departure"
would change the fundamental issue here: people need to be able to
map the random token X to the "current name" right now, and the
mechanism used to do so can be replayed at later date because a
mapping will be distributed, copied and saved.  Or a much simpler
and obvious source of the problem is that people have memories.

People change names and addresses over the course of their lives.
Employers may encourage their employees to use their corporate ident
when contributing to an external project, and often their employment
contract would make it clear that rights to the work belong to the
employers.  When employees move on, old contributions need to stay
to be "owned" by the original user ident.

	Side note: when changing an employer, people may more often
	change the address but not names.  But technically names and
	addresses as part of author/committer ident have the same
	characteristics in Git (e.g. being part of a etched-in-stone
	identity string), and the address is much less loaded
	emotionally, I'll talk about address change in this
	paragraph, but the same discussion applies to name change.

Some of these employees may not mind letting others know that the
person who made these old contributions and the person who is making
new contributions under different name and/or address are the same
person.  Others may be ashamed of their past association to the $EVIL
company and may want to start afresh, without being known about their
past employment with them.

The mailmap mechanism is a great way for the former group of folks.
It allows them to group the contributions by such a person who had
multiple idents over time into a single bucket.  But the mechanism
may not be suited for other uses, including the latter.

Some folks, after changing their name and/or address, do not want it
to be known that they used to use that name and/or address (e.g.
they may be a victim of a crime, being stalked, etc.)  The mailmap
mechanism would not help, even with your "random token" redirection,
and it shouldn't, because for those folks, they do not want to be
associated with their old ident after they started using new one.
The idea to use mailmap to somehow "link" the author of these old
commits (made under old ident) and the author of the new commits
(made under new ident of the same physical person who wants the
association with the old ident not to be known) _creates_ the
problem of "the linkage between two idents, which was made with
clever use of random token to make it irreversible, can be
recovered".

If "Such and such person used to work at $CORP and made these
contribution" was publically known as a fact before the person
changed their name and/or address, it is impossible to force all
other people to forget.  Wouldn't the only practical solution be to
stick to your new ident, and not talk about the old ident you used
to use?  If you try to abuse mailmap for something it wasn't even
designed to and have any entry to link the old and new ident in some
way, isn't it backwards as a solution, when what you want is that
the linkage between the old ident and you not to be known?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Improving support for name changes in git
  2023-04-04 18:00 Improving support for name changes in git Bran Hagger
  2023-04-06  1:59 ` Junio C Hamano
@ 2023-04-26 18:35 ` Gwyneth Morgan
  1 sibling, 0 replies; 3+ messages in thread
From: Gwyneth Morgan @ 2023-04-26 18:35 UTC (permalink / raw)
  To: Bran Hagger; +Cc: git@vger.kernel.org, Emily Shaffer, brian m. carlson

On 2023-04-04 18:00:00+0000, Bran Hagger wrote:
> Has there been any further work done on supporting git name changes that I missed? Are there any existing files without git history that face similar issues?
> 
> [1] https://code.googlesource.com/git/summit/2020/+/main/notes.md
> [2] https://lore.kernel.org/git/20210103211849.2691287-1-sandals@crustytoothpaste.net/

There was another proposal posted by brian last year, using signing keys
the author controls instead of hashes:
https://lore.kernel.org/git/20220919145231.48245-1-sandals@crustytoothpaste.net/T/

A different VCS, Pijul, recently adopted a system that seems similar to
brian's proposal, and may provide some inspiration on the user
experience. I haven't seen documentation for it, but there are some
examples of commands here:
https://nest.pijul.com/pijul/pijul/discussions/706

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-04-26 18:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-04 18:00 Improving support for name changes in git Bran Hagger
2023-04-06  1:59 ` Junio C Hamano
2023-04-26 18:35 ` Gwyneth Morgan

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).