git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v2 0/5] Hashed mailmap
@ 2021-01-03 21:18 brian m. carlson
  2021-01-03 21:18 ` [PATCH v2 1/5] mailmap: add a function to inspect the number of entries brian m. carlson
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: brian m. carlson @ 2021-01-03 21:18 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Ævar Arnfjörð Bjarmason, Phillip Wood

Many people, through the course of their lives, will change either a
name or an email address.  For this reason, we have the mailmap, to map
from a user's former name or email address to their current, canonical
forms.  Normally, this works well as it is.

However, sometimes people change a name (or an email) and want to
completely cease use of the former name or email.  This could be because
a transgender person has transitioned, because a person has left an
abusive partner or broken ties with an abusive family member, or for any
other number of good and valuable reasons.  In these cases, placing the
former name in the .mailmap may be undesirable.

For those situations, let's introduce a hashed mailmap, where the user's
former name or email address can be in the form @sha256:<hash>.  This
obscures the former name or email.

In the course of experimenting with some solutions for v2, I noticed
that our mailmap support has a bunch of problems with case sensitivity.
Notably, it treats local-parts of email addresses in a case-insensitive
way, when the RFC specifically says that they are case sensitive, and we
also treat names case insensitively, but only for ASCII characters.
Both of those have been fixed here, and the commit messages explain in
lurid detail why, while incompatible, this is the correct behavior.

I've also added some performance numbers and explained some alternate
solutions in the commit message for the final patch.  That's in addition
to the performance improvements I've done so that the feature is both
cheaper for users and nearly invisible for non-users.  That isn't quite
the same as adding a perf test, which I haven't done, but I think this
explains the situation quite well.  If folks are still dying for a perf
test, I can add one in v3.

I will point out that fully hashing a mailmap isn't necessarily cheap,
but how expensive it is depends on the weighting of current and former
members of the project.  As mentioned in the original thread, I think a
hash rather than an encoding is the right choice here.  It is likely
that in a few iterations of hardware, all users will have accelerated
SHA-256 and the cost will end up being a handful of cycles per name
overall.

Changes from v1:
* Fix case-sensitivity problems in the mailmap.
* Add documentation.
* Add explanation of how to compute the value.
* Add some optimizations to improve performance.
* Improve commit message to discuss performance numbers and explain
  rationale better.

brian m. carlson (5):
  mailmap: add a function to inspect the number of entries
  mailmap: switch to opaque struct
  t4203: add failing test for case-sensitive local-parts and names
  mailmap: use case-sensitive comparisons for local-parts and names
  mailmap: support hashed entries in mailmaps

 Documentation/mailmap.txt |  28 ++++++++
 builtin/blame.c           |   2 +-
 builtin/check-mailmap.c   |   4 +-
 builtin/commit.c          |   2 +-
 mailmap.c                 | 139 +++++++++++++++++++++++++++++++++-----
 mailmap.h                 |  15 ++--
 pretty.c                  |   4 +-
 pretty.h                  |   2 +-
 revision.c                |   2 +-
 revision.h                |   3 +-
 shortlog.h                |   3 +-
 t/t4203-mailmap.sh        |  64 +++++++++++++++++-
 12 files changed, 236 insertions(+), 32 deletions(-)


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-01-12 14:10 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-03 21:18 [PATCH v2 0/5] Hashed mailmap brian m. carlson
2021-01-03 21:18 ` [PATCH v2 1/5] mailmap: add a function to inspect the number of entries brian m. carlson
2021-01-04 15:14   ` Ævar Arnfjörð Bjarmason
2021-01-04 17:04   ` René Scharfe
2021-01-03 21:18 ` [PATCH v2 2/5] mailmap: switch to opaque struct brian m. carlson
2021-01-04 15:17   ` Ævar Arnfjörð Bjarmason
2021-01-03 21:18 ` [PATCH v2 3/5] t4203: add failing test for case-sensitive local-parts and names brian m. carlson
2021-01-03 21:18 ` [PATCH v2 4/5] mailmap: use case-sensitive comparisons for " brian m. carlson
2021-01-04 16:10   ` Ævar Arnfjörð Bjarmason
2021-01-06  0:46     ` Junio C Hamano
2021-01-12 14:08       ` Ævar Arnfjörð Bjarmason
2021-01-03 21:18 ` [PATCH v2 5/5] mailmap: support hashed entries in mailmaps brian m. carlson
2021-01-05 14:21   ` Ævar Arnfjörð Bjarmason
2021-01-06  0:24     ` brian m. carlson
2021-01-10 19:24       ` Ævar Arnfjörð Bjarmason
2021-01-10 21:26         ` brian m. carlson
2021-01-05 20:05   ` Junio C Hamano
2021-01-06  0:28     ` brian m. carlson
2021-01-06  1:50       ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).