git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Florine W. Dekker" <florine@fwdekker.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"brian m. carlson" <sandals@crustytoothpaste.net>
Cc: "René Scharfe" <l.s.r@web.de>, git@vger.kernel.org
Subject: Re: Wildcards in mailmap to hide transgender people's deadnames
Date: Tue, 20 Sep 2022 16:58:35 +0200	[thread overview]
Message-ID: <2c29ca18-4b45-af44-5690-0b9804a81461@fwdekker.com> (raw)
In-Reply-To: <220920.86edw65ngv.gmgdl@evledraar.gmail.com>

On 20/09/2022 12:23, Ævar Arnfjörð Bjarmason wrote:
>> I'm happy to resurrect my SHA-256 hashed mailmap series if we're
>> all willing to agree to not implement trivial decoding features.
> I'd think you'd want to be really clear about what that forward promise
> would entail. E.g. I've sometimes wanted a way for "git log" to report
> when it munges commits due to adding notes, re-encoding the data etc. If
> someone submits that sort of feature should it always explicitly leave
> out mailmap-related rewrites?
>
> And even if it does, who do we think we're really helping in the end,
> given the trivial way you could get that with an external "diff" with
> the one-liner above?

I think the most important thing here is that the mailmap should not 
allow for even-more-trivial ways to discover old names than currently 
already exist. I've thought more about what you said, Ævar, and now I'm 
wary of a mailmap implementation that would entail having my old and new 
information next to each other, even if encoded (doesn't matter if it's 
URL-encoded or base64-encoded), because I think it's likely some 
external data mining tool will decode the address and place them next to 
each other, so that if you search for the email address in a search 
engine you'll also see the other address. I think a hash encoding will 
prevent these automated miners from doing that, since reversing a hash 
is too much effort for an untargeted attack (right? if you disagree, how 
about a salted hash?).

Either way, I think any mailmap-based solution will allow the old and 
new name to be linked to each other by an adversary, as you showed with 
your neat one-liner. However, I think a (salted?) hash in the mailmap 
will be sufficient for casual obfuscation where harassment is unlikely, 
but the user wants to prevent accidental disclosure or plain linkage.

>> I also have an alternate proposal which I pitched to some folks at Git
>> Merge and which I just finished writing up that basically moves personal
>> names and emails out of commits, replacing them with opaque identifiers,
>> and using a constantly squashed mailmap commit in a special ref to store
>> the mapping.  This doesn't address changing identities in existing
>> commits, which as we've seen are nearly impossible to fix, but it does
>> address new ones.  I've sent it out at
>> https://lore.kernel.org/git/20220919145231.48245-1-sandals@crustytoothpaste.net/.
> As I understand the difference in this scenario a hypothetical future
> repo's Y commit's authorship would have been opaque in the first place
> using this mechanism, and via your "refs/mailmap" you'd have mapped
> Y=Bob.
>
> You then make a future X commit, and map X=Alice, and have a .mailmap
> entry which mapped Y=X, but that entry would refer to the opaque value.
>
> That certainly changes things in a fundamental way, and goes most or all
> of the way to mitigating what I've been pointing out as a flaw in these
> proposals.
>
> I'd still be very much on the fence about whether we'd ever want to
> recommend that to someone concerned with "harassment" and the like (as
> opposed to a milder social preference), as all it would take to get to
> that point is someone having a copy of the older "refs/mailmap" to
> unmask the previous "Y".

I first want to say that I really like your proposal, Brian! I didn't 
think this subject would get the attention it did, but I'm happy it's 
being picked up the way it is, and to see this lively discussion going 
on between yall!

And Ævar, you're right that having an older copy would allow one to 
discover a mapping from the old to the new name. But this will happen in 
any way we can conceivably implement this because the adversary can 
always keep an old copy of the entire repo, clone the new one, and 
compare the two logs. (You can probably come up with a neat one-liner, 
but that's besides the point ;-).) I think that the most appropriate 
threat model here is to assume that everyone who has accessed the repo 
before the name change will notice the name change and will be able to 
create a mapping. Instead, our goal should be to create a system that 
ensures that people who first access the repo after the name change are 
unable to find the old name at all. I think Brian's proposal achieves 
this. This is analogous to the real world where people who knew me 
before my transition will probably never (completely) forget my old 
name, and it's useless to try to make that happen, but at least I can 
prevent new people I meet from finding out the old name.

- Florine



  reply	other threads:[~2022-09-20 14:58 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-13 21:53 Wildcards in mailmap to hide transgender people's deadnames Florine W. Dekker
2022-09-14  7:40 ` René Scharfe
2022-09-14  9:07   ` Florine W. Dekker
2022-09-19 11:20     ` Ævar Arnfjörð Bjarmason
2022-09-19 12:27       ` rsbecker
2022-09-19 15:19       ` brian m. carlson
2022-09-19 16:31         ` Junio C Hamano
2022-09-19 17:26           ` brian m. carlson
2022-09-20 10:23         ` Ævar Arnfjörð Bjarmason
2022-09-20 14:58           ` Florine W. Dekker [this message]
2022-09-21 16:42           ` Junio C Hamano
2022-09-26  9:14             ` Ævar Arnfjörð Bjarmason
     [not found]   ` <CANgJU+Wt_yjv1phwiSUtLLZ=JKA9LvS=0UcBYNu+nxdJ_7d_Ew@mail.gmail.com>
2022-09-16 16:59     ` Florine W. Dekker
2022-09-20  0:32       ` brian m. carlson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2c29ca18-4b45-af44-5690-0b9804a81461@fwdekker.com \
    --to=florine@fwdekker.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=l.s.r@web.de \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).