git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: Oddidies in the .mailmap parser & future syntax extensions
Date: Fri, 10 Sep 2021 21:50:28 +0200	[thread overview]
Message-ID: <87h7esb3ig.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <xmqq1r5wti5a.fsf@gitster.g>


On Fri, Sep 10 2021, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> [Changed subject]
>
> [jc: culled CC addresses]
>
>> I'd expect:
>>
>>     Foo <foo@example.com> Bar
>>
>> To be an alias/shorthand for:
>>
>>     Foo <foo@example.com> Bar <foo@example.com>
>
> OK.
>
>> More annoying is that this:
>>
>>     New <foo@example.com> <bar@example.com>
>>     <foo@example.com> <zar@example.com>
>>
>> Doesn't mean the same as:
>> ...
>> I.e. I'd expect the name to map to the empty string, *unless* we saw an
>> earlier address, i.e. just as we do for the first bar -> foo line (we
>> map it to a name of "New", we don't map it to an empty name).
>
> You expect the first one to map (anyname, <bar@example.com>) to
> ("New", <foo@example.com>) and you describe the second one does not
> map the human-readable part to "New", but it is unclear what the
> code does, or why you expect it to map to "" (or what your
> expectation is, for that matter, exactly---do you want an empty
> string, or do you want "New", or something else???).
>
> FWIW, if we were designing it from scratch, I'd expect the second
> one to map (anyname, <zar@example.com>) to ($1, <foo@example.com>),
> keeping the human-readable part as-is and only map the e-mail part.

Yes, that's what it does now. I.e. we'll create two mappings from those
lines, namely (in line-by-line regex-like pseudosyntax):

    [[(.*), bar@example.com>] => [New, foo@example.com]]
    [[(.*), zar@example.com>] => [$1, foo@example.com]]

I.e. for the second one we'll retain whatever name we found there.

> Or do you expect that when these two entries appear together, the
> first entry with "New" is carried over to the second entry?

Yes, I'd expect it to be stateful and implicitly do:

    [[(.*), bar@example.com>] => [New, foo@example.com]]
    [[(.*), zar@example.com>] => [find_last_name_for_address_in_mailmap(find_last_explicit_) || $1, foo@example.com]]

I.e. we'd map the entries for you in git.git like:
    
    diff --git a/.mailmap b/.mailmap
    index 9c6a446bdfb..2a884981e9d 100644
    --- a/.mailmap
    +++ b/.mailmap
    @@ -127,8 +127,8 @@ Julian Phillips <julian@quantumfyre.co.uk> <jp3@quantumfyre.co.uk>
     Junio C Hamano <gitster@pobox.com> <gitster@pobox.com>
    -Junio C Hamano <gitster@pobox.com> <junio@hera.kernel.org>
    -Junio C Hamano <gitster@pobox.com> <junio@kernel.org>
    -Junio C Hamano <gitster@pobox.com> <junio@pobox.com>
    -Junio C Hamano <gitster@pobox.com> <junio@twinsun.com>
    -Junio C Hamano <gitster@pobox.com> <junkio@cox.net>
    -Junio C Hamano <gitster@pobox.com> <junkio@twinsun.com>
    +<gitster@pobox.com> <junio@hera.kernel.org>
    +<gitster@pobox.com> <junio@kernel.org>
    +<gitster@pobox.com> <junio@pobox.com>
    +<gitster@pobox.com> <junio@twinsun.com>
    +<gitster@pobox.com> <junkio@cox.net>
    +<gitster@pobox.com> <junkio@twinsun.com>

Which are currently mostly redundant, except for one old commit under
junio@twinsun.com where the " C " wasn't present in your name field.

One might say that's a feature, and you'd like to be explicit about when
to map addresses and when to map names, i.e. if we were trying to
minimize the size of the .mailmap then this would be the most minimal
thing we can get away with currently:
    
    diff --git a/.mailmap b/.mailmap
    index 9c6a446bdfb..4668f9b32f2 100644
    --- a/.mailmap
    +++ b/.mailmap
    @@ -127,8 +127,8 @@ Julian Phillips <julian@quantumfyre.co.uk> <jp3@quantumfyre.co.uk>
     Junio C Hamano <gitster@pobox.com> <gitster@pobox.com>
    -Junio C Hamano <gitster@pobox.com> <junio@hera.kernel.org>
    -Junio C Hamano <gitster@pobox.com> <junio@kernel.org>
    -Junio C Hamano <gitster@pobox.com> <junio@pobox.com>
    +<gitster@pobox.com> <junio@hera.kernel.org>
    +<gitster@pobox.com> <junio@kernel.org>
    +<gitster@pobox.com> <junio@pobox.com>
     Junio C Hamano <gitster@pobox.com> <junio@twinsun.com>
    -Junio C Hamano <gitster@pobox.com> <junkio@cox.net>
    -Junio C Hamano <gitster@pobox.com> <junkio@twinsun.com>
    +<gitster@pobox.com> <junkio@cox.net>
    +<gitster@pobox.com> <junkio@twinsun.com>

But I just can't imagine cases where the proposed shorthand isn't useful
for everyone.

Who wants to use mailmap, *and* map one e-mail address to another, *and*
has an entry explicitly mapping the name, but *would* mind having git be
auto-smart and follow the chain of that explicit name mapping if there's
an entry after that with an an e-mail -> that-earlier-email mapping?

I think the answer is "nobody" and it would be unambiguously helpful,
but maybe I'm wrong.
    
>> Doing that would be strictly backwards compatible, i.e. now we'll
>> entirely ignore the 3rd E-Mail address. It does mean we also
>> accidentally support things like:
>>
>>     New <foo@example.com> <bar@example.com> # A comment, because we ignore everything after the 2nd address
>>
>> But don't tell anyone I told you that :) But that is something that
>> might technically have inadvertently closed the door to future syntax
>> extensions, but we could probably do them anyway, or at worst have some
>> heuristic.
>
> I vaguely recall that it was not an accident but a deliberate
> feature to allow comments, but don't tell anyone I told you that.

Which works, right until someone has an old name entry that looks like a
comment :)

  reply	other threads:[~2021-09-10 20:05 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-10 13:02 [PATCH] .mailmap: Update mailmap Fangyi Zhou
2021-09-10 15:22 ` Gwyneth Morgan
2021-09-10 15:31   ` Jeff King
2021-09-10 15:35     ` Sibi Siddharthan
2021-09-11  0:32     ` Junio C Hamano
2021-09-11  1:31       ` Ævar Arnfjörð Bjarmason
2021-09-11 14:52         ` Jeff King
2021-09-11 14:47       ` Jeff King
2021-09-10 16:48   ` Oddidies in the .mailmap parser & future syntax extensions Ævar Arnfjörð Bjarmason
2021-09-10 18:11     ` Junio C Hamano
2021-09-10 19:50       ` Ævar Arnfjörð Bjarmason [this message]
2021-09-10 20:20         ` Junio C Hamano
2021-09-13  4:02           ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h7esb3ig.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).