git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: "Jakub Narębski" <jnareb@gmail.com>
Cc: Git <git@vger.kernel.org>, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 3/3] gitweb: Link to "git describe"'d commits in log messages
Date: Wed, 21 Sep 2016 19:58:18 +0200	[thread overview]
Message-ID: <CACBZZX5QF7FztmU-mqOEx40kC-PpQ9SqcL8+3HtuntcRZ1tWzA@mail.gmail.com> (raw)
In-Reply-To: <fadd75f3-3737-1eaf-30f3-46a2ef132b27@gmail.com>

On Wed, Sep 21, 2016 at 7:09 PM, Jakub Narębski <jnareb@gmail.com> wrote:
> W dniu 21.09.2016 o 13:44, Ævar Arnfjörð Bjarmason napisał:
>
>> Change the log formatting function to know about "git describe" output
>> like v2.8.0-4-g867ad08 in addition to just plain 867ad08.
>
> All right, that is a good plan.
>
>>
>> This also fixes a micro-regression in my change of the minimum SHA1
>> length from 8 to 7, which is that dated tags like
>> hadoop-20160921-113441-20-g094fb7d would start thinking the "20160921"
>> part was a commit.
>
> Actually 20160921 is 8 characters, so assuming that '-' is treated
> as word boundary by Perl, it is not a regression; this false positive
> was there.  The new feature would help, instead of linking false match
> it links whole git-describe output.
>
> So this paragraph needs to be changed wrt. the above.

Yeah I just miscounted there, will change that.

> Note that there are quite a bit of shortened SHA-1 that are composed
> entirely from digits, without a-f characters.

*nod*

>>
>> There are still many valid refnames that we don't link to
>> e.g. v2.10.0-rc1~2^2~1 is also a valid way to refer to
>> v2.8.0-4-g867ad08, but I'm not supporting that with this commit,
>> similarly it's trivially possible to create some refnames like
>> "æ/var-gf6727b0" or whatever which won't be picked up by this regex.
>
> Hopefully hierarchical tags are rare.  We need to reduce false
> positives.
>
>>
>> There's surely room for improvement here, but I just wanted to address
>> the very common case of sticking "git describe" output into commit
>> messages without trying to link to all possible refnames, that's going
>> to be a rather futile exercise given that this is free text, and it
>> would be prohibitively expensive to look up whether the references in
>> question exist in our repository.
>
> Note that we do not ask Git at the time of displaying commit message
> if the link is valid for performance reasons; we link it, and the link
> may be invalid if it was a false positive.
>
> Note that recommended way to refer to other commit in commit mesages
> is (see Documentation/SubmittingPatches):
>
>   If you want to reference a previous commit in the history of a stable
>   branch, use the format "abbreviated sha1 (subject, date)",
>   with the subject enclosed in a pair of double-quotes, like this:
>
>      Commit f86a374 ("pack-bitmap.c: fix a memleak", 2015-03-30)
>      noticed that ...
>
> Hmmm... this makes previous commit even more important.
>
>> ---
>>  gitweb/gitweb.perl | 18 ++++++++++++++++--
>>  1 file changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
>> index 101dbc0..3a52bc7 100755
>> --- a/gitweb/gitweb.perl
>> +++ b/gitweb/gitweb.perl
>> @@ -2036,10 +2036,24 @@ sub format_log_line_html {
>>       my $line = shift;
>>
>>       $line = esc_html($line, -nbsp=>1);
>> -     $line =~ s{\b([0-9a-fA-F]{7,40})\b}{
>> +     $line =~ s{
>> +        \b
>> +        (
>> +            # The output of "git describe", e.g. v2.10.0-297-gf6727b0
>> +            # or hadoop-20160921-113441-20-g094fb7d
>
> All right, for more complex regular expressions using in-line comments
> (extended regexp in Perl) is a good idea.
>
>> +            (?<!-) # see strbuf_check_tag_ref(). Tags can't start with -
>> +            [A-Za-z0-9.-]+
>> +            (?!\.) # refs can't end with ".", see check_refname_format()
>
> If we can assume that tag name is at least two characters (instead of
> at least one character), we could get rid of those extended regexp
> lookaround assertions:
>
>   (?<!pattern) - zero-width negative lookbehind assertion
>   (?!pattern)  - zero-width negative lookahead  assertion
>
> That is:
>
>   +            [A-Za-z0-9.]   # see strbuf_check_tag_ref(). Tags can't start with -
>   +            [A-Za-z0-9.-]*
>   +            [A-Za-z0-9-]   # refs can't end with ".", see check_refname_format()

Why get rid of them? I'm all for improving the regex, there's bound to
be lots of bugs in it, but since it's perl we can freely use its
extended features.

> Also, the canonical documentation for what is allowed in refnames
> is git-check-ref-format(1)... though it does not look like it includes
> "tags cannot start with '-'".

Yeah, looks like that manpage needs to be patched.

> Anyway, perhaps 'is it valid refname' could be passed to a subroutine,
> or a named regexp (which might be more involved, like disallowing two
> consecutive dots, e.g. "(?!.*\.{2})" at beginning).
>
>> +            -g[0-9a-fA-F]{7,40}
>
> If we are limiting to git-describe output, we can get rid of A-F here.

Indeed.

>> +            |
>> +            # Just a normal looking Git SHA1
>> +            [0-9a-fA-F]{7,40}
>> +        )
>> +        \b
>> +    }{
>>               $cgi->a({-href => href(action=>"object", hash=>$1),
>>                                       -class => "text"}, $1);
>> -     }eg;
>> +     }egx;
>>
>>       return $line;
>>  }
>>
>
> Good work.
>
> I assume that you are using git-describe output in commit messages
> a lot, isn't it?

Yeah, and I got tired of gitweb not linking to any of the commits I
was referencing.

  reply	other threads:[~2016-09-21 17:58 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-21 11:44 [PATCH 1/3] gitweb: Fix an ancient typo in v1.7.7-rc1-1-g0866786 Ævar Arnfjörð Bjarmason
2016-09-21 11:44 ` [PATCH 2/3] gitweb: Link to 7-character SHA1SUMS in commit messages Ævar Arnfjörð Bjarmason
2016-09-21 16:26   ` Jakub Narębski
2016-09-21 18:04     ` Ævar Arnfjörð Bjarmason
2016-09-21 18:28       ` Jakub Narębski
2016-09-21 20:09         ` Ævar Arnfjörð Bjarmason
2016-09-21 11:44 ` [PATCH 3/3] gitweb: Link to "git describe"'d commits in log messages Ævar Arnfjörð Bjarmason
2016-09-21 16:50   ` Junio C Hamano
2016-09-21 17:49     ` Jakub Narębski
2016-09-21 18:01       ` Junio C Hamano
2016-09-21 17:09   ` Jakub Narębski
2016-09-21 17:58     ` Ævar Arnfjörð Bjarmason [this message]
2016-09-21 19:13       ` Jakub Narębski
2016-09-21 13:33 ` [PATCH 1/3] gitweb: Fix an ancient typo in v1.7.7-rc1-1-g0866786 Jakub Narębski
2016-09-21 14:17   ` Ævar Arnfjörð Bjarmason
2016-09-21 17:14     ` Jakub Narębski
2016-09-21 17:17       ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACBZZX5QF7FztmU-mqOEx40kC-PpQ9SqcL8+3HtuntcRZ1tWzA@mail.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jnareb@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).