git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Johannes Sixt <j6t@kdbg.org>
To: Tassilo Horn <tsdh@gnu.org>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v4] userdiff: improve java hunk header regex
Date: Tue, 10 Aug 2021 22:57:47 +0200	[thread overview]
Message-ID: <d3484278-8413-0d10-e6cd-59a7ff04564b@kdbg.org> (raw)
In-Reply-To: <20210810190937.305765-1-tsdh@gnu.org>

Am 10.08.21 um 21:09 schrieb Tassilo Horn:
> Currently, the git diff hunk headers show the wrong method signature if the
> method has a qualified return type, an array return type, or a generic return
> type because the regex doesn't allow dots (.), [], or < and > in the return
> type.  Also, type parameter declarations couldn't be matched.
> 
> Add several t4018 tests asserting the right hunk headers for increasingly
> complex method signatures:
> 
>   public String[] myMethod(String[] RIGHT)
>   public List<String> myMethod(String[] RIGHT)
>   public <T> List<T> myMethod(T[] RIGHT)
>   public <AType, B> Map<AType, B> myMethod(String[] RIGHT)
>   public <AType, B> java.util.Map<AType, Map<B, B[]>> myMethod(String[] RIGHT)
>   public List<? extends Comparable> myMethod(String[] RIGHT)
>   public <T extends Serializable & Comparable<T>> List<T> myMethod(String[] RIGHT)
> 
> Signed-off-by: Tassilo Horn <tsdh@gnu.org>
> ---
>  t/t4018/java-constructor             |  6 ++++++
>  t/t4018/java-enum-constant           |  6 ++++++
>  t/t4018/java-nested-field            |  6 ++++++
>  t/t4018/java-return-array            |  6 ++++++
>  t/t4018/java-return-generic          |  6 ++++++
>  t/t4018/java-return-generic-bounded  |  6 ++++++
>  t/t4018/java-return-generic-wildcart |  6 ++++++
>  t/t4018/java-return-generic2         |  6 ++++++
>  t/t4018/java-return-generic3         |  6 ++++++
>  t/t4018/java-return-generic4         |  6 ++++++
>  userdiff.c                           | 23 ++++++++++++++++++++++-
>  11 files changed, 82 insertions(+), 1 deletion(-)
>  create mode 100644 t/t4018/java-constructor
>  create mode 100644 t/t4018/java-enum-constant
>  create mode 100644 t/t4018/java-nested-field
>  create mode 100644 t/t4018/java-return-array
>  create mode 100644 t/t4018/java-return-generic
>  create mode 100644 t/t4018/java-return-generic-bounded
>  create mode 100644 t/t4018/java-return-generic-wildcart
>  create mode 100644 t/t4018/java-return-generic2
>  create mode 100644 t/t4018/java-return-generic3
>  create mode 100644 t/t4018/java-return-generic4
> 

These new tests are very much appreciated. You do not have to go wild
with that many return type tests; IMO, the simple one and the most
complicated one should do it. (And btw, s/cart/card/)

> diff --git a/t/t4018/java-return-array b/t/t4018/java-return-array
> new file mode 100644
> index 0000000000..747638b9a8
> --- /dev/null
> +++ b/t/t4018/java-return-array
> @@ -0,0 +1,6 @@
> +class MyExample {
> +    public String[] myMethod(String[] RIGHT) {
> +        // Whatever...
> +        return new; // ChangeMe
> +    }
> +}
> diff --git a/userdiff.c b/userdiff.c
> index 3c3bbe38b0..9bd751b7d2 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -142,7 +142,28 @@ PATTERNS("html",
>  	 "[^<>= \t]+"),
>  PATTERNS("java",
>  	 "!^[ \t]*(catch|do|for|if|instanceof|new|return|switch|throw|while)\n"
> -	 "^[ \t]*(([A-Za-z_][A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)$",
> +         "^[ \t]*("
> +         /* Class, enum, and interface declarations: */
> +         /*   optional modifiers: public */
> +         "(([a-z]+[ \t]+)*"
> +         /*   the kind of declaration */
> +         "(class|enum|interface)[ \t]+"
> +         /*   the name */
> +         "[A-Za-z][A-Za-z0-9_$]*[ \t]+.*)"
> +         /* Method & constructor signatures: */
> +         /*   optional modifiers: public static */
> +         "|(([a-z]+[ \t]+)*"
> +         /*   type params and return types for methods but not constructors */
> +         "("
> +         /*     optional type parameters: <A, B extends Comparable<B>> */
> +         "(<[A-Za-z0-9_,.&<> \t]+>[ \t]+)?"
> +         /*     return type: java.util.Map<A, B[]> or List<?> */
> +         "([A-Za-z_]([A-Za-z_0-9<>,.?]|\\[[ \t]*\\])*[ \t]+)+"
> +         /*   end of type params and return type */
> +         ")?"
> +         /*   the method name followed by the parameter list: myMethod(...) */
> +         "[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)"
> +         ")$",

I don't see the point in this complicated regex. Please recall that it
will be applied only to syntactically correct Java text. Therefore, you
do not have to implement all syntactical corner cases, just be
sufficiently permissive.

What is wrong with

	"^[ \t]*(([A-Za-z_][][?&<>.,A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[
\t]*\\([^;]*)$",

i.e. take every "token" until an identifier followed by an opening
parenthesis is found. Can types in Java contain parentheses? That would
make my suggested simplified regex too permissive, but otherwise it
would do its job, I would think.

-- Hannes

  reply	other threads:[~2021-08-10 20:57 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-10 19:09 [PATCH v4] userdiff: improve java hunk header regex Tassilo Horn
2021-08-10 20:57 ` Johannes Sixt [this message]
2021-08-10 22:12   ` Re* " Junio C Hamano
2021-08-11  7:14     ` Johannes Sixt
2021-08-11 16:04       ` Junio C Hamano
2021-08-11 20:32         ` Johannes Sixt
2021-08-11  5:22   ` Tassilo Horn
2021-08-11  7:34     ` Johannes Sixt
2021-08-11  7:39       ` Tassilo Horn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d3484278-8413-0d10-e6cd-59a7ff04564b@kdbg.org \
    --to=j6t@kdbg.org \
    --cc=git@vger.kernel.org \
    --cc=tsdh@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).