git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Johannes Sixt <j6t@kdbg.org>
To: Tassilo Horn <tsdh@gnu.org>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v4] userdiff: improve java hunk header regex
Date: Wed, 11 Aug 2021 09:34:22 +0200	[thread overview]
Message-ID: <95ebb2cf-2e6e-912e-7d80-3947a8e3d9e4@kdbg.org> (raw)
In-Reply-To: <87zgtoh6bm.fsf@gnu.org>

Am 11.08.21 um 07:22 schrieb Tassilo Horn:
> Johannes Sixt <j6t@kdbg.org> writes:
>> These new tests are very much appreciated. You do not have to go wild
>> with that many return type tests; IMO, the simple one and the most
>> complicated one should do it. (And btw, s/cart/card/)
> 
> Well, they appeared naturally as a result during development and made it
> easier to spot errors when you know up to which level of complexity it
> still worked.  Is there a stronger reason to remove tests which might
> not be needed, e.g., runtime cost on some CI machines?

I totally understand how the test cases evolved. Having many of them is
not a big deal. It's just the disproportion of tests of this new feature
vs. the existing tests that your patch creates, in particular, when
earlier of the new tests are subsumed by later new tests.

> Another thing I've noticed (with my suggested patch) is that I should
> not try to match constructor signatures.  I think that's impossible
> because they are indistinguishable from method calls, e.g., in
> 
>   public class MyClass {
>       MyClass(String RIGHT) {
>           someMethodCall();
>           someOtherMethod(17)
>               .doThat();
>           // Whatever
>           // ChangeMe
>       }
>   }
> 
> there is no regex way to prefer MyClass(String RIGHT) over
> someOtherMethod().

Good find.

> So all in all, I'd propose this version in the next patch version:
> 
> --8<---------------cut here---------------start------------->8---
> PATTERNS("java",
> 	 "!^[ \t]*(catch|do|for|if|instanceof|new|return|switch|throw|while)\n"
>          "^[ \t]*("
>          /* Class, enum, and interface declarations */
>          "(([a-z]+[ \t]+)*(class|enum|interface)[ \t]+[A-Za-z][A-Za-z0-9_$]*[ \t]+.*)"
>          /* Method definitions; note that constructor signatures are not */
>          /* matched because they are indistinguishable from method calls. */
>          "|(([A-Za-z_<>&][][?&<>.,A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)"
>          ")$",
> 	 /* -- */
> 	 "[a-zA-Z_][a-zA-Z0-9_]*"
> 	 "|[-+0-9.e]+[fFlL]?|0[xXbB]?[0-9a-fA-F]+[lL]?"
> 	 "|[-+*/<>%&^|=!]="
> 	 "|--|\\+\\+|<<=?|>>>?=?|&&|\\|\\|"),
> --8<---------------cut here---------------end--------------->8---

That looks fine.

One suggestion, though. You do not have to have all positive patterns
("class, enum, interface" and "method definitions") in a single pattern
separated by "|". You can place them on different "lines" (note the "\n"
at the end of the first pattern):

	/* Class, enum, and interface declarations */
	"^[ \t]*(...(class|enum|interface)...)$\n"
	/*
	 * Method definitions; note that constructor signatures are not
	 * matched because they are indistinguishable from method calls.
	 */
	"^[ \t]*(...[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*))$",

I don't think there is a technical difference, but I find this form
easier to understand because fewer open parentheses have to be tracked.

-- Hannes

  reply	other threads:[~2021-08-11  7:34 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-10 19:09 [PATCH v4] userdiff: improve java hunk header regex Tassilo Horn
2021-08-10 20:57 ` Johannes Sixt
2021-08-10 22:12   ` Re* " Junio C Hamano
2021-08-11  7:14     ` Johannes Sixt
2021-08-11 16:04       ` Junio C Hamano
2021-08-11 20:32         ` Johannes Sixt
2021-08-11  5:22   ` Tassilo Horn
2021-08-11  7:34     ` Johannes Sixt [this message]
2021-08-11  7:39       ` Tassilo Horn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=95ebb2cf-2e6e-912e-7d80-3947a8e3d9e4@kdbg.org \
    --to=j6t@kdbg.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=tsdh@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).