git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Steven Jeuris via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Jeff King" <peff@peff.net>,
	"Steven Jeuris" <steven.jeuris@gmail.com>,
	"Steven Jeuris" <steven.jeuris@3shape.com>
Subject: Re: [PATCH v2] userdiff: better method/property matching for C#
Date: Wed, 06 Mar 2024 18:11:31 -0800	[thread overview]
Message-ID: <xmqqv85yokoc.fsf@gitster.g> (raw)
In-Reply-To: <pull.1682.v2.git.git.1709756493673.gitgitgadget@gmail.com> (Steven Jeuris via GitGitGadget's message of "Wed, 06 Mar 2024 20:21:33 +0000")

"Steven Jeuris via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Steven Jeuris <steven.jeuris@3shape.com>
>
> - Support multi-line methods by not requiring closing parenthesis.
> - Support multiple generics (comma was missing before).
> - Add missing `foreach`, `lock` and  `fixed` keywords to skip over.
> - Remove `instanceof` keyword, which isn't C#.
> - Also detect non-method keywords not positioned at the start of a line.
> - Added tests; none existed before.
>
> The overall strategy is to focus more on what isn't expected for
> method/property definitions, instead of what is, but is fully optional.
>

Roughly in other words, we assume that any file the end user throws
at us is a well formed program, so instead of enumerating all valid
keywords and limit the match exactly to them, use a pattern that
would match valid keywords (both currently known ones, and anything
the language might add in the future that we do not know about), to
match with anything syntactically plausible to be a definition?

It does make sense to start by assuming that the end user data is a
valid C# program.

> Signed-off-by: Steven Jeuris <steven.jeuris@gmail.com>
> ---

> diff --git a/userdiff.c b/userdiff.c
> index e399543823b..5a9e8a0ef55 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -89,12 +89,18 @@ PATTERNS("cpp",
>  	 "|\\.[0-9][0-9]*([Ee][-+]?[0-9]+)?[fFlL]?"
>  	 "|[-+*/<>%&^|=!]=|--|\\+\\+|<<=?|>>=?|&&|\\|\\||::|->\\*?|\\.\\*|<=>"),
>  PATTERNS("csharp",
> -	 /* Keywords */
> -	 "!^[ \t]*(do|while|for|if|else|instanceof|new|return|switch|case|throw|catch|using)\n"
> -	 /* Methods and constructors */
> -	 "^[ \t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe|async)[ \t]+)*[][<>@.~_[:alnum:]]+[ \t]+[<>@._[:alnum:]]+[ \t]*\\(.*\\))[ \t]*$\n"
> +	 /*
> +	  * Jump over keywords not used by methods which can be followed by parentheses without special characters in between,
> +	  * making them look like methods.
> +	  */

Overly long comments (I'll wrap them while queuing).

> +	 "!(^|[ \t]+)(do|while|for|foreach|if|else|new|default|return|switch|case|throw|catch|using|lock|fixed)([ \t(]+|$)\n"
> +	 /* Methods/constructors:
> +	  * the strategy is to identify a minimum of two groups (any combination of keywords/type/name),
> +	  * without intermediate or final characters which can't be part of method definitions before the opening parenthesis.
> +	  */
> +	 "^[ \t]*(([][[:alnum:]@_<>.,]*[^=:{ \t][ \t]+[][[:alnum:]@_<>.,]*)+\\([^;]*)$\n"
>  	 /* Properties */
> -	 "^[ \t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe)[ \t]+)*[][<>@.~_[:alnum:]]+[ \t]+[@._[:alnum:]]+)[ \t]*$\n"
> +	 "^[ \t]*((([][[:alnum:]@_<>.,]+)[ \t]+[][[:alnum:]@_]*)+[^=:;,()]*)$\n"
>  	 /* Type definitions */
>  	 "^[ \t]*(((static|public|internal|private|protected|new|unsafe|sealed|abstract|partial)[ \t]+)*(class|enum|interface|struct|record)[ \t]+.*)$\n"
>  	 /* Namespace */
>
> base-commit: f41f85c9ec8d4d46de0fd5fded88db94d3ec8c11


  reply	other threads:[~2024-03-07  2:12 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-25 17:33 [PATCH] userdiff: better method/property matching for C# Steven Jeuris via GitGitGadget
2024-03-06 20:21 ` [PATCH v2] " Steven Jeuris via GitGitGadget
2024-03-07  2:11   ` Junio C Hamano [this message]
2024-03-16 18:14   ` Linus Arver
2024-03-26 21:38   ` Junio C Hamano
2024-03-27  8:40     ` Jeff King
2024-03-27  7:30   ` Johannes Sixt
2024-03-28  8:07   ` [PATCH v3] " Steven Jeuris via GitGitGadget
2024-03-28 19:14     ` [PATCH v4] " Steven Jeuris via GitGitGadget
2024-03-28 19:33       ` Junio C Hamano
2024-03-30 18:49       ` Johannes Sixt
2024-04-03 21:42       ` [PATCH v5] " Steven Jeuris via GitGitGadget
2024-04-05 22:02         ` Johannes Sixt
2024-04-05 22:10           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqv85yokoc.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=peff@peff.net \
    --cc=steven.jeuris@3shape.com \
    --cc=steven.jeuris@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).