git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Atharva Raykar <raykar.ath@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [GSOC][PATCH] userdiff: add support for Scheme
Date: Sun, 28 Mar 2021 17:21:16 +0530	[thread overview]
Message-ID: <EBC020E6-BE8B-4332-8225-A988CB7CFA69@gmail.com> (raw)
In-Reply-To: <xmqq5z1cqki7.fsf@gitster.g>

On 28-Mar-2021, at 04:20, Junio C Hamano <gitster@pobox.com> wrote:
> 
> Atharva Raykar <raykar.ath@gmail.com> writes:
> 
>> diff --git a/t/t4018/scheme-define-syntax b/t/t4018/scheme-define-syntax
>> new file mode 100644
>> index 0000000000..603b99cea4
>> --- /dev/null
>> +++ b/t/t4018/scheme-define-syntax
>> @@ -0,0 +1,8 @@
>> +(define-syntax define-test-suite RIGHT
>> +  (syntax-rules ()
>> +    ((_ suite-name (name test) ChangeMe ...)
>> +     (define suite-name
>> +       (let ((tests
>> +              `((name . ,test) ...)))
>> +         (lambda ()
>> +           (ChangeMe 'suite-name tests)))))))
>> \ No newline at end of file
> 
> Is there a good reason to leave the final line incomplete?  If there
> isn't, complete it (applies to other newly-created files in the patch).

Will do.

>> diff --git a/userdiff.c b/userdiff.c
>> index 3f81a2261c..c51a8c98ba 100644
>> --- a/userdiff.c
>> +++ b/userdiff.c
>> @@ -191,6 +191,14 @@ PATTERNS("rust",
>> 	 "[a-zA-Z_][a-zA-Z0-9_]*"
>> 	 "|[0-9][0-9_a-fA-Fiosuxz]*(\\.([0-9]*[eE][+-]?)?[0-9_fF]*)?"
>> 	 "|[-+*\\/<>%&^|=!:]=|<<=?|>>=?|&&|\\|\\||->|=>|\\.{2}=|\\.{3}|::"),
>> +PATTERNS("scheme",
>> +         "^[\t ]*(\\(define-?.*)$",
> 
> Didn't "git diff HEAD" before committing (or "git show") highlighted
> these whitespace errors?
> 
> .git/rebase-apply/patch:183: indent with spaces.
>         "^[\t ]*(\\(define-?.*)$",
> .git/rebase-apply/patch:184: trailing whitespace, indent with spaces.
>         /* 
> .git/rebase-apply/patch:185: indent with spaces.
>          * Scheme allows symbol names to have any character,
> .git/rebase-apply/patch:186: indent with spaces.
>          * as long as it is not a form of a parenthesis.
> .git/rebase-apply/patch:187: indent with spaces.
>          * The spaces must be escaped.
> warning: squelched 2 whitespace errors
> warning: 7 lines applied after fixing whitespace errors.

It did highlight the spaces (which I accidentally overlooked), but I
didn’t receive these warnings. It shows up with the --check flag though.
I'll recheck my configuration. Thanks for pointing this out.

> 
>> +         /* 
>> +          * Scheme allows symbol names to have any character,
>> +          * as long as it is not a form of a parenthesis.
>> +          * The spaces must be escaped.
>> +          */
>> +         "(\\.|[^][)(\\}\\{ ])+"),
> 
> One or more "dot or anything other than SP or parentheses"?  But
> a dot "." is neither a space or any {bra-ce} letter, so would the
> above be equivalent to
> 
> 	"[^][()\\{\\} \t]+"
> 
> I wonder...

A backslash is allowed in scheme identifiers, and I erroneously thought that
the first part handles the case for identifiers such as `component\new` or 
`\"id-with-quotes\"`. (I tested it with a regex engine that behaves differently
than the one git is using, my bad.)

> I am also trying to figure out what you wanted to achieve by
> mentioning "The spaces must be escaped.".  Did you mean something
> like (string->symbol "a symbol with SP in it") is a symbol?  Even
> so, I cannot quite guess the significance of that fact wrt the
> regexp you added here?

I initially tried using identifiers like `space\ separated` and they
seemed to work in my REPL, but turns out space separated identifiers in
scheme do not work with backslashes, and it was working because of the way
my terminal handled escaping. Space separated identifiers are declared like
`|space separated|` and this too only seems to work with Racket, not
the other Scheme implementations. So I stand corrected here, and it's better
to drop this feature altogether.

But somehow, the regexp you suggested, ie:

	"[^][()\\{\\} \t]+"

does not handle the case of make\foo -> make\bar (it will only diff on foo).
I am not too sure why it treats backslashes as delimiters.

This seems to actually do what I was going for:

	"(\\\\|[^][)(\\}\\{ ])+"

> As we are trying to catch program identifiers (symbols in scheme)
> and numeric literals, treating any group of non-whitespace letters
> that is delimited by one or more whitespaces as a "word" would be a
> good first-order approximation, but in addition, as can be seen in
> an example like (a(b(c))), parentheses can also serve as such "word
> delimiters" in addition to whitespaces.  So the regexp given above
> makes sense to me from that angle, especially if you do not limit
> the whitespace to only SP, but include HT (\t) as well.  But was
> that how you came up with the regexp?

Yes, this is exactly what I was trying to express. All words should be
delimited by either whitespace or a parenthesis, and all other special
characters should be accepted as part of the word.

  parent reply	other threads:[~2021-03-28 11:52 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-27 17:39 [GSOC][PATCH] userdiff: add support for Scheme Atharva Raykar
2021-03-27 22:50 ` Junio C Hamano
2021-03-27 23:09   ` Junio C Hamano
2021-03-28  3:16     ` Ævar Arnfjörð Bjarmason
2021-03-28  5:37       ` Junio C Hamano
2021-03-28 12:40       ` Atharva Raykar
2021-03-29 10:08         ` Phillip Wood
2021-03-30  6:41           ` Atharva Raykar
2021-03-30 12:56             ` Ævar Arnfjörð Bjarmason
2021-03-30 13:48               ` Atharva Raykar
2021-03-28 12:45     ` Atharva Raykar
2021-03-28 11:51   ` Atharva Raykar [this message]
2021-03-28 18:06     ` Junio C Hamano
2021-03-29  8:12       ` Atharva Raykar
2021-03-29 20:47         ` Junio C Hamano
2021-03-29 10:12     ` Phillip Wood
2021-03-27 23:46 ` Johannes Sixt
2021-03-28 12:23   ` Atharva Raykar
2021-03-29 10:18     ` Phillip Wood
2021-03-29 10:48       ` Johannes Sixt
2021-03-29 13:12         ` Ævar Arnfjörð Bjarmason
2021-03-29 14:06           ` Phillip Wood
2021-03-30  7:04       ` Atharva Raykar
2021-03-30 10:22         ` Atharva Raykar
2021-04-05 10:04           ` Phillip Wood
2021-04-05 17:58             ` Johannes Sixt
2021-04-06 12:29             ` Atharva Raykar
2021-04-06 19:10               ` Phillip Wood
2021-04-03 13:16 ` [GSoC][PATCH v2 0/1] userdiff: add support for scheme Atharva Raykar
2021-04-03 13:16   ` [GSoC][PATCH v2 1/1] " Atharva Raykar
2021-04-05 10:21     ` Phillip Wood
2021-04-06 10:32       ` Atharva Raykar
2021-04-08  9:14   ` [GSoC][PATCH v3 0/1] " Atharva Raykar
2021-04-08  9:14   ` [GSoC][PATCH v3 1/1] userdiff: add support for Scheme Atharva Raykar
2021-04-12 23:04     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=EBC020E6-BE8B-4332-8225-A988CB7CFA69@gmail.com \
    --to=raykar.ath@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).