git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Sergius Nyah <sergiusnyah@gmail.com>
To: git@vger.kernel.org, "gitster@pobox.com" <gitster@pobox.com>
Subject: [GSOC][RFC] Add more builtin patterns for userdiff, as Mircroproject.
Date: Tue, 9 Jan 2024 20:55:44 +0100	[thread overview]
Message-ID: <CANAnif95ux=vCNCKbVw0q_vYamQRkbFqSa9_-u6xRvK6r+2a+Q@mail.gmail.com> (raw)

Hello everyone,
I'm Sergius, a Computer Science undergraduate student, and I want to
begin Contributing to the Git project. So far, I've gone through
Matheus' tutorial on First steps Contributing to Git, and I found it
very helpful. I've also read the Contribution guidelines keenly and
built Git from source.

In accordance to the contributor guidelines, I came across this
Mircoproject idea from: https://git.github.io/SoC-2022-Microprojects/
which I'm willing to work on. It talked about enhancing Git's
"userdiff" feature in "userdiff.c" which is crucial for identifying
function names in various programming languages, thereby improving the
readability of "git diff" outputs.

From my understanding, the project involves extending the `userdiff`
feature to support additional programming languages that are currently
not covered such as Shell, Swift, Go and the others.

Here is a sample of how a language is defined in `userdiff.c`:

> #define PATTERNS(lang, rx, wrx) { \
> .name = lang, \
> .binary = -1, \
> .funcname = { \
> .pattern = rx, \
> .cflags = REG_EXTENDED, \
> }, \
> .word_regex = wrx "|[^[:space:]]|[\xc0-\xff][\x80-\xbf]+", \
> .word_regex_multi_byte = wrx "|[^[:space:]]", \
> }

In this code, `lang` is the name of the language, `rx` is the regular
expression for identifying function names, and `wrx` is the word
regex.

Approach: I Identified the Programming Languages that are not
currently supported by the userdiff feature by reviewing the existing
patterns in userdiff.c and comparing them with some popular
programming languages.
For each supported language, I would define a regular expression that
could help identify function names in that language. This could
include researching each language's syntax and testing their
expressions to ensure that they work well.
Also, I'd add a new IPATTERN definition for each language to the
"userdiff.c" file, then rebuild Git and test the changes by creating a
repo with files in the newly supported languages then run "git diff"
to ensure the line @@ ... @@ produces their correct function names.
Then submit a patch.

Best Regards!
Sergius.


             reply	other threads:[~2024-01-09 19:56 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-09 19:55 Sergius Nyah [this message]
2024-01-10 11:33 ` [GSOC][RFC] Add more builtin patterns for userdiff, as Mircroproject Christian Couder
     [not found]   ` <CANAnif90Bqp2pWCn_71t-Fss6wspo+==vMdYsX+Wt5m=4Ocpng@mail.gmail.com>
     [not found]     ` <CAP8UFD0ELy2WegVYdxi_O5UpHS4MyOPp4tuAQK+XvvNmABc2ZA@mail.gmail.com>
     [not found]       ` <CAP8UFD3e=Zv2wkx5tswCz05Vwn3vD68Vw-TD6SoENWK+norYsw@mail.gmail.com>
2024-01-11 14:23         ` Fwd: " Sergius Nyah
2024-01-11 15:11           ` Christian Couder
2024-01-11 15:42             ` Sergius Nyah
2024-02-27 14:19               ` [GSOC][PATCH 0/2] Add builtin patterns for userdiff in JavaScript, as Microproject Sergius Nyah
2024-02-27 14:19                 ` [PATCH 1/2] Subject: [GSOC][RFC PATCH 1/2] Add builtin patterns for JavaScript function detection in userdiff Sergius Nyah
2024-02-27 14:19                 ` [PATCH 2/2] Subject:[GSOC] [RFC PATCH 2/2] Add test for JavaScript function detection in Git diffs Sergius Nyah
2024-02-27 14:21               ` [GSOC][PATCH 0/2] Add builtin patterns for userdiff in JavaScript, as Microproject Sergius Nyah
2024-02-27 14:21                 ` [PATCH 1/2] Subject: [GSOC][RFC PATCH 1/2] Add builtin patterns for JavaScript function detection in userdiff Sergius Nyah
2024-02-27 14:21                 ` [PATCH 2/2] Subject:[GSOC] [RFC PATCH 2/2] Add test for JavaScript function detection in Git diffs Sergius Nyah
2024-02-27 14:25               ` [GSOC][PATCH 0/2] Add builtin patterns for userdiff in JavaScript, as Microproject Sergius Nyah
2024-02-27 14:25                 ` [PATCH 1/2] Subject: [GSOC][RFC PATCH 1/2] Add builtin patterns for JavaScript function detection in userdiff Sergius Nyah
2024-02-27 14:25                 ` [PATCH 2/2] Subject:[GSOC] [RFC PATCH 2/2] Add test for JavaScript function detection in Git diffs Sergius Nyah
2024-02-27 16:02               ` [GSOC][PATCH 0/2] Add builtin patterns for userdiff in JavaScript, as Microproject Sergius Nyah
2024-02-27 16:02                 ` [PATCH 1/2] Subject: [GSOC][RFC PATCH 1/2] Add builtin patterns for JavaScript function detection in userdiff Sergius Nyah
2024-02-27 19:06                   ` Ghanshyam Thakkar
2024-02-27 21:05                     ` Sergius Nyah
2024-02-27 16:02                 ` [PATCH 2/2] Subject:[GSOC] [RFC PATCH 2/2] Add test for JavaScript function detection in Git diffs Sergius Nyah
2024-02-28 15:00               ` [GSOC][PATCH] Userdiff: add builtin patterns for JavaScript Sergius Nyah
2024-02-28 18:19                 ` Junio C Hamano
2024-02-29 10:11                   ` [GSOC][PATCH] userdiff: " Sergius Nyah
2024-02-29 12:01                     ` Ghanshyam Thakkar
2024-03-01  7:40                   ` [GSOC][PATCH] userdiff: Add JavaScript function patterns Sergius Nyah
2024-03-02 10:28                     ` Christian Couder
2024-03-02 10:54                       ` Christian Couder
2024-03-02 17:13                       ` Junio C Hamano
2024-03-04  9:04                     ` Patrick Steinhardt
2024-03-12 13:14                       ` [GSOC][PATCH v2 1/3]t4018-diff-funcname: use test_grep instead of test_i18ngrep Sergius Nyah
2024-03-12 13:14                         ` [GSOC][PATCH v2 2/3]t4034-diff-words: add javascript language driver Sergius Nyah
2024-03-12 13:14                         ` [GSOC][PATCH v2 3/3]userdiff: remove trailing whitespaces, fix multiline comments Sergius Nyah
2024-03-19 10:10                       ` [GSOC][PATCH v2]userdiff: improve code quality and add JavaScript language driver Sergius Nyah
2024-03-19 21:02                         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANAnif95ux=vCNCKbVw0q_vYamQRkbFqSa9_-u6xRvK6r+2a+Q@mail.gmail.com' \
    --to=sergiusnyah@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).