git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Re: [PATCH 2/4] doc: allow literal and emphasis format in doc vs help tests
From: Jeff King @ 2024-03-28 10:06 UTC (permalink / raw)
  To: Jean-Noël Avila via GitGitGadget; +Cc: git, Jean-Noël Avila
In-Reply-To: <202ed891463c134904b89a0d746d85bb62338d52.1711318739.git.gitgitgadget@gmail.com>

On Sun, Mar 24, 2024 at 10:18:57PM +0000, Jean-Noël Avila via GitGitGadget wrote:

> From: =?UTF-8?q?Jean-No=C3=ABl=20Avila?= <jn.avila@free.fr>
> 
> As the new formatting of literal and placeholders is introduced,
> the synopsis in the man pages can now hold additional markup with
> respect to the command help.
> 
> Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
> ---
>  t/t0450-txt-doc-vs-help.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/t/t0450-txt-doc-vs-help.sh b/t/t0450-txt-doc-vs-help.sh
> index cd3969e852b..e47599cbf26 100755
> --- a/t/t0450-txt-doc-vs-help.sh
> +++ b/t/t0450-txt-doc-vs-help.sh
> @@ -59,7 +59,7 @@ txt_to_synopsis () {
>  		-e '/^\[verse\]$/,/^$/ {
>  			/^$/d;
>  			/^\[verse\]$/d;
> -
> +			s/{empty}\|_\|`//g;

It looks like this doesn't work in the macos CI jobs. E.g., this run:

  https://github.com/git/git/actions/runs/8460326247/job/23178326007

produced output from t0450 like:

  +++ diff -u txt help
  --- txt    2024-03-28 00:14:15
  +++ help   2024-03-28 00:14:15
  @@ -1,5 +1,5 @@
  -`git init` [`-q` | `--quiet`] [`--bare`] [`--template=`{empty}__<template-directory>__]
  -         [`--separate-git-dir` _<git-dir>_] [`--object-format=`{empty}__<format>__]
  -         [`--ref-format=`{empty}__<format>__]
  -         [`-b` _<branch-name>_ | `--initial-branch=`{empty}__<branch-name>__]
  -         [`--shared`[`=`{empty}__<permissions>__]] [_<directory>_]
  +git init [-q | --quiet] [--bare] [--template=<template-directory>]
  +         [--separate-git-dir <git-dir>] [--object-format=<format>]
  +         [--ref-format=<format>]
  +         [-b <branch-name> | --initial-branch=<branch-name>]
  +         [--shared[=<permissions>]] [<directory>]

I think the issue is that "|" alternation is not portable. In GNU sed,
doing "\|" with BRE is enough, but in BSD sed you need to enable ERE
with "-E". I'm not sure how portable that is (we do seem to have at
least one instance in t6030, so maybe it's OK). The most basic
alternative is just splitting it like:

  s/{empty}//g;
  s/[_`]//g;

-Peff


^ permalink raw reply

* Update message_advice_pull_before_push to mention how to push after rebasing
From: zhang kai @ 2024-03-28  9:54 UTC (permalink / raw)
  To: git

Hi all,

Our team uses git rebase a lot. A common hiccup we noticed during the
daily work is that new devs often didn't know how to push properly
after the rebase.

When someone try to push to the remote after rebasing the branch, the
push would mostly fail with the message_advice_pull_before_push:

> Updates were rejected because the tip of your current branch is behind\n" "its remote counterpart. If you want to integrate the remote changes,\n" "use 'git pull' before pushing again.\n" "See the 'Note about fast-forwards' in 'git push --help' for details.

However, in this situation, pull will "invalid" the rebasing. After
pull then push, devs often will be surprised by the merge/pull request
containing diffs that should not be in this branch. In this case, we
often should just do `git push --force-with-lease`.

If we add a message like "If you just rebased your branch, please
consider 'git push --force-with-lease'" to the
message_adivce_pull_before_push, we may save devs some time figuring
out what's wrong with their branch. If this is acceptable, we'd be
happy to prepare a patch.

Thanks,
Kai


^ permalink raw reply

* Re: [PATCH 17/16] config: add core.commentString
From: Jeff King @ 2024-03-28  9:47 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Kristoffer Haugsbakk, Dragan Simic, Manlio Perillo,
	René Scharfe, Phillip Wood, git
In-Reply-To: <xmqqa5mj3b6c.fsf@gitster.g>

On Wed, Mar 27, 2024 at 09:13:31AM -0700, Junio C Hamano wrote:

> > An alternative to using "$var cannot ..." in the error messages (if we
> > don't like the all-lowercase variable name) is to just say "comment
> > strings cannot ...". That vaguely covers both cases, and the message
> > printed by the config code itself does mention the actual variable name
> > that triggered the error.
> 
> OK, because the error() return from this function will trigger
> another die() in the caller, e.g.
> 
>     error: core.commentchar must have at least one character
>     fatal: bad config variable 'core.commentchar' in file '.git/config' at line 6
> 
> so we can afford to make the "error" side vague, except that the
> "fatal" one is also downcased already, so we are not really solving
> anything by making the message vague, I would think.  The posted
> patch as-is is prefectly fine.

Oh, right.  For some reason I thought the die() message would have the
variable as written by the user, but that obviously is not true. So I
agree it would not even be an improvement (and the normalizing in my new
error() message is something we've been living with all along anyway for
other messages).

> Side note:
>     I wonder if we would later want to somehow _merge_ these two
>     error messages, i.e. the lower-level will notice and record the
>     nature of the problem instead of calling error(), and the caller
>     will use the recorded information while composing the "fatal"
>     message to die with.  I actually do not know if it is a good
>     idea to begin with.  If we want to do it right, the "record"
>     part probably cannot be a simple "stringify into strbuf" that
>     will result in lego message that is harder for i18n folks.

Yeah, this is a general problem of accumulating errors. I had always
assumed in cases like this that we could have some language-independent
syntax like:

  die("%s:%d: error parsing '%s': %s",
      file, line_nr, var, err_from_callback);

It's certainly lego-like, but it avoids the worst lego cases where
we're literally composing sentences. But as somebody who does not do
translations, it's possible I'm just being optimistic. ;)

-Peff


^ permalink raw reply

* GSoC 2024 [RFC PATCH] Fix Git command exit code suppression in test script t2104-update-index-skip-worktree.sh
From: Aishwarya Narayanan @ 2024-03-28  8:40 UTC (permalink / raw)
  To: git

Dear Git Contributors,
Please find attached a patch addressing an issue in the test script
t2104-update-index-skip-worktree.sh. The issue pertains to the
inadvertent suppression of exit codes from Git commands when used in
pipelines, potentially leading to false positives in test results.

From a80ff00cda2445f93eac1510f0434095f342887b Mon Sep 17 00:00:00 2001
From: Aishwarya <your@email.com>
Date: Thu, 28 Mar 2024 13:54:35 +0530
Subject: [PATCH] This commit addresses an issue in our test scripts where the
 exit code of Git commands could be inadvertently suppressed when used in
 pipelines. Such suppression can lead to tests passing despite failures in Git
 commands, potentially masking bugs or regressions.

Changes made:

- Modified instances where `git ls-files -t` and similar Git commands
are used in pipelines, to capture their output in a variable first.
This ensures that the exit code of the Git command is correctly
evaluated.
- Applied checks for the exit code immediately after the command
execution, allowing the script to exit with an error if the Git
command fails.
- Adjusted related test cases to work with the new method of capturing
and evaluating Git command outputs.

These changes improve the robustness of our testing framework by
ensuring that the failure of a Git command in a test script is
correctly detected and reported. This is crucial for maintaining the
reliability and integrity of the Git project as we continue to evolve
and enhance its functionality.
---
 t/t2104-update-index-skip-worktree.sh | 66 ++++++++-------------------
 1 file changed, 20 insertions(+), 46 deletions(-)

diff --git a/t/t2104-update-index-skip-worktree.sh
b/t/t2104-update-index-skip-worktree.sh
index 0bab134d71..c552d2208e 100755
--- a/t/t2104-update-index-skip-worktree.sh
+++ b/t/t2104-update-index-skip-worktree.sh
@@ -3,65 +3,39 @@
 # Copyright (c) 2008 Nguyễn Thái Ngọc Duy
 #

-test_description='skip-worktree bit test'
-
-TEST_PASSES_SANITIZE_LEAK=true
-. ./test-lib.sh
-
-sane_unset GIT_TEST_SPLIT_INDEX
-
-test_set_index_version () {
-    GIT_INDEX_VERSION="$1"
-    export GIT_INDEX_VERSION
-}
-
-test_set_index_version 3
-
-cat >expect.full <<EOF
-H 1
-H 2
-H sub/1
-H sub/2
-EOF
-
-cat >expect.skip <<EOF
-S 1
-H 2
-S sub/1
-H sub/2
-EOF
-
 test_expect_success 'setup' '
    mkdir sub &&
    touch ./1 ./2 sub/1 sub/2 &&
    git add 1 2 sub/1 sub/2 &&
-   git ls-files -t | test_cmp expect.full -
-'
-
-test_expect_success 'index is at version 2' '
-   test "$(git update-index --show-index-version)" = 2
+   output=$(git ls-files -t)
+   echo "$output" | test_cmp expect.full -
+   if [ $? -ne 0 ]; then
+       exit 1
+   fi
 '

 test_expect_success 'update-index --skip-worktree' '
    git update-index --skip-worktree 1 sub/1 &&
-   git ls-files -t | test_cmp expect.skip -
-'
-
-test_expect_success 'index is at version 3 after having some
skip-worktree entries' '
-   test "$(git update-index --show-index-version)" = 3
+   output=$(git ls-files -t)
+   echo "$output" | test_cmp expect.skip -
+   if [ $? -ne 0 ]; then
+       exit 1
+   fi
 '

 test_expect_success 'ls-files -t' '
-   git ls-files -t | test_cmp expect.skip -
+   output=$(git ls-files -t)
+   echo "$output" | test_cmp expect.skip -
+   if [ $? -ne 0 ]; then
+       exit 1
+   fi
 '

 test_expect_success 'update-index --no-skip-worktree' '
    git update-index --no-skip-worktree 1 sub/1 &&
-   git ls-files -t | test_cmp expect.full -
+   output=$(git ls-files -t)
+   echo "$output" | test_cmp expect.full -
+   if [ $? -ne 0 ]; then
+       exit 1
+   fi
 '
-
-test_expect_success 'index version is back to 2 when there is no
skip-worktree entry' '
-   test "$(git update-index --show-index-version)" = 2
-'
-
-test_done
--


^ permalink raw reply related

* Re: [PATCH 5/6] fast-import: document C-style escapes for paths
From: Patrick Steinhardt @ 2024-03-28  8:21 UTC (permalink / raw)
  To: Thalia Archibald; +Cc: git, Elijah Newren
In-Reply-To: <20240322000304.76810-6-thalia@archibald.dev>

[-- Attachment #1: Type: text/plain, Size: 2571 bytes --]

On Fri, Mar 22, 2024 at 12:03:47AM +0000, Thalia Archibald wrote:
> Simply saying “C-style” string quoting is imprecise, as only a subset of
> C escapes are supported. Document the exact escapes.
> 
> Signed-off-by: Thalia Archibald <thalia@archibald.dev>
> ---
>  Documentation/git-fast-import.txt | 21 +++++++++++++--------
>  1 file changed, 13 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt
> index 271bd63a10..4aa8ccbefd 100644
> --- a/Documentation/git-fast-import.txt
> +++ b/Documentation/git-fast-import.txt
> @@ -630,18 +630,23 @@ in octal.  Git only supports the following modes:
>  In both formats `<path>` is the complete path of the file to be added
>  (if not already existing) or modified (if already existing).
>  
> -A `<path>` string must use UNIX-style directory separators (forward
> -slash `/`), may contain any byte other than `LF`, and must not
> -start with double quote (`"`).
> +A `<path>` string may contain any byte other than `LF`, and must not
> +start with double quote (`"`). It is interpreted as literal bytes
> +without escaping.

Paths also mustn't start with a space in many cases, right?

Patrick

>  A path can use C-style string quoting; this is accepted in all cases
>  and mandatory if the filename starts with double quote or contains
> -`LF`. In C-style quoting, the complete name should be surrounded with
> -double quotes, and any `LF`, backslash, or double quote characters
> -must be escaped by preceding them with a backslash (e.g.,
> -`"path/with\n, \\ and \" in it"`).
> +`LF`. In C-style quoting, the complete name is surrounded with
> +double quotes (`"`) and certain characters must be escaped by preceding
> +them with a backslash: `LF` is written as `\n`, backslash as `\\`, and
> +double quote as `\"`. Additionally, some characters may may optionally
> +be written with escape sequences: `\a` for bell, `\b` for backspace,
> +`\f` for form feed, `\n` for line feed, `\r` for carriage return, `\t`
> +for horizontal tab, and `\v` for vertical tab. Any byte can be written
> +with 3-digit octal codes (e.g., `\033`).
>  
> -The value of `<path>` must be in canonical form. That is it must not:
> +A `<path>` must use UNIX-style directory separators (forward slash `/`)
> +and must be in canonical form. That is it must not:
>  
>  * contain an empty directory component (e.g. `foo//bar` is invalid),
>  * end with a directory separator (e.g. `foo/` is invalid),
> -- 
> 2.44.0
> 
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 4/6] fast-import: remove dead strbuf
From: Patrick Steinhardt @ 2024-03-28  8:21 UTC (permalink / raw)
  To: Thalia Archibald; +Cc: git, Elijah Newren
In-Reply-To: <20240322000304.76810-5-thalia@archibald.dev>

[-- Attachment #1: Type: text/plain, Size: 1582 bytes --]

On Fri, Mar 22, 2024 at 12:03:40AM +0000, Thalia Archibald wrote:
> The strbuf in `note_change_n` has been unused since the function was
> created in a8dd2e7d2b (fast-import: Add support for importing commit
> notes, 2009-10-09) and looks to be a fossil from adapting
> `note_change_m`. Remove it.

Just from inspecting the diff it's not clear that it is actually unused
given that we assign `p = uq.buf`. The message here should probably
mention the important detail that `p` is not actually used after the
assignment.

Patrick

> Signed-off-by: Thalia Archibald <thalia@archibald.dev>
> ---
>  builtin/fast-import.c | 5 -----
>  1 file changed, 5 deletions(-)
> 
> diff --git a/builtin/fast-import.c b/builtin/fast-import.c
> index d6f998f363..ae8494d0ac 100644
> --- a/builtin/fast-import.c
> +++ b/builtin/fast-import.c
> @@ -2458,7 +2458,6 @@ static void file_change_cr(const char *p, struct branch *b, int rename)
>  
>  static void note_change_n(const char *p, struct branch *b, unsigned char *old_fanout)
>  {
> -	static struct strbuf uq = STRBUF_INIT;
>  	struct object_entry *oe;
>  	struct branch *s;
>  	struct object_id oid, commit_oid;
> @@ -2523,10 +2522,6 @@ static void note_change_n(const char *p, struct branch *b, unsigned char *old_fa
>  		die("Invalid ref name or SHA1 expression: %s", p);
>  
>  	if (inline_data) {
> -		if (p != uq.buf) {
> -			strbuf_addstr(&uq, p);
> -			p = uq.buf;
> -		}
>  		read_next_command();
>  		parse_and_store_blob(&last_blob, &oid, 0);
>  	} else if (oe) {
> -- 
> 2.44.0
> 
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 3/6] fast-import: release unfreed strbufs
From: Patrick Steinhardt @ 2024-03-28  8:21 UTC (permalink / raw)
  To: Thalia Archibald; +Cc: git, Elijah Newren
In-Reply-To: <20240322000304.76810-4-thalia@archibald.dev>

[-- Attachment #1: Type: text/plain, Size: 4225 bytes --]

On Fri, Mar 22, 2024 at 12:03:33AM +0000, Thalia Archibald wrote:
> These strbufs are owned. Release them at the end of their scopes.
> 
> Signed-off-by: Thalia Archibald <thalia@archibald.dev>
> ---
>  builtin/fast-import.c | 29 ++++++++++++++++++-----------
>  1 file changed, 18 insertions(+), 11 deletions(-)
> 
> diff --git a/builtin/fast-import.c b/builtin/fast-import.c
> index 1b3d6784c1..d6f998f363 100644
> --- a/builtin/fast-import.c
> +++ b/builtin/fast-import.c
> @@ -2364,6 +2364,7 @@ static void file_change_m(const char *p, struct branch *b)
>  	/* Git does not track empty, non-toplevel directories. */
>  	if (S_ISDIR(mode) && is_empty_tree_oid(&oid) && *path.buf) {
>  		tree_content_remove(&b->branch_tree, path.buf, NULL, 0);
> +		strbuf_release(&path);
>  		return;
>  	}

Oh, now you get to my comment in the preceding patch. With this patch
we're now in a somewhat weird in-between state where the buffers are
still static, but we release their memory after each call. So we kind of
get the worst of both worlds: static variables without being able to
reuse the buffer's memory.

If we were to change this then we should definitely mark the buffers as
non-static. If so, it would be great to demonstrate that this does not
significantly impact performance.

The same is true for all the other instances.

Patrick

> @@ -2409,11 +2410,11 @@ static void file_change_m(const char *p, struct branch *b)
>  				command_buf.buf);
>  	}
>  
> -	if (!*path.buf) {
> +	if (*path.buf)
> +		tree_content_set(&b->branch_tree, path.buf, &oid, mode, NULL);
> +	else
>  		tree_content_replace(&b->branch_tree, &oid, mode, NULL);
> -		return;
> -	}
> -	tree_content_set(&b->branch_tree, path.buf, &oid, mode, NULL);
> +	strbuf_release(&path);
>  }
>  
>  static void file_change_d(const char *p, struct branch *b)
> @@ -2422,6 +2423,7 @@ static void file_change_d(const char *p, struct branch *b)
>  
>  	parse_path_eol(&path, p, "path");
>  	tree_content_remove(&b->branch_tree, path.buf, NULL, 1);
> +	strbuf_release(&path);
>  }
>  
>  static void file_change_cr(const char *p, struct branch *b, int rename)
> @@ -2440,17 +2442,18 @@ static void file_change_cr(const char *p, struct branch *b, int rename)
>  		tree_content_get(&b->branch_tree, source.buf, &leaf, 1);
>  	if (!leaf.versions[1].mode)
>  		die("Path %s not in branch", source.buf);
> -	if (!*dest.buf) {	/* C "path/to/subdir" "" */
> +	if (*dest.buf)
> +		tree_content_set(&b->branch_tree, dest.buf,
> +			&leaf.versions[1].oid,
> +			leaf.versions[1].mode,
> +			leaf.tree);
> +	else	/* C "path/to/subdir" "" */
>  		tree_content_replace(&b->branch_tree,
>  			&leaf.versions[1].oid,
>  			leaf.versions[1].mode,
>  			leaf.tree);
> -		return;
> -	}
> -	tree_content_set(&b->branch_tree, dest.buf,
> -		&leaf.versions[1].oid,
> -		leaf.versions[1].mode,
> -		leaf.tree);
> +	strbuf_release(&source);
> +	strbuf_release(&dest);
>  }
>  
>  static void note_change_n(const char *p, struct branch *b, unsigned char *old_fanout)
> @@ -2804,6 +2807,7 @@ static void parse_new_commit(const char *arg)
>  	free(author);
>  	free(committer);
>  	free(encoding);
> +	strbuf_release(&msg);
>  
>  	if (!store_object(OBJ_COMMIT, &new_data, NULL, &b->oid, next_mark))
>  		b->pack_id = pack_id;
> @@ -2886,6 +2890,7 @@ static void parse_new_tag(const char *arg)
>  	strbuf_addch(&new_data, '\n');
>  	strbuf_addbuf(&new_data, &msg);
>  	free(tagger);
> +	strbuf_release(&msg);
>  
>  	if (store_object(OBJ_TAG, &new_data, NULL, &t->oid, next_mark))
>  		t->pack_id = MAX_PACK_ID;
> @@ -3171,6 +3176,7 @@ static void print_ls(int mode, const unsigned char *hash, const char *path)
>  		strbuf_addch(&line, '\n');
>  	}
>  	cat_blob_write(line.buf, line.len);
> +	strbuf_release(&line);
>  }
>  
>  static void parse_ls(const char *p, struct branch *b)
> @@ -3206,6 +3212,7 @@ static void parse_ls(const char *p, struct branch *b)
>  		release_tree_content_recursive(leaf.tree);
>  	if (!b || root != &b->branch_tree)
>  		release_tree_entry(root);
> +	strbuf_release(&path);
>  }
>  
>  static void checkpoint(void)
> -- 
> 2.44.0
> 
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 1/6] fast-import: tighten parsing of paths
From: Patrick Steinhardt @ 2024-03-28  8:21 UTC (permalink / raw)
  To: Thalia Archibald; +Cc: git, Elijah Newren
In-Reply-To: <20240322000304.76810-2-thalia@archibald.dev>

[-- Attachment #1: Type: text/plain, Size: 21317 bytes --]

On Fri, Mar 22, 2024 at 12:03:18AM +0000, Thalia Archibald wrote:
> Path parsing in fast-import is inconsistent and many unquoting errors
> are suppressed.
> 
> `<path>` appears in the grammar in these places:
> 
>     filemodify ::= 'M' SP <mode> (<dataref> | 'inline') SP <path> LF
>     filedelete ::= 'D' SP <path> LF
>     filecopy   ::= 'C' SP <path> SP <path> LF
>     filerename ::= 'R' SP <path> SP <path> LF
>     ls         ::= 'ls' SP <dataref> SP <path> LF
>     ls-commit  ::= 'ls' SP <path> LF
> 
> and fast-import.c parses them in five different ways:
> 
> 1. For filemodify and filedelete:
>    If `<path>` is a valid quoted string, unquote it;
>    otherwise, treat it as literal bytes (including any number of SP).
> 2. For filecopy (source) and filerename (source):
>    If `<path>` is a valid quoted string, unquote it;
>    otherwise, treat it as literal bytes until the next SP.
> 3. For filecopy (dest) and filerename (dest):
>    Like 1., but an unquoted empty string is an error.
> 4. For ls:
>    If `<path>` starts with `"`, unquote it and report parse errors;
>    otherwise, treat it as literal bytes (including any number of SP).
> 5. For ls-commit:
>    Unquote `<path>` and report parse errors.
>    (It must start with `"` to disambiguate from ls.)
> 
> In the first three, any errors from trying to unquote a string are
> suppressed, so a quoted string that contains invalid escapes would be
> interpreted as literal bytes. For example, `"\xff"` would fail to
> unquote (because hex escapes are not supported), and it would instead be
> interpreted as the byte sequence `"` `\` `x` `f` `f` `"`, which is
> certainly not intended. Some front-ends erroneously use their language's
> standard quoting routine and could silently introduce escapes that would
> be incorrectly parsed due to this.
> 
> The documentation states that “To use a source path that contains SP the
> path must be quoted.”, so it is expected that some implementations
> depend on spaces being allowed in paths in the final position. Thus we
> have two documented ways to parse paths, so simplify the implementation
> to that.
> 
> Now we have:
> 
> 1. `parse_path_eol` for filemodify, filedelete, filecopy (dest),
>    filerename (dest), ls, and ls-commit:
> 
>    If `<path>` starts with `"`, unquote it and report parse errors;
>    otherwise, treat it as literal bytes (including any number of SP).
>    Garbage after a quoted string or an unquoted empty string are errors.
>    (In ls-commit, it must be quoted to disambiguate from ls.)
> 
> 2. `parse_path_space` for filecopy (source) and filerename (source):
> 
>    If `<path>` starts with `"`, unquote it and report parse errors;
>    otherwise, treat it as literal bytes until the next SP.
>    It must be followed by a SP. An unquoted empty string is an error.
> 
> Signed-off-by: Thalia Archibald <thalia@archibald.dev>
> ---
>  Documentation/git-fast-import.txt |   3 +-
>  builtin/fast-import.c             | 115 ++++++++------
>  t/t9300-fast-import.sh            | 252 +++++++++++++++++++++++++++++-
>  3 files changed, 316 insertions(+), 54 deletions(-)
> 
> diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt
> index b2607366b9..271bd63a10 100644
> --- a/Documentation/git-fast-import.txt
> +++ b/Documentation/git-fast-import.txt
> @@ -649,7 +649,8 @@ The value of `<path>` must be in canonical form. That is it must not:
>  * contain the special component `.` or `..` (e.g. `foo/./bar` and
>    `foo/../bar` are invalid).
>  
> -The root of the tree can be represented by an empty string as `<path>`.
> +The root of the tree can be represented by a quoted empty string (`""`)
> +as `<path>`.

So this is part of the "filemodify" section with the following syntax:

    'M' SP <mode> SP <dataref> SP <path> LF

The way I interpret this change is that <path> could previously be empty
(`SP LF`), but now it needs to be quoted (`SP '"' '"' LF). This seems to
be related to cases (1) and (3) of your commit messages, where
"filemodify" could contain an unquoted empty string whereas "filecopy"
and "filerename" would complain about such an unquoted string.

In any case I don't see a strong argument why exactly it should be
forbidden to have an unquoted empty path here, and I do wonder whether
it would break existing writers of the format when we retroactively
tighten this case. Isn't it possible to instead loosen it such that all
three of the above actions know to handle unquoted empty paths?

>  It is recommended that `<path>` always be encoded using UTF-8.
>  
> diff --git a/builtin/fast-import.c b/builtin/fast-import.c
> index 71a195ca22..b2adec8d9a 100644
> --- a/builtin/fast-import.c
> +++ b/builtin/fast-import.c
> @@ -2224,7 +2224,7 @@ static int parse_mapped_oid_hex(const char *hex, struct object_id *oid, const ch
>   *
>   *   idnum ::= ':' bigint;
>   *
> - * Return the first character after the value in *endptr.
> + * Update *endptr to point to the first character after the value.

I think it would make sense to put these improvements for comments into
a separate commit. Otherwise it makes you wonder whether this behaviour
is new now.

>   * Complain if the following character is not what is expected,
>   * either a space or end of the string.
> @@ -2257,8 +2257,8 @@ static uintmax_t parse_mark_ref_eol(const char *p)
>  }
>  
>  /*
> - * Parse the mark reference, demanding a trailing space.  Return a
> - * pointer to the space.
> + * Parse the mark reference, demanding a trailing space. Update *p to
> + * point to the first character after the space.
>   */

Same.

>  static uintmax_t parse_mark_ref_space(const char **p)
>  {
> @@ -2272,10 +2272,57 @@ static uintmax_t parse_mark_ref_space(const char **p)
>  	return mark;
>  }
>  
> +/*
> + * Parse the path string into the strbuf. It may be quoted with escape sequences
> + * or unquoted without escape sequences. When unquoted, it may only contain a
> + * space if `allow_spaces` is nonzero.
> + */
> +static void parse_path(struct strbuf *sb, const char *p, const char **endp, int allow_spaces, const char *field)
> +{
> +	strbuf_reset(sb);

It's not all that customary in our codebase to have the function reset
the `strbuf` for the caller because it does make the function less
flexible. I would either keep the `strbuf_reset()` on the caller side or
at least document this behaviour in the comment.

> +	if (*p == '"') {
> +		if (unquote_c_style(sb, p, endp))
> +			die("Invalid %s: %s", field, command_buf.buf);
> +	} else {
> +		if (allow_spaces)
> +			*endp = p + strlen(p);

I wonder whether `stop_at_unquoted_space` might be more telling. It's
not like we disallow spaces here, it's that we treat them as the
separator to the next field.

> +		else
> +			*endp = strchr(p, ' ');
> +		if (*endp == p)
> +			die("Missing %s: %s", field, command_buf.buf);

Error messages should start with a lower-case letter and be
translateable. But these are simply moved over from the previous code,
so I don't mind much if you want to keep them as-is.

> +		strbuf_add(sb, p, *endp - p);
> +	}
> +}
> +
> +/*
> + * Parse the path string into the strbuf, and complain if this is not the end of
> + * the string. It may contain spaces even when unquoted.
> + */
> +static void parse_path_eol(struct strbuf *sb, const char *p, const char *field)
> +{
> +	const char *end;
> +
> +	parse_path(sb, p, &end, 1, field);
> +	if (*end)
> +		die("Garbage after %s: %s", field, command_buf.buf);
> +}
> +
> +/*
> + * Parse the path string into the strbuf, and ensure it is followed by a space.
> + * It may not contain spaces when unquoted. Update *endp to point to the first
> + * character after the space.
> + */
> +static void parse_path_space(struct strbuf *sb, const char *p, const char **endp, const char *field)
> +{
> +	parse_path(sb, p, endp, 0, field);
> +	if (**endp != ' ')
> +		die("Missing space after %s: %s", field, command_buf.buf);
> +	(*endp)++;
> +}
> +
>  static void file_change_m(const char *p, struct branch *b)
>  {
>  	static struct strbuf uq = STRBUF_INIT;
> -	const char *endp;
>  	struct object_entry *oe;
>  	struct object_id oid;
>  	uint16_t mode, inline_data = 0;
> @@ -2312,12 +2359,8 @@ static void file_change_m(const char *p, struct branch *b)
>  			die("Missing space after SHA1: %s", command_buf.buf);
>  	}
>  
> -	strbuf_reset(&uq);
> -	if (!unquote_c_style(&uq, p, &endp)) {
> -		if (*endp)
> -			die("Garbage after path in: %s", command_buf.buf);
> -		p = uq.buf;
> -	}
> +	parse_path_eol(&uq, p, "path");
> +	p = uq.buf;

This is loosening the condition so that we also accept unquoted paths
now. Okay.

>  
>  	/* Git does not track empty, non-toplevel directories. */
>  	if (S_ISDIR(mode) && is_empty_tree_oid(&oid) && *p) {
> @@ -2381,48 +2424,23 @@ static void file_change_m(const char *p, struct branch *b)
>  static void file_change_d(const char *p, struct branch *b)
>  {
>  	static struct strbuf uq = STRBUF_INIT;
> -	const char *endp;
>  
> -	strbuf_reset(&uq);
> -	if (!unquote_c_style(&uq, p, &endp)) {
> -		if (*endp)
> -			die("Garbage after path in: %s", command_buf.buf);
> -		p = uq.buf;
> -	}
> +	parse_path_eol(&uq, p, "path");
> +	p = uq.buf;
>  	tree_content_remove(&b->branch_tree, p, NULL, 1);
>  }

Same.

> -static void file_change_cr(const char *s, struct branch *b, int rename)
> +static void file_change_cr(const char *p, struct branch *b, int rename)
>  {
> -	const char *d;
> +	const char *s, *d;
>  	static struct strbuf s_uq = STRBUF_INIT;
>  	static struct strbuf d_uq = STRBUF_INIT;
> -	const char *endp;
>  	struct tree_entry leaf;
>  
> -	strbuf_reset(&s_uq);
> -	if (!unquote_c_style(&s_uq, s, &endp)) {
> -		if (*endp != ' ')
> -			die("Missing space after source: %s", command_buf.buf);
> -	} else {
> -		endp = strchr(s, ' ');
> -		if (!endp)
> -			die("Missing space after source: %s", command_buf.buf);
> -		strbuf_add(&s_uq, s, endp - s);
> -	}
> +	parse_path_space(&s_uq, p, &p, "source");
> +	parse_path_eol(&d_uq, p, "dest");
>  	s = s_uq.buf;
> -
> -	endp++;
> -	if (!*endp)
> -		die("Missing dest: %s", command_buf.buf);
> -
> -	d = endp;
> -	strbuf_reset(&d_uq);
> -	if (!unquote_c_style(&d_uq, d, &endp)) {
> -		if (*endp)
> -			die("Garbage after dest in: %s", command_buf.buf);
> -		d = d_uq.buf;
> -	}
> +	d = d_uq.buf;

Nice simplification. The source path should behave the same, and parsing
of the destination path has been loosened to also allow unquoted paths.

>  	memset(&leaf, 0, sizeof(leaf));
>  	if (rename)
> @@ -3168,6 +3186,7 @@ static void parse_ls(const char *p, struct branch *b)
>  {
>  	struct tree_entry *root = NULL;
>  	struct tree_entry leaf = {NULL};
> +	static struct strbuf uq = STRBUF_INIT;

I know the code had this as a static variable before, as well. But is
this really necessary? Can't we leave it as non-static and then release
the buffer at the end of this function?

>  	/* ls SP (<tree-ish> SP)? <path> */
>  	if (*p == '"') {
> @@ -3182,16 +3201,8 @@ static void parse_ls(const char *p, struct branch *b)
>  			root->versions[1].mode = S_IFDIR;
>  		load_tree(root);
>  	}
> -	if (*p == '"') {
> -		static struct strbuf uq = STRBUF_INIT;
> -		const char *endp;
> -		strbuf_reset(&uq);
> -		if (unquote_c_style(&uq, p, &endp))
> -			die("Invalid path: %s", command_buf.buf);
> -		if (*endp)
> -			die("Garbage after path in: %s", command_buf.buf);
> -		p = uq.buf;
> -	}
> +	parse_path_eol(&uq, p, "path");
> +	p = uq.buf;

And this case should behave the same.

>  	tree_content_get(root, p, &leaf, 1);
>  	/*
>  	 * A directory in preparation would have a sha1 of zero
> diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
> index dbb5042b0b..ef04b43f46 100755
> --- a/t/t9300-fast-import.sh
> +++ b/t/t9300-fast-import.sh
> @@ -2146,6 +2146,7 @@ test_expect_success 'Q: deny note on empty branch' '
>  	EOF
>  	test_must_fail git fast-import <input
>  '
> +
>  ###
>  ### series R (feature and option)
>  ###
> @@ -2794,7 +2795,7 @@ test_expect_success 'R: blob appears only once' '
>  '
>  
>  ###
> -### series S
> +### series S (mark and path parsing)
>  ###
>  #
>  # Make sure missing spaces and EOLs after mark references
> @@ -3064,6 +3065,255 @@ test_expect_success 'S: ls with garbage after sha1 must fail' '
>  	test_grep "space after tree-ish" err
>  '
>  
> +#
> +# Path parsing
> +#
> +# There are two sorts of ways a path can be parsed, depending on whether it is
> +# the last field on the line. Additionally, ls without a <tree-ish> has a
> +# special case. Test every occurrence of <path> in the grammar against every
> +# error case.
> +#
> +
> +#
> +# Valid paths at the end of a line: filemodify, filedelete, filecopy (dest),
> +# filerename (dest), and ls.
> +#
> +# commit :301 from root -- modify hello.c
> +# commit :302 from :301 -- modify $path
> +# commit :303 from :302 -- delete $path
> +# commit :304 from :301 -- copy hello.c $path
> +# commit :305 from :301 -- rename hello.c $path
> +# ls :305 $path
> +#
> +test_path_eol_success () {
> +	test="$1" path="$2" unquoted_path="$3"

Should these variables be local?

> +	test_expect_success "S: paths at EOL with $test must work" '
> +		git fast-import --export-marks=marks.out <<-EOF >out 2>err &&
> +		blob
> +		mark :401
> +		data <<BLOB
> +		hello world
> +		BLOB
> +
> +		blob
> +		mark :402
> +		data <<BLOB
> +		hallo welt
> +		BLOB
> +
> +		commit refs/heads/path-eol
> +		mark :301
> +		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
> +		data <<COMMIT
> +		initial commit
> +		COMMIT
> +		M 100644 :401 hello.c
> +
> +		commit refs/heads/path-eol
> +		mark :302
> +		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
> +		data <<COMMIT
> +		commit filemodify
> +		COMMIT
> +		from :301
> +		M 100644 :402 '"$path"'
> +
> +		commit refs/heads/path-eol
> +		mark :303
> +		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
> +		data <<COMMIT
> +		commit filedelete
> +		COMMIT
> +		from :302
> +		D '"$path"'
> +
> +		commit refs/heads/path-eol
> +		mark :304
> +		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
> +		data <<COMMIT
> +		commit filecopy dest
> +		COMMIT
> +		from :301
> +		C hello.c '"$path"'
> +
> +		commit refs/heads/path-eol
> +		mark :305
> +		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
> +		data <<COMMIT
> +		commit filerename dest
> +		COMMIT
> +		from :301
> +		R hello.c '"$path"'
> +
> +		ls :305 '"$path"'
> +		EOF
> +
> +		commit_m=$(grep :302 marks.out | cut -d\  -f2) &&
> +		commit_d=$(grep :303 marks.out | cut -d\  -f2) &&
> +		commit_c=$(grep :304 marks.out | cut -d\  -f2) &&
> +		commit_r=$(grep :305 marks.out | cut -d\  -f2) &&
> +		blob1=$(grep :401 marks.out | cut -d\  -f2) &&
> +		blob2=$(grep :402 marks.out | cut -d\  -f2) &&
> +
> +		( printf "100644 blob $blob2\t'"$unquoted_path"'\n" &&
> +		  printf "100644 blob $blob1\thello.c\n" ) | sort >tree_m.exp &&
> +		git ls-tree $commit_m | sort >tree_m.out &&
> +		test_cmp tree_m.exp tree_m.out &&
> +
> +		printf "100644 blob $blob1\thello.c\n" >tree_d.exp &&
> +		git ls-tree $commit_d >tree_d.out &&
> +		test_cmp tree_d.exp tree_d.out &&
> +
> +		( printf "100644 blob $blob1\t'"$unquoted_path"'\n" &&
> +		  printf "100644 blob $blob1\thello.c\n" ) | sort >tree_c.exp &&
> +		git ls-tree $commit_c | sort >tree_c.out &&
> +		test_cmp tree_c.exp tree_c.out &&
> +
> +		printf "100644 blob $blob1\t'"$unquoted_path"'\n" >tree_r.exp &&
> +		git ls-tree $commit_r >tree_r.out &&
> +		test_cmp tree_r.exp tree_r.out &&
> +
> +		test_cmp out tree_r.exp &&
> +
> +		git branch -D path-eol
> +	'
> +}
> +
> +test_path_eol_success 'quoted spaces'   '" hello world.c "' ' hello world.c '
> +test_path_eol_success 'unquoted spaces' ' hello world.c '   ' hello world.c '
> +
> +#
> +# Valid paths before a space: filecopy (source) and filerename (source).
> +#
> +# commit :301 from root -- modify $path
> +# commit :302 from :301 -- copy $path hello2.c
> +# commit :303 from :301 -- rename $path hello2.c
> +#
> +test_path_space_success () {
> +	test="$1" path="$2" unquoted_path="$3"

Same question here, should these be local?

> +	test_expect_success "S: paths before space with $test must work" '
> +		git fast-import --export-marks=marks.out <<-EOF 2>err &&
> +		blob
> +		mark :401
> +		data <<BLOB
> +		hello world
> +		BLOB
> +
> +		commit refs/heads/path-space
> +		mark :301
> +		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
> +		data <<COMMIT
> +		initial commit
> +		COMMIT
> +		M 100644 :401 '"$path"'
> +
> +		commit refs/heads/path-space
> +		mark :302
> +		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
> +		data <<COMMIT
> +		commit filecopy source
> +		COMMIT
> +		from :301
> +		C '"$path"' hello2.c
> +
> +		commit refs/heads/path-space
> +		mark :303
> +		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
> +		data <<COMMIT
> +		commit filerename source
> +		COMMIT
> +		from :301
> +		R '"$path"' hello2.c
> +
> +		EOF
> +
> +		commit_c=$(grep :302 marks.out | cut -d\  -f2) &&
> +		commit_r=$(grep :303 marks.out | cut -d\  -f2) &&
> +		blob=$(grep :401 marks.out | cut -d\  -f2) &&
> +
> +		( printf "100644 blob $blob\t'"$unquoted_path"'\n" &&
> +		  printf "100644 blob $blob\thello2.c\n" ) | sort >tree_c.exp &&
> +		git ls-tree $commit_c | sort >tree_c.out &&
> +		test_cmp tree_c.exp tree_c.out &&
> +
> +		printf "100644 blob $blob\thello2.c\n" >tree_r.exp &&
> +		git ls-tree $commit_r >tree_r.out &&
> +		test_cmp tree_r.exp tree_r.out &&
> +
> +		git branch -D path-space
> +	'
> +}
> +
> +test_path_space_success 'quoted spaces'      '" hello world.c "' ' hello world.c '
> +test_path_space_success 'no unquoted spaces' 'hello_world.c'     'hello_world.c'
> +
> +#
> +# Test a single commit change with an invalid path. Run it with all occurrences
> +# of <path> in the grammar against all error kinds.
> +#
> +test_path_fail () {
> +	what="$1" path="$2" err_grep="$3"

Same.

> +	test_expect_success "S: $change with $what must fail" '
> +		test_must_fail git fast-import <<-EOF 2>err &&
> +		blob
> +		mark :1
> +		data <<BLOB
> +		hello world
> +		BLOB
> +
> +		commit refs/heads/S-path-fail
> +		mark :2
> +		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
> +		data <<COMMIT
> +		commit setup
> +		COMMIT
> +		M 100644 :1 hello.c
> +
> +		commit refs/heads/S-path-fail
> +		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
> +		data <<COMMIT
> +		commit with bad path
> +		COMMIT
> +		from :2
> +		'"$prefix$path$suffix"'
> +		EOF
> +
> +		test_grep '"'$err_grep'"' err
> +	'
> +}
> +
> +test_path_base_fail () {
> +	test_path_fail 'unclosed " in '"$field"          '"hello.c'    "Invalid $field"
> +	test_path_fail "invalid escape in quoted $field" '"hello\xff"' "Invalid $field"
> +}
> +test_path_eol_quoted_fail () {
> +	test_path_base_fail
> +	test_path_fail "garbage after quoted $field" '"hello.c"x' "Garbage after $field"
> +	test_path_fail "space after quoted $field"   '"hello.c" ' "Garbage after $field"
> +}
> +test_path_eol_fail () {
> +	test_path_eol_quoted_fail
> +	test_path_fail 'empty unquoted path' '' "Missing $field"
> +}
> +test_path_space_fail () {
> +	test_path_base_fail
> +	test_path_fail 'empty unquoted path' '' "Missing $field"
> +	test_path_fail "missing space after quoted $field" '"hello.c"x' "Missing space after $field"
> +}
> +
> +change=filemodify       prefix='M 100644 :1 ' field=path   suffix=''         test_path_eol_fail
> +change=filedelete       prefix='D '           field=path   suffix=''         test_path_eol_fail
> +change=filecopy         prefix='C '           field=source suffix=' world.c' test_path_space_fail
> +change=filecopy         prefix='C hello.c '   field=dest   suffix=''         test_path_eol_fail
> +change=filerename       prefix='R '           field=source suffix=' world.c' test_path_space_fail
> +change=filerename       prefix='R hello.c '   field=dest   suffix=''         test_path_eol_fail
> +change='ls (in commit)' prefix='ls :2 '       field=path   suffix=''         test_path_eol_fail

This is quite confusing because you now mix two different styles, where
some of the functions take arguments while others pass arguments via
variables. I think it would be preferable to pass all arguments as
proper function arguments.

Patrick

> +# When 'ls' has no <tree-ish>, the <path> must be quoted.
> +change='ls (without tree-ish in commit)' prefix='ls ' field=path suffix='' \
> +test_path_eol_quoted_fail &&
> +test_path_fail 'empty unquoted path' '' "Invalid dataref"
> +
>  ###
>  ### series T (ls)
>  ###
> -- 
> 2.44.0
> 
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 2/6] fast-import: directly use strbufs for paths
From: Patrick Steinhardt @ 2024-03-28  8:21 UTC (permalink / raw)
  To: Thalia Archibald; +Cc: git, Elijah Newren
In-Reply-To: <20240322000304.76810-3-thalia@archibald.dev>

[-- Attachment #1: Type: text/plain, Size: 6398 bytes --]

On Fri, Mar 22, 2024 at 12:03:25AM +0000, Thalia Archibald wrote:
> Previously, one case would not write the path to the strbuf: when the
> path is unquoted and at the end of the string. It was essentially
> copy-on-write. However, with the logic simplification of the previous
> commit, this case was eliminated and the strbuf is always populated.
> 
> Directly use the strbufs now instead of an alias.
> 
> Since this already changes all the lines that use the strbufs, rename
> them from `uq` to be more descriptive. That they are unquoted is not
> their most important property, so name them after what they carry.
> 
> Additionally, `file_change_m` no longer needs to copy the path before
> reading inline data.
> 
> Signed-off-by: Thalia Archibald <thalia@archibald.dev>
> ---
>  builtin/fast-import.c | 54 ++++++++++++++++++-------------------------
>  1 file changed, 22 insertions(+), 32 deletions(-)
> 
> diff --git a/builtin/fast-import.c b/builtin/fast-import.c
> index b2adec8d9a..1b3d6784c1 100644
> --- a/builtin/fast-import.c
> +++ b/builtin/fast-import.c
> @@ -2322,7 +2322,7 @@ static void parse_path_space(struct strbuf *sb, const char *p, const char **endp
>  
>  static void file_change_m(const char *p, struct branch *b)
>  {
> -	static struct strbuf uq = STRBUF_INIT;
> +	static struct strbuf path = STRBUF_INIT;

I was about to propose that we should likely also change all of these
static variables to be local instead. I don't think that we use the
variables after the function calls. But now that I see that we do it
like this in all of these helpers I think what's going on is that this
is a memory optimization to avoid reallocating buffers all the time.

Ugly, but so be it. We could refactor the code to pass in scratch
buffers from the outside to remove those static variables. But that
certainly would be a bigger change and thus likely outside of the scope
of this patch series.

Patrick

>  	struct object_entry *oe;
>  	struct object_id oid;
>  	uint16_t mode, inline_data = 0;
> @@ -2359,12 +2359,11 @@ static void file_change_m(const char *p, struct branch *b)
>  			die("Missing space after SHA1: %s", command_buf.buf);
>  	}
>  
> -	parse_path_eol(&uq, p, "path");
> -	p = uq.buf;
> +	parse_path_eol(&path, p, "path");
>  
>  	/* Git does not track empty, non-toplevel directories. */
> -	if (S_ISDIR(mode) && is_empty_tree_oid(&oid) && *p) {
> -		tree_content_remove(&b->branch_tree, p, NULL, 0);
> +	if (S_ISDIR(mode) && is_empty_tree_oid(&oid) && *path.buf) {
> +		tree_content_remove(&b->branch_tree, path.buf, NULL, 0);
>  		return;
>  	}
>  
> @@ -2385,10 +2384,6 @@ static void file_change_m(const char *p, struct branch *b)
>  		if (S_ISDIR(mode))
>  			die("Directories cannot be specified 'inline': %s",
>  				command_buf.buf);
> -		if (p != uq.buf) {
> -			strbuf_addstr(&uq, p);
> -			p = uq.buf;
> -		}
>  		while (read_next_command() != EOF) {
>  			const char *v;
>  			if (skip_prefix(command_buf.buf, "cat-blob ", &v))
> @@ -2414,49 +2409,45 @@ static void file_change_m(const char *p, struct branch *b)
>  				command_buf.buf);
>  	}
>  
> -	if (!*p) {
> +	if (!*path.buf) {
>  		tree_content_replace(&b->branch_tree, &oid, mode, NULL);
>  		return;
>  	}
> -	tree_content_set(&b->branch_tree, p, &oid, mode, NULL);
> +	tree_content_set(&b->branch_tree, path.buf, &oid, mode, NULL);
>  }
>  
>  static void file_change_d(const char *p, struct branch *b)
>  {
> -	static struct strbuf uq = STRBUF_INIT;
> +	static struct strbuf path = STRBUF_INIT;
>  
> -	parse_path_eol(&uq, p, "path");
> -	p = uq.buf;
> -	tree_content_remove(&b->branch_tree, p, NULL, 1);
> +	parse_path_eol(&path, p, "path");
> +	tree_content_remove(&b->branch_tree, path.buf, NULL, 1);
>  }
>  
>  static void file_change_cr(const char *p, struct branch *b, int rename)
>  {
> -	const char *s, *d;
> -	static struct strbuf s_uq = STRBUF_INIT;
> -	static struct strbuf d_uq = STRBUF_INIT;
> +	static struct strbuf source = STRBUF_INIT;
> +	static struct strbuf dest = STRBUF_INIT;
>  	struct tree_entry leaf;
>  
> -	parse_path_space(&s_uq, p, &p, "source");
> -	parse_path_eol(&d_uq, p, "dest");
> -	s = s_uq.buf;
> -	d = d_uq.buf;
> +	parse_path_space(&source, p, &p, "source");
> +	parse_path_eol(&dest, p, "dest");
>  
>  	memset(&leaf, 0, sizeof(leaf));
>  	if (rename)
> -		tree_content_remove(&b->branch_tree, s, &leaf, 1);
> +		tree_content_remove(&b->branch_tree, source.buf, &leaf, 1);
>  	else
> -		tree_content_get(&b->branch_tree, s, &leaf, 1);
> +		tree_content_get(&b->branch_tree, source.buf, &leaf, 1);
>  	if (!leaf.versions[1].mode)
> -		die("Path %s not in branch", s);
> -	if (!*d) {	/* C "path/to/subdir" "" */
> +		die("Path %s not in branch", source.buf);
> +	if (!*dest.buf) {	/* C "path/to/subdir" "" */
>  		tree_content_replace(&b->branch_tree,
>  			&leaf.versions[1].oid,
>  			leaf.versions[1].mode,
>  			leaf.tree);
>  		return;
>  	}
> -	tree_content_set(&b->branch_tree, d,
> +	tree_content_set(&b->branch_tree, dest.buf,
>  		&leaf.versions[1].oid,
>  		leaf.versions[1].mode,
>  		leaf.tree);
> @@ -3186,7 +3177,7 @@ static void parse_ls(const char *p, struct branch *b)
>  {
>  	struct tree_entry *root = NULL;
>  	struct tree_entry leaf = {NULL};
> -	static struct strbuf uq = STRBUF_INIT;
> +	static struct strbuf path = STRBUF_INIT;
>  
>  	/* ls SP (<tree-ish> SP)? <path> */
>  	if (*p == '"') {
> @@ -3201,9 +3192,8 @@ static void parse_ls(const char *p, struct branch *b)
>  			root->versions[1].mode = S_IFDIR;
>  		load_tree(root);
>  	}
> -	parse_path_eol(&uq, p, "path");
> -	p = uq.buf;
> -	tree_content_get(root, p, &leaf, 1);
> +	parse_path_eol(&path, p, "path");
> +	tree_content_get(root, path.buf, &leaf, 1);
>  	/*
>  	 * A directory in preparation would have a sha1 of zero
>  	 * until it is saved.  Save, for simplicity.
> @@ -3211,7 +3201,7 @@ static void parse_ls(const char *p, struct branch *b)
>  	if (S_ISDIR(leaf.versions[1].mode))
>  		store_tree(&leaf);
>  
> -	print_ls(leaf.versions[1].mode, leaf.versions[1].oid.hash, p);
> +	print_ls(leaf.versions[1].mode, leaf.versions[1].oid.hash, path.buf);
>  	if (leaf.tree)
>  		release_tree_content_recursive(leaf.tree);
>  	if (!b || root != &b->branch_tree)
> -- 
> 2.44.0
> 
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH v3] userdiff: better method/property matching for C#
From: Steven Jeuris via GitGitGadget @ 2024-03-28  8:07 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Jeff King, Linus Arver,
	Johannes Sixt, Steven Jeuris, Steven Jeuris
In-Reply-To: <pull.1682.v2.git.git.1709756493673.gitgitgadget@gmail.com>

From: Steven Jeuris <steven.jeuris@3shape.com>

- Support multi-line methods by not requiring closing parenthesis.
- Support multiple generics (comma was missing before).
- Add missing `foreach`, `lock` and  `fixed` keywords to skip over.
- Remove `instanceof` keyword, which isn't C#.
- Also detect non-method keywords not positioned at the start of a line.
- Added tests; none existed before.

The overall strategy is to focus more on what isn't expected for
method/property definitions, instead of what is, but is fully optional.

Signed-off-by: Steven Jeuris <steven.jeuris@gmail.com>
---
    userdiff: better method/property matching for C#
    
    Change since v1: I removed "from" from the list of keywords to skip.
    First, I considered adding "await", but I discovered both "await" and
    "from" are "contextual keywords", which unlike the other keywords
    currently listed, aren't reserved, and can thus cause false negatives.
    I.e., it is valid to have a method named "await" or "from". In edge
    cases, this may lead to false positives, but a different exclusion rule
    will need to be added to handle these.
    
    Change since v2:
    
     * Corrected comment formatting.
     * Added csharp-property-skip-body test.
     * Added comments in test code to explain sections not part of the test.
     * Elaborated regex comments.
     * Excluded math operators (+-*/%) in method pattern to not catch
       multiline operations, and tested for this in the -skip-body tests.
       Catching "-" only worked when it was defined at the end of the
       exclusion block for some reason. The regex matcher seems quite
       bugged.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1682%2FWhathecode%2Fmaster-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1682/Whathecode/master-v3
Pull-Request: https://github.com/git/git/pull/1682

Range-diff vs v2:

 1:  00315519014 ! 1:  9b6c76f5342 userdiff: better method/property matching for C#
     @@ t/t4018/csharp-method (new)
      @@
      +class Example
      +{
     -+	string Method(int RIGHT)
     -+	{
     -+		// Filler
     -+		// Filler
     -+		
     -+		return "ChangeMe";
     -+	}
     ++    string Method(int RIGHT)
     ++    {
     ++        // Filler
     ++        // Filler
     ++
     ++        return "ChangeMe";
     ++    }
      +}
      
       ## t/t4018/csharp-method-explicit (new) ##
     @@ t/t4018/csharp-method-explicit (new)
      +
      +class Example : IDisposable
      +{
     -+	void IDisposable.Dispose() // RIGHT
     -+	{
     -+		// Filler
     -+		// Filler
     -+		
     -+		// ChangeMe
     -+	}
     ++    void IDisposable.Dispose() // RIGHT
     ++    {
     ++        // Filler
     ++        // Filler
     ++        
     ++        // ChangeMe
     ++    }
      +}
      
       ## t/t4018/csharp-method-generics (new) ##
      @@
      +class Example<T1, T2>
      +{
     -+	Example<int, string> Method<TA, TB>(TA RIGHT, TB b)
     -+	{
     -+		// Filler
     -+		// Filler
     -+		
     -+		// ChangeMe
     -+		return null;
     -+	}
     ++    Example<int, string> Method<TA, TB>(TA RIGHT, TB b)
     ++    {
     ++        // Filler
     ++        // Filler
     ++        
     ++        // ChangeMe
     ++        return null;
     ++    }
      +}
      
       ## t/t4018/csharp-method-modifiers (new) ##
     @@ t/t4018/csharp-method-modifiers (new)
      +
      +class Example
      +{
     -+	static internal async Task Method(int RIGHT)
     -+	{
     -+		// Filler
     -+		// Filler
     -+		
     -+		// ChangeMe
     -+		await Task.Delay(1);
     -+	}
     ++    static internal async Task Method(int RIGHT)
     ++    {
     ++        // Filler
     ++        // Filler
     ++        
     ++        // ChangeMe
     ++        await Task.Delay(1);
     ++    }
      +}
      
       ## t/t4018/csharp-method-multiline (new) ##
      @@
      +class Example
      +{
     -+	string Method_RIGHT(
     -+		int a,
     -+		int b,
     -+		int c)
     -+	{
     -+		return "ChangeMe";
     -+	}
     ++    string Method_RIGHT(
     ++        int a,
     ++        int b,
     ++        int c)
     ++    {
     ++        return "ChangeMe";
     ++    }
      +}
      
       ## t/t4018/csharp-method-params (new) ##
      @@
      +class Example
      +{
     -+	string Method(int RIGHT, int b, int c = 42)
     -+	{
     -+		// Filler
     -+		// Filler
     -+		
     -+		return "ChangeMe";
     -+	}
     ++    string Method(int RIGHT, int b, int c = 42)
     ++    {
     ++        // Filler
     ++        // Filler
     ++        
     ++        return "ChangeMe";
     ++    }
      +}
      
       ## t/t4018/csharp-method-skip-body (new) ##
     @@ t/t4018/csharp-method-skip-body (new)
      +
      +class Example : IDisposable
      +{
     -+	string Method(int RIGHT)
     -+	{
     -+		// Method calls
     -+		MethodCall();
     -+		MethodCall(1, 2);
     -+		MethodCall(
     -+			1, 2);
     -+		
     -+		// Assignments
     -+		var constantAssignment = "test";
     -+		var methodAssignment = MethodCall();
     -+		var multiLineMethodAssignment = MethodCall(
     -+			);
     -+		
     -+		// Initializations/disposal
     -+		new Example();
     -+		new Example(
     -+			);
     -+		new Example { };
     -+		using (this) 
     -+		{
     -+		}
     -+		var def =
     -+			this is default(
     -+				Example);
     -+		
     -+		// Iteration statements
     -+		do { } while (true);
     -+		do MethodCall(
     -+			); while (true);
     -+		while (true);
     -+		while (true) {
     -+			break;
     -+		}
     -+		for (int i = 0; i < 10; ++i)
     -+		{
     -+		}
     -+		foreach (int i in Enumerable.Range(0, 10))
     -+		{
     -+		}
     -+		int[] numbers = [5, 4, 1, 3, 9, 8, 6, 7, 2, 0];
     -+		
     -+		// Control
     -+		if (false)
     -+		{
     -+			return "out";
     -+		}
     -+		else { }
     -+		if (true) MethodCall(
     -+			);
     -+		else MethodCall(
     -+			);
     -+		switch ("test")
     -+		{
     -+			case "one":
     -+				return MethodCall(
     -+					);
     -+			case "two":
     -+				break;
     -+		}
     -+		(int, int) tuple = (1, 4);
     -+		switch (tuple)
     -+		{
     -+			case (1, 4):
     -+				MethodCall();
     -+		}
     -+		
     -+		// Exceptions
     -+		try
     -+		{
     -+			throw new Exception("fail");
     -+		}
     -+		catch (Exception)
     -+		{
     -+		}
     -+		finally
     -+		{
     -+		}
     -+		try { } catch (Exception) {}
     -+		try
     -+		{
     -+			throw GetException(
     -+				);
     -+		}
     -+		catch (Exception) { }
     -+		
     -+		// Others
     -+		lock (this)
     -+		{
     -+		}
     -+		unsafe
     -+		{
     -+			byte[] bytes = [1, 2, 3];
     -+			fixed (byte* pointerToFirst = bytes)
     -+			{
     -+			}
     -+		}
     -+		
     -+		return "ChangeMe";
     -+	}
     -+	
     -+	public void Dispose() {}
     -+	
     -+	string MethodCall(int a = 0, int b = 0) => "test";
     -+	Exception GetException() => new Exception("fail");
     -+	int[] Numbers() => [0, 1];
     ++    string Method(int RIGHT)
     ++    {
     ++        // Method calls
     ++        MethodCall();
     ++        MethodCall(1, 2);
     ++        MethodCall(
     ++            1, 2);
     ++
     ++        // Assignments
     ++        var constantAssignment = "test";
     ++        var methodAssignment = MethodCall();
     ++        var multiLineMethodAssignment = MethodCall(
     ++        );
     ++        var multiLine = "first"
     ++            + MethodCall()
     ++            - MethodCall()
     ++            + MethodCall();
     ++
     ++        // Initializations/disposal
     ++        new Example();
     ++        new Example(
     ++            );
     ++        new Example { };
     ++        using (this)
     ++        {
     ++        }
     ++        var def =
     ++            this is default(
     ++                Example);
     ++
     ++        // Iteration statements
     ++        do { } while (true);
     ++        do MethodCall(
     ++        ); while (true);
     ++        while (true);
     ++        while (true) {
     ++            break;
     ++        }
     ++        for (int i = 0; i < 10; ++i)
     ++        {
     ++        }
     ++        foreach (int i in Enumerable.Range(0, 10))
     ++        {
     ++        }
     ++        int[] numbers = [5, 4, 1, 3, 9, 8, 6, 7, 2, 0];
     ++
     ++        // Control
     ++        if (false)
     ++        {
     ++            return "out";
     ++        }
     ++        else { }
     ++        if (true) MethodCall(
     ++        );
     ++        else MethodCall(
     ++        );
     ++        switch ("test")
     ++        {
     ++            case "one":
     ++            return MethodCall(
     ++            );
     ++            case "two":
     ++            break;
     ++        }
     ++        (int, int) tuple = (1, 4);
     ++        switch (tuple)
     ++        {
     ++            case (1, 4):
     ++            MethodCall();
     ++        }
     ++
     ++        // Exceptions
     ++        try
     ++        {
     ++            throw new Exception("fail");
     ++        }
     ++        catch (Exception)
     ++        {
     ++        }
     ++        finally
     ++        {
     ++        }
     ++        try { } catch (Exception) {}
     ++        try
     ++        {
     ++            throw GetException(
     ++            );
     ++        }
     ++        catch (Exception) { }
     ++
     ++        // Others
     ++        lock (this)
     ++        {
     ++        }
     ++        unsafe
     ++        {
     ++            byte[] bytes = [1, 2, 3];
     ++            fixed (byte* pointerToFirst = bytes)
     ++            {
     ++            }
     ++        }
     ++
     ++        return "ChangeMe";
     ++    }
     ++
     ++    // Supporting methods to make the tested method above valid C#.
     ++    public void Dispose() {}
     ++    string MethodCall(int a = 0, int b = 0) => "test";
     ++    Exception GetException() => new Exception("fail");
     ++    int[] Numbers() => [0, 1];
      +}
      
       ## t/t4018/csharp-method-special-chars (new) ##
      @@
      +class @Some_Type
      +{
     -+	@Some_Type @Method_With_Underscore(int RIGHT)
     -+	{
     -+		// Filler
     -+		// Filler
     -+		
     -+		// ChangeMe
     -+		return new @Some_Type();
     -+	}
     ++    @Some_Type @Method_With_Underscore(int RIGHT)
     ++    {
     ++        // Filler
     ++        // Filler
     ++        
     ++        // ChangeMe
     ++        return new @Some_Type();
     ++    }
      +}
      
       ## t/t4018/csharp-method-with-spacing (new) ##
     @@ t/t4018/csharp-method-with-spacing (new)
      +{
      +		string   Method 	( int 	RIGHT )
      +	{
     -+		// Filler
     -+		// Filler
     ++        // Filler
     ++        // Filler
      +		
     -+		return "ChangeMe";
     -+	}
     ++        return "ChangeMe";
     ++    }
      +}
      
       ## t/t4018/csharp-property (new) ##
      @@
      +class Example
      +{
     -+	public bool RIGHT
     ++    public bool RIGHT
      +    {
      +        get { return true; }
      +        set
     @@ t/t4018/csharp-property (new)
      +            // ChangeMe
      +        }
      +    }
     ++}
     +
     + ## t/t4018/csharp-property-skip-body (new) ##
     +@@
     ++using System.Linq;
     ++using System;
     ++
     ++class Example : IDisposable
     ++{
     ++    public string RIGHT
     ++    {
     ++        get
     ++        {
     ++            // Method calls
     ++            MethodCall();
     ++            MethodCall(1, 2);
     ++            MethodCall(
     ++                1, 2);
     ++
     ++            // Assignments
     ++            var constantAssignment = "test";
     ++            var methodAssignment = MethodCall();
     ++            var multiLineMethodAssignment = MethodCall(
     ++                );
     ++            var multiLine = "first"
     ++                + MethodCall()
     ++                - MethodCall()
     ++                + MethodCall();
     ++
     ++            // Initializations/disposal
     ++            new Example();
     ++            new Example(
     ++                );
     ++            new Example { };
     ++            using (this)
     ++            {
     ++            }
     ++            var def =
     ++                this is default(
     ++                    Example);
     ++
     ++            // Iteration statements
     ++            do { } while (true);
     ++            do MethodCall(
     ++            ); while (true);
     ++            while (true);
     ++            while (true) {
     ++                break;
     ++            }
     ++            for (int i = 0; i < 10; ++i)
     ++            {
     ++            }
     ++            foreach (int i in Enumerable.Range(0, 10))
     ++            {
     ++            }
     ++            int[] numbers = [5, 4, 1, 3, 9, 8, 6, 7, 2, 0];
     ++
     ++            // Control
     ++            if (false)
     ++            {
     ++                return "out";
     ++            }
     ++            else { }
     ++            if (true) MethodCall(
     ++            );
     ++            else MethodCall(
     ++            );
     ++            switch ("test")
     ++            {
     ++                case "one":
     ++                return MethodCall(
     ++                );
     ++                case "two":
     ++                break;
     ++            }
     ++            (int, int) tuple = (1, 4);
     ++            switch (tuple)
     ++            {
     ++                case (1, 4):
     ++                MethodCall();
     ++            }
     ++
     ++            // Exceptions
     ++            try
     ++            {
     ++                throw new Exception("fail");
     ++            }
     ++            catch (Exception)
     ++            {
     ++            }
     ++            finally
     ++            {
     ++            }
     ++            try { } catch (Exception) {}
     ++            try
     ++            {
     ++                throw GetException(
     ++                );
     ++            }
     ++            catch (Exception) { }
     ++
     ++            // Others
     ++            lock (this)
     ++            {
     ++            }
     ++            unsafe
     ++            {
     ++                byte[] bytes = [1, 2, 3];
     ++                fixed (byte* pointerToFirst = bytes)
     ++                {
     ++                }
     ++            }
     ++
     ++            return "ChangeMe";
     ++        }
     ++        set { }
     ++    }
     ++
     ++    // Supporting methods to make the tested property above valid C#.
     ++    public void Dispose() {}
     ++    string MethodCall(int a = 0, int b = 0) => "test";
     ++    Exception GetException() => new Exception("fail");
     ++    int[] Numbers() => [0, 1];
      +}
      
       ## userdiff.c ##
     @@ userdiff.c: PATTERNS("cpp",
      -	 "!^[ \t]*(do|while|for|if|else|instanceof|new|return|switch|case|throw|catch|using)\n"
      -	 /* Methods and constructors */
      -	 "^[ \t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe|async)[ \t]+)*[][<>@.~_[:alnum:]]+[ \t]+[<>@._[:alnum:]]+[ \t]*\\(.*\\))[ \t]*$\n"
     +-	 /* Properties */
     +-	 "^[ \t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe)[ \t]+)*[][<>@.~_[:alnum:]]+[ \t]+[@._[:alnum:]]+)[ \t]*$\n"
      +	 /*
     -+	  * Jump over keywords not used by methods which can be followed by parentheses without special characters in between,
     ++	  * Jump over reserved keywords which are illegal method names, but which
     ++	  * can be followed by parentheses without special characters in between,
      +	  * making them look like methods.
      +	  */
      +	 "!(^|[ \t]+)(do|while|for|foreach|if|else|new|default|return|switch|case|throw|catch|using|lock|fixed)([ \t(]+|$)\n"
     -+	 /* Methods/constructors:
     -+	  * the strategy is to identify a minimum of two groups (any combination of keywords/type/name),
     -+	  * without intermediate or final characters which can't be part of method definitions before the opening parenthesis.
     ++	 /*
     ++	  * Methods/constructors:
     ++	  * The strategy is to identify a minimum of two groups (any combination
     ++	  * of keywords/type/name), without intermediate characters which can't be
     ++	  * part of method definitions before the opening parenthesis, and without
     ++	  * final unexpected characters, normally only used in ordinary statements.
     ++	  * "=" is excluded to ignore assignments, but as a result rules out
     ++	  * methods with expression bodies. However, since those fit on 1-2 lines,
     ++	  * a chunk header isn't useful either way.
      +	  */
     -+	 "^[ \t]*(([][[:alnum:]@_<>.,]*[^=:{ \t][ \t]+[][[:alnum:]@_<>.,]*)+\\([^;]*)$\n"
     - 	 /* Properties */
     --	 "^[ \t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe)[ \t]+)*[][<>@.~_[:alnum:]]+[ \t]+[@._[:alnum:]]+)[ \t]*$\n"
     -+	 "^[ \t]*((([][[:alnum:]@_<>.,]+)[ \t]+[][[:alnum:]@_]*)+[^=:;,()]*)$\n"
     ++	 "^[ \t]*(([][[:alnum:]@_<>.,]*[^=:{ \t+*%\\/\\-][ \t]+[][[:alnum:]@_<>.,]*)+\\([^;]*)$\n"
     ++	 /*
     ++	  * Properties:
     ++	  * As with methods, expect a minimum of two groups. And, the vast majority
     ++	  * of properties long enough to be worth showing a chunk header for don't
     ++	  * include "=:;,()" on the line they are defined.
     ++	  */
     ++	 "^[ \t]*([][[:alnum:]@_<>.,]+[ \t]+[][[:alnum:]@_.]+[^=:;,()]*)$\n"
       	 /* Type definitions */
       	 "^[ \t]*(((static|public|internal|private|protected|new|unsafe|sealed|abstract|partial)[ \t]+)*(class|enum|interface|struct|record)[ \t]+.*)$\n"
       	 /* Namespace */


 t/t4018/csharp-method               |  10 +++
 t/t4018/csharp-method-explicit      |  12 +++
 t/t4018/csharp-method-generics      |  11 +++
 t/t4018/csharp-method-modifiers     |  13 +++
 t/t4018/csharp-method-multiline     |  10 +++
 t/t4018/csharp-method-params        |  10 +++
 t/t4018/csharp-method-skip-body     | 116 +++++++++++++++++++++++++++
 t/t4018/csharp-method-special-chars |  11 +++
 t/t4018/csharp-method-with-spacing  |  10 +++
 t/t4018/csharp-property             |  11 +++
 t/t4018/csharp-property-skip-body   | 120 ++++++++++++++++++++++++++++
 userdiff.c                          |  30 +++++--
 12 files changed, 358 insertions(+), 6 deletions(-)
 create mode 100644 t/t4018/csharp-method
 create mode 100644 t/t4018/csharp-method-explicit
 create mode 100644 t/t4018/csharp-method-generics
 create mode 100644 t/t4018/csharp-method-modifiers
 create mode 100644 t/t4018/csharp-method-multiline
 create mode 100644 t/t4018/csharp-method-params
 create mode 100644 t/t4018/csharp-method-skip-body
 create mode 100644 t/t4018/csharp-method-special-chars
 create mode 100644 t/t4018/csharp-method-with-spacing
 create mode 100644 t/t4018/csharp-property
 create mode 100644 t/t4018/csharp-property-skip-body

diff --git a/t/t4018/csharp-method b/t/t4018/csharp-method
new file mode 100644
index 00000000000..16b367aca2b
--- /dev/null
+++ b/t/t4018/csharp-method
@@ -0,0 +1,10 @@
+class Example
+{
+    string Method(int RIGHT)
+    {
+        // Filler
+        // Filler
+
+        return "ChangeMe";
+    }
+}
diff --git a/t/t4018/csharp-method-explicit b/t/t4018/csharp-method-explicit
new file mode 100644
index 00000000000..5a710116cc4
--- /dev/null
+++ b/t/t4018/csharp-method-explicit
@@ -0,0 +1,12 @@
+using System;
+
+class Example : IDisposable
+{
+    void IDisposable.Dispose() // RIGHT
+    {
+        // Filler
+        // Filler
+        
+        // ChangeMe
+    }
+}
diff --git a/t/t4018/csharp-method-generics b/t/t4018/csharp-method-generics
new file mode 100644
index 00000000000..b3216bfb2a7
--- /dev/null
+++ b/t/t4018/csharp-method-generics
@@ -0,0 +1,11 @@
+class Example<T1, T2>
+{
+    Example<int, string> Method<TA, TB>(TA RIGHT, TB b)
+    {
+        // Filler
+        // Filler
+        
+        // ChangeMe
+        return null;
+    }
+}
diff --git a/t/t4018/csharp-method-modifiers b/t/t4018/csharp-method-modifiers
new file mode 100644
index 00000000000..caefa8ee99c
--- /dev/null
+++ b/t/t4018/csharp-method-modifiers
@@ -0,0 +1,13 @@
+using System.Threading.Tasks;
+
+class Example
+{
+    static internal async Task Method(int RIGHT)
+    {
+        // Filler
+        // Filler
+        
+        // ChangeMe
+        await Task.Delay(1);
+    }
+}
diff --git a/t/t4018/csharp-method-multiline b/t/t4018/csharp-method-multiline
new file mode 100644
index 00000000000..3983ff42f51
--- /dev/null
+++ b/t/t4018/csharp-method-multiline
@@ -0,0 +1,10 @@
+class Example
+{
+    string Method_RIGHT(
+        int a,
+        int b,
+        int c)
+    {
+        return "ChangeMe";
+    }
+}
diff --git a/t/t4018/csharp-method-params b/t/t4018/csharp-method-params
new file mode 100644
index 00000000000..3f00410ba1f
--- /dev/null
+++ b/t/t4018/csharp-method-params
@@ -0,0 +1,10 @@
+class Example
+{
+    string Method(int RIGHT, int b, int c = 42)
+    {
+        // Filler
+        // Filler
+        
+        return "ChangeMe";
+    }
+}
diff --git a/t/t4018/csharp-method-skip-body b/t/t4018/csharp-method-skip-body
new file mode 100644
index 00000000000..b3c7194eb74
--- /dev/null
+++ b/t/t4018/csharp-method-skip-body
@@ -0,0 +1,116 @@
+using System.Linq;
+using System;
+
+class Example : IDisposable
+{
+    string Method(int RIGHT)
+    {
+        // Method calls
+        MethodCall();
+        MethodCall(1, 2);
+        MethodCall(
+            1, 2);
+
+        // Assignments
+        var constantAssignment = "test";
+        var methodAssignment = MethodCall();
+        var multiLineMethodAssignment = MethodCall(
+        );
+        var multiLine = "first"
+            + MethodCall()
+            - MethodCall()
+            + MethodCall();
+
+        // Initializations/disposal
+        new Example();
+        new Example(
+            );
+        new Example { };
+        using (this)
+        {
+        }
+        var def =
+            this is default(
+                Example);
+
+        // Iteration statements
+        do { } while (true);
+        do MethodCall(
+        ); while (true);
+        while (true);
+        while (true) {
+            break;
+        }
+        for (int i = 0; i < 10; ++i)
+        {
+        }
+        foreach (int i in Enumerable.Range(0, 10))
+        {
+        }
+        int[] numbers = [5, 4, 1, 3, 9, 8, 6, 7, 2, 0];
+
+        // Control
+        if (false)
+        {
+            return "out";
+        }
+        else { }
+        if (true) MethodCall(
+        );
+        else MethodCall(
+        );
+        switch ("test")
+        {
+            case "one":
+            return MethodCall(
+            );
+            case "two":
+            break;
+        }
+        (int, int) tuple = (1, 4);
+        switch (tuple)
+        {
+            case (1, 4):
+            MethodCall();
+        }
+
+        // Exceptions
+        try
+        {
+            throw new Exception("fail");
+        }
+        catch (Exception)
+        {
+        }
+        finally
+        {
+        }
+        try { } catch (Exception) {}
+        try
+        {
+            throw GetException(
+            );
+        }
+        catch (Exception) { }
+
+        // Others
+        lock (this)
+        {
+        }
+        unsafe
+        {
+            byte[] bytes = [1, 2, 3];
+            fixed (byte* pointerToFirst = bytes)
+            {
+            }
+        }
+
+        return "ChangeMe";
+    }
+
+    // Supporting methods to make the tested method above valid C#.
+    public void Dispose() {}
+    string MethodCall(int a = 0, int b = 0) => "test";
+    Exception GetException() => new Exception("fail");
+    int[] Numbers() => [0, 1];
+}
diff --git a/t/t4018/csharp-method-special-chars b/t/t4018/csharp-method-special-chars
new file mode 100644
index 00000000000..e6c7bc01a18
--- /dev/null
+++ b/t/t4018/csharp-method-special-chars
@@ -0,0 +1,11 @@
+class @Some_Type
+{
+    @Some_Type @Method_With_Underscore(int RIGHT)
+    {
+        // Filler
+        // Filler
+        
+        // ChangeMe
+        return new @Some_Type();
+    }
+}
diff --git a/t/t4018/csharp-method-with-spacing b/t/t4018/csharp-method-with-spacing
new file mode 100644
index 00000000000..da0c9b7e60c
--- /dev/null
+++ b/t/t4018/csharp-method-with-spacing
@@ -0,0 +1,10 @@
+class Example
+{
+		string   Method 	( int 	RIGHT )
+	{
+        // Filler
+        // Filler
+		
+        return "ChangeMe";
+    }
+}
diff --git a/t/t4018/csharp-property b/t/t4018/csharp-property
new file mode 100644
index 00000000000..e56dfce34c1
--- /dev/null
+++ b/t/t4018/csharp-property
@@ -0,0 +1,11 @@
+class Example
+{
+    public bool RIGHT
+    {
+        get { return true; }
+        set
+        {
+            // ChangeMe
+        }
+    }
+}
diff --git a/t/t4018/csharp-property-skip-body b/t/t4018/csharp-property-skip-body
new file mode 100644
index 00000000000..ad9d96007a8
--- /dev/null
+++ b/t/t4018/csharp-property-skip-body
@@ -0,0 +1,120 @@
+using System.Linq;
+using System;
+
+class Example : IDisposable
+{
+    public string RIGHT
+    {
+        get
+        {
+            // Method calls
+            MethodCall();
+            MethodCall(1, 2);
+            MethodCall(
+                1, 2);
+
+            // Assignments
+            var constantAssignment = "test";
+            var methodAssignment = MethodCall();
+            var multiLineMethodAssignment = MethodCall(
+                );
+            var multiLine = "first"
+                + MethodCall()
+                - MethodCall()
+                + MethodCall();
+
+            // Initializations/disposal
+            new Example();
+            new Example(
+                );
+            new Example { };
+            using (this)
+            {
+            }
+            var def =
+                this is default(
+                    Example);
+
+            // Iteration statements
+            do { } while (true);
+            do MethodCall(
+            ); while (true);
+            while (true);
+            while (true) {
+                break;
+            }
+            for (int i = 0; i < 10; ++i)
+            {
+            }
+            foreach (int i in Enumerable.Range(0, 10))
+            {
+            }
+            int[] numbers = [5, 4, 1, 3, 9, 8, 6, 7, 2, 0];
+
+            // Control
+            if (false)
+            {
+                return "out";
+            }
+            else { }
+            if (true) MethodCall(
+            );
+            else MethodCall(
+            );
+            switch ("test")
+            {
+                case "one":
+                return MethodCall(
+                );
+                case "two":
+                break;
+            }
+            (int, int) tuple = (1, 4);
+            switch (tuple)
+            {
+                case (1, 4):
+                MethodCall();
+            }
+
+            // Exceptions
+            try
+            {
+                throw new Exception("fail");
+            }
+            catch (Exception)
+            {
+            }
+            finally
+            {
+            }
+            try { } catch (Exception) {}
+            try
+            {
+                throw GetException(
+                );
+            }
+            catch (Exception) { }
+
+            // Others
+            lock (this)
+            {
+            }
+            unsafe
+            {
+                byte[] bytes = [1, 2, 3];
+                fixed (byte* pointerToFirst = bytes)
+                {
+                }
+            }
+
+            return "ChangeMe";
+        }
+        set { }
+    }
+
+    // Supporting methods to make the tested property above valid C#.
+    public void Dispose() {}
+    string MethodCall(int a = 0, int b = 0) => "test";
+    Exception GetException() => new Exception("fail");
+    int[] Numbers() => [0, 1];
+}
diff --git a/userdiff.c b/userdiff.c
index e399543823b..5440ccf2de5 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -89,12 +89,30 @@ PATTERNS("cpp",
 	 "|\\.[0-9][0-9]*([Ee][-+]?[0-9]+)?[fFlL]?"
 	 "|[-+*/<>%&^|=!]=|--|\\+\\+|<<=?|>>=?|&&|\\|\\||::|->\\*?|\\.\\*|<=>"),
 PATTERNS("csharp",
-	 /* Keywords */
-	 "!^[ \t]*(do|while|for|if|else|instanceof|new|return|switch|case|throw|catch|using)\n"
-	 /* Methods and constructors */
-	 "^[ \t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe|async)[ \t]+)*[][<>@.~_[:alnum:]]+[ \t]+[<>@._[:alnum:]]+[ \t]*\\(.*\\))[ \t]*$\n"
-	 /* Properties */
-	 "^[ \t]*(((static|public|internal|private|protected|new|virtual|sealed|override|unsafe)[ \t]+)*[][<>@.~_[:alnum:]]+[ \t]+[@._[:alnum:]]+)[ \t]*$\n"
+	 /*
+	  * Jump over reserved keywords which are illegal method names, but which
+	  * can be followed by parentheses without special characters in between,
+	  * making them look like methods.
+	  */
+	 "!(^|[ \t]+)(do|while|for|foreach|if|else|new|default|return|switch|case|throw|catch|using|lock|fixed)([ \t(]+|$)\n"
+	 /*
+	  * Methods/constructors:
+	  * The strategy is to identify a minimum of two groups (any combination
+	  * of keywords/type/name), without intermediate characters which can't be
+	  * part of method definitions before the opening parenthesis, and without
+	  * final unexpected characters, normally only used in ordinary statements.
+	  * "=" is excluded to ignore assignments, but as a result rules out
+	  * methods with expression bodies. However, since those fit on 1-2 lines,
+	  * a chunk header isn't useful either way.
+	  */
+	 "^[ \t]*(([][[:alnum:]@_<>.,]*[^=:{ \t+*%\\/\\-][ \t]+[][[:alnum:]@_<>.,]*)+\\([^;]*)$\n"
+	 /*
+	  * Properties:
+	  * As with methods, expect a minimum of two groups. And, the vast majority
+	  * of properties long enough to be worth showing a chunk header for don't
+	  * include "=:;,()" on the line they are defined.
+	  */
+	 "^[ \t]*([][[:alnum:]@_<>.,]+[ \t]+[][[:alnum:]@_.]+[^=:;,()]*)$\n"
 	 /* Type definitions */
 	 "^[ \t]*(((static|public|internal|private|protected|new|unsafe|sealed|abstract|partial)[ \t]+)*(class|enum|interface|struct|record)[ \t]+.*)$\n"
 	 /* Namespace */

base-commit: f41f85c9ec8d4d46de0fd5fded88db94d3ec8c11
-- 
gitgitgadget


^ permalink raw reply related

* Re: Allow git bisect to auto-skip
From: Olliver Schinagl @ 2024-03-28  8:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Christian Couder, Stefan Haller, git
In-Reply-To: <xmqqfrwb1nr8.fsf@gitster.g>

On 27-03-2024 20:24, Junio C Hamano wrote:
> Olliver Schinagl <oliver@schinagl.nl> writes:
>
>> Anyway, want I was thinking of, based a key somewhere in the message
>> body (GIT_BISECT_SKIP=1 for example), mark the commit in the list to
>> be skipped, as `git bisect skip` would. This so the skipped commit
>> actualyl ends up on the list of skipped commits
>> (`refs/bisect/skip-*`).
> I think it is very unlikely that we'd adopt any hardcoded scheme
> that allows only one supported way of marking commits as not to be
> tested.  Certainly not the kind that the committer is forced to know
> the commit must be skipped during a bisection session before
> creating the commit.

My goal is to make things work as automagically as possible, without 
special tooling or scripts. This to make the barrier for entry as low as 
possible. E.g. if someone says 'could you please bisect this for me', 
things should just work.

Having said that, I also realize that the use-cases where a commit 
message would be enough are limited. There's only a very rare cases 
where it is known before hand a commit will break bisect.

The `GIT_BISECT_SKIP=1` would obviously be configurable, so it could be 
set on a per-repo level, with a sane default, and probably wiser to use 
`git commit --skip-bisect` to make use of said flag.


Having said all that, right now, I'm just exploring how things work, how 
easy it would be, how logical it would be. So for thank you for bearing 
with me :)

>    So I am not sure how much good it will do to
> know that commit_list->item is a commit object on the list, or
> calling repo_get_commit_buffer() on it would give the contents of
> the commit object.
>
> Knowing that commit_list->item->object.oid is the object name of
> such a commit, and calling oid_to_hex() on such an object name
> stringifies it, should be sufficient to write a new code that calls
> out to a script specified via "git bisect --skip-when=<script>"

I had learned about `oid_to_hex()` already, and I am charmed by that 
idea for sure. So much in that I'll look into this first. In the end, 
both solutions could even be used as they are in nearly the same spot 
I'd think.


Four things I lack right now. The first one I hope to have figured out 
before your reply (but don't shy the reply).

1) What is the best way to parse the skip-when argument. I see 
`bisect_start()` does arg parsing, but trying to see what 'the git way' 
is to get the argument of the new flag.

2) Where to store the argument. `bisect_terms` is passed around 
everywhere to hold the good and bad states, but with bisect doesn't pass 
much of anything else around. Though `bisect_run` isn't storing 
`command` either, and you do mention so I'll delve into that and see how 
that maps into the `run_command()` API.

3) Using the `run_command()` API implies to do the check 'on checkout'; 
which was my first idea as well, but a) where does this happen? I have 
not found this yet, though `bisect_run` might be more revealing here,

3) Using this approach, can I still do a 'regular' `git bisect skip` in 
that the skip gets recorded (in refs/bisect/skip-*)? I think it would be 
nice to see what was skipped.


Thanks,

Olliver

> using the run_command() API to decide if the commit object should be
> skipped, and if you do want GIT_BISECT_SKIP=1 in the log message,
> your script given via --skip-when may be a single-liner:
>
>      #!/bin/sh
>      # usage: this-script $commit_object_name
>      # expected to exit with 0 when the commit should not be tested
>      git cat-file commit "$1" | sed -e '1,/^$/d' | grep GIT_BISECT_SKIP=1
>
> Thanks.




^ permalink raw reply

* Re: [PATCH 13/13] credential: add support for multistage credential rounds
From: M Hickford @ 2024-03-28  8:00 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git, Junio C Hamano, Matthew John Cheetham, M Hickford
In-Reply-To: <20240324011301.1553072-14-sandals@crustytoothpaste.net>

On Sun, 24 Mar 2024 at 01:13, brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> Over HTTP, NTLM and Kerberos require two rounds of authentication on the
> client side.  It's possible that there are custom authentication schemes
> that also implement this same approach.  Since these are tricky schemes
> to implement and the HTTP library in use may not always handle them
> gracefully on all systems, it would be helpful to allow the credential
> helper to implement them instead for increased portability and
> robustness.

Is this design sufficiently flexible for OAuth DPoP (RFC 9449), or at
least to make it work in future?

OAuth 2.0 Demonstrating Proof of Possession describes "a mechanism for
sender-constraining OAuth 2.0 tokens via a proof-of-possession
mechanism on the application level. This mechanism allows for the
detection of replay attacks with access and refresh tokens."
https://www.rfc-editor.org/rfc/rfc9449.html

Popular hosts GitHub, GitLab, Bitbucket and Gitea already support
OAuth. OAuth DPoP "provides a general defense in depth against the
impact of unanticipated token leakage". Motivated by a 2022 GitHub
attack involving stolen tokens
(https://github.blog/2022-04-15-security-alert-stolen-oauth-user-tokens/),
some hosts are already experimenting with it.
https://lore.kernel.org/git/20230128142827.17397-1-mirth.hickford@gmail.com/

In particular, the http request has to include both Authorization and
DPoP headers https://www.rfc-editor.org/rfc/rfc9449.html#name-the-dpop-authentication-sch.
The latter depends on timestamp and a server-optional challenge in a
DPoP-Nonce header.
https://www.rfc-editor.org/rfc/rfc9449.html#name-resource-server-provided-no.


On Sun, 24 Mar 2024 at 01:13, brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> Over HTTP, NTLM and Kerberos require two rounds of authentication on the
> client side.  It's possible that there are custom authentication schemes
> that also implement this same approach.  Since these are tricky schemes
> to implement and the HTTP library in use may not always handle them
> gracefully on all systems, it would be helpful to allow the credential
> helper to implement them instead for increased portability and
> robustness.
>
> To allow this to happen, add a boolean flag, continue, that indicates
> that instead of failing when we get a 401, we should retry another round
> of authentication.  However, this necessitates some changes in our
> current credential code so that we can make this work.
>
> Keep the state[] headers between iterations, but only use them to send
> to the helper and only consider the new ones we read from the credential
> helper to be valid on subsequent iterations.  That avoids us passing
> stale data when we finally approve or reject the credential.  Similarly,
> clear the multistage and wwwauth[] values appropriately so that we
> don't pass stale data or think we're trying a multiround response when
> we're not.  Remove the credential values so that we can actually fill a
> second time with new responses.
>
> Limit the number of iterations of reauthentication we do to 3.  This
> means that if there's a problem, we'll terminate with an error message
> instead of retrying indefinitely and not informing the user (and
> possibly conducting a DoS on the server).
>
> In our tests, handle creating multiple response output files from our
> helper so we can verify that each of the messages sent is correct.
>
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
>  Documentation/git-credential.txt | 16 +++++-
>  builtin/credential.c             |  1 +
>  credential.c                     | 32 ++++++++++--
>  credential.h                     | 27 +++++++++-
>  http.c                           | 59 +++++++++++----------
>  t/t5563-simple-http-auth.sh      | 89 ++++++++++++++++++++++++++++++--
>  6 files changed, 187 insertions(+), 37 deletions(-)
>
> diff --git a/Documentation/git-credential.txt b/Documentation/git-credential.txt
> index 6b7e017066..160dee5c6a 100644
> --- a/Documentation/git-credential.txt
> +++ b/Documentation/git-credential.txt
> @@ -207,6 +207,19 @@ provided on input.
>  This value should not be sent unless the appropriate capability (see below) is
>  provided on input.
>
> +`continue`::
> +       This is a boolean value, which, if enabled, indicates that this
> +       authentication is a non-final part of a multistage authentication step. This
> +       is common in protocols such as NTLM and Kerberos, where two rounds of client
> +       authentication are required, and setting this flag allows the credential
> +       helper to implement the multistage authentication step.  This flag should
> +       only be sent if a further stage is required; that is, if another round of
> +       authentication is expected.
> ++
> +This value should not be sent unless the appropriate capability (see below) is
> +provided on input.  This attribute is 'one-way' from a credential helper to
> +pass information to Git (or other programs invoking `git credential`).
> +
>  `wwwauth[]`::
>
>         When an HTTP response is received by Git that includes one or more
> @@ -225,7 +238,8 @@ to pass additional information to credential helpers.
>  +
>  There are two currently supported capabilities.  The first is `authtype`, which
>  indicates that the `authtype` and `credential` values are understood.  The
> -second is `state`, which indicates that the `state[]` value is understood.
> +second is `state`, which indicates that the `state[]` and `continue` values are
> +understood.
>
>  It is not obligatory to use the additional features just because the capability
>  is supported, but they should not be provided without this capability.
> diff --git a/builtin/credential.c b/builtin/credential.c
> index 5123dabcf1..f14d1b5ed6 100644
> --- a/builtin/credential.c
> +++ b/builtin/credential.c
> @@ -22,6 +22,7 @@ int cmd_credential(int argc, const char **argv, const char *prefix UNUSED)
>
>         if (!strcmp(op, "fill")) {
>                 credential_fill(&c, 0);
> +               credential_next_state(&c);
>                 credential_write(&c, stdout, CREDENTIAL_OP_RESPONSE);
>         } else if (!strcmp(op, "approve")) {
>                 credential_approve(&c);
> diff --git a/credential.c b/credential.c
> index 0ca7c12895..9a08efe113 100644
> --- a/credential.c
> +++ b/credential.c
> @@ -31,10 +31,23 @@ void credential_clear(struct credential *c)
>         string_list_clear(&c->helpers, 0);
>         strvec_clear(&c->wwwauth_headers);
>         strvec_clear(&c->state_headers);
> +       strvec_clear(&c->state_headers_to_send);
>
>         credential_init(c);
>  }
>
> +void credential_next_state(struct credential *c)
> +{
> +       strvec_clear(&c->state_headers_to_send);
> +       strvec_swap(&c->state_headers, &c->state_headers_to_send);
> +}
> +
> +void credential_clear_secrets(struct credential *c)
> +{
> +       FREE_AND_NULL(c->password);
> +       FREE_AND_NULL(c->credential);
> +}
> +
>  static void credential_set_all_capabilities(struct credential *c)
>  {
>         c->capa_authtype.request_initial = 1;
> @@ -295,6 +308,8 @@ int credential_read(struct credential *c, FILE *fp, int op_type)
>                                 credential_set_capability(&c->capa_authtype, op_type);
>                         else if (!strcmp(value, "state"))
>                                 credential_set_capability(&c->capa_state, op_type);
> +               } else if (!strcmp(key, "continue")) {
> +                       c->multistage = !!git_config_bool("continue", value);
>                 } else if (!strcmp(key, "password_expiry_utc")) {
>                         errno = 0;
>                         c->password_expiry_utc = parse_timestamp(value, NULL, 10);
> @@ -359,8 +374,10 @@ void credential_write(const struct credential *c, FILE *fp, int op_type)
>         for (size_t i = 0; i < c->wwwauth_headers.nr; i++)
>                 credential_write_item(fp, "wwwauth[]", c->wwwauth_headers.v[i], 0);
>         if (credential_has_capability(&c->capa_state, op_type)) {
> -               for (size_t i = 0; i < c->state_headers.nr; i++)
> -                       credential_write_item(fp, "state[]", c->state_headers.v[i], 0);
> +               if (c->multistage)
> +                       credential_write_item(fp, "continue", "1", 0);
> +               for (size_t i = 0; i < c->state_headers_to_send.nr; i++)
> +                       credential_write_item(fp, "state[]", c->state_headers_to_send.v[i], 0);
>         }
>  }
>
> @@ -431,6 +448,9 @@ void credential_fill(struct credential *c, int all_capabilities)
>         if ((c->username && c->password) || c->credential)
>                 return;
>
> +       credential_next_state(c);
> +       c->multistage = 0;
> +
>         credential_apply_config(c);
>         if (all_capabilities)
>                 credential_set_all_capabilities(c);
> @@ -443,8 +463,10 @@ void credential_fill(struct credential *c, int all_capabilities)
>                         /* Reset expiry to maintain consistency */
>                         c->password_expiry_utc = TIME_MAX;
>                 }
> -               if ((c->username && c->password) || c->credential)
> +               if ((c->username && c->password) || c->credential) {
> +                       strvec_clear(&c->wwwauth_headers);
>                         return;
> +               }
>                 if (c->quit)
>                         die("credential helper '%s' told us to quit",
>                             c->helpers.items[i].string);
> @@ -464,6 +486,8 @@ void credential_approve(struct credential *c)
>         if (((!c->username || !c->password) && !c->credential) || c->password_expiry_utc < time(NULL))
>                 return;
>
> +       credential_next_state(c);
> +
>         credential_apply_config(c);
>
>         for (i = 0; i < c->helpers.nr; i++)
> @@ -475,6 +499,8 @@ void credential_reject(struct credential *c)
>  {
>         int i;
>
> +       credential_next_state(c);
> +
>         credential_apply_config(c);
>
>         for (i = 0; i < c->helpers.nr; i++)
> diff --git a/credential.h b/credential.h
> index e2021455fe..adb1fc370a 100644
> --- a/credential.h
> +++ b/credential.h
> @@ -143,10 +143,15 @@ struct credential {
>         struct strvec wwwauth_headers;
>
>         /**
> -        * A `strvec` of state headers from credential helpers.
> +        * A `strvec` of state headers received from credential helpers.
>          */
>         struct strvec state_headers;
>
> +       /**
> +        * A `strvec` of state headers to send to credential helpers.
> +        */
> +       struct strvec state_headers_to_send;
> +
>         /**
>          * Internal use only. Keeps track of if we previously matched against a
>          * WWW-Authenticate header line in order to re-fold future continuation
> @@ -156,6 +161,7 @@ struct credential {
>
>         unsigned approved:1,
>                  configured:1,
> +                multistage: 1,
>                  quit:1,
>                  use_http_path:1,
>                  username_from_proto:1;
> @@ -184,6 +190,7 @@ struct credential {
>         .password_expiry_utc = TIME_MAX, \
>         .wwwauth_headers = STRVEC_INIT, \
>         .state_headers = STRVEC_INIT, \
> +       .state_headers_to_send = STRVEC_INIT, \
>  }
>
>  /* Initialize a credential structure, setting all fields to empty. */
> @@ -229,6 +236,24 @@ void credential_approve(struct credential *);
>   */
>  void credential_reject(struct credential *);
>
> +/**
> + * Clear the secrets in this credential, but leave other data intact.
> + *
> + * This is useful for resetting credentials in preparation for a subsequent
> + * stage of filling.
> + */
> +void credential_clear_secrets(struct credential *c);
> +
> +/**
> + * Prepares the credential for the next iteration of the helper protocol by
> + * updating the state headers to send with the ones read by the last iteration
> + * of the protocol.
> + *
> + * Except for internal callers, this should be called exactly once between
> + * reading credentials with `credential_fill` and writing them.
> + */
> +void credential_next_state(struct credential *c);
> +
>  int credential_read(struct credential *, FILE *, int);
>  void credential_write(const struct credential *, FILE *, int);
>
> diff --git a/http.c b/http.c
> index f98c520924..9d373c6460 100644
> --- a/http.c
> +++ b/http.c
> @@ -1781,6 +1781,10 @@ static int handle_curl_result(struct slot_results *results)
>         else if (results->http_code == 401) {
>                 if ((http_auth.username && http_auth.password) ||\
>                     (http_auth.authtype && http_auth.credential)) {
> +                       if (http_auth.multistage) {
> +                               credential_clear_secrets(&http_auth);
> +                               return HTTP_REAUTH;
> +                       }
>                         credential_reject(&http_auth);
>                         return HTTP_NOAUTH;
>                 } else {
> @@ -2178,6 +2182,7 @@ static int http_request_reauth(const char *url,
>                                void *result, int target,
>                                struct http_get_options *options)
>  {
> +       int i = 3;
>         int ret = http_request(url, result, target, options);
>
>         if (ret != HTTP_OK && ret != HTTP_REAUTH)
> @@ -2191,35 +2196,35 @@ static int http_request_reauth(const char *url,
>                 }
>         }
>
> -       if (ret != HTTP_REAUTH)
> -               return ret;
> +       while (ret == HTTP_REAUTH && --i) {
> +               /*
> +                * The previous request may have put cruft into our output stream; we
> +                * should clear it out before making our next request.
> +                */
> +               switch (target) {
> +               case HTTP_REQUEST_STRBUF:
> +                       strbuf_reset(result);
> +                       break;
> +               case HTTP_REQUEST_FILE:
> +                       if (fflush(result)) {
> +                               error_errno("unable to flush a file");
> +                               return HTTP_START_FAILED;
> +                       }
> +                       rewind(result);
> +                       if (ftruncate(fileno(result), 0) < 0) {
> +                               error_errno("unable to truncate a file");
> +                               return HTTP_START_FAILED;
> +                       }
> +                       break;
> +               default:
> +                       BUG("Unknown http_request target");
> +               }
>
> -       /*
> -        * The previous request may have put cruft into our output stream; we
> -        * should clear it out before making our next request.
> -        */
> -       switch (target) {
> -       case HTTP_REQUEST_STRBUF:
> -               strbuf_reset(result);
> -               break;
> -       case HTTP_REQUEST_FILE:
> -               if (fflush(result)) {
> -                       error_errno("unable to flush a file");
> -                       return HTTP_START_FAILED;
> -               }
> -               rewind(result);
> -               if (ftruncate(fileno(result), 0) < 0) {
> -                       error_errno("unable to truncate a file");
> -                       return HTTP_START_FAILED;
> -               }
> -               break;
> -       default:
> -               BUG("Unknown http_request target");
> +               credential_fill(&http_auth, 1);
> +
> +               ret = http_request(url, result, target, options);
>         }
> -
> -       credential_fill(&http_auth, 1);
> -
> -       return http_request(url, result, target, options);
> +       return ret;
>  }
>
>  int http_get_strbuf(const char *url,
> diff --git a/t/t5563-simple-http-auth.sh b/t/t5563-simple-http-auth.sh
> index 515185ae00..5d5caa3f58 100755
> --- a/t/t5563-simple-http-auth.sh
> +++ b/t/t5563-simple-http-auth.sh
> @@ -21,9 +21,17 @@ test_expect_success 'setup_credential_helper' '
>         CREDENTIAL_HELPER="$TRASH_DIRECTORY/bin/git-credential-test-helper" &&
>         write_script "$CREDENTIAL_HELPER" <<-\EOF
>         cmd=$1
> -       teefile=$cmd-query.cred
> +       teefile=$cmd-query-temp.cred
>         catfile=$cmd-reply.cred
>         sed -n -e "/^$/q" -e "p" >>$teefile
> +       state=$(sed -ne "s/^state\[\]=helper://p" "$teefile")
> +       if test -z "$state"
> +       then
> +               mv "$teefile" "$cmd-query.cred"
> +       else
> +               mv "$teefile" "$cmd-query-$state.cred"
> +               catfile="$cmd-reply-$state.cred"
> +       fi
>         if test "$cmd" = "get"
>         then
>                 cat $catfile
> @@ -32,13 +40,15 @@ test_expect_success 'setup_credential_helper' '
>  '
>
>  set_credential_reply () {
> -       cat >"$TRASH_DIRECTORY/$1-reply.cred"
> +       local suffix="$(test -n "$2" && echo "-$2")"
> +       cat >"$TRASH_DIRECTORY/$1-reply$suffix.cred"
>  }
>
>  expect_credential_query () {
> -       cat >"$TRASH_DIRECTORY/$1-expect.cred" &&
> -       test_cmp "$TRASH_DIRECTORY/$1-expect.cred" \
> -                "$TRASH_DIRECTORY/$1-query.cred"
> +       local suffix="$(test -n "$2" && echo "-$2")"
> +       cat >"$TRASH_DIRECTORY/$1-expect$suffix.cred" &&
> +       test_cmp "$TRASH_DIRECTORY/$1-expect$suffix.cred" \
> +                "$TRASH_DIRECTORY/$1-query$suffix.cred"
>  }
>
>  per_test_cleanup () {
> @@ -479,4 +489,73 @@ test_expect_success 'access using bearer auth with invalid credentials' '
>         EOF
>  '
>
> +test_expect_success 'access using three-legged auth' '
> +       test_when_finished "per_test_cleanup" &&
> +
> +       set_credential_reply get <<-EOF &&
> +       capability[]=authtype
> +       capability[]=state
> +       authtype=Multistage
> +       credential=YS1naXQtdG9rZW4=
> +       state[]=helper:foobar
> +       continue=1
> +       EOF
> +
> +       set_credential_reply get foobar <<-EOF &&
> +       capability[]=authtype
> +       capability[]=state
> +       authtype=Multistage
> +       credential=YW5vdGhlci10b2tlbg==
> +       state[]=helper:bazquux
> +       EOF
> +
> +       cat >"$HTTPD_ROOT_PATH/custom-auth.valid" <<-EOF &&
> +       id=1 creds=Multistage YS1naXQtdG9rZW4=
> +       id=2 creds=Multistage YW5vdGhlci10b2tlbg==
> +       EOF
> +
> +       CHALLENGE="$HTTPD_ROOT_PATH/custom-auth.challenge" &&
> +
> +       cat >"$HTTPD_ROOT_PATH/custom-auth.challenge" <<-EOF &&
> +       id=1 status=401 response=WWW-Authenticate: Multistage challenge="456"
> +       id=1 status=401 response=WWW-Authenticate: Bearer authorize_uri="id.example.com" p=1 q=0
> +       id=2 status=200
> +       id=default response=WWW-Authenticate: Multistage challenge="123"
> +       id=default response=WWW-Authenticate: Bearer authorize_uri="id.example.com" p=1 q=0
> +       EOF
> +
> +       test_config_global credential.helper test-helper &&
> +       git ls-remote "$HTTPD_URL/custom_auth/repo.git" &&
> +
> +       expect_credential_query get <<-EOF &&
> +       capability[]=authtype
> +       capability[]=state
> +       protocol=http
> +       host=$HTTPD_DEST
> +       wwwauth[]=Multistage challenge="123"
> +       wwwauth[]=Bearer authorize_uri="id.example.com" p=1 q=0
> +       EOF
> +
> +       expect_credential_query get foobar <<-EOF &&
> +       capability[]=authtype
> +       capability[]=state
> +       authtype=Multistage
> +       protocol=http
> +       host=$HTTPD_DEST
> +       wwwauth[]=Multistage challenge="456"
> +       wwwauth[]=Bearer authorize_uri="id.example.com" p=1 q=0
> +       state[]=helper:foobar
> +       EOF
> +
> +       expect_credential_query store bazquux <<-EOF
> +       capability[]=authtype
> +       capability[]=state
> +       authtype=Multistage
> +       credential=YW5vdGhlci10b2tlbg==
> +       protocol=http
> +       host=$HTTPD_DEST
> +       state[]=helper:bazquux
> +       EOF
> +'
> +
>  test_done


^ permalink raw reply

* Ignore specific changes but not the entire file
From: Mohammad Al Zouabi @ 2024-03-28  7:46 UTC (permalink / raw)
  To: git

The problem with:

```sh
git update-index [--skip-worktree|--assume-unchanged]
```

Is that it ignores the entire file.

Is there a plan to support something that remembers what was exactly
ignored, and if there are changes to it, either un-ignore the file, or
display the hunks that were changed?

Something like committing the change but not actually committing it.


^ permalink raw reply

* Re: Fwd: [GSOC][Proposal] Move existing tests to a unit testing framework
From: Patrick Steinhardt @ 2024-03-28  7:32 UTC (permalink / raw)
  To: Sanchit Jindal; +Cc: git, christian.couder, kaartic.sivaraam
In-Reply-To: <CAN7Jk_0hyjx39rrO1PKbEcJQXLtCYkWP7A4mCv01DZu2ffGHyw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 11210 bytes --]

On Wed, Mar 27, 2024 at 08:06:06PM +0530, Sanchit Jindal wrote:
> Hello,
> 
> Please find my proposal for GSOC 2024 for the project "Move existing
> tests to a unit testing framework" at
> https://docs.google.com/document/d/1yWP5GFRksdQEOKtYIEXbNSVb1qlJ2szRvpTcXXVvgTk/edit?usp=sharing.
> I am also adding the text version for convenience.

Thanks a lot for your proposal!

> Thank You,
> Sanchit Jindal
> --------------------
> 
> Google Summer Of Code Proposal
> 
> GIT (Move existing tests to a unit testing framework)

Sorry to be nitpicky, but it's spelled "Git", not "GIT" :)

> ### Personal Information
> 
> Name:          Sanchit Jindal
> Email:         sanchit1053@gmail.com
> Qualification: B.Tech in Computer Science and Engineering
> 
> 
> Greetings,
> 
> My name is Sanchit Jindal, currently pursuing my B.E. in Computer
> Science at the Indian Institute of Technology, Bombay, I will complete
> my degree in May 2024. I am writing to put forward my proposal for the
> Git Project to transition the unit tests to the new Unit Testing
> Framework.
> 
> You can find my Resume at
> https://www.linkedin.com/in/sanchit-jindal-0a953621a/ in the featured
> section

I would recommend to include highlights of your resume in the proposal
directly, either instead or in addition to linking to it. Not everyone
has a LinkedIn account (I don't), and thus I'm unable to view most of
your profile.

If you already did that, then it might be worthwhile to point out that
you did include highlights already. Otherwise I'm left wondering whether
I miss out on any important information.

> Throughout my academic journey and internships, I've gained valuable
> experience across diverse domains, ranging from machine learning to
> fundamental system components like compilers and operating systems.
> This breadth of exposure has honed my ability to swiftly grasp new
> concepts and adapt to different codebases. Proficient in both C and
> C++, I am well-equipped to contribute effectively to the project.
> 
> For this GSOC project, I aspire to make meaningful contributions to
> the open-source community. My interest in software development drives
> me to create solutions that will offer utility to others.
> 
> 
> ### Overview
> 
> Proposed Abstract
> Git has a lot of test cases that need to be migrated to use a new unit
> testing framework. This typically involves moving code from both:
> * a “t/helper/test-*.c” test helper in C, and
> * a “t/*.sh” test script in shell that invokes the test helper
> 
> 
> over to a single “t/unit-tests/t-*.c” in C using the unit testing framework.

I guess these formatting issues here come from the fact that you copied
the text over from the linked doc? Not a huge issue in that case.

> ### Details
> 
> The Project entails porting the current testing framework (which is a
> collection of helper files written in c that provide various
> utilities, and various shell scripts) to using a new Unit-Testing
> Library. The shell scripts in the original code base setup the
> environment and , using the helper functions, test the functionalities
> provided by git. The new framework utilises the test-lib library that

s/git/Git

> will combine the functionality of both the helper functions and the
> shell scripts into one unit-test.
> Advantages of this approach
> * Unit Tests are a great way to check any program and to test all
> possible corner cases and functions
> * Using this new testing framework will provide a more modular
> approach to unit testing, allowing to create more extensive tests, and
> to provide better debugging features. Such as better debug Messages or

s/Messages/messages

> a better CLI to choose and run tests.
> * The tests will also improve the runtime as instead of running
> multiple process (with each use of shell commands), the program can
> compile and run a single process
> * The approach will also future proof the unit testing as having a c
> library will provide a better base than shell scripts to test
> libraries and other internal features

I wouldn't necessarily say "better" here. It's not like the unit tests
are expected to eventually replace all ouf our shell script tests. Those
test suites serve different purposes, and in general it is recommended
to continue writing shell script tests. They exercise the code at the
user level and thus directly mirror the experience a user would have,
which is generally more helpful than only asserting that one small part
of the codebase works. You can kind of think of these as integration or
end-to-end tests

So the unit testing framework isn't necessarily better or even preferred
in the general case. But it does allow us to exercise low-level
functionality or specific edge cases that were previously cumbersome to
test.

So I wouldn't say "better", but rather that those unit tests address a
different need than the scripts.

> ### Basic Structure of the Changes
> 
> * For each file in the helper directory I will create a corresponding
> unit-test file. (Some files which are being used by multiple test can
> be ported later)

Not all files in the helper directory are actually unit tests. Some of
them are, some of them expose functionality that would otherwise not be
exposed by Git in a direct way. "test-tool refstore" comes to mind,
which allows you to munge the ref store in ways that git-update-ref(1)
wouldn't allow you to do. There are others.

Furthermore, it may not be realistic to port _all_ existing tests over
to the new functionality. It is perfectly fine to say that you plan to
only address a subset of unit tests, and you don't necessarily have to
say which ones exactly you want to port over now.

> * Each test in the script will be corresponding to a function that
> will be called in the format
> 
> '''
> TEST(funtion_name(), “SUCCESS MESSAGE”);
> '''
> 
> * The function body will contain the main implementation of the test
> corresponding to the c file ( The helper function will ideally be the
> same with no or minimal changes just to allow it to be called as
> required )

With "helper function" you probably mean the test logic, right? I guess
it depends on the function whether it would be ideal to have only
minimal changes. Given that we also convert them to use our new testing
framework I do expect that some of them may change significantly, which
wouldn't be all that bad, either.

> * Where the assert usually given at the end of the test will be made
> using the utility functions such as check_int, check_str and others,
> (or creating more function as they are required)
> * The Make files will be updated to remove the use of shell script
> with compiling and running the new c programs
> 
> ### Previous Contributions
> 
> I have worked on the micro-project
> Use test_path_is_* functions in test scripts
> 
> Thread Link: https://lore.kernel.org/git/b8d0620d4104106210ecf6a34ada591adf01cff8.1711049963.git.gitgitgadget@gmail.com/
> Status: Open
> 
> I had some difficulties in understanding the usage of gitgitGadget and
> git send-email which are being employed as the Pull request
> alternatives. Having tried gitgitGadget I am confident that I will be
> able to utilise it hereafter. I am having some trouble using git
> send-email with the network at my institute but I am optimistic that I
> will be able to use it after the completion of my degree.

While many folks use git-send-email(1) directly it is perfectly fine if
you don't and want to use GitGitGadget instead. As long as you are
familiar with one of those workflows you don't necessarily have to set
up the other one.

> With the guidance of Eric Sunshine and Junio C Hamano, I have iterated
> through different versions of the pull request, and familiarised
> myself with the git commit pattern and formalities
> 
> 
> ### Deliverables
> 
> The plan is to port the helper functions from helper directory to the
> unit-test directory along with the test defined in the shell scripts.
> Some of the helper tools which have a more basic functionality and are
> being used by multiple test scripts can be left untouched.
> As a stretch goal I will try to port the shell scripts to the new
> testing interface creating new functions for the library as required
> I also plan to keep documenting all the changes made and to keep in
> constant contact with the mentors regarding any bugs or reviews
> regarding the code.
> 
> 
> ### Availability
> 
> I am expecting to be free from my academic responsibilities by 4th
> May. I have no further commitments for the summer and will be able to
> provide about 40-50 hours of work per week. I am confident that I
> should be able to complete about 2-3 files each week depending on the
> files.

I would like to caution you a bit about the "2-3 files each week" goal.
The Git project is comparatively slow-moving, and it very often takes
multiple weeks for a patch series until it lands in the "master" branch.
Furthermore, the quality bar is quite high in the Git project, which may
require you to undergo several revisions for a patch series before it is
accepted.

So I would recommend to account for all these facts. It may make sense
to compare to below related work as an estimate of how much work you can
get done realistically.

> Related Work
> The project has been worked on by Achu Luma for the Outreachy Internship
> Test-ctype:
> https://lore.kernel.org/git/20240112102743.1440-1-ach.lumap@gmail.com/#t
> Test-advise:
> https://lore.kernel.org/git/20240112102122.1422-1-ach.lumap@gmail.com/
> Test-data:
> https://lore.kernel.org/git/20240205162506.1835-2-ach.lumap@gmail.com/
> Test-hash:
> https://lore.kernel.org/git/20240229054004.3807-2-ach.lumap@gmail.com/
> Test-strcmp-offset:
> https://lore.kernel.org/git/20240310144819.4379-1-ach.lumap@gmail.com/
> 
> 
> Another testcase has also been handled by Ghanshyam Thakkar
> Test–oid-array:
> https://lore.kernel.org/git/20240223193257.9222-1-shyamthakkar001@gmail.com/
> 
> 
> 
> 
> Timeline (Tentative)
> 
> Community Bonding
> (1 May- 26 May)
> Be in contact with mentors and figure out the best way to migrate the
> test-cases, Familiarise myself with the library and previous work done
> during the outreachy program.
> 
> Phase I
> (27 May - 11 July)
> Begin tackling the harder or longer testcases that will require more
> knowledge of the implementation
> 
> Phase II
> (12 July - 18 Aug)
> Keep working on the testcases following the mentors feedback

Note that you'll not only get feedback from mentors, but also from the
community.

> Final Week
> (19 Aug - 26 Aug)
> Finish up the remaining testcases, fixing any bugs that may be
> discovered in the earlier implementations
> 
> ### Acknowledgement
> I would like to thank Eric Sunshine and Junio C Hamano for helping me
> with the microproject and for the guidance on the gitgitGadget and git
> send-email features, And also helping me get acquainted with the git
> PR guidelines.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [GSoC] [PATCH] builtin/gc: Replace run_command with C function refs_pack_refs
From: Junio C Hamano @ 2024-03-28  4:31 UTC (permalink / raw)
  To: vk; +Cc: git, ps
In-Reply-To: <0a149be978e68271c251e80f8eb571f4@vkabc.me>

vk <g@vkabc.me> writes:

> I forgot to add this --auto optional logic. Will add it in the
> following patch.

Please slow down a bit.  If you notice a problem in the patch you
just posted within 7 minutes after posting it, there is a good
chance that you are rushing it without carefully proofreading.

You'd be better off if you give yourself some time to think about
what you wrote, after drafting your patch.  Sleep on it, if needed.

There are issues in the patch.  Trying to spot them yourself would
give you a good exercise.  Documentation/SubmittingPatches and
Documentation/CodingGuidelines would be useful.

Thanks for your interest in Git development.


^ permalink raw reply

* Re: [GSoC] [PATCH] builtin/gc: Replace run_command with C function refs_pack_refs
From: vk @ 2024-03-28  4:12 UTC (permalink / raw)
  To: git; +Cc: ps
In-Reply-To: <20240328040010.40230-1-g@vkabc.me>

On 2024-03-28 12:00, vk wrote:
> Replace "git pack-refs" with C function refs_pack_refs
> for maintenance_task_pack_refs(). C function refs_pack_refs
> can be called directly, hence saving an external process.
> ---
>  builtin/gc.c | 29 +++++++++++++++++++++++------
>  1 file changed, 23 insertions(+), 6 deletions(-)
> 
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 3c874b248b..b55c5f5225 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -42,6 +42,8 @@
>  #include "setup.h"
>  #include "trace2.h"
> 
> +#include <revision.h>
> +
>  #define FAILED_RUN "failed to run %s"
> 
>  static const char * const builtin_gc_usage[] = {
> @@ -218,14 +220,29 @@ static int pack_refs_condition(void)
> 
>  static int maintenance_task_pack_refs(MAYBE_UNUSED struct 
> maintenance_run_opts *opts)
>  {
> -	struct child_process cmd = CHILD_PROCESS_INIT;
> 
> -	cmd.git_cmd = 1;
> -	strvec_pushl(&cmd.args, "pack-refs", "--all", "--prune", NULL);
> -	if (opts->auto_flag)
> -		strvec_push(&cmd.args, "--auto");

I forgot to add this --auto optional logic. Will add it in the following 
patch.

Thanks



^ permalink raw reply

* [GSoC] [PATCH] builtin/gc: Replace run_command with C function refs_pack_refs
From: vk @ 2024-03-28  4:00 UTC (permalink / raw)
  To: git; +Cc: ps, vk

Replace "git pack-refs" with C function refs_pack_refs
for maintenance_task_pack_refs(). C function refs_pack_refs
can be called directly, hence saving an external process.
---
 builtin/gc.c | 29 +++++++++++++++++++++++------
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 3c874b248b..b55c5f5225 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -42,6 +42,8 @@
 #include "setup.h"
 #include "trace2.h"
 
+#include <revision.h>
+
 #define FAILED_RUN "failed to run %s"
 
 static const char * const builtin_gc_usage[] = {
@@ -218,14 +220,29 @@ static int pack_refs_condition(void)
 
 static int maintenance_task_pack_refs(MAYBE_UNUSED struct maintenance_run_opts *opts)
 {
-	struct child_process cmd = CHILD_PROCESS_INIT;
 
-	cmd.git_cmd = 1;
-	strvec_pushl(&cmd.args, "pack-refs", "--all", "--prune", NULL);
-	if (opts->auto_flag)
-		strvec_push(&cmd.args, "--auto");
+	//following coding convention in pack-refs.c
+	int ret;
+
+	struct ref_exclusions excludes = REF_EXCLUSIONS_INIT;
+	struct string_list included_refs = STRING_LIST_INIT_NODUP;
+
+	//default flag is to prune
+	struct pack_refs_opts pack_refs_opts = {
+		.exclusions = &excludes,
+		.includes = &included_refs,
+		.flags = PACK_REFS_PRUNE,
+	};
+
+
+	//--all flag
+	string_list_append(pack_refs_opts.includes, "*");
+
+	ret = refs_pack_refs(get_main_ref_store(the_repository), &pack_refs_opts);
+
+	//same as run_command return since ret is directly returned in cmd_pack_refs
+	return ret;
 
-	return run_command(&cmd);
 }
 
 static int too_many_loose_objects(void)
-- 
2.39.3 (Apple Git-145)



^ permalink raw reply related

* Re: [RFC][GSoC] Proposal: Refactor git-bisect(1)
From: Kaartic Sivaraam @ 2024-03-28  3:23 UTC (permalink / raw)
  To: eugenio gigante, git; +Cc: karthik nayak, ps, Christian Couder
In-Reply-To: <CAFJh0PS_VB8yv7B1uM5+Ts9PtMFtU5mL4y8UUDORKQcYHSoxPA@mail.gmail.com>

Hi Eugenio,

On 22/03/24 15:56, eugenio gigante wrote:>
> Timeline
> ------------
> 
> May 1 - 26:
> Community bounding Period.
> Read Documentation.
> Write Backlog.
> 
> 
> May 27 - July 8:
> Implement a new layout for state files.
> Write tests.
> 
> July 12 - August 19:
> Assess and implement backward compatibility.
> 
> August 19 - Deadline :
> Write documentation for the project.
> 
> I can dedicate 3 hours a day during weekdays, and 5 hours during the weekends.
> > [ ... snip ... ]
> 
> 
> Biographical information
> ----------------------------------
> 
> I’m a former student of Logic and Philosophy who turned to coding
> after graduating.
> I have been working as a Developer for NTT Data Italia for a year.
> I hold a full-time job, but I've seen that it doesn't conflict with
> the rules of GSoC
> since I'm an open source beginner. I am fully capable of managing my
> workload independently,
> including my working hours. I know it is not ideal, but I can
> definitely work around
> my schedule and dedicate time to the project.
> 

Thank you for explicitly mentioning this. Apart from this, if there's 
any specific things you need to clarify regarding your availability, 
please do so. For instance, if you have any planned vacation periods 
during which you will be unavailable or some such.

--
Sivaraam


^ permalink raw reply

* [PATCH v3 2/2] add-patch: do not print hunks repeatedly
From: Rubén Justo @ 2024-03-28  1:12 UTC (permalink / raw)
  To: Git List; +Cc: Johannes Schindelin, Junio C Hamano, Phillip Wood
In-Reply-To: <60d978d0-f69a-4b65-b4ba-d30dac8f112a@gmail.com>

The interactive-patch is a sequential process where, on each step, we
print one hunk from a patch and then ask the user how to proceed.

There is a possibility of repeating a step, for example if the user
enters a non-applicable option, i.e: "s"

    $ git add -p
    diff --git a/add-patch.c b/add-patch.c
    index 52be1ddb15..8fb75e82e2 100644
    --- a/add-patch.c
    +++ b/add-patch.c
    @@ -1394,7 +1394,7 @@ N_("j - leave this hunk undecided, see next undecided hunk\n"
     static int patch_update_file(struct add_p_state *s,
     			     struct file_diff *file_diff)
     {
    -	size_t hunk_index = 0;
    +	size_t hunk_index = 0, prev_hunk_index = -1;
     	ssize_t i, undecided_previous, undecided_next;
     	struct hunk *hunk;
     	char ch;
    (1/4) Stage this hunk [y,n,q,a,d,j,J,g,/,e,p,?]? s
    Sorry, cannot split this hunk
    @@ -1394,7 +1394,7 @@ N_("j - leave this hunk undecided, see next undecided hunk\n"
     static int patch_update_file(struct add_p_state *s,
     			     struct file_diff *file_diff)
     {
    -	size_t hunk_index = 0;
    +	size_t hunk_index = 0, prev_hunk_index = -1;
     	ssize_t i, undecided_previous, undecided_next;
     	struct hunk *hunk;
     	char ch;
    (1/4) Stage this hunk [y,n,q,a,d,j,J,g,/,e,p,?]?

... or an invalid option, i.e: "U"

    $ git add -p
    diff --git a/add-patch.c b/add-patch.c
    index 52be1ddb15..8fb75e82e2 100644
    --- a/add-patch.c
    +++ b/add-patch.c
    @@ -1394,7 +1394,7 @@ N_("j - leave this hunk undecided, see next undecided hunk\n"
     static int patch_update_file(struct add_p_state *s,
     			     struct file_diff *file_diff)
     {
    -	size_t hunk_index = 0;
    +	size_t hunk_index = 0, prev_hunk_index = -1;
     	ssize_t i, undecided_previous, undecided_next;
     	struct hunk *hunk;
     	char ch;
    (1/4) Stage this hunk [y,n,q,a,d,j,J,g,/,e,p,?]? U
    y - stage this hunk
    n - do not stage this hunk
    q - quit; do not stage this hunk or any of the remaining ones
    a - stage this hunk and all later hunks in the file
    d - do not stage this hunk or any of the later hunks in the file
    j - leave this hunk undecided, see next undecided hunk
    J - leave this hunk undecided, see next hunk
    g - select a hunk to go to
    / - search for a hunk matching the given regex
    e - manually edit the current hunk
    p - print again the current hunk
    ? - print help
    @@ -1394,7 +1394,7 @@ N_("j - leave this hunk undecided, see next undecided hunk\n"
     static int patch_update_file(struct add_p_state *s,
     			     struct file_diff *file_diff)
     {
    -	size_t hunk_index = 0;
    +	size_t hunk_index = 0, prev_hunk_index = -1;
     	ssize_t i, undecided_previous, undecided_next;
     	struct hunk *hunk;
     	char ch;
    (1/4) Stage this hunk [y,n,q,a,d,j,J,g,/,e,p,?]?

Printing the chunk again followed by the question can be confusing as
the user has to pay special attention to notice that the same chunk is
being reconsidered.

It can also be problematic if the chunk is longer than one screen height
because the result of the previous iteration is lost off the screen (the
help guide in the previous example).

To avoid such problems, stop printing the chunk if the iteration does
not advance to a different chunk.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
---
 add-patch.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/add-patch.c b/add-patch.c
index 922c43378e..e0f4cd9b9b 100644
--- a/add-patch.c
+++ b/add-patch.c
@@ -1395,7 +1395,7 @@ static int patch_update_file(struct add_p_state *s,
 			     struct file_diff *file_diff)
 {
 	size_t hunk_index = 0;
-	ssize_t i, undecided_previous, undecided_next;
+	ssize_t i, undecided_previous, undecided_next, rendered_hunk_index = -1;
 	struct hunk *hunk;
 	char ch;
 	struct child_process cp = CHILD_PROCESS_INIT;
@@ -1448,10 +1448,15 @@ static int patch_update_file(struct add_p_state *s,
 
 		strbuf_reset(&s->buf);
 		if (file_diff->hunk_nr) {
-			render_hunk(s, hunk, 0, colored, &s->buf);
-			fputs(s->buf.buf, stdout);
+			if (rendered_hunk_index != hunk_index) {
+				render_hunk(s, hunk, 0, colored, &s->buf);
+				fputs(s->buf.buf, stdout);
+
+				rendered_hunk_index = hunk_index;
+			}
 
 			strbuf_reset(&s->buf);
+
 			if (undecided_previous >= 0) {
 				permitted |= ALLOW_GOTO_PREVIOUS_UNDECIDED_HUNK;
 				strbuf_addstr(&s->buf, ",k");
@@ -1649,10 +1654,12 @@ static int patch_update_file(struct add_p_state *s,
 			if (!(permitted & ALLOW_SPLIT))
 				err(s, _("Sorry, cannot split this hunk"));
 			else if (!split_hunk(s, file_diff,
-					     hunk - file_diff->hunk))
+					     hunk - file_diff->hunk)) {
 				color_fprintf_ln(stdout, s->s.header_color,
 						 _("Split into %d hunks."),
 						 (int)splittable_into);
+				rendered_hunk_index = -1;
+			}
 		} else if (s->answer.buf[0] == 'e') {
 			if (!(permitted & ALLOW_EDIT))
 				err(s, _("Sorry, cannot edit this hunk"));
@@ -1661,7 +1668,7 @@ static int patch_update_file(struct add_p_state *s,
 				goto soft_increment;
 			}
 		} else if (s->answer.buf[0] == 'p') {
-			/* nothing special is needed */
+			rendered_hunk_index = -1;
 		} else {
 			const char *p = _(help_patch_remainder), *eol = p;
 
-- 
2.44.0.370.gd7ad5e5424


^ permalink raw reply related

* [PATCH v3 1/2] add-patch: introduce 'p' in interactive-patch
From: Rubén Justo @ 2024-03-28  1:12 UTC (permalink / raw)
  To: Git List; +Cc: Johannes Schindelin, Junio C Hamano, Phillip Wood
In-Reply-To: <60d978d0-f69a-4b65-b4ba-d30dac8f112a@gmail.com>

Shortly we're going make interactive-patch stop printing automatically
the hunk under certain circumstances.

Let's introduce a new option to allow the user to explicitly request
the printing.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
---
 Documentation/git-add.txt  |  1 +
 add-patch.c                |  4 ++++
 t/t3701-add-interactive.sh | 22 +++++++++++-----------
 3 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
index 14a371fff3..2965cd0fb6 100644
--- a/Documentation/git-add.txt
+++ b/Documentation/git-add.txt
@@ -348,6 +348,7 @@ patch::
        K - leave this hunk undecided, see previous hunk
        s - split the current hunk into smaller hunks
        e - manually edit the current hunk
+       p - print the current hunk again
        ? - print help
 +
 After deciding the fate for all hunks, if there is any hunk
diff --git a/add-patch.c b/add-patch.c
index 68f525b35c..922c43378e 100644
--- a/add-patch.c
+++ b/add-patch.c
@@ -1388,6 +1388,7 @@ N_("j - leave this hunk undecided, see next undecided hunk\n"
    "/ - search for a hunk matching the given regex\n"
    "s - split the current hunk into smaller hunks\n"
    "e - manually edit the current hunk\n"
+   "p - print the current hunk again\n"
    "? - print help\n");
 
 static int patch_update_file(struct add_p_state *s,
@@ -1480,6 +1481,7 @@ static int patch_update_file(struct add_p_state *s,
 				permitted |= ALLOW_EDIT;
 				strbuf_addstr(&s->buf, ",e");
 			}
+			strbuf_addstr(&s->buf, ",p");
 		}
 		if (file_diff->deleted)
 			prompt_mode_type = PROMPT_DELETION;
@@ -1658,6 +1660,8 @@ static int patch_update_file(struct add_p_state *s,
 				hunk->use = USE_HUNK;
 				goto soft_increment;
 			}
+		} else if (s->answer.buf[0] == 'p') {
+			/* nothing special is needed */
 		} else {
 			const char *p = _(help_patch_remainder), *eol = p;
 
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 0b5339ac6c..bc55255b0a 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -325,9 +325,9 @@ test_expect_success 'different prompts for mode change/deleted' '
 	git -c core.filemode=true add -p >actual &&
 	sed -n "s/^\(([0-9/]*) Stage .*?\).*/\1/p" actual >actual.filtered &&
 	cat >expect <<-\EOF &&
-	(1/1) Stage deletion [y,n,q,a,d,?]?
-	(1/2) Stage mode change [y,n,q,a,d,j,J,g,/,?]?
-	(2/2) Stage this hunk [y,n,q,a,d,K,g,/,e,?]?
+	(1/1) Stage deletion [y,n,q,a,d,p,?]?
+	(1/2) Stage mode change [y,n,q,a,d,j,J,g,/,p,?]?
+	(2/2) Stage this hunk [y,n,q,a,d,K,g,/,e,p,?]?
 	EOF
 	test_cmp expect actual.filtered
 '
@@ -514,13 +514,13 @@ test_expect_success 'split hunk setup' '
 test_expect_success 'goto hunk' '
 	test_when_finished "git reset" &&
 	tr _ " " >expect <<-EOF &&
-	(2/2) Stage this hunk [y,n,q,a,d,K,g,/,e,?]? + 1:  -1,2 +1,3          +15
+	(2/2) Stage this hunk [y,n,q,a,d,K,g,/,e,p,?]? + 1:  -1,2 +1,3          +15
 	_ 2:  -2,4 +3,8          +21
 	go to which hunk? @@ -1,2 +1,3 @@
 	_10
 	+15
 	_20
-	(1/2) Stage this hunk [y,n,q,a,d,j,J,g,/,e,?]?_
+	(1/2) Stage this hunk [y,n,q,a,d,j,J,g,/,e,p,?]?_
 	EOF
 	test_write_lines s y g 1 | git add -p >actual &&
 	tail -n 7 <actual >actual.trimmed &&
@@ -530,11 +530,11 @@ test_expect_success 'goto hunk' '
 test_expect_success 'navigate to hunk via regex' '
 	test_when_finished "git reset" &&
 	tr _ " " >expect <<-EOF &&
-	(2/2) Stage this hunk [y,n,q,a,d,K,g,/,e,?]? @@ -1,2 +1,3 @@
+	(2/2) Stage this hunk [y,n,q,a,d,K,g,/,e,p,?]? @@ -1,2 +1,3 @@
 	_10
 	+15
 	_20
-	(1/2) Stage this hunk [y,n,q,a,d,j,J,g,/,e,?]?_
+	(1/2) Stage this hunk [y,n,q,a,d,j,J,g,/,e,p,?]?_
 	EOF
 	test_write_lines s y /1,2 | git add -p >actual &&
 	tail -n 5 <actual >actual.trimmed &&
@@ -715,21 +715,21 @@ test_expect_success 'colors can be overridden' '
 	<BLUE>+<RESET><BLUE>new<RESET>
 	<CYAN> more-context<RESET>
 	<BLUE>+<RESET><BLUE>another-one<RESET>
-	<YELLOW>(1/1) Stage this hunk [y,n,q,a,d,s,e,?]? <RESET><BOLD>Split into 2 hunks.<RESET>
+	<YELLOW>(1/1) Stage this hunk [y,n,q,a,d,s,e,p,?]? <RESET><BOLD>Split into 2 hunks.<RESET>
 	<MAGENTA>@@ -1,3 +1,3 @@<RESET>
 	<CYAN> context<RESET>
 	<BOLD>-old<RESET>
 	<BLUE>+<RESET><BLUE>new<RESET>
 	<CYAN> more-context<RESET>
-	<YELLOW>(1/2) Stage this hunk [y,n,q,a,d,j,J,g,/,e,?]? <RESET><MAGENTA>@@ -3 +3,2 @@<RESET>
+	<YELLOW>(1/2) Stage this hunk [y,n,q,a,d,j,J,g,/,e,p,?]? <RESET><MAGENTA>@@ -3 +3,2 @@<RESET>
 	<CYAN> more-context<RESET>
 	<BLUE>+<RESET><BLUE>another-one<RESET>
-	<YELLOW>(2/2) Stage this hunk [y,n,q,a,d,K,g,/,e,?]? <RESET><MAGENTA>@@ -1,3 +1,3 @@<RESET>
+	<YELLOW>(2/2) Stage this hunk [y,n,q,a,d,K,g,/,e,p,?]? <RESET><MAGENTA>@@ -1,3 +1,3 @@<RESET>
 	<CYAN> context<RESET>
 	<BOLD>-old<RESET>
 	<BLUE>+new<RESET>
 	<CYAN> more-context<RESET>
-	<YELLOW>(1/2) Stage this hunk [y,n,q,a,d,j,J,g,/,e,?]? <RESET>
+	<YELLOW>(1/2) Stage this hunk [y,n,q,a,d,j,J,g,/,e,p,?]? <RESET>
 	EOF
 	test_cmp expect actual
 '
-- 
2.44.0.370.gd7ad5e5424


^ permalink raw reply related

* [PATCH v3 0/2] improve interactive-patch
From: Rubén Justo @ 2024-03-28  1:10 UTC (permalink / raw)
  To: Git List; +Cc: Johannes Schindelin, Junio C Hamano, Phillip Wood
In-Reply-To: <6f2ed406-2152-476b-b463-3010afe7e11e@gmail.com>

Let's reduce the verbosity in the interactive-patch process, in order to
make it less confusing.

Rubén Justo (2):
  add-patch: introduce 'p' in interactive-patch
  add-patch: do not print hunks repeatedly

 Documentation/git-add.txt  |  1 +
 add-patch.c                | 19 +++++++++++++++----
 t/t3701-add-interactive.sh | 22 +++++++++++-----------
 3 files changed, 27 insertions(+), 15 deletions(-)

Range-diff against v2:
1:  bbb3bbea1b ! 1:  ca2777cb12 add-patch: introduce 'p' in interactive-patch
    @@ Documentation/git-add.txt: patch::
             K - leave this hunk undecided, see previous hunk
             s - split the current hunk into smaller hunks
             e - manually edit the current hunk
    -+       p - print again the current hunk
    ++       p - print the current hunk again
             ? - print help
      +
      After deciding the fate for all hunks, if there is any hunk
    @@ add-patch.c: N_("j - leave this hunk undecided, see next undecided hunk\n"
         "/ - search for a hunk matching the given regex\n"
         "s - split the current hunk into smaller hunks\n"
         "e - manually edit the current hunk\n"
    -+   "p - print again the current hunk\n"
    ++   "p - print the current hunk again\n"
         "? - print help\n");
      
      static int patch_update_file(struct add_p_state *s,
2:  4d5626e1c4 ! 2:  d7ad5e5424 add-patch: do not print hunks repeatedly
    @@ Commit message
         Signed-off-by: Rubén Justo <rjusto@gmail.com>
     
      ## add-patch.c ##
    -@@ add-patch.c: N_("j - leave this hunk undecided, see next undecided hunk\n"
    - static int patch_update_file(struct add_p_state *s,
    +@@ add-patch.c: static int patch_update_file(struct add_p_state *s,
      			     struct file_diff *file_diff)
      {
    --	size_t hunk_index = 0;
    -+	size_t hunk_index = 0, prev_hunk_index = -1;
    - 	ssize_t i, undecided_previous, undecided_next;
    + 	size_t hunk_index = 0;
    +-	ssize_t i, undecided_previous, undecided_next;
    ++	ssize_t i, undecided_previous, undecided_next, rendered_hunk_index = -1;
      	struct hunk *hunk;
      	char ch;
    + 	struct child_process cp = CHILD_PROCESS_INIT;
     @@ add-patch.c: static int patch_update_file(struct add_p_state *s,
      
      		strbuf_reset(&s->buf);
      		if (file_diff->hunk_nr) {
     -			render_hunk(s, hunk, 0, colored, &s->buf);
     -			fputs(s->buf.buf, stdout);
    -+			if (prev_hunk_index != hunk_index) {
    ++			if (rendered_hunk_index != hunk_index) {
     +				render_hunk(s, hunk, 0, colored, &s->buf);
     +				fputs(s->buf.buf, stdout);
    -+				strbuf_reset(&s->buf);
     +
    -+				prev_hunk_index = hunk_index;
    ++				rendered_hunk_index = hunk_index;
     +			}
      
    --			strbuf_reset(&s->buf);
    + 			strbuf_reset(&s->buf);
    ++
      			if (undecided_previous >= 0) {
      				permitted |= ALLOW_GOTO_PREVIOUS_UNDECIDED_HUNK;
      				strbuf_addstr(&s->buf, ",k");
    @@ add-patch.c: static int patch_update_file(struct add_p_state *s,
      				color_fprintf_ln(stdout, s->s.header_color,
      						 _("Split into %d hunks."),
      						 (int)splittable_into);
    -+				prev_hunk_index = -1;
    ++				rendered_hunk_index = -1;
     +			}
      		} else if (s->answer.buf[0] == 'e') {
      			if (!(permitted & ALLOW_EDIT))
    @@ add-patch.c: static int patch_update_file(struct add_p_state *s,
      			}
      		} else if (s->answer.buf[0] == 'p') {
     -			/* nothing special is needed */
    -+			prev_hunk_index = -1;
    ++			rendered_hunk_index = -1;
      		} else {
      			const char *p = _(help_patch_remainder), *eol = p;
      
-- 
2.44.0.370.gd7ad5e5424


^ permalink raw reply

* Re: [PATCH v2 0/2] improve interactive-patch
From: Rubén Justo @ 2024-03-28  1:03 UTC (permalink / raw)
  To: Junio C Hamano, Phillip Wood; +Cc: Git List, Johannes Schindelin
In-Reply-To: <xmqqttkr3ckz.fsf@gitster.g>

On Wed, Mar 27, 2024 at 08:43:08AM -0700, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
> 
> > ... My
> > reasoning was that 'p' does not do anything useful for the user, if
> > they press it they end up with exactly the same content being printed
> > to the screen ...
> 
> Actually I do not agree that it necessarily is useless that the same
> content is shown.  Especially since we do not page, it is plausible
> for a user, who saw a huge hunk, to want to tweak terminal setting
> and raise scrollbuffer size (which may be set to 0 for usual
> sessions, like me), and say "please show it again".  Or even in a
> more primitive environment, just say "please show it again" and
> immediately type \C-s to stop while the early part of the hunk is
> shown ;-).
> 
> Thinking about the name of the option again, if we are omitting to
> show a hunk in some situations, the request to ask for the current
> hunk to be shown is "please show me", not "please reshow me", so the
> verb 'print' may apply to wider situations than the verb 'reprint',
> strictly speaking.

Thanks for saying it much better than me.

I would add that I see this series in terms of reducing verbosity rather
than repetition.  I'm not trying to stop showing a hunk because we've
just shown it.  In fact, maybe we can find a way to not show the hunk
until requested.


^ permalink raw reply

* Re: [PATCH v2 2/2] add-patch: do not print hunks repeatedly
From: Rubén Justo @ 2024-03-28  0:39 UTC (permalink / raw)
  To: phillip.wood, Git List; +Cc: Johannes Schindelin, Junio C Hamano
In-Reply-To: <b578630c-08d5-4a85-85db-c0bdb24a8486@gmail.com>

On Wed, Mar 27, 2024 at 11:06:49AM +0000, Phillip Wood wrote:

> >       (1/4) Stage this hunk [y,n,q,a,d,j,J,g,/,e,p,?]? U
> >       Unknown option "U".  Use '?' for help.
> 
> Yes, I like that (though I'd use the same quotes for both parts of the
> message)

Yes, you're right.  Using the same quotes are the correct thing to do.
I don't know how I thought we should print the result of
git_read_line_interactively().  After thinking about it again, I see
that would be misleading, to say the least.

> > If find having two strbuf_reset()'s in a row confusing.  Maybe it is
> > just me not seeing that that second strbuf_reset is "close" to noop.
> 
> If we don't print the hunk then the second call to strbuf_reset is indeed a
> noop. In our code base it is common to see a call to strbuf_reset()
> immediately before adding new content to the buffer, rather than cleaning up
> ready for reuse after the buffer has been used. If you grep 'strbuf_reset'
> in this file you'll see all the calls come immediately before adding new
> content to the buffer. By moving the call inside the conditional we're
> moving from a pattern of cleaning up before adding new content to a pattern
> of cleaning up afterwards which I think is harder to follow given the way
> the rest of the code uses strbuf_reset()

I have no strong objection.  I'll reroll leaving that strbuf_reset
untouched.


^ permalink raw reply

* Re: [PATCH 0/1] quote: quote space
From: Junio C Hamano @ 2024-03-27 22:11 UTC (permalink / raw)
  To: git; +Cc: Han Young, Jeff King, Johannes Schindelin
In-Reply-To: <xmqqfrwlltjn.fsf@gitster.g>

Junio C Hamano <gitster@pobox.com> writes:

> diff --git c/t/t4126-apply-empty.sh w/t/t4126-apply-empty.sh
> index ece9fae207..eaf0c5304a 100755
> --- c/t/t4126-apply-empty.sh
> +++ w/t/t4126-apply-empty.sh
> @@ -66,4 +66,26 @@ test_expect_success 'apply --index create' '
>  	git diff --exit-code
>  '
>  
> +test_expect_success 'apply with no-contents and a funny pathname' '
> +	mkdir "funny " &&
> +	>"funny /empty" &&
> +	git add "funny /empty" &&
> +	git diff HEAD "funny /" >sample.patch &&
> +	git diff -R HEAD "funny /" >elpmas.patch &&
> +	git reset --hard &&
> +	rm -fr "funny " &&
> +
> +	git apply --stat --check --apply sample.patch &&
> +	test_must_be_empty "funny /empty" &&
> +
> +	git apply --stat --check --apply elpmas.patch &&
> +	test_path_is_missing "funny /empty" &&
> +
> +	git apply -R --stat --check --apply elpmas.patch &&
> +	test_must_be_empty "funny /empty" &&
> +
> +	git apply -R --stat --check --apply sample.patch &&
> +	test_path_is_missing "funny /empty"
> +'
> +
>  test_done

This seems to fail only on Windows, and I have run out of my today's
allotment of time for this topic.

The earlier part that creates the directory with a trailing SP,
redirects to a file in such a directory to create an empty file, and
adds that path to the index, all succeed and follow the &&-chain,
but the step that runs "git diff" with "funny /" (i.e. the name of
the directory a trailing slash) as the pathspec produces an empty
patch, and "git apply" would of course choke on an empty file as an
input.

With the following band-aid, we can skip the test and the output
from "sh t4126-*.sh -i -v -x" might give us a clue that explains how
such a failure happens.  Unfortunately GitHub CI's win test does not
give us insight into a test that did not fail, so I did not get
anything useful from the "ls -l" down there (I already knew that
sample patches are empty files).

---- >8 ----
Date: Wed, 27 Mar 2024 14:41:26 -0700
Subject: [PATCH] t4126: make sure a directory with SP at the end is usable

If the platform is unable to properly create these sample
patches about a file that lives in a directory whose name
ends with a SP, there is no point testing how "git apply"
behaves there.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t4126-apply-empty.sh | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/t/t4126-apply-empty.sh b/t/t4126-apply-empty.sh
index eaf0c5304a..d2ac7a486f 100755
--- a/t/t4126-apply-empty.sh
+++ b/t/t4126-apply-empty.sh
@@ -66,14 +66,26 @@ test_expect_success 'apply --index create' '
 	git diff --exit-code
 '
 
-test_expect_success 'apply with no-contents and a funny pathname' '
+test_expect_success 'setup patches in dir ending in SP' '
+	test_when_finished "rm -fr \"funny \"" &&
 	mkdir "funny " &&
 	>"funny /empty" &&
 	git add "funny /empty" &&
-	git diff HEAD "funny /" >sample.patch &&
-	git diff -R HEAD "funny /" >elpmas.patch &&
+	git diff HEAD -- "funny /" >sample.patch &&
+	git diff -R HEAD -- "funny /" >elpmas.patch &&
 	git reset --hard &&
-	rm -fr "funny " &&
+
+	if  grep "a/funny /empty b/funny /empty" sample.patch &&
+	    grep "b/funny /empty a/funny /empty" elpmas.patch
+	then
+		test_set_prereq DIR_ENDS_WITH_SP
+	else
+		# Win test???
+		ls -l
+	fi
+'
+
+test_expect_success DIR_ENDS_WITH_SP 'apply with no-contents and a funny pathname' '
 
 	git apply --stat --check --apply sample.patch &&
 	test_must_be_empty "funny /empty" &&
-- 
2.44.0-368-gc75fd8d815



^ permalink raw reply related

* Re: [PATCH 2/3] treewide: switch to khashl for memory savings
From: Junio C Hamano @ 2024-03-27 21:54 UTC (permalink / raw)
  To: Jeff King; +Cc: Eric Wong, git
In-Reply-To: <20240327093710.GA847433@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> On Tue, Mar 26, 2024 at 10:48:40AM -0700, Junio C Hamano wrote:
>
>> $ make builtin/fast-import.sp ;# part of make sparse
>>     SP builtin/fast-import.c
>> builtin/fast-import.c: note: in included file (through oidset.h, packfile.h):
>> khashl.h:516:1: error: Using plain integer as NULL pointer
>> khashl.h:516:1: error: Using plain integer as NULL pointer
>> make: *** [Makefile:3237: builtin/fast-import.sp] Error 1
>> 
>> I found IMPL_GET and IMPL_DEL's use of (h->keys == 0) were giving
>> one of these two, and managed to reduce the error to just one with
>> the attached patch, but I don't know what the other error is coming
>> from.
>
> Probably:
>
> diff --git a/khashl.h b/khashl.h
> index 8fcebed237..1e724bbf88 100644
> --- a/khashl.h
> +++ b/khashl.h
> @@ -116,7 +116,7 @@ static kh_inline khint_t __kh_h2b(khint_t hash, khint_t bits) { return hash * 26
>  
>  #define __KHASHL_IMPL_RESIZE(SCOPE, HType, prefix, khkey_t, __hash_fn, __hash_eq) \
>  	SCOPE void prefix##_resize(HType *h, khint_t new_n_buckets) { \
> -		khint32_t *new_used = 0; \
> +		khint32_t *new_used = NULL; \
>  		khint_t j = 0, x = new_n_buckets, n_buckets, new_bits, new_mask; \
>  		while ((x >>= 1) != 0) ++j; \
>  		if (new_n_buckets & (new_n_buckets - 1)) ++j; \
>
> -Peff

Spot on.  With this (and the other two 0 -> NULL fixes), and with
the SWAP() thing in brian's credential series fixed, the tip of
'seen' passes the static-analysis and sparse CI jobs again.

Thanks.


^ permalink raw reply


Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).