* [PATCH v2 1/2] t3701: clean up hunk splitting tests
2022-01-11 11:12 ` [PATCH v2 0/2] " Phillip Wood via GitGitGadget
@ 2022-01-11 11:12 ` Phillip Wood via GitGitGadget
2022-01-11 11:12 ` [PATCH v2 2/2] builtin add -p: fix hunk splitting Phillip Wood via GitGitGadget
` (2 subsequent siblings)
3 siblings, 0 replies; 21+ messages in thread
From: Phillip Wood via GitGitGadget @ 2022-01-11 11:12 UTC (permalink / raw)
To: git
Cc: Johannes Schindelin, SZEDER Gábor,
Ævar Arnfjörð Bjarmason, Phillip Wood,
Phillip Wood
From: Phillip Wood <phillip.wood@dunelm.org.uk>
Clean up some test constructs in preparation for extending the tests
in the next commit. There are three small changes, I've grouped them
together as they're so small it didn't seem worth creating three
separate commits.
1 - "cat file | sed expression" is better written as
"sed expression file".
2 - Follow our usual practice of redirecting the output of git
commands to a file rather than piping it into another command.
3 - Use test_write_lines rather than 'printf "%s\n"'.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
t/t3701-add-interactive.sh | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 207714655f2..77de0029ba5 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -347,7 +347,7 @@ test_expect_success 'setup patch' '
# Expected output, diff is similar to the patch but w/ diff at the top
test_expect_success 'setup expected' '
echo diff --git a/file b/file >expected &&
- cat patch |sed "/^index/s/ 100644/ 100755/" >>expected &&
+ sed "/^index/s/ 100644/ 100755/" patch >>expected &&
cat >expected-output <<-\EOF
--- a/file
+++ b/file
@@ -373,9 +373,9 @@ test_expect_success 'setup expected' '
test_expect_success 'add first line works' '
git commit -am "clear local changes" &&
git apply patch &&
- printf "%s\n" s y y | git add -p file 2>error |
- sed -n -e "s/^([1-2]\/[1-2]) Stage this hunk[^@]*\(@@ .*\)/\1/" \
- -e "/^[-+@ \\\\]"/p >output &&
+ test_write_lines s y y | git add -p file 2>error >raw-output &&
+ sed -n -e "s/^([1-2]\/[1-2]) Stage this hunk[^@]*\(@@ .*\)/\1/" \
+ -e "/^[-+@ \\\\]"/p raw-output >output &&
test_must_be_empty error &&
git diff --cached >diff &&
diff_cmp expected diff &&
--
gitgitgadget
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 2/2] builtin add -p: fix hunk splitting
2022-01-11 11:12 ` [PATCH v2 0/2] " Phillip Wood via GitGitGadget
2022-01-11 11:12 ` [PATCH v2 1/2] t3701: clean up hunk splitting tests Phillip Wood via GitGitGadget
@ 2022-01-11 11:12 ` Phillip Wood via GitGitGadget
2022-01-11 12:07 ` [PATCH v2 0/2] " Ævar Arnfjörð Bjarmason
2022-01-12 18:34 ` Junio C Hamano
3 siblings, 0 replies; 21+ messages in thread
From: Phillip Wood via GitGitGadget @ 2022-01-11 11:12 UTC (permalink / raw)
To: git
Cc: Johannes Schindelin, SZEDER Gábor,
Ævar Arnfjörð Bjarmason, Phillip Wood,
Phillip Wood
From: Phillip Wood <phillip.wood@dunelm.org.uk>
The C reimplementation of "add -p" fails to split the last hunk in a
file if hunk ends with an addition or deletion without any post context
line unless it is the last file to be processed.
To determine whether a hunk can be split a counter is incremented each
time a context line follows an insertion or deletion. If at the end of
the hunk the value of this counter is greater than one then the hunk
can be split into that number of smaller hunks. If the last hunk in a
file ends with an insertion or deletion then there is no following
context line and the counter will not be incremented. This case is
already handled at the end of the loop where counter is incremented if
the last hunk ended with an insertion or deletion. Unfortunately there
is no similar check between files (likely because the perl version
only ever parses one diff at a time). Fix this by checking if the last
hunk ended with an insertion or deletion when we see the diff header
of a new file and extend the existing regression test.
Reproted-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
add-patch.c | 20 +++++++++++------
t/t3701-add-interactive.sh | 46 ++++++++++++++++++++++++++++++++++----
2 files changed, 55 insertions(+), 11 deletions(-)
diff --git a/add-patch.c b/add-patch.c
index 8c41cdfe39b..89ffda32b26 100644
--- a/add-patch.c
+++ b/add-patch.c
@@ -383,6 +383,17 @@ static int is_octal(const char *p, size_t len)
return 1;
}
+static void complete_file(char marker, struct hunk *hunk)
+{
+ if (marker == '-' || marker == '+')
+ /*
+ * Last hunk ended in non-context line (i.e. it
+ * appended lines to the file, so there are no
+ * trailing context lines).
+ */
+ hunk->splittable_into++;
+}
+
static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
{
struct strvec args = STRVEC_INIT;
@@ -472,6 +483,7 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
eol = pend;
if (starts_with(p, "diff ")) {
+ complete_file(marker, hunk);
ALLOC_GROW_BY(s->file_diff, s->file_diff_nr, 1,
file_diff_alloc);
file_diff = s->file_diff + s->file_diff_nr - 1;
@@ -598,13 +610,7 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
file_diff->hunk->colored_end = hunk->colored_end;
}
}
-
- if (marker == '-' || marker == '+')
- /*
- * Last hunk ended in non-context line (i.e. it appended lines
- * to the file, so there are no trailing context lines).
- */
- hunk->splittable_into++;
+ complete_file(marker, hunk);
/* non-colored shorter than colored? */
if (colored_p != colored_pend) {
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 77de0029ba5..94537a6b40a 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -326,7 +326,9 @@ test_expect_success 'correct message when there is nothing to do' '
test_expect_success 'setup again' '
git reset --hard &&
test_chmod +x file &&
- echo content >>file
+ echo content >>file &&
+ test_write_lines A B C D>file2 &&
+ git add file2
'
# Write the patch file with a new line at the top and bottom
@@ -341,13 +343,27 @@ test_expect_success 'setup patch' '
content
+lastline
\ No newline at end of file
+ diff --git a/file2 b/file2
+ index 8422d40..35b930a 100644
+ --- a/file2
+ +++ b/file2
+ @@ -1,4 +1,5 @@
+ -A
+ +Z
+ B
+ +Y
+ C
+ -D
+ +X
EOF
'
# Expected output, diff is similar to the patch but w/ diff at the top
test_expect_success 'setup expected' '
echo diff --git a/file b/file >expected &&
- sed "/^index/s/ 100644/ 100755/" patch >>expected &&
+ sed -e "/^index 180b47c/s/ 100644/ 100755/" \
+ -e /1,5/s//1,4/ \
+ -e /Y/d patch >>expected &&
cat >expected-output <<-\EOF
--- a/file
+++ b/file
@@ -366,6 +382,28 @@ test_expect_success 'setup expected' '
content
+lastline
\ No newline at end of file
+ --- a/file2
+ +++ b/file2
+ @@ -1,4 +1,5 @@
+ -A
+ +Z
+ B
+ +Y
+ C
+ -D
+ +X
+ @@ -1,2 +1,2 @@
+ -A
+ +Z
+ B
+ @@ -2,2 +2,3 @@
+ B
+ +Y
+ C
+ @@ -3,2 +4,2 @@
+ C
+ -D
+ +X
EOF
'
@@ -373,8 +411,8 @@ test_expect_success 'setup expected' '
test_expect_success 'add first line works' '
git commit -am "clear local changes" &&
git apply patch &&
- test_write_lines s y y | git add -p file 2>error >raw-output &&
- sed -n -e "s/^([1-2]\/[1-2]) Stage this hunk[^@]*\(@@ .*\)/\1/" \
+ test_write_lines s y y s y n y | git add -p 2>error >raw-output &&
+ sed -n -e "s/^([1-9]\/[1-9]) Stage this hunk[^@]*\(@@ .*\)/\1/" \
-e "/^[-+@ \\\\]"/p raw-output >output &&
test_must_be_empty error &&
git diff --cached >diff &&
--
gitgitgadget
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/2] builtin add -p: fix hunk splitting
2022-01-11 11:12 ` [PATCH v2 0/2] " Phillip Wood via GitGitGadget
2022-01-11 11:12 ` [PATCH v2 1/2] t3701: clean up hunk splitting tests Phillip Wood via GitGitGadget
2022-01-11 11:12 ` [PATCH v2 2/2] builtin add -p: fix hunk splitting Phillip Wood via GitGitGadget
@ 2022-01-11 12:07 ` Ævar Arnfjörð Bjarmason
2022-01-11 18:57 ` Phillip Wood
2022-01-12 18:34 ` Junio C Hamano
3 siblings, 1 reply; 21+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-11 12:07 UTC (permalink / raw)
To: Phillip Wood via GitGitGadget
Cc: git, Johannes Schindelin, SZEDER Gábor, Phillip Wood
On Tue, Jan 11 2022, Phillip Wood via GitGitGadget wrote:
> Thanks to Junio and Ævar for their comments on V1. I've updated the commit
> message and added a helper function as suggested.
This v2 LGTM as far as the functionality of the end-state is concerned.
As a remaining nit the complete_file() helper you introduce in 2/2
changes 2/4 places that increment "hunk->splittable+into".
I grabbed this PR and came up with this amendmend to it which adds a 2/3
step that converts 3/3 of them, followed by adding the 4th user in your
2/2 (now patch 3/3):
https://github.com/git/git/compare/master...avar:phillipwood-avar/wip/add-p-fix-hunk-splitting-v2.1
It changes nothing as far as the end-state is concerned, but I think it
makes this easier to read & follow. The actual behavior change becomes a
one-line addition to add-patch.c, instead of being mixed up with the
refactoring of adding the new helper.
If you'd like to pick that up & run with it as a v3 that's fine by me,
and if not that's also fine :) Just a suggestion.
A range-diff between your v2 here and that linked-to
phillipwood-avar/wip/add-p-fix-hunk-splitting-v2.1:
1: cc8639fc29d = 1: 34392397f04 t3701: clean up hunk splitting tests
-: ----------- > 2: c082176f8c5 add-file.c: use static helper to check marker == +|-
2: b698989e265 ! 3: defca0baba4 builtin add -p: fix hunk splitting
@@ Commit message
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
## add-patch.c ##
-@@ add-patch.c: static int is_octal(const char *p, size_t len)
- return 1;
- }
-
-+static void complete_file(char marker, struct hunk *hunk)
-+{
-+ if (marker == '-' || marker == '+')
-+ /*
-+ * Last hunk ended in non-context line (i.e. it
-+ * appended lines to the file, so there are no
-+ * trailing context lines).
-+ */
-+ hunk->splittable_into++;
-+}
-+
- static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
- {
- struct strvec args = STRVEC_INIT;
@@ add-patch.c: static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
eol = pend;
if (starts_with(p, "diff ")) {
-+ complete_file(marker, hunk);
++ complete_file(marker, &hunk->splittable_into);
ALLOC_GROW_BY(s->file_diff, s->file_diff_nr, 1,
file_diff_alloc);
file_diff = s->file_diff + s->file_diff_nr - 1;
-@@ add-patch.c: static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
- file_diff->hunk->colored_end = hunk->colored_end;
- }
- }
--
-- if (marker == '-' || marker == '+')
-- /*
-- * Last hunk ended in non-context line (i.e. it appended lines
-- * to the file, so there are no trailing context lines).
-- */
-- hunk->splittable_into++;
-+ complete_file(marker, hunk);
-
- /* non-colored shorter than colored? */
- if (colored_p != colored_pend) {
## t/t3701-add-interactive.sh ##
@@ t/t3701-add-interactive.sh: test_expect_success 'correct message when there is nothing to do' '
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/2] builtin add -p: fix hunk splitting
2022-01-11 12:07 ` [PATCH v2 0/2] " Ævar Arnfjörð Bjarmason
@ 2022-01-11 18:57 ` Phillip Wood
2022-01-12 18:51 ` Junio C Hamano
0 siblings, 1 reply; 21+ messages in thread
From: Phillip Wood @ 2022-01-11 18:57 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason,
Phillip Wood via GitGitGadget
Cc: git, Johannes Schindelin, SZEDER Gábor, Phillip Wood
Hi Ævar
On 11/01/2022 12:07, Ævar Arnfjörð Bjarmason wrote:
>
> On Tue, Jan 11 2022, Phillip Wood via GitGitGadget wrote:
>
>> Thanks to Junio and Ævar for their comments on V1. I've updated the commit
>> message and added a helper function as suggested.
>
> This v2 LGTM as far as the functionality of the end-state is concerned.
Thanks for taking a look
> As a remaining nit the complete_file() helper you introduce in 2/2
> changes 2/4 places that increment "hunk->splittable+into".
>
> I grabbed this PR and came up with this amendmend to it which adds a 2/3
> step that converts 3/3 of them, followed by adding the 4th user in your
> 2/2 (now patch 3/3):
> https://github.com/git/git/compare/master...avar:phillipwood-avar/wip/add-p-fix-hunk-splitting-v2.1
>
> It changes nothing as far as the end-state is concerned, but I think it
> makes this easier to read & follow. The actual behavior change becomes a
> one-line addition to add-patch.c, instead of being mixed up with the
> refactoring of adding the new helper.
>
> If you'd like to pick that up & run with it as a v3 that's fine by me,
> and if not that's also fine :) Just a suggestion.
I'm not sure I want to go with your extra changes. I've left some
comments on them below
> @@ -488,12 +499,12 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
> else if (starts_with(p, "@@ ") ||
> (hunk == &file_diff->head &&
> (skip_prefix(p, "deleted file", &deleted)))) {
> - if (marker == '-' || marker == '+')
> - /*
> - * Should not happen; previous hunk did not end
> - * in a context line? Handle it anyway.
> - */
> + hunk->splittable_into++;
> + /*
> + * Should not increment "splittable_into";
> + * previous hunk did not end in a context
> + * line? Handle it anyway.
> + */
> + complete_file(marker, &hunk->splittable_into);
>
> ALLOC_GROW_BY(file_diff->hunk, file_diff->hunk_nr, 1,
> file_diff->hunk_alloc);
I deliberately left this alone as I think we should probably make this
BUG() out instead of silently accepting an invalid diff.
> @@ -566,8 +577,8 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
> (int)(eol - (plain->buf + file_diff->head.start)),
> plain->buf + file_diff->head.start);
>
> - if ((marker == '-' || marker == '+') && *p == ' ')
> - hunk->splittable_into++;
> + if (*p == ' ')
> + complete_file(marker, &hunk->splittable_into);
> if (marker && *p != '\\')
> marker = *p;
Here you are calling complete_file() which has the following comment
/*
* Last hunk ended in non-context line (i.e. it
* appended lines to the file, so there are no
* trailing context lines).
*/
for all context lines so the function name and comment would need
updating.
Best Wishes
Phillip
> A range-diff between your v2 here and that linked-to
> phillipwood-avar/wip/add-p-fix-hunk-splitting-v2.1:
>
> 1: cc8639fc29d = 1: 34392397f04 t3701: clean up hunk splitting tests
> -: ----------- > 2: c082176f8c5 add-file.c: use static helper to check marker == +|-
> 2: b698989e265 ! 3: defca0baba4 builtin add -p: fix hunk splitting
> @@ Commit message
> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> ## add-patch.c ##
> -@@ add-patch.c: static int is_octal(const char *p, size_t len)
> - return 1;
> - }
> -
> -+static void complete_file(char marker, struct hunk *hunk)
> -+{
> -+ if (marker == '-' || marker == '+')
> -+ /*
> -+ * Last hunk ended in non-context line (i.e. it
> -+ * appended lines to the file, so there are no
> -+ * trailing context lines).
> -+ */
> -+ hunk->splittable_into++;
> -+}
> -+
> - static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
> - {
> - struct strvec args = STRVEC_INIT;
> @@ add-patch.c: static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
> eol = pend;
>
> if (starts_with(p, "diff ")) {
> -+ complete_file(marker, hunk);
> ++ complete_file(marker, &hunk->splittable_into);
> ALLOC_GROW_BY(s->file_diff, s->file_diff_nr, 1,
> file_diff_alloc);
> file_diff = s->file_diff + s->file_diff_nr - 1;
> -@@ add-patch.c: static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
> - file_diff->hunk->colored_end = hunk->colored_end;
> - }
> - }
> --
> -- if (marker == '-' || marker == '+')
> -- /*
> -- * Last hunk ended in non-context line (i.e. it appended lines
> -- * to the file, so there are no trailing context lines).
> -- */
> -- hunk->splittable_into++;
> -+ complete_file(marker, hunk);
> -
> - /* non-colored shorter than colored? */
> - if (colored_p != colored_pend) {
>
> ## t/t3701-add-interactive.sh ##
> @@ t/t3701-add-interactive.sh: test_expect_success 'correct message when there is nothing to do' '
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/2] builtin add -p: fix hunk splitting
2022-01-11 18:57 ` Phillip Wood
@ 2022-01-12 18:51 ` Junio C Hamano
2022-01-19 20:01 ` Phillip Wood
0 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2022-01-12 18:51 UTC (permalink / raw)
To: Phillip Wood
Cc: Ævar Arnfjörð Bjarmason,
Phillip Wood via GitGitGadget, git, Johannes Schindelin,
SZEDER Gábor, Phillip Wood
Phillip Wood <phillip.wood123@gmail.com> writes:
> I'm not sure I want to go with your extra changes. I've left some
> comments on them below
>
>> @@ -488,12 +499,12 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
>> else if (starts_with(p, "@@ ") ||
>> (hunk == &file_diff->head &&
>> (skip_prefix(p, "deleted file", &deleted)))) {
>> - if (marker == '-' || marker == '+')
>> - /*
>> - * Should not happen; previous hunk did not end
>> - * in a context line? Handle it anyway.
>> - */
>> + hunk->splittable_into++;
>> + /*
>> + * Should not increment "splittable_into";
>> + * previous hunk did not end in a context
>> + * line? Handle it anyway.
>> + */
>> + complete_file(marker, &hunk->splittable_into);
>> ALLOC_GROW_BY(file_diff->hunk, file_diff->hunk_nr, 1,
>> file_diff->hunk_alloc);
>
> I deliberately left this alone as I think we should probably make this
> BUG() out instead of silently accepting an invalid diff.
As we are reading our own output, I agree that such a data error is
a BUG().
In any case, a helper to see if the file ended without post-context
is one thing, and a helper that specify what happens after we are
done with a single file, before we move on top the next file or
after processing the last file, is another thing. The latter may be
able to make use of the former, but the latter may want to do more
than that in the future.
As complete_file() is about finalizing the processing we have done
to the current file, it should be used for that purpose, and nothing
else, I think the hunk I see at
https://github.com/git/git/commit/c082176f8c5a1fc1c8b2a93991ca28fd63aae73a
(reproduced below) is simply a nonsense.
Stepping back a bit, though, is this helper really finalizing the
current file, or is it finalizing the current hunk? If it were the
latter, then its use in the hunk I called "nonsense" above actually
makes perfect sense. There may not be anything other than finalizing
the last hunk when we see the end of a file right now, so we may not
need to add a finalize_file() helper right now, and when we need to
do something more than finalizing the last hunk, we may need to capture
the distinction by adding one.
diff --git i/add-patch.c w/add-patch.c
index 89ffda32b2..6094290c86 100644
--- i/add-patch.c
+++ w/add-patch.c
@@ -578,8 +578,8 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
(int)(eol - (plain->buf + file_diff->head.start)),
plain->buf + file_diff->head.start);
- if ((marker == '-' || marker == '+') && *p == ' ')
- hunk->splittable_into++;
+ if (*p == ' ')
+ complete_file(marker, &hunk->splittable_into);
if (marker && *p != '\\')
marker = *p;
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/2] builtin add -p: fix hunk splitting
2022-01-12 18:51 ` Junio C Hamano
@ 2022-01-19 20:01 ` Phillip Wood
2022-01-20 5:02 ` Junio C Hamano
2022-01-22 9:05 ` Johannes Schindelin
0 siblings, 2 replies; 21+ messages in thread
From: Phillip Wood @ 2022-01-19 20:01 UTC (permalink / raw)
To: Junio C Hamano
Cc: Ævar Arnfjörð Bjarmason,
Phillip Wood via GitGitGadget, git, Johannes Schindelin,
SZEDER Gábor, Phillip Wood
Hi Junio
On 12/01/2022 18:51, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
>
>> I'm not sure I want to go with your extra changes. I've left some
>> comments on them below
>>
>>> @@ -488,12 +499,12 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
>>> else if (starts_with(p, "@@ ") ||
>>> (hunk == &file_diff->head &&
>>> (skip_prefix(p, "deleted file", &deleted)))) {
>>> - if (marker == '-' || marker == '+')
>>> - /*
>>> - * Should not happen; previous hunk did not end
>>> - * in a context line? Handle it anyway.
>>> - */
>>> + hunk->splittable_into++;
>>> + /*
>>> + * Should not increment "splittable_into";
>>> + * previous hunk did not end in a context
>>> + * line? Handle it anyway.
>>> + */
>>> + complete_file(marker, &hunk->splittable_into);
>>> ALLOC_GROW_BY(file_diff->hunk, file_diff->hunk_nr, 1,
>>> file_diff->hunk_alloc);
>>
>> I deliberately left this alone as I think we should probably make this
>> BUG() out instead of silently accepting an invalid diff.
>
> As we are reading our own output, I agree that such a data error is
> a BUG().
>
> In any case, a helper to see if the file ended without post-context
> is one thing, and a helper that specify what happens after we are
> done with a single file, before we move on top the next file or
> after processing the last file, is another thing. The latter may be
> able to make use of the former, but the latter may want to do more
> than that in the future.
>
> As complete_file() is about finalizing the processing we have done
> to the current file, it should be used for that purpose, and nothing
> else, I think the hunk I see at
> https://github.com/git/git/commit/c082176f8c5a1fc1c8b2a93991ca28fd63aae73a
> (reproduced below) is simply a nonsense.
>
> Stepping back a bit, though, is this helper really finalizing the
> current file, or is it finalizing the current hunk? If it were the
> latter, then its use in the hunk I called "nonsense" above actually
> makes perfect sense.
Even if the helper is finalizing the current hunk then I think that
"nonsense" hunk would still wrong as it would be calling finalize_hunk()
on _every_ context line in the hunk rather than just being called once
to finalize the hunk. We could call the function something like
update_splittable() but then we'd need to explain why we were calling
that function at the start of a diff and at the end of the loop.
> There may not be anything other than finalizing
> the last hunk when we see the end of a file right now, so we may not
> need to add a finalize_file() helper right now, and when we need to
> do something more than finalizing the last hunk, we may need to capture
> the distinction by adding one.
Yes, if you're happy lets leave this series as it is
Best Wishes
Phillip
> diff --git i/add-patch.c w/add-patch.c
> index 89ffda32b2..6094290c86 100644
> --- i/add-patch.c
> +++ w/add-patch.c
> @@ -578,8 +578,8 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
> (int)(eol - (plain->buf + file_diff->head.start)),
> plain->buf + file_diff->head.start);
>
> - if ((marker == '-' || marker == '+') && *p == ' ')
> - hunk->splittable_into++;
> + if (*p == ' ')
> + complete_file(marker, &hunk->splittable_into);
> if (marker && *p != '\\')
> marker = *p;
>
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/2] builtin add -p: fix hunk splitting
2022-01-19 20:01 ` Phillip Wood
@ 2022-01-20 5:02 ` Junio C Hamano
2022-01-20 8:42 ` Ævar Arnfjörð Bjarmason
2022-01-22 9:05 ` Johannes Schindelin
1 sibling, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2022-01-20 5:02 UTC (permalink / raw)
To: Phillip Wood
Cc: Ævar Arnfjörð Bjarmason,
Phillip Wood via GitGitGadget, git, Johannes Schindelin,
SZEDER Gábor, Phillip Wood
Phillip Wood <phillip.wood123@gmail.com> writes:
> Even if the helper is finalizing the current hunk then I think that
> "nonsense" hunk would still wrong as it would be calling
> finalize_hunk() on _every_ context line in the hunk rather than just
> being called once to finalize the hunk.
True; this triggers every time we finish reading the common context
lines and not at the end of hunk. In any case, I think what we
queued looks good for 'next'.
>> - if ((marker == '-' || marker == '+') && *p == ' ')
>> - hunk->splittable_into++;
>> + if (*p == ' ')
>> + complete_file(marker, &hunk->splittable_into);
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/2] builtin add -p: fix hunk splitting
2022-01-20 5:02 ` Junio C Hamano
@ 2022-01-20 8:42 ` Ævar Arnfjörð Bjarmason
2022-01-20 19:13 ` Junio C Hamano
0 siblings, 1 reply; 21+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-20 8:42 UTC (permalink / raw)
To: Junio C Hamano
Cc: Phillip Wood, Phillip Wood via GitGitGadget, git,
Johannes Schindelin, SZEDER Gábor, Phillip Wood
On Wed, Jan 19 2022, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
>
>> Even if the helper is finalizing the current hunk then I think that
>> "nonsense" hunk would still wrong as it would be calling
>> finalize_hunk() on _every_ context line in the hunk rather than just
>> being called once to finalize the hunk.
>
> True; this triggers every time we finish reading the common context
> lines and not at the end of hunk. In any case, I think what we
> queued looks good for 'next'.
For what it's worth (and as the person who started this side-thread) I
agree. This looks good as-is, thanks both!
>>> - if ((marker == '-' || marker == '+') && *p == ' ')
>>> - hunk->splittable_into++;
>>> + if (*p == ' ')
>>> + complete_file(marker, &hunk->splittable_into);
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/2] builtin add -p: fix hunk splitting
2022-01-20 8:42 ` Ævar Arnfjörð Bjarmason
@ 2022-01-20 19:13 ` Junio C Hamano
0 siblings, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2022-01-20 19:13 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason
Cc: Phillip Wood, Phillip Wood via GitGitGadget, git,
Johannes Schindelin, SZEDER Gábor, Phillip Wood
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> On Wed, Jan 19 2022, Junio C Hamano wrote:
>
>> Phillip Wood <phillip.wood123@gmail.com> writes:
>>
>>> Even if the helper is finalizing the current hunk then I think that
>>> "nonsense" hunk would still wrong as it would be calling
>>> finalize_hunk() on _every_ context line in the hunk rather than just
>>> being called once to finalize the hunk.
>>
>> True; this triggers every time we finish reading the common context
>> lines and not at the end of hunk. In any case, I think what we
>> queued looks good for 'next'.
>
> For what it's worth (and as the person who started this side-thread) I
> agree. This looks good as-is, thanks both!
>
>>>> - if ((marker == '-' || marker == '+') && *p == ' ')
>>>> - hunk->splittable_into++;
>>>> + if (*p == ' ')
>>>> + complete_file(marker, &hunk->splittable_into);
Yup, thanks all. The fix is now in 'next' and I expect we can
safely merge it down as part of the first batch next cycle.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/2] builtin add -p: fix hunk splitting
2022-01-19 20:01 ` Phillip Wood
2022-01-20 5:02 ` Junio C Hamano
@ 2022-01-22 9:05 ` Johannes Schindelin
2022-01-24 11:10 ` Phillip Wood
1 sibling, 1 reply; 21+ messages in thread
From: Johannes Schindelin @ 2022-01-22 9:05 UTC (permalink / raw)
To: Phillip Wood
Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
Phillip Wood via GitGitGadget, git, SZEDER Gábor
Hi Phillip,
first of all: thank you for these patches. I read over them and they have
my ACK.
On Wed, 19 Jan 2022, Phillip Wood wrote:
> On 12/01/2022 18:51, Junio C Hamano wrote:
> > Phillip Wood <phillip.wood123@gmail.com> writes:
> >
> > > I'm not sure I want to go with your extra changes. I've left some
> > > comments on them below
> > >
> > > > @@ -488,12 +499,12 @@ static int parse_diff(struct add_p_state *s, const
> > > > struct pathspec *ps)
> > > > else if (starts_with(p, "@@ ") ||
> > > > (hunk == &file_diff->head &&
> > > > (skip_prefix(p, "deleted file", &deleted)))) {
> > > > - if (marker == '-' || marker == '+')
> > > > - /*
> > > > - * Should not happen; previous hunk did not end
> > > > - * in a context line? Handle it anyway.
> > > > - */
> > > > + hunk->splittable_into++;
> > > > + /*
> > > > + * Should not increment "splittable_into";
> > > > + * previous hunk did not end in a context
> > > > + * line? Handle it anyway.
> > > > + */
> > > > + complete_file(marker, &hunk->splittable_into);
> > > > ALLOC_GROW_BY(file_diff->hunk, file_diff->hunk_nr, 1,
> > > > file_diff->hunk_alloc);
> > >
> > > I deliberately left this alone as I think we should probably make
> > > this BUG() out instead of silently accepting an invalid diff.
FWIW this was overzealous defensive programming on my part. More on that
below.
> > As we are reading our own output, I agree that such a data error is
> > a BUG().
Indeed. I was less worried about the output format changing, and more
concerned with bugs in my parser ;-)
Although, having said that, I had meant to verify that `git add -p` cannot
be asked to produce and consume diffs with `-U0` when I wrote that
comment. Now I did that, and I am now confident that there is no way to ask
`git add -p` to generate and use context line-free diffs: we neither add
`-U<n>` in https://github.com/git/git/blob/v2.34.1/add-patch.c#L398-L417
nor do we call the user-facing `git diff` command that would interpret
`diff.context`, but instead we use `git diff-index` and `git diff-files`
(which ignore that config setting).
> > In any case, a helper to see if the file ended without post-context
> > is one thing, and a helper that specify what happens after we are
> > done with a single file, before we move on top the next file or
> > after processing the last file, is another thing. The latter may be
> > able to make use of the former, but the latter may want to do more
> > than that in the future.
If you are concerned about the name of the function: maybe a better name
would be `maybe_increment_splittable_hunk_count(marker)`.
> >
> > As complete_file() is about finalizing the processing we have done
> > to the current file, it should be used for that purpose, and nothing
> > else, I think the hunk I see at
> > https://github.com/git/git/commit/c082176f8c5a1fc1c8b2a93991ca28fd63aae73a
> > (reproduced below) is simply a nonsense.
> >
> > Stepping back a bit, though, is this helper really finalizing the
> > current file, or is it finalizing the current hunk? If it were the
> > latter, then its use in the hunk I called "nonsense" above actually
> > makes perfect sense.
>
> Even if the helper is finalizing the current hunk then I think that "nonsense"
> hunk would still wrong as it would be calling finalize_hunk() on _every_
> context line in the hunk rather than just being called once to finalize the
> hunk. We could call the function something like update_splittable() but then
> we'd need to explain why we were calling that function at the start of a diff
> and at the end of the loop.
Right. The point of this check is to see whether we missed counting a
splittable hunk. Then it makes more sense to call it at the beginning of a
file, at the end of a file _and_ at a context line.
Having said all that, I am really fine with what landed in `next`.
Thank you,
Dscho
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/2] builtin add -p: fix hunk splitting
2022-01-22 9:05 ` Johannes Schindelin
@ 2022-01-24 11:10 ` Phillip Wood
0 siblings, 0 replies; 21+ messages in thread
From: Phillip Wood @ 2022-01-24 11:10 UTC (permalink / raw)
To: Johannes Schindelin, Phillip Wood
Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
Phillip Wood via GitGitGadget, git, SZEDER Gábor
Hi Dscho
On 22/01/2022 09:05, Johannes Schindelin wrote:
> Hi Phillip,
>
> first of all: thank you for these patches. I read over them and they have
> my ACK.
Thanks
> On Wed, 19 Jan 2022, Phillip Wood wrote:
>
>> On 12/01/2022 18:51, Junio C Hamano wrote:
>>> Phillip Wood <phillip.wood123@gmail.com> writes:
>>>
>>>> I'm not sure I want to go with your extra changes. I've left some
>>>> comments on them below
>>>>
>>>>> @@ -488,12 +499,12 @@ static int parse_diff(struct add_p_state *s, const
>>>>> struct pathspec *ps)
>>>>> else if (starts_with(p, "@@ ") ||
>>>>> (hunk == &file_diff->head &&
>>>>> (skip_prefix(p, "deleted file", &deleted)))) {
>>>>> - if (marker == '-' || marker == '+')
>>>>> - /*
>>>>> - * Should not happen; previous hunk did not end
>>>>> - * in a context line? Handle it anyway.
>>>>> - */
>>>>> + hunk->splittable_into++;
>>>>> + /*
>>>>> + * Should not increment "splittable_into";
>>>>> + * previous hunk did not end in a context
>>>>> + * line? Handle it anyway.
>>>>> + */
>>>>> + complete_file(marker, &hunk->splittable_into);
>>>>> ALLOC_GROW_BY(file_diff->hunk, file_diff->hunk_nr, 1,
>>>>> file_diff->hunk_alloc);
>>>>
>>>> I deliberately left this alone as I think we should probably make
>>>> this BUG() out instead of silently accepting an invalid diff.
>
> FWIW this was overzealous defensive programming on my part. More on that
> below.
>
>>> As we are reading our own output, I agree that such a data error is
>>> a BUG().
>
> Indeed. I was less worried about the output format changing, and more
> concerned with bugs in my parser ;-)
>
> Although, having said that, I had meant to verify that `git add -p` cannot
> be asked to produce and consume diffs with `-U0` when I wrote that
> comment. Now I did that, and I am now confident that there is no way to ask
> `git add -p` to generate and use context line-free diffs: we neither add
> `-U<n>` in https://github.com/git/git/blob/v2.34.1/add-patch.c#L398-L417
> nor do we call the user-facing `git diff` command that would interpret
> `diff.context`, but instead we use `git diff-index` and `git diff-files`
> (which ignore that config setting).
I did think about zero context diffs but realized that they can never be
split so we don't need to worry about incrementing hunk->splittable_into
in that case. It does mean that hunk->splittable_into will be zero in
the -U0 case rather than one but I dont think that matters as we only
care if it is >2 for splitting.
Best Wishes
Phillip
>>> In any case, a helper to see if the file ended without post-context
>>> is one thing, and a helper that specify what happens after we are
>>> done with a single file, before we move on top the next file or
>>> after processing the last file, is another thing. The latter may be
>>> able to make use of the former, but the latter may want to do more
>>> than that in the future.
>
> If you are concerned about the name of the function: maybe a better name
> would be `maybe_increment_splittable_hunk_count(marker)`.
>
>>>
>>> As complete_file() is about finalizing the processing we have done
>>> to the current file, it should be used for that purpose, and nothing
>>> else, I think the hunk I see at
>>> https://github.com/git/git/commit/c082176f8c5a1fc1c8b2a93991ca28fd63aae73a
>>> (reproduced below) is simply a nonsense.
>>>
>>> Stepping back a bit, though, is this helper really finalizing the
>>> current file, or is it finalizing the current hunk? If it were the
>>> latter, then its use in the hunk I called "nonsense" above actually
>>> makes perfect sense.
>>
>> Even if the helper is finalizing the current hunk then I think that "nonsense"
>> hunk would still wrong as it would be calling finalize_hunk() on _every_
>> context line in the hunk rather than just being called once to finalize the
>> hunk. We could call the function something like update_splittable() but then
>> we'd need to explain why we were calling that function at the start of a diff
>> and at the end of the loop.
>
> Right. The point of this check is to see whether we missed counting a
> splittable hunk. Then it makes more sense to call it at the beginning of a
> file, at the end of a file _and_ at a context line.
>
> Having said all that, I am really fine with what landed in `next`.
>
> Thank you,
> Dscho
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/2] builtin add -p: fix hunk splitting
2022-01-11 11:12 ` [PATCH v2 0/2] " Phillip Wood via GitGitGadget
` (2 preceding siblings ...)
2022-01-11 12:07 ` [PATCH v2 0/2] " Ævar Arnfjörð Bjarmason
@ 2022-01-12 18:34 ` Junio C Hamano
3 siblings, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2022-01-12 18:34 UTC (permalink / raw)
To: Phillip Wood via GitGitGadget
Cc: git, Johannes Schindelin, SZEDER Gábor,
Ævar Arnfjörð Bjarmason, Phillip Wood
"Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:
> Thanks to Junio and Ævar for their comments on V1. I've updated the commit
> message and added a helper function as suggested.
>
> V1 Cover Letter: Fix a small regression in the hunk splitting of the builtin
> version compared to the perl version. Thanks to Szeder for the easy to
> follow bug report.
Looking good. After comparing the output from
$ git grep -e 'finalize[_a-z]*(' -e 'complete[_a-z]*(' \*.c
I would have called the helper "finalize_file()", but in the context
of this file, the name complete_file() is not misleading enough to
require renaming.
Will queue. Thanks.
^ permalink raw reply [flat|nested] 21+ messages in thread