[PATCH 1/3] sequencer: break some long lines

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCH 1/3] sequencer: break some long lines
@ 2019-03-13 18:26 Phillip Wood
  2019-03-13 18:26 ` [PATCH 2/3] cherry-pick: demonstrate option amnesia Phillip Wood
  2019-03-13 18:26 ` [PATCH 3/3] cherry-pick --continue: remember options Phillip Wood
  0 siblings, 2 replies; 10+ messages in thread
From: Phillip Wood @ 2019-03-13 18:26 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Junio C Hamano, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

reformat save_opts() to remove excessively long lines.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 sequencer.c | 35 ++++++++++++++++++++++-------------
 1 file changed, 22 insertions(+), 13 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 0db410d590..5e19b22f8f 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2700,36 +2700,45 @@ static int save_opts(struct replay_opts *opts)
 	int res = 0;
 
 	if (opts->no_commit)
-		res |= git_config_set_in_file_gently(opts_file, "options.no-commit", "true");
+		res |= git_config_set_in_file_gently(opts_file,
+					"options.no-commit", "true");
 	if (opts->edit)
-		res |= git_config_set_in_file_gently(opts_file, "options.edit", "true");
+		res |= git_config_set_in_file_gently(opts_file,
+					"options.edit", "true");
 	if (opts->signoff)
-		res |= git_config_set_in_file_gently(opts_file, "options.signoff", "true");
+		res |= git_config_set_in_file_gently(opts_file,
+					"options.signoff", "true");
 	if (opts->record_origin)
-		res |= git_config_set_in_file_gently(opts_file, "options.record-origin", "true");
+		res |= git_config_set_in_file_gently(opts_file,
+					"options.record-origin", "true");
 	if (opts->allow_ff)
-		res |= git_config_set_in_file_gently(opts_file, "options.allow-ff", "true");
+		res |= git_config_set_in_file_gently(opts_file,
+					"options.allow-ff", "true");
 	if (opts->mainline) {
 		struct strbuf buf = STRBUF_INIT;
 		strbuf_addf(&buf, "%d", opts->mainline);
-		res |= git_config_set_in_file_gently(opts_file, "options.mainline", buf.buf);
+		res |= git_config_set_in_file_gently(opts_file,
+					"options.mainline", buf.buf);
 		strbuf_release(&buf);
 	}
 	if (opts->strategy)
-		res |= git_config_set_in_file_gently(opts_file, "options.strategy", opts->strategy);
+		res |= git_config_set_in_file_gently(opts_file,
+					"options.strategy", opts->strategy);
 	if (opts->gpg_sign)
-		res |= git_config_set_in_file_gently(opts_file, "options.gpg-sign", opts->gpg_sign);
+		res |= git_config_set_in_file_gently(opts_file,
+					"options.gpg-sign", opts->gpg_sign);
 	if (opts->xopts) {
 		int i;
 		for (i = 0; i < opts->xopts_nr; i++)
 			res |= git_config_set_multivar_in_file_gently(opts_file,
-							"options.strategy-option",
-							opts->xopts[i], "^$", 0);
+					"options.strategy-option",
+					opts->xopts[i], "^$", 0);
 	}
 	if (opts->allow_rerere_auto)
-		res |= git_config_set_in_file_gently(opts_file, "options.allow-rerere-auto",
-						     opts->allow_rerere_auto == RERERE_AUTOUPDATE ?
-						     "true" : "false");
+		res |= git_config_set_in_file_gently(opts_file,
+				"options.allow-rerere-auto",
+				opts->allow_rerere_auto == RERERE_AUTOUPDATE ?
+				"true" : "false");
 	return res;
 }
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] cherry-pick: demonstrate option amnesia
  2019-03-13 18:26 [PATCH 1/3] sequencer: break some long lines Phillip Wood
@ 2019-03-13 18:26 ` Phillip Wood
  2019-03-13 18:26 ` [PATCH 3/3] cherry-pick --continue: remember options Phillip Wood
  1 sibling, 0 replies; 10+ messages in thread
From: Phillip Wood @ 2019-03-13 18:26 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Junio C Hamano, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

When cherry-pick stops for a conflict resolution it forgets
--allow-empty --allow-empty-message and --keep-redundant-commits.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 t/t3507-cherry-pick-conflict.sh | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/t/t3507-cherry-pick-conflict.sh b/t/t3507-cherry-pick-conflict.sh
index 0db166152a..79e994cffa 100755
--- a/t/t3507-cherry-pick-conflict.sh
+++ b/t/t3507-cherry-pick-conflict.sh
@@ -25,6 +25,11 @@ test_expect_success setup '
 	test_commit base foo b &&
 	test_commit picked foo c &&
 	test_commit --signoff picked-signed foo d &&
+	git checkout -b topic initial &&
+	test_commit redundant-pick foo c redundant &&
+	git commit --allow-empty --allow-empty-message &&
+	git tag empty &&
+	git checkout master &&
 	git config advice.detachedhead false
 
 '
@@ -405,4 +410,23 @@ test_expect_success 'cherry-pick preserves sparse-checkout' '
 	test_i18ngrep ! "Changes not staged for commit:" actual
 '
 
+test_expect_failure 'cherry-pick --continue remembers --keep-redundant-commits' '
+	test_when_finished "git cherry-pick --abort || :" &&
+	pristine_detach initial &&
+	test_must_fail git cherry-pick --keep-redundant-commits picked redundant &&
+	echo c >foo &&
+	git add foo &&
+	git cherry-pick --continue
+'
+
+test_expect_failure 'cherry-pick --continue remembers --allow-empty and --allow-empty-message' '
+	test_when_finished "git cherry-pick --abort || :" &&
+	pristine_detach initial &&
+	test_must_fail git cherry-pick --allow-empty --allow-empty-message \
+				       picked empty &&
+	echo c >foo &&
+	git add foo &&
+	git cherry-pick --continue
+'
+
 test_done
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] cherry-pick --continue: remember options
  2019-03-13 18:26 [PATCH 1/3] sequencer: break some long lines Phillip Wood
  2019-03-13 18:26 ` [PATCH 2/3] cherry-pick: demonstrate option amnesia Phillip Wood
@ 2019-03-13 18:26 ` Phillip Wood
  2019-03-13 22:45   ` Johannes Schindelin
  1 sibling, 1 reply; 10+ messages in thread
From: Phillip Wood @ 2019-03-13 18:26 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Junio C Hamano, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

Remember --allow-empty, --allow-empty-message and
--keep-redundant-commits when cherry-pick stops for a conflict
resolution.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 sequencer.c                     | 18 ++++++++++++++++++
 t/t3507-cherry-pick-conflict.sh |  4 ++--
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 5e19b22f8f..1ad51aa498 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2307,6 +2307,15 @@ static int populate_opts_cb(const char *key, const char *value, void *data)
 		opts->no_commit = git_config_bool_or_int(key, value, &error_flag);
 	else if (!strcmp(key, "options.edit"))
 		opts->edit = git_config_bool_or_int(key, value, &error_flag);
+	else if (!strcmp(key, "options.allow-empty"))
+		opts->allow_empty =
+			git_config_bool_or_int(key, value, &error_flag);
+	else if (!strcmp(key, "options.allow-empty-message"))
+		opts->allow_empty_message =
+			git_config_bool_or_int(key, value, &error_flag);
+	else if (!strcmp(key, "options.keep-redundant-commits"))
+		opts->keep_redundant_commits =
+			git_config_bool_or_int(key, value, &error_flag);
 	else if (!strcmp(key, "options.signoff"))
 		opts->signoff = git_config_bool_or_int(key, value, &error_flag);
 	else if (!strcmp(key, "options.record-origin"))
@@ -2705,6 +2714,15 @@ static int save_opts(struct replay_opts *opts)
 	if (opts->edit)
 		res |= git_config_set_in_file_gently(opts_file,
 					"options.edit", "true");
+	if (opts->allow_empty)
+		res |= git_config_set_in_file_gently(opts_file,
+					"options.allow-empty", "true");
+	if (opts->allow_empty_message)
+		res |= git_config_set_in_file_gently(opts_file,
+				"options.allow-empty-message", "true");
+	if (opts->keep_redundant_commits)
+		res |= git_config_set_in_file_gently(opts_file,
+				"options.keep-redundant-commits", "true");
 	if (opts->signoff)
 		res |= git_config_set_in_file_gently(opts_file,
 					"options.signoff", "true");
diff --git a/t/t3507-cherry-pick-conflict.sh b/t/t3507-cherry-pick-conflict.sh
index 79e994cffa..1ef8e9d534 100755
--- a/t/t3507-cherry-pick-conflict.sh
+++ b/t/t3507-cherry-pick-conflict.sh
@@ -410,7 +410,7 @@ test_expect_success 'cherry-pick preserves sparse-checkout' '
 	test_i18ngrep ! "Changes not staged for commit:" actual
 '
 
-test_expect_failure 'cherry-pick --continue remembers --keep-redundant-commits' '
+test_expect_success 'cherry-pick --continue remembers --keep-redundant-commits' '
 	test_when_finished "git cherry-pick --abort || :" &&
 	pristine_detach initial &&
 	test_must_fail git cherry-pick --keep-redundant-commits picked redundant &&
@@ -419,7 +419,7 @@ test_expect_failure 'cherry-pick --continue remembers --keep-redundant-commits'
 	git cherry-pick --continue
 '
 
-test_expect_failure 'cherry-pick --continue remembers --allow-empty and --allow-empty-message' '
+test_expect_success 'cherry-pick --continue remembers --allow-empty and --allow-empty-message' '
 	test_when_finished "git cherry-pick --abort || :" &&
 	pristine_detach initial &&
 	test_must_fail git cherry-pick --allow-empty --allow-empty-message \
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] cherry-pick --continue: remember options
  2019-03-13 18:26 ` [PATCH 3/3] cherry-pick --continue: remember options Phillip Wood
@ 2019-03-13 22:45   ` Johannes Schindelin
  2019-03-14  5:01     ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Johannes Schindelin @ 2019-03-13 22:45 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Git Mailing List, Junio C Hamano

Hi Phillip,

On Wed, 13 Mar 2019, Phillip Wood wrote:

> From: Phillip Wood <phillip.wood@dunelm.org.uk>
> 
> Remember --allow-empty, --allow-empty-message and
> --keep-redundant-commits when cherry-pick stops for a conflict
> resolution.
> 
> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>

This whole patch series makes sense to me.

And it is especially nice that you make it easy to verify that there is a
bug in the first place, by separating the concern of demonstrating it from
the concern to fix it.

Thanks,
Dscho

> ---
>  sequencer.c                     | 18 ++++++++++++++++++
>  t/t3507-cherry-pick-conflict.sh |  4 ++--
>  2 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/sequencer.c b/sequencer.c
> index 5e19b22f8f..1ad51aa498 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -2307,6 +2307,15 @@ static int populate_opts_cb(const char *key, const char *value, void *data)
>  		opts->no_commit = git_config_bool_or_int(key, value, &error_flag);
>  	else if (!strcmp(key, "options.edit"))
>  		opts->edit = git_config_bool_or_int(key, value, &error_flag);
> +	else if (!strcmp(key, "options.allow-empty"))
> +		opts->allow_empty =
> +			git_config_bool_or_int(key, value, &error_flag);
> +	else if (!strcmp(key, "options.allow-empty-message"))
> +		opts->allow_empty_message =
> +			git_config_bool_or_int(key, value, &error_flag);
> +	else if (!strcmp(key, "options.keep-redundant-commits"))
> +		opts->keep_redundant_commits =
> +			git_config_bool_or_int(key, value, &error_flag);
>  	else if (!strcmp(key, "options.signoff"))
>  		opts->signoff = git_config_bool_or_int(key, value, &error_flag);
>  	else if (!strcmp(key, "options.record-origin"))
> @@ -2705,6 +2714,15 @@ static int save_opts(struct replay_opts *opts)
>  	if (opts->edit)
>  		res |= git_config_set_in_file_gently(opts_file,
>  					"options.edit", "true");
> +	if (opts->allow_empty)
> +		res |= git_config_set_in_file_gently(opts_file,
> +					"options.allow-empty", "true");
> +	if (opts->allow_empty_message)
> +		res |= git_config_set_in_file_gently(opts_file,
> +				"options.allow-empty-message", "true");
> +	if (opts->keep_redundant_commits)
> +		res |= git_config_set_in_file_gently(opts_file,
> +				"options.keep-redundant-commits", "true");
>  	if (opts->signoff)
>  		res |= git_config_set_in_file_gently(opts_file,
>  					"options.signoff", "true");
> diff --git a/t/t3507-cherry-pick-conflict.sh b/t/t3507-cherry-pick-conflict.sh
> index 79e994cffa..1ef8e9d534 100755
> --- a/t/t3507-cherry-pick-conflict.sh
> +++ b/t/t3507-cherry-pick-conflict.sh
> @@ -410,7 +410,7 @@ test_expect_success 'cherry-pick preserves sparse-checkout' '
>  	test_i18ngrep ! "Changes not staged for commit:" actual
>  '
>  
> -test_expect_failure 'cherry-pick --continue remembers --keep-redundant-commits' '
> +test_expect_success 'cherry-pick --continue remembers --keep-redundant-commits' '
>  	test_when_finished "git cherry-pick --abort || :" &&
>  	pristine_detach initial &&
>  	test_must_fail git cherry-pick --keep-redundant-commits picked redundant &&
> @@ -419,7 +419,7 @@ test_expect_failure 'cherry-pick --continue remembers --keep-redundant-commits'
>  	git cherry-pick --continue
>  '
>  
> -test_expect_failure 'cherry-pick --continue remembers --allow-empty and --allow-empty-message' '
> +test_expect_success 'cherry-pick --continue remembers --allow-empty and --allow-empty-message' '
>  	test_when_finished "git cherry-pick --abort || :" &&
>  	pristine_detach initial &&
>  	test_must_fail git cherry-pick --allow-empty --allow-empty-message \
> -- 
> 2.21.0
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] cherry-pick --continue: remember options
  2019-03-13 22:45   ` Johannes Schindelin
@ 2019-03-14  5:01     ` Junio C Hamano
  2019-03-14 14:09       ` separating regression test patches from fixes, was " Johannes Schindelin
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2019-03-14  5:01 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Phillip Wood, Git Mailing List

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> Remember --allow-empty, --allow-empty-message and
>> --keep-redundant-commits when cherry-pick stops for a conflict
>> resolution.
>> 
>> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> This whole patch series makes sense to me.

Yes, the changes look sensible (provided if it is the sensible goal
to make --continue use the same set of options, that is).

> And it is especially nice that you make it easy to verify that there is a
> bug in the first place, by separating the concern of demonstrating it from
> the concern to fix it.

For a multi-patch series, breaking out the verification test out of
the fixing patch is OK.  It is a different story for a simple change
that could be made in a single step to artificially separate the
test into a separate step.  An early "test_expect_failure" would not
stop the test when applied to a different codebase as a standalone
"does the breakage exist in this unrelated version?" even if the
"test only" patch might appear easier to apply (as opposed to
applying only the t/ part of the patch that expects success and
seeing it actually fail).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* separating regression test patches from fixes, was Re: [PATCH 3/3] cherry-pick --continue: remember options
  2019-03-14  5:01     ` Junio C Hamano
@ 2019-03-14 14:09       ` Johannes Schindelin
  2019-03-14 14:30         ` Duy Nguyen
  0 siblings, 1 reply; 10+ messages in thread
From: Johannes Schindelin @ 2019-03-14 14:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Phillip Wood, Git Mailing List

Hi Junio,

On Thu, 14 Mar 2019, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > And it is especially nice that you make it easy to verify that there
> > is a bug in the first place, by separating the concern of
> > demonstrating it from the concern to fix it.
> 
> For a multi-patch series, breaking out the verification test out of the
> fixing patch is OK.  It is a different story for a simple change that
> could be made in a single step to artificially separate the test into a
> separate step.  An early "test_expect_failure" would not stop the test
> when applied to a different codebase as a standalone "does the breakage
> exist in this unrelated version?" even if the "test only" patch might
> appear easier to apply (as opposed to applying only the t/ part of the
> patch that expects success and seeing it actually fail).

This keeps coming up, and I feel that in the past, I have only managed to
upset you, not to convince you that my point of view has at least *some*
merit.

So please let me try again, this time with a cooler head.

As you know, I find it rather cumbersome to review code changes on the Git
mailing list. Unlike others, I do not suffer from issues revolving around
HTML parts and/or mailer issues, and I do try to just work with the
limitations of the medium. But sometimes the diff just does not give
enough context. Or I really would need to look at it with `--color-moved`
or `--color-words` or `--patience` or `-w`. Or yet other advanced diff
options.

So I would like to have the code changes locally, in a branch in my
worktree.

And at least for me, code review often does not end there. I need to dive
into the commit history to see e.g. why a line that a patch wants to
removed was added in the first place, i.e. read the commit message and see
how the surrounding code changed.

So I would like to have not only some re-applied changes locally, but
really a clone of the branch that the contributor themselves had locally.

And as my understanding of code review is first and foremost to ensure the
correctness of the code (which is why I want to leave the more tedious
parts of code review that do not really require a human to review, or even
fix, to automation), in some cases I really need to run the code (or the
regression test) locally, e.g. to see whether it is a Windows-specific
issue, or whether it affects this or that branch.

The necessity for this kind of thing was demonstrated rather well by
Hannes Sixt recently, where he ran a regression test that I added as part
of https://github.com/gitgitgadget/git/pull/96 (mail thread here:
https://public-inbox.org/git/pull.96.git.gitgitgadget@gmail.com/), and
figured out that this regression does not even exist in git.git's `master`
branch. Therefore it don't need no fixing there, either.

So now that we established how important it is to go back to the local Git
worktree for many in-depth reviews, let's see how easy/convenient that is
right now.

In our currently established process, you have to have mutt with your own
custom scripts. Kidding! But it is undeniable that without serious
scripting (that nobody seems to be able to do in a way that can be shared
between reviewers), it is super hard to do anything but a pure diff-based
review (which would not have caught the problem described above).

So how can we make this easier? How can we make it less painful to review
code changes beyond the bare minimum diff-based review which, let's face
it, encourages "white space review", i.e. focusing on problems that can be
seen from the diff, but that are really of little consequence when it
comes to the health of our code base?

What it boils down to: it needs to be easier to pull down patches and to
recreate local branches for them.

It is in this context that I find it much better to separate patches that
add regression test cases from patches that fix the regressions. How often
did I encounter problems to apply the diff from public-inbox? I cannot
count the number that happened/happens to me. And usually the diff does
not apply, many more times in the part that touches libgit.a code than in
the part that touches the test suite.

And how many times did `git am -3` not help me because I lacked the
prerequisite objects? Again, I cannot count that number, it
happend/happens *so* often.

But the regression test patches usually apply cleanly, or if they don't,
chances are that I have the prerequisite objects, because we rarely change
regression tests locally, we only add to them. And even if I do not have
those prerequisite objects, those patches usually *add* code, so it is
easier to resolve the problems.

Note: I sometimes want the regression test locally not so much to verify
that it demonstrates a current bug, but sometimes to look at the code
flow, to see which parts of the code are executed. That is why I like to
have that part locally much more often than I need the actual fix locally.

So yes, the easier I can make that process, the better.

Side note: for some time, our team tried to track all patches (whether
they made their way into a branch at https://github.com/gitster/git or
not) in the form of branches in an internal fork. Stolee drove that
project forward, and it worked for the most part. This might be something
we would actually want in a public fork, maybe even with a simple web app
to find the branch corresponding to a given Message-ID, or some such. But
I digress.

In any case, before we get better tooling to work around these issues, I
still think it makes a ton of sense to encourage proper separation of
concerns: to keep patches that introduce regression tests demonstrating a
breakage separate from patches that fix the breakage. It would certainly
help me (e.g. when staring at a range diff).

It certainly would make it easier for me to justify the time I spend on
reviews, because I could review more patch series in the same amount of
time.

So hopefully we can find some middle ground between your endeavor to keep
the contributions compact, and my desire to separate concerns better (even
at the expense of number of patches; I think commits are cheap, cheaper
than reviewers' time at least, for sure).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: separating regression test patches from fixes, was Re: [PATCH 3/3] cherry-pick --continue: remember options
  2019-03-14 14:09       ` separating regression test patches from fixes, was " Johannes Schindelin
@ 2019-03-14 14:30         ` Duy Nguyen
  2019-03-18  0:46           ` Junio C Hamano
  2019-03-18  8:07           ` Junio C Hamano
  0 siblings, 2 replies; 10+ messages in thread
From: Duy Nguyen @ 2019-03-14 14:30 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, Phillip Wood, Git Mailing List

On Thu, Mar 14, 2019 at 9:10 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> In any case, before we get better tooling to work around these issues, I
> still think it makes a ton of sense to encourage proper separation of
> concerns: to keep patches that introduce regression tests demonstrating a
> breakage separate from patches that fix the breakage. It would certainly
> help me (e.g. when staring at a range diff).

Then perhaps improve the tools now because these separate patches
enter 'master' and stay in the history forever. One of the problems I
have with separating tests from the actual code change is the
description of the problem often stays on the test commit, which I
can't see in git-log if I'm searching for the code change. And no
sometimes I can't just look at the parent commit if I filter code by
path (and with --full-diff)
-- 
Duy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: separating regression test patches from fixes, was Re: [PATCH 3/3] cherry-pick --continue: remember options
  2019-03-14 14:30         ` Duy Nguyen
@ 2019-03-18  0:46           ` Junio C Hamano
  2019-03-21 16:43             ` Johannes Schindelin
  2019-03-18  8:07           ` Junio C Hamano
  1 sibling, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2019-03-18  0:46 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Johannes Schindelin, Phillip Wood, Git Mailing List

Duy Nguyen <pclouds@gmail.com> writes:

> On Thu, Mar 14, 2019 at 9:10 PM Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>> In any case, before we get better tooling to work around these issues, I
>> still think it makes a ton of sense to encourage proper separation of
>> concerns: to keep patches that introduce regression tests demonstrating a
>> breakage separate from patches that fix the breakage. It would certainly
>> help me (e.g. when staring at a range diff).
>
> Then perhaps improve the tools now because these separate patches
> enter 'master' and stay in the history forever. One of the problems I
> have with separating tests from the actual code change is the
> description of the problem often stays on the test commit, which I
> can't see in git-log if I'm searching for the code change. And no
> sometimes I can't just look at the parent commit if I filter code by
> path (and with --full-diff)

I do not think the phrase "separation of concerns" is applied
correctly here.

The concern of a simple single-patch is to fix a bug---the test that
shows what bug was fixed and protects the code by ensuring that a
reintroduction of the bug gets noticed is an integral part of it.
It does not make much sense to artificially split it into two.

It's like arguing for adding declarations for new functions and
global variables in *.h files in a step before a separate step adds
their implementations in *.c files, claiming that the first step
designs the interface (which is sufficient for writing the client
code) and the second one gives an implementation of the interface
(which can be replaced later), and they address two separate
concerns.

And then you would find that there are some compilers that warn
against unimplemented functions and undefined variables.  The
"solution" would be to enclose the whole thing that was added in the
first "*.h only patch" inside "#if 0/#endif" to hide it from the
compiler ;-) That in fact looks quite similar to how "test only
patch" marks the new tests as expecting failure in the beginning.
Neither is truly useful when applied to a context different from
where it is originally developed for.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: separating regression test patches from fixes, was Re: [PATCH 3/3] cherry-pick --continue: remember options
  2019-03-14 14:30         ` Duy Nguyen
  2019-03-18  0:46           ` Junio C Hamano
@ 2019-03-18  8:07           ` Junio C Hamano
  1 sibling, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2019-03-18  8:07 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Johannes Schindelin, Phillip Wood, Git Mailing List

Duy Nguyen <pclouds@gmail.com> writes:

> .... One of the problems I
> have with separating tests from the actual code change is the
> description of the problem often stays on the test commit, which I
> can't see in git-log if I'm searching for the code change.

In the message you are reponding to, Dscho made it sound as if I am
reviewing only from my MUA, but most of my reviews are done after
the patches are tentatively applied (it is a separate issue if the
result is found worth keeping and merged to 'pu'), so our workflows
are not so different.  It is not like "must have them separate" is
the need shared among those who prefer to review in-tree.  I do not
want a logically single patch split into two.

And I find your "find the other half" of a pair of patch that is
artificially split into two a real problem for me, too.  If you
split a single patch into two, depending on which half you find
first, finding the other half is either trivial (as you can just
follow the parent pointer to $THAT_THING^) or hard (you'd need to
have something that binds everything together, like 'pu', and grep
for "git log ..pu" trying to find what its most relevant child is).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: separating regression test patches from fixes, was Re: [PATCH 3/3] cherry-pick --continue: remember options
  2019-03-18  0:46           ` Junio C Hamano
@ 2019-03-21 16:43             ` Johannes Schindelin
  0 siblings, 0 replies; 10+ messages in thread
From: Johannes Schindelin @ 2019-03-21 16:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Duy Nguyen, Phillip Wood, Git Mailing List

Hi Junio,

On Mon, 18 Mar 2019, Junio C Hamano wrote:

> Duy Nguyen <pclouds@gmail.com> writes:
>
> > On Thu, Mar 14, 2019 at 9:10 PM Johannes Schindelin
> > <Johannes.Schindelin@gmx.de> wrote:
> >> In any case, before we get better tooling to work around these issues, I
> >> still think it makes a ton of sense to encourage proper separation of
> >> concerns: to keep patches that introduce regression tests demonstrating a
> >> breakage separate from patches that fix the breakage. It would certainly
> >> help me (e.g. when staring at a range diff).
> >
> > Then perhaps improve the tools now because these separate patches
> > enter 'master' and stay in the history forever. One of the problems I
> > have with separating tests from the actual code change is the
> > description of the problem often stays on the test commit, which I
> > can't see in git-log if I'm searching for the code change. And no
> > sometimes I can't just look at the parent commit if I filter code by
> > path (and with --full-diff)
>
> I do not think the phrase "separation of concerns" is applied
> correctly here.

You might currently still think so, so let me give you material to
re-think that stance:

- If you truly care about contributors of all stripes, you also care about
  contributors who report bugs, contributors who turn bug reports into
  reproducible regression test cases, contributors who verify regressions,
  etc.

  And if you really want to give your respect to, say, the contributors
  who provide such regression test cases, you will accept them as full
  commit authors.

  And that fact alone makes it a separate concern, notwithstanding that in
  my case, the contributor who wrote the regression test case and the
  contributor who fixed the regression happened to be the same person: me.
  That fact simply does not invalidate that there are different concerns at
  play.

  You probably even know people who are good at one of those concerns, but
  not at the other.

- If it true what you say, that a bug fix should be in the same commit as
  the regression test case designed to prevent the bug from being
  reintroduced, then both should live in the same file, because it is
  really the same concern.

  But test code lives in a separate area of the source code, is not even
  "shipped with the product", *because* it is a separate concern.

- There is an entire industry built on books and training advertising
  Test Driven Development (TDD). If you really believe what you said, that
  regression tests are really the same concern as fixing the regressions,
  then you claim that all of them are wrong. Which would be a, how can I
  say it, erm, bold statement indeed.

- You did brush aside the example I gave: Hannes Sixt applied a patch
  that demonstrated the breakage, or more correctly: that *should* have
  demonstrated a breakage.

  Let's not brush aside this example. Let's not ignore it just like that.

  This commit addressed a very specific concern: it verified that there is
  a breakage. Or actually, that in this case, there was *no* breakage.

  By not separating those concerns, as you would have me do it, I would
  have made it harder to demonstrate that this bug fix was not even necessary
  in git.git's `master`. I would have made it harder to bisect *where* in
  Git for Windows' patch thicket the regression was introduced.

  You might now say that it should have been easy to separate the concern
  after the fact, to extract it from a commit that adds both fix and
  regression test case.

  But by that token, you should squash all contributions into a single, big,
  honking commit that do way too much "because it is easy to extract what
  you need". I agree, that sounds ridiculous.

  And even saying that it would be easy to extract the test case from the
  commit already admits that there is a good chunk that you can extract
  *that can live on its own*, i.e. is a separate concern.

- We introduced, specifically, the shell function `test_expect_failure` to
  allow demonstrating breakages. If we really thought that that has no
  value, if there should not be any commit that introduces a regression
  test case without fixing any bug at the same time, if that was not a
  concern, then we should get rid of that function. But experience tells us,
  of course, that we need to keep it.

- There is *semantic* meaning in `test_expect_failure`. Depriving the
  commit history from the commits that introduce such test cases
  demonstrating bugs, and from commits that fix those bugs, makes it
  impossible to perform otherwise simple queries along the lines "how many
  regressions did <insert-name-here> fix in September 2017?".

  I know you are a fan of software archaeology as much as I am (maybe even
  a bit more), so hopefully you find at least this argument convincing.

> The concern of a simple single-patch is to fix a bug---the test that
> shows what bug was fixed and protects the code by ensuring that a
> reintroduction of the bug gets noticed is an integral part of it.

By this reasoning, it should not be possible (or allowed) to write and
contribute a patch that merely demonstrates that something is wrong with
Git, but does not fix it.

That position is untenable.

> It does not make much sense to artificially split it into two.

Let's not brush aside the above-mentioned example, okay?

Let's consider how much easier it was for Hannes (and for that matter, how
much easier it is for *me*) to grab a patch that demonstrates a breakage,
apply it, and the report back whether I can confirm the breakage or not.

It would be *even* easier, of course, if I could just fetch a branch and
cherry-pick, but that is not possible in a project that is centered around
a mailing list. So that's just not an option here.

But back to the point: there is a *lot* of value in having an easy way to
reproduce breakages. Conflating this value with the value of a bug fix
does a disservice to everybody involved.

There was nothing artificial about splitting this patch into two.

Even worse: that commit was never artificially split to begin with! What
you assumed was incorrect: those two commits were always two commits. I
came up with the regression test case first, to convince myself that there
was something that needed fixing (much in the same way as Hannes wanted to
convince himself that there was something that needed fixing, and the fact
that that commit was separate from the fix made it very easy for Hannes to
point out that there was actually nothing in git.git yet that needed
fixing).

And if you are still averse to the idea of separating the concern of
demonstrating a breakage from the concern to fix it, I have really,
really, really bad news for you: a couple of my branches that I intend to
contribute as patch series, once they are finalized, *already* have those
commits that demonstrate breakages, *without* commits to fix them!

I am sorry if that made you shudder ;-)

You even saw at least one of them:
https://public-inbox.org/git/cab7bb36eb85dbe38ad95ee02b083f11f0820e24.1533421100.git.gitgitgadget@gmail.com/

But here is a silver lining for you: There is a real advantage of having
this regression test case in a separate patch *for you* (and other
reviewers): That regression test case will most likely not change in
subsequent iterations of the patch series, while the fix will most likely
look *very* different. And by having the regression test case in a
different patch than the regression fix, you will have a very easy time
verifying that it is unchanged from before, that you do not have to
re-review it.

(Although you might have to re-review it if you do not remember that patch
from the beginning of August, last year.)

> It's like arguing for adding declarations for new functions and
> global variables in *.h files in a step before a separate step adds
> their implementations in *.c files, claiming that the first step
> designs the interface (which is sufficient for writing the client
> code) and the second one gives an implementation of the interface
> (which can be replaced later), and they address two separate
> concerns.

No, that is an apples-to-oranges comparison because the declarations
themselves cannot really be used for anything, while the regression test
cases demonstrating a breakage *can* be used for:

- validating the claim that there is a breakage,

- bisecting,

- debugging,

- implementing a fix,

- verifying the claim that a given Git version has this breakage, or
  disproving it.

> And then you would find that there are some compilers that warn
> against unimplemented functions and undefined variables.

Indeed. And I often find myself between a rock and a hard place when I
want to wrap my patch series into nice, reviewable, logically separate
patches, and I cannot, because we specifically discourage file-local
(`static`) functions without users. Although we have not the faintest
problem with *global* functions without users (which makes no sense,
either disallow both file-local *and* global functions without users, or
allow both):

In b0fa1a3f9974 (test-lib-functions.sh: add generate_zero_bytes function,
2019-02-09), a function was introduced that was then consumed in
24b451e77c7d (t5318: replace use of /dev/zero with generate_zero_bytes,
2019-02-09).

In f1f5de442faf (revision: add mark_tree_uninteresting_sparse,
2019-01-16), we introduced a function that was consumed in the following
commit 4f6d26b16703 (list-objects: consume sparse tree walk, 2019-01-16).

In 47edb649973c (hex: introduce functions to print arbitrary hashes,
2018-11-14), the hash_to_hex*() functions were added, to be used only in
subsequent commits.

And the list goes on.

So in this instance, your argument is that we have to please the compiler,
even if it does not make any logical sense for humans.

After all, a patch series has a temporal order. It has a "story arc", so
to say, or in musical terms, it is a "slur". The compiler does not know
that a function will be used, or a regression test will be marked as
fixed, in a future patch. The human reviewer would. But we actively make
it harder on the human reviewer, by combining, say, the (rather large)
implementation of a linear assignment problem solver into the same commit
as its only user (also large), just to please the compiler.

Or at least we would have made it harder on the human reviewer if we had
not used a *trick* to *fool* the compiler into not complaining: the
implementation was put into the separate file file linear-assignment.c,
*even* if it has *but a single user*!

And since you actually committed this in that form, you already proved
that you agree with me. You also did not find it better to smoosh separate
concerns into the same commit. You had no objection to apply a patch that
had no value whatsoever on its own, that only realizes its value together
with a subsequent commit actually introducing a caller to that
functionality. Which is actually worse than a commit doing nothing else
than introducing a single regression test case that demonstrates a bug,
which does have value on its own.

> The "solution" would be to enclose the whole thing that was added in the
> first "*.h only patch" inside "#if 0/#endif" to hide it from the
> compiler ;-)

Yes, if we really wanted to cater to the compiler, rather than human
readers, then we would indeed do that.

Plus, let's not forget about the fact that this is an apples to oranges
comparison. More on that below.

> That in fact looks quite similar to how "test only patch" marks the new
> tests as expecting failure in the beginning. Neither is truly useful
> when applied to a context different from where it is originally
> developed for.

If you really want to hear what the equivalent to your `.h only commit`
would be, here you go. It would look something like:

-- snip --
diff --git t/...
@@ ... @@
 '

+test_expect_failure 'demonstrate a horrible bug' '
+	false # will be implemented later
+'
+
 test_done
-- snap --

*That* is the equivalent to your example of introducing just the function
signatures, leaving the actual implementation to a later commit.

You will note that this looks very different from what I am advertising,
where the actual implementation that does demonstrate a horrible bug (and
will hopefully gain a second life after the bug is fixed by keeping the
bug fixed) is added at the same time as the `test_expect_failure`.

And if you really, really, really needed another argument in favor of
accepting regression test-only patches as standalone commits: take the
unfortunate example of
https://public-inbox.org/git/CA+h-Bnuf6u=hkPBcxhMm06FbfkS+jtrozu+inqqmUY1cNkXrWQ@mail.gmail.com/

It might not have mattered in this case had you stopped discouraging
contributions of patches that merely add test cases demonstrating
breakages. But it could not have helped, for sure.

If we keep discouraging such patches (and note that I specifically
*encouraged* such a patch instead [*1*]), if we keep insisting that a
regression test case on its own is of no value, we will get even less of
those contributions.

And we will be repeating the same type of story again, where a bug was
reported, no patch was contributed (let alone applied) to demonstrate the
bug, the bug will be fixed, only to be broken *just in time* for the next
official Git version.

That is not what I want.

And that is why I keep considering it a bad practice to discourage
separating commits adding regression tests from the corresponding commits
that fix the regressions.

As I said, it might not have mattered in this instance. Or in the next
one. Or even in many cases.

But encouraging that separation of concerns, acknowledging that there is a
real, measurable value not only in knowing about a bug, but in addition
having an automated way to reproduce the bug (and to verify a bug fix),
*this* will get us more of those contributions rather than less.

Phew. What a long mail. I am glad you made it this far.

Let me end by saying that laying out all of the above, I do not intend to
annoy you or anything. I simply care about this project, and I want it to
run as smoothly as possible. And I really think that the separation of
concerns that we are talking about is one of the ways to make it run
smoother. To make it easier on contributors. To make it easier on
reviewers. To make it easier on developers validating bugs.

Ciao,
Dscho

Footnote *1*:
https://public-inbox.org/git/nycvar.QRO.7.76.6.1901091501320.41@tvgsbejvaqbjf.bet/

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-03-21 16:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-13 18:26 [PATCH 1/3] sequencer: break some long lines Phillip Wood
2019-03-13 18:26 ` [PATCH 2/3] cherry-pick: demonstrate option amnesia Phillip Wood
2019-03-13 18:26 ` [PATCH 3/3] cherry-pick --continue: remember options Phillip Wood
2019-03-13 22:45   ` Johannes Schindelin
2019-03-14  5:01     ` Junio C Hamano
2019-03-14 14:09       ` separating regression test patches from fixes, was " Johannes Schindelin
2019-03-14 14:30         ` Duy Nguyen
2019-03-18  0:46           ` Junio C Hamano
2019-03-21 16:43             ` Johannes Schindelin
2019-03-18  8:07           ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).