git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Emily Shaffer <emilyshaffer@google.com>,
	Victoria Dye <vdye@github.com>
Subject: Re: [PATCH v3 2/3] CodingGuidelines: hint why we value clearly written log messages
Date: Thu, 14 Apr 2022 16:04:59 +0200	[thread overview]
Message-ID: <220414.86lew7d7tb.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <xmqq8rs82m4f.fsf@gitster.g>


On Wed, Apr 13 2022, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> Emily Shaffer <emilyshaffer@google.com> writes:
>>
>>>> + - Log messages to explain your changes are as important as the
>>>> +   changes themselves.  Clearly written code and in-code comments
>>>> +   explain how the code works and what is assumed from the surrounding
>>>> +   context.  The log messages explain what the changes wanted to
>>>> +   achieve and why the changes were necessary (more on this in the
>>>> +   accompanying SubmittingPatches document).
>>>> +
>>>
>>> One thing not listed here, that I often hope to find from the commit
>>> message (and don't), is "why we did it this way instead of <other way>".
>>> I am not sure how to phrase it in this document, though. Maybe:
>>>
>>>   The log messages explain what the changes wanted to achieve, any
>>>   decisions that were made between alternative approaches, and why the
>>>   changes were necessary (more on this in blah blah)
>>>
>>> Or maybe "...whether any alternative approaches were considered..." fits
>>> the form of the surrounding sentence better.
>>
>> Quite valid observation.
>>
>> Documentation/SubmittingPatches::meaningful-message makes a note on
>> these points, and the above may want to be more aligned to them.
>>
>> Patches welcome, as these have long been merged to 'master/main'.
>
> Another thing.  If you (not Emily, but figuratively) haven't watched
> Victoria's talk https://www.youtube.com/watch?v=4qLtKx9S9a8 on the
> topic of clearly written commits, you should drop everything you are
> doing and go watch it.
>
> And with what we learn from it, we may be able to rewrite this part
> of the documentation much more clearly.

The slides for it are at
https://vdye.github..io/2022/OS101-Writing-Commits.pdf (not in the video
description, but at the very end of the video).

It's easy to nitpick/improve existing examples, so here goes :)

The main commit message example in that talk starts as just "Make error
text more helpful", and ends with a better version as:

	git-portable.sh: make error text more helpful
	
	The message “Not a valid command: <invalid command>” is
	intended to notify the user that their subcommand is invalid.
	However, when no subcommand is given, the "empty" subcommand
	results in the same message: "Not a valid command:". This does
	not clearly guide the user to the correct behavior, so print
	"Please specify a command" when no subcommand is specified.

For our CodingGuidelines I think it would be useful to have some version
of "if you can explain something with prose or tests, prefer
tests".

I.e. other things being equal I'd much prefer this version
(pseudo-patch):

	git-portable.sh: don't conflate invalid and non-existing command

	 git-portable-test.sh | 2 +-
	 1 file changed, 1 insertion(+), 1 deletion(-)
	
	diff --git a/git-portable-test.sh b/git-portable-test.sh
	index c8bd464..e03f4a8 100644
	--- a/git-portable-test.sh
	+++ b/git-portable-test.sh
	@@ -5,7 +5,7 @@ test_expect_failure 'usage: invalid command' '
	 '
	 
	 test_expect_failure 'usage: no command' '
	-	test_expect_code_output 129 "Not a valid command: " ./gitportable.sh
	+	test_expect_code_output 129 "Please specify a command" ./gitportable.sh
	 '
	 
	 test_done

It ends up basically saying the same thing, but now we're saying it with
a regression test (test_expect_code_output doesn't exist, but let's
pretend it's test_expect_code + a test_cmp-alike).

What it does entirely omit is the "why".

Now I realize I'm nitpicking a slide shown at a conference, which by its
nature needs to show a small pseudo-example, but I think this applies in
general:

While "why" is a good rule of thumb I think it's just as important to
know when not to include explanations and when to include one.

For cases where something is straightforward enough (as in this case,
the RHS of ": " is clearly missing) I'd think omitting the explanation
would be better, as we should also be concerned about the overall signal
ratio.

(Now, if anyone glances at my own commit messages they'll see I'm
thoroughly in "throwing rocks from a glass house" territory here :) I'm
not saying I'm consistency practicing what I'm preaching).

But just like comments there's no right answer, when one person thinks
an explanation is different from another.

But it is unambiguously the case that we can often replace prose with
tests, and in those cases we should almost always prefer that.

It's also the case that even if everyone agrees that a "why" is needed
there's multiple ways to store that information. One is via commit
messages, another would e.g. be that same commit updating some shared
guidelines about goals/examples of CLI usage.

So in this case, if a Documentation/CodingGuidelines had clear examples
of preferred usage, we could just point briefly point to that as
rationale.

While git's commit messages are excellent, I think that's one area where
we really need improvement. It's rare to dig into some old code where no
rationale can be found for it, either in the commit itself, or in the
preceding ML discussion.

But it's unfortunately (at least in my experience) more often than not
the case that you really do need to consult those commit messages or ML
archives, even for things that have come up a *lot* of times, they were
just never documented in-tree.

There's all sorts of reasons for that which are not the result of any
person doing anything wrong, but I do think it's something we could and
should focus more on as a project.

The barriers of entry for adding documentation or adjusting existing
documentation are much higher than adding a one-off explanation in a
commit message.

Partially (and probably mostly) that's a really good thing, but I can't
help but wonder if we're getting that balance right given the (in my
subjective experience) end result of us often lacking good docs, while
we're not lacking if one searches for replacements for those docs in
commit messages or the ML archive.

One more thing that I think is not explicitly covered (I skimmed the
slides, but haven't gone throug the full back yet): Minimizing diffs.

E.g. the talk shows 287fd17e3a1 (sparse-index: prevent repo root from
becoming sparse, 2022-03-01) as an example, which has this hunk:
	
	diff --git a/dir.c b/dir.c
	index d91295f2bcd..a136377eb49 100644
	--- a/dir.c
	+++ b/dir.c
	@@ -1463,10 +1463,11 @@ static int path_in_sparse_checkout_1(const char *path,
	 	const char *end, *slash;
	 
	 	/*
	-	 * We default to accepting a path if there are no patterns or
	-	 * they are of the wrong type.
	+	 * We default to accepting a path if the path is empty, there are no
	+	 * patterns, or the patterns are of the wrong type.
	 	 */
	-	if (init_sparse_checkout_patterns(istate) ||
	+	if (!*path ||
	+	    init_sparse_checkout_patterns(istate) ||
	 	    (require_cone_mode &&
	 	     !istate->sparse_checkout_patterns->use_cone_patterns))
	 		return 1;

I think this is a worthwhile thing to consider as a replacement:
	
	diff --git a/dir.c b/dir.c
	index d91295f2bcd..93a2320ae57 100644
	--- a/dir.c
	+++ b/dir.c
	@@ -1466,7 +1466,8 @@ static int path_in_sparse_checkout_1(const char *path,
	 	 * We default to accepting a path if there are no patterns or
	 	 * they are of the wrong type.
	 	 */
	-	if (init_sparse_checkout_patterns(istate) ||
	+	if (!*path || /* we consider an empty pattern to be no pattern */
	+	    init_sparse_checkout_patterns(istate) ||
	 	    (require_cone_mode &&
	 	     !istate->sparse_checkout_patterns->use_cone_patterns))
	 		return 1;

I.e. trying to optimize for smaller diffs whenever possible. It this
case the word-diff for the original is:

        /*
         * We default to accepting a path if {+the path is empty,+} there are no
         {+*+} patterns{+,+} or [-* they-]{+the patterns+} are of the wrong type.
         */

Now, obviously another small isolated example that's not worth
nitpicking in itself, but just serves to make a larger point. It's clear
why the rephrasing was done in that case, because the patch adds the
"!*path" check, so it makes sense a-priory to have the comment reflect
that.

But one thing where advice about "narrative structure" and good prose
tends to break down when it comes to software development is that we're
much more focused on reviews of incremental additions than many other
fields, where it tends to be more about the final product.

  reply	other threads:[~2022-04-14 15:49 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-23 20:37 [RFC] Contributor doc: more on the proposed log message Junio C Hamano
2022-01-23 21:32 ` brian m. carlson
2022-01-26 11:07   ` Kaartic Sivaraam
2022-01-26 23:42 ` [PATCH v2 0/3] contributor doc update around log messages Junio C Hamano
2022-01-26 23:42   ` [PATCH v2 1/3] SubmittingPatches: write problem statement in the log in the present tense Junio C Hamano
2022-01-26 23:42   ` [PATCH v2 2/3] CodingGuidelines: hint why we value clearly written log messages Junio C Hamano
2022-01-27  7:31     ` Eric Sunshine
2022-01-27 18:35       ` Junio C Hamano
2022-01-26 23:42   ` [PATCH v2 3/3] SubmittingPatches: explain why we care about " Junio C Hamano
2022-01-27 19:02   ` [PATCH v3 0/3] contributor doc update around " Junio C Hamano
2022-03-04  0:12     ` Emily Shaffer
2022-01-27 19:02   ` [PATCH v3 1/3] SubmittingPatches: write problem statement in the log in the present tense Junio C Hamano
2022-03-03 23:59     ` Emily Shaffer
2022-03-04  0:23       ` Junio C Hamano
2022-03-04 23:41         ` Emily Shaffer
2022-01-27 19:02   ` [PATCH v3 2/3] CodingGuidelines: hint why we value clearly written log messages Junio C Hamano
2022-03-04  0:07     ` Emily Shaffer
2022-03-04  0:27       ` Junio C Hamano
2022-04-14  6:51         ` Junio C Hamano
2022-04-14 14:04           ` Ævar Arnfjörð Bjarmason [this message]
2022-04-19 22:53             ` Emily Shaffer
2022-04-20  8:23               ` Junio C Hamano
2022-01-27 19:02   ` [PATCH v3 3/3] SubmittingPatches: explain why we care about " Junio C Hamano
2022-03-04  0:10     ` Emily Shaffer
2022-03-04  0:29       ` Junio C Hamano
2022-03-04  9:52         ` log messages > comments (was: [PATCH v3 3/3] SubmittingPatches: explain why we care about log messages) Ævar Arnfjörð Bjarmason
2022-03-04 19:41           ` log messages > comments Junio C Hamano
2022-03-04 21:30             ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=220414.86lew7d7tb.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).