git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] i18n: improve translatability of ambiguous object output
@ 2021-10-04  1:42 Ævar Arnfjörð Bjarmason
  2021-10-04  1:42 ` [PATCH 1/2] object-name tests: tighten up advise() output test Ævar Arnfjörð Bjarmason
                   ` (2 more replies)
  0 siblings, 3 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-04  1:42 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

This series improves the translatability of the output we emit when an
ambiguous OID is given by not emitting it a line-at-a-time. This
likely won't matter in practice except for RTL languages (of which we
have no current translations), but it's good to be future-proof!

Ævar Arnfjörð Bjarmason (2):
  object-name tests: tighten up advise() output test
  object-name: make ambiguous object output translatable

 object-name.c                       | 53 ++++++++++++++++++++++++-----
 t/t1512-rev-parse-disambiguation.sh | 16 ++++-----
 2 files changed, 52 insertions(+), 17 deletions(-)

-- 
2.33.0.1404.g7bcfc82b295


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 1/2] object-name tests: tighten up advise() output test
  2021-10-04  1:42 [PATCH 0/2] i18n: improve translatability of ambiguous object output Ævar Arnfjörð Bjarmason
@ 2021-10-04  1:42 ` Ævar Arnfjörð Bjarmason
  2021-10-04  2:52   ` Eric Sunshine
  2021-10-04  7:05   ` Jeff King
  2021-10-04  1:42 ` [PATCH 2/2] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
  2021-10-04 14:27 ` [PATCH v2 0/2] i18n: improve translatability of ambiguous object output Ævar Arnfjörð Bjarmason
  2 siblings, 2 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-04  1:42 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

Change tests added in 1ffa26c4614 (get_short_sha1: list ambiguous
objects on error, 2016-09-26) to only care about the OIDs that are
listed, which is what the test is trying to check for.

This isn't needed by the subsequent commit, which won't change any of
the output, but a mere tightening of the tests assertions to more
closely match what we really want to test for here.

Now if the advise() message itself were change the phrasing around the
list of OIDs we won't have this test break. We're assuming that such
output won't have a need to indent anything except the OIDs.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t1512-rev-parse-disambiguation.sh | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 7891a6becf3..d3a2d9188c7 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -334,16 +334,16 @@ test_expect_success 'ambiguity errors are not repeated (peel)' '
 
 test_expect_success 'ambiguity hints' '
 	test_must_fail git rev-parse 000000000 2>stderr &&
-	grep ^hint: stderr >hints &&
-	# 16 candidates, plus one intro line
-	test_line_count = 17 hints
+	grep "^hint:   " stderr >hints &&
+	# 16 candidates, minus surrounding prose
+	test_line_count = 16 hints
 '
 
 test_expect_success 'ambiguity hints respect type' '
 	test_must_fail git rev-parse 000000000^{commit} 2>stderr &&
-	grep ^hint: stderr >hints &&
-	# 5 commits, 1 tag (which is a committish), plus intro line
-	test_line_count = 7 hints
+	grep "^hint:   " stderr >hints &&
+	# 5 commits, 1 tag (which is a committish), minus surrounding prose
+	test_line_count = 6 hints
 '
 
 test_expect_success 'failed type-selector still shows hint' '
@@ -352,8 +352,8 @@ test_expect_success 'failed type-selector still shows hint' '
 	echo 851 | git hash-object --stdin -w &&
 	echo 872 | git hash-object --stdin -w &&
 	test_must_fail git rev-parse ee3d^{commit} 2>stderr &&
-	grep ^hint: stderr >hints &&
-	test_line_count = 3 hints
+	grep "^hint:   " stderr >hints &&
+	test_line_count = 2 hints
 '
 
 test_expect_success 'core.disambiguate config can prefer types' '
-- 
2.33.0.1404.g7bcfc82b295


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 2/2] object-name: make ambiguous object output translatable
  2021-10-04  1:42 [PATCH 0/2] i18n: improve translatability of ambiguous object output Ævar Arnfjörð Bjarmason
  2021-10-04  1:42 ` [PATCH 1/2] object-name tests: tighten up advise() output test Ævar Arnfjörð Bjarmason
@ 2021-10-04  1:42 ` Ævar Arnfjörð Bjarmason
  2021-10-04  7:35   ` Jeff King
  2021-10-04 14:27 ` [PATCH v2 0/2] i18n: improve translatability of ambiguous object output Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-04  1:42 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

Change the output of show_ambiguous_object() added in [1] and last
tweaked in [2] to be more friendly to translators. By being able to
customize the sprintf formats we're even ready for RTL languages.

1. ef9b0370da6 (sha1-name.c: store and use repo in struct
   disambiguate_state, 2019-04-16)
2. 5cc044e0257 (get_short_oid: sort ambiguous objects by type,
   then SHA-1, 2018-05-10)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c | 53 ++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 44 insertions(+), 9 deletions(-)

diff --git a/object-name.c b/object-name.c
index fdff4601b2c..7e7f671e337 100644
--- a/object-name.c
+++ b/object-name.c
@@ -351,9 +351,16 @@ static int init_object_disambiguation(struct repository *r,
 	return 0;
 }
 
+struct show_ambiguous_state {
+	const struct disambiguate_state *ds;
+	struct strbuf *advice;
+};
+
 static int show_ambiguous_object(const struct object_id *oid, void *data)
 {
-	const struct disambiguate_state *ds = data;
+	struct show_ambiguous_state *state = data;
+	const struct disambiguate_state *ds = state->ds;
+	struct strbuf *advice = state->advice;
 	struct strbuf desc = STRBUF_INIT;
 	int type;
 
@@ -366,18 +373,34 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		if (commit) {
 			struct pretty_print_context pp = {0};
 			pp.date_mode.type = DATE_SHORT;
-			format_commit_message(commit, " %ad - %s", &desc, &pp);
+			format_commit_message(commit, _(" %ad - %s"), &desc, &pp);
 		}
 	} else if (type == OBJ_TAG) {
 		struct tag *tag = lookup_tag(ds->repo, oid);
 		if (!parse_tag(tag) && tag->tag)
-			strbuf_addf(&desc, " %s", tag->tag);
+			strbuf_addf(&desc, _(" %s"), tag->tag);
 	}
 
-	advise("  %s %s%s",
-	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
-	       type_name(type) ? type_name(type) : "unknown type",
-	       desc.buf);
+	strbuf_addf(advice,
+		    /*
+		     * TRANSLATORS: This is a line of ambiguous object
+		     * output. E.g.:
+		     *
+		     *    "deadbeef commit 2021-01-01 - Some Commit Message\n"
+		     *    "deadbeef tag Some Tag Message\n"
+		     *    "deadbeef tree\n"
+		     *
+		     * I.e. the first argument is a short OID, the
+		     * second is the type name of the object, and the
+		     * third a description of the object, if it's a
+		     * commit or tag. In that case the " %ad - %s" and
+		     * " %s" formats above will be used for the third
+		     * argument.
+		     */
+		    _("  %s %s%s\n"),
+		    repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
+		    type_name(type) ? type_name(type) : "unknown type",
+		    desc.buf);
 
 	strbuf_release(&desc);
 	return 0;
@@ -475,7 +498,12 @@ static enum get_oid_result get_short_oid(struct repository *r,
 	}
 
 	if (!quietly && (status == SHORT_NAME_AMBIGUOUS)) {
+		struct strbuf sb = STRBUF_INIT;
 		struct oid_array collect = OID_ARRAY_INIT;
+		struct show_ambiguous_state as = {
+			.ds = &ds,
+			.advice = &sb,
+		};
 
 		error(_("short object ID %s is ambiguous"), ds.hex_pfx);
 
@@ -488,12 +516,19 @@ static enum get_oid_result get_short_oid(struct repository *r,
 		if (!ds.ambiguous)
 			ds.fn = NULL;
 
-		advise(_("The candidates are:"));
 		repo_for_each_abbrev(r, ds.hex_pfx, collect_ambiguous, &collect);
 		sort_ambiguous_oid_array(r, &collect);
 
-		if (oid_array_for_each(&collect, show_ambiguous_object, &ds))
+		if (oid_array_for_each(&collect, show_ambiguous_object, &as))
 			BUG("show_ambiguous_object shouldn't return non-zero");
+
+		/*
+		 * TRANSLATORS: The argument is the list of ambiguous
+		 * objects composed in show_ambiguous_object(). See
+		 * its "TRANSLATORS" comment for details.
+		 */
+		advise(_("The candidates are:\n\n%s"), sb.buf);
+
 		oid_array_clear(&collect);
 	}
 
-- 
2.33.0.1404.g7bcfc82b295


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 1/2] object-name tests: tighten up advise() output test
  2021-10-04  1:42 ` [PATCH 1/2] object-name tests: tighten up advise() output test Ævar Arnfjörð Bjarmason
@ 2021-10-04  2:52   ` Eric Sunshine
  2021-10-04  7:05   ` Jeff King
  1 sibling, 0 replies; 81+ messages in thread
From: Eric Sunshine @ 2021-10-04  2:52 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git List, Junio C Hamano, Jeff King

On Sun, Oct 3, 2021 at 9:43 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> Change tests added in 1ffa26c4614 (get_short_sha1: list ambiguous
> objects on error, 2016-09-26) to only care about the OIDs that are
> listed, which is what the test is trying to check for.
>
> This isn't needed by the subsequent commit, which won't change any of
> the output, but a mere tightening of the tests assertions to more
> closely match what we really want to test for here.
>
> Now if the advise() message itself were change the phrasing around the

s/were change/were to change/

> list of OIDs we won't have this test break. We're assuming that such
> output won't have a need to indent anything except the OIDs.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 1/2] object-name tests: tighten up advise() output test
  2021-10-04  1:42 ` [PATCH 1/2] object-name tests: tighten up advise() output test Ævar Arnfjörð Bjarmason
  2021-10-04  2:52   ` Eric Sunshine
@ 2021-10-04  7:05   ` Jeff King
  1 sibling, 0 replies; 81+ messages in thread
From: Jeff King @ 2021-10-04  7:05 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano

On Mon, Oct 04, 2021 at 03:42:48AM +0200, Ævar Arnfjörð Bjarmason wrote:

> Change tests added in 1ffa26c4614 (get_short_sha1: list ambiguous
> objects on error, 2016-09-26) to only care about the OIDs that are
> listed, which is what the test is trying to check for.
> 
> This isn't needed by the subsequent commit, which won't change any of
> the output, but a mere tightening of the tests assertions to more
> closely match what we really want to test for here.

I think the next commit does change the output. It adds an extra empty
line which would cause these tests to fail.

> Now if the advise() message itself were change the phrasing around the
> list of OIDs we won't have this test break. We're assuming that such
> output won't have a need to indent anything except the OIDs.

It feels like we're trading one assumption for another. :)

I admit that don't care much either way, though.

-Peff

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2/2] object-name: make ambiguous object output translatable
  2021-10-04  1:42 ` [PATCH 2/2] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
@ 2021-10-04  7:35   ` Jeff King
  2021-10-04  8:26     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 81+ messages in thread
From: Jeff King @ 2021-10-04  7:35 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano

On Mon, Oct 04, 2021 at 03:42:49AM +0200, Ævar Arnfjörð Bjarmason wrote:

> Change the output of show_ambiguous_object() added in [1] and last
> tweaked in [2] to be more friendly to translators. By being able to
> customize the sprintf formats we're even ready for RTL languages.
> 
> 1. ef9b0370da6 (sha1-name.c: store and use repo in struct
>    disambiguate_state, 2019-04-16)
> 2. 5cc044e0257 (get_short_oid: sort ambiguous objects by type,
>    then SHA-1, 2018-05-10)

I suspect you meant 1ffa26c461 (get_short_sha1: list ambiguous objects
on error, 2016-09-26) for the first one.

I had to stare at the patch for a while to understand the goal here. I
think this would have been a bit easier to review if "change" in your
first sentence was described a bit more. Perhaps:

  The list of candidates output by show_ambiguous_output() is not marked
  for translation. At the very least we want to allow the text "the
  candidates are" to be translated. But we also format individual
  candidate lines like:

      deadbeef commit 2021-01-01 - Some Commit Message

  by formatting the individual components, then using a printf-format to
  arrange them in the correct order. Even though there's no text here to
  be translated, the order and spacing is determined by the format
  string. Allowing that to be translated helps RTL languages.

I have a few comments on the patch itself. The biggest thing is that it
changes the format to add an extra newline (between "The candidates
are:" and the actual list). I don't have a strong opinion on including
that or not, but it seemed unintentional given the comment on the first
commit (and its lack of mention here).

The rest are mostly observations, not criticisms. You can take them with
the appropriate grain of salt given that I don't do translation work
myself, nor know any RTL languages.

> @@ -366,18 +373,34 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
>  		if (commit) {
>  			struct pretty_print_context pp = {0};
>  			pp.date_mode.type = DATE_SHORT;
> -			format_commit_message(commit, " %ad - %s", &desc, &pp);
> +			format_commit_message(commit, _(" %ad - %s"), &desc, &pp);
>  		}

Is it OK to use non-printf expansions with the gettext code? Presumably
the translated string would have the same set of placeholders in it, but
my understanding is that gettext may sometimes munge the %-placeholders
(e.g., allowing numbered ones for re-ordering). I admit I don't know how
any of that works, but I just wonder if this "%ad" may cause confusion
(or even if not, if it is even possible to re-order it for an RTL
language).

>  	} else if (type == OBJ_TAG) {
>  		struct tag *tag = lookup_tag(ds->repo, oid);
>  		if (!parse_tag(tag) && tag->tag)
> -			strbuf_addf(&desc, " %s", tag->tag);
> +			strbuf_addf(&desc, _(" %s"), tag->tag);
>  	}

I wonder whether " %s" is worthwhile as a translatable string. It does
seem to be unique among strings marked for translation, but there are a
ton of non-translated instances. Would context ever matter here?

My impression is that this kind of translation-lego is frowned upon, and
we might be better off repeating ourselves a bit more. I.e., something
like:

  if (commit) {
	  struct strbuf date = STRBUF_INIT;
	  struct strbuf subject = STRBUF_INIT;
	  format_commit_message(commit, "%ad", &date, &pp);
	  format_commit_message(commit, "%s", &subject, &pp);
	  strbuf_addf(advice, _("  %s commit %s - %s\n"),
		      repo_find_unique_abbrev(...),
		      date.buf, subject.buf);
	  strbuf_release(&date);
	  strbuf_release(&subject);
  } else if (type == OBJ_TAG) {
          ...
	  strbuf_addf(advice, _("  %s tag %s\n"),
	              repo_find_unique_abbrev(...), tag->tag);
  } else {
	  /* TRANSLATORS: the fields are abbreviated oid and type */
          strbuf_addf(advice, _("  %s %s\n"),
	              repo_find_unique_abbrev(...), type_name(type));
  }

Though that last one similarly has a real lack of context.

> -	advise("  %s %s%s",
> -	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
> -	       type_name(type) ? type_name(type) : "unknown type",
> -	       desc.buf);
> +	strbuf_addf(advice,
> +		    /*
> +		     * TRANSLATORS: This is a line of ambiguous object
> +		     * output. E.g.:
> +		     *
> +		     *    "deadbeef commit 2021-01-01 - Some Commit Message\n"
> +		     *    "deadbeef tag Some Tag Message\n"
> +		     *    "deadbeef tree\n"
> +		     *
> +		     * I.e. the first argument is a short OID, the
> +		     * second is the type name of the object, and the
> +		     * third a description of the object, if it's a
> +		     * commit or tag. In that case the " %ad - %s" and
> +		     * " %s" formats above will be used for the third
> +		     * argument.
> +		     */
> +		    _("  %s %s%s\n"),
> +		    repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
> +		    type_name(type) ? type_name(type) : "unknown type",
> +		    desc.buf);

Would you want to translate "unknown type" here, as well? It's probably
not that important in practice, but it seems like a funny omission.

> @@ -488,12 +516,19 @@ static enum get_oid_result get_short_oid(struct repository *r,
>  		if (!ds.ambiguous)
>  			ds.fn = NULL;
>  
> -		advise(_("The candidates are:"));
>  		repo_for_each_abbrev(r, ds.hex_pfx, collect_ambiguous, &collect);
>  		sort_ambiguous_oid_array(r, &collect);
>  
> -		if (oid_array_for_each(&collect, show_ambiguous_object, &ds))
> +		if (oid_array_for_each(&collect, show_ambiguous_object, &as))
>  			BUG("show_ambiguous_object shouldn't return non-zero");
> +
> +		/*
> +		 * TRANSLATORS: The argument is the list of ambiguous
> +		 * objects composed in show_ambiguous_object(). See
> +		 * its "TRANSLATORS" comment for details.
> +		 */
> +		advise(_("The candidates are:\n\n%s"), sb.buf);

Here's where the extra newline.

I understand why the earlier ones were changed for RTL languages. But
this one is always line-oriented. Is the point to help bottom-to-top
languages? I can buy that, though it feels like that would be something
that the terminal would deal with (because even with this, you're still
getting the "error:" line printed separately, for example).

I don't think what this is doing is wrong (at first I wondered about the
"hint:" lines, but because advise() looks for embedded newlines, we're
OK). But if the translation doesn't need to reorder things across lines,
this extra format-into-a-strbuf step doesn't seem necessary. We can just
call advise() directly in show_ambiguous_object(), as before.

If it is necessary, then note that you leak "sb" here.

-Peff

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2/2] object-name: make ambiguous object output translatable
  2021-10-04  7:35   ` Jeff King
@ 2021-10-04  8:26     ` Ævar Arnfjörð Bjarmason
  2021-10-04  9:29       ` Jeff King
  0 siblings, 1 reply; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-04  8:26 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano


On Mon, Oct 04 2021, Jeff King wrote:

> On Mon, Oct 04, 2021 at 03:42:49AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> Change the output of show_ambiguous_object() added in [1] and last
>> tweaked in [2] to be more friendly to translators. By being able to
>> customize the sprintf formats we're even ready for RTL languages.
>> 
>> 1. ef9b0370da6 (sha1-name.c: store and use repo in struct
>>    disambiguate_state, 2019-04-16)
>> 2. 5cc044e0257 (get_short_oid: sort ambiguous objects by type,
>>    then SHA-1, 2018-05-10)
>
> I suspect you meant 1ffa26c461 (get_short_sha1: list ambiguous objects
> on error, 2016-09-26) for the first one.
>
> I had to stare at the patch for a while to understand the goal here. I
> think this would have been a bit easier to review if "change" in your
> first sentence was described a bit more. Perhaps:
>
>   The list of candidates output by show_ambiguous_output() is not marked
>   for translation. At the very least we want to allow the text "the
>   candidates are" to be translated. But we also format individual
>   candidate lines like:
>
>       deadbeef commit 2021-01-01 - Some Commit Message
>
>   by formatting the individual components, then using a printf-format to
>   arrange them in the correct order. Even though there's no text here to
>   be translated, the order and spacing is determined by the format
>   string. Allowing that to be translated helps RTL languages.
>
> I have a few comments on the patch itself. The biggest thing is that it
> changes the format to add an extra newline (between "The candidates
> are:" and the actual list). I don't have a strong opinion on including
> that or not, but it seemed unintentional given the comment on the first
> commit (and its lack of mention here).

That was unintentional, sorry. Will fix.

> The rest are mostly observations, not criticisms. You can take them with
> the appropriate grain of salt given that I don't do translation work
> myself, nor know any RTL languages.
>
>> @@ -366,18 +373,34 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
>>  		if (commit) {
>>  			struct pretty_print_context pp = {0};
>>  			pp.date_mode.type = DATE_SHORT;
>> -			format_commit_message(commit, " %ad - %s", &desc, &pp);
>> +			format_commit_message(commit, _(" %ad - %s"), &desc, &pp);
>>  		}
>
> Is it OK to use non-printf expansions with the gettext code? Presumably
> the translated string would have the same set of placeholders in it, but
> my understanding is that gettext may sometimes munge the %-placeholders
> (e.g., allowing numbered ones for re-ordering). I admit I don't know how
> any of that works, but I just wonder if this "%ad" may cause confusion
> (or even if not, if it is even possible to re-order it for an RTL
> language).

It's not, oops. I missed that, blinders on for the "%ad". Will construct
it in advance and use %s interpolation separately.

>>  	} else if (type == OBJ_TAG) {
>>  		struct tag *tag = lookup_tag(ds->repo, oid);
>>  		if (!parse_tag(tag) && tag->tag)
>> -			strbuf_addf(&desc, " %s", tag->tag);
>> +			strbuf_addf(&desc, _(" %s"), tag->tag);
>>  	}
>
> I wonder whether " %s" is worthwhile as a translatable string. It does
> seem to be unique among strings marked for translation, but there are a
> ton of non-translated instances. Would context ever matter here?
>
> My impression is that this kind of translation-lego is frowned upon, and
> we might be better off repeating ourselves a bit more. I.e., something
> like:
>
>   if (commit) {
> 	  struct strbuf date = STRBUF_INIT;
> 	  struct strbuf subject = STRBUF_INIT;
> 	  format_commit_message(commit, "%ad", &date, &pp);
> 	  format_commit_message(commit, "%s", &subject, &pp);
> 	  strbuf_addf(advice, _("  %s commit %s - %s\n"),
> 		      repo_find_unique_abbrev(...),
> 		      date.buf, subject.buf);
> 	  strbuf_release(&date);
> 	  strbuf_release(&subject);
>   } else if (type == OBJ_TAG) {
>           ...
> 	  strbuf_addf(advice, _("  %s tag %s\n"),
> 	              repo_find_unique_abbrev(...), tag->tag);
>   } else {
> 	  /* TRANSLATORS: the fields are abbreviated oid and type */
>           strbuf_addf(advice, _("  %s %s\n"),
> 	              repo_find_unique_abbrev(...), type_name(type));
>   }
>
> Though that last one similarly has a real lack of context.

Yeah that's better. Will change it to something like that.

>> -	advise("  %s %s%s",
>> -	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
>> -	       type_name(type) ? type_name(type) : "unknown type",
>> -	       desc.buf);
>> +	strbuf_addf(advice,
>> +		    /*
>> +		     * TRANSLATORS: This is a line of ambiguous object
>> +		     * output. E.g.:
>> +		     *
>> +		     *    "deadbeef commit 2021-01-01 - Some Commit Message\n"
>> +		     *    "deadbeef tag Some Tag Message\n"
>> +		     *    "deadbeef tree\n"
>> +		     *
>> +		     * I.e. the first argument is a short OID, the
>> +		     * second is the type name of the object, and the
>> +		     * third a description of the object, if it's a
>> +		     * commit or tag. In that case the " %ad - %s" and
>> +		     * " %s" formats above will be used for the third
>> +		     * argument.
>> +		     */
>> +		    _("  %s %s%s\n"),
>> +		    repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
>> +		    type_name(type) ? type_name(type) : "unknown type",
>> +		    desc.buf);
>
> Would you want to translate "unknown type" here, as well? It's probably
> not that important in practice, but it seems like a funny omission.

Willdo.

>> @@ -488,12 +516,19 @@ static enum get_oid_result get_short_oid(struct repository *r,
>>  		if (!ds.ambiguous)
>>  			ds.fn = NULL;
>>  
>> -		advise(_("The candidates are:"));
>>  		repo_for_each_abbrev(r, ds.hex_pfx, collect_ambiguous, &collect);
>>  		sort_ambiguous_oid_array(r, &collect);
>>  
>> -		if (oid_array_for_each(&collect, show_ambiguous_object, &ds))
>> +		if (oid_array_for_each(&collect, show_ambiguous_object, &as))
>>  			BUG("show_ambiguous_object shouldn't return non-zero");
>> +
>> +		/*
>> +		 * TRANSLATORS: The argument is the list of ambiguous
>> +		 * objects composed in show_ambiguous_object(). See
>> +		 * its "TRANSLATORS" comment for details.
>> +		 */
>> +		advise(_("The candidates are:\n\n%s"), sb.buf);
>
> Here's where the extra newline.
>
> I understand why the earlier ones were changed for RTL languages. But
> this one is always line-oriented. Is the point to help bottom-to-top
> languages? I can buy that, though it feels like that would be something
> that the terminal would deal with (because even with this, you're still
> getting the "error:" line printed separately, for example).
>
> I don't think what this is doing is wrong (at first I wondered about the
> "hint:" lines, but because advise() looks for embedded newlines, we're
> OK). But if the translation doesn't need to reorder things across lines,
> this extra format-into-a-strbuf step doesn't seem necessary. We can just
> call advise() directly in show_ambiguous_object(), as before.
>
> If it is necessary, then note that you leak "sb" here.

I'll keep that bit as-is, it's not strictly necessary, but it gives
translators a bit more context.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2/2] object-name: make ambiguous object output translatable
  2021-10-04  8:26     ` Ævar Arnfjörð Bjarmason
@ 2021-10-04  9:29       ` Jeff King
  2021-10-04 11:16         ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 81+ messages in thread
From: Jeff King @ 2021-10-04  9:29 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano

On Mon, Oct 04, 2021 at 10:26:10AM +0200, Ævar Arnfjörð Bjarmason wrote:

> >> +		/*
> >> +		 * TRANSLATORS: The argument is the list of ambiguous
> >> +		 * objects composed in show_ambiguous_object(). See
> >> +		 * its "TRANSLATORS" comment for details.
> >> +		 */
> >> +		advise(_("The candidates are:\n\n%s"), sb.buf);
> >
> > Here's where the extra newline.
> >
> > I understand why the earlier ones were changed for RTL languages. But
> > this one is always line-oriented. Is the point to help bottom-to-top
> > languages? I can buy that, though it feels like that would be something
> > that the terminal would deal with (because even with this, you're still
> > getting the "error:" line printed separately, for example).
> >
> > I don't think what this is doing is wrong (at first I wondered about the
> > "hint:" lines, but because advise() looks for embedded newlines, we're
> > OK). But if the translation doesn't need to reorder things across lines,
> > this extra format-into-a-strbuf step doesn't seem necessary. We can just
> > call advise() directly in show_ambiguous_object(), as before.
> >
> > If it is necessary, then note that you leak "sb" here.
> 
> I'll keep that bit as-is, it's not strictly necessary, but it gives
> translators a bit more context.

If it's just for the context, wouldn't this do the same thing:

  /*
   * TRANSLATORS: This is followed by the list of ambiguous
   * objects composed in show_ambiguous_object(). See its
   * "TRANSLATORS" comments for details.
   */
  advise(_("The candidates are:"));
  ...
  if (oid_array_for_each(&collect, show_ambiguous_object, &ds))
     ...

I.e., leave the code as-is, and just add the extra comment. There is no
need for the extra struct or any change of ordering between this
advise() and the others.

I would think it is worthwhile if we are de-lego-ing a message that is
made in chunks, but in this case the we have to construct an opaque "%s"
to represent the individual lines for each object, because we don't know
how many of them there will be.

-Peff

PS In my "something like this" commit message, I indicated that the
   "candidates" message was getting translated, but it actually is
   already translated in the pre-image. So I think we would not need to
   touch that line at all.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2/2] object-name: make ambiguous object output translatable
  2021-10-04  9:29       ` Jeff King
@ 2021-10-04 11:16         ` Ævar Arnfjörð Bjarmason
  2021-10-04 12:07           ` Jeff King
  0 siblings, 1 reply; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-04 11:16 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano


On Mon, Oct 04 2021, Jeff King wrote:

> On Mon, Oct 04, 2021 at 10:26:10AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> >> +		/*
>> >> +		 * TRANSLATORS: The argument is the list of ambiguous
>> >> +		 * objects composed in show_ambiguous_object(). See
>> >> +		 * its "TRANSLATORS" comment for details.
>> >> +		 */
>> >> +		advise(_("The candidates are:\n\n%s"), sb.buf);
>> >
>> > Here's where the extra newline.
>> >
>> > I understand why the earlier ones were changed for RTL languages. But
>> > this one is always line-oriented. Is the point to help bottom-to-top
>> > languages? I can buy that, though it feels like that would be something
>> > that the terminal would deal with (because even with this, you're still
>> > getting the "error:" line printed separately, for example).
>> >
>> > I don't think what this is doing is wrong (at first I wondered about the
>> > "hint:" lines, but because advise() looks for embedded newlines, we're
>> > OK). But if the translation doesn't need to reorder things across lines,
>> > this extra format-into-a-strbuf step doesn't seem necessary. We can just
>> > call advise() directly in show_ambiguous_object(), as before.
>> >
>> > If it is necessary, then note that you leak "sb" here.
>> 
>> I'll keep that bit as-is, it's not strictly necessary, but it gives
>> translators a bit more context.
>
> If it's just for the context, wouldn't this do the same thing:
>
>   /*
>    * TRANSLATORS: This is followed by the list of ambiguous
>    * objects composed in show_ambiguous_object(). See its
>    * "TRANSLATORS" comments for details.
>    */
>   advise(_("The candidates are:"));
>   ...
>   if (oid_array_for_each(&collect, show_ambiguous_object, &ds))
>      ...
>
> I.e., leave the code as-is, and just add the extra comment. There is no
> need for the extra struct or any change of ordering between this
> advise() and the others.
>
> I would think it is worthwhile if we are de-lego-ing a message that is
> made in chunks, but in this case the we have to construct an opaque "%s"
> to represent the individual lines for each object, because we don't know
> how many of them there will be.
>
> -Peff
>
> PS In my "something like this" commit message, I indicated that the
>    "candidates" message was getting translated, but it actually is
>    already translated in the pre-image. So I think we would not need to
>    touch that line at all.

Yes you're right. You've got me, I guess :)

An unstated motivation of mine here is that I've got a series that
changes the advise() function itself so that it automatically adds the
"and run xyz to disable this message".

Now some don't emit it, some don't even have associated configuration or
documentation. It's a mess.

I originally hacked this up because this is the one in-tree user of
advise() that constructs output incrementally. So for that improvement
to advise() it either needs to be changed to not do so (this patch), or
I'd need an advise_no_template() or advise_hint_line() or whatever as a
workaround.

I didn't mean to be too subterfuge-y about it. It's just hard to find a
balance between a single long series & a few shorter ones, and when to
distract reviewers with "this design choice is also because of XYZ
tangentally related end-goal".

Anyway, now that we're here I'm not sure what the best way forward
is. One is to just address the pointed-out bugs and keep that
accumulate/print pattern I instituded, which would help that subsequent
series. But I agree that while I think it is a bit better to translate
the "foo:\n\n%s" message (it gives a bit more context about what sort of
message it is), it's not really worth it just in the context of this
patch.

What do you think? That we could let this pass for now, or we should
drop this and I can try to re-visit it as part of some larger topic?
That meaningful improvement to advise() depends on this + another series
of advise fixes I submitted in parallel at [1].

1. https://lore.kernel.org/git/cover-0.5-00000000000-20211004T015432Z-avarab@gmail.com/T/#t

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2/2] object-name: make ambiguous object output translatable
  2021-10-04 11:16         ` Ævar Arnfjörð Bjarmason
@ 2021-10-04 12:07           ` Jeff King
  0 siblings, 0 replies; 81+ messages in thread
From: Jeff King @ 2021-10-04 12:07 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano

On Mon, Oct 04, 2021 at 01:16:24PM +0200, Ævar Arnfjörð Bjarmason wrote:

> An unstated motivation of mine here is that I've got a series that
> changes the advise() function itself so that it automatically adds the
> "and run xyz to disable this message".
> 
> Now some don't emit it, some don't even have associated configuration or
> documentation. It's a mess.
> 
> I originally hacked this up because this is the one in-tree user of
> advise() that constructs output incrementally. So for that improvement
> to advise() it either needs to be changed to not do so (this patch), or
> I'd need an advise_no_template() or advise_hint_line() or whatever as a
> workaround.

OK, that makes sense. In general, I think it's much easier if those
motivations _aren't_ unstated. Both for the benefit of reviewers, but
also folks reading commit messages later who wonder "hey, it looks like
we didn't need this hunk, and it is causing a hassle, so why can't I
just revert it".

But...

> I didn't mean to be too subterfuge-y about it. It's just hard to find a
> balance between a single long series & a few shorter ones, and when to
> distract reviewers with "this design choice is also because of XYZ
> tangentally related end-goal".

...yeah, if you have patches that say "do X, because later maybe we'll
do Y", then it is often hard to evaluate them if Y is not in the same
series.  And _especially_ so if there is some other Z happening in the
current series with X, because even talking about X is muddling things.

So in an ideal world, you'd not do X at all (in this case, touch the
advise() lines), and leave Y (rolling up a buf to hand to a single
advise() line) as a preparatory patch in a series that does Z (your
change to advise() to print the extra stuff).

Things don't always break down that way, but I think they do here.
Nothing you want to do here is semantically related to the later change
to advise() you want to make. There are textual dependencies, which
means you'll want to wait for one series to graduate before the other,
but that's already the case if you stuff the preparatory patch in this
series.

-Peff

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v2 0/2] i18n: improve translatability of ambiguous object output
  2021-10-04  1:42 [PATCH 0/2] i18n: improve translatability of ambiguous object output Ævar Arnfjörð Bjarmason
  2021-10-04  1:42 ` [PATCH 1/2] object-name tests: tighten up advise() output test Ævar Arnfjörð Bjarmason
  2021-10-04  1:42 ` [PATCH 2/2] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
@ 2021-10-04 14:27 ` Ævar Arnfjörð Bjarmason
  2021-10-04 14:27   ` [PATCH v2 1/2] object.[ch]: mark object type names for translation Ævar Arnfjörð Bjarmason
                     ` (2 more replies)
  2 siblings, 3 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-04 14:27 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

A mostly-rewritten version in response to the discussion concluding at
http://lore.kernel.org/git/YVrudGOcUxblsfPY@coredump.intra.peff.net;
thanks a lot for the thorough review Jeff!

Ævar Arnfjörð Bjarmason (2):
  object.[ch]: mark object type names for translation
  object-name: make ambiguous object output translatable

 object-name.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++-----
 object.c      | 27 ++++++++++++++++---
 object.h      |  1 +
 3 files changed, 90 insertions(+), 10 deletions(-)

Range-diff against v1:
1:  7085f951a12 ! 1:  55bde16aa23 object-name tests: tighten up advise() output test
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    object-name tests: tighten up advise() output test
    +    object.[ch]: mark object type names for translation
     
    -    Change tests added in 1ffa26c4614 (get_short_sha1: list ambiguous
    -    objects on error, 2016-09-26) to only care about the OIDs that are
    -    listed, which is what the test is trying to check for.
    +    Mark the "commit", "tree", "blob" and "tag" types for translation, and
    +    add an extern "unknown type" string for the OBJ_NONE case.
     
    -    This isn't needed by the subsequent commit, which won't change any of
    -    the output, but a mere tightening of the tests assertions to more
    -    closely match what we really want to test for here.
    +    It is usually bad practice to translate individual words like this,
    +    but for e.g. the list list output emitted by the "short object ID dead
    +    is ambiguous" advice it makes sense.
     
    -    Now if the advise() message itself were change the phrasing around the
    -    list of OIDs we won't have this test break. We're assuming that such
    -    output won't have a need to indent anything except the OIDs.
    +    A subsequent commit will make that output translatable, and use these
    +    translation markings to do so. Well, we won't use "commit", but let's
    +    mark it up anyway for consistency. It'll probably come in handy sooner
    +    than later to have it already be translated, and it's to much of a
    +    burden to place on translators if they're translating the other three
    +    object types anyway.
    +
    +    Aside: I think it would probably make sense to change the "NULL" entry
    +    for type_name() to be the "unknown type". I've ran into cases where
    +    type_name() was unconditionally interpolated in e.g. an sprintf()
    +    format, but let's leave that for #leftoverbits as that would be
    +    changing the behavior of the type_name() function.
    +
    +    All of these will be new in the git.pot file, except "blob" which will
    +    be shared with a "cat-file" command-line option, see
    +    7bcf3414535 (cat-file --textconv/--filters: allow specifying the path
    +    separately, 2016-09-09) for its introduction.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    - ## t/t1512-rev-parse-disambiguation.sh ##
    -@@ t/t1512-rev-parse-disambiguation.sh: test_expect_success 'ambiguity errors are not repeated (peel)' '
    + ## object.c ##
    +@@ object.c: struct object *get_indexed_object(unsigned int idx)
      
    - test_expect_success 'ambiguity hints' '
    - 	test_must_fail git rev-parse 000000000 2>stderr &&
    --	grep ^hint: stderr >hints &&
    --	# 16 candidates, plus one intro line
    --	test_line_count = 17 hints
    -+	grep "^hint:   " stderr >hints &&
    -+	# 16 candidates, minus surrounding prose
    -+	test_line_count = 16 hints
    - '
    + static const char *object_type_strings[] = {
    + 	NULL,		/* OBJ_NONE = 0 */
    +-	"commit",	/* OBJ_COMMIT = 1 */
    +-	"tree",		/* OBJ_TREE = 2 */
    +-	"blob",		/* OBJ_BLOB = 3 */
    +-	"tag",		/* OBJ_TAG = 4 */
    ++	/*
    ++	 * TRANSLATORS: "commit", "tree", "blob" and "tag" are the
    ++	 * name of Git's object types. These names are interpolated
    ++	 * stand-alone when doing so is unambiguous for translation
    ++	 * and doesn't require extra context. E.g. as part of an
    ++	 * already-translated string that needs to have a type name
    ++	 * quoted verbatim, or the short description of a command-line
    ++	 * option expecting a given type.
    ++	 */
    ++	N_("commit"),	/* OBJ_COMMIT = 1 */
    ++	N_("tree"),	/* OBJ_TREE = 2 */
    ++	N_("blob"),	/* OBJ_BLOB = 3 */
    ++	N_("tag"),	/* OBJ_TAG = 4 */
    + };
      
    - test_expect_success 'ambiguity hints respect type' '
    - 	test_must_fail git rev-parse 000000000^{commit} 2>stderr &&
    --	grep ^hint: stderr >hints &&
    --	# 5 commits, 1 tag (which is a committish), plus intro line
    --	test_line_count = 7 hints
    -+	grep "^hint:   " stderr >hints &&
    -+	# 5 commits, 1 tag (which is a committish), minus surrounding prose
    -+	test_line_count = 6 hints
    - '
    - 
    - test_expect_success 'failed type-selector still shows hint' '
    -@@ t/t1512-rev-parse-disambiguation.sh: test_expect_success 'failed type-selector still shows hint' '
    - 	echo 851 | git hash-object --stdin -w &&
    - 	echo 872 | git hash-object --stdin -w &&
    - 	test_must_fail git rev-parse ee3d^{commit} 2>stderr &&
    --	grep ^hint: stderr >hints &&
    --	test_line_count = 3 hints
    -+	grep "^hint:   " stderr >hints &&
    -+	test_line_count = 2 hints
    - '
    ++/*
    ++ * TRANSLATORS: This is the short type name of an object that's not
    ++ * one of Git's known object types, as opposed to "commit", "tree",
    ++ * "blob" and "tag" above.
    ++ *
    ++ * A user is unlikely to ever encounter these, but they can be
    ++ * manually created with "git hash-object --literally".
    ++ */
    ++const char *unknown_type = N_("unknown type");
    ++
    + const char *type_name(unsigned int type)
    + {
    + 	if (type >= ARRAY_SIZE(object_type_strings))
    +
    + ## object.h ##
    +@@ object.h: struct object {
    + 	struct object_id oid;
    + };
      
    - test_expect_success 'core.disambiguate config can prefer types' '
    ++extern const char *unknown_type;
    + const char *type_name(unsigned int type);
    + int type_from_string_gently(const char *str, ssize_t, int gentle);
    + #define type_from_string(str) type_from_string_gently(str, -1, 0)
2:  b6136380c28 ! 2:  c0e873543f5 object-name: make ambiguous object output translatable
    @@ Commit message
         tweaked in [2] to be more friendly to translators. By being able to
         customize the sprintf formats we're even ready for RTL languages.
     
    -    1. ef9b0370da6 (sha1-name.c: store and use repo in struct
    -       disambiguate_state, 2019-04-16)
    +    The "unknown type" message here is unreachable, and has been since
    +    [1], i.e. that code has never worked. If we craft an object of a bogus
    +    type with a conflicting prefix we'll just die:
    +
    +        $ git rev-parse 8315
    +        error: short object ID 8315 is ambiguous
    +        hint: The candidates are:
    +        fatal: invalid object type
    +
    +    But let's continue to pretend that this works, we can eventually use
    +    the API improvements in my ab/fsck-unexpected-type (once it lands) to
    +    inspect these objects and emit the actual type here, or at least not
    +    die as we emit "unknown type".
    +
    +    1. 1ffa26c461 (get_short_sha1: list ambiguous objects on error,
    +       2016-09-26)
         2. 5cc044e0257 (get_short_oid: sort ambiguous objects by type,
            then SHA-1, 2018-05-10)
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## object-name.c ##
    -@@ object-name.c: static int init_object_disambiguation(struct repository *r,
    - 	return 0;
    - }
    - 
    -+struct show_ambiguous_state {
    -+	const struct disambiguate_state *ds;
    -+	struct strbuf *advice;
    -+};
    -+
    - static int show_ambiguous_object(const struct object_id *oid, void *data)
    +@@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, void *data)
      {
    --	const struct disambiguate_state *ds = data;
    -+	struct show_ambiguous_state *state = data;
    -+	const struct disambiguate_state *ds = state->ds;
    -+	struct strbuf *advice = state->advice;
    + 	const struct disambiguate_state *ds = data;
      	struct strbuf desc = STRBUF_INIT;
    ++	struct strbuf ci_ad = STRBUF_INIT;
    ++	struct strbuf ci_s = STRBUF_INIT;
      	int type;
    ++	const char *tag_desc = NULL;
    ++	const char *abbrev;
      
    + 	if (ds->fn && !ds->fn(ds->repo, oid, ds->cb_data))
    + 		return 0;
     @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, void *data)
      		if (commit) {
      			struct pretty_print_context pp = {0};
      			pp.date_mode.type = DATE_SHORT;
     -			format_commit_message(commit, " %ad - %s", &desc, &pp);
    -+			format_commit_message(commit, _(" %ad - %s"), &desc, &pp);
    ++			format_commit_message(commit, "%ad", &ci_ad, &pp);
    ++			format_commit_message(commit, "%s", &ci_s, &pp);
      		}
      	} else if (type == OBJ_TAG) {
      		struct tag *tag = lookup_tag(ds->repo, oid);
      		if (!parse_tag(tag) && tag->tag)
     -			strbuf_addf(&desc, " %s", tag->tag);
    -+			strbuf_addf(&desc, _(" %s"), tag->tag);
    ++			tag_desc = tag->tag;
      	}
      
     -	advise("  %s %s%s",
     -	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
     -	       type_name(type) ? type_name(type) : "unknown type",
     -	       desc.buf);
    -+	strbuf_addf(advice,
    -+		    /*
    -+		     * TRANSLATORS: This is a line of ambiguous object
    -+		     * output. E.g.:
    -+		     *
    -+		     *    "deadbeef commit 2021-01-01 - Some Commit Message\n"
    -+		     *    "deadbeef tag Some Tag Message\n"
    -+		     *    "deadbeef tree\n"
    -+		     *
    -+		     * I.e. the first argument is a short OID, the
    -+		     * second is the type name of the object, and the
    -+		     * third a description of the object, if it's a
    -+		     * commit or tag. In that case the " %ad - %s" and
    -+		     * " %s" formats above will be used for the third
    -+		     * argument.
    -+		     */
    -+		    _("  %s %s%s\n"),
    -+		    repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
    -+		    type_name(type) ? type_name(type) : "unknown type",
    -+		    desc.buf);
    - 
    - 	strbuf_release(&desc);
    - 	return 0;
    -@@ object-name.c: static enum get_oid_result get_short_oid(struct repository *r,
    - 	}
    - 
    - 	if (!quietly && (status == SHORT_NAME_AMBIGUOUS)) {
    -+		struct strbuf sb = STRBUF_INIT;
    - 		struct oid_array collect = OID_ARRAY_INIT;
    -+		struct show_ambiguous_state as = {
    -+			.ds = &ds,
    -+			.advice = &sb,
    -+		};
    - 
    - 		error(_("short object ID %s is ambiguous"), ds.hex_pfx);
    - 
    -@@ object-name.c: static enum get_oid_result get_short_oid(struct repository *r,
    - 		if (!ds.ambiguous)
    - 			ds.fn = NULL;
    - 
    --		advise(_("The candidates are:"));
    - 		repo_for_each_abbrev(r, ds.hex_pfx, collect_ambiguous, &collect);
    - 		sort_ambiguous_oid_array(r, &collect);
    - 
    --		if (oid_array_for_each(&collect, show_ambiguous_object, &ds))
    -+		if (oid_array_for_each(&collect, show_ambiguous_object, &as))
    - 			BUG("show_ambiguous_object shouldn't return non-zero");
    -+
    ++	abbrev = repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV);
    ++	if (type == OBJ_COMMIT) {
     +		/*
    -+		 * TRANSLATORS: The argument is the list of ambiguous
    -+		 * objects composed in show_ambiguous_object(). See
    -+		 * its "TRANSLATORS" comment for details.
    ++		 * TRANSLATORS: This is a line of ambiguous commit
    ++		 * object output. E.g.:
    ++		 *
    ++		 *    "deadbeef commit 2021-01-01 - Some Commit Message"
    ++		 *
    ++		 * The second argument is the "commit" string from
    ++		 * object.c, it should (hopefully) already be
    ++		 * translated.
     +		 */
    -+		advise(_("The candidates are:\n\n%s"), sb.buf);
    ++		strbuf_addf(&desc, _("%s %s %s - %s"), abbrev, ci_ad.buf,
    ++			    _(type_name(type)), ci_s.buf);
    ++	} else if (tag_desc) {
    ++		/*
    ++		 * TRANSLATORS: This is a line of
    ++		 * ambiguous tag object output. E.g.:
    ++		 *
    ++		 *    "deadbeef tag Some Tag Message"
    ++		 *
    ++		 * The second argument is the "tag" string from
    ++		 * object.c, it should (hopefully) already be
    ++		 * translated.
    ++		 */
    ++		strbuf_addf(&desc, _("%s %s %s"), abbrev, _(type_name(type)),
    ++			    tag_desc);
    ++	} else {
    ++		const char *tname = type_name(type) ? _(type_name(type)) :
    ++			_(unknown_type);
    ++		/*
    ++		 * TRANSLATORS: This is a line of ambiguous <type>
    ++		 * object output. Where <type> is one of the object
    ++		 * types of "tree", "blob", "tag" ("commit" is handled
    ++		 * above).
    ++		 *
    ++		 *    "deadbeef tree"
    ++		 *    "deadbeef blob"
    ++		 *    "deadbeef tag"
    ++		 *    "deadbeef unknown type"
    ++		 *
    ++		 * Note that annotated tags use a separate format
    ++		 * outlined above.
    ++		 *
    ++		 * The second argument is the "tree", "blob" or "tag"
    ++		 * string from object.c, or the "unknown type" string
    ++		 * in the case of an unknown type. All of them should
    ++		 * (hopefully) already be translated.
    ++		 */
    ++		strbuf_addf(&desc, _("%s %s"), abbrev, tname);
    ++	}
     +
    - 		oid_array_clear(&collect);
    - 	}
    ++	/*
    ++	 * TRANSLATORS: This is line item of ambiguous object output,
    ++	 * translated above.
    ++	 */
    ++	advise(_("  %s\n"), desc.buf);
    + 
    + 	strbuf_release(&desc);
    ++	strbuf_release(&ci_ad);
    ++	strbuf_release(&ci_s);
    + 	return 0;
    + }
      
-- 
2.33.0.1409.ge73c1ecc5b4


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v2 1/2] object.[ch]: mark object type names for translation
  2021-10-04 14:27 ` [PATCH v2 0/2] i18n: improve translatability of ambiguous object output Ævar Arnfjörð Bjarmason
@ 2021-10-04 14:27   ` Ævar Arnfjörð Bjarmason
  2021-10-04 18:54     ` Eric Sunshine
                       ` (2 more replies)
  2021-10-04 14:27   ` [PATCH v2 2/2] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
  2021-10-08 19:34   ` [PATCH v3 0/3] i18n: improve translatability of ambiguous object output Ævar Arnfjörð Bjarmason
  2 siblings, 3 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-04 14:27 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

Mark the "commit", "tree", "blob" and "tag" types for translation, and
add an extern "unknown type" string for the OBJ_NONE case.

It is usually bad practice to translate individual words like this,
but for e.g. the list list output emitted by the "short object ID dead
is ambiguous" advice it makes sense.

A subsequent commit will make that output translatable, and use these
translation markings to do so. Well, we won't use "commit", but let's
mark it up anyway for consistency. It'll probably come in handy sooner
than later to have it already be translated, and it's to much of a
burden to place on translators if they're translating the other three
object types anyway.

Aside: I think it would probably make sense to change the "NULL" entry
for type_name() to be the "unknown type". I've ran into cases where
type_name() was unconditionally interpolated in e.g. an sprintf()
format, but let's leave that for #leftoverbits as that would be
changing the behavior of the type_name() function.

All of these will be new in the git.pot file, except "blob" which will
be shared with a "cat-file" command-line option, see
7bcf3414535 (cat-file --textconv/--filters: allow specifying the path
separately, 2016-09-09) for its introduction.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object.c | 27 +++++++++++++++++++++++----
 object.h |  1 +
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/object.c b/object.c
index 4e85955a941..47dbe0d8a2a 100644
--- a/object.c
+++ b/object.c
@@ -22,12 +22,31 @@ struct object *get_indexed_object(unsigned int idx)
 
 static const char *object_type_strings[] = {
 	NULL,		/* OBJ_NONE = 0 */
-	"commit",	/* OBJ_COMMIT = 1 */
-	"tree",		/* OBJ_TREE = 2 */
-	"blob",		/* OBJ_BLOB = 3 */
-	"tag",		/* OBJ_TAG = 4 */
+	/*
+	 * TRANSLATORS: "commit", "tree", "blob" and "tag" are the
+	 * name of Git's object types. These names are interpolated
+	 * stand-alone when doing so is unambiguous for translation
+	 * and doesn't require extra context. E.g. as part of an
+	 * already-translated string that needs to have a type name
+	 * quoted verbatim, or the short description of a command-line
+	 * option expecting a given type.
+	 */
+	N_("commit"),	/* OBJ_COMMIT = 1 */
+	N_("tree"),	/* OBJ_TREE = 2 */
+	N_("blob"),	/* OBJ_BLOB = 3 */
+	N_("tag"),	/* OBJ_TAG = 4 */
 };
 
+/*
+ * TRANSLATORS: This is the short type name of an object that's not
+ * one of Git's known object types, as opposed to "commit", "tree",
+ * "blob" and "tag" above.
+ *
+ * A user is unlikely to ever encounter these, but they can be
+ * manually created with "git hash-object --literally".
+ */
+const char *unknown_type = N_("unknown type");
+
 const char *type_name(unsigned int type)
 {
 	if (type >= ARRAY_SIZE(object_type_strings))
diff --git a/object.h b/object.h
index 549f2d256bc..0510dc4b3ea 100644
--- a/object.h
+++ b/object.h
@@ -91,6 +91,7 @@ struct object {
 	struct object_id oid;
 };
 
+extern const char *unknown_type;
 const char *type_name(unsigned int type);
 int type_from_string_gently(const char *str, ssize_t, int gentle);
 #define type_from_string(str) type_from_string_gently(str, -1, 0)
-- 
2.33.0.1409.ge73c1ecc5b4


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v2 2/2] object-name: make ambiguous object output translatable
  2021-10-04 14:27 ` [PATCH v2 0/2] i18n: improve translatability of ambiguous object output Ævar Arnfjörð Bjarmason
  2021-10-04 14:27   ` [PATCH v2 1/2] object.[ch]: mark object type names for translation Ævar Arnfjörð Bjarmason
@ 2021-10-04 14:27   ` Ævar Arnfjörð Bjarmason
  2021-10-06 19:11     ` Jeff King
  2021-10-08 19:34   ` [PATCH v3 0/3] i18n: improve translatability of ambiguous object output Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-04 14:27 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

Change the output of show_ambiguous_object() added in [1] and last
tweaked in [2] to be more friendly to translators. By being able to
customize the sprintf formats we're even ready for RTL languages.

The "unknown type" message here is unreachable, and has been since
[1], i.e. that code has never worked. If we craft an object of a bogus
type with a conflicting prefix we'll just die:

    $ git rev-parse 8315
    error: short object ID 8315 is ambiguous
    hint: The candidates are:
    fatal: invalid object type

But let's continue to pretend that this works, we can eventually use
the API improvements in my ab/fsck-unexpected-type (once it lands) to
inspect these objects and emit the actual type here, or at least not
die as we emit "unknown type".

1. 1ffa26c461 (get_short_sha1: list ambiguous objects on error,
   2016-09-26)
2. 5cc044e0257 (get_short_oid: sort ambiguous objects by type,
   then SHA-1, 2018-05-10)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 66 insertions(+), 6 deletions(-)

diff --git a/object-name.c b/object-name.c
index fdff4601b2c..73c946f1117 100644
--- a/object-name.c
+++ b/object-name.c
@@ -355,7 +355,11 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 {
 	const struct disambiguate_state *ds = data;
 	struct strbuf desc = STRBUF_INIT;
+	struct strbuf ci_ad = STRBUF_INIT;
+	struct strbuf ci_s = STRBUF_INIT;
 	int type;
+	const char *tag_desc = NULL;
+	const char *abbrev;
 
 	if (ds->fn && !ds->fn(ds->repo, oid, ds->cb_data))
 		return 0;
@@ -366,20 +370,76 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		if (commit) {
 			struct pretty_print_context pp = {0};
 			pp.date_mode.type = DATE_SHORT;
-			format_commit_message(commit, " %ad - %s", &desc, &pp);
+			format_commit_message(commit, "%ad", &ci_ad, &pp);
+			format_commit_message(commit, "%s", &ci_s, &pp);
 		}
 	} else if (type == OBJ_TAG) {
 		struct tag *tag = lookup_tag(ds->repo, oid);
 		if (!parse_tag(tag) && tag->tag)
-			strbuf_addf(&desc, " %s", tag->tag);
+			tag_desc = tag->tag;
 	}
 
-	advise("  %s %s%s",
-	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
-	       type_name(type) ? type_name(type) : "unknown type",
-	       desc.buf);
+	abbrev = repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV);
+	if (type == OBJ_COMMIT) {
+		/*
+		 * TRANSLATORS: This is a line of ambiguous commit
+		 * object output. E.g.:
+		 *
+		 *    "deadbeef commit 2021-01-01 - Some Commit Message"
+		 *
+		 * The second argument is the "commit" string from
+		 * object.c, it should (hopefully) already be
+		 * translated.
+		 */
+		strbuf_addf(&desc, _("%s %s %s - %s"), abbrev, ci_ad.buf,
+			    _(type_name(type)), ci_s.buf);
+	} else if (tag_desc) {
+		/*
+		 * TRANSLATORS: This is a line of
+		 * ambiguous tag object output. E.g.:
+		 *
+		 *    "deadbeef tag Some Tag Message"
+		 *
+		 * The second argument is the "tag" string from
+		 * object.c, it should (hopefully) already be
+		 * translated.
+		 */
+		strbuf_addf(&desc, _("%s %s %s"), abbrev, _(type_name(type)),
+			    tag_desc);
+	} else {
+		const char *tname = type_name(type) ? _(type_name(type)) :
+			_(unknown_type);
+		/*
+		 * TRANSLATORS: This is a line of ambiguous <type>
+		 * object output. Where <type> is one of the object
+		 * types of "tree", "blob", "tag" ("commit" is handled
+		 * above).
+		 *
+		 *    "deadbeef tree"
+		 *    "deadbeef blob"
+		 *    "deadbeef tag"
+		 *    "deadbeef unknown type"
+		 *
+		 * Note that annotated tags use a separate format
+		 * outlined above.
+		 *
+		 * The second argument is the "tree", "blob" or "tag"
+		 * string from object.c, or the "unknown type" string
+		 * in the case of an unknown type. All of them should
+		 * (hopefully) already be translated.
+		 */
+		strbuf_addf(&desc, _("%s %s"), abbrev, tname);
+	}
+
+	/*
+	 * TRANSLATORS: This is line item of ambiguous object output,
+	 * translated above.
+	 */
+	advise(_("  %s\n"), desc.buf);
 
 	strbuf_release(&desc);
+	strbuf_release(&ci_ad);
+	strbuf_release(&ci_s);
 	return 0;
 }
 
-- 
2.33.0.1409.ge73c1ecc5b4


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 1/2] object.[ch]: mark object type names for translation
  2021-10-04 14:27   ` [PATCH v2 1/2] object.[ch]: mark object type names for translation Ævar Arnfjörð Bjarmason
@ 2021-10-04 18:54     ` Eric Sunshine
  2021-10-05  9:37     ` Bagas Sanjaya
  2021-10-06 19:05     ` Jeff King
  2 siblings, 0 replies; 81+ messages in thread
From: Eric Sunshine @ 2021-10-04 18:54 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git List, Junio C Hamano, Jeff King

On Mon, Oct 4, 2021 at 10:27 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> Mark the "commit", "tree", "blob" and "tag" types for translation, and
> add an extern "unknown type" string for the OBJ_NONE case.
>
> It is usually bad practice to translate individual words like this,
> but for e.g. the list list output emitted by the "short object ID dead

"list list"?

> is ambiguous" advice it makes sense.
>
> A subsequent commit will make that output translatable, and use these
> translation markings to do so. Well, we won't use "commit", but let's
> mark it up anyway for consistency. It'll probably come in handy sooner
> than later to have it already be translated, and it's to much of a
> burden to place on translators if they're translating the other three
> object types anyway.

At first I thought you meant s/to much/too much/, but that doesn't
seem to make sense (unless I'm misunderstanding), so perhaps you mean
s/to/not/.

> Aside: I think it would probably make sense to change the "NULL" entry
> for type_name() to be the "unknown type". I've ran into cases where
> type_name() was unconditionally interpolated in e.g. an sprintf()
> format, but let's leave that for #leftoverbits as that would be
> changing the behavior of the type_name() function.
>
> All of these will be new in the git.pot file, except "blob" which will
> be shared with a "cat-file" command-line option, see
> 7bcf3414535 (cat-file --textconv/--filters: allow specifying the path
> separately, 2016-09-09) for its introduction.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 1/2] object.[ch]: mark object type names for translation
  2021-10-04 14:27   ` [PATCH v2 1/2] object.[ch]: mark object type names for translation Ævar Arnfjörð Bjarmason
  2021-10-04 18:54     ` Eric Sunshine
@ 2021-10-05  9:37     ` Bagas Sanjaya
  2021-10-05 15:52       ` Ævar Arnfjörð Bjarmason
  2021-10-06 19:05     ` Jeff King
  2 siblings, 1 reply; 81+ messages in thread
From: Bagas Sanjaya @ 2021-10-05  9:37 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git; +Cc: Junio C Hamano, Jeff King

On 04/10/21 21.27, Ævar Arnfjörð Bjarmason wrote:
>   static const char *object_type_strings[] = {
>   	NULL,		/* OBJ_NONE = 0 */
> -	"commit",	/* OBJ_COMMIT = 1 */
> -	"tree",		/* OBJ_TREE = 2 */
> -	"blob",		/* OBJ_BLOB = 3 */
> -	"tag",		/* OBJ_TAG = 4 */
> +	/*
> +	 * TRANSLATORS: "commit", "tree", "blob" and "tag" are the
> +	 * name of Git's object types. These names are interpolated
> +	 * stand-alone when doing so is unambiguous for translation
> +	 * and doesn't require extra context. E.g. as part of an
> +	 * already-translated string that needs to have a type name
> +	 * quoted verbatim, or the short description of a command-line
> +	 * option expecting a given type.
> +	 */
> +	N_("commit"),	/* OBJ_COMMIT = 1 */
> +	N_("tree"),	/* OBJ_TREE = 2 */
> +	N_("blob"),	/* OBJ_BLOB = 3 */
> +	N_("tag"),	/* OBJ_TAG = 4 */
>   };
>   

Are these object type names safe for translating? (e.g. can they be 
translatable without affecting private API string, which aren't 
translatable)?

> +/*
> + * TRANSLATORS: This is the short type name of an object that's not
> + * one of Git's known object types, as opposed to "commit", "tree",
> + * "blob" and "tag" above.
> + *
> + * A user is unlikely to ever encounter these, but they can be
> + * manually created with "git hash-object --literally".
> + */
> +const char *unknown_type = N_("unknown type");
> +
>   const char *type_name(unsigned int type)

Did you mean that "unknown type" is generic shorthand?

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 1/2] object.[ch]: mark object type names for translation
  2021-10-05  9:37     ` Bagas Sanjaya
@ 2021-10-05 15:52       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-05 15:52 UTC (permalink / raw)
  To: Bagas Sanjaya; +Cc: git, Junio C Hamano, Jeff King


On Tue, Oct 05 2021, Bagas Sanjaya wrote:

> On 04/10/21 21.27, Ævar Arnfjörð Bjarmason wrote:
>>   static const char *object_type_strings[] = {
>>   	NULL,		/* OBJ_NONE = 0 */
>> -	"commit",	/* OBJ_COMMIT = 1 */
>> -	"tree",		/* OBJ_TREE = 2 */
>> -	"blob",		/* OBJ_BLOB = 3 */
>> -	"tag",		/* OBJ_TAG = 4 */
>> +	/*
>> +	 * TRANSLATORS: "commit", "tree", "blob" and "tag" are the
>> +	 * name of Git's object types. These names are interpolated
>> +	 * stand-alone when doing so is unambiguous for translation
>> +	 * and doesn't require extra context. E.g. as part of an
>> +	 * already-translated string that needs to have a type name
>> +	 * quoted verbatim, or the short description of a command-line
>> +	 * option expecting a given type.
>> +	 */
>> +	N_("commit"),	/* OBJ_COMMIT = 1 */
>> +	N_("tree"),	/* OBJ_TREE = 2 */
>> +	N_("blob"),	/* OBJ_BLOB = 3 */
>> +	N_("tag"),	/* OBJ_TAG = 4 */
>>   };
>>   
>
> Are these object type names safe for translating? (e.g. can they be
> translatable without affecting private API string, which aren't 
> translatable)?

Yes, the N_() macro is always a noop. It's just there so the i18n
tooling knows to pick up these strings and drop them into
po/git.pot. See po/README.md for details.

It does change the behavior of any code that later does
_(type_name(type)), as the string will then (potentially) be found in
the *.mo files, but as shown in 2/2 that needs to be added to each
callsite manually. So we're not going to translate "ls-tree" output or
whatever just because it has "tree" etc. in it.

>> +/*
>> + * TRANSLATORS: This is the short type name of an object that's not
>> + * one of Git's known object types, as opposed to "commit", "tree",
>> + * "blob" and "tag" above.
>> + *
>> + * A user is unlikely to ever encounter these, but they can be
>> + * manually created with "git hash-object --literally".
>> + */
>> +const char *unknown_type = N_("unknown type");
>> +
>>   const char *type_name(unsigned int type)
>
> Did you mean that "unknown type" is generic shorthand?

Yes, we could get the actual type name here, but it's a bit of a pain,
and as noted in 2/2 this code doesn't work anyway (which pre-dates this
series).

But I'll see if I'll remember to loop around to fixing it after my
fsck/object library fixes related to this land, but for now just marking
this for translation makes senes I think.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 1/2] object.[ch]: mark object type names for translation
  2021-10-04 14:27   ` [PATCH v2 1/2] object.[ch]: mark object type names for translation Ævar Arnfjörð Bjarmason
  2021-10-04 18:54     ` Eric Sunshine
  2021-10-05  9:37     ` Bagas Sanjaya
@ 2021-10-06 19:05     ` Jeff King
  2021-10-06 19:46       ` Junio C Hamano
  2 siblings, 1 reply; 81+ messages in thread
From: Jeff King @ 2021-10-06 19:05 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano

On Mon, Oct 04, 2021 at 04:27:01PM +0200, Ævar Arnfjörð Bjarmason wrote:

> Mark the "commit", "tree", "blob" and "tag" types for translation, and
> add an extern "unknown type" string for the OBJ_NONE case.
> 
> It is usually bad practice to translate individual words like this,
> but for e.g. the list list output emitted by the "short object ID dead
> is ambiguous" advice it makes sense.

We already seem to have a translatable string for "commit", but if I
look at say es.po, the translation is "confirmar", which is considering
it a verb. Now my Spanish is pretty rusty, so it's possible this works
as a noun, too. But if I look at other messages, like:

  #: builtin/commit.c:1623
  msgid "override date for commit"
  msgstr "sobrescribe la fecha del commit"

  #: builtin/commit.c:1626
  msgid "reuse message from specified commit"
  msgstr "reusar el mensaje de un commit específico"

then it's clear that "commit" as a noun is translated as "commit". I'm
not sure what facilities (if any) there are in gettext for having the
same string in different contexts.

I do note that this is already a problem. Of the five spots listed:

  #: builtin/commit.c:1625 builtin/commit.c:1626 builtin/commit.c:1632
  #: parse-options.h:329 ref-filter.h:90
  msgid "commit"
  msgstr "confirmar"

They all appear to want is as a noun. So maybe this is just
mis-translated for Spanish. It does feel like an accident in the making,
though.

> A subsequent commit will make that output translatable, and use these
> translation markings to do so. Well, we won't use "commit", but let's
> mark it up anyway for consistency. It'll probably come in handy sooner
> than later to have it already be translated, and it's to much of a
> burden to place on translators if they're translating the other three
> object types anyway.

I do wonder how useful it is to translate these type names in general.
Especially as used in this series, they're really technical terms, and
you are not going to escape the name "git commit" as a command. But I
don't ever use translated Git, so I'm not sure my opinion is all that
meaningful there.

> Aside: I think it would probably make sense to change the "NULL" entry
> for type_name() to be the "unknown type". I've ran into cases where
> type_name() was unconditionally interpolated in e.g. an sprintf()
> format, but let's leave that for #leftoverbits as that would be
> changing the behavior of the type_name() function.

IMHO this would be a bad idea. Even if there is a spot that uses the
result without checking for NULL, I'd much rather have Git segfault than
say, write out an object with a bogus name (as it would in index_mem(),
for example). So you really have to look over every caller, at which
point you may as well adjust the ones that aren't checking for NULL.

Now if you introduced type_name_human(), which auto-translated and
converted NULL to "unknown", then that would be easy to plug in
appropriately as you audited the callers.

>  static const char *object_type_strings[] = {
>  	NULL,		/* OBJ_NONE = 0 */
> -	"commit",	/* OBJ_COMMIT = 1 */
> -	"tree",		/* OBJ_TREE = 2 */
> -	"blob",		/* OBJ_BLOB = 3 */
> -	"tag",		/* OBJ_TAG = 4 */
> +	/*
> +	 * TRANSLATORS: "commit", "tree", "blob" and "tag" are the
> +	 * name of Git's object types. These names are interpolated
> +	 * stand-alone when doing so is unambiguous for translation
> +	 * and doesn't require extra context. E.g. as part of an
> +	 * already-translated string that needs to have a type name
> +	 * quoted verbatim, or the short description of a command-line
> +	 * option expecting a given type.
> +	 */
> +	N_("commit"),	/* OBJ_COMMIT = 1 */
> +	N_("tree"),	/* OBJ_TREE = 2 */
> +	N_("blob"),	/* OBJ_BLOB = 3 */
> +	N_("tag"),	/* OBJ_TAG = 4 */
>  };

This does make me feel slightly uneasy, just because so many parts of
Git rely on these _not_ being translated. But I see in your other
response that N_() really does nothing. So aside from possibly
misleading readers of the code, I think this is probably OK.

-Peff

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 2/2] object-name: make ambiguous object output translatable
  2021-10-04 14:27   ` [PATCH v2 2/2] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
@ 2021-10-06 19:11     ` Jeff King
  0 siblings, 0 replies; 81+ messages in thread
From: Jeff King @ 2021-10-06 19:11 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano

On Mon, Oct 04, 2021 at 04:27:02PM +0200, Ævar Arnfjörð Bjarmason wrote:

> +	abbrev = repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV);
> +	if (type == OBJ_COMMIT) {
> +		/*
> +		 * TRANSLATORS: This is a line of ambiguous commit
> +		 * object output. E.g.:
> +		 *
> +		 *    "deadbeef commit 2021-01-01 - Some Commit Message"
> +		 *
> +		 * The second argument is the "commit" string from
> +		 * object.c, it should (hopefully) already be
> +		 * translated.
> +		 */
> +		strbuf_addf(&desc, _("%s %s %s - %s"), abbrev, ci_ad.buf,
> +			    _(type_name(type)), ci_s.buf);
> +	} else if (tag_desc) {
> [...]

OK, this all looks reasonable to me. I'd probably have ditched "desc"
altogether in favor of just calling advise(), to give translators even
more information about what we're trying to output, but I admit I don't
care that much either way.

I'm still not sure if translating the object types is a good idea or
not, per my other response.

> +	/*
> +	 * TRANSLATORS: This is line item of ambiguous object output,
> +	 * translated above.
> +	 */
> +	advise(_("  %s\n"), desc.buf);

The "\n" here isn't necessary (and wasn't present in the original, but
it doesn't hurt, as advise()'s algorithm gobbles any newlines as it
splits). I guess it helps making this otherwise un-notable string more
unique for translation, but just stuffing the indentation into the
earlier calls would do an even better job of that.

-Peff

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 1/2] object.[ch]: mark object type names for translation
  2021-10-06 19:05     ` Jeff King
@ 2021-10-06 19:46       ` Junio C Hamano
  2021-10-06 20:38         ` Jeff King
  0 siblings, 1 reply; 81+ messages in thread
From: Junio C Hamano @ 2021-10-06 19:46 UTC (permalink / raw)
  To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, git

Jeff King <peff@peff.net> writes:

> They all appear to want is as a noun. So maybe this is just
> mis-translated for Spanish. It does feel like an accident in the making,
> though.

Probably we need pgettext().

https://www.gnu.org/software/gettext/manual/html_node/Contexts.html

> I do wonder how useful it is to translate these type names in general.
> Especially as used in this series, they're really technical terms, and
> you are not going to escape the name "git commit" as a command.

I share the same feeling (I do not use translated git, either).

> Now if you introduced type_name_human(), which auto-translated and
> converted NULL to "unknown", then that would be easy to plug in
> appropriately as you audited the callers.

Yes.

>
>>  static const char *object_type_strings[] = {
>> ...
>> +	N_("commit"),	/* OBJ_COMMIT = 1 */
>> +	N_("tree"),	/* OBJ_TREE = 2 */
>> +	N_("blob"),	/* OBJ_BLOB = 3 */
>> +	N_("tag"),	/* OBJ_TAG = 4 */
>>  };
>
> This does make me feel slightly uneasy, just because so many parts of
> Git rely on these _not_ being translated. But I see in your other
> response that N_() really does nothing. So aside from possibly
> misleading readers of the code, I think this is probably OK.

Yes, this may be scary looking but the least risky part of this
patch, as N_() is no-op at runtime ;-).


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 1/2] object.[ch]: mark object type names for translation
  2021-10-06 19:46       ` Junio C Hamano
@ 2021-10-06 20:38         ` Jeff King
  2021-10-07 18:06           ` Junio C Hamano
  0 siblings, 1 reply; 81+ messages in thread
From: Jeff King @ 2021-10-06 20:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, git

On Wed, Oct 06, 2021 at 12:46:12PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > They all appear to want is as a noun. So maybe this is just
> > mis-translated for Spanish. It does feel like an accident in the making,
> > though.
> 
> Probably we need pgettext().
> 
> https://www.gnu.org/software/gettext/manual/html_node/Contexts.html

Yeah, that make sense. I'm not sure how it interacts with N_(), though.
I.e., I'd expect the "context" to ride along with the original string,
but I guess it is really in the caller who's translating it. So the real
spot becomes:

  printf(_("my type is %s"), pgettext("object-type", type_name(type)));

It's a little unfortunate that every caller has to do it rather than
putting it near the source string. But I guess a type_name_human() would
solve that, too. ;)

-Peff

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v2 1/2] object.[ch]: mark object type names for translation
  2021-10-06 20:38         ` Jeff King
@ 2021-10-07 18:06           ` Junio C Hamano
  0 siblings, 0 replies; 81+ messages in thread
From: Junio C Hamano @ 2021-10-07 18:06 UTC (permalink / raw)
  To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, git

Jeff King <peff@peff.net> writes:

> On Wed, Oct 06, 2021 at 12:46:12PM -0700, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>> 
>> > They all appear to want is as a noun. So maybe this is just
>> > mis-translated for Spanish. It does feel like an accident in the making,
>> > though.
>> 
>> Probably we need pgettext().
>> 
>> https://www.gnu.org/software/gettext/manual/html_node/Contexts.html
>
> Yeah, that make sense. I'm not sure how it interacts with N_(), though.
> I.e., I'd expect the "context" to ride along with the original string,
> but I guess it is really in the caller who's translating it. So the real
> spot becomes:
>
>   printf(_("my type is %s"), pgettext("object-type", type_name(type)));
>
> It's a little unfortunate that every caller has to do it rather than
> putting it near the source string. But I guess a type_name_human() would
> solve that, too. ;)

Yes, I agree the need for pgettext() is annoying but I do not see an
easy alternative.  Introducing a wrapper like type_name_human() to
limit the damage sounds like the best we could do.

Thanks.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v3 0/3] i18n: improve translatability of ambiguous object output
  2021-10-04 14:27 ` [PATCH v2 0/2] i18n: improve translatability of ambiguous object output Ævar Arnfjörð Bjarmason
  2021-10-04 14:27   ` [PATCH v2 1/2] object.[ch]: mark object type names for translation Ævar Arnfjörð Bjarmason
  2021-10-04 14:27   ` [PATCH v2 2/2] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
@ 2021-10-08 19:34   ` Ævar Arnfjörð Bjarmason
  2021-10-08 19:34     ` [PATCH v3 1/3] object-name: remove unreachable "unknown type" handling Ævar Arnfjörð Bjarmason
                       ` (3 more replies)
  2 siblings, 4 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-08 19:34 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason

Since v2 the "commit", "tag" etc. types in object.c are no longer
marked for translation.

There's a new 1/3 where we lead with an assert() and commit message
showing that the existing "unknown type" code is gone, which makes
what comes after simpler.

In 2/3 we no longer have to deal with special-cases related to corrupt
or otherwise bad objects, which makes for less work for translators.

In 3/3 I added the tag date to ambiguous tag objects, which is now
consistent with how commit objects are shown.

Ævar Arnfjörð Bjarmason (3):
  object-name: remove unreachable "unknown type" handling
  object-name: make ambiguous object output translatable
  object-name: show date for ambiguous tag objects

 object-name.c | 68 +++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 61 insertions(+), 7 deletions(-)

Range-diff against v2:
1:  55bde16aa23 < -:  ----------- object.[ch]: mark object type names for translation
-:  ----------- > 1:  fb29e10ee35 object-name: remove unreachable "unknown type" handling
2:  c0e873543f5 ! 2:  587a5717e47 object-name: make ambiguous object output translatable
    @@ Commit message
         object-name: make ambiguous object output translatable
     
         Change the output of show_ambiguous_object() added in [1] and last
    -    tweaked in [2] to be more friendly to translators. By being able to
    -    customize the sprintf formats we're even ready for RTL languages.
    -
    -    The "unknown type" message here is unreachable, and has been since
    -    [1], i.e. that code has never worked. If we craft an object of a bogus
    -    type with a conflicting prefix we'll just die:
    -
    -        $ git rev-parse 8315
    -        error: short object ID 8315 is ambiguous
    -        hint: The candidates are:
    -        fatal: invalid object type
    -
    -    But let's continue to pretend that this works, we can eventually use
    -    the API improvements in my ab/fsck-unexpected-type (once it lands) to
    -    inspect these objects and emit the actual type here, or at least not
    -    die as we emit "unknown type".
    +    tweaked in [2] and the preceding commit to be more friendly to
    +    translators. By being able to customize the "<SP><SP>%s\n" format
    +    we're even ready for RTL languages, who'd presumably like to change
    +    that to "%s<SP><SP>\n".
     
         1. 1ffa26c461 (get_short_sha1: list ambiguous objects on error,
            2016-09-26)
    @@ Commit message
     
      ## object-name.c ##
     @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, void *data)
    - {
      	const struct disambiguate_state *ds = data;
      	struct strbuf desc = STRBUF_INIT;
    -+	struct strbuf ci_ad = STRBUF_INIT;
    -+	struct strbuf ci_s = STRBUF_INIT;
      	int type;
    -+	const char *tag_desc = NULL;
    -+	const char *abbrev;
    ++	const char *hash;
      
      	if (ds->fn && !ds->fn(ds->repo, oid, ds->cb_data))
      		return 0;
     @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, void *data)
    + 	type = oid_object_info(ds->repo, oid, NULL);
    + 	assert(type == OBJ_TREE || type == OBJ_COMMIT ||
    + 	       type == OBJ_BLOB || type == OBJ_TAG);
    ++	hash = repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV);
    ++
    + 	if (type == OBJ_COMMIT) {
    ++		struct strbuf ad = STRBUF_INIT;
    ++		struct strbuf s = STRBUF_INIT;
    + 		struct commit *commit = lookup_commit(ds->repo, oid);
    ++
      		if (commit) {
      			struct pretty_print_context pp = {0};
      			pp.date_mode.type = DATE_SHORT;
     -			format_commit_message(commit, " %ad - %s", &desc, &pp);
    -+			format_commit_message(commit, "%ad", &ci_ad, &pp);
    -+			format_commit_message(commit, "%s", &ci_s, &pp);
    ++			format_commit_message(commit, "%ad", &ad, &pp);
    ++			format_commit_message(commit, "%s", &s, &pp);
      		}
    - 	} else if (type == OBJ_TAG) {
    - 		struct tag *tag = lookup_tag(ds->repo, oid);
    - 		if (!parse_tag(tag) && tag->tag)
    --			strbuf_addf(&desc, " %s", tag->tag);
    -+			tag_desc = tag->tag;
    - 	}
    - 
    --	advise("  %s %s%s",
    --	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
    --	       type_name(type) ? type_name(type) : "unknown type",
    --	       desc.buf);
    -+	abbrev = repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV);
    -+	if (type == OBJ_COMMIT) {
    ++
     +		/*
     +		 * TRANSLATORS: This is a line of ambiguous commit
     +		 * object output. E.g.:
     +		 *
     +		 *    "deadbeef commit 2021-01-01 - Some Commit Message"
    -+		 *
    -+		 * The second argument is the "commit" string from
    -+		 * object.c, it should (hopefully) already be
    -+		 * translated.
     +		 */
    -+		strbuf_addf(&desc, _("%s %s %s - %s"), abbrev, ci_ad.buf,
    -+			    _(type_name(type)), ci_s.buf);
    -+	} else if (tag_desc) {
    ++		strbuf_addf(&desc, _("%s commit %s - %s"), hash, ad.buf, s.buf);
    ++
    ++		strbuf_release(&ad);
    ++		strbuf_release(&s);
    + 	} else if (type == OBJ_TAG) {
    + 		struct tag *tag = lookup_tag(ds->repo, oid);
    ++		const char *tag_tag = "";
    ++
    + 		if (!parse_tag(tag) && tag->tag)
    +-			strbuf_addf(&desc, " %s", tag->tag);
    ++			tag_tag = tag->tag;
    ++
     +		/*
     +		 * TRANSLATORS: This is a line of
     +		 * ambiguous tag object output. E.g.:
    @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, voi
     +		 * object.c, it should (hopefully) already be
     +		 * translated.
     +		 */
    -+		strbuf_addf(&desc, _("%s %s %s"), abbrev, _(type_name(type)),
    -+			    tag_desc);
    -+	} else {
    -+		const char *tname = type_name(type) ? _(type_name(type)) :
    -+			_(unknown_type);
    ++		strbuf_addf(&desc, _("%s tag %s"), hash, tag_tag);
    ++	} else if (type == OBJ_TREE) {
     +		/*
     +		 * TRANSLATORS: This is a line of ambiguous <type>
    -+		 * object output. Where <type> is one of the object
    -+		 * types of "tree", "blob", "tag" ("commit" is handled
    -+		 * above).
    -+		 *
    -+		 *    "deadbeef tree"
    -+		 *    "deadbeef blob"
    -+		 *    "deadbeef tag"
    -+		 *    "deadbeef unknown type"
    -+		 *
    -+		 * Note that annotated tags use a separate format
    -+		 * outlined above.
    -+		 *
    -+		 * The second argument is the "tree", "blob" or "tag"
    -+		 * string from object.c, or the "unknown type" string
    -+		 * in the case of an unknown type. All of them should
    -+		 * (hopefully) already be translated.
    ++		 * object output. E.g. "deadbeef tree".
     +		 */
    -+		strbuf_addf(&desc, _("%s %s"), abbrev, tname);
    -+	}
    -+
    ++		strbuf_addf(&desc, _("%s tree"), hash);
    ++	} else if (type == OBJ_BLOB) {
    ++		/*
    ++		 * TRANSLATORS: This is a line of ambiguous <type>
    ++		 * object output. E.g. "deadbeef blob".
    ++		 */
    ++		strbuf_addf(&desc, _("%s blob"), hash);
    ++	} else {
    ++		BUG("unreachable");
    + 	}
    + 
    +-	advise("  %s %s%s",
    +-	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
    +-	       type_name(type), desc.buf);
     +	/*
     +	 * TRANSLATORS: This is line item of ambiguous object output,
     +	 * translated above.
     +	 */
    -+	advise(_("  %s\n"), desc.buf);
    ++	advise(_("  %s"), desc.buf);
      
      	strbuf_release(&desc);
    -+	strbuf_release(&ci_ad);
    -+	strbuf_release(&ci_s);
      	return 0;
    - }
    - 
-:  ----------- > 3:  8bde4e174b7 object-name: show date for ambiguous tag objects
-- 
2.33.0.1492.g76eb1af92bc


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v3 1/3] object-name: remove unreachable "unknown type" handling
  2021-10-08 19:34   ` [PATCH v3 0/3] i18n: improve translatability of ambiguous object output Ævar Arnfjörð Bjarmason
@ 2021-10-08 19:34     ` Ævar Arnfjörð Bjarmason
  2021-10-08 19:34     ` [PATCH v3 2/3] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-08 19:34 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason

Remove the "unknown type" handling when displaying the ambiguous
object list. See [1] for the current output, and [1] for the commit
that added the "unknown type" handling.

The reason this code wasn't reachable is because we're not passing in
OBJECT_INFO_ALLOW_UNKNOWN_TYPE, so we'll just die in sort_ambiguous()
before we get to show_ambiguous_object():

    $ git rev-parse 8315
    error: short object ID 8315 is ambiguous
    hint: The candidates are:
    fatal: invalid object type

We should do better here, but let's leave that for some future
improvement. In a subsequent commit I'll improve the output we do
show, and not having to handle the "unknown type" case simplifies that
change.

Even though we know that this isn't reachable let's back that up with
an assert() both for self-documentation and sanity checking.

1. 5cc044e0257 (get_short_oid: sort ambiguous objects by type,
   then SHA-1, 2018-05-10)
2. 1ffa26c461 (get_short_sha1: list ambiguous objects on error,
   2016-09-26)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/object-name.c b/object-name.c
index fdff4601b2c..59e934262e7 100644
--- a/object-name.c
+++ b/object-name.c
@@ -361,6 +361,8 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		return 0;
 
 	type = oid_object_info(ds->repo, oid, NULL);
+	assert(type == OBJ_TREE || type == OBJ_COMMIT ||
+	       type == OBJ_BLOB || type == OBJ_TAG);
 	if (type == OBJ_COMMIT) {
 		struct commit *commit = lookup_commit(ds->repo, oid);
 		if (commit) {
@@ -376,8 +378,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 
 	advise("  %s %s%s",
 	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
-	       type_name(type) ? type_name(type) : "unknown type",
-	       desc.buf);
+	       type_name(type), desc.buf);
 
 	strbuf_release(&desc);
 	return 0;
-- 
2.33.0.1492.g76eb1af92bc


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v3 2/3] object-name: make ambiguous object output translatable
  2021-10-08 19:34   ` [PATCH v3 0/3] i18n: improve translatability of ambiguous object output Ævar Arnfjörð Bjarmason
  2021-10-08 19:34     ` [PATCH v3 1/3] object-name: remove unreachable "unknown type" handling Ævar Arnfjörð Bjarmason
@ 2021-10-08 19:34     ` Ævar Arnfjörð Bjarmason
  2021-10-08 19:34     ` [PATCH v3 3/3] object-name: show date for ambiguous tag objects Ævar Arnfjörð Bjarmason
  2021-11-22 17:53     ` [PATCH v2 0/3] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
  3 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-08 19:34 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason

Change the output of show_ambiguous_object() added in [1] and last
tweaked in [2] and the preceding commit to be more friendly to
translators. By being able to customize the "<SP><SP>%s\n" format
we're even ready for RTL languages, who'd presumably like to change
that to "%s<SP><SP>\n".

1. 1ffa26c461 (get_short_sha1: list ambiguous objects on error,
   2016-09-26)
2. 5cc044e0257 (get_short_oid: sort ambiguous objects by type,
   then SHA-1, 2018-05-10)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 53 insertions(+), 5 deletions(-)

diff --git a/object-name.c b/object-name.c
index 59e934262e7..7a5355b4cf7 100644
--- a/object-name.c
+++ b/object-name.c
@@ -356,6 +356,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	const struct disambiguate_state *ds = data;
 	struct strbuf desc = STRBUF_INIT;
 	int type;
+	const char *hash;
 
 	if (ds->fn && !ds->fn(ds->repo, oid, ds->cb_data))
 		return 0;
@@ -363,22 +364,69 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	type = oid_object_info(ds->repo, oid, NULL);
 	assert(type == OBJ_TREE || type == OBJ_COMMIT ||
 	       type == OBJ_BLOB || type == OBJ_TAG);
+	hash = repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV);
+
 	if (type == OBJ_COMMIT) {
+		struct strbuf ad = STRBUF_INIT;
+		struct strbuf s = STRBUF_INIT;
 		struct commit *commit = lookup_commit(ds->repo, oid);
+
 		if (commit) {
 			struct pretty_print_context pp = {0};
 			pp.date_mode.type = DATE_SHORT;
-			format_commit_message(commit, " %ad - %s", &desc, &pp);
+			format_commit_message(commit, "%ad", &ad, &pp);
+			format_commit_message(commit, "%s", &s, &pp);
 		}
+
+		/*
+		 * TRANSLATORS: This is a line of ambiguous commit
+		 * object output. E.g.:
+		 *
+		 *    "deadbeef commit 2021-01-01 - Some Commit Message"
+		 */
+		strbuf_addf(&desc, _("%s commit %s - %s"), hash, ad.buf, s.buf);
+
+		strbuf_release(&ad);
+		strbuf_release(&s);
 	} else if (type == OBJ_TAG) {
 		struct tag *tag = lookup_tag(ds->repo, oid);
+		const char *tag_tag = "";
+
 		if (!parse_tag(tag) && tag->tag)
-			strbuf_addf(&desc, " %s", tag->tag);
+			tag_tag = tag->tag;
+
+		/*
+		 * TRANSLATORS: This is a line of
+		 * ambiguous tag object output. E.g.:
+		 *
+		 *    "deadbeef tag Some Tag Message"
+		 *
+		 * The second argument is the "tag" string from
+		 * object.c, it should (hopefully) already be
+		 * translated.
+		 */
+		strbuf_addf(&desc, _("%s tag %s"), hash, tag_tag);
+	} else if (type == OBJ_TREE) {
+		/*
+		 * TRANSLATORS: This is a line of ambiguous <type>
+		 * object output. E.g. "deadbeef tree".
+		 */
+		strbuf_addf(&desc, _("%s tree"), hash);
+	} else if (type == OBJ_BLOB) {
+		/*
+		 * TRANSLATORS: This is a line of ambiguous <type>
+		 * object output. E.g. "deadbeef blob".
+		 */
+		strbuf_addf(&desc, _("%s blob"), hash);
+	} else {
+		BUG("unreachable");
 	}
 
-	advise("  %s %s%s",
-	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
-	       type_name(type), desc.buf);
+	/*
+	 * TRANSLATORS: This is line item of ambiguous object output,
+	 * translated above.
+	 */
+	advise(_("  %s"), desc.buf);
 
 	strbuf_release(&desc);
 	return 0;
-- 
2.33.0.1492.g76eb1af92bc


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v3 3/3] object-name: show date for ambiguous tag objects
  2021-10-08 19:34   ` [PATCH v3 0/3] i18n: improve translatability of ambiguous object output Ævar Arnfjörð Bjarmason
  2021-10-08 19:34     ` [PATCH v3 1/3] object-name: remove unreachable "unknown type" handling Ævar Arnfjörð Bjarmason
  2021-10-08 19:34     ` [PATCH v3 2/3] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
@ 2021-10-08 19:34     ` Ævar Arnfjörð Bjarmason
  2021-11-22 17:53     ` [PATCH v2 0/3] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
  3 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-08 19:34 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason

Make the ambiguous tag object output nicer in the case of tag objects
such as ebf3c04b262 (Git 2.32, 2021-06-06) by including the date in
the "tagger" header. I.e.:

    $ git rev-parse b7e68
    error: short object ID b7e68 is ambiguous
    hint: The candidates are:
    hint:   b7e68c41d92 tag 2021-06-06 - v2.32.0
    hint:   b7e68ae18e0 commit 2019-12-23 - bisect: use the standard 'if (!var)' way to check for 0
    hint:   b7e68f6b413 tree
    hint:   b7e68490b97 blob
    b7e68
    [...]

Before this we'd emit a "tag" line of:

    hint:   b7e68c41d92 tag v2.32.0

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/object-name.c b/object-name.c
index 7a5355b4cf7..29859d3eebe 100644
--- a/object-name.c
+++ b/object-name.c
@@ -391,9 +391,12 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	} else if (type == OBJ_TAG) {
 		struct tag *tag = lookup_tag(ds->repo, oid);
 		const char *tag_tag = "";
+		timestamp_t tag_date = 0;
 
-		if (!parse_tag(tag) && tag->tag)
+		if (!parse_tag(tag) && tag->tag) {
 			tag_tag = tag->tag;
+			tag_date = tag->date;
+		}
 
 		/*
 		 * TRANSLATORS: This is a line of
@@ -405,7 +408,9 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		 * object.c, it should (hopefully) already be
 		 * translated.
 		 */
-		strbuf_addf(&desc, _("%s tag %s"), hash, tag_tag);
+		strbuf_addf(&desc, _("%s tag %s - %s"), hash,
+			    show_date(tag_date, 0, DATE_MODE(SHORT)),
+			    tag_tag);
 	} else if (type == OBJ_TREE) {
 		/*
 		 * TRANSLATORS: This is a line of ambiguous <type>
-- 
2.33.0.1492.g76eb1af92bc


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v2 0/3] object-name: make ambiguous object output translatable + show tag date
  2021-10-08 19:34   ` [PATCH v3 0/3] i18n: improve translatability of ambiguous object output Ævar Arnfjörð Bjarmason
                       ` (2 preceding siblings ...)
  2021-10-08 19:34     ` [PATCH v3 3/3] object-name: show date for ambiguous tag objects Ævar Arnfjörð Bjarmason
@ 2021-11-22 17:53     ` Ævar Arnfjörð Bjarmason
  2021-11-22 17:53       ` [PATCH v4 1/3] object-name: remove unreachable "unknown type" handling Ævar Arnfjörð Bjarmason
                         ` (3 more replies)
  3 siblings, 4 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-22 17:53 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason

This topic improves the output we emit on ambiguous objects as noted
in 3/3, and makes it translatable. See [3] for v3.

The only changes since v3 are minor commit message improvements
spotted while re-rolling this. I think revewers were happy with it in
v3, but it fell through the cracks.

1. https://lore.kernel.org/git/cover-v3-0.3-00000000000-20211008T193041Z-avarab@gmail.com/

Ævar Arnfjörð Bjarmason (3):
  object-name: remove unreachable "unknown type" handling
  object-name: make ambiguous object output translatable
  object-name: show date for ambiguous tag objects

 object-name.c | 68 +++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 61 insertions(+), 7 deletions(-)

Range-diff against v3:
1:  fb29e10ee35 ! 1:  2e7090c09f9 object-name: remove unreachable "unknown type" handling
    @@ Metadata
      ## Commit message ##
         object-name: remove unreachable "unknown type" handling
     
    -    Remove the "unknown type" handling when displaying the ambiguous
    -    object list. See [1] for the current output, and [1] for the commit
    -    that added the "unknown type" handling.
    +    Remove unreachable "unknown type" handling in the code that displays
    +    the ambiguous object list. See [1] for the current output, and [1] for
    +    the commit that added the "unknown type" handling.
     
         The reason this code wasn't reachable is because we're not passing in
    -    OBJECT_INFO_ALLOW_UNKNOWN_TYPE, so we'll just die in sort_ambiguous()
    +    OBJECT_INFO_ALLOW_UNKNOWN_TYPE, so we'll die in sort_ambiguous()
         before we get to show_ambiguous_object():
     
             $ git rev-parse 8315
2:  587a5717e47 ! 2:  00d84faeb1d object-name: make ambiguous object output translatable
    @@ Commit message
     
         Change the output of show_ambiguous_object() added in [1] and last
         tweaked in [2] and the preceding commit to be more friendly to
    -    translators. By being able to customize the "<SP><SP>%s\n" format
    -    we're even ready for RTL languages, who'd presumably like to change
    -    that to "%s<SP><SP>\n".
    +    translators.
    +
    +    By being able to customize the "<SP><SP>%s\n" format we're even ready
    +    for RTL languages, who'd presumably like to change that to
    +    "%s<SP><SP>\n".
     
         1. 1ffa26c461 (get_short_sha1: list ambiguous objects on error,
            2016-09-26)
3:  8bde4e174b7 = 3:  9d24bab635d object-name: show date for ambiguous tag objects
-- 
2.34.0.822.gc64b680fd55


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v4 1/3] object-name: remove unreachable "unknown type" handling
  2021-11-22 17:53     ` [PATCH v2 0/3] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
@ 2021-11-22 17:53       ` Ævar Arnfjörð Bjarmason
  2021-11-22 22:37         ` Jeff King
  2021-11-22 17:53       ` [PATCH v4 2/3] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-22 17:53 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason

Remove unreachable "unknown type" handling in the code that displays
the ambiguous object list. See [1] for the current output, and [1] for
the commit that added the "unknown type" handling.

The reason this code wasn't reachable is because we're not passing in
OBJECT_INFO_ALLOW_UNKNOWN_TYPE, so we'll die in sort_ambiguous()
before we get to show_ambiguous_object():

    $ git rev-parse 8315
    error: short object ID 8315 is ambiguous
    hint: The candidates are:
    fatal: invalid object type

We should do better here, but let's leave that for some future
improvement. In a subsequent commit I'll improve the output we do
show, and not having to handle the "unknown type" case simplifies that
change.

Even though we know that this isn't reachable let's back that up with
an assert() both for self-documentation and sanity checking.

1. 5cc044e0257 (get_short_oid: sort ambiguous objects by type,
   then SHA-1, 2018-05-10)
2. 1ffa26c461 (get_short_sha1: list ambiguous objects on error,
   2016-09-26)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/object-name.c b/object-name.c
index fdff4601b2c..59e934262e7 100644
--- a/object-name.c
+++ b/object-name.c
@@ -361,6 +361,8 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		return 0;
 
 	type = oid_object_info(ds->repo, oid, NULL);
+	assert(type == OBJ_TREE || type == OBJ_COMMIT ||
+	       type == OBJ_BLOB || type == OBJ_TAG);
 	if (type == OBJ_COMMIT) {
 		struct commit *commit = lookup_commit(ds->repo, oid);
 		if (commit) {
@@ -376,8 +378,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 
 	advise("  %s %s%s",
 	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
-	       type_name(type) ? type_name(type) : "unknown type",
-	       desc.buf);
+	       type_name(type), desc.buf);
 
 	strbuf_release(&desc);
 	return 0;
-- 
2.34.0.822.gc64b680fd55


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v4 2/3] object-name: make ambiguous object output translatable
  2021-11-22 17:53     ` [PATCH v2 0/3] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
  2021-11-22 17:53       ` [PATCH v4 1/3] object-name: remove unreachable "unknown type" handling Ævar Arnfjörð Bjarmason
@ 2021-11-22 17:53       ` Ævar Arnfjörð Bjarmason
  2021-11-22 17:53       ` [PATCH v4 3/3] object-name: show date for ambiguous tag objects Ævar Arnfjörð Bjarmason
  2021-11-25 22:03       ` [PATCH v5 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
  3 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-22 17:53 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason

Change the output of show_ambiguous_object() added in [1] and last
tweaked in [2] and the preceding commit to be more friendly to
translators.

By being able to customize the "<SP><SP>%s\n" format we're even ready
for RTL languages, who'd presumably like to change that to
"%s<SP><SP>\n".

1. 1ffa26c461 (get_short_sha1: list ambiguous objects on error,
   2016-09-26)
2. 5cc044e0257 (get_short_oid: sort ambiguous objects by type,
   then SHA-1, 2018-05-10)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 53 insertions(+), 5 deletions(-)

diff --git a/object-name.c b/object-name.c
index 59e934262e7..7a5355b4cf7 100644
--- a/object-name.c
+++ b/object-name.c
@@ -356,6 +356,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	const struct disambiguate_state *ds = data;
 	struct strbuf desc = STRBUF_INIT;
 	int type;
+	const char *hash;
 
 	if (ds->fn && !ds->fn(ds->repo, oid, ds->cb_data))
 		return 0;
@@ -363,22 +364,69 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	type = oid_object_info(ds->repo, oid, NULL);
 	assert(type == OBJ_TREE || type == OBJ_COMMIT ||
 	       type == OBJ_BLOB || type == OBJ_TAG);
+	hash = repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV);
+
 	if (type == OBJ_COMMIT) {
+		struct strbuf ad = STRBUF_INIT;
+		struct strbuf s = STRBUF_INIT;
 		struct commit *commit = lookup_commit(ds->repo, oid);
+
 		if (commit) {
 			struct pretty_print_context pp = {0};
 			pp.date_mode.type = DATE_SHORT;
-			format_commit_message(commit, " %ad - %s", &desc, &pp);
+			format_commit_message(commit, "%ad", &ad, &pp);
+			format_commit_message(commit, "%s", &s, &pp);
 		}
+
+		/*
+		 * TRANSLATORS: This is a line of ambiguous commit
+		 * object output. E.g.:
+		 *
+		 *    "deadbeef commit 2021-01-01 - Some Commit Message"
+		 */
+		strbuf_addf(&desc, _("%s commit %s - %s"), hash, ad.buf, s.buf);
+
+		strbuf_release(&ad);
+		strbuf_release(&s);
 	} else if (type == OBJ_TAG) {
 		struct tag *tag = lookup_tag(ds->repo, oid);
+		const char *tag_tag = "";
+
 		if (!parse_tag(tag) && tag->tag)
-			strbuf_addf(&desc, " %s", tag->tag);
+			tag_tag = tag->tag;
+
+		/*
+		 * TRANSLATORS: This is a line of
+		 * ambiguous tag object output. E.g.:
+		 *
+		 *    "deadbeef tag Some Tag Message"
+		 *
+		 * The second argument is the "tag" string from
+		 * object.c, it should (hopefully) already be
+		 * translated.
+		 */
+		strbuf_addf(&desc, _("%s tag %s"), hash, tag_tag);
+	} else if (type == OBJ_TREE) {
+		/*
+		 * TRANSLATORS: This is a line of ambiguous <type>
+		 * object output. E.g. "deadbeef tree".
+		 */
+		strbuf_addf(&desc, _("%s tree"), hash);
+	} else if (type == OBJ_BLOB) {
+		/*
+		 * TRANSLATORS: This is a line of ambiguous <type>
+		 * object output. E.g. "deadbeef blob".
+		 */
+		strbuf_addf(&desc, _("%s blob"), hash);
+	} else {
+		BUG("unreachable");
 	}
 
-	advise("  %s %s%s",
-	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
-	       type_name(type), desc.buf);
+	/*
+	 * TRANSLATORS: This is line item of ambiguous object output,
+	 * translated above.
+	 */
+	advise(_("  %s"), desc.buf);
 
 	strbuf_release(&desc);
 	return 0;
-- 
2.34.0.822.gc64b680fd55


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v4 3/3] object-name: show date for ambiguous tag objects
  2021-11-22 17:53     ` [PATCH v2 0/3] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
  2021-11-22 17:53       ` [PATCH v4 1/3] object-name: remove unreachable "unknown type" handling Ævar Arnfjörð Bjarmason
  2021-11-22 17:53       ` [PATCH v4 2/3] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
@ 2021-11-22 17:53       ` Ævar Arnfjörð Bjarmason
  2021-11-25 22:03       ` [PATCH v5 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
  3 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-22 17:53 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason

Make the ambiguous tag object output nicer in the case of tag objects
such as ebf3c04b262 (Git 2.32, 2021-06-06) by including the date in
the "tagger" header. I.e.:

    $ git rev-parse b7e68
    error: short object ID b7e68 is ambiguous
    hint: The candidates are:
    hint:   b7e68c41d92 tag 2021-06-06 - v2.32.0
    hint:   b7e68ae18e0 commit 2019-12-23 - bisect: use the standard 'if (!var)' way to check for 0
    hint:   b7e68f6b413 tree
    hint:   b7e68490b97 blob
    b7e68
    [...]

Before this we'd emit a "tag" line of:

    hint:   b7e68c41d92 tag v2.32.0

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/object-name.c b/object-name.c
index 7a5355b4cf7..29859d3eebe 100644
--- a/object-name.c
+++ b/object-name.c
@@ -391,9 +391,12 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	} else if (type == OBJ_TAG) {
 		struct tag *tag = lookup_tag(ds->repo, oid);
 		const char *tag_tag = "";
+		timestamp_t tag_date = 0;
 
-		if (!parse_tag(tag) && tag->tag)
+		if (!parse_tag(tag) && tag->tag) {
 			tag_tag = tag->tag;
+			tag_date = tag->date;
+		}
 
 		/*
 		 * TRANSLATORS: This is a line of
@@ -405,7 +408,9 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		 * object.c, it should (hopefully) already be
 		 * translated.
 		 */
-		strbuf_addf(&desc, _("%s tag %s"), hash, tag_tag);
+		strbuf_addf(&desc, _("%s tag %s - %s"), hash,
+			    show_date(tag_date, 0, DATE_MODE(SHORT)),
+			    tag_tag);
 	} else if (type == OBJ_TREE) {
 		/*
 		 * TRANSLATORS: This is a line of ambiguous <type>
-- 
2.34.0.822.gc64b680fd55


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v4 1/3] object-name: remove unreachable "unknown type" handling
  2021-11-22 17:53       ` [PATCH v4 1/3] object-name: remove unreachable "unknown type" handling Ævar Arnfjörð Bjarmason
@ 2021-11-22 22:37         ` Jeff King
  0 siblings, 0 replies; 81+ messages in thread
From: Jeff King @ 2021-11-22 22:37 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano, Bagas Sanjaya

On Mon, Nov 22, 2021 at 06:53:23PM +0100, Ævar Arnfjörð Bjarmason wrote:

> Remove unreachable "unknown type" handling in the code that displays
> the ambiguous object list. See [1] for the current output, and [1] for
> the commit that added the "unknown type" handling.
> 
> The reason this code wasn't reachable is because we're not passing in
> OBJECT_INFO_ALLOW_UNKNOWN_TYPE, so we'll die in sort_ambiguous()
> before we get to show_ambiguous_object():
> 
>     $ git rev-parse 8315
>     error: short object ID 8315 is ambiguous
>     hint: The candidates are:
>     fatal: invalid object type

I'm not so sure about this reasoning. In the code we are getting the
type fresh from oid_object_info():

> @@ -361,6 +361,8 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
>  		return 0;
>  
>  	type = oid_object_info(ds->repo, oid, NULL);
> +	assert(type == OBJ_TREE || type == OBJ_COMMIT ||
> +	       type == OBJ_BLOB || type == OBJ_TAG);

so at the very least we have to worry about the answer changing between
the two spots. You talk above about ALLOW_UNKNOWN_TYPE, but can't we
just get a straight "-1" if there's an error opening the object?

I'm also confused about the mention of die in sort_ambiguous(). It looks
like it would just produce a funny sort order in that case.

Here's a case that triggers the difference:

  git init repo
  cd repo

  one=$(echo 851 | git hash-object -w --stdin)
  two=$(echo 872 | git hash-object -w --stdin)
  oid=$(echo $two | cut -c1-4)

  fn=.git/objects/$(echo $two | perl -pe 's{..}{$&/}')
  chmod +w $fn
  echo broken >$fn

  git show $oid

Without your patch, it produces:

  error: short object ID ee3d is ambiguous
  hint: The candidates are:
  error: inflate: data stream error (incorrect header check)
  error: unable to unpack ee3d8abaa95a7395b373892b2593de2f426814e2 header
  error: inflate: data stream error (incorrect header check)
  error: unable to unpack ee3d8abaa95a7395b373892b2593de2f426814e2 header
  hint:   ee3d8ab unknown type
  hint:   ee3de99 blob

With your patch:

  error: short object ID ee3d is ambiguous
  hint: The candidates are:
  error: inflate: data stream error (incorrect header check)
  error: unable to unpack ee3d8abaa95a7395b373892b2593de2f426814e2 header
  error: inflate: data stream error (incorrect header check)
  error: unable to unpack ee3d8abaa95a7395b373892b2593de2f426814e2 header
  git: object-name.c:364: show_ambiguous_object: Assertion `type == OBJ_TREE || type == OBJ_COMMIT || type == OBJ_BLOB || type == OBJ_TAG' failed.
  Aborted

-Peff

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v5 0/6] object-name: make ambiguous object output translatable + show tag date
  2021-11-22 17:53     ` [PATCH v2 0/3] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
                         ` (2 preceding siblings ...)
  2021-11-22 17:53       ` [PATCH v4 3/3] object-name: show date for ambiguous tag objects Ævar Arnfjörð Bjarmason
@ 2021-11-25 22:03       ` Ævar Arnfjörð Bjarmason
  2021-11-25 22:03         ` [PATCH v5 1/6] object-name tests: add tests for ambiguous object blind spots Ævar Arnfjörð Bjarmason
                           ` (6 more replies)
  3 siblings, 7 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-25 22:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason

This topic improves the output we emit on ambiguous objects as noted
in 4/6, and makes it translatable, see 3/6. See [1] for v4.

This addresses the feedback Jeff King had on v4. There weren't any
tests for cases where we'd return -1 when parsing objects, and I was
focused on different object types in earlier iterations, and missed
that case.

So this v5 leads with some exhaustive testing of the existing
functionality to address that and other blind spots,

I then resurrected the patch from an earlier iteration to buffer the
output for a single advice() call at the end. As the exhaustive tests
that we have now show if we call error() (which can and will happen
several times on invalid objects) while parsing our N objects, we'll
split up the header and body for the advice(), by buffering it up
we're guaranteed to print errors and the payload separately.

1. https://lore.kernel.org/git/cover-v4-0.3-00000000000-20211122T175219Z-avarab@gmail.com

Ævar Arnfjörð Bjarmason (6):
  object-name tests: add tests for ambiguous object blind spots
  object-name: explicitly handle OBJ_BAD in show_ambiguous_object()
  object-name: make ambiguous object output translatable
  object-name: show date for ambiguous tag objects
  object-name: iterate ambiguous objects before showing header
  object-name: re-use "struct strbuf" in show_ambiguous_object()

 object-name.c                       | 111 +++++++++++++++++++++++++---
 t/t1512-rev-parse-disambiguation.sh |  83 +++++++++++++++++++++
 2 files changed, 182 insertions(+), 12 deletions(-)

Range-diff against v4:
-:  ----------- > 1:  767165d096d object-name tests: add tests for ambiguous object blind spots
1:  2e7090c09f9 ! 2:  ee86912f1c1 object-name: remove unreachable "unknown type" handling
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    object-name: remove unreachable "unknown type" handling
    +    object-name: explicitly handle OBJ_BAD in show_ambiguous_object()
     
    -    Remove unreachable "unknown type" handling in the code that displays
    -    the ambiguous object list. See [1] for the current output, and [1] for
    -    the commit that added the "unknown type" handling.
    +    Amend the "unknown type" handling in the code that displays the
    +    ambiguous object list to assert() that we're either going to get the
    +    "real" object types we can pass to type_name(), or a -1 (OBJ_BAD)
    +    return value from oid_object_info().
     
    -    The reason this code wasn't reachable is because we're not passing in
    -    OBJECT_INFO_ALLOW_UNKNOWN_TYPE, so we'll die in sort_ambiguous()
    -    before we get to show_ambiguous_object():
    +    See [1] for the current output, and [1] for the commit that added the
    +    "unknown type" handling.
     
    -        $ git rev-parse 8315
    -        error: short object ID 8315 is ambiguous
    -        hint: The candidates are:
    -        fatal: invalid object type
    +    We are never going to get an "unknown type" in the sense of custom
    +    types crafted with "hash-object --literally", since we're not using
    +    the OBJECT_INFO_ALLOW_UNKNOWN_TYPE flag.
     
    -    We should do better here, but let's leave that for some future
    -    improvement. In a subsequent commit I'll improve the output we do
    -    show, and not having to handle the "unknown type" case simplifies that
    -    change.
    +    If we manage to otherwise unpack such an object without errors we'll
    +    die() in parse_loose_header_extended() called by sort_ambiguous()
    +    before we get to show_ambiguous_object(), as is asserted by the test
    +    added in the preceding commit.
     
    -    Even though we know that this isn't reachable let's back that up with
    -    an assert() both for self-documentation and sanity checking.
    +    So saying "unknown type" here was always misleading, we really meant
    +    to say that we had a failure parsing the object at all, if the problem
    +    is only that it's type is unknown we won't reach this code.
    +
    +    So let's emit a generic "[bad object]" instead. As our tests added in
    +    the preceding commit show, we'll have emitted various "error" output
    +    already in those cases.
    +
    +    We should do better in the truly "unknown type" cases, which we'd need
    +    to handle if we were passing down the OBJECT_INFO_ALLOW_UNKNOWN_TYPE
    +    flag. But let's leave that for some future improvement. In a
    +    subsequent commit I'll improve the output we do show, and not having
    +    to handle the "unknown type" (as in OBJECT_INFO_ALLOW_UNKNOWN_TYPE)
    +    simplifies that change.
     
         1. 5cc044e0257 (get_short_oid: sort ambiguous objects by type,
            then SHA-1, 2018-05-10)
    @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, voi
      		return 0;
      
      	type = oid_object_info(ds->repo, oid, NULL);
    ++
    ++	if (type < 0) {
    ++		strbuf_addstr(&desc, "[bad object]");
    ++		goto out;
    ++	}
    ++
     +	assert(type == OBJ_TREE || type == OBJ_COMMIT ||
     +	       type == OBJ_BLOB || type == OBJ_TAG);
    ++	strbuf_addstr(&desc, type_name(type));
    ++
      	if (type == OBJ_COMMIT) {
      		struct commit *commit = lookup_commit(ds->repo, oid);
      		if (commit) {
     @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, void *data)
    + 			strbuf_addf(&desc, " %s", tag->tag);
    + 	}
      
    - 	advise("  %s %s%s",
    +-	advise("  %s %s%s",
    ++out:
    ++	advise("  %s %s",
      	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
     -	       type_name(type) ? type_name(type) : "unknown type",
    --	       desc.buf);
    -+	       type_name(type), desc.buf);
    + 	       desc.buf);
      
      	strbuf_release(&desc);
    - 	return 0;
    +
    + ## t/t1512-rev-parse-disambiguation.sh ##
    +@@ t/t1512-rev-parse-disambiguation.sh: test_expect_success POSIXPERM 'ambigous zlib corrupt loose blob' '
    + 	error: unable to unpack cafe... header
    + 	error: inflate: data stream error (incorrect header check)
    + 	error: unable to unpack cafe... header
    +-	hint:   cafe... unknown type
    ++	hint:   cafe... [bad object]
    + 	hint:   cafe... blob
    + 	fatal: ambiguous argument '\''cafe...'\'': unknown revision or path not in the working tree.
    + 	Use '\''--'\'' to separate paths from revisions, like this:
2:  00d84faeb1d ! 3:  b79964483e8 object-name: make ambiguous object output translatable
    @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, voi
      
      	if (ds->fn && !ds->fn(ds->repo, oid, ds->cb_data))
      		return 0;
    -@@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, void *data)
    + 
    ++	hash = repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV);
      	type = oid_object_info(ds->repo, oid, NULL);
    + 
    + 	if (type < 0) {
    +-		strbuf_addstr(&desc, "[bad object]");
    ++		/*
    ++		 * TRANSLATORS: This is a line of ambiguous object
    ++		 * output shown when we cannot look up or parse the
    ++		 * object in question. E.g. "deadbeef [bad object]".
    ++		 */
    ++		strbuf_addf(&desc, _("%s [bad object]"), hash);
    + 		goto out;
    + 	}
    + 
      	assert(type == OBJ_TREE || type == OBJ_COMMIT ||
      	       type == OBJ_BLOB || type == OBJ_TAG);
    -+	hash = repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV);
    -+
    +-	strbuf_addstr(&desc, type_name(type));
    + 
      	if (type == OBJ_COMMIT) {
     +		struct strbuf ad = STRBUF_INIT;
     +		struct strbuf s = STRBUF_INIT;
    @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, voi
     +		 * object output. E.g. "deadbeef blob".
     +		 */
     +		strbuf_addf(&desc, _("%s blob"), hash);
    -+	} else {
    -+		BUG("unreachable");
      	}
      
    --	advise("  %s %s%s",
    ++
    + out:
    +-	advise("  %s %s",
     -	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
    --	       type_name(type), desc.buf);
    +-	       desc.buf);
     +	/*
    -+	 * TRANSLATORS: This is line item of ambiguous object output,
    -+	 * translated above.
    ++	 * TRANSLATORS: This is line item of ambiguous object output
    ++	 * from describe_ambiguous_object() above.
     +	 */
     +	advise(_("  %s"), desc.buf);
      
3:  9d24bab635d ! 4:  36b6b440c37 object-name: show date for ambiguous tag objects
    @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, voi
      
      		/*
      		 * TRANSLATORS: This is a line of
    -@@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, void *data)
    + 		 * ambiguous tag object output. E.g.:
    + 		 *
    +-		 *    "deadbeef tag Some Tag Message"
    ++		 *    "deadbeef tag 2021-01-01 - Some Tag Message"
    + 		 *
    + 		 * The second argument is the "tag" string from
      		 * object.c, it should (hopefully) already be
      		 * translated.
      		 */
-:  ----------- > 5:  8880c283559 object-name: iterate ambiguous objects before showing header
-:  ----------- > 6:  78bb0995f08 object-name: re-use "struct strbuf" in show_ambiguous_object()
-- 
2.34.1.838.g779e9098efb


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v5 1/6] object-name tests: add tests for ambiguous object blind spots
  2021-11-25 22:03       ` [PATCH v5 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
@ 2021-11-25 22:03         ` Ævar Arnfjörð Bjarmason
  2021-12-23 21:51           ` Josh Steadmon
  2021-11-25 22:03         ` [PATCH v5 2/6] object-name: explicitly handle OBJ_BAD in show_ambiguous_object() Ævar Arnfjörð Bjarmason
                           ` (5 subsequent siblings)
  6 siblings, 1 reply; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-25 22:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason

Extend the tests for ambiguous objects to check how we handle objects
where we return OBJ_BAD when trying to parse them. As noted in [1] we
have a blindspot when it comes to this behavior.

Since we need to add new test data here let's extend these tests to be
tested under SHA-256, in d7a2fc82491 (t1512: skip test if not using
SHA-1, 2018-05-13) all of the existing tests were skipped, as they
rely on specific SHA-1 object IDs.

For these tests it only matters that the first 4 characters of the OID
prefix are the same for both SHA-1 and SHA-256. This uses strings that
I mined, and have the same prefix when hashed with both.

1. https://lore.kernel.org/git/YZwbphPpfGk78w2f@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t1512-rev-parse-disambiguation.sh | 84 +++++++++++++++++++++++++++++
 1 file changed, 84 insertions(+)

diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 7891a6becf3..ae1c0cf2b21 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -25,6 +25,90 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
+test_cmp_failed_rev_parse () {
+	dir=$1
+	rev=$2
+	shift
+
+	test_must_fail git -C "$dir" rev-parse "$rev" 2>actual.raw &&
+	sed "s/\($rev\)[0-9a-f]*/\1.../g" <actual.raw >actual &&
+	test_cmp expect actual
+}
+
+test_expect_success 'ambiguous blob output' '
+	git init --bare blob.prefix &&
+	(
+		cd blob.prefix &&
+
+		# Both start with "dead..", under both SHA-1 and SHA-256
+		echo brocdnra | git hash-object -w --stdin &&
+		echo brigddsv | git hash-object -w --stdin &&
+
+		# Both start with "beef.."
+		echo 1agllotbh | git hash-object -w --stdin &&
+		echo 1bbfctrkc | git hash-object -w --stdin
+	) &&
+
+	cat >expect <<-\EOF &&
+	error: short object ID beef... is ambiguous
+	hint: The candidates are:
+	hint:   beef... blob
+	hint:   beef... blob
+	fatal: ambiguous argument '\''beef...'\'': unknown revision or path not in the working tree.
+	Use '\''--'\'' to separate paths from revisions, like this:
+	'\''git <command> [<revision>...] -- [<file>...]'\''
+	EOF
+	test_cmp_failed_rev_parse blob.prefix beef
+'
+
+test_expect_success 'ambiguous loose blob parsed as OBJ_BAD' '
+	git init --bare blob.bad &&
+	(
+		cd blob.bad &&
+
+		# Both have the prefix "bad0"
+		echo xyzfaowcoh | git hash-object -t bad -w --stdin --literally &&
+		echo xyzhjpyvwl | git hash-object -t bad -w --stdin --literally
+	) &&
+
+	cat >expect <<-\EOF &&
+	error: short object ID bad0... is ambiguous
+	hint: The candidates are:
+	fatal: invalid object type
+	EOF
+	test_cmp_failed_rev_parse blob.bad bad0
+'
+
+test_expect_success POSIXPERM 'ambigous zlib corrupt loose blob' '
+	git init --bare blob.corrupt &&
+	(
+		cd blob.corrupt &&
+
+		# Both have the prefix "cafe"
+		echo bnkxmdwz | git hash-object -w --stdin &&
+		oid=$(echo bmwsjxzi | git hash-object -w --stdin) &&
+
+		oidf=objects/$(test_oid_to_path "$oid") &&
+		chmod 755 $oidf &&
+		echo broken >$oidf
+	) &&
+
+	cat >expect <<-\EOF &&
+	error: short object ID cafe... is ambiguous
+	hint: The candidates are:
+	error: inflate: data stream error (incorrect header check)
+	error: unable to unpack cafe... header
+	error: inflate: data stream error (incorrect header check)
+	error: unable to unpack cafe... header
+	hint:   cafe... unknown type
+	hint:   cafe... blob
+	fatal: ambiguous argument '\''cafe...'\'': unknown revision or path not in the working tree.
+	Use '\''--'\'' to separate paths from revisions, like this:
+	'\''git <command> [<revision>...] -- [<file>...]'\''
+	EOF
+	test_cmp_failed_rev_parse blob.corrupt cafe
+'
+
 if ! test_have_prereq SHA1
 then
 	skip_all='not using SHA-1 for objects'
-- 
2.34.1.838.g779e9098efb


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v5 2/6] object-name: explicitly handle OBJ_BAD in show_ambiguous_object()
  2021-11-25 22:03       ` [PATCH v5 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
  2021-11-25 22:03         ` [PATCH v5 1/6] object-name tests: add tests for ambiguous object blind spots Ævar Arnfjörð Bjarmason
@ 2021-11-25 22:03         ` Ævar Arnfjörð Bjarmason
  2021-12-23 21:51           ` Josh Steadmon
  2021-12-23 22:42           ` Junio C Hamano
  2021-11-25 22:03         ` [PATCH v5 3/6] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
                           ` (4 subsequent siblings)
  6 siblings, 2 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-25 22:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason

Amend the "unknown type" handling in the code that displays the
ambiguous object list to assert() that we're either going to get the
"real" object types we can pass to type_name(), or a -1 (OBJ_BAD)
return value from oid_object_info().

See [1] for the current output, and [1] for the commit that added the
"unknown type" handling.

We are never going to get an "unknown type" in the sense of custom
types crafted with "hash-object --literally", since we're not using
the OBJECT_INFO_ALLOW_UNKNOWN_TYPE flag.

If we manage to otherwise unpack such an object without errors we'll
die() in parse_loose_header_extended() called by sort_ambiguous()
before we get to show_ambiguous_object(), as is asserted by the test
added in the preceding commit.

So saying "unknown type" here was always misleading, we really meant
to say that we had a failure parsing the object at all, if the problem
is only that it's type is unknown we won't reach this code.

So let's emit a generic "[bad object]" instead. As our tests added in
the preceding commit show, we'll have emitted various "error" output
already in those cases.

We should do better in the truly "unknown type" cases, which we'd need
to handle if we were passing down the OBJECT_INFO_ALLOW_UNKNOWN_TYPE
flag. But let's leave that for some future improvement. In a
subsequent commit I'll improve the output we do show, and not having
to handle the "unknown type" (as in OBJECT_INFO_ALLOW_UNKNOWN_TYPE)
simplifies that change.

1. 5cc044e0257 (get_short_oid: sort ambiguous objects by type,
   then SHA-1, 2018-05-10)
2. 1ffa26c461 (get_short_sha1: list ambiguous objects on error,
   2016-09-26)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c                       | 14 ++++++++++++--
 t/t1512-rev-parse-disambiguation.sh |  2 +-
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/object-name.c b/object-name.c
index fdff4601b2c..9750634ee76 100644
--- a/object-name.c
+++ b/object-name.c
@@ -361,6 +361,16 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		return 0;
 
 	type = oid_object_info(ds->repo, oid, NULL);
+
+	if (type < 0) {
+		strbuf_addstr(&desc, "[bad object]");
+		goto out;
+	}
+
+	assert(type == OBJ_TREE || type == OBJ_COMMIT ||
+	       type == OBJ_BLOB || type == OBJ_TAG);
+	strbuf_addstr(&desc, type_name(type));
+
 	if (type == OBJ_COMMIT) {
 		struct commit *commit = lookup_commit(ds->repo, oid);
 		if (commit) {
@@ -374,9 +384,9 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 			strbuf_addf(&desc, " %s", tag->tag);
 	}
 
-	advise("  %s %s%s",
+out:
+	advise("  %s %s",
 	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
-	       type_name(type) ? type_name(type) : "unknown type",
 	       desc.buf);
 
 	strbuf_release(&desc);
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index ae1c0cf2b21..f1948980dff 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -100,7 +100,7 @@ test_expect_success POSIXPERM 'ambigous zlib corrupt loose blob' '
 	error: unable to unpack cafe... header
 	error: inflate: data stream error (incorrect header check)
 	error: unable to unpack cafe... header
-	hint:   cafe... unknown type
+	hint:   cafe... [bad object]
 	hint:   cafe... blob
 	fatal: ambiguous argument '\''cafe...'\'': unknown revision or path not in the working tree.
 	Use '\''--'\'' to separate paths from revisions, like this:
-- 
2.34.1.838.g779e9098efb


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v5 3/6] object-name: make ambiguous object output translatable
  2021-11-25 22:03       ` [PATCH v5 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
  2021-11-25 22:03         ` [PATCH v5 1/6] object-name tests: add tests for ambiguous object blind spots Ævar Arnfjörð Bjarmason
  2021-11-25 22:03         ` [PATCH v5 2/6] object-name: explicitly handle OBJ_BAD in show_ambiguous_object() Ævar Arnfjörð Bjarmason
@ 2021-11-25 22:03         ` Ævar Arnfjörð Bjarmason
  2021-12-23 21:54           ` [PATCH] fixup! " Josh Steadmon
  2021-11-25 22:03         ` [PATCH v5 4/6] object-name: show date for ambiguous tag objects Ævar Arnfjörð Bjarmason
                           ` (3 subsequent siblings)
  6 siblings, 1 reply; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-25 22:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason

Change the output of show_ambiguous_object() added in [1] and last
tweaked in [2] and the preceding commit to be more friendly to
translators.

By being able to customize the "<SP><SP>%s\n" format we're even ready
for RTL languages, who'd presumably like to change that to
"%s<SP><SP>\n".

1. 1ffa26c461 (get_short_sha1: list ambiguous objects on error,
   2016-09-26)
2. 5cc044e0257 (get_short_oid: sort ambiguous objects by type,
   then SHA-1, 2018-05-10)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c | 64 +++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 57 insertions(+), 7 deletions(-)

diff --git a/object-name.c b/object-name.c
index 9750634ee76..1dcbba7fa76 100644
--- a/object-name.c
+++ b/object-name.c
@@ -356,38 +356,88 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	const struct disambiguate_state *ds = data;
 	struct strbuf desc = STRBUF_INIT;
 	int type;
+	const char *hash;
 
 	if (ds->fn && !ds->fn(ds->repo, oid, ds->cb_data))
 		return 0;
 
+	hash = repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV);
 	type = oid_object_info(ds->repo, oid, NULL);
 
 	if (type < 0) {
-		strbuf_addstr(&desc, "[bad object]");
+		/*
+		 * TRANSLATORS: This is a line of ambiguous object
+		 * output shown when we cannot look up or parse the
+		 * object in question. E.g. "deadbeef [bad object]".
+		 */
+		strbuf_addf(&desc, _("%s [bad object]"), hash);
 		goto out;
 	}
 
 	assert(type == OBJ_TREE || type == OBJ_COMMIT ||
 	       type == OBJ_BLOB || type == OBJ_TAG);
-	strbuf_addstr(&desc, type_name(type));
 
 	if (type == OBJ_COMMIT) {
+		struct strbuf ad = STRBUF_INIT;
+		struct strbuf s = STRBUF_INIT;
 		struct commit *commit = lookup_commit(ds->repo, oid);
+
 		if (commit) {
 			struct pretty_print_context pp = {0};
 			pp.date_mode.type = DATE_SHORT;
-			format_commit_message(commit, " %ad - %s", &desc, &pp);
+			format_commit_message(commit, "%ad", &ad, &pp);
+			format_commit_message(commit, "%s", &s, &pp);
 		}
+
+		/*
+		 * TRANSLATORS: This is a line of ambiguous commit
+		 * object output. E.g.:
+		 *
+		 *    "deadbeef commit 2021-01-01 - Some Commit Message"
+		 */
+		strbuf_addf(&desc, _("%s commit %s - %s"), hash, ad.buf, s.buf);
+
+		strbuf_release(&ad);
+		strbuf_release(&s);
 	} else if (type == OBJ_TAG) {
 		struct tag *tag = lookup_tag(ds->repo, oid);
+		const char *tag_tag = "";
+
 		if (!parse_tag(tag) && tag->tag)
-			strbuf_addf(&desc, " %s", tag->tag);
+			tag_tag = tag->tag;
+
+		/*
+		 * TRANSLATORS: This is a line of
+		 * ambiguous tag object output. E.g.:
+		 *
+		 *    "deadbeef tag Some Tag Message"
+		 *
+		 * The second argument is the "tag" string from
+		 * object.c, it should (hopefully) already be
+		 * translated.
+		 */
+		strbuf_addf(&desc, _("%s tag %s"), hash, tag_tag);
+	} else if (type == OBJ_TREE) {
+		/*
+		 * TRANSLATORS: This is a line of ambiguous <type>
+		 * object output. E.g. "deadbeef tree".
+		 */
+		strbuf_addf(&desc, _("%s tree"), hash);
+	} else if (type == OBJ_BLOB) {
+		/*
+		 * TRANSLATORS: This is a line of ambiguous <type>
+		 * object output. E.g. "deadbeef blob".
+		 */
+		strbuf_addf(&desc, _("%s blob"), hash);
 	}
 
+
 out:
-	advise("  %s %s",
-	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
-	       desc.buf);
+	/*
+	 * TRANSLATORS: This is line item of ambiguous object output
+	 * from describe_ambiguous_object() above.
+	 */
+	advise(_("  %s"), desc.buf);
 
 	strbuf_release(&desc);
 	return 0;
-- 
2.34.1.838.g779e9098efb


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v5 4/6] object-name: show date for ambiguous tag objects
  2021-11-25 22:03       ` [PATCH v5 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
                           ` (2 preceding siblings ...)
  2021-11-25 22:03         ` [PATCH v5 3/6] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
@ 2021-11-25 22:03         ` Ævar Arnfjörð Bjarmason
  2021-11-25 22:03         ` [PATCH v5 5/6] object-name: iterate ambiguous objects before showing header Ævar Arnfjörð Bjarmason
                           ` (2 subsequent siblings)
  6 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-25 22:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason

Make the ambiguous tag object output nicer in the case of tag objects
such as ebf3c04b262 (Git 2.32, 2021-06-06) by including the date in
the "tagger" header. I.e.:

    $ git rev-parse b7e68
    error: short object ID b7e68 is ambiguous
    hint: The candidates are:
    hint:   b7e68c41d92 tag 2021-06-06 - v2.32.0
    hint:   b7e68ae18e0 commit 2019-12-23 - bisect: use the standard 'if (!var)' way to check for 0
    hint:   b7e68f6b413 tree
    hint:   b7e68490b97 blob
    b7e68
    [...]

Before this we'd emit a "tag" line of:

    hint:   b7e68c41d92 tag v2.32.0

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/object-name.c b/object-name.c
index 1dcbba7fa76..707480ed191 100644
--- a/object-name.c
+++ b/object-name.c
@@ -402,21 +402,26 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	} else if (type == OBJ_TAG) {
 		struct tag *tag = lookup_tag(ds->repo, oid);
 		const char *tag_tag = "";
+		timestamp_t tag_date = 0;
 
-		if (!parse_tag(tag) && tag->tag)
+		if (!parse_tag(tag) && tag->tag) {
 			tag_tag = tag->tag;
+			tag_date = tag->date;
+		}
 
 		/*
 		 * TRANSLATORS: This is a line of
 		 * ambiguous tag object output. E.g.:
 		 *
-		 *    "deadbeef tag Some Tag Message"
+		 *    "deadbeef tag 2021-01-01 - Some Tag Message"
 		 *
 		 * The second argument is the "tag" string from
 		 * object.c, it should (hopefully) already be
 		 * translated.
 		 */
-		strbuf_addf(&desc, _("%s tag %s"), hash, tag_tag);
+		strbuf_addf(&desc, _("%s tag %s - %s"), hash,
+			    show_date(tag_date, 0, DATE_MODE(SHORT)),
+			    tag_tag);
 	} else if (type == OBJ_TREE) {
 		/*
 		 * TRANSLATORS: This is a line of ambiguous <type>
-- 
2.34.1.838.g779e9098efb


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v5 5/6] object-name: iterate ambiguous objects before showing header
  2021-11-25 22:03       ` [PATCH v5 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
                           ` (3 preceding siblings ...)
  2021-11-25 22:03         ` [PATCH v5 4/6] object-name: show date for ambiguous tag objects Ævar Arnfjörð Bjarmason
@ 2021-11-25 22:03         ` Ævar Arnfjörð Bjarmason
  2021-11-25 22:03         ` [PATCH v5 6/6] object-name: re-use "struct strbuf" in show_ambiguous_object() Ævar Arnfjörð Bjarmason
  2021-12-28 14:34         ` [PATCH v6 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
  6 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-25 22:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason

Change the "The candidates are" header that's shown for ambiguous
objects to be shown after we've iterated over all of the objects.

If we get any errors while doing so we don't want to split up the the
header and the list as a result. The two will now be printed together,
as shown in the updated testcase.

As we're accumulating the lines into as "struct strbuf" before
emitting them we need to add a trailing newline to the call in
show_ambiguous_object(). This and the change from "The candidates
are:" to "The candidates are:\n%s" helps to give translators more
context.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c                       | 27 +++++++++++++++++++++++----
 t/t1512-rev-parse-disambiguation.sh |  3 +--
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/object-name.c b/object-name.c
index 707480ed191..fd8b9244b5e 100644
--- a/object-name.c
+++ b/object-name.c
@@ -351,9 +351,16 @@ static int init_object_disambiguation(struct repository *r,
 	return 0;
 }
 
+struct ambiguous_output {
+	const struct disambiguate_state *ds;
+	struct strbuf advice;
+};
+
 static int show_ambiguous_object(const struct object_id *oid, void *data)
 {
-	const struct disambiguate_state *ds = data;
+	struct ambiguous_output *state = data;
+	const struct disambiguate_state *ds = state->ds;
+	struct strbuf *advice = &state->advice;
 	struct strbuf desc = STRBUF_INIT;
 	int type;
 	const char *hash;
@@ -442,7 +449,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	 * TRANSLATORS: This is line item of ambiguous object output
 	 * from describe_ambiguous_object() above.
 	 */
-	advise(_("  %s"), desc.buf);
+	strbuf_addf(advice, _("  %s\n"), desc.buf);
 
 	strbuf_release(&desc);
 	return 0;
@@ -541,6 +548,10 @@ static enum get_oid_result get_short_oid(struct repository *r,
 
 	if (!quietly && (status == SHORT_NAME_AMBIGUOUS)) {
 		struct oid_array collect = OID_ARRAY_INIT;
+		struct ambiguous_output out = {
+			.ds = &ds,
+			.advice = STRBUF_INIT,
+		};
 
 		error(_("short object ID %s is ambiguous"), ds.hex_pfx);
 
@@ -553,13 +564,21 @@ static enum get_oid_result get_short_oid(struct repository *r,
 		if (!ds.ambiguous)
 			ds.fn = NULL;
 
-		advise(_("The candidates are:"));
 		repo_for_each_abbrev(r, ds.hex_pfx, collect_ambiguous, &collect);
 		sort_ambiguous_oid_array(r, &collect);
 
-		if (oid_array_for_each(&collect, show_ambiguous_object, &ds))
+		if (oid_array_for_each(&collect, show_ambiguous_object, &out))
 			BUG("show_ambiguous_object shouldn't return non-zero");
+
+		/*
+		 * TRANSLATORS: The argument is the list of ambiguous
+		 * objects composed in show_ambiguous_object(). See
+		 * its "TRANSLATORS" comments for details.
+		 */
+		advise(_("The candidates are:\n%s"), out.advice.buf);
+
 		oid_array_clear(&collect);
+		strbuf_release(&out.advice);
 	}
 
 	return status;
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index f1948980dff..9e67231cdbf 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -73,7 +73,6 @@ test_expect_success 'ambiguous loose blob parsed as OBJ_BAD' '
 
 	cat >expect <<-\EOF &&
 	error: short object ID bad0... is ambiguous
-	hint: The candidates are:
 	fatal: invalid object type
 	EOF
 	test_cmp_failed_rev_parse blob.bad bad0
@@ -95,11 +94,11 @@ test_expect_success POSIXPERM 'ambigous zlib corrupt loose blob' '
 
 	cat >expect <<-\EOF &&
 	error: short object ID cafe... is ambiguous
-	hint: The candidates are:
 	error: inflate: data stream error (incorrect header check)
 	error: unable to unpack cafe... header
 	error: inflate: data stream error (incorrect header check)
 	error: unable to unpack cafe... header
+	hint: The candidates are:
 	hint:   cafe... [bad object]
 	hint:   cafe... blob
 	fatal: ambiguous argument '\''cafe...'\'': unknown revision or path not in the working tree.
-- 
2.34.1.838.g779e9098efb


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v5 6/6] object-name: re-use "struct strbuf" in show_ambiguous_object()
  2021-11-25 22:03       ` [PATCH v5 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
                           ` (4 preceding siblings ...)
  2021-11-25 22:03         ` [PATCH v5 5/6] object-name: iterate ambiguous objects before showing header Ævar Arnfjörð Bjarmason
@ 2021-11-25 22:03         ` Ævar Arnfjörð Bjarmason
  2021-12-28 14:34         ` [PATCH v6 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
  6 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-25 22:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason

Reduce the allocations done by show_ambiguous_object() by moving the
"desc" strbuf into the "struct ambiguous_output" introduced in the
preceding commit.

This doesn't matter for optimization purposes, but since we're
accumulating a "struct strbuf advice" anyway let's follow that pattern
and add a "struct strbuf sb", we can then strbuf_reset() it rather
than calling strbuf_release() for each call to
show_ambiguous_object().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/object-name.c b/object-name.c
index fd8b9244b5e..f96552e7af7 100644
--- a/object-name.c
+++ b/object-name.c
@@ -354,6 +354,7 @@ static int init_object_disambiguation(struct repository *r,
 struct ambiguous_output {
 	const struct disambiguate_state *ds;
 	struct strbuf advice;
+	struct strbuf sb;
 };
 
 static int show_ambiguous_object(const struct object_id *oid, void *data)
@@ -361,7 +362,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	struct ambiguous_output *state = data;
 	const struct disambiguate_state *ds = state->ds;
 	struct strbuf *advice = &state->advice;
-	struct strbuf desc = STRBUF_INIT;
+	struct strbuf *sb = &state->sb;
 	int type;
 	const char *hash;
 
@@ -377,7 +378,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		 * output shown when we cannot look up or parse the
 		 * object in question. E.g. "deadbeef [bad object]".
 		 */
-		strbuf_addf(&desc, _("%s [bad object]"), hash);
+		strbuf_addf(sb, _("%s [bad object]"), hash);
 		goto out;
 	}
 
@@ -402,7 +403,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		 *
 		 *    "deadbeef commit 2021-01-01 - Some Commit Message"
 		 */
-		strbuf_addf(&desc, _("%s commit %s - %s"), hash, ad.buf, s.buf);
+		strbuf_addf(sb, _("%s commit %s - %s"), hash, ad.buf, s.buf);
 
 		strbuf_release(&ad);
 		strbuf_release(&s);
@@ -426,7 +427,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		 * object.c, it should (hopefully) already be
 		 * translated.
 		 */
-		strbuf_addf(&desc, _("%s tag %s - %s"), hash,
+		strbuf_addf(sb, _("%s tag %s - %s"), hash,
 			    show_date(tag_date, 0, DATE_MODE(SHORT)),
 			    tag_tag);
 	} else if (type == OBJ_TREE) {
@@ -434,13 +435,13 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		 * TRANSLATORS: This is a line of ambiguous <type>
 		 * object output. E.g. "deadbeef tree".
 		 */
-		strbuf_addf(&desc, _("%s tree"), hash);
+		strbuf_addf(sb, _("%s tree"), hash);
 	} else if (type == OBJ_BLOB) {
 		/*
 		 * TRANSLATORS: This is a line of ambiguous <type>
 		 * object output. E.g. "deadbeef blob".
 		 */
-		strbuf_addf(&desc, _("%s blob"), hash);
+		strbuf_addf(sb, _("%s blob"), hash);
 	}
 
 
@@ -449,9 +450,9 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	 * TRANSLATORS: This is line item of ambiguous object output
 	 * from describe_ambiguous_object() above.
 	 */
-	strbuf_addf(advice, _("  %s\n"), desc.buf);
+	strbuf_addf(advice, _("  %s\n"), sb->buf);
 
-	strbuf_release(&desc);
+	strbuf_reset(sb);
 	return 0;
 }
 
@@ -550,6 +551,7 @@ static enum get_oid_result get_short_oid(struct repository *r,
 		struct oid_array collect = OID_ARRAY_INIT;
 		struct ambiguous_output out = {
 			.ds = &ds,
+			.sb = STRBUF_INIT,
 			.advice = STRBUF_INIT,
 		};
 
@@ -579,6 +581,7 @@ static enum get_oid_result get_short_oid(struct repository *r,
 
 		oid_array_clear(&collect);
 		strbuf_release(&out.advice);
+		strbuf_release(&out.sb);
 	}
 
 	return status;
-- 
2.34.1.838.g779e9098efb


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v5 1/6] object-name tests: add tests for ambiguous object blind spots
  2021-11-25 22:03         ` [PATCH v5 1/6] object-name tests: add tests for ambiguous object blind spots Ævar Arnfjörð Bjarmason
@ 2021-12-23 21:51           ` Josh Steadmon
  0 siblings, 0 replies; 81+ messages in thread
From: Josh Steadmon @ 2021-12-23 21:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Jeff King, Bagas Sanjaya

On 2021.11.25 23:03, Ævar Arnfjörð Bjarmason wrote:
> Extend the tests for ambiguous objects to check how we handle objects
> where we return OBJ_BAD when trying to parse them. As noted in [1] we
> have a blindspot when it comes to this behavior.
> 
> Since we need to add new test data here let's extend these tests to be
> tested under SHA-256, in d7a2fc82491 (t1512: skip test if not using
> SHA-1, 2018-05-13) all of the existing tests were skipped, as they
> rely on specific SHA-1 object IDs.
> 
> For these tests it only matters that the first 4 characters of the OID
> prefix are the same for both SHA-1 and SHA-256. This uses strings that
> I mined, and have the same prefix when hashed with both.
> 
> 1. https://lore.kernel.org/git/YZwbphPpfGk78w2f@coredump.intra.peff.net/
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  t/t1512-rev-parse-disambiguation.sh | 84 +++++++++++++++++++++++++++++
>  1 file changed, 84 insertions(+)
> 
> diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
> index 7891a6becf3..ae1c0cf2b21 100755
> --- a/t/t1512-rev-parse-disambiguation.sh
> +++ b/t/t1512-rev-parse-disambiguation.sh
> @@ -25,6 +25,90 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>  
>  . ./test-lib.sh
>  
> +test_cmp_failed_rev_parse () {
> +	dir=$1
> +	rev=$2
> +	shift
> +
> +	test_must_fail git -C "$dir" rev-parse "$rev" 2>actual.raw &&
> +	sed "s/\($rev\)[0-9a-f]*/\1.../g" <actual.raw >actual &&
> +	test_cmp expect actual
> +}
> +
> +test_expect_success 'ambiguous blob output' '
> +	git init --bare blob.prefix &&
> +	(
> +		cd blob.prefix &&
> +
> +		# Both start with "dead..", under both SHA-1 and SHA-256
> +		echo brocdnra | git hash-object -w --stdin &&
> +		echo brigddsv | git hash-object -w --stdin &&

These "dead.." objects don't seem to be used later, unless I've missed
something.


> +		# Both start with "beef.."
> +		echo 1agllotbh | git hash-object -w --stdin &&
> +		echo 1bbfctrkc | git hash-object -w --stdin
> +	) &&
> +
> +	cat >expect <<-\EOF &&
> +	error: short object ID beef... is ambiguous
> +	hint: The candidates are:
> +	hint:   beef... blob
> +	hint:   beef... blob
> +	fatal: ambiguous argument '\''beef...'\'': unknown revision or path not in the working tree.
> +	Use '\''--'\'' to separate paths from revisions, like this:
> +	'\''git <command> [<revision>...] -- [<file>...]'\''
> +	EOF
> +	test_cmp_failed_rev_parse blob.prefix beef
> +'

Rather than comparing the entire output (which can be brittle), can we
just grep for the important parts of the error message and compare
those?


> +test_expect_success 'ambiguous loose blob parsed as OBJ_BAD' '
> +	git init --bare blob.bad &&
> +	(
> +		cd blob.bad &&
> +
> +		# Both have the prefix "bad0"
> +		echo xyzfaowcoh | git hash-object -t bad -w --stdin --literally &&
> +		echo xyzhjpyvwl | git hash-object -t bad -w --stdin --literally
> +	) &&
> +
> +	cat >expect <<-\EOF &&
> +	error: short object ID bad0... is ambiguous
> +	hint: The candidates are:
> +	fatal: invalid object type
> +	EOF
> +	test_cmp_failed_rev_parse blob.bad bad0
> +'
> +
> +test_expect_success POSIXPERM 'ambigous zlib corrupt loose blob' '
> +	git init --bare blob.corrupt &&
> +	(
> +		cd blob.corrupt &&
> +
> +		# Both have the prefix "cafe"
> +		echo bnkxmdwz | git hash-object -w --stdin &&
> +		oid=$(echo bmwsjxzi | git hash-object -w --stdin) &&
> +
> +		oidf=objects/$(test_oid_to_path "$oid") &&
> +		chmod 755 $oidf &&
> +		echo broken >$oidf
> +	) &&
> +
> +	cat >expect <<-\EOF &&
> +	error: short object ID cafe... is ambiguous
> +	hint: The candidates are:
> +	error: inflate: data stream error (incorrect header check)
> +	error: unable to unpack cafe... header
> +	error: inflate: data stream error (incorrect header check)
> +	error: unable to unpack cafe... header
> +	hint:   cafe... unknown type
> +	hint:   cafe... blob
> +	fatal: ambiguous argument '\''cafe...'\'': unknown revision or path not in the working tree.
> +	Use '\''--'\'' to separate paths from revisions, like this:
> +	'\''git <command> [<revision>...] -- [<file>...]'\''
> +	EOF
> +	test_cmp_failed_rev_parse blob.corrupt cafe
> +'
> +
>  if ! test_have_prereq SHA1
>  then
>  	skip_all='not using SHA-1 for objects'
> -- 
> 2.34.1.838.g779e9098efb
> 

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v5 2/6] object-name: explicitly handle OBJ_BAD in show_ambiguous_object()
  2021-11-25 22:03         ` [PATCH v5 2/6] object-name: explicitly handle OBJ_BAD in show_ambiguous_object() Ævar Arnfjörð Bjarmason
@ 2021-12-23 21:51           ` Josh Steadmon
  2021-12-23 22:42           ` Junio C Hamano
  1 sibling, 0 replies; 81+ messages in thread
From: Josh Steadmon @ 2021-12-23 21:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Jeff King, Bagas Sanjaya

On 2021.11.25 23:03, Ævar Arnfjörð Bjarmason wrote:
> Amend the "unknown type" handling in the code that displays the
> ambiguous object list to assert() that we're either going to get the
> "real" object types we can pass to type_name(), or a -1 (OBJ_BAD)
> return value from oid_object_info().
> 
> See [1] for the current output, and [1] for the commit that added the
> "unknown type" handling.
> 
> We are never going to get an "unknown type" in the sense of custom
> types crafted with "hash-object --literally", since we're not using
> the OBJECT_INFO_ALLOW_UNKNOWN_TYPE flag.
> 
> If we manage to otherwise unpack such an object without errors we'll
> die() in parse_loose_header_extended() called by sort_ambiguous()
> before we get to show_ambiguous_object(), as is asserted by the test
> added in the preceding commit.
> 
> So saying "unknown type" here was always misleading, we really meant
> to say that we had a failure parsing the object at all, if the problem
> is only that it's type is unknown we won't reach this code.

Are there situations other than repo corruption where this could happen?
Maybe it would be more useful to just die() at this point and give the
user advice on how to investigate / fix the corruption, rather than
trying to disambiguate the objects involved.


> So let's emit a generic "[bad object]" instead. As our tests added in
> the preceding commit show, we'll have emitted various "error" output
> already in those cases.
> 
> We should do better in the truly "unknown type" cases, which we'd need
> to handle if we were passing down the OBJECT_INFO_ALLOW_UNKNOWN_TYPE
> flag. But let's leave that for some future improvement. In a
> subsequent commit I'll improve the output we do show, and not having
> to handle the "unknown type" (as in OBJECT_INFO_ALLOW_UNKNOWN_TYPE)
> simplifies that change.
> 
> 1. 5cc044e0257 (get_short_oid: sort ambiguous objects by type,
>    then SHA-1, 2018-05-10)
> 2. 1ffa26c461 (get_short_sha1: list ambiguous objects on error,
>    2016-09-26)
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  object-name.c                       | 14 ++++++++++++--
>  t/t1512-rev-parse-disambiguation.sh |  2 +-
>  2 files changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/object-name.c b/object-name.c
> index fdff4601b2c..9750634ee76 100644
> --- a/object-name.c
> +++ b/object-name.c
> @@ -361,6 +361,16 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
>  		return 0;
>  
>  	type = oid_object_info(ds->repo, oid, NULL);
> +
> +	if (type < 0) {
> +		strbuf_addstr(&desc, "[bad object]");
> +		goto out;
> +	}
> +
> +	assert(type == OBJ_TREE || type == OBJ_COMMIT ||
> +	       type == OBJ_BLOB || type == OBJ_TAG);
> +	strbuf_addstr(&desc, type_name(type));
> +
>  	if (type == OBJ_COMMIT) {
>  		struct commit *commit = lookup_commit(ds->repo, oid);
>  		if (commit) {
> @@ -374,9 +384,9 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
>  			strbuf_addf(&desc, " %s", tag->tag);
>  	}
>  
> -	advise("  %s %s%s",
> +out:
> +	advise("  %s %s",
>  	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
> -	       type_name(type) ? type_name(type) : "unknown type",
>  	       desc.buf);
>  
>  	strbuf_release(&desc);
> diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
> index ae1c0cf2b21..f1948980dff 100755
> --- a/t/t1512-rev-parse-disambiguation.sh
> +++ b/t/t1512-rev-parse-disambiguation.sh
> @@ -100,7 +100,7 @@ test_expect_success POSIXPERM 'ambigous zlib corrupt loose blob' '
>  	error: unable to unpack cafe... header
>  	error: inflate: data stream error (incorrect header check)
>  	error: unable to unpack cafe... header
> -	hint:   cafe... unknown type
> +	hint:   cafe... [bad object]
>  	hint:   cafe... blob
>  	fatal: ambiguous argument '\''cafe...'\'': unknown revision or path not in the working tree.
>  	Use '\''--'\'' to separate paths from revisions, like this:
> -- 
> 2.34.1.838.g779e9098efb
> 

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH] fixup! object-name: make ambiguous object output translatable
  2021-11-25 22:03         ` [PATCH v5 3/6] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
@ 2021-12-23 21:54           ` Josh Steadmon
  2021-12-23 22:48             ` Junio C Hamano
  0 siblings, 1 reply; 81+ messages in thread
From: Josh Steadmon @ 2021-12-23 21:54 UTC (permalink / raw)
  To: git; +Cc: avarab, gitster, peff, bagasdotme

A nitpick, but the "ad" and "s" strbuf names here are not very friendly
for readers who don't know offhand what the format_commit_message fields
expand to. This makes them more self-descriptive.

Signed-off-by: Josh Steadmon <steadmon@google.com>
---
 object-name.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/object-name.c b/object-name.c
index 1dcbba7fa7..dcf3ab9999 100644
--- a/object-name.c
+++ b/object-name.c
@@ -378,15 +378,15 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	       type == OBJ_BLOB || type == OBJ_TAG);
 
 	if (type == OBJ_COMMIT) {
-		struct strbuf ad = STRBUF_INIT;
-		struct strbuf s = STRBUF_INIT;
+		struct strbuf date = STRBUF_INIT;
+		struct strbuf msg = STRBUF_INIT;
 		struct commit *commit = lookup_commit(ds->repo, oid);
 
 		if (commit) {
 			struct pretty_print_context pp = {0};
 			pp.date_mode.type = DATE_SHORT;
-			format_commit_message(commit, "%ad", &ad, &pp);
-			format_commit_message(commit, "%s", &s, &pp);
+			format_commit_message(commit, "%ad", &date, &pp);
+			format_commit_message(commit, "%s", &msg, &pp);
 		}
 
 		/*
@@ -395,10 +395,11 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		 *
 		 *    "deadbeef commit 2021-01-01 - Some Commit Message"
 		 */
-		strbuf_addf(&desc, _("%s commit %s - %s"), hash, ad.buf, s.buf);
+		strbuf_addf(&desc, _("%s commit %s - %s"),
+			    hash, date.buf, msg.buf);
 
-		strbuf_release(&ad);
-		strbuf_release(&s);
+		strbuf_release(&date);
+		strbuf_release(&msg);
 	} else if (type == OBJ_TAG) {
 		struct tag *tag = lookup_tag(ds->repo, oid);
 		const char *tag_tag = "";

base-commit: ea5019ecd7a405d7d5f6527054d0aaca2d3b4bcd
-- 
2.34.1.448.ga2b2bfdf31-goog


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v5 2/6] object-name: explicitly handle OBJ_BAD in show_ambiguous_object()
  2021-11-25 22:03         ` [PATCH v5 2/6] object-name: explicitly handle OBJ_BAD in show_ambiguous_object() Ævar Arnfjörð Bjarmason
  2021-12-23 21:51           ` Josh Steadmon
@ 2021-12-23 22:42           ` Junio C Hamano
  1 sibling, 0 replies; 81+ messages in thread
From: Junio C Hamano @ 2021-12-23 22:42 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Jeff King, Bagas Sanjaya

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
> index ae1c0cf2b21..f1948980dff 100755
> --- a/t/t1512-rev-parse-disambiguation.sh
> +++ b/t/t1512-rev-parse-disambiguation.sh
> @@ -100,7 +100,7 @@ test_expect_success POSIXPERM 'ambigous zlib corrupt loose blob' '
>  	error: unable to unpack cafe... header
>  	error: inflate: data stream error (incorrect header check)
>  	error: unable to unpack cafe... header
> -	hint:   cafe... unknown type
> +	hint:   cafe... [bad object]
>  	hint:   cafe... blob
>  	fatal: ambiguous argument '\''cafe...'\'': unknown revision or path not in the working tree.
>  	Use '\''--'\'' to separate paths from revisions, like this:

OK.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH] fixup! object-name: make ambiguous object output translatable
  2021-12-23 21:54           ` [PATCH] fixup! " Josh Steadmon
@ 2021-12-23 22:48             ` Junio C Hamano
  0 siblings, 0 replies; 81+ messages in thread
From: Junio C Hamano @ 2021-12-23 22:48 UTC (permalink / raw)
  To: Josh Steadmon; +Cc: git, avarab, peff, bagasdotme

Josh Steadmon <steadmon@google.com> writes:

> A nitpick, but the "ad" and "s" strbuf names here are not very friendly
> for readers who don't know offhand what the format_commit_message fields
> expand to. This makes them more self-descriptive.

Sounds like a sensible change.  It seems that this thread didn't
have gathered much interest by others (not many review comments), or
by its author (not an ack to a suggestion like this), so perhaps I
should put on a cold storage and expect an update when the list is a
bit more quiescent.

Thanks.

> Signed-off-by: Josh Steadmon <steadmon@google.com>
> ---
>  object-name.c | 15 ++++++++-------
>  1 file changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/object-name.c b/object-name.c
> index 1dcbba7fa7..dcf3ab9999 100644
> --- a/object-name.c
> +++ b/object-name.c
> @@ -378,15 +378,15 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
>  	       type == OBJ_BLOB || type == OBJ_TAG);
>  
>  	if (type == OBJ_COMMIT) {
> -		struct strbuf ad = STRBUF_INIT;
> -		struct strbuf s = STRBUF_INIT;
> +		struct strbuf date = STRBUF_INIT;
> +		struct strbuf msg = STRBUF_INIT;
>  		struct commit *commit = lookup_commit(ds->repo, oid);
>  
>  		if (commit) {
>  			struct pretty_print_context pp = {0};
>  			pp.date_mode.type = DATE_SHORT;
> -			format_commit_message(commit, "%ad", &ad, &pp);
> -			format_commit_message(commit, "%s", &s, &pp);
> +			format_commit_message(commit, "%ad", &date, &pp);
> +			format_commit_message(commit, "%s", &msg, &pp);
>  		}
>  
>  		/*
> @@ -395,10 +395,11 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
>  		 *
>  		 *    "deadbeef commit 2021-01-01 - Some Commit Message"
>  		 */
> -		strbuf_addf(&desc, _("%s commit %s - %s"), hash, ad.buf, s.buf);
> +		strbuf_addf(&desc, _("%s commit %s - %s"),
> +			    hash, date.buf, msg.buf);
>  
> -		strbuf_release(&ad);
> -		strbuf_release(&s);
> +		strbuf_release(&date);
> +		strbuf_release(&msg);
>  	} else if (type == OBJ_TAG) {
>  		struct tag *tag = lookup_tag(ds->repo, oid);
>  		const char *tag_tag = "";
>
> base-commit: ea5019ecd7a405d7d5f6527054d0aaca2d3b4bcd

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v6 0/6] object-name: make ambiguous object output translatable + show tag date
  2021-11-25 22:03       ` [PATCH v5 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
                           ` (5 preceding siblings ...)
  2021-11-25 22:03         ` [PATCH v5 6/6] object-name: re-use "struct strbuf" in show_ambiguous_object() Ævar Arnfjörð Bjarmason
@ 2021-12-28 14:34         ` Ævar Arnfjörð Bjarmason
  2021-12-28 14:34           ` [PATCH v6 1/6] object-name tests: add tests for ambiguous object blind spots Ævar Arnfjörð Bjarmason
                             ` (7 more replies)
  6 siblings, 8 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28 14:34 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya, Josh Steadmon,
	Ævar Arnfjörð Bjarmason

This topic improves the output we emit on ambiguous objects as noted
in 4/6, and makes it translatable, see 3/6. See [1] for v5.

This iteration addresses various small feedback from Josh
Steadmon. I've incorporated a variable rename fixups here, and
hopefully answered small questions on the v5 thread with amended
commit messages.

For the case of "dead" prefixed objects being unused but "beef" being
used I just added a test for the "dead" objects. They're not strictly
needed, but having them for the "dead...beef" symetry and for use in
future tests is probably better, so I kept them in.

1. http://lore.kernel.org/git/cover-v5-0.6-00000000000-20211125T215529Z-avarab@gmail.com

Ævar Arnfjörð Bjarmason (6):
  object-name tests: add tests for ambiguous object blind spots
  object-name: explicitly handle OBJ_BAD in show_ambiguous_object()
  object-name: make ambiguous object output translatable
  object-name: show date for ambiguous tag objects
  object-name: iterate ambiguous objects before showing header
  object-name: re-use "struct strbuf" in show_ambiguous_object()

 object-name.c                       | 112 +++++++++++++++++++++++++---
 t/t1512-rev-parse-disambiguation.sh |  84 +++++++++++++++++++++
 2 files changed, 184 insertions(+), 12 deletions(-)

Range-diff against v5:
1:  767165d096d ! 1:  27f267ad555 object-name tests: add tests for ambiguous object blind spots
    @@ Commit message
         prefix are the same for both SHA-1 and SHA-256. This uses strings that
         I mined, and have the same prefix when hashed with both.
     
    +    We "test_cmp" the full output to guard against any future regressions,
    +    and because a subsequent commit will tweak it. Showing a diff of how
    +    the output changes is helpful to explain those subsequent commits.
    +
         1. https://lore.kernel.org/git/YZwbphPpfGk78w2f@coredump.intra.peff.net/
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    @@ t/t1512-rev-parse-disambiguation.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
     +		echo 1bbfctrkc | git hash-object -w --stdin
     +	) &&
     +
    ++	test_must_fail git -C blob.prefix rev-parse dead &&
     +	cat >expect <<-\EOF &&
     +	error: short object ID beef... is ambiguous
     +	hint: The candidates are:
2:  ee86912f1c1 ! 2:  c78243dc701 object-name: explicitly handle OBJ_BAD in show_ambiguous_object()
    @@ Commit message
         added in the preceding commit.
     
         So saying "unknown type" here was always misleading, we really meant
    -    to say that we had a failure parsing the object at all, if the problem
    -    is only that it's type is unknown we won't reach this code.
    +    to say that we had a failure parsing the object at all, i.e. that we
    +    had repository corruption. If the problem is only that it's type is
    +    unknown we won't reach this code.
     
         So let's emit a generic "[bad object]" instead. As our tests added in
         the preceding commit show, we'll have emitted various "error" output
3:  b79964483e8 ! 3:  daebc95542c object-name: make ambiguous object output translatable
    @@ Commit message
            then SHA-1, 2018-05-10)
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    +    Signed-off-by: Josh Steadmon <steadmon@google.com>
     
      ## object-name.c ##
     @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, void *data)
    @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, voi
     -	strbuf_addstr(&desc, type_name(type));
      
      	if (type == OBJ_COMMIT) {
    -+		struct strbuf ad = STRBUF_INIT;
    -+		struct strbuf s = STRBUF_INIT;
    ++		struct strbuf date = STRBUF_INIT;
    ++		struct strbuf msg = STRBUF_INIT;
      		struct commit *commit = lookup_commit(ds->repo, oid);
     +
      		if (commit) {
      			struct pretty_print_context pp = {0};
      			pp.date_mode.type = DATE_SHORT;
     -			format_commit_message(commit, " %ad - %s", &desc, &pp);
    -+			format_commit_message(commit, "%ad", &ad, &pp);
    -+			format_commit_message(commit, "%s", &s, &pp);
    ++			format_commit_message(commit, "%ad", &date, &pp);
    ++			format_commit_message(commit, "%s", &msg, &pp);
      		}
     +
     +		/*
    @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, voi
     +		 *
     +		 *    "deadbeef commit 2021-01-01 - Some Commit Message"
     +		 */
    -+		strbuf_addf(&desc, _("%s commit %s - %s"), hash, ad.buf, s.buf);
    ++		strbuf_addf(&desc, _("%s commit %s - %s"),
    ++			    hash, date.buf, msg.buf);
     +
    -+		strbuf_release(&ad);
    -+		strbuf_release(&s);
    ++		strbuf_release(&date);
    ++		strbuf_release(&msg);
      	} else if (type == OBJ_TAG) {
      		struct tag *tag = lookup_tag(ds->repo, oid);
     +		const char *tag_tag = "";
4:  36b6b440c37 = 4:  b5aa6e266f6 object-name: show date for ambiguous tag objects
5:  8880c283559 = 5:  644b076b2a6 object-name: iterate ambiguous objects before showing header
6:  78bb0995f08 ! 6:  6a31cfcfc29 object-name: re-use "struct strbuf" in show_ambiguous_object()
    @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, voi
      		 *
      		 *    "deadbeef commit 2021-01-01 - Some Commit Message"
      		 */
    --		strbuf_addf(&desc, _("%s commit %s - %s"), hash, ad.buf, s.buf);
    -+		strbuf_addf(sb, _("%s commit %s - %s"), hash, ad.buf, s.buf);
    +-		strbuf_addf(&desc, _("%s commit %s - %s"),
    +-			    hash, date.buf, msg.buf);
    ++		strbuf_addf(sb, _("%s commit %s - %s"), hash, date.buf,
    ++			    msg.buf);
      
    - 		strbuf_release(&ad);
    - 		strbuf_release(&s);
    + 		strbuf_release(&date);
    + 		strbuf_release(&msg);
     @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, void *data)
      		 * object.c, it should (hopefully) already be
      		 * translated.
-- 
2.34.1.1257.g2af47340c7b


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v6 1/6] object-name tests: add tests for ambiguous object blind spots
  2021-12-28 14:34         ` [PATCH v6 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
@ 2021-12-28 14:34           ` Ævar Arnfjörð Bjarmason
  2021-12-30 23:36             ` Junio C Hamano
  2021-12-28 14:34           ` [PATCH v6 2/6] object-name: explicitly handle OBJ_BAD in show_ambiguous_object() Ævar Arnfjörð Bjarmason
                             ` (6 subsequent siblings)
  7 siblings, 1 reply; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28 14:34 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya, Josh Steadmon,
	Ævar Arnfjörð Bjarmason

Extend the tests for ambiguous objects to check how we handle objects
where we return OBJ_BAD when trying to parse them. As noted in [1] we
have a blindspot when it comes to this behavior.

Since we need to add new test data here let's extend these tests to be
tested under SHA-256, in d7a2fc82491 (t1512: skip test if not using
SHA-1, 2018-05-13) all of the existing tests were skipped, as they
rely on specific SHA-1 object IDs.

For these tests it only matters that the first 4 characters of the OID
prefix are the same for both SHA-1 and SHA-256. This uses strings that
I mined, and have the same prefix when hashed with both.

We "test_cmp" the full output to guard against any future regressions,
and because a subsequent commit will tweak it. Showing a diff of how
the output changes is helpful to explain those subsequent commits.

1. https://lore.kernel.org/git/YZwbphPpfGk78w2f@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t1512-rev-parse-disambiguation.sh | 85 +++++++++++++++++++++++++++++
 1 file changed, 85 insertions(+)

diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 7891a6becf3..60d2a457067 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -25,6 +25,91 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
+test_cmp_failed_rev_parse () {
+	dir=$1
+	rev=$2
+	shift
+
+	test_must_fail git -C "$dir" rev-parse "$rev" 2>actual.raw &&
+	sed "s/\($rev\)[0-9a-f]*/\1.../g" <actual.raw >actual &&
+	test_cmp expect actual
+}
+
+test_expect_success 'ambiguous blob output' '
+	git init --bare blob.prefix &&
+	(
+		cd blob.prefix &&
+
+		# Both start with "dead..", under both SHA-1 and SHA-256
+		echo brocdnra | git hash-object -w --stdin &&
+		echo brigddsv | git hash-object -w --stdin &&
+
+		# Both start with "beef.."
+		echo 1agllotbh | git hash-object -w --stdin &&
+		echo 1bbfctrkc | git hash-object -w --stdin
+	) &&
+
+	test_must_fail git -C blob.prefix rev-parse dead &&
+	cat >expect <<-\EOF &&
+	error: short object ID beef... is ambiguous
+	hint: The candidates are:
+	hint:   beef... blob
+	hint:   beef... blob
+	fatal: ambiguous argument '\''beef...'\'': unknown revision or path not in the working tree.
+	Use '\''--'\'' to separate paths from revisions, like this:
+	'\''git <command> [<revision>...] -- [<file>...]'\''
+	EOF
+	test_cmp_failed_rev_parse blob.prefix beef
+'
+
+test_expect_success 'ambiguous loose blob parsed as OBJ_BAD' '
+	git init --bare blob.bad &&
+	(
+		cd blob.bad &&
+
+		# Both have the prefix "bad0"
+		echo xyzfaowcoh | git hash-object -t bad -w --stdin --literally &&
+		echo xyzhjpyvwl | git hash-object -t bad -w --stdin --literally
+	) &&
+
+	cat >expect <<-\EOF &&
+	error: short object ID bad0... is ambiguous
+	hint: The candidates are:
+	fatal: invalid object type
+	EOF
+	test_cmp_failed_rev_parse blob.bad bad0
+'
+
+test_expect_success POSIXPERM 'ambigous zlib corrupt loose blob' '
+	git init --bare blob.corrupt &&
+	(
+		cd blob.corrupt &&
+
+		# Both have the prefix "cafe"
+		echo bnkxmdwz | git hash-object -w --stdin &&
+		oid=$(echo bmwsjxzi | git hash-object -w --stdin) &&
+
+		oidf=objects/$(test_oid_to_path "$oid") &&
+		chmod 755 $oidf &&
+		echo broken >$oidf
+	) &&
+
+	cat >expect <<-\EOF &&
+	error: short object ID cafe... is ambiguous
+	hint: The candidates are:
+	error: inflate: data stream error (incorrect header check)
+	error: unable to unpack cafe... header
+	error: inflate: data stream error (incorrect header check)
+	error: unable to unpack cafe... header
+	hint:   cafe... unknown type
+	hint:   cafe... blob
+	fatal: ambiguous argument '\''cafe...'\'': unknown revision or path not in the working tree.
+	Use '\''--'\'' to separate paths from revisions, like this:
+	'\''git <command> [<revision>...] -- [<file>...]'\''
+	EOF
+	test_cmp_failed_rev_parse blob.corrupt cafe
+'
+
 if ! test_have_prereq SHA1
 then
 	skip_all='not using SHA-1 for objects'
-- 
2.34.1.1257.g2af47340c7b


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v6 2/6] object-name: explicitly handle OBJ_BAD in show_ambiguous_object()
  2021-12-28 14:34         ` [PATCH v6 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
  2021-12-28 14:34           ` [PATCH v6 1/6] object-name tests: add tests for ambiguous object blind spots Ævar Arnfjörð Bjarmason
@ 2021-12-28 14:34           ` Ævar Arnfjörð Bjarmason
  2021-12-28 14:34           ` [PATCH v6 3/6] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
                             ` (5 subsequent siblings)
  7 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28 14:34 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya, Josh Steadmon,
	Ævar Arnfjörð Bjarmason

Amend the "unknown type" handling in the code that displays the
ambiguous object list to assert() that we're either going to get the
"real" object types we can pass to type_name(), or a -1 (OBJ_BAD)
return value from oid_object_info().

See [1] for the current output, and [1] for the commit that added the
"unknown type" handling.

We are never going to get an "unknown type" in the sense of custom
types crafted with "hash-object --literally", since we're not using
the OBJECT_INFO_ALLOW_UNKNOWN_TYPE flag.

If we manage to otherwise unpack such an object without errors we'll
die() in parse_loose_header_extended() called by sort_ambiguous()
before we get to show_ambiguous_object(), as is asserted by the test
added in the preceding commit.

So saying "unknown type" here was always misleading, we really meant
to say that we had a failure parsing the object at all, i.e. that we
had repository corruption. If the problem is only that it's type is
unknown we won't reach this code.

So let's emit a generic "[bad object]" instead. As our tests added in
the preceding commit show, we'll have emitted various "error" output
already in those cases.

We should do better in the truly "unknown type" cases, which we'd need
to handle if we were passing down the OBJECT_INFO_ALLOW_UNKNOWN_TYPE
flag. But let's leave that for some future improvement. In a
subsequent commit I'll improve the output we do show, and not having
to handle the "unknown type" (as in OBJECT_INFO_ALLOW_UNKNOWN_TYPE)
simplifies that change.

1. 5cc044e0257 (get_short_oid: sort ambiguous objects by type,
   then SHA-1, 2018-05-10)
2. 1ffa26c461 (get_short_sha1: list ambiguous objects on error,
   2016-09-26)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c                       | 14 ++++++++++++--
 t/t1512-rev-parse-disambiguation.sh |  2 +-
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/object-name.c b/object-name.c
index fdff4601b2c..9750634ee76 100644
--- a/object-name.c
+++ b/object-name.c
@@ -361,6 +361,16 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		return 0;
 
 	type = oid_object_info(ds->repo, oid, NULL);
+
+	if (type < 0) {
+		strbuf_addstr(&desc, "[bad object]");
+		goto out;
+	}
+
+	assert(type == OBJ_TREE || type == OBJ_COMMIT ||
+	       type == OBJ_BLOB || type == OBJ_TAG);
+	strbuf_addstr(&desc, type_name(type));
+
 	if (type == OBJ_COMMIT) {
 		struct commit *commit = lookup_commit(ds->repo, oid);
 		if (commit) {
@@ -374,9 +384,9 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 			strbuf_addf(&desc, " %s", tag->tag);
 	}
 
-	advise("  %s %s%s",
+out:
+	advise("  %s %s",
 	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
-	       type_name(type) ? type_name(type) : "unknown type",
 	       desc.buf);
 
 	strbuf_release(&desc);
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 60d2a457067..d68c411bfc7 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -101,7 +101,7 @@ test_expect_success POSIXPERM 'ambigous zlib corrupt loose blob' '
 	error: unable to unpack cafe... header
 	error: inflate: data stream error (incorrect header check)
 	error: unable to unpack cafe... header
-	hint:   cafe... unknown type
+	hint:   cafe... [bad object]
 	hint:   cafe... blob
 	fatal: ambiguous argument '\''cafe...'\'': unknown revision or path not in the working tree.
 	Use '\''--'\'' to separate paths from revisions, like this:
-- 
2.34.1.1257.g2af47340c7b


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v6 3/6] object-name: make ambiguous object output translatable
  2021-12-28 14:34         ` [PATCH v6 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
  2021-12-28 14:34           ` [PATCH v6 1/6] object-name tests: add tests for ambiguous object blind spots Ævar Arnfjörð Bjarmason
  2021-12-28 14:34           ` [PATCH v6 2/6] object-name: explicitly handle OBJ_BAD in show_ambiguous_object() Ævar Arnfjörð Bjarmason
@ 2021-12-28 14:34           ` Ævar Arnfjörð Bjarmason
  2021-12-30 23:46             ` Junio C Hamano
  2021-12-28 14:35           ` [PATCH v6 4/6] object-name: show date for ambiguous tag objects Ævar Arnfjörð Bjarmason
                             ` (4 subsequent siblings)
  7 siblings, 1 reply; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28 14:34 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya, Josh Steadmon,
	Ævar Arnfjörð Bjarmason

Change the output of show_ambiguous_object() added in [1] and last
tweaked in [2] and the preceding commit to be more friendly to
translators.

By being able to customize the "<SP><SP>%s\n" format we're even ready
for RTL languages, who'd presumably like to change that to
"%s<SP><SP>\n".

1. 1ffa26c461 (get_short_sha1: list ambiguous objects on error,
   2016-09-26)
2. 5cc044e0257 (get_short_oid: sort ambiguous objects by type,
   then SHA-1, 2018-05-10)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
---
 object-name.c | 65 +++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 58 insertions(+), 7 deletions(-)

diff --git a/object-name.c b/object-name.c
index 9750634ee76..dcf3ab99990 100644
--- a/object-name.c
+++ b/object-name.c
@@ -356,38 +356,89 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	const struct disambiguate_state *ds = data;
 	struct strbuf desc = STRBUF_INIT;
 	int type;
+	const char *hash;
 
 	if (ds->fn && !ds->fn(ds->repo, oid, ds->cb_data))
 		return 0;
 
+	hash = repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV);
 	type = oid_object_info(ds->repo, oid, NULL);
 
 	if (type < 0) {
-		strbuf_addstr(&desc, "[bad object]");
+		/*
+		 * TRANSLATORS: This is a line of ambiguous object
+		 * output shown when we cannot look up or parse the
+		 * object in question. E.g. "deadbeef [bad object]".
+		 */
+		strbuf_addf(&desc, _("%s [bad object]"), hash);
 		goto out;
 	}
 
 	assert(type == OBJ_TREE || type == OBJ_COMMIT ||
 	       type == OBJ_BLOB || type == OBJ_TAG);
-	strbuf_addstr(&desc, type_name(type));
 
 	if (type == OBJ_COMMIT) {
+		struct strbuf date = STRBUF_INIT;
+		struct strbuf msg = STRBUF_INIT;
 		struct commit *commit = lookup_commit(ds->repo, oid);
+
 		if (commit) {
 			struct pretty_print_context pp = {0};
 			pp.date_mode.type = DATE_SHORT;
-			format_commit_message(commit, " %ad - %s", &desc, &pp);
+			format_commit_message(commit, "%ad", &date, &pp);
+			format_commit_message(commit, "%s", &msg, &pp);
 		}
+
+		/*
+		 * TRANSLATORS: This is a line of ambiguous commit
+		 * object output. E.g.:
+		 *
+		 *    "deadbeef commit 2021-01-01 - Some Commit Message"
+		 */
+		strbuf_addf(&desc, _("%s commit %s - %s"),
+			    hash, date.buf, msg.buf);
+
+		strbuf_release(&date);
+		strbuf_release(&msg);
 	} else if (type == OBJ_TAG) {
 		struct tag *tag = lookup_tag(ds->repo, oid);
+		const char *tag_tag = "";
+
 		if (!parse_tag(tag) && tag->tag)
-			strbuf_addf(&desc, " %s", tag->tag);
+			tag_tag = tag->tag;
+
+		/*
+		 * TRANSLATORS: This is a line of
+		 * ambiguous tag object output. E.g.:
+		 *
+		 *    "deadbeef tag Some Tag Message"
+		 *
+		 * The second argument is the "tag" string from
+		 * object.c, it should (hopefully) already be
+		 * translated.
+		 */
+		strbuf_addf(&desc, _("%s tag %s"), hash, tag_tag);
+	} else if (type == OBJ_TREE) {
+		/*
+		 * TRANSLATORS: This is a line of ambiguous <type>
+		 * object output. E.g. "deadbeef tree".
+		 */
+		strbuf_addf(&desc, _("%s tree"), hash);
+	} else if (type == OBJ_BLOB) {
+		/*
+		 * TRANSLATORS: This is a line of ambiguous <type>
+		 * object output. E.g. "deadbeef blob".
+		 */
+		strbuf_addf(&desc, _("%s blob"), hash);
 	}
 
+
 out:
-	advise("  %s %s",
-	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
-	       desc.buf);
+	/*
+	 * TRANSLATORS: This is line item of ambiguous object output
+	 * from describe_ambiguous_object() above.
+	 */
+	advise(_("  %s"), desc.buf);
 
 	strbuf_release(&desc);
 	return 0;
-- 
2.34.1.1257.g2af47340c7b


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v6 4/6] object-name: show date for ambiguous tag objects
  2021-12-28 14:34         ` [PATCH v6 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
                             ` (2 preceding siblings ...)
  2021-12-28 14:34           ` [PATCH v6 3/6] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
@ 2021-12-28 14:35           ` Ævar Arnfjörð Bjarmason
  2021-12-30 21:43             ` Junio C Hamano
  2021-12-28 14:35           ` [PATCH v6 5/6] object-name: iterate ambiguous objects before showing header Ævar Arnfjörð Bjarmason
                             ` (3 subsequent siblings)
  7 siblings, 1 reply; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28 14:35 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya, Josh Steadmon,
	Ævar Arnfjörð Bjarmason

Make the ambiguous tag object output nicer in the case of tag objects
such as ebf3c04b262 (Git 2.32, 2021-06-06) by including the date in
the "tagger" header. I.e.:

    $ git rev-parse b7e68
    error: short object ID b7e68 is ambiguous
    hint: The candidates are:
    hint:   b7e68c41d92 tag 2021-06-06 - v2.32.0
    hint:   b7e68ae18e0 commit 2019-12-23 - bisect: use the standard 'if (!var)' way to check for 0
    hint:   b7e68f6b413 tree
    hint:   b7e68490b97 blob
    b7e68
    [...]

Before this we'd emit a "tag" line of:

    hint:   b7e68c41d92 tag v2.32.0

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/object-name.c b/object-name.c
index dcf3ab99990..990f384129e 100644
--- a/object-name.c
+++ b/object-name.c
@@ -403,21 +403,26 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	} else if (type == OBJ_TAG) {
 		struct tag *tag = lookup_tag(ds->repo, oid);
 		const char *tag_tag = "";
+		timestamp_t tag_date = 0;
 
-		if (!parse_tag(tag) && tag->tag)
+		if (!parse_tag(tag) && tag->tag) {
 			tag_tag = tag->tag;
+			tag_date = tag->date;
+		}
 
 		/*
 		 * TRANSLATORS: This is a line of
 		 * ambiguous tag object output. E.g.:
 		 *
-		 *    "deadbeef tag Some Tag Message"
+		 *    "deadbeef tag 2021-01-01 - Some Tag Message"
 		 *
 		 * The second argument is the "tag" string from
 		 * object.c, it should (hopefully) already be
 		 * translated.
 		 */
-		strbuf_addf(&desc, _("%s tag %s"), hash, tag_tag);
+		strbuf_addf(&desc, _("%s tag %s - %s"), hash,
+			    show_date(tag_date, 0, DATE_MODE(SHORT)),
+			    tag_tag);
 	} else if (type == OBJ_TREE) {
 		/*
 		 * TRANSLATORS: This is a line of ambiguous <type>
-- 
2.34.1.1257.g2af47340c7b


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v6 5/6] object-name: iterate ambiguous objects before showing header
  2021-12-28 14:34         ` [PATCH v6 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
                             ` (3 preceding siblings ...)
  2021-12-28 14:35           ` [PATCH v6 4/6] object-name: show date for ambiguous tag objects Ævar Arnfjörð Bjarmason
@ 2021-12-28 14:35           ` Ævar Arnfjörð Bjarmason
  2021-12-28 14:35           ` [PATCH v6 6/6] object-name: re-use "struct strbuf" in show_ambiguous_object() Ævar Arnfjörð Bjarmason
                             ` (2 subsequent siblings)
  7 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28 14:35 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya, Josh Steadmon,
	Ævar Arnfjörð Bjarmason

Change the "The candidates are" header that's shown for ambiguous
objects to be shown after we've iterated over all of the objects.

If we get any errors while doing so we don't want to split up the the
header and the list as a result. The two will now be printed together,
as shown in the updated testcase.

As we're accumulating the lines into as "struct strbuf" before
emitting them we need to add a trailing newline to the call in
show_ambiguous_object(). This and the change from "The candidates
are:" to "The candidates are:\n%s" helps to give translators more
context.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c                       | 27 +++++++++++++++++++++++----
 t/t1512-rev-parse-disambiguation.sh |  3 +--
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/object-name.c b/object-name.c
index 990f384129e..743d272800d 100644
--- a/object-name.c
+++ b/object-name.c
@@ -351,9 +351,16 @@ static int init_object_disambiguation(struct repository *r,
 	return 0;
 }
 
+struct ambiguous_output {
+	const struct disambiguate_state *ds;
+	struct strbuf advice;
+};
+
 static int show_ambiguous_object(const struct object_id *oid, void *data)
 {
-	const struct disambiguate_state *ds = data;
+	struct ambiguous_output *state = data;
+	const struct disambiguate_state *ds = state->ds;
+	struct strbuf *advice = &state->advice;
 	struct strbuf desc = STRBUF_INIT;
 	int type;
 	const char *hash;
@@ -443,7 +450,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	 * TRANSLATORS: This is line item of ambiguous object output
 	 * from describe_ambiguous_object() above.
 	 */
-	advise(_("  %s"), desc.buf);
+	strbuf_addf(advice, _("  %s\n"), desc.buf);
 
 	strbuf_release(&desc);
 	return 0;
@@ -542,6 +549,10 @@ static enum get_oid_result get_short_oid(struct repository *r,
 
 	if (!quietly && (status == SHORT_NAME_AMBIGUOUS)) {
 		struct oid_array collect = OID_ARRAY_INIT;
+		struct ambiguous_output out = {
+			.ds = &ds,
+			.advice = STRBUF_INIT,
+		};
 
 		error(_("short object ID %s is ambiguous"), ds.hex_pfx);
 
@@ -554,13 +565,21 @@ static enum get_oid_result get_short_oid(struct repository *r,
 		if (!ds.ambiguous)
 			ds.fn = NULL;
 
-		advise(_("The candidates are:"));
 		repo_for_each_abbrev(r, ds.hex_pfx, collect_ambiguous, &collect);
 		sort_ambiguous_oid_array(r, &collect);
 
-		if (oid_array_for_each(&collect, show_ambiguous_object, &ds))
+		if (oid_array_for_each(&collect, show_ambiguous_object, &out))
 			BUG("show_ambiguous_object shouldn't return non-zero");
+
+		/*
+		 * TRANSLATORS: The argument is the list of ambiguous
+		 * objects composed in show_ambiguous_object(). See
+		 * its "TRANSLATORS" comments for details.
+		 */
+		advise(_("The candidates are:\n%s"), out.advice.buf);
+
 		oid_array_clear(&collect);
+		strbuf_release(&out.advice);
 	}
 
 	return status;
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index d68c411bfc7..cb8ee3d65ed 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -74,7 +74,6 @@ test_expect_success 'ambiguous loose blob parsed as OBJ_BAD' '
 
 	cat >expect <<-\EOF &&
 	error: short object ID bad0... is ambiguous
-	hint: The candidates are:
 	fatal: invalid object type
 	EOF
 	test_cmp_failed_rev_parse blob.bad bad0
@@ -96,11 +95,11 @@ test_expect_success POSIXPERM 'ambigous zlib corrupt loose blob' '
 
 	cat >expect <<-\EOF &&
 	error: short object ID cafe... is ambiguous
-	hint: The candidates are:
 	error: inflate: data stream error (incorrect header check)
 	error: unable to unpack cafe... header
 	error: inflate: data stream error (incorrect header check)
 	error: unable to unpack cafe... header
+	hint: The candidates are:
 	hint:   cafe... [bad object]
 	hint:   cafe... blob
 	fatal: ambiguous argument '\''cafe...'\'': unknown revision or path not in the working tree.
-- 
2.34.1.1257.g2af47340c7b


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v6 6/6] object-name: re-use "struct strbuf" in show_ambiguous_object()
  2021-12-28 14:34         ` [PATCH v6 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
                             ` (4 preceding siblings ...)
  2021-12-28 14:35           ` [PATCH v6 5/6] object-name: iterate ambiguous objects before showing header Ævar Arnfjörð Bjarmason
@ 2021-12-28 14:35           ` Ævar Arnfjörð Bjarmason
  2021-12-28 15:18           ` [PATCH v8 0/7] progress: test fixes / cleanup Ævar Arnfjörð Bjarmason
  2022-01-12 12:39           ` [PATCH v7 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
  7 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28 14:35 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya, Josh Steadmon,
	Ævar Arnfjörð Bjarmason

Reduce the allocations done by show_ambiguous_object() by moving the
"desc" strbuf into the "struct ambiguous_output" introduced in the
preceding commit.

This doesn't matter for optimization purposes, but since we're
accumulating a "struct strbuf advice" anyway let's follow that pattern
and add a "struct strbuf sb", we can then strbuf_reset() it rather
than calling strbuf_release() for each call to
show_ambiguous_object().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/object-name.c b/object-name.c
index 743d272800d..2d60e5177d3 100644
--- a/object-name.c
+++ b/object-name.c
@@ -354,6 +354,7 @@ static int init_object_disambiguation(struct repository *r,
 struct ambiguous_output {
 	const struct disambiguate_state *ds;
 	struct strbuf advice;
+	struct strbuf sb;
 };
 
 static int show_ambiguous_object(const struct object_id *oid, void *data)
@@ -361,7 +362,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	struct ambiguous_output *state = data;
 	const struct disambiguate_state *ds = state->ds;
 	struct strbuf *advice = &state->advice;
-	struct strbuf desc = STRBUF_INIT;
+	struct strbuf *sb = &state->sb;
 	int type;
 	const char *hash;
 
@@ -377,7 +378,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		 * output shown when we cannot look up or parse the
 		 * object in question. E.g. "deadbeef [bad object]".
 		 */
-		strbuf_addf(&desc, _("%s [bad object]"), hash);
+		strbuf_addf(sb, _("%s [bad object]"), hash);
 		goto out;
 	}
 
@@ -402,8 +403,8 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		 *
 		 *    "deadbeef commit 2021-01-01 - Some Commit Message"
 		 */
-		strbuf_addf(&desc, _("%s commit %s - %s"),
-			    hash, date.buf, msg.buf);
+		strbuf_addf(sb, _("%s commit %s - %s"), hash, date.buf,
+			    msg.buf);
 
 		strbuf_release(&date);
 		strbuf_release(&msg);
@@ -427,7 +428,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		 * object.c, it should (hopefully) already be
 		 * translated.
 		 */
-		strbuf_addf(&desc, _("%s tag %s - %s"), hash,
+		strbuf_addf(sb, _("%s tag %s - %s"), hash,
 			    show_date(tag_date, 0, DATE_MODE(SHORT)),
 			    tag_tag);
 	} else if (type == OBJ_TREE) {
@@ -435,13 +436,13 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		 * TRANSLATORS: This is a line of ambiguous <type>
 		 * object output. E.g. "deadbeef tree".
 		 */
-		strbuf_addf(&desc, _("%s tree"), hash);
+		strbuf_addf(sb, _("%s tree"), hash);
 	} else if (type == OBJ_BLOB) {
 		/*
 		 * TRANSLATORS: This is a line of ambiguous <type>
 		 * object output. E.g. "deadbeef blob".
 		 */
-		strbuf_addf(&desc, _("%s blob"), hash);
+		strbuf_addf(sb, _("%s blob"), hash);
 	}
 
 
@@ -450,9 +451,9 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	 * TRANSLATORS: This is line item of ambiguous object output
 	 * from describe_ambiguous_object() above.
 	 */
-	strbuf_addf(advice, _("  %s\n"), desc.buf);
+	strbuf_addf(advice, _("  %s\n"), sb->buf);
 
-	strbuf_release(&desc);
+	strbuf_reset(sb);
 	return 0;
 }
 
@@ -551,6 +552,7 @@ static enum get_oid_result get_short_oid(struct repository *r,
 		struct oid_array collect = OID_ARRAY_INIT;
 		struct ambiguous_output out = {
 			.ds = &ds,
+			.sb = STRBUF_INIT,
 			.advice = STRBUF_INIT,
 		};
 
@@ -580,6 +582,7 @@ static enum get_oid_result get_short_oid(struct repository *r,
 
 		oid_array_clear(&collect);
 		strbuf_release(&out.advice);
+		strbuf_release(&out.sb);
 	}
 
 	return status;
-- 
2.34.1.1257.g2af47340c7b


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v8 0/7] progress: test fixes / cleanup
  2021-12-28 14:34         ` [PATCH v6 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
                             ` (5 preceding siblings ...)
  2021-12-28 14:35           ` [PATCH v6 6/6] object-name: re-use "struct strbuf" in show_ambiguous_object() Ævar Arnfjörð Bjarmason
@ 2021-12-28 15:18           ` Ævar Arnfjörð Bjarmason
  2021-12-28 15:18             ` [PATCH v8 1/7] leak tests: fix a memory leak in "test-progress" helper Ævar Arnfjörð Bjarmason
                               ` (7 more replies)
  2022-01-12 12:39           ` [PATCH v7 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
  7 siblings, 8 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28 15:18 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Johannes Altmanninger, Ævar Arnfjörð Bjarmason

Various test, leak and other fixes for the progress.c code and its
tests. This v8 addresses feedback on v7[1] by Johannes
Altmanninger. For that round I accidentally broke the In-Reply-To
chain, so I'm replying to the v6 here to attach it to the original
thread again.

1. https://lore.kernel.org/git/cover-v7-0.7-00000000000-20211217T041945Z-avarab@gmail.com/

Ævar Arnfjörð Bjarmason (7):
  leak tests: fix a memory leak in "test-progress" helper
  progress.c test helper: add missing braces
  progress.c tests: make start/stop commands on stdin
  progress.c tests: test some invalid usage
  progress.c: add temporary variable from progress struct
  pack-bitmap-write.c: don't return without stop_progress()
  *.c: use isatty(0|2), not isatty(STDIN_FILENO|STDERR_FILENO)

 builtin/bisect--helper.c    |  2 +-
 builtin/bundle.c            |  2 +-
 compat/mingw.c              |  2 +-
 pack-bitmap-write.c         |  6 +--
 progress.c                  | 14 +++---
 t/helper/test-progress.c    | 52 +++++++++++++++-----
 t/t0500-progress-display.sh | 94 ++++++++++++++++++++++++++++---------
 7 files changed, 126 insertions(+), 46 deletions(-)

Range-diff against v7:
1:  5367293ee84 ! 1:  aa08dab654d leak tests: fix a memory leaks in "test-progress" helper
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    leak tests: fix a memory leaks in "test-progress" helper
    +    leak tests: fix a memory leak in "test-progress" helper
     
         Fix a memory leak in the test-progress helper, and mark the
         corresponding "t0500-progress-display.sh" test as being leak-free
2:  81788101763 = 2:  3ecdab074b6 progress.c test helper: add missing braces
3:  d685c248686 ! 3:  271f6d7ec3b progress.c tests: make start/stop commands on stdin
    @@ t/helper/test-progress.c
      #include "progress.h"
      #include "strbuf.h"
     +#include "string-list.h"
    -+
    -+/*
    -+ * We can't use "end + 1" as an argument to start_progress() below, it
    -+ * doesn't xstrdup() its "title" argument. We need to hold onto a
    -+ * valid "char *" for it until the end.
    -+ */
    -+static char *dup_title(struct string_list *titles, const char *title)
    -+{
    -+	return string_list_insert(titles, title)->string;
    -+}
      
      int cmd__progress(int argc, const char **argv)
      {
    @@ t/helper/test-progress.c
     -		if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
     +		if (skip_prefix(line.buf, "start ", (const char **) &end)) {
     +			uint64_t total = strtoull(end, &end, 10);
    -+			if (*end == '\0')
    -+				progress = start_progress(default_title, total);
    ++			const char *title;
    ++			const char *str;
    ++
    ++			/*
    ++			 * We can't use "end + 1" as an argument to
    ++			 * start_progress(), it doesn't xstrdup() its
    ++			 * "title" argument. We need to hold onto a
    ++			 * valid "char *" for it until the end.
    ++			 */
    ++			if (!*end)
    ++				title = default_title;
     +			else if (*end == ' ')
    -+				progress = start_progress(dup_title(&titles,
    -+								    end + 1),
    -+							  total);
    ++				title = string_list_insert(&titles, end + 1)->string;
     +			else
     +				die("invalid input: '%s'\n", line.buf);
    ++
    ++			str = title ? title : default_title;
    ++			progress = start_progress(str, total);
     +		} else if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
      			uint64_t item_count = strtoull(end, &end, 10);
      			if (*end != '\0')
4:  40e446da277 = 4:  7c1b8b287c5 progress.c tests: test some invalid usage
5:  c2303bfd130 ! 5:  72a31bd7191 progress.c: add temporary variable from progress struct
    @@ Metadata
      ## Commit message ##
         progress.c: add temporary variable from progress struct
     
    -    Add a temporary "progress" variable for the dereferenced p_progress
    -    pointer to a "struct progress *". Before 98a13647408 (trace2: log
    -    progress time and throughput, 2020-05-12) we didn't dereference
    -    "p_progress" in this function, now that we do it's easier to read the
    -    code if we work with a "progress" struct pointer like everywhere else,
    -    instead of a pointer to a pointer.
    +    Since 98a13647408 (trace2: log progress time and throughput,
    +    2020-05-12) stop_progress() dereferences a "struct progress **"
    +    parameter in several places. Extract a dereferenced variable (like in
    +    stop_progress_msg()) to reduce clutter and make it clearer who needs
    +    to write to this parameter.
    +
    +    Now instead of using "*p_progress" several times in stop_progress() we
    +    check it once for NULL and then use a dereferenced "progress" variable
    +    thereafter. This continues the same pattern used in the above
    +    stop_progress() function, see ac900fddb7f (progress: don't dereference
    +    before checking for NULL, 2020-08-10).
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## progress.c ##
    -@@ progress.c: void stop_progress(struct progress **p_progress)
    - 	finish_if_sparse(*p_progress);
    +@@ progress.c: static void finish_if_sparse(struct progress *progress)
    + 
    + void stop_progress(struct progress **p_progress)
    + {
    ++	struct progress *progress;
    + 	if (!p_progress)
    + 		BUG("don't provide NULL to stop_progress");
    ++	progress = *p_progress;
    + 
    +-	finish_if_sparse(*p_progress);
    ++	finish_if_sparse(progress);
      
    - 	if (*p_progress) {
    -+		struct progress *progress = *p_progress;
    +-	if (*p_progress) {
    ++	if (progress) {
      		trace2_data_intmax("progress", the_repository, "total_objects",
    - 				   (*p_progress)->total);
    +-				   (*p_progress)->total);
    ++				   progress->total);
      
    - 		if ((*p_progress)->throughput)
    +-		if ((*p_progress)->throughput)
    ++		if (progress->throughput)
      			trace2_data_intmax("progress", the_repository,
      					   "total_bytes",
     -					   (*p_progress)->throughput->curr_total);
6:  776362de897 ! 6:  0bd08e1b018 pack-bitmap-write.c: don't return without stop_progress()
    @@ Commit message
         reached the early exit in this function.
     
         We could call stop_progress() before we return, but better yet is to
    -    defer calling start_progress() until we need it.
    -
    -    This will matter in a subsequent commit where we BUG(...) out if this
    -    happens, and matters now e.g. because we don't have a corresponding
    -    "region_end" for the progress trace2 event.
    +    defer calling start_progress() until we need it. For now this only
    +    matters in practice because we'd previously omit the "region_leave"
    +    for the progress trace2 event.
     
         Suggested-by: SZEDER Gábor <szeder.dev@gmail.com>
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
7:  0670d1aa5f2 ! 7:  060483fb5ce various *.c: use isatty(0|2), not isatty(STDIN_FILENO|STDERR_FILENO)
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    various *.c: use isatty(0|2), not isatty(STDIN_FILENO|STDERR_FILENO)
    +    *.c: use isatty(0|2), not isatty(STDIN_FILENO|STDERR_FILENO)
     
         We have over 50 uses of "isatty(1)" and "isatty(2)" in the codebase,
    -    and around 10 "isatty(0)", but these used the
    +    and around 10 "isatty(0)", but three callers used the
         {STDIN_FILENO,STD{OUT,ERR}_FILENO} macros in "stdlib.h" to refer to
         them.
     
    -    Let's change these for consistency, and because another commit that
    -    would like to be based on top of this one[1] has a recipe to change
    -    all of these for ad-hoc testing, not needing to match these with that
    -    ad-hoc regex will make things easier to explain. Only one of these is
    -    related to the "struct progress" code which it discusses, but let's
    -    change all of these while we're at it.
    +    Let's change these for consistency.  This makes it easier to change
    +    all calls to isatty() at a whim, which is useful to test some
    +    scenarios[1].
     
         1. https://lore.kernel.org/git/patch-v6-8.8-bff919994b5-20211102T122507Z-avarab@gmail.com/
     
-- 
2.34.1.1257.g2af47340c7b


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v8 1/7] leak tests: fix a memory leak in "test-progress" helper
  2021-12-28 15:18           ` [PATCH v8 0/7] progress: test fixes / cleanup Ævar Arnfjörð Bjarmason
@ 2021-12-28 15:18             ` Ævar Arnfjörð Bjarmason
  2021-12-28 15:18             ` [PATCH v8 2/7] progress.c test helper: add missing braces Ævar Arnfjörð Bjarmason
                               ` (6 subsequent siblings)
  7 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28 15:18 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Johannes Altmanninger, Ævar Arnfjörð Bjarmason

Fix a memory leak in the test-progress helper, and mark the
corresponding "t0500-progress-display.sh" test as being leak-free
under SANITIZE=leak. This fixes a leak added in 2bb74b53a4 (Test the
progress display, 2019-09-16).

My 48f68715b14 (tr2: stop leaking "thread_name" memory, 2021-08-27)
had fixed another memory leak in this test (as it did some trace2
testing).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/helper/test-progress.c    | 1 +
 t/t0500-progress-display.sh | 1 +
 2 files changed, 2 insertions(+)

diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
index 5d05cbe7894..9265e6ab7cf 100644
--- a/t/helper/test-progress.c
+++ b/t/helper/test-progress.c
@@ -69,6 +69,7 @@ int cmd__progress(int argc, const char **argv)
 			die("invalid input: '%s'\n", line.buf);
 	}
 	stop_progress(&progress);
+	strbuf_release(&line);
 
 	return 0;
 }
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 22058b503ac..f37cf2eb9c9 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -2,6 +2,7 @@
 
 test_description='progress display'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 show_cr () {
-- 
2.34.1.1257.g2af47340c7b


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v8 2/7] progress.c test helper: add missing braces
  2021-12-28 15:18           ` [PATCH v8 0/7] progress: test fixes / cleanup Ævar Arnfjörð Bjarmason
  2021-12-28 15:18             ` [PATCH v8 1/7] leak tests: fix a memory leak in "test-progress" helper Ævar Arnfjörð Bjarmason
@ 2021-12-28 15:18             ` Ævar Arnfjörð Bjarmason
  2021-12-28 15:18             ` [PATCH v8 3/7] progress.c tests: make start/stop commands on stdin Ævar Arnfjörð Bjarmason
                               ` (5 subsequent siblings)
  7 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28 15:18 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Johannes Altmanninger, Ævar Arnfjörð Bjarmason

If we have braces on one arm of an if/else all of them should have it,
per the CodingGuidelines's "When there are multiple arms to a
conditional[...]" advice. This formatting change makes a subsequent
commit smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/helper/test-progress.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
index 9265e6ab7cf..50fd3be3dad 100644
--- a/t/helper/test-progress.c
+++ b/t/helper/test-progress.c
@@ -63,10 +63,11 @@ int cmd__progress(int argc, const char **argv)
 				die("invalid input: '%s'\n", line.buf);
 			progress_test_ns = test_ms * 1000 * 1000;
 			display_throughput(progress, byte_count);
-		} else if (!strcmp(line.buf, "update"))
+		} else if (!strcmp(line.buf, "update")) {
 			progress_test_force_update();
-		else
+		} else {
 			die("invalid input: '%s'\n", line.buf);
+		}
 	}
 	stop_progress(&progress);
 	strbuf_release(&line);
-- 
2.34.1.1257.g2af47340c7b


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v8 3/7] progress.c tests: make start/stop commands on stdin
  2021-12-28 15:18           ` [PATCH v8 0/7] progress: test fixes / cleanup Ævar Arnfjörð Bjarmason
  2021-12-28 15:18             ` [PATCH v8 1/7] leak tests: fix a memory leak in "test-progress" helper Ævar Arnfjörð Bjarmason
  2021-12-28 15:18             ` [PATCH v8 2/7] progress.c test helper: add missing braces Ævar Arnfjörð Bjarmason
@ 2021-12-28 15:18             ` Ævar Arnfjörð Bjarmason
  2021-12-28 16:25               ` Johannes Altmanninger
  2021-12-28 15:19             ` [PATCH v8 4/7] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
                               ` (4 subsequent siblings)
  7 siblings, 1 reply; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28 15:18 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Johannes Altmanninger, Ævar Arnfjörð Bjarmason

Change the usage of the "test-tool progress" introduced in
2bb74b53a49 (Test the progress display, 2019-09-16) to take command
like "start" and "stop" on stdin, instead of running them implicitly.

This makes for tests that are easier to read, since the recipe will
mirror the API usage, and allows for easily testing invalid usage that
would yield (or should yield) a BUG(), e.g. providing two "start"
calls in a row. A subsequent commit will add such tests.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/helper/test-progress.c    | 46 ++++++++++++++++++++++-------
 t/t0500-progress-display.sh | 58 +++++++++++++++++++++++--------------
 2 files changed, 72 insertions(+), 32 deletions(-)

diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
index 50fd3be3dad..becc163375f 100644
--- a/t/helper/test-progress.c
+++ b/t/helper/test-progress.c
@@ -3,6 +3,9 @@
  *
  * Reads instructions from standard input, one instruction per line:
  *
+ *   "start <total>[ <title>]" - Call start_progress(title, total),
+ *                               Uses the default title of "Working hard"
+ *                               if the " <title>" is omitted.
  *   "progress <items>" - Call display_progress() with the given item count
  *                        as parameter.
  *   "throughput <bytes> <millis> - Call display_throughput() with the given
@@ -10,6 +13,7 @@
  *                                  specify the time elapsed since the
  *                                  start_progress() call.
  *   "update" - Set the 'progress_update' flag.
+ *   "stop" - Call stop_progress().
  *
  * See 't0500-progress-display.sh' for examples.
  */
@@ -19,34 +23,52 @@
 #include "parse-options.h"
 #include "progress.h"
 #include "strbuf.h"
+#include "string-list.h"
 
 int cmd__progress(int argc, const char **argv)
 {
-	int total = 0;
-	const char *title;
+	const char *const default_title = "Working hard";
+	struct string_list titles = STRING_LIST_INIT_DUP;
 	struct strbuf line = STRBUF_INIT;
-	struct progress *progress;
+	struct progress *progress = NULL;
 
 	const char *usage[] = {
-		"test-tool progress [--total=<n>] <progress-title>",
+		"test-tool progress <stdin",
 		NULL
 	};
 	struct option options[] = {
-		OPT_INTEGER(0, "total", &total, "total number of items"),
 		OPT_END(),
 	};
 
 	argc = parse_options(argc, argv, NULL, options, usage, 0);
-	if (argc != 1)
-		die("need a title for the progress output");
-	title = argv[0];
+	if (argc)
+		usage_with_options(usage, options);
 
 	progress_testing = 1;
-	progress = start_progress(title, total);
 	while (strbuf_getline(&line, stdin) != EOF) {
 		char *end;
 
-		if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
+		if (skip_prefix(line.buf, "start ", (const char **) &end)) {
+			uint64_t total = strtoull(end, &end, 10);
+			const char *title;
+			const char *str;
+
+			/*
+			 * We can't use "end + 1" as an argument to
+			 * start_progress(), it doesn't xstrdup() its
+			 * "title" argument. We need to hold onto a
+			 * valid "char *" for it until the end.
+			 */
+			if (!*end)
+				title = default_title;
+			else if (*end == ' ')
+				title = string_list_insert(&titles, end + 1)->string;
+			else
+				die("invalid input: '%s'\n", line.buf);
+
+			str = title ? title : default_title;
+			progress = start_progress(str, total);
+		} else if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
 			uint64_t item_count = strtoull(end, &end, 10);
 			if (*end != '\0')
 				die("invalid input: '%s'\n", line.buf);
@@ -65,12 +87,14 @@ int cmd__progress(int argc, const char **argv)
 			display_throughput(progress, byte_count);
 		} else if (!strcmp(line.buf, "update")) {
 			progress_test_force_update();
+		} else if (!strcmp(line.buf, "stop")) {
+			stop_progress(&progress);
 		} else {
 			die("invalid input: '%s'\n", line.buf);
 		}
 	}
-	stop_progress(&progress);
 	strbuf_release(&line);
+	string_list_clear(&titles, 0);
 
 	return 0;
 }
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index f37cf2eb9c9..27ab4218b01 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -18,6 +18,7 @@ test_expect_success 'simple progress display' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 0
 	update
 	progress 1
 	update
@@ -26,8 +27,9 @@ test_expect_success 'simple progress display' '
 	progress 4
 	update
 	progress 5
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -42,11 +44,13 @@ test_expect_success 'progress display with total' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 3
 	progress 1
 	progress 2
 	progress 3
+	stop
 	EOF
-	test-tool progress --total=3 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -63,14 +67,14 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 100
 	progress 1000
 	progress 10000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -89,16 +93,16 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	update
 	progress 1
 	update
 	progress 2
 	progress 10000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -117,14 +121,14 @@ Working hard.......2.........3.........4.........5.........6:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6
 	progress 25000
 	progress 50000
 	progress 75000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6" \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -141,14 +145,14 @@ Working hard.......2.........3.........4.........5.........6.........7.........:
 EOF
 
 	cat >in <<-\EOF &&
+	start 100000 Working hard.......2.........3.........4.........5.........6.........7.........
 	progress 25000
 	progress 50000
 	progress 75000
 	progress 100000
+	stop
 	EOF
-	test-tool progress --total=100000 \
-		"Working hard.......2.........3.........4.........5.........6.........7........." \
-		<in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -165,12 +169,14 @@ test_expect_success 'progress shortens - crazy caller' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 1000
 	progress 100
 	progress 200
 	progress 1
 	progress 1000
+	stop
 	EOF
-	test-tool progress --total=1000 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -186,6 +192,7 @@ test_expect_success 'progress display with throughput' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 0
 	throughput 102400 1000
 	update
 	progress 10
@@ -198,8 +205,9 @@ test_expect_success 'progress display with throughput' '
 	throughput 409600 4000
 	update
 	progress 40
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -215,6 +223,7 @@ test_expect_success 'progress display with throughput and total' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 40
 	throughput 102400 1000
 	progress 10
 	throughput 204800 2000
@@ -223,8 +232,9 @@ test_expect_success 'progress display with throughput and total' '
 	progress 30
 	throughput 409600 4000
 	progress 40
+	stop
 	EOF
-	test-tool progress --total=40 "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -240,6 +250,7 @@ test_expect_success 'cover up after throughput shortens' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 0
 	throughput 409600 1000
 	update
 	progress 1
@@ -252,8 +263,9 @@ test_expect_success 'cover up after throughput shortens' '
 	throughput 1638400 4000
 	update
 	progress 4
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -268,6 +280,7 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	EOF
 
 	cat >in <<-\EOF &&
+	start 0
 	throughput 1 1000
 	update
 	progress 1
@@ -277,8 +290,9 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	throughput 3145728 3000
 	update
 	progress 3
+	stop
 	EOF
-	test-tool progress "Working hard" <in 2>stderr &&
+	test-tool progress <in 2>stderr &&
 
 	show_cr <stderr >out &&
 	test_cmp expect out
@@ -286,6 +300,7 @@ test_expect_success 'cover up after throughput shortens a lot' '
 
 test_expect_success 'progress generates traces' '
 	cat >in <<-\EOF &&
+	start 40
 	throughput 102400 1000
 	update
 	progress 10
@@ -298,10 +313,11 @@ test_expect_success 'progress generates traces' '
 	throughput 409600 4000
 	update
 	progress 40
+	stop
 	EOF
 
-	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress --total=40 \
-		"Working hard" <in 2>stderr &&
+	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress \
+		<in 2>stderr &&
 
 	# t0212/parse_events.perl intentionally omits regions and data.
 	test_region progress "Working hard" trace.event &&
-- 
2.34.1.1257.g2af47340c7b


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v8 4/7] progress.c tests: test some invalid usage
  2021-12-28 15:18           ` [PATCH v8 0/7] progress: test fixes / cleanup Ævar Arnfjörð Bjarmason
                               ` (2 preceding siblings ...)
  2021-12-28 15:18             ` [PATCH v8 3/7] progress.c tests: make start/stop commands on stdin Ævar Arnfjörð Bjarmason
@ 2021-12-28 15:19             ` Ævar Arnfjörð Bjarmason
  2021-12-28 16:33               ` Johannes Altmanninger
  2021-12-28 15:19             ` [PATCH v8 5/7] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
                               ` (3 subsequent siblings)
  7 siblings, 1 reply; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28 15:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Johannes Altmanninger, Ævar Arnfjörð Bjarmason

Test what happens when we "stop" without a "start", omit the "stop"
after a "start", or try to start two concurrent progress bars. This
extends the trace2 tests added in 98a13647408 (trace2: log progress
time and throughput, 2020-05-12).

These tests are not merely testing the helper, but invalid API usage
that can happen if the progress.c API is misused.

The "without stop" test will leak under SANITIZE=leak, since this
buggy use of the API will leak memory. But let's not skip it entirely,
or use the "!SANITIZE_LEAK" prerequisite check as we'd do with tests
that we're skipping due to leaks we haven't fixed yet. Instead
annotate the specific command that should skip leak checking with
custom $LSAN_OPTIONS[1].

1. https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t0500-progress-display.sh | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 27ab4218b01..59e9f226ea4 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -325,4 +325,39 @@ test_expect_success 'progress generates traces' '
 	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
 '
 
+test_expect_success 'progress generates traces: stop / start' '
+	cat >in <<-\EOF &&
+	start 0
+	stop
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-startstop.event" test-tool progress \
+		<in 2>stderr &&
+	test_region progress "Working hard" trace-startstop.event
+'
+
+test_expect_success 'progress generates traces: start without stop' '
+	cat >in <<-\EOF &&
+	start 0
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-start.event" \
+	LSAN_OPTIONS=detect_leaks=0 \
+	test-tool progress \
+		<in 2>stderr &&
+	grep region_enter.*progress trace-start.event &&
+	! grep region_leave.*progress trace-start.event
+'
+
+test_expect_success 'progress generates traces: stop without start' '
+	cat >in <<-\EOF &&
+	stop
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace-stop.event" test-tool progress \
+		<in 2>stderr &&
+	! grep region_enter.*progress trace-stop.event &&
+	! grep region_leave.*progress trace-stop.event
+'
+
 test_done
-- 
2.34.1.1257.g2af47340c7b


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v8 5/7] progress.c: add temporary variable from progress struct
  2021-12-28 15:18           ` [PATCH v8 0/7] progress: test fixes / cleanup Ævar Arnfjörð Bjarmason
                               ` (3 preceding siblings ...)
  2021-12-28 15:19             ` [PATCH v8 4/7] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
@ 2021-12-28 15:19             ` Ævar Arnfjörð Bjarmason
  2021-12-28 16:05               ` René Scharfe
  2021-12-28 16:13               ` Johannes Altmanninger
  2021-12-28 15:19             ` [PATCH v8 6/7] pack-bitmap-write.c: don't return without stop_progress() Ævar Arnfjörð Bjarmason
                               ` (2 subsequent siblings)
  7 siblings, 2 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28 15:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Johannes Altmanninger, Ævar Arnfjörð Bjarmason

Since 98a13647408 (trace2: log progress time and throughput,
2020-05-12) stop_progress() dereferences a "struct progress **"
parameter in several places. Extract a dereferenced variable (like in
stop_progress_msg()) to reduce clutter and make it clearer who needs
to write to this parameter.

Now instead of using "*p_progress" several times in stop_progress() we
check it once for NULL and then use a dereferenced "progress" variable
thereafter. This continues the same pattern used in the above
stop_progress() function, see ac900fddb7f (progress: don't dereference
before checking for NULL, 2020-08-10).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 progress.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/progress.c b/progress.c
index 680c6a8bf93..688749648be 100644
--- a/progress.c
+++ b/progress.c
@@ -319,21 +319,23 @@ static void finish_if_sparse(struct progress *progress)
 
 void stop_progress(struct progress **p_progress)
 {
+	struct progress *progress;
 	if (!p_progress)
 		BUG("don't provide NULL to stop_progress");
+	progress = *p_progress;
 
-	finish_if_sparse(*p_progress);
+	finish_if_sparse(progress);
 
-	if (*p_progress) {
+	if (progress) {
 		trace2_data_intmax("progress", the_repository, "total_objects",
-				   (*p_progress)->total);
+				   progress->total);
 
-		if ((*p_progress)->throughput)
+		if (progress->throughput)
 			trace2_data_intmax("progress", the_repository,
 					   "total_bytes",
-					   (*p_progress)->throughput->curr_total);
+					   progress->throughput->curr_total);
 
-		trace2_region_leave("progress", (*p_progress)->title, the_repository);
+		trace2_region_leave("progress", progress->title, the_repository);
 	}
 
 	stop_progress_msg(p_progress, _("done"));
-- 
2.34.1.1257.g2af47340c7b


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v8 6/7] pack-bitmap-write.c: don't return without stop_progress()
  2021-12-28 15:18           ` [PATCH v8 0/7] progress: test fixes / cleanup Ævar Arnfjörð Bjarmason
                               ` (4 preceding siblings ...)
  2021-12-28 15:19             ` [PATCH v8 5/7] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
@ 2021-12-28 15:19             ` Ævar Arnfjörð Bjarmason
  2021-12-28 15:19             ` [PATCH v8 7/7] *.c: use isatty(0|2), not isatty(STDIN_FILENO|STDERR_FILENO) Ævar Arnfjörð Bjarmason
  2022-01-08  0:45             ` [PATCH v8 0/7] progress: test fixes / cleanup Junio C Hamano
  7 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28 15:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Johannes Altmanninger, Ævar Arnfjörð Bjarmason

Fix a bug that's been here since 7cc8f971085 (pack-objects: implement
bitmap writing, 2013-12-21), we did not call stop_progress() if we
reached the early exit in this function.

We could call stop_progress() before we return, but better yet is to
defer calling start_progress() until we need it. For now this only
matters in practice because we'd previously omit the "region_leave"
for the progress trace2 event.

Suggested-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 pack-bitmap-write.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 9c55c1531e1..cab3eaa2acd 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -575,15 +575,15 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
 
 	QSORT(indexed_commits, indexed_commits_nr, date_compare);
 
-	if (writer.show_progress)
-		writer.progress = start_progress("Selecting bitmap commits", 0);
-
 	if (indexed_commits_nr < 100) {
 		for (i = 0; i < indexed_commits_nr; ++i)
 			push_bitmapped_commit(indexed_commits[i]);
 		return;
 	}
 
+	if (writer.show_progress)
+		writer.progress = start_progress("Selecting bitmap commits", 0);
+
 	for (;;) {
 		struct commit *chosen = NULL;
 
-- 
2.34.1.1257.g2af47340c7b


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v8 7/7] *.c: use isatty(0|2), not isatty(STDIN_FILENO|STDERR_FILENO)
  2021-12-28 15:18           ` [PATCH v8 0/7] progress: test fixes / cleanup Ævar Arnfjörð Bjarmason
                               ` (5 preceding siblings ...)
  2021-12-28 15:19             ` [PATCH v8 6/7] pack-bitmap-write.c: don't return without stop_progress() Ævar Arnfjörð Bjarmason
@ 2021-12-28 15:19             ` Ævar Arnfjörð Bjarmason
  2021-12-28 16:47               ` René Scharfe
  2022-01-08  0:45             ` [PATCH v8 0/7] progress: test fixes / cleanup Junio C Hamano
  7 siblings, 1 reply; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28 15:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, SZEDER Gábor, René Scharfe,
	Johannes Altmanninger, Ævar Arnfjörð Bjarmason

We have over 50 uses of "isatty(1)" and "isatty(2)" in the codebase,
and around 10 "isatty(0)", but three callers used the
{STDIN_FILENO,STD{OUT,ERR}_FILENO} macros in "stdlib.h" to refer to
them.

Let's change these for consistency.  This makes it easier to change
all calls to isatty() at a whim, which is useful to test some
scenarios[1].

1. https://lore.kernel.org/git/patch-v6-8.8-bff919994b5-20211102T122507Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/bisect--helper.c | 2 +-
 builtin/bundle.c         | 2 +-
 compat/mingw.c           | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
index 28a2e6a5750..21360a4e70b 100644
--- a/builtin/bisect--helper.c
+++ b/builtin/bisect--helper.c
@@ -830,7 +830,7 @@ static int bisect_autostart(struct bisect_terms *terms)
 	fprintf_ln(stderr, _("You need to start by \"git bisect "
 			  "start\"\n"));
 
-	if (!isatty(STDIN_FILENO))
+	if (!isatty(0))
 		return -1;
 
 	/*
diff --git a/builtin/bundle.c b/builtin/bundle.c
index 5a85d7cd0fe..df69c651753 100644
--- a/builtin/bundle.c
+++ b/builtin/bundle.c
@@ -56,7 +56,7 @@ static int parse_options_cmd_bundle(int argc,
 
 static int cmd_bundle_create(int argc, const char **argv, const char *prefix) {
 	int all_progress_implied = 0;
-	int progress = isatty(STDERR_FILENO);
+	int progress = isatty(2);
 	struct strvec pack_opts;
 	int version = -1;
 	int ret;
diff --git a/compat/mingw.c b/compat/mingw.c
index e14f2d5f77c..7c55d0f0414 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -2376,7 +2376,7 @@ int mingw_raise(int sig)
 	switch (sig) {
 	case SIGALRM:
 		if (timer_fn == SIG_DFL) {
-			if (isatty(STDERR_FILENO))
+			if (isatty(2))
 				fputs("Alarm clock\n", stderr);
 			exit(128 + SIGALRM);
 		} else if (timer_fn != SIG_IGN)
-- 
2.34.1.1257.g2af47340c7b


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v8 5/7] progress.c: add temporary variable from progress struct
  2021-12-28 15:19             ` [PATCH v8 5/7] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
@ 2021-12-28 16:05               ` René Scharfe
  2021-12-28 16:13               ` Johannes Altmanninger
  1 sibling, 0 replies; 81+ messages in thread
From: René Scharfe @ 2021-12-28 16:05 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, SZEDER Gábor, Johannes Altmanninger

Am 28.12.21 um 16:19 schrieb Ævar Arnfjörð Bjarmason:
> Since 98a13647408 (trace2: log progress time and throughput,
> 2020-05-12) stop_progress() dereferences a "struct progress **"
> parameter in several places. Extract a dereferenced variable (like in
> stop_progress_msg()) to reduce clutter and make it clearer who needs
> to write to this parameter.
>
> Now instead of using "*p_progress" several times in stop_progress() we
> check it once for NULL and then use a dereferenced "progress" variable
> thereafter. This continues the same pattern used in the above
> stop_progress() function, see ac900fddb7f (progress: don't dereference
> before checking for NULL, 2020-08-10).
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  progress.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/progress.c b/progress.c
> index 680c6a8bf93..688749648be 100644
> --- a/progress.c
> +++ b/progress.c
> @@ -319,21 +319,23 @@ static void finish_if_sparse(struct progress *progress)
>
>  void stop_progress(struct progress **p_progress)
>  {
> +	struct progress *progress;
>  	if (!p_progress)
>  		BUG("don't provide NULL to stop_progress");
> +	progress = *p_progress;
>
> -	finish_if_sparse(*p_progress);
> +	finish_if_sparse(progress);
>
> -	if (*p_progress) {
> +	if (progress) {
>  		trace2_data_intmax("progress", the_repository, "total_objects",
> -				   (*p_progress)->total);
> +				   progress->total);
>
> -		if ((*p_progress)->throughput)
> +		if (progress->throughput)
>  			trace2_data_intmax("progress", the_repository,
>  					   "total_bytes",
> -					   (*p_progress)->throughput->curr_total);
> +					   progress->throughput->curr_total);
>
> -		trace2_region_leave("progress", (*p_progress)->title, the_repository);
> +		trace2_region_leave("progress", progress->title, the_repository);
>  	}
>
>  	stop_progress_msg(p_progress, _("done"));

This patch is trivially correct, but I wonder why all that code is here
instead of in stop_progress_msg().  I would expect stop_progress() to be
a thin wrapper that just provides a default message, but actually it
handles sparse progress and tracing.  Isn't both necessary even with a
custom message?

In any case, moving the code there becomes easier after this patch.

René

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v8 5/7] progress.c: add temporary variable from progress struct
  2021-12-28 15:19             ` [PATCH v8 5/7] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
  2021-12-28 16:05               ` René Scharfe
@ 2021-12-28 16:13               ` Johannes Altmanninger
  1 sibling, 0 replies; 81+ messages in thread
From: Johannes Altmanninger @ 2021-12-28 16:13 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, SZEDER Gábor, René Scharfe

On Tue, Dec 28, 2021 at 04:19:01PM +0100, Ævar Arnfjörð Bjarmason wrote:
> Since 98a13647408 (trace2: log progress time and throughput,
> 2020-05-12) stop_progress() dereferences a "struct progress **"
> parameter in several places. Extract a dereferenced variable (like in
> stop_progress_msg()) to reduce clutter and make it clearer who needs

The "(like in stop_progress_msg())" can probably go because you explain the
added consistency in the next paragraph.

> to write to this parameter.
> 
> Now instead of using "*p_progress" several times in stop_progress() we
> check it once for NULL and then use a dereferenced "progress" variable
> thereafter. This continues the same pattern used in the above
> stop_progress() function, see ac900fddb7f (progress: don't dereference

"above stop_progress" should be "below stop_progress_msg",
because stop_progress is the one you're modifying?

> before checking for NULL, 2020-08-10).
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  progress.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/progress.c b/progress.c
> index 680c6a8bf93..688749648be 100644
> --- a/progress.c
> +++ b/progress.c
> @@ -319,21 +319,23 @@ static void finish_if_sparse(struct progress *progress)
>  
>  void stop_progress(struct progress **p_progress)
>  {
> +	struct progress *progress;

nit: in stop_progress_msg we have a blank line here, the inconsistency is
mildly surprising

>  	if (!p_progress)
>  		BUG("don't provide NULL to stop_progress");
> +	progress = *p_progress;
>  
> -	finish_if_sparse(*p_progress);
> +	finish_if_sparse(progress);
>  
> -	if (*p_progress) {
> +	if (progress) {
>  		trace2_data_intmax("progress", the_repository, "total_objects",
> -				   (*p_progress)->total);
> +				   progress->total);
>  
> -		if ((*p_progress)->throughput)
> +		if (progress->throughput)
>  			trace2_data_intmax("progress", the_repository,
>  					   "total_bytes",
> -					   (*p_progress)->throughput->curr_total);
> +					   progress->throughput->curr_total);
>  
> -		trace2_region_leave("progress", (*p_progress)->title, the_repository);
> +		trace2_region_leave("progress", progress->title, the_repository);
>  	}
>  
>  	stop_progress_msg(p_progress, _("done"));
> -- 
> 2.34.1.1257.g2af47340c7b
> 

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v8 3/7] progress.c tests: make start/stop commands on stdin
  2021-12-28 15:18             ` [PATCH v8 3/7] progress.c tests: make start/stop commands on stdin Ævar Arnfjörð Bjarmason
@ 2021-12-28 16:25               ` Johannes Altmanninger
  0 siblings, 0 replies; 81+ messages in thread
From: Johannes Altmanninger @ 2021-12-28 16:25 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, SZEDER Gábor, René Scharfe

On Tue, Dec 28, 2021 at 04:18:59PM +0100, Ævar Arnfjörð Bjarmason wrote:
> Change the usage of the "test-tool progress" introduced in
> 2bb74b53a49 (Test the progress display, 2019-09-16) to take command
> like "start" and "stop" on stdin, instead of running them implicitly.
> 
> This makes for tests that are easier to read, since the recipe will
> mirror the API usage, and allows for easily testing invalid usage that
> would yield (or should yield) a BUG(), e.g. providing two "start"
> calls in a row. A subsequent commit will add such tests.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  t/helper/test-progress.c    | 46 ++++++++++++++++++++++-------
>  t/t0500-progress-display.sh | 58 +++++++++++++++++++++++--------------
>  2 files changed, 72 insertions(+), 32 deletions(-)
> 
> diff --git a/t/helper/test-progress.c b/t/helper/test-progress.c
> index 50fd3be3dad..becc163375f 100644
> --- a/t/helper/test-progress.c
> +++ b/t/helper/test-progress.c
> @@ -3,6 +3,9 @@
>   *
>   * Reads instructions from standard input, one instruction per line:
>   *
> + *   "start <total>[ <title>]" - Call start_progress(title, total),
> + *                               Uses the default title of "Working hard"
> + *                               if the " <title>" is omitted.
>   *   "progress <items>" - Call display_progress() with the given item count
>   *                        as parameter.
>   *   "throughput <bytes> <millis> - Call display_throughput() with the given
> @@ -10,6 +13,7 @@
>   *                                  specify the time elapsed since the
>   *                                  start_progress() call.
>   *   "update" - Set the 'progress_update' flag.
> + *   "stop" - Call stop_progress().
>   *
>   * See 't0500-progress-display.sh' for examples.
>   */
> @@ -19,34 +23,52 @@
>  #include "parse-options.h"
>  #include "progress.h"
>  #include "strbuf.h"
> +#include "string-list.h"
>  
>  int cmd__progress(int argc, const char **argv)
>  {
> -	int total = 0;
> -	const char *title;
> +	const char *const default_title = "Working hard";
> +	struct string_list titles = STRING_LIST_INIT_DUP;
>  	struct strbuf line = STRBUF_INIT;
> -	struct progress *progress;
> +	struct progress *progress = NULL;
>  
>  	const char *usage[] = {
> -		"test-tool progress [--total=<n>] <progress-title>",
> +		"test-tool progress <stdin",
>  		NULL
>  	};
>  	struct option options[] = {
> -		OPT_INTEGER(0, "total", &total, "total number of items"),
>  		OPT_END(),
>  	};
>  
>  	argc = parse_options(argc, argv, NULL, options, usage, 0);
> -	if (argc != 1)
> -		die("need a title for the progress output");
> -	title = argv[0];
> +	if (argc)
> +		usage_with_options(usage, options);
>  
>  	progress_testing = 1;
> -	progress = start_progress(title, total);
>  	while (strbuf_getline(&line, stdin) != EOF) {
>  		char *end;
>  
> -		if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
> +		if (skip_prefix(line.buf, "start ", (const char **) &end)) {
> +			uint64_t total = strtoull(end, &end, 10);
> +			const char *title;
> +			const char *str;
> +
> +			/*
> +			 * We can't use "end + 1" as an argument to
> +			 * start_progress(), it doesn't xstrdup() its
> +			 * "title" argument. We need to hold onto a
> +			 * valid "char *" for it until the end.
> +			 */
> +			if (!*end)
> +				title = default_title;
> +			else if (*end == ' ')
> +				title = string_list_insert(&titles, end + 1)->string;
> +			else
> +				die("invalid input: '%s'\n", line.buf);
> +
> +			str = title ? title : default_title;

I don't think title is ever NULL, so we should be able to elide this variable.
(Did you want to fall back to the default title when the input is "start "?)

> +			progress = start_progress(str, total);
> +		} else if (skip_prefix(line.buf, "progress ", (const char **) &end)) {
>  			uint64_t item_count = strtoull(end, &end, 10);
>  			if (*end != '\0')
>  				die("invalid input: '%s'\n", line.buf);

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v8 4/7] progress.c tests: test some invalid usage
  2021-12-28 15:19             ` [PATCH v8 4/7] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
@ 2021-12-28 16:33               ` Johannes Altmanninger
  0 siblings, 0 replies; 81+ messages in thread
From: Johannes Altmanninger @ 2021-12-28 16:33 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, SZEDER Gábor, René Scharfe

On Tue, Dec 28, 2021 at 04:19:00PM +0100, Ævar Arnfjörð Bjarmason wrote:
> Test what happens when we "stop" without a "start", omit the "stop"
> after a "start", or try to start two concurrent progress bars. This

I think there is still no test for the two concurrent progress bars,
but you mention it here, and also in the previous patch's message,
which is misleading.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v8 7/7] *.c: use isatty(0|2), not isatty(STDIN_FILENO|STDERR_FILENO)
  2021-12-28 15:19             ` [PATCH v8 7/7] *.c: use isatty(0|2), not isatty(STDIN_FILENO|STDERR_FILENO) Ævar Arnfjörð Bjarmason
@ 2021-12-28 16:47               ` René Scharfe
  2021-12-28 23:56                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 81+ messages in thread
From: René Scharfe @ 2021-12-28 16:47 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, SZEDER Gábor, Johannes Altmanninger

Am 28.12.21 um 16:19 schrieb Ævar Arnfjörð Bjarmason:
> We have over 50 uses of "isatty(1)" and "isatty(2)" in the codebase,
> and around 10 "isatty(0)", but three callers used the
> {STDIN_FILENO,STD{OUT,ERR}_FILENO} macros in "stdlib.h" to refer to
> them.
>
> Let's change these for consistency.  This makes it easier to change
> all calls to isatty() at a whim, which is useful to test some
> scenarios[1].

Hmm.  Matching e.g. "(0|STDIN_FILENO)" instead of "0" is harder, of
course, but not much.

Shouldn't we use these macros more to reduce the number of magic values?
The code is slightly easier to read before this patch because it doesn't
require the reader to know the meaning of these numbers.

Reducing the constants to their numerical values is easy to automate in
general; the opposite direction is harder.  Coccinelle can help us take
such a step with a semantic patch like this:

	@@
	@@
	  isatty(
	(
	- 0
	+ STDIN_FILENO
	|
	- 1
	+ STDOUT_FILENO
	|
	- 2
	+ STDERR_FILENO
	)
	  )

>
> 1. https://lore.kernel.org/git/patch-v6-8.8-bff919994b5-20211102T122507Z-avarab@gmail.com/
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/bisect--helper.c | 2 +-
>  builtin/bundle.c         | 2 +-
>  compat/mingw.c           | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
> index 28a2e6a5750..21360a4e70b 100644
> --- a/builtin/bisect--helper.c
> +++ b/builtin/bisect--helper.c
> @@ -830,7 +830,7 @@ static int bisect_autostart(struct bisect_terms *terms)
>  	fprintf_ln(stderr, _("You need to start by \"git bisect "
>  			  "start\"\n"));
>
> -	if (!isatty(STDIN_FILENO))
> +	if (!isatty(0))
>  		return -1;
>
>  	/*
> diff --git a/builtin/bundle.c b/builtin/bundle.c
> index 5a85d7cd0fe..df69c651753 100644
> --- a/builtin/bundle.c
> +++ b/builtin/bundle.c
> @@ -56,7 +56,7 @@ static int parse_options_cmd_bundle(int argc,
>
>  static int cmd_bundle_create(int argc, const char **argv, const char *prefix) {
>  	int all_progress_implied = 0;
> -	int progress = isatty(STDERR_FILENO);
> +	int progress = isatty(2);
>  	struct strvec pack_opts;
>  	int version = -1;
>  	int ret;
> diff --git a/compat/mingw.c b/compat/mingw.c
> index e14f2d5f77c..7c55d0f0414 100644
> --- a/compat/mingw.c
> +++ b/compat/mingw.c
> @@ -2376,7 +2376,7 @@ int mingw_raise(int sig)
>  	switch (sig) {
>  	case SIGALRM:
>  		if (timer_fn == SIG_DFL) {
> -			if (isatty(STDERR_FILENO))
> +			if (isatty(2))
>  				fputs("Alarm clock\n", stderr);
>  			exit(128 + SIGALRM);
>  		} else if (timer_fn != SIG_IGN)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v8 7/7] *.c: use isatty(0|2), not isatty(STDIN_FILENO|STDERR_FILENO)
  2021-12-28 16:47               ` René Scharfe
@ 2021-12-28 23:56                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28 23:56 UTC (permalink / raw)
  To: René Scharfe
  Cc: git, Junio C Hamano, SZEDER Gábor, Johannes Altmanninger


On Tue, Dec 28 2021, René Scharfe wrote:

> Am 28.12.21 um 16:19 schrieb Ævar Arnfjörð Bjarmason:
>> We have over 50 uses of "isatty(1)" and "isatty(2)" in the codebase,
>> and around 10 "isatty(0)", but three callers used the
>> {STDIN_FILENO,STD{OUT,ERR}_FILENO} macros in "stdlib.h" to refer to
>> them.
>>
>> Let's change these for consistency.  This makes it easier to change
>> all calls to isatty() at a whim, which is useful to test some
>> scenarios[1].
>
> Hmm.  Matching e.g. "(0|STDIN_FILENO)" instead of "0" is harder, of
> course, but not much.
>
> Shouldn't we use these macros more to reduce the number of magic values?
> The code is slightly easier to read before this patch because it doesn't
> require the reader to know the meaning of these numbers.
>
> Reducing the constants to their numerical values is easy to automate in
> general; the opposite direction is harder.  Coccinelle can help us take
> such a step with a semantic patch like this:
>
> 	@@
> 	@@
> 	  isatty(
> 	(
> 	- 0
> 	+ STDIN_FILENO
> 	|
> 	- 1
> 	+ STDOUT_FILENO
> 	|
> 	- 2
> 	+ STDERR_FILENO
> 	)
> 	  )

We don't bother with EXIT_SUCCESS and EXIT_FAILURE, and for those (VMS)
there is a reason to not use the constants, as EXIT_FAILURE may differ.

But for these I personally think these symbolic names are rather
useless.

They never differ, and when working on POSIX systems you're going to
need to know that 1 is stdout, 2 is stderr. You're also going to have to
maintain shellscripts that use ">&2" or whatever. Those aren't using a
hypothetical ">&$STDERR_FILENO".

But in any case, this change isn't even trying to make the argument that
we *should* use one over the other, just that the constants are used
much more than *_FILENO, so changing them to make a subsequent (now
ejected out of this series) change easier to explain is worth it.

So I'd think we can just take this small change, and argue separately
whether it's worth it to apply that coccinelle rule.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 4/6] object-name: show date for ambiguous tag objects
  2021-12-28 14:35           ` [PATCH v6 4/6] object-name: show date for ambiguous tag objects Ævar Arnfjörð Bjarmason
@ 2021-12-30 21:43             ` Junio C Hamano
  0 siblings, 0 replies; 81+ messages in thread
From: Junio C Hamano @ 2021-12-30 21:43 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Bagas Sanjaya, Josh Steadmon

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> diff --git a/object-name.c b/object-name.c
> index dcf3ab99990..990f384129e 100644
> --- a/object-name.c
> +++ b/object-name.c
> @@ -403,21 +403,26 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
>  	} else if (type == OBJ_TAG) {
>  		struct tag *tag = lookup_tag(ds->repo, oid);
>  		const char *tag_tag = "";
> +		timestamp_t tag_date = 0;
>  
> -		if (!parse_tag(tag) && tag->tag)
> +		if (!parse_tag(tag) && tag->tag) {
>  			tag_tag = tag->tag;
> +			tag_date = tag->date;
> +		}
>  
>  		/*
>  		 * TRANSLATORS: This is a line of
>  		 * ambiguous tag object output. E.g.:
>  		 *
> -		 *    "deadbeef tag Some Tag Message"
> +		 *    "deadbeef tag 2021-01-01 - Some Tag Message"
>  		 *
>  		 * The second argument is the "tag" string from
>  		 * object.c, it should (hopefully) already be
>  		 * translated.
>  		 */
> -		strbuf_addf(&desc, _("%s tag %s"), hash, tag_tag);
> +		strbuf_addf(&desc, _("%s tag %s - %s"), hash,
> +			    show_date(tag_date, 0, DATE_MODE(SHORT)),
> +			    tag_tag);

So, when parse_tag() errors out, we show "" and epoch?  We should be
able to do a better error reporting than that; tag_tag and tag_date
are both local and they do not have to be used to store sentinel values
like that.  Instead perhaps remember that we failed to parse_tag(),
and _omit_ unavailable piece of information from the output?  I dunno.

>  	} else if (type == OBJ_TREE) {
>  		/*
>  		 * TRANSLATORS: This is a line of ambiguous <type>

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 1/6] object-name tests: add tests for ambiguous object blind spots
  2021-12-28 14:34           ` [PATCH v6 1/6] object-name tests: add tests for ambiguous object blind spots Ævar Arnfjörð Bjarmason
@ 2021-12-30 23:36             ` Junio C Hamano
  0 siblings, 0 replies; 81+ messages in thread
From: Junio C Hamano @ 2021-12-30 23:36 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Bagas Sanjaya, Josh Steadmon

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> +test_cmp_failed_rev_parse () {
> +	dir=$1
> +	rev=$2
> +	shift

What are we shifting away?

> +	test_must_fail git -C "$dir" rev-parse "$rev" 2>actual.raw &&
> +	sed "s/\($rev\)[0-9a-f]*/\1.../g" <actual.raw >actual &&

I wonder if we need to ensure not to mistakenly produce second hit
in an object name that has $rev twice, e.g. "cafe123cafe..."?

> +	test_cmp expect actual
> +}

It is a bit confusing to _depend_ on the caller to prepare a
fixed-name file, like this.  We've avoided such confusion in
different ways in other tests, like (A) make the helper take
the expected output from its standard input, or (B) make the
helper take the name of the file that has expected output as
its argument.

> +test_expect_success 'ambiguous blob output' '
> +	git init --bare blob.prefix &&
> +	(
> +		cd blob.prefix &&
> +
> +		# Both start with "dead..", under both SHA-1 and SHA-256
> +		echo brocdnra | git hash-object -w --stdin &&
> +		echo brigddsv | git hash-object -w --stdin &&
> +
> +		# Both start with "beef.."
> +		echo 1agllotbh | git hash-object -w --stdin &&
> +		echo 1bbfctrkc | git hash-object -w --stdin
> +	) &&
> +
> +	test_must_fail git -C blob.prefix rev-parse dead &&
> +	cat >expect <<-\EOF &&
> +	error: short object ID beef... is ambiguous
> +	hint: The candidates are:
> +	hint:   beef... blob
> +	hint:   beef... blob
> +	fatal: ambiguous argument '\''beef...'\'': unknown revision or path not in the working tree.
> +	Use '\''--'\'' to separate paths from revisions, like this:
> +	'\''git <command> [<revision>...] -- [<file>...]'\''
> +	EOF
> +	test_cmp_failed_rev_parse blob.prefix beef
> +'
> +
> +test_expect_success 'ambiguous loose blob parsed as OBJ_BAD' '

"loose bad object", as they aren't even blobs, perhaps?

> +	git init --bare blob.bad &&
> +	(
> +		cd blob.bad &&
> +
> +		# Both have the prefix "bad0"
> +		echo xyzfaowcoh | git hash-object -t bad -w --stdin --literally &&
> +		echo xyzhjpyvwl | git hash-object -t bad -w --stdin --literally
> +	) &&
> +
> +	cat >expect <<-\EOF &&
> +	error: short object ID bad0... is ambiguous
> +	hint: The candidates are:
> +	fatal: invalid object type

That indeed is not very nice.

> +	EOF
> +	test_cmp_failed_rev_parse blob.bad bad0
> +'
> +
> +test_expect_success POSIXPERM 'ambigous zlib corrupt loose blob' '
> +	git init --bare blob.corrupt &&
> +	(
> +		cd blob.corrupt &&
> +
> +		# Both have the prefix "cafe"
> +		echo bnkxmdwz | git hash-object -w --stdin &&
> +		oid=$(echo bmwsjxzi | git hash-object -w --stdin) &&
> +
> +		oidf=objects/$(test_oid_to_path "$oid") &&
> +		chmod 755 $oidf &&
> +		echo broken >$oidf
> +	) &&
> +
> +	cat >expect <<-\EOF &&
> +	error: short object ID cafe... is ambiguous
> +	hint: The candidates are:
> +	error: inflate: data stream error (incorrect header check)
> +	error: unable to unpack cafe... header
> +	error: inflate: data stream error (incorrect header check)
> +	error: unable to unpack cafe... header
> +	hint:   cafe... unknown type
> +	hint:   cafe... blob

This is an interesting one.  I _think_ it is clear enough for the
readers that the inflate errors are for the object that immediately
follows them, so as long as we show these hints one by one, the
above output is perfectly fine.  But we'll see.

> +	fatal: ambiguous argument '\''cafe...'\'': unknown revision or path not in the working tree.
> +	Use '\''--'\'' to separate paths from revisions, like this:
> +	'\''git <command> [<revision>...] -- [<file>...]'\''
> +	EOF
> +	test_cmp_failed_rev_parse blob.corrupt cafe
> +'
> +
>  if ! test_have_prereq SHA1
>  then
>  	skip_all='not using SHA-1 for objects'

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v6 3/6] object-name: make ambiguous object output translatable
  2021-12-28 14:34           ` [PATCH v6 3/6] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
@ 2021-12-30 23:46             ` Junio C Hamano
  0 siblings, 0 replies; 81+ messages in thread
From: Junio C Hamano @ 2021-12-30 23:46 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Bagas Sanjaya, Josh Steadmon

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> +		/*
> +		 * TRANSLATORS: This is a line of
> +		 * ambiguous tag object output. E.g.:
> +		 *
> +		 *    "deadbeef tag Some Tag Message"
> +		 *
> +		 * The second argument is the "tag" string from
> +		 * object.c, it should (hopefully) already be
> +		 * translated.
> +		 */
> +		strbuf_addf(&desc, _("%s tag %s"), hash, tag_tag);

It is better to lose ", it should (hopefully) already be translated"
near the end of the comment.

> +	} else if (type == OBJ_TREE) {
> +		/*
> +		 * TRANSLATORS: This is a line of ambiguous <type>
> +		 * object output. E.g. "deadbeef tree".
> +		 */
> +		strbuf_addf(&desc, _("%s tree"), hash);
> +	} else if (type == OBJ_BLOB) {
> +		/*
> +		 * TRANSLATORS: This is a line of ambiguous <type>
> +		 * object output. E.g. "deadbeef blob".
> +		 */
> +		strbuf_addf(&desc, _("%s blob"), hash);
>  	}
>  
> +
>  out:
> -	advise("  %s %s",
> -	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
> -	       desc.buf);
> +	/*
> +	 * TRANSLATORS: This is line item of ambiguous object output
> +	 * from describe_ambiguous_object() above.
> +	 */
> +	advise(_("  %s"), desc.buf);

What do we expect the translators to do here?  Swap order of the
leading space and the string around?

All the other sentence legos we see in the earlier part of this
patch (omitted) looked quite sensibly done.  Especially the part
that shows a commit object, which lets the translators to take the
object name, the date string, and the message and combine them into
a single string in an order of their choice is nice.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v8 0/7] progress: test fixes / cleanup
  2021-12-28 15:18           ` [PATCH v8 0/7] progress: test fixes / cleanup Ævar Arnfjörð Bjarmason
                               ` (6 preceding siblings ...)
  2021-12-28 15:19             ` [PATCH v8 7/7] *.c: use isatty(0|2), not isatty(STDIN_FILENO|STDERR_FILENO) Ævar Arnfjörð Bjarmason
@ 2022-01-08  0:45             ` Junio C Hamano
  7 siblings, 0 replies; 81+ messages in thread
From: Junio C Hamano @ 2022-01-08  0:45 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, SZEDER Gábor, René Scharfe, Johannes Altmanninger

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Various test, leak and other fixes for the progress.c code and its
> tests. This v8 addresses feedback on v7[1] by Johannes
> Altmanninger. For that round I accidentally broke the In-Reply-To
> chain, so I'm replying to the v6 here to attach it to the original
> thread again.

Is this replying to v6 of a totally unrelated topic that is about
ambiguous object name?


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v7 0/6] object-name: make ambiguous object output translatable + show tag date
  2021-12-28 14:34         ` [PATCH v6 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
                             ` (6 preceding siblings ...)
  2021-12-28 15:18           ` [PATCH v8 0/7] progress: test fixes / cleanup Ævar Arnfjörð Bjarmason
@ 2022-01-12 12:39           ` Ævar Arnfjörð Bjarmason
  2022-01-12 12:39             ` [PATCH v7 1/6] object-name tests: add tests for ambiguous object blind spots Ævar Arnfjörð Bjarmason
                               ` (5 more replies)
  7 siblings, 6 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-12 12:39 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya, Josh Steadmon,
	Ævar Arnfjörð Bjarmason

This topic improves the output we emit on ambiguous objects as noted
in 4/6, and makes it translatable, see 3/6. See [1] for v6.

This v7 addresses all the feedback on v7 from Junio. Note also that
there's an unrelated v8[2] in reply to the v6 from another topic,
because I mixed up the In-Reply-To for the two while submitting a
re-roll of it, sorry about that.

1. https://lore.kernel.org/git/cover-v6-0.6-00000000000-20211228T143223Z-avarab@gmail.com/
2. https://lore.kernel.org/git/cover-v8-0.7-00000000000-20211228T150728Z-avarab@gmail.com/

Ævar Arnfjörð Bjarmason (6):
  object-name tests: add tests for ambiguous object blind spots
  object-name: explicitly handle OBJ_BAD in show_ambiguous_object()
  object-name: make ambiguous object output translatable
  object-name: show date for ambiguous tag objects
  object-name: iterate ambiguous objects before showing header
  object-name: re-use "struct strbuf" in show_ambiguous_object()

 object-name.c                       | 113 +++++++++++++++++++++++++---
 t/t1512-rev-parse-disambiguation.sh |  78 +++++++++++++++++++
 2 files changed, 179 insertions(+), 12 deletions(-)

Range-diff against v6:
1:  27f267ad555 ! 1:  28c01b7f8a5 object-name tests: add tests for ambiguous object blind spots
    @@ t/t1512-rev-parse-disambiguation.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
      . ./test-lib.sh
      
     +test_cmp_failed_rev_parse () {
    -+	dir=$1
    -+	rev=$2
    -+	shift
    -+
    -+	test_must_fail git -C "$dir" rev-parse "$rev" 2>actual.raw &&
    -+	sed "s/\($rev\)[0-9a-f]*/\1.../g" <actual.raw >actual &&
    ++	cat >expect &&
    ++	test_must_fail git -C "$1" rev-parse "$2" 2>actual.raw &&
    ++	sed "s/\($2\)[0-9a-f]*/\1.../" <actual.raw >actual &&
     +	test_cmp expect actual
     +}
     +
    @@ t/t1512-rev-parse-disambiguation.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
     +	) &&
     +
     +	test_must_fail git -C blob.prefix rev-parse dead &&
    -+	cat >expect <<-\EOF &&
    ++	test_cmp_failed_rev_parse blob.prefix beef <<-\EOF
     +	error: short object ID beef... is ambiguous
     +	hint: The candidates are:
     +	hint:   beef... blob
    @@ t/t1512-rev-parse-disambiguation.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
     +	Use '\''--'\'' to separate paths from revisions, like this:
     +	'\''git <command> [<revision>...] -- [<file>...]'\''
     +	EOF
    -+	test_cmp_failed_rev_parse blob.prefix beef
     +'
     +
    -+test_expect_success 'ambiguous loose blob parsed as OBJ_BAD' '
    ++test_expect_success 'ambiguous loose bad object parsed as OBJ_BAD' '
     +	git init --bare blob.bad &&
     +	(
     +		cd blob.bad &&
    @@ t/t1512-rev-parse-disambiguation.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
     +		echo xyzhjpyvwl | git hash-object -t bad -w --stdin --literally
     +	) &&
     +
    -+	cat >expect <<-\EOF &&
    ++	test_cmp_failed_rev_parse blob.bad bad0 <<-\EOF
     +	error: short object ID bad0... is ambiguous
     +	hint: The candidates are:
     +	fatal: invalid object type
     +	EOF
    -+	test_cmp_failed_rev_parse blob.bad bad0
     +'
     +
     +test_expect_success POSIXPERM 'ambigous zlib corrupt loose blob' '
    @@ t/t1512-rev-parse-disambiguation.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
     +		echo broken >$oidf
     +	) &&
     +
    -+	cat >expect <<-\EOF &&
    ++	test_cmp_failed_rev_parse blob.corrupt cafe <<-\EOF
     +	error: short object ID cafe... is ambiguous
     +	hint: The candidates are:
     +	error: inflate: data stream error (incorrect header check)
    @@ t/t1512-rev-parse-disambiguation.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
     +	Use '\''--'\'' to separate paths from revisions, like this:
     +	'\''git <command> [<revision>...] -- [<file>...]'\''
     +	EOF
    -+	test_cmp_failed_rev_parse blob.corrupt cafe
     +'
     +
      if ! test_have_prereq SHA1
2:  c78243dc701 = 2:  b7027dfc843 object-name: explicitly handle OBJ_BAD in show_ambiguous_object()
3:  daebc95542c ! 3:  65801f2c890 object-name: make ambiguous object output translatable
    @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, voi
     +		 *    "deadbeef tag Some Tag Message"
     +		 *
     +		 * The second argument is the "tag" string from
    -+		 * object.c, it should (hopefully) already be
    -+		 * translated.
    ++		 * object.c.
     +		 */
     +		strbuf_addf(&desc, _("%s tag %s"), hash, tag_tag);
     +	} else if (type == OBJ_TREE) {
    @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, voi
     -	       desc.buf);
     +	/*
     +	 * TRANSLATORS: This is line item of ambiguous object output
    -+	 * from describe_ambiguous_object() above.
    ++	 * from describe_ambiguous_object() above. For RTL languages
    ++	 * you'll probably want to swap the "%s" and leading " " space
    ++	 * around.
     +	 */
     +	advise(_("  %s"), desc.buf);
      
4:  b5aa6e266f6 ! 4:  2e5511c9fa5 object-name: show date for ambiguous tag objects
    @@ Commit message
     
             hint:   b7e68c41d92 tag v2.32.0
     
    +    As with OBJ_COMMIT we punt on the cases where the date in the object
    +    is nonsensical, and other cases where parse_tag() might fail. For
    +    those we'll use our default date of "0" and tag message of
    +    "". E.g. for some of the corrupt tags created by t3800-mktag.sh we'd
    +    emit a line like:
    +
    +        hint:   8d62cb0b06 tag 1970-01-01 -
    +
    +    We could detect that and emit a "%s [bad tag object]" message (to go
    +    with the existing generic "%s [bad object]"), but I don't think it's
    +    worth the effort. Users are unlikely to ever run into cases where
    +    they've got a broken object that's also ambiguous, and in case they do
    +    output that's a bit nonsensical beats wasting translator time on this
    +    obscure edge case.
    +
    +    We should instead change parse_tag_buffer() to be more eager to emit
    +    an error() instead of silently aborting with "return -1;". In the case
    +    of "t3800-mktag.sh" it takes the "size < the_hash_algo->hexsz + 24"
    +    branch.
    +
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## object-name.c ##
    @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, voi
     +		 *    "deadbeef tag 2021-01-01 - Some Tag Message"
      		 *
      		 * The second argument is the "tag" string from
    - 		 * object.c, it should (hopefully) already be
    - 		 * translated.
    + 		 * object.c.
      		 */
     -		strbuf_addf(&desc, _("%s tag %s"), hash, tag_tag);
     +		strbuf_addf(&desc, _("%s tag %s - %s"), hash,
5:  644b076b2a6 ! 5:  2c03cdd3c1e object-name: iterate ambiguous objects before showing header
    @@ object-name.c: static int init_object_disambiguation(struct repository *r,
      	int type;
      	const char *hash;
     @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, void *data)
    - 	 * TRANSLATORS: This is line item of ambiguous object output
    - 	 * from describe_ambiguous_object() above.
    + 	 * you'll probably want to swap the "%s" and leading " " space
    + 	 * around.
      	 */
     -	advise(_("  %s"), desc.buf);
     +	strbuf_addf(advice, _("  %s\n"), desc.buf);
    @@ object-name.c: static enum get_oid_result get_short_oid(struct repository *r,
      	return status;
     
      ## t/t1512-rev-parse-disambiguation.sh ##
    -@@ t/t1512-rev-parse-disambiguation.sh: test_expect_success 'ambiguous loose blob parsed as OBJ_BAD' '
    +@@ t/t1512-rev-parse-disambiguation.sh: test_expect_success 'ambiguous loose bad object parsed as OBJ_BAD' '
      
    - 	cat >expect <<-\EOF &&
    + 	test_cmp_failed_rev_parse blob.bad bad0 <<-\EOF
      	error: short object ID bad0... is ambiguous
     -	hint: The candidates are:
      	fatal: invalid object type
      	EOF
    - 	test_cmp_failed_rev_parse blob.bad bad0
    + '
     @@ t/t1512-rev-parse-disambiguation.sh: test_expect_success POSIXPERM 'ambigous zlib corrupt loose blob' '
      
    - 	cat >expect <<-\EOF &&
    + 	test_cmp_failed_rev_parse blob.corrupt cafe <<-\EOF
      	error: short object ID cafe... is ambiguous
     -	hint: The candidates are:
      	error: inflate: data stream error (incorrect header check)
6:  6a31cfcfc29 ! 6:  bf226f67099 object-name: re-use "struct strbuf" in show_ambiguous_object()
    @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, voi
      		strbuf_release(&date);
      		strbuf_release(&msg);
     @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, void *data)
    - 		 * object.c, it should (hopefully) already be
    - 		 * translated.
    + 		 * The second argument is the "tag" string from
    + 		 * object.c.
      		 */
     -		strbuf_addf(&desc, _("%s tag %s - %s"), hash,
     +		strbuf_addf(sb, _("%s tag %s - %s"), hash,
    @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, voi
      
      
     @@ object-name.c: static int show_ambiguous_object(const struct object_id *oid, void *data)
    - 	 * TRANSLATORS: This is line item of ambiguous object output
    - 	 * from describe_ambiguous_object() above.
    + 	 * you'll probably want to swap the "%s" and leading " " space
    + 	 * around.
      	 */
     -	strbuf_addf(advice, _("  %s\n"), desc.buf);
     +	strbuf_addf(advice, _("  %s\n"), sb->buf);
-- 
2.34.1.1373.g062f5534af2


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v7 1/6] object-name tests: add tests for ambiguous object blind spots
  2022-01-12 12:39           ` [PATCH v7 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
@ 2022-01-12 12:39             ` Ævar Arnfjörð Bjarmason
  2022-01-13 22:39               ` Junio C Hamano
  2022-01-12 12:39             ` [PATCH v7 2/6] object-name: explicitly handle OBJ_BAD in show_ambiguous_object() Ævar Arnfjörð Bjarmason
                               ` (4 subsequent siblings)
  5 siblings, 1 reply; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-12 12:39 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya, Josh Steadmon,
	Ævar Arnfjörð Bjarmason

Extend the tests for ambiguous objects to check how we handle objects
where we return OBJ_BAD when trying to parse them. As noted in [1] we
have a blindspot when it comes to this behavior.

Since we need to add new test data here let's extend these tests to be
tested under SHA-256, in d7a2fc82491 (t1512: skip test if not using
SHA-1, 2018-05-13) all of the existing tests were skipped, as they
rely on specific SHA-1 object IDs.

For these tests it only matters that the first 4 characters of the OID
prefix are the same for both SHA-1 and SHA-256. This uses strings that
I mined, and have the same prefix when hashed with both.

We "test_cmp" the full output to guard against any future regressions,
and because a subsequent commit will tweak it. Showing a diff of how
the output changes is helpful to explain those subsequent commits.

1. https://lore.kernel.org/git/YZwbphPpfGk78w2f@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t1512-rev-parse-disambiguation.sh | 79 +++++++++++++++++++++++++++++
 1 file changed, 79 insertions(+)

diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index b0119bf8bc8..01feeeafb72 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -25,6 +25,85 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
+test_cmp_failed_rev_parse () {
+	cat >expect &&
+	test_must_fail git -C "$1" rev-parse "$2" 2>actual.raw &&
+	sed "s/\($2\)[0-9a-f]*/\1.../" <actual.raw >actual &&
+	test_cmp expect actual
+}
+
+test_expect_success 'ambiguous blob output' '
+	git init --bare blob.prefix &&
+	(
+		cd blob.prefix &&
+
+		# Both start with "dead..", under both SHA-1 and SHA-256
+		echo brocdnra | git hash-object -w --stdin &&
+		echo brigddsv | git hash-object -w --stdin &&
+
+		# Both start with "beef.."
+		echo 1agllotbh | git hash-object -w --stdin &&
+		echo 1bbfctrkc | git hash-object -w --stdin
+	) &&
+
+	test_must_fail git -C blob.prefix rev-parse dead &&
+	test_cmp_failed_rev_parse blob.prefix beef <<-\EOF
+	error: short object ID beef... is ambiguous
+	hint: The candidates are:
+	hint:   beef... blob
+	hint:   beef... blob
+	fatal: ambiguous argument '\''beef...'\'': unknown revision or path not in the working tree.
+	Use '\''--'\'' to separate paths from revisions, like this:
+	'\''git <command> [<revision>...] -- [<file>...]'\''
+	EOF
+'
+
+test_expect_success 'ambiguous loose bad object parsed as OBJ_BAD' '
+	git init --bare blob.bad &&
+	(
+		cd blob.bad &&
+
+		# Both have the prefix "bad0"
+		echo xyzfaowcoh | git hash-object -t bad -w --stdin --literally &&
+		echo xyzhjpyvwl | git hash-object -t bad -w --stdin --literally
+	) &&
+
+	test_cmp_failed_rev_parse blob.bad bad0 <<-\EOF
+	error: short object ID bad0... is ambiguous
+	hint: The candidates are:
+	fatal: invalid object type
+	EOF
+'
+
+test_expect_success POSIXPERM 'ambigous zlib corrupt loose blob' '
+	git init --bare blob.corrupt &&
+	(
+		cd blob.corrupt &&
+
+		# Both have the prefix "cafe"
+		echo bnkxmdwz | git hash-object -w --stdin &&
+		oid=$(echo bmwsjxzi | git hash-object -w --stdin) &&
+
+		oidf=objects/$(test_oid_to_path "$oid") &&
+		chmod 755 $oidf &&
+		echo broken >$oidf
+	) &&
+
+	test_cmp_failed_rev_parse blob.corrupt cafe <<-\EOF
+	error: short object ID cafe... is ambiguous
+	hint: The candidates are:
+	error: inflate: data stream error (incorrect header check)
+	error: unable to unpack cafe... header
+	error: inflate: data stream error (incorrect header check)
+	error: unable to unpack cafe... header
+	hint:   cafe... unknown type
+	hint:   cafe... blob
+	fatal: ambiguous argument '\''cafe...'\'': unknown revision or path not in the working tree.
+	Use '\''--'\'' to separate paths from revisions, like this:
+	'\''git <command> [<revision>...] -- [<file>...]'\''
+	EOF
+'
+
 if ! test_have_prereq SHA1
 then
 	skip_all='not using SHA-1 for objects'
-- 
2.34.1.1373.g062f5534af2


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v7 2/6] object-name: explicitly handle OBJ_BAD in show_ambiguous_object()
  2022-01-12 12:39           ` [PATCH v7 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
  2022-01-12 12:39             ` [PATCH v7 1/6] object-name tests: add tests for ambiguous object blind spots Ævar Arnfjörð Bjarmason
@ 2022-01-12 12:39             ` Ævar Arnfjörð Bjarmason
  2022-01-12 12:39             ` [PATCH v7 3/6] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
                               ` (3 subsequent siblings)
  5 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-12 12:39 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya, Josh Steadmon,
	Ævar Arnfjörð Bjarmason

Amend the "unknown type" handling in the code that displays the
ambiguous object list to assert() that we're either going to get the
"real" object types we can pass to type_name(), or a -1 (OBJ_BAD)
return value from oid_object_info().

See [1] for the current output, and [1] for the commit that added the
"unknown type" handling.

We are never going to get an "unknown type" in the sense of custom
types crafted with "hash-object --literally", since we're not using
the OBJECT_INFO_ALLOW_UNKNOWN_TYPE flag.

If we manage to otherwise unpack such an object without errors we'll
die() in parse_loose_header_extended() called by sort_ambiguous()
before we get to show_ambiguous_object(), as is asserted by the test
added in the preceding commit.

So saying "unknown type" here was always misleading, we really meant
to say that we had a failure parsing the object at all, i.e. that we
had repository corruption. If the problem is only that it's type is
unknown we won't reach this code.

So let's emit a generic "[bad object]" instead. As our tests added in
the preceding commit show, we'll have emitted various "error" output
already in those cases.

We should do better in the truly "unknown type" cases, which we'd need
to handle if we were passing down the OBJECT_INFO_ALLOW_UNKNOWN_TYPE
flag. But let's leave that for some future improvement. In a
subsequent commit I'll improve the output we do show, and not having
to handle the "unknown type" (as in OBJECT_INFO_ALLOW_UNKNOWN_TYPE)
simplifies that change.

1. 5cc044e0257 (get_short_oid: sort ambiguous objects by type,
   then SHA-1, 2018-05-10)
2. 1ffa26c461 (get_short_sha1: list ambiguous objects on error,
   2016-09-26)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c                       | 14 ++++++++++++--
 t/t1512-rev-parse-disambiguation.sh |  2 +-
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/object-name.c b/object-name.c
index fdff4601b2c..9750634ee76 100644
--- a/object-name.c
+++ b/object-name.c
@@ -361,6 +361,16 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		return 0;
 
 	type = oid_object_info(ds->repo, oid, NULL);
+
+	if (type < 0) {
+		strbuf_addstr(&desc, "[bad object]");
+		goto out;
+	}
+
+	assert(type == OBJ_TREE || type == OBJ_COMMIT ||
+	       type == OBJ_BLOB || type == OBJ_TAG);
+	strbuf_addstr(&desc, type_name(type));
+
 	if (type == OBJ_COMMIT) {
 		struct commit *commit = lookup_commit(ds->repo, oid);
 		if (commit) {
@@ -374,9 +384,9 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 			strbuf_addf(&desc, " %s", tag->tag);
 	}
 
-	advise("  %s %s%s",
+out:
+	advise("  %s %s",
 	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
-	       type_name(type) ? type_name(type) : "unknown type",
 	       desc.buf);
 
 	strbuf_release(&desc);
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 01feeeafb72..5ed7e49edc7 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -96,7 +96,7 @@ test_expect_success POSIXPERM 'ambigous zlib corrupt loose blob' '
 	error: unable to unpack cafe... header
 	error: inflate: data stream error (incorrect header check)
 	error: unable to unpack cafe... header
-	hint:   cafe... unknown type
+	hint:   cafe... [bad object]
 	hint:   cafe... blob
 	fatal: ambiguous argument '\''cafe...'\'': unknown revision or path not in the working tree.
 	Use '\''--'\'' to separate paths from revisions, like this:
-- 
2.34.1.1373.g062f5534af2


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v7 3/6] object-name: make ambiguous object output translatable
  2022-01-12 12:39           ` [PATCH v7 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
  2022-01-12 12:39             ` [PATCH v7 1/6] object-name tests: add tests for ambiguous object blind spots Ævar Arnfjörð Bjarmason
  2022-01-12 12:39             ` [PATCH v7 2/6] object-name: explicitly handle OBJ_BAD in show_ambiguous_object() Ævar Arnfjörð Bjarmason
@ 2022-01-12 12:39             ` Ævar Arnfjörð Bjarmason
  2022-01-12 12:39             ` [PATCH v7 4/6] object-name: show date for ambiguous tag objects Ævar Arnfjörð Bjarmason
                               ` (2 subsequent siblings)
  5 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-12 12:39 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya, Josh Steadmon,
	Ævar Arnfjörð Bjarmason

Change the output of show_ambiguous_object() added in [1] and last
tweaked in [2] and the preceding commit to be more friendly to
translators.

By being able to customize the "<SP><SP>%s\n" format we're even ready
for RTL languages, who'd presumably like to change that to
"%s<SP><SP>\n".

1. 1ffa26c461 (get_short_sha1: list ambiguous objects on error,
   2016-09-26)
2. 5cc044e0257 (get_short_oid: sort ambiguous objects by type,
   then SHA-1, 2018-05-10)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
---
 object-name.c | 66 +++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 59 insertions(+), 7 deletions(-)

diff --git a/object-name.c b/object-name.c
index 9750634ee76..743f346842a 100644
--- a/object-name.c
+++ b/object-name.c
@@ -356,38 +356,90 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	const struct disambiguate_state *ds = data;
 	struct strbuf desc = STRBUF_INIT;
 	int type;
+	const char *hash;
 
 	if (ds->fn && !ds->fn(ds->repo, oid, ds->cb_data))
 		return 0;
 
+	hash = repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV);
 	type = oid_object_info(ds->repo, oid, NULL);
 
 	if (type < 0) {
-		strbuf_addstr(&desc, "[bad object]");
+		/*
+		 * TRANSLATORS: This is a line of ambiguous object
+		 * output shown when we cannot look up or parse the
+		 * object in question. E.g. "deadbeef [bad object]".
+		 */
+		strbuf_addf(&desc, _("%s [bad object]"), hash);
 		goto out;
 	}
 
 	assert(type == OBJ_TREE || type == OBJ_COMMIT ||
 	       type == OBJ_BLOB || type == OBJ_TAG);
-	strbuf_addstr(&desc, type_name(type));
 
 	if (type == OBJ_COMMIT) {
+		struct strbuf date = STRBUF_INIT;
+		struct strbuf msg = STRBUF_INIT;
 		struct commit *commit = lookup_commit(ds->repo, oid);
+
 		if (commit) {
 			struct pretty_print_context pp = {0};
 			pp.date_mode.type = DATE_SHORT;
-			format_commit_message(commit, " %ad - %s", &desc, &pp);
+			format_commit_message(commit, "%ad", &date, &pp);
+			format_commit_message(commit, "%s", &msg, &pp);
 		}
+
+		/*
+		 * TRANSLATORS: This is a line of ambiguous commit
+		 * object output. E.g.:
+		 *
+		 *    "deadbeef commit 2021-01-01 - Some Commit Message"
+		 */
+		strbuf_addf(&desc, _("%s commit %s - %s"),
+			    hash, date.buf, msg.buf);
+
+		strbuf_release(&date);
+		strbuf_release(&msg);
 	} else if (type == OBJ_TAG) {
 		struct tag *tag = lookup_tag(ds->repo, oid);
+		const char *tag_tag = "";
+
 		if (!parse_tag(tag) && tag->tag)
-			strbuf_addf(&desc, " %s", tag->tag);
+			tag_tag = tag->tag;
+
+		/*
+		 * TRANSLATORS: This is a line of
+		 * ambiguous tag object output. E.g.:
+		 *
+		 *    "deadbeef tag Some Tag Message"
+		 *
+		 * The second argument is the "tag" string from
+		 * object.c.
+		 */
+		strbuf_addf(&desc, _("%s tag %s"), hash, tag_tag);
+	} else if (type == OBJ_TREE) {
+		/*
+		 * TRANSLATORS: This is a line of ambiguous <type>
+		 * object output. E.g. "deadbeef tree".
+		 */
+		strbuf_addf(&desc, _("%s tree"), hash);
+	} else if (type == OBJ_BLOB) {
+		/*
+		 * TRANSLATORS: This is a line of ambiguous <type>
+		 * object output. E.g. "deadbeef blob".
+		 */
+		strbuf_addf(&desc, _("%s blob"), hash);
 	}
 
+
 out:
-	advise("  %s %s",
-	       repo_find_unique_abbrev(ds->repo, oid, DEFAULT_ABBREV),
-	       desc.buf);
+	/*
+	 * TRANSLATORS: This is line item of ambiguous object output
+	 * from describe_ambiguous_object() above. For RTL languages
+	 * you'll probably want to swap the "%s" and leading " " space
+	 * around.
+	 */
+	advise(_("  %s"), desc.buf);
 
 	strbuf_release(&desc);
 	return 0;
-- 
2.34.1.1373.g062f5534af2


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v7 4/6] object-name: show date for ambiguous tag objects
  2022-01-12 12:39           ` [PATCH v7 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
                               ` (2 preceding siblings ...)
  2022-01-12 12:39             ` [PATCH v7 3/6] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
@ 2022-01-12 12:39             ` Ævar Arnfjörð Bjarmason
  2022-01-13 22:46               ` Junio C Hamano
  2022-01-12 12:39             ` [PATCH v7 5/6] object-name: iterate ambiguous objects before showing header Ævar Arnfjörð Bjarmason
  2022-01-12 12:39             ` [PATCH v7 6/6] object-name: re-use "struct strbuf" in show_ambiguous_object() Ævar Arnfjörð Bjarmason
  5 siblings, 1 reply; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-12 12:39 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya, Josh Steadmon,
	Ævar Arnfjörð Bjarmason

Make the ambiguous tag object output nicer in the case of tag objects
such as ebf3c04b262 (Git 2.32, 2021-06-06) by including the date in
the "tagger" header. I.e.:

    $ git rev-parse b7e68
    error: short object ID b7e68 is ambiguous
    hint: The candidates are:
    hint:   b7e68c41d92 tag 2021-06-06 - v2.32.0
    hint:   b7e68ae18e0 commit 2019-12-23 - bisect: use the standard 'if (!var)' way to check for 0
    hint:   b7e68f6b413 tree
    hint:   b7e68490b97 blob
    b7e68
    [...]

Before this we'd emit a "tag" line of:

    hint:   b7e68c41d92 tag v2.32.0

As with OBJ_COMMIT we punt on the cases where the date in the object
is nonsensical, and other cases where parse_tag() might fail. For
those we'll use our default date of "0" and tag message of
"". E.g. for some of the corrupt tags created by t3800-mktag.sh we'd
emit a line like:

    hint:   8d62cb0b06 tag 1970-01-01 -

We could detect that and emit a "%s [bad tag object]" message (to go
with the existing generic "%s [bad object]"), but I don't think it's
worth the effort. Users are unlikely to ever run into cases where
they've got a broken object that's also ambiguous, and in case they do
output that's a bit nonsensical beats wasting translator time on this
obscure edge case.

We should instead change parse_tag_buffer() to be more eager to emit
an error() instead of silently aborting with "return -1;". In the case
of "t3800-mktag.sh" it takes the "size < the_hash_algo->hexsz + 24"
branch.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/object-name.c b/object-name.c
index 743f346842a..7c6cb60ceff 100644
--- a/object-name.c
+++ b/object-name.c
@@ -403,20 +403,25 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	} else if (type == OBJ_TAG) {
 		struct tag *tag = lookup_tag(ds->repo, oid);
 		const char *tag_tag = "";
+		timestamp_t tag_date = 0;
 
-		if (!parse_tag(tag) && tag->tag)
+		if (!parse_tag(tag) && tag->tag) {
 			tag_tag = tag->tag;
+			tag_date = tag->date;
+		}
 
 		/*
 		 * TRANSLATORS: This is a line of
 		 * ambiguous tag object output. E.g.:
 		 *
-		 *    "deadbeef tag Some Tag Message"
+		 *    "deadbeef tag 2021-01-01 - Some Tag Message"
 		 *
 		 * The second argument is the "tag" string from
 		 * object.c.
 		 */
-		strbuf_addf(&desc, _("%s tag %s"), hash, tag_tag);
+		strbuf_addf(&desc, _("%s tag %s - %s"), hash,
+			    show_date(tag_date, 0, DATE_MODE(SHORT)),
+			    tag_tag);
 	} else if (type == OBJ_TREE) {
 		/*
 		 * TRANSLATORS: This is a line of ambiguous <type>
-- 
2.34.1.1373.g062f5534af2


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v7 5/6] object-name: iterate ambiguous objects before showing header
  2022-01-12 12:39           ` [PATCH v7 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
                               ` (3 preceding siblings ...)
  2022-01-12 12:39             ` [PATCH v7 4/6] object-name: show date for ambiguous tag objects Ævar Arnfjörð Bjarmason
@ 2022-01-12 12:39             ` Ævar Arnfjörð Bjarmason
  2022-01-12 12:39             ` [PATCH v7 6/6] object-name: re-use "struct strbuf" in show_ambiguous_object() Ævar Arnfjörð Bjarmason
  5 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-12 12:39 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya, Josh Steadmon,
	Ævar Arnfjörð Bjarmason

Change the "The candidates are" header that's shown for ambiguous
objects to be shown after we've iterated over all of the objects.

If we get any errors while doing so we don't want to split up the the
header and the list as a result. The two will now be printed together,
as shown in the updated testcase.

As we're accumulating the lines into as "struct strbuf" before
emitting them we need to add a trailing newline to the call in
show_ambiguous_object(). This and the change from "The candidates
are:" to "The candidates are:\n%s" helps to give translators more
context.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c                       | 27 +++++++++++++++++++++++----
 t/t1512-rev-parse-disambiguation.sh |  3 +--
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/object-name.c b/object-name.c
index 7c6cb60ceff..71236ed1c16 100644
--- a/object-name.c
+++ b/object-name.c
@@ -351,9 +351,16 @@ static int init_object_disambiguation(struct repository *r,
 	return 0;
 }
 
+struct ambiguous_output {
+	const struct disambiguate_state *ds;
+	struct strbuf advice;
+};
+
 static int show_ambiguous_object(const struct object_id *oid, void *data)
 {
-	const struct disambiguate_state *ds = data;
+	struct ambiguous_output *state = data;
+	const struct disambiguate_state *ds = state->ds;
+	struct strbuf *advice = &state->advice;
 	struct strbuf desc = STRBUF_INIT;
 	int type;
 	const char *hash;
@@ -444,7 +451,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	 * you'll probably want to swap the "%s" and leading " " space
 	 * around.
 	 */
-	advise(_("  %s"), desc.buf);
+	strbuf_addf(advice, _("  %s\n"), desc.buf);
 
 	strbuf_release(&desc);
 	return 0;
@@ -543,6 +550,10 @@ static enum get_oid_result get_short_oid(struct repository *r,
 
 	if (!quietly && (status == SHORT_NAME_AMBIGUOUS)) {
 		struct oid_array collect = OID_ARRAY_INIT;
+		struct ambiguous_output out = {
+			.ds = &ds,
+			.advice = STRBUF_INIT,
+		};
 
 		error(_("short object ID %s is ambiguous"), ds.hex_pfx);
 
@@ -555,13 +566,21 @@ static enum get_oid_result get_short_oid(struct repository *r,
 		if (!ds.ambiguous)
 			ds.fn = NULL;
 
-		advise(_("The candidates are:"));
 		repo_for_each_abbrev(r, ds.hex_pfx, collect_ambiguous, &collect);
 		sort_ambiguous_oid_array(r, &collect);
 
-		if (oid_array_for_each(&collect, show_ambiguous_object, &ds))
+		if (oid_array_for_each(&collect, show_ambiguous_object, &out))
 			BUG("show_ambiguous_object shouldn't return non-zero");
+
+		/*
+		 * TRANSLATORS: The argument is the list of ambiguous
+		 * objects composed in show_ambiguous_object(). See
+		 * its "TRANSLATORS" comments for details.
+		 */
+		advise(_("The candidates are:\n%s"), out.advice.buf);
+
 		oid_array_clear(&collect);
+		strbuf_release(&out.advice);
 	}
 
 	return status;
diff --git a/t/t1512-rev-parse-disambiguation.sh b/t/t1512-rev-parse-disambiguation.sh
index 5ed7e49edc7..9c43699d3ae 100755
--- a/t/t1512-rev-parse-disambiguation.sh
+++ b/t/t1512-rev-parse-disambiguation.sh
@@ -70,7 +70,6 @@ test_expect_success 'ambiguous loose bad object parsed as OBJ_BAD' '
 
 	test_cmp_failed_rev_parse blob.bad bad0 <<-\EOF
 	error: short object ID bad0... is ambiguous
-	hint: The candidates are:
 	fatal: invalid object type
 	EOF
 '
@@ -91,11 +90,11 @@ test_expect_success POSIXPERM 'ambigous zlib corrupt loose blob' '
 
 	test_cmp_failed_rev_parse blob.corrupt cafe <<-\EOF
 	error: short object ID cafe... is ambiguous
-	hint: The candidates are:
 	error: inflate: data stream error (incorrect header check)
 	error: unable to unpack cafe... header
 	error: inflate: data stream error (incorrect header check)
 	error: unable to unpack cafe... header
+	hint: The candidates are:
 	hint:   cafe... [bad object]
 	hint:   cafe... blob
 	fatal: ambiguous argument '\''cafe...'\'': unknown revision or path not in the working tree.
-- 
2.34.1.1373.g062f5534af2


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH v7 6/6] object-name: re-use "struct strbuf" in show_ambiguous_object()
  2022-01-12 12:39           ` [PATCH v7 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
                               ` (4 preceding siblings ...)
  2022-01-12 12:39             ` [PATCH v7 5/6] object-name: iterate ambiguous objects before showing header Ævar Arnfjörð Bjarmason
@ 2022-01-12 12:39             ` Ævar Arnfjörð Bjarmason
  5 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-12 12:39 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Bagas Sanjaya, Josh Steadmon,
	Ævar Arnfjörð Bjarmason

Reduce the allocations done by show_ambiguous_object() by moving the
"desc" strbuf into the "struct ambiguous_output" introduced in the
preceding commit.

This doesn't matter for optimization purposes, but since we're
accumulating a "struct strbuf advice" anyway let's follow that pattern
and add a "struct strbuf sb", we can then strbuf_reset() it rather
than calling strbuf_release() for each call to
show_ambiguous_object().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object-name.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/object-name.c b/object-name.c
index 71236ed1c16..bce3f42356a 100644
--- a/object-name.c
+++ b/object-name.c
@@ -354,6 +354,7 @@ static int init_object_disambiguation(struct repository *r,
 struct ambiguous_output {
 	const struct disambiguate_state *ds;
 	struct strbuf advice;
+	struct strbuf sb;
 };
 
 static int show_ambiguous_object(const struct object_id *oid, void *data)
@@ -361,7 +362,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	struct ambiguous_output *state = data;
 	const struct disambiguate_state *ds = state->ds;
 	struct strbuf *advice = &state->advice;
-	struct strbuf desc = STRBUF_INIT;
+	struct strbuf *sb = &state->sb;
 	int type;
 	const char *hash;
 
@@ -377,7 +378,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		 * output shown when we cannot look up or parse the
 		 * object in question. E.g. "deadbeef [bad object]".
 		 */
-		strbuf_addf(&desc, _("%s [bad object]"), hash);
+		strbuf_addf(sb, _("%s [bad object]"), hash);
 		goto out;
 	}
 
@@ -402,8 +403,8 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		 *
 		 *    "deadbeef commit 2021-01-01 - Some Commit Message"
 		 */
-		strbuf_addf(&desc, _("%s commit %s - %s"),
-			    hash, date.buf, msg.buf);
+		strbuf_addf(sb, _("%s commit %s - %s"), hash, date.buf,
+			    msg.buf);
 
 		strbuf_release(&date);
 		strbuf_release(&msg);
@@ -426,7 +427,7 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		 * The second argument is the "tag" string from
 		 * object.c.
 		 */
-		strbuf_addf(&desc, _("%s tag %s - %s"), hash,
+		strbuf_addf(sb, _("%s tag %s - %s"), hash,
 			    show_date(tag_date, 0, DATE_MODE(SHORT)),
 			    tag_tag);
 	} else if (type == OBJ_TREE) {
@@ -434,13 +435,13 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 		 * TRANSLATORS: This is a line of ambiguous <type>
 		 * object output. E.g. "deadbeef tree".
 		 */
-		strbuf_addf(&desc, _("%s tree"), hash);
+		strbuf_addf(sb, _("%s tree"), hash);
 	} else if (type == OBJ_BLOB) {
 		/*
 		 * TRANSLATORS: This is a line of ambiguous <type>
 		 * object output. E.g. "deadbeef blob".
 		 */
-		strbuf_addf(&desc, _("%s blob"), hash);
+		strbuf_addf(sb, _("%s blob"), hash);
 	}
 
 
@@ -451,9 +452,9 @@ static int show_ambiguous_object(const struct object_id *oid, void *data)
 	 * you'll probably want to swap the "%s" and leading " " space
 	 * around.
 	 */
-	strbuf_addf(advice, _("  %s\n"), desc.buf);
+	strbuf_addf(advice, _("  %s\n"), sb->buf);
 
-	strbuf_release(&desc);
+	strbuf_reset(sb);
 	return 0;
 }
 
@@ -552,6 +553,7 @@ static enum get_oid_result get_short_oid(struct repository *r,
 		struct oid_array collect = OID_ARRAY_INIT;
 		struct ambiguous_output out = {
 			.ds = &ds,
+			.sb = STRBUF_INIT,
 			.advice = STRBUF_INIT,
 		};
 
@@ -581,6 +583,7 @@ static enum get_oid_result get_short_oid(struct repository *r,
 
 		oid_array_clear(&collect);
 		strbuf_release(&out.advice);
+		strbuf_release(&out.sb);
 	}
 
 	return status;
-- 
2.34.1.1373.g062f5534af2


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v7 1/6] object-name tests: add tests for ambiguous object blind spots
  2022-01-12 12:39             ` [PATCH v7 1/6] object-name tests: add tests for ambiguous object blind spots Ævar Arnfjörð Bjarmason
@ 2022-01-13 22:39               ` Junio C Hamano
  2022-01-14 12:07                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 81+ messages in thread
From: Junio C Hamano @ 2022-01-13 22:39 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Bagas Sanjaya, Josh Steadmon

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> +test_cmp_failed_rev_parse () {
> +	cat >expect &&
> +	test_must_fail git -C "$1" rev-parse "$2" 2>actual.raw &&
> +	sed "s/\($2\)[0-9a-f]*/\1.../" <actual.raw >actual &&
> +	test_cmp expect actual
> +}

That's dense, especially without a comment (or named variable) that
hints readers what the arguments to this helper (and its standard
input) ought to be.

As long as messages from rev-parse on the error stream never has
more than one abbreviated object name on a single line, the above
should give us a copy of the message with expected object name
abbreviated to $2; otherwise we might be missing a /g in the sed
script.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v7 4/6] object-name: show date for ambiguous tag objects
  2022-01-12 12:39             ` [PATCH v7 4/6] object-name: show date for ambiguous tag objects Ævar Arnfjörð Bjarmason
@ 2022-01-13 22:46               ` Junio C Hamano
  2022-01-14 12:05                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 81+ messages in thread
From: Junio C Hamano @ 2022-01-13 22:46 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Bagas Sanjaya, Josh Steadmon

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

>  	} else if (type == OBJ_TAG) {
>  		struct tag *tag = lookup_tag(ds->repo, oid);
>  		const char *tag_tag = "";
> +		timestamp_t tag_date = 0;

How about leaving these two uninitialized and introduce one extra
bool,
		int tag_info_valid = 0;

and then

>  
> -		if (!parse_tag(tag) && tag->tag)
> +		if (!parse_tag(tag) && tag->tag) {
>  			tag_tag = tag->tag;
> +			tag_date = tag->date;

			tag_info_valid = 1;

> +		}
>  
>  		/*
>  		 * TRANSLATORS: This is a line of
>  		 * ambiguous tag object output. E.g.:
>  		 *
> -		 *    "deadbeef tag Some Tag Message"
> +		 *    "deadbeef tag 2021-01-01 - Some Tag Message"
>  		 *
>  		 * The second argument is the "tag" string from
>  		 * object.c.
>  		 */
> -		strbuf_addf(&desc, _("%s tag %s"), hash, tag_tag);
> +		strbuf_addf(&desc, _("%s tag %s - %s"), hash,
> +			    show_date(tag_date, 0, DATE_MODE(SHORT)),
> +			    tag_tag);

Then this part can use tag_info_valid to conditionally use tag_date
and tag_tag:

		if (tag_info_valid)
			strbuf_addf(&desc, ... <hash,date,tag>);
		else
			strbuf_addf(&desc, _("%s tag [bad]"), hash);

without throwing a misleading "In 1970 this happened".

>  	} else if (type == OBJ_TREE) {
>  		/*
>  		 * TRANSLATORS: This is a line of ambiguous <type>

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v7 4/6] object-name: show date for ambiguous tag objects
  2022-01-13 22:46               ` Junio C Hamano
@ 2022-01-14 12:05                 ` Ævar Arnfjörð Bjarmason
  2022-01-14 19:04                   ` Junio C Hamano
  0 siblings, 1 reply; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-14 12:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Bagas Sanjaya, Josh Steadmon


On Thu, Jan 13 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>>  	} else if (type == OBJ_TAG) {
>>  		struct tag *tag = lookup_tag(ds->repo, oid);
>>  		const char *tag_tag = "";
>> +		timestamp_t tag_date = 0;
>
> How about leaving these two uninitialized and introduce one extra
> bool,
> 		int tag_info_valid = 0;
>
> and then
>
>>  
>> -		if (!parse_tag(tag) && tag->tag)
>> +		if (!parse_tag(tag) && tag->tag) {
>>  			tag_tag = tag->tag;
>> +			tag_date = tag->date;
>
> 			tag_info_valid = 1;
>
>> +		}
>>  
>>  		/*
>>  		 * TRANSLATORS: This is a line of
>>  		 * ambiguous tag object output. E.g.:
>>  		 *
>> -		 *    "deadbeef tag Some Tag Message"
>> +		 *    "deadbeef tag 2021-01-01 - Some Tag Message"
>>  		 *
>>  		 * The second argument is the "tag" string from
>>  		 * object.c.
>>  		 */
>> -		strbuf_addf(&desc, _("%s tag %s"), hash, tag_tag);
>> +		strbuf_addf(&desc, _("%s tag %s - %s"), hash,
>> +			    show_date(tag_date, 0, DATE_MODE(SHORT)),
>> +			    tag_tag);
>
> Then this part can use tag_info_valid to conditionally use tag_date
> and tag_tag:
>
> 		if (tag_info_valid)
> 			strbuf_addf(&desc, ... <hash,date,tag>);
> 		else
> 			strbuf_addf(&desc, _("%s tag [bad]"), hash);
>
> without throwing a misleading "In 1970 this happened".

I still think the trade-off of not doing that discussed in the commit
message is better, i.e. (to quote upthread):
    
    We could detect that and emit a "%s [bad tag object]" message (to go
    with the existing generic "%s [bad object]"), but I don't think it's
    worth the effort. Users are unlikely to ever run into cases where
    they've got a broken object that's also ambiguous, and in case they do
    output that's a bit nonsensical beats wasting translator time on this
    obscure edge case.
    
    We should instead change parse_tag_buffer() to be more eager to emit
    an error() instead of silently aborting with "return -1;". In the case
    of "t3800-mktag.sh" it takes the "size < the_hash_algo->hexsz + 24"
    branch.

This really is so obscure that I don't think it warrants having N
translators re-translate this message users are very likely never to
see, ever.

And to the extent that they will see anything I've got some
planned/upcoming changes to make some of the underlying object machinery
emit better diagnostic messages on these bad objects, which would hint
in the general case about what's going wrong, instead of needing
ambiguous-object-display-specific messaging.
    
>>  	} else if (type == OBJ_TREE) {
>>  		/*
>>  		 * TRANSLATORS: This is a line of ambiguous <type>


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v7 1/6] object-name tests: add tests for ambiguous object blind spots
  2022-01-13 22:39               ` Junio C Hamano
@ 2022-01-14 12:07                 ` Ævar Arnfjörð Bjarmason
  2022-01-14 18:45                   ` Junio C Hamano
  0 siblings, 1 reply; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-14 12:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Bagas Sanjaya, Josh Steadmon


On Thu, Jan 13 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> +test_cmp_failed_rev_parse () {
>> +	cat >expect &&
>> +	test_must_fail git -C "$1" rev-parse "$2" 2>actual.raw &&
>> +	sed "s/\($2\)[0-9a-f]*/\1.../" <actual.raw >actual &&
>> +	test_cmp expect actual
>> +}
>
> That's dense, especially without a comment (or named variable) that
> hints readers what the arguments to this helper (and its standard
> input) ought to be.

I got rid of the named variables from v6 in response to a "shift" that
shifted the wrong number, but perhaps I should have just removed the
"shift"?

> As long as messages from rev-parse on the error stream never has
> more than one abbreviated object name on a single line, the above
> should give us a copy of the message with expected object name
> abbreviated to $2; otherwise we might be missing a /g in the sed
> script.

In the v6 you rightly commented on the /g that was there previously not
being needed :)

So I dropped it, in this case we can rely on only getting the
abbreviated output.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v7 1/6] object-name tests: add tests for ambiguous object blind spots
  2022-01-14 12:07                 ` Ævar Arnfjörð Bjarmason
@ 2022-01-14 18:45                   ` Junio C Hamano
  0 siblings, 0 replies; 81+ messages in thread
From: Junio C Hamano @ 2022-01-14 18:45 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Bagas Sanjaya, Josh Steadmon

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Thu, Jan 13 2022, Junio C Hamano wrote:
>
>> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>>
>>> +test_cmp_failed_rev_parse () {
>>> +	cat >expect &&
>>> +	test_must_fail git -C "$1" rev-parse "$2" 2>actual.raw &&
>>> +	sed "s/\($2\)[0-9a-f]*/\1.../" <actual.raw >actual &&
>>> +	test_cmp expect actual
>>> +}
>>
>> That's dense, especially without a comment (or named variable) that
>> hints readers what the arguments to this helper (and its standard
>> input) ought to be.
>
> I got rid of the named variables from v6 in response to a "shift" that
> shifted the wrong number, but perhaps I should have just removed the
> "shift"?

I agree that is a more sensible thing you could have done.

>> As long as messages from rev-parse on the error stream never has
>> more than one abbreviated object name on a single line, the above
>> should give us a copy of the message with expected object name
>> abbreviated to $2; otherwise we might be missing a /g in the sed
>> script.
>
> In the v6 you rightly commented on the /g that was there previously not
> being needed :)
>
> So I dropped it, in this case we can rely on only getting the
> abbreviated output.

I do not care either way, as long as it is clearly stated why /g is
there (or why /g is missing) for the future developers.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v7 4/6] object-name: show date for ambiguous tag objects
  2022-01-14 12:05                 ` Ævar Arnfjörð Bjarmason
@ 2022-01-14 19:04                   ` Junio C Hamano
  2022-01-14 19:35                     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 81+ messages in thread
From: Junio C Hamano @ 2022-01-14 19:04 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Bagas Sanjaya, Josh Steadmon

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> I still think the trade-off of not doing that discussed in the commit
> message is better, i.e. (to quote upthread):
>     
>     We could detect that and emit a "%s [bad tag object]" message (to go
>     with the existing generic "%s [bad object]"), but I don't think it's
>     worth the effort. Users are unlikely to ever run into cases where
>     they've got a broken object that's also ambiguous, and in case they do
>     output that's a bit nonsensical beats wasting translator time on this
>     obscure edge case.

Writing the above (and quoting it again to make me respond to it)
have already wasted a lot more time than a better solution that does
not lead to a misleading output, especially given that it was given
for free to you already.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH v7 4/6] object-name: show date for ambiguous tag objects
  2022-01-14 19:04                   ` Junio C Hamano
@ 2022-01-14 19:35                     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 81+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-14 19:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Bagas Sanjaya, Josh Steadmon


On Fri, Jan 14 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> I still think the trade-off of not doing that discussed in the commit
>> message is better, i.e. (to quote upthread):
>>     
>>     We could detect that and emit a "%s [bad tag object]" message (to go
>>     with the existing generic "%s [bad object]"), but I don't think it's
>>     worth the effort. Users are unlikely to ever run into cases where
>>     they've got a broken object that's also ambiguous, and in case they do
>>     output that's a bit nonsensical beats wasting translator time on this
>>     obscure edge case.
>
> Writing the above (and quoting it again to make me respond to it)
> have already wasted a lot more time than a better solution that does
> not lead to a misleading output, especially given that it was given
> for free to you already.

I don't mind changing it, but the reason I re-quoted it is because your
reply seemed to suggest that you had skimmed past that part before
making your original comment, not to merely repeat myself.

I.e. it's basically suggesting "how about?..." without addressing the "I
intentionally didn't do this, because..." argument in the commit
message.

But sure, I'll add a translatable message for this edge case in a
re-roll.

^ permalink raw reply	[flat|nested] 81+ messages in thread

end of thread, other threads:[~2022-01-14 19:37 UTC | newest]

Thread overview: 81+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-04  1:42 [PATCH 0/2] i18n: improve translatability of ambiguous object output Ævar Arnfjörð Bjarmason
2021-10-04  1:42 ` [PATCH 1/2] object-name tests: tighten up advise() output test Ævar Arnfjörð Bjarmason
2021-10-04  2:52   ` Eric Sunshine
2021-10-04  7:05   ` Jeff King
2021-10-04  1:42 ` [PATCH 2/2] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
2021-10-04  7:35   ` Jeff King
2021-10-04  8:26     ` Ævar Arnfjörð Bjarmason
2021-10-04  9:29       ` Jeff King
2021-10-04 11:16         ` Ævar Arnfjörð Bjarmason
2021-10-04 12:07           ` Jeff King
2021-10-04 14:27 ` [PATCH v2 0/2] i18n: improve translatability of ambiguous object output Ævar Arnfjörð Bjarmason
2021-10-04 14:27   ` [PATCH v2 1/2] object.[ch]: mark object type names for translation Ævar Arnfjörð Bjarmason
2021-10-04 18:54     ` Eric Sunshine
2021-10-05  9:37     ` Bagas Sanjaya
2021-10-05 15:52       ` Ævar Arnfjörð Bjarmason
2021-10-06 19:05     ` Jeff King
2021-10-06 19:46       ` Junio C Hamano
2021-10-06 20:38         ` Jeff King
2021-10-07 18:06           ` Junio C Hamano
2021-10-04 14:27   ` [PATCH v2 2/2] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
2021-10-06 19:11     ` Jeff King
2021-10-08 19:34   ` [PATCH v3 0/3] i18n: improve translatability of ambiguous object output Ævar Arnfjörð Bjarmason
2021-10-08 19:34     ` [PATCH v3 1/3] object-name: remove unreachable "unknown type" handling Ævar Arnfjörð Bjarmason
2021-10-08 19:34     ` [PATCH v3 2/3] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
2021-10-08 19:34     ` [PATCH v3 3/3] object-name: show date for ambiguous tag objects Ævar Arnfjörð Bjarmason
2021-11-22 17:53     ` [PATCH v2 0/3] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
2021-11-22 17:53       ` [PATCH v4 1/3] object-name: remove unreachable "unknown type" handling Ævar Arnfjörð Bjarmason
2021-11-22 22:37         ` Jeff King
2021-11-22 17:53       ` [PATCH v4 2/3] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
2021-11-22 17:53       ` [PATCH v4 3/3] object-name: show date for ambiguous tag objects Ævar Arnfjörð Bjarmason
2021-11-25 22:03       ` [PATCH v5 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
2021-11-25 22:03         ` [PATCH v5 1/6] object-name tests: add tests for ambiguous object blind spots Ævar Arnfjörð Bjarmason
2021-12-23 21:51           ` Josh Steadmon
2021-11-25 22:03         ` [PATCH v5 2/6] object-name: explicitly handle OBJ_BAD in show_ambiguous_object() Ævar Arnfjörð Bjarmason
2021-12-23 21:51           ` Josh Steadmon
2021-12-23 22:42           ` Junio C Hamano
2021-11-25 22:03         ` [PATCH v5 3/6] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
2021-12-23 21:54           ` [PATCH] fixup! " Josh Steadmon
2021-12-23 22:48             ` Junio C Hamano
2021-11-25 22:03         ` [PATCH v5 4/6] object-name: show date for ambiguous tag objects Ævar Arnfjörð Bjarmason
2021-11-25 22:03         ` [PATCH v5 5/6] object-name: iterate ambiguous objects before showing header Ævar Arnfjörð Bjarmason
2021-11-25 22:03         ` [PATCH v5 6/6] object-name: re-use "struct strbuf" in show_ambiguous_object() Ævar Arnfjörð Bjarmason
2021-12-28 14:34         ` [PATCH v6 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
2021-12-28 14:34           ` [PATCH v6 1/6] object-name tests: add tests for ambiguous object blind spots Ævar Arnfjörð Bjarmason
2021-12-30 23:36             ` Junio C Hamano
2021-12-28 14:34           ` [PATCH v6 2/6] object-name: explicitly handle OBJ_BAD in show_ambiguous_object() Ævar Arnfjörð Bjarmason
2021-12-28 14:34           ` [PATCH v6 3/6] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
2021-12-30 23:46             ` Junio C Hamano
2021-12-28 14:35           ` [PATCH v6 4/6] object-name: show date for ambiguous tag objects Ævar Arnfjörð Bjarmason
2021-12-30 21:43             ` Junio C Hamano
2021-12-28 14:35           ` [PATCH v6 5/6] object-name: iterate ambiguous objects before showing header Ævar Arnfjörð Bjarmason
2021-12-28 14:35           ` [PATCH v6 6/6] object-name: re-use "struct strbuf" in show_ambiguous_object() Ævar Arnfjörð Bjarmason
2021-12-28 15:18           ` [PATCH v8 0/7] progress: test fixes / cleanup Ævar Arnfjörð Bjarmason
2021-12-28 15:18             ` [PATCH v8 1/7] leak tests: fix a memory leak in "test-progress" helper Ævar Arnfjörð Bjarmason
2021-12-28 15:18             ` [PATCH v8 2/7] progress.c test helper: add missing braces Ævar Arnfjörð Bjarmason
2021-12-28 15:18             ` [PATCH v8 3/7] progress.c tests: make start/stop commands on stdin Ævar Arnfjörð Bjarmason
2021-12-28 16:25               ` Johannes Altmanninger
2021-12-28 15:19             ` [PATCH v8 4/7] progress.c tests: test some invalid usage Ævar Arnfjörð Bjarmason
2021-12-28 16:33               ` Johannes Altmanninger
2021-12-28 15:19             ` [PATCH v8 5/7] progress.c: add temporary variable from progress struct Ævar Arnfjörð Bjarmason
2021-12-28 16:05               ` René Scharfe
2021-12-28 16:13               ` Johannes Altmanninger
2021-12-28 15:19             ` [PATCH v8 6/7] pack-bitmap-write.c: don't return without stop_progress() Ævar Arnfjörð Bjarmason
2021-12-28 15:19             ` [PATCH v8 7/7] *.c: use isatty(0|2), not isatty(STDIN_FILENO|STDERR_FILENO) Ævar Arnfjörð Bjarmason
2021-12-28 16:47               ` René Scharfe
2021-12-28 23:56                 ` Ævar Arnfjörð Bjarmason
2022-01-08  0:45             ` [PATCH v8 0/7] progress: test fixes / cleanup Junio C Hamano
2022-01-12 12:39           ` [PATCH v7 0/6] object-name: make ambiguous object output translatable + show tag date Ævar Arnfjörð Bjarmason
2022-01-12 12:39             ` [PATCH v7 1/6] object-name tests: add tests for ambiguous object blind spots Ævar Arnfjörð Bjarmason
2022-01-13 22:39               ` Junio C Hamano
2022-01-14 12:07                 ` Ævar Arnfjörð Bjarmason
2022-01-14 18:45                   ` Junio C Hamano
2022-01-12 12:39             ` [PATCH v7 2/6] object-name: explicitly handle OBJ_BAD in show_ambiguous_object() Ævar Arnfjörð Bjarmason
2022-01-12 12:39             ` [PATCH v7 3/6] object-name: make ambiguous object output translatable Ævar Arnfjörð Bjarmason
2022-01-12 12:39             ` [PATCH v7 4/6] object-name: show date for ambiguous tag objects Ævar Arnfjörð Bjarmason
2022-01-13 22:46               ` Junio C Hamano
2022-01-14 12:05                 ` Ævar Arnfjörð Bjarmason
2022-01-14 19:04                   ` Junio C Hamano
2022-01-14 19:35                     ` Ævar Arnfjörð Bjarmason
2022-01-12 12:39             ` [PATCH v7 5/6] object-name: iterate ambiguous objects before showing header Ævar Arnfjörð Bjarmason
2022-01-12 12:39             ` [PATCH v7 6/6] object-name: re-use "struct strbuf" in show_ambiguous_object() Ævar Arnfjörð Bjarmason

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).