git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Krzysztof Żelechowski" <giecrilj@stegny.2a.pl>, git@vger.kernel.org
Subject: Re: git log --encoding=HTML is not supported
Date: Fri, 27 Aug 2021 14:30:15 -0400	[thread overview]
Message-ID: <YSkvNyR4uT52de13@coredump.intra.peff.net> (raw)
In-Reply-To: <xmqqeeahjxj4.fsf@gitster.g>

On Wed, Aug 25, 2021 at 09:31:59AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > We feed the encoding "HTML" to iconv_open(), which of course has no idea
> > what that is. It's unfortunate, though, that we don't even print a
> > warning, and instead just quietly leave the text intact. I wonder if we
> > should do something like:
> [...]
> This addition sounds quite sensible to me.
> 
> "git log --encoding=bogus" would issue this warning for each and
> every commit and that may be a bit irritating, but being irritating
> may be a good characteristic for a warning message that is given to
> an easily correctable condition.
> 
> I originally thought that the warning would be lost to the pager,
> but apparently I forgot what I did eons ago at 61b80509 (sending
> errors to stdout under $PAGER, 2008-02-16) ;-).

Here it is polished into a real commit.

-- >8 --
Subject: [PATCH] logmsg_reencode(): warn when iconv() fails

If the user asks for a pretty-printed commit to be converted (either
explicitly with --encoding=foo, or implicitly because the commit is
non-utf8 and we want to convert it), we pass it through iconv(). If that
fails, we fall back to showing the input verbatim, but don't tell the
user that the output may be bogus.

Let's add a warning to do so, along with a mention in the documentation
for --encoding. Two things to note about the implementation:

  - we could produce the warning closer to the call to iconv() in
    reencode_string_len(), which would let us relay the value of errno.
    But this is not actually very helpful. reencode_string_len() does
    not know we are operating on a commit, and indeed does not know that
    the caller won't produce an error of its own. And the errno values
    from iconv() are seldom helpful (iconv_open() only ever produces
    EINVAL; perhaps EILSEQ from iconv() might be illuminating, but it
    can also return EINVAL for incomplete sequences).

  - if the reason for the failure is that the output charset is not
    supported, then the user will see this warning for every commit we
    try to display. That might be ugly and overwhelming, but on the
    other hand it is making it clear that every one of them has not been
    converted (and the likely outcome anyway is to re-try the command
    with a supported output encoding).

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/pretty-options.txt | 4 +++-
 pretty.c                         | 6 +++++-
 t/t4210-log-i18n.sh              | 7 +++++++
 3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/Documentation/pretty-options.txt b/Documentation/pretty-options.txt
index 27ddaf84a1..42b227bc40 100644
--- a/Documentation/pretty-options.txt
+++ b/Documentation/pretty-options.txt
@@ -40,7 +40,9 @@ people using 80-column terminals.
 	defaults to UTF-8. Note that if an object claims to be encoded
 	in `X` and we are outputting in `X`, we will output the object
 	verbatim; this means that invalid sequences in the original
-	commit may be copied to the output.
+	commit may be copied to the output. Likewise, if iconv(3) fails
+	to convert the commit, we will output the original object
+	verbatim, along with a warning.
 
 --expand-tabs=<n>::
 --expand-tabs::
diff --git a/pretty.c b/pretty.c
index 9631529c10..73b5ead509 100644
--- a/pretty.c
+++ b/pretty.c
@@ -671,7 +671,11 @@ const char *repo_logmsg_reencode(struct repository *r,
 	 * If the re-encoding failed, out might be NULL here; in that
 	 * case we just return the commit message verbatim.
 	 */
-	return out ? out : msg;
+	if (!out) {
+		warning("unable to reencode commit to '%s'", output_encoding);
+		return msg;
+	}
+	return out;
 }
 
 static int mailmap_name(const char **email, size_t *email_len,
diff --git a/t/t4210-log-i18n.sh b/t/t4210-log-i18n.sh
index d2dfcf164e..0141f36e33 100755
--- a/t/t4210-log-i18n.sh
+++ b/t/t4210-log-i18n.sh
@@ -131,4 +131,11 @@ do
 	fi
 done
 
+test_expect_success 'log shows warning when conversion fails' '
+	enc=this-encoding-does-not-exist &&
+	git log -1 --encoding=$enc 2>err &&
+	echo "warning: unable to reencode commit to ${SQ}${enc}${SQ}" >expect &&
+	test_cmp expect err
+'
+
 test_done
-- 
2.33.0.396.g72f622fe47


  reply	other threads:[~2021-08-27 18:31 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-24  9:00 git log --encoding=HTML is not supported Krzysztof Żelechowski
2021-08-24 10:31 ` Bagas Sanjaya
2021-08-24 10:33   ` Krzysztof Żelechowski
2021-08-24 10:46     ` Bagas Sanjaya
2021-08-24 19:11       ` Junio C Hamano
2021-08-25  0:57 ` Jeff King
2021-08-25 16:31   ` Junio C Hamano
2021-08-27 18:30     ` Jeff King [this message]
2021-08-27 18:32       ` Jeff King
2021-08-27 19:47         ` Junio C Hamano
2021-10-09  0:58       ` *Really* noisy encoding warnings post-v2.33.0 Ævar Arnfjörð Bjarmason
2021-10-09  1:29         ` Ævar Arnfjörð Bjarmason
2021-10-09  2:36         ` Jeff King
2021-10-09  2:42           ` Jeff King
2021-10-09 13:47             ` Ævar Arnfjörð Bjarmason
2021-10-27 11:03               ` Jeff King
2021-10-29 10:47                 ` Ævar Arnfjörð Bjarmason
2021-10-29 20:40                   ` Jeff King
2021-10-29 20:45                     ` Junio C Hamano
2021-10-29 20:52                       ` Junio C Hamano
2021-10-29 21:10                         ` Jeff King
2021-10-22 22:58             ` Ævar Arnfjörð Bjarmason
2021-10-10 13:53           ` Johannes Sixt
2021-10-10 15:43             ` Ævar Arnfjörð Bjarmason
2021-08-25 23:00   ` git log --encoding=HTML is not supported Krzysztof Żelechowski
2021-08-27 18:33     ` Jeff King
2021-08-25 23:28   ` Krzysztof Żelechowski
2021-08-25 23:47     ` Bryan Turner
2021-08-26 15:37       ` Junio C Hamano
2021-08-26 20:52         ` Krzysztof Żelechowski
2021-08-27 15:59           ` Junio C Hamano
2021-08-27 18:37             ` Jeff King
2021-08-27 21:51               ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YSkvNyR4uT52de13@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=giecrilj@stegny.2a.pl \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).