git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>, git@vger.kernel.org
Subject: Re: pre-v2.34.0-rc0 regressions: 'git log' has a noisy iconv() warning
Date: Thu, 28 Oct 2021 13:30:25 -0400	[thread overview]
Message-ID: <YXreMYZrTCUkpHXz@coredump.intra.peff.net> (raw)
In-Reply-To: <xmqqmtmuwdh9.fsf@gitster.g>

On Wed, Oct 27, 2021 at 11:04:50AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > The firehose of warnings for "git log --encoding=nonsense" was known and
> > discussed in fd680bc558 (logmsg_reencode(): warn when iconv() fails,
> > 2021-08-27). It's ugly for sure, but I'm still OK with it for the
> > reasoning there: your next step is to fix the --encoding argument you
> > gave. Whether you saw one line of warning or many is not that important,
> > IMHO. Giving a single more sensible warning ("your encoding 'nonsense'
> > isn't valid") would be better, but I think it's hard to do without
> > creating other problems.
> >
> > But the most compelling argument against warning at all is the case you
> > gave earlier: that there may be historical garbage commits, and you
> > can't get rid of them, so being warned constantly that we're not going
> > to show or grep them correctly is just annoying. And that is true
> > whether the user sees one warning or a hundred.
> 
> Is it really a "firehose"?  I won't use the word for one warning
> message per commit in the output of "git log --encoding=nonsense".
> 
> If you are running "git log --oneline", it may indeed be annoying to
> double the number of lines shown, and indeed
> 
>     $ git log --oneline --encoding=US-ASCII -4 ab/doc-lint
>     warning: unable to reencode commit to 'US-ASCII'
>     414abf159f docs: fix linting issues due to incorrect relative section order
>     warning: unable to reencode commit to 'US-ASCII'
>     ea8b9271b1 doc lint: lint relative section order
>     warning: unable to reencode commit to 'US-ASCII'
>     cafd9828e8 doc lint: lint and fix missing "GIT" end sections
>     warning: unable to reencode commit to 'US-ASCII'
>     d2c9908076 doc lint: fix bugs in, simplify and improve lint script

It's a bit more than that. You get similar warnings for commits which we
--grep but don't show (and which _might_ have been shown if the encoding
conversion had been different). Try "git log --grep=foo --encoding=bar".

And of course the interleaved output you see above looks OK in a pager.
But if you're sending the output of log (or diff-tree, etc) elsewhere,
you're just going to get a stream of messages on stderr. That would be a
bit less egregious if the message mentioned the commit oid, so they
weren't strict duplicates.

> is indeed annoying, as everything that is _shown_ ought to be
> presentable in US-ASCII.  This observation makes us realize an
> obvious approach to improve over the current behaviour without
> losing the warning when it matters, but I think the required code
> change, to first split the commit message into pieces (which roughly
> corresponds to the atoms in the --format= placeholder language) and
> reencode only these pieces that will be shown, may be too involved
> to be worth the effort.

Yeah, I think that would complicate things significantly, with the way
the code is currently structured. It also means parsing commits that are
in arbitrary encodings, which is not possible in most general sense.
E.g., imagine an encoding which doesn't have ASCII as subset, like
UTF-16.  Though I suspect such encodings probably do not work for
commits anyway (there is a chicken-and-egg with reading the encoding
header in the first place).

> > So while I do hate to have Git just silently ignore errors, probably the
> > original behavior is the least-bad thing, and we should just revert
> > fd680bc558 (logmsg_reencode(): warn when iconv() fails, 2021-08-27). We
> > probably want to salvage the documentation change (minus the "along with
> > a warning") part.
> 
> I am all for making it convenient to squelch, but it would be sad to
> lose the convenient way to notice possible misencoding in recent
> commits.  Or can we have a command line option and pass it through
> the callchain, or would that be too involved?

Do you mean a command-line option to squelch the warnings? I think it
would not be too hard to do it as a config option (which is probably
what you'd want anyway, since historical commits would come up over and
over again).

-Peff

  reply	other threads:[~2021-10-28 17:30 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-26  3:48 What's cooking in git.git (Oct 2021, #06; Mon, 25) Junio C Hamano
2021-10-26  5:25 ` Jeff King
2021-10-27 17:42   ` Junio C Hamano
2021-10-31 18:36   ` Kaartic Sivaraam
2021-11-01  4:04     ` Jeff King
2021-10-26 11:02 ` pre-v2.34.0-rc0 regressions: t7900-maintenance.sh broken due to 'systemd-analyze' (was: What's cooking in git.git (Oct 2021, #06; Mon, 25)) Ævar Arnfjörð Bjarmason
2021-10-26 15:34   ` Ævar Arnfjörð Bjarmason
2021-10-26 18:43     ` Eric Sunshine
2021-11-02 14:24   ` [PATCH] maintenance tests: fix systemd v2.34.0-rc* test regression Ævar Arnfjörð Bjarmason
2021-11-03  5:09     ` Eric Sunshine
2021-11-10  3:52     ` [PATCH v2] " Ævar Arnfjörð Bjarmason
2021-11-10 13:36       ` Johannes Schindelin
2021-11-10 16:22         ` Ævar Arnfjörð Bjarmason
2021-11-11 17:45           ` Junio C Hamano
2021-10-26 11:15 ` pre-v2.34.0-rc0 regressions: 'git log' has a noisy iconv() warning (was: What's cooking in git.git (Oct 2021, #06; Mon, 25)) Ævar Arnfjörð Bjarmason
2021-10-27 11:14   ` Jeff King
2021-10-27 18:04     ` pre-v2.34.0-rc0 regressions: 'git log' has a noisy iconv() warning Junio C Hamano
2021-10-28 17:30       ` Jeff King [this message]
2021-10-28 19:02         ` Junio C Hamano
2021-10-28 19:17           ` Jeff King
2021-10-26 12:13 ` tb/plug-pack-bitmap-leaks (was: What's cooking in git.git (Oct 2021, #06; Mon, 25)) Ævar Arnfjörð Bjarmason
2021-10-26 21:04   ` Taylor Blau
2021-10-26 12:17 ` jc/branch-copy-doc " Ævar Arnfjörð Bjarmason
2021-10-26 12:42 ` What's cooking in git.git (Oct 2021, #06; Mon, 25) Derrick Stolee
2021-10-26 14:55   ` Ævar Arnfjörð Bjarmason
2021-10-26 17:27     ` Victoria Dye
2021-10-26 21:29       ` Junio C Hamano
2021-10-26 21:28   ` Junio C Hamano
2021-10-26 21:54     ` Derrick Stolee
2021-10-26 22:21 ` regression in ns/tmp-objdir and ns/batched-fsync Neeraj Singh
2021-10-27 19:17 ` What's cooking in git.git (Oct 2021, #06; Mon, 25) Martin Ågren
2021-10-28  0:06   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YXreMYZrTCUkpHXz@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).