git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Michael J Gruber <git@grubix.eu>
Cc: Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Lars Schneider <larsxschneider@gmail.com>,
	Jiang Xin <worldhello.net@gmail.com>
Subject: Re: [PATCH] tests: remove the GETTEXT_POISON compile-time option to test i18n marking
Date: Wed, 26 Apr 2017 16:27:35 +0200	[thread overview]
Message-ID: <CACBZZX4opSbALKGQXcXsp2=698rxbx8+1GBSJ6H0TvRriaNk9A@mail.gmail.com> (raw)
In-Reply-To: <4bd88c83-1892-8fc2-981f-f31cfd1b4c87@grubix.eu>

On Wed, Apr 26, 2017 at 11:17 AM, Michael J Gruber <git@grubix.eu> wrote:
> [Turns out I still can't operate gmail's web interface. Sorry for the dupe.]
>
> 2017-04-24 13:04 GMT+02:00 Ævar Arnfjörð Bjarmason <avarab@gmail.com>:
>> Remove the GETTEXT_POISON=YesPlease compile-time which turns all of
>> git's LC_*=C output into strings like "# GETTEXT POISON #" instead of
>> gettext(msgid).
>>
>> See commit bb946bba76 ("i18n: add GETTEXT_POISON to simulate
>> unfriendly translator", 2011-02-22) for what this was originally
>> intended for.
>>
>> This facility has been broken for quite a while and has been subjected
>> to frequent bitrot. The initial idea behind it back when it was added
>> in 2011 was to prevent the accidental translation of plumbing
>> messages.
>>
>> This isn't a big concern anymore as git isn't mass-adding i18n
>> messages for a newly developed i18n facility as it was back then,
>> maintaining this facility incurs a burden, and in actuality this has
>> often been broken long enough for potential plumbing messages to be
>> translated & make their way into major releases anyway.
>>
>> Most of this patch consists of search/replacing the test suite for:
>>
>>     test_i18ngrep ! -> ! grep
>>     test_i18ngrep   -> grep
>>     test_i18ncmp    -> test_cmp
>>
>> 1. <AANLkTi=5MrU-JyeQ3UVNbVwzn-8FbstUXafgcQaLWXDB@mail.gmail.com>
>>    (https://public-inbox.org/git/AANLkTi=5MrU-JyeQ3UVNbVwzn-8FbstUXafgcQaLWXDB@mail.gmail.com/)
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>
>> On Mon, Apr 24, 2017 at 3:18 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>> Michael J Gruber <git@grubix.eu> writes:
>>>
>>>> Ævar Arnfjörð Bjarmason venit, vidit, dixit 20.04.2017 23:58:
>>>>> As a refresh of everyone's memory (because mine needed it). This is a
>>>>> feature I added back in 2011 when the i18n support was initially
>>>>> added.
>>>>>
>>>>> There was concern at the time that we would inadvertently mark
>>>>> plumbing messages for translation, particularly something in a shared
>>>>> code path, and this was a way to hopefully smoke out those issues with
>>>>> the test suite.
>>>>>
>>>>> However compiling with it breaks a couple of dozen tests, I stopped
>>>>> digging when I saw some broke back in 2014.
>>>>>
>>>>> What should be done about this? I think if we're going to keep them
>>>>> they need to be run regularly by something like Travis (Lars CC'd),
>>>>> however empirical evidence suggests that not running them is just fine
>>>>> too, so should we just remove support for this test mode?
>>>>>
>>>>> I don't care, but I can come up with the patch either way, but would
>>>>> only be motivated to write the one-time fix for it if some CI system
>>>>> is actually running them regularly, otherwise they'll just be subject
>>>>> to bitrotting again.
>>>>
>>>> I use that switch when I change something that involves l10n, but
>>>> usually I run specific tests only. To be honest: I have to make sure not
>>>> to get confused by (nor forget one of) the build flag GETTEXT_POISON and
>>>> the environment variable GIT_GETTEXT_POISON. I'm not sure I always
>>>> tested what I meant to test...
>>>
>>> To be quite honest, I have always felt that we are just as likely
>>> inadvertently use test_i18ncmp when we should use test_cmp (and vice
>>> versa) as we would mark plumbing messages with _() by mistake with
>>> this approach, and even with constant monitoring by something like
>>> Travis, GETTEXT_POISON may be able to catch mistakes only some of
>>> the time (i.e. when we do not make mistakes in writing our tests).
>>> Without constant monitoring, I agree that the mechanism does not
>>> work well to catch our mistakes.
>>
>> Here's an alternate patch to just remove it entirely. I think we
>> should apply this instead, the only reason I sent the patch to fix it
>> up was because of Michael's comment that he was occasionally using it.
>
> Yes, I think test_i18ngrep and test_i18ncmp gave the impression that
> they are i18n-aware grep and cmp, whereas in fact they turned off
> these tese test lines completely.
> Combined with the fact that GETTEXT_POSON builds turned on
> GIT_GETTEXT_POISON, this sounds somewhat dangerous - we test more
> aspects of plumbing commands but turn off (some) tests for porcelain.

It gives us no less test coverage because we still run the tests
without GETTEXT_POSON. Thus running first without that & then with
gives 100% coverage.

The entire point of the GETTEXT_POSON mode is that we've already
manually gone over things that broke in the past, and are skipping
them with C_LOCALE_OUTPUT or one of these i18n functions, and when
adding new translations we keep an eye out for new breakages that
haven't been marked yet.

These are then a candidate either for marking to skip, if they're
porcelain commands using our new gettext()-utilizing code, or we've
made a mistake and translated some plumbing, in which case that would
hopefully be caught by list review or Travis.

I still think it's better to just get rid of it, but this aspect of it
is working exactly as intended.

> I'm still wondering wether we couldn't generate a test locale
> automatically by mangling the english strings (but preserving format
> specifiers). That way, tests that test porcelain output could require
> LANG=C while others could run in the mangled locale. (tt_TT is taken,
> though ;) )

If we wanted to keep GETTEXT_POSON this would be going further in the
wrong direction. The reason the message starts with a "#", has no
trailing \n & most importantly ignores any format specifiers the real
translation would have is exactly because we'd like as many things as
can break break under this mode.

If anything we should consider changing "# GETTEXT_POISON #" to
"<random binary garbage>" if we keep it.

Including format specifiers would just reduce the set of things that
would break, consider e.g. a test that merely greps for some substring
that would be emitted by a "%s branch %d commits ahead" format, now we
change that to "%s GIBBER %d ISH", but the test still passes because
it just does $(awk '{print $1}'), but it might be a plumbing message
that could be translated as e.g. "%d WORD1 %s WORD2 WORD3", breaking
the API.

  reply	other threads:[~2017-04-26 14:28 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-20 21:58 [BUG] test suite broken with GETTEXT_POISON=YesPlease Ævar Arnfjörð Bjarmason
2017-04-21 14:47 ` Michael J Gruber
2017-04-21 17:28   ` Ævar Arnfjörð Bjarmason
2017-04-24  1:18   ` Junio C Hamano
     [not found]     ` <20170424110434.27689-1-avarab@gmail.com>
2017-04-26  9:17       ` [PATCH] tests: remove the GETTEXT_POISON compile-time option to test i18n marking Michael J Gruber
2017-04-26 14:27         ` Ævar Arnfjörð Bjarmason [this message]
2017-04-21 14:54 ` [BUG] test suite broken with GETTEXT_POISON=YesPlease Lars Schneider
2017-04-21 18:05   ` Ævar Arnfjörð Bjarmason
2017-04-21 18:57   ` [PATCH] tests: fix tests broken under GETTEXT_POISON=YesPlease Ævar Arnfjörð Bjarmason
2017-04-24  1:15     ` Junio C Hamano
2017-04-24  8:18       ` Ævar Arnfjörð Bjarmason
2017-04-25  4:11         ` Junio C Hamano
2017-04-25  8:56           ` Ævar Arnfjörð Bjarmason
2017-04-26  1:58             ` Junio C Hamano
2017-04-26  7:56               ` Ævar Arnfjörð Bjarmason
2017-04-24 13:22   ` [BUG] test suite broken with GETTEXT_POISON=YesPlease Duy Nguyen
2017-04-24 20:37     ` Jeff King
2017-04-25  8:51       ` Lars Schneider
2017-04-25 10:01         ` Jeff King
2017-04-28  8:42           ` Lars Schneider
2017-04-28  8:44             ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACBZZX4opSbALKGQXcXsp2=698rxbx8+1GBSJ6H0TvRriaNk9A@mail.gmail.com' \
    --to=avarab@gmail.com \
    --cc=git@grubix.eu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=larsxschneider@gmail.com \
    --cc=worldhello.net@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).