git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Cc: git@vger.kernel.org, "SZEDER Gábor" <szeder.dev@gmail.com>,
	"Vasco Almeida" <vascomalmeida@sapo.pt>,
	"Jiang Xin" <worldhello.net@gmail.com>
Subject: Re: [PATCH] Poison gettext with the Ook language
Date: Mon, 22 Oct 2018 23:46:05 +0200	[thread overview]
Message-ID: <875zxtd59e.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <878t2pd6yu.fsf@evledraar.gmail.com>


On Mon, Oct 22 2018, Ævar Arnfjörð Bjarmason wrote:

> On Mon, Oct 22 2018, Nguyễn Thái Ngọc Duy wrote:
>
>> The current gettext() function just replaces all strings with
>> '# GETTEXT POISON #' including format strings and hides the things
>> that we should be allowed to grep (like branch names, or some other
>> codes) even when gettext is poisoned.
>>
>> This patch implements the poisoned _() with a universal and totally
>> legit language called Ook [1]. We could actually grep stuff even in
>> with this because format strings are preserved.
>>
>> Long term, we could implement an "ook translator" for test_i18ngrep
>> and friends so that they translate English to Ook, allowing us to
>> match full text while making sure the text in the code is still marked
>> for translation.
>
> Replying to both this patch, and SZEDER's series for brevity. Thanks
> both for working on this. It's obvious that this stuff needs some
> polishing, and it's great that's being done, and sorry for my part of
> having left this in the current state.
>
> a)
>
> Both this patch and SZEDER's 7/8 are addressing the issue that the
> poison mode doesn't preserve format specifiers.
>
> I haven't tried to dig this up in the list archive, and maybe it only
> exists in my mind, but I seem to remember that this was explicitly
> discussed as a goal at the time.
>
> I.e. we were expecting the lack of this to break tests, and that was
> considered a feature as it would spot plumbing messages we shouldn't be
> translating.
>
> However, it's my opinion that this was just a bad idea to begin with,
> I've CC'd a couple of prolific markers of i18n from my log grepping (and
> feel free to add more) to see if they disagre.
>
> I think it probably helped me a bit in the beginning when i18n was
> bootstrapping and there was a *lot* to mark for translation, but we've
> long since stabilized and aren't doing that anymore, so it's much easier
> to just review the patches to see if they translate plumbing.
>
> All of which is to say that I think something like your patch here is
> fine to just accept, and the only improvement I'd suggest is some note
> in the commit message saying that this was always intentional, but
> nobody can name a case where it helped, so let's just drop it.
>
> In SZEDER's case that we shouldn't have GIT_GETTEXT_POISON=scrambled,
> let's just make it boolean and make "scrambled" (i.e. preserving format
> specifiecs) the default.

[...]

> b)
>
> SZEDER's series, and your patch (although it's smaller) still preserve
> the notion that there's such a thing as a GETTEXT_POISON *build* and you
> actually need to compile git with that flag to make use of it.
>
> This, as I recall, was just paranoia on my part back in 2010/2011. I
> wanted to be able to present a version of this i18n stuff where people
> who didn't like it could be assured that if they didn't opt-in to it
> they wouldn't get any of the code (sans a few trivial and obviously
> correct wrapper functions).
>
> So I think the only reason to keep it compile-time is performance, but I
> don't think that matters. It's not like we're printing gigabytes of _()
> formatted output. Everything where formatting matters is plumbing which
> doesn't use this API. These messages are always for human consumption.
>
> So shouldn't we just drop this notion of GETTEXT_POISON at compile-time
> entirely, and make GIT_GETTEXT_POISON=<boolean> a test flag that all
> builds of git are aware of? I think so.

Something like this, untested except I compiled it and ran one test
with/without GIT_GETTEXT_POISON, so it may have bugs:

 Makefile                | 10 +---------
 gettext.c               |  4 +---
 gettext.h               |  4 ----
 po/README               | 13 ++++---------
 t/README                |  5 +++++
 t/helper/test-tool.c    |  1 +
 t/helper/test-tool.h    |  1 +
 t/test-lib-functions.sh |  8 ++++----
 t/test-lib.sh           | 12 +++---------
 9 files changed, 20 insertions(+), 38 deletions(-)

diff --git a/Makefile b/Makefile
index d18ab0fe78..78190ee9d9 100644
--- a/Makefile
+++ b/Makefile
@@ -362,11 +362,6 @@ all::
 # Define HAVE_DEV_TTY if your system can open /dev/tty to interact with the
 # user.
 #
-# Define GETTEXT_POISON if you are debugging the choice of strings marked
-# for translation.  In a GETTEXT_POISON build, you can turn all strings marked
-# for translation into gibberish by setting the GIT_GETTEXT_POISON variable
-# (to any value) in your environment.
-#
 # Define JSMIN to point to JavaScript minifier that functions as
 # a filter to have gitweb.js minified.
 #
@@ -714,6 +709,7 @@ TEST_BUILTINS_OBJS += test-dump-split-index.o
 TEST_BUILTINS_OBJS += test-dump-untracked-cache.o
 TEST_BUILTINS_OBJS += test-example-decorate.o
 TEST_BUILTINS_OBJS += test-genrandom.o
+TEST_BUILTINS_OBJS += test-gettext-poison.o
 TEST_BUILTINS_OBJS += test-hashmap.o
 TEST_BUILTINS_OBJS += test-index-version.o
 TEST_BUILTINS_OBJS += test-json-writer.o
@@ -1439,9 +1435,6 @@ endif
 ifdef NO_SYMLINK_HEAD
 	BASIC_CFLAGS += -DNO_SYMLINK_HEAD
 endif
-ifdef GETTEXT_POISON
-	BASIC_CFLAGS += -DGETTEXT_POISON
-endif
 ifdef NO_GETTEXT
 	BASIC_CFLAGS += -DNO_GETTEXT
 	USE_GETTEXT_SCHEME ?= fallthrough
@@ -2591,7 +2584,6 @@ ifdef GIT_TEST_CMP_USE_COPIED_CONTEXT
 	@echo GIT_TEST_CMP_USE_COPIED_CONTEXT=YesPlease >>$@+
 endif
 	@echo NO_GETTEXT=\''$(subst ','\'',$(subst ','\'',$(NO_GETTEXT)))'\' >>$@+
-	@echo GETTEXT_POISON=\''$(subst ','\'',$(subst ','\'',$(GETTEXT_POISON)))'\' >>$@+
 ifdef GIT_PERF_REPEAT_COUNT
 	@echo GIT_PERF_REPEAT_COUNT=\''$(subst ','\'',$(subst ','\'',$(GIT_PERF_REPEAT_COUNT)))'\' >>$@+
 endif
diff --git a/gettext.c b/gettext.c
index 7272771c8e..d71e394284 100644
--- a/gettext.c
+++ b/gettext.c
@@ -46,15 +46,13 @@ const char *get_preferred_languages(void)
 	return NULL;
 }

-#ifdef GETTEXT_POISON
 int use_gettext_poison(void)
 {
 	static int poison_requested = -1;
 	if (poison_requested == -1)
-		poison_requested = getenv("GIT_GETTEXT_POISON") ? 1 : 0;
+		poison_requested = git_env_bool("GIT_GETTEXT_POISON", 0);
 	return poison_requested;
 }
-#endif

 #ifndef NO_GETTEXT
 static int test_vsnprintf(const char *fmt, ...)
diff --git a/gettext.h b/gettext.h
index 7eee64a34f..4c492d9f57 100644
--- a/gettext.h
+++ b/gettext.h
@@ -41,11 +41,7 @@ static inline int gettext_width(const char *s)
 }
 #endif

-#ifdef GETTEXT_POISON
 extern int use_gettext_poison(void);
-#else
-#define use_gettext_poison() 0
-#endif

 static inline FORMAT_PRESERVING(1) const char *_(const char *msgid)
 {
diff --git a/po/README b/po/README
index fef4c0f0b5..9dad277e33 100644
--- a/po/README
+++ b/po/README
@@ -289,16 +289,11 @@ something in the test suite might still depend on the US English
 version of the strings, e.g. to grep some error message or other
 output.

-To smoke out issues like these Git can be compiled with gettext poison
-support, at the top-level:
+To smoke out issues like these Git tested with a translation mode that
+emits gibberish on every call to gettext. To use it run the test suite
+with it, e.g.:

-    make GETTEXT_POISON=YesPlease
-
-That'll give you a git which emits gibberish on every call to
-gettext. It's obviously not meant to be installed, but you should run
-the test suite with it:
-
-    cd t && prove -j 9 ./t[0-9]*.sh
+    cd t && GIT_GETTEXT_POISON=true prove -j 9 ./t[0-9]*.sh

 If tests break with it you should inspect them manually and see if
 what you're translating is sane, i.e. that you're not translating
diff --git a/t/README b/t/README
index 8847489640..b3b2fa0b11 100644
--- a/t/README
+++ b/t/README
@@ -301,6 +301,11 @@ that cannot be easily covered by a few specific test cases. These
 could be enabled by running the test suite with correct GIT_TEST_
 environment set.

+GIT_GETTEXT_POISON=<boolean> turns all strings marked for translation
+into gibberish. Used for spotting those tests that need to be marked
+with a C_LOCALE_OUTPUT prerequisite when adding more strings for
+translation. See "Testing marked strings" in po/README for details.
+
 GIT_TEST_SPLIT_INDEX=<boolean> forces split-index mode on the whole
 test suite. Accept any boolean values that are accepted by git-config.

diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 6b5836dc1b..3e75672a37 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -19,6 +19,7 @@ static struct test_cmd cmds[] = {
 	{ "dump-untracked-cache", cmd__dump_untracked_cache },
 	{ "example-decorate", cmd__example_decorate },
 	{ "genrandom", cmd__genrandom },
+	{ "gettext-poison", cmd__gettext_poison },
 	{ "hashmap", cmd__hashmap },
 	{ "index-version", cmd__index_version },
 	{ "json-writer", cmd__json_writer },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index e4890566da..04f033b7fc 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -15,6 +15,7 @@ int cmd__dump_split_index(int argc, const char **argv);
 int cmd__dump_untracked_cache(int argc, const char **argv);
 int cmd__example_decorate(int argc, const char **argv);
 int cmd__genrandom(int argc, const char **argv);
+int cmd__gettext_poison(int argc, const char **argv);
 int cmd__hashmap(int argc, const char **argv);
 int cmd__index_version(int argc, const char **argv);
 int cmd__json_writer(int argc, const char **argv);
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 78d8c3783b..4e8683addd 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -755,16 +755,16 @@ test_cmp_bin() {

 # Use this instead of test_cmp to compare files that contain expected and
 # actual output from git commands that can be translated.  When running
-# under GETTEXT_POISON this pretends that the command produced expected
+# under GIT_GETTEXT_POISON this pretends that the command produced expected
 # results.
 test_i18ncmp () {
-	test -n "$GETTEXT_POISON" || test_cmp "$@"
+	! test_have_prereq C_LOCALE_OUTPUT || test_cmp "$@"
 }

 # Use this instead of "grep expected-string actual" to see if the
 # output from a git command that can be translated either contains an
 # expected string, or does not contain an unwanted one.  When running
-# under GETTEXT_POISON this pretends that the command produced expected
+# under GIT_GETTEXT_POISON this pretends that the command produced expected
 # results.
 test_i18ngrep () {
 	eval "last_arg=\${$#}"
@@ -779,7 +779,7 @@ test_i18ngrep () {
 		error "bug in the test script: too few parameters to test_i18ngrep"
 	fi

-	if test -n "$GETTEXT_POISON"
+	if ! test_have_prereq C_LOCALE_OUTPUT
 	then
 		# pretend success
 		return 0
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 897e6fcc94..f82d2149a2 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -105,6 +105,7 @@ unset VISUAL EMAIL LANGUAGE COLUMNS $("$PERL_PATH" -e '
 		TRACE
 		DEBUG
 		TEST
+		GETTEXT_POISON
 		.*_TEST
 		PROVE
 		VALGRIND
@@ -1104,15 +1105,8 @@ test -n "$USE_LIBPCRE1" && test_set_prereq LIBPCRE1
 test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT

-# Can we rely on git's output in the C locale?
-if test -n "$GETTEXT_POISON"
-then
-	GIT_GETTEXT_POISON=YesPlease
-	export GIT_GETTEXT_POISON
-	test_set_prereq GETTEXT_POISON
-else
-	test_set_prereq C_LOCALE_OUTPUT
-fi
+test_lazy_prereq GETTEXT_POISON 'test-tool gettext-poison'
+test_lazy_prereq C_LOCALE_OUTPUT '! test-tool gettext-poison'

 if test -z "$GIT_TEST_CHECK_CACHE_TREE"
 then

  reply	other threads:[~2018-10-22 21:46 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-22 15:36 [PATCH] Poison gettext with the Ook language Nguyễn Thái Ngọc Duy
2018-10-22 20:22 ` SZEDER Gábor
2018-10-22 20:22   ` [PATCH 1/8] test-lib.sh: preserve GIT_GETTEXT_POISON from the environment SZEDER Gábor
2018-10-22 20:22   ` [PATCH 2/8] gettext: don't poison if GIT_GETTEXT_POISON is set but empty SZEDER Gábor
2018-10-22 20:38     ` Ævar Arnfjörð Bjarmason
2018-10-22 20:22   ` [PATCH 3/8] lib-rebase: loosen GETTEXT_POISON check in fake editor SZEDER Gábor
2018-10-22 20:22   ` [PATCH 4/8] gettext: #ifdef away GETTEXT POISON-related code from _() and Q_() SZEDER Gábor
2018-10-22 20:22   ` [PATCH 5/8] gettext: put "# GETTEXT POISON #" string literal into a macro SZEDER Gábor
2018-10-22 20:22   ` [PATCH 6/8] gettext: use an enum for the mode of GETTEXT POISONing SZEDER Gábor
2018-10-22 20:22   ` [PATCH 7/8] gettext: introduce GIT_GETTEXT_POISON=scrambled SZEDER Gábor
2018-10-23 14:44     ` Duy Nguyen
2018-10-22 20:22   ` [PATCH 8/8] travis-ci: run GETTEXT POISON build job in scrambled mode, too SZEDER Gábor
2018-10-23 14:37   ` [PATCH] Poison gettext with the Ook language Duy Nguyen
2018-10-27 10:11   ` Jakub Narebski
2018-10-22 20:52 ` Johannes Schindelin
2018-10-22 21:09 ` Ævar Arnfjörð Bjarmason
2018-10-22 21:46   ` Ævar Arnfjörð Bjarmason [this message]
2018-10-22 23:04   ` Junio C Hamano
2018-10-23 21:01     ` [PATCH] i18n: make GETTEXT_POISON a runtime option Ævar Arnfjörð Bjarmason
2018-10-24  5:45       ` Junio C Hamano
2018-10-24  7:44         ` Jeff King
2018-10-25  1:00           ` Junio C Hamano
2018-10-25  1:09             ` Jeff King
2018-10-25  1:24               ` Ramsay Jones
2018-10-25 21:23                 ` Jeff King
2018-10-26 19:20                   ` Ævar Arnfjörð Bjarmason
2018-10-27  6:59                     ` Jeff King
2018-10-27 10:42                       ` Ævar Arnfjörð Bjarmason
2018-10-24 11:47         ` [PATCH v2] " Ævar Arnfjörð Bjarmason
2018-11-01 19:31           ` [PATCH v3] " Ævar Arnfjörð Bjarmason
2018-11-02  3:11             ` Junio C Hamano
2018-11-02 16:37             ` SZEDER Gábor
2018-11-08 20:26               ` Ævar Arnfjörð Bjarmason
2018-11-08 20:51                 ` SZEDER Gábor
2018-11-08  3:24             ` Junio C Hamano
2018-11-08 19:12               ` Eric Sunshine
2018-11-08 21:15             ` [PATCH v4 0/2] " Ævar Arnfjörð Bjarmason
2018-11-08 21:15             ` [PATCH v4 1/2] " Ævar Arnfjörð Bjarmason
2018-11-08 21:15             ` [PATCH v4 2/2] Makefile: ease dynamic-gettext-poison transition Ævar Arnfjörð Bjarmason
2018-10-23  9:30   ` [PATCH] Poison gettext with the Ook language Johannes Schindelin
2018-10-23 10:17     ` Ævar Arnfjörð Bjarmason
2018-10-23 11:07       ` Johannes Schindelin
2018-10-23 15:00       ` Duy Nguyen
2018-10-23 16:45         ` Ævar Arnfjörð Bjarmason
2018-10-24 14:41           ` Duy Nguyen
2018-10-24 17:54             ` Ævar Arnfjörð Bjarmason
2018-10-25  3:52             ` Junio C Hamano
2018-10-25  6:20               ` Jeff King
2018-10-27  6:55                 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875zxtd59e.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=szeder.dev@gmail.com \
    --cc=vascomalmeida@sapo.pt \
    --cc=worldhello.net@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).