git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] Poison gettext with the Ook language
@ 2018-10-22 15:36 Nguyễn Thái Ngọc Duy
  2018-10-22 20:22 ` SZEDER Gábor
                   ` (2 more replies)
  0 siblings, 3 replies; 49+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2018-10-22 15:36 UTC (permalink / raw)
  To: git; +Cc: Ævar Arnfjörð,
	Nguyễn Thái Ngọc Duy

The current gettext() function just replaces all strings with
'# GETTEXT POISON #' including format strings and hides the things
that we should be allowed to grep (like branch names, or some other
codes) even when gettext is poisoned.

This patch implements the poisoned _() with a universal and totally
legit language called Ook [1]. We could actually grep stuff even in
with this because format strings are preserved.

Long term, we could implement an "ook translator" for test_i18ngrep
and friends so that they translate English to Ook, allowing us to
match full text while making sure the text in the code is still marked
for translation.

[1] https://en.wikipedia.org/wiki/Unseen_University#Librarian

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 This started out as something fun to do while running the test suite
 last weekend. But it turns out actually working! If this patch ends
 up in git.git, the Librarian would be so proud!

 gettext.c       | 54 +++++++++++++++++++++++++++++++++++++++++++++++++
 gettext.h       |  7 ++++---
 t/lib-rebase.sh |  2 +-
 3 files changed, 59 insertions(+), 4 deletions(-)

diff --git a/gettext.c b/gettext.c
index 7272771c8e..29901e1ddd 100644
--- a/gettext.c
+++ b/gettext.c
@@ -56,6 +56,60 @@ int use_gettext_poison(void)
 }
 #endif
 
+const char *gettext_poison(const char *msgid)
+{
+	/*
+	 * gettext() returns a string that is always valid. We would
+	 * need a hash map for that but let's stay simple and keep the
+	 * last 64 gettext() results. Should be more than enough.
+	 */
+	static char *bufs[64];
+	static int i;
+	struct strbuf sb = STRBUF_INIT;
+	char *buf;
+	const char *p;
+	const char *type_specifiers = "diouxXeEfFgGaAcsCSpnm%";
+
+	if (!strchr(msgid, '%'))
+		return "Eek!";
+
+	p = msgid;
+	while (*p) {
+		const char *type;
+		switch (*p) {
+		case '%':
+			/*
+			 * No strict parsing. We simply look for the end of a
+			 * format string
+			 */
+			type = p + 1;
+			while (*type && !strchr(type_specifiers, *type))
+				type++;
+			if (*type)
+				type++;
+			strbuf_add(&sb, p, (int)(type - p));
+			p = type;
+			break;
+		default:
+			if (!isalpha(*p)) {
+				strbuf_addch(&sb, *p);
+				p++;
+				break;
+			}
+			if (isupper(*p))
+				strbuf_addstr(&sb, "Ook");
+			else
+				strbuf_addstr(&sb, "ook");
+			while (isalpha(*p))
+				p++;
+		}
+	}
+	buf = bufs[(i++) % ARRAY_SIZE(bufs)];
+	free(buf);
+	buf = strbuf_detach(&sb, NULL);
+	return buf;
+}
+
 #ifndef NO_GETTEXT
 static int test_vsnprintf(const char *fmt, ...)
 {
diff --git a/gettext.h b/gettext.h
index 7eee64a34f..dc9851a06a 100644
--- a/gettext.h
+++ b/gettext.h
@@ -41,8 +41,9 @@ static inline int gettext_width(const char *s)
 }
 #endif
 
+const char *gettext_poison(const char *);
 #ifdef GETTEXT_POISON
-extern int use_gettext_poison(void);
+int use_gettext_poison(void);
 #else
 #define use_gettext_poison() 0
 #endif
@@ -51,14 +52,14 @@ static inline FORMAT_PRESERVING(1) const char *_(const char *msgid)
 {
 	if (!*msgid)
 		return "";
-	return use_gettext_poison() ? "# GETTEXT POISON #" : gettext(msgid);
+	return use_gettext_poison() ? gettext_poison(msgid) : gettext(msgid);
 }
 
 static inline FORMAT_PRESERVING(1) FORMAT_PRESERVING(2)
 const char *Q_(const char *msgid, const char *plu, unsigned long n)
 {
 	if (use_gettext_poison())
-		return "# GETTEXT POISON #";
+		return gettext_poison(msgid);
 	return ngettext(msgid, plu, n);
 }
 
diff --git a/t/lib-rebase.sh b/t/lib-rebase.sh
index 2ca9fb69d6..1e8440e935 100644
--- a/t/lib-rebase.sh
+++ b/t/lib-rebase.sh
@@ -29,7 +29,7 @@ set_fake_editor () {
 	*/COMMIT_EDITMSG)
 		test -z "$EXPECT_HEADER_COUNT" ||
 			test "$EXPECT_HEADER_COUNT" = "$(sed -n '1s/^# This is a combination of \(.*\) commits\./\1/p' < "$1")" ||
-			test "# # GETTEXT POISON #" = "$(sed -n '1p' < "$1")" ||
+			test "$EXPECT_HEADER_COUNT" = "$(sed -n '1s/^# Ook ook ook ook ook \(.*\) ook\./\1/p' < "$1")" ||
 			exit
 		test -z "$FAKE_COMMIT_MESSAGE" || echo "$FAKE_COMMIT_MESSAGE" > "$1"
 		test -z "$FAKE_COMMIT_AMEND" || echo "$FAKE_COMMIT_AMEND" >> "$1"
-- 
2.19.1.647.g708186aaf9


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH] Poison gettext with the Ook language
  2018-10-22 15:36 [PATCH] Poison gettext with the Ook language Nguyễn Thái Ngọc Duy
@ 2018-10-22 20:22 ` SZEDER Gábor
  2018-10-22 20:22   ` [PATCH 1/8] test-lib.sh: preserve GIT_GETTEXT_POISON from the environment SZEDER Gábor
                     ` (9 more replies)
  2018-10-22 20:52 ` Johannes Schindelin
  2018-10-22 21:09 ` Ævar Arnfjörð Bjarmason
  2 siblings, 10 replies; 49+ messages in thread
From: SZEDER Gábor @ 2018-10-22 20:22 UTC (permalink / raw)
  To: git, Nguyễn Thái Ngọc Duy
  Cc: Ævar Arnfjörð, SZEDER Gábor

On Mon, Oct 22, 2018 at 05:36:33PM +0200, Nguyễn Thái Ngọc Duy wrote:
> The current gettext() function just replaces all strings with
> '# GETTEXT POISON #' including format strings and hides the things
> that we should be allowed to grep (like branch names, or some other
> codes) even when gettext is poisoned.
> 
> This patch implements the poisoned _() with a universal and totally
> legit language called Ook [1]. We could actually grep stuff even in
> with this because format strings are preserved.

Once upon a time a GETTEXT_POISON build job failed on me, and the
error message:

  error: # GETTEXT POISON #

was not particularly useful.  Ook wouldn't help with that...

So I came up with the following couple of patches that implement a
"scrambled" format that makes the poisoned output legible for humans
but still gibberish for machine consumption (i.e. grep-ing the text
part would still fail):

  error:  U.n.a.b.l.e. .t.o. .c.r.e.a.t.e. .'./home/szeder/src/git/t/trash directory.t1404-update-ref-errors/.git/packed-refs...l.o.c.k.'.:. .File exists...

I have been running GETTEXT_POISON builds with this series for some
months now, but haven't submitted it yet, because I haven't decided
yet whether including strings (paths, refs, etc.) in the output as
they are is a feature or a flaw.  And because it embarrassingly leaks
every single translated string... :)


SZEDER Gábor (8):
  test-lib.sh: preserve GIT_GETTEXT_POISON from the environment
  gettext: don't poison if GIT_GETTEXT_POISON is set but empty
  lib-rebase: loosen GETTEXT_POISON check in fake editor
  gettext: #ifdef away GETTEXT POISON-related code from _() and Q_()
  gettext: put "# GETTEXT POISON #" string literal into a macro
  gettext: use an enum for the mode of GETTEXT POISONing
  gettext: introduce GIT_GETTEXT_POISON=scrambled
  travis-ci: run GETTEXT POISON build job in scrambled mode, too

 Makefile                  |  2 +-
 ci/lib-travisci.sh        |  1 +
 ci/run-build-and-tests.sh | 10 +++++--
 gettext.c                 | 63 +++++++++++++++++++++++++++++++++++----
 gettext.h                 | 33 +++++++++++++++-----
 t/lib-rebase.sh           |  2 +-
 t/test-lib.sh             | 17 ++++++++++-
 7 files changed, 110 insertions(+), 18 deletions(-)

-- 
2.19.1.681.g6bd79da3f5


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH 1/8] test-lib.sh: preserve GIT_GETTEXT_POISON from the environment
  2018-10-22 20:22 ` SZEDER Gábor
@ 2018-10-22 20:22   ` SZEDER Gábor
  2018-10-22 20:22   ` [PATCH 2/8] gettext: don't poison if GIT_GETTEXT_POISON is set but empty SZEDER Gábor
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 49+ messages in thread
From: SZEDER Gábor @ 2018-10-22 20:22 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy,
	Ævar Arnfjörð, SZEDER Gábor

Setting GIT_GETTEXT_POISON can be useful when testing translated
builds.

However, preserving its value from the environment is not as simple as
adding it to the list of GIT_* variables that should not be scrubbed
from the environment:

  - GIT_GETTEXT_POISON should not influence git commands executed
    during initialization of test-lib and the test repo.

    Save its value before it gets scrubbed from the environment, so it
    will be unset for the duration of the initialization, and restore
    its original value after initialization is finished.

  - When testing a GETTEXT_POISON build, 'test-lib.sh' always sets
    GIT_GETTEXT_POISON to 'YesPlease'.  This was fine while all that
    mattered was whether it's set or not.  However, the following
    patches will introduce meaningful values (e.g. "set but empty" to
    disable poisoning even in a GETTEXT_POISON build), so only set it
    like this if GIT_GETTEXT_POISON was not set in the environment.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 t/test-lib.sh | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index ea2bbaaa7a..282c05110d 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -95,6 +95,16 @@ PAGER=cat
 TZ=UTC
 export LANG LC_ALL PAGER TZ
 EDITOR=:
+
+# GIT_GETTEXT_POISON should not influence git commands executed during
+# initialization of test-lib and the test repo.
+# Back it up, unset and then restore after initialization is finished.
+if test -n "${GIT_GETTEXT_POISON-set}"
+then
+	git_gettext_poison_backup=$GIT_GETTEXT_POISON
+	unset GIT_GETTEXT_POISON
+fi
+
 # A call to "unset" with no arguments causes at least Solaris 10
 # /usr/xpg4/bin/sh and /bin/ksh to bail out.  So keep the unsets
 # deriving from the command substitution clustered with the other
@@ -1073,7 +1083,12 @@ test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 # Can we rely on git's output in the C locale?
 if test -n "$GETTEXT_POISON"
 then
-	GIT_GETTEXT_POISON=YesPlease
+	if test -n "${git_gettext_poison_backup-set}"
+	then
+		GIT_GETTEXT_POISON=$git_gettext_poison_backup
+	else
+		GIT_GETTEXT_POISON=YesPlease
+	fi
 	export GIT_GETTEXT_POISON
 	test_set_prereq GETTEXT_POISON
 else
-- 
2.19.1.681.g6bd79da3f5


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 2/8] gettext: don't poison if GIT_GETTEXT_POISON is set but empty
  2018-10-22 20:22 ` SZEDER Gábor
  2018-10-22 20:22   ` [PATCH 1/8] test-lib.sh: preserve GIT_GETTEXT_POISON from the environment SZEDER Gábor
@ 2018-10-22 20:22   ` SZEDER Gábor
  2018-10-22 20:38     ` Ævar Arnfjörð Bjarmason
  2018-10-22 20:22   ` [PATCH 3/8] lib-rebase: loosen GETTEXT_POISON check in fake editor SZEDER Gábor
                     ` (7 subsequent siblings)
  9 siblings, 1 reply; 49+ messages in thread
From: SZEDER Gábor @ 2018-10-22 20:22 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy,
	Ævar Arnfjörð, SZEDER Gábor

This allows us to run test with non-GETTEXT POISON-ed behavior even in
a GETTEXT POISON build by running:

  GIT_GETTEXT_POISON= ./t1234-foo.sh

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 Makefile  | 2 +-
 gettext.c | 9 +++++++--
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/Makefile b/Makefile
index ad880d1fc5..7a165445cd 100644
--- a/Makefile
+++ b/Makefile
@@ -365,7 +365,7 @@ all::
 # Define GETTEXT_POISON if you are debugging the choice of strings marked
 # for translation.  In a GETTEXT_POISON build, you can turn all strings marked
 # for translation into gibberish by setting the GIT_GETTEXT_POISON variable
-# (to any value) in your environment.
+# to a non-empty value in your environment.
 #
 # Define JSMIN to point to JavaScript minifier that functions as
 # a filter to have gitweb.js minified.
diff --git a/gettext.c b/gettext.c
index 7272771c8e..a9509a5df3 100644
--- a/gettext.c
+++ b/gettext.c
@@ -50,8 +50,13 @@ const char *get_preferred_languages(void)
 int use_gettext_poison(void)
 {
 	static int poison_requested = -1;
-	if (poison_requested == -1)
-		poison_requested = getenv("GIT_GETTEXT_POISON") ? 1 : 0;
+	if (poison_requested == -1) {
+		const char *v = getenv("GIT_GETTEXT_POISON");
+		if (v && *v)
+			poison_requested = 1;
+		else
+			poison_requested = 0;
+	}
 	return poison_requested;
 }
 #endif
-- 
2.19.1.681.g6bd79da3f5


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 3/8] lib-rebase: loosen GETTEXT_POISON check in fake editor
  2018-10-22 20:22 ` SZEDER Gábor
  2018-10-22 20:22   ` [PATCH 1/8] test-lib.sh: preserve GIT_GETTEXT_POISON from the environment SZEDER Gábor
  2018-10-22 20:22   ` [PATCH 2/8] gettext: don't poison if GIT_GETTEXT_POISON is set but empty SZEDER Gábor
@ 2018-10-22 20:22   ` SZEDER Gábor
  2018-10-22 20:22   ` [PATCH 4/8] gettext: #ifdef away GETTEXT POISON-related code from _() and Q_() SZEDER Gábor
                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 49+ messages in thread
From: SZEDER Gábor @ 2018-10-22 20:22 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy,
	Ævar Arnfjörð, SZEDER Gábor

The fake editor script created by 't/lib-rebase.sh' recognizes GETTEXT
POSION output when the first line of the file to be edited consists
solely of the GETTEXT POISON magic string as a comment.  However, a
later patch will include additional text after that magic string, so
that check won't work anymore.

So instead of expecting an exact match in the first line, check
whether there are any lines starting with the commented out magic
string.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 t/lib-rebase.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/lib-rebase.sh b/t/lib-rebase.sh
index 25a77ee5cb..530f8ec0a8 100644
--- a/t/lib-rebase.sh
+++ b/t/lib-rebase.sh
@@ -29,7 +29,7 @@ set_fake_editor () {
 	*/COMMIT_EDITMSG)
 		test -z "$EXPECT_HEADER_COUNT" ||
 			test "$EXPECT_HEADER_COUNT" = "$(sed -n '1s/^# This is a combination of \(.*\) commits\./\1/p' < "$1")" ||
-			test "# # GETTEXT POISON #" = "$(sed -n '1p' < "$1")" ||
+			! grep -q "^# # GETTEXT POISON #" ||
 			exit
 		test -z "$FAKE_COMMIT_MESSAGE" || echo "$FAKE_COMMIT_MESSAGE" > "$1"
 		test -z "$FAKE_COMMIT_AMEND" || echo "$FAKE_COMMIT_AMEND" >> "$1"
-- 
2.19.1.681.g6bd79da3f5


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 4/8] gettext: #ifdef away GETTEXT POISON-related code from _() and Q_()
  2018-10-22 20:22 ` SZEDER Gábor
                     ` (2 preceding siblings ...)
  2018-10-22 20:22   ` [PATCH 3/8] lib-rebase: loosen GETTEXT_POISON check in fake editor SZEDER Gábor
@ 2018-10-22 20:22   ` SZEDER Gábor
  2018-10-22 20:22   ` [PATCH 5/8] gettext: put "# GETTEXT POISON #" string literal into a macro SZEDER Gábor
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 49+ messages in thread
From: SZEDER Gábor @ 2018-10-22 20:22 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy,
	Ævar Arnfjörð, SZEDER Gábor

The gettext wrapper functions _() and Q_() contain a GETTEXT
POISON-related conditional construct even in non-GETTEXT POISON
builds, though both of those conditions are #define-d to be false
already at compile time.  Both constructs will grow in a later patch,
using a GETTEXT POISON-specific enum type and calling another GETTEXT
POISON-specific function.

Prepare for those future changes and hide the GETTEXT POISON-related
parts of those functions behind an #ifdef.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 gettext.h | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/gettext.h b/gettext.h
index 7eee64a34f..c658942f7d 100644
--- a/gettext.h
+++ b/gettext.h
@@ -43,22 +43,26 @@ static inline int gettext_width(const char *s)
 
 #ifdef GETTEXT_POISON
 extern int use_gettext_poison(void);
-#else
-#define use_gettext_poison() 0
 #endif
 
 static inline FORMAT_PRESERVING(1) const char *_(const char *msgid)
 {
 	if (!*msgid)
 		return "";
-	return use_gettext_poison() ? "# GETTEXT POISON #" : gettext(msgid);
+#ifdef GETTEXT_POISON
+	if (use_gettext_poison())
+		return "# GETTEXT POISON #";
+#endif
+	return gettext(msgid);
 }
 
 static inline FORMAT_PRESERVING(1) FORMAT_PRESERVING(2)
 const char *Q_(const char *msgid, const char *plu, unsigned long n)
 {
+#ifdef GETTEXT_POISON
 	if (use_gettext_poison())
 		return "# GETTEXT POISON #";
+#endif
 	return ngettext(msgid, plu, n);
 }
 
-- 
2.19.1.681.g6bd79da3f5


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 5/8] gettext: put "# GETTEXT POISON #" string literal into a macro
  2018-10-22 20:22 ` SZEDER Gábor
                     ` (3 preceding siblings ...)
  2018-10-22 20:22   ` [PATCH 4/8] gettext: #ifdef away GETTEXT POISON-related code from _() and Q_() SZEDER Gábor
@ 2018-10-22 20:22   ` SZEDER Gábor
  2018-10-22 20:22   ` [PATCH 6/8] gettext: use an enum for the mode of GETTEXT POISONing SZEDER Gábor
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 49+ messages in thread
From: SZEDER Gábor @ 2018-10-22 20:22 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy,
	Ævar Arnfjörð, SZEDER Gábor

The "# GETTEXT POISON #" string literal is currently used in two
functions in 'gettext.h'.  A later patch will add a function to
'gettext.c' using this string literal as well.

Avoid this duplication and put that string literal into a macro which
is only available in GETTEXT POISON builds.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 gettext.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gettext.h b/gettext.h
index c658942f7d..8e279622f6 100644
--- a/gettext.h
+++ b/gettext.h
@@ -43,6 +43,8 @@ static inline int gettext_width(const char *s)
 
 #ifdef GETTEXT_POISON
 extern int use_gettext_poison(void);
+
+#define GETTEXT_POISON_MAGIC "# GETTEXT POISON #"
 #endif
 
 static inline FORMAT_PRESERVING(1) const char *_(const char *msgid)
@@ -51,7 +53,7 @@ static inline FORMAT_PRESERVING(1) const char *_(const char *msgid)
 		return "";
 #ifdef GETTEXT_POISON
 	if (use_gettext_poison())
-		return "# GETTEXT POISON #";
+		return GETTEXT_POISON_MAGIC;
 #endif
 	return gettext(msgid);
 }
@@ -61,7 +63,7 @@ const char *Q_(const char *msgid, const char *plu, unsigned long n)
 {
 #ifdef GETTEXT_POISON
 	if (use_gettext_poison())
-		return "# GETTEXT POISON #";
+		return GETTEXT_POISON_MAGIC;
 #endif
 	return ngettext(msgid, plu, n);
 }
-- 
2.19.1.681.g6bd79da3f5


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 6/8] gettext: use an enum for the mode of GETTEXT POISONing
  2018-10-22 20:22 ` SZEDER Gábor
                     ` (4 preceding siblings ...)
  2018-10-22 20:22   ` [PATCH 5/8] gettext: put "# GETTEXT POISON #" string literal into a macro SZEDER Gábor
@ 2018-10-22 20:22   ` SZEDER Gábor
  2018-10-22 20:22   ` [PATCH 7/8] gettext: introduce GIT_GETTEXT_POISON=scrambled SZEDER Gábor
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 49+ messages in thread
From: SZEDER Gábor @ 2018-10-22 20:22 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy,
	Ævar Arnfjörð, SZEDER Gábor

The next patch will add a different mode of GETTEXT POISON-ing,
therefore named constants will be better than magic numbers.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 gettext.c | 12 ++++++------
 gettext.h | 12 +++++++++---
 2 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/gettext.c b/gettext.c
index a9509a5df3..c50d1e0377 100644
--- a/gettext.c
+++ b/gettext.c
@@ -47,17 +47,17 @@ const char *get_preferred_languages(void)
 }
 
 #ifdef GETTEXT_POISON
-int use_gettext_poison(void)
+enum poison_mode use_gettext_poison(void)
 {
-	static int poison_requested = -1;
-	if (poison_requested == -1) {
+	static enum poison_mode poison_mode = poison_mode_uninitialized;
+	if (poison_mode == poison_mode_uninitialized) {
 		const char *v = getenv("GIT_GETTEXT_POISON");
 		if (v && *v)
-			poison_requested = 1;
+			poison_mode = poison_mode_default;
 		else
-			poison_requested = 0;
+			poison_mode = poison_mode_none;
 	}
-	return poison_requested;
+	return poison_mode;
 }
 #endif
 
diff --git a/gettext.h b/gettext.h
index 8e279622f6..fcb6bfaa2c 100644
--- a/gettext.h
+++ b/gettext.h
@@ -42,7 +42,13 @@ static inline int gettext_width(const char *s)
 #endif
 
 #ifdef GETTEXT_POISON
-extern int use_gettext_poison(void);
+enum poison_mode {
+	poison_mode_uninitialized = -1,
+	poison_mode_none = 0,
+	poison_mode_default
+};
+
+extern enum poison_mode use_gettext_poison(void);
 
 #define GETTEXT_POISON_MAGIC "# GETTEXT POISON #"
 #endif
@@ -52,7 +58,7 @@ static inline FORMAT_PRESERVING(1) const char *_(const char *msgid)
 	if (!*msgid)
 		return "";
 #ifdef GETTEXT_POISON
-	if (use_gettext_poison())
+	if (use_gettext_poison() == poison_mode_default)
 		return GETTEXT_POISON_MAGIC;
 #endif
 	return gettext(msgid);
@@ -62,7 +68,7 @@ static inline FORMAT_PRESERVING(1) FORMAT_PRESERVING(2)
 const char *Q_(const char *msgid, const char *plu, unsigned long n)
 {
 #ifdef GETTEXT_POISON
-	if (use_gettext_poison())
+	if (use_gettext_poison() == poison_mode_default)
 		return GETTEXT_POISON_MAGIC;
 #endif
 	return ngettext(msgid, plu, n);
-- 
2.19.1.681.g6bd79da3f5


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 7/8] gettext: introduce GIT_GETTEXT_POISON=scrambled
  2018-10-22 20:22 ` SZEDER Gábor
                     ` (5 preceding siblings ...)
  2018-10-22 20:22   ` [PATCH 6/8] gettext: use an enum for the mode of GETTEXT POISONing SZEDER Gábor
@ 2018-10-22 20:22   ` SZEDER Gábor
  2018-10-23 14:44     ` Duy Nguyen
  2018-10-22 20:22   ` [PATCH 8/8] travis-ci: run GETTEXT POISON build job in scrambled mode, too SZEDER Gábor
                     ` (2 subsequent siblings)
  9 siblings, 1 reply; 49+ messages in thread
From: SZEDER Gábor @ 2018-10-22 20:22 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy,
	Ævar Arnfjörð, SZEDER Gábor

Sometimes tests run with GETTEXT POISON fail because of a reason other
than a translated string that should not have been translated.  In
such a case an error message from a git command in the test's verbose
output is usually, well, less than helpful:

  error: # GETTEXT POISON #

or

  fatal: # GETTEXT POISON #: No such file or directory

It's especially annoying on those rare occasions when a heisenbug
decides it's a good time to suddenly reveal its presence during a
GETTEXT POISON test run, and all we get is an error message like these
(yes, I did actually see both of the above error messages only once).

Make builtin commands' GETTEXT POISON-ed error messages more useful
for debugging failures by introducing a new mode of poisoning: if
$GIT_GETTEXT_POISON is set to 'scrambled', then include the original
untranslated message after that "# GETTEXT_POISON #" string in a
scrambled form, interspersing a '.' after each character.  This way
the messages will remain gibberish enough for machine consumption as
they were before, but at the same time they will be relatively easily
legible for humans.  Take extra care to preserve printf() format
conversion specifiers unaltered when inserting those dots.

Leave 'git-sh-i18n.sh' unchanged, because translatable messages in
scripts often include shell variables, and they could (though
currently they don't) include printf format specifiers, parameter
expansions, command substitutions and whatnot, too.  Dealing with
those in a shell script would be too much hassle without its worth.

There is an additional benefit: as this change considerably increases
the size of translated messages, it could detect cases when we try to
format a translated string into a too small buffer.  E.g. this change
applied on old versions causes test failures because of the bug that
was fixed in 2cfa83574c (bisect_next_all: convert xsnprintf to
xstrfmt, 2017-02-16).

[TODO: Fallout?
       A 'printf(_("foo: %s"), var);' call includes the contents of
       'var' unscrambled in the output.  Could that hide the
       translation of a string that should not have been translated?
       I'm afraid yes: to check the output of that printf() a sloppy
       test could do:

         git plumbing-cmd >out && grep "var's content" out

       which would fail in a regular GETTEXT_POISON test run, but
       would succeed in a scrambled test run.  Does this matter in
       practice, do we care at all?

       Does gettext_scramble() need a FORMAT_PRESERVING annotation?
       Seems to work fine without it so far...]

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 gettext.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++++---
 gettext.h | 11 +++++++++--
 2 files changed, 60 insertions(+), 5 deletions(-)

diff --git a/gettext.c b/gettext.c
index c50d1e0377..8ba7fd0bea 100644
--- a/gettext.c
+++ b/gettext.c
@@ -52,13 +52,61 @@ enum poison_mode use_gettext_poison(void)
 	static enum poison_mode poison_mode = poison_mode_uninitialized;
 	if (poison_mode == poison_mode_uninitialized) {
 		const char *v = getenv("GIT_GETTEXT_POISON");
-		if (v && *v)
-			poison_mode = poison_mode_default;
-		else
+		if (v && *v) {
+			if (!strcmp(v, "scrambled"))
+				poison_mode = poison_mode_scrambled;
+			else
+				poison_mode = poison_mode_default;
+		} else
 			poison_mode = poison_mode_none;
 	}
 	return poison_mode;
 }
+
+static int conversion_specifier_len(const char *s)
+{
+	const char printf_conversion_specifiers[] = "diouxXeEfFgGaAcsCSpnm%";
+	const char *format_end;
+
+	if (*s != '%')
+		return 0;
+
+	format_end = strpbrk(s + 1, printf_conversion_specifiers);
+	if (format_end)
+		return format_end - s;
+	else
+		return 0;
+}
+
+const char *gettext_scramble(const char *msg)
+{
+	struct strbuf sb;
+
+	strbuf_init(&sb,
+		    /* "# GETTEXT_POISON #" + ' ' + "m.e.s.s.a.g.e." + '\0' */
+		    strlen(GETTEXT_POISON_MAGIC) + 1 + 2 * strlen(msg) + 1);
+
+	strbuf_addch(&sb, ' ');
+	while (*msg) {
+		if (*msg == '\n') {
+			strbuf_addch(&sb, *(msg++));
+			continue;
+		} else if (*msg == '%') {
+			int spec_len = conversion_specifier_len(msg);
+			if (spec_len) {
+				strbuf_add(&sb, msg, spec_len);
+				msg += spec_len;
+				continue;
+			}
+		}
+
+		strbuf_addch(&sb, *(msg++));
+		strbuf_addch(&sb, '.');
+	}
+
+	/* This will be leaked... */
+	return strbuf_detach(&sb, NULL);
+}
 #endif
 
 #ifndef NO_GETTEXT
diff --git a/gettext.h b/gettext.h
index fcb6bfaa2c..d21346d9fa 100644
--- a/gettext.h
+++ b/gettext.h
@@ -45,10 +45,12 @@ static inline int gettext_width(const char *s)
 enum poison_mode {
 	poison_mode_uninitialized = -1,
 	poison_mode_none = 0,
-	poison_mode_default
+	poison_mode_default,
+	poison_mode_scrambled
 };
 
 extern enum poison_mode use_gettext_poison(void);
+extern const char *gettext_scramble(const char *msg);
 
 #define GETTEXT_POISON_MAGIC "# GETTEXT POISON #"
 #endif
@@ -60,6 +62,8 @@ static inline FORMAT_PRESERVING(1) const char *_(const char *msgid)
 #ifdef GETTEXT_POISON
 	if (use_gettext_poison() == poison_mode_default)
 		return GETTEXT_POISON_MAGIC;
+	else if (use_gettext_poison() == poison_mode_scrambled)
+		return gettext_scramble(gettext(msgid));
 #endif
 	return gettext(msgid);
 }
@@ -67,11 +71,14 @@ static inline FORMAT_PRESERVING(1) const char *_(const char *msgid)
 static inline FORMAT_PRESERVING(1) FORMAT_PRESERVING(2)
 const char *Q_(const char *msgid, const char *plu, unsigned long n)
 {
+	const char *msg = ngettext(msgid, plu, n);
 #ifdef GETTEXT_POISON
 	if (use_gettext_poison() == poison_mode_default)
 		return GETTEXT_POISON_MAGIC;
+	else if (use_gettext_poison() == poison_mode_scrambled)
+		return gettext_scramble(msg);
 #endif
-	return ngettext(msgid, plu, n);
+	return msg;
 }
 
 /* Mark msgid for translation but do not translate it. */
-- 
2.19.1.681.g6bd79da3f5


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 8/8] travis-ci: run GETTEXT POISON build job in scrambled mode, too
  2018-10-22 20:22 ` SZEDER Gábor
                     ` (6 preceding siblings ...)
  2018-10-22 20:22   ` [PATCH 7/8] gettext: introduce GIT_GETTEXT_POISON=scrambled SZEDER Gábor
@ 2018-10-22 20:22   ` SZEDER Gábor
  2018-10-23 14:37   ` [PATCH] Poison gettext with the Ook language Duy Nguyen
  2018-10-27 10:11   ` Jakub Narebski
  9 siblings, 0 replies; 49+ messages in thread
From: SZEDER Gábor @ 2018-10-22 20:22 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy,
	Ævar Arnfjörð, SZEDER Gábor

Run the test suite twice in the GETTEXT POISON build: first with
GIT_GETTEXT_POISON=scrambled and then with "regular" poisoning, to see
whether the scrambled mode hid any mis-translations.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 ci/lib-travisci.sh        |  1 +
 ci/run-build-and-tests.sh | 10 +++++++---
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/ci/lib-travisci.sh b/ci/lib-travisci.sh
index 109ef280da..fdfa4e035b 100755
--- a/ci/lib-travisci.sh
+++ b/ci/lib-travisci.sh
@@ -122,5 +122,6 @@ osx-clang|osx-gcc)
 	;;
 GETTEXT_POISON)
 	export GETTEXT_POISON=YesPlease
+	export GIT_GETTEXT_POISON=scrambled
 	;;
 esac
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 3735ce413f..74ba05e152 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -9,10 +9,14 @@ ln -s "$cache_dir/.prove" t/.prove
 
 make --jobs=2
 make --quiet test
-if test "$jobname" = "linux-gcc"
-then
+case "$jobname" in
+linux-gcc)
 	GIT_TEST_SPLIT_INDEX=YesPlease make --quiet test
-fi
+	;;
+GETTEXT_POISON)
+	GIT_GETTEXT_POISON=YesPlease make --quiet test
+	;;
+esac
 
 check_unignored_build_artifacts
 
-- 
2.19.1.681.g6bd79da3f5


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/8] gettext: don't poison if GIT_GETTEXT_POISON is set but empty
  2018-10-22 20:22   ` [PATCH 2/8] gettext: don't poison if GIT_GETTEXT_POISON is set but empty SZEDER Gábor
@ 2018-10-22 20:38     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 49+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-10-22 20:38 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, Nguyễn Thái Ngọc Duy


On Mon, Oct 22 2018, SZEDER Gábor wrote:

> This allows us to run test with non-GETTEXT POISON-ed behavior even in
> a GETTEXT POISON build by running:
>
>   GIT_GETTEXT_POISON= ./t1234-foo.sh
>
> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
> ---
>  Makefile  | 2 +-
>  gettext.c | 9 +++++++--
>  2 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index ad880d1fc5..7a165445cd 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -365,7 +365,7 @@ all::
>  # Define GETTEXT_POISON if you are debugging the choice of strings marked
>  # for translation.  In a GETTEXT_POISON build, you can turn all strings marked
>  # for translation into gibberish by setting the GIT_GETTEXT_POISON variable
> -# (to any value) in your environment.
> +# to a non-empty value in your environment.
>  #
>  # Define JSMIN to point to JavaScript minifier that functions as
>  # a filter to have gitweb.js minified.
> diff --git a/gettext.c b/gettext.c
> index 7272771c8e..a9509a5df3 100644
> --- a/gettext.c
> +++ b/gettext.c
> @@ -50,8 +50,13 @@ const char *get_preferred_languages(void)
>  int use_gettext_poison(void)
>  {
>  	static int poison_requested = -1;
> -	if (poison_requested == -1)
> -		poison_requested = getenv("GIT_GETTEXT_POISON") ? 1 : 0;
> +	if (poison_requested == -1) {
> +		const char *v = getenv("GIT_GETTEXT_POISON");
> +		if (v && *v)
> +			poison_requested = 1;
> +		else
> +			poison_requested = 0;
> +	}
>  	return poison_requested;
>  }
>  #endif

Fixing this is good. But when the initial support for conditional runs
was added in 309552295a ("i18n: do not poison translations unless
GIT_GETTEXT_POISON envvar is set", 2011-02-22) we didn't have the
convention of using git_env_bool() for these, which a few recent
follow-up patches have also done.

So let's just use git_env_bool() here. It's less code, and consistent
with the rest. See 4c2db93807 ("read-cache.c: make $GIT_TEST_SPLIT_INDEX
boolean", 2018-04-14). Also makes sense to document this in the "Running
tests with special setups" setup that patch added.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] Poison gettext with the Ook language
  2018-10-22 15:36 [PATCH] Poison gettext with the Ook language Nguyễn Thái Ngọc Duy
  2018-10-22 20:22 ` SZEDER Gábor
@ 2018-10-22 20:52 ` Johannes Schindelin
  2018-10-22 21:09 ` Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 49+ messages in thread
From: Johannes Schindelin @ 2018-10-22 20:52 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Ævar Arnfjörð

[-- Attachment #1: Type: text/plain, Size: 4450 bytes --]

Hi Duy,

On Mon, 22 Oct 2018, Nguyễn Thái Ngọc Duy wrote:

> The current gettext() function just replaces all strings with
> '# GETTEXT POISON #' including format strings and hides the things
> that we should be allowed to grep (like branch names, or some other
> codes) even when gettext is poisoned.
> 
> This patch implements the poisoned _() with a universal and totally
> legit language called Ook [1]. We could actually grep stuff even in
> with this because format strings are preserved.
> 
> Long term, we could implement an "ook translator" for test_i18ngrep
> and friends so that they translate English to Ook, allowing us to
> match full text while making sure the text in the code is still marked
> for translation.
> 
> [1] https://en.wikipedia.org/wiki/Unseen_University#Librarian
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  This started out as something fun to do while running the test suite
>  last weekend. But it turns out actually working! If this patch ends
>  up in git.git, the Librarian would be so proud!

Cute.
Dscho

> 
>  gettext.c       | 54 +++++++++++++++++++++++++++++++++++++++++++++++++
>  gettext.h       |  7 ++++---
>  t/lib-rebase.sh |  2 +-
>  3 files changed, 59 insertions(+), 4 deletions(-)
> 
> diff --git a/gettext.c b/gettext.c
> index 7272771c8e..29901e1ddd 100644
> --- a/gettext.c
> +++ b/gettext.c
> @@ -56,6 +56,60 @@ int use_gettext_poison(void)
>  }
>  #endif
>  
> +const char *gettext_poison(const char *msgid)
> +{
> +	/*
> +	 * gettext() returns a string that is always valid. We would
> +	 * need a hash map for that but let's stay simple and keep the
> +	 * last 64 gettext() results. Should be more than enough.
> +	 */
> +	static char *bufs[64];
> +	static int i;
> +	struct strbuf sb = STRBUF_INIT;
> +	char *buf;
> +	const char *p;
> +	const char *type_specifiers = "diouxXeEfFgGaAcsCSpnm%";
> +
> +	if (!strchr(msgid, '%'))
> +		return "Eek!";
> +
> +	p = msgid;
> +	while (*p) {
> +		const char *type;
> +		switch (*p) {
> +		case '%':
> +			/*
> +			 * No strict parsing. We simply look for the end of a
> +			 * format string
> +			 */
> +			type = p + 1;
> +			while (*type && !strchr(type_specifiers, *type))
> +				type++;
> +			if (*type)
> +				type++;
> +			strbuf_add(&sb, p, (int)(type - p));
> +			p = type;
> +			break;
> +		default:
> +			if (!isalpha(*p)) {
> +				strbuf_addch(&sb, *p);
> +				p++;
> +				break;
> +			}
> +			if (isupper(*p))
> +				strbuf_addstr(&sb, "Ook");
> +			else
> +				strbuf_addstr(&sb, "ook");
> +			while (isalpha(*p))
> +				p++;
> +		}
> +	}
> +	buf = bufs[(i++) % ARRAY_SIZE(bufs)];
> +	free(buf);
> +	buf = strbuf_detach(&sb, NULL);
> +	return buf;
> +}
> +
>  #ifndef NO_GETTEXT
>  static int test_vsnprintf(const char *fmt, ...)
>  {
> diff --git a/gettext.h b/gettext.h
> index 7eee64a34f..dc9851a06a 100644
> --- a/gettext.h
> +++ b/gettext.h
> @@ -41,8 +41,9 @@ static inline int gettext_width(const char *s)
>  }
>  #endif
>  
> +const char *gettext_poison(const char *);
>  #ifdef GETTEXT_POISON
> -extern int use_gettext_poison(void);
> +int use_gettext_poison(void);
>  #else
>  #define use_gettext_poison() 0
>  #endif
> @@ -51,14 +52,14 @@ static inline FORMAT_PRESERVING(1) const char *_(const char *msgid)
>  {
>  	if (!*msgid)
>  		return "";
> -	return use_gettext_poison() ? "# GETTEXT POISON #" : gettext(msgid);
> +	return use_gettext_poison() ? gettext_poison(msgid) : gettext(msgid);
>  }
>  
>  static inline FORMAT_PRESERVING(1) FORMAT_PRESERVING(2)
>  const char *Q_(const char *msgid, const char *plu, unsigned long n)
>  {
>  	if (use_gettext_poison())
> -		return "# GETTEXT POISON #";
> +		return gettext_poison(msgid);
>  	return ngettext(msgid, plu, n);
>  }
>  
> diff --git a/t/lib-rebase.sh b/t/lib-rebase.sh
> index 2ca9fb69d6..1e8440e935 100644
> --- a/t/lib-rebase.sh
> +++ b/t/lib-rebase.sh
> @@ -29,7 +29,7 @@ set_fake_editor () {
>  	*/COMMIT_EDITMSG)
>  		test -z "$EXPECT_HEADER_COUNT" ||
>  			test "$EXPECT_HEADER_COUNT" = "$(sed -n '1s/^# This is a combination of \(.*\) commits\./\1/p' < "$1")" ||
> -			test "# # GETTEXT POISON #" = "$(sed -n '1p' < "$1")" ||
> +			test "$EXPECT_HEADER_COUNT" = "$(sed -n '1s/^# Ook ook ook ook ook \(.*\) ook\./\1/p' < "$1")" ||
>  			exit
>  		test -z "$FAKE_COMMIT_MESSAGE" || echo "$FAKE_COMMIT_MESSAGE" > "$1"
>  		test -z "$FAKE_COMMIT_AMEND" || echo "$FAKE_COMMIT_AMEND" >> "$1"
> -- 
> 2.19.1.647.g708186aaf9
> 
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] Poison gettext with the Ook language
  2018-10-22 15:36 [PATCH] Poison gettext with the Ook language Nguyễn Thái Ngọc Duy
  2018-10-22 20:22 ` SZEDER Gábor
  2018-10-22 20:52 ` Johannes Schindelin
@ 2018-10-22 21:09 ` Ævar Arnfjörð Bjarmason
  2018-10-22 21:46   ` Ævar Arnfjörð Bjarmason
                     ` (2 more replies)
  2 siblings, 3 replies; 49+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-10-22 21:09 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: git, SZEDER Gábor, Vasco Almeida, Jiang Xin


On Mon, Oct 22 2018, Nguyễn Thái Ngọc Duy wrote:

> The current gettext() function just replaces all strings with
> '# GETTEXT POISON #' including format strings and hides the things
> that we should be allowed to grep (like branch names, or some other
> codes) even when gettext is poisoned.
>
> This patch implements the poisoned _() with a universal and totally
> legit language called Ook [1]. We could actually grep stuff even in
> with this because format strings are preserved.
>
> Long term, we could implement an "ook translator" for test_i18ngrep
> and friends so that they translate English to Ook, allowing us to
> match full text while making sure the text in the code is still marked
> for translation.

Replying to both this patch, and SZEDER's series for brevity. Thanks
both for working on this. It's obvious that this stuff needs some
polishing, and it's great that's being done, and sorry for my part of
having left this in the current state.

a)

Both this patch and SZEDER's 7/8 are addressing the issue that the
poison mode doesn't preserve format specifiers.

I haven't tried to dig this up in the list archive, and maybe it only
exists in my mind, but I seem to remember that this was explicitly
discussed as a goal at the time.

I.e. we were expecting the lack of this to break tests, and that was
considered a feature as it would spot plumbing messages we shouldn't be
translating.

However, it's my opinion that this was just a bad idea to begin with,
I've CC'd a couple of prolific markers of i18n from my log grepping (and
feel free to add more) to see if they disagre.

I think it probably helped me a bit in the beginning when i18n was
bootstrapping and there was a *lot* to mark for translation, but we've
long since stabilized and aren't doing that anymore, so it's much easier
to just review the patches to see if they translate plumbing.

All of which is to say that I think something like your patch here is
fine to just accept, and the only improvement I'd suggest is some note
in the commit message saying that this was always intentional, but
nobody can name a case where it helped, so let's just drop it.

In SZEDER's case that we shouldn't have GIT_GETTEXT_POISON=scrambled,
let's just make it boolean and make "scrambled" (i.e. preserving format
specifiecs) the default.

b)

SZEDER's series, and your patch (although it's smaller) still preserve
the notion that there's such a thing as a GETTEXT_POISON *build* and you
actually need to compile git with that flag to make use of it.

This, as I recall, was just paranoia on my part back in 2010/2011. I
wanted to be able to present a version of this i18n stuff where people
who didn't like it could be assured that if they didn't opt-in to it
they wouldn't get any of the code (sans a few trivial and obviously
correct wrapper functions).

So I think the only reason to keep it compile-time is performance, but I
don't think that matters. It's not like we're printing gigabytes of _()
formatted output. Everything where formatting matters is plumbing which
doesn't use this API. These messages are always for human consumption.

So shouldn't we just drop this notion of GETTEXT_POISON at compile-time
entirely, and make GIT_GETTEXT_POISON=<boolean> a test flag that all
builds of git are aware of? I think so.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] Poison gettext with the Ook language
  2018-10-22 21:09 ` Ævar Arnfjörð Bjarmason
@ 2018-10-22 21:46   ` Ævar Arnfjörð Bjarmason
  2018-10-22 23:04   ` Junio C Hamano
  2018-10-23  9:30   ` [PATCH] Poison gettext with the Ook language Johannes Schindelin
  2 siblings, 0 replies; 49+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-10-22 21:46 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: git, SZEDER Gábor, Vasco Almeida, Jiang Xin


On Mon, Oct 22 2018, Ævar Arnfjörð Bjarmason wrote:

> On Mon, Oct 22 2018, Nguyễn Thái Ngọc Duy wrote:
>
>> The current gettext() function just replaces all strings with
>> '# GETTEXT POISON #' including format strings and hides the things
>> that we should be allowed to grep (like branch names, or some other
>> codes) even when gettext is poisoned.
>>
>> This patch implements the poisoned _() with a universal and totally
>> legit language called Ook [1]. We could actually grep stuff even in
>> with this because format strings are preserved.
>>
>> Long term, we could implement an "ook translator" for test_i18ngrep
>> and friends so that they translate English to Ook, allowing us to
>> match full text while making sure the text in the code is still marked
>> for translation.
>
> Replying to both this patch, and SZEDER's series for brevity. Thanks
> both for working on this. It's obvious that this stuff needs some
> polishing, and it's great that's being done, and sorry for my part of
> having left this in the current state.
>
> a)
>
> Both this patch and SZEDER's 7/8 are addressing the issue that the
> poison mode doesn't preserve format specifiers.
>
> I haven't tried to dig this up in the list archive, and maybe it only
> exists in my mind, but I seem to remember that this was explicitly
> discussed as a goal at the time.
>
> I.e. we were expecting the lack of this to break tests, and that was
> considered a feature as it would spot plumbing messages we shouldn't be
> translating.
>
> However, it's my opinion that this was just a bad idea to begin with,
> I've CC'd a couple of prolific markers of i18n from my log grepping (and
> feel free to add more) to see if they disagre.
>
> I think it probably helped me a bit in the beginning when i18n was
> bootstrapping and there was a *lot* to mark for translation, but we've
> long since stabilized and aren't doing that anymore, so it's much easier
> to just review the patches to see if they translate plumbing.
>
> All of which is to say that I think something like your patch here is
> fine to just accept, and the only improvement I'd suggest is some note
> in the commit message saying that this was always intentional, but
> nobody can name a case where it helped, so let's just drop it.
>
> In SZEDER's case that we shouldn't have GIT_GETTEXT_POISON=scrambled,
> let's just make it boolean and make "scrambled" (i.e. preserving format
> specifiecs) the default.

[...]

> b)
>
> SZEDER's series, and your patch (although it's smaller) still preserve
> the notion that there's such a thing as a GETTEXT_POISON *build* and you
> actually need to compile git with that flag to make use of it.
>
> This, as I recall, was just paranoia on my part back in 2010/2011. I
> wanted to be able to present a version of this i18n stuff where people
> who didn't like it could be assured that if they didn't opt-in to it
> they wouldn't get any of the code (sans a few trivial and obviously
> correct wrapper functions).
>
> So I think the only reason to keep it compile-time is performance, but I
> don't think that matters. It's not like we're printing gigabytes of _()
> formatted output. Everything where formatting matters is plumbing which
> doesn't use this API. These messages are always for human consumption.
>
> So shouldn't we just drop this notion of GETTEXT_POISON at compile-time
> entirely, and make GIT_GETTEXT_POISON=<boolean> a test flag that all
> builds of git are aware of? I think so.

Something like this, untested except I compiled it and ran one test
with/without GIT_GETTEXT_POISON, so it may have bugs:

 Makefile                | 10 +---------
 gettext.c               |  4 +---
 gettext.h               |  4 ----
 po/README               | 13 ++++---------
 t/README                |  5 +++++
 t/helper/test-tool.c    |  1 +
 t/helper/test-tool.h    |  1 +
 t/test-lib-functions.sh |  8 ++++----
 t/test-lib.sh           | 12 +++---------
 9 files changed, 20 insertions(+), 38 deletions(-)

diff --git a/Makefile b/Makefile
index d18ab0fe78..78190ee9d9 100644
--- a/Makefile
+++ b/Makefile
@@ -362,11 +362,6 @@ all::
 # Define HAVE_DEV_TTY if your system can open /dev/tty to interact with the
 # user.
 #
-# Define GETTEXT_POISON if you are debugging the choice of strings marked
-# for translation.  In a GETTEXT_POISON build, you can turn all strings marked
-# for translation into gibberish by setting the GIT_GETTEXT_POISON variable
-# (to any value) in your environment.
-#
 # Define JSMIN to point to JavaScript minifier that functions as
 # a filter to have gitweb.js minified.
 #
@@ -714,6 +709,7 @@ TEST_BUILTINS_OBJS += test-dump-split-index.o
 TEST_BUILTINS_OBJS += test-dump-untracked-cache.o
 TEST_BUILTINS_OBJS += test-example-decorate.o
 TEST_BUILTINS_OBJS += test-genrandom.o
+TEST_BUILTINS_OBJS += test-gettext-poison.o
 TEST_BUILTINS_OBJS += test-hashmap.o
 TEST_BUILTINS_OBJS += test-index-version.o
 TEST_BUILTINS_OBJS += test-json-writer.o
@@ -1439,9 +1435,6 @@ endif
 ifdef NO_SYMLINK_HEAD
 	BASIC_CFLAGS += -DNO_SYMLINK_HEAD
 endif
-ifdef GETTEXT_POISON
-	BASIC_CFLAGS += -DGETTEXT_POISON
-endif
 ifdef NO_GETTEXT
 	BASIC_CFLAGS += -DNO_GETTEXT
 	USE_GETTEXT_SCHEME ?= fallthrough
@@ -2591,7 +2584,6 @@ ifdef GIT_TEST_CMP_USE_COPIED_CONTEXT
 	@echo GIT_TEST_CMP_USE_COPIED_CONTEXT=YesPlease >>$@+
 endif
 	@echo NO_GETTEXT=\''$(subst ','\'',$(subst ','\'',$(NO_GETTEXT)))'\' >>$@+
-	@echo GETTEXT_POISON=\''$(subst ','\'',$(subst ','\'',$(GETTEXT_POISON)))'\' >>$@+
 ifdef GIT_PERF_REPEAT_COUNT
 	@echo GIT_PERF_REPEAT_COUNT=\''$(subst ','\'',$(subst ','\'',$(GIT_PERF_REPEAT_COUNT)))'\' >>$@+
 endif
diff --git a/gettext.c b/gettext.c
index 7272771c8e..d71e394284 100644
--- a/gettext.c
+++ b/gettext.c
@@ -46,15 +46,13 @@ const char *get_preferred_languages(void)
 	return NULL;
 }

-#ifdef GETTEXT_POISON
 int use_gettext_poison(void)
 {
 	static int poison_requested = -1;
 	if (poison_requested == -1)
-		poison_requested = getenv("GIT_GETTEXT_POISON") ? 1 : 0;
+		poison_requested = git_env_bool("GIT_GETTEXT_POISON", 0);
 	return poison_requested;
 }
-#endif

 #ifndef NO_GETTEXT
 static int test_vsnprintf(const char *fmt, ...)
diff --git a/gettext.h b/gettext.h
index 7eee64a34f..4c492d9f57 100644
--- a/gettext.h
+++ b/gettext.h
@@ -41,11 +41,7 @@ static inline int gettext_width(const char *s)
 }
 #endif

-#ifdef GETTEXT_POISON
 extern int use_gettext_poison(void);
-#else
-#define use_gettext_poison() 0
-#endif

 static inline FORMAT_PRESERVING(1) const char *_(const char *msgid)
 {
diff --git a/po/README b/po/README
index fef4c0f0b5..9dad277e33 100644
--- a/po/README
+++ b/po/README
@@ -289,16 +289,11 @@ something in the test suite might still depend on the US English
 version of the strings, e.g. to grep some error message or other
 output.

-To smoke out issues like these Git can be compiled with gettext poison
-support, at the top-level:
+To smoke out issues like these Git tested with a translation mode that
+emits gibberish on every call to gettext. To use it run the test suite
+with it, e.g.:

-    make GETTEXT_POISON=YesPlease
-
-That'll give you a git which emits gibberish on every call to
-gettext. It's obviously not meant to be installed, but you should run
-the test suite with it:
-
-    cd t && prove -j 9 ./t[0-9]*.sh
+    cd t && GIT_GETTEXT_POISON=true prove -j 9 ./t[0-9]*.sh

 If tests break with it you should inspect them manually and see if
 what you're translating is sane, i.e. that you're not translating
diff --git a/t/README b/t/README
index 8847489640..b3b2fa0b11 100644
--- a/t/README
+++ b/t/README
@@ -301,6 +301,11 @@ that cannot be easily covered by a few specific test cases. These
 could be enabled by running the test suite with correct GIT_TEST_
 environment set.

+GIT_GETTEXT_POISON=<boolean> turns all strings marked for translation
+into gibberish. Used for spotting those tests that need to be marked
+with a C_LOCALE_OUTPUT prerequisite when adding more strings for
+translation. See "Testing marked strings" in po/README for details.
+
 GIT_TEST_SPLIT_INDEX=<boolean> forces split-index mode on the whole
 test suite. Accept any boolean values that are accepted by git-config.

diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 6b5836dc1b..3e75672a37 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -19,6 +19,7 @@ static struct test_cmd cmds[] = {
 	{ "dump-untracked-cache", cmd__dump_untracked_cache },
 	{ "example-decorate", cmd__example_decorate },
 	{ "genrandom", cmd__genrandom },
+	{ "gettext-poison", cmd__gettext_poison },
 	{ "hashmap", cmd__hashmap },
 	{ "index-version", cmd__index_version },
 	{ "json-writer", cmd__json_writer },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index e4890566da..04f033b7fc 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -15,6 +15,7 @@ int cmd__dump_split_index(int argc, const char **argv);
 int cmd__dump_untracked_cache(int argc, const char **argv);
 int cmd__example_decorate(int argc, const char **argv);
 int cmd__genrandom(int argc, const char **argv);
+int cmd__gettext_poison(int argc, const char **argv);
 int cmd__hashmap(int argc, const char **argv);
 int cmd__index_version(int argc, const char **argv);
 int cmd__json_writer(int argc, const char **argv);
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 78d8c3783b..4e8683addd 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -755,16 +755,16 @@ test_cmp_bin() {

 # Use this instead of test_cmp to compare files that contain expected and
 # actual output from git commands that can be translated.  When running
-# under GETTEXT_POISON this pretends that the command produced expected
+# under GIT_GETTEXT_POISON this pretends that the command produced expected
 # results.
 test_i18ncmp () {
-	test -n "$GETTEXT_POISON" || test_cmp "$@"
+	! test_have_prereq C_LOCALE_OUTPUT || test_cmp "$@"
 }

 # Use this instead of "grep expected-string actual" to see if the
 # output from a git command that can be translated either contains an
 # expected string, or does not contain an unwanted one.  When running
-# under GETTEXT_POISON this pretends that the command produced expected
+# under GIT_GETTEXT_POISON this pretends that the command produced expected
 # results.
 test_i18ngrep () {
 	eval "last_arg=\${$#}"
@@ -779,7 +779,7 @@ test_i18ngrep () {
 		error "bug in the test script: too few parameters to test_i18ngrep"
 	fi

-	if test -n "$GETTEXT_POISON"
+	if ! test_have_prereq C_LOCALE_OUTPUT
 	then
 		# pretend success
 		return 0
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 897e6fcc94..f82d2149a2 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -105,6 +105,7 @@ unset VISUAL EMAIL LANGUAGE COLUMNS $("$PERL_PATH" -e '
 		TRACE
 		DEBUG
 		TEST
+		GETTEXT_POISON
 		.*_TEST
 		PROVE
 		VALGRIND
@@ -1104,15 +1105,8 @@ test -n "$USE_LIBPCRE1" && test_set_prereq LIBPCRE1
 test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT

-# Can we rely on git's output in the C locale?
-if test -n "$GETTEXT_POISON"
-then
-	GIT_GETTEXT_POISON=YesPlease
-	export GIT_GETTEXT_POISON
-	test_set_prereq GETTEXT_POISON
-else
-	test_set_prereq C_LOCALE_OUTPUT
-fi
+test_lazy_prereq GETTEXT_POISON 'test-tool gettext-poison'
+test_lazy_prereq C_LOCALE_OUTPUT '! test-tool gettext-poison'

 if test -z "$GIT_TEST_CHECK_CACHE_TREE"
 then

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH] Poison gettext with the Ook language
  2018-10-22 21:09 ` Ævar Arnfjörð Bjarmason
  2018-10-22 21:46   ` Ævar Arnfjörð Bjarmason
@ 2018-10-22 23:04   ` Junio C Hamano
  2018-10-23 21:01     ` [PATCH] i18n: make GETTEXT_POISON a runtime option Ævar Arnfjörð Bjarmason
  2018-10-23  9:30   ` [PATCH] Poison gettext with the Ook language Johannes Schindelin
  2 siblings, 1 reply; 49+ messages in thread
From: Junio C Hamano @ 2018-10-22 23:04 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Nguyễn Thái Ngọc Duy, git, SZEDER Gábor,
	Vasco Almeida, Jiang Xin

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> So I think the only reason to keep it compile-time is performance, but I
> don't think that matters. It's not like we're printing gigabytes of _()
> formatted output. Everything where formatting matters is plumbing which
> doesn't use this API. These messages are always for human consumption.
>
> So shouldn't we just drop this notion of GETTEXT_POISON at compile-time
> entirely, and make GIT_GETTEXT_POISON=<boolean> a test flag that all
> builds of git are aware of? I think so.

Haven't thought things through, but that sounds like a good thing to
aim for.  Keep the ideas coming...

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] Poison gettext with the Ook language
  2018-10-22 21:09 ` Ævar Arnfjörð Bjarmason
  2018-10-22 21:46   ` Ævar Arnfjörð Bjarmason
  2018-10-22 23:04   ` Junio C Hamano
@ 2018-10-23  9:30   ` Johannes Schindelin
  2018-10-23 10:17     ` Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 49+ messages in thread
From: Johannes Schindelin @ 2018-10-23  9:30 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Nguyễn Thái Ngọc Duy, git, SZEDER Gábor,
	Vasco Almeida, Jiang Xin

[-- Attachment #1: Type: text/plain, Size: 1105 bytes --]

Hi Ævar,

On Mon, 22 Oct 2018, Ævar Arnfjörð Bjarmason wrote:

> So I think the only reason to keep it [GETTEXT_POISON] compile-time is
> performance, but I don't think that matters. It's not like we're
> printing gigabytes of _() formatted output. Everything where formatting
> matters is plumbing which doesn't use this API. These messages are
> always for human consumption.

Well, let's make sure that your impression is correct before going too
far. I, too, had the impression that gettext cannot possibly be expensive,
especifally in Git for Windows' settings, where we do not even ship
translations. Yet see the commit message of cc5e1bf99247 (gettext: avoid
initialization if the locale dir is not present, 2018-04-21):

	The runtime of a simple `git.exe version` call on Windows is
	currently dominated by the gettext setup, adding a whopping ~150ms
	to the ~210ms total.

I would be in favor of your change to make this a runtime option, of
course, as long as it does not affect performance greatly (in particular
on Windows, where we fight an uphill battle to make Git faster).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] Poison gettext with the Ook language
  2018-10-23  9:30   ` [PATCH] Poison gettext with the Ook language Johannes Schindelin
@ 2018-10-23 10:17     ` Ævar Arnfjörð Bjarmason
  2018-10-23 11:07       ` Johannes Schindelin
  2018-10-23 15:00       ` Duy Nguyen
  0 siblings, 2 replies; 49+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-10-23 10:17 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Nguyễn Thái Ngọc Duy, git, SZEDER Gábor,
	Vasco Almeida, Jiang Xin


On Tue, Oct 23 2018, Johannes Schindelin wrote:

> Hi Ævar,
>
> On Mon, 22 Oct 2018, Ævar Arnfjörð Bjarmason wrote:
>
>> So I think the only reason to keep it [GETTEXT_POISON] compile-time is
>> performance, but I don't think that matters. It's not like we're
>> printing gigabytes of _() formatted output. Everything where formatting
>> matters is plumbing which doesn't use this API. These messages are
>> always for human consumption.
>
> Well, let's make sure that your impression is correct before going too
> far. I, too, had the impression that gettext cannot possibly be expensive,
> especifally in Git for Windows' settings, where we do not even ship
> translations. Yet see the commit message of cc5e1bf99247 (gettext: avoid
> initialization if the locale dir is not present, 2018-04-21):
>
> 	The runtime of a simple `git.exe version` call on Windows is
> 	currently dominated by the gettext setup, adding a whopping ~150ms
> 	to the ~210ms total.
>
> I would be in favor of your change to make this a runtime option, of
> course, as long as it does not affect performance greatly (in particular
> on Windows, where we fight an uphill battle to make Git faster).

How expensive gettext() may or may not be isn't relevant to the
GETTEXT_POISON compile-time option.

The effect of what I'm suggesting here, and which my WIP patch in
<875zxtd59e.fsf@evledraar.gmail.com> implements is that we'd do a
one-time getenv() for each process that prints a _() message that we
aren't doing now, and for each message call a function that would check
a boolean "are we in poison mode" static global variable.

Perhaps some of that's expensive on Windows, but given the recent
patches by Microsoft employees to add GIT_TEST_* env options I assumed
not, but in any case it won't have anything to do with how expensive
gettext may or may not be, you'll already be paying that cost now (or
not, with NO_GETTEXT).

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] Poison gettext with the Ook language
  2018-10-23 10:17     ` Ævar Arnfjörð Bjarmason
@ 2018-10-23 11:07       ` Johannes Schindelin
  2018-10-23 15:00       ` Duy Nguyen
  1 sibling, 0 replies; 49+ messages in thread
From: Johannes Schindelin @ 2018-10-23 11:07 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Nguyễn Thái Ngọc Duy, git, SZEDER Gábor,
	Vasco Almeida, Jiang Xin

[-- Attachment #1: Type: text/plain, Size: 2273 bytes --]

Hi Ævar,

On Tue, 23 Oct 2018, Ævar Arnfjörð Bjarmason wrote:

> On Tue, Oct 23 2018, Johannes Schindelin wrote:
> 
> > On Mon, 22 Oct 2018, Ævar Arnfjörð Bjarmason wrote:
> >
> >> So I think the only reason to keep it [GETTEXT_POISON] compile-time is
> >> performance, but I don't think that matters. It's not like we're
> >> printing gigabytes of _() formatted output. Everything where formatting
> >> matters is plumbing which doesn't use this API. These messages are
> >> always for human consumption.
> >
> > Well, let's make sure that your impression is correct before going too
> > far. I, too, had the impression that gettext cannot possibly be expensive,
> > especifally in Git for Windows' settings, where we do not even ship
> > translations. Yet see the commit message of cc5e1bf99247 (gettext: avoid
> > initialization if the locale dir is not present, 2018-04-21):
> >
> > 	The runtime of a simple `git.exe version` call on Windows is
> > 	currently dominated by the gettext setup, adding a whopping ~150ms
> > 	to the ~210ms total.
> >
> > I would be in favor of your change to make this a runtime option, of
> > course, as long as it does not affect performance greatly (in particular
> > on Windows, where we fight an uphill battle to make Git faster).
> 
> How expensive gettext() may or may not be isn't relevant to the
> GETTEXT_POISON compile-time option.
> 
> The effect of what I'm suggesting here, and which my WIP patch in
> <875zxtd59e.fsf@evledraar.gmail.com> implements is that we'd do a
> one-time getenv() for each process that prints a _() message that we
> aren't doing now, and for each message call a function that would check
> a boolean "are we in poison mode" static global variable.

Yep, you are right, we are *already* going through _() as a function,
whether gettext() is initialized or not. My bad.

> Perhaps some of that's expensive on Windows, but given the recent
> patches by Microsoft employees to add GIT_TEST_* env options I assumed
> not, but in any case it won't have anything to do with how expensive
> gettext may or may not be, you'll already be paying that cost now (or
> not, with NO_GETTEXT).

Indeed, we want to measure performance better, and that's what those
environment variables are for.

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] Poison gettext with the Ook language
  2018-10-22 20:22 ` SZEDER Gábor
                     ` (7 preceding siblings ...)
  2018-10-22 20:22   ` [PATCH 8/8] travis-ci: run GETTEXT POISON build job in scrambled mode, too SZEDER Gábor
@ 2018-10-23 14:37   ` Duy Nguyen
  2018-10-27 10:11   ` Jakub Narebski
  9 siblings, 0 replies; 49+ messages in thread
From: Duy Nguyen @ 2018-10-23 14:37 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Git Mailing List, Ævar Arnfjörð Bjarmason

On Mon, Oct 22, 2018 at 10:23 PM SZEDER Gábor <szeder.dev@gmail.com> wrote:
> Once upon a time a GETTEXT_POISON build job failed on me, and the
> error message:
>
>   error: # GETTEXT POISON #
>
> was not particularly useful.  Ook wouldn't help with that...

Oook?

> So I came up with the following couple of patches that implement a
> "scrambled" format that makes the poisoned output legible for humans
> but still gibberish for machine consumption (i.e. grep-ing the text
> part would still fail):
>
>   error:  U.n.a.b.l.e. .t.o. .c.r.e.a.t.e. .'./home/szeder/src/git/t/trash directory.t1404-update-ref-errors/.git/packed-refs...l.o.c.k.'.:. .File exists...
>
> I have been running GETTEXT_POISON builds with this series for some
> months now, but haven't submitted it yet, because I haven't decided
> yet whether including strings (paths, refs, etc.) in the output as
> they are is a feature or a flaw.

A feature to me. The only thing that got through and but should not be
grep-able is strerror(). But that's orthogonal to this scrambling
stuff.

> And because it embarrassingly leaks every single translated string... :)

Easy to fix. I hope this means you have no more excuse to not push
this forward ;-)
-- 
Duy

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 7/8] gettext: introduce GIT_GETTEXT_POISON=scrambled
  2018-10-22 20:22   ` [PATCH 7/8] gettext: introduce GIT_GETTEXT_POISON=scrambled SZEDER Gábor
@ 2018-10-23 14:44     ` Duy Nguyen
  0 siblings, 0 replies; 49+ messages in thread
From: Duy Nguyen @ 2018-10-23 14:44 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Git Mailing List, Ævar Arnfjörð Bjarmason

On Mon, Oct 22, 2018 at 10:23 PM SZEDER Gábor <szeder.dev@gmail.com> wrote:
> [TODO: Fallout?
>        A 'printf(_("foo: %s"), var);' call includes the contents of
>        'var' unscrambled in the output.  Could that hide the
>        translation of a string that should not have been translated?
>        I'm afraid yes: to check the output of that printf() a sloppy
>        test could do:
>
>          git plumbing-cmd >out && grep "var's content" out
>
>        which would fail in a regular GETTEXT_POISON test run, but
>        would succeed in a scrambled test run.  Does this matter in
>        practice, do we care at all?

If var is supposed to be translated, _() must have been called before
the final string is stored in var and the content is already
scrambled. Whatever left unscrambled is machine-generated and should
be ok to grep, or we have found new strings that should be _() but
not.

PS. Another thing I'd like to have is to mark the beginning and end of
a scrambled text. For example, _("foo") produces "[f.o.o]" or
something like that. If we have it, it's easier to see "text legos"
(instead of full sentences) that makes translator's life harder. But
it could interfere with stuff (e.g. some strings must start with '#')
so let's forget it for now.

>
>        Does gettext_scramble() need a FORMAT_PRESERVING annotation?
>        Seems to work fine without it so far...]

I don't think you can. _() can be called on plain strings that just
happen to have '%' in them.
-- 
Duy

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] Poison gettext with the Ook language
  2018-10-23 10:17     ` Ævar Arnfjörð Bjarmason
  2018-10-23 11:07       ` Johannes Schindelin
@ 2018-10-23 15:00       ` Duy Nguyen
  2018-10-23 16:45         ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 49+ messages in thread
From: Duy Nguyen @ 2018-10-23 15:00 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, Git Mailing List, SZEDER Gábor,
	Vasco Almeida, Jiang Xin

On Tue, Oct 23, 2018 at 12:17 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Tue, Oct 23 2018, Johannes Schindelin wrote:
>
> > Hi Ævar,
> >
> > On Mon, 22 Oct 2018, Ævar Arnfjörð Bjarmason wrote:
> >
> >> So I think the only reason to keep it [GETTEXT_POISON] compile-time is
> >> performance, but I don't think that matters. It's not like we're
> >> printing gigabytes of _() formatted output. Everything where formatting
> >> matters is plumbing which doesn't use this API. These messages are
> >> always for human consumption.
> >
> > Well, let's make sure that your impression is correct before going too
> > far. I, too, had the impression that gettext cannot possibly be expensive,
> > especifally in Git for Windows' settings, where we do not even ship
> > translations. Yet see the commit message of cc5e1bf99247 (gettext: avoid
> > initialization if the locale dir is not present, 2018-04-21):
> >
> >       The runtime of a simple `git.exe version` call on Windows is
> >       currently dominated by the gettext setup, adding a whopping ~150ms
> >       to the ~210ms total.
> >
> > I would be in favor of your change to make this a runtime option, of
> > course, as long as it does not affect performance greatly (in particular
> > on Windows, where we fight an uphill battle to make Git faster).
>
> How expensive gettext() may or may not be isn't relevant to the
> GETTEXT_POISON compile-time option.

If a user requests NO_GETTEXT, they should have zero (or near zero)
cost related to gettext. Which is true until now (the inline _()
version is expanded to pretty much no-op with the exception of NULL
string)

> The effect of what I'm suggesting here, and which my WIP patch in
> <875zxtd59e.fsf@evledraar.gmail.com> implements is that we'd do a
> one-time getenv() for each process that prints a _() message that we
> aren't doing now, and for each message call a function that would check
> a boolean "are we in poison mode" static global variable.

Just don't do the getenv() inside _() even if you just do it one time.
getenv() may invalidate whatever value from the previous call. I would
not want to find out my code broke because of an getenv() inside some
innocent _()... And we should still respect NO_GETTEXT, not doing any
extra work when it's defined.
-- 
Duy

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] Poison gettext with the Ook language
  2018-10-23 15:00       ` Duy Nguyen
@ 2018-10-23 16:45         ` Ævar Arnfjörð Bjarmason
  2018-10-24 14:41           ` Duy Nguyen
  0 siblings, 1 reply; 49+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-10-23 16:45 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Johannes Schindelin, Git Mailing List, SZEDER Gábor,
	Vasco Almeida, Jiang Xin


On Tue, Oct 23 2018, Duy Nguyen wrote:

> On Tue, Oct 23, 2018 at 12:17 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>>
>> On Tue, Oct 23 2018, Johannes Schindelin wrote:
>>
>> > Hi Ævar,
>> >
>> > On Mon, 22 Oct 2018, Ævar Arnfjörð Bjarmason wrote:
>> >
>> >> So I think the only reason to keep it [GETTEXT_POISON] compile-time is
>> >> performance, but I don't think that matters. It's not like we're
>> >> printing gigabytes of _() formatted output. Everything where formatting
>> >> matters is plumbing which doesn't use this API. These messages are
>> >> always for human consumption.
>> >
>> > Well, let's make sure that your impression is correct before going too
>> > far. I, too, had the impression that gettext cannot possibly be expensive,
>> > especifally in Git for Windows' settings, where we do not even ship
>> > translations. Yet see the commit message of cc5e1bf99247 (gettext: avoid
>> > initialization if the locale dir is not present, 2018-04-21):
>> >
>> >       The runtime of a simple `git.exe version` call on Windows is
>> >       currently dominated by the gettext setup, adding a whopping ~150ms
>> >       to the ~210ms total.
>> >
>> > I would be in favor of your change to make this a runtime option, of
>> > course, as long as it does not affect performance greatly (in particular
>> > on Windows, where we fight an uphill battle to make Git faster).
>>
>> How expensive gettext() may or may not be isn't relevant to the
>> GETTEXT_POISON compile-time option.
>
> If a user requests NO_GETTEXT, they should have zero (or near zero)
> cost related to gettext. Which is true until now (the inline _()
> version is expanded to pretty much no-op with the exception of NULL
> string)

I'm assuming by "until now" you mean until my RFC patch in
https://public-inbox.org/git/875zxtd59e.fsf@evledraar.gmail.com/

Yeah it's more expensive, but I don't see how it matters. We'd be
nano-optimizing a thing that's never going to become a
bottleneck. I.e. the patch is basically taking the commented-out
codepath in this test program:

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>

    int have_flag(void)
    {
        return 0;
        /*
        static int flag = -1;
        if (flag == -1)
            flag = strlen(getenv("GIT_TEST_FOO"));
        return flag;
        */
    }


    int main(void)
    {
        while (1) {
            const char *out = have_flag() ? "foo" : "bar";
            puts(out);
        }
    }

When I test that on my laptop with gcc 8.2.0 I get ~230MB/s of output on
both versions, i.e. where have_flag() can be trivially constant folded,
and where we need to call getenv() for both of:

    GIT_TEST_FOO= ./main | pv >/dev/null
    GIT_TEST_FOO=1 ./main | pv >/dev/null

We don't have any codepaths where we're calling gettext() to output more
than a screenfull or two of output, so optimizing this doesn't make
sense.

>> The effect of what I'm suggesting here, and which my WIP patch in
>> <875zxtd59e.fsf@evledraar.gmail.com> implements is that we'd do a
>> one-time getenv() for each process that prints a _() message that we
>> aren't doing now, and for each message call a function that would check
>> a boolean "are we in poison mode" static global variable.
>
> Just don't do the getenv() inside _() even if you just do it one time.
> getenv() may invalidate whatever value from the previous call. I would
> not want to find out my code broke because of an getenv() inside some
> innocent _()...

How would any code break? We have various getenv("GIT_TEST_*...")
already that work the same way. Unless you set that in the environment
the function is idempotent, and I don't see how anyone would do that
accidentally.

> And we should still respect NO_GETTEXT, not doing any extra work when
> it's defined.

This is not how any of our NO_* defines work. *Sometimes* defining them
guarantees you do no extra work, but in other cases we've decided that
bending over backwards with #ifdef in various places isn't worth it.

E.g. grep_opt in grep.h is a good example of this. Even if you don't
compile with PCRE you're pointlessly memzero-ing out members of the
struct that'll never be used in your config, because it's easier to
maintain the code that way (types always checked at compile-time).

(And no, despite my involvement in PCRE I didn't introduce that
particular pattern)

For the reasons noted above I think NO_GETTEXT is another such
case. Just like GIT_TEST_SPLIT_INDEX (added in your d6e3c181bc
("read-cache: force split index mode with GIT_TEST_SPLIT_INDEX",
2014-06-13)) etc.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH] i18n: make GETTEXT_POISON a runtime option
  2018-10-22 23:04   ` Junio C Hamano
@ 2018-10-23 21:01     ` Ævar Arnfjörð Bjarmason
  2018-10-24  5:45       ` Junio C Hamano
  0 siblings, 1 reply; 49+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-10-23 21:01 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, SZEDER Gábor, Vasco Almeida,
	Jiang Xin, Ævar Arnfjörð Bjarmason

Change the GETTEXT_POISON compile-time + runtime GIT_GETTEXT_POISON
test parameter to only be a GIT_TEST_GETTEXT_POISON=<boolean> runtime
parameter, to be consistent with other parameters documented in
"Running tests with special setups" in t/README.

When I added GETTEXT_POISON in bb946bba76 ("i18n: add GETTEXT_POISON
to simulate unfriendly translator", 2011-02-22) I was concerned with
ensuring that the _() function would get constant folded if NO_GETTEXT
was defined or if it wasn't and GETTEXT_POISON wasn't defined.

But as the benchmark in my [1] shows doing a one-off runtime
getenv("GIT_TEST_[...]") is trivial, and since GETTEXT_POISON was
originally added this has become the common idiom in the codebase for
turning on special test setups.

So change GETTEXT_POISON to work the same way. Now the
GETTEXT_POISON=YesPlease compile-time option is gone, and running the
tests with GIT_TEST_GETTEXT_POISON=[true|false] can be toggled on/off
without recompiling.

This allows for conditionally amending tests to test with/without
poison, similar to what 859fdc0c3c ("commit-graph: define
GIT_TEST_COMMIT_GRAPH", 2018-08-29) did for GIT_TEST_COMMIT_GRAPH, but
this patch doesn't change any of the existing tests to work like that.

Notes on the implementation:

 * The only reason we need a new "git-sh-i18n--helper" and the
   corresponding "test-tool gettext-poison" is to expose
   git_env_bool() to shellscripts, since git-sh-i18n and friends need
   to inspect the $GIT_TEST_GETTEXT_POISON variable.

   We only call these if $GIT_TEST_GETTEXT_POISON is set, or in the
   test suite, and this code can go away for non-test code once the
   last i18n-using shellscript is rewritten in C.

 * We still compile a dedicated GETTEXT_POISON build in Travis CI,
   this is probably the wrong thing to do and should be followed-up
   with something similar to ae59a4e44f ("travis: run tests with
   GIT_TEST_SPLIT_INDEX", 2018-01-07) to re-use an existing test setup
   for running in the GIT_TEST_GETTEXT_POISON mode.

 * The reason for not doing:

       test_lazy_prereq GETTEXT_POISON 'test-tool gettext-poison'
       test_lazy_prereq C_LOCALE_OUTPUT '! test-tool gettext-poison'

   In test-lib.sh is because there's some interpolation problem with
   that syntax which makes t6040-tracking-info.sh fail. Hence using
   the simpler test_set_prereq.

See also
https://public-inbox.org/git/871s8gd32p.fsf@evledraar.gmail.com/ for
more discussion.

1. https://public-inbox.org/git/871s8gd32p.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

On Mon, Oct 22 2018, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> So I think the only reason to keep it compile-time is performance, but I
>> don't think that matters. It's not like we're printing gigabytes of _()
>> formatted output. Everything where formatting matters is plumbing which
>> doesn't use this API. These messages are always for human consumption.
>>
>> So shouldn't we just drop this notion of GETTEXT_POISON at compile-time
>> entirely, and make GIT_GETTEXT_POISON=<boolean> a test flag that all
>> builds of git are aware of? I think so.
>
> Haven't thought things through, but that sounds like a good thing to
> aim for.  Keep the ideas coming...

Here's a polished version of that. I think it makes sense to queue
this up before any other refactoring of GETTEXT_POISON, and some patch
to unconditionally preserve format specifiers as I suggested upthread
could go on top of this.

 .gitignore                     |  1 +
 .travis.yml                    |  2 +-
 Makefile                       | 11 ++---------
 builtin.h                      |  1 +
 builtin/sh-i18n--helper.c      | 27 +++++++++++++++++++++++++++
 ci/lib-travisci.sh             |  4 ++--
 gettext.c                      |  5 ++---
 gettext.h                      |  4 ----
 git-sh-i18n.sh                 |  3 ++-
 git.c                          |  1 +
 po/README                      | 13 ++++---------
 t/README                       |  6 ++++++
 t/helper/test-gettext-poison.c |  9 +++++++++
 t/helper/test-tool.c           |  1 +
 t/helper/test-tool.h           |  1 +
 t/t0000-basic.sh               |  2 +-
 t/t3406-rebase-message.sh      |  2 +-
 t/t7201-co.sh                  |  5 ++++-
 t/test-lib-functions.sh        |  8 ++++----
 t/test-lib.sh                  | 11 ++---------
 20 files changed, 72 insertions(+), 45 deletions(-)
 create mode 100644 builtin/sh-i18n--helper.c
 create mode 100644 t/helper/test-gettext-poison.c

diff --git a/.gitignore b/.gitignore
index 9d1363a1eb..f7b7977910 100644
--- a/.gitignore
+++ b/.gitignore
@@ -148,6 +148,7 @@
 /git-serve
 /git-sh-i18n
 /git-sh-i18n--envsubst
+/git-sh-i18n--helper
 /git-sh-setup
 /git-sh-i18n
 /git-shell
diff --git a/.travis.yml b/.travis.yml
index 4d4e26c9df..4523a2e5ec 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -26,7 +26,7 @@ addons:
 
 matrix:
   include:
-    - env: jobname=GETTEXT_POISON
+    - env: jobname=GIT_TEST_GETTEXT_POISON
       os: linux
       compiler:
       addons:
diff --git a/Makefile b/Makefile
index d18ab0fe78..684dec4d39 100644
--- a/Makefile
+++ b/Makefile
@@ -362,11 +362,6 @@ all::
 # Define HAVE_DEV_TTY if your system can open /dev/tty to interact with the
 # user.
 #
-# Define GETTEXT_POISON if you are debugging the choice of strings marked
-# for translation.  In a GETTEXT_POISON build, you can turn all strings marked
-# for translation into gibberish by setting the GIT_GETTEXT_POISON variable
-# (to any value) in your environment.
-#
 # Define JSMIN to point to JavaScript minifier that functions as
 # a filter to have gitweb.js minified.
 #
@@ -714,6 +709,7 @@ TEST_BUILTINS_OBJS += test-dump-split-index.o
 TEST_BUILTINS_OBJS += test-dump-untracked-cache.o
 TEST_BUILTINS_OBJS += test-example-decorate.o
 TEST_BUILTINS_OBJS += test-genrandom.o
+TEST_BUILTINS_OBJS += test-gettext-poison.o
 TEST_BUILTINS_OBJS += test-hashmap.o
 TEST_BUILTINS_OBJS += test-index-version.o
 TEST_BUILTINS_OBJS += test-json-writer.o
@@ -1099,6 +1095,7 @@ BUILTIN_OBJS += builtin/revert.o
 BUILTIN_OBJS += builtin/rm.o
 BUILTIN_OBJS += builtin/send-pack.o
 BUILTIN_OBJS += builtin/serve.o
+BUILTIN_OBJS += builtin/sh-i18n--helper.o
 BUILTIN_OBJS += builtin/shortlog.o
 BUILTIN_OBJS += builtin/show-branch.o
 BUILTIN_OBJS += builtin/show-index.o
@@ -1439,9 +1436,6 @@ endif
 ifdef NO_SYMLINK_HEAD
 	BASIC_CFLAGS += -DNO_SYMLINK_HEAD
 endif
-ifdef GETTEXT_POISON
-	BASIC_CFLAGS += -DGETTEXT_POISON
-endif
 ifdef NO_GETTEXT
 	BASIC_CFLAGS += -DNO_GETTEXT
 	USE_GETTEXT_SCHEME ?= fallthrough
@@ -2591,7 +2585,6 @@ ifdef GIT_TEST_CMP_USE_COPIED_CONTEXT
 	@echo GIT_TEST_CMP_USE_COPIED_CONTEXT=YesPlease >>$@+
 endif
 	@echo NO_GETTEXT=\''$(subst ','\'',$(subst ','\'',$(NO_GETTEXT)))'\' >>$@+
-	@echo GETTEXT_POISON=\''$(subst ','\'',$(subst ','\'',$(GETTEXT_POISON)))'\' >>$@+
 ifdef GIT_PERF_REPEAT_COUNT
 	@echo GIT_PERF_REPEAT_COUNT=\''$(subst ','\'',$(subst ','\'',$(GIT_PERF_REPEAT_COUNT)))'\' >>$@+
 endif
diff --git a/builtin.h b/builtin.h
index 962f0489ab..a40c56e7a2 100644
--- a/builtin.h
+++ b/builtin.h
@@ -219,6 +219,7 @@ extern int cmd_revert(int argc, const char **argv, const char *prefix);
 extern int cmd_rm(int argc, const char **argv, const char *prefix);
 extern int cmd_send_pack(int argc, const char **argv, const char *prefix);
 extern int cmd_serve(int argc, const char **argv, const char *prefix);
+extern int cmd_sh_i18n__helper(int argc, const char **argv, const char *prefix);
 extern int cmd_shortlog(int argc, const char **argv, const char *prefix);
 extern int cmd_show(int argc, const char **argv, const char *prefix);
 extern int cmd_show_branch(int argc, const char **argv, const char *prefix);
diff --git a/builtin/sh-i18n--helper.c b/builtin/sh-i18n--helper.c
new file mode 100644
index 0000000000..1fba8b902b
--- /dev/null
+++ b/builtin/sh-i18n--helper.c
@@ -0,0 +1,27 @@
+#include "builtin.h"
+#include "cache.h"
+#include "config.h"
+#include "parse-options.h"
+
+static const char * const builtin_sh_i18n_helper_usage[] = {
+	N_("git sh-i18n--helper [<options>]"),
+	NULL
+};
+
+int cmd_sh_i18n__helper(int argc, const char **argv, const char *prefix)
+{
+	int poison = -1;
+	struct option options[] = {
+		OPT_BOOL(0, "git-test-gettext-poison", &poison,
+			 N_("is GIT_TEST_GETTEXT_POISON in effect?")),
+		OPT_END()
+	};
+
+	argc = parse_options(argc, argv, NULL, options,
+			     builtin_sh_i18n_helper_usage, PARSE_OPT_KEEP_ARGV0);
+
+	if (poison != -1)
+		return !git_env_bool("GIT_TEST_GETTEXT_POISON", 0);
+
+	usage_with_options(builtin_sh_i18n_helper_usage, options);
+}
diff --git a/ci/lib-travisci.sh b/ci/lib-travisci.sh
index 06970f7213..6a89d0d7d8 100755
--- a/ci/lib-travisci.sh
+++ b/ci/lib-travisci.sh
@@ -123,7 +123,7 @@ osx-clang|osx-gcc)
 	# Travis CI OS X
 	export GIT_SKIP_TESTS="t9810 t9816"
 	;;
-GETTEXT_POISON)
-	export GETTEXT_POISON=YesPlease
+GIT_TEST_GETTEXT_POISON)
+	export GIT_TEST_GETTEXT_POISON=true
 	;;
 esac
diff --git a/gettext.c b/gettext.c
index 7272771c8e..722a2f726c 100644
--- a/gettext.c
+++ b/gettext.c
@@ -7,6 +7,7 @@
 #include "gettext.h"
 #include "strbuf.h"
 #include "utf8.h"
+#include "config.h"
 
 #ifndef NO_GETTEXT
 #	include <locale.h>
@@ -46,15 +47,13 @@ const char *get_preferred_languages(void)
 	return NULL;
 }
 
-#ifdef GETTEXT_POISON
 int use_gettext_poison(void)
 {
 	static int poison_requested = -1;
 	if (poison_requested == -1)
-		poison_requested = getenv("GIT_GETTEXT_POISON") ? 1 : 0;
+		poison_requested = git_env_bool("GIT_TEST_GETTEXT_POISON", 0);
 	return poison_requested;
 }
-#endif
 
 #ifndef NO_GETTEXT
 static int test_vsnprintf(const char *fmt, ...)
diff --git a/gettext.h b/gettext.h
index 7eee64a34f..4c492d9f57 100644
--- a/gettext.h
+++ b/gettext.h
@@ -41,11 +41,7 @@ static inline int gettext_width(const char *s)
 }
 #endif
 
-#ifdef GETTEXT_POISON
 extern int use_gettext_poison(void);
-#else
-#define use_gettext_poison() 0
-#endif
 
 static inline FORMAT_PRESERVING(1) const char *_(const char *msgid)
 {
diff --git a/git-sh-i18n.sh b/git-sh-i18n.sh
index 9d065fb4bf..c0713b1ee9 100644
--- a/git-sh-i18n.sh
+++ b/git-sh-i18n.sh
@@ -17,7 +17,8 @@ export TEXTDOMAINDIR
 
 # First decide what scheme to use...
 GIT_INTERNAL_GETTEXT_SH_SCHEME=fallthrough
-if test -n "$GIT_GETTEXT_POISON"
+if test -n "$GIT_TEST_GETTEXT_POISON" &&
+	    git sh-i18n--helper --git-test-gettext-poison
 then
 	GIT_INTERNAL_GETTEXT_SH_SCHEME=poison
 elif test -n "@@USE_GETTEXT_SCHEME@@"
diff --git a/git.c b/git.c
index 5920f8019b..125c523720 100644
--- a/git.c
+++ b/git.c
@@ -539,6 +539,7 @@ static struct cmd_struct commands[] = {
 	{ "rm", cmd_rm, RUN_SETUP },
 	{ "send-pack", cmd_send_pack, RUN_SETUP },
 	{ "serve", cmd_serve, RUN_SETUP },
+	{ "sh-i18n--helper", cmd_sh_i18n__helper, 0 },
 	{ "shortlog", cmd_shortlog, RUN_SETUP_GENTLY | USE_PAGER },
 	{ "show", cmd_show, RUN_SETUP },
 	{ "show-branch", cmd_show_branch, RUN_SETUP },
diff --git a/po/README b/po/README
index fef4c0f0b5..dba46c4a40 100644
--- a/po/README
+++ b/po/README
@@ -289,16 +289,11 @@ something in the test suite might still depend on the US English
 version of the strings, e.g. to grep some error message or other
 output.
 
-To smoke out issues like these Git can be compiled with gettext poison
-support, at the top-level:
+To smoke out issues like these Git tested with a translation mode that
+emits gibberish on every call to gettext. To use it run the test suite
+with it, e.g.:
 
-    make GETTEXT_POISON=YesPlease
-
-That'll give you a git which emits gibberish on every call to
-gettext. It's obviously not meant to be installed, but you should run
-the test suite with it:
-
-    cd t && prove -j 9 ./t[0-9]*.sh
+    cd t && GIT_TEST_GETTEXT_POISON=true prove -j 9 ./t[0-9]*.sh
 
 If tests break with it you should inspect them manually and see if
 what you're translating is sane, i.e. that you're not translating
diff --git a/t/README b/t/README
index 8847489640..53c3dee7a9 100644
--- a/t/README
+++ b/t/README
@@ -301,6 +301,12 @@ that cannot be easily covered by a few specific test cases. These
 could be enabled by running the test suite with correct GIT_TEST_
 environment set.
 
+GIT_TEST_GETTEXT_POISON=<boolean> turns all strings marked for
+translation into gibberish. Used for spotting those tests that need to
+be marked with a C_LOCALE_OUTPUT prerequisite when adding more strings
+for translation. See "Testing marked strings" in po/README for
+details.
+
 GIT_TEST_SPLIT_INDEX=<boolean> forces split-index mode on the whole
 test suite. Accept any boolean values that are accepted by git-config.
 
diff --git a/t/helper/test-gettext-poison.c b/t/helper/test-gettext-poison.c
new file mode 100644
index 0000000000..476be95da5
--- /dev/null
+++ b/t/helper/test-gettext-poison.c
@@ -0,0 +1,9 @@
+#include "test-tool.h"
+#include "git-compat-util.h"
+#include "thread-utils.h"
+#include "gettext.h"
+
+int cmd__gettext_poison(int argc, const char **argv)
+{
+	return use_gettext_poison() ? 0 : 1;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 6b5836dc1b..3e75672a37 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -19,6 +19,7 @@ static struct test_cmd cmds[] = {
 	{ "dump-untracked-cache", cmd__dump_untracked_cache },
 	{ "example-decorate", cmd__example_decorate },
 	{ "genrandom", cmd__genrandom },
+	{ "gettext-poison", cmd__gettext_poison },
 	{ "hashmap", cmd__hashmap },
 	{ "index-version", cmd__index_version },
 	{ "json-writer", cmd__json_writer },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index e4890566da..04f033b7fc 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -15,6 +15,7 @@ int cmd__dump_split_index(int argc, const char **argv);
 int cmd__dump_untracked_cache(int argc, const char **argv);
 int cmd__example_decorate(int argc, const char **argv);
 int cmd__genrandom(int argc, const char **argv);
+int cmd__gettext_poison(int argc, const char **argv);
 int cmd__hashmap(int argc, const char **argv);
 int cmd__index_version(int argc, const char **argv);
 int cmd__json_writer(int argc, const char **argv);
diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index 4d23373526..b6566003dd 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -274,7 +274,7 @@ test_expect_success 'pretend we have a mix of all possible results' "
 	EOF
 "
 
-test_expect_success 'test --verbose' '
+test_expect_success C_LOCALE_OUTPUT 'test --verbose' '
 	test_must_fail run_sub_test_lib_test \
 		test-verbose "test verbose" --verbose <<-\EOF &&
 	test_expect_success "passing test" true
diff --git a/t/t3406-rebase-message.sh b/t/t3406-rebase-message.sh
index 0392e36d23..2bdcf83808 100755
--- a/t/t3406-rebase-message.sh
+++ b/t/t3406-rebase-message.sh
@@ -77,7 +77,7 @@ test_expect_success 'rebase -n overrides config rebase.stat config' '
 #     "Does not point to a valid commit: invalid-ref"
 #
 # NEEDSWORK: This "grep" is fine in real non-C locales, but
-# GETTEXT_POISON poisons the refname along with the enclosing
+# GIT_TEST_GETTEXT_POISON poisons the refname along with the enclosing
 # error message.
 test_expect_success 'rebase --onto outputs the invalid ref' '
 	test_must_fail git rebase --onto invalid-ref HEAD HEAD 2>err &&
diff --git a/t/t7201-co.sh b/t/t7201-co.sh
index 826987ca80..cb2c8cf3f3 100755
--- a/t/t7201-co.sh
+++ b/t/t7201-co.sh
@@ -256,7 +256,10 @@ test_expect_success 'checkout to detach HEAD' '
 	git checkout -f renamer && git clean -f &&
 	git checkout renamer^ 2>messages &&
 	test_i18ngrep "HEAD is now at 7329388" messages &&
-	(test_line_count -gt 1 messages || test -n "$GETTEXT_POISON") &&
+	(
+		test_line_count -gt 1 messages ||
+		test_have_prereq GETTEXT_POISON
+	) &&
 	H=$(git rev-parse --verify HEAD) &&
 	M=$(git show-ref -s --verify refs/heads/master) &&
 	test "z$H" = "z$M" &&
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 78d8c3783b..f46e21cfa0 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -755,16 +755,16 @@ test_cmp_bin() {
 
 # Use this instead of test_cmp to compare files that contain expected and
 # actual output from git commands that can be translated.  When running
-# under GETTEXT_POISON this pretends that the command produced expected
+# under GIT_TEST_GETTEXT_POISON this pretends that the command produced expected
 # results.
 test_i18ncmp () {
-	test -n "$GETTEXT_POISON" || test_cmp "$@"
+	! test_have_prereq C_LOCALE_OUTPUT || test_cmp "$@"
 }
 
 # Use this instead of "grep expected-string actual" to see if the
 # output from a git command that can be translated either contains an
 # expected string, or does not contain an unwanted one.  When running
-# under GETTEXT_POISON this pretends that the command produced expected
+# under GIT_TEST_GETTEXT_POISON this pretends that the command produced expected
 # results.
 test_i18ngrep () {
 	eval "last_arg=\${$#}"
@@ -779,7 +779,7 @@ test_i18ngrep () {
 		error "bug in the test script: too few parameters to test_i18ngrep"
 	fi
 
-	if test -n "$GETTEXT_POISON"
+	if ! test_have_prereq C_LOCALE_OUTPUT
 	then
 		# pretend success
 		return 0
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 897e6fcc94..ec77aa57d4 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1104,15 +1104,8 @@ test -n "$USE_LIBPCRE1" && test_set_prereq LIBPCRE1
 test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 
-# Can we rely on git's output in the C locale?
-if test -n "$GETTEXT_POISON"
-then
-	GIT_GETTEXT_POISON=YesPlease
-	export GIT_GETTEXT_POISON
-	test_set_prereq GETTEXT_POISON
-else
-	test_set_prereq C_LOCALE_OUTPUT
-fi
+test-tool gettext-poison && test_set_prereq GETTEXT_POISON
+test-tool gettext-poison || test_set_prereq C_LOCALE_OUTPUT
 
 if test -z "$GIT_TEST_CHECK_CACHE_TREE"
 then
-- 
2.19.1.568.g152ad8e336


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH] i18n: make GETTEXT_POISON a runtime option
  2018-10-23 21:01     ` [PATCH] i18n: make GETTEXT_POISON a runtime option Ævar Arnfjörð Bjarmason
@ 2018-10-24  5:45       ` Junio C Hamano
  2018-10-24  7:44         ` Jeff King
  2018-10-24 11:47         ` [PATCH v2] " Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 49+ messages in thread
From: Junio C Hamano @ 2018-10-24  5:45 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, SZEDER Gábor, Vasco Almeida, Jiang Xin

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Notes on the implementation:
>
>  * The only reason we need a new "git-sh-i18n--helper" and the
>    corresponding "test-tool gettext-poison" is to expose
>    git_env_bool() to shellscripts, since git-sh-i18n and friends need
>    to inspect the $GIT_TEST_GETTEXT_POISON variable.
>
>    We only call these if $GIT_TEST_GETTEXT_POISON is set, or in the
>    test suite, and this code can go away for non-test code once the
>    last i18n-using shellscript is rewritten in C.

Makes me wonder if we want to teach "git config" a new "--env-bool"
option that can be used by third-party scripts.  Or would it be just
the matter of writing

	git config --default=false --type=bool "$GIT_WHATEVER_ENV"

in these third-party scripts and we do not need to add such a thing?

>  * The reason for not doing:
>
>        test_lazy_prereq GETTEXT_POISON 'test-tool gettext-poison'
>        test_lazy_prereq C_LOCALE_OUTPUT '! test-tool gettext-poison'
>
>    In test-lib.sh is because there's some interpolation problem with
>    that syntax which makes t6040-tracking-info.sh fail. Hence using
>    the simpler test_set_prereq.

s/In/in/, as you haven't finished the sentence yet.  But more
importantly, what is this "some interpolation problem"?  Are you
saying that test_lazy_prereq implementation is somehow broken and
cannot take certain strings?  If so, perhaps we want to fix that,
and people other than you can help to do so, once you let them know
what the problem is.

> See also
> https://public-inbox.org/git/871s8gd32p.fsf@evledraar.gmail.com/ for
> more discussion.

"See also [*1*] for more discussion" as you've already have
reference below.

>
> 1. https://public-inbox.org/git/871s8gd32p.fsf@evledraar.gmail.com/
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>
> Here's a polished version of that. I think it makes sense to queue
> this up before any other refactoring of GETTEXT_POISON, and some patch
> to unconditionally preserve format specifiers as I suggested upthread
> could go on top of this.
> ...
> +int cmd_sh_i18n__helper(int argc, const char **argv, const char *prefix)
> +{
> +	int poison = -1;
> +	struct option options[] = {
> +		OPT_BOOL(0, "git-test-gettext-poison", &poison,
> +			 N_("is GIT_TEST_GETTEXT_POISON in effect?")),
> +		OPT_END()
> +	};
> +
> +	argc = parse_options(argc, argv, NULL, options,
> +			     builtin_sh_i18n_helper_usage, PARSE_OPT_KEEP_ARGV0);
> +
> +	if (poison != -1)
> +		return !git_env_bool("GIT_TEST_GETTEXT_POISON", 0);

Hmm?  If "--[no-]git-test-gettext-poison" is given, poison is either
0 or 1, and we return the value we read from the environment?  What
convoluted way to implement the option is that, or is there anything
subtle that I am not getting?

If the "default" parameter to git_env_bool() were poison, and then
the option was renamed to "--get-git-text-gettext-poison", then I
sort of understand the code, though (it's like "git config" with
"--default" option).

But if there is nothing subtle, it may make sense to implement this
as an opt-cmdmode instead.

> diff --git a/po/README b/po/README
> index fef4c0f0b5..dba46c4a40 100644
> --- a/po/README
> +++ b/po/README
> @@ -289,16 +289,11 @@ something in the test suite might still depend on the US English
>  version of the strings, e.g. to grep some error message or other
>  output.
>  
> -To smoke out issues like these Git can be compiled with gettext poison
> -support, at the top-level:
> +To smoke out issues like these Git tested with a translation mode that
> +emits gibberish on every call to gettext. To use it run the test suite
> +with it, e.g.:

s/these Git tested/these, Git can be tested/; even though the comma
is not a new issue you introduced.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] i18n: make GETTEXT_POISON a runtime option
  2018-10-24  5:45       ` Junio C Hamano
@ 2018-10-24  7:44         ` Jeff King
  2018-10-25  1:00           ` Junio C Hamano
  2018-10-24 11:47         ` [PATCH v2] " Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 49+ messages in thread
From: Jeff King @ 2018-10-24  7:44 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, git, SZEDER Gábor,
	Vasco Almeida, Jiang Xin

On Wed, Oct 24, 2018 at 02:45:49PM +0900, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
> 
> > Notes on the implementation:
> >
> >  * The only reason we need a new "git-sh-i18n--helper" and the
> >    corresponding "test-tool gettext-poison" is to expose
> >    git_env_bool() to shellscripts, since git-sh-i18n and friends need
> >    to inspect the $GIT_TEST_GETTEXT_POISON variable.
> >
> >    We only call these if $GIT_TEST_GETTEXT_POISON is set, or in the
> >    test suite, and this code can go away for non-test code once the
> >    last i18n-using shellscript is rewritten in C.
> 
> Makes me wonder if we want to teach "git config" a new "--env-bool"
> option that can be used by third-party scripts.  Or would it be just
> the matter of writing
> 
> 	git config --default=false --type=bool "$GIT_WHATEVER_ENV"
> 
> in these third-party scripts and we do not need to add such a thing?

The config command only takes names, not values. So you'd have to write
something like:

  git -c env.bool="$GIT_WHATEVER_ENV" config --type=bool env.bool

but then you lose the default handling. I think if we added a new
option, it would either be:

  # interpret a value directly; use default on empty, I guess?
  git config --default=false --type=bool --interpret-value "$GIT_WHATEVER_ENV"

or

  # less flexible, but the --default semantics are more obvious
  git config --default=false --type=bool --get-env GIT_WHATEVER_ENV

-Peff

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2] i18n: make GETTEXT_POISON a runtime option
  2018-10-24  5:45       ` Junio C Hamano
  2018-10-24  7:44         ` Jeff King
@ 2018-10-24 11:47         ` Ævar Arnfjörð Bjarmason
  2018-11-01 19:31           ` [PATCH v3] " Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 49+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-10-24 11:47 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, SZEDER Gábor, Vasco Almeida,
	Jiang Xin, Ævar Arnfjörð Bjarmason

Change the GETTEXT_POISON compile-time + runtime GIT_GETTEXT_POISON
test parameter to only be a GIT_TEST_GETTEXT_POISON=<boolean> runtime
parameter, to be consistent with other parameters documented in
"Running tests with special setups" in t/README.

When I added GETTEXT_POISON in bb946bba76 ("i18n: add GETTEXT_POISON
to simulate unfriendly translator", 2011-02-22) I was concerned with
ensuring that the _() function would get constant folded if NO_GETTEXT
was defined or if it wasn't and GETTEXT_POISON wasn't defined.

But as the benchmark in my [1] shows doing a one-off runtime
getenv("GIT_TEST_[...]") is trivial, and since GETTEXT_POISON was
originally added this has become the common idiom in the codebase for
turning on special test setups.

So change GETTEXT_POISON to work the same way. Now the
GETTEXT_POISON=YesPlease compile-time option is gone, and running the
tests with GIT_TEST_GETTEXT_POISON=[true|false] can be toggled on/off
without recompiling.

This allows for conditionally amending tests to test with/without
poison, similar to what 859fdc0c3c ("commit-graph: define
GIT_TEST_COMMIT_GRAPH", 2018-08-29) did for GIT_TEST_COMMIT_GRAPH. Do
some of that, now we e.g. always run the t0205-gettext-poison.sh test.

I did enough there to remove the GETTEXT_POISON prerequisite, but its
inverse C_LOCALE_OUTPUT is still around, and surely some tests using
it can be converted to e.g. always set GIT_TEST_GETTEXT_POISON=false.

Notes on the implementation:

 * The only reason we need a new "git-sh-i18n--helper" is to expose
   git_env_bool() to shellscripts, since git-sh-i18n and friends need
   to inspect the $GIT_TEST_GETTEXT_POISON variable.

   We only call it if $GIT_TEST_GETTEXT_POISON is set, or in the test
   suite. This code can go away for non-test code once the last
   i18n-using shellscript is rewritten in C, or if "git config" learns
   to combine --bool and a value fed into it, see [2].

 * We still compile a dedicated GETTEXT_POISON build in Travis CI,
   this is probably the wrong thing to do and should be followed-up
   with something similar to ae59a4e44f ("travis: run tests with
   GIT_TEST_SPLIT_INDEX", 2018-01-07) to re-use an existing test setup
   for running in the GIT_TEST_GETTEXT_POISON mode.

 * The reason for not doing:

       test_lazy_prereq C_LOCALE_OUTPUT [...]

   in test-lib.sh is because there's some interpolation problem with
   that syntax which makes t6040-tracking-info.sh fail. Hence using
   the simpler test_set_prereq. It'll fail with:

       + git branch -vv
       + sed -n -e
       mkdir -p "$TRASH_DIRECTORY/prereq-test-dir" &&
       (
               cd "$TRASH_DIRECTORY/prereq-test-dir" &&! git sh-i18n--helper --git-test-gettext-poison
       )
       sed: -e expression #1, char 2: unknown command: `m'
       error: last command exited with $?=1
       not ok 3 - branch -vv

   This is some interpolation problem with how test_lazy_prereq works
   that doesn't affect test_set_prereq.

 * We now skip a test in t0000-basic.sh under
   GIT_TEST_GETTEXT_POISON=true that wasn't skipped before. This test
   relies on C locale output, but due to an edge case in how the
   previous implementation of GETTEXT_POISON worked (reading it from
   GIT-BUILD-OPTIONS) wasn't enabling poison correctly. Now it does,
   and needs to be skipped.

See also
https://public-inbox.org/git/878t2pd6yu.fsf@evledraar.gmail.com/ for
more discussion.

1. https://public-inbox.org/git/871s8gd32p.fsf@evledraar.gmail.com/

2. https://public-inbox.org/git/20181024074400.GA31239@sigill.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

On Wed, Oct 24 2018, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> Notes on the implementation:
>>
>>  * The only reason we need a new "git-sh-i18n--helper" and the
>>    corresponding "test-tool gettext-poison" is to expose
>>    git_env_bool() to shellscripts, since git-sh-i18n and friends need
>>    to inspect the $GIT_TEST_GETTEXT_POISON variable.
>>
>>    We only call these if $GIT_TEST_GETTEXT_POISON is set, or in the
>>    test suite, and this code can go away for non-test code once the
>>    last i18n-using shellscript is rewritten in C.
>
> Makes me wonder if we want to teach "git config" a new "--env-bool"
> option that can be used by third-party scripts.  Or would it be just
> the matter of writing
>
> 	git config --default=false --type=bool "$GIT_WHATEVER_ENV"
>
> in these third-party scripts and we do not need to add such a thing?

As Jeff notes in a follow-up reply git-config can't quite do this
yet. So in lieu of implementing that I have a more minimal helper,
which could be migrated to something like the --get-env mode Jeff is
talking about if we add that.

>>  * The reason for not doing:
>>
>>        test_lazy_prereq GETTEXT_POISON 'test-tool gettext-poison'
>>        test_lazy_prereq C_LOCALE_OUTPUT '! test-tool gettext-poison'
>>
>>    In test-lib.sh is because there's some interpolation problem with
>>    that syntax which makes t6040-tracking-info.sh fail. Hence using
>>    the simpler test_set_prereq.
>
> s/In/in/, as you haven't finished the sentence yet.  But more
> importantly, what is this "some interpolation problem"?  Are you
> saying that test_lazy_prereq implementation is somehow broken and
> cannot take certain strings?  If so, perhaps we want to fix that,
> and people other than you can help to do so, once you let them know
> what the problem is.

I haven't fixed (don't know what the fix is) this issue, but the new
version elaborates on what the error is.

>> See also
>> https://public-inbox.org/git/871s8gd32p.fsf@evledraar.gmail.com/ for
>> more discussion.
>
> "See also [*1*] for more discussion" as you've already have
> reference below.

Thanks, mispasted E-Mail ID. Fixed.

>>
>> 1. https://public-inbox.org/git/871s8gd32p.fsf@evledraar.gmail.com/
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>
>> Here's a polished version of that. I think it makes sense to queue
>> this up before any other refactoring of GETTEXT_POISON, and some patch
>> to unconditionally preserve format specifiers as I suggested upthread
>> could go on top of this.
>> ...
>> +int cmd_sh_i18n__helper(int argc, const char **argv, const char *prefix)
>> +{
>> +	int poison = -1;
>> +	struct option options[] = {
>> +		OPT_BOOL(0, "git-test-gettext-poison", &poison,
>> +			 N_("is GIT_TEST_GETTEXT_POISON in effect?")),
>> +		OPT_END()
>> +	};
>> +
>> +	argc = parse_options(argc, argv, NULL, options,
>> +			     builtin_sh_i18n_helper_usage, PARSE_OPT_KEEP_ARGV0);
>> +
>> +	if (poison != -1)
>> +		return !git_env_bool("GIT_TEST_GETTEXT_POISON", 0);
>
> Hmm?  If "--[no-]git-test-gettext-poison" is given, poison is either
> 0 or 1, and we return the value we read from the environment?  What
> convoluted way to implement the option is that, or is there anything
> subtle that I am not getting?
>
> If the "default" parameter to git_env_bool() were poison, and then
> the option was renamed to "--get-git-text-gettext-poison", then I
> sort of understand the code, though (it's like "git config" with
> "--default" option).

Yeah this didn't make much sense. The test helper is now gone, and the
main helper doesn't use git_env_bool() but the utility function in
gettext.h.

> But if there is nothing subtle, it may make sense to implement this
> as an opt-cmdmode instead.

You mean something that works like "helper foo" instead of an option
via "helper --foo"? Seemed easier just to use the opt parse mechanism.

>> diff --git a/po/README b/po/README
>> index fef4c0f0b5..dba46c4a40 100644
>> --- a/po/README
>> +++ b/po/README
>> @@ -289,16 +289,11 @@ something in the test suite might still depend on the US English
>>  version of the strings, e.g. to grep some error message or other
>>  output.
>>  
>> -To smoke out issues like these Git can be compiled with gettext poison
>> -support, at the top-level:
>> +To smoke out issues like these Git tested with a translation mode that
>> +emits gibberish on every call to gettext. To use it run the test suite
>> +with it, e.g.:
>
> s/these Git tested/these, Git can be tested/; even though the comma
> is not a new issue you introduced.

Fixed. Range-diff of the whole thing below:

1:  c93bf2f23f ! 1:  7bcb95a82d i18n: make GETTEXT_POISON a runtime option
    @@ -24,19 +24,23 @@
     
         This allows for conditionally amending tests to test with/without
         poison, similar to what 859fdc0c3c ("commit-graph: define
    -    GIT_TEST_COMMIT_GRAPH", 2018-08-29) did for GIT_TEST_COMMIT_GRAPH, but
    -    this patch doesn't change any of the existing tests to work like that.
    +    GIT_TEST_COMMIT_GRAPH", 2018-08-29) did for GIT_TEST_COMMIT_GRAPH. Do
    +    some of that, now we e.g. always run the t0205-gettext-poison.sh test.
    +
    +    I did enough there to remove the GETTEXT_POISON prerequisite, but its
    +    inverse C_LOCALE_OUTPUT is still around, and surely some tests using
    +    it can be converted to e.g. always set GIT_TEST_GETTEXT_POISON=false.
     
         Notes on the implementation:
     
    -     * The only reason we need a new "git-sh-i18n--helper" and the
    -       corresponding "test-tool gettext-poison" is to expose
    +     * The only reason we need a new "git-sh-i18n--helper" is to expose
            git_env_bool() to shellscripts, since git-sh-i18n and friends need
            to inspect the $GIT_TEST_GETTEXT_POISON variable.
     
    -       We only call these if $GIT_TEST_GETTEXT_POISON is set, or in the
    -       test suite, and this code can go away for non-test code once the
    -       last i18n-using shellscript is rewritten in C.
    +       We only call it if $GIT_TEST_GETTEXT_POISON is set, or in the test
    +       suite. This code can go away for non-test code once the last
    +       i18n-using shellscript is rewritten in C, or if "git config" learns
    +       to combine --bool and a value fed into it, see [2].
     
          * We still compile a dedicated GETTEXT_POISON build in Travis CI,
            this is probably the wrong thing to do and should be followed-up
    @@ -46,19 +50,40 @@
     
          * The reason for not doing:
     
    -           test_lazy_prereq GETTEXT_POISON 'test-tool gettext-poison'
    -           test_lazy_prereq C_LOCALE_OUTPUT '! test-tool gettext-poison'
    +           test_lazy_prereq C_LOCALE_OUTPUT [...]
     
    -       In test-lib.sh is because there's some interpolation problem with
    +       in test-lib.sh is because there's some interpolation problem with
            that syntax which makes t6040-tracking-info.sh fail. Hence using
    -       the simpler test_set_prereq.
    +       the simpler test_set_prereq. It'll fail with:
    +
    +           + git branch -vv
    +           + sed -n -e
    +           mkdir -p "$TRASH_DIRECTORY/prereq-test-dir" &&
    +           (
    +                   cd "$TRASH_DIRECTORY/prereq-test-dir" &&! git sh-i18n--helper --git-test-gettext-poison
    +           )
    +           sed: -e expression #1, char 2: unknown command: `m'
    +           error: last command exited with $?=1
    +           not ok 3 - branch -vv
    +
    +       This is some interpolation problem with how test_lazy_prereq works
    +       that doesn't affect test_set_prereq.
    +
    +     * We now skip a test in t0000-basic.sh under
    +       GIT_TEST_GETTEXT_POISON=true that wasn't skipped before. This test
    +       relies on C locale output, but due to an edge case in how the
    +       previous implementation of GETTEXT_POISON worked (reading it from
    +       GIT-BUILD-OPTIONS) wasn't enabling poison correctly. Now it does,
    +       and needs to be skipped.
     
         See also
    -    https://public-inbox.org/git/871s8gd32p.fsf@evledraar.gmail.com/ for
    +    https://public-inbox.org/git/878t2pd6yu.fsf@evledraar.gmail.com/ for
         more discussion.
     
         1. https://public-inbox.org/git/871s8gd32p.fsf@evledraar.gmail.com/
     
    +    2. https://public-inbox.org/git/20181024074400.GA31239@sigill.intra.peff.net/
    +
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      diff --git a/.gitignore b/.gitignore
    @@ -101,14 +126,6 @@
      # Define JSMIN to point to JavaScript minifier that functions as
      # a filter to have gitweb.js minified.
      #
    -@@
    - TEST_BUILTINS_OBJS += test-dump-untracked-cache.o
    - TEST_BUILTINS_OBJS += test-example-decorate.o
    - TEST_BUILTINS_OBJS += test-genrandom.o
    -+TEST_BUILTINS_OBJS += test-gettext-poison.o
    - TEST_BUILTINS_OBJS += test-hashmap.o
    - TEST_BUILTINS_OBJS += test-index-version.o
    - TEST_BUILTINS_OBJS += test-json-writer.o
     @@
      BUILTIN_OBJS += builtin/rm.o
      BUILTIN_OBJS += builtin/send-pack.o
    @@ -167,8 +184,9 @@
     +{
     +	int poison = -1;
     +	struct option options[] = {
    -+		OPT_BOOL(0, "git-test-gettext-poison", &poison,
    -+			 N_("is GIT_TEST_GETTEXT_POISON in effect?")),
    ++		OPT_BOOL_F(0, "git-test-gettext-poison", &poison,
    ++			   N_("is GIT_TEST_GETTEXT_POISON in effect?"),
    ++			   PARSE_OPT_NONEG),
     +		OPT_END()
     +	};
     +
    @@ -176,7 +194,7 @@
     +			     builtin_sh_i18n_helper_usage, PARSE_OPT_KEEP_ARGV0);
     +
     +	if (poison != -1)
    -+		return !git_env_bool("GIT_TEST_GETTEXT_POISON", 0);
    ++		return !use_gettext_poison();
     +
     +	usage_with_options(builtin_sh_i18n_helper_usage, options);
     +}
    @@ -275,7 +293,7 @@
      
     -To smoke out issues like these Git can be compiled with gettext poison
     -support, at the top-level:
    -+To smoke out issues like these Git tested with a translation mode that
    ++To smoke out issues like these, Git tested with a translation mode that
     +emits gibberish on every call to gettext. To use it run the test suite
     +with it, e.g.:
      
    @@ -308,44 +326,18 @@
      test suite. Accept any boolean values that are accepted by git-config.
      
     
    - diff --git a/t/helper/test-gettext-poison.c b/t/helper/test-gettext-poison.c
    - new file mode 100644
    - --- /dev/null
    - +++ b/t/helper/test-gettext-poison.c
    + diff --git a/t/lib-gettext.sh b/t/lib-gettext.sh
    + --- a/t/lib-gettext.sh
    + +++ b/t/lib-gettext.sh
     @@
    -+#include "test-tool.h"
    -+#include "git-compat-util.h"
    -+#include "thread-utils.h"
    -+#include "gettext.h"
    -+
    -+int cmd__gettext_poison(int argc, const char **argv)
    -+{
    -+	return use_gettext_poison() ? 0 : 1;
    -+}
    -
    - diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
    - --- a/t/helper/test-tool.c
    - +++ b/t/helper/test-tool.c
    -@@
    - 	{ "dump-untracked-cache", cmd__dump_untracked_cache },
    - 	{ "example-decorate", cmd__example_decorate },
    - 	{ "genrandom", cmd__genrandom },
    -+	{ "gettext-poison", cmd__gettext_poison },
    - 	{ "hashmap", cmd__hashmap },
    - 	{ "index-version", cmd__index_version },
    - 	{ "json-writer", cmd__json_writer },
    -
    - diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
    - --- a/t/helper/test-tool.h
    - +++ b/t/helper/test-tool.h
    -@@
    - int cmd__dump_untracked_cache(int argc, const char **argv);
    - int cmd__example_decorate(int argc, const char **argv);
    - int cmd__genrandom(int argc, const char **argv);
    -+int cmd__gettext_poison(int argc, const char **argv);
    - int cmd__hashmap(int argc, const char **argv);
    - int cmd__index_version(int argc, const char **argv);
    - int cmd__json_writer(int argc, const char **argv);
    + 
    + . "$GIT_BUILD_DIR"/git-sh-i18n
    + 
    +-if test_have_prereq GETTEXT && ! test_have_prereq GETTEXT_POISON
    ++if test_have_prereq GETTEXT && test_have_prereq C_LOCALE_OUTPUT
    + then
    + 	# is_IS.UTF-8 on Solaris and FreeBSD, is_IS.utf8 on Debian
    + 	is_IS_locale=$(locale -a 2>/dev/null |
     
      diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
      --- a/t/t0000-basic.sh
    @@ -360,6 +352,37 @@
      		test-verbose "test verbose" --verbose <<-\EOF &&
      	test_expect_success "passing test" true
     
    + diff --git a/t/t0205-gettext-poison.sh b/t/t0205-gettext-poison.sh
    + --- a/t/t0205-gettext-poison.sh
    + +++ b/t/t0205-gettext-poison.sh
    +@@
    + 
    + test_description='Gettext Shell poison'
    + 
    ++GIT_TEST_GETTEXT_POISON=true
    ++export GIT_TEST_GETTEXT_POISON
    + . ./lib-gettext.sh
    + 
    +-test_expect_success GETTEXT_POISON 'sanity: $GIT_INTERNAL_GETTEXT_SH_SCHEME" is poison' '
    ++test_expect_success 'sanity: $GIT_INTERNAL_GETTEXT_SH_SCHEME" is poison' '
    +     test "$GIT_INTERNAL_GETTEXT_SH_SCHEME" = "poison"
    + '
    + 
    +-test_expect_success GETTEXT_POISON 'gettext: our gettext() fallback has poison semantics' '
    ++test_expect_success 'gettext: our gettext() fallback has poison semantics' '
    +     printf "# GETTEXT POISON #" >expect &&
    +     gettext "test" >actual &&
    +     test_cmp expect actual &&
    +@@
    +     test_cmp expect actual
    + '
    + 
    +-test_expect_success GETTEXT_POISON 'eval_gettext: our eval_gettext() fallback has poison semantics' '
    ++test_expect_success 'eval_gettext: our eval_gettext() fallback has poison semantics' '
    +     printf "# GETTEXT POISON #" >expect &&
    +     eval_gettext "test" >actual &&
    +     test_cmp expect actual &&
    +
      diff --git a/t/t3406-rebase-message.sh b/t/t3406-rebase-message.sh
      --- a/t/t3406-rebase-message.sh
      +++ b/t/t3406-rebase-message.sh
    @@ -377,18 +400,33 @@
      --- a/t/t7201-co.sh
      +++ b/t/t7201-co.sh
     @@
    + test_expect_success 'checkout to detach HEAD' '
    + 	git config advice.detachedHead true &&
      	git checkout -f renamer && git clean -f &&
    - 	git checkout renamer^ 2>messages &&
    - 	test_i18ngrep "HEAD is now at 7329388" messages &&
    +-	git checkout renamer^ 2>messages &&
    +-	test_i18ngrep "HEAD is now at 7329388" messages &&
     -	(test_line_count -gt 1 messages || test -n "$GETTEXT_POISON") &&
    -+	(
    -+		test_line_count -gt 1 messages ||
    -+		test_have_prereq GETTEXT_POISON
    -+	) &&
    ++	GIT_TEST_GETTEXT_POISON=false git checkout renamer^ 2>messages &&
    ++	grep "HEAD is now at 7329388" messages &&
    ++	test_line_count -gt 1 messages &&
      	H=$(git rev-parse --verify HEAD) &&
      	M=$(git show-ref -s --verify refs/heads/master) &&
      	test "z$H" = "z$M" &&
     
    + diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
    + --- a/t/t9902-completion.sh
    + +++ b/t/t9902-completion.sh
    +@@
    + 	verbose test -z "$__git_all_commands"
    + '
    + 
    +-test_expect_success !GETTEXT_POISON 'sourcing the completion script clears cached merge strategies' '
    ++test_expect_success 'sourcing the completion script clears cached merge strategies' '
    ++	GIT_TEST_GETTEXT_POISON=false &&
    + 	__git_compute_merge_strategies &&
    + 	verbose test -n "$__git_merge_strategies" &&
    + 	. "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" &&
    +
      diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
      --- a/t/test-lib-functions.sh
      +++ b/t/test-lib-functions.sh
    @@ -417,7 +455,7 @@
      	fi
      
     -	if test -n "$GETTEXT_POISON"
    -+	if ! test_have_prereq C_LOCALE_OUTPUT
    ++	if test_have_prereq !C_LOCALE_OUTPUT
      	then
      		# pretend success
      		return 0
    @@ -438,8 +476,9 @@
     -else
     -	test_set_prereq C_LOCALE_OUTPUT
     -fi
    -+test-tool gettext-poison && test_set_prereq GETTEXT_POISON
    -+test-tool gettext-poison || test_set_prereq C_LOCALE_OUTPUT
    ++## Fails in the 'branch -vv' test in t6040-tracking-info.sh
    ++#test_lazy_prereq C_LOCALE_OUTPUT '! git sh-i18n--helper --git-test-gettext-poison'
    ++git sh-i18n--helper --git-test-gettext-poison || test_set_prereq C_LOCALE_OUTPUT
      
      if test -z "$GIT_TEST_CHECK_CACHE_TREE"
      then

 .gitignore                |  1 +
 .travis.yml               |  2 +-
 Makefile                  | 10 +---------
 builtin.h                 |  1 +
 builtin/sh-i18n--helper.c | 28 ++++++++++++++++++++++++++++
 ci/lib-travisci.sh        |  4 ++--
 gettext.c                 |  5 ++---
 gettext.h                 |  4 ----
 git-sh-i18n.sh            |  3 ++-
 git.c                     |  1 +
 po/README                 | 13 ++++---------
 t/README                  |  6 ++++++
 t/lib-gettext.sh          |  2 +-
 t/t0000-basic.sh          |  2 +-
 t/t0205-gettext-poison.sh |  8 +++++---
 t/t3406-rebase-message.sh |  2 +-
 t/t7201-co.sh             |  6 +++---
 t/t9902-completion.sh     |  3 ++-
 t/test-lib-functions.sh   |  8 ++++----
 t/test-lib.sh             | 12 +++---------
 20 files changed, 69 insertions(+), 52 deletions(-)
 create mode 100644 builtin/sh-i18n--helper.c

diff --git a/.gitignore b/.gitignore
index 9d1363a1eb..f7b7977910 100644
--- a/.gitignore
+++ b/.gitignore
@@ -148,6 +148,7 @@
 /git-serve
 /git-sh-i18n
 /git-sh-i18n--envsubst
+/git-sh-i18n--helper
 /git-sh-setup
 /git-sh-i18n
 /git-shell
diff --git a/.travis.yml b/.travis.yml
index 4d4e26c9df..4523a2e5ec 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -26,7 +26,7 @@ addons:
 
 matrix:
   include:
-    - env: jobname=GETTEXT_POISON
+    - env: jobname=GIT_TEST_GETTEXT_POISON
       os: linux
       compiler:
       addons:
diff --git a/Makefile b/Makefile
index d18ab0fe78..086600b099 100644
--- a/Makefile
+++ b/Makefile
@@ -362,11 +362,6 @@ all::
 # Define HAVE_DEV_TTY if your system can open /dev/tty to interact with the
 # user.
 #
-# Define GETTEXT_POISON if you are debugging the choice of strings marked
-# for translation.  In a GETTEXT_POISON build, you can turn all strings marked
-# for translation into gibberish by setting the GIT_GETTEXT_POISON variable
-# (to any value) in your environment.
-#
 # Define JSMIN to point to JavaScript minifier that functions as
 # a filter to have gitweb.js minified.
 #
@@ -1099,6 +1094,7 @@ BUILTIN_OBJS += builtin/revert.o
 BUILTIN_OBJS += builtin/rm.o
 BUILTIN_OBJS += builtin/send-pack.o
 BUILTIN_OBJS += builtin/serve.o
+BUILTIN_OBJS += builtin/sh-i18n--helper.o
 BUILTIN_OBJS += builtin/shortlog.o
 BUILTIN_OBJS += builtin/show-branch.o
 BUILTIN_OBJS += builtin/show-index.o
@@ -1439,9 +1435,6 @@ endif
 ifdef NO_SYMLINK_HEAD
 	BASIC_CFLAGS += -DNO_SYMLINK_HEAD
 endif
-ifdef GETTEXT_POISON
-	BASIC_CFLAGS += -DGETTEXT_POISON
-endif
 ifdef NO_GETTEXT
 	BASIC_CFLAGS += -DNO_GETTEXT
 	USE_GETTEXT_SCHEME ?= fallthrough
@@ -2591,7 +2584,6 @@ ifdef GIT_TEST_CMP_USE_COPIED_CONTEXT
 	@echo GIT_TEST_CMP_USE_COPIED_CONTEXT=YesPlease >>$@+
 endif
 	@echo NO_GETTEXT=\''$(subst ','\'',$(subst ','\'',$(NO_GETTEXT)))'\' >>$@+
-	@echo GETTEXT_POISON=\''$(subst ','\'',$(subst ','\'',$(GETTEXT_POISON)))'\' >>$@+
 ifdef GIT_PERF_REPEAT_COUNT
 	@echo GIT_PERF_REPEAT_COUNT=\''$(subst ','\'',$(subst ','\'',$(GIT_PERF_REPEAT_COUNT)))'\' >>$@+
 endif
diff --git a/builtin.h b/builtin.h
index 962f0489ab..a40c56e7a2 100644
--- a/builtin.h
+++ b/builtin.h
@@ -219,6 +219,7 @@ extern int cmd_revert(int argc, const char **argv, const char *prefix);
 extern int cmd_rm(int argc, const char **argv, const char *prefix);
 extern int cmd_send_pack(int argc, const char **argv, const char *prefix);
 extern int cmd_serve(int argc, const char **argv, const char *prefix);
+extern int cmd_sh_i18n__helper(int argc, const char **argv, const char *prefix);
 extern int cmd_shortlog(int argc, const char **argv, const char *prefix);
 extern int cmd_show(int argc, const char **argv, const char *prefix);
 extern int cmd_show_branch(int argc, const char **argv, const char *prefix);
diff --git a/builtin/sh-i18n--helper.c b/builtin/sh-i18n--helper.c
new file mode 100644
index 0000000000..aa62787514
--- /dev/null
+++ b/builtin/sh-i18n--helper.c
@@ -0,0 +1,28 @@
+#include "builtin.h"
+#include "cache.h"
+#include "config.h"
+#include "parse-options.h"
+
+static const char * const builtin_sh_i18n_helper_usage[] = {
+	N_("git sh-i18n--helper [<options>]"),
+	NULL
+};
+
+int cmd_sh_i18n__helper(int argc, const char **argv, const char *prefix)
+{
+	int poison = -1;
+	struct option options[] = {
+		OPT_BOOL_F(0, "git-test-gettext-poison", &poison,
+			   N_("is GIT_TEST_GETTEXT_POISON in effect?"),
+			   PARSE_OPT_NONEG),
+		OPT_END()
+	};
+
+	argc = parse_options(argc, argv, NULL, options,
+			     builtin_sh_i18n_helper_usage, PARSE_OPT_KEEP_ARGV0);
+
+	if (poison != -1)
+		return !use_gettext_poison();
+
+	usage_with_options(builtin_sh_i18n_helper_usage, options);
+}
diff --git a/ci/lib-travisci.sh b/ci/lib-travisci.sh
index 06970f7213..6a89d0d7d8 100755
--- a/ci/lib-travisci.sh
+++ b/ci/lib-travisci.sh
@@ -123,7 +123,7 @@ osx-clang|osx-gcc)
 	# Travis CI OS X
 	export GIT_SKIP_TESTS="t9810 t9816"
 	;;
-GETTEXT_POISON)
-	export GETTEXT_POISON=YesPlease
+GIT_TEST_GETTEXT_POISON)
+	export GIT_TEST_GETTEXT_POISON=true
 	;;
 esac
diff --git a/gettext.c b/gettext.c
index 7272771c8e..722a2f726c 100644
--- a/gettext.c
+++ b/gettext.c
@@ -7,6 +7,7 @@
 #include "gettext.h"
 #include "strbuf.h"
 #include "utf8.h"
+#include "config.h"
 
 #ifndef NO_GETTEXT
 #	include <locale.h>
@@ -46,15 +47,13 @@ const char *get_preferred_languages(void)
 	return NULL;
 }
 
-#ifdef GETTEXT_POISON
 int use_gettext_poison(void)
 {
 	static int poison_requested = -1;
 	if (poison_requested == -1)
-		poison_requested = getenv("GIT_GETTEXT_POISON") ? 1 : 0;
+		poison_requested = git_env_bool("GIT_TEST_GETTEXT_POISON", 0);
 	return poison_requested;
 }
-#endif
 
 #ifndef NO_GETTEXT
 static int test_vsnprintf(const char *fmt, ...)
diff --git a/gettext.h b/gettext.h
index 7eee64a34f..4c492d9f57 100644
--- a/gettext.h
+++ b/gettext.h
@@ -41,11 +41,7 @@ static inline int gettext_width(const char *s)
 }
 #endif
 
-#ifdef GETTEXT_POISON
 extern int use_gettext_poison(void);
-#else
-#define use_gettext_poison() 0
-#endif
 
 static inline FORMAT_PRESERVING(1) const char *_(const char *msgid)
 {
diff --git a/git-sh-i18n.sh b/git-sh-i18n.sh
index 9d065fb4bf..c0713b1ee9 100644
--- a/git-sh-i18n.sh
+++ b/git-sh-i18n.sh
@@ -17,7 +17,8 @@ export TEXTDOMAINDIR
 
 # First decide what scheme to use...
 GIT_INTERNAL_GETTEXT_SH_SCHEME=fallthrough
-if test -n "$GIT_GETTEXT_POISON"
+if test -n "$GIT_TEST_GETTEXT_POISON" &&
+	    git sh-i18n--helper --git-test-gettext-poison
 then
 	GIT_INTERNAL_GETTEXT_SH_SCHEME=poison
 elif test -n "@@USE_GETTEXT_SCHEME@@"
diff --git a/git.c b/git.c
index 5920f8019b..125c523720 100644
--- a/git.c
+++ b/git.c
@@ -539,6 +539,7 @@ static struct cmd_struct commands[] = {
 	{ "rm", cmd_rm, RUN_SETUP },
 	{ "send-pack", cmd_send_pack, RUN_SETUP },
 	{ "serve", cmd_serve, RUN_SETUP },
+	{ "sh-i18n--helper", cmd_sh_i18n__helper, 0 },
 	{ "shortlog", cmd_shortlog, RUN_SETUP_GENTLY | USE_PAGER },
 	{ "show", cmd_show, RUN_SETUP },
 	{ "show-branch", cmd_show_branch, RUN_SETUP },
diff --git a/po/README b/po/README
index fef4c0f0b5..07595d369b 100644
--- a/po/README
+++ b/po/README
@@ -289,16 +289,11 @@ something in the test suite might still depend on the US English
 version of the strings, e.g. to grep some error message or other
 output.
 
-To smoke out issues like these Git can be compiled with gettext poison
-support, at the top-level:
+To smoke out issues like these, Git tested with a translation mode that
+emits gibberish on every call to gettext. To use it run the test suite
+with it, e.g.:
 
-    make GETTEXT_POISON=YesPlease
-
-That'll give you a git which emits gibberish on every call to
-gettext. It's obviously not meant to be installed, but you should run
-the test suite with it:
-
-    cd t && prove -j 9 ./t[0-9]*.sh
+    cd t && GIT_TEST_GETTEXT_POISON=true prove -j 9 ./t[0-9]*.sh
 
 If tests break with it you should inspect them manually and see if
 what you're translating is sane, i.e. that you're not translating
diff --git a/t/README b/t/README
index 8847489640..53c3dee7a9 100644
--- a/t/README
+++ b/t/README
@@ -301,6 +301,12 @@ that cannot be easily covered by a few specific test cases. These
 could be enabled by running the test suite with correct GIT_TEST_
 environment set.
 
+GIT_TEST_GETTEXT_POISON=<boolean> turns all strings marked for
+translation into gibberish. Used for spotting those tests that need to
+be marked with a C_LOCALE_OUTPUT prerequisite when adding more strings
+for translation. See "Testing marked strings" in po/README for
+details.
+
 GIT_TEST_SPLIT_INDEX=<boolean> forces split-index mode on the whole
 test suite. Accept any boolean values that are accepted by git-config.
 
diff --git a/t/lib-gettext.sh b/t/lib-gettext.sh
index eec757f104..755f421431 100644
--- a/t/lib-gettext.sh
+++ b/t/lib-gettext.sh
@@ -12,7 +12,7 @@ export GIT_TEXTDOMAINDIR GIT_PO_PATH
 
 . "$GIT_BUILD_DIR"/git-sh-i18n
 
-if test_have_prereq GETTEXT && ! test_have_prereq GETTEXT_POISON
+if test_have_prereq GETTEXT && test_have_prereq C_LOCALE_OUTPUT
 then
 	# is_IS.UTF-8 on Solaris and FreeBSD, is_IS.utf8 on Debian
 	is_IS_locale=$(locale -a 2>/dev/null |
diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index 4d23373526..b6566003dd 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -274,7 +274,7 @@ test_expect_success 'pretend we have a mix of all possible results' "
 	EOF
 "
 
-test_expect_success 'test --verbose' '
+test_expect_success C_LOCALE_OUTPUT 'test --verbose' '
 	test_must_fail run_sub_test_lib_test \
 		test-verbose "test verbose" --verbose <<-\EOF &&
 	test_expect_success "passing test" true
diff --git a/t/t0205-gettext-poison.sh b/t/t0205-gettext-poison.sh
index 438e778d6a..1675d3e171 100755
--- a/t/t0205-gettext-poison.sh
+++ b/t/t0205-gettext-poison.sh
@@ -5,13 +5,15 @@
 
 test_description='Gettext Shell poison'
 
+GIT_TEST_GETTEXT_POISON=true
+export GIT_TEST_GETTEXT_POISON
 . ./lib-gettext.sh
 
-test_expect_success GETTEXT_POISON 'sanity: $GIT_INTERNAL_GETTEXT_SH_SCHEME" is poison' '
+test_expect_success 'sanity: $GIT_INTERNAL_GETTEXT_SH_SCHEME" is poison' '
     test "$GIT_INTERNAL_GETTEXT_SH_SCHEME" = "poison"
 '
 
-test_expect_success GETTEXT_POISON 'gettext: our gettext() fallback has poison semantics' '
+test_expect_success 'gettext: our gettext() fallback has poison semantics' '
     printf "# GETTEXT POISON #" >expect &&
     gettext "test" >actual &&
     test_cmp expect actual &&
@@ -20,7 +22,7 @@ test_expect_success GETTEXT_POISON 'gettext: our gettext() fallback has poison s
     test_cmp expect actual
 '
 
-test_expect_success GETTEXT_POISON 'eval_gettext: our eval_gettext() fallback has poison semantics' '
+test_expect_success 'eval_gettext: our eval_gettext() fallback has poison semantics' '
     printf "# GETTEXT POISON #" >expect &&
     eval_gettext "test" >actual &&
     test_cmp expect actual &&
diff --git a/t/t3406-rebase-message.sh b/t/t3406-rebase-message.sh
index 0392e36d23..2bdcf83808 100755
--- a/t/t3406-rebase-message.sh
+++ b/t/t3406-rebase-message.sh
@@ -77,7 +77,7 @@ test_expect_success 'rebase -n overrides config rebase.stat config' '
 #     "Does not point to a valid commit: invalid-ref"
 #
 # NEEDSWORK: This "grep" is fine in real non-C locales, but
-# GETTEXT_POISON poisons the refname along with the enclosing
+# GIT_TEST_GETTEXT_POISON poisons the refname along with the enclosing
 # error message.
 test_expect_success 'rebase --onto outputs the invalid ref' '
 	test_must_fail git rebase --onto invalid-ref HEAD HEAD 2>err &&
diff --git a/t/t7201-co.sh b/t/t7201-co.sh
index 826987ca80..71e8e0167f 100755
--- a/t/t7201-co.sh
+++ b/t/t7201-co.sh
@@ -254,9 +254,9 @@ test_expect_success 'checkout to detach HEAD (with advice declined)' '
 test_expect_success 'checkout to detach HEAD' '
 	git config advice.detachedHead true &&
 	git checkout -f renamer && git clean -f &&
-	git checkout renamer^ 2>messages &&
-	test_i18ngrep "HEAD is now at 7329388" messages &&
-	(test_line_count -gt 1 messages || test -n "$GETTEXT_POISON") &&
+	GIT_TEST_GETTEXT_POISON=false git checkout renamer^ 2>messages &&
+	grep "HEAD is now at 7329388" messages &&
+	test_line_count -gt 1 messages &&
 	H=$(git rev-parse --verify HEAD) &&
 	M=$(git show-ref -s --verify refs/heads/master) &&
 	test "z$H" = "z$M" &&
diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
index 175f83d704..537f5fdada 100755
--- a/t/t9902-completion.sh
+++ b/t/t9902-completion.sh
@@ -1697,7 +1697,8 @@ test_expect_success 'sourcing the completion script clears cached commands' '
 	verbose test -z "$__git_all_commands"
 '
 
-test_expect_success !GETTEXT_POISON 'sourcing the completion script clears cached merge strategies' '
+test_expect_success 'sourcing the completion script clears cached merge strategies' '
+	GIT_TEST_GETTEXT_POISON=false &&
 	__git_compute_merge_strategies &&
 	verbose test -n "$__git_merge_strategies" &&
 	. "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" &&
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 78d8c3783b..2f42b3653c 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -755,16 +755,16 @@ test_cmp_bin() {
 
 # Use this instead of test_cmp to compare files that contain expected and
 # actual output from git commands that can be translated.  When running
-# under GETTEXT_POISON this pretends that the command produced expected
+# under GIT_TEST_GETTEXT_POISON this pretends that the command produced expected
 # results.
 test_i18ncmp () {
-	test -n "$GETTEXT_POISON" || test_cmp "$@"
+	! test_have_prereq C_LOCALE_OUTPUT || test_cmp "$@"
 }
 
 # Use this instead of "grep expected-string actual" to see if the
 # output from a git command that can be translated either contains an
 # expected string, or does not contain an unwanted one.  When running
-# under GETTEXT_POISON this pretends that the command produced expected
+# under GIT_TEST_GETTEXT_POISON this pretends that the command produced expected
 # results.
 test_i18ngrep () {
 	eval "last_arg=\${$#}"
@@ -779,7 +779,7 @@ test_i18ngrep () {
 		error "bug in the test script: too few parameters to test_i18ngrep"
 	fi
 
-	if test -n "$GETTEXT_POISON"
+	if test_have_prereq !C_LOCALE_OUTPUT
 	then
 		# pretend success
 		return 0
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 897e6fcc94..892591ac84 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1104,15 +1104,9 @@ test -n "$USE_LIBPCRE1" && test_set_prereq LIBPCRE1
 test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 
-# Can we rely on git's output in the C locale?
-if test -n "$GETTEXT_POISON"
-then
-	GIT_GETTEXT_POISON=YesPlease
-	export GIT_GETTEXT_POISON
-	test_set_prereq GETTEXT_POISON
-else
-	test_set_prereq C_LOCALE_OUTPUT
-fi
+## Fails in the 'branch -vv' test in t6040-tracking-info.sh
+#test_lazy_prereq C_LOCALE_OUTPUT '! git sh-i18n--helper --git-test-gettext-poison'
+git sh-i18n--helper --git-test-gettext-poison || test_set_prereq C_LOCALE_OUTPUT
 
 if test -z "$GIT_TEST_CHECK_CACHE_TREE"
 then
-- 
2.19.1.568.g152ad8e336


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH] Poison gettext with the Ook language
  2018-10-23 16:45         ` Ævar Arnfjörð Bjarmason
@ 2018-10-24 14:41           ` Duy Nguyen
  2018-10-24 17:54             ` Ævar Arnfjörð Bjarmason
  2018-10-25  3:52             ` Junio C Hamano
  0 siblings, 2 replies; 49+ messages in thread
From: Duy Nguyen @ 2018-10-24 14:41 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, Git Mailing List, SZEDER Gábor,
	Vasco Almeida, Jiang Xin

On Tue, Oct 23, 2018 at 6:45 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> >> The effect of what I'm suggesting here, and which my WIP patch in
> >> <875zxtd59e.fsf@evledraar.gmail.com> implements is that we'd do a
> >> one-time getenv() for each process that prints a _() message that we
> >> aren't doing now, and for each message call a function that would check
> >> a boolean "are we in poison mode" static global variable.
> >
> > Just don't do the getenv() inside _() even if you just do it one time.
> > getenv() may invalidate whatever value from the previous call. I would
> > not want to find out my code broke because of an getenv() inside some
> > innocent _()...
>
> How would any code break? We have various getenv("GIT_TEST_*...")
> already that work the same way. Unless you set that in the environment
> the function is idempotent, and I don't see how anyone would do that
> accidentally.

I didn't check those GIT_TEST_ but I think the difference is in
complexity. When you write

 var = getenv("foo");
 complexFunction();

you probably start to feel scary and wrap getenv() with a strdup()
because you usually don't know exactly what complexFunction() can call
(and even if you do, you can't be sure what it may call in the
future).

The person who writes

 printf(_("%s"), getenv("foo"));

may not go through the same thought process as with complexFunction().
If _() calls getenv(), because you the order of parameter evaluation
is unspecified, you cannot be sure if getenv("foo") will be called
before or after the one inside _(). One of them may screw up the
other.

> > And we should still respect NO_GETTEXT, not doing any extra work when
> > it's defined.
>
> This is not how any of our NO_* defines work. *Sometimes* defining them
> guarantees you do no extra work, but in other cases we've decided that
> bending over backwards with #ifdef in various places isn't worth it.

I guess it boils down to "worth it". For me #ifdef NO_GETTEXT would be
limited to gettext.h and it's not that much work. But you do the
actual work. You decide.
-- 
Duy

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] Poison gettext with the Ook language
  2018-10-24 14:41           ` Duy Nguyen
@ 2018-10-24 17:54             ` Ævar Arnfjörð Bjarmason
  2018-10-25  3:52             ` Junio C Hamano
  1 sibling, 0 replies; 49+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-10-24 17:54 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Johannes Schindelin, Git Mailing List, SZEDER Gábor,
	Vasco Almeida, Jiang Xin


On Wed, Oct 24 2018, Duy Nguyen wrote:

> On Tue, Oct 23, 2018 at 6:45 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> >> The effect of what I'm suggesting here, and which my WIP patch in
>> >> <875zxtd59e.fsf@evledraar.gmail.com> implements is that we'd do a
>> >> one-time getenv() for each process that prints a _() message that we
>> >> aren't doing now, and for each message call a function that would check
>> >> a boolean "are we in poison mode" static global variable.
>> >
>> > Just don't do the getenv() inside _() even if you just do it one time.
>> > getenv() may invalidate whatever value from the previous call. I would
>> > not want to find out my code broke because of an getenv() inside some
>> > innocent _()...
>>
>> How would any code break? We have various getenv("GIT_TEST_*...")
>> already that work the same way. Unless you set that in the environment
>> the function is idempotent, and I don't see how anyone would do that
>> accidentally.
>
> I didn't check those GIT_TEST_ but I think the difference is in
> complexity. When you write
>
>  var = getenv("foo");
>  complexFunction();
>
> you probably start to feel scary and wrap getenv() with a strdup()
> because you usually don't know exactly what complexFunction() can call
> (and even if you do, you can't be sure what it may call in the
> future).
>
> The person who writes
>
>  printf(_("%s"), getenv("foo"));
>
> may not go through the same thought process as with complexFunction().
> If _() calls getenv(), because you the order of parameter evaluation
> is unspecified, you cannot be sure if getenv("foo") will be called
> before or after the one inside _(). One of them may screw up the
> other.

Ah, you mean because getenv() is not reentrant such a construct may
cause us to return something else entirely for getenv("bar") (e.g. in
this case the value for getenv("bar")).

Is that something you or anyone else has seen in practice? My intuition
is that while POSIX doesn't make that promise it isn't likely that
there's any implementation that would mutate the env purely on a
getenv(), but setenv() (munging some static "env" area in-place) might
be another matter.

But we could always call use_gettext_poison() really early from
git_setup_gettext() (called from our main()) if we're worried about
this, it would then set the static "poison_requested" variable and we
wouldn't call getenv() again, e.g. if we're later calling it in the
middle of multithreaded code, or within the same same sequence point.

>> > And we should still respect NO_GETTEXT, not doing any extra work when
>> > it's defined.
>>
>> This is not how any of our NO_* defines work. *Sometimes* defining them
>> guarantees you do no extra work, but in other cases we've decided that
>> bending over backwards with #ifdef in various places isn't worth it.
>
> I guess it boils down to "worth it". For me #ifdef NO_GETTEXT would be
> limited to gettext.h and it's not that much work. But you do the
> actual work. You decide.

Yeah the ifdef is pretty small and not really worth worrynig about, the
main benefit is being able to run tests in this mode without
recompiling.

I hadn't been running with GETTEXT_POISON when I build git because I
hadn't been bothered to build twice just for it, but am now running it
alongside other GIT_TEST_* modes.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] i18n: make GETTEXT_POISON a runtime option
  2018-10-24  7:44         ` Jeff King
@ 2018-10-25  1:00           ` Junio C Hamano
  2018-10-25  1:09             ` Jeff King
  0 siblings, 1 reply; 49+ messages in thread
From: Junio C Hamano @ 2018-10-25  1:00 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, git, SZEDER Gábor,
	Vasco Almeida, Jiang Xin

Jeff King <peff@peff.net> writes:

> but then you lose the default handling. I think if we added a new
> option, it would either be:
>
>   # interpret a value directly; use default on empty, I guess?
>   git config --default=false --type=bool --interpret-value "$GIT_WHATEVER_ENV"
>
> or
>
>   # less flexible, but the --default semantics are more obvious
>   git config --default=false --type=bool --get-env GIT_WHATEVER_ENV

Yeah, my thinko.  The latter would be closer to what this patch
wants to have, but obviously the former would be more flexible and
useful in wider context.  Both have the "Huh?" factor---what they
are doing has little to do with "config", but I did not think of a
better kitchen-sink (and our default kitchen-sink "rev-parse" is
even further than "config", I would think, for this one).

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] i18n: make GETTEXT_POISON a runtime option
  2018-10-25  1:00           ` Junio C Hamano
@ 2018-10-25  1:09             ` Jeff King
  2018-10-25  1:24               ` Ramsay Jones
  0 siblings, 1 reply; 49+ messages in thread
From: Jeff King @ 2018-10-25  1:09 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, git, SZEDER Gábor,
	Vasco Almeida, Jiang Xin

On Thu, Oct 25, 2018 at 10:00:31AM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > but then you lose the default handling. I think if we added a new
> > option, it would either be:
> >
> >   # interpret a value directly; use default on empty, I guess?
> >   git config --default=false --type=bool --interpret-value "$GIT_WHATEVER_ENV"
> >
> > or
> >
> >   # less flexible, but the --default semantics are more obvious
> >   git config --default=false --type=bool --get-env GIT_WHATEVER_ENV
> 
> Yeah, my thinko.  The latter would be closer to what this patch
> wants to have, but obviously the former would be more flexible and
> useful in wider context.  Both have the "Huh?" factor---what they
> are doing has little to do with "config", but I did not think of a
> better kitchen-sink (and our default kitchen-sink "rev-parse" is
> even further than "config", I would think, for this one).

Heh, I thought through the exact sequence in your paragraph when writing
my other message. That's probably a good sign that we should probably
not pursue this further unless we see the use case come up again a few
more times (and if we do, then consider "config" the least-bad place to
do it).

-Peff

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] i18n: make GETTEXT_POISON a runtime option
  2018-10-25  1:09             ` Jeff King
@ 2018-10-25  1:24               ` Ramsay Jones
  2018-10-25 21:23                 ` Jeff King
  0 siblings, 1 reply; 49+ messages in thread
From: Ramsay Jones @ 2018-10-25  1:24 UTC (permalink / raw)
  To: Jeff King, Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, git, SZEDER Gábor,
	Vasco Almeida, Jiang Xin



On 25/10/2018 02:09, Jeff King wrote:
> On Thu, Oct 25, 2018 at 10:00:31AM +0900, Junio C Hamano wrote:
> 
>> Jeff King <peff@peff.net> writes:
>>
>>> but then you lose the default handling. I think if we added a new
>>> option, it would either be:
>>>
>>>   # interpret a value directly; use default on empty, I guess?
>>>   git config --default=false --type=bool --interpret-value "$GIT_WHATEVER_ENV"
>>>
>>> or
>>>
>>>   # less flexible, but the --default semantics are more obvious
>>>   git config --default=false --type=bool --get-env GIT_WHATEVER_ENV
>>
>> Yeah, my thinko.  The latter would be closer to what this patch
>> wants to have, but obviously the former would be more flexible and
>> useful in wider context.  Both have the "Huh?" factor---what they
>> are doing has little to do with "config", but I did not think of a
>> better kitchen-sink (and our default kitchen-sink "rev-parse" is
>> even further than "config", I would think, for this one).
> 
> Heh, I thought through the exact sequence in your paragraph when writing
> my other message. That's probably a good sign that we should probably
> not pursue this further unless we see the use case come up again a few
> more times (and if we do, then consider "config" the least-bad place to
> do it).

I was thinking:

  $ git var -e GIT_WHATEVER_ENV

[-e for environment].

... but that is really no different than git-config. ;-)

ATB,
Ramsay Jones




^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] Poison gettext with the Ook language
  2018-10-24 14:41           ` Duy Nguyen
  2018-10-24 17:54             ` Ævar Arnfjörð Bjarmason
@ 2018-10-25  3:52             ` Junio C Hamano
  2018-10-25  6:20               ` Jeff King
  1 sibling, 1 reply; 49+ messages in thread
From: Junio C Hamano @ 2018-10-25  3:52 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Ævar Arnfjörð Bjarmason, Johannes Schindelin,
	Git Mailing List, SZEDER Gábor, Vasco Almeida, Jiang Xin

Duy Nguyen <pclouds@gmail.com> writes:

> The person who writes
>
>  printf(_("%s"), getenv("foo"));
>
> may not go through the same thought process as with complexFunction().
> If _() calls getenv(), because you the order of parameter evaluation
> is unspecified, you cannot be sure if getenv("foo") will be called
> before or after the one inside _(). One of them may screw up the
> other.

Yup, sometimes we've been sloppy but we should strive to mimick
efforts like f4ef5173 ("determine_author_info(): copy getenv
output", 2014-08-27).

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] Poison gettext with the Ook language
  2018-10-25  3:52             ` Junio C Hamano
@ 2018-10-25  6:20               ` Jeff King
  2018-10-27  6:55                 ` Junio C Hamano
  0 siblings, 1 reply; 49+ messages in thread
From: Jeff King @ 2018-10-25  6:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Duy Nguyen, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Git Mailing List, SZEDER Gábor,
	Vasco Almeida, Jiang Xin

On Thu, Oct 25, 2018 at 12:52:55PM +0900, Junio C Hamano wrote:

> Duy Nguyen <pclouds@gmail.com> writes:
> 
> > The person who writes
> >
> >  printf(_("%s"), getenv("foo"));
> >
> > may not go through the same thought process as with complexFunction().
> > If _() calls getenv(), because you the order of parameter evaluation
> > is unspecified, you cannot be sure if getenv("foo") will be called
> > before or after the one inside _(). One of them may screw up the
> > other.
> 
> Yup, sometimes we've been sloppy but we should strive to mimick
> efforts like f4ef5173 ("determine_author_info(): copy getenv
> output", 2014-08-27).

I've wondered about this before. Even calling:

  foo = xstrdup(getenv("bar"));

is not necessarily correct, because xstrdup() relies on xmalloc(), which
may check GIT_ALLOC_LIMIT (we do cache that result, but it can happen on
the first malloc).

I also wouldn't be surprised if there are cases in our threaded code
that use getenv() without taking a lock.

I've definitely run into setenv()/getenv() races on Linux (inside Git,
even, though it was while working on custom code). But I wonder how
common it is for getenv() to be invalidated by another getenv() call.
Certainly POSIX allows it, but every implementation I've seen (which is
admittedly few) is passing back pointers to a chunk of environment
memory.

I.e., could we mostly ignore this problem as not applying to most modern
systems? And if there is such a system, give it a fallback like:

  /*
   * For systems that use a single buffer for getenv(), this hacks
   * around it by giving it _four_ buffers. That's just punting on
   * the problem, but it at least gives enough breathing room for
   * the caller to do something sane like use non-trivial functions
   * to copy the string. It still does nothing for threading, but
   * hopefully such systems don't support pthreads in the first place. ;)
   */
  const char *xgetenv(const char *key)
  {
	static struct strbuf bufs[] = {
		STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT
	};
	static unsigned int cur;
	struct strbuf *buf;
	const char *value;
	size_t len;

	value = getenv(key);
	if (!value)
		return NULL;

	buf = bufs[cur++];
	cur %= ARRAY_SIZE(bufs);

	/*
	 * We have to do this length check ourselves, because allocating
	 * the strbuf may invalidate "value"!
	 */
	len = strlen(value);
	if (buf->alloc <= len) {
		strbuf_grow(buf, len);
		value = getenv(key);
		if (!value)
			return NULL; /* whoops, it went away! */
		len = strlen(value); /* paranoia that it didn't change */
	}

	strbuf_reset(buf);
	strbuf_add(buf, value, len);

	return buf->buf;
  }

I dunno. Maybe I am being overly optimistic. But I strongly suspect we
have such bugs already in our code base, and nobody has run into them
(OTOH, they are quite finicky due to things like the caching I
mentioned).

-Peff

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] i18n: make GETTEXT_POISON a runtime option
  2018-10-25  1:24               ` Ramsay Jones
@ 2018-10-25 21:23                 ` Jeff King
  2018-10-26 19:20                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 49+ messages in thread
From: Jeff King @ 2018-10-25 21:23 UTC (permalink / raw)
  To: Ramsay Jones
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason, git,
	SZEDER Gábor, Vasco Almeida, Jiang Xin

On Thu, Oct 25, 2018 at 02:24:41AM +0100, Ramsay Jones wrote:

> >> Yeah, my thinko.  The latter would be closer to what this patch
> >> wants to have, but obviously the former would be more flexible and
> >> useful in wider context.  Both have the "Huh?" factor---what they
> >> are doing has little to do with "config", but I did not think of a
> >> better kitchen-sink (and our default kitchen-sink "rev-parse" is
> >> even further than "config", I would think, for this one).
> > 
> > Heh, I thought through the exact sequence in your paragraph when writing
> > my other message. That's probably a good sign that we should probably
> > not pursue this further unless we see the use case come up again a few
> > more times (and if we do, then consider "config" the least-bad place to
> > do it).
> 
> I was thinking:
> 
>   $ git var -e GIT_WHATEVER_ENV
> 
> [-e for environment].
> 
> ... but that is really no different than git-config. ;-)

Actually, "git var" already does pull bits from the environment. It
doesn't know about all of the type-specific parsing that git-config
does, but it might be a reasonable path forward to teach it that. (But I
still think we should do nothing for now and see how often this comes
up).

-Peff

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] i18n: make GETTEXT_POISON a runtime option
  2018-10-25 21:23                 ` Jeff King
@ 2018-10-26 19:20                   ` Ævar Arnfjörð Bjarmason
  2018-10-27  6:59                     ` Jeff King
  0 siblings, 1 reply; 49+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-10-26 19:20 UTC (permalink / raw)
  To: Jeff King
  Cc: Ramsay Jones, Junio C Hamano, git, SZEDER Gábor,
	Vasco Almeida, Jiang Xin


On Thu, Oct 25 2018, Jeff King wrote:

> On Thu, Oct 25, 2018 at 02:24:41AM +0100, Ramsay Jones wrote:
>
>> >> Yeah, my thinko.  The latter would be closer to what this patch
>> >> wants to have, but obviously the former would be more flexible and
>> >> useful in wider context.  Both have the "Huh?" factor---what they
>> >> are doing has little to do with "config", but I did not think of a
>> >> better kitchen-sink (and our default kitchen-sink "rev-parse" is
>> >> even further than "config", I would think, for this one).
>> >
>> > Heh, I thought through the exact sequence in your paragraph when writing
>> > my other message. That's probably a good sign that we should probably
>> > not pursue this further unless we see the use case come up again a few
>> > more times (and if we do, then consider "config" the least-bad place to
>> > do it).
>>
>> I was thinking:
>>
>>   $ git var -e GIT_WHATEVER_ENV
>>
>> [-e for environment].
>>
>> ... but that is really no different than git-config. ;-)
>
> Actually, "git var" already does pull bits from the environment. It
> doesn't know about all of the type-specific parsing that git-config
> does, but it might be a reasonable path forward to teach it that. (But I
> still think we should do nothing for now and see how often this comes
> up).

For myself / Junio picking this up: Does that mean you've read v2 and
think it's OK to pick up in its current form? I think it is, just
looking for some Acks on that since it's not in the latest "What's
Cooking".

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] Poison gettext with the Ook language
  2018-10-25  6:20               ` Jeff King
@ 2018-10-27  6:55                 ` Junio C Hamano
  0 siblings, 0 replies; 49+ messages in thread
From: Junio C Hamano @ 2018-10-27  6:55 UTC (permalink / raw)
  To: Jeff King
  Cc: Duy Nguyen, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Git Mailing List, SZEDER Gábor,
	Vasco Almeida, Jiang Xin

Jeff King <peff@peff.net> writes:

> I.e., could we mostly ignore this problem as not applying to most modern
> systems? And if there is such a system, give it a fallback like:
>
>   /*
>    * For systems that use a single buffer for getenv(), this hacks
>    * around it by giving it _four_ buffers. That's just punting on
>    * the problem, but it at least gives enough breathing room for
>    * the caller to do something sane like use non-trivial functions
>    * to copy the string. It still does nothing for threading, but
>    * hopefully such systems don't support pthreads in the first place. ;)
>    */

Tempting.

> I dunno. Maybe I am being overly optimistic. But I strongly suspect we
> have such bugs already in our code base, and nobody has run into them

Yeah, I tend to agree that I would not be surprised at all if we
have many such "bugs" that triggers for nobody on modern platforms.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] i18n: make GETTEXT_POISON a runtime option
  2018-10-26 19:20                   ` Ævar Arnfjörð Bjarmason
@ 2018-10-27  6:59                     ` Jeff King
  2018-10-27 10:42                       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 49+ messages in thread
From: Jeff King @ 2018-10-27  6:59 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Ramsay Jones, Junio C Hamano, git, SZEDER Gábor,
	Vasco Almeida, Jiang Xin

On Fri, Oct 26, 2018 at 09:20:56PM +0200, Ævar Arnfjörð Bjarmason wrote:

> >> I was thinking:
> >>
> >>   $ git var -e GIT_WHATEVER_ENV
> >>
> >> [-e for environment].
> >>
> >> ... but that is really no different than git-config. ;-)
> >
> > Actually, "git var" already does pull bits from the environment. It
> > doesn't know about all of the type-specific parsing that git-config
> > does, but it might be a reasonable path forward to teach it that. (But I
> > still think we should do nothing for now and see how often this comes
> > up).
> 
> For myself / Junio picking this up: Does that mean you've read v2 and
> think it's OK to pick up in its current form? I think it is, just
> looking for some Acks on that since it's not in the latest "What's
> Cooking".

I'm not sure if you're asking whether I looked at the rest of the patch.
If so, then no, not really (so no objection, but I also did not review
it).

-Peff

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] Poison gettext with the Ook language
  2018-10-22 20:22 ` SZEDER Gábor
                     ` (8 preceding siblings ...)
  2018-10-23 14:37   ` [PATCH] Poison gettext with the Ook language Duy Nguyen
@ 2018-10-27 10:11   ` Jakub Narebski
  9 siblings, 0 replies; 49+ messages in thread
From: Jakub Narebski @ 2018-10-27 10:11 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: git, Nguyễn Thái Ngọc Duy,
	Ævar Arnfjörð

SZEDER Gábor <szeder.dev@gmail.com> writes:
> On Mon, Oct 22, 2018 at 05:36:33PM +0200, Nguyễn Thái Ngọc Duy wrote:
>>
>> The current gettext() function just replaces all strings with
>> '# GETTEXT POISON #' including format strings and hides the things
>> that we should be allowed to grep (like branch names, or some other
>> codes) even when gettext is poisoned.
>> 
>> This patch implements the poisoned _() with a universal and totally
>> legit language called Ook [1]. We could actually grep stuff even in
>> with this because format strings are preserved.
>
> Once upon a time a GETTEXT_POISON build job failed on me, and the
> error message:
>
>   error: # GETTEXT POISON #
>
> was not particularly useful.  Ook wouldn't help with that...
>
> So I came up with the following couple of patches that implement a
> "scrambled" format that makes the poisoned output legible for humans
> but still gibberish for machine consumption (i.e. grep-ing the text
> part would still fail):
>
>   error:  U.n.a.b.l.e. .t.o. .c.r.e.a.t.e. .'./home/szeder/src/git/t/trash directory.t1404-update-ref-errors/.git/packed-refs...l.o.c.k.'.:. .File exists...
>
> I have been running GETTEXT_POISON builds with this series for some
> months now, but haven't submitted it yet, because I haven't decided
> yet whether including strings (paths, refs, etc.) in the output as
> they are is a feature or a flaw.  And because it embarrassingly leaks
> every single translated string... :)

There is similar technique called "pseudolocalization", meant for
testing i18n aspect of software.

In one of most common forms, the string

  Edit program settings

woukd be translated to

  [!!! εÐiţ Þr0ģЯãm səTτıИğ§ !!!]

(possibly using mirrored locale, i.e. right-to-left order).

The brackets [!!! ... !!!] are used as a "poison", to detect
translatable text, and to spot issues with truncation; it also helps
with finding "lego" translation.

It would also stress-test Unicode handling...

Regards,
-- 
Jakub Narębski


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] i18n: make GETTEXT_POISON a runtime option
  2018-10-27  6:59                     ` Jeff King
@ 2018-10-27 10:42                       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 49+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-10-27 10:42 UTC (permalink / raw)
  To: Jeff King
  Cc: Ramsay Jones, Junio C Hamano, git, SZEDER Gábor,
	Vasco Almeida, Jiang Xin


On Sat, Oct 27 2018, Jeff King wrote:

> On Fri, Oct 26, 2018 at 09:20:56PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> >> I was thinking:
>> >>
>> >>   $ git var -e GIT_WHATEVER_ENV
>> >>
>> >> [-e for environment].
>> >>
>> >> ... but that is really no different than git-config. ;-)
>> >
>> > Actually, "git var" already does pull bits from the environment. It
>> > doesn't know about all of the type-specific parsing that git-config
>> > does, but it might be a reasonable path forward to teach it that. (But I
>> > still think we should do nothing for now and see how often this comes
>> > up).
>>
>> For myself / Junio picking this up: Does that mean you've read v2 and
>> think it's OK to pick up in its current form? I think it is, just
>> looking for some Acks on that since it's not in the latest "What's
>> Cooking".
>
> I'm not sure if you're asking whether I looked at the rest of the patch.
> If so, then no, not really (so no objection, but I also did not review
> it).

Yeah I'm fishing for a more general review than just the problem of how
we turn an env variable into a boolean, so if you or anyone else is up
for it it would be most welcome :)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v3] i18n: make GETTEXT_POISON a runtime option
  2018-10-24 11:47         ` [PATCH v2] " Ævar Arnfjörð Bjarmason
@ 2018-11-01 19:31           ` Ævar Arnfjörð Bjarmason
  2018-11-02  3:11             ` Junio C Hamano
                               ` (5 more replies)
  0 siblings, 6 replies; 49+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-11-01 19:31 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, SZEDER Gábor, Vasco Almeida,
	Jiang Xin, Johannes Schindelin,
	Nguyễn Thái Ngọc Duy,
	Ævar Arnfjörð Bjarmason

Change the GETTEXT_POISON compile-time + runtime GIT_GETTEXT_POISON
test parameter to only be a GIT_TEST_GETTEXT_POISON=<non-empty?>
runtime parameter, to be consistent with other parameters documented
in "Running tests with special setups" in t/README.

When I added GETTEXT_POISON in bb946bba76 ("i18n: add GETTEXT_POISON
to simulate unfriendly translator", 2011-02-22) I was concerned with
ensuring that the _() function would get constant folded if NO_GETTEXT
was defined, and likewise that GETTEXT_POISON would be compiled out
unless it was defined.

But as the benchmark in my [1] shows doing a one-off runtime
getenv("GIT_TEST_[...]") is trivial, and since GETTEXT_POISON was
originally added the GIT_TEST_* env variables have become the common
idiom for turning on special test setups.

So change GETTEXT_POISON to work the same way. Now the
GETTEXT_POISON=YesPlease compile-time option is gone, and running the
tests with GIT_TEST_GETTEXT_POISON=[YesPlease|] can be toggled on/off
without recompiling.

This allows for conditionally amending tests to test with/without
poison, similar to what 859fdc0c3c ("commit-graph: define
GIT_TEST_COMMIT_GRAPH", 2018-08-29) did for GIT_TEST_COMMIT_GRAPH. Do
some of that, now we e.g. always run the t0205-gettext-poison.sh test.

I did enough there to remove the GETTEXT_POISON prerequisite, but its
inverse C_LOCALE_OUTPUT is still around, and surely some tests using
it can be converted to e.g. always set GIT_TEST_GETTEXT_POISON=.

Notes on the implementation:

 * We still compile a dedicated GETTEXT_POISON build in Travis CI.
   This is probably the wrong thing to do and should be followed-up
   with something similar to ae59a4e44f ("travis: run tests with
   GIT_TEST_SPLIT_INDEX", 2018-01-07) to re-use an existing test setup
   for running in the GIT_TEST_GETTEXT_POISON mode.

 * We now skip a test in t0000-basic.sh under
   GIT_TEST_GETTEXT_POISON=YesPlease that wasn't skipped before. This
   test relies on C locale output, but due to an edge case in how the
   previous implementation of GETTEXT_POISON worked (reading it from
   GIT-BUILD-OPTIONS) wasn't enabling poison correctly. Now it does,
   and needs to be skipped.

 * The getenv() function is not reentrant, so out of paranoia about
   code of the form:

       printf(_("%s"), getenv("some-env"));

   call use_gettext_poison() in our early setup in git_setup_gettext()
   so we populate the "poison_requested" variable in a codepath that's
   won't suffer from that race condition.

See also [3] for more on the motivation behind this patch, and the
history of the GETTEXT_POISON facility.

1. https://public-inbox.org/git/871s8gd32p.fsf@evledraar.gmail.com/
2. https://public-inbox.org/git/87woq7b58k.fsf@evledraar.gmail.com/
3. https://public-inbox.org/git/878t2pd6yu.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

Now:

 * The new i18n helper is gone. We just use "test -n" semantics for
   $GIT_TEST_GETTEXT_POISON

 * We error out in the Makefile if you're still saying
   GETTEXT_POISON=YesPlease.

   This makes more sense than just making it a synonym since now this
   also needs to be defined at runtime.

 * The caveat with avoiding test_lazy_prereq is gone (although there's
   still some unrelated bug there worth looking into).

 * We call use_gettext_poison() really early to avoid any reentrancy
   issue with getenv().


 .travis.yml               |  2 +-
 Makefile                  |  8 +-------
 ci/lib-travisci.sh        |  4 ++--
 gettext.c                 | 11 +++++++----
 gettext.h                 |  9 +++------
 git-sh-i18n.sh            |  2 +-
 po/README                 | 13 ++++---------
 t/README                  |  6 ++++++
 t/lib-gettext.sh          |  2 +-
 t/t0000-basic.sh          |  2 +-
 t/t0205-gettext-poison.sh |  8 +++++---
 t/t3406-rebase-message.sh |  2 +-
 t/t7201-co.sh             |  6 +++---
 t/t9902-completion.sh     |  3 ++-
 t/test-lib-functions.sh   |  8 ++++----
 t/test-lib.sh             |  6 +-----
 16 files changed, 43 insertions(+), 49 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 4d4e26c9df..4523a2e5ec 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -26,7 +26,7 @@ addons:
 
 matrix:
   include:
-    - env: jobname=GETTEXT_POISON
+    - env: jobname=GIT_TEST_GETTEXT_POISON
       os: linux
       compiler:
       addons:
diff --git a/Makefile b/Makefile
index b08d5ea258..3a08626db0 100644
--- a/Makefile
+++ b/Makefile
@@ -362,11 +362,6 @@ all::
 # Define HAVE_DEV_TTY if your system can open /dev/tty to interact with the
 # user.
 #
-# Define GETTEXT_POISON if you are debugging the choice of strings marked
-# for translation.  In a GETTEXT_POISON build, you can turn all strings marked
-# for translation into gibberish by setting the GIT_GETTEXT_POISON variable
-# (to any value) in your environment.
-#
 # Define JSMIN to point to JavaScript minifier that functions as
 # a filter to have gitweb.js minified.
 #
@@ -1450,7 +1445,7 @@ ifdef NO_SYMLINK_HEAD
 	BASIC_CFLAGS += -DNO_SYMLINK_HEAD
 endif
 ifdef GETTEXT_POISON
-	BASIC_CFLAGS += -DGETTEXT_POISON
+$(error The GETTEXT_POISON option has been removed in favor of runtime GIT_TEST_GETTEXT_POISON. See t/README!)
 endif
 ifdef NO_GETTEXT
 	BASIC_CFLAGS += -DNO_GETTEXT
@@ -2602,7 +2597,6 @@ ifdef GIT_TEST_CMP_USE_COPIED_CONTEXT
 	@echo GIT_TEST_CMP_USE_COPIED_CONTEXT=YesPlease >>$@+
 endif
 	@echo NO_GETTEXT=\''$(subst ','\'',$(subst ','\'',$(NO_GETTEXT)))'\' >>$@+
-	@echo GETTEXT_POISON=\''$(subst ','\'',$(subst ','\'',$(GETTEXT_POISON)))'\' >>$@+
 ifdef GIT_PERF_REPEAT_COUNT
 	@echo GIT_PERF_REPEAT_COUNT=\''$(subst ','\'',$(subst ','\'',$(GIT_PERF_REPEAT_COUNT)))'\' >>$@+
 endif
diff --git a/ci/lib-travisci.sh b/ci/lib-travisci.sh
index 06970f7213..69dff4d1ec 100755
--- a/ci/lib-travisci.sh
+++ b/ci/lib-travisci.sh
@@ -123,7 +123,7 @@ osx-clang|osx-gcc)
 	# Travis CI OS X
 	export GIT_SKIP_TESTS="t9810 t9816"
 	;;
-GETTEXT_POISON)
-	export GETTEXT_POISON=YesPlease
+GIT_TEST_GETTEXT_POISON)
+	export GIT_TEST_GETTEXT_POISON=YesPlease
 	;;
 esac
diff --git a/gettext.c b/gettext.c
index 7272771c8e..d4021d690c 100644
--- a/gettext.c
+++ b/gettext.c
@@ -7,6 +7,7 @@
 #include "gettext.h"
 #include "strbuf.h"
 #include "utf8.h"
+#include "config.h"
 
 #ifndef NO_GETTEXT
 #	include <locale.h>
@@ -46,15 +47,15 @@ const char *get_preferred_languages(void)
 	return NULL;
 }
 
-#ifdef GETTEXT_POISON
 int use_gettext_poison(void)
 {
 	static int poison_requested = -1;
-	if (poison_requested == -1)
-		poison_requested = getenv("GIT_GETTEXT_POISON") ? 1 : 0;
+	if (poison_requested == -1) {
+		const char *v = getenv("GIT_TEST_GETTEXT_POISON");
+		poison_requested = v && strlen(v) ? 1 : 0;
+	}
 	return poison_requested;
 }
-#endif
 
 #ifndef NO_GETTEXT
 static int test_vsnprintf(const char *fmt, ...)
@@ -164,6 +165,8 @@ void git_setup_gettext(void)
 	if (!podir)
 		podir = p = system_path(GIT_LOCALE_PATH);
 
+	use_gettext_poison(); /* getenv() reentrancy paranoia */
+
 	if (!is_directory(podir)) {
 		free(p);
 		return;
diff --git a/gettext.h b/gettext.h
index 7eee64a34f..43d991a2df 100644
--- a/gettext.h
+++ b/gettext.h
@@ -28,12 +28,15 @@
 
 #define FORMAT_PRESERVING(n) __attribute__((format_arg(n)))
 
+extern int use_gettext_poison(void);
+
 #ifndef NO_GETTEXT
 extern void git_setup_gettext(void);
 extern int gettext_width(const char *s);
 #else
 static inline void git_setup_gettext(void)
 {
+	use_gettext_poison();; /* getenv() reentrancy paranoia */
 }
 static inline int gettext_width(const char *s)
 {
@@ -41,12 +44,6 @@ static inline int gettext_width(const char *s)
 }
 #endif
 
-#ifdef GETTEXT_POISON
-extern int use_gettext_poison(void);
-#else
-#define use_gettext_poison() 0
-#endif
-
 static inline FORMAT_PRESERVING(1) const char *_(const char *msgid)
 {
 	if (!*msgid)
diff --git a/git-sh-i18n.sh b/git-sh-i18n.sh
index 9d065fb4bf..e1d917fd27 100644
--- a/git-sh-i18n.sh
+++ b/git-sh-i18n.sh
@@ -17,7 +17,7 @@ export TEXTDOMAINDIR
 
 # First decide what scheme to use...
 GIT_INTERNAL_GETTEXT_SH_SCHEME=fallthrough
-if test -n "$GIT_GETTEXT_POISON"
+if test -n "$GIT_TEST_GETTEXT_POISON"
 then
 	GIT_INTERNAL_GETTEXT_SH_SCHEME=poison
 elif test -n "@@USE_GETTEXT_SCHEME@@"
diff --git a/po/README b/po/README
index fef4c0f0b5..aa704ffcb7 100644
--- a/po/README
+++ b/po/README
@@ -289,16 +289,11 @@ something in the test suite might still depend on the US English
 version of the strings, e.g. to grep some error message or other
 output.
 
-To smoke out issues like these Git can be compiled with gettext poison
-support, at the top-level:
+To smoke out issues like these, Git tested with a translation mode that
+emits gibberish on every call to gettext. To use it run the test suite
+with it, e.g.:
 
-    make GETTEXT_POISON=YesPlease
-
-That'll give you a git which emits gibberish on every call to
-gettext. It's obviously not meant to be installed, but you should run
-the test suite with it:
-
-    cd t && prove -j 9 ./t[0-9]*.sh
+    cd t && GIT_TEST_GETTEXT_POISON=YesPlease prove -j 9 ./t[0-9]*.sh
 
 If tests break with it you should inspect them manually and see if
 what you're translating is sane, i.e. that you're not translating
diff --git a/t/README b/t/README
index 8847489640..25c4ba3419 100644
--- a/t/README
+++ b/t/README
@@ -301,6 +301,12 @@ that cannot be easily covered by a few specific test cases. These
 could be enabled by running the test suite with correct GIT_TEST_
 environment set.
 
+GIT_TEST_GETTEXT_POISON=<non-empty?> turns all strings marked for
+translation into gibberish if non-empty (think "test -n"). Used for
+spotting those tests that need to be marked with a C_LOCALE_OUTPUT
+prerequisite when adding more strings for translation. See "Testing
+marked strings" in po/README for details.
+
 GIT_TEST_SPLIT_INDEX=<boolean> forces split-index mode on the whole
 test suite. Accept any boolean values that are accepted by git-config.
 
diff --git a/t/lib-gettext.sh b/t/lib-gettext.sh
index eec757f104..755f421431 100644
--- a/t/lib-gettext.sh
+++ b/t/lib-gettext.sh
@@ -12,7 +12,7 @@ export GIT_TEXTDOMAINDIR GIT_PO_PATH
 
 . "$GIT_BUILD_DIR"/git-sh-i18n
 
-if test_have_prereq GETTEXT && ! test_have_prereq GETTEXT_POISON
+if test_have_prereq GETTEXT && test_have_prereq C_LOCALE_OUTPUT
 then
 	# is_IS.UTF-8 on Solaris and FreeBSD, is_IS.utf8 on Debian
 	is_IS_locale=$(locale -a 2>/dev/null |
diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index 4d23373526..b6566003dd 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -274,7 +274,7 @@ test_expect_success 'pretend we have a mix of all possible results' "
 	EOF
 "
 
-test_expect_success 'test --verbose' '
+test_expect_success C_LOCALE_OUTPUT 'test --verbose' '
 	test_must_fail run_sub_test_lib_test \
 		test-verbose "test verbose" --verbose <<-\EOF &&
 	test_expect_success "passing test" true
diff --git a/t/t0205-gettext-poison.sh b/t/t0205-gettext-poison.sh
index 438e778d6a..a06269f38a 100755
--- a/t/t0205-gettext-poison.sh
+++ b/t/t0205-gettext-poison.sh
@@ -5,13 +5,15 @@
 
 test_description='Gettext Shell poison'
 
+GIT_TEST_GETTEXT_POISON=YesPlease
+export GIT_TEST_GETTEXT_POISON
 . ./lib-gettext.sh
 
-test_expect_success GETTEXT_POISON 'sanity: $GIT_INTERNAL_GETTEXT_SH_SCHEME" is poison' '
+test_expect_success 'sanity: $GIT_INTERNAL_GETTEXT_SH_SCHEME" is poison' '
     test "$GIT_INTERNAL_GETTEXT_SH_SCHEME" = "poison"
 '
 
-test_expect_success GETTEXT_POISON 'gettext: our gettext() fallback has poison semantics' '
+test_expect_success 'gettext: our gettext() fallback has poison semantics' '
     printf "# GETTEXT POISON #" >expect &&
     gettext "test" >actual &&
     test_cmp expect actual &&
@@ -20,7 +22,7 @@ test_expect_success GETTEXT_POISON 'gettext: our gettext() fallback has poison s
     test_cmp expect actual
 '
 
-test_expect_success GETTEXT_POISON 'eval_gettext: our eval_gettext() fallback has poison semantics' '
+test_expect_success 'eval_gettext: our eval_gettext() fallback has poison semantics' '
     printf "# GETTEXT POISON #" >expect &&
     eval_gettext "test" >actual &&
     test_cmp expect actual &&
diff --git a/t/t3406-rebase-message.sh b/t/t3406-rebase-message.sh
index 0392e36d23..2bdcf83808 100755
--- a/t/t3406-rebase-message.sh
+++ b/t/t3406-rebase-message.sh
@@ -77,7 +77,7 @@ test_expect_success 'rebase -n overrides config rebase.stat config' '
 #     "Does not point to a valid commit: invalid-ref"
 #
 # NEEDSWORK: This "grep" is fine in real non-C locales, but
-# GETTEXT_POISON poisons the refname along with the enclosing
+# GIT_TEST_GETTEXT_POISON poisons the refname along with the enclosing
 # error message.
 test_expect_success 'rebase --onto outputs the invalid ref' '
 	test_must_fail git rebase --onto invalid-ref HEAD HEAD 2>err &&
diff --git a/t/t7201-co.sh b/t/t7201-co.sh
index 826987ca80..72b9b375ba 100755
--- a/t/t7201-co.sh
+++ b/t/t7201-co.sh
@@ -254,9 +254,9 @@ test_expect_success 'checkout to detach HEAD (with advice declined)' '
 test_expect_success 'checkout to detach HEAD' '
 	git config advice.detachedHead true &&
 	git checkout -f renamer && git clean -f &&
-	git checkout renamer^ 2>messages &&
-	test_i18ngrep "HEAD is now at 7329388" messages &&
-	(test_line_count -gt 1 messages || test -n "$GETTEXT_POISON") &&
+	GIT_TEST_GETTEXT_POISON= git checkout renamer^ 2>messages &&
+	grep "HEAD is now at 7329388" messages &&
+	test_line_count -gt 1 messages &&
 	H=$(git rev-parse --verify HEAD) &&
 	M=$(git show-ref -s --verify refs/heads/master) &&
 	test "z$H" = "z$M" &&
diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
index 175f83d704..3c6b185b60 100755
--- a/t/t9902-completion.sh
+++ b/t/t9902-completion.sh
@@ -1697,7 +1697,8 @@ test_expect_success 'sourcing the completion script clears cached commands' '
 	verbose test -z "$__git_all_commands"
 '
 
-test_expect_success !GETTEXT_POISON 'sourcing the completion script clears cached merge strategies' '
+test_expect_success 'sourcing the completion script clears cached merge strategies' '
+	GIT_TEST_GETTEXT_POISON= &&
 	__git_compute_merge_strategies &&
 	verbose test -n "$__git_merge_strategies" &&
 	. "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" &&
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 78d8c3783b..2f42b3653c 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -755,16 +755,16 @@ test_cmp_bin() {
 
 # Use this instead of test_cmp to compare files that contain expected and
 # actual output from git commands that can be translated.  When running
-# under GETTEXT_POISON this pretends that the command produced expected
+# under GIT_TEST_GETTEXT_POISON this pretends that the command produced expected
 # results.
 test_i18ncmp () {
-	test -n "$GETTEXT_POISON" || test_cmp "$@"
+	! test_have_prereq C_LOCALE_OUTPUT || test_cmp "$@"
 }
 
 # Use this instead of "grep expected-string actual" to see if the
 # output from a git command that can be translated either contains an
 # expected string, or does not contain an unwanted one.  When running
-# under GETTEXT_POISON this pretends that the command produced expected
+# under GIT_TEST_GETTEXT_POISON this pretends that the command produced expected
 # results.
 test_i18ngrep () {
 	eval "last_arg=\${$#}"
@@ -779,7 +779,7 @@ test_i18ngrep () {
 		error "bug in the test script: too few parameters to test_i18ngrep"
 	fi
 
-	if test -n "$GETTEXT_POISON"
+	if test_have_prereq !C_LOCALE_OUTPUT
 	then
 		# pretend success
 		return 0
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 897e6fcc94..370a4821e1 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1105,12 +1105,8 @@ test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 
 # Can we rely on git's output in the C locale?
-if test -n "$GETTEXT_POISON"
+if test -z "$GIT_TEST_GETTEXT_POISON"
 then
-	GIT_GETTEXT_POISON=YesPlease
-	export GIT_GETTEXT_POISON
-	test_set_prereq GETTEXT_POISON
-else
 	test_set_prereq C_LOCALE_OUTPUT
 fi
 
-- 
2.19.1.899.g0250525e69


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH v3] i18n: make GETTEXT_POISON a runtime option
  2018-11-01 19:31           ` [PATCH v3] " Ævar Arnfjörð Bjarmason
@ 2018-11-02  3:11             ` Junio C Hamano
  2018-11-02 16:37             ` SZEDER Gábor
                               ` (4 subsequent siblings)
  5 siblings, 0 replies; 49+ messages in thread
From: Junio C Hamano @ 2018-11-02  3:11 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, SZEDER Gábor, Vasco Almeida, Jiang Xin,
	Johannes Schindelin, Nguyễn Thái Ngọc Duy

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Change the GETTEXT_POISON compile-time + runtime GIT_GETTEXT_POISON
> test parameter to only be a GIT_TEST_GETTEXT_POISON=<non-empty?>
> runtime parameter, to be consistent with other parameters documented
> in "Running tests with special setups" in t/README.
> ...
>  #ifndef NO_GETTEXT
>  extern void git_setup_gettext(void);
>  extern int gettext_width(const char *s);
>  #else
>  static inline void git_setup_gettext(void)
>  {
> +	use_gettext_poison();; /* getenv() reentrancy paranoia */
>  }

Too many "case/esac" statement in shell scripting?


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3] i18n: make GETTEXT_POISON a runtime option
  2018-11-01 19:31           ` [PATCH v3] " Ævar Arnfjörð Bjarmason
  2018-11-02  3:11             ` Junio C Hamano
@ 2018-11-02 16:37             ` SZEDER Gábor
  2018-11-08 20:26               ` Ævar Arnfjörð Bjarmason
  2018-11-08  3:24             ` Junio C Hamano
                               ` (3 subsequent siblings)
  5 siblings, 1 reply; 49+ messages in thread
From: SZEDER Gábor @ 2018-11-02 16:37 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Jeff King, Vasco Almeida, Jiang Xin,
	Johannes Schindelin, Nguyễn Thái Ngọc Duy

On Thu, Nov 01, 2018 at 07:31:15PM +0000, Ævar Arnfjörð Bjarmason wrote:
> Change the GETTEXT_POISON compile-time + runtime GIT_GETTEXT_POISON
> test parameter to only be a GIT_TEST_GETTEXT_POISON=<non-empty?>
> runtime parameter, to be consistent with other parameters documented
> in "Running tests with special setups" in t/README.
> 
> When I added GETTEXT_POISON in bb946bba76 ("i18n: add GETTEXT_POISON
> to simulate unfriendly translator", 2011-02-22) I was concerned with
> ensuring that the _() function would get constant folded if NO_GETTEXT
> was defined, and likewise that GETTEXT_POISON would be compiled out
> unless it was defined.
> 
> But as the benchmark in my [1] shows doing a one-off runtime
> getenv("GIT_TEST_[...]") is trivial, and since GETTEXT_POISON was
> originally added the GIT_TEST_* env variables have become the common
> idiom for turning on special test setups.
> 
> So change GETTEXT_POISON to work the same way. Now the
> GETTEXT_POISON=YesPlease compile-time option is gone, and running the
> tests with GIT_TEST_GETTEXT_POISON=[YesPlease|] can be toggled on/off
> without recompiling.
> 
> This allows for conditionally amending tests to test with/without
> poison, similar to what 859fdc0c3c ("commit-graph: define
> GIT_TEST_COMMIT_GRAPH", 2018-08-29) did for GIT_TEST_COMMIT_GRAPH. Do
> some of that, now we e.g. always run the t0205-gettext-poison.sh test.
> 
> I did enough there to remove the GETTEXT_POISON prerequisite, but its
> inverse C_LOCALE_OUTPUT is still around, and surely some tests using
> it can be converted to e.g. always set GIT_TEST_GETTEXT_POISON=.
> 
> Notes on the implementation:
> 
>  * We still compile a dedicated GETTEXT_POISON build in Travis CI.
>    This is probably the wrong thing to do and should be followed-up
>    with something similar to ae59a4e44f ("travis: run tests with
>    GIT_TEST_SPLIT_INDEX", 2018-01-07) to re-use an existing test setup
>    for running in the GIT_TEST_GETTEXT_POISON mode.

I'm of two minds about this.  Sure, now it's not a compile time
option, so we can spare a dedicated compilation, and sparing resources
is good, of course.

However, checking test failures in the 'linux-gcc' build job is always
a bit of a hassle, because it's not enough to look at the output of
the failed test, but I also have to check which one of the two test
runs failed (i.e. the "regular" or one with all the GIT_TEST_* knobs
turned on).  Adding a second test run with GIT_TEST_GETTEXT_POISON
enabled to an other build job (e.g. 'linux-clang') would then bring
this hassle to that build job as well.

Furthermore, occasionally a build job is slower than usual (whatever
the reason might be), which can be an issue when running the test
suite twice, as it can exceed the time limit.

Anyway, we can think about this later.

In any case, GIT_TEST_GETTEXT_POISON definitely should not be enabled
in the same "catch-all" test run with all other GIT_TEST_* variables,
because it would mess up any translated error messages that might pop
up in a test failure, and then we wouldn't have any idea about what
went wrong.

>  * We now skip a test in t0000-basic.sh under
>    GIT_TEST_GETTEXT_POISON=YesPlease that wasn't skipped before. This
>    test relies on C locale output, but due to an edge case in how the
>    previous implementation of GETTEXT_POISON worked (reading it from
>    GIT-BUILD-OPTIONS) wasn't enabling poison correctly. Now it does,
>    and needs to be skipped.
> 
>  * The getenv() function is not reentrant, so out of paranoia about
>    code of the form:
> 
>        printf(_("%s"), getenv("some-env"));
> 
>    call use_gettext_poison() in our early setup in git_setup_gettext()
>    so we populate the "poison_requested" variable in a codepath that's
>    won't suffer from that race condition.
> 
> See also [3] for more on the motivation behind this patch, and the
> history of the GETTEXT_POISON facility.
> 
> 1. https://public-inbox.org/git/871s8gd32p.fsf@evledraar.gmail.com/
> 2. https://public-inbox.org/git/87woq7b58k.fsf@evledraar.gmail.com/
> 3. https://public-inbox.org/git/878t2pd6yu.fsf@evledraar.gmail.com/
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
> 
> Now:
> 
>  * The new i18n helper is gone. We just use "test -n" semantics for
>    $GIT_TEST_GETTEXT_POISON
> 
>  * We error out in the Makefile if you're still saying
>    GETTEXT_POISON=YesPlease.
> 
>    This makes more sense than just making it a synonym since now this
>    also needs to be defined at runtime.

OK, I think erroring out is better than silently ignoring
GETTEXT_POISON=YesPlease.  However, the commit message only mentions
that GETTEXT_POISON is gone, but perhaps this should be mentioned
there as well.

>  * The caveat with avoiding test_lazy_prereq is gone (although there's
>    still some unrelated bug there worth looking into).
> 
>  * We call use_gettext_poison() really early to avoid any reentrancy
>    issue with getenv().

> diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
> index 175f83d704..3c6b185b60 100755
> --- a/t/t9902-completion.sh
> +++ b/t/t9902-completion.sh
> @@ -1697,7 +1697,8 @@ test_expect_success 'sourcing the completion script clears cached commands' '
>  	verbose test -z "$__git_all_commands"
>  '
>  
> -test_expect_success !GETTEXT_POISON 'sourcing the completion script clears cached merge strategies' '
> +test_expect_success 'sourcing the completion script clears cached merge strategies' '
> +	GIT_TEST_GETTEXT_POISON= &&
>  	__git_compute_merge_strategies &&

OK, makes sense.

>  	verbose test -n "$__git_merge_strategies" &&
>  	. "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" &&

> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index 78d8c3783b..2f42b3653c 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -755,16 +755,16 @@ test_cmp_bin() {
>  
>  # Use this instead of test_cmp to compare files that contain expected and
>  # actual output from git commands that can be translated.  When running
> -# under GETTEXT_POISON this pretends that the command produced expected
> +# under GIT_TEST_GETTEXT_POISON this pretends that the command produced expected
>  # results.
>  test_i18ncmp () {
> -	test -n "$GETTEXT_POISON" || test_cmp "$@"
> +	! test_have_prereq C_LOCALE_OUTPUT || test_cmp "$@"
>  }

>  test_i18ngrep () {
>  	eval "last_arg=\${$#}"
> @@ -779,7 +779,7 @@ test_i18ngrep () {
>  		error "bug in the test script: too few parameters to test_i18ngrep"
>  	fi
>  
> -	if test -n "$GETTEXT_POISON"
> +	if test_have_prereq !C_LOCALE_OUTPUT

Why do these two helper functions call test_have_prereq (a function
that doesn't even fit on my screen) instead of a simple

  test -n "$GIT_TEST_GETTEXT_POISON"

>  	then
>  		# pretend success
>  		return 0
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 897e6fcc94..370a4821e1 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -1105,12 +1105,8 @@ test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
>  test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
>  
>  # Can we rely on git's output in the C locale?
> -if test -n "$GETTEXT_POISON"
> +if test -z "$GIT_TEST_GETTEXT_POISON"
>  then
> -	GIT_GETTEXT_POISON=YesPlease
> -	export GIT_GETTEXT_POISON
> -	test_set_prereq GETTEXT_POISON
> -else
>  	test_set_prereq C_LOCALE_OUTPUT
>  fi

GIT_TEST_GETTEXT_POISON=1 will influence even those git commands that
are executed during initialization of test-lib and the test repo:

  $ GIT_TEST_GETTEXT_POISON=1 ./t0000-basic.sh -v
  # GETTEXT POISON #expecting success: 
  <....>

It should say:

  Initialized empty Git repository in <... path ...>
  expecting success:

See https://public-inbox.org/git/20181022202241.18629-2-szeder.dev@gmail.com/
for a bit of code that you might find worthy to steal.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3] i18n: make GETTEXT_POISON a runtime option
  2018-11-01 19:31           ` [PATCH v3] " Ævar Arnfjörð Bjarmason
  2018-11-02  3:11             ` Junio C Hamano
  2018-11-02 16:37             ` SZEDER Gábor
@ 2018-11-08  3:24             ` Junio C Hamano
  2018-11-08 19:12               ` Eric Sunshine
  2018-11-08 21:15             ` [PATCH v4 0/2] " Ævar Arnfjörð Bjarmason
                               ` (2 subsequent siblings)
  5 siblings, 1 reply; 49+ messages in thread
From: Junio C Hamano @ 2018-11-08  3:24 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, SZEDER Gábor, Vasco Almeida, Jiang Xin,
	Johannes Schindelin, Nguyễn Thái Ngọc Duy

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

>  * We error out in the Makefile if you're still saying
>    GETTEXT_POISON=YesPlease.

I expected this would be irritating, but it turns out it is worse
than mere irritation but is a severe hinderance to affect my
performance, as I (and my bots) keep building and testing different
versions of Git under different configurations.

I know it was done to help those who only ever build a single track
at a time and mostly moving forward, but I'm very much tempted to
remove this part, perhaps demote it to a warning of some sort.


>  ifdef GETTEXT_POISON
> -	BASIC_CFLAGS += -DGETTEXT_POISON
> +$(error The GETTEXT_POISON option has been removed in favor of runtime GIT_TEST_GETTEXT_POISON. See t/README!)
>  endif

-- >8 --
Makefile: ease dynamic-gettext-poison transition

Earlier we made the entire build to fail when GETTEXT_POISON=Yes is
given to make, to notify those who did not notice that text poisoning
is now a runtime behaviour.

It turns out that this too irritating for those who need to build
and test different versions of Git that cross the boundary between
history with and without this topic to switch between two
environment variables.  Demote the error to a warning, so that you
can say something like

	make GETTEXT_POISON=Yes GIT_TEST_GETTEXT_POISON test

during the transition period, without having to worry about whether
exact version you are testing has or does not have this topic.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index f3a9995e50..6b492f44a6 100644
--- a/Makefile
+++ b/Makefile
@@ -1447,7 +1447,7 @@ ifdef NO_SYMLINK_HEAD
 	BASIC_CFLAGS += -DNO_SYMLINK_HEAD
 endif
 ifdef GETTEXT_POISON
-$(error The GETTEXT_POISON option has been removed in favor of runtime GIT_TEST_GETTEXT_POISON. See t/README!)
+$(warning The GETTEXT_POISON option has been removed in favor of runtime GIT_TEST_GETTEXT_POISON. See t/README!)
 endif
 ifdef NO_GETTEXT
 	BASIC_CFLAGS += -DNO_GETTEXT

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH v3] i18n: make GETTEXT_POISON a runtime option
  2018-11-08  3:24             ` Junio C Hamano
@ 2018-11-08 19:12               ` Eric Sunshine
  0 siblings, 0 replies; 49+ messages in thread
From: Eric Sunshine @ 2018-11-08 19:12 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, Git List, Jeff King,
	SZEDER Gábor, Vasco Almeida, Jiang Xin, Johannes Schindelin,
	Nguyễn Thái Ngọc Duy

On Wed, Nov 7, 2018 at 10:24 PM Junio C Hamano <gitster@pobox.com> wrote:
> Makefile: ease dynamic-gettext-poison transition
>
> Earlier we made the entire build to fail when GETTEXT_POISON=Yes is
> given to make, to notify those who did not notice that text poisoning
> is now a runtime behaviour.
>
> It turns out that this too irritating for those who need to build

s/too/is &/

> and test different versions of Git that cross the boundary between
> history with and without this topic to switch between two
> environment variables.  Demote the error to a warning, so that you
> can say something like
>
>         make GETTEXT_POISON=Yes GIT_TEST_GETTEXT_POISON test
>
> during the transition period, without having to worry about whether
> exact version you are testing has or does not have this topic.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3] i18n: make GETTEXT_POISON a runtime option
  2018-11-02 16:37             ` SZEDER Gábor
@ 2018-11-08 20:26               ` Ævar Arnfjörð Bjarmason
  2018-11-08 20:51                 ` SZEDER Gábor
  0 siblings, 1 reply; 49+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-11-08 20:26 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: git, Junio C Hamano, Jeff King, Vasco Almeida, Jiang Xin,
	Johannes Schindelin, Nguyễn Thái Ngọc Duy


On Fri, Nov 02 2018, SZEDER Gábor wrote:

> On Thu, Nov 01, 2018 at 07:31:15PM +0000, Ævar Arnfjörð Bjarmason wrote:
>> Change the GETTEXT_POISON compile-time + runtime GIT_GETTEXT_POISON
>> test parameter to only be a GIT_TEST_GETTEXT_POISON=<non-empty?>
>> runtime parameter, to be consistent with other parameters documented
>> in "Running tests with special setups" in t/README.
>>
>> When I added GETTEXT_POISON in bb946bba76 ("i18n: add GETTEXT_POISON
>> to simulate unfriendly translator", 2011-02-22) I was concerned with
>> ensuring that the _() function would get constant folded if NO_GETTEXT
>> was defined, and likewise that GETTEXT_POISON would be compiled out
>> unless it was defined.
>>
>> But as the benchmark in my [1] shows doing a one-off runtime
>> getenv("GIT_TEST_[...]") is trivial, and since GETTEXT_POISON was
>> originally added the GIT_TEST_* env variables have become the common
>> idiom for turning on special test setups.
>>
>> So change GETTEXT_POISON to work the same way. Now the
>> GETTEXT_POISON=YesPlease compile-time option is gone, and running the
>> tests with GIT_TEST_GETTEXT_POISON=[YesPlease|] can be toggled on/off
>> without recompiling.
>>
>> This allows for conditionally amending tests to test with/without
>> poison, similar to what 859fdc0c3c ("commit-graph: define
>> GIT_TEST_COMMIT_GRAPH", 2018-08-29) did for GIT_TEST_COMMIT_GRAPH. Do
>> some of that, now we e.g. always run the t0205-gettext-poison.sh test.
>>
>> I did enough there to remove the GETTEXT_POISON prerequisite, but its
>> inverse C_LOCALE_OUTPUT is still around, and surely some tests using
>> it can be converted to e.g. always set GIT_TEST_GETTEXT_POISON=.
>>
>> Notes on the implementation:
>>
>>  * We still compile a dedicated GETTEXT_POISON build in Travis CI.
>>    This is probably the wrong thing to do and should be followed-up
>>    with something similar to ae59a4e44f ("travis: run tests with
>>    GIT_TEST_SPLIT_INDEX", 2018-01-07) to re-use an existing test setup
>>    for running in the GIT_TEST_GETTEXT_POISON mode.
>
> I'm of two minds about this.  Sure, now it's not a compile time
> option, so we can spare a dedicated compilation, and sparing resources
> is good, of course.
>
> However, checking test failures in the 'linux-gcc' build job is always
> a bit of a hassle, because it's not enough to look at the output of
> the failed test, but I also have to check which one of the two test
> runs failed (i.e. the "regular" or one with all the GIT_TEST_* knobs
> turned on).  Adding a second test run with GIT_TEST_GETTEXT_POISON
> enabled to an other build job (e.g. 'linux-clang') would then bring
> this hassle to that build job as well.
>
> Furthermore, occasionally a build job is slower than usual (whatever
> the reason might be), which can be an issue when running the test
> suite twice, as it can exceed the time limit.
>
> Anyway, we can think about this later.
>
> In any case, GIT_TEST_GETTEXT_POISON definitely should not be enabled
> in the same "catch-all" test run with all other GIT_TEST_* variables,
> because it would mess up any translated error messages that might pop
> up in a test failure, and then we wouldn't have any idea about what
> went wrong.

Will clarify this in v3. I.e. "let's think about this...".

>>  * We now skip a test in t0000-basic.sh under
>>    GIT_TEST_GETTEXT_POISON=YesPlease that wasn't skipped before. This
>>    test relies on C locale output, but due to an edge case in how the
>>    previous implementation of GETTEXT_POISON worked (reading it from
>>    GIT-BUILD-OPTIONS) wasn't enabling poison correctly. Now it does,
>>    and needs to be skipped.
>>
>>  * The getenv() function is not reentrant, so out of paranoia about
>>    code of the form:
>>
>>        printf(_("%s"), getenv("some-env"));
>>
>>    call use_gettext_poison() in our early setup in git_setup_gettext()
>>    so we populate the "poison_requested" variable in a codepath that's
>>    won't suffer from that race condition.
>>
>> See also [3] for more on the motivation behind this patch, and the
>> history of the GETTEXT_POISON facility.
>>
>> 1. https://public-inbox.org/git/871s8gd32p.fsf@evledraar.gmail.com/
>> 2. https://public-inbox.org/git/87woq7b58k.fsf@evledraar.gmail.com/
>> 3. https://public-inbox.org/git/878t2pd6yu.fsf@evledraar.gmail.com/
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>
>> Now:
>>
>>  * The new i18n helper is gone. We just use "test -n" semantics for
>>    $GIT_TEST_GETTEXT_POISON
>>
>>  * We error out in the Makefile if you're still saying
>>    GETTEXT_POISON=YesPlease.
>>
>>    This makes more sense than just making it a synonym since now this
>>    also needs to be defined at runtime.
>
> OK, I think erroring out is better than silently ignoring
> GETTEXT_POISON=YesPlease.  However, the commit message only mentions
> that GETTEXT_POISON is gone, but perhaps this should be mentioned
> there as well.

Will mention.

>>  * The caveat with avoiding test_lazy_prereq is gone (although there's
>>    still some unrelated bug there worth looking into).
>>
>>  * We call use_gettext_poison() really early to avoid any reentrancy
>>    issue with getenv().
>
>> diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
>> index 175f83d704..3c6b185b60 100755
>> --- a/t/t9902-completion.sh
>> +++ b/t/t9902-completion.sh
>> @@ -1697,7 +1697,8 @@ test_expect_success 'sourcing the completion script clears cached commands' '
>>  	verbose test -z "$__git_all_commands"
>>  '
>>
>> -test_expect_success !GETTEXT_POISON 'sourcing the completion script clears cached merge strategies' '
>> +test_expect_success 'sourcing the completion script clears cached merge strategies' '
>> +	GIT_TEST_GETTEXT_POISON= &&
>>  	__git_compute_merge_strategies &&
>
> OK, makes sense.
>
>>  	verbose test -n "$__git_merge_strategies" &&
>>  	. "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" &&
>
>> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
>> index 78d8c3783b..2f42b3653c 100644
>> --- a/t/test-lib-functions.sh
>> +++ b/t/test-lib-functions.sh
>> @@ -755,16 +755,16 @@ test_cmp_bin() {
>>
>>  # Use this instead of test_cmp to compare files that contain expected and
>>  # actual output from git commands that can be translated.  When running
>> -# under GETTEXT_POISON this pretends that the command produced expected
>> +# under GIT_TEST_GETTEXT_POISON this pretends that the command produced expected
>>  # results.
>>  test_i18ncmp () {
>> -	test -n "$GETTEXT_POISON" || test_cmp "$@"
>> +	! test_have_prereq C_LOCALE_OUTPUT || test_cmp "$@"
>>  }
>
>>  test_i18ngrep () {
>>  	eval "last_arg=\${$#}"
>> @@ -779,7 +779,7 @@ test_i18ngrep () {
>>  		error "bug in the test script: too few parameters to test_i18ngrep"
>>  	fi
>>
>> -	if test -n "$GETTEXT_POISON"
>> +	if test_have_prereq !C_LOCALE_OUTPUT
>
> Why do these two helper functions call test_have_prereq (a function
> that doesn't even fit on my screen) instead of a simple
>
>   test -n "$GIT_TEST_GETTEXT_POISON"

I'm going to keep the call to test_have_prereq, it's consistent with
other use of the helper in the test lib, and doesn't confuse the reader
by giving the impression that we're in some really early setup where we
haven't set the prereq at this point (we have).

>>  	then
>>  		# pretend success
>>  		return 0
>> diff --git a/t/test-lib.sh b/t/test-lib.sh
>> index 897e6fcc94..370a4821e1 100644
>> --- a/t/test-lib.sh
>> +++ b/t/test-lib.sh
>> @@ -1105,12 +1105,8 @@ test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
>>  test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
>>
>>  # Can we rely on git's output in the C locale?
>> -if test -n "$GETTEXT_POISON"
>> +if test -z "$GIT_TEST_GETTEXT_POISON"
>>  then
>> -	GIT_GETTEXT_POISON=YesPlease
>> -	export GIT_GETTEXT_POISON
>> -	test_set_prereq GETTEXT_POISON
>> -else
>>  	test_set_prereq C_LOCALE_OUTPUT
>>  fi
>
> GIT_TEST_GETTEXT_POISON=1 will influence even those git commands that
> are executed during initialization of test-lib and the test repo:
>
>   $ GIT_TEST_GETTEXT_POISON=1 ./t0000-basic.sh -v
>   # GETTEXT POISON #expecting success:
>   <....>
>
> It should say:
>
>   Initialized empty Git repository in <... path ...>
>   expecting success:
>
> See https://public-inbox.org/git/20181022202241.18629-2-szeder.dev@gmail.com/
> for a bit of code that you might find worthy to steal.

Thanks. Fixing.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3] i18n: make GETTEXT_POISON a runtime option
  2018-11-08 20:26               ` Ævar Arnfjörð Bjarmason
@ 2018-11-08 20:51                 ` SZEDER Gábor
  0 siblings, 0 replies; 49+ messages in thread
From: SZEDER Gábor @ 2018-11-08 20:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Jeff King, Vasco Almeida, Jiang Xin,
	Johannes Schindelin, Nguyễn Thái Ngọc Duy

On Thu, Nov 08, 2018 at 09:26:19PM +0100, Ævar Arnfjörð Bjarmason wrote:
> On Fri, Nov 02 2018, SZEDER Gábor wrote:

> >>  * We error out in the Makefile if you're still saying
> >>    GETTEXT_POISON=YesPlease.
> >>
> >>    This makes more sense than just making it a synonym since now this
> >>    also needs to be defined at runtime.
> >
> > OK, I think erroring out is better than silently ignoring
> > GETTEXT_POISON=YesPlease.  However, the commit message only mentions
> > that GETTEXT_POISON is gone, but perhaps this should be mentioned
> > there as well.
> 
> Will mention.

With the benefit of hindsight gained from looking into a failing
GETTEXT_POISON build [1], I have to agree with Junio that flat-out
erroring out when GETTEXT_POISON=YesPlease is set is really a hassle
[2].

[1] https://public-inbox.org/git/20181107130950.GA30222@szeder.dev/
    (the failure is not related to this patch)
[2] https://public-inbox.org/git/xmqqpnvg8d5z.fsf@gitster-ct.c.googlers.com/

> >> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> >> index 78d8c3783b..2f42b3653c 100644
> >> --- a/t/test-lib-functions.sh
> >> +++ b/t/test-lib-functions.sh
> >> @@ -755,16 +755,16 @@ test_cmp_bin() {
> >>
> >>  # Use this instead of test_cmp to compare files that contain expected and
> >>  # actual output from git commands that can be translated.  When running
> >> -# under GETTEXT_POISON this pretends that the command produced expected
> >> +# under GIT_TEST_GETTEXT_POISON this pretends that the command produced expected
> >>  # results.
> >>  test_i18ncmp () {
> >> -	test -n "$GETTEXT_POISON" || test_cmp "$@"
> >> +	! test_have_prereq C_LOCALE_OUTPUT || test_cmp "$@"
> >>  }
> >
> >>  test_i18ngrep () {
> >>  	eval "last_arg=\${$#}"
> >> @@ -779,7 +779,7 @@ test_i18ngrep () {
> >>  		error "bug in the test script: too few parameters to test_i18ngrep"
> >>  	fi
> >>
> >> -	if test -n "$GETTEXT_POISON"
> >> +	if test_have_prereq !C_LOCALE_OUTPUT
> >
> > Why do these two helper functions call test_have_prereq (a function
> > that doesn't even fit on my screen) instead of a simple
> >
> >   test -n "$GIT_TEST_GETTEXT_POISON"
> 
> I'm going to keep the call to test_have_prereq, it's consistent with
> other use of the helper in the test lib, and doesn't confuse the reader
> by giving the impression that we're in some really early setup where we
> haven't set the prereq at this point (we have).

Using the prereq in the first place is what confused (and is still
confusing...) the reader.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v4 0/2] i18n: make GETTEXT_POISON a runtime option
  2018-11-01 19:31           ` [PATCH v3] " Ævar Arnfjörð Bjarmason
                               ` (2 preceding siblings ...)
  2018-11-08  3:24             ` Junio C Hamano
@ 2018-11-08 21:15             ` Ævar Arnfjörð Bjarmason
  2018-11-08 21:15             ` [PATCH v4 1/2] " Ævar Arnfjörð Bjarmason
  2018-11-08 21:15             ` [PATCH v4 2/2] Makefile: ease dynamic-gettext-poison transition Ævar Arnfjörð Bjarmason
  5 siblings, 0 replies; 49+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-11-08 21:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, SZEDER Gábor, Vasco Almeida,
	Jiang Xin, Johannes Schindelin,
	Nguyễn Thái Ngọc Duy, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Addresses all the feedback against v3. Includes a patch by Junio
sitting in "pu" (and I fixed the grammar error Eric pointed out in its
commit message).

Range-diff:

1:  cc24ba8de8 ! 1:  2210ee8bd9 i18n: make GETTEXT_POISON a runtime option
    @@ -34,11 +34,11 @@
     
         Notes on the implementation:
     
    -     * We still compile a dedicated GETTEXT_POISON build in Travis CI.
    -       This is probably the wrong thing to do and should be followed-up
    -       with something similar to ae59a4e44f ("travis: run tests with
    -       GIT_TEST_SPLIT_INDEX", 2018-01-07) to re-use an existing test setup
    -       for running in the GIT_TEST_GETTEXT_POISON mode.
    +     * We still compile a dedicated GETTEXT_POISON build in Travis
    +       CI. Perhaps this should be revisited and integrated into the
    +       "linux-gcc" build, see ae59a4e44f ("travis: run tests with
    +       GIT_TEST_SPLIT_INDEX", 2018-01-07) for prior art in that area. Then
    +       again maybe not, see [2].
     
          * We now skip a test in t0000-basic.sh under
            GIT_TEST_GETTEXT_POISON=YesPlease that wasn't skipped before. This
    @@ -56,12 +56,22 @@
            so we populate the "poison_requested" variable in a codepath that's
            won't suffer from that race condition.
     
    -    See also [3] for more on the motivation behind this patch, and the
    +     * We error out in the Makefile if you're still saying
    +       GETTEXT_POISON=YesPlease to prompt users to change their
    +       invocation.
    +
    +     * We should not print out poisoned messages during the test
    +       initialization itself to keep it more readable, so the test library
    +       hides the variable if set in $GIT_TEST_GETTEXT_POISON_ORIG during
    +       setup. See [3].
    +
    +    See also [4] for more on the motivation behind this patch, and the
         history of the GETTEXT_POISON facility.
     
         1. https://public-inbox.org/git/871s8gd32p.fsf@evledraar.gmail.com/
    -    2. https://public-inbox.org/git/87woq7b58k.fsf@evledraar.gmail.com/
    -    3. https://public-inbox.org/git/878t2pd6yu.fsf@evledraar.gmail.com/
    +    2. https://public-inbox.org/git/20181102163725.GY30222@szeder.dev/
    +    3. https://public-inbox.org/git/20181022202241.18629-2-szeder.dev@gmail.com/
    +    4. https://public-inbox.org/git/878t2pd6yu.fsf@evledraar.gmail.com/
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    @@ -181,7 +191,7 @@
      #else
      static inline void git_setup_gettext(void)
      {
    -+	use_gettext_poison();; /* getenv() reentrancy paranoia */
    ++	use_gettext_poison(); /* getenv() reentrancy paranoia */
      }
      static inline int gettext_width(const char *s)
      {
    @@ -392,8 +402,32 @@
      --- a/t/test-lib.sh
      +++ b/t/test-lib.sh
     @@
    + TZ=UTC
    + export LANG LC_ALL PAGER TZ
    + EDITOR=:
    ++
    ++# GIT_TEST_GETTEXT_POISON should not influence git commands executed
    ++# during initialization of test-lib and the test repo. Back it up,
    ++# unset and then restore after initialization is finished.
    ++if test -n "$GIT_TEST_GETTEXT_POISON"
    ++then
    ++	GIT_TEST_GETTEXT_POISON_ORIG=$GIT_TEST_GETTEXT_POISON
    ++	unset GIT_TEST_GETTEXT_POISON
    ++fi
    ++
    + # A call to "unset" with no arguments causes at least Solaris 10
    + # /usr/xpg4/bin/sh and /bin/ksh to bail out.  So keep the unsets
    + # deriving from the command substitution clustered with the other
    +@@
    + test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
      test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
      
    ++if test -n "$GIT_TEST_GETTEXT_POISON_ORIG"
    ++then
    ++	GIT_TEST_GETTEXT_POISON=$GIT_TEST_GETTEXT_POISON_ORIG
    ++	unset GIT_TEST_GETTEXT_POISON_ORIG
    ++fi
    ++
      # Can we rely on git's output in the C locale?
     -if test -n "$GETTEXT_POISON"
     +if test -z "$GIT_TEST_GETTEXT_POISON"
-:  ---------- > 2:  a6948d936a Makefile: ease dynamic-gettext-poison transition


Junio C Hamano (1):
  Makefile: ease dynamic-gettext-poison transition

Ævar Arnfjörð Bjarmason (1):
  i18n: make GETTEXT_POISON a runtime option

 .travis.yml               |  2 +-
 Makefile                  |  8 +-------
 ci/lib-travisci.sh        |  4 ++--
 gettext.c                 | 11 +++++++----
 gettext.h                 |  9 +++------
 git-sh-i18n.sh            |  2 +-
 po/README                 | 13 ++++---------
 t/README                  |  6 ++++++
 t/lib-gettext.sh          |  2 +-
 t/t0000-basic.sh          |  2 +-
 t/t0205-gettext-poison.sh |  8 +++++---
 t/t3406-rebase-message.sh |  2 +-
 t/t7201-co.sh             |  6 +++---
 t/t9902-completion.sh     |  3 ++-
 t/test-lib-functions.sh   |  8 ++++----
 t/test-lib.sh             | 22 +++++++++++++++++-----
 16 files changed, 59 insertions(+), 49 deletions(-)

-- 
2.19.1.930.g4563a0d9d0


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v4 1/2] i18n: make GETTEXT_POISON a runtime option
  2018-11-01 19:31           ` [PATCH v3] " Ævar Arnfjörð Bjarmason
                               ` (3 preceding siblings ...)
  2018-11-08 21:15             ` [PATCH v4 0/2] " Ævar Arnfjörð Bjarmason
@ 2018-11-08 21:15             ` Ævar Arnfjörð Bjarmason
  2018-11-08 21:15             ` [PATCH v4 2/2] Makefile: ease dynamic-gettext-poison transition Ævar Arnfjörð Bjarmason
  5 siblings, 0 replies; 49+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-11-08 21:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, SZEDER Gábor, Vasco Almeida,
	Jiang Xin, Johannes Schindelin,
	Nguyễn Thái Ngọc Duy, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Change the GETTEXT_POISON compile-time + runtime GIT_GETTEXT_POISON
test parameter to only be a GIT_TEST_GETTEXT_POISON=<non-empty?>
runtime parameter, to be consistent with other parameters documented
in "Running tests with special setups" in t/README.

When I added GETTEXT_POISON in bb946bba76 ("i18n: add GETTEXT_POISON
to simulate unfriendly translator", 2011-02-22) I was concerned with
ensuring that the _() function would get constant folded if NO_GETTEXT
was defined, and likewise that GETTEXT_POISON would be compiled out
unless it was defined.

But as the benchmark in my [1] shows doing a one-off runtime
getenv("GIT_TEST_[...]") is trivial, and since GETTEXT_POISON was
originally added the GIT_TEST_* env variables have become the common
idiom for turning on special test setups.

So change GETTEXT_POISON to work the same way. Now the
GETTEXT_POISON=YesPlease compile-time option is gone, and running the
tests with GIT_TEST_GETTEXT_POISON=[YesPlease|] can be toggled on/off
without recompiling.

This allows for conditionally amending tests to test with/without
poison, similar to what 859fdc0c3c ("commit-graph: define
GIT_TEST_COMMIT_GRAPH", 2018-08-29) did for GIT_TEST_COMMIT_GRAPH. Do
some of that, now we e.g. always run the t0205-gettext-poison.sh test.

I did enough there to remove the GETTEXT_POISON prerequisite, but its
inverse C_LOCALE_OUTPUT is still around, and surely some tests using
it can be converted to e.g. always set GIT_TEST_GETTEXT_POISON=.

Notes on the implementation:

 * We still compile a dedicated GETTEXT_POISON build in Travis
   CI. Perhaps this should be revisited and integrated into the
   "linux-gcc" build, see ae59a4e44f ("travis: run tests with
   GIT_TEST_SPLIT_INDEX", 2018-01-07) for prior art in that area. Then
   again maybe not, see [2].

 * We now skip a test in t0000-basic.sh under
   GIT_TEST_GETTEXT_POISON=YesPlease that wasn't skipped before. This
   test relies on C locale output, but due to an edge case in how the
   previous implementation of GETTEXT_POISON worked (reading it from
   GIT-BUILD-OPTIONS) wasn't enabling poison correctly. Now it does,
   and needs to be skipped.

 * The getenv() function is not reentrant, so out of paranoia about
   code of the form:

       printf(_("%s"), getenv("some-env"));

   call use_gettext_poison() in our early setup in git_setup_gettext()
   so we populate the "poison_requested" variable in a codepath that's
   won't suffer from that race condition.

 * We error out in the Makefile if you're still saying
   GETTEXT_POISON=YesPlease to prompt users to change their
   invocation.

 * We should not print out poisoned messages during the test
   initialization itself to keep it more readable, so the test library
   hides the variable if set in $GIT_TEST_GETTEXT_POISON_ORIG during
   setup. See [3].

See also [4] for more on the motivation behind this patch, and the
history of the GETTEXT_POISON facility.

1. https://public-inbox.org/git/871s8gd32p.fsf@evledraar.gmail.com/
2. https://public-inbox.org/git/20181102163725.GY30222@szeder.dev/
3. https://public-inbox.org/git/20181022202241.18629-2-szeder.dev@gmail.com/
4. https://public-inbox.org/git/878t2pd6yu.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 .travis.yml               |  2 +-
 Makefile                  |  8 +-------
 ci/lib-travisci.sh        |  4 ++--
 gettext.c                 | 11 +++++++----
 gettext.h                 |  9 +++------
 git-sh-i18n.sh            |  2 +-
 po/README                 | 13 ++++---------
 t/README                  |  6 ++++++
 t/lib-gettext.sh          |  2 +-
 t/t0000-basic.sh          |  2 +-
 t/t0205-gettext-poison.sh |  8 +++++---
 t/t3406-rebase-message.sh |  2 +-
 t/t7201-co.sh             |  6 +++---
 t/t9902-completion.sh     |  3 ++-
 t/test-lib-functions.sh   |  8 ++++----
 t/test-lib.sh             | 22 +++++++++++++++++-----
 16 files changed, 59 insertions(+), 49 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 8d2499739e..a329a0add6 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -24,7 +24,7 @@ addons:
 
 matrix:
   include:
-    - env: jobname=GETTEXT_POISON
+    - env: jobname=GIT_TEST_GETTEXT_POISON
       os: linux
       compiler:
       addons:
diff --git a/Makefile b/Makefile
index bbfbb4292d..f3a9995e50 100644
--- a/Makefile
+++ b/Makefile
@@ -362,11 +362,6 @@ all::
 # Define HAVE_DEV_TTY if your system can open /dev/tty to interact with the
 # user.
 #
-# Define GETTEXT_POISON if you are debugging the choice of strings marked
-# for translation.  In a GETTEXT_POISON build, you can turn all strings marked
-# for translation into gibberish by setting the GIT_GETTEXT_POISON variable
-# (to any value) in your environment.
-#
 # Define JSMIN to point to JavaScript minifier that functions as
 # a filter to have gitweb.js minified.
 #
@@ -1452,7 +1447,7 @@ ifdef NO_SYMLINK_HEAD
 	BASIC_CFLAGS += -DNO_SYMLINK_HEAD
 endif
 ifdef GETTEXT_POISON
-	BASIC_CFLAGS += -DGETTEXT_POISON
+$(error The GETTEXT_POISON option has been removed in favor of runtime GIT_TEST_GETTEXT_POISON. See t/README!)
 endif
 ifdef NO_GETTEXT
 	BASIC_CFLAGS += -DNO_GETTEXT
@@ -2603,7 +2598,6 @@ ifdef GIT_TEST_CMP_USE_COPIED_CONTEXT
 	@echo GIT_TEST_CMP_USE_COPIED_CONTEXT=YesPlease >>$@+
 endif
 	@echo NO_GETTEXT=\''$(subst ','\'',$(subst ','\'',$(NO_GETTEXT)))'\' >>$@+
-	@echo GETTEXT_POISON=\''$(subst ','\'',$(subst ','\'',$(GETTEXT_POISON)))'\' >>$@+
 ifdef GIT_PERF_REPEAT_COUNT
 	@echo GIT_PERF_REPEAT_COUNT=\''$(subst ','\'',$(subst ','\'',$(GIT_PERF_REPEAT_COUNT)))'\' >>$@+
 endif
diff --git a/ci/lib-travisci.sh b/ci/lib-travisci.sh
index 06970f7213..69dff4d1ec 100755
--- a/ci/lib-travisci.sh
+++ b/ci/lib-travisci.sh
@@ -123,7 +123,7 @@ osx-clang|osx-gcc)
 	# Travis CI OS X
 	export GIT_SKIP_TESTS="t9810 t9816"
 	;;
-GETTEXT_POISON)
-	export GETTEXT_POISON=YesPlease
+GIT_TEST_GETTEXT_POISON)
+	export GIT_TEST_GETTEXT_POISON=YesPlease
 	;;
 esac
diff --git a/gettext.c b/gettext.c
index 7272771c8e..d4021d690c 100644
--- a/gettext.c
+++ b/gettext.c
@@ -7,6 +7,7 @@
 #include "gettext.h"
 #include "strbuf.h"
 #include "utf8.h"
+#include "config.h"
 
 #ifndef NO_GETTEXT
 #	include <locale.h>
@@ -46,15 +47,15 @@ const char *get_preferred_languages(void)
 	return NULL;
 }
 
-#ifdef GETTEXT_POISON
 int use_gettext_poison(void)
 {
 	static int poison_requested = -1;
-	if (poison_requested == -1)
-		poison_requested = getenv("GIT_GETTEXT_POISON") ? 1 : 0;
+	if (poison_requested == -1) {
+		const char *v = getenv("GIT_TEST_GETTEXT_POISON");
+		poison_requested = v && strlen(v) ? 1 : 0;
+	}
 	return poison_requested;
 }
-#endif
 
 #ifndef NO_GETTEXT
 static int test_vsnprintf(const char *fmt, ...)
@@ -164,6 +165,8 @@ void git_setup_gettext(void)
 	if (!podir)
 		podir = p = system_path(GIT_LOCALE_PATH);
 
+	use_gettext_poison(); /* getenv() reentrancy paranoia */
+
 	if (!is_directory(podir)) {
 		free(p);
 		return;
diff --git a/gettext.h b/gettext.h
index 7eee64a34f..71255e503e 100644
--- a/gettext.h
+++ b/gettext.h
@@ -28,12 +28,15 @@
 
 #define FORMAT_PRESERVING(n) __attribute__((format_arg(n)))
 
+extern int use_gettext_poison(void);
+
 #ifndef NO_GETTEXT
 extern void git_setup_gettext(void);
 extern int gettext_width(const char *s);
 #else
 static inline void git_setup_gettext(void)
 {
+	use_gettext_poison(); /* getenv() reentrancy paranoia */
 }
 static inline int gettext_width(const char *s)
 {
@@ -41,12 +44,6 @@ static inline int gettext_width(const char *s)
 }
 #endif
 
-#ifdef GETTEXT_POISON
-extern int use_gettext_poison(void);
-#else
-#define use_gettext_poison() 0
-#endif
-
 static inline FORMAT_PRESERVING(1) const char *_(const char *msgid)
 {
 	if (!*msgid)
diff --git a/git-sh-i18n.sh b/git-sh-i18n.sh
index 9d065fb4bf..e1d917fd27 100644
--- a/git-sh-i18n.sh
+++ b/git-sh-i18n.sh
@@ -17,7 +17,7 @@ export TEXTDOMAINDIR
 
 # First decide what scheme to use...
 GIT_INTERNAL_GETTEXT_SH_SCHEME=fallthrough
-if test -n "$GIT_GETTEXT_POISON"
+if test -n "$GIT_TEST_GETTEXT_POISON"
 then
 	GIT_INTERNAL_GETTEXT_SH_SCHEME=poison
 elif test -n "@@USE_GETTEXT_SCHEME@@"
diff --git a/po/README b/po/README
index fef4c0f0b5..aa704ffcb7 100644
--- a/po/README
+++ b/po/README
@@ -289,16 +289,11 @@ something in the test suite might still depend on the US English
 version of the strings, e.g. to grep some error message or other
 output.
 
-To smoke out issues like these Git can be compiled with gettext poison
-support, at the top-level:
+To smoke out issues like these, Git tested with a translation mode that
+emits gibberish on every call to gettext. To use it run the test suite
+with it, e.g.:
 
-    make GETTEXT_POISON=YesPlease
-
-That'll give you a git which emits gibberish on every call to
-gettext. It's obviously not meant to be installed, but you should run
-the test suite with it:
-
-    cd t && prove -j 9 ./t[0-9]*.sh
+    cd t && GIT_TEST_GETTEXT_POISON=YesPlease prove -j 9 ./t[0-9]*.sh
 
 If tests break with it you should inspect them manually and see if
 what you're translating is sane, i.e. that you're not translating
diff --git a/t/README b/t/README
index 2e9bef2852..bb2db49615 100644
--- a/t/README
+++ b/t/README
@@ -302,6 +302,12 @@ that cannot be easily covered by a few specific test cases. These
 could be enabled by running the test suite with correct GIT_TEST_
 environment set.
 
+GIT_TEST_GETTEXT_POISON=<non-empty?> turns all strings marked for
+translation into gibberish if non-empty (think "test -n"). Used for
+spotting those tests that need to be marked with a C_LOCALE_OUTPUT
+prerequisite when adding more strings for translation. See "Testing
+marked strings" in po/README for details.
+
 GIT_TEST_SPLIT_INDEX=<boolean> forces split-index mode on the whole
 test suite. Accept any boolean values that are accepted by git-config.
 
diff --git a/t/lib-gettext.sh b/t/lib-gettext.sh
index eec757f104..755f421431 100644
--- a/t/lib-gettext.sh
+++ b/t/lib-gettext.sh
@@ -12,7 +12,7 @@ export GIT_TEXTDOMAINDIR GIT_PO_PATH
 
 . "$GIT_BUILD_DIR"/git-sh-i18n
 
-if test_have_prereq GETTEXT && ! test_have_prereq GETTEXT_POISON
+if test_have_prereq GETTEXT && test_have_prereq C_LOCALE_OUTPUT
 then
 	# is_IS.UTF-8 on Solaris and FreeBSD, is_IS.utf8 on Debian
 	is_IS_locale=$(locale -a 2>/dev/null |
diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index 4d23373526..b6566003dd 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -274,7 +274,7 @@ test_expect_success 'pretend we have a mix of all possible results' "
 	EOF
 "
 
-test_expect_success 'test --verbose' '
+test_expect_success C_LOCALE_OUTPUT 'test --verbose' '
 	test_must_fail run_sub_test_lib_test \
 		test-verbose "test verbose" --verbose <<-\EOF &&
 	test_expect_success "passing test" true
diff --git a/t/t0205-gettext-poison.sh b/t/t0205-gettext-poison.sh
index 438e778d6a..a06269f38a 100755
--- a/t/t0205-gettext-poison.sh
+++ b/t/t0205-gettext-poison.sh
@@ -5,13 +5,15 @@
 
 test_description='Gettext Shell poison'
 
+GIT_TEST_GETTEXT_POISON=YesPlease
+export GIT_TEST_GETTEXT_POISON
 . ./lib-gettext.sh
 
-test_expect_success GETTEXT_POISON 'sanity: $GIT_INTERNAL_GETTEXT_SH_SCHEME" is poison' '
+test_expect_success 'sanity: $GIT_INTERNAL_GETTEXT_SH_SCHEME" is poison' '
     test "$GIT_INTERNAL_GETTEXT_SH_SCHEME" = "poison"
 '
 
-test_expect_success GETTEXT_POISON 'gettext: our gettext() fallback has poison semantics' '
+test_expect_success 'gettext: our gettext() fallback has poison semantics' '
     printf "# GETTEXT POISON #" >expect &&
     gettext "test" >actual &&
     test_cmp expect actual &&
@@ -20,7 +22,7 @@ test_expect_success GETTEXT_POISON 'gettext: our gettext() fallback has poison s
     test_cmp expect actual
 '
 
-test_expect_success GETTEXT_POISON 'eval_gettext: our eval_gettext() fallback has poison semantics' '
+test_expect_success 'eval_gettext: our eval_gettext() fallback has poison semantics' '
     printf "# GETTEXT POISON #" >expect &&
     eval_gettext "test" >actual &&
     test_cmp expect actual &&
diff --git a/t/t3406-rebase-message.sh b/t/t3406-rebase-message.sh
index 0392e36d23..2bdcf83808 100755
--- a/t/t3406-rebase-message.sh
+++ b/t/t3406-rebase-message.sh
@@ -77,7 +77,7 @@ test_expect_success 'rebase -n overrides config rebase.stat config' '
 #     "Does not point to a valid commit: invalid-ref"
 #
 # NEEDSWORK: This "grep" is fine in real non-C locales, but
-# GETTEXT_POISON poisons the refname along with the enclosing
+# GIT_TEST_GETTEXT_POISON poisons the refname along with the enclosing
 # error message.
 test_expect_success 'rebase --onto outputs the invalid ref' '
 	test_must_fail git rebase --onto invalid-ref HEAD HEAD 2>err &&
diff --git a/t/t7201-co.sh b/t/t7201-co.sh
index 826987ca80..72b9b375ba 100755
--- a/t/t7201-co.sh
+++ b/t/t7201-co.sh
@@ -254,9 +254,9 @@ test_expect_success 'checkout to detach HEAD (with advice declined)' '
 test_expect_success 'checkout to detach HEAD' '
 	git config advice.detachedHead true &&
 	git checkout -f renamer && git clean -f &&
-	git checkout renamer^ 2>messages &&
-	test_i18ngrep "HEAD is now at 7329388" messages &&
-	(test_line_count -gt 1 messages || test -n "$GETTEXT_POISON") &&
+	GIT_TEST_GETTEXT_POISON= git checkout renamer^ 2>messages &&
+	grep "HEAD is now at 7329388" messages &&
+	test_line_count -gt 1 messages &&
 	H=$(git rev-parse --verify HEAD) &&
 	M=$(git show-ref -s --verify refs/heads/master) &&
 	test "z$H" = "z$M" &&
diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
index 175f83d704..3c6b185b60 100755
--- a/t/t9902-completion.sh
+++ b/t/t9902-completion.sh
@@ -1697,7 +1697,8 @@ test_expect_success 'sourcing the completion script clears cached commands' '
 	verbose test -z "$__git_all_commands"
 '
 
-test_expect_success !GETTEXT_POISON 'sourcing the completion script clears cached merge strategies' '
+test_expect_success 'sourcing the completion script clears cached merge strategies' '
+	GIT_TEST_GETTEXT_POISON= &&
 	__git_compute_merge_strategies &&
 	verbose test -n "$__git_merge_strategies" &&
 	. "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" &&
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 78d8c3783b..2f42b3653c 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -755,16 +755,16 @@ test_cmp_bin() {
 
 # Use this instead of test_cmp to compare files that contain expected and
 # actual output from git commands that can be translated.  When running
-# under GETTEXT_POISON this pretends that the command produced expected
+# under GIT_TEST_GETTEXT_POISON this pretends that the command produced expected
 # results.
 test_i18ncmp () {
-	test -n "$GETTEXT_POISON" || test_cmp "$@"
+	! test_have_prereq C_LOCALE_OUTPUT || test_cmp "$@"
 }
 
 # Use this instead of "grep expected-string actual" to see if the
 # output from a git command that can be translated either contains an
 # expected string, or does not contain an unwanted one.  When running
-# under GETTEXT_POISON this pretends that the command produced expected
+# under GIT_TEST_GETTEXT_POISON this pretends that the command produced expected
 # results.
 test_i18ngrep () {
 	eval "last_arg=\${$#}"
@@ -779,7 +779,7 @@ test_i18ngrep () {
 		error "bug in the test script: too few parameters to test_i18ngrep"
 	fi
 
-	if test -n "$GETTEXT_POISON"
+	if test_have_prereq !C_LOCALE_OUTPUT
 	then
 		# pretend success
 		return 0
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 47a99aa0ed..b50f934dcd 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -95,6 +95,16 @@ PAGER=cat
 TZ=UTC
 export LANG LC_ALL PAGER TZ
 EDITOR=:
+
+# GIT_TEST_GETTEXT_POISON should not influence git commands executed
+# during initialization of test-lib and the test repo. Back it up,
+# unset and then restore after initialization is finished.
+if test -n "$GIT_TEST_GETTEXT_POISON"
+then
+	GIT_TEST_GETTEXT_POISON_ORIG=$GIT_TEST_GETTEXT_POISON
+	unset GIT_TEST_GETTEXT_POISON
+fi
+
 # A call to "unset" with no arguments causes at least Solaris 10
 # /usr/xpg4/bin/sh and /bin/ksh to bail out.  So keep the unsets
 # deriving from the command substitution clustered with the other
@@ -1104,13 +1114,15 @@ test -n "$USE_LIBPCRE1" && test_set_prereq LIBPCRE1
 test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 
+if test -n "$GIT_TEST_GETTEXT_POISON_ORIG"
+then
+	GIT_TEST_GETTEXT_POISON=$GIT_TEST_GETTEXT_POISON_ORIG
+	unset GIT_TEST_GETTEXT_POISON_ORIG
+fi
+
 # Can we rely on git's output in the C locale?
-if test -n "$GETTEXT_POISON"
+if test -z "$GIT_TEST_GETTEXT_POISON"
 then
-	GIT_GETTEXT_POISON=YesPlease
-	export GIT_GETTEXT_POISON
-	test_set_prereq GETTEXT_POISON
-else
 	test_set_prereq C_LOCALE_OUTPUT
 fi
 
-- 
2.19.1.930.g4563a0d9d0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v4 2/2] Makefile: ease dynamic-gettext-poison transition
  2018-11-01 19:31           ` [PATCH v3] " Ævar Arnfjörð Bjarmason
                               ` (4 preceding siblings ...)
  2018-11-08 21:15             ` [PATCH v4 1/2] " Ævar Arnfjörð Bjarmason
@ 2018-11-08 21:15             ` Ævar Arnfjörð Bjarmason
  5 siblings, 0 replies; 49+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-11-08 21:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, SZEDER Gábor, Vasco Almeida,
	Jiang Xin, Johannes Schindelin,
	Nguyễn Thái Ngọc Duy, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

From: Junio C Hamano <gitster@pobox.com>

Earlier we made the entire build to fail when GETTEXT_POISON=Yes is
given to make, to notify those who did not notice that text poisoning
is now a runtime behaviour.

It turns out that this is too irritating for those who need to build
and test different versions of Git that cross the boundary between
history with and without this topic to switch between two
environment variables.  Demote the error to a warning, so that you
can say something like

    make GETTEXT_POISON=Yes GIT_TEST_GETTEXT_POISON=Yes test

during the transition period, without having to worry about whether
exact version you are testing has or does not have this topic.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index f3a9995e50..6b492f44a6 100644
--- a/Makefile
+++ b/Makefile
@@ -1447,7 +1447,7 @@ ifdef NO_SYMLINK_HEAD
 	BASIC_CFLAGS += -DNO_SYMLINK_HEAD
 endif
 ifdef GETTEXT_POISON
-$(error The GETTEXT_POISON option has been removed in favor of runtime GIT_TEST_GETTEXT_POISON. See t/README!)
+$(warning The GETTEXT_POISON option has been removed in favor of runtime GIT_TEST_GETTEXT_POISON. See t/README!)
 endif
 ifdef NO_GETTEXT
 	BASIC_CFLAGS += -DNO_GETTEXT
-- 
2.19.1.930.g4563a0d9d0


^ permalink raw reply related	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2018-11-08 21:15 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-22 15:36 [PATCH] Poison gettext with the Ook language Nguyễn Thái Ngọc Duy
2018-10-22 20:22 ` SZEDER Gábor
2018-10-22 20:22   ` [PATCH 1/8] test-lib.sh: preserve GIT_GETTEXT_POISON from the environment SZEDER Gábor
2018-10-22 20:22   ` [PATCH 2/8] gettext: don't poison if GIT_GETTEXT_POISON is set but empty SZEDER Gábor
2018-10-22 20:38     ` Ævar Arnfjörð Bjarmason
2018-10-22 20:22   ` [PATCH 3/8] lib-rebase: loosen GETTEXT_POISON check in fake editor SZEDER Gábor
2018-10-22 20:22   ` [PATCH 4/8] gettext: #ifdef away GETTEXT POISON-related code from _() and Q_() SZEDER Gábor
2018-10-22 20:22   ` [PATCH 5/8] gettext: put "# GETTEXT POISON #" string literal into a macro SZEDER Gábor
2018-10-22 20:22   ` [PATCH 6/8] gettext: use an enum for the mode of GETTEXT POISONing SZEDER Gábor
2018-10-22 20:22   ` [PATCH 7/8] gettext: introduce GIT_GETTEXT_POISON=scrambled SZEDER Gábor
2018-10-23 14:44     ` Duy Nguyen
2018-10-22 20:22   ` [PATCH 8/8] travis-ci: run GETTEXT POISON build job in scrambled mode, too SZEDER Gábor
2018-10-23 14:37   ` [PATCH] Poison gettext with the Ook language Duy Nguyen
2018-10-27 10:11   ` Jakub Narebski
2018-10-22 20:52 ` Johannes Schindelin
2018-10-22 21:09 ` Ævar Arnfjörð Bjarmason
2018-10-22 21:46   ` Ævar Arnfjörð Bjarmason
2018-10-22 23:04   ` Junio C Hamano
2018-10-23 21:01     ` [PATCH] i18n: make GETTEXT_POISON a runtime option Ævar Arnfjörð Bjarmason
2018-10-24  5:45       ` Junio C Hamano
2018-10-24  7:44         ` Jeff King
2018-10-25  1:00           ` Junio C Hamano
2018-10-25  1:09             ` Jeff King
2018-10-25  1:24               ` Ramsay Jones
2018-10-25 21:23                 ` Jeff King
2018-10-26 19:20                   ` Ævar Arnfjörð Bjarmason
2018-10-27  6:59                     ` Jeff King
2018-10-27 10:42                       ` Ævar Arnfjörð Bjarmason
2018-10-24 11:47         ` [PATCH v2] " Ævar Arnfjörð Bjarmason
2018-11-01 19:31           ` [PATCH v3] " Ævar Arnfjörð Bjarmason
2018-11-02  3:11             ` Junio C Hamano
2018-11-02 16:37             ` SZEDER Gábor
2018-11-08 20:26               ` Ævar Arnfjörð Bjarmason
2018-11-08 20:51                 ` SZEDER Gábor
2018-11-08  3:24             ` Junio C Hamano
2018-11-08 19:12               ` Eric Sunshine
2018-11-08 21:15             ` [PATCH v4 0/2] " Ævar Arnfjörð Bjarmason
2018-11-08 21:15             ` [PATCH v4 1/2] " Ævar Arnfjörð Bjarmason
2018-11-08 21:15             ` [PATCH v4 2/2] Makefile: ease dynamic-gettext-poison transition Ævar Arnfjörð Bjarmason
2018-10-23  9:30   ` [PATCH] Poison gettext with the Ook language Johannes Schindelin
2018-10-23 10:17     ` Ævar Arnfjörð Bjarmason
2018-10-23 11:07       ` Johannes Schindelin
2018-10-23 15:00       ` Duy Nguyen
2018-10-23 16:45         ` Ævar Arnfjörð Bjarmason
2018-10-24 14:41           ` Duy Nguyen
2018-10-24 17:54             ` Ævar Arnfjörð Bjarmason
2018-10-25  3:52             ` Junio C Hamano
2018-10-25  6:20               ` Jeff King
2018-10-27  6:55                 ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).