git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: git@vger.kernel.org
Cc: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Ævar Arnfjörð" <avarab@gmail.com>,
	"SZEDER Gábor" <szeder.dev@gmail.com>
Subject: [PATCH 7/8] gettext: introduce GIT_GETTEXT_POISON=scrambled
Date: Mon, 22 Oct 2018 22:22:40 +0200	[thread overview]
Message-ID: <20181022202241.18629-8-szeder.dev@gmail.com> (raw)
In-Reply-To: <20181022202241.18629-1-szeder.dev@gmail.com>

Sometimes tests run with GETTEXT POISON fail because of a reason other
than a translated string that should not have been translated.  In
such a case an error message from a git command in the test's verbose
output is usually, well, less than helpful:

  error: # GETTEXT POISON #

or

  fatal: # GETTEXT POISON #: No such file or directory

It's especially annoying on those rare occasions when a heisenbug
decides it's a good time to suddenly reveal its presence during a
GETTEXT POISON test run, and all we get is an error message like these
(yes, I did actually see both of the above error messages only once).

Make builtin commands' GETTEXT POISON-ed error messages more useful
for debugging failures by introducing a new mode of poisoning: if
$GIT_GETTEXT_POISON is set to 'scrambled', then include the original
untranslated message after that "# GETTEXT_POISON #" string in a
scrambled form, interspersing a '.' after each character.  This way
the messages will remain gibberish enough for machine consumption as
they were before, but at the same time they will be relatively easily
legible for humans.  Take extra care to preserve printf() format
conversion specifiers unaltered when inserting those dots.

Leave 'git-sh-i18n.sh' unchanged, because translatable messages in
scripts often include shell variables, and they could (though
currently they don't) include printf format specifiers, parameter
expansions, command substitutions and whatnot, too.  Dealing with
those in a shell script would be too much hassle without its worth.

There is an additional benefit: as this change considerably increases
the size of translated messages, it could detect cases when we try to
format a translated string into a too small buffer.  E.g. this change
applied on old versions causes test failures because of the bug that
was fixed in 2cfa83574c (bisect_next_all: convert xsnprintf to
xstrfmt, 2017-02-16).

[TODO: Fallout?
       A 'printf(_("foo: %s"), var);' call includes the contents of
       'var' unscrambled in the output.  Could that hide the
       translation of a string that should not have been translated?
       I'm afraid yes: to check the output of that printf() a sloppy
       test could do:

         git plumbing-cmd >out && grep "var's content" out

       which would fail in a regular GETTEXT_POISON test run, but
       would succeed in a scrambled test run.  Does this matter in
       practice, do we care at all?

       Does gettext_scramble() need a FORMAT_PRESERVING annotation?
       Seems to work fine without it so far...]

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 gettext.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++++---
 gettext.h | 11 +++++++++--
 2 files changed, 60 insertions(+), 5 deletions(-)

diff --git a/gettext.c b/gettext.c
index c50d1e0377..8ba7fd0bea 100644
--- a/gettext.c
+++ b/gettext.c
@@ -52,13 +52,61 @@ enum poison_mode use_gettext_poison(void)
 	static enum poison_mode poison_mode = poison_mode_uninitialized;
 	if (poison_mode == poison_mode_uninitialized) {
 		const char *v = getenv("GIT_GETTEXT_POISON");
-		if (v && *v)
-			poison_mode = poison_mode_default;
-		else
+		if (v && *v) {
+			if (!strcmp(v, "scrambled"))
+				poison_mode = poison_mode_scrambled;
+			else
+				poison_mode = poison_mode_default;
+		} else
 			poison_mode = poison_mode_none;
 	}
 	return poison_mode;
 }
+
+static int conversion_specifier_len(const char *s)
+{
+	const char printf_conversion_specifiers[] = "diouxXeEfFgGaAcsCSpnm%";
+	const char *format_end;
+
+	if (*s != '%')
+		return 0;
+
+	format_end = strpbrk(s + 1, printf_conversion_specifiers);
+	if (format_end)
+		return format_end - s;
+	else
+		return 0;
+}
+
+const char *gettext_scramble(const char *msg)
+{
+	struct strbuf sb;
+
+	strbuf_init(&sb,
+		    /* "# GETTEXT_POISON #" + ' ' + "m.e.s.s.a.g.e." + '\0' */
+		    strlen(GETTEXT_POISON_MAGIC) + 1 + 2 * strlen(msg) + 1);
+
+	strbuf_addch(&sb, ' ');
+	while (*msg) {
+		if (*msg == '\n') {
+			strbuf_addch(&sb, *(msg++));
+			continue;
+		} else if (*msg == '%') {
+			int spec_len = conversion_specifier_len(msg);
+			if (spec_len) {
+				strbuf_add(&sb, msg, spec_len);
+				msg += spec_len;
+				continue;
+			}
+		}
+
+		strbuf_addch(&sb, *(msg++));
+		strbuf_addch(&sb, '.');
+	}
+
+	/* This will be leaked... */
+	return strbuf_detach(&sb, NULL);
+}
 #endif
 
 #ifndef NO_GETTEXT
diff --git a/gettext.h b/gettext.h
index fcb6bfaa2c..d21346d9fa 100644
--- a/gettext.h
+++ b/gettext.h
@@ -45,10 +45,12 @@ static inline int gettext_width(const char *s)
 enum poison_mode {
 	poison_mode_uninitialized = -1,
 	poison_mode_none = 0,
-	poison_mode_default
+	poison_mode_default,
+	poison_mode_scrambled
 };
 
 extern enum poison_mode use_gettext_poison(void);
+extern const char *gettext_scramble(const char *msg);
 
 #define GETTEXT_POISON_MAGIC "# GETTEXT POISON #"
 #endif
@@ -60,6 +62,8 @@ static inline FORMAT_PRESERVING(1) const char *_(const char *msgid)
 #ifdef GETTEXT_POISON
 	if (use_gettext_poison() == poison_mode_default)
 		return GETTEXT_POISON_MAGIC;
+	else if (use_gettext_poison() == poison_mode_scrambled)
+		return gettext_scramble(gettext(msgid));
 #endif
 	return gettext(msgid);
 }
@@ -67,11 +71,14 @@ static inline FORMAT_PRESERVING(1) const char *_(const char *msgid)
 static inline FORMAT_PRESERVING(1) FORMAT_PRESERVING(2)
 const char *Q_(const char *msgid, const char *plu, unsigned long n)
 {
+	const char *msg = ngettext(msgid, plu, n);
 #ifdef GETTEXT_POISON
 	if (use_gettext_poison() == poison_mode_default)
 		return GETTEXT_POISON_MAGIC;
+	else if (use_gettext_poison() == poison_mode_scrambled)
+		return gettext_scramble(msg);
 #endif
-	return ngettext(msgid, plu, n);
+	return msg;
 }
 
 /* Mark msgid for translation but do not translate it. */
-- 
2.19.1.681.g6bd79da3f5


  parent reply	other threads:[~2018-10-22 20:23 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-22 15:36 [PATCH] Poison gettext with the Ook language Nguyễn Thái Ngọc Duy
2018-10-22 20:22 ` SZEDER Gábor
2018-10-22 20:22   ` [PATCH 1/8] test-lib.sh: preserve GIT_GETTEXT_POISON from the environment SZEDER Gábor
2018-10-22 20:22   ` [PATCH 2/8] gettext: don't poison if GIT_GETTEXT_POISON is set but empty SZEDER Gábor
2018-10-22 20:38     ` Ævar Arnfjörð Bjarmason
2018-10-22 20:22   ` [PATCH 3/8] lib-rebase: loosen GETTEXT_POISON check in fake editor SZEDER Gábor
2018-10-22 20:22   ` [PATCH 4/8] gettext: #ifdef away GETTEXT POISON-related code from _() and Q_() SZEDER Gábor
2018-10-22 20:22   ` [PATCH 5/8] gettext: put "# GETTEXT POISON #" string literal into a macro SZEDER Gábor
2018-10-22 20:22   ` [PATCH 6/8] gettext: use an enum for the mode of GETTEXT POISONing SZEDER Gábor
2018-10-22 20:22   ` SZEDER Gábor [this message]
2018-10-23 14:44     ` [PATCH 7/8] gettext: introduce GIT_GETTEXT_POISON=scrambled Duy Nguyen
2018-10-22 20:22   ` [PATCH 8/8] travis-ci: run GETTEXT POISON build job in scrambled mode, too SZEDER Gábor
2018-10-23 14:37   ` [PATCH] Poison gettext with the Ook language Duy Nguyen
2018-10-27 10:11   ` Jakub Narebski
2018-10-22 20:52 ` Johannes Schindelin
2018-10-22 21:09 ` Ævar Arnfjörð Bjarmason
2018-10-22 21:46   ` Ævar Arnfjörð Bjarmason
2018-10-22 23:04   ` Junio C Hamano
2018-10-23 21:01     ` [PATCH] i18n: make GETTEXT_POISON a runtime option Ævar Arnfjörð Bjarmason
2018-10-24  5:45       ` Junio C Hamano
2018-10-24  7:44         ` Jeff King
2018-10-25  1:00           ` Junio C Hamano
2018-10-25  1:09             ` Jeff King
2018-10-25  1:24               ` Ramsay Jones
2018-10-25 21:23                 ` Jeff King
2018-10-26 19:20                   ` Ævar Arnfjörð Bjarmason
2018-10-27  6:59                     ` Jeff King
2018-10-27 10:42                       ` Ævar Arnfjörð Bjarmason
2018-10-24 11:47         ` [PATCH v2] " Ævar Arnfjörð Bjarmason
2018-11-01 19:31           ` [PATCH v3] " Ævar Arnfjörð Bjarmason
2018-11-02  3:11             ` Junio C Hamano
2018-11-02 16:37             ` SZEDER Gábor
2018-11-08 20:26               ` Ævar Arnfjörð Bjarmason
2018-11-08 20:51                 ` SZEDER Gábor
2018-11-08  3:24             ` Junio C Hamano
2018-11-08 19:12               ` Eric Sunshine
2018-11-08 21:15             ` [PATCH v4 0/2] " Ævar Arnfjörð Bjarmason
2018-11-08 21:15             ` [PATCH v4 1/2] " Ævar Arnfjörð Bjarmason
2018-11-08 21:15             ` [PATCH v4 2/2] Makefile: ease dynamic-gettext-poison transition Ævar Arnfjörð Bjarmason
2018-10-23  9:30   ` [PATCH] Poison gettext with the Ook language Johannes Schindelin
2018-10-23 10:17     ` Ævar Arnfjörð Bjarmason
2018-10-23 11:07       ` Johannes Schindelin
2018-10-23 15:00       ` Duy Nguyen
2018-10-23 16:45         ` Ævar Arnfjörð Bjarmason
2018-10-24 14:41           ` Duy Nguyen
2018-10-24 17:54             ` Ævar Arnfjörð Bjarmason
2018-10-25  3:52             ` Junio C Hamano
2018-10-25  6:20               ` Jeff King
2018-10-27  6:55                 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181022202241.18629-8-szeder.dev@gmail.com \
    --to=szeder.dev@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).