git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] t3900: ISO-2022-JP has more than one popular variants
@ 2009-05-12  9:29 Junio C Hamano
  0 siblings, 0 replies; only message in thread
From: Junio C Hamano @ 2009-05-12  9:29 UTC (permalink / raw)
  To: git

When converting from other encodings (e.g. EUC-JP or UTF-8), there are
subtly different variants of ISO-2022-JP, all of which are valid.  At the
end of line or when a run of string switches to 1-byte sequence, ESC ( B
can be used to switch to ASCII or ESC ( J can be used to switch to ISO
646:JP (JIS X 0201) but they essentially are the same character set and
are used interchangeably.  Similarly the set ESC $ @ switches to (JIS X
0208-1978) and ESC $ B switches to (JIS X 0208-1983) are in practice used
interchangeably.

Depending on the iconv library and the locale definition on the system, a
program that converts from another encoding to ISO-2022-JP can produce
different byte sequence, and GIT_TEST_CMP (aka "diff -u") will report the
difference as a failure.

Fix this by converting the expected and the actual output to UTF-8 before
comparing when the end result is ISO-2022-JP.  The test vector string in
t3900/ISO-2022-JP.txt is expressed with ASCII and JIS X 0208-1983, but it
can be expressed with any other possible variant, and when converted back
to UTF-8, these variants produce identical byte sequences.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * I had to spend some time on an OpenSolaris box at work today X-< so
   play with git there while waiting for something else to finish.  They
   seem to produce ESC ( J instead of ESC ( B to switch out of the 2-byte
   Japanese.

   I had to "pkg install lang-support-japanese" to pass this test.

 t/t3900-i18n-commit.sh |   18 ++++++++++++++++--
 1 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/t/t3900-i18n-commit.sh b/t/t3900-i18n-commit.sh
index 784c31a..5dbbcb6 100755
--- a/t/t3900-i18n-commit.sh
+++ b/t/t3900-i18n-commit.sh
@@ -9,7 +9,15 @@ test_description='commit and log output encodings'
 
 compare_with () {
 	git show -s $1 | sed -e '1,/^$/d' -e 's/^    //' >current &&
-	test_cmp current "$2"
+	case "$3" in
+	'')
+		test_cmp "$2" current ;;
+	?*)
+		iconv -f "$3" -t utf8 >current.utf8 <current &&
+		iconv -f "$3" -t utf8 >expect.utf8 <"$2" &&
+		test_cmp expect.utf8 current.utf8
+		;;
+	esac
 }
 
 test_expect_success setup '
@@ -103,11 +111,17 @@ done
 
 for J in EUCJP ISO-2022-JP
 do
+	if test "$J" = ISO-2022-JP
+	then
+		ICONV=$J
+	else
+		ICONV=
+	fi
 	git config i18n.logoutputencoding $J
 	for H in EUCJP ISO-2022-JP
 	do
 		test_expect_success "$H should be shown in $J now" '
-			compare_with '$H' "$TEST_DIRECTORY"/t3900/'$J'.txt
+			compare_with '$H' "$TEST_DIRECTORY"/t3900/'$J'.txt $ICONV
 		'
 	done
 done
-- 
1.6.3.9.g6345d

^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2009-05-12  9:29 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-12  9:29 [PATCH] t3900: ISO-2022-JP has more than one popular variants Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).