From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Subject: [PATCH] t3900: ISO-2022-JP has more than one popular variants
Date: Tue, 12 May 2009 02:29:10 -0700 [thread overview]
Message-ID: <7vljp28yah.fsf@alter.siamese.dyndns.org> (raw)
When converting from other encodings (e.g. EUC-JP or UTF-8), there are
subtly different variants of ISO-2022-JP, all of which are valid. At the
end of line or when a run of string switches to 1-byte sequence, ESC ( B
can be used to switch to ASCII or ESC ( J can be used to switch to ISO
646:JP (JIS X 0201) but they essentially are the same character set and
are used interchangeably. Similarly the set ESC $ @ switches to (JIS X
0208-1978) and ESC $ B switches to (JIS X 0208-1983) are in practice used
interchangeably.
Depending on the iconv library and the locale definition on the system, a
program that converts from another encoding to ISO-2022-JP can produce
different byte sequence, and GIT_TEST_CMP (aka "diff -u") will report the
difference as a failure.
Fix this by converting the expected and the actual output to UTF-8 before
comparing when the end result is ISO-2022-JP. The test vector string in
t3900/ISO-2022-JP.txt is expressed with ASCII and JIS X 0208-1983, but it
can be expressed with any other possible variant, and when converted back
to UTF-8, these variants produce identical byte sequences.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
* I had to spend some time on an OpenSolaris box at work today X-< so
play with git there while waiting for something else to finish. They
seem to produce ESC ( J instead of ESC ( B to switch out of the 2-byte
Japanese.
I had to "pkg install lang-support-japanese" to pass this test.
t/t3900-i18n-commit.sh | 18 ++++++++++++++++--
1 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/t/t3900-i18n-commit.sh b/t/t3900-i18n-commit.sh
index 784c31a..5dbbcb6 100755
--- a/t/t3900-i18n-commit.sh
+++ b/t/t3900-i18n-commit.sh
@@ -9,7 +9,15 @@ test_description='commit and log output encodings'
compare_with () {
git show -s $1 | sed -e '1,/^$/d' -e 's/^ //' >current &&
- test_cmp current "$2"
+ case "$3" in
+ '')
+ test_cmp "$2" current ;;
+ ?*)
+ iconv -f "$3" -t utf8 >current.utf8 <current &&
+ iconv -f "$3" -t utf8 >expect.utf8 <"$2" &&
+ test_cmp expect.utf8 current.utf8
+ ;;
+ esac
}
test_expect_success setup '
@@ -103,11 +111,17 @@ done
for J in EUCJP ISO-2022-JP
do
+ if test "$J" = ISO-2022-JP
+ then
+ ICONV=$J
+ else
+ ICONV=
+ fi
git config i18n.logoutputencoding $J
for H in EUCJP ISO-2022-JP
do
test_expect_success "$H should be shown in $J now" '
- compare_with '$H' "$TEST_DIRECTORY"/t3900/'$J'.txt
+ compare_with '$H' "$TEST_DIRECTORY"/t3900/'$J'.txt $ICONV
'
done
done
--
1.6.3.9.g6345d
reply other threads:[~2009-05-12 9:29 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7vljp28yah.fsf@alter.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).