git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Johannes Sixt <j.sixt@viscovery.net>
To: Alexey Shumkin <Alex.Crezoff@gmail.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v7 4/5] pretty: Add failing tests: --format output should honor logOutputEncoding
Date: Mon, 01 Jul 2013 09:00:55 +0200	[thread overview]
Message-ID: <51D12927.5050000@viscovery.net> (raw)
In-Reply-To: <9e83de68067be7548c0119d6e99caa905fab0c0f.1372240999.git.Alex.Crezoff@gmail.com>

Am 6/26/2013 12:19, schrieb Alexey Shumkin:
> One can set an alias
> 	$ git config alias.lg "log --graph --pretty=format:'%Cred%h%Creset
> 	-%C(yellow)%d%Creset %s %Cgreen(%cd) %C(bold blue)<%an>%Creset'
> 	--abbrev-commit --date=local"
> 
> to see the log as a pretty tree (like *gitk* but in a terminal).
> 
> However, log messages written in an encoding i18n.commitEncoding which differs
> from terminal encoding are shown corrupted even when i18n.logOutputEncoding
> and terminal encoding are the same (e.g. log messages committed on a Cygwin box
> with Windows-1251 encoding seen on a Linux box with a UTF-8 encoding and vice versa).
> 
> To simplify an example we can say the following two commands are expected
> to give the same output to a terminal:
> 
> 	$ git log --oneline --no-color
> 	$ git log --pretty=format:'%h %s'
> 
> However, the former pays attention to i18n.logOutputEncoding
> configuration, while the latter does not when it formats "%s".
> 
> The same corruption is true for
> 	$ git diff --submodule=log
> and
> 	$ git rev-list --pretty=format:%s HEAD
> and
> 	$ git reset --hard
> 
> This patch adds failing tests for the next patch that fixes them.
> 
> Signed-off-by: Alexey Shumkin <Alex.Crezoff@gmail.com>

> diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh
> index 73ba5e8..6b62da2 100755
> --- a/t/t4205-log-pretty-formats.sh
> +++ b/t/t4205-log-pretty-formats.sh
...
> +commit_msg() {
> +	# String "initial. initial" partly in German (translated with Google Translate),
> +	# encoded in UTF-8, used as a commit log message below.
> +	msg=$(printf "initial. anf\303\244nglich")
> +	if test -n "$1"
> +	then
> +		msg=$(echo $msg | iconv -f utf-8 -t $1)
> +	fi
> +	if test -n "$2" -a -n "$3"
> +	then
> +		# cut string, replace cut part with two dots
> +		# $2 - chars count from the beginning of the string
> +		# $3 - "trailing" chars
> +		# LC_ALL is set to make `sed` interpret "." as a UTF-8 char not a byte
> +		# as it does with C locale
> +		msg=$(echo $msg | LC_ALL=en_US.UTF-8 sed -e "s/^\(.\{$2\}\)$3/\1../")

This does not work as expected on Windows because sed ignores the .UTF-8
part of the locale specifier. (We don't even have en_US; we have de, but
with de.UTF-8 this doesn't work, either.)

I don't have an idea, yet, how to work it around.

> +	fi
> +	echo $msg
> +}

> -test_expect_success 'left alignment formatting with mtrunc' '
> -	git log --pretty="format:%<(10,mtrunc)%s" >actual &&
> +test_expect_failure 'left alignment formatting with mtrunc' "
> +	git log --pretty='format:%<(10,mtrunc)%s' >actual &&
>  	# complete the incomplete line at the end
>  	echo >>actual &&
>  	qz_to_tab_space <<\EOF >expected &&
>  mess.. two
>  mess.. one
>  add bar  Z
> -initial  Z
> +$(commit_msg "" "4" ".\{11\}")
>  EOF
>  	test_cmp expected actual
> -'
> +"

This is the failing test case.

BTW, if you re-roll, there would be fewer changes needed if you kept the
test code single-quoted, but changed <<\EOF to <<EOF where needed.

> diff --git a/t/t6006-rev-list-format.sh b/t/t6006-rev-list-format.sh
> index cc1008d..c66a07f 100755
> --- a/t/t6006-rev-list-format.sh
> +++ b/t/t6006-rev-list-format.sh
...
>  test_expect_success 'setup' '
>  	: >foo &&
>  	git add foo &&
> -	git commit -m "added foo" &&
> +	git config i18n.commitEncoding iso-8859-1 &&

Perhaps
	test_config i18n.commitEncoding iso-8859-1 &&

Also, it is "iso-8869-1" here, but we see "iso8859-1" already used later.
It's probably wise to use that same encoding name everywhere because we
can be very sure that the latter is already understood on all supported
platforms.

> +	git commit -m "$added_iso88591" &&
>  	head1=$(git rev-parse --verify HEAD) &&
>  	head1_short=$(git rev-parse --verify --short $head1) &&
>  	tree1=$(git rev-parse --verify HEAD:) &&
>  	tree1_short=$(git rev-parse --verify --short $tree1) &&
> -	echo changed >foo &&
> -	git commit -a -m "changed foo" &&
> +	echo "$changed" > foo &&
> +	git commit -a -m "$changed_iso88591" &&
>  	head2=$(git rev-parse --verify HEAD) &&
>  	head2_short=$(git rev-parse --verify --short $head2) &&
>  	tree2=$(git rev-parse --verify HEAD:) &&
>  	tree2_short=$(git rev-parse --verify --short $tree2)
> +	git config --unset i18n.commitEncoding
>  '
>  
> -# usage: test_format name format_string <expected_output
> +# usage: test_format [failure] name format_string <expected_output
>  test_format () {
> +	must_fail=0
> +	# if parameters count is more than 2 then test must fail
> +	if test $# -gt 2
> +	then
> +		must_fail=1
> +		# remove first parameter which is flag for test failure
> +		shift
> +	fi
>  	cat >expect.$1
> -	test_expect_success "format $1" "
> -		git rev-list --pretty=format:'$2' master >output.$1 &&
> -		test_cmp expect.$1 output.$1
> -	"
> +	name="format $1"
> +	command="git rev-list --pretty=format:'$2' master >output.$1 &&
> +		test_cmp expect.$1 output.$1"
> +	if test $must_fail -eq 1
> +	then
> +		test_expect_failure "$name" "$command"
> +	else
> +		test_expect_success "$name" "$command"
> +	fi
>  }

This function would be much shorter with the optional "failure" token as
$3 (untested):

test_format () {
	cat >expect.$1
	test_expect_${3:-success} "format $1" "
		git rev-list --pretty=format:'$2' master >output.$1 &&
		test_cmp expect.$1 output.$1
	"
}

>  test_expect_success 'setup complex body' '
>  	git config i18n.commitencoding iso8859-1 &&
>  	echo change2 >foo && git commit -a -F commit-msg &&
>  	head3=$(git rev-parse --verify HEAD) &&
> -	head3_short=$(git rev-parse --short $head3)
> +	head3_short=$(git rev-parse --short $head3) &&
> +	# unset commit encoding config
> +	# otherwise %e does not print encoding value
> +	# and following test fails

I don't understand this comment. The test vector below already shows that
an encoding is printed. Why would this suddenly be different with the
updated tests?

Assuming that this change doesn't sweep a deeper problem under the rug,
it's better to use test_config a few lines earlier.

> +	git config --unset i18n.commitEncoding
> +
>  '
>  
>  test_format complex-encoding %e <<EOF
>  commit $head3
>  iso8859-1

This is the encoding that I mean.

>  commit $head2
> +iso-8859-1
>  commit $head1
> +iso-8859-1
>  EOF

> diff --git a/t/t7102-reset.sh b/t/t7102-reset.sh
> index 05dfb27..72e364e 100755
> --- a/t/t7102-reset.sh
> +++ b/t/t7102-reset.sh
> @@ -9,6 +9,17 @@ Documented tests for git reset'
>  
>  . ./test-lib.sh
>  
> +commit_msg() {
> +	# String "modify 2nd file (changed)" partly in German(translated with Google Translate),
> +	# encoded in UTF-8, used as a commit log message below.
> +	msg=$(printf "modify 2nd file (ge\303\244ndert)")
> +	if test -n "$1"
> +	then
> +		msg=$(echo $msg | iconv -f utf-8 -t $1)
> +	fi
> +	echo $msg
> +}

If you wanted to, you could write this as

commit_msg () {
	# String "modify 2nd file (changed)" partly in German
	#(translated with Google Translate),
	# encoded in UTF-8, used as a commit log message below.
	printf "modify 2nd file (ge\303\244ndert)" |
	if test -n "$1"
	then
		iconv -f utf-8 -t $1
	else
		cat
	fi
}

but I'm not sure whether it's a lot better.

-- Hannes

  reply	other threads:[~2013-07-01  7:01 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-22  7:06 git log --oneline and git log --pretty=format... give differrent outputs Shumkin Alexey
2011-07-22  7:53 ` Alexey Shumkin
2011-07-25 10:31   ` [PATCH] pretty: user format ignores i18n.logOutputEncoding setting Alexey Shumkin
2011-07-25 10:31     ` Alexey Shumkin
2011-07-25 19:45       ` Junio C Hamano
2011-07-26 13:18         ` Alexey Shumkin
2011-09-09  8:43         ` [PATCH v2] " Alexey Shumkin
2011-09-09  8:43           ` [PATCH 1/2] pretty: Add failing tests: " Alexey Shumkin
2011-09-09  8:43           ` [PATCH 2/2] pretty: " Alexey Shumkin
2011-09-09  8:53         ` [PATCH v2] " Alexey Shumkin
2011-09-09  8:53           ` [PATCH 1/2] pretty: Add failing tests: " Alexey Shumkin
2011-09-09 22:54             ` Junio C Hamano
2011-09-20  8:20               ` [PATCH v3 0/2] pretty: " Alexey Shumkin
2011-09-20  8:21               ` [PATCH v3 1/2] pretty: Add failing tests: " Alexey Shumkin
2011-09-20 19:12                 ` Junio C Hamano
2011-09-20 20:46                   ` Alexey Shumkin
2013-01-24  9:10                     ` [PATCH v4 0/4] Reroll patches against v1.8.1.1 Alexey Shumkin
2013-06-20  9:26                       ` [PATCH v5 0/5] Reroll patches against v1.8.3.1 Alexey Shumkin
2013-06-20 20:10                         ` Junio C Hamano
2013-06-25  8:55                         ` [PATCH v6 " Alexey Shumkin
2013-06-25 19:28                           ` Junio C Hamano
2013-06-26  7:37                             ` Alexey Shumkin
2013-06-26 14:24                               ` Junio C Hamano
2013-06-26 10:19                           ` [PATCH v7 " Alexey Shumkin
2013-06-26 16:19                             ` Junio C Hamano
2013-07-01 23:18                             ` [PATCH v8 0/5] Reroll patches against Git v1.8.3.2 Alexey Shumkin
2013-07-02 19:41                               ` Junio C Hamano
2013-07-03 20:03                                 ` Alexey Shumkin
2013-07-03 20:06                                   ` Junio C Hamano
2013-07-04 12:45                               ` [PATCH v9 0/5] Incremental updates against 'next' branch Alexey Shumkin
2013-07-05 12:01                                 ` [PATCH v10 " Alexey Shumkin
2013-07-05 12:01                                 ` [PATCH v10 1/5] t4041, t4205, t6006, t7102: use iso8859-1 rather than iso-8859-1 Alexey Shumkin
2013-07-05 12:01                                 ` [PATCH v10 2/5] t4205 (log-pretty-formats): revert back single quotes Alexey Shumkin
2013-07-05 12:01                                 ` [PATCH v10 3/5] t4205, t6006, t7102: make functions better readable Alexey Shumkin
2013-07-05 18:38                                   ` Junio C Hamano
2013-07-05 18:45                                     ` Junio C Hamano
2013-07-05 12:01                                 ` [PATCH v10 4/5] t6006 (rev-list-format): add tests for "%b" and "%s" for the case i18n.commitEncoding is not set Alexey Shumkin
2013-07-05 12:01                                 ` [PATCH v10 5/5] t4205 (log-pretty-formats): avoid using `sed` Alexey Shumkin
2013-07-04 12:45                               ` [PATCH v9 1/5] t4041, t4205, t6006, t7102: use iso8859-1 rather than iso-8859-1 Alexey Shumkin
2013-07-05  6:47                                 ` Junio C Hamano
2013-07-05  8:00                                   ` Alexey Shumkin
2013-07-05  8:11                                     ` Junio C Hamano
2013-07-05  8:42                                       ` Alexey Shumkin
2013-07-05  8:56                                         ` Junio C Hamano
2013-07-04 12:45                               ` [PATCH v9 2/5] t4205: revert back single quotes Alexey Shumkin
2013-07-05  7:07                                 ` Junio C Hamano
2013-07-04 12:45                               ` [PATCH v9 3/5] t4205, t6006, t7102: make functions more readable Alexey Shumkin
2013-07-05  6:45                                 ` Junio C Hamano
2013-07-05  8:13                                   ` Alexey Shumkin
2013-07-05  8:44                                     ` Junio C Hamano
2013-07-05  8:51                                       ` Alexey Shumkin
2013-07-05  8:58                                         ` Junio C Hamano
2013-07-04 12:45                               ` [PATCH v9 4/5] t6006: add two more tests for the case i18n.commitEncoding is not set Alexey Shumkin
2013-07-05  6:52                                 ` Junio C Hamano
2013-07-05  7:04                                 ` Junio C Hamano
2013-07-05  7:46                                   ` Alexey Shumkin
2013-07-05  8:09                                     ` Junio C Hamano
2013-07-04 12:45                               ` [PATCH v9 5/5] t4205: avoid using `sed` Alexey Shumkin
2013-07-01 23:19                             ` [PATCH v8 1/5] t6006 (rev-list-format): don't hardcode SHA-1 in expected outputs Alexey Shumkin
2013-07-01 23:19                             ` [PATCH v8 2/5] t7102 (reset): " Alexey Shumkin
2013-07-01 23:19                             ` [PATCH v8 3/5] t4205 (log-pretty-formats): " Alexey Shumkin
2013-07-01 23:19                             ` [PATCH v8 4/5] pretty: Add failing tests: --format output should honor logOutputEncoding Alexey Shumkin
2013-07-02  6:46                               ` Johannes Sixt
2013-07-01 23:19                             ` [PATCH v8 5/5] pretty: " Alexey Shumkin
2013-06-26 10:19                           ` [PATCH v7 1/5] t6006 (rev-list-format): don't hardcode SHA-1 in expected outputs Alexey Shumkin
2013-06-26 10:19                           ` [PATCH v7 2/5] t7102 (reset): " Alexey Shumkin
2013-06-26 10:19                           ` [PATCH v7 3/5] t4205 (log-pretty-formats): " Alexey Shumkin
2013-06-26 10:19                           ` [PATCH v7 4/5] pretty: Add failing tests: --format output should honor logOutputEncoding Alexey Shumkin
2013-07-01  7:00                             ` Johannes Sixt [this message]
2013-07-01 22:50                               ` Alexey Shumkin
2013-07-02  7:22                                 ` Johannes Sixt
2013-07-03 20:20                                   ` Alexey Shumkin
2013-06-26 10:19                           ` [PATCH v7 5/5] pretty: " Alexey Shumkin
2013-06-25  8:55                         ` [PATCH v6 1/5] t6006 (rev-list-format): don't hardcode SHA-1 in expected outputs Alexey Shumkin
2013-06-25  8:55                         ` [PATCH v6 2/5] t7102 (reset): " Alexey Shumkin
2013-06-25  8:55                         ` [PATCH v6 3/5] t4205 (log-pretty-formats): " Alexey Shumkin
2013-06-25  8:55                         ` [PATCH v6 4/5] pretty: Add failing tests: --format output should honor logOutputEncoding Alexey Shumkin
2013-06-25  8:55                         ` [PATCH v6 5/5] pretty: " Alexey Shumkin
2013-06-20  9:26                       ` [PATCH v5 1/5] t6006 (rev-list-format): don't hardcode SHA-1 in expected outputs Alexey Shumkin
2013-06-20 20:38                         ` Junio C Hamano
2013-06-20  9:26                       ` [PATCH v5 2/5] t7102 (reset): " Alexey Shumkin
2013-06-20  9:26                       ` [PATCH v5 3/5] t4205 (log-pretty-formats): " Alexey Shumkin
2013-06-20 20:38                         ` Junio C Hamano
2013-06-20  9:26                       ` [PATCH v5 4/5] pretty: Add failing tests: user format ignores i18n.logOutputEncoding setting Alexey Shumkin
2013-06-20 20:23                         ` Junio C Hamano
2013-06-20  9:26                       ` [PATCH v5 5/5] pretty: " Alexey Shumkin
2013-06-20 20:37                         ` Junio C Hamano
2013-01-24  9:10                     ` [PATCH v4 1/4] t6006 (rev-list-format): don't hardcode SHA-1 in expected outputs Alexey Shumkin
2013-01-24 20:29                       ` Junio C Hamano
2013-01-25  9:20                         ` Alexey Shumkin
2013-01-25 11:06                         ` Alexey Shumkin
2013-01-25 15:16                           ` Junio C Hamano
2013-01-25 15:27                             ` Alexey Shumkin
2013-01-24  9:10                     ` [PATCH v4 2/4] t7102 (reset): refactoring: " Alexey Shumkin
2013-01-24 20:30                       ` Junio C Hamano
2013-01-25  9:08                         ` Alexey Shumkin
2013-01-24  9:10                     ` [PATCH v4 3/4] pretty: Add failing tests: user format ignores i18n.logOutputEncoding setting Alexey Shumkin
2013-01-24 20:44                       ` Junio C Hamano
2013-01-25  9:07                         ` Alexey Shumkin
2013-01-24 21:02                       ` Junio C Hamano
2013-01-25  9:01                         ` Alexey Shumkin
2013-01-24  9:10                     ` [PATCH v4 4/4] pretty: " Alexey Shumkin
2011-09-20  8:21               ` [PATCH v3 2/2] " Alexey Shumkin
2011-09-09  8:53           ` [PATCH " Alexey Shumkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51D12927.5050000@viscovery.net \
    --to=j.sixt@viscovery.net \
    --cc=Alex.Crezoff@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).