From: Alexey Shumkin <alex.crezoff@gmail.com>
To: Johannes Sixt <j.sixt@viscovery.net>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v7 4/5] pretty: Add failing tests: --format output should honor logOutputEncoding
Date: Tue, 2 Jul 2013 02:50:13 +0400 [thread overview]
Message-ID: <20130701225013.GA17377@ashu.dyn1.rarus.ru> (raw)
In-Reply-To: <51D12927.5050000@viscovery.net>
On Mon, Jul 01, 2013 at 09:00:55AM +0200, Johannes Sixt wrote:
> Am 6/26/2013 12:19, schrieb Alexey Shumkin:
> > One can set an alias
> > $ git config alias.lg "log --graph --pretty=format:'%Cred%h%Creset
> > -%C(yellow)%d%Creset %s %Cgreen(%cd) %C(bold blue)<%an>%Creset'
> > --abbrev-commit --date=local"
> >
> > to see the log as a pretty tree (like *gitk* but in a terminal).
> >
> > However, log messages written in an encoding i18n.commitEncoding which differs
> > from terminal encoding are shown corrupted even when i18n.logOutputEncoding
> > and terminal encoding are the same (e.g. log messages committed on a Cygwin box
> > with Windows-1251 encoding seen on a Linux box with a UTF-8 encoding and vice versa).
> >
> > To simplify an example we can say the following two commands are expected
> > to give the same output to a terminal:
> >
> > $ git log --oneline --no-color
> > $ git log --pretty=format:'%h %s'
> >
> > However, the former pays attention to i18n.logOutputEncoding
> > configuration, while the latter does not when it formats "%s".
> >
> > The same corruption is true for
> > $ git diff --submodule=log
> > and
> > $ git rev-list --pretty=format:%s HEAD
> > and
> > $ git reset --hard
> >
> > This patch adds failing tests for the next patch that fixes them.
> >
> > Signed-off-by: Alexey Shumkin <Alex.Crezoff@gmail.com>
>
> > diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh
> > index 73ba5e8..6b62da2 100755
> > --- a/t/t4205-log-pretty-formats.sh
> > +++ b/t/t4205-log-pretty-formats.sh
> ...
> > +commit_msg() {
> > + # String "initial. initial" partly in German (translated with Google Translate),
> > + # encoded in UTF-8, used as a commit log message below.
> > + msg=$(printf "initial. anf\303\244nglich")
> > + if test -n "$1"
> > + then
> > + msg=$(echo $msg | iconv -f utf-8 -t $1)
> > + fi
> > + if test -n "$2" -a -n "$3"
> > + then
> > + # cut string, replace cut part with two dots
> > + # $2 - chars count from the beginning of the string
> > + # $3 - "trailing" chars
> > + # LC_ALL is set to make `sed` interpret "." as a UTF-8 char not a byte
> > + # as it does with C locale
> > + msg=$(echo $msg | LC_ALL=en_US.UTF-8 sed -e "s/^\(.\{$2\}\)$3/\1../")
>
> This does not work as expected on Windows because sed ignores the .UTF-8
> part of the locale specifier. (We don't even have en_US; we have de, but
> with de.UTF-8 this doesn't work, either.)
>
> I don't have an idea, yet, how to work it around.
>
Hmm. I have Cygwin v1.7 (on Windows 7 and Windows 2003 Server R2)
with many locales installed (and with en_US.UTF-8 locale, too)
Today I could not find a way to run tests with "no en_US.UTF-8 locale installed" simulation
to test your failure
> > + fi
> > + echo $msg
> > +}
>
> > -test_expect_success 'left alignment formatting with mtrunc' '
> > - git log --pretty="format:%<(10,mtrunc)%s" >actual &&
> > +test_expect_failure 'left alignment formatting with mtrunc' "
> > + git log --pretty='format:%<(10,mtrunc)%s' >actual &&
> > # complete the incomplete line at the end
> > echo >>actual &&
> > qz_to_tab_space <<\EOF >expected &&
> > mess.. two
> > mess.. one
> > add bar Z
> > -initial Z
> > +$(commit_msg "" "4" ".\{11\}")
> > EOF
> > test_cmp expected actual
> > -'
> > +"
>
> This is the failing test case.
Hmm, for me all these tests pass on both Linux and Cygwin (mentioned
above) boxes
>
> BTW, if you re-roll, there would be fewer changes needed if you kept the
> test code single-quoted, but changed <<\EOF to <<EOF where needed.
Yep, thanks for your correction
>
> > diff --git a/t/t6006-rev-list-format.sh b/t/t6006-rev-list-format.sh
> > index cc1008d..c66a07f 100755
> > --- a/t/t6006-rev-list-format.sh
> > +++ b/t/t6006-rev-list-format.sh
> ...
> > test_expect_success 'setup' '
> > : >foo &&
> > git add foo &&
> > - git commit -m "added foo" &&
> > + git config i18n.commitEncoding iso-8859-1 &&
>
> Perhaps
> test_config i18n.commitEncoding iso-8859-1 &&
>
> Also, it is "iso-8869-1" here, but we see "iso8859-1" already used later.
> It's probably wise to use that same encoding name everywhere because we
> can be very sure that the latter is already understood on all supported
> platforms.
You're right (I've looked at explanation in
3994e8a98dc7bbf67e61d23c8125f44383499a1f; I've thought ISO-8859-1 is a
common name).
>
> > + git commit -m "$added_iso88591" &&
> > head1=$(git rev-parse --verify HEAD) &&
> > head1_short=$(git rev-parse --verify --short $head1) &&
> > tree1=$(git rev-parse --verify HEAD:) &&
> > tree1_short=$(git rev-parse --verify --short $tree1) &&
> > - echo changed >foo &&
> > - git commit -a -m "changed foo" &&
> > + echo "$changed" > foo &&
> > + git commit -a -m "$changed_iso88591" &&
> > head2=$(git rev-parse --verify HEAD) &&
> > head2_short=$(git rev-parse --verify --short $head2) &&
> > tree2=$(git rev-parse --verify HEAD:) &&
> > tree2_short=$(git rev-parse --verify --short $tree2)
> > + git config --unset i18n.commitEncoding
> > '
> >
> > -# usage: test_format name format_string <expected_output
> > +# usage: test_format [failure] name format_string <expected_output
> > test_format () {
> > + must_fail=0
> > + # if parameters count is more than 2 then test must fail
> > + if test $# -gt 2
> > + then
> > + must_fail=1
> > + # remove first parameter which is flag for test failure
> > + shift
> > + fi
> > cat >expect.$1
> > - test_expect_success "format $1" "
> > - git rev-list --pretty=format:'$2' master >output.$1 &&
> > - test_cmp expect.$1 output.$1
> > - "
> > + name="format $1"
> > + command="git rev-list --pretty=format:'$2' master >output.$1 &&
> > + test_cmp expect.$1 output.$1"
> > + if test $must_fail -eq 1
> > + then
> > + test_expect_failure "$name" "$command"
> > + else
> > + test_expect_success "$name" "$command"
> > + fi
> > }
>
> This function would be much shorter with the optional "failure" token as
> $3 (untested):
>
> test_format () {
> cat >expect.$1
> test_expect_${3:-success} "format $1" "
> git rev-list --pretty=format:'$2' master >output.$1 &&
> test_cmp expect.$1 output.$1
> "
Thank you for your suggesstion
> }
>
> > test_expect_success 'setup complex body' '
> > git config i18n.commitencoding iso8859-1 &&
> > echo change2 >foo && git commit -a -F commit-msg &&
> > head3=$(git rev-parse --verify HEAD) &&
> > - head3_short=$(git rev-parse --short $head3)
> > + head3_short=$(git rev-parse --short $head3) &&
> > + # unset commit encoding config
> > + # otherwise %e does not print encoding value
> > + # and following test fails
>
> I don't understand this comment. The test vector below already shows that
> an encoding is printed. Why would this suddenly be different with the
> updated tests?
I've changed tests. I've reverted back these ones, and added
new ones with no i18n.commitEncoding set
>
> Assuming that this change doesn't sweep a deeper problem under the rug,
> it's better to use test_config a few lines earlier.
>
> > + git config --unset i18n.commitEncoding
> > +
> > '
> >
> > test_format complex-encoding %e <<EOF
> > commit $head3
> > iso8859-1
>
> This is the encoding that I mean.
These encodings "have appeared" because we've changed 'setup':
we make commits with i18n.commitEncoding set
>
> > commit $head2
> > +iso-8859-1
> > commit $head1
> > +iso-8859-1
> > EOF
>
> > diff --git a/t/t7102-reset.sh b/t/t7102-reset.sh
> > index 05dfb27..72e364e 100755
> > --- a/t/t7102-reset.sh
> > +++ b/t/t7102-reset.sh
> > @@ -9,6 +9,17 @@ Documented tests for git reset'
> >
> > . ./test-lib.sh
> >
> > +commit_msg() {
> > + # String "modify 2nd file (changed)" partly in German(translated with Google Translate),
> > + # encoded in UTF-8, used as a commit log message below.
> > + msg=$(printf "modify 2nd file (ge\303\244ndert)")
> > + if test -n "$1"
> > + then
> > + msg=$(echo $msg | iconv -f utf-8 -t $1)
> > + fi
> > + echo $msg
> > +}
>
> If you wanted to, you could write this as
>
> commit_msg () {
> # String "modify 2nd file (changed)" partly in German
> #(translated with Google Translate),
> # encoded in UTF-8, used as a commit log message below.
> printf "modify 2nd file (ge\303\244ndert)" |
> if test -n "$1"
> then
> iconv -f utf-8 -t $1
> else
> cat
> fi
> }
>
> but I'm not sure whether it's a lot better.
It looks more readable. Thank you
>
> -- Hannes
--
Alexey Shumkin
next prev parent reply other threads:[~2013-07-01 22:50 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-22 7:06 git log --oneline and git log --pretty=format... give differrent outputs Shumkin Alexey
2011-07-22 7:53 ` Alexey Shumkin
2011-07-25 10:31 ` [PATCH] pretty: user format ignores i18n.logOutputEncoding setting Alexey Shumkin
2011-07-25 10:31 ` Alexey Shumkin
2011-07-25 19:45 ` Junio C Hamano
2011-07-26 13:18 ` Alexey Shumkin
2011-09-09 8:43 ` [PATCH v2] " Alexey Shumkin
2011-09-09 8:43 ` [PATCH 1/2] pretty: Add failing tests: " Alexey Shumkin
2011-09-09 8:43 ` [PATCH 2/2] pretty: " Alexey Shumkin
2011-09-09 8:53 ` [PATCH v2] " Alexey Shumkin
2011-09-09 8:53 ` [PATCH 1/2] pretty: Add failing tests: " Alexey Shumkin
2011-09-09 22:54 ` Junio C Hamano
2011-09-20 8:20 ` [PATCH v3 0/2] pretty: " Alexey Shumkin
2011-09-20 8:21 ` [PATCH v3 1/2] pretty: Add failing tests: " Alexey Shumkin
2011-09-20 19:12 ` Junio C Hamano
2011-09-20 20:46 ` Alexey Shumkin
2013-01-24 9:10 ` [PATCH v4 0/4] Reroll patches against v1.8.1.1 Alexey Shumkin
2013-06-20 9:26 ` [PATCH v5 0/5] Reroll patches against v1.8.3.1 Alexey Shumkin
2013-06-20 20:10 ` Junio C Hamano
2013-06-25 8:55 ` [PATCH v6 " Alexey Shumkin
2013-06-25 19:28 ` Junio C Hamano
2013-06-26 7:37 ` Alexey Shumkin
2013-06-26 14:24 ` Junio C Hamano
2013-06-26 10:19 ` [PATCH v7 " Alexey Shumkin
2013-06-26 16:19 ` Junio C Hamano
2013-07-01 23:18 ` [PATCH v8 0/5] Reroll patches against Git v1.8.3.2 Alexey Shumkin
2013-07-02 19:41 ` Junio C Hamano
2013-07-03 20:03 ` Alexey Shumkin
2013-07-03 20:06 ` Junio C Hamano
2013-07-04 12:45 ` [PATCH v9 0/5] Incremental updates against 'next' branch Alexey Shumkin
2013-07-05 12:01 ` [PATCH v10 " Alexey Shumkin
2013-07-05 12:01 ` [PATCH v10 1/5] t4041, t4205, t6006, t7102: use iso8859-1 rather than iso-8859-1 Alexey Shumkin
2013-07-05 12:01 ` [PATCH v10 2/5] t4205 (log-pretty-formats): revert back single quotes Alexey Shumkin
2013-07-05 12:01 ` [PATCH v10 3/5] t4205, t6006, t7102: make functions better readable Alexey Shumkin
2013-07-05 18:38 ` Junio C Hamano
2013-07-05 18:45 ` Junio C Hamano
2013-07-05 12:01 ` [PATCH v10 4/5] t6006 (rev-list-format): add tests for "%b" and "%s" for the case i18n.commitEncoding is not set Alexey Shumkin
2013-07-05 12:01 ` [PATCH v10 5/5] t4205 (log-pretty-formats): avoid using `sed` Alexey Shumkin
2013-07-04 12:45 ` [PATCH v9 1/5] t4041, t4205, t6006, t7102: use iso8859-1 rather than iso-8859-1 Alexey Shumkin
2013-07-05 6:47 ` Junio C Hamano
2013-07-05 8:00 ` Alexey Shumkin
2013-07-05 8:11 ` Junio C Hamano
2013-07-05 8:42 ` Alexey Shumkin
2013-07-05 8:56 ` Junio C Hamano
2013-07-04 12:45 ` [PATCH v9 2/5] t4205: revert back single quotes Alexey Shumkin
2013-07-05 7:07 ` Junio C Hamano
2013-07-04 12:45 ` [PATCH v9 3/5] t4205, t6006, t7102: make functions more readable Alexey Shumkin
2013-07-05 6:45 ` Junio C Hamano
2013-07-05 8:13 ` Alexey Shumkin
2013-07-05 8:44 ` Junio C Hamano
2013-07-05 8:51 ` Alexey Shumkin
2013-07-05 8:58 ` Junio C Hamano
2013-07-04 12:45 ` [PATCH v9 4/5] t6006: add two more tests for the case i18n.commitEncoding is not set Alexey Shumkin
2013-07-05 6:52 ` Junio C Hamano
2013-07-05 7:04 ` Junio C Hamano
2013-07-05 7:46 ` Alexey Shumkin
2013-07-05 8:09 ` Junio C Hamano
2013-07-04 12:45 ` [PATCH v9 5/5] t4205: avoid using `sed` Alexey Shumkin
2013-07-01 23:19 ` [PATCH v8 1/5] t6006 (rev-list-format): don't hardcode SHA-1 in expected outputs Alexey Shumkin
2013-07-01 23:19 ` [PATCH v8 2/5] t7102 (reset): " Alexey Shumkin
2013-07-01 23:19 ` [PATCH v8 3/5] t4205 (log-pretty-formats): " Alexey Shumkin
2013-07-01 23:19 ` [PATCH v8 4/5] pretty: Add failing tests: --format output should honor logOutputEncoding Alexey Shumkin
2013-07-02 6:46 ` Johannes Sixt
2013-07-01 23:19 ` [PATCH v8 5/5] pretty: " Alexey Shumkin
2013-06-26 10:19 ` [PATCH v7 1/5] t6006 (rev-list-format): don't hardcode SHA-1 in expected outputs Alexey Shumkin
2013-06-26 10:19 ` [PATCH v7 2/5] t7102 (reset): " Alexey Shumkin
2013-06-26 10:19 ` [PATCH v7 3/5] t4205 (log-pretty-formats): " Alexey Shumkin
2013-06-26 10:19 ` [PATCH v7 4/5] pretty: Add failing tests: --format output should honor logOutputEncoding Alexey Shumkin
2013-07-01 7:00 ` Johannes Sixt
2013-07-01 22:50 ` Alexey Shumkin [this message]
2013-07-02 7:22 ` Johannes Sixt
2013-07-03 20:20 ` Alexey Shumkin
2013-06-26 10:19 ` [PATCH v7 5/5] pretty: " Alexey Shumkin
2013-06-25 8:55 ` [PATCH v6 1/5] t6006 (rev-list-format): don't hardcode SHA-1 in expected outputs Alexey Shumkin
2013-06-25 8:55 ` [PATCH v6 2/5] t7102 (reset): " Alexey Shumkin
2013-06-25 8:55 ` [PATCH v6 3/5] t4205 (log-pretty-formats): " Alexey Shumkin
2013-06-25 8:55 ` [PATCH v6 4/5] pretty: Add failing tests: --format output should honor logOutputEncoding Alexey Shumkin
2013-06-25 8:55 ` [PATCH v6 5/5] pretty: " Alexey Shumkin
2013-06-20 9:26 ` [PATCH v5 1/5] t6006 (rev-list-format): don't hardcode SHA-1 in expected outputs Alexey Shumkin
2013-06-20 20:38 ` Junio C Hamano
2013-06-20 9:26 ` [PATCH v5 2/5] t7102 (reset): " Alexey Shumkin
2013-06-20 9:26 ` [PATCH v5 3/5] t4205 (log-pretty-formats): " Alexey Shumkin
2013-06-20 20:38 ` Junio C Hamano
2013-06-20 9:26 ` [PATCH v5 4/5] pretty: Add failing tests: user format ignores i18n.logOutputEncoding setting Alexey Shumkin
2013-06-20 20:23 ` Junio C Hamano
2013-06-20 9:26 ` [PATCH v5 5/5] pretty: " Alexey Shumkin
2013-06-20 20:37 ` Junio C Hamano
2013-01-24 9:10 ` [PATCH v4 1/4] t6006 (rev-list-format): don't hardcode SHA-1 in expected outputs Alexey Shumkin
2013-01-24 20:29 ` Junio C Hamano
2013-01-25 9:20 ` Alexey Shumkin
2013-01-25 11:06 ` Alexey Shumkin
2013-01-25 15:16 ` Junio C Hamano
2013-01-25 15:27 ` Alexey Shumkin
2013-01-24 9:10 ` [PATCH v4 2/4] t7102 (reset): refactoring: " Alexey Shumkin
2013-01-24 20:30 ` Junio C Hamano
2013-01-25 9:08 ` Alexey Shumkin
2013-01-24 9:10 ` [PATCH v4 3/4] pretty: Add failing tests: user format ignores i18n.logOutputEncoding setting Alexey Shumkin
2013-01-24 20:44 ` Junio C Hamano
2013-01-25 9:07 ` Alexey Shumkin
2013-01-24 21:02 ` Junio C Hamano
2013-01-25 9:01 ` Alexey Shumkin
2013-01-24 9:10 ` [PATCH v4 4/4] pretty: " Alexey Shumkin
2011-09-20 8:21 ` [PATCH v3 2/2] " Alexey Shumkin
2011-09-09 8:53 ` [PATCH " Alexey Shumkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130701225013.GA17377@ashu.dyn1.rarus.ru \
--to=alex.crezoff@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=j.sixt@viscovery.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).