git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Bryan Turner <bturner@atlassian.com>
Cc: git@vger.kernel.org
Subject: Re: git rev-list %an, %ae, %at bug in v1.7.10.1 and beyond
Date: Tue, 22 May 2012 00:32:21 -0400	[thread overview]
Message-ID: <20120522043221.GA6859@sigill.intra.peff.net> (raw)
In-Reply-To: <CAGyf7-G3nNTTP1bKdd9HLKEn-8+LoxCeY2R08x9gKZwS0L_N6g@mail.gmail.com>

On Mon, May 21, 2012 at 06:01:50PM +1000, Bryan Turner wrote:

> Note that the author name, e-mail and timestamp values are all missing
> (the three |'s in a row at the end).
> [...]
> Built from 0dbe6592ccbd1a394a69a52074e3729d546fe952, the parent of
> 4b340cf, and in previous versions of git (1.7.10 and earlier), I got
> output like this:
> [...]
> Note that the author name, e-mail and timestamp are all present. The
> "a" appears as ASCII here, but it's actually a UTF-16LE character (the
> terminal on the Mac is being "helpful").

I'm not too surprised this is broken (in fact, I'm surprised it ever
really worked). UTF-16, especially representing pure ascii characters,
will have embedded NULs. Most of the code assumes that things like names
and emails are NUL-terminated and ascii-compatible (so ascii, or some
ascii-superset encoding like utf8, iso8859-1, etc). You can store a
commit message (and name) in utf-16 if you tell git that you are doing
so, but we should be re-encoding it before handling it.

> ================================ Output =====================================
> aphrael:qa-resources.git bturner$ git cat-file -p
> 5c1ccdec5f84aa149a4978f729fdda70769f942f
> tree dd173cb70baaac07bdf405f4e3db110e7fafd180
> parent 02c78bc39ac6192623bf080e3e2ac892a4f5764c
> author a <farmas@atlassian.com> 1327876222 +1100
> committer a <farmas@atlassian.com> 1327876222 +1100
> 
> commit with unicode name
> ================================ End ========================================

There's no encoding header, so having a utf-16 character there is wrong.
How did you make such a commit in the first place, though? I believe
that git-commit treats the name as a string and would terminate on a NUL
(or am I wrong in thinking that this "a" is really U+0061, and is
actually some other unicode character that _looks_ like "a", but doesn't
contain a NUL?).

Did you create it by piping straight to git-hash-object? What does
piping the above through "xxd" or "cat -A" show?

-Peff

  reply	other threads:[~2012-05-22  4:32 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-21  8:01 git rev-list %an, %ae, %at bug in v1.7.10.1 and beyond Bryan Turner
2012-05-22  4:32 ` Jeff King [this message]
2012-05-22  4:35   ` Jeff King
2012-05-22  5:13     ` Bryan Turner
2012-05-22  5:58       ` Jeff King
2012-05-22  6:13         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120522043221.GA6859@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=bturner@atlassian.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).