From: Shin Kojima <shin@kojima.org>
To: Jakub Narebski <jnareb@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>, Shin Kojima <shin@kojima.org>,
git@vger.kernel.org
Subject: Re: [PATCH] gitweb: Measure offsets against UTF-8 flagged string
Date: Fri, 4 May 2018 00:16:29 +0900 [thread overview]
Message-ID: <20180503151627.45pt2veqcjzbk44q@skmbp> (raw)
In-Reply-To: <86k1skzzc4.fsf@gmail.com>
> One solution would be to force conversion to UTF-8 on input via "open"
> pragma (e.g. "use open ':encoding(UTF-8)';"). But there is no
> UTF-8-with_fallback encoding available - we would have to write one, and
> install it as module (or fake it via Perl trickery). This mechanism is
> almost the same to what we currently use in gitwbe.
Yes, I tried using `Encode::Guess` with "open" pragma, but no luck.
https://perldoc.perl.org/Encode/Guess.html
I'm also afraid of "open" pragma does not work properly while using
git_blame_common(). Let's say someone using non-ASCII characters in
his/her name, committing non-UTF8 encoded characters. git-blame will
combine them in the same line. Following is an example:
$ git blame dummy | xxd
00000000: 3461 6464 3565 6331 2028 e585 90e5 b3b6 4add5ec1 (......
00000010: 20e6 96b0 2032 3031 382d 3035 2d30 3320 ... 2018-05-03
00000020: 3232 3a34 383a 3432 202b 3039 3030 2031 22:48:42 +0900 1
00000030: 2920 8367 8389 8343 0a ) .g...C.
* e585 90e5 b3b6 20e6 96b0 : my name, encoded with UTF-8
* 8367 8389 8343 : "トライ" encoded with Shift_JIS
It means I need to split each lines of git-blame output at the very
beginning, then convert the first-half as UTF-8 and the second-half as
Shift_JIS.
Sincerely,
--
Shin Kojima
next prev parent reply other threads:[~2018-05-03 15:16 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-01 6:40 [PATCH] gitweb: Measure offsets against UTF-8 flagged string Shin Kojima
2018-05-02 8:01 ` Junio C Hamano
2018-05-02 11:47 ` Shin Kojima
2018-05-03 12:40 ` Jakub Narebski
2018-05-03 15:16 ` Shin Kojima [this message]
2018-05-04 2:38 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180503151627.45pt2veqcjzbk44q@skmbp \
--to=shin@kojima.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jnareb@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).