git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Alexey Shumkin <alex.crezoff@gmail.com>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: git@vger.kernel.org, Jakub Narebski <jnareb@gmail.com>,
	Alexander Gerasiov <gq@debian.org>
Subject: Re: [PATCH 2/3] hooks/post-receive-email: force log messages in UTF-8
Date: Mon, 5 Aug 2013 12:45:43 +0400	[thread overview]
Message-ID: <20130805084543.GA11270@ashu.dyn1.rarus.ru> (raw)
In-Reply-To: <20130804181440.GA2894@elie.Belkin>

On Sun, Aug 04, 2013 at 11:14:40AM -0700, Jonathan Nieder wrote:
> Alexey Shumkin wrote:
> > On Fri, Aug 02, 2013 at 04:23:38PM -0700, Jonathan Nieder wrote:
> 
> >>  1. Log messages use the configured log output encoding, which is
> >>     meant to be whatever encoding works best with local terminals
> >>     (and does not have much to do with what encoding should be used
> >>     for email)
> >>
> >>  2. Filenames are left as is: on Linux, usually UTF-8, and in the Mingw
> >>     port (which uses Unicode filesystem APIs), always UTF-8
> >
> > I cannot say exactly if it makes sense for THIS patch, but I'd like to
> > remind about Cygwin port, which definitely does not use UTF-8 encoding
> > (in my case it is Windows-1251) for filenames.
> >
> >> 
> >>  3. The "This is an automated email" preface uses a project description
> >>     from .git/description, which is typically in UTF-8 to support
> >>     gitweb.
> 
> Thanks for clarifying.  So in the context you describe, (1) is
> configurable, (2) is Windows-1251, (3) is unconfigurably UTF-8, and
> there is no way with current git facilities to force the email to use
> a single encoding unless (3) happens to contain no special characters.
> 
> What is the value of the "[i18n] commitEncoding" setting in your
> project?
commitEncoding is equal to filenames' encoding, Windows-1251, of course.

> What encoding do the raw commit messages (shown with
> "git log --format=raw") use for their text, and what do they declare
> with an in-commit 'encoding' header, if any?
Well, despite `git log --help` 
--8<--
raw
           The raw format shows the entire commit exactly as stored in
           the commit object"
--8<--
on a Linux box (UTF-8) I can see "readable" commit messages nevertheless
they are stored in 'Windows-1251' (so they are converted to UTF-8). To
be sure I've checked actual content of them with `git cat-file commit`
Actually, to be honest, I usually use modified version of Git (see
ecaee8050cec23eb4cf082512e907e3e52c20b57) in 'next' branch, that could
affect the results, so I've checked `git log --format=raw` with
unmodified v1.8.3.3 of Git.

But let's go back to the answer to your question. Commit encoding stored
as a header in a raw commit messages is 'Windows-1251'.
> 
> Does everyone on this project use Cygwin?i
This is a "closed" (commercial) project and every developer uses Cygwin,
except me. I use a Linux box as a desktop (mail, IM, web-browsing; but
development goes on Cygwin). And sometimes I run utility scripts
included to that project on my desktop (as far as Linux works with files
much faster than Cygwin does ;))
Also, a Git server is a coLinux box (http://www.colinux.org/) on a
Windows Server 2003, but I guess, it does not much matter here.
>  That should be fine, but
> I'd expect there to be problems as soon as someone wants to try the
> Mingw port ("Git for Windows").
Yep, one of our developers tried to use modern version of TortoiseGit
with MinGW port of Git. That was a failure. As far as since v1.7.9 MinGW
port transcodes filenames to store them internally in UTF-8. This
problem could be solved with converting once that non-ASCII filenames to
UTF-8, but I do not want to use MinGW port. I like Cygwin
"infrastructure" that is more Linux-like than MinGW.
> 
> I wonder if there should be an "[i18n] repositoryPathEncoding"
> configuration item to support this kind of repository.  Then git could
> be aware of the intended encoding of paths, could recode them for
> display to a terminal, and at least on Linux and Mingw could recode
> them for use in filenames on disk.  "repositoryPathEncoding = none"
> would mean the current behavior of treating paths as raw sequences of
> bytes.
I'd be happy if such a setting exists. That could solve many problems
with cross-platform projects with non-ASCII filenames.
Indeed, MinGW port does resolve that problem somehow!
> 
> What do you think?
> Jonathan

-- 
Alexey Shumkin

  reply	other threads:[~2013-08-05  8:45 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-02 23:21 [PATCH/RFC 0/3] post-receive-email: explicitly set Content-Type header Jonathan Nieder
2013-08-02 23:22 ` [PATCH 1/3] hooks/post-receive-email: use plumbing instead of git log/show Jonathan Nieder
2013-08-02 23:23 ` [PATCH 2/3] hooks/post-receive-email: force log messages in UTF-8 Jonathan Nieder
2013-08-04 14:54   ` Alexey Shumkin
2013-08-04 18:14     ` Jonathan Nieder
2013-08-05  8:45       ` Alexey Shumkin [this message]
2013-08-02 23:24 ` [PATCH 3/3] hooks/post-receive-email: set declared encoding to utf-8 Jonathan Nieder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130805084543.GA11270@ashu.dyn1.rarus.ru \
    --to=alex.crezoff@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gq@debian.org \
    --cc=jnareb@gmail.com \
    --cc=jrnieder@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).