git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Jakub Narębski" <jnareb@gmail.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org
Subject: Re: [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows
Date: Sun, 2 Apr 2017 13:40:58 +0200	[thread overview]
Message-ID: <3e35eda2-0240-8504-b56e-f66092cc1775@gmail.com> (raw)
In-Reply-To: <20170402074522.4qhannjus4ynwx4i@sigill.intra.peff.net>

W dniu 02.04.2017 o 09:45, Jeff King pisze:
> On Sat, Apr 01, 2017 at 08:31:27PM +0200, Jakub Narębski wrote:
> 
>> W dniu 01.04.2017 o 08:08, Jeff King pisze:
>>> On Fri, Mar 31, 2017 at 03:24:48PM +0200, Jakub Narębski wrote:
>>>
>>>>> I suspect in the normal case that git is doing line-ending conversion,
>>>>> but it's suppressed when textconv is in use.
>>>>
>>>> I would not consider this a bug if not for the fact that there is no ^M
>>>> without using iconv as textconv.
>>>
>>> I don't think it's a bug, though. You have told Git that you will
>>> convert the contents (whatever their format) into the canonical format,
>>> but your program to do so includes a CR.
>>
>> Well, I have not declared file binary with "binary = true" in diff driver
>> definition, isn't it?
> 
> I don't think binary has anything to do with it. A textconv filter takes
> input (binary or not) and delivers a normalized representation to feed
> to the diff algorithm. There's no further post-processing, and it's the
> responsibility of the filter to deliver the bytes it wants diffed.
> 
> Like I said, I could see an argument for treating the filter output as
> text to be pre-processed, but that's not how it works (and I don't think
> it is a good idea to change it now, unless by adding an option to the
> diff filter).

I think that actually there is something wrong.

If textconv really gets normalized representation of pre-image (the index
version) and post-image (the filesystem version), as it should I think,
both pre-image lines ('-') and post-image lines ('+') should use CRLF,
so there should be no warning, i.e. ^M

Or textconv filter gets normalized representation (it looks this way
when examining diff result saved to file with `git diff test.tex >test.diff`;
I were unable to use `tr '\r' 'Q', either I got "fatal: bad config line"
from Git, or "tr: extra operand" from tr), and somehow Git mistakes
what is happening and writes those ^M.

If I understand it correctly, if pre-image, post-image and context
all use the same eol, there should be no warning, isn't it?

> 
>> P.S. What do you think about Git supporting 'encoding' attribute (or
>> 'core.encoding' config) plus 'core.outputEncoding' in-core?
> 
> Supporting an "encoding" attribute to normalize file encodings in diffs
> seems reasonable to me. But it would have to be enabled only for
> human-readable diffs, as the result could not be applied (so the same as
> textconv).

I was thinking about human readable diffs, and 'git show <blob>', same
as with textconv.

> 
> I don't think core.outputEncoding is necessarily a good idea. We are not
> really equipped anything that isn't an ascii superset, as we intermingle
> the bytes with ascii diff headers (though I think that is true of the
> commitEncoding stuff; I assume everything breaks horribly if you tried
> to set that to UTF-16, but I've never tried it).

Well, the understanding would be that the same limitation as for 
core.logOutputEncoding (documented if it isn't) that only encodings that
are ASCII compatibile are supported.
 
-- 
Jakub Narębski


  reply	other threads:[~2017-04-02 11:41 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-30 19:35 [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows Jakub Narębski
2017-03-30 20:00 ` Jeff King
2017-03-31 13:24   ` Jakub Narębski
2017-04-01  6:08     ` Jeff King
2017-04-01 18:31       ` Jakub Narębski
2017-04-02  7:45         ` Jeff King
2017-04-02 11:40           ` Jakub Narębski [this message]
2017-03-31 12:38 ` Torsten Bögershausen
2017-03-31 19:44   ` Jakub Narębski
2017-04-02  4:34     ` Torsten Bögershausen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3e35eda2-0240-8504-b56e-f66092cc1775@gmail.com \
    --to=jnareb@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).