git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Torsten Bögershausen" <tboegi@web.de>
To: Rich Felker <dalias@libc.org>,
	"brian m. carlson" <sandals@crustytoothpaste.net>,
	Kevin Daudt <git@lists.ikke.info>,
	git@vger.kernel.org, larsxschneider@gmail.com
Subject: Re: t0028-working-tree-encoding.sh failing on musl based systems (Alpine Linux)
Date: Sat, 9 Feb 2019 09:09:40 +0100	[thread overview]
Message-ID: <0c3b77bf-2903-1cd0-0fce-2ec01be91d84@web.de> (raw)
In-Reply-To: <20190208060403.GA29788@brightrain.aerifal.cx>

On 08.02.19 07:04, Rich Felker wrote:
> On Fri, Feb 08, 2019 at 12:17:05AM +0000, brian m. carlson wrote:

[]
>> Even if Git were to produce a BOM to work around this issue, then we'd
>> still have the problem that any program using musl will write data in
>> UTF-16 without a BOM. Moreover, because musl, in violation of the RFC,
>> doesn't read and process BOMs, someone using little-endian UTF-16 (with
>> a proper BOM) with musl and Git will have their data corrupted,
>> according to my reading of the musl website.
>
> That information is outdated and someone from our side should update
> it; since 1.1.19, musl treats "UTF-16" input as ambiguous endianness
> determined by BOM, defaulting to big if there's no BOM. However output
> is always big endian, such that processes conforming to the Unicode
> SHOULD clause will interpret it correctly.
>
> The portable way to get little endian with a BOM is to open a
> conversion descriptor for "UTF-16LE" (which should not add any BOM)
> and write a BOM manually.
>

That is possible in the next upcoming version of Git:

commit 0fa3cc77ee9fb3b6bb53c73688c9b7500f996b83
Merge: cfd9167c15 aab2a1ae48
Author: Junio C Hamano <gitster@pobox.com>
Date:   Wed Feb 6 22:05:21 2019 -0800

    Merge branch 'tb/utf-16-le-with-explicit-bom'

    A new encoding UTF-16LE-BOM has been invented to force encoding to
    UTF-16 with BOM in little endian byte order, which cannot be directly
    generated by using iconv.

    * tb/utf-16-le-with-explicit-bom:
      Support working-tree-encoding "UTF-16LE-BOM"



      parent reply	other threads:[~2019-02-09  8:10 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-07 21:59 t0028-working-tree-encoding.sh failing on musl based systems (Alpine Linux) Kevin Daudt
2019-02-08  0:17 ` brian m. carlson
2019-02-08  6:04   ` Rich Felker
2019-02-08 11:45     ` brian m. carlson
2019-02-08 11:55       ` Kevin Daudt
2019-02-08 13:51         ` brian m. carlson
2019-02-08 17:50           ` Junio C Hamano
2019-02-08 20:23             ` Kevin Daudt
2019-02-08 20:42               ` brian m. carlson
2019-02-08 23:12                 ` Junio C Hamano
2019-02-09  0:24                   ` brian m. carlson
2019-02-09 14:57                 ` Kevin Daudt
2019-02-09 20:08                   ` [PATCH] utf8: handle systems that don't write BOM for UTF-16 brian m. carlson
2019-02-10  1:45                     ` Eric Sunshine
2019-02-10 18:14                       ` brian m. carlson
2019-02-10  8:04                     ` Torsten Bögershausen
2019-02-10 18:55                       ` brian m. carlson
2019-02-11 17:14                         ` Junio C Hamano
2019-02-11  0:23                     ` [PATCH v2] " brian m. carlson
2019-02-11  1:16                       ` Eric Sunshine
2019-02-11  1:20                         ` brian m. carlson
2019-02-11  1:26                     ` [PATCH v3] " brian m. carlson
2019-02-11 21:43                       ` Kevin Daudt
2019-02-11 23:58                         ` brian m. carlson
2019-02-12  0:31                           ` Junio C Hamano
2019-02-12  0:53                             ` brian m. carlson
2019-02-12  2:43                               ` Junio C Hamano
2019-02-12  0:52                     ` [PATCH v4] " brian m. carlson
2019-02-08 16:13         ` t0028-working-tree-encoding.sh failing on musl based systems (Alpine Linux) Rich Felker
2019-02-09  8:09     ` Torsten Bögershausen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0c3b77bf-2903-1cd0-0fce-2ec01be91d84@web.de \
    --to=tboegi@web.de \
    --cc=dalias@libc.org \
    --cc=git@lists.ikke.info \
    --cc=git@vger.kernel.org \
    --cc=larsxschneider@gmail.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).