git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Kevin Daudt <me@ikke.info>
To: Junio C Hamano <gitster@pobox.com>
Cc: "brian m. carlson" <sandals@crustytoothpaste.net>,
	Rich Felker <dalias@libc.org>,
	git@vger.kernel.org, larsxschneider@gmail.com
Subject: Re: t0028-working-tree-encoding.sh failing on musl based systems (Alpine Linux)
Date: Fri, 8 Feb 2019 21:23:36 +0100	[thread overview]
Message-ID: <20190208202336.GA5284@alpha> (raw)
In-Reply-To: <xmqqr2cikw4w.fsf@gitster-ct.c.googlers.com>

On Fri, Feb 08, 2019 at 09:50:07AM -0800, Junio C Hamano wrote:
> "brian m. carlson" <sandals@crustytoothpaste.net> writes:
> 
> >> So would you suggest that we just skip this test on Alpine Linux?
> >
> > That's not exactly what I said. If Alpine Linux users are never going to
> > use this functionality and don't care that it's broken, then that's a
> > fine solution.
> >
> > As originally mentioned, musl could change its libiconv to write a BOM,
> > which would make it compatible with other known iconv implementations.
> >
> > There's also the possibility of defining NO_ICONV. That basically means
> > that your system won't support encodings, and then this test shouldn't
> > matter.
> >
> > Finally, you could try applying a patch to the test to make it write the
> > BOM for UTF-16 since your iconv doesn't. I expect that the test will
> > fail again later on once you've done that, though.
> 
> Sorry for being late to the party, but is the crux of the issue this
> piece early in the test?
> 
>     printf "$text" | iconv -f UTF-8 -t UTF-16 >test.utf16.raw &&
>     ...
>     cp test.utf16.raw test.utf16 &&
>     ...
>     git add .gitattributes test.utf16 test.utf16lebom &&
> 
> where we expect "iconv -t UTF-16" means "write UTF16 in whatever
> byteorder of your choice, but do write BOM", and iconv
> implementations we have seen so far are in line with that
> expectation, but the one on Apline writes UTF16 in big endian
> without BOM?

Firstly, the tests expect iconv -t UTF-16 to output a BOM, which it
indeed does not do on Alpine. Secondly, git itself also expects the BOM
to be present when the encoding is set to UTF-16, otherwise it will
complain.

> 
> If that is the case, I think it is our expectation that is at fault
> in this case, as I think the most natural interpretation of "UTF-16"
> without any modifiers (like "BE") ought to be "UTF16 stream
> expressed in any way of writers choice, as long as it is readable by
> standard compliant readers", in other words, "write UTF16 in
> whatever byteorder of your choice, with or without BOM, but if you
> omit BOM, you SHOULD write in big endian".  So
> 
>  - If our later test assumes that test.utf16 is UTF16 with BOM, that
>    already assumes too much;
> 
>  - If our later test assumes that test.utf16 is UTF16 in big endian,
>    that assumes too much, too.
> 
> As suggested earlier in the thread, the easiest workaround would be
> to update the preparation of test.utf16.raw may to force big endian
> with BOM by preprending BE-BOM by hand before "iconv -t UTF-32BE"
> output (I am assuming that UTF-32BE will stay to be "big endian
> without BOM" in the future).  That would make sure that the
> assumption later tests have on test.utf16 is held true.

I tried change the test to manually inject a BOM to the file (and
setting iconv to UTF-16LE / UTF16-BE, which lets the first test go
through, but test 3 then fails, because git itself output the file
without BOM, presumably because it's passed through iconv.

So I'm not sure if it's a matter of just fixing the tests.


  reply	other threads:[~2019-02-08 20:23 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-07 21:59 t0028-working-tree-encoding.sh failing on musl based systems (Alpine Linux) Kevin Daudt
2019-02-08  0:17 ` brian m. carlson
2019-02-08  6:04   ` Rich Felker
2019-02-08 11:45     ` brian m. carlson
2019-02-08 11:55       ` Kevin Daudt
2019-02-08 13:51         ` brian m. carlson
2019-02-08 17:50           ` Junio C Hamano
2019-02-08 20:23             ` Kevin Daudt [this message]
2019-02-08 20:42               ` brian m. carlson
2019-02-08 23:12                 ` Junio C Hamano
2019-02-09  0:24                   ` brian m. carlson
2019-02-09 14:57                 ` Kevin Daudt
2019-02-09 20:08                   ` [PATCH] utf8: handle systems that don't write BOM for UTF-16 brian m. carlson
2019-02-10  1:45                     ` Eric Sunshine
2019-02-10 18:14                       ` brian m. carlson
2019-02-10  8:04                     ` Torsten Bögershausen
2019-02-10 18:55                       ` brian m. carlson
2019-02-11 17:14                         ` Junio C Hamano
2019-02-11  0:23                     ` [PATCH v2] " brian m. carlson
2019-02-11  1:16                       ` Eric Sunshine
2019-02-11  1:20                         ` brian m. carlson
2019-02-11  1:26                     ` [PATCH v3] " brian m. carlson
2019-02-11 21:43                       ` Kevin Daudt
2019-02-11 23:58                         ` brian m. carlson
2019-02-12  0:31                           ` Junio C Hamano
2019-02-12  0:53                             ` brian m. carlson
2019-02-12  2:43                               ` Junio C Hamano
2019-02-12  0:52                     ` [PATCH v4] " brian m. carlson
2019-02-08 16:13         ` t0028-working-tree-encoding.sh failing on musl based systems (Alpine Linux) Rich Felker
2019-02-09  8:09     ` Torsten Bögershausen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190208202336.GA5284@alpha \
    --to=me@ikke.info \
    --cc=dalias@libc.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=larsxschneider@gmail.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).