git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: Eric Sunshine <sunshine@sunshineco.com>
Cc: Git List <git@vger.kernel.org>,
	Lars Schneider <larsxschneider@gmail.com>,
	Rich Felker <dalias@libc.org>, Junio C Hamano <gitster@pobox.com>,
	Kevin Daudt <me@ikke.info>
Subject: Re: [PATCH] utf8: handle systems that don't write BOM for UTF-16
Date: Sun, 10 Feb 2019 18:14:30 +0000	[thread overview]
Message-ID: <20190210181430.GA28510@genre.crustytoothpaste.net> (raw)
In-Reply-To: <CAPig+cRyzZMOM19ztgR_wqvk68P_1eNNVBBj5pbY=MhQm08WAw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1957 bytes --]

On Sat, Feb 09, 2019 at 08:45:16PM -0500, Eric Sunshine wrote:
> On Sat, Feb 9, 2019 at 3:08 PM brian m. carlson
> <sandals@crustytoothpaste.net> wrote:
> > [...]
> > Add a Makefile and #define knob, ICONV_NEEDS_BOM, that can be set if the
> > iconv implementation has this behavior. When set, Git will write a BOM
> > manually for UTF-16 and UTF-32 and then force the data to be written in
> > UTF-16BE or UTF-32BE. We choose big-endian behavior here because the
> > tests use the raw "UTF-16" encoding, which will be big-endian when the
> > implementation requires this knob to be set.
> 
> The name ICONV_NEEDS_BOM makes it sound as if we must feed a BOM
> _into_ 'iconv', which is quite confusing since the actual intention is
> that 'iconv' doesn't emit a BOM and we need to make up for the
> deficiency. Using a name such as ICONV_OMITS_BOM or ICONV_NEGLECTS_BOM
> makes it somewhat clearer that there is some deficiency with which we
> need to deal.

That does sound like a better name.

> > Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> > ---
> > diff --git a/Makefile b/Makefile
> > @@ -259,6 +259,9 @@ all::
> > +# Define ICONV_NEEDS_BOM if your iconv implementation does not write a
> > +# byte-order mark (BOM) when writing UTF-16 or UTF-32.
> 
> Not a big deal, but I wonder if it would be helpful to tack on "...,
> in which case it outputs big-endian unconditionally." or something.

Sure, I can add that.

> Stray blank line before the closing brace.

Will fix.

> It's probably doesn't matter much with these two tiny functions, but I
> was wondering if it would make sense to maintain the &&-chain, perhaps
> like this:
> 
>     if test test_have_prereq NO_UTF32_BOM
>     then
>         printf '\x00\x00\xfe\xff'
>     fi &&
>     iconv -f UTF-8 -t UTF-32

Yeah, that sounds like a good idea.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

  reply	other threads:[~2019-02-10 18:14 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-07 21:59 t0028-working-tree-encoding.sh failing on musl based systems (Alpine Linux) Kevin Daudt
2019-02-08  0:17 ` brian m. carlson
2019-02-08  6:04   ` Rich Felker
2019-02-08 11:45     ` brian m. carlson
2019-02-08 11:55       ` Kevin Daudt
2019-02-08 13:51         ` brian m. carlson
2019-02-08 17:50           ` Junio C Hamano
2019-02-08 20:23             ` Kevin Daudt
2019-02-08 20:42               ` brian m. carlson
2019-02-08 23:12                 ` Junio C Hamano
2019-02-09  0:24                   ` brian m. carlson
2019-02-09 14:57                 ` Kevin Daudt
2019-02-09 20:08                   ` [PATCH] utf8: handle systems that don't write BOM for UTF-16 brian m. carlson
2019-02-10  1:45                     ` Eric Sunshine
2019-02-10 18:14                       ` brian m. carlson [this message]
2019-02-10  8:04                     ` Torsten Bögershausen
2019-02-10 18:55                       ` brian m. carlson
2019-02-11 17:14                         ` Junio C Hamano
2019-02-11  0:23                     ` [PATCH v2] " brian m. carlson
2019-02-11  1:16                       ` Eric Sunshine
2019-02-11  1:20                         ` brian m. carlson
2019-02-11  1:26                     ` [PATCH v3] " brian m. carlson
2019-02-11 21:43                       ` Kevin Daudt
2019-02-11 23:58                         ` brian m. carlson
2019-02-12  0:31                           ` Junio C Hamano
2019-02-12  0:53                             ` brian m. carlson
2019-02-12  2:43                               ` Junio C Hamano
2019-02-12  0:52                     ` [PATCH v4] " brian m. carlson
2019-02-08 16:13         ` t0028-working-tree-encoding.sh failing on musl based systems (Alpine Linux) Rich Felker
2019-02-09  8:09     ` Torsten Bögershausen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190210181430.GA28510@genre.crustytoothpaste.net \
    --to=sandals@crustytoothpaste.net \
    --cc=dalias@libc.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=larsxschneider@gmail.com \
    --cc=me@ikke.info \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).