From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: "Torsten Bögershausen" <tboegi@web.de>
Cc: git@vger.kernel.org, larsxschneider@gmail.com,
Rich Felker <dalias@libc.org>, Junio C Hamano <gitster@pobox.com>,
Kevin Daudt <me@ikke.info>
Subject: Re: [PATCH] utf8: handle systems that don't write BOM for UTF-16
Date: Sun, 10 Feb 2019 18:55:24 +0000 [thread overview]
Message-ID: <20190210185523.GB28510@genre.crustytoothpaste.net> (raw)
In-Reply-To: <20190210080413.u56vr3fgoejjzjfm@tb-raspi4>
[-- Attachment #1: Type: text/plain, Size: 1745 bytes --]
On Sun, Feb 10, 2019 at 08:04:13AM +0000, Torsten Bögershausen wrote:
> On Sat, Feb 09, 2019 at 08:08:01PM +0000, brian m. carlson wrote:
> > Preserve the existing behavior for systems which do not have this knob
> > enabled, since they may use optimized implementations, including
> > defaulting to the native endianness, to gain improved performance, which
> > can be significant with large checkouts.
>
> Is the based on measurements on a real system ?
No, I haven't done any performance measurements. However, swapping bytes
is a (IIRC 1-cycle) instruction on x86, which would be executed for each
iteration of the loop. My intuition tells me that will be a significant
expense when there are a lot of files, but I can omit that phrase since
I haven't measured.
> I think we agree that Git will write UTF-16 always as big endian with BOM,
> following the tradition of iconv/libiconv.
> If yes, we can reduce the lines of code/#idefs somewhat, have the knob always on,
> and reduce the maintenance burden a little bit, giving a simpler patch.
No, I don't think it will. libiconv will always write big-endian, but
glibc has a separate iconv implementation which writes the native
endianness. (I believe FreeBSD's does the same thing as glibc's.) I
think it's useful for us to know that we can handle UTF-16 using the
system behavior where possible, since that's what the system is going to
produce.
> What do you think ?
While I like the simplicity of the approach, as I mentioned above, and I
did consider this originally, I'd rather test the behavior of the system
we're operating on, provided it's suitable for our needs.
--
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]
next prev parent reply other threads:[~2019-02-10 18:55 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-07 21:59 t0028-working-tree-encoding.sh failing on musl based systems (Alpine Linux) Kevin Daudt
2019-02-08 0:17 ` brian m. carlson
2019-02-08 6:04 ` Rich Felker
2019-02-08 11:45 ` brian m. carlson
2019-02-08 11:55 ` Kevin Daudt
2019-02-08 13:51 ` brian m. carlson
2019-02-08 17:50 ` Junio C Hamano
2019-02-08 20:23 ` Kevin Daudt
2019-02-08 20:42 ` brian m. carlson
2019-02-08 23:12 ` Junio C Hamano
2019-02-09 0:24 ` brian m. carlson
2019-02-09 14:57 ` Kevin Daudt
2019-02-09 20:08 ` [PATCH] utf8: handle systems that don't write BOM for UTF-16 brian m. carlson
2019-02-10 1:45 ` Eric Sunshine
2019-02-10 18:14 ` brian m. carlson
2019-02-10 8:04 ` Torsten Bögershausen
2019-02-10 18:55 ` brian m. carlson [this message]
2019-02-11 17:14 ` Junio C Hamano
2019-02-11 0:23 ` [PATCH v2] " brian m. carlson
2019-02-11 1:16 ` Eric Sunshine
2019-02-11 1:20 ` brian m. carlson
2019-02-11 1:26 ` [PATCH v3] " brian m. carlson
2019-02-11 21:43 ` Kevin Daudt
2019-02-11 23:58 ` brian m. carlson
2019-02-12 0:31 ` Junio C Hamano
2019-02-12 0:53 ` brian m. carlson
2019-02-12 2:43 ` Junio C Hamano
2019-02-12 0:52 ` [PATCH v4] " brian m. carlson
2019-02-08 16:13 ` t0028-working-tree-encoding.sh failing on musl based systems (Alpine Linux) Rich Felker
2019-02-09 8:09 ` Torsten Bögershausen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190210185523.GB28510@genre.crustytoothpaste.net \
--to=sandals@crustytoothpaste.net \
--cc=dalias@libc.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=larsxschneider@gmail.com \
--cc=me@ikke.info \
--cc=tboegi@web.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).