On Sun, Feb 10, 2019 at 08:04:13AM +0000, Torsten Bögershausen wrote: > On Sat, Feb 09, 2019 at 08:08:01PM +0000, brian m. carlson wrote: > > Preserve the existing behavior for systems which do not have this knob > > enabled, since they may use optimized implementations, including > > defaulting to the native endianness, to gain improved performance, which > > can be significant with large checkouts. > > Is the based on measurements on a real system ? No, I haven't done any performance measurements. However, swapping bytes is a (IIRC 1-cycle) instruction on x86, which would be executed for each iteration of the loop. My intuition tells me that will be a significant expense when there are a lot of files, but I can omit that phrase since I haven't measured. > I think we agree that Git will write UTF-16 always as big endian with BOM, > following the tradition of iconv/libiconv. > If yes, we can reduce the lines of code/#idefs somewhat, have the knob always on, > and reduce the maintenance burden a little bit, giving a simpler patch. No, I don't think it will. libiconv will always write big-endian, but glibc has a separate iconv implementation which writes the native endianness. (I believe FreeBSD's does the same thing as glibc's.) I think it's useful for us to know that we can handle UTF-16 using the system behavior where possible, since that's what the system is going to produce. > What do you think ? While I like the simplicity of the approach, as I mentioned above, and I did consider this originally, I'd rather test the behavior of the system we're operating on, provided it's suitable for our needs. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204