On 2022-09-30 at 13:53:16, Ævar Arnfjörð Bjarmason wrote:
> You might find ASCII-only sufficient, but note that even if you get this
> working you won't catch the more complex Unicode normalization rules
> various filesystems perform, see the fsck code we carefully crafted to
> make sure we don't get something those FS's will mistake for a ".git"
> directory in-tree.

What's even worse is that different OSes case-fold differently and the
behaviour differs based on the version of the OS that formatted the file
system (which is of course not exposed to userspace), so in general it's
impossible to know exactly how case folding works on a particular
system.

It might be possible to implement some general rules that are
overzealous (in that they will catch patterns that will case-fold on
_some_ system), but in general this is very difficult.  The rules will
also almost certainly change with newer versions of Unicode.

I'll also point out that there is no locale-independent way to correctly
case-fold Unicode text.  Correct case-folding is sensitive to the
language, script, and region.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA