From: Mark Amery <email@example.com> To: "Torsten Bögershausen" <firstname.lastname@example.org> Cc: Junio C Hamano <email@example.com>, firstname.lastname@example.org Subject: Re: Bug: Changing folder case with `git mv` crashes on case-insensitive file system Date: Thu, 6 May 2021 10:12:40 +0100 [thread overview] Message-ID: <CAD8jeghZKDcp=weHtcMZ4z8KaO1jQJqfPqaRtYgtiwrX-1+NNg@mail.gmail.com> (raw) In-Reply-To: <20210506043429.zqgzxjrj643avrns@tb-raspi4> So, I'm just a dumb Git user who doesn't even write C, so much of this discussion is over my head, but I have a few thoughts that may be helpful: • The mv utility on Mac is capable of doing `mv bär.txt bÄr.txt` just fine. Maybe `git mv` can learn something from whatever `mv` does? • On a case-insensitive file system, `git mv somedir sOMEdir` is a rename. But on a case-sensitive file system, it might NOT be a rename; it might be the case that `somedir` and `sOMEdir` both exist and that the command should put `somedir` inside `sOMEdir`. I mention this because I can imagine some naive attempts at fixing the original bug by doing a case-insensitive comparison of the two names ending up breaking this behaviour on case-sensitive file systems by wrongly treating such a command as a rename. It's probably worth having a test that this scenario gets handled cleanly on case-sensitive file systems? (I haven't checked whether Torsten's proposed diff falls into this trap or not.) • Above, Torsten mentions that there are filesystem-specific rules about what names are equal to each other that Git can't easily handle, because they go beyond just ASCII case changes. In that case, maybe the right solution is to always defer the question to the filesystem rather than Git trying to figure out the answer "in its head"? That is: first check the inode or file ID of the src and dst passed to `git mv`. If they are different and the second one is a folder, move src inside the existing folder. If either they are the same or the second one is not a folder, then do a rename. It seems to me that this approach automatically handles stuff like `git mv bär.txt bÄr.txt` plus any other rules about names being equal (like two different sequences of code points that both express "à"), all without Git ever needing to explicitly check whether two names are case-insensitively equal. Am I missing something? Sorry if any of the above is dumb or if I'm reiterating things others have already said without realising it. On Thu, May 6, 2021 at 5:34 AM Torsten Bögershausen <email@example.com> wrote: > > On Wed, May 05, 2021 at 09:23:05AM +0900, Junio C Hamano wrote: > > Torsten Bögershausen <firstname.lastname@example.org> writes: > > > > > To my undestanding we try to rename > > > foo/ into FOO/. > > > But because FOO/ already "exists" as directory, > > > Git tries to move foo/ into FOO/foo, which fails. > > > > > > And no, the problem is probably not restricted to MacOs, > > > Windows and all case-insenstive file systems should show > > > the same, but I haven't tested yet, so it's more a suspicion. > > > > > > The following diff allows to move foo/ into FOO/ > > > If someone wants to make a patch out if, that would be good. > > > > Is strcasecmp() sufficient for macOS whose filesystem has not just > > case insensitivity but UTF-8 normalization issues? > > > > Strictly speaking: no. > > The Git code doesn't handle UTF-8 uppper/lower case at all: > git mv bar.txt BAR.TXT works because strcasecmp() is catching it. > > git mv bär.txt BÄR.TXT needs the long way: > git mv bär.txt baer.txt && git mv baer.txt BÄR.TXT > > We have been restricting the case-change-is-allowed to ASCII filenames > all the time. > There is no information, which code points map onto each other in Git, > since this is all file system dependent. > NTFS has one way, HFS+, APFS another, VFAT a third one, and if I expose > ext4 via SAMBA we probably have another one. > Not mentioniong that ext4 can be use case-insensitve on later Linux kernels, > which sticks to unicode. > Or Git repos running on machines using ISO-8859-1, those should be rare these > days. > > That said, people are renaming files in ASCII only and are happy, > and in that sense renaming directories in ASCII can be supported > without major hassle. > > And the inode approach mentioned as well: > This could go on top of strcasecmp() to cover non-ASCII filenames > or other oddities, if someone implements it. > >
next prev parent reply other threads:[~2021-05-06 9:12 UTC|newest] Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-05-03 17:25 Mark Amery 2021-05-03 22:58 ` brian m. carlson 2021-05-04 3:46 ` Junio C Hamano 2021-05-04 11:20 ` brian m. carlson 2021-05-05 13:51 ` Johannes Schindelin 2021-05-06 0:38 ` Junio C Hamano 2021-05-04 15:19 ` Torsten Bögershausen 2021-05-05 0:23 ` Junio C Hamano 2021-05-05 2:12 ` brian m. carlson 2021-05-06 4:34 ` Torsten Bögershausen 2021-05-06 9:12 ` Mark Amery [this message] 2021-05-06 13:11 ` Bagas Sanjaya 2021-05-06 14:53 ` Torsten Bögershausen 2021-05-06 21:03 ` Junio C Hamano
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: http://vger.kernel.org/majordomo-info.html * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAD8jeghZKDcp=weHtcMZ4z8KaO1jQJqfPqaRtYgtiwrX-1+NNg@mail.gmail.com' \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --subject='Re: Bug: Changing folder case with `git mv` crashes on case-insensitive file system' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Code repositories for project(s) associated with this inbox: https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).