* Re: BUG FOLLOWUP: Case insensitivity in worktrees [not found] <EEA65ED1-2BE0-41AD-84CC-780A9F4D9215@strongestfamilies.com> @ 2020-07-23 15:20 ` Casey Meijer 2020-07-24 1:19 ` brian m. carlson 0 siblings, 1 reply; 8+ messages in thread From: Casey Meijer @ 2020-07-23 15:20 UTC (permalink / raw) To: git@vger.kernel.org This just bit me; it seems quite old, and I wanted to propose an alternative solution (maybe it doesn’t work for some reason I’m unaware of): https://marc.info/?l=git&m=154473525401677&w=2 Why not just preserve the existing semantics of the main worktree by checking the worktree refs first unconditionally and only fall back to the main refs when the ref doesn’t exist locally in the worktree? This would have the added benefit of allowing power users to override refs in their worktrees and would, if I’m not mistaken, preserve the semantics of the main worktree in case-insensitive and case-sensitive filesystems. Anywho, just a thought. I could work on a patch if this approach makes sense at least as an intermediary until there’s a pluggable storage backend for non-FS stores 😉 (I'd also be somewhat interested in implementing a postgres/sql storage backend if this project is moving forwards __ ). Best, Casey Meijer ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG FOLLOWUP: Case insensitivity in worktrees 2020-07-23 15:20 ` BUG FOLLOWUP: Case insensitivity in worktrees Casey Meijer @ 2020-07-24 1:19 ` brian m. carlson 2020-07-24 1:25 ` Junio C Hamano 2020-07-24 18:14 ` Casey Meijer 0 siblings, 2 replies; 8+ messages in thread From: brian m. carlson @ 2020-07-24 1:19 UTC (permalink / raw) To: Casey Meijer; +Cc: git@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 1573 bytes --] On 2020-07-23 at 15:20:50, Casey Meijer wrote: > This just bit me; it seems quite old, and I wanted to propose an alternative solution (maybe it doesn’t work for some reason I’m unaware of): > https://marc.info/?l=git&m=154473525401677&w=2 > > Why not just preserve the existing semantics of the main worktree by checking the worktree refs first unconditionally and only fall back to the main refs when the ref doesn’t exist locally in the worktree? > > This would have the added benefit of allowing power users to override refs in their worktrees and would, if I’m not mistaken, preserve the semantics of the main worktree in case-insensitive and case-sensitive filesystems. It isn't clear to me exactly what you're suggesting. Are you suggesting that we allow "head" instead of "HEAD" in worktrees, or that we allow refs in general to be case insensitive, or something else? > Anywho, just a thought. I could work on a patch if this approach makes sense at least as an intermediary until there’s a pluggable storage backend for non-FS stores 😉 (I'd also be somewhat interested in implementing a postgres/sql storage backend if this project is moving forwards __ ). There is a proposal for a ref storage backend called "reftable" which will not store the ref names in the file system, and work is being done on it. There has been a suggestion for an SQLite store in the past, but that causes problems for certain implementations, such as JGit, which do not want to have C bindings. -- brian m. carlson: Houston, Texas, US [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 263 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG FOLLOWUP: Case insensitivity in worktrees 2020-07-24 1:19 ` brian m. carlson @ 2020-07-24 1:25 ` Junio C Hamano 2020-07-24 18:07 ` Casey Meijer 2020-07-24 18:17 ` Casey Meijer 2020-07-24 18:14 ` Casey Meijer 1 sibling, 2 replies; 8+ messages in thread From: Junio C Hamano @ 2020-07-24 1:25 UTC (permalink / raw) To: brian m. carlson; +Cc: Casey Meijer, git@vger.kernel.org "brian m. carlson" <sandals@crustytoothpaste.net> writes: > It isn't clear to me exactly what you're suggesting. Are you suggesting > that we allow "head" instead of "HEAD" in worktrees, or that we allow > refs in general to be case insensitive, or something else? > There is a proposal for a ref storage backend called "reftable" which > will not store the ref names in the file system, and work is being done > on it. There has been a suggestion for an SQLite store in the past, but > that causes problems for certain implementations, such as JGit, which do > not want to have C bindings. Yes, another important thing to point out is that one shared goal of these efforts is so that users, even those on case insensitive filesystems, can name their refs foo and FOO and have the system treat these as two distinct refs. IOW, wanting to enhance "support" for case insensitive treatment of refs will not fly---asking for "head" and getting contents of "HEAD" on certain platforms is a bug, induced by limited filesystem these platforms use, and it is being fixed. Thanks. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG FOLLOWUP: Case insensitivity in worktrees 2020-07-24 1:25 ` Junio C Hamano @ 2020-07-24 18:07 ` Casey Meijer 2020-07-24 18:17 ` Casey Meijer 1 sibling, 0 replies; 8+ messages in thread From: Casey Meijer @ 2020-07-24 18:07 UTC (permalink / raw) To: Junio C Hamano, brian m. carlson; +Cc: git@vger.kernel.org It's definitely a bug and it's kind of amazing that it's been floating about for 2 years. I'm not suggesting anything really change except the way you determine whether a ref is "work-tree local" or not. This way on case sensitive filesystems only HEAD will be accepted, and on case insensitive filesystems both head and HEAD will be valid (and will refer to the same file/ref), replicating the semantics of the primary worktree. Namely, instead of checking explicitly for "HEAD" (or going through some hoops to determine if the filesystem *is* case sensitive), just look in the worktree refs. If it's in there, then it's worktree local. If not, then not. Like I said, maybe there are some problems with this approach that I'm not aware of, but if so, I think it's worth thinking about whether those problems are resolvable 😊 As far as alternate storage engines, I'd be more interested in seeing core git builtout to support a plugable storage engine than any specific implementation. Take a look at PostgreSQL's work on Table Access Methods for an example in this vein. I think this idea plays well with my proposal above as well because it delegates the responsibility of case sensitivity to the storage backend (in this case, the filesystem). Best, Casey On 2020-07-23, 10:25 PM, "Junio C Hamano" <gitster@pobox.com> wrote: "brian m. carlson" <sandals@crustytoothpaste.net> writes: > It isn't clear to me exactly what you're suggesting. Are you suggesting > that we allow "head" instead of "HEAD" in worktrees, or that we allow > refs in general to be case insensitive, or something else? > There is a proposal for a ref storage backend called "reftable" which > will not store the ref names in the file system, and work is being done > on it. There has been a suggestion for an SQLite store in the past, but > that causes problems for certain implementations, such as JGit, which do > not want to have C bindings. Yes, another important thing to point out is that one shared goal of these efforts is so that users, even those on case insensitive filesystems, can name their refs foo and FOO and have the system treat these as two distinct refs. IOW, wanting to enhance "support" for case insensitive treatment of refs will not fly---asking for "head" and getting contents of "HEAD" on certain platforms is a bug, induced by limited filesystem these platforms use, and it is being fixed. Thanks. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG FOLLOWUP: Case insensitivity in worktrees 2020-07-24 1:25 ` Junio C Hamano 2020-07-24 18:07 ` Casey Meijer @ 2020-07-24 18:17 ` Casey Meijer 2020-07-24 19:36 ` Junio C Hamano 1 sibling, 1 reply; 8+ messages in thread From: Casey Meijer @ 2020-07-24 18:17 UTC (permalink / raw) To: Junio C Hamano, brian m. carlson; +Cc: git@vger.kernel.org Sorry I got mixed up,, that last message should have been addressed to Junio. My apologies. To put it very simply, I'm asking that git respect the separation of concerns between itself and its storage engine (regardless of whether that's pluggable, or just the current filesystem, which I guess is technically pluggable, lol). Best, Casey On 2020-07-23, 10:25 PM, "Junio C Hamano" <gitster@pobox.com> wrote: "brian m. carlson" <sandals@crustytoothpaste.net> writes: > It isn't clear to me exactly what you're suggesting. Are you suggesting > that we allow "head" instead of "HEAD" in worktrees, or that we allow > refs in general to be case insensitive, or something else? > There is a proposal for a ref storage backend called "reftable" which > will not store the ref names in the file system, and work is being done > on it. There has been a suggestion for an SQLite store in the past, but > that causes problems for certain implementations, such as JGit, which do > not want to have C bindings. Yes, another important thing to point out is that one shared goal of these efforts is so that users, even those on case insensitive filesystems, can name their refs foo and FOO and have the system treat these as two distinct refs. IOW, wanting to enhance "support" for case insensitive treatment of refs will not fly---asking for "head" and getting contents of "HEAD" on certain platforms is a bug, induced by limited filesystem these platforms use, and it is being fixed. Thanks. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG FOLLOWUP: Case insensitivity in worktrees 2020-07-24 18:17 ` Casey Meijer @ 2020-07-24 19:36 ` Junio C Hamano 0 siblings, 0 replies; 8+ messages in thread From: Junio C Hamano @ 2020-07-24 19:36 UTC (permalink / raw) To: Casey Meijer; +Cc: brian m. carlson, git@vger.kernel.org Casey Meijer <cmeijer@strongestfamilies.com> writes: > Sorry I got mixed up,, that last message should have been > addressed to Junio. > > My apologies. > > To put it very simply, I'm asking that git respect the separation > of concerns between itself and its storage engine (regardless of > whether that's pluggable, or just the current filesystem, which I > guess is technically pluggable, lol). If "git" is told to store ref 'foo' pointing at object X and then ref 'Foo' pointing at object Y by the end user, after claiming to have done these two operations, if it is then asked about the value of 'foo', it must say that 'foo' points at object X and not Y. If a ref backend is based on case insensitive filesystem, there are only two options available. (1) ignore case and violate the expectation of end user. (2) come up with a way to "defeat" the limitation of case insensitivity imposed by the filesystem (e.g. your ref backend implementation _could_ URLencode/decode the ref before using it as a filename on such a filesystem). Doing (2) would be transparent to the rest of Git (i.e. the rest of Git does not have to care that each ref is stored in a file, whose filename is encoded version of the refname) and gives us a good separation of concerns between it and the storage backend. Those who ported Git to case insensitive filesystems didn't and chose (1). As (1) violates end-user expectation, I would think it is fair to declare it a bug. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG FOLLOWUP: Case insensitivity in worktrees 2020-07-24 1:19 ` brian m. carlson 2020-07-24 1:25 ` Junio C Hamano @ 2020-07-24 18:14 ` Casey Meijer 2020-07-24 21:09 ` brian m. carlson 1 sibling, 1 reply; 8+ messages in thread From: Casey Meijer @ 2020-07-24 18:14 UTC (permalink / raw) To: brian m. carlson; +Cc: git@vger.kernel.org I think I misunderstood your claim actually Brian. What is a bug is asking for worktree A's head and getting the main worktree's head. A super dangerous bug. I certainly disagree with your assertion that asking for head and not getting HEAD (or HeaD or hEAd) on a case-insensitive storage engine isn't a bug and it certainly shouldn't be a bug once extensible storage engines are in place: the storage engine should have final say on how objects are stored and retrieved, not git-core. Best, Casey On 2020-07-23, 10:19 PM, "brian m. carlson" <sandals@crustytoothpaste.net> wrote: On 2020-07-23 at 15:20:50, Casey Meijer wrote: > This just bit me; it seems quite old, and I wanted to propose an alternative solution (maybe it doesn’t work for some reason I’m unaware of): > https://marc.info/?l=git&m=154473525401677&w=2 > > Why not just preserve the existing semantics of the main worktree by checking the worktree refs first unconditionally and only fall back to the main refs when the ref doesn’t exist locally in the worktree? > > This would have the added benefit of allowing power users to override refs in their worktrees and would, if I’m not mistaken, preserve the semantics of the main worktree in case-insensitive and case-sensitive filesystems. It isn't clear to me exactly what you're suggesting. Are you suggesting that we allow "head" instead of "HEAD" in worktrees, or that we allow refs in general to be case insensitive, or something else? > Anywho, just a thought. I could work on a patch if this approach makes sense at least as an intermediary until there’s a pluggable storage backend for non-FS stores 😉 (I'd also be somewhat interested in implementing a postgres/sql storage backend if this project is moving forwards __ ). There is a proposal for a ref storage backend called "reftable" which will not store the ref names in the file system, and work is being done on it. There has been a suggestion for an SQLite store in the past, but that causes problems for certain implementations, such as JGit, which do not want to have C bindings. -- brian m. carlson: Houston, Texas, US ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG FOLLOWUP: Case insensitivity in worktrees 2020-07-24 18:14 ` Casey Meijer @ 2020-07-24 21:09 ` brian m. carlson 0 siblings, 0 replies; 8+ messages in thread From: brian m. carlson @ 2020-07-24 21:09 UTC (permalink / raw) To: Casey Meijer; +Cc: git@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 3352 bytes --] On 2020-07-24 at 18:14:03, Casey Meijer wrote: > I think I misunderstood your claim actually Brian. What is a bug is > asking for worktree A's head and getting the main worktree's head. A > super dangerous bug. > > I certainly disagree with your assertion that asking for head and not > getting HEAD (or HeaD or hEAd) on a case-insensitive storage engine > isn't a bug and it certainly shouldn't be a bug once extensible > storage engines are in place: the storage engine should have final say > on how objects are stored and retrieved, not git-core. If you want to refer to HEAD, writing it "head" is always wrong. "head" is not a special ref to Git, and on a case-sensitive system, I am fully entitled to create a branch, tag, or other ref with that name that is independent from HEAD. It's wrong because regardless of operating system, you don't intrinsically know whether the repository is case sensitive. Windows 10 permits case-sensitive directories and macOS has case-sensitive file systems, so you cannot assume that "head" and "HEAD" are the same without knowing the setting of "core.ignorecase" and the properties of the file system. So when you write "head", you are not asking for HEAD in any worktree or repository at all. We are fully aware that Git cannot consistently store refs differing in case on case-insensitive file systems, and we agree that's a bug. Reftable will fix that, and as I mentioned, it is being worked on. It is not, however, a deficiency that refs are intrinsically case sensitive, and let me explain why. First, Git does not require that refs are in any particular encoding. Specifically, they need not be in Unicode or UTF-8. It is valid to have many characters in a ref name, including 0xff. That means any type of case folding is not possible, since a ref need not correspond to actual text. Second, even if we did require them to be UTF-8, it is impossible to consistently fold case in a way that works for all locales. Turkish and other Turkic languages have a dotted I and a dotless I[0]. The ASCII uppercase I would fold to a dotless lowercase I for Turkish and to the ASCII (dotted) lowercase I for English. Similarly, the ASCII lowercase I is dotted, and folds to a dotted uppercase I in Turkish and an ASCII (dotless) uppercase I in English. It is literally not possible to correctly perform case-folding in a locale-independent way. Every attempt to do so will get at least this case wrong (not to mention other cases that occur), and Turkic languages are spoken by 200 million people, so ignoring their needs is not only harmful, but also impacts a massive number of people. That major OS designers have made this mistake doesn't mean that we should as well. We wouldn't perform ASCII-only case folding for all of the reasons mentioned above and because it's Anglocentric. As someone who speaks both Spanish and French, I would find that unsuitable and the results bizarre. So I understand that you may expect that on Windows or macOS that you can write "head" and get HEAD and be surprised when that doesn't work in all cases. But that is not, and never has been, expected to work, nor is it a bug that it doesn't. [0] https://en.wikipedia.org/wiki/Dotted_and_dotless_I -- brian m. carlson: Houston, Texas, US [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 263 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-07-24 21:09 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <EEA65ED1-2BE0-41AD-84CC-780A9F4D9215@strongestfamilies.com> 2020-07-23 15:20 ` BUG FOLLOWUP: Case insensitivity in worktrees Casey Meijer 2020-07-24 1:19 ` brian m. carlson 2020-07-24 1:25 ` Junio C Hamano 2020-07-24 18:07 ` Casey Meijer 2020-07-24 18:17 ` Casey Meijer 2020-07-24 19:36 ` Junio C Hamano 2020-07-24 18:14 ` Casey Meijer 2020-07-24 21:09 ` brian m. carlson
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).