git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: Casey Meijer <cmeijer@strongestfamilies.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: BUG FOLLOWUP: Case insensitivity in worktrees
Date: Fri, 24 Jul 2020 21:09:40 +0000	[thread overview]
Message-ID: <20200724210940.GE1758454@crustytoothpaste.net> (raw)
In-Reply-To: <EE35569F-6029-4659-86B3-29FBAAD7C491@strongestfamilies.com>

[-- Attachment #1: Type: text/plain, Size: 3352 bytes --]

On 2020-07-24 at 18:14:03, Casey Meijer wrote:
> I think I misunderstood your claim actually Brian.   What is a bug is
> asking for worktree A's head and getting the main worktree's head. A
> super dangerous bug.
> 
> I certainly disagree with your assertion that asking for head and not
> getting HEAD (or HeaD or hEAd) on a case-insensitive storage engine
> isn't a bug and it certainly shouldn't be a bug once extensible
> storage engines are in place: the storage engine should have final say
> on how objects are stored and retrieved, not git-core.

If you want to refer to HEAD, writing it "head" is always wrong.  "head"
is not a special ref to Git, and on a case-sensitive system, I am fully
entitled to create a branch, tag, or other ref with that name that is
independent from HEAD.

It's wrong because regardless of operating system, you don't
intrinsically know whether the repository is case sensitive.  Windows 10
permits case-sensitive directories and macOS has case-sensitive file
systems, so you cannot assume that "head" and "HEAD" are the same
without knowing the setting of "core.ignorecase" and the properties of
the file system.

So when you write "head", you are not asking for HEAD in any worktree or
repository at all.

We are fully aware that Git cannot consistently store refs differing in
case on case-insensitive file systems, and we agree that's a bug.
Reftable will fix that, and as I mentioned, it is being worked on.  It
is not, however, a deficiency that refs are intrinsically case
sensitive, and let me explain why.

First, Git does not require that refs are in any particular encoding.
Specifically, they need not be in Unicode or UTF-8.  It is valid to have
many characters in a ref name, including 0xff.  That means any type of
case folding is not possible, since a ref need not correspond to actual
text.

Second, even if we did require them to be UTF-8, it is impossible to
consistently fold case in a way that works for all locales.  Turkish and
other Turkic languages have a dotted I and a dotless I[0].  The ASCII
uppercase I would fold to a dotless lowercase I for Turkish and to the
ASCII (dotted) lowercase I for English.  Similarly, the ASCII lowercase
I is dotted, and folds to a dotted uppercase I in Turkish and an ASCII
(dotless) uppercase I in English.

It is literally not possible to correctly perform case-folding in a
locale-independent way.  Every attempt to do so will get at least this
case wrong (not to mention other cases that occur), and Turkic languages
are spoken by 200 million people, so ignoring their needs is not only
harmful, but also impacts a massive number of people.  That major OS
designers have made this mistake doesn't mean that we should as well.

We wouldn't perform ASCII-only case folding for all of the reasons
mentioned above and because it's Anglocentric.  As someone who speaks
both Spanish and French, I would find that unsuitable and the results
bizarre.

So I understand that you may expect that on Windows or macOS that you
can write "head" and get HEAD and be surprised when that doesn't work in
all cases.  But that is not, and never has been, expected to work, nor
is it a bug that it doesn't.

[0] https://en.wikipedia.org/wiki/Dotted_and_dotless_I
-- 
brian m. carlson: Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

      reply	other threads:[~2020-07-24 21:09 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <EEA65ED1-2BE0-41AD-84CC-780A9F4D9215@strongestfamilies.com>
2020-07-23 15:20 ` BUG FOLLOWUP: Case insensitivity in worktrees Casey Meijer
2020-07-24  1:19   ` brian m. carlson
2020-07-24  1:25     ` Junio C Hamano
2020-07-24 18:07       ` Casey Meijer
2020-07-24 18:17       ` Casey Meijer
2020-07-24 19:36         ` Junio C Hamano
2020-07-24 18:14     ` Casey Meijer
2020-07-24 21:09       ` brian m. carlson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200724210940.GE1758454@crustytoothpaste.net \
    --to=sandals@crustytoothpaste.net \
    --cc=cmeijer@strongestfamilies.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).