git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Should "head" also work for "HEAD" on case-insensitive FS?
@ 2017-07-03 22:00 Ævar Arnfjörð Bjarmason
  2017-07-04  7:19 ` Konstantin Khomoutov
  0 siblings, 1 reply; 11+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-07-03 22:00 UTC (permalink / raw)
  To: Git ML

I don't have a OSX box, but was helping a co-worker over Jabber the
other day, and he pasted something like:

    $ git merge-base github/master head

Which didn't work for me, and I thought he had a local "head" branch
until realizing that of course we were just resolving HEAD on the FS.

Has this come up before? I think it makes sense to warn/error about
these magic /HEAD/ revisions if they're not upper-case.

This is likely unintentional and purely some emergent effect of how it's
implemented, and leads to unportable git invocations.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Should "head" also work for "HEAD" on case-insensitive FS?
  2017-07-03 22:00 Should "head" also work for "HEAD" on case-insensitive FS? Ævar Arnfjörð Bjarmason
@ 2017-07-04  7:19 ` Konstantin Khomoutov
  2017-07-04  8:24   ` Ævar Arnfjörð Bjarmason
  2017-07-06  3:35   ` Kenneth Hsu
  0 siblings, 2 replies; 11+ messages in thread
From: Konstantin Khomoutov @ 2017-07-04  7:19 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git ML

On Tue, Jul 04, 2017 at 12:00:49AM +0200, Ævar Arnfjörð Bjarmason wrote:

> I don't have a OSX box, but was helping a co-worker over Jabber the
> other day, and he pasted something like:
> 
>     $ git merge-base github/master head
> 
> Which didn't work for me, and I thought he had a local "head" branch
> until realizing that of course we were just resolving HEAD on the FS.
> 
> Has this come up before? I think it makes sense to warn/error about
> these magic /HEAD/ revisions if they're not upper-case.
> 
> This is likely unintentional and purely some emergent effect of how it's
> implemented, and leads to unportable git invocations.

JFTR this is one common case of confusion on Windows as well.
To the point that I saw people purposedly using "head" on StackOverflow
questions.  That is, they appear to think (for some reason) that
branches in Git have case-insensitive names and prefer to spell "head"
since it (supposedly) easier to type.

I don't know what to do about it.
Ideally we'd just have a way to perform a final check on the file into
which a ref name was resolved to see its "real" name but I don't know
whether all popular filesystems are case preserving (HFS+ and NTFS are,
IIRC) and even if they are, whether the appropriate platform-specific
APIs exists to perform such a check.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Should "head" also work for "HEAD" on case-insensitive FS?
  2017-07-04  7:19 ` Konstantin Khomoutov
@ 2017-07-04  8:24   ` Ævar Arnfjörð Bjarmason
  2017-07-05  8:36     ` Jeff King
  2017-07-06  3:35   ` Kenneth Hsu
  1 sibling, 1 reply; 11+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-07-04  8:24 UTC (permalink / raw)
  To: Konstantin Khomoutov; +Cc: Git ML


On Tue, Jul 04 2017, Konstantin Khomoutov jotted:

> On Tue, Jul 04, 2017 at 12:00:49AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> I don't have a OSX box, but was helping a co-worker over Jabber the
>> other day, and he pasted something like:
>>
>>     $ git merge-base github/master head
>>
>> Which didn't work for me, and I thought he had a local "head" branch
>> until realizing that of course we were just resolving HEAD on the FS.
>>
>> Has this come up before? I think it makes sense to warn/error about
>> these magic /HEAD/ revisions if they're not upper-case.
>>
>> This is likely unintentional and purely some emergent effect of how it's
>> implemented, and leads to unportable git invocations.
>
> JFTR this is one common case of confusion on Windows as well.
> To the point that I saw people purposedly using "head" on StackOverflow
> questions.  That is, they appear to think (for some reason) that
> branches in Git have case-insensitive names and prefer to spell "head"
> since it (supposedly) easier to type.
>
> I don't know what to do about it.
> Ideally we'd just have a way to perform a final check on the file into
> which a ref name was resolved to see its "real" name but I don't know
> whether all popular filesystems are case preserving (HFS+ and NTFS are,
> IIRC) and even if they are, whether the appropriate platform-specific
> APIs exists to perform such a check.

I think there's no easy way do this in the general case with the current
ref backend, because we rely on the FS to store the refs.

But I'm thinking of the more specific case where you specify
{HEAD,FETCH_HEAD,ORIG_HEAD,MERGE_HEAD,CHERRY_PICK_HEAD} as non-upper
case, and we resolve it from .git/$NAME.

So the detection would not be checking whether the file on-disk has the
same casing, but knowing that if we resolve anything from .git/$NAME
then the string provided on the command-line must be upper-case.

Although there is this:

    $ git rev-parse HEAD
    051ee1e7dd2c7b8bdc20f237eea3c7d5b1314280
    $ git rev-parse WHATEVER
    WHATEVER
    fatal: ambiguous argument 'WHATEVER': unknown revision or path not in the working tree.
    $ cp .git/{HEAD,WHATEVER}
    $ git rev-parse WHATEVER
    051ee1e7dd2c7b8bdc20f237eea3c7d5b1314280

I.e. we allow any arbitrary ref sitting in .git/, but presumably we
could just record the original string the user provided so that this
dies on OSX/Windows too:

    $ cp .git/{HEAD,Whatever}
    $ git rev-parse wHATEVER
    wHATEVER
    fatal: ambiguous argument 'wHATEVER': unknown revision or path not in the working tree.

But this may be a much deeper rabbit hole than I initially thought, I
was fishing to see if someone knew of a place in the code or WIP patch
that dealt with these special refs, but between the low-level machinery
& sha1_name.c (and others) there may be no easy one place to do this...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Should "head" also work for "HEAD" on case-insensitive FS?
  2017-07-04  8:24   ` Ævar Arnfjörð Bjarmason
@ 2017-07-05  8:36     ` Jeff King
  2017-07-05 17:07       ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2017-07-05  8:36 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Konstantin Khomoutov, Git ML

On Tue, Jul 04, 2017 at 10:24:30AM +0200, Ævar Arnfjörð Bjarmason wrote:

> I.e. we allow any arbitrary ref sitting in .git/, but presumably we
> could just record the original string the user provided so that this
> dies on OSX/Windows too:
> 
>     $ cp .git/{HEAD,Whatever}
>     $ git rev-parse wHATEVER
>     wHATEVER
>     fatal: ambiguous argument 'wHATEVER': unknown revision or path not in the working tree.
> 
> But this may be a much deeper rabbit hole than I initially thought, I
> was fishing to see if someone knew of a place in the code or WIP patch
> that dealt with these special refs, but between the low-level machinery
> & sha1_name.c (and others) there may be no easy one place to do this...

I think we talked at one point about allowing only [A-Z_] for top-level
refs. My recollection is that it generally seemed like a good idea, but
I don't think we ever had patches.

I think it would work to enforce it via check_refname_format(). That
would catch reading via dwim_ref(), which is what your example is
hitting. But it should also prevent people from writing ".git/foo" (or
worse, ".git/config") as a ref.

I do think that's the tip of the iceberg for case-sensitivity problems
with refs, though. Because packed-refs is case-sensitive, I think you
can create some pretty confusing states on case-insensitive filesystems.
For example:

  http://public-inbox.org/git/20150825052123.GA523@sigill.intra.peff.net/

Ultimately I think the path forward is to have a ref backend that
behaves uniformly (either because it avoids the filesystem, or because
it encodes around the differences). See:

  http://public-inbox.org/git/xmqqvb4udyf9.fsf@gitster.mtv.corp.google.com/

and its reply.

-Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Should "head" also work for "HEAD" on case-insensitive FS?
  2017-07-05  8:36     ` Jeff King
@ 2017-07-05 17:07       ` Junio C Hamano
  2017-07-06 22:34         ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2017-07-05 17:07 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, Konstantin Khomoutov,
	Git ML

Jeff King <peff@peff.net> writes:

> Ultimately I think the path forward is to have a ref backend that
> behaves uniformly (either because it avoids the filesystem, or because
> it encodes around the differences). See:
>
>   http://public-inbox.org/git/xmqqvb4udyf9.fsf@gitster.mtv.corp.google.com/
>
> and its reply.

Once Michael's packed-refs backend stabilizes, we may have a nice
calm period in the refs subsystem and I expect that this will become
a good medium-sized project for a contributor who does not have to 
be so experienced (but not a complete newbie).

It needs to:

 - add icase-files-backend, preferrably sharing as much code as the
   existing files-backend, in refs/.

 - design a mechanism to configure which refs backend to use at
   runtime; as this has to be done fairly early in the control flow,
   this will likely to use early configuration mechanism and will
   probably need to be done in the set-up code, but doing it lazy
   may even be nicer, as not all subcommands need access to refs.

Thanks for a pointer to the archive.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Should "head" also work for "HEAD" on case-insensitive FS?
  2017-07-04  7:19 ` Konstantin Khomoutov
  2017-07-04  8:24   ` Ævar Arnfjörð Bjarmason
@ 2017-07-06  3:35   ` Kenneth Hsu
  1 sibling, 0 replies; 11+ messages in thread
From: Kenneth Hsu @ 2017-07-06  3:35 UTC (permalink / raw)
  To: Konstantin Khomoutov; +Cc: Ævar Arnfjörð Bjarmason, Git ML

On Tue, Jul 04, 2017 at 10:19:09AM +0300, Konstantin Khomoutov wrote:
> On Tue, Jul 04, 2017 at 12:00:49AM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> > I don't have a OSX box, but was helping a co-worker over Jabber the
> > other day, and he pasted something like:
> > 
> >     $ git merge-base github/master head
> > 
> > Which didn't work for me, and I thought he had a local "head" branch
> > until realizing that of course we were just resolving HEAD on the FS.
> > 
> > Has this come up before? I think it makes sense to warn/error about
> > these magic /HEAD/ revisions if they're not upper-case.
> > 
> > This is likely unintentional and purely some emergent effect of how it's
> > implemented, and leads to unportable git invocations.
> 
> JFTR this is one common case of confusion on Windows as well.
> To the point that I saw people purposedly using "head" on StackOverflow
> questions.  That is, they appear to think (for some reason) that
> branches in Git have case-insensitive names and prefer to spell "head"
> since it (supposedly) easier to type.

The use of "head" also appears to be leading to some strange behavior
when resolving refs on Windows.  See the following issue in the
git-for-windows project:

https://github.com/git-for-windows/git/issues/1225

In summary, it seems that head and HEAD can resolve to different
revisions when in a worktree.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Should "head" also work for "HEAD" on case-insensitive FS?
  2017-07-05 17:07       ` Junio C Hamano
@ 2017-07-06 22:34         ` Junio C Hamano
       [not found]           ` <CAMy9T_FmE=8xzjRJJRxLwQjoMStJx5sYO_xtODv2OEZm54DurA@mail.gmail.com>
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2017-07-06 22:34 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Konstantin Khomoutov,
	Michael Haggerty, Jeff King

Junio C Hamano <gitster@pobox.com> writes:

> Once Michael's packed-refs backend stabilizes, we may have a nice
> calm period in the refs subsystem and I expect that this will become
> a good medium-sized project for a contributor who does not have to 
> be so experienced (but not a complete newbie).
>
> It needs to:
>
>  - add icase-files-backend, preferrably sharing as much code as the
>    existing files-backend, in refs/.
>
>  - design a mechanism to configure which refs backend to use at
>    runtime; as this has to be done fairly early in the control flow,
>    this will likely to use early configuration mechanism and will
>    probably need to be done in the set-up code, but doing it lazy
>    may even be nicer, as not all subcommands need access to refs.
>
> Thanks for a pointer to the archive.

So here is an early WIP/illustration I did to see how involved such
a change would be, which should apply cleanly on top of 'pu'.

I was pleasantly surprised how cleanly refs/files-backend.c
separates the notion of "path" and "refname".  Only two functions,
files_reflog_path() and files_ref_path(), are responsible for taking
the refname and turning it to the pathname of an filesystem entity.
One one function, loose_fill_ref_dir(), is responsible for running
readdir() to find pathname, and turning the result into a refname.
So in theory, these three are the only things that need to know
about the "encoding".

The exact detail of the encoding used here is immaterial, but I just
used "encode uppercase letters and % as % followed by two hex",
which was simple enough.  Usual refs/heads/master and friends will
not have to be touched when encoded this way.  Perhaps the decoding
side should be tweaked so that uppercase letters it sees needs to be
downcased to avoid "refs/heads/Foo" getting returned as "Foo" branch,
as a "Foo" branch should have been encoded as "refs/heads/%46oo".

Having said that, this patch alone does not quite work yet.

 * In the repository discovery code, we have some logic that
   hard-codes the path in the directory (which is a candidate for
   being a repository) to check, like "refs/" and "HEAD".  In the
   attached illustration patch, files_path_encode() special cases
   "HEAD" so that it is not munged, which is a bit of ugly
   workaround for this.

 * I haven't figured out why, but what refs.c calls "pseudo refs"
   bypasses the files backend layer for some but not all operations,
   which causes t1405-main-ref-store to fail.  The test creates a
   "pseudo ref" FOO and then tries to remove it.  Creation seems to
   follow the files-backend.c and thusly goes through the escaping;
   refs.::refs_delete_ref() however does not consult files-backend.c
   and fails to find and delete .git/FOO, because the creation side
   encoded it as ".git/%46%4F%4F".

Michael is CC'ed as I thought it would be simpler to just ask about
the latter bullet point than digging it further myself ;-)

Thanks.

 refs/files-backend.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 56 insertions(+), 6 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 923e481e06..5bde77cbf8 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -23,6 +23,7 @@ struct ref_lock {
 struct files_ref_store {
 	struct ref_store base;
 	unsigned int store_flags;
+	int encode_names;
 
 	char *gitdir;
 	char *gitcommondir;
@@ -54,6 +55,9 @@ static struct ref_store *files_ref_store_create(const char *gitdir,
 	base_ref_store_init(ref_store, &refs_be_files);
 	refs->store_flags = flags;
 
+	/* git_config_get_bool("core.escapeLooseRefNames", &refs->encode_names); */
+	refs->encode_names = 1;
+
 	refs->gitdir = xstrdup(gitdir);
 	get_common_dir_noenv(&sb, gitdir);
 	refs->gitcommondir = strbuf_detach(&sb, NULL);
@@ -102,6 +106,49 @@ static struct files_ref_store *files_downcast(struct ref_store *ref_store,
 	return refs;
 }
 
+static void files_path_encode(struct files_ref_store *refs,
+			      struct strbuf *sb, const char *refname)
+{
+	if (!refs->encode_names || !strcmp(refname, "HEAD")) {
+		strbuf_addstr(sb, refname);
+	} else {
+		const char *cp;
+
+		for (cp = refname; *cp; cp++) {
+			int ch = *cp;
+			if (('A' <= ch && ch <= 'Z') || (ch == '%'))
+				strbuf_addf(sb, "%%%02x", ch);
+			else
+				strbuf_addch(sb, ch);
+		}
+	}
+}
+
+static int files_path_decode(struct files_ref_store *refs,
+			     struct strbuf *sb, const char *name, int namelen)
+{
+	if (!refs->encode_names) {
+		strbuf_add(sb, name, namelen);
+	} else {
+		size_t origlen = sb->len;
+
+		while (namelen--) {
+			int ch = *name++;
+
+			if (ch == '%') {
+				if (namelen < 2 || (ch = hex2chr(name)) < 0) {
+					strbuf_setlen(sb, origlen);
+					return -1;
+				}
+				namelen -= 2;
+				name += 2;
+			}
+			strbuf_addch(sb, ch);
+		}
+	}
+	return 0;
+}
+
 static void files_reflog_path(struct files_ref_store *refs,
 			      struct strbuf *sb,
 			      const char *refname)
@@ -118,15 +165,16 @@ static void files_reflog_path(struct files_ref_store *refs,
 	switch (ref_type(refname)) {
 	case REF_TYPE_PER_WORKTREE:
 	case REF_TYPE_PSEUDOREF:
-		strbuf_addf(sb, "%s/logs/%s", refs->gitdir, refname);
+		strbuf_addf(sb, "%s/logs/", refs->gitdir);
 		break;
 	case REF_TYPE_NORMAL:
-		strbuf_addf(sb, "%s/logs/%s", refs->gitcommondir, refname);
+		strbuf_addf(sb, "%s/logs/", refs->gitcommondir);
 		break;
 	default:
 		die("BUG: unknown ref type %d of ref %s",
 		    ref_type(refname), refname);
 	}
+	files_path_encode(refs, sb, refname);
 }
 
 static void files_ref_path(struct files_ref_store *refs,
@@ -136,15 +184,16 @@ static void files_ref_path(struct files_ref_store *refs,
 	switch (ref_type(refname)) {
 	case REF_TYPE_PER_WORKTREE:
 	case REF_TYPE_PSEUDOREF:
-		strbuf_addf(sb, "%s/%s", refs->gitdir, refname);
+		strbuf_addf(sb, "%s/", refs->gitdir);
 		break;
 	case REF_TYPE_NORMAL:
-		strbuf_addf(sb, "%s/%s", refs->gitcommondir, refname);
+		strbuf_addf(sb, "%s/", refs->gitcommondir);
 		break;
 	default:
 		die("BUG: unknown ref type %d of ref %s",
 		    ref_type(refname), refname);
 	}
+	files_path_encode(refs, sb, refname);
 }
 
 /*
@@ -174,7 +223,7 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
 	}
 
 	strbuf_init(&refname, dirnamelen + 257);
-	strbuf_add(&refname, dirname, dirnamelen);
+	files_path_decode(refs, &refname, dirname, dirnamelen);
 
 	while ((de = readdir(d)) != NULL) {
 		struct object_id oid;
@@ -185,7 +234,8 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
 			continue;
 		if (ends_with(de->d_name, ".lock"))
 			continue;
-		strbuf_addstr(&refname, de->d_name);
+		if (files_path_decode(refs, &refname, de->d_name, strlen(de->d_name)))
+			continue;
 		strbuf_addstr(&path, de->d_name);
 		if (stat(path.buf, &st) < 0) {
 			; /* silently ignore */





^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Fwd: Should "head" also work for "HEAD" on case-insensitive FS?
       [not found]           ` <CAMy9T_FmE=8xzjRJJRxLwQjoMStJx5sYO_xtODv2OEZm54DurA@mail.gmail.com>
@ 2017-07-27  0:23             ` Michael Haggerty
       [not found]             ` <xmqqa84c6v41.fsf@gitster.mtv.corp.google.com>
  1 sibling, 0 replies; 11+ messages in thread
From: Michael Haggerty @ 2017-07-27  0:23 UTC (permalink / raw)
  To: Git Mailing List
  Cc: Ævar Arnfjörð Bjarmason, Konstantin Khomoutov,
	Jeff King

Dang, I just noticed that I hit "reply" rather than "reply-to-all" on
the below email (stupid GMail default). Junio, your response to this
email accordingly went only to me.

Michael

---------- Forwarded message ----------
From: Michael Haggerty <mhagger@alum.mit.edu>
Date: Mon, Jul 10, 2017 at 7:52 AM
Subject: Re: Should "head" also work for "HEAD" on case-insensitive FS?
To: Junio C Hamano <gitster@pobox.com>


On Fri, Jul 7, 2017 at 12:34 AM, Junio C Hamano <gitster@pobox.com> wrote:
> [...]
> The exact detail of the encoding used here is immaterial, but I just
> used "encode uppercase letters and % as % followed by two hex",
> which was simple enough.  Usual refs/heads/master and friends will
> not have to be touched when encoded this way.  Perhaps the decoding
> side should be tweaked so that uppercase letters it sees needs to be
> downcased to avoid "refs/heads/Foo" getting returned as "Foo" branch,
> as a "Foo" branch should have been encoded as "refs/heads/%46oo".
>
> Having said that, this patch alone does not quite work yet.
>
>  * In the repository discovery code, we have some logic that
>    hard-codes the path in the directory (which is a candidate for
>    being a repository) to check, like "refs/" and "HEAD".  In the
>    attached illustration patch, files_path_encode() special cases
>    "HEAD" so that it is not munged, which is a bit of ugly
>    workaround for this.
>
>  * I haven't figured out why, but what refs.c calls "pseudo refs"
>    bypasses the files backend layer for some but not all operations,
>    which causes t1405-main-ref-store to fail.  The test creates a
>    "pseudo ref" FOO and then tries to remove it.  Creation seems to
>    follow the files-backend.c and thusly goes through the escaping;
>    refs.::refs_delete_ref() however does not consult files-backend.c
>    and fails to find and delete .git/FOO, because the creation side
>    encoded it as ".git/%46%4F%4F".

I think the most natural thing would be to use different encoding
rules for pseudo-refs (references like "HEAD" and "FETCH_HEAD") and
for other references (those starting with "refs/").

Pseudo-refs (with the partial exception of "HEAD") are quite peculiar
beasts. They sometimes include other information besides the reference
value and IIRC the refs code doesn't have any idea how to write or
read those extra contents. I believe that "HEAD" is the only pseudo
ref for which reflogs are ever written. Pseudo-refs have to match
/[A-Z_]+/ (see https://github.com/git/git/blob/8b2efe2a0fd93b8721879f796d848a9ce785647f/refs.c#L169-L173),
so ISTM that there is no need to encode such references' filenames at
all. (It's possible that the pattern could be made even stricter, like
/[A-Z_]+HEAD/.) Moreover, IIRC, such references are never scanned for
(as in for-each-refs) but rather are always asked for by name. So
their names might never have to be *de*coded, either. On the other
hand, when trying to look them up, it would be a good idea to verify
that the requested name satisfies the above naming rule. Other than
that, I believe it would be preferable to leave pseudo-refs untouched
by your new encoding/decoding code.

Whereas other references are typically lower-case, so it makes sense
for lower-case letters to be the ones that are passed through
transparently in such references' filenames (as in your scheme).

But...since we are talking about introducing a new loose reference
filename encoding, I think it would be a good idea to address a couple
of related issues at the same time:

* Some filesystems natively use Unicode, and insist on a particular
Unicode normalization (NFC vs NFD), which might differ from the
"upstream" normalization. So such reference names get munged when
written as loose references. I'm not enough of an expert in Unicode to
know what the best solution is, except for the strong feeling that it
would require some cooperation from the rest of Git to ensure a good
user experience.

* Another bad effect of our current loose reference encoding is that
it prohibits references that D/F conflict with each other (like
"refs/heads/foo" and "refs/heads/foo/bar") because
"$GIT_DIR/refs/heads/foo" can't be a file and a directory at the same
time. Even if we don't want to support that, this problem also
prevents us from storing reflogs for deleted references, which is a
serious flaw. We could solve this problem by encoding "directory"
components of reference names differently than "leaf" components; for
example, the above references could be encoded as
"refs/heads.d/foo.ref" and "refs/heads.d/foo.d/bar.ref".

It'd be nice to solve all of these related problems at the same time,
because whatever encoding we choose now will have to be supported
forever.

Michael

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Fwd: Should "head" also work for "HEAD" on case-insensitive FS?
       [not found]             ` <xmqqa84c6v41.fsf@gitster.mtv.corp.google.com>
@ 2017-07-27  0:49               ` Junio C Hamano
  2017-07-27 14:35                 ` Jeff King
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2017-07-27  0:49 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Michael Haggerty

Heh, then I'll forward my response and we are even ;-)


---------- Forwarded message ----------
From: Junio C Hamano <gitster@pobox.com>
Date: Mon, Jul 10, 2017 at 10:48 AM
Subject: Re: Should "head" also work for "HEAD" on case-insensitive FS?
To: Michael Haggerty <mhagger@alum.mit.edu>


Michael Haggerty <mhagger@alum.mit.edu> writes:

> I think the most natural thing would be to use different encoding
> rules for pseudo-refs (references like "HEAD" and "FETCH_HEAD") and
> for other references (those starting with "refs/").
>
> Pseudo-refs (with the partial exception of "HEAD") are quite peculiar
> beasts....

I agree with the reasoning, but what I am worried about is that
their handling in the existing refs.c code may be leaky and/or
inconsistent.

What I saw was that a test have ended up with .git/%46%4F%4F when it
was told to create a ref "FOO" (which indicates that "FOO" was
passed to the files backend), which later failed to read it back
because the pseudo_ref handling refs.c wanted to see ".git/FOO" on
the reading side.

Perhaps it is only a bug in t/t1405-main-ref-store.sh?

> But...since we are talking about introducing a new loose reference
> filename encoding, ...

Yes, but that is an encoding detail I do not have to get involved
and folks with platform needs can add more on top---we need to make
sure that the places that encode and decode are identified in the
code first, and the things like "FOO is encoded upon writing because
files-backend is asked to write it, but not decoded because refs.c
thinks it is pseudo-ref and does not give a say to files-backend"
shouldn't be happening before we can start working on the details of
the encoding.  Making a conscious decision that pseudo-refs are left
as-is is OK, but we need to see both reading and writing side
following the same codepath to make that decision, which does not
seem to be the case in the current code.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: Should "head" also work for "HEAD" on case-insensitive FS?
  2017-07-27  0:49               ` Junio C Hamano
@ 2017-07-27 14:35                 ` Jeff King
  2017-07-27 15:26                   ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2017-07-27 14:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Michael Haggerty

On Wed, Jul 26, 2017 at 05:49:47PM -0700, Junio C Hamano wrote:

> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
> > I think the most natural thing would be to use different encoding
> > rules for pseudo-refs (references like "HEAD" and "FETCH_HEAD") and
> > for other references (those starting with "refs/").
> >
> > Pseudo-refs (with the partial exception of "HEAD") are quite peculiar
> > beasts....
> 
> I agree with the reasoning, but what I am worried about is that
> their handling in the existing refs.c code may be leaky and/or
> inconsistent.
> 
> What I saw was that a test have ended up with .git/%46%4F%4F when it
> was told to create a ref "FOO" (which indicates that "FOO" was
> passed to the files backend), which later failed to read it back
> because the pseudo_ref handling refs.c wanted to see ".git/FOO" on
> the reading side.
> 
> Perhaps it is only a bug in t/t1405-main-ref-store.sh?

An interesting related issue for pseudo-refs: if you encode HEAD as
.git/%48%45%41%44, how will we recognize that directory as a git
repository? Detecting (and doing a sanity check on) "HEAD" is one of the
key mechanisms for deciding whether we are in a git repository.

Obviously an older version of git that doesn't know about the new
encoding scheme wouldn't work on this repository anyway. But:

  1. It should say "this is a git repo, but not a vintage I understand".
     Not "this isn't a git repo, I'll keep looking".

  2. How does a git version of the correct vintage decide "this is a git
     repo, so I'll check its config for extensions.refBackend, and a-ha,
     they _do_ have a HEAD". There's a chicken-and-egg problem.

Obviously for (2) we could teach that mechanism to look for the encoded
HEAD file, too. But this is just one backend. What about a reftable or
other non-filesystem store that keeps "HEAD" inside a file?

I kind of wonder if more exotic ref storage backends should always just
place a dummy "HEAD" file that is enough to bootstrap the "this is a git
repo" process (for both new and old versions).

This is orthogonal to the rest of the pseudo-refs discussion, but just
something I thought of while reading the thread.

-Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: Should "head" also work for "HEAD" on case-insensitive FS?
  2017-07-27 14:35                 ` Jeff King
@ 2017-07-27 15:26                   ` Junio C Hamano
  0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2017-07-27 15:26 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, Michael Haggerty

Jeff King <peff@peff.net> writes:

> On Wed, Jul 26, 2017 at 05:49:47PM -0700, Junio C Hamano wrote:
>
>> What I saw was that a test have ended up with .git/%46%4F%4F when it
>> was told to create a ref "FOO" (which indicates that "FOO" was
>> passed to the files backend), which later failed to read it back
>> because the pseudo_ref handling refs.c wanted to see ".git/FOO" on
>> the reading side.
>> 
>> Perhaps it is only a bug in t/t1405-main-ref-store.sh?
>
> An interesting related issue for pseudo-refs: if you encode HEAD as
> .git/%48%45%41%44, how will we recognize that directory as a git
> repository?

Yes, that is a valid point.  I may have forgot to explain why the
sample change in my message upthread special cases "HEAD" and leaves
it untouched, but it is done for this exact reason.

>   1. It should say "this is a git repo, but not a vintage I understand".
>      Not "this isn't a git repo, I'll keep looking".
>
>   2. How does a git version of the correct vintage decide "this is a git
>      repo, so I'll check its config for extensions.refBackend, and a-ha,
>      they _do_ have a HEAD". There's a chicken-and-egg problem.

Yes, exactly.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-07-27 15:26 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-03 22:00 Should "head" also work for "HEAD" on case-insensitive FS? Ævar Arnfjörð Bjarmason
2017-07-04  7:19 ` Konstantin Khomoutov
2017-07-04  8:24   ` Ævar Arnfjörð Bjarmason
2017-07-05  8:36     ` Jeff King
2017-07-05 17:07       ` Junio C Hamano
2017-07-06 22:34         ` Junio C Hamano
     [not found]           ` <CAMy9T_FmE=8xzjRJJRxLwQjoMStJx5sYO_xtODv2OEZm54DurA@mail.gmail.com>
2017-07-27  0:23             ` Fwd: " Michael Haggerty
     [not found]             ` <xmqqa84c6v41.fsf@gitster.mtv.corp.google.com>
2017-07-27  0:49               ` Junio C Hamano
2017-07-27 14:35                 ` Jeff King
2017-07-27 15:26                   ` Junio C Hamano
2017-07-06  3:35   ` Kenneth Hsu

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).