git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Konstantin Khomoutov" <kostix+git@007spb.ru>,
	"Michael Haggerty" <mhagger@alum.mit.edu>,
	"Jeff King" <peff@peff.net>
Subject: Re: Should "head" also work for "HEAD" on case-insensitive FS?
Date: Thu, 06 Jul 2017 15:34:34 -0700	[thread overview]
Message-ID: <xmqqo9sxdwjp.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <xmqqshiaizhz.fsf@gitster.mtv.corp.google.com> (Junio C. Hamano's message of "Wed, 05 Jul 2017 10:07:20 -0700")

Junio C Hamano <gitster@pobox.com> writes:

> Once Michael's packed-refs backend stabilizes, we may have a nice
> calm period in the refs subsystem and I expect that this will become
> a good medium-sized project for a contributor who does not have to 
> be so experienced (but not a complete newbie).
>
> It needs to:
>
>  - add icase-files-backend, preferrably sharing as much code as the
>    existing files-backend, in refs/.
>
>  - design a mechanism to configure which refs backend to use at
>    runtime; as this has to be done fairly early in the control flow,
>    this will likely to use early configuration mechanism and will
>    probably need to be done in the set-up code, but doing it lazy
>    may even be nicer, as not all subcommands need access to refs.
>
> Thanks for a pointer to the archive.

So here is an early WIP/illustration I did to see how involved such
a change would be, which should apply cleanly on top of 'pu'.

I was pleasantly surprised how cleanly refs/files-backend.c
separates the notion of "path" and "refname".  Only two functions,
files_reflog_path() and files_ref_path(), are responsible for taking
the refname and turning it to the pathname of an filesystem entity.
One one function, loose_fill_ref_dir(), is responsible for running
readdir() to find pathname, and turning the result into a refname.
So in theory, these three are the only things that need to know
about the "encoding".

The exact detail of the encoding used here is immaterial, but I just
used "encode uppercase letters and % as % followed by two hex",
which was simple enough.  Usual refs/heads/master and friends will
not have to be touched when encoded this way.  Perhaps the decoding
side should be tweaked so that uppercase letters it sees needs to be
downcased to avoid "refs/heads/Foo" getting returned as "Foo" branch,
as a "Foo" branch should have been encoded as "refs/heads/%46oo".

Having said that, this patch alone does not quite work yet.

 * In the repository discovery code, we have some logic that
   hard-codes the path in the directory (which is a candidate for
   being a repository) to check, like "refs/" and "HEAD".  In the
   attached illustration patch, files_path_encode() special cases
   "HEAD" so that it is not munged, which is a bit of ugly
   workaround for this.

 * I haven't figured out why, but what refs.c calls "pseudo refs"
   bypasses the files backend layer for some but not all operations,
   which causes t1405-main-ref-store to fail.  The test creates a
   "pseudo ref" FOO and then tries to remove it.  Creation seems to
   follow the files-backend.c and thusly goes through the escaping;
   refs.::refs_delete_ref() however does not consult files-backend.c
   and fails to find and delete .git/FOO, because the creation side
   encoded it as ".git/%46%4F%4F".

Michael is CC'ed as I thought it would be simpler to just ask about
the latter bullet point than digging it further myself ;-)

Thanks.

 refs/files-backend.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 56 insertions(+), 6 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 923e481e06..5bde77cbf8 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -23,6 +23,7 @@ struct ref_lock {
 struct files_ref_store {
 	struct ref_store base;
 	unsigned int store_flags;
+	int encode_names;
 
 	char *gitdir;
 	char *gitcommondir;
@@ -54,6 +55,9 @@ static struct ref_store *files_ref_store_create(const char *gitdir,
 	base_ref_store_init(ref_store, &refs_be_files);
 	refs->store_flags = flags;
 
+	/* git_config_get_bool("core.escapeLooseRefNames", &refs->encode_names); */
+	refs->encode_names = 1;
+
 	refs->gitdir = xstrdup(gitdir);
 	get_common_dir_noenv(&sb, gitdir);
 	refs->gitcommondir = strbuf_detach(&sb, NULL);
@@ -102,6 +106,49 @@ static struct files_ref_store *files_downcast(struct ref_store *ref_store,
 	return refs;
 }
 
+static void files_path_encode(struct files_ref_store *refs,
+			      struct strbuf *sb, const char *refname)
+{
+	if (!refs->encode_names || !strcmp(refname, "HEAD")) {
+		strbuf_addstr(sb, refname);
+	} else {
+		const char *cp;
+
+		for (cp = refname; *cp; cp++) {
+			int ch = *cp;
+			if (('A' <= ch && ch <= 'Z') || (ch == '%'))
+				strbuf_addf(sb, "%%%02x", ch);
+			else
+				strbuf_addch(sb, ch);
+		}
+	}
+}
+
+static int files_path_decode(struct files_ref_store *refs,
+			     struct strbuf *sb, const char *name, int namelen)
+{
+	if (!refs->encode_names) {
+		strbuf_add(sb, name, namelen);
+	} else {
+		size_t origlen = sb->len;
+
+		while (namelen--) {
+			int ch = *name++;
+
+			if (ch == '%') {
+				if (namelen < 2 || (ch = hex2chr(name)) < 0) {
+					strbuf_setlen(sb, origlen);
+					return -1;
+				}
+				namelen -= 2;
+				name += 2;
+			}
+			strbuf_addch(sb, ch);
+		}
+	}
+	return 0;
+}
+
 static void files_reflog_path(struct files_ref_store *refs,
 			      struct strbuf *sb,
 			      const char *refname)
@@ -118,15 +165,16 @@ static void files_reflog_path(struct files_ref_store *refs,
 	switch (ref_type(refname)) {
 	case REF_TYPE_PER_WORKTREE:
 	case REF_TYPE_PSEUDOREF:
-		strbuf_addf(sb, "%s/logs/%s", refs->gitdir, refname);
+		strbuf_addf(sb, "%s/logs/", refs->gitdir);
 		break;
 	case REF_TYPE_NORMAL:
-		strbuf_addf(sb, "%s/logs/%s", refs->gitcommondir, refname);
+		strbuf_addf(sb, "%s/logs/", refs->gitcommondir);
 		break;
 	default:
 		die("BUG: unknown ref type %d of ref %s",
 		    ref_type(refname), refname);
 	}
+	files_path_encode(refs, sb, refname);
 }
 
 static void files_ref_path(struct files_ref_store *refs,
@@ -136,15 +184,16 @@ static void files_ref_path(struct files_ref_store *refs,
 	switch (ref_type(refname)) {
 	case REF_TYPE_PER_WORKTREE:
 	case REF_TYPE_PSEUDOREF:
-		strbuf_addf(sb, "%s/%s", refs->gitdir, refname);
+		strbuf_addf(sb, "%s/", refs->gitdir);
 		break;
 	case REF_TYPE_NORMAL:
-		strbuf_addf(sb, "%s/%s", refs->gitcommondir, refname);
+		strbuf_addf(sb, "%s/", refs->gitcommondir);
 		break;
 	default:
 		die("BUG: unknown ref type %d of ref %s",
 		    ref_type(refname), refname);
 	}
+	files_path_encode(refs, sb, refname);
 }
 
 /*
@@ -174,7 +223,7 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
 	}
 
 	strbuf_init(&refname, dirnamelen + 257);
-	strbuf_add(&refname, dirname, dirnamelen);
+	files_path_decode(refs, &refname, dirname, dirnamelen);
 
 	while ((de = readdir(d)) != NULL) {
 		struct object_id oid;
@@ -185,7 +234,8 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
 			continue;
 		if (ends_with(de->d_name, ".lock"))
 			continue;
-		strbuf_addstr(&refname, de->d_name);
+		if (files_path_decode(refs, &refname, de->d_name, strlen(de->d_name)))
+			continue;
 		strbuf_addstr(&path, de->d_name);
 		if (stat(path.buf, &st) < 0) {
 			; /* silently ignore */





  reply	other threads:[~2017-07-06 22:34 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-03 22:00 Should "head" also work for "HEAD" on case-insensitive FS? Ævar Arnfjörð Bjarmason
2017-07-04  7:19 ` Konstantin Khomoutov
2017-07-04  8:24   ` Ævar Arnfjörð Bjarmason
2017-07-05  8:36     ` Jeff King
2017-07-05 17:07       ` Junio C Hamano
2017-07-06 22:34         ` Junio C Hamano [this message]
     [not found]           ` <CAMy9T_FmE=8xzjRJJRxLwQjoMStJx5sYO_xtODv2OEZm54DurA@mail.gmail.com>
2017-07-27  0:23             ` Fwd: " Michael Haggerty
     [not found]             ` <xmqqa84c6v41.fsf@gitster.mtv.corp.google.com>
2017-07-27  0:49               ` Junio C Hamano
2017-07-27 14:35                 ` Jeff King
2017-07-27 15:26                   ` Junio C Hamano
2017-07-06  3:35   ` Kenneth Hsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqo9sxdwjp.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=kostix+git@007spb.ru \
    --cc=mhagger@alum.mit.edu \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).