git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Johannes Sixt <j6t@kdbg.org>, Klaus Ethgen <Klaus@Ethgen.ch>,
	git@vger.kernel.org
Subject: [PATCH 1/2] alternates: accept double-quoted paths
Date: Mon, 12 Dec 2016 14:52:22 -0500	[thread overview]
Message-ID: <20161212195222.rxnabok6amklt2zf@sigill.intra.peff.net> (raw)
In-Reply-To: <20161212194929.bdcihf7orjabzb2h@sigill.intra.peff.net>

We read lists of alternates from objects/info/alternates
files (delimited by newline), as well as from the
GIT_ALTERNATE_OBJECT_DIRECTORIES environment variable
(delimited by colon or semi-colon, depending on the
platform).

There's no mechanism for quoting the delimiters, so it's
impossible to specify an alternate path that contains a
colon in the environment, or one that contains a newline in
a file. We've lived with that restriction for ages because
both alternates and filenames with colons are relatively
rare, and it's only a problem when the two meet. But since
722ff7f87 (receive-pack: quarantine objects until
pre-receive accepts, 2016-10-03), which builds on the
alternates system, every push causes the receiver to set
GIT_ALTERNATE_OBJECT_DIRECTORIES internally.

It would be convenient to have some way to quote the
delimiter so that we can represent arbitrary paths.

The simplest thing would be an escape character before a
quoted delimiter (e.g., "\:" as a literal colon). But that
creates a backwards compatibility problem: any path which
uses that escape character is now broken, and we've just
shifted the problem. We could choose an unlikely escape
character (e.g., something from the non-printable ASCII
range), but that's awkward to use.

Instead, let's treat names as unquoted unless they begin
with a double-quote, in which case they are interpreted via
our usual C-stylke quoting rules. This also breaks
backwards-compatibility, but in a smaller way: it only
matters if your file has a double-quote as the very _first_
character in the path (whereas an escape character is a
problem anywhere in the path).  It's also consistent with
many other parts of git, which accept either a bare pathname
or a double-quoted one, and the sender can choose to quote
or not as required.

Signed-off-by: Jeff King <peff@peff.net>
---
This also lets you specify paths that start with "#", or ones that
contain newlines in an alternates file, though I doubt anybody really
cares much in practice. I didn't even bother to add them to the test
suite, mostly because of the compatibility hassle (though I guess we
could just mark them with !MINGW and be OK).

 Documentation/git.txt    |  6 ++++++
 sha1_file.c              | 47 ++++++++++++++++++++++++++++++++++++-----------
 t/t5615-alternate-env.sh | 18 ++++++++++++++++++
 3 files changed, 60 insertions(+), 11 deletions(-)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index af191c51b..98033302b 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -871,6 +871,12 @@ Git so take care if using a foreign front-end.
 	specifies a ":" separated (on Windows ";" separated) list
 	of Git object directories which can be used to search for Git
 	objects. New objects will not be written to these directories.
++
+	Entries that begin with `"` (double-quote) will be interpreted
+	as C-style quoted paths, removing leading and trailing
+	double-quotes and respecting backslash escapes. E.g., the value
+	`"path-with-\"-and-:-in-it":vanilla-path` has two paths:
+	`path-with-"-and-:-in-it` and `vanilla-path`.
 
 `GIT_DIR`::
 	If the `GIT_DIR` environment variable is set then it
diff --git a/sha1_file.c b/sha1_file.c
index 9c86d1924..117307185 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -26,6 +26,7 @@
 #include "mru.h"
 #include "list.h"
 #include "mergesort.h"
+#include "quote.h"
 
 #ifndef O_NOATIME
 #if defined(__linux__) && (defined(__i386__) || defined(__PPC__))
@@ -329,13 +330,40 @@ static int link_alt_odb_entry(const char *entry, const char *relative_base,
 	return 0;
 }
 
+static const char *parse_alt_odb_entry(const char *string,
+				       int sep,
+				       struct strbuf *out)
+{
+	const char *end;
+
+	strbuf_reset(out);
+
+	if (*string == '#') {
+		/* comment; consume up to next separator */
+		end = strchrnul(string, sep);
+	} else if (*string == '"' && !unquote_c_style(out, string, &end)) {
+		/*
+		 * quoted path; unquote_c_style has copied the
+		 * data for us and set "end". Broken quoting (e.g.,
+		 * an entry that doesn't end with a quote) falls
+		 * back to the unquoted case below.
+		 */
+	} else {
+		/* normal, unquoted path */
+		end = strchrnul(string, sep);
+		strbuf_add(out, string, end - string);
+	}
+
+	if (*end)
+		end++;
+	return end;
+}
+
 static void link_alt_odb_entries(const char *alt, int len, int sep,
 				 const char *relative_base, int depth)
 {
-	struct string_list entries = STRING_LIST_INIT_NODUP;
-	char *alt_copy;
-	int i;
 	struct strbuf objdirbuf = STRBUF_INIT;
+	struct strbuf entry = STRBUF_INIT;
 
 	if (depth > 5) {
 		error("%s: ignoring alternate object stores, nesting too deep.",
@@ -348,16 +376,13 @@ static void link_alt_odb_entries(const char *alt, int len, int sep,
 		die("unable to normalize object directory: %s",
 		    objdirbuf.buf);
 
-	alt_copy = xmemdupz(alt, len);
-	string_list_split_in_place(&entries, alt_copy, sep, -1);
-	for (i = 0; i < entries.nr; i++) {
-		const char *entry = entries.items[i].string;
-		if (entry[0] == '\0' || entry[0] == '#')
+	while (*alt) {
+		alt = parse_alt_odb_entry(alt, sep, &entry);
+		if (!entry.len)
 			continue;
-		link_alt_odb_entry(entry, relative_base, depth, objdirbuf.buf);
+		link_alt_odb_entry(entry.buf, relative_base, depth, objdirbuf.buf);
 	}
-	string_list_clear(&entries, 0);
-	free(alt_copy);
+	strbuf_release(&entry);
 	strbuf_release(&objdirbuf);
 }
 
diff --git a/t/t5615-alternate-env.sh b/t/t5615-alternate-env.sh
index eec4137ca..69fd8f891 100755
--- a/t/t5615-alternate-env.sh
+++ b/t/t5615-alternate-env.sh
@@ -68,4 +68,22 @@ test_expect_success 'access alternate via relative path (subdir)' '
 	EOF
 '
 
+# set variables outside test to avoid quote insanity; the \057 is '/',
+# which doesn't need quoting, but just confirms that de-quoting
+# is working.
+quoted='"one.git\057objects"'
+unquoted='two.git/objects'
+test_expect_success 'mix of quoted and unquoted alternates' '
+	check_obj "$quoted:$unquoted" <<-EOF
+	$one blob
+	$two blob
+'
+
+test_expect_success 'broken quoting falls back to interpreting raw' '
+	mv one.git \"one.git &&
+	check_obj \"one.git/objects <<-EOF
+	$one blob
+	EOF
+'
+
 test_done
-- 
2.11.0.203.g6657065


  reply	other threads:[~2016-12-12 19:52 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-09 14:02 [BUG] Colon in remote urls Klaus Ethgen
2016-12-09 15:22 ` Jeff King
2016-12-09 19:07   ` Junio C Hamano
2016-12-10  8:51     ` Jeff King
2016-12-12 19:49       ` [PATCH 0/2] handling alternates paths with colons Jeff King
2016-12-12 19:52         ` Jeff King [this message]
2016-12-13 11:30           ` [PATCH 1/2] alternates: accept double-quoted paths Duy Nguyen
2016-12-13 11:52             ` Jeff King
2016-12-13 18:08             ` Junio C Hamano
2016-12-17  7:56               ` Duy Nguyen
2016-12-12 19:53         ` [PATCH 2/2] tmp-objdir: quote paths we add to alternates Jeff King
2016-12-12 20:53           ` Johannes Sixt
2016-12-13 11:44             ` Jeff King
2016-12-13 18:10               ` Junio C Hamano
2016-12-13 18:15                 ` Jeff King
2016-12-13 20:10                   ` Junio C Hamano
2016-12-13 19:09           ` [PATCH 3/2] t5547-push-quarantine: run the path separator test on Windows, too Johannes Sixt
2016-12-13 19:12             ` Jeff King
2016-12-13 19:23             ` Junio C Hamano
2016-12-21 21:33             ` [PATCH 4/2] t5615-alternate-env: double-quotes in file names do not work on Windows Johannes Sixt
2016-12-21 22:42               ` Jeff King
2016-12-22  6:13                 ` Johannes Sixt
2016-12-22 19:06                   ` Johannes Sixt
2016-12-22 21:38                     ` Johannes Schindelin
2016-12-22 22:19                     ` Jeff King
2016-12-12 22:37         ` [PATCH 0/2] handling alternates paths with colons Junio C Hamano
2016-12-13 11:50           ` Jeff King
2016-12-13 18:10             ` Junio C Hamano
2016-12-13 18:17               ` Jeff King
2016-12-13 18:42                 ` Junio C Hamano
2016-12-13 18:47                   ` Jeff King
2016-12-10  9:29     ` [BUG] Colon in remote urls Klaus Ethgen
2016-12-10 10:24       ` Jeff King
2016-12-10 10:46         ` Klaus Ethgen
2016-12-09 21:32   ` Johannes Sixt
2016-12-10  8:26     ` Jeff King
2016-12-10  9:41       ` Klaus Ethgen
2016-12-10 10:26         ` Jeff King
2016-12-10 10:51           ` Klaus Ethgen
2016-12-10  9:32     ` Klaus Ethgen
2016-12-10 18:18       ` Johannes Schindelin
2016-12-11 11:02         ` Klaus Ethgen
2016-12-11 14:51           ` Philip Oakley
2016-12-12 11:03           ` Johannes Schindelin
2016-12-12 11:50             ` Klaus Ethgen
2016-12-12 14:05               ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161212195222.rxnabok6amklt2zf@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=Klaus@Ethgen.ch \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).