git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v2 0/2] rev-parse options for absolute or relative paths
@ 2020-10-09 19:15 brian m. carlson
  2020-10-09 19:15 ` [PATCH v2 1/2] abspath: add a function to resolve paths with missing components brian m. carlson
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: brian m. carlson @ 2020-10-09 19:15 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, SZEDER Gábor

There are a bunch of different situations in which one would like to
have an absolute and canonical or a relative path from Git.  In many of
these cases, these values are already available from git rev-parse, but
some values only come in one form or another.

Many operating systems, such as macOS, lack a built-in realpath command
that can canonicalize paths properly, and additionally some programming
languages, like Go, currently do as well.  It's therefore helpful for us
to provide a generic way to request that a path is fully canonicalized
before using it.  Since users may wish for a relative path, we can
provide one of those as well.

Changes from v1:

* Add a function to handle missing trailing components when
  canonicalizing paths and use it.
* Improve commit messages.
* Fix broken && chain.
* Fix situation where relative paths are not relative.

brian m. carlson (2):
  abspath: add a function to resolve paths with missing components
  rev-parse: add option for absolute or relative path formatting

 Documentation/git-rev-parse.txt |  71 ++++++++++++---------
 abspath.c                       |  50 +++++++++++++--
 builtin/rev-parse.c             | 105 ++++++++++++++++++++++++++++----
 cache.h                         |   1 +
 t/t1500-rev-parse.sh            |  57 ++++++++++++++++-
 5 files changed, 237 insertions(+), 47 deletions(-)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/2] abspath: add a function to resolve paths with missing components
  2020-10-09 19:15 [PATCH v2 0/2] rev-parse options for absolute or relative paths brian m. carlson
@ 2020-10-09 19:15 ` brian m. carlson
  2020-10-09 21:10   ` Junio C Hamano
  2020-11-09 13:55   ` Johannes Schindelin
  2020-10-09 19:15 ` [PATCH v2 2/2] rev-parse: add option for absolute or relative path formatting brian m. carlson
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 13+ messages in thread
From: brian m. carlson @ 2020-10-09 19:15 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, SZEDER Gábor

We'd like to canonicalize paths such that we can preserve any number of
trailing components that may be missing.  Let's add a function to do
that that calls strbuf_realpath to find the canonical path for the
portion we do have and then append the missing part.  We adjust
strip_last_component to return us the component it has stripped and use
that to help us accumulate the missing part.

Note that it is intentional that we invoke strbuf_realpath here,
repeatedly if necessary, because on Windows that function is replaced
with a version that uses the proper system semantics for
canonicalization.  Trying to adjust strbuf_realpath to perform this kind
of canonicalization with an additional option would fail to work
properly on Windows.  The present approach is equivalent to
strbuf_realpath for cases where the path exists, and the only other
cases where we will use this function the additional overhead of
multiple invocations is not significant.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 abspath.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++----
 cache.h   |  1 +
 2 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/abspath.c b/abspath.c
index 6f15a418bb..092bb33b64 100644
--- a/abspath.c
+++ b/abspath.c
@@ -11,8 +11,12 @@ int is_directory(const char *path)
 	return (!stat(path, &st) && S_ISDIR(st.st_mode));
 }
 
-/* removes the last path component from 'path' except if 'path' is root */
-static void strip_last_component(struct strbuf *path)
+/*
+ * Removes the last path component from 'path' except if 'path' is root.
+ *
+ * If last is not NULL, the last path component is copied to last.
+ */
+static void strip_last_component(struct strbuf *path, struct strbuf *last)
 {
 	size_t offset = offset_1st_component(path->buf);
 	size_t len = path->len;
@@ -20,6 +24,10 @@ static void strip_last_component(struct strbuf *path)
 	/* Find start of the last component */
 	while (offset < len && !is_dir_sep(path->buf[len - 1]))
 		len--;
+
+	if (last)
+		strbuf_addstr(last, path->buf + len);
+
 	/* Skip sequences of multiple path-separators */
 	while (offset < len && is_dir_sep(path->buf[len - 1]))
 		len--;
@@ -118,7 +126,7 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path,
 			continue; /* '.' component */
 		} else if (next.len == 2 && !strcmp(next.buf, "..")) {
 			/* '..' component; strip the last path component */
-			strip_last_component(resolved);
+			strip_last_component(resolved, NULL);
 			continue;
 		}
 
@@ -169,7 +177,7 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path,
 				 * strip off the last component since it will
 				 * be replaced with the contents of the symlink
 				 */
-				strip_last_component(resolved);
+				strip_last_component(resolved, NULL);
 			}
 
 			/*
@@ -202,6 +210,40 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path,
 	return retval;
 }
 
+/*
+ * Like strbuf_realpath, but trailing components which do not exist are copied
+ * through.
+ */
+char *strbuf_realpath_missing(struct strbuf *resolved, const char *path)
+{
+	struct strbuf remaining = STRBUF_INIT;
+	struct strbuf trailing = STRBUF_INIT;
+	struct strbuf component = STRBUF_INIT;
+
+	strbuf_addstr(&remaining, path);
+
+	while (remaining.len) {
+		if (strbuf_realpath(resolved, remaining.buf, 0)) {
+			strbuf_addbuf(resolved, &trailing);
+
+			strbuf_release(&component);
+			strbuf_release(&remaining);
+			strbuf_release(&trailing);
+
+			return resolved->buf;
+		}
+		strip_last_component(&remaining, &component);
+		strbuf_insertstr(&trailing, 0, "/");
+		strbuf_insertstr(&trailing, 1, component.buf);
+		strbuf_reset(&component);
+	}
+
+	strbuf_release(&component);
+	strbuf_release(&remaining);
+	strbuf_release(&trailing);
+	return NULL;
+}
+
 char *real_pathdup(const char *path, int die_on_error)
 {
 	struct strbuf realpath = STRBUF_INIT;
diff --git a/cache.h b/cache.h
index c0072d43b1..e1e17e108e 100644
--- a/cache.h
+++ b/cache.h
@@ -1320,6 +1320,7 @@ static inline int is_absolute_path(const char *path)
 int is_directory(const char *);
 char *strbuf_realpath(struct strbuf *resolved, const char *path,
 		      int die_on_error);
+char *strbuf_realpath_missing(struct strbuf *resolved, const char *path);
 char *real_pathdup(const char *path, int die_on_error);
 const char *absolute_path(const char *path);
 char *absolute_pathdup(const char *path);

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/2] rev-parse: add option for absolute or relative path formatting
  2020-10-09 19:15 [PATCH v2 0/2] rev-parse options for absolute or relative paths brian m. carlson
  2020-10-09 19:15 ` [PATCH v2 1/2] abspath: add a function to resolve paths with missing components brian m. carlson
@ 2020-10-09 19:15 ` brian m. carlson
  2020-11-09 14:46   ` Johannes Schindelin
  2020-11-04 23:01 ` [PATCH v2 0/2] rev-parse options for absolute or relative paths Emily Shaffer
  2020-11-09 13:33 ` Johannes Schindelin
  3 siblings, 1 reply; 13+ messages in thread
From: brian m. carlson @ 2020-10-09 19:15 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, SZEDER Gábor

git rev-parse has several options which print various paths.  Some of
these paths are printed relative to the current working directory, and
some are absolute.

Normally, this is not a problem, but there are times when one wants
paths entirely in one format or another.  This can be done trivially if
the paths are canonical, but canonicalizing paths is not possible on
some shell scripting environments which lack realpath(1) and also in Go,
which lacks functions that properly canonicalize paths on Windows.

To help out the scripter, let's provide an option which turns most of
the paths printed by git rev-parse to be either relative to the current
working directory or absolute and canonical.  Document which options are
affected and which are not so that users are not confused.

This approach is cleaner and tidier than providing duplicates of
existing options which are either relative or absolute.

Note that if the user needs both forms, it is possible to pass an
additional option in the middle of the command line which changes the
behavior of subsequent operations.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 Documentation/git-rev-parse.txt |  71 ++++++++++++---------
 builtin/rev-parse.c             | 105 ++++++++++++++++++++++++++++----
 t/t1500-rev-parse.sh            |  57 ++++++++++++++++-
 3 files changed, 190 insertions(+), 43 deletions(-)

diff --git a/Documentation/git-rev-parse.txt b/Documentation/git-rev-parse.txt
index 19b12b6d43..6b95292559 100644
--- a/Documentation/git-rev-parse.txt
+++ b/Documentation/git-rev-parse.txt
@@ -208,6 +208,15 @@ Options for Files
 	Only the names of the variables are listed, not their value,
 	even if they are set.
 
+--path-format=(absolute|relative)::
+	Controls the behavior of certain other options following it on the command
+	line. If specified as absolute, the paths printed by those options will be
+	absolute and canonical. If specified as relative, the paths will be relative
+	to the current working directory if that is possible.  The default is option
+	specific.
+
+The following options are modified by `--path-format`:
+
 --git-dir::
 	Show `$GIT_DIR` if defined. Otherwise show the path to
 	the .git directory. The path shown, when relative, is
@@ -217,13 +226,42 @@ If `$GIT_DIR` is not defined and the current directory
 is not detected to lie in a Git repository or work tree
 print a message to stderr and exit with nonzero status.
 
+--git-common-dir::
+	Show `$GIT_COMMON_DIR` if defined, else `$GIT_DIR`.
+
+--resolve-git-dir <path>::
+	Check if <path> is a valid repository or a gitfile that
+	points at a valid repository, and print the location of the
+	repository.  If <path> is a gitfile then the resolved path
+	to the real repository is printed.
+
+--git-path <path>::
+	Resolve "$GIT_DIR/<path>" and takes other path relocation
+	variables such as $GIT_OBJECT_DIRECTORY,
+	$GIT_INDEX_FILE... into account. For example, if
+	$GIT_OBJECT_DIRECTORY is set to /foo/bar then "git rev-parse
+	--git-path objects/abc" returns /foo/bar/abc.
+
+--show-toplevel::
+	Show the (by default, absolute) path of the top-level directory
+	of the working tree. If there is no working tree, report an error.
+
+--show-superproject-working-tree::
+	Show the absolute path of the root of the superproject's
+	working tree (if exists) that uses the current repository as
+	its submodule.  Outputs nothing if the current repository is
+	not used as a submodule by any project.
+
+--shared-index-path::
+	Show the path to the shared index file in split index mode, or
+	empty if not in split-index mode.
+
+The following options are unaffected by `--path-format`:
+
 --absolute-git-dir::
 	Like `--git-dir`, but its output is always the canonicalized
 	absolute path.
 
---git-common-dir::
-	Show `$GIT_COMMON_DIR` if defined, else `$GIT_DIR`.
-
 --is-inside-git-dir::
 	When the current working directory is below the repository
 	directory print "true", otherwise "false".
@@ -238,19 +276,6 @@ print a message to stderr and exit with nonzero status.
 --is-shallow-repository::
 	When the repository is shallow print "true", otherwise "false".
 
---resolve-git-dir <path>::
-	Check if <path> is a valid repository or a gitfile that
-	points at a valid repository, and print the location of the
-	repository.  If <path> is a gitfile then the resolved path
-	to the real repository is printed.
-
---git-path <path>::
-	Resolve "$GIT_DIR/<path>" and takes other path relocation
-	variables such as $GIT_OBJECT_DIRECTORY,
-	$GIT_INDEX_FILE... into account. For example, if
-	$GIT_OBJECT_DIRECTORY is set to /foo/bar then "git rev-parse
-	--git-path objects/abc" returns /foo/bar/abc.
-
 --show-cdup::
 	When the command is invoked from a subdirectory, show the
 	path of the top-level directory relative to the current
@@ -261,20 +286,6 @@ print a message to stderr and exit with nonzero status.
 	path of the current directory relative to the top-level
 	directory.
 
---show-toplevel::
-	Show the absolute path of the top-level directory of the working
-	tree. If there is no working tree, report an error.
-
---show-superproject-working-tree::
-	Show the absolute path of the root of the superproject's
-	working tree (if exists) that uses the current repository as
-	its submodule.  Outputs nothing if the current repository is
-	not used as a submodule by any project.
-
---shared-index-path::
-	Show the path to the shared index file in split index mode, or
-	empty if not in split-index mode.
-
 --show-object-format[=(storage|input|output)]::
 	Show the object format (hash algorithm) used for the repository
 	for storage inside the `.git` directory, input, or output. For
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index ed200c8af1..ec62b4cd16 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -583,6 +583,76 @@ static void handle_ref_opt(const char *pattern, const char *prefix)
 	clear_ref_exclusion(&ref_excludes);
 }
 
+enum format_type {
+	/* We would like a relative path. */
+	FORMAT_RELATIVE,
+	/* We would like a canonical absolute path. */
+	FORMAT_CANONICAL,
+	/* We would like the default behavior. */
+	FORMAT_DEFAULT,
+};
+
+enum default_type {
+	/* Our default is a relative path. */
+	DEFAULT_RELATIVE,
+	/* Our default is a relative path if there's a shared root. */
+	DEFAULT_RELATIVE_IF_SHARED,
+	/* Our default is a canonical absolute path. */
+	DEFAULT_CANONICAL,
+	/* Our default is not to modify the item. */
+	DEFAULT_UNMODIFIED,
+};
+
+static void print_path(const char *path, const char *prefix, enum format_type format, enum default_type def)
+{
+	char *cwd = NULL;
+	/*
+	 * We don't ever produce a relative path if prefix is NULL, so set the
+	 * prefix to the current directory so that we can produce a relative
+	 * path whenever possible.  If we're using RELATIVE_IF_SHARED mode, then
+	 * we want an absolute path unless the two share a common prefix, so don't
+	 * set it in that case, since doing so causes a relative path to always
+	 * be produced if possible.
+	 */
+	if (!prefix && (format != FORMAT_DEFAULT || def != DEFAULT_RELATIVE_IF_SHARED))
+		prefix = cwd = xgetcwd();
+	if (format == FORMAT_DEFAULT && def == DEFAULT_UNMODIFIED) {
+		puts(path);
+	} else if (format == FORMAT_RELATIVE ||
+		  (format == FORMAT_DEFAULT && def == DEFAULT_RELATIVE)) {
+		/*
+		 * In order for relative_path to work as expected, we need to
+		 * make sure that both paths are absolute paths.  If we don't,
+		 * we can end up with an unexpected absolute path that the user
+		 * didn't want.
+		 */
+		struct strbuf buf = STRBUF_INIT, realbuf = STRBUF_INIT, prefixbuf = STRBUF_INIT;
+		if (!is_absolute_path(path)) {
+			if (!strbuf_realpath_missing(&realbuf, path))
+				die(_("Unable to resolve path '%s'"), path);
+			path = realbuf.buf;
+		}
+		if (!is_absolute_path(prefix)) {
+			if (!strbuf_realpath_missing(&prefixbuf, prefix))
+				die(_("Unable to resolve path '%s'"), path);
+			prefix = prefixbuf.buf;
+		}
+		puts(relative_path(path, prefix, &buf));
+		strbuf_release(&buf);
+	} else if (format == FORMAT_DEFAULT && def == DEFAULT_RELATIVE_IF_SHARED) {
+		struct strbuf buf = STRBUF_INIT;
+		puts(relative_path(path, prefix, &buf));
+		strbuf_release(&buf);
+	} else {
+		struct strbuf buf = STRBUF_INIT;
+		if (!strbuf_realpath_missing(&buf, path))
+			die(_("Unable to resolve path '%s'"), path);
+		puts(buf.buf);
+		strbuf_release(&buf);
+	}
+	free(cwd);
+}
+
 int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 {
 	int i, as_is = 0, verify = 0, quiet = 0, revs_count = 0, type = 0;
@@ -595,6 +665,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 	struct object_context unused;
 	struct strbuf buf = STRBUF_INIT;
 	const int hexsz = the_hash_algo->hexsz;
+	enum format_type format = FORMAT_DEFAULT;
 
 	if (argc > 1 && !strcmp("--parseopt", argv[1]))
 		return cmd_parseopt(argc - 1, argv + 1, prefix);
@@ -650,8 +721,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 			if (!argv[i + 1])
 				die("--git-path requires an argument");
 			strbuf_reset(&buf);
-			puts(relative_path(git_path("%s", argv[i + 1]),
-					   prefix, &buf));
+			print_path(git_path("%s", argv[i + 1]), prefix, format, DEFAULT_RELATIVE_IF_SHARED);
 			i++;
 			continue;
 		}
@@ -683,6 +753,16 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 					show_file(arg, 0);
 				continue;
 			}
+			if (opt_with_value(arg, "--path-format", &arg)) {
+				if (!strcmp(arg, "absolute")) {
+					format = FORMAT_CANONICAL;
+				} else if (!strcmp(arg, "relative")) {
+					format = FORMAT_RELATIVE;
+				} else {
+					die("unknown argument to --path-format: %s", arg);
+				}
+				continue;
+			}
 			if (!strcmp(arg, "--default")) {
 				def = argv[++i];
 				if (!def)
@@ -803,7 +883,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 			if (!strcmp(arg, "--show-toplevel")) {
 				const char *work_tree = get_git_work_tree();
 				if (work_tree)
-					puts(work_tree);
+					print_path(work_tree, prefix, format, DEFAULT_CANONICAL);
 				else
 					die("this operation must be run in a work tree");
 				continue;
@@ -811,7 +891,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 			if (!strcmp(arg, "--show-superproject-working-tree")) {
 				struct strbuf superproject = STRBUF_INIT;
 				if (get_superproject_working_tree(&superproject))
-					puts(superproject.buf);
+					print_path(superproject.buf, prefix, format, DEFAULT_CANONICAL);
 				strbuf_release(&superproject);
 				continue;
 			}
@@ -846,16 +926,18 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				const char *gitdir = getenv(GIT_DIR_ENVIRONMENT);
 				char *cwd;
 				int len;
+				enum format_type wanted = format;
 				if (arg[2] == 'g') {	/* --git-dir */
 					if (gitdir) {
-						puts(gitdir);
+						print_path(gitdir, prefix, format, DEFAULT_UNMODIFIED);
 						continue;
 					}
 					if (!prefix) {
-						puts(".git");
+						print_path(".git", prefix, format, DEFAULT_UNMODIFIED);
 						continue;
 					}
 				} else {		/* --absolute-git-dir */
+					wanted = FORMAT_CANONICAL;
 					if (!gitdir && !prefix)
 						gitdir = ".git";
 					if (gitdir) {
@@ -868,14 +950,14 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				}
 				cwd = xgetcwd();
 				len = strlen(cwd);
-				printf("%s%s.git\n", cwd, len && cwd[len-1] != '/' ? "/" : "");
+				strbuf_reset(&buf);
+				strbuf_addf(&buf, "%s%s.git", cwd, len && cwd[len-1] != '/' ? "/" : "");
 				free(cwd);
+				print_path(buf.buf, prefix, wanted, DEFAULT_CANONICAL);
 				continue;
 			}
 			if (!strcmp(arg, "--git-common-dir")) {
-				strbuf_reset(&buf);
-				puts(relative_path(get_git_common_dir(),
-						   prefix, &buf));
+				print_path(get_git_common_dir(), prefix, format, DEFAULT_RELATIVE_IF_SHARED);
 				continue;
 			}
 			if (!strcmp(arg, "--is-inside-git-dir")) {
@@ -905,8 +987,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				if (the_index.split_index) {
 					const struct object_id *oid = &the_index.split_index->base_oid;
 					const char *path = git_path("sharedindex.%s", oid_to_hex(oid));
-					strbuf_reset(&buf);
-					puts(relative_path(path, prefix, &buf));
+					print_path(path, prefix, format, DEFAULT_RELATIVE);
 				}
 				continue;
 			}
diff --git a/t/t1500-rev-parse.sh b/t/t1500-rev-parse.sh
index 408b97d5af..51d7d40ec1 100755
--- a/t/t1500-rev-parse.sh
+++ b/t/t1500-rev-parse.sh
@@ -3,6 +3,16 @@
 test_description='test git rev-parse'
 . ./test-lib.sh
 
+test_one () {
+	dir="$1" &&
+	expect="$2" &&
+	shift &&
+	shift &&
+	echo "$expect" >expect &&
+	git -C "$dir" rev-parse "$@" >actual &&
+	test_cmp expect actual
+}
+
 # usage: [options] label is-bare is-inside-git is-inside-work prefix git-dir absolute-git-dir
 test_rev_parse () {
 	d=
@@ -60,7 +70,13 @@ ROOT=$(pwd)
 
 test_expect_success 'setup' '
 	mkdir -p sub/dir work &&
-	cp -R .git repo.git
+	cp -R .git repo.git &&
+	git checkout -B main &&
+	test_commit abc &&
+	git checkout -b side &&
+	test_commit def &&
+	git checkout main &&
+	git worktree add worktree side
 '
 
 test_rev_parse toplevel false false true '' .git "$ROOT/.git"
@@ -88,6 +104,45 @@ test_rev_parse -C work -g ../repo.git -b t 'GIT_DIR=../repo.git, core.bare = tru
 
 test_rev_parse -C work -g ../repo.git -b u 'GIT_DIR=../repo.git, core.bare undefined' false false true ''
 
+test_expect_success 'rev-parse --path-format=absolute' '
+	test_one "." "$ROOT/.git" --path-format=absolute --git-dir &&
+	test_one "." "$ROOT/.git" --path-format=absolute --git-common-dir &&
+	test_one "sub/dir" "$ROOT/.git" --path-format=absolute --git-dir &&
+	test_one "sub/dir" "$ROOT/.git" --path-format=absolute --git-common-dir &&
+	test_one "worktree" "$ROOT/.git/worktrees/worktree" --path-format=absolute --git-dir &&
+	test_one "worktree" "$ROOT/.git" --path-format=absolute --git-common-dir &&
+	test_one "." "$ROOT" --path-format=absolute --show-toplevel &&
+	test_one "." "$ROOT/.git/objects" --path-format=absolute --git-path objects &&
+	test_one "." "$ROOT/.git/objects/foo/bar/baz" --path-format=absolute --git-path objects/foo/bar/baz
+'
+
+test_expect_success 'rev-parse --path-format=relative' '
+	test_one "." ".git" --path-format=relative --git-dir &&
+	test_one "." ".git" --path-format=relative --git-common-dir &&
+	test_one "sub/dir" "../../.git" --path-format=relative --git-dir &&
+	test_one "sub/dir" "../../.git" --path-format=relative --git-common-dir &&
+	test_one "worktree" "../.git/worktrees/worktree" --path-format=relative --git-dir &&
+	test_one "worktree" "../.git" --path-format=relative --git-common-dir &&
+	test_one "." "./" --path-format=relative --show-toplevel &&
+	test_one "." ".git/objects" --path-format=relative --git-path objects &&
+	test_one "." ".git/objects/foo/bar/baz" --path-format=relative --git-path objects/foo/bar/baz
+'
+
+test_expect_success '--path-format=relative does not affect --absolute-git-dir' '
+	git rev-parse --path-format=relative --absolute-git-dir >actual &&
+	echo "$ROOT/.git" >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success '--path-format can change in the middle of the command line' '
+	git rev-parse --path-format=absolute --git-dir --path-format=relative --git-path objects/foo/bar >actual &&
+	cat >expect <<-EOF &&
+	$ROOT/.git
+	.git/objects/foo/bar
+	EOF
+	test_cmp expect actual
+'
+
 test_expect_success 'git-common-dir from worktree root' '
 	echo .git >expect &&
 	git rev-parse --git-common-dir >actual &&

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/2] abspath: add a function to resolve paths with missing components
  2020-10-09 19:15 ` [PATCH v2 1/2] abspath: add a function to resolve paths with missing components brian m. carlson
@ 2020-10-09 21:10   ` Junio C Hamano
  2020-10-10  1:10     ` brian m. carlson
  2020-11-09 13:55   ` Johannes Schindelin
  1 sibling, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2020-10-09 21:10 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git, Johannes Schindelin, SZEDER Gábor

"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> We'd like to canonicalize paths such that we can preserve any number of
> trailing components that may be missing.

Sorry, but at least to me, the above gives no clue what kind of
operation is desired to be done on paths.  How would one preserve
what does not exist (i.e. are missing)?

Do you mean some leading components in a path point at existing
directories and after some point a component names a directory
that does not exist, so everything after that does not yet exist
until you "mkdir -p" them?

I guess my confusion comes primarily from the fuzziness of the verb
"canonicalize" in the sentence.  We want to handle a/b/../c/d and
there are various combinations of missng and existing directories,
e.g. a/b may not exist or a/b may but a/c may not, etc.  Is that
what is going on?  Makes me wonder if it makes sense to canonicalize
a/b/../c/d into a/c/d when a/b does not exist in the first place,
though.

> Let's add a function to do
> that that calls strbuf_realpath to find the canonical path for the
> portion we do have and then append the missing part.  We adjust
> strip_last_component to return us the component it has stripped and use
> that to help us accumulate the missing part.

OK, so if we have a/b/c/d and know a/b/c/d does not exist on the
filesystem, we start by splitting it to a/b/c and d, see if a/b/c
exists, and if not, do the same recursively to a/b/c to split it
into a/b and c, and prefix the latter to 'd' that we split earlier
(i.e. now we have a/b and c/d), until we have an existing directory
on the first half?

> Note that it is intentional that we invoke strbuf_realpath here,
> repeatedly if necessary, because on Windows that function is replaced
> with a version that uses the proper system semantics for
> canonicalization.  Trying to adjust strbuf_realpath to perform this kind
> of canonicalization with an additional option would fail to work
> properly on Windows.  The present approach is equivalent to
> strbuf_realpath for cases where the path exists, and the only other
> cases where we will use this function the additional overhead of
> multiple invocations is not significant.
>
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
>  abspath.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++----
>  cache.h   |  1 +
>  2 files changed, 47 insertions(+), 4 deletions(-)
>
> diff --git a/abspath.c b/abspath.c
> index 6f15a418bb..092bb33b64 100644
> --- a/abspath.c
> +++ b/abspath.c
> @@ -11,8 +11,12 @@ int is_directory(const char *path)
>  	return (!stat(path, &st) && S_ISDIR(st.st_mode));
>  }
>  
> -/* removes the last path component from 'path' except if 'path' is root */
> -static void strip_last_component(struct strbuf *path)
> +/*
> + * Removes the last path component from 'path' except if 'path' is root.
> + *
> + * If last is not NULL, the last path component is copied to last.
> + */
> +static void strip_last_component(struct strbuf *path, struct strbuf *last)
>  {
>  	size_t offset = offset_1st_component(path->buf);
>  	size_t len = path->len;
> @@ -20,6 +24,10 @@ static void strip_last_component(struct strbuf *path)
>  	/* Find start of the last component */
>  	while (offset < len && !is_dir_sep(path->buf[len - 1]))
>  		len--;
> +
> +	if (last)
> +		strbuf_addstr(last, path->buf + len);
> +
>  	/* Skip sequences of multiple path-separators */
>  	while (offset < len && is_dir_sep(path->buf[len - 1]))
>  		len--;
> @@ -118,7 +126,7 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path,
>  			continue; /* '.' component */
>  		} else if (next.len == 2 && !strcmp(next.buf, "..")) {
>  			/* '..' component; strip the last path component */
> -			strip_last_component(resolved);
> +			strip_last_component(resolved, NULL);
>  			continue;
>  		}
>  
> @@ -169,7 +177,7 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path,
>  				 * strip off the last component since it will
>  				 * be replaced with the contents of the symlink
>  				 */
> -				strip_last_component(resolved);
> +				strip_last_component(resolved, NULL);
>  			}
>  
>  			/*
> @@ -202,6 +210,40 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path,
>  	return retval;
>  }
>  
> +/*
> + * Like strbuf_realpath, but trailing components which do not exist are copied
> + * through.
> + */
> +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path)
> +{
> +	struct strbuf remaining = STRBUF_INIT;
> +	struct strbuf trailing = STRBUF_INIT;
> +	struct strbuf component = STRBUF_INIT;
> +
> +	strbuf_addstr(&remaining, path);
> +
> +	while (remaining.len) {
> +		if (strbuf_realpath(resolved, remaining.buf, 0)) {
> +			strbuf_addbuf(resolved, &trailing);
> +
> +			strbuf_release(&component);
> +			strbuf_release(&remaining);
> +			strbuf_release(&trailing);
> +
> +			return resolved->buf;
> +		}
> +		strip_last_component(&remaining, &component);
> +		strbuf_insertstr(&trailing, 0, "/");
> +		strbuf_insertstr(&trailing, 1, component.buf);

I may be utterly confused, but is this where

    - we started with a/b/c/d, pushed 'd' into trailing and decided
      to redo with a/b/c

    - now we split the a/b/c into a/b and c, and adjusting what is
      in trailing from 'd' to 'c/d'

happens place?  It's a bit sad that we need to repeatedly use
insertstr to prepend in front, instead of appending.

> +		strbuf_reset(&component);
> +	}
> +
> +	strbuf_release(&component);
> +	strbuf_release(&remaining);
> +	strbuf_release(&trailing);
> +	return NULL;
> +}
> +
>  char *real_pathdup(const char *path, int die_on_error)
>  {
>  	struct strbuf realpath = STRBUF_INIT;
> diff --git a/cache.h b/cache.h
> index c0072d43b1..e1e17e108e 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1320,6 +1320,7 @@ static inline int is_absolute_path(const char *path)
>  int is_directory(const char *);
>  char *strbuf_realpath(struct strbuf *resolved, const char *path,
>  		      int die_on_error);
> +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path);
>  char *real_pathdup(const char *path, int die_on_error);
>  const char *absolute_path(const char *path);
>  char *absolute_pathdup(const char *path);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/2] abspath: add a function to resolve paths with missing components
  2020-10-09 21:10   ` Junio C Hamano
@ 2020-10-10  1:10     ` brian m. carlson
  2020-11-09 13:57       ` Johannes Schindelin
  0 siblings, 1 reply; 13+ messages in thread
From: brian m. carlson @ 2020-10-10  1:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Johannes Schindelin, SZEDER Gábor

[-- Attachment #1: Type: text/plain, Size: 3602 bytes --]

On 2020-10-09 at 21:10:04, Junio C Hamano wrote:
> "brian m. carlson" <sandals@crustytoothpaste.net> writes:
> 
> > We'd like to canonicalize paths such that we can preserve any number of
> > trailing components that may be missing.
> 
> Sorry, but at least to me, the above gives no clue what kind of
> operation is desired to be done on paths.  How would one preserve
> what does not exist (i.e. are missing)?
> 
> Do you mean some leading components in a path point at existing
> directories and after some point a component names a directory
> that does not exist, so everything after that does not yet exist
> until you "mkdir -p" them?
> 
> I guess my confusion comes primarily from the fuzziness of the verb
> "canonicalize" in the sentence.  We want to handle a/b/../c/d and
> there are various combinations of missng and existing directories,
> e.g. a/b may not exist or a/b may but a/c may not, etc.  Is that
> what is going on?  Makes me wonder if it makes sense to canonicalize
> a/b/../c/d into a/c/d when a/b does not exist in the first place,
> though.

The behavior that I'm proposing is the realpath -m behavior.  If the
path we're canonicalizing doesn't exist, we find the closest parent that
does exist, canonicalize it (à la realpath(3)), and then append the
components that don't exist to the canonicalized portion.

> > Let's add a function to do
> > that that calls strbuf_realpath to find the canonical path for the
> > portion we do have and then append the missing part.  We adjust
> > strip_last_component to return us the component it has stripped and use
> > that to help us accumulate the missing part.
> 
> OK, so if we have a/b/c/d and know a/b/c/d does not exist on the
> filesystem, we start by splitting it to a/b/c and d, see if a/b/c
> exists, and if not, do the same recursively to a/b/c to split it
> into a/b and c, and prefix the latter to 'd' that we split earlier
> (i.e. now we have a/b and c/d), until we have an existing directory
> on the first half?

Correct.

> > +/*
> > + * Like strbuf_realpath, but trailing components which do not exist are copied
> > + * through.
> > + */
> > +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path)
> > +{
> > +	struct strbuf remaining = STRBUF_INIT;
> > +	struct strbuf trailing = STRBUF_INIT;
> > +	struct strbuf component = STRBUF_INIT;
> > +
> > +	strbuf_addstr(&remaining, path);
> > +
> > +	while (remaining.len) {
> > +		if (strbuf_realpath(resolved, remaining.buf, 0)) {
> > +			strbuf_addbuf(resolved, &trailing);
> > +
> > +			strbuf_release(&component);
> > +			strbuf_release(&remaining);
> > +			strbuf_release(&trailing);
> > +
> > +			return resolved->buf;
> > +		}
> > +		strip_last_component(&remaining, &component);
> > +		strbuf_insertstr(&trailing, 0, "/");
> > +		strbuf_insertstr(&trailing, 1, component.buf);
> 
> I may be utterly confused, but is this where
> 
>     - we started with a/b/c/d, pushed 'd' into trailing and decided
>       to redo with a/b/c
> 
>     - now we split the a/b/c into a/b and c, and adjusting what is
>       in trailing from 'd' to 'c/d'
> 
> happens place?  It's a bit sad that we need to repeatedly use
> insertstr to prepend in front, instead of appending.

Yes, that's true.  It really isn't avoidable, though, with the functions
the way that they are.  We can't use the original path and keep track of
the offset because it may contain multiple path separators and we don't
want to include those in the path.
-- 
brian m. carlson: Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/2] rev-parse options for absolute or relative paths
  2020-10-09 19:15 [PATCH v2 0/2] rev-parse options for absolute or relative paths brian m. carlson
  2020-10-09 19:15 ` [PATCH v2 1/2] abspath: add a function to resolve paths with missing components brian m. carlson
  2020-10-09 19:15 ` [PATCH v2 2/2] rev-parse: add option for absolute or relative path formatting brian m. carlson
@ 2020-11-04 23:01 ` Emily Shaffer
  2020-11-05  3:20   ` brian m. carlson
  2020-11-09 13:33 ` Johannes Schindelin
  3 siblings, 1 reply; 13+ messages in thread
From: Emily Shaffer @ 2020-11-04 23:01 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git, Johannes Schindelin, SZEDER Gábor

Hiya, we took a look in the review club, but I'll try to keep it simple.

On Fri, Oct 09, 2020 at 07:15:09PM +0000, brian m. carlson wrote:
> 
> There are a bunch of different situations in which one would like to
> have an absolute and canonical or a relative path from Git.

I think specifically you are interested in this situation:
https://github.com/git-lfs/git-lfs/issues/4012

I think this would have been useful to see in the cover letter :) There
was a lot of "but why" during the review club meeting.

Looking at the git-lfs issue in question, the problem seems to be "how
to make sure the path Git gives me will match the path Go gives me" -
and like you mentioned in the issue, there are two ways to fix it.

Also, there was some brief wondering: if this option is useful in 'git
rev-parse', would it be useful in other commands too? I thought maybe it
would be more useful as an arg to 'git' itself, but on inspection, not
really - because you wouldn't be able to switch the format in the middle
of a bunch of args, like you can now. The way it's written here,
though, should mean a smooth transition to something like OPT__VERBOSE
or OPT__QUIET if we discover lots of folks would find use.

Thanks for sending it. I'll try and do a detailed review of the patches
themselves. To me, the concept seems sound.

 - Emily

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/2] rev-parse options for absolute or relative paths
  2020-11-04 23:01 ` [PATCH v2 0/2] rev-parse options for absolute or relative paths Emily Shaffer
@ 2020-11-05  3:20   ` brian m. carlson
  0 siblings, 0 replies; 13+ messages in thread
From: brian m. carlson @ 2020-11-05  3:20 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git, Johannes Schindelin, SZEDER Gábor

[-- Attachment #1: Type: text/plain, Size: 2003 bytes --]

On 2020-11-04 at 23:01:57, Emily Shaffer wrote:
> Hiya, we took a look in the review club, but I'll try to keep it simple.
> 
> On Fri, Oct 09, 2020 at 07:15:09PM +0000, brian m. carlson wrote:
> > 
> > There are a bunch of different situations in which one would like to
> > have an absolute and canonical or a relative path from Git.
> 
> I think specifically you are interested in this situation:
> https://github.com/git-lfs/git-lfs/issues/4012

Yes, that's the case I'm interested in.

> I think this would have been useful to see in the cover letter :) There
> was a lot of "but why" during the review club meeting.

Sorry about that.  When I'm sending patches to Git that benefit Git LFS,
I want to be careful to build features that are generally applicable and
don't just try to shove in features that benefit the project I maintain,
so I'm often a little hesitant to mention my particular use case and try
to let the patch stand on its own.  I know that there are several
competing projects in this space, and I want to be sensitive to not
privileging my own just because I'm a frequent contributor.

I'll try to strike a better balance here in the future.

> Also, there was some brief wondering: if this option is useful in 'git
> rev-parse', would it be useful in other commands too? I thought maybe it
> would be more useful as an arg to 'git' itself, but on inspection, not
> really - because you wouldn't be able to switch the format in the middle
> of a bunch of args, like you can now. The way it's written here,
> though, should mean a smooth transition to something like OPT__VERBOSE
> or OPT__QUIET if we discover lots of folks would find use.

Yeah, I think this could be a thing we add in the future, but I'm
generally comfortable with this being just in rev-parse, since that's
the command that folks tend to use when scripting, but we'll see if
other folks think differently.
-- 
brian m. carlson (he/him or they/them)
Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/2] rev-parse options for absolute or relative paths
  2020-10-09 19:15 [PATCH v2 0/2] rev-parse options for absolute or relative paths brian m. carlson
                   ` (2 preceding siblings ...)
  2020-11-04 23:01 ` [PATCH v2 0/2] rev-parse options for absolute or relative paths Emily Shaffer
@ 2020-11-09 13:33 ` Johannes Schindelin
  3 siblings, 0 replies; 13+ messages in thread
From: Johannes Schindelin @ 2020-11-09 13:33 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git, SZEDER Gábor

Hi brian,

On Fri, 9 Oct 2020, brian m. carlson wrote:

> There are a bunch of different situations in which one would like to
> have an absolute and canonical or a relative path from Git.  In many of
> these cases, these values are already available from git rev-parse, but
> some values only come in one form or another.
>
> Many operating systems, such as macOS, lack a built-in realpath command
> that can canonicalize paths properly, and additionally some programming
> languages, like Go, currently do as well.  It's therefore helpful for us
> to provide a generic way to request that a path is fully canonicalized
> before using it.  Since users may wish for a relative path, we can
> provide one of those as well.

I am very much in favor of this patch series' goal. Windows also does not
_really_ have anything in the way of `realpath`. Besides, it would be good
to have a way to ask _Git_ what it's idea of the real path is (no
guessing!).

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/2] abspath: add a function to resolve paths with missing components
  2020-10-09 19:15 ` [PATCH v2 1/2] abspath: add a function to resolve paths with missing components brian m. carlson
  2020-10-09 21:10   ` Junio C Hamano
@ 2020-11-09 13:55   ` Johannes Schindelin
  2020-11-16  2:21     ` brian m. carlson
  1 sibling, 1 reply; 13+ messages in thread
From: Johannes Schindelin @ 2020-11-09 13:55 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git, SZEDER Gábor

Hi brian,

On Fri, 9 Oct 2020, brian m. carlson wrote:

> We'd like to canonicalize paths such that we can preserve any number of
> trailing components that may be missing.  Let's add a function to do
> that that calls strbuf_realpath to find the canonical path for the
> portion we do have and then append the missing part.  We adjust
> strip_last_component to return us the component it has stripped and use
> that to help us accumulate the missing part.
>
> Note that it is intentional that we invoke strbuf_realpath here,
> repeatedly if necessary, because on Windows that function is replaced
> with a version that uses the proper system semantics for
> canonicalization.  Trying to adjust strbuf_realpath to perform this kind
> of canonicalization with an additional option would fail to work
> properly on Windows.  The present approach is equivalent to
> strbuf_realpath for cases where the path exists, and the only other
> cases where we will use this function the additional overhead of
> multiple invocations is not significant.

Thank you for being so considerate. Yes, on Windows, we use (wherever
possible) a shortcut that tells us the canonicalized path of existing
entries.

Technically, it is not `strbuf_realpath()` that we override, but we take a
shortcut _in_ that function. That's semantics, though.

More importantly, we recently fixed a bug in our code to allow for a quirk
in the `strbuf_realpath()` function: `strbuf_realpath()` allows the last
path component to not exist. If that is the case, now it's time to try
without last component.

In a sense, this is a 1-level version of your infinite-level
`strbuf_realpath_missing()` function.

An idea that immediately crosses my mind is whether that level could be
something we want to pass directly into `strbuf_realpath()` as a parameter
(it would be 1 to imitate the current behavior and -1 for the
infinite-level case). What do you think? Does that make sense?

In any case, I think this `_missing()` functionality should be implemented
a bit more tightly with the `strbuf_realpath()` function because of the
logic that already allows the last component to be missing:

                if (lstat(resolved->buf, &st)) {
                        /* error out unless this was the last component */
                        if (errno != ENOENT || remaining.len) {
                                if (die_on_error)
                                        die_errno("Invalid path '%s'",
                                                  resolved->buf);
                                else
                                        goto error_out;
                        }

See https://github.com/git/git/blob/v2.29.2/abspath.c#L130-L138 for the
exact code and context.

Seeing as we _already_ have some code to allow for _some_ missing
component, it should be possible to extend the logic to allow for
different levels (e.g. using `count_slashes()` if we want to allow more
than just the last component to be missing).

> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
>  abspath.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++----
>  cache.h   |  1 +
>  2 files changed, 47 insertions(+), 4 deletions(-)
>
> diff --git a/abspath.c b/abspath.c
> index 6f15a418bb..092bb33b64 100644
> --- a/abspath.c
> +++ b/abspath.c
> @@ -11,8 +11,12 @@ int is_directory(const char *path)
>  	return (!stat(path, &st) && S_ISDIR(st.st_mode));
>  }
>
> -/* removes the last path component from 'path' except if 'path' is root */
> -static void strip_last_component(struct strbuf *path)
> +/*
> + * Removes the last path component from 'path' except if 'path' is root.
> + *
> + * If last is not NULL, the last path component is copied to last.
> + */
> +static void strip_last_component(struct strbuf *path, struct strbuf *last)
>  {
>  	size_t offset = offset_1st_component(path->buf);
>  	size_t len = path->len;
> @@ -20,6 +24,10 @@ static void strip_last_component(struct strbuf *path)
>  	/* Find start of the last component */
>  	while (offset < len && !is_dir_sep(path->buf[len - 1]))
>  		len--;
> +
> +	if (last)
> +		strbuf_addstr(last, path->buf + len);
> +
>  	/* Skip sequences of multiple path-separators */
>  	while (offset < len && is_dir_sep(path->buf[len - 1]))
>  		len--;
> @@ -118,7 +126,7 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path,
>  			continue; /* '.' component */
>  		} else if (next.len == 2 && !strcmp(next.buf, "..")) {
>  			/* '..' component; strip the last path component */
> -			strip_last_component(resolved);
> +			strip_last_component(resolved, NULL);
>  			continue;
>  		}
>
> @@ -169,7 +177,7 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path,
>  				 * strip off the last component since it will
>  				 * be replaced with the contents of the symlink
>  				 */
> -				strip_last_component(resolved);
> +				strip_last_component(resolved, NULL);
>  			}
>
>  			/*
> @@ -202,6 +210,40 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path,
>  	return retval;
>  }
>
> +/*
> + * Like strbuf_realpath, but trailing components which do not exist are copied
> + * through.
> + */
> +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path)
> +{
> +	struct strbuf remaining = STRBUF_INIT;
> +	struct strbuf trailing = STRBUF_INIT;
> +	struct strbuf component = STRBUF_INIT;
> +
> +	strbuf_addstr(&remaining, path);
> +
> +	while (remaining.len) {
> +		if (strbuf_realpath(resolved, remaining.buf, 0)) {
> +			strbuf_addbuf(resolved, &trailing);
> +
> +			strbuf_release(&component);
> +			strbuf_release(&remaining);
> +			strbuf_release(&trailing);
> +
> +			return resolved->buf;
> +		}
> +		strip_last_component(&remaining, &component);
> +		strbuf_insertstr(&trailing, 0, "/");
> +		strbuf_insertstr(&trailing, 1, component.buf);

Not that it matters a lot, but this could be written shorter via

		strbuf_insertf(&trailing, "/%s", component.buf);

But as I said above, I think we should be able to fold the logic _into_
`strbuf_realpath()` (even if this makes my job harder to maintain the
Windows-specific shortcut).

Thanks,
Dscho

> +		strbuf_reset(&component);
> +	}
> +
> +	strbuf_release(&component);
> +	strbuf_release(&remaining);
> +	strbuf_release(&trailing);
> +	return NULL;
> +}
> +
>  char *real_pathdup(const char *path, int die_on_error)
>  {
>  	struct strbuf realpath = STRBUF_INIT;
> diff --git a/cache.h b/cache.h
> index c0072d43b1..e1e17e108e 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1320,6 +1320,7 @@ static inline int is_absolute_path(const char *path)
>  int is_directory(const char *);
>  char *strbuf_realpath(struct strbuf *resolved, const char *path,
>  		      int die_on_error);
> +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path);
>  char *real_pathdup(const char *path, int die_on_error);
>  const char *absolute_path(const char *path);
>  char *absolute_pathdup(const char *path);
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/2] abspath: add a function to resolve paths with missing components
  2020-10-10  1:10     ` brian m. carlson
@ 2020-11-09 13:57       ` Johannes Schindelin
  0 siblings, 0 replies; 13+ messages in thread
From: Johannes Schindelin @ 2020-11-09 13:57 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, git, SZEDER Gábor

[-- Attachment #1: Type: text/plain, Size: 4113 bytes --]

Hi brian,

On Sat, 10 Oct 2020, brian m. carlson wrote:

> On 2020-10-09 at 21:10:04, Junio C Hamano wrote:
> > "brian m. carlson" <sandals@crustytoothpaste.net> writes:
> >
> > > We'd like to canonicalize paths such that we can preserve any number of
> > > trailing components that may be missing.
> >
> > Sorry, but at least to me, the above gives no clue what kind of
> > operation is desired to be done on paths.  How would one preserve
> > what does not exist (i.e. are missing)?
> >
> > Do you mean some leading components in a path point at existing
> > directories and after some point a component names a directory
> > that does not exist, so everything after that does not yet exist
> > until you "mkdir -p" them?
> >
> > I guess my confusion comes primarily from the fuzziness of the verb
> > "canonicalize" in the sentence.  We want to handle a/b/../c/d and
> > there are various combinations of missng and existing directories,
> > e.g. a/b may not exist or a/b may but a/c may not, etc.  Is that
> > what is going on?  Makes me wonder if it makes sense to canonicalize
> > a/b/../c/d into a/c/d when a/b does not exist in the first place,
> > though.
>
> The behavior that I'm proposing is the realpath -m behavior.  If the
> path we're canonicalizing doesn't exist, we find the closest parent that
> does exist, canonicalize it (à la realpath(3)), and then append the
> components that don't exist to the canonicalized portion.

FWIW I was immediately able to think of a handful scenarios where this
functionality would come in handy, but I am probably not a typical example
for the median reader. So maybe a concrete example or two why this could
be handy could be shown in the cover letter?

Thanks,
Dscho

>
> > > Let's add a function to do
> > > that that calls strbuf_realpath to find the canonical path for the
> > > portion we do have and then append the missing part.  We adjust
> > > strip_last_component to return us the component it has stripped and use
> > > that to help us accumulate the missing part.
> >
> > OK, so if we have a/b/c/d and know a/b/c/d does not exist on the
> > filesystem, we start by splitting it to a/b/c and d, see if a/b/c
> > exists, and if not, do the same recursively to a/b/c to split it
> > into a/b and c, and prefix the latter to 'd' that we split earlier
> > (i.e. now we have a/b and c/d), until we have an existing directory
> > on the first half?
>
> Correct.
>
> > > +/*
> > > + * Like strbuf_realpath, but trailing components which do not exist are copied
> > > + * through.
> > > + */
> > > +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path)
> > > +{
> > > +	struct strbuf remaining = STRBUF_INIT;
> > > +	struct strbuf trailing = STRBUF_INIT;
> > > +	struct strbuf component = STRBUF_INIT;
> > > +
> > > +	strbuf_addstr(&remaining, path);
> > > +
> > > +	while (remaining.len) {
> > > +		if (strbuf_realpath(resolved, remaining.buf, 0)) {
> > > +			strbuf_addbuf(resolved, &trailing);
> > > +
> > > +			strbuf_release(&component);
> > > +			strbuf_release(&remaining);
> > > +			strbuf_release(&trailing);
> > > +
> > > +			return resolved->buf;
> > > +		}
> > > +		strip_last_component(&remaining, &component);
> > > +		strbuf_insertstr(&trailing, 0, "/");
> > > +		strbuf_insertstr(&trailing, 1, component.buf);
> >
> > I may be utterly confused, but is this where
> >
> >     - we started with a/b/c/d, pushed 'd' into trailing and decided
> >       to redo with a/b/c
> >
> >     - now we split the a/b/c into a/b and c, and adjusting what is
> >       in trailing from 'd' to 'c/d'
> >
> > happens place?  It's a bit sad that we need to repeatedly use
> > insertstr to prepend in front, instead of appending.
>
> Yes, that's true.  It really isn't avoidable, though, with the functions
> the way that they are.  We can't use the original path and keep track of
> the offset because it may contain multiple path separators and we don't
> want to include those in the path.
> --
> brian m. carlson: Houston, Texas, US
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/2] rev-parse: add option for absolute or relative path formatting
  2020-10-09 19:15 ` [PATCH v2 2/2] rev-parse: add option for absolute or relative path formatting brian m. carlson
@ 2020-11-09 14:46   ` Johannes Schindelin
  2020-11-16  2:15     ` brian m. carlson
  0 siblings, 1 reply; 13+ messages in thread
From: Johannes Schindelin @ 2020-11-09 14:46 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git, SZEDER Gábor

Hi brian,

On Fri, 9 Oct 2020, brian m. carlson wrote:

> git rev-parse has several options which print various paths.  Some of
> these paths are printed relative to the current working directory, and
> some are absolute.
>
> Normally, this is not a problem, but there are times when one wants
> paths entirely in one format or another.  This can be done trivially if
> the paths are canonical, but canonicalizing paths is not possible on
> some shell scripting environments which lack realpath(1) and also in Go,
> which lacks functions that properly canonicalize paths on Windows.
>
> To help out the scripter, let's provide an option which turns most of
> the paths printed by git rev-parse to be either relative to the current
> working directory or absolute and canonical.  Document which options are
> affected and which are not so that users are not confused.
>
> This approach is cleaner and tidier than providing duplicates of
> existing options which are either relative or absolute.
>
> Note that if the user needs both forms, it is possible to pass an
> additional option in the middle of the command line which changes the
> behavior of subsequent operations.

Good.

> diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
> index ed200c8af1..ec62b4cd16 100644
> --- a/builtin/rev-parse.c
> +++ b/builtin/rev-parse.c
> @@ -583,6 +583,76 @@ static void handle_ref_opt(const char *pattern, const char *prefix)
>  	clear_ref_exclusion(&ref_excludes);
>  }
>
> +enum format_type {
> +	/* We would like a relative path. */
> +	FORMAT_RELATIVE,
> +	/* We would like a canonical absolute path. */
> +	FORMAT_CANONICAL,
> +	/* We would like the default behavior. */
> +	FORMAT_DEFAULT,
> +};
> +
> +enum default_type {
> +	/* Our default is a relative path. */
> +	DEFAULT_RELATIVE,
> +	/* Our default is a relative path if there's a shared root. */
> +	DEFAULT_RELATIVE_IF_SHARED,
> +	/* Our default is a canonical absolute path. */
> +	DEFAULT_CANONICAL,
> +	/* Our default is not to modify the item. */
> +	DEFAULT_UNMODIFIED,
> +};

I wonder whether it would make sense to consolidate these two enums into a
single one.

> +static void print_path(const char *path, const char *prefix, enum format_type format, enum default_type def)
> +{
> +	char *cwd = NULL;
> +	/*
> +	 * We don't ever produce a relative path if prefix is NULL, so set the
> +	 * prefix to the current directory so that we can produce a relative
> +	 * path whenever possible.  If we're using RELATIVE_IF_SHARED mode, then
> +	 * we want an absolute path unless the two share a common prefix, so don't
> +	 * set it in that case, since doing so causes a relative path to always
> +	 * be produced if possible.
> +	 */
> +	if (!prefix && (format != FORMAT_DEFAULT || def != DEFAULT_RELATIVE_IF_SHARED))
> +		prefix = cwd = xgetcwd();
> +	if (format == FORMAT_DEFAULT && def == DEFAULT_UNMODIFIED) {
> +		puts(path);
> +	} else if (format == FORMAT_RELATIVE ||
> +		  (format == FORMAT_DEFAULT && def == DEFAULT_RELATIVE)) {
> +		/*
> +		 * In order for relative_path to work as expected, we need to
> +		 * make sure that both paths are absolute paths.  If we don't,
> +		 * we can end up with an unexpected absolute path that the user
> +		 * didn't want.
> +		 */
> +		struct strbuf buf = STRBUF_INIT, realbuf = STRBUF_INIT, prefixbuf = STRBUF_INIT;
> +		if (!is_absolute_path(path)) {
> +			if (!strbuf_realpath_missing(&realbuf, path))
> +				die(_("Unable to resolve path '%s'"), path);
> +			path = realbuf.buf;
> +		}
> +		if (!is_absolute_path(prefix)) {
> +			if (!strbuf_realpath_missing(&prefixbuf, prefix))
> +				die(_("Unable to resolve path '%s'"), path);
> +			prefix = prefixbuf.buf;
> +		}
> +		puts(relative_path(path, prefix, &buf));
> +		strbuf_release(&buf);
> +	} else if (format == FORMAT_DEFAULT && def == DEFAULT_RELATIVE_IF_SHARED) {
> +		struct strbuf buf = STRBUF_INIT;
> +		puts(relative_path(path, prefix, &buf));
> +		strbuf_release(&buf);
> +	} else {
> +		struct strbuf buf = STRBUF_INIT;
> +		if (!strbuf_realpath_missing(&buf, path))
> +			die(_("Unable to resolve path '%s'"), path);
> +		puts(buf.buf);
> +		strbuf_release(&buf);
> +	}
> +	free(cwd);
> +}
> +
>  int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  {
>  	int i, as_is = 0, verify = 0, quiet = 0, revs_count = 0, type = 0;
> @@ -595,6 +665,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  	struct object_context unused;
>  	struct strbuf buf = STRBUF_INIT;
>  	const int hexsz = the_hash_algo->hexsz;
> +	enum format_type format = FORMAT_DEFAULT;
>
>  	if (argc > 1 && !strcmp("--parseopt", argv[1]))
>  		return cmd_parseopt(argc - 1, argv + 1, prefix);
> @@ -650,8 +721,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  			if (!argv[i + 1])
>  				die("--git-path requires an argument");
>  			strbuf_reset(&buf);
> -			puts(relative_path(git_path("%s", argv[i + 1]),
> -					   prefix, &buf));
> +			print_path(git_path("%s", argv[i + 1]), prefix, format, DEFAULT_RELATIVE_IF_SHARED);

One thing that the original code did was to reuse the same `strbuf`. Not
sure whether this matters in practice.

>  			i++;
>  			continue;
>  		}
> @@ -683,6 +753,16 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  					show_file(arg, 0);
>  				continue;
>  			}
> +			if (opt_with_value(arg, "--path-format", &arg)) {
> +				if (!strcmp(arg, "absolute")) {
> +					format = FORMAT_CANONICAL;
> +				} else if (!strcmp(arg, "relative")) {
> +					format = FORMAT_RELATIVE;
> +				} else {
> +					die("unknown argument to --path-format: %s", arg);
> +				}
> +				continue;
> +			}
>  			if (!strcmp(arg, "--default")) {
>  				def = argv[++i];
>  				if (!def)
> @@ -803,7 +883,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  			if (!strcmp(arg, "--show-toplevel")) {
>  				const char *work_tree = get_git_work_tree();
>  				if (work_tree)
> -					puts(work_tree);
> +					print_path(work_tree, prefix, format, DEFAULT_CANONICAL);

The way `print_path()`'s code is structured, it is not immediately obvious
to me whether the patch changes behavior here. I _suspect_ that we're now
calling `strbuf_realpath_missing()` and react to its return value, which
is different from before.

Wouldn't make `DEFAULT_UNMODIFIED` make more sense here?

>  				else
>  					die("this operation must be run in a work tree");
>  				continue;
> @@ -811,7 +891,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  			if (!strcmp(arg, "--show-superproject-working-tree")) {
>  				struct strbuf superproject = STRBUF_INIT;
>  				if (get_superproject_working_tree(&superproject))
> -					puts(superproject.buf);
> +					print_path(superproject.buf, prefix, format, DEFAULT_CANONICAL);

Shouldn't this be `DEFAULT_UNMODIFIED`?

>  				strbuf_release(&superproject);
>  				continue;
>  			}
> @@ -846,16 +926,18 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  				const char *gitdir = getenv(GIT_DIR_ENVIRONMENT);
>  				char *cwd;
>  				int len;
> +				enum format_type wanted = format;
>  				if (arg[2] == 'g') {	/* --git-dir */
>  					if (gitdir) {
> -						puts(gitdir);
> +						print_path(gitdir, prefix, format, DEFAULT_UNMODIFIED);
>  						continue;
>  					}
>  					if (!prefix) {
> -						puts(".git");
> +						print_path(".git", prefix, format, DEFAULT_UNMODIFIED);
>  						continue;
>  					}
>  				} else {		/* --absolute-git-dir */
> +					wanted = FORMAT_CANONICAL;
>  					if (!gitdir && !prefix)
>  						gitdir = ".git";
>  					if (gitdir) {
> @@ -868,14 +950,14 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  				}
>  				cwd = xgetcwd();
>  				len = strlen(cwd);
> -				printf("%s%s.git\n", cwd, len && cwd[len-1] != '/' ? "/" : "");
> +				strbuf_reset(&buf);
> +				strbuf_addf(&buf, "%s%s.git", cwd, len && cwd[len-1] != '/' ? "/" : "");

So `DEFAULT_CANONICAL` ensures a trailing `/`? I do not see that in
`print_path()`'s implementation, and also, I would love to see a different
name for that ("canonical", from my Java past, suggests something like
"real path" to me).

Thanks,
Dscho

>  				free(cwd);
> +				print_path(buf.buf, prefix, wanted, DEFAULT_CANONICAL);
>  				continue;
>  			}
>  			if (!strcmp(arg, "--git-common-dir")) {
> -				strbuf_reset(&buf);
> -				puts(relative_path(get_git_common_dir(),
> -						   prefix, &buf));
> +				print_path(get_git_common_dir(), prefix, format, DEFAULT_RELATIVE_IF_SHARED);
>  				continue;
>  			}
>  			if (!strcmp(arg, "--is-inside-git-dir")) {
> @@ -905,8 +987,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  				if (the_index.split_index) {
>  					const struct object_id *oid = &the_index.split_index->base_oid;
>  					const char *path = git_path("sharedindex.%s", oid_to_hex(oid));
> -					strbuf_reset(&buf);
> -					puts(relative_path(path, prefix, &buf));
> +					print_path(path, prefix, format, DEFAULT_RELATIVE);
>  				}
>  				continue;
>  			}
> diff --git a/t/t1500-rev-parse.sh b/t/t1500-rev-parse.sh
> index 408b97d5af..51d7d40ec1 100755
> --- a/t/t1500-rev-parse.sh
> +++ b/t/t1500-rev-parse.sh
> @@ -3,6 +3,16 @@
>  test_description='test git rev-parse'
>  . ./test-lib.sh
>
> +test_one () {
> +	dir="$1" &&
> +	expect="$2" &&
> +	shift &&
> +	shift &&
> +	echo "$expect" >expect &&
> +	git -C "$dir" rev-parse "$@" >actual &&
> +	test_cmp expect actual
> +}
> +
>  # usage: [options] label is-bare is-inside-git is-inside-work prefix git-dir absolute-git-dir
>  test_rev_parse () {
>  	d=
> @@ -60,7 +70,13 @@ ROOT=$(pwd)
>
>  test_expect_success 'setup' '
>  	mkdir -p sub/dir work &&
> -	cp -R .git repo.git
> +	cp -R .git repo.git &&
> +	git checkout -B main &&
> +	test_commit abc &&
> +	git checkout -b side &&
> +	test_commit def &&
> +	git checkout main &&
> +	git worktree add worktree side
>  '
>
>  test_rev_parse toplevel false false true '' .git "$ROOT/.git"
> @@ -88,6 +104,45 @@ test_rev_parse -C work -g ../repo.git -b t 'GIT_DIR=../repo.git, core.bare = tru
>
>  test_rev_parse -C work -g ../repo.git -b u 'GIT_DIR=../repo.git, core.bare undefined' false false true ''
>
> +test_expect_success 'rev-parse --path-format=absolute' '
> +	test_one "." "$ROOT/.git" --path-format=absolute --git-dir &&
> +	test_one "." "$ROOT/.git" --path-format=absolute --git-common-dir &&
> +	test_one "sub/dir" "$ROOT/.git" --path-format=absolute --git-dir &&
> +	test_one "sub/dir" "$ROOT/.git" --path-format=absolute --git-common-dir &&
> +	test_one "worktree" "$ROOT/.git/worktrees/worktree" --path-format=absolute --git-dir &&
> +	test_one "worktree" "$ROOT/.git" --path-format=absolute --git-common-dir &&
> +	test_one "." "$ROOT" --path-format=absolute --show-toplevel &&
> +	test_one "." "$ROOT/.git/objects" --path-format=absolute --git-path objects &&
> +	test_one "." "$ROOT/.git/objects/foo/bar/baz" --path-format=absolute --git-path objects/foo/bar/baz
> +'
> +
> +test_expect_success 'rev-parse --path-format=relative' '
> +	test_one "." ".git" --path-format=relative --git-dir &&
> +	test_one "." ".git" --path-format=relative --git-common-dir &&
> +	test_one "sub/dir" "../../.git" --path-format=relative --git-dir &&
> +	test_one "sub/dir" "../../.git" --path-format=relative --git-common-dir &&
> +	test_one "worktree" "../.git/worktrees/worktree" --path-format=relative --git-dir &&
> +	test_one "worktree" "../.git" --path-format=relative --git-common-dir &&
> +	test_one "." "./" --path-format=relative --show-toplevel &&
> +	test_one "." ".git/objects" --path-format=relative --git-path objects &&
> +	test_one "." ".git/objects/foo/bar/baz" --path-format=relative --git-path objects/foo/bar/baz
> +'
> +
> +test_expect_success '--path-format=relative does not affect --absolute-git-dir' '
> +	git rev-parse --path-format=relative --absolute-git-dir >actual &&
> +	echo "$ROOT/.git" >expect &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success '--path-format can change in the middle of the command line' '
> +	git rev-parse --path-format=absolute --git-dir --path-format=relative --git-path objects/foo/bar >actual &&
> +	cat >expect <<-EOF &&
> +	$ROOT/.git
> +	.git/objects/foo/bar
> +	EOF
> +	test_cmp expect actual
> +'
> +
>  test_expect_success 'git-common-dir from worktree root' '
>  	echo .git >expect &&
>  	git rev-parse --git-common-dir >actual &&
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/2] rev-parse: add option for absolute or relative path formatting
  2020-11-09 14:46   ` Johannes Schindelin
@ 2020-11-16  2:15     ` brian m. carlson
  0 siblings, 0 replies; 13+ messages in thread
From: brian m. carlson @ 2020-11-16  2:15 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, SZEDER Gábor

[-- Attachment #1: Type: text/plain, Size: 5418 bytes --]

On 2020-11-09 at 14:46:13, Johannes Schindelin wrote:
> On Fri, 9 Oct 2020, brian m. carlson wrote:
> > diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
> > index ed200c8af1..ec62b4cd16 100644
> > --- a/builtin/rev-parse.c
> > +++ b/builtin/rev-parse.c
> > @@ -583,6 +583,76 @@ static void handle_ref_opt(const char *pattern, const char *prefix)
> >  	clear_ref_exclusion(&ref_excludes);
> >  }
> >
> > +enum format_type {
> > +	/* We would like a relative path. */
> > +	FORMAT_RELATIVE,
> > +	/* We would like a canonical absolute path. */
> > +	FORMAT_CANONICAL,
> > +	/* We would like the default behavior. */
> > +	FORMAT_DEFAULT,
> > +};
> > +
> > +enum default_type {
> > +	/* Our default is a relative path. */
> > +	DEFAULT_RELATIVE,
> > +	/* Our default is a relative path if there's a shared root. */
> > +	DEFAULT_RELATIVE_IF_SHARED,
> > +	/* Our default is a canonical absolute path. */
> > +	DEFAULT_CANONICAL,
> > +	/* Our default is not to modify the item. */
> > +	DEFAULT_UNMODIFIED,
> > +};
> 
> I wonder whether it would make sense to consolidate these two enums into a
> single one.

Technically, we can, but because there are cases in each one which don't
make sense in the other, we end up with a situation that is hard to
reason about in print_path, which is, by this point, already a little
complex.  So I think I'd prefer not to consolidate them.

> >  int cmd_rev_parse(int argc, const char **argv, const char *prefix)
> >  {
> >  	int i, as_is = 0, verify = 0, quiet = 0, revs_count = 0, type = 0;
> > @@ -595,6 +665,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
> >  	struct object_context unused;
> >  	struct strbuf buf = STRBUF_INIT;
> >  	const int hexsz = the_hash_algo->hexsz;
> > +	enum format_type format = FORMAT_DEFAULT;
> >
> >  	if (argc > 1 && !strcmp("--parseopt", argv[1]))
> >  		return cmd_parseopt(argc - 1, argv + 1, prefix);
> > @@ -650,8 +721,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
> >  			if (!argv[i + 1])
> >  				die("--git-path requires an argument");
> >  			strbuf_reset(&buf);
> > -			puts(relative_path(git_path("%s", argv[i + 1]),
> > -					   prefix, &buf));
> > +			print_path(git_path("%s", argv[i + 1]), prefix, format, DEFAULT_RELATIVE_IF_SHARED);
> 
> One thing that the original code did was to reuse the same `strbuf`. Not
> sure whether this matters in practice.

I don't think it does.  I'll make sure to free it, though, since
strbuf_reset doesn't do that.

> >  			i++;
> >  			continue;
> >  		}
> > @@ -683,6 +753,16 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
> >  					show_file(arg, 0);
> >  				continue;
> >  			}
> > +			if (opt_with_value(arg, "--path-format", &arg)) {
> > +				if (!strcmp(arg, "absolute")) {
> > +					format = FORMAT_CANONICAL;
> > +				} else if (!strcmp(arg, "relative")) {
> > +					format = FORMAT_RELATIVE;
> > +				} else {
> > +					die("unknown argument to --path-format: %s", arg);
> > +				}
> > +				continue;
> > +			}
> >  			if (!strcmp(arg, "--default")) {
> >  				def = argv[++i];
> >  				if (!def)
> > @@ -803,7 +883,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
> >  			if (!strcmp(arg, "--show-toplevel")) {
> >  				const char *work_tree = get_git_work_tree();
> >  				if (work_tree)
> > -					puts(work_tree);
> > +					print_path(work_tree, prefix, format, DEFAULT_CANONICAL);
> 
> The way `print_path()`'s code is structured, it is not immediately obvious
> to me whether the patch changes behavior here. I _suspect_ that we're now
> calling `strbuf_realpath_missing()` and react to its return value, which
> is different from before.
> 
> Wouldn't make `DEFAULT_UNMODIFIED` make more sense here?

It's documented to show the absolute path of the top of the repository,
so it should be safe to do either one.  Will switch.

> >  				else
> >  					die("this operation must be run in a work tree");
> >  				continue;
> > @@ -811,7 +891,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
> >  			if (!strcmp(arg, "--show-superproject-working-tree")) {
> >  				struct strbuf superproject = STRBUF_INIT;
> >  				if (get_superproject_working_tree(&superproject))
> > -					puts(superproject.buf);
> > +					print_path(superproject.buf, prefix, format, DEFAULT_CANONICAL);
> 
> Shouldn't this be `DEFAULT_UNMODIFIED`?

Same thing as above.  Will change.

> > @@ -868,14 +950,14 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
> >  				}
> >  				cwd = xgetcwd();
> >  				len = strlen(cwd);
> > -				printf("%s%s.git\n", cwd, len && cwd[len-1] != '/' ? "/" : "");
> > +				strbuf_reset(&buf);
> > +				strbuf_addf(&buf, "%s%s.git", cwd, len && cwd[len-1] != '/' ? "/" : "");
> 
> So `DEFAULT_CANONICAL` ensures a trailing `/`? I do not see that in
> `print_path()`'s implementation, and also, I would love to see a different
> name for that ("canonical", from my Java past, suggests something like
> "real path" to me).

I don't think that's what's happening here.  I believe the intent is to
insert a slash between the current working directory and the ".git"
component if and only if the former lacks one.  My code doesn't change
that.
-- 
brian m. carlson (he/him or they/them)
Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/2] abspath: add a function to resolve paths with missing components
  2020-11-09 13:55   ` Johannes Schindelin
@ 2020-11-16  2:21     ` brian m. carlson
  0 siblings, 0 replies; 13+ messages in thread
From: brian m. carlson @ 2020-11-16  2:21 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, SZEDER Gábor

[-- Attachment #1: Type: text/plain, Size: 3392 bytes --]

On 2020-11-09 at 13:55:53, Johannes Schindelin wrote:
> Hi brian,
> 
> On Fri, 9 Oct 2020, brian m. carlson wrote:
> 
> > We'd like to canonicalize paths such that we can preserve any number of
> > trailing components that may be missing.  Let's add a function to do
> > that that calls strbuf_realpath to find the canonical path for the
> > portion we do have and then append the missing part.  We adjust
> > strip_last_component to return us the component it has stripped and use
> > that to help us accumulate the missing part.
> >
> > Note that it is intentional that we invoke strbuf_realpath here,
> > repeatedly if necessary, because on Windows that function is replaced
> > with a version that uses the proper system semantics for
> > canonicalization.  Trying to adjust strbuf_realpath to perform this kind
> > of canonicalization with an additional option would fail to work
> > properly on Windows.  The present approach is equivalent to
> > strbuf_realpath for cases where the path exists, and the only other
> > cases where we will use this function the additional overhead of
> > multiple invocations is not significant.
> 
> Thank you for being so considerate. Yes, on Windows, we use (wherever
> possible) a shortcut that tells us the canonicalized path of existing
> entries.
> 
> Technically, it is not `strbuf_realpath()` that we override, but we take a
> shortcut _in_ that function. That's semantics, though.
> 
> More importantly, we recently fixed a bug in our code to allow for a quirk
> in the `strbuf_realpath()` function: `strbuf_realpath()` allows the last
> path component to not exist. If that is the case, now it's time to try
> without last component.
> 
> In a sense, this is a 1-level version of your infinite-level
> `strbuf_realpath_missing()` function.
> 
> An idea that immediately crosses my mind is whether that level could be
> something we want to pass directly into `strbuf_realpath()` as a parameter
> (it would be 1 to imitate the current behavior and -1 for the
> infinite-level case). What do you think? Does that make sense?
> 
> In any case, I think this `_missing()` functionality should be implemented
> a bit more tightly with the `strbuf_realpath()` function because of the
> logic that already allows the last component to be missing:
> 
>                 if (lstat(resolved->buf, &st)) {
>                         /* error out unless this was the last component */
>                         if (errno != ENOENT || remaining.len) {
>                                 if (die_on_error)
>                                         die_errno("Invalid path '%s'",
>                                                   resolved->buf);
>                                 else
>                                         goto error_out;
>                         }
> 
> See https://github.com/git/git/blob/v2.29.2/abspath.c#L130-L138 for the
> exact code and context.
> 
> Seeing as we _already_ have some code to allow for _some_ missing
> component, it should be possible to extend the logic to allow for
> different levels (e.g. using `count_slashes()` if we want to allow more
> than just the last component to be missing).

Okay, if you'd prefer to do it that way, that's fine with me.  I'll
reroll with that change.
-- 
brian m. carlson (he/him or they/them)
Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-11-16  2:26 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-09 19:15 [PATCH v2 0/2] rev-parse options for absolute or relative paths brian m. carlson
2020-10-09 19:15 ` [PATCH v2 1/2] abspath: add a function to resolve paths with missing components brian m. carlson
2020-10-09 21:10   ` Junio C Hamano
2020-10-10  1:10     ` brian m. carlson
2020-11-09 13:57       ` Johannes Schindelin
2020-11-09 13:55   ` Johannes Schindelin
2020-11-16  2:21     ` brian m. carlson
2020-10-09 19:15 ` [PATCH v2 2/2] rev-parse: add option for absolute or relative path formatting brian m. carlson
2020-11-09 14:46   ` Johannes Schindelin
2020-11-16  2:15     ` brian m. carlson
2020-11-04 23:01 ` [PATCH v2 0/2] rev-parse options for absolute or relative paths Emily Shaffer
2020-11-05  3:20   ` brian m. carlson
2020-11-09 13:33 ` Johannes Schindelin

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).