git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] submodule: Port resolve_relative_url from shell to C
@ 2015-12-10  1:07 Stefan Beller
  2015-12-10  6:48 ` Johannes Sixt
                   ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Stefan Beller @ 2015-12-10  1:07 UTC (permalink / raw)
  To: gitster; +Cc: git, jens.lehmann, Stefan Beller

This reimplements the helper function `resolve_relative_url` in shell
in C. This functionality is needed in C for introducing the groups
feature later on. When using groups, the user should not need to run
`git submodule init`, but it should be implicit at all appropriate places,
which are all in C code. As the we would not just call out to `git
submodule init`, but do a more fine grained structure there, we actually
need all the init functionality in C before attempting the groups
feature. To get the init functionality in C, rewriting the
resolve_relative_url subfunction is a major step.

This also improves the performance:
(Best out of 3) time ./t7400-submodule-basic.sh
Before:
real	0m9.575s
user	0m2.683s
sys	0m6.773s
After:
real	0m9.293s
user	0m2.691s
sys	0m6.549s

Signed-off-by: Stefan Beller <sbeller@google.com>
---

 This applies on origin/master, and I'd carry as its own feature branch
 as I am nowhere near done with the groups feature after reading Jens feedback.
 (It took me a while to identify this as a next best step.)
 
 Thanks,
 Stefan

 builtin/submodule--helper.c | 120 ++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  81 ++----------------------------
 2 files changed, 124 insertions(+), 77 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index f4c3eff..f48b5b5 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -9,6 +9,125 @@
 #include "submodule-config.h"
 #include "string-list.h"
 #include "run-command.h"
+#include "remote.h"
+#include "refs.h"
+
+static const char *get_default_remote(void)
+{
+	char *dest = NULL;
+	unsigned char sha1[20];
+	int flag;
+	struct strbuf sb = STRBUF_INIT;
+	const char *refname = resolve_ref_unsafe("HEAD", 0, sha1, &flag);
+
+	if (!refname)
+		die("No such ref: HEAD");
+
+	refname = shorten_unambiguous_ref(refname, 0);
+	strbuf_addf(&sb, "branch.%s.remote", refname);
+	if (git_config_get_string(sb.buf, &dest))
+		return "origin";
+	else
+		return xstrdup(dest);
+}
+
+/*
+ * The function takes at most 2 arguments. The first argument is the
+ * URL that navigates to the submodule origin repo. When relative, this URL
+ * is relative to the superproject origin URL repo. The second up_path
+ * argument, if specified, is the relative path that navigates
+ * from the submodule working tree to the superproject working tree.
+ *
+ * The output of the function is the origin URL of the submodule.
+ *
+ * The output will either be an absolute URL or filesystem path (if the
+ * superproject origin URL is an absolute URL or filesystem path,
+ * respectively) or a relative file system path (if the superproject
+ * origin URL is a relative file system path).
+ *
+ * When the output is a relative file system path, the path is either
+ * relative to the submodule working tree, if up_path is specified, or to
+ * the superproject working tree otherwise.
+ */
+static const char *relative_url(const char *url, const char *up_path)
+{
+	int is_relative = 0;
+	size_t len;
+	char *remoteurl = NULL;
+	char *sep = "/";
+	const char *out;
+	struct strbuf sb = STRBUF_INIT;
+	const char *remote = get_default_remote();
+	strbuf_addf(&sb, "remote.%s.url", remote);
+
+	if (git_config_get_string(sb.buf, &remoteurl))
+		/* the repository is its own authoritative upstream */
+		remoteurl = xgetcwd();
+
+	if (strip_suffix(remoteurl, "/", &len))
+		remoteurl[len] = '\0';
+
+	if (strchr(remoteurl, ':') || skip_prefix(remoteurl, "/", &out))
+		is_relative = 0;
+	else if (skip_prefix(remoteurl, "./", &out) ||
+		    skip_prefix(remoteurl, "../", &out))
+		is_relative = 1;
+	else {
+		is_relative = 1;
+		strbuf_reset(&sb);
+		strbuf_addf(&sb, "./%s", remoteurl);
+		remoteurl = strbuf_detach(&sb, NULL);
+	}
+
+	while (url) {
+		if (skip_prefix(url, "../", &out)) {
+			char *rfind;
+			url = out;
+
+			rfind = strrchr(remoteurl, '/');
+			if (rfind)
+				*rfind = '\0';
+			else {
+				rfind = strrchr(remoteurl, ':');
+				if (rfind) {
+					*rfind = '\0';
+					sep = ":";
+				} else {
+					if (is_relative || !strcmp(".", remoteurl))
+						die(N_("cannot strip one component off url '%s'"), remoteurl);
+					else
+						remoteurl = ".";
+				}
+			}
+		} else if (skip_prefix(url, "./", &out))
+			url = out;
+		else
+			break;
+	}
+	strbuf_reset(&sb);
+	strbuf_addf(&sb, "%s%s%s", remoteurl, sep, url);
+
+	if (!skip_prefix(sb.buf, "./", &out))
+		out = sb.buf;
+	out = xstrdup(out);
+
+	strbuf_reset(&sb);
+	strbuf_addf(&sb, "%s%s", is_relative && up_path ? up_path : "", out);
+
+	free((char*)out);
+	return strbuf_detach(&sb, NULL);
+}
+
+static int resolve_relative_url(int argc, const char **argv, const char *prefix)
+{
+	if (argc == 2)
+		printf("%s\n", relative_url(argv[1], NULL));
+	else if (argc == 3)
+		printf("%s\n", relative_url(argv[1], argv[2]));
+	else
+		die("BUG: resolve_relative_url only accepts one or two arguments");
+	return 0;
+}
 
 struct module_list {
 	const struct cache_entry **entries;
@@ -264,6 +383,7 @@ static struct cmd_struct commands[] = {
 	{"list", module_list},
 	{"name", module_name},
 	{"clone", module_clone},
+	{"resolve_relative_url", resolve_relative_url},
 };
 
 int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
diff --git a/git-submodule.sh b/git-submodule.sh
index 9bc5c5f..6a7a3e4 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -46,79 +46,6 @@ prefix=
 custom_name=
 depth=
 
-# The function takes at most 2 arguments. The first argument is the
-# URL that navigates to the submodule origin repo. When relative, this URL
-# is relative to the superproject origin URL repo. The second up_path
-# argument, if specified, is the relative path that navigates
-# from the submodule working tree to the superproject working tree.
-#
-# The output of the function is the origin URL of the submodule.
-#
-# The output will either be an absolute URL or filesystem path (if the
-# superproject origin URL is an absolute URL or filesystem path,
-# respectively) or a relative file system path (if the superproject
-# origin URL is a relative file system path).
-#
-# When the output is a relative file system path, the path is either
-# relative to the submodule working tree, if up_path is specified, or to
-# the superproject working tree otherwise.
-resolve_relative_url ()
-{
-	remote=$(get_default_remote)
-	remoteurl=$(git config "remote.$remote.url") ||
-		remoteurl=$(pwd) # the repository is its own authoritative upstream
-	url="$1"
-	remoteurl=${remoteurl%/}
-	sep=/
-	up_path="$2"
-
-	case "$remoteurl" in
-	*:*|/*)
-		is_relative=
-		;;
-	./*|../*)
-		is_relative=t
-		;;
-	*)
-		is_relative=t
-		remoteurl="./$remoteurl"
-		;;
-	esac
-
-	while test -n "$url"
-	do
-		case "$url" in
-		../*)
-			url="${url#../}"
-			case "$remoteurl" in
-			*/*)
-				remoteurl="${remoteurl%/*}"
-				;;
-			*:*)
-				remoteurl="${remoteurl%:*}"
-				sep=:
-				;;
-			*)
-				if test -z "$is_relative" || test "." = "$remoteurl"
-				then
-					die "$(eval_gettext "cannot strip one component off url '\$remoteurl'")"
-				else
-					remoteurl=.
-				fi
-				;;
-			esac
-			;;
-		./*)
-			url="${url#./}"
-			;;
-		*)
-			break;;
-		esac
-	done
-	remoteurl="$remoteurl$sep${url%/}"
-	echo "${is_relative:+${up_path}}${remoteurl#./}"
-}
-
 # Resolve a path to be relative to another path.  This is intended for
 # converting submodule paths when git-submodule is run in a subdirectory
 # and only handles paths where the directory separator is '/'.
@@ -281,7 +208,7 @@ cmd_add()
 		die "$(gettext "Relative path can only be used from the toplevel of the working tree")"
 
 		# dereference source url relative to parent's url
-		realrepo=$(resolve_relative_url "$repo") || exit
+		realrepo=$(git submodule--helper resolve_relative_url "$repo") || exit
 		;;
 	*:*|/*)
 		# absolute url
@@ -485,7 +412,7 @@ cmd_init()
 			# Possibly a url relative to parent
 			case "$url" in
 			./*|../*)
-				url=$(resolve_relative_url "$url") || exit
+				url=$(git submodule--helper resolve_relative_url "$url") || exit
 				;;
 			esac
 			git config submodule."$name".url "$url" ||
@@ -1190,9 +1117,9 @@ cmd_sync()
 			# guarantee a trailing /
 			up_path=${up_path%/}/ &&
 			# path from submodule work tree to submodule origin repo
-			sub_origin_url=$(resolve_relative_url "$url" "$up_path") &&
+			sub_origin_url=$(git submodule--helper resolve_relative_url "$url" "$up_path") &&
 			# path from superproject work tree to submodule origin repo
-			super_config_url=$(resolve_relative_url "$url") || exit
+			super_config_url=$(git submodule--helper resolve_relative_url "$url") || exit
 			;;
 		*)
 			sub_origin_url="$url"
-- 
2.6.3.470.g4b82c23.dirty

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] submodule: Port resolve_relative_url from shell to C
  2015-12-10  1:07 [PATCH] submodule: Port resolve_relative_url from shell to C Stefan Beller
@ 2015-12-10  6:48 ` Johannes Sixt
  2015-12-16 22:36   ` Stefan Beller
  2015-12-17  0:26 ` [PATCHv2] Porting " Stefan Beller
  2015-12-17  0:26 ` [PATCHv2] submodule: Port " Stefan Beller
  2 siblings, 1 reply; 23+ messages in thread
From: Johannes Sixt @ 2015-12-10  6:48 UTC (permalink / raw)
  To: Stefan Beller; +Cc: gitster, git, jens.lehmann

Am 10.12.2015 um 02:07 schrieb Stefan Beller:
> This reimplements the helper function `resolve_relative_url` in shell
> in C. This functionality is needed in C for introducing the groups
> feature later on. When using groups, the user should not need to run
> `git submodule init`, but it should be implicit at all appropriate places,
> which are all in C code. As the we would not just call out to `git
> submodule init`, but do a more fine grained structure there, we actually
> need all the init functionality in C before attempting the groups
> feature. To get the init functionality in C, rewriting the
> resolve_relative_url subfunction is a major step.

I see lots of '/', but no is_dir_sep() in the C version. Did you 
consider that local URLs can use a backslash as path separator on 
Windows? In the shell version, this did not matter because bash converts 
the backslashes to forward slashes for us. But when rewritten in C, this 
does not happen.

Valid URLs are

   D:\foo\bar.git
   \\server\share\foo\bar
   ..\..\foo\bar

and all of them with some or all backslashes replaced by forward slashes.

See also connect.c:url_is_local_not_ssh, which ensures that the first 
example above is considered a local path with a drive letter, not a 
remote ssh path.

>
> This also improves the performance:
> (Best out of 3) time ./t7400-submodule-basic.sh
> Before:
> real	0m9.575s
> user	0m2.683s
> sys	0m6.773s
> After:
> real	0m9.293s
> user	0m2.691s
> sys	0m6.549s
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>
>   This applies on origin/master, and I'd carry as its own feature branch
>   as I am nowhere near done with the groups feature after reading Jens feedback.
>   (It took me a while to identify this as a next best step.)
>
>   Thanks,
>   Stefan
>
>   builtin/submodule--helper.c | 120 ++++++++++++++++++++++++++++++++++++++++++++
>   git-submodule.sh            |  81 ++----------------------------
>   2 files changed, 124 insertions(+), 77 deletions(-)
>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index f4c3eff..f48b5b5 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -9,6 +9,125 @@
>   #include "submodule-config.h"
>   #include "string-list.h"
>   #include "run-command.h"
> +#include "remote.h"
> +#include "refs.h"
> +
> +static const char *get_default_remote(void)
> +{
> +	char *dest = NULL;
> +	unsigned char sha1[20];
> +	int flag;
> +	struct strbuf sb = STRBUF_INIT;
> +	const char *refname = resolve_ref_unsafe("HEAD", 0, sha1, &flag);
> +
> +	if (!refname)
> +		die("No such ref: HEAD");
> +
> +	refname = shorten_unambiguous_ref(refname, 0);
> +	strbuf_addf(&sb, "branch.%s.remote", refname);
> +	if (git_config_get_string(sb.buf, &dest))
> +		return "origin";
> +	else
> +		return xstrdup(dest);
> +}
> +
> +/*
> + * The function takes at most 2 arguments. The first argument is the
> + * URL that navigates to the submodule origin repo. When relative, this URL
> + * is relative to the superproject origin URL repo. The second up_path
> + * argument, if specified, is the relative path that navigates
> + * from the submodule working tree to the superproject working tree.
> + *
> + * The output of the function is the origin URL of the submodule.
> + *
> + * The output will either be an absolute URL or filesystem path (if the
> + * superproject origin URL is an absolute URL or filesystem path,
> + * respectively) or a relative file system path (if the superproject
> + * origin URL is a relative file system path).
> + *
> + * When the output is a relative file system path, the path is either
> + * relative to the submodule working tree, if up_path is specified, or to
> + * the superproject working tree otherwise.
> + */
> +static const char *relative_url(const char *url, const char *up_path)
> +{
> +	int is_relative = 0;
> +	size_t len;
> +	char *remoteurl = NULL;
> +	char *sep = "/";
> +	const char *out;
> +	struct strbuf sb = STRBUF_INIT;
> +	const char *remote = get_default_remote();
> +	strbuf_addf(&sb, "remote.%s.url", remote);
> +
> +	if (git_config_get_string(sb.buf, &remoteurl))
> +		/* the repository is its own authoritative upstream */
> +		remoteurl = xgetcwd();
> +
> +	if (strip_suffix(remoteurl, "/", &len))
> +		remoteurl[len] = '\0';
> +
> +	if (strchr(remoteurl, ':') || skip_prefix(remoteurl, "/", &out))
> +		is_relative = 0;
> +	else if (skip_prefix(remoteurl, "./", &out) ||
> +		    skip_prefix(remoteurl, "../", &out))
> +		is_relative = 1;
> +	else {
> +		is_relative = 1;
> +		strbuf_reset(&sb);
> +		strbuf_addf(&sb, "./%s", remoteurl);
> +		remoteurl = strbuf_detach(&sb, NULL);
> +	}
> +
> +	while (url) {
> +		if (skip_prefix(url, "../", &out)) {
> +			char *rfind;
> +			url = out;
> +
> +			rfind = strrchr(remoteurl, '/');
> +			if (rfind)
> +				*rfind = '\0';
> +			else {
> +				rfind = strrchr(remoteurl, ':');
> +				if (rfind) {
> +					*rfind = '\0';
> +					sep = ":";
> +				} else {
> +					if (is_relative || !strcmp(".", remoteurl))
> +						die(N_("cannot strip one component off url '%s'"), remoteurl);
> +					else
> +						remoteurl = ".";
> +				}
> +			}
> +		} else if (skip_prefix(url, "./", &out))
> +			url = out;
> +		else
> +			break;
> +	}
> +	strbuf_reset(&sb);
> +	strbuf_addf(&sb, "%s%s%s", remoteurl, sep, url);
> +
> +	if (!skip_prefix(sb.buf, "./", &out))
> +		out = sb.buf;
> +	out = xstrdup(out);
> +
> +	strbuf_reset(&sb);
> +	strbuf_addf(&sb, "%s%s", is_relative && up_path ? up_path : "", out);
> +
> +	free((char*)out);
> +	return strbuf_detach(&sb, NULL);
> +}
> +
> +static int resolve_relative_url(int argc, const char **argv, const char *prefix)
> +{
> +	if (argc == 2)
> +		printf("%s\n", relative_url(argv[1], NULL));
> +	else if (argc == 3)
> +		printf("%s\n", relative_url(argv[1], argv[2]));
> +	else
> +		die("BUG: resolve_relative_url only accepts one or two arguments");
> +	return 0;
> +}
>
>   struct module_list {
>   	const struct cache_entry **entries;
> @@ -264,6 +383,7 @@ static struct cmd_struct commands[] = {
>   	{"list", module_list},
>   	{"name", module_name},
>   	{"clone", module_clone},
> +	{"resolve_relative_url", resolve_relative_url},
>   };
>
>   int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
> diff --git a/git-submodule.sh b/git-submodule.sh
> index 9bc5c5f..6a7a3e4 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -46,79 +46,6 @@ prefix=
>   custom_name=
>   depth=
>
> -# The function takes at most 2 arguments. The first argument is the
> -# URL that navigates to the submodule origin repo. When relative, this URL
> -# is relative to the superproject origin URL repo. The second up_path
> -# argument, if specified, is the relative path that navigates
> -# from the submodule working tree to the superproject working tree.
> -#
> -# The output of the function is the origin URL of the submodule.
> -#
> -# The output will either be an absolute URL or filesystem path (if the
> -# superproject origin URL is an absolute URL or filesystem path,
> -# respectively) or a relative file system path (if the superproject
> -# origin URL is a relative file system path).
> -#
> -# When the output is a relative file system path, the path is either
> -# relative to the submodule working tree, if up_path is specified, or to
> -# the superproject working tree otherwise.
> -resolve_relative_url ()
> -{
> -	remote=$(get_default_remote)
> -	remoteurl=$(git config "remote.$remote.url") ||
> -		remoteurl=$(pwd) # the repository is its own authoritative upstream
> -	url="$1"
> -	remoteurl=${remoteurl%/}
> -	sep=/
> -	up_path="$2"
> -
> -	case "$remoteurl" in
> -	*:*|/*)
> -		is_relative=
> -		;;
> -	./*|../*)
> -		is_relative=t
> -		;;
> -	*)
> -		is_relative=t
> -		remoteurl="./$remoteurl"
> -		;;
> -	esac
> -
> -	while test -n "$url"
> -	do
> -		case "$url" in
> -		../*)
> -			url="${url#../}"
> -			case "$remoteurl" in
> -			*/*)
> -				remoteurl="${remoteurl%/*}"
> -				;;
> -			*:*)
> -				remoteurl="${remoteurl%:*}"
> -				sep=:
> -				;;
> -			*)
> -				if test -z "$is_relative" || test "." = "$remoteurl"
> -				then
> -					die "$(eval_gettext "cannot strip one component off url '\$remoteurl'")"
> -				else
> -					remoteurl=.
> -				fi
> -				;;
> -			esac
> -			;;
> -		./*)
> -			url="${url#./}"
> -			;;
> -		*)
> -			break;;
> -		esac
> -	done
> -	remoteurl="$remoteurl$sep${url%/}"
> -	echo "${is_relative:+${up_path}}${remoteurl#./}"
> -}
> -
>   # Resolve a path to be relative to another path.  This is intended for
>   # converting submodule paths when git-submodule is run in a subdirectory
>   # and only handles paths where the directory separator is '/'.
> @@ -281,7 +208,7 @@ cmd_add()
>   		die "$(gettext "Relative path can only be used from the toplevel of the working tree")"
>
>   		# dereference source url relative to parent's url
> -		realrepo=$(resolve_relative_url "$repo") || exit
> +		realrepo=$(git submodule--helper resolve_relative_url "$repo") || exit
>   		;;
>   	*:*|/*)
>   		# absolute url
> @@ -485,7 +412,7 @@ cmd_init()
>   			# Possibly a url relative to parent
>   			case "$url" in
>   			./*|../*)
> -				url=$(resolve_relative_url "$url") || exit
> +				url=$(git submodule--helper resolve_relative_url "$url") || exit
>   				;;
>   			esac
>   			git config submodule."$name".url "$url" ||
> @@ -1190,9 +1117,9 @@ cmd_sync()
>   			# guarantee a trailing /
>   			up_path=${up_path%/}/ &&
>   			# path from submodule work tree to submodule origin repo
> -			sub_origin_url=$(resolve_relative_url "$url" "$up_path") &&
> +			sub_origin_url=$(git submodule--helper resolve_relative_url "$url" "$up_path") &&
>   			# path from superproject work tree to submodule origin repo
> -			super_config_url=$(resolve_relative_url "$url") || exit
> +			super_config_url=$(git submodule--helper resolve_relative_url "$url") || exit
>   			;;
>   		*)
>   			sub_origin_url="$url"
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] submodule: Port resolve_relative_url from shell to C
  2015-12-10  6:48 ` Johannes Sixt
@ 2015-12-16 22:36   ` Stefan Beller
  0 siblings, 0 replies; 23+ messages in thread
From: Stefan Beller @ 2015-12-16 22:36 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, git@vger.kernel.org, Jens Lehmann

On Wed, Dec 9, 2015 at 10:48 PM, Johannes Sixt <j6t@kdbg.org> wrote:
> Am 10.12.2015 um 02:07 schrieb Stefan Beller:
>>
>> This reimplements the helper function `resolve_relative_url` in shell
>> in C. This functionality is needed in C for introducing the groups
>> feature later on. When using groups, the user should not need to run
>> `git submodule init`, but it should be implicit at all appropriate places,
>> which are all in C code. As the we would not just call out to `git
>> submodule init`, but do a more fine grained structure there, we actually
>> need all the init functionality in C before attempting the groups
>> feature. To get the init functionality in C, rewriting the
>> resolve_relative_url subfunction is a major step.
>
>
> I see lots of '/', but no is_dir_sep() in the C version. Did you consider
> that local URLs can use a backslash as path separator on Windows? In the
> shell version, this did not matter because bash converts the backslashes to
> forward slashes for us. But when rewritten in C, this does not happen.

I see. That's a pity. :(

>
> Valid URLs are
>
>   D:\foo\bar.git
>   \\server\share\foo\bar
>   ..\..\foo\bar

I am staring at the code in desperation as backslashes
in Linux are valid for file names, i.e.:
"/tmp/testfile\" is a valid filename

Now look at the code where the first slash occurs, it's

    if (strip_suffix(remoteurl, "/", &len))
        remoteurl[len] = '\0';

The intention is to strip off the last character if it is a directory separator.
So in the unix world we want to keep "/tmp/testfile\" as it is,
whereas in Windows
we want to chop off the backslash (because there is no file with a
backslash allowed,
it is the directory separator?)

So what I think I am going to do for next round is something like

    static int has_same_dir_prefix(const char *str, const char **out)
    {
    #ifdef GIT_WINDOWS_NATIVE
        return skip_prefix(str, "./", out)
           || skip_prefix(str, ".\\", out);
    #else
        return skip_prefix(str, "./", out);
    #endif
    }

    static int has_upper_dir_prefix(const char *str, const char **out)
    {
    #ifdef GIT_WINDOWS_NATIVE
        return skip_prefix(str, "../", out)
           || skip_prefix(str, "..\\", out);
    #else
        return skip_prefix(str, "../", out);
    #endif
    }

in the submodule helper function, or alternatively in the wrapper.c ?
and then rely on these functions being accurate.

Would that approach make sense?

Thanks,
Stefan


>
> and all of them with some or all backslashes replaced by forward slashes.
>
> See also connect.c:url_is_local_not_ssh, which ensures that the first
> example above is considered a local path with a drive letter, not a remote
> ssh path.
>
>
>>
>> This also improves the performance:
>> (Best out of 3) time ./t7400-submodule-basic.sh
>> Before:
>> real    0m9.575s
>> user    0m2.683s
>> sys     0m6.773s
>> After:
>> real    0m9.293s
>> user    0m2.691s
>> sys     0m6.549s
>>
>> Signed-off-by: Stefan Beller <sbeller@google.com>
>> ---
>>
>>   This applies on origin/master, and I'd carry as its own feature branch
>>   as I am nowhere near done with the groups feature after reading Jens
>> feedback.
>>   (It took me a while to identify this as a next best step.)
>>
>>   Thanks,
>>   Stefan
>>
>>   builtin/submodule--helper.c | 120
>> ++++++++++++++++++++++++++++++++++++++++++++
>>   git-submodule.sh            |  81 ++----------------------------
>>   2 files changed, 124 insertions(+), 77 deletions(-)
>>
>> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
>> index f4c3eff..f48b5b5 100644
>> --- a/builtin/submodule--helper.c
>> +++ b/builtin/submodule--helper.c
>> @@ -9,6 +9,125 @@
>>   #include "submodule-config.h"
>>   #include "string-list.h"
>>   #include "run-command.h"
>> +#include "remote.h"
>> +#include "refs.h"
>> +
>> +static const char *get_default_remote(void)
>> +{
>> +       char *dest = NULL;
>> +       unsigned char sha1[20];
>> +       int flag;
>> +       struct strbuf sb = STRBUF_INIT;
>> +       const char *refname = resolve_ref_unsafe("HEAD", 0, sha1, &flag);
>> +
>> +       if (!refname)
>> +               die("No such ref: HEAD");
>> +
>> +       refname = shorten_unambiguous_ref(refname, 0);
>> +       strbuf_addf(&sb, "branch.%s.remote", refname);
>> +       if (git_config_get_string(sb.buf, &dest))
>> +               return "origin";
>> +       else
>> +               return xstrdup(dest);
>> +}
>> +
>> +/*
>> + * The function takes at most 2 arguments. The first argument is the
>> + * URL that navigates to the submodule origin repo. When relative, this
>> URL
>> + * is relative to the superproject origin URL repo. The second up_path
>> + * argument, if specified, is the relative path that navigates
>> + * from the submodule working tree to the superproject working tree.
>> + *
>> + * The output of the function is the origin URL of the submodule.
>> + *
>> + * The output will either be an absolute URL or filesystem path (if the
>> + * superproject origin URL is an absolute URL or filesystem path,
>> + * respectively) or a relative file system path (if the superproject
>> + * origin URL is a relative file system path).
>> + *
>> + * When the output is a relative file system path, the path is either
>> + * relative to the submodule working tree, if up_path is specified, or to
>> + * the superproject working tree otherwise.
>> + */
>> +static const char *relative_url(const char *url, const char *up_path)
>> +{
>> +       int is_relative = 0;
>> +       size_t len;
>> +       char *remoteurl = NULL;
>> +       char *sep = "/";
>> +       const char *out;
>> +       struct strbuf sb = STRBUF_INIT;
>> +       const char *remote = get_default_remote();
>> +       strbuf_addf(&sb, "remote.%s.url", remote);
>> +
>> +       if (git_config_get_string(sb.buf, &remoteurl))
>> +               /* the repository is its own authoritative upstream */
>> +               remoteurl = xgetcwd();
>> +
>> +       if (strip_suffix(remoteurl, "/", &len))
>> +               remoteurl[len] = '\0';
>> +
>> +       if (strchr(remoteurl, ':') || skip_prefix(remoteurl, "/", &out))
>> +               is_relative = 0;
>> +       else if (skip_prefix(remoteurl, "./", &out) ||
>> +                   skip_prefix(remoteurl, "../", &out))
>> +               is_relative = 1;
>> +       else {
>> +               is_relative = 1;
>> +               strbuf_reset(&sb);
>> +               strbuf_addf(&sb, "./%s", remoteurl);
>> +               remoteurl = strbuf_detach(&sb, NULL);
>> +       }
>> +
>> +       while (url) {
>> +               if (skip_prefix(url, "../", &out)) {
>> +                       char *rfind;
>> +                       url = out;
>> +
>> +                       rfind = strrchr(remoteurl, '/');
>> +                       if (rfind)
>> +                               *rfind = '\0';
>> +                       else {
>> +                               rfind = strrchr(remoteurl, ':');
>> +                               if (rfind) {
>> +                                       *rfind = '\0';
>> +                                       sep = ":";
>> +                               } else {
>> +                                       if (is_relative || !strcmp(".",
>> remoteurl))
>> +                                               die(N_("cannot strip one
>> component off url '%s'"), remoteurl);
>> +                                       else
>> +                                               remoteurl = ".";
>> +                               }
>> +                       }
>> +               } else if (skip_prefix(url, "./", &out))
>> +                       url = out;
>> +               else
>> +                       break;
>> +       }
>> +       strbuf_reset(&sb);
>> +       strbuf_addf(&sb, "%s%s%s", remoteurl, sep, url);
>> +
>> +       if (!skip_prefix(sb.buf, "./", &out))
>> +               out = sb.buf;
>> +       out = xstrdup(out);
>> +
>> +       strbuf_reset(&sb);
>> +       strbuf_addf(&sb, "%s%s", is_relative && up_path ? up_path : "",
>> out);
>> +
>> +       free((char*)out);
>> +       return strbuf_detach(&sb, NULL);
>> +}
>> +
>> +static int resolve_relative_url(int argc, const char **argv, const char
>> *prefix)
>> +{
>> +       if (argc == 2)
>> +               printf("%s\n", relative_url(argv[1], NULL));
>> +       else if (argc == 3)
>> +               printf("%s\n", relative_url(argv[1], argv[2]));
>> +       else
>> +               die("BUG: resolve_relative_url only accepts one or two
>> arguments");
>> +       return 0;
>> +}
>>
>>   struct module_list {
>>         const struct cache_entry **entries;
>> @@ -264,6 +383,7 @@ static struct cmd_struct commands[] = {
>>         {"list", module_list},
>>         {"name", module_name},
>>         {"clone", module_clone},
>> +       {"resolve_relative_url", resolve_relative_url},
>>   };
>>
>>   int cmd_submodule__helper(int argc, const char **argv, const char
>> *prefix)
>> diff --git a/git-submodule.sh b/git-submodule.sh
>> index 9bc5c5f..6a7a3e4 100755
>> --- a/git-submodule.sh
>> +++ b/git-submodule.sh
>> @@ -46,79 +46,6 @@ prefix=
>>   custom_name=
>>   depth=
>>
>> -# The function takes at most 2 arguments. The first argument is the
>> -# URL that navigates to the submodule origin repo. When relative, this
>> URL
>> -# is relative to the superproject origin URL repo. The second up_path
>> -# argument, if specified, is the relative path that navigates
>> -# from the submodule working tree to the superproject working tree.
>> -#
>> -# The output of the function is the origin URL of the submodule.
>> -#
>> -# The output will either be an absolute URL or filesystem path (if the
>> -# superproject origin URL is an absolute URL or filesystem path,
>> -# respectively) or a relative file system path (if the superproject
>> -# origin URL is a relative file system path).
>> -#
>> -# When the output is a relative file system path, the path is either
>> -# relative to the submodule working tree, if up_path is specified, or to
>> -# the superproject working tree otherwise.
>> -resolve_relative_url ()
>> -{
>> -       remote=$(get_default_remote)
>> -       remoteurl=$(git config "remote.$remote.url") ||
>> -               remoteurl=$(pwd) # the repository is its own authoritative
>> upstream
>> -       url="$1"
>> -       remoteurl=${remoteurl%/}
>> -       sep=/
>> -       up_path="$2"
>> -
>> -       case "$remoteurl" in
>> -       *:*|/*)
>> -               is_relative=
>> -               ;;
>> -       ./*|../*)
>> -               is_relative=t
>> -               ;;
>> -       *)
>> -               is_relative=t
>> -               remoteurl="./$remoteurl"
>> -               ;;
>> -       esac
>> -
>> -       while test -n "$url"
>> -       do
>> -               case "$url" in
>> -               ../*)
>> -                       url="${url#../}"
>> -                       case "$remoteurl" in
>> -                       */*)
>> -                               remoteurl="${remoteurl%/*}"
>> -                               ;;
>> -                       *:*)
>> -                               remoteurl="${remoteurl%:*}"
>> -                               sep=:
>> -                               ;;
>> -                       *)
>> -                               if test -z "$is_relative" || test "." =
>> "$remoteurl"
>> -                               then
>> -                                       die "$(eval_gettext "cannot strip
>> one component off url '\$remoteurl'")"
>> -                               else
>> -                                       remoteurl=.
>> -                               fi
>> -                               ;;
>> -                       esac
>> -                       ;;
>> -               ./*)
>> -                       url="${url#./}"
>> -                       ;;
>> -               *)
>> -                       break;;
>> -               esac
>> -       done
>> -       remoteurl="$remoteurl$sep${url%/}"
>> -       echo "${is_relative:+${up_path}}${remoteurl#./}"
>> -}
>> -
>>   # Resolve a path to be relative to another path.  This is intended for
>>   # converting submodule paths when git-submodule is run in a subdirectory
>>   # and only handles paths where the directory separator is '/'.
>> @@ -281,7 +208,7 @@ cmd_add()
>>                 die "$(gettext "Relative path can only be used from the
>> toplevel of the working tree")"
>>
>>                 # dereference source url relative to parent's url
>> -               realrepo=$(resolve_relative_url "$repo") || exit
>> +               realrepo=$(git submodule--helper resolve_relative_url
>> "$repo") || exit
>>                 ;;
>>         *:*|/*)
>>                 # absolute url
>> @@ -485,7 +412,7 @@ cmd_init()
>>                         # Possibly a url relative to parent
>>                         case "$url" in
>>                         ./*|../*)
>> -                               url=$(resolve_relative_url "$url") || exit
>> +                               url=$(git submodule--helper
>> resolve_relative_url "$url") || exit
>>                                 ;;
>>                         esac
>>                         git config submodule."$name".url "$url" ||
>> @@ -1190,9 +1117,9 @@ cmd_sync()
>>                         # guarantee a trailing /
>>                         up_path=${up_path%/}/ &&
>>                         # path from submodule work tree to submodule
>> origin repo
>> -                       sub_origin_url=$(resolve_relative_url "$url"
>> "$up_path") &&
>> +                       sub_origin_url=$(git submodule--helper
>> resolve_relative_url "$url" "$up_path") &&
>>                         # path from superproject work tree to submodule
>> origin repo
>> -                       super_config_url=$(resolve_relative_url "$url") ||
>> exit
>> +                       super_config_url=$(git submodule--helper
>> resolve_relative_url "$url") || exit
>>                         ;;
>>                 *)
>>                         sub_origin_url="$url"
>>
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCHv2] Porting resolve_relative_url from shell to C
  2015-12-10  1:07 [PATCH] submodule: Port resolve_relative_url from shell to C Stefan Beller
  2015-12-10  6:48 ` Johannes Sixt
@ 2015-12-17  0:26 ` Stefan Beller
  2015-12-17  7:47   ` Torsten Bögershausen
  2015-12-17  0:26 ` [PATCHv2] submodule: Port " Stefan Beller
  2 siblings, 1 reply; 23+ messages in thread
From: Stefan Beller @ 2015-12-17  0:26 UTC (permalink / raw)
  To: sbeller, git; +Cc: gitster, jens.lehmann, j6t

A new version of the patch, which spells out more its intent and
may actually work in Windows.

Any comment welcome,
Thanks,
Stefan

Stefan Beller (1):
  submodule: Port resolve_relative_url from shell to C

 builtin/submodule--helper.c | 151 ++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  81 ++----------------------
 2 files changed, 155 insertions(+), 77 deletions(-)

interdiff to previous version:

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index f48b5b5..b925bed 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -31,6 +31,36 @@ static const char *get_default_remote(void)
 		return xstrdup(dest);
 }
 
+static int has_same_dir_prefix(const char *str, const char **out)
+{
+#ifdef GIT_WINDOWS_NATIVE
+	return skip_prefix(str, "./", out)
+		|| skip_prefix(str, ".\\", out);
+#else
+	return skip_prefix(str, "./", out);
+#endif
+}
+
+static int has_upper_dir_prefix(const char *str, const char **out)
+{
+#ifdef GIT_WINDOWS_NATIVE
+	return skip_prefix(str, "../", out)
+		|| skip_prefix(str, "..\\", out);
+#else
+	return skip_prefix(str, "../", out);
+#endif
+}
+
+static char *last_dir_separator(const char *str)
+{
+#ifdef GIT_WINDOWS_NATIVE
+	return strrchr(str, "/")
+		|| strrchr(str, "\\");
+#else
+	return strrchr(str, '/');
+#endif
+}
+
 /*
  * The function takes at most 2 arguments. The first argument is the
  * URL that navigates to the submodule origin repo. When relative, this URL
@@ -64,13 +94,14 @@ static const char *relative_url(const char *url, const char *up_path)
 		/* the repository is its own authoritative upstream */
 		remoteurl = xgetcwd();
 
-	if (strip_suffix(remoteurl, "/", &len))
+	len = strlen(remoteurl);
+	if (is_dir_sep(remoteurl[len]))
 		remoteurl[len] = '\0';
 
-	if (strchr(remoteurl, ':') || skip_prefix(remoteurl, "/", &out))
+	if (strchr(remoteurl, ':') || is_dir_sep(*remoteurl))
 		is_relative = 0;
-	else if (skip_prefix(remoteurl, "./", &out) ||
-		    skip_prefix(remoteurl, "../", &out))
+	else if (has_same_dir_prefix(remoteurl, &out) ||
+		    has_upper_dir_prefix(remoteurl, &out))
 		is_relative = 1;
 	else {
 		is_relative = 1;
@@ -80,11 +111,11 @@ static const char *relative_url(const char *url, const char *up_path)
 	}
 
 	while (url) {
-		if (skip_prefix(url, "../", &out)) {
+		if (has_upper_dir_prefix(url, &out)) {
 			char *rfind;
 			url = out;
 
-			rfind = strrchr(remoteurl, '/');
+			rfind = last_dir_separator(remoteurl);
 			if (rfind)
 				*rfind = '\0';
 			else {
@@ -99,7 +130,7 @@ static const char *relative_url(const char *url, const char *up_path)
 						remoteurl = ".";
 				}
 			}
-		} else if (skip_prefix(url, "./", &out))
+		} else if (has_same_dir_prefix(url, &out))
 			url = out;
 		else
 			break;
@@ -107,7 +138,7 @@ static const char *relative_url(const char *url, const char *up_path)
 	strbuf_reset(&sb);
 	strbuf_addf(&sb, "%s%s%s", remoteurl, sep, url);
 
-	if (!skip_prefix(sb.buf, "./", &out))
+	if (!has_same_dir_prefix(sb.buf, &out))
 		out = sb.buf;
 	out = xstrdup(out);
 

-- 
2.7.0.rc1.2.gfc39790.dirty

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCHv2] submodule: Port resolve_relative_url from shell to C
  2015-12-10  1:07 [PATCH] submodule: Port resolve_relative_url from shell to C Stefan Beller
  2015-12-10  6:48 ` Johannes Sixt
  2015-12-17  0:26 ` [PATCHv2] Porting " Stefan Beller
@ 2015-12-17  0:26 ` Stefan Beller
  2015-12-17  8:17   ` Eric Sunshine
  2015-12-17 18:55   ` Johannes Sixt
  2 siblings, 2 replies; 23+ messages in thread
From: Stefan Beller @ 2015-12-17  0:26 UTC (permalink / raw)
  To: sbeller, git; +Cc: gitster, jens.lehmann, j6t

This reimplements the helper function `resolve_relative_url` in shell
in C. This functionality is needed in C for introducing the groups
feature later on. When using groups, the user should not need to run
`git submodule init`, but it should be implicit at all appropriate places,
which are all in C code. As the we would not just call out to `git
submodule init`, but do a more fine grained structure there, we actually
need all the init functionality in C before attempting the groups
feature. To get the init functionality in C, rewriting the
resolve_relative_url subfunction is a major step.

This also improves the performance:
(Best out of 3) time ./t7400-submodule-basic.sh
Before:
real	0m9.575s
user	0m2.683s
sys	0m6.773s
After:
real	0m9.293s
user	0m2.691s
sys	0m6.549s

That's about 3%.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/submodule--helper.c | 151 ++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  81 ++----------------------
 2 files changed, 155 insertions(+), 77 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index f4c3eff..b925bed 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -9,6 +9,156 @@
 #include "submodule-config.h"
 #include "string-list.h"
 #include "run-command.h"
+#include "remote.h"
+#include "refs.h"
+
+static const char *get_default_remote(void)
+{
+	char *dest = NULL;
+	unsigned char sha1[20];
+	int flag;
+	struct strbuf sb = STRBUF_INIT;
+	const char *refname = resolve_ref_unsafe("HEAD", 0, sha1, &flag);
+
+	if (!refname)
+		die("No such ref: HEAD");
+
+	refname = shorten_unambiguous_ref(refname, 0);
+	strbuf_addf(&sb, "branch.%s.remote", refname);
+	if (git_config_get_string(sb.buf, &dest))
+		return "origin";
+	else
+		return xstrdup(dest);
+}
+
+static int has_same_dir_prefix(const char *str, const char **out)
+{
+#ifdef GIT_WINDOWS_NATIVE
+	return skip_prefix(str, "./", out)
+		|| skip_prefix(str, ".\\", out);
+#else
+	return skip_prefix(str, "./", out);
+#endif
+}
+
+static int has_upper_dir_prefix(const char *str, const char **out)
+{
+#ifdef GIT_WINDOWS_NATIVE
+	return skip_prefix(str, "../", out)
+		|| skip_prefix(str, "..\\", out);
+#else
+	return skip_prefix(str, "../", out);
+#endif
+}
+
+static char *last_dir_separator(const char *str)
+{
+#ifdef GIT_WINDOWS_NATIVE
+	return strrchr(str, "/")
+		|| strrchr(str, "\\");
+#else
+	return strrchr(str, '/');
+#endif
+}
+
+/*
+ * The function takes at most 2 arguments. The first argument is the
+ * URL that navigates to the submodule origin repo. When relative, this URL
+ * is relative to the superproject origin URL repo. The second up_path
+ * argument, if specified, is the relative path that navigates
+ * from the submodule working tree to the superproject working tree.
+ *
+ * The output of the function is the origin URL of the submodule.
+ *
+ * The output will either be an absolute URL or filesystem path (if the
+ * superproject origin URL is an absolute URL or filesystem path,
+ * respectively) or a relative file system path (if the superproject
+ * origin URL is a relative file system path).
+ *
+ * When the output is a relative file system path, the path is either
+ * relative to the submodule working tree, if up_path is specified, or to
+ * the superproject working tree otherwise.
+ */
+static const char *relative_url(const char *url, const char *up_path)
+{
+	int is_relative = 0;
+	size_t len;
+	char *remoteurl = NULL;
+	char *sep = "/";
+	const char *out;
+	struct strbuf sb = STRBUF_INIT;
+	const char *remote = get_default_remote();
+	strbuf_addf(&sb, "remote.%s.url", remote);
+
+	if (git_config_get_string(sb.buf, &remoteurl))
+		/* the repository is its own authoritative upstream */
+		remoteurl = xgetcwd();
+
+	len = strlen(remoteurl);
+	if (is_dir_sep(remoteurl[len]))
+		remoteurl[len] = '\0';
+
+	if (strchr(remoteurl, ':') || is_dir_sep(*remoteurl))
+		is_relative = 0;
+	else if (has_same_dir_prefix(remoteurl, &out) ||
+		    has_upper_dir_prefix(remoteurl, &out))
+		is_relative = 1;
+	else {
+		is_relative = 1;
+		strbuf_reset(&sb);
+		strbuf_addf(&sb, "./%s", remoteurl);
+		remoteurl = strbuf_detach(&sb, NULL);
+	}
+
+	while (url) {
+		if (has_upper_dir_prefix(url, &out)) {
+			char *rfind;
+			url = out;
+
+			rfind = last_dir_separator(remoteurl);
+			if (rfind)
+				*rfind = '\0';
+			else {
+				rfind = strrchr(remoteurl, ':');
+				if (rfind) {
+					*rfind = '\0';
+					sep = ":";
+				} else {
+					if (is_relative || !strcmp(".", remoteurl))
+						die(N_("cannot strip one component off url '%s'"), remoteurl);
+					else
+						remoteurl = ".";
+				}
+			}
+		} else if (has_same_dir_prefix(url, &out))
+			url = out;
+		else
+			break;
+	}
+	strbuf_reset(&sb);
+	strbuf_addf(&sb, "%s%s%s", remoteurl, sep, url);
+
+	if (!has_same_dir_prefix(sb.buf, &out))
+		out = sb.buf;
+	out = xstrdup(out);
+
+	strbuf_reset(&sb);
+	strbuf_addf(&sb, "%s%s", is_relative && up_path ? up_path : "", out);
+
+	free((char*)out);
+	return strbuf_detach(&sb, NULL);
+}
+
+static int resolve_relative_url(int argc, const char **argv, const char *prefix)
+{
+	if (argc == 2)
+		printf("%s\n", relative_url(argv[1], NULL));
+	else if (argc == 3)
+		printf("%s\n", relative_url(argv[1], argv[2]));
+	else
+		die("BUG: resolve_relative_url only accepts one or two arguments");
+	return 0;
+}
 
 struct module_list {
 	const struct cache_entry **entries;
@@ -264,6 +414,7 @@ static struct cmd_struct commands[] = {
 	{"list", module_list},
 	{"name", module_name},
 	{"clone", module_clone},
+	{"resolve_relative_url", resolve_relative_url},
 };
 
 int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
diff --git a/git-submodule.sh b/git-submodule.sh
index 9bc5c5f..6a7a3e4 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -46,79 +46,6 @@ prefix=
 custom_name=
 depth=
 
-# The function takes at most 2 arguments. The first argument is the
-# URL that navigates to the submodule origin repo. When relative, this URL
-# is relative to the superproject origin URL repo. The second up_path
-# argument, if specified, is the relative path that navigates
-# from the submodule working tree to the superproject working tree.
-#
-# The output of the function is the origin URL of the submodule.
-#
-# The output will either be an absolute URL or filesystem path (if the
-# superproject origin URL is an absolute URL or filesystem path,
-# respectively) or a relative file system path (if the superproject
-# origin URL is a relative file system path).
-#
-# When the output is a relative file system path, the path is either
-# relative to the submodule working tree, if up_path is specified, or to
-# the superproject working tree otherwise.
-resolve_relative_url ()
-{
-	remote=$(get_default_remote)
-	remoteurl=$(git config "remote.$remote.url") ||
-		remoteurl=$(pwd) # the repository is its own authoritative upstream
-	url="$1"
-	remoteurl=${remoteurl%/}
-	sep=/
-	up_path="$2"
-
-	case "$remoteurl" in
-	*:*|/*)
-		is_relative=
-		;;
-	./*|../*)
-		is_relative=t
-		;;
-	*)
-		is_relative=t
-		remoteurl="./$remoteurl"
-		;;
-	esac
-
-	while test -n "$url"
-	do
-		case "$url" in
-		../*)
-			url="${url#../}"
-			case "$remoteurl" in
-			*/*)
-				remoteurl="${remoteurl%/*}"
-				;;
-			*:*)
-				remoteurl="${remoteurl%:*}"
-				sep=:
-				;;
-			*)
-				if test -z "$is_relative" || test "." = "$remoteurl"
-				then
-					die "$(eval_gettext "cannot strip one component off url '\$remoteurl'")"
-				else
-					remoteurl=.
-				fi
-				;;
-			esac
-			;;
-		./*)
-			url="${url#./}"
-			;;
-		*)
-			break;;
-		esac
-	done
-	remoteurl="$remoteurl$sep${url%/}"
-	echo "${is_relative:+${up_path}}${remoteurl#./}"
-}
-
 # Resolve a path to be relative to another path.  This is intended for
 # converting submodule paths when git-submodule is run in a subdirectory
 # and only handles paths where the directory separator is '/'.
@@ -281,7 +208,7 @@ cmd_add()
 		die "$(gettext "Relative path can only be used from the toplevel of the working tree")"
 
 		# dereference source url relative to parent's url
-		realrepo=$(resolve_relative_url "$repo") || exit
+		realrepo=$(git submodule--helper resolve_relative_url "$repo") || exit
 		;;
 	*:*|/*)
 		# absolute url
@@ -485,7 +412,7 @@ cmd_init()
 			# Possibly a url relative to parent
 			case "$url" in
 			./*|../*)
-				url=$(resolve_relative_url "$url") || exit
+				url=$(git submodule--helper resolve_relative_url "$url") || exit
 				;;
 			esac
 			git config submodule."$name".url "$url" ||
@@ -1190,9 +1117,9 @@ cmd_sync()
 			# guarantee a trailing /
 			up_path=${up_path%/}/ &&
 			# path from submodule work tree to submodule origin repo
-			sub_origin_url=$(resolve_relative_url "$url" "$up_path") &&
+			sub_origin_url=$(git submodule--helper resolve_relative_url "$url" "$up_path") &&
 			# path from superproject work tree to submodule origin repo
-			super_config_url=$(resolve_relative_url "$url") || exit
+			super_config_url=$(git submodule--helper resolve_relative_url "$url") || exit
 			;;
 		*)
 			sub_origin_url="$url"
-- 
2.7.0.rc1.2.gfc39790.dirty

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCHv2] Porting resolve_relative_url from shell to C
  2015-12-17  0:26 ` [PATCHv2] Porting " Stefan Beller
@ 2015-12-17  7:47   ` Torsten Bögershausen
  0 siblings, 0 replies; 23+ messages in thread
From: Torsten Bögershausen @ 2015-12-17  7:47 UTC (permalink / raw)
  To: Stefan Beller, git; +Cc: gitster, jens.lehmann, j6t

On 17.12.15 01:26, Stefan Beller wrote:
> A new version of the patch, which spells out more its intent and
> may actually work in Windows.
> 
> Any comment welcome,
> Thanks,
> Stefan
> 
> Stefan Beller (1):
>   submodule: Port resolve_relative_url from shell to C
> 
>  builtin/submodule--helper.c | 151 ++++++++++++++++++++++++++++++++++++++++++++
>  git-submodule.sh            |  81 ++----------------------
>  2 files changed, 155 insertions(+), 77 deletions(-)
> 
> interdiff to previous version:
> 
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index f48b5b5..b925bed 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -31,6 +31,36 @@ static const char *get_default_remote(void)
>  		return xstrdup(dest);
>  }
>  
> +static int has_same_dir_prefix(const char *str, const char **out)
> +{
> +#ifdef GIT_WINDOWS_NATIVE
Should that be
if defined(GIT_WINDOWS_NATIVE) || defined(__CYGWIN__)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCHv2] submodule: Port resolve_relative_url from shell to C
  2015-12-17  0:26 ` [PATCHv2] submodule: Port " Stefan Beller
@ 2015-12-17  8:17   ` Eric Sunshine
  2015-12-17 18:55   ` Johannes Sixt
  1 sibling, 0 replies; 23+ messages in thread
From: Eric Sunshine @ 2015-12-17  8:17 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Git List, Junio C Hamano, Jens Lehmann, Johannes Sixt

A rather superficial review...

On Wed, Dec 16, 2015 at 7:26 PM, Stefan Beller <sbeller@google.com> wrote:
> This reimplements the helper function `resolve_relative_url` in shell

s/This reimplements/Reimplement/

> in C. This functionality is needed in C for introducing the groups
> feature later on. When using groups, the user should not need to run
> `git submodule init`, but it should be implicit at all appropriate places,
> which are all in C code. As the we would not just call out to `git

I guess you mean "As then we..." or something?

> submodule init`, but do a more fine grained structure there, we actually
> need all the init functionality in C before attempting the groups
> feature. To get the init functionality in C, rewriting the
> resolve_relative_url subfunction is a major step.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> @@ -9,6 +9,156 @@
> +static const char *get_default_remote(void)
> +{
> +       char *dest = NULL;
> +       unsigned char sha1[20];
> +       int flag;
> +       struct strbuf sb = STRBUF_INIT;
> +       const char *refname = resolve_ref_unsafe("HEAD", 0, sha1, &flag);
> +
> +       if (!refname)
> +               die("No such ref: HEAD");
> +
> +       refname = shorten_unambiguous_ref(refname, 0);
> +       strbuf_addf(&sb, "branch.%s.remote", refname);
> +       if (git_config_get_string(sb.buf, &dest))
> +               return "origin";
> +       else
> +               return xstrdup(dest);

Leaking the strbuf at both returns.

And, leaking the strdup'd dest (in the caller), but I suppose that's
intentional?

> +}
> +
> +static int has_same_dir_prefix(const char *str, const char **out)
> +{
> +#ifdef GIT_WINDOWS_NATIVE
> +       return skip_prefix(str, "./", out)
> +               || skip_prefix(str, ".\\", out);
> +#else
> +       return skip_prefix(str, "./", out);
> +#endif
> +}
> +
> +static int has_upper_dir_prefix(const char *str, const char **out)
> +{
> +#ifdef GIT_WINDOWS_NATIVE
> +       return skip_prefix(str, "../", out)
> +               || skip_prefix(str, "..\\", out);
> +#else
> +       return skip_prefix(str, "../", out);
> +#endif
> +}
> +
> +static char *last_dir_separator(const char *str)
> +{
> +#ifdef GIT_WINDOWS_NATIVE
> +       return strrchr(str, "/")
> +               || strrchr(str, "\\");
> +#else
> +       return strrchr(str, '/');
> +#endif
> +}

Cleaner would be to move the #if's outside the functions:

    #ifdef GIT_WINDOWS_NATIVE

    /* Windows implementations */
    static int has_same_dir_prefix(...) {...}
    static int has_upper_dir_prefix(...) {...}
    static char *last_dir_separator(...) {...}

    #else

    /* POSIX implementations */
    static int has_same_dir_prefix(...) {...}
    static int has_upper_dir_prefix(...) {...}
    static char *last_dir_separator(...) {...}

    #endif

> +static const char *relative_url(const char *url, const char *up_path)
> +{
> +       int is_relative = 0;
> +       size_t len;
> +       char *remoteurl = NULL;
> +       char *sep = "/";

'sep' only ever holds a single character, so why not declare it 'char'
rather than 'char *'? (And, adjust the format string of strbuf_addf(),
of course.)

> +       const char *out;
> +       [...]
> +       while (url) {
> +               if (has_upper_dir_prefix(url, &out)) {
> +                       [...]
> +                       if (rfind)
> +                               *rfind = '\0';
> +                       else {
> +                               rfind = strrchr(remoteurl, ':');
> +                               if (rfind) {
> +                                       *rfind = '\0';
> +                                       sep = ":";
> +                               } else {
> +                                       [...]
> +                               }
> +                       }
> +               } else if (has_same_dir_prefix(url, &out))
> +                       url = out;
> +               else
> +                       break;
> +       }
> +       strbuf_reset(&sb);
> +       strbuf_addf(&sb, "%s%s%s", remoteurl, sep, url);
> +
> +       if (!has_same_dir_prefix(sb.buf, &out))
> +               out = sb.buf;
> +       out = xstrdup(out);
> +
> +       strbuf_reset(&sb);
> +       strbuf_addf(&sb, "%s%s", is_relative && up_path ? up_path : "", out);
> +
> +       free((char*)out);

Why declare 'out' const if you know you will be freeing it?

> +       return strbuf_detach(&sb, NULL);
> +}
> +
> +static int resolve_relative_url(int argc, const char **argv, const char *prefix)
> +{
> +       if (argc == 2)
> +               printf("%s\n", relative_url(argv[1], NULL));
> +       else if (argc == 3)
> +               printf("%s\n", relative_url(argv[1], argv[2]));
> +       else
> +               die("BUG: resolve_relative_url only accepts one or two arguments");
> +       return 0;
> +}
>
>  struct module_list {
>         const struct cache_entry **entries;
> @@ -264,6 +414,7 @@ static struct cmd_struct commands[] = {
>         {"list", module_list},
>         {"name", module_name},
>         {"clone", module_clone},
> +       {"resolve_relative_url", resolve_relative_url},

Can we please use hyphens rather than underscores, and name this
"resolve-relative-url" instead? Quoting from a review[1] of an earlier
version of git-submodule--helper:

    ... these subcommands would be better spelled with a hyphen than
    an underscore. If I recall correctly, the arguments for using
    underscore were (1) a less noisy diff, (2) these are internal
    commands nobody will be typing anyhow. However, (1) the diff
    noise will be the same with hyphens, and (2) people will want to
    test these commands manually anyhow, and its easier to type
    hyphens and easier to remember them since the precedent for
    hyphens in command-names is already well established.

    Also, the reason that the original shell code used underscores
    was because hyphens are not valid characters in shell function
    names, but that's an implementation detail which shouldn't be
    allowed to bleed over to user-facing interface design (and these
    subcommands are user-facing).

[1]: http://article.gmane.org/gmane.comp.version-control.git/276947

>  };

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCHv2] submodule: Port resolve_relative_url from shell to C
  2015-12-17  0:26 ` [PATCHv2] submodule: Port " Stefan Beller
  2015-12-17  8:17   ` Eric Sunshine
@ 2015-12-17 18:55   ` Johannes Sixt
  2015-12-17 19:17     ` Johannes Sixt
                       ` (2 more replies)
  1 sibling, 3 replies; 23+ messages in thread
From: Johannes Sixt @ 2015-12-17 18:55 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, gitster, jens.lehmann

Am 17.12.2015 um 01:26 schrieb Stefan Beller:
> This reimplements the helper function `resolve_relative_url` in shell
> in C. This functionality is needed in C for introducing the groups
> feature later on. When using groups, the user should not need to run
> `git submodule init`, but it should be implicit at all appropriate places,
> which are all in C code. As the we would not just call out to `git
> submodule init`, but do a more fine grained structure there, we actually
> need all the init functionality in C before attempting the groups
> feature. To get the init functionality in C, rewriting the
> resolve_relative_url subfunction is a major step.
>
> This also improves the performance:
> (Best out of 3) time ./t7400-submodule-basic.sh
> Before:
> real	0m9.575s
> user	0m2.683s
> sys	0m6.773s
> After:
> real	0m9.293s
> user	0m2.691s
> sys	0m6.549s
>
> That's about 3%.

I appreciate this effort as it should help us on Windows. Although the
numbers (and my own timings) suggest that this is only a small step
forward. That's not surprising as the patch removes only two forks.

As to the implementation, find a patch below that removes the ifdefs
and a few other suggestions. It is a mechanical conversion without
understanding what relative_url() does. I have the gut feeling that the
two strbuf_addf towards the end of the function can be contracted and
the temporarily allocate copy in 'out' can be removed.

If there were a few examples in the comment above the function, it
would be much simpler to understand.

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index b925bed..8ec0975 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -11,6 +11,7 @@
 #include "run-command.h"
 #include "remote.h"
 #include "refs.h"
+#include "connect.h"
 
 static const char *get_default_remote(void)
 {
@@ -31,34 +32,23 @@ static const char *get_default_remote(void)
 		return xstrdup(dest);
 }
 
-static int has_same_dir_prefix(const char *str, const char **out)
+static int starts_with_dot_slash(const char *str)
 {
-#ifdef GIT_WINDOWS_NATIVE
-	return skip_prefix(str, "./", out)
-		|| skip_prefix(str, ".\\", out);
-#else
-	return skip_prefix(str, "./", out);
-#endif
+	return str[0] == '.' && is_dir_sep(str[1]);
 }
 
-static int has_upper_dir_prefix(const char *str, const char **out)
+static int starts_with_dot_dot_slash(const char *str)
 {
-#ifdef GIT_WINDOWS_NATIVE
-	return skip_prefix(str, "../", out)
-		|| skip_prefix(str, "..\\", out);
-#else
-	return skip_prefix(str, "../", out);
-#endif
+	return str[0] == '.' && str[1] == '.' && is_dir_sep(str[2]);
 }
 
-static char *last_dir_separator(const char *str)
+static char *last_dir_separator(char *str)
 {
-#ifdef GIT_WINDOWS_NATIVE
-	return strrchr(str, "/")
-		|| strrchr(str, "\\");
-#else
-	return strrchr(str, '/');
-#endif
+	char* p = str + strlen(str);
+	while (p-- != str)
+		if (is_dir_sep(*p))
+			return p;
+	return NULL;
 }
 
 /*
@@ -85,9 +75,10 @@ static const char *relative_url(const char *url, const char *up_path)
 	size_t len;
 	char *remoteurl = NULL;
 	char *sep = "/";
-	const char *out;
+	char *out;
 	struct strbuf sb = STRBUF_INIT;
 	const char *remote = get_default_remote();
+
 	strbuf_addf(&sb, "remote.%s.url", remote);
 
 	if (git_config_get_string(sb.buf, &remoteurl))
@@ -98,22 +89,23 @@ static const char *relative_url(const char *url, const char *up_path)
 	if (is_dir_sep(remoteurl[len]))
 		remoteurl[len] = '\0';
 
-	if (strchr(remoteurl, ':') || is_dir_sep(*remoteurl))
+	if (!url_is_local_not_ssh(remoteurl) || is_absolute_path(remoteurl))
 		is_relative = 0;
-	else if (has_same_dir_prefix(remoteurl, &out) ||
-		    has_upper_dir_prefix(remoteurl, &out))
+	else if (starts_with_dot_slash(remoteurl) ||
+		    starts_with_dot_dot_slash(remoteurl))
 		is_relative = 1;
 	else {
 		is_relative = 1;
 		strbuf_reset(&sb);
 		strbuf_addf(&sb, "./%s", remoteurl);
+		free(remoteurl);
 		remoteurl = strbuf_detach(&sb, NULL);
 	}
 
 	while (url) {
-		if (has_upper_dir_prefix(url, &out)) {
+		if (starts_with_dot_dot_slash(url)) {
 			char *rfind;
-			url = out;
+			url += 3;
 
 			rfind = last_dir_separator(remoteurl);
 			if (rfind)
@@ -130,22 +122,23 @@ static const char *relative_url(const char *url, const char *up_path)
 						remoteurl = ".";
 				}
 			}
-		} else if (has_same_dir_prefix(url, &out))
-			url = out;
-		else
+		} else if (starts_with_dot_slash(url)) {
+			url += 2;
+		} else
 			break;
 	}
 	strbuf_reset(&sb);
 	strbuf_addf(&sb, "%s%s%s", remoteurl, sep, url);
 
-	if (!has_same_dir_prefix(sb.buf, &out))
-		out = sb.buf;
-	out = xstrdup(out);
+	if (starts_with_dot_slash(sb.buf))
+		out = strdup(sb.buf + 2);
+	else
+		out = xstrdup(sb.buf);
 
 	strbuf_reset(&sb);
 	strbuf_addf(&sb, "%s%s", is_relative && up_path ? up_path : "", out);
 
-	free((char*)out);
+	free(out);
 	return strbuf_detach(&sb, NULL);
 }
 

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCHv2] submodule: Port resolve_relative_url from shell to C
  2015-12-17 18:55   ` Johannes Sixt
@ 2015-12-17 19:17     ` Johannes Sixt
  2015-12-18  3:27     ` Jeff King
  2016-01-11 20:17     ` Stefan Beller
  2 siblings, 0 replies; 23+ messages in thread
From: Johannes Sixt @ 2015-12-17 19:17 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, gitster, jens.lehmann

Am 17.12.2015 um 19:55 schrieb Johannes Sixt:
> As to the implementation, find a patch below that removes the ifdefs
> and a few other suggestions.

The word "offers" is missing in this last half-sentence ;)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCHv2] submodule: Port resolve_relative_url from shell to C
  2015-12-17 18:55   ` Johannes Sixt
  2015-12-17 19:17     ` Johannes Sixt
@ 2015-12-18  3:27     ` Jeff King
  2016-01-11 20:17     ` Stefan Beller
  2 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2015-12-18  3:27 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Stefan Beller, git, gitster, jens.lehmann

On Thu, Dec 17, 2015 at 07:55:43PM +0100, Johannes Sixt wrote:

> -static int has_same_dir_prefix(const char *str, const char **out)
> +static int starts_with_dot_slash(const char *str)
>  {
> -#ifdef GIT_WINDOWS_NATIVE
> -	return skip_prefix(str, "./", out)
> -		|| skip_prefix(str, ".\\", out);
> -#else
> -	return skip_prefix(str, "./", out);
> -#endif
> +	return str[0] == '.' && is_dir_sep(str[1]);
>  }
>  
> -static int has_upper_dir_prefix(const char *str, const char **out)
> +static int starts_with_dot_dot_slash(const char *str)
>  {
> -#ifdef GIT_WINDOWS_NATIVE
> -	return skip_prefix(str, "../", out)
> -		|| skip_prefix(str, "..\\", out);
> -#else
> -	return skip_prefix(str, "../", out);
> -#endif
> +	return str[0] == '.' && str[1] == '.' && is_dir_sep(str[2]);
>  }

As the set of prefixes you are looking is probably bounded, it may not
be worth generalizing this. But I wondered if something like:

  /*
   * Like skip_prefix, but consider any "/" in the prefix as a
   * directory separator for the platform.
   */
  int skip_prefix_fs(const char *str, const char *prefix, const char **out)
  {
	while (1) {
		if (!*prefix) {
			*out = str;
			return 1;
		} else if (*prefix == '/') {
			if (!is_dir_sep(*str))
				return 0;
		} else {
			if (*str != *prefix)
				return 0;
		}
		str++;
		prefix++;
	}
  }

  ...
  /* works on all platforms! */
  if (skip_prefix_fs(foo, "./", &out))
	...

would be helpful. I don't know if there are other opportunities in the
code base that could make use of this. If it's just these two sites,
it's probably not worth it.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCHv2] submodule: Port resolve_relative_url from shell to C
  2015-12-17 18:55   ` Johannes Sixt
  2015-12-17 19:17     ` Johannes Sixt
  2015-12-18  3:27     ` Jeff King
@ 2016-01-11 20:17     ` Stefan Beller
  2 siblings, 0 replies; 23+ messages in thread
From: Stefan Beller @ 2016-01-11 20:17 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git@vger.kernel.org, Junio C Hamano, Jens Lehmann

I forgot to send out the draft last year.

On Thu, Dec 17, 2015 at 10:55 AM, Johannes Sixt <j6t@kdbg.org> wrote:
>> That's about 3%.
>
> I appreciate this effort as it should help us on Windows. Although the
> numbers (and my own timings) suggest that this is only a small step
> forward. That's not surprising as the patch removes only two forks.

ok. Probably the timings are not as important here anyways.

>
> As to the implementation, find a patch below that removes the ifdefs
> and a few other suggestions. It is a mechanical conversion without
> understanding what relative_url() does. I have the gut feeling that the
> two strbuf_addf towards the end of the function can be contracted and
> the temporarily allocate copy in 'out' can be removed.

I have the gut feeling, too. But I could not quite write down the solution.

>
> If there were a few examples in the comment above the function, it
> would be much simpler to understand.

I agree. So rewording the comment in the next reroll.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] submodule: Port resolve_relative_url from shell to C
@ 2016-01-13 18:15 Stefan Beller
  2016-01-13 22:03 ` Junio C Hamano
  2016-01-13 22:03 ` Eric Sunshine
  0 siblings, 2 replies; 23+ messages in thread
From: Stefan Beller @ 2016-01-13 18:15 UTC (permalink / raw)
  To: git, gitster; +Cc: j6t, peff, jens.lehmann, Stefan Beller

Later on we want to deprecate the `git submodule init` command and make
it implicit in other submodule commands. As these other commands are
written in C already, we'd need the init functionality in C, too.
The `resolve_relative_url` function is a rather large part of that init
functionality, so start by porting this function to C.

As I was porting the functionality I noticed some odds with the inputs.
To fully understand the situation I added some logging to the function
temporarily to capture all calls to the function throughout the test
suite. Duplicates have been removed and all unique testing inputs have
been recorded into t0060.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
> up_path seems to be ignored when remoteurl is absolute. Is that
> combination an invalid use case?

Yes, that is invalid. See the original:
 
    echo "${is_relative:+${up_path}}${remoteurl#./}"

This also only adds up_path in case of is_relative being true.
Did you mean to say that fact should be documented?

> I think that you strike a good balance between a direct rewrite
> of the shell function and possible optimizations. Therefore,
> further improvements should go into separate patches.

ok.

> In these two cases, it is unclear whether the "bar" in the 4th
> argument is copied from the 2nd or the 3rd argument. I suggest to
> use a different token:
> ...

I just produced all possible test cases I could find in the test suite,
which looked like they are a good fit. (I did some deduplication there already,
the test suite produced over 100 different mostly subtly reworded test cases)

I wonder if I should slim down this more as the test suite already takes very long
to run. However these tests don't take very long on their own.

Thanks for the review,
Stefan

interdiff to v2:
	diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
	index 3dd8008..3e58b5d 100644
	--- a/builtin/submodule--helper.c
	+++ b/builtin/submodule--helper.c
	@@ -117,9 +117,9 @@ static char *relative_url(const char *remote_url,
						colonsep = 1;
					} else {
						if (is_relative || !strcmp(".", remoteurl))
	-						die(N_("cannot strip one component off url '%s'"), remoteurl);
	+						die(_("cannot strip one component off url '%s'"), remoteurl);
						else
	-						remoteurl = ".";
	+						remoteurl = xstrdup(".");
					}
				}
			} else if (starts_with_dot_slash(url)) {
	@@ -139,12 +139,10 @@ static char *relative_url(const char *remote_url,
		free(remoteurl);
		if (!up_path || !is_relative)
			return out;
	-	else {
	-		strbuf_addf(&sb, "%s%s", up_path, out);
	 
	-		free(out);
	-		return strbuf_detach(&sb, NULL);
	-	}
	+	strbuf_addf(&sb, "%s%s", up_path, out);
	+	free(out);
	+	return strbuf_detach(&sb, NULL);
	 }
	 
	 static int resolve_relative_url(int argc, const char **argv, const char *prefix)
	diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
	index 2ae1bbd..8a1579c 100755
	--- a/t/t0060-path-utils.sh
	+++ b/t/t0060-path-utils.sh
	@@ -20,9 +20,10 @@ relative_path() {
	 }
	 
	 test_submodule_relative_url() {
	-	expected="$4"
	-	test_expect_success "test_submodule_relative_url: $1 $2 $3 => $4" \
	-	"test \"\$(git submodule--helper resolve-relative-url-test '$1' '$2' '$3')\" = '$expected'"
	+	test_expect_success "test_submodule_relative_url: $1 $2 $3 => $4" "
	+		actual=\$(git submodule--helper resolve-relative-url-test '$1' '$2' '$3') &&
	+		test \"\$actual\" = '$4'
	+	"
	 }
	 
	 test_git_path() {
	@@ -292,8 +293,8 @@ test_git_path GIT_COMMON_DIR=bar config                   bar/config
	 test_git_path GIT_COMMON_DIR=bar packed-refs              bar/packed-refs
	 test_git_path GIT_COMMON_DIR=bar shallow                  bar/shallow
	 
	-test_submodule_relative_url "(null)" "../foo/bar" "../bar/a/b/c" "../foo/bar/a/b/c"
	-test_submodule_relative_url "../../../" "../foo/bar" "../bar/a/b/c" "../../../../foo/bar/a/b/c"
	+test_submodule_relative_url "(null)" "../foo/bar" "../sub/a/b/c" "../foo/sub/a/b/c"
	+test_submodule_relative_url "../../../" "../foo/bar" "../sub/a/b/c" "../../../../foo/sub/a/b/c"
	 test_submodule_relative_url "(null)" "../foo/bar" "../submodule" "../foo/submodule"
	 test_submodule_relative_url "../" "../foo/bar" "../submodule" "../../foo/submodule"
	 test_submodule_relative_url "(null)" "../foo/submodule" "../submodule" "../foo/submodule"

I tried considering an alternative implementation for `relative_url`,
which is not a direct translation of the shell code. There are some
advanced path/url functions, such as `normalize_path_copy`, however
using that function is not straightforward as it seems. The idea would
have been to use that on a concatenation of remoteurl and url, however
there are cases like ("foo/." "../.") to result in "foo/.", so we really
need to count the slashes ourselves.

---
 builtin/submodule--helper.c | 189 ++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  81 +------------------
 t/t0060-path-utils.sh       |  42 ++++++++++
 3 files changed, 235 insertions(+), 77 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index f4c3eff..3e58b5d 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -9,6 +9,193 @@
 #include "submodule-config.h"
 #include "string-list.h"
 #include "run-command.h"
+#include "remote.h"
+#include "refs.h"
+#include "connect.h"
+
+static char *get_default_remote(void)
+{
+	char *dest = NULL, *ret;
+	unsigned char sha1[20];
+	int flag;
+	struct strbuf sb = STRBUF_INIT;
+	const char *refname = resolve_ref_unsafe("HEAD", 0, sha1, &flag);
+
+	if (!refname)
+		die("No such ref: HEAD");
+
+	refname = shorten_unambiguous_ref(refname, 0);
+	strbuf_addf(&sb, "branch.%s.remote", refname);
+	if (git_config_get_string(sb.buf, &dest))
+		ret = xstrdup("origin");
+	else
+		ret = xstrdup(dest);
+
+	strbuf_release(&sb);
+	return ret;
+}
+
+static int starts_with_dot_slash(const char *str)
+{
+	return str[0] == '.' && is_dir_sep(str[1]);
+}
+
+static int starts_with_dot_dot_slash(const char *str)
+{
+	return str[0] == '.' && str[1] == '.' && is_dir_sep(str[2]);
+}
+
+static char *last_dir_separator(char *str)
+{
+	char* p = str + strlen(str);
+	while (p-- != str)
+		if (is_dir_sep(*p))
+			return p;
+	return NULL;
+}
+
+/*
+ * The `url` argument is the URL that navigates to the submodule origin
+ * repo. When relative, this URL is relative to the superproject origin
+ * URL repo. The `up_path` argument, if specified, is the relative
+ * path that navigates from the submodule working tree to the superproject
+ * working tree. Returns the origin URL of the submodule.
+ *
+ * Return either an absolute URL or filesystem path (if the superproject
+ * origin URL is an absolute URL or filesystem path, respectively) or a
+ * relative file system path (if the superproject origin URL is a relative
+ * file system path).
+ *
+ * When the output is a relative file system path, the path is either
+ * relative to the submodule working tree, if up_path is specified, or to
+ * the superproject working tree otherwise.
+ */
+static char *relative_url(const char *remote_url,
+				const char *url,
+				const char *up_path)
+{
+	int is_relative = 0;
+	int colonsep = 0;
+	char *out;
+	char *remoteurl = xstrdup(remote_url);
+	struct strbuf sb = STRBUF_INIT;
+	size_t len;
+
+	len = strlen(remoteurl);
+	if (is_dir_sep(remoteurl[len]))
+		remoteurl[len] = '\0';
+
+	if (!url_is_local_not_ssh(remoteurl) || is_absolute_path(remoteurl))
+		is_relative = 0;
+	else {
+		is_relative = 1;
+
+		/* Prepend a './' to ensure all relative remoteurls start
+		 * with './' or '../'. */
+		if (!starts_with_dot_slash(remoteurl) &&
+		    !starts_with_dot_dot_slash(remoteurl)) {
+			strbuf_reset(&sb);
+			strbuf_addf(&sb, "./%s", remoteurl);
+			free(remoteurl);
+			remoteurl = strbuf_detach(&sb, NULL);
+		}
+	}
+	/* When the url starts with '../', remove that and the
+	 * last directory in remoteurl. */
+	while (url) {
+		if (starts_with_dot_dot_slash(url)) {
+			char *rfind;
+			url += 3;
+
+			rfind = last_dir_separator(remoteurl);
+			if (rfind)
+				*rfind = '\0';
+			else {
+				rfind = strrchr(remoteurl, ':');
+				if (rfind) {
+					*rfind = '\0';
+					colonsep = 1;
+				} else {
+					if (is_relative || !strcmp(".", remoteurl))
+						die(_("cannot strip one component off url '%s'"), remoteurl);
+					else
+						remoteurl = xstrdup(".");
+				}
+			}
+		} else if (starts_with_dot_slash(url)) {
+			url += 2;
+		} else
+			break;
+	}
+	strbuf_reset(&sb);
+	strbuf_addf(&sb, "%s%s%s", remoteurl, colonsep ? ":" : "/", url);
+
+	if (starts_with_dot_slash(sb.buf))
+		out = xstrdup(sb.buf + 2);
+	else
+		out = xstrdup(sb.buf);
+	strbuf_reset(&sb);
+
+	free(remoteurl);
+	if (!up_path || !is_relative)
+		return out;
+
+	strbuf_addf(&sb, "%s%s", up_path, out);
+	free(out);
+	return strbuf_detach(&sb, NULL);
+}
+
+static int resolve_relative_url(int argc, const char **argv, const char *prefix)
+{
+	char *remoteurl = NULL;
+	char *remote = get_default_remote();
+	const char *up_path = NULL;
+	char *res;
+	const char *url;
+	struct strbuf sb = STRBUF_INIT;
+
+	if (argc != 2 && argc != 3)
+		die("BUG: resolve_relative_url only accepts one or two arguments");
+
+	url = argv[1];
+	strbuf_addf(&sb, "remote.%s.url", remote);
+	free(remote);
+
+	if (git_config_get_string(sb.buf, &remoteurl))
+		/* the repository is its own authoritative upstream */
+		remoteurl = xgetcwd();
+
+	if (argc == 3)
+		up_path = argv[2];
+
+	res = relative_url(remoteurl, url, up_path);
+	printf("%s\n", res);
+
+	free(res);
+	return 0;
+}
+
+static int resolve_relative_url_test(int argc, const char **argv, const char *prefix)
+{
+	char *remoteurl, *res;
+	const char *up_path, *url;
+
+	if (argc != 4)
+		die("BUG: resolve_relative_url only accepts three arguments: <up_path> <remoteurl> <url>");
+
+	up_path = argv[1];
+	remoteurl = xstrdup(argv[2]);
+	url = argv[3];
+
+	if (!strcmp(up_path, "(null)"))
+		up_path = NULL;
+
+	res = relative_url(remoteurl, url, up_path);
+	printf("%s\n", res);
+
+	free(res);
+	return 0;
+}
 
 struct module_list {
 	const struct cache_entry **entries;
@@ -264,6 +451,8 @@ static struct cmd_struct commands[] = {
 	{"list", module_list},
 	{"name", module_name},
 	{"clone", module_clone},
+	{"resolve-relative-url", resolve_relative_url},
+	{"resolve-relative-url-test", resolve_relative_url_test},
 };
 
 int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
diff --git a/git-submodule.sh b/git-submodule.sh
index 9bc5c5f..3e409af 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -46,79 +46,6 @@ prefix=
 custom_name=
 depth=
 
-# The function takes at most 2 arguments. The first argument is the
-# URL that navigates to the submodule origin repo. When relative, this URL
-# is relative to the superproject origin URL repo. The second up_path
-# argument, if specified, is the relative path that navigates
-# from the submodule working tree to the superproject working tree.
-#
-# The output of the function is the origin URL of the submodule.
-#
-# The output will either be an absolute URL or filesystem path (if the
-# superproject origin URL is an absolute URL or filesystem path,
-# respectively) or a relative file system path (if the superproject
-# origin URL is a relative file system path).
-#
-# When the output is a relative file system path, the path is either
-# relative to the submodule working tree, if up_path is specified, or to
-# the superproject working tree otherwise.
-resolve_relative_url ()
-{
-	remote=$(get_default_remote)
-	remoteurl=$(git config "remote.$remote.url") ||
-		remoteurl=$(pwd) # the repository is its own authoritative upstream
-	url="$1"
-	remoteurl=${remoteurl%/}
-	sep=/
-	up_path="$2"
-
-	case "$remoteurl" in
-	*:*|/*)
-		is_relative=
-		;;
-	./*|../*)
-		is_relative=t
-		;;
-	*)
-		is_relative=t
-		remoteurl="./$remoteurl"
-		;;
-	esac
-
-	while test -n "$url"
-	do
-		case "$url" in
-		../*)
-			url="${url#../}"
-			case "$remoteurl" in
-			*/*)
-				remoteurl="${remoteurl%/*}"
-				;;
-			*:*)
-				remoteurl="${remoteurl%:*}"
-				sep=:
-				;;
-			*)
-				if test -z "$is_relative" || test "." = "$remoteurl"
-				then
-					die "$(eval_gettext "cannot strip one component off url '\$remoteurl'")"
-				else
-					remoteurl=.
-				fi
-				;;
-			esac
-			;;
-		./*)
-			url="${url#./}"
-			;;
-		*)
-			break;;
-		esac
-	done
-	remoteurl="$remoteurl$sep${url%/}"
-	echo "${is_relative:+${up_path}}${remoteurl#./}"
-}
-
 # Resolve a path to be relative to another path.  This is intended for
 # converting submodule paths when git-submodule is run in a subdirectory
 # and only handles paths where the directory separator is '/'.
@@ -281,7 +208,7 @@ cmd_add()
 		die "$(gettext "Relative path can only be used from the toplevel of the working tree")"
 
 		# dereference source url relative to parent's url
-		realrepo=$(resolve_relative_url "$repo") || exit
+		realrepo=$(git submodule--helper resolve-relative-url "$repo") || exit
 		;;
 	*:*|/*)
 		# absolute url
@@ -485,7 +412,7 @@ cmd_init()
 			# Possibly a url relative to parent
 			case "$url" in
 			./*|../*)
-				url=$(resolve_relative_url "$url") || exit
+				url=$(git submodule--helper resolve-relative-url "$url") || exit
 				;;
 			esac
 			git config submodule."$name".url "$url" ||
@@ -1190,9 +1117,9 @@ cmd_sync()
 			# guarantee a trailing /
 			up_path=${up_path%/}/ &&
 			# path from submodule work tree to submodule origin repo
-			sub_origin_url=$(resolve_relative_url "$url" "$up_path") &&
+			sub_origin_url=$(git submodule--helper resolve-relative-url "$url" "$up_path") &&
 			# path from superproject work tree to submodule origin repo
-			super_config_url=$(resolve_relative_url "$url") || exit
+			super_config_url=$(git submodule--helper resolve-relative-url "$url") || exit
 			;;
 		*)
 			sub_origin_url="$url"
diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
index 627ef85..8a1579c 100755
--- a/t/t0060-path-utils.sh
+++ b/t/t0060-path-utils.sh
@@ -19,6 +19,13 @@ relative_path() {
 	"test \"\$(test-path-utils relative_path '$1' '$2')\" = '$expected'"
 }
 
+test_submodule_relative_url() {
+	test_expect_success "test_submodule_relative_url: $1 $2 $3 => $4" "
+		actual=\$(git submodule--helper resolve-relative-url-test '$1' '$2' '$3') &&
+		test \"\$actual\" = '$4'
+	"
+}
+
 test_git_path() {
 	test_expect_success "git-path $1 $2 => $3" "
 		$1 git rev-parse --git-path $2 >actual &&
@@ -286,4 +293,39 @@ test_git_path GIT_COMMON_DIR=bar config                   bar/config
 test_git_path GIT_COMMON_DIR=bar packed-refs              bar/packed-refs
 test_git_path GIT_COMMON_DIR=bar shallow                  bar/shallow
 
+test_submodule_relative_url "(null)" "../foo/bar" "../sub/a/b/c" "../foo/sub/a/b/c"
+test_submodule_relative_url "../../../" "../foo/bar" "../sub/a/b/c" "../../../../foo/sub/a/b/c"
+test_submodule_relative_url "(null)" "../foo/bar" "../submodule" "../foo/submodule"
+test_submodule_relative_url "../" "../foo/bar" "../submodule" "../../foo/submodule"
+test_submodule_relative_url "(null)" "../foo/submodule" "../submodule" "../foo/submodule"
+test_submodule_relative_url "../" "../foo/submodule" "../submodule" "../../foo/submodule"
+test_submodule_relative_url "(null)" "../foo" "../submodule" "../submodule"
+test_submodule_relative_url "../" "../foo" "../submodule" "../../submodule"
+test_submodule_relative_url "(null)" "./foo/bar" "../submodule" "foo/submodule"
+test_submodule_relative_url "../" "./foo/bar" "../submodule" "../foo/submodule"
+test_submodule_relative_url "(null)" "./foo" "../submodule" "submodule"
+test_submodule_relative_url "../" "./foo" "../submodule" "../submodule"
+test_submodule_relative_url "(null)" "//somewhere else/repo" "../subrepo" "//somewhere else/subrepo"
+test_submodule_relative_url "(null)" "/u//trash directory.t7406-submodule-update/subsuper_update_r" "../subsubsuper_update_r" "/u//trash directory.t7406-submodule-update/subsubsuper_update_r"
+test_submodule_relative_url "(null)" "/u//trash directory.t7406-submodule-update/super_update_r2" "../subsuper_update_r" "/u//trash directory.t7406-submodule-update/subsuper_update_r"
+test_submodule_relative_url "(null)" "/u/trash directory.t3600-rm/." "../." "/u/trash directory.t3600-rm/."
+test_submodule_relative_url "(null)" "/u/trash directory.t3600-rm" "./." "/u/trash directory.t3600-rm/."
+test_submodule_relative_url "(null)" "/u/trash directory.t7400-submodule-basic/addtest" "../repo" "/u/trash directory.t7400-submodule-basic/repo"
+test_submodule_relative_url "../" "/u/trash directory.t7400-submodule-basic/addtest" "../repo" "/u/trash directory.t7400-submodule-basic/repo"
+test_submodule_relative_url "(null)" "/u/trash directory.t7400-submodule-basic" "./å äö" "/u/trash directory.t7400-submodule-basic/å äö"
+test_submodule_relative_url "(null)" "/u/trash directory.t7403-submodule-sync/." "../submodule" "/u/trash directory.t7403-submodule-sync/submodule"
+test_submodule_relative_url "(null)" "/u/trash directory.t7407-submodule-foreach/submodule" "../submodule" "/u/trash directory.t7407-submodule-foreach/submodule"
+test_submodule_relative_url "(null)" "/u/trash directory.t7409-submodule-detached-worktree/home2/../remote" "../bundle1" "/u/trash directory.t7409-submodule-detached-worktree/home2/../bundle1"
+test_submodule_relative_url "(null)" "/u/trash directory.t7613-merge-submodule/submodule_update_repo" "./." "/u/trash directory.t7613-merge-submodule/submodule_update_repo/."
+test_submodule_relative_url "(null)" "file:///tmp/repo" "../subrepo" "file:///tmp/subrepo"
+test_submodule_relative_url "(null)" "foo/bar" "../submodule" "foo/submodule"
+test_submodule_relative_url "../" "foo/bar" "../submodule" "../foo/submodule"
+test_submodule_relative_url "(null)" "foo" "../submodule" "submodule"
+test_submodule_relative_url "../" "foo" "../submodule" "../submodule"
+test_submodule_relative_url "(null)" "helper:://hostname/repo" "../subrepo" "helper:://hostname/subrepo"
+test_submodule_relative_url "(null)" "ssh://hostname/repo" "../subrepo" "ssh://hostname/subrepo"
+test_submodule_relative_url "(null)" "ssh://hostname:22/repo" "../subrepo" "ssh://hostname:22/subrepo"
+test_submodule_relative_url "(null)" "user@host:path/to/repo" "../subrepo" "user@host:path/to/subrepo"
+test_submodule_relative_url "(null)" "user@host:repo" "../subrepo" "user@host:subrepo"
+
 test_done
-- 
2.7.0.1.g33e69b5.dirty

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] submodule: Port resolve_relative_url from shell to C
  2016-01-13 18:15 [PATCH] " Stefan Beller
@ 2016-01-13 22:03 ` Junio C Hamano
  2016-01-13 22:47   ` Stefan Beller
  2016-01-14 20:57   ` Johannes Sixt
  2016-01-13 22:03 ` Eric Sunshine
  1 sibling, 2 replies; 23+ messages in thread
From: Junio C Hamano @ 2016-01-13 22:03 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, j6t, peff, jens.lehmann

Stefan Beller <sbeller@google.com> writes:

> Later on we want to deprecate the `git submodule init` command and make
> it implicit in other submodule commands.

I doubt there is a concensus for "deprecate" part to warrant the use
of "we want" here.  I tend to think that the latter half of the
sentence is uncontroversial, i.e. it is a good idea to make other
"submodule" subcommands internally call it when it makes sense, and
also make knobs available to other commands like "clone" and
possibly "checkout" so that the users do not have to do the
"submodule init" as a separate step, though.

> As I was porting the functionality I noticed some odds with the inputs.

I can parse but cannot quite grok.  You found some strange things in
the input?  Whose input, that comes from where given by whom?

> To fully understand the situation I added some logging to the function
> temporarily to capture all calls to the function throughout the test
> suite. Duplicates have been removed and all unique testing inputs have
> been recorded into t0060.

I can also parse this, but it is unclear what you did to the
temporary debugging help at the end.  If you left it, then that is
no longer a temporary but is part of the final product.  It is also
unclear what "Duplicates" you are talking about here.

Do you mean that you found some of the existing tests were odd, and
after examination with help from a temporary hack which does not
remain in this patch, you determined that some tests were duplicated,
which you removed, while adding new ones?

>  builtin/submodule--helper.c | 189 ++++++++++++++++++++++++++++++++++++++++++++
>  git-submodule.sh            |  81 +------------------
>  t/t0060-path-utils.sh       |  42 ++++++++++
>  3 files changed, 235 insertions(+), 77 deletions(-)
>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index f4c3eff..3e58b5d 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -9,6 +9,193 @@
>  #include "submodule-config.h"
>  #include "string-list.h"
>  #include "run-command.h"
> +#include "remote.h"
> +#include "refs.h"
> +#include "connect.h"
> +
> +static char *get_default_remote(void)
> +{
> +	char *dest = NULL, *ret;
> +	unsigned char sha1[20];
> +	int flag;
> +	struct strbuf sb = STRBUF_INIT;
> +	const char *refname = resolve_ref_unsafe("HEAD", 0, sha1, &flag);
> +
> +	if (!refname)
> +		die("No such ref: HEAD");
> +
> +	refname = shorten_unambiguous_ref(refname, 0);
> +	strbuf_addf(&sb, "branch.%s.remote", refname);

Is it correct to use shorten_unambiguous_ref() here like this?  The
function is meant to be used when you want heads/frotz because you
have both refs/heads/frotz and refs/tags/frotz at the same time.  I
think you want to say branch.frotz.remote even in such a case.  IOW,
shouldn't it be skip_prefix() with refs/heads/, together with die()
if the prefix is something else?

> +	if (git_config_get_string(sb.buf, &dest))
> +		ret = xstrdup("origin");
> +	else
> +		ret = xstrdup(dest);
> +
> +	strbuf_release(&sb);
> +	return ret;
> +}
> +
> +static int starts_with_dot_slash(const char *str)
> +{
> +	return str[0] == '.' && is_dir_sep(str[1]);
> +}
> +
> +static int starts_with_dot_dot_slash(const char *str)
> +{
> +	return str[0] == '.' && str[1] == '.' && is_dir_sep(str[2]);
> +}
> +
> +static char *last_dir_separator(char *str)
> +{
> +	char* p = str + strlen(str);

Asterisk sticks to the variable, not the type.

> +	while (p-- != str)

It is preferable to use '>' not '!=' here, because you know p
approaches str from the larger side, for readability.

> +		if (is_dir_sep(*p))
> +			return p;
> +	return NULL;
> +}

(a useless comment) This is one of the rare places where I wish
there were a version of strcspn() that scans from the right.

> +static char *relative_url(const char *remote_url,
> +				const char *url,
> +				const char *up_path)
> +{
> +	int is_relative = 0;
> +	int colonsep = 0;
> +	char *out;
> +	char *remoteurl = xstrdup(remote_url);
> +	struct strbuf sb = STRBUF_INIT;
> +	size_t len;
> +
> +	len = strlen(remoteurl);

Nothing wrong here, but it looked somewhat inconsistent to see this
assignment, while remoteurl is done as an initialization [*1*]


[Footnote]

*1* as a personal preference, I tend to prefer seeing only simple
operations in initialization and heavyweight ones as a separate
assignment to an otherwise uninitialized variable, and strlen() is
lighter-weight than xstrdup() in my dictionary.



> +	if (is_dir_sep(remoteurl[len]))
> +		remoteurl[len] = '\0';
> +
> +	if (!url_is_local_not_ssh(remoteurl) || is_absolute_path(remoteurl))
> +		is_relative = 0;
> +	else {
> +		is_relative = 1;
> +
> +		/* Prepend a './' to ensure all relative remoteurls start
> +		 * with './' or '../'. */

Adjust the style, and perhaps remove the final full-stop to make the
last string literal easier to see?  I.e.

	/*
         * Prepend a './' to ensure all relative remoteurls
         * start with './' or '../'
         */

would be easier to see what it is said.

> +		if (!starts_with_dot_slash(remoteurl) &&
> +		    !starts_with_dot_dot_slash(remoteurl)) {
> +			strbuf_reset(&sb);
> +			strbuf_addf(&sb, "./%s", remoteurl);
> +			free(remoteurl);
> +			remoteurl = strbuf_detach(&sb, NULL);
> +		}
> +	}
> +	/* When the url starts with '../', remove that and the
> +	 * last directory in remoteurl. */

Style.

> +	while (url) {
> +		if (starts_with_dot_dot_slash(url)) {
> +			char *rfind;
> +			url += 3;
> +
> +			rfind = last_dir_separator(remoteurl);
> +			if (rfind)
> +				*rfind = '\0';
> +			else {
> +				rfind = strrchr(remoteurl, ':');
> +				if (rfind) {
> +					*rfind = '\0';
> +					colonsep = 1;
> +				} else {
> +					if (is_relative || !strcmp(".", remoteurl))
> +						die(_("cannot strip one component off url '%s'"), remoteurl);
> +					else
> +						remoteurl = xstrdup(".");
> +				}
> +			}

It is somewhat hard to see how this avoids stripping one (or both)
slashes just after "http:" in remoteurl="http://site/path/", leaving
just "http:/" (or "http:").

This codepath has overly deep nesting levels.  Is this the simplest
we can do?

The final else { if .. else } can be made into else if .. else to
dedent the overlong die() by one level, but I am wondering if the
deep nesting is just a symptom of logic being unnecessarily complex.

> +		} else if (starts_with_dot_slash(url)) {
> +			url += 2;
> +		} else
> +			break;
> +	}
> +	strbuf_reset(&sb);
> +	strbuf_addf(&sb, "%s%s%s", remoteurl, colonsep ? ":" : "/", url);
> +
> +	if (starts_with_dot_slash(sb.buf))
> +		out = xstrdup(sb.buf + 2);
> +	else
> +		out = xstrdup(sb.buf);
> +	strbuf_reset(&sb);
> +
> +	free(remoteurl);
> +	if (!up_path || !is_relative)
> +		return out;
> +
> +	strbuf_addf(&sb, "%s%s", up_path, out);
> +	free(out);
> +	return strbuf_detach(&sb, NULL);
> +}

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] submodule: Port resolve_relative_url from shell to C
  2016-01-13 18:15 [PATCH] " Stefan Beller
  2016-01-13 22:03 ` Junio C Hamano
@ 2016-01-13 22:03 ` Eric Sunshine
  2016-01-13 23:37   ` Stefan Beller
  1 sibling, 1 reply; 23+ messages in thread
From: Eric Sunshine @ 2016-01-13 22:03 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Git List, Junio C Hamano, Johannes Sixt, Jeff King, Jens Lehmann

On Wed, Jan 13, 2016 at 1:15 PM, Stefan Beller <sbeller@google.com> wrote:
> Later on we want to deprecate the `git submodule init` command and make
> it implicit in other submodule commands. As these other commands are
> written in C already, we'd need the init functionality in C, too.
> The `resolve_relative_url` function is a rather large part of that init
> functionality, so start by porting this function to C.
>
> As I was porting the functionality I noticed some odds with the inputs.
> To fully understand the situation I added some logging to the function
> temporarily to capture all calls to the function throughout the test
> suite. Duplicates have been removed and all unique testing inputs have
> been recorded into t0060.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---

No time presently for a proper review, so just a few superficial comments...

> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> @@ -9,6 +9,193 @@
> +static int resolve_relative_url(int argc, const char **argv, const char *prefix)
> +{
> +       char *remoteurl = NULL;
> +       char *remote = get_default_remote();
> +       const char *up_path = NULL;
> +       char *res;
> +       const char *url;
> +       struct strbuf sb = STRBUF_INIT;
> +
> +       if (argc != 2 && argc != 3)
> +               die("BUG: resolve_relative_url only accepts one or two arguments");

This is diagnosing incorrect arguments, so:
s/resolve_relative_url/resolve-relative-url/

Also, I'm not convinced that this deserves a "BUG:" prefix, as this is
now a user-accessible command (even though it's plumbing).

> +       url = argv[1];
> +       strbuf_addf(&sb, "remote.%s.url", remote);
> +       free(remote);
> +
> +       if (git_config_get_string(sb.buf, &remoteurl))
> +               /* the repository is its own authoritative upstream */
> +               remoteurl = xgetcwd();
> +
> +       if (argc == 3)
> +               up_path = argv[2];
> +
> +       res = relative_url(remoteurl, url, up_path);
> +       printf("%s\n", res);
> +
> +       free(res);
> +       return 0;
> +}
> +
> +static int resolve_relative_url_test(int argc, const char **argv, const char *prefix)
> +{
> +       char *remoteurl, *res;
> +       const char *up_path, *url;
> +
> +       if (argc != 4)
> +               die("BUG: resolve_relative_url only accepts three arguments: <up_path> <remoteurl> <url>");

s/resolve_relative_url/resolve-relative-url/

Ditto observation about "BUG:" prefix.

> +       up_path = argv[1];
> +       remoteurl = xstrdup(argv[2]);
> +       url = argv[3];
> +
> +       if (!strcmp(up_path, "(null)"))
> +               up_path = NULL;
> +
> +       res = relative_url(remoteurl, url, up_path);
> +       printf("%s\n", res);

This could be:

     puts(res);

though I don't care strongly.

> +       free(res);
> +       return 0;
> +}

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] submodule: Port resolve_relative_url from shell to C
  2016-01-13 22:03 ` Junio C Hamano
@ 2016-01-13 22:47   ` Stefan Beller
  2016-01-14 20:50     ` Jens Lehmann
  2016-01-15 17:37     ` Junio C Hamano
  2016-01-14 20:57   ` Johannes Sixt
  1 sibling, 2 replies; 23+ messages in thread
From: Stefan Beller @ 2016-01-13 22:47 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git@vger.kernel.org, Johannes Sixt, Jeff King, Jens Lehmann

On Wed, Jan 13, 2016 at 2:03 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> Later on we want to deprecate the `git submodule init` command and make
>> it implicit in other submodule commands.
>
> I doubt there is a concensus for "deprecate" part to warrant the use
> of "we want" here.  I tend to think that the latter half of the
> sentence is uncontroversial, i.e. it is a good idea to make other
> "submodule" subcommands internally call it when it makes sense, and
> also make knobs available to other commands like "clone" and
> possibly "checkout" so that the users do not have to do the
> "submodule init" as a separate step, though.

Maybe I need to rethink my strategy here and deliver a patch series
which includes a complete port of `submodule init`, and maybe even
options in checkout (and clone) to run `submodule init`. That way the
immediate benefit would be clear on why the series is a good idea.

The current wording is mostly arguing to Jens, how to do the submodule
groups thing later on, but skipping the immediate steps.

>
>> As I was porting the functionality I noticed some odds with the inputs.
>
> I can parse but cannot quite grok.  You found some strange things in
> the input?  Whose input, that comes from where given by whom?
>
>> To fully understand the situation I added some logging to the function
>> temporarily to capture all calls to the function throughout the test
>> suite. Duplicates have been removed and all unique testing inputs have
>> been recorded into t0060.
>
> I can also parse this, but it is unclear what you did to the
> temporary debugging help at the end.  If you left it, then that is
> no longer a temporary but is part of the final product.  It is also
> unclear what "Duplicates" you are talking about here.

So in v1 somebody complained it's not clear what kind of input you'd get into
the relative_url(up_path, remoteurl, url) function. I did not know either, as it
was a straight port, passing the test suite. So I wanted to add tests.

To come up with reasonable tests I added a section to the code similar as this:

    {
        FILE *f = fopen("/tmp/testcases", "a");
        fprintf(f, "%s|%s|%s|%s\n", up_path, remoteurl, url, result);
        fclose(f);
    }

Then I run the whole test suite with the relative_url instrumented.
This gave me a file "/tmp/testcases" containing 500 lines with valid
in and output for the `relative_url` function.
However I run these 500 lines through sort|uniq to get about 90 lines
of unduplicated tests.

but in these 90 lines there were still syntactic duplicates where one
line may look like the other line just with
    s/trash directory.tXXXX/trash directory.tYYYY/
so I removed these lines manually, too.

And that's how I came up with the set of tests.
The logging function to "/tmp/testcases" was temporary and is not part
of the final product, but by mentioning that, some issues may be clear
to the reader, such as:
 * why there are tests with /u/trash directory-t7400.../...
 * the tests are as exhaustive as the test suite before.
 * there are no tests to test failure though, only test for good tests

>
> Do you mean that you found some of the existing tests were odd, and
> after examination with help from a temporary hack which does not
> remain in this patch, you determined that some tests were duplicated,
> which you removed, while adding new ones?

Yes, this.

>
>>  builtin/submodule--helper.c | 189 ++++++++++++++++++++++++++++++++++++++++++++
>>  git-submodule.sh            |  81 +------------------
>>  t/t0060-path-utils.sh       |  42 ++++++++++
>>  3 files changed, 235 insertions(+), 77 deletions(-)
>>
>> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
>> index f4c3eff..3e58b5d 100644
>> --- a/builtin/submodule--helper.c
>> +++ b/builtin/submodule--helper.c
>> @@ -9,6 +9,193 @@
>>  #include "submodule-config.h"
>>  #include "string-list.h"
>>  #include "run-command.h"
>> +#include "remote.h"
>> +#include "refs.h"
>> +#include "connect.h"
>> +
>> +static char *get_default_remote(void)
>> +{
>> +     char *dest = NULL, *ret;
>> +     unsigned char sha1[20];
>> +     int flag;
>> +     struct strbuf sb = STRBUF_INIT;
>> +     const char *refname = resolve_ref_unsafe("HEAD", 0, sha1, &flag);
>> +
>> +     if (!refname)
>> +             die("No such ref: HEAD");
>> +
>> +     refname = shorten_unambiguous_ref(refname, 0);
>> +     strbuf_addf(&sb, "branch.%s.remote", refname);
>
> Is it correct to use shorten_unambiguous_ref() here like this?  The
> function is meant to be used when you want heads/frotz because you
> have both refs/heads/frotz and refs/tags/frotz at the same time.  I
> think you want to say branch.frotz.remote even in such a case.  IOW,
> shouldn't it be skip_prefix() with refs/heads/, together with die()
> if the prefix is something else?

Right.

>
>> +     if (git_config_get_string(sb.buf, &dest))
>> +             ret = xstrdup("origin");
>> +     else
>> +             ret = xstrdup(dest);
>> +
>> +     strbuf_release(&sb);
>> +     return ret;
>> +}
>> +
>> +static int starts_with_dot_slash(const char *str)
>> +{
>> +     return str[0] == '.' && is_dir_sep(str[1]);
>> +}
>> +
>> +static int starts_with_dot_dot_slash(const char *str)
>> +{
>> +     return str[0] == '.' && str[1] == '.' && is_dir_sep(str[2]);
>> +}
>> +
>> +static char *last_dir_separator(char *str)
>> +{
>> +     char* p = str + strlen(str);
>
> Asterisk sticks to the variable, not the type.

ok

>
>> +     while (p-- != str)
>
> It is preferable to use '>' not '!=' here, because you know p
> approaches str from the larger side, for readability.

Also known as the limes operator (p--> str, "p goes to str")
just kidding :)

>
>> +             if (is_dir_sep(*p))
>> +                     return p;
>> +     return NULL;
>> +}
>
> (a useless comment) This is one of the rare places where I wish
> there were a version of strcspn() that scans from the right.
>
>> +static char *relative_url(const char *remote_url,
>> +                             const char *url,
>> +                             const char *up_path)
>> +{
>> +     int is_relative = 0;
>> +     int colonsep = 0;
>> +     char *out;
>> +     char *remoteurl = xstrdup(remote_url);
>> +     struct strbuf sb = STRBUF_INIT;
>> +     size_t len;
>> +
>> +     len = strlen(remoteurl);
>
> Nothing wrong here, but it looked somewhat inconsistent to see this
> assignment, while remoteurl is done as an initialization [*1*]

ok, noted.

>
>
> [Footnote]
>
> *1* as a personal preference, I tend to prefer seeing only simple
> operations in initialization and heavyweight ones as a separate
> assignment to an otherwise uninitialized variable, and strlen() is
> lighter-weight than xstrdup() in my dictionary.
>
>
>
>> +     if (is_dir_sep(remoteurl[len]))
>> +             remoteurl[len] = '\0';
>> +
>> +     if (!url_is_local_not_ssh(remoteurl) || is_absolute_path(remoteurl))
>> +             is_relative = 0;
>> +     else {
>> +             is_relative = 1;
>> +
>> +             /* Prepend a './' to ensure all relative remoteurls start
>> +              * with './' or '../'. */
>
> Adjust the style, and perhaps remove the final full-stop to make the
> last string literal easier to see?  I.e.
>
>         /*
>          * Prepend a './' to ensure all relative remoteurls
>          * start with './' or '../'
>          */
>
> would be easier to see what it is said.

ok

>
>> +             if (!starts_with_dot_slash(remoteurl) &&
>> +                 !starts_with_dot_dot_slash(remoteurl)) {
>> +                     strbuf_reset(&sb);
>> +                     strbuf_addf(&sb, "./%s", remoteurl);
>> +                     free(remoteurl);
>> +                     remoteurl = strbuf_detach(&sb, NULL);
>> +             }
>> +     }
>> +     /* When the url starts with '../', remove that and the
>> +      * last directory in remoteurl. */
>
> Style.

ok

>
>> +     while (url) {
>> +             if (starts_with_dot_dot_slash(url)) {
>> +                     char *rfind;
>> +                     url += 3;
>> +
>> +                     rfind = last_dir_separator(remoteurl);
>> +                     if (rfind)
>> +                             *rfind = '\0';
>> +                     else {
>> +                             rfind = strrchr(remoteurl, ':');
>> +                             if (rfind) {
>> +                                     *rfind = '\0';
>> +                                     colonsep = 1;
>> +                             } else {
>> +                                     if (is_relative || !strcmp(".", remoteurl))
>> +                                             die(_("cannot strip one component off url '%s'"), remoteurl);
>> +                                     else
>> +                                             remoteurl = xstrdup(".");
>> +                             }
>> +                     }
>
> It is somewhat hard to see how this avoids stripping one (or both)
> slashes just after "http:" in remoteurl="http://site/path/", leaving
> just "http:/" (or "http:").

it would leave just 'http:/' if url were to be ../../some/where/else,
such that the constructed url below would be http://some/where/else.

>
> This codepath has overly deep nesting levels.  Is this the simplest
> we can do?

it's a direct translation from shell. I could imagine the inside of
    if (starts_with_dot_dot_slash(url)) {
        ...
    }

may go to its own function, such that it becomes:

    while (url) {
        if (starts_with_dot_dot_slash(url)) {
            adjust_remoteurl_and_url(&url, &remoteurl)
        else if (starts_with_dot_slash(url))
            url += 2;
        else
            break;
    }


with a proper name for adjust_remoteurl_and_url of course.

>
> The final else { if .. else } can be made into else if .. else to
> dedent the overlong die() by one level, but I am wondering if the
> deep nesting is just a symptom of logic being unnecessarily complex.

I don't think it's unnecessary complex, but results from a direct
shell->C translation.

>
>> +             } else if (starts_with_dot_slash(url)) {
>> +                     url += 2;
>> +             } else
>> +                     break;
>> +     }
>> +     strbuf_reset(&sb);
>> +     strbuf_addf(&sb, "%s%s%s", remoteurl, colonsep ? ":" : "/", url);
>> +
>> +     if (starts_with_dot_slash(sb.buf))
>> +             out = xstrdup(sb.buf + 2);
>> +     else
>> +             out = xstrdup(sb.buf);
>> +     strbuf_reset(&sb);
>> +
>> +     free(remoteurl);
>> +     if (!up_path || !is_relative)
>> +             return out;
>> +
>> +     strbuf_addf(&sb, "%s%s", up_path, out);
>> +     free(out);
>> +     return strbuf_detach(&sb, NULL);
>> +}
>
> Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] submodule: Port resolve_relative_url from shell to C
  2016-01-13 22:03 ` Eric Sunshine
@ 2016-01-13 23:37   ` Stefan Beller
  0 siblings, 0 replies; 23+ messages in thread
From: Stefan Beller @ 2016-01-13 23:37 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Git List, Junio C Hamano, Johannes Sixt, Jeff King, Jens Lehmann

On Wed, Jan 13, 2016 at 2:03 PM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> No time presently for a proper review, so just a few superficial comments...

Ok, I incorporated all your suggestions, too. Thanks for the superficial review!

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] submodule: Port resolve_relative_url from shell to C
  2016-01-13 22:47   ` Stefan Beller
@ 2016-01-14 20:50     ` Jens Lehmann
  2016-01-14 23:43       ` Stefan Beller
  2016-01-15 17:37     ` Junio C Hamano
  1 sibling, 1 reply; 23+ messages in thread
From: Jens Lehmann @ 2016-01-14 20:50 UTC (permalink / raw)
  To: Stefan Beller, Junio C Hamano
  Cc: git@vger.kernel.org, Johannes Sixt, Jeff King

Am 13.01.2016 um 23:47 schrieb Stefan Beller:
> On Wed, Jan 13, 2016 at 2:03 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Stefan Beller <sbeller@google.com> writes:
>>
>>> Later on we want to deprecate the `git submodule init` command and make
>>> it implicit in other submodule commands.
>>
>> I doubt there is a concensus for "deprecate" part to warrant the use
>> of "we want" here.  I tend to think that the latter half of the
>> sentence is uncontroversial, i.e. it is a good idea to make other
>> "submodule" subcommands internally call it when it makes sense, and
>> also make knobs available to other commands like "clone" and
>> possibly "checkout" so that the users do not have to do the
>> "submodule init" as a separate step, though.
>
> Maybe I need to rethink my strategy here and deliver a patch series
> which includes a complete port of `submodule init`, and maybe even
> options in checkout (and clone) to run `submodule init`. That way the
> immediate benefit would be clear on why the series is a good idea.

I think that makes lots of sense. It looks to me like clone already
has that option (as --recurse-submodules must init the submodules),
but it might make sense to add such an option to checkout to init
(and then also update) all newly appearing submodules (just like
"git submodule update" has the --init option for the same purpose).

> The current wording is mostly arguing to Jens, how to do the submodule
> groups thing later on, but skipping the immediate steps.

I really believe that in the future a lot of users will hop on to the
automatically-init-and-update-submodules train once we have it (and I
think users of the groups feature want to be on that train by default).

But I also believe we'll have to support the old school init-manually
and update-when-I-want-to use cases for a very long time, as lots of
work flows are built around that.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] submodule: Port resolve_relative_url from shell to C
  2016-01-13 22:03 ` Junio C Hamano
  2016-01-13 22:47   ` Stefan Beller
@ 2016-01-14 20:57   ` Johannes Sixt
  2016-01-14 22:49     ` Stefan Beller
  1 sibling, 1 reply; 23+ messages in thread
From: Johannes Sixt @ 2016-01-14 20:57 UTC (permalink / raw)
  To: Junio C Hamano, Stefan Beller; +Cc: git, peff, jens.lehmann

Am 13.01.2016 um 23:03 schrieb Junio C Hamano:
> Stefan Beller <sbeller@google.com> writes:
>> +	while (url) {
>> +		if (starts_with_dot_dot_slash(url)) {
>> +			char *rfind;
>> +			url += 3;
>> +
>> +			rfind = last_dir_separator(remoteurl);
>> +			if (rfind)
>> +				*rfind = '\0';
>> +			else {
>> +				rfind = strrchr(remoteurl, ':');
>> +				if (rfind) {
>> +					*rfind = '\0';
>> +					colonsep = 1;
>> +				} else {
>> +					if (is_relative || !strcmp(".", remoteurl))
>> +						die(_("cannot strip one component off url '%s'"), remoteurl);
>> +					else
>> +						remoteurl = xstrdup(".");
>> +				}
>> +			}
>
> It is somewhat hard to see how this avoids stripping one (or both)
> slashes just after "http:" in remoteurl="http://site/path/", leaving
> just "http:/" (or "http:").
>
> This codepath has overly deep nesting levels.  Is this the simplest
> we can do?

The code as written is quite easy to follow when compared to the 
original shell code. I think that is a reasonable goal, and improvements 
can into separate patches.

>
> The final else { if .. else } can be made into else if .. else to
> dedent the overlong die() by one level, but I am wondering if the
> deep nesting is just a symptom of logic being unnecessarily complex.
>
>> +		} else if (starts_with_dot_slash(url)) {
>> +			url += 2;
>> +		} else
>> +			break;
>> +	}

For example, the section that begins here...

>> +	strbuf_reset(&sb);
>> +	strbuf_addf(&sb, "%s%s%s", remoteurl, colonsep ? ":" : "/", url);
>> +
>> +	if (starts_with_dot_slash(sb.buf))
>> +		out = xstrdup(sb.buf + 2);
>> +	else
>> +		out = xstrdup(sb.buf);
>> +	strbuf_reset(&sb);
>> +
>> +	free(remoteurl);
>> +	if (!up_path || !is_relative)
>> +		return out;
>> +
>> +	strbuf_addf(&sb, "%s%s", up_path, out);
>> +	free(out);
>> +	return strbuf_detach(&sb, NULL);


... and ends here can easily be rewritten to become a single 
strbuf_addf() without the xstrdup()s and without the early exit (at the 
cost of some additional ?: conditionals in the arguments).

-- Hannes

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] submodule: Port resolve_relative_url from shell to C
  2016-01-14 20:57   ` Johannes Sixt
@ 2016-01-14 22:49     ` Stefan Beller
  0 siblings, 0 replies; 23+ messages in thread
From: Stefan Beller @ 2016-01-14 22:49 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: Junio C Hamano, git@vger.kernel.org, Jeff King, Jens Lehmann

> (at the cost of some additional ?: conditionals in the arguments).

I tried that once last year and I could not bring a version which was
easier to read.
I'll try again.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] submodule: Port resolve_relative_url from shell to C
  2016-01-14 20:50     ` Jens Lehmann
@ 2016-01-14 23:43       ` Stefan Beller
  0 siblings, 0 replies; 23+ messages in thread
From: Stefan Beller @ 2016-01-14 23:43 UTC (permalink / raw)
  To: Jens Lehmann
  Cc: Junio C Hamano, git@vger.kernel.org, Johannes Sixt, Jeff King

On Thu, Jan 14, 2016 at 12:50 PM, Jens Lehmann <Jens.Lehmann@web.de> wrote:
> Am 13.01.2016 um 23:47 schrieb Stefan Beller:
>>
>> On Wed, Jan 13, 2016 at 2:03 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>>
>>> Stefan Beller <sbeller@google.com> writes:
>>>
>>>> Later on we want to deprecate the `git submodule init` command and make
>>>> it implicit in other submodule commands.
>>>
>>>
>>> I doubt there is a concensus for "deprecate" part to warrant the use
>>> of "we want" here.  I tend to think that the latter half of the
>>> sentence is uncontroversial, i.e. it is a good idea to make other
>>> "submodule" subcommands internally call it when it makes sense, and
>>> also make knobs available to other commands like "clone" and
>>> possibly "checkout" so that the users do not have to do the
>>> "submodule init" as a separate step, though.
>>
>>
>> Maybe I need to rethink my strategy here and deliver a patch series
>> which includes a complete port of `submodule init`, and maybe even
>> options in checkout (and clone) to run `submodule init`. That way the
>> immediate benefit would be clear on why the series is a good idea.
>
>
> I think that makes lots of sense. It looks to me like clone already
> has that option (as --recurse-submodules must init the submodules),
> but it might make sense to add such an option to checkout to init
> (and then also update) all newly appearing submodules (just like
> "git submodule update" has the --init option for the same purpose).

The next series I'll send out will replace the shell part of `git
submodule init`
with a small wrapper for `git submodule--helper init` which will then have
the functionality to initialize submodules. I'll break that C code in a way
that we'll end up having a function like:

    void init_submodule(const char *path);

After that is donw, I'll try to call this from all the places which currently
do setup a child process for init or `update --init`.

>
>> The current wording is mostly arguing to Jens, how to do the submodule
>> groups thing later on, but skipping the immediate steps.
>
>
> I really believe that in the future a lot of users will hop on to the
> automatically-init-and-update-submodules train once we have it (and I
> think users of the groups feature want to be on that train by default).

Rereading old mail I wonder if we had a miss understanding on the groups
feature or rather the  automatically-init-submodules feature.

As far as I understand initializing git submodules, you can do it multiple times
without hurting yourself, i.e. an implementation of update could look like

update()
{
    auto-init-subs = { }
    if groups selected:
        auto-init-subs = {subs selected by groups}
    foreach uninitialized submodule:
        if submodule has set auto-init (in superprojects .gitmodule I'd guess)
            auto-init-subs += {that submodule}
    if auto-init-subs not empty:
        git submodule init <auto-init-subs>
    update-as-we-know-it
}

and then multiple calls to update() would not hurt.
That way we would not need to add any logic to the init sub command as my
first patch series had. There it was more like:

update()
{
    if groups selected:
        git submodule init --groups # have the logic inside of init
    update-as-we-know-it
}

I think I'll redo the groups patch series as the former now.

>
> But I also believe we'll have to support the old school init-manually
> and update-when-I-want-to use cases for a very long time, as lots of
> work flows are built around that.

Sure, the "submodule init" command is not going away. I just want to have
an easy way to access it from within C code, hence the rewrite effort.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] submodule: Port resolve_relative_url from shell to C
  2016-01-13 22:47   ` Stefan Beller
  2016-01-14 20:50     ` Jens Lehmann
@ 2016-01-15 17:37     ` Junio C Hamano
  2016-01-15 22:58       ` Stefan Beller
  1 sibling, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2016-01-15 17:37 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git@vger.kernel.org, Johannes Sixt, Jeff King, Jens Lehmann

Stefan Beller <sbeller@google.com> writes:

> On Wed, Jan 13, 2016 at 2:03 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Stefan Beller <sbeller@google.com> writes:
>>> +     while (url) {
>>> +             if (starts_with_dot_dot_slash(url)) {
>>> +                     char *rfind;
>>> +                     url += 3;
>>> +
>>> +                     rfind = last_dir_separator(remoteurl);
>>> +                     if (rfind)
>>> +                             *rfind = '\0';
>>> +                     else {
>>> +                             rfind = strrchr(remoteurl, ':');
>>> +                             if (rfind) {
>>> +                                     *rfind = '\0';
>>> +                                     colonsep = 1;
>>> +                             } else {
>>> +                                     if (is_relative || !strcmp(".", remoteurl))
>>> +                                             die(_("cannot strip one component off url '%s'"), remoteurl);
>>> +                                     else
>>> +                                             remoteurl = xstrdup(".");
>>> +                             }
>>> +                     }
>>
>> It is somewhat hard to see how this avoids stripping one (or both)
>> slashes just after "http:" in remoteurl="http://site/path/", leaving
>> just "http:/" (or "http:").
>
> it would leave just 'http:/' if url were to be ../../some/where/else,
> such that the constructed url below would be http://some/where/else.

Is that a good outcome, though?  Isn't it something we would want to
catch as an error?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] submodule: Port resolve_relative_url from shell to C
  2016-01-15 17:37     ` Junio C Hamano
@ 2016-01-15 22:58       ` Stefan Beller
  2016-01-15 23:03         ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: Stefan Beller @ 2016-01-15 22:58 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git@vger.kernel.org, Johannes Sixt, Jeff King, Jens Lehmann

On Fri, Jan 15, 2016 at 9:37 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> On Wed, Jan 13, 2016 at 2:03 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>> Stefan Beller <sbeller@google.com> writes:
>>>> +     while (url) {
>>>> +             if (starts_with_dot_dot_slash(url)) {
>>>> +                     char *rfind;
>>>> +                     url += 3;
>>>> +
>>>> +                     rfind = last_dir_separator(remoteurl);
>>>> +                     if (rfind)
>>>> +                             *rfind = '\0';
>>>> +                     else {
>>>> +                             rfind = strrchr(remoteurl, ':');
>>>> +                             if (rfind) {
>>>> +                                     *rfind = '\0';
>>>> +                                     colonsep = 1;
>>>> +                             } else {
>>>> +                                     if (is_relative || !strcmp(".", remoteurl))
>>>> +                                             die(_("cannot strip one component off url '%s'"), remoteurl);
>>>> +                                     else
>>>> +                                             remoteurl = xstrdup(".");
>>>> +                             }
>>>> +                     }
>>>
>>> It is somewhat hard to see how this avoids stripping one (or both)
>>> slashes just after "http:" in remoteurl="http://site/path/", leaving
>>> just "http:/" (or "http:").
>>
>> it would leave just 'http:/' if url were to be ../../some/where/else,
>> such that the constructed url below would be http://some/where/else.
>
> Is that a good outcome, though?  Isn't it something we would want to
> catch as an error?

I would want to add theses checks later and for now
just port over the code from shell to C. (The same issue
is found in the shell code and nobody seems to bother so far)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] submodule: Port resolve_relative_url from shell to C
  2016-01-15 22:58       ` Stefan Beller
@ 2016-01-15 23:03         ` Junio C Hamano
  0 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2016-01-15 23:03 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git@vger.kernel.org, Johannes Sixt, Jeff King, Jens Lehmann

Stefan Beller <sbeller@google.com> writes:

> On Fri, Jan 15, 2016 at 9:37 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>>> It is somewhat hard to see how this avoids stripping one (or both)
>>>> slashes just after "http:" in remoteurl="http://site/path/", leaving
>>>> just "http:/" (or "http:").
>>>
>>> it would leave just 'http:/' if url were to be ../../some/where/else,
>>> such that the constructed url below would be http://some/where/else.
>>
>> Is that a good outcome, though?  Isn't it something we would want to
>> catch as an error?
>
> I would want to add theses checks later and for now
> just port over the code from shell to C. (The same issue
> is found in the shell code and nobody seems to bother so far)

Understood and I think that is a good direction to go.  Perhaps
leave a comment in the area to document it as a known bug (or a
NEEDSWORK) to make it more obvious and to help remember it?

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2016-01-15 23:03 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-10  1:07 [PATCH] submodule: Port resolve_relative_url from shell to C Stefan Beller
2015-12-10  6:48 ` Johannes Sixt
2015-12-16 22:36   ` Stefan Beller
2015-12-17  0:26 ` [PATCHv2] Porting " Stefan Beller
2015-12-17  7:47   ` Torsten Bögershausen
2015-12-17  0:26 ` [PATCHv2] submodule: Port " Stefan Beller
2015-12-17  8:17   ` Eric Sunshine
2015-12-17 18:55   ` Johannes Sixt
2015-12-17 19:17     ` Johannes Sixt
2015-12-18  3:27     ` Jeff King
2016-01-11 20:17     ` Stefan Beller
  -- strict thread matches above, loose matches on Subject: below --
2016-01-13 18:15 [PATCH] " Stefan Beller
2016-01-13 22:03 ` Junio C Hamano
2016-01-13 22:47   ` Stefan Beller
2016-01-14 20:50     ` Jens Lehmann
2016-01-14 23:43       ` Stefan Beller
2016-01-15 17:37     ` Junio C Hamano
2016-01-15 22:58       ` Stefan Beller
2016-01-15 23:03         ` Junio C Hamano
2016-01-14 20:57   ` Johannes Sixt
2016-01-14 22:49     ` Stefan Beller
2016-01-13 22:03 ` Eric Sunshine
2016-01-13 23:37   ` Stefan Beller

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).