git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs
@ 2017-06-16 19:28 SZEDER Gábor
  2017-06-16 19:28 ` [PATCH 1/5] remote: don't use remote->{fetch,push}_refspec SZEDER Gábor
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: SZEDER Gábor @ 2017-06-16 19:28 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Jonathan Nieder, SZEDER Gábor


'struct remote' stores refspecs twice: once in their original string
form in remote->{fetch,push}_refspecs and once in their parsed form in
remote->{fetch,push}.  This is necessary, because we need the refspecs
for lazy parsing after we finished reading the configuration: we don't
want to die() on a bogus refspec while reading the configuration of a
remote we are not going to access at all.

However, storing refspecs in both forms has some drawbacks:

  - The same information is stored twice, wasting memory.
  - remote->{fetch,push}_refspecs, i.e. the string arrays are
    conveniently ALLOC_GROW()-able with associated
    {fetch,push}_refspec_{nr,alloc} fields, but remote->{fetch,push}
    are not.
  - Wherever remote->{fetch,push} are accessed, the number of parsed
    refspecs in there is specified by remote->{fetch,push}_refspec_nr.
    This requires us to keep the two arrays in sync and makes adding
    additional refspecs cumbersome and error prone.

  - And worst of all, it pissed me off while working on
    sg/clone-refspec-from-command-line-config ;)


So here is my crack at getting rid of them.

The idea is to parse refspecs gently while reading the configuration:
this way we won't need to store all refspecs as strings, and won't
die() on a bogus refspec right away.  A bogus refspec, if there's one,
will be stored in the remote it belongs to, so it will be available
later when that remote is accessed and can be used in the error
message.

At the end of the series the remote API will have public functions
add_{fetch,push}_refspec() for, well, adding refspecs, including
parsing, to those that were already present in the configuration.  I'm
not sure what the use case for adding extra push refspecs could be,
but sg/clone-refspec-from-command-line-config has shown that there is
a need for adding fetch refspecs, and this way they are at least
symmetrical.  Though I don't see any other use case for adding extra
fetch refspecs, either.  Anyway, there's less subtlety without the
need to keep the two arrays ->fetch and ->fetch_refspec in sync.


The first three patches are preparation.
Patch 3 was done with --patience, otherwise it's unreadable.  -w will
help even more to see what's going on.
The last patch's commit message is rather terse... but I deemed it
sufficient for an initial RFC; will expand it later, if there's an
agreement that this approach is worth pursuing.


This applies on top of a merge of master and the fresh reroll (v5) of
sg/clone-refspec-from-command-line-config:

  http://public-inbox.org/git/20170616173849.8071-1-szeder.dev@gmail.com/T/#m7f9c81fb415920acc2af93f432aecfa52477e6b8

It's also available at:

  https://github.com/szeder/git remote-no-string-refspecs


Best,
Gábor


SZEDER Gábor (5):
  remote: don't use remote->{fetch,push}_refspec
  remote.c: don't pass copies of refspecs to add_{fetch,push}_refspec()
  remote.c: extract a helper function to parse a single refspec
  remote.c: eliminate remote->fetch_refspec
  remote.c: eliminate remote->push_refspec

 builtin/clone.c  |   7 +-
 builtin/fetch.c  |  20 ++--
 builtin/push.c   |  12 ++-
 builtin/remote.c |  47 ++++++---
 remote.c         | 316 +++++++++++++++++++++++++++++++------------------------
 remote.h         |  31 ++++--
 6 files changed, 248 insertions(+), 185 deletions(-)

-- 
2.13.1.505.g7cc9fcafb


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/5] remote: don't use remote->{fetch,push}_refspec
  2017-06-16 19:28 [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs SZEDER Gábor
@ 2017-06-16 19:28 ` SZEDER Gábor
  2017-06-16 22:00   ` Junio C Hamano
  2017-06-16 19:28 ` [PATCH 2/5] remote.c: don't pass copies of refspecs to add_{fetch,push}_refspec() SZEDER Gábor
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: SZEDER Gábor @ 2017-06-16 19:28 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Jonathan Nieder, SZEDER Gábor

builtin/remote.c uses remote->fetch_refspec and remote->push_refspec,
i.e. refspecs as strings, in a few places, e.g. in an error message or
to set configuration variables.

Since we are about to eliminate remote->{fetch,push}_refspec, recreate
those strings from the corresponding remote->{fetch,push} entries.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 builtin/remote.c | 22 +++++++++++++++-------
 remote.c         | 20 ++++++++++++++++++++
 remote.h         |  2 ++
 3 files changed, 37 insertions(+), 7 deletions(-)

diff --git a/builtin/remote.c b/builtin/remote.c
index f1a88fe26..7f0072fe5 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -334,7 +334,7 @@ static int get_ref_states(const struct ref *remote_refs, struct ref_states *stat
 	for (i = 0; i < states->remote->fetch_refspec_nr; i++)
 		if (get_fetch_map(remote_refs, states->remote->fetch + i, &tail, 1))
 			die(_("Could not get fetch map for refspec %s"),
-				states->remote->fetch_refspec[i]);
+			      refspec_to_string(&states->remote->fetch[i]));
 
 	states->new.strdup_strings = 1;
 	states->tracked.strdup_strings = 1;
@@ -576,7 +576,7 @@ static int read_remote_branches(const char *refname,
 
 static int migrate_file(struct remote *remote)
 {
-	struct strbuf buf = STRBUF_INIT;
+	struct strbuf buf = STRBUF_INIT, refspec = STRBUF_INIT;
 	int i;
 
 	strbuf_addf(&buf, "remote.%s.url", remote->name);
@@ -584,17 +584,25 @@ static int migrate_file(struct remote *remote)
 		git_config_set_multivar(buf.buf, remote->url[i], "^$", 0);
 	strbuf_reset(&buf);
 	strbuf_addf(&buf, "remote.%s.push", remote->name);
-	for (i = 0; i < remote->push_refspec_nr; i++)
-		git_config_set_multivar(buf.buf, remote->push_refspec[i], "^$", 0);
+	for (i = 0; i < remote->push_refspec_nr; i++) {
+		strbuf_add_refspec(&refspec, &remote->push[i]);
+		git_config_set_multivar(buf.buf, refspec.buf, "^$", 0);
+		strbuf_reset(&refspec);
+	}
 	strbuf_reset(&buf);
 	strbuf_addf(&buf, "remote.%s.fetch", remote->name);
-	for (i = 0; i < remote->fetch_refspec_nr; i++)
-		git_config_set_multivar(buf.buf, remote->fetch_refspec[i], "^$", 0);
+	for (i = 0; i < remote->fetch_refspec_nr; i++) {
+		strbuf_add_refspec(&refspec, &remote->fetch[i]);
+		git_config_set_multivar(buf.buf, refspec.buf, "^$", 0);
+		strbuf_reset(&refspec);
+	}
 	if (remote->origin == REMOTE_REMOTES)
 		unlink_or_warn(git_path("remotes/%s", remote->name));
 	else if (remote->origin == REMOTE_BRANCHES)
 		unlink_or_warn(git_path("branches/%s", remote->name));
 
+	strbuf_release(&buf);
+	strbuf_release(&refspec);
 	return 0;
 }
 
@@ -647,7 +655,7 @@ static int mv(int argc, const char **argv)
 		char *ptr;
 
 		strbuf_reset(&buf2);
-		strbuf_addstr(&buf2, oldremote->fetch_refspec[i]);
+		strbuf_add_refspec(&buf2, &oldremote->fetch[i]);
 		ptr = strstr(buf2.buf, old_remote_context.buf);
 		if (ptr) {
 			refspec_updated = 1;
diff --git a/remote.c b/remote.c
index 336db8298..a021decee 100644
--- a/remote.c
+++ b/remote.c
@@ -919,6 +919,26 @@ char *apply_refspecs(struct refspec *refspecs, int nr_refspec,
 	return query.dst;
 }
 
+void strbuf_add_refspec(struct strbuf *sb, const struct refspec *refspec)
+{
+	if (refspec->force)
+		strbuf_addch(sb, '+');
+	if (refspec->src)
+		strbuf_addstr(sb, refspec->src);
+	if (refspec->dst) {
+		strbuf_addch(sb, ':');
+		strbuf_addstr(sb, refspec->dst);
+	} else if (!refspec->src)
+		strbuf_addch(sb, ':');
+}
+
+char *refspec_to_string(const struct refspec *refspec)
+{
+	struct strbuf sb = STRBUF_INIT;
+	strbuf_add_refspec(&sb, refspec);
+	return strbuf_detach(&sb, NULL);
+}
+
 int remote_find_tracking(struct remote *remote, struct refspec *refspec)
 {
 	return query_refspecs(remote->fetch, remote->fetch_refspec_nr, refspec);
diff --git a/remote.h b/remote.h
index 9619f94dd..ee6c432d0 100644
--- a/remote.h
+++ b/remote.h
@@ -177,6 +177,8 @@ void free_refspec(int nr_refspec, struct refspec *refspec);
 extern int query_refspecs(struct refspec *specs, int nr, struct refspec *query);
 char *apply_refspecs(struct refspec *refspecs, int nr_refspec,
 		     const char *name);
+void strbuf_add_refspec(struct strbuf *sb, const struct refspec *refspec);
+char *refspec_to_string(const struct refspec *refspec);
 
 int check_push_refs(struct ref *src, int nr_refspec, const char **refspec);
 int match_push_refs(struct ref *src, struct ref **dst,
-- 
2.13.1.505.g7cc9fcafb


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/5] remote.c: don't pass copies of refspecs to add_{fetch,push}_refspec()
  2017-06-16 19:28 [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs SZEDER Gábor
  2017-06-16 19:28 ` [PATCH 1/5] remote: don't use remote->{fetch,push}_refspec SZEDER Gábor
@ 2017-06-16 19:28 ` SZEDER Gábor
  2017-06-16 19:28 ` [PATCH 3/5] remote.c: extract a helper function to parse a single refspec SZEDER Gábor
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: SZEDER Gábor @ 2017-06-16 19:28 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Jonathan Nieder, SZEDER Gábor

Currently neither add_fetch_refspec() nor add_push_refspec() duplicate
the refspecs before appending them to an array of refspecs, therefore
all their callers pass them copies of refspecs.  Soon we won't store
refspecs as strings, therefore passing duplicated strings to these
functions will not be necessary.

Perform the copying inside these functions and modify their callers to
either not pass duplicates or free() them to avoid leaks.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 builtin/clone.c |  3 +--
 remote.c        | 21 ++++++++++++++-------
 2 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 869e093ff..5b72d853f 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1054,8 +1054,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	remote = remote_get(option_origin);
 	strbuf_addf(&default_refspec, "+%s*:%s*", src_ref_prefix,
 		    branch_top.buf);
-	add_and_parse_fetch_refspec(remote,
-				    strbuf_detach(&default_refspec, NULL));
+	add_and_parse_fetch_refspec(remote, default_refspec.buf);
 
 	transport = transport_get(remote, remote->url[0]);
 	transport_set_verbosity(transport, option_verbosity, option_progress);
diff --git a/remote.c b/remote.c
index a021decee..d23518afd 100644
--- a/remote.c
+++ b/remote.c
@@ -91,7 +91,7 @@ static void add_push_refspec(struct remote *remote, const char *ref)
 	ALLOC_GROW(remote->push_refspec,
 		   remote->push_refspec_nr + 1,
 		   remote->push_refspec_alloc);
-	remote->push_refspec[remote->push_refspec_nr++] = ref;
+	remote->push_refspec[remote->push_refspec_nr++] = strdup(ref);
 }
 
 static void add_fetch_refspec(struct remote *remote, const char *ref)
@@ -99,7 +99,7 @@ static void add_fetch_refspec(struct remote *remote, const char *ref)
 	ALLOC_GROW(remote->fetch_refspec,
 		   remote->fetch_refspec_nr + 1,
 		   remote->fetch_refspec_alloc);
-	remote->fetch_refspec[remote->fetch_refspec_nr++] = ref;
+	remote->fetch_refspec[remote->fetch_refspec_nr++] = strdup(ref);
 }
 
 static void add_url(struct remote *remote, const char *url)
@@ -265,9 +265,9 @@ static void read_remotes_file(struct remote *remote)
 		if (skip_prefix(buf.buf, "URL:", &v))
 			add_url_alias(remote, xstrdup(skip_spaces(v)));
 		else if (skip_prefix(buf.buf, "Push:", &v))
-			add_push_refspec(remote, xstrdup(skip_spaces(v)));
+			add_push_refspec(remote, skip_spaces(v));
 		else if (skip_prefix(buf.buf, "Pull:", &v))
-			add_fetch_refspec(remote, xstrdup(skip_spaces(v)));
+			add_fetch_refspec(remote, skip_spaces(v));
 	}
 	strbuf_release(&buf);
 	fclose(f);
@@ -306,15 +306,20 @@ static void read_branches_file(struct remote *remote)
 		frag = "master";
 
 	add_url_alias(remote, strbuf_detach(&buf, NULL));
-	add_fetch_refspec(remote, xstrfmt("refs/heads/%s:refs/heads/%s",
-					  frag, remote->name));
+
+	strbuf_addf(&buf, "refs/heads/%s:refs/heads/%s", frag, remote->name);
+	add_fetch_refspec(remote, buf.buf);
+	strbuf_reset(&buf);
 
 	/*
 	 * Cogito compatible push: push current HEAD to remote #branch
 	 * (master if missing)
 	 */
-	add_push_refspec(remote, xstrfmt("HEAD:refs/heads/%s", frag));
+	strbuf_addf(&buf, "HEAD:refs/heads/%s", frag);
+	add_push_refspec(remote, buf.buf);
 	remote->fetch_tags = 1; /* always auto-follow */
+
+	strbuf_release(&buf);
 }
 
 static int handle_config(const char *key, const char *value, void *cb)
@@ -398,11 +403,13 @@ static int handle_config(const char *key, const char *value, void *cb)
 		if (git_config_string(&v, key, value))
 			return -1;
 		add_push_refspec(remote, v);
+		free((char*)v);
 	} else if (!strcmp(subkey, "fetch")) {
 		const char *v;
 		if (git_config_string(&v, key, value))
 			return -1;
 		add_fetch_refspec(remote, v);
+		free((char*)v);
 	} else if (!strcmp(subkey, "receivepack")) {
 		const char *v;
 		if (git_config_string(&v, key, value))
-- 
2.13.1.505.g7cc9fcafb


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/5] remote.c: extract a helper function to parse a single refspec
  2017-06-16 19:28 [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs SZEDER Gábor
  2017-06-16 19:28 ` [PATCH 1/5] remote: don't use remote->{fetch,push}_refspec SZEDER Gábor
  2017-06-16 19:28 ` [PATCH 2/5] remote.c: don't pass copies of refspecs to add_{fetch,push}_refspec() SZEDER Gábor
@ 2017-06-16 19:28 ` SZEDER Gábor
  2017-06-16 19:28 ` [PATCH 4/5] remote.c: eliminate remote->fetch_refspec SZEDER Gábor
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: SZEDER Gábor @ 2017-06-16 19:28 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Jonathan Nieder, SZEDER Gábor

Fetch refspecs read from the configuration are currently parsed
lazily: they are collected into a string array for each remote while
reading the configuration and then refspecs of a particular remote are
parsed together later when the remote is accessed by remote_get() or
for_each_remote().

We are about to parse refspecs _while_ reading them from the
configuration, one by one, and a function parsing a single refspec
into a memory location provided by the caller will be quite useful
then.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 remote.c | 231 +++++++++++++++++++++++++++++++++------------------------------
 1 file changed, 122 insertions(+), 109 deletions(-)

diff --git a/remote.c b/remote.c
index d23518afd..fc1d3cf7a 100644
--- a/remote.c
+++ b/remote.c
@@ -484,123 +484,136 @@ static void read_config(void)
 	alias_all_urls();
 }
 
+static int parse_one_refspec(struct refspec *rs, const char *refspec,
+			     int fetch, int gently)
+{
+	size_t llen;
+	int is_glob;
+	const char *lhs, *rhs;
+	int flags;
+
+	memset(rs, 0, sizeof(*rs));
+
+	is_glob = 0;
+
+	lhs = refspec;
+	if (*lhs == '+') {
+		rs->force = 1;
+		lhs++;
+	}
+
+	rhs = strrchr(lhs, ':');
+
+	/*
+	 * Before going on, special case ":" (or "+:") as a refspec
+	 * for pushing matching refs.
+	 */
+	if (!fetch && rhs == lhs && rhs[1] == '\0') {
+		rs->matching = 1;
+		return 0;
+	}
+
+	if (rhs) {
+		size_t rlen = strlen(++rhs);
+		is_glob = (1 <= rlen && strchr(rhs, '*'));
+		rs->dst = xstrndup(rhs, rlen);
+	}
+
+	llen = (rhs ? (rhs - lhs - 1) : strlen(lhs));
+	if (1 <= llen && memchr(lhs, '*', llen)) {
+		if ((rhs && !is_glob) || (!rhs && fetch))
+			goto invalid;
+		is_glob = 1;
+	} else if (rhs && is_glob) {
+		goto invalid;
+	}
+
+	rs->pattern = is_glob;
+	rs->src = xstrndup(lhs, llen);
+	flags = REFNAME_ALLOW_ONELEVEL | (is_glob ? REFNAME_REFSPEC_PATTERN : 0);
+
+	if (fetch) {
+		struct object_id unused;
+
+		/* LHS */
+		if (!*rs->src)
+			; /* empty is ok; it means "HEAD" */
+		else if (llen == GIT_SHA1_HEXSZ && !get_oid_hex(rs->src, &unused))
+			rs->exact_sha1 = 1; /* ok */
+		else if (!check_refname_format(rs->src, flags))
+			; /* valid looking ref is ok */
+		else
+			goto invalid;
+		/* RHS */
+		if (!rs->dst)
+			; /* missing is ok; it is the same as empty */
+		else if (!*rs->dst)
+			; /* empty is ok; it means "do not store" */
+		else if (!check_refname_format(rs->dst, flags))
+			; /* valid looking ref is ok */
+		else
+			goto invalid;
+	} else {
+		/*
+		 * LHS
+		 * - empty is allowed; it means delete.
+		 * - when wildcarded, it must be a valid looking ref.
+		 * - otherwise, it must be an extended SHA-1, but
+		 *   there is no existing way to validate this.
+		 */
+		if (!*rs->src)
+			; /* empty is ok */
+		else if (is_glob) {
+			if (check_refname_format(rs->src, flags))
+				goto invalid;
+		}
+		else
+			; /* anything goes, for now */
+		/*
+		 * RHS
+		 * - missing is allowed, but LHS then must be a
+		 *   valid looking ref.
+		 * - empty is not allowed.
+		 * - otherwise it must be a valid looking ref.
+		 */
+		if (!rs->dst) {
+			if (check_refname_format(rs->src, flags))
+				goto invalid;
+		} else if (!*rs->dst) {
+			goto invalid;
+		} else {
+			if (check_refname_format(rs->dst, flags))
+				goto invalid;
+		}
+	}
+
+	return 0;
+
+ invalid:
+	if (gently) {
+		free(rs->src);
+		free(rs->dst);
+		return -1;
+	}
+	die("Invalid refspec '%s'", refspec);
+}
+
 static struct refspec *parse_refspec_internal(int nr_refspec, const char **refspec, int fetch, int verify)
 {
 	int i;
-	struct refspec *rs = xcalloc(nr_refspec, sizeof(*rs));
+	struct refspec *rs;
+
+	ALLOC_ARRAY(rs, nr_refspec);
 
 	for (i = 0; i < nr_refspec; i++) {
-		size_t llen;
-		int is_glob;
-		const char *lhs, *rhs;
-		int flags;
-
-		is_glob = 0;
-
-		lhs = refspec[i];
-		if (*lhs == '+') {
-			rs[i].force = 1;
-			lhs++;
-		}
-
-		rhs = strrchr(lhs, ':');
-
-		/*
-		 * Before going on, special case ":" (or "+:") as a refspec
-		 * for pushing matching refs.
-		 */
-		if (!fetch && rhs == lhs && rhs[1] == '\0') {
-			rs[i].matching = 1;
-			continue;
-		}
-
-		if (rhs) {
-			size_t rlen = strlen(++rhs);
-			is_glob = (1 <= rlen && strchr(rhs, '*'));
-			rs[i].dst = xstrndup(rhs, rlen);
-		}
-
-		llen = (rhs ? (rhs - lhs - 1) : strlen(lhs));
-		if (1 <= llen && memchr(lhs, '*', llen)) {
-			if ((rhs && !is_glob) || (!rhs && fetch))
-				goto invalid;
-			is_glob = 1;
-		} else if (rhs && is_glob) {
-			goto invalid;
-		}
-
-		rs[i].pattern = is_glob;
-		rs[i].src = xstrndup(lhs, llen);
-		flags = REFNAME_ALLOW_ONELEVEL | (is_glob ? REFNAME_REFSPEC_PATTERN : 0);
-
-		if (fetch) {
-			struct object_id unused;
-
-			/* LHS */
-			if (!*rs[i].src)
-				; /* empty is ok; it means "HEAD" */
-			else if (llen == GIT_SHA1_HEXSZ && !get_oid_hex(rs[i].src, &unused))
-				rs[i].exact_sha1 = 1; /* ok */
-			else if (!check_refname_format(rs[i].src, flags))
-				; /* valid looking ref is ok */
-			else
-				goto invalid;
-			/* RHS */
-			if (!rs[i].dst)
-				; /* missing is ok; it is the same as empty */
-			else if (!*rs[i].dst)
-				; /* empty is ok; it means "do not store" */
-			else if (!check_refname_format(rs[i].dst, flags))
-				; /* valid looking ref is ok */
-			else
-				goto invalid;
-		} else {
-			/*
-			 * LHS
-			 * - empty is allowed; it means delete.
-			 * - when wildcarded, it must be a valid looking ref.
-			 * - otherwise, it must be an extended SHA-1, but
-			 *   there is no existing way to validate this.
-			 */
-			if (!*rs[i].src)
-				; /* empty is ok */
-			else if (is_glob) {
-				if (check_refname_format(rs[i].src, flags))
-					goto invalid;
-			}
-			else
-				; /* anything goes, for now */
-			/*
-			 * RHS
-			 * - missing is allowed, but LHS then must be a
-			 *   valid looking ref.
-			 * - empty is not allowed.
-			 * - otherwise it must be a valid looking ref.
-			 */
-			if (!rs[i].dst) {
-				if (check_refname_format(rs[i].src, flags))
-					goto invalid;
-			} else if (!*rs[i].dst) {
-				goto invalid;
-			} else {
-				if (check_refname_format(rs[i].dst, flags))
-					goto invalid;
-			}
+		if (parse_one_refspec(&rs[i], refspec[i], fetch, verify) < 0) {
+			/* verify != 0 here, because parse_one_refspec()
+			 * would have already die()d otherwise. */
+			free_refspec(i, rs);
+			return NULL;
 		}
 	}
 	return rs;
-
- invalid:
-	if (verify) {
-		/*
-		 * nr_refspec must be greater than zero and i must be valid
-		 * since it is only possible to reach this point from within
-		 * the for loop above.
-		 */
-		free_refspec(i+1, rs);
-		return NULL;
-	}
-	die("Invalid refspec '%s'", refspec[i]);
 }
 
 int valid_fetch_refspec(const char *fetch_refspec_str)
-- 
2.13.1.505.g7cc9fcafb


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/5] remote.c: eliminate remote->fetch_refspec
  2017-06-16 19:28 [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs SZEDER Gábor
                   ` (2 preceding siblings ...)
  2017-06-16 19:28 ` [PATCH 3/5] remote.c: extract a helper function to parse a single refspec SZEDER Gábor
@ 2017-06-16 19:28 ` SZEDER Gábor
  2017-06-16 19:28 ` [PATCH 5/5] remote.c: eliminate remote->push_refspec SZEDER Gábor
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: SZEDER Gábor @ 2017-06-16 19:28 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Jonathan Nieder, SZEDER Gábor

'struct remote' stores fetch refspecs twice: once in their original
string form in remote->fetch_refspecs and once in their parsed form in
remote->fetch.

The main reason for this is that we want to read the configuration
only once, but we don't want to error out while doing so because of a
bogus refspec in a remote that we aren't accessing at all.  Therefore,
when the configuration is read all refspecs are simply stored in
string arrays for each remote, and then refspecs of a particular
remote are parsed lazily, only when that remote is actually accessed.

However, storing refspecs in both forms has some drawbacks:

  - The same information is stored twice, wasting memory.
  - remote->fetch_refspecs, i.e. the string array is conveniently
    ALLOC_GROW()-able with associated 'fetch_refspec_{nr,alloc}'
    fields, but remote->fetch is not.
  - Wherever remote->fetch are accessed, the number of parsed refspecs
    in there is specified by remote->fetch_refspec_nr.  This requires
    us to keep the two arrays in sync and makes adding additional
    refspecs cumbersome and error prone.

So eliminate remote->fetch_refspec and parse fetch refspecs right away
while they are being read from the configuration.  However, to avoid
erroring out on bogus refspecs in "uninteresting" remotes, parse them
gently: instead of die()ing, store the problematic refspec in
remote->bogus_refspec, so it will be available later when that remote
is actually accessed and we can use it in the error message.  Make
remote->fetch ALLOC_GROW()-able.  Add a new add_fetch_refspec()
function to the remote API, replacing an old function with the same
name and add_and_parse_fetch_refspec(), to encompass memory
management of remote->fetch and parsing the new refspec (gently or
otherwise).  Update all sites accessing fetch-refspec-related fields
to use the new field names.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 builtin/clone.c  |  6 +++---
 builtin/fetch.c  | 20 ++++++++++----------
 builtin/remote.c | 18 +++++++++---------
 remote.c         | 55 +++++++++++++++++++++++++++++--------------------------
 remote.h         | 23 ++++++++++++++++++-----
 5 files changed, 69 insertions(+), 53 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 5b72d853f..05cc57f79 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1054,7 +1054,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	remote = remote_get(option_origin);
 	strbuf_addf(&default_refspec, "+%s*:%s*", src_ref_prefix,
 		    branch_top.buf);
-	add_and_parse_fetch_refspec(remote, default_refspec.buf);
+	add_fetch_refspec(remote, default_refspec.buf, 0);
 
 	transport = transport_get(remote, remote->url[0]);
 	transport_set_verbosity(transport, option_verbosity, option_progress);
@@ -1106,8 +1106,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	refs = transport_get_remote_refs(transport);
 
 	if (refs) {
-		mapped_refs = wanted_peer_refs(refs, remote->fetch,
-					       remote->fetch_refspec_nr);
+		mapped_refs = wanted_peer_refs(refs, remote->fetch.rs,
+					       remote->fetch.nr);
 		/*
 		 * transport_get_remote_refs() may return refs with null sha-1
 		 * in mapped_refs (see struct transport->get_refs_list
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 100248c5a..0f791a12f 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -360,8 +360,8 @@ static struct ref *get_ref_map(struct transport *transport,
 			fetch_refspec = parse_fetch_refspec(refmap_nr, refmap_array);
 			fetch_refspec_nr = refmap_nr;
 		} else {
-			fetch_refspec = transport->remote->fetch;
-			fetch_refspec_nr = transport->remote->fetch_refspec_nr;
+			fetch_refspec = transport->remote->fetch.rs;
+			fetch_refspec_nr = transport->remote->fetch.nr;
 		}
 
 		for (i = 0; i < fetch_refspec_nr; i++)
@@ -374,16 +374,16 @@ static struct ref *get_ref_map(struct transport *transport,
 		struct branch *branch = branch_get(NULL);
 		int has_merge = branch_has_merge_config(branch);
 		if (remote &&
-		    (remote->fetch_refspec_nr ||
+		    (remote->fetch.nr ||
 		     /* Note: has_merge implies non-NULL branch->remote_name */
 		     (has_merge && !strcmp(branch->remote_name, remote->name)))) {
-			for (i = 0; i < remote->fetch_refspec_nr; i++) {
-				get_fetch_map(remote_refs, &remote->fetch[i], &tail, 0);
-				if (remote->fetch[i].dst &&
-				    remote->fetch[i].dst[0])
+			for (i = 0; i < remote->fetch.nr; i++) {
+				get_fetch_map(remote_refs, &remote->fetch.rs[i], &tail, 0);
+				if (remote->fetch.rs[i].dst &&
+				    remote->fetch.rs[i].dst[0])
 					*autotags = 1;
 				if (!i && !has_merge && ref_map &&
-				    !remote->fetch[0].pattern)
+				    !remote->fetch.rs[0].pattern)
 					ref_map->fetch_head_status = FETCH_HEAD_MERGE;
 			}
 			/*
@@ -1117,8 +1117,8 @@ static int do_fetch(struct transport *transport,
 		if (ref_count) {
 			prune_refs(refs, ref_count, ref_map, transport->url);
 		} else {
-			prune_refs(transport->remote->fetch,
-				   transport->remote->fetch_refspec_nr,
+			prune_refs(transport->remote->fetch.rs,
+				   transport->remote->fetch.nr,
 				   ref_map,
 				   transport->url);
 		}
diff --git a/builtin/remote.c b/builtin/remote.c
index 7f0072fe5..d61daa5e8 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -331,10 +331,10 @@ static int get_ref_states(const struct ref *remote_refs, struct ref_states *stat
 	struct ref *ref, *stale_refs;
 	int i;
 
-	for (i = 0; i < states->remote->fetch_refspec_nr; i++)
-		if (get_fetch_map(remote_refs, states->remote->fetch + i, &tail, 1))
+	for (i = 0; i < states->remote->fetch.nr; i++)
+		if (get_fetch_map(remote_refs, states->remote->fetch.rs + i, &tail, 1))
 			die(_("Could not get fetch map for refspec %s"),
-			      refspec_to_string(&states->remote->fetch[i]));
+			      refspec_to_string(&states->remote->fetch.rs[i]));
 
 	states->new.strdup_strings = 1;
 	states->tracked.strdup_strings = 1;
@@ -345,8 +345,8 @@ static int get_ref_states(const struct ref *remote_refs, struct ref_states *stat
 		else
 			string_list_append(&states->tracked, abbrev_branch(ref->name));
 	}
-	stale_refs = get_stale_heads(states->remote->fetch,
-				     states->remote->fetch_refspec_nr, fetch_map);
+	stale_refs = get_stale_heads(states->remote->fetch.rs,
+				     states->remote->fetch.nr, fetch_map);
 	for (ref = stale_refs; ref; ref = ref->next) {
 		struct string_list_item *item =
 			string_list_append(&states->stale, abbrev_branch(ref->name));
@@ -591,8 +591,8 @@ static int migrate_file(struct remote *remote)
 	}
 	strbuf_reset(&buf);
 	strbuf_addf(&buf, "remote.%s.fetch", remote->name);
-	for (i = 0; i < remote->fetch_refspec_nr; i++) {
-		strbuf_add_refspec(&refspec, &remote->fetch[i]);
+	for (i = 0; i < remote->fetch.nr; i++) {
+		strbuf_add_refspec(&refspec, &remote->fetch.rs[i]);
 		git_config_set_multivar(buf.buf, refspec.buf, "^$", 0);
 		strbuf_reset(&refspec);
 	}
@@ -651,11 +651,11 @@ static int mv(int argc, const char **argv)
 	strbuf_addf(&buf, "remote.%s.fetch", rename.new);
 	git_config_set_multivar(buf.buf, NULL, NULL, 1);
 	strbuf_addf(&old_remote_context, ":refs/remotes/%s/", rename.old);
-	for (i = 0; i < oldremote->fetch_refspec_nr; i++) {
+	for (i = 0; i < oldremote->fetch.nr; i++) {
 		char *ptr;
 
 		strbuf_reset(&buf2);
-		strbuf_add_refspec(&buf2, &oldremote->fetch[i]);
+		strbuf_add_refspec(&buf2, &oldremote->fetch.rs[i]);
 		ptr = strstr(buf2.buf, old_remote_context.buf);
 		if (ptr) {
 			refspec_updated = 1;
diff --git a/remote.c b/remote.c
index fc1d3cf7a..952000b61 100644
--- a/remote.c
+++ b/remote.c
@@ -94,14 +94,6 @@ static void add_push_refspec(struct remote *remote, const char *ref)
 	remote->push_refspec[remote->push_refspec_nr++] = strdup(ref);
 }
 
-static void add_fetch_refspec(struct remote *remote, const char *ref)
-{
-	ALLOC_GROW(remote->fetch_refspec,
-		   remote->fetch_refspec_nr + 1,
-		   remote->fetch_refspec_alloc);
-	remote->fetch_refspec[remote->fetch_refspec_nr++] = strdup(ref);
-}
-
 static void add_url(struct remote *remote, const char *url)
 {
 	ALLOC_GROW(remote->url, remote->url_nr + 1, remote->url_alloc);
@@ -267,7 +259,7 @@ static void read_remotes_file(struct remote *remote)
 		else if (skip_prefix(buf.buf, "Push:", &v))
 			add_push_refspec(remote, skip_spaces(v));
 		else if (skip_prefix(buf.buf, "Pull:", &v))
-			add_fetch_refspec(remote, skip_spaces(v));
+			add_fetch_refspec(remote, skip_spaces(v), 1);
 	}
 	strbuf_release(&buf);
 	fclose(f);
@@ -308,7 +300,7 @@ static void read_branches_file(struct remote *remote)
 	add_url_alias(remote, strbuf_detach(&buf, NULL));
 
 	strbuf_addf(&buf, "refs/heads/%s:refs/heads/%s", frag, remote->name);
-	add_fetch_refspec(remote, buf.buf);
+	add_fetch_refspec(remote, buf.buf, 1);
 	strbuf_reset(&buf);
 
 	/*
@@ -408,7 +400,7 @@ static int handle_config(const char *key, const char *value, void *cb)
 		const char *v;
 		if (git_config_string(&v, key, value))
 			return -1;
-		add_fetch_refspec(remote, v);
+		add_fetch_refspec(remote, v, 1);
 		free((char*)v);
 	} else if (!strcmp(subkey, "receivepack")) {
 		const char *v;
@@ -630,17 +622,28 @@ struct refspec *parse_fetch_refspec(int nr_refspec, const char **refspec)
 	return parse_refspec_internal(nr_refspec, refspec, 1, 0);
 }
 
-void add_and_parse_fetch_refspec(struct remote *remote, const char *refspec)
+static int add_refspec(struct remote *remote, const char *refspec,
+		       int fetch, int gently)
 {
-	struct refspec *rs;
+	struct refspec_array *rsa = fetch ? &remote->fetch : NULL;
 
-	add_fetch_refspec(remote, refspec);
-	rs = parse_fetch_refspec(1, &refspec);
-	REALLOC_ARRAY(remote->fetch, remote->fetch_refspec_nr);
-	remote->fetch[remote->fetch_refspec_nr - 1] = *rs;
+	ALLOC_GROW(rsa->rs, rsa->nr + 1, rsa->alloc);
 
-	/* Not free_refspecs(), as we copied its pointers above */
-	free(rs);
+	if (parse_one_refspec(&rsa->rs[rsa->nr], refspec, fetch, gently) < 0) {
+		if (gently) {
+			if (!remote->bogus_refspec)
+				remote->bogus_refspec = strdup(refspec);
+			return -1;
+		} else
+			die("Invalid refspec: '%s'", refspec);
+	}
+	rsa->nr++;
+	return 0;
+}
+
+int add_fetch_refspec(struct remote *remote, const char *refspec, int gently)
+{
+	return add_refspec(remote, refspec, 1, gently);
 }
 
 struct refspec *parse_push_refspec(int nr_refspec, const char **refspec)
@@ -725,7 +728,8 @@ static struct remote *remote_get_1(const char *name,
 		add_url_alias(ret, name);
 	if (!valid_remote(ret))
 		return NULL;
-	ret->fetch = parse_fetch_refspec(ret->fetch_refspec_nr, ret->fetch_refspec);
+	if (ret->bogus_refspec)
+		die("Invalid refspec '%s'", ret->bogus_refspec);
 	ret->push = parse_push_refspec(ret->push_refspec_nr, ret->push_refspec);
 	return ret;
 }
@@ -757,9 +761,8 @@ int for_each_remote(each_remote_fn fn, void *priv)
 		struct remote *r = remotes[i];
 		if (!r)
 			continue;
-		if (!r->fetch)
-			r->fetch = parse_fetch_refspec(r->fetch_refspec_nr,
-						       r->fetch_refspec);
+		if (r->bogus_refspec)
+			die("Invalid refspec '%s'", r->bogus_refspec);
 		if (!r->push)
 			r->push = parse_push_refspec(r->push_refspec_nr,
 						     r->push_refspec);
@@ -961,7 +964,7 @@ char *refspec_to_string(const struct refspec *refspec)
 
 int remote_find_tracking(struct remote *remote, struct refspec *refspec)
 {
-	return query_refspecs(remote->fetch, remote->fetch_refspec_nr, refspec);
+	return query_refspecs(remote->fetch.rs, remote->fetch.nr, refspec);
 }
 
 static struct ref *alloc_ref_with_prefix(const char *prefix, size_t prefixlen,
@@ -1756,7 +1759,7 @@ static const char *tracking_for_push_dest(struct remote *remote,
 {
 	char *ret;
 
-	ret = apply_refspecs(remote->fetch, remote->fetch_refspec_nr, refname);
+	ret = apply_refspecs(remote->fetch.rs, remote->fetch.nr, refname);
 	if (!ret)
 		return error_buf(err,
 				 _("push destination '%s' on remote '%s' has no local tracking branch"),
@@ -2373,7 +2376,7 @@ static int remote_tracking(struct remote *remote, const char *refname,
 {
 	char *dst;
 
-	dst = apply_refspecs(remote->fetch, remote->fetch_refspec_nr, refname);
+	dst = apply_refspecs(remote->fetch.rs, remote->fetch.nr, refname);
 	if (!dst)
 		return -1; /* no tracking ref for refname at remote */
 	if (read_ref(dst, oid->hash))
diff --git a/remote.h b/remote.h
index ee6c432d0..416a08501 100644
--- a/remote.h
+++ b/remote.h
@@ -11,6 +11,11 @@ enum {
 	REMOTE_BRANCHES
 };
 
+struct refspec_array {
+	struct refspec *rs;
+	int nr, alloc;
+};
+
 struct remote {
 	struct hashmap_entry ent;  /* must be first */
 
@@ -32,10 +37,10 @@ struct remote {
 	int push_refspec_nr;
 	int push_refspec_alloc;
 
-	const char **fetch_refspec;
-	struct refspec *fetch;
-	int fetch_refspec_nr;
-	int fetch_refspec_alloc;
+	struct refspec_array fetch;
+
+	/* Copy of the first bogus fetch refspec we couldn't parse */
+	const char *bogus_refspec;
 
 	/*
 	 * -1 to never fetch tags
@@ -169,7 +174,15 @@ struct ref *ref_remove_duplicates(struct ref *ref_map);
 
 int valid_fetch_refspec(const char *refspec);
 struct refspec *parse_fetch_refspec(int nr_refspec, const char **refspec);
-void add_and_parse_fetch_refspec(struct remote *remote, const char *refspec);
+/*
+ * Add a fetch refspec to a remote.
+ * If the refspec cannot be parsed successfully and
+ *  - gently=0: die().
+ *  - gently=1: store the refspec in remote->bogus_refspec and return with -1.
+ *              If there are more than one bogus refspecs in the same remote,
+ *              then only the first one will be stored.
+ */
+int add_fetch_refspec(struct remote *remote, const char *refspec, int gently);
 extern struct refspec *parse_push_refspec(int nr_refspec, const char **refspec);
 
 void free_refspec(int nr_refspec, struct refspec *refspec);
-- 
2.13.1.505.g7cc9fcafb


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/5] remote.c: eliminate remote->push_refspec
  2017-06-16 19:28 [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs SZEDER Gábor
                   ` (3 preceding siblings ...)
  2017-06-16 19:28 ` [PATCH 4/5] remote.c: eliminate remote->fetch_refspec SZEDER Gábor
@ 2017-06-16 19:28 ` SZEDER Gábor
  2017-06-16 21:55 ` [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs Junio C Hamano
  2017-06-17 11:55 ` Jeff King
  6 siblings, 0 replies; 12+ messages in thread
From: SZEDER Gábor @ 2017-06-16 19:28 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Jonathan Nieder, SZEDER Gábor

Do the same as in the previous patch, but for push refspecs, i.e. with
remote->push_refspec, remote->push and add_push_refspec().

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 builtin/push.c   | 12 +++++++-----
 builtin/remote.c | 19 ++++++++++++-------
 remote.c         | 29 +++++++++++------------------
 remote.h         | 10 ++++------
 4 files changed, 34 insertions(+), 36 deletions(-)

diff --git a/builtin/push.c b/builtin/push.c
index 258648d5f..9a721fe8a 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -47,11 +47,11 @@ static const char *map_refspec(const char *ref,
 	if (count_refspec_match(ref, local_refs, &matched) != 1)
 		return ref;
 
-	if (remote->push) {
+	if (remote->push.rs) {
 		struct refspec query;
 		memset(&query, 0, sizeof(struct refspec));
 		query.src = matched->name;
-		if (!query_refspecs(remote->push, remote->push_refspec_nr, &query) &&
+		if (!query_refspecs(remote->push.rs, remote->push.nr, &query) &&
 		    query.dst) {
 			struct strbuf buf = STRBUF_INIT;
 			strbuf_addf(&buf, "%s%s:%s",
@@ -401,9 +401,11 @@ static int do_push(const char *repo, int flags,
 	}
 
 	if (!refspec && !(flags & TRANSPORT_PUSH_ALL)) {
-		if (remote->push_refspec_nr) {
-			refspec = remote->push_refspec;
-			refspec_nr = remote->push_refspec_nr;
+		if (remote->push.nr) {
+			ALLOC_ARRAY(refspec, remote->push.nr);
+			for (i = 0; i < remote->push.nr; i++)
+				refspec[i] = refspec_to_string(&remote->push.rs[i]);
+			refspec_nr = remote->push.nr;
 		} else if (!(flags & TRANSPORT_PUSH_MIRROR))
 			setup_default_push_refspecs(remote);
 	}
diff --git a/builtin/remote.c b/builtin/remote.c
index d61daa5e8..c04dec0e1 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -380,14 +380,19 @@ static int get_push_ref_states(const struct ref *remote_refs,
 {
 	struct remote *remote = states->remote;
 	struct ref *ref, *local_refs, *push_map;
+	const char **refspec;
+	int i;
 	if (remote->mirror)
 		return 0;
 
 	local_refs = get_local_heads();
 	push_map = copy_ref_list(remote_refs);
 
-	match_push_refs(local_refs, &push_map, remote->push_refspec_nr,
-			remote->push_refspec, MATCH_REFS_NONE);
+	ALLOC_ARRAY(refspec, remote->push.nr);
+	for (i = 0; i < remote->push.nr; i++)
+		refspec[i] = refspec_to_string(&remote->push.rs[i]);
+	match_push_refs(local_refs, &push_map, remote->push.nr,
+			refspec, MATCH_REFS_NONE);
 
 	states->push.strdup_strings = 1;
 	for (ref = push_map; ref; ref = ref->next) {
@@ -433,14 +438,14 @@ static int get_push_ref_states_noquery(struct ref_states *states)
 		return 0;
 
 	states->push.strdup_strings = 1;
-	if (!remote->push_refspec_nr) {
+	if (!remote->push.nr) {
 		item = string_list_append(&states->push, _("(matching)"));
 		info = item->util = xcalloc(1, sizeof(struct push_info));
 		info->status = PUSH_STATUS_NOTQUERIED;
 		info->dest = xstrdup(item->string);
 	}
-	for (i = 0; i < remote->push_refspec_nr; i++) {
-		struct refspec *spec = remote->push + i;
+	for (i = 0; i < remote->push.nr; i++) {
+		struct refspec *spec = remote->push.rs + i;
 		if (spec->matching)
 			item = string_list_append(&states->push, _("(matching)"));
 		else if (strlen(spec->src))
@@ -584,8 +589,8 @@ static int migrate_file(struct remote *remote)
 		git_config_set_multivar(buf.buf, remote->url[i], "^$", 0);
 	strbuf_reset(&buf);
 	strbuf_addf(&buf, "remote.%s.push", remote->name);
-	for (i = 0; i < remote->push_refspec_nr; i++) {
-		strbuf_add_refspec(&refspec, &remote->push[i]);
+	for (i = 0; i < remote->push.nr; i++) {
+		strbuf_add_refspec(&refspec, &remote->push.rs[i]);
 		git_config_set_multivar(buf.buf, refspec.buf, "^$", 0);
 		strbuf_reset(&refspec);
 	}
diff --git a/remote.c b/remote.c
index 952000b61..9dfe3e9a6 100644
--- a/remote.c
+++ b/remote.c
@@ -86,14 +86,6 @@ static const char *alias_url(const char *url, struct rewrites *r)
 	return xstrfmt("%s%s", r->rewrite[longest_i]->base, url + longest->len);
 }
 
-static void add_push_refspec(struct remote *remote, const char *ref)
-{
-	ALLOC_GROW(remote->push_refspec,
-		   remote->push_refspec_nr + 1,
-		   remote->push_refspec_alloc);
-	remote->push_refspec[remote->push_refspec_nr++] = strdup(ref);
-}
-
 static void add_url(struct remote *remote, const char *url)
 {
 	ALLOC_GROW(remote->url, remote->url_nr + 1, remote->url_alloc);
@@ -257,7 +249,7 @@ static void read_remotes_file(struct remote *remote)
 		if (skip_prefix(buf.buf, "URL:", &v))
 			add_url_alias(remote, xstrdup(skip_spaces(v)));
 		else if (skip_prefix(buf.buf, "Push:", &v))
-			add_push_refspec(remote, skip_spaces(v));
+			add_push_refspec(remote, skip_spaces(v), 1);
 		else if (skip_prefix(buf.buf, "Pull:", &v))
 			add_fetch_refspec(remote, skip_spaces(v), 1);
 	}
@@ -308,7 +300,7 @@ static void read_branches_file(struct remote *remote)
 	 * (master if missing)
 	 */
 	strbuf_addf(&buf, "HEAD:refs/heads/%s", frag);
-	add_push_refspec(remote, buf.buf);
+	add_push_refspec(remote, buf.buf, 1);
 	remote->fetch_tags = 1; /* always auto-follow */
 
 	strbuf_release(&buf);
@@ -394,7 +386,7 @@ static int handle_config(const char *key, const char *value, void *cb)
 		const char *v;
 		if (git_config_string(&v, key, value))
 			return -1;
-		add_push_refspec(remote, v);
+		add_push_refspec(remote, v, 1);
 		free((char*)v);
 	} else if (!strcmp(subkey, "fetch")) {
 		const char *v;
@@ -625,7 +617,7 @@ struct refspec *parse_fetch_refspec(int nr_refspec, const char **refspec)
 static int add_refspec(struct remote *remote, const char *refspec,
 		       int fetch, int gently)
 {
-	struct refspec_array *rsa = fetch ? &remote->fetch : NULL;
+	struct refspec_array *rsa = fetch ? &remote->fetch : &remote->push;
 
 	ALLOC_GROW(rsa->rs, rsa->nr + 1, rsa->alloc);
 
@@ -651,6 +643,11 @@ struct refspec *parse_push_refspec(int nr_refspec, const char **refspec)
 	return parse_refspec_internal(nr_refspec, refspec, 0, 0);
 }
 
+int add_push_refspec(struct remote *remote, const char *refspec, int gently)
+{
+	return add_refspec(remote, refspec, 0, gently);
+}
+
 void free_refspec(int nr_refspec, struct refspec *refspec)
 {
 	int i;
@@ -730,7 +727,6 @@ static struct remote *remote_get_1(const char *name,
 		return NULL;
 	if (ret->bogus_refspec)
 		die("Invalid refspec '%s'", ret->bogus_refspec);
-	ret->push = parse_push_refspec(ret->push_refspec_nr, ret->push_refspec);
 	return ret;
 }
 
@@ -763,9 +759,6 @@ int for_each_remote(each_remote_fn fn, void *priv)
 			continue;
 		if (r->bogus_refspec)
 			die("Invalid refspec '%s'", r->bogus_refspec);
-		if (!r->push)
-			r->push = parse_push_refspec(r->push_refspec_nr,
-						     r->push_refspec);
 		result = fn(r, priv);
 	}
 	return result;
@@ -1777,11 +1770,11 @@ static const char *branch_get_push_1(struct branch *branch, struct strbuf *err)
 				 _("branch '%s' has no remote for pushing"),
 				 branch->name);
 
-	if (remote->push_refspec_nr) {
+	if (remote->push.nr) {
 		char *dst;
 		const char *ret;
 
-		dst = apply_refspecs(remote->push, remote->push_refspec_nr,
+		dst = apply_refspecs(remote->push.rs, remote->push.nr,
 				     branch->refname);
 		if (!dst)
 			return error_buf(err,
diff --git a/remote.h b/remote.h
index 416a08501..eba06bacb 100644
--- a/remote.h
+++ b/remote.h
@@ -32,14 +32,10 @@ struct remote {
 	int pushurl_nr;
 	int pushurl_alloc;
 
-	const char **push_refspec;
-	struct refspec *push;
-	int push_refspec_nr;
-	int push_refspec_alloc;
-
+	struct refspec_array push;
 	struct refspec_array fetch;
 
-	/* Copy of the first bogus fetch refspec we couldn't parse */
+	/* Copy of the first bogus refspec we couldn't parse */
 	const char *bogus_refspec;
 
 	/*
@@ -184,6 +180,8 @@ struct refspec *parse_fetch_refspec(int nr_refspec, const char **refspec);
  */
 int add_fetch_refspec(struct remote *remote, const char *refspec, int gently);
 extern struct refspec *parse_push_refspec(int nr_refspec, const char **refspec);
+/* The same as add_fetch_refspec() above, but for push refspecs. */
+int add_push_refspec(struct remote *remote, const char *refspec, int gently);
 
 void free_refspec(int nr_refspec, struct refspec *refspec);
 
-- 
2.13.1.505.g7cc9fcafb


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs
  2017-06-16 19:28 [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs SZEDER Gábor
                   ` (4 preceding siblings ...)
  2017-06-16 19:28 ` [PATCH 5/5] remote.c: eliminate remote->push_refspec SZEDER Gábor
@ 2017-06-16 21:55 ` Junio C Hamano
  2017-06-17 12:25   ` SZEDER Gábor
  2017-06-17 11:55 ` Jeff King
  6 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2017-06-16 21:55 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, Jeff King, Jonathan Nieder

SZEDER Gábor <szeder.dev@gmail.com> writes:

> 'struct remote' stores refspecs twice: once in their original string
> form in remote->{fetch,push}_refspecs and once in their parsed form in
> remote->{fetch,push}.  This is necessary, because we need the refspecs
> for lazy parsing after we finished reading the configuration: we don't
> want to die() on a bogus refspec while reading the configuration of a
> remote we are not going to access at all.
>
> However, storing refspecs in both forms has some drawbacks:
>
>   - The same information is stored twice, wasting memory.

True (but a few hundred bytes is nothing among friends ;-)

>   - remote->{fetch,push}_refspecs, i.e. the string arrays are
>     conveniently ALLOC_GROW()-able with associated
>     {fetch,push}_refspec_{nr,alloc} fields, but remote->{fetch,push}
>     are not.

This is a more real issue.

>   - Wherever remote->{fetch,push} are accessed, the number of parsed
>     refspecs in there is specified by remote->{fetch,push}_refspec_nr.
>     This requires us to keep the two arrays in sync and makes adding
>     additional refspecs cumbersome and error prone.

You haven't told us which way you want to dedup.  Are you keeping
the original and removing the pre-parsed?  or are you only keeping
the pre-parsed ones?  As long as you want ALLOC_GROW() ability, you
need to maintain the invariants in three-tuple (foo, foo_alloc,
foo_nr).

>   - And worst of all, it pissed me off while working on
>     sg/clone-refspec-from-command-line-config ;)

Your feelings (or mine) do not count ;-).

I do not think we would terribly mind if you only kept a list of
pre-parsed form, with some mechanism to keep an "error" entry in
that list with its original, so that an error can be reported with
the refspec as the user originally gave us (which may mean the
"error" entry may have to keep the original form, since it wasn't
correctly parsable in the first place for it to trigger an error).

> So here is my crack at getting rid of them.

You still haven't told us what "them" are.  Parsed form, or the
original?  Let's find out by reading on....

> The idea is to parse refspecs gently while reading the configuration:
> this way we won't need to store all refspecs as strings, and won't
> die() on a bogus refspec right away.  A bogus refspec, if there's one,
> will be stored in the remote it belongs to, so it will be available
> later when that remote is accessed and can be used in the error
> message.

So normally we only have a list of parsed ones, but optionally there
is a list of malformed originals that are before attempted (and
failed) parsing used for error reporting?  That sounds sensible,
especially given that we can recreate the original textual form from
correctly parsed result (which allows us to report on other kinds
of errors as necessary).

> This applies on top of a merge of master and the fresh reroll (v5) of
> sg/clone-refspec-from-command-line-config:

Thanks.  Will take a look (but not immediately).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/5] remote: don't use remote->{fetch,push}_refspec
  2017-06-16 19:28 ` [PATCH 1/5] remote: don't use remote->{fetch,push}_refspec SZEDER Gábor
@ 2017-06-16 22:00   ` Junio C Hamano
  0 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2017-06-16 22:00 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, Jeff King, Jonathan Nieder

SZEDER Gábor <szeder.dev@gmail.com> writes:

> diff --git a/remote.c b/remote.c
> index 336db8298..a021decee 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -919,6 +919,26 @@ char *apply_refspecs(struct refspec *refspecs, int nr_refspec,
>  	return query.dst;
>  }
>  
> +void strbuf_add_refspec(struct strbuf *sb, const struct refspec *refspec)
> +{
> +	if (refspec->force)
> +		strbuf_addch(sb, '+');
> +	if (refspec->src)
> +		strbuf_addstr(sb, refspec->src);
> +	if (refspec->dst) {
> +		strbuf_addch(sb, ':');
> +		strbuf_addstr(sb, refspec->dst);
> +	} else if (!refspec->src)
> +		strbuf_addch(sb, ':');
> +}

Hmph, don't we support wildcarding (aka refspec->pattern)?

    ... goes and looks ...

Ah, even when we set the .pattern bit, we retain '*' in the string,
so the above is correct.

The last one looks a bit sloppy; shouldn't it be inspecting the
.matching bit?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs
  2017-06-16 19:28 [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs SZEDER Gábor
                   ` (5 preceding siblings ...)
  2017-06-16 21:55 ` [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs Junio C Hamano
@ 2017-06-17 11:55 ` Jeff King
  2017-06-19 20:07   ` Junio C Hamano
  6 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2017-06-17 11:55 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, Jonathan Nieder

On Fri, Jun 16, 2017 at 09:28:32PM +0200, SZEDER Gábor wrote:

> 'struct remote' stores refspecs twice: once in their original string
> form in remote->{fetch,push}_refspecs and once in their parsed form in
> remote->{fetch,push}.  This is necessary, because we need the refspecs
> for lazy parsing after we finished reading the configuration: we don't
> want to die() on a bogus refspec while reading the configuration of a
> remote we are not going to access at all.
> 
> However, storing refspecs in both forms has some drawbacks:
> 
>   - The same information is stored twice, wasting memory.

I don't think this is really that important. It's rare to have more than
a handful of refspecs, and they're short strings.

>   - remote->{fetch,push}_refspecs, i.e. the string arrays are
>     conveniently ALLOC_GROW()-able with associated
>     {fetch,push}_refspec_{nr,alloc} fields, but remote->{fetch,push}
>     are not.
>   - Wherever remote->{fetch,push} are accessed, the number of parsed
>     refspecs in there is specified by remote->{fetch,push}_refspec_nr.
>     This requires us to keep the two arrays in sync and makes adding
>     additional refspecs cumbersome and error prone.

I think the two-arrays things is more important, because keeping them in
sync is error-prone. On the other hand, if the array are manipulated by
accessors, that keeps the complexity in one spot.

The lazy parsing is an additional headache on top of that, but that's
orthogonal. Without the lazy parsing, then a single add_refspec() would
be simple and keep everything in sync.

>   - And worst of all, it pissed me off while working on
>     sg/clone-refspec-from-command-line-config ;)

That I can sympathize with. ;)

> The idea is to parse refspecs gently while reading the configuration:
> this way we won't need to store all refspecs as strings, and won't
> die() on a bogus refspec right away.  A bogus refspec, if there's one,
> will be stored in the remote it belongs to, so it will be available
> later when that remote is accessed and can be used in the error
> message.

The "turn a parsed refspec struct back into a refspec string" makes me a
little nervous, just because it needs to be the inverse of the parsing
function. So now we have two new things that need to be kept in sync.

If we forget the "storing it twice" argument, would it make sense to
convert the parallel arrays of items into a single array-of-struct?
I.e.:

  struct configured_refspec {
	const char *string;
	struct refspec refspec;
	unsigned parsed:1;
  }

I guess that may run into problems where we really need an
array-of-refspec to pass into sub-functions. So going further, could we
just have "struct refspec" store the text form it was parsed from?

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs
  2017-06-16 21:55 ` [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs Junio C Hamano
@ 2017-06-17 12:25   ` SZEDER Gábor
  2017-06-17 12:33     ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: SZEDER Gábor @ 2017-06-17 12:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git mailing list, Jeff King, Jonathan Nieder

On Fri, Jun 16, 2017 at 11:55 PM, Junio C Hamano <gitster@pobox.com> wrote:
> SZEDER Gábor <szeder.dev@gmail.com> writes:
>
>> 'struct remote' stores refspecs twice: once in their original string
>> form in remote->{fetch,push}_refspecs and once in their parsed form in
>> remote->{fetch,push}.  This is necessary, because we need the refspecs
>> for lazy parsing after we finished reading the configuration: we don't
>> want to die() on a bogus refspec while reading the configuration of a
>> remote we are not going to access at all.
>>
>> However, storing refspecs in both forms has some drawbacks:
>>
>>   - The same information is stored twice, wasting memory.
>
> True (but a few hundred bytes is nothing among friends ;-)

Indeed.  Even in my repos with close to 10k remotes the amount of
memory wasted by the duplicated refspecs is not an problem, there are
more pressing issues there.

>>   - remote->{fetch,push}_refspecs, i.e. the string arrays are
>>     conveniently ALLOC_GROW()-able with associated
>>     {fetch,push}_refspec_{nr,alloc} fields, but remote->{fetch,push}
>>     are not.
>
> This is a more real issue.
>
>>   - Wherever remote->{fetch,push} are accessed, the number of parsed
>>     refspecs in there is specified by remote->{fetch,push}_refspec_nr.
>>     This requires us to keep the two arrays in sync and makes adding
>>     additional refspecs cumbersome and error prone.
>
> You haven't told us which way you want to dedup.

Well, I actually did, right at the beginning.  The Subject:
specifically mentions which fields will be removed, and the first one
and a half line says in more usual terms what their roles are.

Anyway, made a note to use more natural language in the subjects (and
elsewhere) when we get there, maybe "remote.c: don't store refspecs as
strings in 'struct remote'" or something.

>  Are you keeping
> the original and removing the pre-parsed?  or are you only keeping
> the pre-parsed ones?  As long as you want ALLOC_GROW() ability, you
> need to maintain the invariants in three-tuple (foo, foo_alloc,
> foo_nr).
>
>>   - And worst of all, it pissed me off while working on
>>     sg/clone-refspec-from-command-line-config ;)
>
> Your feelings (or mine) do not count ;-).

Feelings have nothing to do with it, "it pissed me off" is a concise
way of saying "it made me waste quite some time debugging segfaults
and other nonsense resulting from ALLOC_GROW()ing remote->fetch,
misled by the previous point and the confusing order in which these
fields are listed in 'struct remote's definition".

> I do not think we would terribly mind if you only kept a list of
> pre-parsed form, with some mechanism to keep an "error" entry in
> that list with its original, so that an error can be reported with
> the refspec as the user originally gave us (which may mean the
> "error" entry may have to keep the original form, since it wasn't
> correctly parsable in the first place for it to trigger an error).
>
>> So here is my crack at getting rid of them.
>
> You still haven't told us what "them" are.  Parsed form, or the
> original?  Let's find out by reading on....
>
>> The idea is to parse refspecs gently while reading the configuration:
>> this way we won't need to store all refspecs as strings, and won't
>> die() on a bogus refspec right away.  A bogus refspec, if there's one,
>> will be stored in the remote it belongs to, so it will be available
>> later when that remote is accessed and can be used in the error
>> message.
>
> So normally we only have a list of parsed ones, but optionally there
> is a list of malformed originals that are before attempted (and
> failed) parsing used for error reporting?

For each remote there are two arrays of parsed refspecs, one for fetch
and one for push, and a single malformed original as string.

The reason for storing only a single malformed refspec per remote is
that I didn't want to noteworthily change the behaviour: the current
implementation die()s on the first malformed refspec it encounters
while parsing and reports only that one malformed refspec in the error
message.  This series essentially does the same as far as observable
behaviour goes, though it might happen that a different malformed
refspec is reported in the error message (if there are more than one,
depending on their order in the configuration).

Of course, if we want to, then this could be extended to record all
malformed refspecs while reading the configuration and report all of
them in the error message.  But that's a behaviour change which I
think should come on top as a separate patch.

>  That sounds sensible,
> especially given that we can recreate the original textual form from
> correctly parsed result (which allows us to report on other kinds
> of errors as necessary).
>
>> This applies on top of a merge of master and the fresh reroll (v5) of
>> sg/clone-refspec-from-command-line-config:
>
> Thanks.  Will take a look (but not immediately).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs
  2017-06-17 12:25   ` SZEDER Gábor
@ 2017-06-17 12:33     ` Jeff King
  0 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2017-06-17 12:33 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Junio C Hamano, Git mailing list, Jonathan Nieder

On Sat, Jun 17, 2017 at 02:25:39PM +0200, SZEDER Gábor wrote:

> >>   - The same information is stored twice, wasting memory.
> >
> > True (but a few hundred bytes is nothing among friends ;-)
> 
> Indeed.  Even in my repos with close to 10k remotes the amount of
> memory wasted by the duplicated refspecs is not an problem, there are
> more pressing issues there.

This is sort of a side note, but I'm curious why you need 10k remotes.

We used to do this at GitHub as part of our fork storage (the shared
repository had each fork as a remote). We fixed the quadratic issues
like d0da003d5 (use a hashmap to make remotes faster, 2014-07-29). But
even the linear work to read the config is noticeable (and hits you on
every command, even ones that don't care about remotes).  Now instead we
pass the refspecs directly to fetch whenever move objects between the
storage repos. They were the same for every remote anyway (and I'd guess
that is true, too, of your 10k remotes).

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs
  2017-06-17 11:55 ` Jeff King
@ 2017-06-19 20:07   ` Junio C Hamano
  0 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2017-06-19 20:07 UTC (permalink / raw)
  To: Jeff King; +Cc: SZEDER Gábor, git, Jonathan Nieder

Jeff King <peff@peff.net> writes:

> If we forget the "storing it twice" argument, would it make sense to
> convert the parallel arrays of items into a single array-of-struct?
> I.e.:
>
>   struct configured_refspec {
> 	const char *string;
> 	struct refspec refspec;
> 	unsigned parsed:1;
>   }
>
> I guess that may run into problems where we really need an
> array-of-refspec to pass into sub-functions. So going further, could we
> just have "struct refspec" store the text form it was parsed from?

I find this a sensible suggestion.  

I think the original "parallel" structure was anticipating that a
textual input we get from the user (either from the command line or
from the configuration) could expand to multiple parsed refspecs for
easier use by the code, but it appears that we hadn't seen a need
for that, so I think it is safe to unify them into a single struct.

Of course, that still anticipates that a parsed refspec may not be
unambiguously turned back to textual original input form; if that
will never be the case, then the approach taken by the posted
patches should also be OK.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-06-19 20:07 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-16 19:28 [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs SZEDER Gábor
2017-06-16 19:28 ` [PATCH 1/5] remote: don't use remote->{fetch,push}_refspec SZEDER Gábor
2017-06-16 22:00   ` Junio C Hamano
2017-06-16 19:28 ` [PATCH 2/5] remote.c: don't pass copies of refspecs to add_{fetch,push}_refspec() SZEDER Gábor
2017-06-16 19:28 ` [PATCH 3/5] remote.c: extract a helper function to parse a single refspec SZEDER Gábor
2017-06-16 19:28 ` [PATCH 4/5] remote.c: eliminate remote->fetch_refspec SZEDER Gábor
2017-06-16 19:28 ` [PATCH 5/5] remote.c: eliminate remote->push_refspec SZEDER Gábor
2017-06-16 21:55 ` [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs Junio C Hamano
2017-06-17 12:25   ` SZEDER Gábor
2017-06-17 12:33     ` Jeff King
2017-06-17 11:55 ` Jeff King
2017-06-19 20:07   ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).