git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
* [PATCH 0/3] am/rebase: share read_author_script()
@ 2018-09-12 10:10 Phillip Wood
  2018-09-12 10:10 ` [PATCH 1/3] am: rename read_author_script() Phillip Wood
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Phillip Wood @ 2018-09-12 10:10 UTC (permalink / raw)
  To: Junio C Hamano, Johannes Schindelin
  Cc: Git Mailing List, Eric Sunshine, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

This is a follow up to pw/rebase-i-author-script-fix, it reduces code
duplication and improves rebase's parsing of the author script. After
this I'll do another series to share the code to write the author
script.

Phillip Wood (3):
  am: rename read_author_script()
  add read_author_script() to libgit
  sequencer: use read_author_script()

 builtin/am.c |  61 ++------------------
 sequencer.c  | 160 ++++++++++++++++++++++++++++-----------------------
 sequencer.h  |   3 +
 3 files changed, 96 insertions(+), 128 deletions(-)

-- 
2.18.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] am: rename read_author_script()
  2018-09-12 10:10 [PATCH 0/3] am/rebase: share read_author_script() Phillip Wood
@ 2018-09-12 10:10 ` Phillip Wood
  2018-09-12 10:10 ` [PATCH 2/3] add read_author_script() to libgit Phillip Wood
  2018-09-12 10:10 ` [PATCH 3/3] sequencer: use read_author_script() Phillip Wood
  2 siblings, 0 replies; 7+ messages in thread
From: Phillip Wood @ 2018-09-12 10:10 UTC (permalink / raw)
  To: Junio C Hamano, Johannes Schindelin
  Cc: Git Mailing List, Eric Sunshine, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

Rename read_author_script() in preparation for adding a shared
read_author_script() function to libgit.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 builtin/am.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 5e866d17c7..8c165f747b 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -302,7 +302,7 @@ static int parse_key_value_squoted(char *buf, struct string_list *list)
  * script, and thus if the file differs from what this function expects, it is
  * better to bail out than to do something that the user does not expect.
  */
-static int read_author_script(struct am_state *state)
+static int read_am_author_script(struct am_state *state)
 {
 	const char *filename = am_path(state, "author-script");
 	struct strbuf buf = STRBUF_INIT;
@@ -411,7 +411,7 @@ static void am_load(struct am_state *state)
 		BUG("state file 'last' does not exist");
 	state->last = strtol(sb.buf, NULL, 10);
 
-	if (read_author_script(state) < 0)
+	if (read_am_author_script(state) < 0)
 		die(_("could not parse author script"));
 
 	read_commit_msg(state);
-- 
2.18.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/3] add read_author_script() to libgit
  2018-09-12 10:10 [PATCH 0/3] am/rebase: share read_author_script() Phillip Wood
  2018-09-12 10:10 ` [PATCH 1/3] am: rename read_author_script() Phillip Wood
@ 2018-09-12 10:10 ` Phillip Wood
  2018-09-13 23:49   ` Eric Sunshine
  2018-09-12 10:10 ` [PATCH 3/3] sequencer: use read_author_script() Phillip Wood
  2 siblings, 1 reply; 7+ messages in thread
From: Phillip Wood @ 2018-09-12 10:10 UTC (permalink / raw)
  To: Junio C Hamano, Johannes Schindelin
  Cc: Git Mailing List, Eric Sunshine, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

Add read_author_script() to sequencer.c based on the implementation in
builtin/am.c and update read_am_author_script() to use
read_author_script(). The sequencer code that reads the author script
will be updated in the next commit.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 builtin/am.c | 57 ++++-------------------------------------------
 sequencer.c  | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 sequencer.h  |  3 +++
 3 files changed, 69 insertions(+), 53 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 8c165f747b..aa5de0ee73 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -260,32 +260,6 @@ static int read_state_file(struct strbuf *sb, const struct am_state *state,
 	die_errno(_("could not read '%s'"), am_path(state, file));
 }
 
-/**
- * Take a series of KEY='VALUE' lines where VALUE part is
- * sq-quoted, and append <KEY, VALUE> at the end of the string list
- */
-static int parse_key_value_squoted(char *buf, struct string_list *list)
-{
-	while (*buf) {
-		struct string_list_item *item;
-		char *np;
-		char *cp = strchr(buf, '=');
-		if (!cp)
-			return -1;
-		np = strchrnul(cp, '\n');
-		*cp++ = '\0';
-		item = string_list_append(list, buf);
-
-		buf = np + (*np == '\n');
-		*np = '\0';
-		cp = sq_dequote(cp);
-		if (!cp)
-			return -1;
-		item->util = xstrdup(cp);
-	}
-	return 0;
-}
-
 /**
  * Reads and parses the state directory's "author-script" file, and sets
  * state->author_name, state->author_email and state->author_date accordingly.
@@ -305,39 +279,16 @@ static int parse_key_value_squoted(char *buf, struct string_list *list)
 static int read_am_author_script(struct am_state *state)
 {
 	const char *filename = am_path(state, "author-script");
-	struct strbuf buf = STRBUF_INIT;
-	struct string_list kv = STRING_LIST_INIT_DUP;
-	int retval = -1; /* assume failure */
-	int fd;
 
 	assert(!state->author_name);
 	assert(!state->author_email);
 	assert(!state->author_date);
 
-	fd = open(filename, O_RDONLY);
-	if (fd < 0) {
-		if (errno == ENOENT)
-			return 0;
-		die_errno(_("could not open '%s' for reading"), filename);
-	}
-	strbuf_read(&buf, fd, 0);
-	close(fd);
-	if (parse_key_value_squoted(buf.buf, &kv))
-		goto finish;
+	if (read_author_script(filename, &state->author_name,
+			       &state->author_email, &state->author_date, 1))
+		exit(128);
 
-	if (kv.nr != 3 ||
-	    strcmp(kv.items[0].string, "GIT_AUTHOR_NAME") ||
-	    strcmp(kv.items[1].string, "GIT_AUTHOR_EMAIL") ||
-	    strcmp(kv.items[2].string, "GIT_AUTHOR_DATE"))
-		goto finish;
-	state->author_name = kv.items[0].util;
-	state->author_email = kv.items[1].util;
-	state->author_date = kv.items[2].util;
-	retval = 0;
-finish:
-	string_list_clear(&kv, !!retval);
-	strbuf_release(&buf);
-	return retval;
+	return 0;
 }
 
 /**
diff --git a/sequencer.c b/sequencer.c
index dc2c58d464..5d0ff8f1c1 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -660,6 +660,68 @@ static int write_author_script(const char *message)
 	return res;
 }
 
+/**
+ * Take a series of KEY='VALUE' lines where VALUE part is
+ * sq-quoted, and append <KEY, VALUE> at the end of the string list
+ */
+static int parse_key_value_squoted(char *buf, struct string_list *list)
+{
+	while (*buf) {
+		struct string_list_item *item;
+		char *np;
+		char *cp = strchr(buf, '=');
+		if (!cp)
+			return -1;
+		np = strchrnul(cp, '\n');
+		*cp++ = '\0';
+		item = string_list_append(list, buf);
+
+		buf = np + (*np == '\n');
+		*np = '\0';
+		cp = sq_dequote(cp);
+		if (!cp)
+			return -1;
+		item->util = xstrdup(cp);
+	}
+	return 0;
+}
+
+int read_author_script(const char *path, char **name, char **email, char **date,
+		       int allow_missing)
+{
+	struct strbuf buf = STRBUF_INIT;
+	struct string_list kv = STRING_LIST_INIT_DUP;
+	int retval = -1;
+
+	if (strbuf_read_file(&buf, path, 256) <= 0) {
+		strbuf_release(&buf);
+		if (errno == ENOENT && allow_missing)
+			return 0;
+		else
+			return error_errno(_("could not open '%s' for reading"),
+					   path);
+	}
+
+	if (parse_key_value_squoted(buf.buf, &kv)) {
+		error(_("unable to parse '%s'"), path);
+		goto finish;
+	}
+	if (kv.nr != 3 ||
+	    strcmp(kv.items[0].string, "GIT_AUTHOR_NAME") ||
+	    strcmp(kv.items[1].string, "GIT_AUTHOR_EMAIL") ||
+	    strcmp(kv.items[2].string, "GIT_AUTHOR_DATE")) {
+		error(_("unable to parse '%s'"), path);
+		goto finish;
+	}
+	*name = kv.items[0].util;
+	*email = kv.items[1].util;
+	*date = kv.items[2].util;
+	retval = 0;
+finish:
+	string_list_clear(&kv, !!retval);
+	strbuf_release(&buf);
+	return retval;
+}
 
 /*
  * write_author_script() used to fail to terminate the last line with a "'" and
diff --git a/sequencer.h b/sequencer.h
index c751c9d6e4..3713f955f5 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -107,4 +107,7 @@ void commit_post_rewrite(const struct commit *current_head,
 #define SUMMARY_SHOW_AUTHOR_DATE (1 << 1)
 void print_commit_summary(const char *prefix, const struct object_id *oid,
 			  unsigned int flags);
+
+int read_author_script(const char *path, char **name, char **email, char **date,
+		       int allow_missing);
 #endif
-- 
2.18.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 3/3] sequencer: use read_author_script()
  2018-09-12 10:10 [PATCH 0/3] am/rebase: share read_author_script() Phillip Wood
  2018-09-12 10:10 ` [PATCH 1/3] am: rename read_author_script() Phillip Wood
  2018-09-12 10:10 ` [PATCH 2/3] add read_author_script() to libgit Phillip Wood
@ 2018-09-12 10:10 ` Phillip Wood
  2018-09-12 12:14   ` Eric Sunshine
  2018-09-14  0:31   ` Eric Sunshine
  2 siblings, 2 replies; 7+ messages in thread
From: Phillip Wood @ 2018-09-12 10:10 UTC (permalink / raw)
  To: Junio C Hamano, Johannes Schindelin
  Cc: Git Mailing List, Eric Sunshine, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

Use the new function to read the author script, updating
read_env_script() and read_author_ident(). This means we now have a
single code path that reads the author script and uses sq_dequote() to
dequote it. This fixes potential problems with user edited scripts
as read_env_script() which did not track quotes properly.

This commit also removes the fallback code for checking for a broken
author script after git is upgraded when a rebase is stopped. Now that
the parsing uses sq_dequote() it will reliably return an error if the
quoting is broken and the user will have to abort the rebase and
restart. This isn't ideal but it's a corner case and the detection of
the broken quoting could be confused by user edited author scripts.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 sequencer.c | 108 +++++++++++++++-------------------------------------
 1 file changed, 30 insertions(+), 78 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 5d0ff8f1c1..630741cfe0 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -723,54 +723,35 @@ int read_author_script(const char *path, char **name, char **email, char **date,
 	return retval;
 }
 
-/*
- * write_author_script() used to fail to terminate the last line with a "'" and
- * also escaped "'" incorrectly as "'\\\\''" rather than "'\\''". We check for
- * the terminating "'" on the last line to see how "'" has been escaped in case
- * git was upgraded while rebase was stopped.
- */
-static int quoting_is_broken(const char *s, size_t n)
-{
-	/* Skip any empty lines in case the file was hand edited */
-	while (n > 0 && s[--n] == '\n')
-		; /* empty */
-	if (n > 0 && s[n] != '\'')
-		return 1;
-
-	return 0;
-}
-
 /*
  * Read a list of environment variable assignments (such as the author-script
  * file) into an environment block. Returns -1 on error, 0 otherwise.
  */
 static int read_env_script(struct argv_array *env)
 {
 	struct strbuf script = STRBUF_INIT;
-	int i, count = 0, sq_bug;
-	const char *p2;
-	char *p;
+	char *name, *email, *date;
 
-	if (strbuf_read_file(&script, rebase_path_author_script(), 256) <= 0)
+	if (read_author_script(rebase_path_author_script(),
+			       &name, &email, &date, 0))
 		return -1;
-	/* write_author_script() used to quote incorrectly */
-	sq_bug = quoting_is_broken(script.buf, script.len);
-	for (p = script.buf; *p; p++)
-		if (sq_bug && skip_prefix(p, "'\\\\''", &p2))
-			strbuf_splice(&script, p - script.buf, p2 - p, "'", 1);
-		else if (skip_prefix(p, "'\\''", &p2))
-			strbuf_splice(&script, p - script.buf, p2 - p, "'", 1);
-		else if (*p == '\'')
-			strbuf_splice(&script, p-- - script.buf, 1, "", 0);
-		else if (*p == '\n') {
-			*p = '\0';
-			count++;
-		}
 
-	for (i = 0, p = script.buf; i < count; i++) {
-		argv_array_push(env, p);
-		p += strlen(p) + 1;
-	}
+	strbuf_addstr(&script, "GIT_AUTHOR_NAME=");
+	strbuf_addstr(&script, name);
+	argv_array_push(env, script.buf);
+	strbuf_reset(&script);
+	strbuf_addstr(&script, "GIT_AUTHOR_EMAIL=");
+	strbuf_addstr(&script, email);
+	argv_array_push(env, script.buf);
+	strbuf_reset(&script);
+	strbuf_addstr(&script, "GIT_AUTHOR_DATE=");
+	strbuf_addstr(&script, date);
+	argv_array_push(env, script.buf);
+	strbuf_release(&script);
+
+	free(name);
+	free(email);
+	free(date);
 
 	return 0;
 }
@@ -790,54 +771,25 @@ static char *get_author(const char *message)
 /* Read author-script and return an ident line (author <email> timestamp) */
 static const char *read_author_ident(struct strbuf *buf)
 {
-	const char *keys[] = {
-		"GIT_AUTHOR_NAME=", "GIT_AUTHOR_EMAIL=", "GIT_AUTHOR_DATE="
-	};
-	struct strbuf out = STRBUF_INIT;
-	char *in, *eol;
-	const char *val[3];
-	int i = 0;
+	char *name, *email, *date;
 
-	if (strbuf_read_file(buf, rebase_path_author_script(), 256) <= 0)
+	if (read_author_script(rebase_path_author_script(),
+			       &name, &email, &date, 0))
 		return NULL;
 
-	/* dequote values and construct ident line in-place */
-	for (in = buf->buf; i < 3 && in - buf->buf < buf->len; i++) {
-		if (!skip_prefix(in, keys[i], (const char **)&in)) {
-			warning(_("could not parse '%s' (looking for '%s')"),
-				rebase_path_author_script(), keys[i]);
-			return NULL;
-		}
-
-		eol = strchrnul(in, '\n');
-		*eol = '\0';
-		if (!sq_dequote(in)) {
-			warning(_("bad quoting on %s value in '%s'"),
-				keys[i], rebase_path_author_script());
-			return NULL;
-		}
-		val[i] = in;
-		in = eol + 1;
-	}
-
-	if (i < 3) {
-		warning(_("could not parse '%s' (looking for '%s')"),
-			rebase_path_author_script(), keys[i]);
-		return NULL;
-	}
-
 	/* validate date since fmt_ident() will die() on bad value */
-	if (parse_date(val[2], &out)){
+	if (parse_date(date, buf)){
 		warning(_("invalid date format '%s' in '%s'"),
-			val[2], rebase_path_author_script());
-		strbuf_release(&out);
+			date, rebase_path_author_script());
+		strbuf_release(buf);
 		return NULL;
 	}
 
-	strbuf_reset(&out);
-	strbuf_addstr(&out, fmt_ident(val[0], val[1], val[2], 0));
-	strbuf_swap(buf, &out);
-	strbuf_release(&out);
+	strbuf_reset(buf);
+	strbuf_addstr(buf, fmt_ident(name, email, date, 0));
+	free(name);
+	free(email);
+	free(date);
 	return buf->buf;
 }
 
-- 
2.18.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] sequencer: use read_author_script()
  2018-09-12 10:10 ` [PATCH 3/3] sequencer: use read_author_script() Phillip Wood
@ 2018-09-12 12:14   ` Eric Sunshine
  2018-09-14  0:31   ` Eric Sunshine
  1 sibling, 0 replies; 7+ messages in thread
From: Eric Sunshine @ 2018-09-12 12:14 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Junio C Hamano, Johannes Schindelin, Git List

On Wed, Sep 12, 2018 at 6:11 AM Phillip Wood <phillip.wood@talktalk.net> wrote:
> Use the new function to read the author script, updating
> read_env_script() and read_author_ident(). This means we now have a
> single code path that reads the author script and uses sq_dequote() to
> dequote it. This fixes potential problems with user edited scripts
> as read_env_script() which did not track quotes properly.
> [...]
> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> ---
> diff --git a/sequencer.c b/sequencer.c
> @@ -723,54 +723,35 @@ int read_author_script(const char *path, char **name, char **email, char **date,
>  static int read_env_script(struct argv_array *env)
>  {
> +       strbuf_addstr(&script, "GIT_AUTHOR_NAME=");
> +       strbuf_addstr(&script, name);
> +       argv_array_push(env, script.buf);
> +       strbuf_reset(&script);
> +       strbuf_addstr(&script, "GIT_AUTHOR_EMAIL=");
> +       strbuf_addstr(&script, email);
> +       argv_array_push(env, script.buf);
> +       strbuf_reset(&script);
> +       strbuf_addstr(&script, "GIT_AUTHOR_DATE=");
> +       strbuf_addstr(&script, date);
> +       argv_array_push(env, script.buf);
> +       strbuf_release(&script);

I haven't read this series closely yet, but this caught my eye while
scanning it quickly. The above noisy code can all be collapsed to the
simpler:

    argv_array_pushf(env, "GIT_AUTHOR_NAME=%s", name);
    argv_array_pushf(env, "GIT_AUTHOR_EMAIL =%s", email);
    argv_array_pushf(env, "GIT_AUTHOR_DATE =%s", date);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/3] add read_author_script() to libgit
  2018-09-12 10:10 ` [PATCH 2/3] add read_author_script() to libgit Phillip Wood
@ 2018-09-13 23:49   ` Eric Sunshine
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Sunshine @ 2018-09-13 23:49 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Junio C Hamano, Johannes Schindelin, Git List

On Wed, Sep 12, 2018 at 6:11 AM Phillip Wood <phillip.wood@talktalk.net> wrote:
> Add read_author_script() to sequencer.c based on the implementation in
> builtin/am.c and update read_am_author_script() to use
> read_author_script(). The sequencer code that reads the author script
> will be updated in the next commit.
>
> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> ---
> diff --git a/builtin/am.c b/builtin/am.c
> @@ -305,39 +279,16 @@ static int parse_key_value_squoted(char *buf, struct string_list *list)
>  static int read_am_author_script(struct am_state *state)
>  {

This function returns 'int'...

>         const char *filename = am_path(state, "author-script");
> +       if (read_author_script(filename, &state->author_name,
> +                              &state->author_email, &state->author_date, 1))
> +               exit(128);

...but only ever exit()s...

> +       return 0;

... or returns 0.

>  }

It's a little disturbing that it now invokes exit() directly rather
than calling die() since die() may involve extra functionality before
actually exiting.

What if, instead of exit()ing directly, you drop the conditional and
instead return the value of read_author_script():

    return read_author_script(...);

and let the caller deal with the zero or non-zero return value as
usual? (True, you'd get two error messages upon failure rather than
just one, but that's not necessarily a bad thing.)

A possibly valid response is that such a change is outside the scope
of this series since the original code shared this odd architecture of
sometimes returning 0, sometimes -1, and sometimes die()ing.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] sequencer: use read_author_script()
  2018-09-12 10:10 ` [PATCH 3/3] sequencer: use read_author_script() Phillip Wood
  2018-09-12 12:14   ` Eric Sunshine
@ 2018-09-14  0:31   ` Eric Sunshine
  1 sibling, 0 replies; 7+ messages in thread
From: Eric Sunshine @ 2018-09-14  0:31 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Junio C Hamano, Johannes Schindelin, Git List

On Wed, Sep 12, 2018 at 6:11 AM Phillip Wood <phillip.wood@talktalk.net> wrote:
> Use the new function to read the author script, updating
> read_env_script() and read_author_ident(). This means we now have a
> single code path that reads the author script and uses sq_dequote() to
> dequote it. This fixes potential problems with user edited scripts
> as read_env_script() which did not track quotes properly.
> [...]
> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> ---
>  /*
>   * Read a list of environment variable assignments (such as the author-script
>   * file) into an environment block. Returns -1 on error, 0 otherwise.
>   */

According to this comment, this function is capable of parsing a file
of arbitrary "NAME=Value" lines, and indeed the original code does
just that, but...

>  static int read_env_script(struct argv_array *env)
>  {
> +       char *name, *email, *date;
>
> -       if (strbuf_read_file(&script, rebase_path_author_script(), 256) <= 0)
> +       if (read_author_script(rebase_path_author_script(),
> +                              &name, &email, &date, 0))

...the new implementation is able to handle only GIT_AUTHOR_NAME,
GIT_AUTHOR_EMAIL, and GIT_AUTHOR_DATE, in exactly that order.

This seems like a pretty serious (and possibly buggy) change of
behavior, and makes the function much less useful (in general). Is it
true that it will only ever be used for files containing that limited
set of names? If so, the behavior change deserves mention in the
commit message, the function comment needs updating, and the function
itself probably ought to be renamed.

> +       strbuf_addstr(&script, "GIT_AUTHOR_NAME=");
> +       strbuf_addstr(&script, name);
> +       argv_array_push(env, script.buf);
> +       strbuf_reset(&script);
> +       strbuf_addstr(&script, "GIT_AUTHOR_EMAIL=");
> +       strbuf_addstr(&script, email);
> +       argv_array_push(env, script.buf);
> +       strbuf_reset(&script);
> +       strbuf_addstr(&script, "GIT_AUTHOR_DATE=");
> +       strbuf_addstr(&script, date);
> +       argv_array_push(env, script.buf);
> +       strbuf_release(&script);

Mentioned earlier[1], this can all collapse down to:

argv_array_pushf(env, "GIT_AUTHOR_NAME=%s", name);
argv_array_pushf(env, "GIT_AUTHOR_EMAIL=%s", email);
argv_array_pushf(env, "GIT_AUTHOR_DATE=%s", date);

However, it's unfortunate that this manual and hard-coded
reconstruction is needed at all. If you restructure the factoring of
this patch series, such ugliness can be avoided altogether. For
instance, the series could be structured like this:

1. Introduce a general-purpose function for reading a file containing
arbitrary "NAME=Value" lines (not carrying about specific key names or
their order) and returning them in some data structure (perhaps via
'string_list' as parse_key_value_squoted() in patch 2/3 does).

2. Build read_author_script() atop #1, making it expect and extract
GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL, and GIT_AUTHOR_DATE (possibly in
that order, or possibly not if we don't care).

3. Retrofit existing parsers to call one of those two functions (this
step may happen over several patches). So, for instance,
read_env_script() would call the generic parser from #1, whereas
sequencer.c:read_author_ident() would call the more specific parser
from #2.

More below...

> @@ -790,54 +771,25 @@ static char *get_author(const char *message)
>  /* Read author-script and return an ident line (author <email> timestamp) */
>  static const char *read_author_ident(struct strbuf *buf)
>  {
> -       const char *keys[] = {
> -               "GIT_AUTHOR_NAME=", "GIT_AUTHOR_EMAIL=", "GIT_AUTHOR_DATE="
> -       };
> -       if (strbuf_read_file(buf, rebase_path_author_script(), 256) <= 0)
> +       if (read_author_script(rebase_path_author_script(),
> +                              &name, &email, &date, 0))
>                 return NULL;
> -       /* dequote values and construct ident line in-place */
> -       for (in = buf->buf; i < 3 && in - buf->buf < buf->len; i++) {
> -               if (!skip_prefix(in, keys[i], (const char **)&in)) {
> -                       warning(_("could not parse '%s' (looking for '%s')"),
> -                               rebase_path_author_script(), keys[i]);
> -                       return NULL;
> -               }
> -               if (!sq_dequote(in)) {
> -                       warning(_("bad quoting on %s value in '%s'"),
> -                               keys[i], rebase_path_author_script());
> -                       return NULL;
> -               }
> -       if (i < 3) {
> -               warning(_("could not parse '%s' (looking for '%s')"),
> -                       rebase_path_author_script(), keys[i]);
> -               return NULL;
> -       }

The parsing code being thrown away here does a better job of
diagnosing problems (thus helping the user figure out what went wrong)
than the new shared parser introduced by patch 2/3. The shared
function only ever reports a generic "unable to parse", whereas the
above code gets specific, saying that it was looking for a particular
key or that quoting was broken. I'd have expected the new shared
parser to encompass the best features of the existing parsers (such as
presenting better error messages).

>         /* validate date since fmt_ident() will die() on bad value */
> -       if (parse_date(val[2], &out)){
> +       if (parse_date(date, buf)){

Re-purposing the strbuf 'buf', which is passed into this function,
binds this function too tightly with its caller by assuming that the
caller will never need the original content of 'buf' anymore. Thus, it
would be better for this code continue using its own local strbuf
'out' rather than re-purposing the incoming 'buf'.

>                 warning(_("invalid date format '%s' in '%s'"),
> -                       val[2], rebase_path_author_script());
> -               strbuf_release(&out);
> +                       date, rebase_path_author_script());
> +               strbuf_release(buf);

Likewise, it's doubly odd to see this function releasing 'buf' which
it does not own.

>                 return NULL;
>         }

[1]: https://public-inbox.org/git/CAPig+cRvUr26GZyW6ecYhpwABueBqaEfZH1+JjLaqZo8+RTD6Q@mail.gmail.com/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-12 10:10 [PATCH 0/3] am/rebase: share read_author_script() Phillip Wood
2018-09-12 10:10 ` [PATCH 1/3] am: rename read_author_script() Phillip Wood
2018-09-12 10:10 ` [PATCH 2/3] add read_author_script() to libgit Phillip Wood
2018-09-13 23:49   ` Eric Sunshine
2018-09-12 10:10 ` [PATCH 3/3] sequencer: use read_author_script() Phillip Wood
2018-09-12 12:14   ` Eric Sunshine
2018-09-14  0:31   ` Eric Sunshine

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox