[PATCH/RFC 00/10] ref-filter: use parsing functions

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCH/RFC 00/10] ref-filter: use parsing functions
@ 2015-11-11 19:44 Karthik Nayak
  2015-11-11 19:44 ` [PATCH/RFC 01/10] ref-filter: introduce a parsing function for each atom in valid_atom Karthik Nayak
                   ` (10 more replies)
  0 siblings, 11 replies; 50+ messages in thread
From: Karthik Nayak @ 2015-11-11 19:44 UTC (permalink / raw)
  To: git; +Cc: matthieu.moy, gitster, Karthik Nayak

Carried over
http://thread.gmane.org/gmane.comp.version-control.git/279226/focus=279352. Where
we were talking about pre-parsing most of the atoms so that we do not
have to parse them in ref-filter:populate_value(), where we could now
instead only fill in necessary values. This series aims to introduce
parsing functions for atoms so that they maybe parsed before hand and
the necessary values maybe stored the introduce used_atom structure.

Karthik Nayak (10):
  ref-filter: introduce a parsing function for each atom in valid_atom
  ref-filter: introduce struct used_atom
  ref-fitler: bump match_atom() name to the top
  ref-filter: skip deref specifier in match_atom_name()
  ref-filter: introduce color_atom_parser()
  strbuf: introduce strbuf_split_str_without_term()
  ref-filter: introduce align_atom_parser()
  ref-filter: introduce remote_ref_atom_parser()
  ref-filter: introduce contents_atom_parser()
  ref-filter: introduce objectname_atom_parser()

 Documentation/git-for-each-ref.txt |  18 +-
 ref-filter.c                       | 483 ++++++++++++++++++++++---------------
 strbuf.c                           |  17 +-
 strbuf.h                           |  14 +-
 t/t6302-for-each-ref-filter.sh     |   4 +-
 5 files changed, 328 insertions(+), 208 deletions(-)

-- 
2.6.2

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH/RFC 01/10] ref-filter: introduce a parsing function for each atom in valid_atom
  2015-11-11 19:44 [PATCH/RFC 00/10] ref-filter: use parsing functions Karthik Nayak
@ 2015-11-11 19:44 ` Karthik Nayak
  2015-11-23 23:44   ` Eric Sunshine
  2015-11-11 19:44 ` [PATCH/RFC 02/10] ref-filter: introduce struct used_atom Karthik Nayak
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 50+ messages in thread
From: Karthik Nayak @ 2015-11-11 19:44 UTC (permalink / raw)
  To: git; +Cc: matthieu.moy, gitster, Karthik Nayak

Introduce a parsing function for each atom in valid_atom. Using this
we can define special parsing functions for each of the atoms. Since
we have a third field in valid_atom structure, we now fill out missing
cmp_type values.

Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
---
 ref-filter.c | 59 ++++++++++++++++++++++++++++++-----------------------------
 1 file changed, 30 insertions(+), 29 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index e205dd2..fbbda17 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -19,42 +19,43 @@ typedef enum { FIELD_STR, FIELD_ULONG, FIELD_TIME } cmp_type;
 static struct {
 	const char *name;
 	cmp_type cmp_type;
+	void (*parser)(struct used_atom *atom);
 } valid_atom[] = {
-	{ "refname" },
-	{ "objecttype" },
+	{ "refname", FIELD_STR },
+	{ "objecttype", FIELD_STR },
 	{ "objectsize", FIELD_ULONG },
-	{ "objectname" },
-	{ "tree" },
-	{ "parent" },
+	{ "objectname", FIELD_STR },
+	{ "tree", FIELD_STR },
+	{ "parent", FIELD_STR },
 	{ "numparent", FIELD_ULONG },
-	{ "object" },
-	{ "type" },
-	{ "tag" },
-	{ "author" },
-	{ "authorname" },
-	{ "authoremail" },
+	{ "object", FIELD_STR },
+	{ "type", FIELD_STR },
+	{ "tag", FIELD_STR },
+	{ "author", FIELD_STR },
+	{ "authorname", FIELD_STR },
+	{ "authoremail", FIELD_STR },
 	{ "authordate", FIELD_TIME },
-	{ "committer" },
-	{ "committername" },
-	{ "committeremail" },
+	{ "committer", FIELD_STR },
+	{ "committername", FIELD_STR },
+	{ "committeremail", FIELD_STR },
 	{ "committerdate", FIELD_TIME },
-	{ "tagger" },
-	{ "taggername" },
-	{ "taggeremail" },
+	{ "tagger", FIELD_STR },
+	{ "taggername", FIELD_STR },
+	{ "taggeremail", FIELD_STR },
 	{ "taggerdate", FIELD_TIME },
-	{ "creator" },
+	{ "creator", FIELD_STR },
 	{ "creatordate", FIELD_TIME },
-	{ "subject" },
-	{ "body" },
-	{ "contents" },
-	{ "upstream" },
-	{ "push" },
-	{ "symref" },
-	{ "flag" },
-	{ "HEAD" },
-	{ "color" },
-	{ "align" },
-	{ "end" },
+	{ "subject", FIELD_STR },
+	{ "body", FIELD_STR },
+	{ "contents", FIELD_STR },
+	{ "upstream", FIELD_STR },
+	{ "push", FIELD_STR },
+	{ "symref", FIELD_STR },
+	{ "flag", FIELD_STR },
+	{ "HEAD", FIELD_STR },
+	{ "color", FIELD_STR },
+	{ "align", FIELD_STR },
+	{ "end", FIELD_STR },
 };
 
 #define REF_FORMATTING_STATE_INIT  { 0, NULL }
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH/RFC 02/10] ref-filter: introduce struct used_atom
  2015-11-11 19:44 [PATCH/RFC 00/10] ref-filter: use parsing functions Karthik Nayak
  2015-11-11 19:44 ` [PATCH/RFC 01/10] ref-filter: introduce a parsing function for each atom in valid_atom Karthik Nayak
@ 2015-11-11 19:44 ` Karthik Nayak
  2015-12-01 23:00   ` Eric Sunshine
  2015-11-11 19:44 ` [PATCH/RFC 03/10] ref-fitler: bump match_atom() name to the top Karthik Nayak
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 50+ messages in thread
From: Karthik Nayak @ 2015-11-11 19:44 UTC (permalink / raw)
  To: git; +Cc: matthieu.moy, gitster, Karthik Nayak

Introduce the 'used_array' structure which would replace the existing
implementation of 'used_array' (which a list of atoms). This helps us
parse atom's before hand and store required details into the
'used_array' for future usage.

Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
---
 ref-filter.c | 62 +++++++++++++++++++++++++++++++-----------------------------
 1 file changed, 32 insertions(+), 30 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index fbbda17..748a2fe 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -16,6 +16,23 @@
 
 typedef enum { FIELD_STR, FIELD_ULONG, FIELD_TIME } cmp_type;
 
+/*
+ * An atom is a valid field atom listed below, possibly prefixed with
+ * a "*" to denote deref_tag().
+ *
+ * We parse given format string and sort specifiers, and make a list
+ * of properties that we need to extract out of objects.  ref_array_item
+ * structure will hold an array of values extracted that can be
+ * indexed with the "atom number", which is an index into this
+ * array.
+ */
+static struct used_atom {
+	const char *str;
+	cmp_type type;
+} *used_atom;
+static int used_atom_cnt, need_tagged, need_symref;
+static int need_color_reset_at_eol;
+
 static struct {
 	const char *name;
 	cmp_type cmp_type;
@@ -93,21 +110,6 @@ struct atom_value {
 };
 
 /*
- * An atom is a valid field atom listed above, possibly prefixed with
- * a "*" to denote deref_tag().
- *
- * We parse given format string and sort specifiers, and make a list
- * of properties that we need to extract out of objects.  ref_array_item
- * structure will hold an array of values extracted that can be
- * indexed with the "atom number", which is an index into this
- * array.
- */
-static const char **used_atom;
-static cmp_type *used_atom_type;
-static int used_atom_cnt, need_tagged, need_symref;
-static int need_color_reset_at_eol;
-
-/*
  * Used to parse format string and sort specifiers
  */
 int parse_ref_filter_atom(const char *atom, const char *ep)
@@ -123,8 +125,8 @@ int parse_ref_filter_atom(const char *atom, const char *ep)
 
 	/* Do we have the atom already used elsewhere? */
 	for (i = 0; i < used_atom_cnt; i++) {
-		int len = strlen(used_atom[i]);
-		if (len == ep - atom && !memcmp(used_atom[i], atom, len))
+		int len = strlen(used_atom[i].str);
+		if (len == ep - atom && !memcmp(used_atom[i].str, atom, len))
 			return i;
 	}
 
@@ -151,12 +153,11 @@ int parse_ref_filter_atom(const char *atom, const char *ep)
 	at = used_atom_cnt;
 	used_atom_cnt++;
 	REALLOC_ARRAY(used_atom, used_atom_cnt);
-	REALLOC_ARRAY(used_atom_type, used_atom_cnt);
-	used_atom[at] = xmemdupz(atom, ep - atom);
-	used_atom_type[at] = valid_atom[i].cmp_type;
+	used_atom[at].str = xmemdupz(atom, ep - atom);
+	used_atom[at].type = valid_atom[i].cmp_type;
 	if (*atom == '*')
 		need_tagged = 1;
-	if (!strcmp(used_atom[at], "symref"))
+	if (!strcmp(used_atom[at].str, "symref"))
 		need_symref = 1;
 	return at;
 }
@@ -316,7 +317,7 @@ int verify_ref_format(const char *format)
 		at = parse_ref_filter_atom(sp + 2, ep);
 		cp = ep + 1;
 
-		if (skip_prefix(used_atom[at], "color:", &color))
+		if (skip_prefix(used_atom[at].str, "color:", &color))
 			need_color_reset_at_eol = !!strcmp(color, "reset");
 	}
 	return 0;
@@ -360,7 +361,7 @@ static void grab_common_values(struct atom_value *val, int deref, struct object
 	int i;
 
 	for (i = 0; i < used_atom_cnt; i++) {
-		const char *name = used_atom[i];
+		const char *name = used_atom[i].str;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
@@ -384,7 +385,7 @@ static void grab_tag_values(struct atom_value *val, int deref, struct object *ob
 	struct tag *tag = (struct tag *) obj;
 
 	for (i = 0; i < used_atom_cnt; i++) {
-		const char *name = used_atom[i];
+		const char *name = used_atom[i].str;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
@@ -406,7 +407,7 @@ static void grab_commit_values(struct atom_value *val, int deref, struct object
 	struct commit *commit = (struct commit *) obj;
 
 	for (i = 0; i < used_atom_cnt; i++) {
-		const char *name = used_atom[i];
+		const char *name = used_atom[i].str;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
@@ -536,7 +537,7 @@ static void grab_person(const char *who, struct atom_value *val, int deref, stru
 	const char *wholine = NULL;
 
 	for (i = 0; i < used_atom_cnt; i++) {
-		const char *name = used_atom[i];
+		const char *name = used_atom[i].str;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
@@ -574,7 +575,7 @@ static void grab_person(const char *who, struct atom_value *val, int deref, stru
 	if (!wholine)
 		return;
 	for (i = 0; i < used_atom_cnt; i++) {
-		const char *name = used_atom[i];
+		const char *name = used_atom[i].str;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
@@ -664,7 +665,7 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct obj
 	unsigned long sublen = 0, bodylen = 0, nonsiglen = 0, siglen = 0;
 
 	for (i = 0; i < used_atom_cnt; i++) {
-		const char *name = used_atom[i];
+		const char *name = used_atom[i].str;
 		struct atom_value *v = &val[i];
 		const char *valp = NULL;
 		if (!!deref != (*name == '*'))
@@ -787,7 +788,8 @@ static void populate_value(struct ref_array_item *ref)
 
 	/* Fill in specials first */
 	for (i = 0; i < used_atom_cnt; i++) {
-		const char *name = used_atom[i];
+		struct used_atom *atom = &used_atom[i];
+		const char *name = atom->str;
 		struct atom_value *v = &ref->value[i];
 		int deref = 0;
 		const char *refname;
@@ -1446,7 +1448,7 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru
 {
 	struct atom_value *va, *vb;
 	int cmp;
-	cmp_type cmp_type = used_atom_type[s->atom];
+	cmp_type cmp_type = used_atom[s->atom].type;
 
 	get_ref_atom_value(a, s->atom, &va);
 	get_ref_atom_value(b, s->atom, &vb);
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH/RFC 03/10] ref-fitler: bump match_atom() name to the top
  2015-11-11 19:44 [PATCH/RFC 00/10] ref-filter: use parsing functions Karthik Nayak
  2015-11-11 19:44 ` [PATCH/RFC 01/10] ref-filter: introduce a parsing function for each atom in valid_atom Karthik Nayak
  2015-11-11 19:44 ` [PATCH/RFC 02/10] ref-filter: introduce struct used_atom Karthik Nayak
@ 2015-11-11 19:44 ` Karthik Nayak
  2015-11-11 19:44 ` [PATCH/RFC 04/10] ref-filter: skip deref specifier in match_atom_name() Karthik Nayak
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 50+ messages in thread
From: Karthik Nayak @ 2015-11-11 19:44 UTC (permalink / raw)
  To: git; +Cc: matthieu.moy, gitster, Karthik Nayak

Bump match_atom() to the top for usage in further patches.

Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
---
 ref-filter.c | 32 ++++++++++++++++----------------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 748a2fe..1542f5f 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -33,6 +33,22 @@ static struct used_atom {
 static int used_atom_cnt, need_tagged, need_symref;
 static int need_color_reset_at_eol;
 
+static int match_atom_name(const char *name, const char *atom_name, const char **val)
+{
+	const char *body;
+
+	if (!skip_prefix(name, atom_name, &body))
+		return 0; /* doesn't even begin with "atom_name" */
+	if (!body[0]) {
+		*val = NULL; /* %(atom_name) and no customization */
+		return 1;
+	}
+	if (body[0] != ':')
+		return 0; /* "atom_namefoo" is not "atom_name" or "atom_name:..." */
+	*val = body + 1; /* "atom_name:val" */
+	return 1;
+}
+
 static struct {
 	const char *name;
 	cmp_type cmp_type;
@@ -260,22 +276,6 @@ static void end_atom_handler(struct atom_value *atomv, struct ref_formatting_sta
 	pop_stack_element(&state->stack);
 }
 
-static int match_atom_name(const char *name, const char *atom_name, const char **val)
-{
-	const char *body;
-
-	if (!skip_prefix(name, atom_name, &body))
-		return 0; /* doesn't even begin with "atom_name" */
-	if (!body[0]) {
-		*val = NULL; /* %(atom_name) and no customization */
-		return 1;
-	}
-	if (body[0] != ':')
-		return 0; /* "atom_namefoo" is not "atom_name" or "atom_name:..." */
-	*val = body + 1; /* "atom_name:val" */
-	return 1;
-}
-
 /*
  * In a format string, find the next occurrence of %(atom).
  */
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH/RFC 04/10] ref-filter: skip deref specifier in match_atom_name()
  2015-11-11 19:44 [PATCH/RFC 00/10] ref-filter: use parsing functions Karthik Nayak
                   ` (2 preceding siblings ...)
  2015-11-11 19:44 ` [PATCH/RFC 03/10] ref-fitler: bump match_atom() name to the top Karthik Nayak
@ 2015-11-11 19:44 ` Karthik Nayak
  2015-12-01 23:11   ` Eric Sunshine
  2015-11-11 19:44 ` [PATCH/RFC 05/10] ref-filter: introduce color_atom_parser() Karthik Nayak
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 50+ messages in thread
From: Karthik Nayak @ 2015-11-11 19:44 UTC (permalink / raw)
  To: git; +Cc: matthieu.moy, gitster, Karthik Nayak

Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
---
 ref-filter.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/ref-filter.c b/ref-filter.c
index 1542f5f..4af28ef 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -37,6 +37,10 @@ static int match_atom_name(const char *name, const char *atom_name, const char *
 {
 	const char *body;
 
+	/*  skip the deref specifier*/
+	if (name[0] == '*')
+		name++;
+
 	if (!skip_prefix(name, atom_name, &body))
 		return 0; /* doesn't even begin with "atom_name" */
 	if (!body[0]) {
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH/RFC 05/10] ref-filter: introduce color_atom_parser()
  2015-11-11 19:44 [PATCH/RFC 00/10] ref-filter: use parsing functions Karthik Nayak
                   ` (3 preceding siblings ...)
  2015-11-11 19:44 ` [PATCH/RFC 04/10] ref-filter: skip deref specifier in match_atom_name() Karthik Nayak
@ 2015-11-11 19:44 ` Karthik Nayak
  2015-12-01 23:27   ` Eric Sunshine
  2015-11-11 19:44 ` [PATCH/RFC 06/10] strbuf: introduce strbuf_split_str_without_term() Karthik Nayak
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 50+ messages in thread
From: Karthik Nayak @ 2015-11-11 19:44 UTC (permalink / raw)
  To: git; +Cc: matthieu.moy, gitster, Karthik Nayak

Introduce color_atom_parser() which will parse a "color" atom and
store its color in the "use_atom" structure for further usage in
'populate_value()'.

Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
---
 ref-filter.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 4af28ef..0523d54 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -29,6 +29,9 @@ typedef enum { FIELD_STR, FIELD_ULONG, FIELD_TIME } cmp_type;
 static struct used_atom {
 	const char *str;
 	cmp_type type;
+	union {
+		const char *color;
+	} u;
 } *used_atom;
 static int used_atom_cnt, need_tagged, need_symref;
 static int need_color_reset_at_eol;
@@ -53,6 +56,13 @@ static int match_atom_name(const char *name, const char *atom_name, const char *
 	return 1;
 }
 
+void color_atom_parser(struct used_atom *atom)
+{
+	match_atom_name(atom->str, "color", &atom->u.color);
+	if (!atom->u.color)
+		die(_("expected format: %%(color:<color>)"));
+}
+
 static struct {
 	const char *name;
 	cmp_type cmp_type;
@@ -90,7 +100,7 @@ static struct {
 	{ "symref", FIELD_STR },
 	{ "flag", FIELD_STR },
 	{ "HEAD", FIELD_STR },
-	{ "color", FIELD_STR },
+	{ "color", FIELD_STR, color_atom_parser },
 	{ "align", FIELD_STR },
 	{ "end", FIELD_STR },
 };
@@ -175,6 +185,9 @@ int parse_ref_filter_atom(const char *atom, const char *ep)
 	REALLOC_ARRAY(used_atom, used_atom_cnt);
 	used_atom[at].str = xmemdupz(atom, ep - atom);
 	used_atom[at].type = valid_atom[i].cmp_type;
+	memset(&used_atom[at].u, 0, sizeof(used_atom[at].u));
+	if (valid_atom[i].parser)
+		valid_atom[i].parser(&used_atom[at]);
 	if (*atom == '*')
 		need_tagged = 1;
 	if (!strcmp(used_atom[at].str, "symref"))
@@ -833,11 +846,10 @@ static void populate_value(struct ref_array_item *ref)
 			refname = branch_get_push(branch, NULL);
 			if (!refname)
 				continue;
-		} else if (match_atom_name(name, "color", &valp)) {
+		} else if (starts_with(name, "color")) {
 			char color[COLOR_MAXLEN] = "";
+			const char *valp = atom->u.color;
 
-			if (!valp)
-				die(_("expected format: %%(color:<color>)"));
 			if (color_parse(valp, color) < 0)
 				die(_("unable to parse format"));
 			v->s = xstrdup(color);
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH/RFC 06/10] strbuf: introduce strbuf_split_str_without_term()
  2015-11-11 19:44 [PATCH/RFC 00/10] ref-filter: use parsing functions Karthik Nayak
                   ` (4 preceding siblings ...)
  2015-11-11 19:44 ` [PATCH/RFC 05/10] ref-filter: introduce color_atom_parser() Karthik Nayak
@ 2015-11-11 19:44 ` Karthik Nayak
  2015-12-02  8:04   ` Eric Sunshine
  2015-11-11 19:44 ` [PATCH/RFC 07/10] ref-filter: introduce align_atom_parser() Karthik Nayak
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 50+ messages in thread
From: Karthik Nayak @ 2015-11-11 19:44 UTC (permalink / raw)
  To: git; +Cc: matthieu.moy, gitster, Karthik Nayak

The current implementation of 'strbuf_split_buf()' includes the
terminator at the end of each strbuf post splitting. Include an option
wherein we can drop the terminator if required. In this context
introduce a wrapper function 'strbuf_split_str_without_term()' which
splits a given string into strbufs without including the terminator.

Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
---
 ref-filter.c |  5 +----
 strbuf.c     | 17 +++++++++++++----
 strbuf.h     | 14 ++++++++++----
 3 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 0523d54..4e8b3c9 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -892,14 +892,11 @@ static void populate_value(struct ref_array_item *ref)
 			 * TODO: Implement a function similar to strbuf_split_str()
 			 * which would omit the separator from the end of each value.
 			 */
-			s = to_free = strbuf_split_str(valp, ',', 0);
+			s = to_free = strbuf_split_str_without_term(valp, ',', 0);
 
 			align->position = ALIGN_LEFT;
 
 			while (*s) {
-				/*  Strip trailing comma */
-				if (s[1])
-					strbuf_setlen(s[0], s[0]->len - 1);
 				if (!strtoul_ui(s[0]->buf, 10, (unsigned int *)&width))
 					;
 				else if (!strcmp(s[0]->buf, "left"))
diff --git a/strbuf.c b/strbuf.c
index b552a13..d31336f 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -115,7 +115,7 @@ void strbuf_tolower(struct strbuf *sb)
 }
 
 struct strbuf **strbuf_split_buf(const char *str, size_t slen,
-				 int terminator, int max)
+				 int terminator, int max, int with_term)
 {
 	struct strbuf **ret = NULL;
 	size_t nr = 0, alloc = 0;
@@ -123,18 +123,27 @@ struct strbuf **strbuf_split_buf(const char *str, size_t slen,
 
 	while (slen) {
 		int len = slen;
+		int term = with_term;
 		if (max <= 0 || nr + 1 < max) {
 			const char *end = memchr(str, terminator, slen);
 			if (end)
-				len = end - str + 1;
+				len = end - str + term;
+			else
+				/*  When no terminator present, we must add the last character */
+				term = 1;
 		}
 		t = xmalloc(sizeof(struct strbuf));
 		strbuf_init(t, len);
 		strbuf_add(t, str, len);
 		ALLOC_GROW(ret, nr + 2, alloc);
 		ret[nr++] = t;
-		str += len;
-		slen -= len;
+		if (!term) {
+			str += len + 1;
+			slen -= len + 1;
+		} else {
+			str += len;
+			slen -= len;
+		}
 	}
 	ALLOC_GROW(ret, nr + 1, alloc); /* In case string was empty */
 	ret[nr] = NULL;
diff --git a/strbuf.h b/strbuf.h
index 6ae7a72..63e1e69 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -465,19 +465,25 @@ static inline int strbuf_strip_suffix(struct strbuf *sb, const char *suffix)
  * For lighter-weight alternatives, see string_list_split() and
  * string_list_split_in_place().
  */
-extern struct strbuf **strbuf_split_buf(const char *, size_t,
-					int terminator, int max);
+extern struct strbuf **strbuf_split_buf(const char *str, size_t slen,
+					int terminator, int max, int with_term);
+
+static inline struct strbuf **strbuf_split_str_without_term(const char *str,
+							    int terminator, int max)
+{
+	return strbuf_split_buf(str, strlen(str), terminator, max, 0);
+}
 
 static inline struct strbuf **strbuf_split_str(const char *str,
 					       int terminator, int max)
 {
-	return strbuf_split_buf(str, strlen(str), terminator, max);
+	return strbuf_split_buf(str, strlen(str), terminator, max, 1);
 }
 
 static inline struct strbuf **strbuf_split_max(const struct strbuf *sb,
 						int terminator, int max)
 {
-	return strbuf_split_buf(sb->buf, sb->len, terminator, max);
+	return strbuf_split_buf(sb->buf, sb->len, terminator, max, 1);
 }
 
 static inline struct strbuf **strbuf_split(const struct strbuf *sb,
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH/RFC 07/10] ref-filter: introduce align_atom_parser()
  2015-11-11 19:44 [PATCH/RFC 00/10] ref-filter: use parsing functions Karthik Nayak
                   ` (5 preceding siblings ...)
  2015-11-11 19:44 ` [PATCH/RFC 06/10] strbuf: introduce strbuf_split_str_without_term() Karthik Nayak
@ 2015-11-11 19:44 ` Karthik Nayak
  2015-12-02 21:23   ` Eric Sunshine
  2015-11-11 19:44 ` [PATCH/RFC 08/10] ref-filter: introduce remote_ref_atom_parser() Karthik Nayak
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 50+ messages in thread
From: Karthik Nayak @ 2015-11-11 19:44 UTC (permalink / raw)
  To: git; +Cc: matthieu.moy, gitster, Karthik Nayak

Introduce align_atom_parser() which will parse 'align' atoms and store
the required width and position into the 'used_atom' structure. While
we're here, add support for the usage of 'width=' and 'position=' when
using the 'align' atom (e.g. %(align:position=middle,width=30)).

Add documentation and modify the existing tests in t6302 to reflect
the same.

Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
---
 Documentation/git-for-each-ref.txt |  18 ++++---
 ref-filter.c                       | 102 +++++++++++++++++++++++--------------
 t/t6302-for-each-ref-filter.sh     |   4 +-
 3 files changed, 75 insertions(+), 49 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index c6f073c..56ffdc1 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -129,14 +129,16 @@ color::
 
 align::
 	Left-, middle-, or right-align the content between
-	%(align:...) and %(end). The "align:" is followed by `<width>`
-	and `<position>` in any order separated by a comma, where the
-	`<position>` is either left, right or middle, default being
-	left and `<width>` is the total length of the content with
-	alignment. If the contents length is more than the width then
-	no alignment is performed. If used with '--quote' everything
-	in between %(align:...) and %(end) is quoted, but if nested
-	then only the topmost level performs quoting.
+	%(align:...) and %(end). The "align:" is followed by
+	`width=<width>` and `position=<position>` in any order
+	separated by a comma, where the `<position>` is either left,
+	right or middle, default being left and `<width>` is the total
+	length of the content with alignment. The prefix for the
+	arguments is not mandatory. If the contents length is more
+	than the width then no alignment is performed. If used with
+	'--quote' everything in between %(align:...) and %(end) is
+	quoted, but if nested then only the topmost level performs
+	quoting.
 
 In addition to the above, for commit and tag objects, the header
 field names (`tree`, `parent`, `object`, `type`, and `tag`) can
diff --git a/ref-filter.c b/ref-filter.c
index 4e8b3c9..049e6b9 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -16,6 +16,11 @@
 
 typedef enum { FIELD_STR, FIELD_ULONG, FIELD_TIME } cmp_type;
 
+struct align {
+	align_type position;
+	unsigned int width;
+};
+
 /*
  * An atom is a valid field atom listed below, possibly prefixed with
  * a "*" to denote deref_tag().
@@ -31,6 +36,7 @@ static struct used_atom {
 	cmp_type type;
 	union {
 		const char *color;
+		struct align align;
 	} u;
 } *used_atom;
 static int used_atom_cnt, need_tagged, need_symref;
@@ -63,6 +69,61 @@ void color_atom_parser(struct used_atom *atom)
 		die(_("expected format: %%(color:<color>)"));
 }
 
+static align_type get_align_position(const char *type)
+{
+	if (!strcmp(type, "right"))
+		return ALIGN_RIGHT;
+	else if (!strcmp(type, "middle"))
+		return ALIGN_MIDDLE;
+	else if (!strcmp(type, "left"))
+		return ALIGN_LEFT;
+	return -1;
+}
+
+void align_atom_parser(struct used_atom *atom)
+{
+	struct align *align = &atom->u.align;
+	const char *buf = NULL;
+	struct strbuf **s, **to_free;
+	int width = -1;
+
+	match_atom_name(atom->str, "align", &buf);
+	if (!buf)
+		die(_("expected format: %%(align:<width>,<position>)"));
+	s = to_free = strbuf_split_str_without_term(buf, ',', 0);
+
+	/*  By default align to ALGIN_LEFT */
+	align->position = ALIGN_LEFT;
+
+	while (*s) {
+		int position;
+		buf = s[0]->buf;
+
+		position = get_align_position(buf);
+
+		if (skip_prefix(buf, "position=", &buf)) {
+			position = get_align_position(buf);
+			if (position == -1)
+				die(_("improper format entered align:%s"), s[0]->buf);
+			align->position = position;
+		} else if (skip_prefix(buf, "width=", &buf)) {
+			if (strtoul_ui(buf, 10, (unsigned int *)&width))
+				die(_("improper format entered align:%s"), s[0]->buf);
+		} else if (!strtoul_ui(buf, 10, (unsigned int *)&width))
+				;
+		else if (position != -1)
+			align->position = position;
+		else
+			die(_("improper format entered align:%s"), s[0]->buf);
+		s++;
+	}
+
+	if (width < 0)
+		die(_("positive width expected with the %%(align) atom"));
+	align->width = width;
+	strbuf_list_free(to_free);
+}
+
 static struct {
 	const char *name;
 	cmp_type cmp_type;
@@ -101,17 +162,12 @@ static struct {
 	{ "flag", FIELD_STR },
 	{ "HEAD", FIELD_STR },
 	{ "color", FIELD_STR, color_atom_parser },
-	{ "align", FIELD_STR },
+	{ "align", FIELD_STR, align_atom_parser },
 	{ "end", FIELD_STR },
 };
 
 #define REF_FORMATTING_STATE_INIT  { 0, NULL }
 
-struct align {
-	align_type position;
-	unsigned int width;
-};
-
 struct contents {
 	unsigned int lines;
 	struct object_id oid;
@@ -881,39 +937,7 @@ static void populate_value(struct ref_array_item *ref)
 				v->s = " ";
 			continue;
 		} else if (match_atom_name(name, "align", &valp)) {
-			struct align *align = &v->u.align;
-			struct strbuf **s, **to_free;
-			int width = -1;
-
-			if (!valp)
-				die(_("expected format: %%(align:<width>,<position>)"));
-
-			/*
-			 * TODO: Implement a function similar to strbuf_split_str()
-			 * which would omit the separator from the end of each value.
-			 */
-			s = to_free = strbuf_split_str_without_term(valp, ',', 0);
-
-			align->position = ALIGN_LEFT;
-
-			while (*s) {
-				if (!strtoul_ui(s[0]->buf, 10, (unsigned int *)&width))
-					;
-				else if (!strcmp(s[0]->buf, "left"))
-					align->position = ALIGN_LEFT;
-				else if (!strcmp(s[0]->buf, "right"))
-					align->position = ALIGN_RIGHT;
-				else if (!strcmp(s[0]->buf, "middle"))
-					align->position = ALIGN_MIDDLE;
-				else
-					die(_("improper format entered align:%s"), s[0]->buf);
-				s++;
-			}
-
-			if (width < 0)
-				die(_("positive width expected with the %%(align) atom"));
-			align->width = width;
-			strbuf_list_free(to_free);
+			v->u.align = atom->u.align;
 			v->handler = align_atom_handler;
 			continue;
 		} else if (!strcmp(name, "end")) {
diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
index fe4796c..688751e 100755
--- a/t/t6302-for-each-ref-filter.sh
+++ b/t/t6302-for-each-ref-filter.sh
@@ -97,7 +97,7 @@ test_expect_success 'left alignment is default' '
 	refname is refs/tags/three    |refs/tags/three
 	refname is refs/tags/two      |refs/tags/two
 	EOF
-	git for-each-ref --format="%(align:30)refname is %(refname)%(end)|%(refname)" >actual &&
+	git for-each-ref --format="%(align:width=30)refname is %(refname)%(end)|%(refname)" >actual &&
 	test_cmp expect actual
 '
 
@@ -113,7 +113,7 @@ test_expect_success 'middle alignment' '
 	|  refname is refs/tags/three  |refs/tags/three
 	|   refname is refs/tags/two   |refs/tags/two
 	EOF
-	git for-each-ref --format="|%(align:middle,30)refname is %(refname)%(end)|%(refname)" >actual &&
+	git for-each-ref --format="|%(align:position=middle,30)refname is %(refname)%(end)|%(refname)" >actual &&
 	test_cmp expect actual
 '
 
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH/RFC 08/10] ref-filter: introduce remote_ref_atom_parser()
  2015-11-11 19:44 [PATCH/RFC 00/10] ref-filter: use parsing functions Karthik Nayak
                   ` (6 preceding siblings ...)
  2015-11-11 19:44 ` [PATCH/RFC 07/10] ref-filter: introduce align_atom_parser() Karthik Nayak
@ 2015-11-11 19:44 ` Karthik Nayak
  2015-12-13  0:53   ` Eric Sunshine
  2015-11-11 19:44 ` [PATCH/RFC 09/10] ref-filter: introduce contents_atom_parser() Karthik Nayak
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 50+ messages in thread
From: Karthik Nayak @ 2015-11-11 19:44 UTC (permalink / raw)
  To: git; +Cc: matthieu.moy, gitster, Karthik Nayak

Introduce remote_ref_atom_parser() which will parse the '%(upstream)'
and '%(push)' atoms and store information into the 'used_atom'
structure based on the modifiers used along with the corresponding
atom.

Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
---
 ref-filter.c | 107 ++++++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 66 insertions(+), 41 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 049e6b9..802629b 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -37,6 +37,11 @@ static struct used_atom {
 	union {
 		const char *color;
 		struct align align;
+		struct {
+			unsigned int shorten : 1,
+				track : 1,
+				trackshort : 1;
+		} remote_ref;
 	} u;
 } *used_atom;
 static int used_atom_cnt, need_tagged, need_symref;
@@ -69,6 +74,24 @@ void color_atom_parser(struct used_atom *atom)
 		die(_("expected format: %%(color:<color>)"));
 }
 
+void remote_ref_atom_parser(struct used_atom *atom)
+{
+	const char *buf;
+
+	buf = strchr(atom->str, ':');
+	if (!buf)
+		return;
+	buf++;
+	if (!strcmp(buf, "short"))
+		atom->u.remote_ref.shorten = 1;
+	else if (!strcmp(buf, "track"))
+		atom->u.remote_ref.track = 1;
+	else if (!strcmp(buf, "trackshort"))
+		atom->u.remote_ref.trackshort = 1;
+	else
+		die(_("improper format entered align:%s"), buf);
+}
+
 static align_type get_align_position(const char *type)
 {
 	if (!strcmp(type, "right"))
@@ -156,8 +179,8 @@ static struct {
 	{ "subject", FIELD_STR },
 	{ "body", FIELD_STR },
 	{ "contents", FIELD_STR },
-	{ "upstream", FIELD_STR },
-	{ "push", FIELD_STR },
+	{ "upstream", FIELD_STR, remote_ref_atom_parser },
+	{ "push", FIELD_STR, remote_ref_atom_parser },
 	{ "symref", FIELD_STR },
 	{ "flag", FIELD_STR },
 	{ "HEAD", FIELD_STR },
@@ -838,6 +861,42 @@ static inline char *copy_advance(char *dst, const char *src)
 	return dst;
 }
 
+static void fill_remote_ref_details(struct used_atom *atom, const char *refname,
+				    struct branch *branch, const char **s)
+{
+	int num_ours, num_theirs;
+	if (atom->u.remote_ref.shorten)
+		*s = shorten_unambiguous_ref(refname, warn_ambiguous_refs);
+	else if (atom->u.remote_ref.track) {
+		if (stat_tracking_info(branch, &num_ours,
+				       &num_theirs, NULL))
+			return;
+		if (!num_ours && !num_theirs)
+			*s = "";
+		else if (!num_ours)
+			*s = xstrfmt("[behind %d]", num_theirs);
+		else if (!num_theirs)
+			*s = xstrfmt("[ahead %d]", num_ours);
+		else
+			*s = xstrfmt("[ahead %d, behind %d]",
+				     num_ours, num_theirs);
+	} else if (atom->u.remote_ref.trackshort) {
+		if (stat_tracking_info(branch, &num_ours,
+				       &num_theirs, NULL))
+			return;
+
+		if (!num_ours && !num_theirs)
+			*s = "=";
+		else if (!num_ours)
+			*s = "<";
+		else if (!num_theirs)
+			*s = ">";
+		else
+			*s = "<>";
+	} else
+		*s = refname;
+}
+
 /*
  * Parse the object referred by ref, and grab needed value.
  */
@@ -892,6 +951,8 @@ static void populate_value(struct ref_array_item *ref)
 			refname = branch_get_upstream(branch, NULL);
 			if (!refname)
 				continue;
+			fill_remote_ref_details(atom, refname, branch, &v->s);
+			continue;
 		} else if (starts_with(name, "push")) {
 			const char *branch_name;
 			if (!skip_prefix(ref->refname, "refs/heads/",
@@ -902,6 +963,8 @@ static void populate_value(struct ref_array_item *ref)
 			refname = branch_get_push(branch, NULL);
 			if (!refname)
 				continue;
+			fill_remote_ref_details(atom, refname, branch, &v->s);
+			continue;
 		} else if (starts_with(name, "color")) {
 			char color[COLOR_MAXLEN] = "";
 			const char *valp = atom->u.color;
@@ -948,49 +1011,11 @@ static void populate_value(struct ref_array_item *ref)
 
 		formatp = strchr(name, ':');
 		if (formatp) {
-			int num_ours, num_theirs;
-
 			formatp++;
 			if (!strcmp(formatp, "short"))
 				refname = shorten_unambiguous_ref(refname,
 						      warn_ambiguous_refs);
-			else if (!strcmp(formatp, "track") &&
-				 (starts_with(name, "upstream") ||
-				  starts_with(name, "push"))) {
-
-				if (stat_tracking_info(branch, &num_ours,
-						       &num_theirs, NULL))
-					continue;
-
-				if (!num_ours && !num_theirs)
-					v->s = "";
-				else if (!num_ours)
-					v->s = xstrfmt("[behind %d]", num_theirs);
-				else if (!num_theirs)
-					v->s = xstrfmt("[ahead %d]", num_ours);
-				else
-					v->s = xstrfmt("[ahead %d, behind %d]",
-						       num_ours, num_theirs);
-				continue;
-			} else if (!strcmp(formatp, "trackshort") &&
-				   (starts_with(name, "upstream") ||
-				    starts_with(name, "push"))) {
-				assert(branch);
-
-				if (stat_tracking_info(branch, &num_ours,
-							&num_theirs, NULL))
-					continue;
-
-				if (!num_ours && !num_theirs)
-					v->s = "=";
-				else if (!num_ours)
-					v->s = "<";
-				else if (!num_theirs)
-					v->s = ">";
-				else
-					v->s = "<>";
-				continue;
-			} else
+			else
 				die("unknown %.*s format %s",
 				    (int)(formatp - name), name, formatp);
 		}
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH/RFC 09/10] ref-filter: introduce contents_atom_parser()
  2015-11-11 19:44 [PATCH/RFC 00/10] ref-filter: use parsing functions Karthik Nayak
                   ` (7 preceding siblings ...)
  2015-11-11 19:44 ` [PATCH/RFC 08/10] ref-filter: introduce remote_ref_atom_parser() Karthik Nayak
@ 2015-11-11 19:44 ` Karthik Nayak
  2015-12-13  3:10   ` Eric Sunshine
  2015-11-24 21:48 ` [PATCH/RFC 00/10] ref-filter: use parsing functions Jeff King
  2015-12-11 22:49 ` [PATCH/RFC 00/10] ref-filter: use parsing functions Junio C Hamano
  10 siblings, 1 reply; 50+ messages in thread
From: Karthik Nayak @ 2015-11-11 19:44 UTC (permalink / raw)
  To: git; +Cc: matthieu.moy, gitster, Karthik Nayak

Introduce contents_atom_parser() which will parse the '%(contents)'
atom and store information into the 'used_atom' structure based on the
modifiers used along with the atom.

Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
---
 ref-filter.c | 66 ++++++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 42 insertions(+), 24 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 802629b..117bbbb 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -42,6 +42,14 @@ static struct used_atom {
 				track : 1,
 				trackshort : 1;
 		} remote_ref;
+		struct {
+			unsigned int subject : 1,
+				body : 1,
+				signature : 1,
+				all : 1,
+				lines : 1,
+				no_lines;
+		} contents;
 	} u;
 } *used_atom;
 static int used_atom_cnt, need_tagged, need_symref;
@@ -92,6 +100,29 @@ void remote_ref_atom_parser(struct used_atom *atom)
 		die(_("improper format entered align:%s"), buf);
 }
 
+void contents_atom_parser(struct used_atom *atom)
+{
+	const char * buf;
+
+	if (match_atom_name(atom->str, "contents", &buf))
+		atom->u.contents.all = 1;
+
+	if (!buf)
+		return;
+	if (!strcmp(buf, "body"))
+		atom->u.contents.body = 1;
+	else if (!strcmp(buf, "signature"))
+		atom->u.contents.signature = 1;
+	else if (!strcmp(buf, "subject"))
+		atom->u.contents.subject = 1;
+	else if (skip_prefix(buf, "lines=", &buf)) {
+		atom->u.contents.lines = 1;
+		if (strtoul_ui(buf, 10, &atom->u.contents.no_lines))
+			die(_("positive value expected contents:lines=%s"), buf);
+	} else
+		die(_("improper format entered contents:%s"), buf);
+}
+
 static align_type get_align_position(const char *type)
 {
 	if (!strcmp(type, "right"))
@@ -178,7 +209,7 @@ static struct {
 	{ "creatordate", FIELD_TIME },
 	{ "subject", FIELD_STR },
 	{ "body", FIELD_STR },
-	{ "contents", FIELD_STR },
+	{ "contents", FIELD_STR, contents_atom_parser },
 	{ "upstream", FIELD_STR, remote_ref_atom_parser },
 	{ "push", FIELD_STR, remote_ref_atom_parser },
 	{ "symref", FIELD_STR },
@@ -191,11 +222,6 @@ static struct {
 
 #define REF_FORMATTING_STATE_INIT  { 0, NULL }
 
-struct contents {
-	unsigned int lines;
-	struct object_id oid;
-};
-
 struct ref_formatting_stack {
 	struct ref_formatting_stack *prev;
 	struct strbuf output;
@@ -212,7 +238,6 @@ struct atom_value {
 	const char *s;
 	union {
 		struct align align;
-		struct contents contents;
 	} u;
 	void (*handler)(struct atom_value *atomv, struct ref_formatting_state *state);
 	unsigned long ul; /* used for sorting when not FIELD_STR */
@@ -761,20 +786,16 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct obj
 	unsigned long sublen = 0, bodylen = 0, nonsiglen = 0, siglen = 0;
 
 	for (i = 0; i < used_atom_cnt; i++) {
-		const char *name = used_atom[i].str;
+		struct used_atom *atom = &used_atom[i];
+		const char *name = atom->str;
 		struct atom_value *v = &val[i];
-		const char *valp = NULL;
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
 		if (strcmp(name, "subject") &&
 		    strcmp(name, "body") &&
-		    strcmp(name, "contents") &&
-		    strcmp(name, "contents:subject") &&
-		    strcmp(name, "contents:body") &&
-		    strcmp(name, "contents:signature") &&
-		    !starts_with(name, "contents:lines="))
+		    !atom->u.contents.all)
 			continue;
 		if (!subpos)
 			find_subpos(buf, sz,
@@ -784,26 +805,23 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct obj
 
 		if (!strcmp(name, "subject"))
 			v->s = copy_subject(subpos, sublen);
-		else if (!strcmp(name, "contents:subject"))
+		else if (atom->u.contents.subject)
 			v->s = copy_subject(subpos, sublen);
 		else if (!strcmp(name, "body"))
 			v->s = xmemdupz(bodypos, bodylen);
-		else if (!strcmp(name, "contents:body"))
+		else if (atom->u.contents.body)
 			v->s = xmemdupz(bodypos, nonsiglen);
-		else if (!strcmp(name, "contents:signature"))
+		else if (atom->u.contents.signature)
 			v->s = xmemdupz(sigpos, siglen);
-		else if (!strcmp(name, "contents"))
-			v->s = xstrdup(subpos);
-		else if (skip_prefix(name, "contents:lines=", &valp)) {
+		else if (atom->u.contents.lines) {
 			struct strbuf s = STRBUF_INIT;
 			const char *contents_end = bodylen + bodypos - siglen;
 
-			if (strtoul_ui(valp, 10, &v->u.contents.lines))
-				die(_("positive value expected contents:lines=%s"), valp);
 			/*  Size is the length of the message after removing the signature */
-			append_lines(&s, subpos, contents_end - subpos, v->u.contents.lines);
+			append_lines(&s, subpos, contents_end - subpos, atom->u.contents.no_lines);
 			v->s = strbuf_detach(&s, NULL);
-		}
+		} else /*  For %(contents) without modifiers */
+			v->s = xstrdup(subpos);
 	}
 }
 
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 01/10] ref-filter: introduce a parsing function for each atom in valid_atom
  2015-11-11 19:44 ` [PATCH/RFC 01/10] ref-filter: introduce a parsing function for each atom in valid_atom Karthik Nayak
@ 2015-11-23 23:44   ` Eric Sunshine
  2015-11-25 12:10     ` Karthik Nayak
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Sunshine @ 2015-11-23 23:44 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
> Introduce a parsing function for each atom in valid_atom. Using this
> we can define special parsing functions for each of the atoms. Since
> we have a third field in valid_atom structure, we now fill out missing
> cmp_type values.

I don't get it. Why do you need to "fill out missing cmp_type values"
considering that you're never assigning the third field in this patch?
Are you planning on filling in the third field in a future patch?

> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
> ---
> diff --git a/ref-filter.c b/ref-filter.c
> @@ -19,42 +19,43 @@ typedef enum { FIELD_STR, FIELD_ULONG, FIELD_TIME } cmp_type;
>  static struct {
>         const char *name;
>         cmp_type cmp_type;
> +       void (*parser)(struct used_atom *atom);

Compiler diagnostic:

    warning: declaration of 'struct used_atom' will not be
        visible outside of this function [-Wvisibility]

Indeed, it seems rather odd to introduce the new field in this patch
but never actually do anything with it. It's difficult to understand
the intention.

>  } valid_atom[] = {
> -       { "refname" },
> -       { "objecttype" },
> +       { "refname", FIELD_STR },
> +       { "objecttype", FIELD_STR },
>         { "objectsize", FIELD_ULONG },
> -       { "objectname" },
> -       { "tree" },
> -       { "parent" },
> +       { "objectname", FIELD_STR },
> +       { "tree", FIELD_STR },
> +       { "parent", FIELD_STR },
>         { "numparent", FIELD_ULONG },
> -       { "object" },
> -       { "type" },
> -       { "tag" },
> -       { "author" },
> -       { "authorname" },
> -       { "authoremail" },
> +       { "object", FIELD_STR },
> +       { "type", FIELD_STR },
> +       { "tag", FIELD_STR },
> +       { "author", FIELD_STR },
> +       { "authorname", FIELD_STR },
> +       { "authoremail", FIELD_STR },
>         { "authordate", FIELD_TIME },
> -       { "committer" },
> -       { "committername" },
> -       { "committeremail" },
> +       { "committer", FIELD_STR },
> +       { "committername", FIELD_STR },
> +       { "committeremail", FIELD_STR },
>         { "committerdate", FIELD_TIME },
> -       { "tagger" },
> -       { "taggername" },
> -       { "taggeremail" },
> +       { "tagger", FIELD_STR },
> +       { "taggername", FIELD_STR },
> +       { "taggeremail", FIELD_STR },
>         { "taggerdate", FIELD_TIME },
> -       { "creator" },
> +       { "creator", FIELD_STR },
>         { "creatordate", FIELD_TIME },
> -       { "subject" },
> -       { "body" },
> -       { "contents" },
> -       { "upstream" },
> -       { "push" },
> -       { "symref" },
> -       { "flag" },
> -       { "HEAD" },
> -       { "color" },
> -       { "align" },
> -       { "end" },
> +       { "subject", FIELD_STR },
> +       { "body", FIELD_STR },
> +       { "contents", FIELD_STR },
> +       { "upstream", FIELD_STR },
> +       { "push", FIELD_STR },
> +       { "symref", FIELD_STR },
> +       { "flag", FIELD_STR },
> +       { "HEAD", FIELD_STR },
> +       { "color", FIELD_STR },
> +       { "align", FIELD_STR },
> +       { "end", FIELD_STR },
>  };
>
>  #define REF_FORMATTING_STATE_INIT  { 0, NULL }
> --
> 2.6.2

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 00/10] ref-filter: use parsing functions
  2015-11-11 19:44 [PATCH/RFC 00/10] ref-filter: use parsing functions Karthik Nayak
                   ` (8 preceding siblings ...)
  2015-11-11 19:44 ` [PATCH/RFC 09/10] ref-filter: introduce contents_atom_parser() Karthik Nayak
@ 2015-11-24 21:48 ` Jeff King
  2015-11-25 12:07   ` Karthik Nayak
  2015-11-25 13:44   ` [PATCH/RFC 10/10] ref-filter: introduce objectname_atom_parser() Karthik Nayak
  2015-12-11 22:49 ` [PATCH/RFC 00/10] ref-filter: use parsing functions Junio C Hamano
  10 siblings, 2 replies; 50+ messages in thread
From: Jeff King @ 2015-11-24 21:48 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git, matthieu.moy, gitster

On Thu, Nov 12, 2015 at 01:14:26AM +0530, Karthik Nayak wrote:

> Karthik Nayak (10):
>   ref-filter: introduce a parsing function for each atom in valid_atom
>   ref-filter: introduce struct used_atom
>   ref-fitler: bump match_atom() name to the top
>   ref-filter: skip deref specifier in match_atom_name()
>   ref-filter: introduce color_atom_parser()
>   strbuf: introduce strbuf_split_str_without_term()
>   ref-filter: introduce align_atom_parser()
>   ref-filter: introduce remote_ref_atom_parser()
>   ref-filter: introduce contents_atom_parser()
>   ref-filter: introduce objectname_atom_parser()

Hmm, your patch 10 does not seem to have made it to the list (at least I
did not ever get it, and gmane seems to be down, so I cannot check there).

-Peff

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 00/10] ref-filter: use parsing functions
  2015-11-24 21:48 ` [PATCH/RFC 00/10] ref-filter: use parsing functions Jeff King
@ 2015-11-25 12:07   ` Karthik Nayak
  2015-11-25 13:44   ` [PATCH/RFC 10/10] ref-filter: introduce objectname_atom_parser() Karthik Nayak
  1 sibling, 0 replies; 50+ messages in thread
From: Karthik Nayak @ 2015-11-25 12:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Git, Matthieu Moy, Junio C Hamano

On Wed, Nov 25, 2015 at 3:18 AM, Jeff King <peff@peff.net> wrote:
> On Thu, Nov 12, 2015 at 01:14:26AM +0530, Karthik Nayak wrote:
>
>> Karthik Nayak (10):
>>   ref-filter: introduce a parsing function for each atom in valid_atom
>>   ref-filter: introduce struct used_atom
>>   ref-fitler: bump match_atom() name to the top
>>   ref-filter: skip deref specifier in match_atom_name()
>>   ref-filter: introduce color_atom_parser()
>>   strbuf: introduce strbuf_split_str_without_term()
>>   ref-filter: introduce align_atom_parser()
>>   ref-filter: introduce remote_ref_atom_parser()
>>   ref-filter: introduce contents_atom_parser()
>>   ref-filter: introduce objectname_atom_parser()
>
> Hmm, your patch 10 does not seem to have made it to the list (at least I
> did not ever get it, and gmane seems to be down, so I cannot check there).
>
> -Peff

That's weird, I'll reply to this mail with patch 10.

-- 
Regards,
Karthik Nayak

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 01/10] ref-filter: introduce a parsing function for each atom in valid_atom
  2015-11-23 23:44   ` Eric Sunshine
@ 2015-11-25 12:10     ` Karthik Nayak
  2015-11-25 19:41       ` Eric Sunshine
  0 siblings, 1 reply; 50+ messages in thread
From: Karthik Nayak @ 2015-11-25 12:10 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Tue, Nov 24, 2015 at 5:14 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
>> Introduce a parsing function for each atom in valid_atom. Using this
>> we can define special parsing functions for each of the atoms. Since
>> we have a third field in valid_atom structure, we now fill out missing
>> cmp_type values.
>
> I don't get it. Why do you need to "fill out missing cmp_type values"
> considering that you're never assigning the third field in this patch?
> Are you planning on filling in the third field in a future patch?
>

I plan on filling that in upcoming patches. Probably, should mention that in
the commit message.

>> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
>> ---
>> diff --git a/ref-filter.c b/ref-filter.c
>> @@ -19,42 +19,43 @@ typedef enum { FIELD_STR, FIELD_ULONG, FIELD_TIME } cmp_type;
>>  static struct {
>>         const char *name;
>>         cmp_type cmp_type;
>> +       void (*parser)(struct used_atom *atom);
>
> Compiler diagnostic:
>
>     warning: declaration of 'struct used_atom' will not be
>         visible outside of this function [-Wvisibility]
>
> Indeed, it seems rather odd to introduce the new field in this patch
> but never actually do anything with it. It's difficult to understand
> the intention.
>

This is to make way for upcoming patches. But the compiler error is
accurate used_atom only becomes a structure in the next patch.
Should change that.

-- 
Regards,
Karthik Nayak

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH/RFC 10/10] ref-filter: introduce objectname_atom_parser()
  2015-11-24 21:48 ` [PATCH/RFC 00/10] ref-filter: use parsing functions Jeff King
  2015-11-25 12:07   ` Karthik Nayak
@ 2015-11-25 13:44   ` Karthik Nayak
  2015-12-13  4:49     ` Eric Sunshine
  1 sibling, 1 reply; 50+ messages in thread
From: Karthik Nayak @ 2015-11-25 13:44 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Karthik Nayak

Introduce objectname_atom_parser() which will parse the
'%(objectname)' atom and store information into the 'used_atom'
structure based on the modifiers used along with the atom.

Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
---
 ref-filter.c | 42 +++++++++++++++++++++++++++++++-----------
 1 file changed, 31 insertions(+), 11 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 117bbbb..f2add19 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -50,6 +50,10 @@ static struct used_atom {
 				lines : 1,
 				no_lines;
 		} contents;
+		struct {
+			unsigned int shorten : 1,
+				full : 1;
+		} objectname;
 	} u;
 } *used_atom;
 static int used_atom_cnt, need_tagged, need_symref;
@@ -123,6 +127,21 @@ void contents_atom_parser(struct used_atom *atom)
 		die(_("improper format entered contents:%s"), buf);
 }
 
+void objectname_atom_parser(struct used_atom *atom)
+{
+	const char * buf;
+
+	if (match_atom_name(atom->str, "objectname", &buf))
+		atom->u.objectname.full = 1;
+
+	if (!buf)
+		return;
+	if (!strcmp(buf, "short"))
+		atom->u.objectname.shorten = 1;
+	else
+		die(_("improper format entered objectname:%s"), buf);
+}
+
 static align_type get_align_position(const char *type)
 {
 	if (!strcmp(type, "right"))
@@ -186,7 +205,7 @@ static struct {
 	{ "refname", FIELD_STR },
 	{ "objecttype", FIELD_STR },
 	{ "objectsize", FIELD_ULONG },
-	{ "objectname", FIELD_STR },
+	{ "objectname", FIELD_STR, objectname_atom_parser },
 	{ "tree", FIELD_STR },
 	{ "parent", FIELD_STR },
 	{ "numparent", FIELD_ULONG },
@@ -463,15 +482,16 @@ static void *get_obj(const unsigned char *sha1, struct object **obj, unsigned lo
 }
 
 static int grab_objectname(const char *name, const unsigned char *sha1,
-			    struct atom_value *v)
+			   struct atom_value *v, struct used_atom *atom)
 {
-	if (!strcmp(name, "objectname")) {
-		v->s = xstrdup(sha1_to_hex(sha1));
-		return 1;
-	}
-	if (!strcmp(name, "objectname:short")) {
-		v->s = xstrdup(find_unique_abbrev(sha1, DEFAULT_ABBREV));
-		return 1;
+	if (starts_with(name, "objectname")) {
+		if (atom->u.objectname.shorten) {
+			v->s = xstrdup(find_unique_abbrev(sha1, DEFAULT_ABBREV));
+			return 1;
+		} else if (atom->u.objectname.full) {
+			v->s = xstrdup(sha1_to_hex(sha1));
+			return 1;
+		}
 	}
 	return 0;
 }
@@ -495,7 +515,7 @@ static void grab_common_values(struct atom_value *val, int deref, struct object
 			v->s = xstrfmt("%lu", sz);
 		}
 		else if (deref)
-			grab_objectname(name, obj->sha1, v);
+			grab_objectname(name, obj->sha1, v, &used_atom[i]);
 	}
 }
 
@@ -1004,7 +1024,7 @@ static void populate_value(struct ref_array_item *ref)
 				v->s = xstrdup(buf + 1);
 			}
 			continue;
-		} else if (!deref && grab_objectname(name, ref->objectname, v)) {
+		} else if (!deref && grab_objectname(name, ref->objectname, v, atom)) {
 			continue;
 		} else if (!strcmp(name, "HEAD")) {
 			const char *head;
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 01/10] ref-filter: introduce a parsing function for each atom in valid_atom
  2015-11-25 12:10     ` Karthik Nayak
@ 2015-11-25 19:41       ` Eric Sunshine
  2015-11-26 18:01         ` Karthik Nayak
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Sunshine @ 2015-11-25 19:41 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Wed, Nov 25, 2015 at 7:10 AM, Karthik Nayak <karthik.188@gmail.com> wrote:
> On Tue, Nov 24, 2015 at 5:14 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
>> On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
>>> Introduce a parsing function for each atom in valid_atom. Using this
>>> we can define special parsing functions for each of the atoms. Since
>>> we have a third field in valid_atom structure, we now fill out missing
>>> cmp_type values.
>>
>> I don't get it. Why do you need to "fill out missing cmp_type values"
>> considering that you're never assigning the third field in this patch?
>> Are you planning on filling in the third field in a future patch?
>
> I plan on filling that in upcoming patches. Probably, should mention that in
> the commit message.

Making it clear that this patch is preparatory for introduction of
'valid_atom' is a good idea, however, adding the unused 'valid_atom'
field in this patch is not recommended. It would be better to
introduce 'valid_atom' in the patch which actually needs it.

>>> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
>>> ---
>>> diff --git a/ref-filter.c b/ref-filter.c
>>> @@ -19,42 +19,43 @@ typedef enum { FIELD_STR, FIELD_ULONG, FIELD_TIME } cmp_type;
>>>  static struct {
>>>         const char *name;
>>>         cmp_type cmp_type;
>>> +       void (*parser)(struct used_atom *atom);
>>
>> Compiler diagnostic:
>>
>>     warning: declaration of 'struct used_atom' will not be
>>         visible outside of this function [-Wvisibility]
>>
>> Indeed, it seems rather odd to introduce the new field in this patch
>> but never actually do anything with it. It's difficult to understand
>> the intention.
>
> This is to make way for upcoming patches. But the compiler error is
> accurate used_atom only becomes a structure in the next patch.
> Should change that.

This problem will go away if you introduce the 'valid_atom' field in
the patch which actually needs it (as suggested above) rather than in
this patch.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 01/10] ref-filter: introduce a parsing function for each atom in valid_atom
  2015-11-25 19:41       ` Eric Sunshine
@ 2015-11-26 18:01         ` Karthik Nayak
  2015-12-11 22:18           ` Junio C Hamano
  0 siblings, 1 reply; 50+ messages in thread
From: Karthik Nayak @ 2015-11-26 18:01 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Thu, Nov 26, 2015 at 1:11 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Wed, Nov 25, 2015 at 7:10 AM, Karthik Nayak <karthik.188@gmail.com> wrote:
>> On Tue, Nov 24, 2015 at 5:14 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
>>> On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
>>>> Introduce a parsing function for each atom in valid_atom. Using this
>>>> we can define special parsing functions for each of the atoms. Since
>>>> we have a third field in valid_atom structure, we now fill out missing
>>>> cmp_type values.
>>>
>>> I don't get it. Why do you need to "fill out missing cmp_type values"
>>> considering that you're never assigning the third field in this patch?
>>> Are you planning on filling in the third field in a future patch?
>>
>> I plan on filling that in upcoming patches. Probably, should mention that in
>> the commit message.
>
> Making it clear that this patch is preparatory for introduction of
> 'valid_atom' is a good idea, however, adding the unused 'valid_atom'
> field in this patch is not recommended. It would be better to
> introduce 'valid_atom' in the patch which actually needs it.
>

I get your point, will do as you suggested.

>>>> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
>>>> ---
>>>> diff --git a/ref-filter.c b/ref-filter.c
>>>> @@ -19,42 +19,43 @@ typedef enum { FIELD_STR, FIELD_ULONG, FIELD_TIME } cmp_type;
>>>>  static struct {
>>>>         const char *name;
>>>>         cmp_type cmp_type;
>>>> +       void (*parser)(struct used_atom *atom);
>>>
>>> Compiler diagnostic:
>>>
>>>     warning: declaration of 'struct used_atom' will not be
>>>         visible outside of this function [-Wvisibility]
>>>
>>> Indeed, it seems rather odd to introduce the new field in this patch
>>> but never actually do anything with it. It's difficult to understand
>>> the intention.
>>
>> This is to make way for upcoming patches. But the compiler error is
>> accurate used_atom only becomes a structure in the next patch.
>> Should change that.
>
> This problem will go away if you introduce the 'valid_atom' field in
> the patch which actually needs it (as suggested above) rather than in
> this patch.

Yup, agreed.
Thanks for your suggestions.

-- 
Regards,
Karthik Nayak

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 02/10] ref-filter: introduce struct used_atom
  2015-11-11 19:44 ` [PATCH/RFC 02/10] ref-filter: introduce struct used_atom Karthik Nayak
@ 2015-12-01 23:00   ` Eric Sunshine
  0 siblings, 0 replies; 50+ messages in thread
From: Eric Sunshine @ 2015-12-01 23:00 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
> Introduce the 'used_array' structure which would replace the existing

I guess you meant s/used_array/used_atom/ or something?

Also, s/which would/to/

> implementation of 'used_array' (which a list of atoms). This helps us

s/which a/which is a/

> parse atom's before hand and store required details into the

s/atom's/atoms/
s/before hand/beforehand/

> 'used_array' for future usage.
>
> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
> ---
> diff --git a/ref-filter.c b/ref-filter.c
> @@ -16,6 +16,23 @@
> +/*
> + * An atom is a valid field atom listed below, possibly prefixed with
> + * a "*" to denote deref_tag().
> + *
> + * We parse given format string and sort specifiers, and make a list
> + * of properties that we need to extract out of objects.  ref_array_item
> + * structure will hold an array of values extracted that can be
> + * indexed with the "atom number", which is an index into this
> + * array.
> + */
> +static struct used_atom {
> +       const char *str;

This is really the atom's name, isn't it? If so, perhaps "name" would
be a better field name.

> +       cmp_type type;
> +} *used_atom;
> +static int used_atom_cnt, need_tagged, need_symref;
> +static int need_color_reset_at_eol;
> +
>  static struct {
>         const char *name;
>         cmp_type cmp_type;
> @@ -93,21 +110,6 @@ struct atom_value {
>  };
>
>  /*
> - * An atom is a valid field atom listed above, possibly prefixed with
> - * a "*" to denote deref_tag().
> - *
> - * We parse given format string and sort specifiers, and make a list
> - * of properties that we need to extract out of objects.  ref_array_item
> - * structure will hold an array of values extracted that can be
> - * indexed with the "atom number", which is an index into this
> - * array.
> - */
> -static const char **used_atom;
> -static cmp_type *used_atom_type;
> -static int used_atom_cnt, need_tagged, need_symref;
> -static int need_color_reset_at_eol;

You're moving this block of declarations up above the valid_atom[]
array because the previous patch added a new field named "parser" to
valid_atom[] which references 'struct used_atom' added by this patch
(2). I wonder if this movement should be done as a separate
preparatory patch to make it easier to review since, as it stands, the
reviewer has to read much more carefully to detect changes in the
moved block.

> -/*
>   * Used to parse format string and sort specifiers
>   */
>  int parse_ref_filter_atom(const char *atom, const char *ep)
> @@ -787,7 +788,8 @@ static void populate_value(struct ref_array_item *ref)
>
>         /* Fill in specials first */
>         for (i = 0; i < used_atom_cnt; i++) {
> -               const char *name = used_atom[i];
> +               struct used_atom *atom = &used_atom[i];
> +               const char *name = atom->str;

Why not just:

    const char *name = used_atom[i].str;

?

>                 struct atom_value *v = &ref->value[i];
>                 int deref = 0;
>                 const char *refname;

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 04/10] ref-filter: skip deref specifier in match_atom_name()
  2015-11-11 19:44 ` [PATCH/RFC 04/10] ref-filter: skip deref specifier in match_atom_name() Karthik Nayak
@ 2015-12-01 23:11   ` Eric Sunshine
  2015-12-03  6:34     ` Karthik Nayak
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Sunshine @ 2015-12-01 23:11 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>

A bit of explanation about why this change is desirable would be
welcome. I'm guessing it's because a future patch is going to make
calls to match_atom_name() with the '*' deref indicator still attached
to the name, whereas existing code does not do so.

> ---
> diff --git a/ref-filter.c b/ref-filter.c
> @@ -37,6 +37,10 @@ static int match_atom_name(const char *name, const char *atom_name, const char *
>  {
>         const char *body;
>
> +       /*  skip the deref specifier*/
> +       if (name[0] == '*')
> +               name++;
> +
>         if (!skip_prefix(name, atom_name, &body))
>                 return 0; /* doesn't even begin with "atom_name" */
>         if (!body[0]) {
> --
> 2.6.2

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 05/10] ref-filter: introduce color_atom_parser()
  2015-11-11 19:44 ` [PATCH/RFC 05/10] ref-filter: introduce color_atom_parser() Karthik Nayak
@ 2015-12-01 23:27   ` Eric Sunshine
  2015-12-03 13:35     ` Karthik Nayak
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Sunshine @ 2015-12-01 23:27 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
> Introduce color_atom_parser() which will parse a "color" atom and
> store its color in the "use_atom" structure for further usage in

s/use_atom/used_atom/

> 'populate_value()'.
>
> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
> ---
> diff --git a/ref-filter.c b/ref-filter.c
> @@ -29,6 +29,9 @@ typedef enum { FIELD_STR, FIELD_ULONG, FIELD_TIME } cmp_type;
>  static struct used_atom {
>         const char *str;
>         cmp_type type;
> +       union {
> +               const char *color;
> +       } u;
>  } *used_atom;
> @@ -53,6 +56,13 @@ static int match_atom_name(const char *name, const char *atom_name, const char *
> +void color_atom_parser(struct used_atom *atom)
> +{
> +       match_atom_name(atom->str, "color", &atom->u.color);
> +       if (!atom->u.color)
> +               die(_("expected format: %%(color:<color>)"));
> +}
> +
> @@ -175,6 +185,9 @@ int parse_ref_filter_atom(const char *atom, const char *ep)
>         REALLOC_ARRAY(used_atom, used_atom_cnt);
>         used_atom[at].str = xmemdupz(atom, ep - atom);
>         used_atom[at].type = valid_atom[i].cmp_type;
> +       memset(&used_atom[at].u, 0, sizeof(used_atom[at].u));
> +       if (valid_atom[i].parser)
> +               valid_atom[i].parser(&used_atom[at]);
>         if (*atom == '*')
>                 need_tagged = 1;
>         if (!strcmp(used_atom[at].str, "symref"))
> @@ -833,11 +846,10 @@ static void populate_value(struct ref_array_item *ref)
>                         refname = branch_get_push(branch, NULL);
>                         if (!refname)
>                                 continue;
> -               } else if (match_atom_name(name, "color", &valp)) {
> +               } else if (starts_with(name, "color")) {

Hmm, so this will also match "colorize". Is that desirable?

>                         char color[COLOR_MAXLEN] = "";
> +                       const char *valp = atom->u.color;
>
> -                       if (!valp)
> -                               die(_("expected format: %%(color:<color>)"));
>                         if (color_parse(valp, color) < 0)

Rather than declaring variable 'valp', why not just say:

    if (color_parse(atom->u.color, color) < 0)

?

>                                 die(_("unable to parse format"));
>                         v->s = xstrdup(color);
> --
> 2.6.2

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 06/10] strbuf: introduce strbuf_split_str_without_term()
  2015-11-11 19:44 ` [PATCH/RFC 06/10] strbuf: introduce strbuf_split_str_without_term() Karthik Nayak
@ 2015-12-02  8:04   ` Eric Sunshine
  2015-12-03 18:12     ` Karthik Nayak
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Sunshine @ 2015-12-02  8:04 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
> The current implementation of 'strbuf_split_buf()' includes the
> terminator at the end of each strbuf post splitting. Include an option

s/Include an/Add an/

> wherein we can drop the terminator if required. In this context

s/required/desired/

> introduce a wrapper function 'strbuf_split_str_without_term()' which
> splits a given string into strbufs without including the terminator.
>
> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
> ---
> diff --git a/ref-filter.c b/ref-filter.c
> @@ -892,14 +892,11 @@ static void populate_value(struct ref_array_item *ref)
>                          * TODO: Implement a function similar to strbuf_split_str()
>                          * which would omit the separator from the end of each value.
>                          */
> -                       s = to_free = strbuf_split_str(valp, ',', 0);
> +                       s = to_free = strbuf_split_str_without_term(valp, ',', 0);
>
>                         align->position = ALIGN_LEFT;
>
>                         while (*s) {
> -                               /*  Strip trailing comma */
> -                               if (s[1])
> -                                       strbuf_setlen(s[0], s[0]->len - 1);

I'd prefer to see this ref-filter.c change split out as a separate
patch so as not to pollute the otherwise single-purpose change
introduced by this patch (i.e. capability to omit the terminator).

Also, it might make sense to move this patch to the head of the
series, since it's conceptually distinct from the rest of the patches,
and could conceivably prove useful on its own, regardless of how the
rest of the series fares.

>                                 if (!strtoul_ui(s[0]->buf, 10, (unsigned int *)&width))
>                                         ;
>                                 else if (!strcmp(s[0]->buf, "left"))
> diff --git a/strbuf.c b/strbuf.c
> @@ -115,7 +115,7 @@ void strbuf_tolower(struct strbuf *sb)
>  }
>
>  struct strbuf **strbuf_split_buf(const char *str, size_t slen,
> -                                int terminator, int max)
> +                                int terminator, int max, int with_term)

"with_term" might undesirably be interpreted as meaning "use this
particular term". Perhaps a different name, such as "include_term",
"drop_term", or "omit_term" would be a bit less ambiguous. (I think I
prefer "omit_term".)

>  {
>         struct strbuf **ret = NULL;
>         size_t nr = 0, alloc = 0;
> @@ -123,18 +123,27 @@ struct strbuf **strbuf_split_buf(const char *str, size_t slen,
>
>         while (slen) {
>                 int len = slen;
> +               int term = with_term;

"term" is not a great variable name, and is easily confused with the
existing "terminator" input argument. This is really being used as a
length adjustment, so perhaps a name such as 'term_adjust' or
'len_adjust' or something better would be preferable.

Also, since the value of 'with_term' never changes, then 'term' will
have the same value each time through the loop, thus you could
(should) hoist the declaration and initialization of 'term' outside of
the loop.

Due to the way you're using this variable ('term'), you want its value
always to be 0 or 1 but you don't do anything to ensure that. What if
the user passes in 42 rather than 0 or 1? That would mess up your
(below) calculations. Worse, what if the user passes in -42? That
would be particularly alarming. To turn this into a boolean value (0
or 1), do this instead:

    int term = !!with_term;

>                 if (max <= 0 || nr + 1 < max) {
>                         const char *end = memchr(str, terminator, slen);
>                         if (end)
> -                               len = end - str + 1;
> +                               len = end - str + term;
> +                       else
> +                               /*  When no terminator present, we must add the last character */
> +                               term = 1;
>                 }
>                 t = xmalloc(sizeof(struct strbuf));
>                 strbuf_init(t, len);
>                 strbuf_add(t, str, len);
>                 ALLOC_GROW(ret, nr + 2, alloc);
>                 ret[nr++] = t;
> -               str += len;
> -               slen -= len;
> +               if (!term) {
> +                       str += len + 1;
> +                       slen -= len + 1;
> +               } else {
> +                       str += len;
> +                       slen -= len;
> +               }

This new logic is complex and confusing, thus difficult to review for
correctness. Rather than messing with 'len' and the existing logic,
how about instead, just adjusting the amount you store in the strbuf?
That is, instead of all the above changes, you might be able to get by
with one little change, something like this (untested):

    - strbuf_add(t, str, len);
    + strbuf_add(t, str, len - !!end * !!with_term);

>         }
>         ALLOC_GROW(ret, nr + 1, alloc); /* In case string was empty */
>         ret[nr] = NULL;
> diff --git a/strbuf.h b/strbuf.h
> @@ -465,19 +465,25 @@ static inline int strbuf_strip_suffix(struct strbuf *sb, const char *suffix)
>   * For lighter-weight alternatives, see string_list_split() and
>   * string_list_split_in_place().
>   */
> -extern struct strbuf **strbuf_split_buf(const char *, size_t,
> -                                       int terminator, int max);
> +extern struct strbuf **strbuf_split_buf(const char *str, size_t slen,
> +                                       int terminator, int max, int with_term);

You also need to update the comment block above this declaration since
it still says that each substring includes the terminator. It also
fails to mention the new 'with_term' argument added by this patch.

> +static inline struct strbuf **strbuf_split_str_without_term(const char *str,
> +                                                           int terminator, int max)

This is an uncomfortably long function name. Unfortunately, short and
sweet strbuf_split() is already taken. Perhaps
strbuf_split_str_drop_term()? strbuf_split_str_omit_term()?
strbuf_split_str_no_term()? strbuf_split_noterm()?

> +{
> +       return strbuf_split_buf(str, strlen(str), terminator, max, 0);
> +}
>
>  static inline struct strbuf **strbuf_split_str(const char *str,
>                                                int terminator, int max)
>  {
> -       return strbuf_split_buf(str, strlen(str), terminator, max);
> +       return strbuf_split_buf(str, strlen(str), terminator, max, 1);
>  }
>
>  static inline struct strbuf **strbuf_split_max(const struct strbuf *sb,
>                                                 int terminator, int max)
>  {
> -       return strbuf_split_buf(sb->buf, sb->len, terminator, max);
> +       return strbuf_split_buf(sb->buf, sb->len, terminator, max, 1);
>  }
>
>  static inline struct strbuf **strbuf_split(const struct strbuf *sb,
> --
> 2.6.2

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 07/10] ref-filter: introduce align_atom_parser()
  2015-11-11 19:44 ` [PATCH/RFC 07/10] ref-filter: introduce align_atom_parser() Karthik Nayak
@ 2015-12-02 21:23   ` Eric Sunshine
  2015-12-07 17:00     ` Karthik Nayak
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Sunshine @ 2015-12-02 21:23 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
> Introduce align_atom_parser() which will parse 'align' atoms and store
> the required width and position into the 'used_atom' structure. While
> we're here, add support for the usage of 'width=' and 'position=' when
> using the 'align' atom (e.g. %(align:position=middle,width=30)).

This patch is doing too much by both moving code around and modifying
that code (somewhat dramatically), thus it is difficult for reviewers
to compare the old and new behaviors. It deserves to be split apart
into at least two patches. First, the code movement patch which
introduces align_atom_parser() (and possibly get_align_position())
without any behavior or logical change; then the patch which changes
behavior to recognize the spelled-out forms "width=" and "position=".
You may even want to spilt it into more patches, for instance by doing
the get_align_position() extraction in its own patch.

> Add documentation and modify the existing tests in t6302 to reflect
> the same.
>
> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
> ---
> diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
> @@ -129,14 +129,16 @@ color::
>  align::
>         Left-, middle-, or right-align the content between
> -       %(align:...) and %(end). The "align:" is followed by `<width>`
> -       and `<position>` in any order separated by a comma, where the
> -       `<position>` is either left, right or middle, default being
> -       left and `<width>` is the total length of the content with
> -       alignment. If the contents length is more than the width then
> -       no alignment is performed. If used with '--quote' everything
> -       in between %(align:...) and %(end) is quoted, but if nested
> -       then only the topmost level performs quoting.
> +       %(align:...) and %(end). The "align:" is followed by
> +       `width=<width>` and `position=<position>` in any order
> +       separated by a comma, where the `<position>` is either left,
> +       right or middle, default being left and `<width>` is the total
> +       length of the content with alignment. The prefix for the
> +       arguments is not mandatory. If the contents length is more

This paragraph is so bulky that it's very easy to overlook the bit
about the "prefix for the arguments" being optional, and it's not
necessarily even clear to the casual reader what that means. It might,
therefore, be a good idea to spell it out explicitly. For instance,
you might say something like:

    For brevity, the "width=" and/or "position=" prefixes may be
    omitted, and bare <width> and <position> used instead.
    For instance, `%(align:<width>,<position>)`.

> +       than the width then no alignment is performed. If used with
> +       '--quote' everything in between %(align:...) and %(end) is
> +       quoted, but if nested then only the topmost level performs
> +       quoting.
> diff --git a/ref-filter.c b/ref-filter.c
> @@ -63,6 +69,61 @@ void color_atom_parser(struct used_atom *atom)
> +static align_type get_align_position(const char *type)

Taken in context of the callers, this isn't a great function name, as
it implies that it is retrieving some value, when in fact it is
parsing the input argument. A better name might be
parse_align_position().

Likewise, 'type' isn't necessarily a great argument name. You might
instead call it 'pos' or even just short and sweet 's'.

> +{
> +       if (!strcmp(type, "right"))
> +               return ALIGN_RIGHT;
> +       else if (!strcmp(type, "middle"))
> +               return ALIGN_MIDDLE;
> +       else if (!strcmp(type, "left"))
> +               return ALIGN_LEFT;
> +       return -1;
> +}
> +
> +void align_atom_parser(struct used_atom *atom)
> +{
> +       struct align *align = &atom->u.align;
> +       const char *buf = NULL;
> +       struct strbuf **s, **to_free;
> +       int width = -1;
> +
> +       match_atom_name(atom->str, "align", &buf);
> +       if (!buf)
> +               die(_("expected format: %%(align:<width>,<position>)"));

Is this still the way you want this error message to appear, or should
it show the long-form of the arguments? (I don't care strongly.)

> +       s = to_free = strbuf_split_str_without_term(buf, ',', 0);
> +
> +       /*  By default align to ALGIN_LEFT */

What is ALGIN? Regardless of the answer, this comment is not
particularly useful since it merely repeats what the code itself
already states clearly.

> +       align->position = ALIGN_LEFT;
> +
> +       while (*s) {
> +               int position;
> +               buf = s[0]->buf;
> +
> +               position = get_align_position(buf);

Why is this assignment way up here rather than down below in the
penultimate 'else' arm where its result is actually being checked? By
moving it closer to the point of use, the logic becomes easier to
understand.

> +               if (skip_prefix(buf, "position=", &buf)) {
> +                       position = get_align_position(buf);
> +                       if (position == -1)
> +                               die(_("improper format entered align:%s"), s[0]->buf);

At this point, you can give a better error message since you know that
you were parsing a "position=" argument. Maybe something like
"unrecognized position: %s".

> +                       align->position = position;
> +               } else if (skip_prefix(buf, "width=", &buf)) {
> +                       if (strtoul_ui(buf, 10, (unsigned int *)&width))
> +                               die(_("improper format entered align:%s"), s[0]->buf);

Ditto regarding better error message.

> +               } else if (!strtoul_ui(buf, 10, (unsigned int *)&width))
> +                               ;
> +               else if (position != -1)
> +                       align->position = position;
> +               else
> +                       die(_("improper format entered align:%s"), s[0]->buf);

Here too, it would be more user-friendly to say "unrecognized
%%(align) argument: %s".

> +               s++;
> +       }
> +
> +       if (width < 0)
> +               die(_("positive width expected with the %%(align) atom"));
> +       align->width = width;
> +       strbuf_list_free(to_free);
> +}
> diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
> @@ -97,7 +97,7 @@ test_expect_success 'left alignment is default' '
>         refname is refs/tags/three    |refs/tags/three
>         refname is refs/tags/two      |refs/tags/two
>         EOF
> -       git for-each-ref --format="%(align:30)refname is %(refname)%(end)|%(refname)" >actual &&
> +       git for-each-ref --format="%(align:width=30)refname is %(refname)%(end)|%(refname)" >actual &&
>         test_cmp expect actual
>  '
>
> @@ -113,7 +113,7 @@ test_expect_success 'middle alignment' '
>         |  refname is refs/tags/three  |refs/tags/three
>         |   refname is refs/tags/two   |refs/tags/two
>         EOF
> -       git for-each-ref --format="|%(align:middle,30)refname is %(refname)%(end)|%(refname)" >actual &&
> +       git for-each-ref --format="|%(align:position=middle,30)refname is %(refname)%(end)|%(refname)" >actual &&
>         test_cmp expect actual
>  '

While it may sometimes be reasonable to re-purpose existing tests like
this, this probably is not one of those cases. Instead, you should be
adding new tests to check all the permutations of the new argument
handling. For instance:

    %(align:42)
    %(align:middle,42)
    %(align:42,middle)
    %(align:position=middle,42)
    %(align:42,position=middle)
    %(align:middle,width=42)
    %(align:width=42,middle)
    %(align:position=middle,width=42)
    %(align:width=42,position=middle)

And, it wouldn't hurt to test handling of redundant or extra position
and width arguments. Should multiple arguments of the same type result
in an error, or should "last one wins (sliently)" be the policy? Once
you decide upon a policy, add tests to check that that policy works as
expected.

In this case, "last one wins (silently)" may be more friendly to
script writers, so it might be the better choice. You'd want to add
appropriate tests, using the various permutations. For instance:

    %(align:42,width=43)
    %(align:width=43,42)
    %(align:42,position=middle,right)
    %(align:42,right,position=middle)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 04/10] ref-filter: skip deref specifier in match_atom_name()
  2015-12-01 23:11   ` Eric Sunshine
@ 2015-12-03  6:34     ` Karthik Nayak
  0 siblings, 0 replies; 50+ messages in thread
From: Karthik Nayak @ 2015-12-03  6:34 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Wed, Dec 2, 2015 at 4:41 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
>> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
>
> A bit of explanation about why this change is desirable would be
> welcome. I'm guessing it's because a future patch is going to make
> calls to match_atom_name() with the '*' deref indicator still attached
> to the name, whereas existing code does not do so.
>

Yes, you're correct!
Will add this in, thanks.

-- 
Regards,
Karthik Nayak

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 05/10] ref-filter: introduce color_atom_parser()
  2015-12-01 23:27   ` Eric Sunshine
@ 2015-12-03 13:35     ` Karthik Nayak
  2015-12-13  6:05       ` Eric Sunshine
  0 siblings, 1 reply; 50+ messages in thread
From: Karthik Nayak @ 2015-12-03 13:35 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Wed, Dec 2, 2015 at 4:57 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
>> Introduce color_atom_parser() which will parse a "color" atom and
>> store its color in the "use_atom" structure for further usage in
>
> s/use_atom/used_atom/
>

will change.

>> 'populate_value()'.
>>
>> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
>> ---
>> diff --git a/ref-filter.c b/ref-filter.c
>> @@ -29,6 +29,9 @@ typedef enum { FIELD_STR, FIELD_ULONG, FIELD_TIME } cmp_type;
>>  static struct used_atom {
>>         const char *str;
>>         cmp_type type;
>> +       union {
>> +               const char *color;
>> +       } u;
>>  } *used_atom;
>> @@ -53,6 +56,13 @@ static int match_atom_name(const char *name, const char *atom_name, const char *
>> +void color_atom_parser(struct used_atom *atom)
>> +{
>> +       match_atom_name(atom->str, "color", &atom->u.color);
>> +       if (!atom->u.color)
>> +               die(_("expected format: %%(color:<color>)"));
>> +}
>> +
>> @@ -175,6 +185,9 @@ int parse_ref_filter_atom(const char *atom, const char *ep)
>>         REALLOC_ARRAY(used_atom, used_atom_cnt);
>>         used_atom[at].str = xmemdupz(atom, ep - atom);
>>         used_atom[at].type = valid_atom[i].cmp_type;
>> +       memset(&used_atom[at].u, 0, sizeof(used_atom[at].u));
>> +       if (valid_atom[i].parser)
>> +               valid_atom[i].parser(&used_atom[at]);
>>         if (*atom == '*')
>>                 need_tagged = 1;
>>         if (!strcmp(used_atom[at].str, "symref"))
>> @@ -833,11 +846,10 @@ static void populate_value(struct ref_array_item *ref)
>>                         refname = branch_get_push(branch, NULL);
>>                         if (!refname)
>>                                 continue;
>> -               } else if (match_atom_name(name, "color", &valp)) {
>> +               } else if (starts_with(name, "color")) {
>
> Hmm, so this will also match "colorize". Is that desirable?
>

Well the error checking is done when we parse the atom in color_atom_parser()
so here we don't need to worry about something like this.

>>                         char color[COLOR_MAXLEN] = "";
>> +                       const char *valp = atom->u.color;
>>
>> -                       if (!valp)
>> -                               die(_("expected format: %%(color:<color>)"));
>>                         if (color_parse(valp, color) < 0)
>
> Rather than declaring variable 'valp', why not just say:
>
>     if (color_parse(atom->u.color, color) < 0)
>
> ?

Yes, will change, thanks.

-- 
Regards,
Karthik Nayak

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 06/10] strbuf: introduce strbuf_split_str_without_term()
  2015-12-02  8:04   ` Eric Sunshine
@ 2015-12-03 18:12     ` Karthik Nayak
  2015-12-11 22:31       ` Junio C Hamano
  0 siblings, 1 reply; 50+ messages in thread
From: Karthik Nayak @ 2015-12-03 18:12 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Wed, Dec 2, 2015 at 1:34 PM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
>> The current implementation of 'strbuf_split_buf()' includes the
>> terminator at the end of each strbuf post splitting. Include an option
>
> s/Include an/Add an/
>
>> wherein we can drop the terminator if required. In this context
>
> s/required/desired/
>

will change.

>> introduce a wrapper function 'strbuf_split_str_without_term()' which
>> splits a given string into strbufs without including the terminator.
>>
>> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
>> ---
>> diff --git a/ref-filter.c b/ref-filter.c
>> @@ -892,14 +892,11 @@ static void populate_value(struct ref_array_item *ref)
>>                          * TODO: Implement a function similar to strbuf_split_str()
>>                          * which would omit the separator from the end of each value.
>>                          */
>> -                       s = to_free = strbuf_split_str(valp, ',', 0);
>> +                       s = to_free = strbuf_split_str_without_term(valp, ',', 0);
>>
>>                         align->position = ALIGN_LEFT;
>>
>>                         while (*s) {
>> -                               /*  Strip trailing comma */
>> -                               if (s[1])
>> -                                       strbuf_setlen(s[0], s[0]->len - 1);
>
> I'd prefer to see this ref-filter.c change split out as a separate
> patch so as not to pollute the otherwise single-purpose change
> introduced by this patch (i.e. capability to omit the terminator).
>
> Also, it might make sense to move this patch to the head of the
> series, since it's conceptually distinct from the rest of the patches,
> and could conceivably prove useful on its own, regardless of how the
> rest of the series fares.
>

I guess it makes sense to split this into two separate patches. I'll do that and
push it to the top of the series.

>>                                 if (!strtoul_ui(s[0]->buf, 10, (unsigned int *)&width))
>>                                         ;
>>                                 else if (!strcmp(s[0]->buf, "left"))
>> diff --git a/strbuf.c b/strbuf.c
>> @@ -115,7 +115,7 @@ void strbuf_tolower(struct strbuf *sb)
>>  }
>>
>>  struct strbuf **strbuf_split_buf(const char *str, size_t slen,
>> -                                int terminator, int max)
>> +                                int terminator, int max, int with_term)
>
> "with_term" might undesirably be interpreted as meaning "use this
> particular term". Perhaps a different name, such as "include_term",
> "drop_term", or "omit_term" would be a bit less ambiguous. (I think I
> prefer "omit_term".)
>

True. I too prefer "omit_term" from what you mentioned I'll stick to that.

>>  {
>>         struct strbuf **ret = NULL;
>>         size_t nr = 0, alloc = 0;
>> @@ -123,18 +123,27 @@ struct strbuf **strbuf_split_buf(const char *str, size_t slen,
>>
>>         while (slen) {
>>                 int len = slen;
>> +               int term = with_term;
>
> "term" is not a great variable name, and is easily confused with the
> existing "terminator" input argument. This is really being used as a
> length adjustment, so perhaps a name such as 'term_adjust' or
> 'len_adjust' or something better would be preferable.

'len_adjust' seems better, will change.

>
> Also, since the value of 'with_term' never changes, then 'term' will
> have the same value each time through the loop, thus you could
> (should) hoist the declaration and initialization of 'term' outside of
> the loop.
>

True, will move it out of the loop.

> Due to the way you're using this variable ('term'), you want its value
> always to be 0 or 1 but you don't do anything to ensure that. What if
> the user passes in 42 rather than 0 or 1? That would mess up your
> (below) calculations. Worse, what if the user passes in -42? That
> would be particularly alarming. To turn this into a boolean value (0
> or 1), do this instead:
>
>     int term = !!with_term;

Makes sense, will change.

>
>>                 if (max <= 0 || nr + 1 < max) {
>>                         const char *end = memchr(str, terminator, slen);
>>                         if (end)
>> -                               len = end - str + 1;
>> +                               len = end - str + term;
>> +                       else
>> +                               /*  When no terminator present, we must add the last character */
>> +                               term = 1;
>>                 }
>>                 t = xmalloc(sizeof(struct strbuf));
>>                 strbuf_init(t, len);
>>                 strbuf_add(t, str, len);
>>                 ALLOC_GROW(ret, nr + 2, alloc);
>>                 ret[nr++] = t;
>> -               str += len;
>> -               slen -= len;
>> +               if (!term) {
>> +                       str += len + 1;
>> +                       slen -= len + 1;
>> +               } else {
>> +                       str += len;
>> +                       slen -= len;
>> +               }
>
> This new logic is complex and confusing, thus difficult to review for
> correctness. Rather than messing with 'len' and the existing logic,
> how about instead, just adjusting the amount you store in the strbuf?
> That is, instead of all the above changes, you might be able to get by
> with one little change, something like this (untested):
>
>     - strbuf_add(t, str, len);
>     + strbuf_add(t, str, len - !!end * !!with_term);
>

That seems about right, this should be easier to understand.

>>         }
>>         ALLOC_GROW(ret, nr + 1, alloc); /* In case string was empty */
>>         ret[nr] = NULL;
>> diff --git a/strbuf.h b/strbuf.h
>> @@ -465,19 +465,25 @@ static inline int strbuf_strip_suffix(struct strbuf *sb, const char *suffix)
>>   * For lighter-weight alternatives, see string_list_split() and
>>   * string_list_split_in_place().
>>   */
>> -extern struct strbuf **strbuf_split_buf(const char *, size_t,
>> -                                       int terminator, int max);
>> +extern struct strbuf **strbuf_split_buf(const char *str, size_t slen,
>> +                                       int terminator, int max, int with_term);
>
> You also need to update the comment block above this declaration since
> it still says that each substring includes the terminator. It also
> fails to mention the new 'with_term' argument added by this patch.
>

Will do :)

>> +static inline struct strbuf **strbuf_split_str_without_term(const char *str,
>> +                                                           int terminator, int max)
>
> This is an uncomfortably long function name. Unfortunately, short and
> sweet strbuf_split() is already taken. Perhaps
> strbuf_split_str_drop_term()? strbuf_split_str_omit_term()?
> strbuf_split_str_no_term()? strbuf_split_noterm()?
>

strbuf_split_str_omit_term() would be nice keeping it consistent with the
variable name omit_term.

-- 
Regards,
Karthik Nayak

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 07/10] ref-filter: introduce align_atom_parser()
  2015-12-02 21:23   ` Eric Sunshine
@ 2015-12-07 17:00     ` Karthik Nayak
  0 siblings, 0 replies; 50+ messages in thread
From: Karthik Nayak @ 2015-12-07 17:00 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Thu, Dec 3, 2015 at 2:53 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
>> Introduce align_atom_parser() which will parse 'align' atoms and store
>> the required width and position into the 'used_atom' structure. While
>> we're here, add support for the usage of 'width=' and 'position=' when
>> using the 'align' atom (e.g. %(align:position=middle,width=30)).
>
> This patch is doing too much by both moving code around and modifying
> that code (somewhat dramatically), thus it is difficult for reviewers
> to compare the old and new behaviors. It deserves to be split apart
> into at least two patches. First, the code movement patch which
> introduces align_atom_parser() (and possibly get_align_position())
> without any behavior or logical change; then the patch which changes
> behavior to recognize the spelled-out forms "width=" and "position=".
> You may even want to spilt it into more patches, for instance by doing
> the get_align_position() extraction in its own patch.
>

I split it into two separate patches:
1. add align_atom_parser()
2. introduce "width=" and "position=" prefixes.

I think now it seems easier to follow.

>> Add documentation and modify the existing tests in t6302 to reflect
>> the same.
>>
>> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
>> ---
>> diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
>> @@ -129,14 +129,16 @@ color::
>>  align::
>>         Left-, middle-, or right-align the content between
>> -       %(align:...) and %(end). The "align:" is followed by `<width>`
>> -       and `<position>` in any order separated by a comma, where the
>> -       `<position>` is either left, right or middle, default being
>> -       left and `<width>` is the total length of the content with
>> -       alignment. If the contents length is more than the width then
>> -       no alignment is performed. If used with '--quote' everything
>> -       in between %(align:...) and %(end) is quoted, but if nested
>> -       then only the topmost level performs quoting.
>> +       %(align:...) and %(end). The "align:" is followed by
>> +       `width=<width>` and `position=<position>` in any order
>> +       separated by a comma, where the `<position>` is either left,
>> +       right or middle, default being left and `<width>` is the total
>> +       length of the content with alignment. The prefix for the
>> +       arguments is not mandatory. If the contents length is more
>
> This paragraph is so bulky that it's very easy to overlook the bit
> about the "prefix for the arguments" being optional, and it's not
> necessarily even clear to the casual reader what that means. It might,
> therefore, be a good idea to spell it out explicitly. For instance,
> you might say something like:
>
>     For brevity, the "width=" and/or "position=" prefixes may be
>     omitted, and bare <width> and <position> used instead.
>     For instance, `%(align:<width>,<position>)`.
>

Added this in.

>> +       than the width then no alignment is performed. If used with
>> +       '--quote' everything in between %(align:...) and %(end) is
>> +       quoted, but if nested then only the topmost level performs
>> +       quoting.
>> diff --git a/ref-filter.c b/ref-filter.c
>> @@ -63,6 +69,61 @@ void color_atom_parser(struct used_atom *atom)
>> +static align_type get_align_position(const char *type)
>
> Taken in context of the callers, this isn't a great function name, as
> it implies that it is retrieving some value, when in fact it is
> parsing the input argument. A better name might be
> parse_align_position().
>
> Likewise, 'type' isn't necessarily a great argument name. You might
> instead call it 'pos' or even just short and sweet 's'.
>

Will change.

>> +{
>> +       if (!strcmp(type, "right"))
>> +               return ALIGN_RIGHT;
>> +       else if (!strcmp(type, "middle"))
>> +               return ALIGN_MIDDLE;
>> +       else if (!strcmp(type, "left"))
>> +               return ALIGN_LEFT;
>> +       return -1;
>> +}
>> +
>> +void align_atom_parser(struct used_atom *atom)
>> +{
>> +       struct align *align = &atom->u.align;
>> +       const char *buf = NULL;
>> +       struct strbuf **s, **to_free;
>> +       int width = -1;
>> +
>> +       match_atom_name(atom->str, "align", &buf);
>> +       if (!buf)
>> +               die(_("expected format: %%(align:<width>,<position>)"));
>
> Is this still the way you want this error message to appear, or should
> it show the long-form of the arguments? (I don't care strongly.)
>

I think it still holds good, not too keen on changing it either.

>> +       s = to_free = strbuf_split_str_without_term(buf, ',', 0);
>> +
>> +       /*  By default align to ALGIN_LEFT */
>
> What is ALGIN? Regardless of the answer, this comment is not
> particularly useful since it merely repeats what the code itself
> already states clearly.
>

will do.

>> +       align->position = ALIGN_LEFT;
>> +
>> +       while (*s) {
>> +               int position;
>> +               buf = s[0]->buf;
>> +
>> +               position = get_align_position(buf);
>
> Why is this assignment way up here rather than down below in the
> penultimate 'else' arm where its result is actually being checked? By
> moving it closer to the point of use, the logic becomes easier to
> understand.
>

makes sense, will do.

>> +               if (skip_prefix(buf, "position=", &buf)) {
>> +                       position = get_align_position(buf);
>> +                       if (position == -1)
>> +                               die(_("improper format entered align:%s"), s[0]->buf);
>
> At this point, you can give a better error message since you know that
> you were parsing a "position=" argument. Maybe something like
> "unrecognized position: %s".
>

thanks, will add this in.

>> +                       align->position = position;
>> +               } else if (skip_prefix(buf, "width=", &buf)) {
>> +                       if (strtoul_ui(buf, 10, (unsigned int *)&width))
>> +                               die(_("improper format entered align:%s"), s[0]->buf);
>
> Ditto regarding better error message.
>

will do.

>> +               } else if (!strtoul_ui(buf, 10, (unsigned int *)&width))
>> +                               ;
>> +               else if (position != -1)
>> +                       align->position = position;
>> +               else
>> +                       die(_("improper format entered align:%s"), s[0]->buf);
>
> Here too, it would be more user-friendly to say "unrecognized
> %%(align) argument: %s".
>

will change.

>> +               s++;
>> +       }
>> +
>> +       if (width < 0)
>> +               die(_("positive width expected with the %%(align) atom"));
>> +       align->width = width;
>> +       strbuf_list_free(to_free);
>> +}
>> diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
>> @@ -97,7 +97,7 @@ test_expect_success 'left alignment is default' '
>>         refname is refs/tags/three    |refs/tags/three
>>         refname is refs/tags/two      |refs/tags/two
>>         EOF
>> -       git for-each-ref --format="%(align:30)refname is %(refname)%(end)|%(refname)" >actual &&
>> +       git for-each-ref --format="%(align:width=30)refname is %(refname)%(end)|%(refname)" >actual &&
>>         test_cmp expect actual
>>  '
>>
>> @@ -113,7 +113,7 @@ test_expect_success 'middle alignment' '
>>         |  refname is refs/tags/three  |refs/tags/three
>>         |   refname is refs/tags/two   |refs/tags/two
>>         EOF
>> -       git for-each-ref --format="|%(align:middle,30)refname is %(refname)%(end)|%(refname)" >actual &&
>> +       git for-each-ref --format="|%(align:position=middle,30)refname is %(refname)%(end)|%(refname)" >actual &&
>>         test_cmp expect actual
>>  '
>
> While it may sometimes be reasonable to re-purpose existing tests like
> this, this probably is not one of those cases. Instead, you should be
> adding new tests to check all the permutations of the new argument
> handling. For instance:
>
>     %(align:42)
>     %(align:middle,42)
>     %(align:42,middle)
>     %(align:position=middle,42)
>     %(align:42,position=middle)
>     %(align:middle,width=42)
>     %(align:width=42,middle)
>     %(align:position=middle,width=42)
>     %(align:width=42,position=middle)
>
> And, it wouldn't hurt to test handling of redundant or extra position
> and width arguments. Should multiple arguments of the same type result
> in an error, or should "last one wins (sliently)" be the policy? Once
> you decide upon a policy, add tests to check that that policy works as
> expected.
>

Currently like you said, it works on a "last one wins(silently)" policy.
I think this is what I'd stick with.
So like you said I'll implement the combinations of tests as suggested
and those suggested below too.

> In this case, "last one wins (silently)" may be more friendly to
> script writers, so it might be the better choice. You'd want to add
> appropriate tests, using the various permutations. For instance:
>
>     %(align:42,width=43)
>     %(align:width=43,42)
>     %(align:42,position=middle,right)
>     %(align:42,right,position=middle)

Thanks for the review :)

-- 
Regards,
Karthik Nayak

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 01/10] ref-filter: introduce a parsing function for each atom in valid_atom
  2015-11-26 18:01         ` Karthik Nayak
@ 2015-12-11 22:18           ` Junio C Hamano
  2015-12-12 16:05             ` Karthik Nayak
  0 siblings, 1 reply; 50+ messages in thread
From: Junio C Hamano @ 2015-12-11 22:18 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Eric Sunshine, Git List, Matthieu Moy

Karthik Nayak <karthik.188@gmail.com> writes:

>> This problem will go away if you introduce the 'valid_atom' field in
>> the patch which actually needs it (as suggested above) rather than in
>> this patch.
>
> Yup, agreed.
> Thanks for your suggestions.

In addition, most of the lines in this patch should become
unnecessary even after you start using the "parser" field.

As the majority of fields are compared as strings, and only a
selected few fields need custom parsers, you do not have to
explicitly write FIELD_STR everywhere, as long as you make sure the
value of FIELD_STR is zero (and use parser==NULL as a sign that no
custom parser is used, which I think you already do).

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 06/10] strbuf: introduce strbuf_split_str_without_term()
  2015-12-03 18:12     ` Karthik Nayak
@ 2015-12-11 22:31       ` Junio C Hamano
  2015-12-12 16:04         ` Karthik Nayak
  0 siblings, 1 reply; 50+ messages in thread
From: Junio C Hamano @ 2015-12-11 22:31 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Eric Sunshine, Git List, Matthieu Moy

Karthik Nayak <karthik.188@gmail.com> writes:

>>> diff --git a/ref-filter.c b/ref-filter.c
>>> @@ -892,14 +892,11 @@ static void populate_value(struct ref_array_item *ref)
>>>                          * TODO: Implement a function similar to strbuf_split_str()
>>>                          * which would omit the separator from the end of each value.
>>>                          */
>>> -                       s = to_free = strbuf_split_str(valp, ',', 0);
>>> +                       s = to_free = strbuf_split_str_without_term(valp, ',', 0);
>>>
>>>                         align->position = ALIGN_LEFT;
>>>
>>>                         while (*s) {
>>> -                               /*  Strip trailing comma */
>>> -                               if (s[1])
>>> -                                       strbuf_setlen(s[0], s[0]->len - 1);
>>
>> I'd prefer to see this ref-filter.c change split out as a separate
>> patch so as not to pollute the otherwise single-purpose change
>> introduced by this patch (i.e. capability to omit the terminator).
>>
>> Also, it might make sense to move this patch to the head of the
>> series, since it's conceptually distinct from the rest of the patches,
>> and could conceivably prove useful on its own, regardless of how the
>> rest of the series fares.
>>
>
> I guess it makes sense to split this into two separate patches. I'll do that and
> push it to the top of the series.

Sounds good.  I also notice that the "TODO: Implement a function
similar to..." we see in the precontext can now be removed, as that
is what is done in this patch?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 00/10] ref-filter: use parsing functions
  2015-11-11 19:44 [PATCH/RFC 00/10] ref-filter: use parsing functions Karthik Nayak
                   ` (9 preceding siblings ...)
  2015-11-24 21:48 ` [PATCH/RFC 00/10] ref-filter: use parsing functions Jeff King
@ 2015-12-11 22:49 ` Junio C Hamano
  2015-12-13  5:40   ` Eric Sunshine
  10 siblings, 1 reply; 50+ messages in thread
From: Junio C Hamano @ 2015-12-11 22:49 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git, matthieu.moy

Karthik Nayak <karthik.188@gmail.com> writes:

> Karthik Nayak (10):
>   ref-filter: introduce a parsing function for each atom in valid_atom
>   ref-filter: introduce struct used_atom
>   ref-fitler: bump match_atom() name to the top
>   ref-filter: skip deref specifier in match_atom_name()
>   ref-filter: introduce color_atom_parser()
>   strbuf: introduce strbuf_split_str_without_term()
>   ref-filter: introduce align_atom_parser()
>   ref-filter: introduce remote_ref_atom_parser()
>   ref-filter: introduce contents_atom_parser()
>   ref-filter: introduce objectname_atom_parser()

It seems that this series had seen quite a good inputs, mostly from
Eric.  I finished reading it over and I didn't find anything more to
add.  The patches are mostly good and would be ready once these
points raised during the review are addressed, I think

Thanks.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 06/10] strbuf: introduce strbuf_split_str_without_term()
  2015-12-11 22:31       ` Junio C Hamano
@ 2015-12-12 16:04         ` Karthik Nayak
  0 siblings, 0 replies; 50+ messages in thread
From: Karthik Nayak @ 2015-12-12 16:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eric Sunshine, Git List, Matthieu Moy

On Sat, Dec 12, 2015 at 4:01 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Karthik Nayak <karthik.188@gmail.com> writes:
>
>>>> diff --git a/ref-filter.c b/ref-filter.c
>>>> @@ -892,14 +892,11 @@ static void populate_value(struct ref_array_item *ref)
>>>>                          * TODO: Implement a function similar to strbuf_split_str()
>>>>                          * which would omit the separator from the end of each value.
>>>>                          */
>>>> -                       s = to_free = strbuf_split_str(valp, ',', 0);
>>>> +                       s = to_free = strbuf_split_str_without_term(valp, ',', 0);
>>>>
>>>>                         align->position = ALIGN_LEFT;
>>>>
>>>>                         while (*s) {
>>>> -                               /*  Strip trailing comma */
>>>> -                               if (s[1])
>>>> -                                       strbuf_setlen(s[0], s[0]->len - 1);
>>>
>>> I'd prefer to see this ref-filter.c change split out as a separate
>>> patch so as not to pollute the otherwise single-purpose change
>>> introduced by this patch (i.e. capability to omit the terminator).
>>>
>>> Also, it might make sense to move this patch to the head of the
>>> series, since it's conceptually distinct from the rest of the patches,
>>> and could conceivably prove useful on its own, regardless of how the
>>> rest of the series fares.
>>>
>>
>> I guess it makes sense to split this into two separate patches. I'll do that and
>> push it to the top of the series.
>
> Sounds good.  I also notice that the "TODO: Implement a function
> similar to..." we see in the precontext can now be removed, as that
> is what is done in this patch?

Yes I noticed that when going through the changes Eric suggested. I have
removed it.

-- 
Regards,
Karthik Nayak

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 01/10] ref-filter: introduce a parsing function for each atom in valid_atom
  2015-12-11 22:18           ` Junio C Hamano
@ 2015-12-12 16:05             ` Karthik Nayak
  0 siblings, 0 replies; 50+ messages in thread
From: Karthik Nayak @ 2015-12-12 16:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eric Sunshine, Git List, Matthieu Moy

On Sat, Dec 12, 2015 at 3:48 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Karthik Nayak <karthik.188@gmail.com> writes:
>
>>> This problem will go away if you introduce the 'valid_atom' field in
>>> the patch which actually needs it (as suggested above) rather than in
>>> this patch.
>>
>> Yup, agreed.
>> Thanks for your suggestions.
>
> In addition, most of the lines in this patch should become
> unnecessary even after you start using the "parser" field.
>
> As the majority of fields are compared as strings, and only a
> selected few fields need custom parsers, you do not have to
> explicitly write FIELD_STR everywhere, as long as you make sure the
> value of FIELD_STR is zero (and use parser==NULL as a sign that no
> custom parser is used, which I think you already do).

True, currently I've changed it to only add the new parser function values
only wherever required.

-- 
Regards,
Karthik Nayak

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 08/10] ref-filter: introduce remote_ref_atom_parser()
  2015-11-11 19:44 ` [PATCH/RFC 08/10] ref-filter: introduce remote_ref_atom_parser() Karthik Nayak
@ 2015-12-13  0:53   ` Eric Sunshine
  2015-12-13  6:02     ` Karthik Nayak
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Sunshine @ 2015-12-13  0:53 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
> Introduce remote_ref_atom_parser() which will parse the '%(upstream)'
> and '%(push)' atoms and store information into the 'used_atom'
> structure based on the modifiers used along with the corresponding
> atom.
>
> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
> ---
> diff --git a/ref-filter.c b/ref-filter.c
> @@ -37,6 +37,11 @@ static struct used_atom {
>         union {
>                 const char *color;
>                 struct align align;
> +               struct {
> +                       unsigned int shorten : 1,
> +                               track : 1,
> +                               trackshort : 1;
> +               } remote_ref;

Are 'shorten', 'track', and 'trackshort' mutually exclusive? If so, a
simple enum would be clearer than bitfields:

    union {
        const char *color;
        struct align align;
        enum { RR_PLAIN, RR_SHORTEN, RR_TRACK, RR_TRACKSHORT }
            remote_ref;
    };

Or something.

>         } u;
>  } *used_atom;
>  static int used_atom_cnt, need_tagged, need_symref;
> @@ -69,6 +74,24 @@ void color_atom_parser(struct used_atom *atom)
> +void remote_ref_atom_parser(struct used_atom *atom)
> +{
> +       const char *buf;
> +
> +       buf = strchr(atom->str, ':');
> +       if (!buf)
> +               return;
> +       buf++;
> +       if (!strcmp(buf, "short"))
> +               atom->u.remote_ref.shorten = 1;
> +       else if (!strcmp(buf, "track"))
> +               atom->u.remote_ref.track = 1;
> +       else if (!strcmp(buf, "trackshort"))
> +               atom->u.remote_ref.trackshort = 1;
> +       else
> +               die(_("improper format entered align:%s"), buf);

"align:"? Also, how about a more grammatically-friendly error message?

> +}
> +
> @@ -838,6 +861,42 @@ static inline char *copy_advance(char *dst, const char *src)
> +static void fill_remote_ref_details(struct used_atom *atom, const char *refname,
> +                                   struct branch *branch, const char **s)
> +{
> +       int num_ours, num_theirs;
> +       if (atom->u.remote_ref.shorten)
> +               *s = shorten_unambiguous_ref(refname, warn_ambiguous_refs);
> +       else if (atom->u.remote_ref.track) {
> +               if (stat_tracking_info(branch, &num_ours,
> +                                      &num_theirs, NULL))
> +                       return;
> +               if (!num_ours && !num_theirs)
> +                       *s = "";
> +               else if (!num_ours)
> +                       *s = xstrfmt("[behind %d]", num_theirs);
> +               else if (!num_theirs)
> +                       *s = xstrfmt("[ahead %d]", num_ours);
> +               else
> +                       *s = xstrfmt("[ahead %d, behind %d]",
> +                                    num_ours, num_theirs);

Tangent: These xstrfmt()'d strings are getting leaked, right? Is that
something that we need to worry about (if, for instance, a repository
contains a lot of tracking refs)? Should there be a NEEDSWORK comment
here regarding the issue?

> +       } else if (atom->u.remote_ref.trackshort) {
> +               if (stat_tracking_info(branch, &num_ours,
> +                                      &num_theirs, NULL))
> +                       return;
> +
> +               if (!num_ours && !num_theirs)
> +                       *s = "=";
> +               else if (!num_ours)
> +                       *s = "<";
> +               else if (!num_theirs)
> +                       *s = ">";
> +               else
> +                       *s = "<>";
> +       } else
> +               *s = refname;
> +}
> +
>  /*
>   * Parse the object referred by ref, and grab needed value.
>   */
> @@ -948,49 +1011,11 @@ static void populate_value(struct ref_array_item *ref)
>
>                 formatp = strchr(name, ':');
>                 if (formatp) {
> -                       int num_ours, num_theirs;
> -
>                         formatp++;
>                         if (!strcmp(formatp, "short"))
>                                 refname = shorten_unambiguous_ref(refname,
>                                                       warn_ambiguous_refs);

Is this duplicating work already done by fill_remote_ref_details()?

> -                       else if (!strcmp(formatp, "track") &&
> -                                (starts_with(name, "upstream") ||
> -                                 starts_with(name, "push"))) {
> -
> -                               if (stat_tracking_info(branch, &num_ours,
> -                                                      &num_theirs, NULL))
> -                                       continue;
> -
> -                               if (!num_ours && !num_theirs)
> -                                       v->s = "";
> -                               else if (!num_ours)
> -                                       v->s = xstrfmt("[behind %d]", num_theirs);
> -                               else if (!num_theirs)
> -                                       v->s = xstrfmt("[ahead %d]", num_ours);
> -                               else
> -                                       v->s = xstrfmt("[ahead %d, behind %d]",
> -                                                      num_ours, num_theirs);
> -                               continue;
> -                       } else if (!strcmp(formatp, "trackshort") &&
> -                                  (starts_with(name, "upstream") ||
> -                                   starts_with(name, "push"))) {
> -                               assert(branch);
> -
> -                               if (stat_tracking_info(branch, &num_ours,
> -                                                       &num_theirs, NULL))
> -                                       continue;
> -
> -                               if (!num_ours && !num_theirs)
> -                                       v->s = "=";
> -                               else if (!num_ours)
> -                                       v->s = "<";
> -                               else if (!num_theirs)
> -                                       v->s = ">";
> -                               else
> -                                       v->s = "<>";
> -                               continue;
> -                       } else
> +                       else
>                                 die("unknown %.*s format %s",
>                                     (int)(formatp - name), name, formatp);
>                 }
> --
> 2.6.2

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 09/10] ref-filter: introduce contents_atom_parser()
  2015-11-11 19:44 ` [PATCH/RFC 09/10] ref-filter: introduce contents_atom_parser() Karthik Nayak
@ 2015-12-13  3:10   ` Eric Sunshine
  2015-12-13  4:41     ` Eric Sunshine
  2015-12-13 19:33     ` Karthik Nayak
  0 siblings, 2 replies; 50+ messages in thread
From: Eric Sunshine @ 2015-12-13  3:10 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
> Introduce contents_atom_parser() which will parse the '%(contents)'
> atom and store information into the 'used_atom' structure based on the
> modifiers used along with the atom.
>
> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
> ---
> diff --git a/ref-filter.c b/ref-filter.c
> index 802629b..117bbbb 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -42,6 +42,14 @@ static struct used_atom {
>                                 track : 1,
>                                 trackshort : 1;
>                 } remote_ref;
> +               struct {
> +                       unsigned int subject : 1,
> +                               body : 1,
> +                               signature : 1,
> +                               all : 1,
> +                               lines : 1,
> +                               no_lines;
> +               } contents;

Same question as patch 8/10: With the exception of non-bitfield
'no_lines', are these 'contents' flags mutually exclusive? If so, an
enum would be a more natural representation than bitfields.

>         } u;
>  } *used_atom;
>  static int used_atom_cnt, need_tagged, need_symref;
> @@ -92,6 +100,29 @@ void remote_ref_atom_parser(struct used_atom *atom)
> +void contents_atom_parser(struct used_atom *atom)
> +{
> +       const char * buf;
> +
> +       if (match_atom_name(atom->str, "contents", &buf))
> +               atom->u.contents.all = 1;
> +
> +       if (!buf)
> +               return;

Hmm, I'd have placed the blank line after the 'if (!buf) return;'
rather than before it.

> +       if (!strcmp(buf, "body"))
> +               atom->u.contents.body = 1;
> +       else if (!strcmp(buf, "signature"))
> +               atom->u.contents.signature = 1;
> +       else if (!strcmp(buf, "subject"))
> +               atom->u.contents.subject = 1;
> +       else if (skip_prefix(buf, "lines=", &buf)) {
> +               atom->u.contents.lines = 1;
> +               if (strtoul_ui(buf, 10, &atom->u.contents.no_lines))
> +                       die(_("positive value expected contents:lines=%s"), buf);
> +       } else
> +               die(_("improper format entered contents:%s"), buf);

Perhaps a more grammatically-friendly error message?

> +}
> +
>  static align_type get_align_position(const char *type)
>  {
>         if (!strcmp(type, "right"))
> @@ -761,20 +786,16 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct obj
>         unsigned long sublen = 0, bodylen = 0, nonsiglen = 0, siglen = 0;
>
>         for (i = 0; i < used_atom_cnt; i++) {
> -               const char *name = used_atom[i].str;
> +               struct used_atom *atom = &used_atom[i];
> +               const char *name = atom->str;
>                 struct atom_value *v = &val[i];
> -               const char *valp = NULL;
>                 if (!!deref != (*name == '*'))
>                         continue;
>                 if (deref)
>                         name++;
>                 if (strcmp(name, "subject") &&
>                     strcmp(name, "body") &&
> -                   strcmp(name, "contents") &&
> -                   strcmp(name, "contents:subject") &&
> -                   strcmp(name, "contents:body") &&
> -                   strcmp(name, "contents:signature") &&
> -                   !starts_with(name, "contents:lines="))
> +                   !atom->u.contents.all)
>                         continue;
>                 if (!subpos)
>                         find_subpos(buf, sz,
> @@ -784,26 +805,23 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct obj
>
>                 if (!strcmp(name, "subject"))
>                         v->s = copy_subject(subpos, sublen);
> -               else if (!strcmp(name, "contents:subject"))
> +               else if (atom->u.contents.subject)
>                         v->s = copy_subject(subpos, sublen);

With the disclaimer that I haven't fully digested the existing logic,
is there a reason that you don't also preprocess bare "subject" as you
preprocess "contents:subject"? Isn't "subject" just historic aliases
for "contents:subject"?

A similar observation may be made about "body" and "contents:body",
although I see they mean slightly different things (for, I suppose,
historical reasons).

>                 else if (!strcmp(name, "body"))
>                         v->s = xmemdupz(bodypos, bodylen);
> -               else if (!strcmp(name, "contents:body"))
> +               else if (atom->u.contents.body)
>                         v->s = xmemdupz(bodypos, nonsiglen);
> -               else if (!strcmp(name, "contents:signature"))
> +               else if (atom->u.contents.signature)
>                         v->s = xmemdupz(sigpos, siglen);
> -               else if (!strcmp(name, "contents"))
> -                       v->s = xstrdup(subpos);
> -               else if (skip_prefix(name, "contents:lines=", &valp)) {
> +               else if (atom->u.contents.lines) {
>                         struct strbuf s = STRBUF_INIT;
>                         const char *contents_end = bodylen + bodypos - siglen;
>
> -                       if (strtoul_ui(valp, 10, &v->u.contents.lines))
> -                               die(_("positive value expected contents:lines=%s"), valp);
>                         /*  Size is the length of the message after removing the signature */
> -                       append_lines(&s, subpos, contents_end - subpos, v->u.contents.lines);
> +                       append_lines(&s, subpos, contents_end - subpos, atom->u.contents.no_lines);
>                         v->s = strbuf_detach(&s, NULL);
> -               }
> +               } else /*  For %(contents) without modifiers */

Too many blanks following '/*'; downcase 'for' or drop it altogether:

    /* bare %(contents) */

> +                       v->s = xstrdup(subpos);
>         }
>  }
>
> --
> 2.6.2

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 09/10] ref-filter: introduce contents_atom_parser()
  2015-12-13  3:10   ` Eric Sunshine
@ 2015-12-13  4:41     ` Eric Sunshine
  2015-12-13 19:36       ` Karthik Nayak
  2015-12-13 19:33     ` Karthik Nayak
  1 sibling, 1 reply; 50+ messages in thread
From: Eric Sunshine @ 2015-12-13  4:41 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Sat, Dec 12, 2015 at 10:10 PM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
>> +void contents_atom_parser(struct used_atom *atom)
>> +{
>> +       const char * buf;
>> +
>> +       if (match_atom_name(atom->str, "contents", &buf))
>> +               atom->u.contents.all = 1;
>> +
>> +       if (!buf)
>> +               return;

Also, isn't this logic somewhat bogus? u.contents.all is set to 1 for
both bare %(contents) and decorated %(contents:whatever). Then, below,
you also set one of .body, .signature, .subject, or .lines if a
decoration is specified. So, now you have both .all and one of the
other attributes set to 1, which is rather nonsensical (if I
understand correctly).

If you change this to an enum as suggested in my previous email, then
the problem goes away.

>> +       if (!strcmp(buf, "body"))
>> +               atom->u.contents.body = 1;
>> +       else if (!strcmp(buf, "signature"))
>> +               atom->u.contents.signature = 1;
>> +       else if (!strcmp(buf, "subject"))
>> +               atom->u.contents.subject = 1;
>> +       else if (skip_prefix(buf, "lines=", &buf)) {
>> +               atom->u.contents.lines = 1;
>> +               if (strtoul_ui(buf, 10, &atom->u.contents.no_lines))
>> +                       die(_("positive value expected contents:lines=%s"), buf);
>> +       } else
>> +               die(_("improper format entered contents:%s"), buf);
>> +}

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 10/10] ref-filter: introduce objectname_atom_parser()
  2015-11-25 13:44   ` [PATCH/RFC 10/10] ref-filter: introduce objectname_atom_parser() Karthik Nayak
@ 2015-12-13  4:49     ` Eric Sunshine
  2015-12-13 19:40       ` Karthik Nayak
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Sunshine @ 2015-12-13  4:49 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Git List, Junio C Hamano, Jeff King

On Wed, Nov 25, 2015 at 8:44 AM, Karthik Nayak <karthik.188@gmail.com> wrote:
> Introduce objectname_atom_parser() which will parse the
> '%(objectname)' atom and store information into the 'used_atom'
> structure based on the modifiers used along with the atom.
>
> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
> ---
> diff --git a/ref-filter.c b/ref-filter.c
> @@ -50,6 +50,10 @@ static struct used_atom {
>                                 lines : 1,
>                                 no_lines;
>                 } contents;
> +               struct {
> +                       unsigned int shorten : 1,
> +                               full : 1;
> +               } objectname;

Same comment as in my patch 8 and 9 reviews: If 'shorten' and 'full'
are mutually exclusive, then an enum would be clearer. In fact, if
there are only these two states (full and short), then this could be a
simple boolean named 'shorten'.

>         } u;
>  } *used_atom;
>  static int used_atom_cnt, need_tagged, need_symref;
> @@ -123,6 +127,21 @@ void contents_atom_parser(struct used_atom *atom)
> +void objectname_atom_parser(struct used_atom *atom)
> +{
> +       const char * buf;
> +
> +       if (match_atom_name(atom->str, "objectname", &buf))
> +               atom->u.objectname.full = 1;

Same comment about bogus logic as in patch 9 review: u.objectname.full
and u.objectname.shorten are both set to 1 for %(objectname:short).

> +
> +       if (!buf)
> +               return;

Same comment about misplaced blank line: Put the blank line after the
conditional rather than before or drop it altogether.

> +       if (!strcmp(buf, "short"))
> +               atom->u.objectname.shorten = 1;
> +       else
> +               die(_("improper format entered objectname:%s"), buf);

Maybe just "unrecognized objectname:%s" or something?

> +}
> +
> @@ -463,15 +482,16 @@ static void *get_obj(const unsigned char *sha1, struct object **obj, unsigned lo
>  }
>
>  static int grab_objectname(const char *name, const unsigned char *sha1,
> -                           struct atom_value *v)
> +                          struct atom_value *v, struct used_atom *atom)
>  {
> -       if (!strcmp(name, "objectname")) {
> -               v->s = xstrdup(sha1_to_hex(sha1));
> -               return 1;
> -       }
> -       if (!strcmp(name, "objectname:short")) {
> -               v->s = xstrdup(find_unique_abbrev(sha1, DEFAULT_ABBREV));
> -               return 1;
> +       if (starts_with(name, "objectname")) {
> +               if (atom->u.objectname.shorten) {
> +                       v->s = xstrdup(find_unique_abbrev(sha1, DEFAULT_ABBREV));
> +                       return 1;
> +               } else if (atom->u.objectname.full) {
> +                       v->s = xstrdup(sha1_to_hex(sha1));
> +                       return 1;
> +               }
>         }
>         return 0;
>  }
> @@ -495,7 +515,7 @@ static void grab_common_values(struct atom_value *val, int deref, struct object
>                         v->s = xstrfmt("%lu", sz);
>                 }
>                 else if (deref)
> -                       grab_objectname(name, obj->sha1, v);
> +                       grab_objectname(name, obj->sha1, v, &used_atom[i]);
>         }
>  }
>
> @@ -1004,7 +1024,7 @@ static void populate_value(struct ref_array_item *ref)
>                                 v->s = xstrdup(buf + 1);
>                         }
>                         continue;
> -               } else if (!deref && grab_objectname(name, ref->objectname, v)) {
> +               } else if (!deref && grab_objectname(name, ref->objectname, v, atom)) {
>                         continue;
>                 } else if (!strcmp(name, "HEAD")) {
>                         const char *head;
> --
> 2.6.2

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 00/10] ref-filter: use parsing functions
  2015-12-11 22:49 ` [PATCH/RFC 00/10] ref-filter: use parsing functions Junio C Hamano
@ 2015-12-13  5:40   ` Eric Sunshine
  2015-12-13  9:31     ` Karthik Nayak
  2015-12-14 19:06     ` Junio C Hamano
  0 siblings, 2 replies; 50+ messages in thread
From: Eric Sunshine @ 2015-12-13  5:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Karthik Nayak, Git List, Matthieu Moy

On Fri, Dec 11, 2015 at 5:49 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Karthik Nayak <karthik.188@gmail.com> writes:
>>   ref-filter: introduce a parsing function for each atom in valid_atom
>>   ref-filter: introduce struct used_atom
>>   ref-fitler: bump match_atom() name to the top
>>   ref-filter: skip deref specifier in match_atom_name()
>>   ref-filter: introduce color_atom_parser()
>>   strbuf: introduce strbuf_split_str_without_term()
>>   ref-filter: introduce align_atom_parser()
>>   ref-filter: introduce remote_ref_atom_parser()
>>   ref-filter: introduce contents_atom_parser()
>>   ref-filter: introduce objectname_atom_parser()
>
> It seems that this series had seen quite a good inputs, mostly from
> Eric.  I finished reading it over and I didn't find anything more to
> add.  The patches are mostly good and would be ready once these
> points raised during the review are addressed, I think

I'm still a bit fuzzy about what this series is trying to achieve. It
feels like it wants to avoid doing repeated processing of unchanging
bits of %(foo:bar) atoms for each ref processed, but it only partly
achieves that goal. For instance, there are still an awful lot of
strcmp()s and starts_with()s in that inner loop, and even the
unchanging %(color:) argument gets re-evaulated repeatedly, which is
probably quite expensive.

If the intention is to rid that inner loop of much of the expensive
processing, then wouldn't we want to introduce an enum of valid atoms
which is to be a member of 'struct used_atom', and have
populate_value() switch on the enum value rather than doing all the
expensive strcmp()s and starts_with()?

    enum atom_type {
        AT_REFNAME,
        AT_OBJECTTYPE,
        ...
        AT_COLOR,
        AT_ALIGN
    };

    static struct used_atom {
        enum atom_type atom;
        cmp_type cmp;
        union {
            char *color; /* parsed color */
            struct align align;
            enum { ... } remote_ref;
            struct {
                enum { ... } portion;
                unsigned int nlines;
            } contents;
            int short_objname;
        } u;
    } *used_atom;

In fact, the 'cmp_type cmp' field can be dropped altogether since it
can just as easily be looked up when needed by keeping 'enum
atom_type' and valid_atoms[] in-sync and indexing into valid_atoms[]
by atom_type:

    struct used_atom *a = ...;
    cmp_type cmp = valid_atoms[a->atom].cmp_type;

As a bonus, an 'enum atom_type' would resolve the problem of how
starts_with() is abused in populate_value() for certain cases
(assuming I'm understanding the logic), such as how matching of
"color" could incorrectly match some yet-to-be-added atom named
"colorize".

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 08/10] ref-filter: introduce remote_ref_atom_parser()
  2015-12-13  0:53   ` Eric Sunshine
@ 2015-12-13  6:02     ` Karthik Nayak
  2015-12-13  6:15       ` Eric Sunshine
  2015-12-13  8:32       ` Karthik Nayak
  0 siblings, 2 replies; 50+ messages in thread
From: Karthik Nayak @ 2015-12-13  6:02 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Sun, Dec 13, 2015 at 6:23 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
>> Introduce remote_ref_atom_parser() which will parse the '%(upstream)'
>> and '%(push)' atoms and store information into the 'used_atom'
>> structure based on the modifiers used along with the corresponding
>> atom.
>>
>> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
>> ---
>> diff --git a/ref-filter.c b/ref-filter.c
>> @@ -37,6 +37,11 @@ static struct used_atom {
>>         union {
>>                 const char *color;
>>                 struct align align;
>> +               struct {
>> +                       unsigned int shorten : 1,
>> +                               track : 1,
>> +                               trackshort : 1;
>> +               } remote_ref;
>
> Are 'shorten', 'track', and 'trackshort' mutually exclusive? If so, a
> simple enum would be clearer than bitfields:
>
>     union {
>         const char *color;
>         struct align align;
>         enum { RR_PLAIN, RR_SHORTEN, RR_TRACK, RR_TRACKSHORT }
>             remote_ref;
>     };
>
> Or something.
>

Sure, will do that.

>>         } u;
>>  } *used_atom;
>>  static int used_atom_cnt, need_tagged, need_symref;
>> @@ -69,6 +74,24 @@ void color_atom_parser(struct used_atom *atom)
>> +void remote_ref_atom_parser(struct used_atom *atom)
>> +{
>> +       const char *buf;
>> +
>> +       buf = strchr(atom->str, ':');
>> +       if (!buf)
>> +               return;
>> +       buf++;
>> +       if (!strcmp(buf, "short"))
>> +               atom->u.remote_ref.shorten = 1;
>> +       else if (!strcmp(buf, "track"))
>> +               atom->u.remote_ref.track = 1;
>> +       else if (!strcmp(buf, "trackshort"))
>> +               atom->u.remote_ref.trackshort = 1;
>> +       else
>> +               die(_("improper format entered align:%s"), buf);
>
> "align:"? Also, how about a more grammatically-friendly error message?
>

Bad copy-paste, will change that.

>> +}
>> +
>> @@ -838,6 +861,42 @@ static inline char *copy_advance(char *dst, const char *src)
>> +static void fill_remote_ref_details(struct used_atom *atom, const char *refname,
>> +                                   struct branch *branch, const char **s)
>> +{
>> +       int num_ours, num_theirs;
>> +       if (atom->u.remote_ref.shorten)
>> +               *s = shorten_unambiguous_ref(refname, warn_ambiguous_refs);
>> +       else if (atom->u.remote_ref.track) {
>> +               if (stat_tracking_info(branch, &num_ours,
>> +                                      &num_theirs, NULL))
>> +                       return;
>> +               if (!num_ours && !num_theirs)
>> +                       *s = "";
>> +               else if (!num_ours)
>> +                       *s = xstrfmt("[behind %d]", num_theirs);
>> +               else if (!num_theirs)
>> +                       *s = xstrfmt("[ahead %d]", num_ours);
>> +               else
>> +                       *s = xstrfmt("[ahead %d, behind %d]",
>> +                                    num_ours, num_theirs);
>
> Tangent: These xstrfmt()'d strings are getting leaked, right? Is that
> something that we need to worry about (if, for instance, a repository
> contains a lot of tracking refs)? Should there be a NEEDSWORK comment
> here regarding the issue?
>

This is sort of a problem with most of the values in ref-filter, we dynamically
allocate memory and do not free it, since the program exits soon after and
we leave it to the Operating System to do the garbage collection.

Not sure if we'd want to work on this though.

>> +       } else if (atom->u.remote_ref.trackshort) {
>> +               if (stat_tracking_info(branch, &num_ours,
>> +                                      &num_theirs, NULL))
>> +                       return;
>> +
>> +               if (!num_ours && !num_theirs)
>> +                       *s = "=";
>> +               else if (!num_ours)
>> +                       *s = "<";
>> +               else if (!num_theirs)
>> +                       *s = ">";
>> +               else
>> +                       *s = "<>";
>> +       } else
>> +               *s = refname;
>> +}
>> +
>>  /*
>>   * Parse the object referred by ref, and grab needed value.
>>   */
>> @@ -948,49 +1011,11 @@ static void populate_value(struct ref_array_item *ref)
>>
>>                 formatp = strchr(name, ':');
>>                 if (formatp) {
>> -                       int num_ours, num_theirs;
>> -
>>                         formatp++;
>>                         if (!strcmp(formatp, "short"))
>>                                 refname = shorten_unambiguous_ref(refname,
>>                                                       warn_ambiguous_refs);
>
> Is this duplicating work already done by fill_remote_ref_details()?
>

No, this is only activated when using "refname". fill_remote_ref_details()
is used by the %(upstream) and %(push) atoms, both of which skip the
loop using "continue" in populate_value().

-- 
Regards,
Karthik Nayak

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 05/10] ref-filter: introduce color_atom_parser()
  2015-12-03 13:35     ` Karthik Nayak
@ 2015-12-13  6:05       ` Eric Sunshine
  2015-12-13 18:46         ` Karthik Nayak
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Sunshine @ 2015-12-13  6:05 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Thu, Dec 3, 2015 at 8:35 AM, Karthik Nayak <karthik.188@gmail.com> wrote:
> On Wed, Dec 2, 2015 at 4:57 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
>> On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
>>> @@ -833,11 +846,10 @@ static void populate_value(struct ref_array_item *ref)
>>>                         refname = branch_get_push(branch, NULL);
>>>                         if (!refname)
>>>                                 continue;
>>> -               } else if (match_atom_name(name, "color", &valp)) {
>>> +               } else if (starts_with(name, "color")) {
>>
>> Hmm, so this will also match "colorize". Is that desirable?
>
> Well the error checking is done when we parse the atom in color_atom_parser()
> so here we don't need to worry about something like this.

I'm not sure that I understand your response. Let's say that, in the
future, someone adds a new atom named "colorize" (which may or may not
have its own parser in the valid_atom[] table). color_atom_parser()
will never see that atom, thus error checking in color_atom_parser()
is not relevant to this case. What is relevant is that the original
code:

    } if (match_atom_name(name, "color", &valp)) {

only matched %(color) or %(color:whatever). It did not match
%(colorize). However, the new code:

    } else if (starts_with(name, "color")) {

is far looser and will match %(colorize) and %(color) and
%(color:whatever) and %(coloranything), which is potentially
undesirable. It's true that the person adding %(colorize) could be
careful and ensure that the if/else chain checks %(colorize) first:

    } else if (!strcmp(name, "colorize") {
        ...
    } else if (starts_with(name, "color")) {
        ...
    } else ...

but that places a certain extra burden on that person. Alternately,
you can tighten the matching so that it is as strict as the original:

    } else if (!strcmp(name, "color") || starts_with(name, "color:")) {
        ...
    } else ...

Or perhaps upgrade match_atom_name() to make the 'val' argument
optional, in which case you might be able to do something like this:

    } else if (match_atom_name(name, "color", NULL) {

(However, if you introduce an 'enum atom_type' as suggested in my
response to the cover letter, then this problem goes away because
you'd be switching on the enum value, which is determinate, rather
than on a partial string, which may be ambiguous, as illustrated.)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 08/10] ref-filter: introduce remote_ref_atom_parser()
  2015-12-13  6:02     ` Karthik Nayak
@ 2015-12-13  6:15       ` Eric Sunshine
  2015-12-13  8:32       ` Karthik Nayak
  1 sibling, 0 replies; 50+ messages in thread
From: Eric Sunshine @ 2015-12-13  6:15 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Sun, Dec 13, 2015 at 1:02 AM, Karthik Nayak <karthik.188@gmail.com> wrote:
> On Sun, Dec 13, 2015 at 6:23 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
>> On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
>>> +               if (!num_ours && !num_theirs)
>>> +                       *s = "";
>>> +               else if (!num_ours)
>>> +                       *s = xstrfmt("[behind %d]", num_theirs);
>>> +               else if (!num_theirs)
>>> +                       *s = xstrfmt("[ahead %d]", num_ours);
>>> +               else
>>> +                       *s = xstrfmt("[ahead %d, behind %d]",
>>> +                                    num_ours, num_theirs);
>>
>> Tangent: These xstrfmt()'d strings are getting leaked, right? Is that
>> something that we need to worry about (if, for instance, a repository
>> contains a lot of tracking refs)? Should there be a NEEDSWORK comment
>> here regarding the issue?
>
> This is sort of a problem with most of the values in ref-filter, we dynamically
> allocate memory and do not free it, since the program exits soon after and
> we leave it to the Operating System to do the garbage collection.

I'm not worried about memory dynamically allocated for the used_atom[]
array being leaked (and cleaned up automatically at program exit), but
rather about memory being leaked for each processed reference, which
might become substantial for a project with a lot of references.

> Not sure if we'd want to work on this though.

It's likely outside the scope of the current patch series anyhow, and
probably not something that needs to be tackled right away (or perhaps
ever), which is why a NEEDSWORK comment might be appropriate, as a
reminder that the situation could be improved.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 08/10] ref-filter: introduce remote_ref_atom_parser()
  2015-12-13  6:02     ` Karthik Nayak
  2015-12-13  6:15       ` Eric Sunshine
@ 2015-12-13  8:32       ` Karthik Nayak
  2015-12-13  8:45         ` Eric Sunshine
  1 sibling, 1 reply; 50+ messages in thread
From: Karthik Nayak @ 2015-12-13  8:32 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Sun, Dec 13, 2015 at 11:32 AM, Karthik Nayak <karthik.188@gmail.com> wrote:
> On Sun, Dec 13, 2015 at 6:23 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
>> On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
>>> Introduce remote_ref_atom_parser() which will parse the '%(upstream)'
>>> and '%(push)' atoms and store information into the 'used_atom'
>>> structure based on the modifiers used along with the corresponding
>>> atom.
>>>
>>> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
>>> ---
>>> diff --git a/ref-filter.c b/ref-filter.c
>>> @@ -37,6 +37,11 @@ static struct used_atom {
>>>         union {
>>>                 const char *color;
>>>                 struct align align;
>>> +               struct {
>>> +                       unsigned int shorten : 1,
>>> +                               track : 1,
>>> +                               trackshort : 1;
>>> +               } remote_ref;
>>
>> Are 'shorten', 'track', and 'trackshort' mutually exclusive? If so, a
>> simple enum would be clearer than bitfields:
>>
>>     union {
>>         const char *color;
>>         struct align align;
>>         enum { RR_PLAIN, RR_SHORTEN, RR_TRACK, RR_TRACKSHORT }
>>             remote_ref;
>>     };
>>
>> Or something.
>>
>
> Sure, will do that.
>

There's also a slight problem with using enum's with the current implementation.
The problem is the enum is set to 0 by default (since we use memset).
so the first
value is set by default, not something we'd want. So either we stick
to the structure
with unsigned bits or we introduce a pseudo value in the enum. I
prefer the former.



On Sun, Dec 13, 2015 at 11:45 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Sun, Dec 13, 2015 at 1:02 AM, Karthik Nayak <karthik.188@gmail.com> wrote:
>> On Sun, Dec 13, 2015 at 6:23 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
>>> On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
>>>> +               if (!num_ours && !num_theirs)
>>>> +                       *s = "";
>>>> +               else if (!num_ours)
>>>> +                       *s = xstrfmt("[behind %d]", num_theirs);
>>>> +               else if (!num_theirs)
>>>> +                       *s = xstrfmt("[ahead %d]", num_ours);
>>>> +               else
>>>> +                       *s = xstrfmt("[ahead %d, behind %d]",
>>>> +                                    num_ours, num_theirs);
>>>
>>> Tangent: These xstrfmt()'d strings are getting leaked, right? Is that
>>> something that we need to worry about (if, for instance, a repository
>>> contains a lot of tracking refs)? Should there be a NEEDSWORK comment
>>> here regarding the issue?
>>
>> This is sort of a problem with most of the values in ref-filter, we dynamically
>> allocate memory and do not free it, since the program exits soon after and
>> we leave it to the Operating System to do the garbage collection.
>
> I'm not worried about memory dynamically allocated for the used_atom[]
> array being leaked (and cleaned up automatically at program exit), but
> rather about memory being leaked for each processed reference, which
> might become substantial for a project with a lot of references.
>
>> Not sure if we'd want to work on this though.
>
> It's likely outside the scope of the current patch series anyhow, and
> probably not something that needs to be tackled right away (or perhaps
> ever), which is why a NEEDSWORK comment might be appropriate, as a
> reminder that the situation could be improved.

Yes I got what you're saying. I'm talking about other values which are
dynamically
allocated in ref-filter itself.

For e.g.

When we use the deref option
v->s = xstrfmt("%s^{}", refname);

or the color option
v->s = xstrdup(color);

So seems like we need a way to go around all of these.

-- 
Regards,
Karthik Nayak

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 08/10] ref-filter: introduce remote_ref_atom_parser()
  2015-12-13  8:32       ` Karthik Nayak
@ 2015-12-13  8:45         ` Eric Sunshine
  2015-12-13  8:53           ` Karthik Nayak
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Sunshine @ 2015-12-13  8:45 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Sun, Dec 13, 2015 at 3:32 AM, Karthik Nayak <karthik.188@gmail.com> wrote:
>> On Sun, Dec 13, 2015 at 6:23 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
>>> On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
>>>> Introduce remote_ref_atom_parser() which will parse the '%(upstream)'
>>>> and '%(push)' atoms and store information into the 'used_atom'
>>>> structure based on the modifiers used along with the corresponding
>>>> atom.
>>>>
>>>> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
>>>> ---
>>>> diff --git a/ref-filter.c b/ref-filter.c
>>>> @@ -37,6 +37,11 @@ static struct used_atom {
>>>>         union {
>>>>                 const char *color;
>>>>                 struct align align;
>>>> +               struct {
>>>> +                       unsigned int shorten : 1,
>>>> +                               track : 1,
>>>> +                               trackshort : 1;
>>>> +               } remote_ref;
>>>
>>> Are 'shorten', 'track', and 'trackshort' mutually exclusive? If so, a
>>> simple enum would be clearer than bitfields:
>>>
>>>     union {
>>>         const char *color;
>>>         struct align align;
>>>         enum { RR_PLAIN, RR_SHORTEN, RR_TRACK, RR_TRACKSHORT }
>>>             remote_ref;
>>>     };
>>>
>>> Or something.
>
> There's also a slight problem with using enum's with the current implementation.
> The problem is the enum is set to 0 by default (since we use memset).
> so the first value is set by default, not something we'd want.

I'm afraid I don't see the problem. Doesn't the RR_PLAIN in the
example cover this case?

> So either we stick to the structure
> with unsigned bits or we introduce a pseudo value in the enum. I
> prefer the former.

It's not a pseudo-value, but rather just one of the mutually exclusive states.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 08/10] ref-filter: introduce remote_ref_atom_parser()
  2015-12-13  8:45         ` Eric Sunshine
@ 2015-12-13  8:53           ` Karthik Nayak
  0 siblings, 0 replies; 50+ messages in thread
From: Karthik Nayak @ 2015-12-13  8:53 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Sun, Dec 13, 2015 at 2:15 PM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Sun, Dec 13, 2015 at 3:32 AM, Karthik Nayak <karthik.188@gmail.com> wrote:
>>> On Sun, Dec 13, 2015 at 6:23 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
>>>> On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
>>>>> Introduce remote_ref_atom_parser() which will parse the '%(upstream)'
>>>>> and '%(push)' atoms and store information into the 'used_atom'
>>>>> structure based on the modifiers used along with the corresponding
>>>>> atom.
>>>>>
>>>>> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
>>>>> ---
>>>>> diff --git a/ref-filter.c b/ref-filter.c
>>>>> @@ -37,6 +37,11 @@ static struct used_atom {
>>>>>         union {
>>>>>                 const char *color;
>>>>>                 struct align align;
>>>>> +               struct {
>>>>> +                       unsigned int shorten : 1,
>>>>> +                               track : 1,
>>>>> +                               trackshort : 1;
>>>>> +               } remote_ref;
>>>>
>>>> Are 'shorten', 'track', and 'trackshort' mutually exclusive? If so, a
>>>> simple enum would be clearer than bitfields:
>>>>
>>>>     union {
>>>>         const char *color;
>>>>         struct align align;
>>>>         enum { RR_PLAIN, RR_SHORTEN, RR_TRACK, RR_TRACKSHORT }
>>>>             remote_ref;
>>>>     };
>>>>
>>>> Or something.
>>
>> There's also a slight problem with using enum's with the current implementation.
>> The problem is the enum is set to 0 by default (since we use memset).
>> so the first value is set by default, not something we'd want.
>
> I'm afraid I don't see the problem. Doesn't the RR_PLAIN in the
> example cover this case?
>
>> So either we stick to the structure
>> with unsigned bits or we introduce a pseudo value in the enum. I
>> prefer the former.
>
> It's not a pseudo-value, but rather just one of the mutually exclusive states.

This example is actually fine, the next one "%(contents)" is more of
the problem,
the check is done in grab_sub_body_contents() where previously "contents.all"
would be enough to check if we need to add contents value. Now the first enum
value is selected. Maybe have a "NOT_VALID" field in the enum.

-- 
Regards,
Karthik Nayak

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 00/10] ref-filter: use parsing functions
  2015-12-13  5:40   ` Eric Sunshine
@ 2015-12-13  9:31     ` Karthik Nayak
  2015-12-13 21:55       ` Eric Sunshine
  2015-12-14 19:06     ` Junio C Hamano
  1 sibling, 1 reply; 50+ messages in thread
From: Karthik Nayak @ 2015-12-13  9:31 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Junio C Hamano, Git List, Matthieu Moy

On Sun, Dec 13, 2015 at 11:10 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Fri, Dec 11, 2015 at 5:49 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Karthik Nayak <karthik.188@gmail.com> writes:
>>>   ref-filter: introduce a parsing function for each atom in valid_atom
>>>   ref-filter: introduce struct used_atom
>>>   ref-fitler: bump match_atom() name to the top
>>>   ref-filter: skip deref specifier in match_atom_name()
>>>   ref-filter: introduce color_atom_parser()
>>>   strbuf: introduce strbuf_split_str_without_term()
>>>   ref-filter: introduce align_atom_parser()
>>>   ref-filter: introduce remote_ref_atom_parser()
>>>   ref-filter: introduce contents_atom_parser()
>>>   ref-filter: introduce objectname_atom_parser()
>>
>> It seems that this series had seen quite a good inputs, mostly from
>> Eric.  I finished reading it over and I didn't find anything more to
>> add.  The patches are mostly good and would be ready once these
>> points raised during the review are addressed, I think
>
> I'm still a bit fuzzy about what this series is trying to achieve. It
> feels like it wants to avoid doing repeated processing of unchanging
> bits of %(foo:bar) atoms for each ref processed, but it only partly
> achieves that goal. For instance, there are still an awful lot of
> strcmp()s and starts_with()s in that inner loop, and even the
> unchanging %(color:) argument gets re-evaulated repeatedly, which is
> probably quite expensive.
>

Yes, you're right. It started off as per Junio's suggestion which you
can find here: http://thread.gmane.org/gmane.comp.version-control.git/279254

> If the intention is to rid that inner loop of much of the expensive
> processing, then wouldn't we want to introduce an enum of valid atoms
> which is to be a member of 'struct used_atom', and have
> populate_value() switch on the enum value rather than doing all the
> expensive strcmp()s and starts_with()?
>
>     enum atom_type {
>         AT_REFNAME,
>         AT_OBJECTTYPE,
>         ...
>         AT_COLOR,
>         AT_ALIGN
>     };
>
>     static struct used_atom {
>         enum atom_type atom;
>         cmp_type cmp;
>         union {
>             char *color; /* parsed color */
>             struct align align;
>             enum { ... } remote_ref;
>             struct {
>                 enum { ... } portion;
>                 unsigned int nlines;
>             } contents;
>             int short_objname;
>         } u;
>     } *used_atom;
>
> In fact, the 'cmp_type cmp' field can be dropped altogether since it
> can just as easily be looked up when needed by keeping 'enum
> atom_type' and valid_atoms[] in-sync and indexing into valid_atoms[]
> by atom_type:
>
>     struct used_atom *a = ...;
>     cmp_type cmp = valid_atoms[a->atom].cmp_type;
>
> As a bonus, an 'enum atom_type' would resolve the problem of how
> starts_with() is abused in populate_value() for certain cases
> (assuming I'm understanding the logic), such as how matching of
> "color" could incorrectly match some yet-to-be-added atom named
> "colorize".

This actually makes sense to me, especially how we can eliminate most of
the starts_with() in populate_value(). Well then I guess I'll work on this and
see what I can come up with, Thanks for the review :)

-- 
Regards,
Karthik Nayak

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 05/10] ref-filter: introduce color_atom_parser()
  2015-12-13  6:05       ` Eric Sunshine
@ 2015-12-13 18:46         ` Karthik Nayak
  0 siblings, 0 replies; 50+ messages in thread
From: Karthik Nayak @ 2015-12-13 18:46 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Sun, Dec 13, 2015 at 11:35 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Thu, Dec 3, 2015 at 8:35 AM, Karthik Nayak <karthik.188@gmail.com> wrote:
>> On Wed, Dec 2, 2015 at 4:57 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
>>> On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
>>>> @@ -833,11 +846,10 @@ static void populate_value(struct ref_array_item *ref)
>>>>                         refname = branch_get_push(branch, NULL);
>>>>                         if (!refname)
>>>>                                 continue;
>>>> -               } else if (match_atom_name(name, "color", &valp)) {
>>>> +               } else if (starts_with(name, "color")) {
>>>
>>> Hmm, so this will also match "colorize". Is that desirable?
>>
>> Well the error checking is done when we parse the atom in color_atom_parser()
>> so here we don't need to worry about something like this.
>
> I'm not sure that I understand your response. Let's say that, in the
> future, someone adds a new atom named "colorize" (which may or may not
> have its own parser in the valid_atom[] table). color_atom_parser()
> will never see that atom, thus error checking in color_atom_parser()
> is not relevant to this case. What is relevant is that the original
> code:
>
>     } if (match_atom_name(name, "color", &valp)) {
>
> only matched %(color) or %(color:whatever). It did not match
> %(colorize). However, the new code:
>
>     } else if (starts_with(name, "color")) {
>
> is far looser and will match %(colorize) and %(color) and
> %(color:whatever) and %(coloranything), which is potentially
> undesirable. It's true that the person adding %(colorize) could be
> careful and ensure that the if/else chain checks %(colorize) first:
>
>     } else if (!strcmp(name, "colorize") {
>         ...
>     } else if (starts_with(name, "color")) {
>         ...
>     } else ...
>
> but that places a certain extra burden on that person. Alternately,
> you can tighten the matching so that it is as strict as the original:
>
>     } else if (!strcmp(name, "color") || starts_with(name, "color:")) {
>         ...
>     } else ...
>
> Or perhaps upgrade match_atom_name() to make the 'val' argument
> optional, in which case you might be able to do something like this:
>
>     } else if (match_atom_name(name, "color", NULL) {
>
> (However, if you introduce an 'enum atom_type' as suggested in my
> response to the cover letter, then this problem goes away because
> you'd be switching on the enum value, which is determinate, rather
> than on a partial string, which may be ambiguous, as illustrated.)

Ah, Thanks for putting it out. I understand, I'll work on the enum atom_type
now. that should take care of this :)

-- 
Regards,
Karthik Nayak

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 09/10] ref-filter: introduce contents_atom_parser()
  2015-12-13  3:10   ` Eric Sunshine
  2015-12-13  4:41     ` Eric Sunshine
@ 2015-12-13 19:33     ` Karthik Nayak
  1 sibling, 0 replies; 50+ messages in thread
From: Karthik Nayak @ 2015-12-13 19:33 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Sun, Dec 13, 2015 at 8:40 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
>> Introduce contents_atom_parser() which will parse the '%(contents)'
>> atom and store information into the 'used_atom' structure based on the
>> modifiers used along with the atom.
>>
>> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
>> ---
>> diff --git a/ref-filter.c b/ref-filter.c
>> index 802629b..117bbbb 100644
>> --- a/ref-filter.c
>> +++ b/ref-filter.c
>> @@ -42,6 +42,14 @@ static struct used_atom {
>>                                 track : 1,
>>                                 trackshort : 1;
>>                 } remote_ref;
>> +               struct {
>> +                       unsigned int subject : 1,
>> +                               body : 1,
>> +                               signature : 1,
>> +                               all : 1,
>> +                               lines : 1,
>> +                               no_lines;
>> +               } contents;
>
> Same question as patch 8/10: With the exception of non-bitfield
> 'no_lines', are these 'contents' flags mutually exclusive? If so, an
> enum would be a more natural representation than bitfields.
>

Like I said, an enum would cause problems here.
If you see the code flow when we check for contents we currently (as
per this patch)
use the 'all' bit to denote usage of contents, but when using an enum we would
set the enum to the first value when using memset.

But this should go away with the 'enum atom_type' implementation which
you suggested.

>>         } u;
>>  } *used_atom;
>>  static int used_atom_cnt, need_tagged, need_symref;
>> @@ -92,6 +100,29 @@ void remote_ref_atom_parser(struct used_atom *atom)
>> +void contents_atom_parser(struct used_atom *atom)
>> +{
>> +       const char * buf;
>> +
>> +       if (match_atom_name(atom->str, "contents", &buf))
>> +               atom->u.contents.all = 1;
>> +
>> +       if (!buf)
>> +               return;
>
> Hmm, I'd have placed the blank line after the 'if (!buf) return;'
> rather than before it.
>

Will change that.

>> +       if (!strcmp(buf, "body"))
>> +               atom->u.contents.body = 1;
>> +       else if (!strcmp(buf, "signature"))
>> +               atom->u.contents.signature = 1;
>> +       else if (!strcmp(buf, "subject"))
>> +               atom->u.contents.subject = 1;
>> +       else if (skip_prefix(buf, "lines=", &buf)) {
>> +               atom->u.contents.lines = 1;
>> +               if (strtoul_ui(buf, 10, &atom->u.contents.no_lines))
>> +                       die(_("positive value expected contents:lines=%s"), buf);
>> +       } else
>> +               die(_("improper format entered contents:%s"), buf);
>
> Perhaps a more grammatically-friendly error message?
>

Will do.

>> +}
>> +
>>  static align_type get_align_position(const char *type)
>>  {
>>         if (!strcmp(type, "right"))
>> @@ -761,20 +786,16 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct obj
>>         unsigned long sublen = 0, bodylen = 0, nonsiglen = 0, siglen = 0;
>>
>>         for (i = 0; i < used_atom_cnt; i++) {
>> -               const char *name = used_atom[i].str;
>> +               struct used_atom *atom = &used_atom[i];
>> +               const char *name = atom->str;
>>                 struct atom_value *v = &val[i];
>> -               const char *valp = NULL;
>>                 if (!!deref != (*name == '*'))
>>                         continue;
>>                 if (deref)
>>                         name++;
>>                 if (strcmp(name, "subject") &&
>>                     strcmp(name, "body") &&
>> -                   strcmp(name, "contents") &&
>> -                   strcmp(name, "contents:subject") &&
>> -                   strcmp(name, "contents:body") &&
>> -                   strcmp(name, "contents:signature") &&
>> -                   !starts_with(name, "contents:lines="))
>> +                   !atom->u.contents.all)
>>                         continue;
>>                 if (!subpos)
>>                         find_subpos(buf, sz,
>> @@ -784,26 +805,23 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct obj
>>
>>                 if (!strcmp(name, "subject"))
>>                         v->s = copy_subject(subpos, sublen);
>> -               else if (!strcmp(name, "contents:subject"))
>> +               else if (atom->u.contents.subject)
>>                         v->s = copy_subject(subpos, sublen);
>
> With the disclaimer that I haven't fully digested the existing logic,
> is there a reason that you don't also preprocess bare "subject" as you
> preprocess "contents:subject"? Isn't "subject" just historic aliases
> for "contents:subject"?
>
> A similar observation may be made about "body" and "contents:body",
> although I see they mean slightly different things (for, I suppose,
> historical reasons).
>

Actually I missed that out, now that you mention it, i'll see how I
can come around this.

>>                 else if (!strcmp(name, "body"))
>>                         v->s = xmemdupz(bodypos, bodylen);
>> -               else if (!strcmp(name, "contents:body"))
>> +               else if (atom->u.contents.body)
>>                         v->s = xmemdupz(bodypos, nonsiglen);
>> -               else if (!strcmp(name, "contents:signature"))
>> +               else if (atom->u.contents.signature)
>>                         v->s = xmemdupz(sigpos, siglen);
>> -               else if (!strcmp(name, "contents"))
>> -                       v->s = xstrdup(subpos);
>> -               else if (skip_prefix(name, "contents:lines=", &valp)) {
>> +               else if (atom->u.contents.lines) {
>>                         struct strbuf s = STRBUF_INIT;
>>                         const char *contents_end = bodylen + bodypos - siglen;
>>
>> -                       if (strtoul_ui(valp, 10, &v->u.contents.lines))
>> -                               die(_("positive value expected contents:lines=%s"), valp);
>>                         /*  Size is the length of the message after removing the signature */
>> -                       append_lines(&s, subpos, contents_end - subpos, v->u.contents.lines);
>> +                       append_lines(&s, subpos, contents_end - subpos, atom->u.contents.no_lines);
>>                         v->s = strbuf_detach(&s, NULL);
>> -               }
>> +               } else /*  For %(contents) without modifiers */
>
> Too many blanks following '/*'; downcase 'for' or drop it altogether:
>
>     /* bare %(contents) */
>

Will do thanks.

-- 
Regards,
Karthik Nayak

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 09/10] ref-filter: introduce contents_atom_parser()
  2015-12-13  4:41     ` Eric Sunshine
@ 2015-12-13 19:36       ` Karthik Nayak
  0 siblings, 0 replies; 50+ messages in thread
From: Karthik Nayak @ 2015-12-13 19:36 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Matthieu Moy, Junio C Hamano

On Sun, Dec 13, 2015 at 10:11 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Sat, Dec 12, 2015 at 10:10 PM, Eric Sunshine <sunshine@sunshineco.com> wrote:
>> On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@gmail.com> wrote:
>>> +void contents_atom_parser(struct used_atom *atom)
>>> +{
>>> +       const char * buf;
>>> +
>>> +       if (match_atom_name(atom->str, "contents", &buf))
>>> +               atom->u.contents.all = 1;
>>> +
>>> +       if (!buf)
>>> +               return;
>
> Also, isn't this logic somewhat bogus? u.contents.all is set to 1 for
> both bare %(contents) and decorated %(contents:whatever). Then, below,
> you also set one of .body, .signature, .subject, or .lines if a
> decoration is specified. So, now you have both .all and one of the
> other attributes set to 1, which is rather nonsensical (if I
> understand correctly).
>

The problem is its not mutually exclusive here, the 'all' was supposed to
act as a way of checking if its a contents atom, since populate_value() doesn't
really check that again.

So if any of the others were selected we would implement %(contents:<value>)
else the bare %(contents:<value>) would be selected.

> If you change this to an enum as suggested in my previous email, then
> the problem goes away.
>

Yup. Will do. Thanks

-- 
Regards,
Karthik Nayak

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 10/10] ref-filter: introduce objectname_atom_parser()
  2015-12-13  4:49     ` Eric Sunshine
@ 2015-12-13 19:40       ` Karthik Nayak
  0 siblings, 0 replies; 50+ messages in thread
From: Karthik Nayak @ 2015-12-13 19:40 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Junio C Hamano, Jeff King

On Sun, Dec 13, 2015 at 10:19 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Wed, Nov 25, 2015 at 8:44 AM, Karthik Nayak <karthik.188@gmail.com> wrote:
>> Introduce objectname_atom_parser() which will parse the
>> '%(objectname)' atom and store information into the 'used_atom'
>> structure based on the modifiers used along with the atom.
>>
>> Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
>> ---
>> diff --git a/ref-filter.c b/ref-filter.c
>> @@ -50,6 +50,10 @@ static struct used_atom {
>>                                 lines : 1,
>>                                 no_lines;
>>                 } contents;
>> +               struct {
>> +                       unsigned int shorten : 1,
>> +                               full : 1;
>> +               } objectname;
>
> Same comment as in my patch 8 and 9 reviews: If 'shorten' and 'full'
> are mutually exclusive, then an enum would be clearer. In fact, if
> there are only these two states (full and short), then this could be a
> simple boolean named 'shorten'.
>
>>         } u;
>>  } *used_atom;
>>  static int used_atom_cnt, need_tagged, need_symref;
>> @@ -123,6 +127,21 @@ void contents_atom_parser(struct used_atom *atom)
>> +void objectname_atom_parser(struct used_atom *atom)
>> +{
>> +       const char * buf;
>> +
>> +       if (match_atom_name(atom->str, "objectname", &buf))
>> +               atom->u.objectname.full = 1;
>
> Same comment about bogus logic as in patch 9 review: u.objectname.full
> and u.objectname.shorten are both set to 1 for %(objectname:short).
>

I guess I responded to the same issue, will work on it.

>> +
>> +       if (!buf)
>> +               return;
>
> Same comment about misplaced blank line: Put the blank line after the
> conditional rather than before or drop it altogether.
>

Will change.

>> +       if (!strcmp(buf, "short"))
>> +               atom->u.objectname.shorten = 1;
>> +       else
>> +               die(_("improper format entered objectname:%s"), buf);
>
> Maybe just "unrecognized objectname:%s" or something?
>

die(_("unrecognized %%(objectname) argument: %s"), buf);
to keep things consistent

>> +}
>> +
>> @@ -463,15 +482,16 @@ static void *get_obj(const unsigned char *sha1, struct object **obj, unsigned lo
>>  }
>>
>>  static int grab_objectname(const char *name, const unsigned char *sha1,
>> -                           struct atom_value *v)
>> +                          struct atom_value *v, struct used_atom *atom)
>>  {
>> -       if (!strcmp(name, "objectname")) {
>> -               v->s = xstrdup(sha1_to_hex(sha1));
>> -               return 1;
>> -       }
>> -       if (!strcmp(name, "objectname:short")) {
>> -               v->s = xstrdup(find_unique_abbrev(sha1, DEFAULT_ABBREV));
>> -               return 1;
>> +       if (starts_with(name, "objectname")) {
>> +               if (atom->u.objectname.shorten) {
>> +                       v->s = xstrdup(find_unique_abbrev(sha1, DEFAULT_ABBREV));
>> +                       return 1;
>> +               } else if (atom->u.objectname.full) {
>> +                       v->s = xstrdup(sha1_to_hex(sha1));
>> +                       return 1;
>> +               }
>>         }
>>         return 0;
>>  }
>> @@ -495,7 +515,7 @@ static void grab_common_values(struct atom_value *val, int deref, struct object
>>                         v->s = xstrfmt("%lu", sz);
>>                 }
>>                 else if (deref)
>> -                       grab_objectname(name, obj->sha1, v);
>> +                       grab_objectname(name, obj->sha1, v, &used_atom[i]);
>>         }
>>  }
>>
>> @@ -1004,7 +1024,7 @@ static void populate_value(struct ref_array_item *ref)
>>                                 v->s = xstrdup(buf + 1);
>>                         }
>>                         continue;
>> -               } else if (!deref && grab_objectname(name, ref->objectname, v)) {
>> +               } else if (!deref && grab_objectname(name, ref->objectname, v, atom)) {
>>                         continue;
>>                 } else if (!strcmp(name, "HEAD")) {
>>                         const char *head;
>> --
>> 2.6.2



-- 
Regards,
Karthik Nayak

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 00/10] ref-filter: use parsing functions
  2015-12-13  9:31     ` Karthik Nayak
@ 2015-12-13 21:55       ` Eric Sunshine
  2015-12-16 14:45         ` Karthik Nayak
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Sunshine @ 2015-12-13 21:55 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Junio C Hamano, Git List, Matthieu Moy

On Sun, Dec 13, 2015 at 4:31 AM, Karthik Nayak <karthik.188@gmail.com> wrote:
> On Sun, Dec 13, 2015 at 11:10 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
>> If the intention is to rid that inner loop of much of the expensive
>> processing, then wouldn't we want to introduce an enum of valid atoms
>> which is to be a member of 'struct used_atom', and have
>> populate_value() switch on the enum value rather than doing all the
>> expensive strcmp()s and starts_with()?
>>
>>     enum atom_type {
>>         AT_REFNAME,
>>         AT_OBJECTTYPE,
>>         ...
>>         AT_COLOR,
>>         AT_ALIGN
>>     };
>>
>>     static struct used_atom {
>>         enum atom_type atom;
>>         cmp_type cmp;
>>         union {
>>             char *color; /* parsed color */
>>             struct align align;
>>             enum { ... } remote_ref;
>>             struct {
>>                 enum { ... } portion;
>>                 unsigned int nlines;
>>             } contents;
>>             int short_objname;
>>         } u;
>>     } *used_atom;
>>
>> In fact, the 'cmp_type cmp' field can be dropped altogether since it
>> can just as easily be looked up when needed by keeping 'enum
>> atom_type' and valid_atoms[] in-sync and indexing into valid_atoms[]
>> by atom_type:
>>
>>     struct used_atom *a = ...;
>>     cmp_type cmp = valid_atoms[a->atom].cmp_type;
>>
>> As a bonus, an 'enum atom_type' would resolve the problem of how
>> starts_with() is abused in populate_value() for certain cases
>> (assuming I'm understanding the logic), such as how matching of
>> "color" could incorrectly match some yet-to-be-added atom named
>> "colorize".
>
> This actually makes sense to me, especially how we can eliminate most of
> the starts_with() in populate_value(). Well then I guess I'll work on this and
> see what I can come up with, Thanks for the review :)

The 'enum atom_type' approach doesn't necessarily need to be part of
this patch series; it may make sense to layer it atop the current
series, which is already long enough to make review onerous. Patches
in the current series are things you would want to do anyhow for the
'enum atom_type' approach, and there isn't really anything in the
current series which should present a roadblock for layering 'enum
atom_type' atop. Plus, by layering 'enum atom_type' atop, you get a
chance to polish the current series (based upon review comments) and
let it settle down, which should make working on 'enum atom_type'
easier.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 00/10] ref-filter: use parsing functions
  2015-12-13  5:40   ` Eric Sunshine
  2015-12-13  9:31     ` Karthik Nayak
@ 2015-12-14 19:06     ` Junio C Hamano
  1 sibling, 0 replies; 50+ messages in thread
From: Junio C Hamano @ 2015-12-14 19:06 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Karthik Nayak, Git List, Matthieu Moy

Eric Sunshine <sunshine@sunshineco.com> writes:

> On Fri, Dec 11, 2015 at 5:49 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Karthik Nayak <karthik.188@gmail.com> writes:
>>>   ref-filter: introduce a parsing function for each atom in valid_atom
>>>   ref-filter: introduce struct used_atom
>>>   ref-fitler: bump match_atom() name to the top
>>>   ref-filter: skip deref specifier in match_atom_name()
>>>   ref-filter: introduce color_atom_parser()
>>>   strbuf: introduce strbuf_split_str_without_term()
>>>   ref-filter: introduce align_atom_parser()
>>>   ref-filter: introduce remote_ref_atom_parser()
>>>   ref-filter: introduce contents_atom_parser()
>>>   ref-filter: introduce objectname_atom_parser()
>>
>> It seems that this series had seen quite a good inputs, mostly from
>> Eric.  I finished reading it over and I didn't find anything more to
>> add.  The patches are mostly good and would be ready once these
>> points raised during the review are addressed, I think
>
> I'm still a bit fuzzy about what this series is trying to achieve. It
> feels like it wants to avoid doing repeated processing of unchanging
> bits of %(foo:bar) atoms for each ref processed, but it only partly
> achieves that goal.

That's very true.

It seems you two already have hashed it out in the downthread, and I
think that is in line with an earlier suggestion by Matthieu to
fully pre-parse in the earlier thread, which was made in response to
(and is much better than) my "let's start with a half-way solution"
in $gmane/279254.

Thanks.

> strcmp()s and starts_with()s in that inner loop, and even the
> unchanging %(color:) argument gets re-evaulated repeatedly, which is
> probably quite expensive.
>
> If the intention is to rid that inner loop of much of the expensive
> processing, then wouldn't we want to introduce an enum of valid atoms
> which is to be a member of 'struct used_atom', and have
> populate_value() switch on the enum value rather than doing all the
> expensive strcmp()s and starts_with()?
>
>     enum atom_type {
>         AT_REFNAME,
>         AT_OBJECTTYPE,
>         ...
>         AT_COLOR,
>         AT_ALIGN
>     };
>
>     static struct used_atom {
>         enum atom_type atom;
>         cmp_type cmp;
>         union {
>             char *color; /* parsed color */
>             struct align align;
>             enum { ... } remote_ref;
>             struct {
>                 enum { ... } portion;
>                 unsigned int nlines;
>             } contents;
>             int short_objname;
>         } u;
>     } *used_atom;
>
> In fact, the 'cmp_type cmp' field can be dropped altogether since it
> can just as easily be looked up when needed by keeping 'enum
> atom_type' and valid_atoms[] in-sync and indexing into valid_atoms[]
> by atom_type:
>
>     struct used_atom *a = ...;
>     cmp_type cmp = valid_atoms[a->atom].cmp_type;
>
> As a bonus, an 'enum atom_type' would resolve the problem of how
> starts_with() is abused in populate_value() for certain cases
> (assuming I'm understanding the logic), such as how matching of
> "color" could incorrectly match some yet-to-be-added atom named
> "colorize".

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH/RFC 00/10] ref-filter: use parsing functions
  2015-12-13 21:55       ` Eric Sunshine
@ 2015-12-16 14:45         ` Karthik Nayak
  0 siblings, 0 replies; 50+ messages in thread
From: Karthik Nayak @ 2015-12-16 14:45 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Junio C Hamano, Git List, Matthieu Moy

On Mon, Dec 14, 2015 at 3:25 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Sun, Dec 13, 2015 at 4:31 AM, Karthik Nayak <karthik.188@gmail.com> wrote:
>> On Sun, Dec 13, 2015 at 11:10 AM, Eric Sunshine <sunshine@sunshineco.com> wrote:
>>> If the intention is to rid that inner loop of much of the expensive
>>> processing, then wouldn't we want to introduce an enum of valid atoms
>>> which is to be a member of 'struct used_atom', and have
>>> populate_value() switch on the enum value rather than doing all the
>>> expensive strcmp()s and starts_with()?
>>>
>>>     enum atom_type {
>>>         AT_REFNAME,
>>>         AT_OBJECTTYPE,
>>>         ...
>>>         AT_COLOR,
>>>         AT_ALIGN
>>>     };
>>>
>>>     static struct used_atom {
>>>         enum atom_type atom;
>>>         cmp_type cmp;
>>>         union {
>>>             char *color; /* parsed color */
>>>             struct align align;
>>>             enum { ... } remote_ref;
>>>             struct {
>>>                 enum { ... } portion;
>>>                 unsigned int nlines;
>>>             } contents;
>>>             int short_objname;
>>>         } u;
>>>     } *used_atom;
>>>
>>> In fact, the 'cmp_type cmp' field can be dropped altogether since it
>>> can just as easily be looked up when needed by keeping 'enum
>>> atom_type' and valid_atoms[] in-sync and indexing into valid_atoms[]
>>> by atom_type:
>>>
>>>     struct used_atom *a = ...;
>>>     cmp_type cmp = valid_atoms[a->atom].cmp_type;
>>>
>>> As a bonus, an 'enum atom_type' would resolve the problem of how
>>> starts_with() is abused in populate_value() for certain cases
>>> (assuming I'm understanding the logic), such as how matching of
>>> "color" could incorrectly match some yet-to-be-added atom named
>>> "colorize".
>>
>> This actually makes sense to me, especially how we can eliminate most of
>> the starts_with() in populate_value(). Well then I guess I'll work on this and
>> see what I can come up with, Thanks for the review :)
>
> The 'enum atom_type' approach doesn't necessarily need to be part of
> this patch series; it may make sense to layer it atop the current
> series, which is already long enough to make review onerous. Patches
> in the current series are things you would want to do anyhow for the
> 'enum atom_type' approach, and there isn't really anything in the
> current series which should present a roadblock for layering 'enum
> atom_type' atop. Plus, by layering 'enum atom_type' atop, you get a
> chance to polish the current series (based upon review comments) and
> let it settle down, which should make working on 'enum atom_type'
> easier.

Ya could do that, I have the series ready then, I was gonna work on the
'enum atom_type' but I'll work on top of this, like you suggested.

Thanks.

-- 
Regards,
Karthik Nayak

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2015-12-16 14:45 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-11 19:44 [PATCH/RFC 00/10] ref-filter: use parsing functions Karthik Nayak
2015-11-11 19:44 ` [PATCH/RFC 01/10] ref-filter: introduce a parsing function for each atom in valid_atom Karthik Nayak
2015-11-23 23:44   ` Eric Sunshine
2015-11-25 12:10     ` Karthik Nayak
2015-11-25 19:41       ` Eric Sunshine
2015-11-26 18:01         ` Karthik Nayak
2015-12-11 22:18           ` Junio C Hamano
2015-12-12 16:05             ` Karthik Nayak
2015-11-11 19:44 ` [PATCH/RFC 02/10] ref-filter: introduce struct used_atom Karthik Nayak
2015-12-01 23:00   ` Eric Sunshine
2015-11-11 19:44 ` [PATCH/RFC 03/10] ref-fitler: bump match_atom() name to the top Karthik Nayak
2015-11-11 19:44 ` [PATCH/RFC 04/10] ref-filter: skip deref specifier in match_atom_name() Karthik Nayak
2015-12-01 23:11   ` Eric Sunshine
2015-12-03  6:34     ` Karthik Nayak
2015-11-11 19:44 ` [PATCH/RFC 05/10] ref-filter: introduce color_atom_parser() Karthik Nayak
2015-12-01 23:27   ` Eric Sunshine
2015-12-03 13:35     ` Karthik Nayak
2015-12-13  6:05       ` Eric Sunshine
2015-12-13 18:46         ` Karthik Nayak
2015-11-11 19:44 ` [PATCH/RFC 06/10] strbuf: introduce strbuf_split_str_without_term() Karthik Nayak
2015-12-02  8:04   ` Eric Sunshine
2015-12-03 18:12     ` Karthik Nayak
2015-12-11 22:31       ` Junio C Hamano
2015-12-12 16:04         ` Karthik Nayak
2015-11-11 19:44 ` [PATCH/RFC 07/10] ref-filter: introduce align_atom_parser() Karthik Nayak
2015-12-02 21:23   ` Eric Sunshine
2015-12-07 17:00     ` Karthik Nayak
2015-11-11 19:44 ` [PATCH/RFC 08/10] ref-filter: introduce remote_ref_atom_parser() Karthik Nayak
2015-12-13  0:53   ` Eric Sunshine
2015-12-13  6:02     ` Karthik Nayak
2015-12-13  6:15       ` Eric Sunshine
2015-12-13  8:32       ` Karthik Nayak
2015-12-13  8:45         ` Eric Sunshine
2015-12-13  8:53           ` Karthik Nayak
2015-11-11 19:44 ` [PATCH/RFC 09/10] ref-filter: introduce contents_atom_parser() Karthik Nayak
2015-12-13  3:10   ` Eric Sunshine
2015-12-13  4:41     ` Eric Sunshine
2015-12-13 19:36       ` Karthik Nayak
2015-12-13 19:33     ` Karthik Nayak
2015-11-24 21:48 ` [PATCH/RFC 00/10] ref-filter: use parsing functions Jeff King
2015-11-25 12:07   ` Karthik Nayak
2015-11-25 13:44   ` [PATCH/RFC 10/10] ref-filter: introduce objectname_atom_parser() Karthik Nayak
2015-12-13  4:49     ` Eric Sunshine
2015-12-13 19:40       ` Karthik Nayak
2015-12-11 22:49 ` [PATCH/RFC 00/10] ref-filter: use parsing functions Junio C Hamano
2015-12-13  5:40   ` Eric Sunshine
2015-12-13  9:31     ` Karthik Nayak
2015-12-13 21:55       ` Eric Sunshine
2015-12-16 14:45         ` Karthik Nayak
2015-12-14 19:06     ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).