git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/6] [GSOC][RFC] ref-filter: add %(raw:textconv) and %(raw:filters)
@ 2021-06-05  9:13 ZheNing Hu via GitGitGadget
  2021-06-05  9:13 ` [PATCH 1/6] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
                   ` (8 more replies)
  0 siblings, 9 replies; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-06-05  9:13 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Christian Couder, Hariom Verma, ZheNing Hu

In order to let git cat-file --batch reuse ref-filter logic, This patch,
%(rest), %(raw:textconv), %(raw:filters) atoms and --rest=<rest> option are
added to ref-filter.

 * %(rest) int the format will be replaced by the <rest> in --rest=<rest>.
 * the <rest> in --rest=<rest> can also be used as the <path> for
   %(raw:textconv) and %(raw:filters).
 * %(raw:textconv) can show the object's contents as transformed by a
   textconv filter.
 * %(raw:filters) can show the content as converted by the filters
   configured in the current working tree for the given <path> (i.e. smudge
   filters, end-of-line conversion, etc).

The current series is based on 0efed9435 ([GSOC] ref-filter: add %(raw)
atom)
https://lore.kernel.org/git/pull.966.v2.git.1622808751.gitgitgadget@gmail.com/
If necessary, "%(rest)" part can be an independent patch later.

ZheNing Hu (6):
  [GSOC] ref-filter: add obj-type check in grab contents
  [GSOC] ref-filter: add %(raw) atom
  [GSOC] ref-filter: use non-const ref_format in *_atom_parser()
  [GSOC] ref-filter: add %(rest) atom and --rest option
  [GSOC] ref-filter: teach grab_sub_body_contents() return value and err
  [GSOC] ref-filter: add %(raw:textconv) and %(raw:filters)

 Documentation/git-branch.txt       |   6 +-
 Documentation/git-for-each-ref.txt |  29 ++-
 Documentation/git-tag.txt          |   4 +
 Documentation/git-verify-tag.txt   |   6 +-
 builtin/branch.c                   |   5 +
 builtin/for-each-ref.c             |  17 +-
 builtin/tag.c                      |   8 +-
 builtin/verify-tag.c               |   1 +
 ref-filter.c                       | 296 ++++++++++++++++++++++-------
 ref-filter.h                       |   9 +-
 t/t3203-branch-output.sh           |  14 ++
 t/t6300-for-each-ref.sh            | 294 ++++++++++++++++++++++++++++
 t/t7004-tag.sh                     |  10 +
 t/t7030-verify-tag.sh              |   8 +
 14 files changed, 633 insertions(+), 74 deletions(-)


base-commit: 1197f1a46360d3ae96bd9c15908a3a6f8e562207
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-972%2Fadlternative%2Fref-filter-texconv-filters-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-972/adlternative/ref-filter-texconv-filters-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/972
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/6] [GSOC] ref-filter: add obj-type check in grab contents
  2021-06-05  9:13 [PATCH 0/6] [GSOC][RFC] ref-filter: add %(raw:textconv) and %(raw:filters) ZheNing Hu via GitGitGadget
@ 2021-06-05  9:13 ` ZheNing Hu via GitGitGadget
  2021-06-05  9:13 ` [PATCH 2/6] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-06-05  9:13 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Christian Couder, Hariom Verma, ZheNing Hu,
	ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Only tag and commit objects use `grab_sub_body_contents()` to grab
object contents in the current codebase.  We want to teach the
function to also handle blobs and trees to get their raw data,
without parsing a blob (whose contents looks like a commit or a tag)
incorrectly as a commit or a tag.

Skip the block of code that is specific to handling commits and tags
early when the given object is of a wrong type to help later
addition to handle other types of objects in this function.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 4db0e40ff4c6..5cee6512fbaf 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1356,11 +1356,12 @@ static void append_lines(struct strbuf *out, const char *buf, unsigned long size
 }
 
 /* See grab_values */
-static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
+static void grab_sub_body_contents(struct atom_value *val, int deref, struct expand_data *data)
 {
 	int i;
 	const char *subpos = NULL, *bodypos = NULL, *sigpos = NULL;
 	size_t sublen = 0, bodylen = 0, nonsiglen = 0, siglen = 0;
+	void *buf = data->content;
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		struct used_atom *atom = &used_atom[i];
@@ -1371,10 +1372,13 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
 			continue;
 		if (deref)
 			name++;
-		if (strcmp(name, "body") &&
-		    !starts_with(name, "subject") &&
-		    !starts_with(name, "trailers") &&
-		    !starts_with(name, "contents"))
+
+		if ((data->type != OBJ_TAG &&
+		     data->type != OBJ_COMMIT) ||
+		    (strcmp(name, "body") &&
+		     !starts_with(name, "subject") &&
+		     !starts_with(name, "trailers") &&
+		     !starts_with(name, "contents")))
 			continue;
 		if (!subpos)
 			find_subpos(buf,
@@ -1438,17 +1442,19 @@ static void fill_missing_values(struct atom_value *val)
  * pointed at by the ref itself; otherwise it is the object the
  * ref (which is a tag) refers to.
  */
-static void grab_values(struct atom_value *val, int deref, struct object *obj, void *buf)
+static void grab_values(struct atom_value *val, int deref, struct object *obj, struct expand_data *data)
 {
+	void *buf = data->content;
+
 	switch (obj->type) {
 	case OBJ_TAG:
 		grab_tag_values(val, deref, obj);
-		grab_sub_body_contents(val, deref, buf);
+		grab_sub_body_contents(val, deref, data);
 		grab_person("tagger", val, deref, buf);
 		break;
 	case OBJ_COMMIT:
 		grab_commit_values(val, deref, obj);
-		grab_sub_body_contents(val, deref, buf);
+		grab_sub_body_contents(val, deref, data);
 		grab_person("author", val, deref, buf);
 		grab_person("committer", val, deref, buf);
 		break;
@@ -1678,7 +1684,7 @@ static int get_object(struct ref_array_item *ref, int deref, struct object **obj
 			return strbuf_addf_ret(err, -1, _("parse_object_buffer failed on %s for %s"),
 					       oid_to_hex(&oi->oid), ref->refname);
 		}
-		grab_values(ref->value, deref, *obj, oi->content);
+		grab_values(ref->value, deref, *obj, oi);
 	}
 
 	grab_common_values(ref->value, deref, oi);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 2/6] [GSOC] ref-filter: add %(raw) atom
  2021-06-05  9:13 [PATCH 0/6] [GSOC][RFC] ref-filter: add %(raw:textconv) and %(raw:filters) ZheNing Hu via GitGitGadget
  2021-06-05  9:13 ` [PATCH 1/6] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
@ 2021-06-05  9:13 ` ZheNing Hu via GitGitGadget
  2021-06-08  5:07   ` Junio C Hamano
  2021-06-05  9:13 ` [PATCH 3/6] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-06-05  9:13 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Christian Couder, Hariom Verma, ZheNing Hu,
	ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Add new formatting option `%(raw)`, which will print the raw
object data without any changes. It will help further to migrate
all cat-file formatting logic from cat-file to ref-filter.

The raw data of blob, tree objects may contain '\0', but most of
the logic in `ref-filter` depands on the output of the atom being
text (specifically, no embedded NULs in it).

E.g. `quote_formatting()` use `strbuf_addstr()` or `*._quote_buf()`
add the data to the buffer. The raw data of a tree object is
`100644 one\0...`, only the `100644 one` will be added to the buffer,
which is incorrect.

Therefore, add a new member in `struct atom_value`: `s_size`, which
can record raw object size, it can help us add raw object data to
the buffer or compare two buffers which contain raw object data.

Beyond, `--format=%(raw)` cannot be used with `--python`, `--shell`,
`--tcl`, `--perl` because if our binary raw data is passed to a variable
in the host language, the host language may not support arbitrary binary
data in the variables of its string type.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Felipe Contreras <felipe.contreras@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Based-on-patch-by: Olga Telezhnaya <olyatelezhnaya@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 Documentation/git-for-each-ref.txt |   9 ++
 ref-filter.c                       | 140 +++++++++++++++----
 t/t6300-for-each-ref.sh            | 207 +++++++++++++++++++++++++++++
 3 files changed, 328 insertions(+), 28 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index 2ae2478de706..8f8d8cd1e04f 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -235,6 +235,15 @@ and `date` to extract the named component.  For email fields (`authoremail`,
 without angle brackets, and `:localpart` to get the part before the `@` symbol
 out of the trimmed email.
 
+The raw data in a object is `raw`.
+
+raw:size::
+	The raw data size of the object.
+
+Note that `--format=%(raw)` can not be used with `--python`, `--shell`, `--tcl`,
+`--perl` because the host language may not support arbitrary binary data in the
+variables of its string type.
+
 The message in a commit or a tag object is `contents`, from which
 `contents:<part>` can be used to extract various parts out of:
 
diff --git a/ref-filter.c b/ref-filter.c
index 5cee6512fbaf..46aec291de62 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -144,6 +144,7 @@ enum atom_type {
 	ATOM_BODY,
 	ATOM_TRAILERS,
 	ATOM_CONTENTS,
+	ATOM_RAW,
 	ATOM_UPSTREAM,
 	ATOM_PUSH,
 	ATOM_SYMREF,
@@ -189,6 +190,9 @@ static struct used_atom {
 			struct process_trailer_options trailer_opts;
 			unsigned int nlines;
 		} contents;
+		struct {
+			enum { RAW_BARE, RAW_LENGTH } option;
+		} raw_data;
 		struct {
 			cmp_status cmp_status;
 			const char *str;
@@ -426,6 +430,18 @@ static int contents_atom_parser(const struct ref_format *format, struct used_ato
 	return 0;
 }
 
+static int raw_atom_parser(const struct ref_format *format, struct used_atom *atom,
+				const char *arg, struct strbuf *err)
+{
+	if (!arg)
+		atom->u.raw_data.option = RAW_BARE;
+	else if (!strcmp(arg, "size"))
+		atom->u.raw_data.option = RAW_LENGTH;
+	else
+		return strbuf_addf_ret(err, -1, _("unrecognized %%(raw) argument: %s"), arg);
+	return 0;
+}
+
 static int oid_atom_parser(const struct ref_format *format, struct used_atom *atom,
 			   const char *arg, struct strbuf *err)
 {
@@ -586,6 +602,7 @@ static struct {
 	[ATOM_BODY] = { "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
 	[ATOM_TRAILERS] = { "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
 	[ATOM_CONTENTS] = { "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser },
+	[ATOM_RAW] = { "raw", SOURCE_OBJ, FIELD_STR, raw_atom_parser },
 	[ATOM_UPSTREAM] = { "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
 	[ATOM_PUSH] = { "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
 	[ATOM_SYMREF] = { "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser },
@@ -620,12 +637,15 @@ struct ref_formatting_state {
 
 struct atom_value {
 	const char *s;
+	size_t s_size;
 	int (*handler)(struct atom_value *atomv, struct ref_formatting_state *state,
 		       struct strbuf *err);
 	uintmax_t value; /* used for sorting when not FIELD_STR */
 	struct used_atom *atom;
 };
 
+#define ATOM_VALUE_S_SIZE_INIT (-1)
+
 /*
  * Used to parse format string and sort specifiers
  */
@@ -644,13 +664,6 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 		return strbuf_addf_ret(err, -1, _("malformed field name: %.*s"),
 				       (int)(ep-atom), atom);
 
-	/* Do we have the atom already used elsewhere? */
-	for (i = 0; i < used_atom_cnt; i++) {
-		int len = strlen(used_atom[i].name);
-		if (len == ep - atom && !memcmp(used_atom[i].name, atom, len))
-			return i;
-	}
-
 	/*
 	 * If the atom name has a colon, strip it and everything after
 	 * it off - it specifies the format for this entry, and
@@ -660,6 +673,13 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 	arg = memchr(sp, ':', ep - sp);
 	atom_len = (arg ? arg : ep) - sp;
 
+	/* Do we have the atom already used elsewhere? */
+	for (i = 0; i < used_atom_cnt; i++) {
+		int len = strlen(used_atom[i].name);
+		if (len == ep - atom && !memcmp(used_atom[i].name, atom, len))
+			return i;
+	}
+
 	/* Is the atom a valid one? */
 	for (i = 0; i < ARRAY_SIZE(valid_atom); i++) {
 		int len = strlen(valid_atom[i].name);
@@ -709,11 +729,14 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 	return at;
 }
 
-static void quote_formatting(struct strbuf *s, const char *str, int quote_style)
+static void quote_formatting(struct strbuf *s, const char *str, size_t len, int quote_style)
 {
 	switch (quote_style) {
 	case QUOTE_NONE:
-		strbuf_addstr(s, str);
+		if (len != ATOM_VALUE_S_SIZE_INIT)
+			strbuf_add(s, str, len);
+		else
+			strbuf_addstr(s, str);
 		break;
 	case QUOTE_SHELL:
 		sq_quote_buf(s, str);
@@ -740,9 +763,12 @@ static int append_atom(struct atom_value *v, struct ref_formatting_state *state,
 	 * encountered.
 	 */
 	if (!state->stack->prev)
-		quote_formatting(&state->stack->output, v->s, state->quote_style);
+		quote_formatting(&state->stack->output, v->s, v->s_size, state->quote_style);
 	else
-		strbuf_addstr(&state->stack->output, v->s);
+		if (v->s_size != ATOM_VALUE_S_SIZE_INIT)
+			strbuf_add(&state->stack->output, v->s, v->s_size);
+		else
+			strbuf_addstr(&state->stack->output, v->s);
 	return 0;
 }
 
@@ -842,21 +868,23 @@ static int if_atom_handler(struct atom_value *atomv, struct ref_formatting_state
 	return 0;
 }
 
-static int is_empty(const char *s)
+static int is_empty(struct strbuf *buf)
 {
-	while (*s != '\0') {
-		if (!isspace(*s))
-			return 0;
-		s++;
-	}
-	return 1;
-}
+	const char *cur = buf->buf;
+	const char *end = buf->buf + buf->len;
+
+	while (cur != end && (isspace(*cur)))
+		cur++;
+
+	return cur == end;
+ }
 
 static int then_atom_handler(struct atom_value *atomv, struct ref_formatting_state *state,
 			     struct strbuf *err)
 {
 	struct ref_formatting_stack *cur = state->stack;
 	struct if_then_else *if_then_else = NULL;
+	size_t str_len = 0;
 
 	if (cur->at_end == if_then_else_handler)
 		if_then_else = (struct if_then_else *)cur->at_end_data;
@@ -867,18 +895,22 @@ static int then_atom_handler(struct atom_value *atomv, struct ref_formatting_sta
 	if (if_then_else->else_atom_seen)
 		return strbuf_addf_ret(err, -1, _("format: %%(then) atom used after %%(else)"));
 	if_then_else->then_atom_seen = 1;
+	if (if_then_else->str)
+		str_len = strlen(if_then_else->str);
 	/*
 	 * If the 'equals' or 'notequals' attribute is used then
 	 * perform the required comparison. If not, only non-empty
 	 * strings satisfy the 'if' condition.
 	 */
 	if (if_then_else->cmp_status == COMPARE_EQUAL) {
-		if (!strcmp(if_then_else->str, cur->output.buf))
+		if (str_len == cur->output.len &&
+		    !memcmp(if_then_else->str, cur->output.buf, cur->output.len))
 			if_then_else->condition_satisfied = 1;
 	} else if (if_then_else->cmp_status == COMPARE_UNEQUAL) {
-		if (strcmp(if_then_else->str, cur->output.buf))
+		if (str_len != cur->output.len ||
+		    memcmp(if_then_else->str, cur->output.buf, cur->output.len))
 			if_then_else->condition_satisfied = 1;
-	} else if (cur->output.len && !is_empty(cur->output.buf))
+	} else if (cur->output.len && !is_empty(&cur->output))
 		if_then_else->condition_satisfied = 1;
 	strbuf_reset(&cur->output);
 	return 0;
@@ -924,7 +956,7 @@ static int end_atom_handler(struct atom_value *atomv, struct ref_formatting_stat
 	 * only on the topmost supporting atom.
 	 */
 	if (!current->prev->prev) {
-		quote_formatting(&s, current->output.buf, state->quote_style);
+		quote_formatting(&s, current->output.buf, current->output.len, state->quote_style);
 		strbuf_swap(&current->output, &s);
 	}
 	strbuf_release(&s);
@@ -974,6 +1006,10 @@ int verify_ref_format(struct ref_format *format)
 		at = parse_ref_filter_atom(format, sp + 2, ep, &err);
 		if (at < 0)
 			die("%s", err.buf);
+		if (format->quote_style && used_atom[at].atom_type == ATOM_RAW &&
+		    used_atom[at].u.raw_data.option == RAW_BARE)
+			die(_("--format=%.*s cannot be used with"
+			      "--python, --shell, --tcl, --perl"), (int)(ep - sp - 2), sp + 2);
 		cp = ep + 1;
 
 		if (skip_prefix(used_atom[at].name, "color:", &color))
@@ -1362,17 +1398,29 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct exp
 	const char *subpos = NULL, *bodypos = NULL, *sigpos = NULL;
 	size_t sublen = 0, bodylen = 0, nonsiglen = 0, siglen = 0;
 	void *buf = data->content;
+	unsigned long buf_size = data->size;
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		struct used_atom *atom = &used_atom[i];
 		const char *name = atom->name;
 		struct atom_value *v = &val[i];
+		enum atom_type atom_type = atom->atom_type;
 
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
 
+		if (atom_type == ATOM_RAW) {
+			if (atom->u.raw_data.option == RAW_BARE) {
+				v->s = xmemdupz(buf, buf_size);
+				v->s_size = buf_size;
+			} else if (atom->u.raw_data.option == RAW_LENGTH) {
+				v->s = xstrfmt("%"PRIuMAX, (uintmax_t)buf_size);
+			}
+			continue;
+		}
+
 		if ((data->type != OBJ_TAG &&
 		     data->type != OBJ_COMMIT) ||
 		    (strcmp(name, "body") &&
@@ -1460,9 +1508,11 @@ static void grab_values(struct atom_value *val, int deref, struct object *obj, s
 		break;
 	case OBJ_TREE:
 		/* grab_tree_values(val, deref, obj, buf, sz); */
+		grab_sub_body_contents(val, deref, data);
 		break;
 	case OBJ_BLOB:
 		/* grab_blob_values(val, deref, obj, buf, sz); */
+		grab_sub_body_contents(val, deref, data);
 		break;
 	default:
 		die("Eh?  Object of type %d?", obj->type);
@@ -1765,7 +1815,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 		int deref = 0;
 		const char *refname;
 		struct branch *branch = NULL;
-
+		v->s_size = ATOM_VALUE_S_SIZE_INIT;
 		v->handler = append_atom;
 		v->atom = atom;
 
@@ -2369,6 +2419,19 @@ static int compare_detached_head(struct ref_array_item *a, struct ref_array_item
 	return 0;
 }
 
+static int memcasecmp(const void *vs1, const void *vs2, size_t n)
+{
+	const char *s1 = vs1, *s2 = vs2;
+	const char *end = s1 + n;
+
+	for (; s1 < end; s1++, s2++) {
+		int diff = tolower(*s1) - tolower(*s2);
+		if (diff)
+			return diff;
+	}
+	return 0;
+}
+
 static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, struct ref_array_item *b)
 {
 	struct atom_value *va, *vb;
@@ -2389,10 +2452,30 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru
 	} else if (s->sort_flags & REF_SORTING_VERSION) {
 		cmp = versioncmp(va->s, vb->s);
 	} else if (cmp_type == FIELD_STR) {
-		int (*cmp_fn)(const char *, const char *);
-		cmp_fn = s->sort_flags & REF_SORTING_ICASE
-			? strcasecmp : strcmp;
-		cmp = cmp_fn(va->s, vb->s);
+		if (va->s_size == ATOM_VALUE_S_SIZE_INIT &&
+		    vb->s_size == ATOM_VALUE_S_SIZE_INIT) {
+			int (*cmp_fn)(const char *, const char *);
+			cmp_fn = s->sort_flags & REF_SORTING_ICASE
+				? strcasecmp : strcmp;
+			cmp = cmp_fn(va->s, vb->s);
+		} else {
+			int (*cmp_fn)(const void *, const void *, size_t);
+			cmp_fn = s->sort_flags & REF_SORTING_ICASE
+				? memcasecmp : memcmp;
+			size_t a_size = va->s_size == ATOM_VALUE_S_SIZE_INIT ?
+					strlen(va->s) : va->s_size;
+			size_t b_size = vb->s_size == ATOM_VALUE_S_SIZE_INIT ?
+					strlen(vb->s) : vb->s_size;
+
+			cmp = cmp_fn(va->s, vb->s, b_size > a_size ?
+				     a_size : b_size);
+			if (!cmp) {
+				if (a_size > b_size)
+					cmp = 1;
+				else if (a_size < b_size)
+					cmp = -1;
+			}
+		}
 	} else {
 		if (va->value < vb->value)
 			cmp = -1;
@@ -2492,6 +2575,7 @@ int format_ref_array_item(struct ref_array_item *info,
 	}
 	if (format->need_color_reset_at_eol) {
 		struct atom_value resetv;
+		resetv.s_size = ATOM_VALUE_S_SIZE_INIT;
 		resetv.s = GIT_COLOR_RESET;
 		if (append_atom(&resetv, &state, error_buf)) {
 			pop_stack_element(&state.stack);
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 9e0214076b4d..5f66d933ace0 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -130,6 +130,8 @@ test_atom head parent:short=10 ''
 test_atom head numparent 0
 test_atom head object ''
 test_atom head type ''
+test_atom head raw "$(git cat-file commit refs/heads/main)
+"
 test_atom head '*objectname' ''
 test_atom head '*objecttype' ''
 test_atom head author 'A U Thor <author@example.com> 1151968724 +0200'
@@ -221,6 +223,15 @@ test_atom tag contents 'Tagging at 1151968727
 '
 test_atom tag HEAD ' '
 
+test_expect_success 'basic atom: refs/tags/testtag *raw' '
+	git cat-file commit refs/tags/testtag^{} >expected &&
+	git for-each-ref --format="%(*raw)" refs/tags/testtag >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	echo "" >>expected.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_expect_success 'Check invalid atoms names are errors' '
 	test_must_fail git for-each-ref --format="%(INVALID)" refs/heads
 '
@@ -686,6 +697,15 @@ test_atom refs/tags/signed-empty contents:body ''
 test_atom refs/tags/signed-empty contents:signature "$sig"
 test_atom refs/tags/signed-empty contents "$sig"
 
+test_expect_success 'basic atom: refs/tags/signed-empty raw' '
+	git cat-file tag refs/tags/signed-empty >expected &&
+	git for-each-ref --format="%(raw)" refs/tags/signed-empty >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	echo "" >>expected.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_atom refs/tags/signed-short subject 'subject line'
 test_atom refs/tags/signed-short subject:sanitize 'subject-line'
 test_atom refs/tags/signed-short contents:subject 'subject line'
@@ -695,6 +715,15 @@ test_atom refs/tags/signed-short contents:signature "$sig"
 test_atom refs/tags/signed-short contents "subject line
 $sig"
 
+test_expect_success 'basic atom: refs/tags/signed-short raw' '
+	git cat-file tag refs/tags/signed-short >expected &&
+	git for-each-ref --format="%(raw)" refs/tags/signed-short >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	echo "" >>expected.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_atom refs/tags/signed-long subject 'subject line'
 test_atom refs/tags/signed-long subject:sanitize 'subject-line'
 test_atom refs/tags/signed-long contents:subject 'subject line'
@@ -708,6 +737,15 @@ test_atom refs/tags/signed-long contents "subject line
 body contents
 $sig"
 
+test_expect_success 'basic atom: refs/tags/signed-long raw' '
+	git cat-file tag refs/tags/signed-long >expected &&
+	git for-each-ref --format="%(raw)" refs/tags/signed-long >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	echo "" >>expected.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_expect_success 'set up refs pointing to tree and blob' '
 	git update-ref refs/mytrees/first refs/heads/main^{tree} &&
 	git update-ref refs/myblobs/first refs/heads/main:one
@@ -720,6 +758,16 @@ test_atom refs/mytrees/first contents:body ""
 test_atom refs/mytrees/first contents:signature ""
 test_atom refs/mytrees/first contents ""
 
+test_expect_success 'basic atom: refs/mytrees/first raw' '
+	git cat-file tree refs/mytrees/first >expected &&
+	echo "" >>expected &&
+	git for-each-ref --format="%(raw)" refs/mytrees/first >actual &&
+	test_cmp expected actual &&
+	git cat-file -s refs/mytrees/first >expected &&
+	git for-each-ref --format="%(raw:size)" refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
 test_atom refs/myblobs/first subject ""
 test_atom refs/myblobs/first contents:subject ""
 test_atom refs/myblobs/first body ""
@@ -727,6 +775,165 @@ test_atom refs/myblobs/first contents:body ""
 test_atom refs/myblobs/first contents:signature ""
 test_atom refs/myblobs/first contents ""
 
+test_expect_success 'basic atom: refs/myblobs/first raw' '
+	git cat-file blob refs/myblobs/first >expected &&
+	echo "" >>expected &&
+	git for-each-ref --format="%(raw)" refs/myblobs/first >actual &&
+	test_cmp expected actual &&
+	git cat-file -s refs/myblobs/first >expected &&
+	git for-each-ref --format="%(raw:size)" refs/myblobs/first >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'set up refs pointing to binary blob' '
+	printf "%b" "a\0b\0c" >blob1 &&
+	printf "%b" "a\0c\0b" >blob2 &&
+	printf "%b" "\0a\0b\0c" >blob3 &&
+	printf "%b" "abc" >blob4 &&
+	printf "%b" "\0 \0 \0 " >blob5 &&
+	printf "%b" "\0 \0a\0 " >blob6 &&
+	printf "%b" "  " >blob7 &&
+	>blob8 &&
+	git hash-object blob1 -w | xargs git update-ref refs/myblobs/blob1 &&
+	git hash-object blob2 -w | xargs git update-ref refs/myblobs/blob2 &&
+	git hash-object blob3 -w | xargs git update-ref refs/myblobs/blob3 &&
+	git hash-object blob4 -w | xargs git update-ref refs/myblobs/blob4 &&
+	git hash-object blob5 -w | xargs git update-ref refs/myblobs/blob5 &&
+	git hash-object blob6 -w | xargs git update-ref refs/myblobs/blob6 &&
+	git hash-object blob7 -w | xargs git update-ref refs/myblobs/blob7 &&
+	git hash-object blob8 -w | xargs git update-ref refs/myblobs/blob8
+'
+
+test_expect_success 'Verify sorts with raw' '
+	cat >expected <<-EOF &&
+	refs/myblobs/blob8
+	refs/myblobs/blob5
+	refs/myblobs/blob6
+	refs/myblobs/blob3
+	refs/myblobs/blob7
+	refs/mytrees/first
+	refs/myblobs/first
+	refs/myblobs/blob1
+	refs/myblobs/blob2
+	refs/myblobs/blob4
+	refs/heads/main
+	EOF
+	git for-each-ref --format="%(refname)" --sort=raw \
+		refs/heads/main refs/myblobs/ refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'Verify sorts with raw:size' '
+	cat >expected <<-EOF &&
+	refs/myblobs/blob8
+	refs/myblobs/first
+	refs/myblobs/blob7
+	refs/heads/main
+	refs/myblobs/blob4
+	refs/myblobs/blob1
+	refs/myblobs/blob2
+	refs/myblobs/blob3
+	refs/myblobs/blob5
+	refs/myblobs/blob6
+	refs/mytrees/first
+	EOF
+	git for-each-ref --format="%(refname)" --sort=raw:size \
+		refs/heads/main refs/myblobs/ refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'validate raw atom with %(if:equals)' '
+	cat >expected <<-EOF &&
+	not equals
+	not equals
+	not equals
+	not equals
+	not equals
+	not equals
+	refs/myblobs/blob4
+	not equals
+	not equals
+	not equals
+	not equals
+	not equals
+	EOF
+	git for-each-ref --format="%(if:equals=abc)%(raw)%(then)%(refname)%(else)not equals%(end)" \
+		refs/myblobs/ refs/heads/ >actual &&
+	test_cmp expected actual
+'
+test_expect_success 'validate raw atom with %(if:notequals)' '
+	cat >expected <<-EOF &&
+	refs/heads/ambiguous
+	refs/heads/main
+	refs/heads/newtag
+	refs/myblobs/blob1
+	refs/myblobs/blob2
+	refs/myblobs/blob3
+	equals
+	refs/myblobs/blob5
+	refs/myblobs/blob6
+	refs/myblobs/blob7
+	refs/myblobs/blob8
+	refs/myblobs/first
+	EOF
+	git for-each-ref --format="%(if:notequals=abc)%(raw)%(then)%(refname)%(else)equals%(end)" \
+		refs/myblobs/ refs/heads/ >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'empty raw refs with %(if)' '
+	cat >expected <<-EOF &&
+	refs/myblobs/blob1 not empty
+	refs/myblobs/blob2 not empty
+	refs/myblobs/blob3 not empty
+	refs/myblobs/blob4 not empty
+	refs/myblobs/blob5 not empty
+	refs/myblobs/blob6 not empty
+	refs/myblobs/blob7 empty
+	refs/myblobs/blob8 empty
+	refs/myblobs/first not empty
+	EOF
+	git for-each-ref --format="%(refname) %(if)%(raw)%(then)not empty%(else)empty%(end)" \
+		refs/myblobs/ >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success '%(raw) with --python must failed' '
+	test_must_fail git for-each-ref --format="%(raw)" --python
+'
+
+test_expect_success '%(raw) with --tcl must failed' '
+	test_must_fail git for-each-ref --format="%(raw)" --tcl
+'
+
+test_expect_success '%(raw) with --perl must failed' '
+	test_must_fail git for-each-ref --format="%(raw)" --perl
+'
+
+test_expect_success '%(raw) with --shell must failed' '
+	test_must_fail git for-each-ref --format="%(raw)" --shell
+'
+
+test_expect_success '%(raw) with --shell and --sort=raw must failed' '
+	test_must_fail git for-each-ref --format="%(raw)" --sort=raw --shell
+'
+
+test_expect_success '%(raw:size) with --shell' '
+	git for-each-ref --format="%(raw:size)" | while read line
+	do
+		echo "'\''$line'\''" >>expect
+	done &&
+	git for-each-ref --format="%(raw:size)" --shell >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'for-each-ref --format compare with cat-file --batch' '
+	git rev-parse refs/mytrees/first | git cat-file --batch >expected &&
+	git for-each-ref --format="%(objectname) %(objecttype) %(objectsize)
+%(raw)" refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
 test_expect_success 'set up multiple-sort tags' '
 	for when in 100000 200000
 	do
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 3/6] [GSOC] ref-filter: use non-const ref_format in *_atom_parser()
  2021-06-05  9:13 [PATCH 0/6] [GSOC][RFC] ref-filter: add %(raw:textconv) and %(raw:filters) ZheNing Hu via GitGitGadget
  2021-06-05  9:13 ` [PATCH 1/6] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
  2021-06-05  9:13 ` [PATCH 2/6] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
@ 2021-06-05  9:13 ` ZheNing Hu via GitGitGadget
  2021-06-05  9:13 ` [PATCH 4/6] [GSOC] ref-filter: add %(rest) atom and --rest option ZheNing Hu via GitGitGadget
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-06-05  9:13 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Christian Couder, Hariom Verma, ZheNing Hu,
	ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Use non-const ref_format in *_atom_parser(), which can help us
modify the members of ref_format in *_atom_parser().

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 builtin/tag.c |  2 +-
 ref-filter.c  | 44 ++++++++++++++++++++++----------------------
 ref-filter.h  |  4 ++--
 3 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/builtin/tag.c b/builtin/tag.c
index 82fcfc098242..452558ec9575 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -146,7 +146,7 @@ static int verify_tag(const char *name, const char *ref,
 		      const struct object_id *oid, void *cb_data)
 {
 	int flags;
-	const struct ref_format *format = cb_data;
+	struct ref_format *format = cb_data;
 	flags = GPG_VERIFY_VERBOSE;
 
 	if (format->format)
diff --git a/ref-filter.c b/ref-filter.c
index 46aec291de62..608e38aa4160 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -226,7 +226,7 @@ static int strbuf_addf_ret(struct strbuf *sb, int ret, const char *fmt, ...)
 	return ret;
 }
 
-static int color_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int color_atom_parser(struct ref_format *format, struct used_atom *atom,
 			     const char *color_value, struct strbuf *err)
 {
 	if (!color_value)
@@ -264,7 +264,7 @@ static int refname_atom_parser_internal(struct refname_atom *atom, const char *a
 	return 0;
 }
 
-static int remote_ref_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int remote_ref_atom_parser(struct ref_format *format, struct used_atom *atom,
 				  const char *arg, struct strbuf *err)
 {
 	struct string_list params = STRING_LIST_INIT_DUP;
@@ -311,7 +311,7 @@ static int remote_ref_atom_parser(const struct ref_format *format, struct used_a
 	return 0;
 }
 
-static int objecttype_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int objecttype_atom_parser(struct ref_format *format, struct used_atom *atom,
 				  const char *arg, struct strbuf *err)
 {
 	if (arg)
@@ -323,7 +323,7 @@ static int objecttype_atom_parser(const struct ref_format *format, struct used_a
 	return 0;
 }
 
-static int objectsize_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int objectsize_atom_parser(struct ref_format *format, struct used_atom *atom,
 				  const char *arg, struct strbuf *err)
 {
 	if (!arg) {
@@ -343,7 +343,7 @@ static int objectsize_atom_parser(const struct ref_format *format, struct used_a
 	return 0;
 }
 
-static int deltabase_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int deltabase_atom_parser(struct ref_format *format, struct used_atom *atom,
 				 const char *arg, struct strbuf *err)
 {
 	if (arg)
@@ -355,7 +355,7 @@ static int deltabase_atom_parser(const struct ref_format *format, struct used_at
 	return 0;
 }
 
-static int body_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int body_atom_parser(struct ref_format *format, struct used_atom *atom,
 			    const char *arg, struct strbuf *err)
 {
 	if (arg)
@@ -364,7 +364,7 @@ static int body_atom_parser(const struct ref_format *format, struct used_atom *a
 	return 0;
 }
 
-static int subject_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int subject_atom_parser(struct ref_format *format, struct used_atom *atom,
 			       const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -376,7 +376,7 @@ static int subject_atom_parser(const struct ref_format *format, struct used_atom
 	return 0;
 }
 
-static int trailers_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int trailers_atom_parser(struct ref_format *format, struct used_atom *atom,
 				const char *arg, struct strbuf *err)
 {
 	atom->u.contents.trailer_opts.no_divider = 1;
@@ -402,7 +402,7 @@ static int trailers_atom_parser(const struct ref_format *format, struct used_ato
 	return 0;
 }
 
-static int contents_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int contents_atom_parser(struct ref_format *format, struct used_atom *atom,
 				const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -430,7 +430,7 @@ static int contents_atom_parser(const struct ref_format *format, struct used_ato
 	return 0;
 }
 
-static int raw_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int raw_atom_parser(struct ref_format *format, struct used_atom *atom,
 				const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -442,7 +442,7 @@ static int raw_atom_parser(const struct ref_format *format, struct used_atom *at
 	return 0;
 }
 
-static int oid_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int oid_atom_parser(struct ref_format *format, struct used_atom *atom,
 			   const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -461,7 +461,7 @@ static int oid_atom_parser(const struct ref_format *format, struct used_atom *at
 	return 0;
 }
 
-static int person_email_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int person_email_atom_parser(struct ref_format *format, struct used_atom *atom,
 				    const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -475,7 +475,7 @@ static int person_email_atom_parser(const struct ref_format *format, struct used
 	return 0;
 }
 
-static int refname_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int refname_atom_parser(struct ref_format *format, struct used_atom *atom,
 			       const char *arg, struct strbuf *err)
 {
 	return refname_atom_parser_internal(&atom->u.refname, arg, atom->name, err);
@@ -492,7 +492,7 @@ static align_type parse_align_position(const char *s)
 	return -1;
 }
 
-static int align_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int align_atom_parser(struct ref_format *format, struct used_atom *atom,
 			     const char *arg, struct strbuf *err)
 {
 	struct align *align = &atom->u.align;
@@ -544,7 +544,7 @@ static int align_atom_parser(const struct ref_format *format, struct used_atom *
 	return 0;
 }
 
-static int if_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int if_atom_parser(struct ref_format *format, struct used_atom *atom,
 			  const char *arg, struct strbuf *err)
 {
 	if (!arg) {
@@ -559,7 +559,7 @@ static int if_atom_parser(const struct ref_format *format, struct used_atom *ato
 	return 0;
 }
 
-static int head_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int head_atom_parser(struct ref_format *format, struct used_atom *atom,
 			    const char *arg, struct strbuf *unused_err)
 {
 	atom->u.head = resolve_refdup("HEAD", RESOLVE_REF_READING, NULL, NULL);
@@ -570,7 +570,7 @@ static struct {
 	const char *name;
 	info_source source;
 	cmp_type cmp_type;
-	int (*parser)(const struct ref_format *format, struct used_atom *atom,
+	int (*parser)(struct ref_format *format, struct used_atom *atom,
 		      const char *arg, struct strbuf *err);
 } valid_atom[] = {
 	[ATOM_REFNAME] = { "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
@@ -649,7 +649,7 @@ struct atom_value {
 /*
  * Used to parse format string and sort specifiers
  */
-static int parse_ref_filter_atom(const struct ref_format *format,
+static int parse_ref_filter_atom(struct ref_format *format,
 				 const char *atom, const char *ep,
 				 struct strbuf *err)
 {
@@ -2545,9 +2545,9 @@ static void append_literal(const char *cp, const char *ep, struct ref_formatting
 }
 
 int format_ref_array_item(struct ref_array_item *info,
-			   const struct ref_format *format,
-			   struct strbuf *final_buf,
-			   struct strbuf *error_buf)
+			  struct ref_format *format,
+			  struct strbuf *final_buf,
+			  struct strbuf *error_buf)
 {
 	const char *cp, *sp, *ep;
 	struct ref_formatting_state state = REF_FORMATTING_STATE_INIT;
@@ -2592,7 +2592,7 @@ int format_ref_array_item(struct ref_array_item *info,
 }
 
 void pretty_print_ref(const char *name, const struct object_id *oid,
-		      const struct ref_format *format)
+		      struct ref_format *format)
 {
 	struct ref_array_item *ref_item;
 	struct strbuf output = STRBUF_INIT;
diff --git a/ref-filter.h b/ref-filter.h
index baf72a718965..74fb423fc89f 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -116,7 +116,7 @@ void ref_array_sort(struct ref_sorting *sort, struct ref_array *array);
 void ref_sorting_set_sort_flags_all(struct ref_sorting *sorting, unsigned int mask, int on);
 /*  Based on the given format and quote_style, fill the strbuf */
 int format_ref_array_item(struct ref_array_item *info,
-			  const struct ref_format *format,
+			  struct ref_format *format,
 			  struct strbuf *final_buf,
 			  struct strbuf *error_buf);
 /*  Parse a single sort specifier and add it to the list */
@@ -137,7 +137,7 @@ void setup_ref_filter_porcelain_msg(void);
  * name must be a fully qualified refname.
  */
 void pretty_print_ref(const char *name, const struct object_id *oid,
-		      const struct ref_format *format);
+		      struct ref_format *format);
 
 /*
  * Push a single ref onto the array; this can be used to construct your own
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 4/6] [GSOC] ref-filter: add %(rest) atom and --rest option
  2021-06-05  9:13 [PATCH 0/6] [GSOC][RFC] ref-filter: add %(raw:textconv) and %(raw:filters) ZheNing Hu via GitGitGadget
                   ` (2 preceding siblings ...)
  2021-06-05  9:13 ` [PATCH 3/6] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
@ 2021-06-05  9:13 ` ZheNing Hu via GitGitGadget
  2021-06-05 15:20   ` Hariom verma
  2021-06-07  5:52   ` Junio C Hamano
  2021-06-05  9:13 ` [PATCH 5/6] [GSOC] ref-filter: teach grab_sub_body_contents() return value and err ZheNing Hu via GitGitGadget
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-06-05  9:13 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Christian Couder, Hariom Verma, ZheNing Hu,
	ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

In order to let "cat-file --batch=%(rest)" use the ref-filter
interface, add %(rest) atom for ref-filter and --rest option
for "git for-each-ref", "git branch", "git tag" and "git verify-tag".
`--rest` specify a string to replace %(rest) placeholders of
the --format option.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 Documentation/git-branch.txt       |  6 +++++-
 Documentation/git-for-each-ref.txt | 10 +++++++++-
 Documentation/git-tag.txt          |  4 ++++
 Documentation/git-verify-tag.txt   |  6 +++++-
 builtin/branch.c                   |  5 +++++
 builtin/for-each-ref.c             |  6 ++++++
 builtin/tag.c                      |  6 ++++++
 builtin/verify-tag.c               |  1 +
 ref-filter.c                       | 21 +++++++++++++++++++++
 ref-filter.h                       |  5 ++++-
 t/t3203-branch-output.sh           | 14 ++++++++++++++
 t/t6300-for-each-ref.sh            | 24 ++++++++++++++++++++++++
 t/t7004-tag.sh                     | 10 ++++++++++
 t/t7030-verify-tag.sh              |  8 ++++++++
 14 files changed, 122 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
index 94dc9a54f2d7..29fa579e3d8a 100644
--- a/Documentation/git-branch.txt
+++ b/Documentation/git-branch.txt
@@ -15,7 +15,7 @@ SYNOPSIS
 	[--contains [<commit>]] [--no-contains [<commit>]]
 	[--points-at <object>] [--format=<format>]
 	[(-r | --remotes) | (-a | --all)]
-	[--list] [<pattern>...]
+	[--list] [<pattern>...] [--rest=<rest>]
 'git branch' [--track | --no-track] [-f] <branchname> [<start-point>]
 'git branch' (--set-upstream-to=<upstream> | -u <upstream>) [<branchname>]
 'git branch' --unset-upstream [<branchname>]
@@ -298,6 +298,10 @@ start-point is either a local or remote-tracking branch.
 	and the object it points at.  The format is the same as
 	that of linkgit:git-for-each-ref[1].
 
+--rest=<rest>::
+	If given, the `%(rest)` placeholders in the `--format` option
+	will be replaced.
+
 CONFIGURATION
 -------------
 `pager.branch` is only respected when listing branches, i.e., when
diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index 8f8d8cd1e04f..e85b7b19a530 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 [verse]
 'git for-each-ref' [--count=<count>] [--shell|--perl|--python|--tcl]
 		   [(--sort=<key>)...] [--format=<format>] [<pattern>...]
-		   [--points-at=<object>]
+		   [--points-at=<object>] [--rest=<rest>]
 		   [--merged[=<object>]] [--no-merged[=<object>]]
 		   [--contains[=<object>]] [--no-contains[=<object>]]
 
@@ -74,6 +74,10 @@ OPTIONS
 --points-at=<object>::
 	Only list refs which points at the given object.
 
+--rest=<rest>::
+	If given, the `%(rest)` placeholders in the `--format` option
+	will be replaced by its contents.
+
 --merged[=<object>]::
 	Only list refs whose tips are reachable from the
 	specified commit (HEAD if not specified).
@@ -235,6 +239,10 @@ and `date` to extract the named component.  For email fields (`authoremail`,
 without angle brackets, and `:localpart` to get the part before the `@` symbol
 out of the trimmed email.
 
+rest::
+	The placeholder for `--rest` specified string. If no `--rest`, nothing
+	is printed.
+
 The raw data in a object is `raw`.
 
 raw:size::
diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
index 31a97a1b6c5b..3bf6ae7216de 100644
--- a/Documentation/git-tag.txt
+++ b/Documentation/git-tag.txt
@@ -200,6 +200,10 @@ This option is only applicable when listing tags without annotation lines.
 	that of linkgit:git-for-each-ref[1].  When unspecified,
 	defaults to `%(refname:strip=2)`.
 
+--rest=<rest>::
+	If given, the `%(rest)` placeholders in the `--format` option
+	will be replaced.
+
 <tagname>::
 	The name of the tag to create, delete, or describe.
 	The new tag name must pass all checks defined by
diff --git a/Documentation/git-verify-tag.txt b/Documentation/git-verify-tag.txt
index 0b8075dad965..7ba4a6941ab1 100644
--- a/Documentation/git-verify-tag.txt
+++ b/Documentation/git-verify-tag.txt
@@ -8,7 +8,7 @@ git-verify-tag - Check the GPG signature of tags
 SYNOPSIS
 --------
 [verse]
-'git verify-tag' [--format=<format>] <tag>...
+'git verify-tag' [--format=<format>] [--rest=<rest>] <tag>...
 
 DESCRIPTION
 -----------
@@ -27,6 +27,10 @@ OPTIONS
 <tag>...::
 	SHA-1 identifiers of Git tag objects.
 
+--rest=<rest>::
+	If given, the `%(rest)` placeholders in the `--format` option
+	will be replaced.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/branch.c b/builtin/branch.c
index b23b1d1752af..b03a4c49c1d9 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -439,6 +439,10 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
 	if (verify_ref_format(format))
 		die(_("unable to parse format string"));
 
+	if (format->use_rest)
+		for (i = 0; i < array.nr; i++)
+			array.items[i]->rest = format->rest;
+
 	ref_array_sort(sorting, &array);
 
 	for (i = 0; i < array.nr; i++) {
@@ -670,6 +674,7 @@ int cmd_branch(int argc, const char **argv, const char *prefix)
 			N_("print only branches of the object"), parse_opt_object_name),
 		OPT_BOOL('i', "ignore-case", &icase, N_("sorting and filtering are case insensitive")),
 		OPT_STRING(  0 , "format", &format.format, N_("format"), N_("format to use for the output")),
+		OPT_STRING(0, "rest", &format.rest, N_("rest"), N_("specify %(rest) contents")),
 		OPT_END(),
 	};
 
diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index 89cb6307d46f..fac7777fd2c0 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -37,6 +37,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
 
 		OPT_GROUP(""),
 		OPT_INTEGER( 0 , "count", &maxcount, N_("show only <n> matched refs")),
+		OPT_STRING(  0 , "rest", &format.rest, N_("rest"), N_("specify %(rest) contents")),
 		OPT_STRING(  0 , "format", &format.format, N_("format"), N_("format to use for the output")),
 		OPT__COLOR(&format.use_color, N_("respect format colors")),
 		OPT_REF_SORT(sorting_tail),
@@ -78,6 +79,11 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
 	filter.name_patterns = argv;
 	filter.match_as_path = 1;
 	filter_refs(&array, &filter, FILTER_REFS_ALL | FILTER_REFS_INCLUDE_BROKEN);
+
+	if (format.use_rest)
+		for (i = 0; i < array.nr; i++)
+			array.items[i]->rest = format.rest;
+
 	ref_array_sort(sorting, &array);
 
 	if (!maxcount || array.nr < maxcount)
diff --git a/builtin/tag.c b/builtin/tag.c
index 452558ec9575..9e52b3eacb16 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -63,6 +63,11 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting,
 		die(_("unable to parse format string"));
 	filter->with_commit_tag_algo = 1;
 	filter_refs(&array, filter, FILTER_REFS_TAGS);
+
+	if (format->use_rest)
+		for (i = 0; i < array.nr; i++)
+			array.items[i]->rest = format->rest;
+
 	ref_array_sort(sorting, &array);
 
 	for (i = 0; i < array.nr; i++) {
@@ -481,6 +486,7 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
 		OPT_STRING(  0 , "format", &format.format, N_("format"),
 			   N_("format to use for the output")),
 		OPT__COLOR(&format.use_color, N_("respect format colors")),
+		OPT_STRING(0, "rest", &format.rest, N_("rest"), N_("specify %(rest) contents")),
 		OPT_BOOL('i', "ignore-case", &icase, N_("sorting and filtering are case insensitive")),
 		OPT_END()
 	};
diff --git a/builtin/verify-tag.c b/builtin/verify-tag.c
index f45136a06ba7..00be4899678d 100644
--- a/builtin/verify-tag.c
+++ b/builtin/verify-tag.c
@@ -36,6 +36,7 @@ int cmd_verify_tag(int argc, const char **argv, const char *prefix)
 		OPT__VERBOSE(&verbose, N_("print tag contents")),
 		OPT_BIT(0, "raw", &flags, N_("print raw gpg status output"), GPG_VERIFY_RAW),
 		OPT_STRING(0, "format", &format.format, N_("format"), N_("format to use for the output")),
+		OPT_STRING(0, "rest", &format.rest, N_("rest"), N_("specify %(rest) contents")),
 		OPT_END()
 	};
 
diff --git a/ref-filter.c b/ref-filter.c
index 608e38aa4160..695f6f55e3e3 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -157,6 +157,7 @@ enum atom_type {
 	ATOM_IF,
 	ATOM_THEN,
 	ATOM_ELSE,
+	ATOM_REST,
 };
 
 /*
@@ -559,6 +560,15 @@ static int if_atom_parser(struct ref_format *format, struct used_atom *atom,
 	return 0;
 }
 
+static int rest_atom_parser(struct ref_format *format, struct used_atom *atom,
+			    const char *arg, struct strbuf *err)
+{
+	if (arg)
+		return strbuf_addf_ret(err, -1, _("%%(rest) does not take arguments"));
+	format->use_rest = 1;
+	return 0;
+}
+
 static int head_atom_parser(struct ref_format *format, struct used_atom *atom,
 			    const char *arg, struct strbuf *unused_err)
 {
@@ -615,6 +625,7 @@ static struct {
 	[ATOM_IF] = { "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
 	[ATOM_THEN] = { "then", SOURCE_NONE },
 	[ATOM_ELSE] = { "else", SOURCE_NONE },
+	[ATOM_REST] = { "rest", SOURCE_NONE, FIELD_STR, rest_atom_parser },
 	/*
 	 * Please update $__git_ref_fieldlist in git-completion.bash
 	 * when you add new atoms
@@ -1919,6 +1930,12 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			v->handler = else_atom_handler;
 			v->s = xstrdup("");
 			continue;
+		} else if (atom_type == ATOM_REST) {
+			if (ref->rest)
+				v->s = xstrdup(ref->rest);
+			else
+				v->s = xstrdup("");
+			continue;
 		} else
 			continue;
 
@@ -2136,6 +2153,7 @@ static struct ref_array_item *new_ref_array_item(const char *refname,
 
 	FLEX_ALLOC_STR(ref, refname, refname);
 	oidcpy(&ref->objectname, oid);
+	ref->rest = NULL;
 
 	return ref;
 }
@@ -2600,6 +2618,9 @@ void pretty_print_ref(const char *name, const struct object_id *oid,
 
 	ref_item = new_ref_array_item(name, oid);
 	ref_item->kind = ref_kind_from_refname(name);
+	if (format->use_rest)
+		ref_item->rest = format->rest;
+
 	if (format_ref_array_item(ref_item, format, &output, &err))
 		die("%s", err.buf);
 	fwrite(output.buf, 1, output.len, stdout);
diff --git a/ref-filter.h b/ref-filter.h
index 74fb423fc89f..9dc07476a584 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -38,6 +38,7 @@ struct ref_sorting {
 
 struct ref_array_item {
 	struct object_id objectname;
+	const char *rest;
 	int flag;
 	unsigned int kind;
 	const char *symref;
@@ -76,14 +77,16 @@ struct ref_format {
 	 * verify_ref_format() afterwards to finalize.
 	 */
 	const char *format;
+	const char *rest;
 	int quote_style;
+	int use_rest;
 	int use_color;
 
 	/* Internal state to ref-filter */
 	int need_color_reset_at_eol;
 };
 
-#define REF_FORMAT_INIT { NULL, 0, -1 }
+#define REF_FORMAT_INIT { NULL, NULL, 0, 0, -1 }
 
 /*  Macros for checking --merged and --no-merged options */
 #define _OPT_MERGED_NO_MERGED(option, filter, h) \
diff --git a/t/t3203-branch-output.sh b/t/t3203-branch-output.sh
index 5325b9f67a00..dc6bf2557088 100755
--- a/t/t3203-branch-output.sh
+++ b/t/t3203-branch-output.sh
@@ -340,6 +340,20 @@ test_expect_success 'git branch --format option' '
 	test_cmp expect actual
 '
 
+test_expect_success 'git branch with --format and --rest option' '
+	cat >expect <<-\EOF &&
+	@Refname is (HEAD detached from fromtag)
+	@Refname is refs/heads/ambiguous
+	@Refname is refs/heads/branch-one
+	@Refname is refs/heads/branch-two
+	@Refname is refs/heads/main
+	@Refname is refs/heads/ref-to-branch
+	@Refname is refs/heads/ref-to-remote
+	EOF
+	git branch --rest="@" --format="%(rest)Refname is %(refname)" >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'worktree colors correct' '
 	cat >expect <<-EOF &&
 	* <GREEN>(HEAD detached from fromtag)<RESET>
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 5f66d933ace0..e908b2ca0522 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -1187,6 +1187,30 @@ test_expect_success 'basic atom: head contents:trailers' '
 	test_cmp expect actual.clean
 '
 
+test_expect_success 'bacic atom: rest with --rest' '
+	git for-each-ref --format="###refname=%(refname)
+###oid=%(objectname)
+###type=%(objecttype)
+###size=%(objectsize)" refs/heads/main refs/tags/subject-body >expect &&
+	git for-each-ref --rest="###" --format="%(rest)refname=%(refname)
+%(rest)oid=%(objectname)
+%(rest)type=%(objecttype)
+%(rest)size=%(objectsize)" refs/heads/main refs/tags/subject-body >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'bacic atom: *rest with --rest' '
+	git for-each-ref --format="###refname=%(refname)
+###oid=%(objectname)
+###type=%(objecttype)
+###size=%(objectsize)" refs/heads/main refs/tags/subject-body >expect &&
+	git for-each-ref --rest="###" --format="%(*rest)refname=%(refname)
+%(rest)oid=%(objectname)
+%(rest)type=%(objecttype)
+%(rest)size=%(objectsize)" refs/heads/main refs/tags/subject-body >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'trailer parsing not fooled by --- line' '
 	git commit --allow-empty -F - <<-\EOF &&
 	this is the subject
diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index 2f72c5c6883e..93c4366ff52d 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -1998,6 +1998,16 @@ test_expect_success '--format should list tags as per format given' '
 	test_cmp expect actual
 '
 
+test_expect_success 'tag -l with --format and --rest' '
+	cat >expect <<-\EOF &&
+	#refname : refs/tags/v1.0
+	#refname : refs/tags/v1.0.1
+	#refname : refs/tags/v1.1.3
+	EOF
+	git tag -l --rest="#" --format="%(rest)refname : %(refname)" "v1*" >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success "set up color tests" '
 	echo "<RED>v1.0<RESET>" >expect.color &&
 	echo "v1.0" >expect.bare &&
diff --git a/t/t7030-verify-tag.sh b/t/t7030-verify-tag.sh
index 3cefde9602bf..008df00c32d6 100755
--- a/t/t7030-verify-tag.sh
+++ b/t/t7030-verify-tag.sh
@@ -194,6 +194,14 @@ test_expect_success GPG 'verifying tag with --format' '
 	test_cmp expect actual
 '
 
+test_expect_success GPG 'verifying tag with --format and --rest' '
+	cat >expect <<-\EOF &&
+	#tagname : fourth-signed
+	EOF
+	git verify-tag --rest="#" --format="%(rest)tagname : %(tag)" "fourth-signed" >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success GPG 'verifying a forged tag with --format should fail silently' '
 	test_must_fail git verify-tag --format="tagname : %(tag)" $(cat forged1.tag) >actual-forged &&
 	test_must_be_empty actual-forged
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 5/6] [GSOC] ref-filter: teach grab_sub_body_contents() return value and err
  2021-06-05  9:13 [PATCH 0/6] [GSOC][RFC] ref-filter: add %(raw:textconv) and %(raw:filters) ZheNing Hu via GitGitGadget
                   ` (3 preceding siblings ...)
  2021-06-05  9:13 ` [PATCH 4/6] [GSOC] ref-filter: add %(rest) atom and --rest option ZheNing Hu via GitGitGadget
@ 2021-06-05  9:13 ` ZheNing Hu via GitGitGadget
  2021-06-05  9:13 ` [PATCH 6/6] [GSOC] ref-filter: add %(raw:textconv) and %(raw:filters) ZheNing Hu via GitGitGadget
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-06-05  9:13 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Christian Couder, Hariom Verma, ZheNing Hu,
	ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Teach grab_sub_body_contents() return value and err can help us
report more useful error messages when when add %(raw:textconv)
and %(raw:filter) later.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c | 43 ++++++++++++++++++++++++++++---------------
 1 file changed, 28 insertions(+), 15 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 695f6f55e3e3..a859a94aa8e0 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1403,7 +1403,8 @@ static void append_lines(struct strbuf *out, const char *buf, unsigned long size
 }
 
 /* See grab_values */
-static void grab_sub_body_contents(struct atom_value *val, int deref, struct expand_data *data)
+static int grab_sub_body_contents(struct atom_value *val, int deref, struct expand_data *data,
+				  struct strbuf *err)
 {
 	int i;
 	const char *subpos = NULL, *bodypos = NULL, *sigpos = NULL;
@@ -1478,6 +1479,7 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct exp
 
 	}
 	free((void *)sigpos);
+	return 0;
 }
 
 /*
@@ -1501,33 +1503,39 @@ static void fill_missing_values(struct atom_value *val)
  * pointed at by the ref itself; otherwise it is the object the
  * ref (which is a tag) refers to.
  */
-static void grab_values(struct atom_value *val, int deref, struct object *obj, struct expand_data *data)
+static int grab_values(struct atom_value *val, int deref, struct object *obj, struct expand_data *data, struct strbuf *err)
 {
 	void *buf = data->content;
+	int ret = 0;
 
 	switch (obj->type) {
 	case OBJ_TAG:
 		grab_tag_values(val, deref, obj);
-		grab_sub_body_contents(val, deref, data);
+		if ((ret = grab_sub_body_contents(val, deref, data, err)))
+			return ret;
 		grab_person("tagger", val, deref, buf);
 		break;
 	case OBJ_COMMIT:
 		grab_commit_values(val, deref, obj);
-		grab_sub_body_contents(val, deref, data);
+		if ((ret = grab_sub_body_contents(val, deref, data, err)))
+			return ret;
 		grab_person("author", val, deref, buf);
 		grab_person("committer", val, deref, buf);
 		break;
 	case OBJ_TREE:
 		/* grab_tree_values(val, deref, obj, buf, sz); */
-		grab_sub_body_contents(val, deref, data);
+		if ((ret = grab_sub_body_contents(val, deref, data, err)))
+			return ret;
 		break;
 	case OBJ_BLOB:
 		/* grab_blob_values(val, deref, obj, buf, sz); */
-		grab_sub_body_contents(val, deref, data);
+		if ((ret = grab_sub_body_contents(val, deref, data, err)))
+			return ret;
 		break;
 	default:
 		die("Eh?  Object of type %d?", obj->type);
 	}
+	return ret;
 }
 
 static inline char *copy_advance(char *dst, const char *src)
@@ -1725,6 +1733,8 @@ static int get_object(struct ref_array_item *ref, int deref, struct object **obj
 {
 	/* parse_object_buffer() will set eaten to 0 if free() will be needed */
 	int eaten = 1;
+	int ret = 0;
+
 	if (oi->info.contentp) {
 		/* We need to know that to use parse_object_buffer properly */
 		oi->info.sizep = &oi->size;
@@ -1745,13 +1755,13 @@ static int get_object(struct ref_array_item *ref, int deref, struct object **obj
 			return strbuf_addf_ret(err, -1, _("parse_object_buffer failed on %s for %s"),
 					       oid_to_hex(&oi->oid), ref->refname);
 		}
-		grab_values(ref->value, deref, *obj, oi);
+		ret = grab_values(ref->value, deref, *obj, oi, err);
 	}
 
 	grab_common_values(ref->value, deref, oi);
 	if (!eaten)
 		free(oi->content);
-	return 0;
+	return ret;
 }
 
 static void populate_worktree_map(struct hashmap *map, struct worktree **worktrees)
@@ -1805,7 +1815,7 @@ static char *get_worktree_path(const struct used_atom *atom, const struct ref_ar
 static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 {
 	struct object *obj;
-	int i;
+	int i, ret = 0;
 	struct object_info empty = OBJECT_INFO_INIT;
 
 	CALLOC_ARRAY(ref->value, used_atom_cnt);
@@ -1961,8 +1971,8 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 
 
 	oi.oid = ref->objectname;
-	if (get_object(ref, 0, &obj, &oi, err))
-		return -1;
+	if ((ret = get_object(ref, 0, &obj, &oi, err)))
+		return ret;
 
 	/*
 	 * If there is no atom that wants to know about tagged
@@ -1993,9 +2003,11 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 static int get_ref_atom_value(struct ref_array_item *ref, int atom,
 			      struct atom_value **v, struct strbuf *err)
 {
+	int ret = 0;
+
 	if (!ref->value) {
-		if (populate_value(ref, err))
-			return -1;
+		if ((ret = populate_value(ref, err)))
+			return ret;
 		fill_missing_values(ref->value);
 	}
 	*v = &ref->value[atom];
@@ -2568,6 +2580,7 @@ int format_ref_array_item(struct ref_array_item *info,
 			  struct strbuf *error_buf)
 {
 	const char *cp, *sp, *ep;
+	int ret = 0;
 	struct ref_formatting_state state = REF_FORMATTING_STATE_INIT;
 
 	state.quote_style = format->quote_style;
@@ -2581,10 +2594,10 @@ int format_ref_array_item(struct ref_array_item *info,
 		if (cp < sp)
 			append_literal(cp, sp, &state);
 		pos = parse_ref_filter_atom(format, sp + 2, ep, error_buf);
-		if (pos < 0 || get_ref_atom_value(info, pos, &atomv, error_buf) ||
+		if (pos < 0 || ((ret = get_ref_atom_value(info, pos, &atomv, error_buf))) ||
 		    atomv->handler(atomv, &state, error_buf)) {
 			pop_stack_element(&state.stack);
-			return -1;
+			return ret ? ret : -1;
 		}
 	}
 	if (*cp) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 6/6] [GSOC] ref-filter: add %(raw:textconv) and %(raw:filters)
  2021-06-05  9:13 [PATCH 0/6] [GSOC][RFC] ref-filter: add %(raw:textconv) and %(raw:filters) ZheNing Hu via GitGitGadget
                   ` (4 preceding siblings ...)
  2021-06-05  9:13 ` [PATCH 5/6] [GSOC] ref-filter: teach grab_sub_body_contents() return value and err ZheNing Hu via GitGitGadget
@ 2021-06-05  9:13 ` ZheNing Hu via GitGitGadget
  2021-06-05 10:29 ` [PATCH 0/6] [GSOC][RFC] " Bagas Sanjaya
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-06-05  9:13 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Christian Couder, Hariom Verma, ZheNing Hu,
	ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

In order to let `git cat-file --batch --filter` and
`git cat-file --batch --textconv` use ref-filter
interface, `%(raw:textconv)` and `%(raw:filters)` are
added to ref-filter.

`--rest` contents is used as the `<path>` for them.

`%(raw:textconv)` can show the object' contents as transformed
by a textconv filter.

`%(raw:filters)` can show the content as converted by the filters
configured in the current working tree for the given `<path>`
(i.e. smudge filters, end-of-line conversion, etc).

In addition, they cannot be used with `--python`, `--tcl`,
`--shell`, `--perl`.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 Documentation/git-for-each-ref.txt | 18 ++++++--
 builtin/for-each-ref.c             | 11 ++++-
 ref-filter.c                       | 68 ++++++++++++++++++++++++------
 t/t6300-for-each-ref.sh            | 63 +++++++++++++++++++++++++++
 4 files changed, 141 insertions(+), 19 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index e85b7b19a530..e49b8861f474 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -76,7 +76,8 @@ OPTIONS
 
 --rest=<rest>::
 	If given, the `%(rest)` placeholders in the `--format` option
-	will be replaced by its contents.
+	will be replaced by its contents. At the same time, its contents
+	is used as the `<path>` for `%(raw:textconv)` and `%(raw:filter)`.
 
 --merged[=<object>]::
 	Only list refs whose tips are reachable from the
@@ -248,9 +249,18 @@ The raw data in a object is `raw`.
 raw:size::
 	The raw data size of the object.
 
-Note that `--format=%(raw)` can not be used with `--python`, `--shell`, `--tcl`,
-`--perl` because the host language may not support arbitrary binary data in the
-variables of its string type.
+raw:textconv::
+	Show the content as transformed by a textconv filter. It must be used
+	with `--rest`.
+
+raw:filters::
+	Show the content as converted by the filters configured in
+	the current working tree for the given `<path>` (i.e. smudge filters,
+	end-of-line conversion, etc). It must be used with `--rest`.
+
+Note that `--format=%(raw)`, `--format=%(raw:textconv)`, `--format=%(raw:filter)`
+can not be used with `--python`, `--shell`, `--tcl`, `--perl` because the host
+language may not support arbitrary binary data in the variables of its string type.
 
 The message in a commit or a tag object is `contents`, from which
 `contents:<part>` can be used to extract various parts out of:
diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index fac7777fd2c0..4be5ddacbac1 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -5,6 +5,7 @@
 #include "object.h"
 #include "parse-options.h"
 #include "ref-filter.h"
+#include "userdiff.h"
 
 static char const * const for_each_ref_usage[] = {
 	N_("git for-each-ref [<options>] [<pattern>]"),
@@ -14,6 +15,14 @@ static char const * const for_each_ref_usage[] = {
 	NULL
 };
 
+static int git_for_each_ref_config(const char *var, const char *value, void *cb)
+{
+	if (userdiff_config(var, value) < 0)
+		return -1;
+
+	return git_default_config(var, value, cb);
+}
+
 int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
 {
 	int i;
@@ -57,7 +66,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
 
 	format.format = "%(objectname) %(objecttype)\t%(refname)";
 
-	git_config(git_default_config, NULL);
+	git_config(git_for_each_ref_config, NULL);
 
 	parse_options(argc, argv, prefix, opts, for_each_ref_usage, 0);
 	if (maxcount < 0) {
diff --git a/ref-filter.c b/ref-filter.c
index a859a94aa8e0..ae684924b363 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1,3 +1,4 @@
+#define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
 #include "cache.h"
 #include "parse-options.h"
@@ -192,7 +193,7 @@ static struct used_atom {
 			unsigned int nlines;
 		} contents;
 		struct {
-			enum { RAW_BARE, RAW_LENGTH } option;
+			enum { RAW_BARE, RAW_FILTERS, RAW_LENGTH, RAW_TEXT_CONV } option;
 		} raw_data;
 		struct {
 			cmp_status cmp_status;
@@ -434,12 +435,19 @@ static int contents_atom_parser(struct ref_format *format, struct used_atom *ato
 static int raw_atom_parser(struct ref_format *format, struct used_atom *atom,
 				const char *arg, struct strbuf *err)
 {
-	if (!arg)
+	if (!arg) {
 		atom->u.raw_data.option = RAW_BARE;
-	else if (!strcmp(arg, "size"))
+	} else if (!strcmp(arg, "size")) {
 		atom->u.raw_data.option = RAW_LENGTH;
-	else
+	} else if (!strcmp(arg, "filters")) {
+		atom->u.raw_data.option = RAW_FILTERS;
+		format->use_rest = 1;
+	} else if (!strcmp(arg, "textconv")) {
+		atom->u.raw_data.option = RAW_TEXT_CONV;
+		format->use_rest = 1;
+	} else {
 		return strbuf_addf_ret(err, -1, _("unrecognized %%(raw) argument: %s"), arg);
+	}
 	return 0;
 }
 
@@ -1018,7 +1026,9 @@ int verify_ref_format(struct ref_format *format)
 		if (at < 0)
 			die("%s", err.buf);
 		if (format->quote_style && used_atom[at].atom_type == ATOM_RAW &&
-		    used_atom[at].u.raw_data.option == RAW_BARE)
+		    (used_atom[at].u.raw_data.option == RAW_BARE ||
+		     used_atom[at].u.raw_data.option == RAW_FILTERS ||
+		     used_atom[at].u.raw_data.option == RAW_TEXT_CONV))
 			die(_("--format=%.*s cannot be used with"
 			      "--python, --shell, --tcl, --perl"), (int)(ep - sp - 2), sp + 2);
 		cp = ep + 1;
@@ -1403,7 +1413,7 @@ static void append_lines(struct strbuf *out, const char *buf, unsigned long size
 }
 
 /* See grab_values */
-static int grab_sub_body_contents(struct atom_value *val, int deref, struct expand_data *data,
+static int grab_sub_body_contents(struct ref_array_item *ref, int deref, struct expand_data *data,
 				  struct strbuf *err)
 {
 	int i;
@@ -1411,6 +1421,7 @@ static int grab_sub_body_contents(struct atom_value *val, int deref, struct expa
 	size_t sublen = 0, bodylen = 0, nonsiglen = 0, siglen = 0;
 	void *buf = data->content;
 	unsigned long buf_size = data->size;
+	struct atom_value *val = ref->value;
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		struct used_atom *atom = &used_atom[i];
@@ -1425,12 +1436,39 @@ static int grab_sub_body_contents(struct atom_value *val, int deref, struct expa
 
 		if (atom_type == ATOM_RAW) {
 			if (atom->u.raw_data.option == RAW_BARE) {
-				v->s = xmemdupz(buf, buf_size);
-				v->s_size = buf_size;
+				goto bare;
 			} else if (atom->u.raw_data.option == RAW_LENGTH) {
 				v->s = xstrfmt("%"PRIuMAX, (uintmax_t)buf_size);
+			} else if (atom->u.raw_data.option == RAW_FILTERS ||
+				   atom->u.raw_data.option == RAW_TEXT_CONV) {
+				if (!ref->rest)
+					return strbuf_addf_ret(err, -1, _("missing path for '%s'"),
+							       oid_to_hex(&data->oid));
+				if (data->type != OBJ_BLOB)
+					goto bare;
+				if (atom->u.raw_data.option == RAW_FILTERS) {
+					struct strbuf strbuf = STRBUF_INIT;
+					struct checkout_metadata meta;
+
+					init_checkout_metadata(&meta, NULL, NULL, &data->oid);
+					if (convert_to_working_tree(&the_index, ref->rest, buf, data->size, &strbuf, &meta)) {
+						v->s_size = strbuf.len;
+						v->s = strbuf_detach(&strbuf, NULL);
+					} else {
+						goto bare;
+					}
+				} else if (atom->u.raw_data.option == RAW_TEXT_CONV) {
+					if (!textconv_object(the_repository,
+							ref->rest, 0100644, &data->oid,
+							1, (char **)(&v->s), &v->s_size))
+						goto bare;
+				}
 			}
 			continue;
+bare:
+			v->s = xmemdupz(buf, buf_size);
+			v->s_size = buf_size;
+			continue;
 		}
 
 		if ((data->type != OBJ_TAG &&
@@ -1503,33 +1541,35 @@ static void fill_missing_values(struct atom_value *val)
  * pointed at by the ref itself; otherwise it is the object the
  * ref (which is a tag) refers to.
  */
-static int grab_values(struct atom_value *val, int deref, struct object *obj, struct expand_data *data, struct strbuf *err)
+static int grab_values(struct ref_array_item *ref, int deref, struct object *obj,
+		       struct expand_data *data, struct strbuf *err)
 {
 	void *buf = data->content;
+	struct atom_value *val = ref->value;
 	int ret = 0;
 
 	switch (obj->type) {
 	case OBJ_TAG:
 		grab_tag_values(val, deref, obj);
-		if ((ret = grab_sub_body_contents(val, deref, data, err)))
+		if ((ret = grab_sub_body_contents(ref, deref, data, err)))
 			return ret;
 		grab_person("tagger", val, deref, buf);
 		break;
 	case OBJ_COMMIT:
 		grab_commit_values(val, deref, obj);
-		if ((ret = grab_sub_body_contents(val, deref, data, err)))
+		if ((ret = grab_sub_body_contents(ref, deref, data, err)))
 			return ret;
 		grab_person("author", val, deref, buf);
 		grab_person("committer", val, deref, buf);
 		break;
 	case OBJ_TREE:
 		/* grab_tree_values(val, deref, obj, buf, sz); */
-		if ((ret = grab_sub_body_contents(val, deref, data, err)))
+		if ((ret = grab_sub_body_contents(ref, deref, data, err)))
 			return ret;
 		break;
 	case OBJ_BLOB:
 		/* grab_blob_values(val, deref, obj, buf, sz); */
-		if ((ret = grab_sub_body_contents(val, deref, data, err)))
+		if ((ret = grab_sub_body_contents(ref, deref, data, err)))
 			return ret;
 		break;
 	default:
@@ -1755,7 +1795,7 @@ static int get_object(struct ref_array_item *ref, int deref, struct object **obj
 			return strbuf_addf_ret(err, -1, _("parse_object_buffer failed on %s for %s"),
 					       oid_to_hex(&oi->oid), ref->refname);
 		}
-		ret = grab_values(ref->value, deref, *obj, oi, err);
+		ret = grab_values(ref, deref, *obj, oi, err);
 	}
 
 	grab_common_values(ref->value, deref, oi);
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index e908b2ca0522..4ba2aa2dcd73 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -1365,6 +1365,69 @@ test_expect_success 'for-each-ref --ignore-case works on multiple sort keys' '
 	test_cmp expect actual
 '
 
+test_expect_success 'ref-filter raw:textconv and raw:filter setup ' '
+	echo "*.txt eol=crlf diff=txt" >.gitattributes &&
+	echo "hello" | append_cr >world.txt &&
+	git add .gitattributes world.txt &&
+	test_tick &&
+	git commit -m "Initial commit" &&
+	git update-ref refs/myblobs/world_blob HEAD:world.txt
+'
+
+test_expect_success 'basic atom raw:filters with --rest' '
+	sha1=$(git rev-parse -q --verify HEAD:world.txt) &&
+	git for-each-ref --format="%(objectname) %(raw:filters)" --rest="HEAD:world.txt" \
+	    refs/myblobs/world_blob >actual &&
+	printf "%s hello\r\n\n" $sha1 >expect &&
+	test_cmp expect actual &&
+	git for-each-ref --format="%(objectname) %(raw:filters)" --rest="HEAD:world.txt2" \
+	    refs/myblobs/world_blob >actual &&
+	printf "%s hello\n\n" $sha1 >expect &&
+	test_cmp expect actual
+
+'
+
+test_expect_success 'basic atom raw:textconv with --rest' '
+	sha1=$(git rev-parse -q --verify HEAD:world.txt) &&
+	git -c diff.txt.textconv="tr A-Za-z N-ZA-Mn-za-m <" \
+	    for-each-ref --format="%(objectname) %(raw:textconv)" --rest="HEAD:world.txt" \
+	    refs/myblobs/world_blob >actual &&
+	printf "%s uryyb\r\n\n" $sha1 >expect &&
+	test_cmp expect actual &&
+	git for-each-ref --format="%(objectname) %(raw:textconv)" --rest="HEAD:world.txt2" \
+	    refs/myblobs/world_blob >actual &&
+	printf "%s hello\n\n" $sha1 >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success '%(raw:textconv) without --rest must failed' '
+	test_must_fail git for-each-ref --format="%(raw:textconv)"
+'
+
+test_expect_success '%(raw:filters) without --rest must failed' '
+	test_must_fail git for-each-ref --format="%(raw:filters)"
+'
+
+test_expect_success '%(raw:textconv) with --shell must failed' '
+	test_must_fail git for-each-ref --format="%(raw:textconv)" \
+			   --shell --rest="HEAD:world.txt"
+'
+
+test_expect_success '%(raw:filters) with --shell must failed' '
+	test_must_fail git for-each-ref --format="%(raw:filters)" \
+			   --shell --rest="HEAD:world.txt"
+'
+
+test_expect_success '%(raw:textconv) with --shell and --sort=raw:textconv must failed' '
+	test_must_fail git for-each-ref --format="%(raw:textconv)" \
+			   --sort=raw:textconv --shell --rest="HEAD:world.txt"
+'
+
+test_expect_success '%(raw:filters) with --shell and --sort=raw:filters must failed' '
+	test_must_fail git for-each-ref --format="%(raw:filters)" \
+			   --sort=raw:filters --shell --rest="HEAD:world.txt"
+'
+
 test_expect_success 'for-each-ref reports broken tags' '
 	git tag -m "good tag" broken-tag-good HEAD &&
 	git cat-file tag broken-tag-good >good &&
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/6] [GSOC][RFC] ref-filter: add %(raw:textconv) and %(raw:filters)
  2021-06-05  9:13 [PATCH 0/6] [GSOC][RFC] ref-filter: add %(raw:textconv) and %(raw:filters) ZheNing Hu via GitGitGadget
                   ` (5 preceding siblings ...)
  2021-06-05  9:13 ` [PATCH 6/6] [GSOC] ref-filter: add %(raw:textconv) and %(raw:filters) ZheNing Hu via GitGitGadget
@ 2021-06-05 10:29 ` Bagas Sanjaya
  2021-06-05 13:19   ` ZheNing Hu
  2021-06-07  5:55 ` Junio C Hamano
  2021-06-08  6:42 ` Junio C Hamano
  8 siblings, 1 reply; 27+ messages in thread
From: Bagas Sanjaya @ 2021-06-05 10:29 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget, git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, ZheNing Hu

Hi,

On 05/06/21 16.13, ZheNing Hu via GitGitGadget wrote:
> In order to let git cat-file --batch reuse ref-filter logic, This patch,
> %(rest), %(raw:textconv), %(raw:filters) atoms and --rest=<rest> option are
> added to ref-filter.
> 

Better say "Add ... atoms and --rest=<rest> option to ref-filter, in 
order to let git cat-file reuse ref-filter logic."

>   * %(rest) int the format will be replaced by the <rest> in --rest=<rest>.
>   * the <rest> in --rest=<rest> can also be used as the <path> for
>     %(raw:textconv) and %(raw:filters).

s/int/in/

Did you mean that %(rest) atom can also be used as <path> for the latter?

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/6] [GSOC][RFC] ref-filter: add %(raw:textconv) and %(raw:filters)
  2021-06-05 10:29 ` [PATCH 0/6] [GSOC][RFC] " Bagas Sanjaya
@ 2021-06-05 13:19   ` ZheNing Hu
  0 siblings, 0 replies; 27+ messages in thread
From: ZheNing Hu @ 2021-06-05 13:19 UTC (permalink / raw)
  To: Bagas Sanjaya
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Hariom Verma

Bagas Sanjaya <bagasdotme@gmail.com> 于2021年6月5日周六 下午6:29写道:
>
> Hi,
>
> On 05/06/21 16.13, ZheNing Hu via GitGitGadget wrote:
> > In order to let git cat-file --batch reuse ref-filter logic, This patch,
> > %(rest), %(raw:textconv), %(raw:filters) atoms and --rest=<rest> option are
> > added to ref-filter.
> >
>
> Better say "Add ... atoms and --rest=<rest> option to ref-filter, in
> order to let git cat-file reuse ref-filter logic."
>

OK.

> >   * %(rest) int the format will be replaced by the <rest> in --rest=<rest>.
> >   * the <rest> in --rest=<rest> can also be used as the <path> for
> >     %(raw:textconv) and %(raw:filters).
>
> s/int/in/
>
> Did you mean that %(rest) atom can also be used as <path> for the latter?
>

No. just the <rest> in `--rest=<rest>` will be treated as <path> for
%(raw:textconv)
and %(raw:filters).

> --
> An old man doll... just what I always wanted! - Clara

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/6] [GSOC] ref-filter: add %(rest) atom and --rest option
  2021-06-05  9:13 ` [PATCH 4/6] [GSOC] ref-filter: add %(rest) atom and --rest option ZheNing Hu via GitGitGadget
@ 2021-06-05 15:20   ` Hariom verma
  2021-06-06  4:58     ` ZheNing Hu
  2021-06-07  5:52   ` Junio C Hamano
  1 sibling, 1 reply; 27+ messages in thread
From: Hariom verma @ 2021-06-05 15:20 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Christian Couder, ZheNing Hu

Hi,

On Sat, Jun 5, 2021 at 2:43 PM ZheNing Hu via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: ZheNing Hu <adlternative@gmail.com>
>
> --- a/builtin/branch.c
> +++ b/builtin/branch.c
> @@ -670,6 +674,7 @@ int cmd_branch(int argc, const char **argv, const char *prefix)
>                         N_("print only branches of the object"), parse_opt_object_name),
>                 OPT_BOOL('i', "ignore-case", &icase, N_("sorting and filtering are case insensitive")),
>                 OPT_STRING(  0 , "format", &format.format, N_("format"), N_("format to use for the output")),
> +               OPT_STRING(0, "rest", &format.rest, N_("rest"), N_("specify %(rest) contents")),
>                 OPT_END(),
>         };
>

Although it's not related to this patch. But I just noticed an unusual
extra space(s) before the first argument of `OPT_STRING()`. (above the
line you added)

> --- a/builtin/for-each-ref.c
> +++ b/builtin/for-each-ref.c
> @@ -37,6 +37,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
>
>                 OPT_GROUP(""),
>                 OPT_INTEGER( 0 , "count", &maxcount, N_("show only <n> matched refs")),
> +               OPT_STRING(  0 , "rest", &format.rest, N_("rest"), N_("specify %(rest) contents")),
>                 OPT_STRING(  0 , "format", &format.format, N_("format"), N_("format to use for the output")),
>                 OPT__COLOR(&format.use_color, N_("respect format colors")),
>                 OPT_REF_SORT(sorting_tail),

Here too in `OPT_INTEGER()` and `OPT_INTEGER()`.

Also, I don't think these extra space(s) are intended. So you don't
need to imitate them.

> --- a/builtin/tag.c
> +++ b/builtin/tag.c
> @@ -481,6 +486,7 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
>                 OPT_STRING(  0 , "format", &format.format, N_("format"),
>                            N_("format to use for the output")),
>                 OPT__COLOR(&format.use_color, N_("respect format colors")),
> +               OPT_STRING(0, "rest", &format.rest, N_("rest"), N_("specify %(rest) contents")),
>                 OPT_BOOL('i', "ignore-case", &icase, N_("sorting and filtering are case insensitive")),
>                 OPT_END()
>         };

Here too in the first line.

Thanks,
Hariom

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/6] [GSOC] ref-filter: add %(rest) atom and --rest option
  2021-06-05 15:20   ` Hariom verma
@ 2021-06-06  4:58     ` ZheNing Hu
  0 siblings, 0 replies; 27+ messages in thread
From: ZheNing Hu @ 2021-06-06  4:58 UTC (permalink / raw)
  To: Hariom verma
  Cc: ZheNing Hu via GitGitGadget, git, Junio C Hamano,
	Christian Couder

Hi,

Hariom verma <hariom18599@gmail.com> 于2021年6月5日周六 下午11:20写道:
>
> Hi,
>
> On Sat, Jun 5, 2021 at 2:43 PM ZheNing Hu via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > --- a/builtin/branch.c
> > +++ b/builtin/branch.c
> > @@ -670,6 +674,7 @@ int cmd_branch(int argc, const char **argv, const char *prefix)
> >                         N_("print only branches of the object"), parse_opt_object_name),
> >                 OPT_BOOL('i', "ignore-case", &icase, N_("sorting and filtering are case insensitive")),
> >                 OPT_STRING(  0 , "format", &format.format, N_("format"), N_("format to use for the output")),
> > +               OPT_STRING(0, "rest", &format.rest, N_("rest"), N_("specify %(rest) contents")),
> >                 OPT_END(),
> >         };
> >
>
> Although it's not related to this patch. But I just noticed an unusual
> extra space(s) before the first argument of `OPT_STRING()`. (above the
> line you added)
>

Yeah, I noticed it too.

> > --- a/builtin/for-each-ref.c
> > +++ b/builtin/for-each-ref.c
> > @@ -37,6 +37,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
> >
> >                 OPT_GROUP(""),
> >                 OPT_INTEGER( 0 , "count", &maxcount, N_("show only <n> matched refs")),
> > +               OPT_STRING(  0 , "rest", &format.rest, N_("rest"), N_("specify %(rest) contents")),
> >                 OPT_STRING(  0 , "format", &format.format, N_("format"), N_("format to use for the output")),
> >                 OPT__COLOR(&format.use_color, N_("respect format colors")),
> >                 OPT_REF_SORT(sorting_tail),
>
> Here too in `OPT_INTEGER()` and `OPT_INTEGER()`.
>
> Also, I don't think these extra space(s) are intended. So you don't
> need to imitate them.
>

Maybe... I find it's begin at c3428da8 (Make builtin-for-each-ref.c
use parse-opts.)
This is something from 2007. So It may be wrong to imitate its format.
But this format fix work may be can put in good-first-issue. :)

> > --- a/builtin/tag.c
> > +++ b/builtin/tag.c
> > @@ -481,6 +486,7 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
> >                 OPT_STRING(  0 , "format", &format.format, N_("format"),
> >                            N_("format to use for the output")),
> >                 OPT__COLOR(&format.use_color, N_("respect format colors")),
> > +               OPT_STRING(0, "rest", &format.rest, N_("rest"), N_("specify %(rest) contents")),
> >                 OPT_BOOL('i', "ignore-case", &icase, N_("sorting and filtering are case insensitive")),
> >                 OPT_END()
> >         };
>
> Here too in the first line.
>
> Thanks,
> Hariom

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/6] [GSOC] ref-filter: add %(rest) atom and --rest option
  2021-06-05  9:13 ` [PATCH 4/6] [GSOC] ref-filter: add %(rest) atom and --rest option ZheNing Hu via GitGitGadget
  2021-06-05 15:20   ` Hariom verma
@ 2021-06-07  5:52   ` Junio C Hamano
  2021-06-07 13:02     ` ZheNing Hu
  1 sibling, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2021-06-07  5:52 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Christian Couder, Hariom Verma, ZheNing Hu

"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: ZheNing Hu <adlternative@gmail.com>
>
> In order to let "cat-file --batch=%(rest)" use the ref-filter
> interface, add %(rest) atom for ref-filter and --rest option
> for "git for-each-ref", "git branch", "git tag" and "git verify-tag".
> `--rest` specify a string to replace %(rest) placeholders of
> the --format option.

I cannot think of a sane reason why we need to allow "%(rest)" in
anithing but "cat-file --batch", where a natural source of %(rest)
exists in its input stream (i.e. each input record begins with an
object name to be processed, and the rest of the record can become
"%(rest)").

The "cat-file --batch" thing is much more understandable.  You could
for example:

    git ls-files -s |
    sed -e 's/^[0-7]* \([0-9a-f]*\) [0-3]	/\1 /' |
    git cat-file --batch='%(objectname) %(objecttype) %(rest)'

to massage output from "ls-files -s" like this

    100644 c2f5fe385af1bbc161f6c010bdcf0048ab6671ed 0	.cirrus.yml
    100644 c592dda681fecfaa6bf64fb3f539eafaf4123ed8 0	.clang-format
    100644 f9d819623d832113014dd5d5366e8ee44ac9666a 0	.editorconfig
    ...

into recods of "<objectname> <path>", and each output record will
replay the <path> part from each corresponding input record.

Unless for-each-ref family of commands read the list of refs that it
shows from their standard input (they do not, and I do not think it
makes any sense to teach them to), there is no place to feed the
"rest" information that is associated with each output record.  The
only thing the commands taught about %(rest) by this patch can do is
to parrot the same string into each and every output record.  I am
not seeing what this new feature is attempting to give us.

If anything, I would imagine that it would be a very useful addition
to teach the ref-filter machinery an ability to optionally error out
depending on the caller when the caller attempts to use certain
placeholder.  Then, we can reject "git branch --sort=rest" sensibly,
instead of accepting "git branch --sort=rest --rest=constant", which
is not technically wrong per-se, but smells like a total nonsense from
practical usefulness's point of view.

> -	[--list] [<pattern>...]
> +	[--list] [<pattern>...] [--rest=<rest>]
>  'git branch' [--track | --no-track] [-f] <branchname> [<start-point>]
>  'git branch' (--set-upstream-to=<upstream> | -u <upstream>) [<branchname>]
>  'git branch' --unset-upstream [<branchname>]
> @@ -298,6 +298,10 @@ start-point is either a local or remote-tracking branch.
>  	and the object it points at.  The format is the same as
>  	that of linkgit:git-for-each-ref[1].
>  
> +--rest=<rest>::
> +	If given, the `%(rest)` placeholders in the `--format` option
> +	will be replaced.

If not given, what happens?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/6] [GSOC][RFC] ref-filter: add %(raw:textconv) and %(raw:filters)
  2021-06-05  9:13 [PATCH 0/6] [GSOC][RFC] ref-filter: add %(raw:textconv) and %(raw:filters) ZheNing Hu via GitGitGadget
                   ` (6 preceding siblings ...)
  2021-06-05 10:29 ` [PATCH 0/6] [GSOC][RFC] " Bagas Sanjaya
@ 2021-06-07  5:55 ` Junio C Hamano
  2021-06-07 13:06   ` ZheNing Hu
  2021-06-08  6:42 ` Junio C Hamano
  8 siblings, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2021-06-07  5:55 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Christian Couder, Hariom Verma, ZheNing Hu

"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> The current series is based on 0efed9435 ([GSOC] ref-filter: add %(raw)
> atom)

I do not have that commit object, but these six patches include the
two commits that are %(raw) and %(raw:size), so I'll just discard
the old round that wasn't based on the atom-type stuff and queue
these six as a single series.

As I already said, I am not sure how %(rest) makes any sense outside
the context of "cat-file --batch"; I suspect it would make more sense
to make it easier to arrange certain placeholders to error out when
used in a context where they do not make sense (e.g. use of --rest
in "git branch --list").


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/6] [GSOC] ref-filter: add %(rest) atom and --rest option
  2021-06-07  5:52   ` Junio C Hamano
@ 2021-06-07 13:02     ` ZheNing Hu
  2021-06-07 13:18       ` ZheNing Hu
  2021-06-08  6:50       ` Junio C Hamano
  0 siblings, 2 replies; 27+ messages in thread
From: ZheNing Hu @ 2021-06-07 13:02 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Hariom Verma

Junio C Hamano <gitster@pobox.com> 于2021年6月7日周一 下午1:52写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > In order to let "cat-file --batch=%(rest)" use the ref-filter
> > interface, add %(rest) atom for ref-filter and --rest option
> > for "git for-each-ref", "git branch", "git tag" and "git verify-tag".
> > `--rest` specify a string to replace %(rest) placeholders of
> > the --format option.
>
> I cannot think of a sane reason why we need to allow "%(rest)" in
> anithing but "cat-file --batch", where a natural source of %(rest)
> exists in its input stream (i.e. each input record begins with an
> object name to be processed, and the rest of the record can become
> "%(rest)").
>

First of all, although %(rest) is meaningless in ordinary circumstances,
ref-filter must learn %(rest), it is impossible for us to leave the parsing
of %(rest) in cat-file.c alone.

Then, `--rest` is a strategy that make %(rest) can use in `git for-each-ref`
or `git branch -l`. As you said, it is just a boring placeholder used for string
replacement. We can make it output only empty content, If we really don’t
need `--rest`.

> The "cat-file --batch" thing is much more understandable.  You could
> for example:
>
>     git ls-files -s |
>     sed -e 's/^[0-7]* \([0-9a-f]*\) [0-3]       /\1 /' |
>     git cat-file --batch='%(objectname) %(objecttype) %(rest)'
>

s/[0-3]       /[0-3]\t/

> to massage output from "ls-files -s" like this
>
>     100644 c2f5fe385af1bbc161f6c010bdcf0048ab6671ed 0   .cirrus.yml
>     100644 c592dda681fecfaa6bf64fb3f539eafaf4123ed8 0   .clang-format
>     100644 f9d819623d832113014dd5d5366e8ee44ac9666a 0   .editorconfig
>     ...
>
> into recods of "<objectname> <path>", and each output record will
> replay the <path> part from each corresponding input record.
>

Yeah, the <path> in the input will be treated as "rest".

> Unless for-each-ref family of commands read the list of refs that it
> shows from their standard input (they do not, and I do not think it
> makes any sense to teach them to), there is no place to feed the
> "rest" information that is associated with each output record.  The
> only thing the commands taught about %(rest) by this patch can do is
> to parrot the same string into each and every output record.  I am
> not seeing what this new feature is attempting to give us.
>

"parrot the same string"? I think we should use an empty string here,
"parrot the same string" more like what the "git log --format" family does.

> If anything, I would imagine that it would be a very useful addition
> to teach the ref-filter machinery an ability to optionally error out
> depending on the caller when the caller attempts to use certain
> placeholder.  Then, we can reject "git branch --sort=rest" sensibly,
> instead of accepting "git branch --sort=rest --rest=constant", which
> is not technically wrong per-se, but smells like a total nonsense from
> practical usefulness's point of view.
>

This sounds like it might help `cat-file` to reject some useless atoms
like %(refname). So something like:

$ git for-each-ref --format="%(objectname) %(objectsize)"
--refject-atoms="%(objectsize) %(objectname)"

will fail.

"git for-each-ref" family use hardcoded to reject %(rest).
I can try to achieve this function.

> > -     [--list] [<pattern>...]
> > +     [--list] [<pattern>...] [--rest=<rest>]
> >  'git branch' [--track | --no-track] [-f] <branchname> [<start-point>]
> >  'git branch' (--set-upstream-to=<upstream> | -u <upstream>) [<branchname>]
> >  'git branch' --unset-upstream [<branchname>]
> > @@ -298,6 +298,10 @@ start-point is either a local or remote-tracking branch.
> >       and the object it points at.  The format is the same as
> >       that of linkgit:git-for-each-ref[1].
> >
> > +--rest=<rest>::
> > +     If given, the `%(rest)` placeholders in the `--format` option
> > +     will be replaced.
>
> If not given, what happens?

Will output an empty string.

Hope we can reach an agreement:
delete `--rest` and add `--reject-atoms`. ;-)

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/6] [GSOC][RFC] ref-filter: add %(raw:textconv) and %(raw:filters)
  2021-06-07  5:55 ` Junio C Hamano
@ 2021-06-07 13:06   ` ZheNing Hu
  0 siblings, 0 replies; 27+ messages in thread
From: ZheNing Hu @ 2021-06-07 13:06 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Hariom Verma

Junio C Hamano <gitster@pobox.com> 于2021年6月7日周一 下午1:56写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > The current series is based on 0efed9435 ([GSOC] ref-filter: add %(raw)
> > atom)
>
> I do not have that commit object, but these six patches include the
> two commits that are %(raw) and %(raw:size), so I'll just discard
> the old round that wasn't based on the atom-type stuff and queue
> these six as a single series.
>

Well, it is a commit that has not been sent to the mailing list. But it’s okay
to treat the new 6 commits as a new patch series.

> As I already said, I am not sure how %(rest) makes any sense outside
> the context of "cat-file --batch"; I suspect it would make more sense
> to make it easier to arrange certain placeholders to error out when
> used in a context where they do not make sense (e.g. use of --rest
> in "git branch --list").
>

I agree.

--
ZheNing Hu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/6] [GSOC] ref-filter: add %(rest) atom and --rest option
  2021-06-07 13:02     ` ZheNing Hu
@ 2021-06-07 13:18       ` ZheNing Hu
  2021-06-08  6:16         ` ZheNing Hu
  2021-06-08  6:50       ` Junio C Hamano
  1 sibling, 1 reply; 27+ messages in thread
From: ZheNing Hu @ 2021-06-07 13:18 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Hariom Verma

ZheNing Hu <adlternative@gmail.com> 于2021年6月7日周一 下午9:02写道:
>
> Hope we can reach an agreement:
> delete `--rest` and add `--reject-atoms`. ;-)
>

I forget one thing: %(raw:textconv) and %(raw:filters) can use
the value of "--rest" as their <path>. But now if we want delete --rest,
they can not be used for "for-each-ref" family, Git will die with
"missing path for 'xxx'".

> Thanks.
> --
> ZheNing Hu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/6] [GSOC] ref-filter: add %(raw) atom
  2021-06-05  9:13 ` [PATCH 2/6] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
@ 2021-06-08  5:07   ` Junio C Hamano
  2021-06-08  6:10     ` ZheNing Hu
  0 siblings, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2021-06-08  5:07 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Christian Couder, Hariom Verma, ZheNing Hu

"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

>  static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, struct ref_array_item *b)
>  {
>  	struct atom_value *va, *vb;
> @@ -2389,10 +2452,30 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru
>  	} else if (s->sort_flags & REF_SORTING_VERSION) {
>  		cmp = versioncmp(va->s, vb->s);
>  	} else if (cmp_type == FIELD_STR) {
> -		int (*cmp_fn)(const char *, const char *);
> -		cmp_fn = s->sort_flags & REF_SORTING_ICASE
> -			? strcasecmp : strcmp;
> -		cmp = cmp_fn(va->s, vb->s);
> +		if (va->s_size == ATOM_VALUE_S_SIZE_INIT &&
> +		    vb->s_size == ATOM_VALUE_S_SIZE_INIT) {
> +			int (*cmp_fn)(const char *, const char *);
> +			cmp_fn = s->sort_flags & REF_SORTING_ICASE
> +				? strcasecmp : strcmp;
> +			cmp = cmp_fn(va->s, vb->s);
> +		} else {
> +			int (*cmp_fn)(const void *, const void *, size_t);
> +			cmp_fn = s->sort_flags & REF_SORTING_ICASE
> +				? memcasecmp : memcmp;
> +			size_t a_size = va->s_size == ATOM_VALUE_S_SIZE_INIT ?
> +					strlen(va->s) : va->s_size;
> +			size_t b_size = vb->s_size == ATOM_VALUE_S_SIZE_INIT ?
> +					strlen(vb->s) : vb->s_size;

This breaks -Wdecl-after-stmt.  A possible fix below.

diff --git a/ref-filter.c b/ref-filter.c
index 46aec291de..648f9cabff 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2459,13 +2459,13 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru
 				? strcasecmp : strcmp;
 			cmp = cmp_fn(va->s, vb->s);
 		} else {
-			int (*cmp_fn)(const void *, const void *, size_t);
-			cmp_fn = s->sort_flags & REF_SORTING_ICASE
+			size_t a_size = va->s_size == ATOM_VALUE_S_SIZE_INIT
+					? strlen(va->s) : va->s_size;
+			size_t b_size = vb->s_size == ATOM_VALUE_S_SIZE_INIT
+					? strlen(vb->s) : vb->s_size;
+			int (*cmp_fn)(const void *, const void *, size_t) =
+				s->sort_flags & REF_SORTING_ICASE
 				? memcasecmp : memcmp;
-			size_t a_size = va->s_size == ATOM_VALUE_S_SIZE_INIT ?
-					strlen(va->s) : va->s_size;
-			size_t b_size = vb->s_size == ATOM_VALUE_S_SIZE_INIT ?
-					strlen(vb->s) : vb->s_size;
 
 			cmp = cmp_fn(va->s, vb->s, b_size > a_size ?
 				     a_size : b_size);

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/6] [GSOC] ref-filter: add %(raw) atom
  2021-06-08  5:07   ` Junio C Hamano
@ 2021-06-08  6:10     ` ZheNing Hu
  0 siblings, 0 replies; 27+ messages in thread
From: ZheNing Hu @ 2021-06-08  6:10 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Hariom Verma

Junio C Hamano <gitster@pobox.com> 于2021年6月8日周二 下午1:07写道:
>
> This breaks -Wdecl-after-stmt.  A possible fix below.
>
> diff --git a/ref-filter.c b/ref-filter.c
> index 46aec291de..648f9cabff 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -2459,13 +2459,13 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru
>                                 ? strcasecmp : strcmp;
>                         cmp = cmp_fn(va->s, vb->s);
>                 } else {
> -                       int (*cmp_fn)(const void *, const void *, size_t);
> -                       cmp_fn = s->sort_flags & REF_SORTING_ICASE
> +                       size_t a_size = va->s_size == ATOM_VALUE_S_SIZE_INIT
> +                                       ? strlen(va->s) : va->s_size;
> +                       size_t b_size = vb->s_size == ATOM_VALUE_S_SIZE_INIT
> +                                       ? strlen(vb->s) : vb->s_size;
> +                       int (*cmp_fn)(const void *, const void *, size_t) =
> +                               s->sort_flags & REF_SORTING_ICASE
>                                 ? memcasecmp : memcmp;
> -                       size_t a_size = va->s_size == ATOM_VALUE_S_SIZE_INIT ?
> -                                       strlen(va->s) : va->s_size;
> -                       size_t b_size = vb->s_size == ATOM_VALUE_S_SIZE_INIT ?
> -                                       strlen(vb->s) : vb->s_size;
>
>                         cmp = cmp_fn(va->s, vb->s, b_size > a_size ?
>                                      a_size : b_size);

You are right.

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/6] [GSOC] ref-filter: add %(rest) atom and --rest option
  2021-06-07 13:18       ` ZheNing Hu
@ 2021-06-08  6:16         ` ZheNing Hu
  2021-06-08  6:59           ` Junio C Hamano
  0 siblings, 1 reply; 27+ messages in thread
From: ZheNing Hu @ 2021-06-08  6:16 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Hariom Verma

ZheNing Hu <adlternative@gmail.com> 于2021年6月7日周一 下午9:18写道:
>
> ZheNing Hu <adlternative@gmail.com> 于2021年6月7日周一 下午9:02写道:
> >
> > Hope we can reach an agreement:
> > delete `--rest` and add `--reject-atoms`. ;-)
> >
>
> I forget one thing: %(raw:textconv) and %(raw:filters) can use
> the value of "--rest" as their <path>. But now if we want delete --rest,
> they can not be used for "for-each-ref" family, Git will die with
> "missing path for 'xxx'".
>

If we actually delete "--rest", we will have no way to test %(raw:textconv)
and %(raw:filters)... So now I think we can keep --rest (or use
another name --path)
and let "git for-each-ref" family reject %(rest) by default.

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/6] [GSOC][RFC] ref-filter: add %(raw:textconv) and %(raw:filters)
  2021-06-05  9:13 [PATCH 0/6] [GSOC][RFC] ref-filter: add %(raw:textconv) and %(raw:filters) ZheNing Hu via GitGitGadget
                   ` (7 preceding siblings ...)
  2021-06-07  5:55 ` Junio C Hamano
@ 2021-06-08  6:42 ` Junio C Hamano
  2021-06-08 12:52   ` ZheNing Hu
  8 siblings, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2021-06-08  6:42 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Christian Couder, Hariom Verma, ZheNing Hu

"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> ZheNing Hu (6):
>   [GSOC] ref-filter: add obj-type check in grab contents
>   [GSOC] ref-filter: add %(raw) atom
>   [GSOC] ref-filter: use non-const ref_format in *_atom_parser()
>   [GSOC] ref-filter: add %(rest) atom and --rest option
>   [GSOC] ref-filter: teach grab_sub_body_contents() return value and err
>   [GSOC] ref-filter: add %(raw:textconv) and %(raw:filters)

I haven't gotten around looking at anything after the %(rest) one,
but 

https://github.com/git/git/runs/2770688471?check_suite_focus=true

seems to tell us that there is "size_t *" vs "ulong *" type
confusion, possibly around the textconv thing.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/6] [GSOC] ref-filter: add %(rest) atom and --rest option
  2021-06-07 13:02     ` ZheNing Hu
  2021-06-07 13:18       ` ZheNing Hu
@ 2021-06-08  6:50       ` Junio C Hamano
  2021-06-08 12:32         ` ZheNing Hu
  1 sibling, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2021-06-08  6:50 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Hariom Verma

ZheNing Hu <adlternative@gmail.com> writes:

> First of all, although %(rest) is meaningless in ordinary circumstances,
> ref-filter must learn %(rest), it is impossible for us to leave the parsing
> of %(rest) in cat-file.c alone.

Oh, there is no question about that.

> Then, `--rest` is a strategy that make %(rest) can use in `git for-each-ref`
> or `git branch -l`.

If there is no need to expose %(rest) to the users who write
--format for these two commands, it would be much better to detect
attempted use of %(rest) and error out.

> This sounds like it might help `cat-file` to reject some useless atoms
> like %(refname).

Yes.

> So something like:
>
> $ git for-each-ref --format="%(objectname) %(objectsize)"
> --refject-atoms="%(objectsize) %(objectname)"
>
> will fail.

I don't understand.  Why do you even need to add --reject?  Why
would any user would want to use it with for-each-ref?

Without any end-user input, %(rest) for for-each-ref would not make
sense, and %(refname) for cat-file --batch would not make sense, I
would imagine, so there is no need to be able to tell --reject=rest
to for-each-ref.  It is not like giving --no-reject=rest to for-each-ref
and make it interpolate to an empty string is a useful feature anyway,
so I do not see a need for such an option.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/6] [GSOC] ref-filter: add %(rest) atom and --rest option
  2021-06-08  6:16         ` ZheNing Hu
@ 2021-06-08  6:59           ` Junio C Hamano
  2021-06-08 12:39             ` ZheNing Hu
  2021-06-09  7:00             ` Junio C Hamano
  0 siblings, 2 replies; 27+ messages in thread
From: Junio C Hamano @ 2021-06-08  6:59 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Hariom Verma

ZheNing Hu <adlternative@gmail.com> writes:

> ZheNing Hu <adlternative@gmail.com> 于2021年6月7日周一 下午9:18写道:
>>
>> ZheNing Hu <adlternative@gmail.com> 于2021年6月7日周一 下午9:02写道:
>> >
>> > Hope we can reach an agreement:
>> > delete `--rest` and add `--reject-atoms`. ;-)
>> >
>>
>> I forget one thing: %(raw:textconv) and %(raw:filters) can use
>> the value of "--rest" as their <path>. But now if we want delete --rest,
>> they can not be used for "for-each-ref" family, Git will die with
>> "missing path for 'xxx'".
>>
>
> If we actually delete "--rest", we will have no way to test %(raw:textconv)
> and %(raw:filters)... So now I think we can keep --rest (or use
> another name --path)
> and let "git for-each-ref" family reject %(rest) by default.

I didn't read beyond the %(rest) thing, but do we even need
%(raw:textconv) to begin with?  It is totally useless in the context
of for-each-ref because textconv by its nature is tied to attributes
that by definition needs a blob that is sitting at a path, but the
objects for-each-ref and friends visit are mostly commits and tags,
and even for refs that point at a blob, there isn't any "path"
information to pull attribute for.

Is that what you want to add to give "cat-file --batch"?  Even in
the context of "cat-file --batch", you can throw an object name for
a blob to the command, but there is no path for the blob (a blob can
appear at different places in different trees---think "rename), so I
am not sure what benefit you are trying to derive from it.

Thanks.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/6] [GSOC] ref-filter: add %(rest) atom and --rest option
  2021-06-08  6:50       ` Junio C Hamano
@ 2021-06-08 12:32         ` ZheNing Hu
  0 siblings, 0 replies; 27+ messages in thread
From: ZheNing Hu @ 2021-06-08 12:32 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Hariom Verma

Junio C Hamano <gitster@pobox.com> 于2021年6月8日周二 下午2:50写道:
>
> ZheNing Hu <adlternative@gmail.com> writes:
>
> > First of all, although %(rest) is meaningless in ordinary circumstances,
> > ref-filter must learn %(rest), it is impossible for us to leave the parsing
> > of %(rest) in cat-file.c alone.
>
> Oh, there is no question about that.
>

OK.

>
> I don't understand.  Why do you even need to add --reject?  Why
> would any user would want to use it with for-each-ref?
>
> Without any end-user input, %(rest) for for-each-ref would not make
> sense, and %(refname) for cat-file --batch would not make sense, I
> would imagine, so there is no need to be able to tell --reject=rest
> to for-each-ref.  It is not like giving --no-reject=rest to for-each-ref
> and make it interpolate to an empty string is a useful feature anyway,
> so I do not see a need for such an option.

Yeah... I might have thought it complicated before. We can reject %(rest) in
verify_ref_format() easily. It's like we refused --<lang> before. :)

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/6] [GSOC] ref-filter: add %(rest) atom and --rest option
  2021-06-08  6:59           ` Junio C Hamano
@ 2021-06-08 12:39             ` ZheNing Hu
  2021-06-09  7:00             ` Junio C Hamano
  1 sibling, 0 replies; 27+ messages in thread
From: ZheNing Hu @ 2021-06-08 12:39 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Hariom Verma

Junio C Hamano <gitster@pobox.com> 于2021年6月8日周二 下午2:59写道:
>
> >
> > If we actually delete "--rest", we will have no way to test %(raw:textconv)
> > and %(raw:filters)... So now I think we can keep --rest (or use
> > another name --path)
> > and let "git for-each-ref" family reject %(rest) by default.
>
> I didn't read beyond the %(rest) thing, but do we even need
> %(raw:textconv) to begin with?  It is totally useless in the context
> of for-each-ref because textconv by its nature is tied to attributes
> that by definition needs a blob that is sitting at a path, but the
> objects for-each-ref and friends visit are mostly commits and tags,
> and even for refs that point at a blob, there isn't any "path"
> information to pull attribute for.
>

After thinking about your words, now I think maybe we can leave
%(raw:textconv) and %(raw:filter) after cat-file --batch start using
ref-filter logic, so that we can provide them with suitable tests,
and we don't need `--rest` anymore.

> Is that what you want to add to give "cat-file --batch"?  Even in
> the context of "cat-file --batch", you can throw an object name for
> a blob to the command, but there is no path for the blob (a blob can
> appear at different places in different trees---think "rename), so I
> am not sure what benefit you are trying to derive from it.
>

So I will remove the last two commits.

> Thanks.
>
>

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/6] [GSOC][RFC] ref-filter: add %(raw:textconv) and %(raw:filters)
  2021-06-08  6:42 ` Junio C Hamano
@ 2021-06-08 12:52   ` ZheNing Hu
  0 siblings, 0 replies; 27+ messages in thread
From: ZheNing Hu @ 2021-06-08 12:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Hariom Verma

Junio C Hamano <gitster@pobox.com> 于2021年6月8日周二 下午2:42写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > ZheNing Hu (6):
> >   [GSOC] ref-filter: add obj-type check in grab contents
> >   [GSOC] ref-filter: add %(raw) atom
> >   [GSOC] ref-filter: use non-const ref_format in *_atom_parser()
> >   [GSOC] ref-filter: add %(rest) atom and --rest option
> >   [GSOC] ref-filter: teach grab_sub_body_contents() return value and err
> >   [GSOC] ref-filter: add %(raw:textconv) and %(raw:filters)
>
> I haven't gotten around looking at anything after the %(rest) one,
> but
>
> https://github.com/git/git/runs/2770688471?check_suite_focus=true
>
> seems to tell us that there is "size_t *" vs "ulong *" type
> confusion, possibly around the textconv thing.
>
>

ok, I will change it when we use %(raw:textconv) next time.

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/6] [GSOC] ref-filter: add %(rest) atom and --rest option
  2021-06-08  6:59           ` Junio C Hamano
  2021-06-08 12:39             ` ZheNing Hu
@ 2021-06-09  7:00             ` Junio C Hamano
  2021-06-09 12:47               ` ZheNing Hu
  1 sibling, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2021-06-09  7:00 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Hariom Verma

Junio C Hamano <gitster@pobox.com> writes:

> Is that what you want to add to give "cat-file --batch"?  Even in
> the context of "cat-file --batch", you can throw an object name for
> a blob to the command, but there is no path for the blob (a blob can
> appear at different places in different trees---think "rename), so I
> am not sure what benefit you are trying to derive from it.

I think I kind-of see what is going on here.  There is

    git cat-file blob --textconv --path="$path" "$blob_object_name"

that allows a blob to be fed to the command, pretend as if it
appears at $path in a tree object and grab attribute for it, and
show the blob contents converted using the textconv filter.  If we
were to mimic it by extending the format based substitutions, a
design consistent with the behaviour is to teach --format=%(raw)
to show the contents after applying the textconv filter instead of
the raw blob contents.

And there is a corresponding

    git cat-file --batch --textconv

The "--path=$path" parameter is omitted when using --batch, as each
object would sit at different path in the tree (so the input stream
would be given as a run of "<blob> <path>" to give each item its own
path).

So to answer my question in the previous message, yes, this is an
attempt to support the "cat-file --textconv".  So in the context of
that command, something may need to be added.  But I do not think it
makes any sense to expose that to for-each-ref and friends, even if
we were to share the internal machinery (after all, sharing of the
internal machinery is a mere means to an end that is to make it
easier to give the same syntax and same behaviour to end users and
is not a goal itself; "because we use the same machinery, the users
have to tolerate that irrelevant %(atoms) are accepted by the parser"
is not making a good excuse for a sloppy implementation).

Having said all that, I somehow doubt that the "--batch=<format>"
was designed to interact sensibly with the "--textconv" option.
builtin/cat-file.c::expand_atom() does not know anything at all that
the data could be modified from the raw contents of the blob, so
--batch="%(contents) %(size)" --textconv, if existed, may show the
conveted contents with size of blob before conversion, or something
incoherent like that.  And if your rewrite using the shared internal
machinery results in a more coherent behaviour, that would be
excellent.  For example, we could imagine that the machinery, when
textconv (or filter) is in use, would first grab the blob contents
and run the requested conversion, and then work solely on that
conveted contents when computing what to fill with %(raw:size) and
other blob-related atoms.

Thanks.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/6] [GSOC] ref-filter: add %(rest) atom and --rest option
  2021-06-09  7:00             ` Junio C Hamano
@ 2021-06-09 12:47               ` ZheNing Hu
  0 siblings, 0 replies; 27+ messages in thread
From: ZheNing Hu @ 2021-06-09 12:47 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Hariom Verma

Junio C Hamano <gitster@pobox.com> 于2021年6月9日周三 下午3:00写道:
>
> I think I kind-of see what is going on here.  There is
>
>     git cat-file blob --textconv --path="$path" "$blob_object_name"
>
> that allows a blob to be fed to the command, pretend as if it
> appears at $path in a tree object and grab attribute for it, and
> show the blob contents converted using the textconv filter.  If we
> were to mimic it by extending the format based substitutions, a
> design consistent with the behaviour is to teach --format=%(raw)
> to show the contents after applying the textconv filter instead of
> the raw blob contents.
>

Yes, this is exactly what cat-file --textconv does.

> And there is a corresponding
>
>     git cat-file --batch --textconv
>
> The "--path=$path" parameter is omitted when using --batch, as each
> object would sit at different path in the tree (so the input stream
> would be given as a run of "<blob> <path>" to give each item its own
> path).
>

Just like let --batch omitted --path, --rest is meaningless for "for-ech-ref".

> So to answer my question in the previous message, yes, this is an
> attempt to support the "cat-file --textconv".  So in the context of
> that command, something may need to be added.  But I do not think it
> makes any sense to expose that to for-each-ref and friends, even if
> we were to share the internal machinery (after all, sharing of the
> internal machinery is a mere means to an end that is to make it
> easier to give the same syntax and same behaviour to end users and
> is not a goal itself; "because we use the same machinery, the users
> have to tolerate that irrelevant %(atoms) are accepted by the parser"
> is not making a good excuse for a sloppy implementation).
>

Because "git cat-file --batch" will only print the contents of the object once,
so when implements the function of textconv/filters in ref-filter,
we should really consider whether we should let something like
"%(raw) %(raw) %(raw) %(raw:size)" all pass the conversion of textconv/filters.
If it is my previous %(raw:textconv) or %(raw:filter), they can only print
the converted content separately, and we need:

$ git for-ecah-ref --format="%(raw:filters) %(raw:filters)
%(raw:filters) %(raw:filters:size)"

As you said it might be too complicated for the user....

> Having said all that, I somehow doubt that the "--batch=<format>"
> was designed to interact sensibly with the "--textconv" option.
> builtin/cat-file.c::expand_atom() does not know anything at all that
> the data could be modified from the raw contents of the blob, so
> --batch="%(contents) %(size)" --textconv, if existed, may show the
> conveted contents with size of blob before conversion, or something
> incoherent like that.  And if your rewrite using the shared internal
> machinery results in a more coherent behaviour, that would be
> excellent.  For example, we could imagine that the machinery, when
> textconv (or filter) is in use, would first grab the blob contents
> and run the requested conversion, and then work solely on that
> conveted contents when computing what to fill with %(raw:size) and
> other blob-related atoms.
>

And with your --textconv/--filters, we only need:

$ git for-ecah-ref --format="%(raw) %(raw) %(raw) %(raw:size)" --filters

This will be more concise for users. I will try to build --filters, --textconv
for ref-filter . But as stated in the previous reply, it needs to be placed
after the transplant of cat-file --batch (Because --path is useless for
for-each-ref, and at the same time we need proper testing for
--filter/--textconv.)

Thanks, your reply is very reasonable,
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2021-06-09 12:47 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-05  9:13 [PATCH 0/6] [GSOC][RFC] ref-filter: add %(raw:textconv) and %(raw:filters) ZheNing Hu via GitGitGadget
2021-06-05  9:13 ` [PATCH 1/6] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
2021-06-05  9:13 ` [PATCH 2/6] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
2021-06-08  5:07   ` Junio C Hamano
2021-06-08  6:10     ` ZheNing Hu
2021-06-05  9:13 ` [PATCH 3/6] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
2021-06-05  9:13 ` [PATCH 4/6] [GSOC] ref-filter: add %(rest) atom and --rest option ZheNing Hu via GitGitGadget
2021-06-05 15:20   ` Hariom verma
2021-06-06  4:58     ` ZheNing Hu
2021-06-07  5:52   ` Junio C Hamano
2021-06-07 13:02     ` ZheNing Hu
2021-06-07 13:18       ` ZheNing Hu
2021-06-08  6:16         ` ZheNing Hu
2021-06-08  6:59           ` Junio C Hamano
2021-06-08 12:39             ` ZheNing Hu
2021-06-09  7:00             ` Junio C Hamano
2021-06-09 12:47               ` ZheNing Hu
2021-06-08  6:50       ` Junio C Hamano
2021-06-08 12:32         ` ZheNing Hu
2021-06-05  9:13 ` [PATCH 5/6] [GSOC] ref-filter: teach grab_sub_body_contents() return value and err ZheNing Hu via GitGitGadget
2021-06-05  9:13 ` [PATCH 6/6] [GSOC] ref-filter: add %(raw:textconv) and %(raw:filters) ZheNing Hu via GitGitGadget
2021-06-05 10:29 ` [PATCH 0/6] [GSOC][RFC] " Bagas Sanjaya
2021-06-05 13:19   ` ZheNing Hu
2021-06-07  5:55 ` Junio C Hamano
2021-06-07 13:06   ` ZheNing Hu
2021-06-08  6:42 ` Junio C Hamano
2021-06-08 12:52   ` ZheNing Hu

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).