git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/5] [GSoC] Improvements to ref-filter
@ 2020-07-27 20:43 Hariom Verma via GitGitGadget
  2020-07-27 20:43 ` [PATCH 1/5] ref-filter: support different email formats Hariom Verma via GitGitGadget
                   ` (5 more replies)
  0 siblings, 6 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-07-27 20:43 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma

This is the first patch series that introduces some improvements and
features to file ref-filter.{c,h}. These changes are useful to ref-filter,
but in near future also will allow us to use ref-filter's logic in pretty.c

I plan to add more to format-support.{c,h} in the upcoming patch series.
That will lead to more improved and feature-rich ref-filter.c

Hariom Verma (5):
  ref-filter: support different email formats
  ref-filter: add `short` option for 'tree' and 'parent'
  pretty: refactor `format_sanitized_subject()`
  format-support: move `format_sanitized_subject()` from pretty
  ref-filter: add `sanitize` option for 'subject' atom

 Makefile                |  1 +
 format-support.c        | 43 +++++++++++++++++++++++++
 format-support.h        |  6 ++++
 pretty.c                | 40 +++---------------------
 ref-filter.c            | 69 ++++++++++++++++++++++++++++++++++-------
 t/t6300-for-each-ref.sh | 27 ++++++++++++++++
 6 files changed, 138 insertions(+), 48 deletions(-)
 create mode 100644 format-support.c
 create mode 100644 format-support.h


base-commit: 5c06d60fc55d2213c089f63c282468080f812686
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-684%2Fharry-hov%2Fonly-rf6-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-684/harry-hov/only-rf6-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/684
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 1/5] ref-filter: support different email formats
  2020-07-27 20:43 [PATCH 0/5] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
@ 2020-07-27 20:43 ` Hariom Verma via GitGitGadget
  2020-07-27 22:51   ` Junio C Hamano
  2020-07-28 13:58   ` Đoàn Trần Công Danh
  2020-07-27 20:43 ` [PATCH 2/5] ref-filter: add `short` option for 'tree' and 'parent' Hariom Verma via GitGitGadget
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-07-27 20:43 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

Currently, ref-filter only supports printing email with arrow brackets.

Let's add support for two more email options.
- trim : print email without arrow brackets.
- localpart : prints the part before the @ sign

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 ref-filter.c            | 36 ++++++++++++++++++++++++++++++++----
 t/t6300-for-each-ref.sh | 16 ++++++++++++++++
 2 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 8447cb09be..8563088eb1 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -102,6 +102,10 @@ static struct ref_to_worktree_map {
 	struct worktree **worktrees;
 } ref_to_worktree_map;
 
+static struct email_option{
+	enum { EO_INVALID, EO_RAW, EO_TRIM, EO_LOCALPART } option;
+} email_option;
+
 /*
  * An atom is a valid field atom listed below, possibly prefixed with
  * a "*" to denote deref_tag().
@@ -1040,10 +1044,26 @@ static const char *copy_email(const char *buf)
 	const char *eoemail;
 	if (!email)
 		return xstrdup("");
-	eoemail = strchr(email, '>');
+	switch (email_option.option) {
+	case EO_RAW:
+		eoemail = strchr(email, '>') + 1;
+		break;
+	case EO_TRIM:
+		email++;
+		eoemail = strchr(email, '>');
+		break;
+	case EO_LOCALPART:
+		email++;
+		eoemail = strchr(email, '@');
+		break;
+	case EO_INVALID:
+	default:
+		return xstrdup("");
+	}
+
 	if (!eoemail)
 		return xstrdup("");
-	return xmemdupz(email, eoemail + 1 - email);
+	return xmemdupz(email, eoemail - email);
 }
 
 static char *copy_subject(const char *buf, unsigned long len)
@@ -1113,7 +1133,7 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void
 			continue;
 		if (name[wholen] != 0 &&
 		    strcmp(name + wholen, "name") &&
-		    strcmp(name + wholen, "email") &&
+		    !starts_with(name + wholen, "email") &&
 		    !starts_with(name + wholen, "date"))
 			continue;
 		if (!wholine)
@@ -1124,8 +1144,16 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void
 			v->s = copy_line(wholine);
 		else if (!strcmp(name + wholen, "name"))
 			v->s = copy_name(wholine);
-		else if (!strcmp(name + wholen, "email"))
+		else if (starts_with(name + wholen, "email")) {
+			email_option.option = EO_INVALID;
+			if (!strcmp(name + wholen, "email"))
+				email_option.option = EO_RAW;
+			if (!strcmp(name + wholen, "email:trim"))
+				email_option.option = EO_TRIM;
+			if (!strcmp(name + wholen, "email:localpart"))
+				email_option.option = EO_LOCALPART;
 			v->s = copy_email(wholine);
+		}
 		else if (starts_with(name + wholen, "date"))
 			grab_date(wholine, v, name);
 	}
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index da59fadc5d..b8a2cb8439 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -106,15 +106,21 @@ test_atom head '*objecttype' ''
 test_atom head author 'A U Thor <author@example.com> 1151968724 +0200'
 test_atom head authorname 'A U Thor'
 test_atom head authoremail '<author@example.com>'
+test_atom head authoremail:trim 'author@example.com'
+test_atom head authoremail:localpart 'author'
 test_atom head authordate 'Tue Jul 4 01:18:44 2006 +0200'
 test_atom head committer 'C O Mitter <committer@example.com> 1151968723 +0200'
 test_atom head committername 'C O Mitter'
 test_atom head committeremail '<committer@example.com>'
+test_atom head committeremail:trim 'committer@example.com'
+test_atom head committeremail:localpart 'committer'
 test_atom head committerdate 'Tue Jul 4 01:18:43 2006 +0200'
 test_atom head tag ''
 test_atom head tagger ''
 test_atom head taggername ''
 test_atom head taggeremail ''
+test_atom head taggeremail:trim ''
+test_atom head taggeremail:localpart ''
 test_atom head taggerdate ''
 test_atom head creator 'C O Mitter <committer@example.com> 1151968723 +0200'
 test_atom head creatordate 'Tue Jul 4 01:18:43 2006 +0200'
@@ -151,15 +157,21 @@ test_atom tag '*objecttype' 'commit'
 test_atom tag author ''
 test_atom tag authorname ''
 test_atom tag authoremail ''
+test_atom tag authoremail:trim ''
+test_atom tag authoremail:localpart ''
 test_atom tag authordate ''
 test_atom tag committer ''
 test_atom tag committername ''
 test_atom tag committeremail ''
+test_atom tag committeremail:trim ''
+test_atom tag committeremail:localpart ''
 test_atom tag committerdate ''
 test_atom tag tag 'testtag'
 test_atom tag tagger 'C O Mitter <committer@example.com> 1151968725 +0200'
 test_atom tag taggername 'C O Mitter'
 test_atom tag taggeremail '<committer@example.com>'
+test_atom tag taggeremail:trim 'committer@example.com'
+test_atom tag taggeremail:localpart 'committer'
 test_atom tag taggerdate 'Tue Jul 4 01:18:45 2006 +0200'
 test_atom tag creator 'C O Mitter <committer@example.com> 1151968725 +0200'
 test_atom tag creatordate 'Tue Jul 4 01:18:45 2006 +0200'
@@ -545,10 +557,14 @@ test_atom refs/tags/taggerless tag 'taggerless'
 test_atom refs/tags/taggerless tagger ''
 test_atom refs/tags/taggerless taggername ''
 test_atom refs/tags/taggerless taggeremail ''
+test_atom refs/tags/taggerless taggeremail:trim ''
+test_atom refs/tags/taggerless taggeremail:localpart ''
 test_atom refs/tags/taggerless taggerdate ''
 test_atom refs/tags/taggerless committer ''
 test_atom refs/tags/taggerless committername ''
 test_atom refs/tags/taggerless committeremail ''
+test_atom refs/tags/taggerless committeremail:trim ''
+test_atom refs/tags/taggerless committeremail:localpart ''
 test_atom refs/tags/taggerless committerdate ''
 test_atom refs/tags/taggerless subject 'Broken tag'
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 2/5] ref-filter: add `short` option for 'tree' and 'parent'
  2020-07-27 20:43 [PATCH 0/5] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
  2020-07-27 20:43 ` [PATCH 1/5] ref-filter: support different email formats Hariom Verma via GitGitGadget
@ 2020-07-27 20:43 ` Hariom Verma via GitGitGadget
  2020-07-27 23:21   ` Junio C Hamano
  2020-07-27 20:43 ` [PATCH 3/5] pretty: refactor `format_sanitized_subject()` Hariom Verma via GitGitGadget
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-07-27 20:43 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

Sometimes while using 'parent' and 'tree' atom, user might
want to see abbrev hash instead of full 40 character hash.

'objectname' and 'refname' already supports printing abbrev hash.
It might be convenient for users to have the same option for printing
'parent' and 'tree' hash.

Let's introduce `short` option to 'parent' and 'tree' atom.

`tree:short` - for abbrev tree hash
`parent:short` - for abbrev parent hash

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 ref-filter.c            | 12 ++++++++----
 t/t6300-for-each-ref.sh |  4 ++++
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 8563088eb1..d5d5ff6a9d 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -983,21 +983,25 @@ static void grab_commit_values(struct atom_value *val, int deref, struct object
 			continue;
 		if (deref)
 			name++;
-		if (!strcmp(name, "tree")) {
+		if (!strcmp(name, "tree"))
 			v->s = xstrdup(oid_to_hex(get_commit_tree_oid(commit)));
-		}
+		else if (!strcmp(name, "tree:short"))
+			v->s = xstrdup(find_unique_abbrev(get_commit_tree_oid(commit), DEFAULT_ABBREV));
 		else if (!strcmp(name, "numparent")) {
 			v->value = commit_list_count(commit->parents);
 			v->s = xstrfmt("%lu", (unsigned long)v->value);
 		}
-		else if (!strcmp(name, "parent")) {
+		else if (starts_with(name, "parent")) {
 			struct commit_list *parents;
 			struct strbuf s = STRBUF_INIT;
 			for (parents = commit->parents; parents; parents = parents->next) {
 				struct commit *parent = parents->item;
 				if (parents != commit->parents)
 					strbuf_addch(&s, ' ');
-				strbuf_addstr(&s, oid_to_hex(&parent->object.oid));
+				if (!strcmp(name, "parent"))
+					strbuf_addstr(&s, oid_to_hex(&parent->object.oid));
+				else if (!strcmp(name, "parent:short"))
+					strbuf_add_unique_abbrev(&s, &parent->object.oid, DEFAULT_ABBREV);
 			}
 			v->s = strbuf_detach(&s, NULL);
 		}
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index b8a2cb8439..533827d297 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -97,7 +97,9 @@ test_atom head objectname:short $(git rev-parse --short refs/heads/master)
 test_atom head objectname:short=1 $(git rev-parse --short=1 refs/heads/master)
 test_atom head objectname:short=10 $(git rev-parse --short=10 refs/heads/master)
 test_atom head tree $(git rev-parse refs/heads/master^{tree})
+test_atom head tree:short $(git rev-parse --short refs/heads/master^{tree})
 test_atom head parent ''
+test_atom head parent:short ''
 test_atom head numparent 0
 test_atom head object ''
 test_atom head type ''
@@ -148,7 +150,9 @@ test_atom tag objectname:short $(git rev-parse --short refs/tags/testtag)
 test_atom head objectname:short=1 $(git rev-parse --short=1 refs/heads/master)
 test_atom head objectname:short=10 $(git rev-parse --short=10 refs/heads/master)
 test_atom tag tree ''
+test_atom tag tree:short ''
 test_atom tag parent ''
+test_atom tag parent:short ''
 test_atom tag numparent ''
 test_atom tag object $(git rev-parse refs/tags/testtag^0)
 test_atom tag type 'commit'
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 3/5] pretty: refactor `format_sanitized_subject()`
  2020-07-27 20:43 [PATCH 0/5] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
  2020-07-27 20:43 ` [PATCH 1/5] ref-filter: support different email formats Hariom Verma via GitGitGadget
  2020-07-27 20:43 ` [PATCH 2/5] ref-filter: add `short` option for 'tree' and 'parent' Hariom Verma via GitGitGadget
@ 2020-07-27 20:43 ` Hariom Verma via GitGitGadget
  2020-07-27 20:43 ` [PATCH 4/5] format-support: move `format_sanitized_subject()` from pretty Hariom Verma via GitGitGadget
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-07-27 20:43 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

This commit refactors `format_sanitized_subject()` in the
hope to use same logic in ref-filter.c

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 pretty.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/pretty.c b/pretty.c
index 2a3d46bf42..8d08e8278a 100644
--- a/pretty.c
+++ b/pretty.c
@@ -839,24 +839,29 @@ static int istitlechar(char c)
 		(c >= '0' && c <= '9') || c == '.' || c == '_';
 }
 
-static void format_sanitized_subject(struct strbuf *sb, const char *msg)
+static void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len)
 {
+	char *r = xmemdupz(msg, len);
 	size_t trimlen;
 	size_t start_len = sb->len;
 	int space = 2;
+	int i;
 
-	for (; *msg && *msg != '\n'; msg++) {
-		if (istitlechar(*msg)) {
+	for (i = 0; i < len; i++) {
+		if (r[i] == '\n')
+			r[i] = ' ';
+		if (istitlechar(r[i])) {
 			if (space == 1)
 				strbuf_addch(sb, '-');
 			space = 0;
-			strbuf_addch(sb, *msg);
-			if (*msg == '.')
-				while (*(msg+1) == '.')
-					msg++;
+			strbuf_addch(sb, r[i]);
+			if (r[i] == '.')
+				while (r[i+1] == '.')
+					i++;
 		} else
 			space |= 1;
 	}
+	free(r);
 
 	/* trim any trailing '.' or '-' characters */
 	trimlen = 0;
@@ -1155,7 +1160,7 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
 	const struct commit *commit = c->commit;
 	const char *msg = c->message;
 	struct commit_list *p;
-	const char *arg;
+	const char *arg, *eol;
 	size_t res;
 	char **slot;
 
@@ -1405,7 +1410,8 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
 		format_subject(sb, msg + c->subject_off, " ");
 		return 1;
 	case 'f':	/* sanitized subject */
-		format_sanitized_subject(sb, msg + c->subject_off);
+		eol = strchrnul(msg + c->subject_off, '\n');
+		format_sanitized_subject(sb, msg + c->subject_off, eol - (msg + c->subject_off));
 		return 1;
 	case 'b':	/* body */
 		strbuf_addstr(sb, msg + c->body_off);
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 4/5] format-support: move `format_sanitized_subject()` from pretty
  2020-07-27 20:43 [PATCH 0/5] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
                   ` (2 preceding siblings ...)
  2020-07-27 20:43 ` [PATCH 3/5] pretty: refactor `format_sanitized_subject()` Hariom Verma via GitGitGadget
@ 2020-07-27 20:43 ` Hariom Verma via GitGitGadget
  2020-07-27 20:43 ` [PATCH 5/5] ref-filter: add `sanitize` option for 'subject' atom Hariom Verma via GitGitGadget
  2020-08-05 21:51 ` [PATCH v2 0/9] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
  5 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-07-27 20:43 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

In hope of some new features in `subject` atom, move funtion
`format_sanitized_subject()` and all the function it uses
to new file format-support.[c/h].

Consider this new file as a common interface between functions that
pretty.c and ref-filter.c shares.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 Makefile         |  1 +
 format-support.c | 43 +++++++++++++++++++++++++++++++++++++++++++
 format-support.h |  6 ++++++
 pretty.c         | 40 +---------------------------------------
 4 files changed, 51 insertions(+), 39 deletions(-)
 create mode 100644 format-support.c
 create mode 100644 format-support.h

diff --git a/Makefile b/Makefile
index 372139f1f2..4dfc384b49 100644
--- a/Makefile
+++ b/Makefile
@@ -882,6 +882,7 @@ LIB_OBJS += exec-cmd.o
 LIB_OBJS += fetch-negotiator.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fmt-merge-msg.o
+LIB_OBJS += format-support.o
 LIB_OBJS += fsck.o
 LIB_OBJS += fsmonitor.o
 LIB_OBJS += gettext.o
diff --git a/format-support.c b/format-support.c
new file mode 100644
index 0000000000..d693aa1744
--- /dev/null
+++ b/format-support.c
@@ -0,0 +1,43 @@
+#include "diff.h"
+#include "log-tree.h"
+#include "color.h"
+#include "format-support.h"
+
+static int istitlechar(char c)
+{
+	return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') ||
+		(c >= '0' && c <= '9') || c == '.' || c == '_';
+}
+
+void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len)
+{
+	char *r = xmemdupz(msg, len);
+	size_t trimlen;
+	size_t start_len = sb->len;
+	int space = 2;
+	int i;
+
+	for (i = 0; i < len; i++) {
+		if (r[i] == '\n')
+			r[i] = ' ';
+		if (istitlechar(r[i])) {
+			if (space == 1)
+				strbuf_addch(sb, '-');
+			space = 0;
+			strbuf_addch(sb, r[i]);
+			if (r[i] == '.')
+				while (r[i+1] == '.')
+					i++;
+		} else
+			space |= 1;
+	}
+	free(r);
+
+	/* trim any trailing '.' or '-' characters */
+	trimlen = 0;
+	while (sb->len - trimlen > start_len &&
+		(sb->buf[sb->len - 1 - trimlen] == '.'
+		|| sb->buf[sb->len - 1 - trimlen] == '-'))
+		trimlen++;
+	strbuf_remove(sb, sb->len - trimlen, trimlen);
+}
diff --git a/format-support.h b/format-support.h
new file mode 100644
index 0000000000..c344ccbc33
--- /dev/null
+++ b/format-support.h
@@ -0,0 +1,6 @@
+#ifndef FORMAT_SUPPORT_H
+#define FORMAT_SUPPORT_H
+
+void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len);
+
+#endif /* FORMAT_SUPPORT_H */
diff --git a/pretty.c b/pretty.c
index 8d08e8278a..2de01b7115 100644
--- a/pretty.c
+++ b/pretty.c
@@ -12,6 +12,7 @@
 #include "reflog-walk.h"
 #include "gpg-interface.h"
 #include "trailer.h"
+#include "format-support.h"
 
 static char *user_format;
 static struct cmt_fmt_map {
@@ -833,45 +834,6 @@ static void parse_commit_header(struct format_commit_context *context)
 	context->commit_header_parsed = 1;
 }
 
-static int istitlechar(char c)
-{
-	return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') ||
-		(c >= '0' && c <= '9') || c == '.' || c == '_';
-}
-
-static void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len)
-{
-	char *r = xmemdupz(msg, len);
-	size_t trimlen;
-	size_t start_len = sb->len;
-	int space = 2;
-	int i;
-
-	for (i = 0; i < len; i++) {
-		if (r[i] == '\n')
-			r[i] = ' ';
-		if (istitlechar(r[i])) {
-			if (space == 1)
-				strbuf_addch(sb, '-');
-			space = 0;
-			strbuf_addch(sb, r[i]);
-			if (r[i] == '.')
-				while (r[i+1] == '.')
-					i++;
-		} else
-			space |= 1;
-	}
-	free(r);
-
-	/* trim any trailing '.' or '-' characters */
-	trimlen = 0;
-	while (sb->len - trimlen > start_len &&
-		(sb->buf[sb->len - 1 - trimlen] == '.'
-		|| sb->buf[sb->len - 1 - trimlen] == '-'))
-		trimlen++;
-	strbuf_remove(sb, sb->len - trimlen, trimlen);
-}
-
 const char *format_subject(struct strbuf *sb, const char *msg,
 			   const char *line_separator)
 {
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 5/5] ref-filter: add `sanitize` option for 'subject' atom
  2020-07-27 20:43 [PATCH 0/5] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
                   ` (3 preceding siblings ...)
  2020-07-27 20:43 ` [PATCH 4/5] format-support: move `format_sanitized_subject()` from pretty Hariom Verma via GitGitGadget
@ 2020-07-27 20:43 ` Hariom Verma via GitGitGadget
  2020-08-05 21:51 ` [PATCH v2 0/9] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
  5 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-07-27 20:43 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

Currently, subject does not take any arguments. This commit introduce
`sanitize` formatting option to 'subject' atom.

`subject:sanitize` - print sanitized subject line, suitable for a filename.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 ref-filter.c            | 21 +++++++++++++++++----
 t/t6300-for-each-ref.sh |  7 +++++++
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index d5d5ff6a9d..5f8fc65b68 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -23,6 +23,7 @@
 #include "worktree.h"
 #include "hashmap.h"
 #include "argv-array.h"
+#include "format-support.h"
 
 static struct ref_msg {
 	const char *gone;
@@ -131,7 +132,7 @@ static struct used_atom {
 			unsigned int nobracket : 1, push : 1, push_remote : 1;
 		} remote_ref;
 		struct {
-			enum { C_BARE, C_BODY, C_BODY_DEP, C_LINES, C_SIG, C_SUB, C_TRAILERS } option;
+			enum { C_BARE, C_BODY, C_BODY_DEP, C_LINES, C_SIG, C_SUB, C_SUB_SANITIZE, C_TRAILERS } option;
 			struct process_trailer_options trailer_opts;
 			unsigned int nlines;
 		} contents;
@@ -301,8 +302,14 @@ static int body_atom_parser(const struct ref_format *format, struct used_atom *a
 static int subject_atom_parser(const struct ref_format *format, struct used_atom *atom,
 			       const char *arg, struct strbuf *err)
 {
-	if (arg)
-		return strbuf_addf_ret(err, -1, _("%%(subject) does not take arguments"));
+	if (arg) {
+		if (!strcmp(arg, "sanitize")) {
+			atom->u.contents.option = C_SUB_SANITIZE;
+			return 0;
+		} else {
+			return strbuf_addf_ret(err, -1, _("unrecognized %%(subject) argument: %s"), arg);
+		}
+	}
 	atom->u.contents.option = C_SUB;
 	return 0;
 }
@@ -1266,11 +1273,13 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
 		struct used_atom *atom = &used_atom[i];
 		const char *name = atom->name;
 		struct atom_value *v = &val[i];
+
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
 		if (strcmp(name, "subject") &&
+		    strcmp(name, "subject:sanitize") &&
 		    strcmp(name, "body") &&
 		    !starts_with(name, "trailers") &&
 		    !starts_with(name, "contents"))
@@ -1283,7 +1292,11 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
 
 		if (atom->u.contents.option == C_SUB)
 			v->s = copy_subject(subpos, sublen);
-		else if (atom->u.contents.option == C_BODY_DEP)
+		else if (atom->u.contents.option == C_SUB_SANITIZE) {
+			struct strbuf sb = STRBUF_INIT;
+			format_sanitized_subject(&sb, subpos, sublen);
+			v->s = strbuf_detach(&sb, NULL);
+		} else if (atom->u.contents.option == C_BODY_DEP)
 			v->s = xmemdupz(bodypos, bodylen);
 		else if (atom->u.contents.option == C_BODY)
 			v->s = xmemdupz(bodypos, nonsiglen);
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 533827d297..e2dd410356 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -127,6 +127,7 @@ test_atom head taggerdate ''
 test_atom head creator 'C O Mitter <committer@example.com> 1151968723 +0200'
 test_atom head creatordate 'Tue Jul 4 01:18:43 2006 +0200'
 test_atom head subject 'Initial'
+test_atom head subject:sanitize 'Initial'
 test_atom head contents:subject 'Initial'
 test_atom head body ''
 test_atom head contents:body ''
@@ -180,6 +181,7 @@ test_atom tag taggerdate 'Tue Jul 4 01:18:45 2006 +0200'
 test_atom tag creator 'C O Mitter <committer@example.com> 1151968725 +0200'
 test_atom tag creatordate 'Tue Jul 4 01:18:45 2006 +0200'
 test_atom tag subject 'Tagging at 1151968727'
+test_atom tag subject:sanitize 'Tagging-at-1151968727'
 test_atom tag contents:subject 'Tagging at 1151968727'
 test_atom tag body ''
 test_atom tag contents:body ''
@@ -592,6 +594,7 @@ test_expect_success 'create tag with subject and body content' '
 	git tag -F msg subject-body
 '
 test_atom refs/tags/subject-body subject 'the subject line'
+test_atom refs/tags/subject-body subject:sanitize 'the-subject-line'
 test_atom refs/tags/subject-body body 'first body line
 second body line
 '
@@ -612,6 +615,7 @@ test_expect_success 'create tag with multiline subject' '
 	git tag -F msg multiline
 '
 test_atom refs/tags/multiline subject 'first subject line second subject line'
+test_atom refs/tags/multiline subject:sanitize 'first-subject-line-second-subject-line'
 test_atom refs/tags/multiline contents:subject 'first subject line second subject line'
 test_atom refs/tags/multiline body 'first body line
 second body line
@@ -644,6 +648,7 @@ sig='-----BEGIN PGP SIGNATURE-----
 
 PREREQ=GPG
 test_atom refs/tags/signed-empty subject ''
+test_atom refs/tags/signed-empty subject:sanitize ''
 test_atom refs/tags/signed-empty contents:subject ''
 test_atom refs/tags/signed-empty body "$sig"
 test_atom refs/tags/signed-empty contents:body ''
@@ -651,6 +656,7 @@ test_atom refs/tags/signed-empty contents:signature "$sig"
 test_atom refs/tags/signed-empty contents "$sig"
 
 test_atom refs/tags/signed-short subject 'subject line'
+test_atom refs/tags/signed-short subject:sanitize 'subject-line'
 test_atom refs/tags/signed-short contents:subject 'subject line'
 test_atom refs/tags/signed-short body "$sig"
 test_atom refs/tags/signed-short contents:body ''
@@ -659,6 +665,7 @@ test_atom refs/tags/signed-short contents "subject line
 $sig"
 
 test_atom refs/tags/signed-long subject 'subject line'
+test_atom refs/tags/signed-long subject:sanitize 'subject-line'
 test_atom refs/tags/signed-long contents:subject 'subject line'
 test_atom refs/tags/signed-long body "body contents
 $sig"
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/5] ref-filter: support different email formats
  2020-07-27 20:43 ` [PATCH 1/5] ref-filter: support different email formats Hariom Verma via GitGitGadget
@ 2020-07-27 22:51   ` Junio C Hamano
  2020-07-28 20:31     ` Hariom verma
  2020-07-28 13:58   ` Đoàn Trần Công Danh
  1 sibling, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2020-07-27 22:51 UTC (permalink / raw)
  To: Hariom Verma via GitGitGadget; +Cc: git, Hariom Verma

"Hariom Verma via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Hariom Verma <hariom18599@gmail.com>
>
> Currently, ref-filter only supports printing email with arrow brackets.

This is the first time I heard the term "arrow bracket".  Aren't
they more commonly called angle brackets?

> Let's add support for two more email options.
> - trim : print email without arrow brackets.

Why would this be useful?

> - localpart : prints the part before the @ sign

Meaning I'd get "<gitster" for me?

Building small pieces of new feature in each patch is good, and
adding tests to each step is also good.  Why not do the same for
docs?

> +static struct email_option{

Missing SP.

> +	enum { EO_INVALID, EO_RAW, EO_TRIM, EO_LOCALPART } option;
> +} email_option;
> +
> @@ -1040,10 +1044,26 @@ static const char *copy_email(const char *buf)
>  	const char *eoemail;
>  	if (!email)
>  		return xstrdup("");
> -	eoemail = strchr(email, '>');

The original code prepares to see NULL from this strchr(); that is
why it checks eoemail for NULL and returns an empty string---the
data is broken (i.e. not an address in angle brackets), which this
code cannot do anything about---in the later part of the code.

> +	switch (email_option.option) {
> +	case EO_RAW:
> +		eoemail = strchr(email, '>') + 1;

And this breaks the carefully laid out error handling by the
original code.  Adding 1 to NULL is quite undefined.

> +		break;
> +	case EO_TRIM:
> +		email++;
> +		eoemail = strchr(email, '>');
> +		break;
> +	case EO_LOCALPART:
> +		email++;
> +		eoemail = strchr(email, '@');

The undocumented design here is that you want to return "hariom" for
"<hariom@gmail.com>", i.e. out of the "trimmed" e-mail, the part
before the at-sign is returned.

If the data were "<hariom>", you'd still want to return "hariom" no?
But because you do not check for NULL, you end up returning an empty
string.

I think you want to cut at the first '@' or '>', whichever comes
first.


> +		break;
> +	case EO_INVALID:
> +	default:

Invalid and unhandled ones are silently ignored and not treated as
an error?  I would have thought that at least the "default" one
would be a BUG(), as you covered all the possible values for the
enum with case arms.  I wouldn't be surprised if seeing EO_INVALID
is also a BUG(), i.e. the control flow that led to the caller to
call this function with EO_INVALID in email_option.option is likely
to be broken.  It's not like you return "" to protect yourself when
fed a bad data from objects---a bad value in .option can only come
here if the parser you wrote for "--format=<string>" produced a
wrong result.

> +		return xstrdup("");
> +	}
> +
>  	if (!eoemail)
>  		return xstrdup("");
> -	return xmemdupz(email, eoemail + 1 - email);
> +	return xmemdupz(email, eoemail - email);
>  }
>  
>  static char *copy_subject(const char *buf, unsigned long len)
> @@ -1113,7 +1133,7 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void
>  			continue;
>  		if (name[wholen] != 0 &&
>  		    strcmp(name + wholen, "name") &&
> -		    strcmp(name + wholen, "email") &&
> +		    !starts_with(name + wholen, "email") &&
>  		    !starts_with(name + wholen, "date"))
>  			continue;
>  		if (!wholine)
> @@ -1124,8 +1144,16 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void
>  			v->s = copy_line(wholine);
>  		else if (!strcmp(name + wholen, "name"))
>  			v->s = copy_name(wholine);
> -		else if (!strcmp(name + wholen, "email"))
> +		else if (starts_with(name + wholen, "email")) {
> +			email_option.option = EO_INVALID;
> +			if (!strcmp(name + wholen, "email"))
> +				email_option.option = EO_RAW;
> +			if (!strcmp(name + wholen, "email:trim"))
> +				email_option.option = EO_TRIM;
> +			if (!strcmp(name + wholen, "email:localpart"))
> +				email_option.option = EO_LOCALPART;

The ref-filter formatting language already knows many "colon plus
modifier" suffix like "refname:short" and "contents:body", but I do
not think we have ugly repetition like the above code to parse them.
Perhaps the addition for "email:<whatever>" can benefit from
studying and mimicking existing practices a bit more?


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/5] ref-filter: add `short` option for 'tree' and 'parent'
  2020-07-27 20:43 ` [PATCH 2/5] ref-filter: add `short` option for 'tree' and 'parent' Hariom Verma via GitGitGadget
@ 2020-07-27 23:21   ` Junio C Hamano
  0 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2020-07-27 23:21 UTC (permalink / raw)
  To: Hariom Verma via GitGitGadget; +Cc: git, Hariom Verma

"Hariom Verma via GitGitGadget" <gitgitgadget@gmail.com> writes:

> -		if (!strcmp(name, "tree")) {
> +		if (!strcmp(name, "tree"))
>  			v->s = xstrdup(oid_to_hex(get_commit_tree_oid(commit)));
> -		}
> +		else if (!strcmp(name, "tree:short"))
> +			v->s = xstrdup(find_unique_abbrev(get_commit_tree_oid(commit), DEFAULT_ABBREV));

Again, isn't this going in totally unacceptable direction?  

By the time grab_foo() helper functions are reached, the requested
format should have been parsed to atom->u.foo.option and the only
thing grab_foo() helper functions should look at are the option.

Perhaps studying how "objectname" and its ":"-modified forms are
handled before writing this series would be beneficial.

 - objectname_atom_parser() is called when the parser for --format
   notices "objectname:modifier"; it is responsible for setting up
   atom->u.objectname.option.  Note that this is done only once at
   the very begining of processing

 - grab_objectname() is called for each and every object ref-filter
   iterates over and "objectname" and/or its modified form
   (e.g. "objectname:short") is requested.  Since the modifier is
   already parsed, it can do a simple switch on the value in
   .option.

I do not know if patches [3-5/5] follow the pattern used in [1-2/5],
but if they do, then they all need to be fixed, I think.

Thanks.



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/5] ref-filter: support different email formats
  2020-07-27 20:43 ` [PATCH 1/5] ref-filter: support different email formats Hariom Verma via GitGitGadget
  2020-07-27 22:51   ` Junio C Hamano
@ 2020-07-28 13:58   ` Đoàn Trần Công Danh
  2020-07-28 16:45     ` Junio C Hamano
  1 sibling, 1 reply; 51+ messages in thread
From: Đoàn Trần Công Danh @ 2020-07-28 13:58 UTC (permalink / raw)
  To: Hariom Verma via GitGitGadget; +Cc: git, Hariom Verma

[this is a resent, my previous mail couldn't reach the archive]

On 2020-07-27 20:43:04+0000, Hariom Verma via GitGitGadget <gitgitgadget@gmail.com> wrote:
> From: Hariom Verma <hariom18599@gmail.com>
> 
> Currently, ref-filter only supports printing email with arrow brackets.
> 
> Let's add support for two more email options.
> - trim : print email without arrow brackets.
> - localpart : prints the part before the @ sign
> 
> Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> Mentored-by: Heba Waly <heba.waly@gmail.com>
> Signed-off-by: Hariom Verma <hariom18599@gmail.com>
> ---
>  ref-filter.c            | 36 ++++++++++++++++++++++++++++++++----
>  t/t6300-for-each-ref.sh | 16 ++++++++++++++++
>  2 files changed, 48 insertions(+), 4 deletions(-)
> 
> diff --git a/ref-filter.c b/ref-filter.c
> index 8447cb09be..8563088eb1 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -102,6 +102,10 @@ static struct ref_to_worktree_map {
>  	struct worktree **worktrees;
>  } ref_to_worktree_map;
>  
> +static struct email_option{
> +	enum { EO_INVALID, EO_RAW, EO_TRIM, EO_LOCALPART } option;
> +} email_option;
> +
>  /*
>   * An atom is a valid field atom listed below, possibly prefixed with
>   * a "*" to denote deref_tag().
> @@ -1040,10 +1044,26 @@ static const char *copy_email(const char *buf)
>  	const char *eoemail;
>  	if (!email)
>  		return xstrdup("");
> -	eoemail = strchr(email, '>');
> +	switch (email_option.option) {
> +	case EO_RAW:
> +		eoemail = strchr(email, '>') + 1;
> +		break;
> +	case EO_TRIM:
> +		email++;
> +		eoemail = strchr(email, '>');
> +		break;
> +	case EO_LOCALPART:
> +		email++;
> +		eoemail = strchr(email, '@');
> +		break;


This is not correct.
RFC-822 allows @ in local part,
albeit, that localpart must be quoted:

        addr-spec       =       local-part "@" domain
        local-part      =       dot-atom / quoted-string / obs-local-part
        quoted-string   =       [CFWS]
                                DQUOTE *([FWS] qcontent) [FWS] DQUOTE
                                [CFWS]
        qcontent        =       qtext / quoted-pair
        qtext           =       NO-WS-CTL /     ; Non white space
        qtext           =       NO-WS-CTL /     ; Non white space controls
                                %d33 /          ; The rest of the US-ASCII
                                %d35-91 /       ;  characters not including "\"
                                %d93-126        ;  or the quote character
        quoted-pair     =       ("\" text) / obs-qp

IOW, those below email addresses are valid email address,
and the local part is `quoted@local'

        "quoted@local"@example.com
        quoted\@local@example.com

Anyway, it seems like current Git strips first `"'
from `"quoted@local"@example.com'


-- 
Danh

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/5] ref-filter: support different email formats
  2020-07-28 13:58   ` Đoàn Trần Công Danh
@ 2020-07-28 16:45     ` Junio C Hamano
  0 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2020-07-28 16:45 UTC (permalink / raw)
  To: Đoàn Trần Công Danh
  Cc: Hariom Verma via GitGitGadget, git, Hariom Verma

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> This is not correct.
> RFC-822 allows @ in local part,
> albeit, that localpart must be quoted:

We do care about truly local e-mail addresses, without '@' anywhere
inside <braket>, simply because they are common in the result of SCM
conversion from CVS/SVN.

But I do not think we are pedantic/academic enough to have cared
about any local part that is unusual enough to require quoting; we
instead relied on the fact that we live in real world with practical
people who would avoid such an address ;-).

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/5] ref-filter: support different email formats
  2020-07-27 22:51   ` Junio C Hamano
@ 2020-07-28 20:31     ` Hariom verma
  2020-07-28 20:43       ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Hariom verma @ 2020-07-28 20:31 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Hariom Verma via GitGitGadget, git, Christian Couder, Heba Waly

Hi,

On Tue, Jul 28, 2020 at 4:21 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Hariom Verma via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Hariom Verma <hariom18599@gmail.com>
> >
> > Currently, ref-filter only supports printing email with arrow brackets.
>
> This is the first time I heard the term "arrow bracket".  Aren't
> they more commonly called angle brackets?

Yeah. Sorry about that.

> > Let's add support for two more email options.
> > - trim : print email without arrow brackets.
>
> Why would this be useful?

It might be useful for using the ref-filter's logic in pretty.c
(especially for `--pretty` formats like `%an` and `%cn`)

> > - localpart : prints the part before the @ sign
>
> Meaning I'd get "<gitster" for me?

No, you'll get "gitster".

> Building small pieces of new feature in each patch is good, and
> adding tests to each step is also good.  Why not do the same for
> docs?

Yeah, I agree with you. I should have focused on documentation too.

> > +static struct email_option{
>
> Missing SP.

I'll fix it.

> > +     enum { EO_INVALID, EO_RAW, EO_TRIM, EO_LOCALPART } option;
> > +} email_option;
> > +
> > @@ -1040,10 +1044,26 @@ static const char *copy_email(const char *buf)
> >       const char *eoemail;
> >       if (!email)
> >               return xstrdup("");
> > -     eoemail = strchr(email, '>');
>
> The original code prepares to see NULL from this strchr(); that is
> why it checks eoemail for NULL and returns an empty string---the
> data is broken (i.e. not an address in angle brackets), which this
> code cannot do anything about---in the later part of the code.

I think this commit still takes care of NULL.
After the switch-case statements, code consists of:
```
if (!eoemail)
    return xstrdup("");
```
Which checks eoemail for NULL. And will return empty string if address
is not in angle brackets.
Same applies for local-part too. If '@' is not present in email
address, it will still return empty string.

> > +     switch (email_option.option) {
> > +     case EO_RAW:
> > +             eoemail = strchr(email, '>') + 1;
>
> And this breaks the carefully laid out error handling by the
> original code.  Adding 1 to NULL is quite undefined.

Yeah. I'll take care of it in the next version.

> > +             break;
> > +     case EO_TRIM:
> > +             email++;
> > +             eoemail = strchr(email, '>');
> > +             break;
> > +     case EO_LOCALPART:
> > +             email++;
> > +             eoemail = strchr(email, '@');
>
> The undocumented design here is that you want to return "hariom" for
> "<hariom@gmail.com>", i.e. out of the "trimmed" e-mail, the part
> before the at-sign is returned.
>
> If the data were "<hariom>", you'd still want to return "hariom" no?
> But because you do not check for NULL, you end up returning an empty
> string.

I never heard of email address without '@' symbol. Thats why I
returned empty string.

Will fix that too.

> I think you want to cut at the first '@' or '>', whichever comes
> first.

If email data can be without '@' symbol, then I guess "yes".

> > +             break;
> > +     case EO_INVALID:
> > +     default:
>
> Invalid and unhandled ones are silently ignored and not treated as
> an error?  I would have thought that at least the "default" one
> would be a BUG(), as you covered all the possible values for the
> enum with case arms.  I wouldn't be surprised if seeing EO_INVALID
> is also a BUG(), i.e. the control flow that led to the caller to
> call this function with EO_INVALID in email_option.option is likely
> to be broken.  It's not like you return "" to protect yourself when
> fed a bad data from objects---a bad value in .option can only come
> here if the parser you wrote for "--format=<string>" produced a
> wrong result.

Christian <chriscool@tuxfamily.org> also suggested me the same. Will
fix this too.

BTW, on master "{author,committer,tagger}email:<xyz>" does not print any error.

> > +             return xstrdup("");
> > +     }
> > +
> >       if (!eoemail)
> >               return xstrdup("");
> > -     return xmemdupz(email, eoemail + 1 - email);
> > +     return xmemdupz(email, eoemail - email);
> >  }
> >
> >  static char *copy_subject(const char *buf, unsigned long len)
> > @@ -1113,7 +1133,7 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void
> >                       continue;
> >               if (name[wholen] != 0 &&
> >                   strcmp(name + wholen, "name") &&
> > -                 strcmp(name + wholen, "email") &&
> > +                 !starts_with(name + wholen, "email") &&
> >                   !starts_with(name + wholen, "date"))
> >                       continue;
> >               if (!wholine)
> > @@ -1124,8 +1144,16 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void
> >                       v->s = copy_line(wholine);
> >               else if (!strcmp(name + wholen, "name"))
> >                       v->s = copy_name(wholine);
> > -             else if (!strcmp(name + wholen, "email"))
> > +             else if (starts_with(name + wholen, "email")) {
> > +                     email_option.option = EO_INVALID;
> > +                     if (!strcmp(name + wholen, "email"))
> > +                             email_option.option = EO_RAW;
> > +                     if (!strcmp(name + wholen, "email:trim"))
> > +                             email_option.option = EO_TRIM;
> > +                     if (!strcmp(name + wholen, "email:localpart"))
> > +                             email_option.option = EO_LOCALPART;
>
> The ref-filter formatting language already knows many "colon plus
> modifier" suffix like "refname:short" and "contents:body", but I do
> not think we have ugly repetition like the above code to parse them.
> Perhaps the addition for "email:<whatever>" can benefit from
> studying and mimicking existing practices a bit more?
>

For "email:<whatever>",
even If I parse that <whatever>. I still make comparison something like:
```
if (!modifier)
    email_option.option = EO_RAW;
else if (!strcmp(modifier, "trim"))
    email_option.option = EO_TRIM;
else if (!strcmp(arg, "localpart"))
    email_option.option = EO_LOCALPART;
```

Thanks,
Hariom

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/5] ref-filter: support different email formats
  2020-07-28 20:31     ` Hariom verma
@ 2020-07-28 20:43       ` Junio C Hamano
  0 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2020-07-28 20:43 UTC (permalink / raw)
  To: Hariom verma
  Cc: Hariom Verma via GitGitGadget, git, Christian Couder, Heba Waly

Hariom verma <hariom18599@gmail.com> writes:

>> The ref-filter formatting language already knows many "colon plus
>> modifier" suffix like "refname:short" and "contents:body", but I do
>> not think we have ugly repetition like the above code to parse them.
>> Perhaps the addition for "email:<whatever>" can benefit from
>> studying and mimicking existing practices a bit more?
>>
>
> For "email:<whatever>",
> even If I parse that <whatever>. I still make comparison something like:
> ```
> if (!modifier)
>     email_option.option = EO_RAW;
> else if (!strcmp(modifier, "trim"))
>     email_option.option = EO_TRIM;
> else if (!strcmp(arg, "localpart"))
>     email_option.option = EO_LOCALPART;
> ```

Somebody needs to do a comparison, but it should be done at parsing
phase when the --format is grokked, not in grab phase that is run
for each and every ref to be shown.

These patches should only be done after looking at existing
"<basicatom>:<modifiers>" like "objectname:short" etc are handled,
not before, I think.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v2 0/9] [GSoC] Improvements to ref-filter
  2020-07-27 20:43 [PATCH 0/5] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
                   ` (4 preceding siblings ...)
  2020-07-27 20:43 ` [PATCH 5/5] ref-filter: add `sanitize` option for 'subject' atom Hariom Verma via GitGitGadget
@ 2020-08-05 21:51 ` Hariom Verma via GitGitGadget
  2020-08-05 21:51   ` [PATCH v2 1/9] ref-filter: support different email formats Hariom Verma via GitGitGadget
                     ` (10 more replies)
  5 siblings, 11 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-05 21:51 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma

This is the first patch series that introduces some improvements and
features to file ref-filter.{c,h}. These changes are useful to ref-filter,
but in near future also will allow us to use ref-filter's logic in pretty.c

I plan to add more to format-support.{c,h} in the upcoming patch series.
That will lead to more improved and feature-rich ref-filter.c

Hariom Verma (9):
  ref-filter: support different email formats
  ref-filter: refactor `grab_objectname()`
  ref-filter: modify error messages in `grab_objectname()`
  ref-filter: rename `objectname` related functions and fields
  ref-filter: add `short` modifier to 'tree' atom
  ref-filter: add `short` modifier to 'parent' atom
  pretty: refactor `format_sanitized_subject()`
  format-support: move `format_sanitized_subject()` from pretty
  ref-filter: add `sanitize` option for 'subject' atom

 Documentation/git-for-each-ref.txt |  10 +-
 Makefile                           |   1 +
 format-support.c                   |  43 ++++++++
 format-support.h                   |   6 ++
 pretty.c                           |  40 +-------
 ref-filter.c                       | 159 +++++++++++++++++++----------
 t/t6300-for-each-ref.sh            |  35 +++++++
 7 files changed, 202 insertions(+), 92 deletions(-)
 create mode 100644 format-support.c
 create mode 100644 format-support.h


base-commit: dc04167d378fb29d30e1647ff6ff51dd182bc9a3
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-684%2Fharry-hov%2Fonly-rf6-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-684/harry-hov/only-rf6-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/684

Range-diff vs v1:

  1:  aeb116c5aa !  1:  78e69032df ref-filter: support different email formats
     @@ Metadata
       ## Commit message ##
          ref-filter: support different email formats
      
     -    Currently, ref-filter only supports printing email with arrow brackets.
     +    Currently, ref-filter only supports printing email with angle brackets.
      
          Let's add support for two more email options.
     -    - trim : print email without arrow brackets.
     -    - localpart : prints the part before the @ sign
     +    - trim : for email without angle brackets.
     +    - localpart : for the part before the @ sign out of trimmed email
      
          Mentored-by: Christian Couder <chriscool@tuxfamily.org>
          Mentored-by: Heba Waly <heba.waly@gmail.com>
          Signed-off-by: Hariom Verma <hariom18599@gmail.com>
      
     + ## Documentation/git-for-each-ref.txt ##
     +@@ Documentation/git-for-each-ref.txt: These are intended for working on a mix of annotated and lightweight tags.
     + 
     + Fields that have name-email-date tuple as its value (`author`,
     + `committer`, and `tagger`) can be suffixed with `name`, `email`,
     +-and `date` to extract the named component.
     ++and `date` to extract the named component.  For email fields (`authoremail`,
     ++`committeremail` and `taggeremail`), `:trim` can be appended to get the email
     ++without angle brackets, and `:localpart` to get the part before the `@` symbol
     ++out of the trimmed email.
     + 
     + The message in a commit or a tag object is `contents`, from which
     + `contents:<part>` can be used to extract various parts out of:
     +
       ## ref-filter.c ##
     -@@ ref-filter.c: static struct ref_to_worktree_map {
     - 	struct worktree **worktrees;
     - } ref_to_worktree_map;
     +@@ ref-filter.c: static struct used_atom {
     + 			enum { O_FULL, O_LENGTH, O_SHORT } option;
     + 			unsigned int length;
     + 		} objectname;
     ++		struct email_option {
     ++			enum { EO_RAW, EO_TRIM, EO_LOCALPART } option;
     ++		} email_option;
     + 		struct refname_atom refname;
     + 		char *head;
     + 	} u;
     +@@ ref-filter.c: static int objectname_atom_parser(const struct ref_format *format, struct used_a
     + 	return 0;
     + }
       
     -+static struct email_option{
     -+	enum { EO_INVALID, EO_RAW, EO_TRIM, EO_LOCALPART } option;
     -+} email_option;
     ++static int person_email_atom_parser(const struct ref_format *format, struct used_atom *atom,
     ++				    const char *arg, struct strbuf *err)
     ++{
     ++	if (!arg)
     ++		atom->u.email_option.option = EO_RAW;
     ++	else if (!strcmp(arg, "trim"))
     ++		atom->u.email_option.option = EO_TRIM;
     ++	else if (!strcmp(arg, "localpart"))
     ++		atom->u.email_option.option = EO_LOCALPART;
     ++	else
     ++		return strbuf_addf_ret(err, -1, _("unrecognized email option: %s"), arg);
     ++	return 0;
     ++}
      +
     - /*
     -  * An atom is a valid field atom listed below, possibly prefixed with
     -  * a "*" to denote deref_tag().
     -@@ ref-filter.c: static const char *copy_email(const char *buf)
     + static int refname_atom_parser(const struct ref_format *format, struct used_atom *atom,
     + 			       const char *arg, struct strbuf *err)
     + {
     +@@ ref-filter.c: static struct {
     + 	{ "tag", SOURCE_OBJ },
     + 	{ "author", SOURCE_OBJ },
     + 	{ "authorname", SOURCE_OBJ },
     +-	{ "authoremail", SOURCE_OBJ },
     ++	{ "authoremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
     + 	{ "authordate", SOURCE_OBJ, FIELD_TIME },
     + 	{ "committer", SOURCE_OBJ },
     + 	{ "committername", SOURCE_OBJ },
     +-	{ "committeremail", SOURCE_OBJ },
     ++	{ "committeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
     + 	{ "committerdate", SOURCE_OBJ, FIELD_TIME },
     + 	{ "tagger", SOURCE_OBJ },
     + 	{ "taggername", SOURCE_OBJ },
     +-	{ "taggeremail", SOURCE_OBJ },
     ++	{ "taggeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
     + 	{ "taggerdate", SOURCE_OBJ, FIELD_TIME },
     + 	{ "creator", SOURCE_OBJ },
     + 	{ "creatordate", SOURCE_OBJ, FIELD_TIME },
     +@@ ref-filter.c: static const char *copy_name(const char *buf)
     + 	return xstrdup("");
     + }
     + 
     +-static const char *copy_email(const char *buf)
     ++static const char *copy_email(const char *buf, struct used_atom *atom)
     + {
     + 	const char *email = strchr(buf, '<');
       	const char *eoemail;
       	if (!email)
       		return xstrdup("");
      -	eoemail = strchr(email, '>');
     -+	switch (email_option.option) {
     ++	switch (atom->u.email_option.option) {
      +	case EO_RAW:
     -+		eoemail = strchr(email, '>') + 1;
     ++		eoemail = strchr(email, '>');
     ++		if (eoemail)
     ++			eoemail++;
      +		break;
      +	case EO_TRIM:
      +		email++;
     @@ ref-filter.c: static const char *copy_email(const char *buf)
      +	case EO_LOCALPART:
      +		email++;
      +		eoemail = strchr(email, '@');
     ++		if (!eoemail)
     ++			eoemail = strchr(email, '>');
      +		break;
     -+	case EO_INVALID:
      +	default:
     -+		return xstrdup("");
     ++		BUG("unknown email option");
      +	}
      +
       	if (!eoemail)
     @@ ref-filter.c: static void grab_person(const char *who, struct atom_value *val, i
       		else if (!strcmp(name + wholen, "name"))
       			v->s = copy_name(wholine);
      -		else if (!strcmp(name + wholen, "email"))
     -+		else if (starts_with(name + wholen, "email")) {
     -+			email_option.option = EO_INVALID;
     -+			if (!strcmp(name + wholen, "email"))
     -+				email_option.option = EO_RAW;
     -+			if (!strcmp(name + wholen, "email:trim"))
     -+				email_option.option = EO_TRIM;
     -+			if (!strcmp(name + wholen, "email:localpart"))
     -+				email_option.option = EO_LOCALPART;
     - 			v->s = copy_email(wholine);
     -+		}
     +-			v->s = copy_email(wholine);
     ++		else if (starts_with(name + wholen, "email"))
     ++			v->s = copy_email(wholine, &used_atom[i]);
       		else if (starts_with(name + wholen, "date"))
       			grab_date(wholine, v, name);
       	}
  -:  ---------- >  2:  b6b6acab9a ref-filter: refactor `grab_objectname()`
  -:  ---------- >  3:  65fee332a3 ref-filter: modify error messages in `grab_objectname()`
  -:  ---------- >  4:  976f2041a4 ref-filter: rename `objectname` related functions and fields
  -:  ---------- >  5:  dda7400b14 ref-filter: add `short` modifier to 'tree' atom
  2:  49344f1b55 !  6:  764bb23b59 ref-filter: add `short` option for 'tree' and 'parent'
     @@ Metadata
      Author: Hariom Verma <hariom18599@gmail.com>
      
       ## Commit message ##
     -    ref-filter: add `short` option for 'tree' and 'parent'
     +    ref-filter: add `short` modifier to 'parent' atom
      
     -    Sometimes while using 'parent' and 'tree' atom, user might
     -    want to see abbrev hash instead of full 40 character hash.
     +    Sometimes while using 'parent' atom, user might want to see abbrev hash
     +    instead of full 40 character hash.
      
     -    'objectname' and 'refname' already supports printing abbrev hash.
     -    It might be convenient for users to have the same option for printing
     -    'parent' and 'tree' hash.
     +    Just like 'objectname', it might be convenient for users to have the
     +    `:short` and `:short=<length>` option for printing 'parent' hash.
      
     -    Let's introduce `short` option to 'parent' and 'tree' atom.
     -
     -    `tree:short` - for abbrev tree hash
     -    `parent:short` - for abbrev parent hash
     +    Let's introduce `short` option to 'parent' atom.
      
          Mentored-by: Christian Couder <chriscool@tuxfamily.org>
          Mentored-by: Heba Waly <heba.waly@gmail.com>
          Signed-off-by: Hariom Verma <hariom18599@gmail.com>
      
     + ## Documentation/git-for-each-ref.txt ##
     +@@ Documentation/git-for-each-ref.txt: worktreepath::
     + In addition to the above, for commit and tag objects, the header
     + field names (`tree`, `parent`, `object`, `type`, and `tag`) can
     + be used to specify the value in the header field.
     +-Field `tree` can also be used with modifier `:short` and
     ++Fields `tree` and `parent` can also be used with modifier `:short` and
     + `:short=<length>` just like `objectname`.
     + 
     + For commit and tag objects, the special `creatordate` and `creator`
     +
       ## ref-filter.c ##
     +@@ ref-filter.c: static struct {
     + 	{ "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
     + 	{ "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
     + 	{ "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
     +-	{ "parent", SOURCE_OBJ },
     ++	{ "parent", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
     + 	{ "numparent", SOURCE_OBJ, FIELD_ULONG },
     + 	{ "object", SOURCE_OBJ },
     + 	{ "type", SOURCE_OBJ },
      @@ ref-filter.c: static void grab_commit_values(struct atom_value *val, int deref, struct object
     - 			continue;
     - 		if (deref)
     - 			name++;
     --		if (!strcmp(name, "tree")) {
     -+		if (!strcmp(name, "tree"))
     - 			v->s = xstrdup(oid_to_hex(get_commit_tree_oid(commit)));
     --		}
     -+		else if (!strcmp(name, "tree:short"))
     -+			v->s = xstrdup(find_unique_abbrev(get_commit_tree_oid(commit), DEFAULT_ABBREV));
     - 		else if (!strcmp(name, "numparent")) {
       			v->value = commit_list_count(commit->parents);
       			v->s = xstrfmt("%lu", (unsigned long)v->value);
       		}
     @@ ref-filter.c: static void grab_commit_values(struct atom_value *val, int deref,
       			struct commit_list *parents;
       			struct strbuf s = STRBUF_INIT;
       			for (parents = commit->parents; parents; parents = parents->next) {
     - 				struct commit *parent = parents->item;
     +-				struct commit *parent = parents->item;
     ++				struct object_id *oid = &parents->item->object.oid;
       				if (parents != commit->parents)
       					strbuf_addch(&s, ' ');
      -				strbuf_addstr(&s, oid_to_hex(&parent->object.oid));
     -+				if (!strcmp(name, "parent"))
     -+					strbuf_addstr(&s, oid_to_hex(&parent->object.oid));
     -+				else if (!strcmp(name, "parent:short"))
     -+					strbuf_add_unique_abbrev(&s, &parent->object.oid, DEFAULT_ABBREV);
     ++				strbuf_addstr(&s, do_grab_oid("parent", oid, &used_atom[i]));
       			}
       			v->s = strbuf_detach(&s, NULL);
       		}
      
       ## t/t6300-for-each-ref.sh ##
     -@@ t/t6300-for-each-ref.sh: test_atom head objectname:short $(git rev-parse --short refs/heads/master)
     - test_atom head objectname:short=1 $(git rev-parse --short=1 refs/heads/master)
     - test_atom head objectname:short=10 $(git rev-parse --short=10 refs/heads/master)
     - test_atom head tree $(git rev-parse refs/heads/master^{tree})
     -+test_atom head tree:short $(git rev-parse --short refs/heads/master^{tree})
     +@@ t/t6300-for-each-ref.sh: test_atom head tree:short $(git rev-parse --short refs/heads/master^{tree})
     + test_atom head tree:short=1 $(git rev-parse --short=1 refs/heads/master^{tree})
     + test_atom head tree:short=10 $(git rev-parse --short=10 refs/heads/master^{tree})
       test_atom head parent ''
      +test_atom head parent:short ''
     ++test_atom head parent:short=1 ''
     ++test_atom head parent:short=10 ''
       test_atom head numparent 0
       test_atom head object ''
       test_atom head type ''
     -@@ t/t6300-for-each-ref.sh: test_atom tag objectname:short $(git rev-parse --short refs/tags/testtag)
     - test_atom head objectname:short=1 $(git rev-parse --short=1 refs/heads/master)
     - test_atom head objectname:short=10 $(git rev-parse --short=10 refs/heads/master)
     - test_atom tag tree ''
     -+test_atom tag tree:short ''
     +@@ t/t6300-for-each-ref.sh: test_atom tag tree:short ''
     + test_atom tag tree:short=1 ''
     + test_atom tag tree:short=10 ''
       test_atom tag parent ''
      +test_atom tag parent:short ''
     ++test_atom tag parent:short=1 ''
     ++test_atom tag parent:short=10 ''
       test_atom tag numparent ''
       test_atom tag object $(git rev-parse refs/tags/testtag^0)
       test_atom tag type 'commit'
  3:  69b9d221c0 !  7:  95035765a0 pretty: refactor `format_sanitized_subject()`
     @@ Metadata
       ## Commit message ##
          pretty: refactor `format_sanitized_subject()`
      
     -    This commit refactors `format_sanitized_subject()` in the
     -    hope to use same logic in ref-filter.c
     +    The function 'format_sanitized_subject()' is responsible for
     +    sanitized subject line in pretty.c
     +    e.g.
     +    the subject line
     +    the-sanitized-subject-line
     +
     +    It would be a nice enhancement to `subject` atom to have the
     +    same feature. So in the later commits, we plan to add this feature
     +    to ref-filter.
     +
     +    Refactor `format_sanitized_subject()`, so it can be reused in
     +    ref-filter.c for adding new modifier `sanitize` to "subject" atom.
     +
     +    Currently, the loop inside `format_sanitized_subject()` runs
     +    until `\n` is found. But now, we stored the first occurrence
     +    of `\n` in a variable `eol` and passed it in
     +    `format_sanitized_subject()`. And the loop runs upto `eol`.
     +
     +    But this change isn't sufficient to reuse this function in
     +    ref-filter.c because there exist tags with multiline subject.
     +
     +    It's wise to replace `\n` with ' ', if `format_sanitized_subject()`
     +    encounters `\n` before end of subject line, just like `copy_subject()`.
     +    Because we'll be only using `format_sanitized_subject()` for
     +    "%(subject:sanitize)", instead of `copy_subject()` and
     +    `format_sanitized_subject()` both. So, added the code:
     +    ```
     +    if (char == '\n') /* never true if called inside pretty.c */
     +        char = ' ';
     +    ```
     +
     +    Now, it's ready to be reused in ref-filter.c
      
          Mentored-by: Christian Couder <chriscool@tuxfamily.org>
          Mentored-by: Heba Waly <heba.waly@gmail.com>
  4:  9dc619b448 !  8:  1c43f55d7c format-support: move `format_sanitized_subject()` from pretty
     @@ Commit message
      
          In hope of some new features in `subject` atom, move funtion
          `format_sanitized_subject()` and all the function it uses
     -    to new file format-support.[c/h].
     +    to new file format-support.{c,h}.
      
          Consider this new file as a common interface between functions that
          pretty.c and ref-filter.c shares.
  5:  7b9103cbad !  9:  feace82752 ref-filter: add `sanitize` option for 'subject' atom
     @@ Commit message
      
          `subject:sanitize` - print sanitized subject line, suitable for a filename.
      
     +    e.g.
     +    %(subject): "the subject line"
     +    %(subject:sanitize): "the-subject-line"
     +
          Mentored-by: Christian Couder <chriscool@tuxfamily.org>
          Mentored-by: Heba Waly <heba.waly@gmail.com>
          Signed-off-by: Hariom Verma <hariom18599@gmail.com>
      
     + ## Documentation/git-for-each-ref.txt ##
     +@@ Documentation/git-for-each-ref.txt: contents:subject::
     + 	The first paragraph of the message, which typically is a
     + 	single line, is taken as the "subject" of the commit or the
     + 	tag message.
     ++	Instead of `contents:subject`, field `subject` can also be used to
     ++	obtain same results. `:sanitize` can be appended to `subject` for
     ++	subject line suitable for filename.
     + 
     + contents:body::
     + 	The remainder of the commit or the tag message that follows
     +
       ## ref-filter.c ##
      @@
       #include "worktree.h"
     @@ ref-filter.c: static struct used_atom {
       			unsigned int nobracket : 1, push : 1, push_remote : 1;
       		} remote_ref;
       		struct {
     --			enum { C_BARE, C_BODY, C_BODY_DEP, C_LINES, C_SIG, C_SUB, C_TRAILERS } option;
     -+			enum { C_BARE, C_BODY, C_BODY_DEP, C_LINES, C_SIG, C_SUB, C_SUB_SANITIZE, C_TRAILERS } option;
     +-			enum { C_BARE, C_BODY, C_BODY_DEP, C_LENGTH,
     +-			       C_LINES, C_SIG, C_SUB, C_TRAILERS } option;
     ++			enum { C_BARE, C_BODY, C_BODY_DEP, C_LENGTH, C_LINES,
     ++			       C_SIG, C_SUB, C_SUB_SANITIZE, C_TRAILERS } option;
       			struct process_trailer_options trailer_opts;
       			unsigned int nlines;
       		} contents;
     @@ ref-filter.c: static int body_atom_parser(const struct ref_format *format, struc
       {
      -	if (arg)
      -		return strbuf_addf_ret(err, -1, _("%%(subject) does not take arguments"));
     -+	if (arg) {
     -+		if (!strcmp(arg, "sanitize")) {
     -+			atom->u.contents.option = C_SUB_SANITIZE;
     -+			return 0;
     -+		} else {
     -+			return strbuf_addf_ret(err, -1, _("unrecognized %%(subject) argument: %s"), arg);
     -+		}
     -+	}
     - 	atom->u.contents.option = C_SUB;
     +-	atom->u.contents.option = C_SUB;
     ++	if (!arg)
     ++		atom->u.contents.option = C_SUB;
     ++	else if (!strcmp(arg, "sanitize"))
     ++		atom->u.contents.option = C_SUB_SANITIZE;
     ++	else
     ++		return strbuf_addf_ret(err, -1, _("unrecognized %%(subject) argument: %s"), arg);
       	return 0;
       }
     + 
      @@ ref-filter.c: static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
     - 		struct used_atom *atom = &used_atom[i];
     - 		const char *name = atom->name;
     - 		struct atom_value *v = &val[i];
     -+
     - 		if (!!deref != (*name == '*'))
       			continue;
       		if (deref)
       			name++;
     - 		if (strcmp(name, "subject") &&
     -+		    strcmp(name, "subject:sanitize") &&
     - 		    strcmp(name, "body") &&
     +-		if (strcmp(name, "subject") &&
     +-		    strcmp(name, "body") &&
     ++		if (strcmp(name, "body") &&
     ++		    !starts_with(name, "subject") &&
       		    !starts_with(name, "trailers") &&
       		    !starts_with(name, "contents"))
     + 			continue;
      @@ ref-filter.c: static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
       
       		if (atom->u.contents.option == C_SUB)
     @@ ref-filter.c: static void grab_sub_body_contents(struct atom_value *val, int der
      +			v->s = strbuf_detach(&sb, NULL);
      +		} else if (atom->u.contents.option == C_BODY_DEP)
       			v->s = xmemdupz(bodypos, bodylen);
     - 		else if (atom->u.contents.option == C_BODY)
     - 			v->s = xmemdupz(bodypos, nonsiglen);
     + 		else if (atom->u.contents.option == C_LENGTH)
     + 			v->s = xstrfmt("%"PRIuMAX, (uintmax_t)strlen(subpos));
      
       ## t/t6300-for-each-ref.sh ##
      @@ t/t6300-for-each-ref.sh: test_atom head taggerdate ''

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v2 1/9] ref-filter: support different email formats
  2020-08-05 21:51 ` [PATCH v2 0/9] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
@ 2020-08-05 21:51   ` Hariom Verma via GitGitGadget
  2020-08-05 21:51   ` [PATCH v2 2/9] ref-filter: refactor `grab_objectname()` Hariom Verma via GitGitGadget
                     ` (9 subsequent siblings)
  10 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-05 21:51 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

Currently, ref-filter only supports printing email with angle brackets.

Let's add support for two more email options.
- trim : for email without angle brackets.
- localpart : for the part before the @ sign out of trimmed email

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 Documentation/git-for-each-ref.txt |  5 ++-
 ref-filter.c                       | 54 +++++++++++++++++++++++++-----
 t/t6300-for-each-ref.sh            | 16 +++++++++
 3 files changed, 65 insertions(+), 10 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index 2ea71c5f6c..e6ce8af612 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -230,7 +230,10 @@ These are intended for working on a mix of annotated and lightweight tags.
 
 Fields that have name-email-date tuple as its value (`author`,
 `committer`, and `tagger`) can be suffixed with `name`, `email`,
-and `date` to extract the named component.
+and `date` to extract the named component.  For email fields (`authoremail`,
+`committeremail` and `taggeremail`), `:trim` can be appended to get the email
+without angle brackets, and `:localpart` to get the part before the `@` symbol
+out of the trimmed email.
 
 The message in a commit or a tag object is `contents`, from which
 `contents:<part>` can be used to extract various parts out of:
diff --git a/ref-filter.c b/ref-filter.c
index f2b078db11..307069219f 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -140,6 +140,9 @@ static struct used_atom {
 			enum { O_FULL, O_LENGTH, O_SHORT } option;
 			unsigned int length;
 		} objectname;
+		struct email_option {
+			enum { EO_RAW, EO_TRIM, EO_LOCALPART } option;
+		} email_option;
 		struct refname_atom refname;
 		char *head;
 	} u;
@@ -377,6 +380,20 @@ static int objectname_atom_parser(const struct ref_format *format, struct used_a
 	return 0;
 }
 
+static int person_email_atom_parser(const struct ref_format *format, struct used_atom *atom,
+				    const char *arg, struct strbuf *err)
+{
+	if (!arg)
+		atom->u.email_option.option = EO_RAW;
+	else if (!strcmp(arg, "trim"))
+		atom->u.email_option.option = EO_TRIM;
+	else if (!strcmp(arg, "localpart"))
+		atom->u.email_option.option = EO_LOCALPART;
+	else
+		return strbuf_addf_ret(err, -1, _("unrecognized email option: %s"), arg);
+	return 0;
+}
+
 static int refname_atom_parser(const struct ref_format *format, struct used_atom *atom,
 			       const char *arg, struct strbuf *err)
 {
@@ -488,15 +505,15 @@ static struct {
 	{ "tag", SOURCE_OBJ },
 	{ "author", SOURCE_OBJ },
 	{ "authorname", SOURCE_OBJ },
-	{ "authoremail", SOURCE_OBJ },
+	{ "authoremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
 	{ "authordate", SOURCE_OBJ, FIELD_TIME },
 	{ "committer", SOURCE_OBJ },
 	{ "committername", SOURCE_OBJ },
-	{ "committeremail", SOURCE_OBJ },
+	{ "committeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
 	{ "committerdate", SOURCE_OBJ, FIELD_TIME },
 	{ "tagger", SOURCE_OBJ },
 	{ "taggername", SOURCE_OBJ },
-	{ "taggeremail", SOURCE_OBJ },
+	{ "taggeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
 	{ "taggerdate", SOURCE_OBJ, FIELD_TIME },
 	{ "creator", SOURCE_OBJ },
 	{ "creatordate", SOURCE_OBJ, FIELD_TIME },
@@ -1037,16 +1054,35 @@ static const char *copy_name(const char *buf)
 	return xstrdup("");
 }
 
-static const char *copy_email(const char *buf)
+static const char *copy_email(const char *buf, struct used_atom *atom)
 {
 	const char *email = strchr(buf, '<');
 	const char *eoemail;
 	if (!email)
 		return xstrdup("");
-	eoemail = strchr(email, '>');
+	switch (atom->u.email_option.option) {
+	case EO_RAW:
+		eoemail = strchr(email, '>');
+		if (eoemail)
+			eoemail++;
+		break;
+	case EO_TRIM:
+		email++;
+		eoemail = strchr(email, '>');
+		break;
+	case EO_LOCALPART:
+		email++;
+		eoemail = strchr(email, '@');
+		if (!eoemail)
+			eoemail = strchr(email, '>');
+		break;
+	default:
+		BUG("unknown email option");
+	}
+
 	if (!eoemail)
 		return xstrdup("");
-	return xmemdupz(email, eoemail + 1 - email);
+	return xmemdupz(email, eoemail - email);
 }
 
 static char *copy_subject(const char *buf, unsigned long len)
@@ -1116,7 +1152,7 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void
 			continue;
 		if (name[wholen] != 0 &&
 		    strcmp(name + wholen, "name") &&
-		    strcmp(name + wholen, "email") &&
+		    !starts_with(name + wholen, "email") &&
 		    !starts_with(name + wholen, "date"))
 			continue;
 		if (!wholine)
@@ -1127,8 +1163,8 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void
 			v->s = copy_line(wholine);
 		else if (!strcmp(name + wholen, "name"))
 			v->s = copy_name(wholine);
-		else if (!strcmp(name + wholen, "email"))
-			v->s = copy_email(wholine);
+		else if (starts_with(name + wholen, "email"))
+			v->s = copy_email(wholine, &used_atom[i]);
 		else if (starts_with(name + wholen, "date"))
 			grab_date(wholine, v, name);
 	}
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index a83579fbdf..64fbc91146 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -125,15 +125,21 @@ test_atom head '*objecttype' ''
 test_atom head author 'A U Thor <author@example.com> 1151968724 +0200'
 test_atom head authorname 'A U Thor'
 test_atom head authoremail '<author@example.com>'
+test_atom head authoremail:trim 'author@example.com'
+test_atom head authoremail:localpart 'author'
 test_atom head authordate 'Tue Jul 4 01:18:44 2006 +0200'
 test_atom head committer 'C O Mitter <committer@example.com> 1151968723 +0200'
 test_atom head committername 'C O Mitter'
 test_atom head committeremail '<committer@example.com>'
+test_atom head committeremail:trim 'committer@example.com'
+test_atom head committeremail:localpart 'committer'
 test_atom head committerdate 'Tue Jul 4 01:18:43 2006 +0200'
 test_atom head tag ''
 test_atom head tagger ''
 test_atom head taggername ''
 test_atom head taggeremail ''
+test_atom head taggeremail:trim ''
+test_atom head taggeremail:localpart ''
 test_atom head taggerdate ''
 test_atom head creator 'C O Mitter <committer@example.com> 1151968723 +0200'
 test_atom head creatordate 'Tue Jul 4 01:18:43 2006 +0200'
@@ -170,15 +176,21 @@ test_atom tag '*objecttype' 'commit'
 test_atom tag author ''
 test_atom tag authorname ''
 test_atom tag authoremail ''
+test_atom tag authoremail:trim ''
+test_atom tag authoremail:localpart ''
 test_atom tag authordate ''
 test_atom tag committer ''
 test_atom tag committername ''
 test_atom tag committeremail ''
+test_atom tag committeremail:trim ''
+test_atom tag committeremail:localpart ''
 test_atom tag committerdate ''
 test_atom tag tag 'testtag'
 test_atom tag tagger 'C O Mitter <committer@example.com> 1151968725 +0200'
 test_atom tag taggername 'C O Mitter'
 test_atom tag taggeremail '<committer@example.com>'
+test_atom tag taggeremail:trim 'committer@example.com'
+test_atom tag taggeremail:localpart 'committer'
 test_atom tag taggerdate 'Tue Jul 4 01:18:45 2006 +0200'
 test_atom tag creator 'C O Mitter <committer@example.com> 1151968725 +0200'
 test_atom tag creatordate 'Tue Jul 4 01:18:45 2006 +0200'
@@ -564,10 +576,14 @@ test_atom refs/tags/taggerless tag 'taggerless'
 test_atom refs/tags/taggerless tagger ''
 test_atom refs/tags/taggerless taggername ''
 test_atom refs/tags/taggerless taggeremail ''
+test_atom refs/tags/taggerless taggeremail:trim ''
+test_atom refs/tags/taggerless taggeremail:localpart ''
 test_atom refs/tags/taggerless taggerdate ''
 test_atom refs/tags/taggerless committer ''
 test_atom refs/tags/taggerless committername ''
 test_atom refs/tags/taggerless committeremail ''
+test_atom refs/tags/taggerless committeremail:trim ''
+test_atom refs/tags/taggerless committeremail:localpart ''
 test_atom refs/tags/taggerless committerdate ''
 test_atom refs/tags/taggerless subject 'Broken tag'
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v2 2/9] ref-filter: refactor `grab_objectname()`
  2020-08-05 21:51 ` [PATCH v2 0/9] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
  2020-08-05 21:51   ` [PATCH v2 1/9] ref-filter: support different email formats Hariom Verma via GitGitGadget
@ 2020-08-05 21:51   ` Hariom Verma via GitGitGadget
  2020-08-05 21:51   ` [PATCH v2 3/9] ref-filter: modify error messages in `grab_objectname()` Hariom Verma via GitGitGadget
                     ` (8 subsequent siblings)
  10 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-05 21:51 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

Prepares `grab_objectname()` for more generic usage.
This change will allow us to reuse `grab_objectname()` for
the `tree` and `parent` atoms in a following commit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 ref-filter.c | 36 +++++++++++++++++++++---------------
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 307069219f..d078f893ff 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -918,21 +918,27 @@ int verify_ref_format(struct ref_format *format)
 	return 0;
 }
 
-static int grab_objectname(const char *name, const struct object_id *oid,
+static const char *do_grab_objectname(const char *field, const struct object_id *oid,
+				      struct used_atom *atom)
+{
+	switch (atom->u.objectname.option) {
+	case O_FULL:
+		return oid_to_hex(oid);
+	case O_LENGTH:
+		return find_unique_abbrev(oid, atom->u.objectname.length);
+	case O_SHORT:
+		return find_unique_abbrev(oid, DEFAULT_ABBREV);
+	default:
+		BUG("unknown %%(%s) option", field);
+	}
+}
+
+static int grab_objectname(const char *name, const char *field, const struct object_id *oid,
 			   struct atom_value *v, struct used_atom *atom)
 {
-	if (starts_with(name, "objectname")) {
-		if (atom->u.objectname.option == O_SHORT) {
-			v->s = xstrdup(find_unique_abbrev(oid, DEFAULT_ABBREV));
-			return 1;
-		} else if (atom->u.objectname.option == O_FULL) {
-			v->s = xstrdup(oid_to_hex(oid));
-			return 1;
-		} else if (atom->u.objectname.option == O_LENGTH) {
-			v->s = xstrdup(find_unique_abbrev(oid, atom->u.objectname.length));
-			return 1;
-		} else
-			BUG("unknown %%(objectname) option");
+	if (starts_with(name, field)) {
+		v->s = xstrdup(do_grab_objectname(field, oid, atom));
+		return 1;
 	}
 	return 0;
 }
@@ -960,7 +966,7 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 		} else if (!strcmp(name, "deltabase"))
 			v->s = xstrdup(oid_to_hex(&oi->delta_base_oid));
 		else if (deref)
-			grab_objectname(name, &oi->oid, v, &used_atom[i]);
+			grab_objectname(name, "objectname", &oi->oid, v, &used_atom[i]);
 	}
 }
 
@@ -1740,7 +1746,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				v->s = xstrdup(buf + 1);
 			}
 			continue;
-		} else if (!deref && grab_objectname(name, &ref->objectname, v, atom)) {
+		} else if (!deref && grab_objectname(name, "objectname", &ref->objectname, v, atom)) {
 			continue;
 		} else if (!strcmp(name, "HEAD")) {
 			if (atom->u.head && !strcmp(ref->refname, atom->u.head))
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v2 3/9] ref-filter: modify error messages in `grab_objectname()`
  2020-08-05 21:51 ` [PATCH v2 0/9] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
  2020-08-05 21:51   ` [PATCH v2 1/9] ref-filter: support different email formats Hariom Verma via GitGitGadget
  2020-08-05 21:51   ` [PATCH v2 2/9] ref-filter: refactor `grab_objectname()` Hariom Verma via GitGitGadget
@ 2020-08-05 21:51   ` Hariom Verma via GitGitGadget
  2020-08-05 21:51   ` [PATCH v2 4/9] ref-filter: rename `objectname` related functions and fields Hariom Verma via GitGitGadget
                     ` (7 subsequent siblings)
  10 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-05 21:51 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

As we plan to use `grab_objectname()` for `tree` and `parent` atom,
it's better to parameterize the error messages in the function
`grab_objectname()` where "objectname" is hard coded.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 ref-filter.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index d078f893ff..4183ee2797 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -372,11 +372,11 @@ static int objectname_atom_parser(const struct ref_format *format, struct used_a
 		atom->u.objectname.option = O_LENGTH;
 		if (strtoul_ui(arg, 10, &atom->u.objectname.length) ||
 		    atom->u.objectname.length == 0)
-			return strbuf_addf_ret(err, -1, _("positive value expected objectname:short=%s"), arg);
+			return strbuf_addf_ret(err, -1, _("positive value expected '%s' in %%(%s)"), arg, atom->name);
 		if (atom->u.objectname.length < MINIMUM_ABBREV)
 			atom->u.objectname.length = MINIMUM_ABBREV;
 	} else
-		return strbuf_addf_ret(err, -1, _("unrecognized %%(objectname) argument: %s"), arg);
+		return strbuf_addf_ret(err, -1, _("unrecognized argument '%s' in %%(%s)"), arg, atom->name);
 	return 0;
 }
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v2 4/9] ref-filter: rename `objectname` related functions and fields
  2020-08-05 21:51 ` [PATCH v2 0/9] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
                     ` (2 preceding siblings ...)
  2020-08-05 21:51   ` [PATCH v2 3/9] ref-filter: modify error messages in `grab_objectname()` Hariom Verma via GitGitGadget
@ 2020-08-05 21:51   ` Hariom Verma via GitGitGadget
  2020-08-05 21:51   ` [PATCH v2 5/9] ref-filter: add `short` modifier to 'tree' atom Hariom Verma via GitGitGadget
                     ` (6 subsequent siblings)
  10 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-05 21:51 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

In previous commits, we prepared some `objectname` related functions
for more generic usage, so that these functions can be used for `tree`
and `parent` atom.

But the name of some functions and fields may mislead someone.
For ex: function `objectname_atom_parser()` implies that it is
for atom `objectname`.

Let's rename all such functions and fields.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 ref-filter.c | 40 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 4183ee2797..cc544eadaf 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -139,7 +139,7 @@ static struct used_atom {
 		struct {
 			enum { O_FULL, O_LENGTH, O_SHORT } option;
 			unsigned int length;
-		} objectname;
+		} oid;
 		struct email_option {
 			enum { EO_RAW, EO_TRIM, EO_LOCALPART } option;
 		} email_option;
@@ -361,20 +361,20 @@ static int contents_atom_parser(const struct ref_format *format, struct used_ato
 	return 0;
 }
 
-static int objectname_atom_parser(const struct ref_format *format, struct used_atom *atom,
-				  const char *arg, struct strbuf *err)
+static int oid_atom_parser(const struct ref_format *format, struct used_atom *atom,
+			   const char *arg, struct strbuf *err)
 {
 	if (!arg)
-		atom->u.objectname.option = O_FULL;
+		atom->u.oid.option = O_FULL;
 	else if (!strcmp(arg, "short"))
-		atom->u.objectname.option = O_SHORT;
+		atom->u.oid.option = O_SHORT;
 	else if (skip_prefix(arg, "short=", &arg)) {
-		atom->u.objectname.option = O_LENGTH;
-		if (strtoul_ui(arg, 10, &atom->u.objectname.length) ||
-		    atom->u.objectname.length == 0)
+		atom->u.oid.option = O_LENGTH;
+		if (strtoul_ui(arg, 10, &atom->u.oid.length) ||
+		    atom->u.oid.length == 0)
 			return strbuf_addf_ret(err, -1, _("positive value expected '%s' in %%(%s)"), arg, atom->name);
-		if (atom->u.objectname.length < MINIMUM_ABBREV)
-			atom->u.objectname.length = MINIMUM_ABBREV;
+		if (atom->u.oid.length < MINIMUM_ABBREV)
+			atom->u.oid.length = MINIMUM_ABBREV;
 	} else
 		return strbuf_addf_ret(err, -1, _("unrecognized argument '%s' in %%(%s)"), arg, atom->name);
 	return 0;
@@ -495,7 +495,7 @@ static struct {
 	{ "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
 	{ "objecttype", SOURCE_OTHER, FIELD_STR, objecttype_atom_parser },
 	{ "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
-	{ "objectname", SOURCE_OTHER, FIELD_STR, objectname_atom_parser },
+	{ "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
 	{ "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
 	{ "tree", SOURCE_OBJ },
 	{ "parent", SOURCE_OBJ },
@@ -918,14 +918,14 @@ int verify_ref_format(struct ref_format *format)
 	return 0;
 }
 
-static const char *do_grab_objectname(const char *field, const struct object_id *oid,
-				      struct used_atom *atom)
+static const char *do_grab_oid(const char *field, const struct object_id *oid,
+			       struct used_atom *atom)
 {
-	switch (atom->u.objectname.option) {
+	switch (atom->u.oid.option) {
 	case O_FULL:
 		return oid_to_hex(oid);
 	case O_LENGTH:
-		return find_unique_abbrev(oid, atom->u.objectname.length);
+		return find_unique_abbrev(oid, atom->u.oid.length);
 	case O_SHORT:
 		return find_unique_abbrev(oid, DEFAULT_ABBREV);
 	default:
@@ -933,11 +933,11 @@ static const char *do_grab_objectname(const char *field, const struct object_id
 	}
 }
 
-static int grab_objectname(const char *name, const char *field, const struct object_id *oid,
-			   struct atom_value *v, struct used_atom *atom)
+static int grab_oid(const char *name, const char *field, const struct object_id *oid,
+		    struct atom_value *v, struct used_atom *atom)
 {
 	if (starts_with(name, field)) {
-		v->s = xstrdup(do_grab_objectname(field, oid, atom));
+		v->s = xstrdup(do_grab_oid(field, oid, atom));
 		return 1;
 	}
 	return 0;
@@ -966,7 +966,7 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 		} else if (!strcmp(name, "deltabase"))
 			v->s = xstrdup(oid_to_hex(&oi->delta_base_oid));
 		else if (deref)
-			grab_objectname(name, "objectname", &oi->oid, v, &used_atom[i]);
+			grab_oid(name, "objectname", &oi->oid, v, &used_atom[i]);
 	}
 }
 
@@ -1746,7 +1746,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				v->s = xstrdup(buf + 1);
 			}
 			continue;
-		} else if (!deref && grab_objectname(name, "objectname", &ref->objectname, v, atom)) {
+		} else if (!deref && grab_oid(name, "objectname", &ref->objectname, v, atom)) {
 			continue;
 		} else if (!strcmp(name, "HEAD")) {
 			if (atom->u.head && !strcmp(ref->refname, atom->u.head))
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v2 5/9] ref-filter: add `short` modifier to 'tree' atom
  2020-08-05 21:51 ` [PATCH v2 0/9] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
                     ` (3 preceding siblings ...)
  2020-08-05 21:51   ` [PATCH v2 4/9] ref-filter: rename `objectname` related functions and fields Hariom Verma via GitGitGadget
@ 2020-08-05 21:51   ` Hariom Verma via GitGitGadget
  2020-08-05 21:51   ` [PATCH v2 6/9] ref-filter: add `short` modifier to 'parent' atom Hariom Verma via GitGitGadget
                     ` (5 subsequent siblings)
  10 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-05 21:51 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

Sometimes while using 'tree' atom, user might want to see abbrev hash
instead of full 40 character hash.

Just like 'objectname', it might be convenient for users to have the
`:short` and `:short=<length>` option for printing 'tree' hash.

Let's introduce `short` option to 'tree' atom.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 Documentation/git-for-each-ref.txt | 2 ++
 ref-filter.c                       | 9 ++++-----
 t/t6300-for-each-ref.sh            | 6 ++++++
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index e6ce8af612..40ebdfcc41 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -222,6 +222,8 @@ worktreepath::
 In addition to the above, for commit and tag objects, the header
 field names (`tree`, `parent`, `object`, `type`, and `tag`) can
 be used to specify the value in the header field.
+Field `tree` can also be used with modifier `:short` and
+`:short=<length>` just like `objectname`.
 
 For commit and tag objects, the special `creatordate` and `creator`
 fields will correspond to the appropriate date or name-email-date tuple
diff --git a/ref-filter.c b/ref-filter.c
index cc544eadaf..f9d85661eb 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -497,7 +497,7 @@ static struct {
 	{ "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
 	{ "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
 	{ "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
-	{ "tree", SOURCE_OBJ },
+	{ "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
 	{ "parent", SOURCE_OBJ },
 	{ "numparent", SOURCE_OBJ, FIELD_ULONG },
 	{ "object", SOURCE_OBJ },
@@ -1005,10 +1005,9 @@ static void grab_commit_values(struct atom_value *val, int deref, struct object
 			continue;
 		if (deref)
 			name++;
-		if (!strcmp(name, "tree")) {
-			v->s = xstrdup(oid_to_hex(get_commit_tree_oid(commit)));
-		}
-		else if (!strcmp(name, "numparent")) {
+		if (grab_oid(name, "tree", get_commit_tree_oid(commit), v, &used_atom[i]))
+			continue;
+		if (!strcmp(name, "numparent")) {
 			v->value = commit_list_count(commit->parents);
 			v->s = xstrfmt("%lu", (unsigned long)v->value);
 		}
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 64fbc91146..e30bbff6d9 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -116,6 +116,9 @@ test_atom head objectname:short $(git rev-parse --short refs/heads/master)
 test_atom head objectname:short=1 $(git rev-parse --short=1 refs/heads/master)
 test_atom head objectname:short=10 $(git rev-parse --short=10 refs/heads/master)
 test_atom head tree $(git rev-parse refs/heads/master^{tree})
+test_atom head tree:short $(git rev-parse --short refs/heads/master^{tree})
+test_atom head tree:short=1 $(git rev-parse --short=1 refs/heads/master^{tree})
+test_atom head tree:short=10 $(git rev-parse --short=10 refs/heads/master^{tree})
 test_atom head parent ''
 test_atom head numparent 0
 test_atom head object ''
@@ -167,6 +170,9 @@ test_atom tag objectname:short $(git rev-parse --short refs/tags/testtag)
 test_atom head objectname:short=1 $(git rev-parse --short=1 refs/heads/master)
 test_atom head objectname:short=10 $(git rev-parse --short=10 refs/heads/master)
 test_atom tag tree ''
+test_atom tag tree:short ''
+test_atom tag tree:short=1 ''
+test_atom tag tree:short=10 ''
 test_atom tag parent ''
 test_atom tag numparent ''
 test_atom tag object $(git rev-parse refs/tags/testtag^0)
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v2 6/9] ref-filter: add `short` modifier to 'parent' atom
  2020-08-05 21:51 ` [PATCH v2 0/9] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
                     ` (4 preceding siblings ...)
  2020-08-05 21:51   ` [PATCH v2 5/9] ref-filter: add `short` modifier to 'tree' atom Hariom Verma via GitGitGadget
@ 2020-08-05 21:51   ` Hariom Verma via GitGitGadget
  2020-08-05 21:51   ` [PATCH v2 7/9] pretty: refactor `format_sanitized_subject()` Hariom Verma via GitGitGadget
                     ` (4 subsequent siblings)
  10 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-05 21:51 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

Sometimes while using 'parent' atom, user might want to see abbrev hash
instead of full 40 character hash.

Just like 'objectname', it might be convenient for users to have the
`:short` and `:short=<length>` option for printing 'parent' hash.

Let's introduce `short` option to 'parent' atom.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 Documentation/git-for-each-ref.txt | 2 +-
 ref-filter.c                       | 8 ++++----
 t/t6300-for-each-ref.sh            | 6 ++++++
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index 40ebdfcc41..dd09763e7d 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -222,7 +222,7 @@ worktreepath::
 In addition to the above, for commit and tag objects, the header
 field names (`tree`, `parent`, `object`, `type`, and `tag`) can
 be used to specify the value in the header field.
-Field `tree` can also be used with modifier `:short` and
+Fields `tree` and `parent` can also be used with modifier `:short` and
 `:short=<length>` just like `objectname`.
 
 For commit and tag objects, the special `creatordate` and `creator`
diff --git a/ref-filter.c b/ref-filter.c
index f9d85661eb..6d5bbb14a2 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -498,7 +498,7 @@ static struct {
 	{ "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
 	{ "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
 	{ "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
-	{ "parent", SOURCE_OBJ },
+	{ "parent", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
 	{ "numparent", SOURCE_OBJ, FIELD_ULONG },
 	{ "object", SOURCE_OBJ },
 	{ "type", SOURCE_OBJ },
@@ -1011,14 +1011,14 @@ static void grab_commit_values(struct atom_value *val, int deref, struct object
 			v->value = commit_list_count(commit->parents);
 			v->s = xstrfmt("%lu", (unsigned long)v->value);
 		}
-		else if (!strcmp(name, "parent")) {
+		else if (starts_with(name, "parent")) {
 			struct commit_list *parents;
 			struct strbuf s = STRBUF_INIT;
 			for (parents = commit->parents; parents; parents = parents->next) {
-				struct commit *parent = parents->item;
+				struct object_id *oid = &parents->item->object.oid;
 				if (parents != commit->parents)
 					strbuf_addch(&s, ' ');
-				strbuf_addstr(&s, oid_to_hex(&parent->object.oid));
+				strbuf_addstr(&s, do_grab_oid("parent", oid, &used_atom[i]));
 			}
 			v->s = strbuf_detach(&s, NULL);
 		}
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index e30bbff6d9..79d5b29387 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -120,6 +120,9 @@ test_atom head tree:short $(git rev-parse --short refs/heads/master^{tree})
 test_atom head tree:short=1 $(git rev-parse --short=1 refs/heads/master^{tree})
 test_atom head tree:short=10 $(git rev-parse --short=10 refs/heads/master^{tree})
 test_atom head parent ''
+test_atom head parent:short ''
+test_atom head parent:short=1 ''
+test_atom head parent:short=10 ''
 test_atom head numparent 0
 test_atom head object ''
 test_atom head type ''
@@ -174,6 +177,9 @@ test_atom tag tree:short ''
 test_atom tag tree:short=1 ''
 test_atom tag tree:short=10 ''
 test_atom tag parent ''
+test_atom tag parent:short ''
+test_atom tag parent:short=1 ''
+test_atom tag parent:short=10 ''
 test_atom tag numparent ''
 test_atom tag object $(git rev-parse refs/tags/testtag^0)
 test_atom tag type 'commit'
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v2 7/9] pretty: refactor `format_sanitized_subject()`
  2020-08-05 21:51 ` [PATCH v2 0/9] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
                     ` (5 preceding siblings ...)
  2020-08-05 21:51   ` [PATCH v2 6/9] ref-filter: add `short` modifier to 'parent' atom Hariom Verma via GitGitGadget
@ 2020-08-05 21:51   ` Hariom Verma via GitGitGadget
  2020-08-05 21:51   ` [PATCH v2 8/9] format-support: move `format_sanitized_subject()` from pretty Hariom Verma via GitGitGadget
                     ` (3 subsequent siblings)
  10 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-05 21:51 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

The function 'format_sanitized_subject()' is responsible for
sanitized subject line in pretty.c
e.g.
the subject line
the-sanitized-subject-line

It would be a nice enhancement to `subject` atom to have the
same feature. So in the later commits, we plan to add this feature
to ref-filter.

Refactor `format_sanitized_subject()`, so it can be reused in
ref-filter.c for adding new modifier `sanitize` to "subject" atom.

Currently, the loop inside `format_sanitized_subject()` runs
until `\n` is found. But now, we stored the first occurrence
of `\n` in a variable `eol` and passed it in
`format_sanitized_subject()`. And the loop runs upto `eol`.

But this change isn't sufficient to reuse this function in
ref-filter.c because there exist tags with multiline subject.

It's wise to replace `\n` with ' ', if `format_sanitized_subject()`
encounters `\n` before end of subject line, just like `copy_subject()`.
Because we'll be only using `format_sanitized_subject()` for
"%(subject:sanitize)", instead of `copy_subject()` and
`format_sanitized_subject()` both. So, added the code:
```
if (char == '\n') /* never true if called inside pretty.c */
    char = ' ';
```

Now, it's ready to be reused in ref-filter.c

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 pretty.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/pretty.c b/pretty.c
index 2a3d46bf42..8d08e8278a 100644
--- a/pretty.c
+++ b/pretty.c
@@ -839,24 +839,29 @@ static int istitlechar(char c)
 		(c >= '0' && c <= '9') || c == '.' || c == '_';
 }
 
-static void format_sanitized_subject(struct strbuf *sb, const char *msg)
+static void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len)
 {
+	char *r = xmemdupz(msg, len);
 	size_t trimlen;
 	size_t start_len = sb->len;
 	int space = 2;
+	int i;
 
-	for (; *msg && *msg != '\n'; msg++) {
-		if (istitlechar(*msg)) {
+	for (i = 0; i < len; i++) {
+		if (r[i] == '\n')
+			r[i] = ' ';
+		if (istitlechar(r[i])) {
 			if (space == 1)
 				strbuf_addch(sb, '-');
 			space = 0;
-			strbuf_addch(sb, *msg);
-			if (*msg == '.')
-				while (*(msg+1) == '.')
-					msg++;
+			strbuf_addch(sb, r[i]);
+			if (r[i] == '.')
+				while (r[i+1] == '.')
+					i++;
 		} else
 			space |= 1;
 	}
+	free(r);
 
 	/* trim any trailing '.' or '-' characters */
 	trimlen = 0;
@@ -1155,7 +1160,7 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
 	const struct commit *commit = c->commit;
 	const char *msg = c->message;
 	struct commit_list *p;
-	const char *arg;
+	const char *arg, *eol;
 	size_t res;
 	char **slot;
 
@@ -1405,7 +1410,8 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
 		format_subject(sb, msg + c->subject_off, " ");
 		return 1;
 	case 'f':	/* sanitized subject */
-		format_sanitized_subject(sb, msg + c->subject_off);
+		eol = strchrnul(msg + c->subject_off, '\n');
+		format_sanitized_subject(sb, msg + c->subject_off, eol - (msg + c->subject_off));
 		return 1;
 	case 'b':	/* body */
 		strbuf_addstr(sb, msg + c->body_off);
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v2 8/9] format-support: move `format_sanitized_subject()` from pretty
  2020-08-05 21:51 ` [PATCH v2 0/9] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
                     ` (6 preceding siblings ...)
  2020-08-05 21:51   ` [PATCH v2 7/9] pretty: refactor `format_sanitized_subject()` Hariom Verma via GitGitGadget
@ 2020-08-05 21:51   ` Hariom Verma via GitGitGadget
  2020-08-05 21:51   ` [PATCH v2 9/9] ref-filter: add `sanitize` option for 'subject' atom Hariom Verma via GitGitGadget
                     ` (2 subsequent siblings)
  10 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-05 21:51 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

In hope of some new features in `subject` atom, move funtion
`format_sanitized_subject()` and all the function it uses
to new file format-support.{c,h}.

Consider this new file as a common interface between functions that
pretty.c and ref-filter.c shares.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 Makefile         |  1 +
 format-support.c | 43 +++++++++++++++++++++++++++++++++++++++++++
 format-support.h |  6 ++++++
 pretty.c         | 40 +---------------------------------------
 4 files changed, 51 insertions(+), 39 deletions(-)
 create mode 100644 format-support.c
 create mode 100644 format-support.h

diff --git a/Makefile b/Makefile
index 372139f1f2..4dfc384b49 100644
--- a/Makefile
+++ b/Makefile
@@ -882,6 +882,7 @@ LIB_OBJS += exec-cmd.o
 LIB_OBJS += fetch-negotiator.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fmt-merge-msg.o
+LIB_OBJS += format-support.o
 LIB_OBJS += fsck.o
 LIB_OBJS += fsmonitor.o
 LIB_OBJS += gettext.o
diff --git a/format-support.c b/format-support.c
new file mode 100644
index 0000000000..d693aa1744
--- /dev/null
+++ b/format-support.c
@@ -0,0 +1,43 @@
+#include "diff.h"
+#include "log-tree.h"
+#include "color.h"
+#include "format-support.h"
+
+static int istitlechar(char c)
+{
+	return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') ||
+		(c >= '0' && c <= '9') || c == '.' || c == '_';
+}
+
+void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len)
+{
+	char *r = xmemdupz(msg, len);
+	size_t trimlen;
+	size_t start_len = sb->len;
+	int space = 2;
+	int i;
+
+	for (i = 0; i < len; i++) {
+		if (r[i] == '\n')
+			r[i] = ' ';
+		if (istitlechar(r[i])) {
+			if (space == 1)
+				strbuf_addch(sb, '-');
+			space = 0;
+			strbuf_addch(sb, r[i]);
+			if (r[i] == '.')
+				while (r[i+1] == '.')
+					i++;
+		} else
+			space |= 1;
+	}
+	free(r);
+
+	/* trim any trailing '.' or '-' characters */
+	trimlen = 0;
+	while (sb->len - trimlen > start_len &&
+		(sb->buf[sb->len - 1 - trimlen] == '.'
+		|| sb->buf[sb->len - 1 - trimlen] == '-'))
+		trimlen++;
+	strbuf_remove(sb, sb->len - trimlen, trimlen);
+}
diff --git a/format-support.h b/format-support.h
new file mode 100644
index 0000000000..c344ccbc33
--- /dev/null
+++ b/format-support.h
@@ -0,0 +1,6 @@
+#ifndef FORMAT_SUPPORT_H
+#define FORMAT_SUPPORT_H
+
+void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len);
+
+#endif /* FORMAT_SUPPORT_H */
diff --git a/pretty.c b/pretty.c
index 8d08e8278a..2de01b7115 100644
--- a/pretty.c
+++ b/pretty.c
@@ -12,6 +12,7 @@
 #include "reflog-walk.h"
 #include "gpg-interface.h"
 #include "trailer.h"
+#include "format-support.h"
 
 static char *user_format;
 static struct cmt_fmt_map {
@@ -833,45 +834,6 @@ static void parse_commit_header(struct format_commit_context *context)
 	context->commit_header_parsed = 1;
 }
 
-static int istitlechar(char c)
-{
-	return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') ||
-		(c >= '0' && c <= '9') || c == '.' || c == '_';
-}
-
-static void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len)
-{
-	char *r = xmemdupz(msg, len);
-	size_t trimlen;
-	size_t start_len = sb->len;
-	int space = 2;
-	int i;
-
-	for (i = 0; i < len; i++) {
-		if (r[i] == '\n')
-			r[i] = ' ';
-		if (istitlechar(r[i])) {
-			if (space == 1)
-				strbuf_addch(sb, '-');
-			space = 0;
-			strbuf_addch(sb, r[i]);
-			if (r[i] == '.')
-				while (r[i+1] == '.')
-					i++;
-		} else
-			space |= 1;
-	}
-	free(r);
-
-	/* trim any trailing '.' or '-' characters */
-	trimlen = 0;
-	while (sb->len - trimlen > start_len &&
-		(sb->buf[sb->len - 1 - trimlen] == '.'
-		|| sb->buf[sb->len - 1 - trimlen] == '-'))
-		trimlen++;
-	strbuf_remove(sb, sb->len - trimlen, trimlen);
-}
-
 const char *format_subject(struct strbuf *sb, const char *msg,
 			   const char *line_separator)
 {
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v2 9/9] ref-filter: add `sanitize` option for 'subject' atom
  2020-08-05 21:51 ` [PATCH v2 0/9] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
                     ` (7 preceding siblings ...)
  2020-08-05 21:51   ` [PATCH v2 8/9] format-support: move `format_sanitized_subject()` from pretty Hariom Verma via GitGitGadget
@ 2020-08-05 21:51   ` Hariom Verma via GitGitGadget
  2020-08-05 22:04   ` [PATCH v2 0/9] [GSoC] Improvements to ref-filter Junio C Hamano
  2020-08-17 18:10   ` [PATCH v3 0/9] [Resend][GSoC] " Hariom Verma via GitGitGadget
  10 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-05 21:51 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

Currently, subject does not take any arguments. This commit introduce
`sanitize` formatting option to 'subject' atom.

`subject:sanitize` - print sanitized subject line, suitable for a filename.

e.g.
%(subject): "the subject line"
%(subject:sanitize): "the-subject-line"

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 Documentation/git-for-each-ref.txt |  3 +++
 ref-filter.c                       | 24 ++++++++++++++++--------
 t/t6300-for-each-ref.sh            |  7 +++++++
 3 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index dd09763e7d..616ce46087 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -247,6 +247,9 @@ contents:subject::
 	The first paragraph of the message, which typically is a
 	single line, is taken as the "subject" of the commit or the
 	tag message.
+	Instead of `contents:subject`, field `subject` can also be used to
+	obtain same results. `:sanitize` can be appended to `subject` for
+	subject line suitable for filename.
 
 contents:body::
 	The remainder of the commit or the tag message that follows
diff --git a/ref-filter.c b/ref-filter.c
index 6d5bbb14a2..016a03ef20 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -23,6 +23,7 @@
 #include "worktree.h"
 #include "hashmap.h"
 #include "argv-array.h"
+#include "format-support.h"
 
 static struct ref_msg {
 	const char *gone;
@@ -127,8 +128,8 @@ static struct used_atom {
 			unsigned int nobracket : 1, push : 1, push_remote : 1;
 		} remote_ref;
 		struct {
-			enum { C_BARE, C_BODY, C_BODY_DEP, C_LENGTH,
-			       C_LINES, C_SIG, C_SUB, C_TRAILERS } option;
+			enum { C_BARE, C_BODY, C_BODY_DEP, C_LENGTH, C_LINES,
+			       C_SIG, C_SUB, C_SUB_SANITIZE, C_TRAILERS } option;
 			struct process_trailer_options trailer_opts;
 			unsigned int nlines;
 		} contents;
@@ -301,9 +302,12 @@ static int body_atom_parser(const struct ref_format *format, struct used_atom *a
 static int subject_atom_parser(const struct ref_format *format, struct used_atom *atom,
 			       const char *arg, struct strbuf *err)
 {
-	if (arg)
-		return strbuf_addf_ret(err, -1, _("%%(subject) does not take arguments"));
-	atom->u.contents.option = C_SUB;
+	if (!arg)
+		atom->u.contents.option = C_SUB;
+	else if (!strcmp(arg, "sanitize"))
+		atom->u.contents.option = C_SUB_SANITIZE;
+	else
+		return strbuf_addf_ret(err, -1, _("unrecognized %%(subject) argument: %s"), arg);
 	return 0;
 }
 
@@ -1282,8 +1286,8 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
 			continue;
 		if (deref)
 			name++;
-		if (strcmp(name, "subject") &&
-		    strcmp(name, "body") &&
+		if (strcmp(name, "body") &&
+		    !starts_with(name, "subject") &&
 		    !starts_with(name, "trailers") &&
 		    !starts_with(name, "contents"))
 			continue;
@@ -1295,7 +1299,11 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
 
 		if (atom->u.contents.option == C_SUB)
 			v->s = copy_subject(subpos, sublen);
-		else if (atom->u.contents.option == C_BODY_DEP)
+		else if (atom->u.contents.option == C_SUB_SANITIZE) {
+			struct strbuf sb = STRBUF_INIT;
+			format_sanitized_subject(&sb, subpos, sublen);
+			v->s = strbuf_detach(&sb, NULL);
+		} else if (atom->u.contents.option == C_BODY_DEP)
 			v->s = xmemdupz(bodypos, bodylen);
 		else if (atom->u.contents.option == C_LENGTH)
 			v->s = xstrfmt("%"PRIuMAX, (uintmax_t)strlen(subpos));
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 79d5b29387..220ff5c3c2 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -150,6 +150,7 @@ test_atom head taggerdate ''
 test_atom head creator 'C O Mitter <committer@example.com> 1151968723 +0200'
 test_atom head creatordate 'Tue Jul 4 01:18:43 2006 +0200'
 test_atom head subject 'Initial'
+test_atom head subject:sanitize 'Initial'
 test_atom head contents:subject 'Initial'
 test_atom head body ''
 test_atom head contents:body ''
@@ -207,6 +208,7 @@ test_atom tag taggerdate 'Tue Jul 4 01:18:45 2006 +0200'
 test_atom tag creator 'C O Mitter <committer@example.com> 1151968725 +0200'
 test_atom tag creatordate 'Tue Jul 4 01:18:45 2006 +0200'
 test_atom tag subject 'Tagging at 1151968727'
+test_atom tag subject:sanitize 'Tagging-at-1151968727'
 test_atom tag contents:subject 'Tagging at 1151968727'
 test_atom tag body ''
 test_atom tag contents:body ''
@@ -619,6 +621,7 @@ test_expect_success 'create tag with subject and body content' '
 	git tag -F msg subject-body
 '
 test_atom refs/tags/subject-body subject 'the subject line'
+test_atom refs/tags/subject-body subject:sanitize 'the-subject-line'
 test_atom refs/tags/subject-body body 'first body line
 second body line
 '
@@ -639,6 +642,7 @@ test_expect_success 'create tag with multiline subject' '
 	git tag -F msg multiline
 '
 test_atom refs/tags/multiline subject 'first subject line second subject line'
+test_atom refs/tags/multiline subject:sanitize 'first-subject-line-second-subject-line'
 test_atom refs/tags/multiline contents:subject 'first subject line second subject line'
 test_atom refs/tags/multiline body 'first body line
 second body line
@@ -671,6 +675,7 @@ sig='-----BEGIN PGP SIGNATURE-----
 
 PREREQ=GPG
 test_atom refs/tags/signed-empty subject ''
+test_atom refs/tags/signed-empty subject:sanitize ''
 test_atom refs/tags/signed-empty contents:subject ''
 test_atom refs/tags/signed-empty body "$sig"
 test_atom refs/tags/signed-empty contents:body ''
@@ -678,6 +683,7 @@ test_atom refs/tags/signed-empty contents:signature "$sig"
 test_atom refs/tags/signed-empty contents "$sig"
 
 test_atom refs/tags/signed-short subject 'subject line'
+test_atom refs/tags/signed-short subject:sanitize 'subject-line'
 test_atom refs/tags/signed-short contents:subject 'subject line'
 test_atom refs/tags/signed-short body "$sig"
 test_atom refs/tags/signed-short contents:body ''
@@ -686,6 +692,7 @@ test_atom refs/tags/signed-short contents "subject line
 $sig"
 
 test_atom refs/tags/signed-long subject 'subject line'
+test_atom refs/tags/signed-long subject:sanitize 'subject-line'
 test_atom refs/tags/signed-long contents:subject 'subject line'
 test_atom refs/tags/signed-long body "body contents
 $sig"
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 0/9] [GSoC] Improvements to ref-filter
  2020-08-05 21:51 ` [PATCH v2 0/9] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
                     ` (8 preceding siblings ...)
  2020-08-05 21:51   ` [PATCH v2 9/9] ref-filter: add `sanitize` option for 'subject' atom Hariom Verma via GitGitGadget
@ 2020-08-05 22:04   ` Junio C Hamano
  2020-08-06 13:47     ` Hariom verma
  2020-08-17 18:10   ` [PATCH v3 0/9] [Resend][GSoC] " Hariom Verma via GitGitGadget
  10 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2020-08-05 22:04 UTC (permalink / raw)
  To: Hariom Verma via GitGitGadget; +Cc: git, Hariom Verma

"Hariom Verma via GitGitGadget" <gitgitgadget@gmail.com> writes:

>      -@@ ref-filter.c: static struct ref_to_worktree_map {
>      - 	struct worktree **worktrees;
>      - } ref_to_worktree_map;
>      +@@ ref-filter.c: static struct used_atom {
>      + 			enum { O_FULL, O_LENGTH, O_SHORT } option;
>      + 			unsigned int length;
>      + 		} objectname;
>      ++		struct email_option {
>      ++			enum { EO_RAW, EO_TRIM, EO_LOCALPART } option;
>      ++		} email_option;
>      + 		struct refname_atom refname;
>      + 		char *head;
>      + 	} u;

I'll try to find enough time to read the body of the series sometime
later this week, but this interdiff alone smells that this is much
closer to being correct (no, I am not saying I spotted a bug, but it
certainly looks liek it is on the right track, relative to what I
saw the last time, to be right).

A good test for this new feature may be to try using

	"<%(authoremail:localpart)> <%(committeremail:trim)>"

as a format to make sure e-mail options are done per-atom.

Thanks.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 0/9] [GSoC] Improvements to ref-filter
  2020-08-05 22:04   ` [PATCH v2 0/9] [GSoC] Improvements to ref-filter Junio C Hamano
@ 2020-08-06 13:47     ` Hariom verma
  0 siblings, 0 replies; 51+ messages in thread
From: Hariom verma @ 2020-08-06 13:47 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Hariom Verma via GitGitGadget, git, Christian Couder, Heba Waly

Hi,

On Thu, Aug 6, 2020 at 3:34 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Hariom Verma via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> >      -@@ ref-filter.c: static struct ref_to_worktree_map {
> >      -        struct worktree **worktrees;
> >      - } ref_to_worktree_map;
> >      +@@ ref-filter.c: static struct used_atom {
> >      +                        enum { O_FULL, O_LENGTH, O_SHORT } option;
> >      +                        unsigned int length;
> >      +                } objectname;
> >      ++               struct email_option {
> >      ++                       enum { EO_RAW, EO_TRIM, EO_LOCALPART } option;
> >      ++               } email_option;
> >      +                struct refname_atom refname;
> >      +                char *head;
> >      +        } u;
>
> I'll try to find enough time to read the body of the series sometime
> later this week, but this interdiff alone smells that this is much
> closer to being correct (no, I am not saying I spotted a bug, but it
> certainly looks liek it is on the right track, relative to what I
> saw the last time, to be right).

Thanks. I'll wait for your review.

> A good test for this new feature may be to try using
>
>         "<%(authoremail:localpart)> <%(committeremail:trim)>"
>
> as a format to make sure e-mail options are done per-atom.

Yeah. This test will surely make its way to t6300 in the next version
of this patch series.

Thanks,
Hariom

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v3 0/9] [Resend][GSoC] Improvements to ref-filter
  2020-08-05 21:51 ` [PATCH v2 0/9] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
                     ` (9 preceding siblings ...)
  2020-08-05 22:04   ` [PATCH v2 0/9] [GSoC] Improvements to ref-filter Junio C Hamano
@ 2020-08-17 18:10   ` Hariom Verma via GitGitGadget
  2020-08-17 18:10     ` [PATCH v3 1/9] ref-filter: support different email formats Hariom Verma via GitGitGadget
                       ` (10 more replies)
  10 siblings, 11 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-17 18:10 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma

This is the first patch series that introduces some improvements and
features to file ref-filter.{c,h}. These changes are useful to ref-filter,
but in near future also will allow us to use ref-filter's logic in pretty.c

I plan to add more to format-support.{c,h} in the upcoming patch series.
That will lead to more improved and feature-rich ref-filter.c


----------------------------------------------------------------------------

I just rebased the branch with master and fixed some merge conflicts. Only
the version number has been incremented, there are no changes in the
patches. Original patch series: 
https://public-inbox.org/git/pull.684.v2.git.1596664305.gitgitgadget@gmail.com/#t

Hariom Verma (9):
  ref-filter: support different email formats
  ref-filter: refactor `grab_objectname()`
  ref-filter: modify error messages in `grab_objectname()`
  ref-filter: rename `objectname` related functions and fields
  ref-filter: add `short` modifier to 'tree' atom
  ref-filter: add `short` modifier to 'parent' atom
  pretty: refactor `format_sanitized_subject()`
  format-support: move `format_sanitized_subject()` from pretty
  ref-filter: add `sanitize` option for 'subject' atom

 Documentation/git-for-each-ref.txt |  10 +-
 Makefile                           |   1 +
 format-support.c                   |  43 ++++++++
 format-support.h                   |   6 ++
 pretty.c                           |  40 +-------
 ref-filter.c                       | 159 +++++++++++++++++++----------
 t/t6300-for-each-ref.sh            |  35 +++++++
 7 files changed, 202 insertions(+), 92 deletions(-)
 create mode 100644 format-support.c
 create mode 100644 format-support.h


base-commit: 878e727637ec5815ccb3301eb994a54df95b21b8
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-684%2Fharry-hov%2Fonly-rf6-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-684/harry-hov/only-rf6-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/684

Range-diff vs v2:

  1:  78e69032df =  1:  3e6fc66a46 ref-filter: support different email formats
  2:  b6b6acab9a =  2:  5268b973da ref-filter: refactor `grab_objectname()`
  3:  65fee332a3 =  3:  4a12ff8210 ref-filter: modify error messages in `grab_objectname()`
  4:  976f2041a4 =  4:  d53ca56778 ref-filter: rename `objectname` related functions and fields
  5:  dda7400b14 =  5:  fd4ed82e80 ref-filter: add `short` modifier to 'tree' atom
  6:  764bb23b59 =  6:  7a039823de ref-filter: add `short` modifier to 'parent' atom
  7:  95035765a0 =  7:  0ad22c7cdd pretty: refactor `format_sanitized_subject()`
  8:  1c43f55d7c =  8:  7a64495f99 format-support: move `format_sanitized_subject()` from pretty
  9:  feace82752 !  9:  1ab35e9251 ref-filter: add `sanitize` option for 'subject' atom
     @@ ref-filter.c
      @@
       #include "worktree.h"
       #include "hashmap.h"
     - #include "argv-array.h"
     + #include "strvec.h"
      +#include "format-support.h"
       
       static struct ref_msg {

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v3 1/9] ref-filter: support different email formats
  2020-08-17 18:10   ` [PATCH v3 0/9] [Resend][GSoC] " Hariom Verma via GitGitGadget
@ 2020-08-17 18:10     ` Hariom Verma via GitGitGadget
  2020-08-17 18:10     ` [PATCH v3 2/9] ref-filter: refactor `grab_objectname()` Hariom Verma via GitGitGadget
                       ` (9 subsequent siblings)
  10 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-17 18:10 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

Currently, ref-filter only supports printing email with angle brackets.

Let's add support for two more email options.
- trim : for email without angle brackets.
- localpart : for the part before the @ sign out of trimmed email

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 Documentation/git-for-each-ref.txt |  5 ++-
 ref-filter.c                       | 54 +++++++++++++++++++++++++-----
 t/t6300-for-each-ref.sh            | 16 +++++++++
 3 files changed, 65 insertions(+), 10 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index 2ea71c5f6c..e6ce8af612 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -230,7 +230,10 @@ These are intended for working on a mix of annotated and lightweight tags.
 
 Fields that have name-email-date tuple as its value (`author`,
 `committer`, and `tagger`) can be suffixed with `name`, `email`,
-and `date` to extract the named component.
+and `date` to extract the named component.  For email fields (`authoremail`,
+`committeremail` and `taggeremail`), `:trim` can be appended to get the email
+without angle brackets, and `:localpart` to get the part before the `@` symbol
+out of the trimmed email.
 
 The message in a commit or a tag object is `contents`, from which
 `contents:<part>` can be used to extract various parts out of:
diff --git a/ref-filter.c b/ref-filter.c
index ba85869755..e60765f156 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -140,6 +140,9 @@ static struct used_atom {
 			enum { O_FULL, O_LENGTH, O_SHORT } option;
 			unsigned int length;
 		} objectname;
+		struct email_option {
+			enum { EO_RAW, EO_TRIM, EO_LOCALPART } option;
+		} email_option;
 		struct refname_atom refname;
 		char *head;
 	} u;
@@ -377,6 +380,20 @@ static int objectname_atom_parser(const struct ref_format *format, struct used_a
 	return 0;
 }
 
+static int person_email_atom_parser(const struct ref_format *format, struct used_atom *atom,
+				    const char *arg, struct strbuf *err)
+{
+	if (!arg)
+		atom->u.email_option.option = EO_RAW;
+	else if (!strcmp(arg, "trim"))
+		atom->u.email_option.option = EO_TRIM;
+	else if (!strcmp(arg, "localpart"))
+		atom->u.email_option.option = EO_LOCALPART;
+	else
+		return strbuf_addf_ret(err, -1, _("unrecognized email option: %s"), arg);
+	return 0;
+}
+
 static int refname_atom_parser(const struct ref_format *format, struct used_atom *atom,
 			       const char *arg, struct strbuf *err)
 {
@@ -488,15 +505,15 @@ static struct {
 	{ "tag", SOURCE_OBJ },
 	{ "author", SOURCE_OBJ },
 	{ "authorname", SOURCE_OBJ },
-	{ "authoremail", SOURCE_OBJ },
+	{ "authoremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
 	{ "authordate", SOURCE_OBJ, FIELD_TIME },
 	{ "committer", SOURCE_OBJ },
 	{ "committername", SOURCE_OBJ },
-	{ "committeremail", SOURCE_OBJ },
+	{ "committeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
 	{ "committerdate", SOURCE_OBJ, FIELD_TIME },
 	{ "tagger", SOURCE_OBJ },
 	{ "taggername", SOURCE_OBJ },
-	{ "taggeremail", SOURCE_OBJ },
+	{ "taggeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
 	{ "taggerdate", SOURCE_OBJ, FIELD_TIME },
 	{ "creator", SOURCE_OBJ },
 	{ "creatordate", SOURCE_OBJ, FIELD_TIME },
@@ -1037,16 +1054,35 @@ static const char *copy_name(const char *buf)
 	return xstrdup("");
 }
 
-static const char *copy_email(const char *buf)
+static const char *copy_email(const char *buf, struct used_atom *atom)
 {
 	const char *email = strchr(buf, '<');
 	const char *eoemail;
 	if (!email)
 		return xstrdup("");
-	eoemail = strchr(email, '>');
+	switch (atom->u.email_option.option) {
+	case EO_RAW:
+		eoemail = strchr(email, '>');
+		if (eoemail)
+			eoemail++;
+		break;
+	case EO_TRIM:
+		email++;
+		eoemail = strchr(email, '>');
+		break;
+	case EO_LOCALPART:
+		email++;
+		eoemail = strchr(email, '@');
+		if (!eoemail)
+			eoemail = strchr(email, '>');
+		break;
+	default:
+		BUG("unknown email option");
+	}
+
 	if (!eoemail)
 		return xstrdup("");
-	return xmemdupz(email, eoemail + 1 - email);
+	return xmemdupz(email, eoemail - email);
 }
 
 static char *copy_subject(const char *buf, unsigned long len)
@@ -1116,7 +1152,7 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void
 			continue;
 		if (name[wholen] != 0 &&
 		    strcmp(name + wholen, "name") &&
-		    strcmp(name + wholen, "email") &&
+		    !starts_with(name + wholen, "email") &&
 		    !starts_with(name + wholen, "date"))
 			continue;
 		if (!wholine)
@@ -1127,8 +1163,8 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void
 			v->s = copy_line(wholine);
 		else if (!strcmp(name + wholen, "name"))
 			v->s = copy_name(wholine);
-		else if (!strcmp(name + wholen, "email"))
-			v->s = copy_email(wholine);
+		else if (starts_with(name + wholen, "email"))
+			v->s = copy_email(wholine, &used_atom[i]);
 		else if (starts_with(name + wholen, "date"))
 			grab_date(wholine, v, name);
 	}
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index a83579fbdf..64fbc91146 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -125,15 +125,21 @@ test_atom head '*objecttype' ''
 test_atom head author 'A U Thor <author@example.com> 1151968724 +0200'
 test_atom head authorname 'A U Thor'
 test_atom head authoremail '<author@example.com>'
+test_atom head authoremail:trim 'author@example.com'
+test_atom head authoremail:localpart 'author'
 test_atom head authordate 'Tue Jul 4 01:18:44 2006 +0200'
 test_atom head committer 'C O Mitter <committer@example.com> 1151968723 +0200'
 test_atom head committername 'C O Mitter'
 test_atom head committeremail '<committer@example.com>'
+test_atom head committeremail:trim 'committer@example.com'
+test_atom head committeremail:localpart 'committer'
 test_atom head committerdate 'Tue Jul 4 01:18:43 2006 +0200'
 test_atom head tag ''
 test_atom head tagger ''
 test_atom head taggername ''
 test_atom head taggeremail ''
+test_atom head taggeremail:trim ''
+test_atom head taggeremail:localpart ''
 test_atom head taggerdate ''
 test_atom head creator 'C O Mitter <committer@example.com> 1151968723 +0200'
 test_atom head creatordate 'Tue Jul 4 01:18:43 2006 +0200'
@@ -170,15 +176,21 @@ test_atom tag '*objecttype' 'commit'
 test_atom tag author ''
 test_atom tag authorname ''
 test_atom tag authoremail ''
+test_atom tag authoremail:trim ''
+test_atom tag authoremail:localpart ''
 test_atom tag authordate ''
 test_atom tag committer ''
 test_atom tag committername ''
 test_atom tag committeremail ''
+test_atom tag committeremail:trim ''
+test_atom tag committeremail:localpart ''
 test_atom tag committerdate ''
 test_atom tag tag 'testtag'
 test_atom tag tagger 'C O Mitter <committer@example.com> 1151968725 +0200'
 test_atom tag taggername 'C O Mitter'
 test_atom tag taggeremail '<committer@example.com>'
+test_atom tag taggeremail:trim 'committer@example.com'
+test_atom tag taggeremail:localpart 'committer'
 test_atom tag taggerdate 'Tue Jul 4 01:18:45 2006 +0200'
 test_atom tag creator 'C O Mitter <committer@example.com> 1151968725 +0200'
 test_atom tag creatordate 'Tue Jul 4 01:18:45 2006 +0200'
@@ -564,10 +576,14 @@ test_atom refs/tags/taggerless tag 'taggerless'
 test_atom refs/tags/taggerless tagger ''
 test_atom refs/tags/taggerless taggername ''
 test_atom refs/tags/taggerless taggeremail ''
+test_atom refs/tags/taggerless taggeremail:trim ''
+test_atom refs/tags/taggerless taggeremail:localpart ''
 test_atom refs/tags/taggerless taggerdate ''
 test_atom refs/tags/taggerless committer ''
 test_atom refs/tags/taggerless committername ''
 test_atom refs/tags/taggerless committeremail ''
+test_atom refs/tags/taggerless committeremail:trim ''
+test_atom refs/tags/taggerless committeremail:localpart ''
 test_atom refs/tags/taggerless committerdate ''
 test_atom refs/tags/taggerless subject 'Broken tag'
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v3 2/9] ref-filter: refactor `grab_objectname()`
  2020-08-17 18:10   ` [PATCH v3 0/9] [Resend][GSoC] " Hariom Verma via GitGitGadget
  2020-08-17 18:10     ` [PATCH v3 1/9] ref-filter: support different email formats Hariom Verma via GitGitGadget
@ 2020-08-17 18:10     ` Hariom Verma via GitGitGadget
  2020-08-17 18:10     ` [PATCH v3 3/9] ref-filter: modify error messages in `grab_objectname()` Hariom Verma via GitGitGadget
                       ` (8 subsequent siblings)
  10 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-17 18:10 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

Prepares `grab_objectname()` for more generic usage.
This change will allow us to reuse `grab_objectname()` for
the `tree` and `parent` atoms in a following commit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 ref-filter.c | 36 +++++++++++++++++++++---------------
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index e60765f156..9bf92db6df 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -918,21 +918,27 @@ int verify_ref_format(struct ref_format *format)
 	return 0;
 }
 
-static int grab_objectname(const char *name, const struct object_id *oid,
+static const char *do_grab_objectname(const char *field, const struct object_id *oid,
+				      struct used_atom *atom)
+{
+	switch (atom->u.objectname.option) {
+	case O_FULL:
+		return oid_to_hex(oid);
+	case O_LENGTH:
+		return find_unique_abbrev(oid, atom->u.objectname.length);
+	case O_SHORT:
+		return find_unique_abbrev(oid, DEFAULT_ABBREV);
+	default:
+		BUG("unknown %%(%s) option", field);
+	}
+}
+
+static int grab_objectname(const char *name, const char *field, const struct object_id *oid,
 			   struct atom_value *v, struct used_atom *atom)
 {
-	if (starts_with(name, "objectname")) {
-		if (atom->u.objectname.option == O_SHORT) {
-			v->s = xstrdup(find_unique_abbrev(oid, DEFAULT_ABBREV));
-			return 1;
-		} else if (atom->u.objectname.option == O_FULL) {
-			v->s = xstrdup(oid_to_hex(oid));
-			return 1;
-		} else if (atom->u.objectname.option == O_LENGTH) {
-			v->s = xstrdup(find_unique_abbrev(oid, atom->u.objectname.length));
-			return 1;
-		} else
-			BUG("unknown %%(objectname) option");
+	if (starts_with(name, field)) {
+		v->s = xstrdup(do_grab_objectname(field, oid, atom));
+		return 1;
 	}
 	return 0;
 }
@@ -960,7 +966,7 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 		} else if (!strcmp(name, "deltabase"))
 			v->s = xstrdup(oid_to_hex(&oi->delta_base_oid));
 		else if (deref)
-			grab_objectname(name, &oi->oid, v, &used_atom[i]);
+			grab_objectname(name, "objectname", &oi->oid, v, &used_atom[i]);
 	}
 }
 
@@ -1740,7 +1746,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				v->s = xstrdup(buf + 1);
 			}
 			continue;
-		} else if (!deref && grab_objectname(name, &ref->objectname, v, atom)) {
+		} else if (!deref && grab_objectname(name, "objectname", &ref->objectname, v, atom)) {
 			continue;
 		} else if (!strcmp(name, "HEAD")) {
 			if (atom->u.head && !strcmp(ref->refname, atom->u.head))
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v3 3/9] ref-filter: modify error messages in `grab_objectname()`
  2020-08-17 18:10   ` [PATCH v3 0/9] [Resend][GSoC] " Hariom Verma via GitGitGadget
  2020-08-17 18:10     ` [PATCH v3 1/9] ref-filter: support different email formats Hariom Verma via GitGitGadget
  2020-08-17 18:10     ` [PATCH v3 2/9] ref-filter: refactor `grab_objectname()` Hariom Verma via GitGitGadget
@ 2020-08-17 18:10     ` Hariom Verma via GitGitGadget
  2020-08-17 18:10     ` [PATCH v3 4/9] ref-filter: rename `objectname` related functions and fields Hariom Verma via GitGitGadget
                       ` (7 subsequent siblings)
  10 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-17 18:10 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

As we plan to use `grab_objectname()` for `tree` and `parent` atom,
it's better to parameterize the error messages in the function
`grab_objectname()` where "objectname" is hard coded.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 ref-filter.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 9bf92db6df..4f4591cad0 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -372,11 +372,11 @@ static int objectname_atom_parser(const struct ref_format *format, struct used_a
 		atom->u.objectname.option = O_LENGTH;
 		if (strtoul_ui(arg, 10, &atom->u.objectname.length) ||
 		    atom->u.objectname.length == 0)
-			return strbuf_addf_ret(err, -1, _("positive value expected objectname:short=%s"), arg);
+			return strbuf_addf_ret(err, -1, _("positive value expected '%s' in %%(%s)"), arg, atom->name);
 		if (atom->u.objectname.length < MINIMUM_ABBREV)
 			atom->u.objectname.length = MINIMUM_ABBREV;
 	} else
-		return strbuf_addf_ret(err, -1, _("unrecognized %%(objectname) argument: %s"), arg);
+		return strbuf_addf_ret(err, -1, _("unrecognized argument '%s' in %%(%s)"), arg, atom->name);
 	return 0;
 }
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v3 4/9] ref-filter: rename `objectname` related functions and fields
  2020-08-17 18:10   ` [PATCH v3 0/9] [Resend][GSoC] " Hariom Verma via GitGitGadget
                       ` (2 preceding siblings ...)
  2020-08-17 18:10     ` [PATCH v3 3/9] ref-filter: modify error messages in `grab_objectname()` Hariom Verma via GitGitGadget
@ 2020-08-17 18:10     ` Hariom Verma via GitGitGadget
  2020-08-17 18:10     ` [PATCH v3 5/9] ref-filter: add `short` modifier to 'tree' atom Hariom Verma via GitGitGadget
                       ` (6 subsequent siblings)
  10 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-17 18:10 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

In previous commits, we prepared some `objectname` related functions
for more generic usage, so that these functions can be used for `tree`
and `parent` atom.

But the name of some functions and fields may mislead someone.
For ex: function `objectname_atom_parser()` implies that it is
for atom `objectname`.

Let's rename all such functions and fields.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 ref-filter.c | 40 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 4f4591cad0..066975b306 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -139,7 +139,7 @@ static struct used_atom {
 		struct {
 			enum { O_FULL, O_LENGTH, O_SHORT } option;
 			unsigned int length;
-		} objectname;
+		} oid;
 		struct email_option {
 			enum { EO_RAW, EO_TRIM, EO_LOCALPART } option;
 		} email_option;
@@ -361,20 +361,20 @@ static int contents_atom_parser(const struct ref_format *format, struct used_ato
 	return 0;
 }
 
-static int objectname_atom_parser(const struct ref_format *format, struct used_atom *atom,
-				  const char *arg, struct strbuf *err)
+static int oid_atom_parser(const struct ref_format *format, struct used_atom *atom,
+			   const char *arg, struct strbuf *err)
 {
 	if (!arg)
-		atom->u.objectname.option = O_FULL;
+		atom->u.oid.option = O_FULL;
 	else if (!strcmp(arg, "short"))
-		atom->u.objectname.option = O_SHORT;
+		atom->u.oid.option = O_SHORT;
 	else if (skip_prefix(arg, "short=", &arg)) {
-		atom->u.objectname.option = O_LENGTH;
-		if (strtoul_ui(arg, 10, &atom->u.objectname.length) ||
-		    atom->u.objectname.length == 0)
+		atom->u.oid.option = O_LENGTH;
+		if (strtoul_ui(arg, 10, &atom->u.oid.length) ||
+		    atom->u.oid.length == 0)
 			return strbuf_addf_ret(err, -1, _("positive value expected '%s' in %%(%s)"), arg, atom->name);
-		if (atom->u.objectname.length < MINIMUM_ABBREV)
-			atom->u.objectname.length = MINIMUM_ABBREV;
+		if (atom->u.oid.length < MINIMUM_ABBREV)
+			atom->u.oid.length = MINIMUM_ABBREV;
 	} else
 		return strbuf_addf_ret(err, -1, _("unrecognized argument '%s' in %%(%s)"), arg, atom->name);
 	return 0;
@@ -495,7 +495,7 @@ static struct {
 	{ "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
 	{ "objecttype", SOURCE_OTHER, FIELD_STR, objecttype_atom_parser },
 	{ "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
-	{ "objectname", SOURCE_OTHER, FIELD_STR, objectname_atom_parser },
+	{ "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
 	{ "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
 	{ "tree", SOURCE_OBJ },
 	{ "parent", SOURCE_OBJ },
@@ -918,14 +918,14 @@ int verify_ref_format(struct ref_format *format)
 	return 0;
 }
 
-static const char *do_grab_objectname(const char *field, const struct object_id *oid,
-				      struct used_atom *atom)
+static const char *do_grab_oid(const char *field, const struct object_id *oid,
+			       struct used_atom *atom)
 {
-	switch (atom->u.objectname.option) {
+	switch (atom->u.oid.option) {
 	case O_FULL:
 		return oid_to_hex(oid);
 	case O_LENGTH:
-		return find_unique_abbrev(oid, atom->u.objectname.length);
+		return find_unique_abbrev(oid, atom->u.oid.length);
 	case O_SHORT:
 		return find_unique_abbrev(oid, DEFAULT_ABBREV);
 	default:
@@ -933,11 +933,11 @@ static const char *do_grab_objectname(const char *field, const struct object_id
 	}
 }
 
-static int grab_objectname(const char *name, const char *field, const struct object_id *oid,
-			   struct atom_value *v, struct used_atom *atom)
+static int grab_oid(const char *name, const char *field, const struct object_id *oid,
+		    struct atom_value *v, struct used_atom *atom)
 {
 	if (starts_with(name, field)) {
-		v->s = xstrdup(do_grab_objectname(field, oid, atom));
+		v->s = xstrdup(do_grab_oid(field, oid, atom));
 		return 1;
 	}
 	return 0;
@@ -966,7 +966,7 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 		} else if (!strcmp(name, "deltabase"))
 			v->s = xstrdup(oid_to_hex(&oi->delta_base_oid));
 		else if (deref)
-			grab_objectname(name, "objectname", &oi->oid, v, &used_atom[i]);
+			grab_oid(name, "objectname", &oi->oid, v, &used_atom[i]);
 	}
 }
 
@@ -1746,7 +1746,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				v->s = xstrdup(buf + 1);
 			}
 			continue;
-		} else if (!deref && grab_objectname(name, "objectname", &ref->objectname, v, atom)) {
+		} else if (!deref && grab_oid(name, "objectname", &ref->objectname, v, atom)) {
 			continue;
 		} else if (!strcmp(name, "HEAD")) {
 			if (atom->u.head && !strcmp(ref->refname, atom->u.head))
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v3 5/9] ref-filter: add `short` modifier to 'tree' atom
  2020-08-17 18:10   ` [PATCH v3 0/9] [Resend][GSoC] " Hariom Verma via GitGitGadget
                       ` (3 preceding siblings ...)
  2020-08-17 18:10     ` [PATCH v3 4/9] ref-filter: rename `objectname` related functions and fields Hariom Verma via GitGitGadget
@ 2020-08-17 18:10     ` Hariom Verma via GitGitGadget
  2020-08-17 18:10     ` [PATCH v3 6/9] ref-filter: add `short` modifier to 'parent' atom Hariom Verma via GitGitGadget
                       ` (5 subsequent siblings)
  10 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-17 18:10 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

Sometimes while using 'tree' atom, user might want to see abbrev hash
instead of full 40 character hash.

Just like 'objectname', it might be convenient for users to have the
`:short` and `:short=<length>` option for printing 'tree' hash.

Let's introduce `short` option to 'tree' atom.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 Documentation/git-for-each-ref.txt | 2 ++
 ref-filter.c                       | 9 ++++-----
 t/t6300-for-each-ref.sh            | 6 ++++++
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index e6ce8af612..40ebdfcc41 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -222,6 +222,8 @@ worktreepath::
 In addition to the above, for commit and tag objects, the header
 field names (`tree`, `parent`, `object`, `type`, and `tag`) can
 be used to specify the value in the header field.
+Field `tree` can also be used with modifier `:short` and
+`:short=<length>` just like `objectname`.
 
 For commit and tag objects, the special `creatordate` and `creator`
 fields will correspond to the appropriate date or name-email-date tuple
diff --git a/ref-filter.c b/ref-filter.c
index 066975b306..3449fe45d8 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -497,7 +497,7 @@ static struct {
 	{ "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
 	{ "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
 	{ "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
-	{ "tree", SOURCE_OBJ },
+	{ "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
 	{ "parent", SOURCE_OBJ },
 	{ "numparent", SOURCE_OBJ, FIELD_ULONG },
 	{ "object", SOURCE_OBJ },
@@ -1005,10 +1005,9 @@ static void grab_commit_values(struct atom_value *val, int deref, struct object
 			continue;
 		if (deref)
 			name++;
-		if (!strcmp(name, "tree")) {
-			v->s = xstrdup(oid_to_hex(get_commit_tree_oid(commit)));
-		}
-		else if (!strcmp(name, "numparent")) {
+		if (grab_oid(name, "tree", get_commit_tree_oid(commit), v, &used_atom[i]))
+			continue;
+		if (!strcmp(name, "numparent")) {
 			v->value = commit_list_count(commit->parents);
 			v->s = xstrfmt("%lu", (unsigned long)v->value);
 		}
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 64fbc91146..e30bbff6d9 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -116,6 +116,9 @@ test_atom head objectname:short $(git rev-parse --short refs/heads/master)
 test_atom head objectname:short=1 $(git rev-parse --short=1 refs/heads/master)
 test_atom head objectname:short=10 $(git rev-parse --short=10 refs/heads/master)
 test_atom head tree $(git rev-parse refs/heads/master^{tree})
+test_atom head tree:short $(git rev-parse --short refs/heads/master^{tree})
+test_atom head tree:short=1 $(git rev-parse --short=1 refs/heads/master^{tree})
+test_atom head tree:short=10 $(git rev-parse --short=10 refs/heads/master^{tree})
 test_atom head parent ''
 test_atom head numparent 0
 test_atom head object ''
@@ -167,6 +170,9 @@ test_atom tag objectname:short $(git rev-parse --short refs/tags/testtag)
 test_atom head objectname:short=1 $(git rev-parse --short=1 refs/heads/master)
 test_atom head objectname:short=10 $(git rev-parse --short=10 refs/heads/master)
 test_atom tag tree ''
+test_atom tag tree:short ''
+test_atom tag tree:short=1 ''
+test_atom tag tree:short=10 ''
 test_atom tag parent ''
 test_atom tag numparent ''
 test_atom tag object $(git rev-parse refs/tags/testtag^0)
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v3 6/9] ref-filter: add `short` modifier to 'parent' atom
  2020-08-17 18:10   ` [PATCH v3 0/9] [Resend][GSoC] " Hariom Verma via GitGitGadget
                       ` (4 preceding siblings ...)
  2020-08-17 18:10     ` [PATCH v3 5/9] ref-filter: add `short` modifier to 'tree' atom Hariom Verma via GitGitGadget
@ 2020-08-17 18:10     ` Hariom Verma via GitGitGadget
  2020-08-17 18:10     ` [PATCH v3 7/9] pretty: refactor `format_sanitized_subject()` Hariom Verma via GitGitGadget
                       ` (4 subsequent siblings)
  10 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-17 18:10 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

Sometimes while using 'parent' atom, user might want to see abbrev hash
instead of full 40 character hash.

Just like 'objectname', it might be convenient for users to have the
`:short` and `:short=<length>` option for printing 'parent' hash.

Let's introduce `short` option to 'parent' atom.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 Documentation/git-for-each-ref.txt | 2 +-
 ref-filter.c                       | 8 ++++----
 t/t6300-for-each-ref.sh            | 6 ++++++
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index 40ebdfcc41..dd09763e7d 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -222,7 +222,7 @@ worktreepath::
 In addition to the above, for commit and tag objects, the header
 field names (`tree`, `parent`, `object`, `type`, and `tag`) can
 be used to specify the value in the header field.
-Field `tree` can also be used with modifier `:short` and
+Fields `tree` and `parent` can also be used with modifier `:short` and
 `:short=<length>` just like `objectname`.
 
 For commit and tag objects, the special `creatordate` and `creator`
diff --git a/ref-filter.c b/ref-filter.c
index 3449fe45d8..c7d81088e4 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -498,7 +498,7 @@ static struct {
 	{ "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
 	{ "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
 	{ "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
-	{ "parent", SOURCE_OBJ },
+	{ "parent", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
 	{ "numparent", SOURCE_OBJ, FIELD_ULONG },
 	{ "object", SOURCE_OBJ },
 	{ "type", SOURCE_OBJ },
@@ -1011,14 +1011,14 @@ static void grab_commit_values(struct atom_value *val, int deref, struct object
 			v->value = commit_list_count(commit->parents);
 			v->s = xstrfmt("%lu", (unsigned long)v->value);
 		}
-		else if (!strcmp(name, "parent")) {
+		else if (starts_with(name, "parent")) {
 			struct commit_list *parents;
 			struct strbuf s = STRBUF_INIT;
 			for (parents = commit->parents; parents; parents = parents->next) {
-				struct commit *parent = parents->item;
+				struct object_id *oid = &parents->item->object.oid;
 				if (parents != commit->parents)
 					strbuf_addch(&s, ' ');
-				strbuf_addstr(&s, oid_to_hex(&parent->object.oid));
+				strbuf_addstr(&s, do_grab_oid("parent", oid, &used_atom[i]));
 			}
 			v->s = strbuf_detach(&s, NULL);
 		}
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index e30bbff6d9..79d5b29387 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -120,6 +120,9 @@ test_atom head tree:short $(git rev-parse --short refs/heads/master^{tree})
 test_atom head tree:short=1 $(git rev-parse --short=1 refs/heads/master^{tree})
 test_atom head tree:short=10 $(git rev-parse --short=10 refs/heads/master^{tree})
 test_atom head parent ''
+test_atom head parent:short ''
+test_atom head parent:short=1 ''
+test_atom head parent:short=10 ''
 test_atom head numparent 0
 test_atom head object ''
 test_atom head type ''
@@ -174,6 +177,9 @@ test_atom tag tree:short ''
 test_atom tag tree:short=1 ''
 test_atom tag tree:short=10 ''
 test_atom tag parent ''
+test_atom tag parent:short ''
+test_atom tag parent:short=1 ''
+test_atom tag parent:short=10 ''
 test_atom tag numparent ''
 test_atom tag object $(git rev-parse refs/tags/testtag^0)
 test_atom tag type 'commit'
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v3 7/9] pretty: refactor `format_sanitized_subject()`
  2020-08-17 18:10   ` [PATCH v3 0/9] [Resend][GSoC] " Hariom Verma via GitGitGadget
                       ` (5 preceding siblings ...)
  2020-08-17 18:10     ` [PATCH v3 6/9] ref-filter: add `short` modifier to 'parent' atom Hariom Verma via GitGitGadget
@ 2020-08-17 18:10     ` Hariom Verma via GitGitGadget
  2020-08-17 19:29       ` Junio C Hamano
  2020-08-17 18:10     ` [PATCH v3 8/9] format-support: move `format_sanitized_subject()` from pretty Hariom Verma via GitGitGadget
                       ` (3 subsequent siblings)
  10 siblings, 1 reply; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-17 18:10 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

The function 'format_sanitized_subject()' is responsible for
sanitized subject line in pretty.c
e.g.
the subject line
the-sanitized-subject-line

It would be a nice enhancement to `subject` atom to have the
same feature. So in the later commits, we plan to add this feature
to ref-filter.

Refactor `format_sanitized_subject()`, so it can be reused in
ref-filter.c for adding new modifier `sanitize` to "subject" atom.

Currently, the loop inside `format_sanitized_subject()` runs
until `\n` is found. But now, we stored the first occurrence
of `\n` in a variable `eol` and passed it in
`format_sanitized_subject()`. And the loop runs upto `eol`.

But this change isn't sufficient to reuse this function in
ref-filter.c because there exist tags with multiline subject.

It's wise to replace `\n` with ' ', if `format_sanitized_subject()`
encounters `\n` before end of subject line, just like `copy_subject()`.
Because we'll be only using `format_sanitized_subject()` for
"%(subject:sanitize)", instead of `copy_subject()` and
`format_sanitized_subject()` both. So, added the code:
```
if (char == '\n') /* never true if called inside pretty.c */
    char = ' ';
```

Now, it's ready to be reused in ref-filter.c

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 pretty.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/pretty.c b/pretty.c
index 2a3d46bf42..8d08e8278a 100644
--- a/pretty.c
+++ b/pretty.c
@@ -839,24 +839,29 @@ static int istitlechar(char c)
 		(c >= '0' && c <= '9') || c == '.' || c == '_';
 }
 
-static void format_sanitized_subject(struct strbuf *sb, const char *msg)
+static void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len)
 {
+	char *r = xmemdupz(msg, len);
 	size_t trimlen;
 	size_t start_len = sb->len;
 	int space = 2;
+	int i;
 
-	for (; *msg && *msg != '\n'; msg++) {
-		if (istitlechar(*msg)) {
+	for (i = 0; i < len; i++) {
+		if (r[i] == '\n')
+			r[i] = ' ';
+		if (istitlechar(r[i])) {
 			if (space == 1)
 				strbuf_addch(sb, '-');
 			space = 0;
-			strbuf_addch(sb, *msg);
-			if (*msg == '.')
-				while (*(msg+1) == '.')
-					msg++;
+			strbuf_addch(sb, r[i]);
+			if (r[i] == '.')
+				while (r[i+1] == '.')
+					i++;
 		} else
 			space |= 1;
 	}
+	free(r);
 
 	/* trim any trailing '.' or '-' characters */
 	trimlen = 0;
@@ -1155,7 +1160,7 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
 	const struct commit *commit = c->commit;
 	const char *msg = c->message;
 	struct commit_list *p;
-	const char *arg;
+	const char *arg, *eol;
 	size_t res;
 	char **slot;
 
@@ -1405,7 +1410,8 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
 		format_subject(sb, msg + c->subject_off, " ");
 		return 1;
 	case 'f':	/* sanitized subject */
-		format_sanitized_subject(sb, msg + c->subject_off);
+		eol = strchrnul(msg + c->subject_off, '\n');
+		format_sanitized_subject(sb, msg + c->subject_off, eol - (msg + c->subject_off));
 		return 1;
 	case 'b':	/* body */
 		strbuf_addstr(sb, msg + c->body_off);
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v3 8/9] format-support: move `format_sanitized_subject()` from pretty
  2020-08-17 18:10   ` [PATCH v3 0/9] [Resend][GSoC] " Hariom Verma via GitGitGadget
                       ` (6 preceding siblings ...)
  2020-08-17 18:10     ` [PATCH v3 7/9] pretty: refactor `format_sanitized_subject()` Hariom Verma via GitGitGadget
@ 2020-08-17 18:10     ` Hariom Verma via GitGitGadget
  2020-08-17 19:33       ` Junio C Hamano
  2020-08-17 18:10     ` [PATCH v3 9/9] ref-filter: add `sanitize` option for 'subject' atom Hariom Verma via GitGitGadget
                       ` (2 subsequent siblings)
  10 siblings, 1 reply; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-17 18:10 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

In hope of some new features in `subject` atom, move funtion
`format_sanitized_subject()` and all the function it uses
to new file format-support.{c,h}.

Consider this new file as a common interface between functions that
pretty.c and ref-filter.c shares.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 Makefile         |  1 +
 format-support.c | 43 +++++++++++++++++++++++++++++++++++++++++++
 format-support.h |  6 ++++++
 pretty.c         | 40 +---------------------------------------
 4 files changed, 51 insertions(+), 39 deletions(-)
 create mode 100644 format-support.c
 create mode 100644 format-support.h

diff --git a/Makefile b/Makefile
index 65f8cfb236..4ead4a256c 100644
--- a/Makefile
+++ b/Makefile
@@ -881,6 +881,7 @@ LIB_OBJS += exec-cmd.o
 LIB_OBJS += fetch-negotiator.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fmt-merge-msg.o
+LIB_OBJS += format-support.o
 LIB_OBJS += fsck.o
 LIB_OBJS += fsmonitor.o
 LIB_OBJS += gettext.o
diff --git a/format-support.c b/format-support.c
new file mode 100644
index 0000000000..d693aa1744
--- /dev/null
+++ b/format-support.c
@@ -0,0 +1,43 @@
+#include "diff.h"
+#include "log-tree.h"
+#include "color.h"
+#include "format-support.h"
+
+static int istitlechar(char c)
+{
+	return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') ||
+		(c >= '0' && c <= '9') || c == '.' || c == '_';
+}
+
+void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len)
+{
+	char *r = xmemdupz(msg, len);
+	size_t trimlen;
+	size_t start_len = sb->len;
+	int space = 2;
+	int i;
+
+	for (i = 0; i < len; i++) {
+		if (r[i] == '\n')
+			r[i] = ' ';
+		if (istitlechar(r[i])) {
+			if (space == 1)
+				strbuf_addch(sb, '-');
+			space = 0;
+			strbuf_addch(sb, r[i]);
+			if (r[i] == '.')
+				while (r[i+1] == '.')
+					i++;
+		} else
+			space |= 1;
+	}
+	free(r);
+
+	/* trim any trailing '.' or '-' characters */
+	trimlen = 0;
+	while (sb->len - trimlen > start_len &&
+		(sb->buf[sb->len - 1 - trimlen] == '.'
+		|| sb->buf[sb->len - 1 - trimlen] == '-'))
+		trimlen++;
+	strbuf_remove(sb, sb->len - trimlen, trimlen);
+}
diff --git a/format-support.h b/format-support.h
new file mode 100644
index 0000000000..c344ccbc33
--- /dev/null
+++ b/format-support.h
@@ -0,0 +1,6 @@
+#ifndef FORMAT_SUPPORT_H
+#define FORMAT_SUPPORT_H
+
+void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len);
+
+#endif /* FORMAT_SUPPORT_H */
diff --git a/pretty.c b/pretty.c
index 8d08e8278a..2de01b7115 100644
--- a/pretty.c
+++ b/pretty.c
@@ -12,6 +12,7 @@
 #include "reflog-walk.h"
 #include "gpg-interface.h"
 #include "trailer.h"
+#include "format-support.h"
 
 static char *user_format;
 static struct cmt_fmt_map {
@@ -833,45 +834,6 @@ static void parse_commit_header(struct format_commit_context *context)
 	context->commit_header_parsed = 1;
 }
 
-static int istitlechar(char c)
-{
-	return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') ||
-		(c >= '0' && c <= '9') || c == '.' || c == '_';
-}
-
-static void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len)
-{
-	char *r = xmemdupz(msg, len);
-	size_t trimlen;
-	size_t start_len = sb->len;
-	int space = 2;
-	int i;
-
-	for (i = 0; i < len; i++) {
-		if (r[i] == '\n')
-			r[i] = ' ';
-		if (istitlechar(r[i])) {
-			if (space == 1)
-				strbuf_addch(sb, '-');
-			space = 0;
-			strbuf_addch(sb, r[i]);
-			if (r[i] == '.')
-				while (r[i+1] == '.')
-					i++;
-		} else
-			space |= 1;
-	}
-	free(r);
-
-	/* trim any trailing '.' or '-' characters */
-	trimlen = 0;
-	while (sb->len - trimlen > start_len &&
-		(sb->buf[sb->len - 1 - trimlen] == '.'
-		|| sb->buf[sb->len - 1 - trimlen] == '-'))
-		trimlen++;
-	strbuf_remove(sb, sb->len - trimlen, trimlen);
-}
-
 const char *format_subject(struct strbuf *sb, const char *msg,
 			   const char *line_separator)
 {
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v3 9/9] ref-filter: add `sanitize` option for 'subject' atom
  2020-08-17 18:10   ` [PATCH v3 0/9] [Resend][GSoC] " Hariom Verma via GitGitGadget
                       ` (7 preceding siblings ...)
  2020-08-17 18:10     ` [PATCH v3 8/9] format-support: move `format_sanitized_subject()` from pretty Hariom Verma via GitGitGadget
@ 2020-08-17 18:10     ` Hariom Verma via GitGitGadget
  2020-08-17 19:37     ` [PATCH v3 0/9] [Resend][GSoC] Improvements to ref-filter Junio C Hamano
  2020-08-21 21:41     ` [PATCH v4 0/8] [GSoC] " Hariom Verma via GitGitGadget
  10 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-17 18:10 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

Currently, subject does not take any arguments. This commit introduce
`sanitize` formatting option to 'subject' atom.

`subject:sanitize` - print sanitized subject line, suitable for a filename.

e.g.
%(subject): "the subject line"
%(subject:sanitize): "the-subject-line"

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 Documentation/git-for-each-ref.txt |  3 +++
 ref-filter.c                       | 24 ++++++++++++++++--------
 t/t6300-for-each-ref.sh            |  7 +++++++
 3 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index dd09763e7d..616ce46087 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -247,6 +247,9 @@ contents:subject::
 	The first paragraph of the message, which typically is a
 	single line, is taken as the "subject" of the commit or the
 	tag message.
+	Instead of `contents:subject`, field `subject` can also be used to
+	obtain same results. `:sanitize` can be appended to `subject` for
+	subject line suitable for filename.
 
 contents:body::
 	The remainder of the commit or the tag message that follows
diff --git a/ref-filter.c b/ref-filter.c
index c7d81088e4..7f03444767 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -23,6 +23,7 @@
 #include "worktree.h"
 #include "hashmap.h"
 #include "strvec.h"
+#include "format-support.h"
 
 static struct ref_msg {
 	const char *gone;
@@ -127,8 +128,8 @@ static struct used_atom {
 			unsigned int nobracket : 1, push : 1, push_remote : 1;
 		} remote_ref;
 		struct {
-			enum { C_BARE, C_BODY, C_BODY_DEP, C_LENGTH,
-			       C_LINES, C_SIG, C_SUB, C_TRAILERS } option;
+			enum { C_BARE, C_BODY, C_BODY_DEP, C_LENGTH, C_LINES,
+			       C_SIG, C_SUB, C_SUB_SANITIZE, C_TRAILERS } option;
 			struct process_trailer_options trailer_opts;
 			unsigned int nlines;
 		} contents;
@@ -301,9 +302,12 @@ static int body_atom_parser(const struct ref_format *format, struct used_atom *a
 static int subject_atom_parser(const struct ref_format *format, struct used_atom *atom,
 			       const char *arg, struct strbuf *err)
 {
-	if (arg)
-		return strbuf_addf_ret(err, -1, _("%%(subject) does not take arguments"));
-	atom->u.contents.option = C_SUB;
+	if (!arg)
+		atom->u.contents.option = C_SUB;
+	else if (!strcmp(arg, "sanitize"))
+		atom->u.contents.option = C_SUB_SANITIZE;
+	else
+		return strbuf_addf_ret(err, -1, _("unrecognized %%(subject) argument: %s"), arg);
 	return 0;
 }
 
@@ -1282,8 +1286,8 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
 			continue;
 		if (deref)
 			name++;
-		if (strcmp(name, "subject") &&
-		    strcmp(name, "body") &&
+		if (strcmp(name, "body") &&
+		    !starts_with(name, "subject") &&
 		    !starts_with(name, "trailers") &&
 		    !starts_with(name, "contents"))
 			continue;
@@ -1295,7 +1299,11 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
 
 		if (atom->u.contents.option == C_SUB)
 			v->s = copy_subject(subpos, sublen);
-		else if (atom->u.contents.option == C_BODY_DEP)
+		else if (atom->u.contents.option == C_SUB_SANITIZE) {
+			struct strbuf sb = STRBUF_INIT;
+			format_sanitized_subject(&sb, subpos, sublen);
+			v->s = strbuf_detach(&sb, NULL);
+		} else if (atom->u.contents.option == C_BODY_DEP)
 			v->s = xmemdupz(bodypos, bodylen);
 		else if (atom->u.contents.option == C_LENGTH)
 			v->s = xstrfmt("%"PRIuMAX, (uintmax_t)strlen(subpos));
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 79d5b29387..220ff5c3c2 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -150,6 +150,7 @@ test_atom head taggerdate ''
 test_atom head creator 'C O Mitter <committer@example.com> 1151968723 +0200'
 test_atom head creatordate 'Tue Jul 4 01:18:43 2006 +0200'
 test_atom head subject 'Initial'
+test_atom head subject:sanitize 'Initial'
 test_atom head contents:subject 'Initial'
 test_atom head body ''
 test_atom head contents:body ''
@@ -207,6 +208,7 @@ test_atom tag taggerdate 'Tue Jul 4 01:18:45 2006 +0200'
 test_atom tag creator 'C O Mitter <committer@example.com> 1151968725 +0200'
 test_atom tag creatordate 'Tue Jul 4 01:18:45 2006 +0200'
 test_atom tag subject 'Tagging at 1151968727'
+test_atom tag subject:sanitize 'Tagging-at-1151968727'
 test_atom tag contents:subject 'Tagging at 1151968727'
 test_atom tag body ''
 test_atom tag contents:body ''
@@ -619,6 +621,7 @@ test_expect_success 'create tag with subject and body content' '
 	git tag -F msg subject-body
 '
 test_atom refs/tags/subject-body subject 'the subject line'
+test_atom refs/tags/subject-body subject:sanitize 'the-subject-line'
 test_atom refs/tags/subject-body body 'first body line
 second body line
 '
@@ -639,6 +642,7 @@ test_expect_success 'create tag with multiline subject' '
 	git tag -F msg multiline
 '
 test_atom refs/tags/multiline subject 'first subject line second subject line'
+test_atom refs/tags/multiline subject:sanitize 'first-subject-line-second-subject-line'
 test_atom refs/tags/multiline contents:subject 'first subject line second subject line'
 test_atom refs/tags/multiline body 'first body line
 second body line
@@ -671,6 +675,7 @@ sig='-----BEGIN PGP SIGNATURE-----
 
 PREREQ=GPG
 test_atom refs/tags/signed-empty subject ''
+test_atom refs/tags/signed-empty subject:sanitize ''
 test_atom refs/tags/signed-empty contents:subject ''
 test_atom refs/tags/signed-empty body "$sig"
 test_atom refs/tags/signed-empty contents:body ''
@@ -678,6 +683,7 @@ test_atom refs/tags/signed-empty contents:signature "$sig"
 test_atom refs/tags/signed-empty contents "$sig"
 
 test_atom refs/tags/signed-short subject 'subject line'
+test_atom refs/tags/signed-short subject:sanitize 'subject-line'
 test_atom refs/tags/signed-short contents:subject 'subject line'
 test_atom refs/tags/signed-short body "$sig"
 test_atom refs/tags/signed-short contents:body ''
@@ -686,6 +692,7 @@ test_atom refs/tags/signed-short contents "subject line
 $sig"
 
 test_atom refs/tags/signed-long subject 'subject line'
+test_atom refs/tags/signed-long subject:sanitize 'subject-line'
 test_atom refs/tags/signed-long contents:subject 'subject line'
 test_atom refs/tags/signed-long body "body contents
 $sig"
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 7/9] pretty: refactor `format_sanitized_subject()`
  2020-08-17 18:10     ` [PATCH v3 7/9] pretty: refactor `format_sanitized_subject()` Hariom Verma via GitGitGadget
@ 2020-08-17 19:29       ` Junio C Hamano
  2020-08-19 13:36         ` Hariom verma
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2020-08-17 19:29 UTC (permalink / raw)
  To: Hariom Verma via GitGitGadget; +Cc: git, Hariom Verma

"Hariom Verma via GitGitGadget" <gitgitgadget@gmail.com> writes:

> -static void format_sanitized_subject(struct strbuf *sb, const char *msg)
> +static void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len)
>  {
> +	char *r = xmemdupz(msg, len);
>  	size_t trimlen;
>  	size_t start_len = sb->len;
>  	int space = 2;
> +	int i;
>  
> -	for (; *msg && *msg != '\n'; msg++) {
> -		if (istitlechar(*msg)) {
> +	for (i = 0; i < len; i++) {
> +		if (r[i] == '\n')
> +			r[i] = ' ';

Copying the whole string only for this one looks very wasteful.
Can't you do

	for (i = 0; i < len; i++) {
		char r = msg[i];
		if (isspace(r))
			r = ' ';
		if (istitlechar(r)) {
			...
	}

or something like that instead?  

> +		if (istitlechar(r[i])) {
>  			if (space == 1)
>  				strbuf_addch(sb, '-');
>  			space = 0;
> -			strbuf_addch(sb, *msg);
> -			if (*msg == '.')
> -				while (*(msg+1) == '.')
> -					msg++;
> +			strbuf_addch(sb, r[i]);
> +			if (r[i] == '.')
> +				while (r[i+1] == '.')
> +					i++;
>  		} else
>  			space |= 1;
>  	}
> +	free(r);

Also, because neither LF or SP is a titlechar(), wouldn't the "if
r[i] is LF, replace it with SP" a no-op wrt what will be in sb at
the end?


>  	case 'f':	/* sanitized subject */
> -		format_sanitized_subject(sb, msg + c->subject_off);
> +		eol = strchrnul(msg + c->subject_off, '\n');
> +		format_sanitized_subject(sb, msg + c->subject_off, eol - (msg + c->subject_off));

This original caller expected the helper to stop reading at the end
of the first line, but the updated helper needs to be told where to
stop, so we do so with some extra computation.  Makes sense.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 8/9] format-support: move `format_sanitized_subject()` from pretty
  2020-08-17 18:10     ` [PATCH v3 8/9] format-support: move `format_sanitized_subject()` from pretty Hariom Verma via GitGitGadget
@ 2020-08-17 19:33       ` Junio C Hamano
  0 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2020-08-17 19:33 UTC (permalink / raw)
  To: Hariom Verma via GitGitGadget; +Cc: git, Hariom Verma

"Hariom Verma via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Hariom Verma <hariom18599@gmail.com>
>
> In hope of some new features in `subject` atom, move funtion
> `format_sanitized_subject()` and all the function it uses
> to new file format-support.{c,h}.
>
> Consider this new file as a common interface between functions that
> pretty.c and ref-filter.c shares.

Sorry, I do not see a point.  Let's not do this and keep or add them
in pretty.[ch] if you are making some of the static ones public.

Thanks.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 0/9] [Resend][GSoC] Improvements to ref-filter
  2020-08-17 18:10   ` [PATCH v3 0/9] [Resend][GSoC] " Hariom Verma via GitGitGadget
                       ` (8 preceding siblings ...)
  2020-08-17 18:10     ` [PATCH v3 9/9] ref-filter: add `sanitize` option for 'subject' atom Hariom Verma via GitGitGadget
@ 2020-08-17 19:37     ` Junio C Hamano
  2020-08-21 21:41     ` [PATCH v4 0/8] [GSoC] " Hariom Verma via GitGitGadget
  10 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2020-08-17 19:37 UTC (permalink / raw)
  To: Hariom Verma via GitGitGadget; +Cc: git, Hariom Verma

"Hariom Verma via GitGitGadget" <gitgitgadget@gmail.com> writes:

> This is the first patch series that introduces some improvements and
> features to file ref-filter.{c,h}. These changes are useful to ref-filter,
> but in near future also will allow us to use ref-filter's logic in pretty.c

Overall the series was a pleasant read, except for some places I
found small questionable things.

Thanks, will queue.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 7/9] pretty: refactor `format_sanitized_subject()`
  2020-08-17 19:29       ` Junio C Hamano
@ 2020-08-19 13:36         ` Hariom verma
  2020-08-19 16:01           ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Hariom verma @ 2020-08-19 13:36 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Hariom Verma via GitGitGadget, git, Christian Couder, Heba Waly

Hi Junio,

On Tue, Aug 18, 2020 at 12:59 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Hariom Verma via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > -static void format_sanitized_subject(struct strbuf *sb, const char *msg)
> > +static void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len)
> >  {
> > +     char *r = xmemdupz(msg, len);
> >       size_t trimlen;
> >       size_t start_len = sb->len;
> >       int space = 2;
> > +     int i;
> >
> > -     for (; *msg && *msg != '\n'; msg++) {
> > -             if (istitlechar(*msg)) {
> > +     for (i = 0; i < len; i++) {
> > +             if (r[i] == '\n')
> > +                     r[i] = ' ';
>
> Copying the whole string only for this one looks very wasteful.
> Can't you do
>
>         for (i = 0; i < len; i++) {
>                 char r = msg[i];
>                 if (isspace(r))
>                         r = ' ';
>                 if (istitlechar(r)) {
>                         ...
>         }
>
> or something like that instead?

Ok, that sounds better. Noted for the next version.

> > +             if (istitlechar(r[i])) {
> >                       if (space == 1)
> >                               strbuf_addch(sb, '-');
> >                       space = 0;
> > -                     strbuf_addch(sb, *msg);
> > -                     if (*msg == '.')
> > -                             while (*(msg+1) == '.')
> > -                                     msg++;
> > +                     strbuf_addch(sb, r[i]);
> > +                     if (r[i] == '.')
> > +                             while (r[i+1] == '.')
> > +                                     i++;
> >               } else
> >                       space |= 1;
> >       }
> > +     free(r);
>
> Also, because neither LF or SP is a titlechar(), wouldn't the "if
> r[i] is LF, replace it with SP" a no-op wrt what will be in sb at
> the end?

Maybe its better to directly replace LF with hyphen? [Instead of first
replacing LF with SP and then replacing SP with '-'.]

> >       case 'f':       /* sanitized subject */
> > -             format_sanitized_subject(sb, msg + c->subject_off);
> > +             eol = strchrnul(msg + c->subject_off, '\n');
> > +             format_sanitized_subject(sb, msg + c->subject_off, eol - (msg + c->subject_off));
>
> This original caller expected the helper to stop reading at the end
> of the first line, but the updated helper needs to be told where to
> stop, so we do so with some extra computation.  Makes sense.

Yeah.

Thanks,
Hariom

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 7/9] pretty: refactor `format_sanitized_subject()`
  2020-08-19 13:36         ` Hariom verma
@ 2020-08-19 16:01           ` Junio C Hamano
  2020-08-19 16:08             ` Junio C Hamano
  2020-08-20 17:27             ` Hariom verma
  0 siblings, 2 replies; 51+ messages in thread
From: Junio C Hamano @ 2020-08-19 16:01 UTC (permalink / raw)
  To: Hariom verma
  Cc: Hariom Verma via GitGitGadget, git, Christian Couder, Heba Waly

Hariom verma <hariom18599@gmail.com> writes:

>> Also, because neither LF or SP is a titlechar(), wouldn't the "if
>> r[i] is LF, replace it with SP" a no-op wrt what will be in sb at
>> the end?
>
> Maybe its better to directly replace LF with hyphen? [Instead of first
> replacing LF with SP and then replacing SP with '-'.]

Why do you think LF is so special?

Everything other than titlechar() including HT, '#', '*', SP is
treated in the same way as the body of that loop.  It does not
directly contribute to the final contents of sb, but just leaves
the marker in the variable "space" the fact that when adding the
next titlechar() to the resulting sb, we need a SP to wordbreak.

LF now happens to be in the set due to the way you extended the
function (it wasn't fed to this function by its sole caller), but
other than that, it is no more special than HT, SP or '*'.  And they
are not replaced with SP or replaced with '-'.

So it would be the most sensible to just drop 'if LF, replace it
with SP before doing anything else' you added.  The existing 'if
titlechar, add it to sb but if we saw non-title, add a SP before
doing so to wordbreak, and if not titlechar, just remember the fact
that we saw one' should work fine as-is without special casing LF at
all.

Or am I missing something subtle?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 7/9] pretty: refactor `format_sanitized_subject()`
  2020-08-19 16:01           ` Junio C Hamano
@ 2020-08-19 16:08             ` Junio C Hamano
  2020-08-20 17:33               ` Hariom verma
  2020-08-20 17:27             ` Hariom verma
  1 sibling, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2020-08-19 16:08 UTC (permalink / raw)
  To: Hariom verma
  Cc: Hariom Verma via GitGitGadget, git, Christian Couder, Heba Waly

Junio C Hamano <gitster@pobox.com> writes:

> Hariom verma <hariom18599@gmail.com> writes:
>
>>> Also, because neither LF or SP is a titlechar(), wouldn't the "if
>>> r[i] is LF, replace it with SP" a no-op wrt what will be in sb at
>>> the end?
>>
>> Maybe its better to directly replace LF with hyphen? [Instead of first
>> replacing LF with SP and then replacing SP with '-'.]
>
> Why do you think LF is so special?
>
> Everything other than titlechar() including HT, '#', '*', SP is
> treated in the same way as the body of that loop.  It does not
> directly contribute to the final contents of sb, but just leaves
> the marker in the variable "space" the fact that when adding the
> next titlechar() to the resulting sb, we need a SP to wordbreak.

I was undecided between mentioning and not mentioning the variable
name "space" here.  On one hand, one _could_ argue that "space" is
used to remember we saw "space and the like" and if it were named
"seen_non_title_char", then such a confusion to treat LF so
specially might not have occurred.  But on the other hand, "space"
is what the variable exactly keeps track of; it is just the need for
space on the output side, i.e. we remember that "space needed before
the next output" with that variable.

I am inclined not to suggest renaming "space" at all, but it won't
be the end of the world if it were renamed to "need_space" (before
the next output), or "seen_nontitle".  If we were to actually
rename, I have moderately strong preference to the "need_space" over
"seen_nontitle", as it won't have to be renamed again when the logic
to require a space before the next output has to be updated to
include cases other than just "we saw a nontitle character".

Thanks.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 7/9] pretty: refactor `format_sanitized_subject()`
  2020-08-19 16:01           ` Junio C Hamano
  2020-08-19 16:08             ` Junio C Hamano
@ 2020-08-20 17:27             ` Hariom verma
  1 sibling, 0 replies; 51+ messages in thread
From: Hariom verma @ 2020-08-20 17:27 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Hariom Verma via GitGitGadget, git, Christian Couder, Heba Waly

Hi,

On Wed, Aug 19, 2020 at 9:31 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Hariom verma <hariom18599@gmail.com> writes:
>
> >> Also, because neither LF or SP is a titlechar(), wouldn't the "if
> >> r[i] is LF, replace it with SP" a no-op wrt what will be in sb at
> >> the end?
> >
> > Maybe its better to directly replace LF with hyphen? [Instead of first
> > replacing LF with SP and then replacing SP with '-'.]
>
> Why do you think LF is so special?
>
> Everything other than titlechar() including HT, '#', '*', SP is
> treated in the same way as the body of that loop.  It does not
> directly contribute to the final contents of sb, but just leaves
> the marker in the variable "space" the fact that when adding the
> next titlechar() to the resulting sb, we need a SP to wordbreak.
>
> LF now happens to be in the set due to the way you extended the
> function (it wasn't fed to this function by its sole caller), but
> other than that, it is no more special than HT, SP or '*'.  And they
> are not replaced with SP or replaced with '-'.
>
> So it would be the most sensible to just drop 'if LF, replace it
> with SP before doing anything else' you added.  The existing 'if
> titlechar, add it to sb but if we saw non-title, add a SP before
> doing so to wordbreak, and if not titlechar, just remember the fact
> that we saw one' should work fine as-is without special casing LF at
> all.
>
> Or am I missing something subtle?

You actually got it all right. Thanks for the insight.

Now, I got it. There is no need to give special attention to LF. I
missed to see that `titlechar()` is already taking care of everything.

Thanks,
Hariom

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 7/9] pretty: refactor `format_sanitized_subject()`
  2020-08-19 16:08             ` Junio C Hamano
@ 2020-08-20 17:33               ` Hariom verma
  0 siblings, 0 replies; 51+ messages in thread
From: Hariom verma @ 2020-08-20 17:33 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Hariom Verma via GitGitGadget, git, Christian Couder, Heba Waly

Hi,

On Wed, Aug 19, 2020 at 9:38 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Hariom verma <hariom18599@gmail.com> writes:
> >
> >>> Also, because neither LF or SP is a titlechar(), wouldn't the "if
> >>> r[i] is LF, replace it with SP" a no-op wrt what will be in sb at
> >>> the end?
> >>
> >> Maybe its better to directly replace LF with hyphen? [Instead of first
> >> replacing LF with SP and then replacing SP with '-'.]
> >
> > Why do you think LF is so special?
> >
> > Everything other than titlechar() including HT, '#', '*', SP is
> > treated in the same way as the body of that loop.  It does not
> > directly contribute to the final contents of sb, but just leaves
> > the marker in the variable "space" the fact that when adding the
> > next titlechar() to the resulting sb, we need a SP to wordbreak.
>
> I was undecided between mentioning and not mentioning the variable
> name "space" here.  On one hand, one _could_ argue that "space" is
> used to remember we saw "space and the like" and if it were named
> "seen_non_title_char", then such a confusion to treat LF so
> specially might not have occurred.  But on the other hand, "space"
> is what the variable exactly keeps track of; it is just the need for
> space on the output side, i.e. we remember that "space needed before
> the next output" with that variable.
>
> I am inclined not to suggest renaming "space" at all, but it won't
> be the end of the world if it were renamed to "need_space" (before
> the next output), or "seen_nontitle".  If we were to actually
> rename, I have moderately strong preference to the "need_space" over
> "seen_nontitle", as it won't have to be renamed again when the logic
> to require a space before the next output has to be updated to
> include cases other than just "we saw a nontitle character".

Yeah, if it was named "seen_non_title_char", I might not get confused.
But now as you have already explained its working pretty well, "space"
makes more sense to me.
Well, I'm okay with both "space" and "need_space".

I wonder what others have to say on this? "space" or "need_space"?

Thanks,
Hariom

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v4 0/8] [GSoC] Improvements to ref-filter
  2020-08-17 18:10   ` [PATCH v3 0/9] [Resend][GSoC] " Hariom Verma via GitGitGadget
                       ` (9 preceding siblings ...)
  2020-08-17 19:37     ` [PATCH v3 0/9] [Resend][GSoC] Improvements to ref-filter Junio C Hamano
@ 2020-08-21 21:41     ` Hariom Verma via GitGitGadget
  2020-08-21 21:41       ` [PATCH v4 1/8] ref-filter: support different email formats Hariom Verma via GitGitGadget
                         ` (7 more replies)
  10 siblings, 8 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-21 21:41 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma

This is the first patch series that introduces some improvements and
features to file ref-filter.{c,h}. These changes are useful to ref-filter,
but in near future also will allow us to use ref-filter's logic in pretty.c

I plan to add more to format-support.{c,h} in the upcoming patch series.
That will lead to more improved and feature-rich ref-filter.c

Hariom Verma (8):
  ref-filter: support different email formats
  ref-filter: refactor `grab_objectname()`
  ref-filter: modify error messages in `grab_objectname()`
  ref-filter: rename `objectname` related functions and fields
  ref-filter: add `short` modifier to 'tree' atom
  ref-filter: add `short` modifier to 'parent' atom
  pretty: refactor `format_sanitized_subject()`
  ref-filter: add `sanitize` option for 'subject' atom

 Documentation/git-for-each-ref.txt |  10 +-
 pretty.c                           |  20 ++--
 pretty.h                           |   3 +
 ref-filter.c                       | 158 +++++++++++++++++++----------
 t/t6300-for-each-ref.sh            |  35 +++++++
 5 files changed, 161 insertions(+), 65 deletions(-)


base-commit: 675a4aaf3b226c0089108221b96559e0baae5de9
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-684%2Fharry-hov%2Fonly-rf6-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-684/harry-hov/only-rf6-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/684

Range-diff vs v3:

  1:  3e6fc66a46 =  1:  55618fe4c1 ref-filter: support different email formats
  2:  5268b973da =  2:  c508c96eb8 ref-filter: refactor `grab_objectname()`
  3:  4a12ff8210 =  3:  582f00ace6 ref-filter: modify error messages in `grab_objectname()`
  4:  d53ca56778 =  4:  503a1874ce ref-filter: rename `objectname` related functions and fields
  5:  fd4ed82e80 =  5:  6b97166796 ref-filter: add `short` modifier to 'tree' atom
  6:  7a039823de =  6:  5ed5ac259d ref-filter: add `short` modifier to 'parent' atom
  7:  0ad22c7cdd !  7:  6105046d96 pretty: refactor `format_sanitized_subject()`
     @@ Commit message
          of `\n` in a variable `eol` and passed it in
          `format_sanitized_subject()`. And the loop runs upto `eol`.
      
     -    But this change isn't sufficient to reuse this function in
     -    ref-filter.c because there exist tags with multiline subject.
     -
     -    It's wise to replace `\n` with ' ', if `format_sanitized_subject()`
     -    encounters `\n` before end of subject line, just like `copy_subject()`.
     -    Because we'll be only using `format_sanitized_subject()` for
     -    "%(subject:sanitize)", instead of `copy_subject()` and
     -    `format_sanitized_subject()` both. So, added the code:
     -    ```
     -    if (char == '\n') /* never true if called inside pretty.c */
     -        char = ' ';
     -    ```
     -
     -    Now, it's ready to be reused in ref-filter.c
     -
          Mentored-by: Christian Couder <chriscool@tuxfamily.org>
          Mentored-by: Heba Waly <heba.waly@gmail.com>
          Signed-off-by: Hariom Verma <hariom18599@gmail.com>
     @@ pretty.c: static int istitlechar(char c)
       }
       
      -static void format_sanitized_subject(struct strbuf *sb, const char *msg)
     -+static void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len)
     ++void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len)
       {
     -+	char *r = xmemdupz(msg, len);
       	size_t trimlen;
       	size_t start_len = sb->len;
       	int space = 2;
     @@ pretty.c: static int istitlechar(char c)
      -	for (; *msg && *msg != '\n'; msg++) {
      -		if (istitlechar(*msg)) {
      +	for (i = 0; i < len; i++) {
     -+		if (r[i] == '\n')
     -+			r[i] = ' ';
     -+		if (istitlechar(r[i])) {
     ++		if (istitlechar(msg[i])) {
       			if (space == 1)
       				strbuf_addch(sb, '-');
       			space = 0;
     @@ pretty.c: static int istitlechar(char c)
      -			if (*msg == '.')
      -				while (*(msg+1) == '.')
      -					msg++;
     -+			strbuf_addch(sb, r[i]);
     -+			if (r[i] == '.')
     -+				while (r[i+1] == '.')
     ++			strbuf_addch(sb, msg[i]);
     ++			if (msg[i] == '.')
     ++				while (msg[i+1] == '.')
      +					i++;
       		} else
       			space |= 1;
       	}
     -+	free(r);
     - 
     - 	/* trim any trailing '.' or '-' characters */
     - 	trimlen = 0;
      @@ pretty.c: static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
       	const struct commit *commit = c->commit;
       	const char *msg = c->message;
     @@ pretty.c: static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
       		return 1;
       	case 'b':	/* body */
       		strbuf_addstr(sb, msg + c->body_off);
     +
     + ## pretty.h ##
     +@@ pretty.h: const char *format_subject(struct strbuf *sb, const char *msg,
     + /* Check if "cmit_fmt" will produce an empty output. */
     + int commit_format_is_empty(enum cmit_fmt);
     + 
     ++/* Make subject of commit message suitable for filename */
     ++void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len);
     ++
     + #endif /* PRETTY_H */
  8:  7a64495f99 <  -:  ---------- format-support: move `format_sanitized_subject()` from pretty
  9:  1ab35e9251 !  8:  7cba8d7a28 ref-filter: add `sanitize` option for 'subject' atom
     @@ Documentation/git-for-each-ref.txt: contents:subject::
       	The remainder of the commit or the tag message that follows
      
       ## ref-filter.c ##
     -@@
     - #include "worktree.h"
     - #include "hashmap.h"
     - #include "strvec.h"
     -+#include "format-support.h"
     - 
     - static struct ref_msg {
     - 	const char *gone;
      @@ ref-filter.c: static struct used_atom {
       			unsigned int nobracket : 1, push : 1, push_remote : 1;
       		} remote_ref;

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v4 1/8] ref-filter: support different email formats
  2020-08-21 21:41     ` [PATCH v4 0/8] [GSoC] " Hariom Verma via GitGitGadget
@ 2020-08-21 21:41       ` Hariom Verma via GitGitGadget
  2020-08-21 21:41       ` [PATCH v4 2/8] ref-filter: refactor `grab_objectname()` Hariom Verma via GitGitGadget
                         ` (6 subsequent siblings)
  7 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-21 21:41 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

Currently, ref-filter only supports printing email with angle brackets.

Let's add support for two more email options.
- trim : for email without angle brackets.
- localpart : for the part before the @ sign out of trimmed email

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 Documentation/git-for-each-ref.txt |  5 ++-
 ref-filter.c                       | 54 +++++++++++++++++++++++++-----
 t/t6300-for-each-ref.sh            | 16 +++++++++
 3 files changed, 65 insertions(+), 10 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index 2ea71c5f6c..e6ce8af612 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -230,7 +230,10 @@ These are intended for working on a mix of annotated and lightweight tags.
 
 Fields that have name-email-date tuple as its value (`author`,
 `committer`, and `tagger`) can be suffixed with `name`, `email`,
-and `date` to extract the named component.
+and `date` to extract the named component.  For email fields (`authoremail`,
+`committeremail` and `taggeremail`), `:trim` can be appended to get the email
+without angle brackets, and `:localpart` to get the part before the `@` symbol
+out of the trimmed email.
 
 The message in a commit or a tag object is `contents`, from which
 `contents:<part>` can be used to extract various parts out of:
diff --git a/ref-filter.c b/ref-filter.c
index ba85869755..e60765f156 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -140,6 +140,9 @@ static struct used_atom {
 			enum { O_FULL, O_LENGTH, O_SHORT } option;
 			unsigned int length;
 		} objectname;
+		struct email_option {
+			enum { EO_RAW, EO_TRIM, EO_LOCALPART } option;
+		} email_option;
 		struct refname_atom refname;
 		char *head;
 	} u;
@@ -377,6 +380,20 @@ static int objectname_atom_parser(const struct ref_format *format, struct used_a
 	return 0;
 }
 
+static int person_email_atom_parser(const struct ref_format *format, struct used_atom *atom,
+				    const char *arg, struct strbuf *err)
+{
+	if (!arg)
+		atom->u.email_option.option = EO_RAW;
+	else if (!strcmp(arg, "trim"))
+		atom->u.email_option.option = EO_TRIM;
+	else if (!strcmp(arg, "localpart"))
+		atom->u.email_option.option = EO_LOCALPART;
+	else
+		return strbuf_addf_ret(err, -1, _("unrecognized email option: %s"), arg);
+	return 0;
+}
+
 static int refname_atom_parser(const struct ref_format *format, struct used_atom *atom,
 			       const char *arg, struct strbuf *err)
 {
@@ -488,15 +505,15 @@ static struct {
 	{ "tag", SOURCE_OBJ },
 	{ "author", SOURCE_OBJ },
 	{ "authorname", SOURCE_OBJ },
-	{ "authoremail", SOURCE_OBJ },
+	{ "authoremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
 	{ "authordate", SOURCE_OBJ, FIELD_TIME },
 	{ "committer", SOURCE_OBJ },
 	{ "committername", SOURCE_OBJ },
-	{ "committeremail", SOURCE_OBJ },
+	{ "committeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
 	{ "committerdate", SOURCE_OBJ, FIELD_TIME },
 	{ "tagger", SOURCE_OBJ },
 	{ "taggername", SOURCE_OBJ },
-	{ "taggeremail", SOURCE_OBJ },
+	{ "taggeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
 	{ "taggerdate", SOURCE_OBJ, FIELD_TIME },
 	{ "creator", SOURCE_OBJ },
 	{ "creatordate", SOURCE_OBJ, FIELD_TIME },
@@ -1037,16 +1054,35 @@ static const char *copy_name(const char *buf)
 	return xstrdup("");
 }
 
-static const char *copy_email(const char *buf)
+static const char *copy_email(const char *buf, struct used_atom *atom)
 {
 	const char *email = strchr(buf, '<');
 	const char *eoemail;
 	if (!email)
 		return xstrdup("");
-	eoemail = strchr(email, '>');
+	switch (atom->u.email_option.option) {
+	case EO_RAW:
+		eoemail = strchr(email, '>');
+		if (eoemail)
+			eoemail++;
+		break;
+	case EO_TRIM:
+		email++;
+		eoemail = strchr(email, '>');
+		break;
+	case EO_LOCALPART:
+		email++;
+		eoemail = strchr(email, '@');
+		if (!eoemail)
+			eoemail = strchr(email, '>');
+		break;
+	default:
+		BUG("unknown email option");
+	}
+
 	if (!eoemail)
 		return xstrdup("");
-	return xmemdupz(email, eoemail + 1 - email);
+	return xmemdupz(email, eoemail - email);
 }
 
 static char *copy_subject(const char *buf, unsigned long len)
@@ -1116,7 +1152,7 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void
 			continue;
 		if (name[wholen] != 0 &&
 		    strcmp(name + wholen, "name") &&
-		    strcmp(name + wholen, "email") &&
+		    !starts_with(name + wholen, "email") &&
 		    !starts_with(name + wholen, "date"))
 			continue;
 		if (!wholine)
@@ -1127,8 +1163,8 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void
 			v->s = copy_line(wholine);
 		else if (!strcmp(name + wholen, "name"))
 			v->s = copy_name(wholine);
-		else if (!strcmp(name + wholen, "email"))
-			v->s = copy_email(wholine);
+		else if (starts_with(name + wholen, "email"))
+			v->s = copy_email(wholine, &used_atom[i]);
 		else if (starts_with(name + wholen, "date"))
 			grab_date(wholine, v, name);
 	}
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index a83579fbdf..64fbc91146 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -125,15 +125,21 @@ test_atom head '*objecttype' ''
 test_atom head author 'A U Thor <author@example.com> 1151968724 +0200'
 test_atom head authorname 'A U Thor'
 test_atom head authoremail '<author@example.com>'
+test_atom head authoremail:trim 'author@example.com'
+test_atom head authoremail:localpart 'author'
 test_atom head authordate 'Tue Jul 4 01:18:44 2006 +0200'
 test_atom head committer 'C O Mitter <committer@example.com> 1151968723 +0200'
 test_atom head committername 'C O Mitter'
 test_atom head committeremail '<committer@example.com>'
+test_atom head committeremail:trim 'committer@example.com'
+test_atom head committeremail:localpart 'committer'
 test_atom head committerdate 'Tue Jul 4 01:18:43 2006 +0200'
 test_atom head tag ''
 test_atom head tagger ''
 test_atom head taggername ''
 test_atom head taggeremail ''
+test_atom head taggeremail:trim ''
+test_atom head taggeremail:localpart ''
 test_atom head taggerdate ''
 test_atom head creator 'C O Mitter <committer@example.com> 1151968723 +0200'
 test_atom head creatordate 'Tue Jul 4 01:18:43 2006 +0200'
@@ -170,15 +176,21 @@ test_atom tag '*objecttype' 'commit'
 test_atom tag author ''
 test_atom tag authorname ''
 test_atom tag authoremail ''
+test_atom tag authoremail:trim ''
+test_atom tag authoremail:localpart ''
 test_atom tag authordate ''
 test_atom tag committer ''
 test_atom tag committername ''
 test_atom tag committeremail ''
+test_atom tag committeremail:trim ''
+test_atom tag committeremail:localpart ''
 test_atom tag committerdate ''
 test_atom tag tag 'testtag'
 test_atom tag tagger 'C O Mitter <committer@example.com> 1151968725 +0200'
 test_atom tag taggername 'C O Mitter'
 test_atom tag taggeremail '<committer@example.com>'
+test_atom tag taggeremail:trim 'committer@example.com'
+test_atom tag taggeremail:localpart 'committer'
 test_atom tag taggerdate 'Tue Jul 4 01:18:45 2006 +0200'
 test_atom tag creator 'C O Mitter <committer@example.com> 1151968725 +0200'
 test_atom tag creatordate 'Tue Jul 4 01:18:45 2006 +0200'
@@ -564,10 +576,14 @@ test_atom refs/tags/taggerless tag 'taggerless'
 test_atom refs/tags/taggerless tagger ''
 test_atom refs/tags/taggerless taggername ''
 test_atom refs/tags/taggerless taggeremail ''
+test_atom refs/tags/taggerless taggeremail:trim ''
+test_atom refs/tags/taggerless taggeremail:localpart ''
 test_atom refs/tags/taggerless taggerdate ''
 test_atom refs/tags/taggerless committer ''
 test_atom refs/tags/taggerless committername ''
 test_atom refs/tags/taggerless committeremail ''
+test_atom refs/tags/taggerless committeremail:trim ''
+test_atom refs/tags/taggerless committeremail:localpart ''
 test_atom refs/tags/taggerless committerdate ''
 test_atom refs/tags/taggerless subject 'Broken tag'
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v4 2/8] ref-filter: refactor `grab_objectname()`
  2020-08-21 21:41     ` [PATCH v4 0/8] [GSoC] " Hariom Verma via GitGitGadget
  2020-08-21 21:41       ` [PATCH v4 1/8] ref-filter: support different email formats Hariom Verma via GitGitGadget
@ 2020-08-21 21:41       ` Hariom Verma via GitGitGadget
  2020-08-21 21:41       ` [PATCH v4 3/8] ref-filter: modify error messages in `grab_objectname()` Hariom Verma via GitGitGadget
                         ` (5 subsequent siblings)
  7 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-21 21:41 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

Prepares `grab_objectname()` for more generic usage.
This change will allow us to reuse `grab_objectname()` for
the `tree` and `parent` atoms in a following commit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 ref-filter.c | 36 +++++++++++++++++++++---------------
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index e60765f156..9bf92db6df 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -918,21 +918,27 @@ int verify_ref_format(struct ref_format *format)
 	return 0;
 }
 
-static int grab_objectname(const char *name, const struct object_id *oid,
+static const char *do_grab_objectname(const char *field, const struct object_id *oid,
+				      struct used_atom *atom)
+{
+	switch (atom->u.objectname.option) {
+	case O_FULL:
+		return oid_to_hex(oid);
+	case O_LENGTH:
+		return find_unique_abbrev(oid, atom->u.objectname.length);
+	case O_SHORT:
+		return find_unique_abbrev(oid, DEFAULT_ABBREV);
+	default:
+		BUG("unknown %%(%s) option", field);
+	}
+}
+
+static int grab_objectname(const char *name, const char *field, const struct object_id *oid,
 			   struct atom_value *v, struct used_atom *atom)
 {
-	if (starts_with(name, "objectname")) {
-		if (atom->u.objectname.option == O_SHORT) {
-			v->s = xstrdup(find_unique_abbrev(oid, DEFAULT_ABBREV));
-			return 1;
-		} else if (atom->u.objectname.option == O_FULL) {
-			v->s = xstrdup(oid_to_hex(oid));
-			return 1;
-		} else if (atom->u.objectname.option == O_LENGTH) {
-			v->s = xstrdup(find_unique_abbrev(oid, atom->u.objectname.length));
-			return 1;
-		} else
-			BUG("unknown %%(objectname) option");
+	if (starts_with(name, field)) {
+		v->s = xstrdup(do_grab_objectname(field, oid, atom));
+		return 1;
 	}
 	return 0;
 }
@@ -960,7 +966,7 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 		} else if (!strcmp(name, "deltabase"))
 			v->s = xstrdup(oid_to_hex(&oi->delta_base_oid));
 		else if (deref)
-			grab_objectname(name, &oi->oid, v, &used_atom[i]);
+			grab_objectname(name, "objectname", &oi->oid, v, &used_atom[i]);
 	}
 }
 
@@ -1740,7 +1746,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				v->s = xstrdup(buf + 1);
 			}
 			continue;
-		} else if (!deref && grab_objectname(name, &ref->objectname, v, atom)) {
+		} else if (!deref && grab_objectname(name, "objectname", &ref->objectname, v, atom)) {
 			continue;
 		} else if (!strcmp(name, "HEAD")) {
 			if (atom->u.head && !strcmp(ref->refname, atom->u.head))
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v4 3/8] ref-filter: modify error messages in `grab_objectname()`
  2020-08-21 21:41     ` [PATCH v4 0/8] [GSoC] " Hariom Verma via GitGitGadget
  2020-08-21 21:41       ` [PATCH v4 1/8] ref-filter: support different email formats Hariom Verma via GitGitGadget
  2020-08-21 21:41       ` [PATCH v4 2/8] ref-filter: refactor `grab_objectname()` Hariom Verma via GitGitGadget
@ 2020-08-21 21:41       ` Hariom Verma via GitGitGadget
  2020-08-21 21:41       ` [PATCH v4 4/8] ref-filter: rename `objectname` related functions and fields Hariom Verma via GitGitGadget
                         ` (4 subsequent siblings)
  7 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-21 21:41 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

As we plan to use `grab_objectname()` for `tree` and `parent` atom,
it's better to parameterize the error messages in the function
`grab_objectname()` where "objectname" is hard coded.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 ref-filter.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 9bf92db6df..4f4591cad0 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -372,11 +372,11 @@ static int objectname_atom_parser(const struct ref_format *format, struct used_a
 		atom->u.objectname.option = O_LENGTH;
 		if (strtoul_ui(arg, 10, &atom->u.objectname.length) ||
 		    atom->u.objectname.length == 0)
-			return strbuf_addf_ret(err, -1, _("positive value expected objectname:short=%s"), arg);
+			return strbuf_addf_ret(err, -1, _("positive value expected '%s' in %%(%s)"), arg, atom->name);
 		if (atom->u.objectname.length < MINIMUM_ABBREV)
 			atom->u.objectname.length = MINIMUM_ABBREV;
 	} else
-		return strbuf_addf_ret(err, -1, _("unrecognized %%(objectname) argument: %s"), arg);
+		return strbuf_addf_ret(err, -1, _("unrecognized argument '%s' in %%(%s)"), arg, atom->name);
 	return 0;
 }
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v4 4/8] ref-filter: rename `objectname` related functions and fields
  2020-08-21 21:41     ` [PATCH v4 0/8] [GSoC] " Hariom Verma via GitGitGadget
                         ` (2 preceding siblings ...)
  2020-08-21 21:41       ` [PATCH v4 3/8] ref-filter: modify error messages in `grab_objectname()` Hariom Verma via GitGitGadget
@ 2020-08-21 21:41       ` Hariom Verma via GitGitGadget
  2020-08-21 21:41       ` [PATCH v4 5/8] ref-filter: add `short` modifier to 'tree' atom Hariom Verma via GitGitGadget
                         ` (3 subsequent siblings)
  7 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-21 21:41 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

In previous commits, we prepared some `objectname` related functions
for more generic usage, so that these functions can be used for `tree`
and `parent` atom.

But the name of some functions and fields may mislead someone.
For ex: function `objectname_atom_parser()` implies that it is
for atom `objectname`.

Let's rename all such functions and fields.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 ref-filter.c | 40 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 4f4591cad0..066975b306 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -139,7 +139,7 @@ static struct used_atom {
 		struct {
 			enum { O_FULL, O_LENGTH, O_SHORT } option;
 			unsigned int length;
-		} objectname;
+		} oid;
 		struct email_option {
 			enum { EO_RAW, EO_TRIM, EO_LOCALPART } option;
 		} email_option;
@@ -361,20 +361,20 @@ static int contents_atom_parser(const struct ref_format *format, struct used_ato
 	return 0;
 }
 
-static int objectname_atom_parser(const struct ref_format *format, struct used_atom *atom,
-				  const char *arg, struct strbuf *err)
+static int oid_atom_parser(const struct ref_format *format, struct used_atom *atom,
+			   const char *arg, struct strbuf *err)
 {
 	if (!arg)
-		atom->u.objectname.option = O_FULL;
+		atom->u.oid.option = O_FULL;
 	else if (!strcmp(arg, "short"))
-		atom->u.objectname.option = O_SHORT;
+		atom->u.oid.option = O_SHORT;
 	else if (skip_prefix(arg, "short=", &arg)) {
-		atom->u.objectname.option = O_LENGTH;
-		if (strtoul_ui(arg, 10, &atom->u.objectname.length) ||
-		    atom->u.objectname.length == 0)
+		atom->u.oid.option = O_LENGTH;
+		if (strtoul_ui(arg, 10, &atom->u.oid.length) ||
+		    atom->u.oid.length == 0)
 			return strbuf_addf_ret(err, -1, _("positive value expected '%s' in %%(%s)"), arg, atom->name);
-		if (atom->u.objectname.length < MINIMUM_ABBREV)
-			atom->u.objectname.length = MINIMUM_ABBREV;
+		if (atom->u.oid.length < MINIMUM_ABBREV)
+			atom->u.oid.length = MINIMUM_ABBREV;
 	} else
 		return strbuf_addf_ret(err, -1, _("unrecognized argument '%s' in %%(%s)"), arg, atom->name);
 	return 0;
@@ -495,7 +495,7 @@ static struct {
 	{ "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
 	{ "objecttype", SOURCE_OTHER, FIELD_STR, objecttype_atom_parser },
 	{ "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
-	{ "objectname", SOURCE_OTHER, FIELD_STR, objectname_atom_parser },
+	{ "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
 	{ "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
 	{ "tree", SOURCE_OBJ },
 	{ "parent", SOURCE_OBJ },
@@ -918,14 +918,14 @@ int verify_ref_format(struct ref_format *format)
 	return 0;
 }
 
-static const char *do_grab_objectname(const char *field, const struct object_id *oid,
-				      struct used_atom *atom)
+static const char *do_grab_oid(const char *field, const struct object_id *oid,
+			       struct used_atom *atom)
 {
-	switch (atom->u.objectname.option) {
+	switch (atom->u.oid.option) {
 	case O_FULL:
 		return oid_to_hex(oid);
 	case O_LENGTH:
-		return find_unique_abbrev(oid, atom->u.objectname.length);
+		return find_unique_abbrev(oid, atom->u.oid.length);
 	case O_SHORT:
 		return find_unique_abbrev(oid, DEFAULT_ABBREV);
 	default:
@@ -933,11 +933,11 @@ static const char *do_grab_objectname(const char *field, const struct object_id
 	}
 }
 
-static int grab_objectname(const char *name, const char *field, const struct object_id *oid,
-			   struct atom_value *v, struct used_atom *atom)
+static int grab_oid(const char *name, const char *field, const struct object_id *oid,
+		    struct atom_value *v, struct used_atom *atom)
 {
 	if (starts_with(name, field)) {
-		v->s = xstrdup(do_grab_objectname(field, oid, atom));
+		v->s = xstrdup(do_grab_oid(field, oid, atom));
 		return 1;
 	}
 	return 0;
@@ -966,7 +966,7 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 		} else if (!strcmp(name, "deltabase"))
 			v->s = xstrdup(oid_to_hex(&oi->delta_base_oid));
 		else if (deref)
-			grab_objectname(name, "objectname", &oi->oid, v, &used_atom[i]);
+			grab_oid(name, "objectname", &oi->oid, v, &used_atom[i]);
 	}
 }
 
@@ -1746,7 +1746,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				v->s = xstrdup(buf + 1);
 			}
 			continue;
-		} else if (!deref && grab_objectname(name, "objectname", &ref->objectname, v, atom)) {
+		} else if (!deref && grab_oid(name, "objectname", &ref->objectname, v, atom)) {
 			continue;
 		} else if (!strcmp(name, "HEAD")) {
 			if (atom->u.head && !strcmp(ref->refname, atom->u.head))
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v4 5/8] ref-filter: add `short` modifier to 'tree' atom
  2020-08-21 21:41     ` [PATCH v4 0/8] [GSoC] " Hariom Verma via GitGitGadget
                         ` (3 preceding siblings ...)
  2020-08-21 21:41       ` [PATCH v4 4/8] ref-filter: rename `objectname` related functions and fields Hariom Verma via GitGitGadget
@ 2020-08-21 21:41       ` Hariom Verma via GitGitGadget
  2020-08-21 21:41       ` [PATCH v4 6/8] ref-filter: add `short` modifier to 'parent' atom Hariom Verma via GitGitGadget
                         ` (2 subsequent siblings)
  7 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-21 21:41 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

Sometimes while using 'tree' atom, user might want to see abbrev hash
instead of full 40 character hash.

Just like 'objectname', it might be convenient for users to have the
`:short` and `:short=<length>` option for printing 'tree' hash.

Let's introduce `short` option to 'tree' atom.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 Documentation/git-for-each-ref.txt | 2 ++
 ref-filter.c                       | 9 ++++-----
 t/t6300-for-each-ref.sh            | 6 ++++++
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index e6ce8af612..40ebdfcc41 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -222,6 +222,8 @@ worktreepath::
 In addition to the above, for commit and tag objects, the header
 field names (`tree`, `parent`, `object`, `type`, and `tag`) can
 be used to specify the value in the header field.
+Field `tree` can also be used with modifier `:short` and
+`:short=<length>` just like `objectname`.
 
 For commit and tag objects, the special `creatordate` and `creator`
 fields will correspond to the appropriate date or name-email-date tuple
diff --git a/ref-filter.c b/ref-filter.c
index 066975b306..3449fe45d8 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -497,7 +497,7 @@ static struct {
 	{ "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
 	{ "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
 	{ "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
-	{ "tree", SOURCE_OBJ },
+	{ "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
 	{ "parent", SOURCE_OBJ },
 	{ "numparent", SOURCE_OBJ, FIELD_ULONG },
 	{ "object", SOURCE_OBJ },
@@ -1005,10 +1005,9 @@ static void grab_commit_values(struct atom_value *val, int deref, struct object
 			continue;
 		if (deref)
 			name++;
-		if (!strcmp(name, "tree")) {
-			v->s = xstrdup(oid_to_hex(get_commit_tree_oid(commit)));
-		}
-		else if (!strcmp(name, "numparent")) {
+		if (grab_oid(name, "tree", get_commit_tree_oid(commit), v, &used_atom[i]))
+			continue;
+		if (!strcmp(name, "numparent")) {
 			v->value = commit_list_count(commit->parents);
 			v->s = xstrfmt("%lu", (unsigned long)v->value);
 		}
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 64fbc91146..e30bbff6d9 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -116,6 +116,9 @@ test_atom head objectname:short $(git rev-parse --short refs/heads/master)
 test_atom head objectname:short=1 $(git rev-parse --short=1 refs/heads/master)
 test_atom head objectname:short=10 $(git rev-parse --short=10 refs/heads/master)
 test_atom head tree $(git rev-parse refs/heads/master^{tree})
+test_atom head tree:short $(git rev-parse --short refs/heads/master^{tree})
+test_atom head tree:short=1 $(git rev-parse --short=1 refs/heads/master^{tree})
+test_atom head tree:short=10 $(git rev-parse --short=10 refs/heads/master^{tree})
 test_atom head parent ''
 test_atom head numparent 0
 test_atom head object ''
@@ -167,6 +170,9 @@ test_atom tag objectname:short $(git rev-parse --short refs/tags/testtag)
 test_atom head objectname:short=1 $(git rev-parse --short=1 refs/heads/master)
 test_atom head objectname:short=10 $(git rev-parse --short=10 refs/heads/master)
 test_atom tag tree ''
+test_atom tag tree:short ''
+test_atom tag tree:short=1 ''
+test_atom tag tree:short=10 ''
 test_atom tag parent ''
 test_atom tag numparent ''
 test_atom tag object $(git rev-parse refs/tags/testtag^0)
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v4 6/8] ref-filter: add `short` modifier to 'parent' atom
  2020-08-21 21:41     ` [PATCH v4 0/8] [GSoC] " Hariom Verma via GitGitGadget
                         ` (4 preceding siblings ...)
  2020-08-21 21:41       ` [PATCH v4 5/8] ref-filter: add `short` modifier to 'tree' atom Hariom Verma via GitGitGadget
@ 2020-08-21 21:41       ` Hariom Verma via GitGitGadget
  2020-08-21 21:41       ` [PATCH v4 7/8] pretty: refactor `format_sanitized_subject()` Hariom Verma via GitGitGadget
  2020-08-21 21:41       ` [PATCH v4 8/8] ref-filter: add `sanitize` option for 'subject' atom Hariom Verma via GitGitGadget
  7 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-21 21:41 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

Sometimes while using 'parent' atom, user might want to see abbrev hash
instead of full 40 character hash.

Just like 'objectname', it might be convenient for users to have the
`:short` and `:short=<length>` option for printing 'parent' hash.

Let's introduce `short` option to 'parent' atom.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 Documentation/git-for-each-ref.txt | 2 +-
 ref-filter.c                       | 8 ++++----
 t/t6300-for-each-ref.sh            | 6 ++++++
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index 40ebdfcc41..dd09763e7d 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -222,7 +222,7 @@ worktreepath::
 In addition to the above, for commit and tag objects, the header
 field names (`tree`, `parent`, `object`, `type`, and `tag`) can
 be used to specify the value in the header field.
-Field `tree` can also be used with modifier `:short` and
+Fields `tree` and `parent` can also be used with modifier `:short` and
 `:short=<length>` just like `objectname`.
 
 For commit and tag objects, the special `creatordate` and `creator`
diff --git a/ref-filter.c b/ref-filter.c
index 3449fe45d8..c7d81088e4 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -498,7 +498,7 @@ static struct {
 	{ "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
 	{ "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
 	{ "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
-	{ "parent", SOURCE_OBJ },
+	{ "parent", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
 	{ "numparent", SOURCE_OBJ, FIELD_ULONG },
 	{ "object", SOURCE_OBJ },
 	{ "type", SOURCE_OBJ },
@@ -1011,14 +1011,14 @@ static void grab_commit_values(struct atom_value *val, int deref, struct object
 			v->value = commit_list_count(commit->parents);
 			v->s = xstrfmt("%lu", (unsigned long)v->value);
 		}
-		else if (!strcmp(name, "parent")) {
+		else if (starts_with(name, "parent")) {
 			struct commit_list *parents;
 			struct strbuf s = STRBUF_INIT;
 			for (parents = commit->parents; parents; parents = parents->next) {
-				struct commit *parent = parents->item;
+				struct object_id *oid = &parents->item->object.oid;
 				if (parents != commit->parents)
 					strbuf_addch(&s, ' ');
-				strbuf_addstr(&s, oid_to_hex(&parent->object.oid));
+				strbuf_addstr(&s, do_grab_oid("parent", oid, &used_atom[i]));
 			}
 			v->s = strbuf_detach(&s, NULL);
 		}
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index e30bbff6d9..79d5b29387 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -120,6 +120,9 @@ test_atom head tree:short $(git rev-parse --short refs/heads/master^{tree})
 test_atom head tree:short=1 $(git rev-parse --short=1 refs/heads/master^{tree})
 test_atom head tree:short=10 $(git rev-parse --short=10 refs/heads/master^{tree})
 test_atom head parent ''
+test_atom head parent:short ''
+test_atom head parent:short=1 ''
+test_atom head parent:short=10 ''
 test_atom head numparent 0
 test_atom head object ''
 test_atom head type ''
@@ -174,6 +177,9 @@ test_atom tag tree:short ''
 test_atom tag tree:short=1 ''
 test_atom tag tree:short=10 ''
 test_atom tag parent ''
+test_atom tag parent:short ''
+test_atom tag parent:short=1 ''
+test_atom tag parent:short=10 ''
 test_atom tag numparent ''
 test_atom tag object $(git rev-parse refs/tags/testtag^0)
 test_atom tag type 'commit'
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v4 7/8] pretty: refactor `format_sanitized_subject()`
  2020-08-21 21:41     ` [PATCH v4 0/8] [GSoC] " Hariom Verma via GitGitGadget
                         ` (5 preceding siblings ...)
  2020-08-21 21:41       ` [PATCH v4 6/8] ref-filter: add `short` modifier to 'parent' atom Hariom Verma via GitGitGadget
@ 2020-08-21 21:41       ` Hariom Verma via GitGitGadget
  2020-08-21 21:41       ` [PATCH v4 8/8] ref-filter: add `sanitize` option for 'subject' atom Hariom Verma via GitGitGadget
  7 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-21 21:41 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

The function 'format_sanitized_subject()' is responsible for
sanitized subject line in pretty.c
e.g.
the subject line
the-sanitized-subject-line

It would be a nice enhancement to `subject` atom to have the
same feature. So in the later commits, we plan to add this feature
to ref-filter.

Refactor `format_sanitized_subject()`, so it can be reused in
ref-filter.c for adding new modifier `sanitize` to "subject" atom.

Currently, the loop inside `format_sanitized_subject()` runs
until `\n` is found. But now, we stored the first occurrence
of `\n` in a variable `eol` and passed it in
`format_sanitized_subject()`. And the loop runs upto `eol`.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 pretty.c | 20 +++++++++++---------
 pretty.h |  3 +++
 2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/pretty.c b/pretty.c
index 2a3d46bf42..7a7708a0ea 100644
--- a/pretty.c
+++ b/pretty.c
@@ -839,21 +839,22 @@ static int istitlechar(char c)
 		(c >= '0' && c <= '9') || c == '.' || c == '_';
 }
 
-static void format_sanitized_subject(struct strbuf *sb, const char *msg)
+void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len)
 {
 	size_t trimlen;
 	size_t start_len = sb->len;
 	int space = 2;
+	int i;
 
-	for (; *msg && *msg != '\n'; msg++) {
-		if (istitlechar(*msg)) {
+	for (i = 0; i < len; i++) {
+		if (istitlechar(msg[i])) {
 			if (space == 1)
 				strbuf_addch(sb, '-');
 			space = 0;
-			strbuf_addch(sb, *msg);
-			if (*msg == '.')
-				while (*(msg+1) == '.')
-					msg++;
+			strbuf_addch(sb, msg[i]);
+			if (msg[i] == '.')
+				while (msg[i+1] == '.')
+					i++;
 		} else
 			space |= 1;
 	}
@@ -1155,7 +1156,7 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
 	const struct commit *commit = c->commit;
 	const char *msg = c->message;
 	struct commit_list *p;
-	const char *arg;
+	const char *arg, *eol;
 	size_t res;
 	char **slot;
 
@@ -1405,7 +1406,8 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
 		format_subject(sb, msg + c->subject_off, " ");
 		return 1;
 	case 'f':	/* sanitized subject */
-		format_sanitized_subject(sb, msg + c->subject_off);
+		eol = strchrnul(msg + c->subject_off, '\n');
+		format_sanitized_subject(sb, msg + c->subject_off, eol - (msg + c->subject_off));
 		return 1;
 	case 'b':	/* body */
 		strbuf_addstr(sb, msg + c->body_off);
diff --git a/pretty.h b/pretty.h
index 071f2fb8e4..7ce6c0b437 100644
--- a/pretty.h
+++ b/pretty.h
@@ -139,4 +139,7 @@ const char *format_subject(struct strbuf *sb, const char *msg,
 /* Check if "cmit_fmt" will produce an empty output. */
 int commit_format_is_empty(enum cmit_fmt);
 
+/* Make subject of commit message suitable for filename */
+void format_sanitized_subject(struct strbuf *sb, const char *msg, size_t len);
+
 #endif /* PRETTY_H */
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v4 8/8] ref-filter: add `sanitize` option for 'subject' atom
  2020-08-21 21:41     ` [PATCH v4 0/8] [GSoC] " Hariom Verma via GitGitGadget
                         ` (6 preceding siblings ...)
  2020-08-21 21:41       ` [PATCH v4 7/8] pretty: refactor `format_sanitized_subject()` Hariom Verma via GitGitGadget
@ 2020-08-21 21:41       ` Hariom Verma via GitGitGadget
  7 siblings, 0 replies; 51+ messages in thread
From: Hariom Verma via GitGitGadget @ 2020-08-21 21:41 UTC (permalink / raw)
  To: git; +Cc: Hariom Verma, Hariom Verma

From: Hariom Verma <hariom18599@gmail.com>

Currently, subject does not take any arguments. This commit introduce
`sanitize` formatting option to 'subject' atom.

`subject:sanitize` - print sanitized subject line, suitable for a filename.

e.g.
%(subject): "the subject line"
%(subject:sanitize): "the-subject-line"

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 Documentation/git-for-each-ref.txt |  3 +++
 ref-filter.c                       | 23 +++++++++++++++--------
 t/t6300-for-each-ref.sh            |  7 +++++++
 3 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index dd09763e7d..616ce46087 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -247,6 +247,9 @@ contents:subject::
 	The first paragraph of the message, which typically is a
 	single line, is taken as the "subject" of the commit or the
 	tag message.
+	Instead of `contents:subject`, field `subject` can also be used to
+	obtain same results. `:sanitize` can be appended to `subject` for
+	subject line suitable for filename.
 
 contents:body::
 	The remainder of the commit or the tag message that follows
diff --git a/ref-filter.c b/ref-filter.c
index c7d81088e4..12bb78ce06 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -127,8 +127,8 @@ static struct used_atom {
 			unsigned int nobracket : 1, push : 1, push_remote : 1;
 		} remote_ref;
 		struct {
-			enum { C_BARE, C_BODY, C_BODY_DEP, C_LENGTH,
-			       C_LINES, C_SIG, C_SUB, C_TRAILERS } option;
+			enum { C_BARE, C_BODY, C_BODY_DEP, C_LENGTH, C_LINES,
+			       C_SIG, C_SUB, C_SUB_SANITIZE, C_TRAILERS } option;
 			struct process_trailer_options trailer_opts;
 			unsigned int nlines;
 		} contents;
@@ -301,9 +301,12 @@ static int body_atom_parser(const struct ref_format *format, struct used_atom *a
 static int subject_atom_parser(const struct ref_format *format, struct used_atom *atom,
 			       const char *arg, struct strbuf *err)
 {
-	if (arg)
-		return strbuf_addf_ret(err, -1, _("%%(subject) does not take arguments"));
-	atom->u.contents.option = C_SUB;
+	if (!arg)
+		atom->u.contents.option = C_SUB;
+	else if (!strcmp(arg, "sanitize"))
+		atom->u.contents.option = C_SUB_SANITIZE;
+	else
+		return strbuf_addf_ret(err, -1, _("unrecognized %%(subject) argument: %s"), arg);
 	return 0;
 }
 
@@ -1282,8 +1285,8 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
 			continue;
 		if (deref)
 			name++;
-		if (strcmp(name, "subject") &&
-		    strcmp(name, "body") &&
+		if (strcmp(name, "body") &&
+		    !starts_with(name, "subject") &&
 		    !starts_with(name, "trailers") &&
 		    !starts_with(name, "contents"))
 			continue;
@@ -1295,7 +1298,11 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
 
 		if (atom->u.contents.option == C_SUB)
 			v->s = copy_subject(subpos, sublen);
-		else if (atom->u.contents.option == C_BODY_DEP)
+		else if (atom->u.contents.option == C_SUB_SANITIZE) {
+			struct strbuf sb = STRBUF_INIT;
+			format_sanitized_subject(&sb, subpos, sublen);
+			v->s = strbuf_detach(&sb, NULL);
+		} else if (atom->u.contents.option == C_BODY_DEP)
 			v->s = xmemdupz(bodypos, bodylen);
 		else if (atom->u.contents.option == C_LENGTH)
 			v->s = xstrfmt("%"PRIuMAX, (uintmax_t)strlen(subpos));
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 79d5b29387..220ff5c3c2 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -150,6 +150,7 @@ test_atom head taggerdate ''
 test_atom head creator 'C O Mitter <committer@example.com> 1151968723 +0200'
 test_atom head creatordate 'Tue Jul 4 01:18:43 2006 +0200'
 test_atom head subject 'Initial'
+test_atom head subject:sanitize 'Initial'
 test_atom head contents:subject 'Initial'
 test_atom head body ''
 test_atom head contents:body ''
@@ -207,6 +208,7 @@ test_atom tag taggerdate 'Tue Jul 4 01:18:45 2006 +0200'
 test_atom tag creator 'C O Mitter <committer@example.com> 1151968725 +0200'
 test_atom tag creatordate 'Tue Jul 4 01:18:45 2006 +0200'
 test_atom tag subject 'Tagging at 1151968727'
+test_atom tag subject:sanitize 'Tagging-at-1151968727'
 test_atom tag contents:subject 'Tagging at 1151968727'
 test_atom tag body ''
 test_atom tag contents:body ''
@@ -619,6 +621,7 @@ test_expect_success 'create tag with subject and body content' '
 	git tag -F msg subject-body
 '
 test_atom refs/tags/subject-body subject 'the subject line'
+test_atom refs/tags/subject-body subject:sanitize 'the-subject-line'
 test_atom refs/tags/subject-body body 'first body line
 second body line
 '
@@ -639,6 +642,7 @@ test_expect_success 'create tag with multiline subject' '
 	git tag -F msg multiline
 '
 test_atom refs/tags/multiline subject 'first subject line second subject line'
+test_atom refs/tags/multiline subject:sanitize 'first-subject-line-second-subject-line'
 test_atom refs/tags/multiline contents:subject 'first subject line second subject line'
 test_atom refs/tags/multiline body 'first body line
 second body line
@@ -671,6 +675,7 @@ sig='-----BEGIN PGP SIGNATURE-----
 
 PREREQ=GPG
 test_atom refs/tags/signed-empty subject ''
+test_atom refs/tags/signed-empty subject:sanitize ''
 test_atom refs/tags/signed-empty contents:subject ''
 test_atom refs/tags/signed-empty body "$sig"
 test_atom refs/tags/signed-empty contents:body ''
@@ -678,6 +683,7 @@ test_atom refs/tags/signed-empty contents:signature "$sig"
 test_atom refs/tags/signed-empty contents "$sig"
 
 test_atom refs/tags/signed-short subject 'subject line'
+test_atom refs/tags/signed-short subject:sanitize 'subject-line'
 test_atom refs/tags/signed-short contents:subject 'subject line'
 test_atom refs/tags/signed-short body "$sig"
 test_atom refs/tags/signed-short contents:body ''
@@ -686,6 +692,7 @@ test_atom refs/tags/signed-short contents "subject line
 $sig"
 
 test_atom refs/tags/signed-long subject 'subject line'
+test_atom refs/tags/signed-long subject:sanitize 'subject-line'
 test_atom refs/tags/signed-long contents:subject 'subject line'
 test_atom refs/tags/signed-long body "body contents
 $sig"
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2020-08-21 21:42 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-27 20:43 [PATCH 0/5] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
2020-07-27 20:43 ` [PATCH 1/5] ref-filter: support different email formats Hariom Verma via GitGitGadget
2020-07-27 22:51   ` Junio C Hamano
2020-07-28 20:31     ` Hariom verma
2020-07-28 20:43       ` Junio C Hamano
2020-07-28 13:58   ` Đoàn Trần Công Danh
2020-07-28 16:45     ` Junio C Hamano
2020-07-27 20:43 ` [PATCH 2/5] ref-filter: add `short` option for 'tree' and 'parent' Hariom Verma via GitGitGadget
2020-07-27 23:21   ` Junio C Hamano
2020-07-27 20:43 ` [PATCH 3/5] pretty: refactor `format_sanitized_subject()` Hariom Verma via GitGitGadget
2020-07-27 20:43 ` [PATCH 4/5] format-support: move `format_sanitized_subject()` from pretty Hariom Verma via GitGitGadget
2020-07-27 20:43 ` [PATCH 5/5] ref-filter: add `sanitize` option for 'subject' atom Hariom Verma via GitGitGadget
2020-08-05 21:51 ` [PATCH v2 0/9] [GSoC] Improvements to ref-filter Hariom Verma via GitGitGadget
2020-08-05 21:51   ` [PATCH v2 1/9] ref-filter: support different email formats Hariom Verma via GitGitGadget
2020-08-05 21:51   ` [PATCH v2 2/9] ref-filter: refactor `grab_objectname()` Hariom Verma via GitGitGadget
2020-08-05 21:51   ` [PATCH v2 3/9] ref-filter: modify error messages in `grab_objectname()` Hariom Verma via GitGitGadget
2020-08-05 21:51   ` [PATCH v2 4/9] ref-filter: rename `objectname` related functions and fields Hariom Verma via GitGitGadget
2020-08-05 21:51   ` [PATCH v2 5/9] ref-filter: add `short` modifier to 'tree' atom Hariom Verma via GitGitGadget
2020-08-05 21:51   ` [PATCH v2 6/9] ref-filter: add `short` modifier to 'parent' atom Hariom Verma via GitGitGadget
2020-08-05 21:51   ` [PATCH v2 7/9] pretty: refactor `format_sanitized_subject()` Hariom Verma via GitGitGadget
2020-08-05 21:51   ` [PATCH v2 8/9] format-support: move `format_sanitized_subject()` from pretty Hariom Verma via GitGitGadget
2020-08-05 21:51   ` [PATCH v2 9/9] ref-filter: add `sanitize` option for 'subject' atom Hariom Verma via GitGitGadget
2020-08-05 22:04   ` [PATCH v2 0/9] [GSoC] Improvements to ref-filter Junio C Hamano
2020-08-06 13:47     ` Hariom verma
2020-08-17 18:10   ` [PATCH v3 0/9] [Resend][GSoC] " Hariom Verma via GitGitGadget
2020-08-17 18:10     ` [PATCH v3 1/9] ref-filter: support different email formats Hariom Verma via GitGitGadget
2020-08-17 18:10     ` [PATCH v3 2/9] ref-filter: refactor `grab_objectname()` Hariom Verma via GitGitGadget
2020-08-17 18:10     ` [PATCH v3 3/9] ref-filter: modify error messages in `grab_objectname()` Hariom Verma via GitGitGadget
2020-08-17 18:10     ` [PATCH v3 4/9] ref-filter: rename `objectname` related functions and fields Hariom Verma via GitGitGadget
2020-08-17 18:10     ` [PATCH v3 5/9] ref-filter: add `short` modifier to 'tree' atom Hariom Verma via GitGitGadget
2020-08-17 18:10     ` [PATCH v3 6/9] ref-filter: add `short` modifier to 'parent' atom Hariom Verma via GitGitGadget
2020-08-17 18:10     ` [PATCH v3 7/9] pretty: refactor `format_sanitized_subject()` Hariom Verma via GitGitGadget
2020-08-17 19:29       ` Junio C Hamano
2020-08-19 13:36         ` Hariom verma
2020-08-19 16:01           ` Junio C Hamano
2020-08-19 16:08             ` Junio C Hamano
2020-08-20 17:33               ` Hariom verma
2020-08-20 17:27             ` Hariom verma
2020-08-17 18:10     ` [PATCH v3 8/9] format-support: move `format_sanitized_subject()` from pretty Hariom Verma via GitGitGadget
2020-08-17 19:33       ` Junio C Hamano
2020-08-17 18:10     ` [PATCH v3 9/9] ref-filter: add `sanitize` option for 'subject' atom Hariom Verma via GitGitGadget
2020-08-17 19:37     ` [PATCH v3 0/9] [Resend][GSoC] Improvements to ref-filter Junio C Hamano
2020-08-21 21:41     ` [PATCH v4 0/8] [GSoC] " Hariom Verma via GitGitGadget
2020-08-21 21:41       ` [PATCH v4 1/8] ref-filter: support different email formats Hariom Verma via GitGitGadget
2020-08-21 21:41       ` [PATCH v4 2/8] ref-filter: refactor `grab_objectname()` Hariom Verma via GitGitGadget
2020-08-21 21:41       ` [PATCH v4 3/8] ref-filter: modify error messages in `grab_objectname()` Hariom Verma via GitGitGadget
2020-08-21 21:41       ` [PATCH v4 4/8] ref-filter: rename `objectname` related functions and fields Hariom Verma via GitGitGadget
2020-08-21 21:41       ` [PATCH v4 5/8] ref-filter: add `short` modifier to 'tree' atom Hariom Verma via GitGitGadget
2020-08-21 21:41       ` [PATCH v4 6/8] ref-filter: add `short` modifier to 'parent' atom Hariom Verma via GitGitGadget
2020-08-21 21:41       ` [PATCH v4 7/8] pretty: refactor `format_sanitized_subject()` Hariom Verma via GitGitGadget
2020-08-21 21:41       ` [PATCH v4 8/8] ref-filter: add `sanitize` option for 'subject' atom Hariom Verma via GitGitGadget

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).