git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Jeff King <peff@peff.net>
Cc: Junio C Hamano <gitster@pobox.com>,
	Jacob Stopak <jacob@initialcommit.io>,
	git@vger.kernel.org, martin.agren@gmail.com
Subject: Re: [RFC PATCH v2] shortlog: add group-by options for year and month
Date: Wed, 5 Oct 2022 17:43:20 -0400	[thread overview]
Message-ID: <Yz36eFeGyQ3ha1pw@nand.local> (raw)
In-Reply-To: <Yy4sIAHdvp6yRql+@coredump.intra.peff.net>

On Fri, Sep 23, 2022 at 05:58:56PM -0400, Jeff King wrote:
> On Fri, Sep 23, 2022 at 09:17:10AM -0700, Junio C Hamano wrote:
>
> > Not a suggestion to use a different implementation or add a new
> > feature on top of this --group-by-time-range idea, but I wonder if
> > it is a more flexible and generalizeable approach to say "formulate
> > this value given by the --format=<format> string, apply this regular
> > expression match, and group by the subexpression value".  E.g.
> >
> >     git shortlog \
> > 	--group-by-value="%cI" \
> > 	--group-by-regexp="^(\d{4}-\d{2})"
>
> Heh, I was about to make the exact same suggestion. The existing
> "--group=author" could really just be "--group='%an <%ae>'" (or variants
> depending on the "-e" flag).

This caught my attention, so I wanted to see how hard it would be to
implement. It actually is quite straightforward, and gets us most of the
way to being able to get the same functionality as in Jacob's patch
(minus being able to do the for-each-ref-style sub-selectors, like
`%(authordate:format=%Y-%m)`).

Here's the patch:

--- >8 ---

diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index 7a1e1fe7c0..68880e8867 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -200,6 +200,29 @@ static void insert_records_from_trailers(struct shortlog *log,
 	unuse_commit_buffer(commit, commit_buffer);
 }

+static void insert_record_from_pretty(struct shortlog *log,
+				      struct strset *dups,
+				      struct commit *commit,
+				      struct pretty_print_context *ctx,
+				      const char *oneline)
+{
+	struct strbuf ident = STRBUF_INIT;
+	size_t i;
+
+	for (i = 0; i < log->pretty.nr; i++) {
+		if (i)
+			strbuf_addch(&ident, ' ');
+
+		format_commit_message(commit, log->pretty.items[i].string,
+				      &ident, ctx);
+	}
+
+	if (strset_add(dups, ident.buf))
+		insert_one_record(log, ident.buf, oneline);
+
+	strbuf_release(&ident);
+}
+
 void shortlog_add_commit(struct shortlog *log, struct commit *commit)
 {
 	struct strbuf ident = STRBUF_INIT;
@@ -243,6 +266,8 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit)
 	if (log->groups & SHORTLOG_GROUP_TRAILER) {
 		insert_records_from_trailers(log, &dups, commit, &ctx, oneline_str);
 	}
+	if (log->groups & SHORTLOG_GROUP_PRETTY)
+		insert_record_from_pretty(log, &dups, commit, &ctx, oneline_str);

 	strset_clear(&dups);
 	strbuf_release(&ident);
@@ -321,8 +346,10 @@ static int parse_group_option(const struct option *opt, const char *arg, int uns
 	else if (skip_prefix(arg, "trailer:", &field)) {
 		log->groups |= SHORTLOG_GROUP_TRAILER;
 		string_list_append(&log->trailers, field);
-	} else
-		return error(_("unknown group type: %s"), arg);
+	} else {
+		log->groups |= SHORTLOG_GROUP_PRETTY;
+		string_list_append(&log->pretty, arg);
+	}

 	return 0;
 }
@@ -340,6 +367,7 @@ void shortlog_init(struct shortlog *log)
 	log->in2 = DEFAULT_INDENT2;
 	log->trailers.strdup_strings = 1;
 	log->trailers.cmp = strcasecmp;
+	log->pretty.strdup_strings = 1;
 }

 int cmd_shortlog(int argc, const char **argv, const char *prefix)
diff --git a/shortlog.h b/shortlog.h
index 3f7e9aabca..d7caecb76f 100644
--- a/shortlog.h
+++ b/shortlog.h
@@ -20,8 +20,10 @@ struct shortlog {
 		SHORTLOG_GROUP_AUTHOR = (1 << 0),
 		SHORTLOG_GROUP_COMMITTER = (1 << 1),
 		SHORTLOG_GROUP_TRAILER = (1 << 2),
+		SHORTLOG_GROUP_PRETTY = (1 << 3),
 	} groups;
 	struct string_list trailers;
+	struct string_list pretty;

 	int email;
 	struct string_list mailmap;

--- 8< ---

> I don't think you even really need the regexp. If we respect --date,
> then you should be able to ask for --date=format:%Y-%m. Unfortunately
> there's no way to specify the format as part of the placeholder. The
> for-each-ref formatter understands this, like:
>
>   %(authordate:format:%Y-%m)
>
> I wouldn't be opposed to teaching the git-log formatter something
> similar.

Yeah, I think having a similar mechanism there would be useful, and
certainly a prerequisite to being able to achieve what Jacob has done
here with the more general approach.

I think you could also do some cleanup on top, like replacing the
SHORTLOG_GROUP_AUTHOR mode with adding either "%aN <%aE>" (or "%aN",
without --email) as an entry in the `pretty` string_list.

Thanks,
Taylor

  parent reply	other threads:[~2022-10-05 21:43 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-22  6:18 [RFC PATCH] shortlog: add group-by options for year and month Jacob Stopak
2022-09-22 15:46 ` Martin Ågren
2022-09-22 23:25 ` [RFC PATCH v2] " Jacob Stopak
2022-09-23 16:17   ` Junio C Hamano
2022-09-23 21:19     ` Jacob Stopak
2022-09-23 21:58     ` Jeff King
2022-09-23 22:06       ` Junio C Hamano
2022-09-24  4:38       ` Jacob Stopak
2022-10-05 22:14         ` Jeff King
2022-10-05 21:43       ` Taylor Blau [this message]
2022-10-05 22:26         ` Jeff King
2022-10-07  0:48           ` Jacob Stopak
2022-10-07 21:59             ` Taylor Blau
2022-10-11  0:59             ` Jeff King
2022-10-07 22:24           ` Taylor Blau
2022-10-11  1:00             ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yz36eFeGyQ3ha1pw@nand.local \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jacob@initialcommit.io \
    --cc=martin.agren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).