git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jacob Stopak <jacob@initialcommit.io>
To: Jeff King <peff@peff.net>
Cc: Taylor Blau <me@ttaylorr.com>, Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org, martin.agren@gmail.com
Subject: Re: [RFC PATCH v2] shortlog: add group-by options for year and month
Date: Thu, 6 Oct 2022 17:48:06 -0700	[thread overview]
Message-ID: <Yz93RjrJ00A5QvNe.jacob@initialcommit.io> (raw)
In-Reply-To: <Yz4EsT8noIoygk9b@coredump.intra.peff.net>

On Wed, Oct 05, 2022 at 06:26:57PM -0400, Jeff King wrote:
> On Wed, Oct 05, 2022 at 05:43:20PM -0400, Taylor Blau wrote:
> 
> > This caught my attention, so I wanted to see how hard it would be to
> > implement. It actually is quite straightforward, and gets us most of the
> > way to being able to get the same functionality as in Jacob's patch
> > (minus being able to do the for-each-ref-style sub-selectors, like
> > `%(authordate:format=%Y-%m)`).

Thanks Taylor!! This looks awesome and helped me understand how the pretty
context stuff works. I was able to apply your patch locally and test,
and plan to continue working off of this :D. Like Peff mentioned seems to
be a few usage details to hammer out.

> The date thing I think can be done with --date; I just sent a sketch in
> another part of the thread.

Peff - I applied your --date sketch onto Taylor's patch and it worked first try.

> So here you're allowing multiple pretty options. But really, once we
> allow the user an arbitrary format, is there any reason for them to do:
> 
>   git shortlog --group=%an --group=%ad
> 
> versus just:
> 
>   git shortlog --group='%an %ad'
> 
> ?

Yes I can't think of an advantage of having multiple custom-formatted group
fields. Also see my note on this below related to your comment on specifying
multiple groups.

> >  void shortlog_add_commit(struct shortlog *log, struct commit *commit)
> >  {
> >  	struct strbuf ident = STRBUF_INIT;
> > @@ -243,6 +266,8 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit)
> >  	if (log->groups & SHORTLOG_GROUP_TRAILER) {
> >  		insert_records_from_trailers(log, &dups, commit, &ctx, oneline_str);
> >  	}
> > +	if (log->groups & SHORTLOG_GROUP_PRETTY)
> > +		insert_record_from_pretty(log, &dups, commit, &ctx, oneline_str);
> 
> I was puzzled at first that this was a bitwise check. But I forgot that
> we added support for --group options already, in 63d24fa0b0 (shortlog:
> allow multiple groups to be specified, 2020-09-27).
> 
> So a plan like:
> 
>   git shortlog --group=author --group=date
> 
> (as in the original patch in this thread) doesn't quite work, I think.

My first patch addressed this by specifically handling cases where the new
grouping options were passed in-tandem with existing options, and making sure
only a single shortlog group was generated. But if we're generalizing the custom
group format then it might be unecessary to even allow the custom group in tandem
with most other options (like 'author' and 'committer'), since those options can be
included in the custom group format. The trailer option might be an exception, but
that could possibly just be handled as a special case.

> We probably want to insist that the format contains a "%" sign, and/or
> git it a keyword like "format:". Otherwise a typo like:
> 
>   git shortlog --format=autor
> 
> stops being an error we detect, and just returns nonsense results
> (every commit has the same ident).

Small aside: I like how Taylor re-used the --group option for the custom format.
IMO it hammers home that this is a grouping option and not just formatting or
filtering which can be confusing to users sometimes when doing data analytics.

But your points here all still apply. Maybe detecting a "%" sign can be the way
that git identifies the custom format being passed in. In conjuction with the %
identifiers, this would still enable users to add in some arbitrary constant label
like --group='Year: %cd' --date='%Y', without affecting the grouped/sorted results
since all entries would then include the "Years: " prefix (or postfix or however
they decide to write it).

The one case that should probably be handled is when no % sign is used and no other
matching flags like --author or --committer are either, because currently that will
just group all commits under 1 nonsensical group name like "autor" as you mentioned.

> I think you'd want to detect SHORTLOG_GROUP_PRETTY in the
> read_from_stdin() path, too. And probably just die() with "not
> supported", like we do for trailers.

Glad you said this because I applied this in my original patch with the explicit
--year and --month groups. Didn't seem to be an obvious use-case with the stdin 
even though my original year and month values could possibly be read from the git
log output supplied as stdin. But for a generalized group format seems even more
far-fetched to try and make it jive with the stdin version of the command.

-Jack

  reply	other threads:[~2022-10-07  0:48 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-22  6:18 [RFC PATCH] shortlog: add group-by options for year and month Jacob Stopak
2022-09-22 15:46 ` Martin Ågren
2022-09-22 23:25 ` [RFC PATCH v2] " Jacob Stopak
2022-09-23 16:17   ` Junio C Hamano
2022-09-23 21:19     ` Jacob Stopak
2022-09-23 21:58     ` Jeff King
2022-09-23 22:06       ` Junio C Hamano
2022-09-24  4:38       ` Jacob Stopak
2022-10-05 22:14         ` Jeff King
2022-10-05 21:43       ` Taylor Blau
2022-10-05 22:26         ` Jeff King
2022-10-07  0:48           ` Jacob Stopak [this message]
2022-10-07 21:59             ` Taylor Blau
2022-10-11  0:59             ` Jeff King
2022-10-07 22:24           ` Taylor Blau
2022-10-11  1:00             ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yz93RjrJ00A5QvNe.jacob@initialcommit.io \
    --to=jacob@initialcommit.io \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=martin.agren@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).