git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Hamza Mahfooz <someguy@effective-light.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v6 2/2] pretty: colorize pattern matches in commit messages
Date: Mon, 20 Sep 2021 21:24:08 -0400	[thread overview]
Message-ID: <YUk0OEXg36QXrkDm@coredump.intra.peff.net> (raw)
In-Reply-To: <20210921003050.641393-2-someguy@effective-light.com>

On Mon, Sep 20, 2021 at 08:30:50PM -0400, Hamza Mahfooz wrote:

> Teach the pretty-printer code to highlight matches from the
> "--grep=<pattern>", "--author=<pattern>" and "--committer=<pattern>"
> options (to view the last one, you may have to ask for --pretty=fuller).
> 
> Also, it must be noted that we are effectively greping the content twice,
> however it only slows down "git log --author=^H" on this repository by
> around 1-2% (compared to v2.33.0), so it should be a small enough slow
> down to justify the addition of the feature.

This might or might not be related, but one thing I noticed is that your
earlier patch causes us to grep a lot more lines than we mean to (even
if we are looking for "author" lines, it greps every header line). That
might contribute to the slowdown. Likewise, it calls strip_timestamp()
on every line, even if it does not start with "author").

> +static inline void strbuf_add_with_color(struct strbuf *sb, const char *color,
> +					 char *buf, size_t buflen)
> +{
> +	strbuf_addstr(sb, color);
> +	strbuf_add(sb, buf, buflen);
> +	if (*color)
> +		strbuf_addstr(sb, GIT_COLOR_RESET);
> +}

You could take "buf" as a "const char *" here. That doesn't matter too
much for now, but see below.

> +static void append_line_with_color(struct strbuf *sb, struct grep_opt *opt,
> +				   const char *line, size_t linelen,
> +				   int color, enum grep_context ctx,
> +				   enum grep_header_field field)
> +{
> +	char *buf, *eol;
> +	const char *line_color, *match_color;
> +	regmatch_t match;
> +	int eflags = 0;
> +
> +	if (!opt || !want_color(color) || opt->invert) {
> +		strbuf_add(sb, line, linelen);
> +		return;
> +	}
> +
> +	buf = (char *)line;
> +	eol = buf + linelen;

OK, so we got rid of the copy of "line", which is nice. But we are
casting away const-ness, which is a potential red flag (is somebody
going to modify this string, even though we promised our caller we would
not?). We'd probably want a comment to explain why we are doing so, and
why it is OK (e.g., if somebody in the call stack modifies it
temporarily).

More on this in a moment.

> +	while (grep_next_match(opt, buf, eol, ctx, &match, field, eflags)) {
> +		if (match.rm_so == match.rm_eo)
> +			break;
> +
> +		strbuf_grow(sb, strlen(line_color) + strlen(match_color) +
> +			    (2 * strlen(GIT_COLOR_RESET)));
> +		strbuf_add_with_color(sb, line_color, buf, match.rm_so);
> +		strbuf_add_with_color(sb, match_color, buf + match.rm_so,
> +				      match.rm_eo - match.rm_so);

As Eric mentioned, these strbuf_grow() calls can go away. The whole
point of strbuf is that we do not have to clutter the code with manual
size computations, because it will do the right thing automatically.

Sometimes you can get extra performance by pre-sizing the strbuf, but:

  1. I'd be surprised if we did in this case. We're writing into a
     strbuf that will receive the whole per-commit output, so any growth
     cost due to a couple of short strings here would be amortized
     anyway.

  2. The computation here doesn't represent the needed growth anyway.
     When we call strbuf_add_with_color(), it's going to add not just
     the colors but all of the data for the line itself.

So at best it's doing nothing, and at worst it is making the code harder
to understand.

> +	if (eflags) {
> +		strbuf_grow(sb, strlen(line_color) + strlen(GIT_COLOR_RESET));
> +		strbuf_add_with_color(sb, line_color, buf, eol - buf);
> +	} else
> +		strbuf_add(sb, buf, eol - buf);
> +}

Ditto here (we grow for the colors, but also end up adding "eol - buf"
bytes).

-Peff

  reply	other threads:[~2021-09-21  1:26 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-21  0:30 [PATCH v6 1/2] grep: refactor next_match() and match_one_pattern() for external use Hamza Mahfooz
2021-09-21  0:30 ` [PATCH v6 2/2] pretty: colorize pattern matches in commit messages Hamza Mahfooz
2021-09-21  1:24   ` Jeff King [this message]
2021-09-21  1:39     ` Jeff King
2021-09-21  1:41       ` [PATCH 1/2] grep: stop modifying buffer in strip_timestamp Jeff King
2021-09-21  1:43       ` [PATCH 2/2] grep: mark "haystack" buffers as const Jeff King
2021-09-21  2:05         ` Jeff King
2021-09-21  2:38       ` [PATCH v6 2/2] pretty: colorize pattern matches in commit messages Hamza Mahfooz
2021-09-21  3:15         ` Jeff King
2021-09-21  1:15 ` [PATCH v6 1/2] grep: refactor next_match() and match_one_pattern() for external use Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YUk0OEXg36QXrkDm@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=someguy@effective-light.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).