git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Rene Scharfe <l.s.r@web.de>, git@vger.kernel.org
Subject: Re: [PATCH 27/34] shortlog: release strbuf after use in insert_one_record()
Date: Fri, 8 Sep 2017 00:36:34 -0400	[thread overview]
Message-ID: <20170908043633.smytugbn7ge4twlm@sigill.intra.peff.net> (raw)
In-Reply-To: <20170908035648.jhm6ypxkwwms4bqu@sigill.intra.peff.net>

On Thu, Sep 07, 2017 at 11:56:48PM -0400, Jeff King wrote:

> > True; I do not think string_list API does.  But for this particular
> > application, I suspect that we can by looking at the util field of
> > the item returned.  A newly created one has NULL, but we always make
> > it non-NULL before leaving this function.
> 
> Yeah, I agree that would work here.
> 
> I also wondered if we could get away with avoiding the malloc entirely
> here. Especially in the "shortlog -n" case, it is identical to the name
> field we already have in ident.name. So ideally we'd do a lookup to see
> if we have the entry before allocating anything (since we do one lookup
> per commit, but only insert once per unique author).
> 
> But that doesn't quite work, because ident.name doesn't put to a
> NUL-terminated string, and string_list only handles strings.

I happened to look at this more while digging on an unrelated shortlog
bug. I think the whole thing could actually be reorganized a bit.

We call insert_one_record() from shortlog_add_commit(). The latter
formats "%an <%ae>", only to have the former parse it back to its
constituent parts. That seems rather silly.

This is an artifact of shortlog's original mode, which was to parse "git
log" output. But for an internal traversal, we can just format the
correct item right off the bat. That part of insert_one_record() is also
where we handle the mailmap mapping. But again, the internal traversal
can just "%aE" to format that correctly in the first place.

IOW, something like the patch below, which pushes the re-parsing out to
the stdin code-path, and lets the internal traversal format directly
into the final buffer. It seems to be about 3% faster than the existing
code, and fixes the leak (by dropping that variable entirely).

-Peff

---
diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index 43c4799ea9..e29875b843 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -52,26 +52,8 @@ static void insert_one_record(struct shortlog *log,
 			      const char *oneline)
 {
 	struct string_list_item *item;
-	const char *mailbuf, *namebuf;
-	size_t namelen, maillen;
-	struct strbuf namemailbuf = STRBUF_INIT;
-	struct ident_split ident;
 
-	if (split_ident_line(&ident, author, strlen(author)))
-		return;
-
-	namebuf = ident.name_begin;
-	mailbuf = ident.mail_begin;
-	namelen = ident.name_end - ident.name_begin;
-	maillen = ident.mail_end - ident.mail_begin;
-
-	map_user(&log->mailmap, &mailbuf, &maillen, &namebuf, &namelen);
-	strbuf_add(&namemailbuf, namebuf, namelen);
-
-	if (log->email)
-		strbuf_addf(&namemailbuf, " <%.*s>", (int)maillen, mailbuf);
-
-	item = string_list_insert(&log->list, namemailbuf.buf);
+	item = string_list_insert(&log->list, author);
 
 	if (log->summary)
 		item->util = (void *)(UTIL_TO_INT(item) + 1);
@@ -114,9 +96,33 @@ static void insert_one_record(struct shortlog *log,
 	}
 }
 
+static int parse_stdin_author(struct shortlog *log,
+			       struct strbuf *out, const char *in)
+{
+	const char *mailbuf, *namebuf;
+	size_t namelen, maillen;
+	struct ident_split ident;
+
+	if (split_ident_line(&ident, in, strlen(in)))
+		return -1;
+
+	namebuf = ident.name_begin;
+	mailbuf = ident.mail_begin;
+	namelen = ident.name_end - ident.name_begin;
+	maillen = ident.mail_end - ident.mail_begin;
+
+	map_user(&log->mailmap, &mailbuf, &maillen, &namebuf, &namelen);
+	strbuf_add(out, namebuf, namelen);
+	if (log->email)
+		strbuf_addf(out, " <%.*s>", (int)maillen, mailbuf);
+
+	return 0;
+}
+
 static void read_from_stdin(struct shortlog *log)
 {
 	struct strbuf author = STRBUF_INIT;
+	struct strbuf mapped_author = STRBUF_INIT;
 	struct strbuf oneline = STRBUF_INIT;
 	static const char *author_match[2] = { "Author: ", "author " };
 	static const char *committer_match[2] = { "Commit: ", "committer " };
@@ -134,9 +140,15 @@ static void read_from_stdin(struct shortlog *log)
 		while (strbuf_getline_lf(&oneline, stdin) != EOF &&
 		       !oneline.len)
 			; /* discard blanks */
-		insert_one_record(log, v, oneline.buf);
+
+		strbuf_reset(&mapped_author);
+		if (parse_stdin_author(log, &mapped_author, v) < 0)
+			continue;
+
+		insert_one_record(log, mapped_author.buf, oneline.buf);
 	}
 	strbuf_release(&author);
+	strbuf_release(&mapped_author);
 	strbuf_release(&oneline);
 }
 
@@ -153,7 +165,9 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit)
 	ctx.date_mode.type = DATE_NORMAL;
 	ctx.output_encoding = get_log_output_encoding();
 
-	fmt = log->committer ? "%cn <%ce>" : "%an <%ae>";
+	fmt = log->committer ?
+		(log->email ? "%cN <%cE>" : "%cN") :
+		(log->email ? "%aN <%aE>" : "%aN");
 
 	format_commit_message(commit, fmt, &author, &ctx);
 	if (!log->summary) {

  reply	other threads:[~2017-09-08  4:36 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-30 17:49 [PATCH 00/34] plug strbuf memory leaks Rene Scharfe
2017-08-30 17:49 ` [PATCH 01/34] am: release strbufs after use in detect_patch_format() Rene Scharfe
2017-08-31 17:31   ` Stefan Beller
2017-08-30 17:49 ` [PATCH 02/34] am: release strbuf on error return in hg_patch_to_mail() Rene Scharfe
2017-08-30 17:49 ` [PATCH 03/34] am: release strbuf after use in safe_to_abort() Rene Scharfe
2017-08-30 17:49 ` [PATCH 04/34] check-ref-format: release strbuf after use in check_ref_format_branch() Rene Scharfe
2017-08-30 17:49 ` [PATCH 05/34] clean: release strbuf after use in remove_dirs() Rene Scharfe
2017-08-30 17:49 ` [PATCH 06/34] clone: release strbuf after use in remove_junk() Rene Scharfe
2017-09-06 19:51   ` Junio C Hamano
2017-09-10  6:27     ` René Scharfe
2017-09-10  7:30       ` Jeff King
2017-09-10 10:37         ` René Scharfe
2017-09-10 17:38           ` Jeff King
2017-09-11 21:40             ` René Scharfe
2017-09-13 12:56               ` Jeff King
2017-08-30 17:49 ` [PATCH 07/34] commit: release strbuf on error return in commit_tree_extended() Rene Scharfe
2017-08-31 17:40   ` Stefan Beller
2017-08-30 17:49 ` [PATCH 08/34] connect: release strbuf on error return in git_connect() Rene Scharfe
2017-08-31 17:44   ` Stefan Beller
2017-08-30 17:49 ` [PATCH 09/34] convert: release strbuf on error return in filter_buffer_or_fd() Rene Scharfe
2017-08-30 17:49 ` [PATCH 10/34] diff: release strbuf after use in diff_summary() Rene Scharfe
2017-08-31 17:46   ` Stefan Beller
2017-08-30 17:49 ` [PATCH 11/34] diff: release strbuf after use in show_rename_copy() Rene Scharfe
2017-08-30 17:49 ` [PATCH 12/34] diff: release strbuf after use in show_stats() Rene Scharfe
2017-08-30 17:49 ` [PATCH 13/34] help: release strbuf on error return in exec_man_konqueror() Rene Scharfe
2017-08-30 17:49 ` [PATCH 14/34] help: release strbuf on error return in exec_man_man() Rene Scharfe
2017-08-30 17:49 ` [PATCH 15/34] help: release strbuf on error return in exec_woman_emacs() Rene Scharfe
2017-08-30 17:49 ` [PATCH 16/34] mailinfo: release strbuf after use in handle_from() Rene Scharfe
2017-08-30 17:49 ` [PATCH 17/34] mailinfo: release strbuf on error return in handle_boundary() Rene Scharfe
2017-08-30 18:23   ` Martin Ågren
2017-08-31 17:21     ` René Scharfe
2017-09-05 17:10       ` Martin Ågren
2017-08-30 17:49 ` [PATCH 18/34] merge: release strbuf after use in save_state() Rene Scharfe
2017-08-30 17:49 ` [PATCH 19/34] merge: release strbuf after use in write_merge_heads() Rene Scharfe
2017-08-30 17:57 ` [PATCH 20/34] notes: release strbuf after use in notes_copy_from_stdin() Rene Scharfe
2017-08-30 17:58 ` [PATCH 02/34] am: release strbuf on error return in hg_patch_to_mail() Rene Scharfe
2017-08-30 17:58   ` [PATCH 03/34] am: release strbuf after use in safe_to_abort() Rene Scharfe
2017-08-30 17:58   ` [PATCH 04/34] check-ref-format: release strbuf after use in check_ref_format_branch() Rene Scharfe
2017-08-30 17:58   ` [PATCH 08/34] connect: release strbuf on error return in git_connect() Rene Scharfe
2017-08-30 17:58   ` [PATCH 09/34] convert: release strbuf on error return in filter_buffer_or_fd() Rene Scharfe
2017-08-30 17:58   ` [PATCH 11/34] diff: release strbuf after use in show_rename_copy() Rene Scharfe
2017-08-30 17:58   ` [PATCH 12/34] diff: release strbuf after use in show_stats() Rene Scharfe
2017-08-30 17:58   ` [PATCH 21/34] refs: release strbuf on error return in write_pseudoref() Rene Scharfe
2017-08-30 18:00 ` [PATCH 08/34] connect: release strbuf on error return in git_connect() Rene Scharfe
2017-08-30 18:00   ` [PATCH 21/34] refs: release strbuf on error return in write_pseudoref() Rene Scharfe
2017-08-30 18:00   ` [PATCH 22/34] remote: release strbuf after use in read_remote_branches() Rene Scharfe
2017-08-30 18:00   ` [PATCH 23/34] remote: release strbuf after use in migrate_file() Rene Scharfe
2017-08-30 18:00   ` [PATCH 24/34] remote: release strbuf after use in set_url() Rene Scharfe
2017-08-30 18:00   ` [PATCH 25/34] send-pack: release strbuf on error return in send_pack() Rene Scharfe
2017-08-30 18:00   ` [PATCH 26/34] sha1_file: release strbuf on error return in index_path() Rene Scharfe
2017-08-30 18:00   ` [PATCH 27/34] shortlog: release strbuf after use in insert_one_record() Rene Scharfe
2017-09-06 19:51     ` Junio C Hamano
2017-09-07  4:33       ` Jeff King
2017-09-08  0:33         ` Junio C Hamano
2017-09-08  3:56           ` Jeff King
2017-09-08  4:36             ` Jeff King [this message]
2017-09-08  6:39               ` Junio C Hamano
2017-09-08  9:21                 ` [PATCH] shortlog: skip format/parse roundtrip for internal traversal Jeff King
2017-09-10  8:44                   ` René Scharfe
2017-09-10  8:50                     ` Jeff King
2017-08-30 18:05 ` [PATCH 08/34] connect: release strbuf on error return in git_connect() Rene Scharfe
2017-08-30 18:20 ` [PATCH 21/34] refs: release strbuf on error return in write_pseudoref() Rene Scharfe
2017-08-30 18:20 ` [PATCH 25/34] send-pack: release strbuf on error return in send_pack() Rene Scharfe
2017-08-30 18:20 ` [PATCH 28/34] sequencer: release strbuf after use in save_head() Rene Scharfe
2017-08-30 18:20 ` [PATCH 29/34] transport-helper: release strbuf after use in process_connect_service() Rene Scharfe
2017-08-30 18:20 ` [PATCH 30/34] userdiff: release strbuf after use in userdiff_get_textconv() Rene Scharfe
2017-08-30 18:20 ` [PATCH 31/34] utf8: release strbuf on error return in strbuf_utf8_replace() Rene Scharfe
2017-08-30 18:20 ` [PATCH 33/34] wt-status: release strbuf after use in read_rebase_todolist() Rene Scharfe
2017-08-30 18:20 ` [PATCH 32/34] vcs-svn: release strbuf after use in end_revision() Rene Scharfe
2017-08-30 18:20 ` [PATCH 34/34] wt-status: release strbuf after use in wt_longstatus_print_tracking() Rene Scharfe
2017-09-06 19:51   ` Junio C Hamano
2017-09-10  6:27     ` René Scharfe
2017-09-10  7:39       ` Junio C Hamano
2017-08-31 18:05 ` [PATCH 00/34] plug strbuf memory leaks Stefan Beller
2017-09-06 19:51 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170908043633.smytugbn7ge4twlm@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).