git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: "René Scharfe" <l.s.r@web.de>, "Git List" <git@vger.kernel.org>
Subject: Re: [PATCH] fmt-merge-msg: avoid leaking strbuf in shortlog()
Date: Fri, 8 Dec 2017 05:14:56 -0500	[thread overview]
Message-ID: <20171208101455.GC1899@sigill.intra.peff.net> (raw)
In-Reply-To: <xmqq4lp2cisd.fsf@gitster.mtv.corp.google.com>

On Thu, Dec 07, 2017 at 01:47:14PM -0800, Junio C Hamano wrote:

> > diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
> > index 22034f87e7..8e8a15ea4a 100644
> > --- a/builtin/fmt-merge-msg.c
> > +++ b/builtin/fmt-merge-msg.c
> > @@ -377,7 +377,8 @@ static void shortlog(const char *name,
> >  			string_list_append(&subjects,
> >  					   oid_to_hex(&commit->object.oid));
> >  		else
> > -			string_list_append(&subjects, strbuf_detach(&sb, NULL));
> > +			string_list_append_nodup(&subjects,
> > +						 strbuf_detach(&sb, NULL));
> >  	}
> >  
> >  	if (opts->credit_people)
> 
> What is leaked comes from strbuf, so the title is not a lie, but I
> tend to think that this leak is caused by a somewhat strange
> string_list API.  The subjects string-list is initialized as a "dup"
> kind, but a caller that wants to avoid leaking can (and should) use
> _nodup() call to add a string without duping.  It all feels a bit
> too convoluted.

I'm not sure it's string-list's fault. Many callers (including this one)
have _some_ entries whose strings must be duplicated and others which do
not.

So either:

  1. The list gets marked as "nodup", and we add an extra xstrdup() to the
     oid_to_hex call above. And also need to remember to free() the
     strings later, since the list does not own them.

or

  2. We mark it as "dup" and incur an extra allocation and copy, like:

       string_list_append(&subjects, sb.buf);
       strbuf_release(&buf);

So I'd really blame the caller, which doesn't want to do (2) out of a
sense of optimization. It could also perhaps write it as:

  while (commit = get_revision(rev)) {
	strbuf_reset(&sb);
	... maybe put some stuff in sb ...
	if (!sb.len)
		string_list_append(&subjects, oid_to_hex(obj));
	else
		string_list_append(&subjects, sb.buf);
  }
  strbuf_release(&sb);

which at least avoids the extra allocations.

By the way, I think there's another quite subtle leak in this function.
We do this:

  format_commit_message(commit, "%s", &sb, &ctx);
  strbuf_ltrim(&sb);

and then only use "sb" if sb.len is non-zero. But we may have actually
allocated to create our zero-length string (e.g., if we had a strbuf
full of spaces and trimmed them all off). Since we reuse "sb" over and
over as we loop, this will actually only leak once for the whole loop,
not once per iteration. So it's probably not a big deal, but writing it
with the explicit reset/release pattern fixes that (and is more
idiomatic for our code base, I think).

-Peff

  reply	other threads:[~2017-12-08 10:15 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-07 20:22 [PATCH] fmt-merge-msg: avoid leaking strbuf in shortlog() René Scharfe
2017-12-07 21:27 ` Jeff King
2017-12-08 17:29   ` René Scharfe
2017-12-08 18:44     ` Junio C Hamano
2017-12-08 20:10       ` René Scharfe
2017-12-08 21:11     ` Jeff King
2017-12-07 21:47 ` Junio C Hamano
2017-12-08 10:14   ` Jeff King [this message]
2017-12-08 17:29     ` René Scharfe
2017-12-08 18:37       ` Junio C Hamano
2017-12-08 21:28         ` Jeff King
2017-12-18 19:18           ` René Scharfe
2017-12-19 11:38             ` Jeff King
2017-12-19 18:26               ` René Scharfe
2017-12-20 13:05                 ` Jeff King
2017-12-08 21:17       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171208101455.GC1899@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).