git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: "René Scharfe" <l.s.r@web.de>, "Git List" <git@vger.kernel.org>
Subject: Re: [PATCH] fmt-merge-msg: avoid leaking strbuf in shortlog()
Date: Fri, 8 Dec 2017 16:28:33 -0500	[thread overview]
Message-ID: <20171208212832.GC7355@sigill.intra.peff.net> (raw)
In-Reply-To: <xmqqd13p83sb.fsf@gitster.mtv.corp.google.com>

On Fri, Dec 08, 2017 at 10:37:08AM -0800, Junio C Hamano wrote:

> > The two modes (dup/nodup) make string_list code tricky.  Not sure
> > how far we'd get with something simpler (e.g. an array of char pointers),
> > but having the caller do all string allocations would make the code
> > easier to analyze.
> 
> Yes.
> 
> It probably would have been more sensible if the API did not have
> two modes (instead, have the caller pass whatever string to be
> stored, *and* make the caller responsible for freeing them *if* it
> passed an allocated string).

I'd actually argue the other way: the simplest interface is one where
the string list owns all of its pointers. That keeps the
ownership/lifetime issues clear, and it's one less step for the caller
to have to remember to do at the end (they do have to clear() the list,
but they must do that anyway to free the array of items).

It does mean that some callers may have to remember to free a temporary
buffer right after adding its contents to the list. But that's a lesser
evil, I think, since the memory ownership issues are all clearly
resolved at the time of add.

The big cost is just extra copies/allocations.

I dunno. I actually do not mind the "nodup" version of append being used
on a "dup" list. It is just a way of letting each call decide on whether
to hand over ownership, and I think most sites are pretty clear.

> For the push_refs_with_push() patch you sent, another possible fix
> would be to make cas_options a nodup kind so that the result of
> strbuf_detach() does not get an extra strdup to be lost when placed
> in cas_options.  With the current string-list API that would not
> quite work, because freeing done in _release() is tied to the
> "dup/nodup" ness of the string list.  I think there even is a
> codepath that initializes a string_list as nodup kind, stuffs string
> in it giving the ownership, and then flips it into dup kind just
> before calling _release() only to have it free the strings, or
> something silly/ugly like that.

Yes, the first grep hit for NODUP is bisect_clean_state(), which does
this. I think it would be more clear if we could do:

diff --git a/bisect.c b/bisect.c
index 0fca17c02b..7c59408a13 100644
--- a/bisect.c
+++ b/bisect.c
@@ -1060,8 +1060,7 @@ static int mark_for_removal(const char *refname, const struct object_id *oid,
 			    int flag, void *cb_data)
 {
 	struct string_list *refs = cb_data;
-	char *ref = xstrfmt("refs/bisect%s", refname);
-	string_list_append(refs, ref);
+	string_list_appendf(refs, "refs/bisect%s", refname);
 	return 0;
 }
 
@@ -1070,11 +1069,10 @@ int bisect_clean_state(void)
 	int result = 0;
 
 	/* There may be some refs packed during bisection */
-	struct string_list refs_for_removal = STRING_LIST_INIT_NODUP;
+	struct string_list refs_for_removal = STRING_LIST_INIT_DUP;
 	for_each_ref_in("refs/bisect", mark_for_removal, (void *) &refs_for_removal);
 	string_list_append(&refs_for_removal, xstrdup("BISECT_HEAD"));
 	result = delete_refs("bisect: remove", &refs_for_removal, REF_NO_DEREF);
-	refs_for_removal.strdup_strings = 1;
 	string_list_clear(&refs_for_removal, 0);
 	unlink_or_warn(git_path_bisect_expected_rev());
 	unlink_or_warn(git_path_bisect_ancestors_ok());


Having a "format into a string" wrapper doesn't cover _every_ string you
might want to add to a list, but my experience with argv_array_pushf
leads me to believe that it covers quite a lot of cases.

I dunno. I am not so bothered by the current dual-nature that I think it
is worth going on a crusade to eliminate one side. But I'm OK if
somebody else wants to do so.

-Peff

  reply	other threads:[~2017-12-08 21:28 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-07 20:22 [PATCH] fmt-merge-msg: avoid leaking strbuf in shortlog() René Scharfe
2017-12-07 21:27 ` Jeff King
2017-12-08 17:29   ` René Scharfe
2017-12-08 18:44     ` Junio C Hamano
2017-12-08 20:10       ` René Scharfe
2017-12-08 21:11     ` Jeff King
2017-12-07 21:47 ` Junio C Hamano
2017-12-08 10:14   ` Jeff King
2017-12-08 17:29     ` René Scharfe
2017-12-08 18:37       ` Junio C Hamano
2017-12-08 21:28         ` Jeff King [this message]
2017-12-18 19:18           ` René Scharfe
2017-12-19 11:38             ` Jeff King
2017-12-19 18:26               ` René Scharfe
2017-12-20 13:05                 ` Jeff King
2017-12-08 21:17       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171208212832.GC7355@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).