git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
	"Eric Sunshine" <sunshine@sunshineco.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [PATCH v3 0/6] revisions API: fix more memory leaks
Date: Tue,  2 Aug 2022 17:33:10 +0200	[thread overview]
Message-ID: <cover-v3-0.6-00000000000-20220802T152925Z-avarab@gmail.com> (raw)
In-Reply-To: <cover-v2-0.6-00000000000-20220729T082919Z-avarab@gmail.com>

This series fixes various leaking users of the revisions API.

Changes since v2:

* I think the rewrite here of fixing the leak should address Jeff
  King's concerns. I was somewhat mentally stuck on trying to change
  the code to make the leak fix easier, and then having the leak fix.

  But I think as the new 3/6 shows fixing the leak first is much
  easier, and more straightforward to explain.

* A new 4/6 then follows-up and rewrites the variable clobbering that
  was needed for the pre-image of 3/6.

* Don't reference the "XXX" comment in 6/6.

* A typo fix, pointed out by Eric Sunshine.

Thanks all for the review on the v2!

Ævar Arnfjörð Bjarmason (6):
  bisect.c: add missing "goto" for release_revisions()
  test-fast-rebase helper: use release_revisions() (again)
  log: fix a memory leak in "git show <revision>..."
  log: refactor "rev.pending" code in cmd_show()
  bisect.c: partially fix bisect_rev_setup() memory leak
  revisions API: don't leak memory on argv elements that need free()-ing

 bisect.c                                      | 28 ++++++++++-------
 builtin/log.c                                 | 31 +++++++++++++------
 builtin/submodule--helper.c                   |  5 ++-
 remote.c                                      |  5 ++-
 revision.c                                    |  2 ++
 revision.h                                    |  3 +-
 t/helper/test-fast-rebase.c                   |  2 --
 t/t0203-gettext-setlocale-sanity.sh           |  1 +
 t/t1020-subdirectory.sh                       |  1 +
 t/t2020-checkout-detach.sh                    |  1 +
 t/t3307-notes-man.sh                          |  1 +
 t/t3920-crlf-messages.sh                      |  2 ++
 t/t4069-remerge-diff.sh                       |  1 +
 t/t7007-show.sh                               |  1 +
 ...3-pre-commit-and-pre-merge-commit-hooks.sh |  1 +
 t/t9122-git-svn-author.sh                     |  1 -
 t/t9162-git-svn-dcommit-interactive.sh        |  1 -
 17 files changed, 59 insertions(+), 28 deletions(-)

Range-diff against v2:
1:  12a4a20c59f = 1:  12a4a20c59f bisect.c: add missing "goto" for release_revisions()
2:  bbd3a7e5ecc ! 2:  a04fade9d9d test-fast-rebase helper: use release_revisions() (again)
    @@ Commit message
                   ... when != \( I \| &I \)
     
         That rule should arguably use only &I, not I (as we might be passed a
    -    pointer). He distinction would matter if anyone cared about the
    +    pointer). The distinction would matter if anyone cared about the
         side-effects of a memset() followed by release() of a pointer to a
         variable passed into the function.
     
4:  54b632c1124 ! 3:  83fc1835fe5 log: fix common "rev.pending" memory leak in "git show"
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    log: fix common "rev.pending" memory leak in "git show"
    -
    -    Fix a very common memory leak introduced in 5d7eeee2ac6 (git-show: grok blobs,
    -    trees and tags, too, 2006-12-14).
    -
    -    When "git show" displays commits it needs to temporarily clobbers the
    -    "rev.pending" array, but by doing so we'll fail to
    -    release_revisions(), as we have for most other uses of "struct
    -    rev_info" since 2da81d1efb0 (Merge branch 'ab/plug-leak-in-revisions',
    -    2022-06-07).
    -
    -    In the preceding commit this code was made to use a more extendable
    -    pattern, which we can now complete. Once we've clobbered our
    -    "rev.pending" and invoked "cmd_log_walk_no_free()" we need to
    -    "object_array_clear()" our old "rev.pending".
    -
    -    So in addition to the now-populated &pending array, to avoid leaking
    -    the memory related to the one member array we've created. The
    -    &rev.pending was already being object_array_clear()'d by the
    -    release_revisions() added in f6bfea0ad01 (revisions API users: use
    -    release_revisions() in builtin/log.c, 2022-04-13), now we'll also
    -    correctly free the previous data (f6bfea0ad01 noted this memory leak
    -    as an outstanding TODO).
    -
    -    This way of doing it is making assumptions about the internals of
    -    "struct rev_info", in particular that the "pending" member is
    -    stand-alone, and that no other state is referring to it. But we've
    -    been making those assumptions ever since 5d7eeee2ac6 (git-show: grok
    -    blobs, trees and tags, too, 2006-12-14).
    +    log: fix a memory leak in "git show <revision>..."
    +
    +    Fix a memory leak in code added in 5d7eeee2ac6 (git-show: grok blobs,
    +    trees and tags, too, 2006-12-14). As we iterate over a "<revision>..."
    +    command-line and encounter ad OBJ_COMMIT we want to use our "struct
    +    rev_info", but with a "pending" array of one element: the one commit
    +    we're showing in the loop.
    +
    +    To do this 5d7eeee2ac6 saved away a pointer to rev.pending.objects and
    +    rev.pending.nr for its iteration. We'd then clobber those (and alloc)
    +    when we needed to show an OBJ_COMMIT.
    +
    +    We'd therefore leak the "rev.pending" we started out with, and only
    +    free the new "rev.pending" in the "OBJ_COMMIT" case arm as
    +    prepare_revision_walk() would draw it down.
    +
    +    Let's fix this memory leak. Now when we encounter an OBJ_COMMIT we
    +    save away the "rev.pending" before clearing it. We then add a single
    +    commit to it, which our indirect invocation of prepare_revision_walk()
    +    will remove. After that we restore the "rev.pending".
    +
    +    Our "rev.pending" will then get free'd by the release_revisions()
    +    added in f6bfea0ad01 (revisions API users: use release_revisions() in
    +    builtin/log.c, 2022-04-13)
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## builtin/log.c ##
    -@@ builtin/log.c: int cmd_show(int argc, const char **argv, const char *prefix)
    - 		return cmd_log_deinit(cmd_log_walk(&rev), &rev);
    - 
    - 	memcpy(&pending, &rev.pending, sizeof(rev.pending));
    -+	memcpy(&rev.pending, &blank, sizeof(rev.pending));
    - 	rev.diffopt.no_free = 1;
    - 	for (i = 0; i < pending.nr && !ret; i++) {
    - 		struct object *o = pending.objects[i].item;
     @@ builtin/log.c: int cmd_show(int argc, const char **argv, const char *prefix)
      			rev.shown_one = 1;
      			break;
      		case OBJ_COMMIT:
    --			memcpy(&rev.pending, &blank, sizeof(rev.pending));
    ++		{
    ++			struct object_array old;
    ++
    ++			memcpy(&old, &rev.pending, sizeof(old));
    + 			rev.pending.nr = rev.pending.alloc = 0;
    + 			rev.pending.objects = NULL;
      			add_object_array(o, name, &rev.pending);
      			ret = cmd_log_walk_no_free(&rev);
    ++			memcpy(&rev.pending, &old, sizeof(rev.pending));
      			break;
    -@@ builtin/log.c: int cmd_show(int argc, const char **argv, const char *prefix)
    ++		}
    + 		default:
      			ret = error(_("unknown type: %d"), o->type);
      		}
    - 	}
    -+	object_array_clear(&pending);
    - 
    - 	rev.diffopt.no_free = 0;
    - 	diff_free(&rev.diffopt);
     
      ## t/t0203-gettext-setlocale-sanity.sh ##
     @@
3:  1629299f883 ! 4:  fd474666e7c log: make the intent of cmd_show()'s "rev.pending" juggling clearer
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    log: make the intent of cmd_show()'s "rev.pending" juggling clearer
    +    log: refactor "rev.pending" code in cmd_show()
     
    -    Adjust code added in 5d7eeee2ac6 (git-show: grok blobs, trees and
    -    tags, too, 2006-12-14) to use the "memcpy a &blank" idiom introduced
    -    in 5726a6b4012 (*.c *_init(): define in terms of corresponding *_INIT
    -    macro, 2021-07-01).
    +    Refactor the juggling of "rev.pending" and our replacement for it
    +    amended in the preceding commit so that:
     
    -    Now the types in play are guaranteed to correspond, i.e. we used "int"
    -    here for the "count" before, but the corresponding "nr" is an
    -    "unsigned int". By using a "blank" object we almost entirely bypass
    -    that, we'll only need to declare our own "unsigned int i".
    +     * We use an "unsigned int" instead of an "int" for "i", this matches
    +       the types of "struct rev_info" itself.
     
    -    There are no functional changes here aside from potential overflow
    -    guard rails, the structure only has these three members ("nr", "alloc"
    -    and "objects"), but now we're obviously future-proof against assuming
    -    that.
    +     * We don't need the "count" and "objects" variables introduced in
    +       5d7eeee2ac6 (git-show: grok blobs, trees and tags, too, 2006-12-14).
    +
    +       They were originally added since we'd clobber rev.pending in the
    +       loop without restoring it. Since the preceding commit we are
    +       restoring it when we handle OBJ_COMMIT, so the main for-loop can
    +       refer to "rev.pending" didrectly.
    +
    +     * We use the "memcpy a &blank" idiom introduced in
    +       5726a6b4012 (*.c *_init(): define in terms of corresponding *_INIT
    +       macro, 2021-07-01).
    +
    +       This is more obvious than relying on us enumerating all of the
    +       relevant members of the "struct object_array" that we need to
    +       clear.
    +
    +     * We comment on why we don't need an object_array_clear() here, see
    +       the analysis in [1].
    +
    +    1. https://lore.kernel.org/git/YuQtJ2DxNKX%2Fy70N@coredump.intra.peff.net/
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    +    Helped-by: Jeff King <peff@peff.net>
     
      ## builtin/log.c ##
     @@ builtin/log.c: static void show_setup_revisions_tweak(struct rev_info *rev,
    @@ builtin/log.c: static void show_setup_revisions_tweak(struct rev_info *rev,
      {
      	struct rev_info rev;
     -	struct object_array_entry *objects;
    -+	struct object_array blank = OBJECT_ARRAY_INIT;
    -+	struct object_array pending = OBJECT_ARRAY_INIT;
     +	unsigned int i;
      	struct setup_revision_opt opt;
      	struct pathspec match_all;
    @@ builtin/log.c: int cmd_show(int argc, const char **argv, const char *prefix)
      
     -	count = rev.pending.nr;
     -	objects = rev.pending.objects;
    -+	memcpy(&pending, &rev.pending, sizeof(rev.pending));
      	rev.diffopt.no_free = 1;
     -	for (i = 0; i < count && !ret; i++) {
     -		struct object *o = objects[i].item;
     -		const char *name = objects[i].name;
    -+	for (i = 0; i < pending.nr && !ret; i++) {
    -+		struct object *o = pending.objects[i].item;
    -+		const char *name = pending.objects[i].name;
    ++	for (i = 0; i < rev.pending.nr && !ret; i++) {
    ++		struct object *o = rev.pending.objects[i].item;
    ++		const char *name = rev.pending.objects[i].name;
      		switch (o->type) {
      		case OBJ_BLOB:
      			ret = show_blob_object(&o->oid, &rev, name);
    @@ builtin/log.c: int cmd_show(int argc, const char **argv, const char *prefix)
      				ret = error(_("could not read object %s"),
      					    oid_to_hex(oid));
     -			objects[i].item = o;
    -+			pending.objects[i].item = o;
    ++			rev.pending.objects[i].item = o;
      			i--;
      			break;
      		}
     @@ builtin/log.c: int cmd_show(int argc, const char **argv, const char *prefix)
    - 			rev.shown_one = 1;
    - 			break;
      		case OBJ_COMMIT:
    + 		{
    + 			struct object_array old;
    ++			struct object_array blank = OBJECT_ARRAY_INIT;
    + 
    + 			memcpy(&old, &rev.pending, sizeof(old));
     -			rev.pending.nr = rev.pending.alloc = 0;
     -			rev.pending.objects = NULL;
     +			memcpy(&rev.pending, &blank, sizeof(rev.pending));
    ++
      			add_object_array(o, name, &rev.pending);
      			ret = cmd_log_walk_no_free(&rev);
    ++
    ++			/*
    ++			 * No need for
    ++			 * object_array_clear(&pending). It was
    ++			 * cleared already in prepare_revision_walk()
    ++			 */
    + 			memcpy(&rev.pending, &old, sizeof(rev.pending));
      			break;
    + 		}
5:  2af2bce2d17 = 5:  dae5872e722 bisect.c: partially fix bisect_rev_setup() memory leak
6:  3c57e126554 ! 6:  5e637f55ff1 revisions API: don't leak memory on argv elements that need free()-ing
    @@ Commit message
         revisions API: don't leak memory on argv elements that need free()-ing
     
         Add a "free_removed_argv_elements" member to "struct
    -    setup_revision_opt", and use it to fix several memory leaks, e.g. the
    -    one with a "XXX" comment added in 8a534b61241 (bisect: use argv_array
    -    API, 2011-09-13).
    +    setup_revision_opt", and use it to fix several memory leaks.
     
         We have various memory leaks in APIs that take and munge "const
         char **argv", e.g. parse_options(). Sometimes these APIs are given the
-- 
2.37.1.1233.ge8b09efaedc


  parent reply	other threads:[~2022-08-02 15:33 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-13 13:10 [PATCH 0/6] revisions API: fix more memory leaks Ævar Arnfjörð Bjarmason
2022-07-13 13:10 ` [PATCH 1/6] bisect.c: add missing "goto" for release_revisions() Ævar Arnfjörð Bjarmason
2022-07-13 13:10 ` [PATCH 2/6] test-fast-rebase helper: use release_revisions() (again) Ævar Arnfjörð Bjarmason
2022-07-13 13:10 ` [PATCH 3/6] log: make the intent of cmd_show()'s "rev.pending" juggling clearer Ævar Arnfjörð Bjarmason
2022-07-18 14:51   ` Jeff King
2022-07-13 13:10 ` [PATCH 4/6] log: fix common "rev.pending" memory leak in "git show" Ævar Arnfjörð Bjarmason
2022-07-18 14:57   ` Jeff King
2022-07-18 15:08     ` Ævar Arnfjörð Bjarmason
2022-07-13 13:10 ` [PATCH 5/6] bisect.c: partially fix bisect_rev_setup() memory leak Ævar Arnfjörð Bjarmason
2022-07-18 15:05   ` Jeff King
2022-07-13 13:10 ` [PATCH 6/6] revisions API: don't leak memory on argv elements that need free()-ing Ævar Arnfjörð Bjarmason
2022-07-18 15:11   ` Jeff King
2022-07-18 15:13     ` Ævar Arnfjörð Bjarmason
2022-07-18 15:45       ` Jeff King
2022-07-29  7:07         ` Ævar Arnfjörð Bjarmason
2022-07-29 18:03           ` Jeff King
2022-07-29 18:37             ` Jeff King
2022-07-29 18:52               ` Junio C Hamano
2022-07-29 19:04                 ` Jeff King
2022-07-18 15:14 ` [PATCH 0/6] revisions API: fix more memory leaks Jeff King
2022-07-29  8:31 ` [PATCH v2 " Ævar Arnfjörð Bjarmason
2022-07-29  8:31   ` [PATCH v2 1/6] bisect.c: add missing "goto" for release_revisions() Ævar Arnfjörð Bjarmason
2022-07-29  8:31   ` [PATCH v2 2/6] test-fast-rebase helper: use release_revisions() (again) Ævar Arnfjörð Bjarmason
2022-07-30  5:22     ` Eric Sunshine
2022-07-29  8:31   ` [PATCH v2 3/6] log: make the intent of cmd_show()'s "rev.pending" juggling clearer Ævar Arnfjörð Bjarmason
2022-07-29 18:44     ` Junio C Hamano
2022-07-29  8:31   ` [PATCH v2 4/6] log: fix common "rev.pending" memory leak in "git show" Ævar Arnfjörð Bjarmason
2022-07-29 18:55     ` Jeff King
2022-07-29 19:07       ` Jeff King
2022-07-29 18:59     ` Jeff King
2022-07-29  8:31   ` [PATCH v2 5/6] bisect.c: partially fix bisect_rev_setup() memory leak Ævar Arnfjörð Bjarmason
2022-07-29  8:31   ` [PATCH v2 6/6] revisions API: don't leak memory on argv elements that need free()-ing Ævar Arnfjörð Bjarmason
2022-07-29 19:00     ` Jeff King
2022-07-29 19:02   ` [PATCH v2 0/6] revisions API: fix more memory leaks Jeff King
2022-08-02 15:33   ` Ævar Arnfjörð Bjarmason [this message]
2022-08-02 15:33     ` [PATCH v3 1/6] bisect.c: add missing "goto" for release_revisions() Ævar Arnfjörð Bjarmason
2022-08-03 17:13       ` Junio C Hamano
2022-08-02 15:33     ` [PATCH v3 2/6] test-fast-rebase helper: use release_revisions() (again) Ævar Arnfjörð Bjarmason
2022-08-02 15:33     ` [PATCH v3 3/6] log: fix a memory leak in "git show <revision>..." Ævar Arnfjörð Bjarmason
2022-08-03 17:27       ` Jeff King
2022-08-03 17:29         ` Jeff King
2022-08-03 17:54       ` Junio C Hamano
2022-08-02 15:33     ` [PATCH v3 4/6] log: refactor "rev.pending" code in cmd_show() Ævar Arnfjörð Bjarmason
2022-08-03 17:32       ` Jeff King
2022-08-03 17:59       ` Junio C Hamano
2022-08-04  7:51         ` Ævar Arnfjörð Bjarmason
2022-08-04 15:59           ` Junio C Hamano
2022-08-05  9:01             ` Ævar Arnfjörð Bjarmason
2022-08-02 15:33     ` [PATCH v3 5/6] bisect.c: partially fix bisect_rev_setup() memory leak Ævar Arnfjörð Bjarmason
2022-08-02 15:33     ` [PATCH v3 6/6] revisions API: don't leak memory on argv elements that need free()-ing Ævar Arnfjörð Bjarmason
2022-08-03 18:30       ` Junio C Hamano
2022-08-03 17:33     ` [PATCH v3 0/6] revisions API: fix more memory leaks Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover-v3-0.6-00000000000-20220802T152925Z-avarab@gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).