From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
"Eric Sunshine" <sunshine@sunshineco.com>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [PATCH v3 0/6] revisions API: fix more memory leaks
Date: Tue, 2 Aug 2022 17:33:10 +0200 [thread overview]
Message-ID: <cover-v3-0.6-00000000000-20220802T152925Z-avarab@gmail.com> (raw)
In-Reply-To: <cover-v2-0.6-00000000000-20220729T082919Z-avarab@gmail.com>
This series fixes various leaking users of the revisions API.
Changes since v2:
* I think the rewrite here of fixing the leak should address Jeff
King's concerns. I was somewhat mentally stuck on trying to change
the code to make the leak fix easier, and then having the leak fix.
But I think as the new 3/6 shows fixing the leak first is much
easier, and more straightforward to explain.
* A new 4/6 then follows-up and rewrites the variable clobbering that
was needed for the pre-image of 3/6.
* Don't reference the "XXX" comment in 6/6.
* A typo fix, pointed out by Eric Sunshine.
Thanks all for the review on the v2!
Ævar Arnfjörð Bjarmason (6):
bisect.c: add missing "goto" for release_revisions()
test-fast-rebase helper: use release_revisions() (again)
log: fix a memory leak in "git show <revision>..."
log: refactor "rev.pending" code in cmd_show()
bisect.c: partially fix bisect_rev_setup() memory leak
revisions API: don't leak memory on argv elements that need free()-ing
bisect.c | 28 ++++++++++-------
builtin/log.c | 31 +++++++++++++------
builtin/submodule--helper.c | 5 ++-
remote.c | 5 ++-
revision.c | 2 ++
revision.h | 3 +-
t/helper/test-fast-rebase.c | 2 --
t/t0203-gettext-setlocale-sanity.sh | 1 +
t/t1020-subdirectory.sh | 1 +
t/t2020-checkout-detach.sh | 1 +
t/t3307-notes-man.sh | 1 +
t/t3920-crlf-messages.sh | 2 ++
t/t4069-remerge-diff.sh | 1 +
t/t7007-show.sh | 1 +
...3-pre-commit-and-pre-merge-commit-hooks.sh | 1 +
t/t9122-git-svn-author.sh | 1 -
t/t9162-git-svn-dcommit-interactive.sh | 1 -
17 files changed, 59 insertions(+), 28 deletions(-)
Range-diff against v2:
1: 12a4a20c59f = 1: 12a4a20c59f bisect.c: add missing "goto" for release_revisions()
2: bbd3a7e5ecc ! 2: a04fade9d9d test-fast-rebase helper: use release_revisions() (again)
@@ Commit message
... when != \( I \| &I \)
That rule should arguably use only &I, not I (as we might be passed a
- pointer). He distinction would matter if anyone cared about the
+ pointer). The distinction would matter if anyone cared about the
side-effects of a memset() followed by release() of a pointer to a
variable passed into the function.
4: 54b632c1124 ! 3: 83fc1835fe5 log: fix common "rev.pending" memory leak in "git show"
@@ Metadata
Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
## Commit message ##
- log: fix common "rev.pending" memory leak in "git show"
-
- Fix a very common memory leak introduced in 5d7eeee2ac6 (git-show: grok blobs,
- trees and tags, too, 2006-12-14).
-
- When "git show" displays commits it needs to temporarily clobbers the
- "rev.pending" array, but by doing so we'll fail to
- release_revisions(), as we have for most other uses of "struct
- rev_info" since 2da81d1efb0 (Merge branch 'ab/plug-leak-in-revisions',
- 2022-06-07).
-
- In the preceding commit this code was made to use a more extendable
- pattern, which we can now complete. Once we've clobbered our
- "rev.pending" and invoked "cmd_log_walk_no_free()" we need to
- "object_array_clear()" our old "rev.pending".
-
- So in addition to the now-populated &pending array, to avoid leaking
- the memory related to the one member array we've created. The
- &rev.pending was already being object_array_clear()'d by the
- release_revisions() added in f6bfea0ad01 (revisions API users: use
- release_revisions() in builtin/log.c, 2022-04-13), now we'll also
- correctly free the previous data (f6bfea0ad01 noted this memory leak
- as an outstanding TODO).
-
- This way of doing it is making assumptions about the internals of
- "struct rev_info", in particular that the "pending" member is
- stand-alone, and that no other state is referring to it. But we've
- been making those assumptions ever since 5d7eeee2ac6 (git-show: grok
- blobs, trees and tags, too, 2006-12-14).
+ log: fix a memory leak in "git show <revision>..."
+
+ Fix a memory leak in code added in 5d7eeee2ac6 (git-show: grok blobs,
+ trees and tags, too, 2006-12-14). As we iterate over a "<revision>..."
+ command-line and encounter ad OBJ_COMMIT we want to use our "struct
+ rev_info", but with a "pending" array of one element: the one commit
+ we're showing in the loop.
+
+ To do this 5d7eeee2ac6 saved away a pointer to rev.pending.objects and
+ rev.pending.nr for its iteration. We'd then clobber those (and alloc)
+ when we needed to show an OBJ_COMMIT.
+
+ We'd therefore leak the "rev.pending" we started out with, and only
+ free the new "rev.pending" in the "OBJ_COMMIT" case arm as
+ prepare_revision_walk() would draw it down.
+
+ Let's fix this memory leak. Now when we encounter an OBJ_COMMIT we
+ save away the "rev.pending" before clearing it. We then add a single
+ commit to it, which our indirect invocation of prepare_revision_walk()
+ will remove. After that we restore the "rev.pending".
+
+ Our "rev.pending" will then get free'd by the release_revisions()
+ added in f6bfea0ad01 (revisions API users: use release_revisions() in
+ builtin/log.c, 2022-04-13)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
## builtin/log.c ##
-@@ builtin/log.c: int cmd_show(int argc, const char **argv, const char *prefix)
- return cmd_log_deinit(cmd_log_walk(&rev), &rev);
-
- memcpy(&pending, &rev.pending, sizeof(rev.pending));
-+ memcpy(&rev.pending, &blank, sizeof(rev.pending));
- rev.diffopt.no_free = 1;
- for (i = 0; i < pending.nr && !ret; i++) {
- struct object *o = pending.objects[i].item;
@@ builtin/log.c: int cmd_show(int argc, const char **argv, const char *prefix)
rev.shown_one = 1;
break;
case OBJ_COMMIT:
-- memcpy(&rev.pending, &blank, sizeof(rev.pending));
++ {
++ struct object_array old;
++
++ memcpy(&old, &rev.pending, sizeof(old));
+ rev.pending.nr = rev.pending.alloc = 0;
+ rev.pending.objects = NULL;
add_object_array(o, name, &rev.pending);
ret = cmd_log_walk_no_free(&rev);
++ memcpy(&rev.pending, &old, sizeof(rev.pending));
break;
-@@ builtin/log.c: int cmd_show(int argc, const char **argv, const char *prefix)
++ }
+ default:
ret = error(_("unknown type: %d"), o->type);
}
- }
-+ object_array_clear(&pending);
-
- rev.diffopt.no_free = 0;
- diff_free(&rev.diffopt);
## t/t0203-gettext-setlocale-sanity.sh ##
@@
3: 1629299f883 ! 4: fd474666e7c log: make the intent of cmd_show()'s "rev.pending" juggling clearer
@@ Metadata
Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
## Commit message ##
- log: make the intent of cmd_show()'s "rev.pending" juggling clearer
+ log: refactor "rev.pending" code in cmd_show()
- Adjust code added in 5d7eeee2ac6 (git-show: grok blobs, trees and
- tags, too, 2006-12-14) to use the "memcpy a &blank" idiom introduced
- in 5726a6b4012 (*.c *_init(): define in terms of corresponding *_INIT
- macro, 2021-07-01).
+ Refactor the juggling of "rev.pending" and our replacement for it
+ amended in the preceding commit so that:
- Now the types in play are guaranteed to correspond, i.e. we used "int"
- here for the "count" before, but the corresponding "nr" is an
- "unsigned int". By using a "blank" object we almost entirely bypass
- that, we'll only need to declare our own "unsigned int i".
+ * We use an "unsigned int" instead of an "int" for "i", this matches
+ the types of "struct rev_info" itself.
- There are no functional changes here aside from potential overflow
- guard rails, the structure only has these three members ("nr", "alloc"
- and "objects"), but now we're obviously future-proof against assuming
- that.
+ * We don't need the "count" and "objects" variables introduced in
+ 5d7eeee2ac6 (git-show: grok blobs, trees and tags, too, 2006-12-14).
+
+ They were originally added since we'd clobber rev.pending in the
+ loop without restoring it. Since the preceding commit we are
+ restoring it when we handle OBJ_COMMIT, so the main for-loop can
+ refer to "rev.pending" didrectly.
+
+ * We use the "memcpy a &blank" idiom introduced in
+ 5726a6b4012 (*.c *_init(): define in terms of corresponding *_INIT
+ macro, 2021-07-01).
+
+ This is more obvious than relying on us enumerating all of the
+ relevant members of the "struct object_array" that we need to
+ clear.
+
+ * We comment on why we don't need an object_array_clear() here, see
+ the analysis in [1].
+
+ 1. https://lore.kernel.org/git/YuQtJ2DxNKX%2Fy70N@coredump.intra.peff.net/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
+ Helped-by: Jeff King <peff@peff.net>
## builtin/log.c ##
@@ builtin/log.c: static void show_setup_revisions_tweak(struct rev_info *rev,
@@ builtin/log.c: static void show_setup_revisions_tweak(struct rev_info *rev,
{
struct rev_info rev;
- struct object_array_entry *objects;
-+ struct object_array blank = OBJECT_ARRAY_INIT;
-+ struct object_array pending = OBJECT_ARRAY_INIT;
+ unsigned int i;
struct setup_revision_opt opt;
struct pathspec match_all;
@@ builtin/log.c: int cmd_show(int argc, const char **argv, const char *prefix)
- count = rev.pending.nr;
- objects = rev.pending.objects;
-+ memcpy(&pending, &rev.pending, sizeof(rev.pending));
rev.diffopt.no_free = 1;
- for (i = 0; i < count && !ret; i++) {
- struct object *o = objects[i].item;
- const char *name = objects[i].name;
-+ for (i = 0; i < pending.nr && !ret; i++) {
-+ struct object *o = pending.objects[i].item;
-+ const char *name = pending.objects[i].name;
++ for (i = 0; i < rev.pending.nr && !ret; i++) {
++ struct object *o = rev.pending.objects[i].item;
++ const char *name = rev.pending.objects[i].name;
switch (o->type) {
case OBJ_BLOB:
ret = show_blob_object(&o->oid, &rev, name);
@@ builtin/log.c: int cmd_show(int argc, const char **argv, const char *prefix)
ret = error(_("could not read object %s"),
oid_to_hex(oid));
- objects[i].item = o;
-+ pending.objects[i].item = o;
++ rev.pending.objects[i].item = o;
i--;
break;
}
@@ builtin/log.c: int cmd_show(int argc, const char **argv, const char *prefix)
- rev.shown_one = 1;
- break;
case OBJ_COMMIT:
+ {
+ struct object_array old;
++ struct object_array blank = OBJECT_ARRAY_INIT;
+
+ memcpy(&old, &rev.pending, sizeof(old));
- rev.pending.nr = rev.pending.alloc = 0;
- rev.pending.objects = NULL;
+ memcpy(&rev.pending, &blank, sizeof(rev.pending));
++
add_object_array(o, name, &rev.pending);
ret = cmd_log_walk_no_free(&rev);
++
++ /*
++ * No need for
++ * object_array_clear(&pending). It was
++ * cleared already in prepare_revision_walk()
++ */
+ memcpy(&rev.pending, &old, sizeof(rev.pending));
break;
+ }
5: 2af2bce2d17 = 5: dae5872e722 bisect.c: partially fix bisect_rev_setup() memory leak
6: 3c57e126554 ! 6: 5e637f55ff1 revisions API: don't leak memory on argv elements that need free()-ing
@@ Commit message
revisions API: don't leak memory on argv elements that need free()-ing
Add a "free_removed_argv_elements" member to "struct
- setup_revision_opt", and use it to fix several memory leaks, e.g. the
- one with a "XXX" comment added in 8a534b61241 (bisect: use argv_array
- API, 2011-09-13).
+ setup_revision_opt", and use it to fix several memory leaks.
We have various memory leaks in APIs that take and munge "const
char **argv", e.g. parse_options(). Sometimes these APIs are given the
--
2.37.1.1233.ge8b09efaedc
next prev parent reply other threads:[~2022-08-02 15:33 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-13 13:10 [PATCH 0/6] revisions API: fix more memory leaks Ævar Arnfjörð Bjarmason
2022-07-13 13:10 ` [PATCH 1/6] bisect.c: add missing "goto" for release_revisions() Ævar Arnfjörð Bjarmason
2022-07-13 13:10 ` [PATCH 2/6] test-fast-rebase helper: use release_revisions() (again) Ævar Arnfjörð Bjarmason
2022-07-13 13:10 ` [PATCH 3/6] log: make the intent of cmd_show()'s "rev.pending" juggling clearer Ævar Arnfjörð Bjarmason
2022-07-18 14:51 ` Jeff King
2022-07-13 13:10 ` [PATCH 4/6] log: fix common "rev.pending" memory leak in "git show" Ævar Arnfjörð Bjarmason
2022-07-18 14:57 ` Jeff King
2022-07-18 15:08 ` Ævar Arnfjörð Bjarmason
2022-07-13 13:10 ` [PATCH 5/6] bisect.c: partially fix bisect_rev_setup() memory leak Ævar Arnfjörð Bjarmason
2022-07-18 15:05 ` Jeff King
2022-07-13 13:10 ` [PATCH 6/6] revisions API: don't leak memory on argv elements that need free()-ing Ævar Arnfjörð Bjarmason
2022-07-18 15:11 ` Jeff King
2022-07-18 15:13 ` Ævar Arnfjörð Bjarmason
2022-07-18 15:45 ` Jeff King
2022-07-29 7:07 ` Ævar Arnfjörð Bjarmason
2022-07-29 18:03 ` Jeff King
2022-07-29 18:37 ` Jeff King
2022-07-29 18:52 ` Junio C Hamano
2022-07-29 19:04 ` Jeff King
2022-07-18 15:14 ` [PATCH 0/6] revisions API: fix more memory leaks Jeff King
2022-07-29 8:31 ` [PATCH v2 " Ævar Arnfjörð Bjarmason
2022-07-29 8:31 ` [PATCH v2 1/6] bisect.c: add missing "goto" for release_revisions() Ævar Arnfjörð Bjarmason
2022-07-29 8:31 ` [PATCH v2 2/6] test-fast-rebase helper: use release_revisions() (again) Ævar Arnfjörð Bjarmason
2022-07-30 5:22 ` Eric Sunshine
2022-07-29 8:31 ` [PATCH v2 3/6] log: make the intent of cmd_show()'s "rev.pending" juggling clearer Ævar Arnfjörð Bjarmason
2022-07-29 18:44 ` Junio C Hamano
2022-07-29 8:31 ` [PATCH v2 4/6] log: fix common "rev.pending" memory leak in "git show" Ævar Arnfjörð Bjarmason
2022-07-29 18:55 ` Jeff King
2022-07-29 19:07 ` Jeff King
2022-07-29 18:59 ` Jeff King
2022-07-29 8:31 ` [PATCH v2 5/6] bisect.c: partially fix bisect_rev_setup() memory leak Ævar Arnfjörð Bjarmason
2022-07-29 8:31 ` [PATCH v2 6/6] revisions API: don't leak memory on argv elements that need free()-ing Ævar Arnfjörð Bjarmason
2022-07-29 19:00 ` Jeff King
2022-07-29 19:02 ` [PATCH v2 0/6] revisions API: fix more memory leaks Jeff King
2022-08-02 15:33 ` Ævar Arnfjörð Bjarmason [this message]
2022-08-02 15:33 ` [PATCH v3 1/6] bisect.c: add missing "goto" for release_revisions() Ævar Arnfjörð Bjarmason
2022-08-03 17:13 ` Junio C Hamano
2022-08-02 15:33 ` [PATCH v3 2/6] test-fast-rebase helper: use release_revisions() (again) Ævar Arnfjörð Bjarmason
2022-08-02 15:33 ` [PATCH v3 3/6] log: fix a memory leak in "git show <revision>..." Ævar Arnfjörð Bjarmason
2022-08-03 17:27 ` Jeff King
2022-08-03 17:29 ` Jeff King
2022-08-03 17:54 ` Junio C Hamano
2022-08-02 15:33 ` [PATCH v3 4/6] log: refactor "rev.pending" code in cmd_show() Ævar Arnfjörð Bjarmason
2022-08-03 17:32 ` Jeff King
2022-08-03 17:59 ` Junio C Hamano
2022-08-04 7:51 ` Ævar Arnfjörð Bjarmason
2022-08-04 15:59 ` Junio C Hamano
2022-08-05 9:01 ` Ævar Arnfjörð Bjarmason
2022-08-02 15:33 ` [PATCH v3 5/6] bisect.c: partially fix bisect_rev_setup() memory leak Ævar Arnfjörð Bjarmason
2022-08-02 15:33 ` [PATCH v3 6/6] revisions API: don't leak memory on argv elements that need free()-ing Ævar Arnfjörð Bjarmason
2022-08-03 18:30 ` Junio C Hamano
2022-08-03 17:33 ` [PATCH v3 0/6] revisions API: fix more memory leaks Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover-v3-0.6-00000000000-20220802T152925Z-avarab@gmail.com \
--to=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).