git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Eric Sunshine <sunshine@sunshineco.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v2 04/21] strbuf: fix leak when `appendwholeline()` fails with EOF
Date: Wed, 29 May 2024 13:25:02 +0200	[thread overview]
Message-ID: <ZlcQjvAS-27S-mjw@tanuki> (raw)
In-Reply-To: <20240529091633.GB1098944@coredump.intra.peff.net>

[-- Attachment #1: Type: text/plain, Size: 3872 bytes --]

On Wed, May 29, 2024 at 05:16:33AM -0400, Jeff King wrote:
> On Mon, May 27, 2024 at 08:44:46AM +0200, Patrick Steinhardt wrote:
> > > diff --git a/strbuf.c b/strbuf.c
> > > diff --git a/strbuf.c b/strbuf.c
> > > index e1076c9891..aed699c6bf 100644
> > > --- a/strbuf.c
> > > +++ b/strbuf.c
> > > @@ -656,10 +656,8 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
> > >  	 * we can just re-init, but otherwise we should make sure that our
> > >  	 * length is empty, and that the result is NUL-terminated.
> > >  	 */
> > > -	if (!sb->buf)
> > > -		strbuf_init(sb, 0);
> > > -	else
> > > -		strbuf_reset(sb);
> > > +	FREE_AND_NULL(sb->buf);
> > > +	strbuf_init(sb, 0);
> > >  	return EOF;
> > >  }
> > >  #else
> > > 
> > > But I think either of those would solve your leak, _and_ would help with
> > > similar leaks of strbuf_getwholeline() and friends.
> > 
> > I'm not quite convinced that `strbuf_getwholeline()` should deallocate
> > the buffer for the caller, I think that makes for quite a confusing
> > calling convention. The caller may want to reuse the buffer for other
> > operations, and it feels hostile to release the buffer under their feet.
> > 
> > The only edge case where I think it would make sense to free allocated
> > data is when being passed a not-yet-allocated strbuf. But I wonder
> > whether the added complexity would be worth it.
> 
> I'm not sure what they'd reuse it for. We necessarily have to reset it
> before reading, so the contents are now garbage. The allocated buffer
> could be reused, but since everybody has to call strbuf_grow() before
> assuming they can write, it's not a correctness issue, but only an
> optimization. But that optimization is pretty unlikely to matter. Since
> we hit this code only on EOF or error, it's generally going to happen
> once in a program, and not in a tight loop.
> 
> If we really cared, though, I think you could check sb->alloc before the
> call to getdelim(), and then we'd know whether the original held an
> allocation or not (and we could restore its state). That's what other
> syscall-ish strbuf functions like strbuf_readlink() and strbuf_getcwd()
> do.

Ah, I didn't know that we did similar things in other strbuf functions.
With that precedence I think it's less ugly to do this dance.

> That said, I agree that leaks here are not going to be common. Most
> callers are going to call it in a loop and unconditionally release at
> the end, whether they get multiple lines or not. The "append" function
> is the odd man out by reading a single line into a new buffer[1].
> 
> Looking through the results of:
> 
>   git grep -P '(?<!while) \(!?strbuf_get(whole)?line'
> 
> I saw only one questionable case. builtin/difftool.c does:
> 
>   if (strbuf_getline_nul(&lpath, fp))
> 	break;
> 
> without freeing lpath. But then...it does not free it in the case that
> we got a value, either! So I think it is leaking either way, and the
> solution, to strbuf_release(&lpath) outside of the loop, would fix both
> cases.

Indeed. We also didn't free `rpath` and `info`. I do have a follow up to
this series already, so let me add those leak fixes to it.

> > I've been going through all callsites and couldn't spot any that doesn't
> > free the buffer on EOF. So I'd propose to leave this as-is and revisit
> > if we eventually see that this is causing more memory leaks.
> 
> OK. I don't feel too strongly about it, but mostly thought it seemed
> inconsistent with the philosophy of those other strbuf functions.

I get where you're coming from now with the additional info that other
syscall-ish functions do a similar dance. I'll refrain from rerolling
this series just to fix this in a different way, also because neither of
us did spot any additional leaks caused by this.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2024-05-29 11:25 UTC|newest]

Thread overview: 115+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-23 12:25 [PATCH 00/20] Various memory leak fixes Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 01/20] t: mark a bunch of tests as leak-free Patrick Steinhardt
2024-05-23 17:44   ` Junio C Hamano
2024-05-24  6:56     ` Patrick Steinhardt
2024-05-24 16:05       ` Junio C Hamano
2024-05-24 17:53         ` Junio C Hamano
2024-05-24 20:34   ` Karthik Nayak
2024-05-23 12:25 ` [PATCH 02/20] transport-helper: fix leaking helper name Patrick Steinhardt
2024-05-23 17:36   ` Junio C Hamano
2024-05-24 20:38   ` Karthik Nayak
2024-05-23 12:25 ` [PATCH 03/20] strbuf: fix leak when `appendwholeline()` fails with EOF Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 04/20] checkout: clarify memory ownership in `unique_tracking_name()` Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 05/20] http: refactor code to clarify memory ownership Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 06/20] config: clarify memory ownership in `git_config_pathname()` Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 07/20] diff: refactor code to clarify memory ownership of prefixes Patrick Steinhardt
2024-05-23 16:59   ` Eric Sunshine
2024-05-23 12:25 ` [PATCH 08/20] convert: refactor code to clarify ownership of check_roundtrip_encoding Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 09/20] builtin/log: stop using globals for log config Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 10/20] builtin/log: stop using globals for format config Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 11/20] config: clarify memory ownership in `git_config_string()` Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 12/20] config: plug various memory leaks Patrick Steinhardt
2024-05-23 17:13   ` Junio C Hamano
2024-05-24  6:58     ` Patrick Steinhardt
2024-05-24  8:55       ` Patrick Steinhardt
2024-05-24 16:12         ` Junio C Hamano
2024-05-24 16:11       ` Junio C Hamano
2024-05-23 12:26 ` [PATCH 13/20] builtin/credential: clear credential before exit Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 14/20] commit-reach: fix memory leak in `ahead_behind()` Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 15/20] submodule: fix leaking memory for submodule entries Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 16/20] strvec: add functions to replace and remove strings Patrick Steinhardt
2024-05-23 17:09   ` Eric Sunshine
2024-05-24  6:56     ` Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 17/20] builtin/mv: refactor `add_slash()` to always return allocated strings Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 18/20] builtin/mv duplicate string list memory Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 19/20] builtin/mv: refactor to use `struct strvec` Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 20/20] builtin/mv: fix leaks for submodule gitfile paths Patrick Steinhardt
2024-05-23 16:45 ` [PATCH 00/20] Various memory leak fixes Junio C Hamano
2024-05-24  6:56   ` Patrick Steinhardt
2024-05-24 10:03 ` [PATCH v2 00/21] " Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 01/21] ci: add missing dependency for TTY prereq Patrick Steinhardt
2024-05-24 16:31     ` Junio C Hamano
2024-05-24 10:03   ` [PATCH v2 02/21] t: mark a bunch of tests as leak-free Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 03/21] transport-helper: fix leaking helper name Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 04/21] strbuf: fix leak when `appendwholeline()` fails with EOF Patrick Steinhardt
2024-05-25  4:46     ` Jeff King
2024-05-27  6:44       ` Patrick Steinhardt
2024-05-29  9:16         ` Jeff King
2024-05-29 11:25           ` Patrick Steinhardt [this message]
2024-05-30  7:16             ` Jeff King
2024-05-24 10:03   ` [PATCH v2 05/21] checkout: clarify memory ownership in `unique_tracking_name()` Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 06/21] http: refactor code to clarify memory ownership Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 07/21] config: clarify memory ownership in `git_config_pathname()` Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 08/21] diff: refactor code to clarify memory ownership of prefixes Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 09/21] convert: refactor code to clarify ownership of check_roundtrip_encoding Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 10/21] builtin/log: stop using globals for log config Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 11/21] builtin/log: stop using globals for format config Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 12/21] config: clarify memory ownership in `git_config_string()` Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 13/21] config: plug various memory leaks Patrick Steinhardt
2024-05-24 10:13     ` Patrick Steinhardt
2024-05-25  4:33     ` Jeff King
2024-05-27  6:46       ` Patrick Steinhardt
2024-05-29  9:20         ` Jeff King
2024-05-24 10:04   ` [PATCH v2 14/21] builtin/credential: clear credential before exit Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 15/21] commit-reach: fix memory leak in `ahead_behind()` Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 16/21] submodule: fix leaking memory for submodule entries Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 17/21] strvec: add functions to replace and remove strings Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 18/21] builtin/mv: refactor `add_slash()` to always return allocated strings Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 19/21] builtin/mv duplicate string list memory Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 20/21] builtin/mv: refactor to use `struct strvec` Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 21/21] builtin/mv: fix leaks for submodule gitfile paths Patrick Steinhardt
2024-05-25  2:10   ` [PATCH v2 00/21] Various memory leak fixes Junio C Hamano
2024-05-27  6:44     ` Patrick Steinhardt
2024-05-27 17:38       ` Junio C Hamano
2024-05-27 18:02         ` Junio C Hamano
2024-05-28  5:09         ` Patrick Steinhardt
2024-05-29  8:25       ` Karthik Nayak
2024-05-27 11:45 ` [PATCH v3 " Patrick Steinhardt
2024-05-27 11:45   ` [PATCH v3 01/21] ci: add missing dependency for TTY prereq Patrick Steinhardt
2024-05-27 11:45   ` [PATCH v3 02/21] t: mark a bunch of tests as leak-free Patrick Steinhardt
2024-05-27 11:45   ` [PATCH v3 03/21] transport-helper: fix leaking helper name Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 04/21] strbuf: fix leak when `appendwholeline()` fails with EOF Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 05/21] checkout: clarify memory ownership in `unique_tracking_name()` Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 06/21] http: refactor code to clarify memory ownership Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 07/21] config: clarify memory ownership in `git_config_pathname()` Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 08/21] diff: refactor code to clarify memory ownership of prefixes Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 09/21] convert: refactor code to clarify ownership of check_roundtrip_encoding Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 10/21] builtin/log: stop using globals for log config Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 11/21] builtin/log: stop using globals for format config Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 12/21] config: clarify memory ownership in `git_config_string()` Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 13/21] config: plug various memory leaks Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 14/21] builtin/credential: clear credential before exit Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 15/21] commit-reach: fix memory leak in `ahead_behind()` Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 16/21] submodule: fix leaking memory for submodule entries Patrick Steinhardt
2024-05-27 11:47   ` [PATCH v3 17/21] strvec: add functions to replace and remove strings Patrick Steinhardt
2024-05-27 11:47   ` [PATCH v3 18/21] builtin/mv: refactor `add_slash()` to always return allocated strings Patrick Steinhardt
2024-05-27 11:47   ` [PATCH v3 19/21] builtin/mv duplicate string list memory Patrick Steinhardt
2024-05-27 11:47   ` [PATCH v3 20/21] builtin/mv: refactor to use `struct strvec` Patrick Steinhardt
2024-05-27 11:47   ` [PATCH v3 21/21] builtin/mv: fix leaks for submodule gitfile paths Patrick Steinhardt
2024-05-27 17:52   ` [PATCH v3 00/21] Various memory leak fixes Junio C Hamano
2024-05-30  6:38   ` [PATCH 0/5] add-ons for ps/leakfixes Jeff King
2024-05-30  6:39     ` [PATCH 1/5] t-strvec: use va_end() to match va_start() Jeff King
2024-05-30  6:39     ` [PATCH 2/5] t-strvec: mark variable-arg helper with LAST_ARG_MUST_BE_NULL Jeff King
2024-05-30  6:44     ` [PATCH 3/5] mv: move src_dir cleanup to end of cmd_mv() Jeff King
2024-05-30  7:04       ` Patrick Steinhardt
2024-05-30  7:21         ` Jeff King
2024-05-30  7:24           ` Patrick Steinhardt
2024-05-30  8:15             ` Jeff King
2024-05-30  8:19               ` Patrick Steinhardt
2024-05-30  8:28                 ` Jeff King
2024-05-30  6:45     ` [PATCH 4/5] mv: factor out empty src_dir removal Jeff King
2024-05-30  6:46     ` [PATCH 5/5] mv: replace src_dir with a strvec Jeff King
2024-05-30 15:36       ` Junio C Hamano
2024-05-31 11:12         ` Jeff King
2024-05-31 14:56           ` Junio C Hamano
2024-05-30  7:05     ` [PATCH 0/5] add-ons for ps/leakfixes Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZlcQjvAS-27S-mjw@tanuki \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).