git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Taylor Blau <me@ttaylorr.com>, Calbabreaker <calbabreaker@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: Memory leak with sparse-checkout
Date: Mon, 20 Sep 2021 12:29:36 -0400	[thread overview]
Message-ID: <8a0ddd8e-b585-8f40-c4b1-0a51f11e6b84@gmail.com> (raw)
In-Reply-To: <YUiuWSXO1P3JwerH@nand.local>

On 9/20/2021 11:52 AM, Taylor Blau wrote:
> On Mon, Sep 20, 2021 at 09:45:14PM +0930, Calbabreaker wrote:
>> What did you do before the bug happened? (Steps to reproduce your issue)
>>
>> This was ran:
>>
>> git clone https://github.com/Calbabreaker/piano --sparse
>> cd piano
>> git sparse-checkout add any_text
>> git checkout deploy-frontend
>> git sparse-checkout init --cone
>> git sparse-checkout add any_text

Thank you for pointing this out.

I'll point out that this was likely found because "--sparse" does not
initialize cone mode patterns, but you might have expected it to. This
will increase the priority of adding something like "--sparse=cone"
to the 'git clone' options.

> Thanks for the reproduction. An even simpler one may be (inside of any
> repository):
> 
>     git sparse-checkout init
>     git sparse-checkout add dir
>     git sparse-checkout init --cone
>     git sparse-checkout add dir
> 
> The problem occurs because we keep existing entries when adding to the
> sparse-checkout list, and cone-mode patterns do not mix with
> non cone-mode patterns.
> 
> So after the first init and "add dir", your sparse-checkout file looks
> like:
> 
>   /*
>   !/*/
>   dir
> 
> but then when we convert to cone-mode and try and add "dir" (which in
> cone-mode we'll convert to "/dir/"), we run into trouble when adding the
> existing "dir" entry. That's because add_patterns_cone_mode() calls
> insert_recursive_pattern() on every entry in the existing list,
> including "dir".
> 
> So when we call insert_recursive_pattern() with any pattern list and
> path containing "dir", we first insert "dir" into the list, and then:
> 
>   char *slash = strrchr(e->pattern, '/');
>   char *oldpattern = e->pattern;
> 
>   if (slash == e->pattern)
>     break;
>   // trim off a slash, repeat
> 
> except slash is NULL because "dir" doesn't contain a slash. And that
> explains the problem you're seeing, because (a) we'll stay in that while
> loop forever, and (b) because each iteration allocates memory to
> accommodate the new pattern, so we'll eventually run out of memory.

Yikes! Thanks for digging into the details.

> The wrong thing to do would be to handle this case by changing the
> conditional to "if (!slash || slash == e->pattern)", because we can't
> blindly carry forward some patterns which look like cone-mode patterns,
> since together the list of sparse-checkout entries may not represent a
> cone.
> 
> (An example here is if we added /foo/bar/baz/* without the corresponding
> /foo/, !/foo/*, and so on).
> 
> So I think the problem really is that we need to drop existing patterns
> when re-initializing the sparse-checkout in cone mode. We could try to
> recognize that existing patterns may already constitute a cone (and/or
> create a cone that covers the existing patterns).
> 
> But I think the easiest thing (if a little unfriendly) would be to just
> drop them and start afresh when re-initializing the sparse-checkout in
> cone mode.

This isn't sufficient, as a user can modify their .git/info/sparse-checkout
file whenever they want, so we should fix this bug regardless. We could add
a "Your existing patterns are not in cone mode" error.

It might still be a good idea to let "git sparse-checkout init --cone"
overwrite the sparse-checkout file _if the file is not already in cone
mode_.

Thanks,
-Stolee

  reply	other threads:[~2021-09-20 16:29 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-20 12:15 Memory leak with sparse-checkout Calbabreaker
2021-09-20 15:52 ` Taylor Blau
2021-09-20 16:29   ` Derrick Stolee [this message]
2021-09-20 16:42     ` Taylor Blau
2021-09-20 17:25       ` Derrick Stolee
2021-09-20 17:27         ` Derrick Stolee
2021-09-20 19:08           ` Taylor Blau
2021-09-20 20:56             ` Derrick Stolee
2021-09-20 21:20               ` Taylor Blau
2021-09-21 12:55                 ` Derrick Stolee
2021-09-21 16:32                   ` Taylor Blau
2021-09-21 18:56                     ` Derrick Stolee
2021-09-21 20:45                       ` Taylor Blau
2021-09-22 19:16                         ` Derrick Stolee
2021-09-22 19:37                           ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8a0ddd8e-b585-8f40-c4b1-0a51f11e6b84@gmail.com \
    --to=stolee@gmail.com \
    --cc=calbabreaker@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).