git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] repository: fix free problem with repo_clear(the_repository)
@ 2018-05-09 17:04 Nguyễn Thái Ngọc Duy
  2018-05-09 17:17 ` Brandon Williams
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2018-05-09 17:04 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller, Junio C Hamano,
	Nguyễn Thái Ngọc Duy

the_repository is special. One of the special things about it is that
it does not allocate a new index_state object like submodules but
points to the global the_index variable instead. As a global variable,
the_index cannot be free()'d.

Add an exception for this in repo_clear(). In the future perhaps we
would be able to allocate the_repository's index on heap too. Then we
can remove revert this.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 I was trying to test the new parsed_object_pool_clear() and found this.

 repository.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/repository.c b/repository.c
index a4848c1bd0..f44733524a 100644
--- a/repository.c
+++ b/repository.c
@@ -238,7 +238,9 @@ void repo_clear(struct repository *repo)
 
 	if (repo->index) {
 		discard_index(repo->index);
-		FREE_AND_NULL(repo->index);
+		if (repo->index != &the_index)
+			free(repo->index);
+		repo->index = NULL;
 	}
 }
 
-- 
2.17.0.705.g3525833791


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] repository: fix free problem with repo_clear(the_repository)
  2018-05-09 17:04 [PATCH] repository: fix free problem with repo_clear(the_repository) Nguyễn Thái Ngọc Duy
@ 2018-05-09 17:17 ` Brandon Williams
  2018-05-09 17:42 ` Elijah Newren
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Brandon Williams @ 2018-05-09 17:17 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Stefan Beller, Junio C Hamano

On 05/09, Nguyễn Thái Ngọc Duy wrote:
> the_repository is special. One of the special things about it is that
> it does not allocate a new index_state object like submodules but
> points to the global the_index variable instead. As a global variable,
> the_index cannot be free()'d.
> 
> Add an exception for this in repo_clear(). In the future perhaps we
> would be able to allocate the_repository's index on heap too. Then we
> can remove revert this.
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  I was trying to test the new parsed_object_pool_clear() and found this.

This looks good and I do hope we can get to a state soon where we can
not have to special case the_repository.

> 
>  repository.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/repository.c b/repository.c
> index a4848c1bd0..f44733524a 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -238,7 +238,9 @@ void repo_clear(struct repository *repo)
>  
>  	if (repo->index) {
>  		discard_index(repo->index);
> -		FREE_AND_NULL(repo->index);
> +		if (repo->index != &the_index)
> +			free(repo->index);
> +		repo->index = NULL;
>  	}
>  }
>  
> -- 
> 2.17.0.705.g3525833791
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] repository: fix free problem with repo_clear(the_repository)
  2018-05-09 17:04 [PATCH] repository: fix free problem with repo_clear(the_repository) Nguyễn Thái Ngọc Duy
  2018-05-09 17:17 ` Brandon Williams
@ 2018-05-09 17:42 ` Elijah Newren
  2018-05-09 17:54   ` Duy Nguyen
  2018-05-09 17:50 ` Stefan Beller
  2018-05-10  6:13 ` [PATCH v2] " Nguyễn Thái Ngọc Duy
  3 siblings, 1 reply; 11+ messages in thread
From: Elijah Newren @ 2018-05-09 17:42 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: Git Mailing List, Stefan Beller, Junio C Hamano

On Wed, May 9, 2018 at 10:04 AM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
> the_repository is special. One of the special things about it is that
> it does not allocate a new index_state object like submodules but
> points to the global the_index variable instead. As a global variable,
> the_index cannot be free()'d.
>
> Add an exception for this in repo_clear(). In the future perhaps we
> would be able to allocate the_repository's index on heap too. Then we
> can remove revert this.

"remove revert"?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] repository: fix free problem with repo_clear(the_repository)
  2018-05-09 17:04 [PATCH] repository: fix free problem with repo_clear(the_repository) Nguyễn Thái Ngọc Duy
  2018-05-09 17:17 ` Brandon Williams
  2018-05-09 17:42 ` Elijah Newren
@ 2018-05-09 17:50 ` Stefan Beller
  2018-05-09 18:00   ` Duy Nguyen
  2018-05-10  9:27   ` Junio C Hamano
  2018-05-10  6:13 ` [PATCH v2] " Nguyễn Thái Ngọc Duy
  3 siblings, 2 replies; 11+ messages in thread
From: Stefan Beller @ 2018-05-09 17:50 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano

On Wed, May 9, 2018 at 10:04 AM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
> the_repository is special. One of the special things about it is that
> it does not allocate a new index_state object like submodules but
> points to the global the_index variable instead. As a global variable,
> the_index cannot be free()'d.

ok. That is the situation we're in.

>
> Add an exception for this in repo_clear(). In the future perhaps we
> would be able to allocate the_repository's index on heap too. Then we
> can remove revert this.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  I was trying to test the new parsed_object_pool_clear() and found this.

So this would go with the latest sb/object-store-alloc ?

My impression was that we never call repo_clear() on
the_repository, which would allow us to special case
the_repository further just as I did in v2 of that series[1] by
having static allocations for certain objects in case of \
the_repository.

[1] https://public-inbox.org/git/20180501213403.14643-14-sbeller@google.com/

We could just deal with all the exceptions, but that makes repo_clear
ugly IMHO.

I would rather special case the_repository even more instead
of having it allocate all its things on the heap. (However we rather
want to profile it and argue with data....)

>  repository.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/repository.c b/repository.c
> index a4848c1bd0..f44733524a 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -238,7 +238,9 @@ void repo_clear(struct repository *repo)
>
>         if (repo->index) {
>                 discard_index(repo->index);
> -               FREE_AND_NULL(repo->index);
> +               if (repo->index != &the_index)
> +                       free(repo->index);
> +               repo->index = NULL;

So after this we have a "dangling" the_index.
It is not really dangling, but it is not part of the_repository any more
and many places still use the_index, it might make up for interesting
bugs?

What is your use case of repo_clear(the_repository)?

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] repository: fix free problem with repo_clear(the_repository)
  2018-05-09 17:42 ` Elijah Newren
@ 2018-05-09 17:54   ` Duy Nguyen
  0 siblings, 0 replies; 11+ messages in thread
From: Duy Nguyen @ 2018-05-09 17:54 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List, Stefan Beller, Junio C Hamano

On Wed, May 9, 2018 at 7:42 PM, Elijah Newren <newren@gmail.com> wrote:
> On Wed, May 9, 2018 at 10:04 AM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
>> the_repository is special. One of the special things about it is that
>> it does not allocate a new index_state object like submodules but
>> points to the global the_index variable instead. As a global variable,
>> the_index cannot be free()'d.
>>
>> Add an exception for this in repo_clear(). In the future perhaps we
>> would be able to allocate the_repository's index on heap too. Then we
>> can remove revert this.
>
> "remove revert"?

It's obvious that double negatives are below me. I'm going to the next
level with double positives! "remove" should be removed.
-- 
Duy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] repository: fix free problem with repo_clear(the_repository)
  2018-05-09 17:50 ` Stefan Beller
@ 2018-05-09 18:00   ` Duy Nguyen
  2018-05-09 18:09     ` Stefan Beller
  2018-05-10  6:18     ` Duy Nguyen
  2018-05-10  9:27   ` Junio C Hamano
  1 sibling, 2 replies; 11+ messages in thread
From: Duy Nguyen @ 2018-05-09 18:00 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Junio C Hamano

On Wed, May 9, 2018 at 7:50 PM, Stefan Beller <sbeller@google.com> wrote:
>>  I was trying to test the new parsed_object_pool_clear() and found this.
>
> So this would go with the latest sb/object-store-alloc ?

No this should be separate because sb/object-store-alloc did not even
touch this code. I mistakenly thought you wrote this and sent to you.
I should have checked and sent to Brandon instead.

> My impression was that we never call repo_clear() on
> the_repository, which would allow us to special case
> the_repository further just as I did in v2 of that series[1] by
> having static allocations for certain objects in case of \
> the_repository.
>
> [1] https://public-inbox.org/git/20180501213403.14643-14-sbeller@google.com/
>
> We could just deal with all the exceptions, but that makes repo_clear
> ugly IMHO.
>
> I would rather special case the_repository even more instead
> of having it allocate all its things on the heap. (However we rather
> want to profile it and argue with data....)

I'm actually going the opposite direction and trying hard to make
the_repository normal like everybody else :)

>>  repository.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/repository.c b/repository.c
>> index a4848c1bd0..f44733524a 100644
>> --- a/repository.c
>> +++ b/repository.c
>> @@ -238,7 +238,9 @@ void repo_clear(struct repository *repo)
>>
>>         if (repo->index) {
>>                 discard_index(repo->index);
>> -               FREE_AND_NULL(repo->index);
>> +               if (repo->index != &the_index)
>> +                       free(repo->index);
>> +               repo->index = NULL;
>
> So after this we have a "dangling" the_index.
> It is not really dangling, but it is not part of the_repository any more
> and many places still use the_index, it might make up for interesting
> bugs?

Hmm.. yeah, as a "clearing" operation, I probaly should not clear
repo->index. This way the_repository will be back in initial state and
could be reused again. Something like this perhaps?

discard_index(repo->index);
if (repo->index != &the_index)
        FREE_AND_NULL(repo->index);

> What is your use case of repo_clear(the_repository)?

No actual use case right now. See [1] for the code that triggered
this. I do want to free some memory in pack-objects and repo_clear()
_might_ be the one. I'm not sure yet.

[1] https://gist.github.com/pclouds/86a2df6c28043f1b6fa3d4e72e7a1276
-- 
Duy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] repository: fix free problem with repo_clear(the_repository)
  2018-05-09 18:00   ` Duy Nguyen
@ 2018-05-09 18:09     ` Stefan Beller
  2018-05-10  6:18     ` Duy Nguyen
  1 sibling, 0 replies; 11+ messages in thread
From: Stefan Beller @ 2018-05-09 18:09 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: git, Junio C Hamano

On Wed, May 9, 2018 at 11:00 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Wed, May 9, 2018 at 7:50 PM, Stefan Beller <sbeller@google.com> wrote:
>>>  I was trying to test the new parsed_object_pool_clear() and found this.
>>
>> So this would go with the latest sb/object-store-alloc ?
>
> No this should be separate because sb/object-store-alloc did not even
> touch this code. I mistakenly thought you wrote this and sent to you.
> I should have checked and sent to Brandon instead.

Yes, they do not produce merge conflicts and do not semantically
rely on each other. Except as noted below
the object store series complicated things in a non-latest version
of it, such that we'd have to add more special casing. Now everything
is fine.


>> I would rather special case the_repository even more instead
>> of having it allocate all its things on the heap. (However we rather
>> want to profile it and argue with data....)
>
> I'm actually going the opposite direction and trying hard to make
> the_repository normal like everybody else :)

ok, both Brandon and you want to do this, so I guess we'll just
go this route for now.

>
> discard_index(repo->index);
> if (repo->index != &the_index)
>         FREE_AND_NULL(repo->index);
>
>> What is your use case of repo_clear(the_repository)?
>
> No actual use case right now. See [1] for the code that triggered
> this. I do want to free some memory in pack-objects and repo_clear()
> _might_ be the one. I'm not sure yet.

This diff seems good to me. as any repo is cleared and wil have the minimal
amount of memory. the_repository clears its the_index which is also fine
as it has the minimal amount of memory then, too

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2] repository: fix free problem with repo_clear(the_repository)
  2018-05-09 17:04 [PATCH] repository: fix free problem with repo_clear(the_repository) Nguyễn Thái Ngọc Duy
                   ` (2 preceding siblings ...)
  2018-05-09 17:50 ` Stefan Beller
@ 2018-05-10  6:13 ` Nguyễn Thái Ngọc Duy
  3 siblings, 0 replies; 11+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2018-05-10  6:13 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Stefan Beller, Brandon Williams,
	Nguyễn Thái Ngọc Duy

the_repository is special. One of the special things about it is that
it does not allocate a new index_state object like submodules but
points to the global the_index variable instead. As a global variable,
the_index cannot be free()'d.

Add an exception for this in repo_clear(). In the future perhaps we
would be able to allocate the_repository's index on heap too. Then we
can revert this.

the_repository->index remains pointed to a clean the_index even after
repo_clear() so that it could still be used next time (e.g. in a crazy
use case where a dev switches repo in the same process).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 v2 keeps the_repository->index pointed to the_index even after cleared

 repository.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/repository.c b/repository.c
index a4848c1bd0..4143073f89 100644
--- a/repository.c
+++ b/repository.c
@@ -238,7 +238,8 @@ void repo_clear(struct repository *repo)
 
 	if (repo->index) {
 		discard_index(repo->index);
-		FREE_AND_NULL(repo->index);
+		if (repo->index != &the_index)
+			FREE_AND_NULL(repo->index);
 	}
 }
 
-- 
2.17.0.705.g3525833791


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] repository: fix free problem with repo_clear(the_repository)
  2018-05-09 18:00   ` Duy Nguyen
  2018-05-09 18:09     ` Stefan Beller
@ 2018-05-10  6:18     ` Duy Nguyen
  1 sibling, 0 replies; 11+ messages in thread
From: Duy Nguyen @ 2018-05-10  6:18 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Junio C Hamano

On Wed, May 9, 2018 at 8:00 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> discard_index(repo->index);
> if (repo->index != &the_index)
>         FREE_AND_NULL(repo->index);
>
>> What is your use case of repo_clear(the_repository)?
>
> No actual use case right now. See [1] for the code that triggered
> this. I do want to free some memory in pack-objects and repo_clear()
> _might_ be the one. I'm not sure yet.

Another use case for repo_clear(the_repository) is "git worktree
move". Part of the reason I did not support moving the main repository
with that command is because I wanted to shut down every access  to
$MAIN_WORK_TREE/.git before moving $MAIN_WORK_TREE. It's probably not
a problem for linux/mac platforms to move files (on the same file
system [1]) with file descriptors still open, but I'm pretty sure it
won't work on Windows. If repo_clear() does its job well, I should be
able to safely move $GIT_WORK_TREE after that.

[1] if we move across file systems then another set of problems arise:
if file descriptors remain open, writing to those will not affect the
new copies in the target. We do not support moving across filesystems
yet, but we should not shut that door now.
-- 
Duy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] repository: fix free problem with repo_clear(the_repository)
  2018-05-09 17:50 ` Stefan Beller
  2018-05-09 18:00   ` Duy Nguyen
@ 2018-05-10  9:27   ` Junio C Hamano
  2018-05-10 16:34     ` Stefan Beller
  1 sibling, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2018-05-10  9:27 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Nguyễn Thái Ngọc Duy, git

Stefan Beller <sbeller@google.com> writes:

> So this would go with the latest sb/object-store-alloc ?
>
> My impression was that we never call repo_clear() on
> the_repository, which would allow us to special case
> the_repository further just as I did in v2 of that series[1] by
> having static allocations for certain objects in case of \
> the_repository.
>
> [1] https://public-inbox.org/git/20180501213403.14643-14-sbeller@google.com/
>
> We could just deal with all the exceptions, but that makes repo_clear
> ugly IMHO.

So perhaps

         void repo_clear(...)
         {
        +       if (repo == the_repository)
        +               BUG("repo_clear() called on the repository");
                ...

or something?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] repository: fix free problem with repo_clear(the_repository)
  2018-05-10  9:27   ` Junio C Hamano
@ 2018-05-10 16:34     ` Stefan Beller
  0 siblings, 0 replies; 11+ messages in thread
From: Stefan Beller @ 2018-05-10 16:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nguyễn Thái Ngọc Duy, git

Hi Junio,

On Thu, May 10, 2018 at 2:27 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> So this would go with the latest sb/object-store-alloc ?
>>
>> My impression was that we never call repo_clear() on
>> the_repository, which would allow us to special case
>> the_repository further just as I did in v2 of that series[1] by
>> having static allocations for certain objects in case of \
>> the_repository.
>>
>> [1] https://public-inbox.org/git/20180501213403.14643-14-sbeller@google.com/
>>
>> We could just deal with all the exceptions, but that makes repo_clear
>> ugly IMHO.
>
> So perhaps
>
>          void repo_clear(...)
>          {
>         +       if (repo == the_repository)
>         +               BUG("repo_clear() called on the repository");
>                 ...
>
> or something?

This would work, but Duy convinced me to have repo_clear working
on the_repository is a good idea by giving a minimal test helper[1],
which helped me to find leaks[2][3], so I'd rather go with the solution
by Duy in [4] from a developers perspective.

Stefan

[1] https://public-inbox.org/git/CACsJy8C7N2W821H8YR8VaKdCSOSCDtQi_YT7z8hHNDO-VxJmEA@mail.gmail.com/
    https://gist.github.com/pclouds/86a2df6c28043f1b6fa3d4e72e7a1276
[2] https://public-inbox.org/git/20180510001211.163692-1-sbeller@google.com/
[3] https://public-inbox.org/git/20180509234059.52156-1-sbeller@google.com/
[4] https://public-inbox.org/git/CACsJy8AdJcQpiGrR3p6xfuqU0=qoP3=StgbWRNCkdfka6di+5w@mail.gmail.com/

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-05-10 16:34 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-09 17:04 [PATCH] repository: fix free problem with repo_clear(the_repository) Nguyễn Thái Ngọc Duy
2018-05-09 17:17 ` Brandon Williams
2018-05-09 17:42 ` Elijah Newren
2018-05-09 17:54   ` Duy Nguyen
2018-05-09 17:50 ` Stefan Beller
2018-05-09 18:00   ` Duy Nguyen
2018-05-09 18:09     ` Stefan Beller
2018-05-10  6:18     ` Duy Nguyen
2018-05-10  9:27   ` Junio C Hamano
2018-05-10 16:34     ` Stefan Beller
2018-05-10  6:13 ` [PATCH v2] " Nguyễn Thái Ngọc Duy

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).