[PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
@ 2017-08-27  7:37 Martin Ågren
  2017-08-27 15:41 ` Jeff King
  2017-08-27 18:21 ` Lars Schneider
  0 siblings, 2 replies; 43+ messages in thread
From: Martin Ågren @ 2017-08-27  7:37 UTC (permalink / raw)
  To: git; +Cc: Lars Schneider, Jeff King

The static-ness was silently dropped in commit 70428d1a5 ("pkt-line: add
packet_write_fmt_gently()", 2016-10-16). As a result, for each call to
packet_write_fmt_1, we allocate and leak a buffer.

We could keep the strbuf non-static and instead make sure we always
release it before returning (but not before we die, so that we don't
touch errno). That would also prepare us for threaded use. But until
that needs to happen, let's just restore the static-ness so that we get
back to a situation where we (eventually) do not continuosly keep
allocating memory.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
I waffled between "fixing the memory leak" by releasing the buffer and
"fixing the static-ness" as in this patch. I can see arguments both ways
and don't have any strong opinion.

 pkt-line.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/pkt-line.c b/pkt-line.c
index 7db911957..f364944b9 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -136,9 +136,10 @@ static void format_packet(struct strbuf *out, const char *fmt, va_list args)
 static int packet_write_fmt_1(int fd, int gently,
 			      const char *fmt, va_list args)
 {
-	struct strbuf buf = STRBUF_INIT;
+	static struct strbuf buf = STRBUF_INIT;
 	ssize_t count;

+	strbuf_reset(&buf);
 	format_packet(&buf, fmt, args);
 	count = write_in_full(fd, buf.buf, buf.len);
 	if (count == buf.len)
-- 
2.14.1.151.g45c1275a3.dirty

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
  2017-08-27  7:37 [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1() Martin Ågren
@ 2017-08-27 15:41 ` Jeff King
  2017-08-27 18:25   ` Jeff King
  2017-08-27 18:21 ` Lars Schneider
  1 sibling, 1 reply; 43+ messages in thread
From: Jeff King @ 2017-08-27 15:41 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git, Lars Schneider

On Sun, Aug 27, 2017 at 09:37:32AM +0200, Martin Ågren wrote:

> The static-ness was silently dropped in commit 70428d1a5 ("pkt-line: add
> packet_write_fmt_gently()", 2016-10-16). As a result, for each call to
> packet_write_fmt_1, we allocate and leak a buffer.
> 
> We could keep the strbuf non-static and instead make sure we always
> release it before returning (but not before we die, so that we don't
> touch errno). That would also prepare us for threaded use. But until
> that needs to happen, let's just restore the static-ness so that we get
> back to a situation where we (eventually) do not continuosly keep
> allocating memory.

Ouch. So this means that git since v2.11 is basically leaking every
non-byte pack sent by upload-pack (so all of the ref advertisement and
want/have negotiation).

I'm surprised nobody noticed the extra memory use, but I guess those
things aren't usually _too_ big.

> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---
> I waffled between "fixing the memory leak" by releasing the buffer and
> "fixing the static-ness" as in this patch. I can see arguments both ways
> and don't have any strong opinion.

I think this is a good fix for now as it takes as back to the pre-bug
state.  The only downside with the static buffer is that it's not
reentrant.  Since the function is just inherently writing out the result
and then forgetting it, in a single thread there's no opportunity for a
sub-function to try writing another packet. And I don't think we have
any code paths that write packets from multiple threads. That may be
something we do eventually, but we can deal with it then (and until
then, it's nice to avoid the malloc/free overhead).

-Peff

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
  2017-08-27 15:41 ` Jeff King
@ 2017-08-27 18:25   ` Jeff King
  0 siblings, 0 replies; 43+ messages in thread
From: Jeff King @ 2017-08-27 18:25 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git, Lars Schneider

On Sun, Aug 27, 2017 at 11:41:52AM -0400, Jeff King wrote:

> Ouch. So this means that git since v2.11 is basically leaking every
> non-byte pack sent by upload-pack (so all of the ref advertisement and
> want/have negotiation).

Determined experimentally that the answer is: yes.

The increase in ref advertisement is roughly linear with the size of
your packed-refs, so for most people this wasn't a big deal (and even if
you have insanely large packed-refs, it's "only" 2-3x the RAM;
potentially a problem, but only if you're close to the tipping point
already).

-Peff

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
  2017-08-27  7:37 [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1() Martin Ågren
  2017-08-27 15:41 ` Jeff King
@ 2017-08-27 18:21 ` Lars Schneider
  2017-08-27 19:09   ` Martin Ågren
  1 sibling, 1 reply; 43+ messages in thread
From: Lars Schneider @ 2017-08-27 18:21 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Git Users, Jeff King


> On 27 Aug 2017, at 09:37, Martin Ågren <martin.agren@gmail.com> wrote:
> 
> The static-ness was silently dropped in commit 70428d1a5 ("pkt-line: add
> packet_write_fmt_gently()", 2016-10-16). As a result, for each call to
> packet_write_fmt_1, we allocate and leak a buffer.

Oh :-( Thanks for detecting and fixing the leak.

How did you actually find it? Reading the code or did you use some
tool?


> We could keep the strbuf non-static and instead make sure we always
> release it before returning (but not before we die, so that we don't
> touch errno). That would also prepare us for threaded use. But until
> that needs to happen, let's just restore the static-ness so that we get
> back to a situation where we (eventually) do not continuosly keep
> allocating memory.
> 
> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---
> I waffled between "fixing the memory leak" by releasing the buffer and
> "fixing the static-ness" as in this patch. I can see arguments both ways
> and don't have any strong opinion.
> 
> pkt-line.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/pkt-line.c b/pkt-line.c
> index 7db911957..f364944b9 100644
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -136,9 +136,10 @@ static void format_packet(struct strbuf *out, const char *fmt, va_list args)
> static int packet_write_fmt_1(int fd, int gently,
> 			      const char *fmt, va_list args)
> {
> -	struct strbuf buf = STRBUF_INIT;
> +	static struct strbuf buf = STRBUF_INIT;
> 	ssize_t count;
> 
> +	strbuf_reset(&buf);
> 	format_packet(&buf, fmt, args);
> 	count = write_in_full(fd, buf.buf, buf.len);
> 	if (count == buf.len)
> -- 
> 2.14.1.151.g45c1275a3.dirty

Looks good to me!

- Lars

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
  2017-08-27 18:21 ` Lars Schneider
@ 2017-08-27 19:09   ` Martin Ågren
  2017-08-27 19:15     ` Jeff King
  2017-08-27 20:04     ` Lars Schneider
  0 siblings, 2 replies; 43+ messages in thread
From: Martin Ågren @ 2017-08-27 19:09 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Git Users, Jeff King

On 27 August 2017 at 20:21, Lars Schneider <larsxschneider@gmail.com> wrote:
>
>> On 27 Aug 2017, at 09:37, Martin Ågren <martin.agren@gmail.com> wrote:
>>
>> The static-ness was silently dropped in commit 70428d1a5 ("pkt-line: add
>> packet_write_fmt_gently()", 2016-10-16). As a result, for each call to
>> packet_write_fmt_1, we allocate and leak a buffer.
>
> Oh :-( Thanks for detecting and fixing the leak.
>
> How did you actually find it? Reading the code or did you use some
> tool?

Valgrind found it for me. Most leaks it reports are "ok" since we're
about to exit, it just takes more or less effort to realize that...

Martin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
  2017-08-27 19:09   ` Martin Ågren
@ 2017-08-27 19:15     ` Jeff King
  2017-08-27 20:04     ` Lars Schneider
  1 sibling, 0 replies; 43+ messages in thread
From: Jeff King @ 2017-08-27 19:15 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Lars Schneider, Git Users

On Sun, Aug 27, 2017 at 09:09:15PM +0200, Martin Ågren wrote:

> On 27 August 2017 at 20:21, Lars Schneider <larsxschneider@gmail.com> wrote:
> >
> >> On 27 Aug 2017, at 09:37, Martin Ågren <martin.agren@gmail.com> wrote:
> >>
> >> The static-ness was silently dropped in commit 70428d1a5 ("pkt-line: add
> >> packet_write_fmt_gently()", 2016-10-16). As a result, for each call to
> >> packet_write_fmt_1, we allocate and leak a buffer.
> >
> > Oh :-( Thanks for detecting and fixing the leak.
> >
> > How did you actually find it? Reading the code or did you use some
> > tool?
> 
> Valgrind found it for me. Most leaks it reports are "ok" since we're
> about to exit, it just takes more or less effort to realize that...

This is one more thing it would be nice to have as part of our
perf-testing framework. It will run two versions of Git across a battery
of tests and report on the runtime differences for each. It would be
great if it did the same for peak heap. The tricky thing is that you
sometimes need repositories that are exaggerated in size in one
dimension to notice the differences as significant. So I don't know if
we would need new tests for this, or if existing "this repo has a lot of
refs" tests would have caught this.

Another approach, of course, is to get valgrind (or asan) to a
zero-leaks-reported steady state, and then even small leak reports would
be worth investigating. I'm not sure how easy it is to get there, and
whether it's less work to actually plug free-on-exit leaks or somehow
suppress the reports. I think most free-on-exit false positives would
show up as "leaked but still reachable", whereas leaks like this that
can grow without bound mean would not be reachable (we throw away the
pointer to the memory at the end of the function).

-Peff

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
  2017-08-27 19:09   ` Martin Ågren
  2017-08-27 19:15     ` Jeff King
@ 2017-08-27 20:04     ` Lars Schneider
  2017-08-27 23:23       ` Jeff King
  1 sibling, 1 reply; 43+ messages in thread
From: Lars Schneider @ 2017-08-27 20:04 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Git Users, Jeff King


> On 27 Aug 2017, at 21:09, Martin Ågren <martin.agren@gmail.com> wrote:
> 
> On 27 August 2017 at 20:21, Lars Schneider <larsxschneider@gmail.com> wrote:
>> 
>>> On 27 Aug 2017, at 09:37, Martin Ågren <martin.agren@gmail.com> wrote:
>>> 
>>> The static-ness was silently dropped in commit 70428d1a5 ("pkt-line: add
>>> packet_write_fmt_gently()", 2016-10-16). As a result, for each call to
>>> packet_write_fmt_1, we allocate and leak a buffer.
>> 
>> Oh :-( Thanks for detecting and fixing the leak.
>> 
>> How did you actually find it? Reading the code or did you use some
>> tool?
> 
> Valgrind found it for me. Most leaks it reports are "ok" since we're
> about to exit, it just takes more or less effort to realize that...

I am sorry if I am bugging you here, but I would really like to 
understand how you found it.

I did run all tests under valgrind using "make valgrind" and I found
the following files with potential issues:

cat valgrind.out | perl -nE 'say /^==.+\((.+)\:.+\)$/' | sort | uniq -c
7102
   2 clone.c
  33 common-main.c
   6 connect.c
  64 git.c
   4 ls-remote.c
 126 run-command.c
  12 transport.c
   7 worktree.c


No mention of "pkt-line.c". Did you run Git with valgrind on one of 
your repositories to find it?

Thanks,
Lars

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
  2017-08-27 20:04     ` Lars Schneider
@ 2017-08-27 23:23       ` Jeff King
  2017-08-28  4:11         ` Martin Ågren
  0 siblings, 1 reply; 43+ messages in thread
From: Jeff King @ 2017-08-27 23:23 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Martin Ågren, Git Users

On Sun, Aug 27, 2017 at 10:04:55PM +0200, Lars Schneider wrote:

> I did run all tests under valgrind using "make valgrind" and I found
> the following files with potential issues:
> 
> cat valgrind.out | perl -nE 'say /^==.+\((.+)\:.+\)$/' | sort | uniq -c
> 7102
>    2 clone.c
>   33 common-main.c
>    6 connect.c
>   64 git.c
>    4 ls-remote.c
>  126 run-command.c
>   12 transport.c
>    7 worktree.c

I'm not sure where valgrind.out comes from. The individual
test-results/*.out files may have valgrind output, but I don't think
they usually contain leak output.

Doing "valgrind ./git-upload-pack . </dev/null >/dev/null" mentions
leaked memory but not the locations. Adding --leak-check=full shows that
most of it comes from format_packet().

And applying Martin's patch drops the "definitely lost" category down to
0 bytes (there's still 550k in "still reachable", but those are in the
"exit will free them for us" category).

> No mention of "pkt-line.c". Did you run Git with valgrind on one of 
> your repositories to find it?

I'm curious, too. I don't think the valgrind setup in our test suite is
great for finding leaks right now.

-Peff

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
  2017-08-27 23:23       ` Jeff King
@ 2017-08-28  4:11         ` Martin Ågren
  2017-08-28 17:52           ` Stefan Beller
  2017-08-29 17:51           ` Lars Schneider
  0 siblings, 2 replies; 43+ messages in thread
From: Martin Ågren @ 2017-08-28  4:11 UTC (permalink / raw)
  To: Jeff King; +Cc: Lars Schneider, Git Users

On 28 August 2017 at 01:23, Jeff King <peff@peff.net> wrote:
> On Sun, Aug 27, 2017 at 10:04:55PM +0200, Lars Schneider wrote:
>
>> I did run all tests under valgrind using "make valgrind" and I found
>> the following files with potential issues:
>>
>> cat valgrind.out | perl -nE 'say /^==.+\((.+)\:.+\)$/' | sort | uniq -c
>> 7102
>>    2 clone.c
>>   33 common-main.c
>>    6 connect.c
>>   64 git.c
>>    4 ls-remote.c
>>  126 run-command.c
>>   12 transport.c
>>    7 worktree.c
>
> I'm not sure where valgrind.out comes from. The individual
> test-results/*.out files may have valgrind output, but I don't think
> they usually contain leak output.
>
> Doing "valgrind ./git-upload-pack . </dev/null >/dev/null" mentions
> leaked memory but not the locations. Adding --leak-check=full shows that
> most of it comes from format_packet().
>
> And applying Martin's patch drops the "definitely lost" category down to
> 0 bytes (there's still 550k in "still reachable", but those are in the
> "exit will free them for us" category).
>
>> No mention of "pkt-line.c". Did you run Git with valgrind on one of
>> your repositories to find it?
>
> I'm curious, too. I don't think the valgrind setup in our test suite is
> great for finding leaks right now.

Sorry for being brief. I've patched t/valgrind/valgrind.sh to say
"--leak-check=yes". Then I run "./t0000 --valgrind", simply because
running the complete suite gives more reports than I could possibly
process.

Then I check the first few leaks, verify that they're "ok" and add
them to a suppressions-list. Lather, rinse, repeat. A couple of very
targeted and well-motivated suppressions in git.git could perhaps be
motivated, but there are many many reported leaks. My suppressions-list
is getting gross.

I started with t0000 and t0001 because I figure, once I have those basic
suppressions in place (or leaks fixed), I can run some other more
interesting tests. Of course, the concept of "this leak is ok" is a bit
subjective. For example, we might do "return !!create_object(...);"
(function name invented on the fly), which is a leak, and unreachable.
But if this is only done once in builtin/foo.c and the object created is
small, then this could be deemed "ok", since in practice this leak will
never bring anyone over the cliff. If clean-ups in such code would not
just be code churn, then I can of course adjust my definition of "ok"
accordingly.

This is not an attempt to find and fix a huge number of leaks, it's more
to have a good reason to go through call-stacks, convince myself I know
what the code wants to do and how it does it.

Looking at only "unreachable" leaks seems like it should be an
improvement for finding the interesting cases. I'll have less time for
Git this week, but can try it out as time permits.

Thanks for your feedback, both of you.

Martin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
  2017-08-28  4:11         ` Martin Ågren
@ 2017-08-28 17:52           ` Stefan Beller
  2017-08-28 17:58             ` Jeff King
  2017-08-29 17:51           ` Lars Schneider
  1 sibling, 1 reply; 43+ messages in thread
From: Stefan Beller @ 2017-08-28 17:52 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Jeff King, Lars Schneider, Git Users

>>> No mention of "pkt-line.c". Did you run Git with valgrind on one of
>>> your repositories to find it?
>>
>> I'm curious, too. I don't think the valgrind setup in our test suite is
>> great for finding leaks right now.
>

This discussion comes up every once in a while,
the last time I was involved in this discussion I proposed
to have an "optional_free(void *)", which only frees memory
in e.g. the developer build/debug build.

That way we can have a strict "no leaks in developer build"
policy (as it is easy to detect!), but it doesn't slow down the
widely distributed version of Git.

IIRC, then the discussion focused on "what if we misuse
optional_free and the real free", such that the widely distributed
version just leaks large amounts of memory and the developers
do not notice, but quite the opposite(!) the developers would feel
safe to ignore any memory complaints, because the developer build
is clean.

Stefan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
  2017-08-28 17:52           ` Stefan Beller
@ 2017-08-28 17:58             ` Jeff King
  2017-09-05 10:03               ` Junio C Hamano
  0 siblings, 1 reply; 43+ messages in thread
From: Jeff King @ 2017-08-28 17:58 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Martin Ågren, Lars Schneider, Git Users

On Mon, Aug 28, 2017 at 10:52:51AM -0700, Stefan Beller wrote:

> >> I'm curious, too. I don't think the valgrind setup in our test suite is
> >> great for finding leaks right now.
> 
> This discussion comes up every once in a while,
> the last time I was involved in this discussion I proposed
> to have an "optional_free(void *)", which only frees memory
> in e.g. the developer build/debug build.
> 
> That way we can have a strict "no leaks in developer build"
> policy (as it is easy to detect!), but it doesn't slow down the
> widely distributed version of Git.

Personally I am not that worried about slowing down program-exit with
some free() calls (though I would reserve judgement to see how slow it
actually is).

It is that I do not want to deal with the complexity and maintenance
headache of dropping values which are program-lifetime caches. If we
introduce a free-all-object-structs function, now everybody needs to
remember to call it (even if they didn't realize they were allocating
object structs in the first place, as it may have happened under the
hood of a sub-function). And we have to wonder what might happen when
somebody calls that function _not_ at program exit.

If we can declare "reachable things are not leaks" (and certainly
valgrind is aware of that distinction), then all of that goes away. And
you're just left with local variables in main() (or our cmd_*
equivalents) that appear as leaks. And then we can solve that either by
freeing them, or just calling them program-lifetime and telling the
compiler that it's so by declaring them "static".

-Peff

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
  2017-08-28 17:58             ` Jeff King
@ 2017-09-05 10:03               ` Junio C Hamano
  0 siblings, 0 replies; 43+ messages in thread
From: Junio C Hamano @ 2017-09-05 10:03 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Beller, Martin Ågren, Lars Schneider, Git Users

Jeff King <peff@peff.net> writes:

> On Mon, Aug 28, 2017 at 10:52:51AM -0700, Stefan Beller wrote:
>
>> >> I'm curious, too. I don't think the valgrind setup in our test suite is
>> >> great for finding leaks right now.
>> 
>> This discussion comes up every once in a while,
>> the last time I was involved in this discussion I proposed
>> to have an "optional_free(void *)", which only frees memory
>> in e.g. the developer build/debug build.
>> 
>> That way we can have a strict "no leaks in developer build"
>> policy (as it is easy to detect!), but it doesn't slow down the
>> widely distributed version of Git.
>
> Personally I am not that worried about slowing down program-exit with
> some free() calls (though I would reserve judgement to see how slow it
> actually is).
>
> It is that I do not want to deal with the complexity and maintenance
> headache of dropping values which are program-lifetime caches. If we
> introduce a free-all-object-structs function, now everybody needs to
> remember to call it (even if they didn't realize they were allocating
> object structs in the first place, as it may have happened under the
> hood of a sub-function). And we have to wonder what might happen when
> somebody calls that function _not_ at program exit.

In addition, the code earlier may have a variable point at a
compiled in literal string or an allocated string depending on the
situation and it would have been perfectly fine to rely on exit() to
clean it up.  "We free all ourselves before exit()" would mean these
codepaths now need to be updated to keep track of what needs to be
and what must not be freed, or just duplicate everything to make the
life of that "free everything" part easier, which somehow feels like
a tail wagging a dog.

> If we can declare "reachable things are not leaks" (and certainly
> valgrind is aware of that distinction), then all of that goes away. And
> you're just left with local variables in main() (or our cmd_*
> equivalents) that appear as leaks. And then we can solve that either by
> freeing them, or just calling them program-lifetime and telling the
> compiler that it's so by declaring them "static".

Yup.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
  2017-08-28  4:11         ` Martin Ågren
  2017-08-28 17:52           ` Stefan Beller
@ 2017-08-29 17:51           ` Lars Schneider
  2017-08-29 18:53             ` Jeff King
  1 sibling, 1 reply; 43+ messages in thread
From: Lars Schneider @ 2017-08-29 17:51 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Jeff King, Git Users


> On 28 Aug 2017, at 06:11, Martin Ågren <martin.agren@gmail.com> wrote:
> 
> On 28 August 2017 at 01:23, Jeff King <peff@peff.net> wrote:
>> On Sun, Aug 27, 2017 at 10:04:55PM +0200, Lars Schneider wrote:
>> 
>>> I did run all tests under valgrind using "make valgrind" and I found
>>> the following files with potential issues:
>>> 
>>> cat valgrind.out | perl -nE 'say /^==.+\((.+)\:.+\)$/' | sort | uniq -c
>>> 7102
>>>  2 clone.c
>>> 33 common-main.c
>>>  6 connect.c
>>> 64 git.c
>>>  4 ls-remote.c
>>> 126 run-command.c
>>> 12 transport.c
>>>  7 worktree.c
>> 
>> I'm not sure where valgrind.out comes from. The individual
>> test-results/*.out files may have valgrind output, but I don't think
>> they usually contain leak output.
>> 
>> Doing "valgrind ./git-upload-pack . </dev/null >/dev/null" mentions
>> leaked memory but not the locations. Adding --leak-check=full shows that
>> most of it comes from format_packet().
>> 
>> And applying Martin's patch drops the "definitely lost" category down to
>> 0 bytes (there's still 550k in "still reachable", but those are in the
>> "exit will free them for us" category).
>> 
>>> No mention of "pkt-line.c". Did you run Git with valgrind on one of
>>> your repositories to find it?
>> 
>> I'm curious, too. I don't think the valgrind setup in our test suite is
>> great for finding leaks right now.
> 
> Sorry for being brief. I've patched t/valgrind/valgrind.sh to say
> "--leak-check=yes". Then I run "./t0000 --valgrind", simply because
> running the complete suite gives more reports than I could possibly
> process.

I set $TOOL_OPTIONS in valgrind.sh: to 
'--leak-check=full --errors-for-leak-kinds=definite'

... but I also had to adjust t/test-lib-functions.sh:test_create_repo
as I ran into the error "cannot run git init -- have you built things yet?".

With these changes I was able to see the leak running valgrind with t5110.

---

What if we run a few selected tests with valgrind and count all files that
valgrind mentions (a single leak has multiple file mentions because of
the stack trace and other leak indicators). We record these counts and let 
TravisCI scream if one of the numbers increases.

I wonder how stable/fragile such a metric would be as a simple refactoring 
could easily change these numbers. Below I ran valgrind on t5510 before and
after Martin's patch. The diff below clearly shows the pkt-line leak.

Would it make sense to pursue something like this in TravisCI to avoid 
"pkt-line" kind of leaks in the future? 


## Valgrind run with leak

$ ./t5510-fetch.sh --valgrind | perl -nE 'say /^==.+\((.+)\:.+\)$/' | sort | uniq -c
15529
  39 abspath.c
  30 add.c
  34 branch.c
  10 bundle.c
 268 clone.c
  13 commit.c
 471 common-main.c
  52 config.c
  50 connect.c
   4 connected.c
   1 diff-lib.c
 102 dir.c
 120 environment.c
   4 fetch-pack.c
  47 fetch.c
  14 files-backend.c
   1 git-compat-util.h
 871 git.c
  96 init-db.c
  26 iterator.c
   1 list-objects.c
   4 log-tree.c
   2 object.c
   1 pack-objects.c
   6 parse-options.c
  10 pathspec.c
  83 pkt-line.c
   6 precompose_utf8.c
   2 push.c
  67 refs.c
  96 remote.c
  90 repository.c
   4 rev-list.c
  10 revision.c
 165 run-command.c
  42 setup.c
 288 strbuf.c
  51 strbuf.h
   2 tag.c
 107 transport.c
  10 tree-diff.c
  85 upload-pack.c
   1 usage.c
 455 wrapper.c


## Valgrind run with Matrin's patch to fix the leak
$ ./t5510-fetch.sh --valgrind | perl -nE 'say /^==.+\((.+)\:.+\)$/' | sort | uniq -c
14931
  39 abspath.c
  30 add.c
  34 branch.c
  10 bundle.c
 268 clone.c
  13 commit.c
 433 common-main.c
  52 config.c
  50 connect.c
   6 connected.c
   1 diff-lib.c
 102 dir.c
 120 environment.c
   4 fetch-pack.c
  53 fetch.c
  14 files-backend.c
   1 git-compat-util.h
 879 git.c
  96 init-db.c
   1 iterator.c
   1 list-objects.c
   4 log-tree.c
   2 object.c
   1 pack-objects.c
   6 parse-options.c
  10 pathspec.c
   6 precompose_utf8.c
   2 push.c
  26 refs.c
  96 remote.c
  90 repository.c
   6 rev-list.c
  14 revision.c
 171 run-command.c
  42 setup.c
 246 strbuf.c
  50 strbuf.h
   2 tag.c
 107 transport.c
  10 tree-diff.c
   2 upload-pack.c
   1 usage.c
 415 wrapper.c


## Diff

15529                    | 14931 
  39 abspath.c               39 abspath.c
  30 add.c                   30 add.c
  34 branch.c                34 branch.c
  10 bundle.c                10 bundle.c
 268 clone.c                268 clone.c
  13 commit.c                13 commit.c
 471 common-main.c       |  433 common-main.c
  52 config.c                52 config.c
  50 connect.c               50 connect.c
   4 connected.c         |    6 connected.c
   1 diff-lib.c               1 diff-lib.c
 102 dir.c                  102 dir.c
 120 environment.c          120 environment.c
   4 fetch-pack.c             4 fetch-pack.c
  47 fetch.c             |   53 fetch.c
  14 files-backend.c         14 files-backend.c
   1 git-compat-util.h        1 git-compat-util.h
 871 git.c               |  879 git.c
  96 init-db.c               96 init-db.c
  26 iterator.c          |    1 iterator.c
   1 list-objects.c           1 list-objects.c
   4 log-tree.c               4 log-tree.c
   2 object.c                 2 object.c
   1 pack-objects.c           1 pack-objects.c
   6 parse-options.c          6 parse-options.c
  10 pathspec.c              10 pathspec.c
  83 pkt-line.c          <
   6 precompose_utf8.c        6 precompose_utf8.c
   2 push.c                   2 push.c
  67 refs.c              |   26 refs.c
  96 remote.c                96 remote.c
  90 repository.c            90 repository.c
   4 rev-list.c          |    6 rev-list.c
  10 revision.c          |   14 revision.c
 165 run-command.c       |  171 run-command.c
  42 setup.c                 42 setup.c
 288 strbuf.c            |  246 strbuf.c
  51 strbuf.h            |   50 strbuf.h
   2 tag.c                    2 tag.c
 107 transport.c            107 transport.c
  10 tree-diff.c             10 tree-diff.c
  85 upload-pack.c       |    2 upload-pack.c
   1 usage.c                  1 usage.c
 455 wrapper.c           |  415 wrapp 


Thanks,
Lars

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
  2017-08-29 17:51           ` Lars Schneider
@ 2017-08-29 18:53             ` Jeff King
  2017-08-29 18:58               ` [PATCH] config: use a static lock_file struct Jeff King
  2017-08-29 19:22               ` [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1() Martin Ågren
  0 siblings, 2 replies; 43+ messages in thread
From: Jeff King @ 2017-08-29 18:53 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Martin Ågren, Git Users

On Tue, Aug 29, 2017 at 06:51:52PM +0100, Lars Schneider wrote:

> I set $TOOL_OPTIONS in valgrind.sh: to 
> '--leak-check=full --errors-for-leak-kinds=definite'
> 
> ... but I also had to adjust t/test-lib-functions.sh:test_create_repo
> as I ran into the error "cannot run git init -- have you built things yet?".

Yeah, this is a general problem with run-time analyzers. If we're not
mostly error free than the setup code often breaks before you even get
to the interesting part of the test.

It looks like the config code has a minor-ish leak. Patch to follow.

> What if we run a few selected tests with valgrind and count all files that
> valgrind mentions (a single leak has multiple file mentions because of
> the stack trace and other leak indicators). We record these counts and let 
> TravisCI scream if one of the numbers increases.
> 
> I wonder how stable/fragile such a metric would be as a simple refactoring 
> could easily change these numbers. Below I ran valgrind on t5510 before and
> after Martin's patch. The diff below clearly shows the pkt-line leak.
> 
> Would it make sense to pursue something like this in TravisCI to avoid 
> "pkt-line" kind of leaks in the future?

I don't think that would work, because simply adding new tests would
bump the leak count, without the code actually growing worse.

But think about your strategy for a moment: what you're really trying to
do is say "these existing leaks are OK because they are too many for us
to count, but we want to make sure we don't add _new_ ones". We already
have two techniques for distinguishing old ones from new ones:

  1. Mark existing ones with valgrind suppressions so they do not warn
     at all.

  2. Fix them, so that the "existing" count drops to zero.

Option (2), of course, has the added side effect that it's actually
fixing potential problems. :)

-Peff

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH] config: use a static lock_file struct
  2017-08-29 18:53             ` Jeff King
@ 2017-08-29 18:58               ` Jeff King
  2017-08-29 19:09                 ` Brandon Williams
  2017-09-06  3:59                 ` Junio C Hamano
  2017-08-29 19:22               ` [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1() Martin Ågren
  1 sibling, 2 replies; 43+ messages in thread
From: Jeff King @ 2017-08-29 18:58 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Martin Ågren, Git Users

On Tue, Aug 29, 2017 at 02:53:41PM -0400, Jeff King wrote:

> It looks like the config code has a minor-ish leak. Patch to follow.

Here it is.

-- >8 --
Subject: [PATCH] config: use a static lock_file struct

When modifying git config, we xcalloc() a struct lock_file
but never free it. This is necessary because the tempfile
code (upon which the locking code is built) requires that
the resulting struct remain valid through the life of the
program. However, it also confuses leak-checkers like
valgrind because only the inner "struct tempfile" is still
reachable; no pointer to the outer lock_file is kept.

Other code paths solve this by using a single static lock
struct. We can do the same here, because we know that we'll
only lock and modify one config file at a time (and
assertions within the lockfile code will ensure that this
remains the case).

That removes a real leak (when we fail to free the struct
after locking fails) as well as removes the valgrind false
positive. It also means that doing N sequential
config-writes will use a constant amount of memory, rather
than leaving stale lock_files for each.

Note that since "lock" is no longer a pointer, it can't be
NULL anymore. But that's OK. We used that feature only to
avoid calling rollback_lock_file() on an already-committed
lock. Since the lockfile code keeps its own "active" flag,
it's a noop to rollback an inactive lock, and we don't have
to worry about this ourselves.

Signed-off-by: Jeff King <peff@peff.net>
---
In the long run we may want to drop the "tempfiles must remain forever"
rule. This is certainly not the first time it has caused confusion or
leaks. And I don't think it's a fundamental issue, just the way the code
is written. But in the interim, this fix is probably worth doing.

 config.c | 24 +++++++-----------------
 1 file changed, 7 insertions(+), 17 deletions(-)

diff --git a/config.c b/config.c
index d0d8ce823a..1603f96e40 100644
--- a/config.c
+++ b/config.c
@@ -2450,7 +2450,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 {
 	int fd = -1, in_fd = -1;
 	int ret;
-	struct lock_file *lock = NULL;
+	static struct lock_file lock;
 	char *filename_buf = NULL;
 	char *contents = NULL;
 	size_t contents_sz;
@@ -2469,8 +2469,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 	 * The lock serves a purpose in addition to locking: the new
 	 * contents of .git/config will be written into it.
 	 */
-	lock = xcalloc(1, sizeof(struct lock_file));
-	fd = hold_lock_file_for_update(lock, config_filename, 0);
+	fd = hold_lock_file_for_update(&lock, config_filename, 0);
 	if (fd < 0) {
 		error_errno("could not lock config file %s", config_filename);
 		free(store.key);
@@ -2583,8 +2582,8 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 		close(in_fd);
 		in_fd = -1;
 
-		if (chmod(get_lock_file_path(lock), st.st_mode & 07777) < 0) {
-			error_errno("chmod on %s failed", get_lock_file_path(lock));
+		if (chmod(get_lock_file_path(&lock), st.st_mode & 07777) < 0) {
+			error_errno("chmod on %s failed", get_lock_file_path(&lock));
 			ret = CONFIG_NO_WRITE;
 			goto out_free;
 		}
@@ -2639,28 +2638,19 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 		contents = NULL;
 	}
 
-	if (commit_lock_file(lock) < 0) {
+	if (commit_lock_file(&lock) < 0) {
 		error_errno("could not write config file %s", config_filename);
 		ret = CONFIG_NO_WRITE;
-		lock = NULL;
 		goto out_free;
 	}
 
-	/*
-	 * lock is committed, so don't try to roll it back below.
-	 * NOTE: Since lockfile.c keeps a linked list of all created
-	 * lock_file structures, it isn't safe to free(lock).  It's
-	 * better to just leave it hanging around.
-	 */
-	lock = NULL;
 	ret = 0;
 
 	/* Invalidate the config cache */
 	git_config_clear();
 
 out_free:
-	if (lock)
-		rollback_lock_file(lock);
+	rollback_lock_file(&lock);
 	free(filename_buf);
 	if (contents)
 		munmap(contents, contents_sz);
@@ -2669,7 +2659,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 	return ret;
 
 write_err_out:
-	ret = write_error(get_lock_file_path(lock));
+	ret = write_error(get_lock_file_path(&lock));
 	goto out_free;
 
 }
-- 
2.14.1.721.gc5bc1565f1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-08-29 18:58               ` [PATCH] config: use a static lock_file struct Jeff King
@ 2017-08-29 19:09                 ` Brandon Williams
  2017-08-29 19:10                   ` Brandon Williams
  2017-08-29 19:12                   ` Jeff King
  2017-09-06  3:59                 ` Junio C Hamano
  1 sibling, 2 replies; 43+ messages in thread
From: Brandon Williams @ 2017-08-29 19:09 UTC (permalink / raw)
  To: Jeff King; +Cc: Lars Schneider, Martin Ågren, Git Users

On 08/29, Jeff King wrote:
> On Tue, Aug 29, 2017 at 02:53:41PM -0400, Jeff King wrote:
> 
> > It looks like the config code has a minor-ish leak. Patch to follow.
> 
> Here it is.
> 
> -- >8 --
> Subject: [PATCH] config: use a static lock_file struct
> 
> When modifying git config, we xcalloc() a struct lock_file
> but never free it. This is necessary because the tempfile
> code (upon which the locking code is built) requires that
> the resulting struct remain valid through the life of the
> program. However, it also confuses leak-checkers like
> valgrind because only the inner "struct tempfile" is still
> reachable; no pointer to the outer lock_file is kept.

Is this just due to a limitation in the tempfile code?  Would it be
possible to improve the tempfile code such that we don't need to require
that a tempfile, once created, is required to exist for the remaining
life of the program?

> 
> Other code paths solve this by using a single static lock
> struct. We can do the same here, because we know that we'll
> only lock and modify one config file at a time (and
> assertions within the lockfile code will ensure that this
> remains the case).
> 
> That removes a real leak (when we fail to free the struct
> after locking fails) as well as removes the valgrind false
> positive. It also means that doing N sequential
> config-writes will use a constant amount of memory, rather
> than leaving stale lock_files for each.
> 
> Note that since "lock" is no longer a pointer, it can't be
> NULL anymore. But that's OK. We used that feature only to
> avoid calling rollback_lock_file() on an already-committed
> lock. Since the lockfile code keeps its own "active" flag,
> it's a noop to rollback an inactive lock, and we don't have
> to worry about this ourselves.
> 
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> In the long run we may want to drop the "tempfiles must remain forever"
> rule. This is certainly not the first time it has caused confusion or
> leaks. And I don't think it's a fundamental issue, just the way the code
> is written. But in the interim, this fix is probably worth doing.
> 
>  config.c | 24 +++++++-----------------
>  1 file changed, 7 insertions(+), 17 deletions(-)
> 
> diff --git a/config.c b/config.c
> index d0d8ce823a..1603f96e40 100644
> --- a/config.c
> +++ b/config.c
> @@ -2450,7 +2450,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
>  {
>  	int fd = -1, in_fd = -1;
>  	int ret;
> -	struct lock_file *lock = NULL;
> +	static struct lock_file lock;
>  	char *filename_buf = NULL;
>  	char *contents = NULL;
>  	size_t contents_sz;
> @@ -2469,8 +2469,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
>  	 * The lock serves a purpose in addition to locking: the new
>  	 * contents of .git/config will be written into it.
>  	 */
> -	lock = xcalloc(1, sizeof(struct lock_file));
> -	fd = hold_lock_file_for_update(lock, config_filename, 0);
> +	fd = hold_lock_file_for_update(&lock, config_filename, 0);
>  	if (fd < 0) {
>  		error_errno("could not lock config file %s", config_filename);
>  		free(store.key);
> @@ -2583,8 +2582,8 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
>  		close(in_fd);
>  		in_fd = -1;
>  
> -		if (chmod(get_lock_file_path(lock), st.st_mode & 07777) < 0) {
> -			error_errno("chmod on %s failed", get_lock_file_path(lock));
> +		if (chmod(get_lock_file_path(&lock), st.st_mode & 07777) < 0) {
> +			error_errno("chmod on %s failed", get_lock_file_path(&lock));
>  			ret = CONFIG_NO_WRITE;
>  			goto out_free;
>  		}
> @@ -2639,28 +2638,19 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
>  		contents = NULL;
>  	}
>  
> -	if (commit_lock_file(lock) < 0) {
> +	if (commit_lock_file(&lock) < 0) {
>  		error_errno("could not write config file %s", config_filename);
>  		ret = CONFIG_NO_WRITE;
> -		lock = NULL;
>  		goto out_free;
>  	}
>  
> -	/*
> -	 * lock is committed, so don't try to roll it back below.
> -	 * NOTE: Since lockfile.c keeps a linked list of all created
> -	 * lock_file structures, it isn't safe to free(lock).  It's
> -	 * better to just leave it hanging around.
> -	 */
> -	lock = NULL;
>  	ret = 0;
>  
>  	/* Invalidate the config cache */
>  	git_config_clear();
>  
>  out_free:
> -	if (lock)
> -		rollback_lock_file(lock);
> +	rollback_lock_file(&lock);
>  	free(filename_buf);
>  	if (contents)
>  		munmap(contents, contents_sz);
> @@ -2669,7 +2659,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
>  	return ret;
>  
>  write_err_out:
> -	ret = write_error(get_lock_file_path(lock));
> +	ret = write_error(get_lock_file_path(&lock));
>  	goto out_free;
>  
>  }
> -- 
> 2.14.1.721.gc5bc1565f1
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-08-29 19:09                 ` Brandon Williams
@ 2017-08-29 19:10                   ` Brandon Williams
  2017-08-29 19:12                   ` Jeff King
  1 sibling, 0 replies; 43+ messages in thread
From: Brandon Williams @ 2017-08-29 19:10 UTC (permalink / raw)
  To: Jeff King; +Cc: Lars Schneider, Martin Ågren, Git Users

On 08/29, Brandon Williams wrote:
> On 08/29, Jeff King wrote:
> > On Tue, Aug 29, 2017 at 02:53:41PM -0400, Jeff King wrote:
> > 
> > > It looks like the config code has a minor-ish leak. Patch to follow.
> > 
> > Here it is.
> > 
> > -- >8 --
> > Subject: [PATCH] config: use a static lock_file struct
> > 
> > When modifying git config, we xcalloc() a struct lock_file
> > but never free it. This is necessary because the tempfile
> > code (upon which the locking code is built) requires that
> > the resulting struct remain valid through the life of the
> > program. However, it also confuses leak-checkers like
> > valgrind because only the inner "struct tempfile" is still
> > reachable; no pointer to the outer lock_file is kept.
> 
> Is this just due to a limitation in the tempfile code?  Would it be
> possible to improve the tempfile code such that we don't need to require
> that a tempfile, once created, is required to exist for the remaining
> life of the program?
> 
> > 
> > Other code paths solve this by using a single static lock
> > struct. We can do the same here, because we know that we'll
> > only lock and modify one config file at a time (and
> > assertions within the lockfile code will ensure that this
> > remains the case).
> > 
> > That removes a real leak (when we fail to free the struct
> > after locking fails) as well as removes the valgrind false
> > positive. It also means that doing N sequential
> > config-writes will use a constant amount of memory, rather
> > than leaving stale lock_files for each.
> > 
> > Note that since "lock" is no longer a pointer, it can't be
> > NULL anymore. But that's OK. We used that feature only to
> > avoid calling rollback_lock_file() on an already-committed
> > lock. Since the lockfile code keeps its own "active" flag,
> > it's a noop to rollback an inactive lock, and we don't have
> > to worry about this ourselves.
> > 
> > Signed-off-by: Jeff King <peff@peff.net>
> > ---
> > In the long run we may want to drop the "tempfiles must remain forever"
> > rule. This is certainly not the first time it has caused confusion or
> > leaks. And I don't think it's a fundamental issue, just the way the code
> > is written. But in the interim, this fix is probably worth doing.

I didn't read far enough apparently :)  I took a look at this once and
found that the in order to do this we would just need to be careful in
modifying a list of tempfiles.

> > 
> >  config.c | 24 +++++++-----------------
> >  1 file changed, 7 insertions(+), 17 deletions(-)
> > 
> > diff --git a/config.c b/config.c
> > index d0d8ce823a..1603f96e40 100644
> > --- a/config.c
> > +++ b/config.c
> > @@ -2450,7 +2450,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
> >  {
> >  	int fd = -1, in_fd = -1;
> >  	int ret;
> > -	struct lock_file *lock = NULL;
> > +	static struct lock_file lock;
> >  	char *filename_buf = NULL;
> >  	char *contents = NULL;
> >  	size_t contents_sz;
> > @@ -2469,8 +2469,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
> >  	 * The lock serves a purpose in addition to locking: the new
> >  	 * contents of .git/config will be written into it.
> >  	 */
> > -	lock = xcalloc(1, sizeof(struct lock_file));
> > -	fd = hold_lock_file_for_update(lock, config_filename, 0);
> > +	fd = hold_lock_file_for_update(&lock, config_filename, 0);
> >  	if (fd < 0) {
> >  		error_errno("could not lock config file %s", config_filename);
> >  		free(store.key);
> > @@ -2583,8 +2582,8 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
> >  		close(in_fd);
> >  		in_fd = -1;
> >  
> > -		if (chmod(get_lock_file_path(lock), st.st_mode & 07777) < 0) {
> > -			error_errno("chmod on %s failed", get_lock_file_path(lock));
> > +		if (chmod(get_lock_file_path(&lock), st.st_mode & 07777) < 0) {
> > +			error_errno("chmod on %s failed", get_lock_file_path(&lock));
> >  			ret = CONFIG_NO_WRITE;
> >  			goto out_free;
> >  		}
> > @@ -2639,28 +2638,19 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
> >  		contents = NULL;
> >  	}
> >  
> > -	if (commit_lock_file(lock) < 0) {
> > +	if (commit_lock_file(&lock) < 0) {
> >  		error_errno("could not write config file %s", config_filename);
> >  		ret = CONFIG_NO_WRITE;
> > -		lock = NULL;
> >  		goto out_free;
> >  	}
> >  
> > -	/*
> > -	 * lock is committed, so don't try to roll it back below.
> > -	 * NOTE: Since lockfile.c keeps a linked list of all created
> > -	 * lock_file structures, it isn't safe to free(lock).  It's
> > -	 * better to just leave it hanging around.
> > -	 */
> > -	lock = NULL;
> >  	ret = 0;
> >  
> >  	/* Invalidate the config cache */
> >  	git_config_clear();
> >  
> >  out_free:
> > -	if (lock)
> > -		rollback_lock_file(lock);
> > +	rollback_lock_file(&lock);
> >  	free(filename_buf);
> >  	if (contents)
> >  		munmap(contents, contents_sz);
> > @@ -2669,7 +2659,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
> >  	return ret;
> >  
> >  write_err_out:
> > -	ret = write_error(get_lock_file_path(lock));
> > +	ret = write_error(get_lock_file_path(&lock));
> >  	goto out_free;
> >  
> >  }
> > -- 
> > 2.14.1.721.gc5bc1565f1
> > 
> 
> -- 
> Brandon Williams

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-08-29 19:09                 ` Brandon Williams
  2017-08-29 19:10                   ` Brandon Williams
@ 2017-08-29 19:12                   ` Jeff King
  2017-08-30  3:25                     ` Michael Haggerty
  1 sibling, 1 reply; 43+ messages in thread
From: Jeff King @ 2017-08-29 19:12 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Michael Haggerty, Lars Schneider, Martin Ågren, Git Users

On Tue, Aug 29, 2017 at 12:09:28PM -0700, Brandon Williams wrote:

> > -- >8 --
> > Subject: [PATCH] config: use a static lock_file struct
> > 
> > When modifying git config, we xcalloc() a struct lock_file
> > but never free it. This is necessary because the tempfile
> > code (upon which the locking code is built) requires that
> > the resulting struct remain valid through the life of the
> > program. However, it also confuses leak-checkers like
> > valgrind because only the inner "struct tempfile" is still
> > reachable; no pointer to the outer lock_file is kept.
> 
> Is this just due to a limitation in the tempfile code?  Would it be
> possible to improve the tempfile code such that we don't need to require
> that a tempfile, once created, is required to exist for the remaining
> life of the program?

Yes. Like I wrote below:

> > ---
> > In the long run we may want to drop the "tempfiles must remain forever"
> > rule. This is certainly not the first time it has caused confusion or
> > leaks. And I don't think it's a fundamental issue, just the way the code
> > is written. But in the interim, this fix is probably worth doing.

The main issue is that the caller needs to make sure they're removed
from the list (via commit or rollback) before being freed.

As far as I know anyway. This restriction dates back to the very early
days of the lockfile code and has been carried through the various
tempfile-cleanup refactorings over the years (mostly because each was
afraid to make functional changes).

+cc Michael, who did most comprehensive cleanup of that code.

-Peff

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-08-29 19:12                   ` Jeff King
@ 2017-08-30  3:25                     ` Michael Haggerty
  2017-08-30  4:31                       ` Jeff King
  0 siblings, 1 reply; 43+ messages in thread
From: Michael Haggerty @ 2017-08-30  3:25 UTC (permalink / raw)
  To: Jeff King, Brandon Williams; +Cc: Lars Schneider, Martin Ågren, Git Users

On 08/29/2017 09:12 PM, Jeff King wrote:
> On Tue, Aug 29, 2017 at 12:09:28PM -0700, Brandon Williams wrote:
> 
>>> -- >8 --
>>> Subject: [PATCH] config: use a static lock_file struct
>>>
>>> When modifying git config, we xcalloc() a struct lock_file
>>> but never free it. This is necessary because the tempfile
>>> code (upon which the locking code is built) requires that
>>> the resulting struct remain valid through the life of the
>>> program. However, it also confuses leak-checkers like
>>> valgrind because only the inner "struct tempfile" is still
>>> reachable; no pointer to the outer lock_file is kept.
>>
>> Is this just due to a limitation in the tempfile code?  Would it be
>> possible to improve the tempfile code such that we don't need to require
>> that a tempfile, once created, is required to exist for the remaining
>> life of the program?
> 
> Yes. Like I wrote below:
> 
>>> ---
>>> In the long run we may want to drop the "tempfiles must remain forever"
>>> rule. This is certainly not the first time it has caused confusion or
>>> leaks. And I don't think it's a fundamental issue, just the way the code
>>> is written. But in the interim, this fix is probably worth doing.
> 
> The main issue is that the caller needs to make sure they're removed
> from the list (via commit or rollback) before being freed.
> 
> As far as I know anyway. This restriction dates back to the very early
> days of the lockfile code and has been carried through the various
> tempfile-cleanup refactorings over the years (mostly because each was
> afraid to make functional changes).
> 
> +cc Michael, who did most comprehensive cleanup of that code.

It was surprisingly hard trying to get that code to do the right thing,
non-racily, in every error path. Since `remove_tempfiles()` can be
called any time (even from a signal handler), the linked list of
`tempfile` objects has to be kept valid at all times but can't use
mutexes. I didn't have the energy to keep going and make the lock
objects freeable.

I suppose the task could be made a bit easier by using `sigprocmask(2)`
or `pthread_sigmask(3)` to make its running environment a little bit
less hostile.

Michael

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-08-30  3:25                     ` Michael Haggerty
@ 2017-08-30  4:31                       ` Jeff King
  2017-08-30  4:55                         ` Michael Haggerty
  2017-08-30  4:55                         ` Jeff King
  0 siblings, 2 replies; 43+ messages in thread
From: Jeff King @ 2017-08-30  4:31 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Brandon Williams, Lars Schneider, Martin Ågren, Git Users

On Wed, Aug 30, 2017 at 05:25:18AM +0200, Michael Haggerty wrote:

> >>> In the long run we may want to drop the "tempfiles must remain forever"
> >>> rule. This is certainly not the first time it has caused confusion or
> >>> leaks. And I don't think it's a fundamental issue, just the way the code
> >>> is written. But in the interim, this fix is probably worth doing.
> > 
> > The main issue is that the caller needs to make sure they're removed
> > from the list (via commit or rollback) before being freed.
> > 
> > As far as I know anyway. This restriction dates back to the very early
> > days of the lockfile code and has been carried through the various
> > tempfile-cleanup refactorings over the years (mostly because each was
> > afraid to make functional changes).
> > 
> > +cc Michael, who did most comprehensive cleanup of that code.
> 
> It was surprisingly hard trying to get that code to do the right thing,
> non-racily, in every error path. Since `remove_tempfiles()` can be
> called any time (even from a signal handler), the linked list of
> `tempfile` objects has to be kept valid at all times but can't use
> mutexes. I didn't have the energy to keep going and make the lock
> objects freeable.
> 
> I suppose the task could be made a bit easier by using `sigprocmask(2)`
> or `pthread_sigmask(3)` to make its running environment a little bit
> less hostile.

I think there are really two levels of carefulness here:

  1. Avoiding complicated things during a signal handler that may rely
     on having a sane state from the rest of the program (e.g.,
     half-formed entries, stdio locks, etc).

  2. Being truly race-free in the face of a signal arriving while we're
     running arbitrary code that might have a tempfile struct in a funny
     state.

I feel like right now we meet (1) and not (2). But I think if we keep to
that lower bar of (1), it might not be that bad. We're assuming now that
there's no race on the tempfile->active flag, for instance. We could
probably make a similar assumption about putting items onto or taking
them off of a linked list (it's not really atomic, but a single pointer
assignment is probably "atomic enough" for our purposes).

Or I dunno. There's a lot of "volatile" modifiers sprinkled around.
Maybe those are enough to give us (2) as well (though in that case, I
think we'd still be as OK with the list manipulation as we are with the
active flag manipulation).

-Peff

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-08-30  4:31                       ` Jeff King
@ 2017-08-30  4:55                         ` Michael Haggerty
  2017-08-30  4:55                         ` Jeff King
  1 sibling, 0 replies; 43+ messages in thread
From: Michael Haggerty @ 2017-08-30  4:55 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, Lars Schneider, Martin Ågren, Git Users

On Wed, Aug 30, 2017 at 6:31 AM, Jeff King <peff@peff.net> wrote:
> On Wed, Aug 30, 2017 at 05:25:18AM +0200, Michael Haggerty wrote:
>> It was surprisingly hard trying to get that code to do the right thing,
>> non-racily, in every error path. Since `remove_tempfiles()` can be
>> called any time (even from a signal handler), the linked list of
>> `tempfile` objects has to be kept valid at all times but can't use
>> mutexes. I didn't have the energy to keep going and make the lock
>> objects freeable.
>>
>> I suppose the task could be made a bit easier by using `sigprocmask(2)`
>> or `pthread_sigmask(3)` to make its running environment a little bit
>> less hostile.
>
> I think there are really two levels of carefulness here:
>
>   1. Avoiding complicated things during a signal handler that may rely
>      on having a sane state from the rest of the program (e.g.,
>      half-formed entries, stdio locks, etc).
>
>   2. Being truly race-free in the face of a signal arriving while we're
>      running arbitrary code that might have a tempfile struct in a funny
>      state.
>
> I feel like right now we meet (1) and not (2). But I think if we keep to
> that lower bar of (1), it might not be that bad. We're assuming now that
> there's no race on the tempfile->active flag, for instance. We could
> probably make a similar assumption about putting items onto or taking
> them off of a linked list (it's not really atomic, but a single pointer
> assignment is probably "atomic enough" for our purposes).
>
> Or I dunno. There's a lot of "volatile" modifiers sprinkled around.
> Maybe those are enough to give us (2) as well (though in that case, I
> think we'd still be as OK with the list manipulation as we are with the
> active flag manipulation).

I did in fact strive for both (1) and (2), though I know that there
are a couple of short races remaining in the code (plus who knows how
many bugs).

Actually, I think you're right that a single "probably-atomic" pointer
assignment would be enough to remove an entry from the list safely. I
was thinking about what it would take to make the list thread-safe
(which it currently is not, but probably doesn't need to be(?)).
Modifying the linked list is not that difficult if you assume that
there is only a single thread modifying the list at any given time.

Michael

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-08-30  4:31                       ` Jeff King
  2017-08-30  4:55                         ` Michael Haggerty
@ 2017-08-30  4:55                         ` Jeff King
  2017-08-30  5:55                           ` Jeff King
  2017-08-30  6:55                           ` Michael Haggerty
  1 sibling, 2 replies; 43+ messages in thread
From: Jeff King @ 2017-08-30  4:55 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Brandon Williams, Lars Schneider, Martin Ågren, Git Users

On Wed, Aug 30, 2017 at 12:31:47AM -0400, Jeff King wrote:

> > It was surprisingly hard trying to get that code to do the right thing,
> > non-racily, in every error path. Since `remove_tempfiles()` can be
> > called any time (even from a signal handler), the linked list of
> > `tempfile` objects has to be kept valid at all times but can't use
> > mutexes. I didn't have the energy to keep going and make the lock
> > objects freeable.
> > 
> > I suppose the task could be made a bit easier by using `sigprocmask(2)`
> > or `pthread_sigmask(3)` to make its running environment a little bit
> > less hostile.
> 
> I think there are really two levels of carefulness here:
> 
>   1. Avoiding complicated things during a signal handler that may rely
>      on having a sane state from the rest of the program (e.g.,
>      half-formed entries, stdio locks, etc).
> 
>   2. Being truly race-free in the face of a signal arriving while we're
>      running arbitrary code that might have a tempfile struct in a funny
>      state.
> 
> I feel like right now we meet (1) and not (2). But I think if we keep to
> that lower bar of (1), it might not be that bad. We're assuming now that
> there's no race on the tempfile->active flag, for instance. We could
> probably make a similar assumption about putting items onto or taking
> them off of a linked list (it's not really atomic, but a single pointer
> assignment is probably "atomic enough" for our purposes).

Something like this, which AFAICT is about as safe as the existing code
in its list manipulation. It retains the "active" flag as an additional
check which I think isn't strictly necessary, but potentially catches
some logic errors.

The strbuf_reset() calls become strbuf_release(), since we're promising
the caller that they could now free the struct (or let it go out of
scope) if they chose. Probably during a signal handler we should skip
that (we know the struct is off the list and non-active at that point,
but we could possibly hit libc's free() mutex).

diff --git a/tempfile.c b/tempfile.c
index 6843710670..a7d964ebf8 100644
--- a/tempfile.c
+++ b/tempfile.c
@@ -69,7 +69,6 @@ static void remove_tempfiles(int skip_fclose)
 				tempfile_list->fp = NULL;
 			delete_tempfile(tempfile_list);
 		}
-		tempfile_list = tempfile_list->next;
 	}
 }
 
@@ -85,39 +84,54 @@ static void remove_tempfiles_on_signal(int signo)
 	raise(signo);
 }
 
-/*
- * Initialize *tempfile if necessary and add it to tempfile_list.
- */
-static void prepare_tempfile_object(struct tempfile *tempfile)
+static void init_tempfile(struct tempfile *tempfile)
+{
+	tempfile->fd = -1;
+	tempfile->fp = NULL;
+	tempfile->active = 0;
+	tempfile->owner = 0;
+	tempfile->next = NULL;
+	strbuf_init(&tempfile->filename, 0);
+}
+
+static void activate_tempfile(struct tempfile *tempfile)
 {
-	if (!tempfile_list) {
-		/* One-time initialization */
+	static volatile int initialized;
+
+	if (!initialized) {
 		sigchain_push_common(remove_tempfiles_on_signal);
 		atexit(remove_tempfiles_on_exit);
+		initialized = 1;
 	}
 
 	if (tempfile->active)
-		die("BUG: prepare_tempfile_object called for active object");
-	if (!tempfile->on_list) {
-		/* Initialize *tempfile and add it to tempfile_list: */
-		tempfile->fd = -1;
-		tempfile->fp = NULL;
-		tempfile->active = 0;
-		tempfile->owner = 0;
-		strbuf_init(&tempfile->filename, 0);
-		tempfile->next = tempfile_list;
-		tempfile_list = tempfile;
-		tempfile->on_list = 1;
-	} else if (tempfile->filename.len) {
-		/* This shouldn't happen, but better safe than sorry. */
-		die("BUG: prepare_tempfile_object called for improperly-reset object");
+		die("BUG: activate_tempfile called for active object");
+
+	tempfile->next = tempfile_list;
+	tempfile_list = tempfile;
+	tempfile->active = 1;
+}
+
+static void deactivate_tempfile(struct tempfile *tempfile)
+{
+	struct tempfile *volatile *p;
+
+	if (!tempfile->active)
+		return;
+
+	tempfile->active = 0;
+	for (p = &tempfile_list; *p; p = &(*p)->next) {
+		if (*p == tempfile) {
+			*p = tempfile->next;
+			break;
+		}
 	}
 }
 
 /* Make sure errno contains a meaningful value on error */
 int create_tempfile(struct tempfile *tempfile, const char *path)
 {
-	prepare_tempfile_object(tempfile);
+	init_tempfile(tempfile);
 
 	strbuf_add_absolute_path(&tempfile->filename, path);
 	tempfile->fd = open(tempfile->filename.buf,
@@ -127,11 +141,11 @@ int create_tempfile(struct tempfile *tempfile, const char *path)
 		tempfile->fd = open(tempfile->filename.buf,
 				    O_RDWR | O_CREAT | O_EXCL, 0666);
 	if (tempfile->fd < 0) {
-		strbuf_reset(&tempfile->filename);
+		strbuf_release(&tempfile->filename);
 		return -1;
 	}
 	tempfile->owner = getpid();
-	tempfile->active = 1;
+	activate_tempfile(tempfile);
 	if (adjust_shared_perm(tempfile->filename.buf)) {
 		int save_errno = errno;
 		error("cannot fix permission bits on %s", tempfile->filename.buf);
@@ -144,25 +158,25 @@ int create_tempfile(struct tempfile *tempfile, const char *path)
 
 void register_tempfile(struct tempfile *tempfile, const char *path)
 {
-	prepare_tempfile_object(tempfile);
+	init_tempfile(tempfile);
 	strbuf_add_absolute_path(&tempfile->filename, path);
 	tempfile->owner = getpid();
-	tempfile->active = 1;
+	activate_tempfile(tempfile);
 }
 
 int mks_tempfile_sm(struct tempfile *tempfile,
 		    const char *template, int suffixlen, int mode)
 {
-	prepare_tempfile_object(tempfile);
+	init_tempfile(tempfile);
 
 	strbuf_add_absolute_path(&tempfile->filename, template);
 	tempfile->fd = git_mkstemps_mode(tempfile->filename.buf, suffixlen, mode);
 	if (tempfile->fd < 0) {
-		strbuf_reset(&tempfile->filename);
+		strbuf_release(&tempfile->filename);
 		return -1;
 	}
 	tempfile->owner = getpid();
-	tempfile->active = 1;
+	activate_tempfile(tempfile);
 	return tempfile->fd;
 }
 
@@ -171,7 +185,7 @@ int mks_tempfile_tsm(struct tempfile *tempfile,
 {
 	const char *tmpdir;
 
-	prepare_tempfile_object(tempfile);
+	init_tempfile(tempfile);
 
 	tmpdir = getenv("TMPDIR");
 	if (!tmpdir)
@@ -180,11 +194,11 @@ int mks_tempfile_tsm(struct tempfile *tempfile,
 	strbuf_addf(&tempfile->filename, "%s/%s", tmpdir, template);
 	tempfile->fd = git_mkstemps_mode(tempfile->filename.buf, suffixlen, mode);
 	if (tempfile->fd < 0) {
-		strbuf_reset(&tempfile->filename);
+		strbuf_release(&tempfile->filename);
 		return -1;
 	}
 	tempfile->owner = getpid();
-	tempfile->active = 1;
+	activate_tempfile(tempfile);
 	return tempfile->fd;
 }
 
@@ -293,8 +307,8 @@ int rename_tempfile(struct tempfile *tempfile, const char *path)
 		return -1;
 	}
 
-	tempfile->active = 0;
-	strbuf_reset(&tempfile->filename);
+	deactivate_tempfile(tempfile);
+	strbuf_release(&tempfile->filename);
 	return 0;
 }
 
@@ -305,7 +319,7 @@ void delete_tempfile(struct tempfile *tempfile)
 
 	if (!close_tempfile(tempfile)) {
 		unlink_or_warn(tempfile->filename.buf);
-		tempfile->active = 0;
-		strbuf_reset(&tempfile->filename);
+		deactivate_tempfile(tempfile);
+		strbuf_release(&tempfile->filename);
 	}
 }
diff --git a/tempfile.h b/tempfile.h
index 2f0038decd..df96f82e84 100644
--- a/tempfile.h
+++ b/tempfile.h
@@ -85,7 +85,6 @@ struct tempfile {
 	volatile int fd;
 	FILE *volatile fp;
 	volatile pid_t owner;
-	char on_list;
 	struct strbuf filename;
 };
 

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-08-30  4:55                         ` Jeff King
@ 2017-08-30  5:55                           ` Jeff King
  2017-08-30  7:07                             ` Michael Haggerty
  2017-08-30  6:55                           ` Michael Haggerty
  1 sibling, 1 reply; 43+ messages in thread
From: Jeff King @ 2017-08-30  5:55 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Brandon Williams, Lars Schneider, Martin Ågren, Git Users

On Wed, Aug 30, 2017 at 12:55:55AM -0400, Jeff King wrote:

> > I feel like right now we meet (1) and not (2). But I think if we keep to
> > that lower bar of (1), it might not be that bad. We're assuming now that
> > there's no race on the tempfile->active flag, for instance. We could
> > probably make a similar assumption about putting items onto or taking
> > them off of a linked list (it's not really atomic, but a single pointer
> > assignment is probably "atomic enough" for our purposes).
> 
> Something like this, which AFAICT is about as safe as the existing code
> in its list manipulation. It retains the "active" flag as an additional
> check which I think isn't strictly necessary, but potentially catches
> some logic errors.

The patch below demonstrates how this could be used to turn some
"xcalloc" lock_files into stack variables that are allowed to go out of
scope after we commit or rollback. This solves the three lock-related
leaks reported by valgrind when running t0000.

_But_ it also demonstrates an interesting downside of this approach.
Some functions are lazy in their error paths. For instance, look at
write_index_as_tree(). We take the lock early in the function, but may
return before a commit or rollback if we hit an error.  With the current
code this "leaks" the tempfile, which is wrong. But nobody notices
because in practice the program exits soon after and we clean up the
tempfile then.

But with a stack variable lock_file, that minor problem becomes a major
one: our stack variable got added to a global linked list and the
atexit() handler will try to read it. But now of course it will contain
garbage, since the variable went out of scope.

So it's probably safer to just let tempfile.c handle the whole lifetime,
and have it put all live tempfiles on the heap, and free them when
they're deactivated. That means our creation signature becomes more
like:

  struct tempfile *create_tempfile(const char *path);

and delete_tempfile() actually calls free() on it (though we'd probably
want to skip the free() from a signal handler for the usual reasons).

I don't have a patch for that, but just food for thought while exploring
the implications of my earlier suggestion.

---
diff --git a/builtin/update-index.c b/builtin/update-index.c
index d955cd56b3..bf7420b808 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -917,7 +917,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	struct refresh_params refresh_args = {0, &has_errors};
 	int lock_error = 0;
 	int split_index = -1;
-	struct lock_file *lock_file;
+	struct lock_file lock_file = LOCK_INIT;
 	struct parse_opt_ctx_t ctx;
 	strbuf_getline_fn getline_fn;
 	int parseopt_state = PARSE_OPT_UNKNOWN;
@@ -1016,11 +1016,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 
 	git_config(git_default_config, NULL);
 
-	/* We can't free this memory, it becomes part of a linked list parsed atexit() */
-	lock_file = xcalloc(1, sizeof(struct lock_file));
-
 	/* we will diagnose later if it turns out that we need to update it */
-	newfd = hold_locked_index(lock_file, 0);
+	newfd = hold_locked_index(&lock_file, 0);
 	if (newfd < 0)
 		lock_error = errno;
 
@@ -1155,11 +1152,11 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 				exit(128);
 			unable_to_lock_die(get_index_file(), lock_error);
 		}
-		if (write_locked_index(&the_index, lock_file, COMMIT_LOCK))
+		if (write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
 			die("Unable to write new index file");
 	}
 
-	rollback_lock_file(lock_file);
+	rollback_lock_file(&lock_file);
 
 	return has_errors ? 1 : 0;
 }
diff --git a/cache-tree.c b/cache-tree.c
index 2440d1dc89..440f92aeca 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -603,19 +603,15 @@ static struct cache_tree *cache_tree_find(struct cache_tree *it, const char *pat
 int write_index_as_tree(unsigned char *sha1, struct index_state *index_state, const char *index_path, int flags, const char *prefix)
 {
 	int entries, was_valid, newfd;
-	struct lock_file *lock_file;
+	struct lock_file lock_file = LOCK_INIT;
 
-	/*
-	 * We can't free this memory, it becomes part of a linked list
-	 * parsed atexit()
-	 */
-	lock_file = xcalloc(1, sizeof(struct lock_file));
-
-	newfd = hold_lock_file_for_update(lock_file, index_path, LOCK_DIE_ON_ERROR);
+	newfd = hold_lock_file_for_update(&lock_file, index_path, LOCK_DIE_ON_ERROR);
 
 	entries = read_index_from(index_state, index_path);
-	if (entries < 0)
+	if (entries < 0) {
+		rollback_lock_file(&lock_file);
 		return WRITE_TREE_UNREADABLE_INDEX;
+	}
 	if (flags & WRITE_TREE_IGNORE_CACHE_TREE)
 		cache_tree_free(&index_state->cache_tree);
 
@@ -624,10 +620,12 @@ int write_index_as_tree(unsigned char *sha1, struct index_state *index_state, co
 
 	was_valid = cache_tree_fully_valid(index_state->cache_tree);
 	if (!was_valid) {
-		if (cache_tree_update(index_state, flags) < 0)
+		if (cache_tree_update(index_state, flags) < 0) {
+			rollback_lock_file(&lock_file);
 			return WRITE_TREE_UNMERGED_INDEX;
+		}
 		if (0 <= newfd) {
-			if (!write_locked_index(index_state, lock_file, COMMIT_LOCK))
+			if (!write_locked_index(index_state, &lock_file, COMMIT_LOCK))
 				newfd = -1;
 		}
 		/* Not being able to write is fine -- we are only interested
@@ -641,15 +639,17 @@ int write_index_as_tree(unsigned char *sha1, struct index_state *index_state, co
 	if (prefix) {
 		struct cache_tree *subtree;
 		subtree = cache_tree_find(index_state->cache_tree, prefix);
-		if (!subtree)
+		if (!subtree) {
+			rollback_lock_file(&lock_file);
 			return WRITE_TREE_PREFIX_ERROR;
+		}
 		hashcpy(sha1, subtree->oid.hash);
 	}
 	else
 		hashcpy(sha1, index_state->cache_tree->oid.hash);
 
 	if (0 <= newfd)
-		rollback_lock_file(lock_file);
+		rollback_lock_file(&lock_file);
 
 	return 0;
 }
diff --git a/config.c b/config.c
index d0d8ce823a..7931182a54 100644
--- a/config.c
+++ b/config.c
@@ -2450,7 +2450,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 {
 	int fd = -1, in_fd = -1;
 	int ret;
-	struct lock_file *lock = NULL;
+	struct lock_file lock = LOCK_INIT;
 	char *filename_buf = NULL;
 	char *contents = NULL;
 	size_t contents_sz;
@@ -2469,8 +2469,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 	 * The lock serves a purpose in addition to locking: the new
 	 * contents of .git/config will be written into it.
 	 */
-	lock = xcalloc(1, sizeof(struct lock_file));
-	fd = hold_lock_file_for_update(lock, config_filename, 0);
+	fd = hold_lock_file_for_update(&lock, config_filename, 0);
 	if (fd < 0) {
 		error_errno("could not lock config file %s", config_filename);
 		free(store.key);
@@ -2583,8 +2582,8 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 		close(in_fd);
 		in_fd = -1;
 
-		if (chmod(get_lock_file_path(lock), st.st_mode & 07777) < 0) {
-			error_errno("chmod on %s failed", get_lock_file_path(lock));
+		if (chmod(get_lock_file_path(&lock), st.st_mode & 07777) < 0) {
+			error_errno("chmod on %s failed", get_lock_file_path(&lock));
 			ret = CONFIG_NO_WRITE;
 			goto out_free;
 		}
@@ -2639,28 +2638,19 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 		contents = NULL;
 	}
 
-	if (commit_lock_file(lock) < 0) {
+	if (commit_lock_file(&lock) < 0) {
 		error_errno("could not write config file %s", config_filename);
 		ret = CONFIG_NO_WRITE;
-		lock = NULL;
 		goto out_free;
 	}
 
-	/*
-	 * lock is committed, so don't try to roll it back below.
-	 * NOTE: Since lockfile.c keeps a linked list of all created
-	 * lock_file structures, it isn't safe to free(lock).  It's
-	 * better to just leave it hanging around.
-	 */
-	lock = NULL;
 	ret = 0;
 
 	/* Invalidate the config cache */
 	git_config_clear();
 
 out_free:
-	if (lock)
-		rollback_lock_file(lock);
+	rollback_lock_file(&lock);
 	free(filename_buf);
 	if (contents)
 		munmap(contents, contents_sz);
@@ -2669,7 +2659,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 	return ret;
 
 write_err_out:
-	ret = write_error(get_lock_file_path(lock));
+	ret = write_error(get_lock_file_path(&lock));
 	goto out_free;
 
 }
diff --git a/lockfile.h b/lockfile.h
index 572064939c..a7ba42bdf2 100644
--- a/lockfile.h
+++ b/lockfile.h
@@ -114,6 +114,8 @@ struct lock_file {
 	struct tempfile tempfile;
 };
 
+#define LOCK_INIT { { NULL } }
+
 /* String appended to a filename to derive the lockfile name: */
 #define LOCK_SUFFIX ".lock"
 #define LOCK_SUFFIX_LEN 5

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-08-30  5:55                           ` Jeff King
@ 2017-08-30  7:07                             ` Michael Haggerty
  2017-09-02  8:44                               ` Jeff King
  0 siblings, 1 reply; 43+ messages in thread
From: Michael Haggerty @ 2017-08-30  7:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, Lars Schneider, Martin Ågren, Git Users

On 08/30/2017 07:55 AM, Jeff King wrote:
> On Wed, Aug 30, 2017 at 12:55:55AM -0400, Jeff King wrote:
> [...]
> The patch below demonstrates how this could be used to turn some
> "xcalloc" lock_files into stack variables that are allowed to go out of
> scope after we commit or rollback. This solves the three lock-related
> leaks reported by valgrind when running t0000.
> 
> _But_ it also demonstrates an interesting downside of this approach.
> Some functions are lazy in their error paths. For instance, look at
> write_index_as_tree(). We take the lock early in the function, but may
> return before a commit or rollback if we hit an error.  With the current
> code this "leaks" the tempfile, which is wrong. But nobody notices
> because in practice the program exits soon after and we clean up the
> tempfile then.
> 
> But with a stack variable lock_file, that minor problem becomes a major
> one: our stack variable got added to a global linked list and the
> atexit() handler will try to read it. But now of course it will contain
> garbage, since the variable went out of scope.
> 
> So it's probably safer to just let tempfile.c handle the whole lifetime,
> and have it put all live tempfiles on the heap, and free them when
> they're deactivated. That means our creation signature becomes more
> like:
> 
>   struct tempfile *create_tempfile(const char *path);
> 
> and delete_tempfile() actually calls free() on it (though we'd probably
> want to skip the free() from a signal handler for the usual reasons).

I agree that the latter would be a nice, and relatively safe, design. It
would involve some fairly intrusive changes to client code, though.

I think it would be possible to implement the new API while leaving the
old one intact, to avoid having to rewrite all clients at once, and
potentially to allow clients to avoid a malloc if they already have a
convenient place to embed a `struct tempfile` (except that now they'd be
able to free it when done). For example, `create_tempfile(tempfile,
path)` and its friends could accept NULL as the first argument, in which
case it would malloc a `struct tempfile` itself, and mark it as being
owned by the tempfile module. Such objects would be freed when
deactivated. But if the caller passes in a non-NULL `tempfile` argument,
the old behavior would be retained.

Michael

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-08-30  7:07                             ` Michael Haggerty
@ 2017-09-02  8:44                               ` Jeff King
  2017-09-02 13:50                                 ` Jeff King
  0 siblings, 1 reply; 43+ messages in thread
From: Jeff King @ 2017-09-02  8:44 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Brandon Williams, Lars Schneider, Martin Ågren, Git Users

On Wed, Aug 30, 2017 at 09:07:53AM +0200, Michael Haggerty wrote:

> > So it's probably safer to just let tempfile.c handle the whole lifetime,
> > and have it put all live tempfiles on the heap, and free them when
> > they're deactivated. That means our creation signature becomes more
> > like:
> > 
> >   struct tempfile *create_tempfile(const char *path);
> > 
> > and delete_tempfile() actually calls free() on it (though we'd probably
> > want to skip the free() from a signal handler for the usual reasons).
> 
> I agree that the latter would be a nice, and relatively safe, design. It
> would involve some fairly intrusive changes to client code, though.
>
> I think it would be possible to implement the new API while leaving the
> old one intact, to avoid having to rewrite all clients at once, and
> potentially to allow clients to avoid a malloc if they already have a
> convenient place to embed a `struct tempfile` (except that now they'd be
> able to free it when done). For example, `create_tempfile(tempfile,
> path)` and its friends could accept NULL as the first argument, in which
> case it would malloc a `struct tempfile` itself, and mark it as being
> owned by the tempfile module. Such objects would be freed when
> deactivated. But if the caller passes in a non-NULL `tempfile` argument,
> the old behavior would be retained.

There are actually very few callers of create_tempfile (I count two).
Everybody just uses lock_file(). So I think we could get away with just
modifying the internals of the lock struct to hold a pointer, and then
teaching commit_lock_file() et al to reset it to NULL after performing
an operation that frees the tempfile. We could even modify
rename_tempfile() and friends to take a pointer-to-pointer and NULL it
itself. That would make it harder to get wrong.

In the long term I don't think there's a huge value to being able to
place a "struct tempfile" inside another struct, as opposed to holding a
pointer. It saves a malloc, but at the cost of opening up confusion
about lifetimes.

-Peff

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-09-02  8:44                               ` Jeff King
@ 2017-09-02 13:50                                 ` Jeff King
  0 siblings, 0 replies; 43+ messages in thread
From: Jeff King @ 2017-09-02 13:50 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Brandon Williams, Lars Schneider, Martin Ågren, Git Users

On Sat, Sep 02, 2017 at 04:44:51AM -0400, Jeff King wrote:

> There are actually very few callers of create_tempfile (I count two).
> Everybody just uses lock_file(). So I think we could get away with just
> modifying the internals of the lock struct to hold a pointer, and then
> teaching commit_lock_file() et al to reset it to NULL after performing
> an operation that frees the tempfile. We could even modify
> rename_tempfile() and friends to take a pointer-to-pointer and NULL it
> itself. That would make it harder to get wrong.

I take it back.  It's true that there are very few create_tempfile()
callers, but we'd want to have a similar change for mks_tempfile_*, and
there are a lot more of those.

One solution would be to add an extra level of indirection, like:

  struct tempfile {
	struct the_part_that_we_put_on_our_global_list *data;
  };

That lets us keep normal value semantics for "struct tempfile", which is
nice. And existing callers wouldn't have to be modified at all for the
creation/deletion bits. But I suspect it would also result in a lot of
replacements like s/temp->/temp.data->/ for any callers that look at the
underlying fields that have moved into the sub-struct.

-Peff

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-08-30  4:55                         ` Jeff King
  2017-08-30  5:55                           ` Jeff King
@ 2017-08-30  6:55                           ` Michael Haggerty
  2017-08-30 19:53                             ` Jeff King
  1 sibling, 1 reply; 43+ messages in thread
From: Michael Haggerty @ 2017-08-30  6:55 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, Lars Schneider, Martin Ågren, Git Users

On 08/30/2017 06:55 AM, Jeff King wrote:
> [...]
> Something like this, which AFAICT is about as safe as the existing code
> in its list manipulation. It retains the "active" flag as an additional
> check which I think isn't strictly necessary, but potentially catches
> some logic errors.
> 
> The strbuf_reset() calls become strbuf_release(), since we're promising
> the caller that they could now free the struct (or let it go out of
> scope) if they chose. Probably during a signal handler we should skip
> that (we know the struct is off the list and non-active at that point,
> but we could possibly hit libc's free() mutex).

This looks OK to me aside from the one caveat below.

> diff --git a/tempfile.c b/tempfile.c
> index 6843710670..a7d964ebf8 100644
> --- a/tempfile.c
> +++ b/tempfile.c
> [...]
> +static void deactivate_tempfile(struct tempfile *tempfile)
> +{
> +	struct tempfile *volatile *p;
> +
> +	if (!tempfile->active)
> +		return;
> +
> +	tempfile->active = 0;
> +	for (p = &tempfile_list; *p; p = &(*p)->next) {
> +		if (*p == tempfile) {
> +			*p = tempfile->next;
> +			break;
> +		}
>  	}
>  }

`deactivate_tempfile()` is O(N) in the number of active tempfiles. This
could get noticeable for, say, updating 30k references, which involves
obtaining 30k reference locks. I think that code adds the locks in
alphabetical order and also removes them in alphabetical order, so the
overall effort would go like O(N²). I'm guessing that this would be
measurable but not fatal for realistic numbers of references, but it
should at least be benchmarked.

There are three obvious ways to make this O(1) again:

* Make the list doubly-linked.
* Make sure that all significant callers remove items in the *reverse*
order that they were added, in which case the item to be removed would
always be found near the head of the list.
* Change this class to keep track of a tail pointer and add new items at
the end of the list, and ensure that all significant callers remove
locks in the *same* order that they were added.

The second and third options might be easy or difficult (depend on how
much current callers need to be changed) and certainly would be easy to
break again as new callers are added in the future.

> [...]

Michael

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-08-30  6:55                           ` Michael Haggerty
@ 2017-08-30 19:53                             ` Jeff King
  2017-08-30 19:57                               ` Brandon Williams
  0 siblings, 1 reply; 43+ messages in thread
From: Jeff King @ 2017-08-30 19:53 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Brandon Williams, Lars Schneider, Martin Ågren, Git Users

On Wed, Aug 30, 2017 at 08:55:01AM +0200, Michael Haggerty wrote:

> > +	tempfile->active = 0;
> > +	for (p = &tempfile_list; *p; p = &(*p)->next) {
> > +		if (*p == tempfile) {
> > +			*p = tempfile->next;
> > +			break;
> > +		}
> >  	}
> >  }
> 
> `deactivate_tempfile()` is O(N) in the number of active tempfiles. This
> could get noticeable for, say, updating 30k references, which involves
> obtaining 30k reference locks. I think that code adds the locks in
> alphabetical order and also removes them in alphabetical order, so the
> overall effort would go like O(N²). I'm guessing that this would be
> measurable but not fatal for realistic numbers of references, but it
> should at least be benchmarked.
> 
> There are three obvious ways to make this O(1) again:
> 
> * Make the list doubly-linked.

Yeah, I noticed this when writing it, and the doubly-linked list was my
first thought. It's much more straight-forward than making guesses about
traversal order, and we already have a nice implementation in list.h.

What made me hesitate for this demonstration was that it turns the
removal into _two_ pointer updates. That made me more nervous about the
racy case where we get a single handler while already deleting
something. Specifically when we have "a <-> b <-> c" and we've updated
"a->next" to point to "c" but "c->prev" still points to "b".

But even with the singly-linked list we're not fully race-proof
(somebody could set *p to NULL between the time we look at it and
dereference it). What we really care about is not two versions of the
function running simultaneously, but one version getting interrupted by
another and having the second one see a sane state (because we'll never
return to the first signal handler; the second one will raise() and
die).

And I think we're fine there even with a doubly-linked list. It's still
the single update of the "next" pointer that controls that second
traversal.

-Peff

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-08-30 19:53                             ` Jeff King
@ 2017-08-30 19:57                               ` Brandon Williams
  2017-08-30 20:11                                 ` Jeff King
  0 siblings, 1 reply; 43+ messages in thread
From: Brandon Williams @ 2017-08-30 19:57 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, Lars Schneider, Martin Ågren, Git Users

On 08/30, Jeff King wrote:
> On Wed, Aug 30, 2017 at 08:55:01AM +0200, Michael Haggerty wrote:
> 
> > > +	tempfile->active = 0;
> > > +	for (p = &tempfile_list; *p; p = &(*p)->next) {
> > > +		if (*p == tempfile) {
> > > +			*p = tempfile->next;
> > > +			break;
> > > +		}
> > >  	}
> > >  }
> > 
> > `deactivate_tempfile()` is O(N) in the number of active tempfiles. This
> > could get noticeable for, say, updating 30k references, which involves
> > obtaining 30k reference locks. I think that code adds the locks in
> > alphabetical order and also removes them in alphabetical order, so the
> > overall effort would go like O(N²). I'm guessing that this would be
> > measurable but not fatal for realistic numbers of references, but it
> > should at least be benchmarked.
> > 
> > There are three obvious ways to make this O(1) again:
> > 
> > * Make the list doubly-linked.
> 
> Yeah, I noticed this when writing it, and the doubly-linked list was my
> first thought. It's much more straight-forward than making guesses about
> traversal order, and we already have a nice implementation in list.h.

I agree that a doubly-linked list is probably the best way to go in
order to solve the performance issue.

> 
> What made me hesitate for this demonstration was that it turns the
> removal into _two_ pointer updates. That made me more nervous about the
> racy case where we get a single handler while already deleting
> something. Specifically when we have "a <-> b <-> c" and we've updated
> "a->next" to point to "c" but "c->prev" still points to "b".
> 
> But even with the singly-linked list we're not fully race-proof
> (somebody could set *p to NULL between the time we look at it and
> dereference it). What we really care about is not two versions of the
> function running simultaneously, but one version getting interrupted by
> another and having the second one see a sane state (because we'll never
> return to the first signal handler; the second one will raise() and
> die).
> 
> And I think we're fine there even with a doubly-linked list. It's still
> the single update of the "next" pointer that controls that second
> traversal.
> 
> -Peff

I know it was mentioned earlier but if this is a critical section, and
it would be bad if it was interrupted, then couldn't we turn off
interrupts before attempting to remove an item from the list?

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-08-30 19:57                               ` Brandon Williams
@ 2017-08-30 20:11                                 ` Jeff King
  2017-08-30 21:06                                   ` Brandon Williams
  0 siblings, 1 reply; 43+ messages in thread
From: Jeff King @ 2017-08-30 20:11 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Michael Haggerty, Lars Schneider, Martin Ågren, Git Users

On Wed, Aug 30, 2017 at 12:57:31PM -0700, Brandon Williams wrote:

> > And I think we're fine there even with a doubly-linked list. It's still
> > the single update of the "next" pointer that controls that second
> > traversal.
> 
> I know it was mentioned earlier but if this is a critical section, and
> it would be bad if it was interrupted, then couldn't we turn off
> interrupts before attempting to remove an item from the list?

If we have to, that's an option. I mostly didn't want to get into the
portability headaches of signal-handling if we can avoid it.

Right now sigchain uses plain old signal(). We do use sigaction() in a
few places, but in a vanilla way. So the portability questions are:

  - can we rely on using more interesting bits of sigaction like
    sa_mask?  Probably.

  - which flags should we be setting in sa_flags to get the same
    behavior we've been getting with signal()? Or alternatively, which
    behavior do we want?

On Linux and BSD systems, at least, signal() defers repeat instances of
the handled signal. So getting two quick SIGTERMs will not interrupt the
handler to deliver the second. But getting SIGTERM followed by SIGINT would.

We could extend that protection by having sigchain_push_common() set
sa_mask to cover all of the related signals. On Linux and BSD the
current code using signal() also implies SA_RESTART. We could add that
to our flags, though I suspect in practice it doesn't matter. Whenever
we establish a handler like this our intent is to never return from it.

That just protects us from calling the _same_ handler from itself. But
that's probably enough in practice, and means handlers don't have to
worry about "critical sections". The other alternative is sigprocmask()
to block signals entirely during a section. I'm not sure if there are
portability questions there (it looks like we have a mingw wrapper
there, but it's a complete noop).

-Peff

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-08-30 20:11                                 ` Jeff King
@ 2017-08-30 21:06                                   ` Brandon Williams
  2017-08-31  4:09                                     ` Jeff King
  0 siblings, 1 reply; 43+ messages in thread
From: Brandon Williams @ 2017-08-30 21:06 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, Lars Schneider, Martin Ågren, Git Users

On 08/30, Jeff King wrote:
> On Wed, Aug 30, 2017 at 12:57:31PM -0700, Brandon Williams wrote:
> 
> > > And I think we're fine there even with a doubly-linked list. It's still
> > > the single update of the "next" pointer that controls that second
> > > traversal.
> > 
> > I know it was mentioned earlier but if this is a critical section, and
> > it would be bad if it was interrupted, then couldn't we turn off
> > interrupts before attempting to remove an item from the list?
> 
> If we have to, that's an option. I mostly didn't want to get into the
> portability headaches of signal-handling if we can avoid it.
> 
> Right now sigchain uses plain old signal(). We do use sigaction() in a
> few places, but in a vanilla way. So the portability questions are:
> 
>   - can we rely on using more interesting bits of sigaction like
>     sa_mask?  Probably.
> 
>   - which flags should we be setting in sa_flags to get the same
>     behavior we've been getting with signal()? Or alternatively, which
>     behavior do we want?
> 
> On Linux and BSD systems, at least, signal() defers repeat instances of
> the handled signal. So getting two quick SIGTERMs will not interrupt the
> handler to deliver the second. But getting SIGTERM followed by SIGINT would.
> 
> We could extend that protection by having sigchain_push_common() set
> sa_mask to cover all of the related signals. On Linux and BSD the
> current code using signal() also implies SA_RESTART. We could add that
> to our flags, though I suspect in practice it doesn't matter. Whenever
> we establish a handler like this our intent is to never return from it.
> 
> That just protects us from calling the _same_ handler from itself. But
> that's probably enough in practice, and means handlers don't have to
> worry about "critical sections". The other alternative is sigprocmask()
> to block signals entirely during a section. I'm not sure if there are
> portability questions there (it looks like we have a mingw wrapper
> there, but it's a complete noop).

Yeah there's a lot about signals that I'm not very clear on.  I do know
that Eric helped me out on the fork-exec series I worked on earlier in
the year and I believe it was to turn on/off signals during process
launches in 45afb1ca9 (run-command: block signals between fork and
execve, 2017-04-19).  Though that bit of code is strictly for unix so I
wouldn't know how that would work on windows machines.  Portability does
seem to always be a challenging problem.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-08-30 21:06                                   ` Brandon Williams
@ 2017-08-31  4:09                                     ` Jeff King
  0 siblings, 0 replies; 43+ messages in thread
From: Jeff King @ 2017-08-31  4:09 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Michael Haggerty, Lars Schneider, Martin Ågren, Git Users

On Wed, Aug 30, 2017 at 02:06:02PM -0700, Brandon Williams wrote:

> > We could extend that protection by having sigchain_push_common() set
> > sa_mask to cover all of the related signals. On Linux and BSD the
> > current code using signal() also implies SA_RESTART. We could add that
> > to our flags, though I suspect in practice it doesn't matter. Whenever
> > we establish a handler like this our intent is to never return from it.
> > 
> > That just protects us from calling the _same_ handler from itself. But
> > that's probably enough in practice, and means handlers don't have to
> > worry about "critical sections". The other alternative is sigprocmask()
> > to block signals entirely during a section. I'm not sure if there are
> > portability questions there (it looks like we have a mingw wrapper
> > there, but it's a complete noop).
> 
> Yeah there's a lot about signals that I'm not very clear on.  I do know
> that Eric helped me out on the fork-exec series I worked on earlier in
> the year and I believe it was to turn on/off signals during process
> launches in 45afb1ca9 (run-command: block signals between fork and
> execve, 2017-04-19).  Though that bit of code is strictly for unix so I
> wouldn't know how that would work on windows machines.  Portability does
> seem to always be a challenging problem.

Based on the sketch I wrote above, I figured it would be pretty easy to
convert sigchain to sigaction. But after taking a look at
compat/mingw.c, I don't think Windows would be on board. sigaction()
there is is a stub implementation that _only_ handles SIGALRM and
nothing else.

So I think the best we could do is put big #ifdefs around it to use
sigaction on other platforms, and fall back to signal() on Windows.
That's do-able, but my enthusiasm is waning as the complexity increases.
Getting two SIGINTs in a row seems plausible, but we already handle that
well. Getting SIGINT and then a _different_ signal while we're in the
handler seems less likely in practice. The only combination I can think
that would be common is TERM+KILL, but of course we're not catching KILL
in the first place.

So this seems like adding complexity for very little benefit.

-Peff

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-08-29 18:58               ` [PATCH] config: use a static lock_file struct Jeff King
  2017-08-29 19:09                 ` Brandon Williams
@ 2017-09-06  3:59                 ` Junio C Hamano
  2017-09-06 12:41                   ` Jeff King
  1 sibling, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2017-09-06  3:59 UTC (permalink / raw)
  To: Jeff King; +Cc: Lars Schneider, Martin Ågren, Git Users

Jeff King <peff@peff.net> writes:

> In the long run we may want to drop the "tempfiles must remain forever"
> rule. This is certainly not the first time it has caused confusion or
> leaks. And I don't think it's a fundamental issue, just the way the code
> is written. But in the interim, this fix is probably worth doing.

I notice you also have a series to rework the "must remain forever"
on the list, but nevertheless let's take this one first.

Will queue.

Thanks.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] config: use a static lock_file struct
  2017-09-06  3:59                 ` Junio C Hamano
@ 2017-09-06 12:41                   ` Jeff King
  0 siblings, 0 replies; 43+ messages in thread
From: Jeff King @ 2017-09-06 12:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Lars Schneider, Martin Ågren, Git Users

On Wed, Sep 06, 2017 at 12:59:46PM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > In the long run we may want to drop the "tempfiles must remain forever"
> > rule. This is certainly not the first time it has caused confusion or
> > leaks. And I don't think it's a fundamental issue, just the way the code
> > is written. But in the interim, this fix is probably worth doing.
> 
> I notice you also have a series to rework the "must remain forever"
> on the list, but nevertheless let's take this one first.
> 
> Will queue.

Yes, this will conflict with that series (but the resolution should just
be to throw this one away). I'm fine with taking this in the meantime,
though it's certainly not urgent (my main goal was getting the pre-test
bits of test-lib.sh to run with leak detection on; I don't think this
was actually affecting anybody in the real world).

-Peff

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
  2017-08-29 18:53             ` Jeff King
  2017-08-29 18:58               ` [PATCH] config: use a static lock_file struct Jeff King
@ 2017-08-29 19:22               ` Martin Ågren
  2017-08-29 21:48                 ` Jeff King
  1 sibling, 1 reply; 43+ messages in thread
From: Martin Ågren @ 2017-08-29 19:22 UTC (permalink / raw)
  To: Jeff King; +Cc: Lars Schneider, Git Users

On 29 August 2017 at 20:53, Jeff King <peff@peff.net> wrote:
> On Tue, Aug 29, 2017 at 06:51:52PM +0100, Lars Schneider wrote:
>> What if we run a few selected tests with valgrind and count all files that
>> valgrind mentions (a single leak has multiple file mentions because of
>> the stack trace and other leak indicators). We record these counts and let
>> TravisCI scream if one of the numbers increases.
>>
>> I wonder how stable/fragile such a metric would be as a simple refactoring
>> could easily change these numbers. Below I ran valgrind on t5510 before and
>> after Martin's patch. The diff below clearly shows the pkt-line leak.
>>
>> Would it make sense to pursue something like this in TravisCI to avoid
>> "pkt-line" kind of leaks in the future?
>
> I don't think that would work, because simply adding new tests would
> bump the leak count, without the code actually growing worse.
>
> But think about your strategy for a moment: what you're really trying to
> do is say "these existing leaks are OK because they are too many for us
> to count, but we want to make sure we don't add _new_ ones". We already
> have two techniques for distinguishing old ones from new ones:
>
>   1. Mark existing ones with valgrind suppressions so they do not warn
>      at all.
>
>   2. Fix them, so that the "existing" count drops to zero.
>
> Option (2), of course, has the added side effect that it's actually
> fixing potential problems. :)

One idea would be to report that "t1234 used to complete without leaks,
but now there is a leak". Of course, the leak could be triggered by some
"irrelevant" setup step that was just added and not by the functionality
that is actually tested. But maybe then the reaction might be "oh, let's
fix this one leak so that this test is leak-free again". If that then
also reduces the leakiness of some other tests, good. That might help
against the "there are too many leaks, where should I start?"-effect.

Or similarly "This stack-trace was reported. I haven't seen it before."
It could be a refactoring, it could be an old leak that hasn't been
triggered in that way before, or it could be a new leak.

But to be honest, I think that only works -- in a psychological sense --
once the number of leaks is small enough, however much that is.

One take-away for me from this thread is that memory-leak-plugging seems
to be appreciated, i.e., it's not just an academic problem. I think I'll
look into it some more, i.e., slowly pursue option 2. :-) That might help
give some insights on how to identify interesting leaks.

Martin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
  2017-08-29 19:22               ` [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1() Martin Ågren
@ 2017-08-29 21:48                 ` Jeff King
  2017-08-30  5:31                   ` Jeff King
  0 siblings, 1 reply; 43+ messages in thread
From: Jeff King @ 2017-08-29 21:48 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Lars Schneider, Git Users

On Tue, Aug 29, 2017 at 09:22:08PM +0200, Martin Ågren wrote:

> One take-away for me from this thread is that memory-leak-plugging seems
> to be appreciated, i.e., it's not just an academic problem. I think I'll
> look into it some more, i.e., slowly pursue option 2. :-) That might help
> give some insights on how to identify interesting leaks.

Yes, I think if it's tractable that's the best path forward. It's not
clear to me how many of the things valgrind reports are false positives
(and in what way). Certainly the lock_file patch I just posted seems
like something worth addressing, even if we _tend_ to only leak a small
number of such structs per program invocation. It's not a problem now,
but it's a sign of a bad pattern.

-Peff

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
  2017-08-29 21:48                 ` Jeff King
@ 2017-08-30  5:31                   ` Jeff King
  2017-09-05 10:03                     ` Junio C Hamano
  0 siblings, 1 reply; 43+ messages in thread
From: Jeff King @ 2017-08-30  5:31 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Lars Schneider, Git Users

On Tue, Aug 29, 2017 at 05:48:09PM -0400, Jeff King wrote:

> On Tue, Aug 29, 2017 at 09:22:08PM +0200, Martin Ågren wrote:
> 
> > One take-away for me from this thread is that memory-leak-plugging seems
> > to be appreciated, i.e., it's not just an academic problem. I think I'll
> > look into it some more, i.e., slowly pursue option 2. :-) That might help
> > give some insights on how to identify interesting leaks.
> 
> Yes, I think if it's tractable that's the best path forward. It's not
> clear to me how many of the things valgrind reports are false positives
> (and in what way). Certainly the lock_file patch I just posted seems
> like something worth addressing, even if we _tend_ to only leak a small
> number of such structs per program invocation. It's not a problem now,
> but it's a sign of a bad pattern.

I wanted to see what it would take to get t0000 to pass with "definite"
leak-checking on, as in:

diff --git a/t/valgrind/valgrind.sh b/t/valgrind/valgrind.sh
index 669ebaf68b..5f01c06280 100755
--- a/t/valgrind/valgrind.sh
+++ b/t/valgrind/valgrind.sh
@@ -10,7 +10,7 @@ test-*)
 	;;
 esac

-TOOL_OPTIONS='--leak-check=no'
+TOOL_OPTIONS='--leak-check=full --errors-for-leak-kinds=definite'

 test -z "$GIT_VALGRIND_ENABLED" &&
 exec "$program" "$@"

The results gave an interesting overview of the problems I'd expect to
face for the rest of the tests. I started with the config lock_file fix
I sent earlier, but we needed a few more.

This one in cmd_update_index() can become static in the same way:

diff --git a/builtin/update-index.c b/builtin/update-index.c
index d955cd56b3..3bff7de720 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -908,6 +908,7 @@ static int reupdate_callback(struct parse_opt_ctx_t *ctx,

 int cmd_update_index(int argc, const char **argv, const char *prefix)
 {
+	static struct lock_file lock_file;
 	int newfd, entries, has_errors = 0, nul_term_line = 0;
 	enum uc_mode untracked_cache = UC_UNSPECIFIED;
 	int read_from_stdin = 0;
@@ -917,7 +918,6 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	struct refresh_params refresh_args = {0, &has_errors};
 	int lock_error = 0;
 	int split_index = -1;
-	struct lock_file *lock_file;
 	struct parse_opt_ctx_t ctx;
 	strbuf_getline_fn getline_fn;
 	int parseopt_state = PARSE_OPT_UNKNOWN;
@@ -1016,11 +1016,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)

 	git_config(git_default_config, NULL);

-	/* We can't free this memory, it becomes part of a linked list parsed atexit() */
-	lock_file = xcalloc(1, sizeof(struct lock_file));
-
 	/* we will diagnose later if it turns out that we need to update it */
-	newfd = hold_locked_index(lock_file, 0);
+	newfd = hold_locked_index(&lock_file, 0);
 	if (newfd < 0)
 		lock_error = errno;

@@ -1155,11 +1152,11 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 				exit(128);
 			unable_to_lock_die(get_index_file(), lock_error);
 		}
-		if (write_locked_index(&the_index, lock_file, COMMIT_LOCK))
+		if (write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
 			die("Unable to write new index file");
 	}

-	rollback_lock_file(lock_file);
+	rollback_lock_file(&lock_file);

 	return has_errors ? 1 : 0;
 }

But there's another one in write_index_as_tree(), which is less
obviously a candidate for turning into a single static variable. We
might plausibly have multiple indices. I hacked around it with:

diff --git a/cache-tree.c b/cache-tree.c
index 2440d1dc89..f2a5aa138f 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -603,7 +603,7 @@ static struct cache_tree *cache_tree_find(struct cache_tree *it, const char *pat
 int write_index_as_tree(unsigned char *sha1, struct index_state *index_state, const char *index_path, int flags, const char *prefix)
 {
 	int entries, was_valid, newfd;
-	struct lock_file *lock_file;
+	static struct lock_file *lock_file;

 	/*
 	 * We can't free this memory, it becomes part of a linked list

which means that if we call this function once per program we'll keep a
single static pointer, and valgrind will consider it "reachable". But if
we call multiple, they really will leak as before. So it's definitely a
hack, though it was enough for t0000. I think I'd favor loosening the
"you can never free locks" requirement instead, which makes this problem
just go away (we could even declare it as a stack variable then).

Next up is the fact that setup_git_env() happily overwrites its own
pointers without freeing them if it's called twice. I expect this will
be a reasonably common problem for things like config or option parsing
where we might see multiple values. In the simple cases, we just need to
make sure we do:

  free(foo);
  foo = xstrdup(...);

instead of just calling the latter.

But we can't always do that, because we sometimes point to string
literals like:

  static const char *foo = "default_value";
  ...
  foo = xstrfmt("computed_value: %s", value_from_user);

If we assign to "foo" again, we need to free only when the previous
value was from xstrfmt, not when it points to the string literal. But we
can't duplicate the string literal, because that default assignment is
outside any function.

We could solve that by using a struct instead, with a few helpers. Or
possibly we could just use strbufs, if we taught it to handle "unowned"
strings (i.e., where "alloc" is zero but "len" isn't necessarily, and we
don't just point to the slopbuf). And then you'd have something like:

  #define STRBUF_INIT_STR(x) { 0, strlen(x), (char *)(x) }
  ...
  static struct strbuf foo = STRBUF_INIT_STR("default_value");
  ...
  strbuf_addf(&foo, "computed_value...");

And then strbuf_grow() would have to learn to copy the existing bytes
when it moves from the "alloc==0" case.

Anyway, that actually ended up _not_ being how I solved this one. I was
puzzled why we were calling setup_git_env() twice in the first place.
And the answer is that setup_work_tree() frequently does a noop
set_git_dir(). And we can just get rid of that:

diff --git a/setup.c b/setup.c
index 23950173fc..9104740e2d 100644
--- a/setup.c
+++ b/setup.c
@@ -399,12 +399,9 @@ void setup_work_tree(void)
 	if (getenv(GIT_WORK_TREE_ENVIRONMENT))
 		setenv(GIT_WORK_TREE_ENVIRONMENT, ".", 1);

-	/*
-	 * NEEDSWORK: this call can essentially be set_git_dir(get_git_dir())
-	 * which can cause some problems when trying to free the old value of
-	 * gitdir.
-	 */
-	set_git_dir(remove_leading_path(git_dir, work_tree));
+	git_dir = remove_leading_path(git_dir, work_tree);
+	if (strcmp(get_git_dir(), git_dir))
+		set_git_dir(git_dir);
 	initialized = 1;
 }

I think this is really papering over underlying problems, though in this
case they are fairly simple "just call free()" cases that don't need the
strbuf magic.

Next up is update-index's add_one_path. This one is a real leak:

diff --git a/builtin/update-index.c b/builtin/update-index.c
index d562f2ec69..d955cd56b3 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -287,8 +287,10 @@ static int add_one_path(const struct cache_entry *old, const char *path, int len
 	}
 	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
 	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
-	if (add_cache_entry(ce, option))
+	if (add_cache_entry(ce, option)) {
+		free(ce);
 		return error("%s: cannot add to the index - missing --add option?", path);
+	}
 	return 0;
 }

It's not a big deal in practice since we exit soon, but since it's in a
sub-function I think it's worth addressing (and the earlier error code
path already does the same free, so I think this really is just a bug).

And the final case I hit is in ls-files. Deep inside dir.c we allocate
some structures into the dir_struct. But there's no cleanup function, so
of course we just let it go out of scope at the end of cmd_ls_files()
and let the OS deal with it.

Obviously one solution is to add a cleanup function and call it at the
end of cmd_ls_files(). But here's another trick:

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e1339e6d17..5f2396392e 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -504,7 +504,7 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 {
 	int require_work_tree = 0, show_tag = 0, i;
 	const char *max_prefix;
-	struct dir_struct dir;
+	static struct dir_struct dir;
 	struct exclude_list *el;
 	struct string_list exclude_list = STRING_LIST_INIT_NODUP;
 	struct option builtin_ls_files_options[] = {

By marking it as static, it remains reachable even after the function
exits, and valgrind doesn't complain about the leak. In a sense, this is
accomplishing exactly what we want to say: there should be one of these
per program run, and its lifetime is the same as that of the whole
program.

But it does make me feel just a little dirty. If for some reason we
called cmd_ls_files() multiple times in a single process, the current
behavior would be to have a small leak for each call. But the behavior
with the hunk above would be that the second run sees whatever cruft was
left in "dir" from the last run.

Overall I suspect that running our cmd_*() functions multiple times in a
single process is already going to cause chaos, because they often are
touching static globals, etc. So it's probably not that big a deal in
practice to add one more variable to the list, and declare that calling
cmd_ls_files() anywhere except as the main function of ls-files is
forbidden.

-Peff

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
  2017-08-30  5:31                   ` Jeff King
@ 2017-09-05 10:03                     ` Junio C Hamano
  2017-10-10  4:06                       ` [PATCH 0/2] Do not call cmd_*() as a subroutine Junio C Hamano
  0 siblings, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2017-09-05 10:03 UTC (permalink / raw)
  To: Jeff King; +Cc: Martin Ågren, Lars Schneider, Git Users

Jeff King <peff@peff.net> writes:

> Overall I suspect that running our cmd_*() functions multiple times in a
> single process is already going to cause chaos, because they often are
> touching static globals, etc. So it's probably not that big a deal in
> practice to add one more variable to the list, and declare that calling
> cmd_ls_files() anywhere except as the main function of ls-files is
> forbidden.

IIRC there were a few existing violators of this rule around "merge"
implementation.  It may be a reasonably low hanging fruit to find
and fix them.

#leftoverbits

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 0/2] Do not call cmd_*() as a subroutine
  2017-09-05 10:03                     ` Junio C Hamano
@ 2017-10-10  4:06                       ` Junio C Hamano
  2017-10-10  4:06                         ` [PATCH 1/2] describe: do not use " Junio C Hamano
  2017-10-10  4:06                         ` [PATCH 2/2] merge-ours: " Junio C Hamano
  0 siblings, 2 replies; 43+ messages in thread
From: Junio C Hamano @ 2017-10-10  4:06 UTC (permalink / raw)
  To: git

These two patches are for a recent #leftoverbits topic.

https://public-inbox.org/git/xmqqr2vlbgyk.fsf@gitster.mtv.corp.google.com/

The cmd_foo() function is a moral equivalent of 'main' for a Git
subcommand 'git foo', and as such, it is allowed to do many things
that make it unsuitable to be called as a subroutine, including

 - call exit(3) to terminate the process;

 - allocate resource held and used throughout its lifetime, without
   releasing it upon return/exit;

 - rely on global variables being initialized at program startup,
   and update them as needed, making another clean invocation of the
   function impossible.

Correcting two callers by using helper API calls is not so hard.

Junio C Hamano (2):
  describe: do not use cmd_*() as a subroutine
  merge-ours: do not use cmd_*() as a subroutine

 builtin/describe.c   | 15 +++++++++++----
 builtin/merge-ours.c | 16 +++++++---------
 2 files changed, 18 insertions(+), 13 deletions(-)

-- 
2.15.0-rc0-203-g4c8d0e28b1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 1/2] describe: do not use cmd_*() as a subroutine
  2017-10-10  4:06                       ` [PATCH 0/2] Do not call cmd_*() as a subroutine Junio C Hamano
@ 2017-10-10  4:06                         ` Junio C Hamano
  2017-10-10 13:43                           ` SZEDER Gábor
  2017-10-10  4:06                         ` [PATCH 2/2] merge-ours: " Junio C Hamano
  1 sibling, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2017-10-10  4:06 UTC (permalink / raw)
  To: git

The cmd_foo() function is a moral equivalent of 'main' for a Git
subcommand 'git foo', and as such, it is allowed to do many things
that make it unsuitable to be called as a subroutine, including

 - call exit(3) to terminate the process;

 - allocate resource held and used throughout its lifetime, without
   releasing it upon return/exit;

 - rely on global variables being initialized at program startup,
   and update them as needed, making another clean invocation of the
   function impossible.

The call to cmd_diff_index() "git describe" makes has been working
by accident that the function did not call exit(3); it sets a bad
precedent for people to cut and paste.

We could invoke it via the run_command() interface, but the diff
family of commands have helper functions in diff-lib.c that are
meant to be usable as subroutines, and using the latter does not
make the resulting code all that longer.  Use it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/describe.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 89ea1cdd60..50263759cb 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -7,12 +7,12 @@
 #include "builtin.h"
 #include "exec_cmd.h"
 #include "parse-options.h"
+#include "revision.h"
 #include "diff.h"
 #include "hashmap.h"
 #include "argv-array.h"
 #include "run-command.h"
 
-#define SEEN		(1u << 0)
 #define MAX_TAGS	(FLAG_BITS - 1)
 
 static const char * const describe_usage[] = {
@@ -532,7 +532,9 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 			}
 		} else if (dirty) {
 			static struct lock_file index_lock;
-			int fd;
+			struct rev_info revs;
+			struct argv_array args = ARGV_ARRAY_INIT;
+			int fd, result;
 
 			read_cache_preload(NULL);
 			refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED,
@@ -541,8 +543,13 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
 			if (0 <= fd)
 				update_index_if_able(&the_index, &index_lock);
 
-			if (!cmd_diff_index(ARRAY_SIZE(diff_index_args) - 1,
-					    diff_index_args, prefix))
+			init_revisions(&revs, prefix);
+			argv_array_pushv(&args, diff_index_args);
+			if (setup_revisions(args.argc, args.argv, &revs, NULL) != 1)
+				BUG("malformed internal diff-index command line");
+			result = run_diff_index(&revs, 0);
+
+			if (!diff_result_code(&revs.diffopt, result))
 				suffix = NULL;
 			else
 				suffix = dirty;
-- 
2.15.0-rc0-203-g4c8d0e28b1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/2] describe: do not use cmd_*() as a subroutine
  2017-10-10  4:06                         ` [PATCH 1/2] describe: do not use " Junio C Hamano
@ 2017-10-10 13:43                           ` SZEDER Gábor
  2017-10-11  6:00                             ` Junio C Hamano
  0 siblings, 1 reply; 43+ messages in thread
From: SZEDER Gábor @ 2017-10-10 13:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: SZEDER Gábor, git


> The cmd_foo() function is a moral equivalent of 'main' for a Git
> subcommand 'git foo', and as such, it is allowed to do many things
> that make it unsuitable to be called as a subroutine, including
> 
>  - call exit(3) to terminate the process;
> 
>  - allocate resource held and used throughout its lifetime, without
>    releasing it upon return/exit;
> 
>  - rely on global variables being initialized at program startup,
>    and update them as needed, making another clean invocation of the
>    function impossible.
> 
> The call to cmd_diff_index() "git describe" makes has been working
> by accident that the function did not call exit(3); it sets a bad
> precedent for people to cut and paste.

The subject implies to me that this patch eliminates all cmd_*() calls
in builtin/describe.c, but a call to cmd_name_rev() still remains.

However, that call is special in the sense that cmd_describe() returns
immediately thereafter, so none of the above three points are an issue
there.  Someone might argue that it still sets a bad precedent, but I
won't :)  To avoid the direct cmd_name_rev() call we would have to use
run_command(), because there are no libified helper functions
available to do the job, adding the overhead of a fork()+exec(),
though only once or nonce, depending on cmdline options.

Maybe you already considered all this WRT that cmd_name_rev() call, I
don't know.  In any case, I think at least the subject line should
spell out cmd_diff_index().


Gábor


> We could invoke it via the run_command() interface, but the diff
> family of commands have helper functions in diff-lib.c that are
> meant to be usable as subroutines, and using the latter does not
> make the resulting code all that longer.  Use it.
> 
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  builtin/describe.c | 15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/builtin/describe.c b/builtin/describe.c
> index 89ea1cdd60..50263759cb 100644
> --- a/builtin/describe.c
> +++ b/builtin/describe.c
> @@ -7,12 +7,12 @@
>  #include "builtin.h"
>  #include "exec_cmd.h"
>  #include "parse-options.h"
> +#include "revision.h"
>  #include "diff.h"
>  #include "hashmap.h"
>  #include "argv-array.h"
>  #include "run-command.h"
>  
> -#define SEEN		(1u << 0)
>  #define MAX_TAGS	(FLAG_BITS - 1)
>  
>  static const char * const describe_usage[] = {
> @@ -532,7 +532,9 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
>  			}
>  		} else if (dirty) {
>  			static struct lock_file index_lock;
> -			int fd;
> +			struct rev_info revs;
> +			struct argv_array args = ARGV_ARRAY_INIT;
> +			int fd, result;
>  
>  			read_cache_preload(NULL);
>  			refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED,
> @@ -541,8 +543,13 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
>  			if (0 <= fd)
>  				update_index_if_able(&the_index, &index_lock);
>  
> -			if (!cmd_diff_index(ARRAY_SIZE(diff_index_args) - 1,
> -					    diff_index_args, prefix))
> +			init_revisions(&revs, prefix);
> +			argv_array_pushv(&args, diff_index_args);
> +			if (setup_revisions(args.argc, args.argv, &revs, NULL) != 1)
> +				BUG("malformed internal diff-index command line");
> +			result = run_diff_index(&revs, 0);
> +
> +			if (!diff_result_code(&revs.diffopt, result))
>  				suffix = NULL;
>  			else
>  				suffix = dirty;
> -- 
> 2.15.0-rc0-203-g4c8d0e28b1

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/2] describe: do not use cmd_*() as a subroutine
  2017-10-10 13:43                           ` SZEDER Gábor
@ 2017-10-11  6:00                             ` Junio C Hamano
  0 siblings, 0 replies; 43+ messages in thread
From: Junio C Hamano @ 2017-10-11  6:00 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git

SZEDER Gábor <szeder.dev@gmail.com> writes:

> Maybe you already considered all this WRT that cmd_name_rev() call, I
> don't know.  In any case, I think at least the subject line should
> spell out cmd_diff_index().

Perhaps.  The intent was to eradicate cmd_*() used as a subroutine,
so I'd rather opt to explain why name_rev() that does not return is
OK in this case in the log message without changing anything else.

Thanks for looking it over.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 2/2] merge-ours: do not use cmd_*() as a subroutine
  2017-10-10  4:06                       ` [PATCH 0/2] Do not call cmd_*() as a subroutine Junio C Hamano
  2017-10-10  4:06                         ` [PATCH 1/2] describe: do not use " Junio C Hamano
@ 2017-10-10  4:06                         ` Junio C Hamano
  1 sibling, 0 replies; 43+ messages in thread
From: Junio C Hamano @ 2017-10-10  4:06 UTC (permalink / raw)
  To: git

The call to cmd_diff_index() "git merge-ours" makes has been working
by accident that the function did not call exit(3), and the caller
exited almost immediately after making a call, but it sets a bad
precedent for people to cut and paste.

For finding out if the index exactly matches the HEAD (or a given
tree-ish), there is index_differs_from() which is exactly written
for that purpose.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/merge-ours.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/builtin/merge-ours.c b/builtin/merge-ours.c
index 684411694f..beb0623d56 100644
--- a/builtin/merge-ours.c
+++ b/builtin/merge-ours.c
@@ -9,26 +9,24 @@
  */
 #include "git-compat-util.h"
 #include "builtin.h"
+#include "diff.h"
 
 static const char builtin_merge_ours_usage[] =
 	"git merge-ours <base>... -- HEAD <remote>...";
 
-static const char *diff_index_args[] = {
-	"diff-index", "--quiet", "--cached", "HEAD", "--", NULL
-};
-#define NARGS (ARRAY_SIZE(diff_index_args) - 1)
-
 int cmd_merge_ours(int argc, const char **argv, const char *prefix)
 {
 	if (argc == 2 && !strcmp(argv[1], "-h"))
 		usage(builtin_merge_ours_usage);
 
 	/*
-	 * We need to exit with 2 if the index does not match our HEAD tree,
-	 * because the current index is what we will be committing as the
-	 * merge result.
+	 * The contents of the current index becomes the tree we
+	 * commit.  The index must match HEAD, or this merge cannot go
+	 * through.
 	 */
-	if (cmd_diff_index(NARGS, diff_index_args, prefix))
+	if (read_cache() < 0)
+		die_errno("read_cache failed");
+	if (index_differs_from("HEAD", 0, 0))
 		exit(2);
 	exit(0);
 }
-- 
2.15.0-rc0-203-g4c8d0e28b1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2017-10-11  6:00 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-27  7:37 [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1() Martin Ågren
2017-08-27 15:41 ` Jeff King
2017-08-27 18:25   ` Jeff King
2017-08-27 18:21 ` Lars Schneider
2017-08-27 19:09   ` Martin Ågren
2017-08-27 19:15     ` Jeff King
2017-08-27 20:04     ` Lars Schneider
2017-08-27 23:23       ` Jeff King
2017-08-28  4:11         ` Martin Ågren
2017-08-28 17:52           ` Stefan Beller
2017-08-28 17:58             ` Jeff King
2017-09-05 10:03               ` Junio C Hamano
2017-08-29 17:51           ` Lars Schneider
2017-08-29 18:53             ` Jeff King
2017-08-29 18:58               ` [PATCH] config: use a static lock_file struct Jeff King
2017-08-29 19:09                 ` Brandon Williams
2017-08-29 19:10                   ` Brandon Williams
2017-08-29 19:12                   ` Jeff King
2017-08-30  3:25                     ` Michael Haggerty
2017-08-30  4:31                       ` Jeff King
2017-08-30  4:55                         ` Michael Haggerty
2017-08-30  4:55                         ` Jeff King
2017-08-30  5:55                           ` Jeff King
2017-08-30  7:07                             ` Michael Haggerty
2017-09-02  8:44                               ` Jeff King
2017-09-02 13:50                                 ` Jeff King
2017-08-30  6:55                           ` Michael Haggerty
2017-08-30 19:53                             ` Jeff King
2017-08-30 19:57                               ` Brandon Williams
2017-08-30 20:11                                 ` Jeff King
2017-08-30 21:06                                   ` Brandon Williams
2017-08-31  4:09                                     ` Jeff King
2017-09-06  3:59                 ` Junio C Hamano
2017-09-06 12:41                   ` Jeff King
2017-08-29 19:22               ` [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1() Martin Ågren
2017-08-29 21:48                 ` Jeff King
2017-08-30  5:31                   ` Jeff King
2017-09-05 10:03                     ` Junio C Hamano
2017-10-10  4:06                       ` [PATCH 0/2] Do not call cmd_*() as a subroutine Junio C Hamano
2017-10-10  4:06                         ` [PATCH 1/2] describe: do not use " Junio C Hamano
2017-10-10 13:43                           ` SZEDER Gábor
2017-10-11  6:00                             ` Junio C Hamano
2017-10-10  4:06                         ` [PATCH 2/2] merge-ours: " Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).