git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] reduce_heads: fix memory leaks
@ 2017-11-01  9:03 Martin Ågren
  2017-11-02  3:11 ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Martin Ågren @ 2017-11-01  9:03 UTC (permalink / raw)
  To: git

We currently have seven callers of `reduce_heads(foo)`. Six of them do
not use the original list `foo` again, and actually, all six of those
end up leaking it.

Introduce and use `reduce_heads_replace(&foo)` as a leak-free version of
`foo = reduce_heads(foo)` to fix several of these. Fix the remaining
leaks using `free_commit_list()`.

While we're here, document `reduce_heads()` and mark it as `extern`.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 builtin/commit.c        |  2 +-
 builtin/fmt-merge-msg.c |  2 +-
 builtin/merge-base.c    |  6 +++++-
 builtin/merge.c         |  1 +
 builtin/pull.c          |  4 +++-
 commit.c                |  7 +++++++
 commit.h                | 18 +++++++++++++++++-
 7 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index d75b3805e..11c474018 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1728,7 +1728,7 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
 				allow_fast_forward = 0;
 		}
 		if (allow_fast_forward)
-			parents = reduce_heads(parents);
+			reduce_heads_replace(&parents);
 	} else {
 		if (!reflog_msg)
 			reflog_msg = (whence == FROM_CHERRY_PICK)
diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index e99b5ddbf..27a2361e9 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -571,7 +571,7 @@ static void find_merge_parents(struct merge_parents *result,
 	head_commit = lookup_commit(head);
 	if (head_commit)
 		commit_list_insert(head_commit, &parents);
-	parents = reduce_heads(parents);
+	reduce_heads_replace(&parents);
 
 	while (parents) {
 		struct commit *cmit = pop_commit(&parents);
diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index 6dbd167d3..b1b7590c4 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -59,6 +59,8 @@ static int handle_independent(int count, const char **args)
 		commit_list_insert(get_commit_reference(args[i]), &revs);
 
 	result = reduce_heads(revs);
+	free_commit_list(revs);
+
 	if (!result)
 		return 1;
 
@@ -78,7 +80,9 @@ static int handle_octopus(int count, const char **args, int show_all)
 	for (i = count - 1; i >= 0; i--)
 		commit_list_insert(get_commit_reference(args[i]), &revs);
 
-	result = reduce_heads(get_octopus_merge_bases(revs));
+	result = get_octopus_merge_bases(revs);
+	free_commit_list(revs);
+	reduce_heads_replace(&result);
 
 	if (!result)
 		return 1;
diff --git a/builtin/merge.c b/builtin/merge.c
index ab5ffe85e..fbbf2a9e5 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -999,6 +999,7 @@ static struct commit_list *reduce_parents(struct commit *head_commit,
 
 	/* Find what parents to record by checking independent ones. */
 	parents = reduce_heads(remoteheads);
+	free_commit_list(remoteheads);
 
 	remoteheads = NULL;
 	remotes = &remoteheads;
diff --git a/builtin/pull.c b/builtin/pull.c
index 6f772e8a2..5eeaa8c68 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -745,8 +745,10 @@ static int get_octopus_merge_base(struct object_id *merge_base,
 	if (!is_null_oid(fork_point))
 		commit_list_insert(lookup_commit_reference(fork_point), &revs);
 
-	result = reduce_heads(get_octopus_merge_bases(revs));
+	result = get_octopus_merge_bases(revs);
 	free_commit_list(revs);
+	reduce_heads_replace(&result);
+
 	if (!result)
 		return 1;
 
diff --git a/commit.c b/commit.c
index 1e0e63379..cab8d4455 100644
--- a/commit.c
+++ b/commit.c
@@ -1090,6 +1090,13 @@ struct commit_list *reduce_heads(struct commit_list *heads)
 	return result;
 }
 
+void reduce_heads_replace(struct commit_list **heads)
+{
+	struct commit_list *result = reduce_heads(*heads);
+	free_commit_list(*heads);
+	*heads = result;
+}
+
 static const char gpg_sig_header[] = "gpgsig";
 static const int gpg_sig_header_len = sizeof(gpg_sig_header) - 1;
 
diff --git a/commit.h b/commit.h
index 6d769590f..99a3fea68 100644
--- a/commit.h
+++ b/commit.h
@@ -313,7 +313,23 @@ extern int interactive_add(int argc, const char **argv, const char *prefix, int
 extern int run_add_interactive(const char *revision, const char *patch_mode,
 			       const struct pathspec *pathspec);
 
-struct commit_list *reduce_heads(struct commit_list *heads);
+/*
+ * Takes a list of commits and returns a new list where those
+ * have been removed that can be reached from other commits in
+ * the list. It is useful for, e.g., reducing the commits
+ * randomly thrown at the git-merge command and removing
+ * redundant commits that the user shouldn't have given to it.
+ *
+ * This function destroys the STALE bit of the commit objects'
+ * flags.
+ */
+extern struct commit_list *reduce_heads(struct commit_list *heads);
+
+/*
+ * Like `reduce_heads()`, except it replaces the list. Use this
+ * instead of `foo = reduce_heads(foo);` to avoid memory leaks.
+ */
+extern void reduce_heads_replace(struct commit_list **heads);
 
 struct commit_extra_header {
 	struct commit_extra_header *next;
-- 
2.15.0.415.gac1375d7e


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] reduce_heads: fix memory leaks
  2017-11-01  9:03 [PATCH] reduce_heads: fix memory leaks Martin Ågren
@ 2017-11-02  3:11 ` Junio C Hamano
  2017-11-02 10:45   ` Martin Ågren
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2017-11-02  3:11 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git

Martin Ågren <martin.agren@gmail.com> writes:

> diff --git a/builtin/merge-base.c b/builtin/merge-base.c
> index 6dbd167d3..b1b7590c4 100644
> --- a/builtin/merge-base.c
> +++ b/builtin/merge-base.c
> @@ -59,6 +59,8 @@ static int handle_independent(int count, const char **args)
>  		commit_list_insert(get_commit_reference(args[i]), &revs);
>  
>  	result = reduce_heads(revs);
> +	free_commit_list(revs);
> +
>  	if (!result)
>  		return 1;

The post-context of this hunk continues like so:

	while (result) {
		printf("%s\n", oid_to_hex(&result->item->object.oid));
		result = result->next;
	}
	return 0;
}

and we end up leaking "result".  This function is directly called from
cmd_merge_base() and its value is returned to main(), so leaking it
is not that a grave offence, but that excuse applies equally well to
revs.  

I can see you are shooting for minimum change in this patch, but if
we were writing this code in a codebase where reduce_heads_replace()
is already available, I would imagine that we wouldn't use two separate
variables, perhaps?

> diff --git a/commit.c b/commit.c
> index 1e0e63379..cab8d4455 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -1090,6 +1090,13 @@ struct commit_list *reduce_heads(struct commit_list *heads)
>  	return result;
>  }
>  
> +void reduce_heads_replace(struct commit_list **heads)
> +{
> +	struct commit_list *result = reduce_heads(*heads);
> +	free_commit_list(*heads);
> +	*heads = result;
> +}
> +

Looks good.

> diff --git a/commit.h b/commit.h
> index 6d769590f..99a3fea68 100644
> --- a/commit.h
> +++ b/commit.h
> @@ -313,7 +313,23 @@ extern int interactive_add(int argc, const char **argv, const char *prefix, int
>  extern int run_add_interactive(const char *revision, const char *patch_mode,
>  			       const struct pathspec *pathspec);
>  
> -struct commit_list *reduce_heads(struct commit_list *heads);
> +/*
> + * Takes a list of commits and returns a new list where those
> + * have been removed that can be reached from other commits in
> + * the list. It is useful for, e.g., reducing the commits
> + * randomly thrown at the git-merge command and removing
> + * redundant commits that the user shouldn't have given to it.
> + *
> + * This function destroys the STALE bit of the commit objects'
> + * flags.
> + */
> +extern struct commit_list *reduce_heads(struct commit_list *heads);
> +
> +/*
> + * Like `reduce_heads()`, except it replaces the list. Use this
> + * instead of `foo = reduce_heads(foo);` to avoid memory leaks.
> + */
> +extern void reduce_heads_replace(struct commit_list **heads);

Looks excellent.

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] reduce_heads: fix memory leaks
  2017-11-02  3:11 ` Junio C Hamano
@ 2017-11-02 10:45   ` Martin Ågren
  2017-11-05 20:26     ` [PATCH v2 0/2] " Martin Ågren
  0 siblings, 1 reply; 12+ messages in thread
From: Martin Ågren @ 2017-11-02 10:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

On 2 November 2017 at 04:11, Junio C Hamano <gitster@pobox.com> wrote:
> Martin Ågren <martin.agren@gmail.com> writes:
>
>> diff --git a/builtin/merge-base.c b/builtin/merge-base.c
>> index 6dbd167d3..b1b7590c4 100644
>> --- a/builtin/merge-base.c
>> +++ b/builtin/merge-base.c
>> @@ -59,6 +59,8 @@ static int handle_independent(int count, const char **args)
>>               commit_list_insert(get_commit_reference(args[i]), &revs);
>>
>>       result = reduce_heads(revs);
>> +     free_commit_list(revs);
>> +
>>       if (!result)
>>               return 1;
>
> The post-context of this hunk continues like so:
>
>         while (result) {
>                 printf("%s\n", oid_to_hex(&result->item->object.oid));
>                 result = result->next;
>         }
>         return 0;
> }
>
> and we end up leaking "result".  This function is directly called from
> cmd_merge_base() and its value is returned to main(), so leaking it
> is not that a grave offence, but that excuse applies equally well to
> revs.

Good catch. I even have a patch to address the leak of `result`, except
I seem to have sorted it into another pile. For this series I just
grepped for "reduce_heads" and didn't stop to think about using UNLEAK,
nor about the leaking of `result`.

> I can see you are shooting for minimum change in this patch, but if
> we were writing this code in a codebase where reduce_heads_replace()
> is already available, I would imagine that we wouldn't use two separate
> variables, perhaps?

The way my other patch addresses the leaking of `result` is that it
rewrites the loop to avoid losing the original value of `result`, so
that it can be UNLEAK-ed at the very end. (That makes it obvious where
the leak happens, compared to adding an UNLEAK a few lines up.) If I do
`reduce_heads_replace(&revs)`, I'll need to touch the loop anyway, and
then I could probably just as well UNLEAK a little while at it.

I'll get to this within the next couple of days, then I'll see what it
looks like.

Thanks for your feedback.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 0/2] Re: reduce_heads: fix memory leaks
  2017-11-02 10:45   ` Martin Ågren
@ 2017-11-05 20:26     ` Martin Ågren
  2017-11-05 20:26       ` [PATCH v2 1/2] builtin/merge-base: UNLEAK commit lists Martin Ågren
                         ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Martin Ågren @ 2017-11-05 20:26 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

Since v1 [1], I've added a preparatory patch to UNLEAK some variables.
That sets the stage slightly better for patch 2.

Junio, you placed v1 on maint. Because UNLEAK is not in maint, this is
based on master and maint misses out on this v2. If you have any advice
for how I should (not) do series with UNLEAK in them, I'm all ears.

Martin

[1] https://public-inbox.org/git/20171101090326.8091-1-martin.agren@gmail.com/

Martin Ågren (2):
  builtin/merge-base: UNLEAK commit lists
  reduce_heads: fix memory leaks

 commit.h                | 18 +++++++++++++++++-
 builtin/commit.c        |  2 +-
 builtin/fmt-merge-msg.c |  2 +-
 builtin/merge-base.c    | 40 +++++++++++++++++++++-------------------
 builtin/merge.c         |  1 +
 builtin/pull.c          |  5 ++++-
 commit.c                |  7 +++++++
 7 files changed, 52 insertions(+), 23 deletions(-)

-- 
2.15.0.415.gac1375d7e


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/2] builtin/merge-base: UNLEAK commit lists
  2017-11-05 20:26     ` [PATCH v2 0/2] " Martin Ågren
@ 2017-11-05 20:26       ` Martin Ågren
  2017-11-06  1:03         ` Junio C Hamano
  2017-11-06 11:05         ` Jeff King
  2017-11-05 20:26       ` [PATCH v2 " Martin Ågren
  2017-11-06  1:12       ` [PATCH v2 0/2] " Junio C Hamano
  2 siblings, 2 replies; 12+ messages in thread
From: Martin Ågren @ 2017-11-05 20:26 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

In several functions, we iterate through a commit list by assigning
`result = result->next`. As a consequence, we lose the original pointer
and eventually leak the list.

These are immediate helpers to `cmd_merge_base()` which is just about to
return, so we can use UNLEAK. For example, we could `UNLEAK(result)`
before we start iterating. That would be a one-liner change per
function. Instead, leave the lists alone and iterate using a dedicated
pointer. Then UNLEAK immediately before returning.

After this change, it is clearer that the leaks happen as we return, and
not as we process the list. That is, we could just as well have used
`free_commit_list()`. Also, leaving a "result" unchanged as we display
it feels (marginally) better.

In `handle_independent()` we can drop `result` while we're here and
reuse the `revs`-variable instead. That matches several other users of
`reduce_heads()`. The memory-leak that this hides will be addressed in
the next commit.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 builtin/merge-base.c | 36 ++++++++++++++++++------------------
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index 6dbd167d3..fd0eba14b 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -9,20 +9,20 @@
 
 static int show_merge_base(struct commit **rev, int rev_nr, int show_all)
 {
-	struct commit_list *result;
+	struct commit_list *result, *r;
 
 	result = get_merge_bases_many_dirty(rev[0], rev_nr - 1, rev + 1);
 
 	if (!result)
 		return 1;
 
-	while (result) {
-		printf("%s\n", oid_to_hex(&result->item->object.oid));
+	for (r = result; r; r = r->next) {
+		printf("%s\n", oid_to_hex(&r->item->object.oid));
 		if (!show_all)
-			return 0;
-		result = result->next;
+			break;
 	}
 
+	UNLEAK(result);
 	return 0;
 }
 
@@ -51,28 +51,28 @@ static struct commit *get_commit_reference(const char *arg)
 
 static int handle_independent(int count, const char **args)
 {
-	struct commit_list *revs = NULL;
-	struct commit_list *result;
+	struct commit_list *revs = NULL, *rev;
 	int i;
 
 	for (i = count - 1; i >= 0; i--)
 		commit_list_insert(get_commit_reference(args[i]), &revs);
 
-	result = reduce_heads(revs);
-	if (!result)
+	revs = reduce_heads(revs);
+
+	if (!revs)
 		return 1;
 
-	while (result) {
-		printf("%s\n", oid_to_hex(&result->item->object.oid));
-		result = result->next;
-	}
+	for (rev = revs; rev; rev = rev->next)
+		printf("%s\n", oid_to_hex(&rev->item->object.oid));
+
+	UNLEAK(revs);
 	return 0;
 }
 
 static int handle_octopus(int count, const char **args, int show_all)
 {
 	struct commit_list *revs = NULL;
-	struct commit_list *result;
+	struct commit_list *result, *rev;
 	int i;
 
 	for (i = count - 1; i >= 0; i--)
@@ -83,13 +83,13 @@ static int handle_octopus(int count, const char **args, int show_all)
 	if (!result)
 		return 1;
 
-	while (result) {
-		printf("%s\n", oid_to_hex(&result->item->object.oid));
+	for (rev = result; rev; rev = rev->next) {
+		printf("%s\n", oid_to_hex(&rev->item->object.oid));
 		if (!show_all)
-			return 0;
-		result = result->next;
+			break;
 	}
 
+	UNLEAK(result);
 	return 0;
 }
 
-- 
2.15.0.415.gac1375d7e


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/2] reduce_heads: fix memory leaks
  2017-11-05 20:26     ` [PATCH v2 0/2] " Martin Ågren
  2017-11-05 20:26       ` [PATCH v2 1/2] builtin/merge-base: UNLEAK commit lists Martin Ågren
@ 2017-11-05 20:26       ` Martin Ågren
  2017-11-06  1:12       ` [PATCH v2 0/2] " Junio C Hamano
  2 siblings, 0 replies; 12+ messages in thread
From: Martin Ågren @ 2017-11-05 20:26 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

We currently have seven callers of `reduce_heads(foo)`. Six of them do
not use the original list `foo` again, and actually, all six of those
end up leaking it.

Introduce and use `reduce_heads_replace(&foo)` as a leak-free version of
`foo = reduce_heads(foo)` to fix several of these. Fix the remaining
leaks using `free_commit_list()`.

While we're here, document `reduce_heads()` and mark it as `extern`.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 commit.h                | 18 +++++++++++++++++-
 builtin/commit.c        |  2 +-
 builtin/fmt-merge-msg.c |  2 +-
 builtin/merge-base.c    |  6 ++++--
 builtin/merge.c         |  1 +
 builtin/pull.c          |  5 ++++-
 commit.c                |  7 +++++++
 7 files changed, 35 insertions(+), 6 deletions(-)

diff --git a/commit.h b/commit.h
index 6d769590f..99a3fea68 100644
--- a/commit.h
+++ b/commit.h
@@ -313,7 +313,23 @@ extern int interactive_add(int argc, const char **argv, const char *prefix, int
 extern int run_add_interactive(const char *revision, const char *patch_mode,
 			       const struct pathspec *pathspec);
 
-struct commit_list *reduce_heads(struct commit_list *heads);
+/*
+ * Takes a list of commits and returns a new list where those
+ * have been removed that can be reached from other commits in
+ * the list. It is useful for, e.g., reducing the commits
+ * randomly thrown at the git-merge command and removing
+ * redundant commits that the user shouldn't have given to it.
+ *
+ * This function destroys the STALE bit of the commit objects'
+ * flags.
+ */
+extern struct commit_list *reduce_heads(struct commit_list *heads);
+
+/*
+ * Like `reduce_heads()`, except it replaces the list. Use this
+ * instead of `foo = reduce_heads(foo);` to avoid memory leaks.
+ */
+extern void reduce_heads_replace(struct commit_list **heads);
 
 struct commit_extra_header {
 	struct commit_extra_header *next;
diff --git a/builtin/commit.c b/builtin/commit.c
index d75b3805e..11c474018 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1728,7 +1728,7 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
 				allow_fast_forward = 0;
 		}
 		if (allow_fast_forward)
-			parents = reduce_heads(parents);
+			reduce_heads_replace(&parents);
 	} else {
 		if (!reflog_msg)
 			reflog_msg = (whence == FROM_CHERRY_PICK)
diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index e99b5ddbf..27a2361e9 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -571,7 +571,7 @@ static void find_merge_parents(struct merge_parents *result,
 	head_commit = lookup_commit(head);
 	if (head_commit)
 		commit_list_insert(head_commit, &parents);
-	parents = reduce_heads(parents);
+	reduce_heads_replace(&parents);
 
 	while (parents) {
 		struct commit *cmit = pop_commit(&parents);
diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index fd0eba14b..0178ca772 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -57,7 +57,7 @@ static int handle_independent(int count, const char **args)
 	for (i = count - 1; i >= 0; i--)
 		commit_list_insert(get_commit_reference(args[i]), &revs);
 
-	revs = reduce_heads(revs);
+	reduce_heads_replace(&revs);
 
 	if (!revs)
 		return 1;
@@ -78,7 +78,9 @@ static int handle_octopus(int count, const char **args, int show_all)
 	for (i = count - 1; i >= 0; i--)
 		commit_list_insert(get_commit_reference(args[i]), &revs);
 
-	result = reduce_heads(get_octopus_merge_bases(revs));
+	result = get_octopus_merge_bases(revs);
+	free_commit_list(revs);
+	reduce_heads_replace(&result);
 
 	if (!result)
 		return 1;
diff --git a/builtin/merge.c b/builtin/merge.c
index ab5ffe85e..fbbf2a9e5 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -999,6 +999,7 @@ static struct commit_list *reduce_parents(struct commit *head_commit,
 
 	/* Find what parents to record by checking independent ones. */
 	parents = reduce_heads(remoteheads);
+	free_commit_list(remoteheads);
 
 	remoteheads = NULL;
 	remotes = &remoteheads;
diff --git a/builtin/pull.c b/builtin/pull.c
index 6f772e8a2..4edab228e 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -745,12 +745,15 @@ static int get_octopus_merge_base(struct object_id *merge_base,
 	if (!is_null_oid(fork_point))
 		commit_list_insert(lookup_commit_reference(fork_point), &revs);
 
-	result = reduce_heads(get_octopus_merge_bases(revs));
+	result = get_octopus_merge_bases(revs);
 	free_commit_list(revs);
+	reduce_heads_replace(&result);
+
 	if (!result)
 		return 1;
 
 	oidcpy(merge_base, &result->item->object.oid);
+	free_commit_list(result);
 	return 0;
 }
 
diff --git a/commit.c b/commit.c
index 1e0e63379..cab8d4455 100644
--- a/commit.c
+++ b/commit.c
@@ -1090,6 +1090,13 @@ struct commit_list *reduce_heads(struct commit_list *heads)
 	return result;
 }
 
+void reduce_heads_replace(struct commit_list **heads)
+{
+	struct commit_list *result = reduce_heads(*heads);
+	free_commit_list(*heads);
+	*heads = result;
+}
+
 static const char gpg_sig_header[] = "gpgsig";
 static const int gpg_sig_header_len = sizeof(gpg_sig_header) - 1;
 
-- 
2.15.0.415.gac1375d7e


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] builtin/merge-base: UNLEAK commit lists
  2017-11-05 20:26       ` [PATCH v2 1/2] builtin/merge-base: UNLEAK commit lists Martin Ågren
@ 2017-11-06  1:03         ` Junio C Hamano
  2017-11-06 11:05         ` Jeff King
  1 sibling, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2017-11-06  1:03 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git

Martin Ågren <martin.agren@gmail.com> writes:

> In several functions, we iterate through a commit list by assigning
> `result = result->next`. As a consequence, we lose the original pointer
> and eventually leak the list.
>
> These are immediate helpers to `cmd_merge_base()` which is just about to
> return, so we can use UNLEAK. For example, we could `UNLEAK(result)`
> before we start iterating. That would be a one-liner change per
> function. Instead, leave the lists alone and iterate using a dedicated
> pointer. Then UNLEAK immediately before returning.

Hmm, I cannot shake this feeling that this goes somewhat opposite to
the spirit of UNLEAK(), which I view as "It is too cumbersome and
makes the resulting code ugly if we try to make everything properly
freed, so mark what we know we will leak upfront".  The result of
this patch feels more like "Even though we took pains to restructure
the code so that we could call free_commit_list() to properly free
things, we use UNLEAK() and do not actually bother to free."

Havin said that, I do not think the resulting code has become uglier
or the conversion process (both writing and reviewing) were too
cumbersome for this particular case, and marking where we could call
free_commit_list() with UNLEAK() like this patch does may make
sense.  If somebody someday wants to call some of these helpers in
other contexts repeatedly, they may have an easier time.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/2] Re: reduce_heads: fix memory leaks
  2017-11-05 20:26     ` [PATCH v2 0/2] " Martin Ågren
  2017-11-05 20:26       ` [PATCH v2 1/2] builtin/merge-base: UNLEAK commit lists Martin Ågren
  2017-11-05 20:26       ` [PATCH v2 " Martin Ågren
@ 2017-11-06  1:12       ` Junio C Hamano
  2 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2017-11-06  1:12 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git

Martin Ågren <martin.agren@gmail.com> writes:

> Since v1 [1], I've added a preparatory patch to UNLEAK some variables.
> That sets the stage slightly better for patch 2.
>
> Junio, you placed v1 on maint. Because UNLEAK is not in maint, this is
> based on master and maint misses out on this v2. If you have any advice
> for how I should (not) do series with UNLEAK in them, I'm all ears.

As far as we know, nobody reported that these leaks made Git run out
of memory while running merge-base and prevented them from getting
desired result, so it is not worth the effort to make (part of) them
mergeable to 'maint'.  I forked the branch from 'maint' only because
it was a fix and it was not harder than forking from 'master'.

If 2/2 (which was 1/1 in the v1) were fixes to a very grave error,
then I might have suggested to do the 2/2 on maint first and call
that topic ${some_grave_error}_fix-maint; then fork another topic
${some_grave_error}_fix at master, merge the _fix-maint topic in,
and then do the 1/2 on top.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] builtin/merge-base: UNLEAK commit lists
  2017-11-05 20:26       ` [PATCH v2 1/2] builtin/merge-base: UNLEAK commit lists Martin Ågren
  2017-11-06  1:03         ` Junio C Hamano
@ 2017-11-06 11:05         ` Jeff King
  2017-11-07 20:39           ` [PATCH v3 1/2] builtin/merge-base: free " Martin Ågren
  2017-11-07 20:39           ` [PATCH v3 2/2] reduce_heads: fix memory leaks Martin Ågren
  1 sibling, 2 replies; 12+ messages in thread
From: Jeff King @ 2017-11-06 11:05 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git, Junio C Hamano

On Sun, Nov 05, 2017 at 09:26:30PM +0100, Martin Ågren wrote:

> In several functions, we iterate through a commit list by assigning
> `result = result->next`. As a consequence, we lose the original pointer
> and eventually leak the list.
> 
> These are immediate helpers to `cmd_merge_base()` which is just about to
> return, so we can use UNLEAK. For example, we could `UNLEAK(result)`
> before we start iterating. That would be a one-liner change per
> function. Instead, leave the lists alone and iterate using a dedicated
> pointer. Then UNLEAK immediately before returning.
> 
> After this change, it is clearer that the leaks happen as we return, and
> not as we process the list. That is, we could just as well have used
> `free_commit_list()`. Also, leaving a "result" unchanged as we display
> it feels (marginally) better.

I think it would be OK to show that we are consuming the list as we go,
like:

  while ((commit = pop_commit(&result))
	...do the thing ...

but like you I think I prefer the read-only iteration followed by a
separate deallocation/leak phase.

Like Junio, though, I kind of wonder if just calling free_commit_list()
would be the most readable thing.

The "other" benefit of UNLEAK() is that it has zero runtime cost. I'm
not sure if that matters much here, so I'd tend to choose the thing that
is most readable / understandable. These aren't cmd_* functions, so it's
not _immediately_ obvious that it is OK for them to leak. But it's also
pretty easy to see that they are called as the final element of
cmd_merge_base(). So I'm on the fence as far as this being a good use of
UNLEAK() or not.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v3 1/2] builtin/merge-base: free commit lists
  2017-11-06 11:05         ` Jeff King
@ 2017-11-07 20:39           ` Martin Ågren
  2017-11-08  2:40             ` Junio C Hamano
  2017-11-07 20:39           ` [PATCH v3 2/2] reduce_heads: fix memory leaks Martin Ågren
  1 sibling, 1 reply; 12+ messages in thread
From: Martin Ågren @ 2017-11-07 20:39 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King

In several functions, we iterate through a commit list by assigning
`result = result->next`. As a consequence, we lose the original pointer
and eventually leak the list.

Rewrite the loops so that we keep the original pointers, then call
`free_commit_list()`. Various alternatives were considered:

1) Use `UNLEAK(result)` before the loop. Simple change, but not very
pretty. These would definitely be new lows among our usages of UNLEAK.
2) Use `pop_commit()` when looping. Slightly less simple change, but it
feels slightly preferable to first display the list, then free it.
3) As in this patch, but with `UNLEAK()` instead of freeing. We'd still
go through all the trouble of refactoring the loop, and because it's not
super-obvious that we're about to exit, let's just free the lists -- it
probably doesn't affect the runtime much.

In `handle_independent()` we can drop `result` while we're here and
reuse the `revs`-variable instead. That matches several other users of
`reduce_heads()`. The memory-leak that this hides will be addressed in
the next commit.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
v3: Like v2 but s/UNLEAK/free_commit_list/ and rebased onto maint.

> Like Junio, though, I kind of wonder if just calling free_commit_list()
> would be the most readable thing.

Thanks Junio and Peff for insightful considerations around UNLEAK vs
free. I'd rather miss one or two opportunities too UNLEAK than use it
too often. I've got a couple of patches in the pipeline that will be
better thanks to your comments. Thanks.

 builtin/merge-base.c | 36 ++++++++++++++++++------------------
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index 6dbd167d3b..e17835fabb 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -9,20 +9,20 @@
 
 static int show_merge_base(struct commit **rev, int rev_nr, int show_all)
 {
-	struct commit_list *result;
+	struct commit_list *result, *r;
 
 	result = get_merge_bases_many_dirty(rev[0], rev_nr - 1, rev + 1);
 
 	if (!result)
 		return 1;
 
-	while (result) {
-		printf("%s\n", oid_to_hex(&result->item->object.oid));
+	for (r = result; r; r = r->next) {
+		printf("%s\n", oid_to_hex(&r->item->object.oid));
 		if (!show_all)
-			return 0;
-		result = result->next;
+			break;
 	}
 
+	free_commit_list(result);
 	return 0;
 }
 
@@ -51,28 +51,28 @@ static struct commit *get_commit_reference(const char *arg)
 
 static int handle_independent(int count, const char **args)
 {
-	struct commit_list *revs = NULL;
-	struct commit_list *result;
+	struct commit_list *revs = NULL, *rev;
 	int i;
 
 	for (i = count - 1; i >= 0; i--)
 		commit_list_insert(get_commit_reference(args[i]), &revs);
 
-	result = reduce_heads(revs);
-	if (!result)
+	revs = reduce_heads(revs);
+
+	if (!revs)
 		return 1;
 
-	while (result) {
-		printf("%s\n", oid_to_hex(&result->item->object.oid));
-		result = result->next;
-	}
+	for (rev = revs; rev; rev = rev->next)
+		printf("%s\n", oid_to_hex(&rev->item->object.oid));
+
+	free_commit_list(revs);
 	return 0;
 }
 
 static int handle_octopus(int count, const char **args, int show_all)
 {
 	struct commit_list *revs = NULL;
-	struct commit_list *result;
+	struct commit_list *result, *rev;
 	int i;
 
 	for (i = count - 1; i >= 0; i--)
@@ -83,13 +83,13 @@ static int handle_octopus(int count, const char **args, int show_all)
 	if (!result)
 		return 1;
 
-	while (result) {
-		printf("%s\n", oid_to_hex(&result->item->object.oid));
+	for (rev = result; rev; rev = rev->next) {
+		printf("%s\n", oid_to_hex(&rev->item->object.oid));
 		if (!show_all)
-			return 0;
-		result = result->next;
+			break;
 	}
 
+	free_commit_list(result);
 	return 0;
 }
 
-- 
2.15.0.415.gac1375d7e


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 2/2] reduce_heads: fix memory leaks
  2017-11-06 11:05         ` Jeff King
  2017-11-07 20:39           ` [PATCH v3 1/2] builtin/merge-base: free " Martin Ågren
@ 2017-11-07 20:39           ` Martin Ågren
  1 sibling, 0 replies; 12+ messages in thread
From: Martin Ågren @ 2017-11-07 20:39 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King

We currently have seven callers of `reduce_heads(foo)`. Six of them do
not use the original list `foo` again, and actually, all six of those
end up leaking it.

Introduce and use `reduce_heads_replace(&foo)` as a leak-free version of
`foo = reduce_heads(foo)` to fix several of these. Fix the remaining
leaks using `free_commit_list()`.

While we're here, document `reduce_heads()` and mark it as `extern`.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
v3: Like v2 but rebased onto maint.

 commit.h                | 18 +++++++++++++++++-
 builtin/commit.c        |  2 +-
 builtin/fmt-merge-msg.c |  2 +-
 builtin/merge-base.c    |  6 ++++--
 builtin/merge.c         |  1 +
 builtin/pull.c          |  5 ++++-
 commit.c                |  7 +++++++
 7 files changed, 35 insertions(+), 6 deletions(-)

diff --git a/commit.h b/commit.h
index f3805fec6b..cfa2636e33 100644
--- a/commit.h
+++ b/commit.h
@@ -313,7 +313,23 @@ extern int interactive_add(int argc, const char **argv, const char *prefix, int
 extern int run_add_interactive(const char *revision, const char *patch_mode,
 			       const struct pathspec *pathspec);
 
-struct commit_list *reduce_heads(struct commit_list *heads);
+/*
+ * Takes a list of commits and returns a new list where those
+ * have been removed that can be reached from other commits in
+ * the list. It is useful for, e.g., reducing the commits
+ * randomly thrown at the git-merge command and removing
+ * redundant commits that the user shouldn't have given to it.
+ *
+ * This function destroys the STALE bit of the commit objects'
+ * flags.
+ */
+extern struct commit_list *reduce_heads(struct commit_list *heads);
+
+/*
+ * Like `reduce_heads()`, except it replaces the list. Use this
+ * instead of `foo = reduce_heads(foo);` to avoid memory leaks.
+ */
+extern void reduce_heads_replace(struct commit_list **heads);
 
 struct commit_extra_header {
 	struct commit_extra_header *next;
diff --git a/builtin/commit.c b/builtin/commit.c
index 1a0da71a43..73c4ec2b08 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1719,7 +1719,7 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
 				allow_fast_forward = 0;
 		}
 		if (allow_fast_forward)
-			parents = reduce_heads(parents);
+			reduce_heads_replace(&parents);
 	} else {
 		if (!reflog_msg)
 			reflog_msg = (whence == FROM_CHERRY_PICK)
diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index e99b5ddbf9..27a2361e91 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -571,7 +571,7 @@ static void find_merge_parents(struct merge_parents *result,
 	head_commit = lookup_commit(head);
 	if (head_commit)
 		commit_list_insert(head_commit, &parents);
-	parents = reduce_heads(parents);
+	reduce_heads_replace(&parents);
 
 	while (parents) {
 		struct commit *cmit = pop_commit(&parents);
diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index e17835fabb..24f6c71935 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -57,7 +57,7 @@ static int handle_independent(int count, const char **args)
 	for (i = count - 1; i >= 0; i--)
 		commit_list_insert(get_commit_reference(args[i]), &revs);
 
-	revs = reduce_heads(revs);
+	reduce_heads_replace(&revs);
 
 	if (!revs)
 		return 1;
@@ -78,7 +78,9 @@ static int handle_octopus(int count, const char **args, int show_all)
 	for (i = count - 1; i >= 0; i--)
 		commit_list_insert(get_commit_reference(args[i]), &revs);
 
-	result = reduce_heads(get_octopus_merge_bases(revs));
+	result = get_octopus_merge_bases(revs);
+	free_commit_list(revs);
+	reduce_heads_replace(&result);
 
 	if (!result)
 		return 1;
diff --git a/builtin/merge.c b/builtin/merge.c
index 23c53a3082..5cccf21c30 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -981,6 +981,7 @@ static struct commit_list *reduce_parents(struct commit *head_commit,
 
 	/* Find what parents to record by checking independent ones. */
 	parents = reduce_heads(remoteheads);
+	free_commit_list(remoteheads);
 
 	remoteheads = NULL;
 	remotes = &remoteheads;
diff --git a/builtin/pull.c b/builtin/pull.c
index 9b86e519b1..a5dd6426d5 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -741,12 +741,15 @@ static int get_octopus_merge_base(struct object_id *merge_base,
 	if (!is_null_oid(fork_point))
 		commit_list_insert(lookup_commit_reference(fork_point), &revs);
 
-	result = reduce_heads(get_octopus_merge_bases(revs));
+	result = get_octopus_merge_bases(revs);
 	free_commit_list(revs);
+	reduce_heads_replace(&result);
+
 	if (!result)
 		return 1;
 
 	oidcpy(merge_base, &result->item->object.oid);
+	free_commit_list(result);
 	return 0;
 }
 
diff --git a/commit.c b/commit.c
index d3150d6270..7ff102c0df 100644
--- a/commit.c
+++ b/commit.c
@@ -1083,6 +1083,13 @@ struct commit_list *reduce_heads(struct commit_list *heads)
 	return result;
 }
 
+void reduce_heads_replace(struct commit_list **heads)
+{
+	struct commit_list *result = reduce_heads(*heads);
+	free_commit_list(*heads);
+	*heads = result;
+}
+
 static const char gpg_sig_header[] = "gpgsig";
 static const int gpg_sig_header_len = sizeof(gpg_sig_header) - 1;
 
-- 
2.15.0.415.gac1375d7e


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/2] builtin/merge-base: free commit lists
  2017-11-07 20:39           ` [PATCH v3 1/2] builtin/merge-base: free " Martin Ågren
@ 2017-11-08  2:40             ` Junio C Hamano
  0 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2017-11-08  2:40 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git, Jeff King

Martin Ågren <martin.agren@gmail.com> writes:

> In several functions, we iterate through a commit list by assigning
> `result = result->next`. As a consequence, we lose the original pointer
> and eventually leak the list.
>
> Rewrite the loops so that we keep the original pointers, then call
> `free_commit_list()`. Various alternatives were considered:

So, the reader expects to see a list of alternatives that we
considered but did not use in this solution to follow.

>
> 1) Use `UNLEAK(result)` before the loop. Simple change, but not very
> pretty. These would definitely be new lows among our usages of UNLEAK.

I am not sure if I agree with the judgment, but we did reject it, so
describing it as a candidate is good.

> 2) Use `pop_commit()` when looping. Slightly less simple change, but it
> feels slightly preferable to first display the list, then free it.

OK.

> 3) As in this patch, but with `UNLEAK()` instead of freeing. We'd still
> go through all the trouble of refactoring the loop, and because it's not
> super-obvious that we're about to exit, let's just free the lists -- it
> probably doesn't affect the runtime much.

That does include what we did, too.  A rejected alternative is only
the first 3/4 of what this says.

    3) As in this patch, but with `UNLEAK()` instead of freeing. We'd
       still go through all the trouble of refactoring the loop, and the
       use of UNLEAK() is left questionable because it's not very obvious
       that we're about to exit.

and then you can begin a separate paragraph, after the above lists,
e.g.

    Let's just free the lists -- it probably doesn't affect the
    runtime much.

> In `handle_independent()` we can drop `result` while we're here and
> reuse the `revs`-variable instead. That matches several other users of
> `reduce_heads()`. The memory-leak that this hides will be addressed in
> the next commit.

The patch text looks agreeable.

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-11-08  2:40 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-01  9:03 [PATCH] reduce_heads: fix memory leaks Martin Ågren
2017-11-02  3:11 ` Junio C Hamano
2017-11-02 10:45   ` Martin Ågren
2017-11-05 20:26     ` [PATCH v2 0/2] " Martin Ågren
2017-11-05 20:26       ` [PATCH v2 1/2] builtin/merge-base: UNLEAK commit lists Martin Ågren
2017-11-06  1:03         ` Junio C Hamano
2017-11-06 11:05         ` Jeff King
2017-11-07 20:39           ` [PATCH v3 1/2] builtin/merge-base: free " Martin Ågren
2017-11-08  2:40             ` Junio C Hamano
2017-11-07 20:39           ` [PATCH v3 2/2] reduce_heads: fix memory leaks Martin Ågren
2017-11-05 20:26       ` [PATCH v2 " Martin Ågren
2017-11-06  1:12       ` [PATCH v2 0/2] " Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).