git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] config: free resources of `struct config_store_data`
@ 2018-05-13  8:23 Martin Ågren
  2018-05-13  8:59 ` Eric Sunshine
  2018-05-13 18:40 ` [PATCH] " Martin Ågren
  0 siblings, 2 replies; 18+ messages in thread
From: Martin Ågren @ 2018-05-13  8:23 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin

Commit fee8572c6d (config: avoid using the global variable `store`,
2018-04-09) dropped the staticness of a certain struct, instead letting
the users create an instance on the stack and pass around a pointer.

We do not free the memory that the struct tracks. When the struct was
static, the memory would always be reachable. Now that we keep the
struct on the stack, though, as soon as we return, it goes out of scope
and we leak the memory it points to.

Introduce and use a small helper function `config_store_data_clear()` to
plug these leaks. This should be safe. The memory tracked here is config
parser events. Once the users (`git_config_set_multivar_in_file_gently()`
and `git_config_copy_or_rename_section_in_file()` at the moment) are
done, no-one should be holding on to a pointer into this memory.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 config.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/config.c b/config.c
index 6f8f1d8c11..83d7d0851a 100644
--- a/config.c
+++ b/config.c
@@ -2333,6 +2333,13 @@ struct config_store_data {
 	unsigned int key_seen:1, section_seen:1, is_keys_section:1;
 };
 
+void config_store_data_clear(struct config_store_data *store)
+{
+	free(store->parsed);
+	free(store->seen);
+	memset(store, 0, sizeof(*store));
+}
+
 static int matches(const char *key, const char *value,
 		   const struct config_store_data *store)
 {
@@ -2887,6 +2894,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 		munmap(contents, contents_sz);
 	if (in_fd >= 0)
 		close(in_fd);
+	config_store_data_clear(&store);
 	return ret;
 
 write_err_out:
@@ -3127,6 +3135,7 @@ static int git_config_copy_or_rename_section_in_file(const char *config_filename
 	rollback_lock_file(&lock);
 out_no_rollback:
 	free(filename_buf);
+	config_store_data_clear(&store);
 	return ret;
 }
 
-- 
2.17.0.583.g9a75a153ac


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] config: free resources of `struct config_store_data`
  2018-05-13  8:23 [PATCH] config: free resources of `struct config_store_data` Martin Ågren
@ 2018-05-13  8:59 ` Eric Sunshine
  2018-05-13  9:58   ` Martin Ågren
  2018-05-13 18:40 ` [PATCH] " Martin Ågren
  1 sibling, 1 reply; 18+ messages in thread
From: Eric Sunshine @ 2018-05-13  8:59 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Git List, Johannes Schindelin

On Sun, May 13, 2018 at 4:23 AM Martin Ågren <martin.agren@gmail.com> wrote:
> Commit fee8572c6d (config: avoid using the global variable `store`,
> 2018-04-09) dropped the staticness of a certain struct, instead letting
> the users create an instance on the stack and pass around a pointer.

> We do not free the memory that the struct tracks. When the struct was
> static, the memory would always be reachable. Now that we keep the
> struct on the stack, though, as soon as we return, it goes out of scope
> and we leak the memory it points to.

> Introduce and use a small helper function `config_store_data_clear()` to
> plug these leaks. This should be safe. The memory tracked here is config
> parser events. Once the users (`git_config_set_multivar_in_file_gently()`
> and `git_config_copy_or_rename_section_in_file()` at the moment) are
> done, no-one should be holding on to a pointer into this memory.

> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---
> diff --git a/config.c b/config.c
> @@ -2333,6 +2333,13 @@ struct config_store_data {
> +void config_store_data_clear(struct config_store_data *store)
> +{
> +       free(store->parsed);
> +       free(store->seen);
> +       memset(store, 0, sizeof(*store));
> +}

A newcomer to this code (such as myself) might legitimately wonder why
store->key and store->value_regex are not also being cleaned up by this
function. An examination of the relevant code reveals that those structure
members are manually (and perhaps tediously) freed already by
git_config_set_multivar_in_file_gently(), however, if
config_store_data_clear() was responsible for freeing them, then
git_config_set_multivar_in_file_gently() could be made a bit less noisy.

On the other hand, if there's actually a good reason for
  config_store_data_clear() not freeing those members, then perhaps it
deserves an in-code comment explaining why they are exempt.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] config: free resources of `struct config_store_data`
  2018-05-13  8:59 ` Eric Sunshine
@ 2018-05-13  9:58   ` Martin Ågren
  2018-05-13  9:58     ` [PATCH 2/1] config: let `config_store_data_clear()` handle `value_regex` Martin Ågren
                       ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Martin Ågren @ 2018-05-13  9:58 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: git, Johannes Schindelin

On 13 May 2018 at 10:59, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Sun, May 13, 2018 at 4:23 AM Martin Ågren <martin.agren@gmail.com> wrote:
>
>> Introduce and use a small helper function `config_store_data_clear()` to
>> plug these leaks. This should be safe. The memory tracked here is config
>> parser events. Once the users (`git_config_set_multivar_in_file_gently()`
>> and `git_config_copy_or_rename_section_in_file()` at the moment) are
>> done, no-one should be holding on to a pointer into this memory.
>
> A newcomer to this code (such as myself) might legitimately wonder why
> store->key and store->value_regex are not also being cleaned up by this

Good point. I was only concerned by the members that no-one took
responsibility for.

> function. An examination of the relevant code reveals that those structure
> members are manually (and perhaps tediously) freed already by
> git_config_set_multivar_in_file_gently(), however, if
> config_store_data_clear() was responsible for freeing them, then
> git_config_set_multivar_in_file_gently() could be made a bit less noisy.

Yes, that's true.

> On the other hand, if there's actually a good reason for
>   config_store_data_clear() not freeing those members, then perhaps it
> deserves an in-code comment explaining why they are exempt.

Not any good reason that I can think of, other than "history". But to be
clear, a day ago I was as much of a newcomer in this part of the code as
you are. Johannes is the one who might have the most up-to-date
understanding of this.

How about the following two patches as patches 2/3 and 3/3? I would also
need to mention in the commit message of this patch (1/3) that the
function will soon learn to clean up more members.

I could of course squash the three patches into one, but there is some
minor trickery involved, like we can't reuse a pointer in one place, but
need to xstrdup it.

Thank you for your comments. I'd be very interested in your thoughts on
this.

Martin

Martin Ågren (2):
  config: let `config_store_data_clear()` handle `value_regex`
  config: let `config_store_data_clear()` handle `key`

 config.c | 26 ++++++++------------------
 1 file changed, 8 insertions(+), 18 deletions(-)

-- 
2.17.0.583.g9a75a153ac


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/1] config: let `config_store_data_clear()` handle `value_regex`
  2018-05-13  9:58   ` Martin Ågren
@ 2018-05-13  9:58     ` Martin Ågren
  2018-05-14  3:05       ` Eric Sunshine
  2018-05-13  9:58     ` [PATCH 3/1] config: let `config_store_data_clear()` handle `key` Martin Ågren
  2018-05-14  3:03     ` [PATCH] config: free resources of `struct config_store_data` Eric Sunshine
  2 siblings, 1 reply; 18+ messages in thread
From: Martin Ågren @ 2018-05-13  9:58 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: git, Johannes Schindelin

Instead of carefully clearing up `value_regex` in each code path, let
`config_store_data_clear()` handle that.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
I *think* that it should be ok to `regfree()` after `regcomp()` failed,
but I'll need to look into that some more (and say something about it in
the commit message).

 config.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/config.c b/config.c
index 83d7d0851a..2e3c6c94e9 100644
--- a/config.c
+++ b/config.c
@@ -2335,6 +2335,11 @@ struct config_store_data {
 
 void config_store_data_clear(struct config_store_data *store)
 {
+	if (store->value_regex != NULL &&
+	    store->value_regex != CONFIG_REGEX_NONE) {
+		regfree(store->value_regex);
+		free(store->value_regex);
+	}
 	free(store->parsed);
 	free(store->seen);
 	memset(store, 0, sizeof(*store));
@@ -2722,7 +2727,6 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 			if (regcomp(store.value_regex, value_regex,
 					REG_EXTENDED)) {
 				error("invalid pattern: %s", value_regex);
-				free(store.value_regex);
 				ret = CONFIG_INVALID_PATTERN;
 				goto out_free;
 			}
@@ -2748,21 +2752,11 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 						      &store, &opts)) {
 			error("invalid config file %s", config_filename);
 			free(store.key);
-			if (store.value_regex != NULL &&
-			    store.value_regex != CONFIG_REGEX_NONE) {
-				regfree(store.value_regex);
-				free(store.value_regex);
-			}
 			ret = CONFIG_INVALID_FILE;
 			goto out_free;
 		}
 
 		free(store.key);
-		if (store.value_regex != NULL &&
-		    store.value_regex != CONFIG_REGEX_NONE) {
-			regfree(store.value_regex);
-			free(store.value_regex);
-		}
 
 		/* if nothing to unset, or too many matches, error out */
 		if ((store.seen_nr == 0 && value == NULL) ||
-- 
2.17.0.583.g9a75a153ac


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/1] config: let `config_store_data_clear()` handle `key`
  2018-05-13  9:58   ` Martin Ågren
  2018-05-13  9:58     ` [PATCH 2/1] config: let `config_store_data_clear()` handle `value_regex` Martin Ågren
@ 2018-05-13  9:58     ` Martin Ågren
  2018-05-14  3:03     ` [PATCH] config: free resources of `struct config_store_data` Eric Sunshine
  2 siblings, 0 replies; 18+ messages in thread
From: Martin Ågren @ 2018-05-13  9:58 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: git, Johannes Schindelin

Instead of carefully clearing up `key` in each code path, let
`config_store_data_clear()` handle that.

We still need to free it before replacing it, though. Move that freeing
closer to the replacing to be safe. Note that in that same part of the
code, we can no longer set `key` to the original pointer, but need to
`xstrdup()` it.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 config.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/config.c b/config.c
index 2e3c6c94e9..963edacf10 100644
--- a/config.c
+++ b/config.c
@@ -2335,6 +2335,7 @@ struct config_store_data {
 
 void config_store_data_clear(struct config_store_data *store)
 {
+	free(store->key);
 	if (store->value_regex != NULL &&
 	    store->value_regex != CONFIG_REGEX_NONE) {
 		regfree(store->value_regex);
@@ -2679,7 +2680,6 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 	fd = hold_lock_file_for_update(&lock, config_filename, 0);
 	if (fd < 0) {
 		error_errno("could not lock config file %s", config_filename);
-		free(store.key);
 		ret = CONFIG_NO_LOCK;
 		goto out_free;
 	}
@@ -2689,8 +2689,6 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 	 */
 	in_fd = open(config_filename, O_RDONLY);
 	if ( in_fd < 0 ) {
-		free(store.key);
-
 		if ( ENOENT != errno ) {
 			error_errno("opening %s", config_filename);
 			ret = CONFIG_INVALID_FILE; /* same as "invalid config file" */
@@ -2702,7 +2700,8 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 			goto out_free;
 		}
 
-		store.key = (char *)key;
+		free(store.key);
+		store.key = xstrdup(key);
 		if (write_section(fd, key, &store) < 0 ||
 		    write_pair(fd, key, value, &store) < 0)
 			goto write_err_out;
@@ -2751,13 +2750,10 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 						      config_filename,
 						      &store, &opts)) {
 			error("invalid config file %s", config_filename);
-			free(store.key);
 			ret = CONFIG_INVALID_FILE;
 			goto out_free;
 		}
 
-		free(store.key);
-
 		/* if nothing to unset, or too many matches, error out */
 		if ((store.seen_nr == 0 && value == NULL) ||
 		    (store.seen_nr > 1 && multi_replace == 0)) {
-- 
2.17.0.583.g9a75a153ac


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] config: free resources of `struct config_store_data`
  2018-05-13  8:23 [PATCH] config: free resources of `struct config_store_data` Martin Ågren
  2018-05-13  8:59 ` Eric Sunshine
@ 2018-05-13 18:40 ` Martin Ågren
  1 sibling, 0 replies; 18+ messages in thread
From: Martin Ågren @ 2018-05-13 18:40 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Johannes Schindelin

On 13 May 2018 at 10:23, Martin Ågren <martin.agren@gmail.com> wrote:

> +void config_store_data_clear(struct config_store_data *store)

I will do s/void/static void/ here...

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] config: free resources of `struct config_store_data`
  2018-05-13  9:58   ` Martin Ågren
  2018-05-13  9:58     ` [PATCH 2/1] config: let `config_store_data_clear()` handle `value_regex` Martin Ågren
  2018-05-13  9:58     ` [PATCH 3/1] config: let `config_store_data_clear()` handle `key` Martin Ågren
@ 2018-05-14  3:03     ` Eric Sunshine
  2018-05-20 10:42       ` [PATCH v2 0/3] " Martin Ågren
  2 siblings, 1 reply; 18+ messages in thread
From: Eric Sunshine @ 2018-05-14  3:03 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Git List, Johannes Schindelin

On Sun, May 13, 2018 at 5:58 AM, Martin Ågren <martin.agren@gmail.com> wrote:
> On 13 May 2018 at 10:59, Eric Sunshine <sunshine@sunshineco.com> wrote:
>> On Sun, May 13, 2018 at 4:23 AM Martin Ågren <martin.agren@gmail.com> wrote:
>>> Introduce and use a small helper function `config_store_data_clear()` to
>>> plug these leaks. This should be safe. The memory tracked here is config
>>> parser events. Once the users (`git_config_set_multivar_in_file_gently()`
>>> and `git_config_copy_or_rename_section_in_file()` at the moment) are
>>> done, no-one should be holding on to a pointer into this memory.
>>
>> A newcomer to this code (such as myself) might legitimately wonder why
>> store->key and store->value_regex are not also being cleaned up by this
>> function. An examination of the relevant code reveals that those structure
>> members are manually (and perhaps tediously) freed already by
>> git_config_set_multivar_in_file_gently(), however, if
>> config_store_data_clear() was responsible for freeing them, then
>> git_config_set_multivar_in_file_gently() could be made a bit less noisy.
>
> How about the following two patches as patches 2/3 and 3/3? I would also
> need to mention in the commit message of this patch (1/3) that the
> function will soon learn to clean up more members.
>
> I could of course squash the three patches into one, but there is some
> minor trickery involved, like we can't reuse a pointer in one place, but
> need to xstrdup it.
>
> Thank you for your comments. I'd be very interested in your thoughts on
> this.

Yep, making this a multi-part patch series and updating the commit
message of the patch which introduces config_store_data_clear(), as
you suggest, makes sense. The patch series could be organized
differently -- such as first moving freeing of 'value_regex' into new
config_store_data_clear(), then freeing additional fields in later
patches -- but I don't think it matters much in practice, so the
current organization is likely good enough.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/1] config: let `config_store_data_clear()` handle `value_regex`
  2018-05-13  9:58     ` [PATCH 2/1] config: let `config_store_data_clear()` handle `value_regex` Martin Ågren
@ 2018-05-14  3:05       ` Eric Sunshine
  2018-05-20 10:50         ` [PATCH] regex: do not call `regfree()` if compilation fails Martin Ågren
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Sunshine @ 2018-05-14  3:05 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Git List, Johannes Schindelin

On Sun, May 13, 2018 at 5:58 AM, Martin Ågren <martin.agren@gmail.com> wrote:
> Instead of carefully clearing up `value_regex` in each code path, let
> `config_store_data_clear()` handle that.
>
> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---
> I *think* that it should be ok to `regfree()` after `regcomp()` failed,
> but I'll need to look into that some more (and say something about it in
> the commit message).

My research (for instance [1,2]) seems to indicate that it would be
better to avoid regfree() upon failed regcomp(), even though such a
situation is handled sanely in some implementations.

[1]: https://www.redhat.com/archives/libvir-list/2013-September/msg00276.html
[2]: https://www.redhat.com/archives/libvir-list/2013-September/msg00273.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 0/3] config: free resources of `struct config_store_data`
  2018-05-14  3:03     ` [PATCH] config: free resources of `struct config_store_data` Eric Sunshine
@ 2018-05-20 10:42       ` Martin Ågren
  2018-05-20 10:42         ` [PATCH v2 1/3] " Martin Ågren
                           ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Martin Ågren @ 2018-05-20 10:42 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: git, Johannes Schindelin

On 14 May 2018 at 05:03, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Sun, May 13, 2018 at 5:58 AM, Martin Ågren <martin.agren@gmail.com> wrote:
>>
>> How about the following two patches as patches 2/3 and 3/3? I would also
>> need to mention in the commit message of this patch (1/3) that the
>> function will soon learn to clean up more members.
>>
>
> Yep, making this a multi-part patch series and updating the commit
> message of the patch which introduces config_store_data_clear(), as
> you suggest, makes sense. The patch series could be organized
> differently -- such as first moving freeing of 'value_regex' into new
> config_store_data_clear(), then freeing additional fields in later
> patches -- but I don't think it matters much in practice, so the
> current organization is likely good enough.

I tried such a re-ordering but wasn't entirely happy about the result
(maybe I didn't try hard enough), so here are these patches again, as a
proper series and with improved commit messages.

Martin

Martin Ågren (3):
  config: free resources of `struct config_store_data`
  config: let `config_store_data_clear()` handle `value_regex`
  config: let `config_store_data_clear()` handle `key`

 config.c | 36 ++++++++++++++++++------------------
 1 file changed, 18 insertions(+), 18 deletions(-)

-- 
2.17.0.840.g5d83f92caf


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/3] config: free resources of `struct config_store_data`
  2018-05-20 10:42       ` [PATCH v2 0/3] " Martin Ågren
@ 2018-05-20 10:42         ` Martin Ågren
  2018-05-20 10:42         ` [PATCH v2 2/3] config: let `config_store_data_clear()` handle `value_regex` Martin Ågren
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 18+ messages in thread
From: Martin Ågren @ 2018-05-20 10:42 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: git, Johannes Schindelin

Commit fee8572c6d (config: avoid using the global variable `store`,
2018-04-09) dropped the staticness of a certain struct, instead letting
the users create an instance on the stack and pass around a pointer.

We do not free all the memory that the struct tracks. When the struct
was static, the memory would always be reachable. Now that we keep the
struct on the stack, though, as soon as we return, it goes out of scope
and we leak the memory it points to. In particular, we leak the memory
pointed to by the `parsed` and `seen` fields.

Introduce and use a helper function `config_store_data_clear()` to plug
these leaks. The memory tracked here is config parser events. Once the
users (`git_config_set_multivar_in_file_gently()` and
`git_config_copy_or_rename_section_in_file()` at the moment) are done,
no-one should be holding on to a pointer into this memory.

There are two more members of the struct that are candidates for freeing
in this new function (`key` and `value_regex`). Those are actually
already being taken care of. The next couple of patches will move their
freeing into the function we are adding here.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 config.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/config.c b/config.c
index 6f8f1d8c11..b3282f7193 100644
--- a/config.c
+++ b/config.c
@@ -2333,6 +2333,13 @@ struct config_store_data {
 	unsigned int key_seen:1, section_seen:1, is_keys_section:1;
 };
 
+static void config_store_data_clear(struct config_store_data *store)
+{
+	free(store->parsed);
+	free(store->seen);
+	memset(store, 0, sizeof(*store));
+}
+
 static int matches(const char *key, const char *value,
 		   const struct config_store_data *store)
 {
@@ -2887,6 +2894,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 		munmap(contents, contents_sz);
 	if (in_fd >= 0)
 		close(in_fd);
+	config_store_data_clear(&store);
 	return ret;
 
 write_err_out:
@@ -3127,6 +3135,7 @@ static int git_config_copy_or_rename_section_in_file(const char *config_filename
 	rollback_lock_file(&lock);
 out_no_rollback:
 	free(filename_buf);
+	config_store_data_clear(&store);
 	return ret;
 }
 
-- 
2.17.0.840.g5d83f92caf


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 2/3] config: let `config_store_data_clear()` handle `value_regex`
  2018-05-20 10:42       ` [PATCH v2 0/3] " Martin Ågren
  2018-05-20 10:42         ` [PATCH v2 1/3] " Martin Ågren
@ 2018-05-20 10:42         ` Martin Ågren
  2018-05-20 10:42         ` [PATCH v2 3/3] config: let `config_store_data_clear()` handle `key` Martin Ågren
  2018-05-23  7:01         ` [PATCH v2 0/3] config: free resources of `struct config_store_data` Eric Sunshine
  3 siblings, 0 replies; 18+ messages in thread
From: Martin Ågren @ 2018-05-20 10:42 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: git, Johannes Schindelin

Instead of duplicating the logic for clearing up `value_regex`, let
`config_store_data_clear()` handle that.

When `regcomp()` fails, the current code does not call `regfree()`. Make
sure we do the same by immediately invalidating `value_regex`. Some
implementations are able to handle such an extra `regfree()`-call [1],
but from the example in [2], we should not do so. (The language itself
in [2] is not super-clear on this.)

[1] https://www.redhat.com/archives/libvir-list/2013-September/msg00262.html

[2] http://pubs.opengroup.org/onlinepubs/9699919799/functions/regcomp.html

Researched-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 config.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/config.c b/config.c
index b3282f7193..ac71f3f2e1 100644
--- a/config.c
+++ b/config.c
@@ -2335,6 +2335,11 @@ struct config_store_data {
 
 static void config_store_data_clear(struct config_store_data *store)
 {
+	if (store->value_regex != NULL &&
+	    store->value_regex != CONFIG_REGEX_NONE) {
+		regfree(store->value_regex);
+		free(store->value_regex);
+	}
 	free(store->parsed);
 	free(store->seen);
 	memset(store, 0, sizeof(*store));
@@ -2722,7 +2727,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 			if (regcomp(store.value_regex, value_regex,
 					REG_EXTENDED)) {
 				error("invalid pattern: %s", value_regex);
-				free(store.value_regex);
+				FREE_AND_NULL(store.value_regex);
 				ret = CONFIG_INVALID_PATTERN;
 				goto out_free;
 			}
@@ -2748,21 +2753,11 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 						      &store, &opts)) {
 			error("invalid config file %s", config_filename);
 			free(store.key);
-			if (store.value_regex != NULL &&
-			    store.value_regex != CONFIG_REGEX_NONE) {
-				regfree(store.value_regex);
-				free(store.value_regex);
-			}
 			ret = CONFIG_INVALID_FILE;
 			goto out_free;
 		}
 
 		free(store.key);
-		if (store.value_regex != NULL &&
-		    store.value_regex != CONFIG_REGEX_NONE) {
-			regfree(store.value_regex);
-			free(store.value_regex);
-		}
 
 		/* if nothing to unset, or too many matches, error out */
 		if ((store.seen_nr == 0 && value == NULL) ||
-- 
2.17.0.840.g5d83f92caf


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 3/3] config: let `config_store_data_clear()` handle `key`
  2018-05-20 10:42       ` [PATCH v2 0/3] " Martin Ågren
  2018-05-20 10:42         ` [PATCH v2 1/3] " Martin Ågren
  2018-05-20 10:42         ` [PATCH v2 2/3] config: let `config_store_data_clear()` handle `value_regex` Martin Ågren
@ 2018-05-20 10:42         ` Martin Ågren
  2018-05-23  7:03           ` Eric Sunshine
  2018-05-23  7:01         ` [PATCH v2 0/3] config: free resources of `struct config_store_data` Eric Sunshine
  3 siblings, 1 reply; 18+ messages in thread
From: Martin Ågren @ 2018-05-20 10:42 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: git, Johannes Schindelin

Instead of remembering to free `key` in each code path, let
`config_store_data_clear()` handle that.

We still need to free it before replacing it, though. Move that freeing
closer to the replacing to be safe. Note that in that same part of the
code, we can no longer set `key` to the original pointer, but need to
`xstrdup()` it.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 config.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/config.c b/config.c
index ac71f3f2e1..339a92235d 100644
--- a/config.c
+++ b/config.c
@@ -2335,6 +2335,7 @@ struct config_store_data {
 
 static void config_store_data_clear(struct config_store_data *store)
 {
+	free(store->key);
 	if (store->value_regex != NULL &&
 	    store->value_regex != CONFIG_REGEX_NONE) {
 		regfree(store->value_regex);
@@ -2679,7 +2680,6 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 	fd = hold_lock_file_for_update(&lock, config_filename, 0);
 	if (fd < 0) {
 		error_errno("could not lock config file %s", config_filename);
-		free(store.key);
 		ret = CONFIG_NO_LOCK;
 		goto out_free;
 	}
@@ -2689,8 +2689,6 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 	 */
 	in_fd = open(config_filename, O_RDONLY);
 	if ( in_fd < 0 ) {
-		free(store.key);
-
 		if ( ENOENT != errno ) {
 			error_errno("opening %s", config_filename);
 			ret = CONFIG_INVALID_FILE; /* same as "invalid config file" */
@@ -2702,7 +2700,8 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 			goto out_free;
 		}
 
-		store.key = (char *)key;
+		free(store.key);
+		store.key = xstrdup(key);
 		if (write_section(fd, key, &store) < 0 ||
 		    write_pair(fd, key, value, &store) < 0)
 			goto write_err_out;
@@ -2752,13 +2751,10 @@ int git_config_set_multivar_in_file_gently(const char *config_filename,
 						      config_filename,
 						      &store, &opts)) {
 			error("invalid config file %s", config_filename);
-			free(store.key);
 			ret = CONFIG_INVALID_FILE;
 			goto out_free;
 		}
 
-		free(store.key);
-
 		/* if nothing to unset, or too many matches, error out */
 		if ((store.seen_nr == 0 && value == NULL) ||
 		    (store.seen_nr > 1 && multi_replace == 0)) {
-- 
2.17.0.840.g5d83f92caf


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH] regex: do not call `regfree()` if compilation fails
  2018-05-14  3:05       ` Eric Sunshine
@ 2018-05-20 10:50         ` Martin Ågren
  2018-05-21 18:43           ` Stefan Beller
  0 siblings, 1 reply; 18+ messages in thread
From: Martin Ågren @ 2018-05-20 10:50 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: git, Johannes Schindelin

It is apparently undefined behavior to call `regfree()` on a regex where
`regcomp()` failed. The language in [1] is a bit muddy, at least to me,
but the clearest hint is this (`preg` is the `regex_t *`):

    Upon successful completion, the regcomp() function shall return 0.
    Otherwise, it shall return an integer value indicating an error as
    described in <regex.h>, and the content of preg is undefined.

Funnily enough, there is also the `regerror()` function which should be
given a pointer to such a "failed" `regex_t` -- the content of which
would supposedly be undefined -- and which may investigate it to come up
with a detailed error message.

In any case, the example in that document shows how `regfree()` is not
called after `regcomp()` fails.

We have quite a few users of this API and most get this right. These
three users do not.

Several implementations can handle this just fine [2] and these code paths
supposedly have not wreaked havoc or we'd have heard about it. (These
are all in code paths where git got bad input and is just about to die
anyway.) But let's just avoid the issue altogether.

[1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/regcomp.html

[2] https://www.redhat.com/archives/libvir-list/2013-September/msg00262.html

Researched-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-byi Martin Ågren <martin.agren@gmail.com>
---

On 14 May 2018 at 05:05, Eric Sunshine <sunshine@sunshineco.com> wrote:
> My research (for instance [1,2]) seems to indicate that it would be
> better to avoid regfree() upon failed regcomp(), even though such a
> situation is handled sanely in some implementations.
>
> [1]: https://www.redhat.com/archives/libvir-list/2013-September/msg00276.html
> [2]: https://www.redhat.com/archives/libvir-list/2013-September/msg00273.html

Thank you for researching this. I think it would make sense to get rid
of the few places we have where we get this wrong (if our understanding
of this being undefined is right).

 diffcore-pickaxe.c | 1 -
 grep.c             | 2 --
 2 files changed, 3 deletions(-)

diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index 239ce5122b..800a899c86 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -215,7 +215,6 @@ static void regcomp_or_die(regex_t *regex, const char *needle, int cflags)
 		/* The POSIX.2 people are surely sick */
 		char errbuf[1024];
 		regerror(err, regex, errbuf, 1024);
-		regfree(regex);
 		die("invalid regex: %s", errbuf);
 	}
 }
diff --git a/grep.c b/grep.c
index 65b90c10a3..5e4f3f9a9d 100644
--- a/grep.c
+++ b/grep.c
@@ -636,7 +636,6 @@ static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt)
 	if (err) {
 		char errbuf[1024];
 		regerror(err, &p->regexp, errbuf, sizeof(errbuf));
-		regfree(&p->regexp);
 		compile_regexp_failed(p, errbuf);
 	}
 }
@@ -701,7 +700,6 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 	if (err) {
 		char errbuf[1024];
 		regerror(err, &p->regexp, errbuf, 1024);
-		regfree(&p->regexp);
 		compile_regexp_failed(p, errbuf);
 	}
 }
-- 
2.17.0.840.g5d83f92caf


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] regex: do not call `regfree()` if compilation fails
  2018-05-20 10:50         ` [PATCH] regex: do not call `regfree()` if compilation fails Martin Ågren
@ 2018-05-21 18:43           ` Stefan Beller
  2018-05-22  2:20             ` Eric Sunshine
  0 siblings, 1 reply; 18+ messages in thread
From: Stefan Beller @ 2018-05-21 18:43 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Eric Sunshine, git, Johannes Schindelin

On Sun, May 20, 2018 at 3:50 AM, Martin Ågren <martin.agren@gmail.com> wrote:
> It is apparently undefined behavior to call `regfree()` on a regex where
> `regcomp()` failed. The language in [1] is a bit muddy, at least to me,
> but the clearest hint is this (`preg` is the `regex_t *`):
>
>     Upon successful completion, the regcomp() function shall return 0.
>     Otherwise, it shall return an integer value indicating an error as
>     described in <regex.h>, and the content of preg is undefined.
>
> Funnily enough, there is also the `regerror()` function which should be
> given a pointer to such a "failed" `regex_t` -- the content of which
> would supposedly be undefined -- and which may investigate it to come up
> with a detailed error message.
>
> In any case, the example in that document shows how `regfree()` is not
> called after `regcomp()` fails.
>
> We have quite a few users of this API and most get this right. These
> three users do not.
>
> Several implementations can handle this just fine [2] and these code paths
> supposedly have not wreaked havoc or we'd have heard about it. (These
> are all in code paths where git got bad input and is just about to die
> anyway.) But let's just avoid the issue altogether.
>
> [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/regcomp.html
>
> [2] https://www.redhat.com/archives/libvir-list/2013-September/msg00262.html
>
> Researched-by: Eric Sunshine <sunshine@sunshineco.com>
> Signed-off-byi Martin Ågren <martin.agren@gmail.com>
> ---
>
> On 14 May 2018 at 05:05, Eric Sunshine <sunshine@sunshineco.com> wrote:
>> My research (for instance [1,2]) seems to indicate that it would be
>> better to avoid regfree() upon failed regcomp(), even though such a
>> situation is handled sanely in some implementations.
>>
>> [1]: https://www.redhat.com/archives/libvir-list/2013-September/msg00276.html
>> [2]: https://www.redhat.com/archives/libvir-list/2013-September/msg00273.html
>
> Thank you for researching this. I think it would make sense to get rid
> of the few places we have where we get this wrong (if our understanding
> of this being undefined is right).
>
>  diffcore-pickaxe.c | 1 -
>  grep.c             | 2 --
>  2 files changed, 3 deletions(-)
>
> diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
> index 239ce5122b..800a899c86 100644
> --- a/diffcore-pickaxe.c
> +++ b/diffcore-pickaxe.c
> @@ -215,7 +215,6 @@ static void regcomp_or_die(regex_t *regex, const char *needle, int cflags)
>                 /* The POSIX.2 people are surely sick */
>                 char errbuf[1024];
>                 regerror(err, regex, errbuf, 1024);
> -               regfree(regex);

While the commit message is very clear why we supposedly introduce a leak here,
it is hard to be found from the source code (as we only delete code
there, so digging
for history is not obvious), so maybe

     /* regfree(regex) is invalid here */

instead?

>                 die("invalid regex: %s", errbuf);
>         }
>  }
> diff --git a/grep.c b/grep.c
> index 65b90c10a3..5e4f3f9a9d 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -636,7 +636,6 @@ static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt)
>         if (err) {
>                 char errbuf[1024];
>                 regerror(err, &p->regexp, errbuf, sizeof(errbuf));
> -               regfree(&p->regexp);
>                 compile_regexp_failed(p, errbuf);
>         }
>  }
> @@ -701,7 +700,6 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
>         if (err) {
>                 char errbuf[1024];
>                 regerror(err, &p->regexp, errbuf, 1024);
> -               regfree(&p->regexp);
>                 compile_regexp_failed(p, errbuf);
>         }
>  }
> --
> 2.17.0.840.g5d83f92caf
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] regex: do not call `regfree()` if compilation fails
  2018-05-21 18:43           ` Stefan Beller
@ 2018-05-22  2:20             ` Eric Sunshine
  2018-05-22 11:00               ` Martin Ågren
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Sunshine @ 2018-05-22  2:20 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Martin Ågren, git, Johannes Schindelin

On Mon, May 21, 2018 at 2:43 PM, Stefan Beller <sbeller@google.com> wrote:
> On Sun, May 20, 2018 at 3:50 AM, Martin Ågren <martin.agren@gmail.com> wrote:
>> It is apparently undefined behavior to call `regfree()` on a regex where
>> `regcomp()` failed. [...]
>>
>> diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
>> @@ -215,7 +215,6 @@ static void regcomp_or_die(regex_t *regex, const char *needle, int cflags)
>>                 /* The POSIX.2 people are surely sick */
>>                 char errbuf[1024];
>>                 regerror(err, regex, errbuf, 1024);
>> -               regfree(regex);
>>                 die("invalid regex: %s", errbuf);
>
> While the commit message is very clear why we supposedly introduce a leak here,
> it is hard to be found from the source code (as we only delete code
> there, so digging
> for history is not obvious), so maybe
>
>      /* regfree(regex) is invalid here */
>
> instead?

The commit message doesn't say that we are supposedly introducing a
leak (and, indeed, no resources should have been allocated to the
'regex' upon failed compile). It's saying that removing this call
potentially avoids a crash under some implementations.

Given that the very next line is die(), and that the function name has
"_or_die" in it, I'm not sure that an in-code comment about regfree()
being invalid upon failed compile would be all that helpful; indeed,
it could be confusing, causing the reader to wonder why that is
significant if we're just dying anyhow. I find that the patch, as is,
clarifies rather than muddles the situation.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] regex: do not call `regfree()` if compilation fails
  2018-05-22  2:20             ` Eric Sunshine
@ 2018-05-22 11:00               ` Martin Ågren
  0 siblings, 0 replies; 18+ messages in thread
From: Martin Ågren @ 2018-05-22 11:00 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Stefan Beller, git, Johannes Schindelin

On 22 May 2018 at 04:20, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Mon, May 21, 2018 at 2:43 PM, Stefan Beller <sbeller@google.com> wrote:
>> On Sun, May 20, 2018 at 3:50 AM, Martin Ågren <martin.agren@gmail.com> wrote:
>>> It is apparently undefined behavior to call `regfree()` on a regex where
>>> `regcomp()` failed. [...]
>>>
>>> diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
>>> @@ -215,7 +215,6 @@ static void regcomp_or_die(regex_t *regex, const char *needle, int cflags)
>>>                 /* The POSIX.2 people are surely sick */
>>>                 char errbuf[1024];
>>>                 regerror(err, regex, errbuf, 1024);
>>> -               regfree(regex);
>>>                 die("invalid regex: %s", errbuf);
>>
>> While the commit message is very clear why we supposedly introduce a leak here,
>> it is hard to be found from the source code (as we only delete code
>> there, so digging
>> for history is not obvious), so maybe
>>
>>      /* regfree(regex) is invalid here */
>>
>> instead?
>
> The commit message doesn't say that we are supposedly introducing a
> leak (and, indeed, no resources should have been allocated to the
> 'regex' upon failed compile). It's saying that removing this call
> potentially avoids a crash under some implementations.
>
> Given that the very next line is die(), and that the function name has
> "_or_die" in it, I'm not sure that an in-code comment about regfree()
> being invalid upon failed compile would be all that helpful; indeed,
> it could be confusing, causing the reader to wonder why that is
> significant if we're just dying anyhow. I find that the patch, as is,
> clarifies rather than muddles the situation.

Like Eric, I feel that the possible leak here is somewhat uninteresting,
since the next line will die. That said, I seem to recall from my
grepping around earlier that we have other users where we return
with a failure instead of dying.

Any clarifying comments in such code would be a separate patch to me. I
also do not immediately see the need for adding such a comment in those
places. We can do that once we verify that we actually do leak (I would
expect that to happen only in some implementations, and I think there is
a fair chance that we will never encounter such an implementation.)

Martin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 0/3] config: free resources of `struct config_store_data`
  2018-05-20 10:42       ` [PATCH v2 0/3] " Martin Ågren
                           ` (2 preceding siblings ...)
  2018-05-20 10:42         ` [PATCH v2 3/3] config: let `config_store_data_clear()` handle `key` Martin Ågren
@ 2018-05-23  7:01         ` Eric Sunshine
  3 siblings, 0 replies; 18+ messages in thread
From: Eric Sunshine @ 2018-05-23  7:01 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Git List, Johannes Schindelin

On Sun, May 20, 2018 at 6:42 AM, Martin Ågren <martin.agren@gmail.com> wrote:
> On 14 May 2018 at 05:03, Eric Sunshine <sunshine@sunshineco.com> wrote:
>> On Sun, May 13, 2018 at 5:58 AM, Martin Ågren <martin.agren@gmail.com> wrote:
>>> How about the following two patches as patches 2/3 and 3/3? I would also
>>> need to mention in the commit message of this patch (1/3) that the
>>> function will soon learn to clean up more members.
>>
>> Yep, making this a multi-part patch series and updating the commit
>> message of the patch which introduces config_store_data_clear(), as
>> you suggest, makes sense. The patch series could be organized
>> differently -- such as first moving freeing of 'value_regex' into new
>> config_store_data_clear(), then freeing additional fields in later
>> patches -- but I don't think it matters much in practice, so the
>> current organization is likely good enough.
>
> I tried such a re-ordering but wasn't entirely happy about the result
> (maybe I didn't try hard enough), so here are these patches again, as a
> proper series and with improved commit messages.

The re-roll looks good; it address my concern about v1. Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/3] config: let `config_store_data_clear()` handle `key`
  2018-05-20 10:42         ` [PATCH v2 3/3] config: let `config_store_data_clear()` handle `key` Martin Ågren
@ 2018-05-23  7:03           ` Eric Sunshine
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Sunshine @ 2018-05-23  7:03 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Git List, Johannes Schindelin

On Sun, May 20, 2018 at 6:42 AM, Martin Ågren <martin.agren@gmail.com> wrote:
> Instead of remembering to free `key` in each code path, let
> `config_store_data_clear()` handle that.
>
> We still need to free it before replacing it, though. Move that freeing
> closer to the replacing to be safe. Note that in that same part of the
> code, we can no longer set `key` to the original pointer, but need to
> `xstrdup()` it.

That casting away of 'const' was an oddball case anyhow, so it's nice
to see it go away (even at the expense of an extra xstrdup()).

Overall, this change makes it quite a bit easier to reason about the
cleanup of 'store.key'.

Thanks.

> Signed-off-by: Martin Ågren <martin.agren@gmail.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2018-05-23  7:03 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-13  8:23 [PATCH] config: free resources of `struct config_store_data` Martin Ågren
2018-05-13  8:59 ` Eric Sunshine
2018-05-13  9:58   ` Martin Ågren
2018-05-13  9:58     ` [PATCH 2/1] config: let `config_store_data_clear()` handle `value_regex` Martin Ågren
2018-05-14  3:05       ` Eric Sunshine
2018-05-20 10:50         ` [PATCH] regex: do not call `regfree()` if compilation fails Martin Ågren
2018-05-21 18:43           ` Stefan Beller
2018-05-22  2:20             ` Eric Sunshine
2018-05-22 11:00               ` Martin Ågren
2018-05-13  9:58     ` [PATCH 3/1] config: let `config_store_data_clear()` handle `key` Martin Ågren
2018-05-14  3:03     ` [PATCH] config: free resources of `struct config_store_data` Eric Sunshine
2018-05-20 10:42       ` [PATCH v2 0/3] " Martin Ågren
2018-05-20 10:42         ` [PATCH v2 1/3] " Martin Ågren
2018-05-20 10:42         ` [PATCH v2 2/3] config: let `config_store_data_clear()` handle `value_regex` Martin Ågren
2018-05-20 10:42         ` [PATCH v2 3/3] config: let `config_store_data_clear()` handle `key` Martin Ågren
2018-05-23  7:03           ` Eric Sunshine
2018-05-23  7:01         ` [PATCH v2 0/3] config: free resources of `struct config_store_data` Eric Sunshine
2018-05-13 18:40 ` [PATCH] " Martin Ågren

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).