git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] unleak paths allocated in "diff --no-index"
@ 2022-09-02 21:27 Junio C Hamano
  2022-09-02 23:49 ` [PATCH v2] diff --no-index: unleak paths[] elements Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2022-09-02 21:27 UTC (permalink / raw)
  To: git, Ævar Arnfjörð Bjarmason, Johannes Schindelin

"git diff --no-index" codepath starts with the two elements in
argv[] and munges them into two paths to be compared, stored in a
separate path[] arrays.  The munging is implemented in a rather
haphazard way, sometimes overwriting old version with a new copy,
and sometimes a constant string assigned to path[], making it
impossible to release the resources properly:

 * A single dash "-" from the command line is a special signal that
   the standard input is used for the side to be compared, and is
   internally replaced with a copy of string "-" at a known address.

 * When run in a subdirectory, full paths to the two paths are
   allocated and placed in path[].

 * After the above happens, when comparing a file with a directory,
   the directory side is replaced with the path to a file in the
   directory with the same name as the file.

This was perfectly fine for just two strings that are pathnames used
during the lifetime of the program and cleaned up upon program exit,
but it gets in the way when leak sanitizer is in effect.  The third
step can be losing the full path that was allocated in the second
step, but it is not easy to tell if its input is an allocated piece
of memory to begin with.

Loosen the earlier two steps a bit so that elements of the path[]
array that come to the directory/file comparison code are either the
singleton "-" or an allocated piece of memory.  Use that knowledge
in the third step to release an allocated piece of memory when it
replaces the path to a directory with the path to a file in that
directory, and also at the end to release the two elements of the
path[] array as needed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * Because we never pass NULL to the new helper, !path check is not
   strictly necessary, but it is conventional for free()-like
   functions to take NULL and safely become a no-op.

 diff-no-index.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git c/diff-no-index.c w/diff-no-index.c
index 9a8b09346b..2770b7d15a 100644
--- c/diff-no-index.c
+++ w/diff-no-index.c
@@ -208,6 +208,14 @@ static void append_basename(struct strbuf *path, const char *dir, const char *fi
 	strbuf_addstr(path, tail ? tail + 1 : file);
 }
 
+static void free_allocated_path(const char *path)
+{
+	if (!path ||
+	    (path == file_from_standard_input))
+		return;
+	free((char *)path);
+}
+
 /*
  * DWIM "diff D F" into "diff D/F F" and "diff F D" into "diff F D/F"
  * Note that we append the basename of F to D/, so "diff a/b/file D"
@@ -226,9 +234,11 @@ static void fixup_paths(const char **path, struct strbuf *replacement)
 		return;
 	if (isdir0) {
 		append_basename(replacement, path[0], path[1]);
+		free_allocated_path(path[0]);
 		path[0] = replacement->buf;
 	} else {
 		append_basename(replacement, path[1], path[0]);
+		free_allocated_path(path[1]);
 		path[1] = replacement->buf;
 	}
 }
@@ -274,6 +284,8 @@ int diff_no_index(struct rev_info *revs,
 			p = file_from_standard_input;
 		else if (prefix)
 			p = prefix_filename(prefix, p);
+		else
+			p = xstrdup(p);
 		paths[i] = p;
 	}
 
@@ -294,13 +306,18 @@ int diff_no_index(struct rev_info *revs,
 	setup_diff_pager(&revs->diffopt);
 	revs->diffopt.flags.exit_with_status = 1;
 
-	if (queue_diff(&revs->diffopt, paths[0], paths[1]))
+	if (queue_diff(&revs->diffopt, paths[0], paths[1])) {
+		free_allocated_path(paths[0]);
+		free_allocated_path(paths[1]);
 		return 1;
+	}
 	diff_set_mnemonic_prefix(&revs->diffopt, "1/", "2/");
 	diffcore_std(&revs->diffopt);
 	diff_flush(&revs->diffopt);
 
 	strbuf_release(&replacement);
+	free_allocated_path(paths[0]);
+	free_allocated_path(paths[1]);
 
 	/*
 	 * The return code for --no-index imitates diff(1):

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2] diff --no-index: unleak paths[] elements
  2022-09-02 21:27 [PATCH] unleak paths allocated in "diff --no-index" Junio C Hamano
@ 2022-09-02 23:49 ` Junio C Hamano
  2022-09-03  6:00   ` René Scharfe
  2022-09-05 10:03   ` [PATCH v2] diff --no-index: unleak paths[] elements Johannes Schindelin
  0 siblings, 2 replies; 17+ messages in thread
From: Junio C Hamano @ 2022-09-02 23:49 UTC (permalink / raw)
  To: git; +Cc: Ævar Arnfjörð Bjarmason, Johannes Schindelin

"git diff --no-index" codepath starts with the two elements in
argv[] and munges them into two paths to be compared, stored in a
separate path[] arrays.  The munging is implemented in a rather
haphazard way, sometimes overwriting old version with a new copy,
and sometimes a constant string assigned to path[], making it
impossible to release the resources properly:

 * A single dash "-" from the command line is a special signal that
   the standard input is used for the side to be compared, and is
   internally replaced with a copy of string "-" at a known address.

 * When run in a subdirectory, full paths to the two paths are
   allocated and placed in path[].

 * After the above happens, when comparing a file with a directory,
   the directory side is replaced with the path to a file in the
   directory with the same name as the file.

This was perfectly fine for just two strings that are pathnames used
during the lifetime of the program and cleaned up upon program exit,
but it gets in the way when leak sanitizer is in effect.  The third
step can be losing the full path that was allocated in the second
step, but it is not easy to tell if its input is an allocated piece
of memory to begin with.

Loosen the earlier two steps a bit so that elements of the path[]
array that come to the directory/file comparison code are either the
singleton "-" or an allocated piece of memory.  Use that knowledge
in the third step to release an allocated piece of memory when it
replaces the path to a directory with the path to a file in that
directory, and also at the end to release the two elements of the
path[] array as needed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * The previous one allowed strbuf_release() to free replacement.buf
   which may be used in path[0] or path[1] potentially leading to
   double freeing.  The kosher way may be to use strbuf_detach() in
   fixup_paths(), but this is a simpler fix, it is getting late in
   the day, and I am getting sick of fighting the leak-checker, so...

 diff-no-index.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)


diff --git a/diff-no-index.c b/diff-no-index.c
index 9a8b09346b..77a126469b 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -208,6 +208,14 @@ static void append_basename(struct strbuf *path, const char *dir, const char *fi
 	strbuf_addstr(path, tail ? tail + 1 : file);
 }
 
+static void free_allocated_path(const char *path)
+{
+	if (!path ||
+	    (path == file_from_standard_input))
+		return;
+	free((char *)path);
+}
+
 /*
  * DWIM "diff D F" into "diff D/F F" and "diff F D" into "diff F D/F"
  * Note that we append the basename of F to D/, so "diff a/b/file D"
@@ -226,9 +234,11 @@ static void fixup_paths(const char **path, struct strbuf *replacement)
 		return;
 	if (isdir0) {
 		append_basename(replacement, path[0], path[1]);
+		free_allocated_path(path[0]);
 		path[0] = replacement->buf;
 	} else {
 		append_basename(replacement, path[1], path[0]);
+		free_allocated_path(path[1]);
 		path[1] = replacement->buf;
 	}
 }
@@ -274,6 +284,8 @@ int diff_no_index(struct rev_info *revs,
 			p = file_from_standard_input;
 		else if (prefix)
 			p = prefix_filename(prefix, p);
+		else
+			p = xstrdup(p);
 		paths[i] = p;
 	}
 
@@ -294,13 +306,21 @@ int diff_no_index(struct rev_info *revs,
 	setup_diff_pager(&revs->diffopt);
 	revs->diffopt.flags.exit_with_status = 1;
 
-	if (queue_diff(&revs->diffopt, paths[0], paths[1]))
+	if (queue_diff(&revs->diffopt, paths[0], paths[1])) {
+		free_allocated_path(paths[0]);
+		free_allocated_path(paths[1]);
 		return 1;
+	}
 	diff_set_mnemonic_prefix(&revs->diffopt, "1/", "2/");
 	diffcore_std(&revs->diffopt);
 	diff_flush(&revs->diffopt);
 
-	strbuf_release(&replacement);
+	/*
+	 * do not strbuf_release(&replacement), as it is in paths[]
+	 * when replacement was actually used.
+	 */
+	free_allocated_path(paths[0]);
+	free_allocated_path(paths[1]);
 
 	/*
 	 * The return code for --no-index imitates diff(1):
-- 
2.37.3-661-g73a641a77a


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] diff --no-index: unleak paths[] elements
  2022-09-02 23:49 ` [PATCH v2] diff --no-index: unleak paths[] elements Junio C Hamano
@ 2022-09-03  6:00   ` René Scharfe
  2022-09-05 20:26     ` Junio C Hamano
  2022-09-05 10:03   ` [PATCH v2] diff --no-index: unleak paths[] elements Johannes Schindelin
  1 sibling, 1 reply; 17+ messages in thread
From: René Scharfe @ 2022-09-03  6:00 UTC (permalink / raw)
  To: Junio C Hamano, git
  Cc: Ævar Arnfjörð Bjarmason, Johannes Schindelin

Am 03.09.22 um 01:49 schrieb Junio C Hamano:
> "git diff --no-index" codepath starts with the two elements in
> argv[] and munges them into two paths to be compared, stored in a
> separate path[] arrays.  The munging is implemented in a rather
> haphazard way, sometimes overwriting old version with a new copy,
> and sometimes a constant string assigned to path[], making it
> impossible to release the resources properly:
>
>  * A single dash "-" from the command line is a special signal that
>    the standard input is used for the side to be compared, and is
>    internally replaced with a copy of string "-" at a known address.
>
>  * When run in a subdirectory, full paths to the two paths are
>    allocated and placed in path[].
>
>  * After the above happens, when comparing a file with a directory,
>    the directory side is replaced with the path to a file in the
>    directory with the same name as the file.
>
> This was perfectly fine for just two strings that are pathnames used
> during the lifetime of the program and cleaned up upon program exit,
> but it gets in the way when leak sanitizer is in effect.  The third
> step can be losing the full path that was allocated in the second
> step, but it is not easy to tell if its input is an allocated piece
> of memory to begin with.
>
> Loosen the earlier two steps a bit so that elements of the path[]
> array that come to the directory/file comparison code are either the
> singleton "-" or an allocated piece of memory.  Use that knowledge
> in the third step to release an allocated piece of memory when it
> replaces the path to a directory with the path to a file in that
> directory, and also at the end to release the two elements of the
> path[] array as needed.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>
>  * The previous one allowed strbuf_release() to free replacement.buf
>    which may be used in path[0] or path[1] potentially leading to
>    double freeing.  The kosher way may be to use strbuf_detach() in
>    fixup_paths(), but this is a simpler fix, it is getting late in
>    the day, and I am getting sick of fighting the leak-checker, so...
>
>  diff-no-index.c | 24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
>
>
> diff --git a/diff-no-index.c b/diff-no-index.c
> index 9a8b09346b..77a126469b 100644
> --- a/diff-no-index.c
> +++ b/diff-no-index.c
> @@ -208,6 +208,14 @@ static void append_basename(struct strbuf *path, const char *dir, const char *fi
>  	strbuf_addstr(path, tail ? tail + 1 : file);
>  }
>
> +static void free_allocated_path(const char *path)
> +{
> +	if (!path ||

How can path be NULL?  And if it was, why shield free(3) from it?

> +	    (path == file_from_standard_input))
> +		return;
> +	free((char *)path);
> +}
> +
>  /*
>   * DWIM "diff D F" into "diff D/F F" and "diff F D" into "diff F D/F"
>   * Note that we append the basename of F to D/, so "diff a/b/file D"
> @@ -226,9 +234,11 @@ static void fixup_paths(const char **path, struct strbuf *replacement)
>  		return;
>  	if (isdir0) {
>  		append_basename(replacement, path[0], path[1]);
> +		free_allocated_path(path[0]);
>  		path[0] = replacement->buf;
>  	} else {
>  		append_basename(replacement, path[1], path[0]);
> +		free_allocated_path(path[1]);
>  		path[1] = replacement->buf;
>  	}
>  }
> @@ -274,6 +284,8 @@ int diff_no_index(struct rev_info *revs,
>  			p = file_from_standard_input;
>  		else if (prefix)
>  			p = prefix_filename(prefix, p);
> +		else
> +			p = xstrdup(p);

prefix_filename(NULL, p) is basically the same as xstrdup(p), so those
two conditional branches could be joined.

>  		paths[i] = p;
>  	}
>
> @@ -294,13 +306,21 @@ int diff_no_index(struct rev_info *revs,
>  	setup_diff_pager(&revs->diffopt);
>  	revs->diffopt.flags.exit_with_status = 1;
>
> -	if (queue_diff(&revs->diffopt, paths[0], paths[1]))
> +	if (queue_diff(&revs->diffopt, paths[0], paths[1])) {
> +		free_allocated_path(paths[0]);
> +		free_allocated_path(paths[1]);
>  		return 1;
> +	}
>  	diff_set_mnemonic_prefix(&revs->diffopt, "1/", "2/");
>  	diffcore_std(&revs->diffopt);
>  	diff_flush(&revs->diffopt);
>
> -	strbuf_release(&replacement);
> +	/*
> +	 * do not strbuf_release(&replacement), as it is in paths[]
> +	 * when replacement was actually used.
> +	 */
> +	free_allocated_path(paths[0]);
> +	free_allocated_path(paths[1]);
>
>  	/*
>  	 * The return code for --no-index imitates diff(1):

Perhaps avoid the need for that comment by moving that strbuf to where
it's used and have it spend its full lifecycle there?  Something like:

---
 diff-no-index.c | 50 ++++++++++++++++++++++---------------------------
 1 file changed, 22 insertions(+), 28 deletions(-)

diff --git a/diff-no-index.c b/diff-no-index.c
index 77a126469b..9f8b78f173 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -196,18 +196,6 @@ static int queue_diff(struct diff_options *o,
 	}
 }

-/* append basename of F to D */
-static void append_basename(struct strbuf *path, const char *dir, const char *file)
-{
-	const char *tail = strrchr(file, '/');
-
-	strbuf_addstr(path, dir);
-	while (path->len && path->buf[path->len - 1] == '/')
-		path->len--;
-	strbuf_addch(path, '/');
-	strbuf_addstr(path, tail ? tail + 1 : file);
-}
-
 static void free_allocated_path(const char *path)
 {
 	if (!path ||
@@ -216,12 +204,28 @@ static void free_allocated_path(const char *path)
 	free((char *)path);
 }

+/* append basename of F to D */
+static void append_basename(const char **dir, const char *file)
+{
+	const char *tail = strrchr(file, '/');
+	struct strbuf path = STRBUF_INIT;
+
+	strbuf_addstr(&path, *dir);
+	while (path.len && path.buf[path.len - 1] == '/')
+		path.len--;
+	strbuf_addch(&path, '/');
+	strbuf_addstr(&path, tail ? tail + 1 : file);
+
+	free_allocated_path(*dir);
+	*dir = strbuf_detach(&path, NULL);
+}
+
 /*
  * DWIM "diff D F" into "diff D/F F" and "diff F D" into "diff F D/F"
  * Note that we append the basename of F to D/, so "diff a/b/file D"
  * becomes "diff a/b/file D/file", not "diff a/b/file D/a/b/file".
  */
-static void fixup_paths(const char **path, struct strbuf *replacement)
+static void fixup_paths(const char **path)
 {
 	unsigned int isdir0, isdir1;

@@ -232,15 +236,10 @@ static void fixup_paths(const char **path, struct strbuf *replacement)
 	isdir1 = is_directory(path[1]);
 	if (isdir0 == isdir1)
 		return;
-	if (isdir0) {
-		append_basename(replacement, path[0], path[1]);
-		free_allocated_path(path[0]);
-		path[0] = replacement->buf;
-	} else {
-		append_basename(replacement, path[1], path[0]);
-		free_allocated_path(path[1]);
-		path[1] = replacement->buf;
-	}
+	if (isdir0)
+		append_basename(&path[0], path[1]);
+	else
+		append_basename(&path[1], path[0]);
 }

 static const char * const diff_no_index_usage[] = {
@@ -254,7 +253,6 @@ int diff_no_index(struct rev_info *revs,
 {
 	int i, no_index;
 	const char *paths[2];
-	struct strbuf replacement = STRBUF_INIT;
 	const char *prefix = revs->prefix;
 	struct option no_index_options[] = {
 		OPT_BOOL_F(0, "no-index", &no_index, "",
@@ -289,7 +287,7 @@ int diff_no_index(struct rev_info *revs,
 		paths[i] = p;
 	}

-	fixup_paths(paths, &replacement);
+	fixup_paths(paths);

 	revs->diffopt.skip_stat_unmatch = 1;
 	if (!revs->diffopt.output_format)
@@ -315,10 +313,6 @@ int diff_no_index(struct rev_info *revs,
 	diffcore_std(&revs->diffopt);
 	diff_flush(&revs->diffopt);

-	/*
-	 * do not strbuf_release(&replacement), as it is in paths[]
-	 * when replacement was actually used.
-	 */
 	free_allocated_path(paths[0]);
 	free_allocated_path(paths[1]);

--
2.37.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] diff --no-index: unleak paths[] elements
  2022-09-02 23:49 ` [PATCH v2] diff --no-index: unleak paths[] elements Junio C Hamano
  2022-09-03  6:00   ` René Scharfe
@ 2022-09-05 10:03   ` Johannes Schindelin
  2022-09-05 11:01     ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 17+ messages in thread
From: Johannes Schindelin @ 2022-09-05 10:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Ævar Arnfjörð Bjarmason

Hi Junio,

On Fri, 2 Sep 2022, Junio C Hamano wrote:

> "git diff --no-index" codepath starts with the two elements in
> argv[] and munges them into two paths to be compared, stored in a
> separate path[] arrays.  The munging is implemented in a rather
> haphazard way, sometimes overwriting old version with a new copy,
> and sometimes a constant string assigned to path[], making it
> impossible to release the resources properly:
>
>  * A single dash "-" from the command line is a special signal that
>    the standard input is used for the side to be compared, and is
>    internally replaced with a copy of string "-" at a known address.
>
>  * When run in a subdirectory, full paths to the two paths are
>    allocated and placed in path[].
>
>  * After the above happens, when comparing a file with a directory,
>    the directory side is replaced with the path to a file in the
>    directory with the same name as the file.
>
> This was perfectly fine for just two strings that are pathnames used
> during the lifetime of the program and cleaned up upon program exit,
> but it gets in the way when leak sanitizer is in effect.  The third
> step can be losing the full path that was allocated in the second
> step, but it is not easy to tell if its input is an allocated piece
> of memory to begin with.
>
> Loosen the earlier two steps a bit so that elements of the path[]
> array that come to the directory/file comparison code are either the
> singleton "-" or an allocated piece of memory.  Use that knowledge
> in the third step to release an allocated piece of memory when it
> replaces the path to a directory with the path to a file in that
> directory, and also at the end to release the two elements of the
> path[] array as needed.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>
>  * The previous one allowed strbuf_release() to free replacement.buf
>    which may be used in path[0] or path[1] potentially leading to
>    double freeing.  The kosher way may be to use strbuf_detach() in
>    fixup_paths(), but this is a simpler fix, it is getting late in
>    the day, and I am getting sick of fighting the leak-checker, so...

I wonder whether a much better way would be to first fix the code to
always release `replacement`, like so:

-- snip --
diff --git a/diff-no-index.c b/diff-no-index.c
index 9a8b09346bd..87047605385 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -242,7 +242,7 @@ int diff_no_index(struct rev_info *revs,
 		  int implicit_no_index,
 		  int argc, const char **argv)
 {
-	int i, no_index;
+	int i, no_index, ret;
 	const char *paths[2];
 	struct strbuf replacement = STRBUF_INIT;
 	const char *prefix = revs->prefix;
@@ -294,17 +294,23 @@ int diff_no_index(struct rev_info *revs,
 	setup_diff_pager(&revs->diffopt);
 	revs->diffopt.flags.exit_with_status = 1;

-	if (queue_diff(&revs->diffopt, paths[0], paths[1]))
-		return 1;
+	if (queue_diff(&revs->diffopt, paths[0], paths[1])) {
+		ret = 1;
+		goto out;
+	}
+
 	diff_set_mnemonic_prefix(&revs->diffopt, "1/", "2/");
 	diffcore_std(&revs->diffopt);
 	diff_flush(&revs->diffopt);

-	strbuf_release(&replacement);
-
 	/*
 	 * The return code for --no-index imitates diff(1):
 	 * 0 = no changes, 1 = changes, else error
 	 */
-	return diff_result_code(&revs->diffopt, 0);
+	ret = diff_result_code(&revs->diffopt, 0);
+
+out:
+	strbuf_release(&replacement);
+
+	return ret;
 }
-- snap --

After that, the proposed diff could be replaced by this diff:

-- snip --
diff --git a/diff-no-index.c b/diff-no-index.c
index 87047605385..d350e4381bc 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -244,6 +244,7 @@ int diff_no_index(struct rev_info *revs,
 {
 	int i, no_index, ret;
 	const char *paths[2];
+	struct string_list to_release = STRING_LIST_INIT_DUP;
 	struct strbuf replacement = STRBUF_INIT;
 	const char *prefix = revs->prefix;
 	struct option no_index_options[] = {
@@ -274,6 +275,12 @@ int diff_no_index(struct rev_info *revs,
 			p = file_from_standard_input;
 		else if (prefix)
 			p = prefix_filename(prefix, p);
+		else {
+			char *dup = xstrdup(p);
+
+			p = dup;
+			string_list_append_nodup(&to_release, dup);
+		}
 		paths[i] = p;
 	}

@@ -310,6 +317,7 @@ int diff_no_index(struct rev_info *revs,
 	ret = diff_result_code(&revs->diffopt, 0);

 out:
+	string_list_clear(&to_release, 1);
 	strbuf_release(&replacement);

 	return ret;

-- snap --

That approach has the distinct advantage of making it very easy to reason
about the code.

What do you think?

Ciao,
Dscho

>
>  diff-no-index.c | 24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
>
>
> diff --git a/diff-no-index.c b/diff-no-index.c
> index 9a8b09346b..77a126469b 100644
> --- a/diff-no-index.c
> +++ b/diff-no-index.c
> @@ -208,6 +208,14 @@ static void append_basename(struct strbuf *path, const char *dir, const char *fi
>  	strbuf_addstr(path, tail ? tail + 1 : file);
>  }
>
> +static void free_allocated_path(const char *path)
> +{
> +	if (!path ||
> +	    (path == file_from_standard_input))
> +		return;
> +	free((char *)path);
> +}
> +
>  /*
>   * DWIM "diff D F" into "diff D/F F" and "diff F D" into "diff F D/F"
>   * Note that we append the basename of F to D/, so "diff a/b/file D"
> @@ -226,9 +234,11 @@ static void fixup_paths(const char **path, struct strbuf *replacement)
>  		return;
>  	if (isdir0) {
>  		append_basename(replacement, path[0], path[1]);
> +		free_allocated_path(path[0]);
>  		path[0] = replacement->buf;
>  	} else {
>  		append_basename(replacement, path[1], path[0]);
> +		free_allocated_path(path[1]);
>  		path[1] = replacement->buf;
>  	}
>  }
> @@ -274,6 +284,8 @@ int diff_no_index(struct rev_info *revs,
>  			p = file_from_standard_input;
>  		else if (prefix)
>  			p = prefix_filename(prefix, p);
> +		else
> +			p = xstrdup(p);
>  		paths[i] = p;
>  	}
>
> @@ -294,13 +306,21 @@ int diff_no_index(struct rev_info *revs,
>  	setup_diff_pager(&revs->diffopt);
>  	revs->diffopt.flags.exit_with_status = 1;
>
> -	if (queue_diff(&revs->diffopt, paths[0], paths[1]))
> +	if (queue_diff(&revs->diffopt, paths[0], paths[1])) {
> +		free_allocated_path(paths[0]);
> +		free_allocated_path(paths[1]);
>  		return 1;
> +	}
>  	diff_set_mnemonic_prefix(&revs->diffopt, "1/", "2/");
>  	diffcore_std(&revs->diffopt);
>  	diff_flush(&revs->diffopt);
>
> -	strbuf_release(&replacement);
> +	/*
> +	 * do not strbuf_release(&replacement), as it is in paths[]
> +	 * when replacement was actually used.
> +	 */
> +	free_allocated_path(paths[0]);
> +	free_allocated_path(paths[1]);
>
>  	/*
>  	 * The return code for --no-index imitates diff(1):
> --
> 2.37.3-661-g73a641a77a
>
>

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] diff --no-index: unleak paths[] elements
  2022-09-05 10:03   ` [PATCH v2] diff --no-index: unleak paths[] elements Johannes Schindelin
@ 2022-09-05 11:01     ` Ævar Arnfjörð Bjarmason
  2022-09-07 10:06       ` Johannes Schindelin
  0 siblings, 1 reply; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-09-05 11:01 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, git


On Mon, Sep 05 2022, Johannes Schindelin wrote:

> Hi Junio,
>
> On Fri, 2 Sep 2022, Junio C Hamano wrote:
>
>> "git diff --no-index" codepath starts with the two elements in
>> argv[] and munges them into two paths to be compared, stored in a
>> separate path[] arrays.  The munging is implemented in a rather
>> haphazard way, sometimes overwriting old version with a new copy,
>> and sometimes a constant string assigned to path[], making it
>> impossible to release the resources properly:
>>
>>  * A single dash "-" from the command line is a special signal that
>>    the standard input is used for the side to be compared, and is
>>    internally replaced with a copy of string "-" at a known address.
>>
>>  * When run in a subdirectory, full paths to the two paths are
>>    allocated and placed in path[].
>>
>>  * After the above happens, when comparing a file with a directory,
>>    the directory side is replaced with the path to a file in the
>>    directory with the same name as the file.
>>
>> This was perfectly fine for just two strings that are pathnames used
>> during the lifetime of the program and cleaned up upon program exit,
>> but it gets in the way when leak sanitizer is in effect.  The third
>> step can be losing the full path that was allocated in the second
>> step, but it is not easy to tell if its input is an allocated piece
>> of memory to begin with.
>>
>> Loosen the earlier two steps a bit so that elements of the path[]
>> array that come to the directory/file comparison code are either the
>> singleton "-" or an allocated piece of memory.  Use that knowledge
>> in the third step to release an allocated piece of memory when it
>> replaces the path to a directory with the path to a file in that
>> directory, and also at the end to release the two elements of the
>> path[] array as needed.
>>
>> Signed-off-by: Junio C Hamano <gitster@pobox.com>
>> ---
>>
>>  * The previous one allowed strbuf_release() to free replacement.buf
>>    which may be used in path[0] or path[1] potentially leading to
>>    double freeing.  The kosher way may be to use strbuf_detach() in
>>    fixup_paths(), but this is a simpler fix, it is getting late in
>>    the day, and I am getting sick of fighting the leak-checker, so...
>
> I wonder whether a much better way would be to first fix the code to
> always release `replacement`, like so:
>
> -- snip --
> diff --git a/diff-no-index.c b/diff-no-index.c
> index 9a8b09346bd..87047605385 100644
> --- a/diff-no-index.c
> +++ b/diff-no-index.c
> @@ -242,7 +242,7 @@ int diff_no_index(struct rev_info *revs,
>  		  int implicit_no_index,
>  		  int argc, const char **argv)
>  {
> -	int i, no_index;
> +	int i, no_index, ret;
>  	const char *paths[2];
>  	struct strbuf replacement = STRBUF_INIT;
>  	const char *prefix = revs->prefix;
> @@ -294,17 +294,23 @@ int diff_no_index(struct rev_info *revs,
>  	setup_diff_pager(&revs->diffopt);
>  	revs->diffopt.flags.exit_with_status = 1;
>
> -	if (queue_diff(&revs->diffopt, paths[0], paths[1]))
> -		return 1;
> +	if (queue_diff(&revs->diffopt, paths[0], paths[1])) {
> +		ret = 1;
> +		goto out;
> +	}
> +
>  	diff_set_mnemonic_prefix(&revs->diffopt, "1/", "2/");
>  	diffcore_std(&revs->diffopt);
>  	diff_flush(&revs->diffopt);
>
> -	strbuf_release(&replacement);
> -
>  	/*
>  	 * The return code for --no-index imitates diff(1):
>  	 * 0 = no changes, 1 = changes, else error
>  	 */
> -	return diff_result_code(&revs->diffopt, 0);
> +	ret = diff_result_code(&revs->diffopt, 0);
> +
> +out:
> +	strbuf_release(&replacement);
> +
> +	return ret;
>  }
> -- snap --
>
> After that, the proposed diff could be replaced by this diff:
>
> -- snip --
> diff --git a/diff-no-index.c b/diff-no-index.c
> index 87047605385..d350e4381bc 100644
> --- a/diff-no-index.c
> +++ b/diff-no-index.c
> @@ -244,6 +244,7 @@ int diff_no_index(struct rev_info *revs,
>  {
>  	int i, no_index, ret;
>  	const char *paths[2];
> +	struct string_list to_release = STRING_LIST_INIT_DUP;
>  	struct strbuf replacement = STRBUF_INIT;
>  	const char *prefix = revs->prefix;
>  	struct option no_index_options[] = {
> @@ -274,6 +275,12 @@ int diff_no_index(struct rev_info *revs,
>  			p = file_from_standard_input;
>  		else if (prefix)
>  			p = prefix_filename(prefix, p);
> +		else {
> +			char *dup = xstrdup(p);
> +
> +			p = dup;
> +			string_list_append_nodup(&to_release, dup);
> +		}
>  		paths[i] = p;
>  	}
>
> @@ -310,6 +317,7 @@ int diff_no_index(struct rev_info *revs,
>  	ret = diff_result_code(&revs->diffopt, 0);
>
>  out:
> +	string_list_clear(&to_release, 1);
>  	strbuf_release(&replacement);
>
>  	return ret;
>
> -- snap --
>
> That approach has the distinct advantage of making it very easy to reason
> about the code.
>
> What do you think?
>
> Ciao,
> Dscho
>
>>
>>  diff-no-index.c | 24 ++++++++++++++++++++++--
>>  1 file changed, 22 insertions(+), 2 deletions(-)
>>
>>
>> diff --git a/diff-no-index.c b/diff-no-index.c
>> index 9a8b09346b..77a126469b 100644
>> --- a/diff-no-index.c
>> +++ b/diff-no-index.c
>> @@ -208,6 +208,14 @@ static void append_basename(struct strbuf *path, const char *dir, const char *fi
>>  	strbuf_addstr(path, tail ? tail + 1 : file);
>>  }
>>
>> +static void free_allocated_path(const char *path)
>> +{
>> +	if (!path ||
>> +	    (path == file_from_standard_input))
>> +		return;
>> +	free((char *)path);
>> +}
>> +
>>  /*
>>   * DWIM "diff D F" into "diff D/F F" and "diff F D" into "diff F D/F"
>>   * Note that we append the basename of F to D/, so "diff a/b/file D"
>> @@ -226,9 +234,11 @@ static void fixup_paths(const char **path, struct strbuf *replacement)
>>  		return;
>>  	if (isdir0) {
>>  		append_basename(replacement, path[0], path[1]);
>> +		free_allocated_path(path[0]);
>>  		path[0] = replacement->buf;
>>  	} else {
>>  		append_basename(replacement, path[1], path[0]);
>> +		free_allocated_path(path[1]);
>>  		path[1] = replacement->buf;
>>  	}
>>  }
>> @@ -274,6 +284,8 @@ int diff_no_index(struct rev_info *revs,
>>  			p = file_from_standard_input;
>>  		else if (prefix)
>>  			p = prefix_filename(prefix, p);
>> +		else
>> +			p = xstrdup(p);
>>  		paths[i] = p;
>>  	}
>>
>> @@ -294,13 +306,21 @@ int diff_no_index(struct rev_info *revs,
>>  	setup_diff_pager(&revs->diffopt);
>>  	revs->diffopt.flags.exit_with_status = 1;
>>
>> -	if (queue_diff(&revs->diffopt, paths[0], paths[1]))
>> +	if (queue_diff(&revs->diffopt, paths[0], paths[1])) {
>> +		free_allocated_path(paths[0]);
>> +		free_allocated_path(paths[1]);
>>  		return 1;
>> +	}
>>  	diff_set_mnemonic_prefix(&revs->diffopt, "1/", "2/");
>>  	diffcore_std(&revs->diffopt);
>>  	diff_flush(&revs->diffopt);
>>
>> -	strbuf_release(&replacement);
>> +	/*
>> +	 * do not strbuf_release(&replacement), as it is in paths[]
>> +	 * when replacement was actually used.
>> +	 */
>> +	free_allocated_path(paths[0]);
>> +	free_allocated_path(paths[1]);
>>
>>  	/*
>>  	 * The return code for --no-index imitates diff(1):
>> --
>> 2.37.3-661-g73a641a77a
>>
>>

That looks like a much better approach, I'd think you'd want this on
top, because:

 * We entirely avoid playing cames with the string_list "dup" and then
   "nodup" append. In some cases we have to, but in this case we can
   just get the pointer to the member we just created, and avoid the
   explicit xstrdup() in the caller.
 * The free_util=1 in your code isn't needed/is a bug, we make no use of
   "util" here, so it should be free_util=0
 * It avoids the "add braces to all if/else arms" part of
   CodingGuidelines.

diff --git a/diff-no-index.c b/diff-no-index.c
index d350e4381bc..2861319c0e5 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -275,12 +275,8 @@ int diff_no_index(struct rev_info *revs,
 			p = file_from_standard_input;
 		else if (prefix)
 			p = prefix_filename(prefix, p);
-		else {
-			char *dup = xstrdup(p);
-
-			p = dup;
-			string_list_append_nodup(&to_release, dup);
-		}
+		else
+			p = string_list_append(&to_release, p)->string;
 		paths[i] = p;
 	}
 
@@ -317,7 +313,7 @@ int diff_no_index(struct rev_info *revs,
 	ret = diff_result_code(&revs->diffopt, 0);
 
 out:
-	string_list_clear(&to_release, 1);
+	string_list_clear(&to_release, 0);
 	strbuf_release(&replacement);
 
 	return ret;

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] diff --no-index: unleak paths[] elements
  2022-09-03  6:00   ` René Scharfe
@ 2022-09-05 20:26     ` Junio C Hamano
  2022-09-06 12:31       ` [PATCH 1/2] diff-no-index: release strbuf on queue error René Scharfe
                         ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Junio C Hamano @ 2022-09-05 20:26 UTC (permalink / raw)
  To: René Scharfe
  Cc: git, Ævar Arnfjörð Bjarmason, Johannes Schindelin

René Scharfe <l.s.r@web.de> writes:

> Am 03.09.22 um 01:49 schrieb Junio C Hamano:
>> "git diff --no-index" codepath starts with the two elements in
>> argv[] and munges them into two paths to be compared, stored in a
>> separate path[] arrays.  The munging is implemented in a rather
>> haphazard way, sometimes overwriting old version with a new copy,
>> and sometimes a constant string assigned to path[], making it
>> impossible to release the resources properly:
>>
>>  * A single dash "-" from the command line is a special signal that
>>    the standard input is used for the side to be compared, and is
>>    internally replaced with a copy of string "-" at a known address.
>>
>>  * When run in a subdirectory, full paths to the two paths are
>>    allocated and placed in path[].
>>
>>  * After the above happens, when comparing a file with a directory,
>>    the directory side is replaced with the path to a file in the
>>    directory with the same name as the file.
>>
>> This was perfectly fine for just two strings that are pathnames used
>> during the lifetime of the program and cleaned up upon program exit,
>> but it gets in the way when leak sanitizer is in effect.  The third
>> step can be losing the full path that was allocated in the second
>> step, but it is not easy to tell if its input is an allocated piece
>> of memory to begin with.
>>
>> Loosen the earlier two steps a bit so that elements of the path[]
>> array that come to the directory/file comparison code are either the
>> singleton "-" or an allocated piece of memory.  Use that knowledge
>> in the third step to release an allocated piece of memory when it
>> replaces the path to a directory with the path to a file in that
>> directory, and also at the end to release the two elements of the
>> path[] array as needed.
>>
>> Signed-off-by: Junio C Hamano <gitster@pobox.com>
>> ---
>>
>>  * The previous one allowed strbuf_release() to free replacement.buf
>>    which may be used in path[0] or path[1] potentially leading to
>>    double freeing.  The kosher way may be to use strbuf_detach() in
>>    fixup_paths(), but this is a simpler fix, it is getting late in
>>    the day, and I am getting sick of fighting the leak-checker, so...
>>
>>  diff-no-index.c | 24 ++++++++++++++++++++++--
>>  1 file changed, 22 insertions(+), 2 deletions(-)
>>
>>
>> diff --git a/diff-no-index.c b/diff-no-index.c
>> index 9a8b09346b..77a126469b 100644
>> --- a/diff-no-index.c
>> +++ b/diff-no-index.c
>> @@ -208,6 +208,14 @@ static void append_basename(struct strbuf *path, const char *dir, const char *fi
>>  	strbuf_addstr(path, tail ? tail + 1 : file);
>>  }
>>
>> +static void free_allocated_path(const char *path)
>> +{
>> +	if (!path ||
>
> How can path be NULL?  And if it was, why shield free(3) from it?

See the comment under three-dashes of the first iteration.

>> +	/*
>> +	 * do not strbuf_release(&replacement), as it is in paths[]
>> +	 * when replacement was actually used.
>> +	 */
>> +	free_allocated_path(paths[0]);
>> +	free_allocated_path(paths[1]);
>>
>>  	/*
>>  	 * The return code for --no-index imitates diff(1):
>
> Perhaps avoid the need for that comment by moving that strbuf to where
> it's used and have it spend its full lifecycle there?  Something like:

Yup, that is what I said in the comment under three-dashes (with the
reason why I didn't bother).

Quite honestly I am sick of fighting the overzealous leak-checker so
I'd very much appreciate if somebody else pick this up and run with
it.

Thanks.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/2] diff-no-index: release strbuf on queue error
  2022-09-05 20:26     ` Junio C Hamano
@ 2022-09-06 12:31       ` René Scharfe
  2022-09-07  6:01         ` René Scharfe
  2022-09-06 12:31       ` [PATCH 2/2] diff-no-index: release prefixed filenames René Scharfe
                         ` (3 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: René Scharfe @ 2022-09-06 12:31 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Ævar Arnfjörð Bjarmason, Johannes Schindelin

The strbuf is small and we are about to exit, so we could leave its
cleanup to the OS.  If we release it explicitly at all, however, then we
should do it on early exit as well.  Move it to a new cleanup section at
the end and make sure all execution paths go through it.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 diff-no-index.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/diff-no-index.c b/diff-no-index.c
index 9a8b09346b..a3683d8a04 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -243,6 +243,7 @@ int diff_no_index(struct rev_info *revs,
 		  int argc, const char **argv)
 {
 	int i, no_index;
+	int ret = 1;
 	const char *paths[2];
 	struct strbuf replacement = STRBUF_INIT;
 	const char *prefix = revs->prefix;
@@ -295,16 +296,18 @@ int diff_no_index(struct rev_info *revs,
 	revs->diffopt.flags.exit_with_status = 1;

 	if (queue_diff(&revs->diffopt, paths[0], paths[1]))
-		return 1;
+		goto out;
 	diff_set_mnemonic_prefix(&revs->diffopt, "1/", "2/");
 	diffcore_std(&revs->diffopt);
 	diff_flush(&revs->diffopt);

-	strbuf_release(&replacement);
-
 	/*
 	 * The return code for --no-index imitates diff(1):
 	 * 0 = no changes, 1 = changes, else error
 	 */
-	return diff_result_code(&revs->diffopt, 0);
+	ret = diff_result_code(&revs->diffopt, 0);
+
+out:
+	strbuf_release(&replacement);
+	return ret;
 }
--
2.37.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/2] diff-no-index: release prefixed filenames
  2022-09-05 20:26     ` Junio C Hamano
  2022-09-06 12:31       ` [PATCH 1/2] diff-no-index: release strbuf on queue error René Scharfe
@ 2022-09-06 12:31       ` René Scharfe
  2022-09-07 10:03         ` Johannes Schindelin
  2022-09-07 11:36       ` [PATCH v2 1/2] diff-no-index: release strbuf on queue error René Scharfe
                         ` (2 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: René Scharfe @ 2022-09-06 12:31 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Ævar Arnfjörð Bjarmason, Johannes Schindelin

Callers of prefix_filename() are responsible for freeing its result.
Remember them and release them to appease leak checkers.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 diff-no-index.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/diff-no-index.c b/diff-no-index.c
index a3683d8a04..35809f26d7 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -245,6 +245,7 @@ int diff_no_index(struct rev_info *revs,
 	int i, no_index;
 	int ret = 1;
 	const char *paths[2];
+	char *to_free[2] = { 0 };
 	struct strbuf replacement = STRBUF_INIT;
 	const char *prefix = revs->prefix;
 	struct option no_index_options[] = {
@@ -274,7 +275,7 @@ int diff_no_index(struct rev_info *revs,
 			 */
 			p = file_from_standard_input;
 		else if (prefix)
-			p = prefix_filename(prefix, p);
+			p = to_free[i] = prefix_filename(prefix, p);
 		paths[i] = p;
 	}

@@ -308,6 +309,8 @@ int diff_no_index(struct rev_info *revs,
 	ret = diff_result_code(&revs->diffopt, 0);

 out:
+	for (i = 0; i < 2; i++)
+		free(to_free[i]);
 	strbuf_release(&replacement);
 	return ret;
 }
--
2.37.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] diff-no-index: release strbuf on queue error
  2022-09-06 12:31       ` [PATCH 1/2] diff-no-index: release strbuf on queue error René Scharfe
@ 2022-09-07  6:01         ` René Scharfe
  0 siblings, 0 replies; 17+ messages in thread
From: René Scharfe @ 2022-09-07  6:01 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Ævar Arnfjörð Bjarmason, Johannes Schindelin

Am 06.09.22 um 14:31 schrieb René Scharfe:
> The strbuf is small and we are about to exit, so we could leave its
> cleanup to the OS.  If we release it explicitly at all, however, then we
> should do it on early exit as well.  Move it to a new cleanup section at
> the end and make sure all execution paths go through it.
>

Oh, forgot:

Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>

> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>  diff-no-index.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/diff-no-index.c b/diff-no-index.c
> index 9a8b09346b..a3683d8a04 100644
> --- a/diff-no-index.c
> +++ b/diff-no-index.c
> @@ -243,6 +243,7 @@ int diff_no_index(struct rev_info *revs,
>  		  int argc, const char **argv)
>  {
>  	int i, no_index;
> +	int ret = 1;
>  	const char *paths[2];
>  	struct strbuf replacement = STRBUF_INIT;
>  	const char *prefix = revs->prefix;
> @@ -295,16 +296,18 @@ int diff_no_index(struct rev_info *revs,
>  	revs->diffopt.flags.exit_with_status = 1;
>
>  	if (queue_diff(&revs->diffopt, paths[0], paths[1]))
> -		return 1;
> +		goto out;
>  	diff_set_mnemonic_prefix(&revs->diffopt, "1/", "2/");
>  	diffcore_std(&revs->diffopt);
>  	diff_flush(&revs->diffopt);
>
> -	strbuf_release(&replacement);
> -
>  	/*
>  	 * The return code for --no-index imitates diff(1):
>  	 * 0 = no changes, 1 = changes, else error
>  	 */
> -	return diff_result_code(&revs->diffopt, 0);
> +	ret = diff_result_code(&revs->diffopt, 0);
> +
> +out:
> +	strbuf_release(&replacement);
> +	return ret;
>  }
> --
> 2.37.2

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] diff-no-index: release prefixed filenames
  2022-09-06 12:31       ` [PATCH 2/2] diff-no-index: release prefixed filenames René Scharfe
@ 2022-09-07 10:03         ` Johannes Schindelin
  2022-09-07 11:19           ` René Scharfe
  0 siblings, 1 reply; 17+ messages in thread
From: Johannes Schindelin @ 2022-09-07 10:03 UTC (permalink / raw)
  To: René Scharfe
  Cc: Junio C Hamano, git, Ævar Arnfjörð Bjarmason

[-- Attachment #1: Type: text/plain, Size: 1227 bytes --]

Hi René,

On Tue, 6 Sep 2022, René Scharfe wrote:

> diff --git a/diff-no-index.c b/diff-no-index.c
> index a3683d8a04..35809f26d7 100644
> --- a/diff-no-index.c
> +++ b/diff-no-index.c
> @@ -245,6 +245,7 @@ int diff_no_index(struct rev_info *revs,
>  	int i, no_index;
>  	int ret = 1;
>  	const char *paths[2];
> +	char *to_free[2] = { 0 };
>  	struct strbuf replacement = STRBUF_INIT;
>  	const char *prefix = revs->prefix;
>  	struct option no_index_options[] = {
> @@ -274,7 +275,7 @@ int diff_no_index(struct rev_info *revs,
>  			 */
>  			p = file_from_standard_input;
>  		else if (prefix)
> -			p = prefix_filename(prefix, p);
> +			p = to_free[i] = prefix_filename(prefix, p);
>  		paths[i] = p;
>  	}
>
> @@ -308,6 +309,8 @@ int diff_no_index(struct rev_info *revs,
>  	ret = diff_result_code(&revs->diffopt, 0);
>
>  out:
> +	for (i = 0; i < 2; i++)
> +		free(to_free[i]);

Heh. That's long-hand for

	free(to_free[0]);
	free(to_free[1]);

If you do want to have that loop, please replace the hard-coded 2 by
`ARRAY_SIZE(to_free)`.

Otherwise, both patches look fine to me.

Thanks!
Dscho

>  	strbuf_release(&replacement);
>  	return ret;
>  }
> --
> 2.37.2
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] diff --no-index: unleak paths[] elements
  2022-09-05 11:01     ` Ævar Arnfjörð Bjarmason
@ 2022-09-07 10:06       ` Johannes Schindelin
  2022-09-07 12:31         ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 17+ messages in thread
From: Johannes Schindelin @ 2022-09-07 10:06 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 566 bytes --]

Hi Ævar,

On Mon, 5 Sep 2022, Ævar Arnfjörð Bjarmason wrote:

> On Mon, Sep 05 2022, Johannes Schindelin wrote:
>
> [...]
> > +	string_list_clear(&to_release, 1);
> [...]
>
>  * The free_util=1 in your code isn't needed/is a bug, we make no use of
>    "util" here, so it should be free_util=0

Calling it a bug is a bit strong, and misses the reason why I did it:
future-proofing.

In any case, René took up Junio's ask and provided a new iteration that
side-steps this concern altogether, therefore the point is now moot.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] diff-no-index: release prefixed filenames
  2022-09-07 10:03         ` Johannes Schindelin
@ 2022-09-07 11:19           ` René Scharfe
  0 siblings, 0 replies; 17+ messages in thread
From: René Scharfe @ 2022-09-07 11:19 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Junio C Hamano, git, Ævar Arnfjörð Bjarmason

Am 07.09.22 um 12:03 schrieb Johannes Schindelin:
> Hi René,
>
> On Tue, 6 Sep 2022, René Scharfe wrote:
>
>> diff --git a/diff-no-index.c b/diff-no-index.c
>> index a3683d8a04..35809f26d7 100644
>> --- a/diff-no-index.c
>> +++ b/diff-no-index.c
>> @@ -245,6 +245,7 @@ int diff_no_index(struct rev_info *revs,
>>  	int i, no_index;
>>  	int ret = 1;
>>  	const char *paths[2];
>> +	char *to_free[2] = { 0 };
>>  	struct strbuf replacement = STRBUF_INIT;
>>  	const char *prefix = revs->prefix;
>>  	struct option no_index_options[] = {
>> @@ -274,7 +275,7 @@ int diff_no_index(struct rev_info *revs,
>>  			 */
>>  			p = file_from_standard_input;
>>  		else if (prefix)
>> -			p = prefix_filename(prefix, p);
>> +			p = to_free[i] = prefix_filename(prefix, p);
>>  		paths[i] = p;
>>  	}
>>
>> @@ -308,6 +309,8 @@ int diff_no_index(struct rev_info *revs,
>>  	ret = diff_result_code(&revs->diffopt, 0);
>>
>>  out:
>> +	for (i = 0; i < 2; i++)
>> +		free(to_free[i]);
>
> Heh. That's long-hand for
>
> 	free(to_free[0]);
> 	free(to_free[1]);

Had that before, but it's repetitive and more importantly this loop
matches the first one.

> If you do want to have that loop, please replace the hard-coded 2 by
> `ARRAY_SIZE(to_free)`.

The two is hard-coded in other places explicitly as well and implied in
fixup_paths().  The root cause is not any array size but the design
decision to require exactly two things to compare.  A reader would need
to know that.  We could sure use ARRAY_SIZE(paths) in the declaration
of to_free and ARRAY_SIZE(to_free) in the loop to at least not add more
instances of that magic number and make the code understandable without
seeing the bigger picture.

> Otherwise, both patches look fine to me.
>
> Thanks!
> Dscho
>
>>  	strbuf_release(&replacement);
>>  	return ret;
>>  }
>> --
>> 2.37.2
>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/2] diff-no-index: release strbuf on queue error
  2022-09-05 20:26     ` Junio C Hamano
  2022-09-06 12:31       ` [PATCH 1/2] diff-no-index: release strbuf on queue error René Scharfe
  2022-09-06 12:31       ` [PATCH 2/2] diff-no-index: release prefixed filenames René Scharfe
@ 2022-09-07 11:36       ` René Scharfe
  2022-09-07 11:37       ` [PATCH v2 2/2] diff-no-index: release prefixed filenames René Scharfe
  2022-09-07 11:45       ` [PATCH v2 3/2] diff-no-index: simplify argv index calculation René Scharfe
  4 siblings, 0 replies; 17+ messages in thread
From: René Scharfe @ 2022-09-07 11:36 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Ævar Arnfjörð Bjarmason, Johannes Schindelin

The strbuf is small and we are about to exit, so we could leave its
cleanup to the OS.  If we release it explicitly at all, however, then we
should do it on early exit as well.  Move the strbuf_release call to a
new cleanup section at the end and make sure all execution paths go
through it.

Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
Changes:
* Add Suggested-by.
* Clarify message (s/Move it/Move the strbuf_release call/).

 diff-no-index.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/diff-no-index.c b/diff-no-index.c
index 9a8b09346b..a3683d8a04 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -243,6 +243,7 @@ int diff_no_index(struct rev_info *revs,
 		  int argc, const char **argv)
 {
 	int i, no_index;
+	int ret = 1;
 	const char *paths[2];
 	struct strbuf replacement = STRBUF_INIT;
 	const char *prefix = revs->prefix;
@@ -295,16 +296,18 @@ int diff_no_index(struct rev_info *revs,
 	revs->diffopt.flags.exit_with_status = 1;

 	if (queue_diff(&revs->diffopt, paths[0], paths[1]))
-		return 1;
+		goto out;
 	diff_set_mnemonic_prefix(&revs->diffopt, "1/", "2/");
 	diffcore_std(&revs->diffopt);
 	diff_flush(&revs->diffopt);

-	strbuf_release(&replacement);
-
 	/*
 	 * The return code for --no-index imitates diff(1):
 	 * 0 = no changes, 1 = changes, else error
 	 */
-	return diff_result_code(&revs->diffopt, 0);
+	ret = diff_result_code(&revs->diffopt, 0);
+
+out:
+	strbuf_release(&replacement);
+	return ret;
 }
--
2.37.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/2] diff-no-index: release prefixed filenames
  2022-09-05 20:26     ` Junio C Hamano
                         ` (2 preceding siblings ...)
  2022-09-07 11:36       ` [PATCH v2 1/2] diff-no-index: release strbuf on queue error René Scharfe
@ 2022-09-07 11:37       ` René Scharfe
  2022-09-07 11:45       ` [PATCH v2 3/2] diff-no-index: simplify argv index calculation René Scharfe
  4 siblings, 0 replies; 17+ messages in thread
From: René Scharfe @ 2022-09-07 11:37 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Ævar Arnfjörð Bjarmason, Johannes Schindelin

Callers of prefix_filename() are responsible for freeing its result.
Remember the returned strings and release them to appease leak checkers.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
Changes:
* Use ARRAY_SIZE twice instead of hard-code 2.
* Clarify message (s/Remember them/Remember the returned strings/).

 diff-no-index.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/diff-no-index.c b/diff-no-index.c
index a3683d8a04..a18f6c3c63 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -245,6 +245,7 @@ int diff_no_index(struct rev_info *revs,
 	int i, no_index;
 	int ret = 1;
 	const char *paths[2];
+	char *to_free[ARRAY_SIZE(paths)] = { 0 };
 	struct strbuf replacement = STRBUF_INIT;
 	const char *prefix = revs->prefix;
 	struct option no_index_options[] = {
@@ -274,7 +275,7 @@ int diff_no_index(struct rev_info *revs,
 			 */
 			p = file_from_standard_input;
 		else if (prefix)
-			p = prefix_filename(prefix, p);
+			p = to_free[i] = prefix_filename(prefix, p);
 		paths[i] = p;
 	}

@@ -308,6 +309,8 @@ int diff_no_index(struct rev_info *revs,
 	ret = diff_result_code(&revs->diffopt, 0);

 out:
+	for (i = 0; i < ARRAY_SIZE(to_free); i++)
+		free(to_free[i]);
 	strbuf_release(&replacement);
 	return ret;
 }
--
2.37.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 3/2] diff-no-index: simplify argv index calculation
  2022-09-05 20:26     ` Junio C Hamano
                         ` (3 preceding siblings ...)
  2022-09-07 11:37       ` [PATCH v2 2/2] diff-no-index: release prefixed filenames René Scharfe
@ 2022-09-07 11:45       ` René Scharfe
  2022-09-07 19:37         ` Junio C Hamano
  4 siblings, 1 reply; 17+ messages in thread
From: René Scharfe @ 2022-09-07 11:45 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Ævar Arnfjörð Bjarmason, Johannes Schindelin

Since 16bb3d714d (diff --no-index: use parse_options() instead of
diff_opt_parse(), 2019-03-24) argc must be 2 if we reach the loop, i.e.
argc - 2 == 0.  Remove that inconsequential term.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
Bonus patch "while at it", would have saved me from going "huh?".
Generated using -U8 for easier review.

 diff-no-index.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/diff-no-index.c b/diff-no-index.c
index a18f6c3c63..18edbdf4b5 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -262,17 +262,17 @@ int diff_no_index(struct rev_info *revs,
 	if (argc != 2) {
 		if (implicit_no_index)
 			warning(_("Not a git repository. Use --no-index to "
 				  "compare two paths outside a working tree"));
 		usage_with_options(diff_no_index_usage, options);
 	}
 	FREE_AND_NULL(options);
 	for (i = 0; i < 2; i++) {
-		const char *p = argv[argc - 2 + i];
+		const char *p = argv[i];
 		if (!strcmp(p, "-"))
 			/*
 			 * stdin should be spelled as "-"; if you have
 			 * path that is "-", spell it as "./-".
 			 */
 			p = file_from_standard_input;
 		else if (prefix)
 			p = to_free[i] = prefix_filename(prefix, p);
--
2.37.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] diff --no-index: unleak paths[] elements
  2022-09-07 10:06       ` Johannes Schindelin
@ 2022-09-07 12:31         ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-09-07 12:31 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, git


On Wed, Sep 07 2022, Johannes Schindelin wrote:

> Hi Ævar,
>
> On Mon, 5 Sep 2022, Ævar Arnfjörð Bjarmason wrote:
>
>> On Mon, Sep 05 2022, Johannes Schindelin wrote:
>>
>> [...]
>> > +	string_list_clear(&to_release, 1);
>> [...]
>>
>>  * The free_util=1 in your code isn't needed/is a bug, we make no use of
>>    "util" here, so it should be free_util=0
>
> Calling it a bug is a bit strong[...]

Yes, FWIW I meant that in the sense of "the author probably didn't mean
this" or "it was copy/pasted", but it has no effect currently, as we'll
always have NULL "util" members.

> , and misses the reason why I did it: future-proofing.

I didn't think it was intentional, but obviously you know better about
the intent.

"Future proofing" seems like a bad reason for that API use however. If
you look at the various "util" users some of them want to free() it, and
some don't. If you forget to free() the worst you'll have is a memory
leak, but if you have some boilerplate free() there you'll probably get
a segfault.

Without knowing what the future code looks like we don't know whether it
would make any sense to free() that util use.

> In any case, René took up Junio's ask and provided a new iteration that
> side-steps this concern altogether, therefore the point is now moot.

Indeed, that patch LGTM.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/2] diff-no-index: simplify argv index calculation
  2022-09-07 11:45       ` [PATCH v2 3/2] diff-no-index: simplify argv index calculation René Scharfe
@ 2022-09-07 19:37         ` Junio C Hamano
  0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2022-09-07 19:37 UTC (permalink / raw)
  To: René Scharfe
  Cc: git, Ævar Arnfjörð Bjarmason, Johannes Schindelin

René Scharfe <l.s.r@web.de> writes:

> Since 16bb3d714d (diff --no-index: use parse_options() instead of
> diff_opt_parse(), 2019-03-24) argc must be 2 if we reach the loop, i.e.
> argc - 2 == 0.  Remove that inconsequential term.
>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
> Bonus patch "while at it", would have saved me from going "huh?".
> Generated using -U8 for easier review.

All three patches made sense to me.  Thanks.

>
>  diff-no-index.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/diff-no-index.c b/diff-no-index.c
> index a18f6c3c63..18edbdf4b5 100644
> --- a/diff-no-index.c
> +++ b/diff-no-index.c
> @@ -262,17 +262,17 @@ int diff_no_index(struct rev_info *revs,
>  	if (argc != 2) {
>  		if (implicit_no_index)
>  			warning(_("Not a git repository. Use --no-index to "
>  				  "compare two paths outside a working tree"));
>  		usage_with_options(diff_no_index_usage, options);
>  	}
>  	FREE_AND_NULL(options);
>  	for (i = 0; i < 2; i++) {
> -		const char *p = argv[argc - 2 + i];
> +		const char *p = argv[i];
>  		if (!strcmp(p, "-"))
>  			/*
>  			 * stdin should be spelled as "-"; if you have
>  			 * path that is "-", spell it as "./-".
>  			 */
>  			p = file_from_standard_input;
>  		else if (prefix)
>  			p = to_free[i] = prefix_filename(prefix, p);
> --
> 2.37.2

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-09-07 19:37 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-02 21:27 [PATCH] unleak paths allocated in "diff --no-index" Junio C Hamano
2022-09-02 23:49 ` [PATCH v2] diff --no-index: unleak paths[] elements Junio C Hamano
2022-09-03  6:00   ` René Scharfe
2022-09-05 20:26     ` Junio C Hamano
2022-09-06 12:31       ` [PATCH 1/2] diff-no-index: release strbuf on queue error René Scharfe
2022-09-07  6:01         ` René Scharfe
2022-09-06 12:31       ` [PATCH 2/2] diff-no-index: release prefixed filenames René Scharfe
2022-09-07 10:03         ` Johannes Schindelin
2022-09-07 11:19           ` René Scharfe
2022-09-07 11:36       ` [PATCH v2 1/2] diff-no-index: release strbuf on queue error René Scharfe
2022-09-07 11:37       ` [PATCH v2 2/2] diff-no-index: release prefixed filenames René Scharfe
2022-09-07 11:45       ` [PATCH v2 3/2] diff-no-index: simplify argv index calculation René Scharfe
2022-09-07 19:37         ` Junio C Hamano
2022-09-05 10:03   ` [PATCH v2] diff --no-index: unleak paths[] elements Johannes Schindelin
2022-09-05 11:01     ` Ævar Arnfjörð Bjarmason
2022-09-07 10:06       ` Johannes Schindelin
2022-09-07 12:31         ` Ævar Arnfjörð Bjarmason

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).