git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: Junio C Hamano <gitster@pobox.com>, git@vger.kernel.org
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Johannes Schindelin" <johannes.schindelin@gmx.de>
Subject: Re: [PATCH v2] diff --no-index: unleak paths[] elements
Date: Sat, 3 Sep 2022 08:00:23 +0200	[thread overview]
Message-ID: <181c029b-8b36-4b04-30f9-97a3f252bfbc@web.de> (raw)
In-Reply-To: <xmqqilm51gn6.fsf@gitster.g>

Am 03.09.22 um 01:49 schrieb Junio C Hamano:
> "git diff --no-index" codepath starts with the two elements in
> argv[] and munges them into two paths to be compared, stored in a
> separate path[] arrays.  The munging is implemented in a rather
> haphazard way, sometimes overwriting old version with a new copy,
> and sometimes a constant string assigned to path[], making it
> impossible to release the resources properly:
>
>  * A single dash "-" from the command line is a special signal that
>    the standard input is used for the side to be compared, and is
>    internally replaced with a copy of string "-" at a known address.
>
>  * When run in a subdirectory, full paths to the two paths are
>    allocated and placed in path[].
>
>  * After the above happens, when comparing a file with a directory,
>    the directory side is replaced with the path to a file in the
>    directory with the same name as the file.
>
> This was perfectly fine for just two strings that are pathnames used
> during the lifetime of the program and cleaned up upon program exit,
> but it gets in the way when leak sanitizer is in effect.  The third
> step can be losing the full path that was allocated in the second
> step, but it is not easy to tell if its input is an allocated piece
> of memory to begin with.
>
> Loosen the earlier two steps a bit so that elements of the path[]
> array that come to the directory/file comparison code are either the
> singleton "-" or an allocated piece of memory.  Use that knowledge
> in the third step to release an allocated piece of memory when it
> replaces the path to a directory with the path to a file in that
> directory, and also at the end to release the two elements of the
> path[] array as needed.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>
>  * The previous one allowed strbuf_release() to free replacement.buf
>    which may be used in path[0] or path[1] potentially leading to
>    double freeing.  The kosher way may be to use strbuf_detach() in
>    fixup_paths(), but this is a simpler fix, it is getting late in
>    the day, and I am getting sick of fighting the leak-checker, so...
>
>  diff-no-index.c | 24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
>
>
> diff --git a/diff-no-index.c b/diff-no-index.c
> index 9a8b09346b..77a126469b 100644
> --- a/diff-no-index.c
> +++ b/diff-no-index.c
> @@ -208,6 +208,14 @@ static void append_basename(struct strbuf *path, const char *dir, const char *fi
>  	strbuf_addstr(path, tail ? tail + 1 : file);
>  }
>
> +static void free_allocated_path(const char *path)
> +{
> +	if (!path ||

How can path be NULL?  And if it was, why shield free(3) from it?

> +	    (path == file_from_standard_input))
> +		return;
> +	free((char *)path);
> +}
> +
>  /*
>   * DWIM "diff D F" into "diff D/F F" and "diff F D" into "diff F D/F"
>   * Note that we append the basename of F to D/, so "diff a/b/file D"
> @@ -226,9 +234,11 @@ static void fixup_paths(const char **path, struct strbuf *replacement)
>  		return;
>  	if (isdir0) {
>  		append_basename(replacement, path[0], path[1]);
> +		free_allocated_path(path[0]);
>  		path[0] = replacement->buf;
>  	} else {
>  		append_basename(replacement, path[1], path[0]);
> +		free_allocated_path(path[1]);
>  		path[1] = replacement->buf;
>  	}
>  }
> @@ -274,6 +284,8 @@ int diff_no_index(struct rev_info *revs,
>  			p = file_from_standard_input;
>  		else if (prefix)
>  			p = prefix_filename(prefix, p);
> +		else
> +			p = xstrdup(p);

prefix_filename(NULL, p) is basically the same as xstrdup(p), so those
two conditional branches could be joined.

>  		paths[i] = p;
>  	}
>
> @@ -294,13 +306,21 @@ int diff_no_index(struct rev_info *revs,
>  	setup_diff_pager(&revs->diffopt);
>  	revs->diffopt.flags.exit_with_status = 1;
>
> -	if (queue_diff(&revs->diffopt, paths[0], paths[1]))
> +	if (queue_diff(&revs->diffopt, paths[0], paths[1])) {
> +		free_allocated_path(paths[0]);
> +		free_allocated_path(paths[1]);
>  		return 1;
> +	}
>  	diff_set_mnemonic_prefix(&revs->diffopt, "1/", "2/");
>  	diffcore_std(&revs->diffopt);
>  	diff_flush(&revs->diffopt);
>
> -	strbuf_release(&replacement);
> +	/*
> +	 * do not strbuf_release(&replacement), as it is in paths[]
> +	 * when replacement was actually used.
> +	 */
> +	free_allocated_path(paths[0]);
> +	free_allocated_path(paths[1]);
>
>  	/*
>  	 * The return code for --no-index imitates diff(1):

Perhaps avoid the need for that comment by moving that strbuf to where
it's used and have it spend its full lifecycle there?  Something like:

---
 diff-no-index.c | 50 ++++++++++++++++++++++---------------------------
 1 file changed, 22 insertions(+), 28 deletions(-)

diff --git a/diff-no-index.c b/diff-no-index.c
index 77a126469b..9f8b78f173 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -196,18 +196,6 @@ static int queue_diff(struct diff_options *o,
 	}
 }

-/* append basename of F to D */
-static void append_basename(struct strbuf *path, const char *dir, const char *file)
-{
-	const char *tail = strrchr(file, '/');
-
-	strbuf_addstr(path, dir);
-	while (path->len && path->buf[path->len - 1] == '/')
-		path->len--;
-	strbuf_addch(path, '/');
-	strbuf_addstr(path, tail ? tail + 1 : file);
-}
-
 static void free_allocated_path(const char *path)
 {
 	if (!path ||
@@ -216,12 +204,28 @@ static void free_allocated_path(const char *path)
 	free((char *)path);
 }

+/* append basename of F to D */
+static void append_basename(const char **dir, const char *file)
+{
+	const char *tail = strrchr(file, '/');
+	struct strbuf path = STRBUF_INIT;
+
+	strbuf_addstr(&path, *dir);
+	while (path.len && path.buf[path.len - 1] == '/')
+		path.len--;
+	strbuf_addch(&path, '/');
+	strbuf_addstr(&path, tail ? tail + 1 : file);
+
+	free_allocated_path(*dir);
+	*dir = strbuf_detach(&path, NULL);
+}
+
 /*
  * DWIM "diff D F" into "diff D/F F" and "diff F D" into "diff F D/F"
  * Note that we append the basename of F to D/, so "diff a/b/file D"
  * becomes "diff a/b/file D/file", not "diff a/b/file D/a/b/file".
  */
-static void fixup_paths(const char **path, struct strbuf *replacement)
+static void fixup_paths(const char **path)
 {
 	unsigned int isdir0, isdir1;

@@ -232,15 +236,10 @@ static void fixup_paths(const char **path, struct strbuf *replacement)
 	isdir1 = is_directory(path[1]);
 	if (isdir0 == isdir1)
 		return;
-	if (isdir0) {
-		append_basename(replacement, path[0], path[1]);
-		free_allocated_path(path[0]);
-		path[0] = replacement->buf;
-	} else {
-		append_basename(replacement, path[1], path[0]);
-		free_allocated_path(path[1]);
-		path[1] = replacement->buf;
-	}
+	if (isdir0)
+		append_basename(&path[0], path[1]);
+	else
+		append_basename(&path[1], path[0]);
 }

 static const char * const diff_no_index_usage[] = {
@@ -254,7 +253,6 @@ int diff_no_index(struct rev_info *revs,
 {
 	int i, no_index;
 	const char *paths[2];
-	struct strbuf replacement = STRBUF_INIT;
 	const char *prefix = revs->prefix;
 	struct option no_index_options[] = {
 		OPT_BOOL_F(0, "no-index", &no_index, "",
@@ -289,7 +287,7 @@ int diff_no_index(struct rev_info *revs,
 		paths[i] = p;
 	}

-	fixup_paths(paths, &replacement);
+	fixup_paths(paths);

 	revs->diffopt.skip_stat_unmatch = 1;
 	if (!revs->diffopt.output_format)
@@ -315,10 +313,6 @@ int diff_no_index(struct rev_info *revs,
 	diffcore_std(&revs->diffopt);
 	diff_flush(&revs->diffopt);

-	/*
-	 * do not strbuf_release(&replacement), as it is in paths[]
-	 * when replacement was actually used.
-	 */
 	free_allocated_path(paths[0]);
 	free_allocated_path(paths[1]);

--
2.37.2


  reply	other threads:[~2022-09-03  6:00 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-02 21:27 [PATCH] unleak paths allocated in "diff --no-index" Junio C Hamano
2022-09-02 23:49 ` [PATCH v2] diff --no-index: unleak paths[] elements Junio C Hamano
2022-09-03  6:00   ` René Scharfe [this message]
2022-09-05 20:26     ` Junio C Hamano
2022-09-06 12:31       ` [PATCH 1/2] diff-no-index: release strbuf on queue error René Scharfe
2022-09-07  6:01         ` René Scharfe
2022-09-06 12:31       ` [PATCH 2/2] diff-no-index: release prefixed filenames René Scharfe
2022-09-07 10:03         ` Johannes Schindelin
2022-09-07 11:19           ` René Scharfe
2022-09-07 11:36       ` [PATCH v2 1/2] diff-no-index: release strbuf on queue error René Scharfe
2022-09-07 11:37       ` [PATCH v2 2/2] diff-no-index: release prefixed filenames René Scharfe
2022-09-07 11:45       ` [PATCH v2 3/2] diff-no-index: simplify argv index calculation René Scharfe
2022-09-07 19:37         ` Junio C Hamano
2022-09-05 10:03   ` [PATCH v2] diff --no-index: unleak paths[] elements Johannes Schindelin
2022-09-05 11:01     ` Ævar Arnfjörð Bjarmason
2022-09-07 10:06       ` Johannes Schindelin
2022-09-07 12:31         ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=181c029b-8b36-4b04-30f9-97a3f252bfbc@web.de \
    --to=l.s.r@web.de \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).