git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] difftool: avoid strcpy
@ 2017-03-30 10:35 Jeff King
  2017-03-30 13:19 ` Johannes Schindelin
  0 siblings, 1 reply; 2+ messages in thread
From: Jeff King @ 2017-03-30 10:35 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

In order to checkout files, difftool reads "diff --raw"
output and feeds the names to checkout_entry(). That
function requires us to have a "struct cache_entry". And
because that struct uses a FLEX_ARRAY for the name field, we
have to actually copy in our new name.

The current code allocates a single re-usable cache_entry
that can hold a name up to PATH_MAX, and then copies
filenames into it using strcpy(). But there's no guarantee
that incoming names are smaller than PATH_MAX. They've come
from "diff --raw" output which might be diffing between two
trees (and hence we'd be subject to the PATH_MAX of some
other system, or even none at all if they were created
directly via "update-index").

We can fix this by using make_cache_entry() to create a
correctly-sized cache_entry for each name. This incurs an
extra allocation per file, but this is negligible compared
to actually writing out the file contents.

To make this simpler, we can push this procedure into a new
helper function. Note that we can also get rid of the "len"
variables for src_path and dst_path (and in fact we must, as
the compiler complains that they are unused).

Signed-off-by: Jeff King <peff@peff.net>
---
I tested this with:

  git init
  sha1=$(echo whatever | git hash-object -w --stdin)
  git update-index --add --cacheinfo \
    100644 $sha1 $(perl -e 'print join("/", 1..2048)')
  git difftool -d HEAD

It fails anyway, of course, because we can't check out a filename of
that length, but not until after it has overflowed the buffer.

I'm not sure if we'd want that in the test suite or not, since the
outcome is unpredictable.

 builtin/difftool.c | 31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/builtin/difftool.c b/builtin/difftool.c
index 25e54ad3e..b350b3d39 100644
--- a/builtin/difftool.c
+++ b/builtin/difftool.c
@@ -297,6 +297,19 @@ static char *get_symlink(const struct object_id *oid, const char *path)
 	return data;
 }
 
+static int checkout_path(unsigned mode, struct object_id *oid,
+			 const char *path, const struct checkout *state)
+{
+	struct cache_entry *ce;
+	int ret;
+
+	ce = make_cache_entry(mode, oid->hash, path, 0, 0);
+	ret = checkout_entry(ce, state, NULL);
+
+	free(ce);
+	return ret;
+}
+
 static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix,
 			int argc, const char **argv)
 {
@@ -306,7 +319,6 @@ static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix,
 	struct strbuf ldir = STRBUF_INIT, rdir = STRBUF_INIT;
 	struct strbuf wtdir = STRBUF_INIT;
 	size_t ldir_len, rdir_len, wtdir_len;
-	struct cache_entry *ce = xcalloc(1, sizeof(ce) + PATH_MAX + 1);
 	const char *workdir, *tmp;
 	int ret = 0, i;
 	FILE *fp;
@@ -377,7 +389,6 @@ static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix,
 		struct object_id loid, roid;
 		char status;
 		const char *src_path, *dst_path;
-		size_t src_path_len, dst_path_len;
 
 		if (starts_with(info.buf, "::"))
 			die(N_("combined diff formats('-c' and '--cc') are "
@@ -390,17 +401,14 @@ static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix,
 		if (strbuf_getline_nul(&lpath, fp))
 			break;
 		src_path = lpath.buf;
-		src_path_len = lpath.len;
 
 		i++;
 		if (status != 'C' && status != 'R') {
 			dst_path = src_path;
-			dst_path_len = src_path_len;
 		} else {
 			if (strbuf_getline_nul(&rpath, fp))
 				break;
 			dst_path = rpath.buf;
-			dst_path_len = rpath.len;
 		}
 
 		if (S_ISGITLINK(lmode) || S_ISGITLINK(rmode)) {
@@ -430,11 +438,7 @@ static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix,
 		}
 
 		if (lmode && status != 'C') {
-			ce->ce_mode = lmode;
-			oidcpy(&ce->oid, &loid);
-			strcpy(ce->name, src_path);
-			ce->ce_namelen = src_path_len;
-			if (checkout_entry(ce, &lstate, NULL))
+			if (checkout_path(lmode, &loid, src_path, &lstate))
 				return error("could not write '%s'", src_path);
 		}
 
@@ -451,11 +455,7 @@ static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix,
 			hashmap_add(&working_tree_dups, entry);
 
 			if (!use_wt_file(workdir, dst_path, &roid)) {
-				ce->ce_mode = rmode;
-				oidcpy(&ce->oid, &roid);
-				strcpy(ce->name, dst_path);
-				ce->ce_namelen = dst_path_len;
-				if (checkout_entry(ce, &rstate, NULL))
+				if (checkout_path(rmode, &roid, dst_path, &rstate))
 					return error("could not write '%s'",
 						     dst_path);
 			} else if (!is_null_oid(&roid)) {
@@ -625,7 +625,6 @@ static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix,
 		exit_cleanup(tmpdir, rc);
 
 finish:
-	free(ce);
 	strbuf_release(&ldir);
 	strbuf_release(&rdir);
 	strbuf_release(&wtdir);
-- 
2.12.2.920.g2106709b0

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] difftool: avoid strcpy
  2017-03-30 10:35 [PATCH] difftool: avoid strcpy Jeff King
@ 2017-03-30 13:19 ` Johannes Schindelin
  0 siblings, 0 replies; 2+ messages in thread
From: Johannes Schindelin @ 2017-03-30 13:19 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Hi Peff,

On Thu, 30 Mar 2017, Jeff King wrote:

> In order to checkout files, difftool reads "diff --raw"
> output and feeds the names to checkout_entry(). That
> function requires us to have a "struct cache_entry". And
> because that struct uses a FLEX_ARRAY for the name field, we
> have to actually copy in our new name.
> 
> The current code allocates a single re-usable cache_entry
> that can hold a name up to PATH_MAX, and then copies
> filenames into it using strcpy(). But there's no guarantee
> that incoming names are smaller than PATH_MAX. They've come
> from "diff --raw" output which might be diffing between two
> trees (and hence we'd be subject to the PATH_MAX of some
> other system, or even none at all if they were created
> directly via "update-index").
> 
> We can fix this by using make_cache_entry() to create a
> correctly-sized cache_entry for each name. This incurs an
> extra allocation per file, but this is negligible compared
> to actually writing out the file contents.
> 
> To make this simpler, we can push this procedure into a new
> helper function. Note that we can also get rid of the "len"
> variables for src_path and dst_path (and in fact we must, as
> the compiler complains that they are unused).

Oh woops. Thanks for noticing and for patching it right away!

> I tested this with:
> 
>   git init
>   sha1=$(echo whatever | git hash-object -w --stdin)
>   git update-index --add --cacheinfo \
>     100644 $sha1 $(perl -e 'print join("/", 1..2048)')
>   git difftool -d HEAD
> 
> It fails anyway, of course, because we can't check out a filename of
> that length, but not until after it has overflowed the buffer.
> 
> I'm not sure if we'd want that in the test suite or not, since the
> outcome is unpredictable.

I'd leave it out. There is really no reliable way to test this.

>  builtin/difftool.c | 31 +++++++++++++++----------------
>  1 file changed, 15 insertions(+), 16 deletions(-)

Nice! I would have thought that it adds to the total number of lines.
Instead, it removes many fiddly `*_len = *` assignments and makes the code
more robust.

ACK!
Dscho

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-03-30 13:20 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-30 10:35 [PATCH] difftool: avoid strcpy Jeff King
2017-03-30 13:19 ` Johannes Schindelin

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).