git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>,
	git@vger.kernel.org,
	"Johannes Schindelin" <johannes.schindelin@gmx.de>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: [PATCH 1/3] untracked-cache: be defensive about missing NULs in index
Date: Thu, 18 Apr 2019 17:17:02 -0400	[thread overview]
Message-ID: <20190418211701.GA18520@sigill.intra.peff.net> (raw)
In-Reply-To: <20190418211408.GA18011@sigill.intra.peff.net>

The on-disk format for the untracked-cache extension contains
NUL-terminated filenames. We parse these from the mmap'd file using
string functions like strlen(). This works fine in the normal case, but
if we see a malformed or corrupted index, we might read off the end of
our mmap.

Instead, let's use memchr() to find the trailing NUL within the bytes we
know are available, and return an error if it's missing.

Note that we can further simplify by folding another range check into
our conditional. After we find the end of the string, we set "next" to
the byte after the string and treat it as an error if there are no such
bytes left. That saves us from having to do a range check at the
beginning of each subsequent string (and works because there is always
data after each string). We can do both range checks together by
checking "!eos" (we didn't find a NUL) and "eos == end" (it was on the
last available byte, meaning there's nothing after). This replaces the
existing "next > end" checks.

Note also that the decode_varint() calls have a similar problem (we
don't even pass them "end"; they just keep parsing). These are probably
OK in practice since varints have a finite length (we stop parsing when
we'd overflow a uintmax_t), so the worst case is that we'd overflow into
reading the trailing bytes of the index.

Signed-off-by: Jeff King <peff@peff.net>
---
 dir.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/dir.c b/dir.c
index f5293a6536..7b0513c476 100644
--- a/dir.c
+++ b/dir.c
@@ -2733,6 +2733,7 @@ static int read_one_dir(struct untracked_cache_dir **untracked_,
 {
 	struct untracked_cache_dir ud, *untracked;
 	const unsigned char *next, *data = rd->data, *end = rd->end;
+	const unsigned char *eos;
 	unsigned int value;
 	int i, len;
 
@@ -2756,21 +2757,24 @@ static int read_one_dir(struct untracked_cache_dir **untracked_,
 	ALLOC_ARRAY(ud.dirs, ud.dirs_nr);
 	data = next;
 
-	len = strlen((const char *)data);
-	next = data + len + 1;
-	if (next > rd->end)
+	eos = memchr(data, '\0', end - data);
+	if (!eos || eos == end)
 		return -1;
+	len = eos - data;
+	next = eos + 1;
+
 	*untracked_ = untracked = xmalloc(st_add3(sizeof(*untracked), len, 1));
 	memcpy(untracked, &ud, sizeof(ud));
 	memcpy(untracked->name, data, len + 1);
 	data = next;
 
 	for (i = 0; i < untracked->untracked_nr; i++) {
-		len = strlen((const char *)data);
-		next = data + len + 1;
-		if (next > rd->end)
+		eos = memchr(data, '\0', end - data);
+		if (!eos || eos == end)
 			return -1;
-		untracked->untracked[i] = xstrdup((const char*)data);
+		len = eos - data;
+		next = eos + 1;
+		untracked->untracked[i] = xmemdupz(data, len);
 		data = next;
 	}
 
-- 
2.21.0.1092.g8b0302e9c4


  reply	other threads:[~2019-04-18 21:17 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-10 12:56 [PATCH 0/1] Fix an off-by-one bug in the untracked cache code Johannes Schindelin via GitGitGadget
2019-04-10 12:56 ` [PATCH 1/1] untracked cache: fix off-by-one Johannes Schindelin via GitGitGadget
2019-04-10 16:20   ` Jeff King
2019-04-12  1:48     ` Junio C Hamano
2019-04-18 21:14       ` [PATCH 0/3] untracked cache parsing fixups Jeff King
2019-04-18 21:17         ` Jeff King [this message]
2019-04-19  5:29           ` [PATCH 1/3] untracked-cache: be defensive about missing NULs in index Junio C Hamano
2019-04-18 21:17         ` [PATCH 2/3] untracked-cache: simplify parsing by dropping "next" Jeff King
2019-04-19  5:33           ` Junio C Hamano
2019-04-18 21:18         ` [PATCH 3/3] untracked-cache: simplify parsing by dropping "len" Jeff King
2019-04-18 21:24         ` [PATCH 4/3] untracked-cache: use FLEX_ALLOC to create internal structs Jeff King
2019-04-19  9:18           ` Duy Nguyen
2019-04-19 19:43             ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190418211701.GA18520@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).