git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: [PATCH] index: be careful when handling long names
Date: Fri, 18 Jan 2008 23:42:00 -0800	[thread overview]
Message-ID: <7vk5m69s8n.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: <alpine.LFD.1.00.0801181911560.2957@woody.linux-foundation.org> (Linus Torvalds's message of "Fri, 18 Jan 2008 19:25:58 -0800 (PST)")

We currently use lower 12-bit (masked with CE_NAMEMASK) in the
ce_flags field to store the length of the name in cache_entry,
without checking the length parameter given to
create_ce_flags().  This can make us store incorrect length.

Currently we are mostly protected by the fact that many
codepaths first copy the path in a variable of size PATH_MAX,
which typically is 4096 that happens to match the limit, but
that feels like a bug waiting to happen.  Besides, that would
not allow us to shorten the width of CE_NAMEMASK to use the bits
for new flags.

This redefines the meaning of the name length stored in the
cache_entry.  A name that does not fit is represented by storing
CE_NAMEMASK in the field, and the actual length needs to be
computed by actually counting the bytes in the name[] field.
This way, only the unusually long paths need to suffer.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * This is rebased on your "Updated in-memory index" patch.

   When we read on-disk index, I fear that we trust the name
   length a bit too much.  With or without this patch, you could
   craft a malicious index file that records a namelen that is
   longer than the real string name length for the last entry to
   cause us to go beyond the mmaped area.  Maybe we would want
   to make sure that (1) the name lengths are sane; (2) names
   have NUL at the place we expect them to be; and (3) names are
   sorted in the proper cache order, while we iterate over the
   ondisk structure and convert it to incore structure.

 cache.h          |   17 +++++++++++++++--
 read-cache.c     |   12 +++++++++++-
 t/t0000-basic.sh |   20 ++++++++++++++++++++
 3 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/cache.h b/cache.h
index 4a054c5..9eaffde 100644
--- a/cache.h
+++ b/cache.h
@@ -131,8 +131,21 @@ struct cache_entry {
 #define CE_UPDATE    (0x10000)
 #define CE_REMOVE    (0x20000)
 
-#define create_ce_flags(len, stage) ((len) | ((stage) << CE_STAGESHIFT))
-#define ce_namelen(ce) (CE_NAMEMASK & (ce)->ce_flags)
+static inline unsigned create_ce_flags(size_t len, unsigned stage)
+{
+	if (len >= CE_NAMEMASK)
+		len = CE_NAMEMASK;
+	return (len | (stage << CE_STAGESHIFT));
+}
+
+static inline size_t ce_namelen(const struct cache_entry *ce)
+{
+	size_t len = ce->ce_flags & CE_NAMEMASK;
+	if (len < CE_NAMEMASK)
+		return len;
+	return strlen(ce->name + CE_NAMEMASK) + CE_NAMEMASK;
+}
+
 #define ce_size(ce) cache_entry_size(ce_namelen(ce))
 #define ondisk_ce_size(ce) ondisk_cache_entry_size(ce_namelen(ce))
 #define ce_stage(ce) ((CE_STAGEMASK & (ce)->ce_flags) >> CE_STAGESHIFT)
diff --git a/read-cache.c b/read-cache.c
index f5f9c3d..f98f57c 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -927,6 +927,8 @@ int read_index(struct index_state *istate)
 
 static void convert_from_disk(struct ondisk_cache_entry *ondisk, struct cache_entry *ce)
 {
+	size_t len;
+
 	ce->ce_ctime = ntohl(ondisk->ctime.sec);
 	ce->ce_mtime = ntohl(ondisk->mtime.sec);
 	ce->ce_dev   = ntohl(ondisk->dev);
@@ -938,7 +940,15 @@ static void convert_from_disk(struct ondisk_cache_entry *ondisk, struct cache_en
 	/* On-disk flags are just 16 bits */
 	ce->ce_flags = ntohs(ondisk->flags);
 	hashcpy(ce->sha1, ondisk->sha1);
-	memcpy(ce->name, ondisk->name, ce_namelen(ce)+1);
+
+	len = ce->ce_flags & CE_NAMEMASK;
+	if (len == CE_NAMEMASK)
+		len = strlen(ondisk->name);
+	/*
+	 * NEEDSWORK: If the original index is crafted, this copy could
+	 * go unchecked.
+	 */
+	memcpy(ce->name, ondisk->name, len + 1);
 }
 
 /* remember to discard_cache() before reading a different cache! */
diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index 4e49d59..9f84b8d 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -297,4 +297,24 @@ test_expect_success 'absolute path works as expected' '
 	test "$sym" = "$(test-absolute-path $dir2/syml)"
 '
 
+test_expect_success 'very long name in the index handled sanely' '
+
+	a=a && # 1
+	a=$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a && # 16
+	a=$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a && # 256
+	a=$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a$a && # 4096
+	a=${a}q &&
+
+	>path4 &&
+	git update-index --add path4 &&
+	(
+		git ls-files -s path4 |
+		sed -e "s/	.*/	/" |
+		tr -d "\012"
+		echo "$a"
+	) | git update-index --index-info &&
+	len=$(git ls-files "a*" | wc -c) &&
+	test $len = 4098
+'
+
 test_done
-- 
1.5.4.rc3.37.gfdcf3

  reply	other threads:[~2008-01-19  7:42 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-19  3:25 Updated in-memory index cleanup Linus Torvalds
2008-01-19  7:42 ` Junio C Hamano [this message]
2008-01-19  7:45 ` [PATCH] Avoid running lstat(2) on the same cache entry Junio C Hamano
2008-01-19 19:47   ` Linus Torvalds
2008-01-19 19:58     ` Linus Torvalds
2008-01-20  1:22     ` Linus Torvalds
2008-01-20  1:34       ` Linus Torvalds
2008-01-20  1:48       ` Johannes Schindelin
2008-01-20  2:02         ` Linus Torvalds
2008-01-20 10:33         ` Steffen Prohaska
2008-01-20 14:15           ` Johannes Schindelin
2008-01-21  0:18           ` Junio C Hamano
2008-01-20 15:10         ` Steffen Prohaska
2008-01-20 15:18           ` Johannes Schindelin
2008-01-20 15:19             ` [PATCH] Also use unpack_trees() in do_diff_cache() Johannes Schindelin
2008-01-20 20:32               ` Linus Torvalds
2008-01-20 21:53                 ` Linus Torvalds
2008-01-20 23:34                   ` Johannes Schindelin
2008-01-20 23:58                     ` Linus Torvalds
2008-01-21  0:19                       ` Johannes Schindelin
2008-01-20 15:21             ` [PATCH] Try to resurrect the handling for 'diff-index -m' Johannes Schindelin
2008-01-20 18:20           ` [PATCH] Avoid running lstat(2) on the same cache entry Linus Torvalds
2008-01-20 20:03             ` Brian Downing
2008-01-20 21:40             ` Steffen Prohaska
2008-01-20 22:09               ` Linus Torvalds
2008-01-20  2:42       ` Johannes Schindelin
  -- strict thread matches above, loose matches on Subject: below --
2008-07-20  0:51 [PATCH] index: be careful when handling long names Junio C Hamano
2008-01-12 22:46 performance problem: "git commit filename" Linus Torvalds
2008-01-13  1:46 ` Linus Torvalds
2008-01-13  4:04   ` Linus Torvalds
2008-01-13  8:12     ` Junio C Hamano
2008-01-13 10:33       ` Junio C Hamano
2008-01-13 17:24         ` Linus Torvalds
2008-01-13 19:39           ` Junio C Hamano
2008-01-13 22:36             ` [PATCH] index: be careful when handling long names Junio C Hamano
2008-01-13 22:53               ` Alex Riesen
2008-01-13 23:08                 ` Junio C Hamano
2008-01-13 23:33                   ` Alex Riesen
2008-01-14 21:03                     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vk5m69s8n.fsf@gitster.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).