git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/4] a few minor buffer cleanups in fast-import
@ 2017-03-24 17:22 Jeff King
  2017-03-24 17:25 ` [PATCH 1/4] fast-import: use xsnprintf for writing sha1s Jeff King
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Jeff King @ 2017-03-24 17:22 UTC (permalink / raw)
  To: git

I don't think any of these is a triggerable bug. They're just cleanups
to make it more obvious that the code is doing the right thing (and
making it harder to do the wrong thing).

  [1/4]: fast-import: use xsnprintf for writing sha1s
  [2/4]: fast-import: use xsnprintf for formatting headers
  [3/4]: encode_in_pack_object_header: respect output buffer length
  [4/4]: pack.h: define largest possible encoded object size

 builtin/pack-objects.c | 12 ++++++++----
 bulk-checkin.c         |  2 +-
 fast-import.c          | 16 +++++++---------
 pack-write.c           |  5 ++++-
 pack.h                 |  9 ++++++++-
 5 files changed, 28 insertions(+), 16 deletions(-)

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/4] fast-import: use xsnprintf for writing sha1s
  2017-03-24 17:22 [PATCH 0/4] a few minor buffer cleanups in fast-import Jeff King
@ 2017-03-24 17:25 ` Jeff King
  2017-03-24 17:26 ` [PATCH 2/4] fast-import: use xsnprintf for formatting headers Jeff King
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Jeff King @ 2017-03-24 17:25 UTC (permalink / raw)
  To: git

When we have to write a sha1 with a newline, we do so by
copying both into a single buffer, so that we can issue a
single write() call.

We use snprintf but don't bother to check the output, since
we know it will fit. However, we should use xsnprintf() in
such a case so that we're notified if our assumption turns
out to be wrong (and to make it easier to audit for
unchecked snprintf calls).

Signed-off-by: Jeff King <peff@peff.net>
---
This is ready for conversion to oid_to_hex, too, but I avoided that
here. The buffer would need to be allocated with the new GIT_MAX_HEXSZ,
which is not yet available. So I figured it was better to leave it than
half-convert it and leave brian wondering whether it's really supposed
to be GIT_MAX_HEXSZ or GIT_SHA1_HEXSZ.

 fast-import.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fast-import.c b/fast-import.c
index 41a539f97..4e0f3f5dd 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -3003,7 +3003,7 @@ static void parse_get_mark(const char *p)
 	if (!oe)
 		die("Unknown mark: %s", command_buf.buf);
 
-	snprintf(output, sizeof(output), "%s\n", sha1_to_hex(oe->idx.sha1));
+	xsnprintf(output, sizeof(output), "%s\n", sha1_to_hex(oe->idx.sha1));
 	cat_blob_write(output, 41);
 }
 
-- 
2.12.1.843.g1937c56c2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/4] fast-import: use xsnprintf for formatting headers
  2017-03-24 17:22 [PATCH 0/4] a few minor buffer cleanups in fast-import Jeff King
  2017-03-24 17:25 ` [PATCH 1/4] fast-import: use xsnprintf for writing sha1s Jeff King
@ 2017-03-24 17:26 ` Jeff King
  2017-03-24 17:26 ` [PATCH 3/4] encode_in_pack_object_header: respect output buffer length Jeff King
  2017-03-24 17:26 ` [PATCH 4/4] pack.h: define largest possible encoded object size Jeff King
  3 siblings, 0 replies; 5+ messages in thread
From: Jeff King @ 2017-03-24 17:26 UTC (permalink / raw)
  To: git

The stream_blob() function checks the return value of
snprintf and dies. This is more simply done with
xsnprintf (and matches the similar call in store_object).

The message the user would get is less specific, but since
the point is that this _shouldn't_ ever happen, that's OK.

Signed-off-by: Jeff King <peff@peff.net>
---
 fast-import.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index 4e0f3f5dd..4a057e81f 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -1237,9 +1237,7 @@ static void stream_blob(uintmax_t len, unsigned char *sha1out, uintmax_t mark)
 	sha1file_checkpoint(pack_file, &checkpoint);
 	offset = checkpoint.offset;
 
-	hdrlen = snprintf((char *)out_buf, out_sz, "blob %" PRIuMAX, len) + 1;
-	if (out_sz <= hdrlen)
-		die("impossibly large object header");
+	hdrlen = xsnprintf((char *)out_buf, out_sz, "blob %" PRIuMAX, len) + 1;
 
 	git_SHA1_Init(&c);
 	git_SHA1_Update(&c, out_buf, hdrlen);
-- 
2.12.1.843.g1937c56c2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/4] encode_in_pack_object_header: respect output buffer length
  2017-03-24 17:22 [PATCH 0/4] a few minor buffer cleanups in fast-import Jeff King
  2017-03-24 17:25 ` [PATCH 1/4] fast-import: use xsnprintf for writing sha1s Jeff King
  2017-03-24 17:26 ` [PATCH 2/4] fast-import: use xsnprintf for formatting headers Jeff King
@ 2017-03-24 17:26 ` Jeff King
  2017-03-24 17:26 ` [PATCH 4/4] pack.h: define largest possible encoded object size Jeff King
  3 siblings, 0 replies; 5+ messages in thread
From: Jeff King @ 2017-03-24 17:26 UTC (permalink / raw)
  To: git

The encode_in_pack_object_header() writes a variable-length
header to an output buffer, but it doesn't actually know
long the buffer is. At first glance, this looks like it
might be possible to overflow.

In practice, this is probably impossible. The smallest
buffer we use is 10 bytes, which would hold the header for
an object up to 2^67 bytes. Obviously we're not likely to
see such an object, but we might worry that an object could
lie about its size (causing us to overflow before we realize
it does not actually have that many bytes). But the argument
is passed as a uintmax_t. Even on systems that have __int128
available, uintmax_t is typically restricted to 64-bit by
the ABI.

So it's unlikely that a system exists where this could be
exploited. Still, it's easy enough to use a normal out/len
pair and make sure we don't write too far. That protects the
hypothetical 128-bit system, makes it harder for callers to
accidentally specify a too-small buffer, and makes the
resulting code easier to audit.

Note that the one caller in fast-import tried to catch such
a case, but did so _after_ the call (at which point we'd
have already overflowed!). This check can now go away.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/pack-objects.c |  6 ++++--
 bulk-checkin.c         |  2 +-
 fast-import.c          | 10 +++++-----
 pack-write.c           |  5 ++++-
 pack.h                 |  3 ++-
 5 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 16517f263..cf3fc50a2 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -286,7 +286,8 @@ static unsigned long write_no_reuse_object(struct sha1file *f, struct object_ent
 	 * The object header is a byte of 'type' followed by zero or
 	 * more bytes of length.
 	 */
-	hdrlen = encode_in_pack_object_header(type, size, header);
+	hdrlen = encode_in_pack_object_header(header, sizeof(header),
+					      type, size);
 
 	if (type == OBJ_OFS_DELTA) {
 		/*
@@ -358,7 +359,8 @@ static off_t write_reuse_object(struct sha1file *f, struct object_entry *entry,
 	if (entry->delta)
 		type = (allow_ofs_delta && entry->delta->idx.offset) ?
 			OBJ_OFS_DELTA : OBJ_REF_DELTA;
-	hdrlen = encode_in_pack_object_header(type, entry->size, header);
+	hdrlen = encode_in_pack_object_header(header, sizeof(header),
+					      type, entry->size);
 
 	offset = entry->in_pack_offset;
 	revidx = find_pack_revindex(p, offset);
diff --git a/bulk-checkin.c b/bulk-checkin.c
index 991b4a13e..ddb6070c4 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -105,7 +105,7 @@ static int stream_to_pack(struct bulk_checkin_state *state,
 
 	git_deflate_init(&s, pack_compression_level);
 
-	hdrlen = encode_in_pack_object_header(type, size, obuf);
+	hdrlen = encode_in_pack_object_header(obuf, sizeof(obuf), type, size);
 	s.next_out = obuf + hdrlen;
 	s.avail_out = sizeof(obuf) - hdrlen;
 
diff --git a/fast-import.c b/fast-import.c
index 4a057e81f..f6f416f20 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -1173,7 +1173,8 @@ static int store_object(
 		delta_count_by_type[type]++;
 		e->depth = last->depth + 1;
 
-		hdrlen = encode_in_pack_object_header(OBJ_OFS_DELTA, deltalen, hdr);
+		hdrlen = encode_in_pack_object_header(hdr, sizeof(hdr),
+						      OBJ_OFS_DELTA, deltalen);
 		sha1write(pack_file, hdr, hdrlen);
 		pack_size += hdrlen;
 
@@ -1184,7 +1185,8 @@ static int store_object(
 		pack_size += sizeof(hdr) - pos;
 	} else {
 		e->depth = 0;
-		hdrlen = encode_in_pack_object_header(type, dat->len, hdr);
+		hdrlen = encode_in_pack_object_header(hdr, sizeof(hdr),
+						      type, dat->len);
 		sha1write(pack_file, hdr, hdrlen);
 		pack_size += hdrlen;
 	}
@@ -1246,9 +1248,7 @@ static void stream_blob(uintmax_t len, unsigned char *sha1out, uintmax_t mark)
 
 	git_deflate_init(&s, pack_compression_level);
 
-	hdrlen = encode_in_pack_object_header(OBJ_BLOB, len, out_buf);
-	if (out_sz <= hdrlen)
-		die("impossibly large object header");
+	hdrlen = encode_in_pack_object_header(out_buf, out_sz, OBJ_BLOB, len);
 
 	s.next_out = out_buf + hdrlen;
 	s.avail_out = out_sz - hdrlen;
diff --git a/pack-write.c b/pack-write.c
index 88bc7f9f7..c057513f1 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -304,7 +304,8 @@ char *index_pack_lockfile(int ip_out)
  *  - each byte afterwards: low seven bits are size continuation,
  *    with the high bit being "size continues"
  */
-int encode_in_pack_object_header(enum object_type type, uintmax_t size, unsigned char *hdr)
+int encode_in_pack_object_header(unsigned char *hdr, int hdr_len,
+				 enum object_type type, uintmax_t size)
 {
 	int n = 1;
 	unsigned char c;
@@ -315,6 +316,8 @@ int encode_in_pack_object_header(enum object_type type, uintmax_t size, unsigned
 	c = (type << 4) | (size & 15);
 	size >>= 4;
 	while (size) {
+		if (n == hdr_len)
+			die("object size is too enormous to format");
 		*hdr++ = c | 0x80;
 		c = size & 0x7f;
 		size >>= 7;
diff --git a/pack.h b/pack.h
index 0e77429df..87e456d5e 100644
--- a/pack.h
+++ b/pack.h
@@ -84,7 +84,8 @@ extern int verify_pack(struct packed_git *, verify_fn fn, struct progress *, uin
 extern off_t write_pack_header(struct sha1file *f, uint32_t);
 extern void fixup_pack_header_footer(int, unsigned char *, const char *, uint32_t, unsigned char *, off_t);
 extern char *index_pack_lockfile(int fd);
-extern int encode_in_pack_object_header(enum object_type, uintmax_t, unsigned char *);
+extern int encode_in_pack_object_header(unsigned char *hdr, int hdr_len,
+					enum object_type, uintmax_t);
 
 #define PH_ERROR_EOF		(-1)
 #define PH_ERROR_PACK_SIGNATURE	(-2)
-- 
2.12.1.843.g1937c56c2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 4/4] pack.h: define largest possible encoded object size
  2017-03-24 17:22 [PATCH 0/4] a few minor buffer cleanups in fast-import Jeff King
                   ` (2 preceding siblings ...)
  2017-03-24 17:26 ` [PATCH 3/4] encode_in_pack_object_header: respect output buffer length Jeff King
@ 2017-03-24 17:26 ` Jeff King
  3 siblings, 0 replies; 5+ messages in thread
From: Jeff King @ 2017-03-24 17:26 UTC (permalink / raw)
  To: git

Several callers use fixed buffers for storing the pack
object header, and they've picked 10 as a magic number. This
is reasonable, since it handles objects up to 2^67. But
let's give them a constant so it's clear that the number
isn't pulled out of thin air.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/pack-objects.c | 6 ++++--
 pack.h                 | 6 ++++++
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index cf3fc50a2..84af7c232 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -239,7 +239,8 @@ static unsigned long write_no_reuse_object(struct sha1file *f, struct object_ent
 					   unsigned long limit, int usable_delta)
 {
 	unsigned long size, datalen;
-	unsigned char header[10], dheader[10];
+	unsigned char header[MAX_PACK_OBJECT_HEADER],
+		      dheader[MAX_PACK_OBJECT_HEADER];
 	unsigned hdrlen;
 	enum object_type type;
 	void *buf;
@@ -353,7 +354,8 @@ static off_t write_reuse_object(struct sha1file *f, struct object_entry *entry,
 	off_t offset;
 	enum object_type type = entry->type;
 	off_t datalen;
-	unsigned char header[10], dheader[10];
+	unsigned char header[MAX_PACK_OBJECT_HEADER],
+		      dheader[MAX_PACK_OBJECT_HEADER];
 	unsigned hdrlen;
 
 	if (entry->delta)
diff --git a/pack.h b/pack.h
index 87e456d5e..5c2158746 100644
--- a/pack.h
+++ b/pack.h
@@ -84,6 +84,12 @@ extern int verify_pack(struct packed_git *, verify_fn fn, struct progress *, uin
 extern off_t write_pack_header(struct sha1file *f, uint32_t);
 extern void fixup_pack_header_footer(int, unsigned char *, const char *, uint32_t, unsigned char *, off_t);
 extern char *index_pack_lockfile(int fd);
+
+/*
+ * The "hdr" output buffer should be at least this big, which will handle sizes
+ * up to 2^67.
+ */
+#define MAX_PACK_OBJECT_HEADER 10
 extern int encode_in_pack_object_header(unsigned char *hdr, int hdr_len,
 					enum object_type, uintmax_t);
 
-- 
2.12.1.843.g1937c56c2

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-03-24 17:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-24 17:22 [PATCH 0/4] a few minor buffer cleanups in fast-import Jeff King
2017-03-24 17:25 ` [PATCH 1/4] fast-import: use xsnprintf for writing sha1s Jeff King
2017-03-24 17:26 ` [PATCH 2/4] fast-import: use xsnprintf for formatting headers Jeff King
2017-03-24 17:26 ` [PATCH 3/4] encode_in_pack_object_header: respect output buffer length Jeff King
2017-03-24 17:26 ` [PATCH 4/4] pack.h: define largest possible encoded object size Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).