* [PATCH 1/5] test-delta: read input into a heap buffer
2018-08-30 7:05 ` [PATCH 0/5] handle corruption in patch-delta Jeff King
@ 2018-08-30 7:07 ` Jeff King
2018-08-30 7:09 ` [PATCH 2/5] t5303: test some corrupt deltas Jeff King
` (5 subsequent siblings)
6 siblings, 0 replies; 31+ messages in thread
From: Jeff King @ 2018-08-30 7:07 UTC (permalink / raw)
To: Jann Horn
Cc: git, gitster, Nguyễn Thái Ngọc Duy,
Johannes Schindelin, Nicolas Pitre
We currently read the input to test-delta by mmap()-ing it.
However, memory-checking tools like valgrind and ASan are
less able to detect reads/writes past the end of an mmap'd
buffer, because the OS is likely to give us extra bytes to
pad out the final page size. So instead, let's read into a
heap buffer.
As a bonus, this also makes it possible to write tests with
empty bases, as mmap() will complain about a zero-length
map.
This is based on a patch by Jann Horn <jannh@google.com>
which actually aligned the data at the end of a page, and
followed it with another page marked with mprotect(). That
would detect problems even without a tool like ASan, but it
was significantly more complex and may have introduced
portability problems. By comparison, this approach pushes
the complexity onto existing memory-checking tools.
Note that this could be done even more simply by using
strbuf_read_file(), but that would defeat the purpose:
strbufs generally overallocate (and at the very least
include a trailing NUL which we do not care about), which
would defeat most memory checkers.
Signed-off-by: Jeff King <peff@peff.net>
---
t/helper/test-delta.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/t/helper/test-delta.c b/t/helper/test-delta.c
index 34c7259248..e749a49c88 100644
--- a/t/helper/test-delta.c
+++ b/t/helper/test-delta.c
@@ -34,8 +34,8 @@ int cmd__delta(int argc, const char **argv)
return 1;
}
from_size = st.st_size;
- from_buf = mmap(NULL, from_size, PROT_READ, MAP_PRIVATE, fd, 0);
- if (from_buf == MAP_FAILED) {
+ from_buf = xmalloc(from_size);
+ if (read_in_full(fd, from_buf, from_size) < 0) {
perror(argv[2]);
close(fd);
return 1;
@@ -48,8 +48,8 @@ int cmd__delta(int argc, const char **argv)
return 1;
}
data_size = st.st_size;
- data_buf = mmap(NULL, data_size, PROT_READ, MAP_PRIVATE, fd, 0);
- if (data_buf == MAP_FAILED) {
+ data_buf = xmalloc(data_size);
+ if (read_in_full(fd, data_buf, data_size) < 0) {
perror(argv[3]);
close(fd);
return 1;
--
2.19.0.rc1.539.g3876d0831e
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 2/5] t5303: test some corrupt deltas
2018-08-30 7:05 ` [PATCH 0/5] handle corruption in patch-delta Jeff King
2018-08-30 7:07 ` [PATCH 1/5] test-delta: read input into a heap buffer Jeff King
@ 2018-08-30 7:09 ` Jeff King
2018-08-30 17:38 ` Junio C Hamano
2018-08-30 7:09 ` [PATCH 3/5] patch-delta: fix oob read Jeff King
` (4 subsequent siblings)
6 siblings, 1 reply; 31+ messages in thread
From: Jeff King @ 2018-08-30 7:09 UTC (permalink / raw)
To: Jann Horn
Cc: git, gitster, Nguyễn Thái Ngọc Duy,
Johannes Schindelin, Nicolas Pitre
We don't have any tests that specifically check boundary
cases in patch_delta(). It obviously gets exercised by tests
which read from packfiles, but it's hard to create packfiles
with bogus deltas.
So let's cover some obvious boundary cases:
1. commands that overflow the result buffer
a. literal content from the delta
b. copies from a base
2. commands where the source isn't large enough
a. literal content from a truncated delta
b. copies that need more bytes than the base has
3. copy commands who parameters are truncated
And indeed, we have problems with both 2a and 3. I've marked
these both as expect_failure, though note that because they
involve reading past the end of a buffer, they will
typically only be caught when run under valgrind or ASan.
There's one more test here, too, which just applies a basic
delta. Since all of the other tests expect failure and we
don't otherwise use "test-tool delta" in the test suite,
this gives a sanity check that the tool works at all.
These are based on an earlier patch by Jann Horn
<jannh@google.com>.
Signed-off-by: Jeff King <peff@peff.net>
---
t/t5303-pack-corruption-resilience.sh | 59 +++++++++++++++++++++++++++
1 file changed, 59 insertions(+)
diff --git a/t/t5303-pack-corruption-resilience.sh b/t/t5303-pack-corruption-resilience.sh
index 3634e258f8..912e659acf 100755
--- a/t/t5303-pack-corruption-resilience.sh
+++ b/t/t5303-pack-corruption-resilience.sh
@@ -311,4 +311,63 @@ test_expect_success \
test_must_fail git cat-file blob $blob_2 > /dev/null &&
test_must_fail git cat-file blob $blob_3 > /dev/null'
+# \0 - empty base
+# \1 - one byte in result
+# \1 - one literal byte (X)
+test_expect_success \
+ 'apply good minimal delta' \
+ 'printf "\0\1\1X" > minimal_delta &&
+ test-tool delta -p /dev/null minimal_delta /dev/null'
+
+# \0 - empty base
+# \1 - 1 byte in result
+# \2 - two literal bytes (one too many)
+test_expect_success \
+ 'apply delta with too many literal bytes' \
+ 'printf "\0\1\2XX" > too_big_literal &&
+ test_must_fail test-tool delta -p /dev/null too_big_literal /dev/null'
+
+# \5 - five bytes in base
+# \1 - one byte in result
+# \221 - copy, one byte offset, one byte size
+# \0 - copy from offset 0
+# \2 - copy two bytes (one too many)
+test_expect_success \
+ 'apply delta with too many copied bytes' \
+ 'printf "\5\1\221\0\2" > too_big_copy &&
+ echo base >base &&
+ test_must_fail test-tool delta -p base too_big_copy /dev/null'
+
+# \0 - empty base
+# \2 - two bytes in result
+# \2 - two literal bytes (we are short one)
+test_expect_failure \
+ 'apply delta with too few literal bytes' \
+ 'printf "\0\2\2X" > truncated_delta &&
+ test_must_fail test-tool delta -p /dev/null truncated_delta /dev/null'
+
+# \0 - empty base
+# \1 - one byte in result
+# \221 - copy, one byte offset, one byte size
+# \0 - copy from offset 0
+# \1 - copy one byte (we are short one)
+test_expect_success \
+ 'apply delta with too few bytes in base' \
+ 'printf "\0\1\221\0\1" > truncated_base &&
+ test_must_fail test-tool delta -p /dev/null truncated_base /dev/null'
+
+# \5 - five bytes in base
+# \5 - five bytes in result
+# \1 - one literal byte (X)
+# \221 - copy, one byte offset, one byte size
+# (offset/size missing)
+#
+# Note that the literal byte is necessary to get past the uninteresting minimum
+# delta size check.
+test_expect_failure \
+ 'apply delta with truncated copy parameters' \
+ 'printf "\5\5\1X\221" > truncated_copy_delta &&
+ echo base >base &&
+ test_must_fail test-tool delta -p base truncated_copy_delta /dev/null'
+
test_done
--
2.19.0.rc1.539.g3876d0831e
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH 2/5] t5303: test some corrupt deltas
2018-08-30 7:09 ` [PATCH 2/5] t5303: test some corrupt deltas Jeff King
@ 2018-08-30 17:38 ` Junio C Hamano
2018-08-30 18:42 ` Jeff King
2018-08-31 9:58 ` Johannes Schindelin
0 siblings, 2 replies; 31+ messages in thread
From: Junio C Hamano @ 2018-08-30 17:38 UTC (permalink / raw)
To: Jeff King
Cc: Jann Horn, git, Nguyễn Thái Ngọc Duy,
Johannes Schindelin, Nicolas Pitre
Jeff King <peff@peff.net> writes:
> +test_expect_success \
> + 'apply delta with too many copied bytes' \
> + 'printf "\5\1\221\0\2" > too_big_copy &&
> + echo base >base &&
> + test_must_fail test-tool delta -p base too_big_copy /dev/null'
Would "echo base >base" give us 5-byte long base even on Windows?
Or the test does not care if it is either "base\n" or "base\r\n"?
Just double-checking.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 2/5] t5303: test some corrupt deltas
2018-08-30 17:38 ` Junio C Hamano
@ 2018-08-30 18:42 ` Jeff King
2018-08-30 18:44 ` Jeff King
2018-08-31 9:58 ` Johannes Schindelin
1 sibling, 1 reply; 31+ messages in thread
From: Jeff King @ 2018-08-30 18:42 UTC (permalink / raw)
To: Junio C Hamano
Cc: Jann Horn, git, Nguyễn Thái Ngọc Duy,
Johannes Schindelin, Nicolas Pitre
On Thu, Aug 30, 2018 at 10:38:21AM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > +test_expect_success \
> > + 'apply delta with too many copied bytes' \
> > + 'printf "\5\1\221\0\2" > too_big_copy &&
> > + echo base >base &&
> > + test_must_fail test-tool delta -p base too_big_copy /dev/null'
>
> Would "echo base >base" give us 5-byte long base even on Windows?
> Or the test does not care if it is either "base\n" or "base\r\n"?
>
> Just double-checking.
Good question. On the first one, I don't know. On the second one, yes,
it does matter. We'd feed "6" to patch_delta(), and it would complain
about the mismatch before actually hitting the code we're trying to
exercise. The test would still pass (the error result is the same either
way), but would quietly not test what we wanted.
Maybe something like this to be on the safe side?
(note that we can leave the \5 in the result size of the "truncated copy
parameters" test; it really just needs to be larger than 1).
---
diff --git a/t/t5303-pack-corruption-resilience.sh b/t/t5303-pack-corruption-resilience.sh
index 912e659acf..e80934a18e 100755
--- a/t/t5303-pack-corruption-resilience.sh
+++ b/t/t5303-pack-corruption-resilience.sh
@@ -327,15 +327,15 @@ test_expect_success \
'printf "\0\1\2XX" > too_big_literal &&
test_must_fail test-tool delta -p /dev/null too_big_literal /dev/null'
-# \5 - five bytes in base
+# \4 - four bytes in base
# \1 - one byte in result
# \221 - copy, one byte offset, one byte size
# \0 - copy from offset 0
# \2 - copy two bytes (one too many)
test_expect_success \
'apply delta with too many copied bytes' \
- 'printf "\5\1\221\0\2" > too_big_copy &&
- echo base >base &&
+ 'printf "\4\1\221\0\2" > too_big_copy &&
+ printf base >base &&
test_must_fail test-tool delta -p base too_big_copy /dev/null'
# \0 - empty base
@@ -356,7 +356,7 @@ test_expect_success \
'printf "\0\1\221\0\1" > truncated_base &&
test_must_fail test-tool delta -p /dev/null truncated_base /dev/null'
-# \5 - five bytes in base
+# \4 - four bytes in base
# \5 - five bytes in result
# \1 - one literal byte (X)
# \221 - copy, one byte offset, one byte size
@@ -366,8 +366,8 @@ test_expect_success \
# delta size check.
test_expect_failure \
'apply delta with truncated copy parameters' \
- 'printf "\5\5\1X\221" > truncated_copy_delta &&
- echo base >base &&
+ 'printf "\4\5\1X\221" > truncated_copy_delta &&
+ printf base >base &&
test_must_fail test-tool delta -p base truncated_copy_delta /dev/null'
test_done
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH 2/5] t5303: test some corrupt deltas
2018-08-30 18:42 ` Jeff King
@ 2018-08-30 18:44 ` Jeff King
2018-08-30 18:50 ` Junio C Hamano
0 siblings, 1 reply; 31+ messages in thread
From: Jeff King @ 2018-08-30 18:44 UTC (permalink / raw)
To: Junio C Hamano
Cc: Jann Horn, git, Nguyễn Thái Ngọc Duy,
Johannes Schindelin, Nicolas Pitre
On Thu, Aug 30, 2018 at 02:42:01PM -0400, Jeff King wrote:
> > Would "echo base >base" give us 5-byte long base even on Windows?
> > Or the test does not care if it is either "base\n" or "base\r\n"?
> >
> > Just double-checking.
>
> Good question. On the first one, I don't know. On the second one, yes,
> it does matter. We'd feed "6" to patch_delta(), and it would complain
> about the mismatch before actually hitting the code we're trying to
> exercise. The test would still pass (the error result is the same either
> way), but would quietly not test what we wanted.
>
> Maybe something like this to be on the safe side?
That could be squashed into patch 2. Patch 4 would need this one
additional case:
diff --git a/t/t5303-pack-corruption-resilience.sh b/t/t5303-pack-corruption-resilience.sh
index 0c537958e7..e91d6f5770 100755
--- a/t/t5303-pack-corruption-resilience.sh
+++ b/t/t5303-pack-corruption-resilience.sh
@@ -379,7 +379,7 @@ test_expect_success \
'printf "\0\1\1X\1" > tail_garbage_literal &&
test_must_fail test-tool delta -p /dev/null tail_garbage_literal /dev/null'
-# \5 - five bytes in base
+# \4 - four bytes in base
# \1 - one byte in result
# \1 - one literal byte (X)
# \221 - copy, one byte offset, one byte size
@@ -387,8 +387,8 @@ test_expect_success \
# \1 - copy 1 byte
test_expect_success \
'apply delta with trailing garbage copy' \
- 'printf "\5\1\1X\221\0\1" > tail_garbage_copy &&
- echo base >base &&
+ 'printf "\4\1\1X\221\0\1" > tail_garbage_copy &&
+ printf base >base &&
test_must_fail test-tool delta -p /dev/null tail_garbage_copy /dev/null'
# \0 - empty base
I can re-roll, or even prepare a patch on top (it's sufficiently subtle
that it may merit calling out explicitly in a commit).
-Peff
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH 2/5] t5303: test some corrupt deltas
2018-08-30 18:44 ` Jeff King
@ 2018-08-30 18:50 ` Junio C Hamano
2018-08-30 19:13 ` Jeff King
0 siblings, 1 reply; 31+ messages in thread
From: Junio C Hamano @ 2018-08-30 18:50 UTC (permalink / raw)
To: Jeff King
Cc: Jann Horn, git, Nguyễn Thái Ngọc Duy,
Johannes Schindelin, Nicolas Pitre
Jeff King <peff@peff.net> writes:
> I can re-roll, or even prepare a patch on top (it's sufficiently subtle
> that it may merit calling out explicitly in a commit).
Yeah, I tend to agree with your reasoning to do it on top as a
separate patch.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 2/5] t5303: test some corrupt deltas
2018-08-30 18:50 ` Junio C Hamano
@ 2018-08-30 19:13 ` Jeff King
0 siblings, 0 replies; 31+ messages in thread
From: Jeff King @ 2018-08-30 19:13 UTC (permalink / raw)
To: Junio C Hamano
Cc: Jann Horn, git, Nguyễn Thái Ngọc Duy,
Johannes Schindelin, Nicolas Pitre
On Thu, Aug 30, 2018 at 11:50:56AM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > I can re-roll, or even prepare a patch on top (it's sufficiently subtle
> > that it may merit calling out explicitly in a commit).
>
> Yeah, I tend to agree with your reasoning to do it on top as a
> separate patch.
Here it is, then.
-- >8 --
Subject: [PATCH 6/5] t5303: use printf to generate delta bases
The exact byte count of the delta base file is important.
The test-delta helper will feed it to patch_delta(), which
will barf if it doesn't match the size byte given in the
delta. Using "echo" may end up with unexpected line endings
on some platforms (e.g,. "\r\n" instead of just "\n").
This actually wouldn't cause the test to fail (since we
already expect test-delta to complain about these bogus
deltas), but would mean that we're not exercising the code
we think we are.
Let's use printf instead (which we already trust to give us
byte-perfect output when we generate the deltas).
While we're here, let's tighten the 5-byte result size used
in the "truncated copy parameters" test. This just needs to
have enough room to attempt to parse the bogus copy command,
meaning 2 is sufficient. Using 5 was arbitrary and just
copied from the base size; since those no longer match, it's
simply confusing. Let's use a more meaningful number.
Signed-off-by: Jeff King <peff@peff.net>
---
t/t5303-pack-corruption-resilience.sh | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/t/t5303-pack-corruption-resilience.sh b/t/t5303-pack-corruption-resilience.sh
index b68bbeedcc..41e6dc4dcf 100755
--- a/t/t5303-pack-corruption-resilience.sh
+++ b/t/t5303-pack-corruption-resilience.sh
@@ -327,15 +327,15 @@ test_expect_success \
'printf "\0\1\2XX" > too_big_literal &&
test_must_fail test-tool delta -p /dev/null too_big_literal /dev/null'
-# \5 - five bytes in base
+# \4 - four bytes in base
# \1 - one byte in result
# \221 - copy, one byte offset, one byte size
# \0 - copy from offset 0
# \2 - copy two bytes (one too many)
test_expect_success \
'apply delta with too many copied bytes' \
- 'printf "\5\1\221\0\2" > too_big_copy &&
- echo base >base &&
+ 'printf "\4\1\221\0\2" > too_big_copy &&
+ printf base >base &&
test_must_fail test-tool delta -p base too_big_copy /dev/null'
# \0 - empty base
@@ -356,8 +356,8 @@ test_expect_success \
'printf "\0\1\221\0\1" > truncated_base &&
test_must_fail test-tool delta -p /dev/null truncated_base /dev/null'
-# \5 - five bytes in base
-# \5 - five bytes in result
+# \4 - four bytes in base
+# \2 - two bytes in result
# \1 - one literal byte (X)
# \221 - copy, one byte offset, one byte size
# (offset/size missing)
@@ -366,8 +366,8 @@ test_expect_success \
# delta size check.
test_expect_success \
'apply delta with truncated copy parameters' \
- 'printf "\5\5\1X\221" > truncated_copy_delta &&
- echo base >base &&
+ 'printf "\4\2\1X\221" > truncated_copy_delta &&
+ printf base >base &&
test_must_fail test-tool delta -p base truncated_copy_delta /dev/null'
# \0 - empty base
@@ -379,7 +379,7 @@ test_expect_success \
'printf "\0\1\1X\1" > tail_garbage_literal &&
test_must_fail test-tool delta -p /dev/null tail_garbage_literal /dev/null'
-# \5 - five bytes in base
+# \4 - four bytes in base
# \1 - one byte in result
# \1 - one literal byte (X)
# \221 - copy, one byte offset, one byte size
@@ -387,8 +387,8 @@ test_expect_success \
# \1 - copy 1 byte
test_expect_success \
'apply delta with trailing garbage copy' \
- 'printf "\5\1\1X\221\0\1" > tail_garbage_copy &&
- echo base >base &&
+ 'printf "\4\1\1X\221\0\1" > tail_garbage_copy &&
+ printf base >base &&
test_must_fail test-tool delta -p /dev/null tail_garbage_copy /dev/null'
# \0 - empty base
--
2.19.0.rc1.546.g3fcb3c0d7c
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH 2/5] t5303: test some corrupt deltas
2018-08-30 17:38 ` Junio C Hamano
2018-08-30 18:42 ` Jeff King
@ 2018-08-31 9:58 ` Johannes Schindelin
2018-08-31 15:33 ` Junio C Hamano
1 sibling, 1 reply; 31+ messages in thread
From: Johannes Schindelin @ 2018-08-31 9:58 UTC (permalink / raw)
To: Junio C Hamano
Cc: Jeff King, Jann Horn, git, Nguyễn Thái Ngọc Duy,
Nicolas Pitre
Hi Junio,
On Thu, 30 Aug 2018, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > +test_expect_success \
> > + 'apply delta with too many copied bytes' \
> > + 'printf "\5\1\221\0\2" > too_big_copy &&
> > + echo base >base &&
> > + test_must_fail test-tool delta -p base too_big_copy /dev/null'
>
> Would "echo base >base" give us 5-byte long base even on Windows?
Please note that Unix shell scripting is a foreign thing on Windows. As
such, there is not really any "native" shell we can use [*1*], and
therefore we use MSYS2's Bash which outputs Unix line endings.
Ciao,
Dscho
Footnote *1*: I keep trying (and failing) to find the time to work on the
pure-Win32 port of BusyBox, which would give us "sort of a native Unix
shell". That probably *would* output CR/LF in this case. But as there are
still parts of the test suite that require Perl (which is not included in
BusyBox), I think we are still a loooong way from running the test suite
in a pure Win32 fashion. With all the time tax that incurs for us Windows
users.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 2/5] t5303: test some corrupt deltas
2018-08-31 9:58 ` Johannes Schindelin
@ 2018-08-31 15:33 ` Junio C Hamano
2018-08-31 19:47 ` Jeff King
2018-08-31 21:14 ` Johannes Schindelin
0 siblings, 2 replies; 31+ messages in thread
From: Junio C Hamano @ 2018-08-31 15:33 UTC (permalink / raw)
To: Johannes Schindelin
Cc: Jeff King, Jann Horn, git, Nguyễn Thái Ngọc Duy,
Nicolas Pitre
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>> Would "echo base >base" give us 5-byte long base even on Windows?
>
> Please note that Unix shell scripting is a foreign thing on Windows. As
> such, there is not really any "native" shell we can use [*1*], and
Yeah, I know that; otherwise I wouldn't have asked. Because ...
> therefore we use MSYS2's Bash which outputs Unix line endings.
... I didn't know what MSYS folks chose, and/or if you have chosen
to tweak their choice, and/or if you switched to somebody else's shell
(e.g. busybox) and/or you chose to tweak what they do out of the box,
it was worth asking and getting yes/no question. You do not have to
tell me why I should be asking.
So instead of typing 3 lines, you can just say "yes we use echo that
emulates Unix".
Thanks.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 2/5] t5303: test some corrupt deltas
2018-08-31 15:33 ` Junio C Hamano
@ 2018-08-31 19:47 ` Jeff King
2018-08-31 21:39 ` Junio C Hamano
2018-08-31 21:14 ` Johannes Schindelin
1 sibling, 1 reply; 31+ messages in thread
From: Jeff King @ 2018-08-31 19:47 UTC (permalink / raw)
To: Junio C Hamano
Cc: Johannes Schindelin, Jann Horn, git,
Nguyễn Thái Ngọc Duy, Nicolas Pitre
On Fri, Aug 31, 2018 at 08:33:26AM -0700, Junio C Hamano wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> >> Would "echo base >base" give us 5-byte long base even on Windows?
> >
> > Please note that Unix shell scripting is a foreign thing on Windows. As
> > such, there is not really any "native" shell we can use [*1*], and
>
> Yeah, I know that; otherwise I wouldn't have asked. Because ...
>
> > therefore we use MSYS2's Bash which outputs Unix line endings.
>
> ... I didn't know what MSYS folks chose, and/or if you have chosen
> to tweak their choice, and/or if you switched to somebody else's shell
> (e.g. busybox) and/or you chose to tweak what they do out of the box,
> it was worth asking and getting yes/no question. You do not have to
> tell me why I should be asking.
>
> So instead of typing 3 lines, you can just say "yes we use echo that
> emulates Unix".
I actually found Dscho's response much more informative than a simple
yes/no.
At any rate, it sounds like we are probably OK with echo, but I think it
is still worth doing the defensive patch-on-top that I posted.
-Peff
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 2/5] t5303: test some corrupt deltas
2018-08-31 19:47 ` Jeff King
@ 2018-08-31 21:39 ` Junio C Hamano
0 siblings, 0 replies; 31+ messages in thread
From: Junio C Hamano @ 2018-08-31 21:39 UTC (permalink / raw)
To: Jeff King
Cc: Johannes Schindelin, Jann Horn, git,
Nguyễn Thái Ngọc Duy, Nicolas Pitre
Jeff King <peff@peff.net> writes:
>> So instead of typing 3 lines, you can just say "yes we use echo that
>> emulates Unix".
>
> I actually found Dscho's response much more informative than a simple
> yes/no.
>
> At any rate, it sounds like we are probably OK with echo, but I think it
> is still worth doing the defensive patch-on-top that I posted.
Actually, Dscho's footnote was much more informative than the "echo
is from UNIX, dammit" that didn't convey anything useful---yes, we
all know that echo is from UNIX, but what does he do to it? Does he
want it to be like UNIX, or want to tweak it to suit Windows taste?
Reading the footnote again, from there, you can read that 'echo' may
currently behave like UNIX, but he guarantees that it can change any
time without notice; in his mind, 'echo' in the ideal world in the
fiture on Windows is to use CRLF.
So we must protect the test from his whim by using printf.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 2/5] t5303: test some corrupt deltas
2018-08-31 15:33 ` Junio C Hamano
2018-08-31 19:47 ` Jeff King
@ 2018-08-31 21:14 ` Johannes Schindelin
2018-08-31 21:41 ` Jeff King
1 sibling, 1 reply; 31+ messages in thread
From: Johannes Schindelin @ 2018-08-31 21:14 UTC (permalink / raw)
To: Junio C Hamano
Cc: Jeff King, Jann Horn, git, Nguyễn Thái Ngọc Duy,
Nicolas Pitre
Hi Junio,
On Fri, 31 Aug 2018, Junio C Hamano wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> >> Would "echo base >base" give us 5-byte long base even on Windows?
> >
> > Please note that Unix shell scripting is a foreign thing on Windows. As
> > such, there is not really any "native" shell we can use [*1*], and
>
> Yeah, I know that; otherwise I wouldn't have asked. Because ...
>
> > therefore we use MSYS2's Bash which outputs Unix line endings.
>
> ... I didn't know what MSYS folks chose, and/or if you have chosen
> to tweak their choice, and/or if you switched to somebody else's shell
> (e.g. busybox) and/or you chose to tweak what they do out of the box,
> it was worth asking and getting yes/no question. You do not have to
> tell me why I should be asking.
>
> So instead of typing 3 lines, you can just say "yes we use echo that
> emulates Unix".
>
> Thanks.
You could just express your gratitude that I do more than just answer a
question, and impart more knowledge that will help you be a better
maintainer.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 2/5] t5303: test some corrupt deltas
2018-08-31 21:14 ` Johannes Schindelin
@ 2018-08-31 21:41 ` Jeff King
2018-08-31 21:55 ` Junio C Hamano
0 siblings, 1 reply; 31+ messages in thread
From: Jeff King @ 2018-08-31 21:41 UTC (permalink / raw)
To: Johannes Schindelin
Cc: Junio C Hamano, Jann Horn, git,
Nguyễn Thái Ngọc Duy, Nicolas Pitre
On Fri, Aug 31, 2018 at 11:14:20PM +0200, Johannes Schindelin wrote:
> On Fri, 31 Aug 2018, Junio C Hamano wrote:
> [...]
> > So instead of typing 3 lines, you can just say "yes we use echo that
> > emulates Unix".
>
> You could just express your gratitude that I do more than just answer a
> question, and impart more knowledge that will help you be a better
> maintainer.
It is my observation as a third party that there has been a lot of
bickering between you two lately (here and in other threads). Can you
both please try to make sure your comments are truly constructive?
I don't want to get into details of who I think has been unreasonable
when, or lay any kind of blame. I'm just concerned that exchanges like
this one worsen the general tone of the mailing list, and I'd like to:
- make it clear to other observers that there's at least one person
who hopes this is not the norm for this list
- make a general reminder that collaborating on the Internet is hard,
but it's worth spending the extra effort to consider how comments
will be taken by the intended receiver, as well as others on the
list
You're free to disagree, of course. And I'm sorry for being somewhat
vague. I really want to avoid opening up a specific flamewar, and just
make a general call for people to remember to be friendly.
If vger allowed HTML mail, I would link to a picture of a unicorn
running across a rainbow.
-Peff
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 2/5] t5303: test some corrupt deltas
2018-08-31 21:41 ` Jeff King
@ 2018-08-31 21:55 ` Junio C Hamano
0 siblings, 0 replies; 31+ messages in thread
From: Junio C Hamano @ 2018-08-31 21:55 UTC (permalink / raw)
To: Jeff King
Cc: Johannes Schindelin, Jann Horn, git,
Nguyễn Thái Ngọc Duy, Nicolas Pitre
Jeff King <peff@peff.net> writes:
> - make it clear to other observers that there's at least one person
> who hopes this is not the norm for this list
Make it at least two by counting me ;-)
> - make a general reminder that collaborating on the Internet is hard,
> but it's worth spending the extra effort to consider how comments
> will be taken by the intended receiver, as well as others on the
> list
Well, in this particular case both of us know exactly how the sender
wants the message to be taken by the other side. At least I do.
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 3/5] patch-delta: fix oob read
2018-08-30 7:05 ` [PATCH 0/5] handle corruption in patch-delta Jeff King
2018-08-30 7:07 ` [PATCH 1/5] test-delta: read input into a heap buffer Jeff King
2018-08-30 7:09 ` [PATCH 2/5] t5303: test some corrupt deltas Jeff King
@ 2018-08-30 7:09 ` Jeff King
2018-08-30 7:10 ` [PATCH 4/5] patch-delta: consistently report corruption Jeff King
` (3 subsequent siblings)
6 siblings, 0 replies; 31+ messages in thread
From: Jeff King @ 2018-08-30 7:09 UTC (permalink / raw)
To: Jann Horn
Cc: git, gitster, Nguyễn Thái Ngọc Duy,
Johannes Schindelin, Nicolas Pitre
From: Jann Horn <jannh@google.com>
If `cmd` is in the range [0x01,0x7f] and `cmd > top-data`, the
`memcpy(out, data, cmd)` can copy out-of-bounds data from after `delta_buf`
into `dst_buf`.
This is not an exploitable bug because triggering the bug increments the
`data` pointer beyond `top`, causing the `data != top` sanity check after
the loop to trigger and discard the destination buffer - which means that
the result of the out-of-bounds read is never used for anything.
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
---
patch-delta.c | 2 +-
t/t5303-pack-corruption-resilience.sh | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/patch-delta.c b/patch-delta.c
index 56e0a5ede2..b937afd2c9 100644
--- a/patch-delta.c
+++ b/patch-delta.c
@@ -56,7 +56,7 @@ void *patch_delta(const void *src_buf, unsigned long src_size,
out += cp_size;
size -= cp_size;
} else if (cmd) {
- if (cmd > size)
+ if (cmd > size || cmd > top - data)
break;
memcpy(out, data, cmd);
out += cmd;
diff --git a/t/t5303-pack-corruption-resilience.sh b/t/t5303-pack-corruption-resilience.sh
index 912e659acf..7114c31ade 100755
--- a/t/t5303-pack-corruption-resilience.sh
+++ b/t/t5303-pack-corruption-resilience.sh
@@ -341,7 +341,7 @@ test_expect_success \
# \0 - empty base
# \2 - two bytes in result
# \2 - two literal bytes (we are short one)
-test_expect_failure \
+test_expect_success \
'apply delta with too few literal bytes' \
'printf "\0\2\2X" > truncated_delta &&
test_must_fail test-tool delta -p /dev/null truncated_delta /dev/null'
--
2.19.0.rc1.539.g3876d0831e
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 4/5] patch-delta: consistently report corruption
2018-08-30 7:05 ` [PATCH 0/5] handle corruption in patch-delta Jeff King
` (2 preceding siblings ...)
2018-08-30 7:09 ` [PATCH 3/5] patch-delta: fix oob read Jeff King
@ 2018-08-30 7:10 ` Jeff King
2018-08-30 7:12 ` [PATCH 5/5] patch-delta: handle truncated copy parameters Jeff King
` (2 subsequent siblings)
6 siblings, 0 replies; 31+ messages in thread
From: Jeff King @ 2018-08-30 7:10 UTC (permalink / raw)
To: Jann Horn
Cc: git, gitster, Nguyễn Thái Ngọc Duy,
Johannes Schindelin, Nicolas Pitre
From: Jann Horn <jannh@google.com>
When applying a delta, if we see an opcode that cannot be
fulfilled (e.g., asking to write more bytes than the
destination has left), we break out of our parsing loop but
don't signal an explicit error. We rely on the sanity check
after the loop to see if we have leftover delta bytes or
didn't fill our result buffer.
This can silently ignore corruption when the delta buffer
ends with a bogus command and the destination buffer is
already full. Instead, let's jump into the error handler
directly when we see this case.
Note that the tests also cover the "bad opcode" case, which
already handles this correctly.
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
---
patch-delta.c | 5 +++--
t/t5303-pack-corruption-resilience.sh | 30 +++++++++++++++++++++++++++
2 files changed, 33 insertions(+), 2 deletions(-)
diff --git a/patch-delta.c b/patch-delta.c
index b937afd2c9..283fb4b759 100644
--- a/patch-delta.c
+++ b/patch-delta.c
@@ -51,13 +51,13 @@ void *patch_delta(const void *src_buf, unsigned long src_size,
if (unsigned_add_overflows(cp_off, cp_size) ||
cp_off + cp_size > src_size ||
cp_size > size)
- break;
+ goto bad_length;
memcpy(out, (char *) src_buf + cp_off, cp_size);
out += cp_size;
size -= cp_size;
} else if (cmd) {
if (cmd > size || cmd > top - data)
- break;
+ goto bad_length;
memcpy(out, data, cmd);
out += cmd;
data += cmd;
@@ -75,6 +75,7 @@ void *patch_delta(const void *src_buf, unsigned long src_size,
/* sanity check */
if (data != top || size != 0) {
+ bad_length:
error("delta replay has gone wild");
bad:
free(dst_buf);
diff --git a/t/t5303-pack-corruption-resilience.sh b/t/t5303-pack-corruption-resilience.sh
index 7114c31ade..41dc947d3f 100755
--- a/t/t5303-pack-corruption-resilience.sh
+++ b/t/t5303-pack-corruption-resilience.sh
@@ -370,4 +370,34 @@ test_expect_failure \
echo base >base &&
test_must_fail test-tool delta -p base truncated_copy_delta /dev/null'
+# \0 - empty base
+# \1 - one byte in result
+# \1 - one literal byte (X)
+# \1 - trailing garbage command
+test_expect_success \
+ 'apply delta with trailing garbage literal' \
+ 'printf "\0\1\1X\1" > tail_garbage_literal &&
+ test_must_fail test-tool delta -p /dev/null tail_garbage_literal /dev/null'
+
+# \5 - five bytes in base
+# \1 - one byte in result
+# \1 - one literal byte (X)
+# \221 - copy, one byte offset, one byte size
+# \0 - copy from offset 0
+# \1 - copy 1 byte
+test_expect_success \
+ 'apply delta with trailing garbage copy' \
+ 'printf "\5\1\1X\221\0\1" > tail_garbage_copy &&
+ echo base >base &&
+ test_must_fail test-tool delta -p /dev/null tail_garbage_copy /dev/null'
+
+# \0 - empty base
+# \1 - one byte in result
+# \1 - one literal byte (X)
+# \0 - bogus opcode
+test_expect_success \
+ 'apply delta with trailing garbage opcode' \
+ 'printf "\0\1\1X\0" > tail_garbage_opcode &&
+ test_must_fail test-tool delta -p /dev/null tail_garbage_opcode /dev/null'
+
test_done
--
2.19.0.rc1.539.g3876d0831e
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 5/5] patch-delta: handle truncated copy parameters
2018-08-30 7:05 ` [PATCH 0/5] handle corruption in patch-delta Jeff King
` (3 preceding siblings ...)
2018-08-30 7:10 ` [PATCH 4/5] patch-delta: consistently report corruption Jeff King
@ 2018-08-30 7:12 ` Jeff King
2018-08-30 13:25 ` [PATCH 0/5] handle corruption in patch-delta Jann Horn
2018-08-30 15:23 ` Nicolas Pitre
6 siblings, 0 replies; 31+ messages in thread
From: Jeff King @ 2018-08-30 7:12 UTC (permalink / raw)
To: Jann Horn
Cc: git, gitster, Nguyễn Thái Ngọc Duy,
Johannes Schindelin, Nicolas Pitre
When we see a delta command instructing us to copy bytes
from the base, we have to read the offset and size from the
delta stream. We do this without checking whether we're at
the end of the stream, meaning we may read past the end of
the buffer.
In practice this isn't exploitable in any interesting way
because:
1. Deltas are always in packfiles, so we have at least a
20-byte trailer that we'll end up reading.
2. The worst case is that we try to perform a nonsense
copy from the base object into the result, based on
whatever was in the pack stream next. In most cases
this will simply fail due to our bounds-checks against
the base or the result.
But even if you carefully constructed a pack stream for
which it succeeds, it wouldn't perform any delta
operation that you couldn't have simply included in a
non-broken form.
But obviously it's poor form to read past the end of the
buffer we've been given. Unfortunately there's no easy way
to do a single length check, since the number of bytes we
need depends on the number of bits set in the initial
command byte. So we'll just check each byte as we parse. We
can hide the complexity in a macro; it's ugly, but not as
ugly as writing out each individual conditional.
Signed-off-by: Jeff King <peff@peff.net>
---
patch-delta.c | 21 ++++++++++++++-------
t/t5303-pack-corruption-resilience.sh | 2 +-
2 files changed, 15 insertions(+), 8 deletions(-)
diff --git a/patch-delta.c b/patch-delta.c
index 283fb4b759..b5c8594db6 100644
--- a/patch-delta.c
+++ b/patch-delta.c
@@ -40,13 +40,20 @@ void *patch_delta(const void *src_buf, unsigned long src_size,
cmd = *data++;
if (cmd & 0x80) {
unsigned long cp_off = 0, cp_size = 0;
- if (cmd & 0x01) cp_off = *data++;
- if (cmd & 0x02) cp_off |= (*data++ << 8);
- if (cmd & 0x04) cp_off |= (*data++ << 16);
- if (cmd & 0x08) cp_off |= ((unsigned) *data++ << 24);
- if (cmd & 0x10) cp_size = *data++;
- if (cmd & 0x20) cp_size |= (*data++ << 8);
- if (cmd & 0x40) cp_size |= (*data++ << 16);
+#define PARSE_CP_PARAM(bit, var, shift) do { \
+ if (cmd & (bit)) { \
+ if (data >= top) \
+ goto bad_length; \
+ var |= ((unsigned) *data++ << (shift)); \
+ } } while (0)
+ PARSE_CP_PARAM(0x01, cp_off, 0);
+ PARSE_CP_PARAM(0x02, cp_off, 8);
+ PARSE_CP_PARAM(0x04, cp_off, 16);
+ PARSE_CP_PARAM(0x08, cp_off, 24);
+ PARSE_CP_PARAM(0x10, cp_size, 0);
+ PARSE_CP_PARAM(0x20, cp_size, 8);
+ PARSE_CP_PARAM(0x40, cp_size, 16);
+#undef PARSE_CP_PARAM
if (cp_size == 0) cp_size = 0x10000;
if (unsigned_add_overflows(cp_off, cp_size) ||
cp_off + cp_size > src_size ||
diff --git a/t/t5303-pack-corruption-resilience.sh b/t/t5303-pack-corruption-resilience.sh
index 41dc947d3f..b68bbeedcc 100755
--- a/t/t5303-pack-corruption-resilience.sh
+++ b/t/t5303-pack-corruption-resilience.sh
@@ -364,7 +364,7 @@ test_expect_success \
#
# Note that the literal byte is necessary to get past the uninteresting minimum
# delta size check.
-test_expect_failure \
+test_expect_success \
'apply delta with truncated copy parameters' \
'printf "\5\5\1X\221" > truncated_copy_delta &&
echo base >base &&
--
2.19.0.rc1.539.g3876d0831e
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH 0/5] handle corruption in patch-delta
2018-08-30 7:05 ` [PATCH 0/5] handle corruption in patch-delta Jeff King
` (4 preceding siblings ...)
2018-08-30 7:12 ` [PATCH 5/5] patch-delta: handle truncated copy parameters Jeff King
@ 2018-08-30 13:25 ` Jann Horn
2018-08-30 15:23 ` Nicolas Pitre
6 siblings, 0 replies; 31+ messages in thread
From: Jann Horn @ 2018-08-30 13:25 UTC (permalink / raw)
To: Jeff King; +Cc: git, Junio C Hamano, pclouds, johannes.schindelin, nico
On Thu, Aug 30, 2018 at 9:05 AM Jeff King <peff@peff.net> wrote:
>
> On Wed, Aug 29, 2018 at 10:58:55PM +0200, Jann Horn wrote:
>
> > If `cmd` is in the range [0x01,0x7f] and `cmd > top-data`, the
> > `memcpy(out, data, cmd)` can copy out-of-bounds data from after `delta_buf`
> > into `dst_buf`.
> >
> > This is not an exploitable bug because triggering the bug increments the
> > `data` pointer beyond `top`, causing the `data != top` sanity check after
> > the loop to trigger and discard the destination buffer - which means that
> > the result of the out-of-bounds read is never used for anything.
> >
> > Also, directly jump into the error handler instead of just breaking out of
> > the loop - otherwise, data corruption would be silently ignored if the
> > delta buffer ends with a command and the destination buffer is already
> > full.
>
> Based on my earlier observations, here's a replacement patch series I
> came up with. It has:
[...]
> I hope you don't mind me hacking up your patches a bit.
Not at all. I'm happy that I don't have to write a v2 series.
> Thanks again for your original report and patch.
Thanks for turning my patch into something decent so quickly!
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 0/5] handle corruption in patch-delta
2018-08-30 7:05 ` [PATCH 0/5] handle corruption in patch-delta Jeff King
` (5 preceding siblings ...)
2018-08-30 13:25 ` [PATCH 0/5] handle corruption in patch-delta Jann Horn
@ 2018-08-30 15:23 ` Nicolas Pitre
6 siblings, 0 replies; 31+ messages in thread
From: Nicolas Pitre @ 2018-08-30 15:23 UTC (permalink / raw)
To: Jeff King
Cc: Jann Horn, git, gitster, Nguyễn Thái Ngọc Duy,
Johannes Schindelin
On Thu, 30 Aug 2018, Jeff King wrote:
> On Wed, Aug 29, 2018 at 10:58:55PM +0200, Jann Horn wrote:
>
> > If `cmd` is in the range [0x01,0x7f] and `cmd > top-data`, the
> > `memcpy(out, data, cmd)` can copy out-of-bounds data from after `delta_buf`
> > into `dst_buf`.
> >
> > This is not an exploitable bug because triggering the bug increments the
> > `data` pointer beyond `top`, causing the `data != top` sanity check after
> > the loop to trigger and discard the destination buffer - which means that
> > the result of the out-of-bounds read is never used for anything.
> >
> > Also, directly jump into the error handler instead of just breaking out of
> > the loop - otherwise, data corruption would be silently ignored if the
> > delta buffer ends with a command and the destination buffer is already
> > full.
>
> Based on my earlier observations, here's a replacement patch series I
> came up with. It has:
>
> [1/5]: test-delta: read input into a heap buffer
>
> A simpler replacement for your patch 2 which avoids portability
> issues.
>
> [2/5]: t5303: test some corrupt deltas
>
> A more complete set of boundary tests based on the 4 cases I laid
> out, plus the cp_size problem I found.
>
> [3/5]: patch-delta: fix oob read
>
> Your actual fix.
>
> [4/5]: patch-delta: consistently report corruption
>
> Your related trailing-garbage fix. I split this into two in order to
> better demonstrate the cases this part covers.
>
> [5/5]: patch-delta: handle truncated copy parameters
>
> My fix for the cp_size read.
>
> I hope you don't mind me hacking up your patches a bit. Thanks again for
> your original report and patch.
Looks good to me (feels like traveling back in time).
Reviewed-by: Nicolas Pitre <nico@fluxnic.net>
>
> -Peff
>
^ permalink raw reply [flat|nested] 31+ messages in thread