git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Invalid memory access in `git apply`
@ 2017-11-08 16:58 mqudsi
  2017-11-11 14:10 ` René Scharfe
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: mqudsi @ 2017-11-08 16:58 UTC (permalink / raw)
  To: git

**Resending as it seems that the attachments caused the last email to wind up
in a black hole**

There seems to be bug in the `git apply` that leads to out-of-bounds memory
access when --ignore-space-change is combined with --inaccurate-eof and
applying a patch.

On occasion, this can lead to error output like the following:

	 mqudsi@ZBook ~> git apply --ignore-space-change --ignore-whitespace
	 --allow-overlap --inaccurate-eof without_whitespace.diff
	 *** Error in `git': malloc(): memory corruption: 0x0000000002543530 ***
	 ======= Backtrace: =========
	 /lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7fdda79c77e5]
	 /lib/x86_64-linux-gnu/libc.so.6(+0x8213e)[0x7fdda79d213e]
	 /lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x54)[0x7fdda79d4184]
	 /lib/x86_64-linux-gnu/libc.so.6(_IO_file_doallocate+0x55)[0x7fdda79bd1d5]
	 /lib/x86_64-linux-gnu/libc.so.6(_IO_doallocbuf+0x34)[0x7fdda79cb594]
	 /lib/x86_64-linux-gnu/libc.so.6(_IO_file_overflow+0x1c8)[0x7fdda79ca8f8]
	 /lib/x86_64-linux-gnu/libc.so.6(_IO_file_xsputn+0xad)[0x7fdda79c928d]
	 /lib/x86_64-linux-gnu/libc.so.6(fputs+0x98)[0x7fdda79be0c8]
	 git[0x5386cd]
	 git[0x538714]
	 git[0x538940]
	 git[0x40e220]
	 git[0x410a10]
	 git[0x41256e]
	 git[0x412df7]
	 git[0x415935]
	 git[0x406436]
	 git[0x40555c]

The original file being patched (clipboard.vim) and the patch file that I had
attempted to apply (without_whitespace.diff) are attached, along with the
full, unabridged output of the memory map as a result of the out-of-bounds
access (memory_map.txt).

The memory map output was generated under git 2.7.4; repeated attempts to
reproduce the memory map dump with both 2.7.4 and 2.15 produce the following
output:

	 mqudsi@ZBook ~/.c/nvim> git apply --ignore-space-change  --inaccurate-eof
	 --whitespace=fix without_whitespace.diff
	 fatal: BUG: caller miscounted postlen: asked 248, orig = 251, used = 249

Mahmoud Al-Qudsi
NeoSmart Technologies

--Attachments--

* clipboard.vim: http://termbin.com/u25t
* without_whitespace.diff: http://termbin.com/bu9y
* memory_map.txt: http://termbin.com/cboz



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Invalid memory access in `git apply`
  2017-11-08 16:58 Invalid memory access in `git apply` mqudsi
@ 2017-11-11 14:10 ` René Scharfe
  2017-11-11 14:10 ` [PATCH] apply: avoid out-of-bounds access in fuzzy_matchlines() René Scharfe
  2017-11-16 18:50 ` [PATCH] apply: update line lengths for --inaccurate-eof René Scharfe
  2 siblings, 0 replies; 5+ messages in thread
From: René Scharfe @ 2017-11-11 14:10 UTC (permalink / raw)
  To: mqudsi, git; +Cc: Giuseppe Bilotta, Johannes Schindelin

Am 08.11.2017 um 17:58 schrieb mqudsi@neosmart.net:
> **Resending as it seems that the attachments caused the last email to wind up
> in a black hole**
> 
> There seems to be bug in the `git apply` that leads to out-of-bounds memory
> access when --ignore-space-change is combined with --inaccurate-eof and
> applying a patch.
> 
> On occasion, this can lead to error output like the following:
> 
> 	 mqudsi@ZBook ~> git apply --ignore-space-change --ignore-whitespace
> 	 --allow-overlap --inaccurate-eof without_whitespace.diff
> 	 *** Error in `git': malloc(): memory corruption: 0x0000000002543530 ***
> 	 ======= Backtrace: =========
> 	 /lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7fdda79c77e5]
> 	 /lib/x86_64-linux-gnu/libc.so.6(+0x8213e)[0x7fdda79d213e]
> 	 /lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x54)[0x7fdda79d4184]
> 	 /lib/x86_64-linux-gnu/libc.so.6(_IO_file_doallocate+0x55)[0x7fdda79bd1d5]
> 	 /lib/x86_64-linux-gnu/libc.so.6(_IO_doallocbuf+0x34)[0x7fdda79cb594]
> 	 /lib/x86_64-linux-gnu/libc.so.6(_IO_file_overflow+0x1c8)[0x7fdda79ca8f8]
> 	 /lib/x86_64-linux-gnu/libc.so.6(_IO_file_xsputn+0xad)[0x7fdda79c928d]
> 	 /lib/x86_64-linux-gnu/libc.so.6(fputs+0x98)[0x7fdda79be0c8]
> 	 git[0x5386cd]
> 	 git[0x538714]
> 	 git[0x538940]
> 	 git[0x40e220]
> 	 git[0x410a10]
> 	 git[0x41256e]
> 	 git[0x412df7]
> 	 git[0x415935]
> 	 git[0x406436]
> 	 git[0x40555c]
> 
> The original file being patched (clipboard.vim) and the patch file that I had
> attempted to apply (without_whitespace.diff) are attached, along with the
> full, unabridged output of the memory map as a result of the out-of-bounds
> access (memory_map.txt).
> 
> The memory map output was generated under git 2.7.4; repeated attempts to
> reproduce the memory map dump with both 2.7.4 and 2.15 produce the following
> output:
> 
> 	 mqudsi@ZBook ~/.c/nvim> git apply --ignore-space-change  --inaccurate-eof
> 	 --whitespace=fix without_whitespace.diff
> 	 fatal: BUG: caller miscounted postlen: asked 248, orig = 251, used = 249
> 
> Mahmoud Al-Qudsi
> NeoSmart Technologies
> 
> --Attachments--
> 
> * clipboard.vim: http://termbin.com/u25t
> * without_whitespace.diff: http://termbin.com/bu9y
> * memory_map.txt: http://termbin.com/cboz

Thank you for reporting the issue!

There seem to be at least two bugs in git apply and two problems on your
end.  You don't seem need the option --inaccurate-eof and it's causing
trouble for you; I suggest to leave it out.

And the second hunk of your diff doesn't apply because the "<TAB>endif"
context line doesn't match the "endif" line in clipboard.vim which has
no leading whitespace.  --ignore-space-change ignores changes in the
number of whitespace characters, but that number cannot be 0 on only one
side.

If you adjust the diff by removing the tab from that context line or add
one or more spaces in clipboard.vim before the last "endif" then it will
apply without --inaccurate-eof.


One of the bugs is that fuzzy_matchlines() does out-of-bounds reads in
some cases.  You should only notice it with a tool like Valgrind, ASan
or perhaps a hardened malloc(3).  I'll send a separate patch for that.

The second bug is that --inaccurate-eof triggers a sanity check when
used together with --ignore-space-change.  Here's a simpler reproduction
recipe:

  git init repo
  cd repo

  echo 1 >a
  git add a
  git commit -m initial

  echo 2 >a
  git diff >a.diff
	
  git reset --hard
  git apply --ignore-space-change --inaccurate-eof a.diff

Which yields this error message:

  fatal: BUG: caller miscounted postlen: asked 1, orig = 1, used = 2

Perhaps the first thing we'd need would be a couple of tests showing
the expected behavior of git apply --inaccurate-eof with and without
trailing newlines..

René

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] apply: avoid out-of-bounds access in fuzzy_matchlines()
  2017-11-08 16:58 Invalid memory access in `git apply` mqudsi
  2017-11-11 14:10 ` René Scharfe
@ 2017-11-11 14:10 ` René Scharfe
  2017-11-12  4:45   ` Junio C Hamano
  2017-11-16 18:50 ` [PATCH] apply: update line lengths for --inaccurate-eof René Scharfe
  2 siblings, 1 reply; 5+ messages in thread
From: René Scharfe @ 2017-11-11 14:10 UTC (permalink / raw)
  To: mqudsi, git; +Cc: Junio C Hamano, Giuseppe Bilotta

fuzzy_matchlines() uses a pointers to the first and last characters of
two lines to keep track while matching them.  This makes it impossible
to deal with empty strings.  It accesses characters before the start of
empty lines.  It can also access characters after the end when checking
for trailing whitespace in the main loop.

Avoid that by using pointers to the first character and the one *after*
the last one.  This is well-defined as long as the latter is not
dereferenced.  Basically rewrite the function based on that premise; it
becomes much simpler as a result.  There is no need to check for
leading whitespace outside of the main loop anymore.

Reported-by: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
 apply.c | 59 ++++++++++++++++++++---------------------------------------
 1 file changed, 20 insertions(+), 39 deletions(-)

diff --git a/apply.c b/apply.c
index d676debd59..b8087bd29c 100644
--- a/apply.c
+++ b/apply.c
@@ -300,52 +300,33 @@ static uint32_t hash_line(const char *cp, size_t len)
 static int fuzzy_matchlines(const char *s1, size_t n1,
 			    const char *s2, size_t n2)
 {
-	const char *last1 = s1 + n1 - 1;
-	const char *last2 = s2 + n2 - 1;
-	int result = 0;
+	const char *end1 = s1 + n1;
+	const char *end2 = s2 + n2;
 
 	/* ignore line endings */
-	while ((*last1 == '\r') || (*last1 == '\n'))
-		last1--;
-	while ((*last2 == '\r') || (*last2 == '\n'))
-		last2--;
-
-	/* skip leading whitespaces, if both begin with whitespace */
-	if (s1 <= last1 && s2 <= last2 && isspace(*s1) && isspace(*s2)) {
-		while (isspace(*s1) && (s1 <= last1))
-			s1++;
-		while (isspace(*s2) && (s2 <= last2))
-			s2++;
-	}
-	/* early return if both lines are empty */
-	if ((s1 > last1) && (s2 > last2))
-		return 1;
-	while (!result) {
-		result = *s1++ - *s2++;
-		/*
-		 * Skip whitespace inside. We check for whitespace on
-		 * both buffers because we don't want "a b" to match
-		 * "ab"
-		 */
-		if (isspace(*s1) && isspace(*s2)) {
-			while (isspace(*s1) && s1 <= last1)
+	while (s1 < end1 && (end1[-1] == '\r' || end1[-1] == '\n'))
+		end1--;
+	while (s2 < end2 && (end2[-1] == '\r' || end2[-1] == '\n'))
+		end2--;
+
+	while (s1 < end1 && s2 < end2) {
+		if (isspace(*s1)) {
+			/*
+			 * Skip whitespace. We check on both buffers
+			 * because we don't want "a b" to match "ab".
+			 */
+			if (!isspace(*s2))
+				return 0;
+			while (s1 < end1 && isspace(*s1))
 				s1++;
-			while (isspace(*s2) && s2 <= last2)
+			while (s2 < end2 && isspace(*s2))
 				s2++;
-		}
-		/*
-		 * If we reached the end on one side only,
-		 * lines don't match
-		 */
-		if (
-		    ((s2 > last2) && (s1 <= last1)) ||
-		    ((s1 > last1) && (s2 <= last2)))
+		} else if (*s1++ != *s2++)
 			return 0;
-		if ((s1 > last1) && (s2 > last2))
-			break;
 	}
 
-	return !result;
+	/* If we reached the end on one side only, lines don't match. */
+	return s1 == end1 && s2 == end2;
 }
 
 static void add_line_info(struct image *img, const char *bol, size_t len, unsigned flag)
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] apply: avoid out-of-bounds access in fuzzy_matchlines()
  2017-11-11 14:10 ` [PATCH] apply: avoid out-of-bounds access in fuzzy_matchlines() René Scharfe
@ 2017-11-12  4:45   ` Junio C Hamano
  0 siblings, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2017-11-12  4:45 UTC (permalink / raw)
  To: René Scharfe; +Cc: mqudsi, git, Giuseppe Bilotta

René Scharfe <l.s.r@web.de> writes:

> fuzzy_matchlines() uses a pointers to the first and last characters of
> two lines to keep track while matching them.  This makes it impossible
> to deal with empty strings.  It accesses characters before the start of
> empty lines.  It can also access characters after the end when checking
> for trailing whitespace in the main loop.
>
> Avoid that by using pointers to the first character and the one *after*
> the last one.  This is well-defined as long as the latter is not
> dereferenced.  Basically rewrite the function based on that premise; it
> becomes much simpler as a result.  There is no need to check for
> leading whitespace outside of the main loop anymore.

I recall vaguely that we were bitten by a bug or two due to another
instance of <begin,end> that deviates from the usual "close on the
left end, open on the right end" convention somewhere in the system
recently?

I think the fix of the function is correct, but at the same time, we
would want to clean it up after this fix lands by replacing the
function with the line comparison function we already have in the
xdiff/ layer, so that we can (1) reduce the code duplication and (2)
more importantly, do not have to be constrained by the (mistakenly
narrow) policy decision we currently seem to have to support only
"ignore-whitespace-change" and nothing else.  Of course, that should
not be done as part of this fix.  It is strictly a #leftoverbits item.

Thanks.

> Reported-by: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
> Signed-off-by: Rene Scharfe <l.s.r@web.de>
> ---
>  apply.c | 59 ++++++++++++++++++++---------------------------------------
>  1 file changed, 20 insertions(+), 39 deletions(-)
>
> diff --git a/apply.c b/apply.c
> index d676debd59..b8087bd29c 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -300,52 +300,33 @@ static uint32_t hash_line(const char *cp, size_t len)
>  static int fuzzy_matchlines(const char *s1, size_t n1,
>  			    const char *s2, size_t n2)
>  {
> -	const char *last1 = s1 + n1 - 1;
> -	const char *last2 = s2 + n2 - 1;
> -	int result = 0;
> +	const char *end1 = s1 + n1;
> +	const char *end2 = s2 + n2;
>  
>  	/* ignore line endings */
> -	while ((*last1 == '\r') || (*last1 == '\n'))
> -		last1--;
> -	while ((*last2 == '\r') || (*last2 == '\n'))
> -		last2--;
> -
> -	/* skip leading whitespaces, if both begin with whitespace */
> -	if (s1 <= last1 && s2 <= last2 && isspace(*s1) && isspace(*s2)) {
> -		while (isspace(*s1) && (s1 <= last1))
> -			s1++;
> -		while (isspace(*s2) && (s2 <= last2))
> -			s2++;
> -	}
> -	/* early return if both lines are empty */
> -	if ((s1 > last1) && (s2 > last2))
> -		return 1;
> -	while (!result) {
> -		result = *s1++ - *s2++;
> -		/*
> -		 * Skip whitespace inside. We check for whitespace on
> -		 * both buffers because we don't want "a b" to match
> -		 * "ab"
> -		 */
> -		if (isspace(*s1) && isspace(*s2)) {
> -			while (isspace(*s1) && s1 <= last1)
> +	while (s1 < end1 && (end1[-1] == '\r' || end1[-1] == '\n'))
> +		end1--;
> +	while (s2 < end2 && (end2[-1] == '\r' || end2[-1] == '\n'))
> +		end2--;
> +
> +	while (s1 < end1 && s2 < end2) {
> +		if (isspace(*s1)) {
> +			/*
> +			 * Skip whitespace. We check on both buffers
> +			 * because we don't want "a b" to match "ab".
> +			 */
> +			if (!isspace(*s2))
> +				return 0;
> +			while (s1 < end1 && isspace(*s1))
>  				s1++;
> -			while (isspace(*s2) && s2 <= last2)
> +			while (s2 < end2 && isspace(*s2))
>  				s2++;
> -		}
> -		/*
> -		 * If we reached the end on one side only,
> -		 * lines don't match
> -		 */
> -		if (
> -		    ((s2 > last2) && (s1 <= last1)) ||
> -		    ((s1 > last1) && (s2 <= last2)))
> +		} else if (*s1++ != *s2++)
>  			return 0;
> -		if ((s1 > last1) && (s2 > last2))
> -			break;
>  	}
>  
> -	return !result;
> +	/* If we reached the end on one side only, lines don't match. */
> +	return s1 == end1 && s2 == end2;
>  }
>  
>  static void add_line_info(struct image *img, const char *bol, size_t len, unsigned flag)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] apply: update line lengths for --inaccurate-eof
  2017-11-08 16:58 Invalid memory access in `git apply` mqudsi
  2017-11-11 14:10 ` René Scharfe
  2017-11-11 14:10 ` [PATCH] apply: avoid out-of-bounds access in fuzzy_matchlines() René Scharfe
@ 2017-11-16 18:50 ` René Scharfe
  2 siblings, 0 replies; 5+ messages in thread
From: René Scharfe @ 2017-11-16 18:50 UTC (permalink / raw)
  To: mqudsi, git; +Cc: Giuseppe Bilotta, Johannes Schindelin, Junio C Hamano

Some diff implementations don't report missing newlines at the end of
files.  Applying such a patch can cause a newline character to be
added inadvertently.  The option --inaccurate-eof of git apply can be
used to remove trailing newlines if needed.

apply_one_fragment() cuts it off from the buffers for preimage and
postimage.  Before it does, it builds an array with the lengths of each
line for both.  Make sure to update the length of the last line in
these line info structures as well to keep them consistent with their
respective buffer.

Without this fix the added test fails; git apply dies and reports:

   fatal: BUG: caller miscounted postlen: asked 1, orig = 1, used = 2

That sanity check is only called if whitespace changes are ignored.

Reported-by: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
 apply.c                            |  2 ++
 t/t4107-apply-ignore-whitespace.sh | 14 ++++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/apply.c b/apply.c
index b8087bd29c..321a9fa68d 100644
--- a/apply.c
+++ b/apply.c
@@ -2953,6 +2953,8 @@ static int apply_one_fragment(struct apply_state *state,
 	    newlines.len > 0 && newlines.buf[newlines.len - 1] == '\n') {
 		old--;
 		strbuf_setlen(&newlines, newlines.len - 1);
+		preimage.line_allocated[preimage.nr - 1].len--;
+		postimage.line_allocated[postimage.nr - 1].len--;
 	}
 
 	leading = frag->leading;
diff --git a/t/t4107-apply-ignore-whitespace.sh b/t/t4107-apply-ignore-whitespace.sh
index 9e29b5262d..ac72eeaf27 100755
--- a/t/t4107-apply-ignore-whitespace.sh
+++ b/t/t4107-apply-ignore-whitespace.sh
@@ -178,4 +178,18 @@ test_expect_success 'patch5 fails (--no-ignore-whitespace)' '
 	test_must_fail git apply --no-ignore-whitespace patch5.patch
 '
 
+test_expect_success 'apply --ignore-space-change --inaccurate-eof' '
+	echo 1 >file &&
+	git apply --ignore-space-change --inaccurate-eof <<-\EOF &&
+	diff --git a/file b/file
+	--- a/file
+	+++ b/file
+	@@ -1 +1 @@
+	-1
+	+2
+	EOF
+	printf 2 >expect &&
+	test_cmp expect file
+'
+
 test_done
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-11-16 18:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-08 16:58 Invalid memory access in `git apply` mqudsi
2017-11-11 14:10 ` René Scharfe
2017-11-11 14:10 ` [PATCH] apply: avoid out-of-bounds access in fuzzy_matchlines() René Scharfe
2017-11-12  4:45   ` Junio C Hamano
2017-11-16 18:50 ` [PATCH] apply: update line lengths for --inaccurate-eof René Scharfe

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).