unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] regex: fix buffer read overrun in search [BZ#28470]
@ 2021-10-18 22:15 Paul Eggert
  2021-10-19  7:17 ` Andreas Schwab
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Eggert @ 2021-10-18 22:15 UTC (permalink / raw)
  To: libc-alpha

Problem reported by Benno Schulenberg in:
https://lists.gnu.org/r/bug-gnulib/2021-10/msg00035.html
* posix/regexec.c (re_search_internal): Use better bounds check.
---
 posix/regexec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/posix/regexec.c b/posix/regexec.c
index 83e9aaf8ca..a955aa2182 100644
--- a/posix/regexec.c
+++ b/posix/regexec.c
@@ -760,7 +760,7 @@ re_search_internal (const regex_t *preg, const char *string, Idx length,
 		}
 	      /* If MATCH_FIRST is out of the buffer, leave it as '\0'.
 		 Note that MATCH_FIRST must not be smaller than 0.  */
-	      ch = (match_first >= length
+	      ch = (mctx.input.valid_len <= offset
 		    ? 0 : re_string_byte_at (&mctx.input, offset));
 	      if (fastmap[ch])
 		break;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
  2021-10-18 22:15 [PATCH] regex: fix buffer read overrun in search [BZ#28470] Paul Eggert
@ 2021-10-19  7:17 ` Andreas Schwab
  2021-10-19  8:13   ` Paul Eggert
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Schwab @ 2021-10-19  7:17 UTC (permalink / raw)
  To: Paul Eggert; +Cc: libc-alpha

On Okt 18 2021, Paul Eggert wrote:

>  	      /* If MATCH_FIRST is out of the buffer, leave it as '\0'.
>  		 Note that MATCH_FIRST must not be smaller than 0.  */
> -	      ch = (match_first >= length
> +	      ch = (mctx.input.valid_len <= offset

That needs to update the comment.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
  2021-10-19  7:17 ` Andreas Schwab
@ 2021-10-19  8:13   ` Paul Eggert
  2021-10-19  8:25     ` Andreas Schwab
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Eggert @ 2021-10-19  8:13 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: libc-alpha

[-- Attachment #1: Type: text/plain, Size: 109 bytes --]

On 10/19/21 00:17, Andreas Schwab wrote:
> That needs to update the comment.

Thanks, revised patch attached.

[-- Attachment #2: 0001-regex-fix-buffer-read-overrun-in-search-BZ-28470.patch --]
[-- Type: text/x-patch, Size: 1114 bytes --]

From be84b14058bad546eba87742064e347f651e5852 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Mon, 18 Oct 2021 15:00:21 -0700
Subject: [PATCH] regex: fix buffer read overrun in search [BZ#28470]

Problem reported by Benno Schulenberg in:
https://lists.gnu.org/r/bug-gnulib/2021-10/msg00035.html
* posix/regexec.c (re_search_internal): Use better bounds check.
---
 posix/regexec.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/posix/regexec.c b/posix/regexec.c
index 83e9aaf8ca..106f9d7ff1 100644
--- a/posix/regexec.c
+++ b/posix/regexec.c
@@ -758,9 +758,8 @@ re_search_internal (const regex_t *preg, const char *string, Idx length,
 
 		  offset = match_first - mctx.input.raw_mbs_idx;
 		}
-	      /* If MATCH_FIRST is out of the buffer, leave it as '\0'.
-		 Note that MATCH_FIRST must not be smaller than 0.  */
-	      ch = (match_first >= length
+	      /* If OFFSET is out of the buffer, leave CH as '\0'.  */
+	      ch = (mctx.input.valid_len <= offset
 		    ? 0 : re_string_byte_at (&mctx.input, offset));
 	      if (fastmap[ch])
 		break;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
  2021-10-19  8:13   ` Paul Eggert
@ 2021-10-19  8:25     ` Andreas Schwab
  2021-10-19  8:57       ` Paul Eggert
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Schwab @ 2021-10-19  8:25 UTC (permalink / raw)
  To: Paul Eggert; +Cc: libc-alpha

On Okt 19 2021, Paul Eggert wrote:

> +	      ch = (mctx.input.valid_len <= offset

This is backwards.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
  2021-10-19  8:25     ` Andreas Schwab
@ 2021-10-19  8:57       ` Paul Eggert
  2021-10-19 15:09         ` Andreas Schwab
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Eggert @ 2021-10-19  8:57 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: libc-alpha

[-- Attachment #1: Type: text/plain, Size: 357 bytes --]

On 10/19/21 01:25, Andreas Schwab wrote:
> On Okt 19 2021, Paul Eggert wrote:
> 
>> +	      ch = (mctx.input.valid_len <= offset
> 
> This is backwards.

It's correct as-is, so that comment is merely about style. I revamped 
the patch to turn the comparison around; see attached. Let's not have 
our longstanding style disagreement distract us from the fix.

[-- Attachment #2: 0001-regex-fix-buffer-read-overrun-in-search-BZ-28470.patch --]
[-- Type: text/x-patch, Size: 1206 bytes --]

From 7be5e6881cfd18006cac116d27e398ae342ba536 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Mon, 18 Oct 2021 15:00:21 -0700
Subject: [PATCH] regex: fix buffer read overrun in search [BZ#28470]

Problem reported by Benno Schulenberg in:
https://lists.gnu.org/r/bug-gnulib/2021-10/msg00035.html
* posix/regexec.c (re_search_internal): Use better bounds check.
---
 posix/regexec.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/posix/regexec.c b/posix/regexec.c
index 83e9aaf8ca..6aeba3c0b4 100644
--- a/posix/regexec.c
+++ b/posix/regexec.c
@@ -758,10 +758,9 @@ re_search_internal (const regex_t *preg, const char *string, Idx length,
 
 		  offset = match_first - mctx.input.raw_mbs_idx;
 		}
-	      /* If MATCH_FIRST is out of the buffer, leave it as '\0'.
-		 Note that MATCH_FIRST must not be smaller than 0.  */
-	      ch = (match_first >= length
-		    ? 0 : re_string_byte_at (&mctx.input, offset));
+	      /* Use buffer byte if OFFSET is in buffer, otherwise '\0'.  */
+	      ch = (offset < mctx.input.valid_len
+		    ? re_string_byte_at (&mctx.input, offset) : 0);
 	      if (fastmap[ch])
 		break;
 	      match_first += incr;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
  2021-10-19  8:57       ` Paul Eggert
@ 2021-10-19 15:09         ` Andreas Schwab
  2021-10-19 18:14           ` Paul Eggert
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Schwab @ 2021-10-19 15:09 UTC (permalink / raw)
  To: Paul Eggert; +Cc: libc-alpha

On Okt 19 2021, Paul Eggert wrote:

> diff --git a/posix/regexec.c b/posix/regexec.c
> index 83e9aaf8ca..6aeba3c0b4 100644
> --- a/posix/regexec.c
> +++ b/posix/regexec.c
> @@ -758,10 +758,9 @@ re_search_internal (const regex_t *preg, const char *string, Idx length,
>  
>  		  offset = match_first - mctx.input.raw_mbs_idx;
>  		}
> -	      /* If MATCH_FIRST is out of the buffer, leave it as '\0'.
> -		 Note that MATCH_FIRST must not be smaller than 0.  */
> -	      ch = (match_first >= length
> -		    ? 0 : re_string_byte_at (&mctx.input, offset));
> +	      /* Use buffer byte if OFFSET is in buffer, otherwise '\0'.  */
> +	      ch = (offset < mctx.input.valid_len
> +		    ? re_string_byte_at (&mctx.input, offset) : 0);

Why is the bug not in re_string_reconstruct?  Since string[match_first]
exists, so should re_string_byte_at (&mctx.input, offset).

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
  2021-10-19 15:09         ` Andreas Schwab
@ 2021-10-19 18:14           ` Paul Eggert
  2021-11-24 22:27             ` Paul Eggert
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Eggert @ 2021-10-19 18:14 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: libc-alpha

On 10/19/21 08:09, Andreas Schwab wrote:
> Why is the bug not in re_string_reconstruct?  Since string[match_first]
> exists, so should re_string_byte_at (&mctx.input, offset).

I don't know, as I lacked the time to investigate re_string_reconstruct. 
Although the patch I proposed fixes the test case that prompted it, 
possibly it is only a partial fix for a more-general problem.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
  2021-10-19 18:14           ` Paul Eggert
@ 2021-11-24 22:27             ` Paul Eggert
  2021-11-24 22:45               ` Andreas Schwab
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Eggert @ 2021-11-24 22:27 UTC (permalink / raw)
  To: libc-alpha

No further comment, and the patch is safe and has been used in Gnulib 
for some time even if it doesn't necessarily fix all the underlying 
problem, so I installed it. Tests pass on x86-64.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
  2021-11-24 22:27             ` Paul Eggert
@ 2021-11-24 22:45               ` Andreas Schwab
  2021-11-24 23:50                 ` Paul Eggert
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Schwab @ 2021-11-24 22:45 UTC (permalink / raw)
  To: Paul Eggert; +Cc: libc-alpha

On Nov 24 2021, Paul Eggert wrote:

> the patch is safe

Is it?  Why?

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
  2021-11-24 22:45               ` Andreas Schwab
@ 2021-11-24 23:50                 ` Paul Eggert
  2021-11-25  9:01                   ` Andreas Schwab
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Eggert @ 2021-11-24 23:50 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: libc-alpha

On 11/24/21 14:45, Andreas Schwab wrote:
> Is it?  Why?

Partly because it refuses to read past the bounds of an array, where the 
old code would. And partly because it's been run through several tests 
- not just glibc tests, but also grep and coreutils and probably some 
others by now.

Of course this is not a 100% guarantee of safety, but it's close enough.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
  2021-11-24 23:50                 ` Paul Eggert
@ 2021-11-25  9:01                   ` Andreas Schwab
  2021-11-26 18:35                     ` Paul Eggert
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Schwab @ 2021-11-25  9:01 UTC (permalink / raw)
  To: Paul Eggert; +Cc: libc-alpha

On Nov 24 2021, Paul Eggert wrote:

> On 11/24/21 14:45, Andreas Schwab wrote:
>> Is it?  Why?
>
> Partly because it refuses to read past the bounds of an array, where the
> old code would.

That's just papering over a bug, not fixing it.

> And partly because it's been run through several tests - not just
> glibc tests, but also grep and coreutils and probably some others by
> now.

How much coverage do they provide?

Also, you failed to add a test.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
  2021-11-25  9:01                   ` Andreas Schwab
@ 2021-11-26 18:35                     ` Paul Eggert
  2021-11-26 18:39                       ` Andreas Schwab
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Eggert @ 2021-11-26 18:35 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: libc-alpha

On 11/25/21 01:01, Andreas Schwab wrote:

>> Partly because it refuses to read past the bounds of an array, where the
>> old code would.
> 
> That's just papering over a bug, not fixing it.

That's not clear to me. Perhaps you're right, but perhaps it really does 
fix the bug.

>> And partly because it's been run through several tests - not just
>> glibc tests, but also grep and coreutils and probably some others by
>> now.
> 
> How much coverage do they provide?

Someone who has more time could presumably determine this by looking at 
the respective test suites. I forgot to mention, Gnulib also has its own 
regex tests (which also pass).

> Also, you failed to add a test.

Yes, that's correct. It would be nice if someone could do that. However, 
it'd be some work and like you I'm pressed for time.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
  2021-11-26 18:35                     ` Paul Eggert
@ 2021-11-26 18:39                       ` Andreas Schwab
  0 siblings, 0 replies; 13+ messages in thread
From: Andreas Schwab @ 2021-11-26 18:39 UTC (permalink / raw)
  To: Paul Eggert; +Cc: libc-alpha

On Nov 26 2021, Paul Eggert wrote:

> On 11/25/21 01:01, Andreas Schwab wrote:
>
>>> Partly because it refuses to read past the bounds of an array, where the
>>> old code would.
>> That's just papering over a bug, not fixing it.
>
> That's not clear to me. Perhaps you're right, but perhaps it really does
> fix the bug.

That's why we need a proper test case.  Not voodoo programming.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-11-26 18:40 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-18 22:15 [PATCH] regex: fix buffer read overrun in search [BZ#28470] Paul Eggert
2021-10-19  7:17 ` Andreas Schwab
2021-10-19  8:13   ` Paul Eggert
2021-10-19  8:25     ` Andreas Schwab
2021-10-19  8:57       ` Paul Eggert
2021-10-19 15:09         ` Andreas Schwab
2021-10-19 18:14           ` Paul Eggert
2021-11-24 22:27             ` Paul Eggert
2021-11-24 22:45               ` Andreas Schwab
2021-11-24 23:50                 ` Paul Eggert
2021-11-25  9:01                   ` Andreas Schwab
2021-11-26 18:35                     ` Paul Eggert
2021-11-26 18:39                       ` Andreas Schwab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).