* [PATCH] regex: fix buffer read overrun in search [BZ#28470]
@ 2021-10-18 22:15 Paul Eggert
2021-10-19 7:17 ` Andreas Schwab
0 siblings, 1 reply; 13+ messages in thread
From: Paul Eggert @ 2021-10-18 22:15 UTC (permalink / raw)
To: libc-alpha
Problem reported by Benno Schulenberg in:
https://lists.gnu.org/r/bug-gnulib/2021-10/msg00035.html
* posix/regexec.c (re_search_internal): Use better bounds check.
---
posix/regexec.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/posix/regexec.c b/posix/regexec.c
index 83e9aaf8ca..a955aa2182 100644
--- a/posix/regexec.c
+++ b/posix/regexec.c
@@ -760,7 +760,7 @@ re_search_internal (const regex_t *preg, const char *string, Idx length,
}
/* If MATCH_FIRST is out of the buffer, leave it as '\0'.
Note that MATCH_FIRST must not be smaller than 0. */
- ch = (match_first >= length
+ ch = (mctx.input.valid_len <= offset
? 0 : re_string_byte_at (&mctx.input, offset));
if (fastmap[ch])
break;
--
2.31.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
2021-10-18 22:15 [PATCH] regex: fix buffer read overrun in search [BZ#28470] Paul Eggert
@ 2021-10-19 7:17 ` Andreas Schwab
2021-10-19 8:13 ` Paul Eggert
0 siblings, 1 reply; 13+ messages in thread
From: Andreas Schwab @ 2021-10-19 7:17 UTC (permalink / raw)
To: Paul Eggert; +Cc: libc-alpha
On Okt 18 2021, Paul Eggert wrote:
> /* If MATCH_FIRST is out of the buffer, leave it as '\0'.
> Note that MATCH_FIRST must not be smaller than 0. */
> - ch = (match_first >= length
> + ch = (mctx.input.valid_len <= offset
That needs to update the comment.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
2021-10-19 7:17 ` Andreas Schwab
@ 2021-10-19 8:13 ` Paul Eggert
2021-10-19 8:25 ` Andreas Schwab
0 siblings, 1 reply; 13+ messages in thread
From: Paul Eggert @ 2021-10-19 8:13 UTC (permalink / raw)
To: Andreas Schwab; +Cc: libc-alpha
[-- Attachment #1: Type: text/plain, Size: 109 bytes --]
On 10/19/21 00:17, Andreas Schwab wrote:
> That needs to update the comment.
Thanks, revised patch attached.
[-- Attachment #2: 0001-regex-fix-buffer-read-overrun-in-search-BZ-28470.patch --]
[-- Type: text/x-patch, Size: 1114 bytes --]
From be84b14058bad546eba87742064e347f651e5852 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Mon, 18 Oct 2021 15:00:21 -0700
Subject: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
Problem reported by Benno Schulenberg in:
https://lists.gnu.org/r/bug-gnulib/2021-10/msg00035.html
* posix/regexec.c (re_search_internal): Use better bounds check.
---
posix/regexec.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/posix/regexec.c b/posix/regexec.c
index 83e9aaf8ca..106f9d7ff1 100644
--- a/posix/regexec.c
+++ b/posix/regexec.c
@@ -758,9 +758,8 @@ re_search_internal (const regex_t *preg, const char *string, Idx length,
offset = match_first - mctx.input.raw_mbs_idx;
}
- /* If MATCH_FIRST is out of the buffer, leave it as '\0'.
- Note that MATCH_FIRST must not be smaller than 0. */
- ch = (match_first >= length
+ /* If OFFSET is out of the buffer, leave CH as '\0'. */
+ ch = (mctx.input.valid_len <= offset
? 0 : re_string_byte_at (&mctx.input, offset));
if (fastmap[ch])
break;
--
2.31.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
2021-10-19 8:13 ` Paul Eggert
@ 2021-10-19 8:25 ` Andreas Schwab
2021-10-19 8:57 ` Paul Eggert
0 siblings, 1 reply; 13+ messages in thread
From: Andreas Schwab @ 2021-10-19 8:25 UTC (permalink / raw)
To: Paul Eggert; +Cc: libc-alpha
On Okt 19 2021, Paul Eggert wrote:
> + ch = (mctx.input.valid_len <= offset
This is backwards.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
2021-10-19 8:25 ` Andreas Schwab
@ 2021-10-19 8:57 ` Paul Eggert
2021-10-19 15:09 ` Andreas Schwab
0 siblings, 1 reply; 13+ messages in thread
From: Paul Eggert @ 2021-10-19 8:57 UTC (permalink / raw)
To: Andreas Schwab; +Cc: libc-alpha
[-- Attachment #1: Type: text/plain, Size: 357 bytes --]
On 10/19/21 01:25, Andreas Schwab wrote:
> On Okt 19 2021, Paul Eggert wrote:
>
>> + ch = (mctx.input.valid_len <= offset
>
> This is backwards.
It's correct as-is, so that comment is merely about style. I revamped
the patch to turn the comparison around; see attached. Let's not have
our longstanding style disagreement distract us from the fix.
[-- Attachment #2: 0001-regex-fix-buffer-read-overrun-in-search-BZ-28470.patch --]
[-- Type: text/x-patch, Size: 1206 bytes --]
From 7be5e6881cfd18006cac116d27e398ae342ba536 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Mon, 18 Oct 2021 15:00:21 -0700
Subject: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
Problem reported by Benno Schulenberg in:
https://lists.gnu.org/r/bug-gnulib/2021-10/msg00035.html
* posix/regexec.c (re_search_internal): Use better bounds check.
---
posix/regexec.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/posix/regexec.c b/posix/regexec.c
index 83e9aaf8ca..6aeba3c0b4 100644
--- a/posix/regexec.c
+++ b/posix/regexec.c
@@ -758,10 +758,9 @@ re_search_internal (const regex_t *preg, const char *string, Idx length,
offset = match_first - mctx.input.raw_mbs_idx;
}
- /* If MATCH_FIRST is out of the buffer, leave it as '\0'.
- Note that MATCH_FIRST must not be smaller than 0. */
- ch = (match_first >= length
- ? 0 : re_string_byte_at (&mctx.input, offset));
+ /* Use buffer byte if OFFSET is in buffer, otherwise '\0'. */
+ ch = (offset < mctx.input.valid_len
+ ? re_string_byte_at (&mctx.input, offset) : 0);
if (fastmap[ch])
break;
match_first += incr;
--
2.31.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
2021-10-19 8:57 ` Paul Eggert
@ 2021-10-19 15:09 ` Andreas Schwab
2021-10-19 18:14 ` Paul Eggert
0 siblings, 1 reply; 13+ messages in thread
From: Andreas Schwab @ 2021-10-19 15:09 UTC (permalink / raw)
To: Paul Eggert; +Cc: libc-alpha
On Okt 19 2021, Paul Eggert wrote:
> diff --git a/posix/regexec.c b/posix/regexec.c
> index 83e9aaf8ca..6aeba3c0b4 100644
> --- a/posix/regexec.c
> +++ b/posix/regexec.c
> @@ -758,10 +758,9 @@ re_search_internal (const regex_t *preg, const char *string, Idx length,
>
> offset = match_first - mctx.input.raw_mbs_idx;
> }
> - /* If MATCH_FIRST is out of the buffer, leave it as '\0'.
> - Note that MATCH_FIRST must not be smaller than 0. */
> - ch = (match_first >= length
> - ? 0 : re_string_byte_at (&mctx.input, offset));
> + /* Use buffer byte if OFFSET is in buffer, otherwise '\0'. */
> + ch = (offset < mctx.input.valid_len
> + ? re_string_byte_at (&mctx.input, offset) : 0);
Why is the bug not in re_string_reconstruct? Since string[match_first]
exists, so should re_string_byte_at (&mctx.input, offset).
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
2021-10-19 15:09 ` Andreas Schwab
@ 2021-10-19 18:14 ` Paul Eggert
2021-11-24 22:27 ` Paul Eggert
0 siblings, 1 reply; 13+ messages in thread
From: Paul Eggert @ 2021-10-19 18:14 UTC (permalink / raw)
To: Andreas Schwab; +Cc: libc-alpha
On 10/19/21 08:09, Andreas Schwab wrote:
> Why is the bug not in re_string_reconstruct? Since string[match_first]
> exists, so should re_string_byte_at (&mctx.input, offset).
I don't know, as I lacked the time to investigate re_string_reconstruct.
Although the patch I proposed fixes the test case that prompted it,
possibly it is only a partial fix for a more-general problem.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
2021-10-19 18:14 ` Paul Eggert
@ 2021-11-24 22:27 ` Paul Eggert
2021-11-24 22:45 ` Andreas Schwab
0 siblings, 1 reply; 13+ messages in thread
From: Paul Eggert @ 2021-11-24 22:27 UTC (permalink / raw)
To: libc-alpha
No further comment, and the patch is safe and has been used in Gnulib
for some time even if it doesn't necessarily fix all the underlying
problem, so I installed it. Tests pass on x86-64.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
2021-11-24 22:27 ` Paul Eggert
@ 2021-11-24 22:45 ` Andreas Schwab
2021-11-24 23:50 ` Paul Eggert
0 siblings, 1 reply; 13+ messages in thread
From: Andreas Schwab @ 2021-11-24 22:45 UTC (permalink / raw)
To: Paul Eggert; +Cc: libc-alpha
On Nov 24 2021, Paul Eggert wrote:
> the patch is safe
Is it? Why?
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
2021-11-24 22:45 ` Andreas Schwab
@ 2021-11-24 23:50 ` Paul Eggert
2021-11-25 9:01 ` Andreas Schwab
0 siblings, 1 reply; 13+ messages in thread
From: Paul Eggert @ 2021-11-24 23:50 UTC (permalink / raw)
To: Andreas Schwab; +Cc: libc-alpha
On 11/24/21 14:45, Andreas Schwab wrote:
> Is it? Why?
Partly because it refuses to read past the bounds of an array, where the
old code would. And partly because it's been run through several tests
- not just glibc tests, but also grep and coreutils and probably some
others by now.
Of course this is not a 100% guarantee of safety, but it's close enough.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
2021-11-24 23:50 ` Paul Eggert
@ 2021-11-25 9:01 ` Andreas Schwab
2021-11-26 18:35 ` Paul Eggert
0 siblings, 1 reply; 13+ messages in thread
From: Andreas Schwab @ 2021-11-25 9:01 UTC (permalink / raw)
To: Paul Eggert; +Cc: libc-alpha
On Nov 24 2021, Paul Eggert wrote:
> On 11/24/21 14:45, Andreas Schwab wrote:
>> Is it? Why?
>
> Partly because it refuses to read past the bounds of an array, where the
> old code would.
That's just papering over a bug, not fixing it.
> And partly because it's been run through several tests - not just
> glibc tests, but also grep and coreutils and probably some others by
> now.
How much coverage do they provide?
Also, you failed to add a test.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
2021-11-25 9:01 ` Andreas Schwab
@ 2021-11-26 18:35 ` Paul Eggert
2021-11-26 18:39 ` Andreas Schwab
0 siblings, 1 reply; 13+ messages in thread
From: Paul Eggert @ 2021-11-26 18:35 UTC (permalink / raw)
To: Andreas Schwab; +Cc: libc-alpha
On 11/25/21 01:01, Andreas Schwab wrote:
>> Partly because it refuses to read past the bounds of an array, where the
>> old code would.
>
> That's just papering over a bug, not fixing it.
That's not clear to me. Perhaps you're right, but perhaps it really does
fix the bug.
>> And partly because it's been run through several tests - not just
>> glibc tests, but also grep and coreutils and probably some others by
>> now.
>
> How much coverage do they provide?
Someone who has more time could presumably determine this by looking at
the respective test suites. I forgot to mention, Gnulib also has its own
regex tests (which also pass).
> Also, you failed to add a test.
Yes, that's correct. It would be nice if someone could do that. However,
it'd be some work and like you I'm pressed for time.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] regex: fix buffer read overrun in search [BZ#28470]
2021-11-26 18:35 ` Paul Eggert
@ 2021-11-26 18:39 ` Andreas Schwab
0 siblings, 0 replies; 13+ messages in thread
From: Andreas Schwab @ 2021-11-26 18:39 UTC (permalink / raw)
To: Paul Eggert; +Cc: libc-alpha
On Nov 26 2021, Paul Eggert wrote:
> On 11/25/21 01:01, Andreas Schwab wrote:
>
>>> Partly because it refuses to read past the bounds of an array, where the
>>> old code would.
>> That's just papering over a bug, not fixing it.
>
> That's not clear to me. Perhaps you're right, but perhaps it really does
> fix the bug.
That's why we need a proper test case. Not voodoo programming.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2021-11-26 18:40 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-18 22:15 [PATCH] regex: fix buffer read overrun in search [BZ#28470] Paul Eggert
2021-10-19 7:17 ` Andreas Schwab
2021-10-19 8:13 ` Paul Eggert
2021-10-19 8:25 ` Andreas Schwab
2021-10-19 8:57 ` Paul Eggert
2021-10-19 15:09 ` Andreas Schwab
2021-10-19 18:14 ` Paul Eggert
2021-11-24 22:27 ` Paul Eggert
2021-11-24 22:45 ` Andreas Schwab
2021-11-24 23:50 ` Paul Eggert
2021-11-25 9:01 ` Andreas Schwab
2021-11-26 18:35 ` Paul Eggert
2021-11-26 18:39 ` Andreas Schwab
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).