From: Johannes Schindelin <johannes.schindelin@gmx.de>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>
Subject: [PATCH v3 2/3] Introduce a function to run regexec() on non-NUL-terminated buffers
Date: Thu, 8 Sep 2016 09:58:35 +0200 (CEST) [thread overview]
Message-ID: <94ee698b2736929d37640012a1b1735b134dd3d6.1473321437.git.johannes.schindelin@gmx.de> (raw)
In-Reply-To: <cover.1473321437.git.johannes.schindelin@gmx.de>
We just introduced a test that demonstrates that our sloppy use of
regexec() on a mmap()ed area can result in incorrect results or even
hard crashes.
So what we need to fix this is a function that calls regexec() on a
length-delimited, rather than a NUL-terminated, string.
Happily, there is an extension to regexec() introduced by the NetBSD
project and present in all major regex implementation including
Linux', MacOSX' and the one Git includes in compat/regex/: by using
the (non-POSIX) REG_STARTEND flag, it is possible to tell the
regexec() function that it should only look at the offsets between
pmatch[0].rm_so and pmatch[0].rm_eo.
That is exactly what we need.
Since support for REG_STARTEND is so widespread by now, let's just
introduce a helper function that uses it, and fall back to allocating
and constructing a NUL-terminated when REG_STARTEND is not available.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
git-compat-util.h | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/git-compat-util.h b/git-compat-util.h
index db89ba7..19128b3 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -965,6 +965,27 @@ void git_qsort(void *base, size_t nmemb, size_t size,
#define qsort git_qsort
#endif
+static inline int regexec_buf(const regex_t *preg, const char *buf, size_t size,
+ size_t nmatch, regmatch_t pmatch[], int eflags)
+{
+#ifdef REG_STARTEND
+ assert(nmatch > 0 && pmatch);
+ pmatch[0].rm_so = 0;
+ pmatch[0].rm_eo = size;
+ return regexec(preg, buf, nmatch, pmatch, eflags | REG_STARTEND);
+#else
+ char *buf2 = xmalloc(size + 1);
+ int ret;
+
+ memcpy(buf2, buf, size);
+ buf2[size] = '\0';
+ ret = regexec(preg, buf2, nmatch, pmatch, eflags);
+ free(buf2);
+
+ return ret;
+#endif
+}
+
#ifndef DIR_HAS_BSD_GROUP_SEMANTICS
# define FORCE_DIR_SET_GID S_ISGID
#else
--
2.10.0.windows.1.10.g803177d
next prev parent reply other threads:[~2016-09-08 7:59 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-05 15:44 [PATCH 0/3] Fix a segfault caused by regexec() being called on mmap()ed data Johannes Schindelin
2016-09-05 15:45 ` [PATCH 1/3] Demonstrate a problem: our pickaxe code assumes NUL-terminated buffers Johannes Schindelin
2016-09-06 18:43 ` Jeff King
2016-09-08 7:53 ` Johannes Schindelin
2016-09-05 15:45 ` [PATCH 2/3] diff_populate_filespec: NUL-terminate buffers Johannes Schindelin
2016-09-06 7:06 ` Jeff King
2016-09-06 16:02 ` Johannes Schindelin
2016-09-06 18:41 ` Jeff King
2016-09-07 18:31 ` Junio C Hamano
2016-09-08 7:52 ` Johannes Schindelin
2016-09-08 7:49 ` Johannes Schindelin
2016-09-08 8:22 ` Jeff King
2016-09-08 16:57 ` Junio C Hamano
2016-09-08 18:22 ` Johannes Schindelin
2016-09-08 18:48 ` Jeff King
2016-09-05 15:45 ` [PATCH 3/3] diff_grep: add assertions verifying that the buffers are NUL-terminated Johannes Schindelin
2016-09-06 7:08 ` Jeff King
2016-09-06 16:04 ` Johannes Schindelin
2016-09-05 19:10 ` [PATCH 0/3] Fix a segfault caused by regexec() being called on mmap()ed data Junio C Hamano
2016-09-06 7:12 ` Jeff King
2016-09-06 14:06 ` Johannes Schindelin
2016-09-06 18:29 ` Jeff King
2016-09-08 7:29 ` Johannes Schindelin
2016-09-08 8:00 ` Jeff King
2016-09-09 10:09 ` Johannes Schindelin
2016-09-09 17:46 ` Junio C Hamano
2016-09-06 13:21 ` Johannes Schindelin
2016-09-06 6:58 ` Jeff King
2016-09-06 14:13 ` Johannes Schindelin
2016-09-08 7:31 ` [PATCH v2 " Johannes Schindelin
2016-09-08 7:31 ` [PATCH v2 2/3] Introduce a function to run regexec() on non-NUL-terminated buffers Johannes Schindelin
2016-09-08 8:04 ` Jeff King
2016-09-09 9:45 ` Johannes Schindelin
2016-09-09 9:59 ` Jeff King
2016-09-08 7:31 ` [PATCH v2 1/3] Demonstrate a problem: our pickaxe code assumes NUL-terminated buffers Johannes Schindelin
2016-09-08 7:31 ` [PATCH v2 3/3] Use the newly-introduced regexec_buf() function Johannes Schindelin
2016-09-08 7:54 ` Johannes Schindelin
2016-09-08 8:10 ` Jeff King
2016-09-08 8:14 ` Jeff King
2016-09-08 8:35 ` Jeff King
2016-09-08 19:06 ` Ramsay Jones
2016-09-08 19:53 ` Jeff King
2016-09-08 21:30 ` Junio C Hamano
2016-09-08 7:33 ` [PATCH v2 0/3] Fix a segfault caused by regexec() being called on mmap()ed data Johannes Schindelin
2016-09-08 8:13 ` Jeff King
2016-09-08 7:57 ` [PATCH v3 " Johannes Schindelin
2016-09-08 7:57 ` [PATCH v3 1/3] Demonstrate a problem: our pickaxe code assumes NUL-terminated buffers Johannes Schindelin
2016-09-08 7:58 ` Johannes Schindelin [this message]
2016-09-08 17:03 ` [PATCH v3 2/3] Introduce a function to run regexec() on non-NUL-terminated buffers Junio C Hamano
2016-09-08 7:59 ` [PATCH v3 3/3] Use the newly-introduced regexec_buf() function Johannes Schindelin
2016-09-08 17:09 ` Junio C Hamano
2016-09-09 9:52 ` Johannes Schindelin
2016-09-09 9:57 ` Jeff King
2016-09-09 10:41 ` Johannes Schindelin
2016-09-09 17:49 ` Junio C Hamano
2016-09-21 18:23 ` [PATCH v4 0/3] Fix a segfault caused by regexec() being called on mmap()ed data Johannes Schindelin
2016-09-21 18:23 ` [PATCH v4 1/3] regex: -G<pattern> feeds a non NUL-terminated string to regexec() and fails Johannes Schindelin
2016-09-21 18:24 ` [PATCH v4 2/3] regex: add regexec_buf() that can work on a non NUL-terminated string Johannes Schindelin
2016-09-21 19:17 ` Junio C Hamano
2016-09-22 18:38 ` Johannes Schindelin
2016-09-21 18:24 ` [PATCH v4 3/3] regex: use regexec_buf() Johannes Schindelin
2016-09-21 19:18 ` Junio C Hamano
2016-09-21 20:09 ` Junio C Hamano
2016-09-21 22:03 ` Jeff King
2016-09-25 14:01 ` Johannes Schindelin
2016-09-21 22:04 ` [PATCH v4 0/3] Fix a segfault caused by regexec() being called on mmap()ed data Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=94ee698b2736929d37640012a1b1735b134dd3d6.1473321437.git.johannes.schindelin@gmx.de \
--to=johannes.schindelin@gmx.de \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).