From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Rich Felker <dalias@libc.org>
Cc: Jeff King <peff@peff.net>, git@vger.kernel.org, musl@lists.openwall.com
Subject: Re: [musl] Re: Regression: git no longer works with musl libc's regex impl
Date: Wed, 5 Oct 2016 13:17:49 +0200 (CEST) [thread overview]
Message-ID: <alpine.DEB.2.20.1610051250080.35196@virtualbox> (raw)
In-Reply-To: <20161004173926.GA19318@brightrain.aerifal.cx>
Hi Rich,
On Tue, 4 Oct 2016, Rich Felker wrote:
> On Tue, Oct 04, 2016 at 06:08:33PM +0200, Johannes Schindelin wrote:
>
> > And lastly, the best alternative would be to teach musl about
> > REG_STARTEND, as it is rather useful a feature.
>
> Maybe, but it seems fundamentally costly to support -- it's extra
> state in the inner loops that imposes costly spill/reload on archs
> with too few registers (x86).
It is true that it could cause that.
I had a brief look at the source code (you use backtracking... hopefully
nobody uses musl to parse regular expressions from untrusted, or
inexperienced, sources [*1*]), and it seems that the regex code might
spill unnecessarily already (I see, for example, that the reg_notbol,
reg_noteol and reg_newline flags all use up complete int registers, not
merely bits of a single one).
It seems, specifically, that the *match_end_ofs parameter of the two
regexec backends is always set to point to eo, which is so far not
initialized. You could initialize it to -1 and set it to pmatch[0].rm_eo
if the REG_STARTEND flag is set. The GET_NEXT_WCHAR() macro would then
need to test something like
if (str_byte >= string + *match_end_ofs) {
ret = REG_NOMATCH; goto error_exit;
}
This does not handle non-zero pmatch[0].rm_so, though. I would probably
try to pass another input parameter for that, but I have not verified yet
that a "^" would be handled properly (if pmatch[0].rm_so > 0 and
REG_STARTEND is set, "^" should *not* match).
> I'll look at doing this when we overhaul/replace the regex
> implementation, and I'm happy to do some performance-regression tests
> for adding it now if someone has a simple patch (as was mentioned on the
> musl list).
I'd be interested to be kept in the loop, if you do not mind Cc:ing me.
Ciao,
Johannes
Footnote *1*:
http://stackstatus.net/post/147710624694/outage-postmortem-july-20-2016
next prev parent reply other threads:[~2016-10-05 11:18 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-04 15:08 Regression: git no longer works with musl libc's regex impl Rich Felker
2016-10-04 15:27 ` Jeff King
2016-10-04 15:40 ` Rich Felker
2016-10-04 16:08 ` Johannes Schindelin
2016-10-04 16:11 ` Rich Felker
2016-10-04 17:16 ` Johannes Schindelin
2016-10-04 18:00 ` Ray Donnelly
2016-10-04 17:39 ` [musl] " Rich Felker
2016-10-05 11:17 ` Johannes Schindelin [this message]
2016-10-05 13:01 ` Szabolcs Nagy
2016-10-05 13:15 ` Rich Felker
2016-10-04 22:06 ` James B
2016-10-04 22:33 ` Rich Felker
2016-10-04 22:48 ` Junio C Hamano
2016-10-05 13:11 ` Jakub Narębski
2016-10-05 16:15 ` [musl] " Rich Felker
2016-10-05 10:41 ` Johannes Schindelin
2016-10-05 11:59 ` James B
2016-10-05 16:11 ` Jeff King
2016-10-05 16:27 ` Rich Felker
2016-10-06 10:44 ` Johannes Schindelin
2016-10-06 19:18 ` Ævar Arnfjörð Bjarmason
2016-10-06 19:23 ` Jeff King
2016-10-06 19:25 ` Rich Felker
2016-10-06 19:28 ` Jeff King
2016-10-06 22:42 ` Ramsay Jones
2016-10-07 11:30 ` Jakub Narębski
2016-10-04 16:01 ` Johannes Schindelin
-- strict thread matches above, loose matches on Subject: below --
2016-10-05 3:00 [musl] " writeonce
2016-10-05 10:49 ` Johannes Schindelin
2016-10-05 16:37 writeonce
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.20.1610051250080.35196@virtualbox \
--to=johannes.schindelin@gmx.de \
--cc=dalias@libc.org \
--cc=git@vger.kernel.org \
--cc=musl@lists.openwall.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).