From: Paolo Bonzini <paolo.bonzini@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Junio C Hamano <gitster@pobox.com>, John Bito <jwbito@gmail.com>,
git <git@vger.kernel.org>
Subject: Re: git diff looping?
Date: Wed, 17 Jun 2009 10:46:21 +0200 [thread overview]
Message-ID: <4A38AD5D.6010404@gmail.com> (raw)
In-Reply-To: <20090616171531.GA17538@coredump.intra.peff.net>
> Really, that performance is so bad that I'm beginning to wonder if I am
> somehow measuring something wrong. How could they ship something so
> crappy through so many versions?
Because without some care in the matcher, the regex can be exponential.
This happens because you can backtrack arbitrarily from [A-Za-z_0-9]*
into [A-Za-z_] and ironically it also causes the regex not to work as
intended; for example "catch(" can match the complex part of the regex
(e.g. the first repetition can be "c" and the second can be "atch".
We can make it faster and more correct at the expense of additional
complication.
Starting from:
^[ \t]*(([ \t]*[A-Za-z_][A-Za-z_0-9]*){2,}[ \t]*\([^;]*)$
we have to:
1) move [ \t] at the end of the repeated subexpression so that it
removes the need for the [ \t] after
^[ \t]*(([A-Za-z_][A-Za-z_0-9]*[ \t]*){2,}\([^;]*)$
2) make sure that at least one space/tab is eaten on all but the last
occurrence of the repeated subexpression. To this end the LHS of {2,}
is duplicated, once with [ \t]+ and once with [ \t]*. The repetition
itself becomes a + since the last occurrence is now separately handled:
^[ \t]*(([A-Za-z_][A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*
[ \t]*\([^;]*)$
Paolo
next prev parent reply other threads:[~2009-06-17 8:46 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-16 1:37 git diff looping? John Bito
2009-06-16 2:44 ` Jeff Epler
2009-06-16 2:53 ` John Bito
2009-06-16 11:47 ` Jeff King
2009-06-16 12:07 ` Jeff King
2009-06-16 12:11 ` [PATCH 1/2] Makefile: refactor regex compat support Jeff King
2009-06-16 18:47 ` Johannes Sixt
2009-06-16 19:05 ` Jeff King
2009-06-16 19:07 ` [PATCH v2 " Jeff King
2009-06-16 19:08 ` [PATCH v2 2/2] Makefile: use compat regex on Solaris Jeff King
2009-06-16 20:07 ` Brandon Casey
2009-06-17 13:15 ` Mike Ralphson
2009-06-17 13:55 ` Mike Ralphson
2009-06-16 12:14 ` [PATCH " Jeff King
2009-06-16 15:48 ` git diff looping? John Bito
2009-06-16 16:51 ` Junio C Hamano
2009-06-16 17:15 ` Jeff King
2009-06-16 17:35 ` Brandon Casey
2009-06-16 17:39 ` John Bito
2009-06-16 17:41 ` Jeff King
2009-06-16 20:22 ` Brandon Casey
2009-06-17 8:46 ` Paolo Bonzini [this message]
2009-06-17 10:23 ` Jeff King
2009-06-17 11:02 ` Paolo Bonzini
2009-06-17 11:31 ` Andreas Ericsson
2009-06-17 13:08 ` Paolo Bonzini
2009-06-17 13:16 ` Andreas Ericsson
2009-06-17 13:58 ` Paolo Bonzini
2009-06-17 14:26 ` [PATCH] avoid exponential regex match for java and objc function names Paolo Bonzini
2009-06-17 15:46 ` demerphq
2009-06-17 15:56 ` Jeff King
2009-06-17 16:00 ` demerphq
2009-06-17 16:04 ` Paolo Bonzini
2009-06-17 16:42 ` Junio C Hamano
2009-06-18 6:45 ` Paolo Bonzini
2009-06-16 17:16 ` git diff looping? John Bito
2009-06-16 17:24 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A38AD5D.6010404@gmail.com \
--to=paolo.bonzini@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jwbito@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).