From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 123131F8C2 for ; Tue, 9 Feb 2021 20:23:57 +0000 (UTC) Received: from localhost ([::1]:42370 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1l9ZYB-00045o-E3 for normalperson@yhbt.net; Tue, 09 Feb 2021 15:23:55 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:60514) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l9YuA-0001JU-C0 for bug-gnulib@gnu.org; Tue, 09 Feb 2021 14:42:34 -0500 Received: from mail-qt1-x833.google.com ([2607:f8b0:4864:20::833]:34975) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1l9Yu2-0006u0-Kv for bug-gnulib@gnu.org; Tue, 09 Feb 2021 14:42:34 -0500 Received: by mail-qt1-x833.google.com with SMTP id c5so5039205qth.2 for ; Tue, 09 Feb 2021 11:42:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=to:references:from:autocrypt:subject:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=ePZTZZ3e/obG4a78Vmn8YiEUpgLXrtwMdZ8ua0Soihc=; b=Wf4Nu16hSuvuABXzEyXHhcL/NncBgVSNcWeTYEor9E1Kd0qjUObFl86k4wolU7mkv9 Accch8lG/pwkYP0U9i3mQTfv/kqd2q/iQqu9QqPR3+zEnaLUE0gBdL3zDW3KPCYdxSbD EuXy/XjWyiN5B60Haj+EX7OJqvKT9eLJyk79EYisZJSLCLyag7rsuP9Fh/8BHErlzSR9 ts978t6rVS//6zcO4iSBAgv/DiwD+zYneHcWJDdYJmABXYFFzmi/z1W+Fx/vgKbV0g8z R12vC1+X5B2+NRoM46sCC0j9ZO6n8ngX4WJ+2zcriHrE28/ipHbPRQb0i9ShZOewbJrj aItQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:references:from:autocrypt:subject:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=ePZTZZ3e/obG4a78Vmn8YiEUpgLXrtwMdZ8ua0Soihc=; b=AEYK9UfZCkjMyTpcaN01mD/598utBLfiIsaii9oOs+FzqPd3DkyJ6SP8EShfyiA/gD sW6S2iecKlyvzryRPP3VYLDhV9RcLa47J97Dkd+PSBW7N6t8G6hkXXITIK1egtlwdOjF vF/jbYGSnYrV+MPo5iBTL7sk+zrbTKQoM+LHX8kwg7MOlj1Y73xPmqSX3OLIWUEAEWeV y/rCMXYhPQy4Fmqzyn1PCcMPzyJfcsp3OV95o6+0hN9Ft5TXPRl5mMC02AukGCHnYx8V clPkYRasRgyoI1IP+xbvv7M5oNcHp7ehcsirAISW6dftPmBfq6IrtKHXo9dqWc3Ep4M8 y6yQ== X-Gm-Message-State: AOAM532QoeNvQG6Dkvwg39QIc+Het9WZpv/t2TSCDhC1P5V5QfupQ1P/ DaypBhnQOXqYyYxPlMLjczxxTOOvHkcqhw== X-Google-Smtp-Source: ABdhPJxW66Qbv1YHwNQKpDNYMh2XciiSWF0WZmVT09t35iSqlrOkYzRvx51GePRCnlGpQnJUz2PQWQ== X-Received: by 2002:ac8:59c1:: with SMTP id f1mr21235586qtf.310.1612899744715; Tue, 09 Feb 2021 11:42:24 -0800 (PST) Received: from [192.168.1.4] ([177.194.48.209]) by smtp.googlemail.com with ESMTPSA id j46sm7846123qtk.1.2021.02.09.11.42.23 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 09 Feb 2021 11:42:24 -0800 (PST) To: Paul Eggert , bug-gnulib@gnu.org References: <20210206012602.2257711-1-eggert@cs.ucla.edu> <20210206012602.2257711-7-eggert@cs.ucla.edu> From: Adhemerval Zanella Autocrypt: addr=adhemerval.zanella@linaro.org; prefer-encrypt=mutual; keydata= mQINBFcVGkoBEADiQU2x/cBBmAVf5C2d1xgz6zCnlCefbqaflUBw4hB/bEME40QsrVzWZ5Nq 8kxkEczZzAOKkkvv4pRVLlLn/zDtFXhlcvQRJ3yFMGqzBjofucOrmdYkOGo0uCaoJKPT186L NWp53SACXguFJpnw4ODI64ziInzXQs/rUJqrFoVIlrPDmNv/LUv1OVPKz20ETjgfpg8MNwG6 iMizMefCl+RbtXbIEZ3TE/IaDT/jcOirjv96lBKrc/pAL0h/O71Kwbbp43fimW80GhjiaN2y WGByepnkAVP7FyNarhdDpJhoDmUk9yfwNuIuESaCQtfd3vgKKuo6grcKZ8bHy7IXX1XJj2X/ BgRVhVgMHAnDPFIkXtP+SiarkUaLjGzCz7XkUn4XAGDskBNfbizFqYUQCaL2FdbW3DeZqNIa nSzKAZK7Dm9+0VVSRZXP89w71Y7JUV56xL/PlOE+YKKFdEw+gQjQi0e+DZILAtFjJLoCrkEX w4LluMhYX/X8XP6/C3xW0yOZhvHYyn72sV4yJ1uyc/qz3OY32CRy+bwPzAMAkhdwcORA3JPb kPTlimhQqVgvca8m+MQ/JFZ6D+K7QPyvEv7bQ7M+IzFmTkOCwCJ3xqOD6GjX3aphk8Sr0dq3 4Awlf5xFDAG8dn8Uuutb7naGBd/fEv6t8dfkNyzj6yvc4jpVxwARAQABtElBZGhlbWVydmFs IFphbmVsbGEgTmV0dG8gKExpbmFybyBWUE4gS2V5KSA8YWRoZW1lcnZhbC56YW5lbGxhQGxp bmFyby5vcmc+iQI3BBMBCAAhBQJXFRpKAhsDBQsJCAcDBRUKCQgLBRYCAwEAAh4BAheAAAoJ EKqx7BSnlIjv0e8P/1YOYoNkvJ+AJcNUaM5a2SA9oAKjSJ/M/EN4Id5Ow41ZJS4lUA0apSXW NjQg3VeVc2RiHab2LIB4MxdJhaWTuzfLkYnBeoy4u6njYcaoSwf3g9dSsvsl3mhtuzm6aXFH /Qsauav77enJh99tI4T+58rp0EuLhDsQbnBic/ukYNv7sQV8dy9KxA54yLnYUFqH6pfH8Lly sTVAMyi5Fg5O5/hVV+Z0Kpr+ZocC1YFJkTsNLAW5EIYSP9ftniqaVsim7MNmodv/zqK0IyDB GLLH1kjhvb5+6ySGlWbMTomt/or/uvMgulz0bRS+LUyOmlfXDdT+t38VPKBBVwFMarNuREU2 69M3a3jdTfScboDd2ck1u7l+QbaGoHZQ8ZNUrzgObltjohiIsazqkgYDQzXIMrD9H19E+8fw kCNUlXxjEgH/Kg8DlpoYJXSJCX0fjMWfXywL6ZXc2xyG/hbl5hvsLNmqDpLpc1CfKcA0BkK+ k8R57fr91mTCppSwwKJYO9T+8J+o4ho/CJnK/jBy1pWKMYJPvvrpdBCWq3MfzVpXYdahRKHI ypk8m4QlRlbOXWJ3TDd/SKNfSSrWgwRSg7XCjSlR7PNzNFXTULLB34sZhjrN6Q8NQZsZnMNs TX8nlGOVrKolnQPjKCLwCyu8PhllU8OwbSMKskcD1PSkG6h3r0AquQINBFcVGkoBEACgAdbR Ck+fsfOVwT8zowMiL3l9a2DP3Eeak23ifdZG+8Avb/SImpv0UMSbRfnw/N81IWwlbjkjbGTu oT37iZHLRwYUFmA8fZX0wNDNKQUUTjN6XalJmvhdz9l71H3WnE0wneEM5ahu5V1L1utUWTyh VUwzX1lwJeV3vyrNgI1kYOaeuNVvq7npNR6t6XxEpqPsNc6O77I12XELic2+36YibyqlTJIQ V1SZEbIy26AbC2zH9WqaKyGyQnr/IPbTJ2Lv0dM3RaXoVf+CeK7gB2B+w1hZummD21c1Laua +VIMPCUQ+EM8W9EtX+0iJXxI+wsztLT6vltQcm+5Q7tY+HFUucizJkAOAz98YFucwKefbkTp eKvCfCwiM1bGatZEFFKIlvJ2QNMQNiUrqJBlW9nZp/k7pbG3oStOjvawD9ZbP9e0fnlWJIsj 6c7pX354Yi7kxIk/6gREidHLLqEb/otuwt1aoMPg97iUgDV5mlNef77lWE8vxmlY0FBWIXuZ yv0XYxf1WF6dRizwFFbxvUZzIJp3spAao7jLsQj1DbD2s5+S1BW09A0mI/1DjB6EhNN+4bDB SJCOv/ReK3tFJXuj/HbyDrOdoMt8aIFbe7YFLEExHpSk+HgN05Lg5TyTro8oW7TSMTk+8a5M kzaH4UGXTTBDP/g5cfL3RFPl79ubXwARAQABiQIfBBgBCAAJBQJXFRpKAhsMAAoJEKqx7BSn lIjvI/8P/jg0jl4Tbvg3B5kT6PxJOXHYu9OoyaHLcay6Cd+ZrOd1VQQCbOcgLFbf4Yr+rE9l mYsY67AUgq2QKmVVbn9pjvGsEaz8UmfDnz5epUhDxC6yRRvY4hreMXZhPZ1pbMa6A0a/WOSt AgFj5V6Z4dXGTM/lNManr0HjXxbUYv2WfbNt3/07Db9T+GZkpUotC6iknsTA4rJi6u2ls0W9 1UIvW4o01vb4nZRCj4rni0g6eWoQCGoVDk/xFfy7ZliR5B+3Z3EWRJcQskip/QAHjbLa3pml xAZ484fVxgeESOoaeC9TiBIp0NfH8akWOI0HpBCiBD5xaCTvR7ujUWMvhsX2n881r/hNlR9g fcE6q00qHSPAEgGr1bnFv74/1vbKtjeXLCcRKk3Ulw0bY1OoDxWQr86T2fZGJ/HIZuVVBf3+ gaYJF92GXFynHnea14nFFuFgOni0Mi1zDxYH/8yGGBXvo14KWd8JOW0NJPaCDFJkdS5hu0VY 7vJwKcyHJGxsCLU+Et0mryX8qZwqibJIzu7kUJQdQDljbRPDFd/xmGUFCQiQAncSilYOcxNU EMVCXPAQTteqkvA+gNqSaK1NM9tY0eQ4iJpo+aoX8HAcn4sZzt2pfUB9vQMTBJ2d4+m/qO6+ cFTAceXmIoFsN8+gFN3i8Is3u12u8xGudcBPvpoy4OoG Subject: Re: [PATCH 07/10] regex: fix longstanding backref match bug Message-ID: <041bd09f-447e-40ca-70a6-1eab3993fe19@linaro.org> Date: Tue, 9 Feb 2021 16:42:21 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <20210206012602.2257711-7-eggert@cs.ucla.edu> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2607:f8b0:4864:20::833; envelope-from=adhemerval.zanella@linaro.org; helo=mail-qt1-x833.google.com X-Spam_score_int: -23 X-Spam_score: -2.4 X-Spam_bar: -- X-Spam_report: (-2.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.265, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: bug-gnulib@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Gnulib discussion list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnulib-bounces+normalperson=yhbt.net@gnu.org Sender: "bug-gnulib" Hi Paul, Trying to sync gnulib with glibc code, this patch trigger some regression on glibc testcases: FAIL: posix/tst-boost FAIL: posix/tst-pcre FAIL: posix/tst-rxspencer FAIL: posix/tst-rxspencer-no-utf8 $ grep -n "FAIL rm" posix/tst-boost.out 445:FAIL rm[1] 3..-1 != expected 2..3 448:FAIL rm[1] 3..-1 != expected 2..3 451:FAIL rm[1] 3..-1 != expected 2..3 454:FAIL rm[1] 3..-1 != expected 2..3 $ cat posix/tst-pcre.out 1168: /([a]*)*/ on a unexpectedly failed to match register 1 1..-1 1171: /([a]*)*/ on aaaaa unexpectedly failed to match register 1 5..-1 1176: /([ab]*)*/ on a unexpectedly failed to match register 1 1..-1 1179: /([ab]*)*/ on b unexpectedly failed to match register 1 1..-1 1182: /([ab]*)*/ on ababab unexpectedly failed to match register 1 6..-1 1185: /([ab]*)*/ on aaaabcde unexpectedly failed to match register 1 5..-1 1188: /([ab]*)*/ on bbbb unexpectedly failed to match register 1 4..-1 1193: /([^a]*)*/ on b unexpectedly failed to match register 1 1..-1 1196: /([^a]*)*/ on bbbb unexpectedly failed to match register 1 4..-1 1203: /([^ab]*)*/ on cccc unexpectedly failed to match register 1 4..-1 $ cat posix/tst-rxspencer.out | grep FAIL FAIL rm[1] unexpectedly did not match FAIL rm[1] unexpectedly did not match $ cat posix/tst-rxspencer-no-utf8.out | grep FAIL FAIL rm[1] unexpectedly did not match FAIL rm[1] unexpectedly did not match On 05/02/2021 22:25, Paul Eggert wrote: > This fixes a longstanding glibc bug concerning backreferences > (2009-12-04). > * lib/regexec.c (proceed_next_node, push_fail_stack) > (pop_fail_stack): Push and pop the previous registers > as well as the current ones. All callers changed. > (set_regs): Also pop if CUR_NODE has already been checked, > so that it does not get added as a duplicate set entry. > (update_regs): Fix comment location. > * tests/test-regex.c (tests): New constant. > (bug_regex11): New test function. > (main): Bump alarm value. Call new test function. > --- > ChangeLog | 13 ++++++ > lib/regexec.c | 26 +++++++---- > tests/test-regex.c | 113 ++++++++++++++++++++++++++++++++++++++++++++- > 3 files changed, 141 insertions(+), 11 deletions(-) > > diff --git a/ChangeLog b/ChangeLog > index 74304474b..bd7d1fa16 100644 > --- a/ChangeLog > +++ b/ChangeLog > @@ -1,5 +1,18 @@ > 2021-02-05 Paul Eggert > > + regex: fix longstanding backref match bug > + This fixes a longstanding glibc bug concerning backreferences > + (2009-12-04). > + * lib/regexec.c (proceed_next_node, push_fail_stack) > + (pop_fail_stack): Push and pop the previous registers > + as well as the current ones. All callers changed. > + (set_regs): Also pop if CUR_NODE has already been checked, > + so that it does not get added as a duplicate set entry. > + (update_regs): Fix comment location. > + * tests/test-regex.c (tests): New constant. > + (bug_regex11): New test function. > + (main): Bump alarm value. Call new test function. > + > regex: avoid duplicate in espilon closure > * lib/regcomp.c (calc_eclosure_iter): Insert NODE into epsilon > closure first rather than last. Otherwise, the epsilon closure > diff --git a/lib/regexec.c b/lib/regexec.c > index fdd2e373e..424bc8d15 100644 > --- a/lib/regexec.c > +++ b/lib/regexec.c > @@ -59,7 +59,7 @@ static void update_regs (const re_dfa_t *dfa, regmatch_t *pmatch, > Idx cur_idx, Idx nmatch); > static reg_errcode_t push_fail_stack (struct re_fail_stack_t *fs, > Idx str_idx, Idx dest_node, Idx nregs, > - regmatch_t *regs, > + regmatch_t *regs, regmatch_t *prevregs, > re_node_set *eps_via_nodes); > static reg_errcode_t set_regs (const regex_t *preg, > const re_match_context_t *mctx, > @@ -1211,6 +1211,7 @@ check_halt_state_context (const re_match_context_t *mctx, > > static Idx > proceed_next_node (const re_match_context_t *mctx, Idx nregs, regmatch_t *regs, > + regmatch_t *prevregs, > Idx *pidx, Idx node, re_node_set *eps_via_nodes, > struct re_fail_stack_t *fs) > { > @@ -1243,7 +1244,7 @@ proceed_next_node (const re_match_context_t *mctx, Idx nregs, regmatch_t *regs, > /* Otherwise, push the second epsilon-transition on the fail stack. */ > else if (fs != NULL > && push_fail_stack (fs, *pidx, candidate, nregs, regs, > - eps_via_nodes)) > + prevregs, eps_via_nodes)) > return -2; > > /* We know we are going to exit. */ > @@ -1316,7 +1317,8 @@ proceed_next_node (const re_match_context_t *mctx, Idx nregs, regmatch_t *regs, > static reg_errcode_t > __attribute_warn_unused_result__ > push_fail_stack (struct re_fail_stack_t *fs, Idx str_idx, Idx dest_node, > - Idx nregs, regmatch_t *regs, re_node_set *eps_via_nodes) > + Idx nregs, regmatch_t *regs, regmatch_t *prevregs, > + re_node_set *eps_via_nodes) > { > reg_errcode_t err; > Idx num = fs->num++; > @@ -1332,23 +1334,26 @@ push_fail_stack (struct re_fail_stack_t *fs, Idx str_idx, Idx dest_node, > } > fs->stack[num].idx = str_idx; > fs->stack[num].node = dest_node; > - fs->stack[num].regs = re_malloc (regmatch_t, nregs); > + fs->stack[num].regs = re_malloc (regmatch_t, 2 * nregs); > if (fs->stack[num].regs == NULL) > return REG_ESPACE; > memcpy (fs->stack[num].regs, regs, sizeof (regmatch_t) * nregs); > + memcpy (fs->stack[num].regs + nregs, prevregs, sizeof (regmatch_t) * nregs); > err = re_node_set_init_copy (&fs->stack[num].eps_via_nodes, eps_via_nodes); > return err; > } > > static Idx > pop_fail_stack (struct re_fail_stack_t *fs, Idx *pidx, Idx nregs, > - regmatch_t *regs, re_node_set *eps_via_nodes) > + regmatch_t *regs, regmatch_t *prevregs, > + re_node_set *eps_via_nodes) > { > if (fs == NULL || fs->num == 0) > return -1; > Idx num = --fs->num; > *pidx = fs->stack[num].idx; > memcpy (regs, fs->stack[num].regs, sizeof (regmatch_t) * nregs); > + memcpy (prevregs, fs->stack[num].regs + nregs, sizeof (regmatch_t) * nregs); > re_node_set_free (eps_via_nodes); > re_free (fs->stack[num].regs); > *eps_via_nodes = fs->stack[num].eps_via_nodes; > @@ -1408,7 +1413,8 @@ set_regs (const regex_t *preg, const re_match_context_t *mctx, size_t nmatch, > { > update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, nmatch); > > - if (idx == pmatch[0].rm_eo && cur_node == mctx->last_node) > + if ((idx == pmatch[0].rm_eo && cur_node == mctx->last_node) > + || re_node_set_contains (&eps_via_nodes, cur_node)) > { > Idx reg_idx; > cur_node = -1; > @@ -1418,7 +1424,7 @@ set_regs (const regex_t *preg, const re_match_context_t *mctx, size_t nmatch, > if (pmatch[reg_idx].rm_so > -1 && pmatch[reg_idx].rm_eo == -1) > { > cur_node = pop_fail_stack (fs, &idx, nmatch, pmatch, > - &eps_via_nodes); > + prev_idx_match, &eps_via_nodes); > break; > } > } > @@ -1431,7 +1437,8 @@ set_regs (const regex_t *preg, const re_match_context_t *mctx, size_t nmatch, > } > > /* Proceed to next node. */ > - cur_node = proceed_next_node (mctx, nmatch, pmatch, &idx, cur_node, > + cur_node = proceed_next_node (mctx, nmatch, pmatch, prev_idx_match, > + &idx, cur_node, > &eps_via_nodes, fs); > > if (__glibc_unlikely (cur_node < 0)) > @@ -1443,7 +1450,8 @@ set_regs (const regex_t *preg, const re_match_context_t *mctx, size_t nmatch, > free_fail_stack_return (fs); > return REG_ESPACE; > } > - cur_node = pop_fail_stack (fs, &idx, nmatch, pmatch, &eps_via_nodes); > + cur_node = pop_fail_stack (fs, &idx, nmatch, pmatch, > + prev_idx_match, &eps_via_nodes); > if (cur_node < 0) > { > re_node_set_free (&eps_via_nodes); > diff --git a/tests/test-regex.c b/tests/test-regex.c > index 58f8200f2..a14619805 100644 > --- a/tests/test-regex.c > +++ b/tests/test-regex.c > @@ -55,6 +55,111 @@ really_utf8 (void) > return strcmp (locale_charset (), "UTF-8") == 0; > } > > +/* Tests supposed to match; copied from glibc posix/bug-regex11.c. */ > +static struct > +{ > + const char *pattern; > + const char *string; > + int flags, nmatch; > + regmatch_t rm[5]; > +} const tests[] = { > + /* Test for newline handling in regex. */ > + { "[^~]*~", "\nx~y", 0, 2, { { 0, 3 }, { -1, -1 } } }, > + /* Other tests. */ > + { "a(.*)b", "a b", REG_EXTENDED, 2, { { 0, 3 }, { 1, 2 } } }, > + { ".*|\\([KIO]\\)\\([^|]*\\).*|?[KIO]", "10~.~|P|K0|I10|O16|?KSb", 0, 3, > + { { 0, 21 }, { 15, 16 }, { 16, 18 } } }, > + { ".*|\\([KIO]\\)\\([^|]*\\).*|?\\1", "10~.~|P|K0|I10|O16|?KSb", 0, 3, > + { { 0, 21 }, { 8, 9 }, { 9, 10 } } }, > + { "^\\(a*\\)\\1\\{9\\}\\(a\\{0,9\\}\\)\\([0-9]*;.*[^a]\\2\\([0-9]\\)\\)", > + "a1;;0a1aa2aaa3aaaa4aaaaa5aaaaaa6aaaaaaa7aaaaaaaa8aaaaaaaaa9aa2aa1a0", 0, > + 5, { { 0, 67 }, { 0, 0 }, { 0, 1 }, { 1, 67 }, { 66, 67 } } }, > + /* Test for BRE expression anchoring. POSIX says just that this may match; > + in glibc regex it always matched, so avoid changing it. */ > + { "\\(^\\|foo\\)bar", "bar", 0, 2, { { 0, 3 }, { -1, -1 } } }, > + { "\\(foo\\|^\\)bar", "bar", 0, 2, { { 0, 3 }, { -1, -1 } } }, > + /* In ERE this must be treated as an anchor. */ > + { "(^|foo)bar", "bar", REG_EXTENDED, 2, { { 0, 3 }, { -1, -1 } } }, > + { "(foo|^)bar", "bar", REG_EXTENDED, 2, { { 0, 3 }, { -1, -1 } } }, > + /* Here ^ cannot be treated as an anchor according to POSIX. */ > + { "(^|foo)bar", "(^|foo)bar", 0, 2, { { 0, 10 }, { -1, -1 } } }, > + { "(foo|^)bar", "(foo|^)bar", 0, 2, { { 0, 10 }, { -1, -1 } } }, > + /* More tests on backreferences. */ > + { "()\\1", "x", REG_EXTENDED, 2, { { 0, 0 }, { 0, 0 } } }, > + { "()x\\1", "x", REG_EXTENDED, 2, { { 0, 1 }, { 0, 0 } } }, > + { "()\\1*\\1*", "", REG_EXTENDED, 2, { { 0, 0 }, { 0, 0 } } }, > + { "([0-9]).*\\1(a*)", "7;7a6", REG_EXTENDED, 3, { { 0, 4 }, { 0, 1 }, { 3, 4 } } }, > + { "([0-9]).*\\1(a*)", "7;7a", REG_EXTENDED, 3, { { 0, 4 }, { 0, 1 }, { 3, 4 } } }, > + { "(b)()c\\1", "bcb", REG_EXTENDED, 3, { { 0, 3 }, { 0, 1 }, { 1, 1 } } }, > + { "()(b)c\\2", "bcb", REG_EXTENDED, 3, { { 0, 3 }, { 0, 0 }, { 0, 1 } } }, > + { "a(b)()c\\1", "abcb", REG_EXTENDED, 3, { { 0, 4 }, { 1, 2 }, { 2, 2 } } }, > + { "a()(b)c\\2", "abcb", REG_EXTENDED, 3, { { 0, 4 }, { 1, 1 }, { 1, 2 } } }, > + { "()(b)\\1c\\2", "bcb", REG_EXTENDED, 3, { { 0, 3 }, { 0, 0 }, { 0, 1 } } }, > + { "(b())\\2\\1", "bbbb", REG_EXTENDED, 3, { { 0, 2 }, { 0, 1 }, { 1, 1 } } }, > + { "a()(b)\\1c\\2", "abcb", REG_EXTENDED, 3, { { 0, 4 }, { 1, 1 }, { 1, 2 } } }, > + { "a()d(b)\\1c\\2", "adbcb", REG_EXTENDED, 3, { { 0, 5 }, { 1, 1 }, { 2, 3 } } }, > + { "a(b())\\2\\1", "abbbb", REG_EXTENDED, 3, { { 0, 3 }, { 1, 2 }, { 2, 2 } } }, > + { "(bb())\\2\\1", "bbbb", REG_EXTENDED, 3, { { 0, 4 }, { 0, 2 }, { 2, 2 } } }, > + { "^([^,]*),\\1,\\1$", "a,a,a", REG_EXTENDED, 2, { { 0, 5 }, { 0, 1 } } }, > + { "^([^,]*),\\1,\\1$", "ab,ab,ab", REG_EXTENDED, 2, { { 0, 8 }, { 0, 2 } } }, > + { "^([^,]*),\\1,\\1,\\1$", "abc,abc,abc,abc", REG_EXTENDED, 2, > + { { 0, 15 }, { 0, 3 } } }, > + { "^(.?)(.?)(.?)(.?)(.?).?\\5\\4\\3\\2\\1$", > + "level", REG_NOSUB | REG_EXTENDED, 0, { { -1, -1 } } }, > + { "^(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.).?\\9\\8\\7\\6\\5\\4\\3\\2\\1$|^.?$", > + "level", REG_NOSUB | REG_EXTENDED, 0, { { -1, -1 } } }, > + { "^(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.).?\\9\\8\\7\\6\\5\\4\\3\\2\\1$|^.?$", > + "abcdedcba", REG_EXTENDED, 1, { { 0, 9 } } }, > + /* XXX Not used since they fail so far. */ > + { "^(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.).?\\9\\8\\7\\6\\5\\4\\3\\2\\1$|^.?$", > + "ababababa", REG_EXTENDED, 1, { { 0, 9 } } }, > + { "^(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?).?\\9\\8\\7\\6\\5\\4\\3\\2\\1$", > + "level", REG_NOSUB | REG_EXTENDED, 0, { { -1, -1 } } }, > + { "^(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?).?\\9\\8\\7\\6\\5\\4\\3\\2\\1$", > + "ababababa", REG_EXTENDED, 1, { { 0, 9 } } }, > +}; > + > +static void > +bug_regex11 (void) > +{ > + regex_t re; > + regmatch_t rm[5]; > + size_t i; > + int n; > + > + for (i = 0; i < sizeof (tests) / sizeof (tests[0]); ++i) > + { > + n = regcomp (&re, tests[i].pattern, tests[i].flags); > + if (n != 0) > + { > + char buf[500]; > + regerror (n, &re, buf, sizeof (buf)); > + report_error ("%s: regcomp %zd failed: %s", tests[i].pattern, i, buf); > + continue; > + } > + > + if (regexec (&re, tests[i].string, tests[i].nmatch, rm, 0)) > + { > + report_error ("%s: regexec %zd failed", tests[i].pattern, i); > + regfree (&re); > + continue; > + } > + > + for (n = 0; n < tests[i].nmatch; ++n) > + if (rm[n].rm_so != tests[i].rm[n].rm_so > + || rm[n].rm_eo != tests[i].rm[n].rm_eo) > + { > + if (tests[i].rm[n].rm_so == -1 && tests[i].rm[n].rm_eo == -1) > + break; > + report_error ("%s: regexec %zd match failure rm[%d] %d..%d", > + tests[i].pattern, i, n, rm[n].rm_so, rm[n].rm_eo); > + break; > + } > + > + regfree (&re); > + } > +} > + > int > main (void) > { > @@ -65,11 +170,15 @@ main (void) > struct re_registers regs; > > #if HAVE_DECL_ALARM > - /* Some builds of glibc go into an infinite loop on this test. */ > - int alarm_value = 2; > + /* In case a bug causes glibc to go into an infinite loop. > + The tests should take less than 10 s on a reasonably modern CPU. */ > + int alarm_value = 1000; > signal (SIGALRM, SIG_DFL); > alarm (alarm_value); > #endif > + > + bug_regex11 (); > + > if (setlocale (LC_ALL, "en_US.UTF-8")) > { > { >