From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS22989 209.51.188.0/24 X-Spam-Status: No, score=-3.3 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 857AF1F463 for ; Wed, 11 Dec 2019 23:26:01 +0000 (UTC) Received: from localhost ([::1]:51650 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ifBMm-0001Bw-0o for normalperson@yhbt.net; Wed, 11 Dec 2019 18:26:00 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:49699) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ifBMh-0001Bc-A4 for bug-gnulib@gnu.org; Wed, 11 Dec 2019 18:25:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ifBMf-0004pA-4M for bug-gnulib@gnu.org; Wed, 11 Dec 2019 18:25:54 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:39294) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ifBMe-0004lz-Pt for bug-gnulib@gnu.org; Wed, 11 Dec 2019 18:25:53 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 12DB116008F; Wed, 11 Dec 2019 15:25:51 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 3cjkt_Oqnm8M; Wed, 11 Dec 2019 15:25:49 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id CF8D1160158; Wed, 11 Dec 2019 15:25:49 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 5rE2ZMxojoCv; Wed, 11 Dec 2019 15:25:49 -0800 (PST) Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id AFB0816008F; Wed, 11 Dec 2019 15:25:49 -0800 (PST) Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher To: Norihiro Tanaka , 34951@debbugs.gnu.org References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> Date: Wed, 11 Dec 2019 15:25:48 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> Content-Type: multipart/mixed; boundary="------------D01426E8E1E2BC03E164985F" Content-Language: en-US X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [fuzzy] X-Received-From: 131.179.128.68 X-BeenThere: bug-gnulib@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Gnulib discussion list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Aharon Robbins , bug-gnulib@gnu.org Errors-To: bug-gnulib-bounces+normalperson=yhbt.net@gnu.org Sender: "bug-gnulib" This is a multi-part message in MIME format. --------------D01426E8E1E2BC03E164985F Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit On 3/22/19 7:49 PM, Norihiro Tanaka wrote: > Missing a patch for dfa. Re-send correct patch file. Thanks, I installed the DFA-relevant parts of your proposed fix into Gnulib. (The grep parts still need doing.) I also installed the attached commentary followup. While I was at it I installed a patch to fix an unlikely integer overflow that I noticed while reviewing your fix. I also installed some internal changes to prefer signed to unsigned integers for indexes, as this should make future integer overflows easier to catch. See: https://lists.gnu.org/r/bug-gnulib/2019-12/msg00058.html https://lists.gnu.org/r/bug-gnulib/2019-12/msg00059.html I'd also like to change dfa.h's API to prefer ptrdiff_t to size_t, for the same integer-overflow reason. This would be a (minor) API change so I thought I'd ask first. Any objections? PS. Arnold, the above discusses all the changes I know about for dfa.c and dfa.h. The proposed API change (size_t->ptrdiff_t) could be installed either before or after the next Gawk release. --------------D01426E8E1E2BC03E164985F Content-Type: text/x-patch; charset=UTF-8; name="0001-dfa-update-commentary-for-previous-change.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="0001-dfa-update-commentary-for-previous-change.patch" >From 360cbd3b17a314807e808626e100ef47dcf4d162 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Wed, 11 Dec 2019 13:40:01 -0800 Subject: [PATCH] dfa: update commentary for previous change * NEWS: Mention the change. * lib/dfa.c, lib/dfa.h (dfaparse, dfamust, dfacomp): Update comments. --- ChangeLog | 6 ++++++ NEWS | 4 ++++ lib/dfa.c | 9 +++++---- lib/dfa.h | 14 ++++++++------ 4 files changed, 23 insertions(+), 10 deletions(-) diff --git a/ChangeLog b/ChangeLog index f80f33b38..bc912c771 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,9 @@ +2019-12-11 Paul Eggert + + dfa: update commentary for previous change + * NEWS: Mention the change. + * lib/dfa.c, lib/dfa.h (dfaparse, dfamust, dfacomp): Update comments. + 2019-12-11 Norihiro Tanaka dfa: separate parse and compile phase diff --git a/NEWS b/NEWS index 8085c353e..b73c9088a 100644 --- a/NEWS +++ b/NEWS @@ -58,6 +58,10 @@ User visible incompatible changes Date Modules Changes +2019-12-11 dfa To call dfamust, one must now call dfaparse + without yet calling dfacomp. This fixes a bug + introduced on 2018-10-22 that broke dfamust. + 2019-12-07 xstrtol This module no longer defines the function xstrtoll 'xstrtol_fatal'. Program that need this function xstrtoimax should add the module 'xstrtol-error' to the list diff --git a/lib/dfa.c b/lib/dfa.c index 1e125b4d2..2347a91c1 100644 --- a/lib/dfa.c +++ b/lib/dfa.c @@ -1966,9 +1966,8 @@ regexp (struct dfa *dfa) } } -/* Main entry point for the parser. S is a string to be parsed, len is the - length of the string, so s can include NUL characters. D is a pointer to - the struct dfa to parse into. */ +/* Parse a string S of length LEN into D. S can include NUL characters. + This is the main entry point for the parser. */ void dfaparse (char const *s, size_t len, struct dfa *d) { @@ -3741,7 +3740,9 @@ dfassbuild (struct dfa *d) } } -/* Parse and analyze a single string of the given length. */ +/* Parse a string S of length LEN into D (but skip this step if S is null). + Then analyze D and build a matcher for it. + SEARCHFLAG says whether to build a searching or an exact matcher. */ void dfacomp (char const *s, size_t len, struct dfa *d, bool searchflag) { diff --git a/lib/dfa.h b/lib/dfa.h index 221f7d172..bf87703e8 100644 --- a/lib/dfa.h +++ b/lib/dfa.h @@ -65,18 +65,20 @@ enum extern void dfasyntax (struct dfa *, struct localeinfo const *, reg_syntax_t, int); -/* Build and return the struct dfamust from the given struct dfa. */ +/* Parse the given string of given length into the given struct dfa. */ +extern void dfaparse (char const *, size_t, struct dfa *); + +/* Allocate and return a struct dfamust from a struct dfa that was + initialized by dfaparse and not yet given to dfacomp. */ extern struct dfamust *dfamust (struct dfa const *); /* Free the storage held by the components of a struct dfamust. */ extern void dfamustfree (struct dfamust *); -/* Parse the given string of given length into the given struct dfa. */ -extern void dfaparse (char const *, size_t, struct dfa *); - /* Compile the given string of the given length into the given struct dfa. - Final argument is a flag specifying whether to build a searching or an - exact matcher. */ + The last argument says whether to build a searching or an exact matcher. + A null first argument means the struct dfa has already been + initialized by dfaparse; the second argument is ignored. */ extern void dfacomp (char const *, size_t, struct dfa *, bool); /* Search through a buffer looking for a match to the given struct dfa. -- 2.23.0 --------------D01426E8E1E2BC03E164985F--