bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed
From: Paul Eggert <eggert@cs.ucla.edu>
To: Norihiro Tanaka <noritnk@kcn.ne.jp>, 34951@debbugs.gnu.org
Cc: Aharon Robbins <arnold@skeeve.com>, bug-gnulib@gnu.org
Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher
Date: Wed, 11 Dec 2019 15:25:48 -0800	[thread overview]
Message-ID: <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> (raw)
In-Reply-To: <20190323114902.E6F6.27F6AC2D@kcn.ne.jp>

[-- Attachment #1: Type: text/plain, Size: 1037 bytes --]

On 3/22/19 7:49 PM, Norihiro Tanaka wrote:
> Missing a patch for dfa.  Re-send correct patch file.

Thanks, I installed the DFA-relevant parts of your proposed fix into 
Gnulib. (The grep parts still need doing.) I also installed the attached 
commentary followup.

While I was at it I installed a patch to fix an unlikely integer 
overflow that I noticed while reviewing your fix. I also installed some 
internal changes to prefer signed to unsigned integers for indexes, as 
this should make future integer overflows easier to catch. See:

https://lists.gnu.org/r/bug-gnulib/2019-12/msg00058.html
https://lists.gnu.org/r/bug-gnulib/2019-12/msg00059.html

I'd also like to change dfa.h's API to prefer ptrdiff_t to size_t, for 
the same integer-overflow reason. This would be a (minor) API change so 
I thought I'd ask first. Any objections?

PS. Arnold, the above discusses all the changes I know about for dfa.c 
and dfa.h. The proposed API change (size_t->ptrdiff_t) could be 
installed either before or after the next Gawk release.

[-- Attachment #2: 0001-dfa-update-commentary-for-previous-change.patch --]
[-- Type: text/x-patch, Size: 3855 bytes --]

From 360cbd3b17a314807e808626e100ef47dcf4d162 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Wed, 11 Dec 2019 13:40:01 -0800
Subject: [PATCH] dfa: update commentary for previous change

* NEWS: Mention the change.
* lib/dfa.c, lib/dfa.h (dfaparse, dfamust, dfacomp): Update comments.
---
 ChangeLog |  6 ++++++
 NEWS      |  4 ++++
 lib/dfa.c |  9 +++++----
 lib/dfa.h | 14 ++++++++------
 4 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index f80f33b38..bc912c771 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2019-12-11  Paul Eggert  <eggert@cs.ucla.edu>
+
+	dfa: update commentary for previous change
+	* NEWS: Mention the change.
+	* lib/dfa.c, lib/dfa.h (dfaparse, dfamust, dfacomp): Update comments.
+
 2019-12-11  Norihiro Tanaka  <noritnk@kcn.ne.jp>
 
 	dfa: separate parse and compile phase
diff --git a/NEWS b/NEWS
index 8085c353e..b73c9088a 100644
--- a/NEWS
+++ b/NEWS
@@ -58,6 +58,10 @@ User visible incompatible changes
 
 Date        Modules         Changes
 
+2019-12-11  dfa             To call dfamust, one must now call dfaparse
+                            without yet calling dfacomp.  This fixes a bug
+                            introduced on 2018-10-22 that broke dfamust.
+
 2019-12-07  xstrtol         This module no longer defines the function
             xstrtoll        'xstrtol_fatal'.  Program that need this function
             xstrtoimax      should add the module 'xstrtol-error' to the list
diff --git a/lib/dfa.c b/lib/dfa.c
index 1e125b4d2..2347a91c1 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -1966,9 +1966,8 @@ regexp (struct dfa *dfa)
     }
 }
 
-/* Main entry point for the parser.  S is a string to be parsed, len is the
-   length of the string, so s can include NUL characters.  D is a pointer to
-   the struct dfa to parse into.  */
+/* Parse a string S of length LEN into D.  S can include NUL characters.
+   This is the main entry point for the parser.  */
 void
 dfaparse (char const *s, size_t len, struct dfa *d)
 {
@@ -3741,7 +3740,9 @@ dfassbuild (struct dfa *d)
     }
 }
 
-/* Parse and analyze a single string of the given length.  */
+/* Parse a string S of length LEN into D (but skip this step if S is null).
+   Then analyze D and build a matcher for it.
+   SEARCHFLAG says whether to build a searching or an exact matcher.  */
 void
 dfacomp (char const *s, size_t len, struct dfa *d, bool searchflag)
 {
diff --git a/lib/dfa.h b/lib/dfa.h
index 221f7d172..bf87703e8 100644
--- a/lib/dfa.h
+++ b/lib/dfa.h
@@ -65,18 +65,20 @@ enum
 extern void dfasyntax (struct dfa *, struct localeinfo const *,
                        reg_syntax_t, int);
 
-/* Build and return the struct dfamust from the given struct dfa. */
+/* Parse the given string of given length into the given struct dfa.  */
+extern void dfaparse (char const *, size_t, struct dfa *);
+
+/* Allocate and return a struct dfamust from a struct dfa that was
+   initialized by dfaparse and not yet given to dfacomp.  */
 extern struct dfamust *dfamust (struct dfa const *);
 
 /* Free the storage held by the components of a struct dfamust. */
 extern void dfamustfree (struct dfamust *);
 
-/* Parse the given string of given length into the given struct dfa.  */
-extern void dfaparse (char const *, size_t, struct dfa *);
-
 /* Compile the given string of the given length into the given struct dfa.
-   Final argument is a flag specifying whether to build a searching or an
-   exact matcher. */
+   The last argument says whether to build a searching or an exact matcher.
+   A null first argument means the struct dfa has already been
+   initialized by dfaparse; the second argument is ignored.  */
 extern void dfacomp (char const *, size_t, struct dfa *, bool);
 
 /* Search through a buffer looking for a match to the given struct dfa.
-- 
2.23.0


  parent reply	other threads:[~2019-12-11 23:26 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-22 23:06 [PATCH] grep: a kwset matcher not work in a grep matcher Norihiro Tanaka
2019-03-23  2:49 ` bug#34951: " Norihiro Tanaka
2019-03-23  2:58   ` Budi
2019-03-23  2:59     ` Budi
2019-03-23 12:39       ` Eric Blake
2019-03-29 10:58   ` arnold
2019-12-11 23:25   ` Paul Eggert [this message]
2019-12-12  7:23     ` arnold
2019-12-12  7:31     ` arnold
2019-12-12  7:47       ` arnold
2019-12-12 22:26       ` Paul Eggert
2019-12-13  8:09         ` arnold
2019-12-13 12:08           ` arnold
2019-12-13 17:53             ` Jim Meyering
2019-12-13 20:00               ` Paul Eggert
2019-12-14  2:35                 ` intptr_t vs. uintptr_t Bruno Haible
2019-12-14  3:19                   ` Paul Eggert
2019-12-14  9:14                     ` Bruno Haible
2019-12-14 22:29                       ` Paul Eggert
2019-12-15  0:35                         ` Bruno Haible
2019-12-16 10:02                           ` Paul Eggert
2019-12-15  8:14                 ` bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher arnold
2019-12-16  9:56                   ` Paul Eggert
2019-12-16 10:12                     ` arnold
2019-12-20  3:18                       ` Paul Eggert
2019-12-20 10:35                         ` arnold

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.gnu.org/mailman/listinfo/bug-gnulib

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu \
    --to=eggert@cs.ucla.edu \
    --cc=34951@debbugs.gnu.org \
    --cc=arnold@skeeve.com \
    --cc=bug-gnulib@gnu.org \
    --cc=noritnk@kcn.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).