Re: bug#40634: Massive pattern list handling with -E format seems very slow since 2.28.

bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed

* Re: bug#40634: Massive pattern list handling with -E format seems very slow since 2.28.
       [not found] ` <20200417093536.875E.27F6AC2D@kcn.ne.jp>
@ 2020-04-17  1:24   ` Norihiro Tanaka
  2020-04-17 15:22     ` Norihiro Tanaka
  0 siblings, 1 reply; 11+ messages in thread
From: Norihiro Tanaka @ 2020-04-17  1:24 UTC (permalink / raw)
  To: 40634; +Cc: fryasu, Paul Eggert, bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 677 bytes --]


On Fri, 17 Apr 2020 09:35:36 +0900
Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:

> 
> On Thu, 16 Apr 2020 16:00:29 -0700
> Paul Eggert <eggert@cs.ucla.edu> wrote:
> 
> > On 4/16/20 3:53 PM, Norihiro Tanaka wrote:
> > 
> > > I have had no idea to solve the problem yet.  If we revert it, bug#33357
> > > will come back.
> > 
> > Yes, I'd rather not revert if we can help it.
> > 
> > My own thought was to not analyze the regular expression if we discover that the input is empty. :-)
> 
> Now, I have a idea, it is that we build indexes of epsilon nodes
> including in follows before remove epsilon nodes.


I wrote fix for the bug, but it will be slower then at grep 2.27 yet.

[-- Attachment #2: 0001-dfa-build-auxiliary-indexes-before-remove-epsilon-cl.patch --]
[-- Type: text/plain, Size: 3184 bytes --]

From e5680a4c43b185df6b8c1b80423d663de6f6e37e Mon Sep 17 00:00:00 2001
From: Norihiro Tanaka <noritnk@kcn.ne.jp>
Date: Fri, 17 Apr 2020 10:12:01 +0900
Subject: [PATCH] dfa: build auxiliary indexes before remove epsilon closure

When remove epsilon closure, so far searched nodes including epsilon closure
in all nodes sequentially, but it is slow for some cases.  Now build
auxiliary indexes before search.

Problem reported in: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=40634

* lib/dfa.c (overwrap): New function.
(epsclosure): build auxiliary indexes before remove epsilon closure.
---
 lib/dfa.c |   52 +++++++++++++++++++++++++++++++++++++++++++++-------
 1 files changed, 45 insertions(+), 7 deletions(-)

diff --git a/lib/dfa.c b/lib/dfa.c
index 9939d22..958069e 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -2201,6 +2201,22 @@ replace (position_set *dst, idx_t del, position_set *add,
     }
 }
 
+static bool
+overwrap (position_set const *s, idx_t const *elems, idx_t nelem)
+{
+  idx_t i = 0, j = 0;
+
+  while (i < s->nelem && j < nelem)
+    if (s->elems[i].index < elems[j])
+      i++;
+    else if (s->elems[i].index > elems[j])
+      j++;
+    else
+      return true;
+
+  return false;
+}
+
 /* Find the index of the state corresponding to the given position set with
    the given preceding context, or create a new state if there is no such
    state.  Context tells whether we got here on a newline or letter.  */
@@ -2298,14 +2314,33 @@ static void
 epsclosure (struct dfa const *d)
 {
   position_set tmp;
+  idx_t *currs, *nexts;
+  idx_t ncurr = 0;
+  idx_t nnext = 0;
+
   alloc_position_set (&tmp, d->nleaves);
-  for (idx_t i = 0; i < d->tindex; i++)
-    if (d->follows[i].nelem > 0 && d->tokens[i] >= NOTCHAR
+  currs = xnmalloc (d->tindex, sizeof *currs);
+  nexts = xnmalloc (d->tindex, sizeof *nexts);
+
+  for (idx_t i = 0; i < d->tindex; i++) {
+     if (d->follows[i].nelem > 0 && d->tokens[i] >= NOTCHAR
         && d->tokens[i] != BACKREF && d->tokens[i] != ANYCHAR
         && d->tokens[i] != MBCSET && d->tokens[i] < CSET)
+      currs[ncurr++] = i;
+  }
+
+  for (idx_t i = 0, j = 0; i < d->tindex; i++)
+    {
+      while (j < ncurr && currs[j] < i)
+        j++;
+      if (overwrap (&d->follows[i], currs, ncurr))
+        nexts[nnext++] = i;
+    }
+
+  for (idx_t i = 0; i < ncurr; i++)
       {
         unsigned int constraint;
-        switch (d->tokens[i])
+        switch (d->tokens[currs[i]])
           {
           case BEGLINE:
             constraint = BEGLINE_CONSTRAINT;
@@ -2330,13 +2365,16 @@ epsclosure (struct dfa const *d)
             break;
           }
 
-        delete (i, &d->follows[i]);
+        delete (i, &d->follows[currs[i]]);
 
-        for (idx_t j = 0; j < d->tindex; j++)
-          if (i != j && d->follows[j].nelem > 0)
-            replace (&d->follows[j], i, &d->follows[i], constraint, &tmp);
+        for (idx_t j = 0; j < nnext; j++)
+          replace (&d->follows[nexts[j]], currs[i], &d->follows[currs[i]],
+                  constraint, &tmp);
       }
+
   free (tmp.elems);
+  free (currs);
+  free (nexts);
 }
 
 /* Returns the set of contexts for which there is at least one
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: bug#40634: Massive pattern list handling with -E format seems very slow since 2.28.
  2020-04-17  1:24   ` bug#40634: Massive pattern list handling with -E format seems very slow since 2.28 Norihiro Tanaka
@ 2020-04-17 15:22     ` Norihiro Tanaka
  2020-04-18 22:41       ` Norihiro Tanaka
  0 siblings, 1 reply; 11+ messages in thread
From: Norihiro Tanaka @ 2020-04-17 15:22 UTC (permalink / raw)
  To: 40634; +Cc: fryasu, Paul Eggert, bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 834 bytes --]


On Fri, 17 Apr 2020 10:24:42 +0900
Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:

> 
> On Fri, 17 Apr 2020 09:35:36 +0900
> Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:
> 
> > 
> > On Thu, 16 Apr 2020 16:00:29 -0700
> > Paul Eggert <eggert@cs.ucla.edu> wrote:
> > 
> > > On 4/16/20 3:53 PM, Norihiro Tanaka wrote:
> > > 
> > > > I have had no idea to solve the problem yet.  If we revert it, bug#33357
> > > > will come back.
> > > 
> > > Yes, I'd rather not revert if we can help it.
> > > 
> > > My own thought was to not analyze the regular expression if we discover that the input is empty. :-)
> > 
> > Now, I have a idea, it is that we build indexes of epsilon nodes
> > including in follows before remove epsilon nodes.
> 
> 
> I wrote fix for the bug, but it will be slower then at grep 2.27 yet.

It was improved previous patch.

[-- Attachment #2: 0001-dfa-build-auxiliary-indexes-before-remove-epsilon-cl_old.patch --]
[-- Type: text/plain, Size: 3184 bytes --]

From e5680a4c43b185df6b8c1b80423d663de6f6e37e Mon Sep 17 00:00:00 2001
From: Norihiro Tanaka <noritnk@kcn.ne.jp>
Date: Fri, 17 Apr 2020 10:12:01 +0900
Subject: [PATCH] dfa: build auxiliary indexes before remove epsilon closure

When remove epsilon closure, so far searched nodes including epsilon closure
in all nodes sequentially, but it is slow for some cases.  Now build
auxiliary indexes before search.

Problem reported in: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=40634

* lib/dfa.c (overwrap): New function.
(epsclosure): build auxiliary indexes before remove epsilon closure.
---
 lib/dfa.c |   52 +++++++++++++++++++++++++++++++++++++++++++++-------
 1 files changed, 45 insertions(+), 7 deletions(-)

diff --git a/lib/dfa.c b/lib/dfa.c
index 9939d22..958069e 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -2201,6 +2201,22 @@ replace (position_set *dst, idx_t del, position_set *add,
     }
 }
 
+static bool
+overwrap (position_set const *s, idx_t const *elems, idx_t nelem)
+{
+  idx_t i = 0, j = 0;
+
+  while (i < s->nelem && j < nelem)
+    if (s->elems[i].index < elems[j])
+      i++;
+    else if (s->elems[i].index > elems[j])
+      j++;
+    else
+      return true;
+
+  return false;
+}
+
 /* Find the index of the state corresponding to the given position set with
    the given preceding context, or create a new state if there is no such
    state.  Context tells whether we got here on a newline or letter.  */
@@ -2298,14 +2314,33 @@ static void
 epsclosure (struct dfa const *d)
 {
   position_set tmp;
+  idx_t *currs, *nexts;
+  idx_t ncurr = 0;
+  idx_t nnext = 0;
+
   alloc_position_set (&tmp, d->nleaves);
-  for (idx_t i = 0; i < d->tindex; i++)
-    if (d->follows[i].nelem > 0 && d->tokens[i] >= NOTCHAR
+  currs = xnmalloc (d->tindex, sizeof *currs);
+  nexts = xnmalloc (d->tindex, sizeof *nexts);
+
+  for (idx_t i = 0; i < d->tindex; i++) {
+     if (d->follows[i].nelem > 0 && d->tokens[i] >= NOTCHAR
         && d->tokens[i] != BACKREF && d->tokens[i] != ANYCHAR
         && d->tokens[i] != MBCSET && d->tokens[i] < CSET)
+      currs[ncurr++] = i;
+  }
+
+  for (idx_t i = 0, j = 0; i < d->tindex; i++)
+    {
+      while (j < ncurr && currs[j] < i)
+        j++;
+      if (overwrap (&d->follows[i], currs, ncurr))
+        nexts[nnext++] = i;
+    }
+
+  for (idx_t i = 0; i < ncurr; i++)
       {
         unsigned int constraint;
-        switch (d->tokens[i])
+        switch (d->tokens[currs[i]])
           {
           case BEGLINE:
             constraint = BEGLINE_CONSTRAINT;
@@ -2330,13 +2365,16 @@ epsclosure (struct dfa const *d)
             break;
           }
 
-        delete (i, &d->follows[i]);
+        delete (i, &d->follows[currs[i]]);
 
-        for (idx_t j = 0; j < d->tindex; j++)
-          if (i != j && d->follows[j].nelem > 0)
-            replace (&d->follows[j], i, &d->follows[i], constraint, &tmp);
+        for (idx_t j = 0; j < nnext; j++)
+          replace (&d->follows[nexts[j]], currs[i], &d->follows[currs[i]],
+                  constraint, &tmp);
       }
+
   free (tmp.elems);
+  free (currs);
+  free (nexts);
 }
 
 /* Returns the set of contexts for which there is at least one
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: bug#40634: Massive pattern list handling with -E format seems very slow since 2.28.
  2020-04-17 15:22     ` Norihiro Tanaka
@ 2020-04-18 22:41       ` Norihiro Tanaka
  2020-04-19  2:10         ` Norihiro Tanaka
  0 siblings, 1 reply; 11+ messages in thread
From: Norihiro Tanaka @ 2020-04-18 22:41 UTC (permalink / raw)
  To: 40634, fryasu; +Cc: Paul Eggert, bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 1001 bytes --]


On Sat, 18 Apr 2020 00:22:26 +0900
Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:

> 
> On Fri, 17 Apr 2020 10:24:42 +0900
> Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:
> 
> > 
> > On Fri, 17 Apr 2020 09:35:36 +0900
> > Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:
> > 
> > > 
> > > On Thu, 16 Apr 2020 16:00:29 -0700
> > > Paul Eggert <eggert@cs.ucla.edu> wrote:
> > > 
> > > > On 4/16/20 3:53 PM, Norihiro Tanaka wrote:
> > > > 
> > > > > I have had no idea to solve the problem yet.  If we revert it, bug#33357
> > > > > will come back.
> > > > 
> > > > Yes, I'd rather not revert if we can help it.
> > > > 
> > > > My own thought was to not analyze the regular expression if we discover that the input is empty. :-)
> > > 
> > > Now, I have a idea, it is that we build indexes of epsilon nodes
> > > including in follows before remove epsilon nodes.
> > 
> > 
> > I wrote fix for the bug, but it will be slower then at grep 2.27 yet.
> 
> It was improved previous patch.

Sorry, correct patch is here.

[-- Attachment #2: 0001-dfa-build-auxiliary-indexes-before-remove-epsilon-cl.patch --]
[-- Type: text/plain, Size: 4945 bytes --]

From 8357fa551c8a5a4f14540b250bc2485c4812a234 Mon Sep 17 00:00:00 2001
From: Norihiro Tanaka <noritnk@kcn.ne.jp>
Date: Fri, 17 Apr 2020 10:12:01 +0900
Subject: [PATCH] dfa: build auxiliary indexes before remove epsilon closure

When remove epsilon closure, so far searched nodes including epsilon closure
in all nodes sequentially, but it is slow for some cases.  Now build
auxiliary indexes before search.

Problem reported in: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=40634

* lib/dfa.c (epsclosure): build auxiliary indexes before remove epsilon
closure.
---
 lib/dfa.c |  129 +++++++++++++++++++++++++++++++++++++++++++++----------------
 1 files changed, 95 insertions(+), 34 deletions(-)

diff --git a/lib/dfa.c b/lib/dfa.c
index 9939d22..5f90a92 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -2298,45 +2298,106 @@ static void
 epsclosure (struct dfa const *d)
 {
   position_set tmp;
+  idx_t *currs, *backs;
+  idx_t ncurr = 0;
+  position_set *prevs;
+
   alloc_position_set (&tmp, d->nleaves);
+  currs = xnmalloc (d->nleaves, sizeof *currs);
+  backs = xnmalloc (d->tindex, sizeof *backs);
+
   for (idx_t i = 0; i < d->tindex; i++)
-    if (d->follows[i].nelem > 0 && d->tokens[i] >= NOTCHAR
-        && d->tokens[i] != BACKREF && d->tokens[i] != ANYCHAR
-        && d->tokens[i] != MBCSET && d->tokens[i] < CSET)
-      {
-        unsigned int constraint;
-        switch (d->tokens[i])
-          {
-          case BEGLINE:
-            constraint = BEGLINE_CONSTRAINT;
-            break;
-          case ENDLINE:
-            constraint = ENDLINE_CONSTRAINT;
-            break;
-          case BEGWORD:
-            constraint = BEGWORD_CONSTRAINT;
-            break;
-          case ENDWORD:
-            constraint = ENDWORD_CONSTRAINT;
-            break;
-          case LIMWORD:
-            constraint = LIMWORD_CONSTRAINT;
-            break;
-          case NOTLIMWORD:
-            constraint = NOTLIMWORD_CONSTRAINT;
-            break;
-          default:
-            constraint = NO_CONSTRAINT;
-            break;
-          }
+    {
+      if (d->follows[i].nelem > 0 && d->tokens[i] >= NOTCHAR
+          && d->tokens[i] != ANYCHAR && d->tokens[i] != BEG
+          && d->tokens[i] != BACKREF && d->tokens[i] != MBCSET
+          && d->tokens[i] < CSET)
+        {
+          currs[ncurr] = i;
+          backs[i] = ncurr++;
+        }
+      else
+        backs[i] = -1;
+    }
 
-        delete (i, &d->follows[i]);
+  prevs = xnmalloc (ncurr, sizeof *prevs);
 
-        for (idx_t j = 0; j < d->tindex; j++)
-          if (i != j && d->follows[j].nelem > 0)
-            replace (&d->follows[j], i, &d->follows[i], constraint, &tmp);
-      }
+  for (idx_t i = 0; i < ncurr; i++)
+    {
+      prevs[i].elems = NULL;
+      prevs[i].nelem = 0;
+    }
+
+  for (idx_t i = 0; i < d->tindex; i++)
+    {
+      for (idx_t j = 0, k = 0; j < d->follows[i].nelem && k < ncurr;)
+        {
+          if (d->follows[i].elems[j].index < currs[k])
+            j++;
+          else if (currs[k] < d->follows[i].elems[j].index)
+            k++;
+          else
+            {
+              if (currs[k] != i)
+                {
+                  position p;
+                  p.index = i;
+                  p.constraint = NO_CONSTRAINT;
+                  if (prevs[k].nelem == 0)
+                    alloc_position_set (&prevs[k], 1);
+                  insert (p, &prevs[k]);
+                }
+              j++;
+              k++;
+            }
+        }
+    }
+
+  for (idx_t i = 0; i < ncurr; i++)
+    {
+      unsigned int constraint;
+      switch (d->tokens[currs[i]])
+        {
+        case BEGLINE:
+          constraint = BEGLINE_CONSTRAINT;
+          break;
+        case ENDLINE:
+          constraint = ENDLINE_CONSTRAINT;
+          break;
+        case BEGWORD:
+          constraint = BEGWORD_CONSTRAINT;
+          break;
+        case ENDWORD:
+          constraint = ENDWORD_CONSTRAINT;
+          break;
+        case LIMWORD:
+          constraint = LIMWORD_CONSTRAINT;
+          break;
+        case NOTLIMWORD:
+          constraint = NOTLIMWORD_CONSTRAINT;
+          break;
+        default:
+          constraint = NO_CONSTRAINT;
+          break;
+        }
+
+      delete (currs[i], &d->follows[currs[i]]);
+
+      for (idx_t j = 0; j < prevs[i].nelem; j++)
+        replace (&d->follows[prevs[i].elems[j].index], currs[i],
+                 &d->follows[currs[i]], constraint, &tmp);
+      for (idx_t j = 0; j < d->follows[currs[i]].nelem; j++)
+        if (backs[d->follows[currs[i]].elems[j].index] >= 0)
+          replace (&prevs[backs[d->follows[currs[i]].elems[j].index]],
+                   currs[i], &prevs[i], NO_CONSTRAINT, &tmp);
+    }
+
+  for (idx_t i = 0; i < ncurr; i++)
+    free (prevs[i].elems);
+  free (prevs);
   free (tmp.elems);
+  free (currs);
+  free (backs);
 }
 
 /* Returns the set of contexts for which there is at least one
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: bug#40634: Massive pattern list handling with -E format seems very slow since 2.28.
  2020-04-18 22:41       ` Norihiro Tanaka
@ 2020-04-19  2:10         ` Norihiro Tanaka
  2020-09-11 12:47           ` Jim Meyering
  0 siblings, 1 reply; 11+ messages in thread
From: Norihiro Tanaka @ 2020-04-19  2:10 UTC (permalink / raw)
  To: 40634; +Cc: fryasu, Paul Eggert, bug-gnulib

[-- Attachment #1: Type: text/plain, Size: 1413 bytes --]


On Sun, 19 Apr 2020 07:41:49 +0900
Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:

> 
> On Sat, 18 Apr 2020 00:22:26 +0900
> Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:
> 
> > 
> > On Fri, 17 Apr 2020 10:24:42 +0900
> > Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:
> > 
> > > 
> > > On Fri, 17 Apr 2020 09:35:36 +0900
> > > Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:
> > > 
> > > > 
> > > > On Thu, 16 Apr 2020 16:00:29 -0700
> > > > Paul Eggert <eggert@cs.ucla.edu> wrote:
> > > > 
> > > > > On 4/16/20 3:53 PM, Norihiro Tanaka wrote:
> > > > > 
> > > > > > I have had no idea to solve the problem yet.  If we revert it, bug#33357
> > > > > > will come back.
> > > > > 
> > > > > Yes, I'd rather not revert if we can help it.
> > > > > 
> > > > > My own thought was to not analyze the regular expression if we discover that the input is empty. :-)
> > > > 
> > > > Now, I have a idea, it is that we build indexes of epsilon nodes
> > > > including in follows before remove epsilon nodes.
> > > 
> > > 
> > > I wrote fix for the bug, but it will be slower then at grep 2.27 yet.
> > 
> > It was improved previous patch.
> 
> Sorry, correct patch is here.

I made the previous patch even simpler.

before:

$ env LC_ALL=C time -p src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 7.24
user 7.14
sys 0.09

after:

$ env LC_ALL=C time -p src/grep -E -v -m1 -f grep-patterns.txt /dev/null
real 0.62
user 0.52
sys 0.10

[-- Attachment #2: 0001-dfa-use-backword-set-in-removal-of-epsilon-closure.patch --]
[-- Type: text/plain, Size: 5628 bytes --]

From fffe326a7cd7d452f86ea51cae6fe08168a1391e Mon Sep 17 00:00:00 2001
From: Norihiro Tanaka <noritnk@kcn.ne.jp>
Date: Sun, 19 Apr 2020 11:01:18 +0900
Subject: [PATCH] dfa: use backword set in removal of epsilon closure

When remove epsilon closure, so far searched nodes including epsilon closure
in all nodes sequentially, but it is slow for some cases.  Now build backword
set before search, and only check previos position with the set.
Problem reported in: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=40634

* lib/dfa.c (struct dfa): New member 'epsilon'.
(addtok_mb): Check whether a pattern has epsilon node or not.
(epsclosure): When remove epsilon node and reconnect, only check previos
positions.
(dfaanalyze): Build backword set.
---
 lib/dfa.c |   65 +++++++++++++++++++++++++++++++++++++++++++++++-------------
 1 files changed, 51 insertions(+), 14 deletions(-)

diff --git a/lib/dfa.c b/lib/dfa.c
index 9939d22..811c41e 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -486,6 +486,7 @@ struct dfa
   bool fast;			/* The DFA is fast.  */
   token utf8_anychar_classes[9]; /* To lower ANYCHAR in UTF-8 locales.  */
   mbstate_t mbs;		/* Multibyte conversion state.  */
+  bool epsilon;
 
   /* The following are valid only if MB_CUR_MAX > 1.  */
 
@@ -1613,13 +1614,21 @@ addtok_mb (struct dfa *dfa, token t, char mbprop)
       dfa->parse.depth--;
       break;
 
-    case BACKREF:
-      dfa->fast = false;
+    case BEGLINE:
+    case ENDLINE:
+    case BEGWORD:
+    case ENDWORD:
+    case LIMWORD:
+    case NOTLIMWORD:
+    case EMPTY:
+      dfa->epsilon = true;
       FALLTHROUGH;
+
     default:
-      dfa->nleaves++;
-      FALLTHROUGH;
-    case EMPTY:
+      if (t == BACKREF)
+        dfa->fast = false;
+      if (t != EMPTY)
+        dfa->nleaves++;
       dfa->parse.depth++;
       break;
     }
@@ -2295,14 +2304,15 @@ state_index (struct dfa *d, position_set const *s, int context)
    constraint.  Repeat exhaustively until no funny positions are left.
    S->elems must be large enough to hold the result.  */
 static void
-epsclosure (struct dfa const *d)
+epsclosure (struct dfa const *d, position_set *backword)
 {
   position_set tmp;
   alloc_position_set (&tmp, d->nleaves);
   for (idx_t i = 0; i < d->tindex; i++)
     if (d->follows[i].nelem > 0 && d->tokens[i] >= NOTCHAR
         && d->tokens[i] != BACKREF && d->tokens[i] != ANYCHAR
-        && d->tokens[i] != MBCSET && d->tokens[i] < CSET)
+        && d->tokens[i] != MBCSET && d->tokens[i] < CSET
+        && d->tokens[i] != BEG)
       {
         unsigned int constraint;
         switch (d->tokens[i])
@@ -2325,16 +2335,21 @@ epsclosure (struct dfa const *d)
           case NOTLIMWORD:
             constraint = NOTLIMWORD_CONSTRAINT;
             break;
-          default:
+          case EMPTY:
             constraint = NO_CONSTRAINT;
             break;
+          default:
+            abort ();
           }
 
         delete (i, &d->follows[i]);
 
-        for (idx_t j = 0; j < d->tindex; j++)
-          if (i != j && d->follows[j].nelem > 0)
-            replace (&d->follows[j], i, &d->follows[i], constraint, &tmp);
+        for (idx_t j = 0; j < backword[i].nelem; j++)
+          replace (&d->follows[backword[i].elems[j].index], i, &d->follows[i],
+                   constraint, &tmp);
+        for (idx_t j = 0; j < d->follows[i].nelem; j++)
+          replace (&backword[d->follows[i].elems[j].index], i, &backword[i],
+                   NO_CONSTRAINT, &tmp);
       }
   free (tmp.elems);
 }
@@ -2641,6 +2656,7 @@ dfaanalyze (struct dfa *d, bool searchflag)
   /* Firstpos and lastpos elements.  */
   position *firstpos = posalloc;
   position *lastpos = firstpos + d->nleaves;
+  position_set *backword = NULL;
   position pos;
   position_set tmp;
 
@@ -2673,6 +2689,9 @@ dfaanalyze (struct dfa *d, bool searchflag)
   alloc_position_set (&merged, d->nleaves);
   d->follows = xcalloc (d->tindex, sizeof *d->follows);
 
+  if (d->epsilon)
+    backword = xcalloc (d->tindex, sizeof *backword);
+
   for (idx_t i = 0; i < d->tindex; i++)
     {
       switch (d->tokens[i])
@@ -2710,6 +2729,17 @@ dfaanalyze (struct dfa *d, bool searchflag)
         case CAT:
           /* Every element in the firstpos of the second argument is in the
              follow of every element in the lastpos of the first argument.  */
+          if (d->epsilon)
+            {
+              tmp.nelem = stk[-2].nlastpos;
+              tmp.elems = lastpos - stk[-1].nlastpos - stk[-2].nlastpos;
+              position *p = firstpos - stk[-1].nfirstpos;
+              for (idx_t j = 0; j < stk[-1].nfirstpos; j++)
+                {
+                  merge (&tmp, &backword[p[j].index], &merged);
+                  copy (&merged, &backword[p[j].index]);
+                }
+            }
           {
             tmp.nelem = stk[-1].nfirstpos;
             tmp.elems = firstpos - stk[-1].nfirstpos;
@@ -2799,9 +2829,15 @@ dfaanalyze (struct dfa *d, bool searchflag)
 #endif
     }
 
-  /* For each follow set that is the follow set of a real position, replace
-     it with its epsilon closure.  */
-  epsclosure (d);
+  if (d->epsilon)
+    {
+      /* For each follow set that is the follow set of a real position,
+         replace it with its epsilon closure.  */
+      epsclosure (d, backword);
+
+      for (idx_t i = 0; i < d->tindex; i++)
+        free (backword[i].elems);
+    }
 
   dfaoptimize (d);
 
@@ -2863,6 +2899,7 @@ dfaanalyze (struct dfa *d, bool searchflag)
   d->min_trcount++;
   d->trcount = 0;
 
+  free (backword);
   free (posalloc);
   free (stkalloc);
   free (merged.elems);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: bug#40634: Massive pattern list handling with -E format seems very slow since 2.28.
  2020-04-19  2:10         ` Norihiro Tanaka
@ 2020-09-11 12:47           ` Jim Meyering
  2020-09-11 13:17             ` Jim Meyering
  0 siblings, 1 reply; 11+ messages in thread
From: Jim Meyering @ 2020-09-11 12:47 UTC (permalink / raw)
  To: Norihiro Tanaka, GNU grep developers
  Cc: fryasu, bug-gnulib@gnu.org List, Paul Eggert, 40634

On Sun, Apr 19, 2020 at 4:10 AM Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:
> On Sun, 19 Apr 2020 07:41:49 +0900
> Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:
> > On Sat, 18 Apr 2020 00:22:26 +0900
> > Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:
> >
> > >
> > > On Fri, 17 Apr 2020 10:24:42 +0900
> > > Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:
> > >
> > > >
> > > > On Fri, 17 Apr 2020 09:35:36 +0900
> > > > Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:
> > > >
> > > > >
> > > > > On Thu, 16 Apr 2020 16:00:29 -0700
> > > > > Paul Eggert <eggert@cs.ucla.edu> wrote:
> > > > >
> > > > > > On 4/16/20 3:53 PM, Norihiro Tanaka wrote:
> > > > > >
> > > > > > > I have had no idea to solve the problem yet.  If we revert it, bug#33357
> > > > > > > will come back.
> > > > > >
> > > > > > Yes, I'd rather not revert if we can help it.
> > > > > >
> > > > > > My own thought was to not analyze the regular expression if we discover that the input is empty. :-)
> > > > >
> > > > > Now, I have a idea, it is that we build indexes of epsilon nodes
> > > > > including in follows before remove epsilon nodes.
> > > >
> > > >
> > > > I wrote fix for the bug, but it will be slower then at grep 2.27 yet.
> > >
> > > It was improved previous patch.
> >
> > Sorry, correct patch is here.
>
> I made the previous patch even simpler.
>
> before:
>
> $ env LC_ALL=C time -p src/grep -E -v -m1 -f grep-patterns.txt /dev/null
> real 7.24
> user 7.14
> sys 0.09
>
> after:
>
> $ env LC_ALL=C time -p src/grep -E -v -m1 -f grep-patterns.txt /dev/null
> real 0.62
> user 0.52
> sys 0.10

Thank you for this patch. I have rebased and made minor syntactic changes.
I'll push it to gnulib soon, if not today, then by Monday.

I am considering creating a test case in grep, but it feels too tight
to be feasible: I would use a relative perf test, requiring that a
passing test incur a perf cost of less than say 100x. Here's the
beginnings of my attempt (note: this is just an outline -- obviously
would not rely on having "time" in path or as a shell builtin):

gen()
{
  local n=$1
  local i=1
  while : ; do
    local pat=$(printf $i | sha1sum | cut -d' ' -f1)
    printf '%s\n' "$pat$pat(\$|$pat)"
    i=$(expr $i + 1)
    test $i = $n && break
  done
}

gen 4000 > pats-4000
head -400 pats-4000 > pats-400

# With fixed code, that a 10x input size increase (n=400 to 4000)
# induces a 40x runtime increase: .05 -> 2.0s
# Just prior to this change, it's 150x: 0.2 -> 30s

env LC_ALL=C time -p src/grep -E -v -m1 -f pats-400 /dev/null
env LC_ALL=C time -p src/grep -E -v -m1 -f pats-4000 /dev/null


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: bug#40634: Massive pattern list handling with -E format seems very slow since 2.28.
  2020-09-11 12:47           ` Jim Meyering
@ 2020-09-11 13:17             ` Jim Meyering
  2020-09-11 23:01               ` Paul Eggert
  0 siblings, 1 reply; 11+ messages in thread
From: Jim Meyering @ 2020-09-11 13:17 UTC (permalink / raw)
  To: Norihiro Tanaka, GNU grep developers
  Cc: fryasu, bug-gnulib@gnu.org List, Paul Eggert, 40634

[-- Attachment #1: Type: text/plain, Size: 2839 bytes --]

On Fri, Sep 11, 2020 at 2:47 PM Jim Meyering <jim@meyering.net> wrote:
> On Sun, Apr 19, 2020 at 4:10 AM Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:
> > On Sun, 19 Apr 2020 07:41:49 +0900
> > Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:
> > > On Sat, 18 Apr 2020 00:22:26 +0900
> > > Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:
> > >
> > > >
> > > > On Fri, 17 Apr 2020 10:24:42 +0900
> > > > Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:
> > > >
> > > > >
> > > > > On Fri, 17 Apr 2020 09:35:36 +0900
> > > > > Norihiro Tanaka <noritnk@kcn.ne.jp> wrote:
> > > > >
> > > > > >
> > > > > > On Thu, 16 Apr 2020 16:00:29 -0700
> > > > > > Paul Eggert <eggert@cs.ucla.edu> wrote:
> > > > > >
> > > > > > > On 4/16/20 3:53 PM, Norihiro Tanaka wrote:
> > > > > > >
> > > > > > > > I have had no idea to solve the problem yet.  If we revert it, bug#33357
> > > > > > > > will come back.
> > > > > > >
> > > > > > > Yes, I'd rather not revert if we can help it.
> > > > > > >
> > > > > > > My own thought was to not analyze the regular expression if we discover that the input is empty. :-)
> > > > > >
> > > > > > Now, I have a idea, it is that we build indexes of epsilon nodes
> > > > > > including in follows before remove epsilon nodes.
> > > > >
> > > > >
> > > > > I wrote fix for the bug, but it will be slower then at grep 2.27 yet.
> > > >
> > > > It was improved previous patch.
> > >
> > > Sorry, correct patch is here.
> >
> > I made the previous patch even simpler.
> >
> > before:
> >
> > $ env LC_ALL=C time -p src/grep -E -v -m1 -f grep-patterns.txt /dev/null
> > real 7.24
> > user 7.14
> > sys 0.09
> >
> > after:
> >
> > $ env LC_ALL=C time -p src/grep -E -v -m1 -f grep-patterns.txt /dev/null
> > real 0.62
> > user 0.52
> > sys 0.10
>
> Thank you for this patch. I have rebased and made minor syntactic changes.
> I'll push it to gnulib soon, if not today, then by Monday.
>
> I am considering creating a test case in grep, but it feels too tight
> to be feasible: I would use a relative perf test, requiring that a
> passing test incur a perf cost of less than say 100x. Here's the
> beginnings of my attempt (note: this is just an outline -- obviously
> would not rely on having "time" in path or as a shell builtin):
>
> gen()
> {
>   local n=$1
>   local i=1
>   while : ; do
>     local pat=$(printf $i | sha1sum | cut -d' ' -f1)
>     printf '%s\n' "$pat$pat(\$|$pat)"
>     i=$(expr $i + 1)
>     test $i = $n && break
>   done
> }
>
> gen 4000 > pats-4000
> head -400 pats-4000 > pats-400
>
> # With fixed code, that a 10x input size increase (n=400 to 4000)
> # induces a 40x runtime increase: .05 -> 2.0s
> # Just prior to this change, it's 150x: 0.2 -> 30s
>
> env LC_ALL=C time -p src/grep -E -v -m1 -f pats-400 /dev/null
> env LC_ALL=C time -p src/grep -E -v -m1 -f pats-4000 /dev/null

And here is the adjusted patch:

[-- Attachment #2: dfa.c-epsilon-node-removal-speedup.patch --]
[-- Type: application/octet-stream, Size: 4498 bytes --]

From bcd7930dba2f078da992660f6a14cde9e42b94c6 Mon Sep 17 00:00:00 2001
From: Jim Meyering <meyering@fb.com>
Date: Fri, 11 Sep 2020 01:25:10 -0700
Subject: [PATCH] dfa: speed up epsilon-node removal

Build auxiliary indices before removing epsilon closure.
Before, when removing an epsilon closure, we would search for nodes
including that epsilon closure in all nodes sequentially, but that
could be very slow.  Now, build auxiliary indices before searching.
Reported in: https://bugs.gnu.org/40634

* lib/dfa.c (overwrap): New function.
(epsclosure): Build auxiliary indices before removing any
epsilon closure; use them to speed up that process.
---
 lib/dfa.c | 107 +++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 73 insertions(+), 34 deletions(-)

diff --git a/lib/dfa.c b/lib/dfa.c
index 1f0587a7a..c4300dfb5 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -2203,6 +2203,22 @@ replace (position_set *dst, idx_t del, position_set *add,
     }
 }

+static bool _GL_ATTRIBUTE_PURE
+overwrap (position_set const *s, idx_t const *elems, idx_t nelem)
+{
+  idx_t i = 0, j = 0;
+
+  while (i < s->nelem && j < nelem)
+    if (s->elems[i].index < elems[j])
+      i++;
+    else if (s->elems[i].index > elems[j])
+      j++;
+    else
+      return true;
+
+  return false;
+}
+
 /* Find the index of the state corresponding to the given position set with
    the given preceding context, or create a new state if there is no such
    state.  Context tells whether we got here on a newline or letter.  */
@@ -2300,45 +2316,68 @@ static void
 epsclosure (struct dfa const *d)
 {
   position_set tmp;
+  idx_t *currs, *nexts;
+  idx_t ncurr = 0;
+  idx_t nnext = 0;
+
   alloc_position_set (&tmp, d->nleaves);
+  currs = xnmalloc (d->tindex, sizeof *currs);
+  nexts = xnmalloc (d->tindex, sizeof *nexts);
+
   for (idx_t i = 0; i < d->tindex; i++)
-    if (d->follows[i].nelem > 0 && d->tokens[i] >= NOTCHAR
-        && d->tokens[i] != BACKREF && d->tokens[i] != ANYCHAR
-        && d->tokens[i] != MBCSET && d->tokens[i] < CSET)
-      {
-        unsigned int constraint;
-        switch (d->tokens[i])
-          {
-          case BEGLINE:
-            constraint = BEGLINE_CONSTRAINT;
-            break;
-          case ENDLINE:
-            constraint = ENDLINE_CONSTRAINT;
-            break;
-          case BEGWORD:
-            constraint = BEGWORD_CONSTRAINT;
-            break;
-          case ENDWORD:
-            constraint = ENDWORD_CONSTRAINT;
-            break;
-          case LIMWORD:
-            constraint = LIMWORD_CONSTRAINT;
-            break;
-          case NOTLIMWORD:
-            constraint = NOTLIMWORD_CONSTRAINT;
-            break;
-          default:
-            constraint = NO_CONSTRAINT;
-            break;
-          }
+    {
+      if (d->follows[i].nelem > 0 && d->tokens[i] >= NOTCHAR
+          && d->tokens[i] != BACKREF && d->tokens[i] != ANYCHAR
+          && d->tokens[i] != MBCSET && d->tokens[i] < CSET)
+        currs[ncurr++] = i;
+    }

-        delete (i, &d->follows[i]);
+  for (idx_t i = 0, j = 0; i < d->tindex; i++)
+    {
+      while (j < ncurr && currs[j] < i)
+        j++;
+      if (overwrap (&d->follows[i], currs, ncurr))
+        nexts[nnext++] = i;
+    }
+
+  for (idx_t i = 0; i < ncurr; i++)
+    {
+      unsigned int constraint;
+      switch (d->tokens[currs[i]])
+        {
+        case BEGLINE:
+          constraint = BEGLINE_CONSTRAINT;
+          break;
+        case ENDLINE:
+          constraint = ENDLINE_CONSTRAINT;
+          break;
+        case BEGWORD:
+          constraint = BEGWORD_CONSTRAINT;
+          break;
+        case ENDWORD:
+          constraint = ENDWORD_CONSTRAINT;
+          break;
+        case LIMWORD:
+          constraint = LIMWORD_CONSTRAINT;
+          break;
+        case NOTLIMWORD:
+          constraint = NOTLIMWORD_CONSTRAINT;
+          break;
+        default:
+          constraint = NO_CONSTRAINT;
+          break;
+        }
+
+      delete (i, &d->follows[currs[i]]);
+
+      for (idx_t j = 0; j < nnext; j++)
+        replace (&d->follows[nexts[j]], currs[i], &d->follows[currs[i]],
+                 constraint, &tmp);
+    }

-        for (idx_t j = 0; j < d->tindex; j++)
-          if (i != j && d->follows[j].nelem > 0)
-            replace (&d->follows[j], i, &d->follows[i], constraint, &tmp);
-      }
   free (tmp.elems);
+  free (currs);
+  free (nexts);
 }

 /* Returns the set of contexts for which there is at least one
-- 
2.28.0.rc0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: bug#40634: Massive pattern list handling with -E format seems very slow since 2.28.
  2020-09-11 13:17             ` Jim Meyering
@ 2020-09-11 23:01               ` Paul Eggert
  2020-09-12  6:41                 ` Jim Meyering
  0 siblings, 1 reply; 11+ messages in thread
From: Paul Eggert @ 2020-09-11 23:01 UTC (permalink / raw)
  To: Jim Meyering, Norihiro Tanaka, GNU grep developers
  Cc: fryasu, Gnulib bugs, 40634

[-- Attachment #1: Type: text/plain, Size: 319 bytes --]

> And here is the adjusted patch:

Hold on, that looks like a cleanup of the April 18 patch posted here:

https://bugs.gnu.org/40634#26

But there's a later patch dated April 19, which Norihiro Tanaka said should be 
more correct and simpler:

https://bugs.gnu.org/40634#32

I'll try to take a look at the later patch.

[-- Attachment #2: 0001-dfa-minor-improvements-of-previous-change.patch --]
[-- Type: text/x-patch, Size: 3136 bytes --]

From a185ed4e6341b5c6aab3e3d950aafeae9eafe4cc Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Fri, 11 Sep 2020 15:46:14 -0700
Subject: [PATCH] dfa: minor improvements of previous change

* lib/dfa.c (epsclosure): Use one xnmalloc call to allocate
CURRS and NEXTS, to lessen pressure on the heap allocator.
Assign unconditionally in a couple of places, to help branch
prediction.  Redo comparison order to help the compiler.
Omit unnecessary index variable J.
---
 ChangeLog |  9 +++++++++
 lib/dfa.c | 28 ++++++++++++++--------------
 2 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index bf39cc512..da75e4757 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2020-09-11  Paul Eggert  <eggert@cs.ucla.edu>
+
+	dfa: minor improvements of previous change
+	* lib/dfa.c (epsclosure): Use one xnmalloc call to allocate
+	CURRS and NEXTS, to lessen pressure on the heap allocator.
+	Assign unconditionally in a couple of places, to help branch
+	prediction.  Redo comparison order to help the compiler.
+	Omit unnecessary index variable J.
+
 2020-09-10  Paul Eggert  <eggert@cs.ucla.edu>
 
 	canonicalize: fix pointer indexing bugs
diff --git a/lib/dfa.c b/lib/dfa.c
index c4300dfb5..df6399e45 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -2203,6 +2203,8 @@ replace (position_set *dst, idx_t del, position_set *add,
     }
 }
 
+/* Return true if any position in S has an index equal to any element
+   of ELEMS, which has NELEM elements.  */
 static bool _GL_ATTRIBUTE_PURE
 overwrap (position_set const *s, idx_t const *elems, idx_t nelem)
 {
@@ -2316,28 +2318,27 @@ static void
 epsclosure (struct dfa const *d)
 {
   position_set tmp;
-  idx_t *currs, *nexts;
   idx_t ncurr = 0;
   idx_t nnext = 0;
+  idx_t tindex = d->tindex;
 
   alloc_position_set (&tmp, d->nleaves);
-  currs = xnmalloc (d->tindex, sizeof *currs);
-  nexts = xnmalloc (d->tindex, sizeof *nexts);
 
-  for (idx_t i = 0; i < d->tindex; i++)
+  idx_t *currs = xnmalloc (tindex, 2 * sizeof *currs);
+  for (idx_t i = 0; i < tindex; i++)
     {
-      if (d->follows[i].nelem > 0 && d->tokens[i] >= NOTCHAR
-          && d->tokens[i] != BACKREF && d->tokens[i] != ANYCHAR
-          && d->tokens[i] != MBCSET && d->tokens[i] < CSET)
-        currs[ncurr++] = i;
+      currs[ncurr] = i;
+      ncurr += (d->follows[i].nelem > 0
+                && NOTCHAR <= d->tokens[i] && d->tokens[i] < CSET
+                && d->tokens[i] != BACKREF && d->tokens[i] != ANYCHAR
+                && d->tokens[i] != MBCSET);
     }
 
-  for (idx_t i = 0, j = 0; i < d->tindex; i++)
+  idx_t *nexts = currs + ncurr;
+  for (idx_t i = 0; i < tindex; i++)
     {
-      while (j < ncurr && currs[j] < i)
-        j++;
-      if (overwrap (&d->follows[i], currs, ncurr))
-        nexts[nnext++] = i;
+      nexts[nnext] = i;
+      nnext += overwrap (&d->follows[i], currs, ncurr);
     }
 
   for (idx_t i = 0; i < ncurr; i++)
@@ -2377,7 +2378,6 @@ epsclosure (struct dfa const *d)
 
   free (tmp.elems);
   free (currs);
-  free (nexts);
 }
 
 /* Returns the set of contexts for which there is at least one
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: bug#40634: Massive pattern list handling with -E format seems very slow since 2.28.
  2020-09-11 23:01               ` Paul Eggert
@ 2020-09-12  6:41                 ` Jim Meyering
  2020-09-14  2:03                   ` Paul Eggert
  0 siblings, 1 reply; 11+ messages in thread
From: Jim Meyering @ 2020-09-12  6:41 UTC (permalink / raw)
  To: Paul Eggert
  Cc: fryasu, Gnulib bugs, 40634, Norihiro Tanaka, GNU grep developers

On Sat, Sep 12, 2020 at 1:01 AM Paul Eggert <eggert@cs.ucla.edu> wrote:
> > And here is the adjusted patch:
>
> Hold on, that looks like a cleanup of the April 18 patch posted here:
>
> https://bugs.gnu.org/40634#26
>
> But there's a later patch dated April 19, which Norihiro Tanaka said should be
> more correct and simpler:
>
> https://bugs.gnu.org/40634#32
>
> I'll try to take a look at the later patch.

Oh! Glad you spotted that.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: bug#40634: Massive pattern list handling with -E format seems very slow since 2.28.
  2020-09-12  6:41                 ` Jim Meyering
@ 2020-09-14  2:03                   ` Paul Eggert
  2020-09-14 14:14                     ` Jim Meyering
  2020-09-21 19:22                     ` Paul Eggert
  0 siblings, 2 replies; 11+ messages in thread
From: Paul Eggert @ 2020-09-14  2:03 UTC (permalink / raw)
  To: Jim Meyering
  Cc: fryasu, Gnulib bugs, 40634, Norihiro Tanaka, GNU grep developers

[-- Attachment #1: Type: text/plain, Size: 1380 bytes --]

On 9/11/20 11:41 PM, Jim Meyering wrote:

>> https://bugs.gnu.org/40634#32
>>
>> I'll try to take a look at the later patch.
> 
> Oh! Glad you spotted that.

I took a look and the basic idea sounds good though I admit I did not check 
every detail. While looking into it I found some opportunities for improvements, 
plus I found what appear to be some longstanding bugs in the area, one of which 
causes a grep test failure on Solaris (and I suspect the bug is also on 
GNU/Linux but the grep tests don't catch it). I installed the attached patches 
into Gnulib, updated grep to point to the new Gnulib version, and added a note 
in grep's NEWS file about this.

Patch 1 is what Norihiro Tanaka proposed in Bug#40634#32, except I edited the 
commit message. Patch 2 consists of minor cleanups and performance tweaks for 
Patch 1. (Patches 3 and 4 are omitted as they were installed by others into 
Gnulib at about the same time I was installing these.) Patch 5 fixes a 
dfa-heap-overrun failure on Solaris that appears to be a longstanding bug 
exposed by Patch 1 when running on Solaris. Patch 6 merely cleans up code near 
Patch 5. Patch 7 fixes the use of an uninitialized constraint, which I 
discovered while debugging Patch 5 under Valgrind; this also appears to be a 
longstandiung bug.

Coming up with test cases for all these bugs would be pretty tricky, unfortunately.

[-- Attachment #2: 0001-dfa-use-backward-set-in-removal-of-epsilon-closure.patch --]
[-- Type: text/x-patch, Size: 5655 bytes --]

From da0e8454a8e68035ef4b87dbb9097f85df6ece27 Mon Sep 17 00:00:00 2001
From: Norihiro Tanaka <noritnk@kcn.ne.jp>
Date: Sat, 12 Sep 2020 18:51:55 -0700
Subject: [PATCH 1/7] dfa: use backward set in removal of epsilon closure

When removing in epsilon closure, the code searched all nodes
sequentially, and this was slow for some cases.  Build a backward
set before search, and only check previous position with the set.
Problem reported in <https://bugs.gnu.org/40634>.
* lib/dfa.c (struct dfa): New member 'epsilon'.
(addtok_mb): Check whether a pattern has epsilon node or not.
(epsclosure): New arg BACKWORD; caller changed.  When removing
epsilon node and reconnecting, check only previous positions.
Treat BEG as if it were character.
(dfaanalyze): Build backward set.
---
 lib/dfa.c | 65 +++++++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 51 insertions(+), 14 deletions(-)

diff --git a/lib/dfa.c b/lib/dfa.c
index 1f0587a7a..7c8eb6685 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -488,6 +488,7 @@ struct dfa
   bool fast;			/* The DFA is fast.  */
   token utf8_anychar_classes[9]; /* To lower ANYCHAR in UTF-8 locales.  */
   mbstate_t mbs;		/* Multibyte conversion state.  */
+  bool epsilon;
 
   /* The following are valid only if MB_CUR_MAX > 1.  */
 
@@ -1615,13 +1616,21 @@ addtok_mb (struct dfa *dfa, token t, char mbprop)
       dfa->parse.depth--;
       break;
 
-    case BACKREF:
-      dfa->fast = false;
+    case BEGLINE:
+    case ENDLINE:
+    case BEGWORD:
+    case ENDWORD:
+    case LIMWORD:
+    case NOTLIMWORD:
+    case EMPTY:
+      dfa->epsilon = true;
       FALLTHROUGH;
+
     default:
-      dfa->nleaves++;
-      FALLTHROUGH;
-    case EMPTY:
+      if (t == BACKREF)
+        dfa->fast = false;
+      if (t != EMPTY)
+        dfa->nleaves++;
       dfa->parse.depth++;
       break;
     }
@@ -2297,14 +2306,15 @@ state_index (struct dfa *d, position_set const *s, int context)
    constraint.  Repeat exhaustively until no funny positions are left.
    S->elems must be large enough to hold the result.  */
 static void
-epsclosure (struct dfa const *d)
+epsclosure (struct dfa const *d, position_set *backword)
 {
   position_set tmp;
   alloc_position_set (&tmp, d->nleaves);
   for (idx_t i = 0; i < d->tindex; i++)
     if (d->follows[i].nelem > 0 && d->tokens[i] >= NOTCHAR
         && d->tokens[i] != BACKREF && d->tokens[i] != ANYCHAR
-        && d->tokens[i] != MBCSET && d->tokens[i] < CSET)
+        && d->tokens[i] != MBCSET && d->tokens[i] < CSET
+        && d->tokens[i] != BEG)
       {
         unsigned int constraint;
         switch (d->tokens[i])
@@ -2327,16 +2337,21 @@ epsclosure (struct dfa const *d)
           case NOTLIMWORD:
             constraint = NOTLIMWORD_CONSTRAINT;
             break;
-          default:
+          case EMPTY:
             constraint = NO_CONSTRAINT;
             break;
+          default:
+            abort ();
           }
 
         delete (i, &d->follows[i]);
 
-        for (idx_t j = 0; j < d->tindex; j++)
-          if (i != j && d->follows[j].nelem > 0)
-            replace (&d->follows[j], i, &d->follows[i], constraint, &tmp);
+        for (idx_t j = 0; j < backword[i].nelem; j++)
+          replace (&d->follows[backword[i].elems[j].index], i, &d->follows[i],
+                   constraint, &tmp);
+        for (idx_t j = 0; j < d->follows[i].nelem; j++)
+          replace (&backword[d->follows[i].elems[j].index], i, &backword[i],
+                   NO_CONSTRAINT, &tmp);
       }
   free (tmp.elems);
 }
@@ -2643,6 +2658,7 @@ dfaanalyze (struct dfa *d, bool searchflag)
   /* Firstpos and lastpos elements.  */
   position *firstpos = posalloc;
   position *lastpos = firstpos + d->nleaves;
+  position_set *backword = NULL;
   position pos;
   position_set tmp;
 
@@ -2675,6 +2691,9 @@ dfaanalyze (struct dfa *d, bool searchflag)
   alloc_position_set (&merged, d->nleaves);
   d->follows = xcalloc (d->tindex, sizeof *d->follows);
 
+  if (d->epsilon)
+    backword = xcalloc (d->tindex, sizeof *backword);
+
   for (idx_t i = 0; i < d->tindex; i++)
     {
       switch (d->tokens[i])
@@ -2712,6 +2731,17 @@ dfaanalyze (struct dfa *d, bool searchflag)
         case CAT:
           /* Every element in the firstpos of the second argument is in the
              follow of every element in the lastpos of the first argument.  */
+          if (d->epsilon)
+            {
+              tmp.nelem = stk[-2].nlastpos;
+              tmp.elems = lastpos - stk[-1].nlastpos - stk[-2].nlastpos;
+              position *p = firstpos - stk[-1].nfirstpos;
+              for (idx_t j = 0; j < stk[-1].nfirstpos; j++)
+                {
+                  merge (&tmp, &backword[p[j].index], &merged);
+                  copy (&merged, &backword[p[j].index]);
+                }
+            }
           {
             tmp.nelem = stk[-1].nfirstpos;
             tmp.elems = firstpos - stk[-1].nfirstpos;
@@ -2801,9 +2831,15 @@ dfaanalyze (struct dfa *d, bool searchflag)
 #endif
     }
 
-  /* For each follow set that is the follow set of a real position, replace
-     it with its epsilon closure.  */
-  epsclosure (d);
+  if (d->epsilon)
+    {
+      /* For each follow set that is the follow set of a real position,
+         replace it with its epsilon closure.  */
+      epsclosure (d, backword);
+
+      for (idx_t i = 0; i < d->tindex; i++)
+        free (backword[i].elems);
+    }
 
   dfaoptimize (d);
 
@@ -2865,6 +2901,7 @@ dfaanalyze (struct dfa *d, bool searchflag)
   d->min_trcount++;
   d->trcount = 0;
 
+  free (backword);
   free (posalloc);
   free (stkalloc);
   free (merged.elems);
-- 
2.17.1


[-- Attachment #3: 0002-dfa-epsilon-closure-tweaks-Bug-40634.patch --]
[-- Type: text/x-patch, Size: 12491 bytes --]

From 4e14bef83b3c8096efebe343069baf13277337bc Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 12 Sep 2020 18:51:55 -0700
Subject: [PATCH 2/7] dfa: epsilon-closure tweaks (Bug#40634)

Rename BACKWORD to BACKWARD consistently.
* lib/dfa.c (struct dfa): Reorder members to reduce fragmentation.
(addtok_mb): Redo slightly to make it act more like a state machine.
Check depth only when it increases.
(epsclosure): Let the switch test the tokens.
(dfaanalyze): Cache tindex.  Simplify position loops.
Prefer xcalloc to xnmalloc + explicit zeroing.  Free BACKWARD
only if it is not null, since we're testing that anyway.
(dfaanalyze, build_state): Use merge2 instead of doing it by hand.
---
 ChangeLog |  27 ++++++++++++
 lib/dfa.c | 124 ++++++++++++++++++++++++++----------------------------
 2 files changed, 87 insertions(+), 64 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index bf39cc512..66aec61cb 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,30 @@
+2020-09-12  Paul Eggert  <eggert@cs.ucla.edu>
+
+	dfa: epsilon-closure tweaks (Bug#40634)
+	Rename BACKWORD to BACKWARD consistently.
+	* lib/dfa.c (struct dfa): Reorder members to reduce fragmentation.
+	(addtok_mb): Redo slightly to make it act more like a state machine.
+	Check depth only when it increases.
+	(epsclosure): Let the switch test the tokens.
+	(dfaanalyze): Cache tindex.  Simplify position loops.
+	Prefer xcalloc to xnmalloc + explicit zeroing.  Free BACKWARD
+	only if it is not null, since we're testing that anyway.
+	(dfaanalyze, build_state): Use merge2 instead of doing it by hand.
+
+2020-09-12  Norihiro Tanaka  <noritnk@kcn.ne.jp>
+
+	dfa: use backward set in removal of epsilon closure
+	When removing in epsilon closure, the code searched all nodes
+	sequentially, and this was slow for some cases.  Build a backward
+	set before search, and only check previous position with the set.
+	Problem reported in <https://bugs.gnu.org/40634>.
+	* lib/dfa.c (struct dfa): New member 'epsilon'.
+	(addtok_mb): Check whether a pattern has epsilon node or not.
+	(epsclosure): New arg BACKWORD; caller changed.  When removing
+	epsilon node and reconnecting, check only previous positions.
+	Treat BEG as if it were character.
+	(dfaanalyze): Build backward set.
+
 2020-09-10  Paul Eggert  <eggert@cs.ucla.edu>
 
 	canonicalize: fix pointer indexing bugs
diff --git a/lib/dfa.c b/lib/dfa.c
index 7c8eb6685..0f0764887 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -482,13 +482,14 @@ struct dfa
   idx_t depth;			/* Depth required of an evaluation stack
                                    used for depth-first traversal of the
                                    parse tree.  */
-  idx_t nleaves;		/* Number of leaves on the parse tree.  */
+  idx_t nleaves;		/* Number of non-EMPTY leaves
+                                   in the parse tree.  */
   idx_t nregexps;		/* Count of parallel regexps being built
                                    with dfaparse.  */
   bool fast;			/* The DFA is fast.  */
+  bool epsilon;			/* Does a token match only the empty string?  */
   token utf8_anychar_classes[9]; /* To lower ANYCHAR in UTF-8 locales.  */
   mbstate_t mbs;		/* Multibyte conversion state.  */
-  bool epsilon;
 
   /* The following are valid only if MB_CUR_MAX > 1.  */
 
@@ -1616,26 +1617,31 @@ addtok_mb (struct dfa *dfa, token t, char mbprop)
       dfa->parse.depth--;
       break;
 
+    case EMPTY:
+      dfa->epsilon = true;
+      goto increment_depth;
+
+    case BACKREF:
+      dfa->fast = false;
+      goto increment_nleaves;
+
     case BEGLINE:
     case ENDLINE:
     case BEGWORD:
     case ENDWORD:
     case LIMWORD:
     case NOTLIMWORD:
-    case EMPTY:
       dfa->epsilon = true;
       FALLTHROUGH;
-
     default:
-      if (t == BACKREF)
-        dfa->fast = false;
-      if (t != EMPTY)
-        dfa->nleaves++;
+    increment_nleaves:
+      dfa->nleaves++;
+    increment_depth:
       dfa->parse.depth++;
+      if (dfa->depth < dfa->parse.depth)
+        dfa->depth = dfa->parse.depth;
       break;
     }
-  if (dfa->parse.depth > dfa->depth)
-    dfa->depth = dfa->parse.depth;
 }
 
 static void addtok_wc (struct dfa *dfa, wint_t wc);
@@ -2156,6 +2162,8 @@ merge (position_set const *s1, position_set const *s2, position_set *m)
   merge_constrained (s1, s2, -1, m);
 }
 
+/* Merge into DST all the elements of SRC, possibly destroying
+   the contents of the temporary M.  */
 static void
 merge2 (position_set *dst, position_set const *src, position_set *m)
 {
@@ -2300,25 +2308,26 @@ state_index (struct dfa *d, position_set const *s, int context)
   return i;
 }
 
-/* Find the epsilon closure of a set of positions.  If any position of the set
+/* Find the epsilon closure of D's set of positions.  If any position of the set
    contains a symbol that matches the empty string in some context, replace
    that position with the elements of its follow labeled with an appropriate
    constraint.  Repeat exhaustively until no funny positions are left.
-   S->elems must be large enough to hold the result.  */
+   S->elems must be large enough to hold the result.  BACKWARD is D's
+   backward set; use and update it too.  */
 static void
-epsclosure (struct dfa const *d, position_set *backword)
+epsclosure (struct dfa const *d, position_set *backward)
 {
   position_set tmp;
   alloc_position_set (&tmp, d->nleaves);
   for (idx_t i = 0; i < d->tindex; i++)
-    if (d->follows[i].nelem > 0 && d->tokens[i] >= NOTCHAR
-        && d->tokens[i] != BACKREF && d->tokens[i] != ANYCHAR
-        && d->tokens[i] != MBCSET && d->tokens[i] < CSET
-        && d->tokens[i] != BEG)
+    if (0 < d->follows[i].nelem)
       {
         unsigned int constraint;
         switch (d->tokens[i])
           {
+          default:
+            continue;
+
           case BEGLINE:
             constraint = BEGLINE_CONSTRAINT;
             break;
@@ -2340,17 +2349,15 @@ epsclosure (struct dfa const *d, position_set *backword)
           case EMPTY:
             constraint = NO_CONSTRAINT;
             break;
-          default:
-            abort ();
           }
 
         delete (i, &d->follows[i]);
 
-        for (idx_t j = 0; j < backword[i].nelem; j++)
-          replace (&d->follows[backword[i].elems[j].index], i, &d->follows[i],
+        for (idx_t j = 0; j < backward[i].nelem; j++)
+          replace (&d->follows[backward[i].elems[j].index], i, &d->follows[i],
                    constraint, &tmp);
         for (idx_t j = 0; j < d->follows[i].nelem; j++)
-          replace (&backword[d->follows[i].elems[j].index], i, &backword[i],
+          replace (&backward[d->follows[i].elems[j].index], i, &backward[i],
                    NO_CONSTRAINT, &tmp);
       }
   free (tmp.elems);
@@ -2658,7 +2665,6 @@ dfaanalyze (struct dfa *d, bool searchflag)
   /* Firstpos and lastpos elements.  */
   position *firstpos = posalloc;
   position *lastpos = firstpos + d->nleaves;
-  position_set *backword = NULL;
   position pos;
   position_set tmp;
 
@@ -2676,10 +2682,11 @@ dfaanalyze (struct dfa *d, bool searchflag)
   position_set merged;          /* Result of merging sets.  */
 
   addtok (d, CAT);
+  idx_t tindex = d->tindex;
 
 #ifdef DEBUG
   fprintf (stderr, "dfaanalyze:\n");
-  for (idx_t i = 0; i < d->tindex; i++)
+  for (idx_t i = 0; i < tindex; i++)
     {
       fprintf (stderr, " %td:", i);
       prtok (d->tokens[i]);
@@ -2689,12 +2696,11 @@ dfaanalyze (struct dfa *d, bool searchflag)
 
   d->searchflag = searchflag;
   alloc_position_set (&merged, d->nleaves);
-  d->follows = xcalloc (d->tindex, sizeof *d->follows);
-
-  if (d->epsilon)
-    backword = xcalloc (d->tindex, sizeof *backword);
+  d->follows = xcalloc (tindex, sizeof *d->follows);
+  position_set *backward
+    = d->epsilon ? xcalloc (tindex, sizeof *backward) : NULL;
 
-  for (idx_t i = 0; i < d->tindex; i++)
+  for (idx_t i = 0; i < tindex; i++)
     {
       switch (d->tokens[i])
         {
@@ -2714,12 +2720,8 @@ dfaanalyze (struct dfa *d, bool searchflag)
           {
             tmp.elems = firstpos - stk[-1].nfirstpos;
             tmp.nelem = stk[-1].nfirstpos;
-            position *p = lastpos - stk[-1].nlastpos;
-            for (idx_t j = 0; j < stk[-1].nlastpos; j++)
-              {
-                merge (&tmp, &d->follows[p[j].index], &merged);
-                copy (&merged, &d->follows[p[j].index]);
-              }
+            for (position *p = lastpos - stk[-1].nlastpos; p < lastpos; p++)
+              merge2 (&d->follows[p->index], &tmp, &merged);
           }
           FALLTHROUGH;
         case QMARK:
@@ -2729,28 +2731,27 @@ dfaanalyze (struct dfa *d, bool searchflag)
           break;
 
         case CAT:
-          /* Every element in the firstpos of the second argument is in the
-             follow of every element in the lastpos of the first argument.  */
-          if (d->epsilon)
+          /* Every element in the lastpos of the first argument is in
+             the backward set of every element in the firstpos of the
+             second argument.  */
+          if (backward)
             {
               tmp.nelem = stk[-2].nlastpos;
               tmp.elems = lastpos - stk[-1].nlastpos - stk[-2].nlastpos;
-              position *p = firstpos - stk[-1].nfirstpos;
-              for (idx_t j = 0; j < stk[-1].nfirstpos; j++)
-                {
-                  merge (&tmp, &backword[p[j].index], &merged);
-                  copy (&merged, &backword[p[j].index]);
-                }
+              for (position *p = firstpos - stk[-1].nfirstpos;
+                   p < firstpos; p++)
+                merge2 (&backward[p->index], &tmp, &merged);
             }
+
+          /* Every element in the firstpos of the second argument is in the
+             follow of every element in the lastpos of the first argument.  */
           {
             tmp.nelem = stk[-1].nfirstpos;
             tmp.elems = firstpos - stk[-1].nfirstpos;
-            position *p = lastpos - stk[-1].nlastpos - stk[-2].nlastpos;
-            for (idx_t j = 0; j < stk[-2].nlastpos; j++)
-              {
-                merge (&tmp, &d->follows[p[j].index], &merged);
-                copy (&merged, &d->follows[p[j].index]);
-              }
+            for (position *plim = lastpos - stk[-1].nlastpos,
+                   *p = plim - stk[-2].nlastpos;
+                 p < plim; p++)
+              merge2 (&d->follows[p->index], &tmp, &merged);
           }
 
           /* The firstpos of a CAT node is the firstpos of the first argument,
@@ -2831,20 +2832,21 @@ dfaanalyze (struct dfa *d, bool searchflag)
 #endif
     }
 
-  if (d->epsilon)
+  if (backward)
     {
       /* For each follow set that is the follow set of a real position,
          replace it with its epsilon closure.  */
-      epsclosure (d, backword);
+      epsclosure (d, backward);
 
-      for (idx_t i = 0; i < d->tindex; i++)
-        free (backword[i].elems);
+      for (idx_t i = 0; i < tindex; i++)
+        free (backward[i].elems);
+      free (backward);
     }
 
   dfaoptimize (d);
 
 #ifdef DEBUG
-  for (idx_t i = 0; i < d->tindex; i++)
+  for (idx_t i = 0; i < tindex; i++)
     if (d->tokens[i] == BEG || d->tokens[i] < NOTCHAR
         || d->tokens[i] == BACKREF || d->tokens[i] == ANYCHAR
         || d->tokens[i] == MBCSET || d->tokens[i] >= CSET)
@@ -2868,12 +2870,10 @@ dfaanalyze (struct dfa *d, bool searchflag)
 
   append (pos, &tmp);
 
-  d->separates = xnmalloc (d->tindex, sizeof *d->separates);
+  d->separates = xcalloc (tindex, sizeof *d->separates);
 
-  for (idx_t i = 0; i < d->tindex; i++)
+  for (idx_t i = 0; i < tindex; i++)
     {
-      d->separates[i] = 0;
-
       if (prev_newline_dependent (d->constraints[i]))
         d->separates[i] |= CTX_NEWLINE;
       if (prev_letter_dependent (d->constraints[i]))
@@ -2901,7 +2901,6 @@ dfaanalyze (struct dfa *d, bool searchflag)
   d->min_trcount++;
   d->trcount = 0;
 
-  free (backword);
   free (posalloc);
   free (stkalloc);
   free (merged.elems);
@@ -3172,10 +3171,7 @@ build_state (state_num s, struct dfa *d, unsigned char uc)
                 mergeit &= d->multibyte_prop[group.elems[j].index];
             }
           if (mergeit)
-            {
-              merge (&d->states[0].elems, &group, &tmp);
-              copy (&tmp, &group);
-            }
+            merge2 (&group, &d->states[0].elems, &tmp);
         }
 
       /* Find out if the new state will want any context information,
-- 
2.17.1


[-- Attachment #4: 0005-dfa-fix-dfa-heap-overrun-failure.patch --]
[-- Type: text/x-patch, Size: 3060 bytes --]

From cafb61533f5bfb989698e3924f97471498b2422b Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sun, 13 Sep 2020 18:20:01 -0700
Subject: [PATCH 5/7] dfa: fix dfa-heap-overrun failure

* lib/dfa.c (reorder_tokens): When setting
map[d->follows[i].elems[j].index], instead of incorrectly assuming
that (i < d->follows[i].elems[j].index), use two loops, one to set
the map array and the other to use it.  The incorrect assumption
caused some elements to be missed, and this in turn caused grep's
dfa-heap-overrun test to fail on Solaris 10 sparc when compiled
with GCC.  I found this bug while investigating
https://buildfarm.opencsw.org/buildbot/builders/ggrep-solaris10-sparc/builds/183
and I think the bug also occurs on GNU/Linux but with more-subtle
symptoms.  The bug predates the recent dfa.c changes; perhaps the
recent changes make the bug more likely.
---
 ChangeLog | 15 +++++++++++++++
 lib/dfa.c | 12 ++++++------
 2 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index a5c14c45e..1b5720761 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,18 @@
+2020-09-13  Paul Eggert  <eggert@cs.ucla.edu>
+
+	dfa: fix dfa-heap-overrun failure
+	* lib/dfa.c (reorder_tokens): When setting
+	map[d->follows[i].elems[j].index], instead of incorrectly assuming
+	that (i < d->follows[i].elems[j].index), use two loops, one to set
+	the map array and the other to use it.  The incorrect assumption
+	caused some elements to be missed, and this in turn caused grep's
+	dfa-heap-overrun test to fail on Solaris 10 sparc when compiled
+	with GCC.  I found this bug while investigating
+	https://buildfarm.opencsw.org/buildbot/builders/ggrep-solaris10-sparc/builds/183
+	and I think the bug also occurs on GNU/Linux but with more-subtle
+	symptoms.  The bug predates the recent dfa.c changes; perhaps the
+	recent changes make the bug more likely.
+
 2020-09-13  Bruno Haible  <bruno@clisp.org>
 
 	parse-datetime: Make the build rule work with parallel 'make'.
diff --git a/lib/dfa.c b/lib/dfa.c
index 0f0764887..4992bcaf2 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -2519,6 +2519,11 @@ reorder_tokens (struct dfa *d)
   else
     multibyte_prop = NULL;
 
+  for (idx_t i = 0; i < d->tindex; i++)
+    for (idx_t j = 0; j < d->follows[i].nelem; j++)
+      if (map[d->follows[i].elems[j].index] == -1)
+        map[d->follows[i].elems[j].index] = nleaves++;
+
   for (idx_t i = 0; i < d->tindex; i++)
     {
       if (map[i] == -1)
@@ -2537,12 +2542,7 @@ reorder_tokens (struct dfa *d)
         multibyte_prop[map[i]] = d->multibyte_prop[i];
 
       for (idx_t j = 0; j < d->follows[i].nelem; j++)
-        {
-          if (map[d->follows[i].elems[j].index] == -1)
-            map[d->follows[i].elems[j].index] = nleaves++;
-
-          d->follows[i].elems[j].index = map[d->follows[i].elems[j].index];
-        }
+        d->follows[i].elems[j].index = map[d->follows[i].elems[j].index];
 
       qsort (d->follows[i].elems, d->follows[i].nelem,
              sizeof *d->follows[i].elems, compare);
-- 
2.17.1


[-- Attachment #5: 0006-dfa-assume-C99-in-reorder_tokens.patch --]
[-- Type: text/x-patch, Size: 2474 bytes --]

From 5332a21b1188224ecddbbe8b234b618d0b84437a Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sun, 13 Sep 2020 18:27:07 -0700
Subject: [PATCH 6/7] dfa: assume C99 in reorder_tokens

* lib/dfa.c (reorder_tokens): Assume C99 and simplify.
---
 ChangeLog |  3 +++
 lib/dfa.c | 32 ++++++++++----------------------
 2 files changed, 13 insertions(+), 22 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 1b5720761..5f7a148e3 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,8 @@
 2020-09-13  Paul Eggert  <eggert@cs.ucla.edu>
 
+	dfa: assume C99 in reorder_tokens
+	* lib/dfa.c (reorder_tokens): Assume C99 and simplify.
+
 	dfa: fix dfa-heap-overrun failure
 	* lib/dfa.c (reorder_tokens): When setting
 	map[d->follows[i].elems[j].index], instead of incorrectly assuming
diff --git a/lib/dfa.c b/lib/dfa.c
index 4992bcaf2..0fa9958fd 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -2494,39 +2494,27 @@ compare (const void *a, const void *b)
 static void
 reorder_tokens (struct dfa *d)
 {
-  idx_t nleaves;
-  ptrdiff_t *map;
-  token *tokens;
-  position_set *follows;
-  int *constraints;
-  char *multibyte_prop;
-
-  nleaves = 0;
-
-  map = xnmalloc (d->tindex, sizeof *map);
-
+  idx_t nleaves = 0;
+  ptrdiff_t *map = xnmalloc (d->tindex, sizeof *map);
   map[0] = nleaves++;
-
   for (idx_t i = 1; i < d->tindex; i++)
     map[i] = -1;
 
-  tokens = xnmalloc (d->nleaves, sizeof *tokens);
-  follows = xnmalloc (d->nleaves, sizeof *follows);
-  constraints = xnmalloc (d->nleaves, sizeof *constraints);
-
-  if (d->localeinfo.multibyte)
-    multibyte_prop = xnmalloc (d->nleaves, sizeof *multibyte_prop);
-  else
-    multibyte_prop = NULL;
+  token *tokens = xnmalloc (d->nleaves, sizeof *tokens);
+  position_set *follows = xnmalloc (d->nleaves, sizeof *follows);
+  int *constraints = xnmalloc (d->nleaves, sizeof *constraints);
+  char *multibyte_prop = (d->localeinfo.multibyte
+                          ? xnmalloc (d->nleaves, sizeof *multibyte_prop)
+                          : NULL);
 
   for (idx_t i = 0; i < d->tindex; i++)
     for (idx_t j = 0; j < d->follows[i].nelem; j++)
-      if (map[d->follows[i].elems[j].index] == -1)
+      if (map[d->follows[i].elems[j].index] < 0)
         map[d->follows[i].elems[j].index] = nleaves++;
 
   for (idx_t i = 0; i < d->tindex; i++)
     {
-      if (map[i] == -1)
+      if (map[i] < 0)
         {
           free (d->follows[i].elems);
           d->follows[i].elems = NULL;
-- 
2.17.1


[-- Attachment #6: 0007-dfa-avoid-use-of-uninitialized-constraint.patch --]
[-- Type: text/x-patch, Size: 2187 bytes --]

From 46bdd627ff522193134d31bdfd3ac4e4fddb5975 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sun, 13 Sep 2020 18:40:08 -0700
Subject: [PATCH 7/7] dfa: avoid use of uninitialized constraint

* lib/dfa.c (merge_nfa_state): Do not initialize the constraint
to zero here.
(dfaoptimize): Do it here instead, via xcalloc.  This prevents the
use of an uninitialized constraint by later code when ! (flags[i]
& OPT_QUEUED) means merge_nfa_state was not called to initialize
the constraint.  Problem found by running 'valgrind src/grep -E
'(^| )*(a|b)*(c|d)*( |$)' < /dev/null' on Ubuntu 18.04.5 x86-64.
---
 ChangeLog | 9 +++++++++
 lib/dfa.c | 4 +---
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 5f7a148e3..395ac6baf 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,14 @@
 2020-09-13  Paul Eggert  <eggert@cs.ucla.edu>
 
+	dfa: avoid use of uninitialized constraint
+	* lib/dfa.c (merge_nfa_state): Do not initialize the constraint
+	to zero here.
+	(dfaoptimize): Do it here instead, via xcalloc.  This prevents the
+	use of an uninitialized constraint by later code when ! (flags[i]
+	& OPT_QUEUED) means merge_nfa_state was not called to initialize
+	the constraint.  Problem found by running 'valgrind src/grep -E
+	'(^| )*(a|b)*(c|d)*( |$)' < /dev/null' on Ubuntu 18.04.5 x86-64.
+
 	dfa: assume C99 in reorder_tokens
 	* lib/dfa.c (reorder_tokens): Assume C99 and simplify.
 
diff --git a/lib/dfa.c b/lib/dfa.c
index 0fa9958fd..746c7b568 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -2428,8 +2428,6 @@ merge_nfa_state (struct dfa *d, idx_t tindex, char *flags,
   position_set *follows = d->follows;
   idx_t nelem = 0;
 
-  d->constraints[tindex] = 0;
-
   for (idx_t i = 0; i < follows[tindex].nelem; i++)
     {
       idx_t sindex = follows[tindex].elems[i].index;
@@ -2581,7 +2579,7 @@ dfaoptimize (struct dfa *d)
   position_set *merged = &merged0;
   alloc_position_set (merged, d->nleaves);
 
-  d->constraints = xnmalloc (d->tindex, sizeof *d->constraints);
+  d->constraints = xcalloc (d->tindex, sizeof *d->constraints);
 
   for (idx_t i = 0; i < d->tindex; i++)
     if (flags[i] & OPT_QUEUED)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: bug#40634: Massive pattern list handling with -E format seems very slow since 2.28.
  2020-09-14  2:03                   ` Paul Eggert
@ 2020-09-14 14:14                     ` Jim Meyering
  2020-09-21 19:22                     ` Paul Eggert
  1 sibling, 0 replies; 11+ messages in thread
From: Jim Meyering @ 2020-09-14 14:14 UTC (permalink / raw)
  To: Paul Eggert
  Cc: fryasu, Gnulib bugs, 40634, Norihiro Tanaka, GNU grep developers

On Sun, Sep 13, 2020 at 7:03 PM Paul Eggert <eggert@cs.ucla.edu> wrote:
> On 9/11/20 11:41 PM, Jim Meyering wrote:
> >> https://bugs.gnu.org/40634#32
> >>
> >> I'll try to take a look at the later patch.
> >
> > Oh! Glad you spotted that.
>
> I took a look and the basic idea sounds good though I admit I did not check
> every detail. While looking into it I found some opportunities for improvements,
> plus I found what appear to be some longstanding bugs in the area, one of which
> causes a grep test failure on Solaris (and I suspect the bug is also on
> GNU/Linux but the grep tests don't catch it). I installed the attached patches
> into Gnulib, updated grep to point to the new Gnulib version, and added a note
> in grep's NEWS file about this.
>
> Patch 1 is what Norihiro Tanaka proposed in Bug#40634#32, except I edited the
> commit message. Patch 2 consists of minor cleanups and performance tweaks for
> Patch 1. (Patches 3 and 4 are omitted as they were installed by others into
> Gnulib at about the same time I was installing these.) Patch 5 fixes a
> dfa-heap-overrun failure on Solaris that appears to be a longstanding bug
> exposed by Patch 1 when running on Solaris. Patch 6 merely cleans up code near
> Patch 5. Patch 7 fixes the use of an uninitialized constraint, which I
> discovered while debugging Patch 5 under Valgrind; this also appears to be a
> longstandiung bug.
>
> Coming up with test cases for all these bugs would be pretty tricky, unfortunately.

Wow! Thank you!


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: bug#40634: Massive pattern list handling with -E format seems very slow since 2.28.
  2020-09-14  2:03                   ` Paul Eggert
  2020-09-14 14:14                     ` Jim Meyering
@ 2020-09-21 19:22                     ` Paul Eggert
  1 sibling, 0 replies; 11+ messages in thread
From: Paul Eggert @ 2020-09-21 19:22 UTC (permalink / raw)
  To: Jim Meyering
  Cc: fryasu, Gnulib bugs, 40634-done, Norihiro Tanaka,
	GNU grep developers

The dust seems to have settled on this, so I'm closing the grep bug report to 
tidy things up.


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-09-21 19:23 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <6503eb8e-e6fd-b4dd-aab7-76bb6955d87b@cs.ucla.edu>
     [not found] ` <20200417093536.875E.27F6AC2D@kcn.ne.jp>
2020-04-17  1:24   ` bug#40634: Massive pattern list handling with -E format seems very slow since 2.28 Norihiro Tanaka
2020-04-17 15:22     ` Norihiro Tanaka
2020-04-18 22:41       ` Norihiro Tanaka
2020-04-19  2:10         ` Norihiro Tanaka
2020-09-11 12:47           ` Jim Meyering
2020-09-11 13:17             ` Jim Meyering
2020-09-11 23:01               ` Paul Eggert
2020-09-12  6:41                 ` Jim Meyering
2020-09-14  2:03                   ` Paul Eggert
2020-09-14 14:14                     ` Jim Meyering
2020-09-21 19:22                     ` Paul Eggert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).