bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed
From: Bruno Haible <bruno@clisp.org>
To: arnold@skeeve.com
Cc: eggert@cs.ucla.edu, bug-gnulib@gnu.org
Subject: Re: regex unit tests
Date: Sun, 18 Jul 2021 23:45:18 +0200	[thread overview]
Message-ID: <7719867.Yo8exX7jZS@omega> (raw)
In-Reply-To: <202107181859.16IIxOCA007113@freefriends.org>

Hi Arnold,

> > > (And how I've documented things in the manual, also since forever.)
> >
> > If you want the behaviour of the GNU regex to be stable over time, you
> > should contribute unit tests to tests/test-regex.c.
> 
> This is a separate issue. It almost sounds like you're saying "it's your
> fault there's a bug here, you didn't contribute unit tests".

I'm not talking about past incidents and "fault", because that is generally
useless. I'm talking about the future and what we can do to avoid that
packages that depend on the 'regex' module see regressions.

If a Gnulib module does not have a decent test coverage in Gnulib, then its
bugs and regressions become apparent only after a while and only through
these other packages. A good example of this sequence of events was
<https://lists.gnu.org/archive/html/bug-gnulib/2020-07/msg00036.html>,
but I'm sure you can find many others of the same kind in the mailing
list archive. If, on the other hand, there is a unit test and it runs on
glibc platforms, a regression is likely to be visible in the weekly continuous
integration build <https://gitlab.com/gnulib/gnulib-ci/-/pipelines>.

For the regex module, with 20 KB of tests for 300 KB of code full of
complex algorithms, the test coverage is very thin, and it is *to be expected*
that regressions are only visible once the code is integrated into gawk,
grep, sed, etc. Similarly for the 'dfa' module with 5 KB of tests for 140 KB
of code.

The regex and dfa modules are being maintained here (by Paul, with
contributions from various people), and we have seen that it is not
obvious whether a patch is good or not: sometimes Paul has rejected
patches, sometimes he had to revert patches.

I think it would be good if these two modules had a larger test coverage,
and I'm inviting everyone who can to contribute to these unit tests.

> I hope that's not your intent; if it is then sorry, I don't buy it.

The module doesn't have tests for the

  RE_SYNTAX_AWK
  RE_SYNTAX_GNU_AWK
  RE_SYNTAX_POSIX_AWK

syntaxes. It's gawk which depends on the correct functioning of these
syntaxes, not glibc, not grep, not sed, not emacs. Therefore IMO if
the gawk developers don't contribute some test cases for these syntaxes,
no one will. (I certainly won't, because I find writing tests a bit
boring, and I don't see why I should have the "boring" part whereas
others have the "fun" part :-) )

Bruno



  reply	other threads:[~2021-07-18 21:45 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-15 18:48 possible bug in regex and dfa Arnold Robbins
2021-07-17  2:58 ` Paul Eggert
2021-07-18  9:01   ` Bruno Haible
2021-07-18 12:56   ` arnold
2021-07-18 16:09     ` Bruno Haible
2021-07-18 18:59       ` arnold
2021-07-18 21:45         ` Bruno Haible [this message]
2021-07-18 19:30       ` arnold

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.gnu.org/mailman/listinfo/bug-gnulib

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7719867.Yo8exX7jZS@omega \
    --to=bruno@clisp.org \
    --cc=arnold@skeeve.com \
    --cc=bug-gnulib@gnu.org \
    --cc=eggert@cs.ucla.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).