From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-3.9 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 69A211F4B4 for ; Mon, 14 Sep 2020 15:58:08 +0000 (UTC) Received: from localhost ([::1]:54296 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kHqrn-0002sV-56 for normalperson@yhbt.net; Mon, 14 Sep 2020 11:58:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49280) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHpFY-0006GX-Dl; Mon, 14 Sep 2020 10:14:33 -0400 Received: from mail-wr1-f67.google.com ([209.85.221.67]:34239) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kHpFW-0000Be-19; Mon, 14 Sep 2020 10:14:32 -0400 Received: by mail-wr1-f67.google.com with SMTP id t10so18962696wrv.1; Mon, 14 Sep 2020 07:14:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=HzXbwQ2o9R0ziV7C5XHrRyjHh2jp9oVk6Sqk1xjV+SE=; b=kYxn7GiVNuEUKLr8E4xtXjmgk1Fyds5YHKpzOWyn3XnqbamMgLqzIykmhNMDly34Om 7jtXQbUuFd3SOIN81j6nvddAd08I5BiZ3BPZXpHJTeQ6KUYH9tbsUu0TM+orU0WNysS7 z3FPfZ5o4xuEM/OJvqcVMzWNTBya828J7R3QmL1eOJXIHRGiGbGXdARcgL+cdN9iyOz3 dRkxpO7wcIzw51prpgP+tnKz/kzlT2T/SQ74sRnG8d/KgP2vOp49CKHZDqcugMo9JEYi b8NuPVvSifDxnaKz2b8BPbbxwB3nDrp/ebLrn7I+e2M3YvBkV2c8ubkiexssYzAgW4t8 0Kgw== X-Gm-Message-State: AOAM532FOx66SfttVaj5slrCeZ3HjwoawmkuZHE+wxOZKTefyWiroTIX C7K87fo3QZPIxc2sw3VCrnYf/GHe6EbwrrWoDns= X-Google-Smtp-Source: ABdhPJyPMaEZwEMv74Jo1TRwPof3SI+Uf7cEFSbhKU69rq2l2kPobpyaXLYanF5rFjdO1V5T7GhmiG6mjyhEa3fgCYA= X-Received: by 2002:adf:d4c1:: with SMTP id w1mr16183242wrk.108.1600092866513; Mon, 14 Sep 2020 07:14:26 -0700 (PDT) MIME-Version: 1.0 References: <20200418002153.8771.27F6AC2D@kcn.ne.jp> <20200419074109.431A.27F6AC2D@kcn.ne.jp> <20200419111025.4326.27F6AC2D@kcn.ne.jp> <0417af9a-50a5-3462-6b38-393d80395085@cs.ucla.edu> <78d13c9d-0426-b913-66fc-d7d652a5500c@cs.ucla.edu> In-Reply-To: <78d13c9d-0426-b913-66fc-d7d652a5500c@cs.ucla.edu> From: Jim Meyering Date: Mon, 14 Sep 2020 07:14:14 -0700 Message-ID: Subject: Re: bug#40634: Massive pattern list handling with -E format seems very slow since 2.28. To: Paul Eggert Content-Type: text/plain; charset="UTF-8" Received-SPF: pass client-ip=209.85.221.67; envelope-from=meyering@gmail.com; helo=mail-wr1-f67.google.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/14 10:14:26 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -13 X-Spam_score: -1.4 X-Spam_bar: - X-Spam_report: (-1.4 / 5.0 requ) BAYES_00=-1.9, FREEMAIL_FORGED_FROMDOMAIN=0.248, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: bug-gnulib@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Gnulib discussion list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: fryasu@yahoo.co.jp, Gnulib bugs , 40634@debbugs.gnu.org, Norihiro Tanaka , GNU grep developers Errors-To: bug-gnulib-bounces+normalperson=yhbt.net@gnu.org Sender: "bug-gnulib" On Sun, Sep 13, 2020 at 7:03 PM Paul Eggert wrote: > On 9/11/20 11:41 PM, Jim Meyering wrote: > >> https://bugs.gnu.org/40634#32 > >> > >> I'll try to take a look at the later patch. > > > > Oh! Glad you spotted that. > > I took a look and the basic idea sounds good though I admit I did not check > every detail. While looking into it I found some opportunities for improvements, > plus I found what appear to be some longstanding bugs in the area, one of which > causes a grep test failure on Solaris (and I suspect the bug is also on > GNU/Linux but the grep tests don't catch it). I installed the attached patches > into Gnulib, updated grep to point to the new Gnulib version, and added a note > in grep's NEWS file about this. > > Patch 1 is what Norihiro Tanaka proposed in Bug#40634#32, except I edited the > commit message. Patch 2 consists of minor cleanups and performance tweaks for > Patch 1. (Patches 3 and 4 are omitted as they were installed by others into > Gnulib at about the same time I was installing these.) Patch 5 fixes a > dfa-heap-overrun failure on Solaris that appears to be a longstanding bug > exposed by Patch 1 when running on Solaris. Patch 6 merely cleans up code near > Patch 5. Patch 7 fixes the use of an uninitialized constraint, which I > discovered while debugging Patch 5 under Valgrind; this also appears to be a > longstandiung bug. > > Coming up with test cases for all these bugs would be pretty tricky, unfortunately. Wow! Thank you!