Re: [PATCH v2] Improve performance of memmem

unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed

From: Szabolcs Nagy <Szabolcs.Nagy@arm.com>
To: Wilco Dijkstra <Wilco.Dijkstra@arm.com>,
	'GNU C Library' <libc-alpha@sourceware.org>
Cc: nd <nd@arm.com>
Subject: Re: [PATCH v2] Improve performance of memmem
Date: Tue, 11 Jun 2019 09:30:29 +0000	[thread overview]
Message-ID: <326fa6af-990d-79f9-1d8c-4cb9b4fd1719@arm.com> (raw)
In-Reply-To: <VI1PR0801MB21278FF8424901F1FEC9AAFB83130@VI1PR0801MB2127.eurprd08.prod.outlook.com>

On 10/06/2019 19:31, Wilco Dijkstra wrote:
> v2: Update comments after review.
> 
> This patch significantly improves performance of memmem using a novel
> modified Horspool algorithm.  Needles up to size 256 use a bad-character
> table indexed by hashed pairs of characters to quickly skip past mismatches.
> Long needles use a self-adapting filtering step to avoid comparing the whole
> needle repeatedly.
> 
> By limiting the needle length to 256, the shift table only requires 8 bits
> per entry, lowering preprocessing overhead and minimizing cache effects.
> This limit also implies worst-case performance is linear.
> 
> Small needles up to size 2 use a dedicated linear search.  Very long needles
> use the Two-Way algorithm (to avoid increasing stack size or slowing down
> the common case, inlining is disabled).
> 
> The performance gain is 6.6 times on English text on AArch64 using random
> needles with average size 8 (this is even faster than the recently improved strstr
> algorithm, so I'll update that in the near future).

the comment about strstr is no longer relevant.

> 
> Tested against GLIBC testsuite and randomized tests. OK for commit?
> 
> ChangeLog:
> 2019-06-10  Wilco Dijkstra  <wdijkstr@arm.com>
> 
> 	* string/memmem.c (__memmem): Rewrite to improve performance.
> 

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

i only had one comment below if that's addressed
then i think it's ready to commit.

(but i think you should wait a day in case there
are further comments on this latest version.)

> +  for ( ; hs <= end; )
>      {
> -      haystack = memchr (haystack, *needle, haystack_len);
> -      if (!haystack || __builtin_expect (needle_len == 1, 0))
> -	return (void *) haystack;
> -      haystack_len -= haystack - (const unsigned char *) haystack_start;
> -      if (haystack_len < needle_len)
> -	return NULL;
> -      /* Check whether we have a match.  This improves performance since we
> -	 avoid the initialization overhead of the two-way algorithm.  */
> -      if (memcmp (haystack, needle, needle_len) == 0)
> -	return (void *) haystack;
> -      return two_way_short_needle (haystack, haystack_len, needle, needle_len);
> +      /* Skip past character pairs not in the needle.  */
> +      do
> +	{
> +	  hs += m1;
> +	  tmp = shift[hash2 (hs)];
> +	}
> +      while (tmp == 0 && hs <= end);

i noticed that the check here is in different
order than in strstr, i wonder if that's deliberate.

if either way is fine i'd prefer to have the same
logic in strstr and memmem.

> +
> +      /* If the match is not at the end of the needle, shift to the end
> +	 and continue until we match the hash of the needle end.  */
> +      hs -= tmp;
> +      if (tmp < m1)
> +	continue;
> +
> +      /* Hash of the last 2 characters matches.  If the needle is long,
> +	 try to quickly filter out mismatches.  */
> +      if (m1 < 15 || memcmp (hs + offset, ne + offset, 8) == 0)
> +	{
> +	  if (memcmp (hs, ne, m1) == 0)
> +	    return (void *) hs;
> +
> +	  /* Adjust filter offset when it doesn't find the mismatch.  */
> +	  offset = (offset >= 8 ? offset : m1) - 8;
> +	}
> +
> +      /* Skip based on matching the hash of the needle end.  */
> +      hs += shift1;
>      }

     prev parent reply	other threads:[~2019-06-11  9:30 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-21 16:04 [PATCH] Improve performance of memmem Wilco Dijkstra
2019-02-01 15:39 ` Wilco Dijkstra
2019-03-06 16:55   ` Wilco Dijkstra
2019-03-18 17:17     ` Wilco Dijkstra
2019-04-12 15:49       ` Wilco Dijkstra
2019-05-24 13:59         ` Wilco Dijkstra
2019-06-10 11:48           ` Wilco Dijkstra
2019-06-10 18:31             ` [PATCH v2] " Wilco Dijkstra
2019-06-11  9:30               ` Szabolcs Nagy [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=326fa6af-990d-79f9-1d8c-4cb9b4fd1719@arm.com \
    --to=szabolcs.nagy@arm.com \
    --cc=Wilco.Dijkstra@arm.com \
    --cc=libc-alpha@sourceware.org \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).